[
  {
    "path": ".clang-format",
    "content": "---\nBasedOnStyle: Google\n---\nLanguage:               Cpp\nCpp11BracedListStyle:   true\nStandard:               Cpp11\nDerivePointerAlignment: false\nPointerAlignment:       Right\n---\nLanguage: Java\nJavaImportGroups: [ 'java', 'javax', 'javafx', 'org', 'io', 'com', 'de.gsi' ]\nAccessModifierOffset: -4\nAlignAfterOpenBracket: DontAlign\nAlignConsecutiveAssignments: false\nAlignConsecutiveDeclarations: false\nAlignEscapedNewlines: DontAlign\nAlignTrailingComments: false\nAllowAllParametersOfDeclarationOnNextLine: true\nAllowShortLambdasOnASingleLine: None\nAllowShortCaseLabelsOnASingleLine: false\nAllowShortFunctionsOnASingleLine: None\nAllowShortIfStatementsOnASingleLine: Never\nAllowShortLoopsOnASingleLine: false\nAlwaysBreakAfterReturnType: None\nAlwaysBreakBeforeMultilineStrings: false\nAlwaysBreakTemplateDeclarations: Yes\nBinPackArguments: true\nBinPackParameters: true\nBraceWrapping:\n  AfterClass: false\n  AfterControlStatement: Never\n  AfterEnum: false\n  AfterFunction: false\n  AfterNamespace: false\n  AfterObjCDeclaration: false\n  AfterStruct: false\n  AfterUnion: false\n  BeforeCatch: false\n  BeforeElse: false\n  IndentBraces: false\n  SplitEmptyFunction: true\n  SplitEmptyRecord: true\n  SplitEmptyNamespace: true\nBreakBeforeBinaryOperators: All\nBreakBeforeBraces: Custom\nBreakBeforeInheritanceComma: false\nBreakBeforeTernaryOperators: true\nBreakConstructorInitializersBeforeComma: false\nBreakConstructorInitializers: BeforeComma\nBreakAfterJavaFieldAnnotations: true\nBreakStringLiterals: true\nColumnLimit: 0\nCommentPragmas: '^ IWYU pragma:'\nCompactNamespaces: false\nConstructorInitializerAllOnOneLineOrOnePerLine: false\nConstructorInitializerIndentWidth: 4\nContinuationIndentWidth: 8\nCpp11BracedListStyle: false\nDerivePointerAlignment: false\nDisableFormat: false\nExperimentalAutoDetectBinPacking: false\nFixNamespaceComments: true\nForEachMacros:\n  - forever # avoids { wrapped to next line\n  - foreach\n  - Q_FOREACH\n  - BOOST_FOREACH\nIncludeCategories:\n  - Regex: '^<Q.*'\n    Priority: 200\nIncludeIsMainRegex: '(Test)?$'\nIndentCaseLabels: false\nIndentWidth: 4\nIndentWrappedFunctionNames: false\nJavaScriptQuotes: Leave\nJavaScriptWrapImports: true\nKeepEmptyLinesAtTheStartOfBlocks: false\n# Do not add QT_BEGIN_NAMESPACE/QT_END_NAMESPACE as this will indent lines in between.\nMacroBlockBegin: \"\"\nMacroBlockEnd: \"\"\nMaxEmptyLinesToKeep: 1\nNamespaceIndentation: None\nObjCBlockIndentWidth: 4\nObjCSpaceAfterProperty: false\nObjCSpaceBeforeProtocolList: true\nPenaltyBreakAssignment: 150\nPenaltyBreakBeforeFirstCallParameter: 300\nPenaltyBreakComment: 500\nPenaltyBreakFirstLessLess: 400\nPenaltyBreakString: 600\nPenaltyExcessCharacter: 50\nPenaltyReturnTypeOnItsOwnLine: 300\nPointerAlignment: Right\nReflowComments: true\nSortIncludes: true\nSortUsingDeclarations: true\nSpaceAfterCStyleCast: true\nSpaceAfterTemplateKeyword: false\nSpaceBeforeAssignmentOperators: true\nSpaceBeforeParens: ControlStatements\nSpaceInEmptyParentheses: false\nSpacesBeforeTrailingComments: 1\nSpacesInAngles: false\nSpacesInContainerLiterals: false\nSpacesInCStyleCastParentheses: false\nSpacesInParentheses: false\nSpacesInSquareBrackets: false\nStandard: c++17\nTabWidth: 4\nUseTab: Never"
  },
  {
    "path": ".clang-tidy",
    "content": "---\n# NOTE there must be no spaces before the '-', so put the comma last.\n# The check bugprone-unchecked-optional-access is also turned off atm\n# because it causes clang-tidy to hang randomly. The tracking issue\n# can be found at https://github.com/llvm/llvm-project/issues/69369.\n#\n# Modified from\n# https://github.com/pytorch/pytorch/blob/main/.clang-tidy\nInheritParentConfig: true\nChecks: '\nbugprone-*,\n-bugprone-easily-swappable-parameters,\n-bugprone-forward-declaration-namespace,\n-bugprone-implicit-widening-of-multiplication-result,\n-bugprone-macro-parentheses,\n-bugprone-lambda-function-name,\n-bugprone-narrowing-conversions,\n-bugprone-reserved-identifier,\n-bugprone-swapped-arguments,\n-bugprone-unchecked-optional-access,\nclang-diagnostic-missing-prototypes,\ncppcoreguidelines-*,\n-cppcoreguidelines-avoid-const-or-ref-data-members,\n-cppcoreguidelines-avoid-do-while,\n-cppcoreguidelines-avoid-magic-numbers,\n-cppcoreguidelines-avoid-non-const-global-variables,\n-cppcoreguidelines-interfaces-global-init,\n-cppcoreguidelines-macro-usage,\n-cppcoreguidelines-narrowing-conversions,\n-cppcoreguidelines-owning-memory,\n-cppcoreguidelines-pro-bounds-array-to-pointer-decay,\n-cppcoreguidelines-pro-bounds-constant-array-index,\n-cppcoreguidelines-pro-bounds-pointer-arithmetic,\n-cppcoreguidelines-pro-type-const-cast,\n-cppcoreguidelines-pro-type-cstyle-cast,\n-cppcoreguidelines-pro-type-reinterpret-cast,\n-cppcoreguidelines-pro-type-static-cast-downcast,\n-cppcoreguidelines-pro-type-union-access,\n-cppcoreguidelines-pro-type-vararg,\n-cppcoreguidelines-special-member-functions,\n-cppcoreguidelines-non-private-member-variables-in-classes,\n-facebook-hte-RelativeInclude,\nhicpp-exception-baseclass,\nhicpp-avoid-goto,\nmisc-*,\n-misc-const-correctness,\n-misc-include-cleaner,\n-misc-use-anonymous-namespace,\n-misc-unused-parameters,\n-misc-no-recursion,\n-misc-non-private-member-variables-in-classes,\n-misc-confusable-identifiers,\nmodernize-*,\n-modernize-macro-to-enum,\n-modernize-pass-by-value,\n-modernize-return-braced-init-list,\n-modernize-use-auto,\n-modernize-use-default-member-init,\n-modernize-use-using,\n-modernize-use-trailing-return-type,\n-modernize-use-nodiscard,\nperformance-*,\nreadability-container-size-empty,\nreadability-delete-null-pointer,\nreadability-duplicate-include\nreadability-misplaced-array-index,\nreadability-redundant-function-ptr-dereference,\nreadability-redundant-smartptr-get,\nreadability-simplify-subscript-expr,\nreadability-string-compare,\n'\nWarningsAsErrors: '*'\n...\n"
  },
  {
    "path": ".flake8",
    "content": "[flake8]\nshow-source=true\nstatistics=true\nmax-line-length = 120\n\nexclude =\n  .git,\n  ./cmake,\n"
  },
  {
    "path": ".github/scripts/.gitignore",
    "content": "Makefile\n*.jar\nhs_err_pid*.log\n"
  },
  {
    "path": ".github/scripts/as-cmake-sub-project/CMakeLists.txt",
    "content": "cmake_minimum_required(VERSION 3.13 FATAL_ERROR)\n\nproject(use-of-sherpa-onnx-as-a-sub-project)\n\nif(NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/sherpa-onnx/setup.py\")\n  message(FATAL_ERROR \"Please download the source code of sherpa-onnx and put it inside this directory\")\nendif()\n\nset(CMAKE_ARCHIVE_OUTPUT_DIRECTORY \"${CMAKE_BINARY_DIR}/lib\")\nset(CMAKE_LIBRARY_OUTPUT_DIRECTORY \"${CMAKE_BINARY_DIR}/lib\")\nset(CMAKE_RUNTIME_OUTPUT_DIRECTORY \"${CMAKE_BINARY_DIR}/bin\")\n\ninclude_directories(./sherpa-onnx)\nadd_subdirectory(./sherpa-onnx)\n\nadd_executable(main main.cc)\ntarget_link_libraries(main sherpa-onnx-core)\n"
  },
  {
    "path": ".github/scripts/as-cmake-sub-project/main.cc",
    "content": "#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nint main(int32_t argc, char *argv[]) {\n  sherpa_onnx::ParseOptions po(\"help info\");\n  sherpa_onnx::OfflineRecognizerConfig config;\n  config.Register(&po);\n  po.PrintUsage();\n  return 0;\n}\n"
  },
  {
    "path": ".github/scripts/export-ascend/__init__.py",
    "content": ""
  },
  {
    "path": ".github/scripts/export-ascend/generate_paraformer.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport itertools\nimport json\nfrom dataclasses import asdict, dataclass\n\nfrom generate_zipformer_ctc_20250703 import get_cann_version, get_image, get_soc_version\n\n\n@dataclass\nclass Config:\n    # 7.0, 8.0, 8.1, 8.2\n    cann: str\n\n    # 910B, 910B2, 910B3, 310P3\n    soc_version: str\n\n    # FunASR, WSChuan-ASR\n    framework: str\n\n    image: str = \"\"\n\n    def __post_init__(self):\n        self.image = get_image(self.cann, soc_version=self.soc_version)\n\n\ndef main():\n    cann_version = get_cann_version()\n    soc_version = get_soc_version()\n    framework_list = [\"FunASR\", \"WSChuan-ASR\"]\n\n    configs = [\n        Config(cann=cann, soc_version=soc, framework=framework)\n        for cann, soc, framework in itertools.product(\n            cann_version, soc_version, framework_list\n        )\n    ]\n\n    ans = [asdict(c) for c in configs]\n\n    print(json.dumps({\"include\": ans}))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": ".github/scripts/export-ascend/generate_sense_voice.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport itertools\nimport json\nfrom dataclasses import asdict, dataclass\n\nfrom generate_zipformer_ctc_20250703 import get_image, get_soc_version, get_cann_version\n\n\n@dataclass\nclass Config:\n    # 7.0, 8.0, 8.1, 8.2\n    cann: str\n\n    # 910B, 910B2, 910B3, 310P3\n    soc_version: str\n\n    # FunASR, WSYue-ASR\n    framework: str\n\n    image: str = \"\"\n\n    def __post_init__(self):\n        self.image = get_image(self.cann, soc_version=self.soc_version)\n\n\ndef main():\n    cann_version = get_cann_version()\n    soc_version = get_soc_version()\n    framework_list = [\"FunASR\", \"WSYue-ASR\"]\n\n    configs = [\n        Config(cann=cann, soc_version=soc, framework=framework)\n        for cann, soc, framework in itertools.product(\n            cann_version, soc_version, framework_list\n        )\n    ]\n\n    ans = [asdict(c) for c in configs]\n\n    print(json.dumps({\"include\": ans}))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": ".github/scripts/export-ascend/generate_whisper.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2026  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport itertools\nimport json\nfrom dataclasses import asdict, dataclass\n\nfrom generate_zipformer_ctc_20250703 import get_image, get_soc_version, get_cann_version\n\n\n@dataclass\nclass Config:\n    # 7.0, 8.0, 8.1, 8.2\n    cann: str\n\n    # 910B, 910B2, 910B3, 310P3\n    soc_version: str\n\n    model: str\n\n    image: str = \"\"\n\n    def __post_init__(self):\n        self.image = get_image(self.cann, soc_version=self.soc_version)\n\n\ndef main():\n    cann_version = get_cann_version()\n    soc_version = get_soc_version()\n    model_list = [\n        \"turbo\",\n        \"distil-medium.en\",\n        \"distil-small.en\",\n        \"tiny.en\",\n        \"base.en\",\n        \"small.en\",\n        \"medium.en\",\n        \"tiny\",\n        \"base\",\n        \"small\",\n        \"medium\",\n        \"medium-aishell\",\n    ]\n\n    configs = [\n        Config(cann=cann, soc_version=soc, model=model)\n        for cann, soc, model in itertools.product(cann_version, soc_version, model_list)\n    ]\n\n    ans = [asdict(c) for c in configs]\n\n    print(json.dumps({\"include\": ans}))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": ".github/scripts/export-ascend/generate_zipformer_ctc_20250703.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport itertools\nimport json\nfrom dataclasses import asdict, dataclass\n\n\n# image: ascendai/cann:latest\n# image: ascendai/cann:8.1.rc1-910b-ubuntu22.04-py3.10\n# see https://hub.docker.com/r/gpustack/ascendai-cann/tags?name=8.0\n# see https://hub.docker.com/r/gpustack/devel-ascendai-cann/tags?name=310p\n# and\n# https://quay.io/repository/ascend/cann?tab=tags\ndef get_image(cann: str, soc_version: str):\n    cann2image_910 = {\n        \"7.0\": \"quay.io/ascend/cann:7.0.1.beta1-910b-ubuntu22.04-py3.8\",\n        \"8.0\": \"gpustack/ascendai-cann:8.0.RC3-910b-ubuntu20.04-py3.9\",\n        \"8.1\": \"gpustack/devel-ascendai-cann:8.1.rc1.beta1-910b-ubuntu20.04-v2\",\n        \"8.2\": \"gpustack/devel-ascendai-cann:8.2.rc1-910b-ubuntu20.04-v2\",\n        \"8.3\": \"quay.io/ascend/cann:8.3.rc2-910b-ubuntu22.04-py3.11\",\n        \"8.5\": \"quay.io/ascend/cann:8.5.0-910b-ubuntu22.04-py3.11\",\n    }\n\n    cann2image_310 = {\n        \"7.0\": \"quay.io/ascend/cann:7.0.1-310p-ubuntu22.04-py3.9\",\n        \"8.0\": \"gpustack/devel-ascendai-cann:8.0.rc3.beta1-310p-ubuntu20.04-v2\",\n        \"8.1\": \"gpustack/devel-ascendai-cann:8.1.rc1.beta1-310p-ubuntu20.04-v2\",\n        \"8.2\": \"gpustack/devel-ascendai-cann:8.2.rc1-310p-ubuntu20.04-v2\",\n        \"8.3\": \"quay.io/ascend/cann:8.3.rc2-310p-ubuntu22.04-py3.11\",\n        \"8.5\": \"quay.io/ascend/cann:8.5.0-310p-ubuntu22.04-py3.11\",\n    }\n\n    if \"910\" in soc_version:\n        return cann2image_910[cann]\n    elif \"310\" in soc_version:\n        return cann2image_310[cann]\n    else:\n        raise ValueError(f\"Unsupported soc_version {soc_version}\")\n\n\ndef get_soc_version():\n    soc_version = [\"910B\", \"910B2\", \"910B3\", \"910B4\", \"310P3\"]\n    return soc_version\n\n\ndef get_cann_version():\n    cann_version = [\"7.0\", \"8.0\", \"8.1\", \"8.2\", \"8.3\", \"8.5\"]\n    return cann_version\n\n\n@dataclass\nclass Config:\n    # 7.0, 8.0, 8.1, 8.2\n    cann: str\n\n    # 910B, 910B2, 910B3, 310P3\n    soc_version: str\n\n    num_seconds: str\n\n    image: str = \"\"\n\n    def __post_init__(self):\n        self.image = get_image(self.cann, soc_version=self.soc_version)\n\n\ndef main():\n    cann_version = get_cann_version()\n    soc_version = get_soc_version()\n    input_in_seconds = [\"5\", \"8\", \"10\", \"13\", \"15\", \"18\", \"20\", \"23\", \"25\", \"28\", \"30\"]\n\n    configs = [\n        Config(cann=cann, soc_version=soc, num_seconds=sec)\n        for cann, soc, sec in itertools.product(\n            cann_version, soc_version, input_in_seconds\n        )\n    ]\n\n    ans = [asdict(c) for c in configs]\n\n    print(json.dumps({\"include\": ans}))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": ".github/scripts/export-qnn/__init__.py",
    "content": ""
  },
  {
    "path": ".github/scripts/export-qnn/generate_paraformer.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport json\n\nfrom device_info import soc_info_dict\nfrom dataclasses import asdict, dataclass\nimport itertools\n\n\n@dataclass\nclass Config:\n    soc: str  # SM8850\n    soc_id: int  # 87\n    arch: str  # v81\n    input_in_seconds: str\n    framework: str\n\n\ndef main():\n\n    input_in_seconds = [\"5\", \"8\", \"10\", \"13\", \"15\", \"18\", \"20\", \"23\", \"25\", \"28\", \"30\"]\n    framework_list = [\"FunASR\", \"WSChuan-ASR\"]\n\n    configs = []\n\n    for name, soc in soc_info_dict.items():\n        for num_seconds, framework in itertools.product(\n            input_in_seconds, framework_list\n        ):\n            configs.append(\n                Config(\n                    soc=name,\n                    soc_id=soc.model.value,\n                    arch=soc.info.arch.name,\n                    input_in_seconds=num_seconds,\n                    framework=framework,\n                )\n            )\n\n    ans = [asdict(c) for c in configs]\n\n    print(json.dumps({\"include\": ans}))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": ".github/scripts/export-qnn/generate_sense_voice.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport json\n\nfrom device_info import soc_info_dict\nfrom dataclasses import asdict, dataclass\nimport itertools\n\n\n@dataclass\nclass Config:\n    soc: str  # SM8850\n    soc_id: int  # 87\n    arch: str  # v81\n    input_in_seconds: str\n    framework: str\n\n\ndef main():\n\n    input_in_seconds = [\"5\", \"8\", \"10\", \"13\", \"15\", \"18\", \"20\", \"23\", \"25\", \"28\", \"30\"]\n    framework_list = [\"FunASR\", \"WSYue-ASR\"]\n\n    configs = []\n\n    for name, soc in soc_info_dict.items():\n        for num_seconds, framework in itertools.product(\n            input_in_seconds, framework_list\n        ):\n            configs.append(\n                Config(\n                    soc=name,\n                    soc_id=soc.model.value,\n                    arch=soc.info.arch.name,\n                    input_in_seconds=num_seconds,\n                    framework=framework,\n                )\n            )\n\n    ans = [asdict(c) for c in configs]\n\n    print(json.dumps({\"include\": ans}))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": ".github/scripts/export-qnn/generate_zipformer.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport json\n\nfrom device_info import soc_info_dict\nfrom dataclasses import asdict, dataclass\nimport itertools\n\n\n@dataclass\nclass Config:\n    soc: str  # SM8850\n    soc_id: int  # 87\n    arch: str  # v81\n    input_in_seconds: str\n    model_name: str\n\n\ndef main():\n\n    input_in_seconds = [\"5\", \"8\", \"10\", \"13\", \"15\", \"18\", \"20\", \"23\", \"25\", \"28\", \"30\"]\n    model_name_list = [\"20250703\", \"20251222\"]\n\n    configs = []\n\n    for name, soc in soc_info_dict.items():\n        for num_seconds, model_name in itertools.product(\n            input_in_seconds, model_name_list\n        ):\n            if model_name == \"20251222\":\n                if num_seconds not in [\"5\"]:\n                    # TODO(fangjun): We only upload model-5-seconds.onnx right now\n                    continue\n\n            configs.append(\n                Config(\n                    soc=name,\n                    soc_id=soc.model.value,\n                    arch=soc.info.arch.name,\n                    input_in_seconds=num_seconds,\n                    model_name=model_name,\n                )\n            )\n\n    ans = [asdict(c) for c in configs]\n\n    print(json.dumps({\"include\": ans}))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": ".github/scripts/node-addon/README-optional.md",
    "content": "# Introduction\n\nPlease see [sherpa-onnx-node](https://www.npmjs.com/package/sherpa-onnx-node)\n"
  },
  {
    "path": ".github/scripts/node-addon/README.md",
    "content": "# Introduction\n\nPlease see\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/nodejs-addon-examples/README.md\nfor usages.\n\n\n||Method|Support multiple threads|Minimum required node version|\n|---|---|---|---|\n|this package| https://github.com/nodejs/node-addon-api | Yes | v16|\n|https://www.npmjs.com/package/sherpa-onnx| WebAssembly | No | v18|\n"
  },
  {
    "path": ".github/scripts/node-addon/index.js",
    "content": "module.exports = require('./sherpa-onnx.node');\n"
  },
  {
    "path": ".github/scripts/node-addon/notes.md",
    "content": "# Introduction\n\nSee also\n\n  - https://github.com/WonderInventions/node-webrtc/blob/develop/package.json\n  - https://stackoverflow.com/questions/15176082/npm-package-json-os-specific-dependency\n  - https://github.com/WonderInventions/node-webrtc/blob/develop/lib/binding.js\n  - cross-compiling https://github.com/nodejs/node-gyp/issues/829#issuecomment-665527032\n  - https://nodejs.github.io/node-addon-examples/build-tools/cmake-js\n"
  },
  {
    "path": ".github/scripts/node-addon/package-optional.json",
    "content": "{\n  \"name\": \"sherpa-onnx-PLATFORM2-ARCH\",\n  \"version\": \"SHERPA_ONNX_VERSION\",\n  \"description\": \"Speech-to-text, text-to-speech, speaker diarization, and speech enhancement using Next-gen Kaldi without internet connection\",\n  \"main\": \"index.js\",\n  \"scripts\": {\n    \"test\": \"echo \\\"Error: no test specified\\\" && exit 1\"\n  },\n  \"repository\": {\n    \"type\": \"git\",\n    \"url\": \"git+https://github.com/k2-fsa/sherpa-onnx.git\"\n  },\n  \"keywords\": [\n    \"speech to text\",\n    \"text to speech\",\n    \"transcription\",\n    \"real-time speech recognition\",\n    \"without internet connection\",\n    \"locally\",\n    \"local\",\n    \"embedded systems\",\n    \"open source\",\n    \"diarization\",\n    \"speaker diarization\",\n    \"speaker recognition\",\n    \"speaker\",\n    \"speaker segmentation\",\n    \"speaker verification\",\n    \"spoken language identification\",\n    \"sherpa\",\n    \"zipformer\",\n    \"asr\",\n    \"tts\",\n    \"stt\",\n    \"c++\",\n    \"onnxruntime\",\n    \"onnx\",\n    \"ai\",\n    \"next-gen kaldi\",\n    \"offline\",\n    \"privacy\",\n    \"open source\",\n    \"streaming speech recognition\",\n    \"speech\",\n    \"recognition\",\n    \"vad\",\n    \"node-addon-api\",\n    \"speaker id\",\n    \"language id\",\n    \"speech enhancement\",\n    \"denoising\"\n  ],\n  \"author\": \"The next-gen Kaldi team\",\n  \"license\": \"Apache-2.0\",\n  \"bugs\": {\n    \"url\": \"https://github.com/k2-fsa/sherpa-onnx/issues\"\n  },\n  \"homepage\": \"https://github.com/k2-fsa/sherpa-onnx#readme\",\n   \"os\": [\n    \"PLATFORM\"\n  ],\n  \"cpu\": [\n    \"ARCH\"\n  ]\n}\n"
  },
  {
    "path": ".github/scripts/node-addon/package.json",
    "content": "{\n  \"name\": \"sherpa-onnx-node\",\n  \"version\": \"SHERPA_ONNX_VERSION\",\n  \"description\": \"Speech-to-text, text-to-speech, speaker diarization, and speech enhancement using Next-gen Kaldi without internet connection\",\n  \"main\": \"sherpa-onnx.js\",\n  \"scripts\": {\n    \"test\": \"echo \\\"Error: no test specified\\\" && exit 1\"\n  },\n  \"repository\": {\n    \"type\": \"git\",\n    \"url\": \"git+https://github.com/k2-fsa/sherpa-onnx.git\"\n  },\n  \"keywords\": [\n    \"speech to text\",\n    \"text to speech\",\n    \"transcription\",\n    \"real-time speech recognition\",\n    \"without internet connection\",\n    \"locally\",\n    \"local\",\n    \"embedded systems\",\n    \"open source\",\n    \"diarization\",\n    \"speaker diarization\",\n    \"speaker recognition\",\n    \"speaker\",\n    \"speaker segmentation\",\n    \"speaker verification\",\n    \"spoken language identification\",\n    \"sherpa\",\n    \"zipformer\",\n    \"asr\",\n    \"tts\",\n    \"stt\",\n    \"c++\",\n    \"onnxruntime\",\n    \"onnx\",\n    \"ai\",\n    \"next-gen kaldi\",\n    \"offline\",\n    \"privacy\",\n    \"open source\",\n    \"streaming speech recognition\",\n    \"speech\",\n    \"recognition\",\n    \"vad\",\n    \"node-addon-api\",\n    \"speaker id\",\n    \"language id\",\n    \"speech enhancement\",\n    \"denoising\"\n  ],\n  \"author\": \"The next-gen Kaldi team\",\n  \"license\": \"Apache-2.0\",\n  \"bugs\": {\n    \"url\": \"https://github.com/k2-fsa/sherpa-onnx/issues\"\n  },\n  \"homepage\": \"https://github.com/k2-fsa/sherpa-onnx#readme\",\n  \"optionalDependencies\": {\n    \"sherpa-onnx-darwin-arm64\": \"^SHERPA_ONNX_VERSION\",\n    \"sherpa-onnx-darwin-x64\": \"^SHERPA_ONNX_VERSION\",\n    \"sherpa-onnx-linux-x64\": \"^SHERPA_ONNX_VERSION\",\n    \"sherpa-onnx-linux-arm64\": \"^SHERPA_ONNX_VERSION\",\n    \"sherpa-onnx-win-x64\": \"^SHERPA_ONNX_VERSION\",\n    \"sherpa-onnx-win-ia32\": \"^SHERPA_ONNX_VERSION\"\n  }\n}\n"
  },
  {
    "path": ".github/scripts/test-audio-tagging.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nlog \"------------------------------------------------------------\"\nlog \"Run zipformer for audio tagging                             \"\nlog \"------------------------------------------------------------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\ntar xvf sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\nrm sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\nrepo=sherpa-onnx-zipformer-audio-tagging-2024-04-09\nls -lh $repo\n\nfor w in 1.wav 2.wav 3.wav 4.wav; do\n  $EXE \\\n    --zipformer-model=$repo/model.onnx \\\n    --labels=$repo/class_labels_indices.csv \\\n    $repo/test_wavs/$w\ndone\nrm -rf $repo\n"
  },
  {
    "path": ".github/scripts/test-c-api.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\necho \"SLID_EXE is $SLID_EXE\"\necho \"SID_EXE is $SID_EXE\"\necho \"AT_EXE is $AT_EXE\"\necho \"PUNCT_EXE is $PUNCT_EXE\"\necho \"PATH: $PATH\"\n\nlog \"------------------------------------------------------------\"\nlog \"Test adding punctuations                                    \"\nlog \"------------------------------------------------------------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nls -lh\ntar xf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nls -lh sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12\nrm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n$PUNCT_EXE\nrm -rf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12\n\nlog \"------------------------------------------------------------\"\nlog \"Test audio tagging                                          \"\nlog \"------------------------------------------------------------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\ntar xvf sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\nrm sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n\n$AT_EXE\n\nrm -rf sherpa-onnx-zipformer-audio-tagging-2024-04-09\n\n\nlog \"------------------------------------------------------------\"\nlog \"Download whisper tiny for spoken language identification    \"\nlog \"------------------------------------------------------------\"\n\nrm -rf sherpa-onnx-whisper-tiny*\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.tar.bz2\nrm sherpa-onnx-whisper-tiny.tar.bz2\n\n$SLID_EXE\n\nrm -rf sherpa-onnx-whisper-tiny*\n\nlog \"------------------------------------------------------------\"\nlog \"Download file for speaker identification and verification   \"\nlog \"------------------------------------------------------------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\ngit clone https://github.com/csukuangfj/sr-data\n\n$SID_EXE\n\nrm -fv *.onnx\nrm -rf sr-data\n"
  },
  {
    "path": ".github/scripts/test-cxx-api.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\necho \"CXX_STREAMING_ZIPFORMER_EXE is $CXX_STREAMING_ZIPFORMER_EXE\"\necho \"CXX_WHISPER_EXE is $CXX_WHISPER_EXE\"\necho \"CXX_SENSE_VOICE_EXE is $CXX_SENSE_VOICE_EXE\"\necho \"PATH: $PATH\"\n\nlog \"------------------------------------------------------------\"\nlog \"Test streaming zipformer CXX API\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n$CXX_STREAMING_ZIPFORMER_EXE\nrm -rf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n\nlog \"------------------------------------------------------------\"\nlog \"Test Whisper CXX API\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nrm sherpa-onnx-whisper-tiny.en.tar.bz2\n$CXX_WHISPER_EXE\nrm -rf sherpa-onnx-whisper-tiny.en\n\nlog \"------------------------------------------------------------\"\nlog \"Test SenseVoice CXX API\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\n$CXX_SENSE_VOICE_EXE\nrm -rf sherpa-onnx-sense-voice-*\n"
  },
  {
    "path": ".github/scripts/test-dart.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ncd dart-api-examples\n\npushd speech-enhancement-gtcrn\necho \"speech enhancement with gtcrn models\"\n./run.sh\nls -lh\npopd\n\npushd speech-enhancement-dpdfnet\necho \"speech enhancement with dpdfnet models\"\n./run.sh\nls -lh\npopd\n\npushd streaming-speech-enhancement-gtcrn\necho \"streaming speech enhancement with gtcrn models\"\n./run.sh\nls -lh\npopd\n\npushd streaming-speech-enhancement-dpdfnet\necho \"streaming speech enhancement with dpdfnet models\"\n./run.sh\nls -lh\npopd\n\npushd non-streaming-asr\n\necho '----------Moonshine v2----------'\n./run-moonshine-v2.sh\nrm -rf sherpa-onnx-*\n\necho '----------FireRedASR CTC----------'\n./run-fire-red-asr-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------FunASR Nano----------'\n./run-funasr-nano.sh\nrm -rf sherpa-onnx-*\n\necho '----------MedASR CTC----------'\n./run-medasr-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------Omnilingual ASR CTC----------'\n./run-omnilingual-asr-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------Wenet CTC----------'\n./run-wenet-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------Zipformer CTC----------'\n./run-zipformer-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------SenseVoice----------'\n./run-sense-voice-with-hr.sh\n./run-sense-voice.sh\nrm -rf sherpa-onnx-*\n\necho '----------FireRedAsr----------'\n./run-fire-red-asr.sh\nrm -rf sherpa-onnx-fire-red-asr-*\n\necho '----------NeMo transducer----------'\n./run-nemo-transducer.sh\nrm -rf sherpa-onnx-*\n\necho '----------Dolphin CTC----------'\n./run-dolphin-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------NeMo CTC----------'\n./run-nemo-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------TeleSpeech CTC----------'\n./run-telespeech-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------moonshine----------'\n./run-moonshine.sh\nrm -rf sherpa-onnx-*\n\necho '----------whisper----------'\n./run-whisper.sh\nrm -rf sherpa-onnx-*\n\necho '----------zipformer transducer----------'\n./run-zipformer-transducer.sh\nrm -rf sherpa-onnx-*\n\necho '----------paraformer itn----------'\n./run-paraformer-itn.sh\n\necho '----------paraformer----------'\n./run-paraformer.sh\nrm -rf sherpa-onnx-*\n\necho '----------VAD with paraformer----------'\n./run-vad-with-paraformer.sh\nrm -rf sherpa-onnx-*\n\npopd # non-streaming-asr\n\npushd tts\n\necho '----------tts----------'\n./run-pocket-en.sh\n./run-kitten-en.sh\n./run-supertonic-en.sh\n./run-kokoro-zh-en.sh\n./run-kokoro-en.sh\n./run-matcha-zh.sh\n./run-matcha-en.sh\n./run-zipvoice-zh-en.sh\nls -lh *.wav\nrm -rf matcha-icefall-*\nrm -rf sherpa-onnx-zipvoice-*\nrm *.onnx\n\necho '----------piper tts----------'\n./run-piper.sh\nrm -rf vits-piper-*\n\necho '----------coqui tts----------'\n./run-coqui.sh\nrm -rf vits-coqui-*\n\necho '----------zh tts----------'\n./run-vits-zh.sh\nrm -rf sherpa-onnx-*\n\nls -lh *.wav\n\npopd # tts\n\npushd spoken-language-identification\n./run-whisper.sh\npopd\n\npushd streaming-asr\n\necho '----------streaming T-one ctc----------'\n./run-t-one-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------streaming zipformer ctc HLG----------'\n./run-zipformer-ctc-hlg.sh\nrm -rf sherpa-onnx-*\n\necho '----------streaming zipformer ctc----------'\n./run-zipformer-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------streaming zipformer transducer----------'\n./run-zipformer-transducer-itn.sh\n./run-zipformer-transducer.sh\nrm -f itn*\nrm -rf sherpa-onnx-*\n\necho '----------streaming NeMo transducer----------'\n./run-nemo-transducer.sh\nrm -rf sherpa-onnx-*\n\necho '----------streaming paraformer----------'\n./run-paraformer.sh\nrm -rf sherpa-onnx-*\n\npopd # streaming-asr\n\npushd vad\n./run-ten-vad.sh\n./run.sh\nrm *.onnx\npopd\n\npushd speaker-diarization\necho '----------speaker diarization----------'\n./run.sh\npopd\n\npushd speaker-identification\necho '----------3d speaker----------'\n./run-3d-speaker.sh\npopd\n\npushd add-punctuations\necho '----------CT Transformer----------'\n./run-ct-transformer.sh\npopd\n\npushd audio-tagging\necho '----------zipformer----------'\n./run-zipformer.sh\n\necho '----------ced----------'\n./run-ced.sh\npopd\n\npushd vad-with-non-streaming-asr\n\necho '----------Zipformer CTC----------'\n./run-zipformer-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------Dolphin CTC----------'\n./run-dolphin-ctc.sh\nrm -rf sherpa-onnx-*\n\necho '----------TeleSpeech CTC----------'\n./run-telespeech-ctc.sh\nrm -rf sherpa-onnx-*\n\necho \"----zipformer transducer----\"\n./run-zipformer-transducer.sh\nrm -rf sherpa-onnx-*\n\necho \"----moonshine----\"\n./run-moonshine.sh\nrm -rf sherpa-onnx-*\n\necho \"----whisper----\"\n./run-whisper.sh\nrm -rf sherpa-onnx-*\n\necho \"----paraformer----\"\n./run-paraformer.sh\nrm -rf sherpa-onnx-*\n\necho \"----SenseVoice zh----\"\n./run-sense-voice-zh-2.sh\n./run-sense-voice-zh.sh\nrm -rf sherpa-onnx-*\n\necho \"----SenseVoice en----\"\n./run-sense-voice-en.sh\nrm -rf sherpa-onnx-*\n\npopd\n\npushd keyword-spotter\n./run-zh.sh\npopd\n"
  },
  {
    "path": ".github/scripts/test-dot-net.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ncd dotnet-examples/\n\ncd ./supertonic-tts\n./run.sh\nls -lh\nrm -rf sherpa-onnx-supertonic-*\n\ncd ../non-streaming-moonshine-v2-decode-files\n./run.sh\nrm -rf sherpa-onnx-moonshine-*\n\ncd ../offline-decode-files\n\n./run-fire-red-asr-ctc.sh\nrm -rf sherpa-onnx-fire-*\n\n./run-medasr-ctc.sh\nrm -rf sherpa-onnx-*\n\n./run-omnilingual-asr-ctc.sh\nrm -rf sherpa-onnx-*\n\n./run-wenet-ctc.sh\nrm -rf sherpa-onnx-*\n\n./run-zipformer-ctc.sh\nrm -rf sherpa-onnx-*\n\n./run-dolphin-ctc.sh\nrm -rf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\n\n./run-fire-red-asr.sh\nrm -rf sherpa-onnx-fire-red-asr-*\n\n./run-moonshine.sh\nrm -rf sherpa-onnx-*\n\n./run-sense-voice-ctc.sh\nrm -rf sherpa-onnx-*\n\n./run-paraformer-itn.sh\nrm -rf sherpa-onnx-*\n\n./run-telespeech-ctc.sh\nrm -rf sherpa-onnx-*\n\n./run-nemo-ctc.sh\nrm -rf sherpa-onnx-*\n\n./run-paraformer.sh\nrm -rf sherpa-onnx-*\n\n./run-zipformer.sh\nrm -rf sherpa-onnx-*\n\n./run-hotwords.sh\nrm -rf sherpa-onnx-*\n\n./run-whisper.sh\nrm -rf sherpa-onnx-*\n\n# ./run-whisper-large-v3.sh\n# rm -rf sherpa-onnx-*\n\n./run-tdnn-yesno.sh\nrm -rf sherpa-onnx-*\n\ncd ../pocket-tts-zero-shot\n./run.sh\nls -lh\nrm -rf sherpa-onnx-pocket-*\n\ncd ../zipvoice-tts\n./run.sh\nls -lh\nrm -rf sherpa-onnx-zipvoice-*\nrm -f vocos_24khz.onnx\n\ncd ../vad-non-streaming-funasr-nano\n./run-ten-vad.sh\nrm -fv *.onnx\n\n./run.sh\nrm -fv *.onnx\n\ncd ../non-streaming-funasr-nano-decode-files\n./run.sh\nls -lh\nrm -rf sherpa-onnx-funasr-*\n\ncd ../version-test\n./run.sh\nls -lh\n\ncd ../offline-audio-tagging\n./run.sh\nls -lh\nrm -rf sherpa-onnx-*\n\ncd ../kitten-tts\n./run-kitten.sh\nls -lh\nrm -rf kitten-nano-en-v0_1-fp16\n\ncd ../vad-non-streaming-asr-paraformer\n./run-ten-vad.sh\nrm -fv *.onnx\n\n./run.sh\nrm -fv *.onnx\n\ncd ../non-streaming-canary-decode-files\n\n./run.sh\nls -lh\nrm -rf sherpa-onnx-nemo-*\n\n\n\ncd ../speech-enhancement-gtcrn\n./run.sh\nls -lh\n\ncd ../speech-enhancement-dpdfnet\n./run.sh\nls -lh\n\ncd ../streaming-speech-enhancement-gtcrn\n./run.sh\nls -lh\n\ncd ../streaming-speech-enhancement-dpdfnet\n./run.sh\nls -lh\n\ncd ../kokoro-tts\n./run-kokoro.sh\nls -lh\n\ncd ../offline-tts\n./run-matcha-zh.sh\nls -lh *.wav\n./run-matcha-en.sh\nls -lh *.wav\n./run-aishell3.sh\nls -lh *.wav\n./run-piper.sh\nls -lh *.wav\n./run-hf-fanchen.sh\nls -lh *.wav\nls -lh\n\npushd ../..\n\nmkdir tts\n\ncp -v dotnet-examples/kokoro-tts/*.wav ./tts\ncp -v dotnet-examples/offline-tts/*.wav ./tts\ncp -v dotnet-examples/supertonic-tts/*.wav ./tts\ncp -v dotnet-examples/zipvoice-tts/*.wav ./tts\npopd\n\ncd ../offline-speaker-diarization\n./run.sh\nrm -rfv *.onnx\nrm -fv *.wav\nrm -rfv sherpa-onnx-pyannote-*\n\ncd ../keyword-spotting-from-files\n./run.sh\n\ncd ../online-decode-files\n./run-t-one-ctc.sh\nrm -rf sherpa-onnx-*\n\n./run-transducer-itn.sh\nrm -rf sherpa-onnx-*\n\n./run-zipformer2-ctc.sh\nrm -rf sherpa-onnx-*\n\n./run-transducer.sh\nrm -rf sherpa-onnx-*\n\n./run-paraformer.sh\nrm -rf sherpa-onnx-*\n\ncd ../offline-punctuation\n./run.sh\nrm -rf sherpa-onnx-*\n\ncd ../speaker-identification\n./run.sh\n\ncd ../streaming-hlg-decoding/\n./run.sh\nrm -rf sherpa-onnx-*\n\ncd ../spoken-language-identification\n./run.sh\nrm -rf sherpa-onnx-*\n"
  },
  {
    "path": ".github/scripts/test-kws.sh",
    "content": "#!/usr/bin/env bash\n\nset -e\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nlog \"------------------------------------------------------------\"\nlog \"Run Chinese keyword spotting (Wenetspeech）\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/pkufool/keyword-spotting-models/releases/download/v0.1/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz\nlog \"Start testing ${repo_url}\"\nrepo=sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01\n\nlog \"Download pretrained model and test-data from $repo_url\"\ncurl -SL -O $repo_url\ntar jxvf ${repo}.tar.bz\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-12-avg-2-chunk-16-left-64.onnx \\\n  --decoder=$repo/decoder-epoch-12-avg-2-chunk-16-left-64.onnx \\\n  --joiner=$repo/joiner-epoch-12-avg-2-chunk-16-left-64.onnx \\\n  --keywords-file=$repo/test_wavs/test_keywords.txt \\\n  --max-active-paths=4 \\\n  --num-threads=4 \\\n  $repo/test_wavs/3.wav $repo/test_wavs/4.wav $repo/test_wavs/5.wav $repo/test_wavs/6.wav\n\nrm -rf $repo\nrm -rf ${repo}.tar.bz\n\nlog \"------------------------------------------------------------\"\nlog \"Run English keyword spotting (Gigaspeech）\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/pkufool/keyword-spotting-models/releases/download/v0.1/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01.tar.bz\nlog \"Start testing ${repo_url}\"\nrepo=sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01\n\nlog \"Download pretrained model and test-data from $repo_url\"\ncurl -SL -O $repo_url\ntar jxvf ${repo}.tar.bz\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-12-avg-2-chunk-16-left-64.onnx \\\n  --decoder=$repo/decoder-epoch-12-avg-2-chunk-16-left-64.onnx \\\n  --joiner=$repo/joiner-epoch-12-avg-2-chunk-16-left-64.onnx \\\n  --keywords-file=$repo/test_wavs/test_keywords.txt \\\n  --max-active-paths=4 \\\n  --num-threads=4 \\\n  $repo/test_wavs/0.wav $repo/test_wavs/1.wav\n\nrm -rf $repo\nrm -rf ${repo}.tar.bz\n"
  },
  {
    "path": ".github/scripts/test-nodejs-addon-npm.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nd=nodejs-addon-examples\necho \"dir: $d\"\ncd $d\n\narch=$(node -p \"require('os').arch()\")\nplatform=$(node -p \"require('os').platform()\")\nnode_version=$(node -p \"process.versions.node.split('.')[0]\")\n\necho \"----------Moonshine v2----------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n\nnode ./test_asr_non_streaming_moonshine_v2.js\n\nrm -rf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27\n\necho \"----------FireRedAsr CTC----------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nrm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\nnode ./test_asr_non_streaming_fire_red_asr_ctc.js\nnode ./test_asr_non_streaming_fire_red_asr_ctc_async.js\n\nrm -rf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\n\necho \"----------PocketTTS----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\nnode ./test_tts_non_streaming_pocket_en.js\nnode ./test_tts_non_streaming_pocket_en_async.js\n\nrm -rf sherpa-onnx-pocket-tts-int8-2026-01-26\n\necho \"----------ZipVoice----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\nnode ./test_tts_non_streaming_zipvoice_zh_en.js\nnode ./test_tts_non_streaming_zipvoice_zh_en_async.js\n\nrm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\nrm -f vocos_24khz.onnx\n\necho \"----------non-streaming ASR FunASR Nano----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\ntar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nrm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n\nnode ./test_asr_non_streaming_funasr_nano.js\nnode ./test_asr_non_streaming_funasr_nano_async.js\n\nrm -rf sherpa-onnx-funasr-nano-int8-2025-12-30\n\necho \"----------non-streaming ASR Google MedASR CTC----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\ntar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nrm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n\nnode ./test_asr_non_streaming_medasr_ctc.js\n\nrm -rf sherpa-onnx-medasr-ctc-en-int8-2025-12-25\n\necho \"----------non-streaming ASR Omnilingual ASR CTC----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\ntar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nrm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n\nnode ./test_asr_non_streaming_omnilingual_asr_ctc.js\n\nrm -rf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12\n\necho \"----------non-streaming ASR WeNet CTC----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\ntar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\nrm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n\nnode ./test_asr_non_streaming_wenet_ctc.js\nrm -rf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10\n\necho \"----------streaming ASR T-one CTC----------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\ntar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nrm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n\nnode ./test_asr_streaming_t_one_ctc.js\n\nrm -rf sherpa-onnx-streaming-t-one-russian-2025-09-08\n\necho \"----------KittenTTS----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\ntar xf kitten-nano-en-v0_1-fp16.tar.bz2\nrm kitten-nano-en-v0_1-fp16.tar.bz2\n\nnode ./test_tts_non_streaming_kitten_en.js\n\nrm -rf kitten-nano-en-v0_1-fp16\n\necho \"----------SupertonicTTS----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nrm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\nnode ./test_tts_non_streaming_supertonic_en.js\nnode ./test_tts_non_streaming_supertonic_en_async.js\n\nrm -rf sherpa-onnx-supertonic-tts-int8-2026-03-06\n\necho \"----------non-streaming ASR NeMo Canary----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\ntar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\nrm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n\nnode ./test_asr_non_streaming_nemo_canary.js\n\nrm -rf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8\n\necho \"----------non-streaming ASR Zipformer CTC----------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\ntar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nrm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\nnode ./test_asr_non_streaming_zipformer_ctc.js\nrm -rf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03\n\necho \"----------non-streaming ASR NeMo parakeet tdt----------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\ntar xvf sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\nrm sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\n\nnode ./test_asr_non_streaming_nemo_parakeet_tdt_v2.js\nrm -rf sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8\n\necho \"----------non-streaming ASR dolphin CTC----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\ntar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\nrm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n\nnode ./test_asr_non_streaming_dolphin_ctc.js\n\nrm -rf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\n\necho \"----------non-streaming speech denoiser----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n\nnode ./test_offline_speech_enhancement_gtcrn.js\nnode ./test_offline_speech_enhancement_dpdfnet.js\nnode ./test_online_speech_enhancement_gtcrn.js\nnode ./test_online_speech_enhancement_dpdfnet.js\nrm gtcrn_simple.onnx\nrm dpdfnet_baseline.onnx\nls -lh *.wav\n\necho \"----------non-streaming asr FireRedAsr----------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\nrm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n\nnode ./test_asr_non_streaming_fire_red_asr.js\nrm -rf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\n\necho \"----------non-streaming asr moonshine + vad----------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nnode ./test_vad_with_non_streaming_asr_moonshine.js\nrm -rf sherpa-onnx-*\nrm *.wav\nrm *.onnx\n\necho \"----------non-streaming speaker diarization----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\ntar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nrm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\nnode ./test_offline_speaker_diarization.js\n\nrm -rfv *.onnx *.wav sherpa-onnx-pyannote-*\n\necho \"----------non-streaming asr whisper + vad----------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nrm sherpa-onnx-whisper-tiny.en.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nnode ./test_vad_with_non_streaming_asr_whisper.js\nrm -rf sherpa-onnx-whisper*\nrm *.wav\nrm *.onnx\n\necho \"----------asr----------\"\n\nif [[ $arch != \"ia32\" && $platform != \"win32\" ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n  tar xvf sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n  rm sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n\n  node ./test_asr_non_streaming_nemo_ctc.js\n  rm -rf sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\n  node ./test_asr_non_streaming_sense_voice.js\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\n  tar xf dict.tar.bz2\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\n  node ./test_asr_non_streaming_sense_voice_with_hr.js\n\n  rm -rf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17\n  rm -rf dict replace.fst test-hr.wav lexicon.txt\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  node ./test_asr_non_streaming_paraformer.js\n\n  rm -f itn*\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\n\n  node ./test_asr_non_streaming_paraformer_itn.js\n\n  rm -rf sherpa-onnx-paraformer-zh-2023-09-14\nfi\n\necho \"----------tts----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\ntar xf kokoro-multi-lang-v1_0.tar.bz2\nrm kokoro-multi-lang-v1_0.tar.bz2\n\nnode ./test_tts_non_streaming_kokoro_zh_en.js\nls -lh *.wav\nrm -rf kokoro-multi-lang-v1_0\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\ntar xf kokoro-en-v0_19.tar.bz2\nrm kokoro-en-v0_19.tar.bz2\n\nnode ./test_tts_non_streaming_kokoro_en.js\nls -lh *.wav\nrm -rf kokoro-en-v0_19\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\ntar xvf matcha-icefall-en_US-ljspeech.tar.bz2\nrm matcha-icefall-en_US-ljspeech.tar.bz2\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\nnode ./test_tts_non_streaming_matcha_icefall_en.js\nrm vocos-22khz-univ.onnx\nrm -rf matcha-icefall-en_US-ljspeech\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\ntar xvf matcha-icefall-zh-baker.tar.bz2\nrm matcha-icefall-zh-baker.tar.bz2\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\nnode ./test_tts_non_streaming_matcha_icefall_zh.js\nrm vocos-22khz-univ.onnx\nrm -rf matcha-icefall-zh-baker\nls -lh *.wav\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-cori-medium.tar.bz2\ntar xf vits-piper-en_GB-cori-medium.tar.bz2\nrm vits-piper-en_GB-cori-medium.tar.bz2\n\nnode ./test_tts_non_streaming_vits_piper_en.js\nrm -rf vits-piper-en_GB-cori-medium\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-coqui-de-css10.tar.bz2\ntar xvf vits-coqui-de-css10.tar.bz2\nrm vits-coqui-de-css10.tar.bz2\n\nnode ./test_tts_non_streaming_vits_coqui_de.js\nrm -rf vits-coqui-de-css10\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-vits-zh-ll.tar.bz2\ntar xvf sherpa-onnx-vits-zh-ll.tar.bz2\nrm sherpa-onnx-vits-zh-ll.tar.bz2\n\nnode ./test_tts_non_streaming_vits_zh_ll.js\nrm -rf sherpa-onnx-vits-zh-ll\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\ntar xvf vits-icefall-zh-aishell3.tar.bz2\nrm vits-icefall-zh-aishell3.tar.bz2\n\nnode ./test_tts_non_streaming_vits_zh_aishell3.js\nrm -rf vits-icefall-zh-aishell3\n\necho \"----------keyword spotting----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\ntar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\nrm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n\nnode ./test_keyword_spotter_transducer.js\nrm -rf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01\n\nif [[ $arch != \"ia32\" && $platform != \"win32\" && $node_version != 21 ]]; then\n  # The punctuation model is so large that it cause memory allocation failure on windows x86\n  # 2024-07-17 03:24:34.2388391 [E:onnxruntime:, inference_session.cc:1981\n  # onnxruntime::InferenceSession::Initialize::<lambda_d603a8c74863bd6b58a1c7996295ed04>::operator ()]\n  # Exception during initialization: bad allocation\n  # Error: Process completed with exit code 127.\n  #\n  # Node 21 does not have such an issue\n  echo \"----------add punctuations----------\"\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n\n  node ./test_offline_punctuation.js\n  rm -rf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12\n\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n  tar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n  rm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n\n  node ./test_online_punctuation.js\n  rm -rf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nfi\n\necho \"----------audio tagging----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\ntar xvf sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\nrm sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n\nnode ./test_audio_tagging_zipformer.js\nrm -rf sherpa-onnx-zipformer-small-audio-tagging-2024-04-15\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\ntar xvf sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\nrm sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n\nnode ./test_audio_tagging_ced.js\nrm -rf sherpa-onnx-ced-mini-audio-tagging-2024-04-19\n\necho \"----------speaker identification----------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\ngit clone https://github.com/csukuangfj/sr-data\n\nnode ./test_speaker_identification.js\n\nrm *.onnx\nrm -rf sr-data\n\necho \"----------spoken language identification----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.tar.bz2\nrm sherpa-onnx-whisper-tiny.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/spoken-language-identification-test-wavs.tar.bz2\ntar xvf spoken-language-identification-test-wavs.tar.bz2\nrm spoken-language-identification-test-wavs.tar.bz2\n\nnode ./test_spoken_language_identification.js\nrm -rf sherpa-onnx-whisper-tiny\nrm -rf spoken-language-identification-test-wavs\n\necho \"----------streaming asr----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\nrm -f itn*\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\ntar xf dict.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\nif [[ $arch != \"ia32\" && $platform != \"win32\" ]]; then\n  node test_asr_streaming_transducer_itn.js\n  node test_asr_streaming_transducer.js\n  node test_asr_streaming_transducer_with_hr.js\nfi\n\nrm -rf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\nrm -rf dict lexicon.txt replace.fst test-hr.wav\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nrm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n\nnode ./test_asr_streaming_ctc.js\n\n# To decode with HLG.fst\nnode ./test_asr_streaming_ctc_hlg.js\nrm -rf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\ntar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nrm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n\nnode ./test_asr_streaming_paraformer.js\nrm -rf sherpa-onnx-streaming-paraformer-bilingual-zh-en\n\necho \"----------non-streaming asr----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\ntar xvf sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\nrm sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\n\nnode ./test_asr_non_streaming_transducer.js\nrm -rf sherpa-onnx-zipformer-en-2023-04-01\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nrm sherpa-onnx-whisper-tiny.en.tar.bz2\n\nnode ./test_asr_non_streaming_whisper.js\nrm -rf sherpa-onnx-whisper-tiny.en\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\nnode ./test_asr_non_streaming_moonshine.js\nrm -rf sherpa-onnx-*\n\nls -lh\n"
  },
  {
    "path": ".github/scripts/test-nodejs-npm.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\necho \"dir: $d\"\ncd $d\nnpm install\ngit status\nls -lh\nls -lh node_modules\n\necho \"---test moonshine v2---\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n\nnode ./test-offline-moonshine-v2.js\n\nrm -rf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27\n\necho \"---test FireRedASR CTC---\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nrm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\nnode ./test-offline-fire-red-asr-ctc.js\n\nrm -rf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\ntar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nrm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n\nnode ./test-offline-funasr-nano.js\n\nrm -rf sherpa-onnx-funasr-nano-int8-2025-12-30\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\ntar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nrm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n\nnode ./test-offline-medasr-ctc.js\n\nrm -rf sherpa-onnx-medasr-ctc-en-int8-2025-12-25\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\ntar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nrm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n\nnode ./test-offline-omnilingual-asr-ctc.js\n\nrm -rf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\ntar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\nrm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n\nnode ./test-offline-wenet-ctc.js\nrm -rf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\ntar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nrm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nnode ./test-online-t-one-ctc.js\n\nrm -rf sherpa-onnx-streaming-t-one-russian-2025-09-08\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\ntar xf kitten-nano-en-v0_1-fp16.tar.bz2\nrm kitten-nano-en-v0_1-fp16.tar.bz2\n\nnode ./test-offline-tts-kitten-en.js\nls -lh *.wav\nrm -rf kitten-nano-en-v0_1-fp16\n\n# online asr\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\ntar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nrm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nnode ./test-online-paraformer.js\nrm -rf sherpa-onnx-streaming-paraformer-bilingual-zh-en\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\nrm -f itn*\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n\nnode ./test-online-transducer-itn.js\n\nnode ./test-online-transducer.js\n\nrm -rf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\nrm sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n\nnode ./test-online-zipformer2-ctc.js\nrm -rf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nrm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nnode ./test-online-zipformer2-ctc-hlg.js\nrm -rf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18\n\necho \"----------keyword spotting----------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\ntar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\nrm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n\nnode ./test-keyword-spotter-transducer.js\nrm -rf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01\n\n# asr with offline nemo canary\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\ntar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\nrm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n\nnode ./test-offline-nemo-canary.js\nrm -rf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8\n\n# asr with offline zipformer ctc\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\ntar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nrm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\nnode ./test-offline-zipformer-ctc.js\nrm -rf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03\n\n# asr with offline dolphin ctc\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\ntar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\nrm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\nnode ./test-offline-dolphin-ctc.js\nrm -rf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\n\n# speech enhancement\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nnode ./test-offline-speech-enhancement-gtcrn.js\nnode ./test-offline-speech-enhancement-dpdfnet.js\nnode ./test-online-speech-enhancement-gtcrn.js\nnode ./test-online-speech-enhancement-dpdfnet.js\nls -lh *.wav\nrm gtcrn_simple.onnx\nrm dpdfnet_baseline.onnx\nrm -fv inp_16k.wav\nrm -fv enhanced*.wav\n\n# offline tts\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\nnode ./test-offline-tts-zipvoice-zh-en.js\nls -lh *.wav\nrm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\nrm -f vocos_24khz.onnx\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\ntar xf kokoro-multi-lang-v1_0.tar.bz2\nrm kokoro-multi-lang-v1_0.tar.bz2\n\nnode ./test-offline-tts-kokoro-zh-en.js\nls -lh *.wav\nrm -rf kokoro-multi-lang-v1_0\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\ntar xf kokoro-en-v0_19.tar.bz2\nrm kokoro-en-v0_19.tar.bz2\n\nnode ./test-offline-tts-kokoro-en.js\nrm -rf kokoro-en-v0_19\n\nls -lh\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\ntar xvf matcha-icefall-zh-baker.tar.bz2\nrm matcha-icefall-zh-baker.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\nnode ./test-offline-tts-matcha-zh.js\n\nrm -rf matcha-icefall-zh-baker\nrm vocos-22khz-univ.onnx\n\n\necho \"---\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\ntar xvf matcha-icefall-en_US-ljspeech.tar.bz2\nrm matcha-icefall-en_US-ljspeech.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\nnode ./test-offline-tts-matcha-en.js\n\nrm -rf matcha-icefall-en_US-ljspeech\nrm vocos-22khz-univ.onnx\n\necho \"---\"\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\ntar xf vits-piper-en_US-amy-low.tar.bz2\nnode ./test-offline-tts-vits-en.js\nrm -rf vits-piper-en_US-amy-low*\n\necho \"---\"\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\ntar xvf vits-icefall-zh-aishell3.tar.bz2\nnode ./test-offline-tts-vits-zh.js\nrm -rf vits-icefall-zh-aishell3*\n\nls -lh *.wav\n\necho '-----speaker diarization----------'\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\ntar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nrm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\nnode ./test-offline-speaker-diarization.js\nrm -rfv *.wav *.onnx sherpa-onnx-pyannote-*\n\necho '-----vad+moonshine----------'\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nrm sherpa-onnx-whisper-tiny.en.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nnode ./test-vad-with-non-streaming-asr-whisper.js\nrm Obama.wav\nrm silero_vad.onnx\nrm -rf sherpa-onnx-moonshine-*\n\necho '-----vad+whisper----------'\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nrm sherpa-onnx-whisper-tiny.en.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nnode ./test-vad-with-non-streaming-asr-whisper.js\nrm Obama.wav\nrm silero_vad.onnx\nrm -rf sherpa-onnx-whisper-tiny.en\n\n# offline asr\n#\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\n\nnode ./test-offline-sense-voice.js\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\ntar xf dict.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\nnode ./test-offline-sense-voice-with-hr.js\n\nrm -rf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17\nrm -rf dict replace.fst test-hr.wav lexicon.txt\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nls -lh\ntar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nrm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\nrm -f itn*\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\nnode ./test-offline-paraformer-itn.js\nrm -rf sherpa-onnx-paraformer-zh-2023-09-14\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-en-conformer-small.tar.bz2\nls -lh\ntar xvf sherpa-onnx-nemo-ctc-en-conformer-small.tar.bz2\nrm sherpa-onnx-nemo-ctc-en-conformer-small.tar.bz2\nnode ./test-offline-nemo-ctc.js\nrm -rf sherpa-onnx-nemo-ctc-en-conformer-small\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nls -lh\ntar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nrm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nnode ./test-offline-paraformer.js\nrm -rf sherpa-onnx-paraformer-zh-2023-09-14\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\nls -lh\ntar xvf sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\nrm sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\nnode ./test-offline-transducer.js\nrm -rf sherpa-onnx-zipformer-en-2023-06-26\n\ncurl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nrm sherpa-onnx-whisper-tiny.en.tar.bz2\nnode ./test-offline-whisper.js\nrm -rf sherpa-onnx-whisper-tiny.en\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\nnode ./test-offline-moonshine.js\nrm -rf sherpa-onnx-moonshine-*\n"
  },
  {
    "path": ".github/scripts/test-offline-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -e\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nfor type in base small; do\n  log \"------------------------------------------------------------\"\n  log \"Run Dolphin CTC models ($type int8)\"\n  log \"------------------------------------------------------------\"\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-$type-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  tar xvf sherpa-onnx-dolphin-$type-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  rm sherpa-onnx-dolphin-$type-ctc-multi-lang-int8-2025-04-02.tar.bz2\n\n  $EXE \\\n    --dolphin-model=./sherpa-onnx-dolphin-$type-ctc-multi-lang-int8-2025-04-02/model.int8.onnx \\\n    --tokens=./sherpa-onnx-dolphin-$type-ctc-multi-lang-int8-2025-04-02/tokens.txt \\\n    --debug=1 \\\n    ./sherpa-onnx-dolphin-$type-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav\n\n  rm -rf sherpa-onnx-dolphin-$type-ctc-multi-lang-int8-2025-04-02\n\n  log \"------------------------------------------------------------\"\n  log \"Run Dolphin CTC models ($type)\"\n  log \"------------------------------------------------------------\"\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-$type-ctc-multi-lang-2025-04-02.tar.bz2\n  tar xvf sherpa-onnx-dolphin-$type-ctc-multi-lang-2025-04-02.tar.bz2\n  rm sherpa-onnx-dolphin-$type-ctc-multi-lang-2025-04-02.tar.bz2\n\n  $EXE \\\n    --dolphin-model=./sherpa-onnx-dolphin-$type-ctc-multi-lang-2025-04-02/model.onnx \\\n    --tokens=./sherpa-onnx-dolphin-$type-ctc-multi-lang-2025-04-02/tokens.txt \\\n    --debug=1 \\\n    ./sherpa-onnx-dolphin-$type-ctc-multi-lang-2025-04-02/test_wavs/0.wav\n\n  rm -rf sherpa-onnx-dolphin-$type-ctc-multi-lang-2025-04-02\ndone\n\nlog \"------------------------------------------------------------\"\nlog \"Run NeMo GigaAM Russian models v2\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19.tar.bz2\ntar xvf sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19.tar.bz2\nrm sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19.tar.bz2\n\n$EXE \\\n  --nemo-ctc-model=./sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19/model.int8.onnx \\\n  --tokens=./sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19/tokens.txt \\\n  --debug=1 \\\n  ./sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19/test_wavs/example.wav\n\nrm -rf sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19\n\nlog \"------------------------------------------------------------\"\nlog \"Run NeMo GigaAM Russian models v1\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24.tar.bz2\ntar xvf sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24.tar.bz2\nrm sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24.tar.bz2\n\n$EXE \\\n  --nemo-ctc-model=./sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24/model.int8.onnx \\\n  --tokens=./sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24/tokens.txt \\\n  --debug=1 \\\n  ./sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24/test_wavs/example.wav\n\nrm -rf sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24\n\nlog \"------------------------------------------------------------\"\nlog \"Run SenseVoice models\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\nrepo=sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17\n\nfor m in model.int8.onnx; do\n  for w in zh en yue ja ko; do\n    for use_itn in 0 1; do\n      echo \"$m $w $use_itn\"\n      time $EXE \\\n        --tokens=$repo/tokens.txt \\\n        --sense-voice-model=$repo/$m \\\n        --sense-voice-use-itn=$use_itn \\\n        $repo/test_wavs/$w.wav\n    done\n  done\ndone\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\ntar xf dict.tar.bz2\nrm dict.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\nfor m in model.int8.onnx; do\n  for use_itn in 0 1; do\n    echo \"$m $w $use_itn\"\n    time $EXE \\\n      --tokens=$repo/tokens.txt \\\n      --sense-voice-model=$repo/$m \\\n      --sense-voice-use-itn=$use_itn \\\n      --hr-lexicon=./lexicon.txt \\\n      --hr-rule-fsts=./replace.fst \\\n      ./test-hr.wav\n  done\ndone\n\nrm -rf dict replace.fst test-hr.wav lexicon.txt\n\n# test wav reader for non-standard wav files\nwaves=(\n  naudio.wav\n  junk-padding.wav\n  int8-1-channel-zh.wav\n  int8-2-channel-zh.wav\n  int8-4-channel-zh.wav\n  int16-1-channel-zh.wav\n  int16-2-channel-zh.wav\n  int32-1-channel-zh.wav\n  int32-2-channel-zh.wav\n  float32-1-channel-zh.wav\n  float32-2-channel-zh.wav\n)\nfor w in ${waves[@]}; do\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$w\n\n  time $EXE \\\n    --tokens=$repo/tokens.txt \\\n    --sense-voice-model=$repo/model.int8.onnx \\\n    $w\n  rm -v $w\ndone\n\nrm -rf $repo\n\nif true; then\n  # It has problems with onnxruntime 1.18\n  log \"------------------------------------------------------------\"\n  log \"Run Wenet models\"\n  log \"------------------------------------------------------------\"\n  wenet_models=(\n  sherpa-onnx-zh-wenet-aishell\n  # sherpa-onnx-zh-wenet-aishell2\n  # sherpa-onnx-zh-wenet-wenetspeech\n  # sherpa-onnx-zh-wenet-multi-cn\n  sherpa-onnx-en-wenet-librispeech\n  # sherpa-onnx-en-wenet-gigaspeech\n  )\n  for name in ${wenet_models[@]}; do\n    repo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$name.tar.bz2\n    log \"Start testing ${repo_url}\"\n    repo=$name\n    log \"Download pretrained model and test-data from $repo_url\"\n    curl -SL -O $repo_url\n    tar xvf $name.tar.bz2\n    rm $name.tar.bz2\n\n    log \"test float32 models\"\n    time $EXE \\\n      --tokens=$repo/tokens.txt \\\n      --wenet-ctc-model=$repo/model.onnx \\\n      $repo/test_wavs/0.wav \\\n      $repo/test_wavs/1.wav \\\n      $repo/test_wavs/8k.wav\n\n    log \"test int8 models\"\n    time $EXE \\\n      --tokens=$repo/tokens.txt \\\n      --wenet-ctc-model=$repo/model.int8.onnx \\\n      $repo/test_wavs/0.wav \\\n      $repo/test_wavs/1.wav \\\n      $repo/test_wavs/8k.wav\n\n    rm -rf $repo\n  done\nfi\n\n\nlog \"test offline TeleSpeech CTC\"\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\nname=$(basename $url)\nrepo=$(basename -s .tar.bz2 $name)\n\ncurl -SL -O $url\ntar xvf $name\nrm $name\nls -lh $repo\n\ntest_wavs=(\n3-sichuan.wav\n4-tianjin.wav\n5-henan.wav\n)\nfor w in ${test_wavs[@]}; do\n  time $EXE \\\n    --tokens=$repo/tokens.txt \\\n    --telespeech-ctc=$repo/model.int8.onnx \\\n    --debug=1 \\\n    $repo/test_wavs/$w\ndone\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --telespeech-ctc=$repo/model.int8.onnx \\\n  --debug=1 \\\n  $repo/test_wavs/3-sichuan.wav \\\n  $repo/test_wavs/4-tianjin.wav \\\n  $repo/test_wavs/5-henan.wav\n\nrm -rf $repo\n\nlog \"-----------------------------------------------------------------\"\nlog \"Run Nemo fast conformer hybrid transducer ctc models (CTC branch)\"\nlog \"-----------------------------------------------------------------\"\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\nname=$(basename $url)\ncurl -SL -O $url\ntar xvf $name\nrm $name\nrepo=$(basename -s .tar.bz2 $name)\nls -lh $repo\n\nlog \"test $repo\"\ntest_wavs=(\nde-german.wav\nes-spanish.wav\nhr-croatian.wav\npo-polish.wav\nuk-ukrainian.wav\nen-english.wav\nfr-french.wav\nit-italian.wav\nru-russian.wav\n)\nfor w in ${test_wavs[@]}; do\n  time $EXE \\\n    --tokens=$repo/tokens.txt \\\n    --nemo-ctc-model=$repo/model.onnx \\\n    --debug=1 \\\n    $repo/test_wavs/$w\ndone\n\nrm -rf $repo\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-ctc-en-24500.tar.bz2\nname=$(basename $url)\ncurl -SL -O $url\ntar xvf $name\nrm $name\nrepo=$(basename -s .tar.bz2 $name)\nls -lh $repo\n\nlog \"Test $repo\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --nemo-ctc-model=$repo/model.onnx \\\n  --debug=1 \\\n  $repo/test_wavs/en-english.wav\n\nrm -rf $repo\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-ctc-es-1424.tar.bz2\nname=$(basename $url)\ncurl -SL -O $url\ntar xvf $name\nrm $name\nrepo=$(basename -s .tar.bz2 $name)\nls -lh $repo\n\nlog \"test $repo\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --nemo-ctc-model=$repo/model.onnx \\\n  --debug=1 \\\n  $repo/test_wavs/es-spanish.wav\n\nrm -rf $repo\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288.tar.bz2\nname=$(basename $url)\ncurl -SL -O $url\ntar xvf $name\nrm $name\nrepo=$(basename -s .tar.bz2 $name)\nls -lh $repo\n\nlog \"Test $repo\"\n\ntest_wavs=(\nen-english.wav\nde-german.wav\nfr-french.wav\nes-spanish.wav\n)\n\nfor w in ${test_wavs[@]}; do\n  time $EXE \\\n    --tokens=$repo/tokens.txt \\\n    --nemo-ctc-model=$repo/model.onnx \\\n    --debug=1 \\\n    $repo/test_wavs/$w\ndone\n\nrm -rf $repo\n\n\n\nlog \"------------------------------------------------------------\"\nlog \"Run tdnn yesno (Hebrew)\"\nlog \"------------------------------------------------------------\"\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-tdnn-yesno.tar.bz2\ncurl -SL -O $url\ntar xvf sherpa-onnx-tdnn-yesno.tar.bz2\nrm sherpa-onnx-tdnn-yesno.tar.bz2\nlog \"Start testing ${url}\"\nrepo=sherpa-onnx-tdnn-yesno\nlog \"Download pretrained model and test-data from $url\"\n\nlog \"test float32 models\"\ntime $EXE \\\n  --sample-rate=8000 \\\n  --feat-dim=23 \\\n  \\\n  --tokens=$repo/tokens.txt \\\n  --tdnn-model=$repo/model-epoch-14-avg-2.onnx \\\n  $repo/test_wavs/0_0_0_1_0_0_0_1.wav \\\n  $repo/test_wavs/0_0_1_0_0_0_1_0.wav \\\n  $repo/test_wavs/0_0_1_0_0_1_1_1.wav \\\n  $repo/test_wavs/0_0_1_0_1_0_0_1.wav \\\n  $repo/test_wavs/0_0_1_1_0_0_0_1.wav \\\n  $repo/test_wavs/0_0_1_1_0_1_1_0.wav\n\nlog \"test int8 models\"\ntime $EXE \\\n  --sample-rate=8000 \\\n  --feat-dim=23 \\\n  \\\n  --tokens=$repo/tokens.txt \\\n  --tdnn-model=$repo/model-epoch-14-avg-2.int8.onnx \\\n  $repo/test_wavs/0_0_0_1_0_0_0_1.wav \\\n  $repo/test_wavs/0_0_1_0_0_0_1_0.wav \\\n  $repo/test_wavs/0_0_1_0_0_1_1_1.wav \\\n  $repo/test_wavs/0_0_1_0_1_0_0_1.wav \\\n  $repo/test_wavs/0_0_1_1_0_0_0_1.wav \\\n  $repo/test_wavs/0_0_1_1_0_1_1_0.wav\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run Citrinet (stt_en_citrinet_512, English)\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-en-citrinet-512.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-nemo-ctc-en-citrinet-512.tar.bz2\nrm sherpa-onnx-nemo-ctc-en-citrinet-512.tar.bz2\nlog \"Start testing ${repo_url}\"\nrepo=sherpa-onnx-nemo-ctc-en-citrinet-512\nlog \"Download pretrained model and test-data from $repo_url\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --nemo-ctc-model=$repo/model.onnx \\\n  --num-threads=2 \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --nemo-ctc-model=$repo/model.int8.onnx \\\n  --num-threads=2 \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run Librispeech zipformer CTC H/HL/HLG decoding (English)   \"\nlog \"------------------------------------------------------------\"\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-en-2023-10-02.tar.bz2\ncurl -SL -O $repo_url\nlog \"Start testing ${repo_url}\"\ntar xvf sherpa-onnx-zipformer-ctc-en-2023-10-02.tar.bz2\nrm sherpa-onnx-zipformer-ctc-en-2023-10-02.tar.bz2\nrepo=sherpa-onnx-zipformer-ctc-en-2023-10-02\nlog \"Download pretrained model and test-data from $repo_url\"\n\ngraphs=(\n$repo/H.fst\n$repo/HL.fst\n$repo/HLG.fst\n)\n\nfor graph in ${graphs[@]}; do\n  log \"test float32 models with $graph\"\n  time $EXE \\\n    --model-type=zipformer2_ctc \\\n    --ctc.graph=$graph \\\n    --zipformer-ctc-model=$repo/model.onnx \\\n    --tokens=$repo/tokens.txt \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/2.wav\n\n  log \"test int8 models with $graph\"\n  time $EXE \\\n    --model-type=zipformer2_ctc \\\n    --ctc.graph=$graph \\\n    --zipformer-ctc-model=$repo/model.int8.onnx \\\n    --tokens=$repo/tokens.txt \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/2.wav\ndone\n\nrm -rf $repo\n"
  },
  {
    "path": ".github/scripts/test-offline-fire-red-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -e\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nrm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\nfor w in 0.wav 1.wav 2.wav 3-sichuan.wav 3.wav 4-tianjin.wav 5-henan.wav 8k.wav; do\n$EXE \\\n  --fire-red-asr-ctc=./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx \\\n  --tokens=./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt \\\n  ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/$w\ndone\n\nrm -rf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\n"
  },
  {
    "path": ".github/scripts/test-offline-moonshine.sh",
    "content": "#!/usr/bin/env bash\n\nset -e\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nnames=(\ntiny\nbase\n)\n\nfor name in ${names[@]}; do\n  log \"------------------------------------------------------------\"\n  log \"Run $name\"\n  log \"------------------------------------------------------------\"\n\n  repo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-$name.tar.bz2\n  repo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-$name-en-int8.tar.bz2\n  curl -SL -O $repo_url\n  tar xvf sherpa-onnx-moonshine-$name-en-int8.tar.bz2\n  rm sherpa-onnx-moonshine-$name-en-int8.tar.bz2\n  repo=sherpa-onnx-moonshine-$name-en-int8\n  log \"Start testing ${repo_url}\"\n\n  log \"test int8 onnx\"\n\n  time $EXE \\\n    --moonshine-preprocessor=$repo/preprocess.onnx \\\n    --moonshine-encoder=$repo/encode.int8.onnx \\\n    --moonshine-uncached-decoder=$repo/uncached_decode.int8.onnx \\\n    --moonshine-cached-decoder=$repo/cached_decode.int8.onnx \\\n    --tokens=$repo/tokens.txt \\\n    --num-threads=2 \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/8k.wav\n\n  rm -rf $repo\ndone\n"
  },
  {
    "path": ".github/scripts/test-offline-punctuation.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nlog \"------------------------------------------------------------\"\nlog \"Download the punctuation model                             \"\nlog \"------------------------------------------------------------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\ntar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nrm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nrepo=sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12\nls -lh $repo\n\n$EXE \\\n --debug=1 \\\n --ct-transformer=$repo/model.onnx \\\n \"这是一个测试你好吗How are you我很好thank you are you ok谢谢你\"\n\n$EXE \\\n --debug=1 \\\n --ct-transformer=$repo/model.onnx \\\n \"我们都是木头人不会说话不会动\"\n\n$EXE \\\n --debug=1 \\\n --ct-transformer=$repo/model.onnx \\\n \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\"\n\nrm -rf $repo\n"
  },
  {
    "path": ".github/scripts/test-offline-source-separation.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nif [ -z $EXE ]; then\n  EXE=./build/bin/sherpa-onnx-offline-source-separation\nfi\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nlog \"------------------------------------------------------------\"\nlog \"Run spleeter\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/sherpa-onnx-spleeter-2stems-fp16.tar.bz2\ntar xvf sherpa-onnx-spleeter-2stems-fp16.tar.bz2\nrm sherpa-onnx-spleeter-2stems-fp16.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/qi-feng-le-zh.wav\n\n$EXE \\\n  --spleeter-vocals=sherpa-onnx-spleeter-2stems-fp16/vocals.fp16.onnx \\\n  --spleeter-accompaniment=sherpa-onnx-spleeter-2stems-fp16/accompaniment.fp16.onnx \\\n  --num-threads=2 \\\n  --debug=1 \\\n  --input-wav=./qi-feng-le-zh.wav \\\n  --output-vocals-wav=spleeter_output_vocals.wav \\\n  --output-accompaniment-wav=spleeter_output_accompaniment.wav\n\nrm -rf sherpa-onnx-spleeter-2stems-fp16\n\nlog \"------------------------------------------------------------\"\nlog \"Run UVR\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/UVR-MDX-NET-Voc_FT.onnx\n\n$EXE \\\n  --debug=1 \\\n  --num-threads=2 \\\n  --uvr-model=./UVR-MDX-NET-Voc_FT.onnx \\\n  --input-wav=./qi-feng-le-zh.wav \\\n  --output-vocals-wav=uvr_output_vocals.wav \\\n  --output-accompaniment-wav=uvr_output_non_vocals.wav\n\nrm ./UVR-MDX-NET-Voc_FT.onnx \\\n\nmkdir source-separation-wavs\nmv qi-feng-le-zh.wav source-separation-wavs\nmv spleeter_*.wav ./source-separation-wavs\nmv uvr_*.wav ./source-separation-wavs\n"
  },
  {
    "path": ".github/scripts/test-offline-speech-denoiser.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nif [ -z $EXE ]; then\n  EXE=./build/bin/sherpa-onnx-offline-denoiser\nfi\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nlog \"------------------------------------------------------------\"\nlog \"Run gtcrn\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/speech_with_noise.wav\n\n$EXE \\\n  --debug=1 \\\n  --speech-denoiser-gtcrn-model=./gtcrn_simple.onnx \\\n  --input-wav=./speech_with_noise.wav \\\n  --output-wav=./enhanced_speech_16k.wav\n\nrm ./gtcrn_simple.onnx\n"
  },
  {
    "path": ".github/scripts/test-offline-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -e\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nlog \"------------------------------------------------------------\"\nlog \"Run NeMo GigaAM Russian models v2\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19.tar.bz2\ntar xvf sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19.tar.bz2\nrm sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19.tar.bz2\n\n$EXE \\\n  --encoder=./sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19/encoder.int8.onnx \\\n  --decoder=./sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19/decoder.onnx \\\n  --joiner=./sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19/joiner.onnx \\\n  --tokens=./sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19/tokens.txt \\\n  --model-type=nemo_transducer \\\n  ./sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19/test_wavs/example.wav\n\nrm -rf sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19\n\n\nlog \"------------------------------------------------------------------------\"\nlog \"Run zipformer transducer models (Russian)                              \"\nlog \"------------------------------------------------------------------------\"\nfor type in small-zipformer zipformer; do\n  url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-$type-ru-2024-09-18.tar.bz2\n  name=$(basename $url)\n  curl -SL -O $url\n  tar xvf $name\n  rm $name\n  repo=$(basename -s .tar.bz2 $name)\n  ls -lh $repo\n\n  log \"test $repo\"\n  test_wavs=(\n  0.wav\n  1.wav\n  )\n\n  for w in ${test_wavs[@]}; do\n    time $EXE \\\n      --tokens=$repo/tokens.txt \\\n      --encoder=$repo/encoder.onnx \\\n      --decoder=$repo/decoder.onnx \\\n      --joiner=$repo/joiner.onnx \\\n      --debug=1 \\\n      $repo/test_wavs/$w\n  done\n\n  for w in ${test_wavs[@]}; do\n    time $EXE \\\n      --tokens=$repo/tokens.txt \\\n      --encoder=$repo/encoder.int8.onnx \\\n      --decoder=$repo/decoder.onnx \\\n      --joiner=$repo/joiner.int8.onnx \\\n      --debug=1 \\\n      $repo/test_wavs/$w\n  done\n  rm -rf $repo\ndone\n\nlog \"------------------------------------------------------------------------\"\nlog \"Run zipformer transducer models (Japanese from ReazonSpeech)                              \"\nlog \"------------------------------------------------------------------------\"\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2\n\nname=$(basename $url)\ncurl -SL -O $url\ntar xvf $name\nrm $name\nrepo=$(basename -s .tar.bz2 $name)\nls -lh $repo\n\ncat $repo/test_wavs/*.txt\n\nlog \"test $repo\"\ntest_wavs=(\n1.wav\n2.wav\n3.wav\n4.wav\n5.wav\n)\n\nfor w in ${test_wavs[@]}; do\n  time $EXE \\\n    --tokens=$repo/tokens.txt \\\n    --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n    --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n    --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n    --debug=1 \\\n    $repo/test_wavs/$w\ndone\n\nfor w in ${test_wavs[@]}; do\n  time $EXE \\\n    --tokens=$repo/tokens.txt \\\n    --encoder=$repo/encoder-epoch-99-avg-1.int8.onnx \\\n    --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n    --joiner=$repo/joiner-epoch-99-avg-1.int8.onnx \\\n    --debug=1 \\\n    $repo/test_wavs/$w\ndone\nrm -rf $repo\n\nlog \"------------------------------------------------------------------------\"\nlog \"Run Nemo fast conformer hybrid transducer ctc models (transducer branch)\"\nlog \"------------------------------------------------------------------------\"\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\nname=$(basename $url)\ncurl -SL -O $url\ntar xvf $name\nrm $name\nrepo=$(basename -s .tar.bz2 $name)\nls -lh $repo\n\nlog \"test $repo\"\ntest_wavs=(\nde-german.wav\nes-spanish.wav\nhr-croatian.wav\npo-polish.wav\nuk-ukrainian.wav\nen-english.wav\nfr-french.wav\nit-italian.wav\nru-russian.wav\n)\nfor w in ${test_wavs[@]}; do\n  time $EXE \\\n    --tokens=$repo/tokens.txt \\\n    --encoder=$repo/encoder.onnx \\\n    --decoder=$repo/decoder.onnx \\\n    --joiner=$repo/joiner.onnx \\\n    --debug=1 \\\n    $repo/test_wavs/$w\ndone\n\nrm -rf $repo\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-transducer-en-24500.tar.bz2\nname=$(basename $url)\ncurl -SL -O $url\ntar xvf $name\nrm $name\nrepo=$(basename -s .tar.bz2 $name)\nls -lh $repo\n\nlog \"Test $repo\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder.onnx \\\n  --decoder=$repo/decoder.onnx \\\n  --joiner=$repo/joiner.onnx \\\n  --debug=1 \\\n  $repo/test_wavs/en-english.wav\n\nrm -rf $repo\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-transducer-es-1424.tar.bz2\nname=$(basename $url)\ncurl -SL -O $url\ntar xvf $name\nrm $name\nrepo=$(basename -s .tar.bz2 $name)\nls -lh $repo\n\nlog \"test $repo\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder.onnx \\\n  --decoder=$repo/decoder.onnx \\\n  --joiner=$repo/joiner.onnx \\\n  --debug=1 \\\n  $repo/test_wavs/es-spanish.wav\n\nrm -rf $repo\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-transducer-en-de-es-fr-14288.tar.bz2\nname=$(basename $url)\ncurl -SL -O $url\ntar xvf $name\nrm $name\nrepo=$(basename -s .tar.bz2 $name)\nls -lh $repo\n\nlog \"Test $repo\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder.onnx \\\n  --decoder=$repo/decoder.onnx \\\n  --joiner=$repo/joiner.onnx \\\n  --debug=1 \\\n  $repo/test_wavs/en-english.wav \\\n  $repo/test_wavs/de-german.wav \\\n  $repo/test_wavs/fr-french.wav \\\n  $repo/test_wavs/es-spanish.wav\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run Conformer transducer (English)\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-conformer-en-2023-03-18.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-conformer-en-2023-03-18.tar.bz2\nrm sherpa-onnx-conformer-en-2023-03-18.tar.bz2\nlog \"Start testing ${repo_url}\"\nrepo=sherpa-onnx-conformer-en-2023-03-18\nlog \"Download pretrained model and test-data from $repo_url\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.int8.onnx \\\n  --num-threads=2 \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run Zipformer transducer (English)\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-03-30.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-zipformer-en-2023-03-30.tar.bz2\nrm sherpa-onnx-zipformer-en-2023-03-30.tar.bz2\nrepo=sherpa-onnx-zipformer-en-2023-03-30\nlog \"Start testing ${repo_url}\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.int8.onnx \\\n  --num-threads=2 \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\nlm_repo_url=https://huggingface.co/ezerhouni/icefall-librispeech-rnn-lm\nlog \"Download pre-trained RNN-LM model from ${lm_repo_url}\"\nGIT_LFS_SKIP_SMUDGE=1 git clone $lm_repo_url\nlm_repo=$(basename $lm_repo_url)\npushd $lm_repo\ngit lfs pull --include \"exp/no-state-epoch-99-avg-1.onnx\"\npopd\n\nbigram_repo_url=https://huggingface.co/vsd-vector/librispeech_bigram_sherpa-onnx-zipformer-large-en-2023-06-26\nlog \"Download bi-gram LM from ${bigram_repo_url}\"\nGIT_LFS_SKIP_SMUDGE=1 git clone $bigram_repo_url\nbigramlm_repo=$(basename $bigram_repo_url)\npushd $bigramlm_repo\ngit lfs pull --include \"2gram.fst\"\npopd\n\nlog \"Start testing with LM and bi-gram LODR\"\n# TODO: find test examples that change with the LODR\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  --decoding_method=\"modified_beam_search\" \\\n  --lm=$lm_repo/exp/no-state-epoch-99-avg-1.onnx \\\n  --lodr-fst=$bigramlm_repo/2gram.fst \\\n  --lodr-scale=-0.5  \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\nrm -rf $repo $lm_repo $bigramlm_repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run Paraformer (Chinese)\"\nlog \"------------------------------------------------------------\"\n# For onnxruntime 1.18.0, sherpa-onnx-paraformer-zh-2023-03-28 throws the following error\n# libc++abi: terminating with uncaught exception of type Ort::Exception: Node (Loop_5471)\n# Op (Loop) [TypeInferenceError] Graph attribute inferencing failed: Node (Concat_5490)\n# Op (Concat) [ShapeInferenceError] All inputs to Concat must have same rank. Input 1 has rank 2 != 1\n#\n# See https://github.com/microsoft/onnxruntime/issues/8115\n# We need to re-export this model using a recent version of onnxruntime and onnx\n\nlog \"------------------------------------------------------------\"\nlog \"Run Paraformer (Chinese) with timestamps\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nrm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nrepo=sherpa-onnx-paraformer-zh-2023-09-14\n\nlog \"Start testing ${repo_url}\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --paraformer=$repo/model.int8.onnx \\\n  --num-threads=2 \\\n  --decoding-method=greedy_search \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/2.wav \\\n  $repo/test_wavs/8k.wav\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run NeMo transducer (modified_beam_search + hotwords)\"\nlog \"------------------------------------------------------------\"\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-transducer-en-24500.tar.bz2\nname=$(basename $url)\ncurl -SL -O $url\ntar xvf $name\nrm $name\nrepo=$(basename -s .tar.bz2 $name)\nls -lh $repo\n\nlog \"Test NeMo transducer with modified_beam_search (no hotwords)\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder.onnx \\\n  --decoder=$repo/decoder.onnx \\\n  --joiner=$repo/joiner.onnx \\\n  --model-type=nemo_transducer \\\n  --decoding-method=modified_beam_search \\\n  --debug=1 \\\n  $repo/test_wavs/en-english.wav\n\nlog \"Test NeMo transducer with modified_beam_search and hotwords\"\n\n# Create hotwords file (BPE tokens for common English words)\ncat > $repo/hotwords.txt << EOF\n▁THE\n▁AND\n▁THAT\nEOF\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder.onnx \\\n  --decoder=$repo/decoder.onnx \\\n  --joiner=$repo/joiner.onnx \\\n  --model-type=nemo_transducer \\\n  --decoding-method=modified_beam_search \\\n  --hotwords-file=$repo/hotwords.txt \\\n  --hotwords-score=1.5 \\\n  --debug=1 \\\n  $repo/test_wavs/en-english.wav\n\nrm -rf $repo\n"
  },
  {
    "path": ".github/scripts/test-offline-tts.sh",
    "content": "#!/usr/bin/env bash\n\nset -e\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\n# test waves are saved in ./tts\nmkdir ./tts\n\nlog \"------------------------------------------------------------\"\nlog \"sherpa-onnx-pocket-tts-int8-2026-01-26\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\n$EXE \\\n  --pocket-lm-flow=./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx \\\n  --pocket-lm-main=./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx \\\n  --pocket-encoder=./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx \\\n  --pocket-decoder=./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx \\\n  --pocket-text-conditioner=./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx \\\n  --pocket-vocab-json=./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json \\\n  --pocket-token-scores-json=./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json \\\n  --reference-audio=./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav \\\n  --num-threads=2 \\\n  --debug=1 \\\n  --num-steps=5 \\\n  --output-filename=\"./tts/pocket-tts-out-bria.wav\" \\\n    \"I am happy to join with you today in what will go down in history as the greatest demonstration for freedom in the history of our nation. Five score years ago, a great American, in whose symbolic shadow we stand today, signed the Emancipation Proclamation. This momentous decree came as a great beacon light of hope to millions of Negro slaves who had been seared in the flames of withering injustice. It came as a joyous daybreak to end the long night of their captivity. But one hundred years later, the Negro still is not free. One hundred years later, the life of the Negro is still sadly crippled by the manacles of segregation and the chains of discrimination. One hundred years later, the Negro lives on a lonely island of poverty in the midst of a vast ocean of material prosperity. One hundred years later, the Negro is still languished in the corners of American society and finds himself an exile in his own land. And so we've come here today to dramatize a shameful condition. In a sense we've come to our nation's capital to cash a check. When the architects of our republic wrote the magnificent words of the Constitution and the Declaration of Independence, they were signing a promissory note to which every American was to fall heir. This note was a promise that all men, yes, black men as well as white men, would be guaranteed the unalienable Rights of Life, Liberty and the pursuit of Happiness. It is obvious today that America has defaulted on this promissory note, insofar as her citizens of color are concerned. Instead of honoring this sacred obligation, America has given the Negro people a bad check, a check which has come back marked insufficient funds.\"\n\nrm -rf sherpa-onnx-pocket-tts-int8-2026-01-26\n\nlog \"------------------------------------------------------------\"\nlog \"kokoro-en-v0_19\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\ntar xf kokoro-en-v0_19.tar.bz2\nrm kokoro-en-v0_19.tar.bz2\n\n# mapping of sid to voice name\n# 0->af, 1->af_bella, 2->af_nicole, 3->af_sarah, 4->af_sky, 5->am_adam\n# 6->am_michael, 7->bf_emma, 8->bf_isabella, 9->bm_george, 10->bm_lewis\n\nfor sid in $(seq 0 10); do\n  $EXE \\\n    --debug=1 \\\n    --kokoro-model=./kokoro-en-v0_19/model.onnx \\\n    --kokoro-voices=./kokoro-en-v0_19/voices.bin \\\n    --kokoro-tokens=./kokoro-en-v0_19/tokens.txt \\\n    --kokoro-data-dir=./kokoro-en-v0_19/espeak-ng-data \\\n    --num-threads=2 \\\n    --sid=$sid \\\n    --output-filename=\"./tts/kokoro-$sid.wav\" \\\n    \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be  a statesman, a businessman, an official, or a scholar.\"\ndone\nrm -rf kokoro-en-v0_19\n\nlog \"------------------------------------------------------------\"\nlog \"matcha-tts-fa_en-musa\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-tts-fa_en-musa.tar.bz2\ntar xvf matcha-tts-fa_en-musa.tar.bz2\nrm matcha-tts-fa_en-musa.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n\n$EXE \\\n  --matcha-acoustic-model=./matcha-tts-fa_en-musa/model.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-tokens=./matcha-tts-fa_en-musa/tokens.txt \\\n  --matcha-data-dir=./matcha-tts-fa_en-musa/espeak-ng-data \\\n  --output-filename=./tts/test-matcha-fa-en-musa.wav \\\n  --num-threads=2 \\\n  \"How are you doing today?  این یک نمونه ی تست فارسی است. This is a test.\"\n\nrm -rf matcha-tts-fa_en-musa\nrm vocos-22khz-univ.onnx\nls -lh tts/*.wav\n\nlog \"------------------------------------------------------------\"\nlog \"matcha-icefall-en_US-ljspeech\"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\ntar xvf matcha-icefall-en_US-ljspeech.tar.bz2\nrm matcha-icefall-en_US-ljspeech.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n\n$EXE \\\n  --matcha-acoustic-model=./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-tokens=./matcha-icefall-en_US-ljspeech/tokens.txt \\\n  --matcha-data-dir=./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\n  --num-threads=2 \\\n  --output-filename=./tts/matcha-ljspeech-1.wav \\\n  --debug=1 \\\n \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nrm vocos-22khz-univ.onnx\nrm -rf matcha-icefall-en_US-ljspeech\nls -lh tts/*.wav\n\nlog \"------------------------------------------------------------\"\nlog \"matcha-icefall-zh-baker\"\nlog \"------------------------------------------------------------\"\ncurl -O -SL https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\ntar xvf matcha-icefall-zh-baker.tar.bz2\nrm matcha-icefall-zh-baker.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n$EXE \\\n  --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\n  --matcha-tokens=./matcha-icefall-zh-baker/tokens.txt \\\n  --num-threads=2 \\\n  --debug=1 \\\n  --output-filename=./tts/matcha-baker-zh-1.wav \\\n  '小米的使命是，始终坚持做\"感动人心、价格厚道\"的好产品，让全球每个人都能享受科技带来的美好生活'\n\n$EXE \\\n  --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\n  --matcha-tokens=./matcha-icefall-zh-baker/tokens.txt \\\n  --num-threads=2 \\\n  --debug=1 \\\n  --output-filename=./tts/matcha-baker-zh-2.wav \\\n  \"当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感受着生命的奇迹与温柔。\"\n\nrm vocos-22khz-univ.onnx\nrm -rf matcha-icefall-zh-baker\n\nlog \"------------------------------------------------------------\"\nlog \"vits-piper-en_US-amy-low\"\nlog \"------------------------------------------------------------\"\ncurl -O -SL https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\ntar xf vits-piper-en_US-amy-low.tar.bz2\nrm vits-piper-en_US-amy-low.tar.bz2\n\n$EXE \\\n  --vits-model=./vits-piper-en_US-amy-low/en_US-amy-low.onnx \\\n  --vits-tokens=./vits-piper-en_US-amy-low/tokens.txt \\\n  --vits-data-dir=./vits-piper-en_US-amy-low/espeak-ng-data \\\n  --debug=1 \\\n  --output-filename=./tts/amy.wav \\\n  \"“Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.” The sun shone bleakly in the sky, its meager light struggling to penetrate the thick foliage of the forest. Birds sang their songs up in the crowns of the trees, fluttering from one branch to the other. A blanket of total tranquility lied over the forest. The peace was only broken by the steady gallop of the horses of the soldiers who were traveling to their upcoming knighting the morrow at Camelot, and rowdy conversation. “Finally we will get what we deserve,” “It’s been about time,” Perceval agreed. “We’ve been risking our arses for the past two years. It’s the least they could give us.” Merlin remained ostensibly silent, refusing to join the verbal parade of self-aggrandizing his fellow soldiers have engaged in. He found it difficult to happy about anything, when even if they had won the war, he had lost everything else in the process.\"\n\nfile ./tts/amy.wav\nrm -rf vits-piper-en_US-amy-low\n\nlog \"------------------------------------------------------------\"\nlog \"vits-ljs test\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-ljs.tar.bz2\ncurl -SL -O $repo_url\ntar xvf vits-ljs.tar.bz2\nrm vits-ljs.tar.bz2\nrepo=vits-ljs\n\nlog \"Start testing ${repo_url}\"\n\n$EXE \\\n  --vits-model=$repo/vits-ljs.onnx \\\n  --vits-lexicon=$repo/lexicon.txt \\\n  --vits-tokens=$repo/tokens.txt \\\n  --output-filename=./tts/vits-ljs.wav \\\n  'liliana, the most beautiful and lovely assistant of our team!'\n\nls -lh ./tts\n\nrm -rfv $repo\n\nlog \"------------------------------------------------------------\"\nlog \"vits-vctk test\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-vctk.tar.bz2\ncurl -SL -O $repo_url\ntar xvf vits-vctk.tar.bz2\nrm vits-vctk.tar.bz2\nrepo=vits-vctk\n\nlog \"Start testing ${repo_url}\"\n\nfor sid in 0 10 90; do\n  $EXE \\\n    --vits-model=$repo/vits-vctk.onnx \\\n    --vits-lexicon=$repo/lexicon.txt \\\n    --vits-tokens=$repo/tokens.txt \\\n    --sid=$sid \\\n    --output-filename=./tts/vits-vctk-${sid}.wav \\\n    'liliana, the most beautiful and lovely assistant of our team!'\ndone\n\nrm -rfv $repo\n\nls -lh tts/\n\nlog \"------------------------------------------------------------\"\nlog \"vits-zh-aishell3\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-zh-aishell3.tar.bz2\ncurl -SL -O $repo_url\ntar xvf vits-zh-aishell3.tar.bz2\nrm vits-zh-aishell3.tar.bz2\nrepo=vits-zh-aishell3\n\nlog \"Start testing ${repo_url}\"\n\nfor sid in 0 10 90; do\n  $EXE \\\n    --vits-model=$repo/vits-aishell3.onnx \\\n    --vits-lexicon=$repo/lexicon.txt \\\n    --vits-tokens=$repo/tokens.txt \\\n    --sid=$sid \\\n    --output-filename=./tts/vits-aishell3-${sid}.wav \\\n    '林美丽最美丽'\ndone\n\nrm -rfv $repo\n\nls -lh ./tts/\n"
  },
  {
    "path": ".github/scripts/test-offline-whisper.sh",
    "content": "#!/usr/bin/env bash\n\nset -e\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nnames=(\ntiny.en\nbase.en\nsmall.en\nmedium.en\ntiny\nbase\nsmall\nmedium\ndistil-medium.en\ndistil-small.en\n)\n\nfor name in ${names[@]}; do\n  log \"------------------------------------------------------------\"\n  log \"Run $name\"\n  log \"------------------------------------------------------------\"\n\n  repo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-$name.tar.bz2\n  curl -SL -O $repo_url\n  tar xvf sherpa-onnx-whisper-$name.tar.bz2\n  rm sherpa-onnx-whisper-$name.tar.bz2\n  repo=sherpa-onnx-whisper-$name\n  log \"Start testing ${repo_url}\"\n\n  log \"test fp32 onnx\"\n\n  time $EXE \\\n    --tokens=$repo/${name}-tokens.txt \\\n    --whisper-encoder=$repo/${name}-encoder.onnx \\\n    --whisper-decoder=$repo/${name}-decoder.onnx \\\n    --whisper-tail-paddings=500 \\\n    --num-threads=2 \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/8k.wav\n\n  log \"test int8 onnx\"\n\n  time $EXE \\\n    --tokens=$repo/${name}-tokens.txt \\\n    --whisper-encoder=$repo/${name}-encoder.int8.onnx \\\n    --whisper-decoder=$repo/${name}-decoder.int8.onnx \\\n    --whisper-tail-paddings=500 \\\n    --num-threads=2 \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/8k.wav\n\n  rm -rf $repo\ndone\n"
  },
  {
    "path": ".github/scripts/test-online-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nlog \"------------------------------------------------------------\"\nlog \"Run streaming NeMo CTC                                      \"\nlog \"------------------------------------------------------------\"\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms.tar.bz2\nname=$(basename $url)\nrepo=$(basename -s .tar.bz2 $name)\n\ncurl -SL -O $url\ntar xvf $name\nrm $name\nls -lh $repo\n\n$EXE \\\n  --nemo-ctc-model=$repo/model.onnx \\\n  --tokens=$repo/tokens.txt \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run streaming Zipformer2 CTC HLG decoding                   \"\nlog \"------------------------------------------------------------\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nrm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nrepo=$PWD/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18\nls -lh $repo\necho \"pwd: $PWD\"\n\n$EXE \\\n  --zipformer2-ctc-model=$repo/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx \\\n  --ctc-graph=$repo/HLG.fst \\\n  --tokens=$repo/tokens.txt \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run streaming Zipformer2 CTC                                \"\nlog \"------------------------------------------------------------\"\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\nrepo=$(basename -s .tar.bz2 $url)\ncurl -SL -O $url\ntar xvf $repo.tar.bz2\nrm $repo.tar.bz2\n\nlog \"test fp32\"\n\ntime $EXE \\\n  --debug=1 \\\n  --zipformer2-ctc-model=$repo/ctc-epoch-20-avg-1-chunk-16-left-128.onnx \\\n  --tokens=$repo/tokens.txt \\\n  $repo/test_wavs/DEV_T0000000000.wav \\\n  $repo/test_wavs/DEV_T0000000001.wav \\\n  $repo/test_wavs/DEV_T0000000002.wav\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run streaming Conformer CTC from WeNet\"\nlog \"------------------------------------------------------------\"\nwenet_models=(\nsherpa-onnx-zh-wenet-aishell\n# sherpa-onnx-zh-wenet-aishell2\n# sherpa-onnx-zh-wenet-wenetspeech\n# sherpa-onnx-zh-wenet-multi-cn\nsherpa-onnx-en-wenet-librispeech\n# sherpa-onnx-en-wenet-gigaspeech\n)\nfor name in ${wenet_models[@]}; do\n  repo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$name.tar.bz2\n  curl -SL -O $repo_url\n  tar xvf $name.tar.bz2\n  rm $name.tar.bz2\n  repo=$name\n  log \"Start testing ${repo_url}\"\n\n  log \"test float32 models\"\n  time $EXE \\\n    --tokens=$repo/tokens.txt \\\n    --wenet-ctc-model=$repo/model-streaming.onnx \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/8k.wav\n\n  log \"test int8 models\"\n  time $EXE \\\n    --tokens=$repo/tokens.txt \\\n    --wenet-ctc-model=$repo/model-streaming.int8.onnx \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/8k.wav\n\n  rm -rf $repo\ndone\n"
  },
  {
    "path": ".github/scripts/test-online-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -e\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nlog \"------------------------------------------------------------\"\nlog \"Run streaming Paraformer\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nrm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nrepo=sherpa-onnx-streaming-paraformer-bilingual-zh-en\n\nlog \"Start testing ${repo_url}\"\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --paraformer-encoder=$repo/encoder.onnx \\\n  --paraformer-decoder=$repo/decoder.onnx \\\n  --num-threads=2 \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/2.wav \\\n  $repo/test_wavs/3.wav \\\n  $repo/test_wavs/8k.wav\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --paraformer-encoder=$repo/encoder.int8.onnx \\\n  --paraformer-decoder=$repo/decoder.int8.onnx \\\n  --num-threads=2 \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/2.wav \\\n  $repo/test_wavs/3.wav \\\n  $repo/test_wavs/8k.wav\n\nrm -rf $repo\n"
  },
  {
    "path": ".github/scripts/test-online-punctuation.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\necho \"TODO(fangjun): Skip this test since the sanitizer test is failed. We need to fix it\"\nexit 0\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nlog \"------------------------------------------------------------\"\nlog \"Download the punctuation model                             \"\nlog \"------------------------------------------------------------\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n\ntar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nrm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nrepo=sherpa-onnx-online-punct-en-2024-08-06\nls -lh $repo\n\nfor m in model.onnx model.int8.onnx; do\n  $EXE \\\n   --debug=1 \\\n   --cnn-bilstm=$repo/$m \\\n   --bpe-vocab=$repo/bpe.vocab \\\n   \"How are you i am fine thank you\"\n\n  $EXE \\\n   --debug=1 \\\n   --cnn-bilstm=$repo/$m \\\n   --bpe-vocab=$repo/bpe.vocab \\\n   \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\"\ndone\n\nrm -rf $repo\n"
  },
  {
    "path": ".github/scripts/test-online-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -e\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nlog \"------------------------------------------------------------\"\nlog \"Run NeMo transducer (English)\"\nlog \"------------------------------------------------------------\"\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms.tar.bz2\nrm sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms.tar.bz2\nrepo=sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms\n\nlog \"Start testing ${repo_url}\"\n\nwaves=(\n$repo/test_wavs/0.wav\n$repo/test_wavs/1.wav\n$repo/test_wavs/8k.wav\n)\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder.onnx \\\n  --decoder=$repo/decoder.onnx \\\n  --joiner=$repo/joiner.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\ntime $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder.onnx \\\n  --decoder=$repo/decoder.onnx \\\n  --joiner=$repo/joiner.onnx \\\n  --num-threads=2 \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run LSTM transducer (English)\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-lstm-en-2023-02-17.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-lstm-en-2023-02-17.tar.bz2\nrm sherpa-onnx-lstm-en-2023-02-17.tar.bz2\nrepo=sherpa-onnx-lstm-en-2023-02-17\n\nlog \"Start testing ${repo_url}\"\n\nwaves=(\n$repo/test_wavs/0.wav\n$repo/test_wavs/1.wav\n$repo/test_wavs/8k.wav\n)\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.int8.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run LSTM transducer (Chinese)\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-lstm-zh-2023-02-20.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-lstm-zh-2023-02-20.tar.bz2\nrm sherpa-onnx-lstm-zh-2023-02-20.tar.bz2\nrepo=sherpa-onnx-lstm-zh-2023-02-20\n\nlog \"Start testing ${repo_url}\"\n\nwaves=(\n$repo/test_wavs/0.wav\n$repo/test_wavs/1.wav\n$repo/test_wavs/8k.wav\n)\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-11-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-11-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-11-avg-1.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-11-avg-1.int8.onnx \\\n  --decoder=$repo/decoder-epoch-11-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-11-avg-1.int8.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run streaming Zipformer transducer (English)\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-2023-02-21.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-streaming-zipformer-en-2023-02-21.tar.bz2\nrm sherpa-onnx-streaming-zipformer-en-2023-02-21.tar.bz2\nrepo=sherpa-onnx-streaming-zipformer-en-2023-02-21\n\nlog \"Start testing ${repo_url}\"\n\nwaves=(\n$repo/test_wavs/0.wav\n$repo/test_wavs/1.wav\n$repo/test_wavs/8k.wav\n)\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\n# test int8\n#\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.int8.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\nlm_repo_url=https://huggingface.co/vsd-vector/icefall-librispeech-rnn-lm\nlog \"Download pre-trained RNN-LM model from ${lm_repo_url}\"\nGIT_LFS_SKIP_SMUDGE=1 git clone $lm_repo_url\nlm_repo=$(basename $lm_repo_url)\npushd $lm_repo\ngit lfs pull --include \"with-state-epoch-99-avg-1.onnx\"\npopd\n\nbigram_repo_url=https://huggingface.co/vsd-vector/librispeech_bigram_sherpa-onnx-zipformer-large-en-2023-06-26\nlog \"Download bi-gram LM from ${bigram_repo_url}\"\nGIT_LFS_SKIP_SMUDGE=1 git clone $bigram_repo_url\nbigramlm_repo=$(basename $bigram_repo_url)\npushd $bigramlm_repo\ngit lfs pull --include \"2gram.fst\"\npopd\n\nlog \"Start testing LODR\"\n\nwaves=(\n$repo/test_wavs/0.wav\n$repo/test_wavs/1.wav\n$repo/test_wavs/8k.wav\n)\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  --decoding_method=\"modified_beam_search\" \\\n  --lm=$lm_repo/with-state-epoch-99-avg-1.onnx \\\n  --lodr-fst=$bigramlm_repo/2gram.fst \\\n  --lodr-scale=-0.5  \\\n  $wave\ndone\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  --decoding_method=\"modified_beam_search\" \\\n  --lm=$lm_repo/with-state-epoch-99-avg-1.onnx \\\n  --lodr-fst=$bigramlm_repo/2gram.fst \\\n  --lodr-scale=-0.5  \\\n  --lm-shallow-fusion=true \\\n  $wave\ndone\n\nrm -rf $repo $bigramlm_repo $lm_repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run streaming Zipformer transducer (Bilingual, Chinese + English)\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nrepo=sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n\nlog \"Start testing ${repo_url}\"\n\nwaves=(\n$repo/test_wavs/0.wav\n$repo/test_wavs/1.wav\n$repo/test_wavs/2.wav\n$repo/test_wavs/3.wav\n$repo/test_wavs/8k.wav\n)\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.int8.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\n# Decode a URL\nif [ $EXE == \"sherpa-onnx-ffmpeg\" ]; then\n  time $EXE \\\n  $repo/tokens.txt \\\n  $repo/encoder-epoch-99-avg-1.onnx \\\n  $repo/decoder-epoch-99-avg-1.onnx \\\n  $repo/joiner-epoch-99-avg-1.onnx \\\n  https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/resolve/main/test_wavs/4.wav \\\n  2\nfi\n\nif [ $EXE == \"sherpa-onnx-ffmpeg\" ]; then\n  time $EXE \\\n  $repo/tokens.txt \\\n  $repo/encoder-epoch-99-avg-1.int8.onnx \\\n  $repo/decoder-epoch-99-avg-1.onnx \\\n  $repo/joiner-epoch-99-avg-1.int8.onnx \\\n  https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/resolve/main/test_wavs/4.wav \\\n  2\nfi\n\nrm -rf $repo\n\nlog \"------------------------------------------------------------\"\nlog \"Run streaming Conformer transducer (English)\"\nlog \"------------------------------------------------------------\"\n\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-conformer-en-2023-05-09.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-streaming-conformer-en-2023-05-09.tar.bz2\nrm sherpa-onnx-streaming-conformer-en-2023-05-09.tar.bz2\nrepo=sherpa-onnx-streaming-conformer-en-2023-05-09\n\nlog \"Start testing ${repo_url}\"\n\nwaves=(\n$repo/test_wavs/0.wav\n$repo/test_wavs/1.wav\n$repo/test_wavs/2.wav\n)\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\nfor wave in ${waves[@]}; do\n  time $EXE \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.int8.onnx \\\n  --num-threads=2 \\\n  $wave\ndone\n\nrm -rf $repo\n"
  },
  {
    "path": ".github/scripts/test-python.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nlog \"test Supertonic TTS\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xvf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nrm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\npython3 python-api-examples/supertonic-tts.py\n\nrm -rf sherpa-onnx-supertonic-tts-int8-2026-03-06\n\nmkdir -p tts\ncp supertonic-en.wav tts/\nls -lh tts\n\nlog \"test Moonshine v2\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n\nls -lh sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27\n\npython3 ./python-api-examples/offline-moonshine-decode-files-v2.py\n\nrm -rf  sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27\n\nlog \"test FireRedASR CTC\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nrm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\npython3 ./python-api-examples/offline-fire-red-asr-ctc-decode-files.py\n\nrm -rf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\n\nlog \"test FireRedASR AED\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\nrm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n\npython3 ./python-api-examples/offline-fire-red-asr-decode-files.py\n\nrm -rf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\n\nlog \"test PocketTTS\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\npython3 ./python-api-examples/pocket-tts.py\n\nrm -rf sherpa-onnx-pocket-tts-int8-2026-01-26\n\nlog \"test ZipVoice TTS\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xvf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\npython3 ./python-api-examples/zipvoice-tts.py\n\ncp generated-zipvoice-zh-en-python.wav tts/\n\nrm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\nrm -f vocos_24khz.onnx\n\nlog \"test Google MedASR\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\ntar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nrm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nls -lh sherpa-onnx-medasr-ctc-en-int8-2025-12-25\n\nls -lh sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs\n\npython3 ./python-api-examples/offline-medasr-ctc-decode-files.py\nrm -rf sherpa-onnx-medasr-ctc-en-int8-2025-12-25\n\nlog \"test omnilingual ASR\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\ntar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nrm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nls -lh sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12\n\npython3 ./python-api-examples/offline-omnilingual-asr-ctc-decode-files.py\n\nrm -rf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12\n\nlog \"test T-one\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\ntar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nrm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n\npython3 ./python-api-examples/online-t-one-ctc-decode-files.py\n\nrm -rf sherpa-onnx-streaming-t-one-russian-2025-09-08\n\nlog \"test nemo canary\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\ntar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\nrm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\npython3 ./python-api-examples/offline-nemo-canary-decode-files.py\nrm -rf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8\n\nlog \"test spleeter\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/sherpa-onnx-spleeter-2stems-fp16.tar.bz2\ntar xvf sherpa-onnx-spleeter-2stems-fp16.tar.bz2\nrm sherpa-onnx-spleeter-2stems-fp16.tar.bz2\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/qi-feng-le-zh.wav\n./python-api-examples/offline-source-separation-spleeter.py\nrm -rf sherpa-onnx-spleeter-2stems-fp16\nrm qi-feng-le-zh.wav\n\nlog \"test UVR\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/UVR_MDXNET_9482.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/qi-feng-le-zh.wav\n./python-api-examples/offline-source-separation-uvr.py\nrm UVR_MDXNET_9482.onnx\nrm qi-feng-le-zh.wav\n\nmkdir source-separation\n\nmv spleeter-*.wav source-separation\nmv uvr-*.wav source-separation\n\nls -lh source-separation\n\n\nlog \"test offline dolphin ctc\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\ntar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\nrm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n\npython3 ./python-api-examples/offline-dolphin-ctc-decode-files.py\n\nrm -rf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\n\nlog \"test offline speech enhancement (GTCRN)\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/speech_with_noise.wav\npython3 ./python-api-examples/offline-speech-enhancement-gtcrn.py\npython3 ./python-api-examples/offline-speech-enhancement-dpdfnet.py\npython3 ./python-api-examples/online-speech-enhancement-gtcrn.py\npython3 ./python-api-examples/online-speech-enhancement-dpdfnet.py\nls -lh *.wav\n\nlog \"test offline zipformer (byte-level bpe, Chinese+English)\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-zh-en-2023-11-22.tar.bz2\ntar xvf sherpa-onnx-zipformer-zh-en-2023-11-22.tar.bz2\nrm sherpa-onnx-zipformer-zh-en-2023-11-22.tar.bz2\n\nrepo=sherpa-onnx-zipformer-zh-en-2023-11-22\n\n./python-api-examples/offline-decode-files.py  \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-34-avg-19.int8.onnx \\\n  --decoder=$repo/decoder-epoch-34-avg-19.onnx \\\n  --joiner=$repo/joiner-epoch-34-avg-19.int8.onnx \\\n  --num-threads=2 \\\n  --decoding-method=greedy_search \\\n  --debug=true \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/2.wav\n\nrm -rf sherpa-onnx-zipformer-zh-en-2023-11-22\n\nlog \"test offline Moonshine\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\npython3 ./python-api-examples/offline-moonshine-decode-files.py\n\nrm -rf sherpa-onnx-moonshine-tiny-en-int8\n\nlog \"test offline speaker diarization\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\ntar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nrm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\npython3 ./python-api-examples/offline-speaker-diarization.py\n\nrm -rf *.wav *.onnx ./sherpa-onnx-pyannote-segmentation-3-0\n\n\nlog \"test_clustering\"\npushd /tmp/\nmkdir test-cluster\ncd test-cluster\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\ngit clone https://github.com/csukuangfj/sr-data\npopd\n\npython3 ./sherpa-onnx/python/tests/test_fast_clustering.py\n\nrm -rf /tmp/test-cluster\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\nlog \"test offline SenseVoice CTC\"\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nname=$(basename $url)\nrepo=$(basename -s .tar.bz2 $name)\n\ncurl -SL -O $url\ntar xvf $name\nrm $name\nls -lh $repo\npython3 ./python-api-examples/offline-sense-voice-ctc-decode-files.py\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\ntar xf dict.tar.bz2\nrm dict.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\npython3 ./python-api-examples/offline-sense-voice-ctc-decode-files-with-hr.py\n\nrm -rf dict replace.fst test-hr.wav lexicon.txt\n\nif [[ $(uname) == Linux ]]; then\n  # It needs ffmpeg\n  log  \"generate subtitles (Chinese)\"\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n\n  python3 ./python-api-examples/generate-subtitles.py \\\n    --silero-vad-model=./silero_vad.onnx \\\n    --sense-voice=$repo/model.onnx \\\n    --tokens=$repo/tokens.txt \\\n    --num-threads=2 \\\n    ./lei-jun-test.wav\n\n  cat lei-jun-test.srt\n\n  rm lei-jun-test.wav\n\n  log  \"generate subtitles (English)\"\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\n\n  python3 ./python-api-examples/generate-subtitles.py \\\n    --silero-vad-model=./silero_vad.onnx \\\n    --sense-voice=$repo/model.onnx \\\n    --tokens=$repo/tokens.txt \\\n    --num-threads=2 \\\n    ./Obama.wav\n\n  cat Obama.srt\n  rm Obama.wav\n  rm silero_vad.onnx\nfi\nrm -rf $repo\n\nlog \"test offline TeleSpeech CTC\"\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\nname=$(basename $url)\nrepo=$(basename -s .tar.bz2 $name)\n\ncurl -SL -O $url\ntar xvf $name\nrm $name\nls -lh $repo\npython3 ./python-api-examples/offline-telespeech-ctc-decode-files.py\nrm -rf $repo\n\nlog \"test online NeMo CTC\"\n\nurl=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms.tar.bz2\nname=$(basename $url)\nrepo=$(basename -s .tar.bz2 $name)\n\ncurl -SL -O $url\ntar xvf $name\nrm $name\nls -lh $repo\npython3 ./python-api-examples/online-nemo-ctc-decode-files.py\nrm -rf $repo\n\nlog \"test offline punctuation\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\ntar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nrm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nrepo=sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12\nls -lh $repo\n\npython3 ./python-api-examples/add-punctuation.py\n\nrm -rf $repo\n\nlog \"test online punctuation\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\ntar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nrm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nrepo=sherpa-onnx-online-punct-en-2024-08-06\nls -lh $repo\n\npython3 ./python-api-examples/add-punctuation-online.py\n\nrm -rf $repo\n\nlog \"test audio tagging\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\ntar xvf sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\nrm sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n python3 ./python-api-examples/audio-tagging-from-a-file.py\nrm -rf sherpa-onnx-zipformer-audio-tagging-2024-04-09\n\n\nlog \"test streaming zipformer2 ctc HLG decoding\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nrm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nrepo=sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18\n\npython3 ./python-api-examples/online-zipformer-ctc-hlg-decode-file.py \\\n  --debug 1 \\\n  --tokens ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt \\\n  --graph ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst \\\n  --model ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx \\\n  ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/0.wav\n\nrm -rf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18\n\n\nmkdir -p /tmp/icefall-models\ndir=/tmp/icefall-models\n\npushd $dir\n\nrepo=$dir/icefall-asr-librispeech-streaming-zipformer-small-2024-03-18\nmkdir -p $repo\ncd $repo\nmkdir exp-ctc-rnnt-small\ncd exp-ctc-rnnt-small\ncurl -LS -O https://huggingface.co/csukuangfj/icefall-asr-librispeech-streaming-zipformer-small-2024-03-18/resolve/main/exp-ctc-rnnt-small/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx\ncd ..\nmkdir -p data/lang_bpe_500\ncd data/lang_bpe_500\ncurl -LS -O https://huggingface.co/csukuangfj/icefall-asr-librispeech-streaming-zipformer-small-2024-03-18/resolve/main/data/lang_bpe_500/tokens.txt\ncd ../..\nmkdir test_wavs\ncd test_wavs\n\ncurl -LS -O https://huggingface.co/csukuangfj/icefall-asr-librispeech-streaming-zipformer-small-2024-03-18/resolve/main/test_wavs/0.wav\ncurl -LS -O https://huggingface.co/csukuangfj/icefall-asr-librispeech-streaming-zipformer-small-2024-03-18/resolve/main/test_wavs/1.wav\ncurl -LS -O https://huggingface.co/csukuangfj/icefall-asr-librispeech-streaming-zipformer-small-2024-03-18/resolve/main/test_wavs/8k.wav\npopd\n\npython3 ./python-api-examples/online-decode-files.py \\\n  --tokens=$repo/data/lang_bpe_500/tokens.txt \\\n  --zipformer2-ctc=$repo/exp-ctc-rnnt-small/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\nrm -rf $repo\n\npython3 sherpa-onnx/python/tests/test_offline_recognizer.py --verbose\n\nwenet_models=(\n# sherpa-onnx-zh-wenet-aishell\n# sherpa-onnx-zh-wenet-aishell2\n# sherpa-onnx-zh-wenet-wenetspeech\n# sherpa-onnx-zh-wenet-multi-cn\nsherpa-onnx-en-wenet-librispeech\n# sherpa-onnx-en-wenet-gigaspeech\n)\n\nfor name in ${wenet_models[@]}; do\n  repo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$name.tar.bz2\n  curl -SL -O $repo_url\n  tar xvf $name.tar.bz2\n  rm $name.tar.bz2\n  repo=$name\n  log \"Start testing ${repo_url}\"\n\n  if false; then\n    # offline wenet ctc models are not supported by onnxruntime >= 1.18\n    python3 ./python-api-examples/offline-decode-files.py \\\n      --tokens=$repo/tokens.txt \\\n      --wenet-ctc=$repo/model.onnx \\\n      $repo/test_wavs/0.wav \\\n      $repo/test_wavs/1.wav \\\n      $repo/test_wavs/8k.wav\n  fi\n\n  python3 ./python-api-examples/online-decode-files.py \\\n    --tokens=$repo/tokens.txt \\\n    --wenet-ctc=$repo/model-streaming.onnx \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/8k.wav\n\n  python3 sherpa-onnx/python/tests/test_offline_recognizer.py --verbose\n\n  python3 sherpa-onnx/python/tests/test_online_recognizer.py --verbose\n\n  rm -rf $repo\ndone\n\nlog \"Offline TTS test\"\n# test waves are saved in ./tts\nmkdir -p ./tts\n\nlog \"test kitten tts\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\ntar xf kitten-nano-en-v0_1-fp16.tar.bz2\nrm kitten-nano-en-v0_1-fp16.tar.bz2\n\npython3 ./python-api-examples/offline-tts.py \\\n  --debug=1 \\\n  --kitten-model=./kitten-nano-en-v0_1-fp16/model.fp16.onnx \\\n  --kitten-voices=./kitten-nano-en-v0_1-fp16/voices.bin \\\n  --kitten-tokens=./kitten-nano-en-v0_1-fp16/tokens.txt \\\n  --kitten-data-dir=./kitten-nano-en-v0_1-fp16/espeak-ng-data \\\n  --num-threads=2 \\\n  --sid=0 \\\n  --output-filename=\"./tts/kitten-0.wav\" \\\n  \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nrm -rf kitten-nano-en-v0_1-fp16\n\nlog \"kokoro-multi-lang-v1_0 test\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\ntar xf kokoro-multi-lang-v1_0.tar.bz2\nrm kokoro-multi-lang-v1_0.tar.bz2\n\npython3 ./python-api-examples/offline-tts.py \\\n  --debug=1 \\\n  --kokoro-model=./kokoro-multi-lang-v1_0/model.onnx \\\n  --kokoro-voices=./kokoro-multi-lang-v1_0/voices.bin \\\n  --kokoro-tokens=./kokoro-multi-lang-v1_0/tokens.txt \\\n  --kokoro-data-dir=./kokoro-multi-lang-v1_0/espeak-ng-data \\\n  --kokoro-lexicon=./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt \\\n  --num-threads=2 \\\n  --sid=18 \\\n  --output-filename=\"./tts/kokoro-18-zh-en.wav\" \\\n  \"中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？\"\n\nrm -rf kokoro-multi-lang-v1_0\n\nlog \"kokoro-en-v0_19 test\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\ntar xf kokoro-en-v0_19.tar.bz2\nrm kokoro-en-v0_19.tar.bz2\n\npython3 ./python-api-examples/offline-tts.py \\\n  --debug=1 \\\n  --kokoro-model=./kokoro-en-v0_19/model.onnx \\\n  --kokoro-voices=./kokoro-en-v0_19/voices.bin \\\n  --kokoro-tokens=./kokoro-en-v0_19/tokens.txt \\\n  --kokoro-data-dir=./kokoro-en-v0_19/espeak-ng-data \\\n  --num-threads=2 \\\n  --sid=10 \\\n  --output-filename=\"./tts/kokoro-10.wav\" \\\n  \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be  a statesman, a businessman, an official, or a scholar.\"\n\nrm -rf kokoro-en-v0_19\n\nlog \"matcha-ljspeech-en test\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\ntar xvf matcha-icefall-en_US-ljspeech.tar.bz2\nrm matcha-icefall-en_US-ljspeech.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\npython3 ./python-api-examples/offline-tts.py \\\n  --matcha-acoustic-model=./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-tokens=./matcha-icefall-en_US-ljspeech/tokens.txt \\\n  --matcha-data-dir=./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\n  --output-filename=./tts/test-matcha-ljspeech-en.wav \\\n  --num-threads=2 \\\n \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nrm vocos-22khz-univ.onnx\nrm -rf matcha-icefall-en_US-ljspeech\n\nlog \"matcha-baker-zh test\"\n\ncurl -O -SL https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\ntar xvf matcha-icefall-zh-baker.tar.bz2\nrm matcha-icefall-zh-baker.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\npython3 ./python-api-examples/offline-tts.py \\\n --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\n --matcha-vocoder=./vocos-22khz-univ.onnx \\\n --matcha-lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\n --matcha-tokens=./matcha-icefall-zh-baker/tokens.txt \\\n --tts-rule-fsts=./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst \\\n --output-filename=./tts/test-matcha-baker-zh.wav \\\n \"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\"\n\nrm -rf matcha-icefall-zh-baker\nrm vocos-22khz-univ.onnx\n\nlog \"vits-ljs test\"\n\ncurl -LS -O https://huggingface.co/csukuangfj/vits-ljs/resolve/main/vits-ljs.onnx\ncurl -LS -O https://huggingface.co/csukuangfj/vits-ljs/resolve/main/lexicon.txt\ncurl -LS -O https://huggingface.co/csukuangfj/vits-ljs/resolve/main/tokens.txt\n\npython3 ./python-api-examples/offline-tts.py \\\n  --vits-model=./vits-ljs.onnx \\\n  --vits-lexicon=./lexicon.txt \\\n  --vits-tokens=./tokens.txt \\\n  --output-filename=./tts/vits-ljs.wav \\\n  'liliana, the most beautiful and lovely assistant of our team!'\n\nls -lh ./tts\n\nrm -v vits-ljs.onnx ./lexicon.txt ./tokens.txt\n\nlog \"vits-vctk test\"\ncurl -LS -O https://huggingface.co/csukuangfj/vits-vctk/resolve/main/vits-vctk.onnx\ncurl -LS -O https://huggingface.co/csukuangfj/vits-vctk/resolve/main/lexicon.txt\ncurl -LS -O https://huggingface.co/csukuangfj/vits-vctk/resolve/main/tokens.txt\n\nfor sid in 0 10 90; do\n  python3 ./python-api-examples/offline-tts.py \\\n    --vits-model=./vits-vctk.onnx \\\n    --vits-lexicon=./lexicon.txt \\\n    --vits-tokens=./tokens.txt \\\n    --sid=$sid \\\n    --output-filename=./tts/vits-vctk-${sid}.wav \\\n    'liliana, the most beautiful and lovely assistant of our team!'\ndone\n\nrm -v vits-vctk.onnx ./lexicon.txt ./tokens.txt\n\nif [[ x$OS != x'windows-latest' ]]; then\n  echo \"OS: $OS\"\n\n  log \"vits-zh-aishell3\"\n\n  curl -LS -O https://huggingface.co/csukuangfj/vits-zh-aishell3/resolve/main/vits-aishell3.onnx\n  curl -LS -O https://huggingface.co/csukuangfj/vits-zh-aishell3/resolve/main/lexicon.txt\n  curl -LS -O https://huggingface.co/csukuangfj/vits-zh-aishell3/resolve/main/tokens.txt\n\n  for sid in 0 10 90; do\n    python3 ./python-api-examples/offline-tts.py \\\n      --vits-model=./vits-aishell3.onnx \\\n      --vits-lexicon=./lexicon.txt \\\n      --vits-tokens=./tokens.txt \\\n      --sid=$sid \\\n      --output-filename=./tts/vits-aishell3-${sid}.wav \\\n      '林美丽最美丽'\n  done\n\n  rm -v vits-aishell3.onnx ./lexicon.txt ./tokens.txt\nfi\n\nmkdir -p /tmp/icefall-models\ndir=/tmp/icefall-models\n\nlog \"Test streaming transducer models\"\n\nif [[ x$OS != x'windows-latest' ]]; then\n  echo \"OS: $OS\"\n  pushd $dir\n  repo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  curl -SL -O $repo_url\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  repo=sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n\n  log \"Start testing ${repo_url}\"\n  repo=$dir/$repo\n\n  python3 -c \"import sherpa_onnx; print(sherpa_onnx.__file__)\"\n  sherpa_onnx_version=$(python3 -c \"import sherpa_onnx; print(sherpa_onnx.__version__)\")\n\n  echo \"sherpa_onnx version: $sherpa_onnx_version\"\n\n  pwd\n  ls -lh\n\n  ls -lh $repo\n  popd\n\n  python3 ./python-api-examples/online-decode-files.py \\\n    --tokens=$repo/tokens.txt \\\n    --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n    --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n    --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/2.wav \\\n    $repo/test_wavs/3.wav \\\n    $repo/test_wavs/8k.wav\n\n  python3 ./python-api-examples/online-decode-files.py \\\n    --tokens=$repo/tokens.txt \\\n    --encoder=$repo/encoder-epoch-99-avg-1.int8.onnx \\\n    --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n    --joiner=$repo/joiner-epoch-99-avg-1.int8.onnx \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/2.wav \\\n    $repo/test_wavs/3.wav \\\n    $repo/test_wavs/8k.wav\n\n  ln -s $repo $PWD/\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\n\n  python3 ./python-api-examples/inverse-text-normalization-online-asr.py\n\n  python3 sherpa-onnx/python/tests/test_online_recognizer.py --verbose\n\n  rm -rfv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n\n  rm -rf $repo\nfi\n\nlog \"Test non-streaming transducer models\"\n\npushd $dir\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\nlog \"Download pretrained model and test-data from $repo_url\"\n\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\nrm sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\nrepo=$dir/sherpa-onnx-zipformer-en-2023-04-01\n\npopd\n\nls -lh $repo\n\npython3 ./python-api-examples/offline-decode-files.py \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\npython3 ./python-api-examples/offline-decode-files.py \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.int8.onnx \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\nlm_repo_url=https://huggingface.co/ezerhouni/icefall-librispeech-rnn-lm\nlog \"Download pre-trained RNN-LM model from ${lm_repo_url}\"\nGIT_LFS_SKIP_SMUDGE=1 git clone $lm_repo_url\nlm_repo=$(basename $lm_repo_url)\npushd $lm_repo\ngit lfs pull --include \"exp/no-state-epoch-99-avg-1.onnx\"\npopd\n\nbigram_repo_url=https://huggingface.co/vsd-vector/librispeech_bigram_sherpa-onnx-zipformer-large-en-2023-06-26\nlog \"Download bi-gram LM from ${bigram_repo_url}\"\nGIT_LFS_SKIP_SMUDGE=1 git clone $bigram_repo_url\nbigramlm_repo=$(basename $bigram_repo_url)\npushd $bigramlm_repo\ngit lfs pull --include \"2gram.fst\"\npopd\n\nlog \"Perform offline decoding with RNN-LM and LODR\"\npython3 ./python-api-examples/offline-decode-files.py \\\n  --tokens=$repo/tokens.txt \\\n  --encoder=$repo/encoder-epoch-99-avg-1.onnx \\\n  --decoder=$repo/decoder-epoch-99-avg-1.onnx \\\n  --joiner=$repo/joiner-epoch-99-avg-1.onnx \\\n  --decoding-method=modified_beam_search \\\n  --lm=$lm_repo/exp/no-state-epoch-99-avg-1.onnx \\\n  --lodr-fst=$bigramlm_repo/2gram.fst \\\n  --lodr-scale=-0.5 \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\npython3 sherpa-onnx/python/tests/test_offline_recognizer.py --verbose\n\nrm -rf $repo $lm_repo $bigramlm_repo\n\nlog \"Test non-streaming paraformer models\"\n\nif [[ x$OS != x'windows-latest' ]]; then\n  echo \"OS: $OS\"\n  pushd $dir\n  repo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  curl -SL -O $repo_url\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  log \"Start testing ${repo_url}\"\n  repo=$dir/sherpa-onnx-paraformer-zh-2023-09-14\n\n  ls -lh $repo\n  popd\n\n  python3 ./python-api-examples/offline-decode-files.py \\\n    --tokens=$repo/tokens.txt \\\n    --paraformer=$repo/model.int8.onnx \\\n    $repo/test_wavs/0.wav \\\n    $repo/test_wavs/1.wav \\\n    $repo/test_wavs/2.wav \\\n    $repo/test_wavs/8k.wav\n\n  python3 sherpa-onnx/python/tests/test_offline_recognizer.py --verbose\n\n  ln -s $repo $PWD/\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\n\n  python3 ./python-api-examples/inverse-text-normalization-offline-asr.py\n\n  rm -rfv sherpa-onnx-paraformer-zh-2023-09-14\n\n  rm -rf $repo\nfi\n\nlog \"Test non-streaming NeMo CTC models\"\n\npushd $dir\nrepo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-en-citrinet-512.tar.bz2\ncurl -SL -O $repo_url\ntar xvf sherpa-onnx-nemo-ctc-en-citrinet-512.tar.bz2\nrm sherpa-onnx-nemo-ctc-en-citrinet-512.tar.bz2\n\nlog \"Start testing ${repo_url}\"\nrepo=$dir/sherpa-onnx-nemo-ctc-en-citrinet-512\n\nls -lh $repo\npopd\n\npython3 ./python-api-examples/offline-decode-files.py \\\n  --tokens=$repo/tokens.txt \\\n  --nemo-ctc=$repo/model.onnx \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\npython3 ./python-api-examples/offline-decode-files.py \\\n  --tokens=$repo/tokens.txt \\\n  --nemo-ctc=$repo/model.int8.onnx \\\n  $repo/test_wavs/0.wav \\\n  $repo/test_wavs/1.wav \\\n  $repo/test_wavs/8k.wav\n\npython3 sherpa-onnx/python/tests/test_offline_recognizer.py --verbose\n\nrm -rf $repo\n\n# test text2token\ngit clone https://github.com/pkufool/sherpa-test-data /tmp/sherpa-test-data\n\npython3 sherpa-onnx/python/tests/test_text2token.py --verbose\n\nrm -rf /tmp/sherpa-test-data\n\ndir=/tmp/onnx-models\nmkdir -p $dir\n\nlog \"Test keyword spotting models\"\n\npython3 -c \"import sherpa_onnx; print(sherpa_onnx.__file__)\"\nsherpa_onnx_version=$(python3 -c \"import sherpa_onnx; print(sherpa_onnx.__version__)\")\n\necho \"sherpa_onnx version: $sherpa_onnx_version\"\n\npwd\nls -lh\n\nif [[ x$OS != x'windows-latest' ]]; then\n  echo \"OS: $OS\"\n\n  repo=sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01\n  log \"Start testing ${repo}\"\n\n  curl -LS -O https://github.com/pkufool/keyword-spotting-models/releases/download/v0.1/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz\n  tar xf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz\n  rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz\n\n  ls -lh $repo\n\n  python3 ./python-api-examples/keyword-spotter.py\n\n  python3 sherpa-onnx/python/tests/test_keyword_spotter.py --verbose\n\n  rm -rf $repo\nfi\n\nrm -r $dir\n"
  },
  {
    "path": ".github/scripts/test-rust.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ncd rust-api-examples\n\n./run-audio-tagging-zipformer.sh\nrm -rf sherpa-onnx-zipformer-small-audio-tagging-2024-04-15\n\n./run-audio-tagging-ced.sh\nrm -rf sherpa-onnx-ced-mini-audio-tagging-2024-04-19\n\n./run-speaker-embedding-extractor.sh\n./run-speaker-embedding-manager.sh\nrm -f 3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\nrm -rf sr-data\n\n./run-speaker-embedding-cosine-similarity.sh\nrm -f wespeaker_zh_cnceleb_resnet34.onnx fangjun-sr-1.wav fangjun-sr-2.wav leijun-sr-1.wav\n\n./run-offline-speaker-diarization.sh\nrm -rf sherpa-onnx-pyannote-segmentation-3-0\nrm -f 3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx 0-four-speakers-zh.wav\n\n./run-vits-en.sh\nrm -rf vits-piper-en_US-amy-low\n\n./run-vits-de.sh\nrm -rf vits-piper-de_DE-glados-high\n\n./run-matcha-tts-en.sh\n./run-matcha-tts-zh.sh\nrm -rf matcha-icefall-en_US-ljspeech matcha-icefall-zh-baker\nrm -f vocos-22khz-univ.onnx\n\n./run-kokoro-tts-en.sh\nrm -rf kokoro-en-v0_19\n\n./run-kokoro-tts-zh-en.sh\nrm -rf kokoro-multi-lang-v1_0\n\n./run-kitten-tts-en.sh\nrm -rf kitten-nano-en-v0_1-fp16\n\n./run-pocket-tts.sh\nrm -rf sherpa-onnx-pocket-*\n\n./run-supertonic-tts.sh\nrm -rf sherpa-onnx-supertonic-*\n\n./run-zipvoice-tts.sh\nrm -rf sherpa-onnx-zipvoice-*\nrm -f vocos_24khz.onnx\n\n./run-online-punctuation.sh\nrm -rf sherpa-onnx-online-punct-*\n\n./run-keyword-spotter.sh\nrm -rf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile\n\n./run-spoken-language-identification.sh\nrm -rf sherpa-onnx-whisper-tiny spoken-language-identification-test-wavs\n\n./run-offline-punctuation.sh\nrm -rf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8\n\n./run-version.sh\n\n./run-moonshine-v2.sh\n\n./run-fire-red-asr-ctc.sh\n\n./run-silero-vad-remove-silence.sh\n\n./run-nemo-parakeet-en.sh\n./run-zipformer-vi.sh\n./run-zipformer-zh-en.sh\n./run-zipformer-en.sh\n\n./run-sense-voice.sh\n\n./run-streaming-zipformer-en.sh\n./run-streaming-zipformer-zh-en.sh\n\n./run-offline-speech-enhancement-gtcrn.sh\n./run-offline-speech-enhancement-dpdfnet.sh\n./run-streaming-speech-enhancement-gtcrn.sh\n./run-streaming-speech-enhancement-dpdfnet.sh\n"
  },
  {
    "path": ".github/scripts/test-speaker-diarization.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\ntar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nrm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\nlog \"specify number of clusters\"\n$EXE \\\n  --clustering.num-clusters=4 \\\n  --segmentation.pyannote-model=./sherpa-onnx-pyannote-segmentation-3-0/model.onnx \\\n  --embedding.model=./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx \\\n  ./0-four-speakers-zh.wav\n\nlog \"specify threshold for clustering\"\n\n$EXE \\\n  --clustering.cluster-threshold=0.90 \\\n  --segmentation.pyannote-model=./sherpa-onnx-pyannote-segmentation-3-0/model.onnx \\\n  --embedding.model=./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx \\\n  ./0-four-speakers-zh.wav\n\nrm -rf sherpa-onnx-pyannote-*\nrm -fv *.onnx\nrm -fv *.wav\n"
  },
  {
    "path": ".github/scripts/test-speaker-recognition-python.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nd=/tmp/sr-models\nmkdir -p $d\n\npushd $d\nlog \"Download test waves\"\ngit clone https://github.com/csukuangfj/sr-data\npopd\n\nlog \"Download wespeaker models\"\nmodel_dir=$d/wespeaker\nmkdir -p $model_dir\npushd $model_dir\nmodels=(\nwespeaker_en_voxceleb_CAM++.onnx\nwespeaker_en_voxceleb_CAM++_LM.onnx\nwespeaker_en_voxceleb_resnet152_LM.onnx\nwespeaker_en_voxceleb_resnet221_LM.onnx\nwespeaker_en_voxceleb_resnet293_LM.onnx\nwespeaker_en_voxceleb_resnet34.onnx\nwespeaker_en_voxceleb_resnet34_LM.onnx\nwespeaker_zh_cnceleb_resnet34.onnx\nwespeaker_zh_cnceleb_resnet34_LM.onnx\n)\nfor m in ${models[@]}; do\n  curl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/$m\n  curl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/wespeaker_en_voxceleb_CAM++_LM.onnx\ndone\nls -lh\npopd\n\nlog \"Download 3d-speaker models\"\nmodel_dir=$d/3dspeaker\nmkdir -p $model_dir\npushd $model_dir\nmodels=(\n3dspeaker_speech_campplus_sv_en_voxceleb_16k.onnx\n3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\n3dspeaker_speech_eres2net_base_200k_sv_zh-cn_16k-common.onnx\n3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n3dspeaker_speech_eres2net_large_sv_zh-cn_3dspeaker_16k.onnx\n3dspeaker_speech_eres2net_sv_en_voxceleb_16k.onnx\n3dspeaker_speech_eres2net_sv_zh-cn_16k-common.onnx\n)\nfor m in ${models[@]}; do\n  curl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/$m\ndone\nls -lh\npopd\n\nlog \"Download NeMo models\"\nmodel_dir=$d/nemo\nmkdir -p $model_dir\npushd $model_dir\nmodels=(\nnemo_en_titanet_large.onnx\nnemo_en_titanet_small.onnx\nnemo_en_speakerverification_speakernet.onnx\n)\nfor m in ${models[@]}; do\n  curl -LS -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/$m\ndone\nls -lh\npopd\n\npython3 sherpa-onnx/python/tests/test_speaker_recognition.py --verbose\n"
  },
  {
    "path": ".github/scripts/test-spoken-language-identification.sh",
    "content": "#!/usr/bin/env bash\n\nset -e\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\necho \"EXE is $EXE\"\necho \"PATH: $PATH\"\n\nwhich $EXE\n\nnames=(\ntiny\nbase\nsmall\nmedium\n)\n\n# all_language_codes=bo,ml,tt,fa,sl,bg,sn,sr,tl,km,ln,mr,hr,eu,ro,ba,bs,pl,as,nn,sk,ko,oc,ar,uz,pa,tg,mk,kk,hi,ha,uk,is,de,el,ja,yo,be,so,tk,id,sa,ru,yi,en,am,cs,ne,la,sv,su,pt,mi,ca,sd,hy,haw,fi,et,kn,da,lt,it,nl,he,mg,ur,tr,af,br,bn,ta,no,my,si,mt,th,gl,sw,mn,jw,ms,ps,fo,ka,hu,zh,ht,az,fr,lo,sq,gu,cy,lv,es,lb,te,vi\n\nlog \"Download test waves\"\nwaves=(\nar-arabic.wav\nbg-bulgarian.wav\ncs-czech.wav\nda-danish.wav\n# de-german.wav\n# el-greek.wav\n# en-english.wav\n# es-spanish.wav\n# fa-persian.wav\n# fi-finnish.wav\n# fr-french.wav\n# hi-hindi.wav\n# hr-croatian.wav\n# id-indonesian.wav\n# it-italian.wav\n# ja-japanese.wav\n# ko-korean.wav\n# nl-dutch.wav\n# no-norwegian.wav\n# po-polish.wav\n# pt-portuguese.wav\n# ro-romanian.wav\n# ru-russian.wav\n# sk-slovak.wav\n# sv-swedish.wav\n# ta-tamil.wav\n# tl-tagalog.wav\n# tr-turkish.wav\n# uk-ukrainian.wav\n# zh-chinese.wav\n)\n\nfor wav in ${waves[@]}; do\n  echo \"Downloading $wav\"\n  curl -SL -O https://hf-mirror.com/spaces/k2-fsa/spoken-language-identification/resolve/main/test_wavs/$wav\n  ls -lh *.wav\ndone\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/spoken-language-identification-test-wavs.tar.bz2\ntar xvf spoken-language-identification-test-wavs.tar.bz2\nrm spoken-language-identification-test-wavs.tar.bz2\ndata=spoken-language-identification-test-wavs\n\nfor name in ${names[@]}; do\n  log \"------------------------------------------------------------\"\n  log \"Run $name\"\n  log \"------------------------------------------------------------\"\n  repo_url=https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-$name.tar.bz2\n  curl -SL -O $repo_url\n  tar xvf sherpa-onnx-whisper-$name.tar.bz2\n  rm sherpa-onnx-whisper-$name.tar.bz2\n\n  log \"Start testing ${repo_url}\"\n  repo=sherpa-onnx-whisper-$name\n\n  for wav in ${waves[@]}; do\n    log \"test fp32 onnx\"\n\n    time $EXE \\\n      --whisper-encoder=$repo/${name}-encoder.onnx \\\n      --whisper-decoder=$repo/${name}-decoder.onnx \\\n      $data/$wav\n\n    log \"test int8 onnx\"\n\n    time $EXE \\\n      --whisper-encoder=$repo/${name}-encoder.int8.onnx \\\n      --whisper-decoder=$repo/${name}-decoder.int8.onnx \\\n      $data/$wav\n  done\n  rm -rf $repo\ndone\n"
  },
  {
    "path": ".github/scripts/test-swift.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\necho \"pwd: $PWD\"\n\ncd swift-api-examples\nls -lh\n\n./run-test-version.sh\n\n./run-moonshine-v2-asr.sh\nrm -rf sherpa-onnx-moonshine-*\n\n./run-fire-red-asr-ctc.sh\nrm -rf sherpa-onnx-fire-red-*\n\n./run-tts-pocket-en.sh\nls -lh\nrm -rf sherpa-onnx-pocket-*\n\n./run-tts-supertonic-en.sh\nls -lh\nrm -rf sherpa-onnx-supertonic-*\n\n./run-medasr-ctc-asr.sh\nrm -rf sherpa-onnx-medasr-*\n\n./run-funasr-nano-asr.sh\nrm -rf sherpa-onnx-funasr-nano-*\n\n./run-omnilingual-asr-ctc-asr.sh\nrm -rf sherpa-onnx-omnilingual-*\n\n./run-decode-file-t-one-streaming.sh\nrm -rf sherpa-onnx-streaming-*\n\n./run-compute-speaker-embeddings.sh\nrm -fv *.wav *.onnx\n\n./run-tts-kitten-en.sh\nls -lh\nrm -rf kitten-*\n\n./run-wenet-ctc-asr.sh\nrm -rf sherpa-onnx-*\n\n./run-zipformer-ctc-asr.sh\nrm -rf sherpa-onnx-zipformer-*\n\n./run-decode-file-sense-voice-with-hr.sh\nrm -rf sherpa-onnx-sense-voice-*\nrm -rf dict lexicon.txt replace.fst test-hr.wav\n\n./run-dolphin-ctc-asr.sh\nrm -rf sherpa-onnx-dolphin-*\n\n./run-speech-enhancement-gtcrn.sh\n./run-speech-enhancement-dpdfnet.sh\n./run-online-speech-enhancement-gtcrn.sh\n./run-online-speech-enhancement-dpdfnet.sh\nls -lh *.wav\n\n./run-fire-red-asr.sh\nrm -rf sherpa-onnx-fire-red-asr-*\n\n./run-tts-vits.sh\nls -lh\nrm -rf vits-piper-*\n\n./run-tts-kokoro-zh-en.sh\nls -lh\nrm -rf kokoro-multi-*\n\n./run-tts-kokoro-en.sh\nls -lh\nrm -rf kokoro-en-*\n\n./run-tts-matcha-zh.sh\nls -lh\nrm -rf matcha-icefall-*\n\n./run-tts-matcha-en.sh\nls -lh\nrm -rf matcha-icefall-*\n\n./run-tts-zipvoice.sh\nls -lh\nrm -rf sherpa-onnx-zipvoice-*\nrm -f vocos_24khz.onnx\n\n./run-speaker-diarization.sh\nrm -rf *.onnx\nrm -rf sherpa-onnx-pyannote-segmentation-3-0\nrm -fv *.wav\n\n./run-add-punctuations.sh\nrm ./add-punctuations\nrm -rf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12\n\n./run-keyword-spotting-from-file.sh\nrm ./keyword-spotting-from-file\nrm -rf sherpa-onnx-kws-*\n\n./run-streaming-hlg-decode-file.sh\nrm ./streaming-hlg-decode-file\nrm -rf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18\n\n./run-spoken-language-identification.sh\nrm -rf sherpa-onnx-whisper*\n\nmkdir -p /Users/fangjun/Desktop\npushd /Users/fangjun/Desktop\ncurl -SL -O https://huggingface.co/csukuangfj/test-data/resolve/main/Obama.wav\nls -lh\npopd\n\n./run-generate-subtitles-ten-vad.sh\nrm -rf *.onnx\n\n./run-generate-subtitles.sh\nrm -rf *.onnx\n\nls -lh /Users/fangjun/Desktop\ncat /Users/fangjun/Desktop/Obama.srt\n\nrm -rf sherpa-onnx-whisper*\nrm -f *.onnx\nrm /Users/fangjun/Desktop/Obama.wav\n\n./run-decode-file.sh\nrm decode-file\nsed -i.bak  '20d' ./decode-file.swift\n./run-decode-file.sh\n\n./run-decode-file-non-streaming.sh\n\nls -lh\n"
  },
  {
    "path": ".github/workflows/.gitignore",
    "content": "!*.yaml\n"
  },
  {
    "path": ".github/workflows/aarch64-linux-gnu-shared.yaml",
    "content": "# Modified from https://github.com/Tencent/ncnn/blob/master/.github/workflows/linux-arm-cpu-gcc.yml\nname: aarch64-linux-gnu-shared\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/aarch64-linux-gnu-shared.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'toolchains/aarch64-linux-gnu.toolchain.cmake'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: aarch64-linux-gnu-shared-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  aarch64_linux_gnu_shared:\n    runs-on: ${{ matrix.os }}\n    name: aarch64 shared GPU ${{ matrix.gpu }} ${{ matrix.onnxruntime_version }}\n    strategy:\n      fail-fast: false\n      matrix:\n        include:\n          - os: ubuntu-22.04-arm\n            gpu: ON\n            onnxruntime_version: \"1.11.0\"\n          - os: ubuntu-22.04-arm\n            gpu: ON\n            onnxruntime_version: \"1.16.0\"\n          - os: ubuntu-22.04-arm\n            gpu: ON\n            onnxruntime_version: \"1.18.0\"\n          - os: ubuntu-22.04-arm\n            gpu: ON\n            onnxruntime_version: \"1.18.1\"\n          - os: ubuntu-22.04-arm\n            gpu: OFF\n            onnxruntime_version: \"\"\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Build sherpa-onnx\n        if: matrix.gpu == 'ON'\n        shell: bash\n        run: |\n          onnxruntime_version=${{ matrix.onnxruntime_version }}\n\n          git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n          pushd alsa-lib\n          ./gitcompile\n          popd\n\n          p=$PWD\n\n          export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n          export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n          export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n          mkdir build\n          cd build\n          cmake \\\n            -DALSA_INCLUDE_DIR=$p/alsa-lib/include \\\n            -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -DSHERPA_ONNX_ENABLE_GPU=ON \\\n            -DSHERPA_ONNX_LINUX_ARM64_GPU_ONNXRUNTIME_VERSION=$onnxruntime_version \\\n            ..\n          make -j4 install\n\n          cp -v bin/sense-voice-simulate-streaming-alsa-cxx-api install/bin\n          cp -v bin/zipformer-ctc-simulate-streaming-alsa-cxx-api install/bin\n\n          rm -rf install/lib/pkgconfig\n          rm -fv install/lib/cargs.h\n          rm -fv install/lib/libcargs.so\n\n      - name: Build sherpa-onnx\n        if: matrix.gpu == 'OFF'\n        shell: bash\n        run: |\n            docker run --rm \\\n              --volume ${{ github.workspace }}/:/k2-fsa/sherpa-onnx \\\n              quay.io/pypa/manylinux2014_aarch64 \\\n            bash -c '\n              echo \"config: ${{ matrix.config }}\"\n              uname -a\n              which gcc\n\n              gcc --version\n              g++ --version\n\n              echo \"pwd\"\n\n              ls -lh\n\n              cd /k2-fsa/sherpa-onnx/\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              p=$PWD\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n              mkdir build\n              cd build\n\n              cmake \\\n                -DALSA_INCLUDE_DIR=$p/alsa-lib/include \\\n                -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so \\\n                -DBUILD_SHARED_LIBS=ON \\\n                -DCMAKE_INSTALL_PREFIX=./install \\\n                ..\n\n              make -j4 install\n\n              cp -v bin/sense-voice-simulate-streaming-alsa-cxx-api install/bin\n              cp -v bin/zipformer-ctc-simulate-streaming-alsa-cxx-api install/bin\n\n              rm -rf install/lib/pkgconfig\n              rm -fv install/lib/cargs.h\n              rm -fv install/lib/libcargs.so\n            '\n\n      - name: Display system info\n        shell: bash\n        run: |\n          uname -a\n          gcc --version\n          g++ --version\n\n      - name: Display generated files\n        shell: bash\n        run: |\n          cd build/install\n\n          ls -lh bin\n\n          echo \"---\"\n\n          ls -lh lib\n\n          file bin/sherpa-onnx\n\n          readelf -d bin/sherpa-onnx\n\n          ldd bin/sherpa-onnx\n\n          ./bin/sherpa-onnx --help\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-aarch64-shared\n          if [[ ${{ matrix.gpu }} == OFF ]]; then\n            dst=${dst}-cpu\n          else\n            dst=${dst}-gpu-onnxruntime-${{ matrix.onnxruntime_version }}\n          fi\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n          cp -a build/install/lib $dst/\n\n          ls -lh build/install/lib\n          ls -lh build/install/bin\n\n          ls -lh $dst/bin/\n          echo \"strip\"\n          strip $dst/bin/*\n\n          echo \"after strip\"\n          ls -lh $dst/bin/\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-linux-aarch64-shared-gpu-${{ matrix.gpu }}-onnxruntime-${{ matrix.onnxruntime_version }}\n          path: sherpa-onnx-*linux-aarch64-shared*.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=aarch64/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*-shared*.tar.bz2 $dst/\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-aarch64-shared.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for aarch64 linux\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-aarch64*.tar.bz2\n\n      - name: Release pre-compiled binaries and libs for aarch64 linux\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-aarch64*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.17\n\n      - name: Test offline Moonshine\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/install/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          ls -lh build/bin/sherpa-onnx-offline\n\n          readelf -d build/bin/sherpa-onnx-offline\n\n          strings build/bin/sherpa-onnx-offline | grep ^GLIBC\n\n          .github/scripts/test-offline-moonshine.sh\n"
  },
  {
    "path": ".github/workflows/aarch64-linux-gnu-static.yaml",
    "content": "# Modified from https://github.com/Tencent/ncnn/blob/master/.github/workflows/linux-arm-cpu-gcc.yml\nname: aarch64-linux-gnu-static\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/aarch64-linux-gnu-static.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'toolchains/aarch64-linux-gnu.toolchain.cmake'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: aarch64-linux-gnu-static-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  aarch64_linux_gnu_static:\n    runs-on: ${{ matrix.os }}\n    name: aarch64 static lib test\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-22.04-arm]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n            docker run --rm \\\n              --volume ${{ github.workspace }}/:/k2-fsa/sherpa-onnx \\\n              ghcr.io/csukuangfj/manylinux2014-aarch64-gcc11:latest \\\n            bash -c '\n              echo \"config: ${{ matrix.config }}\"\n              uname -a\n              which gcc\n\n              gcc --version\n              g++ --version\n\n              ldd --version\n\n              export GCC_ROOT=/opt/gcc-11.4.0\n              export CC=$GCC_ROOT/bin/gcc\n              export CXX=$GCC_ROOT/bin/g++\n              export PATH=$GCC_ROOT/bin:$PATH\n\n              gcc --version\n              which gcc\n\n              g++ --version\n              which g++\n\n              ldd --version\n\n              echo \"pwd\"\n\n              ls -lh\n\n              cd /k2-fsa/sherpa-onnx/\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n              p=$PWD\n\n              mkdir build\n              cd build\n              cmake \\\n                -DALSA_INCLUDE_DIR=$p/alsa-lib/include \\\n                -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so \\\n                -DBUILD_SHARED_LIBS=OFF \\\n                -DCMAKE_INSTALL_PREFIX=./install \\\n                ..\n\n              make -j 4\n\n              make install\n\n              cp bin/sense-voice-simulate-streaming-alsa-cxx-api install/bin\n              cp bin/zipformer-ctc-simulate-streaming-alsa-cxx-api install/bin\n\n              ls -lh install/lib\n\n              rm -rf install/lib/pkgconfig\n              rm -fv install/lib/cargs.h\n            '\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-aarch64-static\n          mkdir $dst\n\n          ls -lh build/install/lib\n\n          cp -a build/install/bin $dst/\n          ls -lh $dst/bin/\n          echo \"strip\"\n          strip $dst/bin/*\n          ls -lh $dst/bin/\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-linux-aarch64-static\n          path: sherpa-onnx-*linux-aarch64-static.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=aarch64/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*-static.tar.bz2 $dst/\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-aarch64-static.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for aarch64 linux\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-aarch64*.tar.bz2\n\n      - name: Release pre-compiled binaries and libs for aarch64 linux\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-aarch64*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.11.5\n\n      - name: Test offline Moonshine\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          ls -lh build/bin/sherpa-onnx-offline\n\n          readelf -d build/bin/sherpa-onnx-offline\n\n          strings build/bin/sherpa-onnx-offline | grep ^GLIBC\n\n          .github/scripts/test-offline-moonshine.sh\n"
  },
  {
    "path": ".github/workflows/add-new-asr-models.yaml",
    "content": "name: add-new-asr-models\n\non:\n  # push:\n  #   branches:\n  #     - new-asr-models\n  workflow_dispatch:\n\nconcurrency:\n  group: add-new-asr-models-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  add-new-asr-models:\n    runs-on: ${{ matrix.os }}\n    name: New asr models\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Download icefall-asr-zipformer-multi-zh-en-2023-11-22\n        shell: bash\n        run: |\n          d=sherpa-onnx-zipformer-zh-en-2023-11-22\n          mkdir $d\n          pushd $d\n\n          wget -q https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22/resolve/main/data/lang_bbpe_2000/tokens.txt\n          wget -q https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22/resolve/main/data/lang_bbpe_2000/bbpe.model\n          wget -q https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22/resolve/main/exp/decoder-epoch-34-avg-19.onnx\n          wget -q https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22/resolve/main/exp/encoder-epoch-34-avg-19.int8.onnx\n          wget -q https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22/resolve/main/exp/encoder-epoch-34-avg-19.onnx\n          wget -q https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22/resolve/main/exp/joiner-epoch-34-avg-19.int8.onnx\n          wget -q https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22/resolve/main/exp/joiner-epoch-34-avg-19.onnx\n\n          mkdir test_wavs\n          cd test_wavs\n          wget -O 0.wav -q https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22/resolve/main/test_wavs/_1634_210_2577_1_1525157964032_3712259_29.wav\n          wget -O 1.wav -q https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22/resolve/main/test_wavs/_1634_210_2577_1_1525157964032_3712259_55.wav\n\n          wget -O 2.wav -q https://huggingface.co/zrjin/icefall-asr-zipformer-multi-zh-en-2023-11-22/resolve/main/test_wavs/_1634_210_2577_1_1525157964032_3712259_75.wav\n          popd\n          tar cvjf $d.tar.bz2 $d\n          ls -lh $d\n          rm -rf $d\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/android-rknn.yaml",
    "content": "name: android-rknn\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/android-rknn.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/jni/*'\n      - 'build-android*.sh'\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: android-rknn-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  build-android-rknn-libs:\n    name: Android rknn libs\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android-rknn\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: build android arm64-v8a\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          export SHERPA_ONNX_ENABLE_C_API=ON\n          export SHERPA_ONNX_ENABLE_RKNN=ON\n          ./build-android-arm64-v8a.sh\n          mkdir -p jniLibs/arm64-v8a/\n          cp -v ./build-android-arm64-v8a/install/lib/*.so ./jniLibs/arm64-v8a/\n          cp -v ./build-android-arm64-v8a/install/lib/README.md ./jniLibs/arm64-v8a/\n          rm -rf  ./build-android-arm64-v8a/\n\n      - name: build android armv7-eabi\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          export SHERPA_ONNX_ENABLE_C_API=ON\n          export SHERPA_ONNX_ENABLE_RKNN=ON\n          ./build-android-armv7-eabi.sh\n          mkdir -p ./jniLibs/armeabi-v7a/\n          cp -v ./build-android-armv7-eabi/install/lib/*.so ./jniLibs/armeabi-v7a/\n          cp -v ./build-android-armv7-eabi/install/lib/README.md ./jniLibs/armeabi-v7a/\n          rm -rf ./build-android-armv7-eabi\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION=$SHERPA_ONNX_VERSION\" >> \"$GITHUB_ENV\"\n\n          filename=sherpa-onnx-${SHERPA_ONNX_VERSION}-android-rknn.tar.bz2\n\n          tar cjvf $filename ./jniLibs\n\n          ls -lh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-android-libs-rknn\n          path: ./jniLibs\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            du -h -d1 .\n            ls -lh\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n\n            cp -v ../sherpa-onnx-*-android-rknn.tar.bz2 ./\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-android.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release android libs\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*-android-rknn.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.17\n\n  build-android-aar-rknn:\n    needs: [build-android-rknn-libs]\n    name: Android rknn AAR\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Retrieve artifact\n        uses: actions/download-artifact@v4\n        with:\n          name: sherpa-onnx-android-libs-rknn\n          path: /tmp/jniLibs\n\n      - name: Show jni libs\n        shell: bash\n        run: |\n          ls -lh /tmp/jniLibs\n\n          # drwxr-xr-x 2 runner docker 4.0K Dec 12 06:56 arm64-v8a\n          # drwxr-xr-x 2 runner docker 4.0K Dec 12 06:56 armeabi-v7a\n\n      - name: Copy libs\n        shell: bash\n        run: |\n          for arch in arm64-v8a armeabi-v7a; do\n            cp -v /tmp/jniLibs/$arch/* android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/$arch/\n          done\n\n          rm -rf android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/x86\n          rm -rf android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/x86_64\n\n      - name: Check libs\n        shell: bash\n        run: |\n          ls -lh android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/*\n\n      - name: Build aar\n        shell: bash\n        run: |\n          cd android/SherpaOnnxAar\n\n          ./gradlew :sherpa_onnx:assembleRelease\n\n      - name: Display aar\n        shell: bash\n        run: |\n          cd android/SherpaOnnxAar\n\n          ls -lh ./sherpa_onnx/build/outputs/aar/sherpa_onnx-release.aar\n          cp ./sherpa_onnx/build/outputs/aar/sherpa_onnx-release.aar ../../\n\n\n      - name: Rename aar\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION=$SHERPA_ONNX_VERSION\" >> \"$GITHUB_ENV\"\n\n          mv sherpa_onnx-release.aar sherpa-onnx-${SHERPA_ONNX_VERSION}-rknn.aar\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-android-aar\n          path: ./*.aar\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            du -h -d1 .\n            ls -lh\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=android/aar\n            mkdir -p $dst\n\n            cp -v ../*.aar $dst\n\n            git status\n            git lfs track \"*.aar\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-rknn.aar\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n      - name: Release android aar\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.aar\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.17\n\n      - name: Release android aar\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.aar\n"
  },
  {
    "path": ".github/workflows/android-static.yaml",
    "content": "# static means we link onnxruntime statically\n# but we still have libsherpa-onnx-jni.so\nname: android-static\n\non:\n  push:\n    branches:\n      - master\n      - android-link-onnxruntime-statically\n    paths:\n      - '.github/workflows/android-static.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/jni/*'\n      - 'build-android*.sh'\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: android-static-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  build-android-static-libs:\n    name: Android static libs\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android-jni-static\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: build android arm64-v8a\n        shell: bash\n        run: |\n          export BUILD_SHARED_LIBS=OFF\n\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-android-arm64-v8a.sh\n          mkdir -p jniLibs/arm64-v8a/\n          cp -v ./build-android-arm64-v8a-static/install/lib/*.so ./jniLibs/arm64-v8a/\n          rm -rf  ./build-android-arm64-v8a-static/\n\n      - name: build android armv7-eabi\n        shell: bash\n        run: |\n          export BUILD_SHARED_LIBS=OFF\n\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-android-armv7-eabi.sh\n          mkdir -p ./jniLibs/armeabi-v7a/\n          cp -v ./build-android-armv7-eabi-static/install/lib/*.so ./jniLibs/armeabi-v7a/\n          rm -rf ./build-android-armv7-eabi-static\n\n      - name: build android x86_64\n        shell: bash\n        run: |\n          export BUILD_SHARED_LIBS=OFF\n\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-android-x86-64.sh\n          mkdir -p ./jniLibs/x86_64\n          cp -v ./build-android-x86-64-static/install/lib/*.so ./jniLibs/x86_64\n          rm -rf ./build-android-x86-64-static\n\n      - name: build android x86\n        shell: bash\n        run: |\n          export BUILD_SHARED_LIBS=OFF\n\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-android-x86.sh\n          mkdir -p ./jniLibs/x86\n          cp -v ./build-android-x86/install/lib/*.so ./jniLibs/x86\n          rm -rf ./build-android-x86\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION=$SHERPA_ONNX_VERSION\" >> \"$GITHUB_ENV\"\n\n          filename=sherpa-onnx-${SHERPA_ONNX_VERSION}-android-static-link-onnxruntime.tar.bz2\n\n          tar cjvf $filename ./jniLibs\n\n          ls -lh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-android-libs-static\n          path: ./jniLibs\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            du -h -d1 .\n            ls -lh\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*-android*.tar.bz2 $dst/\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-android.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release android libs\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*-android*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.17\n\n  build-android-aar-static:\n    needs: [build-android-static-libs]\n    name: Android AAR\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Retrieve artifact\n        uses: actions/download-artifact@v4\n        with:\n          name: sherpa-onnx-android-libs-static\n          path: /tmp/jniLibs\n\n      - name: Show jni libs\n        shell: bash\n        run: |\n          ls -lh /tmp/jniLibs\n\n          # drwxr-xr-x 2 runner docker 4.0K Dec 12 06:56 arm64-v8a\n          # drwxr-xr-x 2 runner docker 4.0K Dec 12 06:56 armeabi-v7a\n          # drwxr-xr-x 2 runner docker 4.0K Dec 12 06:56 x86\n          # drwxr-xr-x 2 runner docker 4.0K Dec 12 06:56 x86_64\n          #\n      - name: Copy libs\n        shell: bash\n        run: |\n          for arch in arm64-v8a armeabi-v7a x86 x86_64; do\n            cp -v /tmp/jniLibs/$arch/* android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/$arch/\n          done\n\n      - name: Check libs\n        shell: bash\n        run: |\n          ls -lh android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/*\n\n      - name: Build aar\n        shell: bash\n        run: |\n          cd android/SherpaOnnxAar\n\n          ./gradlew :sherpa_onnx:assembleRelease\n\n      - name: Display aar\n        shell: bash\n        run: |\n          cd android/SherpaOnnxAar\n\n          ls -lh ./sherpa_onnx/build/outputs/aar/sherpa_onnx-release.aar\n          cp ./sherpa_onnx/build/outputs/aar/sherpa_onnx-release.aar ../../\n\n      - name: Rename aar\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION=$SHERPA_ONNX_VERSION\" >> \"$GITHUB_ENV\"\n\n          mv sherpa_onnx-release.aar sherpa-onnx-static-link-onnxruntime-${SHERPA_ONNX_VERSION}.aar\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-android-aar-static\n          path: ./*.aar\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            du -h -d1 .\n            ls -lh\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=android/aar/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../*.aar $dst\n\n            git status\n            git lfs track \"*.aar\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}.aar\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release android aar\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.aar\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.17\n"
  },
  {
    "path": ".github/workflows/android.yaml",
    "content": "name: android\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/android.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/jni/*'\n      - 'build-android*.sh'\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: android-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  build-android-libs:\n    name: Android libs\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android-jni\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: build android arm64-v8a\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          export SHERPA_ONNX_ENABLE_C_API=ON\n          ./build-android-arm64-v8a.sh\n\n          readelf -l ./build-android-arm64-v8a/install/lib/*.so\n\n          mkdir -p jniLibs/arm64-v8a/\n          cp -v ./build-android-arm64-v8a/install/lib/*.so ./jniLibs/arm64-v8a/\n          cp -v ./build-android-arm64-v8a/install/lib/README.md ./jniLibs/arm64-v8a/\n          rm -rf  ./build-android-arm64-v8a/\n\n      - name: build android armv7-eabi\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          export SHERPA_ONNX_ENABLE_C_API=ON\n          ./build-android-armv7-eabi.sh\n          mkdir -p ./jniLibs/armeabi-v7a/\n\n          readelf -l ./build-android-armv7-eabi/install/lib/*.so\n\n          cp -v ./build-android-armv7-eabi/install/lib/*.so ./jniLibs/armeabi-v7a/\n          cp -v ./build-android-armv7-eabi/install/lib/README.md ./jniLibs/armeabi-v7a/\n          rm -rf ./build-android-armv7-eabi\n\n      - name: build android x86_64\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          export SHERPA_ONNX_ENABLE_C_API=ON\n          ./build-android-x86-64.sh\n\n          readelf -l ./build-android-x86-64/install/lib/*.so\n\n          mkdir -p ./jniLibs/x86_64\n          cp -v ./build-android-x86-64/install/lib/*.so ./jniLibs/x86_64\n          cp -v ./build-android-x86-64/install/lib/README.md ./jniLibs/x86_64\n          rm -rf ./build-android-x86-64\n\n      - name: build android x86\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          export SHERPA_ONNX_ENABLE_C_API=ON\n          ./build-android-x86.sh\n\n          readelf -l ./build-android-x86/install/lib/*.so\n\n          mkdir -p ./jniLibs/x86\n          cp -v ./build-android-x86/install/lib/*.so ./jniLibs/x86\n          cp -v ./build-android-x86/install/lib/README.md ./jniLibs/x86\n          rm -rf ./build-android-x86\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION=$SHERPA_ONNX_VERSION\" >> \"$GITHUB_ENV\"\n\n          filename=sherpa-onnx-${SHERPA_ONNX_VERSION}-android.tar.bz2\n\n          tar cjvf $filename ./jniLibs\n\n          ls -lh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-android-libs\n          path: ./jniLibs\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            du -h -d1 .\n            ls -lh\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n\n            cp -v ../sherpa-onnx-*-android.tar.bz2 ./\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-android.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release android libs\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*-android.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.17\n\n      - name: Release android libs\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*-android.tar.bz2\n\n  build-android-aar:\n    needs: [build-android-libs]\n    name: Android AAR\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Retrieve artifact\n        uses: actions/download-artifact@v4\n        with:\n          name: sherpa-onnx-android-libs\n          path: /tmp/jniLibs\n\n      - name: Show jni libs\n        shell: bash\n        run: |\n          ls -lh /tmp/jniLibs\n\n          # drwxr-xr-x 2 runner docker 4.0K Dec 12 06:56 arm64-v8a\n          # drwxr-xr-x 2 runner docker 4.0K Dec 12 06:56 armeabi-v7a\n          # drwxr-xr-x 2 runner docker 4.0K Dec 12 06:56 x86\n          # drwxr-xr-x 2 runner docker 4.0K Dec 12 06:56 x86_64\n          #\n      - name: Copy libs\n        shell: bash\n        run: |\n          for arch in arm64-v8a armeabi-v7a x86 x86_64; do\n            cp -v /tmp/jniLibs/$arch/* android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/$arch/\n          done\n\n      - name: Check libs\n        shell: bash\n        run: |\n          ls -lh android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/*\n\n      - name: Build aar\n        shell: bash\n        run: |\n          cd android/SherpaOnnxAar\n\n          ./gradlew :sherpa_onnx:assembleRelease\n\n      - name: Display aar\n        shell: bash\n        run: |\n          cd android/SherpaOnnxAar\n\n          ls -lh ./sherpa_onnx/build/outputs/aar/sherpa_onnx-release.aar\n          cp ./sherpa_onnx/build/outputs/aar/sherpa_onnx-release.aar ../../\n\n\n      - name: Rename aar\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION=$SHERPA_ONNX_VERSION\" >> \"$GITHUB_ENV\"\n\n          mv sherpa_onnx-release.aar sherpa-onnx-${SHERPA_ONNX_VERSION}.aar\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-android-aar\n          path: ./*.aar\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            du -h -d1 .\n            ls -lh\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=android/aar\n            mkdir -p $dst\n\n            cp -v ../*.aar $dst\n\n            git status\n            git lfs track \"*.aar\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}.aar\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release android aar\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.aar\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.17\n\n      - name: Release android aar\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.aar\n"
  },
  {
    "path": ".github/workflows/apk-asr-2pass.yaml",
    "content": "name: apk-asr-2pass\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-asr-2pass-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_asr_2pass:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for asr ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"16\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"10\", \"11\", \"12\", \"13\", \"14\", \"15\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-asr-2pass-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-asr-2pass.sh\n          mv -v ./build-apk-asr-2pass.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-asr-2pass.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=asr-2pass/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-asr.yaml",
    "content": "name: apk-asr\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-asr-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_asr:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for asr ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"15\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"10\", \"11\", \"12\", \"13\", \"14\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-asr-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-asr.sh\n          mv -v ./build-apk-asr.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-asr.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=asr/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-audio-tagging-wearos.yaml",
    "content": "name: apk-audio-tagging-wearos\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-audio-tagging-wearos-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_audio_tagging_wearos:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for WearOS ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"1\"]\n        index: [\"0\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-audio-tagging-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-audio-tagging-wearos.sh\n          mv -v ./build-apk-audio-tagging-wearos.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-audio-tagging-wearos.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK for audio tagging after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK for audio tagging after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=audio-tagging-wearos/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-audio-tagging.yaml",
    "content": "name: apk-audio-tagging\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-audio-tagging-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_audio_tagging:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for audio tagging ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"1\"]\n        index: [\"0\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-audio-tagging-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-audio-tagging.sh\n          mv -v ./build-apk-audio-tagging.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-audio-tagging.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK for audio tagging after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK for audio tagging after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p audio-tagging\n            d=audio-tagging/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../apks/*.apk ./$d\n\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-kws.yaml",
    "content": "name: apk-kws\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-kws-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_kws:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for kws ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"1\"]\n        index: [\"0\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          mv -v ./build-apk-kws.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-kws.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=kws/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-qnn-vad-asr-simulated-streaming.yaml",
    "content": "name: apk-qnn-vad-asr-simulated-streaming\n\non:\n  push:\n    branches:\n      - apk\n      - zipformer-ctc-qnn-2\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-qnn-vad-asr-simulated-streaming-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  simulated_streaming_asr:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"10\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android-qnn\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-qnn-vad-asr-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-qnn-vad-asr-simulate-streaming.sh\n          mv -v ./build-apk-qnn-vad-asr-simulate-streaming.sh ../..\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: build-script-${{ matrix.total }}-${{ matrix.index }}\n          path: ./build-apk-qnn-vad-asr-simulate-streaming.sh\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-qnn-vad-asr-simulate-streaming.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            du -h -d1 .\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=qnn-vad-asr-simulated-streaming/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks for qnn\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-speaker-diarization.yaml",
    "content": "name: apk-speaker-diarization\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-speaker-diarization-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_speaker_identification:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for speaker diarization ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"1\"]\n        index: [\"0\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          python3 ./generate-speaker-diarization-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-speaker-diarization.sh\n          mv -v ./build-apk-speaker-diarization.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-speaker-diarization.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=speaker-diarization/$SHERPA_ONNX_VERSION\n            mkdir -p $d/\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-speaker-identification.yaml",
    "content": "name: apk-speaker-identification\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-speaker-identification-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_speaker_identification:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for speaker identification ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"10\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-speaker-identification-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-speaker-identification.sh\n          mv -v ./build-apk-speaker-identification.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-speaker-identification.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=speaker-identification/$SHERPA_ONNX_VERSION\n            mkdir -p $d/\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-spoken-language-identification.yaml",
    "content": "name: apk-slid\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-slid-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_slid:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for slid ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"1\"]\n        index: [\"0\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-slid-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-slid.sh\n          mv -v ./build-apk-slid.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-slid.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK for slid after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK for slid after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=slid/$SHERPA_ONNX_VERSION\n            mkdir -p $d/\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-tts-engine.yaml",
    "content": "name: apk-tts-engine\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-tts-engine-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_tts_engine:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for tts engine ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"40\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"10\", \"11\", \"12\", \"13\", \"14\", \"15\", \"16\", \"17\", \"18\", \"19\", \"20\", \"21\", \"22\", \"23\", \"24\", \"25\", \"26\", \"27\", \"28\", \"29\", \"30\", \"31\", \"32\", \"33\", \"34\", \"35\", \"36\", \"37\", \"38\", \"39\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2 iso639-lang\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-tts-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-tts-engine.sh\n          mv -v ./build-apk-tts-engine.sh ../..\n\n      - name: build APK for TTS engine\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-tts-engine.sh\n\n      - name: Display APK for TTS engine\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK for TTS engine after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK for TTS engine after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - uses: actions/upload-artifact@v4\n        if: false\n        with:\n          name: tts-engine-apk-${{ matrix.index }}\n          path: ./apks/*.apk\n\n      - name: Publish to huggingface\n        if: true\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=tts-engine-new/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more tts engine apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-tts.yaml",
    "content": "name: apk-tts\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-tts-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_tts:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for tts ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"40\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"10\", \"11\", \"12\", \"13\", \"14\", \"15\", \"16\", \"17\", \"18\", \"19\", \"20\", \"21\", \"22\", \"23\", \"24\", \"25\", \"26\", \"27\", \"28\", \"29\", \"30\", \"31\", \"32\", \"33\", \"34\", \"35\", \"36\", \"37\", \"38\", \"39\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2 iso639-lang\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-tts-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-tts.sh\n          mv -v ./build-apk-tts.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-tts.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK for TTS engine after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK for TTS engine after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - uses: actions/upload-artifact@v4\n        if: false\n        with:\n          name: tts-apk-${{ matrix.index }}\n          path: ./apks/*.apk\n\n      - name: Publish to huggingface\n        if: true\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=tts-new/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-vad-asr-simulated-streaming.yaml",
    "content": "name: apk-vad-asr-simulated-streaming\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-vad-asr-simulated-streaming-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  simulated_streaming_asr:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"25\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"10\", \"11\", \"12\", \"13\", \"14\", \"15\", \"16\", \"17\", \"18\", \"19\", \"20\", \"21\", \"22\", \"23\", \"24\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-vad-asr-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-vad-asr-simulate-streaming.sh\n          mv -v ./build-apk-vad-asr-simulate-streaming.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-vad-asr-simulate-streaming.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            du -h -d1 .\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=vad-asr-simulated-streaming/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-vad-asr.yaml",
    "content": "name: apk-vad-asr\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-vad-asr-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_vad_asr:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for asr ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"25\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"10\", \"11\", \"12\", \"13\", \"14\", \"15\", \"16\", \"17\", \"18\", \"19\", \"20\", \"21\", \"22\", \"23\", \"24\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-vad-asr-apk-script.py --total $total --index $index\n\n          chmod +x build-apk-vad-asr.sh\n          mv -v ./build-apk-vad-asr.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-vad-asr.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            du -h -d1 .\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=vad-asr/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/apk-vad.yaml",
    "content": "name: apk-vad\n\non:\n  push:\n    branches:\n      - apk\n\n  workflow_dispatch:\n\nconcurrency:\n  group: apk-vad-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  apk_vad:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: apk for vad ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"1\"]\n        index: [\"0\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-android\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/apk\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          mv -v ./build-apk-vad.sh ../..\n\n      - name: build APK\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export ANDROID_NDK=$ANDROID_NDK_LATEST_HOME\n          ./build-apk-vad.sh\n\n      - name: Display APK\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=vad/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../apks/*.apk $d/\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more apks\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-apk main\n"
  },
  {
    "path": ".github/workflows/arm-linux-gnueabihf.yaml",
    "content": "# Modified from https://github.com/Tencent/ncnn/blob/master/.github/workflows/linux-arm-cpu-gcc.yml\nname: arm-linux-gnueabihf\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/arm-linux-gnueabihf.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'toolchains/arm-linux-gnueabihf.toolchain.cmake'\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: arm-linux-gnueabihf-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  arm_linux_gnueabihf:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.os }} ${{ matrix.lib_type }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        lib_type: [static, shared]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-arm-${{ matrix.lib_type }}\n\n      - name: cache-toolchain\n        id: cache-toolchain\n        uses: actions/cache@v4\n        with:\n          path: toolchain\n          key: gcc-arm-11.2-2022.02-x86_64-arm-none-linux-gnueabihf\n\n      - name: Download toolchain\n        if: steps.cache-toolchain.outputs.cache-hit != 'true'\n        shell: bash\n        run: |\n          curl -SL -O https://huggingface.co/csukuangfj/arm-linux-gcc/resolve/main/gcc-arm-11.2-2022.02-x86_64-arm-none-linux-gnueabihf.tar.xz\n          mkdir $GITHUB_WORKSPACE/toolchain\n          tar xvf ./gcc-arm-11.2-2022.02-x86_64-arm-none-linux-gnueabihf.tar.xz --strip-components 1 -C $GITHUB_WORKSPACE/toolchain\n          rm -v gcc-arm-11.2-2022.02-x86_64-arm-none-linux-gnueabihf.tar.xz\n\n      - name: Display toolchain info\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          arm-none-linux-gnueabihf-gcc --version\n\n      - name: build arm-linux-gnueabihf\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cmake --version\n\n          lib_type=${{ matrix.lib_type }}\n\n          if [[ $lib_type == \"shared\" ]]; then\n            export BUILD_SHARED_LIBS=ON\n          else\n            export BUILD_SHARED_LIBS=OFF\n          fi\n\n          ./build-arm-linux-gnueabihf.sh\n\n          ls -lh build-arm-linux-gnueabihf/bin\n          ls -lh build-arm-linux-gnueabihf/lib\n\n          file build-arm-linux-gnueabihf/bin/sherpa-onnx\n\n          strings build-arm-linux-gnueabihf/bin/sherpa-onnx | grep ^GLIBC\n\n      - name: Copy files\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          arm-none-linux-gnueabihf-strip --version\n\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-arm-gnueabihf-${{ matrix.lib_type }}\n          mkdir $dst\n\n          ls -lh build-arm-linux-gnueabihf/install/lib\n\n          cp -a build-arm-linux-gnueabihf/install/bin $dst/\n          ls -lh $dst/bin/*\n          arm-none-linux-gnueabihf-strip $dst/bin/*\n          ls -lh $dst\n\n          lib_type=${{ matrix.lib_type }}\n          if [[ $lib_type == \"shared\" ]]; then\n            cp -a build-arm-linux-gnueabihf/install/lib $dst/\n            rm -v $dst/lib/libasound.so\n          fi\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.lib_type == 'shared'\n        with:\n          name: sherpa-onnx-linux-arm-gnueabihf-shared\n          path: sherpa-onnx-*linux-arm-gnueabihf-shared.tar.bz2\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.lib_type == 'static'\n        with:\n          name: sherpa-onnx-linux-arm-gnueabihf-static\n          path: sherpa-onnx-*linux-arm-gnueabihf-static.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=arm32/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*.tar.bz2 $dst/\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for arm linux gnueabihf ${{ matrix.lib_type }}\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-arm-gnueabihf*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.0\n"
  },
  {
    "path": ".github/workflows/as_cmake_sub_project.yaml",
    "content": "name: as_cmake_sub_project\n\non:\n  push:\n    branches:\n      - master\n\n  workflow_dispatch:\n\nconcurrency:\n  group: as-cmake-sub-project-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  as_cmake_sub_project:\n    name: ${{ matrix.os }} shared ${{ matrix.shared_lib }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        shared_lib: [ON, OFF]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-${{ matrix.shared_lib }}-cmake-sub-project\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n          du -h -d1 .\n\n      - name: Build\n        shell: bash\n        run: |\n          mv .github/scripts/as-cmake-sub-project ..\n          cd ../as-cmake-sub-project\n          ln -s $PWD/../sherpa-onnx .\n          mkdir build\n          cd build\n          cmake -DBUILD_SHARED_LIBS=${{ matrix.shared_lib }} ..\n          make -j2 main\n\n      - name: Test\n        shell: bash\n        run: |\n          cd ../as-cmake-sub-project\n\n          cd build\n          ls -lh lib\n          echo \"----\"\n          ls -lh bin\n\n          readelf -d ./bin/main\n          ./bin/main\n"
  },
  {
    "path": ".github/workflows/ascend.yaml",
    "content": "name: ascend\n\non:\n  push:\n    branches:\n      - master\n\n  workflow_dispatch:\n\nconcurrency:\n  group: ascend-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  linux:\n    name: ascend\n    runs-on: ubuntu-latest\n    strategy:\n      fail-fast: false\n      matrix:\n        include:\n          # - image: \"gpustack/ascendai-cann:8.0.RC3-910b-ubuntu20.04-py3.9\"\n          #   name: \"8.0.0-10b\"\n          - image: \"gpustack/devel-ascendai-cann:8.0.rc3.beta1-310p-ubuntu20.04-v2\"\n            name: \"8.0.0-310p\"\n    container:\n      # image: ascendai/cann:latest\n      # image: ascendai/cann:8.1.rc1-910b-ubuntu22.04-py3.10\n      # see https://hub.docker.com/r/gpustack/ascendai-cann/tags?name=8.0\n      # see https://hub.docker.com/r/gpustack/devel-ascendai-cann/tags?name=310p\n      # and\n      # https://quay.io/repository/ascend/cann?tab=tags\n      image: ${{ matrix.image }}\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Setup Python 3.8\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.8\"\n\n      - name: Install dependencies\n        shell: bash\n        run: |\n          apt-get update && apt-get install -y git curl cmake gcc g++\n\n      - name: Show GCC version\n        shell: bash\n        run: |\n          gcc --version\n          g++ --version\n          which gcc\n          which g++\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          ls -lh /usr/local/Ascend/ascend-toolkit/set_env.sh\n          find /usr/local/Ascend -name \"libascend*.so\" 2>/dev/null\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          mkdir build\n          cd build\n          cmake -DSHERPA_ONNX_ENABLE_ASCEND_NPU=ON ..\n\n          make -j2\n\n      - name: Show results\n        shell: bash\n        run: |\n          cd build\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          ldd ./bin/sherpa-onnx-offline\n"
  },
  {
    "path": ".github/workflows/axcl-linux-aarch64.yaml",
    "content": "name: axcl-linux-aarch64\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/axcl-linux-aarch64.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/csrc/axcl/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'toolchains/aarch64-linux-gnu.toolchain.cmake'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: axcl-linux-aarch64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  axcl_linux_aarch64:\n    runs-on: ubuntu-22.04-arm\n    name: axcl npu\n    strategy:\n      fail-fast: false\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Download SDK\n        shell: bash\n        run: |\n          git clone --depth 1 https://github.com/Abandon-ht/axcl_bsp_sdk\n          mv axcl_bsp_sdk/out sdk_dir\n\n          ls -lh sdk_dir/include\n          echo \"---\"\n          ls -lh sdk_dir/bsp\n          echo \"---\"\n          ls -lh sdk_dir/lib\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: axcl-linux-aarch64\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n            docker run --rm \\\n              --volume ${{ github.workspace }}/:/k2-fsa/sherpa-onnx \\\n              quay.io/pypa/manylinux_2_28_aarch64 \\\n            bash -c '\n              uname -a\n              which gcc\n\n              gcc --version\n              g++ --version\n\n              cmake --version\n\n\n              cd /k2-fsa/sherpa-onnx/\n\n              export AXCL_SDK_ROOT=$PWD/sdk_dir\n              echo \"AXCL_SDK_ROOT: $AXCL_SDK_ROOT\"\n              export CPLUS_INCLUDE_PATH=\"$AXCL_SDK_ROOT/include:$AXCL_SDK_ROOT/bsp:$CPLUS_INCLUDE_PATH\"\n              export SHERPA_ONNX_AXCL_LIB_DIR=\"$AXCL_SDK_ROOT/lib\"\n\n              echo \"pwd\"\n\n              ls -lh\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              ls -lh $PWD/alsa-lib/src/.libs\n\n              strings $PWD/alsa-lib/src/.libs/libasound.so.2.0.0 | grep \"^GLIBC\"\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n              p=$PWD\n\n              export SHERPA_ONNX_ENABLE_ALSA=1\n\n              mkdir build\n              cd build\n\n              cmake \\\n                -DALSA_INCLUDE_DIR=$p/alsa-lib/include \\\n                -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so \\\n                -DBUILD_SHARED_LIBS=ON \\\n                -DCMAKE_INSTALL_PREFIX=./install \\\n                -DSHERPA_ONNX_ENABLE_AXCL=ON \\\n                ..\n\n              make -j4 install\n\n              rm -rf install/lib/pkgconfig\n              rm -fv install/lib/cargs.h\n              rm -fv install/lib/libcargs.so\n            '\n\n      - name: Display system info\n        shell: bash\n        run: |\n          uname -a\n          gcc --version\n          g++ --version\n\n      - name: Display generated files\n        shell: bash\n        run: |\n          export AXCL_SDK_ROOT=$PWD/sdk_dir\n          export LD_LIBRARY_PATH=$AXCL_SDK_ROOT/lib:$LD_LIBRARY_PATH\n\n          ls -lh $AXCL_SDK_ROOT/lib/\n\n          cd build/install\n\n          ls -lh bin\n\n          echo \"---\"\n\n          ls -lh lib\n\n          file bin/sherpa-onnx\n\n          readelf -d bin/sherpa-onnx\n\n          ldd bin/sherpa-onnx\n\n          echo \"---\"\n          strings bin/sherpa-onnx | grep \"^GLIBC\"\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          suffix=shared\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-axcl-linux-aarch64-$suffix\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n\n          mkdir -p $dst/lib\n          cp -v build/install/lib/lib*.so $dst/lib/\n\n          ls -lh build/install/lib\n          ls -lh build/install/bin\n\n          ls -lh $dst/bin/\n          echo \"strip\"\n          strip $dst/bin/*\n\n          echo \"after strip\"\n          ls -lh $dst/bin/\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-axcl-linux-aarch64-shared\n          path: sherpa-onnx-*linux-aarch64*.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=axcl-linux-aarch64/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*axcl*-*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-axcl-linux-aarch64.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for linux aarch64\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-aarch64*.tar.bz2\n\n      - name: Release pre-compiled binaries and libs for linux aarch64\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-aarch64*.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.19\n"
  },
  {
    "path": ".github/workflows/axera-linux-aarch64.yaml",
    "content": "name: axera-linux-aarch64\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/axera-linux-aarch64.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/csrc/axera/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'toolchains/aarch64-linux-gnu.toolchain.cmake'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: axera-linux-aarch64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  axera_linux_aarch64:\n    runs-on: ubuntu-22.04-arm\n    name: axera npu\n    strategy:\n      fail-fast: false\n      matrix:\n        include:\n          - soc: ax650\n          - soc: ax630c\n          - soc: ax620q\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Download SDK\n        shell: bash\n        run: |\n          soc=${{ matrix.soc }}\n          if [[ $soc == ax650 ]]; then\n            version=1.45.0_p39\n            curl -SL -O https://github.com/AXERA-TECH/ax650n_bsp_sdk/archive/refs/tags/v$version.zip\n            unzip -qq v$version.zip\n\n            mv $PWD/ax650n_bsp_sdk-$version/msp/out sdk_dir\n          elif [[ $soc == ax630c || $soc == ax620q ]]; then\n            version=2.0.0_P7\n            curl -SL -O https://github.com/AXERA-TECH/ax620e_bsp_sdk/archive/refs/tags/v2.0.0_P7.zip\n            unzip -qq v$version.zip\n            mv $PWD/ax620e_bsp_sdk-$version/msp/out/arm64_glibc sdk_dir\n\n          fi\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: axera-${{ matrix.soc }}-linux-aarch64\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n            docker run --rm \\\n              --volume ${{ github.workspace }}/:/k2-fsa/sherpa-onnx \\\n              quay.io/pypa/manylinux_2_28_aarch64 \\\n            bash -c '\n              uname -a\n              which gcc\n\n              gcc --version\n              g++ --version\n\n              cmake --version\n\n\n              cd /k2-fsa/sherpa-onnx/\n\n              export AXERA_SDK_ROOT=$PWD/sdk_dir\n              echo \"AXERA_SDK_ROOT: $AXERA_SDK_ROOT\"\n              export CPLUS_INCLUDE_PATH=\"$AXERA_SDK_ROOT/include:$CPLUS_INCLUDE_PATH\"\n              export SHERPA_ONNX_AXERA_LIB_DIR=\"$AXERA_SDK_ROOT/lib\"\n\n              echo \"pwd\"\n\n              ls -lh\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              ls -lh $PWD/alsa-lib/src/.libs\n\n              strings $PWD/alsa-lib/src/.libs/libasound.so.2.0.0 | grep \"^GLIBC\"\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n              p=$PWD\n\n              export SHERPA_ONNX_ENABLE_ALSA=1\n\n              mkdir build\n              cd build\n\n              cmake \\\n                -DALSA_INCLUDE_DIR=$p/alsa-lib/include \\\n                -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so \\\n                -DBUILD_SHARED_LIBS=ON \\\n                -DCMAKE_INSTALL_PREFIX=./install \\\n                -DSHERPA_ONNX_ENABLE_AXERA=ON \\\n                ..\n\n              make -j4 install\n\n              rm -rf install/lib/pkgconfig\n              rm -fv install/lib/cargs.h\n              rm -fv install/lib/libcargs.so\n            '\n\n      - name: Display system info\n        shell: bash\n        run: |\n          uname -a\n          gcc --version\n          g++ --version\n\n      - name: Display generated files\n        shell: bash\n        run: |\n          export AXERA_SDK_ROOT=$PWD/sdk_dir\n          export LD_LIBRARY_PATH=$AXERA_SDK_ROOT/lib:$LD_LIBRARY_PATH\n\n          ls -lh $AXERA_SDK_ROOT/lib/\n\n          cd build/install\n\n          ls -lh bin\n\n          echo \"---\"\n\n          ls -lh lib\n\n          file bin/sherpa-onnx\n\n          readelf -d bin/sherpa-onnx\n\n          ldd bin/sherpa-onnx\n\n          echo \"---\"\n          strings bin/sherpa-onnx | grep \"^GLIBC\"\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          suffix=shared\n\n          soc=${{ matrix.soc }}\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-axera-$soc-linux-aarch64-$suffix\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n\n          mkdir -p $dst/lib\n          cp -v build/install/lib/lib*.so $dst/lib/\n\n          ls -lh build/install/lib\n          ls -lh build/install/bin\n\n          ls -lh $dst/bin/\n          echo \"strip\"\n          strip $dst/bin/*\n\n          echo \"after strip\"\n          ls -lh $dst/bin/\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-axera-${{ matrix.soc }}-linux-aarch64-shared\n          path: sherpa-onnx-*linux-aarch64*.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=axera-linux-aarch64/$SHERPA_ONNX_VERSION/${{ matrix.soc }}\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*axera*-*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-axera-${{ matrix.soc }}-linux-aarch64.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for linux aarch64\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-aarch64*.tar.bz2\n\n      - name: Release pre-compiled binaries and libs for linux aarch64\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-aarch64*.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.19\n"
  },
  {
    "path": ".github/workflows/build-wheels-aarch64-cuda.yaml",
    "content": "name: build-wheels-aarch64-cuda\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-aarch64-cuda-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  build_wheels_aarch64_cuda:\n    name: ${{ matrix.manylinux }} ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-22.04-arm]\n        python-version: [\"cp38\", \"cp39\", \"cp310\", \"cp311\", \"cp312\", \"cp313\", \"cp314\"]\n        # manylinux: [manylinux2014] #, manylinux_2_28]\n        manylinux: [manylinux_2_28] #, manylinux_2_28]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # see https://cibuildwheel.readthedocs.io/en/stable/changelog/\n      # for a list of versions\n      - name: Build wheels\n        uses: pypa/cibuildwheel@v3.3.1\n        env:\n          CIBW_BEFORE_ALL: |\n            git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n            cd alsa-lib\n            ./gitcompile\n            cd ..\n            echo \"PWD\"\n            ls -lh /project/alsa-lib/src/.libs\n\n          CIBW_ENVIRONMENT: >\n            CPLUS_INCLUDE_PATH=/project/alsa-lib/include:$CPLUS_INCLUDE_PATH\n            C_INCLUDE_PATH=/project/alsa-lib/include:$C_INCLUDE_PATH\n            SHERPA_ONNX_ALSA_LIB_DIR=/project/alsa-lib/src/.libs\n            LD_LIBRARY_PATH=/project/build/bdist.linux-x86_64/wheel/sherpa_onnx/lib:$SHERPA_ONNX_ALSA_LIB_DIR:$LD_LIBRARY_PATH\n            SHERPA_ONNX_MAKE_ARGS=\"VERBOSE=1\"\n            SHERPA_ONNX_ENABLE_ALSA=1\n            SHERPA_ONNX_ENABLE_GPU=ON\n            SHERPA_ONNX_CMAKE_ARGS=\"-DSHERPA_ONNX_ENABLE_GPU=ON -DALSA_INCLUDE_DIR=/project/alsa-lib/include -DALSA_LIBRARY=/project/alsa-lib/src/.libs/libasound.so\"\n          CIBW_BUILD: \"${{ matrix.python-version}}-* \"\n          CIBW_SKIP: \"cp27-* cp35-* cp36-* *-win32 pp* *-musllinux* *-manylinux_i686\"\n          CIBW_BUILD_VERBOSITY: 3\n          CIBW_ARCHS_LINUX: aarch64\n          CIBW_MANYLINUX_AARCH64_IMAGE: quay.io/pypa/${{ matrix.manylinux }}_aarch64\n          #  Don't repair Linux wheels\n          CIBW_REPAIR_WHEEL_COMMAND_LINUX: \"\"\n          # From onnxruntime >= 1.17.0, it drops support for CentOS 7.0 and it supports only manylinux_2_28.\n          # manylinux_2_24 is no longer supported\n\n      - name: Display wheels\n        shell: bash\n        run: |\n          ls -lh ./wheelhouse/\n\n      - name: Install patchelf\n        shell: bash\n        run: |\n          sudo apt-get update -q\n          sudo apt-get install -q -y patchelf\n          patchelf --help\n\n      - name: Patch wheels\n        shell: bash\n        run: |\n          mkdir ./wheels\n          sudo ./scripts/wheel/patch_wheel.py --in-dir ./wheelhouse --out-dir ./wheels\n\n          ls -lh ./wheels/\n          rm -rf ./wheelhouse\n          mv ./wheels ./wheelhouse\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cuda/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.python-version }}-${{ matrix.manylinux }}\n          path: ./wheelhouse/*.whl\n"
  },
  {
    "path": ".github/workflows/build-wheels-aarch64-rknn.yaml",
    "content": "name: build-wheels-aarch64-rknn\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n\nconcurrency:\n  group: build-wheels-aarch64-rknn-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  build_wheels_aarch64_rknn:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.os }} ${{ matrix.python-version }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-22.04-arm]\n        python-version: [\"3.8\", \"3.9\", \"3.10\", \"3.11\", \"3.12\", \"3.13\", \"3.14\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Download rknn-toolkit2\n        shell: bash\n        run: |\n          git clone --depth 1 https://github.com/airockchip/rknn-toolkit2\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n            docker run --rm \\\n              --volume ${{ github.workspace }}/:/k2-fsa/sherpa-onnx \\\n              quay.io/pypa/manylinux_2_28_aarch64 \\\n            bash -c '\n              uname -a\n              which gcc\n\n              gcc --version\n              g++ --version\n\n              find /opt -name \"python*\"\n\n              py=${{ matrix.python-version }}\n\n              for v in $(seq 0 99); do\n                if [ -f /opt/_internal/cpython-$py.$v/bin/python3 ]; then\n                  py=/opt/_internal/cpython-$py.$v/bin/python3\n                  break\n                fi\n              done\n\n              # there is\n              # py=/opt/_internal/cpython-3.13.3-nogil/bin/python3\n              #\n              echo \"py: $py\"\n\n              $py --version\n\n              $py -m venv my-py\n\n              python3 --version\n              which python3\n\n              source ./my-py/bin/activate\n\n              python3 --version\n              which python3\n\n              python3 -m pip install wheel twine setuptools\n\n              echo \"pwd\"\n\n              cd /k2-fsa/sherpa-onnx/\n\n              ls -lh\n\n              cmake --version\n\n              uname -a\n              echo \"pwd\"\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              ls -lh $PWD/alsa-lib/src/.libs\n\n              strings $PWD/alsa-lib/src/.libs/libasound.so.2.0.0 | grep \"^GLIBC\"\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n              export SHERPA_ONNX_RKNN_TOOLKIT2_PATH=$PWD/rknn-toolkit2\n              export SHERPA_ONNX_RKNN_TOOLKIT2_LIB_DIR=$SHERPA_ONNX_RKNN_TOOLKIT2_PATH/rknpu2/runtime/Linux/librknn_api/aarch64\n              export CPLUS_INCLUDE_PATH=$SHERPA_ONNX_RKNN_TOOLKIT2_PATH/rknpu2/runtime/Linux/librknn_api/include:$CPLUS_INCLUDE_PATH\n\n              export SHERPA_ONNX_ENABLE_ALSA=1\n\n              p=$PWD\n\n              export SHERPA_ONNX_CMAKE_ARGS=\"-DSHERPA_ONNX_ENABLE_RKNN=ON -DALSA_INCLUDE_DIR=$p/alsa-lib/include -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so\"\n              python3 setup.py bdist_wheel\n\n              mv dist wheelhouse\n            '\n\n      - name: Display results\n        shell: bash\n        run: |\n          ls -lh wheelhouse\n\n      - name: Fix wheel name\n        shell: bash\n        run: |\n          python3 -m pip install auditwheel\n\n          auditwheel show ./wheelhouse/*.whl\n\n          auditwheel repair --help\n\n          auditwheel --verbose repair --plat manylinux_2_27_aarch64 \\\n            --exclude librknnrt.so \\\n            --exclude libasound.so.2 \\\n            -w ./dist ./wheelhouse/*.whl\n\n          ls -lh dist/*.whl\n\n      - name: Show glibc versions\n        shell: bash\n        run: |\n          mkdir t\n          cp dist/*.whl t\n          cd t\n          unzip ./*.whl\n          strings sherpa_onnx-*.data/data/bin/sherpa-onnx | grep GLIBC\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=rknn/$SHERPA_ONNX_VERSION/\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../dist/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.python-version }}\n          path: ./dist/*.whl\n"
  },
  {
    "path": ".github/workflows/build-wheels-aarch64.yaml",
    "content": "name: build-wheels-aarch64\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n    inputs:\n      publish_sherpa_onnx_bin:\n        description: \"Publish sherpa-onnx-bin\"\n        required: false\n        default: \"true\"\n        type: boolean\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-aarch64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  core:\n    name: core\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-24.04-arm]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n          du -h -d1 .\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n            docker run --rm \\\n              --volume ${{ github.workspace }}/:/home/runner/work/sherpa-onnx/sherpa-onnx \\\n              quay.io/pypa/manylinux2014_aarch64 \\\n            bash -c '\n              uname -a\n              gcc --version\n              cmake --version\n              cat /etc/*release\n              id\n              pwd\n\n              cd /home/runner/work/sherpa-onnx/sherpa-onnx\n\n              find /opt -name \"python*\"\n\n              echo \"--------------------\"\n              PY_PATH=$(echo /opt/_internal/cpython-3.10*/bin)\n              export PATH=$PY_PATH:$PATH\n              which python3\n              python3 --version\n\n              python3 -m venv my\n\n              source ./my/bin/activate\n\n              python3 -m pip install setuptools wheel twine\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n              mkdir build\n              pushd build\n\n              cmake \\\n                -D SHERPA_ONNX_ENABLE_TTS=ON \\\n                -D CMAKE_BUILD_TYPE=Release \\\n                -D BUILD_SHARED_LIBS=ON \\\n                -D SHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF \\\n                -D CMAKE_INSTALL_PREFIX=./install \\\n                ..\n\n              make -j2\n              make install\n\n              ls -lh lib\n              ls -lh bin\n\n              echo \"----\"\n              ls -lh install/lib\n\n              rm -fv install/lib/libcargs.so\n\n              echo \"----\"\n              ls -lh install/bin\n\n              echo \"sherpa-onnx-core\"\n              mkdir -p ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n              cp -v ./install/lib/lib*.so ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n\n              mkdir -p ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n              cp -v ./install/include/sherpa-onnx/c-api/*.h ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n\n              pushd ../scripts/wheel/sherpa-onnx-core\n              python3 setup.py bdist_wheel --plat-name=manylinux2014_aarch64\n\n              ls -lh dist\n\n              popd\n\n              echo \"sherpa-onnx-bin\"\n\n              mkdir -p ../scripts/wheel/sherpa-onnx-bin/bin\n              cp -v ./install/bin/sherpa-onnx* ../scripts/wheel/sherpa-onnx-bin/bin\n\n              pushd ../scripts/wheel/sherpa-onnx-bin\n              python3 setup.py bdist_wheel --plat-name=manylinux2014_aarch64\n\n              ls -lh dist\n\n              popd\n            '\n\n      - name: Collect wheels\n        shell: bash\n        run: |\n          sudo chown -R $USER ./scripts/wheel\n          mkdir wheelhouse\n          cp -v ./scripts/wheel/sherpa-onnx-core/dist/*.whl ./wheelhouse\n          cp -v ./scripts/wheel/sherpa-onnx-bin/dist/*.whl ./wheelhouse\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-linux-aarch64\n          path: ./wheelhouse/*.whl\n\n      - name: Show wheels\n        shell: bash\n        run: |\n          sudo chown -R $USER ./scripts/wheel\n          ls -lh ./scripts/wheel/sherpa-onnx-core/dist\n          ls -lh ./scripts/wheel/sherpa-onnx-bin/dist\n\n          unzip -l ./scripts/wheel/sherpa-onnx-core/dist/*.whl\n          echo \"---\"\n          unzip -l ./scripts/wheel/sherpa-onnx-bin/dist/*.whl\n\n      - name: Install patchelf\n        shell: bash\n        run: |\n          sudo apt-get update -q\n          sudo apt-get install -q -y patchelf\n          patchelf --help\n\n      - name: Patch wheels\n        shell: bash\n        run: |\n          mkdir ./wheels\n          sudo ./scripts/wheel/patch_wheel.py --in-dir ./wheelhouse --out-dir ./wheels\n\n          ls -lh ./wheels/\n          rm -rf ./wheelhouse\n          mv ./wheels ./wheelhouse\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-linux-aarch64-patched\n          path: ./wheelhouse/*.whl\n\n  test:\n    name: test\n    needs: [core]\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-24.04-arm]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Retrieve artifact from Linux x64\n        uses: actions/download-artifact@v4\n        with:\n          name: wheels-core-linux-aarch64-patched\n          path: /tmp/wheels\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh /tmp/wheels\n\n      - name: Install\n        shell: bash\n        run: |\n          python3 -m pip install /tmp/wheels/*.whl\n\n      - name: Show version\n        shell: bash\n        run: |\n          sherpa-onnx-version\n\n      - name: Show help\n        shell: bash\n        run: |\n          sherpa-onnx --help\n\n          echo \"---\"\n\n          ls -lh $(which sherpa-onnx)\n          file $(which sherpa-onnx)\n          readelf -d $(which sherpa-onnx)\n\n          ldd $(which sherpa-onnx)\n\n          sherpa-onnx-offline --help\n\n          echo \"---\"\n\n          sherpa-onnx-vad --help\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v /tmp/wheels/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI ${{ github.event.inputs.publish_sherpa_onnx_bin }}\n        if: ${{ (github.event.inputs.publish_sherpa_onnx_bin || 'true') == 'true' }}\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n          twine upload /tmp/wheels/*.whl\n\n\n  build_wheels_aarch64:\n    needs: [core, test]\n    name: ${{ matrix.manylinux }} ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        # see https://github.com/pypa/cibuildwheel/issues/2257\n        # we don't use qemu from now on\n        os: [ubuntu-24.04-arm]\n        python-version: [\"cp38\", \"cp39\", \"cp310\", \"cp311\", \"cp312\", \"cp313\", \"cp314\"]\n        manylinux: [manylinux2014] #, manylinux_2_28]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # see https://cibuildwheel.readthedocs.io/en/stable/changelog/\n      # for a list of versions\n      - name: Build wheels\n        uses: pypa/cibuildwheel@v3.3.1\n        env:\n          CIBW_BEFORE_ALL: |\n            git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n            cd alsa-lib\n            ./gitcompile\n            cd ..\n            echo \"PWD\"\n            ls -lh /project/alsa-lib/src/.libs\n\n          CIBW_ENVIRONMENT: >\n            SHERPA_ONNX_SPLIT_PYTHON_PACKAGE=ON\n            C_INCLUDE_PATH=/project/alsa-lib/include:$C_INCLUDE_PATH\n            CPLUS_INCLUDE_PATH=/project/alsa-lib/include:$CPLUS_INCLUDE_PATH\n            SHERPA_ONNX_ALSA_LIB_DIR=/project/alsa-lib/src/.libs\n            LD_LIBRARY_PATH=/project/build/bdist.linux-aarch64/wheel/sherpa_onnx/lib:$SHERPA_ONNX_ALSA_LIB_DIR\n            SHERPA_ONNX_MAKE_ARGS=\"VERBOSE=1\"\n            SHERPA_ONNX_ENABLE_ALSA=1\n            SHERPA_ONNX_CMAKE_ARGS=\"-DSHERPA_ONNX_ENABLE_BINARY=OFF -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF -DSHERPA_ONNX_ENABLE_C_API=OFF -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF -DALSA_INCLUDE_DIR=$p/alsa-lib/include -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so\"\n\n          CIBW_BUILD: \"${{ matrix.python-version}}-* \"\n          CIBW_SKIP: \"cp27-* cp35-* cp36-* *-win32 pp* *-musllinux* *-manylinux_i686\"\n          CIBW_BUILD_VERBOSITY: 3\n          CIBW_ARCHS_LINUX: aarch64\n          # https://quay.io/repository/pypa/manylinux2014_aarch64?tab=tags\n          CIBW_MANYLINUX_AARCH64_IMAGE: quay.io/pypa/${{ matrix.manylinux }}_aarch64\n          # From onnxruntime >= 1.17.0, it drops support for CentOS 7.0 and it supports only manylinux_2_28.\n          # manylinux_2_24 is no longer supported\n          CIBW_REPAIR_WHEEL_COMMAND: >\n            auditwheel repair -w {dest_dir}\n            --exclude libonnxruntime.so\n            {wheel}\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.python-version }}-${{ matrix.manylinux }}-linux-aarch64\n          path: ./wheelhouse/*.whl\n\n      - name: Display wheels\n        shell: bash\n        run: |\n          ls -lh ./wheelhouse/\n\n      - name: Show wheels\n        shell: bash\n        run: |\n          ls -lh wheelhouse/*.whl\n\n          unzip -l wheelhouse/*.whl\n\n          echo \"---\"\n\n          mkdir t\n          cp wheelhouse/*.whl ./t\n          cd ./t\n          unzip ./*.whl\n          ls -lh\n          echo \"---\"\n\n          readelf -d sherpa_onnx/lib/*.so\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n          twine upload ./wheelhouse/*.whl\n"
  },
  {
    "path": ".github/workflows/build-wheels-armv7l.yaml",
    "content": "name: build-wheels-armv7l\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n    inputs:\n      publish_sherpa_onnx_bin:\n        description: \"Publish sherpa-onnx-bin\"\n        required: false\n        default: \"true\"\n        type: boolean\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-armv7-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  core:\n    name: core\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Set up QEMU\n        uses: docker/setup-qemu-action@v2\n        with:\n          platforms: arm\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n          du -h -d1 .\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          docker run --rm \\\n            --platform linux/arm/v7 \\\n            --volume ${{ github.workspace }}/:/home/runner/work/sherpa-onnx/sherpa-onnx \\\n            quay.io/pypa/manylinux_2_35_armv7l \\\n            bash -c '\n              find / -name \"*gcc*\" 2>/dev/null\n\n              uname -a\n              gcc --version\n              cmake --version\n              cat /etc/*release\n              id\n              pwd\n\n              cd /home/runner/work/sherpa-onnx/sherpa-onnx\n\n              # find /opt -name \"python*\"\n\n              echo \"--------------------\"\n              PY_PATH=$(echo /opt/_internal/cpython-3.10*/bin)\n              export PATH=$PY_PATH:$PATH\n              which python3\n              python3 --version\n\n              python3 -m venv my\n\n              source ./my/bin/activate\n\n              python3 -m pip install setuptools wheel twine\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n              mkdir build\n              pushd build\n\n              cmake \\\n                -D SHERPA_ONNX_ENABLE_TTS=ON \\\n                -D CMAKE_BUILD_TYPE=Release \\\n                -D BUILD_SHARED_LIBS=ON \\\n                -D SHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF \\\n                -D CMAKE_INSTALL_PREFIX=./install \\\n                ..\n\n              make -j2\n              make install\n\n              ls -lh lib\n              ls -lh bin\n\n              echo \"----\"\n              ls -lh install/lib\n\n              file install/lib/*\n\n              rm -fv install/lib/libcargs.so\n\n              echo \"----\"\n              ls -lh install/bin\n\n              file install/bin/*\n\n              ./install/bin/sherpa-onnx --help\n\n              echo \"sherpa-onnx-core\"\n              mkdir -p ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n              cp -v ./install/lib/lib*.so ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n\n              mkdir -p ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n              cp -v ./install/include/sherpa-onnx/c-api/*.h ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n\n              pushd ../scripts/wheel/sherpa-onnx-core\n              python3 setup.py bdist_wheel --plat-name=manylinux_2_35_armv7l\n\n              ls -lh dist\n\n              popd\n\n              echo \"sherpa-onnx-bin\"\n\n              mkdir -p ../scripts/wheel/sherpa-onnx-bin/bin\n              cp -v ./install/bin/sherpa-onnx* ../scripts/wheel/sherpa-onnx-bin/bin\n\n              pushd ../scripts/wheel/sherpa-onnx-bin\n              python3 setup.py bdist_wheel --plat-name=manylinux_2_35_armv7l\n\n              ls -lh dist\n\n              popd\n            '\n\n      - name: Collect wheels\n        shell: bash\n        run: |\n          sudo chown -R $USER ./scripts/wheel\n          mkdir wheelhouse\n          cp -v ./scripts/wheel/sherpa-onnx-core/dist/*.whl ./wheelhouse\n          cp -v ./scripts/wheel/sherpa-onnx-bin/dist/*.whl ./wheelhouse\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-linux-armv7l\n          path: ./wheelhouse/*.whl\n\n      - name: Show wheels\n        shell: bash\n        run: |\n          sudo chown -R $USER ./scripts/wheel\n          ls -lh ./scripts/wheel/sherpa-onnx-core/dist\n          ls -lh ./scripts/wheel/sherpa-onnx-bin/dist\n\n          unzip -l ./scripts/wheel/sherpa-onnx-core/dist/*.whl\n          echo \"---\"\n          unzip -l ./scripts/wheel/sherpa-onnx-bin/dist/*.whl\n\n      - name: Install patchelf\n        shell: bash\n        run: |\n          sudo apt-get update -q\n          sudo apt-get install -q -y patchelf\n          patchelf --help\n\n      - name: Patch wheels\n        shell: bash\n        run: |\n          mkdir ./wheels\n          sudo ./scripts/wheel/patch_wheel.py --in-dir ./wheelhouse --out-dir ./wheels\n\n          ls -lh ./wheels/\n          rm -rf ./wheelhouse\n          mv ./wheels ./wheelhouse\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-linux-armv7l-patched\n          path: ./wheelhouse/*.whl\n\n  test:\n    name: test\n    needs: [core]\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Retrieve artifact from Linux\n        uses: actions/download-artifact@v4\n        with:\n          name: wheels-core-linux-armv7l-patched\n          path: /tmp/wheels\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh /tmp/wheels\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v /tmp/wheels/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI ${{ github.event.inputs.publish_sherpa_onnx_bin }}\n        if: ${{ (github.event.inputs.publish_sherpa_onnx_bin || 'true') == 'true' }}\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n          twine upload /tmp/wheels/*.whl\n\n\n  build_wheels_armv7l:\n    name: ${{ matrix.manylinux }} ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        # see https://github.com/pypa/cibuildwheel/issues/2257\n        # we don't use qemu from now on\n        os: [ubuntu-latest]\n        python-version: [\"3.8\", \"3.9\", \"3.10\", \"3.11\", \"3.12\", \"3.13\", \"3.14\"]\n        manylinux: [manylinux_2_35] #, manylinux_2_28]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Set up QEMU\n        uses: docker/setup-qemu-action@v2\n        with:\n          platforms: arm\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          docker run --rm \\\n            --platform linux/arm/v7 \\\n            --volume ${{ github.workspace }}/:/home/runner/work/sherpa-onnx/sherpa-onnx \\\n            quay.io/pypa/manylinux_2_35_armv7l \\\n            bash -c '\n              find / -name \"*gcc*\" 2>/dev/null\n\n              uname -a\n              gcc --version\n              cmake --version\n              cat /etc/*release\n              id\n              pwd\n\n              python_version=${{ matrix.python-version }}\n\n              cd /home/runner/work/sherpa-onnx/sherpa-onnx\n\n              # find /opt -name \"python*\"\n\n              echo \"--------------------\"\n              # Construct glob pattern\n              PY_GLOB=\"/opt/_internal/cpython-${python_version}*/bin\"\n\n              # Expand the glob safely\n              shopt -s nullglob  # Avoid literal string if no match\n              matches=($PY_GLOB)\n              shopt -u nullglob\n\n              if [[ ${#matches[@]} -eq 0 ]]; then\n                echo \"No Python installation found for version $python_version\"\n                exit 1\n              elif [[ ${#matches[@]} -gt 1 ]]; then\n                echo \"Multiple Python installations found for version $python_version:\"\n                printf \"  %s\\n\" \"${matches[@]}\"\n                echo \"Using the first one: ${matches[0]}\"\n              fi\n\n              PY_PATH=\"${matches[0]}\"\n\n              echo \"$PY_PATH\"\n              export PATH=\"$PY_PATH:$PATH\"\n              echo $PY_PATH\n              export PATH=$PY_PATH:$PATH\n              which python3\n              python3 --version\n\n              python3 -m venv my\n\n              source ./my/bin/activate\n\n              python3 -m pip install setuptools wheel twine\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n              echo \"SHERPA_ONNX_ALSA_LIB_DIR: $SHERPA_ONNX_ALSA_LIB_DIR\"\n\n              export LD_LIBRARY_PATH=$PWD/build/bdist.linux-aarch64/wheel/sherpa_onnx/lib:$SHERPA_ONNX_ALSA_LIB_DIR:$LD_LIBRARY_PATH\n              export LIBRARY_PATH=$PWD/build/bdist.linux-aarch64/wheel/sherpa_onnx/lib:$SHERPA_ONNX_ALSA_LIB_DIR:$LIBRARY_PATH\n\n              echo \"LD_LIBRARY_PATH: $LD_LIBRARY_PATH\"\n              echo \"LIBRARY_PATH: $LIBRARY_PATH\"\n\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n              export SHERPA_ONNX_MAKE_ARGS=\"VERBOSE=1\"\n              export SHERPA_ONNX_ENABLE_ALSA=1\n              export SHERPA_ONNX_CMAKE_ARGS=\"-DCMAKE_C_FLAGS=\\\"-march=armv7-a -mfloat-abi=hard -mfpu=neon\\\" -DCMAKE_CXX_FLAGS=\\\"-march=armv7-a -mfloat-abi=hard -mfpu=neon\\\" -DSHERPA_ONNX_ENABLE_BINARY=OFF -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF -DSHERPA_ONNX_ENABLE_C_API=ON -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF -DALSA_INCLUDE_DIR=$p/alsa-lib/include -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so\"\n              python3 setup.py bdist_wheel\n              ls -lh dist\n\n              mkdir wheelhouse\n              cp -v dist/* wheelhouse/\n            '\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.python-version }}-${{ matrix.manylinux }}-linux-armv7l\n          path: ./wheelhouse/*.whl\n\n      - name: Display wheels\n        shell: bash\n        run: |\n          ls -lh ./wheelhouse/\n\n      - name: Show wheels\n        shell: bash\n        run: |\n          ls -lh wheelhouse/*.whl\n\n          unzip -l wheelhouse/*.whl\n\n          echo \"---\"\n\n          mkdir t\n          cp wheelhouse/*.whl ./t\n          cd ./t\n          unzip ./*.whl\n          ls -lh\n          echo \"---\"\n\n          readelf -d sherpa_onnx/lib/*.so\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n          twine upload ./wheelhouse/*.whl\n"
  },
  {
    "path": ".github/workflows/build-wheels-linux-cuda.yaml",
    "content": "name: build-wheels-linux-cuda\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-linux-cuda-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  build_wheels_linux_cuda:\n    name: ${{ matrix.manylinux }} ${{ matrix.python-version }} ${{ matrix.onnxruntime_version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-22.04]\n        python-version: [\"3.7\", \"3.8\", \"3.9\", \"3.10\", \"3.11\", \"3.12\", \"3.13\", \"3.14\"]\n        onnxruntime_version: [\"1.17.1\", \"1.23.2\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          if [[ ${{ matrix.python-version }} == \"3.7\" ]]; then\n            pip install -U pip wheel setuptools twine\n          else\n            pip install -U pip wheel setuptools twine==5.0.0\n          fi\n\n      - name: Build alsa-lib\n        shell: bash\n        run: |\n          git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n          cd alsa-lib\n          ./gitcompile\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n          export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n          export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n          export LD_LIBRARY_PATH=$SHERPA_ONNX_ALSA_LIB_DIR:$LD_LIBRARY_PATH\n\n          echo \"CPLUS_INCLUDE_PATH: $CPLUS_INCLUDE_PATH\"\n          ls -lh $PWD/alsa-lib/include\n          echo \"---\"\n          ls -lh $PWD/alsa-lib/src/.libs\n\n          p=$PWD\n\n          export SHERPA_ONNX_MAKE_ARGS=\"VERBOSE=1\"\n          export SHERPA_ONNX_ENABLE_ALSA=1\n          export SHERPA_ONNX_CMAKE_ARGS=\"-DSHERPA_ONNX_ENABLE_GPU=ON\"\n\n          onnxruntime_version=${{ matrix.onnxruntime_version }}\n          curl -SL -O https://github.com/csukuangfj/onnxruntime-libs/releases/download/v$onnxruntime_version/onnxruntime-linux-x64-gpu-$onnxruntime_version-patched.zip\n          unzip  onnxruntime-linux-x64-gpu-$onnxruntime_version-patched.zip\n\n          export SHERPA_ONNXRUNTIME_LIB_DIR=$PWD/onnxruntime-linux-x64-gpu-$onnxruntime_version-patched/lib\n          export SHERPA_ONNXRUNTIME_INCLUDE_DIR=$PWD/onnxruntime-linux-x64-gpu-$onnxruntime_version-patched/include\n\n          if [[ $onnxruntime_version == \"1.23.2\" ]]; then\n            export SHERPA_ONNX_CUDA_VERSION=\"12.cudnn9\"\n          fi\n\n          python3 setup.py bdist_wheel\n\n          ls -lh dist\n\n          mv dist wheelhouse\n\n      - name: Display wheels\n        shell: bash\n        run: |\n          ls -lh ./wheelhouse/\n\n          unzip -l ./wheelhouse/*.whl\n\n      - name: Install patchelf\n        shell: bash\n        run: |\n          sudo apt-get update -q\n          sudo apt-get install -q -y patchelf\n          patchelf --help\n\n      - name: Patch wheels\n        shell: bash\n        run: |\n          mkdir ./wheels\n          sudo ./scripts/wheel/patch_wheel.py --in-dir ./wheelhouse --out-dir ./wheels\n\n          ls -lh ./wheels/\n          rm -rf ./wheelhouse\n          mv ./wheels ./wheelhouse\n\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-cuda-${{ matrix.python-version }}-${{ matrix.onnxruntime_version }}\n          path: ./wheelhouse/*.whl\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cuda/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n"
  },
  {
    "path": ".github/workflows/build-wheels-linux.yaml",
    "content": "name: build-wheels-linux\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n    inputs:\n      publish_sherpa_onnx_bin:\n        description: \"Publish sherpa-onnx-bin\"\n        required: false\n        default: \"true\"\n        type: boolean\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-linux-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  core:\n    name: core\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n          du -h -d1 .\n\n      - name: Build sherpa-onnx (docker manually)\n        shell: bash\n        run: |\n          docker run --rm \\\n            -v ${{ github.workspace }}:/workspace \\\n            -w /workspace \\\n            quay.io/pypa/manylinux2014_x86_64 \\\n            bash -c '\n              uname -a\n              gcc --version\n              cmake --version\n              cat /etc/*release\n              id\n              pwd\n\n              cd /workspace\n\n              echo pwd\n              echo $PWD\n\n              find /opt -name \"python*\"\n\n              echo \"--------------------\"\n              PY_PATH=$(echo /opt/_internal/cpython-3.10*/bin)\n              echo \"PY_PATH: $PY_PATH\"\n\n              export PATH=$PY_PATH:$PATH\n\n              echo \"path $PATH\"\n\n              which python3\n              python3 --version\n\n              python3 -m venv my\n\n              source ./my/bin/activate\n\n              python3 -m pip install setuptools wheel twine\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n              mkdir build\n              pushd build\n\n              cmake \\\n                -D SHERPA_ONNX_ENABLE_TTS=ON \\\n                -D CMAKE_BUILD_TYPE=Release \\\n                -D BUILD_SHARED_LIBS=ON \\\n                -D SHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF \\\n                -D CMAKE_INSTALL_PREFIX=./install \\\n                ..\n\n              make -j2\n              make install\n\n              ls -lh lib\n              ls -lh bin\n\n              echo \"----\"\n              ls -lh install/lib\n\n              rm -fv install/lib/libcargs.so\n\n              echo \"----\"\n              ls -lh install/bin\n\n              echo sherpa-onnx-core\n              mkdir -p ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n              cp -v ./install/lib/lib*.so ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n\n              mkdir -p ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n              cp -v ./install/include/sherpa-onnx/c-api/*.h ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n\n              pushd ../scripts/wheel/sherpa-onnx-core\n              python3 setup.py bdist_wheel --plat-name=manylinux2014_x86_64\n\n              ls -lh dist\n              unzip -l dist/*.whl\n\n              popd\n\n              echo \"sherpa-onnx-bin\"\n\n              mkdir -p ../scripts/wheel/sherpa-onnx-bin/bin\n              cp -v ./install/bin/sherpa-onnx* ../scripts/wheel/sherpa-onnx-bin/bin\n\n              pushd ../scripts/wheel/sherpa-onnx-bin\n              python3 setup.py bdist_wheel --plat-name=manylinux2014_x86_64\n\n              ls -lh dist\n              unzip -l dist/*.whl\n\n              popd\n            '\n\n      - name: Collect wheels\n        shell: bash\n        run: |\n          sudo chown -R $USER ./scripts/wheel\n          mkdir wheelhouse\n          cp -v ./scripts/wheel/sherpa-onnx-core/dist/*.whl ./wheelhouse\n          cp -v ./scripts/wheel/sherpa-onnx-bin/dist/*.whl ./wheelhouse\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-linux-x64\n          path: ./wheelhouse/*.whl\n\n      - name: Show wheels\n        shell: bash\n        run: |\n          sudo chown -R $USER ./scripts/wheel\n          ls -lh ./scripts/wheel/sherpa-onnx-core/dist\n          ls -lh ./scripts/wheel/sherpa-onnx-bin/dist\n\n          unzip -l ./scripts/wheel/sherpa-onnx-core/dist/*.whl\n          echo \"---\"\n          unzip -l ./scripts/wheel/sherpa-onnx-bin/dist/*.whl\n\n      - name: Install patchelf\n        shell: bash\n        run: |\n          sudo apt-get update -q\n          sudo apt-get install -q -y patchelf\n          patchelf --help\n\n      - name: Patch wheels\n        shell: bash\n        run: |\n          mkdir ./wheels\n          sudo ./scripts/wheel/patch_wheel.py --in-dir ./wheelhouse --out-dir ./wheels\n\n          ls -lh ./wheels/\n          rm -rf ./wheelhouse\n          mv ./wheels ./wheelhouse\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-linux-x64-patched\n          path: ./wheelhouse/*.whl\n\n  test:\n    name: test\n    needs: [core]\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Retrieve artifact from Linux x64\n        uses: actions/download-artifact@v4\n        with:\n          name: wheels-core-linux-x64-patched\n          path: /tmp/wheels\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh /tmp/wheels\n\n      - name: Install\n        shell: bash\n        run: |\n          python3 -m pip install /tmp/wheels/*.whl\n\n      - name: Show version\n        shell: bash\n        run: |\n          sherpa-onnx-version\n\n      - name: Show help\n        shell: bash\n        run: |\n          sherpa-onnx --help\n\n          echo \"---\"\n\n          ls -lh $(which sherpa-onnx)\n          file $(which sherpa-onnx)\n          readelf -d $(which sherpa-onnx)\n\n          ldd $(which sherpa-onnx)\n\n          sherpa-onnx-offline --help\n\n          echo \"---\"\n\n          sherpa-onnx-vad --help\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v /tmp/wheels/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI ${{ github.event.inputs.publish_sherpa_onnx_bin }}\n        if: ${{ (github.event.inputs.publish_sherpa_onnx_bin || 'true') == 'true' }}\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n          twine upload /tmp/wheels/*.whl\n\n  build_wheels_linux:\n    needs: [core, test]\n    name: ${{ matrix.manylinux }} ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"cp38\", \"cp39\", \"cp310\", \"cp311\", \"cp312\", \"cp313\", \"cp314\"]\n        manylinux: [manylinux2014] #, manylinux_2_28]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # see https://cibuildwheel.readthedocs.io/en/stable/changelog/\n      # for a list of versions\n      - name: Build wheels\n        uses: pypa/cibuildwheel@v3.3.1\n        env:\n          CIBW_BEFORE_ALL: |\n            git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n            cd alsa-lib\n            ./gitcompile\n            cd ..\n            echo \"PWD\"\n            ls -lh /project/alsa-lib/src/.libs\n\n          CIBW_ENVIRONMENT: >\n            SHERPA_ONNX_SPLIT_PYTHON_PACKAGE=ON\n            CPLUS_INCLUDE_PATH=/project/alsa-lib/include:$CPLUS_INCLUDE_PATH\n            C_INCLUDE_PATH=/project/alsa-lib/include:$C_INCLUDE_PATH\n            SHERPA_ONNX_ALSA_LIB_DIR=/project/alsa-lib/src/.libs\n            LD_LIBRARY_PATH=/project/build/bdist.linux-x86_64/wheel/sherpa_onnx/lib:$SHERPA_ONNX_ALSA_LIB_DIR\n            SHERPA_ONNX_MAKE_ARGS=\"VERBOSE=1\"\n            SHERPA_ONNX_ENABLE_ALSA=1\n            SHERPA_ONNX_CMAKE_ARGS=\"-DSHERPA_ONNX_ENABLE_BINARY=OFF -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF -DSHERPA_ONNX_ENABLE_C_API=OFF -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF -DALSA_INCLUDE_DIR=$p/alsa-lib/include -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so\"\n\n          CIBW_BUILD: \"${{ matrix.python-version}}-* \"\n          CIBW_SKIP: \"cp27-* cp35-* cp36-* *-win32 pp* *-musllinux* *-manylinux_i686\"\n          CIBW_BUILD_VERBOSITY: 3\n          CIBW_MANYLINUX_X86_64_IMAGE: quay.io/pypa/${{ matrix.manylinux }}_x86_64\n          CIBW_REPAIR_WHEEL_COMMAND: >\n            auditwheel repair -w {dest_dir}\n            --exclude libonnxruntime.so\n            {wheel}\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.python-version }}-${{ matrix.manylinux }}\n          path: ./wheelhouse/*.whl\n\n      - name: Show wheels\n        shell: bash\n        run: |\n          ls -lh wheelhouse/*.whl\n          unzip -l wheelhouse/*.whl\n\n          echo \"---\"\n\n          mkdir t\n          cp wheelhouse/*.whl ./t\n          cd ./t\n          unzip ./*.whl\n          ls -lh\n          echo \"---\"\n\n          readelf -d sherpa_onnx/lib/*.so\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n          twine upload ./wheelhouse/*.whl\n\n      - name: Build sdist\n        if: matrix.python-version == 'cp38' && matrix.manylinux == 'manylinux2014'\n        shell: bash\n        run: |\n          python3 setup.py sdist\n          ls -l dist/*\n\n      - name: Publish sdist to PyPI\n        if: matrix.python-version == 'cp38' && matrix.manylinux == 'manylinux2014'\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        shell: bash\n        run: |\n          twine upload dist/sherpa*.tar.gz\n"
  },
  {
    "path": ".github/workflows/build-wheels-macos-arm64.yaml",
    "content": "name: build-wheels-macos-arm64\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n    inputs:\n      publish_sherpa_onnx_bin:\n        description: \"Publish sherpa-onnx-bin\"\n        required: false\n        default: \"true\"\n        type: boolean\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-macos-arm64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  core:\n    runs-on: ${{ matrix.os }}\n    name: core\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n      - name: Set up Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Install deps\n        shell: bash\n        run: |\n          python3 -m pip install setuptools wheel twine\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: macos-latest-sherpa-onnx-core-arm64\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n\n          cmake \\\n            -DSHERPA_ONNX_SPLIT_PYTHON_PACKAGE=ON \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D SHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D CMAKE_OSX_ARCHITECTURES='arm64' \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n      - name: Build sherpa-onnx for macos\n        shell: bash\n        run: |\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cd build\n          make -j2\n          make install\n\n          ls -lh lib\n          ls -lh bin\n\n          file ./bin/sherpa-onnx\n\n          rm -fv ./install/include/cargs.h\n          rm -fv ./install/lib/cargs.h\n          rm -fv ./install/lib/libcargs.dylib\n          rm -fv ./install/lib/libcargs.a\n          rm -rfv ./install/lib/pkgconfig\n\n      - name: Copy files\n        shell: bash\n        run: |\n          echo 'sherpa-onnx-core'\n          mkdir -p scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n          cp -v ./build/install/lib/lib* ./scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n\n          mkdir -p ./scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n          cp -v ./build/install/include/sherpa-onnx/c-api/*.h ./scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n\n          echo 'sherpa-onnx-bin'\n\n          mkdir -p ./scripts/wheel/sherpa-onnx-bin/bin\n          cp -v ./build/install/bin/sherpa-onnx* ./scripts/wheel/sherpa-onnx-bin/bin\n\n      - name: Build sherpa-onnx-core\n        shell: bash\n        run: |\n          pushd ./scripts/wheel/sherpa-onnx-core\n          python3 setup.py bdist_wheel --plat-name=macosx_11_0_arm64\n\n          ls -lh dist\n          unzip -l dist/*.whl\n\n          popd\n\n      - name: Build sherpa-onnx-bin\n        shell: bash\n        run: |\n          pushd ./scripts/wheel/sherpa-onnx-bin\n          python3 setup.py bdist_wheel --plat-name=macosx_11_0_arm64\n\n          ls -lh dist\n          unzip -l dist/*.whl\n\n          popd\n\n      - name: Collect wheels\n        shell: bash\n        run: |\n          cp -v ./scripts/wheel/sherpa-onnx-core/dist/*.whl .\n          cp -v ./scripts/wheel/sherpa-onnx-bin/dist/*.whl .\n\n          ls -lh *.whl\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-macos-arm64\n          path: ./*.whl\n\n  test:\n    name: test\n    needs: [core]\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Retrieve artifact from macos arm64\n        uses: actions/download-artifact@v4\n        with:\n          name: wheels-core-macos-arm64\n          path: /tmp/wheels\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh /tmp/wheels\n\n      - name: Install\n        shell: bash\n        run: |\n          python3 -m pip install /tmp/wheels/*.whl\n\n      - name: Show version\n        shell: bash\n        run: |\n          sherpa-onnx-version\n\n      - name: Show help\n        shell: bash\n        run: |\n          sherpa-onnx --help\n\n          ls -lh $(which sherpa-onnx)\n          file $(which sherpa-onnx)\n\n          otool -L $(which sherpa-onnx)\n          otool -l $(which sherpa-onnx)\n\n          echo \"---\"\n\n          sherpa-onnx-offline --help\n\n          echo \"---\"\n\n          sherpa-onnx-vad --help\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v /tmp/wheels/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI ${{ github.event.inputs.publish_sherpa_onnx_bin }}\n        if: ${{ (github.event.inputs.publish_sherpa_onnx_bin || 'true') == 'true' }}\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        shell: bash\n        run: |\n          opts='--break-system-packages'\n\n          python3 -m pip install $opts wheel twine==5.0.0 setuptools\n\n          twine upload /tmp/wheels/*.whl\n\n  build_wheels_macos_arm64:\n    needs: [core, test]\n    name: ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"cp38\", \"cp39\", \"cp310\", \"cp311\", \"cp312\", \"cp313\", \"cp314\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Build wheels\n        uses: pypa/cibuildwheel@v3.3.1\n        env:\n          CIBW_BUILD: \"${{ matrix.python-version}}-* \"\n          CIBW_ENVIRONMENT: >\n            SHERPA_ONNX_SPLIT_PYTHON_PACKAGE=ON\n            SHERPA_ONNX_CMAKE_ARGS=\"-DCMAKE_OSX_ARCHITECTURES='arm64' -DSHERPA_ONNX_ENABLE_BINARY=OFF -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF -DSHERPA_ONNX_ENABLE_C_API=OFF -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF\"\n          CIBW_ARCHS: \"arm64\"\n          CIBW_BUILD_VERBOSITY: 3\n\n          #  Don't repair macOS wheels\n          CIBW_REPAIR_WHEEL_COMMAND_MACOS: \"\"\n\n      - name: Display wheels\n        shell: bash\n        run: |\n          ls -lh ./wheelhouse/\n          unzip -l ./wheelhouse/*.whl\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.python-version }}\n          path: ./wheelhouse/*.whl\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        run: |\n          opts='--break-system-packages'\n\n          python3 -m pip install $opts wheel twine==5.0.0 setuptools\n\n          twine upload ./wheelhouse/*.whl\n"
  },
  {
    "path": ".github/workflows/build-wheels-macos-universal2.yaml",
    "content": "name: build-wheels-macos-universal2\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n    inputs:\n      publish_sherpa_onnx_bin:\n        description: \"Publish sherpa-onnx-bin\"\n        required: false\n        default: \"true\"\n        type: boolean\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-macos-universal2-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  core:\n    runs-on: ${{ matrix.os }}\n    name: core\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n      - name: Set up Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Install deps\n        shell: bash\n        run: |\n          python3 -m pip install setuptools wheel twine\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: macos-latest-sherpa-onnx-core-universal2\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n\n          cmake \\\n            -DSHERPA_ONNX_SPLIT_PYTHON_PACKAGE=ON \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D SHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D CMAKE_OSX_ARCHITECTURES='arm64;x86_64' \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n      - name: Build sherpa-onnx for macos\n        shell: bash\n        run: |\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cd build\n          make -j2\n          make install\n\n          ls -lh lib\n          ls -lh bin\n\n          file ./bin/sherpa-onnx\n\n          rm -fv ./install/include/cargs.h\n          rm -fv ./install/lib/cargs.h\n          rm -fv ./install/lib/libcargs.dylib\n          rm -fv ./install/lib/libcargs.a\n          rm -rfv ./install/lib/pkgconfig\n\n      - name: Copy files\n        shell: bash\n        run: |\n          echo 'sherpa-onnx-core'\n          mkdir -p scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n          cp -v ./build/install/lib/lib* ./scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n\n          mkdir -p ./scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n          cp -v ./build/install/include/sherpa-onnx/c-api/*.h ./scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n\n          echo 'sherpa-onnx-bin'\n\n          mkdir -p ./scripts/wheel/sherpa-onnx-bin/bin\n          cp -v ./build/install/bin/sherpa-onnx* ./scripts/wheel/sherpa-onnx-bin/bin\n\n      - name: Build sherpa-onnx-core\n        shell: bash\n        run: |\n          pushd ./scripts/wheel/sherpa-onnx-core\n          python3 setup.py bdist_wheel --plat-name=macosx_10_15_universal2\n\n          ls -lh dist\n          unzip -l dist/*.whl\n\n          popd\n\n      - name: Build sherpa-onnx-bin\n        shell: bash\n        run: |\n          pushd ./scripts/wheel/sherpa-onnx-bin\n          python3 setup.py bdist_wheel --plat-name=macosx_10_15_universal2\n\n          ls -lh dist\n          unzip -l dist/*.whl\n\n          popd\n\n      - name: Collect wheels\n        shell: bash\n        run: |\n          cp -v ./scripts/wheel/sherpa-onnx-core/dist/*.whl .\n          cp -v ./scripts/wheel/sherpa-onnx-bin/dist/*.whl .\n\n          ls -lh *.whl\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-macos-universal\n          path: ./*.whl\n\n  test:\n    name: test ${{ matrix.os }}\n    needs: [core]\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest, macos-15-intel]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Retrieve artifact from macos universal\n        uses: actions/download-artifact@v4\n        with:\n          name: wheels-core-macos-universal\n          path: /tmp/wheels\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh /tmp/wheels\n\n      - name: Install\n        shell: bash\n        run: |\n          python3 -m pip install /tmp/wheels/*.whl\n\n      - name: Show version\n        shell: bash\n        run: |\n          sherpa-onnx-version\n\n      - name: Show help\n        shell: bash\n        run: |\n          sherpa-onnx --help\n\n          ls -lh $(which sherpa-onnx)\n          file $(which sherpa-onnx)\n\n          otool -L $(which sherpa-onnx)\n          otool -l $(which sherpa-onnx)\n\n          echo \"---\"\n\n          sherpa-onnx-offline --help\n\n          echo \"---\"\n\n          sherpa-onnx-vad --help\n\n      - name: Publish to huggingface\n        if: matrix.os == 'macos-latest'\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v /tmp/wheels/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI ${{ github.event.inputs.publish_sherpa_onnx_bin }}\n        if: ${{ matrix.os == 'macos-latest' && (github.event.inputs.publish_sherpa_onnx_bin || 'true') == 'true' }}\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        shell: bash\n        run: |\n          opts='--break-system-packages'\n\n          python3 -m pip install $opts wheel twine==5.0.0 setuptools\n\n          twine upload /tmp/wheels/*.whl\n\n  build_wheels_macos_universal2:\n    needs: [core, test]\n    name: ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"cp38\", \"cp39\", \"cp310\", \"cp311\", \"cp312\", \"cp313\", \"cp314\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Set macOS deployment target\n        run: echo \"MACOSX_DEPLOYMENT_TARGET=10.15\" >> $GITHUB_ENV\n\n      - name: Build wheels\n        uses: pypa/cibuildwheel@v3.3.1\n        env:\n          CIBW_BUILD: \"${{ matrix.python-version}}-* \"\n          CIBW_ENVIRONMENT: >\n            MACOSX_DEPLOYMENT_TARGET=10.15\n            SHERPA_ONNX_SPLIT_PYTHON_PACKAGE=ON\n            SHERPA_ONNX_CMAKE_ARGS=\"-DCMAKE_OSX_ARCHITECTURES='arm64;x86_64' -DSHERPA_ONNX_ENABLE_BINARY=OFF -DCMAKE_OSX_DEPLOYMENT_TARGET='10.15' -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF -DSHERPA_ONNX_ENABLE_C_API=OFF -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF\"\n          CIBW_ARCHS: \"universal2\"\n          CIBW_BUILD_VERBOSITY: 3\n\n          #  Don't repair macOS wheels\n          CIBW_REPAIR_WHEEL_COMMAND_MACOS: \"\"\n\n      - name: Display wheels\n        shell: bash\n        run: |\n          ls -lh ./wheelhouse/\n          unzip -l ./wheelhouse/*.whl\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.python-version }}\n          path: ./wheelhouse/*.whl\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        run: |\n          opts='--break-system-packages'\n\n          python3 -m pip install $opts wheel twine==5.0.0 setuptools\n\n          twine upload ./wheelhouse/*.whl\n"
  },
  {
    "path": ".github/workflows/build-wheels-macos-x64.yaml",
    "content": "name: build-wheels-macos-x64\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n    inputs:\n      publish_sherpa_onnx_bin:\n        description: \"Publish sherpa-onnx-bin\"\n        required: false\n        default: \"true\"\n        type: boolean\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-macos-x64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  core:\n    runs-on: ${{ matrix.os }}\n    name: core\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n      - name: Set up Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Install deps\n        shell: bash\n        run: |\n          python3 -m pip install setuptools wheel twine\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: macos-latest-sherpa-onnx-core-x64\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n\n          cmake \\\n            -DSHERPA_ONNX_SPLIT_PYTHON_PACKAGE=ON \\\n            -DCMAKE_OSX_DEPLOYMENT_TARGET=10.15 \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D SHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D CMAKE_OSX_ARCHITECTURES='x86_64' \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n      - name: Build sherpa-onnx for macos\n        shell: bash\n        run: |\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cd build\n          make -j2\n          make install\n\n          ls -lh lib\n          ls -lh bin\n\n          file ./bin/sherpa-onnx\n\n          rm -fv ./install/include/cargs.h\n          rm -fv ./install/lib/cargs.h\n          rm -fv ./install/lib/libcargs.dylib\n          rm -fv ./install/lib/libcargs.a\n          rm -rfv ./install/lib/pkgconfig\n\n      - name: Copy files\n        shell: bash\n        run: |\n          echo 'sherpa-onnx-core'\n          mkdir -p scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n          cp -v ./build/install/lib/lib* ./scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n\n          mkdir -p ./scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n          cp -v ./build/install/include/sherpa-onnx/c-api/*.h ./scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n\n          echo 'sherpa-onnx-bin'\n\n          mkdir -p ./scripts/wheel/sherpa-onnx-bin/bin\n          cp -v ./build/install/bin/sherpa-onnx* ./scripts/wheel/sherpa-onnx-bin/bin\n\n      - name: Build sherpa-onnx-core\n        shell: bash\n        run: |\n          pushd ./scripts/wheel/sherpa-onnx-core\n          python3 setup.py bdist_wheel --plat-name=macosx_10_15_x86_64\n\n          ls -lh dist\n          unzip -l dist/*.whl\n\n          popd\n\n      - name: Build sherpa-onnx-bin\n        shell: bash\n        run: |\n          pushd ./scripts/wheel/sherpa-onnx-bin\n          python3 setup.py bdist_wheel --plat-name=macosx_10_15_x86_64\n\n          ls -lh dist\n          unzip -l dist/*.whl\n\n          popd\n\n      - name: Collect wheels\n        shell: bash\n        run: |\n          cp -v ./scripts/wheel/sherpa-onnx-core/dist/*.whl .\n          cp -v ./scripts/wheel/sherpa-onnx-bin/dist/*.whl .\n\n          ls -lh *.whl\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-macos-x64\n          path: ./*.whl\n\n  test:\n    name: test\n    needs: [core]\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-15-intel]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Retrieve artifact from macos x64\n        uses: actions/download-artifact@v4\n        with:\n          name: wheels-core-macos-x64\n          path: /tmp/wheels\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh /tmp/wheels\n\n      - name: Install\n        shell: bash\n        run: |\n          python3 -m pip install /tmp/wheels/*.whl\n\n      - name: Show version\n        shell: bash\n        run: |\n          sherpa-onnx-version\n\n      - name: Show help\n        shell: bash\n        run: |\n          sherpa-onnx --help\n\n          ls -lh $(which sherpa-onnx)\n          file $(which sherpa-onnx)\n          otool -L $(which sherpa-onnx)\n          otool -l $(which sherpa-onnx)\n\n          echo \"---\"\n\n          sherpa-onnx-offline --help\n\n          echo \"---\"\n\n          sherpa-onnx-vad --help\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v /tmp/wheels/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI ${{ github.event.inputs.publish_sherpa_onnx_bin }}\n        if: ${{ (github.event.inputs.publish_sherpa_onnx_bin || 'true') == 'true' }}\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        shell: bash\n        run: |\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n          twine upload /tmp/wheels/*.whl\n\n  build_wheels_macos_x64:\n    needs: [core, test]\n    name: ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"cp38\", \"cp39\", \"cp310\", \"cp311\", \"cp312\", \"cp313\", \"cp314\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Set macOS deployment target\n        run: echo \"MACOSX_DEPLOYMENT_TARGET=10.15\" >> $GITHUB_ENV\n\n      - name: Build wheels\n        uses: pypa/cibuildwheel@v3.3.1\n        env:\n          CIBW_BUILD: \"${{ matrix.python-version}}-* \"\n          CIBW_ENVIRONMENT: >\n            MACOSX_DEPLOYMENT_TARGET=10.15\n            SHERPA_ONNX_SPLIT_PYTHON_PACKAGE=ON\n            SHERPA_ONNX_CMAKE_ARGS=\"-DCMAKE_OSX_ARCHITECTURES='x86_64' -DSHERPA_ONNX_ENABLE_BINARY=OFF -DCMAKE_OSX_DEPLOYMENT_TARGET='10.15' -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF -DSHERPA_ONNX_ENABLE_C_API=OFF -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF\"\n\n          CIBW_ARCHS: \"x86_64\"\n          CIBW_BUILD_VERBOSITY: 3\n\n          #  Don't repair macOS wheels\n          CIBW_REPAIR_WHEEL_COMMAND_MACOS: \"\"\n\n      - name: Display wheels\n        shell: bash\n        run: |\n          ls -lh ./wheelhouse/\n          unzip -l ./wheelhouse/*.whl\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-macos-x64-${{ matrix.python-version }}\n          path: ./wheelhouse/*.whl\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        run: |\n          opts='--break-system-packages'\n\n          python3 -m pip install $opts wheel twine==5.0.0 setuptools\n\n          twine upload ./wheelhouse/*.whl\n"
  },
  {
    "path": ".github/workflows/build-wheels-win32.yaml",
    "content": "name: build-wheels-win32\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n    inputs:\n      publish_sherpa_onnx_bin:\n        description: \"Publish sherpa-onnx-bin\"\n        required: false\n        default: \"true\"\n        type: boolean\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-win32-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  core:\n    name: core\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n      - name: Install sccache\n        run: choco install sccache -y\n\n      - name: Cache sccache\n        uses: actions/cache@v3\n        with:\n          path: C:\\Users\\runneradmin\\AppData\\Local\\Mozilla\\sccache\n          key: ${{ matrix.os }}-sccache-core-win32\n          restore-keys: |\n            ${{ matrix.os }}-sccache-core-win32\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          mkdir build\n          cd build\n          cmake \\\n            -D CMAKE_C_COMPILER_LAUNCHER=sccache \\\n            -D CMAKE_CXX_COMPILER_LAUNCHER=sccache \\\n            -A Win32 \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D SHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n      - name: Build sherpa-onnx for windows\n        shell: bash\n        run: |\n          cd build\n          cmake --build . --config Release  -- -m:2\n          cmake --build . --config Release --target install -- -m:2\n\n          ls -lh ./bin/Release/sherpa-onnx.exe\n\n      - name: Show sccache stats\n        run: sccache --show-stats\n\n      - name: Show\n        shell: bash\n        run: |\n          echo \"---bin---\"\n          ls -lh build/install/bin\n          echo \"---lib---\"\n          ls -lh build/install/lib\n          echo \"---include---\"\n          ls -lh build/install/include\n\n      - name: Copy files\n        shell: bash\n        run: |\n          cd build\n          echo 'sherpa-onnx-core'\n          mkdir -p ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n          cp -v ./install/lib/onnxruntime.dll ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n          cp -v ./install/lib/sherpa-onnx-*.dll ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n          # keep the *.lib file so users can write code to link with our dll\n          cp -v ./install/lib/sherpa-onnx-*.lib ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n\n          mkdir -p ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n          cp -v ./install/include/sherpa-onnx/c-api/*.h ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n\n          pushd ../scripts/wheel/sherpa-onnx-core\n          python3 setup.py bdist_wheel --plat-name=win32\n\n          ls -lh dist\n\n          popd\n\n          echo 'sherpa-onnx-bin'\n\n          mkdir -p ../scripts/wheel/sherpa-onnx-bin/bin\n          cp -v ./install/bin/sherpa-onnx* ../scripts/wheel/sherpa-onnx-bin/bin\n\n          pushd ../scripts/wheel/sherpa-onnx-bin\n          python3 setup.py bdist_wheel --plat-name=win32\n\n          ls -lh dist\n\n          popd\n\n      - name: Collect wheels\n        shell: bash\n        run: |\n          mkdir wheelhouse\n          cp -v ./scripts/wheel/sherpa-onnx-core/dist/*.whl ./wheelhouse\n          cp -v ./scripts/wheel/sherpa-onnx-bin/dist/*.whl ./wheelhouse\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-win-x86\n          path: ./wheelhouse/*.whl\n\n      - name: Show wheels\n        shell: bash\n        run: |\n          ls -lh ./scripts/wheel/sherpa-onnx-core/dist\n          ls -lh ./scripts/wheel/sherpa-onnx-bin/dist\n\n          unzip -l ./scripts/wheel/sherpa-onnx-core/dist/*.whl\n          echo \"---\"\n          unzip -l ./scripts/wheel/sherpa-onnx-bin/dist/*.whl\n\n  test:\n    name: test\n    needs: [core]\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Retrieve artifact from Windows x86\n        uses: actions/download-artifact@v4\n        with:\n          name: wheels-core-win-x86\n          path: /tmp/wheels\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n          architecture: x86\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh /d/tmp/wheels\n\n      - name: Install\n        shell: bash\n        run: |\n          python3 -m pip install /d/tmp/wheels/*.whl\n\n      - name: Show version\n        shell: bash\n        run: |\n          sherpa-onnx-version\n\n          which sherpa-onnx-version\n\n      - name: Show help\n        shell: bash\n        run: |\n          sherpa-onnx --help\n\n          echo \"---\"\n\n          sherpa-onnx-offline --help\n\n          echo \"---\"\n\n          sherpa-onnx-vad --help\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v /d/tmp/wheels/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI ${{ github.event.inputs.publish_sherpa_onnx_bin }}\n        if: ${{ (github.event.inputs.publish_sherpa_onnx_bin || 'true') == 'true' }}\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n          twine upload /d/tmp/wheels/*.whl\n\n  build_wheels_win32:\n    needs: [core, test]\n    name: ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        python-version: [\"cp38\", \"cp39\", \"cp310\", \"cp311\", \"cp312\", \"cp313\", \"cp314\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # see https://cibuildwheel.readthedocs.io/en/stable/changelog/\n      # for a list of versions\n      - name: Build wheels (cibuildwheel)\n        uses: pypa/cibuildwheel@v3.1.4\n        env:\n          CIBW_BUILD: \"${{ matrix.python-version}}-* \"\n          CIBW_SKIP: \"*-win_amd64\"\n          CIBW_BUILD_VERBOSITY: 3\n          CIBW_ENVIRONMENT: >\n            SHERPA_ONNX_SPLIT_PYTHON_PACKAGE=ON\n            SHERPA_ONNX_CMAKE_ARGS=\"-A Win32 -DSHERPA_ONNX_ENABLE_BINARY=OFF -DSHERPA_ONNX_ENABLE_C_API=OFF -DSHERPA_ONNX_ENABLE_C_API=OFF -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF\"\n\n      - name: Display wheels\n        shell: bash\n        run: |\n          ls -lh ./wheelhouse/\n\n          unzip -l ./wheelhouse/*.whl\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.python-version }}\n          path: ./wheelhouse/*.whl\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n          twine upload ./wheelhouse/*.whl\n"
  },
  {
    "path": ".github/workflows/build-wheels-win64-cuda.yaml",
    "content": "name: build-wheels-win64-cuda\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-win64-cuda-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  build_wheels_win64_cuda:\n    name: ${{ matrix.python-version }} ${{ matrix.onnxruntime_version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        python-version: [\"3.7\", \"3.8\", \"3.9\", \"3.10\", \"3.11\", \"3.12\", \"3.13\", \"3.14\"]\n        onnxruntime_version: [\"1.17.1\", \"1.23.2\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Build wheels\n        shell: bash\n        run: |\n          pip install setuptools wheel\n\n          export SHERPA_ONNX_CMAKE_ARGS=\"-DSHERPA_ONNX_ENABLE_GPU=ON\"\n\n          onnxruntime_version=${{ matrix.onnxruntime_version }}\n          curl -SL -O https://github.com/microsoft/onnxruntime/releases/download/v$onnxruntime_version/onnxruntime-win-x64-gpu-$onnxruntime_version.zip\n          unzip onnxruntime-win-x64-gpu-$onnxruntime_version.zip\n\n          export SHERPA_ONNXRUNTIME_LIB_DIR=$PWD/onnxruntime-win-x64-gpu-$onnxruntime_version/lib\n          export SHERPA_ONNXRUNTIME_INCLUDE_DIR=$PWD/onnxruntime-win-x64-gpu-$onnxruntime_version/include\n\n          if [[ $onnxruntime_version == \"1.23.2\" ]]; then\n            export SHERPA_ONNX_CUDA_VERSION=\"12.cudnn9\"\n          fi\n\n          python3 setup.py bdist_wheel\n\n          ls -lh ./dist/\n\n          mv dist wheelhouse\n\n      - name: Display wheels\n        shell: bash\n        run: |\n          ls -lh ./wheelhouse/\n          unzip -l ./wheelhouse/*.whl\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.python-version }}-${{ matrix.onnxruntime_version }}\n          path: ./wheelhouse/*.whl\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cuda/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n"
  },
  {
    "path": ".github/workflows/build-wheels-win64.yaml",
    "content": "name: build-wheels-win64\n\non:\n  push:\n    branches:\n      - wheel\n  workflow_dispatch:\n    inputs:\n      publish_sherpa_onnx_bin:\n        description: \"Publish sherpa-onnx-bin\"\n        required: false\n        default: \"true\"\n        type: boolean\n\nenv:\n  SHERPA_ONNX_IS_IN_GITHUB_ACTIONS: 1\n\nconcurrency:\n  group: build-wheels-win64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  core:\n    name: core\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n      - name: Install sccache\n        run: choco install sccache -y\n\n      - name: Cache sccache\n        uses: actions/cache@v3\n        with:\n          path: C:\\Users\\runneradmin\\AppData\\Local\\Mozilla\\sccache\n          key: ${{ matrix.os }}-sccache-core\n          restore-keys: |\n            ${{ matrix.os }}-sccache-core-\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          mkdir build\n          cd build\n          cmake \\\n            -D CMAKE_C_COMPILER_LAUNCHER=sccache \\\n            -D CMAKE_CXX_COMPILER_LAUNCHER=sccache \\\n            -A x64 \\\n            -D SHERPA_ONNX_ENABLE_TTS=ON \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D SHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n      - name: Build sherpa-onnx for windows\n        shell: bash\n        run: |\n          cd build\n          cmake --build . --config Release  -- -m:2\n          cmake --build . --config Release --target install -- -m:2\n\n          ls -lh ./bin/Release/sherpa-onnx.exe\n\n      - name: Show sccache stats\n        run: sccache --show-stats\n\n      - name: Show\n        shell: bash\n        run: |\n          echo \"---bin---\"\n          ls -lh build/install/bin\n          echo \"---lib---\"\n          ls -lh build/install/lib\n          echo \"---include---\"\n          ls -lh build/install/include\n\n      - name: Copy files\n        shell: bash\n        run: |\n          cd build\n          echo 'sherpa-onnx-core'\n          mkdir -p ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n          cp -v ./install/lib/onnxruntime.dll ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n          cp -v ./install/lib/sherpa-onnx-*.dll ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n          # keep the *.lib file so users can write code to link with our dll\n          cp -v ./install/lib/sherpa-onnx-*.lib ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/lib\n\n          mkdir -p ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n          cp -v ./install/include/sherpa-onnx/c-api/*.h ../scripts/wheel/sherpa-onnx-core/sherpa_onnx/include/sherpa-onnx/c-api\n\n          pushd ../scripts/wheel/sherpa-onnx-core\n          python3 setup.py bdist_wheel --plat-name=win_amd64\n\n          ls -lh dist\n\n          popd\n\n          echo 'sherpa-onnx-bin'\n\n          mkdir -p ../scripts/wheel/sherpa-onnx-bin/bin\n          cp -v ./install/bin/sherpa-onnx* ../scripts/wheel/sherpa-onnx-bin/bin\n\n          pushd ../scripts/wheel/sherpa-onnx-bin\n          python3 setup.py bdist_wheel --plat-name=win_amd64\n\n          ls -lh dist\n\n          popd\n\n      - name: Collect wheels\n        shell: bash\n        run: |\n          mkdir wheelhouse\n          cp -v ./scripts/wheel/sherpa-onnx-core/dist/*.whl ./wheelhouse\n          cp -v ./scripts/wheel/sherpa-onnx-bin/dist/*.whl ./wheelhouse\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheels-core-win-x64\n          path: ./wheelhouse/*.whl\n\n      - name: Show wheels\n        shell: bash\n        run: |\n          ls -lh ./scripts/wheel/sherpa-onnx-core/dist\n          ls -lh ./scripts/wheel/sherpa-onnx-bin/dist\n\n          unzip -l ./scripts/wheel/sherpa-onnx-core/dist/*.whl\n          echo \"---\"\n          unzip -l ./scripts/wheel/sherpa-onnx-bin/dist/*.whl\n\n  test:\n    name: test\n    needs: [core]\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Retrieve artifact from Windows x64\n        uses: actions/download-artifact@v4\n        with:\n          name: wheels-core-win-x64\n          path: /tmp/wheels\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh /d/tmp/wheels\n\n      - name: Install\n        shell: bash\n        run: |\n          python3 -m pip install /d/tmp/wheels/*.whl\n\n      - name: Show version\n        shell: bash\n        run: |\n          sherpa-onnx-version\n\n          which sherpa-onnx-version\n\n      - name: Show help\n        shell: bash\n        run: |\n          sherpa-onnx --help\n\n          echo \"---\"\n\n          sherpa-onnx-offline --help\n\n          echo \"---\"\n\n          sherpa-onnx-vad --help\n\n          which sherpa-onnx-vad\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v /d/tmp/wheels/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI ${{ github.event.inputs.publish_sherpa_onnx_bin }}\n        if: ${{ (github.event.inputs.publish_sherpa_onnx_bin || 'true') == 'true' }}\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine==5.0.0 setuptools\n\n          twine upload /d/tmp/wheels/*.whl\n\n  build_wheels_win64:\n    needs: [core, test]\n    name: ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        python-version: [\"3.7\", \"3.8\", \"3.9\", \"3.10\", \"3.11\", \"3.12\", \"3.13\", \"3.14\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Build wheels (cmd)\n        shell: bash\n        run: |\n          python3 -m pip install setuptools wheel twine\n\n          export SHERPA_ONNX_SPLIT_PYTHON_PACKAGE=ON\n\n          export SHERPA_ONNX_CMAKE_ARGS=\"-DSHERPA_ONNX_ENABLE_BINARY=OFF -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF -DSHERPA_ONNX_ENABLE_C_API=OFF -DSHERPA_ONNX_ENABLE_C_API=OFF -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF\"\n\n          python3 setup.py bdist_wheel\n\n          ls -lh ./dist/\n\n          mv dist wheelhouse\n\n      - name: Display wheels\n        shell: bash\n        run: |\n          ls -lh ./wheelhouse/\n          unzip -l ./wheelhouse/*.whl\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.python-version }}\n          path: ./wheelhouse/*.whl\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            d=cpu/$SHERPA_ONNX_VERSION\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            mkdir -p $d\n\n            cp -v ../wheelhouse/*.whl $d/\n\n            git status\n            git add .\n            git commit -m \"add more wheels\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-wheels main\n\n      - name: Publish wheels to PyPI\n        shell: bash\n        env:\n          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}\n          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}\n        run: |\n          python3 -m pip install --upgrade pip\n          if [[ ${{ matrix.python-version }} == \"3.7\" ]]; then\n            python3 -m pip install wheel twine setuptools\n          else\n            python3 -m pip install wheel twine==5.0.0 setuptools\n          fi\n\n          twine upload ./wheelhouse/*.whl\n"
  },
  {
    "path": ".github/workflows/build-xcframework.yaml",
    "content": "name: build-xcframework\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - './build-ios.sh'\n      - '.github/workflows/build-xcframework.yaml'\n      - 'CMakeLists.txt'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: build-xcframework-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  build_xcframework:\n    name: tts-${{ matrix.with_tts }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        with_tts: [ON, OFF]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Build iOS shared\n        if: matrix.with_tts == 'ON'\n        shell: bash\n        run: |\n          export CMAKE_VERBOSE_MAKEFILE=ON\n          ./build-ios-shared.sh\n\n      - name: Build iOS\n        if: matrix.with_tts == 'ON'\n        shell: bash\n        run: |\n          ./build-ios.sh\n\n      - name: Build iOS (No tts)\n        if: matrix.with_tts == 'OFF'\n        shell: bash\n        run: |\n          ./build-ios-no-tts.sh\n\n      - name: Display artifacts\n        if: matrix.with_tts == 'ON'\n        shell: bash\n        run: |\n          brew install tree\n          tree -L 2 ./build-ios\n\n      - name: Display artifacts\n        if: matrix.with_tts == 'OFF'\n        shell: bash\n        run: |\n          brew install tree\n          tree -L 2 ./build-ios-no-tts\n\n      - name: Package artifacts\n        if: matrix.with_tts == 'ON'\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION=$SHERPA_ONNX_VERSION\" >> \"$GITHUB_ENV\"\n\n          rm -rf build-ios/build\n          rm -rf build-ios/install\n          rm -rf build-ios/ios-onnxruntime/.git\n\n          tree build-ios\n\n          filename=sherpa-onnx-${SHERPA_ONNX_VERSION}-ios.tar.bz2\n\n          tar cjvf $filename ./build-ios\n\n          ls -lh\n\n      - name: Package artifacts\n        if: matrix.with_tts == 'OFF'\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION=$SHERPA_ONNX_VERSION\" >> \"$GITHUB_ENV\"\n\n          rm -rf build-ios-no-tts/build\n          rm -rf build-ios-no-tts/install\n          rm -rf build-ios-no-tts/ios-onnxruntime/.git\n\n          tree build-ios-no-tts\n\n          filename=sherpa-onnx-${SHERPA_ONNX_VERSION}-ios-no-tts.tar.bz2\n\n          tar cjvf $filename ./build-ios-no-tts\n\n          ls -lh\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.with_tts == 'ON'\n        with:\n          name: sherpa-onnx-ios-libs\n          path: ./build-ios\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.with_tts == 'OFF'\n        with:\n          name: sherpa-onnx-ios-libs-no-tts\n          path: ./build-ios-no-tts\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n\n            cp -v ../sherpa-onnx-*.tar.bz2 ./\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-ios.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release xcframework\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*.tar.bz2\n"
  },
  {
    "path": ".github/workflows/c-api-from-buffer.yaml",
    "content": "name: c-api-from-memory\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/c-api-from-buffer.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'c-api-examples/**'\n      - 'ffmpeg-examples/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: c-api-from-buffer-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  c_api_from_buffer:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-c-api-shared\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n\n          cmake \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n            ..\n\n          make -j2 install\n\n          ls -lh install/lib\n          ls -lh install/include\n\n          if [[ ${{ matrix.os }} == ubuntu-latest ]]; then\n            ldd ./install/lib/libsherpa-onnx-c-api.so\n            echo \"---\"\n            readelf -d ./install/lib/libsherpa-onnx-c-api.so\n          fi\n\n          if [[ ${{ matrix.os }} == macos-latest ]]; then\n            otool -L ./install/lib/libsherpa-onnx-c-api.dylib\n          fi\n\n      - name: Test streaming zipformer with tokens and hotwords loaded from buffers\n        shell: bash\n        run: |\n          gcc -o streaming-zipformer-buffered-tokens-hotwords-c-api ./c-api-examples/streaming-zipformer-buffered-tokens-hotwords-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh streaming-zipformer-buffered-tokens-hotwords-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest ]]; then\n            ldd ./streaming-zipformer-buffered-tokens-hotwords-c-api\n            echo \"----\"\n            readelf -d ./streaming-zipformer-buffered-tokens-hotwords-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n          curl -SL -O https://huggingface.co/desh2608/icefall-asr-librispeech-pruned-transducer-stateless7-streaming-small/blob/main/data/lang_bpe_500/bpe.model\n          cp bpe.model sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/\n          rm bpe.model\n\n          printf \"▁A ▁T ▁P :1.5\\n▁A ▁B ▁C :3.0\" > hotwords.txt\n          mv hotwords.txt ./sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\n\n          ls -lh sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\n          echo \"---\"\n          ls -lh sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./streaming-zipformer-buffered-tokens-hotwords-c-api\n\n          rm -rf sherpa-onnx-streaming-zipformer-*\n\n      - name: Test streaming paraformer with tokens loaded from buffers\n        shell: bash\n        run: |\n          gcc -o streaming-paraformer-buffered-tokens-c-api ./c-api-examples/streaming-paraformer-buffered-tokens-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh streaming-paraformer-buffered-tokens-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest ]]; then\n            ldd ./streaming-paraformer-buffered-tokens-c-api\n            echo \"----\"\n            readelf -d ./streaming-paraformer-buffered-tokens-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n          tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n          rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n\n          ls -lh sherpa-onnx-streaming-paraformer-bilingual-zh-en\n          echo \"---\"\n          ls -lh sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./streaming-paraformer-buffered-tokens-c-api\n\n          rm -rf sherpa-onnx-streaming-paraformer-*\n\n      - name: Test streaming ctc with tokens loaded from buffers\n        shell: bash\n        run: |\n          gcc -o streaming-ctc-buffered-tokens-c-api ./c-api-examples/streaming-ctc-buffered-tokens-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh streaming-ctc-buffered-tokens-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest ]]; then\n            ldd ./streaming-ctc-buffered-tokens-c-api\n            echo \"----\"\n            readelf -d ./streaming-ctc-buffered-tokens-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n\n          ls -lh sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13\n          echo \"---\"\n          ls -lh sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./streaming-ctc-buffered-tokens-c-api\n\n          rm -rf sherpa-onnx-streaming-ctc-*\n\n      - name: Test keywords spotting with tokens and keywords loaded from buffers\n        shell: bash\n        run: |\n          gcc -o keywords-spotter-buffered-tokens-keywords-c-api ./c-api-examples/keywords-spotter-buffered-tokens-keywords-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh keywords-spotter-buffered-tokens-keywords-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest ]]; then\n            ldd ./keywords-spotter-buffered-tokens-keywords-c-api\n            echo \"----\"\n            readelf -d ./keywords-spotter-buffered-tokens-keywords-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n          tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n          rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n\n          ls -lh sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile\n          echo \"---\"\n          ls -lh sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./keywords-spotter-buffered-tokens-keywords-c-api\n\n          rm -rf sherpa-onnx-kws-zipformer-*\n"
  },
  {
    "path": ".github/workflows/c-api.yaml",
    "content": "name: c-api\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/c-api.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'c-api-examples/**'\n      - 'ffmpeg-examples/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: c-api-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  c_api:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, ubuntu-22.04-arm]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-c-api-shared\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n\n          cmake \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n            ..\n\n          make -j2 install\n\n          ls -lh install/lib\n          ls -lh install/include\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./install/lib/libsherpa-onnx-c-api.so\n            echo \"---\"\n            readelf -d ./install/lib/libsherpa-onnx-c-api.so\n          fi\n\n          if [[ ${{ matrix.os }} == macos-latest ]]; then\n            otool -L ./install/lib/libsherpa-onnx-c-api.dylib\n          fi\n\n      - name: Test Moonshine v2\n        shell: bash\n        run: |\n          name=moonshine-v2-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n          tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n          rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-moonshine-*\n\n      - name: Test FireRedASR CTC\n        shell: bash\n        run: |\n          name=fire-red-asr-ctc-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n          tar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n          rm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-fire-*\n\n      - name: Test online punctuation\n        shell: bash\n        run: |\n          name=add-punctuation-online-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n          tar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n          rm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-online-punct-en-2024-08-06\n\n      - name: Test PocketTTS\n        shell: bash\n        run: |\n          name=pocket-tts-en-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n          tar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n          rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-pocket-tts-int8-2026-01-26\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: pocket-tts-wavs-${{ matrix.os }}\n          path: ./generated-pocket-en.wav\n\n      - name: Test SupertonicTTS\n        shell: bash\n        run: |\n          name=supertonic-tts-en-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n          tar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n          rm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-supertonic-tts-int8-2026-03-06\n\n          ls -lh ./generated-supertonic-en-c.wav\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: supertonic-tts-wavs-${{ matrix.os }}\n          path: ./generated-supertonic-en-c.wav\n\n      - name: Test ZipVoiceTTS\n        shell: bash\n        run: |\n          name=zipvoice-tts-zh-en-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n          tar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n          rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\n          rm -f vocos_24khz.onnx\n\n          ls -lh ./generated-zipvoice-zh-en-c.wav\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: zipvoice-tts-wavs-${{ matrix.os }}\n          path: ./generated-zipvoice-zh-en-c.wav\n\n      - name: Test FunASR Nano\n        shell: bash\n        run: |\n          name=funasr-nano-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n          tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n          rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-funasr-*\n\n      - name: Test MedASR CTC\n        shell: bash\n        run: |\n          name=medasr-ctc-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n          tar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n          rm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-medasr-*\n\n      - name: Test Omnilingual ASR CTC\n        shell: bash\n        run: |\n          name=omnilingual-asr-ctc-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n          tar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n          rm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-omnilingual-*\n\n      - name: Test Wenet CTC\n        shell: bash\n        run: |\n          name=wenet-ctc-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n          tar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n          rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-wenetspeech-*\n\n      - name: Test T-one\n        shell: bash\n        run: |\n          name=streaming-t-one-ctc-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n          tar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n          rm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-streaming-t-one-russian-2025-09-08\n\n      - name: Test KittenTTS\n        shell: bash\n        run: |\n          name=kitten-tts-en-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n          tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n          rm kitten-nano-en-v0_1-fp16.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf kitten-nano-en-v0_1-fp16\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: kitten-tts-wavs-${{ matrix.os }}\n          path: ./generated-kitten-en.wav\n\n      - name: Test streaming zipformer with homophone replacer\n        shell: bash\n        run: |\n          name=streaming-zipformer-with-hr-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\n          ls -lh sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n          echo \"---\"\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\n          tar xf dict.tar.bz2\n          rm dict.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-streaming-zipformer-*\n          rm -rf dict lexicon.txt test-hr.wav replace.fst\n          rm -v $name\n\n      - name: Test NeMo Canary\n        shell: bash\n        run: |\n          name=nemo-canary-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n          tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n          rm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8\n\n      - name: Test Dolphin CTC\n        shell: bash\n        run: |\n          name=dolphin-ctc-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n          tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n          rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm $name\n          rm -rf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\n\n      - name: Test speech enhancement (GTCRN)\n        shell: bash\n        run: |\n          name=speech-enhancement-gtcrn-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n          rm -rf denoised-wavs\n          mkdir -p denoised-wavs\n          cp -v inp_16k.wav denoised-wavs\n          cp -v enhanced.wav denoised-wavs/enhanced-gtcrn.wav\n          rm -fv *.onnx enhanced.wav\n\n          rm $name\n\n      - name: Test speech enhancement (DPDFNet)\n        shell: bash\n        run: |\n          name=speech-enhancement-dpdfnet-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          mkdir -p denoised-wavs\n          cp -v enhanced.wav denoised-wavs/enhanced-dpdfnet.wav\n          rm -fv *.onnx enhanced.wav\n\n          rm $name\n\n      - name: Test online speech enhancement (GTCRN)\n        shell: bash\n        run: |\n          name=online-speech-enhancement-gtcrn-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n          mkdir -p denoised-wavs\n          cp -v enhanced-online-gtcrn.wav denoised-wavs/\n          rm -fv *.onnx enhanced-online-gtcrn.wav\n\n          rm $name\n\n      - name: Test online speech enhancement (DPDFNet)\n        shell: bash\n        run: |\n          name=online-speech-enhancement-dpdfnet-c-api\n          gcc -o $name ./c-api-examples/$name.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n          mkdir -p denoised-wavs\n          cp -v enhanced-online-dpdfnet.wav denoised-wavs/\n          rm -fv *.onnx enhanced-online-dpdfnet.wav\n\n          rm $name\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: denoised-wavs-${{ matrix.os }}\n          path: ./denoised-wavs/*.wav\n\n      - name: Test FireRedAsr\n        shell: bash\n        run: |\n          gcc -o fire-red-asr-c-api ./c-api-examples/fire-red-asr-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh fire-red-asr-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./fire-red-asr-c-api\n            echo \"----\"\n            readelf -d ./fire-red-asr-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n          tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n          rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n\n          ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\n          echo \"---\"\n          ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./fire-red-asr-c-api\n\n          rm -rf sherpa-onnx-fire-red-asr-*\n\n      - name: Test kws (zh)\n        shell: bash\n        run: |\n          gcc -o kws-c-api ./c-api-examples/kws-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n          tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n          rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./kws-c-api\n\n          rm ./kws-c-api\n          rm -rf sherpa-onnx-kws-*\n\n      - name: Test Kokoro TTS (zh+en)\n        shell: bash\n        run: |\n          gcc -o kokoro-tts-zh-en-c-api ./c-api-examples/kokoro-tts-zh-en-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n          tar xf kokoro-multi-lang-v1_0.tar.bz2\n          rm kokoro-multi-lang-v1_0.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./kokoro-tts-zh-en-c-api\n\n          rm ./kokoro-tts-zh-en-c-api\n          rm -rf kokoro-zh-en-*\n\n      - name: Test Kokoro TTS (en)\n        shell: bash\n        run: |\n          gcc -o kokoro-tts-en-c-api ./c-api-examples/kokoro-tts-en-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n          tar xf kokoro-en-v0_19.tar.bz2\n          rm kokoro-en-v0_19.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./kokoro-tts-en-c-api\n\n          rm ./kokoro-tts-en-c-api\n          rm -rf kokoro-en-*\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: kokoro-tts-${{ matrix.os }}\n          path: ./generated-kokoro-*.wav\n\n      - name: Test Matcha TTS (zh)\n        shell: bash\n        run: |\n          gcc -o matcha-tts-zh-c-api ./c-api-examples/matcha-tts-zh-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n          tar xvf matcha-icefall-zh-baker.tar.bz2\n          rm matcha-icefall-zh-baker.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./matcha-tts-zh-c-api\n\n          rm ./matcha-tts-zh-c-api\n          rm -rf matcha-icefall-*\n          rm vocos-22khz-univ.onnx\n\n      - name: Test Matcha TTS (en)\n        shell: bash\n        run: |\n          gcc -o matcha-tts-en-c-api ./c-api-examples/matcha-tts-en-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n          tar xvf matcha-icefall-en_US-ljspeech.tar.bz2\n          rm matcha-icefall-en_US-ljspeech.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./matcha-tts-en-c-api\n\n          rm ./matcha-tts-en-c-api\n          rm -rf matcha-icefall-*\n          rm vocos-22khz-univ.onnx\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: matcha-tts-${{ matrix.os }}\n          path: ./generated-matcha-*.wav\n\n      - name: Test silero-vad + Whisper tiny.en\n        shell: bash\n        run: |\n          gcc -o vad-whisper-c-api ./c-api-examples/vad-whisper-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          # Now download models\n          #\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n          tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n          rm sherpa-onnx-whisper-tiny.en.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./vad-whisper-c-api\n\n          rm -rf sherpa-onnx-*\n          rm -rf *.onnx\n          rm *.wav\n\n      - name: Test ten-vad + Whisper tiny.en\n        shell: bash\n        run: |\n          gcc -o vad-whisper-c-api ./c-api-examples/vad-whisper-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          # Now download models\n          #\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n          tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n          rm sherpa-onnx-whisper-tiny.en.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./vad-whisper-c-api\n\n          rm -rf sherpa-onnx-*\n          rm -rf *.onnx\n          rm *.wav\n\n      - name: Test silero-vad + Moonshine\n        shell: bash\n        run: |\n          gcc -o vad-moonshine-c-api ./c-api-examples/vad-moonshine-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          # Now download models\n          #\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n          tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n          rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./vad-moonshine-c-api\n\n          rm -rf sherpa-onnx-*\n          rm -rf *.onnx\n          rm *.wav\n\n      - name: Test ten-vad + Moonshine\n        shell: bash\n        run: |\n          gcc -o vad-moonshine-c-api ./c-api-examples/vad-moonshine-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          # Now download models\n          #\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n          tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n          rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./vad-moonshine-c-api\n\n          rm -rf sherpa-onnx-*\n          rm -rf *.onnx\n          rm *.wav\n\n      - name: Test Moonshine\n        shell: bash\n        run: |\n          gcc -o moonshine-c-api ./c-api-examples/moonshine-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n          tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n          rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./moonshine-c-api\n\n          rm -rf sherpa-onnx-*\n\n      - name: Test ffmpeg\n        # if: matrix.os == 'macos-latest'\n        if: false\n        shell: bash\n        run: |\n          brew install ffmpeg\n\n          cd ffmpeg-examples\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\n          make\n          ls -lh\n          ./run.sh\n          rm -rf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n\n      - name: Test silero-vad + sense-voice\n        shell: bash\n        run: |\n          gcc -o vad-sense-voice-c-api ./c-api-examples/vad-sense-voice-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh vad-sense-voice-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./vad-sense-voice-c-api\n            echo \"----\"\n            readelf -d ./vad-sense-voice-c-api\n          fi\n\n          # Now download models\n          #\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n          tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n          rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\n          ls -lh sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17\n          echo \"---\"\n          ls -lh sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./vad-sense-voice-c-api\n\n          rm -rf sherpa-onnx-sense-voice-*\n          rm -rf *.onnx\n          rm *.wav\n\n      - name: Test ten-vad + sense-voice\n        shell: bash\n        run: |\n          gcc -o vad-sense-voice-c-api ./c-api-examples/vad-sense-voice-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh vad-sense-voice-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./vad-sense-voice-c-api\n            echo \"----\"\n            readelf -d ./vad-sense-voice-c-api\n          fi\n\n          # Now download models\n          #\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n          tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n          rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\n          ls -lh sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17\n          echo \"---\"\n          ls -lh sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./vad-sense-voice-c-api\n\n          rm -rf sherpa-onnx-sense-voice-*\n          rm -rf *.onnx\n          rm *.wav\n\n      - name: Test sense-voice\n        shell: bash\n        run: |\n          gcc -o sense-voice-c-api ./c-api-examples/sense-voice-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh sense-voice-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./sense-voice-c-api\n            echo \"----\"\n            readelf -d ./sense-voice-c-api\n          fi\n\n          # Now download models\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n          tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n          rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\n          ls -lh sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17\n          echo \"---\"\n          ls -lh sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./sense-voice-c-api\n\n          rm -rf sherpa-onnx-sense-voice-*\n\n      - name: Test whisper\n        shell: bash\n        run: |\n          gcc -o whisper-c-api ./c-api-examples/whisper-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh whisper-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./whisper-c-api\n            echo \"----\"\n            readelf -d ./whisper-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\n          tar xvf sherpa-onnx-whisper-tiny.tar.bz2\n          rm sherpa-onnx-whisper-tiny.tar.bz2\n\n          ls -lh sherpa-onnx-whisper-tiny\n          echo \"---\"\n          ls -lh sherpa-onnx-whisper-tiny/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./whisper-c-api\n\n          rm -rf sherpa-onnx-whisper-*\n\n      - name: Test non-streaming zipformer\n        shell: bash\n        run: |\n          gcc -o zipformer-c-api ./c-api-examples/zipformer-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh zipformer-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./zipformer-c-api\n            echo \"----\"\n            readelf -d ./zipformer-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-small-en-2023-06-26.tar.bz2\n          tar xvf sherpa-onnx-zipformer-small-en-2023-06-26.tar.bz2\n          rm sherpa-onnx-zipformer-small-en-2023-06-26.tar.bz2\n\n          ls -lh sherpa-onnx-zipformer-small-en-2023-06-26\n          echo \"---\"\n          ls -lh sherpa-onnx-zipformer-small-en-2023-06-26/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./zipformer-c-api\n\n          rm -rf sherpa-onnx-zipformer-*\n\n      - name: Test streaming zipformer\n        shell: bash\n        run: |\n          gcc -o streaming-zipformer-c-api ./c-api-examples/streaming-zipformer-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh streaming-zipformer-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./streaming-zipformer-c-api\n            echo \"----\"\n            readelf -d ./streaming-zipformer-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n\n          ls -lh sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\n          echo \"---\"\n          ls -lh sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./streaming-zipformer-c-api\n\n          rm -rf sherpa-onnx-streaming-zipformer-*\n\n      - name: Test non-streaming paraformer\n        shell: bash\n        run: |\n          gcc -o paraformer-c-api ./c-api-examples/paraformer-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh paraformer-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./paraformer-c-api\n            echo \"----\"\n            readelf -d ./paraformer-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-small-2024-03-09.tar.bz2\n          tar xvf sherpa-onnx-paraformer-zh-small-2024-03-09.tar.bz2\n          rm sherpa-onnx-paraformer-zh-small-2024-03-09.tar.bz2\n\n          ls -lh sherpa-onnx-paraformer-zh-small-2024-03-09\n          echo \"---\"\n          ls -lh sherpa-onnx-paraformer-zh-small-2024-03-09/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./paraformer-c-api\n\n          rm -rf sherpa-onnx-paraformer-*\n\n      - name: Test streaming paraformer\n        shell: bash\n        run: |\n          gcc -o streaming-paraformer-c-api ./c-api-examples/streaming-paraformer-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh streaming-paraformer-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./streaming-paraformer-c-api\n            echo \"----\"\n            readelf -d ./streaming-paraformer-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n          tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n          rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n\n          ls -lh sherpa-onnx-streaming-paraformer-bilingual-zh-en\n          echo \"---\"\n          ls -lh sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./streaming-paraformer-c-api\n\n          rm -rf sherpa-onnx-streaming-paraformer-*\n\n      - name: Test telespeech\n        shell: bash\n        run: |\n          gcc -o telespeech-c-api ./c-api-examples/telespeech-c-api.c \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh telespeech-c-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest ]]; then\n            ldd ./telespeech-c-api\n            echo \"----\"\n            readelf -d ./telespeech-c-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n          tar xvf sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n          rm sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n\n          ls -lh sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04\n          echo \"---\"\n          ls -lh sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./telespeech-c-api\n\n          rm -rf sherpa-onnx-telespeech-*\n"
  },
  {
    "path": ".github/workflows/checksum.yaml",
    "content": "name: Create checksum\n\non:\n  schedule:\n    - cron: \"0 1 * * *\" # Runs at 1:00 AM UTC daily\n  workflow_dispatch:\n\njobs:\n  checksum:\n    if: github.repository_owner == 'k2-fsa'\n    runs-on: macos-latest\n    strategy:\n      matrix:\n        tag: [null, asr-models, tts-models, kws-models, speaker-recongition-models, audio-tagging-models, punctuation-models]\n    steps:\n      - name: Run checksum action\n        uses: thewh1teagle/checksum@v1\n        with:\n          tag: ${{ matrix.tag }}\n        env:\n          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}\n"
  },
  {
    "path": ".github/workflows/clang-tidy.yaml",
    "content": "name: clang-tidy\n\non:\n  push:\n    branches:\n      - master\n      - clang-tidy\n    paths:\n      - 'sherpa-onnx/csrc/**'\n\n\n  workflow_dispatch:\n\nconcurrency:\n  group: clang-tidy-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  clang-tidy:\n    runs-on: ubuntu-latest\n    strategy:\n      matrix:\n        python-version: [3.8]\n      fail-fast: false\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install clang-tidy\n        shell: bash\n        run: |\n          pip install clang-tidy\n\n      - name: Configure\n        shell: bash\n        run: |\n          mkdir build\n          cd build\n          cmake -DSHERPA_ONNX_ENABLE_PYTHON=ON -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..\n\n      - name: Check with clang-tidy\n        shell: bash\n        run: |\n          cd build\n          make check\n"
  },
  {
    "path": ".github/workflows/cxx-api.yaml",
    "content": "name: cxx-api\n\non:\n  push:\n    branches:\n      - master\n      - cxx-api-asr-non-streaming\n    paths:\n      - '.github/workflows/cxx-api.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'cxx-api-examples/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: cxx-api-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  cxx_api:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, ubuntu-22.04-arm]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-cxx-api-shared\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n\n          cmake \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n            ..\n\n          make -j2 install\n\n          ls -lh install/lib\n          ls -lh install/include\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./install/lib/libsherpa-onnx-c-api.so\n            ldd ./install/lib/libsherpa-onnx-cxx-api.so\n            echo \"---\"\n            readelf -d ./install/lib/libsherpa-onnx-c-api.so\n            readelf -d ./install/lib/libsherpa-onnx-cxx-api.so\n          fi\n\n          if [[ ${{ matrix.os }} == macos-latest ]]; then\n            otool -L ./install/lib/libsherpa-onnx-c-api.dylib\n            otool -L ./install/lib/libsherpa-onnx-cxx-api.dylib\n          fi\n\n      - name: Test Moonshine v2\n        shell: bash\n        run: |\n          name=moonshine-v2-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n          tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n          rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-moonshine-*\n          rm -v ./$name\n\n      - name: Test FireRedASR CTC\n        shell: bash\n        run: |\n          name=fire-red-asr-ctc-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n          tar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n          rm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-fire-red-*\n          rm -v ./$name\n\n      - name: Test PocketTTS\n        shell: bash\n        run: |\n          name=pocket-tts-en-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n          tar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n          rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-pocket-tts-int8-2026-01-26\n          rm -v ./$name\n\n          ls -lh ./generated-pocket-en-cxx.wav\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: pocket-tts-wavs-${{ matrix.os }}\n          path: ./generated-pocket-en-cxx.wav\n\n      - name: Test SupertonicTTS\n        shell: bash\n        run: |\n          name=supertonic-tts-en-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n          tar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n          rm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-supertonic-tts-int8-2026-03-06\n          rm -v ./$name\n\n          ls -lh ./generated-supertonic-en-cxx.wav\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: supertonic-tts-wavs-${{ matrix.os }}\n          path: ./generated-supertonic-en-cxx.wav\n\n      - name: Test ZipVoiceTTS\n        shell: bash\n        run: |\n          name=zipvoice-tts-zh-en-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n          tar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n          rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\n          rm -f vocos_24khz.onnx\n          rm -v ./$name\n\n          ls -lh ./generated-zipvoice-zh-en-cxx.wav\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: zipvoice-tts-wavs-${{ matrix.os }}\n          path: ./generated-zipvoice-zh-en-cxx.wav\n\n      - name: Test FunASR Nano\n        shell: bash\n        run: |\n          name=funasr-nano-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n          tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n          rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-funasr-*\n          rm -v ./$name\n\n      - name: Test MedASR CTC\n        shell: bash\n        run: |\n          name=medasr-ctc-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n          tar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n          rm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-medasr-*\n          rm -v ./$name\n\n      - name: Test Omnilingual ASR CTC\n        shell: bash\n        run: |\n          name=omnilingual-asr-ctc-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n          tar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n          rm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-omnilingual-*\n          rm -v ./$name\n\n      - name: Test Online punctuation\n        shell: bash\n        run: |\n          name=online-punctuation-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n          tar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n          rm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-online-punct-*\n          rm -v ./$name\n\n      - name: Test Offline punctuation\n        shell: bash\n        run: |\n          name=offline-punctuation-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8.tar.bz2\n          tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8.tar.bz2\n          rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-punct-*\n          rm -v ./$name\n\n      - name: Test CED audio tagging\n        shell: bash\n        run: |\n          name=audio-tagging-ced-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n          tar xvf sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n          rm sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-ced-*\n          rm -v ./$name\n\n      - name: Test Zipformer audio tagging\n        shell: bash\n        run: |\n          name=audio-tagging-zipformer-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n          tar xvf sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n          rm sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-zipformer-*\n          rm -v ./$name\n\n      - name: Test Wenet CTC\n        shell: bash\n        run: |\n          name=wenet-ctc-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n          tar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n          rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-wenetspeech-*\n          rm -v ./$name\n\n      - name: Test T-one\n        shell: bash\n        run: |\n          name=streaming-t-one-ctc-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n          tar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n          rm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-streaming-t-one-russian-2025-09-08\n          rm -v ./$name\n\n      - name: Test KittenTTS\n        shell: bash\n        run: |\n          name=kitten-tts-en-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ls -lh ./$name\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n          tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n          rm kitten-nano-en-v0_1-fp16.tar.bz2\n\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf kitten-nano-en-v0_1-fp16\n          rm -v ./$name\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: kitten-tts-wavs-${{ matrix.os }}\n          path: ./generated-kitten-en-cxx.wav\n\n      - name: Test NeMo Canary\n        shell: bash\n        run: |\n          name=nemo-canary-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n          tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n          rm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n\n          ls -lh sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8\n          echo \"---\"\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-nemo-canary-*\n          rm -v ./$name\n\n      - name: Test streaming zipformer with Homophone replacer\n        shell: bash\n        run: |\n          name=streaming-zipformer-with-hr-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\n          ls -lh sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n          echo \"---\"\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\n          tar xf dict.tar.bz2\n          rm dict.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./$name\n\n          rm -rf sherpa-onnx-streaming-zipformer-*\n          rm -rf dict lexicon.txt test-hr.wav replace.fst\n          rm -v ./$name\n\n      - name: Test Dolphin CTC\n        shell: bash\n        run: |\n          name=dolphin-ctc-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n          tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n          rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n\n          ./$name\n\n          rm -rf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\n\n          rm $name\n\n      - name: Test silero-vad\n        shell: bash\n        run: |\n          name=vad-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n          ./$name\n\n          mkdir vad-test-silero-vad\n          cp -v lei-jun-test*.wav vad-test-silero-vad\n\n          ls -lh vad-test-silero-vad\n\n          rm $name\n          rm -fv *.onnx\n          rm -fv *.wav\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: silero-vad-test-wavs-cxx-${{ matrix.os }}\n          path: ./vad-test-silero-vad/*.wav\n\n      - name: Test ten-vad\n        shell: bash\n        run: |\n          name=vad-cxx-api\n          g++ -std=c++17 -o $name ./cxx-api-examples/$name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $name\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$name\n            echo \"----\"\n            readelf -d ./$name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n\n          ./$name\n\n          mkdir vad-test-ten-vad\n          cp -v lei-jun-test*.wav vad-test-ten-vad\n\n          ls -lh vad-test-ten-vad\n\n          rm $name\n          rm -fv *.onnx\n          rm -rf *.wav\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: ten-vad-test-wavs-cxx-${{ matrix.os }}\n          path: ./vad-test-ten-vad/*.wav\n\n      - name: Test Speech Enhancement\n        shell: bash\n        run: |\n          gtcrn_name=speech-enhancement-gtcrn-cxx-api\n          g++ -std=c++17 -o $gtcrn_name ./cxx-api-examples/$gtcrn_name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          dpdfnet_name=speech-enhancement-dpdfnet-cxx-api\n          g++ -std=c++17 -o $dpdfnet_name ./cxx-api-examples/$dpdfnet_name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          online_gtcrn_name=online-speech-enhancement-gtcrn-cxx-api\n          g++ -std=c++17 -o $online_gtcrn_name ./cxx-api-examples/$online_gtcrn_name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          online_dpdfnet_name=online-speech-enhancement-dpdfnet-cxx-api\n          g++ -std=c++17 -o $online_dpdfnet_name ./cxx-api-examples/$online_dpdfnet_name.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh $gtcrn_name $dpdfnet_name $online_gtcrn_name $online_dpdfnet_name\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./$gtcrn_name\n            echo \"----\"\n            readelf -d ./$gtcrn_name\n            echo \"----\"\n            ldd ./$dpdfnet_name\n            echo \"----\"\n            readelf -d ./$dpdfnet_name\n            echo \"----\"\n            ldd ./$online_gtcrn_name\n            echo \"----\"\n            readelf -d ./$online_gtcrn_name\n            echo \"----\"\n            ldd ./$online_dpdfnet_name\n            echo \"----\"\n            readelf -d ./$online_dpdfnet_name\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n\n          ./$gtcrn_name\n          ./$dpdfnet_name\n          ./$online_gtcrn_name\n          ./$online_dpdfnet_name\n\n          mkdir denoised-wavs\n          cp -v inp_16k.wav denoised-wavs\n          cp -v enhanced-gtcrn.wav denoised-wavs\n          cp -v enhanced-dpdfnet.wav denoised-wavs\n          cp -v enhanced-online-gtcrn.wav denoised-wavs\n          cp -v enhanced-online-dpdfnet.wav denoised-wavs\n\n          rm $gtcrn_name $dpdfnet_name $online_gtcrn_name $online_dpdfnet_name\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: denoised-wavs-cxx-${{ matrix.os }}\n          path: ./denoised-wavs/*.wav\n\n      - name: Test FireRedAsr\n        shell: bash\n        run: |\n          g++ -std=c++17 -o fire-red-asr-cxx-api ./cxx-api-examples/fire-red-asr-cxx-api.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh fire-red-asr-cxx-api\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./fire-red-asr-cxx-api\n            echo \"----\"\n            readelf -d ./fire-red-asr-cxx-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n          tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n          rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n\n          ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\n          echo \"---\"\n          ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs\n\n          ./fire-red-asr-cxx-api\n\n          rm -rf sherpa-onnx-fire-red-asr-*\n\n      - name: Test KWS (zh)\n        shell: bash\n        run: |\n          g++ -std=c++17 -o kws-cxx-api ./cxx-api-examples/kws-cxx-api.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n          tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n          rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./kws-cxx-api\n\n          rm kws-cxx-api\n          rm -rf sherpa-onnx-kws-*\n\n      - name: Test Kokoro TTS (zh+en)\n        shell: bash\n        run: |\n          g++ -std=c++17 -o kokoro-tts-zh-en-cxx-api ./cxx-api-examples/kokoro-tts-zh-en-cxx-api.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n          tar xf kokoro-multi-lang-v1_0.tar.bz2\n          rm kokoro-multi-lang-v1_0.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./kokoro-tts-zh-en-cxx-api\n\n          rm kokoro-tts-zh-en-cxx-api\n          rm -rf kokoro-*\n\n      - name: Test Kokoro TTS (en)\n        shell: bash\n        run: |\n          g++ -std=c++17 -o kokoro-tts-en-cxx-api ./cxx-api-examples/kokoro-tts-en-cxx-api.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n          tar xf kokoro-en-v0_19.tar.bz2\n          rm kokoro-en-v0_19.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./kokoro-tts-en-cxx-api\n\n          rm kokoro-tts-en-cxx-api\n          rm -rf kokoro-en-*\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: kokoro-tts-${{ matrix.os }}\n          path: ./generated-kokoro-*.wav\n\n      - name: Test Matcha TTS (zh)\n        shell: bash\n        run: |\n          g++ -std=c++17 -o matcha-tts-zh-cxx-api ./cxx-api-examples/matcha-tts-zh-cxx-api.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n          tar xvf matcha-icefall-zh-baker.tar.bz2\n          rm matcha-icefall-zh-baker.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./matcha-tts-zh-cxx-api\n\n          rm -rf matcha-icefall-*\n          rm vocos-22khz-univ.onnx\n          rm matcha-tts-zh-cxx-api\n\n      - name: Test Matcha TTS (en)\n        shell: bash\n        run: |\n          g++ -std=c++17 -o matcha-tts-en-cxx-api ./cxx-api-examples/matcha-tts-en-cxx-api.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n          tar xvf matcha-icefall-en_US-ljspeech.tar.bz2\n          rm matcha-icefall-en_US-ljspeech.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./matcha-tts-en-cxx-api\n\n          rm matcha-tts-en-cxx-api\n          rm -rf matcha-icefall-*\n          rm vocos-22khz-univ.onnx\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: matcha-tts-${{ matrix.os }}\n          path: ./generated-matcha-*.wav\n\n      - name: Test Moonshine tiny\n        shell: bash\n        run: |\n          g++ -std=c++17 -o moonshine-cxx-api ./cxx-api-examples/moonshine-cxx-api.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n          tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n          rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./moonshine-cxx-api\n\n          rm -rf sherpa-onnx-*\n          rm ./moonshine-cxx-api\n\n      - name: Test whisper\n        shell: bash\n        run: |\n          g++ -std=c++17 -o whisper-cxx-api ./cxx-api-examples/whisper-cxx-api.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh whisper-cxx-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./whisper-cxx-api\n            echo \"----\"\n            readelf -d ./whisper-cxx-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n          tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n          rm sherpa-onnx-whisper-tiny.en.tar.bz2\n\n          ls -lh sherpa-onnx-whisper-tiny.en\n          echo \"---\"\n          ls -lh sherpa-onnx-whisper-tiny.en/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./whisper-cxx-api\n\n          rm -rf sherpa-onnx-whisper-*\n          rm ./whisper-cxx-api\n\n      - name: Test SenseVoice\n        shell: bash\n        run: |\n          g++ -std=c++17 -o sense-voice-cxx-api ./cxx-api-examples/sense-voice-cxx-api.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh sense-voice-cxx-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./sense-voice-cxx-api\n            echo \"----\"\n            readelf -d ./sense-voice-cxx-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n          tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n          rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\n          ls -lh sherpa-onnx-sense-voice-*\n          echo \"---\"\n          ls -lh sherpa-onnx-sense-voice-*/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./sense-voice-cxx-api\n\n          rm -rf sherpa-onnx-sense-voice-*\n          rm ./sense-voice-cxx-api\n\n      - name: Test streaming zipformer\n        shell: bash\n        run: |\n          g++ -std=c++17 -o streaming-zipformer-cxx-api ./cxx-api-examples/streaming-zipformer-cxx-api.cc \\\n            -I ./build/install/include \\\n            -L ./build/install/lib/ \\\n            -l sherpa-onnx-cxx-api \\\n            -l sherpa-onnx-c-api \\\n            -l onnxruntime\n\n          ls -lh streaming-zipformer-cxx-api\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            ldd ./streaming-zipformer-cxx-api\n            echo \"----\"\n            readelf -d ./streaming-zipformer-cxx-api\n          fi\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\n          ls -lh sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n          echo \"---\"\n          ls -lh sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs\n\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/build/install/lib:$DYLD_LIBRARY_PATH\n\n          ./streaming-zipformer-cxx-api\n\n          rm -rf sherpa-onnx-streaming-zipformer-*\n          rm ./streaming-zipformer-cxx-api\n"
  },
  {
    "path": ".github/workflows/dot-net.yaml",
    "content": "name: release-nuget-package\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: release-nuget-package\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  build-libs:\n    name: ${{ matrix.os }} ${{ matrix.arch }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        arch: [x64, x86, arm64]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          arch=${{ matrix.arch }}\n          opts=\"\"\n          if [ $arch == x86 ]; then\n            opts=\"-A Win32\"\n          elif [ $arch == arm64 ]; then\n            opts=\"-A ARM64\"\n          fi\n\n          mkdir build\n          cd build\n          cmake \\\n            $opts \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -DCMAKE_BUILD_TYPE=Release \\\n            -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n            -DBUILD_ESPEAK_NG_EXE=OFF \\\n            -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF  \\\n            -DSHERPA_ONNX_ENABLE_BINARY=ON \\\n            ..\n\n          cmake --build . --target install --config Release\n          rm -rf install/pkgconfig\n\n      - name: Create tar file\n        shell: bash\n        run: |\n          arch=${{ matrix.arch }}\n\n          cd build\n\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ../CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-$SHERPA_ONNX_VERSION-win-$arch\n          mv install/lib $dst\n          tar cjvf $dst.tar.bz2 $dst\n          ls -lh *.tar.bz2\n          mv *.tar.bz2 ../\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: windows-${{ matrix.arch }}\n          path: ./*.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            export GIT_LFS_SKIP_SMUDGE=1\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            git fetch\n            git pull\n            dst=windows-for-dotnet/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*.tar.bz2 $dst/\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"add more files\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n            rm -rf huggingface\n\n  release-nuget-package:\n    runs-on: ${{ matrix.os }}\n    needs: [build-libs]\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Setup .NET\n        uses: actions/setup-dotnet@v4\n        with:\n          dotnet-version: 8.0.x\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip Jinja2\n\n      - name: Retrieve artifact from windows x64\n        uses: actions/download-artifact@v4\n        with:\n          name: windows-x64\n          path: /tmp/windows-x64\n\n      - name: Retrieve artifact from windows x86\n        uses: actions/download-artifact@v4\n        with:\n          name: windows-x86\n          path: /tmp/windows-x86\n\n      - name: Retrieve artifact from windows arm64\n        uses: actions/download-artifact@v4\n        with:\n          name: windows-arm64\n          path: /tmp/windows-arm64\n\n      - name: Check dotnet\n        run: dotnet --info\n\n      - name: Build\n        shell: bash\n        run: |\n          sudo apt-get install -y tree\n          ls -lh /tmp/\n\n          tree /tmp/windows*\n          echo \"----\"\n\n          rm -fv /tmp/windows*/*.lib\n          tree /tmp/windows*\n\n      - name: Build\n        shell: bash\n        run: |\n          cd scripts/dotnet\n          ./run.sh\n\n          ls -lh /tmp/packages\n\n      - name: publish .Net packages to nuget.org\n        if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n        shell: bash\n        env:\n          API_KEY: ${{ secrets.NUGET_API_KEY }}\n        run: |\n          # API_KEY is valid until 2025.04.26\n          cd /tmp/packages\n          dotnet nuget push ./org.k2fsa.sherpa.onnx.*.nupkg --skip-duplicate --api-key $API_KEY --source https://api.nuget.org/v3/index.json\n"
  },
  {
    "path": ".github/workflows/export-3dspeaker-to-onnx.yaml",
    "content": "name: export-3dspeaker-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-3dspeaker-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-3dspeaker-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export 3d-speaker to ONNX\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/3dspeaker\n          ./run.sh\n\n          mv -v *.onnx ../..\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.onnx\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: speaker-recongition-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            d=speaker-embedding-models\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n            mv -v ./*.onnx ./huggingface\n            cd huggingface\n            git lfs track \"*.onnx\"\n            git status\n            git add .\n            git status\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n"
  },
  {
    "path": ".github/workflows/export-ced-to-onnx.yaml",
    "content": "name: export-ced-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-ced-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-ced-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export ced\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/ced\n          ./run.sh\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: audio-tagging-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            models=(\n              tiny\n              mini\n              small\n              base\n            )\n\n            for m in ${models[@]}; do\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              d=sherpa-onnx-ced-$m-audio-tagging-2024-04-19\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/k2-fsa/$d huggingface\n              mv -v $d/* huggingface\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git status\n              git add .\n              git status\n              git commit -m \"first commit\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/k2-fsa/$d main\n              cd ..\n            done\n"
  },
  {
    "path": ".github/workflows/export-dophin-ctc-to-onnx.yaml",
    "content": "name: export-dolphin-ctc-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-dolphin-ctc-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-dolphin-ctc-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.model_type }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        model_type: [small, base]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Download ${{ matrix.model_type }}\n        shell: bash\n        run: |\n          git lfs install\n          type=${{ matrix.model_type }}\n\n          git clone https://huggingface.co/csukuangfj/sherpa-onnx-dolphin-$type-ctc-multi-lang-int8-2025-04-02\n          git clone https://huggingface.co/csukuangfj/sherpa-onnx-dolphin-$type-ctc-multi-lang-2025-04-02\n\n          rm -rf sherpa-onnx-dolphin-*/.git*\n\n          ls -lha sherpa-onnx-dolphin-*/\n\n          tar cjfv sherpa-onnx-dolphin-$type-ctc-multi-lang-int8-2025-04-02.tar.bz2 sherpa-onnx-dolphin-$type-ctc-multi-lang-int8-2025-04-02\n          tar cjfv sherpa-onnx-dolphin-$type-ctc-multi-lang-2025-04-02.tar.bz2 sherpa-onnx-dolphin-$type-ctc-multi-lang-2025-04-02\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-fire-red-asr.yaml",
    "content": "name: export-fire-red-asr-to-onnx\n\non:\n  push:\n    branches:\n      - export-fire-red-asr\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-fire-red-asr-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-fire-red-asr-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export FireRedAsr ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install \"numpy<=1.26.4\" onnx==1.16.0 onnxruntime==1.17.1\n\n      - name: Download exported ONNX model from ModelScope\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        shell: bash\n        run: |\n          git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.git ms\n          ls -lh ms\n\n\n      - name: Collect results\n        shell: bash\n        run: |\n          src=ms\n          d=sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\n          mkdir $d\n\n          mv -v $src/*.onnx $d\n          cp -v $src/README.md $d\n          cp -v $src/tokens.txt $d\n          cp -av $src/test_wavs $d\n\n          ls -lh $d/\n          tar cjfv $d.tar.bz2 $d\n\n          ls -lh $d.tar.bz2\n          rm -rf ms\n\n      - name: Publish to huggingface ${{ matrix.version }}\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            src=sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$src huggingface\n            cd huggingface\n            rm -rf ./*\n            git fetch\n            git pull\n\n            cp -av ../$src/* ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$src main || true\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-gtcrn.yaml",
    "content": "name: export-gtcrn-to-onnx\n\non:\n  push:\n    branches:\n      - export-gtcrn\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-gtcrn-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-gtcrn-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export gtcrn ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install \"numpy<=1.26.4\" onnx==1.16.0 onnxruntime==1.17.1 librosa soundfile torch==2.6.0+cpu -f https://download.pytorch.org/whl/torch \"kaldi-native-fbank>=1.21.1\"\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/gtcrn\n          ./run.sh\n          ./test.py\n          ls -lh\n\n      - name: Collect results\n        shell: bash\n        run: |\n          src=scripts/gtcrn\n          cp -v $src/*.onnx ./\n          ls -lh *.onnx\n\n      - name: Publish to huggingface 0.19\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/speech-enhancement-models huggingface\n            cd huggingface\n            git fetch\n            git pull\n\n            cp -v ../gtcrn_simple.onnx ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/speech-enhancement-models main || true\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.onnx\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: speech-enhancement-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.onnx\n          overwrite: true\n          tag: speech-enhancement-models\n"
  },
  {
    "path": ".github/workflows/export-kitten.yaml",
    "content": "name: export-kitten-to-onnx\n\non:\n  push:\n    branches:\n      - kitten-0.2\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-kitten-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-kitten-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export kitten ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        version: [\"nano_v0_1\", \"nano_v0_2\", \"mini_v0_1\"]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install \"numpy<=1.26.4\" onnx==1.16.0 onnxruntime==1.17.1 librosa soundfile piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html\n\n      - name: Run\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        shell: bash\n        run: |\n          cd scripts/kitten-tts/${{ matrix.version }}\n          ./run.sh\n\n      - name: Collect results\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2\n          tar xf espeak-ng-data.tar.bz2\n          rm espeak-ng-data.tar.bz2\n\n          version=${{ matrix.version }}\n\n          src=scripts/kitten-tts/$version\n\n          if [[ $version == \"nano_v0_1\" ]]; then\n            d=kitten-nano-en-v0_1-fp16\n          elif [[ $version == \"nano_v0_2\" ]]; then\n            d=kitten-nano-en-v0_2-fp16\n          elif [[ $version == \"mini_v0_1\" ]]; then\n            d=kitten-mini-en-v0_1-fp16\n          else\n            echo \"version $version\"\n            exit 1\n          fi\n\n          mkdir $d\n          cp -a LICENSE $d/LICENSE\n          cp -a espeak-ng-data $d/\n          cp -v $src/model.fp16.onnx $d/model.fp16.onnx\n          cp -v $src/voices.bin $d/\n          cp -v $src/tokens.txt $d/\n          cp -v $src/../README.md $d/README.md\n          ls -lh $d/\n          tar cjfv $d.tar.bz2 $d\n\n          ls -lh $d.tar.bz2\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: tts-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: tts-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            dirs=(\n              kitten-nano-en-v0_1-fp16\n              kitten-nano-en-v0_2-fp16\n              kitten-mini-en-v0_1-fp16\n            )\n\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            for d in ${dirs[@]}; do\n              echo \"d $d\"\n              if [[ ! -d $d ]]; then\n                echo \"$d does not exist\"\n                continue\n              fi\n\n              echo \"$d exists\"\n              rm -rf huggingface\n\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n              cd huggingface\n              rm -rf ./*\n\n              git lfs track \"*.onnx\"\n              git lfs track af_dict\n              git lfs track ar_dict\n              git lfs track cmn_dict\n              git lfs track da_dict en_dict fa_dict hu_dict ia_dict it_dict lb_dict phondata ru_dict ta_dict\n              git lfs track ur_dict yue_dict\n\n              cp -a ../$d/* ./\n\n              git add .\n\n              ls -lh\n\n              git status\n\n              git commit -m \"add models\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main || true\n            done\n"
  },
  {
    "path": ".github/workflows/export-kokoro.yaml",
    "content": "name: export-kokoro-to-onnx\n\non:\n  push:\n    branches:\n      - refactor-kokoro-2\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-kokoro-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-kokoro-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export kokoro ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        version: [\"0.19\", \"1.0\", \"1.1-zh\"]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install kokoro \"numpy<=1.26.4\" onnx==1.16.0 onnxruntime==1.17.1 librosa soundfile piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html misaki[en] misaki[zh] torch==2.6.0+cpu -f https://download.pytorch.org/whl/torch sherpa-onnx\n\n      - name: Run\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2\n          tar xf espeak-ng-data.tar.bz2\n          rm espeak-ng-data.tar.bz2\n          cp -a ./espeak-ng-data ./scripts/kokoro/v0.19\n          cp -a ./espeak-ng-data ./scripts/kokoro/v1.0\n          cp -a ./espeak-ng-data ./scripts/kokoro/v1.1-zh\n\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n          cd scripts/kokoro\n          v=${{ matrix.version }}\n          if [[ $v = \"0.19\" ]]; then\n            cd v0.19\n            ./run.sh\n\n            if false; then\n              # generate samples\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples hf\n              mkdir -p hf/kokoro/v0.19/mp3\n              ./generate_samples.py\n              pushd hf\n              git pull\n              git add .\n              git commit -m 'add kokoro samples for v0.19'\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples main\n              popd\n              rm -rf hf\n            fi\n\n          elif [[ $v == \"1.0\" ]]; then\n            cd v1.0\n            ./run.sh\n\n            if false; then\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples hf\n              mkdir -p hf/kokoro/v1.0/mp3\n\n              curl -SL -O https://github.com/csukuangfj/cppjieba/releases/download/sherpa-onnx-2024-04-19/dict.tar.bz2\n              tar xvf dict.tar.bz2\n              rm dict.tar.bz2\n\n              curl -SL -o date-zh.fst https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/date.fst\n              curl -SL -o number-zh.fst  https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/number.fst\n              curl -SL -o phone-zh.fst https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/phone.fst\n\n              ./generate_samples.py\n              pushd hf\n              git pull\n              git add .\n              git commit -m 'add kokoro samples for v1.0'\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples main\n              popd\n              rm -rf hf\n            fi\n\n          elif [[ $v == \"1.1-zh\" ]]; then\n            cd v1.1-zh\n            ./run.sh\n\n            if false; then\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples hf\n              mkdir -p hf/kokoro/v1.1-zh/mp3\n\n              curl -SL -O https://github.com/csukuangfj/cppjieba/releases/download/sherpa-onnx-2024-04-19/dict.tar.bz2\n              tar xvf dict.tar.bz2\n              rm dict.tar.bz2\n\n              curl -SL -o date-zh.fst https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/date.fst\n              curl -SL -o number-zh.fst  https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/number.fst\n              curl -SL -o phone-zh.fst https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/phone.fst\n\n              ./generate_samples.py\n              pushd hf\n              git pull\n              git add .\n              git commit -m 'add kokoro samples for v1.1-zh'\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples main\n              popd\n              rm -rf hf\n            fi\n          else\n            echo \"Unknown version $v\"\n            exit 1\n          fi\n\n      - name: Collect results 0.19\n        if: matrix.version == '0.19'\n        shell: bash\n        run: |\n          src=scripts/kokoro/v0.19\n\n          d=kokoro-en-v0_19\n\n          mkdir $d\n          cp -a LICENSE $d/LICENSE\n          cp -a espeak-ng-data $d/\n          cp -v $src/model.onnx $d/model.onnx\n          cp -v $src/voices.bin $d/\n          cp -v $src/tokens.txt $d/\n          cp -v $src/../README.md $d/README.md\n          ls -lh $d/\n          tar cjfv $d.tar.bz2 $d\n\n          ls -lh $d.tar.bz2\n\n      - name: Collect results 0.19 (int8)\n        if: matrix.version == '0.19'\n        shell: bash\n        run: |\n          src=scripts/kokoro/v0.19\n\n          d=kokoro-int8-en-v0_19\n\n          mkdir $d\n          cp -a LICENSE $d/LICENSE\n          cp -a espeak-ng-data $d/\n          cp -v $src/model.int8.onnx $d/model.int8.onnx\n          cp -v $src/voices.bin $d/\n          cp -v $src/tokens.txt $d/\n          cp -v $src/../README.md $d/README.md\n          ls -lh $d/\n          tar cjfv $d.tar.bz2 $d\n\n          ls -lh $d.tar.bz2\n\n      - name: Collect results 1.0\n        if: matrix.version == '1.0'\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/csukuangfj/cppjieba/releases/download/sherpa-onnx-2024-04-19/dict.tar.bz2\n          tar xvf dict.tar.bz2\n          rm dict.tar.bz2\n\n          curl -SL -o date-zh.fst https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/date.fst\n          curl -SL -o number-zh.fst  https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/number.fst\n          curl -SL -o phone-zh.fst https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/phone.fst\n\n          src=scripts/kokoro/v1.0\n\n          d=kokoro-multi-lang-v1_0\n          mkdir $d\n          cp -v LICENSE $d/LICENSE\n          cp -a espeak-ng-data $d/\n          cp -v $src/kokoro.onnx $d/model.onnx\n          cp -v $src/voices.bin $d/\n          cp -v $src/tokens.txt $d/\n          cp -v $src/lexicon*.txt $d/\n          cp -v $src/README.md $d/README.md\n          cp -av dict $d/\n          cp -v ./*.fst $d/\n          ls -lh $d/\n          echo \"---\"\n          ls -lh $d/dict\n\n          tar cjfv $d.tar.bz2 $d\n          rm -rf $d\n\n          ls -lh $d.tar.bz2\n\n          d=kokoro-int8-multi-lang-v1_0\n          mkdir $d\n          cp -v LICENSE $d/LICENSE\n          cp -a espeak-ng-data $d/\n          cp -v $src/kokoro.int8.onnx $d/model.int8.onnx\n          cp -v $src/voices.bin $d/\n          cp -v $src/tokens.txt $d/\n          cp -v $src/lexicon*.txt $d/\n          cp -v $src/README.md $d/README.md\n          cp -av dict $d/\n          cp -v ./*.fst $d/\n          ls -lh $d/\n          echo \"---\"\n          ls -lh $d/dict\n\n          tar cjfv $d.tar.bz2 $d\n          rm -rf $d\n\n          ls -lh $d.tar.bz2\n\n      - name: Collect results 1.1-zh\n        if: matrix.version == '1.1-zh'\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/csukuangfj/cppjieba/releases/download/sherpa-onnx-2024-04-19/dict.tar.bz2\n          tar xvf dict.tar.bz2\n          rm dict.tar.bz2\n\n          curl -SL -o date-zh.fst https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/date.fst\n          curl -SL -o number-zh.fst  https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/number.fst\n          curl -SL -o phone-zh.fst https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/phone.fst\n\n          src=scripts/kokoro/v1.1-zh\n\n          d=kokoro-multi-lang-v1_1\n          mkdir $d\n          cp -v LICENSE $d/LICENSE\n          cp -a espeak-ng-data $d/\n          cp -v $src/kokoro.onnx $d/model.onnx\n          cp -v $src/voices.bin $d/\n          cp -v $src/tokens.txt $d/\n          cp -v $src/lexicon*.txt $d/\n          cp -v $src/README.md $d/README.md\n          cp -av dict $d/\n          cp -v ./*.fst $d/\n          ls -lh $d/\n          echo \"---\"\n          ls -lh $d/dict\n\n          tar cjfv $d.tar.bz2 $d\n          rm -rf $d\n          ls -lh $d.tar.bz2\n\n          d=kokoro-int8-multi-lang-v1_1\n          mkdir $d\n          cp -v LICENSE $d/LICENSE\n          cp -a espeak-ng-data $d/\n          cp -v $src/kokoro.int8.onnx $d/model.int8.onnx\n          cp -v $src/voices.bin $d/\n          cp -v $src/tokens.txt $d/\n          cp -v $src/lexicon*.txt $d/\n          cp -v $src/README.md $d/README.md\n          cp -av dict $d/\n          cp -v ./*.fst $d/\n          ls -lh $d/\n          echo \"---\"\n          ls -lh $d/dict\n\n          tar cjfv $d.tar.bz2 $d\n          rm -rf $d\n          ls -lh $d.tar.bz2\n\n          echo \"---\"\n          ls -lh *.tar.bz2\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: tts-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: tts-models\n\n      - name: Publish to huggingface 0.19\n        if: matrix.version == '0.19'\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            dirs=(\n              kokoro-en-v0_19\n              # kokoro-int8-en-v0_19\n            )\n\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            for d in ${dirs[@]}; do\n              rm -rf huggingface\n\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kokoro-en-v0_19 huggingface\n              cd huggingface\n              rm -rf ./*\n\n              git lfs track \"*.onnx\"\n              git lfs track af_dict\n              git lfs track ar_dict\n              git lfs track cmn_dict\n              git lfs track da_dict en_dict fa_dict hu_dict ia_dict it_dict lb_dict phondata ru_dict ta_dict\n              git lfs track ur_dict yue_dict\n\n\n              cp -a ../$d/* ./\n\n              git add .\n\n              ls -lh\n\n              git status\n\n              git commit -m \"add models\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kokoro-en-v0_19 main || true\n            done\n\n      - name: Publish to huggingface 1.0 float32\n        if: matrix.version == '1.0'\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kokoro-multi-lang-v1_0 huggingface\n            cd huggingface\n            rm -rf ./*\n            git fetch\n            git pull\n\n            git lfs track \"cmn_dict\"\n            git lfs track \"ru_dict\"\n            git lfs track \"*.wav\"\n            git lfs track \"lexicon*.txt\"\n\n            cp -a ../espeak-ng-data ./\n\n            cp -v ../scripts/kokoro/v1.0/kokoro.onnx ./model.onnx\n\n\n            cp -v ../scripts/kokoro/v1.0/tokens.txt .\n            cp -v ../scripts/kokoro/v1.0/voices.bin .\n            cp -v ../scripts/kokoro/v1.0/lexicon*.txt .\n            cp -v ../scripts/kokoro/v1.0/README.md ./README.md\n            cp -v ../LICENSE ./\n            cp -av ../dict ./\n            cp -v ../*.fst ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kokoro-multi-lang-v1_0 main || true\n\n      - name: Publish to huggingface 1.0 int8\n        if: matrix.version == '1.0'\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kokoro-int8-multi-lang-v1_0 huggingface\n            cd huggingface\n            rm -rf ./*\n            git fetch\n            git pull\n\n            git lfs track \"cmn_dict\"\n            git lfs track \"ru_dict\"\n            git lfs track \"af_dict\"\n            git lfs track \"ar_dict\"\n            git lfs track \"da_dict\"\n            git lfs track \"en_dict\"\n            git lfs track \"fa_dict\"\n            git lfs track \"hu_dict\"\n            git lfs track \"ia_dict\"\n            git lfs track \"it_dict\"\n            git lfs track \"lb_dict\"\n            git lfs track \"phondata\"\n            git lfs track \"ta_dict\"\n            git lfs track \"ur_dict\"\n            git lfs track \"yue_dict\"\n            git lfs track \"*.wav\"\n            git lfs track \"lexicon*.txt\"\n\n            cp -a ../espeak-ng-data ./\n\n            cp -v ../scripts/kokoro/v1.0/kokoro.int8.onnx ./model.int8.onnx\n\n            cp -v ../scripts/kokoro/v1.0/tokens.txt .\n            cp -v ../scripts/kokoro/v1.0/voices.bin .\n            cp -v ../scripts/kokoro/v1.0/lexicon*.txt .\n            cp -v ../scripts/kokoro/v1.0/README.md ./README.md\n            cp -v ../LICENSE ./\n            cp -av ../dict ./\n            cp -v ../*.fst ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kokoro-int8-multi-lang-v1_0 main || true\n\n      - name: Publish to huggingface 1.1-zh\n        if: matrix.version == '1.1-zh'\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kokoro-multi-lang-v1_1 huggingface\n            cd huggingface\n            rm -rf ./*\n            git fetch\n            git pull\n\n            git lfs track \"cmn_dict\"\n            git lfs track \"ru_dict\"\n            git lfs track \"*.wav\"\n            git lfs track \"lexicon*.txt\"\n\n            cp -a ../espeak-ng-data ./\n\n            cp -v ../scripts/kokoro/v1.1-zh/kokoro.onnx ./model.onnx\n\n            cp -v ../scripts/kokoro/v1.1-zh/tokens.txt .\n            cp -v ../scripts/kokoro/v1.1-zh/voices.bin .\n            cp -v ../scripts/kokoro/v1.1-zh/lexicon*.txt .\n            cp -v ../scripts/kokoro/v1.1-zh/README.md ./README.md\n            cp -v ../LICENSE ./\n            cp -av ../dict ./\n            cp -v ../*.fst ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kokoro-multi-lang-v1_1 main || true\n\n      - name: Publish to huggingface 1.1-zh-int8\n        if: matrix.version == '1.1-zh'\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kokoro-int8-multi-lang-v1_1 huggingface\n            cd huggingface\n            rm -rf ./*\n            git fetch\n            git pull\n\n            git lfs track \"cmn_dict\"\n            git lfs track \"ru_dict\"\n            git lfs track \"*.wav\"\n            git lfs track \"lexicon*.txt\"\n\n            cp -a ../espeak-ng-data ./\n\n            cp -v ../scripts/kokoro/v1.1-zh/kokoro.int8.onnx ./model.int8.onnx\n\n            cp -v ../scripts/kokoro/v1.1-zh/tokens.txt .\n            cp -v ../scripts/kokoro/v1.1-zh/voices.bin .\n            cp -v ../scripts/kokoro/v1.1-zh/lexicon*.txt .\n            cp -v ../scripts/kokoro/v1.1-zh/README.md ./README.md\n            cp -v ../LICENSE ./\n            cp -av ../dict ./\n            cp -v ../*.fst ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/kokoro-int8-multi-lang-v1_1 main || true\n"
  },
  {
    "path": ".github/workflows/export-libriheavy.yaml",
    "content": "name: export-libriheavy-to-onnx\n\non:\n  push:\n    branches:\n      - libriheavy-model\n  workflow_dispatch:\n\nconcurrency:\n  group: export-libriheavy-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-libriheavy-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export libriheavy\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/icefall\n          ./run-libriheavy.sh\n          ./run-libriheavy-punct-case.sh\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            for m in large medium small; do\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              src=sherpa-onnx-zipformer-en-libriheavy-20230926-$m\n              echo \"Process $src\"\n\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$src huggingface\n              cd huggingface\n              git fetch\n              git pull\n\n              cp -av ../scripts/icefall/$src/* .\n\n              git lfs track \"*.onnx\"\n              git add .\n\n              git commit -m \"add large\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$src main || true\n\n              cd ..\n\n              rm -rf huggingface/.git*\n\n              mv huggingface $src\n\n              tar cjvf $src.tar.bz2 $src\n              rm -rf $src\n              ls -lh\n            done\n\n      - name: Publish to huggingface (case and punct)\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            for m in large medium small; do\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              src=sherpa-onnx-zipformer-en-libriheavy-20230830-$m-punct-case\n              echo \"Process $src\"\n\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$src huggingface\n              cd huggingface\n              git fetch\n              git pull\n\n              cp -av ../scripts/icefall/$src/* .\n\n              git lfs track \"*.onnx\"\n              git add .\n\n              git commit -m \"add large\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$src main || true\n\n              cd ..\n\n              rm -rf huggingface/.git*\n\n              mv huggingface $src\n\n              tar cjvf $src.tar.bz2 $src\n              rm -rf $src\n              ls -lh\n            done\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n"
  },
  {
    "path": ".github/workflows/export-matcha-fa-en.yaml",
    "content": "name: export-matcha-fa-en-to-onnx\n\non:\n  push:\n    branches:\n      - tts-matcha-samples\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-matcha-fa-en-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-matcha-fa-en-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export matcha fa-en ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install \"numpy<=1.26.4\" onnx==1.16.0 onnxruntime==1.17.1 soundfile piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html sherpa-onnx\n\n      - name: Run\n        if: false\n        shell: bash\n        run: |\n          cd scripts/matcha-tts/fa-en\n          ./run.sh\n\n      - name: Generate samples\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        shell: bash\n        run: |\n          cd scripts/matcha-tts/zh\n\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n          tar xvf matcha-icefall-zh-baker.tar.bz2\n          rm matcha-icefall-zh-baker.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n          git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples hf\n          mkdir -p ./hf/matcha/icefall-zh/mp3\n\n          ./generate_samples.py\n\n          pushd hf\n          git pull\n          git add .\n          git commit -m 'add kokoro samples for matcha tts zh'\n          git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples main\n          popd\n          rm -rf hf\n\n          ls -lh\n\n      - name: Collect results ${{ matrix.version }}\n        if: false\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2\n          tar xf espeak-ng-data.tar.bz2\n          rm espeak-ng-data.tar.bz2\n\n          src=scripts/matcha-tts/fa-en\n          dst1=matcha-tts-fa_en-musa # male\n          dst2=matcha-tts-fa_en-khadijah # female\n\n          mkdir $dst1 $dst2\n\n          cp -a espeak-ng-data $dst1/\n          cp -a espeak-ng-data $dst2/\n\n          cp -v $src/male/* $dst1\n          cp -v $src/female/* $dst2\n\n          cp -v $src/README.md $dst1/\n          cp -v $src/README.md $dst2/\n\n          ls -lh $dst1/\n          echo \"---\"\n          ls -lh $dst2/\n          tar cjfv $dst1.tar.bz2 $dst1\n          tar cjfv $dst2.tar.bz2 $dst2\n\n          ls -lh $dst1.tar.bz2\n          ls -lh $dst2.tar.bz2\n\n      - name: Publish to huggingface male (musa)\n        if: false\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/matcha-tts-fa_en-musa huggingface\n            cd huggingface\n            rm -rf ./*\n            git fetch\n            git pull\n\n            git lfs track \"cmn_dict\"\n            git lfs track \"ru_dict\"\n\n            cp -a ../matcha-tts-fa_en-musa/* ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/matcha-tts-fa_en-musa main || true\n\n      - name: Publish to huggingface female (khadijah)\n        if: false\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/matcha-tts-fa_en-khadijah huggingface\n            cd huggingface\n            rm -rf ./*\n            git fetch\n            git pull\n\n            git lfs track \"cmn_dict\"\n            git lfs track \"ru_dict\"\n\n            cp -a ../matcha-tts-fa_en-khadijah/* ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/matcha-tts-fa_en-khadijah main || true\n\n      - name: Release\n        # if: github.repository_owner == 'csukuangfj'\n        if: false\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: tts-models\n\n      - name: Release\n        # if: github.repository_owner == 'k2-fsa'\n        if: false\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: tts-models\n"
  },
  {
    "path": ".github/workflows/export-matcha-zh-en.yaml",
    "content": "name: export-matcha-zh-en-to-onnx\n\non:\n  push:\n    branches:\n      - matcha-zh-en\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-matcha-zh-en-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-matcha-zh-en-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export matcha zh-en ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install \"numpy<=1.26.4\" pypinyin soundfile \\\n            sherpa-onnx -f https://k2-fsa.github.io/sherpa/onnx/cpu.html\n\n      - name: Generate samples\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        shell: bash\n        run: |\n          cd scripts/matcha-tts/zh-en\n\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-en.tar.bz2\n          tar xvf matcha-icefall-zh-en.tar.bz2\n          rm matcha-icefall-zh-en.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-16khz-univ.onnx\n\n          git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples hf\n          mkdir -p ./hf/matcha/icefall-zh-en/mp3\n\n          ./generate_samples.py\n\n          pushd hf\n          git pull\n          git add .\n          git commit -m 'add samples for matcha tts zh en'\n          git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples main\n          popd\n          rm -rf hf\n\n          ls -lh\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/matcha-tts/zh-en\n          curl -SL -O https://modelscope.cn/models/dengcunqin/matcha_tts_zh_en_20251010/resolve/master/model-steps-3.onnx\n          curl -SL -O https://modelscope.cn/models/dengcunqin/matcha_tts_zh_en_20251010/resolve/master/vocab_tts.txt\n\n          ./generate_tokens.py\n          ./generate_lexicon.py\n\n          curl -SL -o date-zh.fst https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/date.fst\n          curl -SL -o number-zh.fst  https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/number.fst\n          curl -SL -o phone-zh.fst https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/phone.fst\n\n      - name: Collect results ${{ matrix.version }}\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2\n          tar xf espeak-ng-data.tar.bz2\n          rm espeak-ng-data.tar.bz2\n\n          src=scripts/matcha-tts/zh-en\n          dst=matcha-icefall-zh-en\n\n          mkdir $dst\n\n          cp -a espeak-ng-data $dst/\n\n          cp -v $src/tokens.txt $dst\n          cp -v $src/lexicon.txt $dst\n          cp -v $src/model-steps-3.onnx $dst\n          cp -v $src/README.md $dst\n          cp -v $src/*.fst $dst\n\n          tar cjfv $dst.tar.bz2 $dst\n\n          ls -lh $dst.tar.bz2\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/matcha-icefall-zh-en huggingface\n            cd huggingface\n            rm -rf ./*\n            git fetch\n            git pull\n\n            git lfs track \"cmn_dict\"\n            git lfs track \"ru_dict\" af_dict ar_dict da_dict en_dict fa_dict hu_dict ia_dict it_dict lb_dict phondata ta_dict ur_dict yue_dict\n\n            cp -a ../matcha-icefall-zh-en/* ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/matcha-icefall-zh-en main || true\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: tts-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: tts-models\n"
  },
  {
    "path": ".github/workflows/export-medasr-ctc-to-onnx.yaml",
    "content": "name: export-medasr-ctc-to-onnx\n\non:\n  push:\n    branches:\n      - cpp-medasr-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-medasr-ctc-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-medasr-ctc-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export medasr ctc\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Run\n        shell: bash\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          cd scripts/medasr\n          ./run.sh\n\n      - name: Download test data\n        shell: bash\n        run: |\n          cd scripts/medasr\n\n          for i in $(seq 0 5); do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-medasr-ctc-en-int8-2025-12-25/resolve/main/test_wavs/$i.wav\n          done\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-medasr-ctc-en-int8-2025-12-25/resolve/main/test_wavs/transcript.txt\n\n          ls -lh\n\n      - name: Test fp32\n        shell: bash\n        run: |\n          cd scripts/medasr\n\n          for i in $(seq 0 5); do\n            python3 test_onnx.py --model ./model.onnx --tokens ./tokens.txt --wav ./$i.wav\n          done\n\n          cat transcript.txt\n\n      - name: Test int8\n        shell: bash\n        run: |\n          cd scripts/medasr\n\n          for i in $(seq 0 5); do\n            python3 test_onnx.py --model ./model.int8.onnx --tokens ./tokens.txt --wav ./$i.wav\n          done\n\n          cat transcript.txt\n\n      - name: Collect fp32 files\n        shell: bash\n        run: |\n          cd scripts/medasr\n\n          d=sherpa-onnx-medasr-ctc-en-2025-12-25\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v model.onnx $d\n          cp -v README.md $d\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs/\n          cp -v transcript.txt $d/test_wavs/\n\n          tar cjvf $d.tar.bz2 $d\n\n          ls -lh $d\n          ls -lh *.tar.bz2\n\n          mv $d ../..\n          mv $d.tar.bz2 ../..\n\n      - name: Collect int8 files\n        shell: bash\n        run: |\n          cd scripts/medasr\n\n          d=sherpa-onnx-medasr-ctc-en-int8-2025-12-25\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v model.int8.onnx $d\n          cp -v README.md $d\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs/\n          cp -v transcript.txt $d/test_wavs/\n\n          tar cjvf $d.tar.bz2 $d\n\n          ls -lh $d\n          ls -lh *.tar.bz2\n\n          mv $d ../..\n          mv $d.tar.bz2 ../..\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 5\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            names=(\n             sherpa-onnx-medasr-ctc-en-2025-12-25\n             sherpa-onnx-medasr-ctc-en-int8-2025-12-25\n            )\n            for d in ${names[@]}; do\n              if [ ! -d $d ]; then\n                echo \"$d does not exist - skip it\"\n                continue;\n              fi\n\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n              cp -av $d/* ./huggingface\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.wav\"\n              git status\n              git add .\n              git status\n              git commit -m \"add models\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n              cd ..\n            done\n"
  },
  {
    "path": ".github/workflows/export-melo-tts-to-onnx.yaml",
    "content": "name: export-melo-tts-to-onnx\n\non:\n  push:\n    branches:\n      - export-melo-tts-onnx\n  workflow_dispatch:\n\nconcurrency:\n  group: export-melo-tts-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-melo-tts-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export melo-tts\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/melo-tts\n          ./run.sh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: test.wav\n          path: scripts/melo-tts/test.wav\n\n      - name: Publish to huggingface (Chinese + English)\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/vits-melo-tts-zh_en huggingface\n            cd huggingface\n            git fetch\n            git pull\n            echo \"pwd: $PWD\"\n            ls -lh ../scripts/melo-tts/zh_en\n\n            rm -rf ./\n\n            cp -v ../scripts/melo-tts/zh_en/*.onnx .\n            cp -v ../scripts/melo-tts/zh_en/lexicon.txt .\n            cp -v ../scripts/melo-tts/zh_en/tokens.txt .\n            cp -v ../scripts/melo-tts/zh_en/README.md .\n\n            curl -SL -O https://raw.githubusercontent.com/myshell-ai/MeloTTS/main/LICENSE\n\n            curl -SL -O https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/new_heteronym.fst\n            curl -SL -O https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/date.fst\n            curl -SL -O https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/number.fst\n            curl -SL -O https://huggingface.co/csukuangfj/icefall-tts-aishell3-vits-low-2024-04-06/resolve/main/data/phone.fst\n            curl -SL -O https://github.com/csukuangfj/cppjieba/releases/download/sherpa-onnx-2024-04-19/dict.tar.bz2\n            tar xvf dict.tar.bz2\n            rm dict.tar.bz2\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git diff\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/vits-melo-tts-zh_en main || true\n\n            cd ..\n\n            rm -rf huggingface/.git*\n            dst=vits-melo-tts-zh_en\n\n            mv huggingface $dst\n\n            tar cjvf $dst.tar.bz2 $dst\n            rm -rf $dst\n\n      - name: Publish to huggingface (English)\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/vits-melo-tts-en huggingface\n            cd huggingface\n            git fetch\n            git pull\n            echo \"pwd: $PWD\"\n            ls -lh ../scripts/melo-tts/en\n\n            rm -rf ./\n\n            cp -v ../scripts/melo-tts/en/*.onnx .\n            cp -v ../scripts/melo-tts/en/lexicon.txt .\n            cp -v ../scripts/melo-tts/en/tokens.txt .\n            cp -v ../scripts/melo-tts/en/README.md .\n\n            curl -SL -O https://raw.githubusercontent.com/myshell-ai/MeloTTS/main/LICENSE\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git diff\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/vits-melo-tts-en main || true\n\n            cd ..\n\n            rm -rf huggingface/.git*\n            dst=vits-melo-tts-en\n\n            mv huggingface $dst\n\n            tar cjvf $dst.tar.bz2 $dst\n            rm -rf $dst\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: tts-models\n"
  },
  {
    "path": ".github/workflows/export-moonshine-to-onnx.yaml",
    "content": "name: export-moonshine-to-onnx\n\non:\n  push:\n    branches:\n      - jni-moonshine-v2-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-moonshine-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-moonshine-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export moonshine ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n        version: [v2]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install -q onnx onnxruntime librosa tokenizers soundfile moonshine-voice\n\n      - name: Run v1\n        if: matrix.version == 'v1'\n        shell: bash\n        run: |\n          pushd scripts/moonshine\n          ./run.sh\n          popd\n\n          mv -v scripts/moonshine/*.tar.bz2 .\n          mv -v scripts/moonshine/sherpa-onnx-* ./\n\n      - name: Run v2\n        if: matrix.version == 'v2'\n        shell: bash\n        run: |\n          pushd scripts/moonshine/v2\n          ./run.sh\n          popd\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Publish to huggingface\n        if: true\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            models=(\n              sherpa-onnx-moonshine-tiny-en-int8\n              sherpa-onnx-moonshine-base-en-int8\n              sherpa-onnx-moonshine-base-ar-quantized-2026-02-27\n              sherpa-onnx-moonshine-base-en-quantized-2026-02-27\n              sherpa-onnx-moonshine-base-es-quantized-2026-02-27\n              sherpa-onnx-moonshine-base-ja-quantized-2026-02-27\n              sherpa-onnx-moonshine-base-uk-quantized-2026-02-27\n              sherpa-onnx-moonshine-base-vi-quantized-2026-02-27\n              sherpa-onnx-moonshine-base-zh-quantized-2026-02-27\n              sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27\n              sherpa-onnx-moonshine-tiny-ja-quantized-2026-02-27\n              sherpa-onnx-moonshine-tiny-ko-quantized-2026-02-27\n            )\n            for d in ${models[@]}; do\n              if [ ! -d $d ]; then\n                continue;\n              fi\n\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/$d huggingface\n\n              rm -rf huggingface/*.onnx\n              rm -rf huggingface/*/*.wav\n\n              cp -av $d/* huggingface\n\n              pushd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.ort\"\n              git lfs track \"*.data\"\n              git lfs track \"*.weights\"\n              git lfs track \"bpe.model\"\n              git lfs track \"*.wav\"\n              git lfs track \"*.json\"\n              git status\n              git add .\n\n              git commit -m \"add models\"\n              git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/$d main\n\n              popd\n            done\n\n            rm -rf huggingface\n\n      - name: Publish to modelscope\n        if: true\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            rm -rf ms\n            git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git ms\n\n            cp -av *.tar.bz2 ms/\n\n            pushd ms\n            git lfs track \"*.tar.bz2\"\n            git status\n            ls -lh\n            git add .\n\n            git commit -m \"add models\"\n            git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git\n            popd\n\n            rm -rf ms\n"
  },
  {
    "path": ".github/workflows/export-nemo-canary-180m-flash.yaml",
    "content": "name: export-nemo-canary-180m-flash\n\non:\n  push:\n    branches:\n      - export-nemo-canary\n  workflow_dispatch:\n\nconcurrency:\n  group: export-nemo-canary-180m-flash-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-nemo-canary-180m-flash:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: parakeet nemo canary 180m flash\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/nemo/canary\n          ./run_180m_flash.sh\n\n          ls -lh *.onnx\n          mv -v *.onnx ../../..\n          mv -v tokens.txt ../../..\n          mv de.wav ../../../\n          mv en.wav ../../../\n\n      - name: Collect files (fp32)\n        shell: bash\n        run: |\n          d=sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr\n          mkdir -p $d\n          cp encoder.onnx $d\n          cp decoder.onnx $d\n          cp tokens.txt $d\n\n          mkdir $d/test_wavs\n          cp de.wav $d/test_wavs\n          cp en.wav $d/test_wavs\n\n          tar cjfv $d.tar.bz2 $d\n\n      - name: Collect files (int8)\n        shell: bash\n        run: |\n          d=sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8\n          mkdir -p $d\n          cp encoder.int8.onnx $d\n          cp decoder.int8.onnx $d\n          cp tokens.txt $d\n\n          mkdir $d/test_wavs\n          cp de.wav $d/test_wavs\n          cp en.wav $d/test_wavs\n\n          tar cjfv $d.tar.bz2 $d\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            models=(\n              sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr\n              sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8\n            )\n\n            for m in ${models[@]}; do\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m huggingface\n              cp -av $m/* huggingface\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.wav\"\n              git status\n              git add .\n              git status\n              git commit -m \"first commit\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m main\n              cd ..\n            done\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-nemo-fast-conformer-hybrid-transducer-ctc-non-streaming.yaml",
    "content": "name: export-nemo-fast-conformer-ctc-non-streaming\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-nemo-fast-conformer-hybrid-transducer-ctc-non-streaming-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-nemo-fast-conformer-hybrid-transducer-ctc-non-streaming:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: Hybrid ctc non-streaming\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install NeMo\n        shell: bash\n        run: |\n          BRANCH='main'\n          pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[asr]\n          pip install onnxruntime ipython\n          pip install kaldi-native-fbank\n          pip install soundfile librosa\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/nemo/fast-conformer-hybrid-transducer-ctc\n          ./run-ctc-non-streaming-2.sh\n          ./run-ctc-non-streaming.sh\n\n          mv -v sherpa-onnx-nemo* ../../..\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            models=(\n              sherpa-onnx-nemo-fast-conformer-ctc-en-24500\n              sherpa-onnx-nemo-fast-conformer-ctc-es-1424\n              sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288\n              sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k\n              sherpa-onnx-nemo-parakeet_tdt_ctc_110m-en-36000\n              sherpa-onnx-nemo-fast-conformer-ctc-en-24500-int8\n              sherpa-onnx-nemo-fast-conformer-ctc-es-1424-int8\n              sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288-int8\n              sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k-int8\n              sherpa-onnx-nemo-parakeet_tdt_ctc_110m-en-36000-int8\n              sherpa-onnx-nemo-stt_pt_fastconformer_hybrid_large_pc\n              sherpa-onnx-nemo-stt_pt_fastconformer_hybrid_large_pc-int8\n              sherpa-onnx-nemo-stt_de_fastconformer_hybrid_large_pc\n              sherpa-onnx-nemo-stt_de_fastconformer_hybrid_large_pc-int8\n            )\n\n            for m in ${models[@]}; do\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m huggingface\n              cp -av $m/* huggingface\n              cd huggingface\n              git lfs track \"*.onnx\" \"*.wav\"\n              git status\n              git add .\n              git status\n              git commit -m \"first commit\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m main\n              cd ..\n              rm -rf huggingface\n            done\n\n      - name: Compress files\n        shell: bash\n        run: |\n          dirs=(\n            sherpa-onnx-nemo-fast-conformer-ctc-en-24500\n            sherpa-onnx-nemo-fast-conformer-ctc-es-1424\n            sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288\n            sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k\n            sherpa-onnx-nemo-parakeet_tdt_ctc_110m-en-36000\n            sherpa-onnx-nemo-fast-conformer-ctc-en-24500-int8\n            sherpa-onnx-nemo-fast-conformer-ctc-es-1424-int8\n            sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288-int8\n            sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k-int8\n            sherpa-onnx-nemo-parakeet_tdt_ctc_110m-en-36000-int8\n            sherpa-onnx-nemo-stt_pt_fastconformer_hybrid_large_pc\n            sherpa-onnx-nemo-stt_pt_fastconformer_hybrid_large_pc-int8\n            sherpa-onnx-nemo-stt_de_fastconformer_hybrid_large_pc\n            sherpa-onnx-nemo-stt_de_fastconformer_hybrid_large_pc-int8\n          )\n          for d in ${dirs[@]}; do\n            tar cjvf ${d}.tar.bz2 ./$d\n          done\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n\n"
  },
  {
    "path": ".github/workflows/export-nemo-fast-conformer-hybrid-transducer-ctc.yaml",
    "content": "name: export-nemo-fast-conformer-ctc-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-nemo-fast-conformer-hybrid-transducer-ctc-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-nemo-fast-conformer-hybrid-transducer-ctc-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: Hybrid ctc streaming\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install NeMo\n        shell: bash\n        run: |\n          BRANCH='main'\n          pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[asr]\n          pip install onnxruntime ipython\n          pip install kaldi-native-fbank\n          pip install soundfile librosa\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/nemo/fast-conformer-hybrid-transducer-ctc\n          ./run-ctc.sh\n\n          mv -v sherpa-onnx-nemo* ../../..\n\n      - name: Download test waves\n        shell: bash\n        run: |\n          mkdir test_wavs\n          pushd test_wavs\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/0.wav\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/1.wav\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/8k.wav\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/trans.txt\n          popd\n\n          names=(\n            sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms\n            sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-480ms\n            sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-1040ms\n            sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms-int8\n            sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-480ms-int8\n            sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-1040ms-int8\n          )\n          for d in ${names[@]}; do\n            cp -av test_wavs $d/\n            tar cjvf $d.tar.bz2 $d\n          done\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            models=(\n              sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms\n              sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-480ms\n              sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-1040ms\n              sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms-int8\n              sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-480ms-int8\n              sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-1040ms-int8\n            )\n\n            for m in ${models[@]}; do\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m huggingface\n              cp -av $m/* huggingface\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.wav\"\n              git status\n              git add .\n              git status\n              git commit -m \"first commit\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m main\n              cd ..\n            done\n"
  },
  {
    "path": ".github/workflows/export-nemo-fast-conformer-hybrid-transducer-transducer-non-streaming.yaml",
    "content": "name: export-nemo-fast-conformer-transducer-non-streaming\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-nemo-fast-conformer-hybrid-transducer-transducer-non-streaming-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-nemo-fast-conformer-hybrid-transducer-transducer-non-streaming:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: Hybrid transducer non-streaming\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install NeMo\n        shell: bash\n        run: |\n          BRANCH='main'\n          pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[asr]\n          pip install onnxruntime ipython\n          pip install kaldi-native-fbank\n          pip install soundfile librosa\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/nemo/fast-conformer-hybrid-transducer-ctc\n          ./run-transducer-non-streaming-2.sh\n          ./run-transducer-non-streaming.sh\n\n          mv -v sherpa-onnx-nemo* ../../..\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            models=(\n              sherpa-onnx-nemo-fast-conformer-transducer-en-24500\n              sherpa-onnx-nemo-fast-conformer-transducer-es-1424\n              sherpa-onnx-nemo-fast-conformer-transducer-en-de-es-fr-14288\n              sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k\n              sherpa-onnx-nemo-parakeet_tdt_transducer_110m-en-36000\n              sherpa-onnx-nemo-fast-conformer-transducer-en-24500-int8\n              sherpa-onnx-nemo-fast-conformer-transducer-es-1424-int8\n              sherpa-onnx-nemo-fast-conformer-transducer-en-de-es-fr-14288-int8\n              sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k-int8\n              sherpa-onnx-nemo-parakeet_tdt_transducer_110m-en-36000-int8\n              sherpa-onnx-nemo-transducer-stt_pt_fastconformer_hybrid_large_pc\n              sherpa-onnx-nemo-transducer-stt_pt_fastconformer_hybrid_large_pc-int8\n              sherpa-onnx-nemo-transducer-stt_de_fastconformer_hybrid_large_pc\n              sherpa-onnx-nemo-transducer-stt_de_fastconformer_hybrid_large_pc-int8\n            )\n\n            for m in ${models[@]}; do\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m huggingface\n              cp -av $m/* huggingface\n              cd huggingface\n              git lfs track \"*.onnx\" \"*.wav\"\n              git status\n              git add .\n              git status\n              git commit -m \"first commit\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m main\n              cd ..\n            done\n\n      - name: Compress files\n        shell: bash\n        run: |\n          dirs=(\n            sherpa-onnx-nemo-fast-conformer-transducer-en-24500\n            sherpa-onnx-nemo-fast-conformer-transducer-es-1424\n            sherpa-onnx-nemo-fast-conformer-transducer-en-de-es-fr-14288\n            sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k\n            sherpa-onnx-nemo-parakeet_tdt_transducer_110m-en-36000\n            sherpa-onnx-nemo-fast-conformer-transducer-en-24500-int8\n            sherpa-onnx-nemo-fast-conformer-transducer-es-1424-int8\n            sherpa-onnx-nemo-fast-conformer-transducer-en-de-es-fr-14288-int8\n            sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k-int8\n            sherpa-onnx-nemo-parakeet_tdt_transducer_110m-en-36000-int8\n            sherpa-onnx-nemo-transducer-stt_pt_fastconformer_hybrid_large_pc\n            sherpa-onnx-nemo-transducer-stt_pt_fastconformer_hybrid_large_pc-int8\n            sherpa-onnx-nemo-transducer-stt_de_fastconformer_hybrid_large_pc\n            sherpa-onnx-nemo-transducer-stt_de_fastconformer_hybrid_large_pc-int8\n          )\n          for d in ${dirs[@]}; do\n            tar cjvf ${d}.tar.bz2 ./$d\n          done\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n\n"
  },
  {
    "path": ".github/workflows/export-nemo-fast-conformer-hybrid-transducer-transducer.yaml",
    "content": "name: export-nemo-fast-conformer-transducer-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-nemo-fast-conformer-hybrid-transducer-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-nemo-fast-conformer-hybrid-transducer-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: Hybrid transducer streaming\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install NeMo\n        shell: bash\n        run: |\n          BRANCH='main'\n          pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[asr]\n          pip install onnxruntime ipython\n          pip install kaldi-native-fbank\n          pip install soundfile librosa\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/nemo/fast-conformer-hybrid-transducer-ctc\n          ./run-transducer.sh\n\n          mv -v sherpa-onnx-nemo* ../../..\n\n      - name: Download test waves\n        shell: bash\n        run: |\n          mkdir test_wavs\n          pushd test_wavs\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/0.wav\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/1.wav\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/8k.wav\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/trans.txt\n          popd\n\n          models=(\n            sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms\n            sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-480ms\n            sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-1040ms\n            sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms-int8\n            sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-480ms-int8\n            sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-1040ms-int8\n          )\n          for m in ${models[@]}; do\n            cp -av test_wavs $m\n            tar cjvf $m.tar.bz2 $m\n          done\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            models=(\n              sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms\n              sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-480ms\n              sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-1040ms\n              sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms-int8\n              sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-480ms-int8\n              sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-1040ms-int8\n            )\n\n            for m in ${models[@]}; do\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m huggingface\n              cp -av $m/* huggingface\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.wav\"\n              git status\n              git add .\n              git status\n              git commit -m \"first commit\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m main\n              cd ..\n            done\n"
  },
  {
    "path": ".github/workflows/export-nemo-giga-am-to-onnx.yaml",
    "content": "name: export-nemo-giga-am-to-onnx\n\non:\n  push:\n    branches:\n      - export-giga-am-v3\n  workflow_dispatch:\n\nconcurrency:\n  group: export-nemo-giga-am-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-nemo-am-giga-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export nemo GigaAM models to ONNX\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Run CTC\n        if: false\n        shell: bash\n        run: |\n          pushd scripts/nemo/GigaAM\n          ./run-ctc.sh\n          popd\n\n          d=sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24\n          mkdir $d\n          mkdir $d/test_wavs\n          rm scripts/nemo/GigaAM/model.onnx\n          mv -v scripts/nemo/GigaAM/*.int8.onnx $d/\n          cp -v scripts/nemo/GigaAM/*.md $d/\n          mv -v scripts/nemo/GigaAM/*.pdf $d/\n          mv -v scripts/nemo/GigaAM/tokens.txt $d/\n          mv -v scripts/nemo/GigaAM/*.wav $d/test_wavs/\n          mv -v scripts/nemo/GigaAM/run-ctc.sh $d/\n          mv -v scripts/nemo/GigaAM/export-onnx-ctc.py $d/\n          cp -v scripts/nemo/GigaAM/test-onnx-ctc.py $d/\n\n          ls -lh scripts/nemo/GigaAM/\n\n          ls -lh $d\n\n          tar cjvf ${d}.tar.bz2 $d\n\n      - name: Run Transducer\n        if: false\n        shell: bash\n        run: |\n          pushd scripts/nemo/GigaAM\n          ./run-rnnt.sh\n          popd\n\n          d=sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24\n          mkdir $d\n          mkdir $d/test_wavs\n\n          mv -v scripts/nemo/GigaAM/encoder.int8.onnx $d/\n          mv -v scripts/nemo/GigaAM/decoder.onnx $d/\n          mv -v scripts/nemo/GigaAM/joiner.onnx $d/\n\n          cp -v scripts/nemo/GigaAM/*.md $d/\n          mv -v scripts/nemo/GigaAM/*.pdf $d/\n          mv -v scripts/nemo/GigaAM/tokens.txt $d/\n          mv -v scripts/nemo/GigaAM/*.wav $d/test_wavs/\n          mv -v scripts/nemo/GigaAM/run-rnnt.sh $d/\n          mv -v scripts/nemo/GigaAM/export-onnx-rnnt.py $d/\n          cp -v scripts/nemo/GigaAM/test-onnx-rnnt.py $d/\n\n          ls -lh scripts/nemo/GigaAM/\n\n          ls -lh $d\n\n          tar cjvf ${d}.tar.bz2 $d\n\n      - name: Run CTC v2\n        if: false\n        shell: bash\n        run: |\n          pushd scripts/nemo/GigaAM\n          ./run-ctc-v2.sh\n          popd\n\n          d=sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19\n          mkdir $d\n          mkdir $d/test_wavs\n          rm scripts/nemo/GigaAM/v2_ctc.onnx\n          mv -v scripts/nemo/GigaAM/*.int8.onnx $d/\n          cp -v scripts/nemo/GigaAM/LICENSE $d/\n          mv -v scripts/nemo/GigaAM/tokens.txt $d/\n          mv -v scripts/nemo/GigaAM/*.wav $d/test_wavs/\n          mv -v scripts/nemo/GigaAM/run-ctc-v2.sh $d/\n          mv -v scripts/nemo/GigaAM/*-ctc-v2.py $d/\n          cp -v scripts/nemo/GigaAM/test-onnx-ctc.py $d/\n\n          ls -lh scripts/nemo/GigaAM/\n\n          ls -lh $d\n\n          tar cjvf ${d}.tar.bz2 $d\n\n      - name: Run Transducer v2\n        if: false\n        shell: bash\n        run: |\n          pushd scripts/nemo/GigaAM\n          ./run-rnnt-v2.sh\n          popd\n\n          d=sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19\n          mkdir $d\n          mkdir $d/test_wavs\n\n          mv -v scripts/nemo/GigaAM/encoder.int8.onnx $d/\n          mv -v scripts/nemo/GigaAM/decoder.onnx $d/\n          mv -v scripts/nemo/GigaAM/joiner.onnx $d/\n\n          cp -v scripts/nemo/GigaAM/*.md $d/\n          cp -v scripts/nemo/GigaAM/LICENSE $d/\n          mv -v scripts/nemo/GigaAM/tokens.txt $d/\n          mv -v scripts/nemo/GigaAM/*.wav $d/test_wavs/\n          mv -v scripts/nemo/GigaAM/run-rnnt-v2.sh $d/\n          cp -v scripts/nemo/GigaAM/test-onnx-rnnt.py $d/\n\n          ls -lh scripts/nemo/GigaAM/\n\n          ls -lh $d\n\n          tar cjvf ${d}.tar.bz2 $d\n\n      - name: Run CTC v3\n        if: true\n        shell: bash\n        run: |\n          pushd scripts/nemo/GigaAM\n          ./run-ctc-v3.sh\n          popd\n\n          d=sherpa-onnx-nemo-ctc-giga-am-v3-russian-2025-12-16\n          mkdir $d\n          mkdir $d/test_wavs\n          ls -lh scripts/nemo/GigaAM/v3_ctc.onnx\n          rm scripts/nemo/GigaAM/v3_ctc.onnx\n          cp -v scripts/nemo/GigaAM/*.md $d/\n          mv -v scripts/nemo/GigaAM/*.int8.onnx $d/\n          cp -v scripts/nemo/GigaAM/LICENSE $d/\n          mv -v scripts/nemo/GigaAM/tokens.txt $d/\n          mv -v scripts/nemo/GigaAM/*.wav $d/test_wavs/\n          mv -v scripts/nemo/GigaAM/run-ctc-v3.sh $d/\n          mv -v scripts/nemo/GigaAM/*-ctc-v3.py $d/\n          cp -v scripts/nemo/GigaAM/test-onnx-ctc.py $d/\n\n          ls -lh scripts/nemo/GigaAM/\n\n          ls -lh $d\n\n          tar cjvf ${d}.tar.bz2 $d\n\n          ls -lh *.tar.bz2\n\n      - name: Run CTC v3 with punctuations\n        if: true\n        shell: bash\n        run: |\n          pushd scripts/nemo/GigaAM\n          ./run-ctc-v3-punct.sh\n          popd\n\n          d=sherpa-onnx-nemo-ctc-punct-giga-am-v3-russian-2025-12-16\n          mkdir $d\n          mkdir $d/test_wavs\n          rm scripts/nemo/GigaAM/v3_e2e_ctc.onnx\n          cp -v scripts/nemo/GigaAM/*.md $d/\n          mv -v scripts/nemo/GigaAM/*.int8.onnx $d/\n          cp -v scripts/nemo/GigaAM/LICENSE $d/\n          mv -v scripts/nemo/GigaAM/tokens.txt $d/\n          mv -v scripts/nemo/GigaAM/*.wav $d/test_wavs/\n          mv -v scripts/nemo/GigaAM/run-ctc-v3-punct.sh $d/\n          mv -v scripts/nemo/GigaAM/*-ctc-v3-punct.py $d/\n          cp -v scripts/nemo/GigaAM/test-onnx-ctc.py $d/\n\n          ls -lh scripts/nemo/GigaAM/\n\n          ls -lh $d\n\n          tar cjvf ${d}.tar.bz2 $d\n\n          ls -lh *.tar.bz2\n\n      - name: Run Transducer v3\n        if: false\n        shell: bash\n        run: |\n          pushd scripts/nemo/GigaAM\n          ./run-rnnt-v3.sh\n          popd\n\n          d=sherpa-onnx-nemo-transducer-giga-am-v3-russian-2025-12-16\n          mkdir $d\n          mkdir $d/test_wavs\n\n          mv -v scripts/nemo/GigaAM/encoder.int8.onnx $d/\n          mv -v scripts/nemo/GigaAM/decoder.onnx $d/\n          mv -v scripts/nemo/GigaAM/joiner.onnx $d/\n\n          cp -v scripts/nemo/GigaAM/*.md $d/\n          cp -v scripts/nemo/GigaAM/LICENSE $d/\n          mv -v scripts/nemo/GigaAM/tokens.txt $d/\n          mv -v scripts/nemo/GigaAM/*.wav $d/test_wavs/\n          mv -v scripts/nemo/GigaAM/run-rnnt-v3.sh $d/\n          cp -v scripts/nemo/GigaAM/test-onnx-rnnt.py $d/\n\n          ls -lh scripts/nemo/GigaAM/\n\n          ls -lh $d\n\n          tar cjvf ${d}.tar.bz2 $d\n\n      - name: Run Transducer v3 with punctuations\n        if: false\n        shell: bash\n        run: |\n          pushd scripts/nemo/GigaAM\n          ./run-rnnt-v3-punct.sh\n          popd\n\n          d=sherpa-onnx-nemo-transducer-punct-giga-am-v3-russian-2025-12-16\n          mkdir $d\n          mkdir $d/test_wavs\n\n          mv -v scripts/nemo/GigaAM/encoder.int8.onnx $d/\n          mv -v scripts/nemo/GigaAM/decoder.onnx $d/\n          mv -v scripts/nemo/GigaAM/joiner.onnx $d/\n\n          cp -v scripts/nemo/GigaAM/*.md $d/\n          cp -v scripts/nemo/GigaAM/LICENSE $d/\n          mv -v scripts/nemo/GigaAM/tokens.txt $d/\n          mv -v scripts/nemo/GigaAM/*.wav $d/test_wavs/\n          mv -v scripts/nemo/GigaAM/run-rnnt-v3-punct.sh $d/\n          cp -v scripts/nemo/GigaAM/test-onnx-rnnt.py $d/\n\n          ls -lh scripts/nemo/GigaAM/\n\n          ls -lh $d\n\n          tar cjvf ${d}.tar.bz2 $d\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: asr-models\n\n      - name: Publish to huggingface (CTC)\n        if: false\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            d=sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24/\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            rm -rf huggingface\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n            cp -av $d/* ./huggingface\n            cd huggingface\n            git lfs track \"*.onnx\"\n            git lfs track \"*.wav\"\n            git status\n            git add .\n            git status\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n\n      - name: Publish to huggingface (Transducer)\n        if: false\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 5\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            d=sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24/\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            rm -rf huggingface\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n            cp -av $d/* ./huggingface\n            cd huggingface\n            git lfs track \"*.onnx\"\n            git lfs track \"*.wav\"\n            git status\n            git add .\n            git status\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n\n      - name: Publish v2 to huggingface (CTC)\n        if: false\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 5\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            d=sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19/\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            rm -rf huggingface\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n            cp -av $d/* ./huggingface\n            cd huggingface\n            git lfs track \"*.onnx\"\n            git lfs track \"*.wav\"\n            git status\n            git add .\n            git status\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n\n      - name: Publish v2 to huggingface (Transducer)\n        if: false\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 5\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            d=sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19/\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            rm -rf huggingface\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n            cp -av $d/* ./huggingface\n            cd huggingface\n            git lfs track \"*.onnx\"\n            git lfs track \"*.wav\"\n            git status\n            git add .\n            git status\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n\n      - name: Publish v3 to huggingface\n        if: true\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 5\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            names=(\n             sherpa-onnx-nemo-ctc-giga-am-v3-russian-2025-12-16\n             sherpa-onnx-nemo-ctc-punct-giga-am-v3-russian-2025-12-16\n             sherpa-onnx-nemo-transducer-giga-am-v3-russian-2025-12-16\n             sherpa-onnx-nemo-transducer-punct-giga-am-v3-russian-2025-12-16\n            )\n            for d in ${names[@]}; do\n              if [ ! -d $d ]; then\n                echo \"$d does not exist - skip it\"\n                continue;\n              fi\n\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n              cp -av $d/* ./huggingface\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.wav\"\n              git status\n              git add .\n              git status\n              git commit -m \"add models\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n              cd ..\n            done\n"
  },
  {
    "path": ".github/workflows/export-nemo-parakeet-tdt-0.6b-v2.yaml",
    "content": "name: export-nemo-parakeet-tdt-0.6b\n\non:\n  push:\n    branches:\n      - export-nemo-parakeet-tdt-0.6b-v2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-nemo-parakeet-tdt-0.6b-v2-${{ github.ref }}\n  cancel-in-progress: true\n\nenv:\n  HF_HUB_ENABLE_HF_TRANSFER: \"0\"\n\njobs:\n  export-nemo-parakeet-tdt-0_6b:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: parakeet tdt 0.6b ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n        version: [\"v2\", \"v3\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Show disk space\n        run: |\n          df -h\n\n      # See https://github.com/vlayer-xyz/vlayer/pull/543/files\n      # Free up disk space as the macOS runners end up using most for Xcode\n      # versions we don't need and use iOS simulators.\n      - name: Free up disk space\n        run: |\n          echo '*** Delete iOS simulators and their caches'\n          xcrun simctl delete all\n          sudo rm -rf ~/Library/Developer/CoreSimulator/Caches/*\n\n      - name: Show disk space\n        run: |\n          df -h\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Run ${{ matrix.version }}\n        if: matrix.version == 'v2'\n        shell: bash\n        run: |\n          cd scripts/nemo/parakeet-tdt-0.6b-v2\n          ./run.sh\n\n          ls -lh *.onnx\n          ls -lh *.weights\n\n          mv -v *.onnx ../../..\n          mv -v *.weights ../../..\n          mv -v tokens.txt ../../..\n          mv 2086-149220-0033.wav ../../../0.wav\n\n      - name: Run ${{ matrix.version }}\n        if: matrix.version == 'v3'\n        shell: bash\n        run: |\n          cd scripts/nemo/parakeet-tdt-0.6b-v3\n          ./run.sh\n\n          ls -lh *.onnx\n          mv -v *.onnx ../../..\n          mv -v *.weights ../../..\n          mv -v tokens.txt ../../..\n          mv *.wav ../../../\n\n      - name: Collect files (fp32)\n        shell: bash\n        run: |\n          version=${{ matrix.version }}\n          d=sherpa-onnx-nemo-parakeet-tdt-0.6b-$version\n          mkdir -p $d\n          cp -v encoder.onnx $d\n          cp -v encoder.weights $d\n          cp -v decoder.onnx $d\n          cp -v joiner.onnx $d\n          cp -v tokens.txt $d\n\n          mkdir $d/test_wavs\n          cp -v *.wav $d/test_wavs\n\n          # tar cjfv $d.tar.bz2 $d\n\n          # ls -lh *.tar.bz2\n\n      - name: Collect files (int8)\n        shell: bash\n        run: |\n          version=${{ matrix.version }}\n          d=sherpa-onnx-nemo-parakeet-tdt-0.6b-$version-int8\n          mkdir -p $d\n          cp -v encoder.int8.onnx $d\n          cp -v decoder.int8.onnx $d\n          cp -v joiner.int8.onnx $d\n          cp -v tokens.txt $d\n\n          mkdir $d/test_wavs\n          cp -v *.wav $d/test_wavs\n\n          tar cjfv $d.tar.bz2 $d\n\n          ls -lh *.tar.bz2\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            version=${{ matrix.version }}\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            models=(\n              sherpa-onnx-nemo-parakeet-tdt-0.6b-$version\n              sherpa-onnx-nemo-parakeet-tdt-0.6b-$version-int8\n            )\n\n            for m in ${models[@]}; do\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m huggingface\n              cp -av $m/* huggingface\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.wav\"\n              git lfs track \"*.weights\"\n              git status\n              git add .\n              git status\n              git commit -m \"first commit\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m main\n              cd ..\n            done\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-nemo-parakeet-tdt.yaml",
    "content": "name: export-nemo-parakeet-tdt\n\non:\n  push:\n    branches:\n      - refactor-export-nemo\n  workflow_dispatch:\n\nconcurrency:\n  group: export-nemo-parakeet-tdt-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-nemo-parakeet-tdt-0_6b-v2:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: parakeet tdt\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install python dependencies\n        shell: bash\n        run: |\n          pip install \\\n            nemo_toolkit['asr'] \\\n            \"numpy<2\" \\\n            ipython \\\n            kaldi-native-fbank \\\n            librosa \\\n            onnx==1.17.0 \\\n            onnxmltools==1.13.0 \\\n            onnxruntime==1.17.1 \\\n            soundfile\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/nemo/parakeet-tdt_ctc-0.6b-ja\n          ./run-ctc.sh\n\n      - name: Collect files\n        shell: bash\n        run: |\n          models=(\n            sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8\n          )\n          for m in ${models[@]}; do\n            mv -v scripts/nemo/parakeet-tdt_ctc-0.6b-ja/$m .\n            tar cjfv $m.tar.bz2 $m\n          done\n\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            models=(\n              sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8\n            )\n\n            for m in ${models[@]}; do\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m huggingface\n              cp -av $m/* huggingface\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.wav\"\n              git status\n              git add .\n              git status\n              git commit -m \"first commit\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m main\n              cd ..\n            done\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-nemo-speaker-verification-to-onnx.yaml",
    "content": "name: export-nemo-speaker-verification-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-nemo-speaker-verification-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-nemo-speaker-verification-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export nemo speaker verification models to ONNX\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/nemo/speaker-verification\n          ./run.sh\n\n          mv -v *.onnx ../../..\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.onnx\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: speaker-recongition-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            d=speaker-embedding-models\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n            mv -v ./*.onnx ./huggingface\n            cd huggingface\n            git lfs track \"*.onnx\"\n            git status\n            git add .\n            git status\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n"
  },
  {
    "path": ".github/workflows/export-nemotron-speech-streaming-en-0.6b.yaml",
    "content": "name: export-nemotron-speech-streaming-en-06b\n\non:\n  push:\n    branches:\n      - export-nemotron-streaming-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-nemotron-streaming-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-nemotron-speech-streaming-en-0-6b-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: nemotron-speech-streaming-en-0-6b-to-onnx\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install NeMo\n        shell: bash\n        run: |\n          BRANCH='main'\n          pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[asr]\n          pip install onnxruntime ipython\n          pip install kaldi-native-fbank\n          pip install soundfile librosa\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/nemo/nemotron-speech-streaming-en-0.6b\n\n          python3 ./export_onnx.py\n\n          ls -lh *.onnx\n          echo \"---\"\n          ls -lh encoder.*\n\n      - name: Collect results\n        shell: bash\n        run: |\n          src=scripts/nemo/nemotron-speech-streaming-en-0.6b\n          d=sherpa-onnx-nemotron-speech-streaming-en-0.6b-2026-01-14\n          mkdir -p $d\n\n          cp -av $src/encoder.onnx $d/\n          cp -av $src/encoder.data $d/\n          cp -av $src/decoder.onnx $d/\n          cp -av $src/joiner.onnx $d/\n          cp -av $src/tokens.txt $d/\n          cat >$d/README.md <<EOF\n          # Introduction\n          This model is from https://huggingface.co/nvidia/nemotron-speech-streaming-en-0.6b\n          EOF\n\n          ls -lh $d\n\n          d=sherpa-onnx-nemotron-speech-streaming-en-0.6b-int8-2026-01-14\n          mkdir -p $d\n\n          cp -av $src/encoder.int8.onnx $d/\n          cp -av $src/decoder.int8.onnx $d/\n          cp -av $src/joiner.int8.onnx $d/\n          cp -av $src/tokens.txt $d/\n          cat >$d/README.md <<EOF\n          # Introduction\n          This model is from https://huggingface.co/nvidia/nemotron-speech-streaming-en-0.6b\n          EOF\n\n          ls -lh $d\n\n\n      - name: Download test waves\n        if: true\n        shell: bash\n        run: |\n          mkdir test_wavs\n          pushd test_wavs\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/0.wav\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/1.wav\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/8k.wav\n          curl -SL -O https://hf-mirror.com/csukuangfj/sherpa-onnx-nemo-ctc-en-conformer-small/resolve/main/test_wavs/trans.txt\n          popd\n\n          models=(\n            sherpa-onnx-nemotron-speech-streaming-en-0.6b-int8-2026-01-14\n            sherpa-onnx-nemotron-speech-streaming-en-0.6b-2026-01-14\n          )\n          for m in ${models[@]}; do\n            cp -av test_wavs $m\n            tar cjvf $m.tar.bz2 $m\n          done\n\n          ls -lh *.tar.bz2\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          # fp32 models is 2.2GB > 2GB\n          file: sherpa-onnx-nemotron-speech-streaming-en-0.6b-int8-2026-01-14.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            models=(\n              sherpa-onnx-nemotron-speech-streaming-en-0.6b-int8-2026-01-14\n              sherpa-onnx-nemotron-speech-streaming-en-0.6b-2026-01-14\n            )\n\n            for m in ${models[@]}; do\n              if [ ! -d $m ]; then\n                echo \"skip $m\"\n                continue\n              fi\n\n              rm -rf huggingface\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m huggingface\n              cp -av $m/* huggingface\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.data\"\n              git lfs track \"*.wav\"\n              git status\n              git add .\n              git status\n              git commit -m \"first commit\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m main\n              cd ..\n            done\n\n            rm -rf huggingface\n\n      - name: Publish to modelscope\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in \"*.tar.bz2\"; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              rm -rf ms\n              git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git ms\n\n              mkdir ms/nemo\n              cp -av $m ms/nemo\n\n              pushd ms\n              git lfs track \"*.tar.bz2\"\n              git status\n              ls -lh\n              git add .\n\n              git commit -m \"add models\"\n              git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git\n\n              popd\n            done\n            rm -rf ms\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in \"*.tar.bz2\"; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models huggingface\n\n              d=asr-models/nemo\n              mkdir -p huggingface/$d\n\n              cp -v $m huggingface/$d/\n\n              pushd huggingface\n              git lfs track \"*.tar.bz2\"\n              ls -lh $d/$m\n\n              ls -lh $d\n\n              pushd $d\n              git lfs track \"*.tar.bz2\"\n              popd\n\n              git status\n              git add .\n\n              git commit -m \"add $m\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models main\n              popd\n            done\n            rm -rf huggingface\n"
  },
  {
    "path": ".github/workflows/export-omnilingual-asr-to-onnx.yaml",
    "content": "name: export-omnilingual-asr-to-onnx\n\non:\n  push:\n    branches:\n      - omnilingual-1b\n  workflow_dispatch:\n\nconcurrency:\n  group: export-omnilingual-asr-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-omnilingual-asr-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.model_card }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n        model_card: [\"omniASR_CTC_300M\", \"omniASR_CTC_300M_v2\", \"omniASR_CTC_1B\", \"omniASR_CTC_1B_v2\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install dependencies\n        shell: bash\n        run: |\n          sudo apt install libsndfile1\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install fairseq2 \\\n            --extra-index-url https://fair.pkg.atmeta.com/fairseq2/whl/pt2.8.0/cpu \\\n            torch==2.8.0+cpu -f https://download.pytorch.org/whl/torch \\\n            torchaudio==2.8.0+cpu -f https://download.pytorch.org/whl/torchaudio \\\n            onnx==1.17.0 \\\n            onnxruntime==1.17.1 \\\n            soundfile \\\n            librosa\n\n          pip install --no-deps omnilingual_asr\n\n          pip install retrying pandas polars pyarrow xxhash\n\n      - name: Setup tmate session\n        if: false\n        uses: mxschmitt/action-tmate@v3\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/omnilingual-asr\n          model_card=${{ matrix.model_card }}\n          python3 ./export-onnx.py --model-card $model_card\n\n          ls -lh *.onnx\n          ls -lh *.weights || true\n\n          rm README.md\n\n          curl -SL -O https://raw.githubusercontent.com/facebookresearch/omnilingual-asr/refs/heads/main/README.md\n          curl -SL -O https://raw.githubusercontent.com/facebookresearch/omnilingual-asr/refs/heads/main/LICENSE\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/en.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/es.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/fr.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/de.wav\n\n          echo \"---test----\"\n          python3 ./test.py\n\n          echo \"---collect files----\"\n\n          if [[ $model_card == omniASR_CTC_300M ]]; then\n            d=sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-2025-11-12\n          elif [[ $model_card == omniASR_CTC_300M_v2 ]]; then\n            d=sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-2026-02-05\n          elif [[ $model_card == omniASR_CTC_1B ]]; then\n            d=sherpa-onnx-omnilingual-asr-1600-languages-1B-ctc-2025-11-12\n          elif [[ $model_card == omniASR_CTC_1B_v2 ]]; then\n            d=sherpa-onnx-omnilingual-asr-1600-languages-1B-ctc-v2-2026-02-05\n          else\n            echo \"Unknown model: $model_card\"\n            exit 1\n          fi\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          mv -v model.onnx $d\n          mv -v model.weights $d || true\n          cp -v tokens.txt $d\n          cp -v README.md $d\n          cp -v LICENSE* $d\n          cp -v *.wav $d/test_wavs\n\n          ls -lh $d\n\n          tar cjfv $d.tar.bz2 $d\n          mv $d ../..\n\n          if [[ $model_card == omniASR_CTC_300M ]]; then\n            d=sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12\n          elif [[ $model_card == omniASR_CTC_300M_v2 ]]; then\n            d=sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05\n          elif [[ $model_card == omniASR_CTC_1B ]]; then\n            d=sherpa-onnx-omnilingual-asr-1600-languages-1B-ctc-int8-2025-11-12\n          elif [[ $model_card == omniASR_CTC_1B_v2 ]]; then\n            d=sherpa-onnx-omnilingual-asr-1600-languages-1B-ctc-v2-int8-2026-02-05\n          else\n            echo \"Unknown model: $model_card\"\n            exit 1\n          fi\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          mv -v model.int8.onnx $d\n          cp -v tokens.txt $d\n          cp -v README.md $d\n          cp -v LICENSE* $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n\n          tar cjfv $d.tar.bz2 $d\n\n          mv $d ../..\n\n          mv *.tar.bz2 ../../\n\n          cd ../..\n\n          ls -lh *.tar.bz2\n\n          df -h\n          rm -fv onnx_* model.encoder* model.final*\n\n          ls -lh ~/.cache/fairseq2/assets/*\n\n          rm -rf ~/.cache/fairseq2/assets/\n          rm -rf ~/.cache\n\n          df -h\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            model_card=${{ matrix.model_card }}\n\n            dirs=(\n              sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-2025-11-12\n              sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12\n              sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-2026-02-05\n              sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05\n              sherpa-onnx-omnilingual-asr-1600-languages-1B-ctc-2025-11-12\n              sherpa-onnx-omnilingual-asr-1600-languages-1B-ctc-int8-2025-11-12\n              sherpa-onnx-omnilingual-asr-1600-languages-1B-ctc-v2-2026-02-05\n              sherpa-onnx-omnilingual-asr-1600-languages-1B-ctc-v2-int8-2026-02-05\n            )\n\n            for d in ${dirs[@]}; do\n              if [[ ! -d $d ]]; then\n                continue;\n              fi\n              rm -rf huggingface\n              git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/$d huggingface\n              pushd huggingface\n\n              git fetch\n              git pull\n              echo \"pwd: $PWD\"\n              rm -fv ./*.weights\n              mv -v ../$d/* .\n\n              git lfs track \"*.onnx\"\n              git lfs track \"*.weights\"\n              git lfs track \"*.wav\"\n              ls -lh\n              git add .\n\n              ls -lh\n\n              git status\n\n              git commit -m \"add models\"\n              git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/$d main || true\n              popd\n            done\n\n      - name: Publish to modelscope\n        if: true\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in *.tar.bz2; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              rm -rf ms\n              git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git ms\n\n              cp -av $m ms/\n\n              pushd ms\n              git lfs track \"*.tar.bz2\"\n              git status\n              ls -lh\n              git add .\n\n              git commit -m \"add models\"\n              git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git\n\n              popd\n            done\n\n      # List large files first (safe)\n      - name: List .tar.bz2 files larger than 2GB\n        run: |\n          ls -lh *.tar.bz2\n          echo \"----\"\n          find . -type f -name \"*.tar.bz2\" -size +2G -print\n\n      # Delete large files\n      - name: Delete .tar.bz2 files larger than 2GB\n        run: |\n          find . -type f -name \"*.tar.bz2\" -size +2G -delete\n\n          ls -lh *.tar.bz2\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-paraformer-to-ascend-npu.yaml",
    "content": "name: export-paraformer-to-ascend-npu\n\non:\n  push:\n    branches:\n      - fix-ascend-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-paraformer-to-ascend-nput-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  generate_build_matrix:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    # see https://github.com/pytorch/pytorch/pull/50633\n    runs-on: ubuntu-latest\n    outputs:\n      matrix: ${{ steps.set-matrix.outputs.matrix }}\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Generating build matrix\n        id: set-matrix\n        run: |\n          # outputting for debugging purposes\n          python3 .github/scripts/export-ascend/generate_paraformer.py\n          MATRIX=$(python3 .github/scripts/export-ascend/generate_paraformer.py)\n\n          # deprecated\n          # echo \"::set-output name=matrix::${MATRIX}\"\n          echo \"matrix=$MATRIX\" >> $GITHUB_OUTPUT\n\n  export-paraformer-to-rknn:\n    needs: generate_build_matrix\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.framework }} ${{ matrix.soc_version }} ${{ matrix.cann }}\n    runs-on: ubuntu-latest\n\n    strategy:\n      fail-fast: false\n      matrix:\n        ${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}\n\n    container:\n      image: ${{ matrix.image }}\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python 3.8\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.8\"\n\n      - name: Show Python\n        shell: bash\n        run: |\n          python3 --version\n          which python3\n\n      - name: Install curl\n        shell: bash\n        run: apt-get update && apt-get install -y curl bzip2 git git-lfs\n\n      - name: Verify environment\n        shell: bash\n        run: |\n          ls -lh /usr/local/Ascend/ascend-toolkit/set_env.sh\n\n          find /usr/local/Ascend -name \"libascend*.so\" 2>/dev/null\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          # for cann 7.0.0\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/x86_64:$LD_LIBRARY_PATH\n\n          echo \"CANN environment:\"\n          which atc || echo \"atc not found\"\n          atc --help\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install \"numpy<2\" \\\n                  onnx==1.17.0 \\\n                  torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n                  attrs psutil scipy decorator cloudpickle ml-dtypes tornado \\\n                  pyyaml\n\n      - name: Setup tmate session\n        if: false\n        uses: mxschmitt/action-tmate@v3\n\n      - name: Run Paraformer from FunAsr\n        if: matrix.framework == 'FunASR'\n        shell: bash\n        run: |\n          cd scripts/paraformer/ascend-npu\n\n          curl -SL -O https://www.modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/resolve/master/am.mvn\n          curl -SL -O https://www.modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/resolve/master/config.yaml\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/tokens.txt\n\n          curl -SL -O https://www.modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/resolve/master/model.pt\n          mv model.pt model_state_dict.pt\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/0.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/1.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/2.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/3-sichuan.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/4-tianjin.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/5-henan.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/6-zh-en.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/8k.wav\n\n          rm -f README.md || true\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/README.md\n\n          echo \"export to onnx\"\n\n          python3 ./export_encoder_onnx.py\n          python3 ./export_decoder_onnx.py\n          python3 ./export_predictor_onnx.py\n\n          rm -v *.pt\n\n          ls -lh *.onnx\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          # for cann 7.0.0\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/x86_64:$LD_LIBRARY_PATH\n\n          soc_version=${{ matrix.soc_version }}\n          cann=${{ matrix.cann }}\n\n          atc --model=./predictor.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=predictor \\\n            --input_format=ND \\\n            --input_shape=\"encoder_out:1,-1,512\" \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          ls -lh *.om\n\n          atc --model=./decoder.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=decoder \\\n            --input_format=ND \\\n            --input_shape=\"encoder_out:1,-1,512;acoustic_embedding:1,-1,512\" \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          ls -lh *.om\n\n          atc --model=./encoder.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=encoder \\\n            --input_format=ND \\\n            --input_shape=\"x:1,-1,560\" \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          ls -lh *.om\n\n          rm -v *.onnx\n\n\n          echo \"collect results\"\n          d=sherpa-onnx-ascend-${soc_version}-cann-$cann-paraformer-zh-2023-03-28\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v encoder_linux_aarch64.om $d/encoder.om\n          cp -v decoder_linux_aarch64.om $d/decoder.om\n          cp -v predictor_linux_aarch64.om $d/predictor.om\n          cp -v test_om.py $d/\n\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          rm -v *.om\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../..\n\n      - name: Run Paraformer from WSChuan-ASR\n        if: matrix.framework == 'WSChuan-ASR'\n        shell: bash\n        run: |\n          cd scripts/paraformer/ascend-npu\n\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/am.mvn\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/config.yaml\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/tokens.json\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/model_state_dict.pt\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-int8-2025-10-07/resolve/main/tokens.txt\n\n\n          for i in $(seq 1 16); do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-int8-2025-10-07/resolve/main/test_wavs/$i.wav\n          done\n\n          rm -f README.md || true\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-int8-2025-10-07/resolve/main/README.md\n\n          echo \"export to onnx\"\n\n          python3 ./export_encoder_onnx.py\n          python3 ./export_decoder_onnx.py\n          python3 ./export_predictor_onnx.py\n\n          ls -lh *.onnx\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          # for cann 7.0.0\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/x86_64:$LD_LIBRARY_PATH\n\n          soc_version=${{ matrix.soc_version }}\n          cann=${{ matrix.cann }}\n\n          atc --model=./predictor.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=predictor \\\n            --input_format=ND \\\n            --input_shape=\"encoder_out:1,-1,512\" \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          ls -lh *.om\n\n          atc --model=./decoder.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=decoder \\\n            --input_format=ND \\\n            --input_shape=\"encoder_out:1,-1,512;acoustic_embedding:1,-1,512\" \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          ls -lh *.om\n\n          atc --model=./encoder.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=encoder \\\n            --input_format=ND \\\n            --input_shape=\"x:1,-1,560\" \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          ls -lh *.om\n\n          rm -v *.onnx\n\n          echo \"collect results\"\n          d=sherpa-onnx-ascend-${soc_version}-cann-$cann-paraformer-zh-2025-10-07\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v encoder_linux_aarch64.om $d/encoder.om\n          cp -v decoder_linux_aarch64.om $d/decoder.om\n          cp -v predictor_linux_aarch64.om $d/predictor.om\n          cp -v test_om.py $d/\n\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          rm -v *.om\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../..\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models-ascend\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: asr-models-ascend\n\n      - name: Publish to huggingface\n        if: true\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in \"*.tar.bz2\"; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models huggingface\n\n              d=asr-models/ascend-npu/paraformer\n              mkdir -p huggingface/$d\n\n              cp -v $m huggingface/$d/\n\n              pushd huggingface\n              git lfs track \"*.tar.bz2\"\n              ls -lh $d\n              pushd $d\n              git lfs track \"*.tar.bz2\"\n              popd\n\n              git status\n              git add .\n\n              git commit -m \"add $m\"\n              git push https://csukuangfj2:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models main\n              popd\n            done\n            rm -rf huggingface\n\n      - name: Publish to modelscope\n        if: true\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in \"*.tar.bz2\"; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              rm -rf ms\n              git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git ms\n\n              d=ascend-npu/paraformer\n              mkdir -p ms/$d\n\n              cp -av $m ms/$d/\n\n              pushd ms\n              git lfs track \"*.tar.bz2\"\n              git status\n              ls -lh $d/$m\n\n              ls -lh $d\n              git add .\n\n              git commit -m \"add $m\"\n              git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git\n\n              popd\n            done\n            rm -rf ms\n"
  },
  {
    "path": ".github/workflows/export-paraformer-to-qnn.yaml",
    "content": "name: export-paraformer-to-qnn\n\non:\n  push:\n    branches:\n      - export-paraformer-qnn-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-paraformer-to-qnn-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  generate_build_matrix:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    # see https://github.com/pytorch/pytorch/pull/50633\n    runs-on: ubuntu-latest\n    outputs:\n      matrix: ${{ steps.set-matrix.outputs.matrix }}\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Generating build matrix\n        id: set-matrix\n        run: |\n          # outputting for debugging purposes\n          python3 .github/scripts/export-qnn/generate_paraformer.py\n          MATRIX=$(python3 .github/scripts/export-qnn/generate_paraformer.py)\n\n          # deprecated\n          # echo \"::set-output name=matrix::${MATRIX}\"\n          echo \"matrix=$MATRIX\" >> $GITHUB_OUTPUT\n\n  export-paraformer-to-qnn:\n    needs: generate_build_matrix\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.framework }} ${{ matrix.input_in_seconds }} ${{ matrix.soc }}\n    runs-on: ubuntu-22.04\n    strategy:\n      fail-fast: false\n      matrix:\n        ${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python 3.10\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Create directories\n        shell: bash\n        run: |\n          mkdir so binary\n\n      - name: Create Python virtual environment\n        shell: bash\n        run: |\n          python3 -m venv py310\n          which python3\n          source py310/bin/activate\n          which python3\n\n      - name: Show ndk-build help\n        shell: bash\n        run: |\n          export PATH=${ANDROID_NDK_LATEST_HOME}:$PATH\n          ndk-build --help\n\n      - name: Download toolkit\n        shell: bash\n        run: |\n          curl -SL -O https://huggingface.co/csukuangfj/qnn-toolkit/resolve/main/v2.40.0.251030.zip\n          ls -lh v2.40.0.251030.zip\n\n      - name: Unzip toolkit\n        shell: bash\n        run: |\n          unzip v2.40.0.251030.zip\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh\n\n          echo \"---ls -lh qairt---\"\n\n          ls -lh qairt\n\n          echo \"---\"\n\n      - name: Install linux dependencies\n        shell: bash\n        run: |\n          ls -lh\n\n          echo \"---\"\n\n          ls -lh qairt\n\n          cd qairt/2.40.0.251030/bin\n          source envsetup.sh\n\n          yes | sudo ${QNN_SDK_ROOT}/bin/check-linux-dependency.sh || true\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          cd qairt/2.40.0.251030/bin\n          source envsetup.sh\n\n          python3 -m pip install \\\n            mock \\\n            numpy \\\n            opencv-python \\\n            optuna \\\n            packaging \\\n            pandas \\\n            paramiko \\\n            pathlib2 \\\n            pillow \\\n            plotly \\\n            protobuf \\\n            psutil \\\n            pydantic \\\n            pytest \\\n            pyyaml \\\n            rich \\\n            scikit-optimize \\\n            scipy \\\n            six \\\n            tabulate \\\n            typing-extensions \\\n            xlsxwriter\n\n          python3 \"${QNN_SDK_ROOT}/bin/check-python-dependency\" || true\n\n          which python3\n\n      - name: Install onnx dependencies\n        shell: bash\n        run: |\n          source py310/bin/activate\n          python3 -m pip install --upgrade \\\n            torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n            kaldi_native_fbank \\\n            pip \\\n            \"numpy<2\" \\\n            onnx==1.17.0 \\\n            onnxruntime==1.17.1 \\\n            soundfile \\\n            librosa \\\n            onnxsim \\\n            sentencepiece \\\n            pyyaml\n\n          which python3\n\n      - name: Show qnn-onnx-converter help\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.40.0.251030/bin\n          source envsetup.sh\n          popd\n\n          qnn-onnx-converter --help\n\n      - name: Show qnn-model-lib-generator help\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.40.0.251030/bin\n          source envsetup.sh\n          popd\n\n          qnn-model-lib-generator --help\n\n      - name: Show qnn-net-run help\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.40.0.251030/bin\n          source envsetup.sh\n          popd\n\n          qnn-net-run --help\n\n      - name: Run Paraformer from FunAsr\n        if: matrix.framework == 'FunASR'\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.40.0.251030/bin\n          source envsetup.sh\n          popd\n\n\n          export PATH=${ANDROID_NDK_LATEST_HOME}:$PATH\n          export LDFLAGS=\"-Wl,-z,max-page-size=16384\"\n\n          export t=${{ matrix.input_in_seconds }}\n          export soc=${{ matrix.soc }}\n\n          dir=$PWD\n\n          cd scripts/paraformer/qnn\n\n          curl -SL -O https://www.modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/resolve/master/am.mvn\n          curl -SL -O https://www.modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/resolve/master/config.yaml\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/tokens.txt\n\n          curl -SL -O https://www.modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/resolve/master/model.pt\n          mv model.pt model_state_dict.pt\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/0.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/1.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/2.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/3-sichuan.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/4-tianjin.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/5-henan.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/6-zh-en.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/8k.wav\n\n          rm -f README.md || true\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/README.md\n\n\n          ./convert_decoder.sh\n\n          ./convert_predictor.sh\n\n          ./convert_encoder.sh\n\n          ls -lh model_libs/*/lib*.so\n\n          ls -lh binary\n\n          readelf -lW model_libs/*/lib*.so\n\n          echo \"collect results\"\n\n          d=sherpa-onnx-qnn-${{ matrix.soc}}-binary-$t-seconds-paraformer-zh-2023-03-28-int8\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v binary/encoder.bin $d/\n          cp -v binary/predictor.bin $d/\n          cp -v binary/decoder.bin $d/\n\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          mv *.tar.bz2 ../../../binary/\n\n\n          for p in x86_64-linux-clang aarch64-android; do\n            if [[ $p == x86_64-linux-clang ]]; then\n\n              d=sherpa-onnx-qnn-$t-seconds-paraformer-zh-2023-03-28-int8-linux-x64\n            elif [[ $p == aarch64-android ]]; then\n              d=sherpa-onnx-qnn-$t-seconds-paraformer-zh-2023-03-28-int8-android-aarch64\n            else\n              echo \"Unknown $p\"\n              exit -1\n            fi\n\n            mkdir -p $d\n            mkdir -p $d/test_wavs\n\n            cp -v README.md $d\n\n            cp -v model_libs/$p/libencoder*.so $d/libencoder.so\n            cp -v model_libs/$p/libpredictor*.so $d/libpredictor.so\n            cp -v model_libs/$p/libdecoder*.so $d/libdecoder.so\n\n            cp -v tokens.txt $d\n            cp -v *.wav $d/test_wavs\n            ls -lh $d\n            tar cjfv $d.tar.bz2 $d\n            ls -lh *.tar.bz2\n            rm -rf $d\n          done\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../../so/\n\n\n      - name: Run Paraformer from WSChuan-ASR\n        if: matrix.framework == 'WSChuan-ASR'\n        shell: bash\n        run: |\n          dir=$PWD\n          source py310/bin/activate\n\n          pushd qairt/2.40.0.251030/bin\n          source envsetup.sh\n          popd\n\n          export PATH=${ANDROID_NDK_LATEST_HOME}:$PATH\n          export LDFLAGS=\"-Wl,-z,max-page-size=16384\"\n          export t=${{ matrix.input_in_seconds }}\n          export soc=${{ matrix.soc }}\n\n          cd scripts/paraformer/qnn\n\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/am.mvn\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/config.yaml\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/tokens.json\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/model_state_dict.pt\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-int8-2025-10-07/resolve/main/tokens.txt\n\n\n          for i in $(seq 1 16); do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-int8-2025-10-07/resolve/main/test_wavs/$i.wav\n          done\n\n          rm -f README.md || true\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-int8-2025-10-07/resolve/main/README.md\n\n          ./convert_decoder.sh\n\n          ./convert_predictor.sh\n\n          ./convert_encoder.sh\n\n          ls -lh model_libs/*/lib*.so\n\n          ls -lh binary\n\n          readelf -lW model_libs/*/lib*.so\n\n          echo \"collect results\"\n\n          d=sherpa-onnx-qnn-${{ matrix.soc}}-binary-$t-seconds-paraformer-zh-2025-10-07-int8\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v binary/encoder.bin $d/\n          cp -v binary/predictor.bin $d/\n          cp -v binary/decoder.bin $d/\n\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          mv *.tar.bz2 ../../../binary/\n\n\n          for p in x86_64-linux-clang aarch64-android; do\n            if [[ $p == x86_64-linux-clang ]]; then\n\n              d=sherpa-onnx-qnn-$t-seconds-paraformer-zh-2025-10-07-int8-linux-x64\n            elif [[ $p == aarch64-android ]]; then\n              d=sherpa-onnx-qnn-$t-seconds-paraformer-zh-2025-10-07-int8-android-aarch64\n            else\n              echo \"Unknown $p\"\n              exit -1\n            fi\n\n            mkdir -p $d\n            mkdir -p $d/test_wavs\n\n            cp -v README.md $d\n\n            cp -v model_libs/$p/libencoder*.so $d/libencoder.so\n            cp -v model_libs/$p/libpredictor*.so $d/libpredictor.so\n            cp -v model_libs/$p/libdecoder*.so $d/libdecoder.so\n\n            cp -v tokens.txt $d\n            cp -v *.wav $d/test_wavs\n            ls -lh $d\n            tar cjfv $d.tar.bz2 $d\n            ls -lh *.tar.bz2\n            rm -rf $d\n          done\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../../so/\n\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: ${{ matrix.framework }}-${{ matrix.soc }}-${{ matrix.input_in_seconds }}-seconds\n          path: ./scripts/paraformer/qnn/my-config*/*.json\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj' && matrix.soc == 'SM8850'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./so/*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models-qnn\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./binary/*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models-qnn-binary\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa' && matrix.soc == 'SM8850'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./so/*.tar.bz2\n          overwrite: true\n          tag: asr-models-qnn\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./binary/*.tar.bz2\n          overwrite: true\n          tag: asr-models-qnn-binary\n"
  },
  {
    "path": ".github/workflows/export-paraformer-to-rknn.yaml",
    "content": "name: export-paraformer-to-rknn\n\non:\n  push:\n    branches:\n      - ci-paraformer-rknn\n  workflow_dispatch:\n\nconcurrency:\n  group: export-paraformer-to-rknn-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-paraformer-to-rknn:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.framework }} ${{ matrix.platform }} ${{ matrix.input_in_seconds }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n        platform: [\"rk3562\", \"rk3566\", \"rk3568\", \"rk3576\", \"rk3588\"]\n        input_in_seconds: [\"5\", \"10\", \"15\", \"20\", \"25\", \"30\"]\n        framework: [\"FunASR\", \"WSChuan-ASR\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade \\\n            pip \\\n            \"numpy<2\" \\\n            torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n            onnx==1.17.0 \\\n            onnxruntime==1.17.1 \\\n            librosa \\\n            soundfile \\\n            pyyaml \\\n            onnxsim \\\n            sentencepiece \\\n            kaldi_native_fbank\n\n          curl -SL -O https://huggingface.co/csukuangfj/rknn-toolkit2/resolve/main/rknn_toolkit2-2.1.0%2B708089d1-cp310-cp310-linux_x86_64.whl\n          pip install ./*.whl \"numpy<=1.26.4\"\n\n      - name: Run Paraformer from FunAsr\n        if: matrix.framework == 'FunASR'\n        shell: bash\n        run: |\n          cd scripts/paraformer/rknn\n\n          curl -SL -O https://www.modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/resolve/master/am.mvn\n          curl -SL -O https://www.modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/resolve/master/config.yaml\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/tokens.txt\n\n          curl -SL -O https://www.modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/resolve/master/model.pt\n          mv model.pt model_state_dict.pt\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/0.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/1.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/2.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/3-sichuan.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/4-tianjin.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/5-henan.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/6-zh-en.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/test_wavs/8k.wav\n\n          rm -f README.md || true\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-03-28/resolve/main/README.md\n\n          echo \"export to onnx\"\n          t=${{ matrix.input_in_seconds }}\n          p=${{ matrix.platform }}\n\n          export url=\"https://www.modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\"\n          export model_author=\"iic\"\n          export comment=\"iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch\"\n\n          echo \"----$t---\"\n          python3 ./export_encoder_onnx.py  --input-len-in-seconds $t\n          python3 ./export_rknn.py --target-platform $p --in-model ./encoder-$t-seconds.onnx --out-model ./encoder-$t-seconds.rknn >/dev/null 2>&1\n\n          python3 ./export_predictor_onnx.py  --input-len-in-seconds $t\n          python3 ./export_rknn.py --target-platform $p --in-model ./predictor-$t-seconds.onnx --out-model ./predictor-$t-seconds.rknn >/dev/null 2>&1\n\n          python3 ./export_decoder_onnx.py  --input-len-in-seconds $t\n          python3 ./export_rknn.py --target-platform $p --in-model ./decoder-$t-seconds.onnx --out-model ./decoder-$t-seconds.rknn >/dev/null 2>&1\n\n          ls -lh *.onnx\n          echo \"---\"\n          ls -lh *.rknn\n\n          echo \"collect results\"\n          d=sherpa-onnx-$p-$t-seconds-paraformer-zh-2023-03-28\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v encoder-$t-seconds.rknn $d/encoder.rknn\n          cp -v decoder-$t-seconds.rknn $d/decoder.rknn\n          cp -v predictor-$t-seconds.rknn $d/predictor.rknn\n\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../..\n\n      - name: Run Paraformer from WSChuan-ASR\n        if: matrix.framework == 'WSChuan-ASR'\n        shell: bash\n        run: |\n          cd scripts/paraformer/rknn\n\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/am.mvn\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/config.yaml\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/tokens.json\n          curl -SL -O https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/model_state_dict.pt\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-int8-2025-10-07/resolve/main/tokens.txt\n\n\n          for i in $(seq 1 16); do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-int8-2025-10-07/resolve/main/test_wavs/$i.wav\n          done\n\n          rm -f README.md || true\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-int8-2025-10-07/resolve/main/README.md\n\n          echo \"export to onnx\"\n          t=${{ matrix.input_in_seconds }}\n          p=${{ matrix.platform }}\n\n          export model_author=\"ASLP-lab\"\n          export comment=\"ASLP-lab/WSChuan-ASR\"\n          export url=\"https://huggingface.co/ASLP-lab/WSChuan-ASR/tree/main/Paraformer-large-Chuan\"\n\n          echo \"----$t---\"\n          python3 ./export_encoder_onnx.py  --input-len-in-seconds $t\n          python3 ./export_rknn.py --target-platform $p --in-model ./encoder-$t-seconds.onnx --out-model ./encoder-$t-seconds.rknn >/dev/null 2>&1\n\n          python3 ./export_predictor_onnx.py  --input-len-in-seconds $t\n          python3 ./export_rknn.py --target-platform $p --in-model ./predictor-$t-seconds.onnx --out-model ./predictor-$t-seconds.rknn >/dev/null 2>&1\n\n          python3 ./export_decoder_onnx.py  --input-len-in-seconds $t\n          python3 ./export_rknn.py --target-platform $p --in-model ./decoder-$t-seconds.onnx --out-model ./decoder-$t-seconds.rknn >/dev/null 2>&1\n\n          ls -lh *.onnx\n          echo \"---\"\n          ls -lh *.rknn\n\n          echo \"collect results\"\n          d=sherpa-onnx-$p-$t-seconds-paraformer-zh-2025-10-07\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v encoder-$t-seconds.rknn $d/encoder.rknn\n          cp -v decoder-$t-seconds.rknn $d/decoder.rknn\n          cp -v predictor-$t-seconds.rknn $d/predictor.rknn\n\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../..\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-peng-cheng-starling.yaml",
    "content": "name: export-peng-cheng-starling-to-onnx\n\non:\n  push:\n    branches:\n      - fix-ci-2\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-peng-cheng-starling-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-peng-cheng-starling-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export peng cheng starling ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install \"numpy<=1.26.4\" onnx==1.16.0 onnxruntime==1.17.1\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/peng-cheng-starling\n          ./run.sh\n          python3 ./quantize_models.py\n\n          ls -lh\n          rm encoder-epoch-75-avg-11-chunk-16-left-128.onnx\n          rm joiner-epoch-75-avg-11-chunk-16-left-128.onnx\n          echo \"----\"\n          ls -lh\n\n\n      - name: Collect results ${{ matrix.version }}\n        shell: bash\n        run: |\n          src=scripts/peng-cheng-starling\n          d=sherpa-onnx-streaming-zipformer-ar_en_id_ja_ru_th_vi_zh-2025-02-10\n          mkdir $d\n\n          mv -v $src/*.onnx $d\n          cp -v $src/README.md $d\n          cp -v $src/bpe.model $d\n          cp -v $src/tokens.txt $d\n          cp -av $src/test_wavs $d\n\n          ls -lh $d/\n          tar cjfv $d.tar.bz2 $d\n\n          ls -lh $d.tar.bz2\n\n      - name: Publish to huggingface ${{ matrix.version }}\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            src=sherpa-onnx-streaming-zipformer-ar_en_id_ja_ru_th_vi_zh-2025-02-10\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$src huggingface\n            cd huggingface\n            rm -rf ./*\n            git fetch\n            git pull\n\n            cp -av ../$src/* ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$src main || true\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-piper.yaml",
    "content": "name: export-piper\n\non:\n  push:\n    branches:\n      - export-piper\n  workflow_dispatch:\n\nconcurrency:\n  group: export-piper-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-piper:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.index }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n        total: [\"20\"]\n        index: [\n          \"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\",\n          \"10\", \"11\", \"12\", \"13\", \"14\", \"15\", \"16\", \"17\", \"18\", \"19\",\n           ]\n        # total: [\"2\"]\n        # index: [\"0\", \"1\"]\n        # total: [\"1\"]\n        # index: [\"0\"]\n        # total: [\"5\"]\n        # index: [\"0\", \"1\", \"2\", \"3\", \"4\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2 iso639-lang onnx==1.17.0 onnxruntime==1.17.1 sherpa-onnx onnxmltools==1.13.0\n          python3 -m pip install \"numpy<2\" soundfile\n\n      - name: Generate script\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        shell: bash\n        run: |\n          cd scripts/piper\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n          export GIT_LFS_SKIP_SMUDGE=1\n          export GIT_CLONE_PROTECTION_ACTIVE=false\n\n          git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples hf\n\n          python3 ./generate.py --total $total --index $index\n          chmod +x ./generate.sh\n          ls -lh\n\n      - name: Show script\n        shell: bash\n        run: |\n          cd scripts/piper\n          cat ./generate.sh\n\n      - name: Run script\n        shell: bash\n        run: |\n          cd scripts/piper\n          ./generate.sh\n\n      - name: Show generated mp3 files\n        shell: bash\n        run: |\n          cd scripts/piper\n          ls -lh hf/piper/mp3/*\n          echo \"----\"\n          ls -lh hf/piper/mp3/*/*\n\n      - name: Push generated mp3 files\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            cd scripts/piper/hf\n            git pull --rebase\n            git lfs track \"*.mp3\"\n            git status .\n            git add .\n            git commit -m 'Add mp3 files'\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples main\n\n      - name: Show generated model files\n        shell: bash\n        run: |\n          cd scripts/piper\n          ls -lh *.tar.bz2\n\n      - name: Show generated model files(2)\n        shell: bash\n        run: |\n          cd scripts/piper\n          ls -lh release/\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            dirs=(\n              vits-piper-de_DE-glados-high\n              vits-piper-de_DE-glados-low\n              vits-piper-de_DE-glados-medium\n              vits-piper-de_DE-glados_turret-high\n              vits-piper-de_DE-glados_turret-low\n              vits-piper-de_DE-glados_turret-medium\n              vits-piper-en_US-glados-high\n              vits-piper-fa_IR-ganji-medium\n              vits-piper-fa_IR-ganji_adabi-medium\n              vits-piper-fa_IR-reza_ibrahim-medium\n              vits-piper-hi_IN-pratham-medium\n              vits-piper-hi_IN-priyamvada-medium\n              vits-piper-es_AR-daniela-high\n              vits-piper-en_GB-miro-high\n              vits-piper-en_GB-dii-high\n              vits-piper-pt_PT-miro-high\n              vits-piper-pt_PT-dii-high\n              vits-piper-pt_BR-miro-high\n              vits-piper-pt_BR-dii-high\n              vits-piper-es_ES-miro-high\n              vits-piper-it_IT-miro-high\n              vits-piper-it_IT-dii-high\n              vits-piper-nl_NL-miro-high\n              vits-piper-nl_NL-dii-high\n              vits-piper-de_DE-miro-high\n              vits-piper-de_DE-dii-high\n              vits-piper-fr_FR-miro-high\n              vits-piper-en_US-miro-high\n              vits-piper-pl_PL-jarvis_wg_glos-medium\n              vits-piper-pl_PL-justyna_wg_glos-medium\n              vits-piper-pl_PL-meski_wg_glos-medium\n              vits-piper-pl_PL-zenski_wg_glos-medium\n              vits-piper-id_ID-news_tts-medium\n              vits-piper-hi_IN-rohan-medium\n              vits-piper-ar_JO-SA_miro-high-int8\n              vits-piper-ar_JO-SA_miro-high-fp16\n              vits-piper-ar_JO-SA_miro-high\n              vits-piper-ar_JO-SA_dii-high-int8\n              vits-piper-ar_JO-SA_dii-high-fp16\n              vits-piper-ar_JO-SA_dii-high\n              vits-piper-ar_JO-SA_miro_V2-high-int8\n              vits-piper-ar_JO-SA_miro_V2-high-fp16\n              vits-piper-ar_JO-SA_miro_V2-high\n            )\n            for d in ${dirs[@]}; do\n              src=scripts/piper/release/$d\n              if [ ! -d $src ]; then\n                continue;\n              fi\n\n              rm -rf huggingface\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n              cp -a $src/* ./huggingface\n              pushd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track af_dict\n              git lfs track ar_dict\n              git lfs track cmn_dict\n              git lfs track da_dict en_dict fa_dict hu_dict ia_dict it_dict lb_dict phondata ru_dict ta_dict\n              git lfs track ur_dict yue_dict\n\n              git status\n              git add .\n              git status\n              git commit -m \"add models\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n              popd\n\n            done\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./scripts/piper/vits-piper-*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: tts-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./scripts/piper/vits-piper-*.tar.bz2\n          overwrite: true\n          tag: tts-models\n"
  },
  {
    "path": ".github/workflows/export-pocket-tts.yaml",
    "content": "name: export-pocket-to-onnx\n\non:\n  push:\n    branches:\n      - export-pocket-tts-2\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-pocket-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-pocket-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export PocketTTS ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Install ffmpeg\n        shell: bash\n        run: brew install ffmpeg\n\n      - name: Verify ffmpeg\n        shell: bash\n        run: ffmpeg -version\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install \"numpy<=1.26.4\" onnx==1.17.0 onnxruntime==1.17.1 librosa soundfile \\\n            torch==2.8.0\n\n      - name: Run\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        shell: bash\n        run: |\n          git clone https://github.com/csukuangfj/pocket-tts-onnx-export\n          cd pocket-tts-onnx-export\n          pip install -r requirements.txt\n          pip install onnx==1.17.0 torch==2.8.0\n          pip list\n\n          git grep 'opset_version'\n          echo \"---\"\n          sed -i '' 's/opset_version=17/opset_version=14/g' scripts/*.py\n          echo \"---\"\n          git grep 'opset_version'\n\n          python export.py\n          python export.py --quantize\n\n          ls -lh onnx/\n\n          python3 ../scripts/pyannote//segmentation/show-onnx.py --filename ./onnx/flow_lm_flow.onnx\n          python3 ../scripts/pyannote//segmentation/show-onnx.py --filename ./onnx/flow_lm_main.onnx\n          python3 ../scripts/pyannote//segmentation/show-onnx.py --filename ./onnx/mimi_encoder.onnx\n          python3 ../scripts/pyannote//segmentation/show-onnx.py --filename ./onnx/mimi_decoder.onnx\n          python3 ../scripts/pyannote//segmentation/show-onnx.py --filename ./onnx/text_conditioner.onnx\n\n          cd onnx\n          mv flow_lm_flow_int8.onnx lm_flow.int8.onnx\n          mv flow_lm_flow.onnx lm_flow.onnx\n\n          mv flow_lm_main_int8.onnx lm_main.int8.onnx\n          mv flow_lm_main.onnx lm_main.onnx\n\n          mv mimi_encoder_int8.onnx encoder.int8.onnx\n          mv mimi_encoder.onnx encoder.onnx\n\n          mv mimi_decoder_int8.onnx decoder.int8.onnx\n          mv mimi_decoder.onnx decoder.onnx\n\n          mv text_conditioner_int8.onnx text_conditioner.int8.onnx\n          cd ..\n\n          mv onnx ..\n\n          # bash-3.2$ ls -lh onnx/\n          # total 1318368\n          # -rw-r--r--  1 runner  staff   9.5M Feb 10 09:29 flow_lm_flow_int8.onnx\n          # -rw-r--r--  1 runner  staff    37M Feb 10 09:29 flow_lm_flow.onnx\n          # -rw-r--r--  1 runner  staff    73M Feb 10 09:29 flow_lm_main_int8.onnx\n          # -rw-r--r--  1 runner  staff   289M Feb 10 09:29 flow_lm_main.onnx\n          # -rw-r--r--  1 runner  staff    22M Feb 10 09:29 mimi_decoder_int8.onnx\n          # -rw-r--r--  1 runner  staff    40M Feb 10 09:29 mimi_decoder.onnx\n          # -rw-r--r--  1 runner  staff    71M Feb 10 09:29 mimi_encoder_int8.onnx\n          # -rw-r--r--  1 runner  staff    71M Feb 10 09:29 mimi_encoder.onnx\n          # -rw-r--r--  1 runner  staff    16M Feb 10 09:29 text_conditioner_int8.onnx\n          # -rw-r--r--  1 runner  staff    16M Feb 10 09:29 text_conditioner.onnx\n\n      - name: Setup tmate session\n        # if: true\n        if: failure()\n        uses: mxschmitt/action-tmate@v3\n\n      - name: Generate json files\n        if: true\n        shell: bash\n        run: |\n          cp -v onnx/*.onnx scripts/pocket-tts\n\n          pushd scripts/pocket-tts\n          curl -SsL -O https://huggingface.co/KevinAHM/pocket-tts-onnx/resolve/main/onnx/LICENSE\n          curl -SsL -O https://huggingface.co/KevinAHM/pocket-tts-onnx/resolve/main/tokenizer.model\n\n          wget https://github.com/kyutai-labs/delayed-streams-modeling/raw/refs/heads/main/audio/bria.mp3\n          wget https://github.com/kyutai-labs/delayed-streams-modeling/raw/refs/heads/main/audio/loona.mp3\n          wget https://github.com/kyutai-labs/delayed-streams-modeling/raw/refs/heads/main/audio/sample_fr_hibiki_crepes.mp3\n          for f in *.mp3; do\n            ffmpeg -y -i \"$f\" -ac 1 -ar 24000 \"${f%.mp3}.wav\"\n          done\n          rm -v *.mp3\n\n          ls -lh\n\n          ./convert_tokenizer.py\n\n          ls -lh\n          rm README.md\n          cat >README.md <<EOF\n          # Introduction\n          See also https://github.com/kyutai-labs/pocket-tts\n          Onnx files are exported using https://github.com/KevinAHM/pocket-tts-onnx-export\n          Files in test_wav are from https://github.com/kyutai-labs/delayed-streams-modeling/tree/main/audio\n          Before you use it, please read its [LICENSE](https://huggingface.co/KevinAHM/pocket-tts-onnx/blob/main/onnx/LICENSE)\n          It is for non-commercial.\n          EOF\n\n      - name: Collect results\n        if: true\n        shell: bash\n        run: |\n          d=sherpa-onnx-pocket-tts-2026-01-26\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n          src=scripts/pocket-tts\n          cp -v $src/*.onnx $d\n          rm $d/*int8.onnx\n          cp -v $src/README.md $d\n          cp -v $src/LICENSE $d\n          cp -v $src/*.json $d\n          cp -v $src/*.wav $d/test_wavs\n          ls -lh $d/\n          tar cjfv $d.tar.bz2 $d\n          ls -lh $d.tar.bz2\n\n      - name: Collect results (int8)\n        if: true\n        shell: bash\n        run: |\n          d=sherpa-onnx-pocket-tts-int8-2026-01-26\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n          src=scripts/pocket-tts\n          cp -v $src/*.onnx $d\n          rm $d/lm_flow.onnx\n          rm $d/lm_main.onnx\n          rm $d/decoder.onnx\n          rm $d/encoder.int8.onnx\n          rm $d/text_conditioner.int8.onnx\n          cp -v $src/README.md $d\n          cp -v $src/LICENSE $d\n          cp -v $src/*.json $d\n          cp -v $src/*.wav $d/test_wavs\n          ls -lh $d/\n          tar cjfv $d.tar.bz2 $d\n          ls -lh $d.tar.bz2\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: tts-models\n\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: tts-models\n\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            dirs=(\n              sherpa-onnx-pocket-tts-2026-01-26\n              sherpa-onnx-pocket-tts-int8-2026-01-26\n            )\n\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            for d in ${dirs[@]}; do\n              echo \"d $d\"\n              if [[ ! -d $d ]]; then\n                echo \"$d does not exist\"\n                continue\n              fi\n\n              echo \"$d exists\"\n              rm -rf huggingface\n\n              git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/$d huggingface\n              cd huggingface\n              rm -rf ./*\n\n              git lfs track \"*.onnx\"\n              git lfs track \"*.wav\"\n\n              cp -a ../$d/* ./\n\n              git add .\n\n              ls -lh\n\n              git status\n\n              git commit -m \"add models\"\n              git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/$d main || true\n              cd ..\n            done\n\n      - name: Publish to modelscope\n        if: true\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in *.tar.bz2; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              rm -rf ms\n              git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/tts-models.git ms\n\n              cp -av $m ms/\n\n              pushd ms\n              git lfs track \"*.tar.bz2\"\n              git status\n              ls -lh\n              git add .\n\n              git commit -m \"add models\"\n              git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/tts-models.git\n\n              popd\n            done\n"
  },
  {
    "path": ".github/workflows/export-pyannote-segmentation-to-onnx.yaml",
    "content": "name: export-pyannote-segmentation-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-pyannote-segmentation-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-pyannote-segmentation-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export Pyannote segmentation models to ONNX\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install pyannote\n        shell: bash\n        run: |\n          pip install pyannote.audio onnx==1.15.0 onnxruntime==1.16.3\n\n      - name: Run\n        shell: bash\n        run: |\n          d=sherpa-onnx-pyannote-segmentation-3-0\n          src=$PWD/$d\n          mkdir -p $src\n\n          pushd scripts/pyannote/segmentation\n          ./run.sh\n          cp ./*.onnx $src/\n          cp ./README.md $src/\n          cp ./LICENSE $src/\n          cp ./run.sh $src/\n          cp ./*.py $src/\n\n          popd\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: speaker-segmentation-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            d=sherpa-onnx-pyannote-segmentation-3-0\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n            cp -v $d/* ./huggingface\n            cd huggingface\n            git lfs track \"*.onnx\"\n            git status\n            git add .\n            git status\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n"
  },
  {
    "path": ".github/workflows/export-revai-segmentation-to-onnx.yaml",
    "content": "name: export-revai-segmentation-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-revai-segmentation-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-revai-segmentation-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export revai segmentation models to ONNX\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install pyannote\n        shell: bash\n        run: |\n          pip install pyannote.audio onnx==1.15.0 onnxruntime==1.16.3\n\n      - name: Run\n        shell: bash\n        run: |\n          d=sherpa-onnx-reverb-diarization-v1\n          src=$PWD/$d\n          mkdir -p $src\n\n          pushd scripts/pyannote/segmentation\n          ./run-revai.sh\n          cp ./*.onnx $src/\n          cp ./README.md $src/\n          cp ./LICENSE $src/\n          cp ./run-revai.sh $src/run.sh\n          cp ./*.py $src/\n\n          popd\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: speaker-segmentation-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            d=sherpa-onnx-reverb-diarization-v1\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n            cp -v $d/* ./huggingface\n            cd huggingface\n            git lfs track \"*.onnx\"\n            git status\n            git add .\n            git status\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n"
  },
  {
    "path": ".github/workflows/export-russian-onnx-models.yaml",
    "content": "name: export-russian-onnx-models\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-russian-onnx-models-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-russian-onnx-models:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export Russian onnx models\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: vosk-model-ru (zipformer v1)\n        shell: bash\n        run: |\n          cat >README.md <<EOF\n          # Introduction\n          Models in this directory are from\n          https://huggingface.co/alphacep/vosk-model-ru/tree/main\n          EOF\n\n          cat README.md\n\n          d=sherpa-onnx-zipformer-ru-2024-09-18\n          mkdir $d\n          pushd $d\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-ru/resolve/main/lang/bpe.model\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-ru/resolve/main/lang/tokens.txt\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-ru/resolve/main/am-onnx/encoder.int8.onnx\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-ru/resolve/main/am-onnx/decoder.int8.onnx\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-ru/resolve/main/am-onnx/joiner.int8.onnx\n\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-ru/resolve/main/am-onnx/encoder.onnx\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-ru/resolve/main/am-onnx/decoder.onnx\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-ru/resolve/main/am-onnx/joiner.onnx\n\n          mkdir test_wavs\n          cd test_wavs\n          curl -SL -O https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/russian/russian-i-love-you.wav\n          curl -SL -O https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/russian/test.wav\n\n          mv russian-i-love-you.wav 0.wav\n          mv test.wav 1.wav\n          popd\n\n          ls -lh $d\n\n          tar cjvf $d.tar.bz2 $d\n          rm -rf $d\n\n      - name: vosk-model-ru-small (zipformer v1)\n        shell: bash\n        run: |\n          cat >README.md <<EOF\n          # Introduction\n          Models in this directory are from\n          https://huggingface.co/alphacep/vosk-model-small-ru/tree/main\n          EOF\n\n          cat README.md\n\n          d=sherpa-onnx-small-zipformer-ru-2024-09-18\n          mkdir $d\n          pushd $d\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/lang/bpe.model\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/lang/tokens.txt\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/am/encoder.int8.onnx\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/am/decoder.int8.onnx\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/am/joiner.int8.onnx\n\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/am/encoder.onnx\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/am/decoder.onnx\n          curl -SL -O https://huggingface.co/alphacep/vosk-model-small-ru/resolve/main/am/joiner.onnx\n\n          mkdir test_wavs\n          cd test_wavs\n          curl -SL -O https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/russian/russian-i-love-you.wav\n          curl -SL -O https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/russian/test.wav\n\n          mv russian-i-love-you.wav 0.wav\n          mv test.wav 1.wav\n          popd\n\n          ls -lh $d\n\n          tar cjvf $d.tar.bz2 $d\n          rm -rf $d\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-sense-voice-to-ascend-npu.yaml",
    "content": "name: export-sense-voice-to-ascend-npu\n\non:\n  push:\n    branches:\n      - fix-ascend-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-sense-voice-to-ascend-npu-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  generate_build_matrix:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    # see https://github.com/pytorch/pytorch/pull/50633\n    runs-on: ubuntu-latest\n    outputs:\n      matrix: ${{ steps.set-matrix.outputs.matrix }}\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Generating build matrix\n        id: set-matrix\n        run: |\n          # outputting for debugging purposes\n          python3 .github/scripts/export-ascend/generate_sense_voice.py\n          MATRIX=$(python3 .github/scripts/export-ascend/generate_sense_voice.py)\n\n          # deprecated\n          # echo \"::set-output name=matrix::${MATRIX}\"\n          echo \"matrix=$MATRIX\" >> $GITHUB_OUTPUT\n\n  export-sense-voice-to-ascend-npu:\n    needs: generate_build_matrix\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.framework }} ${{ matrix.soc_version }} ${{ matrix.cann }}\n    runs-on: ubuntu-latest\n    strategy:\n      fail-fast: false\n      matrix:\n        ${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}\n\n    container:\n      image: ${{ matrix.image }}\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python 3.8\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.8\"\n\n      - name: Show Python\n        shell: bash\n        run: |\n          python3 --version\n          which python3\n\n      - name: Install curl\n        shell: bash\n        run: apt-get update && apt-get install -y curl bzip2 git git-lfs\n\n      - name: Verify environment\n        shell: bash\n        run: |\n          ls -lh /usr/local/Ascend/ascend-toolkit/set_env.sh\n\n          find /usr/local/Ascend -name \"libascend*.so\" 2>/dev/null\n\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          # for cann 7.0.0\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/x86_64:$LD_LIBRARY_PATH\n\n          echo \"CANN environment:\"\n          which atc || echo \"atc not found\"\n          atc --help\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install \"numpy<2\" \\\n                  onnx==1.17.0 \\\n                  torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n                  attrs psutil scipy decorator cloudpickle ml-dtypes tornado \\\n                  sentencepiece \\\n                  pyyaml\n\n      - name: Run SenseVoice from FunAsr\n        if: matrix.framework == 'FunASR'\n        shell: bash\n        run: |\n          cd scripts/sense-voice/ascend-npu\n\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/am.mvn\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/model.pt\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/chn_jpn_yue_eng_ko_spectok.bpe.model\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/en.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/ja.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/ko.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/yue.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/zh.wav\n\n          rm -f README.md || true\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/README.md\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/LICENSE\n\n          echo \"export to onnx\"\n\n          python3 ./export_onnx.py\n          rm -v *.pt\n\n          ls -lh *.onnx\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          # for cann 7.0.0\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/x86_64:$LD_LIBRARY_PATH\n\n          soc_version=${{ matrix.soc_version }}\n          cann=${{ matrix.cann }}\n\n          atc --model=./model.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=model \\\n            --input_format=ND \\\n            --input_shape=\"x:1,-1,560;prompt:4\" \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          rm -v *.onnx\n\n          ls -lh *.om\n\n          echo \"collect results\"\n          d=sherpa-onnx-ascend-${soc_version}-cann-${cann}-sense-voice-zh-en-ja-ko-yue-2024-07-17\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v LICENSE $d\n          cp -v model_linux_aarch64.om $d/model.om\n          cp -v tokens.txt $d\n          cp -v test_om.py $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          rm -v *.om\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../..\n\n      - name: Run SenseVoice from WSYue-ASR\n        if: matrix.framework == 'WSYue-ASR'\n        shell: bash\n        run: |\n          cd scripts/sense-voice/ascend-npu\n\n          curl -SL -O https://huggingface.co/ASLP-lab/WSYue-ASR/resolve/main/sensevoice_small_yue/model.pt\n\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/am.mvn\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/chn_jpn_yue_eng_ko_spectok.bpe.model\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/en.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/yue.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/zh.wav\n\n          for i in $(seq 0 17); do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09/resolve/main/test_wavs/yue-$i.wav\n          done\n\n          rm -f README.md || true\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09/resolve/main/README.md\n\n          echo \"export to onnx\"\n          python3 ./export_onnx.py\n          rm -v *.pt\n\n          ls -lh *.onnx\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          # for cann 7.0.0\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/x86_64:$LD_LIBRARY_PATH\n\n          soc_version=${{ matrix.soc_version }}\n          cann=${{ matrix.cann }}\n\n          atc --model=./model.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=model \\\n            --input_format=ND \\\n            --input_shape=\"x:1,-1,560;prompt:4\" \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          rm -v *.onnx\n          ls -lh *.om\n\n          echo \"collect results\"\n          d=sherpa-onnx-ascend-${soc_version}-cann-${cann}-sense-voice-zh-en-ja-ko-yue-2025-09-09\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v model_linux_aarch64.om $d/model.om\n          cp -v tokens.txt $d\n          cp -v test_om.py $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          rm -v *.om\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../..\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models-ascend\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: asr-models-ascend\n\n      - name: Publish to huggingface\n        if: true\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in \"*.tar.bz2\"; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models huggingface\n\n              d=asr-models/ascend-npu/sense-voice\n              mkdir -p huggingface/$d\n\n              cp -v $m huggingface/$d/\n\n              pushd huggingface\n              git lfs track \"*.tar.bz2\"\n              ls -lh $d/$m\n\n              ls -lh $d\n\n              pushd $d\n              git lfs track \"*.tar.bz2\"\n              popd\n\n              git status\n              git add .\n\n              git commit -m \"add $m\"\n              git push https://csukuangfj2:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models main\n              popd\n            done\n\n            rm -rf huggingface\n\n      - name: Publish to modelscope\n        if: true\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in \"*.tar.bz2\"; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              rm -rf ms\n              git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git ms\n\n              d=ascend-npu/sense-voice\n              mkdir -p ms/$d\n\n              cp -av $m ms/$d/\n\n              pushd ms\n              git lfs track \"*.tar.bz2\"\n              git status\n              ls -lh $d/$m\n\n              ls -lh $d\n              git add .\n\n              git commit -m \"add $m\"\n              git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git\n\n              popd\n            done\n            rm -rf ms\n"
  },
  {
    "path": ".github/workflows/export-sense-voice-to-onnx.yaml",
    "content": "name: export-sense-voice-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-sense-voice-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-sense-voice-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export sense-voice\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install dependencies\n        shell: bash\n        run: |\n          sudo apt-get install -y -qq sox libsox-fmt-mp3\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install \\\n            torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n            onnx==1.17.0 \\\n            onnxruntime==1.17.1 \\\n            soundfile \\\n            kaldi-native-fbank \\\n            librosa\n\n          pip install  \"numpy<2\"\n\n      - name: Download test_wavs\n        shell: bash\n        run: |\n          sudo apt-get install -y -qq sox libsox-fmt-mp3\n\n          cd scripts/sense-voice\n\n          curl -SL -O https://huggingface.co/FunAudioLLM/SenseVoiceSmall/resolve/main/example/zh.mp3\n          curl -SL -O https://huggingface.co/FunAudioLLM/SenseVoiceSmall/resolve/main/example/en.mp3\n          curl -SL -O https://huggingface.co/FunAudioLLM/SenseVoiceSmall/resolve/main/example/ja.mp3\n          curl -SL -O https://huggingface.co/FunAudioLLM/SenseVoiceSmall/resolve/main/example/ko.mp3\n          curl -SL -O https://huggingface.co/FunAudioLLM/SenseVoiceSmall/resolve/main/example/yue.mp3\n\n          soxi *.mp3\n\n          sox zh.mp3 -r 16k zh.wav\n          sox en.mp3 -r 16k en.wav\n          sox ja.mp3 -r 16k ja.wav\n          sox ko.mp3 -r 16k ko.wav\n          sox yue.mp3 -r 16k yue.wav\n\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/sense-voice\n          curl -SL -O https://huggingface.co/csukuangfj/funasr-nano-with-ctc/resolve/main/model.pt\n          curl -SL -O https://huggingface.co/csukuangfj/funasr-nano-with-ctc/resolve/main/tokens.txt\n          ls -lh\n          ./export_onnx_nano.py\n\n          ls -lh\n\n          d=sherpa-onnx-sense-voice-funasr-nano-2025-12-17\n          d2=sherpa-onnx-sense-voice-funasr-nano-int8-2025-12-17\n          mkdir -p $d $d2\n\n          cp README-nano.md $d/README.md\n          cp README-nano.md $d2/README.md\n\n          mv model.onnx $d/\n          mv model.int8.onnx $d2/\n\n          for m in $d $d2; do\n            mkdir -p $m/test_wavs\n            cp -v *.wav $m/test_wavs\n            cp -v tokens.txt $m/\n\n            ls -lh $m\n\n            tar cjfv $m.tar.bz2 $m\n\n            ls -lh $m.tar.bz2\n            mv $m.tar.bz2 ../../\n            mv $m ../../\n          done\n\n      - name: Publish to huggingface\n        if: true\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 5\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            names=(\n              sherpa-onnx-sense-voice-funasr-nano-2025-12-17\n              sherpa-onnx-sense-voice-funasr-nano-int8-2025-12-17\n            )\n            for d in ${names[@]}; do\n              if [ ! -d $d ]; then\n                echo \"$d does not exist - skip it\"\n                continue;\n              fi\n\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n              cp -av $d/* ./huggingface\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.wav\"\n              git status\n              git add .\n              git status\n              git commit -m \"add models\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n              cd ..\n            done\n\n      - name: Run\n        shell: bash\n        if: false\n        run: |\n          cd scripts/sense-voice\n          ./run.sh\n\n      - name: Publish to huggingface\n        if: false\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17 huggingface\n            cd huggingface\n            git fetch\n            git pull\n            echo \"pwd: $PWD\"\n            ls -lh ../scripts/sense-voice\n\n            rm -rf ./*\n\n            cp -v ../scripts/sense-voice/*.onnx .\n            cp -v ../scripts/sense-voice/tokens.txt .\n            cp -v ../scripts/sense-voice/README.md .\n            cp -v ../scripts/sense-voice/export-onnx.py .\n\n            mkdir test_wavs\n            cp -v ../*.wav ./test_wavs/\n\n            curl -SL -O https://raw.githubusercontent.com/FunAudioLLM/SenseVoice/main/LICENSE\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17 main || true\n\n            cd ..\n\n            rm -rf huggingface/.git*\n            dst=sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17\n\n            mv huggingface $dst\n\n            tar cjvf $dst.tar.bz2 $dst\n            rm -rf $dst\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-sense-voice-to-qnn.yaml",
    "content": "name: export-sense-voice-to-qnn\n\non:\n  push:\n    branches:\n      - qnn-binary-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-sense-voice-to-qnn-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  generate_build_matrix:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    # see https://github.com/pytorch/pytorch/pull/50633\n    runs-on: ubuntu-latest\n    outputs:\n      matrix: ${{ steps.set-matrix.outputs.matrix }}\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Generating build matrix\n        id: set-matrix\n        run: |\n          # outputting for debugging purposes\n          python3 .github/scripts/export-qnn/generate_sense_voice.py\n          MATRIX=$(python3 .github/scripts/export-qnn/generate_sense_voice.py)\n\n          # deprecated\n          # echo \"::set-output name=matrix::${MATRIX}\"\n          echo \"matrix=$MATRIX\" >> $GITHUB_OUTPUT\n\n  export-sense-voice-to-qnn:\n    needs: generate_build_matrix\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.framework }} ${{ matrix.input_in_seconds }} ${{ matrix.soc }}\n    runs-on: ubuntu-22.04\n    strategy:\n      fail-fast: false\n      matrix:\n        ${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python 3.10\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Create directories\n        shell: bash\n        run: |\n          mkdir so binary\n\n      - name: Create Python virtual environment\n        shell: bash\n        run: |\n          python3 -m venv py310\n          which python3\n          source py310/bin/activate\n          which python3\n\n      - name: Show ndk-build help\n        shell: bash\n        run: |\n          export PATH=${ANDROID_NDK_LATEST_HOME}:$PATH\n          ndk-build --help\n\n      - name: Download toolkit\n        shell: bash\n        run: |\n          curl -SL -O https://huggingface.co/csukuangfj/qnn-toolkit/resolve/main/v2.40.0.251030.zip\n          ls -lh v2.40.0.251030.zip\n\n      - name: Unzip toolkit\n        shell: bash\n        run: |\n          unzip v2.40.0.251030.zip\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh\n\n          echo \"---ls -lh qairt---\"\n\n          ls -lh qairt\n\n          echo \"---\"\n\n      - name: Install linux dependencies\n        shell: bash\n        run: |\n          ls -lh\n\n          echo \"---\"\n\n          ls -lh qairt\n\n          cd qairt/2.40.0.251030/bin\n          source envsetup.sh\n\n          yes | sudo ${QNN_SDK_ROOT}/bin/check-linux-dependency.sh || true\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          cd qairt/2.40.0.251030/bin\n          source envsetup.sh\n\n          python3 -m pip install \\\n            mock \\\n            numpy \\\n            opencv-python \\\n            optuna \\\n            packaging \\\n            pandas \\\n            paramiko \\\n            pathlib2 \\\n            pillow \\\n            plotly \\\n            protobuf \\\n            psutil \\\n            pydantic \\\n            pytest \\\n            pyyaml \\\n            rich \\\n            scikit-optimize \\\n            scipy \\\n            six \\\n            tabulate \\\n            typing-extensions \\\n            xlsxwriter\n\n          python3 \"${QNN_SDK_ROOT}/bin/check-python-dependency\" || true\n\n          which python3\n\n      - name: Install onnx dependencies\n        shell: bash\n        run: |\n          source py310/bin/activate\n          python3 -m pip install --upgrade \\\n            torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n            kaldi_native_fbank \\\n            pip \\\n            \"numpy<2\" \\\n            onnx==1.17.0 \\\n            onnxruntime==1.17.1 \\\n            soundfile \\\n            librosa \\\n            onnxsim \\\n            sentencepiece \\\n            pyyaml\n\n          which python3\n\n      - name: Show qnn-onnx-converter help\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.40.0.251030/bin\n          source envsetup.sh\n          popd\n\n          qnn-onnx-converter --help\n\n      - name: Show qnn-model-lib-generator help\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.40.0.251030/bin\n          source envsetup.sh\n          popd\n\n          qnn-model-lib-generator --help\n\n      - name: Show qnn-net-run help\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.40.0.251030/bin\n          source envsetup.sh\n          popd\n\n          qnn-net-run --help\n\n      - name: Run SenseVoice from FunAsr\n        if: matrix.framework == 'FunASR'\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.40.0.251030/bin\n          source envsetup.sh\n          popd\n\n          export PATH=${ANDROID_NDK_LATEST_HOME}:$PATH\n          export LDFLAGS=\"-Wl,-z,max-page-size=16384\"\n          dir=$PWD\n\n          cd scripts/sense-voice/qnn\n\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/am.mvn\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/model.pt\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/chn_jpn_yue_eng_ko_spectok.bpe.model\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/en.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/ja.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/ko.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/yue.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/zh.wav\n\n          rm -f README.md || true\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/README.md\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/LICENSE\n\n          echo \"export to onnx\"\n          t=${{ matrix.input_in_seconds }}\n\n          echo \"----$t---\"\n          python3 ./export-onnx.py --input-len-in-seconds $t --opset-version 17\n\n          ls -lh *.onnx\n\n          python3 ../../pyannote/segmentation/show-onnx.py --filename ./model-$t-seconds.onnx\n\n          echo \"test exported onnx models\"\n\n          echo \"----------$t----------\"\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./en.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./ja.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./ko.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./yue.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./zh.wav\n\n          echo \"export to qnn\"\n          echo \"----------$t----------\"\n          num_frames=$(python3 -c \"print(int($t*100 / 6 + 0.5))\")\n\n          echo \"num_frames: $num_frames\"\n\n          ./generate_test_data.py  --num-frames $num_frames --wav ./zh.wav\n          mv input0.raw zh-input0.raw\n          mv input1.raw zh-input1.raw\n          echo \"zh-input0.raw zh-input1.raw\" > input_list.txt\n\n          for w in ja ko en yue; do\n            ./generate_test_data.py  --num-frames $num_frames --wav ./$w.wav\n            mv input0.raw $w-input0.raw\n            mv input1.raw $w-input1.raw\n            echo \"$w-input0.raw $w-input1.raw\" >> input_list.txt\n          done\n\n          cat ./input_list.txt\n\n          qnn-onnx-converter \\\n            --input_network model-$t-seconds.onnx \\\n            --output_path ./model-$t-seconds-quantized \\\n            --out_node logits \\\n            --input_list ./input_list.txt \\\n            --use_native_input_files  \\\n            --input_dtype x float32 \\\n            --input_dtype prompt int32 \\\n            --act_bitwidth 16 \\\n            --bias_bitwidth 32 \\\n            --input_layout x NTF\n          ls -lh\n          mv model-$t-seconds-quantized model-$t-seconds-quantized.cpp\n          echo \"----\"\n          ls -lh\n\n          python3 \"${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator\" \\\n            -c \"model-$t-seconds-quantized.cpp\" \\\n            -b \"model-$t-seconds-quantized.bin\" \\\n            -o model_libs > /dev/null 2>&1\n\n          ls -lh model_libs/*/\n\n          readelf -lW model_libs/*/lib*.so\n\n          echo \"Generate context binary\"\n\n          $dir/scripts/qnn/generate_config.py  \\\n            --soc ${{ matrix.soc }} \\\n            --graph-name \"model_${t}_seconds_quantized\" \\\n            --output-dir ./my-config \\\n            --qnn-sdk-root $QNN_SDK_ROOT\n\n          ls -lh my-config\n\n          head -n 1000 my-config/*.json\n\n          $QNN_SDK_ROOT/bin/x86_64-linux-clang/qnn-context-binary-generator \\\n            --backend $QNN_SDK_ROOT/lib/x86_64-linux-clang/libQnnHtp.so \\\n            --model ./model_libs/x86_64-linux-clang/libmodel-$t-seconds-quantized.so \\\n            --output_dir ./binary \\\n            --binary_file model \\\n            --config_file ./my-config/htp_backend_extensions.json\n\n          ls -lh binary/\n\n          echo \"collect results\"\n\n          d=sherpa-onnx-qnn-${{ matrix.soc}}-binary-$t-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v LICENSE $d\n          cp -v binary/model.bin $d/\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n\n          echo \"num_frames=$num_frames\" > $d/info.txt\n\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n          mv *.tar.bz2 ../../../binary/\n\n\n          for p in x86_64-linux-clang aarch64-android; do\n            if [[ $p == x86_64-linux-clang ]]; then\n              d=sherpa-onnx-qnn-$t-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-linux-x64\n            elif [[ $p == aarch64-android ]]; then\n              d=sherpa-onnx-qnn-$t-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\n            else\n              echo \"Unknown $p\"\n              exit -1\n            fi\n\n            mkdir -p $d\n            mkdir -p $d/test_wavs\n\n            cp -v README.md $d\n            cp -v LICENSE $d\n            cp -v model_libs/$p/lib*.so $d/libmodel.so\n            cp -v tokens.txt $d\n            cp -v *.wav $d/test_wavs\n\n            echo \"num_frames=$num_frames\" > $d/info.txt\n            echo \"target=$p\" >> $d/info.txt\n\n            ls -lh $d\n            tar cjfv $d.tar.bz2 $d\n            ls -lh *.tar.bz2\n            rm -rf $d\n          done\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../../so/\n\n\n      - name: Run SenseVoice from WSYue-ASR\n        if: matrix.framework == 'WSYue-ASR'\n        shell: bash\n        run: |\n          dir=$PWD\n          source py310/bin/activate\n\n          pushd qairt/2.40.0.251030/bin\n          source envsetup.sh\n          popd\n\n          export PATH=${ANDROID_NDK_LATEST_HOME}:$PATH\n          export LDFLAGS=\"-Wl,-z,max-page-size=16384\"\n\n          cd scripts/sense-voice/qnn\n\n          curl -SL -O https://huggingface.co/ASLP-lab/WSYue-ASR/resolve/main/sensevoice_small_yue/model.pt\n\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/am.mvn\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/chn_jpn_yue_eng_ko_spectok.bpe.model\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/en.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/yue.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/zh.wav\n\n          for i in $(seq 0 17); do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09/resolve/main/test_wavs/yue-$i.wav\n          done\n\n          rm -f README.md || true\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09/resolve/main/README.md\n\n          echo \"export to onnx\"\n          t=${{ matrix.input_in_seconds }}\n\n          echo \"----$t---\"\n\n          export model_author=\"ASLP-lab\"\n          export comment=\"ASLP-lab/WSYue-ASR\"\n          export url=\"https://huggingface.co/ASLP-lab/WSYue-ASR/tree/main/sensevoice_small_yue\"\n\n          python3 ./export-onnx.py --input-len-in-seconds $t --opset-version 17\n\n          ls -lh *.onnx\n\n          python3 ../../pyannote/segmentation/show-onnx.py --filename ./model-$t-seconds.onnx\n\n          echo \"test exported onnx models\"\n\n          echo \"----------$t----------\"\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./en.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./yue.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./zh.wav\n\n          for i in $(seq 0 17); do\n            echo \"yue-$i.wav\"\n            python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./yue-$i.wav\n          done\n\n          echo \"export to qnn\"\n          echo \"----------$t----------\"\n          num_frames=$(python3 -c \"print(int($t*100 / 6 + 0.5))\")\n\n          echo \"num_frames: $num_frames\"\n\n          ./generate_test_data.py  --num-frames $num_frames --wav ./zh.wav\n          mv input0.raw zh-input0.raw\n          mv input1.raw zh-input1.raw\n          echo \"zh-input0.raw zh-input1.raw\" > input_list.txt\n\n          for w in en yue; do\n            ./generate_test_data.py  --num-frames $num_frames --wav ./$w.wav\n            mv input0.raw $w-input0.raw\n            mv input1.raw $w-input1.raw\n            echo \"$w-input0.raw $w-input1.raw\" >> input_list.txt\n          done\n\n          for i in $(seq 0 17); do\n            echo \"yue-$i.wav\"\n            ./generate_test_data.py  --num-frames $num_frames --wav ./yue-$i.wav\n            mv input0.raw $i-input0.raw\n            mv input1.raw $i-input1.raw\n            echo \"$i-input0.raw $i-input1.raw\" >> input_list.txt\n          done\n\n          cat ./input_list.txt\n\n          qnn-onnx-converter \\\n            --input_network model-$t-seconds.onnx \\\n            --output_path ./model-$t-seconds-quantized \\\n            --out_node logits \\\n            --input_list ./input_list.txt \\\n            --use_native_input_files  \\\n            --input_dtype x float32 \\\n            --input_dtype prompt int32 \\\n            --act_bitwidth 16 \\\n            --bias_bitwidth 32 \\\n            --input_layout x NTF\n          ls -lh\n          mv model-$t-seconds-quantized model-$t-seconds-quantized.cpp\n          echo \"----\"\n          ls -lh\n\n          python3 \"${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator\" \\\n            -c \"model-$t-seconds-quantized.cpp\" \\\n            -b \"model-$t-seconds-quantized.bin\" \\\n            -o model_libs > /dev/null 2>&1\n\n          ls -lh model_libs/*/\n\n          readelf -lW model_libs/*/lib*.so\n\n          $dir/scripts/qnn/generate_config.py  \\\n            --soc ${{ matrix.soc }} \\\n            --graph-name \"model_${t}_seconds_quantized\" \\\n            --output-dir ./my-config \\\n            --qnn-sdk-root $QNN_SDK_ROOT\n\n          ls -lh my-config\n\n          head -n 1000 my-config/*.json\n\n          $QNN_SDK_ROOT/bin/x86_64-linux-clang/qnn-context-binary-generator \\\n            --backend $QNN_SDK_ROOT/lib/x86_64-linux-clang/libQnnHtp.so \\\n            --model ./model_libs/x86_64-linux-clang/libmodel-$t-seconds-quantized.so \\\n            --output_dir ./binary \\\n            --binary_file model \\\n            --config_file ./my-config/htp_backend_extensions.json\n\n          ls -lh binary/\n\n          echo \"collect results\"\n\n          d=sherpa-onnx-qnn-${{ matrix.soc }}-binary-$t-seconds-sense-voice-zh-en-ja-ko-yue-2025-09-09-int8\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v binary/model.bin $d/\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n\n          echo \"num_frames=$num_frames\" > $d/info.txt\n\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n          mv *.tar.bz2 ../../../binary/\n\n          echo \"collect results\"\n          for p in x86_64-linux-clang aarch64-android; do\n            if [[ $p == x86_64-linux-clang ]]; then\n              d=sherpa-onnx-qnn-$t-seconds-sense-voice-zh-en-ja-ko-yue-2025-09-09-int8-linux-x64\n            elif [[ $p == aarch64-android ]]; then\n              d=sherpa-onnx-qnn-$t-seconds-sense-voice-zh-en-ja-ko-yue-2025-09-09-int8-android-aarch64\n            else\n              echo \"Unknown $p\"\n              exit -1\n            fi\n\n            mkdir -p $d\n            mkdir -p $d/test_wavs\n\n            cp -v README.md $d\n            cp -v model_libs/$p/lib*.so $d/libmodel.so\n            cp -v tokens.txt $d\n            cp -v *.wav $d/test_wavs\n\n            echo \"num_frames=$num_frames\" > $d/info.txt\n            echo \"target=$p\" >> $d/info.txt\n\n            ls -lh $d\n            tar cjfv $d.tar.bz2 $d\n            ls -lh *.tar.bz2\n            rm -rf $d\n          done\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../../so/\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: ${{ matrix.framework }}-${{ matrix.soc }}-${{ matrix.input_in_seconds }}-seconds\n          path: ./scripts/sense-voice/qnn/*.json\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj' && matrix.soc == 'SM8850'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./so/*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models-qnn\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./binary/*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models-qnn-binary\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa' && matrix.soc == 'SM8850'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./so/*.tar.bz2\n          overwrite: true\n          tag: asr-models-qnn\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./binary/*.tar.bz2\n          overwrite: true\n          tag: asr-models-qnn-binary\n"
  },
  {
    "path": ".github/workflows/export-sense-voice-to-rknn.yaml",
    "content": "name: export-sense-voice-to-rknn\n\non:\n  push:\n    branches:\n      - export-sense-voice-rknn-ci-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-sense-voice-to-rknn-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-sense-voice-to-rknn:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.framework }} ${{ matrix.platform }} ${{ matrix.input_in_seconds }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n        platform: [\"rk3562\", \"rk3566\", \"rk3568\", \"rk3576\", \"rk3588\"]\n        input_in_seconds: [\"5\", \"10\", \"15\", \"20\", \"25\", \"30\"]\n        framework: [\"FunASR\", \"WSYue-ASR\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade \\\n            pip \\\n            \"numpy<2\" \\\n            torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n            onnx==1.17.0 \\\n            onnxruntime==1.17.1 \\\n            librosa \\\n            soundfile \\\n            onnxsim \\\n            sentencepiece \\\n            kaldi_native_fbank\n\n          curl -SL -O https://huggingface.co/csukuangfj/rknn-toolkit2/resolve/main/rknn_toolkit2-2.1.0%2B708089d1-cp310-cp310-linux_x86_64.whl\n          pip install ./*.whl \"numpy<=1.26.4\"\n\n      - name: Run SenseVoice from FunAsr\n        if: matrix.framework == 'FunASR'\n        shell: bash\n        run: |\n          cd scripts/sense-voice/rknn\n\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/am.mvn\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/model.pt\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/chn_jpn_yue_eng_ko_spectok.bpe.model\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/en.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/ja.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/ko.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/yue.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/zh.wav\n\n          rm -f README.md || true\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/README.md\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/LICENSE\n\n          echo \"export to onnx\"\n          t=${{ matrix.input_in_seconds }}\n          p=${{ matrix.platform }}\n\n          echo \"----$t---\"\n          python3 ./export-onnx.py --input-len-in-seconds $t\n\n          ls -lh *.onnx\n\n          echo \"test exported onnx models\"\n\n          echo \"----------$t----------\"\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./en.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./ja.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./ko.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./yue.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./zh.wav\n\n          echo \"export to rknn\"\n          echo \"----------$t----------\"\n          echo \"----------$p----------\"\n          python3 export-rknn.py --target-platform $p --in-model model-$t-seconds.onnx --out-model model-$p-$t-seconds.rknn >/dev/null  2>&1\n\n          ls -lh *.rknn\n\n          echo \"collect results\"\n          d=sherpa-onnx-$p-$t-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v LICENSE $d\n          cp -v model-$p-$t-seconds.rknn $d/model.rknn\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../..\n\n      - name: Run SenseVoice from WSYue-ASR\n        if: matrix.framework == 'WSYue-ASR'\n        shell: bash\n        run: |\n          cd scripts/sense-voice/rknn\n\n          curl -SL -O https://huggingface.co/ASLP-lab/WSYue-ASR/resolve/main/sensevoice_small_yue/model.pt\n\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/am.mvn\n          curl -SL -O https://hf-mirror.com/FunAudioLLM/SenseVoiceSmall/resolve/main/chn_jpn_yue_eng_ko_spectok.bpe.model\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/en.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/yue.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/zh.wav\n\n          for i in $(seq 0 17); do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09/resolve/main/test_wavs/yue-$i.wav\n          done\n\n          rm -f README.md || true\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09/resolve/main/README.md\n\n          echo \"export to onnx\"\n          t=${{ matrix.input_in_seconds }}\n          p=${{ matrix.platform }}\n\n          echo \"----$t---\"\n\n          export model_author=\"ASLP-lab\"\n          export comment=\"ASLP-lab/WSYue-ASR\"\n          export url=\"https://huggingface.co/ASLP-lab/WSYue-ASR/tree/main/sensevoice_small_yue\"\n\n          python3 ./export-onnx.py --input-len-in-seconds $t\n\n          ls -lh *.onnx\n\n          echo \"test exported onnx models\"\n\n          echo \"----------$t----------\"\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./en.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./yue.wav\n          python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./zh.wav\n          for i in $(seq 0 17); do\n            echo \"yue-$i.wav\"\n            python3 ./test_onnx.py --model model-$t-seconds.onnx --tokens ./tokens.txt --wave ./yue-$i.wav\n          done\n\n          echo \"export to rknn\"\n          echo \"----------$t----------\"\n          echo \"----------$p----------\"\n          python3 export-rknn.py --target-platform $p --in-model model-$t-seconds.onnx --out-model model-$p-$t-seconds.rknn >/dev/null  2>&1\n\n          ls -lh *.rknn\n\n          echo \"collect results\"\n          d=sherpa-onnx-$p-$t-seconds-sense-voice-zh-en-ja-ko-yue-2025-09-09\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v README.md $d\n          cp -v model-$p-$t-seconds.rknn $d/model.rknn\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../..\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-silero-vad-rknn.yaml",
    "content": "name: export-silero-vad-to-rknn\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-silero-vad-to-rknn-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-silero-vad-to-rknn:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export silero-vad to rknn\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade \\\n            pip \\\n            \"numpy<2\" \\\n            torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n            onnx \\\n            onnxruntime==1.17.1 \\\n            librosa \\\n            soundfile \\\n            onnxsim\n\n          curl -SL -O https://huggingface.co/csukuangfj/rknn-toolkit2/resolve/main/rknn_toolkit2-2.1.0%2B708089d1-cp310-cp310-linux_x86_64.whl\n          pip install ./*.whl \"numpy<=1.26.4\"\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/silero_vad/v4\n          curl -SL -O https://github.com/snakers4/silero-vad/raw/refs/tags/v4.0/files/silero_vad.jit\n          ./export-onnx.py\n          ./show.py\n\n          ls -lh m.onnx\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n          ./test-onnx.py  --model ./m.onnx --wav ./lei-jun-test.wav\n\n          for platform in rk3588 rk3576 rk3568 rk3566 rk3562; do\n          echo \"Platform: $platform\"\n            ./export-rknn.py --in-model ./m.onnx --out-model silero-vad-v4-$platform.rknn  --target-platform $platform\n            ls -lh silero-vad-v4-$platform.rknn\n          done\n\n      - name: Collect files\n        shell: bash\n        run: |\n          cd scripts/silero_vad/v4\n          ls -lh\n          mv *.rknn ../../..\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.rknn\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Upload model to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n\n            git clone https://huggingface.co/csukuangfj/sherpa-onnx-rknn-models huggingface\n            cd huggingface\n\n            git fetch\n            git pull\n            git lfs track \"*.rknn\"\n            git merge -m \"merge remote\" --ff origin main\n            dst=vad\n            mkdir -p $dst\n            cp ../*.rknn $dst/ || true\n\n            ls -lh $dst\n            git add .\n            git status\n            git commit -m \"update models\"\n            git status\n\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-rknn-models main || true\n            rm -rf huggingface\n"
  },
  {
    "path": ".github/workflows/export-spleeter-to-onnx.yaml",
    "content": "name: export-spleeter-to-onnx\n\non:\n  push:\n    branches:\n      - spleeter-cpp-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-spleeter-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-spleeter-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export spleeter to ONNX\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install dependencies\n        shell: bash\n        run: |\n          pip install tensorflow torch \"numpy<2\" onnx==1.17.0 onnxruntime==1.17.1 onnxmltools\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/spleeter\n          ./run.sh\n\n          echo \"---\"\n          ls -lh 2stems\n          echo \"---\"\n          ls -lh 2stems/*.onnx\n          echo \"---\"\n\n          mv -v 2stems/*.onnx ../..\n\n      - name: Collect models\n        shell: bash\n        run: |\n          mkdir sherpa-onnx-spleeter-2stems\n          mkdir sherpa-onnx-spleeter-2stems-int8\n          mkdir sherpa-onnx-spleeter-2stems-fp16\n\n          mv -v vocals.onnx sherpa-onnx-spleeter-2stems/\n          mv -v accompaniment.onnx sherpa-onnx-spleeter-2stems/\n\n          mv -v vocals.int8.onnx sherpa-onnx-spleeter-2stems-int8/\n          mv -v accompaniment.int8.onnx sherpa-onnx-spleeter-2stems-int8/\n\n          mv -v vocals.fp16.onnx sherpa-onnx-spleeter-2stems-fp16/\n          mv -v accompaniment.fp16.onnx sherpa-onnx-spleeter-2stems-fp16/\n\n          tar cjvf sherpa-onnx-spleeter-2stems.tar.bz2 sherpa-onnx-spleeter-2stems\n          tar cjvf sherpa-onnx-spleeter-2stems-int8.tar.bz2 sherpa-onnx-spleeter-2stems-int8\n          tar cjvf sherpa-onnx-spleeter-2stems-fp16.tar.bz2 sherpa-onnx-spleeter-2stems-fp16\n\n          ls -lh *.tar.bz2\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: source-separation-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            names=(\n              sherpa-onnx-spleeter-2stems\n              sherpa-onnx-spleeter-2stems-int8\n              sherpa-onnx-spleeter-2stems-fp16\n            )\n            for d in ${names[@]}; do\n              rm -rf huggingface\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n              cp -v $d/*onnx huggingface\n\n              cd huggingface\n              git lfs track \"*.onnx\"\n              git status\n              git add .\n              ls -lh\n              git status\n              git commit -m \"add models\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n              cd ..\n            done\n"
  },
  {
    "path": ".github/workflows/export-supertonic.yaml",
    "content": "name: export-supertonic-to-int8-onnx\n\non:\n  push:\n    branches:\n      - ci-supertonic\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-supertonic-to-int8-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-supertonic-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export supertonic int8\n    runs-on: macos-latest\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          brew install git-xet\n          git xet install\n\n          pip install numpy onnx onnxruntime\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/supertonic\n          ./run.sh\n\n          wget https://raw.githubusercontent.com/supertone-inc/supertonic/refs/heads/main/LICENSE\n          rm README.md\n          wget https://raw.githubusercontent.com/supertone-inc/supertonic/refs/heads/main/README.md\n\n      - name: Collect results\n        shell: bash\n        run: |\n          src=scripts/supertonic\n          d=sherpa-onnx-supertonic-tts-int8-2026-03-06\n\n          mkdir $d\n          cp -a $src/LICENSE $d/\n          cp -a $src/README.md $d/\n          cp -v $src/onnx_int8/*.int8.onnx $d/\n          [ -f $src/assets/onnx/unicode_indexer.bin ] && cp -v $src/assets/onnx/unicode_indexer.bin $d/\n          [ -f $src/assets/onnx/tts.json ] && cp -v $src/assets/onnx/tts.json $d/\n          [ -f $src/assets/voice_styles/voice.bin ] && cp -v $src/assets/voice_styles/voice.bin $d/voice.bin\n          ls -lh $d/\n          tar cjfv $d.tar.bz2 $d\n\n          ls -lh $d.tar.bz2\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: tts-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: tts-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            dirs=(\n              sherpa-onnx-supertonic-tts-int8-2026-03-06\n            )\n\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            for d in ${dirs[@]}; do\n              echo \"d $d\"\n              if [[ ! -d $d ]]; then\n                echo \"$d does not exist\"\n                continue\n              fi\n\n              echo \"$d exists\"\n              rm -rf huggingface\n\n              git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/$d huggingface\n              cd huggingface\n              rm -rf ./*\n\n              git lfs track \"*.onnx\"\n              git lfs track \"*.wav\"\n\n              cp -a ../$d/* ./\n\n              git add .\n\n              ls -lh\n\n              git status\n\n              git commit -m \"add models\"\n              git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/$d main || true\n              cd ..\n            done\n\n      - name: Publish to modelscope\n        if: true\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in *.tar.bz2; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              rm -rf ms\n              git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/tts-models.git ms\n\n              cp -av $m ms/\n\n              pushd ms\n              git lfs track \"*.tar.bz2\"\n              git status\n              ls -lh\n              git add .\n\n              git commit -m \"add models\"\n              git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/tts-models.git\n\n              popd\n            done\n"
  },
  {
    "path": ".github/workflows/export-t-one-to-onnx.yaml",
    "content": "name: export-t-one-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-t-one-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-t-one-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export t-one\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install onnx==1.17.0 onnxruntime==1.17.1 soundfile librosa kaldi_native_fbank \"numpy<2\"\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/t-one\n\n          wget https://raw.githubusercontent.com/voicekit-team/T-one/refs/heads/main/LICENSE\n          ./run.sh\n\n          d=sherpa-onnx-streaming-t-one-russian-2025-09-08\n          mkdir $d\n          cp -v ./tokens.txt $d\n          cp -v ./model.onnx $d\n          cp -v ./russian_test_short_from_t_one.wav $d/0.wav\n          cp -v ./LICENSE $d\n          cp -v ./README.md $d\n\n          ls -lh $d\n\n          tar cjfv $d.tar.bz2 $d\n\n          ls -lh $d.tar.bz2\n\n          mv $d.tar.bz2 ../..\n          mv $d ../..\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            m=sherpa-onnx-streaming-t-one-russian-2025-09-08\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m huggingface\n            cd huggingface\n            git fetch\n            git pull\n            echo \"pwd: $PWD\"\n            ls -lh ../$m\n            git lfs track \"*.wav\"\n\n            rm -rf ./*\n\n            cp -v ../$m/* ./\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$m main || true\n\n            cd ..\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-telespeech-ctc.yaml",
    "content": "name: export-telespeech-ctc-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-telespeech-ctc-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-telespeech-ctc-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: telespeech\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install onnx onnxruntime soundfile librosa numpy kaldi-native-fbank\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/tele-speech\n          ./run.sh\n\n          ./test.py\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Publish float32 model to huggingface\n        shell: bash\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          src=scripts/tele-speech/sherpa-onnx-telespeech-ctc-zh-2024-06-04\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n          export GIT_CLONE_PROTECTION_ACTIVE=false\n\n          GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-telespeech-ctc-zh-2024-06-04 hf\n          cp -a $src/* hf/\n          cd hf\n          git lfs track \"*.pdf\"\n          git lfs track \"*.onnx\"\n          git add .\n          git commit -m 'add model files' || true\n          git status\n          ls -lh\n          git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-telespeech-ctc-zh-2024-06-04 main || true\n          rm -rf hf\n\n      - name: Publish int8 model to huggingface\n        shell: bash\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          src=scripts/tele-speech/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n          export GIT_CLONE_PROTECTION_ACTIVE=false\n\n          rm -rf hf\n          GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04 hf\n          cp -a $src/* hf/\n          cd hf\n          git lfs track \"*.pdf\"\n          git lfs track \"*.onnx\"\n          git add .\n          git commit -m 'add model files' || true\n          git status\n          ls -lh\n          git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04 main || true\n"
  },
  {
    "path": ".github/workflows/export-uvr-to-onnx.yaml",
    "content": "name: export-uvr-to-onnx\n\non:\n  push:\n    branches:\n      - uvr\n  workflow_dispatch:\n\nconcurrency:\n  group: export-uvr-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-uvr-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export UVR to ONNX\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install dependencies\n        shell: bash\n        run: |\n          pip install \"numpy<2\" onnx==1.17.0 onnxruntime==1.17.1 onnxmltools kaldi-native-fbank librosa soundfile\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/uvr_mdx\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/audio_example.wav\n          ls -lh audio_example.wav\n          ./run.sh\n\n      - name: Collect mp3 files\n        shell: bash\n        run: |\n          mv -v scripts/uvr_mdx/*.mp3 ./\n          ls -lh *.mp3\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: generated-mp3\n          path: ./*.mp3\n\n      - name: Collect models\n        shell: bash\n        run: |\n          mv -v scripts/uvr_mdx/models/*.onnx ./\n          ls -lh *.onnx\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.onnx\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: source-separation-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            rm -rf huggingface\n            git clone https://huggingface.co/k2-fsa/sherpa-onnx-models huggingface\n            cd huggingface\n            mkdir -p source-separation-models\n            cp -av ../*.onnx ./source-separation-models\n            git lfs track \"*.onnx\"\n            git status\n            git add .\n            ls -lh\n            git status\n            git commit -m \"add source separation models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models main\n"
  },
  {
    "path": ".github/workflows/export-vits-ljspeech-to-onnx.yaml",
    "content": "name: export-vits-ljspeech-to-onnx\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - 'scripts/vits/**'\n      - '.github/workflows/export-vits-ljspeech-to-onnx.yaml'\n  pull_request:\n    paths:\n      - 'scripts/vits/**'\n      - '.github/workflows/export-vits-ljspeech-to-onnx.yaml'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-vits-ljspeech-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-vits-ljspeech-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: vits ljspeech\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        torch: [\"1.13.0\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Install dependencies\n        shell: bash\n        run: |\n          python3 -m pip install -qq torch==${{ matrix.torch }}+cpu -f https://download.pytorch.org/whl/torch_stable.html numpy\n          python3 -m pip install onnxruntime onnx soundfile\n          python3 -m pip install scipy cython unidecode phonemizer\n\n          # required by phonemizer\n          # See https://bootphon.github.io/phonemizer/install.html\n          # To fix the following error: RuntimeError: espeak not installed on your system\n          #\n          sudo apt-get install festival espeak-ng mbrola\n\n\n      - name: export vits ljspeech\n        shell: bash\n        run: |\n          cd scripts/vits\n\n          echo \"Downloading vits\"\n          git clone https://github.com/jaywalnut310/vits\n          pushd vits/monotonic_align\n          python3 setup.py build\n          ls -lh build/\n          ls -lh build/lib*/\n          ls -lh build/lib*/*/\n\n          cp build/lib*/monotonic_align/core*.so .\n          sed -i.bak s/.monotonic_align.core/.core/g ./__init__.py\n          git diff\n          popd\n\n          export PYTHONPATH=$PWD/vits:$PYTHONPATH\n\n          echo \"Download models\"\n\n          wget -qq https://huggingface.co/csukuangfj/vits-ljs/resolve/main/pretrained_ljs.pth\n          wget -qq https://huggingface.co/csukuangfj/vits-ljs/resolve/main/lexicon.txt\n          wget -qq https://huggingface.co/csukuangfj/vits-ljs/resolve/main/tokens.txt\n          wget -qq https://huggingface.co/csukuangfj/vits-ljs/resolve/main/test.py\n\n          python3 ./export-onnx-ljs.py --config vits/configs/ljs_base.json --checkpoint ./pretrained_ljs.pth\n          python3 ./test.py\n          ls -lh *.wav\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: test-0.wav\n          path: scripts/vits/test-0.wav\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: test-1.wav\n          path: scripts/vits/test-1.wav\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: test-2.wav\n          path: scripts/vits/test-2.wav\n"
  },
  {
    "path": ".github/workflows/export-vocos.yaml",
    "content": "name: export-vocos-to-onnx\n\non:\n  push:\n    branches:\n      - export-vocos\n\n  workflow_dispatch:\n\nconcurrency:\n  group: export-vocos-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-vocos-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export vocos ${{ matrix.version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install \"numpy<=1.26.4\" onnx==1.16.0 onnxruntime==1.17.1 soundfile piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html kaldi_native_fbank\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/vocos\n          ./run.sh\n          ls -lh\n\n      - name: Collect results\n        shell: bash\n        run: |\n          cp -v scripts/vocos/vocos-22khz-univ.onnx .\n          cp -v scripts/vocos/*.wav .\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: generated-waves\n          path: ./*.wav\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models huggingface\n            cd huggingface\n            git fetch\n            git pull\n\n            d=vocoder-models\n            mkdir -p $d\n\n            cp -a ../vocos-22khz-univ.onnx $d/\n\n            git lfs track \"*.onnx\"\n            git add .\n\n            ls -lh\n\n            git status\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models main || true\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.onnx\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: vocoder-models\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.onnx\n          overwrite: true\n          tag: vocoder-models\n\n"
  },
  {
    "path": ".github/workflows/export-wenet-to-onnx.yaml",
    "content": "name: export-wenet-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-wenet-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-wenet-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export wenet\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Run\n        shell: bash\n        run: |\n          sudo apt-get install tree sox\n          cd scripts/wenet\n          ./run.sh\n\n      - name: Publish to huggingface (aishell)\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-aishell huggingface\n            cd huggingface\n            git fetch\n            git pull\n\n            cp -v ../scripts/wenet/aishell_u2pp_conformer_exp/*.onnx .\n            cp -v ../scripts/wenet/aishell_u2pp_conformer_exp/units.txt tokens.txt\n            cp -v ../scripts/wenet/aishell_u2pp_conformer_exp/README.md .\n\n            if [ ! -d test_wavs ]; then\n              mkdir test_wavs\n              cd test_wavs\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/0.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/1.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/8k.wav\n              cd ..\n            fi\n            git lfs track \"*.onnx\"\n            git add .\n\n            git commit -m \"add aishell models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-aishell main || true\n\n            cd ..\n\n            rm -rf huggingface/.git\n            dst=sherpa-onnx-zh-wenet-aishell\n\n            mv huggingface $dst\n\n            tar cjvf $dst.tar.bz2 $dst\n            rm -rf $dst\n\n      - name: Publish to huggingface (aishell2)\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-aishell2 huggingface\n            cd huggingface\n            git fetch\n            git pull\n\n            cp -v ../scripts/wenet/aishell2_u2pp_conformer_exp/*.onnx .\n            cp -v ../scripts/wenet/aishell2_u2pp_conformer_exp/units.txt tokens.txt\n            cp -v ../scripts/wenet/aishell2_u2pp_conformer_exp/README.md .\n\n            if [ ! -d test_wavs ]; then\n              mkdir test_wavs\n              cd test_wavs\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/0.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/1.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/8k.wav\n              cd ..\n            fi\n            git lfs track \"*.onnx\"\n            git add .\n\n            git commit -m \"add aishell2 models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-aishell2 main || true\n\n            cd ..\n\n            rm -rf huggingface/.git\n            dst=sherpa-onnx-zh-wenet-aishell2\n\n            mv huggingface $dst\n\n            tar cjvf $dst.tar.bz2 $dst\n            rm -rf $dst\n\n      - name: Publish to huggingface (multi_cn)\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-multi-cn huggingface\n            cd huggingface\n            git fetch\n            git pull\n\n            cp -v ../scripts/wenet/multi_cn_unified_conformer_exp/*.onnx .\n            cp -v ../scripts/wenet/multi_cn_unified_conformer_exp/units.txt tokens.txt\n            cp -v ../scripts/wenet/multi_cn_unified_conformer_exp/README.md .\n\n            if [ ! -d test_wavs ]; then\n              mkdir test_wavs\n              cd test_wavs\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/0.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/1.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/8k.wav\n              cd ..\n            fi\n            git lfs track \"*.onnx\"\n            git add .\n\n            git commit -m \"add multi_cn models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-multi-cn main || true\n\n            cd ..\n\n            rm -rf huggingface/.git\n            dst=sherpa-onnx-zh-wenet-multi-cn\n\n            mv huggingface $dst\n\n            tar cjvf $dst.tar.bz2 $dst\n            rm -rf $dst\n\n      - name: Publish to huggingface (wenetspeech)\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-wenetspeech huggingface\n            cd huggingface\n            git fetch\n            git pull\n\n            cp -v ../scripts/wenet/20220506_u2pp_conformer_exp/*.onnx .\n            cp -v ../scripts/wenet/20220506_u2pp_conformer_exp/units.txt tokens.txt\n            cp -v ../scripts/wenet/20220506_u2pp_conformer_exp/README.md .\n\n            if [ ! -d test_wavs ]; then\n              mkdir test_wavs\n              cd test_wavs\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/0.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/1.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/resolve/main/test_wavs/8k.wav\n              cd ..\n            fi\n            git lfs track \"*.onnx\"\n            git add .\n\n            git commit -m \"add wenetspeech models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-zh-wenet-wenetspeech main || true\n\n            cd ..\n\n            rm -rf huggingface/.git\n            dst=sherpa-onnx-zh-wenet-wenetspeech\n\n            mv huggingface $dst\n\n            tar cjvf $dst.tar.bz2 $dst\n            rm -rf $dst\n\n      - name: Publish to huggingface (librispeech)\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-en-wenet-librispeech huggingface\n            cd huggingface\n            git fetch\n            git pull\n\n            cp -v ../scripts/wenet/librispeech_u2pp_conformer_exp/*.onnx .\n            cp -v ../scripts/wenet/librispeech_u2pp_conformer_exp/units.txt tokens.txt\n            cp -v ../scripts/wenet/librispeech_u2pp_conformer_exp/README.md .\n\n            if [ ! -d test_wavs ]; then\n              mkdir test_wavs\n              cd test_wavs\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-02-21/resolve/main/test_wavs/0.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-02-21/resolve/main/test_wavs/1.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-02-21/resolve/main/test_wavs/8k.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-02-21/resolve/main/test_wavs/trans.txt\n              cd ..\n            fi\n            git lfs track \"*.onnx\"\n            git add .\n\n            git commit -m \"add librispeech models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-en-wenet-librispeech main || true\n\n            cd ..\n\n            rm -rf huggingface/.git\n            dst=sherpa-onnx-en-wenet-librispeech\n\n            mv huggingface $dst\n\n            tar cjvf $dst.tar.bz2 $dst\n            rm -rf $dst\n\n      - name: Publish to huggingface (gigaspeech)\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-en-wenet-gigaspeech huggingface\n            cd huggingface\n            git fetch\n            git pull\n\n            cp -v ../scripts/wenet/20210728_u2pp_conformer_exp/*.onnx .\n            cp -v ../scripts/wenet/20210728_u2pp_conformer_exp/units.txt tokens.txt\n            cp -v ../scripts/wenet/20210728_u2pp_conformer_exp/README.md .\n\n            if [ ! -d test_wavs ]; then\n              mkdir test_wavs\n              cd test_wavs\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-02-21/resolve/main/test_wavs/0.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-02-21/resolve/main/test_wavs/1.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-02-21/resolve/main/test_wavs/8k.wav\n              wget -q https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-02-21/resolve/main/test_wavs/trans.txt\n              cd ..\n            fi\n            git lfs track \"*.onnx\"\n            git add .\n\n            git commit -m \"add gigaspeech models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-en-wenet-gigaspeech main || true\n\n            cd ..\n\n            rm -rf huggingface/.git\n            dst=sherpa-onnx-en-wenet-gigaspeech\n\n            mv huggingface $dst\n\n            tar cjvf $dst.tar.bz2 $dst\n            rm -rf $dst\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/export-wespeaker-to-onnx.yaml",
    "content": "name: export-wespeaker-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: export-wespeaker-to-onnx-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  export-wespeaker-to-onnx:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: export wespeaker\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install kaldi-native-fbank numpy onnx onnxruntime\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/wespeaker\n          ./run.sh\n\n          mv -v *.onnx ../..\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.onnx\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: speaker-recongition-models\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            d=speaker-embedding-models\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n            mv -v ./*.onnx ./huggingface\n            cd huggingface\n            git lfs track \"*.onnx\"\n            git status\n            git add .\n            git status\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n"
  },
  {
    "path": ".github/workflows/export-whisper-to-ascend-npu.yaml",
    "content": "name: export-whisper-to-ascend-npu\n\non:\n  push:\n    branches:\n      - fix-ascend-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-whisper-to-ascend-npu-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  generate_build_matrix:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    # see https://github.com/pytorch/pytorch/pull/50633\n    runs-on: ubuntu-latest\n    outputs:\n      matrix: ${{ steps.set-matrix.outputs.matrix }}\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Generating build matrix\n        id: set-matrix\n        run: |\n          # outputting for debugging purposes\n          python3 .github/scripts/export-ascend/generate_whisper.py\n          MATRIX=$(python3 .github/scripts/export-ascend/generate_whisper.py)\n\n          # deprecated\n          # echo \"::set-output name=matrix::${MATRIX}\"\n          echo \"matrix=$MATRIX\" >> $GITHUB_OUTPUT\n\n  export-whisper-to-ascend-npu:\n    needs: generate_build_matrix\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.model }} ${{ matrix.soc_version }} ${{ matrix.cann }}\n    runs-on: ubuntu-latest\n    strategy:\n      fail-fast: false\n      matrix:\n        ${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}\n\n    container:\n      image: ${{ matrix.image }}\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python 3.8\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.8\"\n\n      - name: Show Python\n        shell: bash\n        run: |\n          python3 --version\n          which python3\n\n      - name: Install curl\n        shell: bash\n        run: |\n          apt-get update && apt-get install -y curl bzip2 git git-lfs\n\n      - name: Verify environment\n        shell: bash\n        run: |\n          ls -lh /usr/local/Ascend/ascend-toolkit/set_env.sh\n\n          find /usr/local/Ascend -name \"libascend*.so\" 2>/dev/null\n\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          # for cann 7.0.0\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/x86_64:$LD_LIBRARY_PATH\n\n          echo \"CANN environment:\"\n          which atc || echo \"atc not found\"\n          atc --help\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install \"numpy<2\" \\\n                  onnx==1.17.0 \\\n                  onnxruntime==1.17.1 \\\n                  torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n                  torchaudio==2.0.0+cpu -f https://download.pytorch.org/whl/torchaudio \\\n                  openai-whisper \\\n                  attrs psutil scipy decorator cloudpickle ml-dtypes tornado \\\n                  sentencepiece \\\n                  pyyaml\n\n      - name: export ${{ matrix.model }} to ONNX\n        shell: bash\n        run: |\n          cd scripts/whisper/ascend-npu\n          model=${{ matrix.model }}\n          echo \"model: $model\"\n          if [[ $model == distil-medium.en ]]; then\n            curl -L -s -o distil-medium-en-original-model.bin https://huggingface.co/distil-whisper/distil-medium.en/resolve/main/original-model.bin\n            ls -lh\n          elif [[ $model == distil-large-v2 ]]; then\n            curl -L -s -o distil-large-v2-original-model.bin https://huggingface.co/distil-whisper/distil-large-v2/resolve/main/original-model.bin\n            ls -lh\n          elif [[ $model == distil-large-v3 ]]; then\n            curl -L -s -o distil-large-v3-original-model.bin https://huggingface.co/distil-whisper/distil-large-v3-openai/resolve/main/model.bin\n            ls -lh\n          elif [[ $model == distil-large-v3.5 ]]; then\n            curl -L -s -o distil-large-v3.5-original-model.bin https://huggingface.co/distil-whisper/distil-large-v3.5-openai/resolve/main/model.bin\n            ls -lh\n          elif [[ $model == distil-small.en ]]; then\n            curl -L -s -o distil-small-en-original-model.bin https://huggingface.co/distil-whisper/distil-small.en/resolve/main/original-model.bin\n            ls -lh\n          elif [[ $model == medium-aishell ]]; then\n            curl -L -s -o medium-aishell.pt https://huggingface.co/yuekai/icefall_asr_aishell_whisper/resolve/main/exp_medium/whisper-medium-aishell1-epoch-10-avg-4.pt\n            ls -lh\n          fi\n          python3 ./export_onnx.py --model ${{ matrix.model }}\n\n\n          ls -lh\n\n          ls -lh ~/.cache/whisper || true\n          ls -lh distil*original-model.bin || true\n          rm -rf ~/.cache/whisper\n          rm -f distil*original-model.bin\n          rm -f medium-aishell.pt\n\n      - name: export ${{ matrix.model }} ONNX to Ascend OM\n        shell: bash\n        run: |\n          cd scripts/whisper/ascend-npu\n          ls -lh *.onnx\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          # for cann 7.0.0\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/x86_64:$LD_LIBRARY_PATH\n\n          soc_version=${{ matrix.soc_version }}\n          cann=${{ matrix.cann }}\n\n          model=${{ matrix.model }}\n\n          atc --model=./${model}-encoder.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=${model}-encoder \\\n            --input_format=ND \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          ls -lh *.om\n\n          atc --model=./${model}-decoder.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=${model}-decoder \\\n            --input_format=ND \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          ls -lh *.om\n\n          rm -v *.onnx\n\n          echo \"collect results\"\n          d=sherpa-onnx-ascend-${soc_version}-cann-${cann}-whisper-$model\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          pushd $d/test_wavs\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium.en/resolve/main/test_wavs/0.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium.en/resolve/main/test_wavs/1.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium.en/resolve/main/test_wavs/8k.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium.en/resolve/main/test_wavs/trans.txt\n          popd\n\n          cp -v $model-encoder*.om $d/${model}-encoder.om\n          cp -v $model-decoder*.om $d/${model}-decoder.om\n          cp -v $model-tokens.txt $d/\n          cp -v test_om.py $d\n          ls -lh $d\n\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          rm -v *.om\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../../..\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models-ascend\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: asr-models-ascend\n\n      - name: Publish to huggingface\n        if: true\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in \"*.tar.bz2\"; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models huggingface\n\n              d=asr-models/ascend-npu/whisper\n              mkdir -p huggingface/$d\n\n              cp -v $m huggingface/$d/\n\n              pushd huggingface\n              git lfs track \"*.tar.bz2\"\n              ls -lh $d/$m\n\n              ls -lh $d\n\n              pushd $d\n              git lfs track \"*.tar.bz2\"\n              popd\n\n              git status\n              git add .\n\n              git commit -m \"add $m\"\n              git push https://csukuangfj2:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models main\n              popd\n            done\n            rm -rf huggingface\n\n      - name: Publish to modelscope\n        if: true\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            models=(\n              sherpa-onnx-ascend-${{ matrix.soc_version }}-cann-${{ matrix.cann }}-whisper-${{ matrix.model }}.tar.bz2\n            )\n            for m in \"*.tar.bz2\"; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              rm -rf ms\n              git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git ms\n\n              d=ascend-npu/whisper\n              mkdir -p ms/$d\n\n              cp -av $m ms/$d/\n\n              pushd ms\n              git lfs track \"*.tar.bz2\"\n              git status\n              ls -lh $d/$m\n\n              ls -lh $d\n\n              git add .\n\n              git commit -m \"add $m\"\n              git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git\n\n              popd\n            done\n            rm -rf ms\n"
  },
  {
    "path": ".github/workflows/export-whisper-to-onnx.yaml",
    "content": "name: export-whisper-to-onnx\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: release-whisper-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  release-whisper-models:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.model }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        model: [\"turbo\", \"distil-medium.en\", \"distil-small.en\",  \"tiny.en\", \"base.en\", \"small.en\", \"medium.en\", \"tiny\", \"base\", \"small\", \"medium\", \"medium-aishell\", \"large\", \"large-v1\", \"large-v2\", \"large-v3\", \"distil-large-v2\", \"distil-large-v3\", \"distil-large-v3.5\"]\n        # model: [\"large\", \"large-v1\", \"large-v2\", \"large-v3\", \"distil-large-v2\"]\n        # model: [\"distil-large-v3.5\", \"distil-large-v3\"]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install dependencies\n        shell: bash\n        run: |\n          python3 -m pip install torch==1.13.0 torchaudio==0.13.0 -f https://download.pytorch.org/whl/cpu/torch_stable.html\n          python3 -m pip install -U openai-whisper\n          python3 -m pip install onnxruntime onnx soundfile librosa\n\n      - name: export ${{ matrix.model }}\n        shell: bash\n        run: |\n          cd scripts/whisper\n          model=${{ matrix.model }}\n          echo \"model: $model\"\n          if [[ $model == distil-medium.en ]]; then\n            wget -q -O distil-medium-en-original-model.bin https://huggingface.co/distil-whisper/distil-medium.en/resolve/main/original-model.bin\n            ls -lh\n          elif [[ $model == distil-large-v2 ]]; then\n            wget -q -O distil-large-v2-original-model.bin https://huggingface.co/distil-whisper/distil-large-v2/resolve/main/original-model.bin\n            ls -lh\n          elif [[ $model == distil-large-v3 ]]; then\n            wget -q -O distil-large-v3-original-model.bin https://huggingface.co/distil-whisper/distil-large-v3-openai/resolve/main/model.bin\n            ls -lh\n          elif [[ $model == distil-large-v3.5 ]]; then\n            wget -q -O distil-large-v3.5-original-model.bin https://huggingface.co/distil-whisper/distil-large-v3.5-openai/resolve/main/model.bin\n            ls -lh\n          elif [[ $model == distil-small.en ]]; then\n            wget -q -O distil-small-en-original-model.bin https://huggingface.co/distil-whisper/distil-small.en/resolve/main/original-model.bin\n            ls -lh\n          elif [[ $model == medium-aishell ]]; then\n            wget -q -O medium-aishell.pt https://huggingface.co/yuekai/icefall_asr_aishell_whisper/resolve/main/exp_medium/whisper-medium-aishell1-epoch-10-avg-4.pt\n            ls -lh\n          fi\n          python3 ./export-onnx.py --model ${{ matrix.model }}\n          # python3 -m onnxruntime.tools.convert_onnx_models_to_ort --optimization_style=Fixed ./\n          #\n\n\n          ls -lh\n\n          ls -lh ~/.cache/whisper || true\n          ls -lh distil*original-model.bin || true\n          rm -rf ~/.cache/whisper\n          rm -f distil*original-model.bin\n          rm -f medium-aishell.pt\n\n          src=sherpa-onnx-whisper-${{ matrix.model }}\n\n          cd ..\n          mkdir $src\n          mv -v whisper/$model* $src/\n\n          echo \"------------------------------\"\n\n          cd $src\n          du -h -d1 .\n          ls -lh\n          mkdir -p test_wavs\n          cd test_wavs\n          wget -q https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium.en/resolve/main/test_wavs/0.wav\n          wget -q https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium.en/resolve/main/test_wavs/1.wav\n          wget -q https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium.en/resolve/main/test_wavs/8k.wav\n          wget -q https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium.en/resolve/main/test_wavs/trans.txt\n          cd ../..\n          mv $src ../\n          echo \"pwd: $PWD\"\n\n          cd ../\n          echo \"--------------------\"\n          ls -lh\n          ls -lh $src\n          echo \"--------------------\"\n\n          if [[ $model == medium-aishell ]]; then\n            ls -lh *.onnx # the float32 onnx model for medium-aishell is too large to be uploaded to GitHub\n            mkdir -p bak\n            mv -v $src/$model-encoder.onnx ./bak\n            mv -v $src/$model-decoder.onnx ./bak\n            ls -lh $src\n\n            tar cvjf $src.tar.bz2 $src\n            mv -v ./bak/* $src/\n            rm -rf bak\n          elif [[ -f $src/$model-encoder.weights ]]; then\n            # we only publish int8 models to GitHub for large Whisper models\n            mkdir -p bak\n            mv -v $src/*weights ./bak\n            mv -v $src/$model-encoder.onnx ./bak\n            mv -v $src/$model-decoder.onnx ./bak\n            ls -lh $src\n\n            tar cvjf $src.tar.bz2 $src\n            mv -v ./bak/* $src/\n            rm -rf bak\n          else\n            tar cvjf $src.tar.bz2 $src\n          fi\n\n          ls -lh *.tar.bz2\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar*\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Publish ${{ matrix.model }} to huggingface\n        shell: bash\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          src=sherpa-onnx-whisper-${{ matrix.model }}\n\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n          export GIT_CLONE_PROTECTION_ACTIVE=false\n\n          export GIT_LFS_SKIP_SMUDGE=1\n\n          git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-whisper-${{ matrix.model }} huggingface\n\n          rm -rf huggingface/*\n\n          cp -av $src/* ./huggingface/\n\n          cd huggingface\n\n          git status\n          ls -lh\n          git lfs track \"*.wav*\"\n          git lfs track \"*onnx*\"\n          git lfs track \"*weights*\"\n\n          git add .\n          git commit -m \"upload ${{ matrix.model }}\"\n          git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-whisper-${{ matrix.model }} main\n\n      - name: Test float32 ${{ matrix.model }}\n        shell: bash\n        run: |\n          python3 -m pip install kaldi-native-fbank\n          model=${{ matrix.model }}\n          src=sherpa-onnx-whisper-$model\n          time python3 scripts/whisper/test.py \\\n            --encoder $src/$model-encoder.onnx \\\n            --decoder $src/$model-decoder.onnx \\\n            --tokens $src/$model-tokens.txt \\\n            $src/test_wavs/0.wav\n\n      - name: Test int8 ${{ matrix.model }}\n        shell: bash\n        run: |\n          model=${{ matrix.model }}\n          src=sherpa-onnx-whisper-$model\n          time python3 scripts/whisper/test.py \\\n            --encoder $src/$model-encoder.int8.onnx \\\n            --decoder $src/$model-decoder.int8.onnx \\\n            --tokens $src/$model-tokens.txt \\\n            $src/test_wavs/0.wav\n"
  },
  {
    "path": ".github/workflows/export-zipformer-ctc-to-ascend-20250703.yaml",
    "content": "name: export-zipformer-ctc-to-ascend-npu-20250703\n\non:\n  push:\n    branches:\n      - fix-ascend-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-zipformer-ctc-to-ascend-npu-20250703-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  generate_build_matrix:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    # see https://github.com/pytorch/pytorch/pull/50633\n    runs-on: ubuntu-latest\n    outputs:\n      matrix: ${{ steps.set-matrix.outputs.matrix }}\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Generating build matrix\n        id: set-matrix\n        run: |\n          # outputting for debugging purposes\n          python3 .github/scripts/export-ascend/generate_zipformer_ctc_20250703.py\n          MATRIX=$(python3 .github/scripts/export-ascend/generate_zipformer_ctc_20250703.py)\n\n          # deprecated\n          # echo \"::set-output name=matrix::${MATRIX}\"\n          echo \"matrix=$MATRIX\" >> $GITHUB_OUTPUT\n\n  export-zipformer-ctc-to-ascend-npu-20250703:\n    needs: generate_build_matrix\n    name: ${{ matrix.soc_version }} ${{ matrix.cann }} ${{ matrix.num_seconds }}\n    runs-on: ubuntu-latest\n    strategy:\n      fail-fast: false\n      matrix:\n        ${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}\n\n    container:\n      image: ${{ matrix.image }}\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python 3.8\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.8\"\n\n      - name: Show Python\n        shell: bash\n        run: |\n          python3 --version\n          which python3\n\n      - name: Install curl\n        shell: bash\n        run: apt-get update && apt-get install -y curl bzip2 git git-lfs\n\n      - name: Verify environment\n        shell: bash\n        run: |\n          ls -lh /usr/local/Ascend/ascend-toolkit/set_env.sh\n\n          find /usr/local/Ascend -name \"libascend*.so\" 2>/dev/null\n\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          # for cann 7.0.0\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/x86_64:$LD_LIBRARY_PATH\n\n          echo \"CANN environment:\"\n          which atc || echo \"atc not found\"\n          atc --help\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install \"numpy<2\" \\\n                  onnx==1.17.0 \\\n                  torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n                  attrs psutil scipy decorator cloudpickle ml-dtypes tornado \\\n                  sentencepiece \\\n                  pyyaml\n\n      - name: Export ${{ matrix.num_seconds }}\n        shell: bash\n        run: |\n          mkdir tmp\n          cd tmp\n\n          t=${{ matrix.num_seconds }}\n          num_frames=$(($t*100))\n\n          echo \"num_frames: $num_frames\"\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/generate_test_data.py\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/test.py\n          chmod +x generate_test_data.py\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/0.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/1.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/8k.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/tokens.txt\n\n\n          source /usr/local/Ascend/ascend-toolkit/set_env.sh\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/linux/x86_64:$LD_LIBRARY_PATH\n\n          # for cann 7.0.0\n          export LD_LIBRARY_PATH=/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/x86_64:$LD_LIBRARY_PATH\n\n          soc_version=${{ matrix.soc_version }}\n          cann=${{ matrix.cann }}\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/model-$t-seconds.onnx\n          mv model-$t-seconds.onnx model.onnx\n\n          atc --model=./model.onnx \\\n            --framework=5 \\\n            --host_env_os=linux \\\n            --host_env_cpu=aarch64 \\\n            --output=model \\\n            --input_format=ND \\\n            --input_shape=\"x:1,${num_frames},80\" \\\n            --soc_version=\"Ascend${soc_version}\"\n\n          rm -v *.onnx\n\n          ls -lh *.om\n\n          echo \"collect results\"\n          d=sherpa-onnx-ascend-${soc_version}-cann-${cann}-$t-seconds-zipformer-ctc-zh-2025-07-03\n\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n\n          cp -v model_linux_aarch64.om $d/model.om || cp -v model.om $d/model.om\n          cp -v tokens.txt $d\n          cp -v ../scripts/zipformer-ctc/ascend/2025-07-03/onnx_test.py $d\n          cp -v ../scripts/zipformer-ctc/ascend/2025-07-03/test_om.py $d\n          cp -v *.wav $d/test_wavs\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n\n          rm -v *.om\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models-ascend\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          tag: asr-models-ascend\n\n      - name: Publish to huggingface\n        if: true\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in \"*.tar.bz2\"; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models huggingface\n\n              d=asr-models/ascend-npu/zipformer-ctc\n              mkdir -p huggingface/$d\n\n              cp -v $m huggingface/$d/\n\n              pushd huggingface\n              git lfs track \"*.tar.bz2\"\n              ls -lh $d\n              pushd $d\n              git lfs track \"*.tar.bz2\"\n              popd\n\n              git status\n              git add .\n\n              git commit -m \"add $m\"\n              git push https://csukuangfj2:$HF_TOKEN@huggingface.co/k2-fsa/sherpa-onnx-models main\n              popd\n            done\n            rm -rf huggingface\n\n      - name: Publish to modelscope\n        if: true\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in \"*.tar.bz2\"; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              rm -rf ms\n              git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git ms\n\n              d=ascend-npu/zipformer-ctc\n              mkdir -p ms/$d\n\n              cp -av $m ms/$d/\n\n              pushd ms\n              git lfs track \"*.tar.bz2\"\n              git status\n              ls -lh $d/$m\n\n              ls -lh $d\n              git add .\n\n              git commit -m \"add $m\"\n              git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git\n\n              popd\n            done\n            rm -rf ms\n"
  },
  {
    "path": ".github/workflows/export-zipformer-ctc-to-qnn-20250703.yaml",
    "content": "name: export-zipformer-ctc-to-qnn-20250703\n\non:\n  push:\n    branches:\n      - zipformer-qnn-model-2\n  workflow_dispatch:\n\nconcurrency:\n  group: export-zipformer-ctc-to-qnn-20250703-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  generate_build_matrix:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    # see https://github.com/pytorch/pytorch/pull/50633\n    runs-on: ubuntu-latest\n    outputs:\n      matrix: ${{ steps.set-matrix.outputs.matrix }}\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Generating build matrix\n        id: set-matrix\n        run: |\n          # outputting for debugging purposes\n          python3 .github/scripts/export-qnn/generate_zipformer.py\n          MATRIX=$(python3 .github/scripts/export-qnn/generate_zipformer.py)\n\n          # deprecated\n          # echo \"::set-output name=matrix::${MATRIX}\"\n          echo \"matrix=$MATRIX\" >> $GITHUB_OUTPUT\n\n  export-zipformer-ctc-to-qnn-20250703:\n    needs: generate_build_matrix\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: ${{ matrix.model_name }} ${{ matrix.input_in_seconds }} ${{ matrix.soc }}\n    runs-on: ubuntu-22.04\n    strategy:\n      fail-fast: false\n      matrix:\n        ${{ fromJson(needs.generate_build_matrix.outputs.matrix) }}\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Setup Python 3.10\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Create directories\n        shell: bash\n        run: |\n          mkdir so binary\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Create Python virtual environment\n        shell: bash\n        run: |\n          python3 -m venv py310\n          which python3\n          source py310/bin/activate\n          which python3\n\n      - name: Show ndk-build help\n        shell: bash\n        run: |\n          export PATH=${ANDROID_NDK_LATEST_HOME}:$PATH\n          ndk-build --help\n\n      - name: Download toolkit\n        shell: bash\n        run: |\n          curl -SL -O https://huggingface.co/csukuangfj/qnn-toolkit/resolve/main/v2.33.0.250327.zip\n          ls -lh v2.33.0.250327.zip\n\n      - name: Unzip toolkit\n        shell: bash\n        run: |\n          unzip v2.33.0.250327.zip\n\n      - name: Show\n        shell: bash\n        run: |\n          ls -lh\n\n          echo \"---ls -lh qairt---\"\n\n          ls -lh qairt\n\n          echo \"---\"\n\n      - name: Install linux dependencies\n        shell: bash\n        run: |\n          ls -lh\n\n          echo \"---\"\n\n          ls -lh qairt\n\n          cd qairt/2.33.0.250327/bin\n          source envsetup.sh\n\n          yes | sudo ${QNN_SDK_ROOT}/bin/check-linux-dependency.sh || true\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          cd qairt/2.33.0.250327/bin\n          source envsetup.sh\n\n          python3 -m pip install \\\n            mock \\\n            numpy \\\n            opencv-python \\\n            optuna \\\n            packaging \\\n            pandas \\\n            paramiko \\\n            pathlib2 \\\n            pillow \\\n            plotly \\\n            protobuf \\\n            psutil \\\n            pydantic \\\n            pytest \\\n            pyyaml \\\n            rich \\\n            scikit-optimize \\\n            scipy \\\n            six \\\n            tabulate \\\n            typing-extensions \\\n            xlsxwriter\n\n          python3 \"${QNN_SDK_ROOT}/bin/check-python-dependency\" || true\n\n          which python3\n\n      - name: Install onnx dependencies\n        shell: bash\n        run: |\n          source py310/bin/activate\n          python3 -m pip install --upgrade \\\n            torch==2.0.0+cpu -f https://download.pytorch.org/whl/torch \\\n            kaldi_native_fbank \\\n            pip \\\n            \"numpy<2\" \\\n            onnx==1.17.0 \\\n            onnxruntime==1.17.1 \\\n            soundfile \\\n            librosa \\\n            onnxsim \\\n            sentencepiece \\\n            pyyaml\n\n          which python3\n\n      - name: Show qnn-onnx-converter help\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.33.0.250327/bin\n          source envsetup.sh\n          popd\n\n          qnn-onnx-converter --help\n\n      - name: Show qnn-model-lib-generator help\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.33.0.250327/bin\n          source envsetup.sh\n          popd\n\n          qnn-model-lib-generator --help\n\n      - name: Show qnn-net-run help\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.33.0.250327/bin\n          source envsetup.sh\n          popd\n\n          qnn-net-run --help\n\n      - name: Run ${{ matrix.input_in_seconds }}\n        if: matrix.model_name == '20250703'\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.33.0.250327/bin\n          source envsetup.sh\n          popd\n\n          export PATH=${ANDROID_NDK_LATEST_HOME}:$PATH\n          export LDFLAGS=\"-Wl,-z,max-page-size=16384\"\n          dir=$PWD\n\n          mkdir tmp\n\n          cd tmp\n\n          t=${{ matrix.input_in_seconds }}\n          num_frames=$(($t*100))\n\n          echo \"num_frames: $num_frames\"\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/generate_test_data.py\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/test.py\n          chmod +x generate_test_data.py\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/0.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/1.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/8k.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/tokens.txt\n\n          ./generate_test_data.py --num-frames $num_frames --wav 0.wav\n          ./generate_test_data.py --num-frames $num_frames --wav 1.wav\n          ./generate_test_data.py --num-frames $num_frames --wav 8k.wav\n\n          echo -e \"0.raw\\n1.raw\\n8k.raw\" > input_list.txt\n\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/model-$t-seconds.onnx\n\n          python3 ../scripts/pyannote/segmentation/show-onnx.py --filename ./model-$t-seconds.onnx\n\n\n          echo \"export to qnn\"\n          echo \"----------$t----------\"\n\n          qnn-onnx-converter \\\n            --input_network model-$t-seconds.onnx \\\n            --output_path ./model-$t-seconds-quantized \\\n            --out_node log_probs \\\n            --input_list ./input_list.txt \\\n            --use_native_input_files  \\\n            --input_dtype x float32 \\\n            --act_bitwidth 16 \\\n            --bias_bitwidth 32 \\\n            --input_layout x NTF\n\n          ls -lh\n          mv model-$t-seconds-quantized model-$t-seconds-quantized.cpp\n          echo \"----\"\n          ls -lh\n\n          python3 \"${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator\" \\\n            -c \"model-$t-seconds-quantized.cpp\" \\\n            -b \"model-$t-seconds-quantized.bin\" \\\n            -o model_libs > /dev/null 2>&1\n\n          ls -lh model_libs/*/\n\n          readelf -lW model_libs/*/lib*.so\n\n          echo \"Generate context binary\"\n\n          $dir/scripts/qnn/generate_config.py  \\\n            --soc ${{ matrix.soc }} \\\n            --graph-name \"model_${t}_seconds_quantized\" \\\n            --output-dir ./my-config \\\n            --qnn-sdk-root $QNN_SDK_ROOT\n\n          ls -lh my-config\n\n          head -n 1000 my-config/*.json\n\n          $QNN_SDK_ROOT/bin/x86_64-linux-clang/qnn-context-binary-generator \\\n            --backend $QNN_SDK_ROOT/lib/x86_64-linux-clang/libQnnHtp.so \\\n            --model ./model_libs/x86_64-linux-clang/libmodel-$t-seconds-quantized.so \\\n            --output_dir ./binary \\\n            --binary_file model \\\n            --config_file ./my-config/htp_backend_extensions.json\n\n          ls -lh binary/\n\n          echo \"collect results\"\n\n          d=sherpa-onnx-qnn-${{ matrix.soc}}-binary-$t-seconds-zipformer-ctc-zh-2025-07-03-int8\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n          cp -v binary/model.bin $d/\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n          echo \"num_frames=$num_frames\" > $d/info.txt\n\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n          mv *.tar.bz2 ../binary/\n\n          for p in x86_64-linux-clang aarch64-android; do\n            if [[ $p == x86_64-linux-clang ]]; then\n              d=sherpa-onnx-qnn-$t-seconds-zipformer-ctc-zh-2025-07-03-int8-linux-x64\n            elif [[ $p == aarch64-android ]]; then\n              d=sherpa-onnx-qnn-$t-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\n            else\n              echo \"Unknown $p\"\n              exit -1\n            fi\n\n            mkdir -p $d\n            mkdir -p $d/test_wavs\n\n            cp -v model_libs/$p/lib*.so $d/libmodel.so\n            cp -v tokens.txt $d\n            cp -v *.wav $d/test_wavs\n\n            echo \"num_frames=$num_frames\" > $d/info.txt\n            echo \"target=$p\" >> $d/info.txt\n\n            ls -lh $d\n            tar cjfv $d.tar.bz2 $d\n            ls -lh *.tar.bz2\n            rm -rf $d\n          done\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../so\n\n      - name: Run ${{ matrix.input_in_seconds }}\n        if: matrix.model_name == '20251222'\n        shell: bash\n        run: |\n          source py310/bin/activate\n\n          pushd qairt/2.33.0.250327/bin\n          source envsetup.sh\n          popd\n\n          export PATH=${ANDROID_NDK_LATEST_HOME}:$PATH\n          export LDFLAGS=\"-Wl,-z,max-page-size=16384\"\n          dir=$PWD\n\n          mkdir tmp\n\n          cd tmp\n\n          t=${{ matrix.input_in_seconds }}\n          num_frames=$(($t*100))\n\n          echo \"num_frames: $num_frames\"\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/generate_test_data.py\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/test.py\n          chmod +x generate_test_data.py\n\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/0.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/1.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/8k.wav\n          curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-ctc-zh-2025-07-03-source-models/resolve/main/tokens.txt\n\n          ./generate_test_data.py --num-frames $num_frames --wav 0.wav\n          ./generate_test_data.py --num-frames $num_frames --wav 1.wav\n          ./generate_test_data.py --num-frames $num_frames --wav 8k.wav\n\n          echo -e \"0.raw\\n1.raw\\n8k.raw\" > input_list.txt\n\n          curl -SL -O https://huggingface.co/csukuangfj/2025-12-22/resolve/main/zipformer-ctc-models/model-$t-seconds.onnx\n\n          python3 ../scripts/pyannote/segmentation/show-onnx.py --filename ./model-$t-seconds.onnx\n\n\n          echo \"export to qnn\"\n          echo \"----------$t----------\"\n\n          qnn-onnx-converter \\\n            --input_network model-$t-seconds.onnx \\\n            --output_path ./model-$t-seconds-quantized \\\n            --out_node log_probs \\\n            --input_list ./input_list.txt \\\n            --use_native_input_files  \\\n            --input_dtype x float32 \\\n            --act_bitwidth 16 \\\n            --bias_bitwidth 32 \\\n            --input_layout x NTF\n\n          ls -lh\n          mv model-$t-seconds-quantized model-$t-seconds-quantized.cpp\n          echo \"----\"\n          ls -lh\n\n          python3 \"${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator\" \\\n            -c \"model-$t-seconds-quantized.cpp\" \\\n            -b \"model-$t-seconds-quantized.bin\" \\\n            -o model_libs > /dev/null 2>&1\n\n          ls -lh model_libs/*/\n\n          readelf -lW model_libs/*/lib*.so\n\n          echo \"Generate context binary\"\n\n          $dir/scripts/qnn/generate_config.py  \\\n            --soc ${{ matrix.soc }} \\\n            --graph-name \"model_${t}_seconds_quantized\" \\\n            --output-dir ./my-config \\\n            --qnn-sdk-root $QNN_SDK_ROOT\n\n          ls -lh my-config\n\n          head -n 1000 my-config/*.json\n\n          $QNN_SDK_ROOT/bin/x86_64-linux-clang/qnn-context-binary-generator \\\n            --backend $QNN_SDK_ROOT/lib/x86_64-linux-clang/libQnnHtp.so \\\n            --model ./model_libs/x86_64-linux-clang/libmodel-$t-seconds-quantized.so \\\n            --output_dir ./binary \\\n            --binary_file model \\\n            --config_file ./my-config/htp_backend_extensions.json\n\n          ls -lh binary/\n\n          d=sherpa-onnx-qnn-${{ matrix.soc}}-binary-$t-seconds-zipformer-ctc-zh-2025-12-22-int8\n          mkdir -p $d\n          mkdir -p $d/test_wavs\n          cp -v binary/model.bin $d/\n          cp -v tokens.txt $d\n          cp -v *.wav $d/test_wavs\n          echo \"num_frames=$num_frames\" > $d/info.txt\n\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n          rm -rf $d\n          mv *.tar.bz2 ../binary/\n\n          echo \"collect results\"\n\n          for p in x86_64-linux-clang aarch64-android; do\n            if [[ $p == x86_64-linux-clang ]]; then\n              d=sherpa-onnx-qnn-$t-seconds-zipformer-ctc-zh-2025-12-22-int8-linux-x64\n            elif [[ $p == aarch64-android ]]; then\n              d=sherpa-onnx-qnn-$t-seconds-zipformer-ctc-zh-2025-12-22-int8-android-aarch64\n            else\n              echo \"Unknown $p\"\n              exit -1\n            fi\n\n            mkdir -p $d\n            mkdir -p $d/test_wavs\n\n            cp -v model_libs/$p/lib*.so $d/libmodel.so\n            cp -v tokens.txt $d\n            cp -v *.wav $d/test_wavs\n\n            echo \"num_frames=$num_frames\" > $d/info.txt\n            echo \"target=$p\" >> $d/info.txt\n\n            ls -lh $d\n            tar cjfv $d.tar.bz2 $d\n            ls -lh *.tar.bz2\n            rm -rf $d\n          done\n\n          echo \"----show---\"\n          ls -lh *.tar.bz2\n\n          mv *.tar.bz2 ../so\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: ${{ matrix.model_name }}-${{ matrix.soc }}-${{ matrix.input_in_seconds }}-seconds\n          path: ./tmp/*.json\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj' && matrix.soc == 'SM8850'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./so/*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models-qnn\n\n      - name: Release\n        if: github.repository_owner == 'csukuangfj'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./binary/*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models-qnn-binary\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa' && matrix.soc == 'SM8850'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./so/*.tar.bz2\n          overwrite: true\n          tag: asr-models-qnn\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./binary/*.tar.bz2\n          overwrite: true\n          tag: asr-models-qnn-binary\n"
  },
  {
    "path": ".github/workflows/flutter-android.yaml",
    "content": "name: flutter-android\n\non:\n  push:\n    branches:\n      - flutter\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: flutter-android-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  asr:\n    name: asr ${{ matrix.index }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"3\"]\n        index: [\"0\", \"1\", \"2\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Set up JDK 17\n        uses: actions/setup-java@v3\n        with:\n          distribution: 'temurin'\n          java-version: '17'\n\n      - name: Check Java version\n        run: |\n          java -version\n          echo $JAVA_HOME\n\n      - name: Set JAVA_HOME for Gradle\n        run: echo \"JAVA_HOME=$JAVA_HOME\" >> $GITHUB_ENV\n\n      - name: Check Java version\n        run: |\n          java -version\n          echo $JAVA_HOME\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2 iso639-lang\n\n      - name: Install deps\n        shell: bash\n        run: |\n          sudo apt-get update -y\n          sudo apt-get install -y build-essential jq git cmake\n          sudo apt-get install -y curl\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v4\n        with:\n          channel: stable\n          version: 3.29.0\n          cache: true\n\n      - name: Install ninja\n        shell: bash\n        run: |\n          sudo apt-get install -y ninja-build\n\n      - name: Display ninja version\n        shell: bash\n        run: |\n          ninja --version\n          ninja --help || true\n          which ninja\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n\n      - name: Display machine info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n\n          git config --global --add safe.directory /__t/flutter-Linux-*/flutter || true\n\n          flutter --version\n\n          dart --version\n          flutter doctor\n\n      - name: Install libgtk-3-dev\n        shell: bash\n        run: |\n          sudo apt install -y libgtk-3-dev tree clang pkg-config\n\n      - name: Accept Android licenses\n        run: yes | $ANDROID_SDK_ROOT/cmdline-tools/latest/bin/sdkmanager --licenses\n\n      - name: Install Android SDK Components\n        run: |\n          $ANDROID_SDK_ROOT/cmdline-tools/latest/bin/sdkmanager \"platforms;android-35\" \"build-tools;35.0.0\"\n\n      - name: Install NDK 27\n        run: |\n          $ANDROID_SDK_ROOT/cmdline-tools/latest/bin/sdkmanager \"ndk;27.0.12077973\"\n\n      - name: Display flutter info (2)\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n          cd ..\n\n      - name: Build flutter\n        shell: bash\n        run: |\n          cd scripts/flutter\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-streaming-asr.py --total $total --index $index\n\n          chmod +x *.sh\n          ./build-android-streaming-asr.sh\n\n          cd ../../\n\n          ls -lh *.apk\n\n      - name: Display generated files\n        shell: bash\n        run: |\n          ls -lh *.apk\n\n          mkdir apks\n\n          mv -v *.apk ./apks\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa' || github.repository_owner == 'csu-fangjun') && ((github.event_name == 'push' || github.event_name == 'workflow_dispatch') || contains(github.ref, 'refs/tags/'))\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            dst=flutter/asr/android/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../apks/*.apk $dst\n\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more files\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter main\n\n  tts:\n    name: tts ${{ matrix.index }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"15\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"10\", \"11\", \"12\", \"13\", \"14\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Set up JDK 17\n        uses: actions/setup-java@v3\n        with:\n          distribution: 'temurin'\n          java-version: '17'\n\n      - name: Check Java version\n        run: |\n          java -version\n          echo $JAVA_HOME\n\n      - name: Set JAVA_HOME for Gradle\n        run: echo \"JAVA_HOME=$JAVA_HOME\" >> $GITHUB_ENV\n\n      - name: Check Java version\n        run: |\n          java -version\n          echo $JAVA_HOME\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Display NDK HOME\n        shell: bash\n        run: |\n          echo \"ANDROID_NDK_LATEST_HOME: ${ANDROID_NDK_LATEST_HOME}\"\n          ls -lh ${ANDROID_NDK_LATEST_HOME}\n\n      - name: Setup build tool version variable\n        shell: bash\n        run: |\n          echo \"---\"\n          ls -lh /usr/local/lib/android/\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk\n          echo \"---\"\n\n          ls -lh /usr/local/lib/android/sdk/build-tools\n          echo \"---\"\n\n          BUILD_TOOL_VERSION=$(ls /usr/local/lib/android/sdk/build-tools/ | tail -n 1)\n          echo \"BUILD_TOOL_VERSION=$BUILD_TOOL_VERSION\" >> $GITHUB_ENV\n          echo \"Last build tool version is: $BUILD_TOOL_VERSION\"\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2 iso639-lang\n\n      - name: Install deps\n        shell: bash\n        run: |\n          sudo apt-get update -y\n          sudo apt-get install -y build-essential jq git cmake\n          sudo apt-get install -y curl\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v4\n        with:\n          channel: stable\n          version: 3.29.0\n          cache: true\n\n      - name: Install ninja\n        shell: bash\n        run: |\n          sudo apt-get install -y ninja-build\n\n      - name: Display ninja version\n        shell: bash\n        run: |\n          ninja --version\n          ninja --help || true\n          which ninja\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n\n      - name: Display machine info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n\n          git config --global --add safe.directory /__t/flutter-Linux-*/flutter || true\n\n          flutter --version\n\n          dart --version\n          flutter doctor\n\n      - name: Install libgtk-3-dev\n        shell: bash\n        run: |\n          sudo apt install -y libgtk-3-dev tree clang pkg-config\n\n      - name: Accept Android licenses\n        run: yes | $ANDROID_SDK_ROOT/cmdline-tools/latest/bin/sdkmanager --licenses\n\n      - name: Install Android SDK Components\n        run: |\n          $ANDROID_SDK_ROOT/cmdline-tools/latest/bin/sdkmanager \"platforms;android-35\" \"build-tools;35.0.0\"\n\n      - name: Display flutter info (2)\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n          cd ..\n\n      - name: Build flutter\n        shell: bash\n        run: |\n          cd scripts/flutter\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-tts.py --total $total --index $index\n\n          chmod +x *.sh\n          ./build-android-tts.sh\n\n          cd ../../\n\n          ls -lh *.apk\n\n      - name: Display generated files\n        shell: bash\n        run: |\n          ls -lh *.apk\n\n          mkdir apks\n\n          mv -v *.apk ./apks\n\n      # https://github.com/marketplace/actions/sign-android-release\n      - uses: r0adkll/sign-android-release@v1\n        name: Sign app APK\n        with:\n          releaseDirectory: ./apks\n          signingKeyBase64: ${{ secrets.ANDROID_SIGNING_KEY }}\n          alias: ${{ secrets.ANDROID_SIGNING_KEY_ALIAS }}\n          keyStorePassword: ${{ secrets.ANDROID_SIGNING_KEY_STORE_PASSWORD }}\n        env:\n          BUILD_TOOLS_VERSION: ${{ env.BUILD_TOOL_VERSION }}\n\n      - name: Display APK after signing\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Rename APK after signing\n        shell: bash\n        run: |\n          cd apks\n          rm -fv signingKey.jks\n          rm -fv *.apk.idsig\n          rm -fv *-aligned.apk\n\n          all_apks=$(ls -1 *-signed.apk)\n          echo \"----\"\n          echo $all_apks\n          echo \"----\"\n          for apk in ${all_apks[@]}; do\n            n=$(echo $apk | sed -e s/-signed//)\n            mv -v $apk $n\n          done\n\n          cd ..\n\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Display APK after rename\n        shell: bash\n        run: |\n          ls -lh ./apks/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa' || github.repository_owner == 'csu-fangjun') && ((github.event_name == 'push' || github.event_name == 'workflow_dispatch') || contains(github.ref, 'refs/tags/'))\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            dst=flutter/tts/android/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../apks/*.apk $dst\n\n            git status\n            git lfs track \"*.apk\"\n            git add .\n            git commit -m \"add more files\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter main\n"
  },
  {
    "path": ".github/workflows/flutter-linux.yaml",
    "content": "name: flutter-linux\n\non:\n  push:\n    branches:\n      - flutter\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: flutter-linux-${{ github.ref }}\n  cancel-in-progress: true\n\n# See https://github.com/actions/checkout/issues/1590#issuecomment-2207052044\n# and\n# https://github.blog/changelog/2023-06-13-github-actions-all-actions-will-run-on-node16-instead-of-node12-by-default/\nenv:\n  ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true\n\njobs:\n  asr:\n    name: asr ${{ matrix.arch }} ${{ matrix.index }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        arch: [x86_64]\n        total: [\"3\"]\n        index: [\"0\", \"1\", \"2\"]\n\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --break-system-packages --upgrade pip jinja2\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v4\n        with:\n          channel: stable\n          version: 3.24.3\n          cache: true\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n\n      - name: Display machine info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n      - name: Install libgtk-3-dev\n        shell: bash\n        run: |\n          sudo apt install -y libgtk-3-dev tree clang pkg-config\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n      - name: Build flutter\n        shell: bash\n        run: |\n          export arch=${{ matrix.arch }}\n          cd scripts/flutter\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-streaming-asr.py --total $total --index $index\n\n          chmod +x *.sh\n          ./build-linux-streaming-asr.sh\n          cd ../../\n          ls -lh *.tar.bz2\n\n      - name: Display generated files\n        shell: bash\n        run: |\n          ls -lh *.tar.bz2\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa' || github.repository_owner == 'csu-fangjun') && ((github.event_name == 'push' || github.event_name == 'workflow_dispatch') || contains(github.ref, 'refs/tags/'))\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            dst=flutter/asr/linux/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n            git add .\n            git commit -m \"add more files\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter main\n\n  tts:\n    name: tts ${{ matrix.arch }} ${{ matrix.index }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        arch: [x86_64]\n        total: [\"20\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"10\", \"11\", \"12\", \"13\", \"14\", \"15\", \"16\", \"17\", \"18\", \"19\"]\n\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --break-system-packages --upgrade pip jinja2\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v4\n        with:\n          channel: stable\n          version: 3.24.3\n          cache: true\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n\n      - name: Display machine info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n      - name: Install libgtk-3-dev\n        shell: bash\n        run: |\n          sudo apt install -y libgtk-3-dev tree clang pkg-config\n\n      - name: Install deps\n        shell: bash\n        run: |\n          sudo apt-get update -y\n          sudo apt-get install -y build-essential jq git python3-pip\n          sudo apt-get install -y curl\n          sudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev libunwind-dev\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n      - name: Build flutter\n        shell: bash\n        run: |\n          export arch=${{ matrix.arch }}\n          cd scripts/flutter\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-tts.py --total $total --index $index\n\n          chmod +x *.sh\n          ./build-linux-tts.sh\n          cd ../../\n          ls -lh *.tar.bz2\n\n      - name: Display generated files\n        shell: bash\n        run: |\n          ls -lh *.tar.bz2\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa' || github.repository_owner == 'csu-fangjun') && ((github.event_name == 'push' || github.event_name == 'workflow_dispatch') || contains(github.ref, 'refs/tags/'))\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            dst=flutter/tts/linux/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n            git add .\n            git commit -m \"add more files\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter main\n"
  },
  {
    "path": ".github/workflows/flutter-macos.yaml",
    "content": "name: flutter-macos\n\non:\n  push:\n    branches:\n      - flutter\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: flutter-macos-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  asr:\n    name: asr ${{ matrix.arch }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        arch: [x86_64, arm64]\n        total: [\"3\"]\n        index: [\"0\", \"1\", \"2\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --break-system-packages --upgrade pip jinja2\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v4\n        with:\n          channel: stable\n          version: 3.24.3\n          cache: true\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n\n      - name: Display machine info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n      - name: Build flutter\n        shell: bash\n        run: |\n          export arch=${{ matrix.arch }}\n          cd scripts/flutter\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-streaming-asr.py --total $total --index $index\n\n          chmod +x *.sh\n          ./build-macos-streaming-asr.sh\n          cd ../../\n          ls -lh *.tar.bz2\n\n      - name: Display generated files\n        shell: bash\n        run: |\n          ls -lh *.tar.bz2\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa' || github.repository_owner == 'csu-fangjun') && ((github.event_name == 'push' || github.event_name == 'workflow_dispatch') || contains(github.ref, 'refs/tags/'))\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            dst=flutter/asr/macos/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n            git add .\n            git commit -m \"add more files\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter main\n\n  tts:\n    name: tts ${{ matrix.arch }} ${{ matrix.index }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        arch: [x86_64, arm64]\n        total: [\"10\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --break-system-packages --upgrade pip jinja2 iso639-lang\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v4\n        with:\n          channel: stable\n          version: 3.24.3\n          cache: true\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n\n      - name: Display machine info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n      - name: Build flutter\n        shell: bash\n        run: |\n          export arch=${{ matrix.arch }}\n          cd scripts/flutter\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-tts.py --total $total --index $index\n\n          chmod +x *.sh\n          ./build-macos-tts.sh\n          cd ../../\n          ls -lh *.tar.bz2\n\n      - name: Display generated files\n        shell: bash\n        run: |\n          ls -lh *.tar.bz2\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa' || github.repository_owner == 'csu-fangjun') && ((github.event_name == 'push' || github.event_name == 'workflow_dispatch') || contains(github.ref, 'refs/tags/'))\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n            dst=flutter/tts/macos/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n            cp -v ../*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n            git add .\n            git commit -m \"add more files\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter main\n"
  },
  {
    "path": ".github/workflows/flutter-windows-x64.yaml",
    "content": "name: flutter-windows-x64\n\non:\n  push:\n    branches:\n      - flutter\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: flutter-windows-x64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  asr:\n    name: asr ${{ matrix.index }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        total: [\"3\"]\n        index: [\"0\", \"1\", \"2\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v4\n        with:\n          channel: stable\n          version: 3.24.3\n          cache: true\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n\n      - name: Display machine info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n      - name: Build flutter\n        shell: bash\n        run: |\n          cd scripts/flutter\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-streaming-asr.py --total $total --index $index\n\n          chmod +x *.sh\n          ./build-windows-streaming-asr.sh\n          cd ../../\n          ls -lh *.tar.bz2\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa' || github.repository_owner == 'csu-fangjun') && ((github.event_name == 'push' || github.event_name == 'workflow_dispatch') || contains(github.ref, 'refs/tags/'))\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            dst=flutter/asr/windows/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n            cp -v ../*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n            git add .\n            git commit -m \"add more files\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter main\n\n  tts:\n    name: tts ${{ matrix.index }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        total: [\"20\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"11\", \"12\", \"13\", \"14\", \"15\", \"16\", \"17\", \"18\", \"19\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2 iso639-lang\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v3\n        with:\n          channel: stable\n          version: latest\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n\n      - name: Display machine info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n      - name: Build flutter\n        shell: bash\n        run: |\n          cd scripts/flutter\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-tts.py --total $total --index $index\n\n          chmod +x *.sh\n          ./build-windows-tts.sh\n          cd ../../\n          ls -lh *.tar.bz2\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa' || github.repository_owner == 'csu-fangjun') && ((github.event_name == 'push' || github.event_name == 'workflow_dispatch') || contains(github.ref, 'refs/tags/'))\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            dst=flutter/tts/windows/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n            cp -v ../*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n            git add .\n            git commit -m \"add more files\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-flutter main\n"
  },
  {
    "path": ".github/workflows/generate-tts-samples.yaml",
    "content": "name: generate-tts-samples\n\non:\n  push:\n    branches:\n      - tts-samples-2\n\n  workflow_dispatch:\n\nconcurrency:\n  group: generate-tts-samples-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  generate_tts_samples:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install \"numpy<=1.26.4\" sherpa-onnx soundfile\n\n      - name: kitten\n        if: true\n        shell: bash\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n          cd scripts/kitten-tts\n          pwd=$PWD\n\n          export GIT_LFS_SKIP_SMUDGE=1\n          export GIT_CLONE_PROTECTION_ACTIVE=false\n          git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples hf\n          mkdir -p ./hf/kitten/v0.1-nano/mp3\n          mkdir -p ./hf/kitten/v0.2-nano/mp3\n          mkdir -p ./hf/kitten/v0.1-mini/mp3\n\n          for v in 1 2; do\n            pushd nano_v0_$v\n            curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_$v-fp16.tar.bz2\n            tar xf kitten-nano-en-v0_$v-fp16.tar.bz2\n            rm kitten-nano-en-v0_$v-fp16.tar.bz2\n\n            ln -s ../hf .\n            python3 ./generate_samples.py\n            rm -rf kitten-nano-en-v0_$v-fp16\n            popd\n          done\n\n          for v in 1; do\n            pushd mini_v0_$v\n            curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-mini-en-v0_$v-fp16.tar.bz2\n            tar xf kitten-mini-en-v0_$v-fp16.tar.bz2\n            rm kitten-mini-en-v0_$v-fp16.tar.bz2\n\n            ln -s ../hf .\n            python3 ./generate_samples.py\n            rm -rf kitten-mini-en-v0_$v-fp16\n            popd\n          done\n\n          pushd hf\n          git pull\n          git add .\n          git commit -m 'add kitten tts samples'\n          git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples main\n          popd\n          rm -rf hf\n\n      - name: matcha en (ljspeech)\n        if: false\n        shell: bash\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n          cd scripts/matcha-tts/en/\n          pwd=$PWD\n\n          export GIT_LFS_SKIP_SMUDGE=1\n          export GIT_CLONE_PROTECTION_ACTIVE=false\n          git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples hf\n\n          mkdir -p ./hf/matcha/icefall-en-ljspeech/mp3\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n          tar xvf matcha-icefall-en_US-ljspeech.tar.bz2\n          rm matcha-icefall-en_US-ljspeech.tar.bz2\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n          python3 ./generate_samples.py\n\n          pushd hf\n          git pull\n          git add .\n          git commit -m 'add matcha tts en (ljspeech) samples'\n          git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-tts-samples main\n          popd\n\n          rm -rf hf\n"
  },
  {
    "path": ".github/workflows/hap-vad-asr.yaml",
    "content": "name: hap-vad-asr\n\non:\n  push:\n    branches:\n      - hap\n      - hap-ci\n\n  workflow_dispatch:\n\nconcurrency:\n  group: hap-vad-asr-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\n\njobs:\n  hap_vad_asr:\n    if: github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa'\n    runs-on: ${{ matrix.os }}\n    name: Haps for vad asr ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"10\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # https://github.com/actions/setup-java\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '17' # it requires jdk 17 to sigh the hap\n\n      - name: Show java version\n        shell: bash\n        run: |\n          which java\n          java --version\n\n      - name: cache-toolchain\n        id: cache-toolchain-ohos\n        uses: actions/cache@v4\n        with:\n          path: command-line-tools\n          key: commandline-tools-linux-x64-5.0.5.200.zip\n\n      - name: Download toolchain\n        if: steps.cache-toolchain-ohos.outputs.cache-hit != 'true'\n        shell: bash\n        run: |\n          curl -SL -O https://huggingface.co/csukuangfj/harmonyos-commandline-tools/resolve/main/commandline-tools-linux-x64-5.0.5.200.zip\n          unzip commandline-tools-linux-x64-5.0.5.200.zip\n          rm commandline-tools-linux-x64-5.0.5.200.zip\n\n      - name: Set environment variable\n        shell: bash\n        run: |\n          echo \"$GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/build-tools/cmake/bin\"  >> \"$GITHUB_PATH\"\n          which cmake\n\n          cmake --version\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/hap\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-vad-asr-hap-script.py --total $total --index $index\n          ls -lh\n\n          chmod +x build-hap-vad-asr.sh\n          mv -v ./build-hap-vad-asr.sh ../..\n\n      - name: Generate secrets\n        shell: bash\n        run: |\n          echo \"${{ secrets.HAP_SHERPA_ONNX_CER }}\" > /tmp/sherpa_onnx.cer\n          shasum -a 256 /tmp/sherpa_onnx.cer\n          ls -lh /tmp/sherpa_onnx.cer\n\n          # macos\n          # base64 -i sherpa_onnx_profileRelease.p7b -o sherpa_onnx_profileRelease.p7b.base64\n          #\n          # linux\n          # base64 -w 0 sherpa_onnx_profileRelease.p7b > sherpa_onnx_profileRelease.p7b.base64\n          #\n          # cat sherpa_onnx_profileRelease.p7b.base64 | base64 --decode > sherpa_onnx_profileRelease.p7b\n          #\n          echo \"${{ secrets.HAP_SHERPA_ONNX_PROFILE }}\"   | base64 --decode > /tmp/sherpa_onnx_profileRelease.p7b\n          echo \"${{ secrets.HAP_SHERPA_ONNX_KEY_STORE }}\" > ./sherpa_onnx_ohos_key.p12.base64\n          echo \"${{ secrets.HAP_SHERPA_ONNX_KEY_STORE }}\" | base64 --decode > /tmp/sherpa_onnx_ohos_key.p12\n\n          ls -l /tmp/sherpa_onnx_profileRelease.p7b\n          ls -l /tmp/sherpa_onnx_ohos_key.p12\n\n          ls -lh ./sherpa_onnx_ohos_key.p12.base64\n          shasum -a 256 ./sherpa_onnx_ohos_key.p12.base64\n          wc ./sherpa_onnx_ohos_key.p12.base64\n          rm ./sherpa_onnx_ohos_key.p12.base64\n\n          shasum -a 256 /tmp/sherpa_onnx_profileRelease.p7b\n          shasum -a 256 /tmp/sherpa_onnx_ohos_key.p12\n\n      - name: build HAP\n        env:\n          HAP_KEY_ALIAS: ${{ secrets.HAP_KEY_ALIAS }}\n          HAP_KEY_PWD: ${{ secrets.HAP_KEY_PWD }}\n          HAP_KEY_STORE_PWD: ${{ secrets.HAP_KEY_STORE_PWD }}\n        shell: bash\n        run: |\n          export COMMANDLINE_TOOLS_DIR=$GITHUB_WORKSPACE/command-line-tools\n          ./build-hap-vad-asr.sh\n\n          # remove secrets\n          rm /tmp/sherpa_onnx.cer\n          rm /tmp/sherpa_onnx_profileRelease.p7b\n          rm /tmp/sherpa_onnx_ohos_key.p12\n\n      - name: Display HAPs\n        shell: bash\n        run: |\n          ls -lh ./haps/\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-harmony-os huggingface\n            cd huggingface\n            du -h -d1 .\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=hap/vad-asr/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n            cp -v ../haps/*.hap $d/\n            git status\n            git lfs track \"*.hap\"\n            git add .\n            git commit -m \"add more HAPs\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-harmony-os main\n"
  },
  {
    "path": ".github/workflows/har.yaml",
    "content": "name: har\n\non:\n  push:\n    branches:\n      - master\n      # - ohos-har\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: har-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  har:\n    name: Har\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: har-linux\n\n      - name: cache-toolchain\n        id: cache-toolchain-ohos\n        uses: actions/cache@v4\n        with:\n          path: command-line-tools\n          key: commandline-tools-linux-x64-5.0.5.200.zip\n\n      - name: Download toolchain\n        if: steps.cache-toolchain-ohos.outputs.cache-hit != 'true'\n        shell: bash\n        run: |\n          curl -SL -O https://huggingface.co/csukuangfj/harmonyos-commandline-tools/resolve/main/commandline-tools-linux-x64-5.0.5.200.zip\n          unzip commandline-tools-linux-x64-5.0.5.200.zip\n          rm commandline-tools-linux-x64-5.0.5.200.zip\n\n      - name: Set environment variable\n        shell: bash\n        run: |\n          echo \"$GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/build-tools/cmake/bin\"  >> \"$GITHUB_PATH\"\n          which cmake\n\n          cmake --version\n\n          ls -lh $GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/build/cmake/ohos.toolchain.cmake\n\n          echo \"====\"\n          cat $GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/build/cmake/ohos.toolchain.cmake\n          echo \"====\"\n\n          # echo \"$GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/llvm/bin\"  >> \"$GITHUB_PATH\"\n\n          ls -lh $GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/llvm/bin/\n          echo \"--\"\n          ls -lh $GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/llvm/bin/*unknown*\n\n          cat $GITHUB_PATH\n\n          # /home/runner/work/onnxruntime-libs/onnxruntime-libs/command-line-tools/sdk/default/openharmony/native/llvm/bin/aarch64-unknown-linux-ohos-clang -v || true\n          export PATH=$PWD/command-line-tools/sdk/default/openharmony/native/llvm/bin:$PATH\n          echo \"path: $PATH\"\n\n          which aarch64-unknown-linux-ohos-clang++ || true\n          which aarch64-unknown-linux-ohos-clang || true\n\n          aarch64-unknown-linux-ohos-clang++ --version || true\n          aarch64-unknown-linux-ohos-clang --version || true\n\n          which armv7-unknown-linux-ohos-clang++\n          which armv7-unknown-linux-ohos-clang\n\n          armv7-unknown-linux-ohos-clang++ --version\n          armv7-unknown-linux-ohos-clang --version\n\n          which x86_64-unknown-linux-ohos-clang++\n          which x86_64-unknown-linux-ohos-clang\n\n          x86_64-unknown-linux-ohos-clang++ --version\n          x86_64-unknown-linux-ohos-clang --version\n\n      - name: Install tree\n        shell: bash\n        run: |\n          sudo apt-get update -q\n          sudo apt-get install -y -q tree\n\n      - name: Build libraries\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export OHOS_SDK_NATIVE_DIR=\"$GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native\"\n\n          ./build-ohos-arm64-v8a.sh\n          ./build-ohos-x86-64.sh\n\n      - name: Build Har\n        shell: bash\n        run: |\n          export PATH=\"$GITHUB_WORKSPACE/command-line-tools/bin:$PATH\"\n\n          which hvigorw\n\n          pushd harmony-os/SherpaOnnxHar\n\n          cp -fv ../../LICENSE ./sherpa_onnx\n          cp -fv ../../CHANGELOG.md ./sherpa_onnx\n\n          hvigorw --mode module -p product=default -p module=sherpa_onnx@default assembleHar --analyze=normal --parallel --incremental --no-daemon\n          ls -lh ./sherpa_onnx/build/default/outputs/default/sherpa_onnx.har\n          cp -v ./sherpa_onnx/build/default/outputs/default/sherpa_onnx.har ../../\n\n          popd\n\n          ls -lh *.har\n\n      - name: View Har\n        shell: bash\n        run: |\n          file sherpa_onnx.har\n          tar xvf sherpa_onnx.har\n\n          cd package\n          ls -lh\n\n          ls -lh libs\n          echo \"---libs/x86_64---\"\n          ls -lh libs/x86_64\n\n          echo \"---libs/arm64-v8a---\"\n          ls -lh libs/arm64-v8a\n\n          echo \"---src/main/ets/components---\"\n          ls -lh src/main/ets/components/\n\n          echo \"---src/main/cpp/types/libsherpa_onnx/---\"\n          ls -lh src/main/cpp/types/libsherpa_onnx/\n\n          tree .\n\n      - name: Collect result\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION=$SHERPA_ONNX_VERSION\" >> \"$GITHUB_ENV\"\n\n          mv sherpa_onnx.har sherpa_onnx-$SHERPA_ONNX_VERSION.har\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-har\n          path: ./sherpa_onnx*.har\n\n      - name: Release har\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.har\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.30\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-harmony-os huggingface\n            cd huggingface\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=har\n            mkdir -p $d\n            cp -v ../*.har $d/\n            git status\n            git lfs track \"*.har\"\n            git add .\n            git commit -m \"add more hars\"\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-harmony-os main\n"
  },
  {
    "path": ".github/workflows/harmony-os.yaml",
    "content": "name: harmony-os\n\non:\n  push:\n    branches:\n      - master\n      - ohos\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: harmony-os-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  harmony_os:\n    name: Harmony OS ${{ matrix.arch }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        arch: [arm64-v8a, armeabi-v7a, x86_64]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ohos-${{ matrix.arch }}\n\n      - name: cache-toolchain\n        id: cache-toolchain-ohos\n        uses: actions/cache@v4\n        with:\n          path: command-line-tools\n          key: commandline-tools-linux-x64-5.0.5.200.zip\n\n      - name: Download toolchain\n        if: steps.cache-toolchain-ohos.outputs.cache-hit != 'true'\n        shell: bash\n        run: |\n          curl -SL -O https://huggingface.co/csukuangfj/harmonyos-commandline-tools/resolve/main/commandline-tools-linux-x64-5.0.5.200.zip\n          unzip commandline-tools-linux-x64-5.0.5.200.zip\n          rm commandline-tools-linux-x64-5.0.5.200.zip\n\n      - name: Set environment variable\n        shell: bash\n        run: |\n          echo \"$GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/build-tools/cmake/bin\"  >> \"$GITHUB_PATH\"\n          which cmake\n\n          cmake --version\n\n          ls -lh $GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/build/cmake/ohos.toolchain.cmake\n\n          echo \"====\"\n          cat $GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/build/cmake/ohos.toolchain.cmake\n          echo \"====\"\n\n          # echo \"$GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/llvm/bin\"  >> \"$GITHUB_PATH\"\n\n          ls -lh $GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/llvm/bin/\n          echo \"--\"\n          ls -lh $GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native/llvm/bin/*unknown*\n\n          cat $GITHUB_PATH\n\n          # /home/runner/work/onnxruntime-libs/onnxruntime-libs/command-line-tools/sdk/default/openharmony/native/llvm/bin/aarch64-unknown-linux-ohos-clang -v || true\n          export PATH=$PWD/command-line-tools/sdk/default/openharmony/native/llvm/bin:$PATH\n          echo \"path: $PATH\"\n\n          which aarch64-unknown-linux-ohos-clang++ || true\n          which aarch64-unknown-linux-ohos-clang || true\n\n          aarch64-unknown-linux-ohos-clang++ --version || true\n          aarch64-unknown-linux-ohos-clang --version || true\n\n          which armv7-unknown-linux-ohos-clang++\n          which armv7-unknown-linux-ohos-clang\n\n          armv7-unknown-linux-ohos-clang++ --version\n          armv7-unknown-linux-ohos-clang --version\n\n          which x86_64-unknown-linux-ohos-clang++\n          which x86_64-unknown-linux-ohos-clang\n\n          x86_64-unknown-linux-ohos-clang++ --version\n          x86_64-unknown-linux-ohos-clang --version\n\n      - name: Build ${{ matrix.arch }}\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          arch=${{ matrix.arch }}\n\n          echo \"arch: $arch\"\n\n          export OHOS_SDK_NATIVE_DIR=\"$GITHUB_WORKSPACE/command-line-tools/sdk/default/openharmony/native\"\n\n          if [[ $arch == arm64-v8a ]]; then\n            ./build-ohos-arm64-v8a.sh\n          elif [[ $arch == armeabi-v7a ]]; then\n            ./build-ohos-armeabi-v7a.sh\n          elif [[ $arch == x86_64 ]]; then\n            ./build-ohos-x86-64.sh\n          else\n            echo \"Unknown arch $arch\"\n          fi\n\n      - name: Collect result for ${{ matrix.arch }}\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION=$SHERPA_ONNX_VERSION\" >> \"$GITHUB_ENV\"\n\n          arch=${{ matrix.arch }}\n          d=sherpa-onnx-$SHERPA_ONNX_VERSION-ohos-$arch\n          if [[ $arch == x86_64 ]]; then\n            cd ./build-ohos-x86-64\n          else\n            cd ./build-ohos-$arch\n          fi\n\n          mv install $d\n          tar cjfv $d.tar.bz2 $d\n\n          ls -lh $d/lib\n\n\n          file $d/lib/*\n\n          readelf -d $d/lib/libsherpa-onnx-c-api.so\n\n          mv $d.tar.bz2 ../\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-ohos-${{ matrix.arch }}\n          path: ./*.tar.bz2\n\n      - name: Release jar\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.10.23\n"
  },
  {
    "path": ".github/workflows/jar.yaml",
    "content": "name: jar\n\non:\n  push:\n    branches:\n      - refactor-jar\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: jar-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: write\njobs:\n  jar:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.os }} ${{ matrix.arch }}\n    strategy:\n      fail-fast: false\n      matrix:\n        include:\n          - os: ubuntu-24.04-arm\n            arch: \"arm64\"\n\n          - os: ubuntu-latest\n            arch: \"x64\"\n\n          - os: macos-latest\n            arch: \"arm64\"\n\n          - os: macos-15-intel\n            arch: \"x64\"\n\n          - os: windows-2022\n            arch: \"x64\"\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: Show java version\n        shell: bash\n        run: |\n          java --version\n\n      - name: Download libs ${{ matrix.os }} ${{ matrix.arch }}\n        if: ${{ matrix.os == 'ubuntu-24.04-arm' && matrix.arch == 'arm64' }}\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/v$SHERPA_ONNX_VERSION/sherpa-onnx-v$SHERPA_ONNX_VERSION-linux-aarch64-jni.tar.bz2\n          tar xvf ./*.tar.bz2\n\n          src=sherpa-onnx-v$SHERPA_ONNX_VERSION-linux-aarch64-jni\n          dst=sherpa-onnx/java-api/resources/sherpa-onnx/native/linux-aarch64\n\n          mkdir -p $dst\n          cp -v $src/lib/libsherpa-onnx-jni.so $dst/\n          cp -v $src/lib/libonnxruntime.so $dst/\n\n          ls -lh $dst\n          rm -rf $src*\n\n      - name: Download libs ${{ matrix.os }} ${{ matrix.arch }}\n        if: ${{ matrix.os == 'ubuntu-latest' && matrix.arch == 'x64' }}\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/v$SHERPA_ONNX_VERSION/sherpa-onnx-v$SHERPA_ONNX_VERSION-linux-x64-jni.tar.bz2\n          tar xvf ./*.tar.bz2\n\n          src=sherpa-onnx-v$SHERPA_ONNX_VERSION-linux-x64-jni\n          dst=sherpa-onnx/java-api/resources/sherpa-onnx/native/linux-x64\n\n          mkdir -p $dst\n          cp -v $src/lib/libsherpa-onnx-jni.so $dst/\n          cp -v $src/lib/libonnxruntime.so $dst/\n\n          ls -lh $dst\n          rm -rf $src*\n\n      - name: Download libs ${{ matrix.os }} ${{ matrix.arch }}\n        if: ${{ matrix.os == 'macos-latest' && matrix.arch == 'arm64' }}\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/v$SHERPA_ONNX_VERSION/sherpa-onnx-v$SHERPA_ONNX_VERSION-osx-arm64-jni.tar.bz2\n          tar xvf ./*.tar.bz2\n\n          src=sherpa-onnx-v$SHERPA_ONNX_VERSION-osx-arm64-jni\n          dst=sherpa-onnx/java-api/resources/sherpa-onnx/native/osx-aarch64\n\n          mkdir -p $dst\n          cp -v $src/lib/libonnxruntime.1.23.2.dylib $dst/\n          cp -v $src/lib/libsherpa-onnx-jni.dylib $dst/\n\n          ls -lh $dst\n          rm -rf $src*\n\n      - name: Download libs ${{ matrix.os }} ${{ matrix.arch }}\n        if: ${{ matrix.os == 'macos-15-intel' && matrix.arch == 'x64' }}\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/v$SHERPA_ONNX_VERSION/sherpa-onnx-v$SHERPA_ONNX_VERSION-osx-x86_64-jni.tar.bz2\n          tar xvf ./*.tar.bz2\n\n          src=sherpa-onnx-v$SHERPA_ONNX_VERSION-osx-x86_64-jni\n          dst=sherpa-onnx/java-api/resources/sherpa-onnx/native/osx-x64\n\n          mkdir -p $dst\n          cp -v $src/lib/libonnxruntime.1.23.2.dylib $dst/\n          cp -v $src/lib/libsherpa-onnx-jni.dylib $dst/\n\n          ls -lh $dst\n          rm -rf $src*\n\n      - name: Download libs ${{ matrix.os }} ${{ matrix.arch }}\n        if: ${{ matrix.os == 'windows-2022' && matrix.arch == 'x64' }}\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/v$SHERPA_ONNX_VERSION/sherpa-onnx-v$SHERPA_ONNX_VERSION-win-x64-jni.tar.bz2\n          tar xvf ./*.tar.bz2\n\n          src=sherpa-onnx-v$SHERPA_ONNX_VERSION-win-x64-jni\n          ls -lh $src\n          ls -lh $src/lib\n          dst=sherpa-onnx/java-api/resources/sherpa-onnx/native/win-x64\n\n          mkdir -p $dst\n          cp -v $src/lib/onnxruntime.dll $dst/\n          cp -v $src/lib/sherpa-onnx-jni.dll $dst/\n\n          ls -lh $dst\n          rm -rf $src*\n\n      - name: Create java jar (source code)\n        shell: bash\n        run: |\n          cd sherpa-onnx/java-api\n          make\n\n          ls -lh build\n\n      - name: Create java jar (native lib)\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          cd sherpa-onnx/java-api\n\n          ls -lh resources/sherpa-onnx/native\n\n          echo \"--\"\n\n          ls -lh resources/sherpa-onnx/native/*/\n\n          jar cfvm ./sherpa-onnx-native.jar MANIFEST.MF -C ./resources .\n\n          ls -lh *.jar\n\n          os=${{ matrix.os }}\n          arch=${{ matrix.arch }}\n\n          if [[ $os == \"ubuntu-24.04-arm\" && $arch == \"arm64\" ]]; then\n            mv -v sherpa-onnx-native.jar sherpa-onnx-native-lib-linux-aarch64-$SHERPA_ONNX_VERSION.jar\n          elif [[ $os == \"ubuntu-latest\" && $arch == \"x64\" ]]; then\n            mv -v sherpa-onnx-native.jar sherpa-onnx-native-lib-linux-x64-$SHERPA_ONNX_VERSION.jar\n          elif [[ $os == \"macos-latest\" && $arch == \"arm64\" ]]; then\n            mv -v sherpa-onnx-native.jar sherpa-onnx-native-lib-osx-aarch64-$SHERPA_ONNX_VERSION.jar\n          elif [[ $os == \"macos-15-intel\" && $arch == \"x64\" ]]; then\n            mv -v sherpa-onnx-native.jar sherpa-onnx-native-lib-osx-x64-$SHERPA_ONNX_VERSION.jar\n          elif [[ $os == \"windows-2022\" && $arch == \"x64\" ]]; then\n            mv -v sherpa-onnx-native.jar sherpa-onnx-native-lib-win-x64-$SHERPA_ONNX_VERSION.jar\n          else\n            echo \"Unknown os $os with arch $arch\"\n          fi\n\n      - name: Show java jar (source code)\n        shell: bash\n        run: |\n          cd sherpa-onnx/java-api\n\n          unzip -l build/sherpa-onnx.jar\n\n      - name: Show java jar (native lib)\n        shell: bash\n        run: |\n          cd sherpa-onnx/java-api\n\n          unzip -l sherpa-onnx*.jar\n\n      - name: Release jar\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./sherpa-onnx/java-api/sherpa-onnx-native-*.jar\n\n      - name: Release jar\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./sherpa-onnx/java-api/sherpa-onnx-native-*.jar\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.15\n\n      - name: Test KittenTTS\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          os=${{ matrix.os }}\n          arch=${{ matrix.arch }}\n\n          if [[ $os == \"ubuntu-24.04-arm\" && $arch == \"arm64\" ]]; then\n            native_jar=sherpa-onnx-native-lib-linux-aarch64-$SHERPA_ONNX_VERSION.jar\n          elif [[ $os == \"ubuntu-latest\" && $arch == \"x64\" ]]; then\n            native_jar=sherpa-onnx-native-lib-linux-x64-$SHERPA_ONNX_VERSION.jar\n          elif [[ $os == \"macos-latest\" && $arch == \"arm64\" ]]; then\n            native_jar=sherpa-onnx-native-lib-osx-aarch64-$SHERPA_ONNX_VERSION.jar\n          elif [[ $os == \"macos-15-intel\" && $arch == \"x64\" ]]; then\n            native_jar=sherpa-onnx-native-lib-osx-x64-$SHERPA_ONNX_VERSION.jar\n          elif [[ $os == \"windows-2022\" && $arch == \"x64\" ]]; then\n            native_jar=sherpa-onnx-native-lib-win-x64-$SHERPA_ONNX_VERSION.jar\n          else\n            echo \"Unknown os $os with arch $arch\"\n          fi\n\n          echo \"native_jar: $native_jar\"\n          ls -lh sherpa-onnx/java-api/$native_jar\n\n          if [[ ${{ matrix.os }} == \"windows-2022\" ]]; then\n            SEP=\";\"\n          else\n            SEP=\":\"\n          fi\n          cd java-api-examples\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n          tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n          rm kitten-nano-en-v0_1-fp16.tar.bz2\n\n          java \\\n            -cp \"../sherpa-onnx/java-api/build/sherpa-onnx.jar${SEP}../sherpa-onnx/java-api/$native_jar\" \\\n            NonStreamingTtsKittenEn.java\n"
  },
  {
    "path": ".github/workflows/jni.yaml",
    "content": "name: jni\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/jni.yaml'\n      - 'cmake/**'\n      - 'kotlin-api-examples/**'\n      - 'sherpa-onnx/kotlin-api/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/jni/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: jni-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  jni:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, macos-15-intel]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}\n\n      - name: OS info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Display kotlin version\n        shell: bash\n        run: |\n          kotlinc -version\n\n      - name: Display java version\n        shell: bash\n        run: |\n          java -version\n          javac -help\n          echo \"JAVA_HOME is: ${JAVA_HOME}\"\n\n      - name:  Run JNI test\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          cd ./kotlin-api-examples\n          ./run.sh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: tts-files-${{ matrix.os }}\n          path: kotlin-api-examples/test-*.wav\n"
  },
  {
    "path": ".github/workflows/lazarus.yaml",
    "content": "name: lazarus\n\non:\n  push:\n    branches:\n      - master\n      - lazarus\n    paths:\n      - '.github/workflows/lazarus.yaml'\n      - 'cmake/**'\n      - 'lazarus-examples/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'sherpa-onnx/pascal-api/*'\n      - 'scripts/lazarus/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: lazarus-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  build:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-22.04, macos-latest, macos-15-intel, windows-2022]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}\n\n      # See https://github.com/gcarreno/setup-lazarus\n      - uses: gcarreno/setup-lazarus@v3.3.1\n        with:\n          lazarus-version: \"stable\"\n          with-cache: false\n\n      - name: Lazarus info\n        shell: bash\n        run: |\n          which lazbuild\n          lazbuild --help\n\n      - name: FPC info\n        shell: bash\n        run: |\n          which fpc\n          fpc -i\n\n      - name: OS info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Install patchelf for ubuntu\n        if: matrix.os == 'ubuntu-22.04'\n        shell: bash\n        run: |\n          sudo apt-get update -q\n          sudo apt-get install -q -y patchelf\n\n      - name: Show Patchelf version (ubuntu)\n        if: matrix.os == 'ubuntu-22.04'\n        shell: bash\n        run: |\n          patchelf --version\n          patchelf --help\n          which patchelf\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n          os=${{ matrix.os }}\n\n          if [[ $os == 'windows-2022' || $os == 'ubuntu-22.04' ]]; then\n            BUILD_SHARED_LIBS=ON\n          else\n            BUILD_SHARED_LIBS=OFF\n          fi\n\n          cmake \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -D BUILD_SHARED_LIBS=$BUILD_SHARED_LIBS \\\n            -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            ..\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cd build\n          cmake --build . --target install --config Release -j 2\n\n          ls -lh install/lib/\n\n          cp -v install/lib/*.dll ../lazarus-examples/generate_subtitles/ || true\n          cp -v install/lib/*.so* ../lazarus-examples/generate_subtitles/ || true\n\n      - name: Build generating subtitles\n        shell: bash\n        run: |\n          cd lazarus-examples/generate_subtitles\n          os=${{ matrix.os }}\n          if [[ $os == macos-15-intel ]]; then\n            lazbuild --verbose --build-mode=Release --widgetset=cocoa ./generate_subtitles.lpi\n          elif [[ $os == macos-latest ]]; then\n            lazbuild --verbose --build-mode=Release --widgetset=cocoa --cpu=aarch64 ./generate_subtitles.lpi\n          elif [[ $os == 'ubuntu-22.04' ]]; then\n            lazbuild --verbose --build-mode=Release-Linux ./generate_subtitles.lpi\n          else\n            lazbuild --verbose --build-mode=Release ./generate_subtitles.lpi\n          fi\n\n      - name: Display generating subtitles\n        shell: bash\n        run: |\n          cd lazarus-examples/generate_subtitles\n          ls -lh\n\n      - name: Collect generating subtitles (Ubuntu)\n        if: matrix.os == 'ubuntu-22.04'\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          cd lazarus-examples/generate_subtitles\n          ls -lh\n          readelf -d ./generate_subtitles\n          echo '----------'\n          ldd ./generate_subtitles\n\n          d=generate_subtitles-linux-x64-$SHERPA_ONNX_VERSION\n          echo \"---before running patchelf---\"\n          readelf -d ./generate_subtitles\n\n          patchelf --set-rpath '$ORIGIN' ./generate_subtitles\n\n          echo \"---after running patchelf---\"\n          readelf -d ./generate_subtitles\n\n          mkdir -p $d\n          cp -v ./generate_subtitles $d/\n          cp -v *.so $d/\n\n          mv -v $d /tmp/linux-x64\n\n          ls -lh /tmp/linux-x64\n\n      - name: Collect generating subtitles (windows)\n        if: matrix.os == 'windows-2022'\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          cd lazarus-examples/generate_subtitles\n          ls -lh\n\n          d=generate-subtitles-windows-x64-$SHERPA_ONNX_VERSION\n          mkdir -p $d\n          cp -v ./generate_subtitles.exe $d/\n          cp -v onnxruntime.dll $d/\n          cp -v sherpa-onnx-c-api.dll $d/\n          mv $d ../../windows-x64\n          cd ../..\n\n          ls -lh windows-x64\n\n      - name: Collect generating subtitles (macos)\n        if: matrix.os == 'macos-15-intel' || matrix.os == 'macos-latest'\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          cd lazarus-examples/generate_subtitles\n          ls -lh\n          file ./generate_subtitles\n          echo '----------'\n          otool -L ./generate_subtitles\n          rm -v generate_subtitles.app/Contents/MacOS/generate_subtitles\n          cp -v ./generate_subtitles generate_subtitles.app/Contents/MacOS/generate_subtitles\n          chmod +x generate_subtitles.app/Contents/MacOS/generate_subtitles\n\n          if [[ ${{ matrix.os }} == 'macos-latest' ]]; then\n            mv generate_subtitles.app /tmp/macos-arm64\n          else\n            mv generate_subtitles.app /tmp/macos-x64\n            d=generate-subtitles-macos-x64-$SHERPA_ONNX_VERSION.app\n          fi\n\n          ls -lh /tmp\n          echo \"---\"\n          ls -lh /tmp/macos-*\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.os == 'ubuntu-22.04'\n        with:\n          name: linux-x64\n          path: /tmp/linux-x64\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.os == 'macos-latest'\n        with:\n          name: macos-arm64\n          path: /tmp/macos-arm64\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.os == 'macos-15-intel'\n        with:\n          name: macos-x64\n          path: /tmp/macos-x64\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.os == 'windows-2022'\n        with:\n          name: windows-x64\n          path: ./windows-x64\n\n  release:\n    runs-on: ${{ matrix.os }}\n    needs: [build]\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"2\"]\n        index: [\"0\", \"1\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Retrieve artifact from windows x64\n        uses: actions/download-artifact@v4\n        with:\n          name: windows-x64\n          path: /tmp/windows-x64\n\n      - name: Retrieve artifact from linux x64\n        uses: actions/download-artifact@v4\n        with:\n          name: linux-x64\n          path: /tmp/linux-x64\n\n      - name: Retrieve artifact from macos x64\n        uses: actions/download-artifact@v4\n        with:\n          name: macos-x64\n          path: /tmp/macos-x64\n\n      - name: Retrieve artifact from macos arm64\n        uses: actions/download-artifact@v4\n        with:\n          name: macos-arm64\n          path: /tmp/macos-arm64\n\n      - name: Display build files\n        shell: bash\n        run: |\n          ls -lh /tmp\n          echo \"---linux-x64---\"\n          ls -lh /tmp/linux-x64/\n          readelf -d /tmp/linux-x64/generate_subtitles\n          echo \"---\"\n          ldd /tmp/linux-x64/generate_subtitles\n\n          echo \"---macos-x64---\"\n          ls -lh /tmp/macos-x64/\n          mkdir -p /tmp/macos-x64/Contents/Resources\n          chmod +x /tmp/macos-x64/Contents/MacOS/generate_subtitles\n\n          echo \"---macos-arm64---\"\n          ls -lh /tmp/macos-arm64/\n          mkdir -p /tmp/macos-arm64/Contents/Resources\n          chmod +x /tmp/macos-arm64/Contents/MacOS/generate_subtitles\n\n          echo \"---windows-x64---\"\n          ls -lh /tmp/windows-x64/\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/lazarus\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-subtitles.py --total $total --index $index\n\n          chmod +x build-generate-subtitles.sh\n          mv -v ./build-generate-subtitles.sh ../..\n\n      - name: Generate tar files\n        shell: bash\n        run: |\n          ./build-generate-subtitles.sh\n\n      - name: Display tar files\n        shell: bash\n        run: |\n          ls -lh /tmp/out\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-bin huggingface\n            cd huggingface\n            git remote set-url origin https://csukuangfj:$HF_TOKEN@huggingface.co/sherpa-onnx-bin\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            d=generate-subtitles/$SHERPA_ONNX_VERSION\n            mkdir -p $d\n\n            cp -v /tmp/out/*.tar.bz2 $d/\n            git status\n            git lfs track \"*.tar.bz2\"\n            git add .\n            git commit -m \"add more files\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/sherpa-onnx-bin main\n"
  },
  {
    "path": ".github/workflows/linux-gpu.yaml",
    "content": "name: linux-gpu\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/linux-gpu.yaml'\n      - '.github/scripts/test-online-transducer.sh'\n      - '.github/scripts/test-online-paraformer.sh'\n      - '.github/scripts/test-offline-transducer.sh'\n      - '.github/scripts/test-offline-ctc.sh'\n      - '.github/scripts/test-online-ctc.sh'\n      - '.github/scripts/test-offline-tts.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'c-api-examples/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: linux-gpu-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  linux_gpu:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.build_type }} ${{ matrix.onnxruntime_version }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        # build_type: [Release, Debug]\n        build_type: [Release]\n        onnxruntime_version: [\"1.17.1\", \"1.23.2\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n            docker run --rm \\\n              --volume ${{ github.workspace }}/:/home/runner/work/sherpa-onnx/sherpa-onnx \\\n              quay.io/pypa/manylinux_2_28_x86_64 \\\n            bash -c '\n              uname -a\n              gcc --version\n              cmake --version\n              cat /etc/*release\n              id\n              pwd\n\n              cd /home/runner/work/sherpa-onnx/sherpa-onnx\n\n              onnxruntime_version=${{ matrix.onnxruntime_version }}\n              curl -SL -O https://github.com/csukuangfj/onnxruntime-libs/releases/download/v$onnxruntime_version/onnxruntime-linux-x64-gpu-$onnxruntime_version-patched.zip\n              unzip  onnxruntime-linux-x64-gpu-$onnxruntime_version-patched.zip\n\n              export SHERPA_ONNXRUNTIME_LIB_DIR=$PWD/onnxruntime-linux-x64-gpu-$onnxruntime_version-patched/lib\n              export SHERPA_ONNXRUNTIME_INCLUDE_DIR=$PWD/onnxruntime-linux-x64-gpu-$onnxruntime_version-patched/include\n\n              ls -lh /home/runner/work/sherpa-onnx/sherpa-onnx/onnxruntime-linux-x64-gpu-$onnxruntime_version-patched/lib/libonnxruntime.so\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n              p=$PWD\n\n              mkdir build\n              cd build\n\n              cmake \\\n                -DALSA_INCLUDE_DIR=$p/alsa-lib/include \\\n                -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so \\\n                -D CMAKE_BUILD_TYPE=${{ matrix.build_type }} \\\n                -D CMAKE_INSTALL_PREFIX=./install \\\n                -D BUILD_SHARED_LIBS=ON \\\n                -D SHERPA_ONNX_ENABLE_GPU=ON \\\n                ..\n\n              make -j2\n              make install\n\n              ls -lh lib\n              ls -lh bin\n\n              echo \"----\"\n              ls -lh install/lib\n\n              echo \"----\"\n              ls -lh install/bin\n            '\n\n      - name: Display dependencies of sherpa-onnx for linux\n        shell: bash\n        run: |\n          du -h -d1 .\n          sudo chown -R $USER ./build\n          ls -lh build/bin\n          ls -lh build/_deps/onnxruntime-src/lib/ || true\n\n          echo \"strip\"\n          strip build/bin/*\n          echo \"after strip\"\n          ls -lh build/bin\n\n          file build/bin/sherpa-onnx\n          file build/bin/sherpa-onnx\n          ls -lh build/bin/sherpa-onnx\n          readelf -d build/bin/sherpa-onnx\n\n          rm -fv build/install/include/cargs.h\n          rm -fv build/install/lib/cargs.h\n          rm -fv build/install/lib/libcargs.so\n          rm -rfv build/install/lib/pkgconfig\n\n          strings build/install/lib/*.so | grep \"^GLIBC_\"\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-x64-gpu\n\n          onnxruntime_version=${{ matrix.onnxruntime_version }}\n          if [[ $onnxruntime_version == \"1.23.2\" ]]; then\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-cuda-12.x-cudnn-9.x-linux-x64-gpu\n          fi\n\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n          cp -a build/install/lib $dst/\n          cp -a build/install/include $dst/\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - name: Release pre-compiled binaries and libs for linux x64\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*gpu.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.13\n\n      - name: Release pre-compiled binaries and libs for linux x64\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*gpu.tar.bz2\n\n      - name: Display dependencies of sherpa-onnx for linux\n        shell: bash\n        run: |\n          file build/bin/sherpa-onnx\n          readelf -d build/bin/sherpa-onnx\n\n      - name: Test spoken language identification\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-language-identification\n\n          .github/scripts/test-spoken-language-identification.sh\n\n      - name: Test online CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-ctc.sh\n\n      - name: Test offline TTS\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-tts\n\n          .github/scripts/test-offline-tts.sh\n\n      - name: Test online paraformer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-paraformer.sh\n\n\n      - name: Test offline Whisper\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-whisper.sh\n\n      - name: Test offline CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-ctc.sh\n\n      - name: Test offline transducer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-transducer.sh\n\n      - name: Test online transducer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-transducer.sh\n\n      - name: Test online transducer (C API)\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=decode-file-c-api\n\n          .github/scripts/test-online-transducer.sh\n\n\n\n\n"
  },
  {
    "path": ".github/workflows/linux-jni-aarch64.yaml",
    "content": "name: linux-jni-aarch64\n\non:\n  push:\n    branches:\n      - jni\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n  workflow_dispatch:\n\nconcurrency:\n  group: linux-jni-aarch64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  linux-jni-aarch64:\n    name: linux jni aarch64\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-22.04-arm]\n        # java-version: ['8', '11', '16', '17', '21']\n        java-version: ['21']\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: ${{ matrix.java-version }}\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n          du -h -d1 .\n\n      - name: Build sherpa-onnx\n        if: matrix.java-version == '21'\n        shell: bash\n        run: |\n            docker run --rm \\\n              --volume ${{ github.workspace }}/:/home/runner/work/sherpa-onnx/sherpa-onnx \\\n              quay.io/pypa/manylinux2014_aarch64 \\\n            bash -c '\n              uname -a\n              gcc --version\n              cmake --version\n              cat /etc/*release\n              id\n              pwd\n\n              yum install -y java-11-openjdk-devel\n              java -version\n              which java\n              ls -lh $(which java)\n              ls -lrt /etc/alternatives/java\n\n              export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.23.0.9-2.el7_9.aarch64\n              echo \"JAVA_HOME: $JAVA_HOME\"\n              find $JAVA_HOME -name jni.h\n\n              cd /home/runner/work/sherpa-onnx/sherpa-onnx\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n              p=$PWD\n\n              mkdir build\n              cd build\n\n              cmake \\\n                -DALSA_INCLUDE_DIR=$p/alsa-lib/include \\\n                -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so \\\n                -D SHERPA_ONNX_ENABLE_TTS=ON \\\n                -D CMAKE_BUILD_TYPE=Release \\\n                -D BUILD_SHARED_LIBS=ON \\\n                -D CMAKE_INSTALL_PREFIX=./install \\\n                -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n                -D SHERPA_ONNX_ENABLE_JNI=ON \\\n                ..\n\n              make -j2\n              make install\n\n              ls -lh lib\n              rm -rf ./install/lib/pkgconfig\n              rm -rf ./install/lib/share\n              rm -rf ./install/lib/cargs.h\n              rm -rf ./install/include/cargs.h\n              rm -rf ./install/lib/libcargs.so\n              rm -rf ./install/lib/libsherpa-onnx-c-api.so\n\n              echo \"----\"\n              ls -lh install/lib\n\n              echo \"----\"\n            '\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.java-version == '21'\n        with:\n          name: release-jni-linux-${{ matrix.java-version }}\n          path: build/install/*\n\n      - name: Copy files\n        if: matrix.java-version == '21'\n        shell: bash\n        run: |\n          du -h -d1 .\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-aarch64-jni\n          mkdir $dst\n\n          cp -a build/install/lib $dst/\n          cp -a build/install/include $dst/\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch') && matrix.java-version == '21'\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=jni/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*.tar.bz2 $dst/\n            cp -v ../*.jar $dst/\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"add more files\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for linux aarch64\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/') && matrix.java-version == '21'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.25\n\n      - name: Release pre-compiled binaries and libs for linux aarch64\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/') && matrix.java-version == '21'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*.tar.bz2\n"
  },
  {
    "path": ".github/workflows/linux-jni.yaml",
    "content": "name: linux-jni\n\non:\n  push:\n    branches:\n      - jni\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n  workflow_dispatch:\n\nconcurrency:\n  group: linux-jni-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  linux-jni:\n    name: linux jni\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        java-version: ['24']\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: ${{ matrix.java-version }}\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n          du -h -d1 .\n\n      - name: Build jar ${{ matrix.java-version }}\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          cd sherpa-onnx/java-api\n          make\n          ls -lh build/\n          cp build/sherpa-onnx.jar ../../sherpa-onnx-$SHERPA_ONNX_VERSION.jar\n          cd ../..\n          ls -lh *.jar\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: release-jni-linux-jar-${{ matrix.java-version }}\n          path: ./*.jar\n\n      - name: Release jar\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.jar\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.25\n\n      - name: Release jar\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.jar\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n            docker run --rm \\\n              --volume ${{ github.workspace }}/:/home/runner/work/sherpa-onnx/sherpa-onnx \\\n              quay.io/pypa/manylinux2014_x86_64 \\\n            bash -c '\n              uname -a\n              gcc --version\n              cmake --version\n              cat /etc/*release\n              id\n              pwd\n\n              yum install -y java-11-openjdk-devel\n              java -version\n              which java\n              ls -lh $(which java)\n              ls -lrt /etc/alternatives/java\n\n              export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.23.0.9-2.el7_9.x86_64\n              echo \"JAVA_HOME: $JAVA_HOME\"\n              find $JAVA_HOME -name jni.h\n\n              cd /home/runner/work/sherpa-onnx/sherpa-onnx\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n              p=$PWD\n\n              mkdir build\n              cd build\n\n              cmake \\\n                -DALSA_INCLUDE_DIR=$p/alsa-lib/include \\\n                -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so \\\n                -D SHERPA_ONNX_ENABLE_TTS=ON \\\n                -D CMAKE_BUILD_TYPE=Release \\\n                -D BUILD_SHARED_LIBS=ON \\\n                -D CMAKE_INSTALL_PREFIX=./install \\\n                -D SHERPA_ONNX_ENABLE_JNI=ON \\\n                ..\n\n              make -j2\n              make install\n\n              ls -lh lib\n              ls -lh bin\n              rm -rf ./install/lib/pkgconfig\n              rm -rf ./install/lib/share\n              rm -rf ./install/lib/cargs.h\n              rm -rf ./install/include/cargs.h\n              rm -rf ./install/lib/libcargs.so\n              rm -rf ./install/lib/libsherpa-onnx-c-api.so\n              rm -rf ./install/lib/libsherpa-onnx-cxx-api.so\n\n              echo \"----\"\n              ls -lh install/lib\n\n              echo \"----\"\n              ls -lh install/bin\n            '\n\n      - name: Display dependencies of sherpa-onnx for linux\n        shell: bash\n        run: |\n          du -h -d1 .\n          sudo chown -R $USER ./build\n          ls -lh build/bin\n          ls -lh build/_deps/onnxruntime-src/lib/\n\n          echo \"strip\"\n          strip build/bin/*\n          echo \"after strip\"\n          ls -lh build/bin\n\n          file build/bin/sherpa-onnx\n          file build/bin/sherpa-onnx\n          ls -lh build/bin/sherpa-onnx\n          readelf -d build/bin/sherpa-onnx\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: release-jni-linux-${{ matrix.java-version }}\n          path: build/install/*\n\n      - name: Copy files\n        shell: bash\n        run: |\n          du -h -d1 .\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-x64-jni\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n          cp -a build/install/lib $dst/\n          cp -a build/install/include $dst/\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n          du -h -d1 .\n\n      - name: Release pre-compiled binaries and libs for linux x64\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.11\n\n      - name: Release pre-compiled binaries and libs for linux x64\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*.tar.bz2\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=jni/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n            git lfs track \"*.jar\"\n\n            cp -v ../sherpa-onnx-*.tar.bz2 $dst/\n            cp -v ../*.jar $dst/\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"add more files\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n"
  },
  {
    "path": ".github/workflows/linux.yaml",
    "content": "name: linux\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/linux.yaml'\n      - '.github/scripts/test-kws.sh'\n      - '.github/scripts/test-online-transducer.sh'\n      - '.github/scripts/test-offline-speech-denoiser.sh'\n      - '.github/scripts/test-offline-source-separation.sh'\n      - '.github/scripts/test-online-paraformer.sh'\n      - '.github/scripts/test-offline-transducer.sh'\n      - '.github/scripts/test-offline-ctc.sh'\n      - '.github/scripts/test-online-ctc.sh'\n      - '.github/scripts/test-offline-tts.sh'\n      - '.github/scripts/test-audio-tagging.sh'\n      - '.github/scripts/test-offline-punctuation.sh'\n      - '.github/scripts/test-online-punctuation.sh'\n      - '.github/scripts/test-speaker-diarization.sh'\n      - '.github/scripts/test-c-api.sh'\n      - '.github/scripts/test-cxx-api.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'c-api-examples/**'\n  pull_request:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/linux.yaml'\n      - '.github/scripts/test-kws.sh'\n      - '.github/scripts/test-offline-speech-denoiser.sh'\n      - '.github/scripts/test-offline-source-separation.sh'\n      - '.github/scripts/test-online-transducer.sh'\n      - '.github/scripts/test-online-paraformer.sh'\n      - '.github/scripts/test-offline-transducer.sh'\n      - '.github/scripts/test-offline-ctc.sh'\n      - '.github/scripts/test-online-ctc.sh'\n      - '.github/scripts/test-offline-tts.sh'\n      - '.github/scripts/test-audio-tagging.sh'\n      - '.github/scripts/test-offline-punctuation.sh'\n      - '.github/scripts/test-online-punctuation.sh'\n      - '.github/scripts/test-speaker-diarization.sh'\n      - '.github/scripts/test-c-api.sh'\n      - '.github/scripts/test-cxx-api.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: linux-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  linux:\n    name: ${{ matrix.build_type }} shared-${{ matrix.shared_lib }} tts-${{ matrix.with_tts }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        build_type: [Release, Debug]\n        shared_lib: [ON, OFF]\n        with_tts: [ON, OFF]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Display PWD\n        shell: bash\n        run: |\n          echo \"pwd: $PWD\"\n          ls -lh\n          du -h -d1 .\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          docker run --rm \\\n            --volume ${{ github.workspace }}/:/home/runner/work/sherpa-onnx/sherpa-onnx \\\n            quay.io/pypa/manylinux2014_x86_64 \\\n            bash -c '\n              uname -a\n              gcc --version\n\n              # use gcc 11. the default is gcc 10\n\n              # See https://github.com/nealef/clefos/issues/9\n              echo \"multilib_policy=best\" >> /etc/yum.conf\n              echo \"skip_missing_names_on_install=False\" >> /etc/yum.conf\n              sed -i \"/^override_install_langs=/d\" /etc/yum.conf\n              yum -y update\n              yum -y install yum-utils curl\n              yum-config-manager --enable extras\n              yum -y install centos-release-scl-rh\n              yum -y install devtoolset-11-binutils devtoolset-11-gcc devtoolset-11-gcc-c++ devtoolset-11-gcc-gfortran\n\n              # see https://stackoverflow.com/questions/72904802/can-not-find-required-gcc-version-after-devtoolset-installation\n              ls -lh /opt/rh/devtoolset-11\n\n              source /opt/rh/devtoolset-11/enable\n\n              echo \"which gcc\"\n              which gcc\n\n              echo \"gcc --version\"\n              gcc --version\n\n              cmake --version\n              cat /etc/*release\n              id\n              pwd\n\n              cd /home/runner/work/sherpa-onnx/sherpa-onnx\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n\n              p=$PWD\n\n              mkdir build\n              cd build\n\n              cmake \\\n                -DALSA_INCLUDE_DIR=$p/alsa-lib/include \\\n                -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so \\\n                -D SHERPA_ONNX_ENABLE_TTS=${{ matrix.with_tts }} \\\n                -D CMAKE_BUILD_TYPE=${{ matrix.build_type }} \\\n                -D BUILD_SHARED_LIBS=${{ matrix.shared_lib }} \\\n                -D CMAKE_INSTALL_PREFIX=./install \\\n                ..\n\n              make -j2\n              make install\n\n              ls -lh lib\n              ls -lh bin\n\n              echo \"----\"\n              ls -lh install/lib\n\n              echo \"----\"\n              ls -lh install/bin\n            '\n\n      - name: Display dependencies of sherpa-onnx for linux\n        shell: bash\n        run: |\n          du -h -d1 .\n          sudo chown -R $USER ./build\n          ls -lh build/bin\n          ls -lh build/_deps/onnxruntime-src/lib/\n\n          echo \"strip\"\n          strip build/bin/*\n          echo \"after strip\"\n          ls -lh build/bin\n\n          file build/bin/sherpa-onnx\n          file build/bin/sherpa-onnx\n          ls -lh build/bin/sherpa-onnx\n          readelf -d build/bin/sherpa-onnx\n\n          rm -fv build/install/include/cargs.h\n          rm -fv build/install/lib/cargs.h\n          rm -fv build/install/lib/libcargs.so\n          rm -rfv build/install/lib/pkgconfig\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: release-${{ matrix.build_type }}-with-shared-lib-${{ matrix.shared_lib }}-with-tts-${{ matrix.with_tts }}\n          path: install/*\n\n      - name: Copy files\n        shell: bash\n        if: matrix.build_type == 'Release'\n        run: |\n          du -h -d1 .\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          if [[ ${{ matrix.shared_lib }} == 'ON' ]]; then\n            suffix=shared\n          else\n            suffix=static\n          fi\n\n          if [[ ${{ matrix.with_tts }} == ON ]]; then\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-x64-$suffix\n          else\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-x64-$suffix-no-tts\n          fi\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n          if [[ ${{ matrix.shared_lib }} == ON ]]; then\n            mkdir $dst/lib\n            cp -av build/install/lib/*.so* $dst/lib/\n          fi\n          cp -a build/install/include $dst/\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n          du -h -d1 .\n\n      - name: Release pre-compiled binaries and libs for linux x64\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/') && matrix.build_type == 'Release'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.25\n\n      - name: Release pre-compiled binaries and libs for linux x64\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/') && matrix.build_type == 'Release'\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*.tar.bz2\n\n      - name: Test offline source separation\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-source-separation\n\n          .github/scripts/test-offline-source-separation.sh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: source-separation-${{ matrix.build_type }}-with-shared-lib-${{ matrix.shared_lib }}-with-tts-${{ matrix.with_tts }}\n          path: ./source-separation-wavs/*.wav\n\n      - name: Test offline CTC\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-ctc.sh\n          du -h -d1 .\n\n      - name: Test offline speech denoiser\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-denoiser\n\n          .github/scripts/test-offline-speech-denoiser.sh\n\n      - name: Test offline TTS\n        if: matrix.with_tts == 'ON'\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-tts\n\n          .github/scripts/test-offline-tts.sh\n          du -h -d1 .\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: speech-denoiser-${{ matrix.build_type }}-with-shared-lib-${{ matrix.shared_lib }}-with-tts-${{ matrix.with_tts }}\n          path: ./*speech*.wav\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.with_tts == 'ON'\n        with:\n          name: tts-generated-test-files-${{ matrix.build_type }}-${{ matrix.shared_lib }}-with-tts-${{ matrix.with_tts }}\n          path: tts\n\n      - name: Test offline FireRedASR\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          readelf -d build/bin/sherpa-onnx-offline\n\n          .github/scripts/test-offline-fire-red-asr.sh\n          du -h -d1 .\n\n      - name: Test offline Moonshine\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          readelf -d build/bin/sherpa-onnx-offline\n\n          .github/scripts/test-offline-moonshine.sh\n          du -h -d1 .\n\n      - name: Test C++ API\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export CXX_STREAMING_ZIPFORMER_EXE=streaming-zipformer-cxx-api\n          export CXX_WHISPER_EXE=whisper-cxx-api\n          export CXX_SENSE_VOICE_EXE=sense-voice-cxx-api\n\n          .github/scripts/test-cxx-api.sh\n          du -h -d1 .\n\n      - name: Test offline speaker diarization\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-speaker-diarization\n\n          .github/scripts/test-speaker-diarization.sh\n\n      - name: Test offline transducer\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-transducer.sh\n          du -h -d1 .\n\n      - name: Test online punctuation\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-online-punctuation\n\n          .github/scripts/test-online-punctuation.sh\n          du -h -d1 .\n\n      - name: Test online transducer\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-transducer.sh\n          du -h -d1 .\n\n      - name: Test online transducer (C API)\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=decode-file-c-api\n\n          .github/scripts/test-online-transducer.sh\n          du -h -d1 .\n\n      - name: Test spoken language identification (C++ API)\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-language-identification\n\n          .github/scripts/test-spoken-language-identification.sh\n          du -h -d1 .\n\n      - name: Test online CTC\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-ctc.sh\n          du -h -d1 .\n\n      - name: Test C API\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export SLID_EXE=spoken-language-identification-c-api\n          export SID_EXE=speaker-identification-c-api\n          export AT_EXE=audio-tagging-c-api\n          export PUNCT_EXE=add-punctuation-c-api\n\n          .github/scripts/test-c-api.sh\n          du -h -d1 .\n\n      - name: Test offline punctuation\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-punctuation\n\n          .github/scripts/test-offline-punctuation.sh\n          du -h -d1 .\n\n      - name: Test Audio tagging\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-audio-tagging\n\n          .github/scripts/test-audio-tagging.sh\n          du -h -d1 .\n\n      - name: Test transducer kws\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-keyword-spotter\n\n          .github/scripts/test-kws.sh\n          du -h -d1 .\n\n      - name: Test offline Whisper\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          readelf -d build/bin/sherpa-onnx-offline\n\n          .github/scripts/test-offline-whisper.sh\n          du -h -d1 .\n\n      - name: Test online paraformer\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-paraformer.sh\n          du -h -d1 .\n"
  },
  {
    "path": ".github/workflows/macos-jni.yaml",
    "content": "name: macos-jni\n\non:\n  push:\n    branches:\n      - jni\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: macos-jni-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  macos_jni:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.arch }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        arch: [arm64, x86_64]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-${{ matrix.arch }}\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n          arch=${{ matrix.arch }}\n\n          cmake \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D CMAKE_OSX_ARCHITECTURES=$arch \\\n            -D SHERPA_ONNX_ENABLE_JNI=ON \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n      - name: Build sherpa-onnx for macos\n        shell: bash\n        run: |\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cd build\n          make -j2\n          make install\n\n          ls -lh lib\n          ls -lh bin\n\n          file ./bin/sherpa-onnx\n\n          rm -rf ./install/lib/pkgconfig\n          rm -rf ./install/lib/share\n          rm -rf ./install/lib/cargs.h\n          rm -rf ./install/include/cargs.h\n          rm -rf ./install/lib/libcargs.dylib\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: release-jni-macos-${{ matrix.arch }}\n          path: build/install/*\n\n      - name: Copy files\n        shell: bash\n        run: |\n          du -h -d1 .\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          arch=${{ matrix.arch }}\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-osx-$arch-jni\n          mkdir -p $dst\n\n          cp -a build/install/bin $dst/\n          cp -a build/install/lib $dst/\n          cp -a build/install/include $dst/\n\n          brew install tree\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n          du -h -d1 .\n\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=jni/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"add more files\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for linux x64\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*.tar.bz2\n"
  },
  {
    "path": ".github/workflows/macos.yaml",
    "content": "name: macos\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/scripts/test-offline-speech-denoiser.sh'\n      - '.github/workflows/macos.yaml'\n      - '.github/scripts/test-kws.sh'\n      - '.github/scripts/test-online-transducer.sh'\n      - '.github/scripts/test-online-paraformer.sh'\n      - '.github/scripts/test-offline-transducer.sh'\n      - '.github/scripts/test-offline-ctc.sh'\n      - '.github/scripts/test-offline-tts.sh'\n      - '.github/scripts/test-online-ctc.sh'\n      - '.github/scripts/test-audio-tagging.sh'\n      - '.github/scripts/test-offline-punctuation.sh'\n      - '.github/scripts/test-online-punctuation.sh'\n      - '.github/scripts/test-speaker-diarization.sh'\n      - '.github/scripts/test-c-api.sh'\n      - '.github/scripts/test-cxx-api.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n  pull_request:\n    branches:\n      - master\n    paths:\n      - '.github/scripts/test-offline-speech-denoiser.sh'\n      - '.github/workflows/macos.yaml'\n      - '.github/scripts/test-kws.sh'\n      - '.github/scripts/test-online-transducer.sh'\n      - '.github/scripts/test-online-paraformer.sh'\n      - '.github/scripts/test-offline-transducer.sh'\n      - '.github/scripts/test-offline-ctc.sh'\n      - '.github/scripts/test-offline-tts.sh'\n      - '.github/scripts/test-online-ctc.sh'\n      - '.github/scripts/test-audio-tagging.sh'\n      - '.github/scripts/test-offline-punctuation.sh'\n      - '.github/scripts/test-online-punctuation.sh'\n      - '.github/scripts/test-speaker-diarization.sh'\n      - '.github/scripts/test-c-api.sh'\n      - '.github/scripts/test-cxx-api.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: macos-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  macos:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.build_type }} ${{ matrix.lib_type }} tts-${{ matrix.with_tts }} ${{ matrix.os }} ${{ matrix.arch }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        build_type: [Release, Debug]\n        lib_type: [static, shared]\n        with_tts: [ON, OFF]\n        arch: [\"arm64;x86_64\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-${{ matrix.build_type }}-${{ matrix.lib_type }}-tts-${{ matrix.with_tts }}\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n          lib_type=${{ matrix.lib_type }}\n          if [[ $lib_type == \"static\" ]]; then\n            BUILD_SHARED_LIBS=OFF\n          else\n            BUILD_SHARED_LIBS=ON\n          fi\n\n          arch=\"${{ matrix.arch }}\"\n\n          cmake \\\n            -DSHERPA_ONNX_ENABLE_TTS=${{ matrix.with_tts }} \\\n            -D BUILD_SHARED_LIBS=$BUILD_SHARED_LIBS \\\n            -D CMAKE_BUILD_TYPE=${{ matrix.build_type }} \\\n            -D CMAKE_OSX_ARCHITECTURES=\"$arch\" \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n      - name: Build sherpa-onnx for macos\n        shell: bash\n        run: |\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cd build\n          make -j2\n          make install\n\n          ls -lh lib\n          ls -lh bin\n\n          file ./bin/sherpa-onnx\n\n          rm -fv ./install/include/cargs.h\n          rm -fv ./install/lib/cargs.h\n          rm -fv ./install/lib/libcargs.dylib\n          rm -fv ./install/lib/libcargs.a\n          rm -rfv ./install/lib/pkgconfig\n\n      - name: Display dependencies of sherpa-onnx for macos\n        shell: bash\n        run: |\n          file bin/sherpa-onnx\n          otool -L build/bin/sherpa-onnx\n          otool -l build/bin/sherpa-onnx\n\n      - name: Copy files\n        if: matrix.build_type == 'Release'\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          if [[ ${{ matrix.with_tts }} == ON ]]; then\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-osx-universal2-${{ matrix.lib_type }}\n          else\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-osx-universal2-${{ matrix.lib_type }}-no-tts\n          fi\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n          if [[ ${{ matrix.lib_type }} == shared ]]; then\n            mkdir $dst/lib\n            cp -a build/install/lib/*.dylib* $dst/lib/\n          else\n            cp -a build/install/lib $dst/\n          fi\n          cp -a build/install/include $dst/\n\n          brew install tree\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - name: Release pre-compiled binaries and libs for macOS\n        if: matrix.build_type == 'Release' && (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*osx-universal2*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.24\n\n      - name: Test offline FireRedASR\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-fire-red-asr.sh\n\n      - name: Test offline CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-ctc.sh\n\n      - name: Test offline speech denoiser\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-denoiser\n\n          .github/scripts/test-offline-speech-denoiser.sh\n\n      - name: Test offline TTS\n        if: matrix.with_tts == 'ON'\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-tts\n\n          .github/scripts/test-offline-tts.sh\n\n      - name: Test offline Moonshine\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-moonshine.sh\n\n      - name: Test C++ API\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export CXX_STREAMING_ZIPFORMER_EXE=streaming-zipformer-cxx-api\n          export CXX_WHISPER_EXE=whisper-cxx-api\n          export CXX_SENSE_VOICE_EXE=sense-voice-cxx-api\n\n          .github/scripts/test-cxx-api.sh\n          du -h -d1 .\n\n      - name: Test offline speaker diarization\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-speaker-diarization\n\n          .github/scripts/test-speaker-diarization.sh\n\n      - name: Test offline transducer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-transducer.sh\n\n      - name: Test online punctuation\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-online-punctuation\n\n          .github/scripts/test-online-punctuation.sh\n\n      - name: Test online CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-ctc.sh\n\n      - name: Test offline punctuation\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-punctuation\n\n          .github/scripts/test-offline-punctuation.sh\n\n      - name: Test C API\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export SLID_EXE=spoken-language-identification-c-api\n          export SID_EXE=speaker-identification-c-api\n          export AT_EXE=audio-tagging-c-api\n          export PUNCT_EXE=add-punctuation-c-api\n\n          .github/scripts/test-c-api.sh\n\n      - name: Test Audio tagging\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-audio-tagging\n\n          .github/scripts/test-audio-tagging.sh\n\n      - name: Test spoken language identification (C++ API)\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-language-identification\n\n          .github/scripts/test-spoken-language-identification.sh\n\n      - name: Test transducer kws\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-keyword-spotter\n\n          .github/scripts/test-kws.sh\n\n      - name: Test online paraformer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-paraformer.sh\n\n      - name: Test offline Whisper\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-whisper.sh\n\n      - name: Test online transducer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-transducer.sh\n\n      - name: Test online transducer (C API)\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=decode-file-c-api\n\n          .github/scripts/test-online-transducer.sh\n\n\n"
  },
  {
    "path": ".github/workflows/mfc.yaml",
    "content": "name: mfc\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/mfc.yaml'\n      - 'cmake/**'\n      - 'mfc-examples/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: mfc-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  mfc:\n    name: MFC for ${{ matrix.arch }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        arch: [x64, x86]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Display MSBuild info\n        shell: cmd\n        run: |\n          set path=\"C:\\Program Files\\Microsoft Visual Studio\\2022\\Enterprise\\MSBuild\\Current\\Bin\"\n          msbuild -help\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          mkdir build\n          cd build\n          arch=${{ matrix.arch }}\n          if [[ $arch == \"x86\" ]]; then\n            arch=Win32\n          fi\n          cmake -A $arch -D CMAKE_BUILD_TYPE=Release -D BUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX=./install ..\n\n      - name: Build sherpa-onnx for windows\n        shell: bash\n        run: |\n          cd build\n          cmake --build . --config Release -- -m:2\n          cmake --build . --config Release --target install -- -m:2\n\n          ls -lh install/*\n\n          ls -lh install/lib\n          ls -lh install/bin\n\n      - name: Build MFC\n        shell: cmd\n        run: |\n          set path=\"C:\\Program Files\\Microsoft Visual Studio\\2022\\Enterprise\\MSBuild\\Current\\Bin\"\n\n          cd mfc-examples\n\n          msbuild .\\mfc-examples.sln /property:Configuration=Release /property:Platform=${{ matrix.arch }}\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          arch=${{ matrix.arch }}\n          if [[ $arch == \"x86\" ]]; then\n            src=mfc-examples/Release\n            ls -h $src\n            dst=mfc-examples/$arch/Release\n\n            mkdir -p $dst\n            cp $src/* $dst\n          fi\n\n          cd mfc-examples/$arch/Release\n          ls -lh\n\n          cp -v StreamingSpeechRecognition.exe sherpa-onnx-streaming-asr-$arch-${SHERPA_ONNX_VERSION}.exe\n          cp -v NonStreamingSpeechRecognition.exe sherpa-onnx-non-streaming-asr-$arch-${SHERPA_ONNX_VERSION}.exe\n          cp -v NonStreamingTextToSpeech.exe ../sherpa-onnx-non-streaming-tts-$arch-${SHERPA_ONNX_VERSION}.exe\n          ls -lh\n\n      - name: Upload artifact tts\n        uses: actions/upload-artifact@v4\n        with:\n          name: non-streaming-tts-${{ matrix.arch }}\n          path: ./mfc-examples/${{ matrix.arch }}/sherpa-onnx-non-streaming-tts-*.exe\n\n      - name: Upload artifact\n        uses: actions/upload-artifact@v4\n        with:\n          name: streaming-speech-recognition-${{ matrix.arch }}\n          path: ./mfc-examples/${{ matrix.arch }}/Release/sherpa-onnx-streaming-asr-*.exe\n\n      - name: Upload artifact\n        uses: actions/upload-artifact@v4\n        with:\n          name: non-streaming-speech-recognition-${{ matrix.arch }}\n          path: ./mfc-examples/${{ matrix.arch }}/Release/sherpa-onnx-non-streaming-asr-*.exe\n\n      - name: Release pre-compiled binaries and libs for Windows ${{ matrix.arch }}\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./mfc-examples/${{ matrix.arch }}/Release/sherpa-onnx-streaming-*.exe\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.24\n\n      - name: Release pre-compiled binaries and libs for Windows ${{ matrix.arch }}\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./mfc-examples/${{ matrix.arch }}/Release/sherpa-onnx-non-streaming-*.exe\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.24\n\n      - name: Release pre-compiled binaries and libs for Windows ${{ matrix.arch }}\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./mfc-examples/${{ matrix.arch }}/sherpa-onnx-non-streaming-*.exe\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.24\n"
  },
  {
    "path": ".github/workflows/mobile-asr-models.yaml",
    "content": "name: mobile-asr-models\n\non:\n  push:\n    branches:\n      - asr-mobile\n\n  workflow_dispatch:\n\nconcurrency:\n  group: mobile-asr-models-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  mobile-asr-models:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj' || github.repository_owner == 'csu-fangjun'\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n        total: [\"11\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install dependencies\n        shell: bash\n        run: |\n          python3 -m pip install onnxruntime==1.16.3 onnx==1.15.0 jinja2\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/mobile-asr-models\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-asr.py --total $total --index $index\n          chmod +x run2.sh\n          mv run2.sh run.sh\n          ls -lh\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/mobile-asr-models\n          ./run.sh\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n"
  },
  {
    "path": ".github/workflows/mobile-kws-models.yaml",
    "content": "name: mobile-kws-models\n\non:\n  push:\n    branches:\n      - asr-mobile\n\n  workflow_dispatch:\n\nconcurrency:\n  group: mobile-kws-models-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  mobile-kws-models:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj' || github.repository_owner == 'csu-fangjun'\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.index }}/${{ matrix.total }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n        total: [\"2\"]\n        index: [\"0\", \"1\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install dependencies\n        shell: bash\n        run: |\n          python3 -m pip install onnxruntime==1.16.3 onnx==1.15.0 jinja2\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/mobile-asr-models\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-kws.py --total $total --index $index\n          chmod +x run2.sh\n          mv run2.sh run.sh\n          ls -lh\n\n      - name: Run\n        shell: bash\n        run: |\n          cd scripts/mobile-asr-models\n          ./run.sh\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./kws/*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: kws-models\n"
  },
  {
    "path": ".github/workflows/nightly-wheel-arm.yaml",
    "content": "name: nightly-wheel-arm\n\non:\n  schedule:\n    # minute (0-59)\n    # hour (0-23)\n    # day of the month (1-31)\n    # month (1-12)\n    # day of the week (0-6)\n    # nightly build at 23:50 UTC time every day\n    - cron: \"50 23 * * *\"\n\n  workflow_dispatch:\n\nconcurrency:\n  group: nightly-wheel-armv7l-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  nightly-wheel-arm:\n    name: ${{ matrix.python-version }}\n    # see https://github.com/actions/virtual-environments/blob/win19/20210525.0/images/win/Windows2019-Readme.md\n    runs-on: ${{ matrix.os}}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.7\", \"3.8\", \"3.9\", \"3.10\", \"3.11\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Set up QEMU\n        uses: docker/setup-qemu-action@v2\n        with:\n          platforms: arm\n\n      - name: Run docker\n        shell: bash\n        run: |\n            docker run --rm \\\n              --platform linux/arm/v7 \\\n              --volume ${{ github.workspace }}/:/workspace \\\n              balenalib/raspberrypi3-python:${{ matrix.python-version }}-bullseye-build \\\n            bash -c '\n              uname -a\n              cd /workspace\n              ls -lh\n\n              v=${{ matrix.python-version }}\n              PYTHON_VERSION=${v/./}\n              echo PYTHON_VERSION=$PYTHON_VERSION >> $GITHUB_ENV\n              extra=\"\"\n              if [[ ${PYTHON_VERSION} == \"37\" ]]; then\n                extra=\"m\"\n              fi\n\n              # pip install -i https://www.piwheels.org/simple numpy sentencepiece click\n              pip install https://huggingface.co/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/sentencepiece-0.2.0-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}${extra}-linux_armv7l.whl\n              pip install --no-deps sherpa-onnx\n              python3 -c \"import sherpa_onnx; print(sherpa_onnx.__file__, sherpa_onnx.__version__); print(dir(sherpa_onnx)); print(help(sherpa_onnx))\"\n            '\n"
  },
  {
    "path": ".github/workflows/npm-addon-linux-aarch64.yaml",
    "content": "name: npm-addon-linux-aarch64\n\non:\n  push:\n    branches:\n      - node-addon\n  workflow_dispatch:\n\nconcurrency:\n  group: npm-addon-linux-aarch64-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n  id-token: write\n\njobs:\n  npm-addon-linux-aarch64:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Set up QEMU\n        uses: docker/setup-qemu-action@v2\n        with:\n          platforms: arm64\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Show .npmrc\n        shell: bash\n        run: |\n          echo $PWD\n          echo $HOME\n\n          find $HOME -name .npmrc\n\n      - uses: actions/setup-node@v4\n        with:\n          node-version: '24'\n          registry-url: 'https://registry.npmjs.org'\n\n      - name: Show .npmrc\n        shell: bash\n        run: |\n          echo $PWD\n          echo $HOME\n\n          find $HOME -name .npmrc\n\n          cat /home/runner/work/_temp/.npmrc\n          cp -v /home/runner/work/_temp/.npmrc ./\n\n      - name: Build sherpa-onnx (docker manually)\n        shell: bash\n        run: |\n          docker run --rm \\\n              --volume ${{ github.workspace }}/:/shared/ \\\n              -w /shared \\\n              --platform linux/arm64 \\\n              quay.io/pypa/manylinux2014_aarch64 \\\n            bash -c '\n              cp /shared/.npmrc ~/\n\n              cat ~/.npmrc\n\n              echo $HOME\n              uname -a\n              cat /etc/*release\n              gcc --version\n              cmake --version\n\n              curl -sL https://rpm.nodesource.com/setup_16.x | bash -\n              yum install -y nodejs\n\n              node --version\n\n              cd /shared\n\n              mkdir build\n              cd build\n              cmake \\\n                -DCMAKE_INSTALL_PREFIX=./install \\\n                -DBUILD_SHARED_LIBS=ON \\\n                -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n                -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n                -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n                ..\n\n              make -j2\n              make install\n              cd ..\n\n              d=$PWD\n              export SHERPA_ONNX_INSTALL_DIR=$d/build/install\n\n              ls -lh /shared/build\n\n              pushd scripts/node-addon-api/\n              npm i\n\n              ./node_modules/.bin/cmake-js compile --log-level verbose\n              popd\n\n              owner=${{ github.repository_owner }}\n              export owner\n\n              echo \"---\"\n              ls -lh build/install/lib/\n              sudo chown -R runner ./build\n              echo \"---\"\n              ls -lh build/install/lib/\n              echo \"---\"\n\n              .github/scripts/node-addon/run.sh\n\n              ls -lh ./sherpa-onnx-node\n\n              tar czvf sherpa-onnx-linux-arm64.tgz sherpa-onnx-node\n            '\n\n      - name: Publish\n        shell: bash\n        run: |\n          cd sherpa-onnx-node\n          ls -lh\n          npm publish --access public\n\n          # cd ./sherpa-onnx-node\n          # cp -v /shared/.npmrc ./\n          # # https://docs.npmjs.com/trusted-publishers\n          # ls -lh\n          # npm publish --access public\n"
  },
  {
    "path": ".github/workflows/npm-addon-linux-x64.yaml",
    "content": "name: npm-addon-linux-x64\n\non:\n  push:\n    branches:\n      - node-addon\n  workflow_dispatch:\n\nconcurrency:\n  group: npm-addon-linux-x64-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n  id-token: write\n\njobs:\n  npm-addon-linux-x64:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - uses: actions/setup-node@v4\n        with:\n          node-version: '24'\n          registry-url: 'https://registry.npmjs.org'\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          docker run --rm \\\n            --volume ${{ github.workspace }}/:/shared/ \\\n            quay.io/pypa/manylinux2014_x86_64 \\\n            bash -c '\n              uname -a\n              gcc --version\n              cmake --version\n              cd /shared\n\n              mkdir build\n              cd build\n              cmake \\\n                -DCMAKE_INSTALL_PREFIX=./install \\\n                -DBUILD_SHARED_LIBS=ON \\\n                -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n                -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n                -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n                ..\n              make -j1 install\n            '\n\n      - name: Build sherpa-onnx node-addon\n        shell: bash\n        run: |\n          d=$PWD\n          export SHERPA_ONNX_INSTALL_DIR=$d/build/install\n\n          sudo mkdir /shared\n          sudo ln -s $PWD/build /shared/\n\n          ls -lh /shared/build\n\n          cd scripts/node-addon-api/\n\n          npm i\n\n          ./node_modules/.bin/cmake-js compile --log-level verbose\n\n      - name: Prepare for publish\n        shell: bash\n        run: |\n          owner=${{ github.repository_owner }}\n          export owner\n\n          echo \"---\"\n          ls -lh build/install/lib/\n          sudo chown -R runner ./build\n          echo \"---\"\n          ls -lh build/install/lib/\n          echo \"---\"\n\n          # find build/install/lib/ -maxdepth 1 -type l\n          # find build/install/lib/ -maxdepth 1 -type l -delete\n          #\n          # echo \"---\"\n          # ls -lh build/install/lib/\n\n          .github/scripts/node-addon/run.sh\n\n      - name: Display files to be published\n        shell: bash\n        run: |\n          ls -lh ./sherpa-onnx-node\n          tar cjvf ./sherpa-onnx-node.tar.bz2 ./sherpa-onnx-node\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-linux-x64\n          path: ./sherpa-onnx-node.tar.bz2\n\n      - name: Publish\n        shell: bash\n        run: |\n          cd ./sherpa-onnx-node\n          # https://docs.npmjs.com/trusted-publishers\n          npm publish --access public\n"
  },
  {
    "path": ".github/workflows/npm-addon-macos.yaml",
    "content": "name: npm-addon-macos\n\non:\n  push:\n    branches:\n      - node-addon\n  workflow_dispatch:\n\nconcurrency:\n  group: npm-addon-macos-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n  id-token: write\n\njobs:\n  npm-addon-macos:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-15-intel, macos-14]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Update pip\n        shell: bash\n        run: |\n          pip install -U pip\n\n      - uses: actions/setup-node@v4\n        with:\n          node-version: '24'\n          registry-url: 'https://registry.npmjs.org'\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-release-shared\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          mkdir build\n          cd build\n          cmake \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n            -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n            ..\n          make -j install\n\n      - name: Build sherpa-onnx node-addon\n        shell: bash\n        run: |\n          d=$PWD\n          export SHERPA_ONNX_INSTALL_DIR=$d/build/install\n\n          cd scripts/node-addon-api/\n\n          npm i\n\n          ./node_modules/.bin/cmake-js compile --log-level verbose\n\n      - name: Prepare for publish\n        shell: bash\n        run: |\n          owner=${{ github.repository_owner }}\n          export owner\n\n          ls -lh build/install/lib/\n          echo \"---\"\n\n          # find build/install/lib/ -maxdepth 1 -type l\n          # find build/install/lib/ -maxdepth 1 -type l -delete\n\n          # echo \"---\"\n          # ls -lh build/install/lib/\n\n          .github/scripts/node-addon/run.sh\n\n      - name: Display files to be published\n        shell: bash\n        run: |\n          ls -lh ./sherpa-onnx-node\n          tar cjvf ./sherpa-onnx-node.tar.bz2 ./sherpa-onnx-node\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-${{ matrix.os }}\n          path: ./sherpa-onnx-node.tar.bz2\n\n      - name: Publish\n        shell: bash\n        run: |\n          cd ./sherpa-onnx-node\n          # https://docs.npmjs.com/trusted-publishers\n          npm publish --access public\n"
  },
  {
    "path": ".github/workflows/npm-addon-win-x64.yaml",
    "content": "name: npm-addon-win-x64\n\non:\n  push:\n    branches:\n      - node-addon\n  workflow_dispatch:\n\nconcurrency:\n  group: npm-addon-win-x64-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n  id-token: write\n\njobs:\n  npm-addon-win-x64:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - uses: actions/setup-node@v4\n        with:\n          node-version: '24'\n          registry-url: 'https://registry.npmjs.org'\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          mkdir build\n          cd build\n          cmake \\\n            -DCMAKE_BUILD_TYPE=Release \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n            -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n            -DBUILD_ESPEAK_NG_EXE=OFF \\\n            -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF  \\\n            ..\n\n          ls -lh  _deps/onnxruntime-src/lib/\n\n          cmake --build . --config Release --target install -- -m:6\n\n          ls -lh install/lib\n\n          echo \"----------\"\n\n          cp -v  _deps/onnxruntime-src/lib/*.lib ./install/lib\n          cp -v  _deps/onnxruntime-src/lib/*.dll ./install/lib\n\n          echo \"----------\"\n\n          ls -lh install/lib\n\n      - name: Build sherpa-onnx node-addon\n        shell: bash\n        run: |\n          d=$PWD\n          export SHERPA_ONNX_INSTALL_DIR=$d/build/install\n\n          cd scripts/node-addon-api/\n\n          npm i\n\n          ./node_modules/.bin/cmake-js compile --log-level verbose\n\n      - name: Prepare for publish\n        shell: bash\n        run: |\n          owner=${{ github.repository_owner }}\n          export owner\n\n          echo \"---\"\n          ls -lh build/install/lib/\n          echo \"---\"\n          ls -lh build/install/lib/\n          echo \"---\"\n\n          .github/scripts/node-addon/run.sh\n\n      - name: Display files to be published\n        shell: bash\n        run: |\n          ls -lh ./sherpa-onnx-node\n          tar cjvf ./sherpa-onnx-node.tar.bz2 ./sherpa-onnx-node\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-win-x64\n          path: ./sherpa-onnx-node.tar.bz2\n\n      - name: Publish\n        shell: bash\n        run: |\n          cd ./sherpa-onnx-node\n          # https://docs.npmjs.com/trusted-publishers\n          npm publish --access public\n"
  },
  {
    "path": ".github/workflows/npm-addon-win-x86.yaml",
    "content": "name: npm-addon-win-x86\n\non:\n  push:\n    branches:\n      - node-addon\n  workflow_dispatch:\n\nconcurrency:\n  group: npm-addon-win-x86-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n  id-token: write\n\njobs:\n  build:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - uses: actions/setup-node@v4\n        with:\n          registry-url: 'https://registry.npmjs.org'\n          architecture: 'x86'\n          node-version: '16'\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: Show node-addon\n        shell: bash\n        run: |\n          cd scripts/node-addon-api/\n\n          npm i || true\n          cat node_modules/node-addon-api/package.json\n          cd node_modules/\n          tar cjf node-addon-api.tar.bz2 ./node-addon-api\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: node-addon-api\n          path: ./scripts/node-addon-api/node_modules/node-addon-api.tar.bz2\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          mkdir build\n          cd build\n          cmake \\\n            -A Win32 \\\n            -DCMAKE_BUILD_TYPE=Release \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n            -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n            -DBUILD_ESPEAK_NG_EXE=OFF \\\n            -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF  \\\n            ..\n\n          ls -lh  _deps/onnxruntime-src/lib/\n\n          cmake --build . --config Release --target install -- -m:6\n\n          ls -lh install/lib\n\n          echo \"----------\"\n\n          cp -v  _deps/onnxruntime-src/lib/*.lib ./install/lib\n          cp -v  _deps/onnxruntime-src/lib/*.dll ./install/lib\n\n          echo \"----------\"\n\n          ls -lh install/lib\n\n      - name: Build sherpa-onnx node-addon\n        shell: bash\n        run: |\n          d=$PWD\n          export SHERPA_ONNX_INSTALL_DIR=$d/build/install\n\n          cd scripts/node-addon-api/\n\n          npm i\n\n          npm config set cmake_js_A \"Win32\"\n          ./node_modules/.bin/cmake-js compile -A Win32 --log-level verbose\n\n      - name: Prepare for publish\n        shell: bash\n        run: |\n          owner=${{ github.repository_owner }}\n          export owner\n\n          echo \"---\"\n          ls -lh build/install/lib/\n          echo \"---\"\n          ls -lh build/install/lib/\n          echo \"---\"\n\n          .github/scripts/node-addon/run.sh\n\n      - name: Display files to be published\n        shell: bash\n        run: |\n          ls -lh ./sherpa-onnx-node\n          tar cjvf ./sherpa-onnx-node.tar.bz2 ./sherpa-onnx-node\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-win-ia32\n          path: ./sherpa-onnx-node.tar.bz2\n\n  upload:\n    needs: [build]\n    name: upload\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - uses: actions/setup-node@v4\n        with:\n          node-version: '24'\n          registry-url: 'https://registry.npmjs.org'\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: Retrieve artifact\n        uses: actions/download-artifact@v4\n        with:\n          name: sherpa-onnx-win-ia32\n          path: /tmp/files/\n\n      - name: Unzip\n        shell: bash\n        run: |\n          cd /tmp/files\n          tar xvf sherpa-onnx-node.tar.bz2\n\n      - name: Publish\n        shell: bash\n        run: |\n          cd /tmp/files/sherpa-onnx-node\n          # https://docs.npmjs.com/trusted-publishers\n          npm publish --access public\n"
  },
  {
    "path": ".github/workflows/npm-addon.yaml",
    "content": "name: npm-addon\n\non:\n  push:\n    branches:\n      - node-addon\n  workflow_dispatch:\n\nconcurrency:\n  group: npm-addon-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n  id-token: write\n\njobs:\n  npm-addon:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - uses: actions/setup-node@v4\n        with:\n          registry-url: 'https://registry.npmjs.org'\n          node-version: '24'\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-release-shared\n\n      - name: Prepare for publish\n        shell: bash\n        run: |\n          owner=${{ github.repository_owner }}\n          export owner\n\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n          # SHERPA_ONNX_VERSION=1.0.30\n\n          src_dir=.github/scripts/node-addon\n          sed -i.bak s/SHERPA_ONNX_VERSION/$SHERPA_ONNX_VERSION/g $src_dir/package.json\n          sed -i.bak s/k2-fsa/$owner/g $src_dir/package.json\n\n          dst=sherpa-onnx-node\n          mkdir $dst\n          cp $src_dir/package.json $dst/\n          cp $src_dir/README.md $dst/\n          cp scripts/node-addon-api/lib/*.js $dst/\n\n      - name: Display files to be published\n        shell: bash\n        run: |\n          ls -lh ./sherpa-onnx-node\n          tar cjvf ./sherpa-onnx-node.tar.bz2 ./sherpa-onnx-node\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-node\n          path: ./sherpa-onnx-node.tar.bz2\n\n      - name: Publish\n        shell: bash\n        run: |\n          cd ./sherpa-onnx-node\n          # https://docs.npmjs.com/trusted-publishers\n          npm publish --access public\n"
  },
  {
    "path": ".github/workflows/npm.yaml",
    "content": "name: npm\n\non:\n  push:\n    branches:\n      - npm\n  workflow_dispatch:\n\nconcurrency:\n  group: npm-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n  id-token: write\n\njobs:\n  nodejs:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.51\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - uses: actions/setup-node@v4\n        with:\n          node-version: '24'\n          registry-url: 'https://registry.npmjs.org'\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: Build nodejs package\n        shell: bash\n        run: |\n          ./build-wasm-simd-nodejs.sh\n          cp -v build-wasm-simd-nodejs/install/bin/wasm/nodejs/*.js ./scripts/nodejs/\n          cp -v build-wasm-simd-nodejs/install/bin/wasm/nodejs/*.wasm ./scripts/nodejs/\n\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n          echo \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\n          cd scripts/nodejs\n\n          owner=${{ github.repository_owner }}\n          echo \"owner: $owner\"\n\n          sed -i.bak s/SHERPA_ONNX_VERSION/$SHERPA_ONNX_VERSION/g ./package.json\n          sed -i.bak s/k2-fsa/$owner/g ./package.json\n\n          rm package.json.bak\n\n      - name: Collect files\n        shell: bash\n        run: |\n          dst=sherpa-onnx-wasm-nodejs\n          mkdir $dst\n          cp -v scripts/nodejs/* $dst\n          tar cvjf $dst.tar.bz2 $dst\n\n          echo \"---\"\n          ls -h $dst\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-wasm-nodejs\n          path: ./*.tar.bz2\n\n      - name: Build nodejs package\n        shell: bash\n        run: |\n          cd scripts/nodejs\n\n          git diff\n\n          # https://docs.npmjs.com/trusted-publishers\n          npm publish --provenance --access public\n"
  },
  {
    "path": ".github/workflows/pascal.yaml",
    "content": "name: pascal\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/pascal.yaml'\n      - 'cmake/**'\n      - 'pascal-api-examples/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'sherpa-onnx/pascal-api/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: pascal-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  pascal:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, macos-15-intel, windows-2022, ubuntu-22.04-arm]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}\n\n      - name: Install Free pascal compiler (ubuntu)\n        if: matrix.os == 'ubuntu-latest' || matrix.os == 'ubuntu-22.04-arm'\n        shell: bash\n        run: |\n          sudo apt-get update\n          sudo apt-get install -q -y fpc\n\n      - name: Install Free pascal compiler (macos)\n        if: matrix.os == 'macos-latest' || matrix.os == 'macos-15-intel'\n        shell: bash\n        run: |\n          brew install fpc\n          # brew install --cask lazarus\n          #\n      - name: Install Free pascal compiler (windows)\n        if: matrix.os == 'windows-2022'\n        shell: bash\n        run: |\n          choco install lazarus\n\n          ls -lh /c/lazarus/fpc/3.2.2/bin/x86_64-win64/\n\n      - name: FPC info\n        shell: bash\n        run: |\n          export PATH=/c/lazarus/fpc/3.2.2/bin/x86_64-win64:$PATH\n          which fpc\n          fpc -i\n\n      - name: OS info\n        shell: bash\n        run: |\n          uname -a\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n\n          cmake \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            ..\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cd build\n          cmake --build . --target install --config Release\n\n          ls -lh install/lib/\n\n          if [[ ${{ matrix.os }} == 'windows-2022' ]]; then\n            cp -v install/lib/*.dll ../pascal-api-examples/non-streaming-asr\n            cp -v install/lib/*.dll ../pascal-api-examples/read-wav\n            cp -v install/lib/*.dll ../pascal-api-examples/speaker-diarization\n            cp -v install/lib/*.dll ../pascal-api-examples/speech-enhancement-gtcrn\n            cp -v install/lib/*.dll ../pascal-api-examples/speech-enhancement-dpdfnet\n            cp -v install/lib/*.dll ../pascal-api-examples/streaming-asr\n            cp -v install/lib/*.dll ../pascal-api-examples/streaming-speech-enhancement-gtcrn\n            cp -v install/lib/*.dll ../pascal-api-examples/streaming-speech-enhancement-dpdfnet\n            cp -v install/lib/*.dll ../pascal-api-examples/tts\n            cp -v install/lib/*.dll ../pascal-api-examples/vad\n            cp -v install/lib/*.dll ../pascal-api-examples/vad-with-non-streaming-asr\n\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/non-streaming-asr\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/read-wav\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/speaker-diarization\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/speech-enhancement-gtcrn\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/speech-enhancement-dpdfnet\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/streaming-asr\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/streaming-speech-enhancement-gtcrn\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/streaming-speech-enhancement-dpdfnet\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/tts\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/vad\n            cp -v ../sherpa-onnx/pascal-api/*.pas ../pascal-api-examples/vad-with-non-streaming-asr\n          fi\n\n      - name:  Run Speech Enhancement test\n        shell: bash\n        run: |\n          export PATH=/c/lazarus/fpc/3.2.2/bin/x86_64-win64:$PATH\n\n          cd ./pascal-api-examples\n\n          pushd speech-enhancement-gtcrn\n          ./run-gtcrn.sh\n          ls -lh\n          popd\n\n          pushd speech-enhancement-dpdfnet\n          ./run-dpdfnet.sh\n          ls -lh\n          popd\n\n          pushd streaming-speech-enhancement-gtcrn\n          ./run-gtcrn.sh\n          ls -lh\n          popd\n\n          pushd streaming-speech-enhancement-dpdfnet\n          ./run-dpdfnet.sh\n          ls -lh\n          popd\n\n      - name:  Run Pascal test (TTS)\n        shell: bash\n        run: |\n          export PATH=/c/lazarus/fpc/3.2.2/bin/x86_64-win64:$PATH\n\n          cd ./pascal-api-examples\n          pushd tts\n\n          ./run-pocket-en.sh\n          rm -rf sherpa-onnx-pocket-*\n\n          ./run-piper.sh\n          rm -rf vits-piper-*\n          rm piper\n          ls -lh\n          echo \"---\"\n\n          ./run-kokoro-zh-en.sh\n          rm -rf kokoro-multi-*\n          rm kokoro-zh-en\n          ls -lh\n          echo \"---\"\n\n          ./run-kokoro-en.sh\n          rm -rf kokoro-en-*\n          rm kokoro-en\n          ls -lh\n          echo \"---\"\n\n          ./run-matcha-zh.sh\n          rm -rf matcha-icefall-*\n          rm matcha-zh\n          ls -lh\n          echo \"---\"\n\n          ./run-matcha-en.sh\n          rm -rf matcha-icefall-*\n          rm matcha-en\n          ls -lh\n          echo \"---\"\n\n          ./run-supertonic-en.sh\n          rm -rf sherpa-onnx-supertonic-*\n          rm supertonic-en\n          ls -lh\n          echo \"---\"\n\n          ./run-zipvoice-zh-en.sh\n          rm -rf sherpa-onnx-zipvoice-*\n          rm -f vocos_24khz.onnx\n          rm zipvoice-zh-en\n          ls -lh\n          echo \"---\"\n\n          popd\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: tts-${{ matrix.os }}\n          path: ./pascal-api-examples/tts/*.wav\n\n      - name:  Run Pascal test (Non Streaming ASR)\n        shell: bash\n        run: |\n          export PATH=/c/lazarus/fpc/3.2.2/bin/x86_64-win64:$PATH\n\n          cd ./pascal-api-examples\n\n          pushd non-streaming-asr\n\n          ./run-funasr-nano.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-wenet-ctc.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-medasr-ctc.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-omnilingual-asr-ctc.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-nemo-canary.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-zipformer-ctc.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-dolphin-ctc.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-zipformer-transducer.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-moonshine-v2.sh\n\n          ./run-moonshine.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-fire-red-asr-ctc.sh\n          rm -rf sherpa-onnx-fire-red-asr*\n          echo \"---\"\n\n          ./run-fire-red-asr.sh\n          rm -rf sherpa-onnx-fire-red-asr*\n          echo \"---\"\n\n          ./run-whisper.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-nemo-transducer.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-nemo-ctc.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-sense-voice.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-telespeech-ctc.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-paraformer.sh\n\n          ./run-paraformer-itn.sh\n\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ls -lh\n          popd\n\n      - name:  Run Pascal test (Streaming ASR)\n        shell: bash\n        run: |\n          export PATH=/c/lazarus/fpc/3.2.2/bin/x86_64-win64:$PATH\n\n          cd ./pascal-api-examples\n\n          pushd streaming-asr\n\n          ./run-t-one-ctc.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-zipformer-transducer.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ./run-nemo-transducer.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          if [[ ${{ matrix.os }} != 'windows-2022' ]]; then\n            ./run-paraformer.sh\n            rm -rf sherpa-onnx-*\n            echo \"---\"\n\n            ./run-zipformer-ctc.sh\n            echo \"---\"\n\n            ./run-zipformer-ctc-hlg.sh\n            rm -rf sherpa-onnx-*\n            echo \"---\"\n          fi\n\n          ls -lh\n          popd\n\n      - name:  Run Pascal test (VAD test)\n        shell: bash\n        run: |\n          export PATH=/c/lazarus/fpc/3.2.2/bin/x86_64-win64:$PATH\n\n          cd ./pascal-api-examples\n\n          pushd vad\n          ./run-circular-buffer.sh\n          echo \"---\"\n\n          time ./run-remove-silence-ten-vad.sh\n          echo \"---\"\n\n          time ./run-remove-silence.sh\n          echo \"---\"\n\n          ls -lh\n\n          popd\n\n      - name:  Run Pascal test (Speaker diarization)\n        shell: bash\n        run: |\n          export PATH=/c/lazarus/fpc/3.2.2/bin/x86_64-win64:$PATH\n\n          cd ./pascal-api-examples\n          pushd speaker-diarization\n\n          ./run.sh\n          rm -rfv *.onnx *.wav sherpa-onnx-*\n          ls -lh\n          echo \"---\"\n\n          popd\n\n      - name:  Run Pascal test (VAD + non-streaming ASR)\n        shell: bash\n        run: |\n          export PATH=/c/lazarus/fpc/3.2.2/bin/x86_64-win64:$PATH\n\n          cd ./pascal-api-examples\n\n          pushd vad-with-non-streaming-asr\n\n          time ./run-vad-with-zipformer-ctc.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          time ./run-vad-with-dolphin-ctc.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          time ./run-vad-with-moonshine.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          time ./run-vad-with-whisper.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          time ./run-vad-with-sense-voice.sh\n          rm -rf sherpa-onnx-*\n          echo \"---\"\n\n          ls -lh\n\n          popd\n\n      - name:  Run Pascal test (Read wav test)\n        shell: bash\n        run: |\n          export PATH=/c/lazarus/fpc/3.2.2/bin/x86_64-win64:$PATH\n\n          cd ./pascal-api-examples\n\n          pushd read-wav\n          ./run.sh\n          echo \"---\"\n          ls -lh\n          popd\n"
  },
  {
    "path": ".github/workflows/pkg-config.yaml",
    "content": "name: pkg-config\n\non:\n  push:\n    branches:\n      - master\n      - pkg-config\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: pkg-config-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  pkg_config:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.os }} ${{ matrix.build_type }} ${{ matrix.lib_type }} tts-${{ matrix.tts }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest]\n        build_type: [Release, Debug]\n        lib_type: [shared, static]\n        tts: [ON, OFF]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-${{ matrix.build_type }}-lib-${{ matrix.lib_type }}\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n          if [[ ${{ matrix.lib_type }} == \"shared\" ]]; then\n            cmake -DSHERPA_ONNX_ENABLE_TTS=${{ matrix.tts }} -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=${{ matrix.build_type }} -DCMAKE_INSTALL_PREFIX=./install -DBUILD_SHARED_LIBS=ON ..\n          else\n            cmake -DSHERPA_ONNX_ENABLE_TTS=${{ matrix.tts }} -DSHERPA_ONNX_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=${{ matrix.build_type }} -DCMAKE_INSTALL_PREFIX=./install -DBUILD_SHARED_LIBS=OFF ..\n          fi\n\n      - name: Build sherpa-onnx for ${{ matrix.os }} ${{ matrix.build_type }} ${{ matrix.lib_type }} tts-${{ matrix.tts }}\n        shell: bash\n        run: |\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cd build\n          make -j2\n          make install\n\n          ls -lh lib\n          ls -lh bin\n\n      - name: Install tree\n        if: matrix.os == 'ubuntu-latest'\n        shell: bash\n        run: |\n          sudo apt-get install tree\n\n      - name: Install tree\n        if: matrix.os == 'macos-latest'\n        shell: bash\n        run: |\n          brew install tree\n\n      - name: Display generated files of sherpa-onnx for ${{ matrix.os }} ${{ matrix.build_type }} ${{ matrix.lib_type }}\n        shell: bash\n        run: |\n          tree build/install\n          ls -lh build/install\n\n          cat build/install/sherpa-onnx.pc\n\n      - name: Show pkg-config\n        shell: bash\n        run: |\n          export PKG_CONFIG_PATH=$PWD/build/install:$PKG_CONFIG_PATH\n          pkg-config --cflags sherpa-onnx\n          pkg-config --libs sherpa-onnx\n\n      - name: Build C API example\n        shell: bash\n        run: |\n          export PKG_CONFIG_PATH=$PWD/build/install:$PKG_CONFIG_PATH\n          cd c-api-examples\n\n          pkg-config --cflags sherpa-onnx\n\n          gcc -o decode-file-c-api $(pkg-config --cflags sherpa-onnx) ./decode-file-c-api.c $(pkg-config --libs sherpa-onnx)\n\n          ./decode-file-c-api --help\n\n      - name: Build C API example (tts)\n        if: matrix.tts == 'ON'\n        shell: bash\n        run: |\n          export PKG_CONFIG_PATH=$PWD/build/install:$PKG_CONFIG_PATH\n          cd c-api-examples\n\n          pkg-config --cflags sherpa-onnx\n\n          gcc -o offline-tts-c-api $(pkg-config --cflags sherpa-onnx) ./offline-tts-c-api.c $(pkg-config --libs sherpa-onnx)\n\n          ./offline-tts-c-api --help\n\n      - name: Test online transducer (C API)\n        shell: bash\n        run: |\n          export PATH=$PWD/c-api-examples:$PATH\n          export EXE=decode-file-c-api\n\n          .github/scripts/test-online-transducer.sh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: tts-generated-test-files-${{ matrix.os }}-${{ matrix.build_type }}-${{ matrix.lib_type }}-tts-${{ matrix.tts }}\n          path: tts\n"
  },
  {
    "path": ".github/workflows/release-dart-package.yaml",
    "content": "name: release-dart\n\non:\n  push:\n    branches:\n      - ci-pub-dart\n    tags:\n      - 'dart-v[0-9]+.[0-9]+.[0-9]+*' # tag-pattern on pub.dev: 'v{{version}}'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: release-dart-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  build_linux_libs_x64:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          docker run --rm \\\n            --volume ${{ github.workspace }}/:/home/runner/work/sherpa-onnx/sherpa-onnx \\\n            quay.io/pypa/manylinux2014_x86_64 \\\n            bash -c '\n              uname -a\n              gcc --version\n              cmake --version\n              cat /etc/*release\n              id\n              pwd\n\n              cd /home/runner/work/sherpa-onnx/sherpa-onnx\n\n              mkdir build\n              cd build\n\n              cmake \\\n                -D SHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n                -D SHERPA_ONNX_ENABLE_TTS=ON \\\n                -D CMAKE_BUILD_TYPE=Release \\\n                -D BUILD_SHARED_LIBS=ON \\\n                -D CMAKE_INSTALL_PREFIX=./install \\\n                -D SHERPA_ONNX_ENABLE_JNI=OFF \\\n                -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n                ..\n\n              make -j2\n              make install\n\n              ls -lh ./install/lib\n            '\n\n      - name: Create tar file\n        shell: bash\n        run: |\n          mkdir x64\n          dst=x64\n          cp -v build/install/lib/lib* $dst\n          tar cjvf $dst.tar.bz2 $dst\n          ls -lh *.tar.bz2\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: linux-x64\n          path: ./*.tar.bz2\n\n  build_linux_libs_aarch64:\n    # if: false\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-22.04-arm]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          docker run --rm \\\n            --volume ${{ github.workspace }}/:/home/runner/work/sherpa-onnx/sherpa-onnx \\\n            quay.io/pypa/manylinux2014_aarch64 \\\n            bash -c '\n              uname -a\n              gcc --version\n              cmake --version\n              cat /etc/*release\n              id\n              pwd\n\n              cd /home/runner/work/sherpa-onnx/sherpa-onnx\n\n              mkdir build\n              cd build\n\n              cmake \\\n                -D SHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n                -D SHERPA_ONNX_ENABLE_TTS=ON \\\n                -D CMAKE_BUILD_TYPE=Release \\\n                -D BUILD_SHARED_LIBS=ON \\\n                -D CMAKE_INSTALL_PREFIX=./install \\\n                -D SHERPA_ONNX_ENABLE_JNI=OFF \\\n                -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n                ..\n\n              make -j2\n              make install\n\n              ls -lh ./install/lib\n            '\n\n      - name: Create tar file\n        shell: bash\n        run: |\n          mkdir aarch64\n          dst=aarch64\n          cp -v build/install/lib/lib* $dst\n          tar cjvf $dst.tar.bz2 $dst\n          ls -lh *.tar.bz2\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: linux-aarch64\n          path: ./*.tar.bz2\n\n  sherpa_onnx_linux:\n    needs: [build_linux_libs_x64, build_linux_libs_aarch64]\n    # if: false\n    permissions:\n      id-token: write # Required for authentication using OIDC\n    name: sherpa_onnx_linux\n    runs-on: ubuntu-latest\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Fix version\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          src_dir=$PWD/flutter/sherpa_onnx_linux\n          pushd $src_dir\n          v=\"version: $SHERPA_ONNX_VERSION\"\n          echo \"v: $v\"\n          sed -i.bak s\"/^version: .*/$v/\" ./pubspec.yaml\n          rm *.bak\n          git status\n          git diff\n\n      - name: Retrieve artifact from linux x64\n        uses: actions/download-artifact@v4\n        with:\n          name: linux-x64\n          path: /tmp\n\n      - name: Retrieve artifact from linux aarch64\n        uses: actions/download-artifact@v4\n        with:\n          name: linux-aarch64\n          path: /tmp\n\n      - name: Show files\n        shell: bash\n        run: |\n          cd /tmp\n          tar xvf x64.tar.bz2\n          tar xvf aarch64.tar.bz2\n\n          echo \"----x64---\"\n          ls -lh /tmp/x64/\n          echo \"----aarch64---\"\n          ls -lh /tmp/aarch64/\n\n      - name: Copy extra files\n        shell: bash\n        run: |\n          dst=flutter/sherpa_onnx_linux\n\n          mkdir $dst/example\n\n          cp -v flutter/sherpa_onnx/example/* $dst/example\n          cp -v LICENSE $dst/\n          cp -v CHANGELOG.md $dst/\n\n          git status\n\n      - name: Copy pre-built libs\n        shell: bash\n        run: |\n          cp -v /tmp/x64/lib*.so* flutter/sherpa_onnx_linux/linux/x64\n          cp -v /tmp/aarch64/lib*.so* flutter/sherpa_onnx_linux/linux/aarch64\n\n          mv -v flutter/sherpa_onnx_linux /tmp/to_be_published\n\n          ls -lh /tmp/to_be_published/linux\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v3\n        with:\n          channel: master\n          version: 3.24.0\n\n      - uses: dart-lang/setup-dart@v1\n\n      - name: Release\n        shell: bash\n        run: |\n          cd /tmp/to_be_published\n          flutter pub get\n          flutter pub publish --dry-run\n          flutter pub publish --force\n\n  sherpa_onnx_macos:\n    # if: false\n    permissions:\n      id-token: write # Required for authentication using OIDC\n    name: sherpa_onnx_macos\n    runs-on: macos-latest\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-flutter-release-package\n\n      - name: Fix version\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          src_dir=$PWD/flutter/sherpa_onnx_macos\n          pushd $src_dir\n          v=\"version: $SHERPA_ONNX_VERSION\"\n          echo \"v: $v\"\n          sed -i.bak s\"/^version: .*/$v/\" ./pubspec.yaml\n          rm *.bak\n          git status\n          git diff\n\n      - name: Copy extra files\n        shell: bash\n        run: |\n          dst=flutter/sherpa_onnx_macos\n\n          mkdir $dst/example\n\n          cp -v flutter/sherpa_onnx/example/* $dst/example\n          cp -v LICENSE $dst/\n          cp -v CHANGELOG.md $dst/\n\n          git status\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n          cmake \\\n            -D SHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -D SHERPA_ONNX_ENABLE_TTS=ON \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            -D SHERPA_ONNX_ENABLE_JNI=OFF \\\n            -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n            -D CMAKE_OSX_ARCHITECTURES=\"x86_64;arm64\" \\\n            ..\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          cd build\n          make -j2 install\n\n          ls -lh install/lib/libsherpa-onnx-c-api.dylib\n          file install/lib/libsherpa-onnx-c-api.dylib\n          rm -v install/lib/libonnxruntime.dylib\n\n      - name: Copy pre-built libs\n        shell: bash\n        run: |\n          cp -v build/install/lib/lib*.dylib* flutter/sherpa_onnx_macos/macos/\n\n          mv -v flutter/sherpa_onnx_macos /tmp/to_be_published\n\n          ls -lh /tmp/to_be_published/macos\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v3\n        with:\n          channel: stable\n          version: latest\n\n      - uses: dart-lang/setup-dart@v1\n\n      - name: Release\n        shell: bash\n        run: |\n          cd /tmp/to_be_published\n          du -h -d1 .\n          flutter pub get\n          flutter pub publish --dry-run\n          flutter pub publish --force\n\n  sherpa_onnx_windows:\n    # if: false\n    permissions:\n      id-token: write # Required for authentication using OIDC\n    name: sherpa_onnx_windows\n    runs-on: windows-2022\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Fix version\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          src_dir=$PWD/flutter/sherpa_onnx_windows\n          pushd $src_dir\n          v=\"version: $SHERPA_ONNX_VERSION\"\n          echo \"v: $v\"\n          sed -i.bak s\"/^version: .*/$v/\" ./pubspec.yaml\n          rm *.bak\n          git status\n          git diff\n\n      - name: Copy extra files\n        shell: bash\n        run: |\n          dst=flutter/sherpa_onnx_windows\n\n          mkdir $dst/example\n\n          cp -v flutter/sherpa_onnx/example/* $dst/example\n          cp -v LICENSE $dst/\n          cp -v CHANGELOG.md $dst/\n\n          git status\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          mkdir build\n          cd build\n          cmake \\\n            -D SHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -D SHERPA_ONNX_ENABLE_TTS=ON \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            -D SHERPA_ONNX_ENABLE_JNI=OFF \\\n            -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n            ..\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          cd build\n          cmake --build . --target install --config Release -- -m:2\n\n          ls -lh install/lib/*.dll\n\n      - name: Copy pre-built libs\n        shell: bash\n        run: |\n          cp -v build/install/lib/*.dll flutter/sherpa_onnx_windows/windows/\n          mv -v flutter/sherpa_onnx_windows /tmp/to_be_published\n\n          ls -lh /tmp/to_be_published/windows\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v3\n        with:\n          channel: stable\n          version: latest\n\n      - uses: dart-lang/setup-dart@v1\n\n      - name: Release\n        shell: bash\n        run: |\n          cd /tmp/to_be_published\n          flutter pub get\n          flutter pub publish --dry-run\n          flutter pub publish --force\n\n  sherpa_onnx_android:\n    # if: false\n    permissions:\n      id-token: write # Required for authentication using OIDC\n    name: sherpa_onnx_android\n    runs-on: ubuntu-latest\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-flutter-release-package-android\n\n      - name: Fix version\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          src_dir=$PWD/flutter/sherpa_onnx_android\n          pushd $src_dir\n          v=\"version: $SHERPA_ONNX_VERSION\"\n          echo \"v: $v\"\n          sed -i.bak s\"/^version: .*/$v/\" ./pubspec.yaml\n          rm *.bak\n          git status\n          git diff\n\n      - name: Copy extra files\n        shell: bash\n        run: |\n          dst=flutter/sherpa_onnx_android\n\n          mkdir $dst/example\n\n          cp -v flutter/sherpa_onnx/example/* $dst/example\n          cp -v LICENSE $dst/\n          cp -v CHANGELOG.md $dst/\n\n          git status\n\n      - name: Build android-arm64-v8a\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export SHERPA_ONNX_ENABLE_C_API=ON\n          export SHERPA_ONNX_ENABLE_JNI=OFF\n          export SHERPA_ONNX_ENABLE_BINARY=OFF\n\n          ./build-android-arm64-v8a.sh\n\n      - name: Build android-armv7-eabi\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export SHERPA_ONNX_ENABLE_C_API=ON\n          export SHERPA_ONNX_ENABLE_JNI=OFF\n          export SHERPA_ONNX_ENABLE_BINARY=OFF\n\n          ./build-android-armv7-eabi.sh\n\n      - name: Build android-x86\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export SHERPA_ONNX_ENABLE_C_API=ON\n          export SHERPA_ONNX_ENABLE_JNI=OFF\n          export SHERPA_ONNX_ENABLE_BINARY=OFF\n\n          ./build-android-x86.sh\n\n      - name: Build android-x86-64\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export SHERPA_ONNX_ENABLE_C_API=ON\n          export SHERPA_ONNX_ENABLE_JNI=OFF\n          export SHERPA_ONNX_ENABLE_BINARY=OFF\n\n          ./build-android-x86-64.sh\n\n      - name: Copy pre-built libs\n        shell: bash\n        run: |\n          echo \"----arm64-v8a----\"\n          cp -v build-android-arm64-v8a/install/lib/lib*.so flutter/sherpa_onnx_android/android/src/main/jniLibs/arm64-v8a/\n\n          echo \"----armv7-eabi----\"\n          cp -v build-android-armv7-eabi/install/lib/lib*.so flutter/sherpa_onnx_android/android/src/main/jniLibs/armeabi-v7a\n\n          echo \"----x86----\"\n          cp -v build-android-x86/install/lib/lib*.so flutter/sherpa_onnx_android/android/src/main/jniLibs/x86\n\n          echo \"----x86_64----\"\n          cp -v build-android-x86-64/install/lib/lib*.so flutter/sherpa_onnx_android/android/src/main/jniLibs/x86_64\n\n          mv -v flutter/sherpa_onnx_android /tmp/to_be_published\n\n          ls -lh /tmp/to_be_published\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v3\n        with:\n          channel: stable\n          version: latest\n\n      - uses: dart-lang/setup-dart@v1\n\n      - name: Release\n        shell: bash\n        run: |\n          cd /tmp/to_be_published\n          du -h -d1 .\n\n          flutter pub get\n          flutter pub publish --dry-run\n          flutter pub publish --force\n\n  sherpa_onnx_ios:\n    # if: false\n    permissions:\n      id-token: write # Required for authentication using OIDC\n    name: sherpa_onnx_ios\n    runs-on: macos-latest\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-flutter-release-package-ios\n\n      - name: Fix version\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          src_dir=$PWD/flutter/sherpa_onnx_ios\n          pushd $src_dir\n          v=\"version: $SHERPA_ONNX_VERSION\"\n          echo \"v: $v\"\n          sed -i.bak s\"/^version: .*/$v/\" ./pubspec.yaml\n          rm *.bak\n          git status\n          git diff\n\n      - name: Copy extra files\n        shell: bash\n        run: |\n          dst=flutter/sherpa_onnx_ios\n\n          mkdir $dst/example\n\n          cp -v flutter/sherpa_onnx/example/* $dst/example\n          cp -v LICENSE $dst/\n          cp -v CHANGELOG.md $dst/\n\n          git status\n\n      - name: Build ios\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n          ./build-ios-shared.sh\n\n      - name: Copy pre-built libs\n        shell: bash\n        run: |\n          echo \"----ios arm64 and arm64_x64_simulator----\"\n          cp -av build-ios-shared/sherpa_onnx.xcframework flutter/sherpa_onnx_ios/ios/\n\n          mv -v flutter/sherpa_onnx_ios /tmp/to_be_published\n\n          ls -lh /tmp/to_be_published\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v3\n        with:\n          channel: stable\n          version: latest\n\n      - uses: dart-lang/setup-dart@v1\n\n      - name: Release\n        shell: bash\n        run: |\n          cd /tmp/to_be_published\n          du -h -d1 .\n\n          flutter pub get\n          flutter pub publish --dry-run\n          flutter pub publish --force\n\n  sherpa_onnx:\n    needs: [sherpa_onnx_linux, sherpa_onnx_macos, sherpa_onnx_windows, sherpa_onnx_android, sherpa_onnx_ios]\n    # if: false\n    permissions:\n      id-token: write # Required for authentication using OIDC\n    name: sherpa_onnx\n    runs-on: ubuntu-latest\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Flutter SDK\n        uses: flutter-actions/setup-flutter@v3\n        with:\n          channel: stable\n          version: latest\n\n      - uses: dart-lang/setup-dart@v1\n\n      - name: Fix version\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          src_dir=$PWD/flutter/sherpa_onnx\n          pushd $src_dir\n          v=\"version: $SHERPA_ONNX_VERSION\"\n          echo \"v: $v\"\n          sed -i.bak s\"/^version: .*/$v/\" ./pubspec.yaml\n          rm *.bak\n          git status\n          git diff\n\n      - name: Copy extra files\n        shell: bash\n        run: |\n          dst=flutter/sherpa_onnx\n\n          cp -v LICENSE $dst/\n          cp -v CHANGELOG.md $dst/\n          cp -v README.md $dst/\n\n          git status\n\n          mv -v flutter/sherpa_onnx /tmp/to_be_published\n\n          ls -lh /tmp/to_be_published\n\n      - name: Release\n        shell: bash\n        run: |\n          cd /tmp/to_be_published\n          du -h -d1 .\n\n          flutter pub get\n          flutter pub publish --dry-run\n          flutter pub publish --force\n"
  },
  {
    "path": ".github/workflows/release-go.yaml",
    "content": "name: release-go\n\non:\n  workflow_dispatch:\n\nconcurrency:\n  group: release-go-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  release_go:\n    name: Release go\n    runs-on: ubuntu-latest\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Add SSH key\n        run: |\n          mkdir -p ~/.ssh/\n          cp scripts/go/ssh_config ~/.ssh/config\n          echo \"${{ secrets.MY_GITHUB_SSH_KEY }}\" > ~/.ssh/github && chmod 600 ~/.ssh/github\n          ssh github.com || true\n\n      - name: Release\n        shell: bash\n        run: |\n          cd scripts/go\n          ./release.sh\n"
  },
  {
    "path": ".github/workflows/release-rust.yaml",
    "content": "name: Publish Rust Crates\n\non:\n  push:\n    branches:\n      - release-rust\n\n  workflow_dispatch:\n\njobs:\n  publish:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v4\n\n      - uses: actions-rust-lang/setup-rust-toolchain@v1\n        with:\n          toolchain: stable\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Update\n        shell: bash\n        run: |\n          cd sherpa-onnx/rust\n          ./publish.sh\n\n      - name: Login to crates.io\n        run: cargo login ${{ secrets.CARGO_REGISTRY_TOKEN }}\n\n      - name: Publish sherpa-onnx-sys\n        shell: bash\n        env:\n          CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}\n        run: |\n          cargo publish --allow-dirty --manifest-path=sherpa-onnx/rust/sherpa-onnx-sys/Cargo.toml\n          sleep 30  # Wait for crates.io to index\n\n      - name: Publish sherpa-onnx\n        shell: bash\n        env:\n          CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}\n        run: |\n          cargo publish --allow-dirty --manifest-path=sherpa-onnx/rust/sherpa-onnx/Cargo.toml\n"
  },
  {
    "path": ".github/workflows/riscv64-linux.yaml",
    "content": "name: riscv64-linux\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/riscv64-linux.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'toolchains/riscv64-linux-gnu.toolchain.cmake'\n      - 'build-riscv64-linux-gnu.sh'\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: riscv64-linux-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  riscv64_linux:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.os }} ${{ matrix.lib_type }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        lib_type: [shared] #, static]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-riscv64-${{ matrix.lib_type }}\n\n      - name: cache-qemu\n        id: cache-qemu\n        uses: actions/cache@v4\n        with:\n          path: qemu-install\n          key: qemu-riscv-xuantie-install-20240306\n\n      - name: qemu\n        if: steps.cache-qemu.outputs.cache-hit != 'true'\n        run: |\n          # https://pypi.org/project/xuantie-qemu/#files\n          wget -q https://files.pythonhosted.org/packages/21/f4/733f29c435987e8bb264a6504c7a4ea4c04d0d431b38a818ab63eef082b9/xuantie_qemu-20230825-py3-none-manylinux1_x86_64.whl\n          unzip xuantie_qemu-20230825-py3-none-manylinux1_x86_64.whl\n          mkdir -p qemu-install/bin\n\n          cp -v ./qemu/qemu-riscv64 ./qemu-install/bin\n\n      - name: cache-toolchain\n        id: cache-toolchain\n        uses: actions/cache@v4\n        with:\n          path: toolchain\n          key: Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.6.1-20220906.tar.gz\n\n      - name: Download toolchain\n        if: steps.cache-toolchain.outputs.cache-hit != 'true'\n        shell: bash\n        run: |\n          wget -q https://occ-oss-prod.oss-cn-hangzhou.aliyuncs.com/resource//1663142514282/Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.6.1-20220906.tar.gz\n\n          mkdir $GITHUB_WORKSPACE/toolchain\n\n          tar xvf ./Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.6.1-20220906.tar.gz --strip-components 1 -C $GITHUB_WORKSPACE/toolchain\n          ls -lh $GITHUB_WORKSPACE/toolchain/bin\n\n      - name: Display toolchain info\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          riscv64-unknown-linux-gnu-gcc --version\n\n      - name: Display qemu-riscv64 -h\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/qemu-install/bin:$PATH\n          export QEMU_LD_PREFIX=$GITHUB_WORKSPACE/toolchain/sysroot\n          qemu-riscv64 -h\n\n      - name: build riscv64-linux\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cmake --version\n\n          lib_type=${{ matrix.lib_type }}\n\n          if [[ $lib_type == \"shared\" ]]; then\n            export BUILD_SHARED_LIBS=ON\n          else\n            export BUILD_SHARED_LIBS=OFF\n          fi\n\n          ./build-riscv64-linux-gnu.sh\n\n          ls -lh build-riscv64-linux-gnu/bin\n          ls -lh build-riscv64-linux-gnu/lib\n\n          echo \"---install/lib---\"\n          ls -lh build-riscv64-linux-gnu/install/lib\n\n          echo \"---install/bin---\"\n          ls -lh build-riscv64-linux-gnu/install/bin\n\n          file build-riscv64-linux-gnu/bin/sherpa-onnx\n\n          readelf -d build-riscv64-linux-gnu/bin/sherpa-onnx\n\n      - name: Copy files\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          riscv64-unknown-linux-gnu-strip --version\n\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-riscv64-${{ matrix.lib_type }}\n          mkdir $dst\n\n          cp -v $GITHUB_WORKSPACE/toolchain/sysroot/lib/ld-linux-riscv64xthead-lp64d.so.1 build-riscv64-linux-gnu/install/lib/\n\n          ls -lh build-riscv64-linux-gnu/install/lib\n\n          cp -a build-riscv64-linux-gnu/install/bin $dst/\n          ls -lh $dst/bin/*\n          riscv64-unknown-linux-gnu-strip $dst/bin/*\n          ls -lh $dst\n\n          lib_type=${{ matrix.lib_type }}\n          if [[ $lib_type == \"shared\" ]]; then\n            cp -a build-riscv64-linux-gnu/install/lib $dst/\n            rm -fv $dst/lib/libasound.so\n            rm -fv $dst/lib/libonnxruntime.so\n          fi\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.lib_type == 'shared'\n        with:\n          name: sherpa-onnx-linux-riscv64-shared\n          path: sherpa-onnx-*linux-riscv64-shared.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=riscv64/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*-shared.tar.bz2 $dst/\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-riscv64-shared.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.lib_type == 'static'\n        with:\n          name: sherpa-onnx-linux-riscv64-static\n          path: sherpa-onnx-*linux-riscv64-static.tar.bz2\n\n      - name: Release pre-compiled binaries and libs for riscv64 linux ${{ matrix.lib_type }}\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-riscv64*.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.11\n\n      - name: Release pre-compiled binaries and libs for riscv64 linux ${{ matrix.lib_type }}\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-riscv64*.tar.bz2\n\n      - name: Test sherpa-onnx\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          export PATH=$GITHUB_WORKSPACE/qemu-install/bin:$PATH\n          export QEMU_LD_PREFIX=$GITHUB_WORKSPACE/toolchain/sysroot\n          export LD_LIBRARY_PATH=$GITHUB_WORKSPACE/toolchain/sysroot/lib\n\n          ls -lh ./build-riscv64-linux-gnu/bin\n\n          echo \"----------sherpa-onnx----------\"\n          qemu-riscv64 ./build-riscv64-linux-gnu/bin/sherpa-onnx --help\n          readelf -d ./build-riscv64-linux-gnu/bin/sherpa-onnx\n\n          echo \"----------sherpa-onnx-offline----------\"\n          qemu-riscv64 ./build-riscv64-linux-gnu/bin/sherpa-onnx-offline --help\n          readelf -d ./build-riscv64-linux-gnu/bin/sherpa-onnx-offline\n\n          echo \"----------sherpa-onnx-offline-tts----------\"\n          qemu-riscv64 ./build-riscv64-linux-gnu/bin/sherpa-onnx-offline-tts --help\n          readelf -d ./build-riscv64-linux-gnu/bin/sherpa-onnx-offline-tts\n\n      - name: Test streaming speech recognition\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          export PATH=$GITHUB_WORKSPACE/qemu-install/bin:$PATH\n          export QEMU_LD_PREFIX=$GITHUB_WORKSPACE/toolchain/sysroot\n          export LD_LIBRARY_PATH=$GITHUB_WORKSPACE/toolchain/sysroot/lib\n\n          wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2\n\n          qemu-riscv64 ./build-riscv64-linux-gnu/bin/sherpa-onnx \\\n            --tokens=./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/tokens.txt \\\n            --encoder=./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/encoder-epoch-99-avg-1.onnx \\\n            --decoder=./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/decoder-epoch-99-avg-1.onnx \\\n            --joiner=./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/joiner-epoch-99-avg-1.onnx \\\n            ./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/test_wavs/0.wav\n\n      - name: Test offline tts\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          export PATH=$GITHUB_WORKSPACE/qemu-install/bin:$PATH\n          export QEMU_LD_PREFIX=$GITHUB_WORKSPACE/toolchain/sysroot\n          export LD_LIBRARY_PATH=$GITHUB_WORKSPACE/toolchain/sysroot/lib\n\n          wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-lessac-medium.tar.bz2\n          tar xf vits-piper-en_US-lessac-medium.tar.bz2\n          rm vits-piper-en_US-lessac-medium.tar.bz2\n\n          qemu-riscv64 ./build-riscv64-linux-gnu/bin/sherpa-onnx-offline-tts \\\n            --vits-model=./vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx \\\n            --vits-data-dir=./vits-piper-en_US-lessac-medium/espeak-ng-data \\\n            --vits-tokens=./vits-piper-en_US-lessac-medium/tokens.txt \\\n            --output-filename=./liliana-piper-en_US-lessac-medium.wav \\\n            'liliana, the most beautiful and lovely assistant of our team!'\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.lib_type == 'shared'\n        with:\n          name: wave\n          path: ./*.wav\n"
  },
  {
    "path": ".github/workflows/riscv64-spacemit-linux.yaml",
    "content": "name: riscv64-spacemit-linux\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/riscv64-spacemit-linux.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'toolchains/riscv64-linux-gnu-spacemit.toolchain.cmake'\n      - 'build-riscv64-linux-gnu-spacemit.sh'\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: riscv64-spacemit-linux-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  riscv64_spacemit_linux:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.os }} ${{ matrix.lib_type }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        lib_type: [shared] #, static]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-riscv64-spacemit-${{ matrix.lib_type }}\n\n      - name: cache-qemu\n        id: cache-qemu\n        uses: actions/cache@v4\n        with:\n          path: qemu-install\n          key: qemu-riscv-spacemit-install-20250818\n\n      - name: qemu\n        if: steps.cache-qemu.outputs.cache-hit != 'true'\n        run: |\n          wget -q https://archive.spacemit.com/spacemit-ai/qemu/jdsk-qemu-v10.0.2.tar.gz\n          tar -xf jdsk-qemu-v10.0.2.tar.gz\n          mkdir -p qemu-install/bin\n\n          cp -v ./jdsk-qemu/bin/qemu-riscv64 ./qemu-install/bin\n\n      - name: cache-toolchain\n        id: cache-toolchain\n        uses: actions/cache@v4\n        with:\n          path: toolchain\n          key: https://archive.spacemit.com/toolchain/spacemit-toolchain-linux-glibc-x86_64-v1.1.2.tar.xz\n\n      - name: Download toolchain\n        if: steps.cache-toolchain.outputs.cache-hit != 'true'\n        shell: bash\n        run: |\n          wget -q https://archive.spacemit.com/toolchain/spacemit-toolchain-linux-glibc-x86_64-v1.1.2.tar.xz\n\n          mkdir $GITHUB_WORKSPACE/toolchain\n\n          tar xvf spacemit-toolchain-linux-glibc-x86_64-v1.1.2.tar.xz --strip-components 1 -C $GITHUB_WORKSPACE/toolchain\n          ls -lh $GITHUB_WORKSPACE/toolchain/bin\n\n      - name: Display toolchain info\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          riscv64-unknown-linux-gnu-gcc --version\n\n      - name: Display qemu-riscv64 -h\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/qemu-install/bin:$PATH\n          export QEMU_LD_PREFIX=$GITHUB_WORKSPACE/toolchain/sysroot\n          qemu-riscv64 -h\n\n      - name: build riscv64-spacemit-linux\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cmake --version\n\n          lib_type=${{ matrix.lib_type }}\n\n          if [[ $lib_type == \"shared\" ]]; then\n            export BUILD_SHARED_LIBS=ON\n          else\n            export BUILD_SHARED_LIBS=OFF\n          fi\n\n          export RISCV_ROOT_PATH=$GITHUB_WORKSPACE/toolchain\n          ./build-riscv64-linux-gnu-spacemit.sh\n\n          ls -lh build-riscv64-linux-gnu-spacemit/bin\n          ls -lh build-riscv64-linux-gnu-spacemit/lib\n\n          echo \"---install/lib---\"\n          ls -lh build-riscv64-linux-gnu-spacemit/install/lib\n\n          echo \"---install/bin---\"\n          ls -lh build-riscv64-linux-gnu-spacemit/install/bin\n\n          file build-riscv64-linux-gnu-spacemit/bin/sherpa-onnx\n\n          readelf -d build-riscv64-linux-gnu-spacemit/bin/sherpa-onnx\n\n      - name: Copy files\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          riscv64-unknown-linux-gnu-strip --version\n\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-riscv64-spacemit-${{ matrix.lib_type }}\n          mkdir $dst\n\n          cp -v $GITHUB_WORKSPACE/toolchain/sysroot/lib/ld-linux-riscv64-lp64d.so.1 build-riscv64-linux-gnu-spacemit/install/lib/\n\n          ls -lh build-riscv64-linux-gnu-spacemit/install/lib\n\n          cp -a build-riscv64-linux-gnu-spacemit/install/bin $dst/\n          ls -lh $dst/bin/*\n          riscv64-unknown-linux-gnu-strip $dst/bin/*\n          ls -lh $dst\n\n          lib_type=${{ matrix.lib_type }}\n          if [[ $lib_type == \"shared\" ]]; then\n            cp -a build-riscv64-linux-gnu-spacemit/install/lib $dst/\n            rm -fv $dst/lib/libasound.so\n          fi\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.lib_type == 'shared'\n        with:\n          name: sherpa-onnx-linux-riscv64-spacemit-shared\n          path: sherpa-onnx-*linux-riscv64-spacemit-shared.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=riscv64-spacemit/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*-shared.tar.bz2 $dst/\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-linux-riscv64-spacemit-shared.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.lib_type == 'static'\n        with:\n          name: sherpa-onnx-linux-riscv64-spacemit-static\n          path: sherpa-onnx-*linux-riscv64-spacemit-static.tar.bz2\n\n      - name: Release pre-compiled binaries and libs for riscv64 linux ${{ matrix.lib_type }}\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-riscv64*.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.11\n\n      - name: Release pre-compiled binaries and libs for riscv64 linux ${{ matrix.lib_type }}\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-riscv64*.tar.bz2\n\n      - name: Test sherpa-onnx\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          export PATH=$GITHUB_WORKSPACE/qemu-install/bin:$PATH\n          export QEMU_LD_PREFIX=$GITHUB_WORKSPACE/toolchain/sysroot\n          export LD_LIBRARY_PATH=$GITHUB_WORKSPACE/toolchain/sysroot/lib\n          export QEMU_ARGS=\"-cpu max,vlen=256,elen=64,vext_spec=v1.0\"\n\n          ls -lh ./build-riscv64-linux-gnu-spacemit/bin\n\n          echo \"----------sherpa-onnx----------\"\n          qemu-riscv64 ${QEMU_ARGS} ./build-riscv64-linux-gnu-spacemit/bin/sherpa-onnx --help\n          readelf -d ./build-riscv64-linux-gnu-spacemit/bin/sherpa-onnx\n\n          echo \"----------sherpa-onnx-offline----------\"\n          qemu-riscv64 ${QEMU_ARGS} ./build-riscv64-linux-gnu-spacemit/bin/sherpa-onnx-offline --help\n          readelf -d ./build-riscv64-linux-gnu-spacemit/bin/sherpa-onnx-offline\n\n          echo \"----------sherpa-onnx-offline-tts----------\"\n          qemu-riscv64 ${QEMU_ARGS} ./build-riscv64-linux-gnu-spacemit/bin/sherpa-onnx-offline-tts --help\n          readelf -d ./build-riscv64-linux-gnu-spacemit/bin/sherpa-onnx-offline-tts\n\n      - name: Test streaming speech recognition\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          export PATH=$GITHUB_WORKSPACE/qemu-install/bin:$PATH\n          export QEMU_LD_PREFIX=$GITHUB_WORKSPACE/toolchain/sysroot\n          export LD_LIBRARY_PATH=$GITHUB_WORKSPACE/toolchain/sysroot/lib\n          export QEMU_ARGS=\"-cpu max,vlen=256,elen=64,vext_spec=v1.0\"\n          echo \"Some mistakes in ep graph partition, disable op Gather for spacemit-ep now, will be fixed soon.\"\n          export SPACEMIT_EP_DISABLE_OP_TYPE_FILTER=\"Gather\"\n\n          wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2\n\n          qemu-riscv64 ${QEMU_ARGS} ./build-riscv64-linux-gnu-spacemit/bin/sherpa-onnx \\\n            --provider=spacemit \\\n            --tokens=./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/tokens.txt \\\n            --encoder=./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/encoder-epoch-99-avg-1.onnx \\\n            --decoder=./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/decoder-epoch-99-avg-1.onnx \\\n            --joiner=./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/joiner-epoch-99-avg-1.onnx \\\n            ./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/test_wavs/0.wav\n\n      - name: Test offline tts\n        shell: bash\n        run: |\n          export PATH=$GITHUB_WORKSPACE/toolchain/bin:$PATH\n          export PATH=$GITHUB_WORKSPACE/qemu-install/bin:$PATH\n          export QEMU_LD_PREFIX=$GITHUB_WORKSPACE/toolchain/sysroot\n          export LD_LIBRARY_PATH=$GITHUB_WORKSPACE/toolchain/sysroot/lib\n          export QEMU_ARGS=\"-cpu max,vlen=256,elen=64,vext_spec=v1.0\"\n          echo \"Some mistakes in ep graph partition, disable op Gather;Cast;ConvTranspose for spacemit-ep now, will be fixed soon.\"\n          export SPACEMIT_EP_DISABLE_OP_TYPE_FILTER=\"Gather;Cast;ConvTranspose\"\n\n          wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-lessac-medium.tar.bz2\n          tar xf vits-piper-en_US-lessac-medium.tar.bz2\n          rm vits-piper-en_US-lessac-medium.tar.bz2\n\n          qemu-riscv64 ${QEMU_ARGS} ./build-riscv64-linux-gnu-spacemit/bin/sherpa-onnx-offline-tts \\\n            --provider=spacemit \\\n            --vits-model=./vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx \\\n            --vits-data-dir=./vits-piper-en_US-lessac-medium/espeak-ng-data \\\n            --vits-tokens=./vits-piper-en_US-lessac-medium/tokens.txt \\\n            --output-filename=./liliana-piper-en_US-lessac-medium.wav \\\n            'liliana, the most beautiful and lovely assistant of our team!'\n\n      - uses: actions/upload-artifact@v4\n        if: matrix.lib_type == 'shared'\n        with:\n          name: wave\n          path: ./*.wav\n"
  },
  {
    "path": ".github/workflows/rknn-linux-aarch64.yaml",
    "content": "name: rknn-linux-aarch64\n\non:\n  push:\n    branches:\n      - master\n      - ci-rknn-bins\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/rknn-linux-aarch64.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/csrc/rknn/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'toolchains/aarch64-linux-gnu.toolchain.cmake'\n  pull_request:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/rknn-linux-aarch64.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/csrc/rknn/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'toolchains/aarch64-linux-gnu.toolchain.cmake'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: rknn-linux-aarch64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  rknn_linux_aarch64:\n    runs-on: ${{ matrix.os }}\n    name: rknn shared ${{ matrix.shared }}\n    strategy:\n      fail-fast: false\n      matrix:\n        include:\n          - os: ubuntu-22.04-arm\n            shared: ON\n          - os: ubuntu-22.04-arm\n            shared: OFF\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-${{ matrix.shared }}-rknn-linux-aarch64\n\n      - name: Download rknn-toolkit2\n        shell: bash\n        run: |\n          git clone --depth 1 https://github.com/airockchip/rknn-toolkit2\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n            docker run --rm \\\n              --volume ${{ github.workspace }}/:/k2-fsa/sherpa-onnx \\\n              quay.io/pypa/manylinux_2_28_aarch64 \\\n            bash -c '\n              uname -a\n              which gcc\n\n              gcc --version\n              g++ --version\n\n\n              cmake --version\n\n\n              cd /k2-fsa/sherpa-onnx/\n\n              echo \"pwd\"\n\n              ls -lh\n\n              git clone --depth 1 --branch v1.2.12 https://github.com/alsa-project/alsa-lib\n              pushd alsa-lib\n              ./gitcompile\n              popd\n\n              ls -lh $PWD/alsa-lib/src/.libs\n\n              strings $PWD/alsa-lib/src/.libs/libasound.so.2.0.0 | grep \"^GLIBC\"\n\n              export CPLUS_INCLUDE_PATH=$PWD/alsa-lib/include:$CPLUS_INCLUDE_PATH\n              export C_INCLUDE_PATH=$PWD/alsa-lib/include:$C_INCLUDE_PATH\n              export SHERPA_ONNX_ALSA_LIB_DIR=$PWD/alsa-lib/src/.libs\n              p=$PWD\n\n              export SHERPA_ONNX_RKNN_TOOLKIT2_PATH=$PWD/rknn-toolkit2\n              export SHERPA_ONNX_RKNN_TOOLKIT2_LIB_DIR=$SHERPA_ONNX_RKNN_TOOLKIT2_PATH/rknpu2/runtime/Linux/librknn_api/aarch64\n              export CPLUS_INCLUDE_PATH=$SHERPA_ONNX_RKNN_TOOLKIT2_PATH/rknpu2/runtime/Linux/librknn_api/include:$CPLUS_INCLUDE_PATH\n\n              export SHERPA_ONNX_ENABLE_ALSA=1\n\n              mkdir build\n              cd build\n\n              BUILD_SHARED_LIBS=${{ matrix.shared }}\n\n              cmake \\\n                -DALSA_INCLUDE_DIR=$p/alsa-lib/include \\\n                -DALSA_LIBRARY=$p/alsa-lib/src/.libs/libasound.so \\\n                -DBUILD_SHARED_LIBS=ON \\\n                -DCMAKE_INSTALL_PREFIX=./install \\\n                -DSHERPA_ONNX_ENABLE_RKNN=ON \\\n                -DBUILD_SHARED_LIBS=$BUILD_SHARED_LIBS \\\n                ..\n\n              make -j4 install\n\n              rm -rf install/lib/pkgconfig\n              rm -fv install/lib/cargs.h\n              rm -fv install/lib/libcargs.so\n            '\n\n      - name: Display system info\n        shell: bash\n        run: |\n          uname -a\n          gcc --version\n          g++ --version\n\n      - name: Display generated files\n        shell: bash\n        run: |\n          export SHERPA_ONNX_RKNN_TOOLKIT2_PATH=$PWD/rknn-toolkit2\n          export LD_LIBRARY_PATH=$SHERPA_ONNX_RKNN_TOOLKIT2_PATH/rknpu2/runtime/Linux/librknn_api/aarch64:$LD_LIBRARY_PATH\n\n          cd build/install\n\n          ls -lh bin\n\n          echo \"---\"\n\n          ls -lh lib\n\n          file bin/sherpa-onnx\n\n          readelf -d bin/sherpa-onnx\n\n          ldd bin/sherpa-onnx\n\n          ./bin/sherpa-onnx --help\n\n          echo \"---\"\n          strings bin/sherpa-onnx | grep \"^GLIBC\"\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          if [[ ${{ matrix.shared }} == ON ]]; then\n            suffix=shared\n          else\n            suffix=static\n          fi\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-rknn-linux-aarch64-$suffix\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n\n          if [[ ${{ matrix.shared }} == ON ]]; then\n            mkdir -p $dst/lib\n            cp -v build/install/lib/lib*.so $dst/lib/\n          fi\n\n          ls -lh build/install/lib\n          ls -lh build/install/bin\n\n          ls -lh $dst/bin/\n          echo \"strip\"\n          strip $dst/bin/*\n\n          echo \"after strip\"\n          ls -lh $dst/bin/\n\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-linux-linux-aarch64-shared-${{ matrix.shared }}\n          path: sherpa-onnx-*linux-aarch64*.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=rknn-linux-aarch64/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*rknn*-*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}-rknn-linux-aarch64.tar.bz2\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for rknn linux aarch64\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-aarch64*.tar.bz2\n\n      - name: Release pre-compiled binaries and libs for rknn linux aarch64\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*linux-aarch64*.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.13\n\n      - name: Test offline Moonshine\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          du -h -d1 .\n\n          export SHERPA_ONNX_RKNN_TOOLKIT2_PATH=$PWD/rknn-toolkit2\n          export LD_LIBRARY_PATH=$SHERPA_ONNX_RKNN_TOOLKIT2_PATH/rknpu2/runtime/Linux/librknn_api/aarch64:$LD_LIBRARY_PATH\n\n          export PATH=$PWD/build/install/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          readelf -d build/bin/sherpa-onnx-offline\n\n          .github/scripts/test-offline-moonshine.sh\n"
  },
  {
    "path": ".github/workflows/run-java-test.yaml",
    "content": "name: run-java-test\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/run-java-test.yaml'\n      - 'cmake/**'\n      - 'java-api-examples/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/jni/*'\n      - 'sherpa-onnx/java-api/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: run-java-test-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  run_java_test:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, macos-14]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-java\n\n      - name: OS info\n        shell: bash\n        run: |\n          uname -a\n\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: Display java version\n        shell: bash\n        run: |\n          java -version\n          java -help\n          echo \"----\"\n          javac -version\n          javac -help\n          echo \"JAVA_HOME is: ${JAVA_HOME}\"\n\n          cmake --version\n\n      - name:  Build sherpa-onnx (jar)\n        shell: bash\n        run: |\n          cd sherpa-onnx/java-api/\n          make\n          ls -lh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-jar-${{ matrix.os }}\n          path: sherpa-onnx/java-api/build\n\n      - name:  Build sherpa-onnx (C++)\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          mkdir build\n          cd build\n\n          cmake \\\n            -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n            -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n            -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n            -DBUILD_ESPEAK_NG_EXE=OFF \\\n            -DSHERPA_ONNX_ENABLE_JNI=ON \\\n            ..\n\n            make -j4\n            ls -lh lib\n\n      - name:  Run java version test\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-version-test.sh\n\n      - name:  Run java test (Non-Streaming ASR)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n\n          ./run-non-streaming-decode-file-fire-red-asr-ctc.sh\n\n          ./run-non-streaming-decode-file-zipformer-ctc.sh\n          rm -rf sherpa-onnx-zipformer-ctc-*\n\n          ./run-non-streaming-decode-file-dolphin-ctc.sh\n          rm -rf sherpa-onnx-dolphin-*\n\n          ./run-non-streaming-decode-file-moonshine-v2.sh\n          ./run-non-streaming-decode-file-moonshine.sh\n          rm -rf sherpa-onnx-moonshine-*\n\n          ./run-non-streaming-decode-file-sense-voice.sh\n          rm -rf sherpa-onnx-sense-voice-*\n\n          ./run-inverse-text-normalization-paraformer.sh\n\n          ./run-non-streaming-decode-file-paraformer.sh\n          rm -rf sherpa-onnx-paraformer-zh-*\n\n          ./run-non-streaming-decode-file-transducer.sh\n          rm -rf sherpa-onnx-zipformer-*\n\n          ./run-non-streaming-decode-file-fire-red-asr.sh\n          rm -rf sherpa-onnx-fire-red-*\n\n          ./run-non-streaming-decode-file-whisper.sh\n\n          ./run-non-streaming-decode-file-whisper-multiple.sh\n          rm -rf sherpa-onnx-whisper-*\n\n          ./run-non-streaming-decode-file-nemo.sh\n          rm -rf sherpa-onnx-nemo-*\n\n      - name:  Run java test (FunASR Nano)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-non-streaming-decode-file-funasr-nano.sh\n          rm -rf sherpa-onnx-funasr-*\n\n      - name:  Run java test (MedASR CTC)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-non-streaming-decode-file-medasr-ctc.sh\n          rm -rf sherpa-onnx-medasr-*\n\n      - name:  Run java test (Omnilingual ASR CTC)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-non-streaming-decode-file-omnilingual-asr-ctc.sh\n          rm -rf sherpa-onnx-omnilingual-*\n\n      - name:  Run java test (WeNet CTC)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-non-streaming-decode-file-wenet-ctc.sh\n          rm -rf sherpa-onnx-wenet*\n\n      - name:  Run java test (Streaming T-one)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-streaming-decode-file-tone-ctc.sh\n          rm -rf sherpa-onnx-streaming-t-one-*\n\n      - name:  Run java test (Nemo Canary)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-non-streaming-decode-file-nemo-canary.sh\n          rm -rf sherpa-onnx-nemo-*\n\n      - name:  Run java test (Non-streaming SenseVoice with homophone replacer)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-non-streaming-decode-file-sense-voice-with-hr.sh\n          rm -rf sherpa-onnx-sense-*\n          rm -rf dict lexicon.txt replace.fst\n\n      - name:  Run java test (VAD + Non-streaming Dolphin CTC)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-vad-non-streaming-dolphin-ctc.sh\n          rm *.onnx\n          ls -lh *.wav\n          rm *.wav\n          rm -rf sherpa-onnx-dolphin-*\n\n      - name:  Run speech enhancement\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-non-streaming-speech-enhancement-gtcrn.sh\n          ./run-non-streaming-speech-enhancement-dpdfnet.sh\n          ./run-streaming-speech-enhancement-gtcrn.sh\n          ./run-streaming-speech-enhancement-dpdfnet.sh\n          ls -lh *.wav\n\n          rm -fv gtcrn_simple.onnx dpdfnet_baseline.onnx *.wav\n\n      - name:  Run java test (Online add punctuations)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-online-add-punctuation-zh-en.sh\n          # Delete model files to save space\n          rm -rf sherpa-onnx-online-*\n\n      - name:  Run java test (Offline add punctuations)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-offline-add-punctuation-zh-en.sh\n          # Delete model files to save space\n          rm -rf sherpa-onnx-punct-*\n\n      - name:  Run java test (speaker diarization)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-offline-speaker-diarization.sh\n          rm -rfv *.onnx *.wav sherpa-onnx-pyannote-*\n\n      - name:  Run java test (kws)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-kws-from-file.sh\n          rm -rf sherpa-onnx-*\n\n      - name:  Run java test (VAD + Non-streaming SenseVoice)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-vad-non-streaming-sense-voice.sh\n          rm *.onnx\n          ls -lh *.wav\n          rm *.wav\n          rm -rf sherpa-onnx-*\n\n      - name:  Run java test (VAD + Non-streaming Paraformer)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-vad-non-streaming-paraformer.sh\n          rm *.onnx\n          ls -lh *.wav\n          rm *.wav\n          rm -rf sherpa-onnx-*\n\n      - name:  Run java test (ten-vad remove silence)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-ten-vad-remove-silence.sh\n          rm *.onnx\n          ls -lh *.wav\n          rm *.wav\n\n      - name:  Run java test (silero-vad remove silence)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-vad-remove-silence.sh\n          rm *.onnx\n          ls -lh *.wav\n          rm *.wav\n\n      - name:  Run java test (speaker identification)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-speaker-identification.sh\n          # Delete model files to save space\n          rm -rf *.onnx\n          rm -rf sr-data\n\n      - name:  Run java test (audio tagging)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-audio-tagging-zipformer-from-file.sh\n          # Delete model files to save space\n          rm -rf sherpa-onnx-zipformer-*\n\n          ./run-audio-tagging-ced-from-file.sh\n          rm -rf sherpa-onnx-ced-*\n\n\n      - name:  Run java test (Spoken language identification)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-spoken-language-identification-whisper.sh\n          # Delete model files to save space\n          rm -rf sherpa-onnx-whisper-*\n\n      - name:  Run java test (Streaming ASR)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n          ./run-inverse-text-normalization-transducer.sh\n          rm -rf sherpa-onnx-streaming-*\n\n          ./run-streaming-decode-file-ctc.sh\n          # Delete model files to save space\n          rm -rf sherpa-onnx-streaming-*\n\n          ./run-streaming-decode-file-ctc-hlg.sh\n          rm -rf sherpa-onnx-streaming-*\n\n          ./run-streaming-decode-file-paraformer.sh\n          rm -rf sherpa-onnx-streaming-*\n\n          ./run-streaming-decode-file-transducer.sh\n          rm -rf sherpa-onnx-streaming-*\n\n      - name:  Run java test (Non-Streaming TTS)\n        shell: bash\n        run: |\n          cd ./java-api-examples\n\n           ./run-pocket-tts.sh\n           ./run-zipvoice-tts.sh\n           ./run-supertonic-tts.sh\n           ./run-non-streaming-tts-kitten-en.sh\n           ./run-non-streaming-tts-kokoro-zh-en.sh\n           ./run-non-streaming-tts-kokoro-en.sh\n           ./run-non-streaming-tts-matcha-zh.sh\n          ./run-non-streaming-tts-matcha-en.sh\n          ls -lh\n\n           rm -rf sherpa-onnx-pocket-tts-*\n           rm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\n           rm -rf sherpa-onnx-supertonic-tts-*\n           rm -rf kitten-nano-en-*\n           rm -rf kokoro-multi-*\n           rm -rf kokoro-en-*\n\n           rm -rf matcha-icefall-*\n           rm vocos-22khz-univ.onnx\n           rm vocos_24khz.onnx\n\n          ./run-non-streaming-tts-piper-en.sh\n          rm -rf vits-piper-*\n\n          ./run-non-streaming-tts-coqui-de.sh\n          rm -rf vits-coqui-*\n\n          ./run-non-streaming-tts-vits-zh.sh\n          rm -rf vits-zh-*\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: tts-wav-files-${{ matrix.os }}\n          path: java-api-examples/*.wav\n"
  },
  {
    "path": ".github/workflows/run-python-test-macos.yaml",
    "content": "name: run-python-test-macos\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/run-python-test-macos.yaml'\n      - '.github/scripts/test-python.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'python-api-examples/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: run-python-test-macos-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  run-python-test:\n    name: ${{ matrix.os }} ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        # See https://github.com/actions/runner-images\n        # macos-14 is for arm64\n        # macos-14-large is for x64\n        include:\n          - os: macos-15-intel\n            python-version: \"3.8\"\n\n          - os: macos-15-intel\n            python-version: \"3.9\"\n          - os: macos-14\n            python-version: \"3.10\"\n          - os: macos-14\n            python-version: \"3.11\"\n\n          - os: macos-latest\n            python-version: \"3.12\"\n\n          - os: macos-latest\n            python-version: \"3.13\"\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Display OS version\n        shell: bash\n        run: |\n          uname -a\n          sw_vers\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-python-${{ matrix.python-version }}\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip numpy pypinyin sentencepiece>=0.1.96 soundfile setuptools wheel librosa\n\n      - name: Install sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          python3 -m pip install .\n\n      - name: Test sherpa-onnx\n        shell: bash\n        run: |\n          export OS=${{ matrix.os }}\n          .github/scripts/test-python.sh\n          .github/scripts/test-speaker-recognition-python.sh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: source-separation-${{ matrix.os }}-${{ matrix.python-version }}\n          path: ./source-separation\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: tts-generated-test-files-${{ matrix.os }}-${{ matrix.python-version }}\n          path: tts\n"
  },
  {
    "path": ".github/workflows/run-python-test.yaml",
    "content": "name: run-python-test\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/run-python-test.yaml'\n      - '.github/scripts/test-python.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'python-api-examples/**'\n  pull_request:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/run-python-test.yaml'\n      - '.github/scripts/test-python.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'python-api-examples/**'\n  workflow_dispatch:\n\nconcurrency:\n  group: run-python-test-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  run-python-test:\n    name: ${{ matrix.os }} ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        include:\n          - os: ubuntu-24.04\n            python-version: \"3.8\"\n          - os: ubuntu-24.04\n            python-version: \"3.9\"\n\n          - os: ubuntu-24.04\n            python-version: \"3.10\"\n          - os: ubuntu-24.04\n            python-version: \"3.11\"\n          - os: ubuntu-24.04\n            python-version: \"3.12\"\n          - os: ubuntu-24.04\n            python-version: \"3.13\"\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Display OS version\n        shell: bash\n        run: |\n          uname -a\n          find \"/etc\" -maxdepth 1 -type f -name \"*version\" -exec head -n 100 {} \\;\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-python-${{ matrix.python-version }}\n\n      - name: Setup Python\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip numpy pypinyin sentencepiece>=0.1.96 soundfile librosa\n          python3 -m pip install wheel twine setuptools\n\n      - uses: afoley587/setup-ffmpeg@main\n        id: setup-ffmpeg\n        with:\n          ffmpeg-version: release\n          architecture: ''\n          github-token: ${{ github.server_url == 'https://github.com' && github.token || '' }}\n\n      - name: Install ninja\n        shell: bash\n        run: |\n          sudo apt-get install ninja-build\n\n      - name: Display ninja version\n        shell: bash\n        run: |\n          ninja --version\n          ninja --help || true\n          which ninja\n\n      - name: Display site packages dir\n        shell: bash\n        run: |\n          python3 -c 'import site; print(site.getsitepackages())'\n          p=$(python3 -c 'import site; print(site.getsitepackages())')\n          echo \"p: $p\"\n\n      - name: Install patchelf\n        shell: bash\n        run: |\n          sudo apt-get update -q\n          sudo apt-get install -q -y patchelf\n          patchelf --help\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n          export SHERPA_ONNX_CMAKE_ARGS=\"-G Ninja -DCMAKE_BUILD_TYPE=Release\"\n          export SHERPA_ONNX_MAKE_ARGS=\"-j 6\"\n\n          python3 setup.py bdist_wheel\n\n      - name: Patch wheels\n        shell: bash\n        run: |\n          mkdir ./dist2\n          sudo ./scripts/wheel/patch_wheel.py --in-dir ./dist --out-dir ./dist2\n\n      - name: Install sherpa-onnx\n        shell: bash\n        run: |\n          ls -lh dist2\n\n          python3 -m pip install ./dist2/*.whl\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: ${{ matrix.os }}-${{ matrix.python-version }}-whl\n          path: ./dist\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: ${{ matrix.os }}-${{ matrix.python-version }}-whl-patched\n          path: ./dist2\n\n      - name: Show dependencies\n        shell: bash\n        run: |\n          cd dist\n          mkdir t\n          cd t\n          unzip ../*.whl\n          readelf -d sherpa_onnx/lib/_sherpa_onnx*.so\n\n          echo \"----\"\n\n          readelf -d sherpa_onnx-*.data/data/bin/sherpa-onnx\n\n      - name: Show dependencies (patched)\n        shell: bash\n        run: |\n          cd dist2\n          mkdir t\n          cd t\n          unzip ../*.whl\n          readelf -d sherpa_onnx/lib/_sherpa_onnx*.so\n\n          echo \"----\"\n\n          readelf -d sherpa_onnx-*.data/data/bin/sherpa-onnx\n\n      - name: Test sherpa-onnx\n        shell: bash\n        run: |\n          export OS=${{ matrix.os }}\n\n          p=$(python3 -c 'import site; print(site.getsitepackages()[0])')\n          echo \"p: $p\"\n          p=$p/sherpa_onnx/lib\n          echo \"p: $p\"\n          ls -lh $p\n\n          export LD_LIBRARY_PATH=$p:$LD_LIBRARY_PATH\n          echo \"LD_LIBRARY_PATH: $LD_LIBRARY_PATH\"\n\n          .github/scripts/test-python.sh\n          .github/scripts/test-speaker-recognition-python.sh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: source-separation-${{ matrix.os }}-${{ matrix.python-version }}-whl\n          path: ./source-separation\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: tts-generated-test-files-${{ matrix.os }}-${{ matrix.python-version }}\n          path: tts\n"
  },
  {
    "path": ".github/workflows/sanitizer.yaml",
    "content": "name: sanitizer\n\non:\n  workflow_dispatch:\n\n  schedule:\n    # minute (0-59)\n    # hour (0-23)\n    # day of the month (1-31)\n    # month (1-12)\n    # day of the week (0-6)\n    # nightly build at 22:50 UTC time every day\n    - cron: \"50 22 * * *\"\n\nconcurrency:\n  group: sanitizer-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  sanitizer:\n    runs-on: ${{ matrix.os }}\n    name: sanitizer\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-sanitizer\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n\n          cmake \\\n            -DSHERPA_ONNX_ENABLE_PYTHON=ON \\\n            -DSHERPA_ONNX_ENABLE_TESTS=ON \\\n            -DSHERPA_ONNX_ENABLE_JNI=ON \\\n            -DSHERPA_ONNX_ENABLE_SANITIZER=ON \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D CMAKE_BUILD_TYPE=Release \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cd build\n          make -j2\n          make install\n\n          ls -lh lib\n          ls -lh bin\n\n          file ./bin/sherpa-onnx\n\n      - name: Display dependencies of sherpa-onnx for macos\n        shell: bash\n        run: |\n          file bin/sherpa-onnx\n          otool -L build/bin/sherpa-onnx\n          otool -l build/bin/sherpa-onnx\n\n      - name: Test C++ API\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export CXX_STREAMING_ZIPFORMER_EXE=streaming-zipformer-cxx-api\n          export CXX_WHISPER_EXE=whisper-cxx-api\n\n          .github/scripts/test-cxx-api.sh\n\n      - name: Test online punctuation\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-online-punctuation\n\n          .github/scripts/test-online-punctuation.sh\n\n      - name: Test offline punctuation\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-punctuation\n\n          .github/scripts/test-offline-punctuation.sh\n\n      - name: Test offline transducer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-transducer.sh\n\n      - name: Test online CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-ctc.sh\n\n\n      - name: Test C API\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export SLID_EXE=spoken-language-identification-c-api\n          export SID_EXE=speaker-identification-c-api\n          export AT_EXE=audio-tagging-c-api\n          export PUNCT_EXE=add-punctuation-c-api\n\n          .github/scripts/test-c-api.sh\n\n      - name: Test Audio tagging\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-audio-tagging\n\n          .github/scripts/test-audio-tagging.sh\n\n      - name: Test spoken language identification (C++ API)\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-language-identification\n\n          .github/scripts/test-spoken-language-identification.sh\n\n      - name: Test transducer kws\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-keyword-spotter\n\n          .github/scripts/test-kws.sh\n\n      - name: Test offline TTS\n        if: matrix.with_tts == 'ON'\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-tts\n\n          .github/scripts/test-offline-tts.sh\n\n      - name: Test online paraformer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-paraformer.sh\n\n      - name: Test offline Whisper\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-whisper.sh\n\n      - name: Test offline CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-ctc.sh\n\n      - name: Test online transducer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-transducer.sh\n\n      - name: Test online transducer (C API)\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=decode-file-c-api\n\n          .github/scripts/test-online-transducer.sh\n"
  },
  {
    "path": ".github/workflows/speaker-diarization.yaml",
    "content": "name: speaker-diarization\n\non:\n  push:\n    branches:\n      - speaker-diarization\n  workflow_dispatch:\n\nconcurrency:\n  group: speaker-diarization-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  linux:\n    name: speaker diarization\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-speaker-diarization\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install pyannote\n        shell: bash\n        run: |\n          pip install pyannote.audio onnx onnxruntime\n\n      - name: Install sherpa-onnx from source\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine setuptools\n\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cat sherpa-onnx/python/sherpa_onnx/__init__.py\n\n          python3 setup.py bdist_wheel\n          ls -lh dist\n          pip install ./dist/*.whl\n\n      - name: Run tests\n        shell: bash\n        run: |\n          pushd scripts/pyannote/segmentation\n\n          python3 -c \"import sherpa_onnx; print(sherpa_onnx.__file__)\"\n          python3 -c \"import sherpa_onnx; print(sherpa_onnx.__version__)\"\n          python3 -c \"import sherpa_onnx; print(dir(sherpa_onnx))\"\n\n          curl -SL -O https://huggingface.co/csukuangfj/pyannote-models/resolve/main/segmentation-3.0/pytorch_model.bin\n\n          test_wavs=(\n            0-four-speakers-zh.wav\n            1-two-speakers-en.wav\n            2-two-speakers-en.wav\n            3-two-speakers-en.wav\n          )\n\n          for w in ${test_wavs[@]}; do\n            curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/$w\n          done\n\n          soxi *.wav\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n          tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n          rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n          ls -lh sherpa-onnx-pyannote-segmentation-3-0\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\n          for w in ${test_wavs[@]}; do\n            echo \"---------test $w (onnx)----------\"\n            time ./speaker-diarization-onnx.py \\\n              --seg-model ./sherpa-onnx-pyannote-segmentation-3-0/model.onnx \\\n              --speaker-embedding-model ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx \\\n              --wav $w\n\n            echo \"---------test $w (torch)----------\"\n            time ./speaker-diarization-torch.py  --wav $w\n          done\n"
  },
  {
    "path": ".github/workflows/style_check.yaml",
    "content": "# Copyright (c)  2022  Xiaomi Corporation (authors: Fangjun Kuang)\n#\n# See ../../LICENSE for clarification regarding multiple authors\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\nname: style_check\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/style_check.yaml'\n      - 'sherpa-onnx/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: style_check-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  style_check:\n    runs-on: ubuntu-latest\n    strategy:\n      matrix:\n        python-version: [3.8]\n      fail-fast: false\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Check style with cpplint\n        shell: bash\n        working-directory: ${{github.workspace}}\n        run: ./scripts/check_style_cpplint.sh\n"
  },
  {
    "path": ".github/workflows/swift.yaml",
    "content": "name: swift\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - './build-swift-macos.sh'\n      - '.github/workflows/swift.yaml'\n      - 'cmake/**'\n      - 'swift-api-examples/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/**'\n      - '.github/scripts/test-swift.sh'\n\n  pull_request:\n    branches:\n      - master\n    paths:\n      - './build-swift-macos.sh'\n      - '.github/workflows/swift.yaml'\n      - 'cmake/**'\n      - 'swift-api-examples/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/**'\n      - '.github/scripts/test-swift.sh'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: swift-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  swift:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest, macos-15-intel]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-swift\n\n      - name: Build\n        shell: bash\n        run: |\n          sudo mkdir -p /Users/fangjun/Desktop\n          sudo chmod a=rwx /Users/fangjun/Desktop\n          ls -lhd /Users/fangjun/Desktop\n          ls -lh /Users/fangjun/Desktop\n\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          ./build-swift-macos.sh\n\n      - name: Copy files\n        if: matrix.os == 'macos-15-intel' && (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-macos-xcframework-static\n          mkdir $dst\n\n          mv -v build-swift-macos/sherpa-onnx.xcframework $dst\n\n          brew install tree\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - name: Release pre-compiled binaries and libs for macOS\n        if: matrix.os == 'macos-15-intel' && (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*macos-xcframework-static.tar.bz2\n\n      - name: test\n        shell: bash\n        run: |\n          .github/scripts/test-swift.sh\n"
  },
  {
    "path": ".github/workflows/test-build-wheel.yaml",
    "content": "name: test-build-wheel\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - 'setup.py'\n      - '.github/workflows/test-build-wheel.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/python/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-build-wheel-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  test-build-wheel:\n    name: ${{ matrix.os }} ${{ matrix.python-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        # See https://github.com/actions/runner-images\n        include:\n          - os: ubuntu-latest\n            python-version: \"3.8\"\n          - os: ubuntu-latest\n            python-version: \"3.9\"\n          - os: ubuntu-latest\n            python-version: \"3.10\"\n          - os: ubuntu-latest\n            python-version: \"3.11\"\n          - os: ubuntu-latest\n            python-version: \"3.12\"\n          - os: ubuntu-latest\n            python-version: \"3.13\"\n\n          - os: ubuntu-24.04-arm\n            python-version: \"3.8\"\n          - os: ubuntu-24.04-arm\n            python-version: \"3.9\"\n          - os: ubuntu-24.04-arm\n            python-version: \"3.10\"\n          - os: ubuntu-24.04-arm\n            python-version: \"3.11\"\n          - os: ubuntu-24.04-arm\n            python-version: \"3.12\"\n          - os: ubuntu-24.04-arm\n            python-version: \"3.13\"\n\n          - os: macos-15-intel\n            python-version: \"3.8\"\n\n          - os: macos-15-intel\n            python-version: \"3.9\"\n          - os: macos-15-intel\n            python-version: \"3.10\"\n          - os: macos-15-intel\n            python-version: \"3.11\"\n\n          - os: macos-latest\n            python-version: \"3.12\"\n          - os: macos-latest\n            python-version: \"3.13\"\n\n          - os: windows-2022\n            python-version: \"3.7\"\n          - os: windows-2022\n            python-version: \"3.8\"\n          - os: windows-2022\n            python-version: \"3.9\"\n\n          - os: windows-2022\n            python-version: \"3.10\"\n          - os: windows-2022\n            python-version: \"3.11\"\n          - os: windows-2022\n            python-version: \"3.12\"\n          - os: windows-2022\n            python-version: \"3.13\"\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-${{ matrix.python_version }}\n\n      - name: Install python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip\n          python3 -m pip install wheel twine setuptools\n\n      - name: Build\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          export SHERPA_ONNX_MAKE_ARGS=\"VERBOSE=1 -j2\"\n\n          python3 setup.py bdist_wheel\n          ls -lh dist\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wheel-${{ matrix.os }}-${{ matrix.python-version }}\n          path: ./dist/*.whl\n\n      - name: Display wheel\n        shell: bash\n        run: |\n          ls -lh dist\n          cd dist\n\n          mkdir t\n          cd t\n          unzip ../*.whl\n\n          ls -lh sherpa_onnx/lib\n\n          file sherpa_onnx/lib/*\n\n      - name: Install wheel\n        shell: bash\n        run: |\n          pip install --verbose ./dist/*.whl\n\n      - name: Test\n        shell: bash\n        run: |\n          which sherpa-onnx\n          sherpa-onnx --help\n"
  },
  {
    "path": ".github/workflows/test-dart-package.yaml",
    "content": "name: test-dart-package\n\non:\n  schedule:\n    # minute (0-59)\n    # hour (0-23)\n    # day of the month (1-31)\n    # month (1-12)\n    # day of the week (0-6)\n    # nightly build at 15:50 UTC time every day\n    - cron: \"50 15 * * *\"\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-dart-package-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  test_dart_package:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, ubuntu-24.04-arm] #, windows-2022]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      # see https://github.com/subosito/flutter-action/issues/345\n      - name: Set up Flutter\n        uses: subosito/flutter-action@v2\n        with:\n          channel: master\n          flutter-version: 3.24.0\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n      - name: Display sherpa-onnx package info\n        shell: bash\n        run: |\n          cd dart-api-examples/vad\n          flutter pub get\n\n          if false; then\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-24.04-arm ]]; then\n            echo \"-----\"\n            ls -lh /home/runner/work/_temp/pub-cache/hosted/pub.dev\n\n            echo \"-----\"\n            ls -lh /home/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx*\n\n            echo \"-----\"\n            ls -lh /home/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx*/*\n\n            echo \"-----\"\n            ls -lh /home/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx_linux-*\n\n            # sudo mkdir /home/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx_linux-1.10.7/lib\n            # sudo touch /home/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx_linux-1.10.7/lib/.gitkeep\n\n            echo \"-----\"\n            ls -lh /home/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx_linux-*/linux/*\n          elif [[ ${{ matrix.os }} == macos-latest ]]; then\n            echo \"-----\"\n            ls -lh /Users/runner/work/_temp/pub-cache/hosted/pub.dev\n\n            echo \"-----\"\n            ls -lh /Users/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx*\n\n            echo \"-----\"\n            ls -lh /Users/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx*/*\n\n            echo \"-----\"\n            ls -lh /Users/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx_macos-*/\n\n            echo \"-----\"\n            ls -lh /Users/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx_macos-*/macos\n\n            # sudo mkdir /Users/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx_macos-1.10.7/lib\n            # sudo touch /Users/runner/work/_temp/pub-cache/hosted/pub.dev/sherpa_onnx_macos-1.10.7/lib/.gitkeep\n          fi\n          fi\n\n\n      - name: Run tests\n        shell: bash\n        run: |\n          .github/scripts/test-dart.sh\n"
  },
  {
    "path": ".github/workflows/test-dart.yaml",
    "content": "name: test-dart\n\non:\n  push:\n    branches:\n      - master\n      - dart\n    paths:\n      - '.github/workflows/test-dart.yaml'\n      - '.github/scripts/test-dart.sh'\n      - 'dart-api-examples/**'\n      - 'flutter/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-dart-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  test_dart:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, ubuntu-24.04-arm] #, windows-2022]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-dart\n\n      # see https://github.com/subosito/flutter-action/issues/345\n      - name: Set up Flutter\n        uses: subosito/flutter-action@v2\n        with:\n          channel: master\n          flutter-version: 3.24.0\n\n      - name: Display flutter info\n        shell: bash\n        run: |\n          which flutter\n          which dart\n\n          flutter --version\n          dart --version\n          flutter doctor\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n          mkdir build\n\n          cd build\n\n          cmake \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n            -DBUILD_ESPEAK_NG_EXE=OFF \\\n            -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n          cmake --build . --target install --config Release\n\n      - name: Copy libs\n        shell: bash\n        run: |\n          if [[ ${{ matrix.os }} == ubuntu-latest ]]; then\n            os=linux-x64\n          elif [[ ${{ matrix.os }} == ubuntu-24.04-arm ]]; then\n            os=linux-aarch64\n          elif [[ ${{ matrix.os }} == macos-latest ]]; then\n            os=macos\n          elif [[ ${{ matrix.os }} == windows-2022 ]]; then\n            os=windows\n          fi\n\n          echo \"os: $os\"\n\n          if [[ $os == windows ]]; then\n            cp -fv build/install/lib/*.dll ./flutter/sherpa_onnx_$os/$os\n          elif [[ $os == linux-x64 ]]; then\n            cp -fv build/install/lib/lib* ./flutter/sherpa_onnx_linux/linux/x64\n          elif [[ $os == linux-aarch64 ]]; then\n            cp -fv build/install/lib/lib* ./flutter/sherpa_onnx_linux/linux/aarch64\n          else\n            cp -fv build/install/lib/lib* ./flutter/sherpa_onnx_$os/$os\n          fi\n\n          echo \"--------------------\"\n\n          if [[ $os == linux-x64 || $os == linux-aarch64 ]]; then\n            ls -lh ./flutter/sherpa_onnx_linux/linux/*\n          else\n            ls -lh ./flutter/sherpa_onnx_$os/$os\n          fi\n\n      - name: Run tests\n        shell: bash\n        run: |\n          cp scripts/dart/vad-pubspec.yaml dart-api-examples/vad/pubspec.yaml\n          cp scripts/dart/non-streaming-asr-pubspec.yaml dart-api-examples/non-streaming-asr/pubspec.yaml\n          cp scripts/dart/streaming-asr-pubspec.yaml dart-api-examples/streaming-asr/pubspec.yaml\n          cp scripts/dart/tts-pubspec.yaml dart-api-examples/tts/pubspec.yaml\n          cp scripts/dart/kws-pubspec.yaml dart-api-examples/keyword-spotter/pubspec.yaml\n          cp scripts/dart/vad-non-streaming-asr-pubspec.yaml dart-api-examples/vad-with-non-streaming-asr/pubspec.yaml\n          cp scripts/dart/audio-tagging-pubspec.yaml dart-api-examples/audio-tagging/pubspec.yaml\n          cp scripts/dart/add-punctuations-pubspec.yaml dart-api-examples/add-punctuations/pubspec.yaml\n          cp scripts/dart/speaker-id-pubspec.yaml dart-api-examples/speaker-identification/pubspec.yaml\n          cp scripts/dart/speaker-diarization-pubspec.yaml dart-api-examples/speaker-diarization/pubspec.yaml\n          cp scripts/dart/speech-enhancement-gtcrn-pubspec.yaml dart-api-examples/speech-enhancement-gtcrn/pubspec.yaml\n          cp scripts/dart/speech-enhancement-dpdfnet-pubspec.yaml dart-api-examples/speech-enhancement-dpdfnet/pubspec.yaml\n          cp scripts/dart/streaming-speech-enhancement-gtcrn-pubspec.yaml dart-api-examples/streaming-speech-enhancement-gtcrn/pubspec.yaml\n          cp scripts/dart/streaming-speech-enhancement-dpdfnet-pubspec.yaml dart-api-examples/streaming-speech-enhancement-dpdfnet/pubspec.yaml\n          cp scripts/dart/slid-pubspec.yaml dart-api-examples/spoken-language-identification/pubspec.yaml\n\n          cp scripts/dart/sherpa-onnx-pubspec.yaml flutter/sherpa_onnx/pubspec.yaml\n\n\n          .github/scripts/test-dart.sh\n"
  },
  {
    "path": ".github/workflows/test-dot-net-nuget.yaml",
    "content": "name: test-dot-net-nuget\n\non:\n  workflow_dispatch:\n\n  schedule:\n    # minute (0-59)\n    # hour (0-23)\n    # day of the month (1-31)\n    # month (1-12)\n    # day of the week (0-6)\n    # nightly build at 23:50 UTC time every day\n    - cron: \"50 23 * * *\"\n\nconcurrency:\n  group: test-dot-net-nuget-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  test-dot-net-nuget:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, windows-2022]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Free space\n        if: matrix.os == 'ubuntu-latest'\n        shell: bash\n        run: |\n          df -h\n          rm -rf /opt/hostedtoolcache\n          df -h\n\n      - name: Free more space\n        if: matrix.os == 'ubuntu-latest'\n        shell: bash\n        run: |\n          # https://github.com/orgs/community/discussions/25678\n          cd /opt\n          find . -maxdepth 1 -mindepth 1 '!' -path ./containerd '!' -path ./actionarchivecache '!' -path ./runner '!' -path ./runner-cache -exec rm -rf '{}' ';'\n\n          sudo rm -rf /usr/share/dotnet\n          sudo rm -rf \"/usr/local/share/boost\"\n          sudo rm -rf \"$AGENT_TOOLSDIRECTORY\"\n\n      - name: Free Disk Space (Ubuntu)\n        if: matrix.os == 'ubuntu-latest'\n        uses: jlumbroso/free-disk-space@main\n        with:\n          # this might remove tools that are actually needed,\n          # if set to \"true\" but frees about 6 GB\n          tool-cache: false\n\n          # all of these default to true, but feel free to set to\n          # \"false\" if necessary for your workflow\n          android: true\n          dotnet: false\n          haskell: true\n          large-packages: true\n          docker-images: false\n          swap-storage: true\n\n      - name: Check space\n        if: matrix.os == 'ubuntu-latest'\n        shell: bash\n        run: |\n          df -h\n\n      - name: Setup .NET 8.0\n        uses: actions/setup-dotnet@v4\n        with:\n          dotnet-version: 8.0.x\n\n      - name: Check dotnet\n        run: dotnet --info\n\n      - name: Run tests\n        shell: bash\n        run: |\n          .github/scripts/test-dot-net.sh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: dot-net-tts-generated-test-files-${{ matrix.os }}\n          path: tts\n"
  },
  {
    "path": ".github/workflows/test-dot-net.yaml",
    "content": "name: test-dot-net\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/test-dot-net.yaml'\n      - '.github/scripts/test-dot-net.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'dotnet-examples/**'\n      - 'scripts/dotnet/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-dot-net-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  build-libs:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-dotnet-release-shared\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n          cmake \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -DCMAKE_BUILD_TYPE=Release \\\n            -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n            -DBUILD_ESPEAK_NG_EXE=OFF \\\n            -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n            ..\n\n          cmake --build . --target install --config Release\n\n          rm -rf install/share\n          rm -rf install/lib/pkg*\n\n          ls -lh ./install/lib\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: ${{ matrix.os }}\n          path: ./build/install/lib/\n\n  test-dot-net:\n    runs-on: ${{ matrix.os }}\n    needs: [build-libs]\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.8\"]\n\n    steps:\n      - name: Check space\n        shell: bash\n        run: |\n          df -h\n\n      - name: Free space\n        if: false\n        shell: bash\n        run: |\n          df -h\n          rm -rf /opt/hostedtoolcache\n          df -h\n\n      - name: Free more space\n        if: false\n        shell: bash\n        run: |\n          # https://github.com/orgs/community/discussions/25678\n          cd /opt\n          find . -maxdepth 1 -mindepth 1 '!' -path ./containerd '!' -path ./actionarchivecache '!' -path ./runner '!' -path ./runner-cache -exec rm -rf '{}' ';'\n\n          sudo rm -rf /usr/share/dotnet\n          sudo rm -rf \"/usr/local/share/boost\"\n          sudo rm -rf \"$AGENT_TOOLSDIRECTORY\"\n\n      - name: Free Disk Space (Ubuntu)\n        if: true\n        uses: jlumbroso/free-disk-space@main\n        with:\n          # this might remove tools that are actually needed,\n          # if set to \"true\" but frees about 6 GB\n          tool-cache: false\n\n          # all of these default to true, but feel free to set to\n          # \"false\" if necessary for your workflow\n          android: true\n          dotnet: false\n          haskell: true\n          large-packages: true\n          docker-images: false\n          swap-storage: true\n\n      - name: Check space\n        if: true\n        shell: bash\n        run: |\n          df -h\n\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip Jinja2\n\n      - name: Retrieve artifact from ubuntu-latest\n        uses: actions/download-artifact@v4\n        with:\n          name: ubuntu-latest\n          path: /tmp/linux-x64\n\n      - name: Setup .NET\n        uses: actions/setup-dotnet@v4\n        with:\n          dotnet-version: 8.0.x\n\n      - name: Check dotnet\n        run: dotnet --info\n\n      - name: Display files\n        shell: bash\n        run: |\n          echo \"----------/tmp----------\"\n          ls -lh /tmp\n\n          echo \"----------/tmp/linux-x64----------\"\n          ls -lh /tmp/linux-x64\n          df -h\n\n      - name: Build\n        shell: bash\n        run: |\n          cd scripts/dotnet\n          ./run.sh\n          df -h\n\n          ls -lh /tmp/packages\n\n      - name: Copy files\n        shell: bash\n        run: |\n          cp -v scripts/dotnet/examples/Common.csproj dotnet-examples/Common/\n\n          ls -lh /tmp\n\n          df -h\n\n      - name: Run tests\n        shell: bash\n        run: |\n          dotnet nuget locals all --clear\n          df -h\n\n          .github/scripts/test-dot-net.sh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: dot-net-tts-generated-test-files-${{ matrix.os }}\n          path: tts\n"
  },
  {
    "path": ".github/workflows/test-go-package.yaml",
    "content": "name: test-go-package\n\non:\n  schedule:\n    # minute (0-59)\n    # hour (0-23)\n    # day of the month (1-31)\n    # month (1-12)\n    # day of the week (0-6)\n    # nightly build at 15:50 UTC time every day\n    - cron: \"50 15 * * *\"\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-go-package-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  test-go-package:\n    name: ${{ matrix.os }} ${{matrix.arch }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        include:\n          - os: ubuntu-latest\n            arch: amd64\n          - os: ubuntu-22.04-arm\n            arch: arm64\n          - os: macos-15-intel\n            arch: amd64\n          - os: macos-14\n            arch: arm64\n          - os: windows-2022\n            arch: x64\n          - os: windows-2022\n            arch: x86 # use 386 for GOARCH\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n      - uses: actions/setup-go@v5\n        with:\n          go-version: '>=1.17'\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Display go version\n        shell: bash\n        run: |\n          go version\n          go env GOPATH\n          go env GOARCH\n\n      - name: Set up MinGW for x64\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x64'\n        uses: csukuangfj/setup-mingw@v2.2.1\n        with:\n          platform: ${{ matrix.arch }}\n\n      - name: Set up MinGW for x86\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x86'\n        uses: csukuangfj/setup-mingw@v2.2.1\n        with:\n          platform: ${{ matrix.arch }}\n          version: '12.2.0'\n\n      - name: Show gcc\n        if: matrix.os == 'windows-2022'\n        run: |\n          gcc --version\n\n      - name: Test NeMo Canary ASR\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/non-streaming-canary-decode-files\n          ./run.sh\n          rm -rf sherpa-onnx-nemo-*\n\n      - name: Test speech enhancement (GTCRN)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/speech-enhancement-gtcrn/\n          ./run.sh\n\n      - name: Test speech enhancement (DPDFNet)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/speech-enhancement-dpdfnet/\n          ./run.sh\n\n      - name: Test streaming speech enhancement (GTCRN)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/streaming-speech-enhancement-gtcrn/\n          ./run.sh\n\n      - name: Test streaming speech enhancement (DPDFNet)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/streaming-speech-enhancement-dpdfnet/\n          ./run.sh\n\n      - name: Test Keyword spotting\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/keyword-spotting-from-file/\n          ./run.sh\n\n      - name: Test adding punctuation\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/add-punctuation/\n          ./run.sh\n\n      - name: Test non-streaming speaker diarization\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/non-streaming-speaker-diarization/\n          ./run.sh\n\n      - name: Test non-streaming speaker diarization\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x64'\n        shell: bash\n        run: |\n          cd go-api-examples/non-streaming-speaker-diarization/\n          go mod tidy\n          cat go.mod\n          go build\n\n          echo $PWD\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/*\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/x86_64-pc-windows-gnu/*.dll .\n\n          ./run.sh\n\n      - name: Test non-streaming speaker diarization\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x86'\n        shell: bash\n        run: |\n          cd go-api-examples/non-streaming-speaker-diarization/\n\n          go env GOARCH\n          go env -w GOARCH=386\n          go env -w CGO_ENABLED=1\n\n          go mod tidy\n          cat go.mod\n          go build\n\n          echo $PWD\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/*\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/i686-pc-windows-gnu/*.dll .\n\n          ./run.sh\n\n      - name: Test streaming HLG decoding (Linux/macOS)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/streaming-hlg-decoding/\n          ./run.sh\n\n      - name: Test speaker identification (Linux/macOS)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/speaker-identification\n          ./run.sh\n\n      - name: Test speaker identification (Win64)\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x64'\n        shell: bash\n        run: |\n          cd go-api-examples/speaker-identification\n          go mod tidy\n          cat go.mod\n          go build\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\n          git clone https://github.com/csukuangfj/sr-data\n          ls -lh\n          echo $PWD\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/*\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/x86_64-pc-windows-gnu/*.dll .\n          ls -lh\n          go mod tidy\n          go build\n          go run ./main.go\n\n      - name: Test speaker identification (Win32)\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x86'\n        shell: bash\n        run: |\n          cd go-api-examples/speaker-identification\n          go mod tidy\n          cat go.mod\n          ls -lh\n\n          go env GOARCH\n          go env\n          echo \"------------------------------\"\n          go env -w GOARCH=386\n          go env -w CGO_ENABLED=1\n          go env\n\n          go clean\n          go build\n\n          echo $PWD\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\n          git clone https://github.com/csukuangfj/sr-data\n          ls -lh\n          echo $PWD\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/*\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/i686-pc-windows-gnu/*.dll .\n          ls -lh\n          go mod tidy\n          go build\n          go run ./main.go\n\n          rm -rf sr-data\n          rm -rf *.onnx\n\n      - name: Test non-streaming TTS (Linux/macOS)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          mkdir tts-waves\n          cd go-api-examples/non-streaming-tts\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          echo \"Test kokoro zh+en\"\n          ./run-kokoro-zh-en.sh\n          rm -rf kokoro-multi-*\n          ls -lh\n\n          echo \"Test kokoro en\"\n          ./run-kokoro-en.sh\n          rm -rf kokoro-en-*\n          ls -lh\n\n          echo \"Test matcha zh\"\n          ./run-matcha-zh.sh\n          rm -rf matcha-icefall-*\n\n          echo \"Test matcha en\"\n          ./run-matcha-en.sh\n          rm -rf matcha-icefall-*\n          ls -lh *.wav\n\n          echo \"Test vits-ljs\"\n          ./run-vits-ljs.sh\n          rm -rf vits-ljs\n\n          echo \"Test vits-vctk\"\n          ./run-vits-vctk.sh\n          rm -rf vits-vctk\n\n          echo \"Test vits-icefall-zh-aishell3\"\n          ./run-vits-zh-aishell3.sh\n          rm -rf vits-icefall-zh-aishell3\n\n          echo \"Test vits-piper-en_US-lessac-medium\"\n          ./run-vits-piper-en_US-lessac-medium.sh\n          rm -rf vits-piper-en_US-lessac-medium\n\n          ls -lh *.wav\n          cp *.wav ../../tts-waves/\n\n      - name: Test zero-shot ZipVoice TTS (Linux/macOS)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          mkdir -p tts-waves\n          cd go-api-examples/zero-shot-zipvoice-tts\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\n          rm -f vocos_24khz.onnx\n          ls -lh *.wav\n          cp *.wav ../../tts-waves/\n\n      - name: Test non-streaming TTS (Win64)\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x64'\n        shell: bash\n        run: |\n          mkdir tts-waves\n          cd go-api-examples/non-streaming-tts\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          echo $PWD\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/*\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/x86_64-pc-windows-gnu/*.dll .\n          ls -lh\n\n          echo \"Test matcha zh\"\n          ./run-matcha-zh.sh\n          rm -rf matcha-icefall-*\n\n          echo \"Test matcha en\"\n          ./run-matcha-en.sh\n          rm -rf matcha-icefall-*\n          ls -lh *.wav\n\n          echo \"Test vits-ljs\"\n          ./run-vits-ljs.sh\n          rm -rf vits-ljs\n\n          echo \"Test vits-vctk\"\n          ./run-vits-vctk.sh\n          rm -rf vits-vctk\n\n          echo \"Test vits-zh-aishell3\"\n          ./run-vits-zh-aishell3.sh\n          rm -rf vits-icefall-zh-aishell3\n\n          echo \"Test vits-piper-en_US-lessac-medium\"\n          ./run-vits-piper-en_US-lessac-medium.sh\n          rm -rf vits-piper-en_US-lessac-medium\n\n          ls -lh *.wav\n          cp *.wav ../../tts-waves/\n\n      - name: Test zero-shot ZipVoice TTS (Win64)\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x64'\n        shell: bash\n        run: |\n          mkdir -p tts-waves\n          cd go-api-examples/zero-shot-zipvoice-tts\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          echo $PWD\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/x86_64-pc-windows-gnu/*.dll .\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\n          rm -f vocos_24khz.onnx\n          ls -lh *.wav\n          cp *.wav ../../tts-waves/\n\n      - name: Test non-streaming TTS (Win32)\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x86'\n        shell: bash\n        run: |\n          mkdir tts-waves\n          cd go-api-examples/non-streaming-tts\n          ls -lh\n          go mod tidy\n          cat go.mod\n          ls -lh\n\n          go env GOARCH\n          go env\n          echo \"------------------------------\"\n          go env -w GOARCH=386\n          go env -w CGO_ENABLED=1\n          go env\n\n          go clean\n          go build\n\n          echo $PWD\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/i686-pc-windows-gnu/*.dll .\n          ls -lh\n\n          echo \"Test matcha zh\"\n          ./run-matcha-zh.sh\n          rm -rf matcha-icefall-*\n\n          echo \"Test matcha en\"\n          ./run-matcha-en.sh\n          rm -rf matcha-icefall-*\n          ls -lh *.wav\n\n          echo \"Test vits-ljs\"\n          ./run-vits-ljs.sh\n          rm -rf vits-ljs\n\n          echo \"Test vits-vctk\"\n          ./run-vits-vctk.sh\n          rm -rf vits-vctk\n\n          echo \"Test vits-zh-aishell3\"\n          ./run-vits-zh-aishell3.sh\n          rm -rf vits-zh-aishell3\n\n          echo \"Test vits-piper-en_US-lessac-medium\"\n          ./run-vits-piper-en_US-lessac-medium.sh\n          rm -rf vits-piper-en_US-lessac-medium\n\n          ls -lh *.wav\n          cp *.wav ../../tts-waves/\n\n      - name: Test zero-shot ZipVoice TTS (Win32)\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x86'\n        shell: bash\n        run: |\n          mkdir -p tts-waves\n          cd go-api-examples/zero-shot-zipvoice-tts\n          ls -lh\n          go mod tidy\n          cat go.mod\n          ls -lh\n\n          go env -w GOARCH=386\n          go env -w CGO_ENABLED=1\n          go clean\n          go build\n\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/i686-pc-windows-gnu/*.dll .\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\n          rm -f vocos_24khz.onnx\n          ls -lh *.wav\n          cp *.wav ../../tts-waves/\n\n      - name: Test non-streaming decoding files (Linux/macOS)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/non-streaming-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          echo \"Test transducer\"\n          ./run-transducer.sh\n          rm -rf sherpa-onnx-zipformer-en-2023-06-26\n\n          echo \"Test paraformer\"\n          ./run-paraformer.sh\n          rm -rf sherpa-onnx-paraformer-zh-2023-09-14\n\n          echo \"Test NeMo CTC\"\n          ./run-nemo-ctc.sh\n          rm -rf sherpa-onnx-nemo-ctc-en-conformer-medium\n\n          echo \"Test Whisper tiny.en\"\n          ./run-whisper.sh\n          rm -rf sherpa-onnx-whisper-tiny.en\n\n          echo \"Test Tdnn yesno\"\n          ./run-tdnn-yesno.sh\n          rm -rf sherpa-onnx-tdnn-yesno\n\n      - name: Test non-streaming decoding files (Win64)\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x64'\n        shell: bash\n        run: |\n          cd go-api-examples/non-streaming-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          echo $PWD\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/*\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/x86_64-pc-windows-gnu/*.dll .\n          ls -lh\n\n          echo \"Test transducer\"\n          ./run-transducer.sh\n          rm -rf sherpa-onnx-zipformer-en-2023-06-26\n\n          echo \"Test paraformer\"\n          ./run-paraformer.sh\n          rm -rf sherpa-onnx-paraformer-zh-2023-09-14\n\n          echo \"Test NeMo CTC\"\n          ./run-nemo-ctc.sh\n          rm -rf sherpa-onnx-nemo-ctc-en-conformer-medium\n\n          echo \"Test Whisper tiny.en\"\n          ./run-whisper.sh\n          rm -rf sherpa-onnx-whisper-tiny.en\n\n          echo \"Test Tdnn yesno\"\n          ./run-tdnn-yesno.sh\n          rm -rf sherpa-onnx-tdnn-yesno\n\n      - name: Test non-streaming decoding files (Win32)\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x86'\n        shell: bash\n        run: |\n          cd go-api-examples/non-streaming-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          ls -lh\n\n          go env GOARCH\n          go env\n          echo \"------------------------------\"\n          go env -w GOARCH=386\n          go env -w CGO_ENABLED=1\n          go env\n\n          go clean\n          go build\n\n          echo $PWD\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/i686-pc-windows-gnu/*.dll .\n          ls -lh\n\n          echo \"Test transducer\"\n          ./run-transducer.sh\n          rm -rf sherpa-onnx-zipformer-en-2023-06-26\n\n          echo \"Test paraformer\"\n          ./run-paraformer.sh\n          rm -rf sherpa-onnx-paraformer-zh-2023-09-14\n\n          echo \"Test NeMo CTC\"\n          ./run-nemo-ctc.sh\n          rm -rf sherpa-onnx-nemo-ctc-en-conformer-medium\n\n          echo \"Test Whisper tiny.en\"\n          ./run-whisper.sh\n          rm -rf sherpa-onnx-whisper-tiny.en\n\n          echo \"Test Tdnn yesno\"\n          ./run-tdnn-yesno.sh\n          rm -rf sherpa-onnx-tdnn-yesno\n\n      - name: Test audio tagging (Linux/macOS)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/audio-tagging\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n\n      - name: Test streaming decoding files (Linux/macOS)\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd go-api-examples/streaming-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          echo \"Test transducer\"\n          ./run-transducer.sh\n          rm -rf sherpa-onnx-streaming-zipformer-en-2023-06-26\n\n          echo \"Test paraformer\"\n          ./run-paraformer.sh\n          rm -rf sherpa-onnx-streaming-paraformer-bilingual-zh-en\n\n      - name: Test streaming decoding files (Win64)\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x64'\n        shell: bash\n        run: |\n          cd go-api-examples/streaming-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          echo $PWD\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/*\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/x86_64-pc-windows-gnu/*.dll .\n          ls -lh\n\n          echo \"Test transducer\"\n          ./run-transducer.sh\n          rm -rf sherpa-onnx-streaming-zipformer-en-2023-06-26\n\n          echo \"Test paraformer\"\n          ./run-paraformer.sh\n          rm -rf sherpa-onnx-streaming-paraformer-bilingual-zh-en\n\n      - name: Test streaming decoding files (Win32)\n        if: matrix.os == 'windows-2022' && matrix.arch == 'x86'\n        shell: bash\n        run: |\n          cd go-api-examples/streaming-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          ls -lh\n\n          go env GOARCH\n          go env\n          echo \"------------------------------\"\n          go env -w GOARCH=386\n          go env -w CGO_ENABLED=1\n          go env\n\n          go clean\n          go build\n\n          echo $PWD\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/\n          ls -lh /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/*\n          cp -v /C/Users/runneradmin/go/pkg/mod/github.com/k2-fsa/sherpa-onnx-go-windows*/lib/i686-pc-windows-gnu/*.dll .\n          ls -lh\n\n          echo \"Test transducer\"\n          ./run-transducer.sh\n          rm -rf sherpa-onnx-streaming-zipformer-en-2023-06-26\n\n          echo \"Test paraformer\"\n          ./run-paraformer.sh\n          rm -rf sherpa-onnx-streaming-paraformer-bilingual-zh-en\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: tts-waves-${{ matrix.os }}-${{ matrix.arch }}\n          path: tts-waves\n"
  },
  {
    "path": ".github/workflows/test-go.yaml",
    "content": "name: test-go\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/test-go.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'go-api-examples/**'\n      - 'scripts/go/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-go-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  test-go:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest, macos-15-intel, ubuntu-latest, windows-2022, ubuntu-22.04-arm]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-go\n\n      - uses: actions/setup-go@v5\n        with:\n          go-version: '>=1.17'\n\n      - name: Display go version\n        shell: bash\n        run: |\n          go version\n          go env GOPATH\n          go env GOARCH\n          go env CGO_ENABLED\n\n      - name: Display go env\n        shell: bash\n        run: |\n          go env\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          upload_dir=$PWD/to-upload\n          mkdir -p $upload_dir\n          echo \"upload_dir\"\n\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          mkdir build\n          cd build\n          cmake \\\n            -DCMAKE_C_COMPILER_LAUNCHER=ccache \\\n            -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n          if [[ ${{ matrix.os }} == windows-2022 ]]; then\n            cmake --build . --target install --config Release -- -m:2\n          else\n            make -j2 install\n          fi\n\n          if [[ ${{ matrix.os }} == ubuntu-latest || ${{ matrix.os }} == ubuntu-22.04-arm ]]; then\n            cp -v ./lib/*.so $upload_dir\n            cp -v _deps/onnxruntime-src/lib/libonnxruntime*so* $upload_dir\n\n            cp -v _deps/onnxruntime-src/lib/libonnxruntime*so* ./lib/\n\n            rm -v ./lib/*.a\n            ls -h ./lib\n          elif [[ ${{ matrix.os }} == windows-2022 ]]; then\n            cp -v ./install/lib/sherpa-onnx-c-api.dll ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/\n            cp -v ./install/lib/onnxruntime.dll ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/\n            ls -lh ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/\n\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/add-punctuation\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/add-punctuation-online\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/audio-tagging\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/keyword-spotting-from-file/\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/non-streaming-canary-decode-files/\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/non-streaming-decode-files/\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/non-streaming-fire-red-asr-ctc-decode-files\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/non-streaming-funasr-nano-decode-files\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/non-streaming-medasr-ctc-decode-files\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/non-streaming-moonshine-v2-decode-files\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/non-streaming-omnilingual-asr-ctc-decode-files\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/non-streaming-speaker-diarization/\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/non-streaming-tts/\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/speaker-identification/\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/speech-enhancement-gtcrn\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/speech-enhancement-dpdfnet\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/streaming-decode-files/\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/streaming-speech-enhancement-gtcrn\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/streaming-speech-enhancement-dpdfnet\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/streaming-hlg-decoding/\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/vad\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/vad-asr-paraformer\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/vad-asr-whisper\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/vad-speaker-identification\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/vad-spoken-language-identification\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/zero-shot-pocket-tts\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll ../scripts/go/_internal/supertonic-tts\n\n            cp -v ../scripts/go/_internal/lib/x86_64-pc-windows-gnu/*.dll $upload_dir\n          else\n            cp -v _deps/onnxruntime-src/lib/libonnxruntime*dylib $upload_dir/\n            cp -v lib/*.dylib $upload_dir\n\n            cp -v _deps/onnxruntime-src/lib/libonnxruntime*dylib ./lib/\n            rm ./lib/*.a\n            rm ./lib/libonnxruntime.dylib\n            cd lib\n            ln -s libonnxruntime.1.23.2.dylib libonnxruntime.dylib\n            cd ..\n          fi\n\n          cd ../scripts/go/_internal/\n          ls -lh lib\n          echo \"-----\"\n          ls -lh lib/*/\n          echo \"-----\"\n\n          go mod tidy\n          go build\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: ${{ matrix.os }}-libs\n          path: to-upload/\n\n      - name: Test SupertonicTTS\n        shell: bash\n        run: |\n          cd scripts/go/_internal/supertonic-tts\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-*\n          ls -lh *.wav\n\n      - name: Test non-streaming decoding files with Moonshine v2\n        shell: bash\n        run: |\n          cd scripts/go/_internal/non-streaming-moonshine-v2-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-moonshine-*\n\n      - name: Test non-streaming decoding files with FireRedAsrCtc\n        shell: bash\n        run: |\n          cd scripts/go/_internal/non-streaming-fire-red-asr-ctc-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-fire-red-*\n\n      - name: Test ZeroShot TTS with PocketTTS\n        shell: bash\n        run: |\n          cd scripts/go/_internal/zero-shot-pocket-tts\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-*\n          ls -lh *.wav\n\n      - name: Test ZeroShot TTS with ZipVoice\n        shell: bash\n        run: |\n          cd scripts/go/_internal/zero-shot-zipvoice-tts\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\n          rm -f vocos_24khz.onnx\n          ls -lh *.wav\n\n      - name: Test non-streaming decoding files with FunASR Nano\n        shell: bash\n        run: |\n          cd scripts/go/_internal/non-streaming-funasr-nano-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-funasr-*\n\n      - name: Test non-streaming decoding files with MedASR\n        shell: bash\n        run: |\n          cd scripts/go/_internal/non-streaming-medasr-ctc-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-medasr-*\n\n      - name: Test non-streaming decoding files with Omnilingual ASR\n        shell: bash\n        run: |\n          cd scripts/go/_internal/non-streaming-omnilingual-asr-ctc-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-omnilingual-*\n\n      - name: Test non-streaming TTS\n        shell: bash\n        run: |\n          mkdir tts-waves\n\n          cd scripts/go/_internal/non-streaming-tts/\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          echo \"Test kitten en\"\n          ./run-kitten-en.sh\n          rm -rf kitten-*\n          ls -lh\n\n          echo \"Test kokoro zh+en\"\n          ./run-kokoro-zh-en.sh\n          rm -rf kokoro-multi-*\n          ls -lh\n\n          echo \"Test kokoro en\"\n          ./run-kokoro-en.sh\n          rm -rf kokoro-en-*\n          ls -lh\n\n          echo \"Test matcha zh\"\n          ./run-matcha-zh.sh\n          rm -rf matcha-icefall-*\n\n          echo \"Test matcha en\"\n          ./run-matcha-en.sh\n          rm -rf matcha-icefall-*\n          ls -lh *.wav\n\n          echo \"Test vits-ljs\"\n          ./run-vits-ljs.sh\n          rm -rf vits-ljs\n\n          echo \"Test vits-vctk\"\n          ./run-vits-vctk.sh\n          rm -rf vits-vctk\n\n          echo \"Test vits-zh-aishell3\"\n          ./run-vits-zh-aishell3.sh\n          rm -rf vits-icefall-zh-aishell3\n\n          echo \"Test vits-piper-en_US-lessac-medium\"\n          ./run-vits-piper-en_US-lessac-medium.sh\n          rm -rf vits-piper-en_US-lessac-medium\n\n          cp *.wav ../../../../tts-waves/\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: tts-waves-${{ matrix.os }}\n          path: tts-waves\n\n      - name: Test streaming decoding files\n        shell: bash\n        run: |\n          cd scripts/go/_internal/streaming-decode-files\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          echo \"Test T-one CTC\"\n          ./run-t-one-ctc.sh\n\n          echo \"Test zipformer2 CTC\"\n          ./run-zipformer2-ctc-with-hr.sh\n          ./run-zipformer2-ctc.sh\n          rm -rf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13\n\n          echo \"Test transducer\"\n          ./run-transducer.sh\n          rm -rf sherpa-onnx-streaming-zipformer-en-2023-06-26\n\n          ./run-transducer-itn.sh\n          rm -rf sherpa-onnx-streaming-*\n\n          echo \"Test paraformer\"\n          ./run-paraformer.sh\n          rm -rf sherpa-onnx-streaming-paraformer-bilingual-zh-en\n\n      - name: Test non-streaming decoding files with NeMo Canary\n        shell: bash\n        run: |\n          cd scripts/go/_internal/non-streaming-canary-decode-files/\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          ./run.sh\n          rm -rf sherpa-onnx-nemo-*\n\n      - name: Test non-streaming decoding files\n        shell: bash\n        run: |\n          cd scripts/go/_internal/non-streaming-decode-files/\n          ls -lh\n          go mod tidy\n          cat go.mod\n          go build\n          ls -lh\n\n          echo \"Test Wenet CTC\"\n          ./run-wenet-ctc.sh\n          rm -rf sherpa-onnx-wenet*\n\n          echo \"Test Zipformer CTC\"\n          ./run-zipformer-ctc.sh\n          rm -rf sherpa-onnx-zipformer-*\n\n          echo \"Test SenseVoice ctc\"\n          ./run-sense-voice-small-with-hr.sh\n          ./run-sense-voice-small.sh\n          rm -rf sherpa-onnx-sense-*\n\n          echo \"Test Dolphin CTC\"\n          ./run-dolphin-ctc-base.sh\n          rm -rf sherpa-onnx-dolphin-*\n\n          echo \"Test FireRedAsr\"\n          ./run-fire-red-asr.sh\n          rm -rf sherpa-onnx-fire-red-asr-*\n\n          echo \"Test Moonshine\"\n          ./run-moonshine.sh\n          rm -rf sherpa-onnx-*\n\n          echo \"Test telespeech ctc\"\n          ./run-telespeech-ctc.sh\n          rm -rf sherpa-onnx-telespeech-ctc-*\n\n          echo \"Test transducer\"\n          ./run-transducer.sh\n          rm -rf sherpa-onnx-zipformer-en-2023-06-26\n\n          echo \"Test transducer\"\n          ./run-transducer.sh\n          rm -rf sherpa-onnx-zipformer-en-2023-06-26\n\n          echo \"Test paraformer\"\n          ./run-paraformer.sh\n          ./run-paraformer-itn.sh\n          rm -rf sherpa-onnx-paraformer-zh-2023-09-14\n\n          echo \"Test NeMo CTC\"\n          ./run-nemo-ctc.sh\n          rm -rf sherpa-onnx-nemo-ctc-en-conformer-medium\n\n          echo \"Test Whisper tiny.en\"\n          ./run-whisper.sh\n          rm -rf sherpa-onnx-whisper-tiny.en\n\n          echo \"Test Tdnn yesno\"\n          ./run-tdnn-yesno.sh\n          rm -rf sherpa-onnx-tdnn-yesno\n\n      - name: Test speech enhancement (GTCRN)\n        shell: bash\n        run: |\n          cd scripts/go/_internal/speech-enhancement-gtcrn/\n\n          ./run.sh\n\n          ls -lh\n\n      - name: Test speech enhancement (DPDFNet)\n        shell: bash\n        run: |\n          cd scripts/go/_internal/speech-enhancement-dpdfnet/\n\n          ./run.sh\n\n          ls -lh\n\n      - name: Test streaming speech enhancement (GTCRN)\n        shell: bash\n        run: |\n          cd scripts/go/_internal/streaming-speech-enhancement-gtcrn/\n\n          ./run.sh\n\n          ls -lh\n\n      - name: Test streaming speech enhancement (DPDFNet)\n        shell: bash\n        run: |\n          cd scripts/go/_internal/streaming-speech-enhancement-dpdfnet/\n\n          ./run.sh\n\n          ls -lh\n\n      - name: Test audio tagging\n        shell: bash\n        run: |\n          cd scripts/go/_internal/audio-tagging/\n\n          ./run.sh\n\n          ls -lh\n\n      - name: Test Keyword spotting\n        shell: bash\n        run: |\n          cd scripts/go/_internal/keyword-spotting-from-file/\n\n          ./run.sh\n\n          ls -lh\n\n      - name: Test adding punctuation\n        shell: bash\n        run: |\n          cd scripts/go/_internal/add-punctuation/\n          ./run.sh\n\n      - name: Test adding online punctuation\n        shell: bash\n        run: |\n          cd scripts/go/_internal/add-punctuation-online/\n          ./run.sh\n\n      - name: Test non-streaming speaker diarization\n        shell: bash\n        run: |\n          cd scripts/go/_internal/non-streaming-speaker-diarization/\n          ./run.sh\n\n      - name: Test speaker identification\n        shell: bash\n        run: |\n          cd scripts/go/_internal/speaker-identification/\n          ./run.sh\n\n      - name: Test streaming HLG decoding\n        shell: bash\n        run: |\n          cd scripts/go/_internal/streaming-hlg-decoding/\n          ./run.sh\n"
  },
  {
    "path": ".github/workflows/test-nodejs-addon-api.yaml",
    "content": "name: test-node-addon-api\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/test-nodejs-addon-api.yaml'\n      - '.github/scripts/test-nodejs-addon-npm.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'scripts/node-addon-api/**'\n      - 'nodejs-addon-examples/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-node-addon-api-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  test-node-addon-api:\n    name: ${{ matrix.os }} ${{ matrix.node-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest, ubuntu-latest]\n        node-version: [\"16\", \"22\"]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          pip install ninja\n\n      - name: Show ninja help\n        shell: bash\n        run: |\n          ninja --help || true\n\n      - uses: actions/setup-node@v4\n        with:\n          registry-url: 'https://registry.npmjs.org'\n          node-version: ${{ matrix.node-version }}\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: Display npm help\n        shell: bash\n        run: |\n          npm help\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-release-shared\n\n      - name: Build sherpa-onnx\n        if: matrix.os == 'windows-2022'\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          mkdir build\n          cd build\n          cmake \\\n            -DCMAKE_BUILD_TYPE=Release \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n            -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n            ..\n\n          ls -lh  _deps/onnxruntime-src/lib/\n\n          cmake --build . --config Release --target install -- -m:6\n\n          ls -lh install/lib\n\n          echo \"----------\"\n\n          cp -v  _deps/onnxruntime-src/lib/*.lib ./install/lib\n\n          echo \"----------\"\n\n          ls -lh install/lib\n\n      - name: Build sherpa-onnx\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          mkdir build\n          cd build\n          cmake \\\n            -G Ninja \\\n            -DCMAKE_BUILD_TYPE=Release \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n            -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n            ..\n\n          cmake --build . --config Release --target install -- -j 6\n\n      - name: Build node-addon-api package\n        shell: bash\n        run: |\n          d=$PWD\n          export SHERPA_ONNX_INSTALL_DIR=$d/build/install\n\n          cd scripts/node-addon-api\n\n          echo $d/build/install\n\n          ls -lh $d/build/install\n\n          npm i\n\n          ./node_modules/.bin/cmake-js compile --log-level verbose\n\n      - name: Run tests\n        shell: bash\n        run: |\n          export PATH=$PWD/build/install/lib:$PATH\n          export LD_LIBRARY_PATH=$PWD/build/install/lib:$LD_LIBRARY_PATH\n          d=nodejs-addon-examples\n          cd $d\n          files=$(ls *.js)\n          echo $files\n          for f in ${files[@]}; do\n            echo $f\n            sed -i.bak s%sherpa-onnx-node%./sherpa-onnx% ./$f\n          done\n          cd ..\n\n          cp -v scripts/node-addon-api/build/Release/sherpa-onnx.node $d/\n          cp -v scripts/node-addon-api/lib/*.js $d/\n          cp -v ./build/install/lib/lib*  $d/\n\n          .github/scripts/test-nodejs-addon-npm.sh\n"
  },
  {
    "path": ".github/workflows/test-nodejs-addon-npm-aarch64.yaml",
    "content": "name: test-node-addon-npm-aarch64\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/test-nodejs-addon-npm-aarch64.yaml'\n      - '.github/scripts/test-nodejs-addon-npm.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'scripts/node-addon-api/**'\n      - 'scripts/node-addon-api/*.js'\n      - 'nodejs-addon-examples/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-node-addon-npm-aarch64-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  test-node-addon-npm-aarch64:\n    name: ${{ matrix.os }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Set up QEMU\n        uses: docker/setup-qemu-action@v2\n        with:\n          platforms: arm64\n\n      - name: Test sherpa-onnx\n        shell: bash\n        run: |\n            docker run --rm \\\n              --platform linux/arm64 \\\n              --volume ${{ github.workspace }}/:/shared/ \\\n              quay.io/pypa/manylinux2014_aarch64 \\\n            bash -c '\n              git config --global --add safe.directory /shared\n\n              echo $HOME\n              uname -a\n              cat /etc/*release\n              cmake --version\n\n              curl -sL https://rpm.nodesource.com/setup_16.x | bash -\n              yum install -y nodejs\n\n              node --version\n\n              cd /shared\n\n              d=nodejs-addon-examples\n              echo \"dir: $d\"\n              cd $d\n              npm install --verbose\n              git status\n              ls -lh\n              ls -lh node_modules\n\n              export DYLD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-x64:$DYLD_LIBRARY_PATH\n              export DYLD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-arm64:$DYLD_LIBRARY_PATH\n              export LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-linux-x64:$LD_LIBRARY_PATH\n              export LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-linux-arm64:$LD_LIBRARY_PATH\n\n              cd ../\n\n              .github/scripts/test-nodejs-addon-npm.sh\n            '\n"
  },
  {
    "path": ".github/workflows/test-nodejs-addon-npm-win-x86.yaml",
    "content": "name: test-node-addon-npm-win-x86\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/test-nodejs-addon-npm-win-x86.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'scripts/node-addon-api/**'\n      - 'scripts/node-addon-api/*.js'\n      - 'nodejs-addon-examples/**'\n      - '.github/scripts/test-nodejs-addon-npm.sh'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-node-addon-npm-win-x86-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  test-node-addon-npm-win-x86:\n    name: ${{ matrix.os }} node v${{ matrix.node-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        node-version: [\"16\", \"17\", \"18\", \"19\", \"21\", \"22\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - uses: actions/setup-node@v4\n        with:\n          registry-url: 'https://registry.npmjs.org'\n          node-version: ${{ matrix.node-version }}\n          architecture: 'x86'\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: Run tests\n        shell: bash\n        run: |\n          d=nodejs-addon-examples\n          echo \"dir: $d\"\n          cd $d\n          npm install --verbose\n          git status\n          ls -lh\n          ls -lh node_modules\n\n          export DYLD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-x64:$DYLD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-arm64:$DYLD_LIBRARY_PATH\n          export LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-x64:$DYLD_LIBRARY_PATH\n          export LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-arm64:$DYLD_LIBRARY_PATH\n          export LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-linux-x64:$LD_LIBRARY_PATH\n          export LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-linux-arm64:$LD_LIBRARY_PATH\n\n          cd ../\n\n          .github/scripts/test-nodejs-addon-npm.sh\n"
  },
  {
    "path": ".github/workflows/test-nodejs-addon-npm.yaml",
    "content": "name: test-node-addon-npm\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/test-nodejs-addon-npm.yaml'\n      - '.github/scripts/test-nodejs-addon-npm.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'scripts/node-addon-api/**'\n      - 'scripts/node-addon-api/*.js'\n      - 'nodejs-addon-examples/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-node-addon-npm-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  test-node-addon-npm:\n    name: ${{ matrix.os }} node v${{ matrix.node-version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest, macos-14, ubuntu-latest, ubuntu-22.04, windows-2022]\n        node-version: [\"16\", \"17\", \"18\", \"19\", \"21\", \"22\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - uses: actions/setup-node@v4\n        with:\n          registry-url: 'https://registry.npmjs.org'\n          node-version: ${{ matrix.node-version }}\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: Run tests\n        shell: bash\n        run: |\n          d=nodejs-addon-examples\n          echo \"dir: $d\"\n          cd $d\n          npm install --verbose\n          git status\n          ls -lh\n          ls -lh node_modules\n\n          export DYLD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-x64:$DYLD_LIBRARY_PATH\n          export DYLD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-arm64:$DYLD_LIBRARY_PATH\n          export LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-x64:$DYLD_LIBRARY_PATH\n          export LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-arm64:$DYLD_LIBRARY_PATH\n          export LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-linux-x64:$LD_LIBRARY_PATH\n          export LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-linux-arm64:$LD_LIBRARY_PATH\n\n          cd ../\n\n          .github/scripts/test-nodejs-addon-npm.sh\n"
  },
  {
    "path": ".github/workflows/test-nodejs-npm.yaml",
    "content": "name: test-nodejs-npm\n\non:\n  workflow_dispatch:\n\n  schedule:\n    # minute (0-59)\n    # hour (0-23)\n    # day of the month (1-31)\n    # month (1-12)\n    # day of the week (0-6)\n    # nightly build at 23:50 UTC time every day\n    - cron: \"50 23 * * *\"\n\nconcurrency:\n  group: test-nodejs-npm-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  test-nodejs-npm:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, windows-2022]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - uses: actions/setup-node@v4\n        with:\n          registry-url: 'https://registry.npmjs.org'\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n          npm --version\n\n      - name: Run tests\n        shell: bash\n        run: |\n          node --version\n          npm --version\n\n          export d=nodejs-examples\n          ./.github/scripts/test-nodejs-npm.sh\n"
  },
  {
    "path": ".github/workflows/test-nodejs.yaml",
    "content": "name: test-nodejs\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/test-nodejs.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/c-api/*'\n      - 'scripts/nodejs/**'\n      - 'nodejs-examples/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-nodejs-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  test-nodejs:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest] #, macos-latest] #, windows-2022]\n        python-version: [\"3.8\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-${{ matrix.build_type }}-wasm-nodejs\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.51\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - uses: actions/setup-node@v4\n        with:\n          registry-url: 'https://registry.npmjs.org'\n\n      - name: Display node version\n        shell: bash\n        run: |\n          node --version\n\n      - name: Build nodejs package\n        shell: bash\n        env:\n          NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          ./build-wasm-simd-nodejs.sh\n          cp -v build-wasm-simd-nodejs/install/bin/wasm/nodejs/*.js ./scripts/nodejs/\n          cp -v build-wasm-simd-nodejs/install/bin/wasm/nodejs/*.wasm ./scripts/nodejs/\n\n      - name: replace files\n        shell: bash\n        run: |\n          cd nodejs-examples\n          files=$(ls -1 *.js)\n          for f in ${files[@]}; do\n            echo $f\n            sed -i.bak s%\\'sherpa-onnx\\'%\\'./index.js\\'% $f\n            git status\n          done\n          git diff\n          cp *.js ../scripts/nodejs\n\n      - name: Run tests\n        shell: bash\n        run: |\n          node --version\n          npm --version\n          export d=scripts/nodejs\n          cat $d/index.js\n\n          pushd $d\n          npm install\n          npm install wav\n          popd\n\n          ./.github/scripts/test-nodejs-npm.sh\n"
  },
  {
    "path": ".github/workflows/test-onnxruntime-version.yaml",
    "content": "name: test-onnxruntime-version\n\non:\n  push:\n    branches:\n      - master\n      - test-onnxruntime\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-onnxrntime-version-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  macos:\n    runs-on: ${{ matrix.os }}\n    name: onnxruntime ${{ matrix.version }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [macos-latest]\n        version: [\"1.11.0\", \"1.11.1\", \"1.12.0\", \"1.12.1\", \"1.13.1\", \"1.14.0\", \"1.14.1\", \"1.15.0\", \"1.15.1\", \"1.16.1\", \"1.16.2\", \"1.17.0\", \"1.17.1\", \"1.17.3\", \"1.18.0\", \"1.18.1\", \"1.19.0\", \"1.19.2\", \"1.20.0\", \"1.20.1\", \"1.20.2\", \"1.21.0\", \"1.21.1\", \"1.22.0\", \"1.22.1\", \"1.22.2\", \"1.23.0\", \"1.23.1\", \"1.23.2\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-onnxruntime-${{ matrix.version }}\n\n      - name: Download onnxruntime ${{ matrix.version }}\n        shell: bash\n        run: |\n          version=${{ matrix.version }}\n          curl -SL -O https://github.com/microsoft/onnxruntime/releases/download/v${version}/onnxruntime-osx-universal2-${version}.tgz\n          tar xvf onnxruntime-osx-universal2-${version}.tgz\n          ls -lh onnxruntime-osx-universal2-${version}\n\n          ls -lh onnxruntime-osx-universal2-${version}\n          echo \"---\"\n          ls -lh onnxruntime-osx-universal2-${version}/include\n          echo \"---\"\n          ls -lh onnxruntime-osx-universal2-${version}/lib\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          version=${{ matrix.version }}\n          onnxruntime_dir=$PWD/onnxruntime-osx-universal2-${version}\n          export SHERPA_ONNXRUNTIME_LIB_DIR=$onnxruntime_dir/lib/\n          export SHERPA_ONNXRUNTIME_INCLUDE_DIR=$onnxruntime_dir/include/\n\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n\n          cmake \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D CMAKE_OSX_ARCHITECTURES='arm64;x86_64' \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n      - name: Build sherpa-onnx for macos\n        shell: bash\n        run: |\n          version=${{ matrix.version }}\n          onnxruntime_dir=$PWD/onnxruntime-osx-universal2-${version}\n          export SHERPA_ONNXRUNTIME_LIB_DIR=$onnxruntime_dir/lib/\n          export SHERPA_ONNXRUNTIME_INCLUDE_DIR=$onnxruntime_dir/include/\n\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n\n          cd build\n          make -j2\n          make install\n\n          ls -lh lib\n          ls -lh bin\n\n          file ./bin/sherpa-onnx\n\n          rm -fv ./install/include/cargs.h\n          rm -fv ./install/lib/cargs.h\n          rm -fv ./install/lib/libcargs.dylib\n          rm -fv ./install/lib/libcargs.a\n          rm -rfv ./install/lib/pkgconfig\n\n      - name: Display dependencies of sherpa-onnx for macos\n        shell: bash\n        run: |\n          file bin/sherpa-onnx\n          otool -L build/bin/sherpa-onnx\n          otool -l build/bin/sherpa-onnx\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-onnxruntime-${{ matrix.version }}-osx-universal2-shared\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n          mkdir $dst/lib\n          cp -a build/install/lib/*.dylib* $dst/lib/\n          cp -a build/install/include $dst/\n\n          brew install tree\n          tree $dst\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - name: Release pre-compiled binaries and libs for macOS\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*osx-universal2*.tar.bz2\n\n      - name: Test offline CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-ctc.sh\n\n      - name: Test offline speech denoiser\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-denoiser\n\n          .github/scripts/test-offline-speech-denoiser.sh\n\n      - name: Test offline TTS\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-tts\n\n          .github/scripts/test-offline-tts.sh\n\n      - name: Test offline Moonshine\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-moonshine.sh\n\n      - name: Test C++ API\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export CXX_STREAMING_ZIPFORMER_EXE=streaming-zipformer-cxx-api\n          export CXX_WHISPER_EXE=whisper-cxx-api\n          export CXX_SENSE_VOICE_EXE=sense-voice-cxx-api\n\n          .github/scripts/test-cxx-api.sh\n          du -h -d1 .\n\n      - name: Test offline speaker diarization\n        shell: bash\n        run: |\n          du -h -d1 .\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-speaker-diarization\n\n          .github/scripts/test-speaker-diarization.sh\n\n      - name: Test offline transducer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-transducer.sh\n\n      - name: Test online punctuation\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-online-punctuation\n\n          .github/scripts/test-online-punctuation.sh\n\n      - name: Test online CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-ctc.sh\n\n      - name: Test offline punctuation\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-punctuation\n\n          .github/scripts/test-offline-punctuation.sh\n\n      - name: Test C API\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export SLID_EXE=spoken-language-identification-c-api\n          export SID_EXE=speaker-identification-c-api\n          export AT_EXE=audio-tagging-c-api\n          export PUNCT_EXE=add-punctuation-c-api\n\n          .github/scripts/test-c-api.sh\n\n      - name: Test Audio tagging\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-audio-tagging\n\n          .github/scripts/test-audio-tagging.sh\n\n      - name: Test spoken language identification (C++ API)\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline-language-identification\n\n          .github/scripts/test-spoken-language-identification.sh\n\n      - name: Test transducer kws\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-keyword-spotter\n\n          .github/scripts/test-kws.sh\n\n      - name: Test online paraformer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-paraformer.sh\n\n      - name: Test offline Whisper\n        if: matrix.build_type != 'Debug'\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx-offline\n\n          .github/scripts/test-offline-whisper.sh\n\n      - name: Test online transducer\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=sherpa-onnx\n\n          .github/scripts/test-online-transducer.sh\n\n      - name: Test online transducer (C API)\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin:$PATH\n          export EXE=decode-file-c-api\n\n          .github/scripts/test-online-transducer.sh\n\n\n"
  },
  {
    "path": ".github/workflows/test-pip-install.yaml",
    "content": "name: test-pip-install\n\non:\n  push:\n    branches:\n      - test-pip-install\n  schedule:\n    # minute (0-59)\n    # hour (0-23)\n    # day of the month (1-31)\n    # month (1-12)\n    # day of the week (0-6)\n    # nightly build at 23:50 UTC time every day\n    - cron: \"50 23 * * *\"\n  workflow_dispatch:\n\nconcurrency:\n  group: test-pip-install-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  test_pip_install:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.os }} ${{ matrix.python-version }}\n    strategy:\n      fail-fast: false\n      matrix:\n        # See https://github.com/actions/runner-images\n        include:\n          - os: ubuntu-22.04\n            python-version: \"3.8\"\n          - os: ubuntu-22.04\n            python-version: \"3.9\"\n          - os: ubuntu-22.04\n            python-version: \"3.10\"\n          - os: ubuntu-latest\n            python-version: \"3.11\"\n          - os: ubuntu-24.04\n            python-version: \"3.12\"\n          - os: ubuntu-latest\n            python-version: \"3.13\"\n\n          - os: ubuntu-24.04-arm\n            python-version: \"3.8\"\n          - os: ubuntu-24.04-arm\n            python-version: \"3.9\"\n          - os: ubuntu-24.04-arm\n            python-version: \"3.10\"\n          - os: ubuntu-24.04-arm\n            python-version: \"3.11\"\n          - os: ubuntu-24.04-arm\n            python-version: \"3.12\"\n          - os: ubuntu-24.04-arm\n            python-version: \"3.13\"\n\n          - os: macos-15-intel\n            python-version: \"3.8\"\n\n          - os: macos-15-intel\n            python-version: \"3.9\"\n          - os: macos-15-intel\n            python-version: \"3.10\"\n          - os: macos-15-intel\n            python-version: \"3.11\"\n\n          - os: macos-14\n            python-version: \"3.12\"\n          - os: macos-14\n            python-version: \"3.13\"\n\n          - os: windows-2022\n            python-version: \"3.8\"\n          - os: windows-2022\n            python-version: \"3.9\"\n\n          - os: windows-2022\n            python-version: \"3.10\"\n          - os: windows-2022\n            python-version: \"3.11\"\n          - os: windows-2022\n            python-version: \"3.12\"\n          - os: windows-2022\n            python-version: \"3.13\"\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install sherpa-onnx\n        shell: bash\n        run: |\n          pip install --verbose -U sherpa-onnx sherpa-onnx-core sherpa-onnx-bin\n\n      - name: Test sherpa-onnx-bin\n        shell: bash\n        run: |\n          sherpa-onnx-version\n\n          sherpa-onnx --help\n          sherpa-onnx-keyword-spotter --help\n          sherpa-onnx-offline --help\n          sherpa-onnx-offline-tts --help\n\n          sherpa-onnx-microphone --help\n          sherpa-onnx-microphone-offline --help\n\n          sherpa-onnx-offline-websocket-server --help\n\n          sherpa-onnx-online-websocket-server --help\n          sherpa-onnx-online-websocket-client --help\n\n      - name: Test sherpa-onnx-core\n        shell: bash\n        run: |\n          python3 -m sherpa_onnx --cflags\n          python3 -m sherpa_onnx --c-api-libs\n          python3 -m sherpa_onnx --c-api-libs-only-L\n          python3 -m sherpa_onnx --c-api-libs-only-l\n\n          python3 -m sherpa_onnx --cxx-api-libs\n          python3 -m sherpa_onnx --cxx-api-libs-only-L\n          python3 -m sherpa_onnx --cxx-api-libs-only-l\n\n      - name: Test sherpa-onnx\n        shell: bash\n        run: |\n          python3 -c \"import sherpa_onnx; print(sherpa_onnx.__file__)\"\n          python3 -c \"import sherpa_onnx; print(sherpa_onnx.__version__)\"\n          python3 -c \"import sherpa_onnx; print(sherpa_onnx.OnlineRecognizer)\"\n          python3 -c \"import sherpa_onnx; print(sherpa_onnx.OfflineRecognizer)\"\n"
  },
  {
    "path": ".github/workflows/test-piper-phonemize.yaml",
    "content": "name: test-piper-phonemize\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/test-piper-phonemize.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: test-piper-phonemize-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  test_piper_phonemize:\n    name: ${{ matrix.os }} ${{ matrix.build_type }} ${{ matrix.shared_lib }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, windows-2022]\n        build_type: [Release, Debug]\n        shared_lib: [ON, OFF]\n        exclude:\n          - os: windows-2022\n            build_type: Debug\n            shared_lib: OFF\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-${{ matrix.build_type }}-shared-${{ matrix.shared_lib }}\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n          cmake -DSHERPA_ONNX_ENABLE_EPSEAK_NG_EXE=ON -DBUILD_ESPEAK_NG_EXE=ON -DCMAKE_VERBOSE_MAKEFILE=ON -D SHERPA_ONNX_ENABLE_TESTS=ON -D CMAKE_BUILD_TYPE=${{ matrix.build_type }} -D BUILD_SHARED_LIBS=${{ matrix.shared_lib }} -DCMAKE_INSTALL_PREFIX=./install ..\n\n      - name: Build\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          cd build\n          cmake --build . --target install --config ${{ matrix.build_type }}\n\n      - name: run test\n        if: matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          cd build\n\n          ls -lh install/\n          ls -lh install/share\n          ls -lh install/share/espeak-ng-data/\n\n          ./bin/piper-phonemize-test\n\n      - name: run test\n        if: matrix.os == 'windows-2022'\n        shell: bash\n        run: |\n          cd build\n\n          ls -lh install/\n          ls -lh install/share\n          ls -lh install/share/espeak-ng-data/\n\n          ./bin/${{ matrix.build_type }}/piper-phonemize-test\n"
  },
  {
    "path": ".github/workflows/test-python-offline-websocket-server.yaml",
    "content": "name: Python offline websocket server\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/test-python-offline-websocket-server.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/python/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: python-offline-websocket-server-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  python_offline_websocket_server:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.os }} ${{ matrix.python-version }} ${{ matrix.model_type }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, ubuntu-22.04, windows-2022, macos-latest, macos-14]\n        python-version: [\"3.10\"]\n        model_type: [\"transducer\", \"paraformer\", \"nemo_ctc\", \"whisper\", \"tdnn\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-python-${{ matrix.python-version }}\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip numpy pypinyin sentencepiece setuptools wheel\n\n      - name: Install sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          python3 -m pip install .\n          python3 -m pip install websockets\n\n      - name: Start server for transducer models\n        if: matrix.model_type == 'transducer'\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\n          tar xvf sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\n          rm sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\n\n          python3 ./python-api-examples/non_streaming_server.py \\\n            --encoder ./sherpa-onnx-zipformer-en-2023-06-26/encoder-epoch-99-avg-1.onnx \\\n            --decoder ./sherpa-onnx-zipformer-en-2023-06-26/decoder-epoch-99-avg-1.onnx \\\n            --joiner ./sherpa-onnx-zipformer-en-2023-06-26/joiner-epoch-99-avg-1.onnx \\\n            --tokens ./sherpa-onnx-zipformer-en-2023-06-26/tokens.txt &\n\n          echo \"sleep 10 seconds to wait the server start\"\n          sleep 10\n\n      - name: Start client for transducer models\n        if: matrix.model_type == 'transducer'\n        shell: bash\n        run: |\n          python3 ./python-api-examples/offline-websocket-client-decode-files-paralell.py \\\n            ./sherpa-onnx-zipformer-en-2023-06-26/test_wavs/0.wav \\\n            ./sherpa-onnx-zipformer-en-2023-06-26/test_wavs/1.wav \\\n            ./sherpa-onnx-zipformer-en-2023-06-26/test_wavs/8k.wav\n\n          python3 ./python-api-examples/offline-websocket-client-decode-files-sequential.py \\\n            ./sherpa-onnx-zipformer-en-2023-06-26/test_wavs/0.wav \\\n            ./sherpa-onnx-zipformer-en-2023-06-26/test_wavs/1.wav \\\n            ./sherpa-onnx-zipformer-en-2023-06-26/test_wavs/8k.wav\n\n      - name: Start server for paraformer models\n        if: matrix.model_type == 'paraformer' && matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n          tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n          rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n          python3 ./python-api-examples/non_streaming_server.py \\\n            --paraformer ./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \\\n            --tokens ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt &\n\n          echo \"sleep 10 seconds to wait the server start\"\n          sleep 10\n\n      - name: Start client for paraformer models\n        if: matrix.model_type == 'paraformer' && matrix.os != 'windows-2022'\n        shell: bash\n        run: |\n          python3 ./python-api-examples/offline-websocket-client-decode-files-paralell.py \\\n            ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/0.wav \\\n            ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/1.wav \\\n            ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/2.wav \\\n            ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/8k.wav\n\n          python3 ./python-api-examples/offline-websocket-client-decode-files-sequential.py \\\n            ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/0.wav \\\n            ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/1.wav \\\n            ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/2.wav \\\n            ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/8k.wav\n\n      - name: Start server for nemo_ctc models\n        if: matrix.model_type == 'nemo_ctc'\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-en-conformer-medium.tar.bz2\n          tar xvf sherpa-onnx-nemo-ctc-en-conformer-medium.tar.bz2\n          rm sherpa-onnx-nemo-ctc-en-conformer-medium.tar.bz2\n\n          python3 ./python-api-examples/non_streaming_server.py \\\n            --nemo-ctc ./sherpa-onnx-nemo-ctc-en-conformer-medium/model.onnx \\\n            --tokens ./sherpa-onnx-nemo-ctc-en-conformer-medium/tokens.txt &\n\n          echo \"sleep 10 seconds to wait the server start\"\n          sleep 10\n\n      - name: Start client for nemo_ctc models\n        if: matrix.model_type == 'nemo_ctc'\n        shell: bash\n        run: |\n          python3 ./python-api-examples/offline-websocket-client-decode-files-paralell.py \\\n            ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/0.wav \\\n            ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/1.wav \\\n            ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/8k.wav\n\n          python3 ./python-api-examples/offline-websocket-client-decode-files-sequential.py \\\n            ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/0.wav \\\n            ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/1.wav \\\n            ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/8k.wav\n\n      - name: Start server for whisper models\n        if: matrix.model_type == 'whisper'\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n          tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n          rm sherpa-onnx-whisper-tiny.en.tar.bz2\n\n          python3 ./python-api-examples/non_streaming_server.py \\\n            --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx \\\n            --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx \\\n            --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt &\n\n          echo \"sleep 10 seconds to wait the server start\"\n          sleep 10\n\n      - name: Start client for whisper models\n        if: matrix.model_type == 'whisper'\n        shell: bash\n        run: |\n          python3 ./python-api-examples/offline-websocket-client-decode-files-paralell.py \\\n            ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav \\\n            ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav \\\n            ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav\n\n          python3 ./python-api-examples/offline-websocket-client-decode-files-sequential.py \\\n            ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav \\\n            ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav \\\n            ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav\n\n      - name: Start server for tdnn models\n        if: matrix.model_type == 'tdnn'\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-tdnn-yesno.tar.bz2\n          tar xvf sherpa-onnx-tdnn-yesno.tar.bz2\n          rm sherpa-onnx-tdnn-yesno.tar.bz2\n\n          python3 ./python-api-examples/non_streaming_server.py \\\n            --tdnn-model=./sherpa-onnx-tdnn-yesno/model-epoch-14-avg-2.onnx \\\n            --tokens=./sherpa-onnx-tdnn-yesno/tokens.txt \\\n            --sample-rate=8000 \\\n            --feat-dim=23 &\n\n          echo \"sleep 10 seconds to wait the server start\"\n          sleep 10\n\n      - name: Start client for tdnn models\n        if: matrix.model_type == 'tdnn'\n        shell: bash\n        run: |\n          python3 ./python-api-examples/offline-websocket-client-decode-files-paralell.py \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_0_1_0_0_0_1.wav \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_0_1_0.wav \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_1_1_1.wav \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_1_0_0_1.wav \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_1_0_0_0_1.wav \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_1_0_1_1_0.wav\n\n          python3 ./python-api-examples/offline-websocket-client-decode-files-sequential.py \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_0_1_0_0_0_1.wav \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_0_1_0.wav \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_1_1_1.wav \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_1_0_0_1.wav \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_1_0_0_0_1.wav \\\n            ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_1_0_1_1_0.wav\n"
  },
  {
    "path": ".github/workflows/test-python-online-websocket-server.yaml",
    "content": "name: Python online websocket server\n\non:\n  push:\n    branches:\n      - master\n    paths:\n      - '.github/workflows/test-python-online-websocket-server.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n      - 'sherpa-onnx/python/**'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: python-online-websocket-server-${{ github.ref }}\n  cancel-in-progress: true\n\npermissions:\n  contents: read\n\njobs:\n  python_online_websocket_server:\n    runs-on: ${{ matrix.os }}\n    name: ${{ matrix.os }} ${{ matrix.python-version }} ${{ matrix.model_type }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, ubuntu-22.04, windows-2022, macos-latest, macos-14]\n        python-version: [\"3.10\"]\n        model_type: [\"transducer\", \"paraformer\", \"zipformer2-ctc\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-python-${{ matrix.python-version }}\n\n      - name: Setup Python ${{ matrix.python-version }}\n        uses: actions/setup-python@v5\n        with:\n          python-version: ${{ matrix.python-version }}\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip numpy pypinyin sentencepiece setuptools wheel\n\n      - name: Install sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          python3 -m pip install .\n          python3 -m pip install websockets\n\n      - name: Start server for zipformer2 CTC models\n        if: matrix.model_type == 'zipformer2-ctc'\n        shell: bash\n        run: |\n          curl -O -L https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n\n          python3 ./python-api-examples/streaming_server.py \\\n            --zipformer2-ctc ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/ctc-epoch-20-avg-1-chunk-16-left-128.onnx \\\n            --tokens=./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt &\n          echo \"sleep 10 seconds to wait the server start\"\n          sleep 10\n\n      - name: Start client for zipformer2 CTC models\n        if: matrix.model_type == 'zipformer2-ctc'\n        shell: bash\n        run: |\n          python3 ./python-api-examples/online-websocket-client-decode-file.py \\\n            ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000000.wav\n\n      - name: Start server for transducer models\n        if: matrix.model_type == 'transducer'\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-2023-06-26.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-en-2023-06-26.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-en-2023-06-26.tar.bz2\n\n          python3 ./python-api-examples/streaming_server.py \\\n            --encoder ./sherpa-onnx-streaming-zipformer-en-2023-06-26/encoder-epoch-99-avg-1-chunk-16-left-128.onnx \\\n            --decoder ./sherpa-onnx-streaming-zipformer-en-2023-06-26/decoder-epoch-99-avg-1-chunk-16-left-128.onnx \\\n            --joiner ./sherpa-onnx-streaming-zipformer-en-2023-06-26/joiner-epoch-99-avg-1-chunk-16-left-128.onnx \\\n            --tokens ./sherpa-onnx-streaming-zipformer-en-2023-06-26/tokens.txt &\n          echo \"sleep 10 seconds to wait the server start\"\n          sleep 10\n\n      - name: Start client for transducer models\n        if: matrix.model_type == 'transducer'\n        shell: bash\n        run: |\n          python3 ./python-api-examples/online-websocket-client-decode-file.py \\\n            ./sherpa-onnx-streaming-zipformer-en-2023-06-26/test_wavs/0.wav\n\n      - name: Start server for paraformer models\n        if: matrix.model_type == 'paraformer'\n        shell: bash\n        run: |\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n          tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n          rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n\n          python3 ./python-api-examples/streaming_server.py \\\n            --tokens ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt \\\n            --paraformer-encoder ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx \\\n            --paraformer-decoder ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx &\n\n          echo \"sleep 10 seconds to wait the server start\"\n          sleep 10\n\n      - name: Start client for paraformer models\n        if: matrix.model_type == 'paraformer'\n        shell: bash\n        run: |\n          python3 ./python-api-examples/online-websocket-client-decode-file.py \\\n            ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav\n\n          python3 ./python-api-examples/online-websocket-client-decode-file.py \\\n            ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/1.wav\n\n          python3 ./python-api-examples/online-websocket-client-decode-file.py \\\n            ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/2.wav\n\n          python3 ./python-api-examples/online-websocket-client-decode-file.py \\\n            ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/3.wav\n"
  },
  {
    "path": ".github/workflows/test-rust-package.yaml",
    "content": "name: Test rust package\n\non:\n  push:\n    branches:\n      - rust-api\n  workflow_dispatch:\n\nconcurrency:\n  group: test-rust-package-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  test-rust-package:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, macos-15-intel, ubuntu-22.04-arm]\n\n    env:\n      # Placeholder, will be overwritten per OS\n      SHERPA_ONNX_LIB_DIR: \"\"\n      RUSTFLAGS: \"\"\n\n    steps:\n      # Checkout the repository\n      - uses: actions/checkout@v4\n\n      # Install Rust stable\n      - uses: actions-rust-lang/setup-rust-toolchain@v1\n        with:\n          toolchain: stable\n\n      # Download prebuilt libraries depending on OS\n      - name: Download prebuilt Sherpa-ONNX libraries\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          if [[ \"${{ matrix.os }}\" == \"macos-latest\" ]]; then\n            d=sherpa-onnx-v$SHERPA_ONNX_VERSION-osx-universal2-shared\n          elif [[ \"${{ matrix.os }}\" == \"macos-15-intel\" ]]; then\n            d=sherpa-onnx-v$SHERPA_ONNX_VERSION-osx-universal2-shared\n          elif [[ \"${{ matrix.os }}\" == \"ubuntu-latest\" ]]; then\n            d=sherpa-onnx-v$SHERPA_ONNX_VERSION-linux-x64-shared\n          elif [[ \"${{ matrix.os }}\" == \"ubuntu-22.04-arm\" ]]; then\n            d=sherpa-onnx-v$SHERPA_ONNX_VERSION-linux-aarch64-shared-cpu\n          else\n            echo \"Unknown ${{ matrix.os }}\"\n            exit 1\n          fi\n\n          LIB_URL=\"https://github.com/k2-fsa/sherpa-onnx/releases/download/v$SHERPA_ONNX_VERSION/$d.tar.bz2\"\n\n          curl -SsL -O  \"$LIB_URL\"\n          tar -xvf $d.tar.bz2\n\n          ls -lh $d/lib\n\n          # Export environment variables for this step\n          echo \"SHERPA_ONNX_LIB_DIR=$PWD/$d/lib\" >> $GITHUB_ENV\n          echo \"RUSTFLAGS=-C link-arg=-Wl,-rpath,$PWD/$d/lib\" >> $GITHUB_ENV\n\n      - name: Show libs\n        shell: bash\n        run: |\n          echo \"SHERPA_ONNX_LIB_DIR: $SHERPA_ONNX_LIB_DIR\"\n          ls -lh $SHERPA_ONNX_LIB_DIR\n\n          echo \"RUSTFLAGS: $RUSTFLAGS\"\n\n      - name: Run test\n        shell: bash\n        run: |\n          ./.github/scripts/test-rust.sh\n"
  },
  {
    "path": ".github/workflows/test-rust.yaml",
    "content": "name: Test rust\n\non:\n  push:\n    branches:\n      - rust-api\n  workflow_dispatch:\n\nconcurrency:\n  group: test-rust-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  test-rust:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest, macos-latest, macos-15-intel, ubuntu-22.04-arm]\n\n    env:\n      # Placeholder, will be overwritten per OS\n      SHERPA_ONNX_LIB_DIR: \"\"\n      RUSTFLAGS: \"\"\n\n    steps:\n      # Checkout the repository\n      - uses: actions/checkout@v4\n\n      - name: ccache\n        uses: hendrikmuhs/ccache-action@v1.2\n        with:\n          key: ${{ matrix.os }}-rust\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Configure sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          mkdir build\n          cd build\n          cmake \\\n            -D SHERPA_ONNX_ENABLE_BINARY=OFF \\\n            -D BUILD_SHARED_LIBS=ON \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            ..\n\n      - name: Build sherpa-onnx\n        shell: bash\n        run: |\n          export CMAKE_CXX_COMPILER_LAUNCHER=ccache\n          export PATH=\"/usr/lib/ccache:/usr/local/opt/ccache/libexec:$PATH\"\n          cmake --version\n\n          cd build\n          make -j2\n          make install\n          ls -lh install/lib\n\n          echo \"SHERPA_ONNX_LIB_DIR=$PWD/install/lib\" >> $GITHUB_ENV\n          echo \"RUSTFLAGS=-C link-arg=-Wl,-rpath,$PWD/install/lib\" >> $GITHUB_ENV\n\n      # Install Rust stable\n      - uses: actions-rust-lang/setup-rust-toolchain@v1\n        with:\n          toolchain: stable\n\n      - name: Show libs\n        shell: bash\n        run: |\n          echo \"SHERPA_ONNX_LIB_DIR: $SHERPA_ONNX_LIB_DIR\"\n          ls -lh $SHERPA_ONNX_LIB_DIR\n\n          echo \"RUSTFLAGS: $RUSTFLAGS\"\n\n      - name: Test locally\n        shell: bash\n        run: |\n          cd rust-api-examples\n\n          sed -i.bak 's|^sherpa-onnx *=.*|sherpa-onnx = { path = \"../sherpa-onnx/rust/sherpa-onnx\" }|' Cargo.toml\n\n          git diff .\n\n          cargo clean\n          cargo run --example version\n\n      - name: Run test\n        shell: bash\n        run: |\n          ./.github/scripts/test-rust.sh\n"
  },
  {
    "path": ".github/workflows/upload-models.yaml",
    "content": "name: upload-models\n\non:\n  push:\n    branches:\n      - upload-models\n  workflow_dispatch:\n\nconcurrency:\n  group: upload-models-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  upload-models:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: upload models\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: Upload DPDFNet\n        shell: bash\n        run: |\n\n          models=(\n            baseline.onnx\n            dpdfnet2.onnx\n            dpdfnet2_48khz_hr.onnx\n            dpdfnet4.onnx\n            dpdfnet8.onnx\n          )\n          for m in ${models[@]}; do\n            wget https://huggingface.co/Ceva-IP/DPDFNet/resolve/main/onnx/$m\n          done\n\n          mv baseline.onnx dpdfnet_baseline.onnx\n\n      - name: Install ffmpeg\n        if: false\n        shell: bash\n        run: |\n          sudo apt-get update\n          sudo apt-get install -y ffmpeg\n\n      - name: Verify ffmpeg\n        if: false\n        shell: bash\n        run: |\n          ffmpeg -version\n\n      - name: git config\n        shell: bash\n        run: |\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n      - name: FireRedASR2 CTC (int8)\n        if: false\n        shell: bash\n        run: |\n          d=sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\n          mkdir $d\n\n          pushd $d\n\n          cat >README.md <<EOF\n          # Introduction\n          Model files are converted from\n          https://www.modelscope.cn/models/FireRedTeam/FireRedASR2-AED\n\n          We export only the encoder and the CTC branch. The attention decoder\n          is not used.\n          EOF\n\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/ctc/model.int8.onnx\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/ctc/tokens.txt\n          mkdir test_wavs\n          cd test_wavs\n          for w in 0.wav 1.wav 2.wav 3-sichuan.wav 3.wav 4-tianjin.wav 5-henan.wav 8k.wav; do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-fire-red-asr-large-zh_en-fp16-2025-02-16/resolve/main/test_wavs/$w\n          done\n\n          popd\n\n          ls -lh $d\n\n          tar cjvf $d.tar.bz2 $d\n\n          ls -lh *.tar.bz2\n\n      - name: FireRedASR2 CTC (fp32)\n        if: false\n        shell: bash\n        run: |\n          d=sherpa-onnx-fire-red-asr2-ctc-zh_en-2026-02-25\n          mkdir $d\n\n          pushd $d\n\n          cat >README.md <<EOF\n          # Introduction\n          Model files are converted from\n          https://www.modelscope.cn/models/FireRedTeam/FireRedASR2-AED\n\n          We export only the encoder and the CTC branch. The attention decoder\n          is not used.\n          EOF\n\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/ctc/model.onnx\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/ctc/model.weights\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/ctc/tokens.txt\n          mkdir test_wavs\n          cd test_wavs\n          for w in 0.wav 1.wav 2.wav 3-sichuan.wav 3.wav 4-tianjin.wav 5-henan.wav 8k.wav; do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-fire-red-asr-large-zh_en-fp16-2025-02-16/resolve/main/test_wavs/$w\n          done\n\n          popd\n\n          ls -lh $d\n\n          tar cjvf $d.tar.bz2 $d\n\n          ls -lh *.tar.bz2\n\n      - name: FireRedASR2 AED (int8)\n        if: false\n        shell: bash\n        run: |\n          d=sherpa-onnx-fire-red-asr2-zh_en-int8-2026-02-26\n          mkdir $d\n\n          pushd $d\n\n          cat >README.md <<EOF\n          # Introduction\n          Model files are converted from\n          https://www.modelscope.cn/models/FireRedTeam/FireRedASR2-AED\n          EOF\n\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/aed/encoder.int8.onnx\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/aed/decoder.int8.onnx\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/aed/tokens.txt\n          mkdir test_wavs\n          cd test_wavs\n          for w in 0.wav 1.wav 2.wav 3-sichuan.wav 3.wav 4-tianjin.wav 5-henan.wav 8k.wav; do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-fire-red-asr-large-zh_en-fp16-2025-02-16/resolve/main/test_wavs/$w\n          done\n\n          popd\n\n          ls -lh $d\n\n          tar cjvf $d.tar.bz2 $d\n\n          ls -lh *.tar.bz2\n\n      - name: FireRedASR2 AED (fp32)\n        if: false\n        shell: bash\n        run: |\n          d=sherpa-onnx-fire-red-asr2-zh_en-2026-02-26\n          mkdir $d\n\n          pushd $d\n\n          cat >README.md <<EOF\n          # Introduction\n          Model files are converted from\n          https://www.modelscope.cn/models/FireRedTeam/FireRedASR2-AED\n          EOF\n\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/aed/encoder.onnx\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/aed/encoder.weights\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/aed/decoder.onnx\n          curl -SL -O https://www.modelscope.cn/models/csukuangfj/FireRedASR2-AED-onnx/resolve/master/aed/tokens.txt\n          mkdir test_wavs\n          cd test_wavs\n          for w in 0.wav 1.wav 2.wav 3-sichuan.wav 3.wav 4-tianjin.wav 5-henan.wav 8k.wav; do\n            curl -SL -O https://huggingface.co/csukuangfj/sherpa-onnx-fire-red-asr-large-zh_en-fp16-2025-02-16/resolve/main/test_wavs/$w\n          done\n\n          popd\n\n          ls -lh $d\n\n          tar cjvf $d.tar.bz2 $d\n\n          ls -lh *.tar.bz2\n\n      - name: Zipformer-30M-RNNT-6000h\n        if: false\n        shell: bash\n        run: |\n          git lfs install\n          repo=Zipformer-30M-RNNT-6000h\n          git clone https://huggingface.co/hynt/$repo\n          pushd $repo\n          mkdir test_wavs\n          cd test_wavs\n          wget https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-vi-2025-04-20/resolve/main/test_wavs/0.wav\n          wget https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-vi-2025-04-20/resolve/main/test_wavs/1.wav\n          wget https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-vi-2025-04-20/resolve/main/test_wavs/2.wav\n          wget https://huggingface.co/csukuangfj/sherpa-onnx-zipformer-vi-2025-04-20/resolve/main/test_wavs/README.md\n          popd\n\n          d=sherpa-onnx-zipformer-vi-30M-2026-02-09\n          mkdir -p $d\n          cat >$d/README.md <<EOF\n          # Introduction\n          Model files are from\n          https://huggingface.co/hynt/Zipformer-30M-RNNT-6000h\n          EOF\n\n          cp -v $repo/encoder-epoch-20-avg-10.onnx $d/encoder.onnx\n          cp -v $repo/decoder-epoch-20-avg-10.onnx $d/decoder.onnx\n          cp -v $repo/joiner-epoch-20-avg-10.onnx $d/joiner.onnx\n          cp -v $repo/bpe.model $d/\n          cp -v $repo/config.json $d/tokens.txt\n          cp -av $repo/test_wavs $d/\n\n          tar cjfv $d.tar.bz2 $d\n\n          d=sherpa-onnx-zipformer-vi-30M-int8-2026-02-09\n          mkdir -p $d\n          cat >$d/README.md <<EOF\n          # Introduction\n          Model files are from\n          https://huggingface.co/hynt/Zipformer-30M-RNNT-6000h\n          EOF\n\n          cp -v $repo/encoder-epoch-20-avg-10.int8.onnx $d/encoder.int8.onnx\n          cp -v $repo/decoder-epoch-20-avg-10.onnx $d/decoder.onnx\n          cp -v $repo/joiner-epoch-20-avg-10.int8.onnx $d/joiner.int8.onnx\n          cp -v $repo/bpe.model $d/\n          cp -v $repo/config.json $d/tokens.txt\n          cp -av $repo/test_wavs $d/\n\n          tar cjfv $d.tar.bz2 $d\n\n      - name: vosk-model-small-streaming-bn\n        if: false\n        shell: bash\n        run: |\n          git lfs install\n          repo=vosk-model-small-streaming-bn\n          git clone https://huggingface.co/alphacep/$repo\n          cd $repo\n          mv test.wav 0.wav\n          wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/bn.wav\n          mv bn.wav 1.wav\n          cd ..\n\n          d=sherpa-onnx-streaming-zipformer-bn-vosk-2026-02-09\n          mkdir $d\n          cat >$d/README.md <<EOF\n          # Introduction\n          Model files are from\n          https://huggingface.co/alphacep/vosk-model-small-streaming-bn\n          EOF\n\n          mv $repo/am-onnx/*.onnx $d/\n          mv $repo/lang/* $d/\n          mkdir $d/test_wavs\n          mv $repo/*.wav $d/test_wavs\n\n          tar cfjv $d.tar.bz2 $d\n\n      - name: WenetSpeech Wu\n        if: false\n        shell: bash\n        run: |\n          git lfs install\n          git clone https://huggingface.co/csukuangfj2/sherpa-onnx-wenetspeech-wu-u2pp-conformer-ctc-zh-int8-2026-02-03\n          git clone https://huggingface.co/csukuangfj2/sherpa-onnx-wenetspeech-wu-u2pp-conformer-ctc-zh-2026-02-03\n\n          d=sherpa-onnx-wenetspeech-wu-u2pp-conformer-ctc-zh-int8-2026-02-03\n          rm -rf $d/.git*\n\n          tar cjfv $d.tar.bz2 $d\n\n          rm -rf $d\n\n          d=sherpa-onnx-wenetspeech-wu-u2pp-conformer-ctc-zh-2026-02-03\n          rm -rf $d/.git*\n\n          tar cjfv $d.tar.bz2 $d\n          rm -rf $d\n\n          ls -lh *.tar.bz2\n\n      - name: Setup tmate session\n        if: false\n        uses: mxschmitt/action-tmate@v3\n\n      - name: Collect funasr-nano with LLM\n        if: false\n        shell: bash\n        run: |\n          git lfs install\n          models=(\n            sherpa-onnx-funasr-nano-int8-2025-12-30\n            sherpa-onnx-funasr-nano-fp16-2025-12-30\n            sherpa-onnx-funasr-nano-2025-12-30\n          )\n          for d in ${models[@]}; do\n            git clone https://huggingface.co/csukuangfj/$d\n            rm -rf $d/.git\n            tar cjfv $d.tar.bz2 $d\n            ls -lh $d.tar.bz2\n            ls -lh $d\n            rm -rf $d\n          done\n\n      - name: Collect funasr-nano with LLM int8\n        if: false\n        shell: bash\n        run: |\n          d=sherpa-onnx-funasr-nano-int8-2025-12-30\n          mkdir $d\n          pushd $d\n\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/llm_int8/llm.int8.onnx\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/embedding.int8.onnx\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/encoder_adaptor.int8.onnx\n\n          mkdir Qwen3-0.6B\n          cd Qwen3-0.6B\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/Qwen3-0.6B/merges.txt\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/Qwen3-0.6B/tokenizer.json\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/Qwen3-0.6B/vocab.json\n\n          ls -lh\n          cd ..\n          mkdir test_wavs\n          cd test_wavs\n\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_hunan.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_minnan.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_sh.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_yue.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_2.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_3.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_4.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_5.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/ja.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/ja_en_codeswitch.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics_2.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics_3.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics_en_2.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/noise_en.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_biochemistry.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_chemistry.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_history.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_math.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_medical.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_physics.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/yuenan.wav\n\n          mv yuenan.wav vietnamese.wav\n\n          for f in *.wav; do\n            ffmpeg -y -loglevel error -i \"$f\" \\\n              -ac 1 -ar 16000 -sample_fmt s16 \\\n              \"${f}.tmp.wav\" \\\n            && mv \"${f}.tmp.wav\" \"$f\"\n          done\n\n          curl -SL -O https://modelscope.cn/models/csukuangfj/sherpa-doc-files/resolve/master/source/_static/fun-asr-nano-2025-12-30/lyrics_en_1.wav\n          curl -SL -O https://modelscope.cn/models/csukuangfj/sherpa-doc-files/resolve/master/source/_static/fun-asr-nano-2025-12-30/lyrics_en_3.wav\n\n          cat >README.md <<EOF\n          Audio files in this directory are downloaded from\n          https://github.com/FunAudioLLM/FunAudioLLM.github.io/tree/master/funasr/static/audios\n\n          | Filename| Trascript|\n          |---------|----------|\n          |湖南方言[dia_hunan.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/dia_hunan.wav)|但总来讲孙膑对兵法的理解运用比庞涓略胜一筹。|\n          |闽南语[dia_minnan.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/dia_minnan.wav)|嗯，下摆若有机会吧，因为即久吼开了吼卷啊遮厉害，会倒贴钱啊。|\n          |上海话[dia_sh.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/dia_sh.wav)|人跟狗，包括人跟动物接触长了，全有感情。葛末随了阿拉社会个富裕。|\n          |粤语[dia_yue.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/dia_yue.wav)|啲身体好劲啊，跟住咧佢哋有一个人咧就突然可能就有高原反应啦，突然间就啊窒息咗，即系晕晕咗。|\n          |中文歌曲[lyrics.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/lyrics.wav)|我看到我的身后盯着我的人群，喜欢或恨不一样的神情，我知道这可能就是所谓的成名，我知道必须往前一步也不能停。|\n          |中文歌曲[lyrics_2.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/lyrics_2.wav)|明明那么远，为何却感觉离他那么近？闭上眼，你甚至能背出他所有押韵。虽然不听说唱了，但你已学会自信。我代表所有中文说唱歌手向你致敬。如今面对困难的你，早已不再抱怨。|\n          |中文歌曲[lyrics_3.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/lyrics_3.wav)|你听啊秋末的落叶，你听它叹息着离别，只剩我独自领略海与山风和月，你听啊。|\n          |英文歌曲[lyrics_en_1.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/lyrics_en_1.wav)|When I was young I'd listen to the radio. Waiting for my favorite songs. When they played I'd sing along. It made me smile.|\n          |英文歌曲[lyrics_en_2.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/lyrics_en_2.wav)|I see your monsters. I see your pain. Tell me your problems; I'll chase them away. I'll be your lighthouse. I'll make it okay. When I see your monsters, I'll stand there so brave and chase them all away.|\n          |英文歌曲[lyrics_en_3.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/lyrics_en_3.wav)|An empty street, an empty house, a hole inside my heart. I'm all alone and the rooms are getting smaller. I wonder how, I wonder why, I wonder where they are. The days we had, the songs we sang together.|\n          |英文[noise_en.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/noise_en.wav)|So what's interesting here is I feel that you know brands knowing this when people sort of speak to the voice assistance at home and if you want to be the brand.|\n          |[far_2.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/far_2.wav)|然后被冠以了渣男线的称号，好了，不管这个，那么前方即将到达沈杜公路站，左边是8号线。|\n          |[far_3.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/far_3.wav)|周末要不要去露营，最近天气超舒服，露营？我怕虫子咬，而且晚上睡帐篷会不会很冷啊？放心，我借了专业装备还有暖宝宝，再带点火锅食材，边吃边看星星超惬意。|\n          |[far_4.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/far_4.wav)|<music>唯一的遗憾就是他那个八宝鸭还有烤鸭都没吃上 估计得提前预定吧 <impact_sounds></impact_sounds>只能怪我自己没有做好功课</music>|\n          |[far_5.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/far_5.wav)|别紧张<breathing></breathing>我只是我是在这边逛街 然后看到你们在这边拍照 想跟你交个朋友<impact_sounds> 认识</impact_sounds>一下|\n          |日语[ja.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/ja.wav)|人民たちは、金欲しさに王をのけ者にしてしまって、何でもすべて商人のところへ持って行ってしまいました。|\n          |日英混合[ja_en_codeswitch.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/ja_en_codeswitch.wav)|このカフェのwi-fiがアン ステーブル 過ぎて、google meetでディスコネクトされて クライエントに悪い印象を与えてしまった。|\n          |越南语[vietnamese.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/vietnamese.wav)|Đi cùng với tiếp tục kêu gọi người dân đã qua lại các ổ dịch này, khai báo y tế và yêu cầu liên hệ để được xét nghiệm.|\n          |[rag_biochemistry.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/rag_biochemistry.wav)|利用三磷酸腺苷的水解所产生的能量来驱动其他化学反应|\n          |[rag_chemistry.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/rag_chemistry.wav)|比如说酯在当时被认为是一种含氧酸盐|\n          |[rag_history.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/rag_history.wav)|由罗马皇帝钦点的犹地亚王大希律王统治期间|\n          |[rag_math.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/rag_math.wav)|对微分形式的积分是微分几何中的基本概念|\n          |[rag_medical.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/rag_medical.wav)|肾脏中肾小球囊上的细胞膜孔隙很小|\n          |[rag_physics.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/rag_physics.wav)|根据碰撞理论月面样本缺少挥发性物质|\n          EOF\n          cd ..\n\n          cat >README.md <<EOF\n\n          # Introduction\n          Models in this directory are downloaded from\n          https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/files\n\n          Export script can be found at\n          https://github.com/Wasser1462/FunASR-nano-onnx\n\n          The author is https://github.com/Wasser1462\n          EOF\n\n          popd\n          ls -lh $d\n          tar cjvf $d.tar.bz2 $d\n\n      - name: Collect funasr-nano with LLM float32\n        if: false\n        shell: bash\n        run: |\n          d=sherpa-onnx-funasr-nano-2025-12-30\n          mkdir $d\n          pushd $d\n\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/llm_fp32/llm.fp32.onnx\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/llm_fp32/llm.fp32.data\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/embedding.onnx\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/encoder_adaptor.onnx\n\n          mkdir Qwen3-0.6B\n          cd Qwen3-0.6B\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/Qwen3-0.6B/merges.txt\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/Qwen3-0.6B/tokenizer.json\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/Qwen3-0.6B/vocab.json\n\n          ls -lh\n          cd ..\n          mkdir test_wavs\n          cd test_wavs\n\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_hunan.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_minnan.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_sh.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_yue.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_2.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_3.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_4.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_5.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/ja.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/ja_en_codeswitch.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics_2.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics_3.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics_en_2.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/noise_en.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_biochemistry.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_chemistry.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_history.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_math.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_medical.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_physics.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/yuenan.wav\n\n          mv yuenan.wav vietnamese.wav\n\n          for f in *.wav; do\n            ffmpeg -y -loglevel error -i \"$f\" \\\n              -ac 1 -ar 16000 -sample_fmt s16 \\\n              \"${f}.tmp.wav\" \\\n            && mv \"${f}.tmp.wav\" \"$f\"\n          done\n\n          curl -SL -O https://modelscope.cn/models/csukuangfj/sherpa-doc-files/resolve/master/source/_static/fun-asr-nano-2025-12-30/lyrics_en_1.wav\n          curl -SL -O https://modelscope.cn/models/csukuangfj/sherpa-doc-files/resolve/master/source/_static/fun-asr-nano-2025-12-30/lyrics_en_3.wav\n\n          cat >README.md <<EOF\n          Audio files in this directory are downloaded from\n          https://github.com/FunAudioLLM/FunAudioLLM.github.io/tree/master/funasr/static/audios\n\n          | Filename| Trascript|\n          |---------|----------|\n          |湖南方言[dia_hunan.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/dia_hunan.wav)|但总来讲孙膑对兵法的理解运用比庞涓略胜一筹。|\n          |闽南语[dia_minnan.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/dia_minnan.wav)|嗯，下摆若有机会吧，因为即久吼开了吼卷啊遮厉害，会倒贴钱啊。|\n          |上海话[dia_sh.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/dia_sh.wav)|人跟狗，包括人跟动物接触长了，全有感情。葛末随了阿拉社会个富裕。|\n          |粤语[dia_yue.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/dia_yue.wav)|啲身体好劲啊，跟住咧佢哋有一个人咧就突然可能就有高原反应啦，突然间就啊窒息咗，即系晕晕咗。|\n          |中文歌曲[lyrics.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/lyrics.wav)|我看到我的身后盯着我的人群，喜欢或恨不一样的神情，我知道这可能就是所谓的成名，我知道必须往前一步也不能停。|\n          |中文歌曲[lyrics_2.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/lyrics_2.wav)|明明那么远，为何却感觉离他那么近？闭上眼，你甚至能背出他所有押韵。虽然不听说唱了，但你已学会自信。我代表所有中文说唱歌手向你致敬。如今面对困难的你，早已不再抱怨。|\n          |中文歌曲[lyrics_3.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/lyrics_3.wav)|你听啊秋末的落叶，你听它叹息着离别，只剩我独自领略海与山风和月，你听啊。|\n          |英文歌曲[lyrics_en_1.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/lyrics_en_1.wav)|When I was young I'd listen to the radio. Waiting for my favorite songs. When they played I'd sing along. It made me smile.|\n          |英文歌曲[lyrics_en_2.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/lyrics_en_2.wav)|I see your monsters. I see your pain. Tell me your problems; I'll chase them away. I'll be your lighthouse. I'll make it okay. When I see your monsters, I'll stand there so brave and chase them all away.|\n          |英文歌曲[lyrics_en_3.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/lyrics_en_3.wav)|An empty street, an empty house, a hole inside my heart. I'm all alone and the rooms are getting smaller. I wonder how, I wonder why, I wonder where they are. The days we had, the songs we sang together.|\n          |英文[noise_en.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/noise_en.wav)|So what's interesting here is I feel that you know brands knowing this when people sort of speak to the voice assistance at home and if you want to be the brand.|\n          |[far_2.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/far_2.wav)|然后被冠以了渣男线的称号，好了，不管这个，那么前方即将到达沈杜公路站，左边是8号线。|\n          |[far_3.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/far_3.wav)|周末要不要去露营，最近天气超舒服，露营？我怕虫子咬，而且晚上睡帐篷会不会很冷啊？放心，我借了专业装备还有暖宝宝，再带点火锅食材，边吃边看星星超惬意。|\n          |[far_4.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/far_4.wav)|<music>唯一的遗憾就是他那个八宝鸭还有烤鸭都没吃上 估计得提前预定吧 <impact_sounds></impact_sounds>只能怪我自己没有做好功课</music>|\n          |[far_5.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/far_5.wav)|别紧张<breathing></breathing>我只是我是在这边逛街 然后看到你们在这边拍照 想跟你交个朋友<impact_sounds> 认识</impact_sounds>一下|\n          |日语[ja.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/ja.wav)|人民たちは、金欲しさに王をのけ者にしてしまって、何でもすべて商人のところへ持って行ってしまいました。|\n          |日英混合[ja_en_codeswitch.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/ja_en_codeswitch.wav)|このカフェのwi-fiがアン ステーブル 過ぎて、google meetでディスコネクトされて クライエントに悪い印象を与えてしまった。|\n          |越南语[vietnamese.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/vietnamese.wav)|Đi cùng với tiếp tục kêu gọi người dân đã qua lại các ổ dịch này, khai báo y tế và yêu cầu liên hệ để được xét nghiệm.|\n          |[rag_biochemistry.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/rag_biochemistry.wav)|利用三磷酸腺苷的水解所产生的能量来驱动其他化学反应|\n          |[rag_chemistry.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/rag_chemistry.wav)|比如说酯在当时被认为是一种含氧酸盐|\n          |[rag_history.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/rag_history.wav)|由罗马皇帝钦点的犹地亚王大希律王统治期间|\n          |[rag_math.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/rag_math.wav)|对微分形式的积分是微分几何中的基本概念|\n          |[rag_medical.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/rag_medical.wav)|肾脏中肾小球囊上的细胞膜孔隙很小|\n          |[rag_physics.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-2025-12-30/resolve/main/test_wavs/rag_physics.wav)|根据碰撞理论月面样本缺少挥发性物质|\n          EOF\n          cd ..\n\n          cat >README.md <<EOF\n\n          # Introduction\n          Models in this directory are downloaded from\n          https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/files\n\n          Export script can be found at\n          https://github.com/Wasser1462/FunASR-nano-onnx\n\n          The author is https://github.com/Wasser1462\n          EOF\n\n          popd\n          ls -lh $d\n          tar cjvf $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n\n      - name: Collect funasr-nano with LLM fp16\n        if: false\n        shell: bash\n        run: |\n          d=sherpa-onnx-funasr-nano-fp16-2025-12-30\n          mkdir $d\n          pushd $d\n\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/llm_fp16/llm.fp16.onnx\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/embedding.int8.onnx\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/encoder_adaptor.int8.onnx\n\n          mkdir Qwen3-0.6B\n          cd Qwen3-0.6B\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/Qwen3-0.6B/merges.txt\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/Qwen3-0.6B/tokenizer.json\n          curl -SL -O https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/resolve/master/Qwen3-0.6B/vocab.json\n\n          ls -lh\n          cd ..\n          mkdir test_wavs\n          cd test_wavs\n\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_hunan.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_minnan.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_sh.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/dia_yue.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_2.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_3.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_4.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/far_5.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/ja.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/ja_en_codeswitch.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics_2.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics_3.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/lyrics_en_2.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/noise_en.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_biochemistry.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_chemistry.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_history.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_math.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_medical.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/rag_physics.wav\n          curl -SL -O https://github.com/FunAudioLLM/FunAudioLLM.github.io/raw/refs/heads/master/funasr/static/audios/yuenan.wav\n\n          mv yuenan.wav vietnamese.wav\n\n          for f in *.wav; do\n            ffmpeg -y -loglevel error -i \"$f\" \\\n              -ac 1 -ar 16000 -sample_fmt s16 \\\n              \"${f}.tmp.wav\" \\\n            && mv \"${f}.tmp.wav\" \"$f\"\n          done\n\n          curl -SL -O https://modelscope.cn/models/csukuangfj/sherpa-doc-files/resolve/master/source/_static/fun-asr-nano-2025-12-30/lyrics_en_1.wav\n          curl -SL -O https://modelscope.cn/models/csukuangfj/sherpa-doc-files/resolve/master/source/_static/fun-asr-nano-2025-12-30/lyrics_en_3.wav\n\n          cat >README.md <<EOF\n          Audio files in this directory are downloaded from\n          https://github.com/FunAudioLLM/FunAudioLLM.github.io/tree/master/funasr/static/audios\n\n          | Filename| Trascript|\n          |---------|----------|\n          |湖南方言[dia_hunan.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/dia_hunan.wav)|但总来讲孙膑对兵法的理解运用比庞涓略胜一筹。|\n          |闽南语[dia_minnan.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/dia_minnan.wav)|嗯，下摆若有机会吧，因为即久吼开了吼卷啊遮厉害，会倒贴钱啊。|\n          |上海话[dia_sh.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/dia_sh.wav)|人跟狗，包括人跟动物接触长了，全有感情。葛末随了阿拉社会个富裕。|\n          |粤语[dia_yue.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/dia_yue.wav)|啲身体好劲啊，跟住咧佢哋有一个人咧就突然可能就有高原反应啦，突然间就啊窒息咗，即系晕晕咗。|\n          |中文歌曲[lyrics.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/lyrics.wav)|我看到我的身后盯着我的人群，喜欢或恨不一样的神情，我知道这可能就是所谓的成名，我知道必须往前一步也不能停。|\n          |中文歌曲[lyrics_2.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/lyrics_2.wav)|明明那么远，为何却感觉离他那么近？闭上眼，你甚至能背出他所有押韵。虽然不听说唱了，但你已学会自信。我代表所有中文说唱歌手向你致敬。如今面对困难的你，早已不再抱怨。|\n          |中文歌曲[lyrics_3.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/lyrics_3.wav)|你听啊秋末的落叶，你听它叹息着离别，只剩我独自领略海与山风和月，你听啊。|\n          |英文歌曲[lyrics_en_1.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/lyrics_en_1.wav)|When I was young I'd listen to the radio. Waiting for my favorite songs. When they played I'd sing along. It made me smile.|\n          |英文歌曲[lyrics_en_2.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/lyrics_en_2.wav)|I see your monsters. I see your pain. Tell me your problems; I'll chase them away. I'll be your lighthouse. I'll make it okay. When I see your monsters, I'll stand there so brave and chase them all away.|\n          |英文歌曲[lyrics_en_3.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-int8-2025-12-30/resolve/main/test_wavs/lyrics_en_3.wav)|An empty street, an empty house, a hole inside my heart. I'm all alone and the rooms are getting smaller. I wonder how, I wonder why, I wonder where they are. The days we had, the songs we sang together.|\n          |英文[noise_en.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/noise_en.wav)|So what's interesting here is I feel that you know brands knowing this when people sort of speak to the voice assistance at home and if you want to be the brand.|\n          |[far_2.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/far_2.wav)|然后被冠以了渣男线的称号，好了，不管这个，那么前方即将到达沈杜公路站，左边是8号线。|\n          |[far_3.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/far_3.wav)|周末要不要去露营，最近天气超舒服，露营？我怕虫子咬，而且晚上睡帐篷会不会很冷啊？放心，我借了专业装备还有暖宝宝，再带点火锅食材，边吃边看星星超惬意。|\n          |[far_4.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/far_4.wav)|<music>唯一的遗憾就是他那个八宝鸭还有烤鸭都没吃上 估计得提前预定吧 <impact_sounds></impact_sounds>只能怪我自己没有做好功课</music>|\n          |[far_5.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/far_5.wav)|别紧张<breathing></breathing>我只是我是在这边逛街 然后看到你们在这边拍照 想跟你交个朋友<impact_sounds> 认识</impact_sounds>一下|\n          |日语[ja.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/ja.wav)|人民たちは、金欲しさに王をのけ者にしてしまって、何でもすべて商人のところへ持って行ってしまいました。|\n          |日英混合[ja_en_codeswitch.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/ja_en_codeswitch.wav)|このカフェのwi-fiがアン ステーブル 過ぎて、google meetでディスコネクトされて クライエントに悪い印象を与えてしまった。|\n          |越南语[vietnamese.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/vietnamese.wav)|Đi cùng với tiếp tục kêu gọi người dân đã qua lại các ổ dịch này, khai báo y tế và yêu cầu liên hệ để được xét nghiệm.|\n          |[rag_biochemistry.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/rag_biochemistry.wav)|利用三磷酸腺苷的水解所产生的能量来驱动其他化学反应|\n          |[rag_chemistry.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/rag_chemistry.wav)|比如说酯在当时被认为是一种含氧酸盐|\n          |[rag_history.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/rag_history.wav)|由罗马皇帝钦点的犹地亚王大希律王统治期间|\n          |[rag_math.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/rag_math.wav)|对微分形式的积分是微分几何中的基本概念|\n          |[rag_medical.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/rag_medical.wav)|肾脏中肾小球囊上的细胞膜孔隙很小|\n          |[rag_physics.wav](https://huggingface.co/csukuangfj/sherpa-onnx-funasr-nano-fp16-2025-12-30/resolve/main/test_wavs/rag_physics.wav)|根据碰撞理论月面样本缺少挥发性物质|\n          EOF\n          cd ..\n\n          cat >README.md <<EOF\n\n          # Introduction\n          Models in this directory are downloaded from\n          https://www.modelscope.cn/models/zengshuishui/FunASR-nano-onnx/files\n\n          Export script can be found at\n          https://github.com/Wasser1462/FunASR-nano-onnx\n\n          The author is https://github.com/Wasser1462\n          EOF\n\n          popd\n          ls -lh $d\n          tar cjvf $d.tar.bz2 $d\n          ls -lh *.tar.bz2\n\n      - name: Streaming zipformer from Banafo/Kroko-ASR\n        if: false\n        shell: bash\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          git lfs install\n          git clone https://csukuangfj:$HF_TOKEN@huggingface.co/Banafo/Kroko-ASR src\n          pushd src\n          curl -SL -O https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Python/resolve/main/de_encoder.onnx\n          curl -SL -O https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Python/resolve/main/de_decoder.onnx\n          curl -SL -O https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Python/resolve/main/de_joiner.onnx\n          curl -SL -O https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Python/resolve/main/de_tokens.txt\n          popd\n\n          for lang in en es fr de; do\n            repo=sherpa-onnx-streaming-zipformer-$lang-kroko-2025-08-06\n            git clone https://huggingface.co/csukuangfj/$repo\n            cp src/${lang}_encoder.onnx $repo/encoder.onnx\n            cp src/${lang}_decoder.onnx $repo/decoder.onnx\n            cp src/${lang}_joiner.onnx $repo/joiner.onnx\n            cp src/${lang}_tokens.txt $repo/tokens.txt\n\n            pushd $repo\n\n            echo \"See license at https://huggingface.co/Banafo/Kroko-ASR\" > README.md\n\n            mkdir -p test_wavs\n            pushd test_wavs\n            curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$lang.wav\n            mv $lang.wav 0.wav\n            popd\n\n            git lfs track \"*.onnx\" \"*.wav\"\n            git status\n            ls -lh\n            git add .\n            git commit -m 'add model files' || true\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$repo main || true\n\n            popd\n\n            rm -rf $repo/.git*\n\n            tar cjfv $repo.tar.bz2 $repo\n\n            ls -lh *.tar.bz2\n          done\n\n      - name: FireRed ASR fp16\n        if: false\n        shell: bash\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16 hf\n\n          git lfs install\n          git clone https://www.modelscope.cn/csukuangfj/sherpa-onnx-fire-red-asr-large-zh_en-fp16-2025-02-16.git ms\n\n          d=sherpa-onnx-fire-red-asr-large-zh_en-fp16-2025-02-16\n          git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d\n          mv -v hf/test_wavs $d\n          mv -v hf/README.md $d\n          mv -v hf/tokens.txt $d\n          mv -v ms/*.onnx $d\n\n          pushd $d\n          git lfs track \"*.onnx\"\n          git lfs track \"*.wav\"\n          git status\n          git add .\n          git commit -m \"add models\"\n          ls -lh\n          git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n          popd\n\n          rm -rf $d/.git\n          rm -rf $d/.gitattributes\n          tar cjvf $d.tar.bz2 $d\n\n      - name: Zipformer CTC (non-streaming)\n        if: false\n        shell: bash\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          git lfs install\n          names=(\n            sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03\n            sherpa-onnx-zipformer-ctc-zh-2025-07-03\n            sherpa-onnx-zipformer-ctc-zh-fp16-2025-07-03\n            sherpa-onnx-zipformer-ctc-small-zh-int8-2025-07-16\n            sherpa-onnx-zipformer-ctc-small-zh-fp16-2025-07-16\n            sherpa-onnx-zipformer-ctc-small-zh-2025-07-16\n          )\n          for name in ${names[@]}; do\n            rm -rf ms\n            git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/$name.git ms\n            git clone https://huggingface.co/csukuangfj/$name\n\n            cp -av ms/test_wavs $name\n            cp -v ms/*.onnx $name\n            cp -v ms/tokens.txt $name\n            cp -v ms/bbpe.model $name\n\n            pushd $name\n            git lfs track \"*.wav\" \"*.onnx\" \"*.model\"\n            git add .\n            git status\n            git commit -m 'add models' || true\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$name main || true\n\n            # git lfs pull\n            rm -rf .git\n            rm -rfv .gitattributes\n            ls -lh\n            popd\n\n            tar cjfv $name.tar.bz2 $name\n            rm -rf $name\n            ls -lh *.tar.bz2\n          done\n\n      - name: sense-voice\n        if: false\n        shell: bash\n        run: |\n          git lfs install\n          d=sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2025-09-09\n          f=sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09\n          git clone https://huggingface.co/csukuangfj/$d\n          git clone https://huggingface.co/csukuangfj/$f\n\n          rm -rf $d/.git\n          rm -rf $f/.gi*\n\n          rm -rf $d/.gitattributes\n          rm -rf $f/.gitattributes\n\n          tar cjfv $d.tar.bz2 $d\n          tar cjfv $f.tar.bz2 $f\n\n          ls -lh *.tar.bz2\n\n      - name: wenetspeech chuan paraformer\n        if: false\n        shell: bash\n        run: |\n          git lfs install\n          d=sherpa-onnx-paraformer-zh-int8-2025-10-07\n          f=sherpa-onnx-paraformer-zh-2025-10-07\n          git clone https://huggingface.co/csukuangfj/$d\n          git clone https://huggingface.co/csukuangfj/$f\n\n          rm -rf $d/.git\n          rm -rf $f/.gi*\n\n          rm -rf $d/.gitattributes\n          rm -rf $f/.gitattributes\n\n          tar cjfv $d.tar.bz2 $d\n          tar cjfv $f.tar.bz2 $f\n\n          ls -lh *.tar.bz2\n\n      - name: u2ppconformer\n        if: false\n        shell: bash\n        run: |\n          git lfs install\n\n          d=sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-2025-09-10\n          f=sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10\n\n          git clone https://huggingface.co/csukuangfj/$d\n          git clone https://huggingface.co/csukuangfj/$f\n\n          rm -rf $d/.git\n          rm -rf $f/.gi*\n\n          rm -rf $d/.gitattributes\n          rm -rf $f/.gitattributes\n\n          tar cjfv $d.tar.bz2 $d\n          tar cjfv $f.tar.bz2 $f\n\n          ls -lh *.tar.bz2\n\n      - name: Vietnamese (zipformer)\n        if: false\n        shell: bash\n        run: |\n          rm -rf models\n          mkdir models\n          cd models\n          cat >README.md <<EOF\n          # Introduction\n          Models in this directory are from\n          https://huggingface.co/zzasdf/viet_iter3_pseudo_label\n          which are trained on about 70k hours of data.\n          EOF\n\n          git lfs install\n          git clone https://huggingface.co/csukuangfj/viet_iter3_pseudo_label hf\n\n          ls -lh\n\n          d=sherpa-onnx-zipformer-vi-2025-04-20\n          mkdir -p $d\n          cp -v hf/exp/encoder-epoch-12-avg-8.onnx $d/\n          cp -v hf/exp/decoder-epoch-12-avg-8.onnx $d/\n          cp -v hf/exp/joiner-epoch-12-avg-8.onnx $d/\n          cp -v hf/data/Vietnam_bpe_2000_new/bpe.model $d/\n          cp -v hf/data/Vietnam_bpe_2000_new/tokens.txt $d/\n          cp -av hf/test_wavs $d\n          cp -v README.md $d\n\n          tar cjfv $d.tar.bz2 $d\n\n          d=sherpa-onnx-zipformer-vi-int8-2025-04-20\n          mkdir -p $d\n\n          cp -v hf/exp/encoder-epoch-12-avg-8.int8.onnx $d/\n          cp -v hf/exp/decoder-epoch-12-avg-8.onnx $d/\n          cp -v hf/exp/joiner-epoch-12-avg-8.int8.onnx $d/\n          cp -v hf/data/Vietnam_bpe_2000_new/bpe.model $d/\n          cp -v hf/data/Vietnam_bpe_2000_new/tokens.txt $d/\n          cp -av hf/test_wavs $d\n          cp -v README.md $d\n\n          tar cjfv $d.tar.bz2 $d\n\n          rm -rf hf\n\n          ls -lh\n\n          cd ..\n\n          mv models/* .\n\n      - name: Publish to huggingface (Vietnamese zipformer)\n        if: false\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            models=(\n              sherpa-onnx-zipformer-vi-2025-04-20\n              sherpa-onnx-zipformer-vi-int8-2025-04-20\n            )\n            for d in ${models[@]}; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d huggingface\n              cp -av $d/* huggingface\n\n              pushd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"bpe.model\"\n              git lfs track \"*.wav\"\n              git status\n              git add .\n\n              git commit -m \"add models\"\n              git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n\n              popd\n            done\n\n      - name: vosk-model-ru (stream zipformer)\n        if: false\n        shell: bash\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n          cat >README.md <<EOF\n          # Introduction\n          Models in this directory are from\n          https://huggingface.co/alphacep/vosk-model-small-streaming-ru\n          EOF\n\n          git lfs install\n          git clone https://huggingface.co/alphacep/vosk-model-small-streaming-ru hf\n\n          git clone https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-small-ru-vosk-int8-2025-08-16 int8\n          git clone https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-small-ru-vosk-2025-08-16 fp32\n\n          rm -fv int8/*.onnx\n          rm -fv fp32/*.onnx\n\n          mkdir -p int8/test_wavs\n          mkdir -p fp32/test_wavs\n\n          curl -SL -O https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/russian/russian-i-love-you.wav\n          curl -SL -O https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/russian/test.wav\n\n          mv russian-i-love-you.wav 0.wav\n          mv test.wav 1.wav\n\n          cp -v README.md int8/\n          cp -v README.md fp32/\n\n          cp -v *.wav int8/test_wavs\n          cp -v *.wav fp32/test_wavs\n\n          cp -v hf/am-onnx/{encoder,decoder,joiner}.onnx fp32/\n\n          cp -v hf/am-onnx/{encoder,joiner}.int8.onnx int8/\n          cp -v hf/am-onnx/decoder.onnx int8/\n\n          cp -v hf/lang/tokens.txt int8/\n          cp -v hf/lang/bpe.model int8/\n\n          cp -v hf/lang/tokens.txt fp32/\n          cp -v hf/lang/bpe.model fp32/\n\n          mv int8 sherpa-onnx-streaming-zipformer-small-ru-vosk-int8-2025-08-16\n          mv fp32 sherpa-onnx-streaming-zipformer-small-ru-vosk-2025-08-16\n\n          models=(\n            sherpa-onnx-streaming-zipformer-small-ru-vosk-2025-08-16\n            sherpa-onnx-streaming-zipformer-small-ru-vosk-int8-2025-08-16\n          )\n\n          for d in ${models[@]}; do\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            pushd $d\n            git lfs track \"*.onnx\"\n            git lfs track \"bpe.model\"\n            git lfs track \"*.wav\"\n            git status\n            git add .\n\n            git commit -m \"add models\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/csukuangfj/$d main\n            popd\n\n            rm -rf $d/.git*\n\n            tar cjfv $d.tar.bz2 $d\n          done\n          ls -lh *.tar.bz2\n\n      - name: vosk-model-ru (zipformer)\n        if: false\n        shell: bash\n        run: |\n          rm -rf models\n          mkdir models\n          cd models\n          cat >README.md <<EOF\n          # Introduction\n          Models in this directory are from\n          https://huggingface.co/alphacep/vosk-model-ru/tree/main\n          EOF\n\n          git lfs install\n          git clone https://huggingface.co/alphacep/vosk-model-ru hf\n\n          ls -lh\n\n          mkdir test_wavs\n          pushd test_wavs\n          curl -SL -O https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/russian/russian-i-love-you.wav\n          curl -SL -O https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/russian/test.wav\n\n          mv russian-i-love-you.wav 0.wav\n          mv test.wav 1.wav\n          popd\n\n          d=sherpa-onnx-zipformer-ru-2025-04-20\n          mkdir $d\n          cp -v hf/am-onnx/encoder.onnx $d\n          cp -v hf/am-onnx/decoder.onnx $d\n          cp -v hf/am-onnx/joiner.onnx $d\n          cp -v hf/lang/bpe.model $d\n          cp -v hf/lang/tokens.txt $d\n          cp -av test_wavs $d/\n          cp -v README.md $d\n\n          tar cjfv $d.tar.bz2 $d\n\n          d=sherpa-onnx-zipformer-ru-int8-2025-04-20\n          mkdir $d\n          cp -v hf/am-onnx/encoder.int8.onnx $d\n          cp -v hf/am-onnx/decoder.onnx $d\n          cp -v hf/am-onnx/joiner.int8.onnx $d\n          cp -v hf/lang/bpe.model $d\n          cp -v hf/lang/tokens.txt $d\n          cp -av test_wavs $d\n          cp -v README.md $d\n\n          tar cjfv $d.tar.bz2 $d\n\n          rm -rf hf\n\n          ls -lh\n\n          cd ..\n\n          mv models/* .\n\n      - name: Publish to huggingface\n        if: true\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            models=(\n              sherpa-onnx-zipformer-ru-2025-04-20\n              sherpa-onnx-zipformer-ru-int8-2025-04-20\n              sherpa-onnx-funasr-nano-int8-2025-12-30\n              sherpa-onnx-funasr-nano-fp16-2025-12-30\n              sherpa-onnx-funasr-nano-2025-12-30\n              sherpa-onnx-streaming-zipformer-bn-vosk-2026-02-09\n              sherpa-onnx-zipformer-vi-30M-int8-2026-02-09\n              sherpa-onnx-zipformer-vi-30M-2026-02-09\n              sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\n              sherpa-onnx-fire-red-asr2-ctc-zh_en-2026-02-25\n              sherpa-onnx-fire-red-asr2-zh_en-int8-2026-02-26\n              sherpa-onnx-fire-red-asr2-zh_en-2026-02-26\n            )\n            for d in ${models[@]}; do\n              if [ ! -d $d ]; then\n                continue;\n              fi\n\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n              rm -rf huggingface\n              git clone https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/$d huggingface\n\n              rm -rf huggingface/*.onnx\n              rm -rf huggingface/*/*.wav\n\n              cp -av $d/* huggingface\n\n              pushd huggingface\n              git lfs track \"*.onnx\"\n              git lfs track \"*.data\"\n              git lfs track \"*.weights\"\n              git lfs track \"bpe.model\"\n              git lfs track \"*.wav\"\n              git lfs track \"*.json\"\n              git status\n              git add .\n\n              git commit -m \"add models\"\n              git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/$d main\n\n              popd\n            done\n\n            rm -rf huggingface\n\n      - name: Publish to modelscope\n        if: true\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n            for m in *.tar.bz2; do\n              export GIT_LFS_SKIP_SMUDGE=1\n              export GIT_CLONE_PROTECTION_ACTIVE=false\n\n              rm -rf ms\n              git clone https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git ms\n\n              cp -av $m ms/\n\n              pushd ms\n              git lfs track \"*.tar.bz2\"\n              git status\n              ls -lh\n              git add .\n\n              git commit -m \"add models\"\n              git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/csukuangfj/asr-models.git\n\n              popd\n            done\n\n      - name: Release\n        if: true\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: asr-models\n\n      - name: Release\n        if: false\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.onnx\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: speech-enhancement-models\n"
  },
  {
    "path": ".github/workflows/upload-zipvoice-models.yaml",
    "content": "name: upload-zipvoice-models\n\non:\n  push:\n    branches:\n      - upload-zipvoice-onnx-models\n  workflow_dispatch:\n\nconcurrency:\n  group: upload-zipvoice-models-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  upload-zipvoice-models:\n    if: github.repository_owner == 'k2-fsa' || github.repository_owner == 'csukuangfj'\n    name: upload zipvoice models\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        python-version: [\"3.10\"]\n\n    steps:\n      - uses: actions/checkout@v4\n\n      - name: git config\n        shell: bash\n        run: |\n          git config --global user.email \"csukuangfj@gmail.com\"\n          git config --global user.name \"Fangjun Kuang\"\n\n      - name: Setup Python 3.10\n        uses: actions/setup-python@v5\n        with:\n          python-version: \"3.10\"\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip pypinyin\n\n      - name: sherpa-onnx-zipvoice-distill-zh-en-emilia-int8\n        shell: bash\n        run: |\n          echo \"Generate lexicon.txt\"\n\n          python3 ./scripts/zipvoice/zh-en/generate_lexicon.py\n\n          d=sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\n          mkdir $d\n\n          cp lexicon.txt $d\n\n          pushd $d\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/prompt.txt\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/news-female.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/news-female-2.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/leijun-1.wav\n\n\n          curl -SL -O https://huggingface.co/k2-fsa/ZipVoice/resolve/main/zipvoice_distill/fm_decoder_int8.onnx\n          curl -SL -O https://huggingface.co/k2-fsa/ZipVoice/resolve/main/zipvoice_distill/text_encoder_int8.onnx\n\n          mv fm_decoder_int8.onnx decoder.int8.onnx\n          mv text_encoder_int8.onnx encoder.int8.onnx\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2\n          tar xf espeak-ng-data.tar.bz2\n          rm espeak-ng-data.tar.bz2\n\n          curl -SL -O https://huggingface.co/k2-fsa/ZipVoice/resolve/main/zipvoice_distill/tokens.txt\n          mkdir test_wavs\n          mv *.wav test_wavs\n\n          mv prompt.txt test_wavs\n\n          ls -lh\n          popd\n          tar cjfv $d.tar.bz2 $d\n          rm -rf $d\n          ls -lh $d.tar.bz2\n\n      - name: sherpa-onnx-zipvoice-distill-zh-en-emilia-fp32\n        shell: bash\n        run: |\n          echo \"Generate lexicon.txt\"\n\n          python3 ./scripts/zipvoice/zh-en/generate_lexicon.py\n\n          d=sherpa-onnx-zipvoice-distill-fp32-zh-en-emilia\n          mkdir $d\n\n          cp lexicon.txt $d\n\n          pushd $d\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/prompt.txt\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/news-female.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/news-female-2.wav\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/leijun-1.wav\n\n\n          curl -SL -O https://huggingface.co/k2-fsa/ZipVoice/resolve/main/zipvoice_distill/fm_decoder.onnx\n          curl -SL -O https://huggingface.co/k2-fsa/ZipVoice/resolve/main/zipvoice_distill/text_encoder.onnx\n\n          mv fm_decoder.onnx decoder.onnx\n          mv text_encoder.onnx encoder.onnx\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2\n          tar xf espeak-ng-data.tar.bz2\n          rm espeak-ng-data.tar.bz2\n\n          curl -SL -O https://huggingface.co/k2-fsa/ZipVoice/resolve/main/zipvoice_distill/tokens.txt\n          mkdir test_wavs\n          mv *.wav test_wavs\n\n          mv prompt.txt test_wavs\n\n          ls -lh\n          popd\n          tar cjfv $d.tar.bz2 $d\n          rm -rf $d\n          ls -lh $d.tar.bz2\n\n      - name: Release\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          file: ./*.tar.bz2\n          overwrite: true\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: tts-models\n"
  },
  {
    "path": ".github/workflows/wasm-simd-hf-space-en-asr-zipformer.yaml",
    "content": "name: wasm-simd-hf-space-en-asr-zipformer\n\non:\n  push:\n    branches:\n      - wasm\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: wasm-simd-hf-space-en-asr-zipformer-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  wasm-simd-hf-space-en-asr-zipformer:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.53\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Download model files\n        shell: bash\n        run: |\n          cd wasm/asr/assets\n          ls -lh\n          echo \"----------\"\n\n          wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-2023-06-21.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-en-2023-06-21.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-en-2023-06-21.tar.bz2\n          mv sherpa-onnx-streaming-zipformer-en-2023-06-21/encoder-epoch-99-avg-1.int8.onnx encoder.onnx\n          mv sherpa-onnx-streaming-zipformer-en-2023-06-21/decoder-epoch-99-avg-1.onnx decoder.onnx\n          mv sherpa-onnx-streaming-zipformer-en-2023-06-21/joiner-epoch-99-avg-1.onnx joiner.onnx\n          mv sherpa-onnx-streaming-zipformer-en-2023-06-21/tokens.txt ./\n\n          rm -rf sherpa-onnx-streaming-zipformer-en-2023-06-21\n\n          ls -lh\n\n      - name: Build sherpa-onnx for WebAssembly (ASR)\n        shell: bash\n        run: |\n          ./build-wasm-simd-asr.sh\n\n      - name: collect files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-en-asr-zipformer\n          mv build-wasm-simd-asr/install/bin/wasm/asr $dst\n          ls -lh $dst\n          tar cjfv ${dst}.tar.bz2 ./${dst}\n\n      - name: Upload wasm files\n        uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-wasm-simd-en-asr-zipformer\n          path: ./sherpa-onnx-wasm-simd-*.tar.bz2\n\n      - name: Release\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n\n      - name: Publish to ModelScope\n        # if: false\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf ms\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://www.modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-en.git ms\n            cd ms\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-en.git\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-en huggingface\n            cd huggingface\n            rm -rf ./*\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-en main\n"
  },
  {
    "path": ".github/workflows/wasm-simd-hf-space-silero-vad.yaml",
    "content": "name: wasm-simd-hf-space-silero-vad\n\non:\n  push:\n    branches:\n      - wasm\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: wasm-simd-hf-space-silero-vad-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  wasm-simd-hf-space-silero-vad:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.53\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Download model files\n        shell: bash\n        run: |\n          cd wasm/vad/assets\n          ls -lh\n          echo \"----------\"\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n          ls -lh\n\n      - name: Build sherpa-onnx for WebAssembly\n        shell: bash\n        run: |\n          ./build-wasm-simd-vad.sh\n\n      - name: collect files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-vad\n          mv build-wasm-simd-vad/install/bin/wasm/vad $dst\n          ls -lh $dst\n          tar cjfv $dst.tar.bz2 ./$dst\n\n      - name: Upload wasm files\n        uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-wasm-simd-vad\n          path: ./sherpa-onnx-wasm-simd-*.tar.bz2\n\n      - name: Release\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n\n      - name: Publish to ModelScope\n        # if: false\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf ms\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-sherpa-onnx.git ms\n            cd ms\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-vad/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/studios/csukuangfj/web-assembly-vad-sherpa-onnx.git\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://huggingface.co/spaces/k2-fsa/web-assembly-vad-sherpa-onnx huggingface\n            cd huggingface\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-vad/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/spaces/k2-fsa/web-assembly-vad-sherpa-onnx main\n"
  },
  {
    "path": ".github/workflows/wasm-simd-hf-space-speaker-diarization.yaml",
    "content": "name: wasm-simd-hf-space-speaker-diarization\n\non:\n  push:\n    branches:\n      - wasm\n      - wasm-speaker-diarization\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: wasm-simd-hf-space-speaker-diarization-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  wasm-simd-hf-space-speaker-diarization:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.53\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Download model files\n        shell: bash\n        run: |\n          cd wasm/speaker-diarization/assets/\n          ls -lh\n          echo \"----------\"\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n          tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n          rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n          mv sherpa-onnx-pyannote-segmentation-3-0/model.onnx ./segmentation.onnx\n          rm -rf sherpa-onnx-pyannote-segmentation-3-0\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n          mv 3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ./embedding.onnx\n\n          echo \"----------\"\n\n          ls -lh\n\n      - name: Build sherpa-onnx for WebAssembly\n        shell: bash\n        run: |\n          ./build-wasm-simd-speaker-diarization.sh\n\n      - name: collect files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-speaker-diarization\n          mv build-wasm-simd-speaker-diarization/install/bin/wasm/speaker-diarization $dst\n          ls -lh $dst\n          tar cjfv $dst.tar.bz2 ./$dst\n\n      - name: Upload wasm files\n        uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-wasm-simd-speaker-diarization\n          path: ./sherpa-onnx-wasm-simd-*.tar.bz2\n\n      - name: Release\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n\n      - name: Publish to ModelScope\n        # if: false\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf ms\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://www.modelscope.cn/studios/csukuangfj/web-assembly-speaker-diarization-sherpa-onnx.git ms\n            cd ms\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/studios/csukuangfj/web-assembly-speaker-diarization-sherpa-onnx.git\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://csukuangfj:$HF_TOKEN@huggingface.co/spaces/k2-fsa/web-assembly-speaker-diarization-sherpa-onnx huggingface\n            ls -lh\n\n            cd huggingface\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/spaces/k2-fsa/web-assembly-speaker-diarization-sherpa-onnx main\n"
  },
  {
    "path": ".github/workflows/wasm-simd-hf-space-speech-enhancement-gtcrn.yaml",
    "content": "name: wasm-simd-hf-space-speech-enhancement-gtcrn\n\non:\n  push:\n    branches:\n      - wasm\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: wasm-simd-hf-space-speech-enhancement-gtcrn-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  wasm-simd-hf-space-speech-enhancement-gtcrn:\n    name: wasm gtcrn\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.53\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Download model\n        shell: bash\n        run: |\n          cd wasm/speech-enhancement/assets\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\n          mv gtcrn_simple.onnx gtcrn.onnx\n\n      - name: build\n        shell: bash\n        run: |\n          ./build-wasm-simd-speech-enhancement.sh\n\n      - name: collect files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          d=sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-speech-enhancement-gtcrn\n          mv build-wasm-simd-speech-enhancement/install/bin/wasm/speech-enhancement $d\n          ls -lh $d\n          tar cjfv $d.tar.bz2 $d\n\n          echo \"---\"\n\n          ls -lh *.tar.bz2\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: wasm-speech-enhancement-gtcrn\n          path: ./*.tar.bz2\n\n      - name: Release\n        # if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.10.46\n\n      - name: Release\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n\n      - name: Publish to ModelScope\n        # if: false\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf ms\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone http://www.modelscope.cn/studios/csukuangfj/wasm-speech-enhancement-gtcrn.git ms\n\n            cd ms\n            rm -fv *.js\n            rm -fv *.data\n\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push http://oauth2:${MS_TOKEN}@www.modelscope.cn/studios/csukuangfj/wasm-speech-enhancement-gtcrn.git\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://huggingface.co/spaces/k2-fsa/wasm-speech-enhancement-gtcrn huggingface\n            cd huggingface\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/spaces/k2-fsa/wasm-speech-enhancement-gtcrn main\n"
  },
  {
    "path": ".github/workflows/wasm-simd-hf-space-ten-vad.yaml",
    "content": "name: wasm-simd-hf-space-ten-vad\n\non:\n  push:\n    branches:\n      - wasm\n      - wasm-ten-vad\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: wasm-simd-hf-space-ten-vad-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  wasm-simd-hf-space-ten-vad:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.53\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Download model files\n        shell: bash\n        run: |\n          cd wasm/vad/assets\n          ls -lh\n          echo \"----------\"\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n          ls -lh\n          cd ..\n          sed -i.bak \"s|.*(with <a .*|    (with <a href=\"https://github.com/TEN-framework/ten-vad\">ten-vad</a>)|\" ./index.html\n          git diff .\n\n      - name: Build sherpa-onnx for WebAssembly\n        shell: bash\n        run: |\n          ./build-wasm-simd-vad.sh\n\n      - name: collect files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-ten-vad\n          mv build-wasm-simd-vad/install/bin/wasm/vad $dst\n          ls -lh $dst\n          tar cjfv $dst.tar.bz2 ./$dst\n\n      - name: Upload wasm files\n        uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-wasm-simd-ten-vad\n          path: ./sherpa-onnx-wasm-simd-*.tar.bz2\n\n      - name: Release\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n\n      - name: Publish to ModelScope\n        # if: false\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf ms\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://www.modelscope.cn/studios/csukuangfj/web-assembly-ten-vad-sherpa-onnx.git ms\n            cd ms\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-ten-vad/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/studios/csukuangfj/web-assembly-ten-vad-sherpa-onnx.git\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://huggingface.co/spaces/k2-fsa/web-assembly-ten-vad-sherpa-onnx huggingface\n            cd huggingface\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-ten-vad/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/spaces/k2-fsa/web-assembly-ten-vad-sherpa-onnx main\n"
  },
  {
    "path": ".github/workflows/wasm-simd-hf-space-tts.yaml",
    "content": "name: wasm-simd-hf-space-tts\n\non:\n  push:\n    branches:\n      - wasm\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: wasm-simd-hf-space-tts${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  wasm-simd-hf-space-tts:\n    name: ${{ matrix.index }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"7\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.53\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/wasm\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-tts.py --total $total --index $index\n\n          chmod +x run-tts.sh\n          mv -v ./run-tts.sh ../..\n\n      - name: Show build scripts\n        shell: bash\n        run: |\n          cat ./run-tts.sh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: run-tts-${{ matrix.index }}\n          path: ./run-tts.sh\n\n      - name: Build sherpa-onnx for WebAssembly\n        shell: bash\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          ./run-tts.sh\n\n      - name: Release\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.19\n\n      - name: Upload wasm files\n        uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-wasm-simd-tts-${{ matrix.index }}\n          path: ./sherpa-onnx-wasm-simd-*.tar.bz2\n"
  },
  {
    "path": ".github/workflows/wasm-simd-hf-space-vad-asr.yaml",
    "content": "name: wasm-simd-hf-space-vad-asr\n\non:\n  push:\n    branches:\n      - wasm\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: wasm-simd-hf-space-vad-asr${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  wasm-simd-hf-space-vad-asr:\n    name: ${{ matrix.index }}/${{ matrix.total }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n        total: [\"15\"]\n        index: [\"0\", \"1\", \"2\", \"3\", \"4\", \"5\", \"6\", \"7\", \"8\", \"9\", \"10\", \"11\", \"12\", \"13\", \"14\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install Python dependencies\n        shell: bash\n        run: |\n          python3 -m pip install --upgrade pip jinja2\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.53\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Generate build script\n        shell: bash\n        run: |\n          cd scripts/wasm\n\n          total=${{ matrix.total }}\n          index=${{ matrix.index }}\n\n          ./generate-vad-asr.py --total $total --index $index\n\n          chmod +x run-vad-asr.sh\n          mv -v ./run-vad-asr.sh ../..\n\n      - name: Show build scripts\n        shell: bash\n        run: |\n          cat ./run-vad-asr.sh\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: run-vad-asr-${{ matrix.index }}\n          path: ./run-vad-asr.sh\n\n      - name: Build sherpa-onnx for WebAssembly\n        shell: bash\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        run: |\n          ./run-vad-asr.sh\n\n      - name: Release\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.10.23\n\n      - name: Upload wasm files\n        uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-wasm-simd-vad-asr-${{ matrix.index }}\n          path: ./sherpa-onnx-wasm-simd-*.tar.bz2\n"
  },
  {
    "path": ".github/workflows/wasm-simd-hf-space-zh-cantonese-en-asr-paraformer.yaml",
    "content": "name: wasm-simd-hf-space-zh-cantonese-en-asr-paraformer\n\non:\n  push:\n    branches:\n      - wasm\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: wasm-simd-hf-space-zh-cantonese-en-asr-paraformer-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  wasm-simd-hf-space-zh-cantonese-en-asr-paraformer:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.53\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Download model files\n        shell: bash\n        run: |\n          cd wasm/asr/assets\n          ls -lh\n          echo \"----------\"\n\n          wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-trilingual-zh-cantonese-en.tar.bz2\n          tar xvf sherpa-onnx-streaming-paraformer-trilingual-zh-cantonese-en.tar.bz2\n          rm sherpa-onnx-streaming-paraformer-trilingual-zh-cantonese-en.tar.bz2\n\n          mv sherpa-onnx-streaming-paraformer-trilingual-zh-cantonese-en/encoder.int8.onnx encoder.onnx\n          mv sherpa-onnx-streaming-paraformer-trilingual-zh-cantonese-en/decoder.int8.onnx decoder.onnx\n          mv sherpa-onnx-streaming-paraformer-trilingual-zh-cantonese-en/tokens.txt ./\n\n          rm -rf sherpa-onnx-streaming-paraformer-trilingual-zh-cantonese-en\n\n          ls -lh\n\n          cd ../\n\n          sed -i.bak s/\"type = 0\"/\"type = 1\"/g ./sherpa-onnx-asr.js\n          sed -i.bak s/Zipformer/Paraformer/g ./index.html\n\n          git diff\n\n      - name: Build sherpa-onnx for WebAssembly (ASR)\n        shell: bash\n        run: |\n          ./build-wasm-simd-asr.sh\n\n      - name: collect files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-zh-cantonese-en-asr-paraformer\n          mv build-wasm-simd-asr/install/bin/wasm/asr $dst\n          ls -lh $dst\n          tar cjfv ${dst}.tar.bz2 ./${dst}\n\n      - name: Upload wasm files\n        uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-wasm-simd-zh-cantonese-en-asr-paraformer\n          path: ./sherpa-onnx-wasm-simd-*.tar.bz2\n\n      - name: Release\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer huggingface\n            cd huggingface\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer main\n\n      - name: Publish to ModelScope\n        # if: false\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 10\n          timeout_seconds: 600\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf ms\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://www.modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer.git ms\n            cd ms\n            rm -fv *.js\n            rm -fv *.data\n            git config lfs.locksverify true\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer.git\n\n"
  },
  {
    "path": ".github/workflows/wasm-simd-hf-space-zh-en-asr-paraformer.yaml",
    "content": "name: wasm-simd-hf-space-zh-en-asr-paraformer\n\non:\n  push:\n    branches:\n      - wasm\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: wasm-simd-hf-space-zh-en-asr-paraformer-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  wasm-simd-hf-space-zh-en-asr-paraformer:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.53\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Download model files\n        shell: bash\n        run: |\n          cd wasm/asr/assets\n          ls -lh\n          echo \"----------\"\n\n          curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n          tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n          rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n\n          mv sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx encoder.onnx\n          mv sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx decoder.onnx\n          mv sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt ./\n\n          rm -rf sherpa-onnx-streaming-paraformer-bilingual-zh-en\n\n          ls -lh\n\n          cd ../\n\n          sed -i.bak s/\"type = 0\"/\"type = 1\"/g ./sherpa-onnx-asr.js\n          sed -i.bak s/Zipformer/Paraformer/g ./index.html\n\n          git diff\n\n      - name: Build sherpa-onnx for WebAssembly (ASR)\n        shell: bash\n        run: |\n          ./build-wasm-simd-asr.sh\n\n      - name: collect files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-zh-en-asr-paraformer\n          mv build-wasm-simd-asr/install/bin/wasm/asr $dst\n          ls -lh $dst\n          tar cjfv ${dst}.tar.bz2 ./${dst}\n\n      - name: Upload wasm files\n        uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-wasm-simd-zh-en-asr-paraformer\n          path: ./sherpa-onnx-wasm-simd-*.tar.bz2\n\n      - name: Release\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n\n      - name: Publish to ModelScope\n        # if: false\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf ms\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://www.modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en-paraformer.git ms\n            cd ms\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en-paraformer.git\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en-paraformer huggingface\n            cd huggingface\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en-paraformer main\n"
  },
  {
    "path": ".github/workflows/wasm-simd-hf-space-zh-en-asr-zipformer.yaml",
    "content": "name: wasm-simd-hf-space-zh-en-asr-zipformer\n\non:\n  push:\n    branches:\n      - wasm\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: wasm-simd-hf-space-zh-en-asr-zipformer-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  wasm-simd-hf-space-zh-en-asr-zipformer:\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [ubuntu-latest]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Install emsdk\n        uses: mymindstorm/setup-emsdk@v14\n        with:\n          version: 3.1.53\n          actions-cache-folder: 'emsdk-cache'\n\n      - name: View emsdk version\n        shell: bash\n        run: |\n          emcc -v\n          echo \"--------------------\"\n          emcc --check\n\n      - name: Download model files\n        shell: bash\n        run: |\n          cd wasm/asr/assets\n          ls -lh\n          echo \"----------\"\n          wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n          mv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx encoder.onnx\n          mv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx decoder.onnx\n          mv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx joiner.onnx\n          mv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt ./\n          rm -rf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\n\n          ls -lh\n\n      - name: Build sherpa-onnx for WebAssembly (ASR)\n        shell: bash\n        run: |\n          ./build-wasm-simd-asr.sh\n\n      - name: collect files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-zh-en-asr-zipformer\n          mv build-wasm-simd-asr/install/bin/wasm/asr $dst\n          ls -lh $dst\n          tar cjfv ${dst}.tar.bz2 ./${dst}\n\n      - name: Upload wasm files\n        uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-wasm-simd-zh-en-asr-zipformer\n          path: ./sherpa-onnx-wasm-simd-*.tar.bz2\n\n      - name: Release\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: ./*.tar.bz2\n\n      - name: Publish to ModelScope\n        # if: false\n        env:\n          MS_TOKEN: ${{ secrets.MODEL_SCOPE_GIT_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf ms\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://www.modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en.git ms\n            cd ms\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en.git\n\n      - name: Publish to huggingface\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v2\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_LFS_SKIP_SMUDGE=1\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n\n            git clone https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en huggingface\n            cd huggingface\n            rm -fv *.js\n            rm -fv *.data\n            git fetch\n            git pull\n            git merge -m \"merge remote\" --ff origin main\n\n            cp -v ../sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-*/* .\n\n            git status\n            git lfs track \"*.data\"\n            git lfs track \"*.wasm\"\n            ls -lh\n\n            git add .\n            git commit -m \"update model\"\n            git push https://csukuangfj:$HF_TOKEN@huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en main\n"
  },
  {
    "path": ".github/workflows/windows-arm64.yaml",
    "content": "name: windows-arm64\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/windows-arm64.yaml'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: windows-arm64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  windows_arm64:\n    name: shared-${{ matrix.shared_lib }} tts-${{ matrix.with_tts }} static CRT ${{ matrix.use_static_crt }} ${{ matrix.build_type }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        build_type: [Release, Debug, MinSizeRel, RelWithDebInfo]\n        shared_lib: [ON, OFF]\n        with_tts: [ON, OFF]\n        use_static_crt: [ON, OFF]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Set up MSVC\n        uses: ilammy/msvc-dev-cmd@v1\n\n      - name: find dumpbin\n        shell: bash\n        run: |\n          which dumpbin\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          mkdir build\n          cd build\n          cmake \\\n            -A ARM64 \\\n            -DSHERPA_ONNX_ENABLE_TTS=${{ matrix.with_tts }} \\\n            -DSHERPA_ONNX_USE_STATIC_CRT=${{ matrix.use_static_crt }} \\\n            -D CMAKE_BUILD_TYPE=${{ matrix.build_type }} \\\n            -D SHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -D BUILD_SHARED_LIBS=${{ matrix.shared_lib }} \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            -D BUILD_ESPEAK_NG_EXE=OFF \\\n            ..\n\n      - name: Check 1\n        shell: bash\n        run: |\n          cd build\n\n          cat sherpa-onnx/csrc/sherpa-onnx.vcxproj\n\n      - name: Check 2\n        shell: cmd\n        run: |\n          cd build\n\n          findstr /R /C:\"<RuntimeLibrary>\" sherpa-onnx\\csrc\\sherpa-onnx.vcxproj\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-vcxproj-release-windows-arm64-${{ matrix.shared_lib }}-${{ matrix.with_tts }}-static-crt-${{ matrix.use_static_crt }}-${{ matrix.build_type }}\n          path: build/sherpa-onnx/csrc/sherpa-onnx.vcxproj\n\n      - name: Check 3\n        shell: bash\n        run: |\n          cd build\n\n          cat c-api-examples/vad-whisper-c-api.vcxproj\n\n      - name: Check 4\n        shell: cmd\n        run: |\n          cd build\n\n          findstr /R /C:\"<RuntimeLibrary>\" \"c-api-examples\\vad-whisper-c-api.vcxproj\"\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: vad-whisper-c-api-vcxproj-release-windows-arm64-${{ matrix.shared_lib }}-${{ matrix.with_tts }}-static-crt-${{ matrix.use_static_crt }}-${{ matrix.build_type }}\n          path: build/c-api-examples/vad-whisper-c-api.vcxproj\n\n      - name: Build sherpa-onnx for windows\n        shell: bash\n        run: |\n          cd build\n          cmake --build . --config ${{ matrix.build_type }} -- -m:2\n          cmake --build . --config ${{ matrix.build_type }} --target install -- -m:2\n\n          ls -lh ./bin/${{ matrix.build_type }}/sherpa-onnx.exe\n\n      - name: Show exe\n        shell: bash\n        run: |\n          ls -lh $PWD/build/bin/${{ matrix.build_type }}\n\n      - name: Dump CRT dependencies\n        shell: cmd\n        run: |\n          dumpbin /dependents build\\bin\\${{ matrix.build_type }}\\sherpa-onnx.exe\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: release-windows-arm64-${{ matrix.shared_lib }}-${{ matrix.with_tts }}-static-crt-${{ matrix.use_static_crt }}-${{ matrix.build_type }}\n          path: build/install/*\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          shared_lib=${{ matrix.shared_lib }}\n          use_static_crt=${{ matrix.use_static_crt }}\n          if [[ $shared_lib == \"ON\" ]]; then\n            if [[ $use_static_crt == ON ]]; then\n              suffix=shared-MT-${{ matrix.build_type }}\n            else\n              suffix=shared-MD-${{ matrix.build_type }}\n            fi\n          else\n            if [[ $use_static_crt == ON ]]; then\n              suffix=static-MT-${{ matrix.build_type }}\n            else\n              suffix=static-MD-${{ matrix.build_type }}\n            fi\n          fi\n\n          if [[ ${{ matrix.with_tts }} == ON ]]; then\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-win-arm64-$suffix\n          else\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-win-arm64-$suffix-no-tts\n          fi\n\n          if [[ \"${{ matrix.build_type }}\" == \"Debug\" || \"${{ matrix.build_type }}\" == \"RelWithDebInfo\" ]]; then\n            echo \"Copy matching PDB files...\"\n\n            build_bin_dir=build/bin/${{ matrix.build_type }}\n            install_bin_dir=build/install/bin\n\n            for exe in ${install_bin_dir}/*.exe; do\n              base=$(basename \"$exe\" .exe)\n              pdb=${build_bin_dir}/${base}.pdb\n\n              if [[ -f \"$pdb\" ]]; then\n                echo \"Copying $pdb\"\n                cp \"$pdb\" ${install_bin_dir}/\n              else\n                echo \"No PDB found for $base\"\n              fi\n            done\n          fi\n\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n          cp -a build/install/lib $dst/\n          cp -a build/install/include $dst/\n\n          ls -lh $dst/bin/\n          echo \"---\"\n          ls -lh $dst/lib/\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n          ls -lh $dst/\n\n          ls -lh *.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=win64/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for Windows arm64\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*-win-arm64*.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.28\n\n      - name: Release pre-compiled binaries and libs for Windows arm64\n        if: github.repository_owner == 'k2-fsa'&& github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*-win-arm64*.tar.bz2\n"
  },
  {
    "path": ".github/workflows/windows-x64-cuda.yaml",
    "content": "name: windows-x64-cuda\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/windows-x64-cuda.yaml'\n      - '.github/scripts/test-online-transducer.sh'\n      - '.github/scripts/test-online-paraformer.sh'\n      - '.github/scripts/test-offline-transducer.sh'\n      - '.github/scripts/test-offline-ctc.sh'\n      - '.github/scripts/test-online-ctc.sh'\n      - '.github/scripts/test-offline-tts.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: windows-x64-cuda-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  windows_x64_cuda:\n    name: Windows x64 CUDA ${{ matrix.onnxruntime_version }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        onnxruntime_version: [\"1.17.1\", \"1.23.2\"]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          onnxruntime_version=${{ matrix.onnxruntime_version }}\n          curl -SL -O https://github.com/microsoft/onnxruntime/releases/download/v$onnxruntime_version/onnxruntime-win-x64-gpu-$onnxruntime_version.zip\n          unzip onnxruntime-win-x64-gpu-$onnxruntime_version.zip\n\n          export SHERPA_ONNXRUNTIME_LIB_DIR=$PWD/onnxruntime-win-x64-gpu-$onnxruntime_version/lib\n          export SHERPA_ONNXRUNTIME_INCLUDE_DIR=$PWD/onnxruntime-win-x64-gpu-$onnxruntime_version/include\n\n          mkdir build\n          cd build\n          cmake \\\n          -A x64 \\\n          -D CMAKE_BUILD_TYPE=Release \\\n          -D BUILD_SHARED_LIBS=ON \\\n          -D CMAKE_INSTALL_PREFIX=./install \\\n          -D SHERPA_ONNX_ENABLE_GPU=ON \\\n          ..\n\n      - name: Build sherpa-onnx for windows\n        shell: bash\n        run: |\n          cd build\n          cmake --build . --config Release -- -m:2\n          cmake --build . --config Release --target install -- -m:2\n\n          ls -lh ./bin/Release/sherpa-onnx.exe\n\n          onnxruntime_version=${{ matrix.onnxruntime_version }}\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-win-x64-cuda\n\n          onnxruntime_version=${{ matrix.onnxruntime_version }}\n          if [[ $onnxruntime_version == \"1.23.2\" ]]; then\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-cuda-12.x-cudnn-9.x-win-x64-cuda\n          fi\n\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n          cp -a build/install/lib $dst/\n          cp -a build/install/include $dst/\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      - name: Release pre-compiled binaries and libs for windows x64\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*cuda.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.15\n\n      - name: Release pre-compiled binaries and libs for windows x64\n        if: github.repository_owner == 'k2-fsa' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*cuda.tar.bz2\n\n      - name: Test spoken language identification\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/Release:$PATH\n          export EXE=sherpa-onnx-offline-language-identification.exe\n\n          .github/scripts/test-spoken-language-identification.sh\n\n      - name: Test online CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/Release:$PATH\n          export EXE=sherpa-onnx.exe\n\n          .github/scripts/test-online-ctc.sh\n\n      - name: Test offline TTS\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/Release:$PATH\n          export EXE=sherpa-onnx-offline-tts.exe\n\n          .github/scripts/test-offline-tts.sh\n\n      - name: Test online paraformer for windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/Release:$PATH\n          export EXE=sherpa-onnx.exe\n\n          .github/scripts/test-online-paraformer.sh\n\n      - name: Test offline Whisper for windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/Release:$PATH\n          export EXE=sherpa-onnx-offline.exe\n\n          .github/scripts/test-offline-whisper.sh\n\n      - name: Test offline CTC for windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/Release:$PATH\n          export EXE=sherpa-onnx-offline.exe\n\n          .github/scripts/test-offline-ctc.sh\n\n      - name: Test offline transducer for Windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/Release:$PATH\n          export EXE=sherpa-onnx-offline.exe\n\n          .github/scripts/test-offline-transducer.sh\n\n      - name: Test online transducer for Windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/Release:$PATH\n          export EXE=sherpa-onnx.exe\n\n          .github/scripts/test-online-transducer.sh\n\n      - name: Test online transducer (C API)\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/Release:$PATH\n          export EXE=decode-file-c-api.exe\n\n          .github/scripts/test-online-transducer.sh\n\n\n"
  },
  {
    "path": ".github/workflows/windows-x64-jni.yaml",
    "content": "name: windows-x64-jni\n\non:\n  push:\n    branches:\n      - jni\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: windows-x64-jni-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  windows_x64_jni:\n    name: windows x64 jni\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - uses: actions/setup-java@v4\n        with:\n          distribution: 'temurin' # See 'Supported distributions' for available options\n          java-version: '21'\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          mkdir build\n          cd build\n          cmake \\\n            -A x64 \\\n            -DBUILD_SHARED_LIBS=ON \\\n            -D SHERPA_ONNX_ENABLE_JNI=ON \\\n            -DCMAKE_INSTALL_PREFIX=./install \\\n            -DCMAKE_BUILD_TYPE=Release \\\n            -DSHERPA_ONNX_ENABLE_WEBSOCKET=OFF \\\n            -DBUILD_ESPEAK_NG_EXE=OFF \\\n            -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF  \\\n            -DSHERPA_ONNX_ENABLE_BINARY=OFF \\\n            -DSHERPA_ONNX_ENABLE_C_API=OFF \\\n            ..\n\n      - name: Build sherpa-onnx for windows\n        shell: bash\n        run: |\n          cd build\n          cmake --build . --config Release -- -m:2\n          cmake --build . --config Release --target install -- -m:2\n\n          rm -rf install/share\n          rm -rf install/lib/share\n          rm -rf install/lib/pkgconfig\n          rm -rf install/lib/sherpa-onnx-c-api.*\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: release-jni-windows-x64\n          path: build/install/*\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-win-x64-jni\n          mkdir -p $dst\n\n          cp -a build/install/lib $dst/ || true\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=jni/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for Windows x64\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*.tar.bz2\n          # repo_name: k2-fsa/sherpa-onnx\n          # repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          # tag: v1.12.18\n"
  },
  {
    "path": ".github/workflows/windows-x64.yaml",
    "content": "name: windows-x64\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/windows-x64.yaml'\n      - '.github/scripts/test-online-transducer.sh'\n      - '.github/scripts/test-online-paraformer.sh'\n      - '.github/scripts/test-offline-transducer.sh'\n      - '.github/scripts/test-offline-ctc.sh'\n      - '.github/scripts/test-online-ctc.sh'\n      - '.github/scripts/test-offline-tts.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: windows-x64-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  windows_x64:\n    name: shared-${{ matrix.shared_lib }} tts-${{ matrix.with_tts }} static CRT ${{ matrix.use_static_crt }} ${{ matrix.build_type }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        build_type: [Release, Debug, MinSizeRel, RelWithDebInfo]\n        shared_lib: [ON, OFF]\n        with_tts: [ON, OFF]\n        use_static_crt: [ON, OFF]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Set up MSVC\n        uses: ilammy/msvc-dev-cmd@v1\n\n      - name: find dumpbin\n        shell: bash\n        run: |\n          which dumpbin\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          mkdir build\n\n          cmake --version\n\n          cd build\n          cmake \\\n            -A x64 \\\n            -DSHERPA_ONNX_ENABLE_TTS=${{ matrix.with_tts }} \\\n            -DSHERPA_ONNX_USE_STATIC_CRT=${{ matrix.use_static_crt }} \\\n            -D CMAKE_BUILD_TYPE=${{ matrix.build_type }} \\\n            -D SHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -D BUILD_SHARED_LIBS=${{ matrix.shared_lib }} \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            -D BUILD_ESPEAK_NG_EXE=OFF \\\n            ..\n\n      - name: Check 1\n        shell: bash\n        run: |\n          cd build\n\n          cat sherpa-onnx/csrc/sherpa-onnx.vcxproj\n\n      - name: Check 2\n        shell: cmd\n        run: |\n          cd build\n\n          findstr /R /C:\"<RuntimeLibrary>\" sherpa-onnx\\csrc\\sherpa-onnx.vcxproj\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-vcxproj-windows-x64-${{ matrix.shared_lib }}-${{ matrix.with_tts }}-static-crt-${{ matrix.use_static_crt }}-${{ matrix.build_type }}\n          path: build/sherpa-onnx/csrc/sherpa-onnx.vcxproj\n\n      - name: Check 3\n        shell: bash\n        run: |\n          cd build\n\n          cat c-api-examples/vad-whisper-c-api.vcxproj\n\n      - name: Check 4\n        shell: cmd\n        run: |\n          cd build\n\n          findstr /R /C:\"<RuntimeLibrary>\" \"c-api-examples\\vad-whisper-c-api.vcxproj\"\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: vad-whisper-c-api-vcxproj-windows-x64-${{ matrix.shared_lib }}-${{ matrix.with_tts }}-static-crt-${{ matrix.use_static_crt }}-${{ matrix.build_type }}\n          path: build/c-api-examples/vad-whisper-c-api.vcxproj\n\n      - name: Build sherpa-onnx for windows\n        shell: bash\n        run: |\n          cd build\n\n          cmake --version\n\n          cmake --build . --config ${{ matrix.build_type }} -- -m:2\n          cmake --build . --config ${{ matrix.build_type }} --target install -- -m:2\n\n          ls -lh ./bin/${{ matrix.build_type }}/sherpa-onnx.exe\n\n      - name: Show exe\n        shell: bash\n        run: |\n          ls -lh $PWD/build/bin/${{ matrix.build_type }}\n\n      - name: Dump CRT dependencies\n        shell: cmd\n        run: |\n          dumpbin /dependents build\\bin\\${{ matrix.build_type }}\\sherpa-onnx.exe\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: windows-x64-${{ matrix.shared_lib }}-${{ matrix.with_tts }}-static-crt-${{ matrix.use_static_crt }}-${{ matrix.build_type }}\n          path: build/install/*\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          shared_lib=${{ matrix.shared_lib }}\n          use_static_crt=${{ matrix.use_static_crt }}\n          if [[ $shared_lib == \"ON\" ]]; then\n            if [[ $use_static_crt == ON ]]; then\n              suffix=shared-MT-${{ matrix.build_type }}\n            else\n              suffix=shared-MD-${{ matrix.build_type }}\n            fi\n          else\n            if [[ $use_static_crt == ON ]]; then\n              suffix=static-MT-${{ matrix.build_type }}\n            else\n              suffix=static-MD-${{ matrix.build_type }}\n            fi\n          fi\n\n          if [[ ${{ matrix.with_tts }} == ON ]]; then\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-win-x64-$suffix\n          else\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-win-x64-$suffix-no-tts\n          fi\n\n          mkdir $dst\n\n          if [[ \"${{ matrix.build_type }}\" == \"Debug\" || \"${{ matrix.build_type }}\" == \"RelWithDebInfo\" ]]; then\n            echo \"Copy matching PDB files...\"\n\n            build_bin_dir=build/bin/${{ matrix.build_type }}\n            install_bin_dir=build/install/bin\n\n            for exe in ${install_bin_dir}/*.exe; do\n              base=$(basename \"$exe\" .exe)\n              pdb=${build_bin_dir}/${base}.pdb\n\n              if [[ -f \"$pdb\" ]]; then\n                echo \"Copying $pdb\"\n                cp \"$pdb\" ${install_bin_dir}/\n              else\n                echo \"No PDB found for $base\"\n              fi\n            done\n          fi\n\n          cp -a build/install/bin $dst/\n          cp -a build/install/lib $dst/\n          cp -a build/install/include $dst/\n\n          ls -lh $dst/bin/\n          echo \"---\"\n          ls -lh $dst/lib/\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n          ls -lh $dst/\n\n          ls -lh *.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=win64/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for Windows x64\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*-win-x64*.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.28\n\n      - name: Release pre-compiled binaries and libs for Windows x64\n        if: github.repository_owner == 'k2-fsa'&& github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*-win-x64*.tar.bz2\n\n      - name: Test offline Moonshine for windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx-offline.exe\n\n          .github/scripts/test-offline-moonshine.sh\n\n      - name: Test C++ API\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export CXX_STREAMING_ZIPFORMER_EXE=streaming-zipformer-cxx-api.exe\n          export CXX_WHISPER_EXE=whisper-cxx-api.exe\n          export CXX_SENSE_VOICE_EXE=sense-voice-cxx-api.exe\n\n          .github/scripts/test-cxx-api.sh\n\n      - name: Test offline speaker diarization\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx-offline-speaker-diarization.exe\n\n          .github/scripts/test-speaker-diarization.sh\n\n      - name: Test online punctuation\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx-online-punctuation.exe\n\n          .github/scripts/test-online-punctuation.sh\n\n      - name: Test offline punctuation\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx-offline-punctuation.exe\n\n          .github/scripts/test-offline-punctuation.sh\n\n      - name: Test C API\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export SLID_EXE=spoken-language-identification-c-api.exe\n          export SID_EXE=speaker-identification-c-api.exe\n          export AT_EXE=audio-tagging-c-api.exe\n          export PUNCT_EXE=add-punctuation-c-api.exe\n\n          .github/scripts/test-c-api.sh\n\n      - name: Test Audio tagging\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx-offline-audio-tagging.exe\n\n          .github/scripts/test-audio-tagging.sh\n\n      - name: Test spoken language identification (C++ API)\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx-offline-language-identification.exe\n\n          .github/scripts/test-spoken-language-identification.sh\n\n      - name: Test online CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx.exe\n\n          .github/scripts/test-online-ctc.sh\n\n      - name: Test offline TTS\n        if: matrix.with_tts == 'ON'\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx-offline-tts.exe\n\n          .github/scripts/test-offline-tts.sh\n\n      - name: Test online paraformer for windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx.exe\n\n          .github/scripts/test-online-paraformer.sh\n\n      - name: Test offline Whisper for windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx-offline.exe\n\n          .github/scripts/test-offline-whisper.sh\n\n      - name: Test offline CTC for windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx-offline.exe\n\n          .github/scripts/test-offline-ctc.sh\n\n      - name: Test offline transducer for Windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx-offline.exe\n\n          .github/scripts/test-offline-transducer.sh\n\n      - name: Test online transducer for Windows x64\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx.exe\n\n          .github/scripts/test-online-transducer.sh\n\n      - name: Test online transducer (C API)\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=decode-file-c-api.exe\n\n          .github/scripts/test-online-transducer.sh\n"
  },
  {
    "path": ".github/workflows/windows-x86.yaml",
    "content": "name: windows-x86\n\non:\n  push:\n    branches:\n      - master\n    tags:\n      - 'v[0-9]+.[0-9]+.[0-9]+*'\n    paths:\n      - '.github/workflows/windows-x86.yaml'\n      - '.github/scripts/test-online-transducer.sh'\n      - '.github/scripts/test-online-paraformer.sh'\n      - '.github/scripts/test-offline-transducer.sh'\n      - '.github/scripts/test-offline-ctc.sh'\n      - '.github/scripts/test-offline-tts.sh'\n      - '.github/scripts/test-online-ctc.sh'\n      - 'cmake/**'\n      - 'sherpa-onnx/csrc/*'\n\n  workflow_dispatch:\n\nconcurrency:\n  group: windows-x86-${{ github.ref }}\n  cancel-in-progress: true\n\njobs:\n  windows_x86:\n    name: shared-${{ matrix.shared_lib }} tts-${{ matrix.with_tts }} static CRT ${{ matrix.use_static_crt }} ${{ matrix.build_type }}\n    runs-on: ${{ matrix.os }}\n    strategy:\n      fail-fast: false\n      matrix:\n        os: [windows-2022]\n        build_type: [Release, Debug, MinSizeRel, RelWithDebInfo]\n        shared_lib: [OFF, ON]\n        with_tts: [ON, OFF]\n        use_static_crt: [ON, OFF]\n\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0\n\n      - name: Update version\n        shell: bash\n        run: |\n          ./new-release.sh\n          git diff .\n\n      - name: Set up MSVC\n        uses: ilammy/msvc-dev-cmd@v1\n\n      - name: find dumpbin\n        shell: bash\n        run: |\n          which dumpbin\n\n      - name: Configure CMake\n        shell: bash\n        run: |\n          mkdir build\n          cd build\n          cmake \\\n            -A Win32 \\\n            -DSHERPA_ONNX_ENABLE_TTS=${{ matrix.with_tts }} \\\n            -DSHERPA_ONNX_USE_STATIC_CRT=${{ matrix.use_static_crt }} \\\n            -D CMAKE_BUILD_TYPE=${{ matrix.build_type }} \\\n            -D SHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n            -D BUILD_SHARED_LIBS=${{ matrix.shared_lib }} \\\n            -D CMAKE_INSTALL_PREFIX=./install \\\n            -D BUILD_ESPEAK_NG_EXE=OFF \\\n            ..\n\n      - name: Check 1\n        shell: bash\n        run: |\n          cd build\n\n          cat sherpa-onnx/csrc/sherpa-onnx.vcxproj\n\n      - name: Check 2\n        shell: cmd\n        run: |\n          cd build\n\n          findstr /R /C:\"<RuntimeLibrary>\" sherpa-onnx\\csrc\\sherpa-onnx.vcxproj\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: sherpa-onnx-vcxproj-release-windows-x86-${{ matrix.shared_lib }}-${{ matrix.with_tts }}-static-crt-${{ matrix.use_static_crt }}-${{ matrix.build_type }}\n          path: build/sherpa-onnx/csrc/sherpa-onnx.vcxproj\n\n      - name: Check 3\n        shell: bash\n        run: |\n          cd build\n\n          cat c-api-examples/vad-whisper-c-api.vcxproj\n\n      - name: Check 4\n        shell: cmd\n        run: |\n          cd build\n\n          findstr /R /C:\"<RuntimeLibrary>\" \"c-api-examples\\vad-whisper-c-api.vcxproj\"\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: vad-whisper-c-api-vcxproj-release-windows-x86-${{ matrix.shared_lib }}-${{ matrix.with_tts }}-static-crt-${{ matrix.use_static_crt }}-${{ matrix.build_type }}\n          path: build/c-api-examples/vad-whisper-c-api.vcxproj\n\n      - name: Build sherpa-onnx for windows\n        shell: bash\n        run: |\n          cd build\n          cmake --build . --config ${{ matrix.build_type }} -- -m:2\n          cmake --build . --config ${{ matrix.build_type }} --target install -- -m:2\n\n          ls -lh ./bin/${{ matrix.build_type }}/sherpa-onnx.exe\n\n      - name: Show exe\n        shell: bash\n        run: |\n          ls -lh $PWD/build/bin/${{ matrix.build_type }}\n\n      - name: Dump CRT dependencies\n        shell: cmd\n        run: |\n          dumpbin /dependents build\\bin\\${{ matrix.build_type }}\\sherpa-onnx.exe\n\n      - uses: actions/upload-artifact@v4\n        with:\n          name: release-windows-x86-${{ matrix.shared_lib }}-${{ matrix.with_tts }}-static-crt-${{ matrix.use_static_crt }}-${{ matrix.build_type }}\n          path: build/install/*\n\n      - name: Copy files\n        shell: bash\n        run: |\n          SHERPA_ONNX_VERSION=v$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n          shared_lib=${{ matrix.shared_lib }}\n          use_static_crt=${{ matrix.use_static_crt }}\n          if [[ $shared_lib == \"ON\" ]]; then\n            if [[ $use_static_crt == ON ]]; then\n              suffix=shared-MT-${{ matrix.build_type }}\n            else\n              suffix=shared-MD-${{ matrix.build_type }}\n            fi\n          else\n            if [[ $use_static_crt == ON ]]; then\n              suffix=static-MT-${{ matrix.build_type }}\n            else\n              suffix=static-MD-${{ matrix.build_type }}\n            fi\n          fi\n\n          if [[ ${{ matrix.with_tts }} == ON ]]; then\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-win-x86-$suffix\n          else\n            dst=sherpa-onnx-${SHERPA_ONNX_VERSION}-win-x86-$suffix-no-tts\n          fi\n\n          if [[ \"${{ matrix.build_type }}\" == \"Debug\" || \"${{ matrix.build_type }}\" == \"RelWithDebInfo\" ]]; then\n            echo \"Copy matching PDB files...\"\n\n            build_bin_dir=build/bin/${{ matrix.build_type }}\n            install_bin_dir=build/install/bin\n\n            for exe in ${install_bin_dir}/*.exe; do\n              base=$(basename \"$exe\" .exe)\n              pdb=${build_bin_dir}/${base}.pdb\n\n              if [[ -f \"$pdb\" ]]; then\n                echo \"Copying $pdb\"\n                cp \"$pdb\" ${install_bin_dir}/\n              else\n                echo \"No PDB found for $base\"\n              fi\n            done\n          fi\n\n          mkdir $dst\n\n          cp -a build/install/bin $dst/\n          cp -a build/install/lib $dst/\n          cp -a build/install/include $dst/\n\n          ls -lh $dst/bin/\n          echo \"---\"\n          ls -lh $dst/lib/\n\n          tar cjvf ${dst}.tar.bz2 $dst\n\n          ls -lh $dst/\n\n          ls -lh *.tar.bz2\n\n      # https://huggingface.co/docs/hub/spaces-github-actions\n      - name: Publish to huggingface\n        if: (github.repository_owner == 'csukuangfj' || github.repository_owner == 'k2-fsa') && (github.event_name == 'push' || github.event_name == 'workflow_dispatch')\n        env:\n          HF_TOKEN: ${{ secrets.HF_TOKEN }}\n        uses: nick-fields/retry@v3\n        with:\n          max_attempts: 20\n          timeout_seconds: 200\n          shell: bash\n          command: |\n            SHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n            git config --global user.email \"csukuangfj@gmail.com\"\n            git config --global user.name \"Fangjun Kuang\"\n\n            rm -rf huggingface\n            export GIT_CLONE_PROTECTION_ACTIVE=false\n            GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/csukuangfj2/sherpa-onnx-libs huggingface\n\n            cd huggingface\n            dst=win64/$SHERPA_ONNX_VERSION\n            mkdir -p $dst\n\n            cp -v ../sherpa-onnx-*.tar.bz2 $dst\n\n            git status\n            git lfs track \"*.bz2\"\n\n            git add .\n\n            git commit -m \"upload sherpa-onnx-${SHERPA_ONNX_VERSION}\"\n\n            git push https://csukuangfj2:$HF_TOKEN@huggingface.co/csukuangfj2/sherpa-onnx-libs main\n\n      - name: Release pre-compiled binaries and libs for Windows x86\n        if: github.repository_owner == 'csukuangfj' && github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*-win-x86*.tar.bz2\n          repo_name: k2-fsa/sherpa-onnx\n          repo_token: ${{ secrets.UPLOAD_GH_SHERPA_ONNX_TOKEN }}\n          tag: v1.12.28\n\n      - name: Release pre-compiled binaries and libs for Windows x86\n        if: github.repository_owner == 'k2-fsa'&& github.event_name == 'push' && contains(github.ref, 'refs/tags/')\n        uses: svenstaro/upload-release-action@v2\n        with:\n          file_glob: true\n          overwrite: true\n          file: sherpa-onnx-*-win-x86*.tar.bz2\n\n      - name: Test online CTC\n        shell: bash\n        run: |\n          export PATH=$PWD/build/bin/${{ matrix.build_type }}:$PATH\n          export EXE=sherpa-onnx.exe\n\n          .github/scripts/test-online-ctc.sh\n"
  },
  {
    "path": ".gitignore",
    "content": "build\n*.zip\n*.tgz\n*.sw?\nonnxruntime-*\nicefall-*\nrun.sh\n__pycache__\ndist/\nsherpa_onnx.egg-info/\n.DS_Store\nbuild-aarch64-linux-gnu\nbuild-arm-linux-gnueabihf\nsherpa-onnx-streaming-zipformer-*\nsherpa-onnx-lstm-en-*\nsherpa-onnx-lstm-zh-*\nbuild-android-arm64-v8a/\nbuild-android-armv7-eabi/\nbuild-android-x86-64/\na.txt\nrun-bilingual*.sh\nrun-*-zipformer.sh\nrun-zh.sh\ndecode-file-c-api\noffline-tts-c-api\nrun-decode-file-c-api.sh\nsherpa-onnx-ffmpeg\nbuild-ios\nbuild-swift-macos\naa.sh\nclient-2.sh\nffmpeg-examples/run-3.sh\npython-api-examples/decode-file-multiple-bak-2.py\nrun-en-zipformer-microphone*\nrun-websocket-server*\ndecode-file\n*.dylib\ntokens.txt\n*.onnx\nlog.txt\ntags\nrun-decode-file-python.sh\nandroid/SherpaOnnx/app/src/main/assets/\n*.ncnn.*\nrun-sherpa-onnx-offline.sh\nsherpa-onnx-conformer-en-2023-03-18\nparaformer-onnxruntime-python-example\nrun-sherpa-onnx-offline-paraformer.sh\nrun-sherpa-onnx-offline-transducer.sh\nsherpa-onnx-paraformer-zh-2023-03-28\nsherpa-onnx-paraformer-zh-2023-09-14\nrun-offline-websocket-server-paraformer.sh\nrun-*int8.sh\na.sh\nrun-offline-websocket-client-*.sh\nrun-sherpa-onnx-*.sh\nsherpa-onnx-zipformer-en-2023-03-30\nsherpa-onnx-zipformer-en-2023-04-01\nrun-offline-decode-files.sh\nsherpa-onnx-nemo-ctc-en-citrinet-512\nsherpa-onnx-streaming-paraformer-bilingual-zh-en\nrun-offline-decode-files-nemo-ctc.sh\nsherpa-onnx-nemo-ctc-*\n*.wav\nsherpa-onnx-zipformer-*\nsherpa-onnx-conformer-*\nsherpa-onnx-whisper-*\nswift-api-examples/k2fsa-*\nrun-*.sh\ntwo-pass-*.sh\nbuild-*\n\n## User settings\nxcuserdata/\n\n## Xcode 8 and earlier\n*.xcscmblueprint\n*.xccheckout\nvits-vctk\nvits-zh-aishell3\njslint.mjs\nvits-piper-en_US-amy-low\nvits-piper-*-*-*\nlog\n*.exe\nvits-piper-*\nvits-coqui-*\nvits-mms-*\n*.tar.bz2\nsherpa-onnx-paraformer-trilingual-zh-cantonese-en\nsr-data\n*xcworkspace/xcuserdata/*\n\nvits-icefall-*\nsherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12\nspoken-language-identification-test-wavs\nmy-release-key*\nvits-zh-hf-fanchen-C\nsherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01\n*.dll\n*.lib\n*.tar.gz\n*.tar.bz2\n*.zip\nsherpa-onnx-ced-*\nnode_modules\npackage-lock.json\npubspec.lock\nsherpa-onnx-nemo-*\nsherpa-onnx-vits-*\nsherpa-onnx-telespeech-ctc-*\n*.fst\n.ccache\nlib*.a\nsherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17\n*.bak\nvits-melo-tts-zh_en\n*.o\n*.ppu\nsherpa-onnx-online-punct-en-2024-08-06\n*.mp4\n*.mp3\nsherpa-onnx-pyannote-segmentation-3-0\nsherpa-onnx-moonshine-tiny-en-int8\nsherpa-onnx-moonshine-base-en-int8\nharmony-os/SherpaOnnxHar/sherpa_onnx/LICENSE\nharmony-os/SherpaOnnxHar/sherpa_onnx/CHANGELOG.md\nmatcha-icefall-zh-baker\nmatcha-icefall-en_US-ljspeech\nkokoro-en-v0_19\n*.pt\nlexicon.txt\nus_gold.json\nus_silver.json\nkokoro-multi-lang-v1_0\nsherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\ncmake-build-debug\ncmake-build-release\nREADME-DEV.txt\n*.rknn\n*.jit\n##clion\n.idea\nsherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\ndict\n*.npz\nvoices.bin\nkitten-nano-en-v0_1-fp16\n*.egg-info\n*.jar\nvocab.json\n*.so\nsherpa-onnx-streaming-t-one-russian-2025-09-08\nsherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10\nam.mvn\n*bpe.model\nconfig.yaml\nconfiguration.json\nsherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12\nsherpa-onnx-qnn-10-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\nsherpa-onnx-paraformer-zh-int8-2025-10-07\nsherpa-onnx-qnn-5-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\nsherpa-onnx-qnn-10-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\nbuild-riscv64-linux-gnu-spacemit/\nspacemit-toolchain*\nsherpa-onnx-qnn-*\nmatcha-icefall-*\nsherpa-onnx-medasr-ctc-en-int8-2025-12-25\nsherpa-onnx-funasr-nano-int8-2025-12-30\n*.raw\n*-input-list.txt\nsherpa-onnx-funasr-nano*2025-12-30\nsherpa-onnx-pocket-tts-int8-2026-01-26\nsherpa-onnx-pocket-tts-2026-01-26\nsherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17\nsherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\nnon-streaming-fire-red-asr-ctc-decode-files\nsherpa-onnx-moonshine-*-quantized-2026-02-27\nsherpa-onnx-supertonic-tts-int8-2026-03-06\ntoken_scores.json\nsherpa-onnx-zipvoice-distill-int8-zh-en-emilia\nsherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile\nsherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8\ndoxygen-docs"
  },
  {
    "path": "CHANGELOG.md",
    "content": "## 1.12.31\n\n* Fix building har for OHOS (#3361)\n* Refactor MatchaTTS to use the new Generate API (#3362)\n* Refactor Kokoro TTS to use the new Generate API (#3363)\n* Refactor KittenTTS to use the new Generate API (#3364)\n* Refactor VITS to use the new Generate API (#3365)\n* Add Rust API examples for TTS (#3366)\n* Fix Swift tests (#3367)\n* Add Rust API for audio tagging (#3368)\n* Add Rust API for speaker embedding extractor and manager (#3369)\n* Add Rust API for speaker diarization (#3370)\n* Refactor Rust API for speech denoiser (#3371)\n* Add Rust API for KWS, offline punctuation and spoken language identification (#3372)\n* Add doc for c api and cxx api (#3374)\n* Add link to C API doc (#3375)\n* Add doc for Rust API (#3376)\n* Add doc for Dart API (#3377)\n* Add more doc for Rust API (#3378)\n\n## 1.12.30\n\n* Fix typos in the project (#3293)\n* Fix WebAssembly JavaScript API (#3294)\n* Remove unnecessary SHERPA_ONNX_API from C/C++ APIs (#3295)\n* Fix bugs in CXX APIs (#3296)\n* Result goes to stdout (#3274)\n* Small fix to online recognizer C++ code (#3297)\n* Small fixes to JNI wrappers (#3298)\n* Add SetOption/GetOption to OnlineStream and OfflineStream (#3307)\n* Add SetOption/GetOption C API and export symbols (#3308)\n* Add SetOption/GetOption CXX wrapper for OnlineStream and OfflineStream (#3309)\n* Migrate Paraformer is_final to use SetOption mechanism (#3310)\n* Add SetOption/GetOption Python bindings for OnlineStream and OfflineStream (#3311)\n* Add SetOption/GetOption Java, Kotlin, and JNI bindings (#3312)\n* Add SetOption/GetOption Go bindings for OnlineStream and OfflineStream (#3313)\n* Add SetOption/GetOption C# bindings for OnlineStream and OfflineStream (#3314)\n* Add SetOption/GetOption WASM/JavaScript bindings (#3315)\n* Fix padding bug in test-onnx-streaming.py (#3318)\n* Fix style issues (#3321)\n* Add DPDFNet speech denoiser support for offline and streaming (#3276)\n* Upload DPDFNet models (#3322)\n* Add C API example for online punctuation (#3323)\n* Add online speech denoiser for GTCRN and examples (#3324)\n* Add Go API example for online punctuation (#3325)\n* Release Rust package for offline/online speech denoiser (#3328)\n* Refactor Dart API to check for nullptr (#3329)\n* Use onnxruntime v1.23.2 for Android (#3330)\n* Refactor ZipVoice TTS to support callback (#3332)\n* Add C and CXX API examples for ZipVoice (#3333)\n* Add Go API examples for ZipVoice (#3334)\n* Add Python API examples for ZipVoice TTS (#3335)\n* Add WebAssembly example for ZipVoice (#3337)\n* Update WebAssembly download progress text to show MB (#3338)\n* Add WebAssembly example for PocketTTS (#3340)\n* Add JavaScript (WebAssembly) example for ZipVoice TTS (#3341)\n* Add JavaScript (node-addon) example for ZipVoice TTS (#3342)\n* Add JavaScript playback examples for Pocket and Supertonic TTS (#3343)\n* Add Kotlin and Java API for ZipVoice models (#3344)\n* Add C# API examples for ZipVoice models (#3345)\n* Add Swift API examples for ZipVoice models (#3346)\n* Add Dart API examples for ZipVoice models (#3347)\n* Add Rust API example for online punctuation (#3348)\n* Add fcitx5-vinput to projects using sherpa-onnx (#3350)\n* Add Pascal API examples for ZipVoice models (#3351)\n* Add Rust API examples for ZipVoice models (#3352)\n* Add SetOption/GetOption/HasOption Kotlin bindings (#3354)\n* Fix building Python wheels for Windows (#3355)\n* Fix OHOS APIs for TTS and ASR (#3356)\n* Add HarmonyOS APIs for online punctuation (#3357)\n* Add HarmonyOS APIs for offline punctuation (#3359)\n\n## 1.12.29\n\n* Add Supertonic TTS support (#3094)\n* Upload supertonic tts models (#3263)\n* Add Python API examples for Supertonic TTS (#3264)\n* Support dynamic decoder layers in canary model runtime (#3268)\n* Add CXX API for Supertonic TTS (#3280)\n* Add C# API for Supertonic TTS (#3283)\n* Add Go API for Supertonic TTS (#3284)\n* Add Rust API for Supertonic TTS (#3285)\n* Add Swift API for Supertonic TTS (#3286)\n* Add JavaScript API for Supertonic TTS (#3287)\n* Add Dart API for Supertonic TTS (#3288)\n* Add Java and Kotlin API for Supertonic TTS (#3289)\n* Add Pascal API and example for Supertonic TTS (#3290)\n* Publish pdb files for Debug build on Windows (#3252)\n* Fix memory leak in WebAssembly for TTS (#3259)\n* Refactor WebAssembly TTS API (#3260)\n\n## 1.12.28\n\n* Add C++ runtime support for Moonshine v2 (#3232)\n* Export Moonshine v2 models to sherpa-onnx (#3234)\n* Update Python APIs for Moonshine v2 models (#3235)\n* Add Kotlin and Java APIs for Moonshine v2 models (#3237)\n* Add C and C++ API for Moonshine v2 models (#3238)\n* Add Swift API for Moonshine v2 models (#3240)\n* Add JavaScript API (WebAssembly) for Moonshine v2 models (#3241)\n* Add JavaScript API (node-addon) for Moonshine v2 models (#3242)\n* Add C# API for Moonshine v2 (#3243)\n* Add Go API for Moonshine v2 (#3244)\n* Add Dart API for Moonshine v2 (#3245)\n* Add Rust API for Moonshine v2 (#3247)\n* Add Pascal API for Moonshine v2 (#3248)\n* Build huggingface spaces for Moonshine v2 with WebAssembly (#3249)\n\n## 1.12.27\n\n* Add Rust API for VAD (#3213)\n* Replace deprecated std::istrstream with std::istringstream (#3214)\n* Replace deprecated std::wstring_convert with manual UTF-8 codec (#3215)\n* Fix CMake warnings: optional feature message level + policy version minimum (#3217)\n* Upload FireRedASR2 CTC model (#3220)\n* Bump hclust-cpp to 2026-02-25 release and modernize FetchContent (#3216)\n* Support FireRedASR CTC models (#3221)\n* Update language bindings for FireRedASR CTC models (#3224)\n\n## 1.12.26\n\n* Fix CI (#3192)\n* Fix heap-buffer-overflow in ReadWaveImpl when data chunk size is odd (#3195)\n* [PocketTTS] Add seed support and voice embedding caching for consiste… (#3189)\n* Feat/pocket tts cache config (#3200)\n* 3197: enhanced java binding for voice_embedding_cache_capacity (#3201)\n* Dart, flutter, go, c-api binding and example (#3202)\n* Begin to add Rust API (#3203)\n* Add Rust API for streaming speech recognition (#3204)\n* Add a real-time speech recognition example with microphone for Rust API. (#3205)\n* Add Rust API for offline ASR (#3207)\n* feat: Add PocketTTS cache & seed support to Node.js Addon and WASM APIs (#3206)\n* Add more examples for offline ASR models with Rust API. (#3209)\n* Update C#/Swift/Pascal API for PocketTTS' VoiceEmbeddingCacheCapacity. (#3211)\n\n## 1.12.25\n\n* Fix building without tts (#3168)\n* Fix publishing npm packages for Linux aarch64 and wheels for macOS (#3169)\n* Export PocketTTS for earlier versions of onnxruntime (#3170)\n* Fix building wheels for Python 3.14 (#3182)\n* Update Eigen from 3.4.0 to 3.4.1 (#3178)\n* fix(flutter): add missing FFI struct fields for OfflineWhisper and FunAsrNano (#3186)\n* Fix building wheels for Windows (#3187)\n\n## 1.12.24\n\n* Fix UnicodeDecodeError when accessing tokens in FunASR-nano tokenizer (#3058)\n* Use more jobs for building VAD ASR APKs (#3068)\n* Add export CGO_ENABLED=1 to all GO examples. (#3069)\n* Support BPE tokenizer (#3078)\n* Add C++ runtime and Python support PocketTTS for streaming voice cloning on CPU (#3083)\n* Refactor addon loading logic and add static import for platform-specific binaries (#3075)\n* Update C++ binary for PocketTTS (#3087)\n* Add Python API examples for PocketTTS (#3088)\n* Limit text length for PocketTTS. (#3089)\n* Add CI for PocketTTS. (#3090)\n* Fix Python CI (#3091)\n* Fix build error (#3096)\n* Add Java and Kotlin API for PocketTTS (#3095)\n* Refactor JNI to remove casting. (#3103)\n* Refactor JNI (#3107)\n* Support MD and MT MSVC runtime libraries (CRT) for Windows x64 static build (#3111)\n* Fix MSVC CRT for Windows x64 shared build. (#3114)\n* Fix MSVC CRT for Windows arm64 (#3117)\n* Fix MSVC CRT for Windows x86 (#3118)\n* Refactor CI for Windows x64 (#3119)\n* Fix CI for Windows x64 (#3123)\n* Upload WenetSpeech-Wu u2pp ASR models. (#3125)\n* Add TTS generation with GenerationConfig params C API (#3115)\n* Refactor TTS C API (#3127)\n* Add CXX API for PocketTTS (#3128)\n* Add Swift API for PocketTTS (#3129)\n* fix(android): Optimize UI updates and remove dead code in MainActivity (#3130)\n* Change RPATH for sherpa-onnx.node (#3131)\n* Add async js API for tts generate. (#3133)\n* fix(android): Initialize models in background coroutine to avoid UI blocking (#3132)\n* Add hotword support for FunASR-Nano (#3122)\n* Provide async JS API to create TTS. (#3134)\n* feat: Add a WebAssembly Text-to-Speech (TTS) demo with UI and worker-based audio generation using sherpa-onnx. (#3120)\n* feat: add support for Meta Omnilingual ASR v2 models (#3138)\n* Export omnilingualASR v2 (#3140)\n* feat: Add ys_log_probs to NeMo transducer greedy search decoder (#3105)\n* Add modified beam search and hotwords support for NeMo transducer models (#3077)\n* Fix ORT Value default construction for Android build (#3141)\n* Whisper timestamps (#2945)\n* Add node-addon JavaScript API for PocketTTS (#3139)\n* Update lifecycle-runtime-ktx version to 2.5.1 (#3143)\n* Enable return value in callback for TTS in Go API. (#3150)\n* Refactor Go API for TTS (#3151)\n* Export models for CANN 8.1 (#3152)\n* Add Go API for PocketTTS (#3153)\n* Export models for CANN 8.3 and 8.5 (#3156)\n* Add https://huggingface.co/alphacep/vosk-model-small-streaming-bn (#3158)\n* Add Pascal API for Pocket TTS (#3157)\n* Upload Vietnamese ASR models (#3159)\n* Refactor Pascal API (#3160)\n* Add C# API for PocketTTS. (#3162)\n* Add JavaScript (WebAssembly) API for PocketTTS (#3163)\n* Add Dart API for PocketTTS (#3164)\n* Add GeneratedAudio ToBuffer() to the GO API (#3136)\n* fix: resolve high vulnerability python.lang.security.audit.dangerous-system-call-tainted-env-args.dangerous-system-call-tainted-env-args (#3155)\n* Fix various language bindings (#3166)\n\n## 1.12.23\n\n* Node addon api jsdoc (#3005)\n* Add JavaScript async api for OfflineRecognizer decodeStream. (#3049)\n* Support creating OfflineRecognizer asynchronously in JavaScript. (#3050)\n* Fix uploading files to huggingface (#3054)\n* Add Dart API for FunASR Nano (#3055)\n* Fix uploading APK files (#3056)\n\n## 1.12.22\n\n* Update wav files for FunASR Nano (#3038)\n* cmake: fix sha256 for onnxruntime linux x86_64 gpu package (#3042)\n* Fix checking funasr nano tokenizer on Windows (#3043)\n* Support nemotron-speech-streaming-en-0.6b (#3044)\n* Build APK for nemotron-speech-streaming-en-0.6b (#3045)\n* Fix building Linux arm wheels (#3047)\n\n## 1.12.21\n\n* Fix publishing NPM packages (#2909)\n* Refactor ZipVoice C++ code (#2911)\n* Export more zipformer ctc models to qnn (#2921)\n* [KWS] Add phone+ppinyin tokenization with lexicon support (for zh-en model) (#2922)\n* Export Paraformer ASR models to QNN (#2925)\n* Add Transpose for a 2-D matrix. (#2926)\n* Optimize computation with Eigen. (#2928)\n* Add C++ runtime for Paraformer ASR models with Qualcomm NPU using QNN (#2931)\n* Add Android demo for Paraformer ASR with Qualcomm NPU. (#2932)\n* Export Google MedASR to sherpa-onnx (#2934)\n* Add C++ runtime and Python API for Google MedASR models (#2935)\n* Fix creating a view of an Ort::Value tensor. (#2939)\n* Add C and CXX API for Google MedASR model (#2946)\n* [TTS Engine] Fix engine speed (#2895)\n* Add Swift API for Google MedASR model (#2947)\n* Add C# API for Google MedASR model (#2949)\n* Add Pascal API for Google MedASR model (#2950)\n* Add Go API for Google MedAsr model (#2952)\n* Add Dart API for Google MedAsr model (#2953)\n* Add JavaScript API (WebAssembly) for Google MedAsr model (#2954)\n* Add JavaScript API (node-addon) for Google MedAsr model (#2955)\n* Add Kotlin and Java API for Google MedAsr model (#2956)\n* Add funASR-Nano with LLM support (#2936)\n* Fix building for Windows (#2964)\n* Fix building for HarmonyOS (#2972)\n* [feature] add FunASRNano config into golang api (#2974)\n* Update FunAsr-Nano CTC model (#2978)\n* [opt] opt free pointer function in Go API (#2975)\n* [feature] use jinja2 to generate sherpa-onnx-go lib (#2976)\n* Reformat Go API code (#2979)\n* Fix building for onnxruntime >= 1.11.0 (#2981)\n* Export Whisper to RK NPU (#2983)\n* Test Whisper on Ascend NPU using ACL Python API (#2986)\n* FunASR-nano: switch to unified KV-cache LLM (#2995)\n* Remove filesystem header (#2998)\n* Fix(csrc/melotts): Fix V-words pronunciation on MeloTTS_en (#3002)\n* Upload FunASR Nano ASR models with LLM (#3003)\n* Fix download test wav files (#3004)\n* Use onnxruntime 1.23.2 for Windows (#3007)\n* Add CI to export Whisper models to Ascend NPU (#3008)\n* Add C++ runtime for Whisper with Ascend NPU (#3009)\n* Use onnxruntime v1.23.2 for Linux aarch64 (#3016)\n* Use onnxruntime v1.23.2 for Linux arm (#3017)\n* Start to switch from onnxruntime 1.17.1 to v1.23.2 (#2993)\n* Use onnxruntime 1.23.2 for Linux x64 + NVIDIA GPU (#3018)\n* Update CI test for FunASR Nano C/C++ API (#3021)\n* [feature] add FunASRNano Swift api (#2994)\n* swift: add FunASR nano Swift API (#3022)\n* Add Go API test for FunASR Nano (#3025)\n* Add JavaScript API for FunASR Nano (node-addon) (#3026)\n* Add Pascal API for FunASR Nano (#3029)\n* Add C# API for FunASR Nano (#3031)\n* Add Kotlin and Java API for FunASR Nano models (#3030)\n* Fire-Red-ASR: enable ORT I/O binding for encoder/decoder (#3011)\n* whisper: improve ORT IO binding execution (#3023)\n* Add JavaScript API for FunASR Nano (WebAssembly) (#3027)\n* Fix CI test for nodejs (#3033)\n\n## 1.12.20\n\n* Refactor axcl examples. (#2867)\n* Update README to include Axera NPU (#2870)\n* Add CI for Axera NPU (#2872)\n* Refactor sense voice impl (#2873)\n* Refactor Paraformer Impl (#2874)\n* Remove unused lock file (#2875)\n* Load QNN context binary for faster startup (#2877)\n* Export models to Ascend 910B4 (#2878)\n* Optimize streaming output results when VAD does not detect human voice for a long time (#2876)\n* Build APKs for MatchaTTS Chinese+English (#2882)\n* Publish WASM spaces for MatchaTTS Chinese+English model (#2885)\n* Add script for testing zipvoice onnx models (#2887)\n* upload zipvoice onnx models (#2890)\n* Remove cppinyin from zipvoice (#2892)\n* Fix building errors (#2893)\n* Use a shorter name for Zipvoice models. (#2894)\n* Export GigaAM v3 to sherpa-onnx (#2901)\n* Fix typos in URL (#2905)\n* Support Fun-ASR-Nano-2512 (#2906)\n\n## 1.12.19\n\n* Fix building without TTS for C API (#2838)\n* [ZipVoice] Fix english tokenization error (#2834)\n* Add simulate streaming ASR Python example for Paraformer (#2839)\n* Fix building JNI for Windows (#2840)\n* Avoid NaN in NeMo speaker embedding models. (#2844)\n* Add spacemit ort ep for spacemit riscv cpus (#2837)\n* Add token-level confidence scores (ys_probs) for offline transducer models (#2843)\n* Fix token log probabilities in offline transducer modified beam search decoder (#2846)\n* Support AXERA ax630, ax650, and axcl backends. (#2849)\n* Refactor axera npu examples (#2850)\n* Fix matcha tts zh-en model (#2851)\n* Fix the English part for Matcha TTS. (#2853)\n* Refactor text-utils (#2855)\n* Fix matcha tts (#2856)\n* Add a space between English words for Matcha zh-en TTS (#2858)\n* Fix punctuations in matcha zh-en tts (#2859)\n* Upload matcha tts zh-en model (#2865)\n* Fix the discrepancy with the Silero VAD isSpeech logic (#2863)\n\n## 1.12.18\n\n* Fix building wheels (#2786)\n* export omniASR_CTC_1B (#2788)\n* Add C++ QNN support for SenseVoice (#2793)\n* Export models for CANN toolkit 7.0 (#2795)\n* Support hotwords with byte level bpe (#2802)\n* Add Android demo with QNN (Qualcomm NPU) for SenseVoice ASR (#2803)\n* Export zipformer ctc models to QNN (#2815)\n* Add spaces between English words for Homophone replacer. (#2817)\n* Add C++ QNN support for Zipformer CTC models. (#2809)\n* Limit symbol visibility in the shared libraries (#2822)\n* Fix warnings for initializing tts lexicon. (#2823)\n* Export zipformer ctc models to Ascend NPU (#2824)\n* Refactor scripts for exporting models to Ascend NPU. (#2825)\n* Add C++ support for Zipformer CTC on Ascend NPU (#2826)\n* Fix segfault when non-wav file is passed to ReadWave (#2821)\n* Avoid calling rknn_dup_context(). (#2828)\n* Add C++ support for Paraformer with RK NPU (#2829)\n* Update README to include NPU support (#2830)\n* Support running whisper large v3 with external data weight (#2807)\n\n## 1.12.17\n\n* Fix releasing\n\n## 1.12.16\n\n* Support exporting SenseVoice and Paraformer to Ascend 310P3 NPU. (#2716)\n* Demo for no stream vad asr with flutter (#2705)\n* Fix crashing in Android KWS demo (#2719)\n* Add C++ API with ACL C API for SenseVoice ASR on Ascend NPU (#2728)\n* Allow up to 30 seconds ASR for sense-voice on Ascend NPU (#2729)\n* Fix compilation error for Ascend NPU (#2731)\n* docs: fix Flutter TTS macOS mirror link targets; fix speech-enhancement link typo (#2723)\n* Export models for Ascend910B2 (#2740)\n* Add C++ runtime for Paraformer on Ascend NPU. (#2741)\n* Expose ys probs to JNI, Kotlin and Java API (#2736)\n* Add CI for Ascend NPU (#2743)\n* Export models for CANN 8.2 (#2745)\n* Fix validating model config for Paraformer. (#2749)\n* Add cxx API for online punctuation models (#2759)\n* Export sense voice to qnn (#2760)\n* Export models to Ascend 910B3 (#2761)\n* Support MatchaTTS models for Chinese+English. (#2763)\n* Fix zipvoice. (#2764)\n* Support passing multiple lexicon files for matcha tts models. (#2765)\n* Begin to add qnn C API (#2766)\n* Add QnnConfig. (#2768)\n* Fix missing includes. (#2769)\n* Begin to export omnilingual-asr to sherpa-onnx (#2770)\n* Add C++ and Python API for Omnilingual ASR models. (#2772)\n* Add C API for Omnilingual ASR CTC models (#2773)\n* Add CXX API for Omnilingual ASR CTC models (#2774)\n* Add C# API for Omnilingual ASR CTC models (#2775)\n* Add Swift API for Omnilingual ASR CTC models (#2776)\n* Add Go API for Omnilingual ASR CTC models (#2778)\n* Add JavaScript (node-addon) API for Omnilingual ASR CTC models (#2780)\n* Add Dart API for Omnilingual ASR CTC models (#2779)\n* Add JavaScript (WebAssembly) API for Omnilingual ASR CTC models (#2781)\n* Add Pascal API for Omnilingual ASR CTC models (#2782)\n* Add Kotlin and Java API for Omnilingual ASR CTC models (#2783)\n\n## 1.12.15\n\n* Exposing online punctuation model support in node-addon-api (#2609)\n* Fix building wheels (#2619)\n* Export one more Piper Arabic TTS model (#2623)\n* fix: hot update language for sencevoice (#2627)\n* Add C API and Go API for Zipvoice (#2628)\n* Add CI tests for Zipvoice Go API (#2630)\n* Remove hardcoded dithering value in NeMo transducer recognizer (#2639)\n* Reduce verbose output about reading lexicon for TTS (#2648)\n* Add Parakeet TDT model for generating subtitles (#2649)\n* Add more Piper TTS models (#2651)\n* Add CXX API for audio tagging (#2652)\n* Add C# API for audio tagging (#2653)\n* Support KWS + RKNN. (#2190)\n* Support https://github.com/ASLP-lab/WenetSpeech-Chuan (#2656)\n* Fix building for android (#2657)\n* fix ios build script (#2645)\n* Update kaldi-native-fbank (#2659)\n* Add missing python class definitions for builds without TTS support (#2660)\n* Remove jieba from kokoro and matcha tts. (#2662)\n* add flet_sherpa_onnx in readme (#2663)\n* Remove cppjieba (#2664)\n* Add phrase matcher to merge words into phrases for TTS. (#2668)\n* Limit number of tokens per sentence in MatchaTTS. (#2671)\n* Update README to include a ROS2 project using sherpa-onnx (#2672)\n* Fix building Flutter APPs (#2673)\n* Export Paraformer to RKNN (#2689)\n* Update README.md add achatbot-go Projects using sherpa-onnx link (#2691)\n* Add CI to export Paraformer to RKNN (#2692)\n* Support MatchaTTS with English and Chinese (#2695)\n* Export Paraformer ASR models from FunASR to Ascend NPU 910B (#2697)\n* Update README to include Ascend NPU (#2698)\n* Fix WASM (JS) after adding zipvoice. (#2702)\n* Export SenseVoice ASR models to Ascend NPU 910B (#2707)\n* Fix building for various language bindings after adding zipvoice (#2709)\n\n## 1.12.14\n\n* Fix setting rknn core mask (#2594)\n* Add Dart API for spoken language identification (#2596)\n* Add CI tests for dart spoken language identification example (#2598)\n* Provide pre-compiled sherpa-onnx libs/binaries for CUDA 12.x + onnxruntime 1.22.0 (#2599)\n* Provide pre-compiled whls for cuda 12.x on Linux x64 and Windows x64 (#2601)\n* Fix TDT decoding for NeMo TDT transducers (#2606)\n* Add a C++ example for simulated streaming ASR (#2607)\n\n## 1.12.13\n\n* Fix initializing symbol table for OnlineRecognizer. (#2590)\n* Support RK NPU for SenseVoice non-streaming ASR models (#2589)\n* Upload RKNN models for sense-voice (#2592)\n\n## 1.12.12\n\n* Fix building for risc-v (#2549)\n* Fix using sherpa-onnx as a cmake sub-project. (#2550)\n* Update kaldifst and kaldi-decoder (#2551)\n* Support armv8l in Java API (#2556)\n* Disable loading libs from jar on Android. (#2557)\n* Fix cantonese vits tts (#2558)\n* Avoid appending blanks for Cantonese vits tts. (#2559)\n* Add hint for loading model files from SD card on Android. (#2564)\n* Update README to include https://github.com/Mentra-Community/MentraOS (#2565)\n* Export models from https://github.com/voicekit-team/T-one to sherpa-onnx (#2571)\n* Add C++ and Python support for T-one streaming Russian ASR models (#2575)\n* Add various language bindings for streaming T-one Russian ASR models (#2576)\n* Fix the missing online punctuation in android aar (#2577)\n* Export KittenTTS mini v0.1 to sherpa-onnx (#2578)\n* Upload new sense-voice models (#2580)\n* Export ASLP-lab/WSYue-ASR/tree/main/u2pp_conformer_yue to sherpa-onnx (#2582)\n* Add various language bindings for Wenet non-streaming CTC models (#2584)\n\n## 1.12.11\n\n* Add two more Piper tts models (#2525)\n* Generate tts samples for MatchaTTS (English). (#2527)\n* Fix releasing go packages (#2529)\n* Add license info about tts models from OpenVoiceOS (#2530)\n* Support BPE models with byte fallback. (#2531)\n* Simplify the usage of our non-Android Java API (#2533)\n* Fix wasm for kws (#2535)\n* Add one more German tts model from OpenVoiceOS. (#2536)\n* Fix uploading win32 libs to huggingface (#2537)\n* Add Zipvoice (#2487)\n* Fix c api (#2545)\n* Fix linking (#2546)\n\n## 1.12.10\n\n* Add VOSK streaming Russian ASR models and Kroko streaming German ASR models (#2502)\n* Refactor CI tests (#2504)\n* Update APK versions (#2505)\n* Export whisper distil-large-v3 and distil-large-v3.5 to sherpa-onnx (#2506)\n* Support specifying pronunciations of phrases in Chinese TTS. (#2507)\n* fix(flutter): fix unicode problem in windows path (#2508)\n* feat: add punctuation C++ API (#2510)\n* Fix ctrl+c may lead to coredump (#2511)\n* Add kitten tts nano v0.2 (#2512)\n* Scripts to generate tts samples (#2513)\n* Add tdt duration to APIs (#2514)\n* Support 16KB page size for Android (#2520)\n* Split sherpa-onnx Python package (#2521)\n* Fix kokoro tts for punctuations (#2522)\n\n## 1.12.9\n\n* Add more piper tts models (#2480)\n* Fix ASR for UE (#2483)\n* push to maven center (#2463)\n* Specify ABIs when building APKs (#2488)\n* Add more debug info for vits tts (#2491)\n* Add Swift API for computing speaker embeddings (#2492)\n* Alex/feat add python example (#2490)\n* Support TDT transducer decoding (#2495)\n* Fix java test (#2496)\n* Refactor Swift API (#2493)\n* add TtsReader app to README.md (#2498)\n* Export https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3 to sherpa-onnx (#2500)\n* Fix building apk (#2499)\n\n## 1.12.8\n\n* Expose JNI to compute probability of chunk in VAD (#2433)\n* Add https://huggingface.co/Banafo/Kroko-ASR (#2453)\n* Add APIs for Online NeMo CTC models (#2454)\n* Export https://github.com/KittenML/KittenTTS to sherpa-onnx (#2456)\n* Fix punctuations in kokoro tts. (#2458)\n* Limit number of tokens in fire red asr decoding. (#2459)\n* Add C++ runtime for kitten-tts (#2460)\n* Add Kotlin and Java API for KittenTTS (#2461)\n* Add Android TTS Engine APK for KittenTTS (#2465)\n* Add Python API for KittenTTS. (#2466)\n* Add C API for KittenTTS (#2467)\n* Add CXX API for KittenTTS (#2469)\n* Add JavaScript API (node-addon) for KittenTTS (#2470)\n* Add JavaScript API (WebAssembly) for KittenTTS (#2471)\n* Add Pascal API for KittenTTS (#2474)\n* Add Dart API for KittenTTS (#2475)\n* Add Swift API for KittenTTS (#2476)\n* Add C# API for KittenTTS (#2477)\n* Add Go API for KittenTTS (#2478)\n\n## 1.12.7\n\n* Support Portuguese and German ASR models from NeMo (#2394)\n* Support returning the current speech segment for VAD. (#2397)\n* Add more piper tts polish models (#2403)\n* Support VAD+ASR for WearOS (#2404)\n* Support test long audio with streaming-model & vad (#2405)\n* Fix typo in sherpa-onnx-vad-with-online-asr.cc (#2407)\n* Add tail padding for sherpa-onnx-vad-with-online-asr (#2408)\n* Add more French TTS models (#2424)\n* Add more piper tts models (#2425)\n* Implement max_symbols_per_frame for GigaAM2 accurate decoding since model uses char tokens instead of BPE. (#2423)\n* Fix GigaAM transducer encoder output length data type (#2426)\n* Add friendly log messages for Android and HarmonyOS TTS users. (#2427)\n* Fix setGraph in OnlineCtcFstDecoderConfig Java API (#2411)\n\n\n## 1.12.6\n\n* Support silero-vad v4 exported by k2-fsa (#2372)\n* Add C++ and Python support for ten-vad (#2377)\n* Fix compile errors for Linux (#2378)\n* Add C API for ten-vad (#2379)\n* Add CXX API examples for ten-vad. (#2380)\n* Add JavaScript (WebAssembly) API for ten-vad (#2382)\n* Add JavaScript (node-addon) API for ten-vad (#2383)\n* Add Go API for ten-vad (#2384)\n* Add C# API for ten-vad (#2385)\n* Add Dart API for ten-vad (#2386)\n* Add Swift API for ten-vad (#2387)\n* Add Pascal API for ten-vad (#2388)\n* Add Java/Kotlin API and Android support for ten-vad (#2389)\n\n## 1.12.5\n\n* Fix typo CMAKE_EXECUTBLE_LINKER_FLAGS -> CMAKE_EXECUTABLE_LINKER_FLAGS (#2344)\n* Fix testing dart packages (#2345)\n* fix(canary): use dynamo export, single input_ids and avoid 0/1 specialization (#2348)\n* Fix TTS for Unreal Engine (#2349)\n* Update readme to include https://github.com/mawwalker/stt-server (#2350)\n* Add meta data to NeMo canary ONNX models (#2351)\n* Update README to include https://github.com/bbeyondllove/asr_server (#2353)\n* Add C++ runtime and Python API for NeMo Canary models (#2352)\n* Add C/CXX/JavaScript API for NeMo Canary models (#2357)\n* Add Java and Kotlin API for NeMo Canary models (#2359)\n* Upload fp16 onnx model files for FireRedASR (#2360)\n* Fix nemo feature normalization in test code (#2361)\n* Refactor exporting NeMo models (#2362)\n* Add LODR support to online and offline recognizers (#2026)\n* Add CXX examples for NeMo TDT ASR. (#2363)\n* Add Pascal/Go/C#/Dart API for NeMo Canary ASR models (#2367)\n\n## 1.12.4\n\n* Refactor release scripts. (#2323)\n* Add TTS engine APKs for more models (#2327)\n* Fix static link without tts (#2328)\n* Fix VAD+ASR C++ example. (#2335)\n* Add sherpa-onnx-streaming-zipformer-zh-int8-2025-06-30 to android ASR apk (#2336)\n* Support non-streaming zipformer CTC ASR models (#2340)\n* Support linux aarch64 for Dart and Flutter (#2342)\n\n## 1.12.3\n\n* Show CMake debug information. (#2316)\n* Remove portaudio-go in Go API examples. (#2317)\n* Support Zipformer CTC ASR with whisper features. (#2319)\n* Support Zipformer transducer ASR with whisper features. (#2321)\n\n## 1.12.2\n\n* Fix CI for windows (#2279)\n* Add jar for Java 24. (#2280)\n* Add Python API for source separation (#2283)\n* Add link to huggingface space for source separation. (#2284)\n* Fix isspace on windows in debug build (#2042)\n* Update wasm/vad-asr/assets/README.md for more clear (#2297)\n* Update TTS Engine APK to support multi-lang (#2294)\n* Add scripts for exporting Piper TTS models to sherpa-onnx (#2299)\n* Update sherpa-onnx-shared.pc.in (#2300)\n* Fixes #2172 (#2301)\n* Refactor kokoro export (#2302)\n* Fix building for Pascal (#2305)\n* Support extra languages in multi-lang kokoro tts (#2303)\n* Update readme to include BreezeApp from MediaTek Research. (#2313)\n* Add API to get version information (#2309)\n\n\n## 1.12.1\n\n* Use jlong explicitly in jni. (#2229)\n* Fix building RKNN wheels (#2233)\n* Fix publishing binaries for RKNN (#2234)\n* Export spleeter model to onnx for source separation (#2237)\n* Add C++ runtime for spleeter about source separation (#2242)\n* Add include headers for __ANDROID_API__,__OHOS__ (#2251)\n* JAVA-API: Manual Library Loading Support for Restricted Environments (#2253)\n* Build APK with replace.fst (#2254)\n* repair rknn wheels (#2257)\n* Update kaldi-native-fbank. (#2259)\n* Fix building sherpa-onnx (#2262)\n* Fix building MFC examples (#2263)\n* Add UVR models for source separation. (#2266)\n* move portaudio common record code to microphone (#2264)\n* fixed mfc build error (#2267)\n* Add C++ support for UVR models (#2269)\n* Export nvidia/canary-180m-flash to sherpa-onnx (#2272)\n* Update utils.dart (#2275)\n* Fix rknn for multi-threads (#2274)\n* Fix 32-bit arm CI (#2276)\n\n## 1.12.0\n\n* Fix building wheels for macOS (#2192)\n* Show verbose logs in homophone replacer (#2194)\n* Fix displaying streaming speech recognition results for Python. (#2196)\n* Add real-time speech recognition example for SenseVoice. (#2197)\n* docs: add Open-XiaoAI KWS project (#2198)\n* Add C++ example for streaming ASR with SenseVoice. (#2199)\n* Add C++ example for real-time ASR with nvidia/parakeet-tdt-0.6b-v2. (#2201)\n* Add a link to YouTube video including sherpa-onnx. (#2202)\n* Support sending is_eof for online websocket server. (#2204)\n* Add alsa-based streaming ASR example for sense voice. (#2207)\n* Support homophone replacer in Android asr demo. (#2210)\n* Add Go implementation of the TTS generation callback (#2213)\n* Add Android demo for real-time ASR with non-streaming ASR models. (#2214)\n* Expose dither for JNI (#2215)\n* Add nodejs example for parakeet-tdt-0.6b-v2. (#2219)\n* Add script to build APK for simulated-streaming-asr. (#2220)\n\n\n## 1.11.5\n\n* export parakeet-tdt-0.6b-v2 to sherpa-onnx (#2180)\n* Add C++ runtime for parakeet-tdt-0.6b-v2. (#2181)\n* Avoid NaN in feature normalization. (#2186)\n\n## 1.11.4\n\n* Disable strict hotword matching mode for offline transducer (#1837)\n* Comment refinement: Add note about vocoder file for matcha TTS config (#2106)\n* Fix a typo in the JNI for Android. (#2108)\n* Generate subtitles with FireRedAsr models (#2112)\n* Use manylinux_2_28_x86_64 to build linux gpu for sherpa-onnx (#2123)\n* Support running sherpa-onnx with RK NPU on Android (#2124)\n* Fix building for HarmonyOS (#2125)\n* cmake build, configurable from env (#2115)\n* Expose dither in python API (#2127)\n* Add support for GigaAM-CTC-v2 (#2135)\n* Support Giga AM transducer V2 (#2136)\n* Export kokoro 1.0 int8 models (#2137)\n* Upload more onnx ASR models (#2141)\n* Fix building for open harmonyOS (#2142)\n* online-transducer: reset the encoder together with 2 previous output symbols (non-blank) (#2129)\n* Fix punctuations for kokoro tts 1.1-zh. (#2146)\n* Fix setting OnlineModelConfig in Java API (#2147)\n* Support decoding multiple streams in Java API. (#2149)\n* Support replacing homophonic phrases (#2153)\n* Add C and CXX API for homophone replacer (#2156)\n* Add JavaScript API (WASM) for homophone replacer (#2157)\n* Add JavaScript API (node-addon) for homophone replacer (#2158)\n* Fix building without TTS (#2159)\n* Add homophone replacer example for Python API. (#2161)\n* More fix for building without tts (#2162)\n* Add Swift API for homophone replacer. (#2164)\n* Add C# API for homophone replacer (#2165)\n* Add Kotlin and Java API for homophone replacer (#2166)\n* Add Dart API for homophone replacer (#2167)\n* Add Go API for homophone replacer (#2168)\n\n## 1.11.3\n\n* fix vits dict dir config (#2036)\n* fix case (#2037)\n* Fix building wheels for RKNN (#2041)\n* Change scale factor to 32767 (#2056)\n* Fix length scale for kokoro tts (#2060)\n* Allow building repository as CMake subdirectory (#2059)\n* Export silero_vad v4 to RKNN (#2067)\n* fix dml with preinstall ort (#2066)\n* Fix building aar to include speech denoiser (#2069)\n* Add CXX API for VAD (#2077)\n* Add C++ runtime for silero_vad with RKNN (#2078)\n* Refactor rknn code (#2079)\n* Fix building for android (#2081)\n* Add C++ and Python API for Dolphin CTC models (#2085)\n* Add Kotlin and Java API for Dolphin CTC models (#2086)\n* Add C and CXX API for Dolphin CTC models (#2088)\n* Preserve more context after endpointing in transducer (#2061)\n* Add C# API for Dolphin CTC models (#2089)\n* Add Go API for Dolphin CTC models (#2090)\n* Add Swift API for Dolphin CTC models (#2091)\n* Add Javascript (WebAssembly) API for Dolphin CTC models (#2093)\n* Add Javascript (node-addon) API for Dolphin CTC models (#2094)\n* Add Dart API for Dolphin CTC models (#2095)\n* Add Pascal API for Dolphin CTC models (#2096)\n\n## 1.11.2\n\n* Fix CI (#2016)\n* Publish jar for more java versions (#2017)\n* add alsa example for vad+offline asr (#2020)\n* Support cuda12 and cudnn8 for Linux aarch64. (#2021)\n* Update README to include more projects using sherpa-onnx (#2022)\n* Fix a bug in vad.reset() (#2023)\n* Fix Matcha + vocos for Android (#2024)\n* Fix crash in Android tts engine demo. (#2029)\n* Fix build script: add 'cd build' after 'mkdir build' to ensure the correct working directory for CMake (#2033)\n* fix static linking (#2032)\n\n## 1.11.1\n\n* Export vocos to sherpa-onnx (#2012)\n* Add C++ runtime for vocos (#2014)\n\n## 1.11.0\n\n* Fix building wheels for Python 3.7 (#1933)\n* Add Kotlin and Java API for online punctuation models (#1936)\n* Add Kokoro v1.1-zh (#1942)\n* Support RKNN for Zipformer CTC models. (#1948)\n* Add transducer modified_beam_search for RKNN. (#1949)\n* Update README to include projects that is using sherpa-onnx (#1956)\n* Limit number of tokens per second for whisper. (#1958)\n* Ebranchformer (#1951)\n* Test using sherpa-onnx as a cmake subproject (#1961)\n* Add C++ demo for VAD+non-streaming ASR (#1964)\n* Export gtcrn models to sherpa-onnx (#1975)\n* c-api add wave write to buffer. (#1962)\n* add SherpaOnnxOfflineRecognizerSetConfig binding for go, and optimize the new/free for C.struct_SherpaOnnxOfflineRecognizerConfig ptr (#1976)\n* Add C++ runtime for speech enhancement GTCRN models (#1977)\n* Add Python API for speech enhancement GTCRN models (#1978)\n* Add C API for speech enhancement GTCRN models (#1984)\n* Add CXX API for speech enhancement GTCRN models (#1986)\n* Add Swift API for speech enhancement GTCRN models (#1989)\n* Add C# API for speech enhancement GTCRN models (#1990)\n* Add Go API for speech enhancement GTCRN models (#1991)\n* Add Pascal API for speech enhancement GTCRN models (#1992)\n* Add Dart API for speech enhancement GTCRN models (#1993)\n* Add JavaScript (node-addon) API for speech enhancement GTCRN models (#1996)\n* Add WebAssembly (WASM) for speech enhancement GTCRN models (#2002)\n* Add JavaScript API (wasm) for speech enhancement GTCRN models (#2007)\n* Add Kotlin API for speech enhancement GTCRN models (#2008)\n* Add Java API for speech enhancement GTCRN models (#2009)\n\n\n\n\n\n## 1.10.46\n\n* Fix kokoro lexicon. (#1886)\n* speaker-identification-with-vad-non-streaming-asr.py Lack of support for sense_voice. (#1884)\n* Fix generating Chinese lexicon for Kokoro TTS 1.0 (#1888)\n* Reduce vad-whisper-c-api example code. (#1891)\n* JNI Exception Handling (#1452)\n* Fix #1901: UnicodeEncodeError running export_bpe_vocab.py (#1902)\n* Fix publishing pre-built windows libraries (#1905)\n* Fixing Whisper Model Token Normalization (#1904)\n* feat: add mic example for better compatibility (#1909)\n* Add onnxruntime 1.18.1 for Linux aarch64 GPU (#1914)\n* Add C++ API for streaming zipformer ASR on RK NPU (#1908)\n* change [1<<28] to [1<<10], to fix build issues on GOARCH=386 that [1<<28] too large (#1916)\n* Flutter Config toJson/fromJson (#1893)\n* Fix publishing linux pre-built artifacts (#1919)\n* go.mod set to use go 1.17, and use unsafe.Slice to optimize the code (#1920)\n* fix: AddPunct panic for Go(#1921)\n* Fix publishing macos pre-built artifacts (#1922)\n* Minor fixes for rknn (#1925)\n* Build wheels for rknn linux aarch64 (#1928)\n\n## 1.10.45\n\n* [update] fixed bug: create golang instance succeed while the c struct create failed (#1860)\n* fixed typo in RTF calculations (#1861)\n* Export FireRedASR to sherpa-onnx. (#1865)\n* Add C++ and Python API for FireRedASR AED models (#1867)\n* Add Kotlin and Java API for FireRedAsr AED model (#1870)\n* Add C API for FireRedAsr AED model. (#1871)\n* Add CXX API for FireRedAsr (#1872)\n* Add JavaScript API (node-addon) for FireRedAsr (#1873)\n* Add JavaScript API (WebAssembly) for FireRedAsr model. (#1874)\n* Add C# API for FireRedAsr Model (#1875)\n* Add C# API for FireRedAsr Model (#1875)\n* Add Swift API for FireRedAsr AED Model (#1876)\n* Add Dart API for FireRedAsr AED Model (#1877)\n* Add Go API for FireRedAsr AED Model (#1879)\n* Add Pascal API for FireRedAsr AED Model (#1880)\n\n## 1.10.44\n\n* Export MatchaTTS fa-en model to sherpa-onnx (#1832)\n* Add C++ support for MatchaTTS models not from icefall. (#1834)\n* OfflineRecognizer supports create stream with hotwords (#1833)\n* Add PengChengStarling models to sherpa-onnx (#1835)\n* Support specifying voice in espeak-ng for kokoro tts models. (#1836)\n* Fix: made print sherpa_onnx_loge when it is in debug mode (#1838)\n* Add Go API for audio tagging (#1840)\n* Fix CI (#1841)\n* Update readme to contain links for pre-built Apps (#1853)\n* Modify the model used (#1855)\n* Flutter OnlinePunctuation (#1854)\n* Fix spliting text by languages for kokoro tts. (#1849)\n\n## 1.10.43\n\n* Add MFC example for Kokoro TTS 1.0 (#1815)\n* Update sherpa-onnx-tts.js VitsModelConfig.model can be none (#1817)\n* Fix passing gb2312 encoded strings to tts on Windows (#1819)\n* Support scaling the duration of a pause in TTS. (#1820)\n* Fix building wheels for linux aarch64. (#1821)\n* Fix CI for Linux aarch64. (#1822)\n\n## 1.10.42\n\n* Fix publishing wheels (#1746)\n* Update README to include https://github.com/xinhecuican/QSmartAssistant (#1755)\n* Add Kokoro TTS to MFC examples (#1760)\n* Refactor node-addon C++ code. (#1768)\n* Add keyword spotter C API for HarmonyOS (#1769)\n* Add ArkTS API for Keyword spotting. (#1775)\n* Add Flutter example for Kokoro TTS (#1776)\n* Initialize the audio session for iOS ASR example (#1786)\n* Fix: Prepend 0 to tokenization to prevent word skipping for Kokoro. (#1787)\n* Export Kokoro 1.0 to sherpa-onnx (#1788)\n* Add C++ and Python API for Kokoro 1.0 multilingual TTS model (#1795)\n* Add Java and Kotlin API for Kokoro TTS 1.0 (#1798)\n* Add Android demo for Kokoro TTS 1.0 (#1799)\n* Add C API for Kokoro TTS 1.0 (#1801)\n* Add CXX API for Kokoro TTS 1.0 (#1802)\n* Add Swift API for Kokoro TTS 1.0 (#1803)\n* Add Go API for Kokoro TTS 1.0 (#1804)\n* Add C# API for Kokoro TTS 1.0 (#1805)\n* Add Dart API for Kokoro TTS 1.0 (#1806)\n* Add Pascal API for Kokoro TTS 1.0 (#1807)\n* Add JavaScript API (node-addon) for Kokoro TTS 1.0 (#1808)\n* Add JavaScript API (WebAssembly) for Kokoro TTS 1.0 (#1809)\n* Add Flutter example for Kokoro TTS 1.0 (#1810)\n* Add iOS demo for Kokoro TTS 1.0 (#1812)\n* Add HarmonyOS demo for Kokoro TTS 1.0 (#1813)\n\n## 1.10.41\n\n* Fix UI for Android TTS Engine. (#1735)\n* Add iOS TTS example for MatchaTTS (#1736)\n* Add iOS example for Kokoro TTS (#1737)\n* Fix dither binding in Pybind11 to ensure independence from high_freq in FeatureExtractorConfig (#1739)\n* Fix keyword spotting. (#1689)\n* Update readme to include https://github.com/hfyydd/sherpa-onnx-server (#1741)\n* Reduce vad-moonshine-c-api example code. (#1742)\n* Support Kokoro TTS for HarmonyOS. (#1743)\n\n## 1.10.40\n\n* Fix building wheels (#1703)\n* Export kokoro to sherpa-onnx (#1713)\n* Add C++ and Python API for Kokoro TTS models. (#1715)\n* Add C API for Kokoro TTS models (#1717)\n* Fix style issues (#1718)\n* Add C# API for Kokoro TTS models (#1720)\n* Add Swift API for Kokoro TTS models (#1721)\n* Add Go API for Kokoro TTS models (#1722)\n* Add Dart API for Kokoro TTS models (#1723)\n* Add Pascal API for Kokoro TTS models (#1724)\n* Add JavaScript API (node-addon) for Kokoro TTS models (#1725)\n* Add JavaScript (WebAssembly) API for Kokoro TTS models. (#1726)\n* Add Kotlin and Java API for Kokoro TTS models (#1728)\n* Update README.md for KWS to not use git lfs. (#1729)\n\n\n\n\n## 1.10.39\n\n* Fix building without TTS (#1691)\n* Add README for android libs. (#1693)\n* Fix: export-onnx.py(expected all tensors to be on the same device) (#1699)\n* Fix passing strings from C# to C. (#1701)\n\n## 1.10.38\n\n* Fix initializing TTS in Python. (#1664)\n* Remove spaces after punctuations for TTS (#1666)\n* Add constructor fromPtr() for all flutter class with factory ctor. (#1667)\n* Add Kotlin API for Matcha-TTS models. (#1668)\n* Support Matcha-TTS models using espeak-ng (#1672)\n* Add Java API for Matcha-TTS models. (#1673)\n* Avoid adding tail padding for VAD in generate-subtitles.py (#1674)\n* Add C API for MatchaTTS models (#1675)\n* Add CXX API for MatchaTTS models (#1676)\n* Add JavaScript API (node-addon-api) for MatchaTTS models. (#1677)\n* Add HarmonyOS examples for MatchaTTS. (#1678)\n* Upgraded to .NET 8 and made code style a little more internally consistent. (#1680)\n* Update workflows to use .NET 8.0 also. (#1681)\n* Add C# and JavaScript (wasm) API for MatchaTTS models (#1682)\n* Add Android demo for MatchaTTS models. (#1683)\n* Add Swift API for MatchaTTS models. (#1684)\n* Add Go API for MatchaTTS models (#1685)\n* Add Pascal API for MatchaTTS models. (#1686)\n* Add Dart API for MatchaTTS models (#1687)\n\n## 1.10.37\n\n* Add new tts models for Latvia and Persian+English (#1644)\n* Add a byte-level BPE Chinese+English non-streaming zipformer model (#1645)\n* Support removing invalid utf-8 sequences. (#1648)\n* Add TeleSpeech CTC to non_streaming_server.py (#1649)\n* Fix building macOS libs (#1656)\n* Add Go API for Keyword spotting (#1662)\n* Add Swift online punctuation (#1661)\n* Add C++ runtime for Matcha-TTS (#1627)\n\n## 1.10.36\n\n* Update AAR version in Android Java demo (#1618)\n* Support linking onnxruntime statically for Android (#1619)\n* Update readme to include Open-LLM-VTuber (#1622)\n* Rename maxNumStences to maxNumSentences (#1625)\n* Support using onnxruntime 1.16.0 with CUDA 11.4 on Jetson Orin NX (Linux arm64 GPU). (#1630)\n* Update readme to include jetson orin nx and nano b01 (#1631)\n* feat: add checksum action (#1632)\n* Support decoding with byte-level BPE (bbpe) models. (#1633)\n* feat: enable c api for android ci (#1635)\n* Update README.md (#1640)\n* SherpaOnnxVadAsr: Offload runSecondPass to background thread for improved real-time audio processing (#1638)\n* Fix GitHub actions. (#1642)\n\n\n## 1.10.35\n\n* Add missing changes about speaker identification demo for HarmonyOS (#1612)\n* Provide sherpa-onnx.aar for Android (#1615)\n* Use aar in Android Java demo. (#1616)\n\n## 1.10.34\n\n* Fix building node-addon package (#1598)\n* Update doc links for HarmonyOS (#1601)\n* Add on-device real-time ASR demo for HarmonyOS (#1606)\n* Add speaker identification APIs for HarmonyOS (#1607)\n* Add speaker identification demo for HarmonyOS (#1608)\n* Add speaker diarization API for HarmonyOS. (#1609)\n* Add speaker diarization demo for HarmonyOS (#1610)\n\n## 1.10.33\n\n* Add non-streaming ASR support for HarmonyOS. (#1564)\n* Add streaming ASR support for HarmonyOS. (#1565)\n* Fix building for Android (#1568)\n* Publish `sherpa_onnx.har` for HarmonyOS (#1572)\n* Add VAD+ASR demo for HarmonyOS (#1573)\n* Fix publishing har packages for HarmonyOS (#1576)\n* Add CI to build HAPs for HarmonyOS (#1578)\n* Add microphone demo about VAD+ASR for HarmonyOS (#1581)\n* Fix getting microphone permission for HarmonyOS VAD+ASR example (#1582)\n* Add HarmonyOS support for text-to-speech. (#1584)\n* Fix: support both old and new websockets request headers format (#1588)\n* Add on-device text-to-speech (TTS) demo for HarmonyOS (#1590)\n\n## 1.10.32\n\n* Support cross-compiling for HarmonyOS (#1553)\n* HarmonyOS support for VAD. (#1561)\n* Fix publishing flutter iOS app to appstore (#1563).\n\n## 1.10.31\n\n* Publish pre-built wheels for Python 3.13 (#1485)\n* Publish pre-built macos xcframework (#1490)\n* Fix reading tokens.txt on Windows. (#1497)\n* Add two-pass ASR Android APKs for Moonshine models. (#1499)\n* Support building GPU-capable sherpa-onnx on Linux aarch64. (#1500)\n* Publish pre-built wheels with CUDA support for Linux aarch64. (#1507)\n* Export the English TTS model from MeloTTS (#1509)\n* Add Lazarus example for Moonshine models. (#1532)\n* Add isolate_tts demo (#1529)\n* Add WebAssembly example for VAD + Moonshine models. (#1535)\n* Add Android APK for streaming Paraformer ASR (#1538)\n* Support static build for windows arm64. (#1539)\n* Use xcframework for Flutter iOS plugin to support iOS simulators.\n\n## 1.10.30\n\n* Fix building node-addon for Windows x86. (#1469)\n* Begin to support https://github.com/usefulsensors/moonshine (#1470)\n* Publish pre-built JNI libs for Linux aarch64 (#1472)\n* Add C++ runtime and Python APIs for Moonshine models (#1473)\n* Add Kotlin and Java API for Moonshine models (#1474)\n* Add C and C++ API for Moonshine models (#1476)\n* Add Swift API for Moonshine models. (#1477)\n* Add Go API examples for adding punctuations to text. (#1478)\n* Add Go API for Moonshine models (#1479)\n* Add JavaScript API for Moonshine models (#1480)\n* Add Dart API for Moonshine models. (#1481)\n* Add Pascal API for Moonshine models (#1482)\n* Add C# API for Moonshine models. (#1483)\n\n## 1.10.29\n\n* Add Go API for offline punctuation models (#1434)\n* Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437)\n* Add more models for speaker diarization (#1440)\n* Add Java API example for hotwords. (#1442)\n* Add java android demo (#1454)\n* Add C++ API for streaming ASR. (#1455)\n* Add C++ API for non-streaming ASR (#1456)\n* Handle NaN embeddings in speaker diarization. (#1461)\n* Add speaker identification with VAD and non-streaming ASR using ALSA (#1463)\n* Support GigaAM CTC models for Russian ASR (#1464)\n* Add GigaAM NeMo transducer model for Russian ASR (#1467)\n\n## 1.10.28\n\n* Fix swift example for generating subtitles. (#1362)\n* Allow more online models to load tokens file from the memory (#1352)\n* Fix CI errors introduced by supporting loading keywords from buffers (#1366)\n* Fix running MeloTTS models on GPU. (#1379)\n* Support Parakeet models from NeMo (#1381)\n* Export Pyannote speaker segmentation models to onnx (#1382)\n* Support Agglomerative clustering. (#1384)\n* Add Python API for clustering (#1385)\n* support whisper turbo (#1390)\n* context_state is not set correctly when previous context is passed after reset (#1393)\n* Speaker diarization example with onnxruntime Python API (#1395)\n* C++ API for speaker diarization (#1396)\n* Python API for speaker diarization. (#1400)\n* C API for speaker diarization (#1402)\n* docs(nodejs-addon-examples): add guide for pnpm user (#1401)\n* Go API for speaker diarization (#1403)\n* Swift API for speaker diarization (#1404)\n* Update readme to include more external projects using sherpa-onnx (#1405)\n* C# API for speaker diarization (#1407)\n* JavaScript API (node-addon) for speaker diarization (#1408)\n* WebAssembly example for speaker diarization (#1411)\n* Handle audio files less than 10s long for speaker diarization. (#1412)\n* JavaScript API with WebAssembly for speaker diarization (#1414)\n* Kotlin API for speaker diarization (#1415)\n* Java API for speaker diarization (#1416)\n* Dart API for speaker diarization (#1418)\n* Pascal API for speaker diarization (#1420)\n* Android JNI support for speaker diarization (#1421)\n* Android demo for speaker diarization (#1423)\n\n## 1.10.27\n\n* Add non-streaming ONNX models for Russian ASR (#1358)\n* Fix building Flutter TTS examples for Linux (#1356)\n* Support passing utf-8 strings from JavaScript to C++. (#1355)\n* Fix sherpa_onnx.go to support returning empty recognition results (#1353)\n\n## 1.10.26\n\n* Add links to projects using sherpa-onnx. (#1345)\n* Support lang/emotion/event results from SenseVoice in Swift API. (#1346)\n* Support specifying max speech duration for VAD. (#1348)\n* Add APIs about max speech duration in VAD for various programming languages (#1349)\n\n## 1.10.25\n\n* Allow tokens and hotwords to be loaded from buffered string directly (#1339)\n* Fix computing features for CED audio tagging models. (#1341)\n* Preserve previous result as context for next segment (#1335)\n* Add Python binding for online punctuation models (#1312)\n* Fix vad.Flush(). (#1329)\n* Fix wasm app for streaming paraformer (#1328)\n* Build websocket related binaries for embedded systems. (#1327)\n* Fixed the C api calls and created the TTS project file (#1324)\n* Re-implement LM rescore for online transducer (#1231)\n\n## 1.10.24\n\n* Add VAD and keyword spotting for the Node package with WebAssembly (#1286)\n* Fix releasing npm package and fix building Android VAD+ASR example (#1288)\n* add Tokens []string, Timestamps []float32, Lang string, Emotion string, Event string (#1277)\n* add vad+sense voice example for C API (#1291)\n* ADD VAD+ASR example for dart with CircularBuffer. (#1293)\n* Fix VAD+ASR example for Dart API. (#1294)\n* Avoid SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches freeing null. (#1296)\n* Fix releasing wasm app for vad+asr (#1300)\n* remove extra files from linux/macos/windows jni libs (#1301)\n* two-pass Android APK for SenseVoice (#1302)\n* Downgrade flutter sdk versions. (#1305)\n* Reduce onnxruntime log output. (#1306)\n* Provide prebuilt .jar files for different java versions. (#1307)\n\n\n## 1.10.23\n\n* flutter: add lang, emotion, event to OfflineRecognizerResult (#1268)\n* Use a separate thread to initialize models for lazarus examples. (#1270)\n* Object pascal examples for recording and playing audio with portaudio. (#1271)\n* Text to speech API for Object Pascal. (#1273)\n* update kotlin api for better release native object and add user-friendly apis. (#1275)\n* Update wave-reader.cc to support 8/16/32-bit waves (#1278)\n* Add WebAssembly for VAD (#1281)\n* WebAssembly example for VAD + Non-streaming ASR (#1284)\n\n## 1.10.22\n\n* Add Pascal API for reading wave files (#1243)\n* Pascal API for streaming ASR (#1246)\n* Pascal API for non-streaming ASR (#1247)\n* Pascal API for VAD (#1249)\n* Add more C API examples (#1255)\n* Add emotion, event of SenseVoice. (#1257)\n* Support reading multi-channel wave files with 8/16/32-bit encoded samples (#1258)\n* Enable IPO only for Release build. (#1261)\n* Add Lazarus example for generating subtitles using Silero VAD with non-streaming ASR (#1251)\n* Fix looking up OOVs in lexicon.txt for MeloTTS models. (#1266)\n\n\n## 1.10.21\n\n* Fix ffmpeg c api example (#1185)\n* Fix splitting sentences for MeloTTS (#1186)\n* Non-streaming WebSocket client for Java. (#1190)\n* Fix copying asset files for flutter examples. (#1191)\n* Add Chinese+English tts example for flutter (#1192)\n* Add speaker identification and verification example for Dart API (#1194)\n* Fix reading non-standard wav files. (#1199)\n* Add ReazonSpeech Japanese pre-trained model (#1203)\n* Describe how to add new words for MeloTTS models (#1209)\n* Remove libonnxruntime_providers_cuda.so as a dependency. (#1210)\n* Fix setting SenseVoice language. (#1214)\n* Support passing TTS callback in Swift API (#1218)\n* Add MeloTTS example for ios (#1223)\n* Add online punctuation and casing prediction model for English language (#1224)\n* Fix python two pass ASR examples (#1230)\n* Add blank penalty for various language bindings\n\n## 1.10.20\n\n* Add Dart API for audio tagging\n* Add Dart API for adding punctuations to text\n\n## 1.10.19\n\n* Prefix all C API functions with SherpaOnnx\n\n## 1.10.18\n\n* Fix the case when recognition results contain the symbol `\"`. It caused\n  issues when converting results to a json string.\n\n## 1.10.17\n\n* Support SenseVoice CTC models.\n* Add Dart API for keyword spotter.\n\n## 1.10.16\n\n* Support zh-en TTS model from MeloTTS.\n\n## 1.10.15\n\n* Downgrade onnxruntime from v1.18.1 to v1.17.1\n\n## 1.10.14\n\n* Support whisper large v3\n* Update onnxruntime from v1.18.0 to v1.18.1\n* Fix invalid utf8 sequence from Whisper for Dart API.\n\n## 1.10.13\n\n* Update onnxruntime from 1.17.1 to 1.18.0\n* Add C# API for Keyword spotting\n\n## 1.10.12\n\n* Add Flush to VAD so that the last speech segment can be detected. See also\n  https://github.com/k2-fsa/sherpa-onnx/discussions/1077#discussioncomment-9979740\n\n## 1.10.11\n\n* Support the iOS platform for Flutter.\n\n## 1.10.10\n\n* Build sherpa-onnx into a single shared library.\n\n## 1.10.9\n\n* Fix released packages. piper-phonemize was not included in v1.10.8.\n\n## 1.10.8\n\n* Fix released packages. There should be a lib directory.\n\n## 1.10.7\n\n* Support Android for Flutter.\n\n## 1.10.2\n\n* Fix passing C# string to C++\n\n## 1.10.1\n\n* Enable to stop TTS generation\n\n## 1.10.0\n\n* Add inverse text normalization\n\n## 1.9.30\n\n* Add TTS\n\n## 1.9.29\n\n* Publish with CI\n\n## 0.0.3\n\n* Fix path separator on Windows.\n\n## 0.0.2\n\n* Support specifying lib path.\n\n## 0.0.1\n\n* Initial release.\n"
  },
  {
    "path": "CMakeLists.txt",
    "content": "if (CMAKE_VERSION VERSION_GREATER_EQUAL \"4.0.0\")\n  set(CMAKE_POLICY_VERSION_MINIMUM 3.10)\nendif()\n\ncmake_minimum_required(VERSION 3.15 FATAL_ERROR)\n\n# https://cmake.org/cmake/help/latest/prop_tgt/MSVC_RUNTIME_LIBRARY.html\ncmake_policy(SET CMP0091 NEW)\n\nmessage(STATUS \"CMake version: ${CMAKE_VERSION}\")\n\nset(CMAKE_OSX_DEPLOYMENT_TARGET \"10.14\" CACHE STRING \"Minimum OS X deployment version. Used only for macOS\")\n\nset(CMAKE_POLICY_DEFAULT_CMP0063 NEW)\nset(CMAKE_POLICY_DEFAULT_CMP0069 NEW)\n\nproject(sherpa-onnx)\n\n# Remember to update\n# ./CHANGELOG.md\n# ./new-release.sh\nset(SHERPA_ONNX_VERSION \"1.12.31\")\n\n# Disable warning about\n#\n# \"The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is\n#  not set.\nif (CMAKE_VERSION VERSION_GREATER_EQUAL \"3.24.0\")\n  cmake_policy(SET CMP0135 NEW)\nendif()\n\n\nif(CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)\n  set(SUGGEST_BUILD_BINARIES ON)\nelse()\n  set(SUGGEST_BUILD_BINARIES OFF)\nendif()\n\noption(SHERPA_ONNX_ENABLE_PYTHON \"Whether to build Python\" OFF)\noption(SHERPA_ONNX_ENABLE_TESTS \"Whether to build tests\" OFF)\noption(SHERPA_ONNX_ENABLE_CHECK \"Whether to build with assert\" OFF)\noption(BUILD_SHARED_LIBS \"Whether to build shared libraries\" OFF)\noption(SHERPA_ONNX_ENABLE_PORTAUDIO \"Whether to build with portaudio\" ON)\noption(SHERPA_ONNX_ENABLE_JNI \"Whether to build JNI internface\" OFF)\noption(SHERPA_ONNX_ENABLE_C_API \"Whether to build C API\" ON)\noption(SHERPA_ONNX_ENABLE_WEBSOCKET \"Whether to build webscoket server/client\" ON)\noption(SHERPA_ONNX_ENABLE_GPU \"Enable ONNX Runtime GPU support\" OFF)\noption(SHERPA_ONNX_ENABLE_DIRECTML \"Enable ONNX Runtime DirectML support\" OFF)\noption(SHERPA_ONNX_LINK_D3D \"Whether static ONNX runtime lib with DML\" OFF)\n\noption(SHERPA_ONNX_ENABLE_WASM \"Whether to enable WASM\" OFF)\noption(SHERPA_ONNX_ENABLE_WASM_SPEAKER_DIARIZATION \"Whether to enable WASM for speaker diarization\" OFF)\noption(SHERPA_ONNX_ENABLE_WASM_TTS \"Whether to enable WASM for TTS\" OFF)\noption(SHERPA_ONNX_ENABLE_WASM_ASR \"Whether to enable WASM for ASR\" OFF)\noption(SHERPA_ONNX_ENABLE_WASM_KWS \"Whether to enable WASM for KWS\" OFF)\noption(SHERPA_ONNX_ENABLE_WASM_VAD \"Whether to enable WASM for VAD\" OFF)\noption(SHERPA_ONNX_ENABLE_WASM_VAD_ASR \"Whether to enable WASM for VAD+ASR\" OFF)\noption(SHERPA_ONNX_ENABLE_WASM_NODEJS \"Whether to enable WASM for NodeJS\" OFF)\noption(SHERPA_ONNX_ENABLE_WASM_SPEECH_ENHANCEMENT \"Whether to enable WASM for speech enhancement\" OFF)\noption(SHERPA_ONNX_ENABLE_BINARY \"Whether to build binaries\" ${SUGGEST_BUILD_BINARIES})\noption(SHERPA_ONNX_ENABLE_TTS \"Whether to build TTS related code\" ON)\noption(SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION \"Whether to build speaker diarization related code\" ON)\noption(SHERPA_ONNX_LINK_LIBSTDCPP_STATICALLY \"True to link libstdc++ statically. Used only when BUILD_SHARED_LIBS is OFF on Linux\" ON)\noption(SHERPA_ONNX_USE_PRE_INSTALLED_ONNXRUNTIME_IF_AVAILABLE \"True to use pre-installed onnxruntime if available\" ON)\noption(SHERPA_ONNX_ENABLE_SANITIZER \"Whether to enable ubsan and asan\" OFF)\noption(SHERPA_ONNX_BUILD_C_API_EXAMPLES \"Whether to enable C API examples\" ${SUGGEST_BUILD_BINARIES})\noption(SHERPA_ONNX_ENABLE_RKNN \"Whether to build for RKNN NPU \" OFF)\noption(SHERPA_ONNX_ENABLE_AXERA \"Whether to build for Axera NPU \" OFF)\noption(SHERPA_ONNX_ENABLE_AXCL \"Whether to build for Axcl NPU \" OFF)\noption(SHERPA_ONNX_ENABLE_ASCEND_NPU \"Whether to build for Ascend NPU \" OFF)\noption(SHERPA_ONNX_ENABLE_QNN \"Whether to build for Qualcomm NPU\" OFF)\noption(SHERPA_ONNX_ENABLE_SPACEMIT \"Whether to build for SpacemiT CPUs \" OFF)\nset(SHERPA_ONNX_LINUX_ARM64_GPU_ONNXRUNTIME_VERSION \"1.11.0\" CACHE STRING \"Used only for Linux ARM64 GPU. Set to 1.11.0 if you use CUDA 10.2 and cudnn8. Set it to 1.16.0 if you use CUDA 11.4 and cudnn8. Set it to 1.18.0 if you use CUDA 12.2 and cudnn8. Set it to 1.18.1 if you use CUDA 12.6 and cudnn9\")\n\n# SHERPA_ONNX_USE_STATIC_CRT controls whether we use:\n#   - Static CRT:   /MT  (Release), /MTd (Debug)\n#   - Dynamic CRT:  /MD  (Release), /MDd (Debug)\noption(SHERPA_ONNX_USE_STATIC_CRT \"For Windows only. ON to use static CRT (/MT /MTd); OFF to use dynamic (/MD /MDd)\" ON)\n\n\n# On Windows with MSVC, explicitly control which C runtime (CRT) to use.\n#\n# We rely on CMAKE_MSVC_RUNTIME_LIBRARY (CMake >= 3.15) instead of manually\n# appending /MT, /MTd, /MD, or /MDd to compiler flags.\n#\n# Benefits:\n#   - Correct behavior for multi-config generators (Visual Studio, Ninja Multi-Config)\n#   - No reliance on CMAKE_BUILD_TYPE (which is empty for multi-config generators)\n#   - Cleaner interaction with subprojects and FetchContent dependencies\n#\n# The generator expression automatically selects:\n#   - Debug   -> /MTd or /MDd\n#   - Release -> /MT  or /MD\n#   - RelWithDebInfo / MinSizeRel -> /MT or /MD\nif (MSVC AND NOT DEFINED CMAKE_MSVC_RUNTIME_LIBRARY)\n  if(DEFINED CMAKE_BUILD_TYPE AND NOT CMAKE_BUILD_TYPE STREQUAL \"\")\n    if (SHERPA_ONNX_USE_STATIC_CRT)\n      # Use static CRT: /MT (Release) and /MTd (Debug)\n      if(CMAKE_BUILD_TYPE MATCHES Debug)\n        set(CMAKE_MSVC_RUNTIME_LIBRARY \"MultiThreadedDebug\")\n      else()\n        set(CMAKE_MSVC_RUNTIME_LIBRARY \"MultiThreaded\")\n      endif()\n    else()\n      # Use dynamic CRT: /MD (Release) and /MDd (Debug)\n      if(CMAKE_BUILD_TYPE MATCHES Debug)\n        set(CMAKE_MSVC_RUNTIME_LIBRARY \"MultiThreadedDebugDLL\")\n      else()\n        set(CMAKE_MSVC_RUNTIME_LIBRARY \"MultiThreadedDLL\")\n      endif()\n    endif()\n  else()\n    if (SHERPA_ONNX_USE_STATIC_CRT)\n      # Use static CRT: /MT (Release) and /MTd (Debug)\n      set(CMAKE_MSVC_RUNTIME_LIBRARY\n          \"MultiThreaded$<$<CONFIG:Debug>:Debug>\")\n    else()\n      # Use dynamic CRT: /MD (Release) and /MDd (Debug)\n      set(CMAKE_MSVC_RUNTIME_LIBRARY\n          \"MultiThreadedDLL$<$<CONFIG:Debug>:Debug>\")\n    endif()\n  endif()\nendif()\n\nset(CMAKE_ARCHIVE_OUTPUT_DIRECTORY \"${CMAKE_BINARY_DIR}/lib\")\nset(CMAKE_LIBRARY_OUTPUT_DIRECTORY \"${CMAKE_BINARY_DIR}/lib\")\nset(CMAKE_RUNTIME_OUTPUT_DIRECTORY \"${CMAKE_BINARY_DIR}/bin\")\n\nif(NOT WIN32)\n  set(CMAKE_SKIP_BUILD_RPATH FALSE)\n  set(BUILD_RPATH_USE_ORIGIN TRUE)\n  set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)\nendif()\n\nif(NOT APPLE)\n  set(SHERPA_ONNX_RPATH_ORIGIN \"$ORIGIN\")\nelse()\n  set(SHERPA_ONNX_RPATH_ORIGIN \"@loader_path\")\nendif()\n\nif(NOT WIN32)\n  set(CMAKE_INSTALL_RPATH ${SHERPA_ONNX_RPATH_ORIGIN})\n  set(CMAKE_BUILD_RPATH ${SHERPA_ONNX_RPATH_ORIGIN})\nendif()\n\nif(NOT CMAKE_BUILD_TYPE)\n  message(STATUS \"No CMAKE_BUILD_TYPE given, default to Release\")\n  set(CMAKE_BUILD_TYPE Release)\nendif()\n\nif(DEFINED ANDROID_ABI AND NOT SHERPA_ONNX_ENABLE_JNI AND NOT SHERPA_ONNX_ENABLE_C_API)\n  message(STATUS \"Set SHERPA_ONNX_ENABLE_JNI to ON for Android\")\n  set(SHERPA_ONNX_ENABLE_JNI ON CACHE BOOL \"\" FORCE)\nendif()\n\nif(SHERPA_ONNX_ENABLE_PYTHON AND NOT BUILD_SHARED_LIBS)\n  message(STATUS \"Set BUILD_SHARED_LIBS to ON since SHERPA_ONNX_ENABLE_PYTHON is ON\")\n  set(BUILD_SHARED_LIBS ON CACHE BOOL \"\" FORCE)\nendif()\n\nif(SHERPA_ONNX_ENABLE_GPU)\n  message(WARNING \"\\\nCompiling for NVIDIA GPU is enabled. Please make sure cudatoolkit\nis installed on your system. Otherwise, you will get errors at runtime.\nHint: You don't need sudo permission to install CUDA toolkit. Please refer to\n  https://k2-fsa.github.io/k2/installation/cuda-cudnn.html\nto install CUDA toolkit if you have not installed it.\")\n  if(NOT BUILD_SHARED_LIBS)\n    message(STATUS \"Set BUILD_SHARED_LIBS to ON since SHERPA_ONNX_ENABLE_GPU is ON\")\n    set(BUILD_SHARED_LIBS ON CACHE BOOL \"\" FORCE)\n  endif()\nendif()\n\nif(SHERPA_ONNX_ENABLE_DIRECTML)\n  message(WARNING \"\\\nCompiling with DirectML enabled. Please make sure Windows 10 SDK\nis installed on your system. Otherwise, you will get errors at runtime.\nPlease refer to\n  https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html#requirements\nto install Windows 10 SDK if you have not installed it.\")\n  if(NOT BUILD_SHARED_LIBS)\n    message(STATUS \"Set BUILD_SHARED_LIBS to ON since SHERPA_ONNX_ENABLE_DIRECTML is ON\")\n    set(BUILD_SHARED_LIBS ON CACHE BOOL \"\" FORCE)\n  endif()\nendif()\n\nif(CMAKE_SYSTEM_NAME STREQUAL OHOS)\n  set(CMAKE_CXX_FLAGS \"-Wno-unused-command-line-argument ${CMAKE_CXX_FLAGS}\")\n  set(CMAKE_C_FLAGS \"-Wno-unused-command-line-argument ${CMAKE_C_FLAGS}\")\nendif()\n\nif(ANDROID)\n  # see https://github.com/microsoft/onnxruntime/pull/22076\n  # https://github.com/k2-fsa/sherpa-onnx/issues/2413\n  set(CMAKE_SHARED_LINKER_FLAGS \"${CMAKE_SHARED_LINKER_FLAGS} -Wl,-z,max-page-size=16384\")\nendif()\n\nmessage(STATUS \"CMAKE_BUILD_TYPE: ${CMAKE_BUILD_TYPE}\")\nmessage(STATUS \"CMAKE_INSTALL_PREFIX: ${CMAKE_INSTALL_PREFIX}\")\nmessage(STATUS \"BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_PYTHON ${SHERPA_ONNX_ENABLE_PYTHON}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_TESTS ${SHERPA_ONNX_ENABLE_TESTS}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_CHECK ${SHERPA_ONNX_ENABLE_CHECK}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_PORTAUDIO ${SHERPA_ONNX_ENABLE_PORTAUDIO}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_JNI ${SHERPA_ONNX_ENABLE_JNI}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_C_API ${SHERPA_ONNX_ENABLE_C_API}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_WEBSOCKET ${SHERPA_ONNX_ENABLE_WEBSOCKET}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_GPU ${SHERPA_ONNX_ENABLE_GPU}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_WASM ${SHERPA_ONNX_ENABLE_WASM}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_WASM_SPEAKER_DIARIZATION ${SHERPA_ONNX_ENABLE_WASM_SPEAKER_DIARIZATION}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_WASM_TTS ${SHERPA_ONNX_ENABLE_WASM_TTS}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_WASM_ASR ${SHERPA_ONNX_ENABLE_WASM_ASR}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_WASM_KWS ${SHERPA_ONNX_ENABLE_WASM_KWS}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_WASM_VAD ${SHERPA_ONNX_ENABLE_WASM_VAD}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_WASM_VAD_ASR ${SHERPA_ONNX_ENABLE_WASM_VAD_ASR}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_WASM_NODEJS ${SHERPA_ONNX_ENABLE_WASM_NODEJS}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_WASM_SPEECH_ENHANCEMENT ${SHERPA_ONNX_ENABLE_WASM_SPEECH_ENHANCEMENT}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_BINARY ${SHERPA_ONNX_ENABLE_BINARY}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_TTS ${SHERPA_ONNX_ENABLE_TTS}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION ${SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION}\")\nmessage(STATUS \"SHERPA_ONNX_LINK_LIBSTDCPP_STATICALLY ${SHERPA_ONNX_LINK_LIBSTDCPP_STATICALLY}\")\nmessage(STATUS \"SHERPA_ONNX_USE_PRE_INSTALLED_ONNXRUNTIME_IF_AVAILABLE ${SHERPA_ONNX_USE_PRE_INSTALLED_ONNXRUNTIME_IF_AVAILABLE}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_SANITIZER: ${SHERPA_ONNX_ENABLE_SANITIZER}\")\nmessage(STATUS \"SHERPA_ONNX_BUILD_C_API_EXAMPLES: ${SHERPA_ONNX_BUILD_C_API_EXAMPLES}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_RKNN: ${SHERPA_ONNX_ENABLE_RKNN}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_AXERA: ${SHERPA_ONNX_ENABLE_AXERA}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_AXCL: ${SHERPA_ONNX_ENABLE_AXCL}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_ASCEND_NPU: ${SHERPA_ONNX_ENABLE_ASCEND_NPU}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_QNN: ${SHERPA_ONNX_ENABLE_QNN}\")\nmessage(STATUS \"SHERPA_ONNX_ENABLE_SPACEMIT: ${SHERPA_ONNX_ENABLE_SPACEMIT}\")\nmessage(STATUS \"SHERPA_ONNX_LINK_D3D: ${SHERPA_ONNX_LINK_D3D}\")\nmessage(STATUS \"SHERPA_ONNX_USE_STATIC_CRT: ${SHERPA_ONNX_USE_STATIC_CRT}\")\nif(MSVC)\n  message(STATUS \"CMAKE_MSVC_RUNTIME_LIBRARY ${CMAKE_MSVC_RUNTIME_LIBRARY}\")\nendif()\n\nif(BUILD_SHARED_LIBS OR SHERPA_ONNX_ENABLE_JNI)\n  set(CMAKE_CXX_VISIBILITY_PRESET hidden)\n  set(CMAKE_VISIBILITY_INLINES_HIDDEN 1)\n  set(CMAKE_POSITION_INDEPENDENT_CODE ON)\nendif()\n\nif(BUILD_SHARED_LIBS AND NOT CMAKE_SYSTEM_NAME STREQUAL iOS AND CMAKE_BUILD_TYPE STREQUAL Release)\n  # Don't use LTO for iOS since it causes the following error\n  # error: unable to find any architecture information in the binary\n  # at '/Users/fangjun/open-source/sherpa-onnx/build-ios/build/os64/sherpa-onnx.a':\n  # Unknown header: 0xb17c0de\n  # See also https://forums.developer.apple.com/forums/thread/714324\n\n  include(CheckIPOSupported)\n  check_ipo_supported(RESULT ipo)\n  if(ipo)\n    message(STATUS \"IPO is enabled\")\n    set(CMAKE_INTERPROCEDURAL_OPTIMIZATION ON)\n  else()\n    message(STATUS \"IPO is not available\")\n  endif()\nendif()\n\nif(SHERPA_ONNX_ENABLE_TTS)\n  message(STATUS \"TTS is enabled\")\n  add_definitions(-DSHERPA_ONNX_ENABLE_TTS=1)\nelse()\n  message(STATUS \"TTS is disabled\")\n  add_definitions(-DSHERPA_ONNX_ENABLE_TTS=0)\nendif()\n\nif(SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION)\n  message(STATUS \"speaker diarization is enabled\")\n  add_definitions(-DSHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION=1)\nelse()\n  message(STATUS \"speaker diarization is disabled\")\n  add_definitions(-DSHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION=0)\nendif()\n\nif(SHERPA_ONNX_ENABLE_DIRECTML)\n  message(STATUS \"DirectML is enabled\")\n  add_definitions(-DSHERPA_ONNX_ENABLE_DIRECTML=1)\nelse()\n  message(STATUS \"DirectML is disabled\")\n  add_definitions(-DSHERPA_ONNX_ENABLE_DIRECTML=0)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_SPEAKER_DIARIZATION)\n  if(NOT SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION)\n    message(FATAL_ERROR \"Please set SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION to ON if you want to build WASM for speaker diarization\")\n  endif()\n\n  if(NOT SHERPA_ONNX_ENABLE_WASM)\n    message(FATAL_ERROR \"Please set SHERPA_ONNX_ENABLE_WASM to ON if you enable WASM for speaker diarization\")\n  endif()\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_TTS)\n  if(NOT SHERPA_ONNX_ENABLE_TTS)\n    message(FATAL_ERROR \"Please set SHERPA_ONNX_ENABLE_TTS to ON if you want to build WASM for TTS\")\n  endif()\n\n  if(NOT SHERPA_ONNX_ENABLE_WASM)\n    message(FATAL_ERROR \"Please set SHERPA_ONNX_ENABLE_WASM to ON if you enable WASM for TTS\")\n  endif()\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_ASR)\n  if(NOT SHERPA_ONNX_ENABLE_WASM)\n    message(FATAL_ERROR \"Please set SHERPA_ONNX_ENABLE_WASM to ON if you enable WASM for ASR\")\n  endif()\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_NODEJS)\n  if(NOT SHERPA_ONNX_ENABLE_WASM)\n    message(FATAL_ERROR \"Please set SHERPA_ONNX_ENABLE_WASM to ON if you enable WASM for NodeJS\")\n  endif()\n  add_definitions(-DSHERPA_ONNX_ENABLE_WASM_KWS=1)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM)\n  add_definitions(-DSHERPA_ONNX_ENABLE_WASM=1)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_KWS)\n  if(NOT SHERPA_ONNX_ENABLE_WASM)\n    message(FATAL_ERROR \"Please set SHERPA_ONNX_ENABLE_WASM to ON if you enable WASM for KWS\")\n  endif()\n  add_definitions(-DSHERPA_ONNX_ENABLE_WASM_KWS=1)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_VAD)\n  if(NOT SHERPA_ONNX_ENABLE_WASM)\n    message(FATAL_ERROR \"Please set SHERPA_ONNX_ENABLE_WASM to ON if you enable WASM for VAD\")\n  endif()\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_VAD_ASR)\n  if(NOT SHERPA_ONNX_ENABLE_WASM)\n    message(FATAL_ERROR \"Please set SHERPA_ONNX_ENABLE_WASM to ON if you enable WASM for VAD+ASR\")\n  endif()\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_SPEECH_ENHANCEMENT)\n  if(NOT SHERPA_ONNX_ENABLE_WASM)\n    message(FATAL_ERROR \"Please set SHERPA_ONNX_ENABLE_WASM to ON if you enable WASM for speech enhancement\")\n  endif()\nendif()\n\nif(NOT CMAKE_CXX_STANDARD)\n  set(CMAKE_CXX_STANDARD 17 CACHE STRING \"The C++ version to be used.\")\nendif()\nset(CMAKE_CXX_EXTENSIONS OFF)\nmessage(STATUS \"C++ Standard version: ${CMAKE_CXX_STANDARD}\")\n\ninclude(CheckIncludeFileCXX)\n\nif(SHERPA_ONNX_ENABLE_RKNN)\n  add_definitions(-DSHERPA_ONNX_ENABLE_RKNN=1)\nendif()\n\nif(SHERPA_ONNX_ENABLE_AXERA)\n  add_definitions(-DSHERPA_ONNX_ENABLE_AXERA=1)\nendif()\n\nif(SHERPA_ONNX_ENABLE_AXCL)\n  add_definitions(-DSHERPA_ONNX_ENABLE_AXCL=1)\nendif()\n\nif(SHERPA_ONNX_ENABLE_QNN)\n  add_definitions(-DSHERPA_ONNX_ENABLE_QNN=1)\nendif()\n\nif(SHERPA_ONNX_ENABLE_SPACEMIT)\n  add_definitions(-DSHERPA_ONNX_ENABLE_SPACEMIT=1)\nendif()\n\nif(SHERPA_ONNX_ENABLE_ASCEND_NPU)\n  set(ASCEND_TOOLKIT_HOME)\n  if(NOT DEFINED ENV{ASCEND_TOOLKIT_HOME})\n    if(EXISTS /usr/local/Ascend/ascend-toolkit/latest)\n      set(ASCEND_TOOLKIT_HOME /usr/local/Ascend/ascend-toolkit/latest)\n    else()\n      message(FATAL_ERROR \"\\\n      Please specify the installation directory of the ascend toolkit.\n      For instance, if it is installed in\n\n        /usr/local/Ascend/ascend-toolkit/latest\n\n      You can run\n\n        export ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest\n      \")\n    endif()\n  else()\n    set(ASCEND_TOOLKIT_HOME $ENV{ASCEND_TOOLKIT_HOME})\n  endif()\n\n  message(STATUS \"ASCEND_TOOLKIT_HOME: ${ASCEND_TOOLKIT_HOME}\")\n\n  if(NOT EXISTS ${ASCEND_TOOLKIT_HOME}/include/acl/acl.h)\n    message(FATAL_ERROR \"${ASCEND_TOOLKIT_HOME}/include/acl/acl.h does not exist\")\n  endif()\n\n  if(NOT EXISTS ${ASCEND_TOOLKIT_HOME}/lib64/libascendcl.so)\n    message(FATAL_ERROR \"${ASCEND_TOOLKIT_HOME}/lib64/libascendcl.so does not exist\")\n  endif()\n\n  add_definitions(-DSHERPA_ONNX_ENABLE_ASCEND_NPU=1)\n  message(STATUS \"Build with Ascend NPU\")\nendif()\n\nif(SHERPA_ONNX_ENABLE_QNN)\n  if(NOT DEFINED ENV{QNN_SDK_ROOT})\n      message(FATAL_ERROR \"\\\n      Please specify the installation directory of the QNN SDK toolkit.\n      For instance, if it is installed in\n\n        /mnt/sdb/open-source/qairt/2.33.0.250327\n\n      You can run\n\n        source /mnt/sdb/open-source/qairt/2.33.0.250327/bin/envsetup.sh\n\n      which will give you the following output\n\n      [INFO] AISW SDK environment set\n      [INFO] QNN_SDK_ROOT: /mnt/sdb/open-source/qairt/2.33.0.250327\n      [INFO] SNPE_ROOT: /mnt/sdb/open-source/qairt/2.33.0.250327\n\n      Then run\n\n        echo $QNN_SDK_ROOT\n\n      It should print:\n\n        /mnt/sdb/open-source/qairt/2.33.0.250327\n\n      You can choose a version of QNN SDK by yourself. You don't need\n      to use 2.33.0.250327\n      \")\n  endif()\n\n  set(QNN_SDK_ROOT $ENV{QNN_SDK_ROOT})\n\n  if(NOT EXISTS ${QNN_SDK_ROOT}/include/QNN/QnnInterface.h)\n    message(FATAL_ERROR \"${QNN_SDK_ROOT}/include/QNN/QnnInterface.h does not exist\")\n  endif()\nendif()\n\nif(UNIX AND NOT APPLE AND NOT SHERPA_ONNX_ENABLE_WASM AND NOT CMAKE_SYSTEM_NAME STREQUAL Android AND NOT CMAKE_SYSTEM_NAME STREQUAL OHOS)\n  check_include_file_cxx(alsa/asoundlib.h SHERPA_ONNX_HAS_ALSA)\n  if(SHERPA_ONNX_HAS_ALSA)\n    message(STATUS \"With Alsa\")\n    add_definitions(-DSHERPA_ONNX_ENABLE_ALSA=1)\n  else()\n    message(WARNING \"\\\nCould not find alsa/asoundlib.h !\nWe won't build sherpa-onnx-alsa\nTo fix that, please do:\n  (1) sudo apt-get install alsa-utils libasound2-dev pkg-config\n  (2) rm -rf build\n  (3) re-try\n  \")\n  endif()\nendif()\n\ncheck_include_file_cxx(cxxabi.h SHERPA_ONNX_HAVE_CXXABI_H)\ncheck_include_file_cxx(execinfo.h SHERPA_ONNX_HAVE_EXECINFO_H)\n\nif(WIN32)\n  add_definitions(-DNOMINMAX) # Otherwise, std::max() and std::min() won't work\nendif()\n\n\nif(WIN32 AND MSVC)\n  # disable various warnings for MSVC\n  # 4244: 'return': conversion from 'unsigned __int64' to 'int', possible loss of data\n  # 4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data\n  # 4305: 'argument': truncation from 'double' to 'const float'\n  # 4334: '<<': result of 32-bit shift implicitly converted to 64 bits\n  # 4800: 'int': forcing value to bool 'true' or 'false'\n  # 4996: 'fopen': This function or variable may be unsafe\n  set(disabled_warnings\n      /wd4244\n      /wd4267\n      /wd4305\n      /wd4334\n      /wd4800\n      /wd4996\n  )\n  message(STATUS \"Disabled warnings: ${disabled_warnings}\")\n  foreach(w IN LISTS disabled_warnings)\n    string(APPEND CMAKE_CXX_FLAGS \" ${w} \")\n  endforeach()\n\n  add_compile_options(\"$<$<C_COMPILER_ID:MSVC>:/utf-8>\")\n  add_compile_options(\"$<$<CXX_COMPILER_ID:MSVC>:/utf-8>\")\nendif()\n\nlist(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake/Modules)\nlist(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake)\n\ninclude(show-info)\n\nif(SHERPA_ONNX_ENABLE_WASM)\n  # Enable it for debugging in case there is something wrong.\n  # string(APPEND CMAKE_CXX_FLAGS \" -g4 -s ASSERTIONS=2 -s SAFE_HEAP=1 -s STACK_OVERFLOW_CHECK=1 \")\nendif()\n\nif(NOT BUILD_SHARED_LIBS AND CMAKE_SYSTEM_NAME STREQUAL Linux)\n  if(SHERPA_ONNX_LINK_LIBSTDCPP_STATICALLY)\n    message(STATUS \"Link libstdc++ statically\")\n    set(CMAKE_CXX_FLAGS \" ${CMAKE_CXX_FLAGS} -static-libstdc++ -static-libgcc \")\n  else()\n    message(STATUS \"Link libstdc++ dynamically\")\n  endif()\nendif()\n\ninclude(kaldi-native-fbank)\ninclude(kaldi-decoder)\ninclude(onnxruntime)\ninclude(simple-sentencepiece)\nset(ONNXRUNTIME_DIR ${onnxruntime_SOURCE_DIR})\nmessage(STATUS \"ONNXRUNTIME_DIR: ${ONNXRUNTIME_DIR}\")\n\nif(SHERPA_ONNX_ENABLE_PORTAUDIO AND SHERPA_ONNX_ENABLE_BINARY)\n  # portaudio is used only in building demo binaries and the sherpa-onnx-core\n  # library does not depend on it.\n  include(portaudio)\nendif()\n\nif(SHERPA_ONNX_ENABLE_PYTHON)\n  include(pybind11)\nendif()\n\nif(SHERPA_ONNX_ENABLE_TESTS)\n  enable_testing()\n  include(googletest)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WEBSOCKET)\n  include(websocketpp)\n  include(asio)\nendif()\n\ninclude(json)\n\nif(SHERPA_ONNX_ENABLE_TTS)\n  include(espeak-ng-for-piper)\n  set(ESPEAK_NG_DIR ${espeak_ng_SOURCE_DIR})\n  message(STATUS \"ESPEAK_NG_DIR: ${ESPEAK_NG_DIR}\")\n  include(piper-phonemize)\nendif()\n\nif(SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION)\n  include(hclust-cpp)\nendif()\n\n# if(NOT MSVC AND CMAKE_BUILD_TYPE STREQUAL Debug AND (CMAKE_CXX_COMPILER_ID STREQUAL \"Clang\" OR CMAKE_CXX_COMPILER_ID STREQUAL \"AppleClang\"))\nif(SHERPA_ONNX_ENABLE_SANITIZER)\n  message(WARNING \"enable ubsan and asan\")\n  set(CMAKE_REQUIRED_LIBRARIES -lubsan -lasan)\n  include(CheckCCompilerFlag)\n\n  set(flags -fsanitize=undefined )\n  string(APPEND flags \" -fno-sanitize-recover=undefined \")\n  string(APPEND flags \" -fsanitize=integer \")\n  string(APPEND flags \" -fsanitize=nullability \")\n  string(APPEND flags \" -fsanitize=implicit-conversion \")\n  string(APPEND flags \" -fsanitize=bounds \")\n  string(APPEND flags \" -fsanitize=address \")\n\n  if(OFF)\n    set(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${flags} -Wall -Wextra\")\n    set(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${flags} -Wall -Wextra\")\n  else()\n    set(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${flags}\")\n    set(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${flags}\")\n  endif()\n\n  set(CMAKE_EXECUTABLE_LINKER_FLAGS \"${CMAKE_EXECUTABLE_LINKER_FLAGS} ${flags}\")\n\n  add_compile_options(-fno-omit-frame-pointer)\nendif()\n\nadd_subdirectory(sherpa-onnx)\n\nif(SHERPA_ONNX_ENABLE_C_API AND SHERPA_ONNX_ENABLE_BINARY AND SHERPA_ONNX_BUILD_C_API_EXAMPLES)\n  set(SHERPA_ONNX_PKG_WITH_CARGS \"-lcargs\")\n  add_subdirectory(c-api-examples)\n  add_subdirectory(cxx-api-examples)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM)\n  add_subdirectory(wasm)\nendif()\n\nmessage(STATUS \"CMAKE_CXX_FLAGS: ${CMAKE_CXX_FLAGS}\")\n\nif(NOT BUILD_SHARED_LIBS)\n  if(APPLE)\n    set(SHERPA_ONNX_PKG_CONFIG_EXTRA_LIBS \"-lc++ -framework Foundation\")\n  endif()\n\n  if(UNIX AND NOT APPLE)\n    set(SHERPA_ONNX_PKG_CONFIG_EXTRA_LIBS \"-lstdc++ -lm -pthread -ldl\")\n  endif()\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n# See https://people.freedesktop.org/~dbn/pkg-config-guide.html\n  if(SHERPA_ONNX_ENABLE_TTS)\n    configure_file(cmake/sherpa-onnx-static.pc.in ${PROJECT_BINARY_DIR}/sherpa-onnx.pc @ONLY)\n  else()\n    configure_file(cmake/sherpa-onnx-static-no-tts.pc.in ${PROJECT_BINARY_DIR}/sherpa-onnx.pc @ONLY)\n  endif()\nelse()\n  configure_file(cmake/sherpa-onnx-shared.pc.in ${PROJECT_BINARY_DIR}/sherpa-onnx.pc @ONLY)\nendif()\n\ninstall(\n  FILES\n    ${PROJECT_BINARY_DIR}/sherpa-onnx.pc\n  DESTINATION\n    ./\n)\nmessage(STATUS \"CMAKE_CXX_FLAGS: ${CMAKE_CXX_FLAGS}\")\n"
  },
  {
    "path": "CPPLINT.cfg",
    "content": "filter=-./mfc-examples\n"
  },
  {
    "path": "LICENSE",
    "content": "\n                                 Apache License\n                           Version 2.0, January 2004\n                        http://www.apache.org/licenses/\n\n   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION\n\n   1. Definitions.\n\n      \"License\" shall mean the terms and conditions for use, reproduction,\n      and distribution as defined by Sections 1 through 9 of this document.\n\n      \"Licensor\" shall mean the copyright owner or entity authorized by\n      the copyright owner that is granting the License.\n\n      \"Legal Entity\" shall mean the union of the acting entity and all\n      other entities that control, are controlled by, or are under common\n      control with that entity. For the purposes of this definition,\n      \"control\" means (i) the power, direct or indirect, to cause the\n      direction or management of such entity, whether by contract or\n      otherwise, or (ii) ownership of fifty percent (50%) or more of the\n      outstanding shares, or (iii) beneficial ownership of such entity.\n\n      \"You\" (or \"Your\") shall mean an individual or Legal Entity\n      exercising permissions granted by this License.\n\n      \"Source\" form shall mean the preferred form for making modifications,\n      including but not limited to software source code, documentation\n      source, and configuration files.\n\n      \"Object\" form shall mean any form resulting from mechanical\n      transformation or translation of a Source form, including but\n      not limited to compiled object code, generated documentation,\n      and conversions to other media types.\n\n      \"Work\" shall mean the work of authorship, whether in Source or\n      Object form, made available under the License, as indicated by a\n      copyright notice that is included in or attached to the work\n      (an example is provided in the Appendix below).\n\n      \"Derivative Works\" shall mean any work, whether in Source or Object\n      form, that is based on (or derived from) the Work and for which the\n      editorial revisions, annotations, elaborations, or other modifications\n      represent, as a whole, an original work of authorship. For the purposes\n      of this License, Derivative Works shall not include works that remain\n      separable from, or merely link (or bind by name) to the interfaces of,\n      the Work and Derivative Works thereof.\n\n      \"Contribution\" shall mean any work of authorship, including\n      the original version of the Work and any modifications or additions\n      to that Work or Derivative Works thereof, that is intentionally\n      submitted to Licensor for inclusion in the Work by the copyright owner\n      or by an individual or Legal Entity authorized to submit on behalf of\n      the copyright owner. For the purposes of this definition, \"submitted\"\n      means any form of electronic, verbal, or written communication sent\n      to the Licensor or its representatives, including but not limited to\n      communication on electronic mailing lists, source code control systems,\n      and issue tracking systems that are managed by, or on behalf of, the\n      Licensor for the purpose of discussing and improving the Work, but\n      excluding communication that is conspicuously marked or otherwise\n      designated in writing by the copyright owner as \"Not a Contribution.\"\n\n      \"Contributor\" shall mean Licensor and any individual or Legal Entity\n      on behalf of whom a Contribution has been received by Licensor and\n      subsequently incorporated within the Work.\n\n   2. Grant of Copyright License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      copyright license to reproduce, prepare Derivative Works of,\n      publicly display, publicly perform, sublicense, and distribute the\n      Work and such Derivative Works in Source or Object form.\n\n   3. Grant of Patent License. Subject to the terms and conditions of\n      this License, each Contributor hereby grants to You a perpetual,\n      worldwide, non-exclusive, no-charge, royalty-free, irrevocable\n      (except as stated in this section) patent license to make, have made,\n      use, offer to sell, sell, import, and otherwise transfer the Work,\n      where such license applies only to those patent claims licensable\n      by such Contributor that are necessarily infringed by their\n      Contribution(s) alone or by combination of their Contribution(s)\n      with the Work to which such Contribution(s) was submitted. If You\n      institute patent litigation against any entity (including a\n      cross-claim or counterclaim in a lawsuit) alleging that the Work\n      or a Contribution incorporated within the Work constitutes direct\n      or contributory patent infringement, then any patent licenses\n      granted to You under this License for that Work shall terminate\n      as of the date such litigation is filed.\n\n   4. Redistribution. You may reproduce and distribute copies of the\n      Work or Derivative Works thereof in any medium, with or without\n      modifications, and in Source or Object form, provided that You\n      meet the following conditions:\n\n      (a) You must give any other recipients of the Work or\n          Derivative Works a copy of this License; and\n\n      (b) You must cause any modified files to carry prominent notices\n          stating that You changed the files; and\n\n      (c) You must retain, in the Source form of any Derivative Works\n          that You distribute, all copyright, patent, trademark, and\n          attribution notices from the Source form of the Work,\n          excluding those notices that do not pertain to any part of\n          the Derivative Works; and\n\n      (d) If the Work includes a \"NOTICE\" text file as part of its\n          distribution, then any Derivative Works that You distribute must\n          include a readable copy of the attribution notices contained\n          within such NOTICE file, excluding those notices that do not\n          pertain to any part of the Derivative Works, in at least one\n          of the following places: within a NOTICE text file distributed\n          as part of the Derivative Works; within the Source form or\n          documentation, if provided along with the Derivative Works; or,\n          within a display generated by the Derivative Works, if and\n          wherever such third-party notices normally appear. The contents\n          of the NOTICE file are for informational purposes only and\n          do not modify the License. You may add Your own attribution\n          notices within Derivative Works that You distribute, alongside\n          or as an addendum to the NOTICE text from the Work, provided\n          that such additional attribution notices cannot be construed\n          as modifying the License.\n\n      You may add Your own copyright statement to Your modifications and\n      may provide additional or different license terms and conditions\n      for use, reproduction, or distribution of Your modifications, or\n      for any such Derivative Works as a whole, provided Your use,\n      reproduction, and distribution of the Work otherwise complies with\n      the conditions stated in this License.\n\n   5. Submission of Contributions. Unless You explicitly state otherwise,\n      any Contribution intentionally submitted for inclusion in the Work\n      by You to the Licensor shall be under the terms and conditions of\n      this License, without any additional terms or conditions.\n      Notwithstanding the above, nothing herein shall supersede or modify\n      the terms of any separate license agreement you may have executed\n      with Licensor regarding such Contributions.\n\n   6. Trademarks. This License does not grant permission to use the trade\n      names, trademarks, service marks, or product names of the Licensor,\n      except as required for reasonable and customary use in describing the\n      origin of the Work and reproducing the content of the NOTICE file.\n\n   7. Disclaimer of Warranty. Unless required by applicable law or\n      agreed to in writing, Licensor provides the Work (and each\n      Contributor provides its Contributions) on an \"AS IS\" BASIS,\n      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or\n      implied, including, without limitation, any warranties or conditions\n      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A\n      PARTICULAR PURPOSE. You are solely responsible for determining the\n      appropriateness of using or redistributing the Work and assume any\n      risks associated with Your exercise of permissions under this License.\n\n   8. Limitation of Liability. In no event and under no legal theory,\n      whether in tort (including negligence), contract, or otherwise,\n      unless required by applicable law (such as deliberate and grossly\n      negligent acts) or agreed to in writing, shall any Contributor be\n      liable to You for damages, including any direct, indirect, special,\n      incidental, or consequential damages of any character arising as a\n      result of this License or out of the use or inability to use the\n      Work (including but not limited to damages for loss of goodwill,\n      work stoppage, computer failure or malfunction, or any and all\n      other commercial damages or losses), even if such Contributor\n      has been advised of the possibility of such damages.\n\n   9. Accepting Warranty or Additional Liability. While redistributing\n      the Work or Derivative Works thereof, You may choose to offer,\n      and charge a fee for, acceptance of support, warranty, indemnity,\n      or other liability obligations and/or rights consistent with this\n      License. However, in accepting such obligations, You may act only\n      on Your own behalf and on Your sole responsibility, not on behalf\n      of any other Contributor, and only if You agree to indemnify,\n      defend, and hold each Contributor harmless for any liability\n      incurred by, or claims asserted against, such Contributor by reason\n      of your accepting any such warranty or additional liability.\n\n   END OF TERMS AND CONDITIONS\n\n   APPENDIX: How to apply the Apache License to your work.\n\n      To apply the Apache License to your work, attach the following\n      boilerplate notice, with the fields enclosed by brackets \"[]\"\n      replaced with your own identifying information. (Don't include\n      the brackets!)  The text should be enclosed in the appropriate\n      comment syntax for the file format. We also recommend that a\n      file or class name and description of purpose be included on the\n      same \"printed page\" as the copyright notice for easier\n      identification within third-party archives.\n\n   Copyright [yyyy] [name of copyright owner]\n\n   Licensed under the Apache License, Version 2.0 (the \"License\");\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an \"AS IS\" BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n"
  },
  {
    "path": "MANIFEST.in",
    "content": "include LICENSE\ninclude README.md\ninclude CMakeLists.txt\nrecursive-include c-api-examples *.*\nrecursive-include sherpa-onnx *.*\nrecursive-include cmake *.*\nprune */__pycache__\nprune android\nprune sherpa-onnx/java-api\nprune ios-swift\nprune ios-swiftui\n\n"
  },
  {
    "path": "README.md",
    "content": " ### Supported functions\n\n|Speech recognition| [Speech synthesis][tts-url] | [Source separation][ss-url] |\n|------------------|------------------|-------------------|\n|   ✔️              |         ✔️        |       ✔️           |\n\n|Speaker identification| [Speaker diarization][sd-url] | Speaker verification |\n|----------------------|-------------------- |------------------------|\n|   ✔️                  |         ✔️           |            ✔️           |\n\n| [Spoken Language identification][slid-url] | [Audio tagging][at-url] | [Voice activity detection][vad-url] |\n|--------------------------------|---------------|--------------------------|\n|                 ✔️              |          ✔️    |                ✔️         |\n\n| [Keyword spotting][kws-url] | [Add punctuation][punct-url] | [Speech enhancement][se-url] |\n|------------------|-----------------|--------------------|\n|     ✔️            |       ✔️         |      ✔️             |\n\n\n### Supported platforms\n\n|Architecture| Android | iOS     | Windows    | macOS | linux | HarmonyOS |\n|------------|---------|---------|------------|-------|-------|-----------|\n|   x64      |  ✔️      |         |   ✔️      | ✔️    |  ✔️    |   ✔️   |\n|   x86      |  ✔️      |         |   ✔️      |       |        |        |\n|   arm64    |  ✔️      | ✔️      |   ✔️      | ✔️    |  ✔️    |   ✔️   |\n|   arm32    |  ✔️      |         |           |       |  ✔️    |   ✔️   |\n|   riscv64  |          |         |           |       |  ✔️    |        |\n\n### Supported programming languages\n\n| 1. C++ | 2. C  | 3. Python | 4. JavaScript |\n|--------|-------|-----------|---------------|\n|   ✔️    | ✔️     | ✔️         |    ✔️          |\n\n|5. Java | 6. C# | 7. Kotlin | 8. Swift |\n|--------|-------|-----------|----------|\n| ✔️      |  ✔️    | ✔️         |  ✔️       |\n\n| 9. Go | 10. Dart | 11. Rust | 12. Pascal |\n|-------|----------|----------|------------|\n| ✔️     |  ✔️       |   ✔️      |    ✔️       |\n\n\nIt also supports WebAssembly.\n\n### Supported NPUs\n\n| [1. Rockchip NPU (RKNN)][rknpu-doc] | [2. Qualcomm NPU (QNN)][qnn-doc]  | [3. Ascend NPU][ascend-doc] |\n|-------------------------------------|-----------------------------------|-----------------------------|\n|     ✔️                              |                  ✔️               |     ✔️                      |\n\n| [4. Axera NPU][axera-npu] |\n|---------------------------|\n|     ✔️                    |\n\n[Join our discord](https://discord.gg/fJdxzg2VbG)\n\n\n## Introduction\n\nThis repository supports running the following functions **locally**\n\n  - Speech-to-text (i.e., ASR); both streaming and non-streaming are supported\n  - Text-to-speech (i.e., TTS)\n  - Speaker diarization\n  - Speaker identification\n  - Speaker verification\n  - Spoken language identification\n  - Audio tagging\n  - VAD (e.g., [silero-vad][silero-vad])\n  - Speech enhancement (e.g., [gtcrn][gtcrn], [DPDFNet](https://github.com/ceva-ip/DPDFNet))\n  - Keyword spotting\n  - Source separation (e.g., [spleeter][spleeter], [UVR][UVR])\n\non the following platforms and operating systems:\n\n  - x86, ``x86_64``, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64), **RK NPU**, **Ascend NPU**\n  - Linux, macOS, Windows, openKylin\n  - Android, WearOS\n  - iOS\n  - HarmonyOS\n  - NodeJS\n  - WebAssembly\n  - [NVIDIA Jetson Orin NX][NVIDIA Jetson Orin NX] (Support running on both CPU and GPU)\n  - [NVIDIA Jetson Nano B01][NVIDIA Jetson Nano B01] (Support running on both CPU and GPU)\n  - [Raspberry Pi][Raspberry Pi]\n  - [RV1126][RV1126]\n  - [LicheePi4A][LicheePi4A]\n  - [VisionFive 2][VisionFive 2]\n  - [旭日X3派][旭日X3派]\n  - [爱芯派][爱芯派]\n  - [RK3588][RK3588]\n  - etc\n\nwith the following APIs\n\n  - C++, C, Python, Go, ``C#``\n  - Java, Kotlin, JavaScript\n  - Swift, Rust\n  - Dart, Object Pascal\n\n### Links for Huggingface Spaces\n\n<details>\n<summary>You can visit the following Huggingface spaces to try sherpa-onnx without\ninstalling anything. All you need is a browser.</summary>\n\n| Description                                           | URL                                     | 中国镜像                               |\n|-------------------------------------------------------|-----------------------------------------|----------------------------------------|\n| Speaker diarization                                   | [Click me][hf-space-speaker-diarization]| [镜像][hf-space-speaker-diarization-cn]|\n| Speech recognition                                    | [Click me][hf-space-asr]                | [镜像][hf-space-asr-cn]                |\n| Speech recognition with [Whisper][Whisper]            | [Click me][hf-space-asr-whisper]        | [镜像][hf-space-asr-whisper-cn]        |\n| Speech synthesis                                      | [Click me][hf-space-tts]                | [镜像][hf-space-tts-cn]                |\n| Generate subtitles                                    | [Click me][hf-space-subtitle]           | [镜像][hf-space-subtitle-cn]           |\n| Audio tagging                                         | [Click me][hf-space-audio-tagging]      | [镜像][hf-space-audio-tagging-cn]      |\n| Source separation                                     | [Click me][hf-space-source-separation]  | [镜像][hf-space-source-separation-cn]  |\n| Spoken language identification with [Whisper][Whisper]| [Click me][hf-space-slid-whisper]       | [镜像][hf-space-slid-whisper-cn]       |\n\nWe also have spaces built using WebAssembly. They are listed below:\n\n| Description                                                                              | Huggingface space| ModelScope space|\n|------------------------------------------------------------------------------------------|------------------|-----------------|\n|Voice activity detection with [silero-vad][silero-vad]                                    | [Click me][wasm-hf-vad]|[地址][wasm-ms-vad]|\n|Real-time speech recognition (Chinese + English) with Zipformer                           | [Click me][wasm-hf-streaming-asr-zh-en-zipformer]|[地址][wasm-hf-streaming-asr-zh-en-zipformer]|\n|Real-time speech recognition (Chinese + English) with Paraformer                          |[Click me][wasm-hf-streaming-asr-zh-en-paraformer]| [地址][wasm-ms-streaming-asr-zh-en-paraformer]|\n|Real-time speech recognition (Chinese + English + Cantonese) with [Paraformer-large][Paraformer-large]|[Click me][wasm-hf-streaming-asr-zh-en-yue-paraformer]| [地址][wasm-ms-streaming-asr-zh-en-yue-paraformer]|\n|Real-time speech recognition (English) |[Click me][wasm-hf-streaming-asr-en-zipformer]    |[地址][wasm-ms-streaming-asr-en-zipformer]|\n|VAD + speech recognition (Chinese) with [Zipformer CTC](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html#sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03-chinese)|[Click me][wasm-hf-vad-asr-zh-zipformer-ctc-07-03]| [地址][wasm-ms-vad-asr-zh-zipformer-ctc-07-03]|\n|VAD + speech recognition (Chinese + English + Korean + Japanese + Cantonese) with [SenseVoice][SenseVoice]|[Click me][wasm-hf-vad-asr-zh-en-ko-ja-yue-sense-voice]| [地址][wasm-ms-vad-asr-zh-en-ko-ja-yue-sense-voice]|\n|VAD + speech recognition (English) with [Whisper][Whisper] tiny.en|[Click me][wasm-hf-vad-asr-en-whisper-tiny-en]| [地址][wasm-ms-vad-asr-en-whisper-tiny-en]|\n|VAD + speech recognition (English) with [Moonshine tiny][Moonshine tiny]|[Click me][wasm-hf-vad-asr-en-moonshine-tiny-en]| [地址][wasm-ms-vad-asr-en-moonshine-tiny-en]|\n|VAD + speech recognition (English) with Zipformer trained with [GigaSpeech][GigaSpeech]    |[Click me][wasm-hf-vad-asr-en-zipformer-gigaspeech]| [地址][wasm-ms-vad-asr-en-zipformer-gigaspeech]|\n|VAD + speech recognition (Chinese) with Zipformer trained with [WenetSpeech][WenetSpeech]  |[Click me][wasm-hf-vad-asr-zh-zipformer-wenetspeech]| [地址][wasm-ms-vad-asr-zh-zipformer-wenetspeech]|\n|VAD + speech recognition (Japanese) with Zipformer trained with [ReazonSpeech][ReazonSpeech]|[Click me][wasm-hf-vad-asr-ja-zipformer-reazonspeech]| [地址][wasm-ms-vad-asr-ja-zipformer-reazonspeech]|\n|VAD + speech recognition (Thai) with Zipformer trained with [GigaSpeech2][GigaSpeech2]      |[Click me][wasm-hf-vad-asr-th-zipformer-gigaspeech2]| [地址][wasm-ms-vad-asr-th-zipformer-gigaspeech2]|\n|VAD + speech recognition (Chinese 多种方言) with a [TeleSpeech-ASR][TeleSpeech-ASR] CTC model|[Click me][wasm-hf-vad-asr-zh-telespeech]| [地址][wasm-ms-vad-asr-zh-telespeech]|\n|VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-large          |[Click me][wasm-hf-vad-asr-zh-en-paraformer-large]| [地址][wasm-ms-vad-asr-zh-en-paraformer-large]|\n|VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-small          |[Click me][wasm-hf-vad-asr-zh-en-paraformer-small]| [地址][wasm-ms-vad-asr-zh-en-paraformer-small]|\n|VAD + speech recognition (多语种及多种中文方言) with [Dolphin][Dolphin]-base          |[Click me][wasm-hf-vad-asr-multi-lang-dolphin-base]| [地址][wasm-ms-vad-asr-multi-lang-dolphin-base]|\n|Speech synthesis (Piper, English)                                                                  |[Click me][wasm-hf-tts-piper-en]| [地址][wasm-ms-tts-piper-en]|\n|Speech synthesis (Piper, German)                                                                   |[Click me][wasm-hf-tts-piper-de]| [地址][wasm-ms-tts-piper-de]|\n|Speech synthesis (Matcha, Chinese)                                                                  |[Click me][wasm-hf-tts-matcha-zh]| [地址][wasm-ms-tts-matcha-zh]|\n|Speech synthesis (Matcha, English)                                                                  |[Click me][wasm-hf-tts-matcha-en]| [地址][wasm-ms-tts-matcha-en]|\n|Speech synthesis (Matcha, Chinese+English)                                                          |[Click me][wasm-hf-tts-matcha-zh-en]| [地址][wasm-ms-tts-matcha-zh-en]|\n|Speaker diarization                                                                         |[Click me][wasm-hf-speaker-diarization]|[地址][wasm-ms-speaker-diarization]|\n|Voice cloning with ZipVoice (Chinese+English)                                               |[Click me][wasm-hf-voice-cloning-zipvoice]|[地址][wasm-ms-voice-cloning-zipvoice]|\n|Voice cloning with Pocket TTS (English)                                               |[Click me][wasm-hf-voice-cloning-pocket]|[地址][wasm-ms-voice-cloning-pocket]|\n\n</details>\n\n### Links for pre-built Android APKs\n\n<details>\n\n<summary>You can find pre-built Android APKs for this repository in the following table</summary>\n\n| Description                            | URL                                | 中国用户                          |\n|----------------------------------------|------------------------------------|-----------------------------------|\n| Speaker diarization                    | [Address][apk-speaker-diarization] | [点此][apk-speaker-diarization-cn]|\n| Streaming speech recognition           | [Address][apk-streaming-asr]       | [点此][apk-streaming-asr-cn]      |\n| Simulated-streaming speech recognition | [Address][apk-simula-streaming-asr]| [点此][apk-simula-streaming-asr-cn]|\n| Text-to-speech                         | [Address][apk-tts]                 | [点此][apk-tts-cn]                |\n| Voice activity detection (VAD)         | [Address][apk-vad]                 | [点此][apk-vad-cn]                |\n| VAD + non-streaming speech recognition | [Address][apk-vad-asr]             | [点此][apk-vad-asr-cn]            |\n| Two-pass speech recognition            | [Address][apk-2pass]               | [点此][apk-2pass-cn]              |\n| Audio tagging                          | [Address][apk-at]                  | [点此][apk-at-cn]                 |\n| Audio tagging (WearOS)                 | [Address][apk-at-wearos]           | [点此][apk-at-wearos-cn]          |\n| Speaker identification                 | [Address][apk-sid]                 | [点此][apk-sid-cn]                |\n| Spoken language identification         | [Address][apk-slid]                | [点此][apk-slid-cn]               |\n| Keyword spotting                       | [Address][apk-kws]                 | [点此][apk-kws-cn]                |\n\n</details>\n\n### Links for pre-built Flutter APPs\n\n<details>\n\n#### Real-time speech recognition\n\n| Description                    | URL                                 | 中国用户                            |\n|--------------------------------|-------------------------------------|-------------------------------------|\n| Streaming speech recognition   | [Address][apk-flutter-streaming-asr]| [点此][apk-flutter-streaming-asr-cn]|\n\n#### Text-to-speech\n\n| Description                              | URL                                | 中国用户                           |\n|------------------------------------------|------------------------------------|------------------------------------|\n| Android (arm64-v8a, armeabi-v7a, x86_64) | [Address][flutter-tts-android]     | [点此][flutter-tts-android-cn]     |\n| Linux (x64)                              | [Address][flutter-tts-linux]       | [点此][flutter-tts-linux-cn]       |\n| macOS (x64)                              | [Address][flutter-tts-macos-x64]   | [点此][flutter-tts-macos-x64-cn] |\n| macOS (arm64)                            | [Address][flutter-tts-macos-arm64] | [点此][flutter-tts-macos-arm64-cn]   |\n| Windows (x64)                            | [Address][flutter-tts-win-x64]     | [点此][flutter-tts-win-x64-cn]     |\n\n> Note: You need to build from source for iOS.\n\n</details>\n\n### Links for pre-built Lazarus APPs\n\n<details>\n\n#### Generating subtitles\n\n| Description                    | URL                        | 中国用户                   |\n|--------------------------------|----------------------------|----------------------------|\n| Generate subtitles (生成字幕)  | [Address][lazarus-subtitle]| [点此][lazarus-subtitle-cn]|\n\n</details>\n\n### Links for pre-trained models\n\n<details>\n\n| Description                                 | URL                                                                                   |\n|---------------------------------------------|---------------------------------------------------------------------------------------|\n| Speech recognition (speech to text, ASR)    | [Address][asr-models]                                                                 |\n| Text-to-speech (TTS)                        | [Address][tts-models]                                                                 |\n| VAD                                         | [Address][vad-models]                                                                 |\n| Keyword spotting                            | [Address][kws-models]                                                                 |\n| Audio tagging                               | [Address][at-models]                                                                  |\n| Speaker identification (Speaker ID)         | [Address][sid-models]                                                                 |\n| Spoken language identification (Language ID)| See multi-lingual [Whisper][Whisper] ASR models from  [Speech recognition][asr-models]|\n| Punctuation                                 | [Address][punct-models]                                                               |\n| Speaker segmentation                        | [Address][speaker-segmentation-models]                                                |\n| Speech enhancement                          | [Address][speech-enhancement-models]                                                  |\n| Source separation                           | [Address][source-separation-models]                                                  |\n\n</details>\n\n#### Some pre-trained ASR models (Streaming)\n\n<details>\n\nPlease see\n\n  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html>\n  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/index.html>\n  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-ctc/index.html>\n\nfor more models. The following table lists only **SOME** of them.\n\n\n|Name | Supported Languages| Description|\n|-----|-----|----|\n|[sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20][sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20]| Chinese, English| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english)|\n|[sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16][sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16]| Chinese, English| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16-bilingual-chinese-english)|\n|[sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23][sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23]|Chinese| Suitable for Cortex A7 CPU. See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-zh-14m-2023-02-23)|\n|[sherpa-onnx-streaming-zipformer-en-20M-2023-02-17][sherpa-onnx-streaming-zipformer-en-20M-2023-02-17]|English|Suitable for Cortex A7 CPU. See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-en-20m-2023-02-17)|\n|[sherpa-onnx-streaming-zipformer-korean-2024-06-16][sherpa-onnx-streaming-zipformer-korean-2024-06-16]|Korean| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-korean-2024-06-16-korean)|\n|[sherpa-onnx-streaming-zipformer-fr-2023-04-14][sherpa-onnx-streaming-zipformer-fr-2023-04-14]|French| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#shaojieli-sherpa-onnx-streaming-zipformer-fr-2023-04-14-french)|\n\n</details>\n\n\n#### Some pre-trained ASR models (Non-Streaming)\n\n<details>\n\nPlease see\n\n  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html>\n  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html>\n  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/index.html>\n  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/telespeech/index.html>\n  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/index.html>\n\nfor more models. The following table lists only **SOME** of them.\n\n|Name | Supported Languages| Description|\n|-----|-----|----|\n|[sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/nemo-transducer-models.html#sherpa-onnx-nemo-parakeet-tdt-0-6b-v2-int8-english)| English | It is converted from <https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2>|\n|[Whisper tiny.en](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2)|English| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html)|\n|[Moonshine tiny][Moonshine tiny]|English|See [also](https://github.com/usefulsensors/moonshine)|\n|[sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html#sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03-chinese)|Chinese| A Zipformer CTC model|\n|[sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17][sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17]|Chinese, Cantonese, English, Korean, Japanese| 支持多种中文方言. See [also](https://k2-fsa.github.io/sherpa/onnx/sense-voice/index.html)|\n|[sherpa-onnx-paraformer-zh-2024-03-09][sherpa-onnx-paraformer-zh-2024-03-09]|Chinese, English| 也支持多种中文方言. See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2024-03-09-chinese-english)|\n|[sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01][sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01]|Japanese|See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01-japanese)|\n|[sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24][sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24]|Russian|See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/nemo-transducer-models.html#sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24-russian)|\n|[sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24][sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24]|Russian| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/russian.html#sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24)|\n|[sherpa-onnx-zipformer-ru-2024-09-18][sherpa-onnx-zipformer-ru-2024-09-18]|Russian|See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-ru-2024-09-18-russian)|\n|[sherpa-onnx-zipformer-korean-2024-06-24][sherpa-onnx-zipformer-korean-2024-06-24]|Korean|See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-korean-2024-06-24-korean)|\n|[sherpa-onnx-zipformer-thai-2024-06-20][sherpa-onnx-zipformer-thai-2024-06-20]|Thai| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-thai-2024-06-20-thai)|\n|[sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04][sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04]|Chinese| 支持多种方言. See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/telespeech/models.html#sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04)|\n\n</details>\n\n### Useful links\n\n- Documentation: https://k2-fsa.github.io/sherpa/onnx/\n- Bilibili 演示视频: https://search.bilibili.com/all?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi\n\n### How to reach us\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/social-groups.html\nfor 新一代 Kaldi **微信交流群** and **QQ 交流群**.\n\n## Projects using sherpa-onnx\n\n### [BreezeApp](https://github.com/mtkresearch/BreezeApp) from [MediaTek Research](https://github.com/mtkresearch)\n\n> BreezeAPP is a mobile AI application developed for both Android and iOS platforms.\n> Users can download it directly from the App Store and enjoy a variety of features\n> offline, including speech-to-text, text-to-speech, text-based chatbot interactions,\n> and image question-answering\n\n  - [Download APK for BreezeAPP](https://huggingface.co/MediaTek-Research/BreezeApp/resolve/main/BreezeApp.apk)\n  - [APK 中国镜像](https://hf-mirror.com/MediaTek-Research/BreezeApp/blob/main/BreezeApp.apk)\n\n| 1 | 2 | 3 |\n|---|---|---|\n|![](https://github.com/user-attachments/assets/1cdbc057-b893-4de6-9e9c-f1d7dfd1d992)|![](https://github.com/user-attachments/assets/d77cd98e-b057-442f-860d-d5befd5c769b)|![](https://github.com/user-attachments/assets/57e546bf-3d39-45b9-b392-b48ca4fb3c58)|\n\n### [Open-LLM-VTuber](https://github.com/t41372/Open-LLM-VTuber)\n\nTalk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking\nface running locally across platforms\n\nSee also <https://github.com/t41372/Open-LLM-VTuber/pull/50>\n\n### [voiceapi](https://github.com/ruzhila/voiceapi)\n\n<details>\n  <summary>Streaming ASR and TTS based on FastAPI</summary>\n\n\nIt shows how to use the ASR and TTS Python APIs with FastAPI.\n</details>\n\n### [腾讯会议摸鱼工具 TMSpeech](https://github.com/jxlpzqc/TMSpeech)\n\nUses streaming ASR in C# with graphical user interface.\n\nVideo demo in Chinese: [【开源】Windows实时字幕软件（网课/开会必备）](https://www.bilibili.com/video/BV1rX4y1p7Nx)\n\n### [lol互动助手](https://github.com/l1veIn/lol-wom-electron)\n\nIt uses the JavaScript API of sherpa-onnx along with [Electron](https://electronjs.org/)\n\nVideo demo in Chinese: [爆了！炫神教你开打字挂！真正影响胜率的英雄联盟工具！英雄联盟的最后一块拼图！和游戏中的每个人无障碍沟通！](https://www.bilibili.com/video/BV142tje9E74)\n\n### [Sherpa-ONNX 语音识别服务器](https://github.com/hfyydd/sherpa-onnx-server)\n\nA server based on nodejs providing Restful API for speech recognition.\n\n### [QSmartAssistant](https://github.com/xinhecuican/QSmartAssistant)\n\n一个模块化，全过程可离线，低占用率的对话机器人/智能音箱\n\nIt uses QT. Both [ASR](https://github.com/xinhecuican/QSmartAssistant/blob/master/doc/%E5%AE%89%E8%A3%85.md#asr)\nand [TTS](https://github.com/xinhecuican/QSmartAssistant/blob/master/doc/%E5%AE%89%E8%A3%85.md#tts)\nare used.\n\n### [Flutter-EasySpeechRecognition](https://github.com/Jason-chen-coder/Flutter-EasySpeechRecognition)\n\nIt extends [./flutter-examples/streaming_asr](./flutter-examples/streaming_asr) by\ndownloading models inside the app to reduce the size of the app.\n\nNote: [[Team B] Sherpa AI backend](https://github.com/umgc/spring2025/pull/82) also uses\nsherpa-onnx in a Flutter APP.\n\n### [sherpa-onnx-unity](https://github.com/xue-fei/sherpa-onnx-unity)\n\nsherpa-onnx in Unity. See also [#1695](https://github.com/k2-fsa/sherpa-onnx/issues/1695),\n[#1892](https://github.com/k2-fsa/sherpa-onnx/issues/1892), and [#1859](https://github.com/k2-fsa/sherpa-onnx/issues/1859)\n\n### [xiaozhi-esp32-server](https://github.com/xinnan-tech/xiaozhi-esp32-server)\n\n本项目为xiaozhi-esp32提供后端服务，帮助您快速搭建ESP32设备控制服务器\nBackend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.\n\nSee also\n\n  - [ASR新增轻量级sherpa-onnx-asr](https://github.com/xinnan-tech/xiaozhi-esp32-server/issues/315)\n  - [feat: ASR增加sherpa-onnx模型](https://github.com/xinnan-tech/xiaozhi-esp32-server/pull/379)\n\n### [KaithemAutomation](https://github.com/EternityForest/KaithemAutomation)\n\nPure Python, GUI-focused home automation/consumer grade SCADA.\n\nIt uses TTS from sherpa-onnx. See also [✨ Speak command that uses the new globally configured TTS model.](https://github.com/EternityForest/KaithemAutomation/commit/8e64d2b138725e426532f7d66bb69dd0b4f53693)\n\n### [Open-XiaoAI KWS](https://github.com/idootop/open-xiaoai-kws)\n\nEnable custom wake word for XiaoAi Speakers. 让小爱音箱支持自定义唤醒词。\n\nVideo demo in Chinese: [小爱同学启动～˶╹ꇴ╹˶！](https://www.bilibili.com/video/BV1YfVUz5EMj)\n\n### [C++ WebSocket ASR Server](https://github.com/mawwalker/stt-server)\n\nIt provides a WebSocket server based on C++ for ASR using sherpa-onnx.\n\n### [Go WebSocket Server](https://github.com/bbeyondllove/asr_server)\n\nIt provides a WebSocket server based on the Go programming language for sherpa-onnx.\n\n### [Making robot Paimon, Ep10 \"The AI Part 1\"](https://www.youtube.com/watch?v=KxPKkwxGWZs)\n\nIt is a [YouTube video](https://www.youtube.com/watch?v=KxPKkwxGWZs),\nshowing how the author tried to use AI so he can have a conversation with Paimon.\n\nIt uses sherpa-onnx for speech-to-text and text-to-speech.\n|1|\n|---|\n|![](https://github.com/user-attachments/assets/f6eea2d5-1807-42cb-9160-be8da2971e1f)|\n\n### [TtsReader - Desktop application](https://github.com/ys-pro-duction/TtsReader)\n\nA desktop text-to-speech application built using Kotlin Multiplatform.\n\n### [MentraOS](https://github.com/Mentra-Community/MentraOS)\n\n> Smart glasses OS, with dozens of built-in apps. Users get AI assistant, notifications,\n> translation, screen mirror, captions, and more. Devs get to write 1 app that runs on\n> any pair of smart glasses.\n\nIt uses sherpa-onnx for real-time speech recognition on iOS and Android devices.\nSee also <https://github.com/Mentra-Community/MentraOS/pull/861>\n\nIt uses Swift for iOS and Java for Android.\n\n### [flet_sherpa_onnx](https://github.com/SamYuan1990/flet_sherpa_onnx)\n\nFlet ASR/STT component based on sherpa-onnx.\nExample [a chat box agent](https://github.com/SamYuan1990/i18n-agent-action)\n\n### [achatbot-go](https://github.com/ai-bot-pro/achatbot-go)\n\na multimodal chatbot based on go with sherpa-onnx's speech lib api.\n\n### [fcitx5-vinput](https://github.com/xifan2333/fcitx5-vinput)\n\nLocal offline voice input plugin for [Fcitx5](https://github.com/fcitx/fcitx5) (Linux input method framework).\nIt uses C++ with offline ASR for speech recognition, supporting push-to-talk,\ncommand mode, and optional LLM post-processing.\n\nVideo demo in Chinese: [fcitx5-vinput](https://www.bilibili.com/video/BV1a6cUzVE6F)\n\n[silero-vad]: https://github.com/snakers4/silero-vad\n[Raspberry Pi]: https://www.raspberrypi.com/\n[RV1126]: https://www.rock-chips.com/uploads/pdf/2022.8.26/191/RV1126%20Brief%20Datasheet.pdf\n[LicheePi4A]: https://sipeed.com/licheepi4a\n[VisionFive 2]: https://www.starfivetech.com/en/site/boards\n[旭日X3派]: https://developer.horizon.ai/api/v1/fileData/documents_pi/index.html\n[爱芯派]: https://wiki.sipeed.com/hardware/zh/maixIII/ax-pi/axpi.html\n[hf-space-speaker-diarization]: https://huggingface.co/spaces/k2-fsa/speaker-diarization\n[hf-space-speaker-diarization-cn]: https://hf.qhduan.com/spaces/k2-fsa/speaker-diarization\n[hf-space-asr]: https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition\n[hf-space-asr-cn]: https://hf.qhduan.com/spaces/k2-fsa/automatic-speech-recognition\n[Whisper]: https://github.com/openai/whisper\n[hf-space-asr-whisper]: https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition-with-whisper\n[hf-space-asr-whisper-cn]: https://hf.qhduan.com/spaces/k2-fsa/automatic-speech-recognition-with-whisper\n[hf-space-tts]: https://huggingface.co/spaces/k2-fsa/text-to-speech\n[hf-space-tts-cn]: https://hf.qhduan.com/spaces/k2-fsa/text-to-speech\n[hf-space-subtitle]: https://huggingface.co/spaces/k2-fsa/generate-subtitles-for-videos\n[hf-space-subtitle-cn]: https://hf.qhduan.com/spaces/k2-fsa/generate-subtitles-for-videos\n[hf-space-audio-tagging]: https://huggingface.co/spaces/k2-fsa/audio-tagging\n[hf-space-audio-tagging-cn]: https://hf.qhduan.com/spaces/k2-fsa/audio-tagging\n[hf-space-source-separation]: https://huggingface.co/spaces/k2-fsa/source-separation\n[hf-space-source-separation-cn]: https://hf.qhduan.com/spaces/k2-fsa/source-separation\n[hf-space-slid-whisper]: https://huggingface.co/spaces/k2-fsa/spoken-language-identification\n[hf-space-slid-whisper-cn]: https://hf.qhduan.com/spaces/k2-fsa/spoken-language-identification\n[wasm-hf-vad]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-sherpa-onnx\n[wasm-ms-vad]: https://modelscope.cn/studios/csukuangfj/web-assembly-vad-sherpa-onnx\n[wasm-hf-streaming-asr-zh-en-zipformer]: https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en\n[wasm-ms-streaming-asr-zh-en-zipformer]: https://modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en\n[wasm-hf-streaming-asr-zh-en-paraformer]: https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en-paraformer\n[wasm-ms-streaming-asr-zh-en-paraformer]: https://modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en-paraformer\n[Paraformer-large]: https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary\n[wasm-hf-streaming-asr-zh-en-yue-paraformer]: https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer\n[wasm-ms-streaming-asr-zh-en-yue-paraformer]: https://modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer\n[wasm-hf-streaming-asr-en-zipformer]: https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-en\n[wasm-ms-streaming-asr-en-zipformer]: https://modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-en\n[SenseVoice]: https://github.com/FunAudioLLM/SenseVoice\n[wasm-hf-vad-asr-zh-zipformer-ctc-07-03]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-zipformer-ctc\n[wasm-ms-vad-asr-zh-zipformer-ctc-07-03]: https://modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-zh-zipformer-ctc/summary\n[wasm-hf-vad-asr-zh-en-ko-ja-yue-sense-voice]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-ja-ko-cantonese-sense-voice\n[wasm-ms-vad-asr-zh-en-ko-ja-yue-sense-voice]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-zh-en-jp-ko-cantonese-sense-voice\n[wasm-hf-vad-asr-en-whisper-tiny-en]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-whisper-tiny\n[wasm-ms-vad-asr-en-whisper-tiny-en]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-en-whisper-tiny\n[wasm-hf-vad-asr-en-moonshine-tiny-en]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny\n[wasm-ms-vad-asr-en-moonshine-tiny-en]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny\n[wasm-hf-vad-asr-en-zipformer-gigaspeech]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech\n[wasm-ms-vad-asr-en-zipformer-gigaspeech]: https://www.modelscope.cn/studios/k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech\n[wasm-hf-vad-asr-zh-zipformer-wenetspeech]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech\n[wasm-ms-vad-asr-zh-zipformer-wenetspeech]: https://www.modelscope.cn/studios/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech\n[reazonspeech]: https://research.reazon.jp/_static/reazonspeech_nlp2023.pdf\n[wasm-hf-vad-asr-ja-zipformer-reazonspeech]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-ja-zipformer\n[wasm-ms-vad-asr-ja-zipformer-reazonspeech]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-ja-zipformer\n[gigaspeech2]: https://github.com/speechcolab/gigaspeech2\n[wasm-hf-vad-asr-th-zipformer-gigaspeech2]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-th-zipformer\n[wasm-ms-vad-asr-th-zipformer-gigaspeech2]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-th-zipformer\n[telespeech-asr]: https://github.com/tele-ai/telespeech-asr\n[wasm-hf-vad-asr-zh-telespeech]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-telespeech\n[wasm-ms-vad-asr-zh-telespeech]: https://www.modelscope.cn/studios/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-telespeech\n[wasm-hf-vad-asr-zh-en-paraformer-large]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer\n[wasm-ms-vad-asr-zh-en-paraformer-large]: https://www.modelscope.cn/studios/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer\n[wasm-hf-vad-asr-zh-en-paraformer-small]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small\n[wasm-ms-vad-asr-zh-en-paraformer-small]: https://www.modelscope.cn/studios/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small\n[dolphin]: https://github.com/dataoceanai/dolphin\n[wasm-ms-vad-asr-multi-lang-dolphin-base]: https://modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-multi-lang-dophin-ctc\n[wasm-hf-vad-asr-multi-lang-dolphin-base]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-multi-lang-dophin-ctc\n\n[wasm-hf-tts-matcha-zh-en]: https://huggingface.co/spaces/k2-fsa/web-assembly-zh-en-tts-matcha\n[wasm-hf-tts-matcha-zh]: https://huggingface.co/spaces/k2-fsa/web-assembly-zh-tts-matcha\n[wasm-ms-tts-matcha-zh-en]: https://modelscope.cn/studios/csukuangfj/web-assembly-zh-en-tts-matcha\n[wasm-ms-tts-matcha-zh]: https://modelscope.cn/studios/csukuangfj/web-assembly-zh-tts-matcha\n[wasm-hf-tts-matcha-en]: https://huggingface.co/spaces/k2-fsa/web-assembly-en-tts-matcha\n[wasm-ms-tts-matcha-en]: https://modelscope.cn/studios/csukuangfj/web-assembly-en-tts-matcha\n[wasm-hf-tts-piper-en]: https://huggingface.co/spaces/k2-fsa/web-assembly-tts-sherpa-onnx-en\n[wasm-ms-tts-piper-en]: https://modelscope.cn/studios/k2-fsa/web-assembly-tts-sherpa-onnx-en\n[wasm-hf-tts-piper-de]: https://huggingface.co/spaces/k2-fsa/web-assembly-tts-sherpa-onnx-de\n[wasm-ms-tts-piper-de]: https://modelscope.cn/studios/k2-fsa/web-assembly-tts-sherpa-onnx-de\n[wasm-hf-speaker-diarization]: https://huggingface.co/spaces/k2-fsa/web-assembly-speaker-diarization-sherpa-onnx\n[wasm-ms-speaker-diarization]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-speaker-diarization-sherpa-onnx\n[wasm-hf-voice-cloning-zipvoice]: https://huggingface.co/spaces/k2-fsa/web-assembly-zh-en-tts-zipvoice\n[wasm-ms-voice-cloning-zipvoice]: https://modelscope.cn/studios/csukuangfj/web-assembly-zh-en-tts-zipvoice\n[wasm-hf-voice-cloning-pocket]: https://huggingface.co/spaces/k2-fsa/web-assembly-en-tts-pocket\n[wasm-ms-voice-cloning-pocket]: https://modelscope.cn/studios/csukuangfj/web-assembly-en-tts-pocket\n[apk-speaker-diarization]: https://k2-fsa.github.io/sherpa/onnx/speaker-diarization/apk.html\n[apk-speaker-diarization-cn]: https://k2-fsa.github.io/sherpa/onnx/speaker-diarization/apk-cn.html\n[apk-streaming-asr]: https://k2-fsa.github.io/sherpa/onnx/android/apk.html\n[apk-streaming-asr-cn]: https://k2-fsa.github.io/sherpa/onnx/android/apk-cn.html\n[apk-simula-streaming-asr]: https://k2-fsa.github.io/sherpa/onnx/android/apk-simulate-streaming-asr.html\n[apk-simula-streaming-asr-cn]: https://k2-fsa.github.io/sherpa/onnx/android/apk-simulate-streaming-asr-cn.html\n[apk-tts]: https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html\n[apk-tts-cn]: https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine-cn.html\n[apk-vad]: https://k2-fsa.github.io/sherpa/onnx/vad/apk.html\n[apk-vad-cn]: https://k2-fsa.github.io/sherpa/onnx/vad/apk-cn.html\n[apk-vad-asr]: https://k2-fsa.github.io/sherpa/onnx/vad/apk-asr.html\n[apk-vad-asr-cn]: https://k2-fsa.github.io/sherpa/onnx/vad/apk-asr-cn.html\n[apk-2pass]: https://k2-fsa.github.io/sherpa/onnx/android/apk-2pass.html\n[apk-2pass-cn]: https://k2-fsa.github.io/sherpa/onnx/android/apk-2pass-cn.html\n[apk-at]: https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk.html\n[apk-at-cn]: https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-cn.html\n[apk-at-wearos]: https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-wearos.html\n[apk-at-wearos-cn]: https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-wearos-cn.html\n[apk-sid]: https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk.html\n[apk-sid-cn]: https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk-cn.html\n[apk-slid]: https://k2-fsa.github.io/sherpa/onnx/spoken-language-identification/apk.html\n[apk-slid-cn]: https://k2-fsa.github.io/sherpa/onnx/spoken-language-identification/apk-cn.html\n[apk-kws]: https://k2-fsa.github.io/sherpa/onnx/kws/apk.html\n[apk-kws-cn]: https://k2-fsa.github.io/sherpa/onnx/kws/apk-cn.html\n[apk-flutter-streaming-asr]: https://k2-fsa.github.io/sherpa/onnx/flutter/pre-built-app.html#streaming-speech-recognition-stt-asr\n[apk-flutter-streaming-asr-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/pre-built-app.html#streaming-speech-recognition-stt-asr\n[flutter-tts-android]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-android.html\n[flutter-tts-android-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-android-cn.html\n[flutter-tts-linux]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-linux.html\n[flutter-tts-linux-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-linux-cn.html\n[flutter-tts-macos-x64]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-macos-x64.html\n[flutter-tts-macos-arm64-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-macos-arm64-cn.html\n[flutter-tts-macos-arm64]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-macos-arm64.html\n[flutter-tts-macos-x64-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-macos-x64-cn.html\n[flutter-tts-win-x64]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-win.html\n[flutter-tts-win-x64-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-win-cn.html\n[lazarus-subtitle]: https://k2-fsa.github.io/sherpa/onnx/lazarus/download-generated-subtitles.html\n[lazarus-subtitle-cn]: https://k2-fsa.github.io/sherpa/onnx/lazarus/download-generated-subtitles-cn.html\n[asr-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n[tts-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n[vad-models]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n[kws-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models\n[at-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\n[sid-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n[slid-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n[punct-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models\n[speaker-segmentation-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\n[GigaSpeech]: https://github.com/SpeechColab/GigaSpeech\n[WenetSpeech]: https://github.com/wenet-e2e/WenetSpeech\n[sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n[sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16.tar.bz2\n[sherpa-onnx-streaming-zipformer-korean-2024-06-16]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-korean-2024-06-16.tar.bz2\n[sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2\n[sherpa-onnx-streaming-zipformer-en-20M-2023-02-17]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n[sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2\n[sherpa-onnx-zipformer-ru-2024-09-18]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ru-2024-09-18.tar.bz2\n[sherpa-onnx-zipformer-korean-2024-06-24]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-korean-2024-06-24.tar.bz2\n[sherpa-onnx-zipformer-thai-2024-06-20]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-thai-2024-06-20.tar.bz2\n[sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24.tar.bz2\n[sherpa-onnx-paraformer-zh-2024-03-09]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2024-03-09.tar.bz2\n[sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24.tar.bz2\n[sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n[sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n[sherpa-onnx-streaming-zipformer-fr-2023-04-14]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-fr-2023-04-14.tar.bz2\n[Moonshine tiny]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n[NVIDIA Jetson Orin NX]: https://developer.download.nvidia.com/assets/embedded/secure/jetson/orin_nx/docs/Jetson_Orin_NX_DS-10712-001_v0.5.pdf?RCPGu9Q6OVAOv7a7vgtwc9-BLScXRIWq6cSLuditMALECJ_dOj27DgnqAPGVnT2VpiNpQan9SyFy-9zRykR58CokzbXwjSA7Gj819e91AXPrWkGZR3oS1VLxiDEpJa_Y0lr7UT-N4GnXtb8NlUkP4GkCkkF_FQivGPrAucCUywL481GH_WpP_p7ziHU1Wg==&t=eyJscyI6ImdzZW8iLCJsc2QiOiJodHRwczovL3d3dy5nb29nbGUuY29tLmhrLyJ9\n[NVIDIA Jetson Nano B01]: https://www.seeedstudio.com/blog/2020/01/16/new-revision-of-jetson-nano-dev-kit-now-supports-new-jetson-nano-module/\n[speech-enhancement-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n[source-separation-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/source-separation-models\n[RK3588]: https://www.rock-chips.com/uploads/pdf/2022.8.26/192/RK3588%20Brief%20Datasheet.pdf\n[spleeter]: https://github.com/deezer/spleeter\n[UVR]: https://github.com/Anjok07/ultimatevocalremovergui\n[gtcrn]: https://github.com/Xiaobin-Rong/gtcrn\n[tts-url]: https://k2-fsa.github.io/sherpa/onnx/tts/all-in-one.html\n[ss-url]: https://k2-fsa.github.io/sherpa/onnx/source-separation/index.html\n[sd-url]: https://k2-fsa.github.io/sherpa/onnx/speaker-diarization/index.html\n[slid-url]: https://k2-fsa.github.io/sherpa/onnx/spoken-language-identification/index.html\n[at-url]: https://k2-fsa.github.io/sherpa/onnx/audio-tagging/index.html\n[vad-url]: https://k2-fsa.github.io/sherpa/onnx/vad/index.html\n[kws-url]: https://k2-fsa.github.io/sherpa/onnx/kws/index.html\n[punct-url]: https://k2-fsa.github.io/sherpa/onnx/punctuation/index.html\n[se-url]: https://k2-fsa.github.io/sherpa/onnx/speech-enhancement/index.html\n[rknpu-doc]: https://k2-fsa.github.io/sherpa/onnx/rknn/index.html\n[qnn-doc]: https://k2-fsa.github.io/sherpa/onnx/qnn/index.html\n[ascend-doc]: https://k2-fsa.github.io/sherpa/onnx/ascend/index.html\n[axera-npu]: https://axera-tech.com/Skill/166.html\n"
  },
  {
    "path": "android/.gitignore",
    "content": "# Gradle files\n.gradle/\nbuild/\n\n# Local configuration file (sdk path, etc)\nlocal.properties\n\n# Log/OS Files\n*.log\n\n# Android Studio generated files and folders\ncaptures/\n.externalNativeBuild/\n.cxx/\n*.apk\noutput.json\n\n# IntelliJ\n*.iml\n.idea/\nmisc.xml\ndeploymentTargetDropDown.xml\nrender.experimental.xml\n\n# Keystore files\n*.jks\n*.keystore\n\n# Google Services (e.g. APIs or Firebase)\ngoogle-services.json\n\n# Android Profiling\n*.hprof\n*.so\n"
  },
  {
    "path": "android/README.md",
    "content": "# Introduction\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/android/index.html\nfor usage.\n\n|Folder| Pre-built APK | Description|\n|------|---------------|-------------|\n|[SherpaOnnxSpeakerDiarization](./SherpaOnnxSpeakerDiarization)| | It is for speaker diarization.|\n|[SherpaOnnx](./SherpaOnnx)| [URL](https://k2-fsa.github.io/sherpa/onnx/android/apk.html)| It uses a streaming ASR model.|\n|[SherpaOnnx2Pass](./SherpaOnnx2Pass)|[URL](https://k2-fsa.github.io/sherpa/onnx/android/apk-2pass.html)| It uses a streaming ASR model for the first pass and use a non-streaming ASR model for the second pass|\n|[SherpaOnnxKws](./SherpaOnnxKws)|[URL](https://k2-fsa.github.io/sherpa/onnx/kws/apk.html)| It demonstrates how to use keyword spotting|\n|[SherpaOnnxSpeakerIdentification](./SherpaOnnxSpeakerIdentification)|[URL](https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk.html)| It demonstrates how to use speaker identification|\n|[SherpaOnnxTts](./SherpaOnnxTts)|[URL](https://k2-fsa.github.io/sherpa/onnx/tts/apk.html)| It is for standalone text-to-speech.|\n|[SherpaOnnxTtsEngine](./SherpaOnnxTtsEngine)|[URL](https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html)| It is for text-to-speech engine; you can use it to replace the system TTS engine, e.g., use it in a e-book reader app|\n|[SherpaOnnxVad](./SherpaOnnxVad)|[URL](https://k2-fsa.github.io/sherpa/onnx/vad/apk.html)| It demonstrates how to use a VAD|\n|[SherpaOnnxVadAsr](./SherpaOnnxVadAsr)|[URL](https://k2-fsa.github.io/sherpa/onnx/vad/apk-asr.html)| It uses a VAD with a non-streaming ASR model.|\n|[SherpaOnnxWebSocket](./SherpaOnnxWebSocket)| |It shows how to write a websocket client for the [Python streaming websocket server](https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/streaming_server.py).|\n|[SherpaOnnxAudioTagging](./SherpaOnnxAudioTagging)|[URL](https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk.html)| It shows how to use audio tagging.|\n|[SherpaOnnxAudioTaggingWearOS](./SherpaOnnxAudioTagging)|[URL](https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-wearos.html)| It shows how to use audio tagging on WearOS.|\n|[SherpaOnnxSimulateStreamingAsr](./SherpaOnnxSimulateStreamingAsr)|| It shows how to use a non-streaming ASR model for streaming speech recognition.|\n|[SherpaOnnxSimulateStreamingAsrWearOs](./SherpaOnnxSimulateStreamingAsrWearOs)|| It shows how to use a non-streaming ASR model for streaming speech recognition with WearOS.|\n"
  },
  {
    "path": "android/SherpaOnnx/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnx/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnx/app/build.gradle",
    "content": "plugins {\n    id 'com.android.application'\n    id 'org.jetbrains.kotlin.android'\n}\n\nandroid {\n    namespace 'com.k2fsa.sherpa.onnx'\n    compileSdk 32\n\n    defaultConfig {\n        applicationId \"com.k2fsa.sherpa.onnx\"\n        minSdk 21\n        targetSdk 32\n        versionCode 20260320\n        versionName \"1.12.31\"\n\n        testInstrumentationRunner \"androidx.test.runner.AndroidJUnitRunner\"\n    }\n\n    buildTypes {\n        release {\n            minifyEnabled false\n            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'\n        }\n    }\n    compileOptions {\n        sourceCompatibility JavaVersion.VERSION_1_8\n        targetCompatibility JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = '1.8'\n    }\n}\n\ndependencies {\n\n    implementation 'androidx.core:core-ktx:1.7.0'\n    implementation 'androidx.appcompat:appcompat:1.5.1'\n    implementation 'com.google.android.material:material:1.7.0'\n    implementation 'androidx.constraintlayout:constraintlayout:2.1.4'\n    testImplementation 'junit:junit:4.13.2'\n    androidTestImplementation 'androidx.test.ext:junit:1.1.4'\n    androidTestImplementation 'androidx.test.espresso:espresso-core:3.5.0'\n}"
  },
  {
    "path": "android/SherpaOnnx/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnx/app/src/androidTest/java/com/k2fsa/sherpa/onnx/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnx\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".MainActivity\"\n            android:label=\"ASR: Next-gen Kaldi\"\n            android:exported=\"true\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n\n            <meta-data\n                android:name=\"android.app.lib_name\"\n                android:value=\"\" />\n        </activity>\n    </application>\n\n</manifest>\n"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/java/com/k2fsa/sherpa/onnx/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.os.Bundle\nimport android.text.method.ScrollingMovementMethod\nimport android.util.Log\nimport android.widget.Button\nimport android.widget.TextView\nimport androidx.appcompat.app.AppCompatActivity\nimport androidx.core.app.ActivityCompat\nimport java.io.File\nimport java.io.FileOutputStream\nimport java.io.IOException\nimport kotlin.concurrent.thread\n\nprivate const val TAG = \"sherpa-onnx\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\n// To enable microphone in android emulator, use\n//\n// adb emu avd hostmicon\n\nclass MainActivity : AppCompatActivity() {\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n\n    private lateinit var recognizer: OnlineRecognizer\n    private var audioRecord: AudioRecord? = null\n    private lateinit var recordButton: Button\n    private lateinit var textView: TextView\n    private var recordingThread: Thread? = null\n\n    private val audioSource = MediaRecorder.AudioSource.MIC\n    private val sampleRateInHz = 16000\n    private val channelConfig = AudioFormat.CHANNEL_IN_MONO\n\n    // Note: We don't use AudioFormat.ENCODING_PCM_FLOAT\n    // since the AudioRecord.read(float[]) needs API level >= 23\n    // but we are targeting API level >= 21\n    private val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n    private var idx: Int = 0\n    private var lastText: String = \"\"\n\n    @Volatile\n    private var isRecording: Boolean = false\n\n    override fun onRequestPermissionsResult(\n        requestCode: Int, permissions: Array<String>, grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            finish()\n        }\n\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        setContentView(R.layout.activity_main)\n\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n\n        Log.i(TAG, \"Start to initialize model\")\n        initModel()\n        Log.i(TAG, \"Finished initializing model\")\n\n        recordButton = findViewById(R.id.record_button)\n        recordButton.setOnClickListener { onclick() }\n\n        textView = findViewById(R.id.my_text)\n        textView.movementMethod = ScrollingMovementMethod()\n    }\n\n    private fun onclick() {\n        if (!isRecording) {\n            val ret = initMicrophone()\n            if (!ret) {\n                Log.e(TAG, \"Failed to initialize microphone\")\n                return\n            }\n            Log.i(TAG, \"state: ${audioRecord?.state}\")\n            audioRecord!!.startRecording()\n            recordButton.setText(R.string.stop)\n            isRecording = true\n            textView.text = \"\"\n            lastText = \"\"\n            idx = 0\n\n            recordingThread = thread(true) {\n                processSamples()\n            }\n            Log.i(TAG, \"Started recording\")\n        } else {\n            isRecording = false\n            audioRecord!!.stop()\n            audioRecord!!.release()\n            audioRecord = null\n            recordButton.setText(R.string.start)\n            Log.i(TAG, \"Stopped recording\")\n        }\n    }\n\n    private fun processSamples() {\n        Log.i(TAG, \"processing samples\")\n        val stream = recognizer.createStream()\n\n        val interval = 0.1 // i.e., 100 ms\n        val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n        val buffer = ShortArray(bufferSize)\n\n        while (isRecording) {\n            val ret = audioRecord?.read(buffer, 0, buffer.size)\n            if (ret != null && ret > 0) {\n                val samples = FloatArray(ret) { buffer[it] / 32768.0f }\n                stream.acceptWaveform(samples, sampleRate = sampleRateInHz)\n                while (recognizer.isReady(stream)) {\n                    recognizer.decode(stream)\n                }\n\n                val isEndpoint = recognizer.isEndpoint(stream)\n                var text = recognizer.getResult(stream).text\n\n                // For streaming parformer, we need to manually add some\n                // paddings so that it has enough right context to\n                // recognize the last word of this segment\n                if (isEndpoint && recognizer.config.modelConfig.paraformer.encoder.isNotBlank()) {\n                    val tailPaddings = FloatArray((0.8 * sampleRateInHz).toInt())\n                    stream.acceptWaveform(tailPaddings, sampleRate = sampleRateInHz)\n                    while (recognizer.isReady(stream)) {\n                        recognizer.decode(stream)\n                    }\n                    text = recognizer.getResult(stream).text\n                }\n\n                var textToDisplay = lastText\n\n                if (text.isNotBlank()) {\n                    textToDisplay = if (lastText.isBlank()) {\n                        \"${idx}: $text\"\n                    } else {\n                        \"${lastText}\\n${idx}: $text\"\n                    }\n                }\n\n                if (isEndpoint) {\n                    recognizer.reset(stream)\n                    if (text.isNotBlank()) {\n                        lastText = \"${lastText}\\n${idx}: $text\"\n                        textToDisplay = lastText\n                        idx += 1\n                    }\n                }\n\n                runOnUiThread {\n                    textView.text = textToDisplay\n                }\n            }\n        }\n        stream.release()\n    }\n\n    private fun initMicrophone(): Boolean {\n        if (ActivityCompat.checkSelfPermission(\n                this, Manifest.permission.RECORD_AUDIO\n            ) != PackageManager.PERMISSION_GRANTED\n        ) {\n            ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n            return false\n        }\n\n        val numBytes = AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n        Log.i(\n            TAG, \"buffer size in milliseconds: ${numBytes * 1000.0f / sampleRateInHz}\"\n        )\n\n        audioRecord = AudioRecord(\n            audioSource,\n            sampleRateInHz,\n            channelConfig,\n            audioFormat,\n            numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n        )\n        return true\n    }\n\n    private fun initModel() {\n        // Please change getModelConfig() to add new models\n        // See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n        // for a list of available models\n        val type = 0\n        var ruleFsts : String?\n        ruleFsts = null\n\n        val useHr = false\n        val hr =  HomophoneReplacerConfig(\n            // Used only when useHr is true\n            // Please download the following 3 files from\n            // https://github.com/k2-fsa/sherpa-onnx/releases/tag/hr-files\n            //\n            // dict and lexicon.txt can be shared by different apps\n            //\n            // replace.fst is specific for an app\n            lexicon = \"lexicon.txt\",\n            ruleFsts = \"replace.fst\",\n        )\n\n        Log.i(TAG, \"Select model type $type\")\n        var config = OnlineRecognizerConfig(\n            featConfig = getFeatureConfig(sampleRate = sampleRateInHz, featureDim = 80),\n            modelConfig = getModelConfig(type = type)!!,\n            // lmConfig = getOnlineLMConfig(type = type),\n            endpointConfig = getEndpointConfig(),\n            enableEndpoint = true,\n        )\n\n        if (ruleFsts != null) {\n            config.ruleFsts = ruleFsts\n        }\n\n        if (useHr) {\n            config.hr = hr\n        }\n\n        recognizer = OnlineRecognizer(\n            assetManager = application.assets,\n            config = config,\n        )\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/jniLibs/.gitignore",
    "content": "*.so\n*.txt\n*.onnx\n*.wav\n"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnx/app/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnx/app/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnx/app/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/layout/activity_main.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<androidx.constraintlayout.widget.ConstraintLayout xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:app=\"http://schemas.android.com/apk/res-auto\"\n    xmlns:tools=\"http://schemas.android.com/tools\"\n    android:layout_width=\"match_parent\"\n    android:layout_height=\"match_parent\"\n    tools:context=\".MainActivity\">\n\n    <LinearLayout\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"match_parent\"\n        android:gravity=\"center\"\n        android:orientation=\"vertical\">\n\n        <TextView\n            android:id=\"@+id/my_text\"\n            android:layout_width=\"match_parent\"\n            android:layout_height=\"match_parent\"\n            android:layout_weight=\"2.5\"\n            android:padding=\"24dp\"\n            android:scrollbars=\"vertical\"\n            android:singleLine=\"false\"\n            android:text=\"@string/hint\"\n            app:layout_constraintBottom_toBottomOf=\"parent\"\n            app:layout_constraintEnd_toEndOf=\"parent\"\n            app:layout_constraintStart_toStartOf=\"parent\"\n            app:layout_constraintTop_toTopOf=\"parent\" />\n\n        <Button\n            android:id=\"@+id/record_button\"\n            android:layout_width=\"wrap_content\"\n            android:layout_height=\"wrap_content\"\n            android:layout_weight=\"0.5\"\n            android:text=\"@string/start\" />\n    </LinearLayout>\n\n\n</androidx.constraintlayout.widget.ConstraintLayout>"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">ASR</string>\n    <string name=\"hint\">Click the Start button to play speech-to-text with Next-gen Kaldi.\n        \\n\n        \\n\\n\\n\n        The source code and pre-trained models are publicly available.\n        Please see https://github.com/k2-fsa/sherpa-onnx for details.\n    </string>\n    <string name=\"start\">Start</string>\n    <string name=\"stop\">Stop</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/values/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnx\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_500</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/white</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_700</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/values-night/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnx\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_200</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/black</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_200</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnx/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnx/app/src/test/java/com/k2fsa/sherpa/onnx/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnx/build.gradle",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id 'com.android.application' version '7.3.1' apply false\n    id 'com.android.library' version '7.3.1' apply false\n    id 'org.jetbrains.kotlin.android' version '1.7.20' apply false\n}"
  },
  {
    "path": "android/SherpaOnnx/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Thu Feb 23 11:09:06 CST 2023\ndistributionBase=GRADLE_USER_HOME\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\ndistributionPath=wrapper/dists\nzipStorePath=wrapper/dists\nzipStoreBase=GRADLE_USER_HOME\n"
  },
  {
    "path": "android/SherpaOnnx/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnx/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnx/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnx/settings.gradle",
    "content": "pluginManagement {\n    repositories {\n        gradlePluginPortal()\n        google()\n        mavenCentral()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\nrootProject.name = \"SherpaOnnx\"\ninclude ':app'\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/build.gradle",
    "content": "plugins {\n    id 'com.android.application'\n    id 'org.jetbrains.kotlin.android'\n}\n\nandroid {\n    namespace 'com.k2fsa.sherpa.onnx'\n    compileSdk 32\n\n    defaultConfig {\n        applicationId \"com.k2fsa.sherpa.onnx\"\n        minSdk 21\n        targetSdk 32\n        versionCode 20260320\n        versionName \"1.12.31\"\n\n        testInstrumentationRunner \"androidx.test.runner.AndroidJUnitRunner\"\n    }\n\n    buildTypes {\n        release {\n            minifyEnabled false\n            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'\n        }\n    }\n    compileOptions {\n        sourceCompatibility JavaVersion.VERSION_1_8\n        targetCompatibility JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = '1.8'\n    }\n}\n\ndependencies {\n\n    implementation 'androidx.core:core-ktx:1.7.0'\n    implementation 'androidx.appcompat:appcompat:1.5.1'\n    implementation 'com.google.android.material:material:1.7.0'\n    implementation 'androidx.constraintlayout:constraintlayout:2.1.4'\n    testImplementation 'junit:junit:4.13.2'\n    androidTestImplementation 'androidx.test.ext:junit:1.1.4'\n    androidTestImplementation 'androidx.test.espresso:espresso-core:3.5.0'\n}"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/androidTest/java/com/k2fsa/sherpa/onnx/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/.gitignore",
    "content": "*.so\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnx2Pass\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".MainActivity\"\n            android:label=\"2pass ASR: Next-gen Kaldi\"\n            android:exported=\"true\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n\n            <meta-data\n                android:name=\"android.app.lib_name\"\n                android:value=\"\" />\n        </activity>\n    </application>\n\n</manifest>\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/assets/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/java/com/k2fsa/sherpa/onnx/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.os.Bundle\nimport android.text.method.ScrollingMovementMethod\nimport android.util.Log\nimport android.widget.Button\nimport android.widget.TextView\nimport androidx.appcompat.app.AppCompatActivity\nimport androidx.core.app.ActivityCompat\nimport kotlin.concurrent.thread\n\nprivate const val TAG = \"sherpa-onnx\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\n// adb emu avd hostmicon\n// to enable microphone inside the emulator\nclass MainActivity : AppCompatActivity() {\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n\n    private lateinit var onlineRecognizer: OnlineRecognizer\n    private lateinit var offlineRecognizer: OfflineRecognizer\n    private var audioRecord: AudioRecord? = null\n    private lateinit var recordButton: Button\n    private lateinit var textView: TextView\n    private var recordingThread: Thread? = null\n\n    private val audioSource = MediaRecorder.AudioSource.MIC\n    private val sampleRateInHz = 16000\n    private val channelConfig = AudioFormat.CHANNEL_IN_MONO\n\n    private var samplesBuffer = arrayListOf<FloatArray>()\n\n    // Note: We don't use AudioFormat.ENCODING_PCM_FLOAT\n    // since the AudioRecord.read(float[]) needs API level >= 23\n    // but we are targeting API level >= 21\n    private val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n    private var idx: Int = 0\n    private var lastText: String = \"\"\n\n    @Volatile\n    private var isRecording: Boolean = false\n\n    override fun onRequestPermissionsResult(\n        requestCode: Int, permissions: Array<String>, grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            finish()\n        }\n\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        setContentView(R.layout.activity_main)\n\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n\n        Log.i(TAG, \"Start to initialize first-pass recognizer\")\n        initOnlineRecognizer()\n        Log.i(TAG, \"Finished initializing first-pass recognizer\")\n\n        Log.i(TAG, \"Start to initialize second-pass recognizer\")\n        initOfflineRecognizer()\n        Log.i(TAG, \"Finished initializing second-pass recognizer\")\n\n        recordButton = findViewById(R.id.record_button)\n        recordButton.setOnClickListener { onclick() }\n\n        textView = findViewById(R.id.my_text)\n        textView.movementMethod = ScrollingMovementMethod()\n    }\n\n    private fun onclick() {\n        if (!isRecording) {\n            val ret = initMicrophone()\n            if (!ret) {\n                Log.e(TAG, \"Failed to initialize microphone\")\n                return\n            }\n            Log.i(TAG, \"state: ${audioRecord?.state}\")\n            audioRecord!!.startRecording()\n            recordButton.setText(R.string.stop)\n            isRecording = true\n            samplesBuffer.clear()\n            textView.text = \"\"\n            lastText = \"\"\n            idx = 0\n\n            recordingThread = thread(true) {\n                processSamples()\n            }\n            Log.i(TAG, \"Started recording\")\n        } else {\n            isRecording = false\n            audioRecord!!.stop()\n            audioRecord!!.release()\n            audioRecord = null\n            recordButton.setText(R.string.start)\n            Log.i(TAG, \"Stopped recording\")\n        }\n    }\n\n    private fun processSamples() {\n        Log.i(TAG, \"processing samples\")\n        val stream = onlineRecognizer.createStream()\n\n        val interval = 0.1 // i.e., 100 ms\n        val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n        val buffer = ShortArray(bufferSize)\n\n        while (isRecording) {\n            val ret = audioRecord?.read(buffer, 0, buffer.size)\n            if (ret != null && ret > 0) {\n                val samples = FloatArray(ret) { buffer[it] / 32768.0f }\n                samplesBuffer.add(samples)\n\n                stream.acceptWaveform(samples, sampleRate = sampleRateInHz)\n                while (onlineRecognizer.isReady(stream)) {\n                    onlineRecognizer.decode(stream)\n                }\n                val isEndpoint = onlineRecognizer.isEndpoint(stream)\n                var textToDisplay = lastText\n\n                var text = onlineRecognizer.getResult(stream).text\n                if (text.isNotBlank()) {\n                    textToDisplay = if (lastText.isBlank()) {\n                        // textView.text = \"${idx}: ${text}\"\n                        \"${idx}: $text\"\n                    } else {\n                        \"${lastText}\\n${idx}: $text\"\n                    }\n                }\n\n                if (isEndpoint) {\n                    onlineRecognizer.reset(stream)\n\n                    if (text.isNotBlank()) {\n                        text = runSecondPass()\n                        lastText = \"${lastText}\\n${idx}: $text\"\n                        idx += 1\n                    } else {\n                        samplesBuffer.clear()\n                    }\n                }\n\n                runOnUiThread {\n                    textView.text = textToDisplay.lowercase()\n                }\n            }\n        }\n        stream.release()\n    }\n\n    private fun initMicrophone(): Boolean {\n        if (ActivityCompat.checkSelfPermission(\n                this, Manifest.permission.RECORD_AUDIO\n            ) != PackageManager.PERMISSION_GRANTED\n        ) {\n            ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n            return false\n        }\n\n        val numBytes = AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n        Log.i(\n            TAG, \"buffer size in milliseconds: ${numBytes * 1000.0f / sampleRateInHz}\"\n        )\n\n        audioRecord = AudioRecord(\n            audioSource,\n            sampleRateInHz,\n            channelConfig,\n            audioFormat,\n            numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n        )\n        return true\n    }\n\n    private fun initOnlineRecognizer() {\n        // Please change getModelConfig() to add new models\n        // See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n        // for a list of available models\n        val firstType = 9\n        val firstRuleFsts: String?\n        firstRuleFsts = null\n        Log.i(TAG, \"Select model type $firstType for the first pass\")\n        val config = OnlineRecognizerConfig(\n            featConfig = getFeatureConfig(sampleRate = sampleRateInHz, featureDim = 80),\n            modelConfig = getModelConfig(type = firstType)!!,\n            endpointConfig = getEndpointConfig(),\n            enableEndpoint = true,\n        )\n        if (firstRuleFsts != null) {\n            config.ruleFsts = firstRuleFsts;\n        }\n\n        onlineRecognizer = OnlineRecognizer(\n            assetManager = application.assets,\n            config = config,\n        )\n    }\n\n    private fun initOfflineRecognizer() {\n        // Please change getOfflineModelConfig() to add new models\n        // See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n        // for a list of available models\n        val secondType = 0\n        var secondRuleFsts: String?\n        secondRuleFsts = null\n        Log.i(TAG, \"Select model type $secondType for the second pass\")\n\n        val config = OfflineRecognizerConfig(\n            featConfig = getFeatureConfig(sampleRate = sampleRateInHz, featureDim = 80),\n            modelConfig = getOfflineModelConfig(type = secondType)!!,\n        )\n\n        if (secondRuleFsts != null) {\n            config.ruleFsts = secondRuleFsts\n        }\n\n        offlineRecognizer = OfflineRecognizer(\n            assetManager = application.assets,\n            config = config,\n        )\n    }\n\n    private fun runSecondPass(): String {\n        var totalSamples = 0\n        for (a in samplesBuffer) {\n            totalSamples += a.size\n        }\n        var i = 0\n\n        val samples = FloatArray(totalSamples)\n\n        // todo(fangjun): Make it more efficient\n        for (a in samplesBuffer) {\n            for (s in a) {\n                samples[i] = s\n                i += 1\n            }\n        }\n\n\n        val n = maxOf(0, samples.size - 8000)\n\n        samplesBuffer.clear()\n        samplesBuffer.add(samples.sliceArray(n until samples.size))\n\n        val stream = offlineRecognizer.createStream()\n        stream.acceptWaveform(samples.sliceArray(0..n), sampleRateInHz)\n        offlineRecognizer.decode(stream)\n        val result = offlineRecognizer.getResult(stream)\n\n        stream.release()\n\n        return result.text\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/jniLibs/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/layout/activity_main.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<androidx.constraintlayout.widget.ConstraintLayout xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:app=\"http://schemas.android.com/apk/res-auto\"\n    xmlns:tools=\"http://schemas.android.com/tools\"\n    android:layout_width=\"match_parent\"\n    android:layout_height=\"match_parent\"\n    tools:context=\".MainActivity\">\n\n    <LinearLayout\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"match_parent\"\n        android:gravity=\"center\"\n        android:orientation=\"vertical\">\n\n        <TextView\n            android:id=\"@+id/my_text\"\n            android:layout_width=\"match_parent\"\n            android:layout_height=\"match_parent\"\n            android:layout_weight=\"2.5\"\n            android:padding=\"24dp\"\n            android:scrollbars=\"vertical\"\n            android:singleLine=\"false\"\n            android:text=\"@string/hint\"\n            app:layout_constraintBottom_toBottomOf=\"parent\"\n            app:layout_constraintEnd_toEndOf=\"parent\"\n            app:layout_constraintStart_toStartOf=\"parent\"\n            android:gravity=\"bottom\"\n            app:layout_constraintTop_toTopOf=\"parent\" />\n\n        <Button\n            android:id=\"@+id/record_button\"\n            android:layout_width=\"wrap_content\"\n            android:layout_height=\"wrap_content\"\n            android:layout_weight=\"0.5\"\n            android:text=\"@string/start\" />\n    </LinearLayout>\n\n\n</androidx.constraintlayout.widget.ConstraintLayout>"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">ASR2pass </string>\n    <string name=\"hint\">Click the Start button to play speech-to-text with Next-gen Kaldi.\n        \\n\n        \\n\\n\\n\n        The source code and pre-trained models are publicly available.\n        Please see https://github.com/k2-fsa/sherpa-onnx for details.\n        \\n\\n\n        Two-pass speech recognition with Next-gen Kaldi.\n    </string>\n    <string name=\"start\">Start</string>\n    <string name=\"stop\">Stop</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/values/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnx2Pass\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_500</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/white</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_700</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/values-night/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnx2Pass\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_200</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/black</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_200</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnx2Pass/app/src/test/java/com/k2fsa/sherpa/onnx/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnx2Pass/build.gradle",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id 'com.android.application' version '7.3.1' apply false\n    id 'com.android.library' version '7.3.1' apply false\n    id 'org.jetbrains.kotlin.android' version '1.7.20' apply false\n}"
  },
  {
    "path": "android/SherpaOnnx2Pass/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Sun Sep 10 18:03:03 CST 2023\ndistributionBase=GRADLE_USER_HOME\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\ndistributionPath=wrapper/dists\nzipStorePath=wrapper/dists\nzipStoreBase=GRADLE_USER_HOME\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnx2Pass/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnx2Pass/settings.gradle",
    "content": "pluginManagement {\n    repositories {\n        gradlePluginPortal()\n        google()\n        mavenCentral()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\nrootProject.name = \"SherpaOnnx2Pass\"\ninclude ':app'\n"
  },
  {
    "path": "android/SherpaOnnxAar/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxAar/README.md",
    "content": "# Usage of this project\n\n```\ngit clone https://github.com/k2-fsa/sherpa-onnx\ncd sherpa-onnx\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.31/sherpa-onnx-v1.12.31-android.tar.bz2\ntar xvf sherpa-onnx-v1.12.31-android.tar.bz2\n\ncp -v jniLibs/arm64-v8a/* android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/arm64-v8a/\ncp -v jniLibs/armeabi-v7a/* android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/armeabi-v7a/\ncp -v jniLibs/x86/* android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/x86/\ncp -v jniLibs/x86_64/* android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/x86_64/\n\ncd android/SherpaOnnxAar\n\n./gradlew :sherpa_onnx:assembleRelease\nls -lh ./sherpa_onnx/build/outputs/aar/sherpa_onnx-release.aar\ncp ./sherpa_onnx/build/outputs/aar/sherpa_onnx-release.aar ../../sherpa-onnx-1.12.31.aar\n```\n"
  },
  {
    "path": "android/SherpaOnnxAar/build.gradle.kts",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    alias(libs.plugins.android.application) apply false\n    alias(libs.plugins.jetbrains.kotlin.android) apply false\n    alias(libs.plugins.android.library) apply false\n}"
  },
  {
    "path": "android/SherpaOnnxAar/gradle/libs.versions.toml",
    "content": "[versions]\nagp = \"8.4.0\"\nkotlin = \"1.7.20\"\ncoreKtx = \"1.15.0\"\njunit = \"4.13.2\"\njunitVersion = \"1.2.1\"\nespressoCore = \"3.6.1\"\nappcompat = \"1.7.0\"\nmaterial = \"1.12.0\"\n\n[libraries]\nandroidx-core-ktx = { group = \"androidx.core\", name = \"core-ktx\", version.ref = \"coreKtx\" }\njunit = { group = \"junit\", name = \"junit\", version.ref = \"junit\" }\nandroidx-junit = { group = \"androidx.test.ext\", name = \"junit\", version.ref = \"junitVersion\" }\nandroidx-espresso-core = { group = \"androidx.test.espresso\", name = \"espresso-core\", version.ref = \"espressoCore\" }\nandroidx-appcompat = { group = \"androidx.appcompat\", name = \"appcompat\", version.ref = \"appcompat\" }\nmaterial = { group = \"com.google.android.material\", name = \"material\", version.ref = \"material\" }\n\n[plugins]\nandroid-application = { id = \"com.android.application\", version.ref = \"agp\" }\njetbrains-kotlin-android = { id = \"org.jetbrains.kotlin.android\", version.ref = \"kotlin\" }\nandroid-library = { id = \"com.android.library\", version.ref = \"agp\" }\n\n"
  },
  {
    "path": "android/SherpaOnnxAar/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Thu Dec 12 14:02:30 CST 2024\ndistributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.6-bin.zip\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\n"
  },
  {
    "path": "android/SherpaOnnxAar/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. For more details, visit\n# https://developer.android.com/r/tools/gradle-multi-project-decoupled-projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxAar/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxAar/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxAar/settings.gradle.kts",
    "content": "pluginManagement {\n    repositories {\n        google {\n            content {\n                includeGroupByRegex(\"com\\\\.android.*\")\n                includeGroupByRegex(\"com\\\\.google.*\")\n                includeGroupByRegex(\"androidx.*\")\n            }\n        }\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\nrootProject.name = \"SherpaOnnxAar\"\ninclude(\":sherpa_onnx\")\n"
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/build.gradle.kts",
    "content": "plugins {\n    alias(libs.plugins.android.library)\n    alias(libs.plugins.jetbrains.kotlin.android)\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx\"\n    compileSdk = 34\n\n    defaultConfig {\n        minSdk = 21\n\n        testInstrumentationRunner = \"androidx.test.runner.AndroidJUnitRunner\"\n        consumerProguardFiles(\"consumer-rules.pro\")\n    }\n\n    buildTypes {\n        release {\n            isMinifyEnabled = false\n            proguardFiles(\n                getDefaultProguardFile(\"proguard-android-optimize.txt\"),\n                \"proguard-rules.pro\"\n            )\n        }\n    }\n    compileOptions {\n        sourceCompatibility = JavaVersion.VERSION_1_8\n        targetCompatibility = JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = \"1.8\"\n    }\n}\n\ndependencies {\n\n    implementation(libs.androidx.core.ktx)\n    implementation(libs.androidx.appcompat)\n    implementation(libs.material)\n    testImplementation(libs.junit)\n    androidTestImplementation(libs.androidx.junit)\n    androidTestImplementation(libs.androidx.espresso.core)\n}"
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/consumer-rules.pro",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/src/androidTest/java/com/k2fsa/sherpa/onnx/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx.test\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\">\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAar/sherpa_onnx/src/test/java/com/k2fsa/sherpa/onnx/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/build.gradle.kts",
    "content": "plugins {\n    id(\"com.android.application\")\n    id(\"org.jetbrains.kotlin.android\")\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx.audio.tagging\"\n    compileSdk = 34\n\n    defaultConfig {\n        applicationId = \"com.k2fsa.sherpa.onnx.audio.tagging\"\n        minSdk = 21\n        targetSdk = 34\n        versionCode = 20260320\n        versionName = \"1.12.31\"\n\n        testInstrumentationRunner = \"androidx.test.runner.AndroidJUnitRunner\"\n        vectorDrawables {\n            useSupportLibrary = true\n        }\n    }\n\n    buildTypes {\n        release {\n            isMinifyEnabled = false\n            proguardFiles(\n                getDefaultProguardFile(\"proguard-android-optimize.txt\"),\n                \"proguard-rules.pro\"\n            )\n        }\n    }\n    compileOptions {\n        sourceCompatibility = JavaVersion.VERSION_1_8\n        targetCompatibility = JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = \"1.8\"\n    }\n    buildFeatures {\n        compose = true\n    }\n    composeOptions {\n        kotlinCompilerExtensionVersion = \"1.5.1\"\n    }\n    packaging {\n        resources {\n            excludes += \"/META-INF/{AL2.0,LGPL2.1}\"\n        }\n    }\n}\n\ndependencies {\n\n    implementation(\"androidx.core:core-ktx:1.12.0\")\n    implementation(\"androidx.lifecycle:lifecycle-runtime-ktx:2.7.0\")\n    implementation(\"androidx.activity:activity-compose:1.8.2\")\n    implementation(platform(\"androidx.compose:compose-bom:2023.08.00\"))\n    implementation(\"androidx.compose.ui:ui\")\n    implementation(\"androidx.compose.ui:ui-graphics\")\n    implementation(\"androidx.compose.ui:ui-tooling-preview\")\n    implementation(\"androidx.compose.material3:material3\")\n    testImplementation(\"junit:junit:4.13.2\")\n    androidTestImplementation(\"androidx.test.ext:junit:1.1.5\")\n    androidTestImplementation(\"androidx.test.espresso:espresso-core:3.5.1\")\n    androidTestImplementation(platform(\"androidx.compose:compose-bom:2023.08.00\"))\n    androidTestImplementation(\"androidx.compose.ui:ui-test-junit4\")\n    debugImplementation(\"androidx.compose.ui:ui-tooling\")\n    debugImplementation(\"androidx.compose.ui:ui-test-manifest\")\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/androidTest/java/com/k2fsa/sherpa/onnx/audio/tagging/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.audio.tagging\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx.audio.tagging\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnxAudioTagging\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\"\n            android:label=\"@string/app_name\"\n            android:theme=\"@style/Theme.SherpaOnnxAudioTagging\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/assets/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/Home.kt",
    "content": "@file:OptIn(ExperimentalMaterial3Api::class, ExperimentalFoundationApi::class)\n\npackage com.k2fsa.sherpa.onnx.audio.tagging\n\nimport android.Manifest\nimport android.app.Activity\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.util.Log\nimport androidx.compose.foundation.ExperimentalFoundationApi\nimport androidx.compose.foundation.layout.Arrangement\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.PaddingValues\nimport androidx.compose.foundation.layout.Row\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.fillMaxWidth\nimport androidx.compose.foundation.layout.height\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.foundation.lazy.LazyColumn\nimport androidx.compose.foundation.lazy.items\nimport androidx.compose.material3.Button\nimport androidx.compose.material3.CenterAlignedTopAppBar\nimport androidx.compose.material3.ExperimentalMaterial3Api\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.Scaffold\nimport androidx.compose.material3.Slider\nimport androidx.compose.material3.Surface\nimport androidx.compose.material3.Text\nimport androidx.compose.material3.TopAppBarDefaults\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.runtime.mutableStateListOf\nimport androidx.compose.runtime.mutableStateOf\nimport androidx.compose.runtime.remember\nimport androidx.compose.runtime.setValue\nimport androidx.compose.ui.Alignment\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.text.style.TextAlign\nimport androidx.compose.ui.unit.dp\nimport androidx.compose.ui.unit.sp\nimport androidx.core.app.ActivityCompat\nimport com.k2fsa.sherpa.onnx.AudioEvent\nimport kotlin.concurrent.thread\n\n\n@Composable\nfun Home() {\n    Scaffold(\n        topBar = {\n            CenterAlignedTopAppBar(\n                colors = TopAppBarDefaults.topAppBarColors(\n                    containerColor = MaterialTheme.colorScheme.primaryContainer,\n                    titleContentColor = MaterialTheme.colorScheme.primary,\n                ),\n                title = {\n                    Text(\n                        \"Next-gen Kaldi: Audio tagging\",\n                        fontWeight = FontWeight.Bold,\n                        fontSize = 15.sp,\n                    )\n                },\n            )\n        },\n        content = {\n            MyApp(it)\n        },\n    )\n}\n\nprivate var audioRecord: AudioRecord? = null\nprivate val sampleRateInHz = 16000\n\n@Composable\nfun MyApp(padding: PaddingValues) {\n    val activity = LocalContext.current as Activity\n    var threshold by remember { mutableStateOf<Float>(0.6F) }\n    var isStarted by remember { mutableStateOf(false) }\n    val result = remember { mutableStateListOf<AudioEvent>() }\n\n\n    val onButtonClick: () -> Unit = {\n        isStarted = !isStarted\n        if (isStarted) {\n            result.clear()\n            if (ActivityCompat.checkSelfPermission(\n                    activity,\n                    Manifest.permission.RECORD_AUDIO\n                ) != PackageManager.PERMISSION_GRANTED\n            ) {\n                Log.i(TAG, \"Recording is not allowed\")\n            } else {\n                val audioSource = MediaRecorder.AudioSource.MIC\n                val channelConfig = AudioFormat.CHANNEL_IN_MONO\n                val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n                val numBytes =\n                    AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n\n                audioRecord = AudioRecord(\n                    audioSource,\n                    sampleRateInHz,\n                    AudioFormat.CHANNEL_IN_MONO,\n                    AudioFormat.ENCODING_PCM_16BIT,\n                    numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n                )\n\n                thread(true) {\n                    Log.i(TAG, \"processing samples\")\n                    val interval = 0.1 // i.e., 100 ms\n                    val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n                    val buffer = ShortArray(bufferSize)\n                    val sampleList = ArrayList<FloatArray>()\n                    audioRecord?.let {\n                        it.startRecording()\n                        while (isStarted) {\n                            val ret = it.read(buffer, 0, buffer.size)\n                            ret.let { n ->\n                                val samples = FloatArray(n) { buffer[it] / 32768.0f }\n                                sampleList.add(samples)\n                            }\n                        }\n                    }\n                    Log.i(TAG, \"Stop recording\")\n                    Log.i(TAG, \"Start recognition\")\n                    val samples = Flatten(sampleList)\n                    val stream = Tagger.tagger.createStream()\n                    stream.acceptWaveform(samples, sampleRateInHz)\n                    val events = Tagger.tagger.compute(stream)\n                    stream.release()\n                    for (e in events) {\n                        if (e.prob > threshold) {\n                            result.add(e)\n                        }\n\n                    }\n\n                }\n            }\n        }\n    }\n\n    Box(\n        modifier = Modifier.fillMaxSize(),\n        contentAlignment = Alignment.TopCenter\n    ) {\n        Column(\n            Modifier.padding(padding),\n            horizontalAlignment = Alignment.CenterHorizontally,\n        ) {\n            Spacer(modifier = Modifier.height(16.dp))\n            Text(\"Threshold \" + String.format(\"%.1f\", threshold))\n            Slider(\n                value = threshold,\n                onValueChange = { threshold = it },\n                valueRange = 0.1F..1.0F,\n                modifier = Modifier.fillMaxWidth()\n            )\n\n            Button(onClick = onButtonClick) {\n                if (isStarted) {\n                    Text(\"Stop\")\n                } else {\n                    Text(\"Start\")\n                }\n            }\n\n            Spacer(modifier = Modifier.height(16.dp))\n            LazyColumn(modifier = Modifier.fillMaxSize()) {\n                if (!result.isEmpty()) {\n\n                    item {\n                        Row(\n                            modifier = Modifier.fillMaxWidth(),\n                            horizontalArrangement = Arrangement.SpaceEvenly\n                        ) {\n                            Text(\n                                text = \"Event name\",\n                            )\n                            Text(\n                                text = \"Probability\",\n                            )\n                        }\n                    }\n                }\n\n                items(result) { event: AudioEvent ->\n                    ViewRow(event = event)\n                }\n            }\n        }\n    }\n}\n\n@Composable\nfun ShowResult(result: String) {\n    Text(\n        modifier = Modifier.fillMaxWidth(),\n        textAlign = TextAlign.Center,\n        color = MaterialTheme.colorScheme.primary,\n        text = result,\n    )\n}\n\n@Composable\nfun ViewRow(\n    modifier: Modifier = Modifier,\n    event: AudioEvent\n) {\n    Surface(\n        modifier = modifier\n            .fillMaxWidth()\n            .padding(8.dp),\n        color = MaterialTheme.colorScheme.inversePrimary,\n    ) {\n        Row(\n            modifier = modifier,\n            horizontalArrangement = Arrangement.Center,\n            verticalAlignment = Alignment.CenterVertically,\n        ) {\n            Text(\n                text = event.name,\n                modifier = modifier.weight(1.0F),\n            )\n            Text(\n                text = \"%.2f\".format(event.prob),\n                modifier = modifier.weight(1.0F),\n            )\n        }\n    }\n}\n\nfun Flatten(sampleList: ArrayList<FloatArray>): FloatArray {\n    var totalSamples = 0\n    for (a in sampleList) {\n        totalSamples += a.size\n    }\n    var i = 0\n    val samples = FloatArray(totalSamples)\n    for (a in sampleList) {\n        for (s in a) {\n            samples[i] = s\n            i += 1\n        }\n    }\n    Log.i(TAG, \"$i, $totalSamples\")\n\n    return samples\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx.audio.tagging\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.os.Bundle\nimport android.util.Log\nimport android.widget.Toast\nimport androidx.activity.ComponentActivity\nimport androidx.activity.compose.setContent\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.Surface\nimport androidx.compose.runtime.Composable\nimport androidx.compose.ui.Modifier\nimport androidx.core.app.ActivityCompat\nimport com.k2fsa.sherpa.onnx.audio.tagging.ui.theme.SherpaOnnxAudioTaggingTheme\n\nconst val TAG = \"sherpa-onnx\"\n\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\n// adb emu avd hostmicon\n// to enable mic inside the emulator\nclass MainActivity : ComponentActivity() {\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n    override fun onCreate(savedInstanceState: Bundle?) {\n\n        super.onCreate(savedInstanceState)\n        setContent {\n            AudioTaggingApp()\n        }\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n        Tagger.initTagger(this.assets)\n    }\n\n    @Suppress(\"DEPRECATION\")\n    @Deprecated(\"Deprecated in Java\")\n    override fun onRequestPermissionsResult(\n        requestCode: Int,\n        permissions: Array<out String>,\n        grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            Toast.makeText(\n                this,\n                \"This App needs access to the microphone\",\n                Toast.LENGTH_SHORT\n            )\n                .show()\n            finish()\n        }\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n}\n\n@Composable\nfun AudioTaggingApp() {\n    SherpaOnnxAudioTaggingTheme {\n        // A surface container using the 'background' color from the theme\n        Surface(\n            modifier = Modifier.fillMaxSize(),\n            color = MaterialTheme.colorScheme.background\n        ) {\n            Home()\n        }\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/Tagger.kt",
    "content": "package com.k2fsa.sherpa.onnx.audio.tagging\n\nimport android.content.res.AssetManager\nimport android.util.Log\nimport com.k2fsa.sherpa.onnx.AudioTagging\nimport com.k2fsa.sherpa.onnx.getAudioTaggingConfig\n\n\nobject Tagger {\n    private var _tagger: AudioTagging? = null\n    val tagger: AudioTagging\n        get() {\n            return _tagger!!\n        }\n\n    fun initTagger(assetManager: AssetManager? = null, numThreads: Int = 1) {\n        synchronized(this) {\n            if (_tagger != null) {\n                return\n            }\n\n            Log.i(\"sherpa-onnx\", \"Initializing audio tagger\")\n            val config = getAudioTaggingConfig(type = 0, numThreads = numThreads)!!\n            _tagger = AudioTagging(assetManager, config)\n        }\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/ui/theme/Color.kt",
    "content": "package com.k2fsa.sherpa.onnx.audio.tagging.ui.theme\n\nimport androidx.compose.ui.graphics.Color\n\nval Purple80 = Color(0xFFD0BCFF)\nval PurpleGrey80 = Color(0xFFCCC2DC)\nval Pink80 = Color(0xFFEFB8C8)\n\nval Purple40 = Color(0xFF6650a4)\nval PurpleGrey40 = Color(0xFF625b71)\nval Pink40 = Color(0xFF7D5260)"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/ui/theme/Theme.kt",
    "content": "package com.k2fsa.sherpa.onnx.audio.tagging.ui.theme\n\nimport android.app.Activity\nimport android.os.Build\nimport androidx.compose.foundation.isSystemInDarkTheme\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.darkColorScheme\nimport androidx.compose.material3.dynamicDarkColorScheme\nimport androidx.compose.material3.dynamicLightColorScheme\nimport androidx.compose.material3.lightColorScheme\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.SideEffect\nimport androidx.compose.ui.graphics.toArgb\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.platform.LocalView\nimport androidx.core.view.WindowCompat\n\nprivate val DarkColorScheme = darkColorScheme(\n    primary = Purple80,\n    secondary = PurpleGrey80,\n    tertiary = Pink80\n)\n\nprivate val LightColorScheme = lightColorScheme(\n    primary = Purple40,\n    secondary = PurpleGrey40,\n    tertiary = Pink40\n\n    /* Other default colors to override\n    background = Color(0xFFFFFBFE),\n    surface = Color(0xFFFFFBFE),\n    onPrimary = Color.White,\n    onSecondary = Color.White,\n    onTertiary = Color.White,\n    onBackground = Color(0xFF1C1B1F),\n    onSurface = Color(0xFF1C1B1F),\n    */\n)\n\n@Composable\nfun SherpaOnnxAudioTaggingTheme(\n    darkTheme: Boolean = isSystemInDarkTheme(),\n    // Dynamic color is available on Android 12+\n    dynamicColor: Boolean = true,\n    content: @Composable () -> Unit\n) {\n    val colorScheme = when {\n        dynamicColor && Build.VERSION.SDK_INT >= Build.VERSION_CODES.S -> {\n            val context = LocalContext.current\n            if (darkTheme) dynamicDarkColorScheme(context) else dynamicLightColorScheme(context)\n        }\n\n        darkTheme -> DarkColorScheme\n        else -> LightColorScheme\n    }\n    val view = LocalView.current\n    if (!view.isInEditMode) {\n        SideEffect {\n            val window = (view.context as Activity).window\n            window.statusBarColor = colorScheme.primary.toArgb()\n            WindowCompat.getInsetsController(window, view).isAppearanceLightStatusBars = darkTheme\n        }\n    }\n\n    MaterialTheme(\n        colorScheme = colorScheme,\n        typography = Typography,\n        content = content\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/ui/theme/Type.kt",
    "content": "package com.k2fsa.sherpa.onnx.audio.tagging.ui.theme\n\nimport androidx.compose.material3.Typography\nimport androidx.compose.ui.text.TextStyle\nimport androidx.compose.ui.text.font.FontFamily\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.unit.sp\n\n// Set of Material typography styles to start with\nval Typography = Typography(\n    bodyLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 16.sp,\n        lineHeight = 24.sp,\n        letterSpacing = 0.5.sp\n    )\n    /* Other default text styles to override\n    titleLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 22.sp,\n        lineHeight = 28.sp,\n        letterSpacing = 0.sp\n    ),\n    labelSmall = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Medium,\n        fontSize = 11.sp,\n        lineHeight = 16.sp,\n        letterSpacing = 0.5.sp\n    )\n    */\n)"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/jniLibs/arm64-v8a/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/jniLibs/armeabi-v7a/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/jniLibs/x86/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/jniLibs/x86_64/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">Audio Tagging</string>\n</resources>\n"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/res/values/themes.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n\n    <style name=\"Theme.SherpaOnnxAudioTagging\" parent=\"android:Theme.Material.Light.NoActionBar\" />\n</resources>"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/app/src/test/java/com/k2fsa/sherpa/onnx/audio/tagging/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.audio.tagging\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/build.gradle.kts",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id(\"com.android.application\") version \"8.2.0\" apply false\n    id(\"org.jetbrains.kotlin.android\") version \"1.9.0\" apply false\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Tue Apr 16 10:10:01 CST 2024\ndistributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\n"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxAudioTagging/settings.gradle.kts",
    "content": "pluginManagement {\n    repositories {\n        google()\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\nrootProject.name = \"SherpaOnnxAudioTagging\"\ninclude(\":app\")\n"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/build.gradle.kts",
    "content": "plugins {\n    id(\"com.android.application\")\n    id(\"org.jetbrains.kotlin.android\")\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx.audio.tagging.wear.os\"\n    compileSdk = 34\n\n    defaultConfig {\n        applicationId = \"com.k2fsa.sherpa.onnx.audio.tagging.wear.os\"\n        minSdk = 26\n        targetSdk = 34\n        versionCode = 20260320\n        versionName = \"1.12.31\"\n        vectorDrawables {\n            useSupportLibrary = true\n        }\n\n    }\n\n    buildTypes {\n        release {\n            isMinifyEnabled = false\n            proguardFiles(\n                getDefaultProguardFile(\"proguard-android-optimize.txt\"),\n                \"proguard-rules.pro\"\n            )\n        }\n    }\n    compileOptions {\n        sourceCompatibility = JavaVersion.VERSION_1_8\n        targetCompatibility = JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = \"1.8\"\n    }\n    buildFeatures {\n        compose = true\n    }\n    composeOptions {\n        kotlinCompilerExtensionVersion = \"1.5.1\"\n    }\n    packaging {\n        resources {\n            excludes += \"/META-INF/{AL2.0,LGPL2.1}\"\n        }\n    }\n}\n\ndependencies {\n\n    implementation(\"com.google.android.gms:play-services-wearable:18.1.0\")\n    implementation(platform(\"androidx.compose:compose-bom:2023.08.00\"))\n    implementation(\"androidx.compose.ui:ui\")\n    implementation(\"androidx.compose.ui:ui-tooling-preview\")\n    implementation(\"androidx.wear.compose:compose-material:1.1.2\")\n    implementation(\"androidx.wear.compose:compose-foundation:1.1.2\")\n    implementation(\"androidx.activity:activity-compose:1.7.2\")\n    implementation(\"androidx.core:core-splashscreen:1.0.1\")\n    implementation(\"androidx.compose.material3:material3-android:1.2.1\")\n    androidTestImplementation(platform(\"androidx.compose:compose-bom:2023.08.00\"))\n    androidTestImplementation(\"androidx.compose.ui:ui-test-junit4\")\n    debugImplementation(\"androidx.compose.ui:ui-tooling\")\n    debugImplementation(\"androidx.compose.ui:ui-test-manifest\")\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/lint.xml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<lint>\n    <!-- Ignore the IconLocation for the Tile preview images -->\n    <issue id=\"IconLocation\">\n        <ignore path=\"res/drawable/tile_preview.png\" />\n        <ignore path=\"res/drawable-round/tile_preview.png\" />\n    </issue>\n</lint>"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\">\n\n    <uses-permission android:name=\"android.permission.WAKE_LOCK\" />\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <uses-feature android:name=\"android.hardware.type.watch\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@android:style/Theme.DeviceDefault\">\n        <uses-library\n            android:name=\"com.google.android.wearable\"\n            android:required=\"true\" />\n\n        <!--\n               Set to true if your app is Standalone, that is, it does not require the handheld\n               app to run.\n        -->\n        <meta-data\n            android:name=\"com.google.android.wearable.standalone\"\n            android:value=\"true\" />\n\n        <activity\n            android:name=\".presentation.MainActivity\"\n            android:exported=\"true\"\n            android:taskAffinity=\"\"\n            android:theme=\"@style/MainActivityTheme.Starting\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/assets/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/wear/os/presentation/HomeScreen.kt",
    "content": "package com.k2fsa.sherpa.onnx.audio.tagging.wear.os.presentation\n\nimport android.Manifest\nimport android.app.Activity\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.util.Log\nimport androidx.compose.foundation.background\nimport androidx.compose.foundation.layout.Arrangement\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.Row\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.fillMaxWidth\nimport androidx.compose.foundation.layout.height\nimport androidx.compose.material3.Slider\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.runtime.mutableStateOf\nimport androidx.compose.runtime.remember\nimport androidx.compose.runtime.setValue\nimport androidx.compose.ui.Alignment\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.text.style.TextAlign\nimport androidx.compose.ui.unit.dp\nimport androidx.compose.ui.unit.sp\nimport androidx.core.app.ActivityCompat\nimport androidx.wear.compose.material.Button\nimport androidx.wear.compose.material.MaterialTheme\nimport androidx.wear.compose.material.Text\nimport com.k2fsa.sherpa.onnx.AudioEvent\nimport com.k2fsa.sherpa.onnx.audio.tagging.Tagger\nimport com.k2fsa.sherpa.onnx.audio.tagging.wear.os.presentation.theme.SherpaOnnxAudioTaggingWearOsTheme\nimport kotlin.concurrent.thread\n\nprivate var audioRecord: AudioRecord? = null\nprivate val sampleRateInHz = 16000\n\n@Composable\nfun HomeScreen() {\n    val activity = LocalContext.current as Activity\n    var threshold by remember { mutableStateOf<Float>(0.6F) }\n    var firstTime by remember { mutableStateOf(true) }\n    var isStarted by remember { mutableStateOf(false) }\n    var result by remember { mutableStateOf(\"\") }\n    val onButtonClick: () -> Unit = {\n        firstTime = false\n\n        isStarted = !isStarted\n        if (isStarted) {\n            result = \"\"\n            if (ActivityCompat.checkSelfPermission(\n                    activity,\n                    Manifest.permission.RECORD_AUDIO\n                ) != PackageManager.PERMISSION_GRANTED\n            ) {\n                Log.i(TAG, \"Recording is not allowed\")\n            } else {\n                val audioSource = MediaRecorder.AudioSource.MIC\n                val channelConfig = AudioFormat.CHANNEL_IN_MONO\n                val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n                val numBytes =\n                    AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n\n                audioRecord = AudioRecord(\n                    audioSource,\n                    sampleRateInHz,\n                    AudioFormat.CHANNEL_IN_MONO,\n                    AudioFormat.ENCODING_PCM_16BIT,\n                    numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n                )\n\n                thread(true) {\n                    Log.i(TAG, \"processing samples\")\n                    val interval = 0.1 // i.e., 100 ms\n                    val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n                    val buffer = ShortArray(bufferSize)\n                    val sampleList = ArrayList<FloatArray>()\n                    audioRecord?.let {\n                        it.startRecording()\n                        while (isStarted) {\n                            val ret = it.read(buffer, 0, buffer.size)\n                            ret.let { n ->\n                                val samples = FloatArray(n) { buffer[it] / 32768.0f }\n                                sampleList.add(samples)\n                            }\n                        }\n                    }\n                    Log.i(TAG, \"Stop recording\")\n                    Log.i(TAG, \"Start recognition\")\n                    val samples = Flatten(sampleList)\n                    val stream = Tagger.tagger.createStream()\n                    stream.acceptWaveform(samples, sampleRateInHz)\n                    val events = Tagger.tagger.compute(stream)\n                    stream.release()\n\n                    var str: String = \"\"\n                    for (e in events) {\n                        if (e.prob > threshold) {\n                            str += \"%s (%.2f)\\n\".format(e.name, e.prob)\n                        }\n                    }\n                    result = str\n                }\n            }\n        }\n    }\n\n\n    SherpaOnnxAudioTaggingWearOsTheme {\n        Box(\n            modifier = Modifier\n                .fillMaxSize()\n                .background(MaterialTheme.colors.background),\n            contentAlignment = Alignment.Center\n        ) {\n            Column(\n                horizontalAlignment = Alignment.CenterHorizontally\n            ) {\n                Spacer(modifier = Modifier.height(16.dp))\n                if (firstTime) {\n                    ShowMessage()\n                }\n\n                Spacer(modifier = Modifier.height(16.dp))\n                Text(\n                    result,\n                    fontSize = 12.sp,\n                )\n\n                Text(\n                    \"Threshold \" + String.format(\"%.1f\", threshold),\n                    fontSize = 12.sp\n                )\n                Slider(\n                    value = threshold,\n                    onValueChange = { threshold = it },\n                    valueRange = 0.1F..1.0F,\n                    modifier = Modifier.fillMaxWidth()\n                )\n                Button(\n                    onClick = onButtonClick,\n                ) {\n                    if (isStarted) {\n                        Text(\"Stop\")\n                    } else {\n                        Text(\"Start\")\n                    }\n                }\n            }\n        }\n    }\n}\n\n@Composable\nfun ShowMessage() {\n    val msg = \"Audio tagging\\nwith\\nNext-gen Kaldi\"\n    Text(\n        modifier = Modifier.fillMaxWidth(),\n        textAlign = TextAlign.Center,\n        color = MaterialTheme.colors.primary,\n        text = msg,\n    )\n}\n\n@Composable\nfun ViewRow(\n    modifier: Modifier = Modifier,\n    event: AudioEvent\n) {\n    Row(\n        modifier = modifier,\n        horizontalArrangement = Arrangement.Center,\n        verticalAlignment = Alignment.CenterVertically,\n    ) {\n        Text(\n            text = event.name,\n            modifier = modifier.weight(1.0F),\n        )\n        Text(\n            text = \"%.2f\".format(event.prob),\n            modifier = modifier.weight(1.0F),\n        )\n    }\n\n}\n\n\nfun Flatten(sampleList: ArrayList<FloatArray>): FloatArray {\n    var totalSamples = 0\n    for (a in sampleList) {\n        totalSamples += a.size\n    }\n    var i = 0\n    val samples = FloatArray(totalSamples)\n    for (a in sampleList) {\n        for (s in a) {\n            samples[i] = s\n            i += 1\n        }\n    }\n    Log.i(TAG, \"$i, $totalSamples\")\n\n    return samples\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/wear/os/presentation/MainActivity.kt",
    "content": "/* While this template provides a good starting point for using Wear Compose, you can always\n * take a look at https://github.com/android/wear-os-samples/tree/main/ComposeStarter and\n * https://github.com/android/wear-os-samples/tree/main/ComposeAdvanced to find the most up to date\n * changes to the libraries and their usages.\n */\n\npackage com.k2fsa.sherpa.onnx.audio.tagging.wear.os.presentation\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.os.Bundle\nimport android.util.Log\nimport android.view.WindowManager\nimport android.widget.Toast\nimport androidx.activity.ComponentActivity\nimport androidx.activity.compose.setContent\nimport androidx.compose.runtime.Composable\nimport androidx.core.app.ActivityCompat\nimport androidx.core.splashscreen.SplashScreen.Companion.installSplashScreen\nimport com.k2fsa.sherpa.onnx.audio.tagging.Tagger\n\nconst val TAG = \"sherpa-onnx\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\n// adb emu avd hostmicon\n// to enable mic inside the emulator\n\nclass MainActivity : ComponentActivity() {\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n    override fun onCreate(savedInstanceState: Bundle?) {\n        installSplashScreen()\n\n        super.onCreate(savedInstanceState)\n\n        // Keep the screen always on\n        // https://developer.android.com/develop/background-work/background-tasks/scheduling/wakelock\n        window.addFlags(WindowManager.LayoutParams.FLAG_KEEP_SCREEN_ON)\n\n        setTheme(android.R.style.Theme_DeviceDefault)\n\n        setContent {\n            WearApp()\n        }\n\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n        Tagger.initTagger(this.assets, numThreads = 2)\n    }\n\n    @Suppress(\"DEPRECATION\")\n    override fun onRequestPermissionsResult(\n        requestCode: Int,\n        permissions: Array<out String>,\n        grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            Toast.makeText(\n                this,\n                \"This App needs access to the microphone\",\n                Toast.LENGTH_SHORT\n            )\n                .show()\n            finish()\n        }\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n}\n\n@Composable\nfun WearApp() {\n    HomeScreen()\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/wear/os/presentation/theme/Theme.kt",
    "content": "package com.k2fsa.sherpa.onnx.audio.tagging.wear.os.presentation.theme\n\nimport androidx.compose.runtime.Composable\nimport androidx.wear.compose.material.MaterialTheme\n\n@Composable\nfun SherpaOnnxAudioTaggingWearOsTheme(\n    content: @Composable () -> Unit\n) {\n    /**\n     * Empty theme to customize for your app.\n     * See: https://developer.android.com/jetpack/compose/designsystems/custom\n     */\n    MaterialTheme(\n        content = content\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/jniLibs/arm64-v8a/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/jniLibs/armeabi-v7a/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/jniLibs/x86/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/jniLibs/x86_64/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/res/drawable/splash_icon.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n\n<layer-list xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <item\n        android:width=\"48dp\"\n        android:height=\"48dp\"\n        android:gravity=\"center\">\n        <shape android:shape=\"oval\">\n            <solid android:color=\"#FFFFFF\" />\n        </shape>\n    </item>\n    <item\n        android:width=\"40dp\"\n        android:height=\"40dp\"\n        android:gravity=\"center\">\n        <vector\n            android:width=\"24dp\"\n            android:height=\"24dp\"\n            android:tint=\"#000000\"\n            android:viewportWidth=\"24\"\n            android:viewportHeight=\"24\">\n            <path\n                android:fillColor=\"#FF000000\"\n                android:pathData=\"M17.6,11.48 L19.44,8.3a0.63,0.63 0,0 0,-1.09 -0.63l-1.88,3.24a11.43,11.43 0,0 0,-8.94 0L5.65,7.67a0.63,0.63 0,0 0,-1.09 0.63L6.4,11.48A10.81,10.81 0,0 0,1 20L23,20A10.81,10.81 0,0 0,17.6 11.48ZM7,17.25A1.25,1.25 0,1 1,8.25 16,1.25 1.25,0 0,1 7,17.25ZM17,17.25A1.25,1.25 0,1 1,18.25 16,1.25 1.25,0 0,1 17,17.25Z\" />\n        </vector>\n    </item>\n</layer-list>\n"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">Audio Tagging</string>\n    <!--\n    This string is used for square devices and overridden by hello_world in\n    values-round/strings.xml for round devices.\n    -->\n    <string name=\"hello_world\">From the Square world,\\nHello, %1$s!</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/res/values/styles.xml",
    "content": "<resources>\n\n    <style name=\"MainActivityTheme.Starting\" parent=\"Theme.SplashScreen\">\n        <item name=\"windowSplashScreenBackground\">@android:color/black</item>\n        <item name=\"windowSplashScreenAnimatedIcon\">@drawable/splash_icon</item>\n        <item name=\"postSplashScreenTheme\">@android:style/Theme.DeviceDefault</item>\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/app/src/main/res/values-round/strings.xml",
    "content": "<resources>\n    <string name=\"hello_world\">From the Round world,\\nHello, %1$s!</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/build.gradle.kts",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id(\"com.android.application\") version \"8.2.0\" apply false\n    id(\"org.jetbrains.kotlin.android\") version \"1.9.0\" apply false\n}"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Tue Apr 16 20:57:10 CST 2024\ndistributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\n"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxAudioTaggingWearOs/settings.gradle.kts",
    "content": "pluginManagement {\n    repositories {\n        google()\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\nrootProject.name = \"SherpaOnnxAudioTaggingWearOs\"\ninclude(\":app\")\n "
  },
  {
    "path": "android/SherpaOnnxJavaDemo/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/README.md",
    "content": "# Introduction\n\nPlease run the following commands to download model files before you run this Android demo:\n\n```bash\n# Assume we are inside\n# /Users/fangjun/open-source/sherpa-onnx/android/SherpaOnnxJavaDemo\n\ncd app/src/main/assets/\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\nmv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx ./\nmv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx ./\nmv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx ./\nmv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt ./\n\nrm -rf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/*\n\nmv encoder-epoch-99-avg-1.int8.onnx sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\nmv decoder-epoch-99-avg-1.onnx sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\nmv joiner-epoch-99-avg-1.int8.onnx sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\nmv tokens.txt sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\n```\n\nYou should have the following directory structure:\n```\n(py38) fangjuns-MacBook-Pro:assets fangjun$ pwd\n/Users/fangjun/open-source/sherpa-onnx/android/SherpaOnnxJavaDemo/app/src/main/assets\n\n(py38) fangjuns-MacBook-Pro:assets fangjun$ tree .\n.\n└── sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n    ├── decoder-epoch-99-avg-1.onnx\n    ├── encoder-epoch-99-avg-1.int8.onnx\n    ├── joiner-epoch-99-avg-1.int8.onnx\n    └── tokens.txt\n\n1 directory, 4 files\n```\n\nRemember to remove unused files to reduce the file size of the final APK.\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/build.gradle",
    "content": "plugins {\n    id 'com.android.application'\n}\n\nandroid {\n    compileSdk 34\n\n    defaultConfig {\n        applicationId \"com.k2fsa.sherpa.onnx\"\n        minSdk 28\n        targetSdk 34\n        versionCode 20260320\n        versionName \"1.12.31\"\n\n        testInstrumentationRunner \"androidx.test.runner.AndroidJUnitRunner\"\n    }\n\n    buildTypes {\n        release {\n            minifyEnabled false\n            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'\n        }\n    }\n    compileOptions {\n        sourceCompatibility JavaVersion.VERSION_1_8\n        targetCompatibility JavaVersion.VERSION_1_8\n    }\n}\n\ndependencies {\n    implementation 'androidx.appcompat:appcompat:1.3.1'\n    implementation 'com.google.android.material:material:1.3.0'\n    implementation 'androidx.constraintlayout:constraintlayout:1.1.3'\n    implementation 'pub.devrel:easypermissions:3.0.0'\n    implementation 'androidx.core:core-ktx:1.7.0'\n    // implementation files('/Users/fangjun/open-source/sherpa-onnx/android/SherpaOnnxAar/sherpa_onnx/build/outputs/aar/sherpa_onnx-release.aar')\n    implementation 'com.github.k2-fsa:sherpa-onnx:v1.12.31'\n}\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\"\n    package=\"com.k2fsa.sherpa.onnx\">\n    <uses-permission android:name=\"android.permission.FOREGROUND_SERVICE\" />\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <application\n        android:name=\".Application\"\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnxJavaDemo\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n        </activity>\n        <service\n            android:name=\".service.SpeechSherpaRecognitionService\"\n            android:exported=\"false\"/>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/assets/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/java/com/k2fsa/sherpa/onnx/AppViewModel.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\nimport androidx.lifecycle.LiveData;\nimport androidx.lifecycle.MutableLiveData;\nimport androidx.lifecycle.ViewModel;\n\npublic class AppViewModel extends ViewModel {\n    private final MutableLiveData<String> speechRecognitionResult = new MutableLiveData<>();\n\n    public LiveData<String> getSpeechRecognitionResult() {\n        return speechRecognitionResult;\n    }\n\n    public void setSpeechRecognitionResult(String result) {\n        speechRecognitionResult.postValue(result);\n    }\n\n}\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/java/com/k2fsa/sherpa/onnx/Application.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\nimport androidx.annotation.NonNull;\nimport androidx.lifecycle.ViewModelProvider;\nimport androidx.lifecycle.ViewModelStore;\nimport androidx.lifecycle.ViewModelStoreOwner;\n\n\npublic class Application extends android.app.Application implements ViewModelStoreOwner {\n    public static Application sApplication;\n\n\n    private AppViewModel viewModel;\n    private ViewModelStore viewModelStore;\n\n    public static Application getInstance() {\n        return sApplication;\n    }\n\n    @Override\n    public void onCreate() {\n        super.onCreate();\n        sApplication = this;\n        viewModelStore = new ViewModelStore();\n        viewModel = new ViewModelProvider(this).get(AppViewModel.class);\n    }\n\n    @NonNull\n    @Override\n    public ViewModelStore getViewModelStore() {\n        return viewModelStore;\n    }\n\n    public AppViewModel getViewModel() {\n        return viewModel;\n    }\n\n\n}\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/java/com/k2fsa/sherpa/onnx/MainActivity.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\nimport androidx.appcompat.app.AppCompatActivity;\nimport androidx.core.content.ContextCompat;\nimport androidx.lifecycle.ViewModelProvider;\n\nimport android.Manifest;\nimport android.content.Intent;\nimport android.os.Bundle;\nimport android.util.Log;\nimport android.widget.TextView;\n\nimport com.k2fsa.sherpa.onnx.service.SpeechSherpaRecognitionService;\n\nimport pub.devrel.easypermissions.EasyPermissions;\n\npublic class MainActivity extends AppCompatActivity {\n    private AppViewModel appViewModel;\n    private TextView tvText;\n    private static final int RC_AUDIO_PERM = 123;\n\n    @Override\n    protected void onCreate(Bundle savedInstanceState) {\n        super.onCreate(savedInstanceState);\n        setContentView(R.layout.activity_main);\n        tvText = findViewById(R.id.text);\n        requestMicrophonePermission();\n    }\n\n\n    private void startSpeechService() {\n        Intent serviceIntent = new Intent(this, SpeechSherpaRecognitionService.class);\n        ContextCompat.startForegroundService(this, serviceIntent);\n        appViewModel = new ViewModelProvider(Application.getInstance()).get(AppViewModel.class);\n        appViewModel.getSpeechRecognitionResult().observe(this, this::handleSpeechRecognitionResult);\n    }\n\n    private void handleSpeechRecognitionResult(String result) {\n        tvText.setText(result);\n    }\n\n    private void requestMicrophonePermission() {\n        String[] perms = {Manifest.permission.RECORD_AUDIO};\n        if (EasyPermissions.hasPermissions(this, perms)) {\n            startSpeechService();\n        } else {\n            EasyPermissions.requestPermissions(MainActivity.this,\n                    \"We need access to your microphone for voice recognition\",\n                    RC_AUDIO_PERM, perms);\n        }\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/java/com/k2fsa/sherpa/onnx/service/SpeechSherpaRecognitionService.java",
    "content": "package com.k2fsa.sherpa.onnx.service;\n\nimport android.Manifest;\nimport android.annotation.SuppressLint;\nimport android.app.Notification;\nimport android.app.NotificationChannel;\nimport android.app.NotificationManager;\nimport android.app.Service;\nimport android.content.Intent;\nimport android.content.pm.PackageManager;\nimport android.content.res.AssetManager;\nimport android.media.AudioFormat;\nimport android.media.AudioRecord;\nimport android.media.MediaRecorder;\nimport android.os.Build;\nimport android.os.IBinder;\nimport android.text.TextUtils;\nimport android.util.Log;\n\nimport androidx.core.app.ActivityCompat;\nimport androidx.core.app.NotificationCompat;\n\n\nimport com.k2fsa.sherpa.onnx.AppViewModel;\nimport com.k2fsa.sherpa.onnx.Application;\n\nimport com.k2fsa.sherpa.onnx.OnlineModelConfig;\nimport com.k2fsa.sherpa.onnx.OnlineRecognizer;\n\nimport com.k2fsa.sherpa.onnx.OnlineRecognizerConfig;\nimport com.k2fsa.sherpa.onnx.OnlineStream;\nimport com.k2fsa.sherpa.onnx.OnlineTransducerModelConfig;\nimport com.k2fsa.sherpa.onnx.R;\n\nimport java.io.File;\nimport java.io.FileOutputStream;\nimport java.io.IOException;\nimport java.io.InputStream;\nimport java.io.OutputStream;\n\nimport java.util.Objects;\n\nimport java.util.concurrent.ExecutorService;\nimport java.util.concurrent.Executors;\n\n\npublic class SpeechSherpaRecognitionService extends Service {\n\n    private AppViewModel appViewModel;\n    private OnlineRecognizer recognizer;\n    private final int sampleRateInHz = 16000;\n\n    private Thread recordingThread;\n    private boolean isRecording = false;\n    private int audioSource = MediaRecorder.AudioSource.MIC;\n    private int channelConfig = AudioFormat.CHANNEL_IN_MONO;\n    private int audioFormat = AudioFormat.ENCODING_PCM_16BIT;\n    private AudioRecord audioRecord;\n    private int idx = 0;\n    private String lastText = \"\";\n    private ExecutorService executor;\n\n    @Override\n    public void onCreate() {\n        super.onCreate();\n        startForegroundService();\n        // 获取 ViewModel\n        appViewModel = Application.getInstance().getViewModel();\n        int numBytes = AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat);\n\n        if (ActivityCompat.checkSelfPermission(this, Manifest.permission.RECORD_AUDIO) != PackageManager.PERMISSION_GRANTED) {\n            // TODO: Consider calling\n            //    ActivityCompat#requestPermissions\n            // here to request the missing permissions, and then overriding\n            //   public void onRequestPermissionsResult(int requestCode, String[] permissions,\n            //                                          int[] grantResults)\n            // to handle the case where the user grants the permission. See the documentation\n            // for ActivityCompat#requestPermissions for more details.\n            return;\n        }\n        audioRecord = new AudioRecord(\n                audioSource,\n                sampleRateInHz,\n                channelConfig,\n                audioFormat,\n                numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n        );\n        executor = Executors.newSingleThreadExecutor();\n        executor.execute(this::initializeSherpa);\n    }\n\n\n    private void initializeSherpa() {\n        Log.d(\"Current Directory\", System.getProperty(\"user.dir\"));\n        String modelDir = \"sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\";\n        initializeSherpaDir(modelDir, modelDir);\n        OnlineTransducerModelConfig onlineTransducerModelConfig = new OnlineTransducerModelConfig();\n        onlineTransducerModelConfig.setEncoder(modelDir + \"/encoder-epoch-99-avg-1.int8.onnx\");\n        onlineTransducerModelConfig.setDecoder(modelDir + \"/decoder-epoch-99-avg-1.onnx\");\n        onlineTransducerModelConfig.setJoiner(modelDir + \"/joiner-epoch-99-avg-1.int8.onnx\");\n\n        OnlineModelConfig onlineModelConfig = new OnlineModelConfig();\n        onlineModelConfig.setTransducer(onlineTransducerModelConfig);\n        onlineModelConfig.setTokens(modelDir + \"/tokens.txt\");\n        onlineModelConfig.setModelType(\"zipformer\");\n        onlineModelConfig.setDebug(true);\n\n        OnlineRecognizerConfig config = new OnlineRecognizerConfig();\n        config.setModelConfig(onlineModelConfig);\n        recognizer = new OnlineRecognizer(getAssets(), config);\n\n        audioRecord.startRecording();\n        startRecognition();\n    }\n\n    private void startRecognition() {\n        isRecording = true;\n        recordingThread = new Thread(this::processSamples);\n        recordingThread.start();\n    }\n\n    private void processSamples() {\n        OnlineStream stream = recognizer.createStream(\"\");\n        double interval = 0.1;\n        int bufferSize = (int) (interval * sampleRateInHz);\n        short[] buffer = new short[bufferSize];\n\n        while (isRecording) {\n            int ret = audioRecord != null ? audioRecord.read(buffer, 0, buffer.length) : -1;\n            if (ret > 0) {\n                float[] samples = new float[ret];\n                for (int i = 0; i < ret; i++) {\n                    samples[i] = buffer[i] / 32768.0f;\n                }\n                stream.acceptWaveform(samples, sampleRateInHz);\n                while (recognizer.isReady(stream)) {\n                    recognizer.decode(stream);\n                }\n\n                boolean isEndpoint = recognizer.isEndpoint(stream);\n                String text = recognizer.getResult(stream).getText();\n                if (isEndpoint) {\n                    float[] tailPaddings = new float[(int) (0.8 * sampleRateInHz)];\n                    stream.acceptWaveform(tailPaddings, sampleRateInHz);\n                    while (recognizer.isReady(stream)) {\n                        recognizer.decode(stream);\n                    }\n                    text = recognizer.getResult(stream).getText();\n                }\n\n                String textToDisplay = lastText;\n\n                if (!TextUtils.isEmpty(text)) {\n                    textToDisplay = TextUtils.isEmpty(text) ? idx + \": \" + text : lastText + \"\\n\" + idx + \": \" + text;\n                }\n\n                if (isEndpoint) {\n                    recognizer.reset(stream);\n                    if (!TextUtils.isEmpty(text)) {\n                        lastText = lastText + \"\\n\" + idx + \": \" + text;\n                        textToDisplay = lastText;\n                        idx += 1;\n                    }\n                    appViewModel.setSpeechRecognitionResult(textToDisplay);\n                }\n            }\n\n        }\n        stream.release();\n\n    }\n\n\n    @Override\n    public int onStartCommand(Intent intent, int flags, int startId) {\n\n        return START_STICKY;\n    }\n\n    @Override\n    public void onDestroy() {\n        super.onDestroy();\n        audioRecord.stop();\n        audioRecord.release();\n        executor.shutdown();\n        stopForeground(true);\n    }\n\n    @Override\n    public IBinder onBind(Intent intent) {\n        return null;\n    }\n\n\n    @SuppressLint(\"ForegroundServiceType\")\n    private void startForegroundService() {\n        String channelId = createNotificationChannel();\n\n        Notification notification = new NotificationCompat.Builder(this, channelId)\n                .setContentTitle(\"Foreground Service\")\n                .setContentText(\"Running in the foreground\")\n                .setSmallIcon(R.drawable.ic_bg_mic_24)\n                .build();\n\n        startForeground(1, notification);\n    }\n\n    // 创建通知渠道 (针对 Android 8.0 及以上版本)\n    private String createNotificationChannel() {\n        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.O) {\n            String channelId = \"speech_channel\";\n            String channelName = \"Speech Channel\";\n            NotificationChannel channel = new NotificationChannel(channelId, channelName, NotificationManager.IMPORTANCE_LOW);\n            NotificationManager manager = getSystemService(NotificationManager.class);\n            if (manager != null) {\n                manager.createNotificationChannel(channel);\n            }\n            return channelId;\n        } else {\n            return \"\";\n        }\n    }\n\n    private void initializeSherpaDir(String assetDir, String internalDir) {\n        AssetManager assetManager = getAssets();\n        File outDir = new File(getFilesDir(), internalDir);\n\n        if (!outDir.exists()) {\n            outDir.mkdirs();\n        }\n\n        try {\n            String[] assets = assetManager.list(assetDir);\n            if (assets != null) {\n                for (String asset : assets) {\n                    String assetPath = assetDir.isEmpty() ? asset : assetDir + \"/\" + asset;\n                    File outFile = new File(outDir, asset);\n                    if (Objects.requireNonNull(assetManager.list(assetPath)).length > 0) {\n                        outFile.mkdirs();\n                        initializeSherpaDir(assetPath, internalDir + \"/\" + asset); // 递归复制子目录\n                    } else {\n                        InputStream in = assetManager.open(assetPath);\n                        OutputStream out = new FileOutputStream(outFile);\n\n                        byte[] buffer = new byte[1024];\n                        int read;\n                        while ((read = in.read(buffer)) != -1) {\n                            out.write(buffer, 0, read);\n                        }\n\n                        in.close();\n                        out.flush();\n                        out.close();\n                    }\n                }\n            }\n        } catch (IOException e) {\n            Log.e(\"ModelCopy\", \"Failed to copy assets\", e);\n        }\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/drawable/ic_bg_mic_24.xml",
    "content": "<vector android:height=\"24dp\" android:tint=\"#000000\"\n    android:viewportHeight=\"24\" android:viewportWidth=\"24\"\n    android:width=\"24dp\" xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <path android:fillColor=\"@android:color/white\" android:pathData=\"M12,14c1.66,0 2.99,-1.34 2.99,-3L15,5c0,-1.66 -1.34,-3 -3,-3S9,3.34 9,5v6c0,1.66 1.34,3 3,3zM17.3,11c0,3 -2.54,5.1 -5.3,5.1S6.7,14 6.7,11L5,11c0,3.41 2.72,6.23 6,6.72L11,21h2v-3.28c3.28,-0.48 6,-3.3 6,-6.72h-1.7z\"/>\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/layout/activity_main.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<androidx.constraintlayout.widget.ConstraintLayout xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:app=\"http://schemas.android.com/apk/res-auto\"\n    xmlns:tools=\"http://schemas.android.com/tools\"\n    android:layout_width=\"match_parent\"\n    android:layout_height=\"match_parent\"\n    tools:context=\".MainActivity\">\n\n    <TextView\n        android:id=\"@+id/text\"\n        android:layout_width=\"wrap_content\"\n        android:layout_height=\"wrap_content\"\n        android:text=\"Hello World!\"\n\n        app:layout_constraintStart_toStartOf=\"parent\"\n        app:layout_constraintTop_toTopOf=\"parent\" />\n\n</androidx.constraintlayout.widget.ConstraintLayout>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">SherpaOnnxJavaDemo</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/values/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnxJavaDemo\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_500</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/white</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_700</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\" tools:targetApi=\"l\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/values-night/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnxJavaDemo\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_200</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/black</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_200</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\" tools:targetApi=\"l\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/build.gradle",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id 'com.android.application' version '7.2.2' apply false\n    id 'com.android.library' version '7.2.2' apply false\n}\n\ntask clean(type: Delete) {\n    delete rootProject.buildDir\n}"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Tue Oct 22 10:59:18 CST 2024\ndistributionBase=GRADLE_USER_HOME\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-7.3.3-bin.zip\ndistributionPath=wrapper/dists\nzipStorePath=wrapper/dists\nzipStoreBase=GRADLE_USER_HOME\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app\"s APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/gradlew.bat",
    "content": "@rem\n@rem Copyright 2015 the original author or authors.\n@rem\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\n@rem you may not use this file except in compliance with the License.\n@rem You may obtain a copy of the License at\n@rem\n@rem      https://www.apache.org/licenses/LICENSE-2.0\n@rem\n@rem Unless required by applicable law or agreed to in writing, software\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n@rem See the License for the specific language governing permissions and\n@rem limitations under the License.\n@rem\n\n@if \"%DEBUG%\" == \"\" @echo off\n@rem ##########################################################################\n@rem\n@rem  Gradle startup script for Windows\n@rem\n@rem ##########################################################################\n\n@rem Set local scope for the variables with windows NT shell\nif \"%OS%\"==\"Windows_NT\" setlocal\n\nset DIRNAME=%~dp0\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\nset APP_BASE_NAME=%~n0\nset APP_HOME=%DIRNAME%\n\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\n\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\n\n@rem Find java.exe\nif defined JAVA_HOME goto findJavaFromJavaHome\n\nset JAVA_EXE=java.exe\n%JAVA_EXE% -version >NUL 2>&1\nif \"%ERRORLEVEL%\" == \"0\" goto execute\n\necho.\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\necho.\necho Please set the JAVA_HOME variable in your environment to match the\necho location of your Java installation.\n\ngoto fail\n\n:findJavaFromJavaHome\nset JAVA_HOME=%JAVA_HOME:\"=%\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\n\nif exist \"%JAVA_EXE%\" goto execute\n\necho.\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\necho.\necho Please set the JAVA_HOME variable in your environment to match the\necho location of your Java installation.\n\ngoto fail\n\n:execute\n@rem Setup the command line\n\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\n\n\n@rem Execute Gradle\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\n\n:end\n@rem End local scope for the variables with windows NT shell\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\n\n:fail\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\nrem the _cmd.exe /c_ return code!\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\nexit /b 1\n\n:mainEnd\nif \"%OS%\"==\"Windows_NT\" endlocal\n\n:omega\n"
  },
  {
    "path": "android/SherpaOnnxJavaDemo/settings.gradle",
    "content": "pluginManagement {\n    repositories {\n        gradlePluginPortal()\n        google()\n        mavenCentral()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n        maven { url 'https://jitpack.io' }\n    }\n}\nrootProject.name = \"SherpaOnnxJavaDemo\"\ninclude ':app'\n"
  },
  {
    "path": "android/SherpaOnnxKws/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxKws/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxKws/app/build.gradle",
    "content": "plugins {\n    id 'com.android.application'\n    id 'org.jetbrains.kotlin.android'\n}\n\nandroid {\n    namespace 'com.k2fsa.sherpa.onnx'\n    compileSdk 32\n\n    defaultConfig {\n        applicationId \"com.k2fsa.sherpa.onnx\"\n        minSdk 21\n        targetSdk 32\n        versionCode 20260320\n        versionName \"1.12.31\"\n\n        testInstrumentationRunner \"androidx.test.runner.AndroidJUnitRunner\"\n    }\n\n    buildTypes {\n        release {\n            minifyEnabled false\n            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'\n        }\n    }\n    compileOptions {\n        sourceCompatibility JavaVersion.VERSION_1_8\n        targetCompatibility JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = '1.8'\n    }\n}\n\ndependencies {\n\n    implementation 'androidx.core:core-ktx:1.7.0'\n    implementation 'androidx.appcompat:appcompat:1.5.1'\n    implementation 'com.google.android.material:material:1.7.0'\n    implementation 'androidx.constraintlayout:constraintlayout:2.1.4'\n    testImplementation 'junit:junit:4.13.2'\n    androidTestImplementation 'androidx.test.ext:junit:1.1.4'\n    androidTestImplementation 'androidx.test.espresso:espresso-core:3.5.0'\n}"
  },
  {
    "path": "android/SherpaOnnxKws/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/androidTest/java/com/k2fsa/sherpa/onnx/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnx\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".kws.MainActivity\"\n            android:label=\"Keyword-spotter\"\n            android:exported=\"true\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n\n            <meta-data\n                android:name=\"android.app.lib_name\"\n                android:value=\"\" />\n        </activity>\n    </application>\n\n</manifest>\n"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/assets/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/java/com/k2fsa/sherpa/onnx/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx.kws\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.os.Bundle\nimport android.text.method.ScrollingMovementMethod\nimport android.util.Log\nimport android.widget.Button\nimport android.widget.EditText\nimport android.widget.TextView\nimport android.widget.Toast\nimport androidx.appcompat.app.AppCompatActivity\nimport androidx.core.app.ActivityCompat\nimport com.k2fsa.sherpa.onnx.KeywordSpotter\nimport com.k2fsa.sherpa.onnx.KeywordSpotterConfig\nimport com.k2fsa.sherpa.onnx.OnlineStream\nimport com.k2fsa.sherpa.onnx.R\nimport com.k2fsa.sherpa.onnx.getFeatureConfig\nimport com.k2fsa.sherpa.onnx.getKeywordsFile\nimport com.k2fsa.sherpa.onnx.getKwsModelConfig\nimport kotlin.concurrent.thread\n\nprivate const val TAG = \"sherpa-onnx\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\nclass MainActivity : AppCompatActivity() {\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n\n    private lateinit var kws: KeywordSpotter\n    private lateinit var stream: OnlineStream\n    private var audioRecord: AudioRecord? = null\n    private lateinit var recordButton: Button\n    private lateinit var textView: TextView\n    private lateinit var inputText: EditText\n    private var recordingThread: Thread? = null\n\n    private val audioSource = MediaRecorder.AudioSource.MIC\n    private val sampleRateInHz = 16000\n    private val channelConfig = AudioFormat.CHANNEL_IN_MONO\n\n    // Note: We don't use AudioFormat.ENCODING_PCM_FLOAT\n    // since the AudioRecord.read(float[]) needs API level >= 23\n    // but we are targeting API level >= 21\n    private val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n    private var idx: Int = 0\n    private var lastText: String = \"\"\n\n    @Volatile\n    private var isRecording: Boolean = false\n\n    override fun onRequestPermissionsResult(\n        requestCode: Int, permissions: Array<String>, grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            finish()\n        }\n\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        setContentView(R.layout.activity_main)\n\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n\n        Log.i(TAG, \"Start to initialize model\")\n        initModel()\n        Log.i(TAG, \"Finished initializing model\")\n\n        recordButton = findViewById(R.id.record_button)\n        recordButton.setOnClickListener { onclick() }\n\n        textView = findViewById(R.id.my_text)\n        textView.movementMethod = ScrollingMovementMethod()\n\n        inputText = findViewById(R.id.input_text)\n    }\n\n    private fun onclick() {\n        if (!isRecording) {\n            val ret = initMicrophone()\n            if (!ret) {\n                Log.e(TAG, \"Failed to initialize microphone\")\n                return\n            }\n            Log.i(TAG, \"state: ${audioRecord?.state}\")\n            audioRecord!!.startRecording()\n            recordButton.setText(R.string.stop)\n            isRecording = true\n            textView.text = \"\"\n            lastText = \"\"\n            idx = 0\n\n            var keywords = inputText.text.toString()\n            Log.i(TAG, \"Raw keywords: $keywords\")\n\n            keywords = keywords.replace(\"\\n\", \"/\")\n            keywords = keywords.trim()\n\n            Log.i(TAG, \"Normalized keywords: $keywords\")\n\n            stream = kws.createStream(keywords)\n            if (stream.ptr == 0L) {\n                Log.i(TAG, \"Failed to create stream with keywords: $keywords\")\n\n                Toast.makeText(this, \"Failed to set keywords to $keywords.\", Toast.LENGTH_LONG)\n                    .show()\n\n                audioRecord?.let {\n                  it.stop()\n                  it.release()\n                }\n                audioRecord = null\n\n                return\n            }\n\n            Log.i(TAG, \"Created stream. Running ...\")\n\n            recordingThread = thread(true) {\n                processSamples()\n            }\n\n            Log.i(TAG, \"Started recording\")\n        } else {\n            isRecording = false\n\n            recordButton.setText(R.string.start)\n            Log.i(TAG, \"Stopped recording\")\n        }\n    }\n\n    private fun processSamples() {\n        Log.i(TAG, \"processing samples\")\n\n        val interval = 0.1 // i.e., 100 ms\n        val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n        val buffer = ShortArray(bufferSize)\n\n        while (isRecording) {\n            val ret = audioRecord?.read(buffer, 0, buffer.size)\n            if (ret != null && ret > 0) {\n                val samples = FloatArray(ret) { buffer[it] / 32768.0f }\n                stream.acceptWaveform(samples, sampleRate = sampleRateInHz)\n                while (kws.isReady(stream)) {\n                    kws.decode(stream)\n\n                    val text = kws.getResult(stream).keyword\n\n                    var textToDisplay = lastText\n\n                    if (text.isNotBlank()) {\n                        // Remember to reset the stream right after detecting a keyword\n\n                        kws.reset(stream)\n                        if (lastText.isBlank()) {\n                            textToDisplay = \"$idx: $text\"\n                        } else {\n                            textToDisplay = \"$idx: $text\\n$lastText\"\n                        }\n                        lastText = \"$idx: $text\\n$lastText\"\n                        idx += 1\n                    }\n\n                    runOnUiThread {\n                        textView.text = textToDisplay\n                    }\n                }\n            }\n        }\n\n        stream.release()\n        Log.i(TAG, \"Released stream. Stopped\")\n\n        audioRecord?.let {\n          it.stop()\n          it.release()\n        }\n\n        audioRecord = null\n    }\n\n    private fun initMicrophone(): Boolean {\n        if (ActivityCompat.checkSelfPermission(\n                this, Manifest.permission.RECORD_AUDIO\n            ) != PackageManager.PERMISSION_GRANTED\n        ) {\n            ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n            return false\n        }\n\n        val numBytes = AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n        Log.i(\n            TAG, \"buffer size in milliseconds: ${numBytes * 1000.0f / sampleRateInHz}\"\n        )\n\n        audioRecord = AudioRecord(\n            audioSource,\n            sampleRateInHz,\n            channelConfig,\n            audioFormat,\n            numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n        )\n        return true\n    }\n\n    private fun initModel() {\n        // Please change getKwsModelConfig() to add new models\n        // See https://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html\n        // for a list of available models\n        val type = 0\n        Log.i(TAG, \"Select model type $type\")\n        val config = KeywordSpotterConfig(\n            featConfig = getFeatureConfig(sampleRate = sampleRateInHz, featureDim = 80),\n            modelConfig = getKwsModelConfig(type = type)!!,\n            keywordsFile = getKeywordsFile(type = type),\n        )\n\n        kws = KeywordSpotter(\n            assetManager = application.assets,\n            config = config,\n        )\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/jniLibs/.gitignore",
    "content": "*.so\n*.txt\n*.onnx\n*.wav\n"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/layout/activity_main.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<androidx.constraintlayout.widget.ConstraintLayout xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:app=\"http://schemas.android.com/apk/res-auto\"\n    xmlns:tools=\"http://schemas.android.com/tools\"\n    android:layout_width=\"match_parent\"\n    android:layout_height=\"match_parent\"\n    tools:context=\".MainActivity\">\n\n    <LinearLayout\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"match_parent\"\n        android:gravity=\"center\"\n        android:orientation=\"vertical\">\n\n        <EditText\n            android:id=\"@+id/input_text\"\n            android:layout_width=\"match_parent\"\n            android:layout_height=\"320dp\"\n            android:layout_weight=\"2.5\"\n            android:hint=\"@string/keyword_hint\"\n            android:scrollbars=\"vertical\"\n            android:text=\"\"\n            android:textSize=\"15dp\" />\n\n        <TextView\n            android:id=\"@+id/my_text\"\n            android:layout_width=\"match_parent\"\n            android:layout_height=\"443dp\"\n            android:layout_weight=\"2.5\"\n            android:padding=\"24dp\"\n            android:scrollbars=\"vertical\"\n            android:singleLine=\"false\"\n            android:text=\"@string/hint\"\n            android:textSize=\"15dp\" />\n\n        <Button\n            android:id=\"@+id/record_button\"\n            android:layout_width=\"wrap_content\"\n            android:layout_height=\"wrap_content\"\n            android:layout_weight=\"0.5\"\n            android:text=\"@string/start\" />\n\n    </LinearLayout>\n\n\n</androidx.constraintlayout.widget.ConstraintLayout>"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">Keyword spotting</string>\n    <string name=\"hint\">Click the Start button to play keyword spotting with Next-gen Kaldi.\n        \\n\n        \\n\\n\\n\n        The source code and pre-trained models are publicly available.\n        Please see https://github.com/k2-fsa/sherpa-onnx for details.\n    </string>\n    <string name=\"keyword_hint\">Input your keywords here, one keyword per line.\\nTwo example keywords are given below:\\n\\nn ǐ h ǎo @你好\\nd àn g ē d àn g ē @蛋哥蛋哥</string>\n    <string name=\"start\">Start</string>\n    <string name=\"stop\">Stop</string>\n</resources>\n"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/values/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnx\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_500</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/white</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_700</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/values-night/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnx\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_200</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/black</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_200</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxKws/app/src/test/java/com/k2fsa/sherpa/onnx/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxKws/build.gradle",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id 'com.android.application' version '7.3.1' apply false\n    id 'com.android.library' version '7.3.1' apply false\n    id 'org.jetbrains.kotlin.android' version '1.7.20' apply false\n}"
  },
  {
    "path": "android/SherpaOnnxKws/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Thu Feb 23 11:09:06 CST 2023\ndistributionBase=GRADLE_USER_HOME\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\ndistributionPath=wrapper/dists\nzipStorePath=wrapper/dists\nzipStoreBase=GRADLE_USER_HOME\n"
  },
  {
    "path": "android/SherpaOnnxKws/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxKws/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxKws/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxKws/settings.gradle",
    "content": "pluginManagement {\n    repositories {\n        gradlePluginPortal()\n        google()\n        mavenCentral()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\nrootProject.name = \"SherpaOnnxKws\"\ninclude ':app'\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/build.gradle.kts",
    "content": "plugins {\n    alias(libs.plugins.android.application)\n    alias(libs.plugins.jetbrains.kotlin.android)\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx.simulate.streaming.asr\"\n    compileSdk = 34\n\n    defaultConfig {\n        applicationId = \"com.k2fsa.sherpa.onnx.simulate.streaming.asr\"\n        minSdk = 21\n        targetSdk = 34\n        versionCode = 20260320\n        versionName = \"1.12.31\"\n\n        testInstrumentationRunner = \"androidx.test.runner.AndroidJUnitRunner\"\n        vectorDrawables {\n            useSupportLibrary = true\n        }\n    }\n\n    buildTypes {\n        release {\n            isMinifyEnabled = false\n            proguardFiles(\n                getDefaultProguardFile(\"proguard-android-optimize.txt\"),\n                \"proguard-rules.pro\"\n            )\n        }\n    }\n    compileOptions {\n        sourceCompatibility = JavaVersion.VERSION_1_8\n        targetCompatibility = JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = \"1.8\"\n    }\n    buildFeatures {\n        compose = true\n    }\n    composeOptions {\n        kotlinCompilerExtensionVersion = \"1.5.1\"\n    }\n    packaging {\n        resources {\n            excludes += \"/META-INF/{AL2.0,LGPL2.1}\"\n        }\n    }\n}\n\ndependencies {\n    implementation(libs.androidx.core.ktx)\n    implementation(libs.androidx.lifecycle.runtime.ktx)\n    implementation(libs.androidx.activity.compose)\n    implementation(platform(libs.androidx.compose.bom))\n    implementation(libs.androidx.ui)\n    implementation(libs.androidx.ui.graphics)\n    implementation(libs.androidx.ui.tooling.preview)\n    implementation(libs.androidx.material3)\n    implementation(libs.androidx.navigation.compose)\n    testImplementation(libs.junit)\n    androidTestImplementation(libs.androidx.junit)\n    androidTestImplementation(libs.androidx.espresso.core)\n    androidTestImplementation(platform(libs.androidx.compose.bom))\n    androidTestImplementation(libs.androidx.ui.test.junit4)\n    debugImplementation(libs.androidx.ui.tooling)\n    debugImplementation(libs.androidx.ui.test.manifest)\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/androidTest/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx.simulate.streaming.asr\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SimulateStreamingAsr\"\n        tools:targetApi=\"31\">\n\n        <!--\n        required by qnn\n\n        If you don't add it, you would get an error from the deviceCreate() API\n        and the error code is 14001\n\n        It is located at /vendor/lib64/libcdsprpc.so on your Phone\n        -->\n        <uses-native-library\n            android:name=\"libcdsprpc.so\"\n            android:required=\"false\"/>\n\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\"\n            android:label=\"@string/app_name\"\n            android:theme=\"@style/Theme.SimulateStreamingAsr\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/assets/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/BarItem.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr\n\nimport androidx.compose.ui.graphics.vector.ImageVector\n\ndata class BarItem(\n    val title: String,\n\n    // see https://www.composables.com/icons\n    // and\n    // https://developer.android.com/reference/kotlin/androidx/compose/material/icons/filled/package-summary\n    val image: ImageVector,\n    val route: String,\n)\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.os.Bundle\nimport android.util.Log\nimport android.widget.Toast\nimport androidx.activity.ComponentActivity\nimport androidx.activity.compose.setContent\nimport androidx.activity.enableEdgeToEdge\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.material3.CenterAlignedTopAppBar\nimport androidx.compose.material3.ExperimentalMaterial3Api\nimport androidx.compose.material3.Icon\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.NavigationBar\nimport androidx.compose.material3.NavigationBarItem\nimport androidx.compose.material3.Scaffold\nimport androidx.compose.material3.Surface\nimport androidx.compose.material3.Text\nimport androidx.compose.material3.TopAppBarDefaults\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.core.app.ActivityCompat\nimport androidx.navigation.NavGraph.Companion.findStartDestination\nimport androidx.navigation.NavHostController\nimport androidx.navigation.compose.NavHost\nimport androidx.navigation.compose.composable\nimport androidx.navigation.compose.currentBackStackEntryAsState\nimport androidx.navigation.compose.rememberNavController\nimport com.k2fsa.sherpa.onnx.simulate.streaming.asr.screens.HelpScreen\nimport com.k2fsa.sherpa.onnx.simulate.streaming.asr.screens.HomeScreen\nimport com.k2fsa.sherpa.onnx.simulate.streaming.asr.ui.theme.SimulateStreamingAsrTheme\n\nconst val TAG = \"sherpa-onnx-sim-asr\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\n@Suppress(\"DEPRECATION\")\nclass MainActivity : ComponentActivity() {\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        enableEdgeToEdge()\n        setContent {\n            SimulateStreamingAsrTheme {\n                Surface(\n                    modifier = Modifier.fillMaxSize(),\n                    color = MaterialTheme.colorScheme.background\n                ) {\n                    MainScreen()\n                }\n            }\n        }\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n    }\n\n    @Deprecated(\"Deprecated in Java\")\n    override fun onRequestPermissionsResult(\n        requestCode: Int,\n        permissions: Array<out String>,\n        grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            Toast.makeText(\n                this,\n                \"This App needs to access the microphone\",\n                Toast.LENGTH_SHORT\n            )\n                .show()\n            finish()\n        }\n\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n}\n\n@OptIn(ExperimentalMaterial3Api::class)\n@Composable\nfun MainScreen(modifier: Modifier = Modifier) {\n    val navController = rememberNavController()\n\n    Scaffold(\n        topBar = {\n            CenterAlignedTopAppBar(\n                colors = TopAppBarDefaults.topAppBarColors(\n                    containerColor = MaterialTheme.colorScheme.primaryContainer,\n                    titleContentColor = MaterialTheme.colorScheme.primary,\n                ),\n                title = {\n                    Text(\n                        \"Next-gen Kaldi: Simulate real-time speech recognition\",\n                        fontWeight = FontWeight.Bold,\n                    )\n                },\n            )\n        },\n        content = { padding ->\n            Column(Modifier.padding(padding)) {\n                NavigationHost(navController = navController)\n\n            }\n        },\n        bottomBar = {\n            BottomNavigationBar(navController = navController)\n        }\n    )\n}\n\n@Composable\nfun NavigationHost(navController: NavHostController) {\n    NavHost(navController = navController, startDestination = NavRoutes.Home.route) {\n        composable(NavRoutes.Home.route) {\n            HomeScreen()\n        }\n\n        composable(NavRoutes.Help.route) {\n            HelpScreen()\n        }\n    }\n}\n\n@Composable\nfun BottomNavigationBar(navController: NavHostController) {\n    NavigationBar {\n        val backStackEntry by navController.currentBackStackEntryAsState()\n        val currentRoute = backStackEntry?.destination?.route\n\n        NavBarItems.BarItems.forEach { navItem ->\n            NavigationBarItem(selected = currentRoute == navItem.route,\n                onClick = {\n                    navController.navigate(navItem.route) {\n                        popUpTo(navController.graph.findStartDestination().id) {\n                            saveState = true\n                        }\n                        launchSingleTop = true\n                        restoreState = true\n                    }\n                },\n                icon = {\n                    Icon(imageVector = navItem.image, contentDescription = navItem.title)\n                }, label = {\n                    Text(text = navItem.title)\n                })\n        }\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/NavBarItems.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr\n\nimport androidx.compose.material.icons.Icons\nimport androidx.compose.material.icons.filled.Home\nimport androidx.compose.material.icons.filled.Info\n\nobject NavBarItems {\n    val BarItems = listOf(\n        BarItem(\n            title = \"Home\",\n            image = Icons.Filled.Home,\n            route = \"home\",\n        ),\n        BarItem(\n            title = \"Help\",\n            image = Icons.Filled.Info,\n            route = \"help\",\n        ),\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/NavRoutes.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr\n\nsealed class NavRoutes(val route: String) {\n    object Home : NavRoutes(\"home\")\n    object Help : NavRoutes(\"help\")\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/SimulateStreamingAsr.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr\n\nimport android.content.Context\nimport android.content.res.AssetManager\nimport android.util.Log\nimport com.k2fsa.sherpa.onnx.HomophoneReplacerConfig\nimport com.k2fsa.sherpa.onnx.OfflineRecognizer\nimport com.k2fsa.sherpa.onnx.OfflineRecognizerConfig\nimport com.k2fsa.sherpa.onnx.Vad\nimport com.k2fsa.sherpa.onnx.getOfflineModelConfig\nimport com.k2fsa.sherpa.onnx.getVadModelConfig\nimport java.io.File\nimport java.io.FileOutputStream\nimport java.io.InputStream\nimport java.io.OutputStream\n\n\nfun assetExists(assetManager: AssetManager, path: String): Boolean {\n    val dir = path.substringBeforeLast('/', \"\")\n    val fileName = path.substringAfterLast('/')\n\n    val files = assetManager.list(dir) ?: return false\n    return files.contains(fileName)\n}\n\nfun assetListExists(\n    assetManager: AssetManager,\n    paths: String\n): Boolean {\n    if (paths.isBlank()) return false\n\n    val pathList = paths.split(\",\")\n        .map { it.trim() }\n        .filter { it.isNotEmpty() }\n\n    if (pathList.isEmpty()) return false\n\n    return pathList.all { path ->\n        assetExists(assetManager, path)\n    }\n}\n\nfun copyAssetToInternalStorage(path: String, context: Context): String {\n    val targetRoot = context.filesDir\n    val outFile = File(targetRoot, path)\n\n    if (!assetExists(context.assets, path = path)) {\n        // for context binary, if it is does not exist, we return a path\n        // that can be written to\n        outFile.parentFile?.mkdirs()\n        Log.i(TAG, \"$path does not exist, return ${outFile.absolutePath}\")\n        return outFile.absolutePath\n    }\n\n    if (outFile.exists()) {\n        val assetSize = context.assets.open(path).use { it.available() }\n        if (outFile.length() == assetSize.toLong()) {\n            Log.i(TAG, \"$targetRoot/$path already exists, skip copying, return $targetRoot/$path\")\n\n            return \"$targetRoot/$path\"\n        }\n    }\n\n    outFile.parentFile?.mkdirs()\n\n    context.assets.open(path).use { input: InputStream ->\n        FileOutputStream(outFile).use { output: OutputStream ->\n            input.copyTo(output)\n        }\n    }\n    Log.i(TAG, \"Copied $path to $targetRoot/$path\")\n\n    return outFile.absolutePath\n}\n\nfun copyAssetListToInternalStorage(\n    paths: String,\n    context: Context\n): String {\n    if (paths.isBlank()) return paths\n\n    val pathList = paths.split(\",\")\n        .map { it.trim() }\n        .filter { it.isNotEmpty() }\n\n    val copiedPaths = pathList.map { path ->\n        copyAssetToInternalStorage(path, context)\n    }\n\n    return copiedPaths.joinToString(\",\")\n}\n\n\nobject SimulateStreamingAsr {\n    private var _recognizer: OfflineRecognizer? = null\n    val recognizer: OfflineRecognizer\n        get() {\n            return _recognizer!!\n        }\n\n    private var _vad: Vad? = null\n    val vad: Vad\n        get() {\n            return _vad!!\n        }\n\n    fun initOfflineRecognizer(context: Context, asrModelType: Int) {\n        synchronized(this) {\n            if (_recognizer != null) {\n                return\n            }\n            Log.i(TAG, \"Initializing sherpa-onnx offline recognizer\")\n            // Please change getOfflineModelConfig() to add new models\n            // See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n            // for a list of available models\n            val asrRuleFsts: String?\n            asrRuleFsts = null\n            Log.i(TAG, \"Select model type $asrModelType for ASR\")\n\n            val useHr = false\n            val hr = HomophoneReplacerConfig(\n                // Used only when useHr is true\n                // Please download the following 2 files from\n                // https://github.com/k2-fsa/sherpa-onnx/releases/tag/hr-files\n                //\n                // lexicon.txt can be shared by different apps\n                //\n                // replace.fst is specific for an app\n                lexicon = \"lexicon.txt\",\n                ruleFsts = \"replace.fst\",\n            )\n\n            val config = OfflineRecognizerConfig(\n                modelConfig = getOfflineModelConfig(type = asrModelType)!!,\n            )\n\n            if (config.modelConfig.numThreads == 1) {\n                config.modelConfig.numThreads = 2\n            }\n\n            if (asrRuleFsts != null) {\n                config.ruleFsts = asrRuleFsts\n            }\n\n            if (useHr) {\n                config.hr = hr\n            }\n\n            var assetManager: AssetManager? = context.assets\n\n            if (config.modelConfig.provider == \"qnn\") {\n                // We assume you have copied files like libQnnHtpV81Skel.so to jniLibs/arm64-v8a\n                Log.i(TAG, \"nativelibdir: ${context.applicationInfo.nativeLibraryDir}\")\n\n                // If we don't set the environment variable for ADSP_LIBRARY_PATH, we will see\n                // the error code 1008 from qnn_interface.deviceCreate()\n                // See also\n                // https://workbench.aihub.qualcomm.com/docs/hub/faq.html#why-am-i-seeing-error-1008-when-trying-to-use-htp\n                OfflineRecognizer.prependAdspLibraryPath(context.applicationInfo.nativeLibraryDir)\n\n                // for qnn, we need to copy *.so files from assets folder to sd card\n                if (config.modelConfig.senseVoice.qnnConfig.backendLib.isEmpty()\n                    && config.modelConfig.zipformerCtc.qnnConfig.backendLib.isEmpty()\n                    && config.modelConfig.paraformer.qnnConfig.backendLib.isEmpty()\n                ) {\n                    Log.e(TAG, \"You should provide libQnnHtp.so for qnn\")\n                    throw IllegalArgumentException(\"You should provide libQnnHtp.so for qnn\")\n                }\n                config.modelConfig.tokens =\n                    copyAssetToInternalStorage(config.modelConfig.tokens, context)\n\n                if (config.modelConfig.senseVoice.model.isNotEmpty() || assetExists(\n                        context.assets,\n                        path = config.modelConfig.senseVoice.qnnConfig.contextBinary\n                    )\n                ) {\n                    if (config.modelConfig.senseVoice.model.isNotEmpty()) {\n                        config.modelConfig.senseVoice.model =\n                            copyAssetToInternalStorage(config.modelConfig.senseVoice.model, context)\n                    }\n\n                    config.modelConfig.senseVoice.qnnConfig.contextBinary =\n                        copyAssetToInternalStorage(\n                            config.modelConfig.senseVoice.qnnConfig.contextBinary,\n                            context\n                        )\n                } else if (config.modelConfig.zipformerCtc.model.isNotEmpty() ||\n                    assetExists(\n                        context.assets,\n                        path = config.modelConfig.zipformerCtc.qnnConfig.contextBinary\n                    )\n                ) {\n                    if (config.modelConfig.zipformerCtc.model.isNotEmpty()) {\n                        config.modelConfig.zipformerCtc.model =\n                            copyAssetToInternalStorage(\n                                config.modelConfig.zipformerCtc.model,\n                                context\n                            )\n                    }\n\n                    config.modelConfig.zipformerCtc.qnnConfig.contextBinary =\n                        copyAssetToInternalStorage(\n                            config.modelConfig.zipformerCtc.qnnConfig.contextBinary,\n                            context\n                        )\n                } else if (config.modelConfig.paraformer.model.isNotEmpty()\n                    || assetListExists(\n                        context.assets,\n                        config.modelConfig.paraformer.qnnConfig.contextBinary\n                    )\n                ) {\n                    if (config.modelConfig.paraformer.model.isNotEmpty()) {\n                        config.modelConfig.paraformer.model =\n                            copyAssetListToInternalStorage(\n                                config.modelConfig.paraformer.model,\n                                context\n                            )\n                    }\n\n                    config.modelConfig.paraformer.qnnConfig.contextBinary =\n                        copyAssetListToInternalStorage(\n                            config.modelConfig.paraformer.qnnConfig.contextBinary,\n                            context\n                        )\n                }\n\n                if (config.hr.lexicon.isNotEmpty()) {\n                    config.hr.lexicon = copyAssetToInternalStorage(config.hr.lexicon, context)\n                }\n\n                if (config.hr.ruleFsts.isNotEmpty()) {\n                    // it assumes there is only one fst. otherwise, you need to copy each fst separately\n                    config.hr.ruleFsts = copyAssetToInternalStorage(config.hr.ruleFsts, context)\n                }\n\n                assetManager = null\n            }\n\n            _recognizer = OfflineRecognizer(\n                assetManager = assetManager,\n                config = config,\n            )\n\n            Log.i(TAG, \"sherpa-onnx offline recognizer initialized\")\n        }\n    }\n\n    fun initVad(assetManager: AssetManager? = null) {\n        if (_vad != null) {\n            return\n        }\n        val type = 0\n        Log.i(TAG, \"Select VAD model type $type\")\n        val config = getVadModelConfig(type)\n\n        _vad = Vad(\n            assetManager = assetManager,\n            config = config!!,\n        )\n        Log.i(TAG, \"sherpa-onnx vad initialized\")\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/screens/Help.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr.screens\n\nimport androidx.compose.runtime.Composable\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.height\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.material3.Text\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.unit.dp\nimport androidx.compose.ui.unit.sp\n\n@Composable\nfun HelpScreen() {\n    Box(modifier = Modifier.fillMaxSize()) {\n        Column(\n            modifier = Modifier.padding(8.dp)\n        ) {\n            Text(\n                \"This app uses a non-streaming ASR model together with silero-vad \" +\n                        \"for streaming/real-time speech recognition. \",\n                fontSize=10.sp\n            )\n            Spacer(modifier = Modifier.height(10.dp))\n            Text(\"Please see http://github.com/k2-fsa/sherpa-onnx \")\n\n            Spacer(modifier = Modifier.height(10.dp))\n            Text(\"Everything is open-sourced!\", fontSize = 20.sp)\n        }\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/screens/Home.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr.screens\n\nimport android.Manifest\nimport android.annotation.SuppressLint\nimport android.app.Activity\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.util.Log\nimport android.widget.Toast\nimport androidx.compose.foundation.layout.Arrangement\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.PaddingValues\nimport androidx.compose.foundation.layout.Row\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxHeight\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.fillMaxWidth\nimport androidx.compose.foundation.layout.width\nimport androidx.compose.foundation.lazy.LazyColumn\nimport androidx.compose.foundation.lazy.itemsIndexed\nimport androidx.compose.foundation.lazy.rememberLazyListState\nimport androidx.compose.material3.Button\nimport androidx.compose.material3.Text\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.LaunchedEffect\nimport androidx.compose.runtime.getValue\nimport androidx.compose.runtime.mutableStateListOf\nimport androidx.compose.runtime.mutableStateOf\nimport androidx.compose.runtime.remember\nimport androidx.compose.runtime.rememberCoroutineScope\nimport androidx.compose.runtime.setValue\nimport androidx.compose.ui.Alignment\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.platform.LocalClipboardManager\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.res.stringResource\nimport androidx.compose.ui.text.AnnotatedString\nimport androidx.compose.ui.unit.dp\nimport androidx.core.app.ActivityCompat\nimport com.k2fsa.sherpa.onnx.simulate.streaming.asr.R\nimport com.k2fsa.sherpa.onnx.simulate.streaming.asr.SimulateStreamingAsr\nimport com.k2fsa.sherpa.onnx.simulate.streaming.asr.TAG\nimport kotlinx.coroutines.CoroutineScope\nimport kotlinx.coroutines.Dispatchers\nimport kotlinx.coroutines.channels.Channel\nimport kotlinx.coroutines.launch\nimport kotlinx.coroutines.withContext\n\nprivate var audioRecord: AudioRecord? = null\n\nprivate const val sampleRateInHz = 16000\nprivate var samplesChannel = Channel<FloatArray>(capacity = Channel.UNLIMITED)\n\n@Composable\nfun HomeScreen() {\n    val context = LocalContext.current\n    val clipboardManager = LocalClipboardManager.current\n\n    val activity = LocalContext.current as Activity\n    var isStarted by remember { mutableStateOf(false) }\n    val resultList: MutableList<String> = remember { mutableStateListOf() }\n    val lazyColumnListState = rememberLazyListState()\n    val coroutineScope = rememberCoroutineScope()\n\n    var isInitialized by remember { mutableStateOf(false) }\n\n    // we change asrModelType in github actions\n    val asrModelType = 15\n\n    LaunchedEffect(Unit) {\n        if (asrModelType >= 9000) {\n            resultList.add(\"Using QNN for Qualcomm NPU (HTP backend)\")\n            resultList.add(\"It takes about 10s for the first run to start\")\n            resultList.add(\"Later runs require less than 1 second\")\n        }\n\n        withContext(Dispatchers.Default) {\n            // Call your heavy initialization off the main thread\n            SimulateStreamingAsr.initOfflineRecognizer(activity, asrModelType)\n            SimulateStreamingAsr.initVad(activity.assets)\n        }\n\n        // Back on the Main thread: update UI state\n        isInitialized = true\n        resultList.clear()\n    }\n\n    val onRecordingButtonClick: () -> Unit = {\n        isStarted = !isStarted\n        if (isStarted) {\n            if (ActivityCompat.checkSelfPermission(\n                    activity,\n                    Manifest.permission.RECORD_AUDIO\n                ) != PackageManager.PERMISSION_GRANTED\n            ) {\n                Log.i(TAG, \"Recording is not allowed\")\n            } else {\n                // recording is allowed\n                val audioSource = MediaRecorder.AudioSource.MIC\n                val channelConfig = AudioFormat.CHANNEL_IN_MONO\n                val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n                val numBytes =\n                    AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n                audioRecord = AudioRecord(\n                    audioSource,\n                    sampleRateInHz,\n                    AudioFormat.CHANNEL_IN_MONO,\n                    AudioFormat.ENCODING_PCM_16BIT,\n                    numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n                )\n\n                SimulateStreamingAsr.vad.reset()\n\n                CoroutineScope(Dispatchers.IO).launch {\n                    Log.i(TAG, \"processing samples\")\n                    val interval = 0.1 // i.e., 100 ms\n                    val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n                    val buffer = ShortArray(bufferSize)\n\n                    audioRecord?.let { it ->\n                        it.startRecording()\n\n                        while (isStarted) {\n                            val ret = audioRecord?.read(buffer, 0, buffer.size)\n                            ret?.let { n ->\n                                val samples = FloatArray(n) { buffer[it] / 32768.0f }\n                                samplesChannel.send(samples)\n                            }\n                        }\n                        val samples = FloatArray(0)\n                        samplesChannel.send(samples)\n                    }\n                }\n\n                CoroutineScope(Dispatchers.Default).launch {\n                    var buffer = arrayListOf<Float>()\n                    var offset = 0\n                    val windowSize = 512\n                    var isSpeechStarted = false\n                    var startTime = System.currentTimeMillis()\n                    var lastText = \"\"\n                    var added = false\n                    var speechStartOffset = 0\n\n\n                    while (isStarted) {\n                        for (s in samplesChannel) {\n                            if (s.isEmpty()) {\n                                break\n                            }\n\n                            buffer.addAll(s.toList())\n                            while (offset + windowSize < buffer.size) {\n                                SimulateStreamingAsr.vad.acceptWaveform(\n                                    buffer.subList(\n                                        offset,\n                                        offset + windowSize\n                                    ).toFloatArray()\n                                )\n                                offset += windowSize\n                                if (!isSpeechStarted && SimulateStreamingAsr.vad.isSpeechDetected()) {\n                                    isSpeechStarted = true\n                                    // offset 0.4s\n                                    speechStartOffset = offset - 6400\n                                    if(speechStartOffset < 0) {\n                                        speechStartOffset = 0\n                                    }\n                                    startTime = System.currentTimeMillis()\n                                }\n                            }\n\n                            val elapsed = System.currentTimeMillis() - startTime\n                            if (isSpeechStarted && elapsed > 200) {\n                                // Run ASR every 0.2 seconds == 200 milliseconds\n                                // You can change it to some other value\n                                val stream = SimulateStreamingAsr.recognizer.createStream()\n                                stream.acceptWaveform(\n                                    buffer.subList(speechStartOffset, offset).toFloatArray(),\n                                    sampleRateInHz\n                                )\n                                SimulateStreamingAsr.recognizer.decode(stream)\n                                val result = SimulateStreamingAsr.recognizer.getResult(stream)\n                                stream.release()\n\n                                lastText = result.text\n\n                                if (lastText.isNotBlank()) {\n                                    if (!added || resultList.isEmpty()) {\n                                        resultList.add(lastText)\n                                        added = true\n                                    } else {\n                                        resultList[resultList.size - 1] = lastText\n                                    }\n\n                                    coroutineScope.launch {\n                                        lazyColumnListState.animateScrollToItem(resultList.size - 1)\n                                    }\n                                }\n\n                                startTime = System.currentTimeMillis()\n                            }\n\n\n                            while (!SimulateStreamingAsr.vad.empty()) {\n                                val stream = SimulateStreamingAsr.recognizer.createStream()\n                                stream.acceptWaveform(\n                                    SimulateStreamingAsr.vad.front().samples,\n                                    sampleRateInHz\n                                )\n                                SimulateStreamingAsr.recognizer.decode(stream)\n                                val result = SimulateStreamingAsr.recognizer.getResult(stream)\n                                stream.release()\n\n                                isSpeechStarted = false\n                                SimulateStreamingAsr.vad.pop()\n\n                                buffer = arrayListOf()\n                                offset = 0\n                                if (lastText.isNotBlank()) {\n                                    if (added && resultList.isNotEmpty()) {\n                                        resultList[resultList.size - 1] = result.text\n                                    } else {\n                                        resultList.add(result.text)\n                                    }\n\n                                    coroutineScope.launch {\n                                        lazyColumnListState.animateScrollToItem(resultList.size - 1)\n                                    }\n                                    added = false\n                                }\n                            }\n                        }\n                    }\n                }\n            }\n        } else {\n            audioRecord?.stop()\n            audioRecord?.release()\n            audioRecord = null\n        }\n    }\n\n    Box(\n        modifier = Modifier.fillMaxSize(),\n        contentAlignment = Alignment.TopCenter,\n    ) {\n        Column(modifier = Modifier) {\n            if (!isInitialized) {\n                Row(\n                    modifier = Modifier.fillMaxWidth(),\n                    horizontalArrangement = Arrangement.Center,\n                ) {\n                    Text(text = \"Initializing... Please wait\")\n                }\n            }\n            if (asrModelType >= 9000) {\n                Row(\n                    modifier = Modifier.fillMaxWidth(),\n                    horizontalArrangement = Arrangement.Center,\n                ) {\n                    Text(text = \"Qualcomm NPU (HTP backend with QNN)\")\n                }\n            }\n\n            HomeButtonRow(\n                isStarted = isStarted,\n                isInitialized = isInitialized,\n                onRecordingButtonClick = onRecordingButtonClick,\n                onCopyButtonClick = {\n                    if (resultList.isNotEmpty()) {\n                        val s = resultList.mapIndexed { i, s -> \"${i + 1}: $s\" }\n                            .joinToString(separator = \"\\n\")\n                        clipboardManager.setText(AnnotatedString(s))\n\n                        Toast.makeText(\n                            context,\n                            \"Copied to clipboard\",\n                            Toast.LENGTH_SHORT\n                        )\n                            .show()\n                    } else {\n                        Toast.makeText(\n                            context,\n                            \"Nothing to copy\",\n                            Toast.LENGTH_SHORT\n                        )\n                            .show()\n\n                    }\n                },\n                onClearButtonClick = {\n                    resultList.clear()\n                }\n            )\n\n            if (resultList.size > 0) {\n                LazyColumn(\n                    modifier = Modifier\n                        .fillMaxWidth()\n                        .fillMaxHeight(),\n                    contentPadding = PaddingValues(16.dp),\n                    state = lazyColumnListState\n                ) {\n                    itemsIndexed(resultList) { index, line ->\n                        Text(text = \"${index + 1}: $line\")\n                    }\n                }\n            }\n\n        }\n    }\n}\n\n@SuppressLint(\"UnrememberedMutableState\")\n@Composable\nprivate fun HomeButtonRow(\n    modifier: Modifier = Modifier,\n    isStarted: Boolean,\n    isInitialized: Boolean,\n    onRecordingButtonClick: () -> Unit,\n    onCopyButtonClick: () -> Unit,\n    onClearButtonClick: () -> Unit,\n) {\n    Row(\n        modifier = modifier.fillMaxWidth(),\n        horizontalArrangement = Arrangement.Center,\n    ) {\n        Button(\n            onClick = onRecordingButtonClick,\n            enabled = isInitialized,\n        ) {\n            Text(text = stringResource(if (isStarted) R.string.stop else R.string.start))\n        }\n\n        Spacer(modifier = Modifier.width(24.dp))\n\n        Button(\n            onClick = onCopyButtonClick,\n            enabled = isInitialized,\n        ) {\n            Text(text = stringResource(id = R.string.copy))\n        }\n\n        Spacer(modifier = Modifier.width(24.dp))\n\n        Button(\n            onClick = onClearButtonClick,\n            enabled = isInitialized,\n        ) {\n            Text(text = stringResource(id = R.string.clear))\n        }\n    }\n}\n\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/ui/theme/Color.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr.ui.theme\n\nimport androidx.compose.ui.graphics.Color\n\nval Purple80 = Color(0xFFD0BCFF)\nval PurpleGrey80 = Color(0xFFCCC2DC)\nval Pink80 = Color(0xFFEFB8C8)\n\nval Purple40 = Color(0xFF6650a4)\nval PurpleGrey40 = Color(0xFF625b71)\nval Pink40 = Color(0xFF7D5260)"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/ui/theme/Theme.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr.ui.theme\n\nimport android.app.Activity\nimport android.os.Build\nimport androidx.compose.foundation.isSystemInDarkTheme\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.darkColorScheme\nimport androidx.compose.material3.dynamicDarkColorScheme\nimport androidx.compose.material3.dynamicLightColorScheme\nimport androidx.compose.material3.lightColorScheme\nimport androidx.compose.runtime.Composable\nimport androidx.compose.ui.platform.LocalContext\n\nprivate val DarkColorScheme = darkColorScheme(\n    primary = Purple80,\n    secondary = PurpleGrey80,\n    tertiary = Pink80\n)\n\nprivate val LightColorScheme = lightColorScheme(\n    primary = Purple40,\n    secondary = PurpleGrey40,\n    tertiary = Pink40\n\n    /* Other default colors to override\n    background = Color(0xFFFFFBFE),\n    surface = Color(0xFFFFFBFE),\n    onPrimary = Color.White,\n    onSecondary = Color.White,\n    onTertiary = Color.White,\n    onBackground = Color(0xFF1C1B1F),\n    onSurface = Color(0xFF1C1B1F),\n    */\n)\n\n@Composable\nfun SimulateStreamingAsrTheme(\n    darkTheme: Boolean = isSystemInDarkTheme(),\n    // Dynamic color is available on Android 12+\n    dynamicColor: Boolean = true,\n    content: @Composable () -> Unit\n) {\n    val colorScheme = when {\n        dynamicColor && Build.VERSION.SDK_INT >= Build.VERSION_CODES.S -> {\n            val context = LocalContext.current\n            if (darkTheme) dynamicDarkColorScheme(context) else dynamicLightColorScheme(context)\n        }\n\n        darkTheme -> DarkColorScheme\n        else -> LightColorScheme\n    }\n\n    MaterialTheme(\n        colorScheme = colorScheme,\n        typography = Typography,\n        content = content\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/ui/theme/Type.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr.ui.theme\n\nimport androidx.compose.material3.Typography\nimport androidx.compose.ui.text.TextStyle\nimport androidx.compose.ui.text.font.FontFamily\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.unit.sp\n\n// Set of Material typography styles to start with\nval Typography = Typography(\n    bodyLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 16.sp,\n        lineHeight = 24.sp,\n        letterSpacing = 0.5.sp\n    )\n    /* Other default text styles to override\n    titleLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 22.sp,\n        lineHeight = 28.sp,\n        letterSpacing = 0.sp\n    ),\n    labelSmall = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Medium,\n        fontSize = 11.sp,\n        lineHeight = 16.sp,\n        letterSpacing = 0.5.sp\n    )\n    */\n)"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">SimulateStreamingAsr</string>\n    <string name=\"start\">Start</string>\n    <string name=\"stop\">Stop</string>\n    <string name=\"copy\">Copy</string>\n    <string name=\"clear\">Clear</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/res/values/themes.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n\n    <style name=\"Theme.SimulateStreamingAsr\" parent=\"android:Theme.Material.Light.NoActionBar\" />\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/app/src/test/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/build.gradle.kts",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    alias(libs.plugins.android.application) apply false\n    alias(libs.plugins.jetbrains.kotlin.android) apply false\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/gradle/libs.versions.toml",
    "content": "[versions]\nagp = \"8.4.0\"\nkotlin = \"1.9.0\"\ncoreKtx = \"1.10.0\"\njunit = \"4.13.2\"\njunitVersion = \"1.1.5\"\nespressoCore = \"3.5.1\"\nlifecycleRuntimeKtx = \"2.6.1\"\nactivityCompose = \"1.8.0\"\ncomposeBom = \"2023.08.00\"\nnavigationCompose = \"2.8.2\"\n\n[libraries]\nandroidx-core-ktx = { group = \"androidx.core\", name = \"core-ktx\", version.ref = \"coreKtx\" }\njunit = { group = \"junit\", name = \"junit\", version.ref = \"junit\" }\nandroidx-junit = { group = \"androidx.test.ext\", name = \"junit\", version.ref = \"junitVersion\" }\nandroidx-espresso-core = { group = \"androidx.test.espresso\", name = \"espresso-core\", version.ref = \"espressoCore\" }\nandroidx-lifecycle-runtime-ktx = { group = \"androidx.lifecycle\", name = \"lifecycle-runtime-ktx\", version.ref = \"lifecycleRuntimeKtx\" }\nandroidx-activity-compose = { group = \"androidx.activity\", name = \"activity-compose\", version.ref = \"activityCompose\" }\nandroidx-compose-bom = { group = \"androidx.compose\", name = \"compose-bom\", version.ref = \"composeBom\" }\nandroidx-ui = { group = \"androidx.compose.ui\", name = \"ui\" }\nandroidx-ui-graphics = { group = \"androidx.compose.ui\", name = \"ui-graphics\" }\nandroidx-ui-tooling = { group = \"androidx.compose.ui\", name = \"ui-tooling\" }\nandroidx-ui-tooling-preview = { group = \"androidx.compose.ui\", name = \"ui-tooling-preview\" }\nandroidx-ui-test-manifest = { group = \"androidx.compose.ui\", name = \"ui-test-manifest\" }\nandroidx-ui-test-junit4 = { group = \"androidx.compose.ui\", name = \"ui-test-junit4\" }\nandroidx-material3 = { group = \"androidx.compose.material3\", name = \"material3\" }\nandroidx-navigation-compose = { group = \"androidx.navigation\", name = \"navigation-compose\", version.ref = \"navigationCompose\" }\n\n\n[plugins]\nandroid-application = { id = \"com.android.application\", version.ref = \"agp\" }\njetbrains-kotlin-android = { id = \"org.jetbrains.kotlin.android\", version.ref = \"kotlin\" }\n\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Wed May 14 11:10:06 CST 2025\ndistributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.6-bin.zip\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. For more details, visit\n# https://developer.android.com/r/tools/gradle-multi-project-decoupled-projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsr/settings.gradle.kts",
    "content": "pluginManagement {\n    repositories {\n        google {\n            content {\n                includeGroupByRegex(\"com\\\\.android.*\")\n                includeGroupByRegex(\"com\\\\.google.*\")\n                includeGroupByRegex(\"androidx.*\")\n            }\n        }\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\nrootProject.name = \"SimulateStreamingAsr\"\ninclude(\":app\")\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/build.gradle.kts",
    "content": "plugins {\n    alias(libs.plugins.android.application)\n    alias(libs.plugins.jetbrains.kotlin.android)\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx.simulate.streaming.asr.wear.os\"\n    compileSdk = 34\n\n    defaultConfig {\n        applicationId = \"com.k2fsa.sherpa.onnx.simulate.streaming.asr.wear.os\"\n        minSdk = 28\n        targetSdk = 34\n        versionCode = 20260320\n        versionName = \"1.12.31\"\n        vectorDrawables {\n            useSupportLibrary = true\n        }\n\n    }\n\n    buildTypes {\n        release {\n            isMinifyEnabled = false\n            proguardFiles(\n                getDefaultProguardFile(\"proguard-android-optimize.txt\"),\n                \"proguard-rules.pro\"\n            )\n        }\n    }\n    compileOptions {\n        sourceCompatibility = JavaVersion.VERSION_1_8\n        targetCompatibility = JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = \"1.8\"\n    }\n    buildFeatures {\n        compose = true\n    }\n    composeOptions {\n        kotlinCompilerExtensionVersion = \"1.5.1\"\n    }\n    packaging {\n        resources {\n            excludes += \"/META-INF/{AL2.0,LGPL2.1}\"\n        }\n    }\n}\n\ndependencies {\n\n    implementation(libs.play.services.wearable)\n    implementation(platform(libs.compose.bom))\n    implementation(libs.ui)\n    implementation(libs.ui.tooling.preview)\n    implementation(libs.compose.material)\n    implementation(libs.compose.foundation)\n    implementation(libs.activity.compose)\n    implementation(libs.core.splashscreen)\n    implementation(\"com.github.k2-fsa:sherpa-onnx:v1.12.31\")\n    androidTestImplementation(platform(libs.compose.bom))\n    androidTestImplementation(libs.ui.test.junit4)\n    debugImplementation(libs.ui.tooling)\n    debugImplementation(libs.ui.test.manifest)\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/lint.xml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<lint>\n    <!-- Ignore the IconLocation for the Tile preview images -->\n    <issue id=\"IconLocation\">\n        <ignore path=\"res/drawable/tile_preview.png\" />\n        <ignore path=\"res/drawable-round/tile_preview.png\" />\n    </issue>\n</lint>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\">\n\n    <uses-permission android:name=\"android.permission.WAKE_LOCK\" />\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <uses-feature android:name=\"android.hardware.type.watch\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@android:style/Theme.DeviceDefault\">\n        <uses-library\n            android:name=\"com.google.android.wearable\"\n            android:required=\"true\" />\n\n        <!--\n               Set to true if your app is Standalone, that is, it does not require the handheld\n               app to run.\n        -->\n        <meta-data\n            android:name=\"com.google.android.wearable.standalone\"\n            android:value=\"true\" />\n\n        <activity\n            android:name=\".presentation.MainActivity\"\n            android:exported=\"true\"\n            android:taskAffinity=\"\"\n            android:theme=\"@style/MainActivityTheme.Starting\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/src/main/assets/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/wear/os/presentation/HomeScreen.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr.wear.os.presentation\n\nimport android.Manifest\nimport android.app.Activity\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.util.Log\nimport androidx.compose.foundation.background\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.fillMaxWidth\nimport androidx.compose.foundation.layout.height\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.runtime.mutableStateOf\nimport androidx.compose.runtime.remember\nimport androidx.compose.runtime.rememberCoroutineScope\nimport androidx.compose.runtime.setValue\nimport androidx.compose.ui.Alignment\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.text.style.TextAlign\nimport androidx.compose.ui.unit.dp\nimport androidx.core.app.ActivityCompat\nimport androidx.wear.compose.material.Button\nimport androidx.wear.compose.material.MaterialTheme\nimport androidx.wear.compose.material.Text\nimport com.k2fsa.sherpa.onnx.simulate.streaming.asr.wear.os.presentation.theme.SherpaOnnxSimulateStreamingAsrWearOsTheme\nimport kotlinx.coroutines.CoroutineScope\nimport kotlinx.coroutines.Dispatchers\nimport kotlinx.coroutines.channels.Channel\nimport kotlinx.coroutines.launch\n\n\nprivate var audioRecord: AudioRecord? = null\n\nprivate const val sampleRateInHz = 16000\nprivate var samplesChannel = Channel<FloatArray>(capacity = Channel.UNLIMITED)\n\n@Composable\nfun HomeScreen() {\n    val activity = LocalContext.current as Activity\n\n    var firstTime by remember { mutableStateOf(true) }\n    var isStarted by remember { mutableStateOf(false) }\n    var result by remember { mutableStateOf(\"\") }\n\n    val coroutineScope = rememberCoroutineScope()\n\n    val onButtonClick: () -> Unit = {\n        firstTime = false\n        isStarted = !isStarted\n\n\n        if (isStarted) {\n            if (ActivityCompat.checkSelfPermission(\n                    activity, Manifest.permission.RECORD_AUDIO\n                ) != PackageManager.PERMISSION_GRANTED\n            ) {\n                Log.i(TAG, \"Recording is not allowed\")\n            } else {\n                // recording is allowed\n                val audioSource = MediaRecorder.AudioSource.MIC\n                val channelConfig = AudioFormat.CHANNEL_IN_MONO\n                val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n                val numBytes =\n                    AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n\n                audioRecord = AudioRecord(\n                    audioSource,\n                    sampleRateInHz,\n                    AudioFormat.CHANNEL_IN_MONO,\n                    AudioFormat.ENCODING_PCM_16BIT,\n                    numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n                )\n\n                SimulateStreamingAsr.vad.reset()\n\n                result = \"Started! Please speak\"\n\n                CoroutineScope(Dispatchers.IO).launch {\n                    Log.i(TAG, \"processing samples\")\n                    val interval = 0.2 // i.e., 200 ms\n                    val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n                    val buffer = ShortArray(bufferSize)\n\n                    audioRecord?.let { it ->\n                        it.startRecording()\n\n                        while (isStarted) {\n                            val ret = audioRecord?.read(buffer, 0, buffer.size)\n                            ret?.let { n ->\n                                val samples = FloatArray(n) { buffer[it] / 32768.0f }\n                                samplesChannel.send(samples)\n                            }\n                        }\n                        val samples = FloatArray(0)\n                        samplesChannel.send(samples)\n                    }\n                }\n\n                CoroutineScope(Dispatchers.Default).launch {\n                    var buffer = arrayListOf<Float>()\n                    var offset = 0\n                    val windowSize = 512 // change it for ten-vad\n\n                    while (isStarted) {\n                        for (s in samplesChannel) {\n                            if (s.isEmpty()) {\n                                break\n                            }\n\n                            buffer.addAll(s.toList())\n                            while (offset + windowSize < buffer.size) {\n                                SimulateStreamingAsr.vad.acceptWaveform(\n                                    buffer.subList(\n                                        offset, offset + windowSize\n                                    ).toFloatArray()\n                                )\n\n                                offset += windowSize\n                            }\n\n                            while (!SimulateStreamingAsr.vad.empty()) {\n                                val duration = SimulateStreamingAsr.vad.front().samples.count().toFloat() / 16000\n\n                                val s0 = System.currentTimeMillis()\n                                val stream = SimulateStreamingAsr.recognizer.createStream()\n                                stream.acceptWaveform(\n                                    SimulateStreamingAsr.vad.front().samples,\n                                    sampleRateInHz\n                                )\n                                SimulateStreamingAsr.recognizer.decode(stream)\n\n                                val s1 = System.currentTimeMillis()\n                                val diff = (s1 - s0).toFloat() / 1000\n                                val rtf = diff / duration\n                                Log.i(TAG, \"rtf: ${rtf}, elapsed: ${diff}, duration: ${duration}\")\n                                val r = SimulateStreamingAsr.recognizer.getResult(stream)\n                                stream.release()\n\n                                Log.i(TAG, \"result: ${r.text}\")\n\n                                coroutineScope.launch {\n                                    result = r.text\n                                }\n\n                                SimulateStreamingAsr.vad.pop()\n                                buffer = arrayListOf()\n                                offset = 0\n                            }\n                        }\n                    }\n                }\n            }\n        } else {\n            audioRecord?.stop()\n            audioRecord?.release()\n            audioRecord = null\n\n            result = \"Click Start and speak\"\n        }\n    }\n\n    SherpaOnnxSimulateStreamingAsrWearOsTheme {\n        Box(\n            modifier = Modifier\n                .fillMaxSize()\n                .background(MaterialTheme.colors.background),\n            contentAlignment = Alignment.Center\n        ) {\n            Column(\n                horizontalAlignment = Alignment.CenterHorizontally\n            ) {\n                Spacer(modifier = Modifier.height(16.dp))\n                if (firstTime) {\n                    ShowMessage()\n                } else {\n                    ShowResult(result)\n                }\n\n                Spacer(modifier = Modifier.height(32.dp))\n\n                Button(\n                    onClick = onButtonClick\n                ) {\n                    if (isStarted) {\n                        Text(\"Stop\")\n                    } else {\n                        Text(\"Start\")\n                    }\n                }\n            }\n        }\n    }\n\n}\n\n@Composable\nfun ShowMessage() {\n    val msg = \"Real-time\\nspeech recognition\\nwith\\nNext-gen Kaldi\"\n    Text(\n        modifier = Modifier.fillMaxWidth(),\n        textAlign = TextAlign.Center,\n        color = MaterialTheme.colors.primary,\n        text = msg,\n    )\n}\n\n@Composable\nfun ShowResult(result: String) {\n    var msg: String = result\n    if (msg.length > 10) {\n        val n = 5\n        val first = result.take(n)\n        val last = result.takeLast(result.length - n)\n        msg = \"${first}\\n${last}\"\n    }\n    Text(\n        modifier = Modifier.fillMaxWidth(),\n        textAlign = TextAlign.Center,\n        color = MaterialTheme.colors.primary,\n        text = msg,\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/wear/os/presentation/MainActivity.kt",
    "content": "/* While this template provides a good starting point for using Wear Compose, you can always\n * take a look at https://github.com/android/wear-os-samples/tree/main/ComposeStarter and\n * https://github.com/android/wear-os-samples/tree/main/ComposeAdvanced to find the most up to date\n * changes to the libraries and their usages.\n */\n\npackage com.k2fsa.sherpa.onnx.simulate.streaming.asr.wear.os.presentation\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.os.Bundle\nimport android.util.Log\nimport android.widget.Toast\nimport androidx.activity.ComponentActivity\nimport androidx.activity.compose.setContent\nimport androidx.compose.foundation.background\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.fillMaxWidth\nimport androidx.compose.runtime.Composable\nimport androidx.compose.ui.Alignment\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.res.stringResource\nimport androidx.compose.ui.text.style.TextAlign\nimport androidx.compose.ui.tooling.preview.Devices\nimport androidx.compose.ui.tooling.preview.Preview\nimport androidx.core.app.ActivityCompat\nimport androidx.core.splashscreen.SplashScreen.Companion.installSplashScreen\nimport androidx.wear.compose.material.MaterialTheme\nimport androidx.wear.compose.material.Text\nimport androidx.wear.compose.material.TimeText\nimport com.k2fsa.sherpa.onnx.simulate.streaming.asr.wear.os.R\nimport com.k2fsa.sherpa.onnx.simulate.streaming.asr.wear.os.presentation.theme.SherpaOnnxSimulateStreamingAsrWearOsTheme\n\nconst val TAG = \"sherpa-onnx\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\nclass MainActivity : ComponentActivity() {\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n    override fun onCreate(savedInstanceState: Bundle?) {\n        installSplashScreen()\n\n        super.onCreate(savedInstanceState)\n\n        setTheme(android.R.style.Theme_DeviceDefault)\n\n        setContent {\n            WearApp(\"Android\")\n        }\n\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n        SimulateStreamingAsr.initOfflineRecognizer(this.assets, this.application)\n        SimulateStreamingAsr.initVad(this.assets)\n    }\n\n    override fun onRequestPermissionsResult(\n        requestCode: Int,\n        permissions: Array<out String>,\n        grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            Toast.makeText(\n                this,\n                \"This App needs access to the microphone\",\n                Toast.LENGTH_SHORT\n            )\n                .show()\n            finish()\n        }\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n}\n\n@Composable\nfun WearApp(greetingName: String) {\n    HomeScreen()\n}\n\n@Composable\nfun Greeting(greetingName: String) {\n    Text(\n        modifier = Modifier.fillMaxWidth(),\n        textAlign = TextAlign.Center,\n        color = MaterialTheme.colors.primary,\n        text = stringResource(R.string.hello_world, greetingName)\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/wear/os/presentation/SimulateStreamingAsr.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr.wear.os.presentation\n\nimport android.app.Application\nimport android.content.res.AssetManager\nimport android.util.Log\nimport com.k2fsa.sherpa.onnx.HomophoneReplacerConfig\nimport com.k2fsa.sherpa.onnx.OfflineRecognizer\nimport com.k2fsa.sherpa.onnx.OfflineRecognizerConfig\nimport com.k2fsa.sherpa.onnx.Vad\nimport com.k2fsa.sherpa.onnx.getOfflineModelConfig\nimport com.k2fsa.sherpa.onnx.getVadModelConfig\nimport java.io.File\nimport java.io.FileOutputStream\nimport java.io.IOException\n\n\nobject SimulateStreamingAsr {\n    private var _recognizer: OfflineRecognizer? = null\n    val recognizer: OfflineRecognizer\n        get() {\n            return _recognizer!!\n        }\n\n    private var _vad: Vad? = null\n    val vad: Vad\n        get() {\n            return _vad!!\n        }\n\n    fun initOfflineRecognizer(assetManager: AssetManager? = null, application: Application) {\n        synchronized(this) {\n            if (_recognizer != null) {\n                return\n            }\n            Log.i(TAG, \"Initializing sherpa-onnx offline recognizer\")\n            // Please change getOfflineModelConfig() to add new models\n            // See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n            // for a list of available models\n            val asrModelType = 39\n            val asrRuleFsts: String?\n            asrRuleFsts = null\n            Log.i(TAG, \"Select model type $asrModelType for ASR\")\n\n            val useHr = false\n            val hr = HomophoneReplacerConfig(\n                // Used only when useHr is true\n                // Please download the following 2 files from\n                // https://github.com/k2-fsa/sherpa-onnx/releases/tag/hr-files\n                //\n                // lexicon.txt can be shared by different apps\n                //\n                // replace.fst is specific for an app\n                lexicon = \"lexicon.txt\",\n                ruleFsts = \"replace.fst\",\n            )\n\n            val config = OfflineRecognizerConfig(\n                modelConfig = getOfflineModelConfig(type = asrModelType)!!,\n            )\n\n            if (config.modelConfig.numThreads == 1) {\n                config.modelConfig.numThreads = 2\n            }\n            config.modelConfig.debug = true\n\n            if (asrRuleFsts != null) {\n                config.ruleFsts = asrRuleFsts\n            }\n\n            if (useHr) {\n                config.hr = hr\n            }\n\n            _recognizer = OfflineRecognizer(\n                assetManager = assetManager,\n                config = config,\n            )\n\n            Log.i(TAG, \"sherpa-onnx offline recognizer initialized\")\n        }\n    }\n\n    fun initVad(assetManager: AssetManager? = null) {\n        if (_vad != null) {\n            return\n        }\n        val type = 0\n        Log.i(TAG, \"Select VAD model type $type\")\n        val config = getVadModelConfig(type)\n\n        _vad = Vad(\n            assetManager = assetManager,\n            config = config!!,\n        )\n        Log.i(TAG, \"sherpa-onnx vad initialized\")\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/wear/os/presentation/theme/Theme.kt",
    "content": "package com.k2fsa.sherpa.onnx.simulate.streaming.asr.wear.os.presentation.theme\n\nimport androidx.compose.runtime.Composable\nimport androidx.wear.compose.material.MaterialTheme\n\n@Composable\nfun SherpaOnnxSimulateStreamingAsrWearOsTheme(\n    content: @Composable () -> Unit\n) {\n    /**\n     * Empty theme to customize for your app.\n     * See: https://developer.android.com/jetpack/compose/designsystems/custom\n     */\n    MaterialTheme(\n        content = content\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/src/main/res/drawable/splash_icon.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n\n<layer-list xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <item\n        android:width=\"48dp\"\n        android:height=\"48dp\"\n        android:gravity=\"center\">\n        <shape android:shape=\"oval\">\n            <solid android:color=\"#FFFFFF\" />\n        </shape>\n    </item>\n    <item\n        android:width=\"40dp\"\n        android:height=\"40dp\"\n        android:gravity=\"center\">\n        <vector\n            android:width=\"24dp\"\n            android:height=\"24dp\"\n            android:tint=\"#000000\"\n            android:viewportWidth=\"24\"\n            android:viewportHeight=\"24\">\n            <path\n                android:fillColor=\"#FF000000\"\n                android:pathData=\"M17.6,11.48 L19.44,8.3a0.63,0.63 0,0 0,-1.09 -0.63l-1.88,3.24a11.43,11.43 0,0 0,-8.94 0L5.65,7.67a0.63,0.63 0,0 0,-1.09 0.63L6.4,11.48A10.81,10.81 0,0 0,1 20L23,20A10.81,10.81 0,0 0,17.6 11.48ZM7,17.25A1.25,1.25 0,1 1,8.25 16,1.25 1.25,0 0,1 7,17.25ZM17,17.25A1.25,1.25 0,1 1,18.25 16,1.25 1.25,0 0,1 17,17.25Z\" />\n        </vector>\n    </item>\n</layer-list>\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">SherpaOnnxSimulateStreamingAsrWearOs</string>\n    <!--\n    This string is used for square devices and overridden by hello_world in\n    values-round/strings.xml for round devices.\n    -->\n    <string name=\"hello_world\">From the Square world,\\nHello, %1$s!</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/src/main/res/values/styles.xml",
    "content": "<resources>\n\n    <style name=\"MainActivityTheme.Starting\" parent=\"Theme.SplashScreen\">\n        <item name=\"windowSplashScreenBackground\">@android:color/black</item>\n        <item name=\"windowSplashScreenAnimatedIcon\">@drawable/splash_icon</item>\n        <item name=\"postSplashScreenTheme\">@android:style/Theme.DeviceDefault</item>\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/app/src/main/res/values-round/strings.xml",
    "content": "<resources>\n    <string name=\"hello_world\">From the Round world,\\nHello, %1$s!</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/build.gradle.kts",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    alias(libs.plugins.android.application) apply false\n    alias(libs.plugins.jetbrains.kotlin.android) apply false\n}"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/gradle/libs.versions.toml",
    "content": "[versions]\nagp = \"8.4.0\"\nkotlin = \"1.9.0\"\nplayServicesWearable = \"18.0.0\"\ncomposeBom = \"2023.08.00\"\ncomposeMaterial = \"1.2.1\"\ncomposeFoundation = \"1.2.1\"\nactivityCompose = \"1.7.2\"\ncoreSplashscreen = \"1.0.1\"\n\n[libraries]\nplay-services-wearable = { group = \"com.google.android.gms\", name = \"play-services-wearable\", version.ref = \"playServicesWearable\" }\ncompose-bom = { group = \"androidx.compose\", name = \"compose-bom\", version.ref = \"composeBom\" }\nui = { group = \"androidx.compose.ui\", name = \"ui\" }\nui-tooling-preview = { group = \"androidx.compose.ui\", name = \"ui-tooling-preview\" }\nui-tooling = { group = \"androidx.compose.ui\", name = \"ui-tooling\" }\nui-test-manifest = { group = \"androidx.compose.ui\", name = \"ui-test-manifest\" }\nui-test-junit4 = { group = \"androidx.compose.ui\", name = \"ui-test-junit4\" }\ncompose-material = { group = \"androidx.wear.compose\", name = \"compose-material\", version.ref = \"composeMaterial\" }\ncompose-foundation = { group = \"androidx.wear.compose\", name = \"compose-foundation\", version.ref = \"composeFoundation\" }\nactivity-compose = { group = \"androidx.activity\", name = \"activity-compose\", version.ref = \"activityCompose\" }\ncore-splashscreen = { group = \"androidx.core\", name = \"core-splashscreen\", version.ref = \"coreSplashscreen\" }\n\n[plugins]\nandroid-application = { id = \"com.android.application\", version.ref = \"agp\" }\njetbrains-kotlin-android = { id = \"org.jetbrains.kotlin.android\", version.ref = \"kotlin\" }\n\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Tue Jul 15 18:18:24 CST 2025\ndistributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.6-bin.zip\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. For more details, visit\n# https://developer.android.com/r/tools/gradle-multi-project-decoupled-projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxSimulateStreamingAsrWearOs/settings.gradle.kts",
    "content": "pluginManagement {\n    repositories {\n        google {\n            content {\n                includeGroupByRegex(\"com\\\\.android.*\")\n                includeGroupByRegex(\"com\\\\.google.*\")\n                includeGroupByRegex(\"androidx.*\")\n            }\n        }\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n        maven { url = uri(\"https://jitpack.io\") }\n    }\n}\n\nrootProject.name = \"SherpaOnnxSimulateStreamingAsrWearOs\"\ninclude(\":app\")\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/build.gradle.kts",
    "content": "plugins {\n    alias(libs.plugins.android.application)\n    alias(libs.plugins.jetbrains.kotlin.android)\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx.speaker.diarization\"\n    compileSdk = 34\n\n    defaultConfig {\n        applicationId = \"com.k2fsa.sherpa.onnx.speaker.diarization\"\n        minSdk = 21\n        targetSdk = 34\n        versionCode = 20260320\n        versionName = \"1.12.31\"\n\n        testInstrumentationRunner = \"androidx.test.runner.AndroidJUnitRunner\"\n        vectorDrawables {\n            useSupportLibrary = true\n        }\n    }\n\n    buildTypes {\n        release {\n            isMinifyEnabled = false\n            proguardFiles(\n                getDefaultProguardFile(\"proguard-android-optimize.txt\"),\n                \"proguard-rules.pro\"\n            )\n        }\n    }\n    compileOptions {\n        sourceCompatibility = JavaVersion.VERSION_1_8\n        targetCompatibility = JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = \"1.8\"\n    }\n    buildFeatures {\n        compose = true\n    }\n    composeOptions {\n        kotlinCompilerExtensionVersion = \"1.5.1\"\n    }\n    packaging {\n        resources {\n            excludes += \"/META-INF/{AL2.0,LGPL2.1}\"\n        }\n    }\n}\n\ndependencies {\n\n    implementation(libs.androidx.core.ktx)\n    implementation(libs.androidx.lifecycle.runtime.ktx)\n    implementation(libs.androidx.activity.compose)\n    implementation(platform(libs.androidx.compose.bom))\n    implementation(libs.androidx.ui)\n    implementation(libs.androidx.ui.graphics)\n    implementation(libs.androidx.ui.tooling.preview)\n    implementation(libs.androidx.material3)\n    implementation(libs.androidx.navigation.compose)\n    implementation(libs.androidx.documentfile)\n    testImplementation(libs.junit)\n    androidTestImplementation(libs.androidx.junit)\n    androidTestImplementation(libs.androidx.espresso.core)\n    androidTestImplementation(platform(libs.androidx.compose.bom))\n    androidTestImplementation(libs.androidx.ui.test.junit4)\n    debugImplementation(libs.androidx.ui.tooling)\n    debugImplementation(libs.androidx.ui.test.manifest)\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/androidTest/java/com/k2fsa/sherpa/onnx/speaker/diarization/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx.speaker.diarization\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission\n        android:name=\"android.permission.READ_EXTERNAL_STORAGE\"\n        android:maxSdkVersion=\"32\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnxSpeakerDiarization\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\"\n            android:label=\"@string/app_name\"\n            android:theme=\"@style/Theme.SherpaOnnxSpeakerDiarization\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/assets/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/BarItem.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization\n\nimport androidx.compose.ui.graphics.vector.ImageVector\n\ndata class BarItem(\n    val title: String,\n\n    // see https://www.composables.com/icons\n    // and\n    // https://developer.android.com/reference/kotlin/androidx/compose/material/icons/filled/package-summary\n    val image: ImageVector,\n    val route: String,\n)"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization\n\nimport android.os.Bundle\nimport androidx.activity.ComponentActivity\nimport androidx.activity.compose.setContent\nimport androidx.activity.enableEdgeToEdge\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.material3.CenterAlignedTopAppBar\nimport androidx.compose.material3.ExperimentalMaterial3Api\nimport androidx.compose.material3.Icon\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.NavigationBar\nimport androidx.compose.material3.NavigationBarItem\nimport androidx.compose.material3.Scaffold\nimport androidx.compose.material3.Surface\nimport androidx.compose.material3.Text\nimport androidx.compose.material3.TopAppBarDefaults\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.tooling.preview.Preview\nimport androidx.navigation.NavGraph.Companion.findStartDestination\nimport androidx.navigation.NavHostController\nimport androidx.navigation.compose.NavHost\nimport androidx.navigation.compose.composable\nimport androidx.navigation.compose.currentBackStackEntryAsState\nimport androidx.navigation.compose.rememberNavController\nimport com.k2fsa.sherpa.onnx.speaker.diarization.screens.HelpScreen\nimport com.k2fsa.sherpa.onnx.speaker.diarization.screens.HomeScreen\nimport com.k2fsa.sherpa.onnx.speaker.diarization.ui.theme.SherpaOnnxSpeakerDiarizationTheme\n\nconst val TAG = \"sherpa-onnx-sd\"\n\nclass MainActivity : ComponentActivity() {\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        enableEdgeToEdge()\n        setContent {\n            SherpaOnnxSpeakerDiarizationTheme {\n                // A surface container using the 'background' color from the theme\n                Surface(\n                    modifier = Modifier.fillMaxSize(),\n                    color = MaterialTheme.colorScheme.background\n                ) {\n                    MainScreen()\n                }\n            }\n        }\n        SpeakerDiarizationObject.initSpeakerDiarization(this.assets)\n    }\n}\n\n@OptIn(ExperimentalMaterial3Api::class)\n@Composable\nfun MainScreen(modifier: Modifier = Modifier) {\n    val navController = rememberNavController()\n    Scaffold(\n        topBar = {\n            CenterAlignedTopAppBar(\n                colors = TopAppBarDefaults.topAppBarColors(\n                    containerColor = MaterialTheme.colorScheme.primaryContainer,\n                    titleContentColor = MaterialTheme.colorScheme.primary,\n                ),\n                title = {\n                    Text(\n                        \"Next-gen Kaldi: Speaker Diarization\",\n                        fontWeight = FontWeight.Bold,\n                    )\n                },\n            )\n        },\n        content = { padding ->\n            Column(Modifier.padding(padding)) {\n                NavigationHost(navController = navController)\n\n            }\n        },\n        bottomBar = {\n            BottomNavigationBar(navController = navController)\n        }\n    )\n}\n\n@Composable\nfun NavigationHost(navController: NavHostController) {\n    NavHost(navController = navController, startDestination = NavRoutes.Home.route) {\n        composable(NavRoutes.Home.route) {\n            HomeScreen()\n        }\n\n        composable(NavRoutes.Help.route) {\n            HelpScreen()\n        }\n    }\n}\n\n@Composable\nfun BottomNavigationBar(navController: NavHostController) {\n    NavigationBar {\n        val backStackEntry by navController.currentBackStackEntryAsState()\n        val currentRoute = backStackEntry?.destination?.route\n\n        NavBarItems.BarItems.forEach { navItem ->\n            NavigationBarItem(selected = currentRoute == navItem.route,\n                onClick = {\n                    navController.navigate(navItem.route) {\n                        popUpTo(navController.graph.findStartDestination().id) {\n                            saveState = true\n                        }\n                        launchSingleTop = true\n                        restoreState = true\n                    }\n                },\n                icon = {\n                    Icon(imageVector = navItem.image, contentDescription = navItem.title)\n                }, label = {\n                    Text(text = navItem.title)\n                })\n        }\n    }\n}\n\n@Preview(showBackground = true)\n@Composable\nfun MainScreenPreview() {\n    SherpaOnnxSpeakerDiarizationTheme {\n        MainScreen()\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/NavBarItems.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization\n\nimport androidx.compose.material.icons.Icons\nimport androidx.compose.material.icons.filled.Home\nimport androidx.compose.material.icons.filled.Info\n\nobject NavBarItems {\n    val BarItems = listOf(\n        BarItem(\n            title = \"Home\",\n            image = Icons.Filled.Home,\n            route = \"home\",\n        ),\n        BarItem(\n            title = \"Help\",\n            image = Icons.Filled.Info,\n            route = \"help\",\n        ),\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/NavRoutes.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization\n\nsealed class NavRoutes(val route: String) {\n    object Home : NavRoutes(\"home\")\n    object Help : NavRoutes(\"help\")\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/ReadWaveFile.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization.screens\n\nimport android.content.Context\nimport android.media.AudioFormat\nimport android.media.MediaCodec\nimport android.media.MediaExtractor\nimport android.media.MediaFormat\nimport android.net.Uri\n\ndata class WaveData(\n    val sampleRate: Int? = null,\n    val samples: FloatArray? = null,\n    val msg: String? = null\n)\n\n// It supports only 16-bit encoded wave files\n//\n// References\n// - https://gist.github.com/a-m-s/1991ab18fbcb0fcc2cf9\n// - https://github.com/taehwandev/MediaCodecExample/blob/master/app/src/main/java/tech/thdev/mediacodecexample/audio/AACAudioDecoderThread.kt\nfun readUri(context: Context, uri: Uri): WaveData {\n    val extractor = MediaExtractor()\n    extractor.setDataSource(context, uri, null)\n\n    val samplesList: MutableList<FloatArray> = ArrayList()\n\n    for (i in 0 until extractor.trackCount) {\n        val format = extractor.getTrackFormat(i)\n        val mime = format.getString(MediaFormat.KEY_MIME)\n        if (mime?.startsWith(\"audio/\") == true) {\n            extractor.selectTrack(i)\n\n            var encoding: Int = -1\n            try {\n                encoding = format.getInteger(MediaFormat.KEY_PCM_ENCODING)\n            } catch (_: Exception) {\n            }\n\n            if (encoding != AudioFormat.ENCODING_PCM_16BIT) {\n                return WaveData(msg = \"We support only 16-bit encoded wave files\")\n            }\n\n            val sampleRate = format.getInteger(MediaFormat.KEY_SAMPLE_RATE)\n            val decoder = MediaCodec.createDecoderByType(mime)\n            decoder.configure(format, null, null, 0)\n            decoder.start()\n\n            val inputBuffers = decoder.inputBuffers\n            var outputBuffers = decoder.outputBuffers\n\n            val info = MediaCodec.BufferInfo()\n            var eof = false\n\n            var outputBufferIndex = -1\n\n            while (true) {\n                if (!eof) {\n                    val inputBufferIndex = decoder.dequeueInputBuffer(10000)\n                    if (inputBufferIndex > 0) {\n                        val size = extractor.readSampleData(inputBuffers[inputBufferIndex], 0)\n                        if (size < 0) {\n                            decoder.queueInputBuffer(\n                                inputBufferIndex,\n                                0,\n                                0,\n                                0,\n                                MediaCodec.BUFFER_FLAG_END_OF_STREAM\n                            )\n                            eof = true\n                        } else {\n                            decoder.queueInputBuffer(\n                                inputBufferIndex,\n                                0,\n                                size,\n                                extractor.sampleTime,\n                                0\n                            )\n                            extractor.advance()\n                        }\n                    }\n                } // if (!eof)\n\n                if (outputBufferIndex >= 0) {\n                    outputBuffers[outputBufferIndex].position(0)\n                }\n\n                outputBufferIndex = decoder.dequeueOutputBuffer(info, 10000)\n                if (outputBufferIndex >= 0) {\n                    if (info.flags != 0) {\n                        decoder.stop()\n                        decoder.release()\n\n                        var k = 0\n                        for (s in samplesList) {\n                            k += s.size\n                        }\n                        if (k == 0) {\n                            return WaveData(msg = \"Failed to read selected file\")\n                        }\n\n                        val ans = FloatArray(k)\n                        k = 0\n                        for (s in samplesList) {\n                            s.copyInto(ans, k)\n                            k += s.size\n                        }\n\n                        return WaveData(sampleRate = sampleRate, samples = ans)\n                    }\n\n                    val buffer = outputBuffers[outputBufferIndex]\n                    val chunk = ByteArray(info.size)\n                    buffer[chunk]\n                    buffer.clear()\n\n                    val numSamples = info.size / 2\n\n                    val samples = FloatArray(numSamples)\n                    for (k in 0 until numSamples) {\n                        // assume little endian\n                        val s = chunk[2 * k] + (chunk[2 * k + 1] * 256.0f)\n\n                        samples[k] = s / 32768.0f\n                    }\n                    samplesList.add(samples)\n\n                    decoder.releaseOutputBuffer(outputBufferIndex, false)\n                } else if (outputBufferIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {\n                    outputBuffers = decoder.outputBuffers\n                }\n            }\n        }\n    }\n\n    extractor.release()\n    return WaveData(msg = \"not an audio file\")\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/SpeakerDiarizationObject.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization\n\nimport android.content.res.AssetManager\nimport android.util.Log\nimport com.k2fsa.sherpa.onnx.FastClusteringConfig\nimport com.k2fsa.sherpa.onnx.OfflineSpeakerDiarization\nimport com.k2fsa.sherpa.onnx.OfflineSpeakerDiarizationConfig\nimport com.k2fsa.sherpa.onnx.OfflineSpeakerSegmentationModelConfig\nimport com.k2fsa.sherpa.onnx.OfflineSpeakerSegmentationPyannoteModelConfig\nimport com.k2fsa.sherpa.onnx.SpeakerEmbeddingExtractorConfig\n\n// Please download\n// https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n// then unzip it, rename model.onnx to segmentation.onnx, and mv\n// segmentation.onnx to the assets folder\nval segmentationModel = \"segmentation.onnx\"\n\n// please download it from\n// https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n// and rename it to embedding.onnx\n// and move it to the assets folder\nval embeddingModel = \"embedding.onnx\"\n\n// in the end, your assets folder should look like below\n/*\n(py38) fangjuns-MacBook-Pro:assets fangjun$ pwd\n/Users/fangjun/open-source/sherpa-onnx/android/SherpaOnnxSpeakerDiarization/app/src/main/assets\n(py38) fangjuns-MacBook-Pro:assets fangjun$ ls -lh\ntotal 89048\n-rw-r--r--  1 fangjun  staff    38M Oct 12 20:28 embedding.onnx\n-rw-r--r--  1 fangjun  staff   5.7M Oct 12 20:28 segmentation.onnx\n */\n\nobject SpeakerDiarizationObject {\n    var _sd: OfflineSpeakerDiarization? = null\n    val sd: OfflineSpeakerDiarization\n        get() {\n            return _sd!!\n        }\n\n    fun initSpeakerDiarization(assetManager: AssetManager? = null) {\n        synchronized(this) {\n            if (_sd != null) {\n                return\n            }\n            Log.i(TAG, \"Initializing sherpa-onnx speaker diarization\")\n\n            val config = OfflineSpeakerDiarizationConfig(\n                segmentation = OfflineSpeakerSegmentationModelConfig(\n                    pyannote = OfflineSpeakerSegmentationPyannoteModelConfig(\n                        segmentationModel\n                    ),\n                    debug = true,\n                ),\n                embedding = SpeakerEmbeddingExtractorConfig(\n                    model = embeddingModel,\n                    debug = true,\n                    numThreads = 2,\n                ),\n                clustering = FastClusteringConfig(numClusters = -1, threshold = 0.5f),\n                minDurationOn = 0.2f,\n                minDurationOff = 0.5f,\n            )\n            _sd = OfflineSpeakerDiarization(assetManager = assetManager, config = config)\n        }\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/screens/Help.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization.screens\n\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.height\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.material3.Text\nimport androidx.compose.runtime.Composable\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.unit.dp\nimport androidx.compose.ui.unit.sp\n\n@Composable\nfun HelpScreen() {\n    Box(modifier = Modifier.fillMaxSize()) {\n        Column(\n            modifier = Modifier.padding(8.dp)\n        ) {\n            Text(\n                \"This app accepts only 16kHz 16-bit 1-channel *.wav files. \" +\n                        \"It has two arguments: Number of speakers and clustering threshold. \" +\n                        \"If you know the actual number of speakers in the file, please set it. \" +\n                        \"Otherwise, please set it to 0. In that case, you have to set the threshold. \" +\n                        \"A larger threshold leads to fewer segmented speakers.\"\n            )\n            Spacer(modifier = Modifier.height(5.dp))\n            Text(\"The speaker segmentation model is from \" +\n                \"pyannote-audio (https://huggingface.co/pyannote/segmentation-3.0), \"+\n                 \"whereas the embedding extractor model is from 3D-Speaker (https://github.com/modelscope/3D-Speaker)\")\n            Spacer(modifier = Modifier.height(5.dp))\n            Text(\"Please see http://github.com/k2-fsa/sherpa-onnx \")\n            Spacer(modifier = Modifier.height(5.dp))\n            Text(\"Everything is open-sourced!\", fontSize = 20.sp)\n        }\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/screens/Home.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization.screens\n\nimport android.util.Log\nimport androidx.activity.compose.rememberLauncherForActivityResult\nimport androidx.activity.result.contract.ActivityResultContracts\nimport androidx.compose.foundation.layout.Arrangement\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.Row\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxWidth\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.foundation.layout.size\nimport androidx.compose.foundation.rememberScrollState\nimport androidx.compose.foundation.verticalScroll\nimport androidx.compose.material3.Button\nimport androidx.compose.material3.OutlinedTextField\nimport androidx.compose.material3.Text\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.runtime.mutableStateOf\nimport androidx.compose.runtime.remember\nimport androidx.compose.runtime.setValue\nimport androidx.compose.ui.Alignment\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.platform.LocalClipboardManager\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.text.AnnotatedString\nimport androidx.compose.ui.unit.dp\nimport androidx.compose.ui.unit.sp\nimport androidx.documentfile.provider.DocumentFile\nimport com.k2fsa.sherpa.onnx.speaker.diarization.SpeakerDiarizationObject\nimport com.k2fsa.sherpa.onnx.speaker.diarization.TAG\nimport kotlin.concurrent.thread\n\n\nprivate var samples: FloatArray? = null\n\n@Composable\nfun HomeScreen() {\n    val context = LocalContext.current\n\n    var sampleRate: Int\n    var filename by remember { mutableStateOf(\"\") }\n    var status by remember { mutableStateOf(\"\") }\n    var progress by remember { mutableStateOf(\"\") }\n    val clipboardManager = LocalClipboardManager.current\n    var done by remember { mutableStateOf(false) }\n    var fileIsOk by remember { mutableStateOf(false) }\n    var started by remember { mutableStateOf(false) }\n    var numSpeakers by remember { mutableStateOf(0) }\n    var threshold by remember { mutableStateOf(0.5f) }\n\n\n    val callback = here@{ numProcessedChunks: Int, numTotalChunks: Int, arg: Long ->\n        Int\n        val percent = 100.0 * numProcessedChunks / numTotalChunks\n        progress = \"%.2f%%\".format(percent)\n        Log.i(TAG, progress)\n        return@here 0\n    }\n\n    val launcher = rememberLauncherForActivityResult(ActivityResultContracts.OpenDocument()) {\n        it?.let {\n            val documentFile = DocumentFile.fromSingleUri(context, it)\n            filename = documentFile?.name ?: \"\"\n\n            progress = \"\"\n            done = false\n            fileIsOk = false\n\n            if (filename.isNotEmpty()) {\n                val data = readUri(context, it)\n                Log.i(TAG, \"sample rate: ${data.sampleRate}\")\n                Log.i(TAG, \"numSamples: ${data.samples?.size ?: 0}\")\n                if (data.msg != null) {\n                    Log.i(TAG, \"failed to read $filename\")\n                    status = data.msg\n                } else if (data.sampleRate != SpeakerDiarizationObject.sd.sampleRate()) {\n                    status =\n                        \"Expected sample rate: ${SpeakerDiarizationObject.sd.sampleRate()}. Given wave file with sample rate: ${data.sampleRate}\"\n                } else {\n                    samples = data.samples!!\n                    fileIsOk = true\n                }\n            }\n        }\n    }\n\n    Column(\n        modifier = Modifier.padding(10.dp),\n        verticalArrangement = Arrangement.Top,\n    ) {\n        Row(\n            modifier = Modifier.fillMaxWidth(),\n            horizontalArrangement = Arrangement.SpaceEvenly,\n            verticalAlignment = Alignment.CenterVertically\n        ) {\n\n            Button(onClick = {\n                launcher.launch(arrayOf(\"audio/*\"))\n            }) {\n                Text(\"Select a .wav file\")\n            }\n\n            Button(enabled = fileIsOk && !started,\n                onClick = {\n                    Log.i(TAG, \"started\")\n                    Log.i(TAG, \"num samples: ${samples?.size}\")\n                    started = true\n                    progress = \"\"\n\n                    val config = SpeakerDiarizationObject.sd.config\n                    config.clustering.numClusters = numSpeakers\n                    config.clustering.threshold = threshold\n\n                    SpeakerDiarizationObject.sd.setConfig(config)\n\n                    thread(true) {\n                        done = false\n                        status = \"Started! Please wait\"\n                        val segments = SpeakerDiarizationObject.sd.processWithCallback(\n                            samples!!,\n                            callback = callback,\n                        )\n                        done = true\n                        started = false\n                        status = \"\"\n                        for (s in segments) {\n                            val start = \"%.2f\".format(s.start)\n                            val end = \"%.2f\".format(s.end)\n                            val speaker = \"speaker_%02d\".format(s.speaker)\n                            status += \"$start -- $end $speaker\\n\"\n                            Log.i(TAG, \"$start -- $end $speaker\")\n                        }\n\n                        Log.i(TAG, status)\n                    }\n                }) {\n                Text(\"Start\")\n            }\n            if (progress.isNotEmpty()) {\n                Text(progress, fontSize = 25.sp)\n            }\n        }\n\n        Row(\n            modifier = Modifier.fillMaxWidth(),\n            horizontalArrangement = Arrangement.SpaceEvenly,\n            verticalAlignment = Alignment.CenterVertically\n        ) {\n            OutlinedTextField(\n                value = numSpeakers.toString(),\n                onValueChange = {\n                    if (it.isEmpty() || it.isBlank()) {\n                        numSpeakers = 0\n                    } else {\n                        numSpeakers = it.toIntOrNull() ?: 0\n                    }\n                },\n                label = {\n                    Text(\"Number of Speakers\")\n                },\n            )\n        }\n\n        Row(\n            modifier = Modifier.fillMaxWidth(),\n            horizontalArrangement = Arrangement.SpaceEvenly,\n            verticalAlignment = Alignment.CenterVertically\n        ) {\n            OutlinedTextField(\n                value = threshold.toString(),\n                onValueChange = {\n                    if (it.isEmpty() || it.isBlank()) {\n                        threshold = 0.5f\n                    } else {\n                        threshold = it.toFloatOrNull() ?: 0.5f\n                    }\n                },\n                label = {\n                    Text(\"Clustering threshold\")\n                },\n            )\n        }\n\n        if (filename.isNotEmpty()) {\n            Text(text = \"Selected $filename\")\n            Spacer(Modifier.size(20.dp))\n        }\n\n        if (done) {\n            Button(onClick = {\n                clipboardManager.setText(AnnotatedString(status))\n                progress = \"Copied!\"\n            }) {\n                Text(\"Copy result\")\n            }\n            Spacer(Modifier.size(20.dp))\n        }\n\n        if (status.isNotEmpty()) {\n            Text(\n                status,\n                modifier = Modifier.verticalScroll(rememberScrollState()),\n            )\n        }\n\n\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/ui/theme/Color.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization.ui.theme\n\nimport androidx.compose.ui.graphics.Color\n\nval Purple80 = Color(0xFFD0BCFF)\nval PurpleGrey80 = Color(0xFFCCC2DC)\nval Pink80 = Color(0xFFEFB8C8)\n\nval Purple40 = Color(0xFF6650a4)\nval PurpleGrey40 = Color(0xFF625b71)\nval Pink40 = Color(0xFF7D5260)"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/ui/theme/Theme.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization.ui.theme\n\nimport android.app.Activity\nimport android.os.Build\nimport androidx.compose.foundation.isSystemInDarkTheme\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.darkColorScheme\nimport androidx.compose.material3.dynamicDarkColorScheme\nimport androidx.compose.material3.dynamicLightColorScheme\nimport androidx.compose.material3.lightColorScheme\nimport androidx.compose.runtime.Composable\nimport androidx.compose.ui.platform.LocalContext\n\nprivate val DarkColorScheme = darkColorScheme(\n    primary = Purple80,\n    secondary = PurpleGrey80,\n    tertiary = Pink80\n)\n\nprivate val LightColorScheme = lightColorScheme(\n    primary = Purple40,\n    secondary = PurpleGrey40,\n    tertiary = Pink40\n\n    /* Other default colors to override\n    background = Color(0xFFFFFBFE),\n    surface = Color(0xFFFFFBFE),\n    onPrimary = Color.White,\n    onSecondary = Color.White,\n    onTertiary = Color.White,\n    onBackground = Color(0xFF1C1B1F),\n    onSurface = Color(0xFF1C1B1F),\n    */\n)\n\n@Composable\nfun SherpaOnnxSpeakerDiarizationTheme(\n    darkTheme: Boolean = isSystemInDarkTheme(),\n    // Dynamic color is available on Android 12+\n    dynamicColor: Boolean = true,\n    content: @Composable () -> Unit\n) {\n    val colorScheme = when {\n        dynamicColor && Build.VERSION.SDK_INT >= Build.VERSION_CODES.S -> {\n            val context = LocalContext.current\n            if (darkTheme) dynamicDarkColorScheme(context) else dynamicLightColorScheme(context)\n        }\n\n        darkTheme -> DarkColorScheme\n        else -> LightColorScheme\n    }\n\n    MaterialTheme(\n        colorScheme = colorScheme,\n        typography = Typography,\n        content = content\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/diarization/ui/theme/Type.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization.ui.theme\n\nimport androidx.compose.material3.Typography\nimport androidx.compose.ui.text.TextStyle\nimport androidx.compose.ui.text.font.FontFamily\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.unit.sp\n\n// Set of Material typography styles to start with\nval Typography = Typography(\n    bodyLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 16.sp,\n        lineHeight = 24.sp,\n        letterSpacing = 0.5.sp\n    )\n    /* Other default text styles to override\n    titleLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 22.sp,\n        lineHeight = 28.sp,\n        letterSpacing = 0.sp\n    ),\n    labelSmall = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Medium,\n        fontSize = 11.sp,\n        lineHeight = 16.sp,\n        letterSpacing = 0.5.sp\n    )\n    */\n)"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">SherpaOnnxSpeakerDiarization</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/res/values/themes.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n\n    <style name=\"Theme.SherpaOnnxSpeakerDiarization\" parent=\"android:Theme.Material.Light.NoActionBar\" />\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/app/src/test/java/com/k2fsa/sherpa/onnx/speaker/diarization/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.diarization\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/build.gradle.kts",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    alias(libs.plugins.android.application) apply false\n    alias(libs.plugins.jetbrains.kotlin.android) apply false\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/gradle/libs.versions.toml",
    "content": "[versions]\nagp = \"8.4.0\"\nkotlin = \"1.9.0\"\ncoreKtx = \"1.10.1\"\njunit = \"4.13.2\"\njunitVersion = \"1.1.5\"\nespressoCore = \"3.5.1\"\nlifecycleRuntimeKtx = \"2.6.1\"\nactivityCompose = \"1.8.0\"\ncomposeBom = \"2023.08.00\"\nnavigationCompose = \"2.8.2\"\ndocumentfile = \"1.0.1\"\n\n[libraries]\nandroidx-core-ktx = { group = \"androidx.core\", name = \"core-ktx\", version.ref = \"coreKtx\" }\njunit = { group = \"junit\", name = \"junit\", version.ref = \"junit\" }\nandroidx-junit = { group = \"androidx.test.ext\", name = \"junit\", version.ref = \"junitVersion\" }\nandroidx-espresso-core = { group = \"androidx.test.espresso\", name = \"espresso-core\", version.ref = \"espressoCore\" }\nandroidx-lifecycle-runtime-ktx = { group = \"androidx.lifecycle\", name = \"lifecycle-runtime-ktx\", version.ref = \"lifecycleRuntimeKtx\" }\nandroidx-activity-compose = { group = \"androidx.activity\", name = \"activity-compose\", version.ref = \"activityCompose\" }\nandroidx-compose-bom = { group = \"androidx.compose\", name = \"compose-bom\", version.ref = \"composeBom\" }\nandroidx-ui = { group = \"androidx.compose.ui\", name = \"ui\" }\nandroidx-ui-graphics = { group = \"androidx.compose.ui\", name = \"ui-graphics\" }\nandroidx-ui-tooling = { group = \"androidx.compose.ui\", name = \"ui-tooling\" }\nandroidx-ui-tooling-preview = { group = \"androidx.compose.ui\", name = \"ui-tooling-preview\" }\nandroidx-ui-test-manifest = { group = \"androidx.compose.ui\", name = \"ui-test-manifest\" }\nandroidx-ui-test-junit4 = { group = \"androidx.compose.ui\", name = \"ui-test-junit4\" }\nandroidx-material3 = { group = \"androidx.compose.material3\", name = \"material3\" }\nandroidx-navigation-compose = { group = \"androidx.navigation\", name = \"navigation-compose\", version.ref = \"navigationCompose\" }\nandroidx-documentfile = { group = \"androidx.documentfile\", name = \"documentfile\", version.ref = \"documentfile\" }\n\n[plugins]\nandroid-application = { id = \"com.android.application\", version.ref = \"agp\" }\njetbrains-kotlin-android = { id = \"org.jetbrains.kotlin.android\", version.ref = \"kotlin\" }\n\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Sat Oct 12 14:27:04 CST 2024\ndistributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.6-bin.zip\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. For more details, visit\n# https://developer.android.com/r/tools/gradle-multi-project-decoupled-projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerDiarization/settings.gradle.kts",
    "content": "pluginManagement {\n    repositories {\n        google {\n            content {\n                includeGroupByRegex(\"com\\\\.android.*\")\n                includeGroupByRegex(\"com\\\\.google.*\")\n                includeGroupByRegex(\"androidx.*\")\n            }\n        }\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\nrootProject.name = \"SherpaOnnxSpeakerDiarization\"\ninclude(\":app\")\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/build.gradle.kts",
    "content": "plugins {\n    id(\"com.android.application\")\n    id(\"org.jetbrains.kotlin.android\")\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx.speaker.identification\"\n    compileSdk = 34\n\n    defaultConfig {\n        applicationId = \"com.k2fsa.sherpa.onnx.speaker.identification\"\n        minSdk = 21\n        targetSdk = 34\n        versionCode = 20260320\n        versionName = \"1.12.31\"\n\n        testInstrumentationRunner = \"androidx.test.runner.AndroidJUnitRunner\"\n        vectorDrawables {\n            useSupportLibrary = true\n        }\n    }\n\n    buildTypes {\n        release {\n            isMinifyEnabled = false\n            proguardFiles(\n                getDefaultProguardFile(\"proguard-android-optimize.txt\"),\n                \"proguard-rules.pro\"\n            )\n        }\n    }\n    compileOptions {\n        sourceCompatibility = JavaVersion.VERSION_1_8\n        targetCompatibility = JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = \"1.8\"\n    }\n    buildFeatures {\n        compose = true\n    }\n    composeOptions {\n        kotlinCompilerExtensionVersion = \"1.5.1\"\n    }\n    packaging {\n        resources {\n            excludes += \"/META-INF/{AL2.0,LGPL2.1}\"\n        }\n    }\n}\n\ndependencies {\n\n    implementation(\"androidx.core:core-ktx:1.12.0\")\n    implementation(\"androidx.lifecycle:lifecycle-runtime-ktx:2.7.0\")\n    implementation(\"androidx.activity:activity-compose:1.8.2\")\n    implementation(platform(\"androidx.compose:compose-bom:2023.08.00\"))\n    implementation(\"androidx.compose.ui:ui\")\n    implementation(\"androidx.compose.ui:ui-graphics\")\n    implementation(\"androidx.compose.ui:ui-tooling-preview\")\n    implementation(\"androidx.compose.material3:material3\")\n    implementation(\"androidx.navigation:navigation-compose:2.7.6\")\n    testImplementation(\"junit:junit:4.13.2\")\n    androidTestImplementation(\"androidx.test.ext:junit:1.1.5\")\n    androidTestImplementation(\"androidx.test.espresso:espresso-core:3.5.1\")\n    androidTestImplementation(platform(\"androidx.compose:compose-bom:2023.08.00\"))\n    androidTestImplementation(\"androidx.compose.ui:ui-test-junit4\")\n    debugImplementation(\"androidx.compose.ui:ui-tooling\")\n    debugImplementation(\"androidx.compose.ui:ui-test-manifest\")\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/androidTest/java/com/k2fsa/sherpa/onnx/speaker/identification/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx.speaker.identification\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnxSpeakerIdentification\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\"\n            android:label=\"@string/app_name\"\n            android:theme=\"@style/Theme.SherpaOnnxSpeakerIdentification\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/assets/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/BarItem.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification\n\nimport androidx.compose.ui.graphics.vector.ImageVector\n\ndata class BarItem(\n    val title: String,\n\n    // see https://www.composables.com/icons\n    // and\n    // https://developer.android.com/reference/kotlin/androidx/compose/material/icons/filled/package-summary\n    val image: ImageVector,\n    val route: String,\n)"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.os.Bundle\nimport android.util.Log\nimport android.widget.Toast\nimport androidx.activity.ComponentActivity\nimport androidx.activity.compose.setContent\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.material3.CenterAlignedTopAppBar\nimport androidx.compose.material3.ExperimentalMaterial3Api\nimport androidx.compose.material3.Icon\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.NavigationBar\nimport androidx.compose.material3.NavigationBarItem\nimport androidx.compose.material3.Scaffold\nimport androidx.compose.material3.Surface\nimport androidx.compose.material3.Text\nimport androidx.compose.material3.TopAppBarDefaults\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.tooling.preview.Preview\nimport androidx.core.app.ActivityCompat\nimport androidx.navigation.NavGraph.Companion.findStartDestination\nimport androidx.navigation.NavHostController\nimport androidx.navigation.compose.NavHost\nimport androidx.navigation.compose.composable\nimport androidx.navigation.compose.currentBackStackEntryAsState\nimport androidx.navigation.compose.rememberNavController\nimport com.k2fsa.sherpa.onnx.SpeakerRecognition\nimport com.k2fsa.sherpa.onnx.speaker.identification.screens.HelpScreen\nimport com.k2fsa.sherpa.onnx.speaker.identification.screens.HomeScreen\nimport com.k2fsa.sherpa.onnx.speaker.identification.screens.RegisterScreen\nimport com.k2fsa.sherpa.onnx.speaker.identification.screens.ViewScreen\nimport com.k2fsa.sherpa.onnx.speaker.identification.ui.theme.SherpaOnnxSpeakerIdentificationTheme\n\nconst val TAG = \"sherpa-onnx-speaker\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\nclass MainActivity : ComponentActivity() {\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        setContent {\n            SherpaOnnxSpeakerIdentificationTheme {\n                // A surface container using the 'background' color from the theme\n                Surface(\n                    modifier = Modifier.fillMaxSize(),\n                    color = MaterialTheme.colorScheme.background\n                ) {\n                    MainScreen()\n                }\n            }\n        }\n\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n\n        SpeakerRecognition.initExtractor(this.assets)\n    }\n\n    @Deprecated(\"Deprecated in Java\")\n    override fun onRequestPermissionsResult(\n        requestCode: Int,\n        permissions: Array<out String>,\n        grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            Toast.makeText(\n                this,\n                \"This App needs access to the microphone\",\n                Toast.LENGTH_SHORT\n            )\n                .show()\n            finish()\n        }\n\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n}\n\n@OptIn(ExperimentalMaterial3Api::class)\n@Composable\nfun MainScreen(modifier: Modifier = Modifier) {\n    val navController = rememberNavController()\n\n    Scaffold(\n        topBar = {\n            CenterAlignedTopAppBar(\n                colors = TopAppBarDefaults.topAppBarColors(\n                    containerColor = MaterialTheme.colorScheme.primaryContainer,\n                    titleContentColor = MaterialTheme.colorScheme.primary,\n                ),\n                title = {\n                    Text(\n                        \"Next-gen Kaldi: Speaker Identification\",\n                        fontWeight = FontWeight.Bold,\n                    )\n                },\n            )\n        },\n        content = { padding ->\n            Column(Modifier.padding(padding)) {\n                NavigationHost(navController = navController)\n\n            }\n        },\n        bottomBar = {\n            BottomNavigationBar(navController = navController)\n        }\n    )\n}\n\n@Composable\nfun NavigationHost(navController: NavHostController) {\n    NavHost(navController = navController, startDestination = NavRoutes.Home.route) {\n        composable(NavRoutes.Home.route) {\n            HomeScreen()\n        }\n\n        composable(NavRoutes.Register.route) {\n            RegisterScreen()\n        }\n\n        composable(NavRoutes.View.route) {\n            ViewScreen()\n        }\n\n        composable(NavRoutes.Help.route) {\n            HelpScreen()\n        }\n    }\n}\n\n@Composable\nfun BottomNavigationBar(navController: NavHostController) {\n    NavigationBar {\n        val backStackEntry by navController.currentBackStackEntryAsState()\n        val currentRoute = backStackEntry?.destination?.route\n\n        NavBarItems.BarItems.forEach { navItem ->\n            NavigationBarItem(selected = currentRoute == navItem.route,\n                onClick = {\n                    navController.navigate(navItem.route) {\n                        popUpTo(navController.graph.findStartDestination().id) {\n                            saveState = true\n                        }\n                        launchSingleTop = true\n                        restoreState = true\n                    }\n                },\n                icon = {\n                    Icon(imageVector = navItem.image, contentDescription = navItem.title)\n                }, label = {\n                    Text(text = navItem.title)\n                })\n        }\n    }\n}\n\n@Preview(showBackground = true)\n@Composable\nfun MainScreenPreview() {\n    SherpaOnnxSpeakerIdentificationTheme {\n        MainScreen()\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/NavBarItems.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification\n\nimport androidx.compose.material.icons.Icons\nimport androidx.compose.material.icons.filled.AccountCircle\nimport androidx.compose.material.icons.filled.Add\nimport androidx.compose.material.icons.filled.Home\nimport androidx.compose.material.icons.filled.Info\n\n\nobject NavBarItems {\n    val BarItems = listOf(\n        BarItem(\n            title = \"Home\",\n            image = Icons.Filled.Home,\n            route = \"home\",\n        ),\n        BarItem(\n            title = \"Register\",\n            image = Icons.Filled.Add,\n            route = \"register\",\n        ),\n        BarItem(\n            title = \"View\",\n            image = Icons.Filled.AccountCircle,\n            route = \"view\",\n        ),\n        BarItem(\n            title = \"Help\",\n            image = Icons.Filled.Info,\n            route = \"help\",\n        ),\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/NavRoutes.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification\n\nsealed class NavRoutes(val route: String) {\n    object Home : NavRoutes(\"home\")\n    object Register : NavRoutes(\"register\")\n    object View : NavRoutes(\"view\")\n    object Help : NavRoutes(\"help\")\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/screens/Help.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification.screens\n\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.height\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.material3.Text\nimport androidx.compose.runtime.Composable\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.unit.dp\n\n@Composable\nfun HelpScreen() {\n    Box(modifier= Modifier.fillMaxSize()) {\n        Column(\n            modifier = Modifier.padding(16.dp)\n        ) {\n            Text(\"Please see http://github.com/k2-fsa/sherpa-onnx \")\n            Spacer(modifier = Modifier.height(16.dp))\n            Text(\"https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\")\n            Spacer(modifier = Modifier.height(16.dp))\n            Text(\"https://k2-fsa.github.io/sherpa/social-groups.html\")\n            Spacer(modifier = Modifier.height(16.dp))\n            Text(\"Everything is open-sourced!\")\n        }\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/screens/Home.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification.screens\n\nimport android.Manifest\nimport android.annotation.SuppressLint\nimport android.app.Activity\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.util.Log\nimport androidx.compose.foundation.layout.Arrangement\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.Row\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.fillMaxWidth\nimport androidx.compose.foundation.layout.height\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.foundation.layout.width\nimport androidx.compose.material3.Button\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.Slider\nimport androidx.compose.material3.Text\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.runtime.mutableStateOf\nimport androidx.compose.runtime.remember\nimport androidx.compose.runtime.setValue\nimport androidx.compose.ui.Alignment\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.res.stringResource\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.unit.dp\nimport androidx.core.app.ActivityCompat\nimport com.k2fsa.sherpa.onnx.SpeakerRecognition\nimport com.k2fsa.sherpa.onnx.speaker.identification.R\nimport com.k2fsa.sherpa.onnx.speaker.identification.TAG\nimport kotlin.concurrent.thread\n\nprivate var audioRecord: AudioRecord? = null\nprivate var sampleList: MutableList<FloatArray>? = null\n\nprivate val clearedResult = \"-cleared-\"\n@Composable\nfun HomeScreen() {\n    val activity = LocalContext.current as Activity\n    var threshold by remember {\n        mutableStateOf(0.5F)\n    }\n\n    var detectedName by remember {\n        mutableStateOf(clearedResult)\n    }\n\n    var isStarted by remember { mutableStateOf(false) }\n    val onRecordingButtonClick: () -> Unit = {\n        isStarted = !isStarted\n\n        if (isStarted) {\n            if (ActivityCompat.checkSelfPermission(\n                    activity,\n                    Manifest.permission.RECORD_AUDIO\n                ) != PackageManager.PERMISSION_GRANTED\n            ) {\n                Log.i(TAG, \"Recording is not allowed\")\n            } else {\n                // recording is allowed\n                val audioSource = MediaRecorder.AudioSource.MIC\n                val channelConfig = AudioFormat.CHANNEL_IN_MONO\n                val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n                val numBytes =\n                    AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n\n                audioRecord = AudioRecord(\n                    audioSource,\n                    sampleRateInHz,\n                    AudioFormat.CHANNEL_IN_MONO,\n                    AudioFormat.ENCODING_PCM_16BIT,\n                    numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n                )\n\n                sampleList = null\n                detectedName = clearedResult\n\n                // recording is started here\n                thread(true) {\n                    Log.i(TAG, \"processing samples\")\n\n                    val interval = 0.1 // i.e., 100 ms\n                    val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n                    val buffer = ShortArray(bufferSize)\n                    audioRecord?.let {\n                        it.startRecording()\n\n                        while (isStarted) {\n                            val ret = audioRecord?.read(buffer, 0, buffer.size)\n                            ret?.let { n ->\n                                val samples = FloatArray(n) { buffer[it] / 32768.0f }\n                                if (sampleList == null) {\n                                    sampleList = mutableListOf(samples)\n                                } else {\n                                    sampleList?.add(samples)\n                                }\n                            }\n                        }\n                    }\n\n                    Log.i(TAG, \"Home: Recording is stopped. ${sampleList?.count()}\")\n                }\n            }\n        } else {\n            // recording is stopped here\n            audioRecord?.stop()\n            audioRecord?.release()\n            audioRecord = null\n\n            sampleList?.let {\n                val stream = SpeakerRecognition.extractor.createStream()\n                for (samples in it) {\n                    stream.acceptWaveform(samples = samples, sampleRate = sampleRateInHz)\n                }\n                stream.inputFinished()\n                if (SpeakerRecognition.extractor.isReady(stream)) {\n                    val embedding = SpeakerRecognition.extractor.compute(stream)\n                    detectedName = SpeakerRecognition.manager.search(\n                        embedding = embedding,\n                        threshold = threshold,\n                    )\n                }\n            }\n        }\n    }\n\n    val onThresholdChange = { newValue: Float ->\n        threshold = newValue\n    }\n\n    Box(\n        modifier = Modifier.fillMaxSize(),\n        contentAlignment = Alignment.TopCenter,\n    ) {\n        Column(\n            horizontalAlignment = Alignment.CenterHorizontally,\n        ) {\n            HomeThresholdRow(\n                threshold = threshold,\n                onValueChange = onThresholdChange,\n            )\n            HomeButtonRow(\n                isStarted = isStarted,\n                onRecordingButtonClick = onRecordingButtonClick,\n                onClearButtonClick = {\n                    detectedName = clearedResult\n                },\n            )\n\n            Spacer(modifier = Modifier.height(48.dp))\n\n            if(detectedName == clearedResult) {\n                // do nothing\n            } else if (detectedName.length > 0) {\n                Text(\n                    text = \"Speaker: ${detectedName}\",\n                    style = MaterialTheme.typography.headlineLarge,\n                    fontWeight = FontWeight.Bold,\n                )\n            } else {\n                Text(\n                    text = \"Unknown speaker\",\n                    style = MaterialTheme.typography.headlineLarge,\n                    fontWeight = FontWeight.Bold,\n                )\n            }\n        }\n    }\n}\n\n@SuppressLint(\"UnrememberedMutableState\")\n@Composable\nprivate fun HomeButtonRow(\n    modifier: Modifier = Modifier,\n    isStarted: Boolean,\n    onRecordingButtonClick: () -> Unit,\n    onClearButtonClick: () -> Unit,\n) {\n    val numSpeakers: Int by mutableStateOf(SpeakerRecognition.manager.numSpeakers())\n    Row(\n        modifier = modifier.fillMaxWidth(),\n        horizontalArrangement = Arrangement.Center,\n    ) {\n        Button(\n            enabled = numSpeakers > 0,\n            onClick = onRecordingButtonClick\n        ) {\n            Text(text = stringResource(if (isStarted) R.string.stop else R.string.start))\n        }\n\n        Spacer(modifier = Modifier.width(24.dp))\n\n        Button(onClick = onClearButtonClick) {\n            Text(text = stringResource(id = R.string.clear))\n        }\n    }\n}\n\n@Composable\nfun HomeThresholdRow(\n    modifier: Modifier = Modifier,\n    threshold: Float,\n    onValueChange: (Float) -> Unit,\n) {\n    Column(modifier = Modifier) {\n        Text(\n            text = \"Threshold: \" + String.format(\"%.2f\", threshold),\n            style = MaterialTheme.typography.headlineMedium,\n            fontWeight = FontWeight.Bold,\n            modifier = modifier.padding(bottom = 8.dp, top = 8.dp),\n        )\n        Slider(\n            value = threshold,\n            onValueChange = onValueChange,\n            valueRange = 0.1F..1.0F,\n            modifier = modifier.fillMaxWidth(),\n        )\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/screens/Register.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification.screens\n\nimport android.Manifest\nimport android.annotation.SuppressLint\nimport android.app.Activity\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.util.Log\nimport android.widget.Toast\nimport androidx.compose.foundation.layout.Arrangement\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.Row\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.fillMaxWidth\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.foundation.layout.width\nimport androidx.compose.material3.Button\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.OutlinedTextField\nimport androidx.compose.material3.Text\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.runtime.mutableStateOf\nimport androidx.compose.runtime.remember\nimport androidx.compose.runtime.setValue\nimport androidx.compose.ui.Alignment\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.res.stringResource\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.tooling.preview.Preview\nimport androidx.compose.ui.unit.dp\nimport androidx.core.app.ActivityCompat\nimport com.k2fsa.sherpa.onnx.SpeakerRecognition\nimport com.k2fsa.sherpa.onnx.speaker.identification.R\nimport com.k2fsa.sherpa.onnx.speaker.identification.TAG\nimport kotlin.concurrent.thread\n\nprivate var audioRecord: AudioRecord? = null\n\nprivate var sampleList: MutableList<FloatArray>? = null\n\nprivate var embeddingList: MutableList<FloatArray>? = null\n\nval sampleRateInHz = 16000\n\n@SuppressLint(\"UnrememberedMutableState\")\n@Preview\n@Composable\nfun RegisterScreen(modifier: Modifier = Modifier) {\n    val activity = LocalContext.current as Activity\n\n    var firstTime by remember { mutableStateOf(true) }\n    if (firstTime) {\n        firstTime = false\n        // clear states\n        embeddingList = null\n    }\n\n    val numberAudio: Int by mutableStateOf(embeddingList?.count() ?: 0)\n\n    Box(\n        modifier = Modifier.fillMaxSize(),\n        contentAlignment = Alignment.TopCenter\n    ) {\n        var speakerName by remember { mutableStateOf(\"\") }\n        val onSpeakerNameChange = { newName: String -> speakerName = newName }\n\n        var isStarted by remember { mutableStateOf(false) }\n        val onRecordingButtonClick: () -> Unit = {\n            isStarted = !isStarted\n\n            if (isStarted) {\n                if (ActivityCompat.checkSelfPermission(\n                        activity,\n                        Manifest.permission.RECORD_AUDIO\n                    ) != PackageManager.PERMISSION_GRANTED\n                ) {\n                    Log.i(TAG, \"Recording is not allowed\")\n                } else {\n                    // recording is allowed\n                    val audioSource = MediaRecorder.AudioSource.MIC\n                    val channelConfig = AudioFormat.CHANNEL_IN_MONO\n                    val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n                    val numBytes =\n                        AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n\n                    audioRecord = AudioRecord(\n                        audioSource,\n                        sampleRateInHz,\n                        AudioFormat.CHANNEL_IN_MONO,\n                        AudioFormat.ENCODING_PCM_16BIT,\n                        numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n                    )\n\n                    sampleList = null\n\n                    // recording is started here\n                    thread(true) {\n                        Log.i(TAG, \"processing samples\")\n\n                        val interval = 0.1 // i.e., 100 ms\n                        val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n                        val buffer = ShortArray(bufferSize)\n                        audioRecord?.let {\n                            it.startRecording()\n\n                            while (isStarted) {\n                                val ret = audioRecord?.read(buffer, 0, buffer.size)\n                                ret?.let { n ->\n                                    val samples = FloatArray(n) { buffer[it] / 32768.0f }\n                                    if (sampleList == null) {\n                                        sampleList = mutableListOf(samples)\n                                    } else {\n                                        sampleList?.add(samples)\n                                    }\n                                }\n                            }\n                        }\n\n                        Log.i(TAG, \"Recording is stopped. ${sampleList?.count()}\")\n\n                    }\n                }\n            } else {\n                // recording is stopped here\n                audioRecord?.stop()\n                audioRecord?.release()\n                audioRecord = null\n\n                sampleList?.let {\n                    val stream = SpeakerRecognition.extractor.createStream()\n                    for (samples in it) {\n                        stream.acceptWaveform(samples=samples, sampleRate=sampleRateInHz)\n                    }\n                    stream.inputFinished()\n                    if(SpeakerRecognition.extractor.isReady(stream)) {\n                        val embedding = SpeakerRecognition.extractor.compute(stream)\n                        if(embeddingList == null) {\n                            embeddingList = mutableListOf(embedding)\n                        } else {\n                            embeddingList?.add(embedding)\n                        }\n                    }\n                }\n            }\n        }\n\n        val onAddButtonClick: () -> Unit = {\n            if(speakerName.isEmpty() || speakerName.isBlank()) {\n                Toast.makeText(\n                    activity,\n                    \"please input a speaker name\",\n                    Toast.LENGTH_SHORT\n                ).show()\n            } else if(SpeakerRecognition.manager.contains(speakerName.trim())) {\n                Toast.makeText(\n                    activity,\n                    \"A speaker with $speakerName already exists. Please choose a new name\",\n                    Toast.LENGTH_SHORT\n                ).show()\n            } else {\n                val ok = SpeakerRecognition.manager.add(speakerName.trim(), embedding = embeddingList!!.toTypedArray())\n                if(ok) {\n                    Log.i(TAG, \"Added ${speakerName.trim()} successfully\")\n                    Toast.makeText(\n                        activity,\n                        \"Added ${speakerName.trim()}\",\n                        Toast.LENGTH_SHORT\n                    ).show()\n\n                    embeddingList = null\n                    sampleList = null\n                    speakerName = \"\"\n                    firstTime = true\n                } else {\n                    Log.i(TAG, \"Failed to add ${speakerName.trim()}\")\n                    Toast.makeText(\n                        activity,\n                        \"Failed to add ${speakerName.trim()}\",\n                        Toast.LENGTH_SHORT\n                    ).show()\n                }\n            }\n        }\n\n        Column(horizontalAlignment = Alignment.CenterHorizontally) {\n            SpeakerNameRow(speakerName = speakerName, onValueChange = onSpeakerNameChange)\n            Text(\n                \"Number of recordings: ${numberAudio}\",\n                modifier = modifier.padding(24.dp),\n                style = MaterialTheme.typography.headlineMedium,\n                fontWeight = FontWeight.Bold,\n            )\n            RegisterSpeakerButtonRow(\n                modifier,\n                isStarted = isStarted,\n                onRecordingButtonClick = onRecordingButtonClick,\n                onAddButtonClick = onAddButtonClick,\n            )\n        }\n    }\n}\n\n@Composable\nfun SpeakerNameRow(\n    modifier: Modifier = Modifier,\n    speakerName: String,\n    onValueChange: (String) -> Unit\n) {\n    OutlinedTextField(\n        value = speakerName,\n        onValueChange = onValueChange,\n        label = {\n            Text(\"Please input the speaker name\")\n        },\n        singleLine = true,\n        modifier = modifier\n            .fillMaxWidth()\n            .padding(8.dp)\n    )\n}\n\n@SuppressLint(\"UnrememberedMutableState\")\n@Composable\nfun RegisterSpeakerButtonRow(\n    modifier: Modifier = Modifier,\n    isStarted: Boolean,\n    onRecordingButtonClick: () -> Unit,\n    onAddButtonClick: () -> Unit,\n) {\n    val numberAudio: Int by mutableStateOf(embeddingList?.count() ?: 0)\n    Row(\n        modifier = modifier.fillMaxWidth(),\n        horizontalArrangement = Arrangement.Center,\n    ) {\n        Button(onClick = onRecordingButtonClick) {\n            Text(text = stringResource(if (isStarted) R.string.stop else R.string.start))\n        }\n\n        Spacer(modifier = Modifier.width(24.dp))\n\n        Button(\n            enabled = numberAudio > 0,\n            onClick = onAddButtonClick,\n        ) {\n            Text(text = stringResource(id = R.string.add))\n        }\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/screens/View.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification.screens\n\nimport android.annotation.SuppressLint\nimport androidx.compose.foundation.ExperimentalFoundationApi\nimport androidx.compose.foundation.layout.Arrangement\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.Row\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.fillMaxWidth\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.foundation.lazy.LazyColumn\nimport androidx.compose.foundation.lazy.items\nimport androidx.compose.material3.Button\nimport androidx.compose.material3.Checkbox\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.Surface\nimport androidx.compose.material3.Text\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.runtime.mutableStateOf\nimport androidx.compose.runtime.remember\nimport androidx.compose.runtime.setValue\nimport androidx.compose.runtime.toMutableStateList\nimport androidx.compose.ui.Alignment\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.unit.dp\nimport com.k2fsa.sherpa.onnx.SpeakerRecognition\n\nclass SpeakerName(val name: String) {\n    val nameState = mutableStateOf(name)\n    val checked = mutableStateOf(false)\n\n    fun onCheckedChange(newValue: Boolean) {\n        checked.value = newValue\n    }\n}\n\n@SuppressLint(\"UnrememberedMutableState\")\n@OptIn(ExperimentalFoundationApi::class)\n@Composable\nfun ViewScreen() {\n    val allSpeakerNames = SpeakerRecognition.manager.allSpeakerNames()\n    val allSpeakerNameList = remember {\n        MutableList(\n            allSpeakerNames.size\n        ) {\n            SpeakerName(allSpeakerNames[it])\n        }.toMutableStateList()\n    }\n\n    var enabled by remember {\n        mutableStateOf(SpeakerRecognition.manager.numSpeakers() > 0)\n    }\n\n    Box(\n        modifier = Modifier.fillMaxSize(),\n        contentAlignment = Alignment.TopCenter\n    ) {\n        Column(\n            modifier = Modifier.padding(16.dp),\n            horizontalAlignment = Alignment.CenterHorizontally,\n        ) {\n            Button(\n                enabled = enabled,\n                onClick = {\n                    val toRemove: MutableList<SpeakerName> = mutableListOf()\n                    for (s in allSpeakerNameList) {\n                        if (s.checked.value) {\n                            SpeakerRecognition.manager.remove(s.name)\n                            toRemove.add(s)\n                        }\n                    }\n                    allSpeakerNameList.removeAll(toRemove)\n                    enabled = SpeakerRecognition.manager.numSpeakers() > 0\n                }) {\n                Text(\"Delete selected\")\n            }\n            LazyColumn(modifier = Modifier.fillMaxSize()) {\n                items(allSpeakerNameList) { s: SpeakerName ->\n                    ViewRow(speakerName = s)\n                }\n            }\n        }\n    }\n}\n\n@Composable\nfun ViewRow(\n    modifier: Modifier = Modifier,\n    speakerName: SpeakerName\n) {\n    Surface(\n        modifier = modifier\n            .fillMaxWidth()\n            .padding(8.dp),\n        color = MaterialTheme.colorScheme.inversePrimary,\n    ) {\n        Row(\n            modifier = modifier,\n            horizontalArrangement = Arrangement.Center,\n            verticalAlignment = Alignment.CenterVertically,\n        ) {\n            Text(\n                text = speakerName.name,\n                modifier = modifier.weight(1.0F),\n            )\n            Checkbox(checked = speakerName.checked.value,\n                onCheckedChange = { speakerName.onCheckedChange(it) }\n            )\n        }\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/ui/theme/Color.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification.ui.theme\n\nimport androidx.compose.ui.graphics.Color\n\nval Purple80 = Color(0xFFD0BCFF)\nval PurpleGrey80 = Color(0xFFCCC2DC)\nval Pink80 = Color(0xFFEFB8C8)\n\nval Purple40 = Color(0xFF6650a4)\nval PurpleGrey40 = Color(0xFF625b71)\nval Pink40 = Color(0xFF7D5260)"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/ui/theme/Theme.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification.ui.theme\n\nimport android.app.Activity\nimport android.os.Build\nimport androidx.compose.foundation.isSystemInDarkTheme\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.darkColorScheme\nimport androidx.compose.material3.dynamicDarkColorScheme\nimport androidx.compose.material3.dynamicLightColorScheme\nimport androidx.compose.material3.lightColorScheme\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.SideEffect\nimport androidx.compose.ui.graphics.toArgb\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.platform.LocalView\nimport androidx.core.view.WindowCompat\n\nprivate val DarkColorScheme = darkColorScheme(\n    primary = Purple80,\n    secondary = PurpleGrey80,\n    tertiary = Pink80\n)\n\nprivate val LightColorScheme = lightColorScheme(\n    primary = Purple40,\n    secondary = PurpleGrey40,\n    tertiary = Pink40\n\n    /* Other default colors to override\n    background = Color(0xFFFFFBFE),\n    surface = Color(0xFFFFFBFE),\n    onPrimary = Color.White,\n    onSecondary = Color.White,\n    onTertiary = Color.White,\n    onBackground = Color(0xFF1C1B1F),\n    onSurface = Color(0xFF1C1B1F),\n    */\n)\n\n@Composable\nfun SherpaOnnxSpeakerIdentificationTheme(\n    darkTheme: Boolean = isSystemInDarkTheme(),\n    // Dynamic color is available on Android 12+\n    dynamicColor: Boolean = true,\n    content: @Composable () -> Unit\n) {\n    val colorScheme = when {\n        dynamicColor && Build.VERSION.SDK_INT >= Build.VERSION_CODES.S -> {\n            val context = LocalContext.current\n            if (darkTheme) dynamicDarkColorScheme(context) else dynamicLightColorScheme(context)\n        }\n\n        darkTheme -> DarkColorScheme\n        else -> LightColorScheme\n    }\n    val view = LocalView.current\n    if (!view.isInEditMode) {\n        SideEffect {\n            val window = (view.context as Activity).window\n            window.statusBarColor = colorScheme.primary.toArgb()\n            WindowCompat.getInsetsController(window, view).isAppearanceLightStatusBars = darkTheme\n        }\n    }\n\n    MaterialTheme(\n        colorScheme = colorScheme,\n        typography = Typography,\n        content = content\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/ui/theme/Type.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification.ui.theme\n\nimport androidx.compose.material3.Typography\nimport androidx.compose.ui.text.TextStyle\nimport androidx.compose.ui.text.font.FontFamily\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.unit.sp\n\n// Set of Material typography styles to start with\nval Typography = Typography(\n    bodyLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 16.sp,\n        lineHeight = 24.sp,\n        letterSpacing = 0.5.sp\n    )\n    /* Other default text styles to override\n    titleLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 22.sp,\n        lineHeight = 28.sp,\n        letterSpacing = 0.sp\n    ),\n    labelSmall = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Medium,\n        fontSize = 11.sp,\n        lineHeight = 16.sp,\n        letterSpacing = 0.5.sp\n    )\n    */\n)"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">Speaker ID</string>\n    <string name=\"start\">Start recording</string>\n    <string name=\"stop\">Stop recording</string>\n    <string name=\"add\">Add speaker</string>\n    <string name=\"clear\">Clear result</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/res/values/themes.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n\n    <style name=\"Theme.SherpaOnnxSpeakerIdentification\" parent=\"android:Theme.Material.Light.NoActionBar\" />\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/app/src/test/java/com/k2fsa/sherpa/onnx/speaker/identification/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.speaker.identification\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/build.gradle.kts",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id(\"com.android.application\") version \"8.2.0\" apply false\n    id(\"org.jetbrains.kotlin.android\") version \"1.9.0\" apply false\n}"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Sun Jan 21 18:37:37 CST 2024\ndistributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxSpeakerIdentification/settings.gradle.kts",
    "content": "pluginManagement {\n    repositories {\n        google()\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\nrootProject.name = \"SherpaOnnxSpeakerIdentification\"\ninclude(\":app\")\n"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/build.gradle.kts",
    "content": "plugins {\n    id(\"com.android.application\")\n    id(\"org.jetbrains.kotlin.android\")\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx.slid\"\n    compileSdk = 34\n\n    defaultConfig {\n        applicationId = \"com.k2fsa.sherpa.onnx.slid\"\n        minSdk = 21\n        targetSdk = 34\n        versionCode = 20260320\n        versionName = \"1.12.31\"\n\n        testInstrumentationRunner = \"androidx.test.runner.AndroidJUnitRunner\"\n        vectorDrawables {\n            useSupportLibrary = true\n        }\n    }\n\n    buildTypes {\n        release {\n            isMinifyEnabled = false\n            proguardFiles(\n                getDefaultProguardFile(\"proguard-android-optimize.txt\"),\n                \"proguard-rules.pro\"\n            )\n        }\n    }\n    compileOptions {\n        sourceCompatibility = JavaVersion.VERSION_1_8\n        targetCompatibility = JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = \"1.8\"\n    }\n    buildFeatures {\n        compose = true\n    }\n    composeOptions {\n        kotlinCompilerExtensionVersion = \"1.5.1\"\n    }\n    packaging {\n        resources {\n            excludes += \"/META-INF/{AL2.0,LGPL2.1}\"\n        }\n    }\n}\n\ndependencies {\n\n    implementation(\"androidx.core:core-ktx:1.12.0\")\n    implementation(\"androidx.lifecycle:lifecycle-runtime-ktx:2.7.0\")\n    implementation(\"androidx.activity:activity-compose:1.8.2\")\n    implementation(platform(\"androidx.compose:compose-bom:2023.08.00\"))\n    implementation(\"androidx.compose.ui:ui\")\n    implementation(\"androidx.compose.ui:ui-graphics\")\n    implementation(\"androidx.compose.ui:ui-tooling-preview\")\n    implementation(\"androidx.compose.material3:material3\")\n    testImplementation(\"junit:junit:4.13.2\")\n    androidTestImplementation(\"androidx.test.ext:junit:1.1.5\")\n    androidTestImplementation(\"androidx.test.espresso:espresso-core:3.5.1\")\n    androidTestImplementation(platform(\"androidx.compose:compose-bom:2023.08.00\"))\n    androidTestImplementation(\"androidx.compose.ui:ui-test-junit4\")\n    debugImplementation(\"androidx.compose.ui:ui-tooling\")\n    debugImplementation(\"androidx.compose.ui:ui-test-manifest\")\n}"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/androidTest/java/com/k2fsa/sherpa/onnx/slid/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.slid\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx.slid\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnxSpokenLanguageIdentification\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\"\n            android:label=\"@string/app_name\"\n            android:theme=\"@style/Theme.SherpaOnnxSpokenLanguageIdentification\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/assets/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/slid/Home.kt",
    "content": "@file:OptIn(ExperimentalMaterial3Api::class)\n\npackage com.k2fsa.sherpa.onnx.slid\n\nimport android.Manifest\nimport android.app.Activity\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.util.Log\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.PaddingValues\nimport androidx.compose.foundation.layout.Spacer\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.height\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.material3.Button\nimport androidx.compose.material3.CenterAlignedTopAppBar\nimport androidx.compose.material3.ExperimentalMaterial3Api\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.Scaffold\nimport androidx.compose.material3.Text\nimport androidx.compose.material3.TopAppBarDefaults\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.getValue\nimport androidx.compose.runtime.mutableStateOf\nimport androidx.compose.runtime.remember\nimport androidx.compose.runtime.setValue\nimport androidx.compose.ui.Alignment\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.unit.dp\nimport androidx.compose.ui.unit.sp\nimport androidx.core.app.ActivityCompat\nimport kotlin.concurrent.thread\n\n@Composable\nfun Home() {\n    Scaffold(\n        topBar = {\n            CenterAlignedTopAppBar(\n                colors = TopAppBarDefaults.topAppBarColors(\n                    containerColor = MaterialTheme.colorScheme.primaryContainer,\n                    titleContentColor = MaterialTheme.colorScheme.primary,\n                ),\n                title = {\n                    Text(\n                        \"Next-gen Kaldi: Spoken language identification\",\n                        fontWeight = FontWeight.Bold,\n                        fontSize = 13.sp,\n                    )\n                },\n            )\n        },\n        content = {\n            MyApp(it)\n        },\n    )\n}\n\nprivate var audioRecord: AudioRecord? = null\nprivate const val sampleRateInHz = 16000\n\n@Composable\nfun MyApp(padding: PaddingValues) {\n    val activity = LocalContext.current as Activity\n    var isStarted by remember { mutableStateOf(false) }\n    var result by remember { mutableStateOf(\"\") }\n\n    val onButtonClick: () -> Unit = {\n        isStarted = !isStarted\n        if (isStarted) {\n            result = \"\"\n            if (ActivityCompat.checkSelfPermission(\n                    activity,\n                    Manifest.permission.RECORD_AUDIO\n                ) != PackageManager.PERMISSION_GRANTED\n            ) {\n                Log.i(TAG, \"Recording is not allowed\")\n            } else {\n                val audioSource = MediaRecorder.AudioSource.MIC\n                val channelConfig = AudioFormat.CHANNEL_IN_MONO\n                val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n                val numBytes =\n                    AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n\n                audioRecord = AudioRecord(\n                    audioSource,\n                    sampleRateInHz,\n                    AudioFormat.CHANNEL_IN_MONO,\n                    AudioFormat.ENCODING_PCM_16BIT,\n                    numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n                )\n\n                thread(true) {\n                    Log.i(TAG, \"processing samples\")\n                    val interval = 0.1 // i.e., 100 ms\n                    val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n                    val buffer = ShortArray(bufferSize)\n                    val sampleList = ArrayList<FloatArray>()\n                    audioRecord?.let {\n                        it.startRecording()\n                        while (isStarted) {\n                            val ret = it.read(buffer, 0, buffer.size)\n                            ret.let { n ->\n                                val samples = FloatArray(n) { buffer[it] / 32768.0f }\n                                sampleList.add(samples)\n                            }\n                        }\n                    }\n                    Log.i(TAG, \"Stop recording\")\n                    Log.i(TAG, \"Start recognition\")\n                    val samples = flatten(sampleList)\n                    val stream = Slid.slid.createStream()\n                    stream.acceptWaveform(samples, sampleRateInHz)\n                    val lang = Slid.slid.compute(stream)\n\n                    result = Slid.localeMap[lang] ?: lang\n\n                    stream.release()\n                }\n            }\n        }\n    }\n\n    Box(\n        modifier = Modifier.fillMaxSize(),\n        contentAlignment = Alignment.TopCenter\n    ) {\n        Column(\n            Modifier.padding(padding),\n            horizontalAlignment = Alignment.CenterHorizontally,\n        ) {\n            Spacer(modifier = Modifier.height(16.dp))\n            Button(onClick = onButtonClick) {\n                if (isStarted) {\n                    Text(\"Stop\")\n                } else {\n                    Text(\"Start\")\n                }\n            }\n\n            Spacer(modifier = Modifier.height(16.dp))\n            if (result.isNotEmpty() && result.isNotBlank()) {\n                Text(\"Detected language: $result\")\n            }\n        }\n    }\n}\n\nfun flatten(sampleList: ArrayList<FloatArray>): FloatArray {\n    var totalSamples = 0\n    for (a in sampleList) {\n        totalSamples += a.size\n    }\n    var i = 0\n    val samples = FloatArray(totalSamples)\n    for (a in sampleList) {\n        for (s in a) {\n            samples[i] = s\n            i += 1\n        }\n    }\n    Log.i(TAG, \"$i, $totalSamples\")\n\n    return samples\n}"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/slid/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx.slid\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.os.Bundle\nimport android.util.Log\nimport android.widget.Toast\nimport androidx.activity.ComponentActivity\nimport androidx.activity.compose.setContent\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.Surface\nimport androidx.compose.runtime.Composable\nimport androidx.compose.ui.Modifier\nimport androidx.core.app.ActivityCompat\nimport com.k2fsa.sherpa.onnx.slid.ui.theme.SherpaOnnxSpokenLanguageIdentificationTheme\n\nconst val TAG = \"sherpa-onnx\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\nclass MainActivity : ComponentActivity() {\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        setContent {\n            SpokenLanguageIdentificationApp()\n        }\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n        Slid.initSlid(this.assets)\n    }\n\n    @Suppress(\"DEPRECATION\")\n    @Deprecated(\"Deprecated in Java\")\n    override fun onRequestPermissionsResult(\n        requestCode: Int,\n        permissions: Array<out String>,\n        grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            Toast.makeText(\n                this,\n                \"This App needs access to the microphone\",\n                Toast.LENGTH_SHORT\n            )\n                .show()\n            finish()\n        }\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n}\n\n@Composable\nfun SpokenLanguageIdentificationApp() {\n    SherpaOnnxSpokenLanguageIdentificationTheme {\n        // A surface container using the 'background' color from the theme\n        Surface(\n            modifier = Modifier.fillMaxSize(),\n            color = MaterialTheme.colorScheme.background\n        ) {\n            Home()\n        }\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/slid/slid.kt",
    "content": "package com.k2fsa.sherpa.onnx.slid\n\nimport android.content.res.AssetManager\nimport android.util.Log\nimport com.k2fsa.sherpa.onnx.SpokenLanguageIdentification\nimport com.k2fsa.sherpa.onnx.getSpokenLanguageIdentificationConfig\nimport java.util.Locale\n\n\nobject Slid {\n    private var _slid: SpokenLanguageIdentification? = null\n\n    private var _localeMap = mutableMapOf<String, String>()\n    val slid: SpokenLanguageIdentification\n        get() {\n            return _slid!!\n        }\n    val localeMap: Map<String, String>\n        get() {\n            return _localeMap\n        }\n\n    fun initSlid(assetManager: AssetManager? = null, numThreads: Int = 1) {\n        synchronized(this) {\n            if (_slid == null) {\n\n                Log.i(TAG, \"Initializing slid\")\n                val config =\n                    getSpokenLanguageIdentificationConfig(type = 0, numThreads = numThreads)!!\n                _slid = SpokenLanguageIdentification(assetManager, config)\n            }\n\n            if (_localeMap.isEmpty()) {\n                val allLang = Locale.getISOLanguages()\n                for (lang in allLang) {\n                    val locale = Locale(lang)\n                    _localeMap[lang] = locale.displayName\n                }\n            }\n        }\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/slid/ui/theme/Color.kt",
    "content": "package com.k2fsa.sherpa.onnx.slid.ui.theme\n\nimport androidx.compose.ui.graphics.Color\n\nval Purple80 = Color(0xFFD0BCFF)\nval PurpleGrey80 = Color(0xFFCCC2DC)\nval Pink80 = Color(0xFFEFB8C8)\n\nval Purple40 = Color(0xFF6650a4)\nval PurpleGrey40 = Color(0xFF625b71)\nval Pink40 = Color(0xFF7D5260)"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/slid/ui/theme/Theme.kt",
    "content": "package com.k2fsa.sherpa.onnx.slid.ui.theme\n\nimport android.app.Activity\nimport android.os.Build\nimport androidx.compose.foundation.isSystemInDarkTheme\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.darkColorScheme\nimport androidx.compose.material3.dynamicDarkColorScheme\nimport androidx.compose.material3.dynamicLightColorScheme\nimport androidx.compose.material3.lightColorScheme\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.SideEffect\nimport androidx.compose.ui.graphics.toArgb\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.platform.LocalView\nimport androidx.core.view.WindowCompat\n\nprivate val DarkColorScheme = darkColorScheme(\n    primary = Purple80,\n    secondary = PurpleGrey80,\n    tertiary = Pink80\n)\n\nprivate val LightColorScheme = lightColorScheme(\n    primary = Purple40,\n    secondary = PurpleGrey40,\n    tertiary = Pink40\n\n    /* Other default colors to override\n    background = Color(0xFFFFFBFE),\n    surface = Color(0xFFFFFBFE),\n    onPrimary = Color.White,\n    onSecondary = Color.White,\n    onTertiary = Color.White,\n    onBackground = Color(0xFF1C1B1F),\n    onSurface = Color(0xFF1C1B1F),\n    */\n)\n\n@Composable\nfun SherpaOnnxSpokenLanguageIdentificationTheme(\n    darkTheme: Boolean = isSystemInDarkTheme(),\n    // Dynamic color is available on Android 12+\n    dynamicColor: Boolean = true,\n    content: @Composable () -> Unit\n) {\n    val colorScheme = when {\n        dynamicColor && Build.VERSION.SDK_INT >= Build.VERSION_CODES.S -> {\n            val context = LocalContext.current\n            if (darkTheme) dynamicDarkColorScheme(context) else dynamicLightColorScheme(context)\n        }\n\n        darkTheme -> DarkColorScheme\n        else -> LightColorScheme\n    }\n    val view = LocalView.current\n    if (!view.isInEditMode) {\n        SideEffect {\n            val window = (view.context as Activity).window\n            window.statusBarColor = colorScheme.primary.toArgb()\n            WindowCompat.getInsetsController(window, view).isAppearanceLightStatusBars = darkTheme\n        }\n    }\n\n    MaterialTheme(\n        colorScheme = colorScheme,\n        typography = Typography,\n        content = content\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/slid/ui/theme/Type.kt",
    "content": "package com.k2fsa.sherpa.onnx.slid.ui.theme\n\nimport androidx.compose.material3.Typography\nimport androidx.compose.ui.text.TextStyle\nimport androidx.compose.ui.text.font.FontFamily\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.unit.sp\n\n// Set of Material typography styles to start with\nval Typography = Typography(\n    bodyLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 16.sp,\n        lineHeight = 24.sp,\n        letterSpacing = 0.5.sp\n    )\n    /* Other default text styles to override\n    titleLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 22.sp,\n        lineHeight = 28.sp,\n        letterSpacing = 0.sp\n    ),\n    labelSmall = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Medium,\n        fontSize = 11.sp,\n        lineHeight = 16.sp,\n        letterSpacing = 0.5.sp\n    )\n    */\n)"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/jniLibs/arm64-v8a/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/jniLibs/armeabi-v7a/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/jniLibs/x86/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/jniLibs/x86_64/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">Language ID</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/res/values/themes.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n\n    <style name=\"Theme.SherpaOnnxSpokenLanguageIdentification\" parent=\"android:Theme.Material.Light.NoActionBar\" />\n</resources>"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/app/src/test/java/com/k2fsa/sherpa/onnx/slid/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.slid\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/build.gradle.kts",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id(\"com.android.application\") version \"8.2.0\" apply false\n    id(\"org.jetbrains.kotlin.android\") version \"1.9.0\" apply false\n}"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Wed Apr 17 19:48:00 CST 2024\ndistributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\n"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxSpokenLanguageIdentification/settings.gradle.kts",
    "content": "pluginManagement {\n    repositories {\n        google()\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\nrootProject.name = \"SherpaOnnxSpokenLanguageIdentification\"\ninclude(\":app\")\n"
  },
  {
    "path": "android/SherpaOnnxTts/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxTts/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxTts/app/build.gradle",
    "content": "plugins {\n    id 'com.android.application'\n    id 'org.jetbrains.kotlin.android'\n}\n\nandroid {\n    namespace 'com.k2fsa.sherpa.onnx'\n    compileSdk 32\n\n    defaultConfig {\n        applicationId \"com.k2fsa.sherpa.onnx\"\n        minSdk 21\n        targetSdk 32\n        versionCode 20260320\n        versionName \"1.12.31\"\n\n        testInstrumentationRunner \"androidx.test.runner.AndroidJUnitRunner\"\n    }\n\n    buildTypes {\n        release {\n            minifyEnabled false\n            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'\n        }\n    }\n    compileOptions {\n        sourceCompatibility JavaVersion.VERSION_1_8\n        targetCompatibility JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = '1.8'\n    }\n}\n\ndependencies {\n\n    implementation 'com.android.support.constraint:constraint-layout:1.1.3'\n    implementation 'androidx.core:core-ktx:1.7.0'\n    implementation 'com.google.android.material:material:1.9.0'\n    implementation 'androidx.constraintlayout:constraintlayout:2.1.4'\n    testImplementation 'junit:junit:4.13.2'\n    androidTestImplementation 'androidx.test.ext:junit:1.1.5'\n    androidTestImplementation 'androidx.test.espresso:espresso-core:3.5.1'\n}"
  },
  {
    "path": "android/SherpaOnnxTts/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/androidTest/java/com/k2fsa/sherpa/onnx/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/.gitignore",
    "content": "vits-zh-aishell3\nvits-vctk\n"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.WRITE_INTERNAL_STORAGE\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnxTts\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n\n            <meta-data\n                android:name=\"android.app.lib_name\"\n                android:value=\"\" />\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/assets/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/java/com/k2fsa/sherpa/onnx/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\nimport android.media.AudioAttributes\nimport android.media.AudioFormat\nimport android.media.AudioManager\nimport android.media.AudioTrack\nimport android.media.MediaPlayer\nimport android.net.Uri\nimport android.os.Bundle\nimport android.util.Log\nimport android.widget.Button\nimport android.widget.EditText\nimport android.widget.Toast\nimport androidx.appcompat.app.AppCompatActivity\nimport java.io.File\nimport java.io.FileOutputStream\nimport java.io.IOException\n\nconst val TAG = \"sherpa-onnx\"\n\nclass MainActivity : AppCompatActivity() {\n    private lateinit var tts: OfflineTts\n    private lateinit var text: EditText\n    private lateinit var sid: EditText\n    private lateinit var speed: EditText\n    private lateinit var generate: Button\n    private lateinit var play: Button\n    private lateinit var stop: Button\n    private var stopped: Boolean = false\n    private var mediaPlayer: MediaPlayer? = null\n\n    // see\n    // https://developer.android.com/reference/kotlin/android/media/AudioTrack\n    private lateinit var track: AudioTrack\n\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        setContentView(R.layout.activity_main)\n\n        Log.i(TAG, \"Start to initialize TTS\")\n        initTts()\n        Log.i(TAG, \"Finish initializing TTS\")\n\n        Log.i(TAG, \"Start to initialize AudioTrack\")\n        initAudioTrack()\n        Log.i(TAG, \"Finish initializing AudioTrack\")\n\n        text = findViewById(R.id.text)\n        sid = findViewById(R.id.sid)\n        speed = findViewById(R.id.speed)\n\n        generate = findViewById(R.id.generate)\n        play = findViewById(R.id.play)\n        stop = findViewById(R.id.stop)\n\n        generate.setOnClickListener { onClickGenerate() }\n        play.setOnClickListener { onClickPlay() }\n        stop.setOnClickListener { onClickStop() }\n\n        sid.setText(\"0\")\n        speed.setText(\"1.0\")\n\n        // we will change sampleText here in the CI\n        val sampleText = \"\"\n        text.setText(sampleText)\n\n        play.isEnabled = false\n    }\n\n    private fun initAudioTrack() {\n        val sampleRate = tts.sampleRate()\n        val bufLength = AudioTrack.getMinBufferSize(\n            sampleRate,\n            AudioFormat.CHANNEL_OUT_MONO,\n            AudioFormat.ENCODING_PCM_FLOAT\n        )\n        Log.i(TAG, \"sampleRate: $sampleRate, buffLength: $bufLength\")\n\n        val attr = AudioAttributes.Builder().setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)\n            .setUsage(AudioAttributes.USAGE_MEDIA)\n            .build()\n\n        val format = AudioFormat.Builder()\n            .setEncoding(AudioFormat.ENCODING_PCM_FLOAT)\n            .setChannelMask(AudioFormat.CHANNEL_OUT_MONO)\n            .setSampleRate(sampleRate)\n            .build()\n\n        track = AudioTrack(\n            attr, format, bufLength, AudioTrack.MODE_STREAM,\n            AudioManager.AUDIO_SESSION_ID_GENERATE\n        )\n        track.play()\n    }\n\n    // this function is called from C++\n    private fun callback(samples: FloatArray): Int {\n        if (!stopped) {\n            track.write(samples, 0, samples.size, AudioTrack.WRITE_BLOCKING)\n            return 1\n        } else {\n            track.stop()\n            return 0\n        }\n    }\n\n    private fun onClickGenerate() {\n        val sidInt = sid.text.toString().toIntOrNull()\n        if (sidInt == null || sidInt < 0) {\n            Toast.makeText(\n                applicationContext,\n                \"Please input a non-negative integer for speaker ID!\",\n                Toast.LENGTH_SHORT\n            ).show()\n            return\n        }\n\n        val speedFloat = speed.text.toString().toFloatOrNull()\n        if (speedFloat == null || speedFloat <= 0) {\n            Toast.makeText(\n                applicationContext,\n                \"Please input a positive number for speech speed!\",\n                Toast.LENGTH_SHORT\n            ).show()\n            return\n        }\n\n        val textStr = text.text.toString().trim()\n        if (textStr.isBlank() || textStr.isEmpty()) {\n            Toast.makeText(applicationContext, \"Please input a non-empty text!\", Toast.LENGTH_SHORT)\n                .show()\n            return\n        }\n\n        track.pause()\n        track.flush()\n        track.play()\n\n        play.isEnabled = false\n        generate.isEnabled = false\n        stopped = false\n        Thread {\n            val audio = tts.generateWithCallback(\n                text = textStr,\n                sid = sidInt,\n                speed = speedFloat,\n                callback = this::callback\n            )\n\n            val filename = application.filesDir.absolutePath + \"/generated.wav\"\n            val ok = audio.samples.size > 0 && audio.save(filename)\n            if (ok) {\n                runOnUiThread {\n                    play.isEnabled = true\n                    generate.isEnabled = true\n                    track.stop()\n                }\n            }\n        }.start()\n    }\n\n    private fun onClickPlay() {\n        val filename = application.filesDir.absolutePath + \"/generated.wav\"\n        mediaPlayer?.stop()\n        mediaPlayer = MediaPlayer.create(\n            applicationContext,\n            Uri.fromFile(File(filename))\n        )\n        mediaPlayer?.start()\n    }\n\n    private fun onClickStop() {\n        stopped = true\n        play.isEnabled = true\n        generate.isEnabled = true\n        track.pause()\n        track.flush()\n        mediaPlayer?.stop()\n        mediaPlayer = null\n    }\n\n    private fun initTts() {\n        var modelDir: String?\n        var modelName: String?\n        var acousticModelName: String?\n        var vocoder: String?\n        var voices: String?\n        var ruleFsts: String?\n        var ruleFars: String?\n        var lexicon: String?\n        var dataDir: String?\n        var assets: AssetManager? = application.assets\n        var isKitten = false\n\n        // The purpose of such a design is to make the CI test easier\n        // Please see\n        // https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/apk/generate-tts-apk-script.py\n\n        // VITS -- begin\n        modelName = null\n        // VITS -- end\n\n        // Matcha -- begin\n        acousticModelName = null\n        vocoder = null\n        // Matcha -- end\n\n        // For Kokoro -- begin\n        voices = null\n        // For Kokoro -- end\n\n\n        modelDir = null\n        ruleFsts = null\n        ruleFars = null\n        lexicon = null\n        dataDir = null\n\n        // Example 1:\n        // modelDir = \"vits-vctk\"\n        // modelName = \"vits-vctk.onnx\"\n        // lexicon = \"lexicon.txt\"\n\n        // Example 2:\n        // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n        // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\n        // modelDir = \"vits-piper-en_US-amy-low\"\n        // modelName = \"en_US-amy-low.onnx\"\n        // dataDir = \"vits-piper-en_US-amy-low/espeak-ng-data\"\n\n        // Example 3:\n        // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\n        // modelDir = \"vits-icefall-zh-aishell3\"\n        // modelName = \"model.onnx\"\n        // ruleFars = \"vits-icefall-zh-aishell3/rule.far\"\n        // lexicon = \"lexicon.txt\"\n\n        // Example 4:\n        // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#csukuangfj-vits-zh-hf-fanchen-c-chinese-187-speakers\n        // modelDir = \"vits-zh-hf-fanchen-C\"\n        // modelName = \"vits-zh-hf-fanchen-C.onnx\"\n        // lexicon = \"lexicon.txt\"\n\n        // Example 5:\n        // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-coqui-de-css10.tar.bz2\n        // modelDir = \"vits-coqui-de-css10\"\n        // modelName = \"model.onnx\"\n\n        // Example 6\n        // vits-melo-tts-zh_en\n        // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#vits-melo-tts-zh-en-chinese-english-1-speaker\n        // modelDir = \"vits-melo-tts-zh_en\"\n        // modelName = \"model.onnx\"\n        // lexicon = \"lexicon.txt\"\n\n        // Example 7\n        // matcha-icefall-zh-baker\n        // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n        // modelDir = \"matcha-icefall-zh-baker\"\n        // acousticModelName = \"model-steps-3.onnx\"\n        // vocoder = \"vocos-22khz-univ.onnx\"    // Vocoder should be downloaded separately; place in the **root directory of your resources folder**, not under modelDir.\n        // lexicon = \"lexicon.txt\"\n\n        // Example 8\n        // matcha-icefall-en_US-ljspeech\n        // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n        // modelDir = \"matcha-icefall-en_US-ljspeech\"\n        // acousticModelName = \"model-steps-3.onnx\"\n        // vocoder = \"vocos-22khz-univ.onnx\"\n        // dataDir = \"matcha-icefall-en_US-ljspeech/espeak-ng-data\"\n\n        // Example 9\n        // kokoro-en-v0_19\n        // modelDir = \"kokoro-en-v0_19\"\n        // modelName = \"model.onnx\"\n        // voices = \"voices.bin\"\n        // dataDir = \"kokoro-en-v0_19/espeak-ng-data\"\n\n        // Example 10\n        // kokoro-multi-lang-v1_0\n        // modelDir = \"kokoro-multi-lang-v1_0\"\n        // modelName = \"model.onnx\"\n        // voices = \"voices.bin\"\n        // dataDir = \"kokoro-multi-lang-v1_0/espeak-ng-data\"\n        // lexicon = \"kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt\"\n        // ruleFsts = \"$modelDir/phone-zh.fst,$modelDir/date-zh.fst,$modelDir/number-zh.fst\"\n\n        // Example 11\n        // kitten-nano-en-v0_1-fp16\n        // modelDir = \"kitten-nano-en-v0_1-fp16\"\n        // modelName = \"model.fp16.onnx\"\n        // voices = \"voices.bin\"\n        // dataDir = \"kokoro-multi-lang-v1_0/espeak-ng-data\"\n        // isKitten = true\n\n        // Example 12\n        // matcha-icefall-zh-en\n        // https://k2-fsa.github.io/sherpa/onnx/tts/all/Chinese-English/matcha-icefall-zh-en.html\n        // modelDir = \"matcha-icefall-zh-en\"\n        // acousticModelName = \"model-steps-3.onnx\"\n        // vocoder = \"vocos-16khz-univ.onnx\"    // Vocoder should be downloaded separately; place in the **root directory of your resources folder**, not under modelDir.\n        // dataDir = \"matcha-icefall-zh-en/espeak-ng-data\"\n        // lexicon = \"lexicon.txt\"\n\n        if (dataDir != null) {\n            val newDir = copyDataDir(dataDir!!)\n            dataDir = \"$newDir/$dataDir\"\n        }\n\n        val config = getOfflineTtsConfig(\n            modelDir = modelDir!!,\n            modelName = modelName ?: \"\",\n            acousticModelName = acousticModelName ?: \"\",\n            vocoder = vocoder ?: \"\",\n            voices = voices ?: \"\",\n            lexicon = lexicon ?: \"\",\n            dataDir = dataDir ?: \"\",\n            dictDir = \"\",\n            ruleFsts = ruleFsts ?: \"\",\n            ruleFars = ruleFars ?: \"\",\n            isKitten = isKitten,\n        )!!\n\n        tts = OfflineTts(assetManager = assets, config = config)\n    }\n\n\n    private fun copyDataDir(dataDir: String): String {\n        Log.i(TAG, \"data dir is $dataDir\")\n        copyAssets(dataDir)\n\n        val newDataDir = application.getExternalFilesDir(null)!!.absolutePath\n        Log.i(TAG, \"newDataDir: $newDataDir\")\n        return newDataDir\n    }\n\n    private fun copyAssets(path: String) {\n        val assets: Array<String>?\n        try {\n            assets = application.assets.list(path)\n            if (assets!!.isEmpty()) {\n                copyFile(path)\n            } else {\n                val fullPath = \"${application.getExternalFilesDir(null)}/$path\"\n                val dir = File(fullPath)\n                dir.mkdirs()\n                for (asset in assets.iterator()) {\n                    val p: String = if (path == \"\") \"\" else path + \"/\"\n                    copyAssets(p + asset)\n                }\n            }\n        } catch (ex: IOException) {\n            Log.e(TAG, \"Failed to copy $path. $ex\")\n        }\n    }\n\n    private fun copyFile(filename: String) {\n        try {\n            val istream = application.assets.open(filename)\n            val newFilename = application.getExternalFilesDir(null).toString() + \"/\" + filename\n            val ostream = FileOutputStream(newFilename)\n            // Log.i(TAG, \"Copying $filename to $newFilename\")\n            val buffer = ByteArray(1024)\n            var read = 0\n            while (read != -1) {\n                ostream.write(buffer, 0, read)\n                read = istream.read(buffer)\n            }\n            istream.close()\n            ostream.flush()\n            ostream.close()\n        } catch (ex: Exception) {\n            Log.e(TAG, \"Failed to copy $filename, $ex\")\n        }\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/jniLibs/arm64-v8a/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/jniLibs/armeabi-v7a/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/jniLibs/x86/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/jniLibs/x86_64/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/layout/activity_main.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<androidx.constraintlayout.widget.ConstraintLayout xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:app=\"http://schemas.android.com/apk/res-auto\"\n    xmlns:tools=\"http://schemas.android.com/tools\"\n    android:layout_width=\"match_parent\"\n    android:layout_height=\"match_parent\"\n    tools:context=\".MainActivity\">\n\n    <TextView\n        android:id=\"@+id/sid_label_hint\"\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"wrap_content\"\n        android:text=\"@string/sid_label\"\n        android:gravity=\"center\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toTopOf=\"parent\"\n        />\n    <EditText\n        android:id=\"@+id/sid\"\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"60dp\"\n        android:layout_marginTop=\"0dp\"\n        android:hint=\"@string/sid_hint\"\n        android:gravity=\"center\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/sid_label_hint\" />\n\n    <TextView\n        android:id=\"@+id/speed_label_hint\"\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"wrap_content\"\n        android:layout_marginTop=\"3dp\"\n        android:text=\"@string/speed_label\"\n        android:gravity=\"center\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/sid\"/>\n    <EditText\n        android:id=\"@+id/speed\"\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"60dp\"\n        android:layout_marginTop=\"0dp\"\n        android:hint=\"@string/speed_hint\"\n        android:gravity=\"center\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/speed_label_hint\" />\n\n    <EditText\n        android:id=\"@+id/text\"\n        android:inputType=\"textMultiLine\"\n        android:lines=\"8\"\n        android:minLines=\"10\"\n        android:gravity=\"top|start\"\n        android:maxLines=\"30\"\n        android:layout_height=\"wrap_content\"\n        android:layout_width=\"match_parent\"\n        android:scrollbars=\"vertical\"\n        android:hint=\"@string/text_hint\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/speed\" />\n\n    <Button\n        android:id=\"@+id/generate\"\n        android:textAllCaps=\"false\"\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"50dp\"\n        android:layout_marginTop=\"4dp\"\n        android:text=\"@string/generate\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/text\" />\n\n    <Button\n        android:id=\"@+id/play\"\n        android:textAllCaps=\"false\"\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"50dp\"\n        android:layout_marginTop=\"4dp\"\n        android:text=\"@string/play\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/generate\" />\n\n    <Button\n        android:id=\"@+id/stop\"\n        android:textAllCaps=\"false\"\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"50dp\"\n        android:layout_marginTop=\"4dp\"\n        android:text=\"@string/stop\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/play\" />\n\n</androidx.constraintlayout.widget.ConstraintLayout>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">TTS</string>\n    <string name=\"sid_label\">Speaker ID</string>\n    <string name=\"sid_hint\">0</string>\n    <string name=\"speed_label\">Speech speed (large->fast)</string>\n    <string name=\"speed_hint\">1.0</string>\n    <string name=\"text_hint\">Please input your text here</string>\n    <string name=\"generate\">Generate</string>\n    <string name=\"play\">Play</string>\n    <string name=\"stop\">Stop</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/values/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnxTts\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_500</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/white</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_700</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/values-night/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnxTts\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_200</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/black</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_200</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxTts/app/src/test/java/com/k2fsa/sherpa/onnx/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxTts/build.gradle",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id 'com.android.application' version '7.3.1' apply false\n    id 'com.android.library' version '7.3.1' apply false\n    id 'org.jetbrains.kotlin.android' version '1.7.20' apply false\n}"
  },
  {
    "path": "android/SherpaOnnxTts/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Mon Oct 23 15:40:58 CST 2023\ndistributionBase=GRADLE_USER_HOME\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\ndistributionPath=wrapper/dists\nzipStorePath=wrapper/dists\nzipStoreBase=GRADLE_USER_HOME\n"
  },
  {
    "path": "android/SherpaOnnxTts/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxTts/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxTts/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxTts/settings.gradle",
    "content": "pluginManagement {\n    repositories {\n        gradlePluginPortal()\n        google()\n        mavenCentral()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\nrootProject.name = \"SherpaOnnxTts\"\ninclude ':app'\n"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/build.gradle.kts",
    "content": "plugins {\n    id(\"com.android.application\")\n    id(\"org.jetbrains.kotlin.android\")\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx.tts.engine\"\n    compileSdk = 34\n\n    defaultConfig {\n        applicationId = \"com.k2fsa.sherpa.onnx.tts.engine\"\n        minSdk = 21\n        targetSdk = 34\n        versionCode = 20260320\n        versionName = \"1.12.31\"\n\n        testInstrumentationRunner = \"androidx.test.runner.AndroidJUnitRunner\"\n        vectorDrawables {\n            useSupportLibrary = true\n        }\n    }\n\n    buildTypes {\n        release {\n            isMinifyEnabled = false\n            proguardFiles(\n                getDefaultProguardFile(\"proguard-android-optimize.txt\"),\n                \"proguard-rules.pro\"\n            )\n        }\n    }\n    compileOptions {\n        sourceCompatibility = JavaVersion.VERSION_1_8\n        targetCompatibility = JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = \"1.8\"\n    }\n    buildFeatures {\n        compose = true\n    }\n    composeOptions {\n        kotlinCompilerExtensionVersion = \"1.5.1\"\n    }\n    packaging {\n        resources {\n            excludes += \"/META-INF/{AL2.0,LGPL2.1}\"\n        }\n    }\n}\n\ndependencies {\n\n    implementation(\"androidx.core:core-ktx:1.12.0\")\n    implementation(\"androidx.lifecycle:lifecycle-runtime-ktx:2.6.2\")\n    implementation(\"androidx.activity:activity-compose:1.8.2\")\n    implementation(platform(\"androidx.compose:compose-bom:2023.08.00\"))\n    implementation(\"androidx.compose.ui:ui\")\n    implementation(\"androidx.compose.ui:ui-graphics\")\n    implementation(\"androidx.compose.ui:ui-tooling-preview\")\n    implementation(\"androidx.compose.material3:material3\")\n    implementation(\"androidx.appcompat:appcompat:1.6.1\")\n    implementation(\"com.google.android.material:material:1.9.0\")\n    testImplementation(\"junit:junit:4.13.2\")\n    androidTestImplementation(\"androidx.test.ext:junit:1.1.5\")\n    androidTestImplementation(\"androidx.test.espresso:espresso-core:3.5.1\")\n    androidTestImplementation(platform(\"androidx.compose:compose-bom:2023.08.00\"))\n    androidTestImplementation(\"androidx.compose.ui:ui-test-junit4\")\n    debugImplementation(\"androidx.compose.ui:ui-tooling\")\n    debugImplementation(\"androidx.compose.ui:ui-test-manifest\")\n}"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/androidTest/java/com/k2fsa/sherpa/onnx/tts/engine/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx.tts.engine\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\"\n    package=\"com.k2fsa.sherpa.onnx.tts.engine\">\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnxTtsEngine\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".GetSampleText\"\n            android:exported=\"true\"\n            android:theme=\"@android:style/Theme.Translucent.NoTitleBar\">\n            <intent-filter>\n                <action android:name=\"android.speech.tts.engine.GET_SAMPLE_TEXT\" />\n\n                <category android:name=\"android.intent.category.DEFAULT\" />\n            </intent-filter>\n        </activity>\n        <activity\n            android:name=\".CheckVoiceData\"\n            android:exported=\"true\">\n            <intent-filter>\n                <action android:name=\"android.speech.tts.engine.CHECK_TTS_DATA\" />\n\n                <category android:name=\"android.intent.category.DEFAULT\" />\n            </intent-filter>\n        </activity>\n        <activity\n            android:name=\".InstallVoiceData\"\n            android:exported=\"true\">\n            <intent-filter>\n                <action android:name=\"android.speech.tts.engine.INSTALL_TTS_DATA\" />\n\n                <category android:name=\"android.intent.category.DEFAULT\" />\n            </intent-filter>\n        </activity>\n\n        <service\n            android:name=\".TtsService\"\n            android:enabled=\"true\"\n            android:exported=\"true\"\n            android:label=\"@string/app_name\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.TTS_SERVICE\" />\n\n                <category android:name=\"android.intent.category.DEFAULT\" />\n            </intent-filter>\n\n            <meta-data\n                android:name=\"android.speech.tts\"\n                android:resource=\"@xml/tts_engine\" />\n        </service>\n\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\"\n            android:label=\"@string/app_name\"\n            android:theme=\"@style/Theme.SherpaOnnxTtsEngine\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n            <intent-filter>\n                <action android:name=\"android.speech.tts.engine.CONFIGURE_ENGINE\" />\n\n                <category android:name=\"android.intent.category.DEFAULT\" />\n            </intent-filter>\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/assets/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/CheckVoiceData.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine\n\nimport android.content.Intent\nimport android.os.Bundle\nimport android.speech.tts.TextToSpeech\nimport androidx.appcompat.app.AppCompatActivity\n\nclass CheckVoiceData : AppCompatActivity() {\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        val intent = Intent().apply {\n            putStringArrayListExtra(\n                TextToSpeech.Engine.EXTRA_AVAILABLE_VOICES,\n                arrayListOf(TtsEngine.lang)\n            )\n            putStringArrayListExtra(TextToSpeech.Engine.EXTRA_UNAVAILABLE_VOICES, arrayListOf())\n        }\n        setResult(TextToSpeech.Engine.CHECK_VOICE_DATA_PASS, intent)\n        finish()\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/GetSampleText.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine\n\nimport android.app.Activity\nimport android.content.Intent\nimport android.os.Bundle\nimport android.speech.tts.TextToSpeech\n\nfun getSampleText(lang: String): String {\n    var text = \"\"\n    when (lang) {\n        \"ara\" -> {\n            text = \"هذا هو محرك تحويل النص إلى كلام باستخدام الجيل القادم من كالدي\"\n        }\n\n        \"ben\" -> {\n            text = \"এটি একটি টেক্সট-টু-স্পীচ ইঞ্জিন যা পরবর্তী প্রজন্মের কালডি ব্যবহার করে\"\n        }\n\n        \"bul\" -> {\n            text =\n                \"Това е машина за преобразуване на текст в реч, използваща Kaldi от следващо поколение\"\n        }\n\n        \"cat\" -> {\n            text = \"Aquest és un motor de text a veu que utilitza Kaldi de nova generació\"\n        }\n\n        \"cym\" -> {\n            text = \"Peiriant testun-i-lais yw hwn sy'n defnyddio Kaldi'r genhedlaeth nesaf\"\n        }\n\n        \"ces\" -> {\n            text = \"Toto je převodník textu na řeč využívající novou generaci kaldi\"\n        }\n\n        \"dan\" -> {\n            text = \"Dette er en tekst til tale-motor, der bruger næste generation af kaldi\"\n        }\n\n        \"deu\" -> {\n            text =\n                \"Dies ist eine Text-to-Speech-Engine, die Kaldi der nächsten Generation verwendet\"\n        }\n\n        \"ell\" -> {\n            text = \"Αυτή είναι μια μηχανή κειμένου σε ομιλία που χρησιμοποιεί kaldi επόμενης γενιάς\"\n        }\n\n        \"eng\" -> {\n            text = \"How are you doing today? This is a text-to-speech engine using next generation Kaldi\"\n        }\n\n        \"est\" -> {\n            text = \"See on teksti kõneks muutmise mootor, mis kasutab järgmise põlvkonna Kaldi\"\n        }\n\n        \"fin\" -> {\n            text = \"Tämä on tekstistä puheeksi -moottori, joka käyttää seuraavan sukupolven kaldia\"\n        }\n\n        \"fra\" -> {\n            text = \"Il s'agit d'un moteur de synthèse vocale utilisant Kaldi de nouvelle génération\"\n        }\n\n        \"gle\" -> {\n            text = \"Is inneall téacs-go-hurlabhra é seo a úsáideann Kaldi den chéad ghlúin eile\"\n        }\n\n        \"hrv\" -> {\n            text =\n                \"Ovo je mehanizam za pretvaranje teksta u govor koji koristi Kaldi sljedeće generacije\"\n        }\n\n        \"hun\" -> {\n            text = \"Ez egy szövegfelolvasó motor a következő generációs kaldi használatával\"\n        }\n\n        \"isl\" -> {\n            text = \"Þetta er texta í tal vél sem notar næstu kynslóð kaldi\"\n        }\n\n        \"ita\" -> {\n            text = \"Questo è un motore di sintesi vocale che utilizza kaldi di nuova generazione\"\n        }\n\n        \"kat\" -> {\n            text = \"ეს არის ტექსტიდან მეტყველების ძრავა შემდეგი თაობის კალდის გამოყენებით\"\n        }\n\n        \"kaz\" -> {\n            text = \"Бұл келесі буын kaldi көмегімен мәтіннен сөйлеуге арналған қозғалтқыш\"\n        }\n\n        \"mlt\" -> {\n            text = \"Din hija magna text-to-speech li tuża Kaldi tal-ġenerazzjoni li jmiss\"\n        }\n\n        \"lav\" -> {\n            text = \"Šis ir teksta pārvēršanas runā dzinējs, kas izmanto nākamās paaudzes Kaldi\"\n        }\n\n        \"lit\" -> {\n            text = \"Tai teksto į kalbą variklis, kuriame naudojamas naujos kartos Kaldi\"\n        }\n\n        \"ltz\" -> {\n            text = \"Dëst ass en Text-zu-Speech-Motor mat der nächster Generatioun Kaldi\"\n        }\n\n        \"nep\" -> {\n            text = \"यो अर्को पुस्ता काल्डी प्रयोग गरेर स्पीच इन्जिनको पाठ हो\"\n        }\n\n        \"nld\" -> {\n            text =\n                \"Dit is een tekst-naar-spraak-engine die gebruik maakt van Kaldi van de volgende generatie\"\n        }\n\n        \"nor\" -> {\n            text = \"Dette er en tekst til tale-motor som bruker neste generasjons kaldi\"\n        }\n\n        \"pol\" -> {\n            text = \"Jest to silnik syntezatora mowy wykorzystujący Kaldi nowej generacji\"\n        }\n\n        \"por\" -> {\n            text =\n                \"Este é um mecanismo de conversão de texto em fala usando Kaldi de próxima geração\"\n        }\n\n        \"ron\" -> {\n            text = \"Acesta este un motor text to speech care folosește generația următoare de kadi\"\n        }\n\n        \"rus\" -> {\n            text =\n                \"Это движок преобразования текста в речь, использующий Kaldi следующего поколения.\"\n        }\n\n        \"slk\" -> {\n            text = \"Toto je nástroj na prevod textu na reč využívajúci kaldi novej generácie\"\n        }\n\n        \"slv\" -> {\n            text =\n                \"To je mehanizem za pretvorbo besedila v govor, ki uporablja Kaldi naslednje generacije\"\n        }\n\n        \"spa\" -> {\n            text = \"Este es un motor de texto a voz que utiliza kaldi de próxima generación.\"\n        }\n\n        \"srp\" -> {\n            text =\n                \"Ово је механизам за претварање текста у говор који користи калди следеће генерације\"\n        }\n\n        \"swa\" -> {\n            text = \"Haya ni maandishi kwa injini ya hotuba kwa kutumia kizazi kijacho kaldi\"\n        }\n\n        \"swe\" -> {\n            text = \"Detta är en text till tal-motor som använder nästa generations kaldi\"\n        }\n\n        \"tur\" -> {\n            text = \"Bu, yeni nesil kaldi'yi kullanan bir metinden konuşmaya motorudur\"\n        }\n\n        \"ukr\" -> {\n            text =\n                \"Це механізм перетворення тексту на мовлення, який використовує kaldi нового покоління\"\n        }\n\n        \"vie\" -> {\n            text = \"Đây là công cụ chuyển văn bản thành giọng nói sử dụng kaldi thế hệ tiếp theo\"\n        }\n\n        \"zho\", \"cmn\" -> {\n            text = \"使用新一代卡尔迪的语音合成引擎\"\n        }\n    }\n    return text\n}\n\nclass GetSampleText : Activity() {\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        var result = TextToSpeech.LANG_AVAILABLE\n        val text: String = getSampleText(TtsEngine.lang ?: \"\")\n        if (text.isEmpty()) {\n            result = TextToSpeech.LANG_NOT_SUPPORTED\n        }\n\n        val intent = Intent().apply {\n            if (result == TextToSpeech.LANG_AVAILABLE) {\n                putExtra(TextToSpeech.Engine.EXTRA_SAMPLE_TEXT, text)\n            } else {\n                putExtra(\"sampleText\", text)\n            }\n        }\n\n        setResult(result, intent)\n        finish()\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/InstallVoiceData.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine\n\nimport android.app.Activity\nimport android.os.Bundle\nimport android.view.Window\n\nclass InstallVoiceData : Activity() {\n    override fun onCreate(savedInstanceState: Bundle?) {\n        requestWindowFeature(Window.FEATURE_NO_TITLE)\n        super.onCreate(savedInstanceState)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/MainActivity.kt",
    "content": "@file:OptIn(ExperimentalMaterial3Api::class)\n\npackage com.k2fsa.sherpa.onnx.tts.engine\n\nimport PreferenceHelper\nimport android.media.AudioAttributes\nimport android.media.AudioFormat\nimport android.media.AudioManager\nimport android.media.AudioTrack\nimport android.media.MediaPlayer\nimport android.net.Uri\nimport android.os.Bundle\nimport android.util.Log\nimport android.widget.Toast\nimport androidx.activity.ComponentActivity\nimport androidx.activity.compose.setContent\nimport androidx.activity.viewModels\nimport androidx.compose.foundation.layout.Box\nimport androidx.compose.foundation.layout.Column\nimport androidx.compose.foundation.layout.Row\nimport androidx.compose.foundation.layout.fillMaxSize\nimport androidx.compose.foundation.layout.fillMaxWidth\nimport androidx.compose.foundation.layout.padding\nimport androidx.compose.foundation.layout.wrapContentHeight\nimport androidx.compose.foundation.rememberScrollState\nimport androidx.compose.foundation.text.KeyboardOptions\nimport androidx.compose.foundation.verticalScroll\nimport androidx.compose.material3.Button\nimport androidx.compose.material3.ExperimentalMaterial3Api\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.OutlinedTextField\nimport androidx.compose.material3.Scaffold\nimport androidx.compose.material3.Slider\nimport androidx.compose.material3.Surface\nimport androidx.compose.material3.Text\nimport androidx.compose.material3.TopAppBar\nimport androidx.compose.runtime.getValue\nimport androidx.compose.runtime.mutableStateOf\nimport androidx.compose.runtime.remember\nimport androidx.compose.runtime.setValue\nimport androidx.compose.ui.Modifier\nimport androidx.compose.ui.text.input.KeyboardType\nimport androidx.compose.ui.unit.dp\nimport com.k2fsa.sherpa.onnx.tts.engine.ui.theme.SherpaOnnxTtsEngineTheme\nimport kotlinx.coroutines.CoroutineScope\nimport kotlinx.coroutines.Dispatchers\nimport kotlinx.coroutines.SupervisorJob\nimport kotlinx.coroutines.channels.Channel\nimport kotlinx.coroutines.launch\nimport kotlinx.coroutines.withContext\nimport java.io.File\nimport kotlin.time.TimeSource\n\nconst val TAG = \"sherpa-onnx-tts-engine\"\n\nclass MainActivity : ComponentActivity() {\n    // TODO(fangjun): Save settings in ttsViewModel\n    private val ttsViewModel: TtsViewModel by viewModels()\n\n    private var mediaPlayer: MediaPlayer? = null\n\n    // see\n    // https://developer.android.com/reference/kotlin/android/media/AudioTrack\n    private lateinit var track: AudioTrack\n\n    private var stopped: Boolean = false\n\n    private var samplesChannel = Channel<FloatArray>(capacity = 128)\n    private val scope = CoroutineScope(Dispatchers.IO + SupervisorJob())\n\n\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n\n        Log.i(TAG, \"Start to initialize TTS\")\n        TtsEngine.createTts(this)\n        Log.i(TAG, \"Finish initializing TTS\")\n\n        Log.i(TAG, \"Start to initialize AudioTrack\")\n        initAudioTrack()\n        Log.i(TAG, \"Finish initializing AudioTrack\")\n\n        val preferenceHelper = PreferenceHelper(this)\n        setContent {\n            SherpaOnnxTtsEngineTheme {\n                // A surface container using the 'background' color from the theme\n                Surface(\n                    modifier = Modifier.fillMaxSize(),\n                    color = MaterialTheme.colorScheme.background\n                ) {\n                    Scaffold(topBar = {\n                        TopAppBar(title = { Text(\"Next-gen Kaldi: TTS Engine\") })\n                    }) {\n                        Box(modifier = Modifier.padding(it)) {\n                            Column(modifier = Modifier.padding(16.dp)) {\n                                Column {\n                                    Text(\"Speed \" + String.format(\"%.1f\", TtsEngine.speed))\n                                    Slider(\n                                        value = TtsEngine.speedState.value,\n                                        onValueChange = {\n                                            TtsEngine.speed = it\n                                            preferenceHelper.setSpeed(it)\n                                        },\n                                        valueRange = MIN_TTS_SPEED..MAX_TTS_SPEED,\n                                        modifier = Modifier.fillMaxWidth()\n                                    )\n                                }\n\n                                val testTextContent = getSampleText(TtsEngine.lang ?: \"\")\n\n                                var testText by remember { mutableStateOf(testTextContent) }\n                                var startEnabled by remember { mutableStateOf(true) }\n                                var playEnabled by remember { mutableStateOf(false) }\n                                var rtfText by remember {\n                                    mutableStateOf(\"\")\n                                }\n                                val scrollState = rememberScrollState(0)\n\n                                val numSpeakers = TtsEngine.tts!!.numSpeakers()\n                                if (numSpeakers > 1) {\n                                    OutlinedTextField(\n                                        value = TtsEngine.speakerIdState.value.toString(),\n                                        onValueChange = {\n                                            if (it.isEmpty() || it.isBlank()) {\n                                                TtsEngine.speakerId = 0\n                                            } else {\n                                                try {\n                                                    TtsEngine.speakerId = it.toString().toInt()\n                                                } catch (ex: NumberFormatException) {\n                                                    Log.i(TAG, \"Invalid input: $it\")\n                                                    TtsEngine.speakerId = 0\n                                                }\n                                            }\n                                            preferenceHelper.setSid(TtsEngine.speakerId)\n                                        },\n                                        label = {\n                                            Text(\"Speaker ID: (0-${numSpeakers - 1})\")\n                                        },\n                                        keyboardOptions = KeyboardOptions(keyboardType = KeyboardType.Number),\n                                        modifier = Modifier\n                                            .fillMaxWidth()\n                                            .padding(bottom = 16.dp)\n                                            .wrapContentHeight(),\n                                    )\n                                }\n\n                                OutlinedTextField(\n                                    value = testText,\n                                    onValueChange = { testText = it },\n                                    label = { Text(\"Please input your text here\") },\n                                    maxLines = 10,\n                                    modifier = Modifier\n                                        .fillMaxWidth()\n                                        .padding(bottom = 16.dp)\n                                        .verticalScroll(scrollState)\n                                        .wrapContentHeight(),\n                                    singleLine = false,\n                                )\n\n                                Row {\n                                    Button(\n                                        enabled = startEnabled,\n                                        modifier = Modifier.padding(5.dp),\n                                        onClick = {\n                                            Log.i(TAG, \"Clicked, text: $testText\")\n                                            if (testText.isBlank() || testText.isEmpty()) {\n                                                Toast.makeText(\n                                                    applicationContext,\n                                                    \"Please input some text to generate\",\n                                                    Toast.LENGTH_SHORT\n                                                ).show()\n                                            } else {\n                                                startEnabled = false\n                                                playEnabled = false\n                                                stopped = false\n\n                                                track.pause()\n                                                track.flush()\n                                                track.play()\n                                                rtfText = \"\"\n                                                Log.i(TAG, \"Started with text $testText\")\n\n                                                scope.launch {\n                                                    for (samples in samplesChannel) {\n                                                        if (samples.isEmpty()) {\n                                                            break\n                                                        }\n\n                                                        Log.i(\n                                                            TAG,\n                                                            \"Received ${samples.count()} samples\"\n                                                        )\n                                                        track.write(\n                                                            samples,\n                                                            0,\n                                                            samples.size,\n                                                            AudioTrack.WRITE_BLOCKING\n                                                        )\n                                                        if (stopped) {\n                                                            break\n                                                        }\n                                                    }\n                                                    Log.i(TAG, \"Draining the channel\")\n\n                                                    // drain remaining\n                                                    while (!samplesChannel.isEmpty) {\n                                                        samplesChannel.tryReceive().getOrNull()\n                                                    }\n                                                    Log.i(TAG, \"Channel drained\")\n\n                                                }\n\n                                                CoroutineScope(Dispatchers.Default).launch {\n                                                    val timeSource = TimeSource.Monotonic\n                                                    val startTime = timeSource.markNow()\n\n                                                    val audio =\n                                                        TtsEngine.tts!!.generateWithCallback(\n                                                            text = testText,\n                                                            sid = TtsEngine.speakerId,\n                                                            speed = TtsEngine.speed,\n                                                            callback = ::callback,\n                                                        )\n\n                                                    val elapsed =\n                                                        startTime.elapsedNow().inWholeMilliseconds.toFloat() / 1000;\n                                                    val audioDuration =\n                                                        audio.samples.size / TtsEngine.tts!!.sampleRate()\n                                                            .toFloat()\n                                                    val RTF = String.format(\n                                                        \"Number of threads: %d\\nElapsed: %.3f s\\nAudio duration: %.3f s\\nRTF: %.3f/%.3f = %.3f\",\n                                                        TtsEngine.tts!!.config.model.numThreads,\n                                                        elapsed,\n                                                        audioDuration,\n                                                        elapsed,\n                                                        audioDuration,\n                                                        elapsed / audioDuration\n                                                    )\n\n                                                    scope.launch {\n                                                        Log.i(TAG, \"send 0 samples\")\n                                                            samplesChannel.send(FloatArray(0))\n                                                        Log.i(TAG, \"send 0 samples done\")\n                                                    }\n\n                                                    val filename =\n                                                        application.filesDir.absolutePath + \"/generated.wav\"\n\n\n                                                    val ok =\n                                                        audio.samples.isNotEmpty() && audio.save(\n                                                            filename\n                                                        )\n\n                                                    if (ok) {\n                                                        withContext(Dispatchers.Main) {\n                                                            startEnabled = true\n                                                            playEnabled = true\n                                                            rtfText = RTF\n                                                        }\n\n\n                                                    }\n                                                }\n                                            }\n                                        }) {\n                                        Text(\"Start\")\n                                    }\n\n                                    Button(\n                                        modifier = Modifier.padding(5.dp),\n                                        enabled = playEnabled,\n                                        onClick = {\n                                            stopped = true\n                                            track.pause()\n                                            track.flush()\n                                            onClickPlay()\n                                        }) {\n                                        Text(\"Play\")\n                                    }\n\n                                    Button(\n                                        modifier = Modifier.padding(5.dp),\n                                        onClick = {\n                                            onClickStop()\n                                            startEnabled = true\n                                        }) {\n                                        Text(\"Stop\")\n                                    }\n                                }\n                                if (rtfText.isNotEmpty()) {\n                                    Row {\n                                        Text(rtfText)\n                                    }\n                                }\n                            }\n                        }\n                    }\n                }\n            }\n        }\n    }\n\n    override fun onDestroy() {\n        stopMediaPlayer()\n        super.onDestroy()\n    }\n\n    private fun stopMediaPlayer() {\n        mediaPlayer?.stop()\n        mediaPlayer?.release()\n        mediaPlayer = null\n    }\n\n    private fun onClickPlay() {\n        val filename = application.filesDir.absolutePath + \"/generated.wav\"\n        stopMediaPlayer()\n        mediaPlayer = MediaPlayer.create(\n            applicationContext,\n            Uri.fromFile(File(filename))\n        )\n        mediaPlayer?.start()\n    }\n\n    private fun onClickStop() {\n        stopped = true\n        track.pause()\n        track.flush()\n\n        stopMediaPlayer()\n    }\n\n    // this function is called from C++\n    private fun callback(samples: FloatArray): Int {\n        if (!stopped) {\n            val samplesCopy = samples.copyOf()\n            scope.launch {\n                Log.i(TAG, \"callback called with ${samplesCopy.count()} samples\")\n                val ok = samplesChannel.trySend(samplesCopy).isSuccess\n                Log.i(TAG, \"callback called with $ok\")\n            }\n            return 1\n        } else {\n            track.stop()\n            Log.i(TAG, \" return 0\")\n            return 0\n        }\n    }\n\n    private fun initAudioTrack() {\n        val sampleRate = TtsEngine.tts!!.sampleRate()\n        val bufLength = AudioTrack.getMinBufferSize(\n            sampleRate,\n            AudioFormat.CHANNEL_OUT_MONO,\n            AudioFormat.ENCODING_PCM_FLOAT\n        )\n        Log.i(TAG, \"sampleRate: $sampleRate, buffLength: $bufLength\")\n\n        val attr = AudioAttributes.Builder().setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)\n            .setUsage(AudioAttributes.USAGE_MEDIA)\n            .build()\n\n        val format = AudioFormat.Builder()\n            .setEncoding(AudioFormat.ENCODING_PCM_FLOAT)\n            .setChannelMask(AudioFormat.CHANNEL_OUT_MONO)\n            .setSampleRate(sampleRate)\n            .build()\n\n        track = AudioTrack(\n            attr, format, bufLength, AudioTrack.MODE_STREAM,\n            AudioManager.AUDIO_SESSION_ID_GENERATE\n        )\n        track.play()\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/PreferencesHelper.kt",
    "content": "import android.content.Context\nimport android.content.SharedPreferences\n\nclass PreferenceHelper(context: Context) {\n\n    private val PREFS_NAME = \"com.k2fsa.sherpa.onnx.tts.engine\"\n    private val SPEED_KEY = \"speed\"\n    private val SID_KEY = \"speaker_id\"\n\n    private val sharedPreferences: SharedPreferences =\n        context.getSharedPreferences(PREFS_NAME, Context.MODE_PRIVATE)\n\n    fun setSpeed(value: Float) {\n        val editor = sharedPreferences.edit()\n        editor.putFloat(SPEED_KEY, value)\n        editor.apply()\n    }\n\n    fun getSpeed(): Float {\n        return sharedPreferences.getFloat(SPEED_KEY, 1.0f)\n    }\n\n    fun setSid(value: Int) {\n        val editor = sharedPreferences.edit()\n        editor.putInt(SID_KEY, value)\n        editor.apply()\n    }\n\n    fun getSid(): Int {\n        return sharedPreferences.getInt(SID_KEY, 0)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/TtsEngine.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine\n\nimport PreferenceHelper\nimport android.content.Context\nimport android.content.res.AssetManager\nimport android.util.Log\nimport androidx.compose.runtime.MutableState\nimport androidx.compose.runtime.mutableFloatStateOf\nimport androidx.compose.runtime.mutableIntStateOf\nimport com.k2fsa.sherpa.onnx.OfflineTts\nimport com.k2fsa.sherpa.onnx.getOfflineTtsConfig\nimport java.io.File\nimport java.io.FileOutputStream\nimport java.io.IOException\n\nconst val MIN_TTS_SPEED = 0.1f\nconst val MAX_TTS_SPEED = 5.0f\n\nobject TtsEngine {\n    var tts: OfflineTts? = null\n\n    // https://en.wikipedia.org/wiki/ISO_639-3\n    // Example:\n    // eng for English,\n    // deu for German\n    // cmn for Mandarin\n    var lang: String? = null\n\n    // if a model supports two languages, set also lang2\n    var lang2: String? = null\n\n\n    val speedState: MutableState<Float> = mutableFloatStateOf(1.0F)\n    val speakerIdState: MutableState<Int> = mutableIntStateOf(0)\n\n    var speed: Float\n        get() = speedState.value\n        set(value) {\n            speedState.value = value\n        }\n\n    var speakerId: Int\n        get() = speakerIdState.value\n        set(value) {\n            speakerIdState.value = value\n        }\n\n    private var modelDir: String? = null\n    private var modelName: String? = null\n    private var acousticModelName: String? = null // for matcha tts\n    private var vocoder: String? = null // for matcha tts\n    private var voices: String? = null // for kokoro\n    private var ruleFsts: String? = null\n    private var ruleFars: String? = null\n    private var lexicon: String? = null\n    private var dataDir: String? = null\n    private var assets: AssetManager? = null\n    private var isKitten = false\n\n    init {\n        // The purpose of such a design is to make the CI test easier\n        // Please see\n        // https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/apk/generate-tts-apk-script.py\n        //\n        // For VITS -- begin\n        modelName = null\n        // For VITS -- end\n\n        // For Matcha -- begin\n        acousticModelName = null\n        vocoder = null\n        // For Matcha -- end\n\n        // For Kokoro -- begin\n        voices = null\n        // For Kokoro -- end\n\n        modelDir = null\n        ruleFsts = null\n        ruleFars = null\n        lexicon = null\n        dataDir = null\n        lang = null\n        lang2 = null\n\n        // Please enable one and only one of the examples below\n\n        // Example 1:\n        // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-vctk.tar.bz2\n        // modelDir = \"vits-vctk\"\n        // modelName = \"vits-vctk.onnx\"\n        // lexicon = \"lexicon.txt\"\n        // lang = \"eng\"\n\n        // Example 2:\n        // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n        // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\n        // modelDir = \"vits-piper-en_US-amy-low\"\n        // modelName = \"en_US-amy-low.onnx\"\n        // dataDir = \"vits-piper-en_US-amy-low/espeak-ng-data\"\n        // lang = \"eng\"\n\n        // Example 3:\n        // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\n        // modelDir = \"vits-icefall-zh-aishell3\"\n        // modelName = \"model.onnx\"\n        // ruleFars = \"vits-icefall-zh-aishell3/rule.far\"\n        // lexicon = \"lexicon.txt\"\n        // lang = \"zho\"\n\n        // Example 4:\n        // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#csukuangfj-vits-zh-hf-fanchen-c-chinese-187-speakers\n        // modelDir = \"vits-zh-hf-fanchen-C\"\n        // modelName = \"vits-zh-hf-fanchen-C.onnx\"\n        // lexicon = \"lexicon.txt\"\n        // lang = \"zho\"\n\n        // Example 5:\n        // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-coqui-de-css10.tar.bz2\n        // This model does not need lexicon or dataDir\n        // modelDir = \"vits-coqui-de-css10\"\n        // modelName = \"model.onnx\"\n        // lang = \"deu\"\n\n        // Example 6\n        // vits-melo-tts-zh_en\n        // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#vits-melo-tts-zh-en-chinese-english-1-speaker\n        // modelDir = \"vits-melo-tts-zh_en\"\n        // modelName = \"model.onnx\"\n        // lexicon = \"lexicon.txt\"\n        // lang = \"zho\"\n        // lang2 = \"eng\"\n\n        // Example 7\n        // matcha-icefall-zh-baker\n        // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n        // modelDir = \"matcha-icefall-zh-baker\"\n        // acousticModelName = \"model-steps-3.onnx\"\n        // vocoder = \"vocos-22khz-univ.onnx\"\n        // lexicon = \"lexicon.txt\"\n        // lang = \"zho\"\n\n        // Example 8\n        // matcha-icefall-en_US-ljspeech\n        // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n        // modelDir = \"matcha-icefall-en_US-ljspeech\"\n        // acousticModelName = \"model-steps-3.onnx\"\n        // vocoder = \"vocos-22khz-univ.onnx\"\n        // dataDir = \"matcha-icefall-en_US-ljspeech/espeak-ng-data\"\n        // lang = \"eng\"\n\n        // Example 9\n        // kokoro-en-v0_19\n        // modelDir = \"kokoro-en-v0_19\"\n        // modelName = \"model.onnx\"\n        // voices = \"voices.bin\"\n        // dataDir = \"kokoro-en-v0_19/espeak-ng-data\"\n        // lang = \"eng\"\n\n        // Example 10\n        // kokoro-multi-lang-v1_0\n        // modelDir = \"kokoro-multi-lang-v1_0\"\n        // modelName = \"model.onnx\"\n        // voices = \"voices.bin\"\n        // dataDir = \"kokoro-multi-lang-v1_0/espeak-ng-data\"\n        // lexicon = \"kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt\"\n        // lang = \"eng\"\n        // lang2 = \"zho\"\n        // ruleFsts = \"$modelDir/phone-zh.fst,$modelDir/date-zh.fst,$modelDir/number-zh.fst\"\n        //\n        // This model supports many languages, e.g., English, Chinese, etc.\n        // We set lang to eng here.\n\n        // Example 11\n        // kitten-nano-en-v0_1-fp16\n        // modelDir = \"kitten-nano-en-v0_1-fp16\"\n        // modelName = \"model.fp16.onnx\"\n        // voices = \"voices.bin\"\n        // dataDir = \"kitten-nano-en-v0_1-fp16/espeak-ng-data\"\n        // lang = \"eng\"\n        // isKitten = true\n\n        // Example 12\n        // matcha-icefall-zh-en\n        // https://k2-fsa.github.io/sherpa/onnx/tts/all/Chinese-English/matcha-icefall-zh-en.html\n        // modelDir = \"matcha-icefall-zh-en\"\n        // acousticModelName = \"model-steps-3.onnx\"\n        // vocoder = \"vocos-16khz-univ.onnx\"\n        // dataDir = \"matcha-icefall-zh-en/espeak-ng-data\"\n        // lexicon = \"lexicon.txt\"\n        // lang = \"zho\"\n    }\n\n    fun createTts(context: Context) {\n        Log.i(TAG, \"Init Next-gen Kaldi TTS\")\n        if (tts == null) {\n            initTts(context)\n        }\n    }\n\n    private fun initTts(context: Context) {\n        assets = context.assets\n\n        if (dataDir != null) {\n            val newDir = copyDataDir(context, dataDir!!)\n            dataDir = \"$newDir/$dataDir\"\n        }\n\n        val config = getOfflineTtsConfig(\n            modelDir = modelDir!!,\n            modelName = modelName ?: \"\",\n            acousticModelName = acousticModelName ?: \"\",\n            vocoder = vocoder ?: \"\",\n            voices = voices ?: \"\",\n            lexicon = lexicon ?: \"\",\n            dataDir = dataDir ?: \"\",\n            dictDir = \"\",\n            ruleFsts = ruleFsts ?: \"\",\n            ruleFars = ruleFars ?: \"\",\n            isKitten = isKitten,\n        )\n\n        speed = PreferenceHelper(context).getSpeed()\n        speakerId = PreferenceHelper(context).getSid()\n\n        tts = OfflineTts(assetManager = assets, config = config)\n    }\n\n\n    private fun copyDataDir(context: Context, dataDir: String): String {\n        Log.i(TAG, \"data dir is $dataDir\")\n        copyAssets(context, dataDir)\n\n        val newDataDir = context.getExternalFilesDir(null)!!.absolutePath\n        Log.i(TAG, \"newDataDir: $newDataDir\")\n        return newDataDir\n    }\n\n    private fun copyAssets(context: Context, path: String) {\n        val assets: Array<String>?\n        try {\n            assets = context.assets.list(path)\n            if (assets!!.isEmpty()) {\n                copyFile(context, path)\n            } else {\n                val fullPath = \"${context.getExternalFilesDir(null)}/$path\"\n                val dir = File(fullPath)\n                dir.mkdirs()\n                for (asset in assets.iterator()) {\n                    val p: String = if (path == \"\") \"\" else \"$path/\"\n                    copyAssets(context, p + asset)\n                }\n            }\n        } catch (ex: IOException) {\n            Log.e(TAG, \"Failed to copy $path. $ex\")\n        }\n    }\n\n    private fun copyFile(context: Context, filename: String) {\n        try {\n            val istream = context.assets.open(filename)\n            val newFilename = context.getExternalFilesDir(null).toString() + \"/\" + filename\n            val ostream = FileOutputStream(newFilename)\n            // Log.i(TAG, \"Copying $filename to $newFilename\")\n            val buffer = ByteArray(1024)\n            var read = 0\n            while (read != -1) {\n                ostream.write(buffer, 0, read)\n                read = istream.read(buffer)\n            }\n            istream.close()\n            ostream.flush()\n            ostream.close()\n        } catch (ex: Exception) {\n            Log.e(TAG, \"Failed to copy $filename, $ex\")\n        }\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/TtsService.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine\n\nimport android.media.AudioFormat\nimport android.speech.tts.SynthesisCallback\nimport android.speech.tts.SynthesisRequest\nimport android.speech.tts.TextToSpeech\nimport android.speech.tts.TextToSpeechService\nimport android.util.Log\n\n/*\nhttps://developer.android.com/reference/java/util/Locale#getISO3Language()\nhttps://developer.android.com/reference/java/util/Locale#getISO3Country()\n\neng, USA,\neng, USA, POSIX\neng,\neng, GBR\nafr,\nafr, NAM\nafr, ZAF\nagq\nagq, CMR\naka,\naka, GHA\namh,\namh, ETH\nara,\nara, 001\nara, ARE\nara, BHR,\ndeu\ndeu, AUT\ndeu, BEL\ndeu, CHE\ndeu, ITA\ndeu, ITA\ndeu, LIE\ndeu, LUX\nspa,\nspa, 419\nspa, ARG,\nspa, BRA\nfra,\nfra, BEL,\nfra, FRA,\n\nE  Failed to check TTS data, no activity found for Intent\n{ act=android.speech.tts.engine.CHECK_TTS_DATA pkg=com.k2fsa.sherpa.chapter5 })\n\nE Failed to get default language from engine com.k2fsa.sherpa.chapter5\nEngine failed voice data integrity check (null return)com.k2fsa.sherpa.chapter5\nFailed to get default language from engine com.k2fsa.sherpa.chapter5\n\n*/\n\nclass TtsService : TextToSpeechService() {\n    override fun onCreate() {\n        Log.i(TAG, \"onCreate tts service\")\n        super.onCreate()\n\n        // see https://github.com/Miserlou/Android-SDK-Samples/blob/master/TtsEngine/src/com/example/android/ttsengine/RobotSpeakTtsService.java#L68\n        onLoadLanguage(TtsEngine.lang, \"\", \"\")\n        if (TtsEngine.lang2 != null) {\n            onLoadLanguage(TtsEngine.lang2, \"\", \"\")\n        }\n    }\n\n    override fun onDestroy() {\n        Log.i(TAG, \"onDestroy tts service\")\n        super.onDestroy()\n    }\n\n    // https://developer.android.com/reference/kotlin/android/speech/tts/TextToSpeechService#onislanguageavailable\n    override fun onIsLanguageAvailable(_lang: String?, _country: String?, _variant: String?): Int {\n        val lang = _lang ?: \"\"\n\n        if (lang == TtsEngine.lang || lang == TtsEngine.lang2) {\n            return TextToSpeech.LANG_AVAILABLE\n        }\n\n        return TextToSpeech.LANG_NOT_SUPPORTED\n    }\n\n    override fun onGetLanguage(): Array<String> {\n        return arrayOf(TtsEngine.lang!!, \"\", \"\")\n    }\n\n    // https://developer.android.com/reference/kotlin/android/speech/tts/TextToSpeechService#onLoadLanguage(kotlin.String,%20kotlin.String,%20kotlin.String)\n    override fun onLoadLanguage(_lang: String?, _country: String?, _variant: String?): Int {\n        Log.i(TAG, \"onLoadLanguage: $_lang, $_country\")\n        val lang = _lang ?: \"\"\n\n        return if (lang == TtsEngine.lang || lang == TtsEngine.lang2) {\n            Log.i(TAG, \"creating tts, lang :$lang\")\n            TtsEngine.createTts(application)\n            TextToSpeech.LANG_AVAILABLE\n        } else {\n            Log.i(TAG, \"lang $lang not supported, tts engine lang: ${TtsEngine.lang}, ${TtsEngine.lang2}\")\n            TextToSpeech.LANG_NOT_SUPPORTED\n        }\n    }\n\n    override fun onStop() {}\n\n    override fun onSynthesizeText(request: SynthesisRequest?, callback: SynthesisCallback?) {\n        if (request == null || callback == null) {\n            return\n        }\n        val language = request.language\n        val country = request.country\n        val variant = request.variant\n        val text = request.charSequenceText.toString()\n        // Map Android TTS speech rate (where 100 == normal) to engine speed (1.0 == normal)\n        // Allow per-request override from external apps; fallback to engine default if absent.\n        val rate = runCatching { request.speechRate }.getOrDefault(-1)\n        val engineSpeed = if (rate > 0) {\n            // Map 100 -> 1.0f\n            val mapped = rate / 100.0f\n            mapped.coerceIn(MIN_TTS_SPEED, MAX_TTS_SPEED)\n        } else {\n            // Fallback to current engine/global setting\n            TtsEngine.speed\n        }\n\n        val ret = onIsLanguageAvailable(language, country, variant)\n        if (ret == TextToSpeech.LANG_NOT_SUPPORTED) {\n            callback.error()\n            return\n        }\n        Log.i(TAG, \"text: $text, engineSpeed: $engineSpeed\")\n        val tts = TtsEngine.tts!!\n\n        // Note that AudioFormat.ENCODING_PCM_FLOAT requires API level >= 24\n        // callback.start(tts.sampleRate(), AudioFormat.ENCODING_PCM_FLOAT, 1)\n\n        callback.start(tts.sampleRate(), AudioFormat.ENCODING_PCM_16BIT, 1)\n\n        if (text.isBlank() || text.isEmpty()) {\n            callback.done()\n            return\n        }\n\n        val ttsCallback: (FloatArray) -> Int = fun(floatSamples): Int {\n            // convert FloatArray to ByteArray\n            val samples = floatArrayToByteArray(floatSamples)\n            val maxBufferSize: Int = callback.maxBufferSize\n            var offset = 0\n            while (offset < samples.size) {\n                val bytesToWrite = Math.min(maxBufferSize, samples.size - offset)\n                callback.audioAvailable(samples, offset, bytesToWrite)\n                offset += bytesToWrite\n            }\n\n            // 1 means to continue\n            // 0 means to stop\n            return 1\n        }\n\n        Log.i(TAG, \"text: $text\")\n        tts.generateWithCallback(\n            text = text,\n            sid = TtsEngine.speakerId,\n            speed = engineSpeed,\n            callback = ttsCallback,\n        )\n\n        callback.done()\n    }\n\n    private fun floatArrayToByteArray(audio: FloatArray): ByteArray {\n        // byteArray is actually a ShortArray\n        val byteArray = ByteArray(audio.size * 2)\n        for (i in audio.indices) {\n            val sample = (audio[i] * 32767).toInt()\n            byteArray[2 * i] = sample.toByte()\n            byteArray[2 * i + 1] = (sample shr 8).toByte()\n        }\n        return byteArray\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/TtsViewModel.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine\n\nimport android.app.Application\nimport android.speech.tts.TextToSpeech\nimport android.speech.tts.TextToSpeech.OnInitListener\nimport android.speech.tts.UtteranceProgressListener\nimport android.util.Log\nimport androidx.lifecycle.ViewModel\nimport java.util.Locale\n\nclass TtsApp : Application() {\n    companion object {\n        lateinit var instance: TtsApp\n    }\n\n    override fun onCreate() {\n        super.onCreate()\n        instance = this\n    }\n\n}\n\nclass TtsViewModel : ViewModel() {\n\n    // https://developer.android.com/reference/kotlin/android/speech/tts/TextToSpeech.OnInitListener\n    private val onInitListener = object : OnInitListener {\n        override fun onInit(status: Int) {\n            when (status) {\n                TextToSpeech.SUCCESS -> Log.i(TAG, \"Init tts succeeded\")\n                TextToSpeech.ERROR -> Log.i(TAG, \"Init tts failed\")\n                else -> Log.i(TAG, \"Unknown status $status\")\n            }\n        }\n    }\n\n    // https://developer.android.com/reference/kotlin/android/speech/tts/UtteranceProgressListener\n    private val utteranceProgressListener = object : UtteranceProgressListener() {\n        override fun onStart(utteranceId: String?) {\n            Log.i(TAG, \"onStart: $utteranceId\")\n        }\n\n        override fun onStop(utteranceId: String?, interrupted: Boolean) {\n            Log.i(TAG, \"onStop: $utteranceId, $interrupted\")\n            super.onStop(utteranceId, interrupted)\n        }\n\n        override fun onError(utteranceId: String?, errorCode: Int) {\n            Log.i(TAG, \"onError: $utteranceId, $errorCode\")\n            super.onError(utteranceId, errorCode)\n        }\n\n        override fun onDone(utteranceId: String?) {\n            Log.i(TAG, \"onDone: $utteranceId\")\n        }\n\n        @Deprecated(\"Deprecated in Java\")\n        override fun onError(utteranceId: String?) {\n            Log.i(TAG, \"onError: $utteranceId\")\n        }\n    }\n\n    val tts = TextToSpeech(TtsApp.instance, onInitListener, \"com.k2fsa.sherpa.onnx.tts.engine\")\n\n    init {\n        tts.setLanguage(Locale(TtsEngine.lang!!))\n        tts.setOnUtteranceProgressListener(utteranceProgressListener)\n    }\n\n    override fun onCleared() {\n        super.onCleared()\n        tts.shutdown()\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/ui/theme/Color.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine.ui.theme\n\nimport androidx.compose.ui.graphics.Color\n\nval Purple80 = Color(0xFFD0BCFF)\nval PurpleGrey80 = Color(0xFFCCC2DC)\nval Pink80 = Color(0xFFEFB8C8)\n\nval Purple40 = Color(0xFF6650a4)\nval PurpleGrey40 = Color(0xFF625b71)\nval Pink40 = Color(0xFF7D5260)"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/ui/theme/Theme.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine.ui.theme\n\nimport android.app.Activity\nimport android.os.Build\nimport androidx.compose.foundation.isSystemInDarkTheme\nimport androidx.compose.material3.MaterialTheme\nimport androidx.compose.material3.darkColorScheme\nimport androidx.compose.material3.dynamicDarkColorScheme\nimport androidx.compose.material3.dynamicLightColorScheme\nimport androidx.compose.material3.lightColorScheme\nimport androidx.compose.runtime.Composable\nimport androidx.compose.runtime.SideEffect\nimport androidx.compose.ui.graphics.toArgb\nimport androidx.compose.ui.platform.LocalContext\nimport androidx.compose.ui.platform.LocalView\nimport androidx.core.view.WindowCompat\n\nprivate val DarkColorScheme = darkColorScheme(\n    primary = Purple80,\n    secondary = PurpleGrey80,\n    tertiary = Pink80\n)\n\nprivate val LightColorScheme = lightColorScheme(\n    primary = Purple40,\n    secondary = PurpleGrey40,\n    tertiary = Pink40\n\n    /* Other default colors to override\n    background = Color(0xFFFFFBFE),\n    surface = Color(0xFFFFFBFE),\n    onPrimary = Color.White,\n    onSecondary = Color.White,\n    onTertiary = Color.White,\n    onBackground = Color(0xFF1C1B1F),\n    onSurface = Color(0xFF1C1B1F),\n    */\n)\n\n@Composable\nfun SherpaOnnxTtsEngineTheme(\n    darkTheme: Boolean = isSystemInDarkTheme(),\n    // Dynamic color is available on Android 12+\n    dynamicColor: Boolean = true,\n    content: @Composable () -> Unit\n) {\n    val colorScheme = when {\n        dynamicColor && Build.VERSION.SDK_INT >= Build.VERSION_CODES.S -> {\n            val context = LocalContext.current\n            if (darkTheme) dynamicDarkColorScheme(context) else dynamicLightColorScheme(context)\n        }\n\n        darkTheme -> DarkColorScheme\n        else -> LightColorScheme\n    }\n    val view = LocalView.current\n    if (!view.isInEditMode) {\n        SideEffect {\n            val window = (view.context as Activity).window\n            window.statusBarColor = colorScheme.primary.toArgb()\n            WindowCompat.getInsetsController(window, view).isAppearanceLightStatusBars = darkTheme\n        }\n    }\n\n    MaterialTheme(\n        colorScheme = colorScheme,\n        typography = Typography,\n        content = content\n    )\n}"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine/ui/theme/Type.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine.ui.theme\n\nimport androidx.compose.material3.Typography\nimport androidx.compose.ui.text.TextStyle\nimport androidx.compose.ui.text.font.FontFamily\nimport androidx.compose.ui.text.font.FontWeight\nimport androidx.compose.ui.unit.sp\n\n// Set of Material typography styles to start with\nval Typography = Typography(\n    bodyLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 16.sp,\n        lineHeight = 24.sp,\n        letterSpacing = 0.5.sp\n    )\n    /* Other default text styles to override\n    titleLarge = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Normal,\n        fontSize = 22.sp,\n        lineHeight = 28.sp,\n        letterSpacing = 0.sp\n    ),\n    labelSmall = TextStyle(\n        fontFamily = FontFamily.Default,\n        fontWeight = FontWeight.Medium,\n        fontSize = 11.sp,\n        lineHeight = 16.sp,\n        letterSpacing = 0.5.sp\n    )\n    */\n)"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector android:height=\"108dp\" android:viewportHeight=\"12267\"\n    android:viewportWidth=\"12267\" android:width=\"108dp\" xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <path android:fillColor=\"#ffffff\"\n        android:pathData=\"m4121,10338c-0,-105 3,-209 2,-313 -2,-26 -15,-55 -40,-66 -11,-14 -35,-7 -22,12 10,48 20,96 30,143 6,22 8,44 14,65 1,13 14,35 3,42 -19,0 -40,-17 -56,-7 -4,17 5,35 8,52 8,28 17,57 15,87 -1,15 -3,33 -13,45 -14,-5 -17,-25 -32,-30 -21,-7 -39,12 -60,12 -17,3 -31,12 -48,14 -19,11 -20,-16 -25,-28 -8,-21 -17,-43 -23,-65 -2,-24 -28,-31 -47,-37 -28,-11 -57,-21 -85,-32 -21,1 -31,-21 -17,-36 18,-51 37,-102 55,-153 5,-25 -10,-49 -29,-62 -80,-72 -157,-148 -234,-223 3,-20 27,-27 38,-43 40,-35 89,-60 120,-104 21,-29 32,-65 35,-100 3,-26 28,-39 45,-55 61,-50 122,-101 189,-143 18,-12 35,-29 34,-52 2,-22 -18,-31 -34,-39 -26,-20 -38,-53 -40,-84 -6,-26 -21,9 -34,3 -5,-19 -1,-40 -2,-60 -0,-20 6,-47 2,-63 -19,8 -36,18 -54,28 -10,-9 6,-27 6,-39 47,-166 93,-333 140,-499 -17,14 -28,34 -42,51 -23,31 -50,61 -55,101 -5,27 -4,54 -6,81 -15,5 -27,-19 -35,-31 -17,-33 -16,-74 2,-107 21,-42 44,-82 67,-123 -16,-18 -44,1 -60,-18 -8,-21 22,-31 31,-47 23,-25 49,-48 78,-66 20,-13 23,-38 20,-60 -3,-38 -5,-76 -8,-114 -3,-38 -7,-76 -13,-113 -10,16 -11,35 -18,52 -14,45 -27,89 -42,134 -25,6 -8,-31 -20,-42 -8,8 -20,18 -29,5 -14,-15 -12,-37 -21,-54 -14,-42 -25,-87 -18,-131 4,-35 16,-69 20,-103 -20,-10 -44,-11 -64,-22 -31,-13 -65,-35 -72,-71 -4,-22 4,-45 15,-63 18,-34 55,-57 59,-98 6,-33 -10,-65 -8,-97 1,-71 28,-141 75,-193 20,-21 44,-36 70,-50 25,-16 46,-43 44,-74 -0,-18 -5,-35 -10,-52 -20,-11 -43,-0 -64,-4 -26,0 -53,0 -79,2 -36,-45 -91,-74 -119,-126 -15,-30 -14,-64 -19,-96 -8,-65 -16,-130 -17,-196 -1,-41 19,-79 31,-117 -2,-21 -28,-28 -39,-43 -18,-16 -36,-40 -28,-66 8,-28 34,-46 54,-66 43,-38 98,-59 148,-86 121,-62 241,-125 365,-181 38,-13 80,-8 119,-2 55,11 105,35 157,54 16,-12 31,-35 54,-30 29,3 52,24 80,30 13,-18 22,-40 43,-51 35,-22 77,-19 116,-23 26,-1 51,-7 75,-15 108,-27 218,-49 327,-72 21,-6 41,7 38,29 53,242 82,490 155,728 8,26 17,52 26,78 2,-22 -7,-44 -9,-66 -20,-106 -39,-213 -59,-319 12,-21 22,16 24,26 47,128 93,257 141,385 7,18 26,-5 16,-17 -17,-54 -32,-108 -48,-163 -40,-136 -79,-272 -119,-407 13,-10 23,17 36,21 161,132 322,264 483,395 17,-8 9,-33 18,-48 19,-73 40,-147 72,-215 13,-19 39,-21 60,-19 24,10 28,-20 29,-36 7,-34 14,-67 19,-101 -21,-13 -47,-8 -69,-12 -26,-2 -25,-31 -24,-50 2,-15 5,-51 -19,-47 -16,5 -18,-16 -27,-25 -14,-24 -43,-32 -69,-35 -24,-3 -48,-1 -71,-4 -8,-7 -4,-34 -20,-19 -12,6 -30,29 -41,20 -0,-11 15,-41 -7,-30 -24,4 -31,-24 -41,-39 -17,-27 -36,-59 -27,-92 5,-20 16,-40 10,-61 -11,-22 16,-34 31,-43 16,-15 -32,-44 -1,-45 20,2 39,-5 53,-20 20,-21 30,-49 52,-69 14,-15 29,-29 44,-43 12,2 14,34 25,11 12,-13 17,-33 35,-40 -1,16 15,29 27,15 57,-22 114,-45 171,-69 31,-12 64,-18 97,-16 41,1 78,-21 107,-49 36,-35 61,-82 67,-131 9,-60 30,-118 51,-175 22,-60 54,-116 82,-174 11,-20 28,-36 43,-53 18,-16 44,-17 66,-25 71,-21 142,-39 216,-45 48,-5 96,-8 144,-13 21,5 -9,26 -15,34 -8,17 18,-1 25,-2 38,-15 77,-32 118,-31 51,-1 103,1 154,-2 34,-9 69,-28 105,-16 35,14 52,50 73,79 15,23 30,46 45,69 4,22 -25,17 -38,12 -35,-10 -70,-18 -105,-27 10,15 31,17 44,28 57,33 112,72 154,124 15,10 0,17 -11,13 -36,-0 -71,-6 -107,-8 -16,-5 -8,16 4,13 76,28 151,56 227,85 8,-18 30,-25 47,-15 23,5 49,18 55,43 -1,12 8,31 17,12 5,-14 6,-46 29,-33 60,16 121,31 181,47 16,16 34,33 57,36 27,5 55,-2 79,-14 19,3 25,29 30,42 -3,11 -31,13 -24,23 22,5 45,5 68,8 17,2 33,4 50,7 -1,11 -28,36 -1,31 14,-3 33,-16 43,-1 10,20 11,47 33,59 20,11 31,34 26,56 -6,12 -14,34 7,33 20,3 21,25 7,36 -13,22 11,36 22,51 7,18 4,39 10,58 4,22 5,48 -12,65 -11,13 -46,18 -28,41 7,9 8,31 -9,19 -20,-12 -30,12 -36,27 -17,30 -33,61 -49,91 -14,17 -38,17 -57,26 -38,14 -76,26 -113,40 -6,9 10,30 -11,25 -31,-0 -61,1 -92,1 -14,1 -29,-0 -42,3 -42,25 -95,22 -140,8 -36,-10 -74,-12 -111,-8 -28,10 -37,42 -43,68 -7,42 0,84 -4,125 -2,31 -2,62 1,93 0,29 -18,60 -48,65 -24,5 -50,-2 -71,-13 -22,2 -22,29 -26,46 -9,62 -50,111 -92,155 -14,14 -29,29 -31,50 -3,23 -7,48 -23,67 -30,38 -77,61 -125,66 -25,3 -47,-10 -70,-17 -49,-19 -96,-42 -144,-63 -19,9 -29,29 -43,43 9,21 34,28 47,46 19,19 40,40 48,67 3,28 0,59 17,83 22,38 52,71 76,108 80,108 160,217 239,325 11,17 25,34 31,53 -7,21 -33,24 -46,39 -36,27 -74,51 -110,78 -15,14 -20,37 -11,56 15,39 32,77 48,116 75,176 151,352 198,539 43,167 71,337 105,505 6,20 5,42 18,59 8,-4 1,-28 3,-40 -0,-99 -2,-199 -23,-297 0,-16 -12,-33 -4,-49 38,37 65,85 75,137 16,79 2,161 -18,238 -6,23 -13,46 -20,69 -20,8 -30,-19 -47,-25 -42,-32 -87,-62 -139,-75 -28,-8 -57,-14 -84,-24 -11,-21 -12,-46 -29,-64 -32,-44 -82,-68 -130,-91 -21,-14 -22,-43 -37,-62 -29,-44 -76,-70 -121,-94 -38,-23 -69,-60 -74,-105 -6,-40 -9,-81 -21,-120 -16,-60 -44,-115 -74,-168 -19,-43 -22,-91 -12,-136 23,-147 34,-296 32,-445 -0,-87 -3,-174 -12,-261 -4,-29 -14,-56 -25,-83 -5,23 2,46 2,70 3,32 5,65 7,97 -13,20 -27,-12 -30,-24 -11,-24 -23,-47 -33,-71 -22,-2 0,29 1,41 22,66 42,133 55,202 11,53 17,108 15,162 -0,67 -7,134 -20,200 -8,36 -18,72 -32,106 -17,-8 -12,-34 -22,-49 -15,-41 -35,-79 -53,-118 -17,-35 -39,-69 -69,-95 -87,-85 -182,-163 -277,-240 -12,-3 7,21 8,28 76,162 188,303 294,447 38,51 75,103 110,155 0,16 -20,-7 -28,-7 -77,-42 -154,-85 -231,-127 -20,15 10,31 20,41 128,112 229,251 336,383 -6,25 -36,11 -46,-2 -45,-31 -90,-63 -135,-94 -8,11 5,38 -20,33 -12,4 -41,2 -42,11 84,54 170,105 255,159 22,14 45,28 68,41 12,19 -19,30 -25,46 -12,20 22,15 33,20 42,10 84,20 126,29 4,18 -14,30 -29,20 -65,-12 -129,-23 -194,-34 -11,22 25,22 38,31 104,42 207,83 312,122 18,8 31,31 20,50 -12,29 -33,53 -49,80 -17,27 -35,53 -51,80 21,-3 39,-16 59,-22 62,-24 125,-49 187,-74 20,6 7,30 -1,41 -9,17 -18,33 -25,51 31,-1 61,-8 91,-11 25,-3 51,-7 76,-9 12,18 -23,24 -31,36 -8,7 -32,18 -29,27 14,-3 43,0 26,20 -3,11 -23,34 -9,39 28,-5 56,-13 84,-18 20,2 2,29 2,42 -1,23 25,-1 35,-4 12,-12 43,-10 31,12 -12,14 -21,31 -32,46 -4,8 -25,31 -5,22 14,-6 33,-24 46,-6 18,19 34,43 59,54 19,-0 33,-23 53,-9 31,8 62,13 93,21 7,13 8,43 29,27 17,-6 42,-1 46,18 6,12 -3,38 10,41 19,-9 32,-27 52,-32 24,1 25,34 45,41 20,3 41,-23 60,-6 21,21 9,58 33,77 7,6 40,6 21,19 -21,7 -27,32 -10,47 10,11 13,36 27,39 12,-12 12,-31 21,-45 7,-17 14,-34 20,-52 -6,-4 -33,-16 -12,-16 61,-1 111,-41 167,-59 54,-17 111,-11 166,-5 21,1 44,3 55,24 13,16 -3,22 -17,20 -22,2 -45,1 -67,4 -24,8 5,25 18,24 66,18 133,36 200,52 19,6 46,12 46,36 0,24 -23,38 -43,46 -41,14 -83,-1 -123,-8 -12,0 -40,-15 -31,6 7,18 6,44 28,51 20,6 40,-5 61,-5 31,-1 61,5 91,13 16,3 37,14 31,34 -7,25 -19,51 -41,67 -10,18 19,35 12,56 -1,23 -19,40 -38,49 -20,17 -17,45 -24,68 -5,21 -11,42 -19,62 10,25 9,59 -14,76 -21,18 -49,26 -74,37 0,115 0,230 0,345 -40,-0 -80,1 -119,-1 -7,-105 -14,-210 -24,-315 3,-19 -19,-24 -34,-25 -18,-7 -46,-9 -52,14 -2,42 3,85 5,127 2,30 4,61 5,91 3,26 3,51 6,77 -3,10 9,35 -7,33 -47,0 -94,0 -141,0 -5,-40 -7,-80 -11,-120 -4,-56 -8,-111 -15,-167 -7,-16 -11,-53 -35,-43 -33,12 -64,30 -91,53 -17,13 -37,22 -56,33 -17,9 -34,19 -51,27 -21,0 -25,-31 -47,-29 -17,-3 -46,16 -56,-2 -12,-20 -48,-6 -54,-31 -1,-26 -29,-28 -47,-38 -161,-77 -322,-156 -483,-234 -38,-18 -80,-30 -122,-33 -31,-1 -62,-2 -92,5 6,13 28,6 40,13 47,12 94,28 139,46 23,11 47,21 71,32 45,21 88,46 132,70 47,27 93,56 140,85 42,25 88,44 121,81 17,18 31,40 41,63 16,39 27,81 34,122 2,20 5,40 1,60 -6,14 -17,-2 -26,3 -19,-8 -24,-31 -36,-46 -25,-42 -49,-84 -74,-126 -3,-14 -22,-11 -15,4 7,18 10,37 17,56 4,15 -5,11 -12,2 -55,-42 -109,-85 -164,-127 -8,-4 -21,-21 -27,-16 5,15 17,27 23,41 42,67 97,125 148,185 4,9 31,26 17,31 -502,2 -1004,1 -1506,1 -310,-0 -621,0 -931,-1 -16,-3 4,-24 4,-34 49,-136 105,-271 184,-393 4,-9 31,-33 7,-31 -56,2 -113,7 -169,10 -16,1 -32,2 -49,2 -19,9 -14,36 -21,53 -10,45 -21,90 -31,135 2,11 -9,22 -11,6 -7,-11 -9,-39 -24,-37 -16,21 -30,45 -45,67 -5,9 -18,34 -20,13 -12,-13 -3,-34 -6,-50 -1,-58 -0,-116 -2,-173 -11,-16 -34,-0 -50,-3 -47,4 -95,8 -142,13 -15,12 2,33 -1,48 14,100 28,200 42,300 4,24 12,47 7,71 3,1 5,-11 3,0 -14,11 -36,-0 -53,4 -16,4 -38,-0 -34,-21 -11,-120 -20,-240 -32,-360 -5,-14 3,-46 -20,-38 -37,4 -74,8 -111,12 -13,16 -1,38 -6,57 0,92 -1,183 -5,275 -0,25 -0,50 -0,76 -26,0 -52,0 -79,0 -2,-68 -7,-136 -9,-205 -3,-57 -5,-114 -8,-171 1,-13 0,-35 -19,-27 -71,5 -141,8 -212,14 -7,21 9,41 11,62 14,52 27,105 45,156 7,25 5,52 -6,76 -7,23 4,47 8,70 10,11 6,34 -12,26 -60,1 -120,0 -180,0 0,-40 0,-80 1,-120zM4842,10439c-2,-11 -1,13 0,0zM7749,10377c-4,-6 0,6 0,0zM7147,10366c-55,-141 -148,-268 -268,-362 -63,-52 -136,-91 -210,-126 -33,-15 -66,-28 -100,-40 -4,10 22,21 29,30 41,35 82,69 123,104 24,12 2,-36 23,-32 14,10 26,22 39,34 16,20 27,44 43,64 46,66 103,123 161,178 35,33 70,65 104,98 20,17 36,37 57,52zM7743,10343c-3,-7 -2,6 0,0zM7512,10338c-6,0 5,7 0,0zM7617,10337c-3,-0 3,6 0,0zM7502,10329c-4,-5 2,6 0,0zM7247,10156c7,-22 -27,-25 -37,-40 -34,-26 -68,-51 -103,-76 -18,14 13,26 22,36 33,29 67,58 102,86 6,4 13,2 16,-5zM7470,10136c19,-16 -7,-32 -22,-38 -61,-38 -122,-75 -182,-113 -9,-17 -44,-8 -29,11 19,14 40,24 59,38 41,27 83,53 125,80 17,5 30,31 50,23zM7149,8470c-12,-125 -46,-252 -119,-355 -11,-15 -28,-39 -38,-45 9,31 23,60 33,90 40,103 79,206 119,308 1,3 7,10 5,2zM7319,8389c-10,-208 -49,-416 -133,-608 -39,-89 -86,-173 -142,-252 -12,-16 -2,15 1,21 44,142 88,283 133,425 5,14 9,28 13,42 11,-16 3,-46 24,-53 22,2 14,30 20,44 28,138 56,275 84,413 5,-11 0,-22 1,-34zM7169,8174c-25,-131 -70,-261 -149,-371 -29,-38 -62,-74 -102,-101 -16,-5 4,16 5,23 79,146 158,291 237,437 2,2 9,20 8,12zM3893,7961c13,-35 22,-75 5,-110 -10,-16 -6,14 -7,20 -0,31 -0,62 1,94l1,-2zM6762,7794c-2,-119 -38,-235 -82,-344 -47,-111 -108,-218 -188,-308 -22,-24 -45,-46 -71,-66 -12,11 14,34 1,40 -19,-7 -35,-21 -53,-28 -7,-4 -25,-17 -28,-9 111,145 235,282 320,445 48,90 83,188 97,289 8,24 3,-13 4,-19zM3934,7658c-0,-21 1,-43 -1,-64 -26,17 -44,46 -51,77 -8,43 11,87 36,121 3,8 18,22 14,5 1,-46 1,-92 2,-139zM7001,7695c-34,-83 -68,-167 -102,-250 -5,-18 -9,-1 -7,8 0,72 30,141 72,199 12,17 23,34 38,48 1,-2 -0,-4 -1,-5zM6552,7567c23,-7 6,-30 -4,-41 -55,-86 -109,-172 -164,-257 -19,3 -21,24 -6,33 56,88 112,176 168,264 1,2 4,3 5,1zM5349,6953c-5,-2 3,10 0,0zM5561,6729c-16,-22 -39,-38 -57,-58 -36,-35 -72,-70 -109,-103 -6,16 8,33 11,49 11,29 22,61 45,83 30,27 72,35 111,30zM9053,4618c-20,0 -42,-3 -51,-23 -86,-94 -172,-190 -241,-298 -15,-23 -27,-48 -38,-74 -14,-20 -41,-19 -62,-15 -20,4 -40,1 -60,1 -19,2 -38,8 -58,5 -20,3 -39,-7 -58,-6 -21,-0 -41,-7 -61,-6 -35,-3 -70,-7 -104,-14 -22,-10 -48,-7 -70,-18 -15,-5 -30,-10 -44,-16 -26,13 -48,33 -74,45 -20,-1 -35,10 -43,28 -11,21 -31,35 -43,55 -14,19 -38,25 -57,38 -18,8 -34,-9 -47,-20 -13,-13 -23,-29 -31,-45 -22,-9 -11,-35 -15,-53 -18,-2 -27,-30 -4,-30 7,1 20,2 16,-9 16,6 20,-23 17,-28 -14,-3 -2,21 -17,13 -8,-14 -25,-9 -34,-0 7,-0 33,-7 26,6 -18,11 -47,7 -50,-17 -11,-23 12,-37 27,-49 25,-21 51,-42 74,-65 11,-16 33,-28 31,-50 -7,-17 13,-24 26,-21 1,-10 4,-20 12,-23 4,-18 39,-12 24,-35 -3,-11 -31,-30 -13,-35 11,-2 25,20 29,11 -5,-19 -29,-29 -26,-51 8,-3 28,15 19,-3 5,-20 21,5 33,3 12,14 26,-1 38,-9 12,-10 25,-22 27,-38 9,-22 26,-40 36,-62 24,-43 42,-90 62,-135 12,-19 18,-41 31,-58 5,-17 10,-33 19,-48 49,-100 83,-207 108,-316 6,-24 11,-48 21,-70 15,-15 28,17 34,28 2,11 2,31 18,18 18,-6 27,-21 35,-38 15,-21 29,16 31,30 6,34 22,65 33,98 9,23 18,46 27,69 -5,23 18,38 23,58 13,29 20,60 38,87 48,87 90,178 135,266 55,110 113,218 170,327 23,48 47,96 77,140 34,53 73,102 111,152 18,31 32,64 43,98 -2,19 -24,34 -10,54 9,21 -5,40 -14,58 -5,14 -19,34 -36,24 -15,-14 -31,4 -19,20 5,16 -10,29 -21,35 -11,-9 -22,-21 -33,-31 -11,-12 -26,-21 -36,-34 -6,-13 -17,-47 -35,-35 5,18 -12,2 -18,1 2,13 -23,22 -5,31 17,0 34,-3 44,15 18,18 33,38 45,60 0,7 -10,7 -13,3zM9028,4599c-6,-13 -7,10 -1,2zM9019,4587c17,-11 -17,-32 -12,-10 3,4 6,10 12,10zM9001,4553c-10,-10 4,12 0,0zM8965,4513c-8,-9 -21,-37 -32,-28 4,13 18,33 32,28zM8993,4490c-1,-8 -19,-33 -20,-15 2,5 14,29 20,15zM8672,4185c2,-14 -25,-0 -7,2 2,0 5,-0 7,-2zM8623,4179c-8,-23 -35,-29 -56,-23 -17,-5 -37,-11 -53,1 -18,9 4,19 14,11 19,-5 32,16 50,5 16,-8 29,7 43,9l1,-1zM8661,4166c3,-18 -37,-1 -10,3 3,1 9,2 10,-3zM7945,4150c13,-14 -24,-19 -10,-1 2,4 8,3 10,1zM8611,4145c-11,-12 -15,13 0,0zM8589,4141c-6,-11 -7,11 0,0zM8664,4136c-9,-20 -35,-19 -52,-13 -18,-10 -35,2 -55,-6 -29,-3 -58,-1 -86,-9 -22,1 -45,3 -66,-2 -29,-5 -57,-13 -87,-13 -6,-4 -29,-7 -18,3 15,10 32,20 48,8 21,-2 38,11 58,16 21,-9 42,8 63,0 23,-1 44,13 68,8 19,1 41,-7 57,4 17,-0 33,-2 50,3 7,2 15,3 22,0zM7914,4120c10,-6 5,-33 22,-16 8,5 31,13 26,-3 10,-10 22,-21 19,-35 9,-7 44,2 32,-17 -12,-10 -25,-27 -42,-20 -20,13 -31,34 -46,52 -12,10 -30,36 -18,47 3,-2 4,-5 6,-7zM7985,4105c-3,-13 -18,7 -3,4 2,2 4,-2 3,-4zM8430,4033c2,-18 -36,-17 -25,-2 8,3 17,4 25,2zM8368,4022c-2,-12 -24,3 -7,2 2,0 5,0 7,-2zM8567,3929c1,-21 -15,-38 -21,-58 -16,-35 -34,-69 -54,-101 -18,-18 -34,6 -39,22 -12,22 -20,47 -32,70 -14,16 -22,49 6,56 19,13 42,15 63,9 25,-0 50,8 75,6 1,-1 2,-2 2,-3zM8357,3887c-0,-11 1,-22 11,-26 -1,-12 1,-24 13,-25 9,-17 16,-35 21,-53 2,-19 18,-30 23,-48 4,-16 31,-13 21,-32 -7,-12 4,-19 11,-23 2,-23 -14,-43 -28,-59 -19,-13 2,-32 -12,-47 -7,-14 -8,-40 -31,-29 -12,9 -25,13 -38,4 -16,5 -22,31 -13,44 14,7 27,22 12,36 -10,18 -19,39 -13,59 -8,17 -7,47 -31,48 -14,1 -18,15 -8,24 6,22 -21,40 -13,60 10,3 17,18 18,20 13,-11 32,-33 50,-18 13,9 2,23 -10,22 -12,12 -8,42 10,44 3,2 9,3 10,-2zM8364,3718c-6,-19 24,7 2,3zM8346,3591c-20,-16 31,-30 16,-6 -3,6 -10,12 -16,6zM8275,3746c14,-18 -25,-31 -22,-12 6,5 13,16 22,12zM8317,3712c-1,-11 -13,7 0,0zM8326,3624c6,-16 -25,-5 -5,1 2,1 4,1 5,-1zM8323,3591c-10,-6 3,13 0,0zM8438,3219c-3,-9 -27,-23 -21,-4 3,9 14,7 21,4zM8427,3134c2,-12 -6,9 0,0zM5128,4517c-15,-4 -21,-41 -29,-38 5,24 -28,24 -34,4 -2,-7 -22,-20 -11,-7 1,22 -17,-2 -20,-10 -7,-21 -27,-29 -42,-42 -13,-9 -15,-24 -5,-35 -3,-21 -32,-26 -33,-50 -32,-79 -54,-162 -68,-245 -2,-19 -7,-37 -7,-56 -1,-22 -9,-43 -9,-65 -2,-25 -3,-49 -4,-74 -12,-22 -40,-10 -60,-13 -62,-1 -124,-9 -186,-14 -17,-2 -19,14 -17,26 -4,12 17,17 8,32 -8,20 -10,40 -10,61 -6,17 -26,25 -33,43 -12,20 -20,42 -27,63 -3,16 8,42 -14,48 -16,8 -3,-26 -15,-12 -4,15 -17,26 -21,39 11,2 28,5 21,21 -5,19 -27,29 -27,50 -8,17 -14,39 -31,50 -17,4 -45,-12 -28,-31 0,-22 -9,13 -13,20 -2,10 -23,37 -25,17 4,-12 -0,-19 -11,-10 -5,-8 -9,-13 -18,-13 -11,-13 7,-23 17,-24 9,-7 7,-27 -7,-23 -6,-15 9,-42 -2,-51 -7,17 -20,-9 -26,9 -10,22 -5,47 -15,69 -8,11 -1,37 -18,37 1,-20 7,-41 10,-62 5,-22 8,-45 13,-67 4,-19 20,-10 30,-4 12,-5 4,-15 -4,-14 -2,-7 -11,-25 -13,-8 -4,15 -17,-0 -10,-9 4,-39 12,-78 7,-117 -9,-118 -20,-237 -38,-354 -10,-61 -23,-122 -46,-179 -22,-60 -49,-118 -76,-176 -16,-2 -17,29 -32,35 -20,10 -41,-4 -62,-7 -54,-13 -111,-25 -167,-13 -23,3 -46,-5 -69,2 -16,4 -32,5 -48,1 -24,-3 -46,7 -69,11 -19,5 -35,17 -55,19 -15,6 -19,27 -39,21 -19,1 -32,20 -53,17 -15,9 -32,5 -48,10 -12,17 -34,16 -50,25 -50,23 -95,55 -141,84 -31,14 -60,34 -78,64 -15,23 -20,51 -23,79 -2,17 -14,37 2,52 19,17 46,17 70,21 68,6 137,-2 204,-9 69,-7 137,-17 206,-25 49,-3 99,-1 148,-0 79,4 162,16 230,60 19,14 39,29 51,49 9,25 13,52 20,77 3,28 -16,52 -26,77 -17,34 -37,66 -56,98 -8,13 -24,16 -19,33 0,19 -18,20 -31,20 -10,12 5,22 14,11 19,-1 6,25 -6,26 -10,17 -21,36 -43,38 -17,14 -30,35 -55,36 -17,-0 -12,14 -9,23 -8,5 -30,-2 -15,11 6,22 -25,20 -35,32 -15,6 -39,-3 -46,17 -10,16 -26,-4 -32,-8 -14,-0 -33,-19 -36,4 -11,12 -29,25 -46,18 -12,-24 -33,2 -50,5 -23,8 -49,11 -70,24 -20,13 -47,7 -66,23 -21,14 -44,-10 -63,9 -16,15 -37,17 -57,11 -19,3 -36,16 -55,14 -27,-5 -3,-31 12,-35 14,-4 9,-16 -4,-12 -13,-1 -32,16 -40,4 16,-10 17,-35 36,-41 139,-84 282,-160 429,-230 37,-18 76,-34 108,-61 14,-15 39,-29 39,-50 -13,-8 -30,-5 -45,-11 -36,-9 -73,-17 -109,-27 -6,-4 -5,-29 -14,-12 -19,10 -10,-22 -27,-20 -18,-8 -48,-1 -59,-19 6,-27 35,-9 53,-11 15,-2 38,-0 41,-16 19,1 38,2 56,-3 6,5 28,10 23,0 -14,-7 12,-15 15,-3 6,9 25,18 24,-0 18,-14 36,10 51,17 18,-1 3,-28 -11,-25 -22,-1 -41,-10 -61,-15 -26,1 -50,-7 -75,-10 -21,-3 -43,-8 -65,-5 -15,2 -30,0 -45,-3 -27,-2 -52,-16 -79,-10 -19,-0 -36,-12 -56,-7 -16,-2 -24,9 -28,22 -20,8 10,15 20,13 10,4 35,4 32,18 -17,4 -35,1 -52,1 -18,1 -38,-5 -52,7 -22,-4 -43,-12 -66,-7 -16,1 -30,-6 -46,-2 -16,-10 -36,-12 -52,-21 -10,-17 -36,-5 -51,-18 -14,-3 -27,-6 -40,-13 -34,-15 -70,-26 -99,-50 -22,-26 -48,-49 -65,-78 -9,-12 -0,-28 -9,-41 -6,-13 -10,-26 -8,-40 -2,-22 8,-45 -5,-63 3,-20 1,-40 2,-60 6,-21 20,-40 27,-61 3,-17 18,-26 27,-40 11,-18 35,-12 48,-28 14,-15 12,-39 35,-46 18,-8 9,-37 33,-37 13,-7 25,-7 37,1 6,-18 30,-0 39,-15 3,-4 -25,-3 -14,-14 37,-27 80,-43 123,-56 109,-33 221,-53 332,-72 37,-5 74,-10 111,-15 49,1 98,-3 145,12 51,14 100,36 145,64 6,12 15,19 28,11 23,-7 35,-29 55,-41 10,-6 25,-3 22,-18 9,-16 31,-15 45,-22 19,-7 43,-10 60,4 38,28 63,70 87,110 40,71 73,146 103,222 5,18 13,35 20,52 6,20 3,41 11,60 2,19 21,25 36,28 77,14 155,25 233,30 23,5 26,-18 24,-35 6,-44 14,-87 27,-130 11,-33 21,-68 44,-96 18,-23 44,-39 73,-47 25,-13 46,-32 72,-44 16,-9 34,-15 53,-19 20,-8 39,-17 61,-19 2,23 -28,33 -31,56 -12,24 -19,50 -27,75 -13,16 -5,40 -6,57 11,19 3,42 10,62 2,21 -5,43 3,64 7,30 -8,59 -7,89 -0,21 2,42 2,64 -24,15 14,21 17,36 4,18 1,36 0,54 -2,19 -6,38 -14,56 -3,20 -3,43 -3,61 -14,12 -11,30 -7,45 3,19 19,10 17,-5 10,-23 19,9 18,20 -4,23 7,44 7,66 0,30 10,60 14,89 -1,14 12,38 1,47 -7,-4 -18,-28 -13,-7 8,20 13,40 17,61 11,20 3,41 -4,60 4,21 14,42 20,63 3,10 12,29 14,32 -5,-17 15,-15 13,1 3,20 23,13 15,-4 -6,-21 -18,-41 -19,-64 -3,-8 -3,-35 10,-26 21,58 31,120 46,180 3,17 -23,-4 -18,17 6,19 -12,37 -31,39 -14,-2 -24,-0 -31,14 -13,21 -39,10 -53,-2 -8,-4 -19,-29 -17,-8 0,9 2,32 -14,19zM5277,4422c3,-15 -23,-14 -9,0 2,3 8,5 9,-0zM5170,4358c3,-9 -14,-25 -8,-6 0,5 9,22 8,6zM5124,4349c-7,-20 -5,15 0,0zM3765,4294c5,-13 -9,7 -0,0zM3784,4283c-7,-17 -3,13 0,0zM3820,4276c3,-18 -27,6 -5,5 2,-1 4,-3 5,-5zM4403,4261c17,7 13,-19 9,-28 -11,-6 -10,18 -15,25 -1,4 3,5 6,3zM3864,4257c-5,-12 -6,11 0,0zM3911,4245c-8,-18 -4,18 0,0zM4435,4238c6,-12 -8,0 -1,4l1,-2zM4495,4205c-5,-16 -5,14 0,0zM3968,4173c-9,-10 -8,13 0,0zM4369,4112c21,-22 -35,-11 -4,1zM4389,4097c2,-17 -11,2 -1,6l1,-3zM4532,4097c-4,-15 -3,15 0,0zM5165,4055c2,-11 1,-40 -8,-38 0,12 -5,33 8,38zM3881,3935c9,-22 -33,-13 -16,2 5,1 11,-0 16,-2zM4149,3928c1,-20 -11,16 0,0zM4587,3903c3,-8 3,-31 -2,-26 -0,13 -13,9 -16,7 -8,10 9,35 19,19zM4011,3892c-3,-12 -19,-44 -26,-18 -13,22 12,26 26,18zM3141,3594c4,-6 15,-34 0,-25 -2,3 -10,31 -0,25zM3191,3547c-8,-19 -12,18 0,0zM3197,3533c-9,-14 -3,16 0,0zM3211,3508c6,-15 -8,-2 -3,3l1,-1zM3180,3505c-3,-0 3,6 0,0zM3194,3490c15,-18 32,-34 49,-50 3,-21 -21,5 -26,12 -14,15 -27,30 -36,49 1,7 11,-10 13,-11zM3238,3483c9,-14 35,-37 28,-48 -17,10 -28,28 -43,41 -14,16 7,23 15,8zM3301,3411c19,-18 -19,11 -2,2zM3323,3375c-1,-15 -16,10 -0,1zM5031,4494c5,-25 16,15 0,0zM7095,4489c-14,-4 -27,-11 -42,-4 -20,1 -38,-12 -56,-19 -17,-12 -21,-37 -41,-45 -27,-15 -55,-31 -75,-55 -20,-19 -41,-38 -57,-60 -17,-19 -36,-37 -50,-59 -29,-37 -61,-72 -85,-113 -14,-21 -34,-38 -48,-59 -7,-19 -24,-31 -38,-44 -25,-23 -49,-47 -72,-73 -14,-18 -38,-3 -27,17 2,11 -15,13 -5,22 5,21 -12,39 -7,60 -4,22 -6,43 -13,64 -8,10 1,38 -18,36 -3,-5 -4,-25 -8,-8 -4,14 -22,25 -10,41 -18,36 -44,71 -82,88 -18,6 -44,0 -51,22 -13,10 -28,-11 -35,8 -10,11 -24,-2 -32,13 -11,12 -26,7 -38,12 -15,-12 -2,-36 -4,-53 11,-99 25,-198 37,-297 4,-81 7,-162 -6,-242 -16,-97 -42,-191 -70,-285 -9,-28 -17,-58 -33,-83 -2,-17 -20,-21 -29,-6 -13,12 -30,18 -46,27 -49,23 -101,40 -155,45 -53,3 -106,2 -159,1 -16,-4 -41,0 -34,23 6,21 5,43 6,64 8,40 10,80 16,120 20,8 42,-3 62,5 31,1 62,0 93,0 18,1 35,-2 52,-4 18,7 41,2 55,17 13,11 33,27 17,46 -6,15 -35,17 -28,36 8,10 9,20 -4,27 -16,11 -19,33 -36,42 -8,4 -19,9 -9,17 -7,20 -22,40 -42,50 -28,18 -61,23 -93,24 -41,4 -83,9 -123,17 -10,14 -10,34 -20,50 -10,24 -19,49 -29,73 4,22 32,25 51,26 22,-1 44,-3 66,-6 20,4 41,-1 62,-2 21,-5 42,-11 64,-8 22,0 45,0 67,-4 16,1 -8,27 13,22 18,-2 51,-4 49,23 -7,19 -22,32 -31,50 -12,15 -24,31 -42,38 -22,2 -34,23 -25,42 -14,18 -36,-14 -17,-25 12,-11 -29,-7 -23,9 19,13 6,33 -2,49 -11,21 -43,15 -58,33 -4,11 -14,14 -17,-0 -20,-15 -41,14 -63,5 -23,-2 -46,-7 -67,5 -14,10 -39,-15 -42,6 5,4 32,-6 23,10 -20,1 -40,-1 -60,-0 -20,3 -39,-3 -56,-12 -27,-15 -51,-35 -74,-56 -11,-19 4,-43 -5,-62 -14,-13 -2,-28 3,-42 6,-17 6,-35 5,-53 1,-16 20,-40 3,-51 -21,8 0,-20 -6,-30 4,-20 1,-38 -6,-57 -3,-23 12,-42 16,-63 0,-22 -1,-44 1,-66 4,-31 -2,-62 3,-92 4,-36 4,-73 -3,-109 -9,-63 -25,-125 -44,-186 -11,-42 -22,-83 -34,-125 -9,-20 -21,-38 -26,-59 4,-24 -17,-41 -24,-61 -6,-15 -33,-37 -12,-50 28,-22 61,-38 87,-63 22,-26 45,-56 80,-67 23,1 32,28 48,41 14,17 30,35 53,39 80,23 161,46 244,62 25,5 51,6 77,5 31,5 63,8 94,-0 21,2 42,6 63,4 21,-2 41,-10 56,-24 18,-11 44,-2 55,-25 14,-18 36,-28 57,-33 21,-6 44,-10 63,4 31,2 62,-3 93,-4 33,-3 67,5 100,2 16,-3 31,-2 46,-0 18,-10 -4,24 15,18 18,5 29,-11 40,-19 18,-1 37,15 52,-1 22,-1 44,8 61,22 17,16 41,20 61,33 21,11 40,25 56,43 9,13 22,3 31,6 4,7 -5,27 10,17 18,2 20,25 19,38 6,10 6,16 18,13 12,3 9,20 -3,19 6,17 24,43 7,58 -10,-13 2,-36 -13,-47 -7,-9 -28,-6 -12,5 14,11 11,32 19,47 1,14 22,7 17,22 -3,19 21,21 12,39 -5,20 -13,40 -24,58 -11,17 -16,38 -32,52 -20,19 -37,40 -57,59 -6,11 -26,9 -31,13 13,10 -1,12 -10,11 -13,8 -26,27 -42,20 5,-18 -31,8 -14,14 -16,6 -33,-10 -49,-1 -10,-0 -26,18 -9,19 14,-6 33,-24 46,-8 -12,18 -42,18 -50,38 1,21 19,36 31,52 24,26 46,54 68,82 48,58 101,111 145,173 13,25 38,39 51,63 -14,16 13,13 19,25 14,14 28,28 42,43 19,20 34,43 52,64 15,15 32,28 45,45 -7,22 40,13 29,36 -11,22 24,7 29,24 7,10 18,28 16,5 6,-73 15,-146 26,-219 3,-22 6,-45 4,-67 8,-22 -18,-28 -33,-18 -8,9 -34,8 -19,-7 18,-17 38,-32 53,-52 18,-18 12,-46 10,-68 -1,-16 3,-33 -1,-49 -3,-15 -5,-30 -3,-45 -4,-37 -10,-74 -16,-111 -13,-79 -26,-158 -54,-233 -12,-28 -23,-58 -26,-88 4,-23 28,-33 43,-47 24,-18 50,-35 74,-53 12,-21 41,-23 53,-44 23,-29 49,-58 83,-74 19,-6 39,-8 57,-16 47,-16 98,-18 148,-9 59,10 117,34 165,71 10,12 40,25 28,41 14,11 20,31 41,31 13,2 39,16 29,30 -9,6 -32,-22 -19,-2 12,15 8,38 24,50 17,15 7,-14 1,-21 -12,-12 2,-30 14,-15 21,12 18,36 27,54 11,14 14,29 19,46 -2,24 9,45 15,68 7,14 31,27 15,43 -9,3 -3,-23 -9,-8 -7,20 21,29 21,43 -8,9 -21,-8 -24,10 -4,18 22,11 17,29 1,19 -14,30 -25,43 -26,29 -51,59 -76,89 -18,11 -24,35 -45,42 -17,13 -39,23 -46,45 -5,10 -11,4 -13,-2 -10,13 -23,22 -36,32 -7,9 -1,20 -17,19 -33,8 -56,36 -88,46 -15,16 -42,14 -53,33 5,15 1,26 -15,31 -16,8 -33,16 -42,32 -6,23 -2,49 -14,70 -2,12 -27,24 -7,29 15,15 -7,32 -10,48 -12,25 -22,52 -36,76 -18,19 -22,46 -37,68 -11,20 -24,39 -39,57 -14,5 -23,-33 -31,-7 -9,19 -34,10 -44,21 -4,7 -18,34 -22,19 3,-9 -1,-18 -5,-5 -5,21 -16,40 -21,60 1,14 4,44 -14,44 -13,-5 -28,-4 -35,10 -13,2 -20,12 -21,23 -9,14 -26,21 -42,17 -3,-0 -5,-2 -7,-4zM7204,4352c13,-19 -12,-5 0,0zM7209,4322c9,-11 -14,-40 -13,-38 -4,12 -2,45 12,39zM7152,4307c-1,-14 -27,-6 -8,0 2,1 5,2 8,-0zM6226,4270c4,-7 -0,-31 -3,-11 -4,5 -3,36 2,17l0,-3zM5815,4243c7,-3 31,-3 25,-15 -15,12 -30,-24 -40,-6 9,8 1,13 -1,21 4,4 11,1 16,-0zM5768,4234c7,-11 5,-30 -11,-18 -25,5 -10,34 9,23l2,-3 0,-3zM5673,4230c9,-21 -29,6 -6,3l3,-1zM5834,4219c10,-16 -17,-1 0,0zM6228,4212c-6,3 4,6 0,0zM5819,4205c-1,-12 -6,9 0,0zM6955,4116c1,-13 -26,-9 -8,-1 2,2 6,4 8,1zM6466,4110c11,-13 -15,-11 -4,1l1,0zM6902,4044c1,-10 -28,-26 -15,-7 3,4 11,18 15,7zM6477,4042c2,-16 -7,-0 -2,4l1,-1zM6478,4016c10,-6 -4,-22 -4,-6 -2,4 -1,14 4,6zM6857,3999c-12,-17 -29,-30 -39,-49 0,22 22,35 35,49l2,0zM7408,3805c5,-13 28,10 26,-13 -0,-13 -11,-48 13,-40 13,12 6,53 34,41 62,-21 125,-41 179,-78 33,-26 52,-66 62,-106 7,-31 -0,-63 -7,-93 -13,-52 -45,-98 -87,-131 -32,-25 -68,-44 -107,-57 -39,-10 -79,-13 -119,-14 -18,15 8,36 7,54 13,41 21,84 23,127 3,16 -0,33 -0,48 -8,15 5,41 -10,49 -7,-5 -18,-30 -17,-8 -0,23 0,47 6,69 -6,21 10,42 4,63 3,21 12,44 3,65 -6,7 -25,22 -12,28 1,-1 2,-3 3,-5zM7427,3680c0,-25 24,20 -0,10 -1,-3 1,-6 0,-10zM7424,3615c-9,-15 20,-7 3,1l-2,-0zM7324,3698c-4,-17 -4,18 0,0zM6494,3625c28,-16 58,-28 90,-32 20,-10 41,-20 61,-31 31,-24 56,-57 69,-93 7,-48 -14,-101 -58,-125 -20,-14 -40,-30 -65,-31 -27,-1 -53,3 -80,4 -36,3 -72,7 -107,18 -21,9 -2,25 3,37 7,19 13,38 22,56 19,49 27,102 46,151 9,15 -4,43 16,48l2,-1zM7892,3598c8,-19 -16,7 0,0zM7897,3413c5,-9 -11,-35 -5,-12 -2,9 5,38 5,15zM6905,3421c-2,-12 -9,9 0,0zM7889,3381c-6,-16 -3,11 0,0zM6878,3355c7,-9 -6,-19 -4,-4 -1,5 0,14 4,4zM5881,3353c-7,-8 -6,11 0,0zM6101,3352c20,-4 -6,-29 -6,-8 1,5 -3,19 6,8zM6866,3330c16,-3 -14,-10 -5,0 1,-0 4,-2 5,-0zM6861,3295c8,-12 -21,-25 -8,-5 1,2 4,11 8,5zM6861,3251c-3,-7 -3,6 0,0zM6829,3236c-8,-4 -2,11 0,0zM6821,3222c-0,-13 25,14 17,-3 -32,-34 -66,-67 -108,-88 -26,-8 -52,-14 -78,-22 -18,6 14,15 21,16 41,11 82,28 113,57 1,20 23,31 31,46 3,0 5,-3 4,-6zM7764,3211c-3,-14 -22,-4 -7,3 3,1 5,-1 7,-3zM6628,3106c-5,-6 -24,3 -8,1 3,0 6,2 8,-1zM6600,3104c-13,-10 -10,10 0,0zM6524,3098c-4,-4 -6,2 0,0zM3794,4322c-1,-17 43,-22 35,-4 -9,8 -23,3 -35,4zM4365,4322c-10,-13 26,-24 11,-6 -2,4 -6,9 -11,6zM5747,4316c-19,-19 31,-15 14,0 -4,2 -10,2 -14,-0zM5219,4314c-10,-9 4,-25 7,-7 2,5 -2,12 -7,7zM5208,4278c-12,-15 19,-6 5,2l-3,-1zM6448,4174c-3,-16 22,4 2,0zM5196,4163c-10,-18 25,-12 7,1 -2,0 -5,0 -7,-1zM5961,4160c-4,-18 18,8 0,0zM4552,4121c-9,-14 19,-9 7,0l-4,0zM4088,4107c-12,-14 22,-14 7,-1 -2,1 -5,2 -7,1zM5419,4083c-9,-24 22,-1 0,0zM3677,3891c6,-8 -5,-29 11,-21 14,10 5,28 -11,21zM3752,3864c-4,-20 21,6 4,2l-2,-1zM3145,3332c-2,-14 32,-19 15,-2 -4,3 -10,5 -15,2zM6797,3155c-17,1 -23,-33 -2,-18 7,2 28,31 2,18zM6763,3132c-12,-14 21,0 2,1z\" android:strokeWidth=\"1.33333337\"/>\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\r\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\r\n    <background android:drawable=\"@color/ic_launcher_background\"/>\r\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\"/>\r\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\"/>\r\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\r\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\r\n    <background android:drawable=\"@color/ic_launcher_background\"/>\r\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\"/>\r\n    <monochrome android:drawable=\"@drawable/ic_launcher_foreground\"/>\r\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/res/values/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\r\n<resources>\r\n    <color name=\"ic_launcher_background\">#0b62c2</color>\r\n</resources>"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">TTS Engine: Next-gen Kaldi</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/res/values/themes.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n\n    <style name=\"Theme.SherpaOnnxTtsEngine\" parent=\"android:Theme.Material.Light.NoActionBar\" />\n</resources>"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/main/res/xml/tts_engine.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<tts-engine xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:settingsActivity=\"com.k2fsa.sherpa.onnx.tts.engine.MainActivity\"\n    >\n</tts-engine>"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/app/src/test/java/com/k2fsa/sherpa/onnx/tts/engine/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts.engine\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/build.gradle.kts",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id(\"com.android.application\") version \"8.2.0\" apply false\n    id(\"org.jetbrains.kotlin.android\") version \"1.9.0\" apply false\n}"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Sun Dec 31 18:47:53 CST 2023\ndistributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\n"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxTtsEngine/settings.gradle.kts",
    "content": "pluginManagement {\n    repositories {\n        google()\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\nrootProject.name = \"SherpaOnnxTtsEngine\"\ninclude(\":app\")\n"
  },
  {
    "path": "android/SherpaOnnxVad/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxVad/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxVad/app/build.gradle",
    "content": "plugins {\n    id 'com.android.application'\n    id 'org.jetbrains.kotlin.android'\n}\n\nandroid {\n    namespace 'com.k2fsa.sherpa.onnx'\n    compileSdk 33\n\n    defaultConfig {\n        applicationId \"com.k2fsa.sherpa.onnx\"\n        minSdk 21\n        targetSdk 33\n        versionCode 20260320\n        versionName \"1.12.31\"\n\n        testInstrumentationRunner \"androidx.test.runner.AndroidJUnitRunner\"\n    }\n\n    buildTypes {\n        release {\n            minifyEnabled false\n            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'\n        }\n    }\n    compileOptions {\n        sourceCompatibility JavaVersion.VERSION_1_8\n        targetCompatibility JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = '1.8'\n    }\n}\n\ndependencies {\n\n    implementation 'androidx.core:core-ktx:1.7.0'\n    implementation 'androidx.appcompat:appcompat:1.6.1'\n    implementation 'com.google.android.material:material:1.9.0'\n    implementation 'androidx.constraintlayout:constraintlayout:2.1.4'\n    testImplementation 'junit:junit:4.13.2'\n    androidTestImplementation 'androidx.test.ext:junit:1.1.5'\n    androidTestImplementation 'androidx.test.espresso:espresso-core:3.5.1'\n}"
  },
  {
    "path": "android/SherpaOnnxVad/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/androidTest/java/com/k2fsa/sherpa/onnx/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnxVad\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\"com.k2fsa.sherpa.onnx.vad.MainActivity\"\n            android:exported=\"true\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n\n            <meta-data\n                android:name=\"android.app.lib_name\"\n                android:value=\"\" />\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/assets/.gitignore",
    "content": "*.onnx\n"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/java/com/k2fsa/sherpa/onnx/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx.vad\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.os.Bundle\nimport android.util.Log\nimport android.view.View\nimport android.widget.Button\nimport androidx.appcompat.app.AppCompatActivity\nimport androidx.core.app.ActivityCompat\nimport com.k2fsa.sherpa.onnx.R\nimport com.k2fsa.sherpa.onnx.Vad\nimport com.k2fsa.sherpa.onnx.getVadModelConfig\nimport kotlin.concurrent.thread\n\n\nprivate const val TAG = \"sherpa-onnx\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\nclass MainActivity : AppCompatActivity() {\n\n    private lateinit var recordButton: Button\n    private lateinit var circle: View\n\n    private lateinit var vad: Vad\n\n    private var audioRecord: AudioRecord? = null\n    private var recordingThread: Thread? = null\n    private val audioSource = MediaRecorder.AudioSource.MIC\n    private val sampleRateInHz = 16000\n    private val channelConfig = AudioFormat.CHANNEL_IN_MONO\n\n    // Note: We don't use AudioFormat.ENCODING_PCM_FLOAT\n    // since the AudioRecord.read(float[]) needs API level >= 23\n    // but we are targeting API level >= 21\n    private val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n\n    @Volatile\n    private var isRecording: Boolean = false\n\n    override fun onRequestPermissionsResult(\n        requestCode: Int, permissions: Array<String>, grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            finish()\n        }\n\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        setContentView(R.layout.activity_main)\n\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n\n        Log.i(TAG, \"Start to initialize model\")\n        initVadModel()\n        Log.i(TAG, \"Finished initializing model\")\n\n        circle= findViewById(R.id.powerCircle)\n\n        recordButton = findViewById(R.id.record_button)\n        recordButton.setOnClickListener { onclick() }\n    }\n\n    private fun onclick() {\n        if (!isRecording) {\n            val ret = initMicrophone()\n            if (!ret) {\n                Log.e(TAG, \"Failed to initialize microphone\")\n                return\n            }\n            Log.i(TAG, \"state: ${audioRecord?.state}\")\n            audioRecord!!.startRecording()\n            recordButton.setText(R.string.stop)\n            isRecording = true\n\n            vad.reset()\n            recordingThread = thread(true) {\n                processSamples()\n            }\n            Log.i(TAG, \"Started recording\")\n            onVad(false)\n\n        } else {\n            isRecording = false\n\n            audioRecord!!.stop()\n            audioRecord!!.release()\n            audioRecord = null\n\n            recordButton.setText(R.string.start)\n            onVad(false)\n            Log.i(TAG, \"Stopped recording\")\n        }\n    }\n\n    private fun onVad(isSpeech: Boolean) {\n        if(isSpeech) {\n            circle.background = resources.getDrawable(R.drawable.red_circle)\n        } else {\n            circle.background = resources.getDrawable(R.drawable.black_circle)\n        }\n    }\n\n    private  fun initVadModel() {\n        val type = 0\n        Log.i(TAG, \"Select VAD model type ${type}\")\n        val config = getVadModelConfig(type)\n\n        vad = Vad(\n            assetManager = application.assets,\n            config = config!!,\n        )\n    }\n\n    private fun initMicrophone(): Boolean {\n        if (ActivityCompat.checkSelfPermission(\n                this, Manifest.permission.RECORD_AUDIO\n            ) != PackageManager.PERMISSION_GRANTED\n        ) {\n            ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n            return false\n        }\n\n        val numBytes = AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n        Log.i(\n            TAG, \"buffer size in milliseconds: ${numBytes * 1000.0f / sampleRateInHz}\"\n        )\n\n        audioRecord = AudioRecord(\n            audioSource,\n            sampleRateInHz,\n            channelConfig,\n            audioFormat,\n            numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n        )\n        return true\n    }\n\n    private fun processSamples() {\n        Log.i(TAG, \"processing samples\")\n\n        val bufferSize = 512 // in samples\n        val buffer = ShortArray(bufferSize)\n\n        while (isRecording) {\n            val ret = audioRecord?.read(buffer, 0, buffer.size)\n            if (ret != null && ret > 0) {\n                val samples = FloatArray(ret) { buffer[it] / 32768.0f }\n\n                vad.acceptWaveform(samples)\n\n                val isSpeechDetected = vad.isSpeechDetected()\n                vad.clear()\n\n                runOnUiThread {\n                    onVad(isSpeechDetected)\n                }\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/jniLibs/.gitignore",
    "content": "*.so\n"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/jniLibs/arm64-v8a/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/jniLibs/armeabi-v7a/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/jniLibs/x86/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/jniLibs/x86_64/.gitignore",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/drawable/black_circle.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<selector xmlns:android=\"http://schemas.android.com/apk/res/android\">\n  <item>\n    <shape  android:shape=\"oval\">\n\n    <solid  android:color=\"#FF000000\"/>\n\n    <size\n        android:width=\"300dp\"\n        android:height=\"300dp\"/>\n    </shape>\n  </item>\n</selector>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/drawable/red_circle.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<selector xmlns:android=\"http://schemas.android.com/apk/res/android\">\n  <item>\n    <shape  android:shape=\"oval\">\n\n    <solid  android:color=\"#FFFF0000\"/>\n\n    <size\n        android:width=\"300dp\"\n        android:height=\"300dp\"/>\n    </shape>\n  </item>\n</selector>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/layout/activity_main.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<androidx.constraintlayout.widget.ConstraintLayout xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:app=\"http://schemas.android.com/apk/res-auto\"\n    xmlns:tools=\"http://schemas.android.com/tools\"\n    android:layout_width=\"match_parent\"\n    android:layout_height=\"match_parent\"\n    tools:context=\"com.k2fsa.sherpa.onnx.vad.MainActivity\">\n    <LinearLayout\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"match_parent\"\n        android:gravity=\"bottom\"\n        android:orientation=\"vertical\"\n        >\n\n        <Space\n            android:layout_width=\"match_parent\"\n            android:layout_height=\"10dp\" />\n\n        <LinearLayout\n            android:id=\"@+id/powerCircle\"\n            android:layout_width=\"wrap_content\"\n            android:layout_height=\"wrap_content\"\n            android:layout_gravity=\"center_horizontal\"\n            android:background=\"@drawable/black_circle\"\n            android:orientation=\"vertical\" />\n\n        <Space\n            android:layout_width=\"match_parent\"\n            android:layout_height=\"200dp\" />\n\n        <Button\n            android:id=\"@+id/record_button\"\n            android:layout_width=\"match_parent\"\n            android:layout_height=\"wrap_content\"\n            android:text=\"@string/start\" />\n\n\n\n    </LinearLayout>\n\n\n\n</androidx.constraintlayout.widget.ConstraintLayout>\n"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">VAD: Next-gen Kaldi</string>\n\n    <string name=\"hint\">Click the Start button to play Silero VAD with Next-gen Kaldi.</string>\n    <string name=\"start\">Start</string>\n    <string name=\"stop\">Stop</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/values/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnxVad\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_500</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/white</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_700</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/values-night/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnxVad\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_200</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/black</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_200</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxVad/app/src/test/java/com/k2fsa/sherpa/onnx/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxVad/build.gradle",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id 'com.android.application' version '7.3.1' apply false\n    id 'com.android.library' version '7.3.1' apply false\n    id 'org.jetbrains.kotlin.android' version '1.7.20' apply false\n}"
  },
  {
    "path": "android/SherpaOnnxVad/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Sat Sep 23 10:24:21 CST 2023\ndistributionBase=GRADLE_USER_HOME\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\ndistributionPath=wrapper/dists\nzipStorePath=wrapper/dists\nzipStoreBase=GRADLE_USER_HOME\n"
  },
  {
    "path": "android/SherpaOnnxVad/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxVad/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxVad/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxVad/settings.gradle",
    "content": "pluginManagement {\n    repositories {\n        gradlePluginPortal()\n        google()\n        mavenCentral()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\nrootProject.name = \"SherpaOnnxVad\"\ninclude ':app'\n"
  },
  {
    "path": "android/SherpaOnnxVadAsr/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/build.gradle",
    "content": "plugins {\n    id 'com.android.application'\n    id 'org.jetbrains.kotlin.android'\n}\n\nandroid {\n    namespace 'com.k2fsa.sherpa.onnx'\n    compileSdk 33\n\n    defaultConfig {\n        applicationId \"com.k2fsa.sherpa.onnx\"\n        minSdk 21\n        targetSdk 33\n        versionCode 20260320\n        versionName \"1.12.31\"\n\n        testInstrumentationRunner \"androidx.test.runner.AndroidJUnitRunner\"\n    }\n\n    buildTypes {\n        release {\n            minifyEnabled false\n            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'\n        }\n    }\n    compileOptions {\n        sourceCompatibility JavaVersion.VERSION_1_8\n        targetCompatibility JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = '1.8'\n    }\n}\n\ndependencies {\n\n    implementation 'androidx.core:core-ktx:1.7.0'\n    implementation 'androidx.appcompat:appcompat:1.6.1'\n    implementation 'com.google.android.material:material:1.9.0'\n    implementation 'androidx.constraintlayout:constraintlayout:2.1.4'\n    implementation 'androidx.lifecycle:lifecycle-runtime-ktx:2.5.1'\n    \n    testImplementation 'junit:junit:4.13.2'\n    androidTestImplementation 'androidx.test.ext:junit:1.1.5'\n    androidTestImplementation 'androidx.test.espresso:espresso-core:3.5.1'\n}\n"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/androidTest/java/com/k2fsa/sherpa/onnx/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnxVadAsr\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".vad.asr.MainActivity\"\n            android:exported=\"true\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n\n            <meta-data\n                android:name=\"android.app.lib_name\"\n                android:value=\"\" />\n        </activity>\n    </application>\n\n</manifest>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/assets/.gitignore",
    "content": "*.onnx\n"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/java/com/k2fsa/sherpa/onnx/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx.vad.asr\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.os.Bundle\nimport android.text.method.ScrollingMovementMethod\nimport android.util.Log\nimport android.widget.Button\nimport android.widget.TextView\nimport androidx.appcompat.app.AppCompatActivity\nimport androidx.core.app.ActivityCompat\nimport com.k2fsa.sherpa.onnx.OfflineRecognizer\nimport com.k2fsa.sherpa.onnx.OfflineRecognizerConfig\nimport com.k2fsa.sherpa.onnx.R\nimport com.k2fsa.sherpa.onnx.Vad\nimport com.k2fsa.sherpa.onnx.getFeatureConfig\nimport com.k2fsa.sherpa.onnx.getOfflineModelConfig\nimport com.k2fsa.sherpa.onnx.getVadModelConfig\nimport kotlinx.coroutines.CoroutineScope\nimport kotlinx.coroutines.Dispatchers\nimport kotlinx.coroutines.cancel\nimport kotlinx.coroutines.launch\nimport kotlinx.coroutines.withContext\nimport kotlin.concurrent.thread\nimport androidx.lifecycle.lifecycleScope\n\n\nprivate const val TAG = \"sherpa-onnx\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\nclass MainActivity : AppCompatActivity() {\n\n    private lateinit var recordButton: Button\n    private lateinit var textView: TextView\n\n    private lateinit var vad: Vad\n\n    private var audioRecord: AudioRecord? = null\n    private var recordingThread: Thread? = null\n    private val audioSource = MediaRecorder.AudioSource.MIC\n    private val sampleRateInHz = 16000\n    private val channelConfig = AudioFormat.CHANNEL_IN_MONO\n\n    // Note: We don't use AudioFormat.ENCODING_PCM_FLOAT\n    // since the AudioRecord.read(float[]) needs API level >= 23\n    // but we are targeting API level >= 21\n    private val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n\n    // Non-streaming ASR\n    private lateinit var offlineRecognizer: OfflineRecognizer\n\n    private var idx: Int = 0\n    private var lastText: String = \"\"\n\n    @Volatile\n    private var isRecording: Boolean = false\n\n    override fun onRequestPermissionsResult(\n        requestCode: Int, permissions: Array<String>, grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            finish()\n        }\n\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        setContentView(R.layout.activity_main)\n\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n\n        textView = findViewById(R.id.my_text)\n        textView.movementMethod = ScrollingMovementMethod()\n\n        recordButton = findViewById(R.id.record_button)\n        recordButton.isEnabled = false\n        recordButton.setOnClickListener { onclick() }\n\n        textView.text = \"Initializing models... Please wait.\"\n\n        lifecycleScope.launch(Dispatchers.IO) {\n            Log.i(TAG, \"Start to initialize model\")\n            initVadModel()\n            Log.i(TAG, \"Finished initializing model\")\n\n            Log.i(TAG, \"Start to initialize non-streaming recognizer\")\n            initOfflineRecognizer()\n            Log.i(TAG, \"Finished initializing non-streaming recognizer\")\n\n            withContext(Dispatchers.Main) {\n                recordButton.isEnabled = true\n                textView.text = \"\" \n                Log.i(TAG, \"Model initialization completed, button enabled\")\n            }\n        }\n    }\n\n    private fun onclick() {\n        if (!isRecording) {\n            val ret = initMicrophone()\n            if (!ret) {\n                Log.e(TAG, \"Failed to initialize microphone\")\n                return\n            }\n            Log.i(TAG, \"state: ${audioRecord?.state}\")\n            audioRecord!!.startRecording()\n            recordButton.setText(R.string.stop)\n            isRecording = true\n\n            textView.text = \"\"\n            lastText = \"\"\n            idx = 0\n\n            vad.reset()\n            recordingThread = thread(true) {\n                processSamples()\n            }\n            Log.i(TAG, \"Started recording\")\n        } else {\n            isRecording = false\n\n            audioRecord!!.stop()\n            audioRecord!!.release()\n            audioRecord = null\n\n            recordButton.setText(R.string.start)\n            Log.i(TAG, \"Stopped recording\")\n        }\n    }\n\n    private  fun initVadModel() {\n        val type = 0\n        Log.i(TAG, \"Select VAD model type ${type}\")\n        val config = getVadModelConfig(type)\n\n        vad = Vad(\n            assetManager = application.assets,\n            config = config!!,\n        )\n    }\n\n    private fun initMicrophone(): Boolean {\n        if (ActivityCompat.checkSelfPermission(\n                this, Manifest.permission.RECORD_AUDIO\n            ) != PackageManager.PERMISSION_GRANTED\n        ) {\n            ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n            return false\n        }\n\n        val numBytes = AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n        Log.i(\n            TAG, \"buffer size in milliseconds: ${numBytes * 1000.0f / sampleRateInHz}\"\n        )\n\n        audioRecord = AudioRecord(\n            audioSource,\n            sampleRateInHz,\n            channelConfig,\n            audioFormat,\n            numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n        )\n        return true\n    }\n\n    private fun processSamples() {\n        Log.i(TAG, \"processing samples\")\n\n        val bufferSize = 512 // in samples\n        val buffer = ShortArray(bufferSize)\n        val coroutineScope = CoroutineScope(Dispatchers.IO)\n\n\n        while (isRecording) {\n            val ret = audioRecord?.read(buffer, 0, buffer.size)\n            if (ret != null && ret > 0) {\n                val samples = FloatArray(ret) { buffer[it] / 32768.0f }\n\n                vad.acceptWaveform(samples)\n                while(!vad.empty()) {\n                    var segment = vad.front()\n                    coroutineScope.launch {\n                        val text = runSecondPass(segment.samples)\n                        if (text.isNotBlank()) {\n                            withContext(Dispatchers.Main) {\n                                lastText = \"${lastText}\\n${idx}: ${text}\"\n                                idx += 1\n                                textView.text = lastText.lowercase()\n                            }\n                        }\n                    }\n\n                    vad.pop();\n                }\n            }\n        }\n\n        // Clean up the coroutine scope when done\n        coroutineScope.cancel()\n    }\n\n    private fun initOfflineRecognizer() {\n        // Please change getOfflineModelConfig() to add new models\n        // See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n        // for a list of available models\n        val asrModelType = 0\n        val asrRuleFsts: String?\n        asrRuleFsts = null\n        Log.i(TAG, \"Select model type ${asrModelType} for ASR\")\n\n        val config = OfflineRecognizerConfig(\n            featConfig = getFeatureConfig(sampleRate = sampleRateInHz, featureDim = 80),\n            modelConfig = getOfflineModelConfig(type = asrModelType)!!,\n        )\n        if (asrRuleFsts != null) {\n            config.ruleFsts = asrRuleFsts;\n        }\n\n        offlineRecognizer = OfflineRecognizer(\n            assetManager = application.assets,\n            config = config,\n        )\n    }\n\n    private fun runSecondPass(samples: FloatArray): String {\n        val stream = offlineRecognizer.createStream()\n        stream.acceptWaveform(samples, sampleRateInHz)\n        offlineRecognizer.decode(stream)\n        val result = offlineRecognizer.getResult(stream)\n        stream.release()\n        return result.text\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/layout/activity_main.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<androidx.constraintlayout.widget.ConstraintLayout xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:app=\"http://schemas.android.com/apk/res-auto\"\n    xmlns:tools=\"http://schemas.android.com/tools\"\n    android:layout_width=\"match_parent\"\n    android:layout_height=\"match_parent\"\n    tools:context=\".vad.asr.MainActivity\">\n\n    <LinearLayout\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"match_parent\"\n        android:gravity=\"center\"\n        android:orientation=\"vertical\">\n\n        <TextView\n            android:id=\"@+id/my_text\"\n            android:layout_width=\"match_parent\"\n            android:layout_height=\"match_parent\"\n            android:layout_weight=\"2.5\"\n            android:padding=\"24dp\"\n            android:scrollbars=\"vertical\"\n            android:singleLine=\"false\"\n            android:text=\"@string/hint\"\n            app:layout_constraintBottom_toBottomOf=\"parent\"\n            app:layout_constraintEnd_toEndOf=\"parent\"\n            app:layout_constraintStart_toStartOf=\"parent\"\n            android:gravity=\"bottom\"\n            app:layout_constraintTop_toTopOf=\"parent\" />\n\n        <Button\n            android:id=\"@+id/record_button\"\n            android:layout_width=\"wrap_content\"\n            android:layout_height=\"wrap_content\"\n            android:layout_weight=\"0.5\"\n            android:text=\"@string/start\" />\n    </LinearLayout>\n\n\n</androidx.constraintlayout.widget.ConstraintLayout>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">VAD+ASR: Next-gen Kaldi</string>\n    <string name=\"hint\">Click the Start button to play speech-to-text with Next-gen Kaldi.\n        \\n\n        \\n\\n\\n\n        The source code and pre-trained models are publicly available.\n        Please see https://github.com/k2-fsa/sherpa-onnx for details.\n        \\n\\n\n        Speech recognition with Next-gen Kaldi using VAD and non-streaming ASR models.\n    </string>\n    <string name=\"start\">Start</string>\n    <string name=\"stop\">Stop</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/values/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnxVadAsr\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_500</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/white</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_700</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/values-night/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnxVadAsr\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_200</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/black</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_200</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxVadAsr/app/src/test/java/com/k2fsa/sherpa/onnx/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxVadAsr/build.gradle",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id 'com.android.application' version '7.3.1' apply false\n    id 'com.android.library' version '7.3.1' apply false\n    id 'org.jetbrains.kotlin.android' version '1.7.20' apply false\n}"
  },
  {
    "path": "android/SherpaOnnxVadAsr/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Sat Sep 23 20:50:52 CST 2023\ndistributionBase=GRADLE_USER_HOME\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\ndistributionPath=wrapper/dists\nzipStorePath=wrapper/dists\nzipStoreBase=GRADLE_USER_HOME\n"
  },
  {
    "path": "android/SherpaOnnxVadAsr/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxVadAsr/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxVadAsr/gradlew.bat",
    "content": "@rem\r\n@rem Copyright 2015 the original author or authors.\r\n@rem\r\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\r\n@rem you may not use this file except in compliance with the License.\r\n@rem You may obtain a copy of the License at\r\n@rem\r\n@rem      https://www.apache.org/licenses/LICENSE-2.0\r\n@rem\r\n@rem Unless required by applicable law or agreed to in writing, software\r\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\r\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\r\n@rem See the License for the specific language governing permissions and\r\n@rem limitations under the License.\r\n@rem\r\n\r\n@if \"%DEBUG%\" == \"\" @echo off\r\n@rem ##########################################################################\r\n@rem\r\n@rem  Gradle startup script for Windows\r\n@rem\r\n@rem ##########################################################################\r\n\r\n@rem Set local scope for the variables with windows NT shell\r\nif \"%OS%\"==\"Windows_NT\" setlocal\r\n\r\nset DIRNAME=%~dp0\r\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\r\nset APP_BASE_NAME=%~n0\r\nset APP_HOME=%DIRNAME%\r\n\r\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\r\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\r\n\r\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\r\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\r\n\r\n@rem Find java.exe\r\nif defined JAVA_HOME goto findJavaFromJavaHome\r\n\r\nset JAVA_EXE=java.exe\r\n%JAVA_EXE% -version >NUL 2>&1\r\nif \"%ERRORLEVEL%\" == \"0\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:findJavaFromJavaHome\r\nset JAVA_HOME=%JAVA_HOME:\"=%\r\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\r\n\r\nif exist \"%JAVA_EXE%\" goto execute\r\n\r\necho.\r\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\r\necho.\r\necho Please set the JAVA_HOME variable in your environment to match the\r\necho location of your Java installation.\r\n\r\ngoto fail\r\n\r\n:execute\r\n@rem Setup the command line\r\n\r\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\r\n\r\n\r\n@rem Execute Gradle\r\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\r\n\r\n:end\r\n@rem End local scope for the variables with windows NT shell\r\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\r\n\r\n:fail\r\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\r\nrem the _cmd.exe /c_ return code!\r\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\r\nexit /b 1\r\n\r\n:mainEnd\r\nif \"%OS%\"==\"Windows_NT\" endlocal\r\n\r\n:omega\r\n"
  },
  {
    "path": "android/SherpaOnnxVadAsr/settings.gradle",
    "content": "pluginManagement {\n    repositories {\n        gradlePluginPortal()\n        google()\n        mavenCentral()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\nrootProject.name = \"SherpaOnnxVadAsr\"\ninclude ':app'\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/caches\n/.idea/libraries\n/.idea/modules.xml\n/.idea/workspace.xml\n/.idea/navEditor.xml\n/.idea/assetWizardSettings.xml\n.DS_Store\n/build\n/captures\n.externalNativeBuild\n.cxx\nlocal.properties\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/.gitignore",
    "content": "/build"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/build.gradle",
    "content": "plugins {\n    id 'com.android.application'\n    id 'org.jetbrains.kotlin.android'\n}\n\nandroid {\n    namespace 'com.k2fsa.sherpa.onnx'\n    compileSdk 32\n\n    defaultConfig {\n        applicationId \"com.k2fsa.sherpa.onnx\"\n        minSdk 21\n        targetSdk 32\n        versionCode 20260320\n        versionName \"1.12.31\"\n\n        testInstrumentationRunner \"androidx.test.runner.AndroidJUnitRunner\"\n    }\n\n    buildTypes {\n        release {\n            minifyEnabled false\n            proguardFiles getDefaultProguardFile('proguard-android-optimize.txt'), 'proguard-rules.pro'\n        }\n    }\n    compileOptions {\n        sourceCompatibility JavaVersion.VERSION_1_8\n        targetCompatibility JavaVersion.VERSION_1_8\n    }\n    kotlinOptions {\n        jvmTarget = '1.8'\n    }\n}\n\ndependencies {\n\n    implementation 'androidx.core:core-ktx:1.7.0'\n    implementation 'androidx.appcompat:appcompat:1.5.1'\n    implementation 'com.google.android.material:material:1.7.0'\n    implementation 'androidx.constraintlayout:constraintlayout:2.1.4'\n    testImplementation 'junit:junit:4.13.2'\n    androidTestImplementation 'androidx.test.ext:junit:1.1.4'\n    androidTestImplementation 'androidx.test.espresso:espresso-core:3.5.0'\n\n    implementation 'org.java-websocket:Java-WebSocket:1.4.0'\n    implementation 'com.google.code.gson:gson:2.10.1'\n}"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/proguard-rules.pro",
    "content": "# Add project specific ProGuard rules here.\n# You can control the set of applied configuration files using the\n# proguardFiles setting in build.gradle.\n#\n# For more details, see\n#   http://developer.android.com/guide/developing/tools/proguard.html\n\n# If your project uses WebView with JS, uncomment the following\n# and specify the fully qualified class name to the JavaScript interface\n# class:\n#-keepclassmembers class fqcn.of.javascript.interface.for.webview {\n#   public *;\n#}\n\n# Uncomment this to preserve the line number information for\n# debugging stack traces.\n#-keepattributes SourceFile,LineNumberTable\n\n# If you keep the line number information, uncomment this to\n# hide the original source file name.\n#-renamesourcefileattribute SourceFile"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/androidTest/java/com/k2fsa/sherpa/onnx/ExampleInstrumentedTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport androidx.test.platform.app.InstrumentationRegistry\nimport androidx.test.ext.junit.runners.AndroidJUnit4\n\nimport org.junit.Test\nimport org.junit.runner.RunWith\n\nimport org.junit.Assert.*\n\n/**\n * Instrumented test, which will execute on an Android device.\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\n@RunWith(AndroidJUnit4::class)\nclass ExampleInstrumentedTest {\n    @Test\n    fun useAppContext() {\n        // Context of the app under test.\n        val appContext = InstrumentationRegistry.getInstrumentation().targetContext\n        assertEquals(\"com.k2fsa.sherpa.onnx\", appContext.packageName)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/AndroidManifest.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:tools=\"http://schemas.android.com/tools\">\n\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n    <uses-permission android:name=\"android.permission.INTERNET\"/>\n\n    <application\n        android:allowBackup=\"true\"\n        android:dataExtractionRules=\"@xml/data_extraction_rules\"\n        android:fullBackupContent=\"@xml/backup_rules\"\n        android:icon=\"@mipmap/ic_launcher\"\n        android:label=\"@string/app_name\"\n        android:roundIcon=\"@mipmap/ic_launcher_round\"\n        android:supportsRtl=\"true\"\n        android:theme=\"@style/Theme.SherpaOnnx\"\n        tools:targetApi=\"31\">\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\">\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\" />\n\n                <category android:name=\"android.intent.category.LAUNCHER\" />\n            </intent-filter>\n\n            <meta-data\n                android:name=\"android.app.lib_name\"\n                android:value=\"\" />\n        </activity>\n    </application>\n\n</manifest>\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/assets/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/java/com/k2fsa/sherpa/onnx/MainActivity.kt",
    "content": "// add by longsm at 2023/10/13\npackage com.k2fsa.sherpa.onnx\n\nimport android.Manifest\nimport android.content.pm.PackageManager\nimport android.media.AudioFormat\nimport android.media.AudioRecord\nimport android.media.MediaRecorder\nimport android.os.Bundle\nimport android.text.TextUtils\nimport android.text.method.ScrollingMovementMethod\nimport android.util.Log\nimport android.widget.Button\nimport android.widget.EditText\nimport android.widget.TextView\nimport androidx.appcompat.app.AppCompatActivity\nimport androidx.core.app.ActivityCompat\nimport com.google.gson.Gson\nimport com.google.gson.reflect.TypeToken\nimport org.java_websocket.handshake.ServerHandshake\nimport java.net.URI\nimport java.net.URISyntaxException\nimport java.nio.ByteBuffer\nimport java.nio.ByteOrder\nimport kotlin.concurrent.thread\n\nprivate const val TAG = \"sherpa-onnx\"\nprivate const val REQUEST_RECORD_AUDIO_PERMISSION = 200\n\nclass MainActivity : AppCompatActivity(), MyWebsocketClient.WebsocketClientCallback {\n    private val permissions: Array<String> = arrayOf(Manifest.permission.RECORD_AUDIO)\n\n    private var audioRecord: AudioRecord? = null\n    private lateinit var recordButton: Button\n    private lateinit var connectButton: Button\n    private lateinit var textView: TextView\n    private lateinit var etUrl: EditText\n    private var recordingThread: Thread? = null\n\n    private var websocketClient: MyWebsocketClient? = null\n\n    private val audioSource = MediaRecorder.AudioSource.MIC\n    private val sampleRateInHz = 16000\n    private val channelConfig = AudioFormat.CHANNEL_IN_MONO\n\n    // Note: We don't use AudioFormat.ENCODING_PCM_FLOAT\n    // since the AudioRecord.read(float[]) needs API level >= 23\n    // but we are targeting API level >= 21\n    private val audioFormat = AudioFormat.ENCODING_PCM_16BIT\n    private var idx: Long = 0\n    private var lastText: String = \"\"\n\n    @Volatile\n    private var isRecording: Boolean = false\n\n    @Volatile\n    private var isConnected: Boolean = false\n\n    override fun onRequestPermissionsResult(\n        requestCode: Int, permissions: Array<String>, grantResults: IntArray\n    ) {\n        super.onRequestPermissionsResult(requestCode, permissions, grantResults)\n        val permissionToRecordAccepted = if (requestCode == REQUEST_RECORD_AUDIO_PERMISSION) {\n            grantResults[0] == PackageManager.PERMISSION_GRANTED\n        } else {\n            false\n        }\n\n        if (!permissionToRecordAccepted) {\n            Log.e(TAG, \"Audio record is disallowed\")\n            finish()\n        }\n\n        Log.i(TAG, \"Audio record is permitted\")\n    }\n\n    override fun onCreate(savedInstanceState: Bundle?) {\n        super.onCreate(savedInstanceState)\n        setContentView(R.layout.activity_main)\n\n        ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n\n        recordButton = findViewById(R.id.record_button)\n        recordButton.setOnClickListener { onclick() }\n\n        connectButton = findViewById(R.id.connect_button)\n        connectButton.setOnClickListener { onclickConnect() }\n\n        textView = findViewById(R.id.my_text)\n        textView.movementMethod = ScrollingMovementMethod()\n\n        recordButton.isEnabled = false\n\n        etUrl = findViewById(R.id.et_uri)\n    }\n\n    private fun onclickConnect() {\n        if (!isConnected) {\n            val etUrlStr = etUrl.text.toString().trim()\n            var uriStr = \"ws://172.28.13.167:6006\"\n            if (!TextUtils.isEmpty(etUrlStr)) {\n                uriStr = etUrlStr\n            }\n            try {\n                val uri = URI(uriStr)\n                websocketClient = MyWebsocketClient(uri)\n                websocketClient?.setClientCallback(this)\n                websocketClient?.connect()\n            } catch (e: URISyntaxException) {\n                Log.e(TAG, \"URISyntaxException === >> $e\")\n            }\n        } else {\n            Log.e(TAG, \"onclick disconnect\")\n            websocketClient?.close()\n            websocketClient = null\n        }\n\n    }\n\n    private fun onclick() {\n\n        if (!isRecording) {\n            val ret = initMicrophone()\n            if (!ret) {\n                Log.e(TAG, \"Failed to initialize microphone\")\n                return\n            }\n            Log.i(TAG, \"state: ${audioRecord?.state}\")\n            audioRecord!!.startRecording()\n            recordButton.setText(R.string.stop)\n            isRecording = true\n            textView.text = \"\"\n            lastText = \"\"\n            idx = 0\n\n            recordingThread = thread(true) {\n                processSamples()\n            }\n            connectButton.isEnabled = false\n            Log.i(TAG, \"Started recording\")\n        } else {\n            isRecording = false\n            audioRecord!!.stop()\n            audioRecord!!.release()\n            audioRecord = null\n            recordButton.setText(R.string.start)\n            connectButton.isEnabled = true\n            Log.i(TAG, \"Stopped recording\")\n        }\n    }\n\n    private fun processSamples() {\n        Log.i(TAG, \"processing samples\")\n\n        val interval = 0.1 // i.e., 100 ms\n        val bufferSize = (interval * sampleRateInHz).toInt() // in samples\n        val buffer = ShortArray(bufferSize)\n\n        while (isRecording) {\n            val ret = audioRecord?.read(buffer, 0, buffer.size)\n            if (ret != null && ret > 0) {\n                val samples = FloatArray(ret) { buffer[it] / 32768.0f }\n\n                val buffer = ByteBuffer.allocate(4 * samples.size)\n                    .order(ByteOrder.LITTLE_ENDIAN) // float is sizeof 4. allocate enough buffer\n\n\n                for (f in samples) {\n                    buffer.putFloat(f)\n                }\n                buffer.rewind()\n                buffer.flip()\n                buffer.order(ByteOrder.LITTLE_ENDIAN)\n\n                if (isConnected) {\n                    websocketClient?.send(buffer.array()) // send buf to server\n                }\n\n            }\n        }\n    }\n\n    private fun initMicrophone(): Boolean {\n        if (ActivityCompat.checkSelfPermission(\n                this, Manifest.permission.RECORD_AUDIO\n            ) != PackageManager.PERMISSION_GRANTED\n        ) {\n            ActivityCompat.requestPermissions(this, permissions, REQUEST_RECORD_AUDIO_PERMISSION)\n            return false\n        }\n\n        val numBytes = AudioRecord.getMinBufferSize(sampleRateInHz, channelConfig, audioFormat)\n        Log.i(\n            TAG, \"buffer size in milliseconds: ${numBytes * 1000.0f / sampleRateInHz}\"\n        )\n\n        audioRecord = AudioRecord(\n            audioSource,\n            sampleRateInHz,\n            channelConfig,\n            audioFormat,\n            numBytes * 2 // a sample has two bytes as we are using 16-bit PCM\n        )\n        return true\n    }\n\n    override fun onOpen(handshakedata: ServerHandshake?) {\n        Log.i(TAG, \"onOpen === >>\")\n        isConnected = true\n        runOnUiThread {\n            recordButton.isEnabled = true\n            connectButton.text = getString(R.string.disconnect)\n        }\n    }\n\n    private val gson = Gson()\n    private val recognitionText = hashMapOf<Long, String>()\n\n    private fun getDisplayResult(): String {\n        var i = 0\n        var ans = \"\"\n        for ((key,value) in recognitionText){\n            if (value == \"\"){\n                continue\n            }\n            ans += \" $i : ${recognitionText[key]}\\n\"\n            i += 1\n        }\n        return ans\n\n    }\n\n    override fun onMessage(message: String?) {\n        Log.i(TAG, \"onMessage === >> $message\")\n        val speechContent = gson.fromJson<SpeechContent>(\n            message,\n            object : TypeToken<SpeechContent?>() {}.type\n        )\n\n        val text = speechContent.text\n        val segment = speechContent.segment\n        Log.i(TAG, \"text === >> $text\")\n\n        recognitionText[segment] = text\n        runOnUiThread {\n            textView.text = getDisplayResult()\n        }\n    }\n\n    override fun onClose(code: Int, reason: String?, remote: Boolean?) {\n        Log.i(TAG, \"onClose === >> code$code reason$reason remote$remote\")\n        isConnected = false\n        runOnUiThread {\n            recordButton.isEnabled = false\n            connectButton.text = getString(R.string.connect)\n            textView.text = getString(R.string.hint)\n        }\n\n    }\n\n    override fun onError(ex: Exception?) {\n        Log.i(TAG, \"onError === >> $ex\")\n        runOnUiThread {\n            textView.text = \"onError === >> $ex\"\n        }\n\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/java/com/k2fsa/sherpa/onnx/MyWebsocketClient.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport org.java_websocket.client.WebSocketClient\nimport org.java_websocket.handshake.ServerHandshake\nimport java.net.URI\n\nclass MyWebsocketClient(serverUri: URI?) : WebSocketClient(serverUri) {\n\n    override fun onOpen(handshakedata: ServerHandshake) {\n        clientCallback?.onOpen(handshakedata)\n\n    }\n    override fun onMessage(message: String) {\n        clientCallback?.onMessage(message)\n    }\n\n    override fun onClose(code: Int, reason: String, remote: Boolean) {\n        clientCallback?.onClose(code,reason,remote)\n    }\n\n    override fun onError(ex: Exception) {\n        clientCallback?.onError(ex)\n    }\n\n    private var clientCallback: WebsocketClientCallback? = null\n\n    fun setClientCallback(clientCallback: WebsocketClientCallback?) {\n        this.clientCallback = clientCallback\n    }\n\n    interface WebsocketClientCallback {\n        fun onOpen(handshakedata: ServerHandshake?)\n        fun onMessage(message: String?)\n        fun onClose(code: Int, reason: String?, remote: Boolean?)\n        fun onError(ex: Exception?)\n    }\n\n\n}"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/java/com/k2fsa/sherpa/onnx/SpeechContent.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\ndata class SpeechContent(val text:String,val segment:Long)\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/java/com/k2fsa/sherpa/onnx/WaveReader.kt",
    "content": "// Copyright (c)  2023  Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\nclass WaveReader {\n    companion object {\n        // Read a mono wave file asset\n        // The returned array has two entries:\n        //  - the first entry contains an 1-D float array\n        //  - the second entry is the sample rate\n        external fun readWaveFromAsset(\n            assetManager: AssetManager,\n            filename: String,\n        ): Array<Any>\n\n        // Read a mono wave file from disk\n        // The returned array has two entries:\n        //  - the first entry contains an 1-D float array\n        //  - the second entry is the sample rate\n        external fun readWaveFromFile(\n            filename: String,\n        ): Array<Any>\n\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/jniLibs/.gitignore",
    "content": "*.so\n*.txt\n*.onnx\n*.wav\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/drawable/ic_launcher_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path\n        android:fillColor=\"#3DDC84\"\n        android:pathData=\"M0,0h108v108h-108z\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M9,0L9,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,0L19,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,0L29,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,0L39,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,0L49,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,0L59,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,0L69,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,0L79,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M89,0L89,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M99,0L99,108\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,9L108,9\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,19L108,19\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,29L108,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,39L108,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,49L108,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,59L108,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,69L108,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,79L108,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,89L108,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M0,99L108,99\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,29L89,29\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,39L89,39\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,49L89,49\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,59L89,59\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,69L89,69\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M19,79L89,79\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M29,19L29,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M39,19L39,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M49,19L49,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M59,19L59,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M69,19L69,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n    <path\n        android:fillColor=\"#00000000\"\n        android:pathData=\"M79,19L79,89\"\n        android:strokeWidth=\"0.8\"\n        android:strokeColor=\"#33FFFFFF\" />\n</vector>\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/drawable-v24/ic_launcher_foreground.xml",
    "content": "<vector xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:aapt=\"http://schemas.android.com/aapt\"\n    android:width=\"108dp\"\n    android:height=\"108dp\"\n    android:viewportWidth=\"108\"\n    android:viewportHeight=\"108\">\n    <path android:pathData=\"M31,63.928c0,0 6.4,-11 12.1,-13.1c7.2,-2.6 26,-1.4 26,-1.4l38.1,38.1L107,108.928l-32,-1L31,63.928z\">\n        <aapt:attr name=\"android:fillColor\">\n            <gradient\n                android:endX=\"85.84757\"\n                android:endY=\"92.4963\"\n                android:startX=\"42.9492\"\n                android:startY=\"49.59793\"\n                android:type=\"linear\">\n                <item\n                    android:color=\"#44000000\"\n                    android:offset=\"0.0\" />\n                <item\n                    android:color=\"#00000000\"\n                    android:offset=\"1.0\" />\n            </gradient>\n        </aapt:attr>\n    </path>\n    <path\n        android:fillColor=\"#FFFFFF\"\n        android:fillType=\"nonZero\"\n        android:pathData=\"M65.3,45.828l3.8,-6.6c0.2,-0.4 0.1,-0.9 -0.3,-1.1c-0.4,-0.2 -0.9,-0.1 -1.1,0.3l-3.9,6.7c-6.3,-2.8 -13.4,-2.8 -19.7,0l-3.9,-6.7c-0.2,-0.4 -0.7,-0.5 -1.1,-0.3C38.8,38.328 38.7,38.828 38.9,39.228l3.8,6.6C36.2,49.428 31.7,56.028 31,63.928h46C76.3,56.028 71.8,49.428 65.3,45.828zM43.4,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2c-0.3,-0.7 -0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C45.3,56.528 44.5,57.328 43.4,57.328L43.4,57.328zM64.6,57.328c-0.8,0 -1.5,-0.5 -1.8,-1.2s-0.1,-1.5 0.4,-2.1c0.5,-0.5 1.4,-0.7 2.1,-0.4c0.7,0.3 1.2,1 1.2,1.8C66.5,56.528 65.6,57.328 64.6,57.328L64.6,57.328z\"\n        android:strokeWidth=\"1\"\n        android:strokeColor=\"#00000000\" />\n</vector>"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/layout/activity_main.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<androidx.constraintlayout.widget.ConstraintLayout xmlns:android=\"http://schemas.android.com/apk/res/android\"\n    xmlns:app=\"http://schemas.android.com/apk/res-auto\"\n    xmlns:tools=\"http://schemas.android.com/tools\"\n    android:layout_width=\"match_parent\"\n    android:layout_height=\"match_parent\"\n    tools:context=\".MainActivity\">\n\n    <TextView\n        android:id=\"@+id/text_hint\"\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"wrap_content\"\n        android:text=\"@string/uri_format\"\n        android:gravity=\"center\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toTopOf=\"parent\" />\n\n    <EditText\n        android:id=\"@+id/et_uri\"\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"56dp\"\n        android:layout_marginTop=\"4dp\"\n        android:hint=\"@string/uri_hint\"\n        android:gravity=\"center\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/text_hint\" />\n    <Button\n        android:id=\"@+id/connect_button\"\n        android:layout_width=\"wrap_content\"\n        android:layout_height=\"wrap_content\"\n        android:layout_marginTop=\"4dp\"\n        android:textAllCaps=\"false\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/et_uri\"\n        android:text=\"@string/connect\" />\n    <Button\n        android:id=\"@+id/record_button\"\n        android:layout_width=\"wrap_content\"\n        android:layout_height=\"wrap_content\"\n        android:layout_marginTop=\"4dp\"\n        android:textAllCaps=\"false\"\n        app:layout_constraintLeft_toLeftOf=\"parent\"\n        app:layout_constraintRight_toRightOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/connect_button\"\n        android:text=\"@string/start\" />\n\n    <TextView\n        android:id=\"@+id/my_text\"\n        android:layout_width=\"match_parent\"\n        android:layout_height=\"0dp\"\n        android:padding=\"24dp\"\n        android:scrollbars=\"vertical\"\n        android:singleLine=\"false\"\n        android:text=\"@string/hint\"\n        app:layout_constraintBottom_toBottomOf=\"parent\"\n        app:layout_constraintEnd_toEndOf=\"parent\"\n        app:layout_constraintStart_toStartOf=\"parent\"\n        app:layout_constraintTop_toBottomOf=\"@id/record_button\" />\n\n\n\n\n</androidx.constraintlayout.widget.ConstraintLayout>"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/mipmap-anydpi-v26/ic_launcher.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/mipmap-anydpi-v26/ic_launcher_round.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<adaptive-icon xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <background android:drawable=\"@drawable/ic_launcher_background\" />\n    <foreground android:drawable=\"@drawable/ic_launcher_foreground\" />\n</adaptive-icon>"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/values/colors.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <color name=\"purple_200\">#FFBB86FC</color>\n    <color name=\"purple_500\">#FF6200EE</color>\n    <color name=\"purple_700\">#FF3700B3</color>\n    <color name=\"teal_200\">#FF03DAC5</color>\n    <color name=\"teal_700\">#FF018786</color>\n    <color name=\"black\">#FF000000</color>\n    <color name=\"white\">#FFFFFFFF</color>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/values/strings.xml",
    "content": "<resources>\n    <string name=\"app_name\">ASR with Next-gen Kaldi</string>\n    <string name=\"hint\">\n        Click the connect button to connect websocket.\n        \\n\n        \\n\\n\\n\n        Click the Start button to play speech-to-text with Next-gen Kaldi.\n        \\n\n        \\n\\n\\n\n        The source code and pre-trained models are publicly available.\n        Please see https://github.com/k2-fsa/sherpa-onnx for details.\n    </string>\n    <string name=\"start\">Start</string>\n    <string name=\"stop\">Stop</string>\n    <string name=\"connect\">connect</string>\n    <string name=\"disconnect\">disconnect</string>\n    <string name=\"uri_format\">please input uri first,format as follows:\\n\n        ws://ip:port or wss://ip:port</string>\n    <string name=\"uri_hint\">please input uri first</string>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/values/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnx\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_500</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/white</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_700</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/values-night/themes.xml",
    "content": "<resources xmlns:tools=\"http://schemas.android.com/tools\">\n    <!-- Base application theme. -->\n    <style name=\"Theme.SherpaOnnx\" parent=\"Theme.MaterialComponents.DayNight.DarkActionBar\">\n        <!-- Primary brand color. -->\n        <item name=\"colorPrimary\">@color/purple_200</item>\n        <item name=\"colorPrimaryVariant\">@color/purple_700</item>\n        <item name=\"colorOnPrimary\">@color/black</item>\n        <!-- Secondary brand color. -->\n        <item name=\"colorSecondary\">@color/teal_200</item>\n        <item name=\"colorSecondaryVariant\">@color/teal_200</item>\n        <item name=\"colorOnSecondary\">@color/black</item>\n        <!-- Status bar color. -->\n        <item name=\"android:statusBarColor\">?attr/colorPrimaryVariant</item>\n        <!-- Customize your theme here. -->\n    </style>\n</resources>"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/xml/backup_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample backup rules file; uncomment and customize as necessary.\n   See https://developer.android.com/guide/topics/data/autobackup\n   for details.\n   Note: This file is ignored for devices older that API 31\n   See https://developer.android.com/about/versions/12/backup-restore\n-->\n<full-backup-content>\n    <!--\n   <include domain=\"sharedpref\" path=\".\"/>\n   <exclude domain=\"sharedpref\" path=\"device.xml\"/>\n-->\n</full-backup-content>"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/main/res/xml/data_extraction_rules.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?><!--\n   Sample data extraction rules file; uncomment and customize as necessary.\n   See https://developer.android.com/about/versions/12/backup-restore#xml-changes\n   for details.\n-->\n<data-extraction-rules>\n    <cloud-backup>\n        <!-- TODO: Use <include> and <exclude> to control what is backed up.\n        <include .../>\n        <exclude .../>\n        -->\n    </cloud-backup>\n    <!--\n    <device-transfer>\n        <include .../>\n        <exclude .../>\n    </device-transfer>\n    -->\n</data-extraction-rules>"
  },
  {
    "path": "android/SherpaOnnxWebSocket/app/src/test/java/com/k2fsa/sherpa/onnx/ExampleUnitTest.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport org.junit.Test\n\nimport org.junit.Assert.*\n\n/**\n * Example local unit test, which will execute on the development machine (host).\n *\n * See [testing documentation](http://d.android.com/tools/testing).\n */\nclass ExampleUnitTest {\n    @Test\n    fun addition_isCorrect() {\n        assertEquals(4, 2 + 2)\n    }\n}"
  },
  {
    "path": "android/SherpaOnnxWebSocket/build.gradle",
    "content": "// Top-level build file where you can add configuration options common to all sub-projects/modules.\nplugins {\n    id 'com.android.application' version '7.3.1' apply false\n    id 'com.android.library' version '7.3.1' apply false\n    id 'org.jetbrains.kotlin.android' version '1.7.20' apply false\n}"
  },
  {
    "path": "android/SherpaOnnxWebSocket/gradle/wrapper/gradle-wrapper.properties",
    "content": "#Thu Feb 23 11:09:06 CST 2023\ndistributionBase=GRADLE_USER_HOME\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.2-bin.zip\ndistributionPath=wrapper/dists\nzipStorePath=wrapper/dists\nzipStoreBase=GRADLE_USER_HOME\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/gradle.properties",
    "content": "# Project-wide Gradle settings.\n# IDE (e.g. Android Studio) users:\n# Gradle settings configured through the IDE *will override*\n# any settings specified in this file.\n# For more details on how to configure your build environment visit\n# http://www.gradle.org/docs/current/userguide/build_environment.html\n# Specifies the JVM arguments used for the daemon process.\n# The setting is particularly useful for tweaking memory settings.\norg.gradle.jvmargs=-Xmx2048m -Dfile.encoding=UTF-8\n# When configured, Gradle will run in incubating parallel mode.\n# This option should only be used with decoupled projects. More details, visit\n# http://www.gradle.org/docs/current/userguide/multi_project_builds.html#sec:decoupled_projects\n# org.gradle.parallel=true\n# AndroidX package structure to make it clearer which packages are bundled with the\n# Android operating system, and which are packaged with your app's APK\n# https://developer.android.com/topic/libraries/support-library/androidx-rn\nandroid.useAndroidX=true\n# Kotlin code style for this project: \"official\" or \"obsolete\":\nkotlin.code.style=official\n# Enables namespacing of each library's R class so that its R class includes only the\n# resources declared in the library itself and none from the library's dependencies,\n# thereby reducing the size of the R class for that library\nandroid.nonTransitiveRClass=true"
  },
  {
    "path": "android/SherpaOnnxWebSocket/gradlew",
    "content": "#!/usr/bin/env sh\n\n#\n# Copyright 2015 the original author or authors.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#      https://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n\n##############################################################################\n##\n##  Gradle start up script for UN*X\n##\n##############################################################################\n\n# Attempt to set APP_HOME\n# Resolve links: $0 may be a link\nPRG=\"$0\"\n# Need this for relative symlinks.\nwhile [ -h \"$PRG\" ] ; do\n    ls=`ls -ld \"$PRG\"`\n    link=`expr \"$ls\" : '.*-> \\(.*\\)$'`\n    if expr \"$link\" : '/.*' > /dev/null; then\n        PRG=\"$link\"\n    else\n        PRG=`dirname \"$PRG\"`\"/$link\"\n    fi\ndone\nSAVED=\"`pwd`\"\ncd \"`dirname \\\"$PRG\\\"`/\" >/dev/null\nAPP_HOME=\"`pwd -P`\"\ncd \"$SAVED\" >/dev/null\n\nAPP_NAME=\"Gradle\"\nAPP_BASE_NAME=`basename \"$0\"`\n\n# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nDEFAULT_JVM_OPTS='\"-Xmx64m\" \"-Xms64m\"'\n\n# Use the maximum available, or set MAX_FD != -1 to use that value.\nMAX_FD=\"maximum\"\n\nwarn () {\n    echo \"$*\"\n}\n\ndie () {\n    echo\n    echo \"$*\"\n    echo\n    exit 1\n}\n\n# OS specific support (must be 'true' or 'false').\ncygwin=false\nmsys=false\ndarwin=false\nnonstop=false\ncase \"`uname`\" in\n  CYGWIN* )\n    cygwin=true\n    ;;\n  Darwin* )\n    darwin=true\n    ;;\n  MINGW* )\n    msys=true\n    ;;\n  NONSTOP* )\n    nonstop=true\n    ;;\nesac\n\nCLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar\n\n\n# Determine the Java command to use to start the JVM.\nif [ -n \"$JAVA_HOME\" ] ; then\n    if [ -x \"$JAVA_HOME/jre/sh/java\" ] ; then\n        # IBM's JDK on AIX uses strange locations for the executables\n        JAVACMD=\"$JAVA_HOME/jre/sh/java\"\n    else\n        JAVACMD=\"$JAVA_HOME/bin/java\"\n    fi\n    if [ ! -x \"$JAVACMD\" ] ; then\n        die \"ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\n    fi\nelse\n    JAVACMD=\"java\"\n    which java >/dev/null 2>&1 || die \"ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\n\nPlease set the JAVA_HOME variable in your environment to match the\nlocation of your Java installation.\"\nfi\n\n# Increase the maximum file descriptors if we can.\nif [ \"$cygwin\" = \"false\" -a \"$darwin\" = \"false\" -a \"$nonstop\" = \"false\" ] ; then\n    MAX_FD_LIMIT=`ulimit -H -n`\n    if [ $? -eq 0 ] ; then\n        if [ \"$MAX_FD\" = \"maximum\" -o \"$MAX_FD\" = \"max\" ] ; then\n            MAX_FD=\"$MAX_FD_LIMIT\"\n        fi\n        ulimit -n $MAX_FD\n        if [ $? -ne 0 ] ; then\n            warn \"Could not set maximum file descriptor limit: $MAX_FD\"\n        fi\n    else\n        warn \"Could not query maximum file descriptor limit: $MAX_FD_LIMIT\"\n    fi\nfi\n\n# For Darwin, add options to specify how the application appears in the dock\nif $darwin; then\n    GRADLE_OPTS=\"$GRADLE_OPTS \\\"-Xdock:name=$APP_NAME\\\" \\\"-Xdock:icon=$APP_HOME/media/gradle.icns\\\"\"\nfi\n\n# For Cygwin or MSYS, switch paths to Windows format before running java\nif [ \"$cygwin\" = \"true\" -o \"$msys\" = \"true\" ] ; then\n    APP_HOME=`cygpath --path --mixed \"$APP_HOME\"`\n    CLASSPATH=`cygpath --path --mixed \"$CLASSPATH\"`\n\n    JAVACMD=`cygpath --unix \"$JAVACMD\"`\n\n    # We build the pattern for arguments to be converted via cygpath\n    ROOTDIRSRAW=`find -L / -maxdepth 1 -mindepth 1 -type d 2>/dev/null`\n    SEP=\"\"\n    for dir in $ROOTDIRSRAW ; do\n        ROOTDIRS=\"$ROOTDIRS$SEP$dir\"\n        SEP=\"|\"\n    done\n    OURCYGPATTERN=\"(^($ROOTDIRS))\"\n    # Add a user-defined pattern to the cygpath arguments\n    if [ \"$GRADLE_CYGPATTERN\" != \"\" ] ; then\n        OURCYGPATTERN=\"$OURCYGPATTERN|($GRADLE_CYGPATTERN)\"\n    fi\n    # Now convert the arguments - kludge to limit ourselves to /bin/sh\n    i=0\n    for arg in \"$@\" ; do\n        CHECK=`echo \"$arg\"|egrep -c \"$OURCYGPATTERN\" -`\n        CHECK2=`echo \"$arg\"|egrep -c \"^-\"`                                 ### Determine if an option\n\n        if [ $CHECK -ne 0 ] && [ $CHECK2 -eq 0 ] ; then                    ### Added a condition\n            eval `echo args$i`=`cygpath --path --ignore --mixed \"$arg\"`\n        else\n            eval `echo args$i`=\"\\\"$arg\\\"\"\n        fi\n        i=`expr $i + 1`\n    done\n    case $i in\n        0) set -- ;;\n        1) set -- \"$args0\" ;;\n        2) set -- \"$args0\" \"$args1\" ;;\n        3) set -- \"$args0\" \"$args1\" \"$args2\" ;;\n        4) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" ;;\n        5) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" ;;\n        6) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" ;;\n        7) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" ;;\n        8) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" ;;\n        9) set -- \"$args0\" \"$args1\" \"$args2\" \"$args3\" \"$args4\" \"$args5\" \"$args6\" \"$args7\" \"$args8\" ;;\n    esac\nfi\n\n# Escape application args\nsave () {\n    for i do printf %s\\\\n \"$i\" | sed \"s/'/'\\\\\\\\''/g;1s/^/'/;\\$s/\\$/' \\\\\\\\/\" ; done\n    echo \" \"\n}\nAPP_ARGS=`save \"$@\"`\n\n# Collect all arguments for the java command, following the shell quoting and substitution rules\neval set -- $DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS \"\\\"-Dorg.gradle.appname=$APP_BASE_NAME\\\"\" -classpath \"\\\"$CLASSPATH\\\"\" org.gradle.wrapper.GradleWrapperMain \"$APP_ARGS\"\n\nexec \"$JAVACMD\" \"$@\"\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/gradlew.bat",
    "content": "@rem\n@rem Copyright 2015 the original author or authors.\n@rem\n@rem Licensed under the Apache License, Version 2.0 (the \"License\");\n@rem you may not use this file except in compliance with the License.\n@rem You may obtain a copy of the License at\n@rem\n@rem      https://www.apache.org/licenses/LICENSE-2.0\n@rem\n@rem Unless required by applicable law or agreed to in writing, software\n@rem distributed under the License is distributed on an \"AS IS\" BASIS,\n@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n@rem See the License for the specific language governing permissions and\n@rem limitations under the License.\n@rem\n\n@if \"%DEBUG%\" == \"\" @echo off\n@rem ##########################################################################\n@rem\n@rem  Gradle startup script for Windows\n@rem\n@rem ##########################################################################\n\n@rem Set local scope for the variables with windows NT shell\nif \"%OS%\"==\"Windows_NT\" setlocal\n\nset DIRNAME=%~dp0\nif \"%DIRNAME%\" == \"\" set DIRNAME=.\nset APP_BASE_NAME=%~n0\nset APP_HOME=%DIRNAME%\n\n@rem Resolve any \".\" and \"..\" in APP_HOME to make it shorter.\nfor %%i in (\"%APP_HOME%\") do set APP_HOME=%%~fi\n\n@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script.\nset DEFAULT_JVM_OPTS=\"-Xmx64m\" \"-Xms64m\"\n\n@rem Find java.exe\nif defined JAVA_HOME goto findJavaFromJavaHome\n\nset JAVA_EXE=java.exe\n%JAVA_EXE% -version >NUL 2>&1\nif \"%ERRORLEVEL%\" == \"0\" goto execute\n\necho.\necho ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH.\necho.\necho Please set the JAVA_HOME variable in your environment to match the\necho location of your Java installation.\n\ngoto fail\n\n:findJavaFromJavaHome\nset JAVA_HOME=%JAVA_HOME:\"=%\nset JAVA_EXE=%JAVA_HOME%/bin/java.exe\n\nif exist \"%JAVA_EXE%\" goto execute\n\necho.\necho ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME%\necho.\necho Please set the JAVA_HOME variable in your environment to match the\necho location of your Java installation.\n\ngoto fail\n\n:execute\n@rem Setup the command line\n\nset CLASSPATH=%APP_HOME%\\gradle\\wrapper\\gradle-wrapper.jar\n\n\n@rem Execute Gradle\n\"%JAVA_EXE%\" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% \"-Dorg.gradle.appname=%APP_BASE_NAME%\" -classpath \"%CLASSPATH%\" org.gradle.wrapper.GradleWrapperMain %*\n\n:end\n@rem End local scope for the variables with windows NT shell\nif \"%ERRORLEVEL%\"==\"0\" goto mainEnd\n\n:fail\nrem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of\nrem the _cmd.exe /c_ return code!\nif  not \"\" == \"%GRADLE_EXIT_CONSOLE%\" exit 1\nexit /b 1\n\n:mainEnd\nif \"%OS%\"==\"Windows_NT\" endlocal\n\n:omega\n"
  },
  {
    "path": "android/SherpaOnnxWebSocket/settings.gradle",
    "content": "pluginManagement {\n    repositories {\n        gradlePluginPortal()\n        google()\n        mavenCentral()\n    }\n}\ndependencyResolutionManagement {\n    repositoriesMode.set(RepositoriesMode.FAIL_ON_PROJECT_REPOS)\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\nrootProject.name = \"SherpaOnnx\"\ninclude ':app'\n"
  },
  {
    "path": "c-api-examples/CMakeLists.txt",
    "content": "include(cargs)\n\ninclude_directories(${PROJECT_SOURCE_DIR})\nadd_executable(decode-file-c-api decode-file-c-api.c)\ntarget_link_libraries(decode-file-c-api sherpa-onnx-c-api cargs)\n\nadd_executable(kws-c-api kws-c-api.c)\ntarget_link_libraries(kws-c-api sherpa-onnx-c-api)\n\nadd_executable(speech-enhancement-gtcrn-c-api speech-enhancement-gtcrn-c-api.c)\ntarget_link_libraries(speech-enhancement-gtcrn-c-api sherpa-onnx-c-api)\n\nadd_executable(speech-enhancement-dpdfnet-c-api speech-enhancement-dpdfnet-c-api.c)\ntarget_link_libraries(speech-enhancement-dpdfnet-c-api sherpa-onnx-c-api)\n\nadd_executable(online-speech-enhancement-gtcrn-c-api\n               online-speech-enhancement-gtcrn-c-api.c)\ntarget_link_libraries(online-speech-enhancement-gtcrn-c-api sherpa-onnx-c-api)\n\nadd_executable(online-speech-enhancement-dpdfnet-c-api\n               online-speech-enhancement-dpdfnet-c-api.c)\ntarget_link_libraries(online-speech-enhancement-dpdfnet-c-api sherpa-onnx-c-api)\n\nif(SHERPA_ONNX_ENABLE_TTS)\n  add_executable(offline-tts-c-api offline-tts-c-api.c)\n  target_link_libraries(offline-tts-c-api sherpa-onnx-c-api cargs)\n\n  add_executable(matcha-tts-zh-c-api matcha-tts-zh-c-api.c)\n  target_link_libraries(matcha-tts-zh-c-api sherpa-onnx-c-api)\n\n  add_executable(matcha-tts-en-c-api matcha-tts-en-c-api.c)\n  target_link_libraries(matcha-tts-en-c-api sherpa-onnx-c-api)\n\n  add_executable(kokoro-tts-en-c-api kokoro-tts-en-c-api.c)\n  target_link_libraries(kokoro-tts-en-c-api sherpa-onnx-c-api)\n\n  add_executable(kitten-tts-en-c-api kitten-tts-en-c-api.c)\n  target_link_libraries(kitten-tts-en-c-api sherpa-onnx-c-api)\n\n  add_executable(kokoro-tts-zh-en-c-api kokoro-tts-zh-en-c-api.c)\n  target_link_libraries(kokoro-tts-zh-en-c-api sherpa-onnx-c-api)\n\n  add_executable(pocket-tts-en-c-api pocket-tts-en-c-api.c)\n  target_link_libraries(pocket-tts-en-c-api sherpa-onnx-c-api)\n\n  add_executable(supertonic-tts-en-c-api supertonic-tts-en-c-api.c)\n  target_link_libraries(supertonic-tts-en-c-api sherpa-onnx-c-api)\n\n  add_executable(zipvoice-tts-zh-en-c-api zipvoice-tts-zh-en-c-api.c)\n  target_link_libraries(zipvoice-tts-zh-en-c-api sherpa-onnx-c-api)\nendif()\n\nif(SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION)\n  add_executable(offline-speaker-diarization-c-api offline-speaker-diarization-c-api.c)\n  target_link_libraries(offline-speaker-diarization-c-api sherpa-onnx-c-api)\nendif()\n\nadd_executable(spoken-language-identification-c-api spoken-language-identification-c-api.c)\ntarget_link_libraries(spoken-language-identification-c-api sherpa-onnx-c-api)\n\nadd_executable(speaker-identification-c-api speaker-identification-c-api.c)\ntarget_link_libraries(speaker-identification-c-api sherpa-onnx-c-api)\n\nadd_executable(streaming-hlg-decode-file-c-api streaming-hlg-decode-file-c-api.c)\ntarget_link_libraries(streaming-hlg-decode-file-c-api sherpa-onnx-c-api)\n\nadd_executable(streaming-t-one-ctc-c-api streaming-t-one-ctc-c-api.c)\ntarget_link_libraries(streaming-t-one-ctc-c-api sherpa-onnx-c-api)\n\nadd_executable(audio-tagging-c-api audio-tagging-c-api.c)\ntarget_link_libraries(audio-tagging-c-api sherpa-onnx-c-api)\n\nadd_executable(add-punctuation-c-api add-punctuation-c-api.c)\ntarget_link_libraries(add-punctuation-c-api sherpa-onnx-c-api)\n\nadd_executable(add-punctuation-online-c-api add-punctuation-online-c-api.c)\ntarget_link_libraries(add-punctuation-online-c-api sherpa-onnx-c-api)\n\nadd_executable(whisper-c-api whisper-c-api.c)\ntarget_link_libraries(whisper-c-api sherpa-onnx-c-api)\n\nadd_executable(fire-red-asr-c-api fire-red-asr-c-api.c)\ntarget_link_libraries(fire-red-asr-c-api sherpa-onnx-c-api)\n\nadd_executable(nemo-canary-c-api nemo-canary-c-api.c)\ntarget_link_libraries(nemo-canary-c-api sherpa-onnx-c-api)\n\nadd_executable(nemo-parakeet-c-api nemo-parakeet-c-api.c)\ntarget_link_libraries(nemo-parakeet-c-api sherpa-onnx-c-api)\n\nadd_executable(sense-voice-c-api sense-voice-c-api.c)\ntarget_link_libraries(sense-voice-c-api sherpa-onnx-c-api)\n\nadd_executable(funasr-nano-c-api funasr-nano-c-api.c)\ntarget_link_libraries(funasr-nano-c-api sherpa-onnx-c-api)\n\nadd_executable(sense-voice-with-hr-c-api sense-voice-with-hr-c-api.c)\ntarget_link_libraries(sense-voice-with-hr-c-api sherpa-onnx-c-api)\n\nadd_executable(dolphin-ctc-c-api dolphin-ctc-c-api.c)\ntarget_link_libraries(dolphin-ctc-c-api sherpa-onnx-c-api)\n\nadd_executable(moonshine-c-api moonshine-c-api.c)\ntarget_link_libraries(moonshine-c-api sherpa-onnx-c-api)\n\nadd_executable(moonshine-v2-c-api moonshine-v2-c-api.c)\ntarget_link_libraries(moonshine-v2-c-api sherpa-onnx-c-api)\n\nadd_executable(zipformer-c-api zipformer-c-api.c)\ntarget_link_libraries(zipformer-c-api sherpa-onnx-c-api)\n\nadd_executable(wenet-ctc-c-api wenet-ctc-c-api.c)\ntarget_link_libraries(wenet-ctc-c-api sherpa-onnx-c-api)\n\nadd_executable(omnilingual-asr-ctc-c-api omnilingual-asr-ctc-c-api.c)\ntarget_link_libraries(omnilingual-asr-ctc-c-api sherpa-onnx-c-api)\n\nadd_executable(medasr-ctc-c-api medasr-ctc-c-api.c)\ntarget_link_libraries(medasr-ctc-c-api sherpa-onnx-c-api)\n\nadd_executable(fire-red-asr-ctc-c-api fire-red-asr-ctc-c-api.c)\ntarget_link_libraries(fire-red-asr-ctc-c-api sherpa-onnx-c-api)\n\nadd_executable(streaming-zipformer-c-api streaming-zipformer-c-api.c)\ntarget_link_libraries(streaming-zipformer-c-api sherpa-onnx-c-api)\n\nadd_executable(streaming-zipformer-with-hr-c-api streaming-zipformer-with-hr-c-api.c)\ntarget_link_libraries(streaming-zipformer-with-hr-c-api sherpa-onnx-c-api)\n\nadd_executable(paraformer-c-api paraformer-c-api.c)\ntarget_link_libraries(paraformer-c-api sherpa-onnx-c-api)\n\nadd_executable(streaming-paraformer-c-api streaming-paraformer-c-api.c)\ntarget_link_libraries(streaming-paraformer-c-api sherpa-onnx-c-api)\n\nadd_executable(telespeech-c-api telespeech-c-api.c)\ntarget_link_libraries(telespeech-c-api sherpa-onnx-c-api)\n\nadd_executable(vad-sense-voice-c-api vad-sense-voice-c-api.c)\ntarget_link_libraries(vad-sense-voice-c-api sherpa-onnx-c-api)\n\nadd_executable(vad-whisper-c-api vad-whisper-c-api.c)\ntarget_link_libraries(vad-whisper-c-api sherpa-onnx-c-api)\n\nadd_executable(vad-moonshine-c-api vad-moonshine-c-api.c)\ntarget_link_libraries(vad-moonshine-c-api sherpa-onnx-c-api)\n\nadd_executable(streaming-zipformer-buffered-tokens-hotwords-c-api\n               streaming-zipformer-buffered-tokens-hotwords-c-api.c)\ntarget_link_libraries(streaming-zipformer-buffered-tokens-hotwords-c-api sherpa-onnx-c-api)\n\nadd_executable(streaming-paraformer-buffered-tokens-c-api\n               streaming-paraformer-buffered-tokens-c-api.c)\ntarget_link_libraries(streaming-paraformer-buffered-tokens-c-api sherpa-onnx-c-api)\n\nadd_executable(streaming-ctc-buffered-tokens-c-api\n               streaming-ctc-buffered-tokens-c-api.c)\ntarget_link_libraries(streaming-ctc-buffered-tokens-c-api sherpa-onnx-c-api)\n\nadd_executable(keywords-spotter-buffered-tokens-keywords-c-api\n               keywords-spotter-buffered-tokens-keywords-c-api.c)\ntarget_link_libraries(keywords-spotter-buffered-tokens-keywords-c-api sherpa-onnx-c-api)\n\nif(SHERPA_ONNX_HAS_ALSA)\n  add_subdirectory(./asr-microphone-example)\nelseif((UNIX AND NOT APPLE) OR LINUX)\n  message(WARNING \"Not include ./asr-microphone-example since alsa is not available\")\nendif()\n"
  },
  {
    "path": "c-api-examples/Makefile",
    "content": "\nCUR_DIR :=$(shell pwd)\n\nCFLAGS := -I ../ -I ../build/_deps/cargs-src/include/\nLDFLAGS := -L ../build/lib\nLDFLAGS += -L ../build/_deps/onnxruntime-src/lib\nLDFLAGS += -lsherpa-onnx-c-api -lsherpa-onnx-core -lkaldi-decoder-core -lsherpa-onnx-kaldifst-core -lsherpa-onnx-fstfar -lsherpa-onnx-fst -lkaldi-native-fbank-core -lkissfft-float -lpiper_phonemize -lespeak-ng -lucd -lcargs -lonnxruntime\nLDFLAGS += -framework Foundation\nLDFLAGS += -lc++\nLDFLAGS += -Wl,-rpath,${CUR_DIR}/../build/lib\nLDFLAGS += -Wl,-rpath,${CUR_DIR}/../build/_deps/onnxruntime-src/lib\n\n.PHONY: all clean\n\nall: decode-file-c-api offline-tts-c-api\n\ndecode-file-c-api: decode-file-c-api.c\n\t$(CC) $(CFLAGS) -o $@ $< $(LDFLAGS)\n\noffline-tts-c-api: offline-tts-c-api.c\n\t$(CC) $(CFLAGS) -o $@ $< $(LDFLAGS)\n\nclean:\n\t$(RM) ./decode-file-c-api ./offline-tts-c-api\n"
  },
  {
    "path": "c-api-examples/README.md",
    "content": "# Introduction\n\nThis folder contains C API examples for [sherpa-onnx][sherpa-onnx].\n\nPlease refer to the documentation\nhttps://k2-fsa.github.io/sherpa/onnx/c-api/index.html\nfor details.\n\n\n## File descriptions\n\n- [decode-file-c-api.c](./decode-file-c-api.c) This file shows how to use the C API\n  for speech recognition with a streaming model.\n\n- [offline-tts-c-api.c](./offline-tts-c-api.c) This file shows how to use the C API\n  to convert text to speech with a non-streaming model.\n\n- [speech-enhancement-gtcrn-c-api.c](./speech-enhancement-gtcrn-c-api.c)\n  This file shows how to use the C API for speech enhancement with GTCRN\n  models.\n\n- [speech-enhancement-dpdfnet-c-api.c](./speech-enhancement-dpdfnet-c-api.c)\n  This file shows how to use the C API for speech enhancement with DPDFNet\n  models. Use 16 kHz DPDFNet models such as `dpdfnet_baseline.onnx`,\n  `dpdfnet2.onnx`, `dpdfnet4.onnx`, or `dpdfnet8.onnx` for downstream ASR and\n  `dpdfnet2_48khz_hr.onnx` for 48 kHz enhancement output.\n\n- [online-speech-enhancement-gtcrn-c-api.c](./online-speech-enhancement-gtcrn-c-api.c)\n  This file shows how to use the C API for online speech enhancement with\n  GTCRN models.\n\n- [online-speech-enhancement-dpdfnet-c-api.c](./online-speech-enhancement-dpdfnet-c-api.c)\n  This file shows how to use the C API for online speech enhancement with\n  DPDFNet models. Use `dpdfnet_baseline.onnx`, `dpdfnet2.onnx`,\n  `dpdfnet4.onnx`, or `dpdfnet8.onnx` for 16 kHz output.\n\n[sherpa-onnx]: https://github.com/k2-fsa/sherpa-onnx\n"
  },
  {
    "path": "c-api-examples/add-punctuation-c-api.c",
    "content": "// c-api-examples/add-punctuation-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n// We assume you have pre-downloaded the model files for testing\n// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models\n//\n// An example is given below:\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n// tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n// rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  SherpaOnnxOfflinePunctuationConfig config;\n  memset(&config, 0, sizeof(config));\n\n  // clang-format off\n  config.model.ct_transformer = \"./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx\";\n  // clang-format on\n  config.model.num_threads = 1;\n  config.model.debug = 1;\n  config.model.provider = \"cpu\";\n\n  const SherpaOnnxOfflinePunctuation *punct =\n      SherpaOnnxCreateOfflinePunctuation(&config);\n  if (!punct) {\n    fprintf(stderr,\n            \"Failed to create OfflinePunctuation. Please check your config\");\n    return -1;\n  }\n\n  const char *texts[] = {\n      \"这是一个测试你好吗How are you我很好thank you are you ok谢谢你\",\n      \"我们都是木头人不会说话不会动\",\n      (\"The African blogosphere is rapidly expanding bringing more voices \"\n       \"online in the form of commentaries opinions analyses rants and poetry\"),\n  };\n\n  int32_t n = sizeof(texts) / sizeof(const char *);\n  fprintf(stderr, \"n: %d\\n\", n);\n\n  fprintf(stderr, \"--------------------\\n\");\n  for (int32_t i = 0; i != n; ++i) {\n    const char *text_with_punct =\n        SherpaOfflinePunctuationAddPunct(punct, texts[i]);\n\n    fprintf(stderr, \"Input text: %s\\n\", texts[i]);\n    fprintf(stderr, \"Output text: %s\\n\", text_with_punct);\n    SherpaOfflinePunctuationFreeText(text_with_punct);\n    fprintf(stderr, \"--------------------\\n\");\n  }\n\n  SherpaOnnxDestroyOfflinePunctuation(punct);\n\n  return 0;\n};\n"
  },
  {
    "path": "c-api-examples/add-punctuation-online-c-api.c",
    "content": "// c-api-examples/add-punctuation-online-c-api.c\n//\n// Copyright (c)  zengyw\n\n// We assume you have pre-downloaded the model files for testing\n// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models\n//\n// An example is given below:\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n// tar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n// rm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  SherpaOnnxOnlinePunctuationConfig config;\n  memset(&config, 0, sizeof(config));\n\n  // clang-format off\n  config.model.cnn_bilstm = \"./sherpa-onnx-online-punct-en-2024-08-06/model.int8.onnx\";\n  config.model.bpe_vocab = \"./sherpa-onnx-online-punct-en-2024-08-06/bpe.vocab\";\n  // clang-format on\n  config.model.num_threads = 1;\n  config.model.debug = 1;\n  config.model.provider = \"cpu\";\n\n  const SherpaOnnxOnlinePunctuation *punct =\n      SherpaOnnxCreateOnlinePunctuation(&config);\n  if (!punct) {\n    fprintf(stderr,\n            \"Failed to create OnlinePunctuation. Please check your config\\n\");\n    return -1;\n  }\n\n  const char *texts[] = {\n      \"how are you i am fine thank you\",\n      (\"The African blogosphere is rapidly expanding bringing more voices \"\n       \"online in the form of commentaries opinions analyses rants and poetry\"),\n  };\n\n  int32_t n = sizeof(texts) / sizeof(const char *);\n  fprintf(stderr, \"n: %d\\n\", n);\n\n  fprintf(stderr, \"--------------------\\n\");\n  for (int32_t i = 0; i != n; ++i) {\n    const char *text_with_punct =\n        SherpaOnnxOnlinePunctuationAddPunct(punct, texts[i]);\n    if (!text_with_punct) {\n      fprintf(stderr, \"Failed to add punctuation for: %s\\n\", texts[i]);\n      continue;\n    }\n\n    fprintf(stderr, \"Input text: %s\\n\", texts[i]);\n    fprintf(stderr, \"Output text: %s\\n\", text_with_punct);\n    SherpaOnnxOnlinePunctuationFreeText(text_with_punct);\n    fprintf(stderr, \"--------------------\\n\");\n  }\n\n  SherpaOnnxDestroyOnlinePunctuation(punct);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/asr-microphone-example/CMakeLists.txt",
    "content": "\nadd_executable(c-api-alsa c-api-alsa.cc alsa.cc)\ntarget_link_libraries(c-api-alsa sherpa-onnx-c-api cargs)\n\nif(DEFINED ENV{SHERPA_ONNX_ALSA_LIB_DIR})\n  target_link_libraries(c-api-alsa -L$ENV{SHERPA_ONNX_ALSA_LIB_DIR} -lasound)\nelse()\n  target_link_libraries(c-api-alsa asound)\nendif()\n"
  },
  {
    "path": "c-api-examples/asr-microphone-example/CPPLINT.cfg",
    "content": "exclude_files=alsa.cc|alsa.h\n"
  },
  {
    "path": "c-api-examples/asr-microphone-example/README.md",
    "content": "# Introduction\n\nThis folder contains examples for real-time speech recognition from a microphone\nusing sherpa-onnx C API.\n\n**Note**: You can call C API from C++ files.\n\n\n## ./c-api-alsa.cc\n\nThis file uses alsa to read a microphone. It runs only on Linux. This file\ndoes not support macOS or Windows.\n"
  },
  {
    "path": "c-api-examples/asr-microphone-example/c-api-alsa.cc",
    "content": "// c-api-examples/asr-microphone-example/c-api-alsa.cc\n// Copyright (c)  2022-2024  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include <algorithm>\n#include <cctype>  // std::tolower\n#include <cstdint>\n#include <string>\n#include <vector>\n\n#include \"c-api-examples/asr-microphone-example/alsa.h\"\n\n// NOTE: You don't need to use cargs.h in your own project.\n// We use it in this file to parse commandline arguments\n#include \"cargs.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic struct cag_option options[] = {\n    {/*.identifier =*/'h',\n     /*.access_letters =*/\"h\",\n     /*.access_name =*/\"help\",\n     /*.value_name =*/\"help\",\n     /*.description =*/\"Show help\"},\n    {/*.identifier =*/'t',\n     /*.access_letters =*/NULL,\n     /*.access_name =*/\"tokens\",\n     /*.value_name =*/\"tokens\",\n     /*.description =*/\"Tokens file\"},\n    {/*.identifier =*/'e',\n     /*.access_letters =*/NULL,\n     /*.access_name =*/\"encoder\",\n     /*.value_name =*/\"encoder\",\n     /*.description =*/\"Encoder ONNX file\"},\n    {/*.identifier =*/'d',\n     /*.access_letters =*/NULL,\n     /*.access_name =*/\"decoder\",\n     /*.value_name =*/\"decoder\",\n     /*.description =*/\"Decoder ONNX file\"},\n    {/*.identifier =*/'j',\n     /*.access_letters =*/NULL,\n     /*.access_name =*/\"joiner\",\n     /*.value_name =*/\"joiner\",\n     /*.description =*/\"Joiner ONNX file\"},\n    {/*.identifier =*/'n',\n     /*.access_letters =*/NULL,\n     /*.access_name =*/\"num-threads\",\n     /*.value_name =*/\"num-threads\",\n     /*.description =*/\"Number of threads\"},\n    {/*.identifier =*/'p',\n     /*.access_letters =*/NULL,\n     /*.access_name =*/\"provider\",\n     /*.value_name =*/\"provider\",\n     /*.description =*/\"Provider: cpu (default), cuda, coreml\"},\n    {/*.identifier =*/'m',\n     /*.access_letters =*/NULL,\n     /*.access_name =*/\"decoding-method\",\n     /*.value_name =*/\"decoding-method\",\n     /*.description =*/\n     \"Decoding method: greedy_search (default), modified_beam_search\"},\n    {/*.identifier =*/'f',\n     /*.access_letters =*/NULL,\n     /*.access_name =*/\"hotwords-file\",\n     /*.value_name =*/\"hotwords-file\",\n     /*.description =*/\n     \"The file containing hotwords, one words/phrases per line, and for each \"\n     \"phrase the bpe/cjkchar are separated by a space. For example: ▁HE LL O \"\n     \"▁WORLD, 你 好 世 界\"},\n    {/*.identifier =*/'s',\n     /*.access_letters =*/NULL,\n     /*.access_name =*/\"hotwords-score\",\n     /*.value_name =*/\"hotwords-score\",\n     /*.description =*/\n     \"The bonus score for each token in hotwords. Used only when \"\n     \"decoding_method is modified_beam_search\"},\n};\n\nconst char *kUsage =\n    R\"(\nUsage:\n  ./bin/c-api-alsa \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/decoder.onnx \\\n    device_name\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n)\";\n\nbool stop = false;\n\nstatic void Handler(int sig) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  if (argc < 6) {\n    fprintf(stderr, \"%s\\n\", kUsage);\n    exit(0);\n  }\n\n  signal(SIGINT, Handler);\n\n  SherpaOnnxOnlineRecognizerConfig config;\n  memset(&config, 0, sizeof(config));\n\n  config.model_config.debug = 0;\n  config.model_config.num_threads = 1;\n  config.model_config.provider = \"cpu\";\n\n  config.decoding_method = \"greedy_search\";\n\n  config.max_active_paths = 4;\n\n  config.feat_config.sample_rate = 16000;\n  config.feat_config.feature_dim = 80;\n\n  config.enable_endpoint = 1;\n  config.rule1_min_trailing_silence = 2.4;\n  config.rule2_min_trailing_silence = 1.2;\n  config.rule3_min_utterance_length = 300;\n\n  cag_option_context context;\n  char identifier;\n  const char *value;\n\n  cag_option_prepare(&context, options, CAG_ARRAY_SIZE(options), argc, argv);\n\n  while (cag_option_fetch(&context)) {\n    identifier = cag_option_get(&context);\n    value = cag_option_get_value(&context);\n    switch (identifier) {\n      case 't':\n        config.model_config.tokens = value;\n        break;\n      case 'e':\n        config.model_config.transducer.encoder = value;\n        break;\n      case 'd':\n        config.model_config.transducer.decoder = value;\n        break;\n      case 'j':\n        config.model_config.transducer.joiner = value;\n        break;\n      case 'n':\n        config.model_config.num_threads = atoi(value);\n        break;\n      case 'p':\n        config.model_config.provider = value;\n        break;\n      case 'm':\n        config.decoding_method = value;\n        break;\n      case 'f':\n        config.hotwords_file = value;\n        break;\n      case 's':\n        config.hotwords_score = atof(value);\n        break;\n      case 'h': {\n        fprintf(stderr, \"%s\\n\", kUsage);\n        exit(0);\n        break;\n      }\n      default:\n        // do nothing as config already has valid default values\n        break;\n    }\n  }\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&config);\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n  const char *device_name = argv[context.index];\n  sherpa_onnx::Alsa alsa(device_name);\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name);\n  fprintf(stderr,\n          \"Please \\033[32m\\033[1mspeak\\033[0m! Press \\033[31m\\033[1mCtrl + \"\n          \"C\\033[0m to exit\\n\");\n\n  int32_t expected_sample_rate = 16000;\n\n  if (alsa.GetExpectedSampleRate() != expected_sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            expected_sample_rate);\n    exit(-1);\n  }\n\n  int32_t chunk = 0.1 * alsa.GetActualSampleRate();\n\n  std::string last_text;\n\n  int32_t segment_index = 0;\n\n  while (!stop) {\n    const std::vector<float> &samples = alsa.Read(chunk);\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, expected_sample_rate,\n                                         samples.data(), samples.size());\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n    std::string text = r->text;\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n    if (!text.empty() && last_text != text) {\n      last_text = text;\n\n      std::transform(text.begin(), text.end(), text.begin(),\n                     [](auto c) { return std::tolower(c); });\n\n      SherpaOnnxPrint(display, segment_index, text.c_str());\n      fflush(stderr);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (!text.empty()) {\n        ++segment_index;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n  }\n\n  // free allocated resources\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/audio-tagging-c-api.c",
    "content": "// c-api-examples/audio-tagging-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n// We assume you have pre-downloaded the model files for testing\n// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\n//\n// An example is given below:\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n// tar xvf sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n// rm sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  SherpaOnnxAudioTaggingConfig config;\n  memset(&config, 0, sizeof(config));\n\n  config.model.zipformer.model =\n      \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.int8.onnx\";\n  config.model.num_threads = 1;\n  config.model.debug = 1;\n  config.model.provider = \"cpu\";\n  // clang-format off\n  config.labels = \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/class_labels_indices.csv\";\n  // clang-format on\n\n  const SherpaOnnxAudioTagging *tagger = SherpaOnnxCreateAudioTagging(&config);\n  if (!tagger) {\n    fprintf(stderr, \"Failed to create audio tagger. Please check your config\");\n    return -1;\n  }\n\n  // You can find more test waves from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n  const char *wav_filename =\n      \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/test_wavs/1.wav\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxAudioTaggingCreateOfflineStream(tagger);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n\n  int32_t top_k = 5;\n  const SherpaOnnxAudioEvent *const *results =\n      SherpaOnnxAudioTaggingCompute(tagger, stream, top_k);\n\n  fprintf(stderr, \"--------------------------------------------------\\n\");\n  fprintf(stderr, \"Index\\t\\tProbability\\t\\tEvent name\\n\");\n  fprintf(stderr, \"--------------------------------------------------\\n\");\n  for (int32_t i = 0; i != top_k; ++i) {\n    fprintf(stderr, \"%d\\t\\t%.3f\\t\\t\\t%s\\n\", i, results[i]->prob,\n            results[i]->name);\n  }\n  fprintf(stderr, \"--------------------------------------------------\\n\");\n\n  SherpaOnnxAudioTaggingFreeResults(results);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxFreeWave(wave);\n  SherpaOnnxDestroyAudioTagging(tagger);\n\n  return 0;\n};\n"
  },
  {
    "path": "c-api-examples/decode-file-c-api.c",
    "content": "// c-api-examples/decode-file-c-api.c\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx C API\n// to decode a file.\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"cargs.h\"\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic struct cag_option options[] = {\n    {.identifier = 'h',\n     .access_letters = \"h\",\n     .access_name = \"help\",\n     .description = \"Show help\"},\n    {.identifier = 't',\n     .access_letters = NULL,\n     .access_name = \"tokens\",\n     .value_name = \"tokens\",\n     .description = \"Tokens file\"},\n    {.identifier = 'e',\n     .access_letters = NULL,\n     .access_name = \"encoder\",\n     .value_name = \"encoder\",\n     .description = \"Encoder ONNX file\"},\n    {.identifier = 'd',\n     .access_letters = NULL,\n     .access_name = \"decoder\",\n     .value_name = \"decoder\",\n     .description = \"Decoder ONNX file\"},\n    {.identifier = 'j',\n     .access_letters = NULL,\n     .access_name = \"joiner\",\n     .value_name = \"joiner\",\n     .description = \"Joiner ONNX file\"},\n    {.identifier = 'n',\n     .access_letters = NULL,\n     .access_name = \"num-threads\",\n     .value_name = \"num-threads\",\n     .description = \"Number of threads\"},\n    {.identifier = 'p',\n     .access_letters = NULL,\n     .access_name = \"provider\",\n     .value_name = \"provider\",\n     .description = \"Provider: cpu (default), cuda, coreml\"},\n    {.identifier = 'm',\n     .access_letters = NULL,\n     .access_name = \"decoding-method\",\n     .value_name = \"decoding-method\",\n     .description =\n         \"Decoding method: greedy_search (default), modified_beam_search\"},\n    {.identifier = 'f',\n     .access_letters = NULL,\n     .access_name = \"hotwords-file\",\n     .value_name = \"hotwords-file\",\n     .description = \"The file containing hotwords, one words/phrases per line, \"\n                    \"and for each phrase the bpe/cjkchar are separated by a \"\n                    \"space. For example: ▁HE LL O ▁WORLD, 你 好 世 界\"},\n    {.identifier = 's',\n     .access_letters = NULL,\n     .access_name = \"hotwords-score\",\n     .value_name = \"hotwords-score\",\n     .description = \"The bonus score for each token in hotwords. Used only \"\n                    \"when decoding_method is modified_beam_search\"},\n};\n\nconst char *kUsage =\n    \"\\n\"\n    \"Usage:\\n \"\n    \"  ./bin/decode-file-c-api \\\\\\n\"\n    \"    --tokens=/path/to/tokens.txt \\\\\\n\"\n    \"    --encoder=/path/to/encoder.onnx \\\\\\n\"\n    \"    --decoder=/path/to/decoder.onnx \\\\\\n\"\n    \"    --joiner=/path/to/joiner.onnx \\\\\\n\"\n    \"    --provider=cpu \\\\\\n\"\n    \"    /path/to/foo.wav\\n\"\n    \"\\n\\n\"\n    \"Default num_threads is 1.\\n\"\n    \"Valid decoding_method: greedy_search (default), modified_beam_search\\n\\n\"\n    \"Valid provider: cpu (default), cuda, coreml\\n\\n\"\n    \"Please refer to \\n\"\n    \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/\"\n    \"index.html\\n\"\n    \"for a list of pre-trained models to download.\\n\"\n    \"\\n\"\n    \"Note that this file supports only streaming transducer models.\\n\";\n\nint32_t main(int32_t argc, char *argv[]) {\n  if (argc < 6) {\n    fprintf(stderr, \"%s\\n\", kUsage);\n    exit(0);\n  }\n\n  SherpaOnnxOnlineRecognizerConfig config;\n  memset(&config, 0, sizeof(config));\n\n  config.model_config.debug = 0;\n  config.model_config.num_threads = 1;\n  config.model_config.provider = \"cpu\";\n\n  config.decoding_method = \"greedy_search\";\n\n  config.max_active_paths = 4;\n\n  config.feat_config.sample_rate = 16000;\n  config.feat_config.feature_dim = 80;\n\n  config.enable_endpoint = 1;\n  config.rule1_min_trailing_silence = 2.4;\n  config.rule2_min_trailing_silence = 1.2;\n  config.rule3_min_utterance_length = 300;\n\n  cag_option_context context;\n  char identifier;\n  const char *value;\n\n  cag_option_prepare(&context, options, CAG_ARRAY_SIZE(options), argc, argv);\n\n  while (cag_option_fetch(&context)) {\n    identifier = cag_option_get(&context);\n    value = cag_option_get_value(&context);\n    switch (identifier) {\n      case 't':\n        config.model_config.tokens = value;\n        break;\n      case 'e':\n        config.model_config.transducer.encoder = value;\n        break;\n      case 'd':\n        config.model_config.transducer.decoder = value;\n        break;\n      case 'j':\n        config.model_config.transducer.joiner = value;\n        break;\n      case 'n':\n        config.model_config.num_threads = atoi(value);\n        break;\n      case 'p':\n        config.model_config.provider = value;\n        break;\n      case 'm':\n        config.decoding_method = value;\n        break;\n      case 'f':\n        config.hotwords_file = value;\n        break;\n      case 's':\n        config.hotwords_score = atof(value);\n        break;\n      case 'h': {\n        fprintf(stderr, \"%s\\n\", kUsage);\n        exit(0);\n        break;\n      }\n      default:\n        // do nothing as config already has valid default values\n        break;\n    }\n  }\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&config);\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n  const char *wav_filename = argv[context.index];\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n  // simulate streaming\n\n#define N 3200  // 0.2 s. Sample rate is fixed to 16 kHz\n\n  fprintf(stderr, \"sample rate: %d, num samples: %d, duration: %.2f s\\n\",\n          wave->sample_rate, wave->num_samples,\n          (float)wave->num_samples / wave->sample_rate);\n\n  int32_t k = 0;\n  while (k < wave->num_samples) {\n    int32_t start = k;\n    int32_t end =\n        (start + N > wave->num_samples) ? wave->num_samples : (start + N);\n    k += N;\n\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n                                         wave->samples + start, end - start);\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n    if (strlen(r->text)) {\n      SherpaOnnxPrint(display, segment_id, r->text);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (strlen(r->text)) {\n        ++segment_id;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       4800);\n\n  SherpaOnnxFreeWave(wave);\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream);\n  }\n\n  const SherpaOnnxOnlineRecognizerResult *r =\n      SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n  if (strlen(r->text)) {\n    SherpaOnnxPrint(display, segment_id, r->text);\n  }\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/dolphin-ctc-c-api.c",
    "content": "// c-api-examples/dolphin-ctc-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Dolphin CTC model with sherpa-onnx's C API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n// tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n// rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  // clang-format off\n  const char *wav_filename = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav\";\n  const char *model_filename = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx\";\n  const char *tokens_filename = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt\";\n  // clang-format on\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = \"cpu\";\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.dolphin.model = model_filename;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/fire-red-asr-c-api.c",
    "content": "// c-api-examples/fire-red-asr-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// We assume you have pre-downloaded the FireRedAsr model\n// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n// An example is given below:\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n// tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n// rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav\";\n  const char *encoder_filename =\n      \"sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx\";\n  const char *decoder_filename =\n      \"sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx\";\n  const char *tokens_filename =\n      \"sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.fire_red_asr.encoder = encoder_filename;\n  offline_model_config.fire_red_asr.decoder = decoder_filename;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n\n    SherpaOnnxFreeWave(wave);\n\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/fire-red-asr-ctc-c-api.c",
    "content": "// c-api-examples/fire-red-asr-ctc-c-api.c\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n//\n// This file demonstrates how to use FireRedASR with sherpa-onnx's C API.\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nrm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n*/\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  // clang-format off\n  const char *wav_filename = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav\";\n  const char *model_filename = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx\";\n  const char *tokens_filename = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt\";\n  // clang-format on\n\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  SherpaOnnxOfflineFireRedAsrCtcModelConfig fire_red_asr_ctc;\n  memset(&fire_red_asr_ctc, 0, sizeof(fire_red_asr_ctc));\n  fire_red_asr_ctc.model = model_filename;\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.fire_red_asr_ctc = fire_red_asr_ctc;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/funasr-nano-c-api.c",
    "content": "// c-api-examples/funasr-nano-c-api.c\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n//\n// This file demonstrates how to use FunASR Nano with sherpa-onnx's C API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n// tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n// rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  // clang-format off\n  const char *wav_filename = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/dia_yue.wav\";\n  const char *encoder_adaptor = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx\";\n  const char *embedding = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx\";\n  const char *llm = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx\";\n  const char *tokenizer = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B\";\n  // clang-format on\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  SherpaOnnxOfflineFunASRNanoModelConfig funasr_nano;\n  memset(&funasr_nano, 0, sizeof(funasr_nano));\n  funasr_nano.encoder_adaptor = encoder_adaptor;\n  funasr_nano.embedding = embedding;\n  funasr_nano.llm = llm;\n  funasr_nano.tokenizer = tokenizer;\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 2;\n  offline_model_config.provider = \"cpu\";\n  offline_model_config.funasr_nano = funasr_nano;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/keywords-spotter-buffered-tokens-keywords-c-api.c",
    "content": "// c-api-examples/keywords-spotter-buffered-tokens-keywords-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n// Copyright (c)  2024  Luo Xiao\n\n//\n// This file demonstrates how to use keywords spotter with sherpa-onnx's C\n// API and with tokens and keywords loaded from buffered strings instead of from\n// external files API.\n// clang-format off\n// \n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n// tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n// rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic size_t ReadFile(const char *filename, const char **buffer_out) {\n  FILE *file = fopen(filename, \"r\");\n  if (file == NULL) {\n    fprintf(stderr, \"Failed to open %s\\n\", filename);\n    return -1;\n  }\n  fseek(file, 0L, SEEK_END);\n  long size = ftell(file);\n  rewind(file);\n  *buffer_out = malloc(size);\n  if (*buffer_out == NULL) {\n    fclose(file);\n    fprintf(stderr, \"Memory error\\n\");\n    return -1;\n  }\n  size_t read_bytes = fread((void *)*buffer_out, 1, size, file);\n  if (read_bytes != size) {\n    printf(\"Errors occurred in reading the file %s\\n\", filename);\n    free((void *)*buffer_out);\n    *buffer_out = NULL;\n    fclose(file);\n    return -1;\n  }\n  fclose(file);\n  return read_bytes;\n}\n\nint32_t main() {\n  const char *wav_filename =\n      \"sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/test_wavs/\"\n      \"6.wav\";\n  const char *encoder_filename =\n      \"sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx\";\n  const char *decoder_filename =\n      \"sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"decoder-epoch-12-avg-2-chunk-16-left-64.onnx\";\n  const char *joiner_filename =\n      \"sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx\";\n  const char *provider = \"cpu\";\n  const char *tokens_filename =\n      \"sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/tokens.txt\";\n  const char *keywords_filename =\n      \"sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/test_wavs/\"\n      \"test_keywords.txt\";\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // reading tokens and keywords to buffers\n  const char *tokens_buf;\n  size_t token_buf_size = ReadFile(tokens_filename, &tokens_buf);\n  if (token_buf_size < 1) {\n    fprintf(stderr, \"Please check your tokens.txt!\\n\");\n    free((void *)tokens_buf);\n    return -1;\n  }\n  const char *keywords_buf;\n  size_t keywords_buf_size = ReadFile(keywords_filename, &keywords_buf);\n  if (keywords_buf_size < 1) {\n    fprintf(stderr, \"Please check your keywords.txt!\\n\");\n    free((void *)keywords_buf);\n    return -1;\n  }\n\n  // Zipformer config\n  SherpaOnnxOnlineTransducerModelConfig zipformer_config;\n  memset(&zipformer_config, 0, sizeof(zipformer_config));\n  zipformer_config.encoder = encoder_filename;\n  zipformer_config.decoder = decoder_filename;\n  zipformer_config.joiner = joiner_filename;\n\n  // Online model config\n  SherpaOnnxOnlineModelConfig online_model_config;\n  memset(&online_model_config, 0, sizeof(online_model_config));\n  online_model_config.debug = 1;\n  online_model_config.num_threads = 1;\n  online_model_config.provider = provider;\n  online_model_config.tokens_buf = tokens_buf;\n  online_model_config.tokens_buf_size = token_buf_size;\n  online_model_config.transducer = zipformer_config;\n\n  // Keywords-spotter config\n  SherpaOnnxKeywordSpotterConfig keywords_spotter_config;\n  memset(&keywords_spotter_config, 0, sizeof(keywords_spotter_config));\n  keywords_spotter_config.max_active_paths = 4;\n  keywords_spotter_config.keywords_threshold = 0.1;\n  keywords_spotter_config.keywords_score = 3.0;\n  keywords_spotter_config.model_config = online_model_config;\n  keywords_spotter_config.keywords_buf = keywords_buf;\n  keywords_spotter_config.keywords_buf_size = keywords_buf_size;\n\n  const SherpaOnnxKeywordSpotter *keywords_spotter =\n      SherpaOnnxCreateKeywordSpotter(&keywords_spotter_config);\n\n  free((void *)tokens_buf);\n  tokens_buf = NULL;\n  free((void *)keywords_buf);\n  keywords_buf = NULL;\n\n  if (keywords_spotter == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateKeywordStream(keywords_spotter);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n// simulate streaming. You can choose an arbitrary N\n#define N 3200\n\n  fprintf(stderr, \"sample rate: %d, num samples: %d, duration: %.2f s\\n\",\n          wave->sample_rate, wave->num_samples,\n          (float)wave->num_samples / wave->sample_rate);\n\n  int32_t k = 0;\n  while (k < wave->num_samples) {\n    int32_t start = k;\n    int32_t end =\n        (start + N > wave->num_samples) ? wave->num_samples : (start + N);\n    k += N;\n\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n                                         wave->samples + start, end - start);\n    while (SherpaOnnxIsKeywordStreamReady(keywords_spotter, stream)) {\n      SherpaOnnxDecodeKeywordStream(keywords_spotter, stream);\n    }\n\n    const SherpaOnnxKeywordResult *r =\n        SherpaOnnxGetKeywordResult(keywords_spotter, stream);\n\n    if (strlen(r->keyword)) {\n      SherpaOnnxPrint(display, segment_id, r->keyword);\n    }\n\n    SherpaOnnxDestroyKeywordResult(r);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       4800);\n\n  SherpaOnnxFreeWave(wave);\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsKeywordStreamReady(keywords_spotter, stream)) {\n    SherpaOnnxDecodeKeywordStream(keywords_spotter, stream);\n  }\n\n  const SherpaOnnxKeywordResult *r =\n      SherpaOnnxGetKeywordResult(keywords_spotter, stream);\n\n  if (strlen(r->keyword)) {\n    SherpaOnnxPrint(display, segment_id, r->keyword);\n  }\n\n  SherpaOnnxDestroyKeywordResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyKeywordSpotter(keywords_spotter);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/kitten-tts-en-c-api.c",
    "content": "// c-api-examples/kitten-tts-en-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx C API\n// for English TTS with Kitten.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\ntar xf kitten-nano-en-v0_1-fp16.tar.bz2\nrm kitten-nano-en-v0_1-fp16.tar.bz2\n\n./kitten-tts-en-c-api\n\n */\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  SherpaOnnxOfflineTtsConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model.kitten.model = \"./kitten-nano-en-v0_1-fp16/model.fp16.onnx\";\n  config.model.kitten.voices = \"./kitten-nano-en-v0_1-fp16/voices.bin\";\n  config.model.kitten.tokens = \"./kitten-nano-en-v0_1-fp16/tokens.txt\";\n  config.model.kitten.data_dir = \"./kitten-nano-en-v0_1-fp16/espeak-ng-data\";\n\n  config.model.num_threads = 2;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  const char *filename = \"./generated-kitten-en.wav\";\n  const char *text =\n      \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n      \"does not have two-thirds of his day for himself, is a slave, whatever \"\n      \"he may be: a statesman, a businessman, an official, or a scholar. \"\n      \"Friends fell out often because life was changing so fast. The easiest \"\n      \"thing in the world was to lose touch with someone.\";\n\n  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);\n  // mapping of sid to voice name\n  // 0->expr-voice-2-m, 1->expr-voice-2-f, 2->expr-voice-3-m\n  // 3->expr-voice-3-f, 4->expr-voice-4-m, 5->expr-voice-4-f\n  // 6->expr-voice-5-m, 7->expr-voice-5-f\n  int32_t sid = 0;\n  float speed = 1.0;  // larger -> faster in speech speed\n  SherpaOnnxGenerationConfig cfg = {0};\n  cfg.silence_scale = 0.2f;\n  cfg.sid = sid;\n  cfg.speed = speed;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, NULL, NULL);\n#else\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, ProgressCallback,\n                                             NULL);\n#endif\n\n  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate, filename);\n\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  SherpaOnnxDestroyOfflineTts(tts);\n\n  fprintf(stderr, \"Input text is: %s\\n\", text);\n  fprintf(stderr, \"Speaker ID is: %d\\n\", sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/kokoro-tts-en-c-api.c",
    "content": "// c-api-examples/kokoro-tts-en-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx C API\n// for English TTS with Kokoro.\n//\n// clang-format off\n/*\nUsage\n\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\ntar xf kokoro-en-v0_19.tar.bz2\nrm kokoro-en-v0_19.tar.bz2\n\n./kokoro-tts-en-c-api\n\n */\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  SherpaOnnxOfflineTtsConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model.kokoro.model = \"./kokoro-en-v0_19/model.onnx\";\n  config.model.kokoro.voices = \"./kokoro-en-v0_19/voices.bin\";\n  config.model.kokoro.tokens = \"./kokoro-en-v0_19/tokens.txt\";\n  config.model.kokoro.data_dir = \"./kokoro-en-v0_19/espeak-ng-data\";\n\n  config.model.num_threads = 2;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  const char *filename = \"./generated-kokoro-en.wav\";\n  const char *text =\n      \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n      \"does not have two-thirds of his day for himself, is a slave, whatever \"\n      \"he may be: a statesman, a businessman, an official, or a scholar. \"\n      \"Friends fell out often because life was changing so fast. The easiest \"\n      \"thing in the world was to lose touch with someone.\";\n\n  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);\n  // mapping of sid to voice name\n  // 0->af, 1->af_bella, 2->af_nicole, 3->af_sarah, 4->af_sky, 5->am_adam\n  // 6->am_michael, 7->bf_emma, 8->bf_isabella, 9->bm_george, 10->bm_lewis\n  int32_t sid = 0;\n  float speed = 1.0;  // larger -> faster in speech speed\n  SherpaOnnxGenerationConfig cfg = {0};\n  cfg.silence_scale = 0.2f;\n  cfg.sid = sid;\n  cfg.speed = speed;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, NULL, NULL);\n#else\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, ProgressCallback,\n                                             NULL);\n#endif\n\n  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate, filename);\n\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  SherpaOnnxDestroyOfflineTts(tts);\n\n  fprintf(stderr, \"Input text is: %s\\n\", text);\n  fprintf(stderr, \"Speaker ID is: %d\\n\", sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/kokoro-tts-zh-en-c-api.c",
    "content": "// c-api-examples/kokoro-tts-zh-en-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx C API\n// for English + Chinese TTS with Kokoro.\n//\n// clang-format off\n/*\nUsage\n\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\ntar xf kokoro-multi-lang-v1_0.tar.bz2\nrm kokoro-multi-lang-v1_0.tar.bz2\n\n./kokoro-tts-zh-en-c-api\n\n */\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  SherpaOnnxOfflineTtsConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model.kokoro.model = \"./kokoro-multi-lang-v1_0/model.onnx\";\n  config.model.kokoro.voices = \"./kokoro-multi-lang-v1_0/voices.bin\";\n  config.model.kokoro.tokens = \"./kokoro-multi-lang-v1_0/tokens.txt\";\n  config.model.kokoro.data_dir = \"./kokoro-multi-lang-v1_0/espeak-ng-data\";\n  config.model.kokoro.dict_dir = \"./kokoro-multi-lang-v1_0/dict\";\n  config.model.kokoro.lexicon =\n      \"./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/\"\n      \"lexicon-zh.txt\";\n\n  config.model.num_threads = 2;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  const char *filename = \"./generated-kokoro-zh-en.wav\";\n  const char *text =\n      \"中英文语音合成测试。This is generated by next generation Kaldi using \"\n      \"Kokoro without Misaki. 你觉得中英文说的如何呢？\";\n\n  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);\n  int32_t sid = 0;    // there are 53 speakers\n  float speed = 1.0;  // larger -> faster in speech speed\n  SherpaOnnxGenerationConfig cfg = {0};\n  cfg.silence_scale = 0.2f;\n  cfg.sid = sid;\n  cfg.speed = speed;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, NULL, NULL);\n#else\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, ProgressCallback,\n                                             NULL);\n#endif\n\n  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate, filename);\n\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  SherpaOnnxDestroyOfflineTts(tts);\n\n  fprintf(stderr, \"Input text is: %s\\n\", text);\n  fprintf(stderr, \"Speaker ID is: %d\\n\", sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/kws-c-api.c",
    "content": "// c-api-examples/kws-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n//\n// This file demonstrates how to use keywords spotter with sherpa-onnx's C\n// clang-format off\n//\n// Usage\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n// tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n// rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n//\n// ./kws-c-api\n//\n// clang-format on\n#include <stdio.h>\n#include <stdlib.h>  // exit\n#include <string.h>  // memset\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  SherpaOnnxKeywordSpotterConfig config;\n\n  memset(&config, 0, sizeof(config));\n  config.model_config.transducer.encoder =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx\";\n\n  config.model_config.transducer.decoder =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"decoder-epoch-12-avg-2-chunk-16-left-64.onnx\";\n\n  config.model_config.transducer.joiner =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx\";\n\n  config.model_config.tokens =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"tokens.txt\";\n\n  config.model_config.provider = \"cpu\";\n  config.model_config.num_threads = 1;\n  config.model_config.debug = 1;\n\n  config.keywords_file =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"test_wavs/test_keywords.txt\";\n\n  const SherpaOnnxKeywordSpotter *kws = SherpaOnnxCreateKeywordSpotter(&config);\n  if (!kws) {\n    fprintf(stderr, \"Please check your config\");\n    exit(-1);\n  }\n\n  fprintf(stderr,\n          \"--Test pre-defined keywords from test_wavs/test_keywords.txt--\\n\");\n\n  const char *wav_filename =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"test_wavs/3.wav\";\n\n  float tail_paddings[8000] = {0};  // 0.5 seconds\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    exit(-1);\n  }\n\n  const SherpaOnnxOnlineStream *stream = SherpaOnnxCreateKeywordStream(kws);\n  if (!stream) {\n    fprintf(stderr, \"Failed to create stream\\n\");\n    exit(-1);\n  }\n\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, wave->samples,\n                                       wave->num_samples);\n\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       sizeof(tail_paddings) / sizeof(float));\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsKeywordStreamReady(kws, stream)) {\n    SherpaOnnxDecodeKeywordStream(kws, stream);\n    const SherpaOnnxKeywordResult *r = SherpaOnnxGetKeywordResult(kws, stream);\n    if (r && r->json && strlen(r->keyword)) {\n      fprintf(stderr, \"Detected keyword: %s\\n\", r->json);\n\n      // Remember to reset the keyword stream right after a keyword is detected\n      SherpaOnnxResetKeywordStream(kws, stream);\n    }\n    SherpaOnnxDestroyKeywordResult(r);\n  }\n  SherpaOnnxDestroyOnlineStream(stream);\n\n  // --------------------------------------------------------------------------\n\n  fprintf(stderr, \"--Use pre-defined keywords + add a new keyword--\\n\");\n\n  stream = SherpaOnnxCreateKeywordStreamWithKeywords(kws, \"y ǎn y uán @演员\");\n\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, wave->samples,\n                                       wave->num_samples);\n\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       sizeof(tail_paddings) / sizeof(float));\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsKeywordStreamReady(kws, stream)) {\n    SherpaOnnxDecodeKeywordStream(kws, stream);\n    const SherpaOnnxKeywordResult *r = SherpaOnnxGetKeywordResult(kws, stream);\n    if (r && r->json && strlen(r->keyword)) {\n      fprintf(stderr, \"Detected keyword: %s\\n\", r->json);\n\n      // Remember to reset the keyword stream\n      SherpaOnnxResetKeywordStream(kws, stream);\n    }\n    SherpaOnnxDestroyKeywordResult(r);\n  }\n  SherpaOnnxDestroyOnlineStream(stream);\n\n  // --------------------------------------------------------------------------\n\n  fprintf(stderr, \"--Use pre-defined keywords + add two new keywords--\\n\");\n\n  stream = SherpaOnnxCreateKeywordStreamWithKeywords(\n      kws, \"y ǎn y uán @演员/zh ī m íng @知名\");\n\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, wave->samples,\n                                       wave->num_samples);\n\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       sizeof(tail_paddings) / sizeof(float));\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsKeywordStreamReady(kws, stream)) {\n    SherpaOnnxDecodeKeywordStream(kws, stream);\n    const SherpaOnnxKeywordResult *r = SherpaOnnxGetKeywordResult(kws, stream);\n    if (r && r->json && strlen(r->keyword)) {\n      fprintf(stderr, \"Detected keyword: %s\\n\", r->json);\n\n      // Remember to reset the keyword stream\n      SherpaOnnxResetKeywordStream(kws, stream);\n    }\n    SherpaOnnxDestroyKeywordResult(r);\n  }\n  SherpaOnnxDestroyOnlineStream(stream);\n\n  SherpaOnnxFreeWave(wave);\n  SherpaOnnxDestroyKeywordSpotter(kws);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/matcha-tts-en-c-api.c",
    "content": "// c-api-examples/matcha-tts-en-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx C API\n// for English TTS with MatchaTTS.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\ntar xvf matcha-icefall-en_US-ljspeech.tar.bz2\nrm matcha-icefall-en_US-ljspeech.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n./matcha-tts-en-c-api\n\n */\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  SherpaOnnxOfflineTtsConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model.matcha.acoustic_model =\n      \"./matcha-icefall-en_US-ljspeech/model-steps-3.onnx\";\n\n  config.model.matcha.vocoder = \"./vocos-22khz-univ.onnx\";\n\n  config.model.matcha.tokens = \"./matcha-icefall-en_US-ljspeech/tokens.txt\";\n\n  config.model.matcha.data_dir =\n      \"./matcha-icefall-en_US-ljspeech/espeak-ng-data\";\n\n  config.model.num_threads = 1;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  const char *filename = \"./generated-matcha-en.wav\";\n  const char *text =\n      \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n      \"does not have two-thirds of his day for himself, is a slave, whatever \"\n      \"he may be: a statesman, a businessman, an official, or a scholar. \"\n      \"Friends fell out often because life was changing so fast. The easiest \"\n      \"thing in the world was to lose touch with someone.\";\n\n  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);\n  SherpaOnnxGenerationConfig cfg = {0};\n  cfg.sid = 0;\n  cfg.speed = 1.0f;  // larger -> faster in speech speed\n  cfg.silence_scale = config.silence_scale;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, NULL, NULL);\n#else\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, ProgressCallback,\n                                             NULL);\n#endif\n\n  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate, filename);\n\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  SherpaOnnxDestroyOfflineTts(tts);\n\n  fprintf(stderr, \"Input text is: %s\\n\", text);\n  fprintf(stderr, \"Speaker ID is: %d\\n\", cfg.sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/matcha-tts-zh-c-api.c",
    "content": "// c-api-examples/matcha-tts-zh-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx C API\n// for Chinese TTS with MatchaTTS.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\ntar xvf matcha-icefall-zh-baker.tar.bz2\nrm matcha-icefall-zh-baker.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n./matcha-tts-zh-c-api\n\n */\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  SherpaOnnxOfflineTtsConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model.matcha.acoustic_model =\n      \"./matcha-icefall-zh-baker/model-steps-3.onnx\";\n  config.model.matcha.vocoder = \"./vocos-22khz-univ.onnx\";\n  config.model.matcha.lexicon = \"./matcha-icefall-zh-baker/lexicon.txt\";\n  config.model.matcha.tokens = \"./matcha-icefall-zh-baker/tokens.txt\";\n  config.model.matcha.dict_dir = \"./matcha-icefall-zh-baker/dict\";\n  config.model.num_threads = 1;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  // clang-format off\n  config.rule_fsts = \"./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst\";\n  // clang-format on\n\n  const char *filename = \"./generated-matcha-zh.wav\";\n  const char *text =\n      \"当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如\"\n      \"涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感\"\n      \"受着生命的奇迹与温柔.\"\n      \"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; \"\n      \"经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\";\n\n  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);\n  SherpaOnnxGenerationConfig cfg = {0};\n  cfg.sid = 0;\n  cfg.speed = 1.0f;  // larger -> faster in speech speed\n  cfg.silence_scale = config.silence_scale;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, NULL, NULL);\n#else\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, ProgressCallback,\n                                             NULL);\n#endif\n\n  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate, filename);\n\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  SherpaOnnxDestroyOfflineTts(tts);\n\n  fprintf(stderr, \"Input text is: %s\\n\", text);\n  fprintf(stderr, \"Speaker ID is: %d\\n\", cfg.sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/medasr-ctc-c-api.c",
    "content": "// c-api-examples/medasr-ctc-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use MedASR with sherpa-onnx's C API.\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\ntar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nrm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n*/\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  // clang-format off\n  const char *wav_filename = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav\";\n  const char *model_filename = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx\";\n  const char *tokens_filename = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt\";\n  // clang-format on\n\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  SherpaOnnxOfflineMedAsrCtcModelConfig medasr;\n  memset(&medasr, 0, sizeof(medasr));\n  medasr.model = model_filename;\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.medasr = medasr;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/moonshine-c-api.c",
    "content": "// c-api-examples/moonshine-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Moonshine tiny with sherpa-onnx's C API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n// tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n// rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav\";\n  const char *preprocessor =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx\";\n  const char *encoder = \"./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx\";\n  const char *uncached_decoder =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx\";\n  const char *cached_decoder =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx\";\n  const char *tokens = \"./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = \"cpu\";\n  offline_model_config.tokens = tokens;\n  offline_model_config.moonshine.preprocessor = preprocessor;\n  offline_model_config.moonshine.encoder = encoder;\n  offline_model_config.moonshine.uncached_decoder = uncached_decoder;\n  offline_model_config.moonshine.cached_decoder = cached_decoder;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/moonshine-v2-c-api.c",
    "content": "// c-api-examples/moonshine-v2-c-api.c\n//\n// Copyright (c)  2024-2026  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Moonshine v2 with sherpa-onnx's C API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n// tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n// rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  // clang-format off\n  const char *wav_filename = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav\";\n  const char *encoder = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort\";\n  const char *merged_decoder = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort\";\n  const char *tokens = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt\";\n  // clang-format on\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = \"cpu\";\n  offline_model_config.tokens = tokens;\n  offline_model_config.moonshine.encoder = encoder;\n  offline_model_config.moonshine.merged_decoder = merged_decoder;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/nemo-canary-c-api.c",
    "content": "// c-api-examples/nemo-canary-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// We assume you have pre-downloaded the Nemo Canary model\n// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n// An example is given below:\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n// tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n// rm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n//\n// clang-format on\n//\n// see https://k2-fsa.github.io/sherpa/onnx/nemo/canary.html\n// for details\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/de.wav\";\n  const char *encoder_filename =\n      \"sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx\";\n  const char *decoder_filename =\n      \"sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx\";\n  const char *tokens_filename =\n      \"sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n\n  // set debug to 1 to view more logs\n  offline_model_config.debug = 0;\n\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.canary.encoder = encoder_filename;\n  offline_model_config.canary.decoder = decoder_filename;\n\n  // so it output punctuations and cases\n  offline_model_config.canary.use_pnc = 1;\n\n  offline_model_config.canary.src_lang = \"de\";\n\n  // since there is a German audio, you can set tgt_lang to en or de\n  offline_model_config.canary.tgt_lang = \"en\";\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n\n    SherpaOnnxFreeWave(wave);\n\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text (English): %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n\n  // now output German text\n  recognizer_config.model_config.canary.tgt_lang = \"de\";\n  SherpaOnnxOfflineRecognizerSetConfig(recognizer, &recognizer_config);\n\n  stream = SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  result = SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text (German): %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/nemo-parakeet-c-api.c",
    "content": "// c-api-examples/nemo-parakeet-c-api.c\n// Example using the C API and sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8 model\n// Prints recognized text, per-token timestamps, and durations\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/test_wavs/en.wav\";\n  const char *encoder_filename =\n      \"sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/encoder.int8.onnx\";\n  const char *decoder_filename =\n      \"sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/decoder.int8.onnx\";\n  const char *joiner_filename =\n      \"sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/joiner.int8.onnx\";\n  const char *tokens_filename =\n      \"sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  if (!SherpaOnnxFileExists(wav_filename)) {\n    fprintf(stderr, \"File not found: %s\\n\", wav_filename);\n    return -1;\n  }\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read or parse %s (not a valid mono 16-bit WAVE file)\\n\", wav_filename);\n    return -1;\n  }\n\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 0;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.transducer.encoder = encoder_filename;\n  offline_model_config.transducer.decoder = decoder_filename;\n  offline_model_config.transducer.joiner = joiner_filename;\n\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n  if (stream == NULL) {\n    fprintf(stderr, \"Failed to create offline stream.\\n\");\n    SherpaOnnxDestroyOfflineRecognizer(recognizer);\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  printf(\"Recognized text: %s\\n\", result->text);\n\n  if (result->tokens_arr && result->timestamps && result->durations) {\n    printf(\"Token\\tTimestamp\\tDuration\\n\");\n    for (int32_t i = 0; i < result->count; ++i) {\n      printf(\"%s\\t%.2f\\t%.2f\\n\", result->tokens_arr[i], result->timestamps[i], result->durations[i]);\n    }\n  } else {\n    printf(\"Timestamps or durations not available.\\n\");\n  }\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/offline-speaker-diarization-c-api.c",
    "content": "// c-api-examples/offline-sepaker-diarization-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to implement speaker diarization with\n// sherpa-onnx's C API.\n\n// clang-format off\n/*\nUsage:\n\nStep 1: Download a speaker segmentation model\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nfor a list of available models. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\nStep 2: Download a speaker embedding extractor model\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nfor a list of available models. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\nStep 3. Download test wave files\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nfor a list of available test wave files. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\nStep 4. Run it\n\n */\n// clang-format on\n\n#include <stdio.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t ProgressCallback(int32_t num_processed_chunks,\n                                int32_t num_total_chunks, void *arg) {\n  float progress = 100.0 * num_processed_chunks / num_total_chunks;\n  fprintf(stderr, \"progress %.2f%%\\n\", progress);\n\n  // the return value is currently ignored\n  return 0;\n}\n\nint main() {\n  // Please see the comments at the start of this file for how to download\n  // the .onnx file and .wav files below\n  const char *segmentation_model =\n      \"./sherpa-onnx-pyannote-segmentation-3-0/model.onnx\";\n\n  const char *embedding_extractor_model =\n      \"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\";\n\n  const char *wav_filename = \"./0-four-speakers-zh.wav\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  SherpaOnnxOfflineSpeakerDiarizationConfig config;\n  memset(&config, 0, sizeof(config));\n\n  config.segmentation.pyannote.model = segmentation_model;\n  config.embedding.model = embedding_extractor_model;\n\n  // the test wave ./0-four-speakers-zh.wav has 4 speakers, so\n  // we set num_clusters to 4\n  //\n  config.clustering.num_clusters = 4;\n  // If you don't know the number of speakers in the test wave file, please\n  // use\n  // config.clustering.threshold = 0.5; // You need to tune this threshold\n\n  const SherpaOnnxOfflineSpeakerDiarization *sd =\n      SherpaOnnxCreateOfflineSpeakerDiarization(&config);\n\n  if (!sd) {\n    fprintf(stderr, \"Failed to initialize offline speaker diarization\\n\");\n    return -1;\n  }\n\n  if (SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(sd) !=\n      wave->sample_rate) {\n    fprintf(\n        stderr,\n        \"Expected sample rate: %d. Actual sample rate from the wave file: %d\\n\",\n        SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(sd),\n        wave->sample_rate);\n    goto failed;\n  }\n\n  const SherpaOnnxOfflineSpeakerDiarizationResult *result =\n      SherpaOnnxOfflineSpeakerDiarizationProcessWithCallback(\n          sd, wave->samples, wave->num_samples, ProgressCallback, NULL);\n  if (!result) {\n    fprintf(stderr, \"Failed to do speaker diarization\");\n    goto failed;\n  }\n\n  int32_t num_segments =\n      SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(result);\n\n  const SherpaOnnxOfflineSpeakerDiarizationSegment *segments =\n      SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(result);\n\n  for (int32_t i = 0; i != num_segments; ++i) {\n    fprintf(stderr, \"%.3f -- %.3f speaker_%02d\\n\", segments[i].start,\n            segments[i].end, segments[i].speaker);\n  }\n\nfailed:\n\n  SherpaOnnxOfflineSpeakerDiarizationDestroySegment(segments);\n  SherpaOnnxOfflineSpeakerDiarizationDestroyResult(result);\n  SherpaOnnxDestroyOfflineSpeakerDiarization(sd);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/offline-tts-c-api.c",
    "content": "// c-api-examples/offline-tts-c-api.c\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx C API\n// to convert text to speech using an offline model.\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"cargs.h\"\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic struct cag_option options[] = {\n    {.identifier = 'h',\n     .access_letters = \"h\",\n     .access_name = \"help\",\n     .description = \"Show help\"},\n    {.access_name = \"vits-model\",\n     .value_name = \"/path/to/xxx.onnx\",\n     .identifier = '0',\n     .description = \"Path to VITS model\"},\n    {.access_name = \"vits-lexicon\",\n     .value_name = \"/path/to/lexicon.txt\",\n     .identifier = '1',\n     .description = \"Path to lexicon.txt for VITS models\"},\n    {.access_name = \"vits-tokens\",\n     .value_name = \"/path/to/tokens.txt\",\n     .identifier = '2',\n     .description = \"Path to tokens.txt for VITS models\"},\n    {.access_name = \"vits-noise-scale\",\n     .value_name = \"0.667\",\n     .identifier = '3',\n     .description = \"noise_scale for VITS models\"},\n    {.access_name = \"vits-noise-scale-w\",\n     .value_name = \"0.8\",\n     .identifier = '4',\n     .description = \"noise_scale_w for VITS models\"},\n    {.access_name = \"vits-length-scale\",\n     .value_name = \"1.0\",\n     .identifier = '5',\n     .description =\n         \"length_scale for VITS models. Default to 1. You can tune it \"\n         \"to change the speech speed. small -> faster; large -> slower. \"},\n    {.access_name = \"num-threads\",\n     .value_name = \"1\",\n     .identifier = '6',\n     .description = \"Number of threads\"},\n    {.access_name = \"provider\",\n     .value_name = \"cpu\",\n     .identifier = '7',\n     .description = \"Provider: cpu (default), cuda, coreml\"},\n    {.access_name = \"debug\",\n     .value_name = \"0\",\n     .identifier = '8',\n     .description = \"1 to show debug messages while loading the model\"},\n    {.access_name = \"sid\",\n     .value_name = \"0\",\n     .identifier = '9',\n     .description = \"Speaker ID. Default to 0. Note it is not used for \"\n                    \"single-speaker models.\"},\n    {.access_name = \"output-filename\",\n     .value_name = \"./generated.wav\",\n     .identifier = 'a',\n     .description =\n         \"Filename to save the generated audio. Default to ./generated.wav\"},\n\n    {.access_name = \"tts-rule-fsts\",\n     .value_name = \"/path/to/rule.fst\",\n     .identifier = 'b',\n     .description = \"It not empty, it contains a list of rule FST filenames.\"\n                    \"Multiple filenames are separated by a comma and they are \"\n                    \"applied from left to right. An example value: \"\n                    \"rule1.fst,rule2,fst,rule3.fst\"},\n\n    {.access_name = \"max-num-sentences\",\n     .value_name = \"2\",\n     .identifier = 'c',\n     .description = \"Maximum number of sentences that we process at a time. \"\n                    \"This is to avoid OOM for very long input text. \"\n                    \"If you set it to -1, then we process all sentences in a \"\n                    \"single batch.\"},\n\n    {.access_name = \"vits-data-dir\",\n     .value_name = \"/path/to/espeak-ng-data\",\n     .identifier = 'd',\n     .description =\n         \"Path to espeak-ng-data. If it is given, --vits-lexicon is ignored\"},\n\n};\n\nstatic void ShowUsage() {\n  const char *kUsageMessage =\n      \"Offline text-to-speech with sherpa-onnx C API\"\n      \"\\n\"\n      \"./offline-tts-c-api \\\\\\n\"\n      \" --vits-model=/path/to/model.onnx \\\\\\n\"\n      \" --vits-lexicon=/path/to/lexicon.txt \\\\\\n\"\n      \" --vits-tokens=/path/to/tokens.txt \\\\\\n\"\n      \" --sid=0 \\\\\\n\"\n      \" --output-filename=./generated.wav \\\\\\n\"\n      \" 'some text within single quotes on linux/macos or use double quotes on \"\n      \"windows'\\n\"\n      \"\\n\"\n      \"It will generate a file ./generated.wav as specified by \"\n      \"--output-filename.\\n\"\n      \"\\n\"\n      \"You can download a test model from\\n\"\n      \"https://huggingface.co/csukuangfj/vits-ljs\\n\"\n      \"\\n\"\n      \"For instance, you can use:\\n\"\n      \"wget \"\n      \"https://huggingface.co/csukuangfj/vits-ljs/resolve/main/vits-ljs.onnx\\n\"\n      \"wget \"\n      \"https://huggingface.co/csukuangfj/vits-ljs/resolve/main/lexicon.txt\\n\"\n      \"wget \"\n      \"https://huggingface.co/csukuangfj/vits-ljs/resolve/main/tokens.txt\\n\"\n      \"\\n\"\n      \"./offline-tts-c-api \\\\\\n\"\n      \"  --vits-model=./vits-ljs.onnx \\\\\\n\"\n      \"  --vits-lexicon=./lexicon.txt \\\\\\n\"\n      \"  --vits-tokens=./tokens.txt \\\\\\n\"\n      \"  --sid=0 \\\\\\n\"\n      \"  --output-filename=./generated.wav \\\\\\n\"\n      \"  'liliana, the most beautiful and lovely assistant of our team!'\\n\"\n      \"\\n\"\n      \"Please see\\n\"\n      \"https://k2-fsa.github.io/sherpa/onnx/tts/index.html\\n\"\n      \"or details.\\n\\n\";\n\n  fprintf(stderr, \"%s\", kUsageMessage);\n  cag_option_print(options, CAG_ARRAY_SIZE(options), stderr);\n  exit(0);\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  cag_option_context context;\n  char identifier;\n  const char *value;\n\n  cag_option_prepare(&context, options, CAG_ARRAY_SIZE(options), argc, argv);\n\n  SherpaOnnxOfflineTtsConfig config;\n  memset(&config, 0, sizeof(config));\n\n  int32_t sid = 0;\n  const char *filename = strdup(\"./generated.wav\");\n  const char *text;\n\n  while (cag_option_fetch(&context)) {\n    identifier = cag_option_get(&context);\n    value = cag_option_get_value(&context);\n    switch (identifier) {\n      case '0':\n        config.model.vits.model = value;\n        break;\n      case '1':\n        config.model.vits.lexicon = value;\n        break;\n      case '2':\n        config.model.vits.tokens = value;\n        break;\n      case '3':\n        config.model.vits.noise_scale = atof(value);\n        break;\n      case '4':\n        config.model.vits.noise_scale_w = atof(value);\n        break;\n      case '5':\n        config.model.vits.length_scale = atof(value);\n        break;\n      case '6':\n        config.model.num_threads = atoi(value);\n        break;\n      case '7':\n        config.model.provider = value;\n        break;\n      case '8':\n        config.model.debug = atoi(value);\n        break;\n      case '9':\n        sid = atoi(value);\n        break;\n      case 'a':\n        free((void *)filename);\n        filename = strdup(value);\n        break;\n      case 'b':\n        config.rule_fsts = value;\n        break;\n      case 'c':\n        config.max_num_sentences = atoi(value);\n        break;\n      case 'd':\n        config.model.vits.data_dir = value;\n        break;\n      case '?':\n        fprintf(stderr, \"Unknown option\\n\");\n        // fall through\n      case 'h':\n        // fall through\n      default:\n        ShowUsage();\n    }\n  }\n  fprintf(stderr, \"here\\n\");\n\n  if (!config.model.vits.model) {\n    fprintf(stderr, \"Please provide --vits-model\\n\");\n    ShowUsage();\n  }\n\n  if (!config.model.vits.tokens) {\n    fprintf(stderr, \"Please provide --vits-tokens\\n\");\n    ShowUsage();\n  }\n\n  if (!config.model.vits.data_dir && !config.model.vits.lexicon) {\n    fprintf(stderr, \"Please provide --vits-data-dir or --vits-lexicon\\n\");\n    ShowUsage();\n  }\n\n  // the last arg is the text\n  text = argv[argc - 1];\n  if (text[0] == '-') {\n    fprintf(stderr, \"\\n***Please input your text!***\\n\\n\");\n    fprintf(stderr, \"\\n---------------Usage---------------\\n\\n\");\n    ShowUsage();\n  }\n\n  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);\n\n  SherpaOnnxGenerationConfig cfg = {0};\n  cfg.silence_scale = 0.2f;\n  cfg.sid = sid;\n  cfg.speed = 1.0f;\n\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, NULL, NULL);\n\n  SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate, filename);\n\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  SherpaOnnxDestroyOfflineTts(tts);\n\n  fprintf(stderr, \"Input text is: %s\\n\", text);\n  fprintf(stderr, \"Speaker ID is: %d\\n\", sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename);\n\n  free((void *)filename);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/omnilingual-asr-ctc-c-api.c",
    "content": "// c-api-examples/omnilingual-asr-ctc-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Omnilingual ASR with sherpa-onnx's C API.\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\ntar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nrm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n*/\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  // clang-format off\n  const char *wav_filename = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav\";\n  const char *model_filename = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx\";\n  const char *tokens_filename = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt\";\n  // clang-format on\n\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  SherpaOnnxOfflineOmnilingualAsrCtcModelConfig omnilingual;\n  memset(&omnilingual, 0, sizeof(omnilingual));\n  omnilingual.model = model_filename;\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.omnilingual = omnilingual;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/online-speech-enhancement-dpdfnet-c-api.c",
    "content": "// c-api-examples/online-speech-enhancement-dpdfnet-c-api.c\n//\n// Copyright (c)  2026  Xiaomi Corporation\n//\n// We assume you have pre-downloaded model\n// from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n// or\n// https://huggingface.co/Ceva-IP/DPDFNet\n//\n// An example command to download\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n*/\n// clang-format on\n\n#include <stdint.h>\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t AppendSamples(float **samples, int32_t *num_samples,\n                             const SherpaOnnxDenoisedAudio *audio) {\n  float *p = NULL;\n\n  if (!audio || audio->n == 0) {\n    return 1;\n  }\n\n  p = (float *)realloc(*samples, sizeof(float) * (*num_samples + audio->n));\n  if (!p) {\n    fprintf(stderr, \"Failed to allocate memory for output samples\\n\");\n    return 0;\n  }\n\n  memcpy(p + *num_samples, audio->samples, sizeof(float) * audio->n);\n  *samples = p;\n  *num_samples += audio->n;\n  return 1;\n}\n\nint32_t main() {\n  SherpaOnnxOnlineSpeechDenoiserConfig config;\n  const char *model_filename = \"./dpdfnet_baseline.onnx\";\n  const char *wav_filename = \"./inp_16k.wav\";\n  const char *out_wave_filename = \"./enhanced-online-dpdfnet.wav\";\n  float *samples = NULL;\n  int32_t num_samples = 0;\n\n  memset(&config, 0, sizeof(config));\n  config.model.dpdfnet.model = model_filename;\n\n  const SherpaOnnxOnlineSpeechDenoiser *sd =\n      SherpaOnnxCreateOnlineSpeechDenoiser(&config);\n  if (!sd) {\n    fprintf(stderr, \"Please check your config\\n\");\n    return -1;\n  }\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (!wave) {\n    SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  int32_t frame_shift = SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(sd);\n  for (int32_t start = 0; start < wave->num_samples; start += frame_shift) {\n    int32_t n = frame_shift;\n    if (start + n > wave->num_samples) {\n      n = wave->num_samples - start;\n    }\n\n    const SherpaOnnxDenoisedAudio *audio = SherpaOnnxOnlineSpeechDenoiserRun(\n        sd, wave->samples + start, n, wave->sample_rate);\n    int32_t ok = AppendSamples(&samples, &num_samples, audio);\n    SherpaOnnxDestroyDenoisedAudio(audio);\n    if (!ok) {\n      free(samples);\n      SherpaOnnxFreeWave(wave);\n      SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n      return -1;\n    }\n  }\n\n  const SherpaOnnxDenoisedAudio *tail = SherpaOnnxOnlineSpeechDenoiserFlush(sd);\n  int32_t sample_rate = tail ? tail->sample_rate\n                             : SherpaOnnxOnlineSpeechDenoiserGetSampleRate(sd);\n  int32_t ok = AppendSamples(&samples, &num_samples, tail);\n  SherpaOnnxDestroyDenoisedAudio(tail);\n  if (!ok) {\n    free(samples);\n    SherpaOnnxFreeWave(wave);\n    SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n    return -1;\n  }\n\n  if (num_samples == 0) {\n    fprintf(stderr, \"No denoised samples were produced\\n\");\n    free(samples);\n    SherpaOnnxFreeWave(wave);\n    SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n    return -1;\n  }\n\n  SherpaOnnxWriteWave(samples, num_samples, sample_rate, out_wave_filename);\n\n  free(samples);\n  SherpaOnnxFreeWave(wave);\n  SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n\n  fprintf(stdout, \"Saved to %s\\n\", out_wave_filename);\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/online-speech-enhancement-gtcrn-c-api.c",
    "content": "// c-api-examples/online-speech-enhancement-gtcrn-c-api.c\n//\n// Copyright (c)  2026  Xiaomi Corporation\n//\n// We assume you have pre-downloaded model\n// from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n//\n// An example command to download\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n*/\n// clang-format on\n\n#include <stdint.h>\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t AppendSamples(float **samples, int32_t *num_samples,\n                             const SherpaOnnxDenoisedAudio *audio) {\n  float *p = NULL;\n\n  if (!audio || audio->n == 0) {\n    return 1;\n  }\n\n  p = (float *)realloc(*samples, sizeof(float) * (*num_samples + audio->n));\n  if (!p) {\n    fprintf(stderr, \"Failed to allocate memory for output samples\\n\");\n    return 0;\n  }\n\n  memcpy(p + *num_samples, audio->samples, sizeof(float) * audio->n);\n  *samples = p;\n  *num_samples += audio->n;\n  return 1;\n}\n\nint32_t main() {\n  SherpaOnnxOnlineSpeechDenoiserConfig config;\n  const char *model_filename = \"./gtcrn_simple.onnx\";\n  const char *wav_filename = \"./inp_16k.wav\";\n  const char *out_wave_filename = \"./enhanced-online-gtcrn.wav\";\n  float *samples = NULL;\n  int32_t num_samples = 0;\n\n  memset(&config, 0, sizeof(config));\n  config.model.gtcrn.model = model_filename;\n\n  const SherpaOnnxOnlineSpeechDenoiser *sd =\n      SherpaOnnxCreateOnlineSpeechDenoiser(&config);\n  if (!sd) {\n    fprintf(stderr, \"Please check your config\\n\");\n    return -1;\n  }\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (!wave) {\n    SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  int32_t frame_shift = SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(sd);\n  for (int32_t start = 0; start < wave->num_samples; start += frame_shift) {\n    int32_t n = frame_shift;\n    if (start + n > wave->num_samples) {\n      n = wave->num_samples - start;\n    }\n\n    const SherpaOnnxDenoisedAudio *audio = SherpaOnnxOnlineSpeechDenoiserRun(\n        sd, wave->samples + start, n, wave->sample_rate);\n    int32_t ok = AppendSamples(&samples, &num_samples, audio);\n    SherpaOnnxDestroyDenoisedAudio(audio);\n    if (!ok) {\n      free(samples);\n      SherpaOnnxFreeWave(wave);\n      SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n      return -1;\n    }\n  }\n\n  const SherpaOnnxDenoisedAudio *tail = SherpaOnnxOnlineSpeechDenoiserFlush(sd);\n  int32_t sample_rate = tail ? tail->sample_rate\n                             : SherpaOnnxOnlineSpeechDenoiserGetSampleRate(sd);\n  int32_t ok = AppendSamples(&samples, &num_samples, tail);\n  SherpaOnnxDestroyDenoisedAudio(tail);\n  if (!ok) {\n    free(samples);\n    SherpaOnnxFreeWave(wave);\n    SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n    return -1;\n  }\n\n  if (num_samples == 0) {\n    fprintf(stderr, \"No denoised samples were produced\\n\");\n    free(samples);\n    SherpaOnnxFreeWave(wave);\n    SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n    return -1;\n  }\n\n  SherpaOnnxWriteWave(samples, num_samples, sample_rate, out_wave_filename);\n\n  free(samples);\n  SherpaOnnxFreeWave(wave);\n  SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n\n  fprintf(stdout, \"Saved to %s\\n\", out_wave_filename);\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/paraformer-c-api.c",
    "content": "// c-api-examples/paraformer-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use non-streaming Paraformer with sherpa-onnx's\n// C API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-small-2024-03-09.tar.bz2\n// tar xvf sherpa-onnx-paraformer-zh-small-2024-03-09.tar.bz2\n// rm sherpa-onnx-paraformer-zh-small-2024-03-09.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"sherpa-onnx-paraformer-zh-small-2024-03-09/test_wavs/0.wav\";\n  const char *model_filename =\n      \"sherpa-onnx-paraformer-zh-small-2024-03-09/model.int8.onnx\";\n  const char *tokens_filename =\n      \"sherpa-onnx-paraformer-zh-small-2024-03-09/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Paraformer config\n  SherpaOnnxOfflineParaformerModelConfig paraformer_config;\n  memset(&paraformer_config, 0, sizeof(paraformer_config));\n  paraformer_config.model = model_filename;\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.paraformer = paraformer_config;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/pocket-tts-en-c-api.c",
    "content": "// c-api-examples/pocket-tts-en-c-api.c\n//\n// Copyright (c)  2026  Xiaoyingtao Corporation\n\n// This file shows how to use sherpa-onnx C API\n// for English TTS with Pocket TTS.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\n./pocket-tts-en-c-api\n\n */\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  SherpaOnnxOfflineTtsConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model.pocket.lm_flow =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\";\n  config.model.pocket.lm_main =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\";\n  config.model.pocket.encoder =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\";\n  config.model.pocket.decoder =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\";\n  config.model.pocket.text_conditioner =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\";\n  config.model.pocket.vocab_json =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\";\n  config.model.pocket.token_scores_json =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\";\n  // Voice embedding cache capacity (default: 50)\n  // Increase this if you have many different reference audios to avoid\n  // recomputing voice embeddings\n  config.model.pocket.voice_embedding_cache_capacity = 50;\n\n  config.model.num_threads = 2;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  const char *filename = \"./generated-pocket-en.wav\";\n  const char *text =\n      \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n      \"does not have two-thirds of his day for himself, is a slave, whatever \"\n      \"he may be: a statesman, a businessman, an official, or a scholar. \"\n      \"Friends fell out often because life was changing so fast. The easiest \"\n      \"thing in the world was to lose touch with someone.\";\n\n  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);\n  if (!tts) {\n    fprintf(stderr, \"Error create Offline TTS\\n\");\n    return -1;\n  }\n  float speed = 1.0;  // larger -> faster in speech speed\n  SherpaOnnxGenerationConfig cfg = {0};\n  const char *reference_audio_file =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\";\n  const SherpaOnnxWave *wave = NULL;\n  wave = SherpaOnnxReadWave(reference_audio_file);\n  if (!wave) {\n    fprintf(stderr, \"Failed to read %s\\n\", reference_audio_file);\n    SherpaOnnxDestroyOfflineTts(tts);\n    return -1;\n  }\n  cfg.reference_audio = wave->samples;\n  cfg.reference_audio_len = wave->num_samples;\n  cfg.reference_sample_rate = wave->sample_rate;\n  // Extra parameters passed as JSON string\n  // - max_reference_audio_len: maximum length of reference audio in seconds\n  // - seed: random seed for reproducibility (optional, -1 for random)\n  cfg.extra = \"{\\\"max_reference_audio_len\\\": 10.0, \\\"seed\\\": 42}\";\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, NULL, NULL);\n#else\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, ProgressCallback,\n                                             NULL);\n#endif\n\n  if (wave) SherpaOnnxFreeWave(wave);\n\n  fprintf(stderr, \"Input text is: %s\\n\", text);\n\n  if (audio) {\n    SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate, filename);\n    fprintf(stderr, \"Saved to: %s\\n\", filename);\n    SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  }\n\n  SherpaOnnxDestroyOfflineTts(tts);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/sense-voice-c-api.c",
    "content": "// c-api-examples/sense-voice-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use SenseVoice with sherpa-onnx's C API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/en.wav\";\n  const char *model_filename =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n  const char *tokens_filename =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n  const char *language = \"auto\";\n  const char *provider = \"cpu\";\n  int32_t use_inverse_text_normalization = 1;\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  SherpaOnnxOfflineSenseVoiceModelConfig sense_voice_config;\n  memset(&sense_voice_config, 0, sizeof(sense_voice_config));\n  sense_voice_config.model = model_filename;\n  sense_voice_config.language = language;\n  sense_voice_config.use_itn = use_inverse_text_normalization;\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.sense_voice = sense_voice_config;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/sense-voice-with-hr-c-api.c",
    "content": "// c-api-examples/sense-voice-with-hr-c-api.c\n//\n// Copyright (c)  2024-2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use SenseVoice with sherpa-onnx's C API\n// with homophone replacer.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\n// tar xf dict.tar.bz2\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename = \"./test-hr.wav\";\n  const char *model_filename =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n  const char *tokens_filename =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n  const char *language = \"auto\";\n  const char *provider = \"cpu\";\n  int32_t use_inverse_text_normalization = 1;\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  SherpaOnnxOfflineSenseVoiceModelConfig sense_voice_config;\n  memset(&sense_voice_config, 0, sizeof(sense_voice_config));\n  sense_voice_config.model = model_filename;\n  sense_voice_config.language = language;\n  sense_voice_config.use_itn = use_inverse_text_normalization;\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.sense_voice = sense_voice_config;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n  recognizer_config.hr.dict_dir = \"./dict\";\n  recognizer_config.hr.lexicon = \"./lexicon.txt\";\n\n  // Please see\n  // https://colab.research.google.com/drive/1jEaS3s8FbRJIcVQJv2EQx19EM_mnuARi?usp=sharing\n  // for how to generate your own replace.fst\n  recognizer_config.hr.rule_fsts = \"./replace.fst\";\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/speaker-identification-c-api.c",
    "content": "// c-api-examples/speaker-identification-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n// We assume you have pre-downloaded the speaker embedding extractor model\n// from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n//\n// An example command to download\n// \"3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\"\n// is given below:\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\n//\n// clang-format on\n//\n// Also, please download the test wave files from\n//\n// https://github.com/csukuangfj/sr-data\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic const float *ComputeEmbedding(\n    const SherpaOnnxSpeakerEmbeddingExtractor *ex, const char *wav_filename) {\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    exit(-1);\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxSpeakerEmbeddingExtractorCreateStream(ex);\n\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, wave->samples,\n                                       wave->num_samples);\n  SherpaOnnxOnlineStreamInputFinished(stream);\n\n  if (!SherpaOnnxSpeakerEmbeddingExtractorIsReady(ex, stream)) {\n    fprintf(stderr, \"The input wave file %s is too short!\\n\", wav_filename);\n    exit(-1);\n  }\n\n  // we will free `v` outside of this function\n  const float *v =\n      SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(ex, stream);\n\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxFreeWave(wave);\n\n  // Remember to free v to avoid memory leak\n  return v;\n}\n\nint32_t main() {\n  SherpaOnnxSpeakerEmbeddingExtractorConfig config;\n\n  memset(&config, 0, sizeof(config));\n\n  // please download the model from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n  config.model = \"./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\";\n\n  config.num_threads = 1;\n  config.debug = 0;\n  config.provider = \"cpu\";\n\n  const SherpaOnnxSpeakerEmbeddingExtractor *ex =\n      SherpaOnnxCreateSpeakerEmbeddingExtractor(&config);\n  if (!ex) {\n    fprintf(stderr, \"Failed to create speaker embedding extractor\");\n    return -1;\n  }\n\n  int32_t dim = SherpaOnnxSpeakerEmbeddingExtractorDim(ex);\n\n  const SherpaOnnxSpeakerEmbeddingManager *manager =\n      SherpaOnnxCreateSpeakerEmbeddingManager(dim);\n\n  // Please download the test data from\n  // https://github.com/csukuangfj/sr-data\n  const char *spk1_1 = \"./sr-data/enroll/fangjun-sr-1.wav\";\n  const char *spk1_2 = \"./sr-data/enroll/fangjun-sr-2.wav\";\n  const char *spk1_3 = \"./sr-data/enroll/fangjun-sr-3.wav\";\n\n  const char *spk2_1 = \"./sr-data/enroll/leijun-sr-1.wav\";\n  const char *spk2_2 = \"./sr-data/enroll/leijun-sr-2.wav\";\n\n  const float *spk1_vec[4] = {NULL};\n  spk1_vec[0] = ComputeEmbedding(ex, spk1_1);\n  spk1_vec[1] = ComputeEmbedding(ex, spk1_2);\n  spk1_vec[2] = ComputeEmbedding(ex, spk1_3);\n\n  const float *spk2_vec[3] = {NULL};\n  spk2_vec[0] = ComputeEmbedding(ex, spk2_1);\n  spk2_vec[1] = ComputeEmbedding(ex, spk2_2);\n\n  if (!SherpaOnnxSpeakerEmbeddingManagerAddList(manager, \"fangjun\", spk1_vec)) {\n    fprintf(stderr, \"Failed to register fangjun\\n\");\n    exit(-1);\n  }\n\n  if (!SherpaOnnxSpeakerEmbeddingManagerContains(manager, \"fangjun\")) {\n    fprintf(stderr, \"Failed to find fangjun\\n\");\n    exit(-1);\n  }\n\n  if (!SherpaOnnxSpeakerEmbeddingManagerAddList(manager, \"leijun\", spk2_vec)) {\n    fprintf(stderr, \"Failed to register leijun\\n\");\n    exit(-1);\n  }\n\n  if (!SherpaOnnxSpeakerEmbeddingManagerContains(manager, \"leijun\")) {\n    fprintf(stderr, \"Failed to find leijun\\n\");\n    exit(-1);\n  }\n\n  if (SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(manager) != 2) {\n    fprintf(stderr, \"There should be two speakers: fangjun and leijun\\n\");\n    exit(-1);\n  }\n\n  const char *const *all_speakers =\n      SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers(manager);\n  const char *const *p = all_speakers;\n  fprintf(stderr, \"list of registered speakers\\n-----\\n\");\n  while (p[0]) {\n    fprintf(stderr, \"speaker: %s\\n\", p[0]);\n    ++p;\n  }\n  fprintf(stderr, \"----\\n\");\n\n  SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers(all_speakers);\n\n  const char *test1 = \"./sr-data/test/fangjun-test-sr-1.wav\";\n  const char *test2 = \"./sr-data/test/leijun-test-sr-1.wav\";\n  const char *test3 = \"./sr-data/test/liudehua-test-sr-1.wav\";\n\n  const float *v1 = ComputeEmbedding(ex, test1);\n  const float *v2 = ComputeEmbedding(ex, test2);\n  const float *v3 = ComputeEmbedding(ex, test3);\n\n  float threshold = 0.6;\n\n  const char *name1 =\n      SherpaOnnxSpeakerEmbeddingManagerSearch(manager, v1, threshold);\n  if (name1) {\n    fprintf(stderr, \"%s: Found %s\\n\", test1, name1);\n    SherpaOnnxSpeakerEmbeddingManagerFreeSearch(name1);\n  } else {\n    fprintf(stderr, \"%s: Not found\\n\", test1);\n  }\n\n  const char *name2 =\n      SherpaOnnxSpeakerEmbeddingManagerSearch(manager, v2, threshold);\n  if (name2) {\n    fprintf(stderr, \"%s: Found %s\\n\", test2, name2);\n    SherpaOnnxSpeakerEmbeddingManagerFreeSearch(name2);\n  } else {\n    fprintf(stderr, \"%s: Not found\\n\", test2);\n  }\n\n  const char *name3 =\n      SherpaOnnxSpeakerEmbeddingManagerSearch(manager, v3, threshold);\n  if (name3) {\n    fprintf(stderr, \"%s: Found %s\\n\", test3, name3);\n    SherpaOnnxSpeakerEmbeddingManagerFreeSearch(name3);\n  } else {\n    fprintf(stderr, \"%s: Not found\\n\", test3);\n  }\n\n  int32_t ok = SherpaOnnxSpeakerEmbeddingManagerVerify(manager, \"fangjun\", v1,\n                                                       threshold);\n  if (ok) {\n    fprintf(stderr, \"%s matches fangjun\\n\", test1);\n  } else {\n    fprintf(stderr, \"%s does NOT match fangjun\\n\", test1);\n  }\n\n  ok = SherpaOnnxSpeakerEmbeddingManagerVerify(manager, \"fangjun\", v2,\n                                               threshold);\n  if (ok) {\n    fprintf(stderr, \"%s matches fangjun\\n\", test2);\n  } else {\n    fprintf(stderr, \"%s does NOT match fangjun\\n\", test2);\n  }\n\n  fprintf(stderr, \"Removing fangjun\\n\");\n  if (!SherpaOnnxSpeakerEmbeddingManagerRemove(manager, \"fangjun\")) {\n    fprintf(stderr, \"Failed to remove fangjun\\n\");\n    exit(-1);\n  }\n\n  if (SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(manager) != 1) {\n    fprintf(stderr, \"There should be only 1 speaker left\\n\");\n    exit(-1);\n  }\n\n  name1 = SherpaOnnxSpeakerEmbeddingManagerSearch(manager, v1, threshold);\n  if (name1) {\n    fprintf(stderr, \"%s: Found %s\\n\", test1, name1);\n    SherpaOnnxSpeakerEmbeddingManagerFreeSearch(name1);\n  } else {\n    fprintf(stderr, \"%s: Not found\\n\", test1);\n  }\n\n  fprintf(stderr, \"Removing leijun\\n\");\n  if (!SherpaOnnxSpeakerEmbeddingManagerRemove(manager, \"leijun\")) {\n    fprintf(stderr, \"Failed to remove leijun\\n\");\n    exit(-1);\n  }\n\n  if (SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(manager) != 0) {\n    fprintf(stderr, \"There should be only 1 speaker left\\n\");\n    exit(-1);\n  }\n\n  name2 = SherpaOnnxSpeakerEmbeddingManagerSearch(manager, v2, threshold);\n  if (name2) {\n    fprintf(stderr, \"%s: Found %s\\n\", test2, name2);\n    SherpaOnnxSpeakerEmbeddingManagerFreeSearch(name2);\n  } else {\n    fprintf(stderr, \"%s: Not found\\n\", test2);\n  }\n\n  all_speakers = SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers(manager);\n\n  p = all_speakers;\n  fprintf(stderr, \"list of registered speakers\\n-----\\n\");\n  while (p[0]) {\n    fprintf(stderr, \"speaker: %s\\n\", p[0]);\n    ++p;\n  }\n  fprintf(stderr, \"----\\n\");\n\n  SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers(all_speakers);\n  SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(v1);\n  SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(v2);\n  SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(v3);\n\n  SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(spk1_vec[0]);\n  SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(spk1_vec[1]);\n  SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(spk1_vec[2]);\n\n  SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(spk2_vec[0]);\n  SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(spk2_vec[1]);\n\n  SherpaOnnxDestroySpeakerEmbeddingManager(manager);\n  SherpaOnnxDestroySpeakerEmbeddingExtractor(ex);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/speech-enhancement-dpdfnet-c-api.c",
    "content": "// c-api-examples/speech-enhancement-dpdfnet-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n//\n// We assume you have pre-downloaded model\n// from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n// or\n// https://huggingface.co/Ceva-IP/DPDFNet\n//\n//\n// An example command to download\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet2.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet4.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet8.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet2_48khz_hr.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n*/\n// clang-format on\n//\n// Use dpdfnet_baseline.onnx, dpdfnet2.onnx, dpdfnet4.onnx, or dpdfnet8.onnx\n// for 16 kHz downstream ASR or speech recognition.\n// Use dpdfnet2_48khz_hr.onnx for 48 kHz enhancement output.\n#include <stdio.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  SherpaOnnxOfflineSpeechDenoiserConfig config;\n  const char *model_filename = \"./dpdfnet_baseline.onnx\";\n  const char *wav_filename = \"./inp_16k.wav\";\n  const char *out_wave_filename = \"./enhanced.wav\";\n\n  memset(&config, 0, sizeof(config));\n  config.model.dpdfnet.model = model_filename;\n\n  const SherpaOnnxOfflineSpeechDenoiser *sd =\n      SherpaOnnxCreateOfflineSpeechDenoiser(&config);\n  if (!sd) {\n    fprintf(stderr, \"Please check your config\");\n    return -1;\n  }\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    SherpaOnnxDestroyOfflineSpeechDenoiser(sd);\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  const SherpaOnnxDenoisedAudio *denoised = SherpaOnnxOfflineSpeechDenoiserRun(\n      sd, wave->samples, wave->num_samples, wave->sample_rate);\n\n  SherpaOnnxWriteWave(denoised->samples, denoised->n, denoised->sample_rate,\n                      out_wave_filename);\n\n  SherpaOnnxDestroyDenoisedAudio(denoised);\n  SherpaOnnxFreeWave(wave);\n  SherpaOnnxDestroyOfflineSpeechDenoiser(sd);\n\n  fprintf(stdout, \"Saved to %s\\n\", out_wave_filename);\n}\n"
  },
  {
    "path": "c-api-examples/speech-enhancement-gtcrn-c-api.c",
    "content": "// c-api-examples/speech-enhancement-gtcrn-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n//\n// We assume you have pre-downloaded model\n// from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n//\n//\n// An example command to download\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n*/\n// clang-format on\n#include <stdio.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  SherpaOnnxOfflineSpeechDenoiserConfig config;\n  const char *model_filename = \"./gtcrn_simple.onnx\";\n  const char *wav_filename = \"./inp_16k.wav\";\n  const char *out_wave_filename = \"./enhanced.wav\";\n\n  memset(&config, 0, sizeof(config));\n  config.model.gtcrn.model = model_filename;\n\n  const SherpaOnnxOfflineSpeechDenoiser *sd =\n      SherpaOnnxCreateOfflineSpeechDenoiser(&config);\n  if (!sd) {\n    fprintf(stderr, \"Please check your config\");\n    return -1;\n  }\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    SherpaOnnxDestroyOfflineSpeechDenoiser(sd);\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  const SherpaOnnxDenoisedAudio *denoised = SherpaOnnxOfflineSpeechDenoiserRun(\n      sd, wave->samples, wave->num_samples, wave->sample_rate);\n\n  SherpaOnnxWriteWave(denoised->samples, denoised->n, denoised->sample_rate,\n                      out_wave_filename);\n\n  SherpaOnnxDestroyDenoisedAudio(denoised);\n  SherpaOnnxFreeWave(wave);\n  SherpaOnnxDestroyOfflineSpeechDenoiser(sd);\n\n  fprintf(stdout, \"Saved to %s\\n\", out_wave_filename);\n}\n"
  },
  {
    "path": "c-api-examples/spoken-language-identification-c-api.c",
    "content": "// c-api-examples/spoken-language-identification-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n// We assume you have pre-downloaded the whisper multi-lingual models\n// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n// An example command to download the \"tiny\" whisper model is given below:\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\n// tar xvf sherpa-onnx-whisper-tiny.tar.bz2\n// rm sherpa-onnx-whisper-tiny.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  SherpaOnnxSpokenLanguageIdentificationConfig config;\n\n  memset(&config, 0, sizeof(config));\n\n  config.whisper.encoder = \"./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx\";\n  config.whisper.decoder = \"./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx\";\n  config.num_threads = 1;\n  config.debug = 1;\n  config.provider = \"cpu\";\n\n  const SherpaOnnxSpokenLanguageIdentification *slid =\n      SherpaOnnxCreateSpokenLanguageIdentification(&config);\n  if (!slid) {\n    fprintf(stderr, \"Failed to create spoken language identifier\");\n    return -1;\n  }\n\n  // You can find more test waves from\n  // https://hf-mirror.com/spaces/k2-fsa/spoken-language-identification/tree/main/test_wavs\n  const char *wav_filename = \"./sherpa-onnx-whisper-tiny/test_wavs/0.wav\";\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  SherpaOnnxOfflineStream *stream =\n      SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(slid);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n\n  const SherpaOnnxSpokenLanguageIdentificationResult *result =\n      SherpaOnnxSpokenLanguageIdentificationCompute(slid, stream);\n\n  fprintf(stderr, \"wav_filename: %s\\n\", wav_filename);\n  fprintf(stderr, \"Detected language: %s\\n\", result->lang);\n\n  SherpaOnnxDestroySpokenLanguageIdentificationResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxFreeWave(wave);\n  SherpaOnnxDestroySpokenLanguageIdentification(slid);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/streaming-ctc-buffered-tokens-c-api.c",
    "content": "// c-api-examples/streaming-ctc-buffered-tokens-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n// Copyright (c)  2024  Luo Xiao\n\n//\n// This file demonstrates how to use streaming Zipformer2 Ctc with sherpa-onnx's\n// C API and with tokens loaded from buffered strings instead of\n// from external files API.\n// clang-format off\n// \n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n// tar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n// rm sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic size_t ReadFile(const char *filename, const char **buffer_out) {\n  FILE *file = fopen(filename, \"r\");\n  if (file == NULL) {\n    fprintf(stderr, \"Failed to open %s\\n\", filename);\n    return -1;\n  }\n  fseek(file, 0L, SEEK_END);\n  long size = ftell(file);\n  rewind(file);\n  *buffer_out = malloc(size);\n  if (*buffer_out == NULL) {\n    fclose(file);\n    fprintf(stderr, \"Memory error\\n\");\n    return -1;\n  }\n  size_t read_bytes = fread((void *)*buffer_out, 1, size, file);\n  if (read_bytes != size) {\n    printf(\"Errors occurred in reading the file %s\\n\", filename);\n    free((void *)*buffer_out);\n    *buffer_out = NULL;\n    fclose(file);\n    return -1;\n  }\n  fclose(file);\n  return read_bytes;\n}\n\nint32_t main() {\n  const char *wav_filename =\n      \"sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/\"\n      \"DEV_T0000000000.wav\";\n  const char *model_filename =\n      \"sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/\"\n      \"ctc-epoch-20-avg-1-chunk-16-left-128.onnx\";\n  const char *tokens_filename =\n      \"sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // reading tokens to buffers\n  const char *tokens_buf;\n  size_t token_buf_size = ReadFile(tokens_filename, &tokens_buf);\n  if (token_buf_size < 1) {\n    fprintf(stderr, \"Please check your tokens.txt!\\n\");\n    free((void *)tokens_buf);\n    return -1;\n  }\n\n  // Zipformer2Ctc config\n  SherpaOnnxOnlineZipformer2CtcModelConfig zipformer2_ctc_config;\n  memset(&zipformer2_ctc_config, 0, sizeof(zipformer2_ctc_config));\n  zipformer2_ctc_config.model = model_filename;\n\n  // Online model config\n  SherpaOnnxOnlineModelConfig online_model_config;\n  memset(&online_model_config, 0, sizeof(online_model_config));\n  online_model_config.debug = 1;\n  online_model_config.num_threads = 1;\n  online_model_config.provider = provider;\n  online_model_config.tokens_buf = tokens_buf;\n  online_model_config.tokens_buf_size = token_buf_size;\n  online_model_config.zipformer2_ctc = zipformer2_ctc_config;\n\n  // Recognizer config\n  SherpaOnnxOnlineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = online_model_config;\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&recognizer_config);\n\n  free((void *)tokens_buf);\n  tokens_buf = NULL;\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n// simulate streaming. You can choose an arbitrary N\n#define N 3200\n\n  fprintf(stderr, \"sample rate: %d, num samples: %d, duration: %.2f s\\n\",\n          wave->sample_rate, wave->num_samples,\n          (float)wave->num_samples / wave->sample_rate);\n\n  int32_t k = 0;\n  while (k < wave->num_samples) {\n    int32_t start = k;\n    int32_t end =\n        (start + N > wave->num_samples) ? wave->num_samples : (start + N);\n    k += N;\n\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n                                         wave->samples + start, end - start);\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n    if (strlen(r->text)) {\n      SherpaOnnxPrint(display, segment_id, r->text);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (strlen(r->text)) {\n        ++segment_id;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       4800);\n\n  SherpaOnnxFreeWave(wave);\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream);\n  }\n\n  const SherpaOnnxOnlineRecognizerResult *r =\n      SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n  if (strlen(r->text)) {\n    SherpaOnnxPrint(display, segment_id, r->text);\n  }\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/streaming-hlg-decode-file-c-api.c",
    "content": "// c-api-examples/streaming-hlg-decode-file-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n/*\nWe use the following model as an example\n\n// clang-format off\n\nDownload the model from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n\ntar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nrm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n\nbuild/bin/streaming-hlg-decode-file-c-api\n\n(The above model is from https://github.com/k2-fsa/icefall/pull/1557)\n*/\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  // clang-format off\n  //\n  // Please download the model from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  const char *model = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx\";\n  const char *tokens = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt\";\n  const char *graph = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst\";\n  const char *wav_filename = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/8k.wav\";\n  // clang-format on\n\n  SherpaOnnxOnlineRecognizerConfig config;\n\n  memset(&config, 0, sizeof(config));\n  config.feat_config.sample_rate = 16000;\n  config.feat_config.feature_dim = 80;\n  config.model_config.zipformer2_ctc.model = model;\n  config.model_config.tokens = tokens;\n  config.model_config.num_threads = 1;\n  config.model_config.provider = \"cpu\";\n  config.model_config.debug = 0;\n  config.ctc_fst_decoder_config.graph = graph;\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&config);\n  if (!recognizer) {\n    fprintf(stderr, \"Failed to create recognizer\");\n    exit(-1);\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    exit(-1);\n  }\n\n// simulate streaming. You can choose an arbitrary N\n#define N 3200\n\n  fprintf(stderr, \"sample rate: %d, num samples: %d, duration: %.2f s\\n\",\n          wave->sample_rate, wave->num_samples,\n          (float)wave->num_samples / wave->sample_rate);\n\n  int32_t k = 0;\n  while (k < wave->num_samples) {\n    int32_t start = k;\n    int32_t end =\n        (start + N > wave->num_samples) ? wave->num_samples : (start + N);\n    k += N;\n\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n                                         wave->samples + start, end - start);\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n    if (strlen(r->text)) {\n      SherpaOnnxPrint(display, segment_id, r->text);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (strlen(r->text)) {\n        ++segment_id;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       4800);\n\n  SherpaOnnxFreeWave(wave);\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream);\n  }\n\n  const SherpaOnnxOnlineRecognizerResult *r =\n      SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n  if (strlen(r->text)) {\n    SherpaOnnxPrint(display, segment_id, r->text);\n  }\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/streaming-paraformer-buffered-tokens-c-api.c",
    "content": "// c-api-examples/streaming-paraformer-buffered-tokens-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n// Copyright (c)  2024  Luo Xiao\n\n//\n// This file demonstrates how to use streaming Paraformer with sherpa-onnx's C\n// API and with tokens loaded from buffered strings instead of from\n// external files API.\n// clang-format off\n// \n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n// tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n// rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic size_t ReadFile(const char *filename, const char **buffer_out) {\n  FILE *file = fopen(filename, \"r\");\n  if (file == NULL) {\n    fprintf(stderr, \"Failed to open %s\\n\", filename);\n    return -1;\n  }\n  fseek(file, 0L, SEEK_END);\n  long size = ftell(file);\n  rewind(file);\n  *buffer_out = malloc(size);\n  if (*buffer_out == NULL) {\n    fclose(file);\n    fprintf(stderr, \"Memory error\\n\");\n    return -1;\n  }\n  size_t read_bytes = fread((void *)*buffer_out, 1, size, file);\n  if (read_bytes != size) {\n    printf(\"Errors occurred in reading the file %s\\n\", filename);\n    free((void *)*buffer_out);\n    *buffer_out = NULL;\n    fclose(file);\n    return -1;\n  }\n  fclose(file);\n  return read_bytes;\n}\n\nint32_t main() {\n  const char *wav_filename =\n      \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav\";\n  const char *encoder_filename =\n      \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx\";\n  const char *decoder_filename =\n      \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx\";\n  const char *tokens_filename =\n      \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // reading tokens to buffers\n  const char *tokens_buf;\n  size_t token_buf_size = ReadFile(tokens_filename, &tokens_buf);\n  if (token_buf_size < 1) {\n    fprintf(stderr, \"Please check your tokens.txt!\\n\");\n    free((void *)tokens_buf);\n    return -1;\n  }\n\n  // Paraformer config\n  SherpaOnnxOnlineParaformerModelConfig paraformer_config;\n  memset(&paraformer_config, 0, sizeof(paraformer_config));\n  paraformer_config.encoder = encoder_filename;\n  paraformer_config.decoder = decoder_filename;\n\n  // Online model config\n  SherpaOnnxOnlineModelConfig online_model_config;\n  memset(&online_model_config, 0, sizeof(online_model_config));\n  online_model_config.debug = 1;\n  online_model_config.num_threads = 1;\n  online_model_config.provider = provider;\n  online_model_config.tokens_buf = tokens_buf;\n  online_model_config.tokens_buf_size = token_buf_size;\n  online_model_config.paraformer = paraformer_config;\n\n  // Recognizer config\n  SherpaOnnxOnlineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = online_model_config;\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&recognizer_config);\n\n  free((void *)tokens_buf);\n  tokens_buf = NULL;\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n// simulate streaming. You can choose an arbitrary N\n#define N 3200\n\n  fprintf(stderr, \"sample rate: %d, num samples: %d, duration: %.2f s\\n\",\n          wave->sample_rate, wave->num_samples,\n          (float)wave->num_samples / wave->sample_rate);\n\n  int32_t k = 0;\n  while (k < wave->num_samples) {\n    int32_t start = k;\n    int32_t end =\n        (start + N > wave->num_samples) ? wave->num_samples : (start + N);\n    k += N;\n\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n                                         wave->samples + start, end - start);\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n    if (strlen(r->text)) {\n      SherpaOnnxPrint(display, segment_id, r->text);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (strlen(r->text)) {\n        ++segment_id;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       4800);\n\n  SherpaOnnxFreeWave(wave);\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream);\n  }\n\n  const SherpaOnnxOnlineRecognizerResult *r =\n      SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n  if (strlen(r->text)) {\n    SherpaOnnxPrint(display, segment_id, r->text);\n  }\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/streaming-paraformer-c-api.c",
    "content": "// c-api-examples/streaming-paraformer-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use streaming Paraformer with sherpa-onnx's C\n// API.\n// clang-format off\n// \n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n// tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n// rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav\";\n  const char *encoder_filename =\n      \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx\";\n  const char *decoder_filename =\n      \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx\";\n  const char *tokens_filename =\n      \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Paraformer config\n  SherpaOnnxOnlineParaformerModelConfig paraformer_config;\n  memset(&paraformer_config, 0, sizeof(paraformer_config));\n  paraformer_config.encoder = encoder_filename;\n  paraformer_config.decoder = decoder_filename;\n\n  // Online model config\n  SherpaOnnxOnlineModelConfig online_model_config;\n  memset(&online_model_config, 0, sizeof(online_model_config));\n  online_model_config.debug = 1;\n  online_model_config.num_threads = 1;\n  online_model_config.provider = provider;\n  online_model_config.tokens = tokens_filename;\n  online_model_config.paraformer = paraformer_config;\n\n  // Recognizer config\n  SherpaOnnxOnlineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = online_model_config;\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n// simulate streaming. You can choose an arbitrary N\n#define N 3200\n\n  fprintf(stderr, \"sample rate: %d, num samples: %d, duration: %.2f s\\n\",\n          wave->sample_rate, wave->num_samples,\n          (float)wave->num_samples / wave->sample_rate);\n\n  int32_t k = 0;\n  while (k < wave->num_samples) {\n    int32_t start = k;\n    int32_t end =\n        (start + N > wave->num_samples) ? wave->num_samples : (start + N);\n    k += N;\n\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n                                         wave->samples + start, end - start);\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n    if (strlen(r->text)) {\n      SherpaOnnxPrint(display, segment_id, r->text);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (strlen(r->text)) {\n        ++segment_id;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       4800);\n\n  SherpaOnnxFreeWave(wave);\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream);\n  }\n\n  const SherpaOnnxOnlineRecognizerResult *r =\n      SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n  if (strlen(r->text)) {\n    SherpaOnnxPrint(display, segment_id, r->text);\n  }\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/streaming-t-one-ctc-c-api.c",
    "content": "// c-api-examples/streaming-t-one-ctc-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use streaming T-one with sherpa-onnx's C\n// API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n// tar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n// rm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav\";\n  const char *model =\n      \"sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx\";\n  const char *tokens =\n      \"sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Zipformer config\n  SherpaOnnxOnlineToneCtcModelConfig t_one_ctc;\n  memset(&t_one_ctc, 0, sizeof(t_one_ctc));\n  t_one_ctc.model = model;\n\n  // Online model config\n  SherpaOnnxOnlineModelConfig online_model_config;\n  memset(&online_model_config, 0, sizeof(online_model_config));\n  online_model_config.debug = 1;\n  online_model_config.num_threads = 1;\n  online_model_config.provider = provider;\n  online_model_config.tokens = tokens;\n  online_model_config.t_one_ctc = t_one_ctc;\n\n  // Recognizer config\n  SherpaOnnxOnlineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = online_model_config;\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n// simulate streaming. You can choose an arbitrary N\n#define N 3200\n\n  fprintf(stderr, \"sample rate: %d, num samples: %d, duration: %.2f s\\n\",\n          wave->sample_rate, wave->num_samples,\n          (float)wave->num_samples / wave->sample_rate);\n\n  float left_paddings[2400] = {0};  // 0.3 seconds at 8 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, left_paddings,\n                                       2400);\n\n  int32_t k = 0;\n  while (k < wave->num_samples) {\n    int32_t start = k;\n    int32_t end =\n        (start + N > wave->num_samples) ? wave->num_samples : (start + N);\n    k += N;\n\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n                                         wave->samples + start, end - start);\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n    if (strlen(r->text)) {\n      SherpaOnnxPrint(display, segment_id, r->text);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (strlen(r->text)) {\n        ++segment_id;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.6 seconds at 8 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       4800);\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream);\n  }\n\n  SherpaOnnxFreeWave(wave);\n\n  const SherpaOnnxOnlineRecognizerResult *r =\n      SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n  if (strlen(r->text)) {\n    SherpaOnnxPrint(display, segment_id, r->text);\n  }\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/streaming-zipformer-buffered-tokens-hotwords-c-api.c",
    "content": "// c-api-examples/streaming-zipformer-buffered-tokens-hotwords-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n// Copyright (c)  2024  Luo Xiao\n\n//\n// This file demonstrates how to use streaming Zipformer with sherpa-onnx's C\n// API and with tokens and hotwords loaded from buffered strings instead of from\n// external files API.\n// clang-format off\n// \n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n// tar xvf sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n// rm sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic size_t ReadFile(const char *filename, const char **buffer_out) {\n  FILE *file = fopen(filename, \"r\");\n  if (file == NULL) {\n    fprintf(stderr, \"Failed to open %s\\n\", filename);\n    return -1;\n  }\n  fseek(file, 0L, SEEK_END);\n  long size = ftell(file);\n  rewind(file);\n  *buffer_out = malloc(size);\n  if (*buffer_out == NULL) {\n    fclose(file);\n    fprintf(stderr, \"Memory error\\n\");\n    return -1;\n  }\n  size_t read_bytes = fread((void *)*buffer_out, 1, size, file);\n  if (read_bytes != size) {\n    printf(\"Errors occurred in reading the file %s\\n\", filename);\n    free((void *)*buffer_out);\n    *buffer_out = NULL;\n    fclose(file);\n    return -1;\n  }\n  fclose(file);\n  return read_bytes;\n}\n\nint32_t main() {\n  const char *wav_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/test_wavs/0.wav\";\n  const char *encoder_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/\"\n      \"encoder-epoch-99-avg-1.onnx\";\n  const char *decoder_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/\"\n      \"decoder-epoch-99-avg-1.onnx\";\n  const char *joiner_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/\"\n      \"joiner-epoch-99-avg-1.onnx\";\n  const char *provider = \"cpu\";\n  const char *modeling_unit = \"bpe\";\n  const char *tokens_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/tokens.txt\";\n  const char *hotwords_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/hotwords.txt\";\n  const char *bpe_vocab =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/\"\n      \"bpe.vocab\";\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // reading tokens and hotwords to buffers\n  const char *tokens_buf;\n  size_t token_buf_size = ReadFile(tokens_filename, &tokens_buf);\n  if (token_buf_size < 1) {\n    fprintf(stderr, \"Please check your tokens.txt!\\n\");\n    free((void *)tokens_buf);\n    return -1;\n  }\n  const char *hotwords_buf;\n  size_t hotwords_buf_size = ReadFile(hotwords_filename, &hotwords_buf);\n  if (hotwords_buf_size < 1) {\n    fprintf(stderr, \"Please check your hotwords.txt!\\n\");\n    free((void *)hotwords_buf);\n    return -1;\n  }\n\n  // Zipformer config\n  SherpaOnnxOnlineTransducerModelConfig zipformer_config;\n  memset(&zipformer_config, 0, sizeof(zipformer_config));\n  zipformer_config.encoder = encoder_filename;\n  zipformer_config.decoder = decoder_filename;\n  zipformer_config.joiner = joiner_filename;\n\n  // Online model config\n  SherpaOnnxOnlineModelConfig online_model_config;\n  memset(&online_model_config, 0, sizeof(online_model_config));\n  online_model_config.debug = 1;\n  online_model_config.num_threads = 1;\n  online_model_config.provider = provider;\n  online_model_config.tokens_buf = tokens_buf;\n  online_model_config.tokens_buf_size = token_buf_size;\n  online_model_config.transducer = zipformer_config;\n\n  // Recognizer config\n  SherpaOnnxOnlineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"modified_beam_search\";\n  recognizer_config.model_config = online_model_config;\n  recognizer_config.hotwords_buf = hotwords_buf;\n  recognizer_config.hotwords_buf_size = hotwords_buf_size;\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&recognizer_config);\n\n  free((void *)tokens_buf);\n  tokens_buf = NULL;\n  free((void *)hotwords_buf);\n  hotwords_buf = NULL;\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n// simulate streaming. You can choose an arbitrary N\n#define N 3200\n\n  fprintf(stderr, \"sample rate: %d, num samples: %d, duration: %.2f s\\n\",\n          wave->sample_rate, wave->num_samples,\n          (float)wave->num_samples / wave->sample_rate);\n\n  int32_t k = 0;\n  while (k < wave->num_samples) {\n    int32_t start = k;\n    int32_t end =\n        (start + N > wave->num_samples) ? wave->num_samples : (start + N);\n    k += N;\n\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n                                         wave->samples + start, end - start);\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n    if (strlen(r->text)) {\n      SherpaOnnxPrint(display, segment_id, r->text);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (strlen(r->text)) {\n        ++segment_id;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       4800);\n\n  SherpaOnnxFreeWave(wave);\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream);\n  }\n\n  const SherpaOnnxOnlineRecognizerResult *r =\n      SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n  if (strlen(r->text)) {\n    SherpaOnnxPrint(display, segment_id, r->text);\n  }\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/streaming-zipformer-c-api.c",
    "content": "// c-api-examples/streaming-zipformer-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use streaming Zipformer with sherpa-onnx's C\n// API.\n// clang-format off\n// \n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n// tar xvf sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n// rm sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/test_wavs/0.wav\";\n  const char *encoder_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/\"\n      \"encoder-epoch-99-avg-1.onnx\";\n  const char *decoder_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/\"\n      \"decoder-epoch-99-avg-1.onnx\";\n  const char *joiner_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/\"\n      \"joiner-epoch-99-avg-1.onnx\";\n  const char *tokens_filename =\n      \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Zipformer config\n  SherpaOnnxOnlineTransducerModelConfig zipformer_config;\n  memset(&zipformer_config, 0, sizeof(zipformer_config));\n  zipformer_config.encoder = encoder_filename;\n  zipformer_config.decoder = decoder_filename;\n  zipformer_config.joiner = joiner_filename;\n\n  // Online model config\n  SherpaOnnxOnlineModelConfig online_model_config;\n  memset(&online_model_config, 0, sizeof(online_model_config));\n  online_model_config.debug = 1;\n  online_model_config.num_threads = 1;\n  online_model_config.provider = provider;\n  online_model_config.tokens = tokens_filename;\n  online_model_config.transducer = zipformer_config;\n\n  // Recognizer config\n  SherpaOnnxOnlineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = online_model_config;\n  recognizer_config.enable_endpoint = 1;\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n// simulate streaming. You can choose an arbitrary N\n#define N 3200\n\n  fprintf(stderr, \"sample rate: %d, num samples: %d, duration: %.2f s\\n\",\n          wave->sample_rate, wave->num_samples,\n          (float)wave->num_samples / wave->sample_rate);\n\n  int32_t k = 0;\n  while (k < wave->num_samples) {\n    int32_t start = k;\n    int32_t end =\n        (start + N > wave->num_samples) ? wave->num_samples : (start + N);\n    k += N;\n\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n                                         wave->samples + start, end - start);\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n    if (strlen(r->text)) {\n      SherpaOnnxPrint(display, segment_id, r->text);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (strlen(r->text)) {\n        ++segment_id;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       4800);\n\n  SherpaOnnxFreeWave(wave);\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream);\n  }\n\n  const SherpaOnnxOnlineRecognizerResult *r =\n      SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n  if (strlen(r->text)) {\n    SherpaOnnxPrint(display, segment_id, r->text);\n  }\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/streaming-zipformer-with-hr-c-api.c",
    "content": "// c-api-examples/streaming-zipformer-with-hr-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use streaming Zipformer with sherpa-onnx's C\n// API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n// tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n// rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\n// tar xf dict.tar.bz2\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename = \"test-hr.wav\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Online model config\n  SherpaOnnxOnlineModelConfig online_model_config;\n  memset(&online_model_config, 0, sizeof(online_model_config));\n  online_model_config.debug = 0;\n  online_model_config.num_threads = 1;\n  online_model_config.provider = \"cpu\";\n  online_model_config.tokens =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\";\n\n  online_model_config.transducer.encoder =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"encoder-epoch-99-avg-1.int8.onnx\";\n\n  // Note: We recommend not using int8.onnx for the decoder.\n  online_model_config.transducer.decoder =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"decoder-epoch-99-avg-1.onnx\";\n\n  online_model_config.transducer.joiner =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"joiner-epoch-99-avg-1.int8.onnx\";\n\n  online_model_config.tokens =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\";\n\n  online_model_config.num_threads = 1;\n\n  // Recognizer config\n  SherpaOnnxOnlineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = online_model_config;\n\n  recognizer_config.hr.dict_dir = \"./dict\";\n  recognizer_config.hr.lexicon = \"./lexicon.txt\";\n\n  // Please see\n  // https://colab.research.google.com/drive/1jEaS3s8FbRJIcVQJv2EQx19EM_mnuARi?usp=sharing\n  // for how to generate your own replace.fst\n  recognizer_config.hr.rule_fsts = \"./replace.fst\";\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n// simulate streaming. You can choose an arbitrary N\n#define N 3200\n\n  fprintf(stderr, \"sample rate: %d, num samples: %d, duration: %.2f s\\n\",\n          wave->sample_rate, wave->num_samples,\n          (float)wave->num_samples / wave->sample_rate);\n\n  int32_t k = 0;\n  while (k < wave->num_samples) {\n    int32_t start = k;\n    int32_t end =\n        (start + N > wave->num_samples) ? wave->num_samples : (start + N);\n    k += N;\n\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n                                         wave->samples + start, end - start);\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n    if (strlen(r->text)) {\n      SherpaOnnxPrint(display, segment_id, r->text);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (strlen(r->text)) {\n        ++segment_id;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,\n                                       4800);\n\n  SherpaOnnxFreeWave(wave);\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n  while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream);\n  }\n\n  const SherpaOnnxOnlineRecognizerResult *r =\n      SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n\n  if (strlen(r->text)) {\n    SherpaOnnxPrint(display, segment_id, r->text);\n  }\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/supertonic-tts-en-c-api.c",
    "content": "// c-api-examples/supertonic-tts-en-c-api.c\n//\n// Copyright (c)  2026  zengyw\n\n// This file shows how to use sherpa-onnx C API\n// for English TTS with Supertonic.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nrm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\n./supertonic-tts-en-c-api\n\n*/\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t ProgressCallback(const float* samples, int32_t num_samples,\n                                float progress, void* arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char* argv[]) {\n  SherpaOnnxOfflineTtsConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model.supertonic.duration_predictor =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/\"\n      \"duration_predictor.int8.onnx\";\n  config.model.supertonic.text_encoder =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx\";\n  config.model.supertonic.vector_estimator =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx\";\n  config.model.supertonic.vocoder =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx\";\n  config.model.supertonic.tts_json =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json\";\n  config.model.supertonic.unicode_indexer =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin\";\n  config.model.supertonic.voice_style =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin\";\n\n  config.model.num_threads = 2;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  const char* filename = \"./generated-supertonic-en-c.wav\";\n  const char* text =\n      \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n      \"does not have two-thirds of his day for himself, is a slave, whatever \"\n      \"he may be: a statesman, a businessman, an official, or a scholar.\";\n\n  const SherpaOnnxOfflineTts* tts = SherpaOnnxCreateOfflineTts(&config);\n  if (!tts) {\n    fprintf(stderr, \"Error create Offline TTS\\n\");\n    return -1;\n  }\n\n  SherpaOnnxGenerationConfig cfg = {0};\n  cfg.sid = 6;\n  cfg.num_steps = 5;\n  cfg.speed = 1.25f;  // larger -> faster\n  cfg.extra = \"{\\\"lang\\\": \\\"en\\\"}\";\n\n  const SherpaOnnxGeneratedAudio* audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, ProgressCallback,\n                                             NULL);\n\n  fprintf(stderr, \"Input text is: %s\\n\", text);\n\n  if (audio) {\n    SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate, filename);\n    fprintf(stderr, \"Saved to: %s\\n\", filename);\n    SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  }\n\n  SherpaOnnxDestroyOfflineTts(tts);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/telespeech-c-api.c",
    "content": "// c-api-examples/telespeech-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use TeleSpeech-ASR CTC model with sherpa-onnx's\n// C API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n// tar xvf sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n// rm sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/test_wavs/3-sichuan.wav\";\n  const char *model_filename =\n      \"sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/model.int8.onnx\";\n  const char *tokens_filename =\n      \"sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.telespeech_ctc = model_filename;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/vad-moonshine-c-api.c",
    "content": "// c-api-examples/vad-moonshine-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use VAD + Moonshine with sherpa-onnx's C API.\n// clang-format off\n//\n// To use silero-vad:\n//  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// To use ten-vad:\n//  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n// tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n// rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename = \"./Obama.wav\";\n  if (!SherpaOnnxFileExists(wav_filename)) {\n    fprintf(stderr, \"Please download %s\\n\", wav_filename);\n    return -1;\n  }\n\n  const char *vad_filename;\n  int32_t use_silero_vad = 0;\n  int32_t use_ten_vad = 0;\n\n  if (SherpaOnnxFileExists(\"./silero_vad.onnx\")) {\n    printf(\"Use silero-vad\\n\");\n    vad_filename = \"./silero_vad.onnx\";\n    use_silero_vad = 1;\n  } else if (SherpaOnnxFileExists(\"./ten-vad.onnx\")) {\n    printf(\"Use ten-vad\\n\");\n    vad_filename = \"./ten-vad.onnx\";\n    use_ten_vad = 1;\n  } else {\n    fprintf(stderr, \"Please provide either silero_vad.onnx or ten-vad.onnx\\n\");\n    return -1;\n  }\n\n  const char *preprocessor =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx\";\n  const char *encoder = \"./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx\";\n  const char *uncached_decoder =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx\";\n  const char *cached_decoder =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx\";\n  const char *tokens = \"./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  if (wave->sample_rate != 16000) {\n    fprintf(stderr, \"Expect the sample rate to be 16000. Given: %d\\n\",\n            wave->sample_rate);\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 0;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = \"cpu\";\n  offline_model_config.tokens = tokens;\n  offline_model_config.moonshine.preprocessor = preprocessor;\n  offline_model_config.moonshine.encoder = encoder;\n  offline_model_config.moonshine.uncached_decoder = uncached_decoder;\n  offline_model_config.moonshine.cached_decoder = cached_decoder;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your recognizer config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  SherpaOnnxVadModelConfig vadConfig;\n  memset(&vadConfig, 0, sizeof(vadConfig));\n  if (use_silero_vad) {\n    vadConfig.silero_vad.model = vad_filename;\n    vadConfig.silero_vad.threshold = 0.25;\n    vadConfig.silero_vad.min_silence_duration = 0.5;\n    vadConfig.silero_vad.min_speech_duration = 0.5;\n    vadConfig.silero_vad.max_speech_duration = 10;\n    vadConfig.silero_vad.window_size = 512;\n  } else if (use_ten_vad) {\n    vadConfig.ten_vad.model = vad_filename;\n    vadConfig.ten_vad.threshold = 0.25;\n    vadConfig.ten_vad.min_silence_duration = 0.5;\n    vadConfig.ten_vad.min_speech_duration = 0.5;\n    vadConfig.ten_vad.max_speech_duration = 10;\n    vadConfig.ten_vad.window_size = 256;\n  }\n\n  vadConfig.sample_rate = 16000;\n  vadConfig.num_threads = 1;\n  vadConfig.debug = 1;\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      SherpaOnnxCreateVoiceActivityDetector(&vadConfig, 30);\n\n  if (vad == NULL) {\n    fprintf(stderr, \"Please check your recognizer config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    SherpaOnnxDestroyOfflineRecognizer(recognizer);\n    return -1;\n  }\n\n  int32_t window_size = use_silero_vad ? vadConfig.silero_vad.window_size\n                                       : vadConfig.ten_vad.window_size;\n\n  int32_t i = 0;\n  int is_eof = 0;\n\n  while (!is_eof) {\n    if (i + window_size < wave->num_samples) {\n      SherpaOnnxVoiceActivityDetectorAcceptWaveform(vad, wave->samples + i,\n                                                    window_size);\n    } else {\n      SherpaOnnxVoiceActivityDetectorFlush(vad);\n      is_eof = 1;\n    }\n    while (!SherpaOnnxVoiceActivityDetectorEmpty(vad)) {\n      const SherpaOnnxSpeechSegment *segment =\n          SherpaOnnxVoiceActivityDetectorFront(vad);\n\n      const SherpaOnnxOfflineStream *stream =\n          SherpaOnnxCreateOfflineStream(recognizer);\n\n      SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate,\n                                      segment->samples, segment->n);\n\n      SherpaOnnxDecodeOfflineStream(recognizer, stream);\n\n      const SherpaOnnxOfflineRecognizerResult *result =\n          SherpaOnnxGetOfflineStreamResult(stream);\n\n      float start = segment->start / 16000.0f;\n      float duration = segment->n / 16000.0f;\n      float stop = start + duration;\n\n      fprintf(stderr, \"%.3f -- %.3f: %s\\n\", start, stop, result->text);\n\n      SherpaOnnxDestroyOfflineRecognizerResult(result);\n      SherpaOnnxDestroyOfflineStream(stream);\n\n      SherpaOnnxDestroySpeechSegment(segment);\n      SherpaOnnxVoiceActivityDetectorPop(vad);\n    }\n    i += window_size;\n  }\n\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxDestroyVoiceActivityDetector(vad);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/vad-sense-voice-c-api.c",
    "content": "// c-api-examples/vad-sense-voice-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use VAD + SenseVoice with sherpa-onnx's C API.\n// clang-format off\n//\n// To use silero-vad:\n//  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// To use ten-vad:\n//  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename = \"./lei-jun-test.wav\";\n  if (!SherpaOnnxFileExists(wav_filename)) {\n    fprintf(stderr, \"Please download %s\\n\", wav_filename);\n    return -1;\n  }\n\n  const char *vad_filename;\n  int32_t use_silero_vad = 0;\n  int32_t use_ten_vad = 0;\n\n  if (SherpaOnnxFileExists(\"./silero_vad.onnx\")) {\n    printf(\"Use silero-vad\\n\");\n    vad_filename = \"./silero_vad.onnx\";\n    use_silero_vad = 1;\n  } else if (SherpaOnnxFileExists(\"./ten-vad.onnx\")) {\n    printf(\"Use ten-vad\\n\");\n    vad_filename = \"./ten-vad.onnx\";\n    use_ten_vad = 1;\n  } else {\n    fprintf(stderr, \"Please provide either silero_vad.onnx or ten-vad.onnx\\n\");\n    return -1;\n  }\n\n  const char *model_filename =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n  const char *tokens_filename =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n  const char *language = \"auto\";\n  const char *provider = \"cpu\";\n  int32_t use_inverse_text_normalization = 1;\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  if (wave->sample_rate != 16000) {\n    fprintf(stderr, \"Expect the sample rate to be 16000. Given: %d\\n\",\n            wave->sample_rate);\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  SherpaOnnxOfflineSenseVoiceModelConfig sense_voice_config;\n  memset(&sense_voice_config, 0, sizeof(sense_voice_config));\n  sense_voice_config.model = model_filename;\n  sense_voice_config.language = language;\n  sense_voice_config.use_itn = use_inverse_text_normalization;\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 0;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.sense_voice = sense_voice_config;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your recognizer config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  SherpaOnnxVadModelConfig vadConfig;\n  memset(&vadConfig, 0, sizeof(vadConfig));\n\n  if (use_silero_vad) {\n    vadConfig.silero_vad.model = vad_filename;\n    vadConfig.silero_vad.threshold = 0.25;\n    vadConfig.silero_vad.min_silence_duration = 0.5;\n    vadConfig.silero_vad.min_speech_duration = 0.5;\n    vadConfig.silero_vad.max_speech_duration = 10;\n    vadConfig.silero_vad.window_size = 512;\n  } else if (use_ten_vad) {\n    vadConfig.ten_vad.model = vad_filename;\n    vadConfig.ten_vad.threshold = 0.25;\n    vadConfig.ten_vad.min_silence_duration = 0.5;\n    vadConfig.ten_vad.min_speech_duration = 0.5;\n    vadConfig.ten_vad.max_speech_duration = 10;\n    vadConfig.ten_vad.window_size = 256;\n  }\n\n  vadConfig.sample_rate = 16000;\n  vadConfig.num_threads = 1;\n  vadConfig.debug = 1;\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      SherpaOnnxCreateVoiceActivityDetector(&vadConfig, 30);\n\n  if (vad == NULL) {\n    fprintf(stderr, \"Please check your recognizer config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    SherpaOnnxDestroyOfflineRecognizer(recognizer);\n    return -1;\n  }\n\n  int32_t window_size = use_silero_vad ? vadConfig.silero_vad.window_size\n                                       : vadConfig.ten_vad.window_size;\n  int32_t i = 0;\n  int is_eof = 0;\n\n  while (!is_eof) {\n    if (i + window_size < wave->num_samples) {\n      SherpaOnnxVoiceActivityDetectorAcceptWaveform(vad, wave->samples + i,\n                                                    window_size);\n    } else {\n      SherpaOnnxVoiceActivityDetectorFlush(vad);\n      is_eof = 1;\n    }\n\n    while (!SherpaOnnxVoiceActivityDetectorEmpty(vad)) {\n      const SherpaOnnxSpeechSegment *segment =\n          SherpaOnnxVoiceActivityDetectorFront(vad);\n\n      const SherpaOnnxOfflineStream *stream =\n          SherpaOnnxCreateOfflineStream(recognizer);\n\n      SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate,\n                                      segment->samples, segment->n);\n\n      SherpaOnnxDecodeOfflineStream(recognizer, stream);\n\n      const SherpaOnnxOfflineRecognizerResult *result =\n          SherpaOnnxGetOfflineStreamResult(stream);\n\n      float start = segment->start / 16000.0f;\n      float duration = segment->n / 16000.0f;\n      float stop = start + duration;\n\n      fprintf(stderr, \"%.3f -- %.3f: %s\\n\", start, stop, result->text);\n\n      SherpaOnnxDestroyOfflineRecognizerResult(result);\n      SherpaOnnxDestroyOfflineStream(stream);\n\n      SherpaOnnxDestroySpeechSegment(segment);\n      SherpaOnnxVoiceActivityDetectorPop(vad);\n    }\n    i += window_size;\n  }\n\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxDestroyVoiceActivityDetector(vad);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/vad-whisper-c-api.c",
    "content": "// c-api-examples/vad-whisper-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use VAD + Whisper tiny.en with\n// sherpa-onnx's C API.\n//\n// clang-format off\n//\n// To use silero-vad:\n//  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// To use ten-vad:\n//  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n// tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n// rm sherpa-onnx-whisper-tiny.en.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename = \"./Obama.wav\";\n\n  if (!SherpaOnnxFileExists(wav_filename)) {\n    fprintf(stderr, \"Please download %s\\n\", wav_filename);\n    return -1;\n  }\n\n  const char *vad_filename;\n  int32_t use_silero_vad = 0;\n  int32_t use_ten_vad = 0;\n\n  if (SherpaOnnxFileExists(\"./silero_vad.onnx\")) {\n    printf(\"Use silero-vad\\n\");\n    vad_filename = \"./silero_vad.onnx\";\n    use_silero_vad = 1;\n  } else if (SherpaOnnxFileExists(\"./ten-vad.onnx\")) {\n    printf(\"Use ten-vad\\n\");\n    vad_filename = \"./ten-vad.onnx\";\n    use_ten_vad = 1;\n  } else {\n    fprintf(stderr, \"Please provide either silero_vad.onnx or ten-vad.onnx\\n\");\n    return -1;\n  }\n\n  const char *encoder = \"sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx\";\n  const char *decoder = \"sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx\";\n  const char *tokens = \"sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  if (wave->sample_rate != 16000) {\n    fprintf(stderr, \"Expect the sample rate to be 16000. Given: %d\\n\",\n            wave->sample_rate);\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 0;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = \"cpu\";\n  offline_model_config.tokens = tokens;\n  offline_model_config.whisper.encoder = encoder;\n  offline_model_config.whisper.decoder = decoder;\n  offline_model_config.whisper.language = \"en\";\n  offline_model_config.whisper.tail_paddings = 0;\n  offline_model_config.whisper.task = \"transcribe\";\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your recognizer config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  SherpaOnnxVadModelConfig vadConfig;\n  memset(&vadConfig, 0, sizeof(vadConfig));\n\n  if (use_silero_vad) {\n    vadConfig.silero_vad.model = vad_filename;\n    vadConfig.silero_vad.threshold = 0.25;\n    vadConfig.silero_vad.min_silence_duration = 0.5;\n    vadConfig.silero_vad.min_speech_duration = 0.5;\n    vadConfig.silero_vad.max_speech_duration = 10;\n    vadConfig.silero_vad.window_size = 512;\n  } else if (use_ten_vad) {\n    vadConfig.ten_vad.model = vad_filename;\n    vadConfig.ten_vad.threshold = 0.25;\n    vadConfig.ten_vad.min_silence_duration = 0.5;\n    vadConfig.ten_vad.min_speech_duration = 0.5;\n    vadConfig.ten_vad.max_speech_duration = 10;\n    vadConfig.ten_vad.window_size = 256;\n  }\n\n  vadConfig.sample_rate = 16000;\n  vadConfig.num_threads = 1;\n  vadConfig.debug = 1;\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      SherpaOnnxCreateVoiceActivityDetector(&vadConfig, 30);\n\n  if (vad == NULL) {\n    fprintf(stderr, \"Please check your recognizer config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    SherpaOnnxDestroyOfflineRecognizer(recognizer);\n    return -1;\n  }\n\n  int32_t window_size = use_silero_vad ? vadConfig.silero_vad.window_size\n                                       : vadConfig.ten_vad.window_size;\n  int32_t i = 0;\n  int is_eof = 0;\n\n  while (!is_eof) {\n    if (i + window_size < wave->num_samples) {\n      SherpaOnnxVoiceActivityDetectorAcceptWaveform(vad, wave->samples + i,\n                                                    window_size);\n    } else {\n      SherpaOnnxVoiceActivityDetectorFlush(vad);\n      is_eof = 1;\n    }\n    while (!SherpaOnnxVoiceActivityDetectorEmpty(vad)) {\n      const SherpaOnnxSpeechSegment *segment =\n          SherpaOnnxVoiceActivityDetectorFront(vad);\n\n      const SherpaOnnxOfflineStream *stream =\n          SherpaOnnxCreateOfflineStream(recognizer);\n\n      SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate,\n                                      segment->samples, segment->n);\n\n      SherpaOnnxDecodeOfflineStream(recognizer, stream);\n\n      const SherpaOnnxOfflineRecognizerResult *result =\n          SherpaOnnxGetOfflineStreamResult(stream);\n\n      float start = segment->start / 16000.0f;\n      float duration = segment->n / 16000.0f;\n      float stop = start + duration;\n\n      fprintf(stderr, \"%.3f -- %.3f: %s\\n\", start, stop, result->text);\n\n      SherpaOnnxDestroyOfflineRecognizerResult(result);\n      SherpaOnnxDestroyOfflineStream(stream);\n\n      SherpaOnnxDestroySpeechSegment(segment);\n      SherpaOnnxVoiceActivityDetectorPop(vad);\n    }\n    i += window_size;\n  }\n\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxDestroyVoiceActivityDetector(vad);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/wenet-ctc-c-api.c",
    "content": "// c-api-examples/wenet-ctc-c-api.c\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use non-streaming Wenet CTC model with\n// sherpa-onnx's C API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n// tar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n// rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  // clang-format off\n  const char *wav_filename = \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/test_wavs/yue-0.wav\";\n  const char *model = \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx\";\n  const char *tokens = \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt\";\n  // clang-format on\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Zipformer config\n  SherpaOnnxOfflineWenetCtcModelConfig wenet_ctc_config;\n  memset(&wenet_ctc_config, 0, sizeof(wenet_ctc_config));\n  wenet_ctc_config.model = model;\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens;\n  offline_model_config.wenet_ctc = wenet_ctc_config;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/whisper-c-api.c",
    "content": "// c-api-examples/whisper-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n// We assume you have pre-downloaded the whisper multi-lingual models\n// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n// An example command to download the \"tiny\" whisper model is given below:\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\n// tar xvf sherpa-onnx-whisper-tiny.tar.bz2\n// rm sherpa-onnx-whisper-tiny.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename = \"./sherpa-onnx-whisper-tiny/test_wavs/0.wav\";\n  const char *encoder_filename = \"sherpa-onnx-whisper-tiny/tiny-encoder.onnx\";\n  const char *decoder_filename = \"sherpa-onnx-whisper-tiny/tiny-decoder.onnx\";\n  const char *tokens_filename = \"sherpa-onnx-whisper-tiny/tiny-tokens.txt\";\n  const char *language = \"en\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Whisper config\n  SherpaOnnxOfflineWhisperModelConfig whisper_config;\n  memset(&whisper_config, 0, sizeof(whisper_config));\n  whisper_config.decoder = decoder_filename;\n  whisper_config.encoder = encoder_filename;\n  whisper_config.language = language;\n  whisper_config.tail_paddings = 0;\n  whisper_config.task = \"transcribe\";\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.whisper = whisper_config;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n\n    SherpaOnnxFreeWave(wave);\n\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/zipformer-c-api.c",
    "content": "// c-api-examples/zipformer-c-api.c\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use non-streaming Zipformer with sherpa-onnx's\n// C API.\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-small-en-2023-06-26.tar.bz2\n// tar xvf sherpa-onnx-zipformer-small-en-2023-06-26.tar.bz2\n// rm sherpa-onnx-zipformer-small-en-2023-06-26.tar.bz2\n//\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nint32_t main() {\n  const char *wav_filename =\n      \"sherpa-onnx-zipformer-small-en-2023-06-26/test_wavs/0.wav\";\n  const char *encoder_filename =\n      \"sherpa-onnx-zipformer-small-en-2023-06-26/encoder-epoch-99-avg-1.onnx\";\n  const char *decoder_filename =\n      \"sherpa-onnx-zipformer-small-en-2023-06-26/decoder-epoch-99-avg-1.onnx\";\n  const char *joiner_filename =\n      \"sherpa-onnx-zipformer-small-en-2023-06-26/joiner-epoch-99-avg-1.onnx\";\n  const char *tokens_filename =\n      \"sherpa-onnx-zipformer-small-en-2023-06-26/tokens.txt\";\n  const char *provider = \"cpu\";\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(wav_filename);\n  if (wave == NULL) {\n    fprintf(stderr, \"Failed to read %s\\n\", wav_filename);\n    return -1;\n  }\n\n  // Zipformer config\n  SherpaOnnxOfflineTransducerModelConfig zipformer_config;\n  memset(&zipformer_config, 0, sizeof(zipformer_config));\n  zipformer_config.encoder = encoder_filename;\n  zipformer_config.decoder = decoder_filename;\n  zipformer_config.joiner = joiner_filename;\n\n  // Offline model config\n  SherpaOnnxOfflineModelConfig offline_model_config;\n  memset(&offline_model_config, 0, sizeof(offline_model_config));\n  offline_model_config.debug = 1;\n  offline_model_config.num_threads = 1;\n  offline_model_config.provider = provider;\n  offline_model_config.tokens = tokens_filename;\n  offline_model_config.transducer = zipformer_config;\n\n  // Recognizer config\n  SherpaOnnxOfflineRecognizerConfig recognizer_config;\n  memset(&recognizer_config, 0, sizeof(recognizer_config));\n  recognizer_config.decoding_method = \"greedy_search\";\n  recognizer_config.model_config = offline_model_config;\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&recognizer_config);\n\n  if (recognizer == NULL) {\n    fprintf(stderr, \"Please check your config!\\n\");\n    SherpaOnnxFreeWave(wave);\n    return -1;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n                                  wave->num_samples);\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n  const SherpaOnnxOfflineRecognizerResult *result =\n      SherpaOnnxGetOfflineStreamResult(stream);\n\n  fprintf(stderr, \"Decoded text: %s\\n\", result->text);\n\n  SherpaOnnxDestroyOfflineRecognizerResult(result);\n  SherpaOnnxDestroyOfflineStream(stream);\n  SherpaOnnxDestroyOfflineRecognizer(recognizer);\n  SherpaOnnxFreeWave(wave);\n\n  return 0;\n}\n"
  },
  {
    "path": "c-api-examples/zipvoice-tts-zh-en-c-api.c",
    "content": "// c-api-examples/zipvoice-tts-zh-en-c-api.c\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx C API\n// for Chinese/English zero-shot TTS with ZipVoice.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n./zipvoice-tts-zh-en-c-api\n*/\n// clang-format on\n\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  SherpaOnnxOfflineTtsConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model.zipvoice.encoder =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx\";\n  config.model.zipvoice.decoder =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx\";\n  config.model.zipvoice.data_dir =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data\";\n  config.model.zipvoice.lexicon =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt\";\n  config.model.zipvoice.tokens =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt\";\n  config.model.zipvoice.vocoder = \"./vocos_24khz.onnx\";\n\n  config.model.num_threads = 2;\n\n  // If you want to see more debug messages, please set it to 1\n  config.model.debug = 0;\n\n  const char *filename = \"./generated-zipvoice-zh-en-c.wav\";\n  const char *text =\n      \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, \"\n      \"就是全心投入并享受其中.\";\n  const char *reference_text =\n      \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\";\n  const char *reference_audio_file =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav\";\n\n  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);\n  if (!tts) {\n    fprintf(stderr, \"Error create Offline TTS\\n\");\n    return -1;\n  }\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(reference_audio_file);\n  if (!wave) {\n    fprintf(stderr, \"Failed to read %s\\n\", reference_audio_file);\n    SherpaOnnxDestroyOfflineTts(tts);\n    return -1;\n  }\n\n  SherpaOnnxGenerationConfig cfg = {0};\n  cfg.speed = 1.0f;\n  cfg.num_steps = 4;\n  cfg.reference_audio = wave->samples;\n  cfg.reference_audio_len = wave->num_samples;\n  cfg.reference_sample_rate = wave->sample_rate;\n  cfg.reference_text = reference_text;\n  cfg.extra = \"{\\\"min_char_in_sentence\\\": 10}\";\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, NULL, NULL);\n#else\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text, &cfg, ProgressCallback,\n                                             NULL);\n#endif\n\n  SherpaOnnxFreeWave(wave);\n\n  fprintf(stderr, \"Input text is: %s\\n\", text);\n\n  if (audio) {\n    SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate, filename);\n    fprintf(stderr, \"Saved to: %s\\n\", filename);\n    SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  }\n\n  SherpaOnnxDestroyOfflineTts(tts);\n\n  return 0;\n}\n"
  },
  {
    "path": "cmake/.gitignore",
    "content": "!*.cmake\n"
  },
  {
    "path": "cmake/__init__.py",
    "content": ""
  },
  {
    "path": "cmake/asio.cmake",
    "content": "function(download_asio)\n  include(FetchContent)\n\n  set(asio_URL  \"https://github.com/chriskohlhoff/asio/archive/refs/tags/asio-1-24-0.tar.gz\")\n  set(asio_URL2  \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/asio-asio-1-24-0.tar.gz\")\n  set(asio_HASH \"SHA256=cbcaaba0f66722787b1a7c33afe1befb3a012b5af3ad7da7ff0f6b8c9b7a8a5b\")\n\n  # If you don't have access to the Internet,\n  # please pre-download asio\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/asio-asio-1-24-0.tar.gz\n    ${CMAKE_SOURCE_DIR}/asio-asio-1-24-0.tar.gz\n    ${CMAKE_BINARY_DIR}/asio-asio-1-24-0.tar.gz\n    /tmp/asio-asio-1-24-0.tar.gz\n    /star-fj/fangjun/download/github/asio-asio-1-24-0.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(asio_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${asio_URL}\" asio_URL)\n      message(STATUS \"Found local downloaded asio: ${asio_URL}\")\n      set(asio_URL2)\n      break()\n    endif()\n  endforeach()\n\n  FetchContent_Declare(asio\n    URL\n      ${asio_URL}\n      ${asio_URL2}\n    URL_HASH          ${asio_HASH}\n  )\n\n  FetchContent_GetProperties(asio)\n  if(NOT asio_POPULATED)\n    message(STATUS \"Downloading asio ${asio_URL}\")\n    FetchContent_Populate(asio)\n  endif()\n  message(STATUS \"asio is downloaded to ${asio_SOURCE_DIR}\")\n  # add_subdirectory(${asio_SOURCE_DIR} ${asio_BINARY_DIR} EXCLUDE_FROM_ALL)\n  include_directories(${asio_SOURCE_DIR}/asio/include)\nendfunction()\n\ndownload_asio()\n"
  },
  {
    "path": "cmake/cargs.cmake",
    "content": "function(download_cargs)\n  include(FetchContent)\n\n  set(cargs_URL \"https://github.com/likle/cargs/archive/refs/tags/v1.0.3.tar.gz\")\n  set(cargs_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/cargs-1.0.3.tar.gz\")\n  set(cargs_HASH \"SHA256=ddba25bd35e9c6c75bc706c126001b8ce8e084d40ef37050e6aa6963e836eb8b\")\n\n  # If you don't have access to the Internet,\n  # please pre-download cargs\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/cargs-1.0.3.tar.gz\n    ${CMAKE_SOURCE_DIR}/cargs-1.0.3.tar.gz\n    ${CMAKE_BINARY_DIR}/cargs-1.0.3.tar.gz\n    /tmp/cargs-1.0.3.tar.gz\n    /star-fj/fangjun/download/github/cargs-1.0.3.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(cargs_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${cargs_URL}\" cargs_URL)\n      message(STATUS \"Found local downloaded cargs: ${cargs_URL}\")\n      set(cargs_URL2)\n      break()\n    endif()\n  endforeach()\n\n  FetchContent_Declare(cargs\n    URL\n      ${cargs_URL}\n      ${cargs_URL2}\n    URL_HASH\n      ${cargs_HASH}\n  )\n\n  FetchContent_GetProperties(cargs)\n  if(NOT cargs_POPULATED)\n    message(STATUS \"Downloading cargs ${cargs_URL}\")\n    FetchContent_Populate(cargs)\n  endif()\n  message(STATUS \"cargs is downloaded to ${cargs_SOURCE_DIR}\")\n  add_subdirectory(${cargs_SOURCE_DIR} ${cargs_BINARY_DIR} EXCLUDE_FROM_ALL)\n\n  install(TARGETS cargs DESTINATION lib)\n  install(FILES ${cargs_SOURCE_DIR}/include/cargs.h\n    DESTINATION include\n  )\nendfunction()\n\ndownload_cargs()\n"
  },
  {
    "path": "cmake/cmake_extension.py",
    "content": "# cmake/cmake_extension.py\n# Copyright (c)  2023  Xiaomi Corporation\n#\n# flake8: noqa\n\nimport os\nimport platform\nimport shlex\nimport shutil\nimport subprocess\nimport sys\nfrom pathlib import Path\n\nimport glob\nimport setuptools\nfrom setuptools.command.build_ext import build_ext\n\n\ndef need_split_package():\n    ans = os.environ.get(\"SHERPA_ONNX_SPLIT_PYTHON_PACKAGE\", None)\n    return ans is not None\n\n\ndef is_for_pypi():\n    ans = os.environ.get(\"SHERPA_ONNX_IS_FOR_PYPI\", None)\n    return ans is not None\n\n\ndef is_macos():\n    return platform.system() == \"Darwin\"\n\n\ndef is_windows():\n    return platform.system() == \"Windows\"\n\n\ndef is_linux():\n    return platform.system() == \"Linux\"\n\n\ndef is_arm64():\n    return platform.machine() in [\"arm64\", \"aarch64\"]\n\n\ndef is_x86():\n    return platform.machine() in [\"i386\", \"i686\", \"x86_64\"]\n\n\ndef enable_alsa():\n    build_alsa = os.environ.get(\"SHERPA_ONNX_ENABLE_ALSA\", None)\n    return build_alsa and is_linux() and (is_arm64() or is_x86())\n\n\ndef get_binaries():\n    binaries = [\n        \"sherpa-onnx\",\n        \"sherpa-onnx-keyword-spotter\",\n        \"sherpa-onnx-microphone\",\n        \"sherpa-onnx-microphone-offline\",\n        \"sherpa-onnx-microphone-offline-audio-tagging\",\n        \"sherpa-onnx-microphone-offline-speaker-identification\",\n        \"sherpa-onnx-offline\",\n        \"sherpa-onnx-offline-audio-tagging\",\n        \"sherpa-onnx-offline-denoiser\",\n        \"sherpa-onnx-offline-language-identification\",\n        \"sherpa-onnx-offline-punctuation\",\n        \"sherpa-onnx-offline-source-separation\",\n        \"sherpa-onnx-offline-speaker-diarization\",\n        \"sherpa-onnx-offline-tts\",\n        \"sherpa-onnx-offline-tts-play\",\n        \"sherpa-onnx-offline-websocket-server\",\n        \"sherpa-onnx-online-denoiser\",\n        \"sherpa-onnx-online-punctuation\",\n        \"sherpa-onnx-online-websocket-client\",\n        \"sherpa-onnx-online-websocket-server\",\n        \"sherpa-onnx-vad\",\n        \"sherpa-onnx-vad-microphone\",\n        \"sherpa-onnx-vad-microphone-offline-asr\",\n        \"sherpa-onnx-vad-microphone-simulated-streaming-asr\",\n        \"sherpa-onnx-vad-with-offline-asr\",\n        \"sherpa-onnx-vad-with-online-asr\",\n        \"sherpa-onnx-version\",\n        \"sherpa-onnx-pa-devs\",\n    ]\n\n    if enable_alsa():\n        binaries += [\n            \"sherpa-onnx-alsa\",\n            \"sherpa-onnx-alsa-offline\",\n            \"sherpa-onnx-alsa-offline-audio-tagging\",\n            \"sherpa-onnx-alsa-offline-speaker-identification\",\n            \"sherpa-onnx-offline-tts-play-alsa\",\n            \"sherpa-onnx-vad-alsa\",\n            \"sherpa-onnx-vad-alsa-offline-asr\",\n        ]\n\n    if is_windows():\n        binaries += [\n            \"onnxruntime.dll\",\n            \"sherpa-onnx-c-api.dll\",\n            \"sherpa-onnx-cxx-api.dll\",\n        ]\n\n    return binaries\n\n\ntry:\n    from wheel.bdist_wheel import bdist_wheel as _bdist_wheel\n\n    class bdist_wheel(_bdist_wheel):\n        def finalize_options(self):\n            _bdist_wheel.finalize_options(self)\n            # In this case, the generated wheel has a name in the form\n            # sherpa-xxx-pyxx-none-any.whl\n            if is_for_pypi() and not is_macos():\n                self.root_is_pure = True\n            else:\n                # The generated wheel has a name ending with\n                # -linux_x86_64.whl\n                self.root_is_pure = False\n\nexcept ImportError:\n    bdist_wheel = None\n\n\ndef cmake_extension(name, *args, **kwargs) -> setuptools.Extension:\n    kwargs[\"language\"] = \"c++\"\n    sources = []\n    return setuptools.Extension(name, sources, *args, **kwargs)\n\n\nclass BuildExtension(build_ext):\n    def build_extension(self, ext: setuptools.extension.Extension):\n        # build/temp.linux-x86_64-3.8\n        os.makedirs(self.build_temp, exist_ok=True)\n\n        # build/lib.linux-x86_64-3.8\n        os.makedirs(self.build_lib, exist_ok=True)\n\n        out_bin_dir = Path(self.build_lib).resolve().parent / \"sherpa_onnx\" / \"bin\"\n        install_dir = Path(self.build_lib).resolve() / \"sherpa_onnx\"\n\n        sherpa_onnx_dir = Path(__file__).parent.parent.resolve()\n\n        cmake_args = os.environ.get(\"SHERPA_ONNX_CMAKE_ARGS\", \"\")\n        make_args = os.environ.get(\"SHERPA_ONNX_MAKE_ARGS\", \"\")\n        system_make_args = os.environ.get(\"MAKEFLAGS\", \"\")\n\n        if cmake_args == \"\":\n            cmake_args = \"-DCMAKE_BUILD_TYPE=Release\"\n\n        extra_cmake_args = \"\"\n        if not need_split_package():\n            extra_cmake_args += f\" -DCMAKE_INSTALL_PREFIX={install_dir} \"\n        extra_cmake_args += \" -DBUILD_SHARED_LIBS=ON \"\n        extra_cmake_args += \" -DBUILD_PIPER_PHONMIZE_EXE=OFF \"\n        extra_cmake_args += \" -DBUILD_PIPER_PHONMIZE_TESTS=OFF \"\n        extra_cmake_args += \" -DBUILD_ESPEAK_NG_EXE=OFF \"\n        extra_cmake_args += \" -DBUILD_ESPEAK_NG_TESTS=OFF \"\n\n        if not need_split_package():\n            extra_cmake_args += \" -DSHERPA_ONNX_ENABLE_C_API=ON \"\n\n        extra_cmake_args += \" -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF \"\n        extra_cmake_args += \" -DSHERPA_ONNX_ENABLE_CHECK=OFF \"\n        extra_cmake_args += \" -DSHERPA_ONNX_ENABLE_PYTHON=ON \"\n        extra_cmake_args += \" -DSHERPA_ONNX_ENABLE_PORTAUDIO=ON \"\n        if not need_split_package():\n            extra_cmake_args += \" -DSHERPA_ONNX_ENABLE_WEBSOCKET=ON \"\n\n        if \"PYTHON_EXECUTABLE\" not in cmake_args:\n            print(f\"Setting PYTHON_EXECUTABLE to {sys.executable}\")\n            cmake_args += f\" -DPYTHON_EXECUTABLE={sys.executable}\"\n\n        # putting `cmake_args` from env variable ${SHERPA_ONNX_CMAKE_ARGS} last,\n        # so they can onverride the \"defaults\" stored in `extra_cmake_args`\n        cmake_args = extra_cmake_args + cmake_args\n\n        if is_windows():\n            if not need_split_package():\n                build_cmd = f\"\"\"\n             cmake {cmake_args} -B {self.build_temp} -S {sherpa_onnx_dir}\n             cmake --build {self.build_temp} --target install --config Release -- -m:2\n                \"\"\"\n            else:\n                build_cmd = f\"\"\"\n             cmake {cmake_args} -B {self.build_temp} -S {sherpa_onnx_dir}\n             cmake --build {self.build_temp} --target _sherpa_onnx --config Release -- -m:2\n                \"\"\"\n\n            print(f\"build command is:\\n{build_cmd}\")\n\n            cmake_configure_cmd = (\n                f'cmake {cmake_args} -B \"{self.build_temp}\" -S \"{sherpa_onnx_dir}\"'\n            )\n            print(\"cmake_configure_cmd\", cmake_configure_cmd)\n\n            ret = subprocess.run(cmake_configure_cmd, shell=True).returncode\n\n            if ret != 0:\n                raise Exception(\"Failed to configure sherpa-onnx\")\n\n            if not need_split_package():\n                cmake_build_cmd = [\n                    \"cmake\",\n                    \"--build\",\n                    str(self.build_temp),\n                    \"--target\",\n                    \"install\",\n                    \"--config\",\n                    \"Release\",\n                    \"--\",\n                    \"-m:2\",\n                ]\n                print(\"cmake_build_cmd\", cmake_build_cmd)\n                ret = subprocess.run(cmake_build_cmd, shell=False).returncode\n            else:\n                cmake_build_cmd = [\n                    \"cmake\",\n                    \"--build\",\n                    str(self.build_temp),\n                    \"--target\",\n                    \"_sherpa_onnx\",\n                    \"--config\",\n                    \"Release\",\n                    \"--\",\n                    \"-m:2\",\n                ]\n                print(\"cmake_build_cmd\", cmake_build_cmd)\n                ret = subprocess.run(cmake_build_cmd, shell=False).returncode\n\n            if ret != 0:\n                raise Exception(\"Failed to build and install sherpa\")\n        else:\n            if make_args == \"\" and system_make_args == \"\":\n                print(\"for fast compilation, run:\")\n                print('export SHERPA_ONNX_MAKE_ARGS=\"-j\"; python setup.py install')\n                print('Setting make_args to \"-j4\"')\n                make_args = \"-j4\"\n\n            if \"-G Ninja\" in cmake_args:\n                if not need_split_package():\n                    build_cmd = f\"\"\"\n                        cd {self.build_temp}\n                        cmake {cmake_args} {sherpa_onnx_dir}\n                        ninja {make_args} install\n                    \"\"\"\n                else:\n                    build_cmd = f\"\"\"\n                        cd {self.build_temp}\n                        cmake {cmake_args} {sherpa_onnx_dir}\n                        ninja {make_args} _sherpa_onnx\n                    \"\"\"\n            else:\n                if not need_split_package():\n                    build_cmd = f\"\"\"\n                        cd {self.build_temp}\n\n                        cmake {cmake_args} {sherpa_onnx_dir}\n\n                        make {make_args} install/strip\n                    \"\"\"\n                else:\n                    build_cmd = f\"\"\"\n                        cd {self.build_temp}\n\n                        cmake {cmake_args} {sherpa_onnx_dir}\n\n                        make {make_args} _sherpa_onnx\n                    \"\"\"\n            print(f\"build command is:\\n{build_cmd}\")\n\n            # Parse cmake_args and make_args into lists for safer execution\n            # Use shlex.split() for safer parsing of user-provided arguments\n            cmake_args_list = shlex.split(cmake_args)\n            make_args_list = shlex.split(make_args) if make_args else []\n\n            # Change to build_temp directory and execute commands\n            original_dir = os.getcwd()\n            try:\n                os.chdir(self.build_temp)\n\n                # Run cmake configuration\n                cmake_cmd = [\"cmake\"] + cmake_args_list + [str(sherpa_onnx_dir)]\n                ret = subprocess.run(cmake_cmd, shell=False).returncode\n                if ret != 0:\n                    raise Exception(\"Failed to configure sherpa\")\n\n                # Run build command\n                if \"-G Ninja\" in cmake_args:\n                    if not need_split_package():\n                        build_cmd_list = [\"ninja\"] + make_args_list + [\"install\"]\n                    else:\n                        build_cmd_list = [\"ninja\"] + make_args_list + [\"_sherpa_onnx\"]\n                else:\n                    if not need_split_package():\n                        build_cmd_list = [\"make\"] + make_args_list + [\"install/strip\"]\n                    else:\n                        build_cmd_list = [\"make\"] + make_args_list + [\"_sherpa_onnx\"]\n\n                ret = subprocess.run(build_cmd_list, shell=False).returncode\n            finally:\n                os.chdir(original_dir)\n\n            if ret != 0:\n                raise Exception(\n                    \"\\nBuild sherpa-onnx failed. Please check the error message.\\n\"\n                    \"You can ask for help by creating an issue on GitHub.\\n\"\n                    \"\\nClick:\\n\\thttps://github.com/k2-fsa/sherpa-onnx/issues/new\\n\"  # noqa\n                )\n\n        if need_split_package():\n            dst = os.path.join(f\"{self.build_lib}\", \"sherpa_onnx\", \"lib\")\n            os.makedirs(dst, exist_ok=True)\n            # Directory listing for debugging - safe with shell=False\n            if is_windows():\n                # On Windows, use PowerShell's Get-ChildItem or just skip the listing\n                # since 'dir' is a shell built-in. For safety, we'll just skip it.\n                pass\n            else:\n                subprocess.run([\"ls\", \"-la\", dst], shell=False)\n\n            ext = \"pyd\" if sys.platform.startswith(\"win\") else \"so\"\n            pattern = os.path.join(self.build_temp, \"**\", f\"_sherpa_onnx.*.{ext}\")\n            matches = glob.glob(pattern, recursive=True)\n            print(\"matches\", list(matches))\n\n            for f in matches:\n                print(f, os.path.join(f\"{self.build_lib}\", \"sherpa_onnx\", \"lib\"))\n                shutil.copy(f\"{f}\", dst)\n                # Directory listing for debugging - safe with shell=False\n                if is_windows():\n                    # On Windows, use PowerShell's Get-ChildItem or just skip the listing\n                    # since 'dir' is a shell built-in. For safety, we'll just skip it.\n                    pass\n                else:\n                    subprocess.run([\"ls\", \"-la\", dst], shell=False)\n\n            return\n\n        suffix = \".exe\" if is_windows() else \"\"\n        # Remember to also change setup.py\n\n        binaries = get_binaries()\n\n        for f in binaries:\n            suffix = \"\" if \".dll\" in f else suffix\n            src_file = install_dir / \"bin\" / (f + suffix)\n            if not src_file.is_file():\n                src_file = install_dir / \"lib\" / (f + suffix)\n            if not src_file.is_file():\n                src_file = install_dir / \"..\" / (f + suffix)\n\n            if not src_file.is_file():\n                continue\n\n            print(f\"Copying {src_file} to {out_bin_dir}/\")\n            shutil.copy(f\"{src_file}\", f\"{out_bin_dir}/\")\n\n        if Path(f\"{install_dir}/bin\").is_dir():\n            shutil.rmtree(f\"{install_dir}/bin\")\n        if Path(f\"{install_dir}/share\").is_dir():\n            shutil.rmtree(f\"{install_dir}/share\")\n        if Path(f\"{install_dir}/lib/pkgconfig\").is_dir():\n            shutil.rmtree(f\"{install_dir}/lib/pkgconfig\")\n\n        if is_macos():\n            os.remove(f\"{install_dir}/lib/libonnxruntime.dylib\")\n"
  },
  {
    "path": "cmake/eigen.cmake",
    "content": "function(download_eigen)\n  include(FetchContent)\n\n  set(eigen_URL  \"https://gitlab.com/libeigen/eigen/-/archive/3.4.1/eigen-3.4.1.tar.gz\")\n  set(eigen_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/eigen-3.4.1.tar.gz\")\n  set(eigen_HASH \"SHA256=b93c667d1b69265cdb4d9f30ec21f8facbbe8b307cf34c0b9942834c6d4fdbe2\")\n\n  # If you don't have access to the Internet,\n  # please pre-download eigen\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/eigen-3.4.1.tar.gz\n    ${CMAKE_SOURCE_DIR}/eigen-3.4.1.tar.gz\n    ${CMAKE_BINARY_DIR}/eigen-3.4.1.tar.gz\n    /tmp/eigen-3.4.1.tar.gz\n    /star-fj/fangjun/download/github/eigen-3.4.1.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(eigen_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${eigen_URL}\" eigen_URL)\n      message(STATUS \"Found local downloaded eigen: ${eigen_URL}\")\n      set(eigen_URL2)\n      break()\n    endif()\n  endforeach()\n\n  set(BUILD_TESTING OFF CACHE BOOL \"\" FORCE)\n  set(EIGEN_BUILD_DOC OFF CACHE BOOL \"\" FORCE)\n\n  FetchContent_Declare(eigen\n    URL               ${eigen_URL} ${eigen_URL2}\n    URL_HASH          ${eigen_HASH}\n  )\n\n  FetchContent_GetProperties(eigen)\n  if(NOT eigen_POPULATED)\n    message(STATUS \"Downloading eigen from ${eigen_URL}\")\n    FetchContent_Populate(eigen)\n  endif()\n  message(STATUS \"eigen is downloaded to ${eigen_SOURCE_DIR}\")\n  message(STATUS \"eigen's binary dir is ${eigen_BINARY_DIR}\")\n\n  add_subdirectory(${eigen_SOURCE_DIR} ${eigen_BINARY_DIR} EXCLUDE_FROM_ALL)\nendfunction()\n\ndownload_eigen()\n\n"
  },
  {
    "path": "cmake/espeak-ng-for-piper.cmake",
    "content": "function(download_espeak_ng_for_piper)\n  include(FetchContent)\n\n  set(espeak_ng_URL  \"https://github.com/csukuangfj/espeak-ng/archive/f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip\")\n  set(espeak_ng_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip\")\n  set(espeak_ng_HASH \"SHA256=70cbf4050e7a014aae19140b05e57249da4720f56128459fbe3a93beaf971ae6\")\n\n  set(BUILD_ESPEAK_NG_TESTS OFF CACHE BOOL \"\" FORCE)\n  set(USE_ASYNC OFF CACHE BOOL \"\" FORCE)\n  set(USE_MBROLA OFF CACHE BOOL \"\" FORCE)\n  set(USE_LIBSONIC OFF CACHE BOOL \"\" FORCE)\n  set(USE_LIBPCAUDIO OFF CACHE BOOL \"\" FORCE)\n  set(USE_KLATT OFF CACHE BOOL \"\" FORCE)\n  set(USE_SPEECHPLAYER OFF CACHE BOOL \"\" FORCE)\n  set(EXTRA_cmn ON CACHE BOOL \"\" FORCE)\n  set(EXTRA_ru ON CACHE BOOL \"\" FORCE)\n  if (NOT SHERPA_ONNX_ENABLE_EPSEAK_NG_EXE)\n    set(BUILD_ESPEAK_NG_EXE OFF CACHE BOOL \"\" FORCE)\n  endif()\n\n  # If you don't have access to the Internet,\n  # please pre-download kaldi-decoder\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip\n    ${CMAKE_SOURCE_DIR}/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip\n    ${CMAKE_BINARY_DIR}/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip\n    /tmp/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip\n    /star-fj/fangjun/download/github/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(espeak_ng_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${espeak_ng_URL}\" espeak_ng_URL)\n      message(STATUS \"Found local downloaded espeak-ng: ${espeak_ng_URL}\")\n      set(espeak_ng_URL2 )\n      break()\n    endif()\n  endforeach()\n\n  FetchContent_Declare(espeak_ng\n    URL\n      ${espeak_ng_URL}\n      ${espeak_ng_URL2}\n    URL_HASH          ${espeak_ng_HASH}\n  )\n\n  FetchContent_GetProperties(espeak_ng)\n  if(NOT espeak_ng_POPULATED)\n    message(STATUS \"Downloading espeak-ng from ${espeak_ng_URL}\")\n    FetchContent_Populate(espeak_ng)\n  endif()\n  message(STATUS \"espeak-ng is downloaded to ${espeak_ng_SOURCE_DIR}\")\n  message(STATUS \"espeak-ng binary dir is ${espeak_ng_BINARY_DIR}\")\n\n  if(BUILD_SHARED_LIBS)\n    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})\n    set(BUILD_SHARED_LIBS OFF)\n  endif()\n\n  add_subdirectory(${espeak_ng_SOURCE_DIR} ${espeak_ng_BINARY_DIR})\n\n  if(_build_shared_libs_bak)\n    set_target_properties(espeak-ng\n      PROPERTIES\n        POSITION_INDEPENDENT_CODE ON\n        C_VISIBILITY_PRESET hidden\n        CXX_VISIBILITY_PRESET hidden\n    )\n    set(BUILD_SHARED_LIBS ON)\n  endif()\n\n  set(espeak_ng_SOURCE_DIR ${espeak_ng_SOURCE_DIR} PARENT_SCOPE)\n\n  if(WIN32 AND MSVC)\n    target_compile_options(ucd PUBLIC\n      /wd4309\n    )\n\n    target_compile_options(espeak-ng PUBLIC\n      /wd4005\n      /wd4018\n      /wd4067\n      /wd4068\n      /wd4090\n      /wd4101\n      /wd4244\n      /wd4267\n      /wd4996\n    )\n\n    if(TARGET espeak-ng-bin)\n      target_compile_options(espeak-ng-bin PRIVATE\n        /wd4244\n        /wd4024\n        /wd4047\n        /wd4067\n        /wd4267\n        /wd4996\n      )\n    endif()\n  endif()\n\n  if(UNIX AND NOT APPLE)\n    target_compile_options(espeak-ng PRIVATE\n      -Wno-unused-result\n      -Wno-format-overflow\n      -Wno-format-truncation\n      -Wno-uninitialized\n      -Wno-format\n    )\n\n    if(TARGET espeak-ng-bin)\n      target_compile_options(espeak-ng-bin PRIVATE\n        -Wno-unused-result\n      )\n    endif()\n  endif()\n\n  target_include_directories(espeak-ng\n    INTERFACE\n      ${espeak_ng_SOURCE_DIR}/src/include\n      ${espeak_ng_SOURCE_DIR}/src/ucd-tools/src/include\n  )\n\n  if(NOT BUILD_SHARED_LIBS)\n    install(TARGETS\n      espeak-ng\n      ucd\n    DESTINATION lib)\n  endif()\nendfunction()\n\ndownload_espeak_ng_for_piper()\n"
  },
  {
    "path": "cmake/googletest.cmake",
    "content": "function(download_googltest)\n  include(FetchContent)\n\n  set(googletest_URL  \"https://github.com/google/googletest/archive/refs/tags/v1.13.0.tar.gz\")\n  set(googletest_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/googletest-1.13.0.tar.gz\")\n  set(googletest_HASH \"SHA256=ad7fdba11ea011c1d925b3289cf4af2c66a352e18d4c7264392fead75e919363\")\n\n  # If you don't have access to the Internet,\n  # please pre-download googletest\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/googletest-1.13.0.tar.gz\n    ${CMAKE_SOURCE_DIR}/googletest-1.13.0.tar.gz\n    ${CMAKE_BINARY_DIR}/googletest-1.13.0.tar.gz\n    /tmp/googletest-1.13.0.tar.gz\n    /star-fj/fangjun/download/github/googletest-1.13.0.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(googletest_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${googletest_URL}\" googletest_URL)\n      message(STATUS \"Found local downloaded googletest: ${googletest_URL}\")\n      set(googletest_URL2)\n      break()\n    endif()\n  endforeach()\n\n  set(BUILD_GMOCK ON CACHE BOOL \"\" FORCE)\n  set(INSTALL_GTEST OFF CACHE BOOL \"\" FORCE)\n  set(gtest_disable_pthreads ON CACHE BOOL \"\" FORCE)\n  set(gtest_force_shared_crt ON CACHE BOOL \"\" FORCE)\n\n  FetchContent_Declare(googletest\n    URL\n      ${googletest_URL}\n      ${googletest_URL2}\n    URL_HASH          ${googletest_HASH}\n  )\n\n  FetchContent_GetProperties(googletest)\n  if(NOT googletest_POPULATED)\n    message(STATUS \"Downloading googletest from ${googletest_URL}\")\n    FetchContent_Populate(googletest)\n  endif()\n  message(STATUS \"googletest is downloaded to ${googletest_SOURCE_DIR}\")\n  message(STATUS \"googletest's binary dir is ${googletest_BINARY_DIR}\")\n\n  if(APPLE)\n    set(CMAKE_MACOSX_RPATH ON) # to solve the following warning on macOS\n  endif()\n  #[==[\n  -- Generating done\n    Policy CMP0042 is not set: MACOSX_RPATH is enabled by default.  Run \"cmake\n    --help-policy CMP0042\" for policy details.  Use the cmake_policy command to\n    set the policy and suppress this warning.\n\n    MACOSX_RPATH is not specified for the following targets:\n\n      gmock\n      gmock_main\n      gtest\n      gtest_main\n\n  This warning is for project developers.  Use -Wno-dev to suppress it.\n  ]==]\n\n  add_subdirectory(${googletest_SOURCE_DIR} ${googletest_BINARY_DIR} EXCLUDE_FROM_ALL)\n\n  target_include_directories(gtest\n    INTERFACE\n      ${googletest_SOURCE_DIR}/googletest/include\n      ${googletest_SOURCE_DIR}/googlemock/include\n  )\nendfunction()\n\ndownload_googltest()\n"
  },
  {
    "path": "cmake/hclust-cpp.cmake",
    "content": "function(download_hclust_cpp)\n  include(FetchContent)\n\n  # The latest release as of 2026.02.25\n  set(hclust_cpp_URL  \"https://github.com/csukuangfj/hclust-cpp/archive/refs/tags/2026-02-25.tar.gz\")\n  set(hclust_cpp_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/hclust-cpp-2026-02-25.tar.gz\")\n  set(hclust_cpp_HASH \"SHA256=8f14e024c709d73afb40ae69cb22de4b73dba67cbce40f2e518813da8139ab56\")\n\n  # If you don't have access to the Internet,\n  # please pre-download hclust-cpp\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/hclust-cpp-2026-02-25.tar.gz\n    ${CMAKE_SOURCE_DIR}/hclust-cpp-2026-02-25.tar.gz\n    ${CMAKE_BINARY_DIR}/hclust-cpp-2026-02-25.tar.gz\n    /tmp/hclust-cpp-2026-02-25.tar.gz\n    /star-fj/fangjun/download/github/hclust-cpp-2026-02-25.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(hclust_cpp_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${hclust_cpp_URL}\" hclust_cpp_URL)\n      message(STATUS \"Found local downloaded hclust_cpp: ${hclust_cpp_URL}\")\n      set(hclust_cpp_URL2)\n      break()\n    endif()\n  endforeach()\n\n  FetchContent_Declare(hclust_cpp\n    URL\n      ${hclust_cpp_URL}\n      ${hclust_cpp_URL2}\n    URL_HASH          ${hclust_cpp_HASH}\n  )\n\n  # hclust-cpp is header-only with no CMakeLists.txt, so we just need the\n  # source directory populated. Use FetchContent_MakeAvailable on CMake 3.24+\n  # (which handles missing CMakeLists.txt gracefully and avoids the\n  # FetchContent_Populate deprecation warning on CMake 3.28+). Fall back to\n  # the older FetchContent_Populate pattern on CMake < 3.24.\n  if(CMAKE_VERSION VERSION_GREATER_EQUAL \"3.24\")\n    FetchContent_MakeAvailable(hclust_cpp)\n  else()\n    FetchContent_GetProperties(hclust_cpp)\n    if(NOT hclust_cpp_POPULATED)\n      message(STATUS \"Downloading hclust_cpp from ${hclust_cpp_URL}\")\n      FetchContent_Populate(hclust_cpp)\n    endif()\n  endif()\n\n  message(STATUS \"hclust_cpp is downloaded to ${hclust_cpp_SOURCE_DIR}\")\n  message(STATUS \"hclust_cpp's binary dir is ${hclust_cpp_BINARY_DIR}\")\n  include_directories(${hclust_cpp_SOURCE_DIR})\nendfunction()\n\ndownload_hclust_cpp()\n"
  },
  {
    "path": "cmake/json.cmake",
    "content": "function(download_json)\n  include(FetchContent)\n\n  set(json_URL  \"https://github.com/nlohmann/json/archive/refs/tags/v3.12.0.tar.gz\")\n  set(json_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/json-3.12.0.tar.gz\")\n  set(json_HASH \"SHA256=4b92eb0c06d10683f7447ce9406cb97cd4b453be18d7279320f7b2f025c10187\")\n\n  # If you don't have access to the Internet,\n  # please pre-download json\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/json-3.12.0.tar.gz\n    ${CMAKE_SOURCE_DIR}/json-3.12.0.tar.gz\n    ${CMAKE_BINARY_DIR}/json-3.12.0.tar.gz\n    /tmp/json-3.12.0.tar.gz\n    /star-fj/fangjun/download/github/json-3.12.0.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(json_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${json_URL}\" json_URL)\n      message(STATUS \"Found local downloaded json: ${json_URL}\")\n      set(json_URL2)\n      break()\n    endif()\n  endforeach()\n\n  FetchContent_Declare(json\n    URL               ${json_URL} ${json_URL2}\n    URL_HASH          ${json_HASH}\n  )\n\n  FetchContent_GetProperties(json)\n  if(NOT json_POPULATED)\n    message(STATUS \"Downloading json from ${json_URL}\")\n    FetchContent_Populate(json)\n  endif()\n  message(STATUS \"json is downloaded to ${json_SOURCE_DIR}\")\n  message(STATUS \"json's binary dir is ${json_BINARY_DIR}\")\n  include_directories(${json_SOURCE_DIR}/include)\n\n  add_subdirectory(${json_SOURCE_DIR} ${json_BINARY_DIR} EXCLUDE_FROM_ALL)\nendfunction()\n\ndownload_json()\n\n"
  },
  {
    "path": "cmake/kaldi-decoder.cmake",
    "content": "function(download_kaldi_decoder)\n  include(FetchContent)\n\n  set(kaldi_decoder_URL  \"https://github.com/k2-fsa/kaldi-decoder/archive/refs/tags/v0.2.11.tar.gz\")\n  set(kaldi_decoder_HASH \"SHA256=85ca462535592541eb5ba6d21843009cf34738f51b28b71f84882a3694b528bf\")\n\n  set(KALDI_DECODER_BUILD_PYTHON OFF CACHE BOOL \"\" FORCE)\n  set(KALDI_DECODER_ENABLE_TESTS OFF CACHE BOOL \"\" FORCE)\n  set(KALDIFST_BUILD_PYTHON OFF CACHE BOOL \"\" FORCE)\n\n  # If you don't have access to the Internet,\n  # please pre-download kaldi-decoder\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/kaldi-decoder-0.2.11.tar.gz\n    ${CMAKE_SOURCE_DIR}/kaldi-decoder-0.2.11.tar.gz\n    ${CMAKE_BINARY_DIR}/kaldi-decoder-0.2.11.tar.gz\n    /tmp/kaldi-decoder-0.2.11.tar.gz\n    /star-fj/fangjun/download/github/kaldi-decoder-0.2.11.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(kaldi_decoder_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${kaldi_decoder_URL}\" kaldi_decoder_URL)\n      message(STATUS \"Found local downloaded kaldi-decoder: ${kaldi_decoder_URL}\")\n      break()\n    endif()\n  endforeach()\n\n  FetchContent_Declare(kaldi_decoder\n    URL\n      ${kaldi_decoder_URL}\n    URL_HASH          ${kaldi_decoder_HASH}\n  )\n\n  FetchContent_GetProperties(kaldi_decoder)\n  if(NOT kaldi_decoder_POPULATED)\n    message(STATUS \"Downloading kaldi-decoder from ${kaldi_decoder_URL}\")\n    FetchContent_Populate(kaldi_decoder)\n  endif()\n  message(STATUS \"kaldi-decoder is downloaded to ${kaldi_decoder_SOURCE_DIR}\")\n  message(STATUS \"kaldi-decoder's binary dir is ${kaldi_decoder_BINARY_DIR}\")\n\n  include_directories(${kaldi_decoder_SOURCE_DIR})\n\n  if(BUILD_SHARED_LIBS)\n    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})\n    set(BUILD_SHARED_LIBS OFF)\n  endif()\n\n  add_subdirectory(${kaldi_decoder_SOURCE_DIR} ${kaldi_decoder_BINARY_DIR} EXCLUDE_FROM_ALL)\n\n  if(_build_shared_libs_bak)\n    set_target_properties(\n        kaldi-decoder-core\n      PROPERTIES\n        POSITION_INDEPENDENT_CODE ON\n        C_VISIBILITY_PRESET hidden\n        CXX_VISIBILITY_PRESET hidden\n    )\n    set(BUILD_SHARED_LIBS ON)\n  endif()\n\n  if(WIN32 AND MSVC)\n    target_compile_options(kaldi-decoder-core PUBLIC\n      /wd4018\n      /wd4291\n    )\n  endif()\n\n  target_include_directories(kaldi-decoder-core\n    INTERFACE\n      ${kaldi-decoder_SOURCE_DIR}/\n  )\n  if(NOT BUILD_SHARED_LIBS)\n    install(TARGETS\n      kaldi-decoder-core\n      kaldifst_core\n      fst\n      fstfar\n    DESTINATION lib)\n  endif()\nendfunction()\n\ndownload_kaldi_decoder()\n\n"
  },
  {
    "path": "cmake/kaldi-native-fbank.cmake",
    "content": "function(download_kaldi_native_fbank)\n  include(FetchContent)\n\n  set(kaldi_native_fbank_URL   \"https://github.com/csukuangfj/kaldi-native-fbank/archive/refs/tags/v1.22.3.tar.gz\")\n  set(kaldi_native_fbank_URL2  \"https://hf-mirror.com/csukuangfj/sherpa-ncnn-cmake-deps/resolve/main/kaldi-native-fbank-1.22.3.tar.gz\")\n  set(kaldi_native_fbank_HASH \"SHA256=9176cc66fc7ce1edf85cf355b06e320c57db6297df74277f575183468893cf61\")\n\n  set(KALDI_NATIVE_FBANK_BUILD_TESTS OFF CACHE BOOL \"\" FORCE)\n  set(KALDI_NATIVE_FBANK_BUILD_PYTHON OFF CACHE BOOL \"\" FORCE)\n  set(KALDI_NATIVE_FBANK_ENABLE_CHECK OFF CACHE BOOL \"\" FORCE)\n\n  # If you don't have access to the Internet,\n  # please pre-download kaldi-native-fbank\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/kaldi-native-fbank-1.22.3.tar.gz\n    ${CMAKE_SOURCE_DIR}/kaldi-native-fbank-1.22.3.tar.gz\n    ${CMAKE_BINARY_DIR}/kaldi-native-fbank-1.22.3.tar.gz\n    /tmp/kaldi-native-fbank-1.22.3.tar.gz\n    /star-fj/fangjun/download/github/kaldi-native-fbank-1.22.3.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(kaldi_native_fbank_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${kaldi_native_fbank_URL}\" kaldi_native_fbank_URL)\n      message(STATUS \"Found local downloaded kaldi-native-fbank: ${kaldi_native_fbank_URL}\")\n      set(kaldi_native_fbank_URL2 )\n      break()\n    endif()\n  endforeach()\n\n  FetchContent_Declare(kaldi_native_fbank\n    URL\n      ${kaldi_native_fbank_URL}\n      ${kaldi_native_fbank_URL2}\n    URL_HASH          ${kaldi_native_fbank_HASH}\n  )\n\n  FetchContent_GetProperties(kaldi_native_fbank)\n  if(NOT kaldi_native_fbank_POPULATED)\n    message(STATUS \"Downloading kaldi-native-fbank from ${kaldi_native_fbank_URL}\")\n    FetchContent_Populate(kaldi_native_fbank)\n  endif()\n  message(STATUS \"kaldi-native-fbank is downloaded to ${kaldi_native_fbank_SOURCE_DIR}\")\n  message(STATUS \"kaldi-native-fbank's binary dir is ${kaldi_native_fbank_BINARY_DIR}\")\n\n  if(BUILD_SHARED_LIBS)\n    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})\n    set(BUILD_SHARED_LIBS OFF)\n  endif()\n\n  add_subdirectory(${kaldi_native_fbank_SOURCE_DIR} ${kaldi_native_fbank_BINARY_DIR} EXCLUDE_FROM_ALL)\n\n  if(_build_shared_libs_bak)\n    set_target_properties(kaldi-native-fbank-core\n      PROPERTIES\n        POSITION_INDEPENDENT_CODE ON\n        C_VISIBILITY_PRESET hidden\n        CXX_VISIBILITY_PRESET hidden\n    )\n    set(BUILD_SHARED_LIBS ON)\n  endif()\n\n  target_include_directories(kaldi-native-fbank-core\n    INTERFACE\n      ${kaldi_native_fbank_SOURCE_DIR}/\n  )\n\n  if(NOT BUILD_SHARED_LIBS)\n    install(TARGETS kaldi-native-fbank-core kissfft DESTINATION lib)\n  endif()\nendfunction()\n\ndownload_kaldi_native_fbank()\n"
  },
  {
    "path": "cmake/kaldifst.cmake",
    "content": "function(download_kaldifst)\n  include(FetchContent)\n\n  set(kaldifst_URL  \"https://github.com/k2-fsa/kaldifst/archive/refs/tags/v1.7.17.tar.gz\")\n  set(kaldifst_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/kaldifst-1.7.17.tar.gz\")\n  set(kaldifst_HASH \"SHA256=c4b701a23a400bda8032586b02c7e0d5e813a765832df60c23e6df9e62b010f4\")\n\n  # If you don't have access to the Internet,\n  # please pre-download kaldifst\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/kaldifst-1.7.17.tar.gz\n    ${CMAKE_SOURCE_DIR}/kaldifst-1.7.17.tar.gz\n    ${CMAKE_BINARY_DIR}/kaldifst-1.7.17.tar.gz\n    /tmp/kaldifst-1.7.17.tar.gz\n    /star-fj/fangjun/download/github/kaldifst-1.7.17.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(kaldifst_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${kaldifst_URL}\" kaldifst_URL)\n      message(STATUS \"Found local downloaded kaldifst: ${kaldifst_URL}\")\n      set(kaldifst_URL2)\n      break()\n    endif()\n  endforeach()\n\n  set(KALDIFST_BUILD_TESTS OFF CACHE BOOL \"\" FORCE)\n  set(KALDIFST_BUILD_PYTHON OFF CACHE BOOL \"\" FORCE)\n\n  FetchContent_Declare(kaldifst\n    URL               ${kaldifst_URL} ${kaldifst_URL2}\n    URL_HASH          ${kaldifst_HASH}\n  )\n\n  FetchContent_GetProperties(kaldifst)\n  if(NOT kaldifst_POPULATED)\n    message(STATUS \"Downloading kaldifst from ${kaldifst_URL}\")\n    FetchContent_Populate(kaldifst)\n  endif()\n  message(STATUS \"kaldifst is downloaded to ${kaldifst_SOURCE_DIR}\")\n  message(STATUS \"kaldifst's binary dir is ${kaldifst_BINARY_DIR}\")\n\n  list(APPEND CMAKE_MODULE_PATH ${kaldifst_SOURCE_DIR}/cmake)\n\n  if(BUILD_SHARED_LIBS)\n    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})\n    set(BUILD_SHARED_LIBS OFF)\n  endif()\n\n  add_subdirectory(${kaldifst_SOURCE_DIR} ${kaldifst_BINARY_DIR} EXCLUDE_FROM_ALL)\n\n  if(_build_shared_libs_bak)\n    set_target_properties(kaldifst_core\n      PROPERTIES\n        POSITION_INDEPENDENT_CODE ON\n        C_VISIBILITY_PRESET hidden\n        CXX_VISIBILITY_PRESET hidden\n    )\n    set(BUILD_SHARED_LIBS ON)\n  endif()\n\n  target_include_directories(kaldifst_core\n    PUBLIC\n      ${kaldifst_SOURCE_DIR}/\n  )\n\n  set_target_properties(kaldifst_core PROPERTIES OUTPUT_NAME \"sherpa-onnx-kaldifst-core\")\n  # installed in ./kaldi-decoder.cmake\nendfunction()\n\ndownload_kaldifst()\n"
  },
  {
    "path": "cmake/onnxruntime-linux-aarch64-gpu.cmake",
    "content": "# Copyright (c)  2022-2024  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT CMAKE_SYSTEM_PROCESSOR STREQUAL aarch64)\n  message(FATAL_ERROR \"This file is for aarch64 only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nif(NOT SHERPA_ONNX_ENABLE_GPU)\n  message(FATAL_ERROR \"This file is for NVIDIA GPU only. Given SHERPA_ONNX_ENABLE_GPU: ${SHERPA_ONNX_ENABLE_GPU}\")\nendif()\n\nmessage(WARNING \"\\\nSHERPA_ONNX_LINUX_ARM64_GPU_ONNXRUNTIME_VERSION: ${SHERPA_ONNX_LINUX_ARM64_GPU_ONNXRUNTIME_VERSION}\nIf you use Jetson nano b01, then please pass\n   -DSHERPA_ONNX_LINUX_ARM64_GPU_ONNXRUNTIME_VERSION=1.11.0\nto cmake (You need to make sure CUDA 10.2 is available on your board).\n\nIf you use Jetson Orin NX, then please pass\n   -DSHERPA_ONNX_LINUX_ARM64_GPU_ONNXRUNTIME_VERSION=1.16.0\nto cmake (You need to make sure CUDA 11.4 is available on your board).\n\nIf you use NVIDIA Jetson Orin Nano Engineering Reference Developer Kit\nSuper - Jetpack 6.2 [L4T 36.4.3], then please pass\n   -DSHERPA_ONNX_LINUX_ARM64_GPU_ONNXRUNTIME_VERSION=1.18.1\nto cmake (You need to make sure CUDA 12.6 is available on your board).\n\")\n\nset(v ${SHERPA_ONNX_LINUX_ARM64_GPU_ONNXRUNTIME_VERSION})\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v${v}/onnxruntime-linux-aarch64-gpu-${v}.tar.bz2\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/onnxruntime-linux-aarch64-gpu-${v}.tar.bz2\")\n\nif(v STREQUAL \"1.11.0\")\n  set(onnxruntime_HASH \"SHA256=36eded935551e23aead09d4173bdf0bd1e7b01fdec15d77f97d6e34029aa60d7\")\nelseif(v STREQUAL \"1.16.0\")\n  set(onnxruntime_HASH \"SHA256=4c09d5acf2c2682b4eab1dc2f1ad98fc1fde5f5f1960063e337983ba59379a4b\")\nelseif(v STREQUAL \"1.18.0\")\n  set(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.18.0/onnxruntime-linux-aarch64-gpu-cuda12.2-cudnn8.9.4-trt8.6.2-1.18.0.tar.bz2\")\n  set(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/onnxruntime-linux-aarch64-gpu-cuda12.2-cudnn8.9.4-trt8.6.2-1.18.0.tar.bz2\")\n  set(onnxruntime_HASH \"SHA256=da437a69be982fc28ca7d60d0c5ccce2f48d027fa888cc76458cdc05410f4e2d\")\nelseif(v STREQUAL \"1.18.1\")\n  set(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.18.1/onnxruntime-linux-aarch64-gpu-cuda12-1.18.1.tar.bz2\")\n  set(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/onnxruntime-linux-aarch64-gpu-cuda12-1.18.1.tar.bz2\")\n  set(onnxruntime_HASH \"SHA256=1e91064ec13a6fabb6b670da8a2da4f369c1dbd50a5be77a879b2473e7afc0a6\")\nelse()\n  message(FATAL_ERROR \"Unuspported onnxruntime version ${v} for Linux aarch64\")\nendif()\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-linux-aarch64-gpu-${v}.tar.bz2\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-aarch64-gpu-${v}.tar.bz2\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-aarch64-gpu-${v}.tar.bz2\n  /tmp/onnxruntime-linux-aarch64-gpu-${v}.tar.bz2\n  /star-fj/fangjun/download/github/onnxruntime-linux-aarch64-gpu-${v}.tar.bz2\n  #\n  $ENV{HOME}/Downloads/onnxruntime-linux-aarch64-gpu-cuda12.2-cudnn8.9.4-trt8.6.2-${v}.tar.bz2\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-aarch64-gpu-cuda12.2-cudnn8.9.4-trt8.6.2-${v}.tar.bz2\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-aarch64-gpu-cuda12.2-cudnn8.9.4-trt8.6.2-${v}.tar.bz2\n  /tmp/onnxruntime-linux-aarch64-gpu-cuda12.2-cudnn8.9.4-trt8.6.2-${v}.tar.bz2\n  /star-fj/fangjun/download/github/onnxruntime-linux-aarch64-gpu-cuda12.2-cudnn8.9.4-trt8.6.2-${v}.tar.bz2\n  #\n  $ENV{HOME}/Downloads/onnxruntime-linux-aarch64-gpu-cuda12-${v}.tar.bz2\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-aarch64-gpu-cuda12-${v}.tar.bz2\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-aarch64-gpu-cuda12-${v}.tar.bz2\n  /tmp/onnxruntime-linux-aarch64-gpu-cuda12-${v}.tar.bz2\n  /star-fj/fangjun/download/github/onnxruntime-linux-aarch64-gpu-cuda12-${v}.tar.bz2\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime*\")\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-linux-aarch64-static.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT CMAKE_SYSTEM_PROCESSOR STREQUAL aarch64)\n  message(FATAL_ERROR \"This file is for aarch64 only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building static libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/onnxruntime-linux-aarch64-static_lib-1.23.2-glibc2_17.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-linux-aarch64-static_lib-1.23.2-glibc2_17.zip\")\nset(onnxruntime_HASH \"SHA256=7a603d836aa27d37197eb76f055d3c9e4e81d3a5a343c60000d7b6345bc6c80f\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-linux-aarch64-static_lib-1.23.2-glibc2_17.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-aarch64-static_lib-1.23.2-glibc2_17.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-aarch64-static_lib-1.23.2-glibc2_17.zip\n  /tmp/onnxruntime-linux-aarch64-static_lib-1.23.2-glibc2_17.zip\n  /star-fj/fangjun/download/github/onnxruntime-linux-aarch64-static_lib-1.23.2-glibc2_17.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/lib*.a\")\n\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-linux-aarch64.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT CMAKE_SYSTEM_PROCESSOR STREQUAL aarch64)\n  message(FATAL_ERROR \"This file is for aarch64 only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/onnxruntime-linux-aarch64-glibc2_17-Release-1.23.2.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-linux-aarch64-glibc2_17-Release-1.23.2.zip\")\nset(onnxruntime_HASH \"SHA256=2a40a5323827bc59844d00ffdd3697d5e30dccb691233054bace0dc61cfa8341\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-linux-aarch64-glibc2_17-Release-1.23.2.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-aarch64-glibc2_17-Release-1.23.2.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-aarch64-glibc2_17-Release-1.23.2.zip\n  /tmp/onnxruntime-linux-aarch64-glibc2_17-Release-1.23.2.zip\n  /star-fj/fangjun/download/github/onnxruntime-linux-aarch64-glibc2_17-Release-1.23.2.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nset(location_onnxruntime \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime.so\")\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime*\")\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-linux-arm-static.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT (CMAKE_SYSTEM_PROCESSOR STREQUAL arm OR CMAKE_SYSTEM_PROCESSOR STREQUAL armv7l))\n  message(FATAL_ERROR \"This file is for arm only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building static libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\n# requires gcc 11\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/onnxruntime-linux-arm-static_lib-1.23.2.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/onnxruntime-linux-arm-static_lib-1.23.2.zip\")\nset(onnxruntime_HASH \"SHA256=334a51dbdc6812f91ee88356cedca14b097ed2907c80aa2b91670680e155ad9f\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-linux-arm-static_lib-1.23.2.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-arm-static_lib-1.23.2.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-arm-static_lib-1.23.2.zip\n  /tmp/onnxruntime-linux-arm-static_lib-1.23.2.zip\n  /star-fj/fangjun/download/github/onnxruntime-linux-arm-static_lib-1.23.2.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/lib*.a\")\n\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-linux-arm.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT (CMAKE_SYSTEM_PROCESSOR STREQUAL arm OR CMAKE_SYSTEM_PROCESSOR STREQUAL armv7l))\n  message(FATAL_ERROR \"This file is for arm only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\n# requires gcc 11\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/onnxruntime-linux-arm-1.23.2.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-linux-arm-1.23.2.zip\")\nset(onnxruntime_HASH \"SHA256=c00aae409731930433badaf7d629499b9a1dcfac4dd67ad6b6a4838349bd6ba5\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-linux-arm-1.23.2.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-arm-1.23.2.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-arm-1.23.2.zip\n  /tmp/onnxruntime-linux-arm-1.23.2.zip\n  /star-fj/fangjun/download/github/onnxruntime-linux-arm-1.23.2.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime*\")\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-linux-riscv64-spacemit.cmake",
    "content": "message(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT CMAKE_SYSTEM_PROCESSOR STREQUAL riscv64)\n  message(FATAL_ERROR \"This file is for riscv64 only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}, SHERPA_ONNX_ENABLE_SPACEMIT: ${SHERPA_ONNX_ENABLE_SPACEMIT}\")\nendif()\n\nset(onnxruntime_pkg_name \"spacemit-ort.riscv64.2.0.1.tar.gz\")\nset(onnxruntime_URL  \"https://archive.spacemit.com/spacemit-ai/onnxruntime/${onnxruntime_pkg_name}\")\nset(onnxruntime_HASH \"SHA256=8a15035aca34d5fd95f24444d4c7843265c1a81f49d84ec6fe9c6d0fdf5b55cf\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/${onnxruntime_pkg_name}\n  ${CMAKE_SOURCE_DIR}/${onnxruntime_pkg_name}\n  ${CMAKE_BINARY_DIR}/${onnxruntime_pkg_name}\n  /tmp/${onnxruntime_pkg_name}\n  /star-fj/fangjun/download/github/${onnxruntime_pkg_name}\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime\n  NAMES onnxruntime\n  PATHS \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nfind_library(location_spacemit_ep\n  NAMES spacemit_ep\n  PATHS \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_spacemit_ep: ${location_spacemit_ep}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\nadd_library(spacemit_ep SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  IMPORTED_LOCATION \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime.so\"\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include/\"\n)\n\nset_target_properties(spacemit_ep PROPERTIES\n  IMPORTED_LOCATION ${location_spacemit_ep}\n  IMPORTED_LOCATION \"${onnxruntime_SOURCE_DIR}/lib/libspacemit_ep.so\"\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include/\"\n)\n\nfile(GLOB onnxruntime_lib_files\n  \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime*\")\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n\nfile(GLOB spacemit_ep_lib_files\n  \"${onnxruntime_SOURCE_DIR}/lib/libspacemit_ep*\")\nmessage(STATUS \"spacemit_ep lib files: ${spacemit_ep_lib_files}\")\ninstall(FILES ${spacemit_ep_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-linux-riscv64-static.cmake",
    "content": "# Copyright (c)  2022-2024  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT CMAKE_SYSTEM_PROCESSOR STREQUAL riscv64)\n  message(FATAL_ERROR \"This file is for riscv64 only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building static libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.18.0/onnxruntime-linux-riscv64-static_lib-1.18.0.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/onnxruntime-linux-riscv64-static_lib-1.18.0.zip\")\nset(onnxruntime_HASH \"SHA256=77ecc51d8caf0953755db6edcdec2fc03bce3f6d379bedd635be50bb95f88da5\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-linux-riscv64-static_lib-1.18.0.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-riscv64-static_lib-1.18.0.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-riscv64-static_lib-1.18.0.zip\n  /tmp/onnxruntime-linux-riscv64-static_lib-1.18.0.zip\n  /star-fj/fangjun/download/github/onnxruntime-linux-riscv64-static_lib-1.18.0.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/lib*.a\")\n\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-linux-riscv64.cmake",
    "content": "# Copyright (c)  2022-2024  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT CMAKE_SYSTEM_PROCESSOR STREQUAL riscv64)\n  message(FATAL_ERROR \"This file is for riscv64 only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.14.1/onnxruntime-linux-riscv64-glibc2_17-Release-1.14.1.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/onnxruntime-linux-riscv64-glibc2_17-Release-1.14.1.zip\")\nset(onnxruntime_HASH \"SHA256=c2cbc5af081ff82f46640befd85433811486daaf28e702163c6e4e75020fde81\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-linux-riscv64-glibc2_17-Release-1.14.1.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-riscv64-glibc2_17-Release-1.14.1.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-riscv64-glibc2_17-Release-1.14.1.zip\n  /tmp/onnxruntime-linux-riscv64-glibc2_17-Release-1.14.1.zip\n  /star-fj/fangjun/download/github/onnxruntime-linux-riscv64-glibc2_17-Release-1.14.1.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include/\"\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime*\")\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-linux-x86_64-gpu.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT CMAKE_SYSTEM_PROCESSOR STREQUAL x86_64)\n  message(FATAL_ERROR \"This file is for x86_64 only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nif(NOT SHERPA_ONNX_ENABLE_GPU)\n  message(FATAL_ERROR \"This file is for NVIDIA GPU only. Given SHERPA_ONNX_ENABLE_GPU: ${SHERPA_ONNX_ENABLE_GPU}\")\nendif()\n\n\n# Requires CUDA 12, cudnn 9\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/onnxruntime-linux-x64-gpu-1.23.2-patched.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-linux-x64-gpu-1.23.2-patched.zip\")\nset(onnxruntime_HASH \"SHA256=e2f622513212304447e34512b99ae4eabb4fd8870dd1baac895f222179dede19\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-linux-x64-gpu-1.23.2-patched.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-x64-gpu-1.23.2-patched.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-x64-gpu-1.23.2-patched.zip\n  /tmp/onnxruntime-linux-x64-gpu-1.23.2-patched.zip\n  /star-fj/fangjun/download/github/onnxruntime-linux-x64-gpu-1.23.2-patched.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime*\")\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-linux-x86_64-static.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT CMAKE_SYSTEM_PROCESSOR STREQUAL x86_64)\n  message(FATAL_ERROR \"This file is for x86_64 only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building static libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/onnxruntime-linux-x64-static_lib-1.23.2-glibc2_17.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-linux-x64-static_lib-1.23.2-glibc2_17.zip\")\nset(onnxruntime_HASH \"SHA256=93a52b9d93a0932259a03090291be861ba21ad4b1b58057d3a0f57a4c4108671\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-linux-x64-static_lib-1.23.2-glibc2_17.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-x64-static_lib-1.23.2-glibc2_17.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-x64-static_lib-1.23.2-glibc2_17.zip\n  /tmp/onnxruntime-linux-x64-static_lib-1.23.2-glibc2_17.zip\n  /star-fj/fangjun/download/github/onnxruntime-linux-x64-static_lib-1.23.2-glibc2_17.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/lib*.a\")\n\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-linux-x86_64.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Linux)\n  message(FATAL_ERROR \"This file is for Linux only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT CMAKE_SYSTEM_PROCESSOR STREQUAL x86_64)\n  message(FATAL_ERROR \"This file is for x86_64 only. Given: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/onnxruntime-linux-x64-glibc2_17-Release-1.23.2.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-linux-x64-glibc2_17-Release-1.23.2.zip\")\nset(onnxruntime_HASH \"SHA256=77ea3532dfdd8d5c66918429f7eacd80c1fea834941a14746adf3109f8e7b830\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-linux-x64-glibc2_17-Release-1.23.2.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-linux-x64-glibc2_17-Release-1.23.2.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-linux-x64-glibc2_17-Release-1.23.2.zip\n  /tmp/onnxruntime-linux-x64-glibc2_17-Release-1.23.2.zip\n  /star-fj/fangjun/download/github/onnxruntime-linux-x64-glibc2_17-Release-1.23.2.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime*\")\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-osx-arm64-static.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_OSX_ARCHITECTURES: ${CMAKE_OSX_ARCHITECTURES}\")\nmessage(STATUS \"CMAKE_APPLE_SILICON_PROCESSOR : ${CMAKE_APPLE_SILICON_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Darwin)\n  message(FATAL_ERROR \"This file is for macOS only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building static libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/onnxruntime-osx-arm64-static_lib-1.23.2.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-osx-arm64-static_lib-1.23.2.zip\")\nset(onnxruntime_HASH \"SHA256=febeb7116f075409c554434a317cd51a2efb26abbf364c2ed77191f728a56633\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-osx-arm64-static_lib-1.23.2.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-osx-arm64-static_lib-1.23.2.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-osx-arm64-static_lib-1.23.2.zip\n  /tmp/onnxruntime-osx-arm64-static_lib-1.23.2.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/lib*.a\")\n\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n\n# disable coreml when using static onnxruntime lib\nadd_definitions(-DSHERPA_ONNX_DISABLE_COREML)\n"
  },
  {
    "path": "cmake/onnxruntime-osx-arm64.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_OSX_ARCHITECTURES: ${CMAKE_OSX_ARCHITECTURES}\")\nmessage(STATUS \"CMAKE_APPLE_SILICON_PROCESSOR : ${CMAKE_APPLE_SILICON_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Darwin)\n  message(FATAL_ERROR \"This file is for macOS only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/microsoft/onnxruntime/releases/download/v1.23.2/onnxruntime-osx-arm64-1.23.2.tgz\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-osx-arm64-1.23.2.tgz\")\nset(onnxruntime_HASH \"SHA256=b4d513ab2b26f088c66891dbbc1408166708773d7cc4163de7bdca0e9bbb7856\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-osx-arm64-1.23.2.tgz\n  ${CMAKE_SOURCE_DIR}/onnxruntime-osx-arm64-1.23.2.tgz\n  ${CMAKE_BINARY_DIR}/onnxruntime-osx-arm64-1.23.2.tgz\n  /tmp/onnxruntime-osx-arm64-1.23.2.tgz\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime*dylib\")\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-osx-universal-static.cmake",
    "content": "# Possible values for CMAKE_SYSTEM_NAME: Linux, Windows, Darwin\n\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_OSX_ARCHITECTURES: ${CMAKE_OSX_ARCHITECTURES}\")\nmessage(STATUS \"CMAKE_APPLE_SILICON_PROCESSOR : ${CMAKE_APPLE_SILICON_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Darwin)\n  message(FATAL_ERROR \"This file is for macOS only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building static libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/onnxruntime-osx-universal2-static_lib-1.23.2.zip\")\nset(onnxruntime_URL2  \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-osx-universal2-static_lib-1.23.2.zip\")\nset(onnxruntime_HASH \"SHA256=9ea206a621d6e5550ddb9de0b96c4f666b074620f5c685b0479b5fa02c0bba76\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-osx-universal2-static_lib-1.23.2.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-osx-universal2-static_lib-1.23.2.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-osx-universal2-static_lib-1.23.2.zip\n  /tmp/onnxruntime-osx-universal2-static_lib-1.23.2.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/lib*.a\")\n\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n\n# disable coreml when using static onnxruntime lib\nadd_definitions(-DSHERPA_ONNX_DISABLE_COREML)\n"
  },
  {
    "path": "cmake/onnxruntime-osx-universal.cmake",
    "content": "# Possible values for CMAKE_SYSTEM_NAME: Linux, Windows, Darwin\n\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_OSX_ARCHITECTURES: ${CMAKE_OSX_ARCHITECTURES}\")\nmessage(STATUS \"CMAKE_APPLE_SILICON_PROCESSOR : ${CMAKE_APPLE_SILICON_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Darwin)\n  message(FATAL_ERROR \"This file is for macOS only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/microsoft/onnxruntime/releases/download/v1.23.2/onnxruntime-osx-universal2-1.23.2.tgz\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-osx-universal2-1.23.2.tgz\")\nset(onnxruntime_HASH \"SHA256=49ae8e3a66ccb18d98ad3fe7f5906b6d7887df8a5edd40f49eb2b14e20885809\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-osx-universal2-1.23.2.tgz\n  ${CMAKE_SOURCE_DIR}/onnxruntime-osx-universal2-1.23.2.tgz\n  ${CMAKE_BINARY_DIR}/onnxruntime-osx-universal2-1.23.2.tgz\n  /tmp/onnxruntime-osx-universal2-1.23.2.tgz\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime*dylib\")\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-osx-x86_64-static.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_OSX_ARCHITECTURES: ${CMAKE_OSX_ARCHITECTURES}\")\nmessage(STATUS \"CMAKE_APPLE_SILICON_PROCESSOR : ${CMAKE_APPLE_SILICON_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Darwin)\n  message(FATAL_ERROR \"This file is for macOS only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building static libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/onnxruntime-osx-x86_64-static_lib-1.23.2.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-osx-x86_64-static_lib-1.23.2.zip\")\nset(onnxruntime_HASH \"SHA256=dc632688d5b48e478742ba1ae2d9ebc78ab6cee18fa6eb61e2fb03b8a80d1b66\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-osx-x86_64-static_lib-1.23.2.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-osx-x86_64-static_lib-1.23.2.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-osx-x86_64-static_lib-1.23.2.zip\n  /tmp/onnxruntime-osx-x86_64-static_lib-1.23.2.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/lib*.a\")\n\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n\n# disable coreml when using static onnxruntime lib\nadd_definitions(-DSHERPA_ONNX_DISABLE_COREML)\n"
  },
  {
    "path": "cmake/onnxruntime-osx-x86_64.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_OSX_ARCHITECTURES: ${CMAKE_OSX_ARCHITECTURES}\")\nmessage(STATUS \"CMAKE_APPLE_SILICON_PROCESSOR : ${CMAKE_APPLE_SILICON_PROCESSOR}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Darwin)\n  message(FATAL_ERROR \"This file is for macOS only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/microsoft/onnxruntime/releases/download/v1.23.2/onnxruntime-osx-x86_64-1.23.2.tgz\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-osx-x86_64-1.23.2.tgz\")\nset(onnxruntime_HASH \"SHA256=d10359e16347b57d9959f7e80a225a5b4a66ed7d7e007274a15cae86836485a6\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-osx-x86_64-1.23.2.tgz\n  ${CMAKE_SOURCE_DIR}/onnxruntime-osx-x86_64-1.23.2.tgz\n  ${CMAKE_BINARY_DIR}/onnxruntime-osx-x86_64-1.23.2.tgz\n  /tmp/onnxruntime-osx-x86_64-1.23.2.tgz\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/libonnxruntime*dylib\")\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-wasm-simd.cmake",
    "content": "# Copyright (c)  2022-2024  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nif(NOT SHERPA_ONNX_ENABLE_WASM)\n  message(FATAL_ERROR \"This file is for WebAssembly.\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"BUILD_SHARED_LIBS should be OFF for WebAssembly\")\nendif()\n\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.17.1/onnxruntime-wasm-static_lib-simd-1.17.1.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/onnxruntime-wasm-static_lib-simd-1.17.1.zip\")\nset(onnxruntime_HASH \"SHA256=8f07778e4233cf5a61a9d0795d90c5497177fbe8a46b701fda2d8d4e2b11cef8\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-wasm-static_lib-simd-1.17.1.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-wasm-static_lib-simd-1.17.1.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-wasm-static_lib-simd-1.17.1.zip\n  /tmp/onnxruntime-wasm-static_lib-simd-1.17.1.zip\n  /star-fj/fangjun/download/github/onnxruntime-wasm-static_lib-simd-1.17.1.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/lib*.a\")\n\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\n"
  },
  {
    "path": "cmake/onnxruntime-win-arm64-static.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_VS_PLATFORM_NAME: ${CMAKE_VS_PLATFORM_NAME}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Windows)\n  message(FATAL_ERROR \"This file is for Windows only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT (CMAKE_VS_PLATFORM_NAME STREQUAL ARM64 OR CMAKE_VS_PLATFORM_NAME STREQUAL arm64))\n  message(FATAL_ERROR \"This file is for Windows arm64 only. Given: ${CMAKE_VS_PLATFORM_NAME}\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building static libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\n# Hashes for static CRT (/MT)\nset(ONNXRUNTIME_HASH_MT_Release \"SHA256=03166e5c7a830586b8772ff166611f0806fbc0ca76bcd177113fe3275a8af59b\")\nset(ONNXRUNTIME_HASH_MT_Debug \"SHA256=61941c8d3058ebb6f9b83c9a99e4ae840708cc99702921d7c950a071f04f26ed\")\nset(ONNXRUNTIME_HASH_MT_RelWithDebInfo \"SHA256=304bd830680b773ed1fce35758f317f2b50b2278359ede292e33bd67b57290b7\")\nset(ONNXRUNTIME_HASH_MT_MinSizeRel \"SHA256=c24f46f689f9dbb8d8cd86d4c1d83f091da82d78548371d59446566d787f8cf4\")\n\n# Hashes for dynamic CRT (/MD)\nset(ONNXRUNTIME_HASH_MD_Release \"SHA256=26bae6d13335ecb229baa545d8c3b910998ace4f3617a4046640b9e6ef208dd7\")\nset(ONNXRUNTIME_HASH_MD_Debug \"SHA256=f3e6d4550ac00c9f8f7ef647974087627f41d063e9899b67a028cca6c34521ab\")\nset(ONNXRUNTIME_HASH_MD_RelWithDebInfo \"SHA256=ddef98c48243b0d7209edb9d416566405fd793551f63962fbb4f049b899136b0\")\nset(ONNXRUNTIME_HASH_MD_MinSizeRel \"SHA256=4b28704e04f25b0839004ca828306f387814ada953750c4103f7076768fcf8a1\")\n\nif(NOT CMAKE_BUILD_TYPE MATCHES \"^(Release|Debug|RelWithDebInfo|MinSizeRel)$\")\n  message(FATAL_ERROR \"Supported CMAKE_BUILD_TYPE values are: Release, Debug, RelWithDebInfo, MinSizeRel. Given ${CMAKE_BUILD_TYPE}\")\nendif()\n\nif(SHERPA_ONNX_USE_STATIC_CRT)\n  set(onnxruntime_crt \"MT\")\nelse()\n  set(onnxruntime_crt \"MD\")\nendif()\n\nmessage(STATUS \"Use MSVC CRT: ${onnxruntime_crt}\")\n\nset(onnxruntime_HASH \"${ONNXRUNTIME_HASH_${onnxruntime_crt}_${CMAKE_BUILD_TYPE}}\")\nset(onnxruntime_filename \"onnxruntime-win-arm64-static_lib-${onnxruntime_crt}-${CMAKE_BUILD_TYPE}-1.23.2.tar.bz2\")\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/${onnxruntime_filename}\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/${onnxruntime_filename}\n  ${CMAKE_SOURCE_DIR}/${onnxruntime_filename}\n  ${CMAKE_BINARY_DIR}/${onnxruntime_filename}\n  $ENV{TMP}/${onnxruntime_filename}\n  $ENV{TEMP}/${onnxruntime_filename}\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/*.lib\")\n\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\nif(SHERPA_ONNX_ENABLE_PYTHON)\n  install(FILES ${onnxruntime_lib_files} DESTINATION ..)\nelse()\n  install(FILES ${onnxruntime_lib_files} DESTINATION lib)\nendif()\n"
  },
  {
    "path": "cmake/onnxruntime-win-arm64.cmake",
    "content": "# Copyright (c)  2022-2024  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_VS_PLATFORM_NAME: ${CMAKE_VS_PLATFORM_NAME}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Windows)\n  message(FATAL_ERROR \"This file is for Windows only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT (CMAKE_VS_PLATFORM_NAME STREQUAL ARM64 OR CMAKE_VS_PLATFORM_NAME STREQUAL arm64))\n  message(FATAL_ERROR \"This file is for Windows arm64 only. Given: ${CMAKE_VS_PLATFORM_NAME}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nif(NOT CMAKE_BUILD_TYPE MATCHES \"^(Release|Debug|RelWithDebInfo|MinSizeRel)$\")\n  message(FATAL_ERROR \"Please set CMAKE_BUILD_TYPE to Release, Debug, RelWithDebInfo or MinSizeRel\")\nendif()\n\n# Hashes for static CRT (/MT)\nset(ONNXRUNTIME_HASH_MT_Debug \"SHA256=c9329ff4e8acdd0a07b40465d6521e15bb21eb0a4d9bc7843803e460dc4d02f0\")\nset(ONNXRUNTIME_HASH_MT_RelWithDebInfo \"SHA256=342470bef5681452fb9add5668debc5e79b5cd01a8c8866fc7c47f81dbd8eb70\")\nset(ONNXRUNTIME_HASH_MT_MinSizeRel \"SHA256=7bcbc8fd66fa1b0783dfdbd66eeb9ad4c023a1080ccc01e80412c2347aeddfa1\")\nset(ONNXRUNTIME_HASH_MT_Release \"SHA256=ee3c257f4c56f91a0a6aa3cce9a29dcd654f432fe5a399246c8bdb86c9bb5900\")\n\n# Hashes for dynamic CRT (/MD)\nset(ONNXRUNTIME_HASH_MD_Debug \"SHA256=722f044c84947e37fec7941f5dc38aa8f71cdf0b4bfa57cb590a4ed634b36ddc\")\nset(ONNXRUNTIME_HASH_MD_RelWithDebInfo \"SHA256=025b0dd682309482b3146b5c3a80e814ad9dec1e93ee8139954857b97d5798a9\")\nset(ONNXRUNTIME_HASH_MD_MinSizeRel \"SHA256=b8d5d508b21b4604d241ac11384fa6906daf556c6667011d9a8c6806d9549b74\")\nset(ONNXRUNTIME_HASH_MD_Release \"SHA256=08ed42a71fbce04e10a3192510a2c578a20c1d3a00652187f85002c54db84548\")\n\nif(SHERPA_ONNX_USE_STATIC_CRT)\n  set(onnxruntime_crt \"MT\")\nelse()\n  set(onnxruntime_crt \"MD\")\nendif()\n\nmessage(STATUS \"Use MSVC CRT: ${onnxruntime_crt}\")\n\nset(onnxruntime_filename \"onnxruntime-win-arm64-${onnxruntime_crt}-${CMAKE_BUILD_TYPE}-1.23.2.tar.bz2\")\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/${onnxruntime_filename}\")\nset(onnxruntime_HASH \"${ONNXRUNTIME_HASH_${onnxruntime_crt}_${CMAKE_BUILD_TYPE}}\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/${onnxruntime_filename}\n  ${CMAKE_SOURCE_DIR}/${onnxruntime_filename}\n  ${CMAKE_BINARY_DIR}/${onnxruntime_filename}\n  $ENV{TMP}/${onnxruntime_filename}\n  $ENV{TEMP}/${onnxruntime_filename}\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nset_property(TARGET onnxruntime\n  PROPERTY\n    IMPORTED_IMPLIB \"${onnxruntime_SOURCE_DIR}/lib/onnxruntime.lib\"\n)\n\nfile(COPY ${onnxruntime_SOURCE_DIR}/lib/onnxruntime.dll\n  DESTINATION\n    ${CMAKE_BINARY_DIR}/bin/${CMAKE_BUILD_TYPE}\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/*.dll\")\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\n\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\ninstall(FILES ${onnxruntime_lib_files} DESTINATION bin)\n"
  },
  {
    "path": "cmake/onnxruntime-win-x64-directml.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_VS_PLATFORM_NAME: ${CMAKE_VS_PLATFORM_NAME}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Windows)\n  message(FATAL_ERROR \"This file is for Windows only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT (CMAKE_VS_PLATFORM_NAME STREQUAL X64 OR CMAKE_VS_PLATFORM_NAME STREQUAL x64))\n  message(FATAL_ERROR \"This file is for Windows x64 only. Given: ${CMAKE_VS_PLATFORM_NAME}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nif(NOT SHERPA_ONNX_ENABLE_DIRECTML)\n  message(FATAL_ERROR \"This file is for DirectML. Given SHERPA_ONNX_ENABLE_DIRECTML: ${SHERPA_ONNX_ENABLE_DIRECTML}\")\nendif()\n\nif(location_onnxruntime_header_dir AND location_onnxruntime_lib)\n    message(\"Use preinstall onnxruntime with directml: ${location_onnxruntime_lib}\")\nelse()\n\n    set(onnxruntime_URL  \"https://globalcdn.nuget.org/packages/microsoft.ml.onnxruntime.directml.1.14.1.nupkg\")\n    set(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/microsoft.ml.onnxruntime.directml.1.14.1.nupkg\")\n    set(onnxruntime_HASH \"SHA256=c8ae7623385b19cd5de968d0df5383e13b97d1b3a6771c9177eac15b56013a5a\")\n\n    # If you don't have access to the Internet,\n    # please download onnxruntime to one of the following locations.\n    # You can add more if you want.\n    set(possible_file_locations\n        $ENV{HOME}/Downloads/microsoft.ml.onnxruntime.directml.1.14.1.nupkg\n        ${PROJECT_SOURCE_DIR}/microsoft.ml.onnxruntime.directml.1.14.1.nupkg\n        ${PROJECT_BINARY_DIR}/microsoft.ml.onnxruntime.directml.1.14.1.nupkg\n        /tmp/microsoft.ml.onnxruntime.directml.1.14.1.nupkg\n    )\n\n    foreach(f IN LISTS possible_file_locations)\n      if(EXISTS ${f})\n        set(onnxruntime_URL  \"${f}\")\n        file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n        message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n        set(onnxruntime_URL2)\n        break()\n      endif()\n    endforeach()\n\n    FetchContent_Declare(onnxruntime\n      URL\n        ${onnxruntime_URL}\n        ${onnxruntime_URL2}\n      URL_HASH          ${onnxruntime_HASH}\n    )\n\n    FetchContent_GetProperties(onnxruntime)\n    if(NOT onnxruntime_POPULATED)\n      message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n      FetchContent_Populate(onnxruntime)\n    endif()\n    message(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n    find_library(location_onnxruntime onnxruntime\n      PATHS\n      \"${onnxruntime_SOURCE_DIR}/runtimes/win-x64/native\"\n      NO_CMAKE_SYSTEM_PATH\n    )\n\n    message(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\n    add_library(onnxruntime SHARED IMPORTED)\n\n    set_target_properties(onnxruntime PROPERTIES\n      IMPORTED_LOCATION ${location_onnxruntime}\n      INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/build/native/include\"\n    )\n\n    set_property(TARGET onnxruntime\n      PROPERTY\n        IMPORTED_IMPLIB \"${onnxruntime_SOURCE_DIR}/runtimes/win-x64/native/onnxruntime.lib\"\n    )\n\n    file(COPY ${onnxruntime_SOURCE_DIR}/runtimes/win-x64/native/onnxruntime.dll\n      DESTINATION\n        ${CMAKE_BINARY_DIR}/bin/${CMAKE_BUILD_TYPE}\n    )\n\n    file(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/runtimes/win-x64/native/onnxruntime.*\")\n\n    message(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\n\n    if(SHERPA_ONNX_ENABLE_PYTHON)\n      install(FILES ${onnxruntime_lib_files} DESTINATION ..)\n    else()\n      install(FILES ${onnxruntime_lib_files} DESTINATION lib)\n    endif()\n\n    install(FILES ${onnxruntime_lib_files} DESTINATION bin)\n\nendif()\n\n# Setup DirectML\n\nset(directml_URL \"https://www.nuget.org/api/v2/package/Microsoft.AI.DirectML/1.15.0\")\nset(directml_HASH \"SHA256=10d175f8e97447712b3680e3ac020bbb8eafdf651332b48f09ffee2eec801c23\")\n\nset(possible_directml_file_locations\n    $ENV{HOME}/Downloads/Microsoft.AI.DirectML.1.15.0.nupkg\n    ${PROJECT_SOURCE_DIR}/Microsoft.AI.DirectML.1.15.0.nupkg\n    ${PROJECT_BINARY_DIR}/Microsoft.AI.DirectML.1.15.0.nupkg\n    /tmp/Microsoft.AI.DirectML.1.15.0.nupkg\n)\n\nforeach(f IN LISTS possible_directml_file_locations)\n  if(EXISTS ${f})\n    set(directml_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${directml_URL}\" directml_URL)\n    message(STATUS \"Found local downloaded DirectML: ${directml_URL}\")\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(directml\n  URL\n    ${directml_URL}\n  URL_HASH ${directml_HASH}\n)\n\nFetchContent_GetProperties(directml)\nif(NOT directml_POPULATED)\n  message(STATUS \"Downloading DirectML from ${directml_URL}\")\n  FetchContent_Populate(directml)\nendif()\nmessage(STATUS \"DirectML is downloaded to ${directml_SOURCE_DIR}\")\n\nfind_library(location_directml DirectML\n  PATHS\n  \"${directml_SOURCE_DIR}/bin/x64-win\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_directml: ${location_directml}\")\n\nadd_library(directml SHARED IMPORTED)\n\nset_target_properties(directml PROPERTIES\n  IMPORTED_LOCATION ${location_directml}\n  INTERFACE_INCLUDE_DIRECTORIES \"${directml_SOURCE_DIR}/bin/x64-win\"\n)\n\nset_property(TARGET directml\n  PROPERTY\n    IMPORTED_IMPLIB \"${directml_SOURCE_DIR}/bin/x64-win/DirectML.lib\"\n)\n\nfile(COPY ${directml_SOURCE_DIR}/bin/x64-win/DirectML.dll\n  DESTINATION\n    ${CMAKE_BINARY_DIR}/bin/${CMAKE_BUILD_TYPE}\n)\n\nfile(GLOB directml_lib_files \"${directml_SOURCE_DIR}/bin/x64-win/DirectML.*\")\n\nmessage(STATUS \"DirectML lib files: ${directml_lib_files}\")\n\ninstall(FILES ${directml_lib_files} DESTINATION lib)\ninstall(FILES ${directml_lib_files} DESTINATION bin)\n"
  },
  {
    "path": "cmake/onnxruntime-win-x64-gpu.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_VS_PLATFORM_NAME: ${CMAKE_VS_PLATFORM_NAME}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Windows)\n  message(FATAL_ERROR \"This file is for Windows only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT (CMAKE_VS_PLATFORM_NAME STREQUAL X64 OR CMAKE_VS_PLATFORM_NAME STREQUAL x64))\n  message(FATAL_ERROR \"This file is for Windows x64 only. Given: ${CMAKE_VS_PLATFORM_NAME}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nif(NOT SHERPA_ONNX_ENABLE_GPU)\n  message(FATAL_ERROR \"This file is for NVIDIA GPU only. Given SHERPA_ONNX_ENABLE_GPU: ${SHERPA_ONNX_ENABLE_GPU}\")\nendif()\n\n# Requires cuda 12.x, cudnn 9.x\nset(onnxruntime_URL  \"https://github.com/microsoft/onnxruntime/releases/download/v1.23.2/onnxruntime-win-x64-gpu-1.23.2.zip\")\nset(onnxruntime_URL2 \"https://hf-mirror.com/csukuangfj/onnxruntime-libs/resolve/main/1.23.2/onnxruntime-win-x64-gpu-1.23.2.zip\")\nset(onnxruntime_HASH \"SHA256=e77afdbbc2b8cb6da4e5a50d89841b48c44f3e47dce4fb87b15a2743786d0bb9\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/onnxruntime-win-x64-gpu-1.23.2.zip\n  ${CMAKE_SOURCE_DIR}/onnxruntime-win-x64-gpu-1.23.2.zip\n  ${CMAKE_BINARY_DIR}/onnxruntime-win-x64-gpu-1.23.2.zip\n  /tmp/onnxruntime-win-x64-gpu-1.23.2.zip\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    set(onnxruntime_URL2)\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n    ${onnxruntime_URL2}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nset_property(TARGET onnxruntime\n  PROPERTY\n    IMPORTED_IMPLIB \"${onnxruntime_SOURCE_DIR}/lib/onnxruntime.lib\"\n)\n\nfile(COPY ${onnxruntime_SOURCE_DIR}/lib/onnxruntime.dll\n  DESTINATION\n    ${CMAKE_BINARY_DIR}/bin/${CMAKE_BUILD_TYPE}\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/*.dll\")\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\n\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\ninstall(FILES ${onnxruntime_lib_files} DESTINATION bin)\n"
  },
  {
    "path": "cmake/onnxruntime-win-x64-static.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_VS_PLATFORM_NAME: ${CMAKE_VS_PLATFORM_NAME}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Windows)\n  message(FATAL_ERROR \"This file is for Windows only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT (CMAKE_VS_PLATFORM_NAME STREQUAL X64 OR CMAKE_VS_PLATFORM_NAME STREQUAL x64))\n  message(FATAL_ERROR \"This file is for Windows x64 only. Given: ${CMAKE_VS_PLATFORM_NAME}\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building static libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nif(NOT CMAKE_BUILD_TYPE MATCHES \"^(Release|Debug|RelWithDebInfo|MinSizeRel)$\")\n  message(FATAL_ERROR \"Supported CMAKE_BUILD_TYPE values are: Release, Debug, RelWithDebInfo, MinSizeRel. Given ${CMAKE_BUILD_TYPE}\")\nendif()\n\n# Hashes for static CRT (/MT)\nset(ONNXRUNTIME_HASH_MT_Release \"SHA256=c853a7646f9ebb0bf900e547141ef3e68d3ec888b27756ecb5f32476a6472391\")\nset(ONNXRUNTIME_HASH_MT_Debug \"SHA256=efd7c3aa9fa10a380e5534ead76627790dd533142307e3fd1de2d1fba533dd90\")\nset(ONNXRUNTIME_HASH_MT_RelWithDebInfo \"SHA256=4cf1733121eee79c9f18b048d1f5e9603079931e62af1c878c0d873ecd48900e\")\nset(ONNXRUNTIME_HASH_MT_MinSizeRel \"SHA256=2d362a781ff98731423688ff5a50a08e1dd0e863e2de5b1d66c6595945a60735\")\n\n# Hashes for dynamic CRT (/MD)\nset(ONNXRUNTIME_HASH_MD_Release \"SHA256=f4596146f3aea7d9c557e466eb55af1cf8bb8e9f2a291ce4c428dd93d0501e33\")\nset(ONNXRUNTIME_HASH_MD_Debug \"SHA256=68aa603aa25fd1cbe7ebef465395d0b685aa66fc8fd2df0b6d6f5a1e88621c60\")\nset(ONNXRUNTIME_HASH_MD_RelWithDebInfo \"SHA256=ba5ae7bf3b5a29ea348f38516e7c46ff49921eb2a2e81e391f36bc932c4a7a20\")\nset(ONNXRUNTIME_HASH_MD_MinSizeRel \"SHA256=e57978b5811fcf795e07c33eb69f32fac5cac8b848d32acf1154ce13c9cbcfd7\")\n\nif(SHERPA_ONNX_USE_STATIC_CRT)\n  set(onnxruntime_crt \"MT\")\nelse()\n  set(onnxruntime_crt \"MD\")\nendif()\n\nmessage(STATUS \"Use MSVC CRT: ${onnxruntime_crt}\")\n\nset(onnxruntime_filename \"onnxruntime-win-x64-static_lib-${onnxruntime_crt}-${CMAKE_BUILD_TYPE}-1.23.2.tar.bz2\")\nset(onnxruntime_HASH \"${ONNXRUNTIME_HASH_${onnxruntime_crt}_${CMAKE_BUILD_TYPE}}\")\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/${onnxruntime_filename}\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/${onnxruntime_filename}\n  ${CMAKE_SOURCE_DIR}/${onnxruntime_filename}\n  ${CMAKE_BINARY_DIR}/${onnxruntime_filename}\n  $ENV{TMP}/${onnxruntime_filename}\n  $ENV{TEMP}/${onnxruntime_filename}\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/*.lib\")\n\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\nif(SHERPA_ONNX_ENABLE_PYTHON)\n  install(FILES ${onnxruntime_lib_files} DESTINATION ..)\nelse()\n  install(FILES ${onnxruntime_lib_files} DESTINATION lib)\nendif()\n"
  },
  {
    "path": "cmake/onnxruntime-win-x64.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_VS_PLATFORM_NAME: ${CMAKE_VS_PLATFORM_NAME}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Windows)\n  message(FATAL_ERROR \"This file is for Windows only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT (CMAKE_VS_PLATFORM_NAME STREQUAL X64 OR CMAKE_VS_PLATFORM_NAME STREQUAL x64))\n  message(FATAL_ERROR \"This file is for Windows x64 only. Given: ${CMAKE_VS_PLATFORM_NAME}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\nif(NOT CMAKE_BUILD_TYPE MATCHES \"^(Release|Debug|RelWithDebInfo|MinSizeRel)$\")\n  message(FATAL_ERROR \"Supported CMAKE_BUILD_TYPE values are: Release, Debug, RelWithDebInfo, MinSizeRel. Given ${CMAKE_BUILD_TYPE}\")\nendif()\n\n# Hashes for static CRT (/MT)\nset(ONNXRUNTIME_HASH_MT_Debug \"SHA256=f63a1dafd63bd911135a47ccc75bd04c06a717de21b96c0d8ddb351714551124\")\nset(ONNXRUNTIME_HASH_MT_RelWithDebInfo \"SHA256=b5363e34544b1d6bf27161843a72dfc853d80e5f14369242378b2e244d2af632\")\nset(ONNXRUNTIME_HASH_MT_MinSizeRel \"SHA256=d1d4c76747020eb7ccafd9180da1a5dda0cc7d01b8cfc153fa88a9c205291c93\")\nset(ONNXRUNTIME_HASH_MT_Release \"SHA256=a5c917196ef3356c343a69cee919a84f40ada6e9bf756b3e6edf3d07afc8a257\")\n\n# Hashes for dynamic CRT (/MD)\nset(ONNXRUNTIME_HASH_MD_Debug \"SHA256=422d9aeed64c6a5fa8daf3286a2ff39485cb8eceafb0d264179dd250e240f2f0\")\nset(ONNXRUNTIME_HASH_MD_RelWithDebInfo \"SHA256=1bb3ca8ea37f9ca3bb6417da1756aadd984a76990bafa50950da1b679c7a1e65\")\nset(ONNXRUNTIME_HASH_MD_MinSizeRel \"SHA256=c1d74a28463eee3297cebf5d6ec06fc7cf207e720dcf81259c3acb0e53534ac3\")\nset(ONNXRUNTIME_HASH_MD_Release \"SHA256=0fffad34226a8b5bc33e7a130f77a57757f5d6623ca8e2495bc529ec3e959dd1\")\n\nif(SHERPA_ONNX_USE_STATIC_CRT)\n  set(onnxruntime_crt \"MT\")\nelse()\n  set(onnxruntime_crt \"MD\")\nendif()\n\nmessage(STATUS \"Use MSVC CRT: ${onnxruntime_crt}\")\n\nset(onnxruntime_filename \"onnxruntime-win-x64-${onnxruntime_crt}-${CMAKE_BUILD_TYPE}-1.23.2.tar.bz2\")\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/${onnxruntime_filename}\")\nset(onnxruntime_HASH \"${ONNXRUNTIME_HASH_${onnxruntime_crt}_${CMAKE_BUILD_TYPE}}\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/${onnxruntime_filename}\n  ${CMAKE_SOURCE_DIR}/${onnxruntime_filename}\n  ${CMAKE_BINARY_DIR}/${onnxruntime_filename}\n  $ENV{TMP}/${onnxruntime_filename}\n  $ENV{TEMP}/${onnxruntime_filename}\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nset_property(TARGET onnxruntime\n  PROPERTY\n    IMPORTED_IMPLIB \"${onnxruntime_SOURCE_DIR}/lib/onnxruntime.lib\"\n)\n\nfile(COPY ${onnxruntime_SOURCE_DIR}/lib/onnxruntime.dll\n  DESTINATION\n    ${CMAKE_BINARY_DIR}/bin/${CMAKE_BUILD_TYPE}\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/*.dll\")\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\n\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\ninstall(FILES ${onnxruntime_lib_files} DESTINATION bin)\n"
  },
  {
    "path": "cmake/onnxruntime-win-x86-static.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_VS_PLATFORM_NAME: ${CMAKE_VS_PLATFORM_NAME}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Windows)\n  message(FATAL_ERROR \"This file is for Windows only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT (CMAKE_VS_PLATFORM_NAME STREQUAL Win32 OR CMAKE_VS_PLATFORM_NAME STREQUAL win32))\n  message(FATAL_ERROR \"This file is for Windows x86 only. Given: ${CMAKE_VS_PLATFORM_NAME}\")\nendif()\n\nif(BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building static libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\n# Hashes for static CRT (/MT)\nset(ONNXRUNTIME_HASH_MT_Release \"SHA256=2e4ecb02d37dfb2d0ed4b4e970b9f0b0a0352a6d7cbcd95fdd693a2a2ba7a0db\")\nset(ONNXRUNTIME_HASH_MT_Debug \"SHA256=18b1030d47f1b0ea744b82b4f6829e991d17b4206c8059d1b2e5393bf7f29b4f\")\nset(ONNXRUNTIME_HASH_MT_RelWithDebInfo \"SHA256=400b6cff390fb36669abe681d34d307746c2ec0309471fe6046dc5def7ccf17e\")\nset(ONNXRUNTIME_HASH_MT_MinSizeRel \"SHA256=10e9faf9f22f5c784b00db1fe907ef99af845aa211add13fc8ff8dec6ed1a665\")\n\n# Hashes for dynamic CRT (/MD)\nset(ONNXRUNTIME_HASH_MD_Release \"SHA256=8793c5ddd6ac44d784005c05ffc8498c15c7a0f26c6b61c4689b5098823b6dad\")\nset(ONNXRUNTIME_HASH_MD_Debug \"SHA256=f43082bcc1f34fce1222fa5b68011d30702182e45198ac553e35add6090f3a3c\")\nset(ONNXRUNTIME_HASH_MD_RelWithDebInfo \"SHA256=96a0be8f1b82c5eff82a8060928dd0a27d1a8a8a94926098bdd6539655393353\")\nset(ONNXRUNTIME_HASH_MD_MinSizeRel \"SHA256=0fe7fc4cb4dba7afc6c1f622168700b4c98a5c01bcfd64ebe72a9c4bb3db4cc2\")\n\nif(NOT CMAKE_BUILD_TYPE MATCHES \"^(Release|Debug|RelWithDebInfo|MinSizeRel)$\")\n  message(FATAL_ERROR \"Supported CMAKE_BUILD_TYPE values are: Release, Debug, RelWithDebInfo, MinSizeRel. Given ${CMAKE_BUILD_TYPE}\")\nendif()\n\nif(SHERPA_ONNX_USE_STATIC_CRT)\n  set(onnxruntime_crt \"MT\")\nelse()\n  set(onnxruntime_crt \"MD\")\nendif()\n\nmessage(STATUS \"Use MSVC CRT: ${onnxruntime_crt}\")\n\nset(onnxruntime_HASH \"${ONNXRUNTIME_HASH_${onnxruntime_crt}_${CMAKE_BUILD_TYPE}}\")\nset(onnxruntime_filename \"onnxruntime-win-x86-static_lib-${onnxruntime_crt}-${CMAKE_BUILD_TYPE}-1.23.2.tar.bz2\")\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/${onnxruntime_filename}\")\n\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/${onnxruntime_filename}\n  ${CMAKE_SOURCE_DIR}/${onnxruntime_filename}\n  ${CMAKE_BINARY_DIR}/${onnxruntime_filename}\n  $ENV{TMP}/${onnxruntime_filename}\n  $ENV{TEMP}/${onnxruntime_filename}\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\n# for static libraries, we use onnxruntime_lib_files directly below\ninclude_directories(${onnxruntime_SOURCE_DIR}/include)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/*.lib\")\nset(onnxruntime_lib_files ${onnxruntime_lib_files} PARENT_SCOPE)\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\nif(SHERPA_ONNX_ENABLE_PYTHON)\n  install(FILES ${onnxruntime_lib_files} DESTINATION ..)\nelse()\n  install(FILES ${onnxruntime_lib_files} DESTINATION lib)\nendif()\n"
  },
  {
    "path": "cmake/onnxruntime-win-x86.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\nmessage(STATUS \"CMAKE_VS_PLATFORM_NAME: ${CMAKE_VS_PLATFORM_NAME}\")\n\nif(NOT CMAKE_SYSTEM_NAME STREQUAL Windows)\n  message(FATAL_ERROR \"This file is for Windows only. Given: ${CMAKE_SYSTEM_NAME}\")\nendif()\n\nif(NOT (CMAKE_VS_PLATFORM_NAME STREQUAL Win32 OR CMAKE_VS_PLATFORM_NAME STREQUAL win32))\n  message(FATAL_ERROR \"This file is for Windows x86 only. Given: ${CMAKE_VS_PLATFORM_NAME}\")\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  message(FATAL_ERROR \"This file is for building shared libraries. BUILD_SHARED_LIBS: ${BUILD_SHARED_LIBS}\")\nendif()\n\n# Hashes for static CRT (/MT)\nset(ONNXRUNTIME_HASH_MT_Release \"SHA256=07536b6b0c3929df8a41352331357daac99deecf49a7856493c7d24dc036b071\")\nset(ONNXRUNTIME_HASH_MT_Debug \"SHA256=6f9146c969f2db41049d604a3a6f922a61a7a18e2116d539bce9578da7092037\")\nset(ONNXRUNTIME_HASH_MT_RelWithDebInfo \"SHA256=d9a40560ca39425fc19c599e2b33b39459660deb030c0dc5ef83ddb7bf5f58d0\")\nset(ONNXRUNTIME_HASH_MT_MinSizeRel \"SHA256=c5dc5570a8144b152592de69ecf4d2aaa4557a62dc4c47e24f480e26c831bd65\")\n\n# Hashes for dynamic CRT (/MD)\nset(ONNXRUNTIME_HASH_MD_Release \"SHA256=ce1519a7934204cbf9f3431ba1d67ea1a8e838f743245ac96db9faaaf150581c\")\nset(ONNXRUNTIME_HASH_MD_Debug \"SHA256=d35d5b5bee5a0483f16e845783902c686c9a186c71c17bcf20d6887a734a6ad9\")\nset(ONNXRUNTIME_HASH_MD_RelWithDebInfo \"SHA256=e4df40832040419d9b5e7d983420c51e3095987bd20314c9d7a90b4759df2991\")\nset(ONNXRUNTIME_HASH_MD_MinSizeRel \"SHA256=4ea6d745466f3623a13c0159f09dcf50d6b34e302b219cfc396c6af7122b7b39\")\n\nif(NOT CMAKE_BUILD_TYPE MATCHES \"^(Release|Debug|RelWithDebInfo|MinSizeRel)$\")\n  message(FATAL_ERROR \"Supported CMAKE_BUILD_TYPE values are: Release, Debug, RelWithDebInfo, MinSizeRel. Given ${CMAKE_BUILD_TYPE}\")\nendif()\n\nif(SHERPA_ONNX_USE_STATIC_CRT)\n  set(onnxruntime_crt \"MT\")\nelse()\n  set(onnxruntime_crt \"MD\")\nendif()\n\nmessage(STATUS \"Use MSVC CRT: ${onnxruntime_crt}\")\n\nset(onnxruntime_HASH \"${ONNXRUNTIME_HASH_${onnxruntime_crt}_${CMAKE_BUILD_TYPE}}\")\nset(onnxruntime_filename \"onnxruntime-win-x86-${onnxruntime_crt}-${CMAKE_BUILD_TYPE}-1.23.2.tar.bz2\")\nset(onnxruntime_URL  \"https://github.com/csukuangfj/onnxruntime-libs/releases/download/v1.23.2/${onnxruntime_filename}\")\n\n# If you don't have access to the Internet,\n# please download onnxruntime to one of the following locations.\n# You can add more if you want.\nset(possible_file_locations\n  $ENV{HOME}/Downloads/${onnxruntime_filename}\n  ${CMAKE_SOURCE_DIR}/${onnxruntime_filename}\n  ${CMAKE_BINARY_DIR}/${onnxruntime_filename}\n  $ENV{TMP}/${onnxruntime_filename}\n  $ENV{TEMP}/${onnxruntime_filename}\n)\n\nforeach(f IN LISTS possible_file_locations)\n  if(EXISTS ${f})\n    set(onnxruntime_URL  \"${f}\")\n    file(TO_CMAKE_PATH \"${onnxruntime_URL}\" onnxruntime_URL)\n    message(STATUS \"Found local downloaded onnxruntime: ${onnxruntime_URL}\")\n    break()\n  endif()\nendforeach()\n\nFetchContent_Declare(onnxruntime\n  URL\n    ${onnxruntime_URL}\n  URL_HASH          ${onnxruntime_HASH}\n)\n\nFetchContent_GetProperties(onnxruntime)\nif(NOT onnxruntime_POPULATED)\n  message(STATUS \"Downloading onnxruntime from ${onnxruntime_URL}\")\n  FetchContent_Populate(onnxruntime)\nendif()\nmessage(STATUS \"onnxruntime is downloaded to ${onnxruntime_SOURCE_DIR}\")\n\nfind_library(location_onnxruntime onnxruntime\n  PATHS\n  \"${onnxruntime_SOURCE_DIR}/lib\"\n  NO_CMAKE_SYSTEM_PATH\n)\n\nmessage(STATUS \"location_onnxruntime: ${location_onnxruntime}\")\n\nadd_library(onnxruntime SHARED IMPORTED)\n\nset_target_properties(onnxruntime PROPERTIES\n  IMPORTED_LOCATION ${location_onnxruntime}\n  INTERFACE_INCLUDE_DIRECTORIES \"${onnxruntime_SOURCE_DIR}/include\"\n)\n\nset_property(TARGET onnxruntime\n  PROPERTY\n    IMPORTED_IMPLIB \"${onnxruntime_SOURCE_DIR}/lib/onnxruntime.lib\"\n)\n\nfile(COPY ${onnxruntime_SOURCE_DIR}/lib/onnxruntime.dll\n  DESTINATION\n    ${CMAKE_BINARY_DIR}/bin/${CMAKE_BUILD_TYPE}\n)\n\nfile(GLOB onnxruntime_lib_files \"${onnxruntime_SOURCE_DIR}/lib/*.dll\")\n\nmessage(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\n\ninstall(FILES ${onnxruntime_lib_files} DESTINATION lib)\ninstall(FILES ${onnxruntime_lib_files} DESTINATION bin)\n"
  },
  {
    "path": "cmake/onnxruntime.cmake",
    "content": "# Copyright (c)  2022-2023  Xiaomi Corporation\nfunction(download_onnxruntime)\n  include(FetchContent)\n\n  message(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\n  message(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n  if(SHERPA_ONNX_ENABLE_WASM)\n    include(onnxruntime-wasm-simd)\n  elseif(CMAKE_SYSTEM_PROCESSOR STREQUAL riscv64)\n    if(SHERPA_ONNX_ENABLE_SPACEMIT)\n      include(onnxruntime-linux-riscv64-spacemit)\n    elseif(BUILD_SHARED_LIBS)\n      include(onnxruntime-linux-riscv64)\n    else()\n      include(onnxruntime-linux-riscv64-static)\n    endif()\n  elseif(CMAKE_SYSTEM_NAME STREQUAL Linux AND CMAKE_SYSTEM_PROCESSOR STREQUAL aarch64)\n    if(SHERPA_ONNX_ENABLE_GPU)\n      include(onnxruntime-linux-aarch64-gpu)\n    elseif(BUILD_SHARED_LIBS)\n      include(onnxruntime-linux-aarch64)\n    else()\n      include(onnxruntime-linux-aarch64-static)\n    endif()\n  elseif(CMAKE_SYSTEM_NAME STREQUAL Linux AND (CMAKE_SYSTEM_PROCESSOR STREQUAL arm OR CMAKE_SYSTEM_PROCESSOR STREQUAL armv7l))\n    if(BUILD_SHARED_LIBS)\n      include(onnxruntime-linux-arm)\n    else()\n      include(onnxruntime-linux-arm-static)\n    endif()\n  elseif(CMAKE_SYSTEM_NAME STREQUAL Linux AND CMAKE_SYSTEM_PROCESSOR STREQUAL x86_64)\n    if(SHERPA_ONNX_ENABLE_GPU)\n      include(onnxruntime-linux-x86_64-gpu)\n    elseif(BUILD_SHARED_LIBS)\n      include(onnxruntime-linux-x86_64)\n    else()\n      include(onnxruntime-linux-x86_64-static)\n    endif()\n  elseif(CMAKE_SYSTEM_NAME STREQUAL Darwin)\n    if (arm64 IN_LIST CMAKE_OSX_ARCHITECTURES AND x86_64 IN_LIST CMAKE_OSX_ARCHITECTURES)\n      if(BUILD_SHARED_LIBS)\n        include(onnxruntime-osx-universal)\n      else()\n        include(onnxruntime-osx-universal-static)\n      endif()\n    elseif(CMAKE_SYSTEM_PROCESSOR STREQUAL x86_64 AND CMAKE_OSX_ARCHITECTURES STREQUAL \"arm64\")\n      # cross compiling\n      if(BUILD_SHARED_LIBS)\n        include(onnxruntime-osx-arm64)\n      else()\n        include(onnxruntime-osx-arm64-static)\n      endif()\n    elseif(CMAKE_SYSTEM_PROCESSOR STREQUAL arm64 AND CMAKE_OSX_ARCHITECTURES STREQUAL \"x86_64\")\n      # cross compiling\n      if(BUILD_SHARED_LIBS)\n        include(onnxruntime-osx-x86_64)\n      else()\n        include(onnxruntime-osx-x86_64-static)\n      endif()\n    elseif(CMAKE_SYSTEM_PROCESSOR STREQUAL arm64)\n      if(BUILD_SHARED_LIBS)\n        include(onnxruntime-osx-arm64)\n      else()\n        include(onnxruntime-osx-arm64-static)\n      endif()\n    elseif(CMAKE_SYSTEM_PROCESSOR STREQUAL x86_64)\n      if(BUILD_SHARED_LIBS)\n        include(onnxruntime-osx-x86_64)\n      else()\n        include(onnxruntime-osx-x86_64-static)\n      endif()\n    else()\n      message(FATAL_ERROR \"Unsupported processor ${CMAKE_SYSTEM_PROCESSOR} for Darwin\")\n    endif()\n  elseif(WIN32)\n    message(STATUS \"CMAKE_VS_PLATFORM_NAME: ${CMAKE_VS_PLATFORM_NAME}\")\n\n    if(CMAKE_VS_PLATFORM_NAME STREQUAL Win32 OR CMAKE_VS_PLATFORM_NAME STREQUAL win32)\n      if(BUILD_SHARED_LIBS)\n        include(onnxruntime-win-x86)\n      else()\n        include(onnxruntime-win-x86-static)\n      endif()\n\n      if(SHERPA_ONNX_ENABLE_GPU)\n        message(FATAL_ERROR \"GPU support for Win32 is not supported!\")\n      endif()\n    elseif(CMAKE_VS_PLATFORM_NAME STREQUAL ARM64 OR CMAKE_VS_PLATFORM_NAME STREQUAL arm64)\n      # for 64-bit windows (arm64)\n      if(BUILD_SHARED_LIBS)\n        include(onnxruntime-win-arm64)\n      else()\n        include(onnxruntime-win-arm64-static)\n      endif()\n    else()\n      # for 64-bit windows (x64)\n      if(SHERPA_ONNX_ENABLE_DIRECTML)\n        message(STATUS \"Use DirectML\")\n        include(onnxruntime-win-x64-directml)\n      elseif(BUILD_SHARED_LIBS)\n        message(STATUS \"Use dynamic onnxruntime libraries\")\n        if(SHERPA_ONNX_ENABLE_GPU)\n          include(onnxruntime-win-x64-gpu)\n        else()\n          include(onnxruntime-win-x64)\n        endif()\n      else()\n        # static libraries for windows x64\n        message(STATUS \"Use static onnxruntime libraries\")\n        include(onnxruntime-win-x64-static)\n      endif()\n    endif()\n  else()\n    message(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\n    message(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n    message(FATAL_ERROR \"Only support Linux, macOS, and Windows at present. Will support other OSes later\")\n  endif()\n  set(onnxruntime_SOURCE_DIR ${onnxruntime_SOURCE_DIR} PARENT_SCOPE)\nendfunction()\n\nif(SHERPA_ONNX_USE_PRE_INSTALLED_ONNXRUNTIME_IF_AVAILABLE)\n  # First, we try to locate the header and the lib if the user has already\n  # installed onnxruntime. Otherwise, we will download the pre-compiled lib\n\n  message(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\n  message(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\n  if(DEFINED ENV{SHERPA_ONNXRUNTIME_INCLUDE_DIR})\n    set(location_onnxruntime_header_dir $ENV{SHERPA_ONNXRUNTIME_INCLUDE_DIR})\n\n    include_directories(${location_onnxruntime_header_dir})\n  else()\n    find_path(location_onnxruntime_header_dir onnxruntime_cxx_api.h\n      PATHS\n        /usr/include/onnxruntime\n        /usr/local/include/onnxruntime\n    )\n  endif()\n\n  message(STATUS \"location_onnxruntime_header_dir: ${location_onnxruntime_header_dir}\")\n\n  if(DEFINED ENV{SHERPA_ONNXRUNTIME_LIB_DIR})\n    if(APPLE)\n      set(location_onnxruntime_lib $ENV{SHERPA_ONNXRUNTIME_LIB_DIR}/libonnxruntime.dylib)\n    elseif(WIN32)\n      if(SHERPA_ONNX_ENABLE_GPU)\n        set(location_onnxruntime_lib $ENV{SHERPA_ONNXRUNTIME_LIB_DIR}/onnxruntime.dll)\n        set(location_onnxruntime_lib2 $ENV{SHERPA_ONNXRUNTIME_LIB_DIR}/onnxruntime.lib)\n      else()\n        set(location_onnxruntime_lib $ENV{SHERPA_ONNXRUNTIME_LIB_DIR}/onnxruntime.lib)\n        if(SHERPA_ONNX_ENABLE_DIRECTML)\n          include(onnxruntime-win-x64-directml)\n        endif()\n      endif()\n    else()\n      set(location_onnxruntime_lib $ENV{SHERPA_ONNXRUNTIME_LIB_DIR}/libonnxruntime.so)\n    endif()\n\n    if(NOT EXISTS ${location_onnxruntime_lib})\n      message(STATUS \"${location_onnxruntime_lib} does not exist. Try static lib\")\n\n      set(location_onnxruntime_lib $ENV{SHERPA_ONNXRUNTIME_LIB_DIR}/libonnxruntime.a)\n      if(NOT EXISTS ${location_onnxruntime_lib})\n        message(FATAL_ERROR \"${location_onnxruntime_lib} cannot be found\")\n      endif()\n      set(onnxruntime_lib_files $ENV{SHERPA_ONNXRUNTIME_LIB_DIR}/libonnxruntime.a)\n      message(\"Use static lib: ${onnxruntime_lib_files}\")\n    endif()\n  else()\n    find_library(location_onnxruntime_lib onnxruntime\n      PATHS\n        /lib\n        /usr/lib\n        /usr/local/lib\n    )\n  endif()\n\n  message(STATUS \"location_onnxruntime_lib: ${location_onnxruntime_lib}\")\nendif()\n\nif(location_onnxruntime_header_dir AND location_onnxruntime_lib)\n  if(NOT DEFINED onnxruntime_lib_files)\n    add_library(onnxruntime SHARED IMPORTED)\n\n    if(WIN32)\n      set_target_properties(onnxruntime PROPERTIES\n        IMPORTED_LOCATION ${location_onnxruntime_lib}\n        IMPORTED_IMPLIB ${location_onnxruntime_lib2}\n        INTERFACE_INCLUDE_DIRECTORIES \"${location_onnxruntime_header_dir}\"\n      )\n    else()\n      set_target_properties(onnxruntime PROPERTIES\n        IMPORTED_LOCATION ${location_onnxruntime_lib}\n        INTERFACE_INCLUDE_DIRECTORIES \"${location_onnxruntime_header_dir}\"\n      )\n    endif()\n\n    if(WIN32)\n      file(GLOB onnxruntime_lib_files \"$ENV{SHERPA_ONNXRUNTIME_LIB_DIR}/*.dll\")\n    else()\n      if(DEFINED ANDROID_ABI)\n        file(GLOB onnxruntime_lib_files \"$ENV{SHERPA_ONNXRUNTIME_LIB_DIR}/libonnxruntime.so\")\n      else()\n        file(GLOB _onnxruntime_all \"$ENV{SHERPA_ONNXRUNTIME_LIB_DIR}/libonnxruntime*\")\n        set(onnxruntime_lib_files \"\")\n\n        foreach(f ${_onnxruntime_all})\n          if (NOT IS_DIRECTORY \"${f}\")\n            list(APPEND onnxruntime_lib_files \"${f}\")\n          endif()\n        endforeach()\n      endif()\n    endif()\n\n    message(STATUS \"onnxruntime lib files: ${onnxruntime_lib_files}\")\n\n    install(FILES ${onnxruntime_lib_files} DESTINATION lib)\n\n    if(WIN32)\n      install(FILES ${onnxruntime_lib_files} DESTINATION bin)\n    endif()\n  endif()\nelse()\n  if(SHERPA_ONNX_USE_PRE_INSTALLED_ONNXRUNTIME_IF_AVAILABLE)\n    message(STATUS \"Could not find a pre-installed onnxruntime.\")\n  endif()\n  message(STATUS \"Downloading pre-compiled onnxruntime\")\n\n  download_onnxruntime()\nendif()\n"
  },
  {
    "path": "cmake/openfst.cmake",
    "content": "# Copyright (c)  2020  Xiaomi Corporation (author: Fangjun Kuang)\n\nfunction(download_openfst)\n  include(FetchContent)\n\n  set(openfst_URL  \"https://github.com/csukuangfj/openfst/archive/refs/tags/sherpa-onnx-2024-06-19.tar.gz\")\n  set(openfst_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/openfst-sherpa-onnx-2024-06-19.tar.gz\")\n  set(openfst_HASH \"SHA256=5c98e82cc509c5618502dde4860b8ea04d843850ed57e6d6b590b644b268853d\")\n\n  # If you don't have access to the Internet,\n  # please pre-download it\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/openfst-sherpa-onnx-2024-06-19.tar.gz\n    ${CMAKE_SOURCE_DIR}/openfst-sherpa-onnx-2024-06-19.tar.gz\n    ${CMAKE_BINARY_DIR}/openfst-sherpa-onnx-2024-06-19.tar.gz\n    /tmp/openfst-sherpa-onnx-2024-06-19.tar.gz\n    /star-fj/fangjun/download/github/openfst-sherpa-onnx-2024-06-19.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(openfst_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${openfst_URL}\" openfst_URL)\n      set(openfst_URL2)\n      break()\n    endif()\n  endforeach()\n\n  set(HAVE_BIN OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_SCRIPT OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_COMPACT OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_COMPRESS OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_CONST OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_FAR ON CACHE BOOL \"\" FORCE)\n  set(HAVE_GRM OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_PDT OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_MPDT OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_LINEAR OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_LOOKAHEAD OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_NGRAM OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_PYTHON OFF CACHE BOOL \"\" FORCE)\n  set(HAVE_SPECIAL OFF CACHE BOOL \"\" FORCE)\n\n  if(NOT WIN32)\n    FetchContent_Declare(openfst\n      URL\n        ${openfst_URL}\n        ${openfst_URL2}\n      URL_HASH          ${openfst_HASH}\n      PATCH_COMMAND\n        sed -i.bak s/enable_testing\\(\\)//g \"src/CMakeLists.txt\" &&\n        sed -i.bak s/add_subdirectory\\(test\\)//g \"src/CMakeLists.txt\" &&\n        sed -i.bak /message/d \"src/script/CMakeLists.txt\"\n        # sed -i.bak s/add_subdirectory\\(script\\)//g \"src/CMakeLists.txt\" &&\n        # sed -i.bak s/add_subdirectory\\(extensions\\)//g \"src/CMakeLists.txt\"\n    )\n  else()\n    FetchContent_Declare(openfst\n      URL               ${openfst_URL}\n      URL_HASH          ${openfst_HASH}\n    )\n  endif()\n\n  FetchContent_GetProperties(openfst)\n  if(NOT openfst_POPULATED)\n    message(STATUS \"Downloading openfst from ${openfst_URL}\")\n    FetchContent_Populate(openfst)\n  endif()\n  message(STATUS \"openfst is downloaded to ${openfst_SOURCE_DIR}\")\n\n  if(_build_shared_libs_bak)\n    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})\n    set(BUILD_SHARED_LIBS OFF)\n  endif()\n\n  add_subdirectory(${openfst_SOURCE_DIR} ${openfst_BINARY_DIR} EXCLUDE_FROM_ALL)\n\n  if(_build_shared_libs_bak)\n    set_target_properties(fst fstfar\n      PROPERTIES\n        POSITION_INDEPENDENT_CODE ON\n        C_VISIBILITY_PRESET hidden\n        CXX_VISIBILITY_PRESET hidden\n    )\n    set(BUILD_SHARED_LIBS ON)\n  endif()\n\n  set(openfst_SOURCE_DIR ${openfst_SOURCE_DIR} PARENT_SCOPE)\n\n  set_target_properties(fst PROPERTIES OUTPUT_NAME \"sherpa-onnx-fst\")\n  set_target_properties(fstfar PROPERTIES OUTPUT_NAME \"sherpa-onnx-fstfar\")\n\n  if(LINUX AND CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL 11)\n    target_compile_options(fst PUBLIC -Wno-missing-template-keyword)\n  endif()\n\n  target_include_directories(fst\n    PUBLIC\n      ${openfst_SOURCE_DIR}/src/include\n  )\n\n  target_include_directories(fstfar\n    PUBLIC\n      ${openfst_SOURCE_DIR}/src/include\n  )\n  # installed in ./kaldi-decoder.cmake\nendfunction()\n\ndownload_openfst()\n"
  },
  {
    "path": "cmake/piper-phonemize.cmake",
    "content": "function(download_piper_phonemize)\n  include(FetchContent)\n\n  set(piper_phonemize_URL  \"https://github.com/csukuangfj/piper-phonemize/archive/78a788e0b719013401572d70fef372e77bff8e43.zip\")\n  set(piper_phonemize_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip\")\n  set(piper_phonemize_HASH \"SHA256=89641a46489a4898754643ce57bda9c9b54b4ca46485fdc02bf0dc84b866645d\")\n\n  # If you don't have access to the Internet,\n  # please pre-download kaldi-decoder\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip\n    ${CMAKE_SOURCE_DIR}/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip\n    ${CMAKE_BINARY_DIR}/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip\n    /tmp/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip\n    /star-fj/fangjun/download/github/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(piper_phonemize_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${piper_phonemize_URL}\" piper_phonemize_URL)\n      message(STATUS \"Found local downloaded espeak-ng: ${piper_phonemize_URL}\")\n      set(piper_phonemize_URL2 )\n      break()\n    endif()\n  endforeach()\n\n  FetchContent_Declare(piper_phonemize\n    URL\n      ${piper_phonemize_URL}\n      ${piper_phonemize_URL2}\n    URL_HASH          ${piper_phonemize_HASH}\n  )\n\n  FetchContent_GetProperties(piper_phonemize)\n  if(NOT piper_phonemize_POPULATED)\n    message(STATUS \"Downloading piper-phonemize from ${piper_phonemize_URL}\")\n    FetchContent_Populate(piper_phonemize)\n  endif()\n  message(STATUS \"piper-phonemize is downloaded to ${piper_phonemize_SOURCE_DIR}\")\n  message(STATUS \"piper-phonemize binary dir is ${piper_phonemize_BINARY_DIR}\")\n\n  if(BUILD_SHARED_LIBS)\n    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})\n    set(BUILD_SHARED_LIBS OFF)\n  endif()\n\n  add_subdirectory(${piper_phonemize_SOURCE_DIR} ${piper_phonemize_BINARY_DIR} EXCLUDE_FROM_ALL)\n\n  if(_build_shared_libs_bak)\n    set_target_properties(piper_phonemize\n      PROPERTIES\n        POSITION_INDEPENDENT_CODE ON\n        C_VISIBILITY_PRESET hidden\n        CXX_VISIBILITY_PRESET hidden\n    )\n    set(BUILD_SHARED_LIBS ON)\n  endif()\n\n  if(WIN32 AND MSVC)\n    target_compile_options(piper_phonemize PUBLIC\n      /wd4309\n    )\n  endif()\n\n  target_include_directories(piper_phonemize\n    INTERFACE\n      ${piper_phonemize_SOURCE_DIR}/src/include\n  )\n\n  if(NOT BUILD_SHARED_LIBS)\n    install(TARGETS\n      piper_phonemize\n    DESTINATION lib)\n  endif()\nendfunction()\n\ndownload_piper_phonemize()\n"
  },
  {
    "path": "cmake/portaudio.cmake",
    "content": "function(download_portaudio)\n  include(FetchContent)\n\n  set(portaudio_URL  \"http://files.portaudio.com/archives/pa_stable_v190700_20210406.tgz\")\n  set(portaudio_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/pa_stable_v190700_20210406.tgz\")\n  set(portaudio_HASH \"SHA256=47efbf42c77c19a05d22e627d42873e991ec0c1357219c0d74ce6a2948cb2def\")\n\n  # If you don't have access to the Internet, please download it to your\n  # local drive and modify the following line according to your needs.\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/pa_stable_v190700_20210406.tgz\n    $ENV{HOME}/asr/pa_stable_v190700_20210406.tgz\n    ${CMAKE_SOURCE_DIR}/pa_stable_v190700_20210406.tgz\n    ${CMAKE_BINARY_DIR}/pa_stable_v190700_20210406.tgz\n    /tmp/pa_stable_v190700_20210406.tgz\n    /star-fj/fangjun/download/github/pa_stable_v190700_20210406.tgz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(portaudio_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${portaudio_URL}\" portaudio_URL)\n      message(STATUS \"Found local downloaded portaudio: ${portaudio_URL}\")\n      set(portaudio_URL2)\n      break()\n    endif()\n  endforeach()\n\n  # Always use static build\n  set(PA_BUILD_SHARED OFF CACHE BOOL \"\" FORCE)\n  set(PA_BUILD_STATIC ON CACHE BOOL \"\" FORCE)\n  set(PA_BUILD_EXAMPLES ON CACHE BOOL \"\" FORCE)\n  set(PA_USE_WDMKS OFF CACHE BOOL \"\" FORCE)\n\n  FetchContent_Declare(portaudio\n    URL\n      ${portaudio_URL}\n      ${portaudio_URL2}\n    URL_HASH          ${portaudio_HASH}\n  )\n\n  FetchContent_GetProperties(portaudio)\n  if(NOT portaudio_POPULATED)\n    message(STATUS \"Downloading portaudio from ${portaudio_URL}\")\n    FetchContent_Populate(portaudio)\n  endif()\n  message(STATUS \"portaudio is downloaded to ${portaudio_SOURCE_DIR}\")\n  message(STATUS \"portaudio's binary dir is ${portaudio_BINARY_DIR}\")\n\n  if(APPLE)\n    set(CMAKE_MACOSX_RPATH ON) # to solve the following warning on macOS\n  endif()\n\n  add_subdirectory(${portaudio_SOURCE_DIR} ${portaudio_BINARY_DIR} EXCLUDE_FROM_ALL)\n  if(CMAKE_SYSTEM_NAME STREQUAL Linux)\n    if(PA_USE_ALSA)\n      message(STATUS \"portaudio with ALSA\")\n    else()\n      message(STATUS \"portaudio without ALSA\")\n    endif()\n  endif()\n\n  set_target_properties(pa_devs PROPERTIES OUTPUT_NAME \"sherpa-onnx-pa-devs\")\n\n  set_target_properties(portaudio_static PROPERTIES OUTPUT_NAME \"sherpa-onnx-portaudio_static\")\n  if(NOT WIN32)\n    target_compile_options(portaudio_static PRIVATE \"-Wno-deprecated-declarations\")\n  endif()\n\n  if(NOT BUILD_SHARED_LIBS AND SHERPA_ONNX_ENABLE_BINARY)\n    install(TARGETS\n      portaudio_static\n    DESTINATION lib)\n  endif()\n\n  install(TARGETS\n    pa_devs\n  DESTINATION bin)\n  add_custom_target(build_pa_devs ALL DEPENDS pa_devs)\n\nendfunction()\n\ndownload_portaudio()\n\n# Note\n# See http://portaudio.com/docs/v19-doxydocs/tutorial_start.html\n# for how to use portaudio\n"
  },
  {
    "path": "cmake/pybind11.cmake",
    "content": "function(download_pybind11)\n  include(FetchContent)\n\n  set(pybind11_URL  \"https://github.com/pybind/pybind11/archive/refs/tags/v3.0.0.tar.gz\")\n  set(pybind11_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/pybind11-3.0.0.tar.gz\")\n  set(pybind11_HASH \"SHA256=453b1a3e2b266c3ae9da872411cadb6d693ac18063bd73226d96cfb7015a200c\")\n\n  # If you don't have access to the Internet,\n  # please pre-download pybind11\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/pybind11-3.0.0.tar.gz\n    ${CMAKE_SOURCE_DIR}/pybind11-3.0.0.tar.gz\n    ${CMAKE_BINARY_DIR}/pybind11-3.0.0.tar.gz\n    /tmp/pybind11-3.0.0.tar.gz\n    /star-fj/fangjun/download/github/pybind11-3.0.0.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(pybind11_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${pybind11_URL}\" pybind11_URL)\n      message(STATUS \"Found local downloaded pybind11: ${pybind11_URL}\")\n      set(pybind11_URL2)\n      break()\n    endif()\n  endforeach()\n\n  FetchContent_Declare(pybind11\n    URL\n      ${pybind11_URL}\n      ${pybind11_URL2}\n    URL_HASH          ${pybind11_HASH}\n  )\n\n  FetchContent_GetProperties(pybind11)\n  if(NOT pybind11_POPULATED)\n    message(STATUS \"Downloading pybind11 from ${pybind11_URL}\")\n    FetchContent_Populate(pybind11)\n  endif()\n  message(STATUS \"pybind11 is downloaded to ${pybind11_SOURCE_DIR}\")\n  add_subdirectory(${pybind11_SOURCE_DIR} ${pybind11_BINARY_DIR} EXCLUDE_FROM_ALL)\nendfunction()\n\ndownload_pybind11()\n"
  },
  {
    "path": "cmake/sherpa-onnx-shared.pc.in",
    "content": "# Note: If you use Python, then the prefix might not be correct.\n#\n# You need to either manually modify this file to change the prefix to the location\n# where this sherpa-onnx.pc file actually resides\n# or\n# you can use\n#\n#   pkg-config --define-variable=prefix=/path/to/the/dir/containing/this/file --cflags sherpa-onnx\n\nprefix=\"@CMAKE_INSTALL_PREFIX@\"\nexec_prefix=\"${prefix}\"\nincludedir=\"${prefix}/include\"\nlibdir=\"${exec_prefix}/lib\"\n\nName: sherpa-onnx\nDescription: pkg-config for sherpa-onnx\nURL: https://github.com/k2-fsa/sherpa-onnx\n\nVersion: @SHERPA_ONNX_VERSION@\nCflags: -I\"${includedir}\"\n\n# Note: -lcargs is required only for the following file\n# https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/decode-file-c-api.c\n# We add it here so that users don't need to specify -lcargs when compiling decode-file-c-api.c\nLibs: -L\"${libdir}\" -lsherpa-onnx-cxx-api -lsherpa-onnx-c-api -lonnxruntime -Wl,-rpath,${libdir} @SHERPA_ONNX_PKG_WITH_CARGS@ @SHERPA_ONNX_PKG_CONFIG_EXTRA_LIBS@\n"
  },
  {
    "path": "cmake/sherpa-onnx-static-no-tts.pc.in",
    "content": "# Note: If you use Python, then the prefix might not be correct.\n#\n# You need to either manually modify this file to change the prefix to the location\n# where this sherpa-onnx.pc file actually resides\n# or\n# you can use\n#\n#   pkg-config --define-variable=prefix=/path/to/the/dir/containing/this/file --cflags sherpa-onnx\n\nprefix=\"@CMAKE_INSTALL_PREFIX@\"\nexec_prefix=\"${prefix}\"\nincludedir=\"${prefix}/include\"\nlibdir=\"${exec_prefix}/lib\"\n\nName: sherpa-onnx\nDescription: pkg-config for sherpa-onnx with TTS support\nURL: https://github.com/k2-fsa/sherpa-onnx\n\nVersion: @SHERPA_ONNX_VERSION@\nCflags: -I\"${includedir}\"\n\n# Note: -lcargs is required only for the following file\n# https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/decode-file-c-api.c\n# We add it here so that users don't need to specify -lcargs when compiling decode-file-c-api.c\nLibs: -L\"${libdir}\" -lsherpa-onnx-c-api -lsherpa-onnx-core -lkaldi-decoder-core -lsherpa-onnx-kaldifst-core -lsherpa-onnx-fstfar -lsherpa-onnx-fst -lkaldi-native-fbank-core -lkissfft-float -lonnxruntime -lssentencepiece_core -Wl,-rpath,${libdir} @SHERPA_ONNX_PKG_WITH_CARGS@ @SHERPA_ONNX_PKG_CONFIG_EXTRA_LIBS@\n"
  },
  {
    "path": "cmake/sherpa-onnx-static.pc.in",
    "content": "# Note: If you use Python, then the prefix might not be correct.\n#\n# You need to either manually modify this file to change the prefix to the location\n# where this sherpa-onnx.pc file actually resides\n# or\n# you can use\n#\n#   pkg-config --define-variable=prefix=/path/to/the/dir/containing/this/file --cflags sherpa-onnx\n\nprefix=\"@CMAKE_INSTALL_PREFIX@\"\nexec_prefix=\"${prefix}\"\nincludedir=\"${prefix}/include\"\nlibdir=\"${exec_prefix}/lib\"\n\nName: sherpa-onnx\nDescription: pkg-config for sherpa-onnx\nURL: https://github.com/k2-fsa/sherpa-onnx\n\nVersion: @SHERPA_ONNX_VERSION@\nCflags: -I\"${includedir}\"\n\n# Note: -lcargs is required only for the following file\n# https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/decode-file-c-api.c\n# We add it here so that users don't need to specify -lcargs when compiling decode-file-c-api.c\nLibs: -L\"${libdir}\" -lsherpa-onnx-c-api -lsherpa-onnx-core -lkaldi-decoder-core -lsherpa-onnx-kaldifst-core -lsherpa-onnx-fstfar -lsherpa-onnx-fst -lkaldi-native-fbank-core -lkissfft-float -lpiper_phonemize -lespeak-ng -lucd -lonnxruntime -lssentencepiece_core -Wl,-rpath,${libdir} @SHERPA_ONNX_PKG_WITH_CARGS@ @SHERPA_ONNX_PKG_CONFIG_EXTRA_LIBS@\n"
  },
  {
    "path": "cmake/show-info.cmake",
    "content": "message(STATUS \"CMAKE_SOURCE_DIR: ${CMAKE_SOURCE_DIR}\")\nmessage(STATUS \"CMAKE_BINARY_DIR: ${CMAKE_BINARY_DIR}\")\nmessage(STATUS \"PROJECT_SOURCE_DIR: ${PROJECT_SOURCE_DIR}\")\nmessage(STATUS \"PROJECT_BINARY_DIR: ${PROJECT_BINARY_DIR}\")\nmessage(STATUS \"CMake version: ${CMAKE_VERSION}\")\nmessage(STATUS \"CMAKE_SYSTEM: ${CMAKE_SYSTEM}\")\nmessage(STATUS \"CMAKE_SYSTEM_NAME: ${CMAKE_SYSTEM_NAME}\")\nmessage(STATUS \"CMAKE_SYSTEM_VERSION: ${CMAKE_SYSTEM_VERSION}\")\nmessage(STATUS \"CMAKE_SYSTEM_PROCESSOR: ${CMAKE_SYSTEM_PROCESSOR}\")\n\nfind_package(Git QUIET)\nif(Git_FOUND)\n  execute_process(COMMAND\n    \"${GIT_EXECUTABLE}\" describe --always --abbrev=40\n    WORKING_DIRECTORY \"${CMAKE_SOURCE_DIR}\"\n    OUTPUT_VARIABLE SHERPA_ONNX_GIT_SHA1\n    ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE\n  )\n\n  execute_process(COMMAND\n    \"${GIT_EXECUTABLE}\" log -1 --format=%ad --date=local\n    WORKING_DIRECTORY \"${CMAKE_SOURCE_DIR}\"\n    OUTPUT_VARIABLE SHERPA_ONNX_GIT_DATE\n    ERROR_QUIET OUTPUT_STRIP_TRAILING_WHITESPACE\n  )\n  message(STATUS \"sherpa-onnx git sha1: ${SHERPA_ONNX_GIT_SHA1}\")\n  message(STATUS \"sherpa-onnx git date: ${SHERPA_ONNX_GIT_DATE}\")\nelse()\n  message(WARNING \"git is not found\")\nendif()\n\nif(UNIX AND NOT APPLE)\n  execute_process(COMMAND\n    lsb_release -sd\n    OUTPUT_VARIABLE SHERPA_ONNX_OS\n    OUTPUT_STRIP_TRAILING_WHITESPACE\n  )\nelseif(APPLE)\n  execute_process(COMMAND\n    sw_vers -productName\n    OUTPUT_VARIABLE _product_name\n    OUTPUT_STRIP_TRAILING_WHITESPACE\n  )\n\n  execute_process(COMMAND\n    sw_vers -productVersion\n    OUTPUT_VARIABLE _product_version\n    OUTPUT_STRIP_TRAILING_WHITESPACE\n  )\n\n  execute_process(COMMAND\n    sw_vers -buildVersion\n    OUTPUT_VARIABLE _build_version\n    OUTPUT_STRIP_TRAILING_WHITESPACE\n  )\n  set(SHERPA_ONNX_OS \"${_product_name} ${_product_version} ${_build_version}\")\nelseif(WIN32)\n  # Try PowerShell first to get OS name + version\n  execute_process(\n    COMMAND powershell -NoProfile -Command \"(Get-CimInstance Win32_OperatingSystem).Caption + ' ' + (Get-CimInstance Win32_OperatingSystem).Version\"\n    OUTPUT_VARIABLE SHERPA_ONNX_OS\n    OUTPUT_STRIP_TRAILING_WHITESPACE\n    ERROR_QUIET\n  )\n\n  if(NOT SHERPA_ONNX_OS)\n    message(WARNING \"PowerShell not available, falling back to cmd /c ver\")\n    # Fallback: cmd.exe /c ver (only version info, less detailed)\n    execute_process(\n      COMMAND cmd /c ver\n      OUTPUT_VARIABLE _cmd_out\n      OUTPUT_STRIP_TRAILING_WHITESPACE\n      ERROR_QUIET\n    )\n    string(REPLACE \"\\r\" \"\" _cmd_out \"${_cmd_out}\")\n    if(_cmd_out)\n      set(SHERPA_ONNX_OS \"Windows ${_cmd_out}\")\n    else()\n      set(SHERPA_ONNX_OS \"Windows (version unknown)\")\n    endif()\n  endif()\nelse()\n  set(SHERPA_ONNX_OS \"Unknown\")\nendif()\nmessage(STATUS \"OS used to build sherpa-onnx: ${SHERPA_ONNX_OS}\")\n\nif(CMAKE_CXX_COMPILER)\n  message(STATUS \"C++ compiler: ${CMAKE_CXX_COMPILER}\")\n  if(CMAKE_CXX_COMPILER_ID)\n    message(STATUS \"C++ compiler ID: ${CMAKE_CXX_COMPILER_ID}\")\n    message(STATUS \"C++ compiler version: ${CMAKE_CXX_COMPILER_VERSION}\")\n  endif()\nendif()\n\nif(CMAKE_C_COMPILER)\n  message(STATUS \"C compiler: ${CMAKE_C_COMPILER}\")\n  if(CMAKE_C_COMPILER_ID)\n    message(STATUS \"C compiler ID: ${CMAKE_C_COMPILER_ID}\")\n    message(STATUS \"C compiler version: ${CMAKE_C_COMPILER_VERSION}\")\n  endif()\nendif()\n"
  },
  {
    "path": "cmake/simple-sentencepiece.cmake",
    "content": "function(download_simple_sentencepiece)\n  include(FetchContent)\n\n  set(simple-sentencepiece_URL  \"https://github.com/pkufool/simple-sentencepiece/archive/refs/tags/v0.7.tar.gz\")\n  set(simple-sentencepiece_URL2 \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/simple-sentencepiece-0.7.tar.gz\")\n  set(simple-sentencepiece_HASH \"SHA256=1748a822060a35baa9f6609f84efc8eb54dc0e74b9ece3d82367b7119fdc75af\")\n\n  # If you don't have access to the Internet,\n  # please pre-download simple-sentencepiece\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/simple-sentencepiece-0.7.tar.gz\n    ${CMAKE_SOURCE_DIR}/simple-sentencepiece-0.7.tar.gz\n    ${CMAKE_BINARY_DIR}/simple-sentencepiece-0.7.tar.gz\n    /tmp/simple-sentencepiece-0.7.tar.gz\n    /star-fj/fangjun/download/github/simple-sentencepiece-0.7.tar.gz\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(simple-sentencepiece_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${simple-sentencepiece_URL}\" simple-sentencepiece_URL)\n      message(STATUS \"Found local downloaded simple-sentencepiece: ${simple-sentencepiece_URL}\")\n      set(simple-sentencepiece_URL2)\n      break()\n    endif()\n  endforeach()\n\n  set(SBPE_ENABLE_TESTS OFF CACHE BOOL \"\" FORCE)\n  set(SBPE_BUILD_PYTHON OFF CACHE BOOL \"\" FORCE)\n\n  FetchContent_Declare(simple-sentencepiece\n    URL\n      ${simple-sentencepiece_URL}\n      ${simple-sentencepiece_URL2}\n    URL_HASH\n      ${simple-sentencepiece_HASH}\n  )\n\n  FetchContent_GetProperties(simple-sentencepiece)\n  if(NOT simple-sentencepiece_POPULATED)\n    message(STATUS \"Downloading simple-sentencepiece ${simple-sentencepiece_URL}\")\n    FetchContent_Populate(simple-sentencepiece)\n  endif()\n  message(STATUS \"simple-sentencepiece is downloaded to ${simple-sentencepiece_SOURCE_DIR}\")\n\n  if(BUILD_SHARED_LIBS)\n    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})\n    set(BUILD_SHARED_LIBS OFF)\n  endif()\n\n  add_subdirectory(${simple-sentencepiece_SOURCE_DIR} ${simple-sentencepiece_BINARY_DIR} EXCLUDE_FROM_ALL)\n\n  if(_build_shared_libs_bak)\n    set_target_properties(ssentencepiece_core\n      PROPERTIES\n        POSITION_INDEPENDENT_CODE ON\n        C_VISIBILITY_PRESET hidden\n        CXX_VISIBILITY_PRESET hidden\n    )\n    set(BUILD_SHARED_LIBS ON)\n  endif()\n\n  target_include_directories(ssentencepiece_core\n    PUBLIC\n      ${simple-sentencepiece_SOURCE_DIR}/\n  )\n\n  if(NOT BUILD_SHARED_LIBS)\n    install(TARGETS ssentencepiece_core DESTINATION lib)\n  endif()\nendfunction()\n\ndownload_simple_sentencepiece()\n"
  },
  {
    "path": "cmake/websocketpp.cmake",
    "content": "function(download_websocketpp)\n  include(FetchContent)\n\n  # The latest commit on the develop branch os as 2022-10-22\n  set(websocketpp_URL  \"https://github.com/zaphoyd/websocketpp/archive/b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip\")\n  set(websocketpp_URL2  \"https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip\")\n  set(websocketpp_HASH \"SHA256=1385135ede8191a7fbef9ec8099e3c5a673d48df0c143958216cd1690567f583\")\n\n  # If you don't have access to the Internet,\n  # please pre-download websocketpp\n  set(possible_file_locations\n    $ENV{HOME}/Downloads/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip\n    ${CMAKE_SOURCE_DIR}/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip\n    ${CMAKE_BINARY_DIR}/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip\n    /tmp/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip\n    /star-fj/fangjun/download/github/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip\n  )\n\n  foreach(f IN LISTS possible_file_locations)\n    if(EXISTS ${f})\n      set(websocketpp_URL  \"${f}\")\n      file(TO_CMAKE_PATH \"${websocketpp_URL}\" websocketpp_URL)\n      message(STATUS \"Found local downloaded websocketpp: ${websocketpp_URL}\")\n      set(websocketpp_URL2)\n      break()\n    endif()\n  endforeach()\n\n  FetchContent_Declare(websocketpp\n    URL\n      ${websocketpp_URL}\n      ${websocketpp_URL2}\n    URL_HASH          ${websocketpp_HASH}\n  )\n\n  FetchContent_GetProperties(websocketpp)\n  if(NOT websocketpp_POPULATED)\n    message(STATUS \"Downloading websocketpp from ${websocketpp_URL}\")\n    FetchContent_Populate(websocketpp)\n  endif()\n  message(STATUS \"websocketpp is downloaded to ${websocketpp_SOURCE_DIR}\")\n  # add_subdirectory(${websocketpp_SOURCE_DIR} ${websocketpp_BINARY_DIR} EXCLUDE_FROM_ALL)\n  include_directories(${websocketpp_SOURCE_DIR})\nendfunction()\n\ndownload_websocketpp()\n"
  },
  {
    "path": "cxx-api-examples/CMakeLists.txt",
    "content": "include_directories(${PROJECT_SOURCE_DIR})\n\nadd_executable(streaming-zipformer-cxx-api ./streaming-zipformer-cxx-api.cc)\ntarget_link_libraries(streaming-zipformer-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(streaming-zipformer-with-hr-cxx-api ./streaming-zipformer-with-hr-cxx-api.cc)\ntarget_link_libraries(streaming-zipformer-with-hr-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(speech-enhancement-gtcrn-cxx-api ./speech-enhancement-gtcrn-cxx-api.cc)\ntarget_link_libraries(speech-enhancement-gtcrn-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(speech-enhancement-dpdfnet-cxx-api ./speech-enhancement-dpdfnet-cxx-api.cc)\ntarget_link_libraries(speech-enhancement-dpdfnet-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(online-speech-enhancement-gtcrn-cxx-api\n               ./online-speech-enhancement-gtcrn-cxx-api.cc)\ntarget_link_libraries(online-speech-enhancement-gtcrn-cxx-api\n                      sherpa-onnx-cxx-api)\n\nadd_executable(online-speech-enhancement-dpdfnet-cxx-api\n               ./online-speech-enhancement-dpdfnet-cxx-api.cc)\ntarget_link_libraries(online-speech-enhancement-dpdfnet-cxx-api\n                      sherpa-onnx-cxx-api)\n\nadd_executable(kws-cxx-api ./kws-cxx-api.cc)\ntarget_link_libraries(kws-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(audio-tagging-ced-cxx-api ./audio-tagging-ced-cxx-api.cc)\ntarget_link_libraries(audio-tagging-ced-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(audio-tagging-zipformer-cxx-api ./audio-tagging-zipformer-cxx-api.cc)\ntarget_link_libraries(audio-tagging-zipformer-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(streaming-zipformer-rtf-cxx-api ./streaming-zipformer-rtf-cxx-api.cc)\ntarget_link_libraries(streaming-zipformer-rtf-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(streaming-t-one-ctc-cxx-api   streaming-t-one-ctc-cxx-api.cc)\ntarget_link_libraries(streaming-t-one-ctc-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(whisper-cxx-api ./whisper-cxx-api.cc)\ntarget_link_libraries(whisper-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(fire-red-asr-cxx-api ./fire-red-asr-cxx-api.cc)\ntarget_link_libraries(fire-red-asr-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(fire-red-asr-ctc-cxx-api ./fire-red-asr-ctc-cxx-api.cc)\ntarget_link_libraries(fire-red-asr-ctc-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(moonshine-cxx-api ./moonshine-cxx-api.cc)\ntarget_link_libraries(moonshine-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(moonshine-v2-cxx-api ./moonshine-v2-cxx-api.cc)\ntarget_link_libraries(moonshine-v2-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(sense-voice-cxx-api ./sense-voice-cxx-api.cc)\ntarget_link_libraries(sense-voice-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(wenet-ctc-cxx-api ./wenet-ctc-cxx-api.cc)\ntarget_link_libraries(wenet-ctc-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(omnilingual-asr-ctc-cxx-api ./omnilingual-asr-ctc-cxx-api.cc)\ntarget_link_libraries(omnilingual-asr-ctc-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(medasr-ctc-cxx-api ./medasr-ctc-cxx-api.cc)\ntarget_link_libraries(medasr-ctc-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(nemo-canary-cxx-api ./nemo-canary-cxx-api.cc)\ntarget_link_libraries(nemo-canary-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(offline-punctuation-cxx-api ./offline-punctuation-cxx-api.cc)\ntarget_link_libraries(offline-punctuation-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(online-punctuation-cxx-api ./online-punctuation-cxx-api.cc)\ntarget_link_libraries(online-punctuation-cxx-api sherpa-onnx-cxx-api)\n\nif(SHERPA_ONNX_ENABLE_PORTAUDIO)\n  add_executable(sense-voice-simulate-streaming-microphone-cxx-api\n    ./sense-voice-simulate-streaming-microphone-cxx-api.cc\n    ${CMAKE_CURRENT_LIST_DIR}/../sherpa-onnx/csrc/microphone.cc\n  )\n  target_link_libraries(sense-voice-simulate-streaming-microphone-cxx-api\n    sherpa-onnx-cxx-api\n    portaudio_static\n  )\n\n  add_executable(fire-red-asr-ctc-simulate-streaming-microphone-cxx-api\n    ./fire-red-asr-ctc-simulate-streaming-microphone-cxx-api.cc\n    ${CMAKE_CURRENT_LIST_DIR}/../sherpa-onnx/csrc/microphone.cc\n  )\n  target_link_libraries(fire-red-asr-ctc-simulate-streaming-microphone-cxx-api\n    sherpa-onnx-cxx-api\n    portaudio_static\n  )\n\n  add_executable(wenet-ctc-simulate-streaming-microphone-cxx-api\n    ./wenet-ctc-simulate-streaming-microphone-cxx-api.cc\n    ${CMAKE_CURRENT_LIST_DIR}/../sherpa-onnx/csrc/microphone.cc\n  )\n  target_link_libraries(wenet-ctc-simulate-streaming-microphone-cxx-api\n    sherpa-onnx-cxx-api\n    portaudio_static\n  )\n\n  add_executable(parakeet-tdt-simulate-streaming-microphone-cxx-api\n    ./parakeet-tdt-simulate-streaming-microphone-cxx-api.cc\n    ${CMAKE_CURRENT_LIST_DIR}/../sherpa-onnx/csrc/microphone.cc\n  )\n  target_link_libraries(parakeet-tdt-simulate-streaming-microphone-cxx-api\n    sherpa-onnx-cxx-api\n    portaudio_static\n  )\n\n  add_executable(parakeet-tdt-ctc-simulate-streaming-microphone-cxx-api\n    ./parakeet-tdt-ctc-simulate-streaming-microphone-cxx-api.cc\n    ${CMAKE_CURRENT_LIST_DIR}/../sherpa-onnx/csrc/microphone.cc\n  )\n  target_link_libraries(parakeet-tdt-ctc-simulate-streaming-microphone-cxx-api\n    sherpa-onnx-cxx-api\n    portaudio_static\n  )\n\n  add_executable(zipformer-ctc-simulate-streaming-microphone-cxx-api\n    ./zipformer-ctc-simulate-streaming-microphone-cxx-api.cc\n    ${CMAKE_CURRENT_LIST_DIR}/../sherpa-onnx/csrc/microphone.cc\n  )\n  target_link_libraries(zipformer-ctc-simulate-streaming-microphone-cxx-api\n    sherpa-onnx-cxx-api\n    portaudio_static\n  )\n\n  add_executable(zipformer-transducer-simulate-streaming-microphone-cxx-api\n    ./zipformer-transducer-simulate-streaming-microphone-cxx-api.cc\n    ${CMAKE_CURRENT_LIST_DIR}/../sherpa-onnx/csrc/microphone.cc\n  )\n  target_link_libraries(zipformer-transducer-simulate-streaming-microphone-cxx-api\n    sherpa-onnx-cxx-api\n    portaudio_static\n  )\nendif()\n\nif(SHERPA_ONNX_HAS_ALSA)\n  add_executable(sense-voice-simulate-streaming-alsa-cxx-api\n    ./sense-voice-simulate-streaming-alsa-cxx-api.cc\n    ${CMAKE_CURRENT_LIST_DIR}/../sherpa-onnx/csrc/alsa.cc\n  )\n  target_link_libraries(sense-voice-simulate-streaming-alsa-cxx-api\n    sherpa-onnx-cxx-api\n  )\n\n  add_executable(fire-red-asr-ctc-simulate-streaming-alsa-cxx-api\n    ./fire-red-asr-ctc-simulate-streaming-alsa-cxx-api.cc\n    ${CMAKE_CURRENT_LIST_DIR}/../sherpa-onnx/csrc/alsa.cc\n  )\n  target_link_libraries(fire-red-asr-ctc-simulate-streaming-alsa-cxx-api\n    sherpa-onnx-cxx-api\n  )\n\n  add_executable(zipformer-ctc-simulate-streaming-alsa-cxx-api\n    ./zipformer-ctc-simulate-streaming-alsa-cxx-api.cc\n    ${CMAKE_CURRENT_LIST_DIR}/../sherpa-onnx/csrc/alsa.cc\n  )\n  target_link_libraries(zipformer-ctc-simulate-streaming-alsa-cxx-api\n    sherpa-onnx-cxx-api\n  )\n\n  if(DEFINED ENV{SHERPA_ONNX_ALSA_LIB_DIR})\n    target_link_libraries(sense-voice-simulate-streaming-alsa-cxx-api -L$ENV{SHERPA_ONNX_ALSA_LIB_DIR} -lasound)\n    target_link_libraries(fire-red-asr-ctc-simulate-streaming-alsa-cxx-api -L$ENV{SHERPA_ONNX_ALSA_LIB_DIR} -lasound)\n    target_link_libraries(zipformer-ctc-simulate-streaming-alsa-cxx-api -L$ENV{SHERPA_ONNX_ALSA_LIB_DIR} -lasound)\n  else()\n    target_link_libraries(sense-voice-simulate-streaming-alsa-cxx-api asound)\n    target_link_libraries(fire-red-asr-ctc-simulate-streaming-alsa-cxx-api asound)\n    target_link_libraries(zipformer-ctc-simulate-streaming-alsa-cxx-api asound)\n  endif()\nendif()\n\nadd_executable(sense-voice-with-hr-cxx-api ./sense-voice-with-hr-cxx-api.cc)\ntarget_link_libraries(sense-voice-with-hr-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(dolphin-ctc-cxx-api ./dolphin-ctc-cxx-api.cc)\ntarget_link_libraries(dolphin-ctc-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(vad-cxx-api ./vad-cxx-api.cc)\ntarget_link_libraries(vad-cxx-api sherpa-onnx-cxx-api)\n\nadd_executable(funasr-nano-cxx-api ./funasr-nano-cxx-api.cc)\ntarget_link_libraries(funasr-nano-cxx-api sherpa-onnx-cxx-api)\n\nif(SHERPA_ONNX_ENABLE_TTS)\n  add_executable(matcha-tts-zh-cxx-api ./matcha-tts-zh-cxx-api.cc)\n  target_link_libraries(matcha-tts-zh-cxx-api sherpa-onnx-cxx-api)\n\n  add_executable(matcha-tts-en-cxx-api ./matcha-tts-en-cxx-api.cc)\n  target_link_libraries(matcha-tts-en-cxx-api sherpa-onnx-cxx-api)\n\n  add_executable(kokoro-tts-en-cxx-api ./kokoro-tts-en-cxx-api.cc)\n  target_link_libraries(kokoro-tts-en-cxx-api sherpa-onnx-cxx-api)\n\n  add_executable(kitten-tts-en-cxx-api ./kitten-tts-en-cxx-api.cc)\n  target_link_libraries(kitten-tts-en-cxx-api sherpa-onnx-cxx-api)\n\n  add_executable(pocket-tts-en-cxx-api ./pocket-tts-en-cxx-api.cc)\n  target_link_libraries(pocket-tts-en-cxx-api sherpa-onnx-cxx-api)\n\n  add_executable(kokoro-tts-zh-en-cxx-api ./kokoro-tts-zh-en-cxx-api.cc)\n  target_link_libraries(kokoro-tts-zh-en-cxx-api sherpa-onnx-cxx-api)\n\n  add_executable(supertonic-tts-en-cxx-api ./supertonic-tts-en-cxx-api.cc)\n  target_link_libraries(supertonic-tts-en-cxx-api sherpa-onnx-cxx-api)\n\n  add_executable(zipvoice-tts-zh-en-cxx-api ./zipvoice-tts-zh-en-cxx-api.cc)\n  target_link_libraries(zipvoice-tts-zh-en-cxx-api sherpa-onnx-cxx-api)\nendif()\n"
  },
  {
    "path": "cxx-api-examples/audio-tagging-ced-cxx-api.cc",
    "content": "// cxx-api-examples/audio-tagging-ced-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use CED with sherpa-onnx's C++\n// API for audio tagging.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n// tar xvf sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n// rm sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n//\n// clang-format on\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  AudioTaggingConfig config;\n\n  config.model.ced =\n      \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.int8.onnx\";\n  config.model.num_threads = 1;\n  config.model.debug = true;\n  config.labels =\n      \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/\"\n      \"class_labels_indices.csv\";\n\n  config.top_k = 5;\n\n  std::cout << \"Loading model\\n\";\n  AudioTagging tagger = AudioTagging::Create(config);\n  if (!tagger.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n\n  std::string wave_filename =\n      \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/1.wav\";\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Started\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = tagger.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n  std::vector<AudioEvent> events = tagger.Compute(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  std::cout << \"Done\\n\";\n\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  int32_t i = 0;\n\n  for (const auto &event : events) {\n    fprintf(stderr, \"%d: AudioEvent(name='%s', index=%d, prob=%.3f)\\n\", i,\n            event.name.c_str(), event.index, event.prob);\n    i += 1;\n  }\n\n  printf(\"Number of threads: %d\\n\", config.model.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n}\n"
  },
  {
    "path": "cxx-api-examples/audio-tagging-zipformer-cxx-api.cc",
    "content": "// cxx-api-examples/audio-tagging-zipformer-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Zipformer with sherpa-onnx's C++\n// API for audio tagging.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n// tar xvf sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n// rm sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n//\n//\n// clang-format on\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  AudioTaggingConfig config;\n\n  config.model.zipformer.model =\n      \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/model.onnx\";\n  config.model.num_threads = 1;\n  config.model.debug = true;\n  config.labels =\n      \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/\"\n      \"class_labels_indices.csv\";\n\n  config.top_k = 5;\n\n  std::cout << \"Loading model\\n\";\n  AudioTagging tagger = AudioTagging::Create(config);\n  if (!tagger.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n\n  std::string wave_filename =\n      \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/1.wav\";\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Started\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = tagger.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n  std::vector<AudioEvent> events = tagger.Compute(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  std::cout << \"Done\\n\";\n\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  int32_t i = 0;\n\n  for (const auto &event : events) {\n    fprintf(stderr, \"%d: AudioEvent(name='%s', index=%d, prob=%.3f)\\n\", i,\n            event.name.c_str(), event.index, event.prob);\n    i += 1;\n  }\n\n  printf(\"Number of threads: %d\\n\", config.model.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n}\n"
  },
  {
    "path": "cxx-api-examples/dolphin-ctc-cxx-api.cc",
    "content": "// cxx-api-examples/dolphin-ctc-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Dolphini CTC model with sherpa-onnx's C++\n// API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n// tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n// rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  // clang-format off\n  config.model_config.dolphin.model = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx\"; // NOLINT\n  config.model_config.tokens = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt\"; // NOLINT\n\n  std::string wave_filename = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav\"; // NOLINT\n  // clang-format on\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/fire-red-asr-ctc-cxx-api.cc",
    "content": "// cxx-api-examples/fire-red-asr-ctc-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use FireRedASR CTC with sherpa-onnx's C++ API.\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nrm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n*/\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  // clang-format off\n  config.model_config.fire_red_asr_ctc.model = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx\";\n  config.model_config.tokens = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename =\"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav\";\n  // clang-format on\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/fire-red-asr-ctc-simulate-streaming-alsa-cxx-api.cc",
    "content": "// cxx-api-examples/fire-red-asr-ctc-simulate-streaming-alsa-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use FireRedASR CTC models with sherpa-onnx's\n// C++ API for streaming speech recognition from a microphone.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n// tar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n// rm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n//\n// clang-format on\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <iostream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <string>\n#include <thread>  // NOLINT\n#include <utility>\n#include <vector>\n\n#include \"sherpa-display.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n#include \"sherpa-onnx/csrc/alsa.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic void RecordCallback(sherpa_onnx::Alsa *alsa) {\n  int32_t chunk = 0.1 * alsa->GetActualSampleRate();\n  while (!stop) {\n    std::vector<float> samples = alsa->Read(chunk);\n\n    std::lock_guard<std::mutex> lock(mutex);\n    samples_queue.emplace(std::move(samples));\n    condition_variable.notify_one();\n  }\n}\n\nstatic sherpa_onnx::cxx::VoiceActivityDetector CreateVad() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  VadModelConfig config;\n  config.silero_vad.model = \"./silero_vad.onnx\";\n  config.silero_vad.threshold = 0.5;\n  config.silero_vad.min_silence_duration = 0.1;\n  config.silero_vad.min_speech_duration = 0.25;\n  config.silero_vad.max_speech_duration = 8;\n  config.sample_rate = 16000;\n  config.debug = false;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 20);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    exit(-1);\n  }\n\n  return vad;\n}\n\nstatic sherpa_onnx::cxx::OfflineRecognizer CreateOfflineRecognizer() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.fire_red_asr_ctc.model =\n      \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt\";\n\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    exit(-1);\n  }\n  std::cout << \"Loading model done\\n\";\n  return recognizer;\n}\n\nint32_t main(int32_t argc, const char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nrm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\n./fire-red-asr-ctc-simulate-streaming-alsa-cxx-api device_name\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n)usage\";\n\n  if (argc != 2) {\n    fprintf(stderr, \"%s\\n\", kUsageMessage);\n    return -1;\n  }\n\n  signal(SIGINT, Handler);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  auto vad = CreateVad();\n  auto recognizer = CreateOfflineRecognizer();\n\n  int32_t expected_sample_rate = 16000;\n\n  std::string device_name = argv[1];\n  sherpa_onnx::Alsa alsa(device_name.c_str());\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name.c_str());\n\n  if (alsa.GetExpectedSampleRate() != expected_sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            expected_sample_rate);\n    exit(-1);\n  }\n\n  int32_t window_size = 512;  // samples, please don't change\n\n  int32_t offset = 0;\n  std::vector<float> buffer;\n  bool speech_started = false;\n\n  auto started_time = std::chrono::steady_clock::now();\n\n  SherpaDisplay display;\n\n  std::thread record_thread(RecordCallback, &alsa);\n\n  std::cout << \"Started! Please speak\\n\";\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      buffer.insert(buffer.end(), s.begin(), s.end());\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad.AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad.IsDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(expected_sample_rate, buffer.data(), buffer.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n\n      vad.Pop();\n\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(expected_sample_rate, segment.samples.data(),\n                            segment.samples.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n      display.UpdateText(result.text);\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  record_thread.join();\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/fire-red-asr-ctc-simulate-streaming-microphone-cxx-api.cc",
    "content": "// cxx-api-examples/fire-red-asr-ctc-simulate-streaming-microphone-cxx-api.cc\n// Copyright (c)  2026  Xiaomi Corporation\n\n//\n// This file demonstrates how to use FireRedASR CTC models with sherpa-onnx's\n// C++ API for streaming speech recognition from a microphone.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n// tar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n// rm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n//\n// clang-format on\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <iostream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <vector>\n\n#include \"portaudio.h\"       // NOLINT\n#include \"sherpa-display.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(mutex);\n  samples_queue.emplace(\n      reinterpret_cast<const float *>(input_buffer),\n      reinterpret_cast<const float *>(input_buffer) + frames_per_buffer);\n  condition_variable.notify_one();\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic sherpa_onnx::cxx::VoiceActivityDetector CreateVad() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  VadModelConfig config;\n  config.silero_vad.model = \"./silero_vad.onnx\";\n  config.silero_vad.threshold = 0.5;\n  config.silero_vad.min_silence_duration = 0.1;\n  config.silero_vad.min_speech_duration = 0.25;\n  config.silero_vad.max_speech_duration = 8;\n  config.sample_rate = 16000;\n  config.debug = false;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 20);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    exit(-1);\n  }\n\n  return vad;\n}\n\nstatic sherpa_onnx::cxx::OfflineRecognizer CreateOfflineRecognizer() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.fire_red_asr_ctc.model =\n      \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt\";\n\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    exit(-1);\n  }\n  std::cout << \"Loading model done\\n\";\n  return recognizer;\n}\n\nint32_t main() {\n  signal(SIGINT, Handler);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  auto vad = CreateVad();\n  auto recognizer = CreateOfflineRecognizer();\n\n  sherpa_onnx::Microphone mic;\n\n  PaDeviceIndex num_devices = Pa_GetDeviceCount();\n  if (num_devices == 0) {\n    std::cerr\n        << \"  If you are using Linux, please try \"\n           \"./build/bin/fire-red-asr-ctc-simulate-streaming-alsa-cxx-api\\n\";\n    return -1;\n  }\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *sample_rate_str = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (sample_rate_str) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(sample_rate_str);\n  }\n  float sample_rate = 16000;\n  LinearResampler resampler;\n  if (mic_sample_rate != sample_rate) {\n    float min_freq = std::min(mic_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler = LinearResampler::Create(mic_sample_rate, sample_rate,\n                                        lowpass_cutoff, lowpass_filter_width);\n  }\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr)) {\n    std::cerr << \"Failed to open microphone device\\n\";\n    return -1;\n  }\n\n  int32_t window_size = 512;  // samples, please don't change\n\n  int32_t offset = 0;\n  std::vector<float> buffer;\n  bool speech_started = false;\n\n  auto started_time = std::chrono::steady_clock::now();\n\n  SherpaDisplay display;\n\n  std::cout << \"Started! Please speak\\n\";\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      if (!resampler.Get()) {\n        buffer.insert(buffer.end(), s.begin(), s.end());\n      } else {\n        auto resampled = resampler.Resample(s.data(), s.size(), false);\n        buffer.insert(buffer.end(), resampled.begin(), resampled.end());\n      }\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad.AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad.IsDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, buffer.data(), buffer.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n\n      vad.Pop();\n\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, segment.samples.data(),\n                            segment.samples.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n      display.UpdateText(result.text);\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/fire-red-asr-cxx-api.cc",
    "content": "// cxx-api-examples/fire-red-asr-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use FireRedAsr AED with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n// tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n// rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.fire_red_asr.encoder =\n      \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx\";\n  config.model_config.fire_red_asr.decoder =\n      \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename =\n      \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav\";\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/funasr-nano-cxx-api.cc",
    "content": "// cxx-api-examples/funasr-nano-cxx-api.cc\n//\n// Copyright (c)  2025  zengyw\n//\n// This file demonstrates how to use FunASR-nano with sherpa-onnx's C++ API.\n//\n//\n// clang-format off\n//\n// Usage:\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n// tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n// rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n//\n// clang-format on\n\n#include <chrono>\n#include <cstdio>\n#include <cstring>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main(int32_t argc, char *argv[]) {\n  using namespace sherpa_onnx::cxx;\n\n  OfflineRecognizerConfig config;\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n  config.model_config.provider = \"cpu\";\n\n  // clang-format off\n  config.model_config.funasr_nano.encoder_adaptor = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx\";\n  config.model_config.funasr_nano.llm = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx\";\n  config.model_config.funasr_nano.embedding = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx\";\n  config.model_config.funasr_nano.tokenizer = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B\";\n\n  // clang-format on\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename =\n      \"./sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/dia_yue.wav\";\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/kitten-tts-en-cxx-api.cc",
    "content": "// cxx-api-examples/kitten-tts-en-cxx-api.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx CXX API\n// for English TTS with Kitten.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\ntar xf kitten-nano-en-v0_1-fp16.tar.bz2\nrm kitten-nano-en-v0_1-fp16.tar.bz2\n\n./kitten-tts-en-cxx-api\n\n */\n// clang-format on\n\n#include <cstdint>\n#include <cstdio>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineTtsConfig config;\n\n  config.model.kitten.model = \"./kitten-nano-en-v0_1-fp16/model.fp16.onnx\";\n  config.model.kitten.voices = \"./kitten-nano-en-v0_1-fp16/voices.bin\";\n  config.model.kitten.tokens = \"./kitten-nano-en-v0_1-fp16/tokens.txt\";\n  config.model.kitten.data_dir = \"./kitten-nano-en-v0_1-fp16/espeak-ng-data\";\n\n  config.model.num_threads = 2;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  std::string filename = \"./generated-kitten-en-cxx.wav\";\n  std::string text =\n      \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n      \"does not have two-thirds of his day for himself, is a slave, whatever \"\n      \"he may be: a statesman, a businessman, an official, or a scholar. \"\n      \"Friends fell out often because life was changing so fast. The easiest \"\n      \"thing in the world was to lose touch with someone.\";\n\n  auto tts = OfflineTts::Create(config);\n  int32_t sid = 0;\n  float speed = 1.0;  // larger -> faster in speech speed\n  GenerationConfig gen_config;\n  gen_config.sid = sid;\n  gen_config.speed = speed;\n  gen_config.silence_scale = 0.2f;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  GeneratedAudio audio = tts.Generate(text, gen_config);\n#else\n  GeneratedAudio audio = tts.Generate(text, gen_config, ProgressCallback);\n#endif\n\n  WriteWave(filename, {audio.samples, audio.sample_rate});\n\n  fprintf(stderr, \"Input text is: %s\\n\", text.c_str());\n  fprintf(stderr, \"Speaker ID is: %d\\n\", sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename.c_str());\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/kokoro-tts-en-cxx-api.cc",
    "content": "// cxx-api-examples/kokoro-tts-en-cxx-api.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx CXX API\n// for English TTS with Kokoro.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\ntar xf kokoro-en-v0_19.tar.bz2\nrm kokoro-en-v0_19.tar.bz2\n\n./kokoro-tts-en-cxx-api\n\n */\n// clang-format on\n\n#include <cstdint>\n#include <cstdio>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineTtsConfig config;\n\n  config.model.kokoro.model = \"./kokoro-en-v0_19/model.onnx\";\n  config.model.kokoro.voices = \"./kokoro-en-v0_19/voices.bin\";\n  config.model.kokoro.tokens = \"./kokoro-en-v0_19/tokens.txt\";\n  config.model.kokoro.data_dir = \"./kokoro-en-v0_19/espeak-ng-data\";\n\n  config.model.num_threads = 2;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  std::string filename = \"./generated-kokoro-en-cxx.wav\";\n  std::string text =\n      \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n      \"does not have two-thirds of his day for himself, is a slave, whatever \"\n      \"he may be: a statesman, a businessman, an official, or a scholar. \"\n      \"Friends fell out often because life was changing so fast. The easiest \"\n      \"thing in the world was to lose touch with someone.\";\n\n  auto tts = OfflineTts::Create(config);\n  int32_t sid = 0;\n  float speed = 1.0;  // larger -> faster in speech speed\n  GenerationConfig gen_config;\n  gen_config.sid = sid;\n  gen_config.speed = speed;\n  gen_config.silence_scale = 0.2f;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  GeneratedAudio audio = tts.Generate(text, gen_config);\n#else\n  GeneratedAudio audio = tts.Generate(text, gen_config, ProgressCallback);\n#endif\n\n  WriteWave(filename, {audio.samples, audio.sample_rate});\n\n  fprintf(stderr, \"Input text is: %s\\n\", text.c_str());\n  fprintf(stderr, \"Speaker ID is: %d\\n\", sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename.c_str());\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/kokoro-tts-zh-en-cxx-api.cc",
    "content": "// cxx-api-examples/kokoro-tts-zh-en-cxx-api.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx CXX API\n// for Chinese + English TTS with Kokoro.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\ntar xf kokoro-multi-lang-v1_0.tar.bz2\nrm kokoro-multi-lang-v1_0.tar.bz2\n\n./kokoro-tts-zh-en-cxx-api\n\n */\n// clang-format on\n\n#include <cstdint>\n#include <cstdio>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineTtsConfig config;\n\n  config.model.kokoro.model = \"./kokoro-multi-lang-v1_0/model.onnx\";\n  config.model.kokoro.voices = \"./kokoro-multi-lang-v1_0/voices.bin\";\n  config.model.kokoro.tokens = \"./kokoro-multi-lang-v1_0/tokens.txt\";\n  config.model.kokoro.data_dir = \"./kokoro-multi-lang-v1_0/espeak-ng-data\";\n  config.model.kokoro.dict_dir = \"./kokoro-multi-lang-v1_0/dict\";\n  config.model.kokoro.lexicon =\n      \"./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/\"\n      \"lexicon-zh.txt\";\n\n  config.model.num_threads = 2;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  std::string filename = \"./generated-kokoro-zh-en-cxx.wav\";\n  std::string text =\n      \"中英文语音合成测试。This is generated by next generation Kaldi using \"\n      \"Kokoro without Misaki. 你觉得中英文说的如何呢？\";\n\n  auto tts = OfflineTts::Create(config);\n  int32_t sid = 50;\n  float speed = 1.0;  // larger -> faster in speech speed\n  GenerationConfig gen_config;\n  gen_config.sid = sid;\n  gen_config.speed = speed;\n  gen_config.silence_scale = 0.2f;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  GeneratedAudio audio = tts.Generate(text, gen_config);\n#else\n  GeneratedAudio audio = tts.Generate(text, gen_config, ProgressCallback);\n#endif\n\n  WriteWave(filename, {audio.samples, audio.sample_rate});\n\n  fprintf(stderr, \"Input text is: %s\\n\", text.c_str());\n  fprintf(stderr, \"Speaker ID is: %d\\n\", sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename.c_str());\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/kws-cxx-api.cc",
    "content": "// cxx-api-examples/kws-cxx-api.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n//\n// This file demonstrates how to use keywords spotter with sherpa-onnx's C\n// clang-format off\n//\n// Usage\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n// tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n// rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2\n//\n// ./kws-cxx-api\n//\n// clang-format on\n#include <array>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  KeywordSpotterConfig config;\n  config.model_config.transducer.encoder =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx\";\n\n  config.model_config.transducer.decoder =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"decoder-epoch-12-avg-2-chunk-16-left-64.onnx\";\n\n  config.model_config.transducer.joiner =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx\";\n\n  config.model_config.tokens =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"tokens.txt\";\n\n  config.model_config.provider = \"cpu\";\n  config.model_config.num_threads = 1;\n  config.model_config.debug = 1;\n\n  config.keywords_file =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"test_wavs/test_keywords.txt\";\n\n  KeywordSpotter kws = KeywordSpotter::Create(config);\n  if (!kws.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n\n  std::cout\n      << \"--Test pre-defined keywords from test_wavs/test_keywords.txt--\\n\";\n\n  std::string wave_filename =\n      \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n      \"test_wavs/3.wav\";\n\n  std::array<float, 8000> tail_paddings = {0};  // 0.5 seconds\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  OnlineStream stream = kws.CreateStream();\n  if (!stream.Get()) {\n    std::cerr << \"Failed to create stream\\n\";\n    return -1;\n  }\n\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  stream.AcceptWaveform(wave.sample_rate, tail_paddings.data(),\n                        tail_paddings.size());\n  stream.InputFinished();\n\n  while (kws.IsReady(&stream)) {\n    kws.Decode(&stream);\n    auto r = kws.GetResult(&stream);\n    if (!r.keyword.empty()) {\n      std::cout << \"Detected keyword: \" << r.json << \"\\n\";\n\n      // Remember to reset the keyword stream right after a keyword is detected\n      kws.Reset(&stream);\n    }\n  }\n\n  // --------------------------------------------------------------------------\n\n  std::cout << \"--Use pre-defined keywords + add a new keyword--\\n\";\n\n  stream = kws.CreateStream(\"y ǎn y uán @演员\");\n\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  stream.AcceptWaveform(wave.sample_rate, tail_paddings.data(),\n                        tail_paddings.size());\n  stream.InputFinished();\n\n  while (kws.IsReady(&stream)) {\n    kws.Decode(&stream);\n    auto r = kws.GetResult(&stream);\n    if (!r.keyword.empty()) {\n      std::cout << \"Detected keyword: \" << r.json << \"\\n\";\n\n      // Remember to reset the keyword stream right after a keyword is detected\n      kws.Reset(&stream);\n    }\n  }\n\n  // --------------------------------------------------------------------------\n\n  std::cout << \"--Use pre-defined keywords + add two new keywords--\\n\";\n\n  stream = kws.CreateStream(\"y ǎn y uán @演员/zh ī m íng @知名\");\n\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  stream.AcceptWaveform(wave.sample_rate, tail_paddings.data(),\n                        tail_paddings.size());\n  stream.InputFinished();\n\n  while (kws.IsReady(&stream)) {\n    kws.Decode(&stream);\n    auto r = kws.GetResult(&stream);\n    if (!r.keyword.empty()) {\n      std::cout << \"Detected keyword: \" << r.json << \"\\n\";\n\n      // Remember to reset the keyword stream right after a keyword is detected\n      kws.Reset(&stream);\n    }\n  }\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/matcha-tts-en-cxx-api.cc",
    "content": "// cxx-api-examples/matcha-tts-en-cxx-api.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx CXX API\n// for Chinese TTS with MatchaTTS.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\ntar xvf matcha-icefall-en_US-ljspeech.tar.bz2\nrm matcha-icefall-en_US-ljspeech.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n./matcha-tts-en-cxx-api\n\n */\n// clang-format on\n\n#include <cstdint>\n#include <cstdio>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineTtsConfig config;\n\n  config.model.matcha.acoustic_model =\n      \"./matcha-icefall-en_US-ljspeech/model-steps-3.onnx\";\n\n  config.model.matcha.vocoder = \"./vocos-22khz-univ.onnx\";\n\n  config.model.matcha.tokens = \"./matcha-icefall-en_US-ljspeech/tokens.txt\";\n\n  config.model.matcha.data_dir =\n      \"./matcha-icefall-en_US-ljspeech/espeak-ng-data\";\n\n  config.model.num_threads = 1;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  std::string filename = \"./generated-matcha-en-cxx.wav\";\n  std::string text =\n      \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n      \"does not have two-thirds of his day for himself, is a slave, whatever \"\n      \"he may be: a statesman, a businessman, an official, or a scholar. \"\n      \"Friends fell out often because life was changing so fast. The easiest \"\n      \"thing in the world was to lose touch with someone.\";\n\n  auto tts = OfflineTts::Create(config);\n  GenerationConfig gen_config;\n  gen_config.sid = 0;\n  gen_config.speed = 1.0;  // larger -> faster in speech speed\n  gen_config.silence_scale = config.silence_scale;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  GeneratedAudio audio = tts.Generate(text, gen_config);\n#else\n  GeneratedAudio audio = tts.Generate(text, gen_config, ProgressCallback);\n#endif\n\n  WriteWave(filename, {audio.samples, audio.sample_rate});\n\n  fprintf(stderr, \"Input text is: %s\\n\", text.c_str());\n  fprintf(stderr, \"Speaker ID is: %d\\n\", gen_config.sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename.c_str());\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/matcha-tts-zh-cxx-api.cc",
    "content": "// cxx-api-examples/matcha-tts-zh-cxx-api.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx CXX API\n// for Chinese TTS with MatchaTTS.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\ntar xvf matcha-icefall-zh-baker.tar.bz2\nrm matcha-icefall-zh-baker.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n./matcha-tts-zh-cxx-api\n\n */\n// clang-format on\n\n#include <cstdint>\n#include <cstdio>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineTtsConfig config;\n  config.model.matcha.acoustic_model =\n      \"./matcha-icefall-zh-baker/model-steps-3.onnx\";\n  config.model.matcha.vocoder = \"./vocos-22khz-univ.onnx\";\n  config.model.matcha.lexicon = \"./matcha-icefall-zh-baker/lexicon.txt\";\n  config.model.matcha.tokens = \"./matcha-icefall-zh-baker/tokens.txt\";\n  config.model.matcha.dict_dir = \"./matcha-icefall-zh-baker/dict\";\n  config.model.num_threads = 1;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  // clang-format off\n  config.rule_fsts = \"./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst\";  // NOLINT\n  // clang-format on\n\n  std::string filename = \"./generated-matcha-zh-cxx.wav\";\n  std::string text =\n      \"当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如\"\n      \"涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感\"\n      \"受着生命的奇迹与温柔.\"\n      \"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; \"\n      \"经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\";\n\n  auto tts = OfflineTts::Create(config);\n  GenerationConfig gen_config;\n  gen_config.sid = 0;\n  gen_config.speed = 1.0;  // larger -> faster in speech speed\n  gen_config.silence_scale = config.silence_scale;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  GeneratedAudio audio = tts.Generate(text, gen_config);\n#else\n  GeneratedAudio audio = tts.Generate(text, gen_config, ProgressCallback);\n#endif\n\n  WriteWave(filename, {audio.samples, audio.sample_rate});\n\n  fprintf(stderr, \"Input text is: %s\\n\", text.c_str());\n  fprintf(stderr, \"Speaker ID is: %d\\n\", gen_config.sid);\n  fprintf(stderr, \"Saved to: %s\\n\", filename.c_str());\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/medasr-ctc-cxx-api.cc",
    "content": "// cxx-api-examples/medasr-ctc-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use MedASR with sherpa-onnx's C++ API.\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\ntar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nrm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n*/\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  // clang-format off\n  config.model_config.medasr.model = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx\";\n  config.model_config.tokens = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav\";\n  // clang-format on\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/moonshine-cxx-api.cc",
    "content": "// cxx-api-examples/moonshine-cxx-api.cc\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Moonshine with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n// tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n// rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.moonshine.preprocessor =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx\";\n  config.model_config.moonshine.encoder =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx\";\n  config.model_config.moonshine.uncached_decoder =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx\";\n  config.model_config.moonshine.cached_decoder =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename =\n      \"./sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav\";\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/moonshine-v2-cxx-api.cc",
    "content": "// cxx-api-examples/moonshine-v2-cxx-api.cc\n// Copyright (c)  2024-2026  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Moonshine v2 with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n// tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n// rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  // clang-format off\n  config.model_config.moonshine.encoder = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort\";\n  config.model_config.moonshine.merged_decoder = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort\";\n  config.model_config.tokens = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt\";\n  // clang-format on\n\n  config.model_config.num_threads = 2;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename =\n      \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav\";\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/nemo-canary-cxx-api.cc",
    "content": "// cxx-api-examples/nemo-canary-cxx-api.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use NeMo Canary models with\n// sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n// tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n// rm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n//\n// clang-format on\n//\n// see https://k2-fsa.github.io/sherpa/onnx/nemo/canary.html\n// for details\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.canary.encoder =\n      \"sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx\";\n  config.model_config.canary.decoder =\n      \"sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx\";\n\n  // our input audio is German, so we set src_lang to \"de\"\n  config.model_config.canary.src_lang = \"de\";\n\n  // we can set tgt_lang either to de or en in this specific case\n  config.model_config.canary.tgt_lang = \"en\";\n  config.model_config.tokens =\n      \"sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename =\n      \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/de.wav\";\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text (English): \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  // now output text in German\n  config.model_config.canary.tgt_lang = \"de\";\n  recognizer.SetConfig(config);\n  stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  result = recognizer.GetResult(&stream);\n  std::cout << \"text (German): \" << result.text << \"\\n\";\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/offline-punctuation-cxx-api.cc",
    "content": "// cxx-api-examples/offline-punctuation-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n// To use punctuation model:\n// clang-format off\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8.tar.bz2\n// tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8.tar.bz2\n// rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8.tar.bz2\n// clang-format on\n\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  OfflinePunctuationConfig punctuation_config;\n  punctuation_config.model.ct_transformer =\n      \"./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8/\"\n      \"model.int8.onnx\";\n  punctuation_config.model.num_threads = 1;\n  punctuation_config.model.debug = false;\n  punctuation_config.model.provider = \"cpu\";\n\n  OfflinePunctuation punct = OfflinePunctuation::Create(punctuation_config);\n  if (!punct.Get()) {\n    std::cerr\n        << \"Failed to create punctuation model. Please check your config\\n\";\n    return -1;\n  }\n\n  std::string text = \"你好吗how are you Fantasitic 谢谢我很好你怎么样呢\";\n  std::string text_with_punct = punct.AddPunctuation(text);\n  std::cout << \"Original text: \" << text << std::endl;\n  std::cout << \"With punctuation: \" << text_with_punct << std::endl;\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/omnilingual-asr-ctc-cxx-api.cc",
    "content": "// cxx-api-examples/omnilingual-asr-ctc-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Omnilingual ASR with sherpa-onnx's C++ API.\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\ntar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nrm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n*/\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  // clang-format off\n  config.model_config.omnilingual.model = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx\";\n  config.model_config.tokens = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav\";\n  // clang-format on\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/online-punctuation-cxx-api.cc",
    "content": "// cxx-api-examples/online-punctuation-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n// To use punctuation model:\n// clang-format off\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n// tar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n// rm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n// clang-format on\n\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  OnlinePunctuationConfig punctuation_config;\n  punctuation_config.model.cnn_bilstm =\n      \"sherpa-onnx-online-punct-en-2024-08-06/model.int8.onnx\";\n  punctuation_config.model.bpe_vocab =\n      \"sherpa-onnx-online-punct-en-2024-08-06/bpe.vocab\";\n  punctuation_config.model.num_threads = 1;\n  punctuation_config.model.debug = false;\n  punctuation_config.model.provider = \"cpu\";\n\n  OnlinePunctuation punct = OnlinePunctuation::Create(punctuation_config);\n  if (!punct.Get()) {\n    std::cerr\n        << \"Failed to create punctuation model. Please check your config\\n\";\n    return -1;\n  }\n\n  std::string text = \"how are you i am fine thank you\";\n  std::string text_with_punct = punct.AddPunctuation(text);\n  std::cout << \"Original text: \" << text << std::endl;\n  std::cout << \"With punctuation: \" << text_with_punct << std::endl;\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/online-speech-enhancement-dpdfnet-cxx-api.cc",
    "content": "// cxx-api-examples/online-speech-enhancement-dpdfnet-cxx-api.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n//\n// We assume you have pre-downloaded the DPDFNet model and sample test wave.\n// DPDFNet models are available from either:\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n// https://huggingface.co/Ceva-IP/DPDFNet\n//\n// An example command to download:\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n*/\n// clang-format on\n//\n// Use dpdfnet_baseline.onnx, dpdfnet2.onnx, dpdfnet4.onnx, or dpdfnet8.onnx\n// for 16 kHz downstream ASR or speech recognition.\n// Use dpdfnet2_48khz_hr.onnx for 48 kHz enhancement output.\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  OnlineSpeechDenoiserConfig config;\n  std::string model_filename = \"./dpdfnet_baseline.onnx\";\n  std::string wav_filename = \"./inp_16k.wav\";\n  std::string out_wave_filename = \"./enhanced-online-dpdfnet.wav\";\n  config.model.dpdfnet.model = model_filename;\n\n  auto sd = OnlineSpeechDenoiser::Create(config);\n  if (!sd.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n\n  Wave wave = ReadWave(wav_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wav_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::vector<float> samples;\n  auto frame_shift = sd.GetFrameShiftInSamples();\n\n  std::cout << \"Started\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  for (int32_t start = 0; start < static_cast<int32_t>(wave.samples.size());\n       start += frame_shift) {\n    int32_t n = std::min<int32_t>(frame_shift, wave.samples.size() - start);\n    auto denoised = sd.Run(wave.samples.data() + start, n, wave.sample_rate);\n    samples.insert(samples.end(), denoised.samples.begin(),\n                   denoised.samples.end());\n  }\n\n  auto tail = sd.Flush();\n  samples.insert(samples.end(), tail.samples.begin(), tail.samples.end());\n\n  const auto end = std::chrono::steady_clock::now();\n  std::cout << \"Done\\n\";\n\n  WriteWave(out_wave_filename, {samples, sd.GetSampleRate()});\n\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"Saved to \" << out_wave_filename << \"\\n\";\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/online-speech-enhancement-gtcrn-cxx-api.cc",
    "content": "// cxx-api-examples/online-speech-enhancement-gtcrn-cxx-api.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n//\n// We assume you have pre-downloaded the GTCRN model and sample test wave from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n//\n// An example command to download:\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n*/\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  OnlineSpeechDenoiserConfig config;\n  std::string model_filename = \"./gtcrn_simple.onnx\";\n  std::string wav_filename = \"./inp_16k.wav\";\n  std::string out_wave_filename = \"./enhanced-online-gtcrn.wav\";\n  config.model.gtcrn.model = model_filename;\n\n  auto sd = OnlineSpeechDenoiser::Create(config);\n  if (!sd.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n\n  Wave wave = ReadWave(wav_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wav_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::vector<float> samples;\n  auto frame_shift = sd.GetFrameShiftInSamples();\n\n  std::cout << \"Started\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  for (int32_t start = 0; start < static_cast<int32_t>(wave.samples.size());\n       start += frame_shift) {\n    int32_t n = std::min<int32_t>(frame_shift, wave.samples.size() - start);\n    auto denoised = sd.Run(wave.samples.data() + start, n, wave.sample_rate);\n    samples.insert(samples.end(), denoised.samples.begin(),\n                   denoised.samples.end());\n  }\n\n  auto tail = sd.Flush();\n  samples.insert(samples.end(), tail.samples.begin(), tail.samples.end());\n\n  const auto end = std::chrono::steady_clock::now();\n  std::cout << \"Done\\n\";\n\n  WriteWave(out_wave_filename, {samples, sd.GetSampleRate()});\n\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"Saved to \" << out_wave_filename << \"\\n\";\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/parakeet-tdt-ctc-simulate-streaming-microphone-cxx-api.cc",
    "content": "// cxx-api-examples/parakeet-tdt-simulate-streaming-microphone-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use parakeet-tdt with sherpa-onnx's C++ API\n// for streaming speech recognition from a microphone.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8.tar.bz2\n// tar xvf sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8.tar.bz2\n// rm sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8.tar.bz2\n//\n// clang-format on\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <iostream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <vector>\n\n#include \"portaudio.h\"       // NOLINT\n#include \"sherpa-display.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(mutex);\n  samples_queue.emplace(\n      reinterpret_cast<const float *>(input_buffer),\n      reinterpret_cast<const float *>(input_buffer) + frames_per_buffer);\n  condition_variable.notify_one();\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic sherpa_onnx::cxx::VoiceActivityDetector CreateVad() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  VadModelConfig config;\n  config.silero_vad.model = \"./silero_vad.onnx\";\n  config.silero_vad.threshold = 0.25;\n  config.silero_vad.min_silence_duration = 0.25;\n  config.silero_vad.min_speech_duration = 0.25;\n  config.silero_vad.max_speech_duration = 5;\n  config.sample_rate = 16000;\n  config.debug = false;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 60);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    exit(-1);\n  }\n\n  return vad;\n}\n\nstatic sherpa_onnx::cxx::OfflineRecognizer CreateOfflineRecognizer() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.nemo_ctc.model =\n      \"./sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8/model.int8.onnx\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8/tokens.txt\";\n\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    exit(-1);\n  }\n  std::cout << \"Loading model done\\n\";\n  return recognizer;\n}\n\nint32_t main() {\n  signal(SIGINT, Handler);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  auto vad = CreateVad();\n  auto recognizer = CreateOfflineRecognizer();\n\n  sherpa_onnx::Microphone mic;\n\n  PaDeviceIndex num_devices = Pa_GetDeviceCount();\n  if (num_devices == 0) {\n    std::cerr << \"  If you are using Linux, please try to modify \"\n                 \"./build/bin/sense-voice-simulate-streaming-alsa-cxx-api\\n\";\n    return -1;\n  }\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *sample_rate_str = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (sample_rate_str) {\n    mic_sample_rate = atof(sample_rate_str);\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n  }\n\n  float sample_rate = 16000;\n  LinearResampler resampler;\n  if (mic_sample_rate != sample_rate) {\n    float min_freq = std::min(mic_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler = LinearResampler::Create(mic_sample_rate, sample_rate,\n                                        lowpass_cutoff, lowpass_filter_width);\n  }\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr)) {\n    std::cerr << \"Failed to open microphone device\\n\";\n    return -1;\n  }\n\n  int32_t window_size = 512;  // samples, please don't change\n\n  int32_t offset = 0;\n  std::vector<float> buffer;\n  bool speech_started = false;\n\n  auto started_time = std::chrono::steady_clock::now();\n\n  SherpaDisplay display;\n\n  std::cout << \"Started! Please speak\\n\";\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      if (!resampler.Get()) {\n        buffer.insert(buffer.end(), s.begin(), s.end());\n      } else {\n        auto resampled = resampler.Resample(s.data(), s.size(), false);\n        buffer.insert(buffer.end(), resampled.begin(), resampled.end());\n      }\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad.AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad.IsDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, buffer.data(), buffer.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n\n      vad.Pop();\n\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, segment.samples.data(),\n                            segment.samples.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n      display.UpdateText(result.text);\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/parakeet-tdt-simulate-streaming-microphone-cxx-api.cc",
    "content": "// cxx-api-examples/parakeet-tdt-simulate-streaming-microphone-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use parakeet-tdt with sherpa-onnx's C++ API\n// for streaming speech recognition from a microphone.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\n// tar xvf sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\n// rm sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\n//\n// clang-format on\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <iostream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <vector>\n\n#include \"portaudio.h\"       // NOLINT\n#include \"sherpa-display.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(mutex);\n  samples_queue.emplace(\n      reinterpret_cast<const float *>(input_buffer),\n      reinterpret_cast<const float *>(input_buffer) + frames_per_buffer);\n  condition_variable.notify_one();\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic sherpa_onnx::cxx::VoiceActivityDetector CreateVad() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  VadModelConfig config;\n  config.silero_vad.model = \"./silero_vad.onnx\";\n  config.silero_vad.threshold = 0.5;\n  config.silero_vad.min_silence_duration = 0.25;\n  config.silero_vad.min_speech_duration = 0.25;\n  config.silero_vad.max_speech_duration = 5;\n  config.sample_rate = 16000;\n  config.debug = false;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 60);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    exit(-1);\n  }\n\n  return vad;\n}\n\nstatic sherpa_onnx::cxx::OfflineRecognizer CreateOfflineRecognizer() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.transducer.encoder =\n      \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/encoder.int8.onnx\";\n  config.model_config.transducer.decoder =\n      \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/decoder.int8.onnx\";\n  config.model_config.transducer.joiner =\n      \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/joiner.int8.onnx\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/tokens.txt\";\n\n  config.model_config.model_type = \"nemo_transducer\";\n\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    exit(-1);\n  }\n  std::cout << \"Loading model done\\n\";\n  return recognizer;\n}\n\nint32_t main() {\n  signal(SIGINT, Handler);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  auto vad = CreateVad();\n  auto recognizer = CreateOfflineRecognizer();\n\n  sherpa_onnx::Microphone mic;\n\n  PaDeviceIndex num_devices = Pa_GetDeviceCount();\n  if (num_devices == 0) {\n    std::cerr << \"  If you are using Linux, please try \"\n                 \"./build/bin/sense-voice-simulate-streaming-alsa-cxx-api\\n\";\n    return -1;\n  }\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *sample_rate_str = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (sample_rate_str) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(sample_rate_str);\n  }\n\n  float sample_rate = 16000;\n  LinearResampler resampler;\n  if (mic_sample_rate != sample_rate) {\n    float min_freq = std::min(mic_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler = LinearResampler::Create(mic_sample_rate, sample_rate,\n                                        lowpass_cutoff, lowpass_filter_width);\n  }\n\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr)) {\n    std::cerr << \"Failed to open microphone device\\n\";\n    return -1;\n  }\n\n  int32_t window_size = 512;  // samples, please don't change\n\n  int32_t offset = 0;\n  std::vector<float> buffer;\n  bool speech_started = false;\n\n  auto started_time = std::chrono::steady_clock::now();\n\n  SherpaDisplay display;\n\n  std::cout << \"Started! Please speak\\n\";\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      if (!resampler.Get()) {\n        buffer.insert(buffer.end(), s.begin(), s.end());\n      } else {\n        auto resampled = resampler.Resample(s.data(), s.size(), false);\n        buffer.insert(buffer.end(), resampled.begin(), resampled.end());\n      }\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad.AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad.IsDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, buffer.data(), buffer.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n\n      vad.Pop();\n\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, segment.samples.data(),\n                            segment.samples.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n      display.UpdateText(result.text);\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/pocket-tts-en-cxx-api.cc",
    "content": "// cxx-api-examples/pocket-tts-en-cxx-api.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx CXX API\n// for English TTS with PocketTTS.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\n./pocket-tts-en-cxx-api\n\n */\n// clang-format on\n\n#include <cstdint>\n#include <cstdio>\n#include <string>\n#include <utility>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineTtsConfig config;\n\n  config.model.pocket.lm_flow =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\";\n  config.model.pocket.lm_main =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\";\n  config.model.pocket.encoder =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\";\n  config.model.pocket.decoder =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\";\n  config.model.pocket.text_conditioner =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\";\n  config.model.pocket.vocab_json =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\";\n  config.model.pocket.token_scores_json =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\";\n\n  config.model.num_threads = 2;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  std::string filename = \"./generated-pocket-en-cxx.wav\";\n  std::string text =\n      \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n      \"does not have two-thirds of his day for himself, is a slave, whatever \"\n      \"he may be: a statesman, a businessman, an official, or a scholar. \"\n      \"Friends fell out often because life was changing so fast. The easiest \"\n      \"thing in the world was to lose touch with someone.\";\n\n  auto tts = OfflineTts::Create(config);\n  GenerationConfig cfg;\n  cfg.speed = 1.0;\n\n  std::string reference_audio_file =\n      \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\";\n\n  Wave wave = ReadWave(reference_audio_file);\n  cfg.reference_audio = std::move(wave.samples);\n  cfg.reference_sample_rate = wave.sample_rate;\n  cfg.extra[\"max_reference_audio_len\"] = \"10\";\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  GeneratedAudio audio = tts.Generate(text, cfg);\n#else\n  GeneratedAudio audio = tts.Generate(text, cfg, ProgressCallback);\n#endif\n\n  WriteWave(filename, {audio.samples, audio.sample_rate});\n\n  fprintf(stderr, \"Input text is: %s\\n\", text.c_str());\n  fprintf(stderr, \"Saved to: %s\\n\", filename.c_str());\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/sense-voice-cxx-api.cc",
    "content": "// cxx-api-examples/sense-voice-cxx-api.cc\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use sense voice with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.sense_voice.model =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n  config.model_config.sense_voice.use_itn = true;\n  config.model_config.sense_voice.language = \"auto\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/en.wav\";\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/sense-voice-simulate-streaming-alsa-cxx-api.cc",
    "content": "// cxx-api-examples/sense-voice-simulate-streaming-alsa-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use sense voice with sherpa-onnx's C++ API\n// for streaming speech recognition from a microphone.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n//\n// clang-format on\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <iostream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <string>\n#include <thread>  // NOLINT\n#include <utility>\n#include <vector>\n\n#include \"sherpa-display.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n#include \"sherpa-onnx/csrc/alsa.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic void RecordCallback(sherpa_onnx::Alsa *alsa) {\n  int32_t chunk = 0.1 * alsa->GetActualSampleRate();\n  while (!stop) {\n    std::vector<float> samples = alsa->Read(chunk);\n\n    std::lock_guard<std::mutex> lock(mutex);\n    samples_queue.emplace(std::move(samples));\n    condition_variable.notify_one();\n  }\n}\n\nstatic sherpa_onnx::cxx::VoiceActivityDetector CreateVad() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  VadModelConfig config;\n  config.silero_vad.model = \"./silero_vad.onnx\";\n  config.silero_vad.threshold = 0.5;\n  config.silero_vad.min_silence_duration = 0.1;\n  config.silero_vad.min_speech_duration = 0.25;\n  config.silero_vad.max_speech_duration = 8;\n  config.sample_rate = 16000;\n  config.debug = false;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 20);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    exit(-1);\n  }\n\n  return vad;\n}\n\nstatic sherpa_onnx::cxx::OfflineRecognizer CreateOfflineRecognizer() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.sense_voice.model =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n  config.model_config.sense_voice.use_itn = false;\n  config.model_config.sense_voice.language = \"auto\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    exit(-1);\n  }\n  std::cout << \"Loading model done\\n\";\n  return recognizer;\n}\n\nint32_t main(int32_t argc, const char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\n./sense-voice-simulate-streaming-alsa-cxx-api device_name\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n)usage\";\n\n  if (argc != 2) {\n    fprintf(stderr, \"%s\\n\", kUsageMessage);\n    return -1;\n  }\n\n  signal(SIGINT, Handler);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  auto vad = CreateVad();\n  auto recognizer = CreateOfflineRecognizer();\n\n  int32_t expected_sample_rate = 16000;\n\n  std::string device_name = argv[1];\n  sherpa_onnx::Alsa alsa(device_name.c_str());\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name.c_str());\n\n  if (alsa.GetExpectedSampleRate() != expected_sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            expected_sample_rate);\n    exit(-1);\n  }\n\n  int32_t window_size = 512;  // samples, please don't change\n\n  int32_t offset = 0;\n  std::vector<float> buffer;\n  bool speech_started = false;\n\n  auto started_time = std::chrono::steady_clock::now();\n\n  SherpaDisplay display;\n\n  std::thread record_thread(RecordCallback, &alsa);\n\n  std::cout << \"Started! Please speak\\n\";\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      buffer.insert(buffer.end(), s.begin(), s.end());\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad.AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad.IsDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(expected_sample_rate, buffer.data(), buffer.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n\n      vad.Pop();\n\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(expected_sample_rate, segment.samples.data(),\n                            segment.samples.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n      display.UpdateText(result.text);\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  record_thread.join();\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/sense-voice-simulate-streaming-microphone-cxx-api.cc",
    "content": "// cxx-api-examples/sense-voice-simulate-streaming-microphone-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use sense voice with sherpa-onnx's C++ API\n// for streaming speech recognition from a microphone.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n//\n// clang-format on\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <iostream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <vector>\n\n#include \"portaudio.h\"       // NOLINT\n#include \"sherpa-display.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(mutex);\n  samples_queue.emplace(\n      reinterpret_cast<const float *>(input_buffer),\n      reinterpret_cast<const float *>(input_buffer) + frames_per_buffer);\n  condition_variable.notify_one();\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic sherpa_onnx::cxx::VoiceActivityDetector CreateVad() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  VadModelConfig config;\n  config.silero_vad.model = \"./silero_vad.onnx\";\n  config.silero_vad.threshold = 0.5;\n  config.silero_vad.min_silence_duration = 0.1;\n  config.silero_vad.min_speech_duration = 0.25;\n  config.silero_vad.max_speech_duration = 8;\n  config.sample_rate = 16000;\n  config.debug = false;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 20);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    exit(-1);\n  }\n\n  return vad;\n}\n\nstatic sherpa_onnx::cxx::OfflineRecognizer CreateOfflineRecognizer() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.sense_voice.model =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n  config.model_config.sense_voice.use_itn = false;\n  config.model_config.sense_voice.language = \"auto\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    exit(-1);\n  }\n  std::cout << \"Loading model done\\n\";\n  return recognizer;\n}\n\nint32_t main() {\n  signal(SIGINT, Handler);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  auto vad = CreateVad();\n  auto recognizer = CreateOfflineRecognizer();\n\n  sherpa_onnx::Microphone mic;\n\n  PaDeviceIndex num_devices = Pa_GetDeviceCount();\n  if (num_devices == 0) {\n    std::cerr << \"  If you are using Linux, please try \"\n                 \"./build/bin/sense-voice-simulate-streaming-alsa-cxx-api\\n\";\n    return -1;\n  }\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *sample_rate_str = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (sample_rate_str) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(sample_rate_str);\n  }\n  float sample_rate = 16000;\n  LinearResampler resampler;\n  if (mic_sample_rate != sample_rate) {\n    float min_freq = std::min(mic_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler = LinearResampler::Create(mic_sample_rate, sample_rate,\n                                        lowpass_cutoff, lowpass_filter_width);\n  }\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr)) {\n    std::cerr << \"Failed to open microphone device\\n\";\n    return -1;\n  }\n\n  int32_t window_size = 512;  // samples, please don't change\n\n  int32_t offset = 0;\n  std::vector<float> buffer;\n  bool speech_started = false;\n\n  auto started_time = std::chrono::steady_clock::now();\n\n  SherpaDisplay display;\n\n  std::cout << \"Started! Please speak\\n\";\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      if (!resampler.Get()) {\n        buffer.insert(buffer.end(), s.begin(), s.end());\n      } else {\n        auto resampled = resampler.Resample(s.data(), s.size(), false);\n        buffer.insert(buffer.end(), resampled.begin(), resampled.end());\n      }\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad.AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad.IsDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, buffer.data(), buffer.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n\n      vad.Pop();\n\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, segment.samples.data(),\n                            segment.samples.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n      display.UpdateText(result.text);\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/sense-voice-with-hr-cxx-api.cc",
    "content": "// cxx-api-examples/sense-voice-with-hr-cxx-api.cc\n//\n// Copyright (c)  2024-2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use sense voice with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n// rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\n// tar xf dict.tar.bz2\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.sense_voice.model =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n  config.model_config.sense_voice.use_itn = true;\n  config.model_config.sense_voice.language = \"auto\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n  config.hr.dict_dir = \"./dict\";\n  config.hr.lexicon = \"./lexicon.txt\";\n\n  // Please see\n  // https://colab.research.google.com/drive/1jEaS3s8FbRJIcVQJv2EQx19EM_mnuARi?usp=sharing\n  // for how to generate your own replace.fst\n  config.hr.rule_fsts = \"./replace.fst\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename = \"./test-hr.wav\";\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/sherpa-display.h",
    "content": "// cxx-api-examples/sherpa-display.cc\n// Copyright (c)  2025  Xiaomi Corporation\n#pragma once\n\n#include <stdlib.h>\n\n#include <cstdio>\n#include <ctime>\n#include <iomanip>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\nnamespace sherpa_onnx::cxx {\n\nclass SherpaDisplay {\n public:\n  void UpdateText(const std::string &text) { current_text_ = text; }\n\n  void FinalizeCurrentSentence() {\n    if (!current_text_.empty() &&\n        (current_text_[0] != ' ' || current_text_.size() > 1)) {\n      sentences_.push_back({GetCurrentDateTime(), std::move(current_text_)});\n    }\n  }\n\n  void Display() const {\n    if (!sentences_.empty() || !current_text_.empty()) {\n      ClearScreen();\n    }\n\n    printf(\"=== Speech Recognition with Next-gen Kaldi ===\\n\");\n    printf(\"------------------------------\\n\");\n    if (!sentences_.empty()) {\n      int32_t i = 1;\n      for (const auto &p : sentences_) {\n        printf(\"[%s] %d. %s\\n\", p.first.c_str(), i, p.second.c_str());\n        i += 1;\n      }\n\n      printf(\"------------------------------\\n\");\n    }\n\n    if (!current_text_.empty()) {\n      printf(\"Recognizing: %s\\n\", current_text_.c_str());\n    }\n  }\n\n private:\n  static void ClearScreen() {\n#ifdef _MSC_VER\n    auto ret = system(\"cls\");\n#else\n    auto ret = system(\"clear\");\n#endif\n    (void)ret;\n  }\n\n  static std::string GetCurrentDateTime() {\n    std::ostringstream os;\n    auto t = std::time(nullptr);\n    auto tm = std::localtime(&t);\n    os << std::put_time(tm, \"%Y-%m-%d %H:%M:%S\");\n    return os.str();\n  }\n\n private:\n  std::vector<std::pair<std::string, std::string>> sentences_;\n  std::string current_text_;\n};\n\n}  // namespace sherpa_onnx::cxx\n"
  },
  {
    "path": "cxx-api-examples/speech-enhancement-dpdfnet-cxx-api.cc",
    "content": "// cxx-api-examples/speech-enhancement-dpdfnet-cxx-api.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n//\n// We assume you have pre-downloaded the DPDFNet model and sample test wave.\n// DPDFNet models are available from either:\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n// https://huggingface.co/Ceva-IP/DPDFNet\n//\n// An example command to download:\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n*/\n// clang-format on\n//\n// Use dpdfnet_baseline.onnx, dpdfnet2.onnx, dpdfnet4.onnx, or dpdfnet8.onnx\n// for 16 kHz downstream ASR or speech recognition.\n// Use dpdfnet2_48khz_hr.onnx for 48 kHz enhancement output.\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  OfflineSpeechDenoiserConfig config;\n  std::string model_filename = \"./dpdfnet_baseline.onnx\";\n  std::string wav_filename = \"./inp_16k.wav\";\n  std::string out_wave_filename = \"./enhanced-dpdfnet.wav\";\n  config.model.dpdfnet.model = model_filename;\n\n  auto sd = OfflineSpeechDenoiser::Create(config);\n  if (!sd.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n\n  Wave wave = ReadWave(wav_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wav_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Started\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n  auto denoised =\n      sd.Run(wave.samples.data(), wave.samples.size(), wave.sample_rate);\n  const auto end = std::chrono::steady_clock::now();\n  std::cout << \"Done\\n\";\n\n  WriteWave(out_wave_filename, {denoised.samples, denoised.sample_rate});\n\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"Saved to \" << out_wave_filename << \"\\n\";\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/speech-enhancement-gtcrn-cxx-api.cc",
    "content": "// cxx-api-examples/speech-enhancement-gtcrn-cxx-api.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n//\n// We assume you have pre-downloaded the GTCRN model and sample test wave from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n//\n// An example command to download:\n// clang-format off\n/*\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n*/\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  OfflineSpeechDenoiserConfig config;\n  std::string model_filename = \"./gtcrn_simple.onnx\";\n  std::string wav_filename = \"./inp_16k.wav\";\n  std::string out_wave_filename = \"./enhanced-gtcrn.wav\";\n  config.model.gtcrn.model = model_filename;\n\n  auto sd = OfflineSpeechDenoiser::Create(config);\n  if (!sd.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n\n  Wave wave = ReadWave(wav_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wav_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Started\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n  auto denoised =\n      sd.Run(wave.samples.data(), wave.samples.size(), wave.sample_rate);\n  const auto end = std::chrono::steady_clock::now();\n  std::cout << \"Done\\n\";\n\n  WriteWave(out_wave_filename, {denoised.samples, denoised.sample_rate});\n\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"Saved to \" << out_wave_filename << \"\\n\";\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/streaming-t-one-ctc-cxx-api.cc",
    "content": "// cxx-api-examples/streaming-t-one-ctc-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use streaming T-one\n// with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n// tar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n// rm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OnlineRecognizerConfig config;\n\n  // please see\n  config.model_config.t_one_ctc.model =\n      \"sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx\";\n\n  config.model_config.tokens =\n      \"sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OnlineRecognizer recognizer = OnlineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename =\n      \"sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav\";\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OnlineStream stream = recognizer.CreateStream();\n  std::vector<float> left_padding(2400);  // 0.3 seconds at 8kHz\n  std::vector<float> tail_padding(4800);  // 0.6 seconds at 8kHz\n\n  stream.AcceptWaveform(wave.sample_rate, left_padding.data(),\n                        left_padding.size());\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n  stream.AcceptWaveform(wave.sample_rate, tail_padding.data(),\n                        tail_padding.size());\n  stream.InputFinished();\n\n  while (recognizer.IsReady(&stream)) {\n    recognizer.Decode(&stream);\n  }\n\n  OnlineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/streaming-zipformer-cxx-api.cc",
    "content": "// cxx-api-examples/streaming-zipformer-cxx-api.cc\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use streaming Zipformer\n// with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n// tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n// rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OnlineRecognizerConfig config;\n\n  // please see\n  // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n  config.model_config.transducer.encoder =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"encoder-epoch-99-avg-1.int8.onnx\";\n\n  // Note: We recommend not using int8.onnx for the decoder.\n  config.model_config.transducer.decoder =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"decoder-epoch-99-avg-1.onnx\";\n\n  config.model_config.transducer.joiner =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"joiner-epoch-99-avg-1.int8.onnx\";\n\n  config.model_config.tokens =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OnlineRecognizer recognizer = OnlineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/\"\n      \"0.wav\";\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OnlineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n  stream.InputFinished();\n\n  while (recognizer.IsReady(&stream)) {\n    recognizer.Decode(&stream);\n  }\n\n  OnlineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/streaming-zipformer-rtf-cxx-api.cc",
    "content": "// cxx-api-examples/streaming-zipformer-rtf-cxx-api.cc\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use streaming Zipformer\n// with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// cd /path/sherpa-onnx/\n// mkdir build\n// cd build\n// cmake ..\n// make\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n// tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n// rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n//\n// #  1. Test on CPU, run once\n//\n// ./bin/streaming-zipformer-rtf-cxx-api\n//\n// #  2. Test on CPU, run 10 times\n//\n// ./bin/streaming-zipformer-rtf-cxx-api 10\n//\n// #  3. Test on GPU, run 10 times\n//\n// ./bin/streaming-zipformer-rtf-cxx-api 10 cuda\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main(int argc, char *argv[]) {\n  int32_t num_runs = 1;\n  if (argc >= 2) {\n    num_runs = atoi(argv[1]);\n    if (num_runs < 0) {\n      num_runs = 1;\n    }\n  }\n\n  bool use_gpu = (argc == 3);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OnlineRecognizerConfig config;\n\n  // please see\n  // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n  config.model_config.transducer.encoder =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"encoder-epoch-99-avg-1.int8.onnx\";\n\n  // Note: We recommend not using int8.onnx for the decoder.\n  config.model_config.transducer.decoder =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"decoder-epoch-99-avg-1.onnx\";\n\n  config.model_config.transducer.joiner =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"joiner-epoch-99-avg-1.int8.onnx\";\n\n  config.model_config.tokens =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n  config.model_config.provider = use_gpu ? \"cuda\" : \"cpu\";\n\n  std::cout << \"Loading model\\n\";\n  OnlineRecognizer recognizer = OnlineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/\"\n      \"0.wav\";\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  float total_elapsed_seconds = 0;\n  OnlineRecognizerResult result;\n  for (int32_t i = 0; i < num_runs; ++i) {\n    const auto begin = std::chrono::steady_clock::now();\n\n    OnlineStream stream = recognizer.CreateStream();\n    stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                          wave.samples.size());\n    stream.InputFinished();\n\n    while (recognizer.IsReady(&stream)) {\n      recognizer.Decode(&stream);\n    }\n\n    result = recognizer.GetResult(&stream);\n\n    auto end = std::chrono::steady_clock::now();\n    float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n            .count() /\n        1000.;\n    printf(\"Run %d/%d, elapsed seconds: %.3f\\n\", i, num_runs, elapsed_seconds);\n    total_elapsed_seconds += elapsed_seconds;\n  }\n  float average_elapsed_secodns = total_elapsed_seconds / num_runs;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = total_elapsed_seconds / num_runs / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Total Elapsed seconds: %.3fs\\n\", total_elapsed_seconds);\n  printf(\"Num runs: %d\\n\", num_runs);\n  printf(\"Elapsed seconds per run: %.3f/%d=%.3f\\n\", total_elapsed_seconds,\n         num_runs, average_elapsed_secodns);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\",\n         average_elapsed_secodns, duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/streaming-zipformer-with-hr-cxx-api.cc",
    "content": "// cxx-api-examples/streaming-zipformer-with-hr-cxx-api.cc\n// Copyright (c)  2024-2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use streaming Zipformer\n// with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n// tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n// rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\n// tar xf dict.tar.bz2\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OnlineRecognizerConfig config;\n\n  // please see\n  // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n  config.model_config.transducer.encoder =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"encoder-epoch-99-avg-1.int8.onnx\";\n\n  // Note: We recommend not using int8.onnx for the decoder.\n  config.model_config.transducer.decoder =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"decoder-epoch-99-avg-1.onnx\";\n\n  config.model_config.transducer.joiner =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n      \"joiner-epoch-99-avg-1.int8.onnx\";\n\n  config.model_config.tokens =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  config.hr.dict_dir = \"./dict\";\n  config.hr.lexicon = \"./lexicon.txt\";\n\n  // Please see\n  // https://colab.research.google.com/drive/1jEaS3s8FbRJIcVQJv2EQx19EM_mnuARi?usp=sharing\n  // for how to generate your own replace.fst\n  config.hr.rule_fsts = \"./replace.fst\";\n\n  std::cout << \"Loading model\\n\";\n  OnlineRecognizer recognizer = OnlineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename = \"./test-hr.wav\";\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OnlineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n  stream.InputFinished();\n\n  while (recognizer.IsReady(&stream)) {\n    recognizer.Decode(&stream);\n  }\n\n  OnlineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/supertonic-tts-en-cxx-api.cc",
    "content": "// cxx-api-examples/supertonic-tts-en-cxx-api.cc\n//\n// Copyright (c)  2026  zengyw\n\n// This file shows how to use sherpa-onnx CXX API\n// for English TTS with Supertonic.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nrm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\n./supertonic-tts-en-cxx-api\n\n*/\n// clang-format on\n\n#include <cstdint>\n#include <cstdio>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineTtsConfig config;\n\n  config.model.supertonic.duration_predictor =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/\"\n      \"duration_predictor.int8.onnx\";\n  config.model.supertonic.text_encoder =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx\";\n  config.model.supertonic.vector_estimator =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx\";\n  config.model.supertonic.vocoder =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx\";\n  config.model.supertonic.tts_json =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json\";\n  config.model.supertonic.unicode_indexer =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin\";\n  config.model.supertonic.voice_style =\n      \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin\";\n\n  config.model.num_threads = 2;\n\n  // If you don't want to see debug messages, please set it to 0\n  config.model.debug = 1;\n\n  std::string filename = \"./generated-supertonic-en-cxx.wav\";\n  std::string text =\n      \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n      \"does not have two-thirds of his day for himself, is a slave, whatever \"\n      \"he may be: a statesman, a businessman, an official, or a scholar.\";\n\n  auto tts = OfflineTts::Create(config);\n\n  GenerationConfig gen_config;\n  gen_config.sid = 6;\n  gen_config.num_steps = 5;\n  gen_config.speed = 1.25;  // larger -> faster\n  gen_config.extra[\"lang\"] = \"en\";\n\n  // Use GenerationConfig for Supertonic.\n  GeneratedAudio audio = tts.Generate(text, gen_config, ProgressCallback);\n\n  WriteWave(filename, {audio.samples, audio.sample_rate});\n\n  fprintf(stderr, \"Input text is: %s\\n\", text.c_str());\n  fprintf(stderr, \"Saved to: %s\\n\", filename.c_str());\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/vad-cxx-api.cc",
    "content": "// cxx-api-examples/vad-cxx-api.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use VAD to remove silences from a file\n// clang-format off\n//\n// To use silero-vad:\n//  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// To use ten-vad:\n//  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n//\n// clang-format on\n#include <cstdio>\n#include <iostream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  std::string wave_filename = \"./lei-jun-test.wav\";\n  if (!FileExists(wave_filename)) {\n    fprintf(stderr, \"Please download %s\\n\", wave_filename.c_str());\n    return -1;\n  }\n\n  std::string vad_filename;\n  bool use_silero_vad = false;\n  bool use_ten_vad = false;\n\n  if (FileExists(\"./silero_vad.onnx\")) {\n    printf(\"Use silero-vad\\n\");\n    vad_filename = \"./silero_vad.onnx\";\n    use_silero_vad = true;\n  } else if (FileExists(\"./ten-vad.onnx\")) {\n    printf(\"Use ten-vad\\n\");\n    vad_filename = \"./ten-vad.onnx\";\n    use_ten_vad = true;\n  } else {\n    fprintf(stderr, \"Please provide either silero_vad.onnx or ten-vad.onnx\\n\");\n    return -1;\n  }\n\n  VadModelConfig config;\n  if (use_silero_vad) {\n    config.silero_vad.model = vad_filename;\n    config.silero_vad.threshold = 0.3;\n    config.silero_vad.min_silence_duration = 0.5;\n    config.silero_vad.min_speech_duration = 0.25;\n    config.silero_vad.max_speech_duration = 20;\n    config.silero_vad.window_size = 512;\n  } else if (use_ten_vad) {\n    config.ten_vad.model = vad_filename;\n    config.ten_vad.threshold = 0.3;\n    config.ten_vad.min_silence_duration = 0.5;\n    config.ten_vad.min_speech_duration = 0.25;\n    config.ten_vad.max_speech_duration = 20;\n    config.ten_vad.window_size = 256;\n  }\n\n  config.sample_rate = 16000;\n  config.debug = true;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 20);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    return -1;\n  }\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n  bool is_eof = false;\n  int32_t i = 0;\n  int32_t window_size = use_silero_vad ? config.silero_vad.window_size\n                                       : config.ten_vad.window_size;\n\n  int32_t sample_rate = config.sample_rate;\n\n  std::vector<float> samples_without_silence;\n\n  while (!is_eof) {\n    if (i + window_size < wave.samples.size()) {\n      vad.AcceptWaveform(wave.samples.data() + i, window_size);\n      i += window_size;\n    } else {\n      is_eof = true;\n      vad.Flush();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n      float start_time = segment.start / static_cast<float>(sample_rate);\n      float end_time =\n          start_time + segment.samples.size() / static_cast<float>(sample_rate);\n      printf(\"%.3f -- %.3f\\n\", start_time, end_time);\n\n      samples_without_silence.insert(samples_without_silence.end(),\n                                     segment.samples.begin(),\n                                     segment.samples.end());\n\n      vad.Pop();\n    }\n  }\n\n  bool ok = WriteWave(\"./lei-jun-test-no-silence.wav\",\n                      {samples_without_silence, sample_rate});\n  if (ok) {\n    std::cout << \"Saved to ./lei-jun-test-no-silence.wav\\n\";\n  } else {\n    std::cerr << \"Failed to write ./lei-jun-test-no-silence.wav\\n\";\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/wenet-ctc-cxx-api.cc",
    "content": "// cxx-api-examples/wenet-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Wenet CTC with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n// tar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n// rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  // clang-format off\n  config.model_config.wenet_ctc.model = \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx\";\n  config.model_config.tokens = \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename = \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/test_wavs/yue-0.wav\";\n  // clang-format on\n\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/wenet-ctc-simulate-streaming-microphone-cxx-api.cc",
    "content": "// cxx-api-examples/wenet-ctc-simulate-streaming-microphone-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Wenet CTC with sherpa-onnx's C++ API\n// for streaming speech recognition from a microphone.\n//\n// clang-format off\n//\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n// tar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n// rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n//\n// clang-format on\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <iostream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <vector>\n\n#include \"portaudio.h\"       // NOLINT\n#include \"sherpa-display.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(mutex);\n  samples_queue.emplace(\n      reinterpret_cast<const float *>(input_buffer),\n      reinterpret_cast<const float *>(input_buffer) + frames_per_buffer);\n  condition_variable.notify_one();\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic sherpa_onnx::cxx::VoiceActivityDetector CreateVad() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  VadModelConfig config;\n  config.silero_vad.model = \"./silero_vad.onnx\";\n  config.silero_vad.threshold = 0.5;\n  config.silero_vad.min_silence_duration = 0.1;\n  config.silero_vad.min_speech_duration = 0.25;\n  config.silero_vad.max_speech_duration = 8;\n  config.sample_rate = 16000;\n  config.debug = false;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 20);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    exit(-1);\n  }\n\n  return vad;\n}\n\nstatic sherpa_onnx::cxx::OfflineRecognizer CreateOfflineRecognizer() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  // clang-format off\n  config.model_config.wenet_ctc.model = \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx\";\n  config.model_config.tokens = \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt\";\n  // clang-format on\n\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    exit(-1);\n  }\n  std::cout << \"Loading model done\\n\";\n  return recognizer;\n}\n\nint32_t main() {\n  signal(SIGINT, Handler);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  auto vad = CreateVad();\n  auto recognizer = CreateOfflineRecognizer();\n\n  sherpa_onnx::Microphone mic;\n\n  PaDeviceIndex num_devices = Pa_GetDeviceCount();\n  if (num_devices == 0) {\n    std::cerr << \"  If you are using Linux, please try \"\n                 \"./build/bin/sense-voice-simulate-streaming-alsa-cxx-api\\n\";\n    return -1;\n  }\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *sample_rate_str = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (sample_rate_str) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(sample_rate_str);\n  }\n  float sample_rate = 16000;\n  LinearResampler resampler;\n  if (mic_sample_rate != sample_rate) {\n    float min_freq = std::min(mic_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler = LinearResampler::Create(mic_sample_rate, sample_rate,\n                                        lowpass_cutoff, lowpass_filter_width);\n  }\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr)) {\n    std::cerr << \"Failed to open microphone device\\n\";\n    return -1;\n  }\n\n  int32_t window_size = 512;  // samples, please don't change\n\n  int32_t offset = 0;\n  std::vector<float> buffer;\n  bool speech_started = false;\n\n  auto started_time = std::chrono::steady_clock::now();\n\n  SherpaDisplay display;\n\n  std::cout << \"Started! Please speak\\n\";\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      if (!resampler.Get()) {\n        buffer.insert(buffer.end(), s.begin(), s.end());\n      } else {\n        auto resampled = resampler.Resample(s.data(), s.size(), false);\n        buffer.insert(buffer.end(), resampled.begin(), resampled.end());\n      }\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad.AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad.IsDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, buffer.data(), buffer.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n\n      vad.Pop();\n\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, segment.samples.data(),\n                            segment.samples.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n      display.UpdateText(result.text);\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/whisper-cxx-api.cc",
    "content": "// cxx-api-examples/whisper-cxx-api.cc\n// Copyright (c)  2024  Xiaomi Corporation\n\n//\n// This file demonstrates how to use whisper with sherpa-onnx's C++ API.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n// tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n// rm sherpa-onnx-whisper-tiny.en.tar.bz2\n//\n// clang-format on\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nint32_t main() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.whisper.encoder =\n      \"./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx\";\n  config.model_config.whisper.decoder =\n      \"./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\";\n\n  config.model_config.num_threads = 1;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    return -1;\n  }\n  std::cout << \"Loading model done\\n\";\n\n  std::string wave_filename = \"./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav\";\n  Wave wave = ReadWave(wave_filename);\n  if (wave.samples.empty()) {\n    std::cerr << \"Failed to read: '\" << wave_filename << \"'\\n\";\n    return -1;\n  }\n\n  std::cout << \"Start recognition\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n\n  OfflineStream stream = recognizer.CreateStream();\n  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n                        wave.samples.size());\n\n  recognizer.Decode(&stream);\n\n  OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n  const auto end = std::chrono::steady_clock::now();\n  const float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);\n  float rtf = elapsed_seconds / duration;\n\n  std::cout << \"text: \" << result.text << \"\\n\";\n  printf(\"Number of threads: %d\\n\", config.model_config.num_threads);\n  printf(\"Duration: %.3fs\\n\", duration);\n  printf(\"Elapsed seconds: %.3fs\\n\", elapsed_seconds);\n  printf(\"(Real time factor) RTF = %.3f / %.3f = %.3f\\n\", elapsed_seconds,\n         duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/zipformer-ctc-simulate-streaming-alsa-cxx-api.cc",
    "content": "// cxx-api-examples/zipformer-ctc-simulate-streaming-alsa-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use zipformer CTC with sherpa-onnx's C++ API\n// for streaming speech recognition from a microphone.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n// tar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n// rm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n//\n// clang-format on\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <iostream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <string>\n#include <thread>  // NOLINT\n#include <utility>\n#include <vector>\n\n#include \"sherpa-display.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n#include \"sherpa-onnx/csrc/alsa.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic void RecordCallback(sherpa_onnx::Alsa *alsa) {\n  int32_t chunk = 0.1 * alsa->GetActualSampleRate();\n  while (!stop) {\n    std::vector<float> samples = alsa->Read(chunk);\n\n    std::lock_guard<std::mutex> lock(mutex);\n    samples_queue.emplace(std::move(samples));\n    condition_variable.notify_one();\n  }\n}\n\nstatic sherpa_onnx::cxx::VoiceActivityDetector CreateVad() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  VadModelConfig config;\n  config.silero_vad.model = \"./silero_vad.onnx\";\n  config.silero_vad.threshold = 0.5;\n  config.silero_vad.min_silence_duration = 0.1;\n  config.silero_vad.min_speech_duration = 0.25;\n  config.silero_vad.max_speech_duration = 8;\n  config.sample_rate = 16000;\n  config.debug = false;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 20);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    exit(-1);\n  }\n\n  return vad;\n}\n\nstatic sherpa_onnx::cxx::OfflineRecognizer CreateOfflineRecognizer() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.zipformer_ctc.model =\n      \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt\";\n\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    exit(-1);\n  }\n  std::cout << \"Loading model done\\n\";\n  return recognizer;\n}\n\nint32_t main(int32_t argc, const char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\ntar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nrm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\n./zipformer-ctc-simulate-streaming-alsa-cxx-api device_name\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n)usage\";\n\n  if (argc != 2) {\n    fprintf(stderr, \"%s\\n\", kUsageMessage);\n    return -1;\n  }\n\n  signal(SIGINT, Handler);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  auto vad = CreateVad();\n  auto recognizer = CreateOfflineRecognizer();\n\n  int32_t expected_sample_rate = 16000;\n\n  std::string device_name = argv[1];\n  sherpa_onnx::Alsa alsa(device_name.c_str());\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name.c_str());\n\n  if (alsa.GetExpectedSampleRate() != expected_sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            expected_sample_rate);\n    exit(-1);\n  }\n\n  int32_t window_size = 512;  // samples, please don't change\n\n  int32_t offset = 0;\n  std::vector<float> buffer;\n  bool speech_started = false;\n\n  auto started_time = std::chrono::steady_clock::now();\n\n  SherpaDisplay display;\n\n  std::thread record_thread(RecordCallback, &alsa);\n\n  std::cout << \"Started! Please speak\\n\";\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      buffer.insert(buffer.end(), s.begin(), s.end());\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad.AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad.IsDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(expected_sample_rate, buffer.data(), buffer.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n\n      vad.Pop();\n\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(expected_sample_rate, segment.samples.data(),\n                            segment.samples.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n      display.UpdateText(result.text);\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  record_thread.join();\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/zipformer-ctc-simulate-streaming-microphone-cxx-api.cc",
    "content": "// cxx-api-examples/zipformer-ctc-simulate-streaming-microphone-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n\n//\n// This file demonstrates how to use Zipformer CTC with sherpa-onnx's C++ API\n// for streaming speech recognition from a microphone.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n// tar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n// rm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n//\n// clang-format on\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <iostream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <vector>\n\n#include \"portaudio.h\"       // NOLINT\n#include \"sherpa-display.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(mutex);\n  samples_queue.emplace(\n      reinterpret_cast<const float *>(input_buffer),\n      reinterpret_cast<const float *>(input_buffer) + frames_per_buffer);\n  condition_variable.notify_one();\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic sherpa_onnx::cxx::VoiceActivityDetector CreateVad() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  VadModelConfig config;\n  config.silero_vad.model = \"./silero_vad.onnx\";\n  config.silero_vad.threshold = 0.5;\n  config.silero_vad.min_silence_duration = 0.1;\n  config.silero_vad.min_speech_duration = 0.25;\n  config.silero_vad.max_speech_duration = 8;\n  config.sample_rate = 16000;\n  config.debug = false;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 20);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    exit(-1);\n  }\n\n  return vad;\n}\n\nstatic sherpa_onnx::cxx::OfflineRecognizer CreateOfflineRecognizer() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.zipformer_ctc.model =\n      \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt\";\n\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    exit(-1);\n  }\n  std::cout << \"Loading model done\\n\";\n  return recognizer;\n}\n\nint32_t main() {\n  signal(SIGINT, Handler);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  auto vad = CreateVad();\n  auto recognizer = CreateOfflineRecognizer();\n\n  sherpa_onnx::Microphone mic;\n\n  PaDeviceIndex num_devices = Pa_GetDeviceCount();\n  if (num_devices == 0) {\n    std::cerr << \"  If you are using Linux, please try \"\n                 \"./build/bin/zipformer-ctc-simulate-streaming-alsa-cxx-api\\n\";\n    return -1;\n  }\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *sample_rate_str = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (sample_rate_str) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(sample_rate_str);\n  }\n  float sample_rate = 16000;\n  LinearResampler resampler;\n  if (mic_sample_rate != sample_rate) {\n    float min_freq = std::min(mic_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler = LinearResampler::Create(mic_sample_rate, sample_rate,\n                                        lowpass_cutoff, lowpass_filter_width);\n  }\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr)) {\n    std::cerr << \"Failed to open microphone device\\n\";\n    return -1;\n  }\n\n  int32_t window_size = 512;  // samples, please don't change\n\n  int32_t offset = 0;\n  std::vector<float> buffer;\n  bool speech_started = false;\n\n  auto started_time = std::chrono::steady_clock::now();\n\n  SherpaDisplay display;\n\n  std::cout << \"Started! Please speak\\n\";\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      if (!resampler.Get()) {\n        buffer.insert(buffer.end(), s.begin(), s.end());\n      } else {\n        auto resampled = resampler.Resample(s.data(), s.size(), false);\n        buffer.insert(buffer.end(), resampled.begin(), resampled.end());\n      }\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad.AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad.IsDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, buffer.data(), buffer.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n\n      vad.Pop();\n\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, segment.samples.data(),\n                            segment.samples.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n      display.UpdateText(result.text);\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/zipformer-transducer-simulate-streaming-microphone-cxx-api.cc",
    "content": "// cxx-api-examples/zipformer-transducer-simulate-streaming-microphone-cxx-api.cc\n// Copyright (c)  2025  Xiaomi Corporation\n//\n// This file demonstrates how to use Zipformer transducer with sherpa-onnx's C++\n// API for streaming speech recognition from a microphone.\n//\n// clang-format off\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2\n// tar xvf sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2\n// rm sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2\n//\n// clang-format on\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <iostream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <vector>\n\n#include \"portaudio.h\"       // NOLINT\n#include \"sherpa-display.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(mutex);\n  samples_queue.emplace(\n      reinterpret_cast<const float *>(input_buffer),\n      reinterpret_cast<const float *>(input_buffer) + frames_per_buffer);\n  condition_variable.notify_one();\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic sherpa_onnx::cxx::VoiceActivityDetector CreateVad() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  VadModelConfig config;\n  config.silero_vad.model = \"./silero_vad.onnx\";\n  config.silero_vad.threshold = 0.5;\n  config.silero_vad.min_silence_duration = 0.1;\n  config.silero_vad.min_speech_duration = 0.25;\n  config.silero_vad.max_speech_duration = 8;\n  config.sample_rate = 16000;\n  config.debug = false;\n\n  VoiceActivityDetector vad = VoiceActivityDetector::Create(config, 20);\n  if (!vad.Get()) {\n    std::cerr << \"Failed to create VAD. Please check your config\\n\";\n    exit(-1);\n  }\n\n  return vad;\n}\n\nstatic sherpa_onnx::cxx::OfflineRecognizer CreateOfflineRecognizer() {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineRecognizerConfig config;\n\n  config.model_config.transducer.encoder =\n      \"./sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/\"\n      \"encoder-epoch-99-avg-1.int8.onnx\";\n\n  config.model_config.transducer.decoder =\n      \"./sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/\"\n      \"decoder-epoch-99-avg-1.onnx\";\n\n  config.model_config.transducer.joiner =\n      \"./sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/\"\n      \"joiner-epoch-99-avg-1.int8.onnx\";\n  config.model_config.tokens =\n      \"./sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01/tokens.txt\";\n\n  config.model_config.num_threads = 2;\n  config.model_config.debug = false;\n\n  std::cout << \"Loading model\\n\";\n  OfflineRecognizer recognizer = OfflineRecognizer::Create(config);\n  if (!recognizer.Get()) {\n    std::cerr << \"Please check your config\\n\";\n    exit(-1);\n  }\n  std::cout << \"Loading model done\\n\";\n  return recognizer;\n}\n\nint32_t main() {\n  signal(SIGINT, Handler);\n\n  using namespace sherpa_onnx::cxx;  // NOLINT\n\n  auto vad = CreateVad();\n  auto recognizer = CreateOfflineRecognizer();\n\n  sherpa_onnx::Microphone mic;\n\n  PaDeviceIndex num_devices = Pa_GetDeviceCount();\n  if (num_devices == 0) {\n    std::cerr << \"  If you are using Linux, please try \"\n                 \"./build/bin/zipformer-ctc-simulate-streaming-alsa-cxx-api\\n\";\n    return -1;\n  }\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *sample_rate_str = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (sample_rate_str) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(sample_rate_str);\n  }\n  float sample_rate = 16000;\n  LinearResampler resampler;\n  if (mic_sample_rate != sample_rate) {\n    float min_freq = std::min(mic_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler = LinearResampler::Create(mic_sample_rate, sample_rate,\n                                        lowpass_cutoff, lowpass_filter_width);\n  }\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr)) {\n    std::cerr << \"Failed to open microphone device\\n\";\n    return -1;\n  }\n\n  int32_t window_size = 512;  // samples, please don't change\n\n  int32_t offset = 0;\n  std::vector<float> buffer;\n  bool speech_started = false;\n\n  auto started_time = std::chrono::steady_clock::now();\n\n  SherpaDisplay display;\n\n  std::cout << \"Started! Please speak\\n\";\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      if (!resampler.Get()) {\n        buffer.insert(buffer.end(), s.begin(), s.end());\n      } else {\n        auto resampled = resampler.Resample(s.data(), s.size(), false);\n        buffer.insert(buffer.end(), resampled.begin(), resampled.end());\n      }\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad.AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad.IsDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, buffer.data(), buffer.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad.IsEmpty()) {\n      auto segment = vad.Front();\n\n      vad.Pop();\n\n      OfflineStream stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sample_rate, segment.samples.data(),\n                            segment.samples.size());\n\n      recognizer.Decode(&stream);\n\n      OfflineRecognizerResult result = recognizer.GetResult(&stream);\n\n      display.UpdateText(result.text);\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "cxx-api-examples/zipvoice-tts-zh-en-cxx-api.cc",
    "content": "// cxx-api-examples/zipvoice-tts-zh-en-cxx-api.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx CXX API\n// for Chinese/English zero-shot TTS with ZipVoice.\n//\n// clang-format off\n/*\nUsage\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n./zipvoice-tts-zh-en-cxx-api\n*/\n// clang-format on\n\n#include <cstdint>\n#include <cstdio>\n#include <string>\n#include <utility>\n\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\nstatic int32_t ProgressCallback(const float *samples, int32_t num_samples,\n                                float progress, void *arg) {\n  fprintf(stderr, \"Progress: %.3f%%\\n\", progress * 100);\n  // return 1 to continue generating\n  // return 0 to stop generating\n  return 1;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  using namespace sherpa_onnx::cxx;  // NOLINT\n  OfflineTtsConfig config;\n\n  config.model.zipvoice.encoder =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx\";\n  config.model.zipvoice.decoder =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx\";\n  config.model.zipvoice.data_dir =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data\";\n  config.model.zipvoice.lexicon =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt\";\n  config.model.zipvoice.tokens =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt\";\n  config.model.zipvoice.vocoder = \"./vocos_24khz.onnx\";\n\n  config.model.num_threads = 2;\n\n  // If you want to see debug messages, please set it to 1\n  config.model.debug = 0;\n\n  std::string filename = \"./generated-zipvoice-zh-en-cxx.wav\";\n  std::string text =\n      \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, \"\n      \"就是全心投入并享受其中.\";\n  std::string reference_text =\n      \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\";\n  std::string reference_audio_file =\n      \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav\";\n\n  auto tts = OfflineTts::Create(config);\n\n  GenerationConfig gen_config;\n  gen_config.speed = 1.0;\n  gen_config.num_steps = 4;\n  gen_config.reference_text = reference_text;\n  gen_config.extra[\"min_char_in_sentence\"] = \"10\";\n\n  Wave wave = ReadWave(reference_audio_file);\n  gen_config.reference_audio = std::move(wave.samples);\n  gen_config.reference_sample_rate = wave.sample_rate;\n\n#if 0\n  // If you don't want to use a callback, then please enable this branch\n  GeneratedAudio audio = tts.Generate(text, gen_config);\n#else\n  GeneratedAudio audio = tts.Generate(text, gen_config, ProgressCallback);\n#endif\n\n  WriteWave(filename, {audio.samples, audio.sample_rate});\n\n  fprintf(stderr, \"Input text is: %s\\n\", text.c_str());\n  fprintf(stderr, \"Saved to: %s\\n\", filename.c_str());\n\n  return 0;\n}\n"
  },
  {
    "path": "dart-api-examples/.gitignore",
    "content": "!run*.sh\n# See https://www.dartlang.org/guides/libraries/private-files\n\n# Files and directories created by pub\n.dart_tool/\n.packages\nbuild/\n# If you're building an application, you may want to check-in your pubspec.lock\npubspec.lock\n\n# Directory created by dartdoc\n# If you don't generate documentation locally you can remove this line.\ndoc/api/\n\n# dotenv environment variables file\n.env*\n\n# Avoid committing generated Javascript files:\n*.dart.js\n*.info.json      # Produced by the --dump-info flag.\n*.js             # When generated by dart2js. Don't specify *.js if your\n                 # project includes source files written in JavaScript.\n*.js_\n*.js.deps\n*.js.map\n\n.flutter-plugins\n.flutter-plugins-dependencies\n"
  },
  {
    "path": "dart-api-examples/README.md",
    "content": "# Introduction\n\nThis directory contains examples for Dart API.\n\nYou can find the package at\nhttps://pub.dev/packages/sherpa_onnx\n\n## Description\n\n| Directory | Description |\n|-----------|-------------|\n| [./speaker-diarization](./speaker-diarization)| Example for speaker diarization.|\n| [./add-punctuations](./add-punctuations)| Example for adding punctuations to text.|\n| [./audio-tagging](./audio-tagging)| Example for audio tagging.|\n| [./keyword-spotter](./keyword-spotter)| Example for keyword spotting|\n| [./non-streaming-asr](./non-streaming-asr)| Example for non-streaming speech recognition|\n| [./speaker-identification](./speaker-identification)| Example for speaker identification and verification.|\n| [./streaming-asr](./streaming-asr)| Example for streaming speech recognition|\n| [./tts](./tts)| Example for text to speech|\n| [./vad-with-non-streaming-asr](./vad-with-non-streaming-asr)| Example for voice activity detection with non-streaming speech recognition. You can use it to generate subtitles.|\n| [./vad](./vad)| Example for voice activity detection|\n| [./speech-enhancement-gtcrn](./speech-enhancement-gtcrn)| Example for speech enhancement/denoising with GTCRN.|\n| [./speech-enhancement-dpdfnet](./speech-enhancement-dpdfnet)| Example for speech enhancement/denoising with DPDFNet, including the 16 kHz family (`dpdfnet_baseline`, `dpdfnet2`, `dpdfnet4`, `dpdfnet8`).|\n| [./streaming-speech-enhancement-gtcrn](./streaming-speech-enhancement-gtcrn)| Example for streaming speech enhancement/denoising with GTCRN.|\n| [./streaming-speech-enhancement-dpdfnet](./streaming-speech-enhancement-dpdfnet)| Example for streaming speech enhancement/denoising with DPDFNet.|\n\n## How to create an example in this folder\n\n```bash\ndart create vad\ncd vad\n\n# Edit pubspec.yaml and add sherpa_onnx to dependencies\n\ndart pub get\ndart run\n```\n"
  },
  {
    "path": "dart-api-examples/add-punctuations/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/add-punctuations/README.md",
    "content": "# Introduction\n\nThis example shows how to use the Dart API from sherpa-onnx to add punctuations to text.\n\n| File | Description|\n|------|------------|\n|[./bin/punctuations.dart](./bin/punctuations.dart)| Use a [CT Transformer model](https://modelscope.cn/models/iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/summary) to add punctuations to text. See [./run-ct-transformer.sh](./run-ct-transformer.sh)|\n\n"
  },
  {
    "path": "dart-api-examples/add-punctuations/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/add-punctuations/bin/punctuations.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()..addOption('model', help: 'Path to model.onnx');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final modelFile = res['model'] as String;\n  final modelConfig = sherpa_onnx.OfflinePunctuationModelConfig(\n    ctTransformer: modelFile,\n    numThreads: 1,\n    provider: 'cpu',\n    debug: false,\n  );\n\n  final config = sherpa_onnx.OfflinePunctuationConfig(model: modelConfig);\n\n  final punct = sherpa_onnx.OfflinePunctuation(config: config);\n\n  final texts = [\n    '这是一个测试你好吗How are you我很好thank you are you ok谢谢你',\n    '我们都是木头人不会说话不会动',\n    'The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry',\n  ];\n\n  for (final t in texts) {\n    final textWithPunct = punct.addPunct(t);\n    print('----------');\n    print('Before: $t');\n    print('After: $textWithPunct');\n  }\n  print('----------');\n\n  punct.free();\n}\n"
  },
  {
    "path": "dart-api-examples/add-punctuations/pubspec.yaml",
    "content": "name: add_punctuations\n\ndescription: >\n  This example demonstrates how to use the Dart API to add punctuations to text.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx: ^1.12.31\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/add-punctuations/run-ct-transformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [[ ! -f ./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nfi\n\ndart run \\\n  ./bin/punctuations.dart \\\n  --model ./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx\n"
  },
  {
    "path": "dart-api-examples/audio-tagging/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/audio-tagging/README.md",
    "content": "# Introduction\n\nThis example shows how to use the Dart API from sherpa-onnx for audio tagging.\n\n| File | Description|\n|------|------------|\n|[./bin/zipformer.dart](./bin/zipformer.dart)| Use a Zipformer model for audio tagging. See [./run-zipformer.sh](./run-zipformer.sh)|\n|[./bin/ced.dart](./bin/ced.dart)| Use a [CED](https://github.com/RicherMans/CED) model for audio tagging. See [./run-ced.sh](./run-ced.sh)|\n"
  },
  {
    "path": "dart-api-examples/audio-tagging/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/audio-tagging/bin/ced.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the zipformer model')\n    ..addOption('labels', help: 'Path to class_labels_indices.csv')\n    ..addOption('top-k', help: 'topK events to be returned', defaultsTo: '5')\n    ..addOption('wav', help: 'Path to test.wav to be tagged');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null || res['labels'] == null || res['wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final labels = res['labels'] as String;\n  final topK = int.tryParse(res['top-k'] as String) ?? 5;\n  final wav = res['wav'] as String;\n\n  final modelConfig = sherpa_onnx.AudioTaggingModelConfig(\n    ced: model,\n    numThreads: 1,\n    debug: true,\n    provider: 'cpu',\n  );\n\n  final config = sherpa_onnx.AudioTaggingConfig(\n    model: modelConfig,\n    labels: labels,\n  );\n\n  final at = sherpa_onnx.AudioTagging(config: config);\n\n  final waveData = sherpa_onnx.readWave(wav);\n\n  final stream = at.createStream();\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n\n  final events = at.compute(stream: stream, topK: topK);\n\n  print(events);\n\n  stream.free();\n  at.free();\n}\n"
  },
  {
    "path": "dart-api-examples/audio-tagging/bin/zipformer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the zipformer model')\n    ..addOption('labels', help: 'Path to class_labels_indices.csv')\n    ..addOption('top-k', help: 'topK events to be returned', defaultsTo: '5')\n    ..addOption('wav', help: 'Path to test.wav to be tagged');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null || res['labels'] == null || res['wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final labels = res['labels'] as String;\n  final topK = int.tryParse(res['top-k'] as String) ?? 5;\n  final wav = res['wav'] as String;\n\n  final zipformerModelConfig =\n      sherpa_onnx.OfflineZipformerAudioTaggingModelConfig(\n    model: model,\n  );\n\n  final modelConfig = sherpa_onnx.AudioTaggingModelConfig(\n    zipformer: zipformerModelConfig,\n    numThreads: 1,\n    debug: true,\n    provider: 'cpu',\n  );\n\n  final config = sherpa_onnx.AudioTaggingConfig(\n    model: modelConfig,\n    labels: labels,\n  );\n\n  final at = sherpa_onnx.AudioTagging(config: config);\n\n  final waveData = sherpa_onnx.readWave(wav);\n\n  final stream = at.createStream();\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n\n  final events = at.compute(stream: stream, topK: topK);\n\n  print(events);\n\n  stream.free();\n  at.free();\n}\n"
  },
  {
    "path": "dart-api-examples/audio-tagging/pubspec.yaml",
    "content": "name: audio_tagging\n\ndescription: >\n  This example demonstrates how to use the Dart API for audio tagging.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx: ^1.12.31\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/audio-tagging/run-ced.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [[ ! -f ./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n  tar xvf sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n  rm sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\nfi\n\nfor w in 1 2 3 4 5 6; do\n  dart run \\\n    ./bin/ced.dart \\\n    --model ./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.int8.onnx \\\n    --labels ./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/class_labels_indices.csv \\\n    --wav ./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/$w.wav\ndone\n"
  },
  {
    "path": "dart-api-examples/audio-tagging/run-zipformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [[ ! -f ./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n  tar xvf sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n  rm sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\nfi\n\nfor w in 1 2 3 4 5 6; do\n  dart run \\\n    ./bin/zipformer.dart \\\n    --model ./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.int8.onnx \\\n    --labels ./sherpa-onnx-zipformer-audio-tagging-2024-04-09/class_labels_indices.csv \\\n    --wav ./sherpa-onnx-zipformer-audio-tagging-2024-04-09/test_wavs/$w.wav\ndone\n"
  },
  {
    "path": "dart-api-examples/keyword-spotter/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/keyword-spotter/CHANGELOG.md",
    "content": "## 1.0.0\n\n- Initial version.\n"
  },
  {
    "path": "dart-api-examples/keyword-spotter/README.md",
    "content": "# Introduction\n\nThis directory contains keyword spotting examples using\nDart API from [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx)\n"
  },
  {
    "path": "dart-api-examples/keyword-spotter/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/keyword-spotter/bin/zipformer-transducer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder', help: 'Path to the encoder model')\n    ..addOption('decoder', help: 'Path to decoder model')\n    ..addOption('joiner', help: 'Path to joiner model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('keywords-file', help: 'Path to keywords.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['joiner'] == null ||\n      res['tokens'] == null ||\n      res['keywords-file'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final joiner = res['joiner'] as String;\n  final tokens = res['tokens'] as String;\n  final keywordsFile = res['keywords-file'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final transducer = sherpa_onnx.OnlineTransducerModelConfig(\n    encoder: encoder,\n    decoder: decoder,\n    joiner: joiner,\n  );\n\n  final modelConfig = sherpa_onnx.OnlineModelConfig(\n    transducer: transducer,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.KeywordSpotterConfig(\n    model: modelConfig,\n    keywordsFile: keywordsFile,\n  );\n  final spotter = sherpa_onnx.KeywordSpotter(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  var stream = spotter.createStream();\n\n  // simulate streaming. You can choose an arbitrary chunk size.\n  // chunkSize of a single sample is also ok, i.e, chunkSize = 1\n  final chunkSize = 1600; // 0.1 second for 16kHz\n  final numChunks = waveData.samples.length ~/ chunkSize;\n\n  for (int i = 0; i != numChunks; ++i) {\n    int start = i * chunkSize;\n    stream.acceptWaveform(\n      samples:\n          Float32List.sublistView(waveData.samples, start, start + chunkSize),\n      sampleRate: waveData.sampleRate,\n    );\n    while (spotter.isReady(stream)) {\n      spotter.decode(stream);\n      final result = spotter.getResult(stream);\n      if (result.keyword != '') {\n        // Remember to reset the stream right after detecting a keyword\n        spotter.reset(stream);\n        print('Detected: ${result.keyword}');\n      }\n    }\n  }\n\n  // 0.5 seconds, assume sampleRate is 16kHz\n  final tailPaddings = Float32List(8000);\n  stream.acceptWaveform(\n    samples: tailPaddings,\n    sampleRate: waveData.sampleRate,\n  );\n\n  while (spotter.isReady(stream)) {\n    spotter.decode(stream);\n    final result = spotter.getResult(stream);\n    if (result.keyword != '') {\n      print('Detected: ${result.keyword}');\n    }\n  }\n\n  stream.free();\n  spotter.free();\n}\n"
  },
  {
    "path": "dart-api-examples/keyword-spotter/pubspec.yaml",
    "content": "name: keyword_spotter\n\ndescription: >\n  This example demonstrates how to use the Dart API for keyword spotting\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx: ^1.12.31\n  # sherpa_onnx:\n  #   path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/keyword-spotter/run-zh.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\nfi\n\ndart run \\\n  ./bin/zipformer-transducer.dart \\\n  --encoder ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx \\\n  --decoder ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx \\\n  --joiner ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx \\\n  --tokens ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt \\\n  --keywords-file ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/test_keywords.txt \\\n  --input-wav ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/3.wav\n\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/CHANGELOG.md",
    "content": "## 1.0.0\n\n- Initial version.\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/README.md",
    "content": "# Introduction\n\nThis folder contains examples for non-streaming ASR with Dart API.\n\n| File | Description|\n|------|------------|\n|[./bin/dolphin-ctc.dart](./bin/dolphin-ctc.dart)| Use a [Dolphin](https://github.com/DataoceanAI/Dolphin) Ctc model for speech recognition. See [./run-dolphin-ctc.sh](./run-dolphin-ctc.sh)|\n|[./bin/nemo-ctc.dart](./bin/nemo-ctc.dart)| Use a NeMo Ctc model for speech recognition. See [./run-nemo-ctc.sh](./run-nemo-ctc.sh)|\n|[./bin/nemo-transducer.dart](./bin/nemo-transducer.dart)| Use a NeMo transducer model for speech recognition. See [./run-nemo-transducer.sh](./run-nemo-transducer.sh)|\n|[./bin/paraformer.dart](./bin/paraformer.dart)|Use a paraformer model for speech recognition. See [./run-paraformer.sh](./run-paraformer.sh)|\n|[./bin/telespeech-ctc.dart](./bin/telespeech-ctc.dart)| Use models from [Tele-AI/TeleSpeech-ASR](https://github.com/Tele-AI/TeleSpeech-ASR) for speech recognition. See [./run-telespeech-ctc.sh](./run-telespeech-ctc.sh)|\n|[./bin/whisper.dart](./bin/whisper.dart)| Use whisper for speech recognition. See [./run-whisper.sh](./run-whisper.sh)|\n|[./bin/zipformer-transducer.dart](./bin/zipformer-transducer.dart)| Use a zipformer transducer for speech recognition. See [./run-zipformer-transducer.sh](./run-zipformer-transducer.sh)|\n|[./bin/vad-with-paraformer.dart](./bin/vad-with-paraformer.dart)| Use a [silero-vad](https://github.com/snakers4/silero-vad) with paraformer for speech recognition. See [./run-vad-with-paraformer.sh](./run-vad-with-paraformer.sh)|\n|[./bin/sense-voice.dart](./bin/sense-voice.dart)| Use a SenseVoice CTC model for speech recognition. See [./run-sense-voice.sh](./run-sense-voice.sh)|\n\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/dolphin-ctc.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the Dolphin CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final dolphin = sherpa_onnx.OfflineDolphinModelConfig(model: model);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    dolphin: dolphin,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/fire-red-asr-ctc.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the FireRedASR CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final fireRedAsrCtc = sherpa_onnx.OfflineFireRedAsrCtcModelConfig(\n    model: model,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    fireRedAsrCtc: fireRedAsrCtc,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n    samples: waveData.samples,\n    sampleRate: waveData.sampleRate,\n  );\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/fire-red-asr.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder', help: 'Path to the FireRedAsr encoder model')\n    ..addOption('decoder', help: 'Path to FireRedAsr decoder model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final fireRedAsr = sherpa_onnx.OfflineFireRedAsrModelConfig(\n    encoder: encoder,\n    decoder: decoder,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    fireRedAsr: fireRedAsr,\n    tokens: tokens,\n    debug: false,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/funasr-nano.dart",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder-adaptor', help: 'Path to the encoder adaptor model')\n    ..addOption('llm', help: 'Path to the llm model')\n    ..addOption('embedding', help: 'Path to the embedding model')\n    ..addOption('tokenizer', help: 'Path to the tokenizer directory')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['encoder-adaptor'] == null ||\n      res['llm'] == null ||\n      res['embedding'] == null ||\n      res['tokenizer'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoderAdaptor = res['encoder-adaptor'] as String;\n  final llm = res['llm'] as String;\n  final embedding = res['embedding'] as String;\n  final tokenizer = res['tokenizer'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final funasrNano = sherpa_onnx.OfflineFunAsrNanoModelConfig(\n    encoderAdaptor: encoderAdaptor,\n    llm: llm,\n    embedding: embedding,\n    tokenizer: tokenizer,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    funasrNano: funasrNano,\n    tokens: '',\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n    samples: waveData.samples,\n    sampleRate: waveData.sampleRate,\n  );\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/medasr-ctc.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the MedASR CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final medasr = sherpa_onnx.OfflineMedAsrCtcModelConfig(model: model);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    medasr: medasr,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n    samples: waveData.samples,\n    sampleRate: waveData.sampleRate,\n  );\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/moonshine.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('preprocessor',\n        help: 'Path to the moonshine preprocessor model')\n    ..addOption('encoder', help: 'Path to the moonshine encoder model')\n    ..addOption('uncached-decoder',\n        help: 'Path to moonshine uncached decoder model')\n    ..addOption('cached-decoder',\n        help: 'Path to moonshine cached decoder model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['preprocessor'] == null ||\n      res['encoder'] == null ||\n      res['uncached-decoder'] == null ||\n      res['cached-decoder'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final preprocessor = res['preprocessor'] as String;\n  final encoder = res['encoder'] as String;\n  final uncachedDecoder = res['uncached-decoder'] as String;\n  final cachedDecoder = res['cached-decoder'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final moonshine = sherpa_onnx.OfflineMoonshineModelConfig(\n    preprocessor: preprocessor,\n    encoder: encoder,\n    uncachedDecoder: uncachedDecoder,\n    cachedDecoder: cachedDecoder,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    moonshine: moonshine,\n    tokens: tokens,\n    debug: false,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/moonshine_v2.dart",
    "content": "// Copyright (c)  2024-2026  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder', help: 'Path to the moonshine v2 encoder model')\n    ..addOption('decoder', help: 'Path to moonshine v2 decoder model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final moonshine = sherpa_onnx.OfflineMoonshineModelConfig(\n    encoder: encoder,\n    mergedDecoder: decoder,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    moonshine: moonshine,\n    tokens: tokens,\n    debug: false,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n    samples: waveData.samples,\n    sampleRate: waveData.sampleRate,\n  );\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/nemo-canary.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder', help: 'Path to the NeMo Canary encoder model')\n    ..addOption('decoder', help: 'Path to the NeMo Canary decoder model')\n    ..addOption('src-lang', help: 'Language of the input audio')\n    ..addOption('tgt-lang', help: 'Language of the recognition result')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['src-lang'] == null ||\n      res['tgt-lang'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final srcLang = res['src-lang'] as String;\n  final tgtLang = res['tgt-lang'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final canary = sherpa_onnx.OfflineCanaryModelConfig(\n      encoder: encoder, decoder: decoder, srcLang: srcLang, tgtLang: tgtLang);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    canary: canary,\n    tokens: tokens,\n    debug: false,\n    numThreads: 1,\n  );\n  var config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print('Result in $tgtLang: ${result.text}');\n\n  stream.free();\n\n  // Example to change the target language to de\n  if (tgtLang != 'en') {\n    var json = config.toJson();\n\n    ((json['model'] as Map<String, dynamic>)!['canary']\n        as Map<String, dynamic>)!['tgtLang'] = 'en';\n\n    config = sherpa_onnx.OfflineRecognizerConfig.fromJson(json);\n    recognizer.setConfig(config);\n\n    final stream = recognizer.createStream();\n\n    stream.acceptWaveform(\n        samples: waveData.samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n    print('Result in English: ${result.text}');\n    stream.free();\n  }\n\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/nemo-ctc.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the NeMo CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final nemo = sherpa_onnx.OfflineNemoEncDecCtcModelConfig(model: model);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    nemoCtc: nemo,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/nemo-transducer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder', help: 'Path to the encoder model')\n    ..addOption('decoder', help: 'Path to decoder model')\n    ..addOption('joiner', help: 'Path to joiner model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['joiner'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final joiner = res['joiner'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final transducer = sherpa_onnx.OfflineTransducerModelConfig(\n    encoder: encoder,\n    decoder: decoder,\n    joiner: joiner,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    transducer: transducer,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/omnilingual-asr-ctc.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the Omnilingual ASR CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final omnilingual = sherpa_onnx.OfflineOmnilingualAsrCtcModelConfig(model: model);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    omnilingual: omnilingual,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/paraformer-itn.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the paraformer model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('rule-fsts',\n        help: 'Path to rule fsts for inverse text normalization')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['rule-fsts'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final ruleFsts = res['rule-fsts'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final paraformer = sherpa_onnx.OfflineParaformerModelConfig(\n    model: model,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    paraformer: paraformer,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n    modelType: 'paraformer',\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(\n    model: modelConfig,\n    ruleFsts: ruleFsts,\n  );\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/paraformer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the paraformer model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final paraformer = sherpa_onnx.OfflineParaformerModelConfig(\n    model: model,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    paraformer: paraformer,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n    modelType: 'paraformer',\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/sense-voice-with-hr.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  print('sherpa-onnx version: ${sherpa_onnx.getVersion()}');\n  print('sherpa-onnx gitSha1: ${sherpa_onnx.getGitSha1()}');\n  print('sherpa-onnx gitDate: ${sherpa_onnx.getGitDate()}');\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the SenseVoice model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('language',\n        help: 'auto, zh, en, ja, ko, yue, or leave it empty to use auto',\n        defaultsTo: '')\n    ..addOption('use-itn',\n        help: 'true to use inverse text normalization', defaultsTo: 'false')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe')\n    ..addOption('hr-lexicon',\n        help: 'Path to lexicon.txt for homophone replacer')\n    ..addOption('hr-rule-fsts',\n        help: 'Path to replace.fst for homophone replacer');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['hr-lexicon'] == null ||\n      res['hr-rule-fsts'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n  final language = res['language'] as String;\n  final useItn = (res['use-itn'] as String).toLowerCase() == 'true';\n  final hrLexicon = res['hr-lexicon'] as String;\n  final hrRuleFsts = res['hr-rule-fsts'] as String;\n\n  final senseVoice = sherpa_onnx.OfflineSenseVoiceModelConfig(\n      model: model, language: language, useInverseTextNormalization: useItn);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    senseVoice: senseVoice,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n\n  final hr = sherpa_onnx.HomophoneReplacerConfig(\n      lexicon: hrLexicon, ruleFsts: hrRuleFsts);\n\n  final config =\n      sherpa_onnx.OfflineRecognizerConfig(model: modelConfig, hr: hr);\n\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/sense-voice.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the SenseVoice model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('language',\n        help: 'auto, zh, en, ja, ko, yue, or leave it empty to use auto',\n        defaultsTo: '')\n    ..addOption('use-itn',\n        help: 'true to use inverse text normalization', defaultsTo: 'false')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n  final language = res['language'] as String;\n  final useItn = (res['use-itn'] as String).toLowerCase() == 'true';\n\n  final senseVoice = sherpa_onnx.OfflineSenseVoiceModelConfig(\n      model: model, language: language, useInverseTextNormalization: useItn);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    senseVoice: senseVoice,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/telespeech-ctc.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the telespeech CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    telespeechCtc: model,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n    modelType: 'telespeech_ctc',\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/vad-with-paraformer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('model', help: 'Path to the paraformer model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['silero-vad'] == null ||\n      res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final sileroVad = res['silero-vad'] as String;\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final paraformer = sherpa_onnx.OfflineParaformerModelConfig(\n    model: model,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    paraformer: paraformer,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n    modelType: 'paraformer',\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n  );\n\n  final vadConfig = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: vadConfig, bufferSizeInSeconds: 10);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ vadConfig.sileroVad.windowSize;\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * vadConfig.sileroVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + vadConfig.sileroVad.windowSize));\n\n    while (!vad.isEmpty()) {\n      final stream = recognizer.createStream();\n      final segment = vad.front();\n      stream.acceptWaveform(\n          samples: segment.samples, sampleRate: waveData.sampleRate);\n      recognizer.decode(stream);\n\n      final result = recognizer.getResult(stream);\n\n      final startTime = segment.start * 1.0 / waveData.sampleRate;\n      final duration = segment.samples.length * 1.0 / waveData.sampleRate;\n      final stopTime = startTime + duration;\n      if (result.text != '') {\n        print(\n            '${startTime.toStringAsPrecision(4)} -- ${stopTime.toStringAsPrecision(4)}: ${result.text}');\n      }\n\n      stream.free();\n      vad.pop();\n    }\n  }\n\n  vad.flush();\n  while (!vad.isEmpty()) {\n    final stream = recognizer.createStream();\n    final segment = vad.front();\n    stream.acceptWaveform(\n        samples: segment.samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n\n    final startTime = segment.start * 1.0 / waveData.sampleRate;\n    final duration = segment.samples.length * 1.0 / waveData.sampleRate;\n    final stopTime = startTime + duration;\n    if (result.text != '') {\n      print(\n          '${startTime.toStringAsPrecision(4)} -- ${stopTime.toStringAsPrecision(4)}: ${result.text}');\n    }\n\n    stream.free();\n    vad.pop();\n  }\n\n  vad.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/wenet-ctc.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the Wenet CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final wenetCtc = sherpa_onnx.OfflineWenetCtcModelConfig(model: model);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    wenetCtc: wenetCtc,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/whisper.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder', help: 'Path to the whisper encoder model')\n    ..addOption('decoder', help: 'Path to whisper decoder model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final whisper = sherpa_onnx.OfflineWhisperModelConfig(\n    encoder: encoder,\n    decoder: decoder,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    whisper: whisper,\n    tokens: tokens,\n    modelType: 'whisper',\n    debug: false,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/zipformer-ctc.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the Zipformer CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final zipformerCtc = sherpa_onnx.OfflineZipformerCtcModelConfig(model: model);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    zipformerCtc: zipformerCtc,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/bin/zipformer-transducer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder', help: 'Path to the encoder model')\n    ..addOption('decoder', help: 'Path to decoder model')\n    ..addOption('joiner', help: 'Path to joiner model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['joiner'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final joiner = res['joiner'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final transducer = sherpa_onnx.OfflineTransducerModelConfig(\n    encoder: encoder,\n    decoder: decoder,\n    joiner: joiner,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    transducer: transducer,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  stream.acceptWaveform(\n      samples: waveData.samples, sampleRate: waveData.sampleRate);\n  recognizer.decode(stream);\n\n  final result = recognizer.getResult(stream);\n  print(result.text);\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/pubspec.yaml",
    "content": "name: non_streaming_asr\ndescription: >\n  This example demonstrates how to use the Dart API for Non-streaming speech recognition. Specifically, we use the following models as examples, whisper, zipformer, and paraformer.\n\nversion: 1.0.0\n# repository: https://github.com/my_org/my_repo\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\n# Add regular dependencies here.\ndependencies:\n  sherpa_onnx: ^1.12.31\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-dolphin-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  ls -lh sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\nfi\n\ndart run \\\n  ./bin/dolphin-ctc.dart \\\n  --model ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx \\\n  --tokens ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt \\\n  --input-wav ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-fire-red-asr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  tar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  rm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nfi\n\ndart run \\\n  ./bin/fire-red-asr-ctc.dart \\\n  --model ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx \\\n  --tokens ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt \\\n  --input-wav ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-fire-red-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\nfi\n\ndart pub get\n\ndart run \\\n  ./bin/fire-red-asr.dart \\\n  --encoder ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx \\\n  --decoder ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx \\\n  --tokens ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt \\\n  --input-wav ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-funasr-nano.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nfi\n\ndart run \\\n  ./bin/funasr-nano.dart \\\n  --encoder-adaptor ./sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx \\\n  --llm ./sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx \\\n  --embedding ./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx \\\n  --tokenizer ./sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B \\\n  --input-wav ./sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/lyrics.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-medasr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  tar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  rm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nfi\n\ndart run \\\n  ./bin/medasr-ctc.dart \\\n  --model ./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx \\\n  --tokens ./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt \\\n  --input-wav ./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-moonshine-v2.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nfi\n\ndart run \\\n  ./bin/moonshine_v2.dart \\\n  --encoder ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort \\\n  --decoder ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort \\\n  --tokens ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt \\\n  --input-wav ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-moonshine.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nfi\n\ndart run \\\n  ./bin/moonshine.dart \\\n  --preprocessor ./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx \\\n  --encoder ./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx \\\n  --uncached-decoder ./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx \\\n  --cached-decoder ./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx \\\n  --tokens ./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt \\\n  --input-wav ./sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-nemo-canary.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n  tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n  rm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\nfi\n\nfor tgt_lang in en de es fr; do\n  dart run \\\n    ./bin/nemo-canary.dart \\\n    --encoder ./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx \\\n    --decoder ./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx \\\n    --tokens ./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt \\\n    --src-lang en \\\n    --tgt-lang $tgt_lang \\\n    --input-wav ./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/en.wav\ndone\n\nfor tgt_lang in en de; do\n  dart run \\\n    ./bin/nemo-canary.dart \\\n    --encoder ./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx \\\n    --decoder ./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx \\\n    --tokens ./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt \\\n    --src-lang de \\\n    --tgt-lang $tgt_lang \\\n    --input-wav ./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/de.wav\ndone\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-nemo-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n  tar xvf sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n  rm sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\nfi\n\ndart run \\\n  ./bin/nemo-ctc.dart \\\n  --model ./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/model.onnx \\\n  --tokens ./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt \\\n  --input-wav ./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/de-german.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-nemo-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n\n  tar xvf sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n  rm sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\nfi\n\ndart run \\\n  ./bin/nemo-transducer.dart \\\n  --encoder ./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/encoder.onnx \\\n  --decoder ./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/decoder.onnx \\\n  --joiner ./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/joiner.onnx \\\n  --tokens ./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt \\\n  --input-wav ./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/de-german.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-omnilingual-asr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  tar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  rm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nfi\n\ndart run \\\n  ./bin/omnilingual-asr-ctc.dart \\\n  --model ./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx \\\n  --tokens ./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt \\\n  --input-wav ./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-paraformer-itn.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\nif [ ! -f ./itn-zh-number.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nfi\n\nif [ ! -f ./itn_zh_number.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\nfi\n\ndart run \\\n  ./bin/paraformer-itn.dart \\\n  --model ./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \\\n  --tokens ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt \\\n  --rule-fsts ./itn_zh_number.fst \\\n  --input-wav ./itn-zh-number.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\ndart run \\\n  ./bin/paraformer.dart \\\n  --model ./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \\\n  --tokens ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt \\\n  --input-wav ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/3-sichuan.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-sense-voice-with-hr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\nif [ ! -d dict ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\n  tar xf dict.tar.bz2\n  rm dict.tar.bz2\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\nfi\n\ndart run \\\n  ./bin/sense-voice-with-hr.dart \\\n  --model ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx \\\n  --tokens ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt \\\n  --use-itn true \\\n  --hr-lexicon ./lexicon.txt \\\n  --hr-rule-fsts ./replace.fst \\\n  --input-wav ./test-hr.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-sense-voice.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\ndart run \\\n  ./bin/sense-voice.dart \\\n  --model ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx \\\n  --tokens ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt \\\n  --use-itn true \\\n  --input-wav ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/zh.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-telespeech-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n\n  tar xvf sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n  rm sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\nfi\n\ndart run \\\n  ./bin/telespeech-ctc.dart \\\n  --model ./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/model.int8.onnx \\\n  --tokens ./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt \\\n  --input-wav ./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/test_wavs/3-sichuan.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-vad-with-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [[ ! -f ./lei-jun-test.wav ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\ndart run \\\n  ./bin/vad-with-paraformer.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --model ./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \\\n  --tokens ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt \\\n  --input-wav ./lei-jun-test.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-wenet-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n  tar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n\n  rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\nfi\n\ndart run \\\n  ./bin/wenet-ctc.dart \\\n  --model ./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx \\\n  --tokens ./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt \\\n  --input-wav ./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/test_wavs/yue-0.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-whisper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\nfi\n\ndart run \\\n  ./bin/whisper.dart \\\n  --encoder ./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx \\\n  --decoder ./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx \\\n  --tokens ./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \\\n  --input-wav ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-zipformer-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n  rm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nfi\n\ndart run \\\n  ./bin/zipformer-ctc.dart \\\n  --model ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx \\\n  --tokens ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt \\\n  --input-wav ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/test_wavs/0.wav\n"
  },
  {
    "path": "dart-api-examples/non-streaming-asr/run-zipformer-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\n  rm sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\nfi\n\ndart run \\\n  ./bin/zipformer-transducer.dart \\\n  --encoder ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/encoder-epoch-30-avg-1.int8.onnx \\\n  --decoder ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/decoder-epoch-30-avg-1.onnx \\\n  --joiner ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/joiner-epoch-30-avg-1.int8.onnx \\\n  --tokens ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/tokens.txt \\\n  --input-wav ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/test_wavs/1221-135766-0001.wav\n"
  },
  {
    "path": "dart-api-examples/speaker-diarization/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/speaker-diarization/CHANGELOG.md",
    "content": "## 1.0.0\n\n- Initial version.\n"
  },
  {
    "path": "dart-api-examples/speaker-diarization/README.md",
    "content": "# Introduction\n\nThis example shows how to use the Dart API from sherpa-onnx for speaker diarization.\n\n# Usage\n\nPlease see [./run.sh](./run.sh)\n"
  },
  {
    "path": "dart-api-examples/speaker-diarization/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/speaker-diarization/bin/speaker-diarization.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\nimport 'dart:ffi';\n\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  /* Please use the following commands to download files used in this file\n    Step 1: Download a speaker segmentation model\n\n    Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\n    for a list of available models. The following is an example\n\n      wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n      tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n      rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\n    Step 2: Download a speaker embedding extractor model\n\n    Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n    for a list of available models. The following is an example\n\n      wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\n    Step 3. Download test wave files\n\n    Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\n    for a list of available test wave files. The following is an example\n\n      wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\n    Step 4. Run it\n        */\n\n  final segmentationModel =\n      \"./sherpa-onnx-pyannote-segmentation-3-0/model.onnx\";\n\n  final embeddingModel =\n      \"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\";\n\n  final waveFilename = \"./0-four-speakers-zh.wav\";\n\n  final segmentationConfig = sherpa_onnx.OfflineSpeakerSegmentationModelConfig(\n    pyannote: sherpa_onnx.OfflineSpeakerSegmentationPyannoteModelConfig(\n        model: segmentationModel),\n  );\n\n  final embeddingConfig =\n      sherpa_onnx.SpeakerEmbeddingExtractorConfig(model: embeddingModel);\n\n  // since we know there are 4 speakers in ./0-four-speakers-zh.wav, we set\n  // numClusters to 4. If you don't know the exact number, please set it to -1.\n  // in that case, you have to set threshold. A larger threshold leads to\n  // fewer clusters, i.e., fewer speakers.\n  final clusteringConfig =\n      sherpa_onnx.FastClusteringConfig(numClusters: 4, threshold: 0.5);\n\n  var config = sherpa_onnx.OfflineSpeakerDiarizationConfig(\n      segmentation: segmentationConfig,\n      embedding: embeddingConfig,\n      clustering: clusteringConfig,\n      minDurationOn: 0.2,\n      minDurationOff: 0.5);\n\n  final sd = sherpa_onnx.OfflineSpeakerDiarization(config);\n  if (sd.ptr == nullptr) {\n    return;\n  }\n\n  final waveData = sherpa_onnx.readWave(waveFilename);\n  if (sd.sampleRate != waveData.sampleRate) {\n    print(\n        'Expected sample rate: ${sd.sampleRate}, given: ${waveData.sampleRate}');\n    return;\n  }\n\n  print('started');\n\n  // Use the following statement if you don't want to use a callback\n  // final segments = sd.process(samples: waveData.samples);\n\n  final segments = sd.processWithCallback(\n      samples: waveData.samples,\n      callback: (int numProcessedChunk, int numTotalChunks) {\n        final progress = 100.0 * numProcessedChunk / numTotalChunks;\n\n        print('Progress ${progress.toStringAsFixed(2)}%');\n\n        return 0;\n      });\n\n  for (int i = 0; i < segments.length; ++i) {\n    print(\n        '${segments[i].start.toStringAsFixed(3)} -- ${segments[i].end.toStringAsFixed(3)}  speaker_${segments[i].speaker}');\n  }\n}\n"
  },
  {
    "path": "dart-api-examples/speaker-diarization/pubspec.yaml",
    "content": "name: speaker_diarization\ndescription: >\n  This example demonstrates how to use the Dart API for speaker diarization.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx: ^1.12.31\n  # sherpa_onnx:\n  #   path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/speaker-diarization/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-pyannote-segmentation-3-0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nfi\n\nif [ ! -f ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\nfi\n\nif [ ! -f ./0-four-speakers-zh.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\nfi\n\ndart run ./bin/speaker-diarization.dart\n"
  },
  {
    "path": "dart-api-examples/speaker-identification/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/speaker-identification/README.md",
    "content": "# Introduction\n\nThis example shows how to use the Dart API from sherpa-onnx for speaker identification.\n\n| File | Description|\n|------|------------|\n|[./bin/speaker_id.dart](./bin/speaker_id.dart)| Use a speaker embedding extractor model for speaker identification and verification. See also [./run-3d-speaker.sh](./run-3d-speaker.sh)|\n"
  },
  {
    "path": "dart-api-examples/speaker-identification/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/speaker-identification/bin/speaker_id.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nFloat32List computeEmbedding(\n    {required sherpa_onnx.SpeakerEmbeddingExtractor extractor,\n    required String filename}) {\n  final waveData = sherpa_onnx.readWave(filename);\n  final stream = extractor.createStream();\n\n  stream.acceptWaveform(\n    samples: waveData.samples,\n    sampleRate: waveData.sampleRate,\n  );\n\n  stream.inputFinished();\n\n  final embedding = extractor.compute(stream);\n\n  stream.free();\n\n  return embedding;\n}\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()..addOption('model', help: 'Path to model.onnx');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  /*\n     Please download test data by yourself\n\n  curl -SL -o sr-data.tar.gz https://github.com/csukuangfj/sr-data/archive/refs/tags/v1.0.0.tar.gz\n  tar xvf sr-data.tar.gz\n  mv sr-data-1.0.0 sr-data\n  */\n\n  final config = sherpa_onnx.SpeakerEmbeddingExtractorConfig(\n    model: model,\n    numThreads: 1,\n    debug: true,\n    provider: 'cpu',\n  );\n  final extractor = sherpa_onnx.SpeakerEmbeddingExtractor(config: config);\n\n  final manager = sherpa_onnx.SpeakerEmbeddingManager(extractor.dim);\n\n  final spk1Files = [\n    \"./sr-data/enroll/fangjun-sr-1.wav\",\n    \"./sr-data/enroll/fangjun-sr-2.wav\",\n    \"./sr-data/enroll/fangjun-sr-3.wav\",\n  ];\n\n  final spk1Vec = <Float32List>[];\n  for (final f in spk1Files) {\n    final embedding = computeEmbedding(extractor: extractor, filename: f);\n    spk1Vec.add(embedding);\n  }\n\n  final spk2Files = [\n    \"./sr-data/enroll/leijun-sr-1.wav\",\n    \"./sr-data/enroll/leijun-sr-2.wav\",\n  ];\n\n  final spk2Vec = <Float32List>[];\n  for (final f in spk2Files) {\n    final embedding = computeEmbedding(extractor: extractor, filename: f);\n    spk2Vec.add(embedding);\n  }\n\n  if (!manager.addMulti(name: \"fangjun\", embeddingList: spk1Vec)) {\n    // Note you should free extractor and manager in your app to avoid memory leak\n    print(\"Failed to register fangjun\");\n    return;\n  }\n\n  if (!manager.addMulti(name: \"leijun\", embeddingList: spk2Vec)) {\n    print(\"Failed to register leijun\");\n    return;\n  }\n\n  if (manager.numSpeakers != 2) {\n    print(\"There should be two speakers\");\n    return;\n  }\n\n  if (!manager.contains(\"fangjun\")) {\n    print(\"It should contain the speaker fangjun\");\n    return;\n  }\n\n  if (!manager.contains(\"leijun\")) {\n    print(\"It should contain the speaker leijun\");\n    return;\n  }\n\n  print(\"---All speakers---\");\n  final allSpeakers = manager.allSpeakerNames;\n  for (final s in allSpeakers) {\n    print(s);\n  }\n  print(\"------------\");\n\n  final testFiles = [\n    \"./sr-data/test/fangjun-test-sr-1.wav\",\n    \"./sr-data/test/leijun-test-sr-1.wav\",\n    \"./sr-data/test/liudehua-test-sr-1.wav\",\n  ];\n\n  final threshold = 0.6;\n  for (final file in testFiles) {\n    final embedding = computeEmbedding(extractor: extractor, filename: file);\n\n    var name = manager.search(embedding: embedding, threshold: threshold);\n    if (name == '') {\n      name = \"<Unknown>\";\n    }\n    print(\"$file: $name\");\n  }\n\n  if (!manager.verify(\n      name: \"fangjun\",\n      embedding: computeEmbedding(extractor: extractor, filename: testFiles[0]),\n      threshold: threshold)) {\n    print(\"{$testFiles[0]} should match fangjun!\");\n    return;\n  }\n\n  if (!manager.remove(\"fangjun\")) {\n    print(\"Failed to remove fangjun\");\n    return;\n  }\n\n  if (manager.verify(\n      name: \"fangjun\",\n      embedding: computeEmbedding(extractor: extractor, filename: testFiles[0]),\n      threshold: threshold)) {\n    print(\"${testFiles[0]} should match no one!\");\n    return;\n  }\n\n  if (manager.numSpeakers != 1) {\n    print(\"There should only 1 speaker left.\");\n    return;\n  }\n\n  extractor.free();\n  manager.free();\n}\n"
  },
  {
    "path": "dart-api-examples/speaker-identification/pubspec.yaml",
    "content": "name: speaker_identification\n\ndescription: >\n  This example demonstrates how to use the Dart API for speaker identification.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx: ^1.12.31\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/speaker-identification/run-3d-speaker.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\nfi\n\nif [ ! -f ./sr-data/enroll/leijun-sr-1.wav ]; then\n  curl -SL -o sr-data.tar.gz https://github.com/csukuangfj/sr-data/archive/refs/tags/v1.0.0.tar.gz\n  tar xvf sr-data.tar.gz\n  mv sr-data-1.0.0 sr-data\nfi\n\ndart run \\\n  ./bin/speaker_id.dart \\\n  --model ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-dpdfnet/.gitignore",
    "content": ".dart_tool/\n.packages\nbuild/\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-dpdfnet/CHANGELOG.md",
    "content": "## 1.0.0\n\n- Initial version.\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-dpdfnet/README.md",
    "content": "# Speech Enhancement Example\n\nThis example shows how to use the Dart offline speech denoiser API with\nDPDFNet models.\n\nUse 16 kHz DPDFNet models such as `dpdfnet_baseline.onnx`, `dpdfnet2.onnx`,\n`dpdfnet4.onnx`, or `dpdfnet8.onnx` for downstream ASR or speech recognition.\nUse `dpdfnet2_48khz_hr.onnx` for 48 kHz enhancement output.\n\nDPDFNet models are available from either:\n\n- https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n- https://huggingface.co/Ceva-IP/DPDFNet\n\nThen run:\n\n```bash\ndart pub get\ndart run ./bin/speech_enhancement_dpdfnet.dart --model ./dpdfnet_baseline.onnx --input-wav ./inp_16k.wav --output-wav ./enhanced-16k.wav\n```\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-dpdfnet/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-dpdfnet/bin/speech_enhancement_dpdfnet.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to a DPDFNet onnx model')\n    ..addOption('input-wav', help: 'Path to input.wav')\n    ..addOption('output-wav', help: 'Path to output.wav');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['input-wav'] == null ||\n      res['output-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final inputWav = res['input-wav'] as String;\n  final outputWav = res['output-wav'] as String;\n\n  final config = sherpa_onnx.OfflineSpeechDenoiserConfig(\n      model: sherpa_onnx.OfflineSpeechDenoiserModelConfig(\n    gtcrn: const sherpa_onnx.OfflineSpeechDenoiserGtcrnModelConfig(),\n    dpdfnet: sherpa_onnx.OfflineSpeechDenoiserDpdfNetModelConfig(model: model),\n    numThreads: 1,\n    debug: true,\n    provider: 'cpu',\n  ));\n\n  final sd = sherpa_onnx.OfflineSpeechDenoiser(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n\n  final denoised =\n      sd.run(samples: waveData.samples, sampleRate: waveData.sampleRate);\n\n  sd.free();\n\n  sherpa_onnx.writeWave(\n      filename: outputWav,\n      samples: denoised.samples,\n      sampleRate: denoised.sampleRate);\n\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-dpdfnet/pubspec.yaml",
    "content": "name: speech_enhancement_dpdfnet\n\ndescription: >\n  This example demonstrates how to use the Dart API for DPDFNet speech enhancement/denoising.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx: ^1.12.31\n  # sherpa_onnx:\n  #   path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-dpdfnet/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ndart run \\\n  ./bin/speech_enhancement_dpdfnet.dart \\\n  --model ./dpdfnet_baseline.onnx \\\n  --input-wav ./inp_16k.wav \\\n  --output-wav ./enhanced-16k.wav\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-gtcrn/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-gtcrn/CHANGELOG.md",
    "content": "## 1.0.0\n\n- Initial version.\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-gtcrn/README.md",
    "content": "# Speech Enhancement Example\n\nThis example shows how to use the Dart offline speech denoiser API with GTCRN\nmodels.\n\nDownload GTCRN models and test wave files from:\n\n- https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\nThen run:\n\n```bash\ndart pub get\ndart run ./bin/speech_enhancement_gtcrn.dart --model ./gtcrn_simple.onnx --input-wav ./inp_16k.wav --output-wav ./enhanced-16k.wav\n```\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-gtcrn/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-gtcrn/bin/speech_enhancement_gtcrn.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to a GTCRN onnx model')\n    ..addOption('input-wav', help: 'Path to input.wav')\n    ..addOption('output-wav', help: 'Path to output.wav');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['input-wav'] == null ||\n      res['output-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final inputWav = res['input-wav'] as String;\n  final outputWav = res['output-wav'] as String;\n\n  final config = sherpa_onnx.OfflineSpeechDenoiserConfig(\n      model: sherpa_onnx.OfflineSpeechDenoiserModelConfig(\n    gtcrn: sherpa_onnx.OfflineSpeechDenoiserGtcrnModelConfig(model: model),\n    dpdfnet: const sherpa_onnx.OfflineSpeechDenoiserDpdfNetModelConfig(),\n    numThreads: 1,\n    debug: true,\n    provider: 'cpu',\n  ));\n\n  final sd = sherpa_onnx.OfflineSpeechDenoiser(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n\n  final denoised =\n      sd.run(samples: waveData.samples, sampleRate: waveData.sampleRate);\n\n  sd.free();\n\n  sherpa_onnx.writeWave(\n      filename: outputWav,\n      samples: denoised.samples,\n      sampleRate: denoised.sampleRate);\n\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-gtcrn/pubspec.yaml",
    "content": "name: speech_enhancement_gtcrn\n\ndescription: >\n  This example demonstrates how to use the Dart API for GTCRN speech enhancement/denoising.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\n# Add regular dependencies here.\ndependencies:\n  sherpa_onnx: ^1.12.31\n  # sherpa_onnx:\n  #   path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/speech-enhancement-gtcrn/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\n\ndart run \\\n  ./bin/speech_enhancement_gtcrn.dart \\\n  --model ./gtcrn_simple.onnx \\\n  --input-wav ./inp_16k.wav \\\n  --output-wav ./enhanced-16k.wav\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/spoken-language-identification/README.md",
    "content": "# Introduction\n\nThis example shows how to use the Dart API from sherpa-onnx for spoken language identification.\n\n| File | Description|\n|------|------------|\n|[./bin/spoken_language_identification.dart](./bin/spoken_language_identification.dart)| Use a whisper model for spoken language identification. See also [./run-whisper.sh](./run-whisper.sh)|\n"
  },
  {
    "path": "dart-api-examples/spoken-language-identification/analysis_options.yaml",
    "content": "include: package:lints/recommended.yaml\n\nanalyzer:\n  language:\n    strict-casts: true\n    strict-inference: true\n    strict-raw-types: true\n\nlinter:\n  rules:\n    - always_use_package_imports\n    - avoid_dynamic_calls\n    - cancel_subscriptions\n    - close_sinks\n    - unawaited_futures\n    - use_super_parameters\n"
  },
  {
    "path": "dart-api-examples/spoken-language-identification/bin/spoken_language_identification.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder', help: 'Path to the whisper encoder model')\n    ..addOption('decoder', help: 'Path to the whisper decoder model')\n    ..addOption('tail-paddings', help: 'Tail paddings for the whisper model', defaultsTo: '0')\n    ..addOption('wav', help: 'Path to test.wav for language identification')\n    ..addFlag('help', abbr: 'h', help: 'Show this help message', negatable: false);\n\n  final res = parser.parse(arguments);\n  if (res['help'] as bool) {\n    print(parser.usage);\n    exit(0);\n  }\n\n  if (res['encoder'] == null || res['decoder'] == null || res['wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final tailPaddings = int.tryParse(res['tail-paddings'] as String) ?? 0;\n  final wav = res['wav'] as String;\n\n  final whisperConfig = sherpa_onnx.SpokenLanguageIdentificationWhisperConfig(\n    encoder: encoder,\n    decoder: decoder,\n    tailPaddings: tailPaddings,\n  );\n\n  final config = sherpa_onnx.SpokenLanguageIdentificationConfig(\n    whisper: whisperConfig,\n    numThreads: 1,\n    debug: true,\n    provider: 'cpu',\n  );\n\n  final slid = sherpa_onnx.SpokenLanguageIdentification(config);\n\n  final waveData = sherpa_onnx.readWave(wav);\n\n  final stream = slid.createStream();\n  stream.acceptWaveform(samples: waveData.samples, sampleRate: waveData.sampleRate);\n\n  final result = slid.compute(stream);\n\n  print('File: $wav');\n  print('Detected language: ${result.lang}');\n\n  stream.free();\n  slid.free();\n}\n"
  },
  {
    "path": "dart-api-examples/spoken-language-identification/pubspec.yaml",
    "content": "name: spoken_language_identification\n\ndescription: >\n  This example demonstrates how to use the Dart API for spoken language identification.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\n# Add regular dependencies here.\ndependencies:\n  sherpa_onnx: ^1.12.31\n  # sherpa_onnx:\n  #   path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/spoken-language-identification/run-whisper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.tar.bz2\n  rm sherpa-onnx-whisper-tiny.tar.bz2\nfi\n\n# Download test WAV files\nwaves=(\n# ar-arabic.wav\n# bg-bulgarian.wav\n# cs-czech.wav\n# da-danish.wav\n# de-german.wav\n# el-greek.wav\nen-english.wav\nes-spanish.wav\n# fa-persian.wav\n# fi-finnish.wav\n# fr-french.wav\n# hi-hindi.wav\n# hr-croatian.wav\n# id-indonesian.wav\n# it-italian.wav\n# ja-japanese.wav\n# ko-korean.wav\n# nl-dutch.wav\n# no-norwegian.wav\n# pl-polish.wav\n# pt-portuguese.wav\n# ro-romanian.wav\nru-russian.wav\n# sk-slovak.wav\n# sv-swedish.wav\n# ta-tamil.wav\n# tl-tagalog.wav\n# tr-turkish.wav\n# uk-ukrainian.wav\nzh-chinese.wav\n)\n\nfor wav in ${waves[@]}; do\n  if [ ! -f ./$wav ]; then\n    echo \"Downloading $wav\"\n    curl -SL -O https://hf-mirror.com/spaces/k2-fsa/spoken-language-identification/resolve/main/test_wavs/$wav\n  fi\n  \n  echo \"Testing $wav\"\n  dart run \\\n    ./bin/spoken_language_identification.dart \\\n    --encoder ./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx \\\n    --decoder ./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx \\\n    --wav ./$wav\n  \n  echo \"----------------------------------------\"\ndone\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/CHANGELOG.md",
    "content": "## 1.0.0\n\n- Initial version.\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/README.md",
    "content": "# Introduction\n\nThis folder contains examples for streaming ASR with Dart API.\n\n| File | Description|\n|------|------------|\n|[./bin/nemo-transducer.dart](./bin/nemo-transducer.dart)| Use a NeMo transducer model for speech recognition. See [./run-nemo-transducer.sh](./run-nemo-transducer.sh)|\n|[./bin/paraformer.dart](./bin/paraformer.dart)| Use a Paraformer model for speech recognition. See [./run-paraformer.sh](./run-paraformer.sh)|\n|[./bin/zipformer-ctc-hlg.dart](./bin/zipformer-ctc-hlg.dart)| Use a Zipformer CTC model with HLG graph for speech recognition. See [./run-zipformer-ctc-hlg.sh](./run-zipformer-ctc-hlg.sh)|\n|[./bin/zipformer-ctc.dart](./bin/zipformer-ctc.dart)| Use a Zipformer CTC model for speech recognition. See [./run-zipformer-ctc.sh](./run-zipformer-ctc.sh)|\n|[./bin/zipformer-transducer.dart](./bin/zipformer-transducer.dart)| Use a Zipformer transducer model for speech recognition. See [./run-zipformer-transducer.sh](./run-zipformer-transducer.sh)|\n\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/bin/paraformer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder', help: 'Path to the encoder model')\n    ..addOption('decoder', help: 'Path to decoder model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final paraformer = sherpa_onnx.OnlineParaformerModelConfig(\n    encoder: encoder,\n    decoder: decoder,\n  );\n\n  final modelConfig = sherpa_onnx.OnlineModelConfig(\n    paraformer: paraformer,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OnlineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OnlineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  // simulate streaming. You can choose an arbitrary chunk size.\n  // chunkSize of a single sample is also ok, i.e, chunkSize = 1\n  final chunkSize = 1600; // 0.1 second for 16kHz\n  final numChunks = waveData.samples.length ~/ chunkSize;\n\n  var last = '';\n  for (int i = 0; i != numChunks; ++i) {\n    int start = i * chunkSize;\n    stream.acceptWaveform(\n      samples:\n          Float32List.sublistView(waveData.samples, start, start + chunkSize),\n      sampleRate: waveData.sampleRate,\n    );\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n    final result = recognizer.getResult(stream);\n    if (result.text != last && result.text != '') {\n      last = result.text;\n      print(last);\n    }\n  }\n\n  // 0.5 seconds, assume sampleRate is 16kHz\n  final tailPaddings = Float32List(8000);\n  stream.acceptWaveform(\n    samples: tailPaddings,\n    sampleRate: waveData.sampleRate,\n  );\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  final result = recognizer.getResult(stream);\n\n  if (result.text != '') {\n    print(result.text);\n  }\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/bin/t-one-ctc.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final ctc = sherpa_onnx.OnlineToneCtcModelConfig(\n    model: model,\n  );\n\n  final modelConfig = sherpa_onnx.OnlineModelConfig(\n    toneCtc: ctc,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OnlineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OnlineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  // 0.3 seconds, assume sampleRate is 8kHz\n  final leftPaddings = Float32List(2400);\n  stream.acceptWaveform(\n    samples: leftPaddings,\n    sampleRate: waveData.sampleRate,\n  );\n\n  // simulate streaming. You can choose an arbitrary chunk size.\n  // chunkSize of a single sample is also ok, i.e, chunkSize = 1\n  final chunkSize = 1600; // 0.1 second for 16kHz\n  final numChunks = waveData.samples.length ~/ chunkSize;\n\n  var last = '';\n  for (int i = 0; i != numChunks; ++i) {\n    int start = i * chunkSize;\n    stream.acceptWaveform(\n      samples:\n          Float32List.sublistView(waveData.samples, start, start + chunkSize),\n      sampleRate: waveData.sampleRate,\n    );\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n    final result = recognizer.getResult(stream);\n    if (result.text != last && result.text != '') {\n      last = result.text;\n      print(last);\n    }\n  }\n\n  // 0.6 seconds, assume sampleRate is 8kHz\n  final tailPaddings = Float32List(4800);\n  stream.acceptWaveform(\n    samples: tailPaddings,\n    sampleRate: waveData.sampleRate,\n  );\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  final result = recognizer.getResult(stream);\n\n  if (result.text != '') {\n    print(result.text);\n  }\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/bin/zipformer-ctc-hlg.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the model')\n    ..addOption('hlg', help: 'Path to HLG.fst')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['hlg'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final hlg = res['hlg'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final ctc = sherpa_onnx.OnlineZipformer2CtcModelConfig(\n    model: model,\n  );\n\n  final modelConfig = sherpa_onnx.OnlineModelConfig(\n    zipformer2Ctc: ctc,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OnlineRecognizerConfig(\n    model: modelConfig,\n    ctcFstDecoderConfig: sherpa_onnx.OnlineCtcFstDecoderConfig(graph: hlg),\n  );\n  final recognizer = sherpa_onnx.OnlineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  // simulate streaming. You can choose an arbitrary chunk size.\n  // chunkSize of a single sample is also ok, i.e, chunkSize = 1\n  final chunkSize = 1600; // 0.1 second for 16kHz\n  final numChunks = waveData.samples.length ~/ chunkSize;\n\n  var last = '';\n  for (int i = 0; i != numChunks; ++i) {\n    int start = i * chunkSize;\n    stream.acceptWaveform(\n      samples:\n          Float32List.sublistView(waveData.samples, start, start + chunkSize),\n      sampleRate: waveData.sampleRate,\n    );\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n    final result = recognizer.getResult(stream);\n    if (result.text != last && result.text != '') {\n      last = result.text;\n      print(last);\n    }\n  }\n\n  // 0.5 seconds, assume sampleRate is 16kHz\n  final tailPaddings = Float32List(8000);\n  stream.acceptWaveform(\n    samples: tailPaddings,\n    sampleRate: waveData.sampleRate,\n  );\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  final result = recognizer.getResult(stream);\n\n  if (result.text != '') {\n    print(result.text);\n  }\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/bin/zipformer-ctc.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final ctc = sherpa_onnx.OnlineZipformer2CtcModelConfig(\n    model: model,\n  );\n\n  final modelConfig = sherpa_onnx.OnlineModelConfig(\n    zipformer2Ctc: ctc,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OnlineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OnlineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  // simulate streaming. You can choose an arbitrary chunk size.\n  // chunkSize of a single sample is also ok, i.e, chunkSize = 1\n  final chunkSize = 1600; // 0.1 second for 16kHz\n  final numChunks = waveData.samples.length ~/ chunkSize;\n\n  var last = '';\n  for (int i = 0; i != numChunks; ++i) {\n    int start = i * chunkSize;\n    stream.acceptWaveform(\n      samples:\n          Float32List.sublistView(waveData.samples, start, start + chunkSize),\n      sampleRate: waveData.sampleRate,\n    );\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n    final result = recognizer.getResult(stream);\n    if (result.text != last && result.text != '') {\n      last = result.text;\n      print(last);\n    }\n  }\n\n  // 0.5 seconds, assume sampleRate is 16kHz\n  final tailPaddings = Float32List(8000);\n  stream.acceptWaveform(\n    samples: tailPaddings,\n    sampleRate: waveData.sampleRate,\n  );\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  final result = recognizer.getResult(stream);\n\n  if (result.text != '') {\n    print(result.text);\n  }\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/bin/zipformer-transducer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('encoder', help: 'Path to the encoder model')\n    ..addOption('decoder', help: 'Path to decoder model')\n    ..addOption('joiner', help: 'Path to joiner model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('rule-fsts', help: 'Path to rule fsts', defaultsTo: '')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['joiner'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final joiner = res['joiner'] as String;\n  final tokens = res['tokens'] as String;\n  final ruleFsts = res['rule-fsts'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final transducer = sherpa_onnx.OnlineTransducerModelConfig(\n    encoder: encoder,\n    decoder: decoder,\n    joiner: joiner,\n  );\n\n  final modelConfig = sherpa_onnx.OnlineModelConfig(\n    transducer: transducer,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OnlineRecognizerConfig(\n    model: modelConfig,\n    ruleFsts: ruleFsts,\n  );\n  final recognizer = sherpa_onnx.OnlineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final stream = recognizer.createStream();\n\n  // simulate streaming. You can choose an arbitrary chunk size.\n  // chunkSize of a single sample is also ok, i.e, chunkSize = 1\n  final chunkSize = 1600; // 0.1 second for 16kHz\n  final numChunks = waveData.samples.length ~/ chunkSize;\n\n  var last = '';\n  for (int i = 0; i != numChunks; ++i) {\n    int start = i * chunkSize;\n    stream.acceptWaveform(\n      samples:\n          Float32List.sublistView(waveData.samples, start, start + chunkSize),\n      sampleRate: waveData.sampleRate,\n    );\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n    final result = recognizer.getResult(stream);\n    if (result.text != last && result.text != '') {\n      last = result.text;\n      print(last);\n    }\n  }\n\n  // 0.5 seconds, assume sampleRate is 16kHz\n  final tailPaddings = Float32List(8000);\n  stream.acceptWaveform(\n    samples: tailPaddings,\n    sampleRate: waveData.sampleRate,\n  );\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  final result = recognizer.getResult(stream);\n\n  if (result.text != '') {\n    print(result.text);\n  }\n\n  stream.free();\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/pubspec.yaml",
    "content": "name: streaming_asr\n\ndescription: >\n  This example demonstrates how to use the Dart API for streaming speech recognition.\n\nversion: 1.0.0\n# repository: https://github.com/my_org/my_repo\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\n# Add regular dependencies here.\ndependencies:\n  sherpa_onnx: ^1.12.31\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n  test: ^1.24.0\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/run-nemo-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms.tar.bz2\n  tar xvf sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms.tar.bz2\n  rm sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms.tar.bz2\nfi\n\ndart run \\\n  ./bin/zipformer-transducer.dart \\\n  --encoder ./sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/encoder.onnx \\\n  --decoder ./sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/decoder.onnx \\\n  --joiner ./sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/joiner.onnx \\\n  --tokens ./sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/tokens.txt \\\n  --input-wav ./sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/test_wavs/0.wav\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/run-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nfi\n\ndart run \\\n  ./bin/paraformer.dart \\\n  --encoder ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx \\\n  --decoder ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx \\\n  --tokens ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt \\\n  --input-wav ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/run-t-one-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n  tar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n  rm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nfi\n\ndart run \\\n  ./bin/t-one-ctc.dart \\\n  --model ./sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx \\\n  --tokens ./sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt \\\n  --input-wav ./sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/run-zipformer-ctc-hlg.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nfi\n\ndart run \\\n  ./bin/zipformer-ctc-hlg.dart \\\n  --model ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx \\\n  --hlg ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst \\\n  --tokens ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt \\\n  --input-wav ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/1.wav\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/run-zipformer-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nfi\n\ndart run \\\n  ./bin/zipformer-ctc.dart \\\n  --model ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx \\\n  --tokens ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt \\\n  --input-wav ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/1.wav\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/run-zipformer-transducer-itn.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nfi\n\nif [ ! -f ./itn-zh-number.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nfi\n\nif [ ! -f ./itn_zh_number.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\nfi\n\ndart run \\\n  ./bin/zipformer-transducer.dart \\\n  --encoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \\\n  --joiner ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx \\\n  --tokens ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \\\n  --rule-fsts ./itn_zh_number.fst \\\n  --input-wav ./itn-zh-number.wav\n"
  },
  {
    "path": "dart-api-examples/streaming-asr/run-zipformer-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nfi\n\ndart run \\\n  ./bin/zipformer-transducer.dart \\\n  --encoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \\\n  --joiner ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx \\\n  --tokens ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \\\n  --input-wav ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav\n"
  },
  {
    "path": "dart-api-examples/streaming-speech-enhancement-dpdfnet/README.md",
    "content": "# Streaming Speech Enhancement Example\n\nThis example shows how to use the Dart streaming speech denoiser API with\nDPDFNet models.\n\nUse 16 kHz DPDFNet models such as `dpdfnet_baseline.onnx`, `dpdfnet2.onnx`,\n`dpdfnet4.onnx`, or `dpdfnet8.onnx` for downstream ASR or speech recognition.\n\nDPDFNet models are available from either:\n\n- https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n- https://huggingface.co/Ceva-IP/DPDFNet\n\nThen run:\n\n```bash\ndart pub get\ndart run ./bin/streaming_speech_enhancement_dpdfnet.dart --model ./dpdfnet_baseline.onnx --input-wav ./inp_16k.wav --output-wav ./enhanced-online-dpdfnet.wav\n```\n"
  },
  {
    "path": "dart-api-examples/streaming-speech-enhancement-dpdfnet/bin/streaming_speech_enhancement_dpdfnet.dart",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to a DPDFNet onnx model')\n    ..addOption('input-wav', help: 'Path to input.wav')\n    ..addOption('output-wav', help: 'Path to output.wav');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['input-wav'] == null ||\n      res['output-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final inputWav = res['input-wav'] as String;\n  final outputWav = res['output-wav'] as String;\n\n  final config = sherpa_onnx.OnlineSpeechDenoiserConfig(\n    model: sherpa_onnx.OfflineSpeechDenoiserModelConfig(\n      gtcrn: const sherpa_onnx.OfflineSpeechDenoiserGtcrnModelConfig(),\n      dpdfnet:\n          sherpa_onnx.OfflineSpeechDenoiserDpdfNetModelConfig(model: model),\n      numThreads: 1,\n      debug: true,\n      provider: 'cpu',\n    ),\n  );\n\n  final sd = sherpa_onnx.OnlineSpeechDenoiser(config);\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final frameShift = sd.frameShiftInSamples;\n  final output = <double>[];\n\n  var start = 0;\n  while (start < waveData.samples.length) {\n    final end = start + frameShift < waveData.samples.length\n        ? start + frameShift\n        : waveData.samples.length;\n    final chunk = waveData.samples.sublist(start, end);\n    final denoised = sd.run(samples: chunk, sampleRate: waveData.sampleRate);\n    output.addAll(denoised.samples);\n    start = end;\n  }\n\n  output.addAll(sd.flush().samples);\n  sd.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: Float32List.fromList(output),\n    sampleRate: waveData.sampleRate,\n  );\n\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/streaming-speech-enhancement-dpdfnet/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ndart run \\\n  ./bin/streaming_speech_enhancement_dpdfnet.dart \\\n  --model ./dpdfnet_baseline.onnx \\\n  --input-wav ./inp_16k.wav \\\n  --output-wav ./enhanced-online-dpdfnet.wav\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/streaming-speech-enhancement-gtcrn/README.md",
    "content": "# Streaming Speech Enhancement Example\n\nThis example shows how to use the Dart streaming speech denoiser API with GTCRN\nmodels.\n\nDownload GTCRN models and test wave files from:\n\n- https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\nThen run:\n\n```bash\ndart pub get\ndart run ./bin/streaming_speech_enhancement_gtcrn.dart --model ./gtcrn_simple.onnx --input-wav ./inp_16k.wav --output-wav ./enhanced-online-gtcrn.wav\n```\n"
  },
  {
    "path": "dart-api-examples/streaming-speech-enhancement-gtcrn/bin/streaming_speech_enhancement_gtcrn.dart",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to a GTCRN onnx model')\n    ..addOption('input-wav', help: 'Path to input.wav')\n    ..addOption('output-wav', help: 'Path to output.wav');\n\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['input-wav'] == null ||\n      res['output-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final model = res['model'] as String;\n  final inputWav = res['input-wav'] as String;\n  final outputWav = res['output-wav'] as String;\n\n  final config = sherpa_onnx.OnlineSpeechDenoiserConfig(\n    model: sherpa_onnx.OfflineSpeechDenoiserModelConfig(\n      gtcrn: sherpa_onnx.OfflineSpeechDenoiserGtcrnModelConfig(model: model),\n      dpdfnet: const sherpa_onnx.OfflineSpeechDenoiserDpdfNetModelConfig(),\n      numThreads: 1,\n      debug: true,\n      provider: 'cpu',\n    ),\n  );\n\n  final sd = sherpa_onnx.OnlineSpeechDenoiser(config);\n  final waveData = sherpa_onnx.readWave(inputWav);\n  final frameShift = sd.frameShiftInSamples;\n  final output = <double>[];\n\n  var start = 0;\n  while (start < waveData.samples.length) {\n    final end = start + frameShift < waveData.samples.length\n        ? start + frameShift\n        : waveData.samples.length;\n    final chunk = waveData.samples.sublist(start, end);\n    final denoised = sd.run(samples: chunk, sampleRate: waveData.sampleRate);\n    output.addAll(denoised.samples);\n    start = end;\n  }\n\n  output.addAll(sd.flush().samples);\n  sd.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: Float32List.fromList(output),\n    sampleRate: waveData.sampleRate,\n  );\n\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/streaming-speech-enhancement-gtcrn/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ndart run \\\n  ./bin/streaming_speech_enhancement_gtcrn.dart \\\n  --model ./gtcrn_simple.onnx \\\n  --input-wav ./inp_16k.wav \\\n  --output-wav ./enhanced-online-gtcrn.wav\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/tts/CHANGELOG.md",
    "content": "## 1.0.0\n\n- Initial version.\n"
  },
  {
    "path": "dart-api-examples/tts/README.md",
    "content": "# Introduction\n\nThis folder contains examples for text to speech with Dart API.\n\n| File | Description|\n|------|------------|\n|[./bin/piper.dart](./bin/piper.dart)| Use a Piper tts model for text to speech. See [./run-piper.sh](./run-piper.sh)|\n|[./bin/coqui.dart](./bin/coqui.dart)| Use a Coqui tts model for text to speech. See [./run-coqui.sh](./run-coqui.sh)|\n|[./bin/zh.dart](./bin/zh.dart)| Use a Chinese VITS tts model for text to speech. See [./run-zh.sh](./run-zh.sh)|\n|[./bin/zipvoice-zh-en.dart](./bin/zipvoice-zh-en.dart)| Use a ZipVoice Chinese/English zero-shot TTS model. See [./run-zipvoice-zh-en.sh](./run-zipvoice-zh-en.sh)|\n"
  },
  {
    "path": "dart-api-examples/tts/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/tts/bin/coqui.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the ONNX model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio')\n    ..addOption('speed', help: 'Speech speed', defaultsTo: '1.0')\n    ..addOption(\n      'sid',\n      help: 'Speaker ID to select. Used only for multi-speaker TTS',\n      defaultsTo: '0',\n    );\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n  var speed = double.tryParse(res['speed'] as String) ?? 1.0;\n  final sid = int.tryParse(res['sid'] as String) ?? 0;\n\n  if (speed == 0) {\n    speed = 1.0;\n  }\n\n  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(\n    model: model,\n    tokens: tokens,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    vits: vits,\n    numThreads: 1,\n    debug: true,\n  );\n  final config = sherpa_onnx.OfflineTtsConfig(\n    model: modelConfig,\n    maxNumSenetences: 1,\n  );\n\n  final tts = sherpa_onnx.OfflineTts(config);\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    sid: sid,\n    speed: speed,\n    silenceScale: 0.2,\n  );\n  final audio = tts.generateWithConfig(text: text, config: genConfig);\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/bin/kitten-en.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the onnx model')\n    ..addOption('voices', help: 'Path to the voices.bin')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption(\n      'data-dir',\n      help: 'Path to espeak-ng-data directory',\n      defaultsTo: '',\n    )\n    ..addOption('rule-fsts', help: 'Path to rule fsts', defaultsTo: '')\n    ..addOption('rule-fars', help: 'Path to rule fars', defaultsTo: '')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio')\n    ..addOption('speed', help: 'Speech speed', defaultsTo: '1.0')\n    ..addOption(\n      'sid',\n      help: 'Speaker ID to select. Used only for multi-speaker TTS',\n      defaultsTo: '0',\n    );\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['voices'] == null ||\n      res['tokens'] == null ||\n      res['data-dir'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n  final model = res['model'] as String;\n  final voices = res['voices'] as String;\n  final tokens = res['tokens'] as String;\n  final dataDir = res['data-dir'] as String;\n  final ruleFsts = res['rule-fsts'] as String;\n  final ruleFars = res['rule-fars'] as String;\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n  var speed = double.tryParse(res['speed'] as String) ?? 1.0;\n  final sid = int.tryParse(res['sid'] as String) ?? 0;\n\n  if (speed == 0) {\n    speed = 1.0;\n  }\n\n  final kitten = sherpa_onnx.OfflineTtsKittenModelConfig(\n    model: model,\n    voices: voices,\n    tokens: tokens,\n    dataDir: dataDir,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    kitten: kitten,\n    numThreads: 1,\n    debug: true,\n  );\n  final config = sherpa_onnx.OfflineTtsConfig(\n    model: modelConfig,\n    maxNumSenetences: 1,\n    ruleFsts: ruleFsts,\n    ruleFars: ruleFars,\n  );\n\n  final tts = sherpa_onnx.OfflineTts(config);\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    sid: sid,\n    speed: speed,\n    silenceScale: config.silenceScale,\n  );\n  final audio = tts.generateWithConfig(text: text, config: genConfig);\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/bin/kokoro-en.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the onnx model')\n    ..addOption('voices', help: 'Path to the voices.bin')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption(\n      'data-dir',\n      help: 'Path to espeak-ng-data directory',\n      defaultsTo: '',\n    )\n    ..addOption('rule-fsts', help: 'Path to rule fsts', defaultsTo: '')\n    ..addOption('rule-fars', help: 'Path to rule fars', defaultsTo: '')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio')\n    ..addOption('speed', help: 'Speech speed', defaultsTo: '1.0')\n    ..addOption(\n      'sid',\n      help: 'Speaker ID to select. Used only for multi-speaker TTS',\n      defaultsTo: '0',\n    );\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['voices'] == null ||\n      res['tokens'] == null ||\n      res['data-dir'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n  final model = res['model'] as String;\n  final voices = res['voices'] as String;\n  final tokens = res['tokens'] as String;\n  final dataDir = res['data-dir'] as String;\n  final ruleFsts = res['rule-fsts'] as String;\n  final ruleFars = res['rule-fars'] as String;\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n  var speed = double.tryParse(res['speed'] as String) ?? 1.0;\n  final sid = int.tryParse(res['sid'] as String) ?? 0;\n\n  if (speed == 0) {\n    speed = 1.0;\n  }\n\n  final kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig(\n    model: model,\n    voices: voices,\n    tokens: tokens,\n    dataDir: dataDir,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    kokoro: kokoro,\n    numThreads: 1,\n    debug: true,\n  );\n  final config = sherpa_onnx.OfflineTtsConfig(\n    model: modelConfig,\n    maxNumSenetences: 1,\n    ruleFsts: ruleFsts,\n    ruleFars: ruleFars,\n  );\n\n  final tts = sherpa_onnx.OfflineTts(config);\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    sid: sid,\n    speed: speed,\n    silenceScale: config.silenceScale,\n  );\n  final audio = tts.generateWithConfig(text: text, config: genConfig);\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/bin/kokoro-zh-en.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the onnx model')\n    ..addOption('voices', help: 'Path to the voices.bin')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption(\n      'data-dir',\n      help: 'Path to espeak-ng-data directory',\n      defaultsTo: '',\n    )\n    ..addOption(\n      'lexicon',\n      help: 'Path to lexicon files',\n      defaultsTo: '',\n    )\n    ..addOption('rule-fsts', help: 'Path to rule fsts', defaultsTo: '')\n    ..addOption('rule-fars', help: 'Path to rule fars', defaultsTo: '')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio')\n    ..addOption('speed', help: 'Speech speed', defaultsTo: '1.0')\n    ..addOption(\n      'sid',\n      help: 'Speaker ID to select. Used only for multi-speaker TTS',\n      defaultsTo: '0',\n    );\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['voices'] == null ||\n      res['tokens'] == null ||\n      res['data-dir'] == null ||\n      res['lexicon'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n  final model = res['model'] as String;\n  final voices = res['voices'] as String;\n  final tokens = res['tokens'] as String;\n  final dataDir = res['data-dir'] as String;\n  final lexicon = res['lexicon'] as String;\n  final ruleFsts = res['rule-fsts'] as String;\n  final ruleFars = res['rule-fars'] as String;\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n  var speed = double.tryParse(res['speed'] as String) ?? 1.0;\n  final sid = int.tryParse(res['sid'] as String) ?? 0;\n\n  if (speed == 0) {\n    speed = 1.0;\n  }\n\n  final kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig(\n    model: model,\n    voices: voices,\n    tokens: tokens,\n    dataDir: dataDir,\n    lexicon: lexicon,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    kokoro: kokoro,\n    numThreads: 1,\n    debug: true,\n  );\n  final config = sherpa_onnx.OfflineTtsConfig(\n    model: modelConfig,\n    maxNumSenetences: 1,\n    ruleFsts: ruleFsts,\n    ruleFars: ruleFars,\n  );\n\n  final tts = sherpa_onnx.OfflineTts(config);\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    sid: sid,\n    speed: speed,\n    silenceScale: config.silenceScale,\n  );\n  final audio = tts.generateWithConfig(text: text, config: genConfig);\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/bin/matcha-en.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('acoustic-model', help: 'Path to the acoustic model')\n    ..addOption('vocoder', help: 'Path to the vocoder model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption(\n      'data-dir',\n      help: 'Path to espeak-ng-data directory',\n      defaultsTo: '',\n    )\n    ..addOption('rule-fsts', help: 'Path to rule fsts', defaultsTo: '')\n    ..addOption('rule-fars', help: 'Path to rule fars', defaultsTo: '')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio')\n    ..addOption('speed', help: 'Speech speed', defaultsTo: '1.0')\n    ..addOption(\n      'sid',\n      help: 'Speaker ID to select. Used only for multi-speaker TTS',\n      defaultsTo: '0',\n    );\n  final res = parser.parse(arguments);\n  if (res['acoustic-model'] == null ||\n      res['vocoder'] == null ||\n      res['tokens'] == null ||\n      res['data-dir'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n  final acousticModel = res['acoustic-model'] as String;\n  final vocoder = res['vocoder'] as String;\n  final tokens = res['tokens'] as String;\n  final dataDir = res['data-dir'] as String;\n  final ruleFsts = res['rule-fsts'] as String;\n  final ruleFars = res['rule-fars'] as String;\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n  var speed = double.tryParse(res['speed'] as String) ?? 1.0;\n  final sid = int.tryParse(res['sid'] as String) ?? 0;\n\n  if (speed == 0) {\n    speed = 1.0;\n  }\n\n  final matcha = sherpa_onnx.OfflineTtsMatchaModelConfig(\n    acousticModel: acousticModel,\n    vocoder: vocoder,\n    tokens: tokens,\n    dataDir: dataDir,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    matcha: matcha,\n    numThreads: 1,\n    debug: true,\n  );\n  final config = sherpa_onnx.OfflineTtsConfig(\n    model: modelConfig,\n    maxNumSenetences: 1,\n    ruleFsts: ruleFsts,\n    ruleFars: ruleFars,\n  );\n\n  final tts = sherpa_onnx.OfflineTts(config);\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    sid: sid,\n    speed: speed,\n    silenceScale: config.silenceScale,\n  );\n  final audio = tts.generateWithConfig(text: text, config: genConfig);\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/bin/matcha-zh.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('acoustic-model', help: 'Path to the acoustic model')\n    ..addOption('vocoder', help: 'Path to the vocoder model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('lexicon', help: 'Path to lexicon.txt')\n    ..addOption('rule-fsts', help: 'Path to rule fsts', defaultsTo: '')\n    ..addOption('rule-fars', help: 'Path to rule fars', defaultsTo: '')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio')\n    ..addOption('speed', help: 'Speech speed', defaultsTo: '1.0')\n    ..addOption(\n      'sid',\n      help: 'Speaker ID to select. Used only for multi-speaker TTS',\n      defaultsTo: '0',\n    );\n  final res = parser.parse(arguments);\n  if (res['acoustic-model'] == null ||\n      res['vocoder'] == null ||\n      res['lexicon'] == null ||\n      res['tokens'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n  final acousticModel = res['acoustic-model'] as String;\n  final vocoder = res['vocoder'] as String;\n  final lexicon = res['lexicon'] as String;\n  final tokens = res['tokens'] as String;\n  final ruleFsts = res['rule-fsts'] as String;\n  final ruleFars = res['rule-fars'] as String;\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n  var speed = double.tryParse(res['speed'] as String) ?? 1.0;\n  final sid = int.tryParse(res['sid'] as String) ?? 0;\n\n  if (speed == 0) {\n    speed = 1.0;\n  }\n\n  final matcha = sherpa_onnx.OfflineTtsMatchaModelConfig(\n    acousticModel: acousticModel,\n    vocoder: vocoder,\n    lexicon: lexicon,\n    tokens: tokens,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    matcha: matcha,\n    numThreads: 1,\n    debug: true,\n  );\n  final config = sherpa_onnx.OfflineTtsConfig(\n    model: modelConfig,\n    maxNumSenetences: 1,\n    ruleFsts: ruleFsts,\n    ruleFars: ruleFars,\n  );\n\n  final tts = sherpa_onnx.OfflineTts(config);\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    sid: sid,\n    speed: speed,\n    silenceScale: config.silenceScale,\n  );\n  final audio = tts.generateWithConfig(text: text, config: genConfig);\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/bin/piper.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the ONNX model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('data-dir', help: 'Path to espeak-ng-data directory')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio')\n    ..addOption('speed', help: 'Speech speed', defaultsTo: '1.0')\n    ..addOption(\n      'sid',\n      help: 'Speaker ID to select. Used only for multi-speaker TTS',\n      defaultsTo: '0',\n    );\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['tokens'] == null ||\n      res['data-dir'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final dataDir = res['data-dir'] as String;\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n  var speed = double.tryParse(res['speed'] as String) ?? 1.0;\n  final sid = int.tryParse(res['sid'] as String) ?? 0;\n\n  if (speed == 0) {\n    speed = 1.0;\n  }\n\n  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(\n    model: model,\n    tokens: tokens,\n    dataDir: dataDir,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    vits: vits,\n    numThreads: 1,\n    debug: true,\n  );\n  final config = sherpa_onnx.OfflineTtsConfig(\n    model: modelConfig,\n    maxNumSenetences: 1,\n  );\n\n  final tts = sherpa_onnx.OfflineTts(config);\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    sid: sid,\n    speed: speed,\n    silenceScale: 0.2,\n  );\n  final audio = tts.generateWithConfig(\n      text: text,\n      config: genConfig,\n      callback: (Float32List samples) {\n        print('${samples.length} samples received');\n        // You can play samples in a separate thread/isolate\n\n        // 1 means to continue\n        // 0 means to stop\n        return 1;\n      });\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/bin/pocket-en.dart",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('lm-flow', help: 'Path to the lm flow model')\n    ..addOption('lm-main', help: 'Path to the lm main model')\n    ..addOption('encoder', help: 'Path to the encoder model')\n    ..addOption('decoder', help: 'Path to the decoder model')\n    ..addOption('text-conditioner', help: 'Path to the text conditioner model')\n    ..addOption('vocab-json', help: 'Path to the vocab.json file')\n    ..addOption('token-scores-json', help: 'Path to the token_scores.json file')\n    ..addOption('reference-audio', help: 'Path to reference audio (wav)')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio')\n    ..addOption(\n      'voice-embedding-cache-capacity',\n      help: 'Voice embedding cache capacity (default: 50)',\n      defaultsTo: '50',\n    )\n    ..addOption(\n      'seed',\n      help: 'Random seed for reproducibility (default: -1, random)',\n      defaultsTo: '-1',\n    );\n\n  final res = parser.parse(arguments);\n\n  if (res['lm-flow'] == null ||\n      res['lm-main'] == null ||\n      res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['text-conditioner'] == null ||\n      res['vocab-json'] == null ||\n      res['token-scores-json'] == null ||\n      res['reference-audio'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final lmFlow = res['lm-flow'] as String;\n  final lmMain = res['lm-main'] as String;\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final textConditioner = res['text-conditioner'] as String;\n  final vocabJson = res['vocab-json'] as String;\n  final tokenScoresJson = res['token-scores-json'] as String;\n  final referenceAudioPath = res['reference-audio'] as String;\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n  final voiceEmbeddingCacheCapacity = int.parse(\n    res['voice-embedding-cache-capacity'] as String,\n  );\n  final seed = int.parse(res['seed'] as String);\n\n  // ---------------- Pocket model config ----------------\n  final pocket = sherpa_onnx.OfflineTtsPocketModelConfig(\n    lmFlow: lmFlow,\n    lmMain: lmMain,\n    encoder: encoder,\n    decoder: decoder,\n    textConditioner: textConditioner,\n    vocabJson: vocabJson,\n    tokenScoresJson: tokenScoresJson,\n    voiceEmbeddingCacheCapacity: voiceEmbeddingCacheCapacity,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    pocket: pocket,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final config = sherpa_onnx.OfflineTtsConfig(model: modelConfig);\n\n  final tts = sherpa_onnx.OfflineTts(config);\n\n  // ---------------- Reference audio (REQUIRED) ----------------\n  final wave = sherpa_onnx.readWave(referenceAudioPath);\n  if (wave.samples.isEmpty || wave.sampleRate == 0) {\n    throw Exception('Failed to read reference audio: $referenceAudioPath');\n  }\n\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    sid: 0,\n    speed: 1.0,\n    referenceAudio: wave.samples,\n    referenceSampleRate: wave.sampleRate,\n    extra: {\"max_reference_audio_len\": 12, if (seed >= 0) \"seed\": seed},\n  );\n\n  // If you don't want to use a callback\n  // final audio = tts.generateWithConfig(text: text, config: genConfig);\n\n  final audio = tts.generateWithConfig(\n    text: text,\n    config: genConfig,\n    onProgress: (samples, progress) {\n      // Print progress as percentage\n      print(\"Progress: ${(progress * 100).toStringAsFixed(2)}%\");\n\n      // Print the length of the received samples chunk\n      print(\"Received samples length: ${samples.length}\");\n\n      // Return 1 to continue, 0 to stop generation\n      return 1;\n    },\n  );\n\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/bin/supertonic-en.dart",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('duration-predictor',\n        help: 'Path to the duration predictor model')\n    ..addOption('text-encoder', help: 'Path to the text encoder model')\n    ..addOption('vector-estimator',\n        help: 'Path to the vector estimator model')\n    ..addOption('vocoder', help: 'Path to the vocoder model')\n    ..addOption('tts-json', help: 'Path to tts.json')\n    ..addOption('unicode-indexer', help: 'Path to unicode_indexer.bin')\n    ..addOption('voice-style', help: 'Path to voice.bin')\n    ..addOption('sid', help: 'Speaker ID (default: 6)', defaultsTo: '6')\n    ..addOption('speed', help: 'Speed (default: 1.25)', defaultsTo: '1.25')\n    ..addOption('num-steps',\n        help: 'Number of steps (default: 5)', defaultsTo: '5')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio');\n\n  final res = parser.parse(arguments);\n\n  if (res['duration-predictor'] == null ||\n      res['text-encoder'] == null ||\n      res['vector-estimator'] == null ||\n      res['vocoder'] == null ||\n      res['tts-json'] == null ||\n      res['unicode-indexer'] == null ||\n      res['voice-style'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final durationPredictor = res['duration-predictor'] as String;\n  final textEncoder = res['text-encoder'] as String;\n  final vectorEstimator = res['vector-estimator'] as String;\n  final vocoder = res['vocoder'] as String;\n  final ttsJson = res['tts-json'] as String;\n  final unicodeIndexer = res['unicode-indexer'] as String;\n  final voiceStyle = res['voice-style'] as String;\n  final sid = int.parse(res['sid'] as String);\n  final speed = double.parse(res['speed'] as String);\n  final numSteps = int.parse(res['num-steps'] as String);\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n\n  final supertonic = sherpa_onnx.OfflineTtsSupertonicModelConfig(\n    durationPredictor: durationPredictor,\n    textEncoder: textEncoder,\n    vectorEstimator: vectorEstimator,\n    vocoder: vocoder,\n    ttsJson: ttsJson,\n    unicodeIndexer: unicodeIndexer,\n    voiceStyle: voiceStyle,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    supertonic: supertonic,\n    numThreads: 2,\n    debug: true,\n  );\n\n  final config = sherpa_onnx.OfflineTtsConfig(model: modelConfig);\n\n  final tts = sherpa_onnx.OfflineTts(config);\n\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    sid: sid,\n    speed: speed,\n    extra: {'lang': 'en', 'num_steps': numSteps},\n  );\n\n  final audio = tts.generateWithConfig(\n    text: text,\n    config: genConfig,\n    onProgress: (samples, progress) {\n      print('Progress: ${(progress * 100).toStringAsFixed(2)}%');\n      return 1;\n    },\n  );\n\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/bin/vits-zh.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('model', help: 'Path to the ONNX model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('lexicon', help: 'Path to lexicon.txt')\n    ..addOption('rule-fsts', help: 'Path to rule fsts', defaultsTo: '')\n    ..addOption('rule-fars', help: 'Path to rule fars', defaultsTo: '')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio')\n    ..addOption('speed', help: 'Speech speed', defaultsTo: '1.0')\n    ..addOption(\n      'sid',\n      help: 'Speaker ID to select. Used only for multi-speaker TTS',\n      defaultsTo: '0',\n    );\n  final res = parser.parse(arguments);\n  if (res['model'] == null ||\n      res['lexicon'] == null ||\n      res['tokens'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n  final model = res['model'] as String;\n  final lexicon = res['lexicon'] as String;\n  final tokens = res['tokens'] as String;\n  final ruleFsts = res['rule-fsts'] as String;\n  final ruleFars = res['rule-fars'] as String;\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n  var speed = double.tryParse(res['speed'] as String) ?? 1.0;\n  final sid = int.tryParse(res['sid'] as String) ?? 0;\n\n  if (speed == 0) {\n    speed = 1.0;\n  }\n\n  final vits = sherpa_onnx.OfflineTtsVitsModelConfig(\n    model: model,\n    lexicon: lexicon,\n    tokens: tokens,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    vits: vits,\n    numThreads: 1,\n    debug: true,\n  );\n  final config = sherpa_onnx.OfflineTtsConfig(\n    model: modelConfig,\n    maxNumSenetences: 1,\n    ruleFsts: ruleFsts,\n    ruleFars: ruleFars,\n  );\n\n  final tts = sherpa_onnx.OfflineTts(config);\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    sid: sid,\n    speed: speed,\n    silenceScale: 0.2,\n  );\n  final audio = tts.generateWithConfig(text: text, config: genConfig);\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/bin/zipvoice-zh-en.dart",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('encoder', help: 'Path to the encoder model')\n    ..addOption('decoder', help: 'Path to the decoder model')\n    ..addOption('vocoder', help: 'Path to the vocoder model')\n    ..addOption('data-dir', help: 'Path to espeak-ng-data directory')\n    ..addOption('lexicon', help: 'Path to lexicon.txt')\n    ..addOption('reference-audio', help: 'Path to reference audio (wav)')\n    ..addOption('reference-text', help: 'Reference text for zero-shot TTS')\n    ..addOption('text', help: 'Text to generate TTS for')\n    ..addOption('output-wav', help: 'Filename to save the generated audio')\n    ..addOption(\n      'num-steps',\n      help: 'Number of inference steps (default: 4)',\n      defaultsTo: '4',\n    );\n\n  final res = parser.parse(arguments);\n\n  if (res['tokens'] == null ||\n      res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['vocoder'] == null ||\n      res['data-dir'] == null ||\n      res['lexicon'] == null ||\n      res['reference-audio'] == null ||\n      res['reference-text'] == null ||\n      res['output-wav'] == null ||\n      res['text'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final tokens = res['tokens'] as String;\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final vocoder = res['vocoder'] as String;\n  final dataDir = res['data-dir'] as String;\n  final lexicon = res['lexicon'] as String;\n  final referenceAudioPath = res['reference-audio'] as String;\n  final referenceText = res['reference-text'] as String;\n  final text = res['text'] as String;\n  final outputWav = res['output-wav'] as String;\n  final numSteps = int.parse(res['num-steps'] as String);\n\n  final zipvoice = sherpa_onnx.OfflineTtsZipVoiceModelConfig(\n    tokens: tokens,\n    encoder: encoder,\n    decoder: decoder,\n    vocoder: vocoder,\n    dataDir: dataDir,\n    lexicon: lexicon,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    zipvoice: zipvoice,\n    numThreads: 2,\n    debug: true,\n  );\n\n  final config = sherpa_onnx.OfflineTtsConfig(model: modelConfig);\n\n  final tts = sherpa_onnx.OfflineTts(config);\n\n  final wave = sherpa_onnx.readWave(referenceAudioPath);\n  if (wave.samples.isEmpty || wave.sampleRate == 0) {\n    throw Exception('Failed to read reference audio: $referenceAudioPath');\n  }\n\n  final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n    speed: 1.0,\n    referenceAudio: wave.samples,\n    referenceSampleRate: wave.sampleRate,\n    referenceText: referenceText,\n    numSteps: numSteps,\n    extra: {'min_char_in_sentence': 10},\n  );\n\n  final audio = tts.generateWithConfig(\n    text: text,\n    config: genConfig,\n    onProgress: (samples, progress) {\n      print('Progress: ${(progress * 100).toStringAsFixed(2)}%');\n      print('Received samples length: ${samples.length}');\n      return 1;\n    },\n  );\n\n  tts.free();\n\n  sherpa_onnx.writeWave(\n    filename: outputWav,\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  );\n\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/tts/pubspec.yaml",
    "content": "name: tts\ndescription: A sample command-line application.\nversion: 1.0.0\n# repository: https://github.com/my_org/my_repo\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\n# Add regular dependencies here.\ndependencies:\n  sherpa_onnx: ^1.12.31\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/tts/run-coqui.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n\n# Please visit\n# https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n# to download more models\n\nif [[ ! -f ./vits-coqui-de-css10/tokens.txt ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-coqui-de-css10.tar.bz2\n  tar xvf vits-coqui-de-css10.tar.bz2\n  rm vits-coqui-de-css10.tar.bz2\nfi\n\n# It is a character-based TTS model, so there is no need to use a lexicon\ndart run \\\n  ./bin/coqui.dart \\\n  --model ./vits-coqui-de-css10/model.onnx \\\n  --tokens ./vits-coqui-de-css10/tokens.txt \\\n  --sid 0 \\\n  --speed 0.7 \\\n  --text 'Alles hat ein Ende, nur die Wurst hat zwei.' \\\n  --output-wav coqui-0.wav\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/run-kitten-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kitten.html\n# to download more models\nif [ ! -f ./kitten-nano-en-v0_1-fp16/model.fp16.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n  tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n  rm kitten-nano-en-v0_1-fp16.tar.bz2\nfi\n\ndart run \\\n  ./bin/kitten-en.dart \\\n  --model ./kitten-nano-en-v0_1-fp16/model.fp16.onnx \\\n  --voices ./kitten-nano-en-v0_1-fp16/voices.bin \\\n  --tokens ./kitten-nano-en-v0_1-fp16/tokens.txt \\\n  --data-dir ./kitten-nano-en-v0_1-fp16/espeak-ng-data \\\n  --sid 0 \\\n  --speed 1.0 \\\n  --output-wav kitten-en-0.wav \\\n  --text \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/run-kokoro-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n# to download more models\nif [ ! -f ./kokoro-en-v0_19/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n  tar xf kokoro-en-v0_19.tar.bz2\n  rm kokoro-en-v0_19.tar.bz2\nfi\n\ndart run \\\n  ./bin/kokoro-en.dart \\\n  --model ./kokoro-en-v0_19/model.onnx \\\n  --voices ./kokoro-en-v0_19/voices.bin \\\n  --tokens ./kokoro-en-v0_19/tokens.txt \\\n  --data-dir ./kokoro-en-v0_19/espeak-ng-data \\\n  --sid 9 \\\n  --speed 1.0 \\\n  --output-wav kokoro-en-9.wav \\\n  --text \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/run-kokoro-zh-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n# to download more models\nif [ ! -f ./kokoro-multi-lang-v1_0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n  tar xf kokoro-multi-lang-v1_0.tar.bz2\n  rm kokoro-multi-lang-v1_0.tar.bz2\nfi\n\ndart run \\\n  ./bin/kokoro-zh-en.dart \\\n  --model ./kokoro-multi-lang-v1_0/model.onnx \\\n  --voices ./kokoro-multi-lang-v1_0/voices.bin \\\n  --tokens ./kokoro-multi-lang-v1_0/tokens.txt \\\n  --data-dir ./kokoro-multi-lang-v1_0/espeak-ng-data \\\n  --lexicon ./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt \\\n  --sid 45 \\\n  --speed 1.0 \\\n  --output-wav kokoro-zh-en-45.wav \\\n  --text \"中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？\"\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/run-matcha-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n  tar xf matcha-icefall-en_US-ljspeech.tar.bz2\n  rm matcha-icefall-en_US-ljspeech.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\ndart run \\\n  ./bin/matcha-en.dart \\\n  --acoustic-model ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\n  --vocoder ./vocos-22khz-univ.onnx \\\n  --tokens ./matcha-icefall-en_US-ljspeech/tokens.txt \\\n  --data-dir ./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\n  --sid 0 \\\n  --speed 1.0 \\\n  --output-wav matcha-en-1.wav \\\n  --text \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\" \\\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/run-matcha-zh.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-zh-baker/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  tar xvf matcha-icefall-zh-baker.tar.bz2\n  rm matcha-icefall-zh-baker.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\ndart run \\\n  ./bin/matcha-zh.dart \\\n  --acoustic-model ./matcha-icefall-zh-baker/model-steps-3.onnx \\\n  --vocoder ./vocos-22khz-univ.onnx \\\n  --lexicon ./matcha-icefall-zh-baker/lexicon.txt \\\n  --tokens ./matcha-icefall-zh-baker/tokens.txt \\\n  --rule-fsts ./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst \\\n  --sid 0 \\\n  --speed 1.0 \\\n  --output-wav matcha-zh-1.wav \\\n  --text \"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\" \\\n\ndart run \\\n  ./bin/matcha-zh.dart \\\n  --acoustic-model ./matcha-icefall-zh-baker/model-steps-3.onnx \\\n  --vocoder ./vocos-22khz-univ.onnx \\\n  --lexicon ./matcha-icefall-zh-baker/lexicon.txt \\\n  --tokens ./matcha-icefall-zh-baker/tokens.txt \\\n  --sid 0 \\\n  --speed 1.0 \\\n  --output-wav matcha-zh-2.wav \\\n  --text \"当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感受着生命的奇迹与温柔.\" \\\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/run-piper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n\n# Please visit\n# https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n# to download more models\n\nif [[ ! -f ./vits-piper-en_US-libritts_r-medium/tokens.txt ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2\n  tar xf vits-piper-en_US-libritts_r-medium.tar.bz2\n  rm vits-piper-en_US-libritts_r-medium.tar.bz2\nfi\n\ndart run \\\n  ./bin/piper.dart \\\n  --model ./vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx \\\n  --tokens ./vits-piper-en_US-libritts_r-medium/tokens.txt \\\n  --data-dir ./vits-piper-en_US-libritts_r-medium/espeak-ng-data \\\n  --sid 351 \\\n  --speed 1.0 \\\n  --text 'How are you doing? This is a speech to text example, using next generation kaldi with piper.' \\\n  --output-wav piper-351.wav\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/run-pocket-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pocket.html\n# to download more models\nif [ ! -f ./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  tar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nfi\n\ndart run \\\n  ./bin/pocket-en.dart \\\n  --lm-flow ./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx \\\n  --lm-main ./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx \\\n  --encoder ./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx \\\n  --decoder ./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx \\\n  --text-conditioner ./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx \\\n  --vocab-json ./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json \\\n  --token-scores-json ./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json \\\n  --reference-audio ./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav \\\n  --output-wav pocket-en-0.wav \\\n  --text \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/run-supertonic-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/supertonic.html\n# to download more models\nif [ ! -f ./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  tar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  rm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nfi\n\ndart run \\\n  ./bin/supertonic-en.dart \\\n  --duration-predictor ./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx \\\n  --text-encoder ./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx \\\n  --vector-estimator ./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx \\\n  --vocoder ./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx \\\n  --tts-json ./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json \\\n  --unicode-indexer ./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin \\\n  --voice-style ./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin \\\n  --sid 6 \\\n  --speed 1.25 \\\n  --num-steps 5 \\\n  --output-wav supertonic-en-0.wav \\\n  --text \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/run-vits-zh.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n\n# Please visit\n# https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n# to download more models\n\nif [[ ! -f ./sherpa-onnx-vits-zh-ll/tokens.txt ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-vits-zh-ll.tar.bz2\n  tar xvf sherpa-onnx-vits-zh-ll.tar.bz2\n  rm sherpa-onnx-vits-zh-ll.tar.bz2\nfi\n\ndart run \\\n  ./bin/vits-zh.dart \\\n  --model ./sherpa-onnx-vits-zh-ll/model.onnx \\\n  --lexicon ./sherpa-onnx-vits-zh-ll/lexicon.txt \\\n  --tokens ./sherpa-onnx-vits-zh-ll/tokens.txt \\\n  --sid 2 \\\n  --speed 1.0 \\\n  --text '当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感受着生命的奇迹与温柔。' \\\n  --output-wav vits-zh-jieba-2.wav\n\ndart run \\\n  ./bin/vits-zh.dart \\\n  --model ./sherpa-onnx-vits-zh-ll/model.onnx \\\n  --lexicon ./sherpa-onnx-vits-zh-ll/lexicon.txt \\\n  --tokens ./sherpa-onnx-vits-zh-ll/tokens.txt \\\n  --rule-fsts \"./sherpa-onnx-vits-zh-ll/phone.fst,./sherpa-onnx-vits-zh-ll/date.fst,./sherpa-onnx-vits-zh-ll/number.fst\" \\\n  --sid 3 \\\n  --speed 1.0 \\\n  --text '今天是2024年6月15号，13点23分。如果有困难，请拨打110或者18920240511。123456块钱。' \\\n  --output-wav vits-zh-jieba-3.wav\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/tts/run-zipvoice-zh-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\n# to download more models\nif [ ! -f ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  tar xvf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nfi\n\nif [ ! -f ./vocos_24khz.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\nfi\n\ndart run \\\n  ./bin/zipvoice-zh-en.dart \\\n  --tokens ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt \\\n  --encoder ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx \\\n  --decoder ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx \\\n  --vocoder ./vocos_24khz.onnx \\\n  --data-dir ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data \\\n  --lexicon ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt \\\n  --reference-audio ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav \\\n  --reference-text \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\" \\\n  --num-steps 4 \\\n  --output-wav zipvoice-zh-en-0.wav \\\n  --text \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/vad/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/vad/CHANGELOG.md",
    "content": "## 1.0.0\n\n- Initial version.\n"
  },
  {
    "path": "dart-api-examples/vad/README.md",
    "content": "# Introduction\n\nThis example shows how to use the Dart API from sherpa-onnx for voice activity detection (VAD).\nSpecifically, we use VAD to remove silences from a wave file.\n\n# Usage\n\n```bash\ndart pub get\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n\ndart run \\\n  ./bin/vad.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --input-wav ./lei-jun-test.wav \\\n  --output-wav ./lei-jun-test-no-silence.wav\n```\n\nIt should generate a file `lei-jun-test-no-silence.wav`, where silences are removed.\n"
  },
  {
    "path": "dart-api-examples/vad/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/vad/bin/init.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:isolate';\nimport 'package:path/path.dart' as p;\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nFuture<void> initSherpaOnnx() async {\n  String platform = '';\n\n  if (Platform.isMacOS) {\n    platform = 'macos';\n  } else if (Platform.isLinux) {\n    platform = 'linux';\n  } else if (Platform.isWindows) {\n    platform = 'windows';\n  } else {\n    throw UnsupportedError('Unknown platform: ${Platform.operatingSystem}');\n  }\n\n  var uri = await Isolate.resolvePackageUri(\n      Uri.parse('package:sherpa_onnx_$platform/any_path_is_ok_here.dart'));\n\n  if (uri == null) {\n    print('File not found');\n    exit(1);\n  }\n\n  var libPath = p.join(p.dirname(p.fromUri(uri)), '..', platform);\n  if (platform == 'linux') {\n    final arch = Platform.version.contains('arm64') ||\n            Platform.version.contains('aarch64')\n        ? 'aarch64'\n        : 'x64';\n    libPath = p.join(p.dirname(p.fromUri(uri)), '..', platform, arch);\n  }\n\n  sherpa_onnx.initBindings(libPath);\n}\n"
  },
  {
    "path": "dart-api-examples/vad/bin/ten-vad.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('ten-vad', help: 'Path to ten-vad.onnx')\n    ..addOption('input-wav', help: 'Path to input.wav')\n    ..addOption('output-wav', help: 'Path to output.wav');\n\n  final res = parser.parse(arguments);\n  if (res['ten-vad'] == null ||\n      res['input-wav'] == null ||\n      res['output-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final tenVad = res['ten-vad'] as String;\n  final inputWav = res['input-wav'] as String;\n  final outputWav = res['output-wav'] as String;\n\n  final tenVadConfig = sherpa_onnx.TenVadModelConfig(\n    model: tenVad,\n    threshold: 0.25,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n    windowSize: 256,\n  );\n\n  final config = sherpa_onnx.VadModelConfig(\n    tenVad: tenVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: config, bufferSizeInSeconds: 10);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ config.tenVad.windowSize;\n\n  List<List<double>> allSamples = [];\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * config.tenVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + config.tenVad.windowSize));\n\n    if (vad.isDetected()) {\n      while (!vad.isEmpty()) {\n        allSamples.add(vad.front().samples);\n        vad.pop();\n      }\n    }\n  }\n\n  vad.flush();\n  while (!vad.isEmpty()) {\n    allSamples.add(vad.front().samples);\n    vad.pop();\n  }\n\n  vad.free();\n\n  final s = Float32List.fromList(allSamples.expand((x) => x).toList());\n  sherpa_onnx.writeWave(\n      filename: outputWav, samples: s, sampleRate: waveData.sampleRate);\n\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/vad/bin/vad.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('input-wav', help: 'Path to input.wav')\n    ..addOption('output-wav', help: 'Path to output.wav');\n\n  final res = parser.parse(arguments);\n  if (res['silero-vad'] == null ||\n      res['input-wav'] == null ||\n      res['output-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  final sileroVad = res['silero-vad'] as String;\n  final inputWav = res['input-wav'] as String;\n  final outputWav = res['output-wav'] as String;\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n  );\n\n  final config = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: config, bufferSizeInSeconds: 10);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ config.sileroVad.windowSize;\n\n  List<List<double>> allSamples = [];\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * config.sileroVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + config.sileroVad.windowSize));\n\n    if (vad.isDetected()) {\n      while (!vad.isEmpty()) {\n        allSamples.add(vad.front().samples);\n        vad.pop();\n      }\n    }\n  }\n\n  vad.flush();\n  while (!vad.isEmpty()) {\n    allSamples.add(vad.front().samples);\n    vad.pop();\n  }\n\n  vad.free();\n\n  final s = Float32List.fromList(allSamples.expand((x) => x).toList());\n  sherpa_onnx.writeWave(\n      filename: outputWav, samples: s, sampleRate: waveData.sampleRate);\n\n  print('Saved to $outputWav');\n}\n"
  },
  {
    "path": "dart-api-examples/vad/pubspec.yaml",
    "content": "name: vad\n\ndescription: >\n  This example demonstrates how to use the Dart API for VAD (voice activity detection).\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx: ^1.12.31\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/vad/run-ten-vad.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n\nif [[ ! -f ./ten-vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\nfi\n\nif [[ ! -f ./lei-jun-test.wav ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\ndart run \\\n  ./bin/ten-vad.dart \\\n  --ten-vad ./ten-vad.onnx \\\n  --input-wav ./lei-jun-test.wav \\\n  --output-wav ./lei-jun-test-no-silence.wav\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/vad/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [[ ! -f ./lei-jun-test.wav ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\ndart run \\\n  ./bin/vad.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --input-wav ./lei-jun-test.wav \\\n  --output-wav ./lei-jun-test-no-silence.wav\n\nls -lh *.wav\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/.gitignore",
    "content": "# https://dart.dev/guides/libraries/private-files\n# Created by `dart pub`\n.dart_tool/\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/README.md",
    "content": "# Introduction\n\nThis folder contains examples for non-streaming ASR + voice activity detection\nwith Dart API.\n\n| File | Description|\n|------|------------|\n|[./bin/paraformer.dart](./bin/paraformer.dart)| Use a Paraformer model for speech recognition. See [./run-paraformer.sh](./run-paraformer.sh)|\n|[./bin/sense-voice.dart](./bin/sense-voice.dart)| Use a SenseVoice Ctc model for speech recognition. See [./run-sense-voice-zh.sh](./run-sense-voice-zh.sh) and [./run-sense-voice-en.sh](./run-sense-voice-en.sh)|\n|[./bin/telespeech-ctc.dart](./bin/telespeech-ctc.dart)| Use a TeleSpeech CTC model for speech recognition. See [./run-telespeech-ctc.sh](./run-telespeech-ctc.sh)|\n|[./bin/whisper.dart](./bin/whisper.dart)| Use a Whisper model for speech recognition. See [./run-whisper.sh](./run-whisper.sh)|\n|[./bin/zipformer-transducer.dart](./bin/zipformer-transducer.dart)| Use a Zipformer transducer model for speech recognition. See [./run-zipformer-transducer.sh](./run-zipformer-transducer.sh)|\n\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/analysis_options.yaml",
    "content": "# This file configures the static analysis results for your project (errors,\n# warnings, and lints).\n#\n# This enables the 'recommended' set of lints from `package:lints`.\n# This set helps identify many issues that may lead to problems when running\n# or consuming Dart code, and enforces writing Dart using a single, idiomatic\n# style and format.\n#\n# If you want a smaller set of lints you can change this to specify\n# 'package:lints/core.yaml'. These are just the most critical lints\n# (the recommended set includes the core lints).\n# The core lints are also what is used by pub.dev for scoring packages.\n\ninclude: package:lints/recommended.yaml\n\n# Uncomment the following section to specify additional rules.\n\n# linter:\n#   rules:\n#     - camel_case_types\n\n# analyzer:\n#   exclude:\n#     - path/to/excluded/files/**\n\n# For more information about the core and recommended set of lints, see\n# https://dart.dev/go/core-lints\n\n# For additional information about configuring this file, see\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/bin/dolphin-ctc.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('model', help: 'Path to the Dolphin CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['silero-vad'] == null ||\n      res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  // create VAD\n  final sileroVad = res['silero-vad'] as String;\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n    maxSpeechDuration: 5.0,\n  );\n\n  final vadConfig = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: vadConfig, bufferSizeInSeconds: 10);\n\n  // create offline recognizer\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final dolphin = sherpa_onnx.OfflineDolphinModelConfig(model: model);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    dolphin: dolphin,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ vadConfig.sileroVad.windowSize;\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * vadConfig.sileroVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + vadConfig.sileroVad.windowSize));\n\n    while (!vad.isEmpty()) {\n      final samples = vad.front().samples;\n      final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n      final endTime =\n          startTime + samples.length.toDouble() / waveData.sampleRate;\n\n      final stream = recognizer.createStream();\n      stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n      recognizer.decode(stream);\n\n      final result = recognizer.getResult(stream);\n      stream.free();\n      print(\n          '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n      vad.pop();\n    }\n  }\n\n  vad.flush();\n\n  while (!vad.isEmpty()) {\n    final samples = vad.front().samples;\n    final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n    final endTime = startTime + samples.length.toDouble() / waveData.sampleRate;\n\n    final stream = recognizer.createStream();\n    stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n    stream.free();\n    print(\n        '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n    vad.pop();\n  }\n\n  vad.free();\n\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/bin/moonshine.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('preprocessor',\n        help: 'Path to the moonshine preprocessor model')\n    ..addOption('encoder', help: 'Path to the moonshine encoder model')\n    ..addOption('uncached-decoder',\n        help: 'Path to moonshine uncached decoder model')\n    ..addOption('cached-decoder',\n        help: 'Path to moonshine cached decoder model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['silero-vad'] == null ||\n      res['preprocessor'] == null ||\n      res['encoder'] == null ||\n      res['uncached-decoder'] == null ||\n      res['cached-decoder'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  // create VAD\n  final sileroVad = res['silero-vad'] as String;\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n    maxSpeechDuration: 5.0,\n  );\n\n  final vadConfig = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: vadConfig, bufferSizeInSeconds: 10);\n\n  // create whisper recognizer\n  final preprocessor = res['preprocessor'] as String;\n  final encoder = res['encoder'] as String;\n  final uncachedDecoder = res['uncached-decoder'] as String;\n  final cachedDecoder = res['cached-decoder'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final moonshine = sherpa_onnx.OfflineMoonshineModelConfig(\n    preprocessor: preprocessor,\n    encoder: encoder,\n    uncachedDecoder: uncachedDecoder,\n    cachedDecoder: cachedDecoder,\n  );\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    moonshine: moonshine,\n    tokens: tokens,\n    debug: false,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ vadConfig.sileroVad.windowSize;\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * vadConfig.sileroVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + vadConfig.sileroVad.windowSize));\n\n    while (!vad.isEmpty()) {\n      final samples = vad.front().samples;\n      final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n      final endTime =\n          startTime + samples.length.toDouble() / waveData.sampleRate;\n\n      final stream = recognizer.createStream();\n      stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n      recognizer.decode(stream);\n\n      final result = recognizer.getResult(stream);\n      stream.free();\n      print(\n          '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n      vad.pop();\n    }\n  }\n\n  vad.flush();\n\n  while (!vad.isEmpty()) {\n    final samples = vad.front().samples;\n    final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n    final endTime = startTime + samples.length.toDouble() / waveData.sampleRate;\n\n    final stream = recognizer.createStream();\n    stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n    stream.free();\n    print(\n        '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n    vad.pop();\n  }\n\n  vad.free();\n\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/bin/paraformer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('model', help: 'Path to the paraformer model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['silero-vad'] == null ||\n      res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  // create VAD\n  final sileroVad = res['silero-vad'] as String;\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n    maxSpeechDuration: 5.0,\n  );\n\n  final vadConfig = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: vadConfig, bufferSizeInSeconds: 10);\n\n  // create paraformer recognizer\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final paraformer = sherpa_onnx.OfflineParaformerModelConfig(\n    model: model,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    paraformer: paraformer,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n    modelType: 'paraformer',\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ vadConfig.sileroVad.windowSize;\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * vadConfig.sileroVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + vadConfig.sileroVad.windowSize));\n\n    while (!vad.isEmpty()) {\n      final samples = vad.front().samples;\n      final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n      final endTime =\n          startTime + samples.length.toDouble() / waveData.sampleRate;\n\n      final stream = recognizer.createStream();\n      stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n      recognizer.decode(stream);\n\n      final result = recognizer.getResult(stream);\n      stream.free();\n      print(\n          '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n      vad.pop();\n    }\n  }\n\n  vad.flush();\n\n  while (!vad.isEmpty()) {\n    final samples = vad.front().samples;\n    final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n    final endTime = startTime + samples.length.toDouble() / waveData.sampleRate;\n\n    final stream = recognizer.createStream();\n    stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n    stream.free();\n    print(\n        '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n    vad.pop();\n  }\n\n  vad.free();\n\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/bin/sense-voice-2.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\n//\n// Different from ./sense-voice.dart, this file uses a CircularBuffer\nimport 'dart:io';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('model', help: 'Path to the SenseVoice model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('language',\n        help: 'auto, zh, en, ja, ko, yue, or leave it empty to use auto',\n        defaultsTo: '')\n    ..addOption('use-itn',\n        help: 'true to use inverse text normalization', defaultsTo: 'false')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['silero-vad'] == null ||\n      res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  // create VAD\n  final sileroVad = res['silero-vad'] as String;\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n    maxSpeechDuration: 5.0,\n  );\n\n  final vadConfig = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: vadConfig, bufferSizeInSeconds: 10);\n\n  // create SenseVoice\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n  final language = res['language'] as String;\n  final useItn = (res['use-itn'] as String).toLowerCase() == 'true';\n\n  final senseVoice = sherpa_onnx.OfflineSenseVoiceModelConfig(\n      model: model, language: language, useInverseTextNormalization: useItn);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    senseVoice: senseVoice,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  final buffer = sherpa_onnx.CircularBuffer(capacity: 30 * 16000);\n  buffer.push(waveData.samples);\n\n  while (buffer.size > vadConfig.sileroVad.windowSize) {\n    final samples =\n        buffer.get(startIndex: buffer.head, n: vadConfig.sileroVad.windowSize);\n    buffer.pop(vadConfig.sileroVad.windowSize);\n\n    vad.acceptWaveform(samples);\n\n    while (!vad.isEmpty()) {\n      final samples = vad.front().samples;\n      final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n      final endTime =\n          startTime + samples.length.toDouble() / waveData.sampleRate;\n\n      final stream = recognizer.createStream();\n      stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n      recognizer.decode(stream);\n\n      final result = recognizer.getResult(stream);\n      stream.free();\n      print(\n          '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n      vad.pop();\n    }\n  }\n\n  vad.flush();\n\n  while (!vad.isEmpty()) {\n    final samples = vad.front().samples;\n    final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n    final endTime = startTime + samples.length.toDouble() / waveData.sampleRate;\n\n    final stream = recognizer.createStream();\n    stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n    stream.free();\n    print(\n        '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n    vad.pop();\n  }\n\n  buffer.free();\n  vad.free();\n\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/bin/sense-voice.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('model', help: 'Path to the SenseVoice model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('language',\n        help: 'auto, zh, en, ja, ko, yue, or leave it empty to use auto',\n        defaultsTo: '')\n    ..addOption('use-itn',\n        help: 'true to use inverse text normalization', defaultsTo: 'false')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['silero-vad'] == null ||\n      res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  // create VAD\n  final sileroVad = res['silero-vad'] as String;\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n    maxSpeechDuration: 5.0,\n  );\n\n  final vadConfig = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: vadConfig, bufferSizeInSeconds: 10);\n\n  // create SenseVoice\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n  final language = res['language'] as String;\n  final useItn = (res['use-itn'] as String).toLowerCase() == 'true';\n\n  final senseVoice = sherpa_onnx.OfflineSenseVoiceModelConfig(\n      model: model, language: language, useInverseTextNormalization: useItn);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    senseVoice: senseVoice,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ vadConfig.sileroVad.windowSize;\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * vadConfig.sileroVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + vadConfig.sileroVad.windowSize));\n\n    while (!vad.isEmpty()) {\n      final samples = vad.front().samples;\n      final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n      final endTime =\n          startTime + samples.length.toDouble() / waveData.sampleRate;\n\n      final stream = recognizer.createStream();\n      stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n      recognizer.decode(stream);\n\n      final result = recognizer.getResult(stream);\n      stream.free();\n      print(\n          '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n      vad.pop();\n    }\n  }\n\n  vad.flush();\n\n  while (!vad.isEmpty()) {\n    final samples = vad.front().samples;\n    final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n    final endTime = startTime + samples.length.toDouble() / waveData.sampleRate;\n\n    final stream = recognizer.createStream();\n    stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n    stream.free();\n    print(\n        '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n    vad.pop();\n  }\n\n  vad.free();\n\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/bin/telespeech-ctc.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('model', help: 'Path to the telespeech CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n\n  if (res['silero-vad'] == null ||\n      res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  // create VAD\n  final sileroVad = res['silero-vad'] as String;\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n    maxSpeechDuration: 5.0,\n  );\n\n  final vadConfig = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: vadConfig, bufferSizeInSeconds: 10);\n\n  // create telespeech CTC recognizer\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    telespeechCtc: model,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n    modelType: 'telespeech_ctc',\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ vadConfig.sileroVad.windowSize;\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * vadConfig.sileroVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + vadConfig.sileroVad.windowSize));\n\n    while (!vad.isEmpty()) {\n      final samples = vad.front().samples;\n      final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n      final endTime =\n          startTime + samples.length.toDouble() / waveData.sampleRate;\n\n      final stream = recognizer.createStream();\n      stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n      recognizer.decode(stream);\n\n      final result = recognizer.getResult(stream);\n      stream.free();\n      print(\n          '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n      vad.pop();\n    }\n  }\n\n  vad.flush();\n\n  while (!vad.isEmpty()) {\n    final samples = vad.front().samples;\n    final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n    final endTime = startTime + samples.length.toDouble() / waveData.sampleRate;\n\n    final stream = recognizer.createStream();\n    stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n    stream.free();\n    print(\n        '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n    vad.pop();\n  }\n\n  vad.free();\n\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/bin/whisper.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('encoder', help: 'Path to the whisper encoder model')\n    ..addOption('decoder', help: 'Path to whisper decoder model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['silero-vad'] == null ||\n      res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  // create VAD\n  final sileroVad = res['silero-vad'] as String;\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n    maxSpeechDuration: 5.0,\n  );\n\n  final vadConfig = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: vadConfig, bufferSizeInSeconds: 10);\n\n  // create whisper recognizer\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final whisper = sherpa_onnx.OfflineWhisperModelConfig(\n    encoder: encoder,\n    decoder: decoder,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    whisper: whisper,\n    tokens: tokens,\n    modelType: 'whisper',\n    debug: false,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ vadConfig.sileroVad.windowSize;\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * vadConfig.sileroVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + vadConfig.sileroVad.windowSize));\n\n    while (!vad.isEmpty()) {\n      final samples = vad.front().samples;\n      final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n      final endTime =\n          startTime + samples.length.toDouble() / waveData.sampleRate;\n\n      final stream = recognizer.createStream();\n      stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n      recognizer.decode(stream);\n\n      final result = recognizer.getResult(stream);\n      stream.free();\n      print(\n          '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n      vad.pop();\n    }\n  }\n\n  vad.flush();\n\n  while (!vad.isEmpty()) {\n    final samples = vad.front().samples;\n    final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n    final endTime = startTime + samples.length.toDouble() / waveData.sampleRate;\n\n    final stream = recognizer.createStream();\n    stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n    stream.free();\n    print(\n        '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n    vad.pop();\n  }\n\n  vad.free();\n\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/bin/zipformer-ctc.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('model', help: 'Path to the Zipformer CTC model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n  if (res['silero-vad'] == null ||\n      res['model'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  // create VAD\n  final sileroVad = res['silero-vad'] as String;\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n    maxSpeechDuration: 5.0,\n  );\n\n  final vadConfig = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: vadConfig, bufferSizeInSeconds: 10);\n\n  // create offline recognizer\n  final model = res['model'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final zipformerCtc = sherpa_onnx.OfflineZipformerCtcModelConfig(model: model);\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    zipformerCtc: zipformerCtc,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ vadConfig.sileroVad.windowSize;\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * vadConfig.sileroVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + vadConfig.sileroVad.windowSize));\n\n    while (!vad.isEmpty()) {\n      final samples = vad.front().samples;\n      final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n      final endTime =\n          startTime + samples.length.toDouble() / waveData.sampleRate;\n\n      final stream = recognizer.createStream();\n      stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n      recognizer.decode(stream);\n\n      final result = recognizer.getResult(stream);\n      stream.free();\n      print(\n          '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n      vad.pop();\n    }\n  }\n\n  vad.flush();\n\n  while (!vad.isEmpty()) {\n    final samples = vad.front().samples;\n    final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n    final endTime = startTime + samples.length.toDouble() / waveData.sampleRate;\n\n    final stream = recognizer.createStream();\n    stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n    stream.free();\n    print(\n        '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n    vad.pop();\n  }\n\n  vad.free();\n\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/bin/zipformer-transducer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:args/args.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './init.dart';\n\nvoid main(List<String> arguments) async {\n  await initSherpaOnnx();\n\n  final parser = ArgParser()\n    ..addOption('silero-vad', help: 'Path to silero_vad.onnx')\n    ..addOption('encoder', help: 'Path to the encoder model')\n    ..addOption('decoder', help: 'Path to decoder model')\n    ..addOption('joiner', help: 'Path to joiner model')\n    ..addOption('tokens', help: 'Path to tokens.txt')\n    ..addOption('input-wav', help: 'Path to input.wav to transcribe');\n\n  final res = parser.parse(arguments);\n\n  if (res['silero-vad'] == null ||\n      res['encoder'] == null ||\n      res['decoder'] == null ||\n      res['joiner'] == null ||\n      res['tokens'] == null ||\n      res['input-wav'] == null) {\n    print(parser.usage);\n    exit(1);\n  }\n\n  // create VAD\n  final sileroVad = res['silero-vad'] as String;\n\n  final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n    model: sileroVad,\n    minSilenceDuration: 0.25,\n    minSpeechDuration: 0.5,\n    maxSpeechDuration: 5.0,\n  );\n\n  final vadConfig = sherpa_onnx.VadModelConfig(\n    sileroVad: sileroVadConfig,\n    numThreads: 1,\n    debug: true,\n  );\n\n  final vad = sherpa_onnx.VoiceActivityDetector(\n      config: vadConfig, bufferSizeInSeconds: 10);\n\n  // create zipformer transducer recognizer\n  final encoder = res['encoder'] as String;\n  final decoder = res['decoder'] as String;\n  final joiner = res['joiner'] as String;\n  final tokens = res['tokens'] as String;\n  final inputWav = res['input-wav'] as String;\n\n  final transducer = sherpa_onnx.OfflineTransducerModelConfig(\n    encoder: encoder,\n    decoder: decoder,\n    joiner: joiner,\n  );\n\n  final modelConfig = sherpa_onnx.OfflineModelConfig(\n    transducer: transducer,\n    tokens: tokens,\n    debug: true,\n    numThreads: 1,\n  );\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  final recognizer = sherpa_onnx.OfflineRecognizer(config);\n\n  final waveData = sherpa_onnx.readWave(inputWav);\n  if (waveData.sampleRate != 16000) {\n    print('Only 16000 Hz is supported. Given: ${waveData.sampleRate}');\n    exit(1);\n  }\n\n  int numSamples = waveData.samples.length;\n  int numIter = numSamples ~/ vadConfig.sileroVad.windowSize;\n\n  for (int i = 0; i != numIter; ++i) {\n    int start = i * vadConfig.sileroVad.windowSize;\n    vad.acceptWaveform(Float32List.sublistView(\n        waveData.samples, start, start + vadConfig.sileroVad.windowSize));\n\n    while (!vad.isEmpty()) {\n      final samples = vad.front().samples;\n      final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n      final endTime =\n          startTime + samples.length.toDouble() / waveData.sampleRate;\n\n      final stream = recognizer.createStream();\n      stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n      recognizer.decode(stream);\n\n      final result = recognizer.getResult(stream);\n      stream.free();\n      print(\n          '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n      vad.pop();\n    }\n  }\n\n  vad.flush();\n\n  while (!vad.isEmpty()) {\n    final samples = vad.front().samples;\n    final startTime = vad.front().start.toDouble() / waveData.sampleRate;\n    final endTime = startTime + samples.length.toDouble() / waveData.sampleRate;\n\n    final stream = recognizer.createStream();\n    stream.acceptWaveform(samples: samples, sampleRate: waveData.sampleRate);\n    recognizer.decode(stream);\n\n    final result = recognizer.getResult(stream);\n    stream.free();\n    print(\n        '${startTime.toStringAsPrecision(5)} -- ${endTime.toStringAsPrecision(5)} : ${result.text}');\n\n    vad.pop();\n  }\n\n  vad.free();\n\n  recognizer.free();\n}\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/pubspec.yaml",
    "content": "name: vad_with_non_streaming_asr\n\ndescription: >\n  This example demonstrates how to use the Dart API for VAD (voice activity detection)\n  with non-streaming speech recognition.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx: ^1.12.31\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/run-dolphin-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  ls -lh sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ndart run \\\n  ./bin/dolphin-ctc.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --model ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx \\\n  --tokens ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt \\\n  --input-wav ./lei-jun-test.wav\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/run-moonshine.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nfi\n\nif [ ! -f ./Obama.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ndart run \\\n  ./bin/moonshine.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --preprocessor ./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx \\\n  --encoder ./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx \\\n  --uncached-decoder ./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx \\\n  --cached-decoder ./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx \\\n  --tokens ./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt \\\n  --input-wav ./Obama.wav\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/run-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ndart run \\\n  ./bin/paraformer.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --model ./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \\\n  --tokens ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt \\\n  --input-wav ./lei-jun-test.wav\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/run-sense-voice-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\nif [ ! -f ./Obama.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ndart run \\\n  ./bin/sense-voice.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --model ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx \\\n  --tokens ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt \\\n  --use-itn true \\\n  --input-wav ./Obama.wav\n\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/run-sense-voice-zh-2.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ndart run \\\n  ./bin/sense-voice-2.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --model ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx \\\n  --tokens ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt \\\n  --use-itn true \\\n  --input-wav ./lei-jun-test.wav\n\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/run-sense-voice-zh.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ndart run \\\n  ./bin/sense-voice.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --model ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx \\\n  --tokens ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt \\\n  --use-itn true \\\n  --input-wav ./lei-jun-test.wav\n\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/run-telespeech-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n\n  tar xvf sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n  rm sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ndart run \\\n  ./bin/telespeech-ctc.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --model ./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/model.int8.onnx \\\n  --tokens ./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt \\\n  --input-wav ./lei-jun-test.wav\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/run-whisper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\nfi\n\n\n\nif [ ! -f ./Obama.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ndart run \\\n  ./bin/whisper.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --encoder ./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx \\\n  --decoder ./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx \\\n  --tokens ./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \\\n  --input-wav ./Obama.wav\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/run-zipformer-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n  rm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ndart run \\\n  ./bin/zipformer-ctc.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --model ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx \\\n  --tokens ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt \\\n  --input-wav ./lei-jun-test.wav\n"
  },
  {
    "path": "dart-api-examples/vad-with-non-streaming-asr/run-zipformer-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndart pub get\n\nif [ ! -f ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\n  rm sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\nfi\n\nif [ ! -f ./Obama.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ndart run \\\n  ./bin/zipformer-transducer.dart \\\n  --silero-vad ./silero_vad.onnx \\\n  --encoder ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/encoder-epoch-30-avg-1.int8.onnx \\\n  --decoder ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/decoder-epoch-30-avg-1.onnx \\\n  --joiner ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/joiner-epoch-30-avg-1.int8.onnx \\\n  --tokens ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/tokens.txt \\\n  --input-wav ./Obama.wav\n\n"
  },
  {
    "path": "dotnet-examples/.editorconfig",
    "content": "# top-most EditorConfig file\nroot = true\n\n# Don't use tabs for indentation.\n[*]\nindent_style = space\n\n# Code files\n[*.{cs,csx,vb,vbx}]\nindent_size = 2\ninsert_final_newline = true\ncharset = utf-8-bom\nend_of_line = crlf\n"
  },
  {
    "path": "dotnet-examples/.gitignore",
    "content": "bin\nobj\nv17\n.vs\n!*.sh\n*.vsidx\n"
  },
  {
    "path": "dotnet-examples/.notes",
    "content": "# How to create a new project in this folder\n\n```bash\nmkdir offline-tts\ncd offline-tts\ndotnet new console\ncd ..\ndotnet sln ./sherpa-onnx.sln add ./offline-tts\n```\n"
  },
  {
    "path": "dotnet-examples/Common/Common.csproj",
    "content": "﻿<Project Sdk=\"Microsoft.NET.Sdk\">\n\n    <PropertyGroup>\n        <TargetFramework>net8.0</TargetFramework>\n        <AllowUnsafeBlocks>true</AllowUnsafeBlocks>\n    </PropertyGroup>\n    <ItemGroup>\n        <PackageReference Include=\"org.k2fsa.sherpa.onnx\" Version=\"*\" />\n    </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "dotnet-examples/Common/WaveHeader.cs",
    "content": "﻿// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\nusing System;\nusing System.IO;\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx;\n\n[StructLayout(LayoutKind.Sequential)]\npublic struct WaveHeader\n{\n  public int ChunkID;\n  public int ChunkSize;\n  public int Format;\n  public int SubChunk1ID;\n  public int SubChunk1Size;\n  public short AudioFormat;\n  public short NumChannels;\n  public int SampleRate;\n  public int ByteRate;\n  public short BlockAlign;\n  public short BitsPerSample;\n  public int SubChunk2ID;\n  public int SubChunk2Size;\n\n  public bool Validate()\n  {\n    if (ChunkID != 0x46464952)\n    {\n      Console.WriteLine($\"Invalid chunk ID: 0x{ChunkID:X}. Expect 0x46464952\");\n      return false;\n    }\n\n    //               E V A W\n    if (Format != 0x45564157)\n    {\n      Console.WriteLine($\"Invalid format: 0x{Format:X}. Expect 0x45564157\");\n      return false;\n    }\n\n    //                      t m f\n    if (SubChunk1ID != 0x20746d66)\n    {\n      Console.WriteLine($\"Invalid SubChunk1ID: 0x{SubChunk1ID:X}. Expect 0x20746d66\");\n      return false;\n    }\n\n    if (SubChunk1Size != 16)\n    {\n      Console.WriteLine($\"Invalid SubChunk1Size: {SubChunk1Size}. Expect 16\");\n      return false;\n    }\n\n    if (AudioFormat != 1)\n    {\n      Console.WriteLine($\"Invalid AudioFormat: {AudioFormat}. Expect 1\");\n      return false;\n    }\n\n    if (NumChannels != 1)\n    {\n      Console.WriteLine($\"Invalid NumChannels: {NumChannels}. Expect 1\");\n      return false;\n    }\n\n    if (ByteRate != (SampleRate * NumChannels * BitsPerSample / 8))\n    {\n      Console.WriteLine($\"Invalid byte rate: {ByteRate}.\");\n      return false;\n    }\n\n    if (BlockAlign != (NumChannels * BitsPerSample / 8))\n    {\n      Console.WriteLine($\"Invalid block align: {ByteRate}.\");\n      return false;\n    }\n\n    if (BitsPerSample != 16)\n    {  // we support only 16 bits per sample\n      Console.WriteLine($\"Invalid bits per sample: {BitsPerSample}. Expect 16\");\n      return false;\n    }\n\n    return true;\n  }\n}\n\n// It supports only 16-bit, single channel WAVE format.\n// The sample rate can be any value.\npublic class WaveReader\n{\n  public WaveReader(string fileName)\n  {\n    if (!File.Exists(fileName))\n    {\n      throw new ApplicationException($\"{fileName} does not exist!\");\n    }\n\n    using var stream = File.Open(fileName, FileMode.Open);\n    using var reader = new BinaryReader(stream);\n\n    _header = ReadHeader(reader);\n\n    if (!_header.Validate())\n    {\n      throw new ApplicationException($\"Invalid wave file ${fileName}\");\n    }\n\n    SkipMetaData(reader);\n\n    // now read samples\n    // _header.SubChunk2Size contains number of bytes in total.\n    // we assume each sample is of type int16\n    var buffer = reader.ReadBytes(_header.SubChunk2Size);\n    var samples_int16 = new short[_header.SubChunk2Size / 2];\n    Buffer.BlockCopy(buffer, 0, samples_int16, 0, buffer.Length);\n\n    _samples = new float[samples_int16.Length];\n\n    for (var i = 0; i < samples_int16.Length; ++i)\n    {\n      _samples[i] = samples_int16[i] / 32768.0F;\n    }\n  }\n\n  private static WaveHeader ReadHeader(BinaryReader reader)\n  {\n    var bytes = reader.ReadBytes(Marshal.SizeOf(typeof(WaveHeader)));\n\n    GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);\n    WaveHeader header = (WaveHeader)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(WaveHeader))!;\n    handle.Free();\n\n    return header;\n  }\n\n  private void SkipMetaData(BinaryReader reader)\n  {\n    var bs = reader.BaseStream;\n\n    var subChunk2ID = _header.SubChunk2ID;\n    var subChunk2Size = _header.SubChunk2Size;\n\n    while (bs.Position != bs.Length && subChunk2ID != 0x61746164)\n    {\n      bs.Seek(subChunk2Size, SeekOrigin.Current);\n      subChunk2ID = reader.ReadInt32();\n      subChunk2Size = reader.ReadInt32();\n    }\n    _header.SubChunk2ID = subChunk2ID;\n    _header.SubChunk2Size = subChunk2Size;\n  }\n\n  private WaveHeader _header;\n\n  // Samples are normalized to the range [-1, 1]\n  private float[] _samples;\n\n  public int SampleRate => _header.SampleRate;\n\n  public float[] Samples => _samples;\n\n  public static void Test(string fileName)\n  {\n    WaveReader reader = new WaveReader(fileName);\n    Console.WriteLine($\"samples length: {reader.Samples.Length}\");\n    Console.WriteLine($\"samples rate: {reader.SampleRate}\");\n  }\n}\n"
  },
  {
    "path": "dotnet-examples/README.md",
    "content": "# Introduction\n\nThis folder contains C# API examples for [sherpa-onnx][sherpa-onnx].\n\nPlease refer to the documentation\nhttps://k2-fsa.github.io/sherpa/onnx/csharp-api/index.html\nfor details.\n\n- [./speech-enhancement-gtcrn](./speech-enhancement-gtcrn) It shows how to use\n  the offline speech denoiser API with GTCRN models.\n- [./speech-enhancement-dpdfnet](./speech-enhancement-dpdfnet) It shows how to\n  use the offline speech denoiser API with DPDFNet models. Use 16 kHz DPDFNet\n  models such as `dpdfnet_baseline.onnx`, `dpdfnet2.onnx`, `dpdfnet4.onnx`, or\n  `dpdfnet8.onnx` for downstream ASR and `dpdfnet2_48khz_hr.onnx` for 48 kHz\n  enhancement output.\n- [./streaming-speech-enhancement-gtcrn](./streaming-speech-enhancement-gtcrn)\n  It shows how to use the online speech denoiser API with GTCRN models.\n- [./streaming-speech-enhancement-dpdfnet](./streaming-speech-enhancement-dpdfnet)\n  It shows how to use the online speech denoiser API with DPDFNet models.\n- [./zipvoice-tts](./zipvoice-tts) It shows how to use ZipVoice for\n  Chinese/English zero-shot text-to-speech.\n- [./zipvoice-tts-play](./zipvoice-tts-play) It shows how to use ZipVoice for\n  Chinese/English zero-shot text-to-speech with playback.\n\n```bash\ndotnet new console -n offline-tts-play\ndotnet sln ./sherpa-onnx.sln add ./offline-tts-play\n```\n\n```bash\ndotnet nuget locals all --list\ndotnet nuget locals all --clear\n```\n\n[sherpa-onnx]: https://github.com/k2-fsa/sherpa-onnx\n"
  },
  {
    "path": "dotnet-examples/keyword-spotting-from-files/Program.cs",
    "content": "﻿// Copyright (c)  2024  Xiaomi Corporation\r\n//\r\n// This file shows how to do keyword spotting with sherpa-onnx.\r\n//\r\n// 1. Download a model from\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models\r\n//\r\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\r\n// tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\r\n//\r\n// 2. Now run it\r\n//\r\n// dotnet run\r\n\r\nusing SherpaOnnx;\r\n\r\nclass KeywordSpotterDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    var config = new KeywordSpotterConfig();\r\n    config.FeatConfig.SampleRate = 16000;\r\n    config.FeatConfig.FeatureDim = 80;\r\n\r\n    config.ModelConfig.Transducer.Encoder = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx\";\r\n    config.ModelConfig.Transducer.Decoder = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\";\r\n    config.ModelConfig.Transducer.Joiner = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx\";\r\n\r\n    config.ModelConfig.Tokens = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt\";\r\n    config.ModelConfig.Provider = \"cpu\";\r\n    config.ModelConfig.NumThreads = 1;\r\n    config.ModelConfig.Debug = 1;\r\n    config.KeywordsFile = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/test_keywords.txt\";\r\n\r\n    var kws = new KeywordSpotter(config);\r\n\r\n    var filename = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/3.wav\";\r\n\r\n    var waveReader = new WaveReader(filename);\r\n\r\n    Console.WriteLine(\"----------Use pre-defined keywords----------\");\r\n\r\n    var s = kws.CreateStream();\r\n    s.AcceptWaveform(waveReader.SampleRate, waveReader.Samples);\r\n\r\n    float[] tailPadding = new float[(int)(waveReader.SampleRate * 0.3)];\r\n    s.AcceptWaveform(waveReader.SampleRate, tailPadding);\r\n    s.InputFinished();\r\n\r\n    while (kws.IsReady(s))\r\n    {\r\n      kws.Decode(s);\r\n      var result = kws.GetResult(s);\r\n      if (result.Keyword != string.Empty)\r\n      {\r\n        // Remember to call Reset() right after detecting a keyword\r\n        kws.Reset(s);\r\n        Console.WriteLine(\"Detected: {0}\", result.Keyword);\r\n      }\r\n    }\r\n\r\n    Console.WriteLine(\"----------Use pre-defined keywords + add a new keyword----------\");\r\n    s = kws.CreateStream(\"y ǎn y uán @演员\");\r\n    s.AcceptWaveform(waveReader.SampleRate, waveReader.Samples);\r\n\r\n    s.AcceptWaveform(waveReader.SampleRate, tailPadding);\r\n    s.InputFinished();\r\n\r\n    while (kws.IsReady(s))\r\n    {\r\n      kws.Decode(s);\r\n      var result = kws.GetResult(s);\r\n      if (result.Keyword != string.Empty)\r\n      {\r\n        // Remember to call Reset() right after detecting a keyword\r\n        kws.Reset(s);\r\n        Console.WriteLine(\"Detected: {0}\", result.Keyword);\r\n      }\r\n    }\r\n\r\n    Console.WriteLine(\"----------Use pre-defined keywords + add 2 new keywords----------\");\r\n\r\n    // Note keywords are separated by /\r\n    s = kws.CreateStream(\"y ǎn y uán @演员/zh ī m íng @知名\");\r\n    s.AcceptWaveform(waveReader.SampleRate, waveReader.Samples);\r\n\r\n    s.AcceptWaveform(waveReader.SampleRate, tailPadding);\r\n    s.InputFinished();\r\n\r\n    while (kws.IsReady(s))\r\n    {\r\n      kws.Decode(s);\r\n      var result = kws.GetResult(s);\r\n      if (result.Keyword != string.Empty)\r\n      {\r\n        // Remember to call Reset() right after detecting a keyword\r\n        kws.Reset(s);\r\n        Console.WriteLine(\"Detected: {0}\", result.Keyword);\r\n      }\r\n    }\r\n  }\r\n}\r\n\r\n"
  },
  {
    "path": "dotnet-examples/keyword-spotting-from-files/keyword-spotting-from-files.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>keyword_spotting_from_files</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/keyword-spotting-from-files/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\nfi\n\ndotnet run -c Release\n"
  },
  {
    "path": "dotnet-examples/keyword-spotting-from-microphone/Program.cs",
    "content": "﻿// Copyright (c)  2024  Xiaomi Corporation\r\n//\r\n// This file shows how to do keyword spotting with sherpa-onnx.\r\n//\r\n// 1. Download a model from\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models\r\n//\r\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\r\n// tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\r\n//\r\n// 2. Now run it\r\n//\r\n// dotnet run\r\n\r\nusing PortAudioSharp;\r\nusing SherpaOnnx;\r\nusing System.Runtime.InteropServices;\r\n\r\nclass KeywordSpotterDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    var config = new KeywordSpotterConfig();\r\n    config.FeatConfig.SampleRate = 16000;\r\n    config.FeatConfig.FeatureDim = 80;\r\n\r\n    config.ModelConfig.Transducer.Encoder = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx\";\r\n    config.ModelConfig.Transducer.Decoder = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\";\r\n    config.ModelConfig.Transducer.Joiner = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx\";\r\n\r\n    config.ModelConfig.Tokens = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt\";\r\n    config.ModelConfig.Provider = \"cpu\";\r\n    config.ModelConfig.NumThreads = 1;\r\n    config.ModelConfig.Debug = 1;\r\n    config.KeywordsFile = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/test_keywords.txt\";\r\n\r\n    var kws = new KeywordSpotter(config);\r\n\r\n    var filename = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/3.wav\";\r\n\r\n    var waveReader = new WaveReader(filename);\r\n\r\n    Console.WriteLine(\"----------Use pre-defined keywords----------\");\r\n\r\n    var s = kws.CreateStream();\r\n\r\n    Console.WriteLine(PortAudio.VersionInfo.versionText);\r\n    PortAudio.Initialize();\r\n\r\n    Console.WriteLine($\"Number of devices: {PortAudio.DeviceCount}\");\r\n    for (int i = 0; i != PortAudio.DeviceCount; ++i)\r\n    {\r\n      Console.WriteLine($\" Device {i}\");\r\n      var deviceInfo = PortAudio.GetDeviceInfo(i);\r\n      Console.WriteLine($\"   Name: {deviceInfo.name}\");\r\n      Console.WriteLine($\"   Max input channels: {deviceInfo.maxInputChannels}\");\r\n      Console.WriteLine($\"   Default sample rate: {deviceInfo.defaultSampleRate}\");\r\n    }\r\n    int deviceIndex = PortAudio.DefaultInputDevice;\r\n    if (deviceIndex == PortAudio.NoDevice)\r\n    {\r\n      Console.WriteLine(\"No default input device found\");\r\n      Environment.Exit(1);\r\n    }\r\n\r\n    var info = PortAudio.GetDeviceInfo(deviceIndex);\r\n\r\n    Console.WriteLine();\r\n    Console.WriteLine($\"Use default device {deviceIndex} ({info.name})\");\r\n\r\n    var param = new StreamParameters();\r\n    param.device = deviceIndex;\r\n    param.channelCount = 1;\r\n    param.sampleFormat = SampleFormat.Float32;\r\n    param.suggestedLatency = info.defaultLowInputLatency;\r\n    param.hostApiSpecificStreamInfo = IntPtr.Zero;\r\n\r\n    PortAudioSharp.Stream.Callback callback = (IntPtr input, IntPtr output,\r\n        uint frameCount,\r\n        ref StreamCallbackTimeInfo timeInfo,\r\n        StreamCallbackFlags statusFlags,\r\n        IntPtr userData\r\n        ) =>\r\n    {\r\n      var samples = new float[frameCount];\r\n      Marshal.Copy(input, samples, 0, (int)frameCount);\r\n\r\n      s.AcceptWaveform(config.FeatConfig.SampleRate, samples);\r\n\r\n      return StreamCallbackResult.Continue;\r\n    };\r\n\r\n    var stream = new PortAudioSharp.Stream(inParams: param, outParams: null, sampleRate: config.FeatConfig.SampleRate,\r\n        framesPerBuffer: 0,\r\n        streamFlags: StreamFlags.ClipOff,\r\n        callback: callback,\r\n        userData: IntPtr.Zero\r\n        );\r\n\r\n    Console.WriteLine(param);\r\n    Console.WriteLine(\"Started! Please speak\");\r\n\r\n    stream.Start();\r\n\r\n    while (true)\r\n    {\r\n      while (kws.IsReady(s))\r\n      {\r\n        kws.Decode(s);\r\n\r\n        var result = kws.GetResult(s);\r\n        if (result.Keyword != string.Empty)\r\n        {\r\n          // Remember to call Reset() right after detecting a keyword\r\n          kws.Reset(s);\r\n\r\n          Console.WriteLine(\"Detected: {0}\", result.Keyword);\r\n        }\r\n      }\r\n\r\n      Thread.Sleep(200); // ms\r\n    }\r\n  }\r\n}\r\n\r\n"
  },
  {
    "path": "dotnet-examples/keyword-spotting-from-microphone/keyword-spotting-from-microphone.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>keyword_spotting_from_microphone</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <PackageReference Include=\"PortAudioSharp2\" Version=\"*\" />\r\n  </ItemGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/keyword-spotting-from-microphone/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\nfi\n\ndotnet run -c Release\n"
  },
  {
    "path": "dotnet-examples/kitten-tts/Program.cs",
    "content": "﻿// Copyright (c)  2025  Xiaomi Corporation\r\n//\r\n// This file shows how to use a non-streaming KittenTTS model\r\n// for text-to-speech\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/index.html\r\n// and\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\r\n// to download pre-trained models\r\nusing SherpaOnnx;\r\nusing System.Runtime.InteropServices;\r\n\r\nclass KittenTtsDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n\r\n    TestEn();\r\n  }\r\n\r\n  static void TestEn()\r\n  {\r\n    var config = new OfflineTtsConfig();\r\n    config.Model.Kitten.Model = \"./kitten-nano-en-v0_1-fp16/model.fp16.onnx\";\r\n    config.Model.Kitten.Voices = \"./kitten-nano-en-v0_1-fp16/voices.bin\";\r\n    config.Model.Kitten.Tokens = \"./kitten-nano-en-v0_1-fp16/tokens.txt\";\r\n    config.Model.Kitten.DataDir = \"./kitten-nano-en-v0_1-fp16/espeak-ng-data\";\r\n\r\n    config.Model.NumThreads = 2;\r\n    config.Model.Debug = 1;\r\n    config.Model.Provider = \"cpu\";\r\n\r\n    var tts = new OfflineTts(config);\r\n    var speed = 1.0f;\r\n    var text = \"Today as always, men fall into two groups: slaves and free men. Whoever \" +\r\n      \"does not have two-thirds of his day for himself, is a slave, whatever \" +\r\n      \"he may be: a statesman, a businessman, an official, or a scholar. \" +\r\n      \"Friends fell out often because life was changing so fast. The easiest \" +\r\n      \"thing in the world was to lose touch with someone.\";\r\n\r\n    // mapping of sid to voice name\r\n    // 0->expr-voice-2-m, 1->expr-voice-2-f, 2->expr-voice-3-m\r\n    // 3->expr-voice-3-f, 4->expr-voice-4-m, 5->expr-voice-4-f\r\n    // 6->expr-voice-5-m, 7->expr-voice-5-f\r\n    var sid = 0;\r\n\r\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\n    genConfig.Sid = sid;\n    genConfig.Speed = speed;\n\n    var MyCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\n    {\r\n      float[] data = new float[n];\r\n      Marshal.Copy(samples, data, 0, n);\r\n      // You can process samples here, e.g., play them.\r\n      // See ../kitten-tts-play for how to play them\r\n      Console.WriteLine($\"Progress {progress*100}%\");\r\n\r\n      // 1 means to keep generating\r\n      // 0 means to stop generating\r\n      return 1;\r\n    };\r\n\r\n    var callback = new OfflineTtsCallbackProgressWithArg(MyCallback);\n\n    var audio = tts.GenerateWithConfig(text, genConfig, callback);\n\r\n    var outputFilename = \"./generated-kitten-en.wav\";\r\n    var ok = audio.SaveToWaveFile(outputFilename);\r\n\r\n    if (ok)\r\n    {\r\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine($\"Failed to write {outputFilename}\");\r\n    }\r\n  }\r\n}\r\n\r\n"
  },
  {
    "path": "dotnet-examples/kitten-tts/kitten-tts.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>kitten_tts</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/kitten-tts/run-kitten.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./kitten-nano-en-v0_1-fp16/model.fp16.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n  tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n  rm kitten-nano-en-v0_1-fp16.tar.bz2\nfi\n\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/kitten-tts-play/Program.cs",
    "content": "﻿// Copyright (c)  2025  Xiaomi Corporation\r\n//\r\n// This file shows how to use a non-streaming Kitten TTS model\r\n// for text-to-speech\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/index.html\r\n// and\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\r\n// to download pre-trained models\r\nusing PortAudioSharp;\r\nusing SherpaOnnx;\r\nusing System.Collections.Concurrent;\r\nusing System.Runtime.InteropServices;\r\n\r\nclass KittenTtsPlayDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    var config = new OfflineTtsConfig();\r\n    config.Model.Kitten.Model = \"./kitten-nano-en-v0_1-fp16/model.fp16.onnx\";\r\n    config.Model.Kitten.Voices = \"./kitten-nano-en-v0_1-fp16/voices.bin\";\r\n    config.Model.Kitten.Tokens = \"./kitten-nano-en-v0_1-fp16/tokens.txt\";\r\n    config.Model.Kitten.DataDir = \"./kitten-nano-en-v0_1-fp16/espeak-ng-data\";\r\n\r\n    config.Model.NumThreads = 2;\r\n    config.Model.Debug = 1;\r\n    config.Model.Provider = \"cpu\";\r\n\r\n    var tts = new OfflineTts(config);\r\n    var speed = 1.0f;\r\n    var text = \"Today as always, men fall into two groups: slaves and free men. Whoever \" +\r\n      \"does not have two-thirds of his day for himself, is a slave, whatever \" +\r\n      \"he may be: a statesman, a businessman, an official, or a scholar. \" +\r\n      \"Friends fell out often because life was changing so fast. The easiest \" +\r\n      \"thing in the world was to lose touch with someone.\";\r\n\r\n    // mapping of sid to voice name\r\n    // 0->expr-voice-2-m, 1->expr-voice-2-f, 2->expr-voice-3-m\n    // 3->expr-voice-3-f, 4->expr-voice-4-m, 5->expr-voice-4-f\n    // 6->expr-voice-5-m, 7->expr-voice-5-f\n    var sid = 0;\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\n    genConfig.Sid = sid;\n    genConfig.Speed = speed;\n\r\n\r\n    Console.WriteLine(PortAudio.VersionInfo.versionText);\r\n    PortAudio.Initialize();\r\n    Console.WriteLine($\"Number of devices: {PortAudio.DeviceCount}\");\r\n\r\n    for (int i = 0; i != PortAudio.DeviceCount; ++i)\r\n    {\r\n      Console.WriteLine($\" Device {i}\");\r\n      DeviceInfo deviceInfo = PortAudio.GetDeviceInfo(i);\r\n      Console.WriteLine($\"   Name: {deviceInfo.name}\");\r\n      Console.WriteLine($\"   Max output channels: {deviceInfo.maxOutputChannels}\");\r\n      Console.WriteLine($\"   Default sample rate: {deviceInfo.defaultSampleRate}\");\r\n    }\r\n    int deviceIndex = PortAudio.DefaultOutputDevice;\r\n    if (deviceIndex == PortAudio.NoDevice)\r\n    {\r\n      Console.WriteLine(\"No default output device found. Please use ../offline-tts instead\");\r\n      Environment.Exit(1);\r\n    }\r\n\r\n    var info = PortAudio.GetDeviceInfo(deviceIndex);\r\n    Console.WriteLine();\r\n    Console.WriteLine($\"Use output default device {deviceIndex} ({info.name})\");\r\n\r\n    var param = new StreamParameters();\r\n    param.device = deviceIndex;\r\n    param.channelCount = 1;\r\n    param.sampleFormat = SampleFormat.Float32;\r\n    param.suggestedLatency = info.defaultLowOutputLatency;\r\n    param.hostApiSpecificStreamInfo = IntPtr.Zero;\r\n\r\n    // https://learn.microsoft.com/en-us/dotnet/standard/collections/thread-safe/blockingcollection-overview\r\n    var dataItems = new BlockingCollection<float[]>();\r\n\r\n    var myCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\n    {\r\n      Console.WriteLine($\"Progress {progress*100}%\");\r\n\r\n      float[] data = new float[n];\r\n\r\n      Marshal.Copy(samples, data, 0, n);\r\n\r\n      dataItems.Add(data);\r\n\r\n      // 1 means to keep generating\r\n      // 0 means to stop generating\r\n      return 1;\r\n    };\r\n\r\n    var playFinished = false;\r\n\r\n    float[]? lastSampleArray = null;\r\n    int lastIndex = 0; // not played\r\n\r\n    PortAudioSharp.Stream.Callback playCallback = (IntPtr input, IntPtr output,\r\n        UInt32 frameCount,\r\n        ref StreamCallbackTimeInfo timeInfo,\r\n        StreamCallbackFlags statusFlags,\r\n        IntPtr userData\r\n        ) =>\r\n    {\r\n      if (dataItems.IsCompleted && lastSampleArray == null && lastIndex == 0)\r\n      {\r\n        Console.WriteLine($\"Finished playing\");\r\n        playFinished = true;\r\n        return StreamCallbackResult.Complete;\r\n      }\r\n\r\n      int expected = Convert.ToInt32(frameCount);\r\n      int i = 0;\r\n\r\n      while ((lastSampleArray != null || dataItems.Count != 0) && (i < expected))\r\n      {\r\n        int needed = expected - i;\r\n\r\n        if (lastSampleArray != null)\r\n        {\r\n          int remaining = lastSampleArray.Length - lastIndex;\r\n          if (remaining >= needed)\r\n          {\r\n            float[] this_block = lastSampleArray.Skip(lastIndex).Take(needed).ToArray();\r\n            lastIndex += needed;\r\n            if (lastIndex == lastSampleArray.Length)\r\n            {\r\n              lastSampleArray = null;\r\n              lastIndex = 0;\r\n            }\r\n\r\n            Marshal.Copy(this_block, 0, IntPtr.Add(output, i * sizeof(float)), needed);\r\n            return StreamCallbackResult.Continue;\r\n          }\r\n\r\n          float[] this_block2 = lastSampleArray.Skip(lastIndex).Take(remaining).ToArray();\r\n          lastIndex = 0;\r\n          lastSampleArray = null;\r\n\r\n          Marshal.Copy(this_block2, 0, IntPtr.Add(output, i * sizeof(float)), remaining);\r\n          i += remaining;\r\n          continue;\r\n        }\r\n\r\n        if (dataItems.Count != 0)\r\n        {\r\n          lastSampleArray = dataItems.Take();\r\n          lastIndex = 0;\r\n        }\r\n      }\r\n\r\n      if (i < expected)\r\n      {\r\n        int sizeInBytes = (expected - i) * 4;\r\n        Marshal.Copy(new byte[sizeInBytes], 0, IntPtr.Add(output, i * sizeof(float)), sizeInBytes);\r\n      }\r\n\r\n      return StreamCallbackResult.Continue;\r\n    };\r\n\r\n    PortAudioSharp.Stream stream = new PortAudioSharp.Stream(inParams: null, outParams: param, sampleRate: tts.SampleRate,\r\n        framesPerBuffer: 0,\r\n        streamFlags: StreamFlags.ClipOff,\r\n        callback: playCallback,\r\n        userData: IntPtr.Zero\r\n        );\r\n\r\n    stream.Start();\r\n\r\n    var callback = new OfflineTtsCallbackProgressWithArg(myCallback);\n\n    var audio = tts.GenerateWithConfig(text, genConfig, callback);\n    var outputFilename = \"./generated-kitten-0.wav\";\r\n    var ok = audio.SaveToWaveFile(outputFilename);\r\n\r\n    if (ok)\r\n    {\r\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine($\"Failed to write {outputFilename}\");\r\n    }\r\n    dataItems.CompleteAdding();\r\n\r\n    while (!playFinished)\r\n    {\r\n      Thread.Sleep(100); // 100ms\r\n    }\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/kitten-tts-play/kitten-tts-play.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>kitten_tts_play</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <PackageReference Include=\"PortAudioSharp2\" Version=\"*\" />\r\n  </ItemGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/kitten-tts-play/run-kitten.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./kitten-nano-en-v0_1-fp16/model.fp16.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n  tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n  rm kitten-nano-en-v0_1-fp16.tar.bz2\nfi\n\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/kokoro-tts/Program.cs",
    "content": "﻿// Copyright (c)  2025  Xiaomi Corporation\r\n//\r\n// This file shows how to use a non-streaming Kokoro TTS model\r\n// for text-to-speech\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/index.html\r\n// and\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\r\n// to download pre-trained models\r\nusing SherpaOnnx;\r\nusing System.Runtime.InteropServices;\r\n\r\nclass KokoroTtsDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n\r\n    TestZhEn();\r\n    TestEn();\r\n  }\r\n\r\n  static void TestZhEn()\r\n  {\r\n    var config = new OfflineTtsConfig();\r\n    config.Model.Kokoro.Model = \"./kokoro-multi-lang-v1_0/model.onnx\";\r\n    config.Model.Kokoro.Voices = \"./kokoro-multi-lang-v1_0/voices.bin\";\r\n    config.Model.Kokoro.Tokens = \"./kokoro-multi-lang-v1_0/tokens.txt\";\r\n    config.Model.Kokoro.DataDir = \"./kokoro-multi-lang-v1_0/espeak-ng-data\";\r\n    config.Model.Kokoro.Lexicon = \"./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt\";\r\n\r\n    config.Model.NumThreads = 2;\r\n    config.Model.Debug = 1;\r\n    config.Model.Provider = \"cpu\";\r\n\r\n    var tts = new OfflineTts(config);\r\n    var speed = 1.0f;\r\n    var text = \"中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？\";\r\n\r\n    var sid = 50;\r\n\r\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\n    genConfig.Sid = sid;\n    genConfig.Speed = speed;\n    genConfig.SilenceScale = 0.2f;\n\n    var MyCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\n    {\r\n      float[] data = new float[n];\r\n      Marshal.Copy(samples, data, 0, n);\r\n      // You can process samples here, e.g., play them.\r\n      // See ../kokoro-tts-playback for how to play them\r\n      Console.WriteLine($\"Progress {progress*100}%\");\r\n\r\n      // 1 means to keep generating\r\n      // 0 means to stop generating\r\n      return 1;\r\n    };\r\n\r\n    var callback = new OfflineTtsCallbackProgressWithArg(MyCallback);\n\n    var audio = tts.GenerateWithConfig(text, genConfig, callback);\n\r\n    var outputFilename = \"./generated-kokoro-zh-en.wav\";\r\n    var ok = audio.SaveToWaveFile(outputFilename);\r\n\r\n    if (ok)\r\n    {\r\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine($\"Failed to write {outputFilename}\");\r\n    }\r\n  }\r\n\r\n  static void TestEn()\r\n  {\r\n    var config = new OfflineTtsConfig();\r\n    config.Model.Kokoro.Model = \"./kokoro-en-v0_19/model.onnx\";\r\n    config.Model.Kokoro.Voices = \"./kokoro-en-v0_19/voices.bin\";\r\n    config.Model.Kokoro.Tokens = \"./kokoro-en-v0_19/tokens.txt\";\r\n    config.Model.Kokoro.DataDir = \"./kokoro-en-v0_19/espeak-ng-data\";\r\n\r\n    config.Model.NumThreads = 2;\r\n    config.Model.Debug = 1;\r\n    config.Model.Provider = \"cpu\";\r\n\r\n    var tts = new OfflineTts(config);\r\n    var speed = 1.0f;\r\n    var text = \"Today as always, men fall into two groups: slaves and free men. Whoever \" +\r\n      \"does not have two-thirds of his day for himself, is a slave, whatever \" +\r\n      \"he may be: a statesman, a businessman, an official, or a scholar. \" +\r\n      \"Friends fell out often because life was changing so fast. The easiest \" +\r\n      \"thing in the world was to lose touch with someone.\";\r\n\r\n    // mapping of sid to voice name\r\n    // 0->af, 1->af_bella, 2->af_nicole, 3->af_sarah, 4->af_sky, 5->am_adam\r\n    // 6->am_michael, 7->bf_emma, 8->bf_isabella, 9->bm_george, 10->bm_lewis\r\n    var sid = 0;\r\n\r\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\n    genConfig.Sid = sid;\n    genConfig.Speed = speed;\n    genConfig.SilenceScale = 0.2f;\n\n    var MyCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\n    {\r\n      float[] data = new float[n];\r\n      Marshal.Copy(samples, data, 0, n);\r\n      // You can process samples here, e.g., play them.\r\n      // See ../kokoro-tts-playback for how to play them\r\n      Console.WriteLine($\"Progress {progress*100}%\");\r\n\r\n      // 1 means to keep generating\r\n      // 0 means to stop generating\r\n      return 1;\r\n    };\r\n\r\n    var callback = new OfflineTtsCallbackProgressWithArg(MyCallback);\n\n    var audio = tts.GenerateWithConfig(text, genConfig, callback);\n\r\n    var outputFilename = \"./generated-kokoro-en.wav\";\r\n    var ok = audio.SaveToWaveFile(outputFilename);\r\n\r\n    if (ok)\r\n    {\r\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine($\"Failed to write {outputFilename}\");\r\n    }\r\n  }\r\n}\r\n\r\n"
  },
  {
    "path": "dotnet-examples/kokoro-tts/kokoro-tts.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>kokoro_tts</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/kokoro-tts/run-kokoro.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./kokoro-multi-lang-v1_0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n  tar xf kokoro-multi-lang-v1_0.tar.bz2\n  rm kokoro-multi-lang-v1_0.tar.bz2\nfi\n\nif [ ! -f ./kokoro-en-v0_19/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n  tar xf kokoro-en-v0_19.tar.bz2\n  rm kokoro-en-v0_19.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/kokoro-tts-play/Program.cs",
    "content": "﻿// Copyright (c)  2025  Xiaomi Corporation\r\n//\r\n// This file shows how to use a non-streaming Kokoro TTS model\r\n// for text-to-speech\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/index.html\r\n// and\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\r\n// to download pre-trained models\r\nusing PortAudioSharp;\r\nusing SherpaOnnx;\r\nusing System.Collections.Concurrent;\r\nusing System.Runtime.InteropServices;\r\n\r\nclass KokoroTtsPlayDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    var config = new OfflineTtsConfig();\r\n    config.Model.Kokoro.Model = \"./kokoro-en-v0_19/model.onnx\";\r\n    config.Model.Kokoro.Voices = \"./kokoro-en-v0_19/voices.bin\";\r\n    config.Model.Kokoro.Tokens = \"./kokoro-en-v0_19/tokens.txt\";\r\n    config.Model.Kokoro.DataDir = \"./kokoro-en-v0_19/espeak-ng-data\";\r\n\r\n    config.Model.NumThreads = 2;\r\n    config.Model.Debug = 1;\r\n    config.Model.Provider = \"cpu\";\r\n\r\n    var tts = new OfflineTts(config);\r\n    var speed = 1.0f;\r\n    var text = \"Today as always, men fall into two groups: slaves and free men. Whoever \" +\r\n      \"does not have two-thirds of his day for himself, is a slave, whatever \" +\r\n      \"he may be: a statesman, a businessman, an official, or a scholar. \" +\r\n      \"Friends fell out often because life was changing so fast. The easiest \" +\r\n      \"thing in the world was to lose touch with someone.\";\r\n\r\n    // mapping of sid to voice name\r\n    // 0->af, 1->af_bella, 2->af_nicole, 3->af_sarah, 4->af_sky, 5->am_adam\n    // 6->am_michael, 7->bf_emma, 8->bf_isabella, 9->bm_george, 10->bm_lewis\n    var sid = 0;\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\n    genConfig.Sid = sid;\n    genConfig.Speed = speed;\n    genConfig.SilenceScale = 0.2f;\n\r\n\r\n    Console.WriteLine(PortAudio.VersionInfo.versionText);\r\n    PortAudio.Initialize();\r\n    Console.WriteLine($\"Number of devices: {PortAudio.DeviceCount}\");\r\n\r\n    for (int i = 0; i != PortAudio.DeviceCount; ++i)\r\n    {\r\n      Console.WriteLine($\" Device {i}\");\r\n      DeviceInfo deviceInfo = PortAudio.GetDeviceInfo(i);\r\n      Console.WriteLine($\"   Name: {deviceInfo.name}\");\r\n      Console.WriteLine($\"   Max output channels: {deviceInfo.maxOutputChannels}\");\r\n      Console.WriteLine($\"   Default sample rate: {deviceInfo.defaultSampleRate}\");\r\n    }\r\n    int deviceIndex = PortAudio.DefaultOutputDevice;\r\n    if (deviceIndex == PortAudio.NoDevice)\r\n    {\r\n      Console.WriteLine(\"No default output device found. Please use ../offline-tts instead\");\r\n      Environment.Exit(1);\r\n    }\r\n\r\n    var info = PortAudio.GetDeviceInfo(deviceIndex);\r\n    Console.WriteLine();\r\n    Console.WriteLine($\"Use output default device {deviceIndex} ({info.name})\");\r\n\r\n    var param = new StreamParameters();\r\n    param.device = deviceIndex;\r\n    param.channelCount = 1;\r\n    param.sampleFormat = SampleFormat.Float32;\r\n    param.suggestedLatency = info.defaultLowOutputLatency;\r\n    param.hostApiSpecificStreamInfo = IntPtr.Zero;\r\n\r\n    // https://learn.microsoft.com/en-us/dotnet/standard/collections/thread-safe/blockingcollection-overview\r\n    var dataItems = new BlockingCollection<float[]>();\r\n\r\n    var MyCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\n    {\r\n      Console.WriteLine($\"Progress {progress*100}%\");\r\n\r\n      float[] data = new float[n];\r\n\r\n      Marshal.Copy(samples, data, 0, n);\r\n\r\n      dataItems.Add(data);\r\n\r\n      // 1 means to keep generating\r\n      // 0 means to stop generating\r\n      return 1;\r\n    };\r\n\r\n    var playFinished = false;\r\n\r\n    float[]? lastSampleArray = null;\r\n    int lastIndex = 0; // not played\r\n\r\n    PortAudioSharp.Stream.Callback playCallback = (IntPtr input, IntPtr output,\r\n        UInt32 frameCount,\r\n        ref StreamCallbackTimeInfo timeInfo,\r\n        StreamCallbackFlags statusFlags,\r\n        IntPtr userData\r\n        ) =>\r\n    {\r\n      if (dataItems.IsCompleted && lastSampleArray == null && lastIndex == 0)\r\n      {\r\n        Console.WriteLine($\"Finished playing\");\r\n        playFinished = true;\r\n        return StreamCallbackResult.Complete;\r\n      }\r\n\r\n      int expected = Convert.ToInt32(frameCount);\r\n      int i = 0;\r\n\r\n      while ((lastSampleArray != null || dataItems.Count != 0) && (i < expected))\r\n      {\r\n        int needed = expected - i;\r\n\r\n        if (lastSampleArray != null)\r\n        {\r\n          int remaining = lastSampleArray.Length - lastIndex;\r\n          if (remaining >= needed)\r\n          {\r\n            float[] this_block = lastSampleArray.Skip(lastIndex).Take(needed).ToArray();\r\n            lastIndex += needed;\r\n            if (lastIndex == lastSampleArray.Length)\r\n            {\r\n              lastSampleArray = null;\r\n              lastIndex = 0;\r\n            }\r\n\r\n            Marshal.Copy(this_block, 0, IntPtr.Add(output, i * sizeof(float)), needed);\r\n            return StreamCallbackResult.Continue;\r\n          }\r\n\r\n          float[] this_block2 = lastSampleArray.Skip(lastIndex).Take(remaining).ToArray();\r\n          lastIndex = 0;\r\n          lastSampleArray = null;\r\n\r\n          Marshal.Copy(this_block2, 0, IntPtr.Add(output, i * sizeof(float)), remaining);\r\n          i += remaining;\r\n          continue;\r\n        }\r\n\r\n        if (dataItems.Count != 0)\r\n        {\r\n          lastSampleArray = dataItems.Take();\r\n          lastIndex = 0;\r\n        }\r\n      }\r\n\r\n      if (i < expected)\r\n      {\r\n        int sizeInBytes = (expected - i) * 4;\r\n        Marshal.Copy(new byte[sizeInBytes], 0, IntPtr.Add(output, i * sizeof(float)), sizeInBytes);\r\n      }\r\n\r\n      return StreamCallbackResult.Continue;\r\n    };\r\n\r\n    PortAudioSharp.Stream stream = new PortAudioSharp.Stream(inParams: null, outParams: param, sampleRate: tts.SampleRate,\r\n        framesPerBuffer: 0,\r\n        streamFlags: StreamFlags.ClipOff,\r\n        callback: playCallback,\r\n        userData: IntPtr.Zero\r\n        );\r\n\r\n    stream.Start();\r\n\r\n    var callback = new OfflineTtsCallbackProgressWithArg(MyCallback);\n\n    var audio = tts.GenerateWithConfig(text, genConfig, callback);\n    var outputFilename = \"./generated-kokoro-0.wav\";\r\n    var ok = audio.SaveToWaveFile(outputFilename);\r\n\r\n    if (ok)\r\n    {\r\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine($\"Failed to write {outputFilename}\");\r\n    }\r\n    dataItems.CompleteAdding();\r\n\r\n    while (!playFinished)\r\n    {\r\n      Thread.Sleep(100); // 100ms\r\n    }\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/kokoro-tts-play/kokoro-tts-play.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>kokoro_tts_play</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <PackageReference Include=\"PortAudioSharp2\" Version=\"*\" />\r\n  </ItemGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/kokoro-tts-play/run-kokoro-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./kokoro-en-v0_19/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n  tar xf kokoro-en-v0_19.tar.bz2\n  rm kokoro-en-v0_19.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/non-streaming-canary-decode-files/Program.cs",
    "content": "﻿// Copyright (c)  2025  Xiaomi Corporation\r\n//\r\n// This file shows how to use a NeMo Canary model for speech recognition.\r\n//\r\n// You can find the model doc at\r\n// https://k2-fsa.github.io/sherpa/onnx/nemo/canary.html\r\nusing SherpaOnnx;\r\n\r\nclass NonStreamingAsrCanary\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    // please download model files from\r\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\r\n    var config = new OfflineRecognizerConfig();\r\n    config.ModelConfig.Canary.Encoder = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx\";\r\n    config.ModelConfig.Canary.Decoder = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx\";\r\n    config.ModelConfig.Canary.SrcLang = \"en\";\r\n    config.ModelConfig.Canary.TgtLang = \"en\";\r\n    config.ModelConfig.Tokens = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt\";\r\n    config.ModelConfig.Debug = 0;\r\n    var recognizer = new OfflineRecognizer(config);\r\n\r\n    var testWaveFilename = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/en.wav\";\r\n    var reader = new WaveReader(testWaveFilename);\r\n    var stream = recognizer.CreateStream();\r\n    stream.AcceptWaveform(reader.SampleRate, reader.Samples);\r\n    recognizer.Decode(stream);\r\n    var text = stream.Result.Text;\r\n    Console.WriteLine(\"Text (English): {0}\", text);\r\n\r\n    // Now output text in German\r\n    config.ModelConfig.Canary.TgtLang = \"de\";\r\n    recognizer.SetConfig(config);\r\n\r\n    stream = recognizer.CreateStream();\r\n    stream.AcceptWaveform(reader.SampleRate, reader.Samples);\r\n    recognizer.Decode(stream);\r\n    text = stream.Result.Text;\r\n    Console.WriteLine(\"Text (German): {0}\", text);\r\n  }\r\n}\r\n\r\n\r\n"
  },
  {
    "path": "dotnet-examples/non-streaming-canary-decode-files/non-streaming-canary-decode-files.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>non_streaming_canary_decode_files</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/non-streaming-canary-decode-files/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n  tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n  rm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/non-streaming-funasr-nano-decode-files/Program.cs",
    "content": "﻿// Copyright (c)  2026  Xiaomi Corporation\r\n//\r\n// This file shows how to use a FunASR Nano model for speech recognition.\r\n//\r\n// You can find the model doc at\r\n// https://k2-fsa.github.io/sherpa/onnx/funasr-nano.html\r\nusing SherpaOnnx;\r\n\r\nclass NonStreamingFunAsrNano\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    // please download model files from\r\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\r\n    var config = new OfflineRecognizerConfig();\r\n    config.ModelConfig.FunAsrNano.EncoderAdaptor = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx\";\r\n    config.ModelConfig.FunAsrNano.LLM = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx\";\r\n    config.ModelConfig.FunAsrNano.Embedding = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx\";\r\n    config.ModelConfig.FunAsrNano.Tokenizer = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B\";\r\n    config.ModelConfig.Tokens = \"\";\r\n    config.ModelConfig.Debug = 1;\r\n    var recognizer = new OfflineRecognizer(config);\r\n\r\n    var testWaveFilename = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/lyrics.wav\";\r\n    var reader = new WaveReader(testWaveFilename);\r\n    var stream = recognizer.CreateStream();\r\n    stream.AcceptWaveform(reader.SampleRate, reader.Samples);\r\n    recognizer.Decode(stream);\r\n    var text = stream.Result.Text;\r\n    Console.WriteLine(\"Text: {0}\", text);\r\n  }\r\n}\r\n\r\n\r\n\r\n"
  },
  {
    "path": "dotnet-examples/non-streaming-funasr-nano-decode-files/non-streaming-funasr-nano-decode-files.csproj",
    "content": "﻿<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>non_streaming_funasr_nano_decode_files</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/non-streaming-funasr-nano-decode-files/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2 \n  tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/non-streaming-moonshine-v2-decode-files/Program.cs",
    "content": "﻿// Copyright (c)  2026  Xiaomi Corporation\r\n//\r\n// This file shows how to use a Moonshine v2 model for speech recognition.\r\n//\r\n// You can find the model doc at\r\n// https://k2-fsa.github.io/sherpa/onnx/moonshine/\r\nusing SherpaOnnx;\r\n\r\nclass NonStreamingAsrMoonshineV2\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    // please download model files from\r\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\r\n    var config = new OfflineRecognizerConfig();\r\n    config.ModelConfig.Moonshine.Encoder = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort\";\r\n    config.ModelConfig.Moonshine.MergedDecoder = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort\";\r\n    config.ModelConfig.Tokens = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt\";\r\n    config.ModelConfig.Debug = 0;\r\n    var recognizer = new OfflineRecognizer(config);\r\n\r\n    var testWaveFilename = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav\";\r\n    var reader = new WaveReader(testWaveFilename);\r\n    var stream = recognizer.CreateStream();\r\n    stream.AcceptWaveform(reader.SampleRate, reader.Samples);\r\n    recognizer.Decode(stream);\r\n    var text = stream.Result.Text;\r\n    Console.WriteLine(\"Text: {0}\", text);\r\n  }\r\n}\r\n\r\n\r\n\r\n"
  },
  {
    "path": "dotnet-examples/non-streaming-moonshine-v2-decode-files/non-streaming-moonshine-v2-decode-files.csproj",
    "content": "﻿<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>non_streaming_moonshine_v2_decode_files</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/non-streaming-moonshine-v2-decode-files/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/offline-audio-tagging/Program.cs",
    "content": "﻿// Copyright (c)  2025  Xiaomi Corporation\r\n//\r\n// This file shows how to use a non-streaming Zipformer or CED model\r\n// for audio tagging\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/audio-tagging/index.html\r\n// and\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\r\n// to download pre-trained models\r\n\r\nusing SherpaOnnx;\r\nusing System.Runtime.InteropServices;\r\n\r\nclass AudioTaggingDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    TestZipformer();\r\n    TestCED();\r\n  }\r\n\r\n  static void TestZipformer()\r\n  {\r\n    var config = new AudioTaggingConfig();\r\n\r\n    config.Model.Zipformer.Model = \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/model.onnx\";\r\n\r\n    config.Model.NumThreads = 1;\r\n    config.Model.Debug = 1;\r\n    config.Labels = \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/class_labels_indices.csv\";\r\n\r\n    config.TopK = 5;\r\n\r\n    var tagger = new AudioTagging(config);\r\n\r\n    var s = tagger.CreateStream();\r\n\r\n    var waveFilename = \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/1.wav\";\r\n    WaveReader waveReader = new WaveReader(waveFilename);\r\n    s.AcceptWaveform(waveReader.SampleRate, waveReader.Samples);\r\n\r\n    var events = tagger.Compute(s);\r\n    foreach (var e in events)\r\n    {\r\n      Console.WriteLine($\"Name {e.Name}, index: {e.Index}, prob: {e.Prob}\");\r\n    }\r\n  }\r\n\r\n  static void TestCED()\r\n  {\r\n    var config = new AudioTaggingConfig();\r\n\r\n    config.Model.CED =\"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.int8.onnx\";\r\n\r\n    config.Model.NumThreads = 1;\r\n    config.Model.Debug = 1;\r\n    config.Labels = \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/class_labels_indices.csv\";\r\n\r\n    config.TopK = 5;\r\n\r\n    var tagger = new AudioTagging(config);\r\n\r\n    var s = tagger.CreateStream();\r\n\r\n    var waveFilename = \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/1.wav\";\r\n    WaveReader waveReader = new WaveReader(waveFilename);\r\n    s.AcceptWaveform(waveReader.SampleRate, waveReader.Samples);\r\n\r\n    var events = tagger.Compute(s);\r\n    foreach (var e in events)\r\n    {\r\n      Console.WriteLine($\"Name {e.Name}, index: {e.Index}, prob: {e.Prob}\");\r\n    }\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/offline-audio-tagging/offline-audio-tagging.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>offline_audio_tagging</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/offline-audio-tagging/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n  tar xvf sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n  rm sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n\n  ls -lh sherpa-onnx-zipformer-small-audio-tagging-2024-04-15\nfi\n\nif [ ! -f ./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n  tar xvf sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n  rm sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n\n  ls -lh sherpa-onnx-ced-mini-audio-tagging-2024-04-19\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/Program.cs",
    "content": "﻿// Copyright (c)  2023  Xiaomi Corporation\r\n// Copyright (c)  2023 by manyeyes\r\n//\r\n// This file shows how to use a non-streaming model to decode files\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\r\n// to download non-streaming models\r\nusing CommandLine;\r\nusing CommandLine.Text;\r\nusing SherpaOnnx;\r\n\r\nclass OfflineDecodeFiles\r\n{\r\n  class Options\r\n  {\r\n    [Option(\"sample-rate\", Required = false, Default = 16000, HelpText = \"Sample rate of the data used to train the model\")]\r\n    public int SampleRate { get; set; } = 16000;\r\n\r\n    [Option(\"feat-dim\", Required = false, Default = 80, HelpText = \"Dimension of the features used to train the model\")]\r\n    public int FeatureDim { get; set; } = 80;\r\n\r\n    [Option(Required = false, HelpText = \"Path to tokens.txt\")]\r\n    public string Tokens { get; set; } = string.Empty;\r\n\r\n    [Option(Required = false, Default = \"\", HelpText = \"Path to transducer encoder.onnx. Used only for transducer models\")]\r\n    public string Encoder { get; set; } = string.Empty;\r\n\r\n    [Option(Required = false, Default = \"\", HelpText = \"Path to transducer decoder.onnx. Used only for transducer models\")]\r\n    public string Decoder { get; set; } = string.Empty;\r\n\r\n    [Option(Required = false, Default = \"\", HelpText = \"Path to transducer joiner.onnx. Used only for transducer models\")]\r\n    public string Joiner { get; set; } = string.Empty;\r\n\r\n    [Option(\"model-type\", Required = false, Default = \"\", HelpText = \"model type\")]\r\n    public string ModelType { get; set; } = string.Empty;\r\n\r\n    [Option(\"fire-red-asr-encoder\", Required = false, Default = \"\", HelpText = \"Path to FireRedAsr encoder.int8.onnx. Used only for FireRedAsr models\")]\r\n    public string FireRedAsrEncoder { get; set; } = string.Empty;\r\n\r\n\r\n    [Option(\"fire-red-asr-decoder\", Required = false, Default = \"\", HelpText = \"Path to FireRedAsr decoder.int8.onnx. Used only for FireRedAsr models\")]\r\n    public string FireRedAsrDecoder { get; set; } = string.Empty;\r\n\r\n\r\n    [Option(\"whisper-encoder\", Required = false, Default = \"\", HelpText = \"Path to whisper encoder.onnx. Used only for whisper models\")]\r\n    public string WhisperEncoder { get; set; } = string.Empty;\r\n\r\n    [Option(\"whisper-decoder\", Required = false, Default = \"\", HelpText = \"Path to whisper decoder.onnx. Used only for whisper models\")]\r\n    public string WhisperDecoder { get; set; } = string.Empty;\r\n\r\n    [Option(\"whisper-language\", Required = false, Default = \"\", HelpText = \"Language of the input file. Can be empty\")]\r\n    public string WhisperLanguage { get; set; } = string.Empty;\r\n\r\n    [Option(\"whisper-task\", Required = false, Default = \"transcribe\", HelpText = \"transcribe or translate\")]\r\n    public string WhisperTask { get; set; } = \"transcribe\";\r\n\r\n    [Option(\"moonshine-preprocessor\", Required = false, Default = \"\", HelpText = \"Path to preprocess.onnx. Used only for Moonshine models\")]\r\n    public string MoonshinePreprocessor { get; set; } = string.Empty;\r\n\r\n    [Option(\"moonshine-encoder\", Required = false, Default = \"\", HelpText = \"Path to encode.onnx. Used only for Moonshine models\")]\r\n    public string MoonshineEncoder { get; set; } = string.Empty;\r\n\r\n    [Option(\"moonshine-uncached-decoder\", Required = false, Default = \"\", HelpText = \"Path to uncached_decode.onnx. Used only for Moonshine models\")]\r\n    public string MoonshineUncachedDecoder { get; set; } = string.Empty;\r\n\r\n    [Option(\"moonshine-cached-decoder\", Required = false, Default = \"\", HelpText = \"Path to cached_decode.onnx. Used only for Moonshine models\")]\r\n    public string MoonshineCachedDecoder { get; set; } = string.Empty;\r\n\r\n    [Option(\"tdnn-model\", Required = false, Default = \"\", HelpText = \"Path to tdnn yesno model\")]\r\n    public string TdnnModel { get; set; } = string.Empty;\r\n\r\n    [Option(Required = false, HelpText = \"Path to model.onnx. Used only for paraformer models\")]\r\n    public string Paraformer { get; set; } = string.Empty;\r\n\r\n    [Option(\"nemo-ctc\", Required = false, HelpText = \"Path to model.onnx. Used only for NeMo CTC models\")]\r\n    public string NeMoCtc { get; set; } = string.Empty;\r\n\r\n    [Option(\"zipformer-ctc\", Required = false, HelpText = \"Path to model.onnx. Used only for Zipformer CTC models\")]\r\n    public string ZipformerCtc { get; set; } = string.Empty;\r\n\r\n    [Option(\"dolphin-model\", Required = false, Default = \"\", HelpText = \"Path to dolphin ctc model\")]\r\n    public string DolphinModel { get; set; } = string.Empty;\r\n\r\n    [Option(\"telespeech-ctc\", Required = false, HelpText = \"Path to model.onnx. Used only for TeleSpeech CTC models\")]\r\n    public string TeleSpeechCtc { get; set; } = string.Empty;\r\n\r\n    [Option(\"wenet-ctc\", Required = false, HelpText = \"Path to model.onnx. Used only for Wenet CTC models\")]\r\n    public string WenetCtc { get; set; } = string.Empty;\r\n\r\n    [Option(\"omnilingual-asr-ctc\", Required = false, HelpText = \"Path to model.onnx. Used only for Omnilingual ASR CTC models\")]\r\n    public string Omnilingual { get; set; } = string.Empty;\r\n\r\n    [Option(\"medasr\", Required = false, HelpText = \"Path to model.onnx. Used only for Google MedASR CTC models\")]\r\n    public string MedAsr { get; set; } = string.Empty;\r\n\r\n    [Option(\"fire-red-asr-ctc\", Required = false, HelpText = \"Path to model.onnx. Used only for FireRedASR CTC models\")]\r\n    public string FireRedAsrCtc { get; set; } = string.Empty;\r\n\r\n    [Option(\"sense-voice-model\", Required = false, HelpText = \"Path to model.onnx. Used only for SenseVoice CTC models\")]\r\n    public string SenseVoiceModel { get; set; } = string.Empty;\r\n\r\n    [Option(\"sense-voice-use-itn\", Required = false, HelpText = \"1 to use inverse text normalization for sense voice.\")]\r\n    public int SenseVoiceUseItn { get; set; } = 1;\r\n\r\n    [Option(\"num-threads\", Required = false, Default = 1, HelpText = \"Number of threads for computation\")]\r\n    public int NumThreads { get; set; } = 1;\r\n\r\n    [Option(\"decoding-method\", Required = false, Default = \"greedy_search\",\r\n            HelpText = \"Valid decoding methods are: greedy_search, modified_beam_search\")]\r\n    public string DecodingMethod { get; set; } = \"greedy_search\";\r\n\r\n    [Option(\"rule-fsts\", Required = false, Default = \"\",\r\n            HelpText = \"If not empty, path to rule fst for inverse text normalization\")]\r\n    public string RuleFsts { get; set; } = string.Empty;\r\n\r\n    [Option(\"max-active-paths\", Required = false, Default = 4,\r\n        HelpText = @\"Used only when --decoding--method is modified_beam_search.\r\nIt specifies number of active paths to keep during the search\")]\r\n    public int MaxActivePaths { get; set; } = 4;\r\n\r\n    [Option(\"hotwords-file\", Required = false, Default = \"\", HelpText = \"Path to hotwords.txt\")]\r\n    public string HotwordsFile { get; set; } = string.Empty;\r\n\r\n    [Option(\"hotwords-score\", Required = false, Default = 1.5F, HelpText = \"hotwords score\")]\r\n    public float HotwordsScore { get; set; } = 1.5F;\r\n\r\n    [Option(\"files\", Required = true, HelpText = \"Audio files for decoding\")]\r\n    public IEnumerable<string> Files { get; set; } = new string[] { };\r\n  }\r\n\r\n  static void Main(string[] args)\r\n  {\r\n    var parser = new CommandLine.Parser(with => with.HelpWriter = null);\r\n    var parserResult = parser.ParseArguments<Options>(args);\r\n\r\n    parserResult\r\n      .WithParsed<Options>(options => Run(options))\r\n      .WithNotParsed(errs => DisplayHelp(parserResult, errs));\r\n  }\r\n\r\n  private static void DisplayHelp<T>(ParserResult<T> result, IEnumerable<Error> errs)\r\n  {\r\n    var usage = @\"\r\n# Zipformer\r\n\r\ndotnet run \\\r\n  --tokens=./sherpa-onnx-zipformer-en-2023-04-01/tokens.txt \\\r\n  --encoder=./sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx \\\r\n  --decoder=./sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx \\\r\n  --joiner=./sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.onnx \\\r\n  --files ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav \\\r\n  ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav \\\r\n  ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/8k.wav\r\n\r\nPlease refer to\r\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html\r\nto download pre-trained non-streaming zipformer models.\r\n\r\n# Paraformer\r\n\r\ndotnet run \\\r\n  --tokens=./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt \\\r\n  --paraformer=./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \\\r\n  --files ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav \\\r\n  ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/0.wav \\\r\n  ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/1.wav \\\r\n  ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/2.wav \\\r\n  ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/8k.wav\r\n\r\nPlease refer to\r\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html\r\nto download pre-trained paraformer models\r\n\r\n# NeMo CTC\r\n\r\ndotnet run \\\r\n  --tokens=./sherpa-onnx-nemo-ctc-en-conformer-medium/tokens.txt \\\r\n  --nemo-ctc=./sherpa-onnx-nemo-ctc-en-conformer-medium/model.onnx \\\r\n  --num-threads=1 \\\r\n  --files ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/0.wav \\\r\n  ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/1.wav \\\r\n  ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/8k.wav\r\n\r\nPlease refer to\r\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/index.html\r\nto download pre-trained paraformer models\r\n\r\n# Whisper\r\n\r\ndotnet run \\\r\n  --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx \\\r\n  --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx \\\r\n  --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \\\r\n  --files ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav \\\r\n  ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav \\\r\n  ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav\r\n\r\nPlease refer to\r\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html\r\nto download pre-trained whisper models.\r\n\r\n# Tdnn yesno\r\n\r\ndotnet run \\\r\n  --sample-rate=8000 \\\r\n  --feat-dim=23 \\\r\n  --tokens=./sherpa-onnx-tdnn-yesno/tokens.txt \\\r\n  --tdnn-model=./sherpa-onnx-tdnn-yesno/model-epoch-14-avg-2.onnx \\\r\n  --files ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_0_1_0_0_0_1.wav \\\r\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_0_1_0.wav \\\r\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_1_1_1.wav \\\r\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_1_0_0_1.wav \\\r\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_1_0_0_0_1.wav \\\r\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_1_0_1_1_0.wav\r\n\r\nPlease refer to\r\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/yesno/index.html\r\nto download pre-trained Tdnn models.\r\n\";\r\n\r\n    var helpText = HelpText.AutoBuild(result, h =>\r\n    {\r\n      h.AdditionalNewLineAfterOption = false;\r\n      h.Heading = usage;\r\n      h.Copyright = \"Copyright (c) 2023 Xiaomi Corporation\";\r\n      return HelpText.DefaultParsingErrorsHandler(result, h);\r\n    }, e => e);\r\n    Console.WriteLine(helpText);\r\n  }\r\n\r\n  private static void Run(Options options)\r\n  {\r\n    OfflineRecognizerConfig config = new OfflineRecognizerConfig();\r\n    config.FeatConfig.SampleRate = options.SampleRate;\r\n    config.FeatConfig.FeatureDim = options.FeatureDim;\r\n\r\n    config.ModelConfig.Tokens = options.Tokens;\r\n\r\n    if (!string.IsNullOrEmpty(options.Encoder))\r\n    {\r\n      // this is a transducer model\r\n      config.ModelConfig.Transducer.Encoder = options.Encoder;\r\n      config.ModelConfig.Transducer.Decoder = options.Decoder;\r\n      config.ModelConfig.Transducer.Joiner = options.Joiner;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.Paraformer))\r\n    {\r\n      config.ModelConfig.Paraformer.Model = options.Paraformer;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.NeMoCtc))\r\n    {\r\n      config.ModelConfig.NeMoCtc.Model = options.NeMoCtc;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.DolphinModel))\r\n    {\r\n      config.ModelConfig.Dolphin.Model = options.DolphinModel;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.ZipformerCtc))\r\n    {\r\n      config.ModelConfig.ZipformerCtc.Model = options.ZipformerCtc;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.TeleSpeechCtc))\r\n    {\r\n      config.ModelConfig.TeleSpeechCtc = options.TeleSpeechCtc;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.WenetCtc))\r\n    {\r\n      config.ModelConfig.WenetCtc.Model = options.WenetCtc;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.Omnilingual))\r\n    {\r\n      config.ModelConfig.Omnilingual.Model = options.Omnilingual;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.MedAsr))\r\n    {\r\n      config.ModelConfig.MedAsr.Model = options.MedAsr;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.WhisperEncoder))\r\n    {\r\n      config.ModelConfig.Whisper.Encoder = options.WhisperEncoder;\r\n      config.ModelConfig.Whisper.Decoder = options.WhisperDecoder;\r\n      config.ModelConfig.Whisper.Language = options.WhisperLanguage;\r\n      config.ModelConfig.Whisper.Task = options.WhisperTask;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.TdnnModel))\r\n    {\r\n      config.ModelConfig.Tdnn.Model = options.TdnnModel;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.SenseVoiceModel))\r\n    {\r\n      config.ModelConfig.SenseVoice.Model = options.SenseVoiceModel;\r\n      config.ModelConfig.SenseVoice.UseInverseTextNormalization = options.SenseVoiceUseItn;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.MoonshinePreprocessor))\r\n    {\r\n      config.ModelConfig.Moonshine.Preprocessor = options.MoonshinePreprocessor;\r\n      config.ModelConfig.Moonshine.Encoder = options.MoonshineEncoder;\r\n      config.ModelConfig.Moonshine.UncachedDecoder = options.MoonshineUncachedDecoder;\r\n      config.ModelConfig.Moonshine.CachedDecoder = options.MoonshineCachedDecoder;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.FireRedAsrEncoder))\r\n    {\r\n      config.ModelConfig.FireRedAsr.Encoder = options.FireRedAsrEncoder;\r\n      config.ModelConfig.FireRedAsr.Decoder = options.FireRedAsrDecoder;\r\n    }\r\n    else if (!string.IsNullOrEmpty(options.FireRedAsrCtc))\r\n    {\r\n      config.ModelConfig.FireRedAsrCtc.Model = options.FireRedAsrCtc;\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine(\"Please provide a model\");\r\n      return;\r\n    }\r\n\r\n    config.ModelConfig.ModelType = options.ModelType;\r\n    config.DecodingMethod = options.DecodingMethod;\r\n    config.MaxActivePaths = options.MaxActivePaths;\r\n    config.HotwordsFile = options.HotwordsFile;\r\n    config.HotwordsScore = options.HotwordsScore;\r\n    config.RuleFsts = options.RuleFsts;\r\n\r\n    config.ModelConfig.Debug = 0;\r\n\r\n    var recognizer = new OfflineRecognizer(config);\r\n\r\n    var files = options.Files.ToArray();\r\n\r\n    // We create a separate stream for each file\r\n    var streams = new List<OfflineStream>();\r\n    streams.EnsureCapacity(files.Length);\r\n\r\n    for (int i = 0; i != files.Length; ++i)\r\n    {\r\n      var s = recognizer.CreateStream();\r\n\r\n      WaveReader waveReader = new WaveReader(files[i]);\r\n      s.AcceptWaveform(waveReader.SampleRate, waveReader.Samples);\r\n      streams.Add(s);\r\n    }\r\n\r\n    recognizer.Decode(streams);\r\n\r\n    // display results\r\n    for (int i = 0; i != files.Length; ++i)\r\n    {\r\n      var r = streams[i].Result;\r\n      Console.WriteLine(\"--------------------\");\r\n      Console.WriteLine(files[i]);\r\n      Console.WriteLine(\"Text: {0}\", r.Text);\r\n      Console.WriteLine(\"Tokens: [{0}]\", string.Join(\", \", r.Tokens));\r\n      if (r.Timestamps != null && r.Timestamps.Length > 0) {\r\n        Console.Write(\"Timestamps: [\");\r\n        var sep = string.Empty;\r\n        for (int k = 0; k != r.Timestamps.Length; ++k)\r\n        {\r\n          Console.Write(\"{0}{1}\", sep, r.Timestamps[k].ToString(\"0.00\"));\r\n          sep = \", \";\r\n        }\r\n        Console.WriteLine(\"]\");\r\n      }\r\n    }\r\n    Console.WriteLine(\"--------------------\");\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/offline-decode-files.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>offline_decode_files</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <PackageReference Include=\"CommandLineParser\" Version=\"2.9.1\" />\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-dolphin-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  ls -lh sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\nfi\n\ndotnet run \\\n  --tokens=./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt \\\n  --dolphin-model=./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx \\\n  --num-threads=1 \\\n  --files ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-fire-red-asr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  tar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  rm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\n  ls -lh sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\nfi\n\ndotnet run \\\n  --num-threads=2 \\\n  --fire-red-asr-ctc=./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx \\\n  --tokens=./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt \\\n  --files ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-fire-red-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\nfi\n\ndotnet run \\\n  --num-threads=2 \\\n  --fire-red-asr-encoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx \\\n  --fire-red-asr-decoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx \\\n  --tokens=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt \\\n  --files ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-hotwords.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ./sherpa-onnx-zipformer-en-2023-04-01 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\n  tar xvf sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\n  rm sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\nfi\n\nif [ ! -f ./sherpa-onnx-zipformer-en-2023-04-01/hotwords_en.txt ]; then\ncat >./sherpa-onnx-zipformer-en-2023-04-01/hotwords_en.txt <<EOF\n▁ QUA R TER S\n▁FOR E VER\nEOF\nfi\n\ndotnet run \\\n  --tokens=./sherpa-onnx-zipformer-en-2023-04-01/tokens.txt \\\n  --encoder=./sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx \\\n  --decoder=./sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx \\\n  --joiner=./sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  --decoding-method=modified_beam_search \\\n  --files ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav \\\n  ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav\n\ndotnet run \\\n  --hotwords-file=./sherpa-onnx-zipformer-en-2023-04-01/hotwords_en.txt \\\n  --hotwords-score=2.0 \\\n  --tokens=./sherpa-onnx-zipformer-en-2023-04-01/tokens.txt \\\n  --encoder=./sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx \\\n  --decoder=./sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx \\\n  --joiner=./sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  --decoding-method=modified_beam_search \\\n  --files ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav \\\n  ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav\n\n# 0.wav: QUARTER -> QUARTERS\n# 1.wav: FOR EVER -> FOREVER\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-medasr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  tar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  rm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nfi\n\ndotnet run \\\n  --medasr=./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx \\\n  --tokens=./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt \\\n  --files ./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-moonshine.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nfi\n\ndotnet run \\\n  --num-threads=2 \\\n  --moonshine-preprocessor=./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx \\\n  --moonshine-encoder=./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx \\\n  --moonshine-uncached-decoder=./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx \\\n  --moonshine-cached-decoder=./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx \\\n  --tokens=./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt \\\n  --files ./sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-nemo-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ./sherpa-onnx-nemo-ctc-en-conformer-medium ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-en-conformer-medium.tar.bz2\n  tar xvf sherpa-onnx-nemo-ctc-en-conformer-medium.tar.bz2\n  rm sherpa-onnx-nemo-ctc-en-conformer-medium.tar.bz2\nfi\n\ndotnet run \\\n  --tokens=./sherpa-onnx-nemo-ctc-en-conformer-medium/tokens.txt \\\n  --nemo-ctc=./sherpa-onnx-nemo-ctc-en-conformer-medium/model.onnx \\\n  --num-threads=1 \\\n  --files ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/0.wav \\\n  ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/1.wav \\\n  ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/8k.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-omnilingual-asr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  tar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  rm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nfi\n\ndotnet run \\\n  --omnilingual-asr-ctc=./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx \\\n  --tokens=./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt \\\n  --files ./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-paraformer-itn.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ./sherpa-onnx-paraformer-zh-2023-09-14 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\nif [ ! -f ./itn-zh-number.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nfi\n\nif [ ! -f ./itn_zh_number.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\nfi\n\ndotnet run \\\n  --tokens=./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt \\\n  --paraformer=./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \\\n  --rule-fsts=./itn_zh_number.fst \\\n  --num-threads=2 \\\n  --files ./itn-zh-number.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ./sherpa-onnx-paraformer-zh-2023-09-14 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\ndotnet run \\\n  --tokens=./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt \\\n  --paraformer=./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \\\n  --num-threads=2 \\\n  --files ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/0.wav \\\n  ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/1.wav \\\n  ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/2.wav \\\n  ./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/8k.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-sense-voice-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\ndotnet run \\\n  --sense-voice-model=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx \\\n  --tokens=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt \\\n  --files ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/zh.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-tdnn-yesno.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ./sherpa-onnx-tdnn-yesno ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-tdnn-yesno.tar.bz2\n  tar xvf sherpa-onnx-tdnn-yesno.tar.bz2\n  rm sherpa-onnx-tdnn-yesno.tar.bz2\nfi\n\ndotnet run \\\n  --sample-rate=8000 \\\n  --feat-dim=23 \\\n  --tokens=./sherpa-onnx-tdnn-yesno/tokens.txt \\\n  --tdnn-model=./sherpa-onnx-tdnn-yesno/model-epoch-14-avg-2.onnx \\\n  --files ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_0_1_0_0_0_1.wav \\\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_0_1_0.wav \\\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_1_1_1.wav \\\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_1_0_0_1.wav \\\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_1_0_0_0_1.wav \\\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_1_0_1_1_0.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-telespeech-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n  tar xvf sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n  rm sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\nfi\n\ndotnet run \\\n  --telespeech-ctc=./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/model.int8.onnx \\\n  --tokens=./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt \\\n  --model-type=telespeech_ctc \\\n  --files ./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/test_wavs/3-sichuan.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-wenet-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n  tar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n  rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\nfi\n\ndotnet run \\\n  --wenet-ctc=./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx \\\n  --tokens=./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt \\\n  --files ./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/test_wavs/yue-0.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-whisper-large-v3.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./large-v3-encoder.int8.onnx ]; then\n  git lfs install\n\n  git clone https://huggingface.co/csukuangfj/sherpa-onnx-whisper-large-v3\n\n  ls -lh sherpa-onnx-whisper-large-v3\n  cp -v sherpa-onnx-whisper-large-v3/*.onnx .\n  cp -v sherpa-onnx-whisper-large-v3/*.weights .\n  ls -lh\nfi\n\ndotnet run \\\n  --num-threads=2 \\\n  --whisper-encoder=./large-v3-encoder.int8.onnx \\\n  --whisper-decoder=./large-v3-decoder.int8.onnx \\\n  --tokens=./sherpa-onnx-whisper-large-v3/large-v3-tokens.txt \\\n  --files ./sherpa-onnx-whisper-large-v3/test_wavs/0.wav \\\n  ./sherpa-onnx-whisper-large-v3/test_wavs/1.wav \\\n  ./sherpa-onnx-whisper-large-v3/test_wavs/8k.wav\n\ndotnet run \\\n  --num-threads=2 \\\n  --whisper-encoder=./large-v3-encoder.onnx \\\n  --whisper-decoder=./large-v3-decoder.onnx \\\n  --tokens=./sherpa-onnx-whisper-large-v3/large-v3-tokens.txt \\\n  --files ./sherpa-onnx-whisper-large-v3/test_wavs/0.wav \\\n  ./sherpa-onnx-whisper-large-v3/test_wavs/1.wav \\\n  ./sherpa-onnx-whisper-large-v3/test_wavs/8k.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-whisper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ./sherpa-onnx-whisper-tiny.en ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\nfi\n\ndotnet run \\\n  --num-threads=2 \\\n  --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx \\\n  --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx \\\n  --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \\\n  --files ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav \\\n  ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav \\\n  ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-zipformer-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n  rm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nfi\n\ndotnet run \\\n  --tokens=./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt \\\n  --zipformer-ctc=./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx \\\n  --num-threads=1 \\\n  --files ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/test_wavs/0.wav\n"
  },
  {
    "path": "dotnet-examples/offline-decode-files/run-zipformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ./sherpa-onnx-zipformer-en-2023-04-01 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\n  tar xvf sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\n  rm sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\nfi\n\ndotnet run \\\n  --tokens=./sherpa-onnx-zipformer-en-2023-04-01/tokens.txt \\\n  --encoder=./sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx \\\n  --decoder=./sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx \\\n  --joiner=./sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.onnx \\\n  --num-threads=2 \\\n  --decoding-method=modified_beam_search \\\n  --files ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav \\\n  ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav \\\n  ./sherpa-onnx-zipformer-en-2023-04-01/test_wavs/8k.wav\n"
  },
  {
    "path": "dotnet-examples/offline-punctuation/Program.cs",
    "content": "﻿// Copyright (c)  2024  Xiaomi Corporation\n//\n// This file shows how to add punctuations to text.\n//\n// 1. Download a model from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n//\n// 3. Now run it\n//\n// dotnet run\n\nusing SherpaOnnx;\n\nclass OfflinePunctuationDemo\n{\n  static void Main(string[] args)\n  {\n    var config = new OfflinePunctuationConfig();\n    config.Model.CtTransformer = \"./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx\";\n    config.Model.Debug = 1;\n    config.Model.NumThreads = 1;\n    var punct = new OfflinePunctuation(config);\n\n    var textList = new string[] {\n        \"这是一个测试你好吗How are you我很好thank you are you ok谢谢你\",\n        \"我们都是木头人不会说话不会动\",\n        \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\",\n    };\n\n    Console.WriteLine(\"---------\");\n    foreach (var text in textList)\n    {\n      string textWithPunct = punct.AddPunct(text);\n      Console.WriteLine(\"Input text: {0}\", text);\n      Console.WriteLine(\"Output text: {0}\", textWithPunct);\n      Console.WriteLine(\"---------\");\n    }\n  }\n}\n"
  },
  {
    "path": "dotnet-examples/offline-punctuation/offline-punctuation.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n\n  <PropertyGroup>\n    <OutputType>Exe</OutputType>\n    <TargetFramework>net8.0</TargetFramework>\n    <RootNamespace>offline_punctuation</RootNamespace>\n    <ImplicitUsings>enable</ImplicitUsings>\n    <Nullable>enable</Nullable>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "dotnet-examples/offline-punctuation/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -e ./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/offline-speaker-diarization/Program.cs",
    "content": "﻿// Copyright (c)  2024  Xiaomi Corporation\r\n//\r\n\r\n// This file shows how to use sherpa-onnx C# API for speaker diarization\r\n/*\r\nUsage:\r\n\r\nStep 1: Download a speaker segmentation model\r\n\r\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\r\nfor a list of available models. The following is an example\r\n\r\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\r\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\r\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\r\n\r\nStep 2: Download a speaker embedding extractor model\r\n\r\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\r\nfor a list of available models. The following is an example\r\n\r\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\r\n\r\nStep 3. Download test wave files\r\n\r\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\r\nfor a list of available test wave files. The following is an example\r\n\r\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\r\n\r\nStep 4. Run it\r\n\r\n  dotnet run\r\n*/\r\n\r\nusing SherpaOnnx;\r\n\r\nclass OfflineSpeakerDiarizationDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    var config = new OfflineSpeakerDiarizationConfig();\r\n    config.Segmentation.Pyannote.Model = \"./sherpa-onnx-pyannote-segmentation-3-0/model.onnx\";\r\n    config.Embedding.Model = \"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\";\r\n\r\n    // the test wave ./0-four-speakers-zh.wav has 4 speakers, so\r\n    // we set num_clusters to 4\r\n    //\r\n    config.Clustering.NumClusters = 4;\r\n    // If you don't know the number of speakers in the test wave file, please\r\n    // use\r\n    // config.Clustering.Threshold = 0.5; // You need to tune this threshold\r\n    var sd = new OfflineSpeakerDiarization(config);\r\n\r\n    var testWaveFile = \"./0-four-speakers-zh.wav\";\r\n    var waveReader = new WaveReader(testWaveFile);\r\n    if (sd.SampleRate != waveReader.SampleRate)\r\n    {\r\n      Console.WriteLine($\"Expected sample rate: {sd.SampleRate}. Given: {waveReader.SampleRate}\");\r\n      return;\r\n    }\r\n\r\n    Console.WriteLine(\"Started\");\r\n\r\n     // var segments = sd.Process(waveReader.Samples); // this one is also ok\r\n\r\n    var progressCallback = (int numProcessedChunks, int numTotalChunks, IntPtr arg) =>\r\n    {\r\n      var progress = 100.0F * numProcessedChunks / numTotalChunks;\r\n      Console.WriteLine(\"Progress {0}%\", string.Format(\"{0:0.00}\", progress));\r\n      return 0;\r\n    };\r\n\r\n    var callback = new OfflineSpeakerDiarizationProgressCallback(progressCallback);\r\n    var segments = sd.ProcessWithCallback(waveReader.Samples, callback, IntPtr.Zero);\r\n\r\n    foreach (var s in segments)\r\n    {\r\n      Console.WriteLine(\"{0} -- {1} speaker_{2}\", string.Format(\"{0:0.00}\", s.Start), string.Format(\"{0:0.00}\", s.End), s.Speaker);\r\n    }\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/offline-speaker-diarization/offline-speaker-diarization.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>offline_speaker_diarization</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/offline-speaker-diarization/run.sh",
    "content": "#!/usr/bin/env bash\n\n\nif [ ! -f ./sherpa-onnx-pyannote-segmentation-3-0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nfi\n\nif [ ! -f ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\nfi\n\nif [ ! -f ./0-four-speakers-zh.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/offline-tts/Program.cs",
    "content": "﻿// Copyright (c)  2024  Xiaomi Corporation\r\n//\r\n// This file shows how to use a non-streaming TTS model for text-to-speech\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\r\n// and\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\r\n// to download pre-trained models\r\nusing CommandLine;\r\nusing CommandLine.Text;\r\nusing SherpaOnnx;\r\n\r\nclass OfflineTtsDemo\r\n{\r\n  class Options\r\n  {\r\n    [Option(\"tts-rule-fsts\", Required = false, Default = \"\", HelpText = \"path to rule.fst\")]\r\n    public string RuleFsts { get; set; } = string.Empty;\r\n\r\n    [Option(\"tts-rule-fars\", Required = false, Default = \"\", HelpText = \"path to rule.far\")]\r\n    public string RuleFars { get; set; } = string.Empty;\r\n\r\n    [Option(\"data-dir\", Required = false, Default = \"\", HelpText = \"Path to the directory containing dict for espeak-ng.\")]\r\n    public string DataDir { get; set; } = string.Empty;\r\n\r\n    [Option(\"length-scale\", Required = false, Default = 1, HelpText = \"speech speed. Larger->Slower; Smaller->faster\")]\r\n    public float LengthScale { get; set; } = 1;\r\n\r\n    [Option(\"noise-scale\", Required = false, Default = 0.667f, HelpText = \"noise_scale for VITS or Matcha models\")]\r\n    public float NoiseScale { get; set; } = 0.667F;\r\n\r\n    [Option(\"vits-noise-scale-w\", Required = false, Default = 0.8F, HelpText = \"noise_scale_w for VITS models\")]\r\n    public float NoiseScaleW { get; set; } = 0.8F;\r\n\r\n    [Option(\"lexicon\", Required = false, Default = \"\", HelpText = \"Path to lexicon.txt\")]\r\n    public string Lexicon { get; set; } = string.Empty;\r\n\r\n    [Option(\"tokens\", Required = true, Default = \"\", HelpText = \"Path to tokens.txt\")]\r\n    public string Tokens { get; set; } = string.Empty;\r\n\r\n    [Option(\"tts-max-num-sentences\", Required = false, Default = 1, HelpText = \"Maximum number of sentences that we process at a time.\")]\r\n    public int MaxNumSentences { get; set; } = 1;\r\n\r\n    [Option(Required = false, Default = 0, HelpText = \"1 to show debug messages.\")]\r\n    public int Debug { get; set; } = 0;\r\n\r\n    [Option(\"vits-model\", Required = false, HelpText = \"Path to VITS model\")]\r\n    public string Model { get; set; } = string.Empty;\r\n\r\n    [Option(\"matcha-acoustic-model\", Required = false, HelpText = \"Path to the acoustic model of Matcha\")]\r\n    public string AcousticModel { get; set; } = \"\";\r\n\r\n    [Option(\"matcha-vocoder\", Required = false, HelpText = \"Path to the vocoder model of Matcha\")]\r\n    public string Vocoder { get; set; } = \"\";\r\n\r\n    [Option(\"sid\", Required = false, Default = 0, HelpText = \"Speaker ID\")]\r\n    public int SpeakerId { get; set; } = 0;\r\n\r\n    [Option(\"text\", Required = true, HelpText = \"Text to synthesize\")]\r\n    public string Text { get; set; } = string.Empty;\r\n\r\n    [Option(\"output-filename\", Required = true, Default = \"./generated.wav\", HelpText = \"Path to save the generated audio\")]\r\n    public string OutputFilename { get; set; } = \"./generated.wav\";\r\n  }\r\n\r\n  static void Main(string[] args)\r\n  {\r\n    var parser = new Parser(with => with.HelpWriter = null);\r\n    var parserResult = parser.ParseArguments<Options>(args);\r\n\r\n    parserResult\r\n      .WithParsed<Options>(options => Run(options))\r\n      .WithNotParsed(errs => DisplayHelp(parserResult, errs));\r\n  }\r\n\r\n  private static void DisplayHelp<T>(ParserResult<T> result, IEnumerable<Error> errs)\r\n  {\r\n    var usage = @\"\r\n# matcha-icefall-zh-baker\r\n\r\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\r\ntar xvf matcha-icefall-zh-baker.tar.bz2\r\nrm matcha-icefall-zh-baker.tar.bz2\r\n\r\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\r\n\r\ndotnet run \\\r\n  --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\r\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\r\n  --lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\r\n  --tokens=./matcha-icefall-zh-baker/tokens.txt \\\r\n  --tts-rule-fsts=./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst \\\r\n  --debug=1 \\\r\n  --output-filename=./matcha-zh.wav \\\r\n  --text='某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。'\r\n\r\n# matcha-icefall-en_US-ljspeech\r\n\r\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\r\ntar xvf matcha-icefall-en_US-ljspeech.tar.bz2\r\nrm matcha-icefall-en_US-ljspeech.tar.bz2\r\n\r\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\r\n\r\ndotnet run \\\r\n  --matcha-acoustic-model=./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\r\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\r\n  --tokens=./matcha-icefall-zh-baker/tokens.txt \\\r\n  --data-dir=./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\r\n  --debug=1 \\\r\n  --output-filename=./matcha-zh.wav \\\r\n  --text='Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.'\r\n\r\n# vits-aishell3\r\n\r\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\r\ntar xvf vits-icefall-zh-aishell3.tar.bz2\r\n\r\ndotnet run \\\r\n  --vits-model=./vits-icefall-zh-aishell3/model.onnx \\\r\n  --tokens=./vits-icefall-zh-aishell3/tokens.txt \\\r\n  --lexicon=./vits-icefall-zh-aishell3/lexicon.txt \\\r\n  --tts-rule-fsts=./vits-icefall-zh-aishell3/phone.fst,./vits-icefall-zh-aishell3/date.fst,./vits-icefall-zh-aishell3/number.fst \\\r\n  --tts-rule-fars=./vits-icefall-zh-aishell3/rule.far \\\r\n  --sid=66 \\\r\n  --debug=1 \\\r\n  --output-filename=./aishell3-66.wav \\\r\n  --text=这是一个语音合成测试\r\n\r\n# Piper models\r\n\r\nwget -qq https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\r\ntar xf vits-piper-en_US-amy-low.tar.bz2\r\n\r\ndotnet run \\\r\n  --vits-model=./vits-piper-en_US-amy-low/en_US-amy-low.onnx \\\r\n  --tokens=./vits-piper-en_US-amy-low/tokens.txt \\\r\n  --data-dir=./vits-piper-en_US-amy-low/espeak-ng-data \\\r\n  --debug=1 \\\r\n  --output-filename=./amy.wav \\\r\n  --text='This is a text to speech application in dotnet with Next Generation Kaldi'\r\n\r\nPlease refer to\r\nhttps://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/index.html\r\nto download more models.\r\n\";\r\n\r\n    var helpText = HelpText.AutoBuild(result, h =>\r\n    {\r\n      h.AdditionalNewLineAfterOption = false;\r\n      h.Heading = usage;\r\n      h.Copyright = \"Copyright (c) 2024 Xiaomi Corporation\";\r\n      return HelpText.DefaultParsingErrorsHandler(result, h);\r\n    }, e => e);\r\n    Console.WriteLine(helpText);\r\n  }\r\n\r\n  private static void Run(Options options)\r\n  {\r\n    var config = new OfflineTtsConfig();\r\n    config.Model.Vits.Model = options.Model;\r\n    config.Model.Vits.Lexicon = options.Lexicon;\r\n    config.Model.Vits.Tokens = options.Tokens;\r\n    config.Model.Vits.DataDir = options.DataDir;\r\n    config.Model.Vits.NoiseScale = options.NoiseScale;\r\n    config.Model.Vits.NoiseScaleW = options.NoiseScaleW;\r\n    config.Model.Vits.LengthScale = options.LengthScale;\r\n\r\n    config.Model.Matcha.AcousticModel = options.AcousticModel;\r\n    config.Model.Matcha.Vocoder = options.Vocoder;\r\n    config.Model.Matcha.Lexicon = options.Lexicon;\r\n    config.Model.Matcha.Tokens = options.Tokens;\r\n    config.Model.Matcha.DataDir = options.DataDir;\r\n    config.Model.Matcha.NoiseScale = options.NoiseScale;\r\n    config.Model.Matcha.LengthScale = options.LengthScale;\r\n\r\n    config.Model.NumThreads = 1;\r\n    config.Model.Debug = options.Debug;\r\n    config.Model.Provider = \"cpu\";\r\n    config.RuleFsts = options.RuleFsts;\r\n    config.RuleFars = options.RuleFars;\r\n    config.MaxNumSentences = options.MaxNumSentences;\n\n    var tts = new OfflineTts(config);\n    var speed = 1.0f / options.LengthScale;\n    var sid = options.SpeakerId;\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\n    genConfig.Sid = sid;\n    genConfig.Speed = speed;\n    genConfig.SilenceScale = 0.2f;\n    var audio = tts.GenerateWithConfig(options.Text, genConfig, null);\n    var ok = audio.SaveToWaveFile(options.OutputFilename);\n\r\n    if (ok)\r\n    {\r\n      Console.WriteLine($\"Wrote to {options.OutputFilename} succeeded!\");\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine($\"Failed to write {options.OutputFilename}\");\r\n    }\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/offline-tts/offline-tts.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>offline_tts</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <PackageReference Include=\"CommandLineParser\" Version=\"2.9.1\" />\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/offline-tts/run-aishell3.sh",
    "content": "#!/usr/bin/env bash\nset -ex\nif [ ! -f ./vits-zh-aishell3/vits-aishell3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\n  tar xvf vits-icefall-zh-aishell3.tar.bz2\n  rm vits-icefall-zh-aishell3.tar.bz2\nfi\n\ndotnet run \\\n  --vits-model=./vits-icefall-zh-aishell3/model.onnx \\\n  --tokens=./vits-icefall-zh-aishell3/tokens.txt \\\n  --lexicon=./vits-icefall-zh-aishell3/lexicon.txt \\\n  --tts-rule-fsts=./vits-icefall-zh-aishell3/phone.fst,./vits-icefall-zh-aishell3/date.fst,./vits-icefall-zh-aishell3/number.fst \\\n  --tts-rule-fars=./vits-icefall-zh-aishell3/rule.far \\\n  --sid=66 \\\n  --debug=1 \\\n  --output-filename=./aishell3-66.wav \\\n  --text=\"这是一个语音合成测试, 写于公元 2024 年 1 月 28 号, 23点27分，星期天。长沙长大，去过长白山和长安街。行行出状元。行行，银行行长，行业。\"\n"
  },
  {
    "path": "dotnet-examples/offline-tts/run-hf-fanchen.sh",
    "content": "#!/usr/bin/env bash\nset -ex\nif [ ! -f ./vits-zh-hf-fanchen-C/vits-zh-hf-fanchen-C.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-zh-hf-fanchen-C.tar.bz2\n  tar xf vits-zh-hf-fanchen-C.tar.bz2\n  rm vits-zh-hf-fanchen-C.tar.bz2\nfi\n\ndotnet run \\\n  --vits-model=./vits-zh-hf-fanchen-C/vits-zh-hf-fanchen-C.onnx \\\n  --tokens=./vits-zh-hf-fanchen-C/tokens.txt \\\n  --lexicon=./vits-zh-hf-fanchen-C/lexicon.txt \\\n  --tts-rule-fsts=./vits-zh-hf-fanchen-C/phone.fst,./vits-zh-hf-fanchen-C/date.fst,./vits-zh-hf-fanchen-C/number.fst \\\n  --sid=100 \\\n  --debug=1 \\\n  --output-filename=./fanchen-100.wav \\\n  --text=\"这是一个语音合成测试, 写于公元2024年4月26号, 11点05分，星期5。小米的使命是，始终坚持做'感动人心、价格厚道'的好产品，让全球每个人都能享受科技带来的美好生活。\"\n"
  },
  {
    "path": "dotnet-examples/offline-tts/run-matcha-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n  tar xf matcha-icefall-en_US-ljspeech.tar.bz2\n  rm matcha-icefall-en_US-ljspeech.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\ndotnet run \\\n  --matcha-acoustic-model=./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --tokens=./matcha-icefall-en_US-ljspeech/tokens.txt \\\n  --data-dir=./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\n  --debug=1 \\\n  --output-filename=./matcha-en.wav \\\n  --text='Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.'\n"
  },
  {
    "path": "dotnet-examples/offline-tts/run-matcha-zh.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-zh-baker/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  tar xvf matcha-icefall-zh-baker.tar.bz2\n  rm matcha-icefall-zh-baker.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\n\ndotnet run \\\n  --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\n  --tokens=./matcha-icefall-zh-baker/tokens.txt \\\n  --tts-rule-fsts=./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst \\\n  --debug=1 \\\n  --output-filename=./matcha-zh.wav \\\n  --text=\"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\"\n"
  },
  {
    "path": "dotnet-examples/offline-tts/run-piper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\nif [ ! -f ./vits-piper-en_US-amy-low/en_US-amy-low.onnx ]; then\n  # wget -qq https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\n  curl -OL https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\n  tar xf vits-piper-en_US-amy-low.tar.bz2\n  rm vits-piper-en_US-amy-low.tar.bz2\nfi\n\ndotnet run \\\n  --vits-model=./vits-piper-en_US-amy-low/en_US-amy-low.onnx \\\n  --tokens=./vits-piper-en_US-amy-low/tokens.txt \\\n  --data-dir=./vits-piper-en_US-amy-low/espeak-ng-data \\\n  --debug=1 \\\n  --output-filename=./amy.wav \\\n  --text=\"This is a text to speech application in dotnet with Next Generation Kaldi\"\n\n"
  },
  {
    "path": "dotnet-examples/offline-tts-play/.gitignore",
    "content": "run-piper.sh\n"
  },
  {
    "path": "dotnet-examples/offline-tts-play/Program.cs",
    "content": "﻿// Copyright (c)  2024  Xiaomi Corporation\r\n//\r\n// This file shows how to use a non-streaming TTS model for text-to-speech\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\r\n// and\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\r\n// to download pre-trained models\r\n//\r\n// Note that you need a speaker to run this file since it will play\r\n// the generated audio back as it is being generated.\r\n\r\nusing CommandLine;\r\nusing CommandLine.Text;\r\nusing PortAudioSharp;\r\nusing SherpaOnnx;\r\nusing System.Collections.Concurrent;\r\nusing System.Runtime.InteropServices;\r\n\r\nclass OfflineTtsPlayDemo\r\n{\r\n  class Options\r\n  {\r\n    [Option(\"tts-rule-fsts\", Required = false, Default = \"\", HelpText = \"path to rule.fst\")]\r\n    public string RuleFsts { get; set; } = string.Empty;\r\n\r\n    [Option(\"tts-rule-fars\", Required = false, Default = \"\", HelpText = \"path to rule.far\")]\r\n    public string RuleFars { get; set; } = string.Empty;\r\n\r\n    [Option(\"data-dir\", Required = false, Default = \"\", HelpText = \"Path to the directory containing dict for espeak-ng.\")]\r\n    public string DataDir { get; set; } = string.Empty;\r\n\r\n    [Option(\"length-scale\", Required = false, Default = 1, HelpText = \"speech speed. Larger->Slower; Smaller->faster\")]\r\n    public float LengthScale { get; set; } = 1;\r\n\r\n    [Option(\"noise-scale\", Required = false, Default = 0.667f, HelpText = \"noise_scale for VITS or Matcha models\")]\r\n    public float NoiseScale { get; set; } = 0.667F;\r\n\r\n    [Option(\"vits-noise-scale-w\", Required = false, Default = 0.8F, HelpText = \"noise_scale_w for VITS models\")]\r\n    public float NoiseScaleW { get; set; } = 0.8F;\r\n\r\n    [Option(\"lexicon\", Required = false, Default = \"\", HelpText = \"Path to lexicon.txt\")]\r\n    public string Lexicon { get; set; } = string.Empty;\r\n\r\n    [Option(\"tokens\", Required = true, Default = \"\", HelpText = \"Path to tokens.txt\")]\r\n    public string Tokens { get; set; } = string.Empty;\r\n\r\n    [Option(\"tts-max-num-sentences\", Required = false, Default = 1, HelpText = \"Maximum number of sentences that we process at a time.\")]\r\n    public int MaxNumSentences { get; set; } = 1;\r\n\r\n    [Option(Required = false, Default = 0, HelpText = \"1 to show debug messages.\")]\r\n    public int Debug { get; set; } = 0;\r\n\r\n    [Option(\"vits-model\", Required = false, HelpText = \"Path to VITS model\")]\r\n    public string Model { get; set; } = string.Empty;\r\n\r\n    [Option(\"matcha-acoustic-model\", Required = false, HelpText = \"Path to the acoustic model of Matcha\")]\r\n    public string AcousticModel { get; set; } = \"\";\r\n\r\n    [Option(\"matcha-vocoder\", Required = false, HelpText = \"Path to the vocoder model of Matcha\")]\r\n    public string Vocoder { get; set; } = \"\";\r\n\r\n    [Option(\"sid\", Required = false, Default = 0, HelpText = \"Speaker ID\")]\r\n    public int SpeakerId { get; set; } = 0;\r\n\r\n    [Option(\"text\", Required = true, HelpText = \"Text to synthesize\")]\r\n    public string Text { get; set; } = string.Empty;\r\n\r\n    [Option(\"output-filename\", Required = true, Default = \"./generated.wav\", HelpText = \"Path to save the generated audio\")]\r\n    public string OutputFilename { get; set; } = \"./generated.wav\";\r\n  }\r\n\r\n  static void Main(string[] args)\r\n  {\r\n    var parser = new CommandLine.Parser(with => with.HelpWriter = null);\r\n    var parserResult = parser.ParseArguments<Options>(args);\r\n\r\n    parserResult\r\n      .WithParsed<Options>(options => Run(options))\r\n      .WithNotParsed(errs => DisplayHelp(parserResult, errs));\r\n  }\r\n\r\n  private static void DisplayHelp<T>(ParserResult<T> result, IEnumerable<Error> errs)\r\n  {\r\n    string usage = @\"\r\n# matcha-icefall-zh-baker\r\n\r\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\r\ntar xvf matcha-icefall-zh-baker.tar.bz2\r\nrm matcha-icefall-zh-baker.tar.bz2\r\n\r\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\r\n\r\ndotnet run \\\r\n  --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\r\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\r\n  --lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\r\n  --tokens=./matcha-icefall-zh-baker/tokens.txt \\\r\n  --tts-rule-fsts=./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst \\\r\n  --debug=1 \\\r\n  --output-filename=./matcha-zh.wav \\\r\n  --text='某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。'\r\n\r\n# matcha-icefall-en_US-ljspeech\r\n\r\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\r\ntar xvf matcha-icefall-en_US-ljspeech.tar.bz2\r\nrm matcha-icefall-en_US-ljspeech.tar.bz2\r\n\r\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\r\n\r\ndotnet run \\\r\n  --matcha-acoustic-model=./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\r\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\r\n  --tokens=./matcha-icefall-zh-baker/tokens.txt \\\r\n  --data-dir=./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\r\n  --debug=1 \\\r\n  --output-filename=./matcha-zh.wav \\\r\n  --text='Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.'\r\n\r\n# vits-aishell3\r\n\r\nwget -qq https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-zh-aishell3.tar.bz2\r\ntar xf vits-zh-aishell3.tar.bz2\r\n\r\ndotnet run \\\r\n  --vits-model=./vits-zh-aishell3/vits-aishell3.onnx \\\r\n  --tokens=./vits-zh-aishell3/tokens.txt \\\r\n  --lexicon=./vits-zh-aishell3/lexicon.txt \\\r\n  --tts-rule-fsts=./vits-zh-aishell3/rule.fst \\\r\n  --sid=66 \\\r\n  --debug=1 \\\r\n  --output-filename=./aishell3-66.wav \\\r\n  --text=这是一个语音合成测试\r\n\r\n# Piper models\r\n\r\nwget -qq https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\r\ntar xf vits-piper-en_US-amy-low.tar.bz2\r\n\r\ndotnet run \\\r\n  --vits-model=./vits-piper-en_US-amy-low/en_US-amy-low.onnx \\\r\n  ---tokens=./vits-piper-en_US-amy-low/tokens.txt \\\r\n  --data-dir=./vits-piper-en_US-amy-low/espeak-ng-data \\\r\n  --debug=1 \\\r\n  --output-filename=./amy.wav \\\r\n  --text='This is a text to speech application in dotnet with Next Generation Kaldi'\r\n\r\nPlease refer to\r\nhttps://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/index.html\r\nto download more models.\r\n\";\r\n\r\n    var helpText = HelpText.AutoBuild(result, h =>\r\n    {\r\n      h.AdditionalNewLineAfterOption = false;\r\n      h.Heading = usage;\r\n      h.Copyright = \"Copyright (c) 2024 Xiaomi Corporation\";\r\n      return HelpText.DefaultParsingErrorsHandler(result, h);\r\n    }, e => e);\r\n    Console.WriteLine(helpText);\r\n  }\r\n\r\n  private static void Run(Options options)\r\n  {\r\n    var config = new OfflineTtsConfig();\r\n\r\n    config.Model.Vits.Model = options.Model;\r\n    config.Model.Vits.Lexicon = options.Lexicon;\r\n    config.Model.Vits.Tokens = options.Tokens;\r\n    config.Model.Vits.DataDir = options.DataDir;\r\n    config.Model.Vits.NoiseScale = options.NoiseScale;\r\n    config.Model.Vits.NoiseScaleW = options.NoiseScaleW;\r\n    config.Model.Vits.LengthScale = options.LengthScale;\r\n\r\n    config.Model.Matcha.AcousticModel = options.AcousticModel;\r\n    config.Model.Matcha.Vocoder = options.Vocoder;\r\n    config.Model.Matcha.Lexicon = options.Lexicon;\r\n    config.Model.Matcha.Tokens = options.Tokens;\r\n    config.Model.Matcha.DataDir = options.DataDir;\r\n    config.Model.Matcha.NoiseScale = options.NoiseScale;\r\n    config.Model.Matcha.LengthScale = options.LengthScale;\r\n\r\n    config.Model.NumThreads = 1;\r\n    config.Model.Debug = options.Debug;\r\n    config.Model.Provider = \"cpu\";\r\n    config.RuleFsts = options.RuleFsts;\r\n    config.MaxNumSentences = options.MaxNumSentences;\n\n    var tts = new OfflineTts(config);\n    var speed = 1.0f / options.LengthScale;\n    var sid = options.SpeakerId;\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\n    genConfig.Sid = sid;\n    genConfig.Speed = speed;\n    genConfig.SilenceScale = 0.2f;\n\n    Console.WriteLine(PortAudio.VersionInfo.versionText);\n    PortAudio.Initialize();\r\n    Console.WriteLine($\"Number of devices: {PortAudio.DeviceCount}\");\r\n\r\n    for (int i = 0; i != PortAudio.DeviceCount; ++i)\r\n    {\r\n      Console.WriteLine($\" Device {i}\");\r\n      DeviceInfo deviceInfo = PortAudio.GetDeviceInfo(i);\r\n      Console.WriteLine($\"   Name: {deviceInfo.name}\");\r\n      Console.WriteLine($\"   Max output channels: {deviceInfo.maxOutputChannels}\");\r\n      Console.WriteLine($\"   Default sample rate: {deviceInfo.defaultSampleRate}\");\r\n    }\r\n    int deviceIndex = PortAudio.DefaultOutputDevice;\r\n    if (deviceIndex == PortAudio.NoDevice)\r\n    {\r\n      Console.WriteLine(\"No default output device found. Please use ../offline-tts instead\");\r\n      Environment.Exit(1);\r\n    }\r\n\r\n    var info = PortAudio.GetDeviceInfo(deviceIndex);\r\n    Console.WriteLine();\r\n    Console.WriteLine($\"Use output default device {deviceIndex} ({info.name})\");\r\n\r\n    var param = new StreamParameters();\r\n    param.device = deviceIndex;\r\n    param.channelCount = 1;\r\n    param.sampleFormat = SampleFormat.Float32;\r\n    param.suggestedLatency = info.defaultLowOutputLatency;\r\n    param.hostApiSpecificStreamInfo = IntPtr.Zero;\r\n\r\n    // https://learn.microsoft.com/en-us/dotnet/standard/collections/thread-safe/blockingcollection-overview\r\n    var dataItems = new BlockingCollection<float[]>();\r\n\r\n    var myCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\n    {\n      float[] data = new float[n];\n\n      Marshal.Copy(samples, data, 0, n);\n\r\n      dataItems.Add(data);\r\n\r\n      // 1 means to keep generating\r\n      // 0 means to stop generating\r\n      return 1;\r\n    };\r\n\r\n    var playFinished = false;\r\n\r\n    float[]? lastSampleArray = null;\r\n    int lastIndex = 0; // not played\r\n\r\n    PortAudioSharp.Stream.Callback playCallback = (IntPtr input, IntPtr output,\r\n        UInt32 frameCount,\r\n        ref StreamCallbackTimeInfo timeInfo,\r\n        StreamCallbackFlags statusFlags,\r\n        IntPtr userData\r\n        ) =>\r\n    {\r\n      if (dataItems.IsCompleted && lastSampleArray == null && lastIndex == 0)\r\n      {\r\n        Console.WriteLine($\"Finished playing\");\r\n        playFinished = true;\r\n        return StreamCallbackResult.Complete;\r\n      }\r\n\r\n      int expected = Convert.ToInt32(frameCount);\r\n      int i = 0;\r\n\r\n      while ((lastSampleArray != null || dataItems.Count != 0) && (i < expected))\r\n      {\r\n        int needed = expected - i;\r\n\r\n        if (lastSampleArray != null)\r\n        {\r\n          int remaining = lastSampleArray.Length - lastIndex;\r\n          if (remaining >= needed)\r\n          {\r\n            float[] this_block = lastSampleArray.Skip(lastIndex).Take(needed).ToArray();\r\n            lastIndex += needed;\r\n            if (lastIndex == lastSampleArray.Length)\r\n            {\r\n              lastSampleArray = null;\r\n              lastIndex = 0;\r\n            }\r\n\r\n            Marshal.Copy(this_block, 0, IntPtr.Add(output, i * sizeof(float)), needed);\r\n            return StreamCallbackResult.Continue;\r\n          }\r\n\r\n          float[] this_block2 = lastSampleArray.Skip(lastIndex).Take(remaining).ToArray();\r\n          lastIndex = 0;\r\n          lastSampleArray = null;\r\n\r\n          Marshal.Copy(this_block2, 0, IntPtr.Add(output, i * sizeof(float)), remaining);\r\n          i += remaining;\r\n          continue;\r\n        }\r\n\r\n        if (dataItems.Count != 0)\r\n        {\r\n          lastSampleArray = dataItems.Take();\r\n          lastIndex = 0;\r\n        }\r\n      }\r\n\r\n      if (i < expected)\r\n      {\r\n        int sizeInBytes = (expected - i) * 4;\r\n        Marshal.Copy(new byte[sizeInBytes], 0, IntPtr.Add(output, i * sizeof(float)), sizeInBytes);\r\n      }\r\n\r\n      return StreamCallbackResult.Continue;\r\n    };\r\n\r\n    PortAudioSharp.Stream stream = new PortAudioSharp.Stream(inParams: null, outParams: param, sampleRate: tts.SampleRate,\r\n        framesPerBuffer: 0,\r\n        streamFlags: StreamFlags.ClipOff,\r\n        callback: playCallback,\r\n        userData: IntPtr.Zero\r\n        );\r\n\r\n    stream.Start();\r\n\r\n    var callback = new OfflineTtsCallbackProgressWithArg(myCallback);\n\n    var audio = tts.GenerateWithConfig(options.Text, genConfig, callback);\n    var ok = audio.SaveToWaveFile(options.OutputFilename);\n\r\n    if (ok)\r\n    {\r\n      Console.WriteLine($\"Wrote to {options.OutputFilename} succeeded!\");\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine($\"Failed to write {options.OutputFilename}\");\r\n    }\r\n    dataItems.CompleteAdding();\r\n\r\n    while (!playFinished)\r\n    {\r\n      Thread.Sleep(100); // 100ms\r\n    }\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/offline-tts-play/offline-tts-play.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>offline_tts_play</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <PackageReference Include=\"CommandLineParser\" Version=\"2.9.1\" />\r\n    <PackageReference Include=\"PortAudioSharp2\" Version=\"*\" />\r\n  </ItemGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/offline-tts-play/run-hf-fanchen.sh",
    "content": "#!/usr/bin/env bash\nset -ex\nif [ ! -f ./vits-zh-hf-fanchen-C/vits-zh-hf-fanchen-C.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-zh-hf-fanchen-C.tar.bz2\n  tar xf vits-zh-hf-fanchen-C.tar.bz2\n  rm vits-zh-hf-fanchen-C.tar.bz2\nfi\n\ndotnet run \\\n  --vits-model=./vits-zh-hf-fanchen-C/vits-zh-hf-fanchen-C.onnx \\\n  --tokens=./vits-zh-hf-fanchen-C/tokens.txt \\\n  --lexicon=./vits-zh-hf-fanchen-C/lexicon.txt \\\n  --tts-rule-fsts=./vits-zh-hf-fanchen-C/phone.fst,./vits-zh-hf-fanchen-C/date.fst,./vits-zh-hf-fanchen-C/number.fst \\\n  --sid=100 \\\n  --debug=1 \\\n  --output-filename=./fanchen-100.wav \\\n  --text=\"这是一个语音合成测试, 写于公元2024年4月26号, 11点05分，星期5。小米的使命是，始终坚持做'感动人心、价格厚道'的好产品，让全球每个人都能享受科技带来的美好生活。\"\n"
  },
  {
    "path": "dotnet-examples/offline-tts-play/run-matcha-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n  tar xf matcha-icefall-en_US-ljspeech.tar.bz2\n  rm matcha-icefall-en_US-ljspeech.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\ndotnet run \\\n  --matcha-acoustic-model=./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --tokens=./matcha-icefall-en_US-ljspeech/tokens.txt \\\n  --data-dir=./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\n  --debug=1 \\\n  --output-filename=./matcha-en.wav \\\n  --text='Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.'\n"
  },
  {
    "path": "dotnet-examples/offline-tts-play/run-matcha-zh.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-zh-baker/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  tar xvf matcha-icefall-zh-baker.tar.bz2\n  rm matcha-icefall-zh-baker.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\n\ndotnet run \\\n  --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\n  --tokens=./matcha-icefall-zh-baker/tokens.txt \\\n  --tts-rule-fsts=./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst \\\n  --debug=1 \\\n  --output-filename=./matcha-zh.wav \\\n  --text=\"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\"\n"
  },
  {
    "path": "dotnet-examples/online-decode-files/Program.cs",
    "content": "﻿// Copyright (c)  2023  Xiaomi Corporation\r\n// Copyright (c)  2023 by manyeyes\r\n//\r\n// This file shows how to use a streaming model to decode files\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html\r\n// to download streaming models\r\n\r\nusing CommandLine;\r\nusing CommandLine.Text;\r\nusing SherpaOnnx;\r\n\r\nclass OnlineDecodeFiles\r\n{\r\n  class Options\r\n  {\r\n    [Option(Required = true, HelpText = \"Path to tokens.txt\")]\r\n    public string Tokens { get; set; } = string.Empty;\r\n\r\n    [Option(Required = false, Default = \"cpu\", HelpText = \"Provider, e.g., cpu, coreml\")]\r\n    public string Provider { get; set; } = string.Empty;\r\n\r\n    [Option(Required = false, HelpText = \"Path to transducer encoder.onnx\")]\r\n    public string Encoder { get; set; } = string.Empty;\r\n\r\n    [Option(Required = false, HelpText = \"Path to transducer decoder.onnx\")]\r\n    public string Decoder { get; set; } = string.Empty;\r\n\r\n    [Option(Required = false, HelpText = \"Path to transducer joiner.onnx\")]\r\n    public string Joiner { get; set; } = string.Empty;\r\n\r\n    [Option(\"paraformer-encoder\", Required = false, HelpText = \"Path to paraformer encoder.onnx\")]\r\n    public string ParaformerEncoder { get; set; } = string.Empty;\r\n\r\n    [Option(\"paraformer-decoder\", Required = false, HelpText = \"Path to paraformer decoder.onnx\")]\r\n    public string ParaformerDecoder { get; set; } = string.Empty;\r\n\r\n    [Option(\"zipformer2-ctc\", Required = false, HelpText = \"Path to zipformer2 CTC onnx model\")]\r\n    public string Zipformer2Ctc { get; set; } = string.Empty;\r\n\r\n    [Option(\"t-one-ctc\", Required = false, HelpText = \"Path to T-one CTC onnx model\")]\r\n    public string ToneCtc { get; set; } = string.Empty;\r\n\r\n    [Option(\"num-threads\", Required = false, Default = 1, HelpText = \"Number of threads for computation\")]\r\n    public int NumThreads { get; set; } = 1;\r\n\r\n    [Option(\"decoding-method\", Required = false, Default = \"greedy_search\",\r\n            HelpText = \"Valid decoding methods are: greedy_search, modified_beam_search\")]\r\n    public string DecodingMethod { get; set; } = \"greedy_search\";\r\n\r\n    [Option(Required = false, Default = false, HelpText = \"True to show model info during loading\")]\r\n    public bool Debug { get; set; } = false;\r\n\r\n    [Option(\"sample-rate\", Required = false, Default = 16000, HelpText = \"Sample rate of the data used to train the model\")]\r\n    public int SampleRate { get; set; } = 16000;\r\n\r\n    [Option(\"max-active-paths\", Required = false, Default = 4,\r\n        HelpText = @\"Used only when --decoding--method is modified_beam_search.\r\nIt specifies number of active paths to keep during the search\")]\r\n    public int MaxActivePaths { get; set; } = 4;\r\n\r\n    [Option(\"enable-endpoint\", Required = false, Default = false,\r\n        HelpText = \"True to enable endpoint detection.\")]\r\n    public bool EnableEndpoint { get; set; } = false;\r\n\r\n    [Option(\"rule1-min-trailing-silence\", Required = false, Default = 2.4F,\r\n        HelpText = @\"An endpoint is detected if trailing silence in seconds is\r\nlarger than this value even if nothing has been decoded. Used only when --enable-endpoint is true.\")]\r\n    public float Rule1MinTrailingSilence { get; set; } = 2.4F;\r\n\r\n    [Option(\"rule2-min-trailing-silence\", Required = false, Default = 1.2F,\r\n        HelpText = @\"An endpoint is detected if trailing silence in seconds is\r\nlarger than this value after something that is not blank has been decoded. Used\r\nonly when --enable-endpoint is true.\")]\r\n    public float Rule2MinTrailingSilence { get; set; }  = 1.2F;\r\n\r\n    [Option(\"rule3-min-utterance-length\", Required = false, Default = 20.0F,\r\n        HelpText = @\"An endpoint is detected if the utterance in seconds is\r\nlarger than this value. Used only when --enable-endpoint is true.\")]\r\n    public float Rule3MinUtteranceLength { get; set; } = 20.0F;\r\n\r\n    [Option(\"hotwords-file\", Required = false, Default = \"\", HelpText = \"Path to hotwords.txt\")]\r\n    public string HotwordsFile { get; set; } = string.Empty;\r\n\r\n    [Option(\"hotwords-score\", Required = false, Default = 1.5F, HelpText = \"hotwords score\")]\r\n    public float HotwordsScore { get; set; } = 1.5F;\r\n\r\n    [Option(\"rule-fsts\", Required = false, Default = \"\",\r\n            HelpText = \"If not empty, path to rule fst for inverse text normalization\")]\r\n    public string RuleFsts { get; set; } = string.Empty;\r\n\r\n    [Option(\"files\", Required = true, HelpText = \"Audio files for decoding\")]\r\n    public IEnumerable<string> Files { get; set; } = new string[] {};\r\n  }\r\n\r\n  static void Main(string[] args)\r\n  {\r\n    var parser = new CommandLine.Parser(with => with.HelpWriter = null);\r\n    var parserResult = parser.ParseArguments<Options>(args);\r\n\r\n    parserResult\r\n      .WithParsed<Options>(options => Run(options))\r\n      .WithNotParsed(errs => DisplayHelp(parserResult, errs));\r\n  }\r\n\r\n  private static void DisplayHelp<T>(ParserResult<T> result, IEnumerable<Error> errs)\r\n  {\r\n    string usage = @\"\r\n(1) Streaming transducer models\r\n\r\ndotnet run \\\r\n  --tokens=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \\\r\n  --encoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx \\\r\n  --decoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \\\r\n  --joiner=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx \\\r\n  --num-threads=2 \\\r\n  --decoding-method=modified_beam_search \\\r\n  --debug=false \\\r\n  --files ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav \\\r\n  ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/1.wav\r\n\r\n(2) Streaming Zipformer2 Ctc models\r\n\r\ndotnet run -c Release \\\r\n  --tokens ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt \\\r\n  --zipformer2-ctc ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/ctc-epoch-20-avg-1-chunk-16-left-128.onnx \\\r\n  --files ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000000.wav \\\r\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000001.wav \\\r\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000002.wav \\\r\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/TEST_MEETING_T0000000113.wav \\\r\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/TEST_MEETING_T0000000219.wav \\\r\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/TEST_MEETING_T0000000351.wav\r\n\r\n(3) Streaming Paraformer models\r\ndotnet run \\\r\n  --tokens=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt \\\r\n  --paraformer-encoder=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx \\\r\n  --paraformer-decoder=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx \\\r\n  --num-threads=2 \\\r\n  --decoding-method=greedy_search \\\r\n  --debug=false \\\r\n  --files ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav \\\r\n  ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/1.wav\r\n\r\nPlease refer to\r\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html\r\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/index.html\r\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-ctc/index.html\r\nto download pre-trained streaming models.\r\n\";\r\n\r\n    var helpText = HelpText.AutoBuild(result, h =>\r\n    {\r\n      h.AdditionalNewLineAfterOption = false;\r\n      h.Heading = usage;\r\n      h.Copyright = \"Copyright (c) 2023 Xiaomi Corporation\";\r\n      return HelpText.DefaultParsingErrorsHandler(result, h);\r\n    }, e => e);\r\n    Console.WriteLine(helpText);\r\n  }\r\n\r\n  private static void Run(Options options)\r\n  {\r\n    var config = new OnlineRecognizerConfig();\r\n    config.FeatConfig.SampleRate = options.SampleRate;\r\n\r\n    // All models from icefall using feature dim 80.\r\n    // You can change it if your model has a different feature dim.\r\n    config.FeatConfig.FeatureDim = 80;\r\n\r\n    config.ModelConfig.Transducer.Encoder = options.Encoder;\r\n    config.ModelConfig.Transducer.Decoder = options.Decoder;\r\n    config.ModelConfig.Transducer.Joiner = options.Joiner;\r\n\r\n    config.ModelConfig.Paraformer.Encoder = options.ParaformerEncoder;\r\n    config.ModelConfig.Paraformer.Decoder = options.ParaformerDecoder;\r\n\r\n    config.ModelConfig.Zipformer2Ctc.Model = options.Zipformer2Ctc;\r\n    config.ModelConfig.ToneCtc.Model = options.ToneCtc;\r\n\r\n    config.ModelConfig.Tokens = options.Tokens;\r\n    config.ModelConfig.Provider = options.Provider;\r\n    config.ModelConfig.NumThreads = options.NumThreads;\r\n    config.ModelConfig.Debug = options.Debug ? 1 : 0;\r\n\r\n    config.DecodingMethod = options.DecodingMethod;\r\n    config.MaxActivePaths = options.MaxActivePaths;\r\n    config.EnableEndpoint = options.EnableEndpoint ? 1 : 0;\r\n\r\n    config.Rule1MinTrailingSilence = options.Rule1MinTrailingSilence;\r\n    config.Rule2MinTrailingSilence = options.Rule2MinTrailingSilence;\r\n    config.Rule3MinUtteranceLength = options.Rule3MinUtteranceLength;\r\n    config.HotwordsFile = options.HotwordsFile;\r\n    config.HotwordsScore = options.HotwordsScore;\r\n    config.RuleFsts = options.RuleFsts;\r\n\r\n    var recognizer = new OnlineRecognizer(config);\r\n\r\n    var files = options.Files.ToArray();\r\n\r\n    // We create a separate stream for each file\r\n    var streams = new List<OnlineStream>();\r\n    streams.EnsureCapacity(files.Length);\r\n\r\n    for (int i = 0; i != files.Length; ++i)\r\n    {\r\n      var s = recognizer.CreateStream();\r\n\r\n      var waveReader = new WaveReader(files[i]);\r\n\r\n      var leftPadding = new float[(int)(waveReader.SampleRate * 0.3)];\r\n      s.AcceptWaveform(waveReader.SampleRate, leftPadding);\r\n\r\n      s.AcceptWaveform(waveReader.SampleRate, waveReader.Samples);\r\n\r\n      var tailPadding = new float[(int)(waveReader.SampleRate * 0.6)];\r\n      s.AcceptWaveform(waveReader.SampleRate, tailPadding);\r\n\r\n      s.InputFinished();\r\n\r\n      streams.Add(s);\r\n    }\r\n\r\n    while (true)\r\n    {\r\n      var readyStreams = streams.Where(s => recognizer.IsReady(s));\r\n      if (!readyStreams.Any())\r\n      {\r\n        break;\r\n      }\r\n\r\n      recognizer.Decode(readyStreams);\r\n    }\r\n\r\n    // display results\r\n    for (int i = 0; i != files.Length; ++i)\r\n    {\r\n      var r = recognizer.GetResult(streams[i]);\r\n      var text = r.Text;\r\n      var tokens = r.Tokens;\r\n      Console.WriteLine(\"--------------------\");\r\n      Console.WriteLine(files[i]);\r\n      Console.WriteLine(\"text: {0}\", text);\r\n      Console.WriteLine(\"tokens: [{0}]\", string.Join(\", \", tokens));\r\n      Console.Write(\"timestamps: [\");\r\n      r.Timestamps.ToList().ForEach(i => Console.Write(string.Format(\"{0:0.00}\", i) + \", \"));\r\n      Console.WriteLine(\"]\");\r\n    }\r\n    Console.WriteLine(\"--------------------\");\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/online-decode-files/online-decode-files.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n\n  <PropertyGroup>\n    <OutputType>Exe</OutputType>\n    <TargetFramework>net8.0</TargetFramework>\n    <RootNamespace>online_decode_files</RootNamespace>\n    <ImplicitUsings>enable</ImplicitUsings>\n    <Nullable>enable</Nullable>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <PackageReference Include=\"CommandLineParser\" Version=\"2.9.1\" />\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "dotnet-examples/online-decode-files/run-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-streaming-paraformer-bilingual-zh-en-chinese-english\n# to download the model files\n\nset -ex\nif [ ! -d ./sherpa-onnx-streaming-paraformer-bilingual-zh-en ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nfi\n\ndotnet run -c Release \\\n  --tokens ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt \\\n  --paraformer-encoder ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx \\\n  --paraformer-decoder ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx \\\n  --decoding-method greedy_search \\\n  --files ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/1.wav \\\n  ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav\n"
  },
  {
    "path": "dotnet-examples/online-decode-files/run-t-one-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n  tar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n  rm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nfi\n\ndotnet run -c Release \\\n  --tokens ./sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt \\\n  --t-one-ctc ./sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx \\\n  --files ./sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav\n"
  },
  {
    "path": "dotnet-examples/online-decode-files/run-transducer-itn.sh",
    "content": "#!/usr/bin/env bash\n\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n# to download the model files\n\nset -ex\nif [ ! -d ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nfi\n\nif [ ! -f ./itn-zh-number.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nfi\n\nif [ ! -f ./itn_zh_number.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\nfi\n\ndotnet run -c Release \\\n  --tokens ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \\\n  --encoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \\\n  --joiner ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx \\\n  --rule-fsts ./itn_zh_number.fst \\\n  --decoding-method greedy_search \\\n  --files ./itn-zh-number.wav\n"
  },
  {
    "path": "dotnet-examples/online-decode-files/run-transducer.sh",
    "content": "#!/usr/bin/env bash\n\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n# to download the model files\n\nset -ex\nif [ ! -d ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nfi\n\ndotnet run -c Release \\\n  --tokens ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \\\n  --encoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \\\n  --joiner ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx \\\n  --decoding-method greedy_search \\\n  --files ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/1.wav \\\n  ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav \\\n"
  },
  {
    "path": "dotnet-examples/online-decode-files/run-zipformer2-ctc.sh",
    "content": "#!/usr/bin/env bash\n\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-ctc/zipformer-ctc-models.html#sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13-chinese\n# to download the model files\n\nset -ex\nif [ ! -d ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\nfi\n\ndotnet run -c Release \\\n  --tokens ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt \\\n  --zipformer2-ctc ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/ctc-epoch-20-avg-1-chunk-16-left-128.onnx \\\n  --files ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000000.wav \\\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000001.wav \\\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000002.wav \\\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/TEST_MEETING_T0000000113.wav \\\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/TEST_MEETING_T0000000219.wav \\\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/TEST_MEETING_T0000000351.wav\n"
  },
  {
    "path": "dotnet-examples/pocket-tts-zero-shot/Program.cs",
    "content": "﻿// Copyright (c)  2026  Xiaomi Corporation\r\n//\r\n// This file shows how to use a non-streaming PocketTTS model\r\n// for text-to-speech\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/tts/pocket.html\r\n// and\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\r\n// to download pre-trained models\r\nusing SherpaOnnx;\r\nusing System.Runtime.InteropServices;\r\n\r\nclass PocketTtsDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n\r\n    TestEn();\r\n  }\r\n\r\n  static void TestEn()\r\n  {\r\n    var config = new OfflineTtsConfig();\r\n    config.Model.Pocket.LmFlow = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\";\r\n    config.Model.Pocket.LmMain = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\";\r\n    config.Model.Pocket.Encoder = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\";\r\n    config.Model.Pocket.Decoder = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\";\r\n    config.Model.Pocket.TextConditioner = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\";\r\n    config.Model.Pocket.VocabJson = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\";\r\n    config.Model.Pocket.TokenScoresJson = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\";\r\n\r\n    config.Model.NumThreads = 2;\r\n    config.Model.Debug = 1;\r\n    config.Model.Provider = \"cpu\";\r\n\r\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\r\n\r\n    var referenceWaveFilename = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\";\r\n    var reader = new WaveReader(referenceWaveFilename);\r\n\r\n    genConfig.ReferenceAudio = reader.Samples;\r\n    genConfig.ReferenceSampleRate = reader.SampleRate;\r\n    genConfig.Extra[\"max_reference_audio_len\"] = 12;\r\n\r\n    var tts = new OfflineTts(config);\r\n    var text = \"Today as always, men fall into two groups: slaves and free men. Whoever \" +\r\n      \"does not have two-thirds of his day for himself, is a slave, whatever \" +\r\n      \"he may be: a statesman, a businessman, an official, or a scholar. \" +\r\n      \"Friends fell out often because life was changing so fast. The easiest \" +\r\n      \"thing in the world was to lose touch with someone.\";\r\n\r\n    var MyCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\r\n    {\r\n      float[] data = new float[n];\r\n      Marshal.Copy(samples, data, 0, n);\r\n      // You can process samples here, e.g., play them.\r\n      // See ../pocket-tts-zero-shot-play for how to play them\r\n      Console.WriteLine($\"Progress {progress*100}%\");\r\n\r\n      // 1 means to keep generating\r\n      // 0 means to stop generating\r\n      return 1;\r\n    };\r\n\r\n    var callback = new OfflineTtsCallbackProgressWithArg(MyCallback);\r\n\r\n    var audio = tts.GenerateWithConfig(text, genConfig, callback);\r\n\r\n    var outputFilename = \"./generated-pocket-en.wav\";\r\n    var ok = audio.SaveToWaveFile(outputFilename);\r\n\r\n    if (ok)\r\n    {\r\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine($\"Failed to write {outputFilename}\");\r\n    }\r\n  }\r\n}\r\n\r\n"
  },
  {
    "path": "dotnet-examples/pocket-tts-zero-shot/pocket-tts-zero-shot.csproj",
    "content": "﻿<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>pocket_tts_zero_shot</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/pocket-tts-zero-shot/run.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  tar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/pocket-tts-zero-shot-play/Program.cs",
    "content": "﻿// Copyright (c)  2026  Xiaomi Corporation\r\n//\r\n// This file shows how to use a non-streaming PocketTTS model\r\n// for text-to-speech\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/tts/pocket.html\r\n// and\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\r\n// to download pre-trained models\r\nusing PortAudioSharp;\r\nusing SherpaOnnx;\r\nusing System.Collections.Concurrent;\r\nusing System.Runtime.InteropServices;\r\n\r\nclass PocketTtsDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n\r\n    TestEn();\r\n  }\r\n\r\n  static void TestEn()\r\n  {\r\n    var config = new OfflineTtsConfig();\r\n    config.Model.Pocket.LmFlow = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\";\r\n    config.Model.Pocket.LmMain = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\";\r\n    config.Model.Pocket.Encoder = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\";\r\n    config.Model.Pocket.Decoder = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\";\r\n    config.Model.Pocket.TextConditioner = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\";\r\n    config.Model.Pocket.VocabJson = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\";\r\n    config.Model.Pocket.TokenScoresJson = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\";\r\n\r\n    config.Model.NumThreads = 2;\r\n    config.Model.Debug = 1;\r\n    config.Model.Provider = \"cpu\";\r\n\r\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\r\n\r\n    var referenceWaveFilename = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\";\r\n    var reader = new WaveReader(referenceWaveFilename);\r\n\r\n    genConfig.ReferenceAudio = reader.Samples;\r\n    genConfig.ReferenceSampleRate= reader.SampleRate;\r\n    genConfig.Extra[\"max_reference_audio_len\"] = 12;\r\n\r\n    var tts = new OfflineTts(config);\r\n    var text = \"Today as always, men fall into two groups: slaves and free men. Whoever \" +\r\n      \"does not have two-thirds of his day for himself, is a slave, whatever \" +\r\n      \"he may be: a statesman, a businessman, an official, or a scholar. \" +\r\n      \"Friends fell out often because life was changing so fast. The easiest \" +\r\n      \"thing in the world was to lose touch with someone.\";\r\n\r\n    Console.WriteLine(PortAudio.VersionInfo.versionText);\r\n    PortAudio.Initialize();\r\n    Console.WriteLine($\"Number of devices: {PortAudio.DeviceCount}\");\r\n\r\n    for (int i = 0; i != PortAudio.DeviceCount; ++i)\r\n    {\r\n      Console.WriteLine($\" Device {i}\");\r\n      DeviceInfo deviceInfo = PortAudio.GetDeviceInfo(i);\r\n      Console.WriteLine($\"   Name: {deviceInfo.name}\");\r\n      Console.WriteLine($\"   Max output channels: {deviceInfo.maxOutputChannels}\");\r\n      Console.WriteLine($\"   Default sample rate: {deviceInfo.defaultSampleRate}\");\r\n    }\r\n    int deviceIndex = PortAudio.DefaultOutputDevice;\r\n    if (deviceIndex == PortAudio.NoDevice)\r\n    {\r\n      Console.WriteLine(\"No default output device found. Please use ../offline-tts instead\");\r\n      Environment.Exit(1);\r\n    }\r\n\r\n    var info = PortAudio.GetDeviceInfo(deviceIndex);\r\n    Console.WriteLine();\r\n    Console.WriteLine($\"Use output default device {deviceIndex} ({info.name})\");\r\n\r\n    var param = new StreamParameters();\r\n    param.device = deviceIndex;\r\n    param.channelCount = 1;\r\n    param.sampleFormat = SampleFormat.Float32;\r\n    param.suggestedLatency = info.defaultLowOutputLatency;\r\n    param.hostApiSpecificStreamInfo = IntPtr.Zero;\r\n\r\n    // https://learn.microsoft.com/en-us/dotnet/standard/collections/thread-safe/blockingcollection-overview\r\n    var dataItems = new BlockingCollection<float[]>();\r\n\r\n    var myCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\r\n    {\r\n      Console.WriteLine($\"Progress {progress*100}%\");\r\n\r\n      float[] data = new float[n];\r\n\r\n      Marshal.Copy(samples, data, 0, n);\r\n\r\n      dataItems.Add(data);\r\n\r\n      // 1 means to keep generating\r\n      // 0 means to stop generating\r\n      return 1;\r\n\r\n    };\r\n\r\n\r\n    var playFinished = false;\r\n\r\n    float[]? lastSampleArray = null;\r\n    int lastIndex = 0; // not played\r\n\r\n    PortAudioSharp.Stream.Callback playCallback = (IntPtr input, IntPtr output,\r\n        UInt32 frameCount,\r\n        ref StreamCallbackTimeInfo timeInfo,\r\n        StreamCallbackFlags statusFlags,\r\n        IntPtr userData\r\n        ) =>\r\n    {\r\n      if (dataItems.IsCompleted && lastSampleArray == null && lastIndex == 0)\r\n      {\r\n        Console.WriteLine($\"Finished playing\");\r\n        playFinished = true;\r\n        return StreamCallbackResult.Complete;\r\n      }\r\n\r\n      int expected = Convert.ToInt32(frameCount);\r\n      int i = 0;\r\n\r\n      while ((lastSampleArray != null || dataItems.Count != 0) && (i < expected))\r\n      {\r\n        int needed = expected - i;\r\n\r\n        if (lastSampleArray != null)\r\n        {\r\n          int remaining = lastSampleArray.Length - lastIndex;\r\n          if (remaining >= needed)\r\n          {\r\n            float[] this_block = lastSampleArray.Skip(lastIndex).Take(needed).ToArray();\r\n            lastIndex += needed;\r\n            if (lastIndex == lastSampleArray.Length)\r\n            {\r\n              lastSampleArray = null;\r\n              lastIndex = 0;\r\n            }\r\n\r\n            Marshal.Copy(this_block, 0, IntPtr.Add(output, i * sizeof(float)), needed);\r\n            return StreamCallbackResult.Continue;\r\n          }\r\n\r\n          float[] this_block2 = lastSampleArray.Skip(lastIndex).Take(remaining).ToArray();\r\n          lastIndex = 0;\r\n          lastSampleArray = null;\r\n\r\n          Marshal.Copy(this_block2, 0, IntPtr.Add(output, i * sizeof(float)), remaining);\r\n          i += remaining;\r\n          continue;\r\n        }\r\n\r\n        if (dataItems.Count != 0)\r\n        {\r\n          lastSampleArray = dataItems.Take();\r\n          lastIndex = 0;\r\n        }\r\n      }\r\n\r\n      if (i < expected)\r\n      {\r\n        int sizeInBytes = (expected - i) * 4;\r\n        Marshal.Copy(new byte[sizeInBytes], 0, IntPtr.Add(output, i * sizeof(float)), sizeInBytes);\r\n      }\r\n\r\n      return StreamCallbackResult.Continue;\r\n    };\r\n\r\n    PortAudioSharp.Stream stream = new PortAudioSharp.Stream(inParams: null, outParams: param, sampleRate: tts.SampleRate,\r\n        framesPerBuffer: 0,\r\n        streamFlags: StreamFlags.ClipOff,\r\n        callback: playCallback,\r\n        userData: IntPtr.Zero\r\n        );\r\n\r\n    stream.Start();\r\n\r\n    var callback = new OfflineTtsCallbackProgressWithArg(myCallback);\r\n\r\n    var audio = tts.GenerateWithConfig(text, genConfig, callback);\r\n\r\n    var outputFilename = \"./generated-pocket-en-play.wav\";\r\n    var ok = audio.SaveToWaveFile(outputFilename);\r\n\r\n    if (ok)\r\n    {\r\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine($\"Failed to write {outputFilename}\");\r\n    }\r\n\r\n    dataItems.CompleteAdding();\r\n\r\n    while (!playFinished)\r\n    {\r\n      Thread.Sleep(100); // 100ms\r\n    }\r\n  }\r\n}\r\n\r\n"
  },
  {
    "path": "dotnet-examples/pocket-tts-zero-shot-play/pocket-tts-zero-shot-play.csproj",
    "content": "﻿<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>pocket_tts_zero_shot_play</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <PackageReference Include=\"PortAudioSharp2\" Version=\"*\" />\r\n  </ItemGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/pocket-tts-zero-shot-play/run.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  tar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/sherpa-onnx.sln",
    "content": "﻿\r\nMicrosoft Visual Studio Solution File, Format Version 12.00\r\n# Visual Studio Version 17\r\nVisualStudioVersion = 17.0.31903.59\r\nMinimumVisualStudioVersion = 10.0.40219.1\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"online-decode-files\", \"online-decode-files\\online-decode-files.csproj\", \"{45307474-BECB-4ABE-9388-D01D55A1A9BE}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"offline-decode-files\", \"offline-decode-files\\offline-decode-files.csproj\", \"{2DAB152C-9E24-47A0-9DB0-781297ECE458}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"speech-recognition-from-microphone\", \"speech-recognition-from-microphone\\speech-recognition-from-microphone.csproj\", \"{FE4EA1FF-062A-46B3-B78D-C828FED7B82E}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"offline-tts\", \"offline-tts\\offline-tts.csproj\", \"{72196886-7143-4043-96E2-BCACEC6C79EB}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"offline-tts-play\", \"offline-tts-play\\offline-tts-play.csproj\", \"{40781464-5948-462B-BA4B-98932711513F}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"spoken-language-identification\", \"spoken-language-identification\\spoken-language-identification.csproj\", \"{3D7CF3D6-AC45-4D50-9619-5687B1443E94}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"streaming-hlg-decoding\", \"streaming-hlg-decoding\\streaming-hlg-decoding.csproj\", \"{C4A368A5-FCA0-419D-97C9-C8CE0B08EB99}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"speaker-identification\", \"speaker-identification\\speaker-identification.csproj\", \"{2B1B140E-A92F-426B-B0DF-5D916B67304F}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"offline-punctuation\", \"offline-punctuation\\offline-punctuation.csproj\", \"{42D85582-BB63-4259-A4EA-837D66AC078B}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"vad-non-streaming-asr-paraformer\", \"vad-non-streaming-asr-paraformer\\vad-non-streaming-asr-paraformer.csproj\", \"{8CD6B7E5-F59F-47B3-BB87-2B2E3678924D}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"Common\", \"Common\\Common.csproj\", \"{401E963F-E25A-43CE-987D-8DB2D4715756}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"keyword-spotting-from-files\", \"keyword-spotting-from-files\\keyword-spotting-from-files.csproj\", \"{A87EDD31-D654-4C9F-AED7-F6F2825659BD}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"keyword-spotting-from-microphone\", \"keyword-spotting-from-microphone\\keyword-spotting-from-microphone.csproj\", \"{AEE0ED2B-C86F-4952-863C-EAD3219CB4EC}\"\r\nEndProject\r\nProject(\"{9A19103F-16F7-4668-BE54-9A1E7A4F7556}\") = \"offline-speaker-diarization\", \"offline-speaker-diarization\\offline-speaker-diarization.csproj\", \"{D3A1FF28-A77D-429D-AEAC-2BA77CA682BC}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"kokoro-tts\", \"kokoro-tts\\kokoro-tts.csproj\", \"{9C0ABE6C-1F54-42B5-804E-C3FED6668F52}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"kokoro-tts-play\", \"kokoro-tts-play\\kokoro-tts-play.csproj\", \"{EC0BCEAB-1B4E-4129-82CE-9880426AFA0B}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"speech-enhancement-gtcrn\", \"speech-enhancement-gtcrn\\speech-enhancement-gtcrn.csproj\", \"{DF2569C6-6011-4716-9538-F9E9069E00EB}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"speech-enhancement-dpdfnet\", \"speech-enhancement-dpdfnet\\speech-enhancement-dpdfnet.csproj\", \"{016E5D0E-6D79-4AF6-B2C6-F0E091D78C00}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"version-test\", \"version-test\\version-test.csproj\", \"{E57711E5-6546-4BA0-B627-79C94F415BC5}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"non-streaming-canary-decode-files\", \"non-streaming-canary-decode-files\\non-streaming-canary-decode-files.csproj\", \"{925779DB-4429-4366-87C3-B14DD44AE1D4}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"kitten-tts\", \"kitten-tts\\kitten-tts.csproj\", \"{E5AB574B-9E31-45D4-9B75-1C1892241E41}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"kitten-tts-play\", \"kitten-tts-play\\kitten-tts-play.csproj\", \"{D60A8A84-D6D3-4B79-A18A-1817BEBD35B9}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"offline-audio-tagging\", \"offline-audio-tagging\\offline-audio-tagging.csproj\", \"{0EBE2CE5-8940-4472-8A38-6A0E976E678F}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"non-streaming-funasr-nano-decode-files\", \"non-streaming-funasr-nano-decode-files\\non-streaming-funasr-nano-decode-files.csproj\", \"{32F7534B-117E-4D1D-BAED-A1D1A6C6A62C}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"vad-non-streaming-funasr-nano\", \"vad-non-streaming-funasr-nano\\vad-non-streaming-funasr-nano.csproj\", \"{32C8C12B-D7DB-455E-B35C-945A745520CC}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"pocket-tts-zero-shot\", \"pocket-tts-zero-shot\\pocket-tts-zero-shot.csproj\", \"{9164FA6A-F8D3-4F52-8173-A2FA78E74BB2}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"pocket-tts-zero-shot-play\", \"pocket-tts-zero-shot-play\\pocket-tts-zero-shot-play.csproj\", \"{0E73BD08-EA6F-416D-8DBF-E92893A8C3B1}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"non-streaming-moonshine-v2-decode-files\", \"non-streaming-moonshine-v2-decode-files\\non-streaming-moonshine-v2-decode-files.csproj\", \"{C9E5A6D3-02F4-46DE-808B-5163348F45B3}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"supertonic-tts\", \"supertonic-tts\\supertonic-tts.csproj\", \"{A3B7C4D1-E5F6-4A8B-9C0D-1E2F3A4B5C6D}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"streaming-speech-enhancement-gtcrn\", \"streaming-speech-enhancement-gtcrn\\streaming-speech-enhancement-gtcrn.csproj\", \"{5B87496C-EF81-4232-A448-6308F8E5A18C}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"streaming-speech-enhancement-dpdfnet\", \"streaming-speech-enhancement-dpdfnet\\streaming-speech-enhancement-dpdfnet.csproj\", \"{8CD66C3E-3AE3-43AA-8FDA-DD5BA456F2EC}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"zipvoice-tts\", \"zipvoice-tts\\zipvoice-tts.csproj\", \"{BBC69A08-01A7-4F89-938F-F0D551AD3F6C}\"\r\nEndProject\r\nProject(\"{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}\") = \"zipvoice-tts-play\", \"zipvoice-tts-play\\zipvoice-tts-play.csproj\", \"{84A37E18-095E-42A6-93CC-C27CD90B8478}\"\r\nEndProject\r\nGlobal\r\n\tGlobalSection(SolutionConfigurationPlatforms) = preSolution\r\n\t\tDebug|Any CPU = Debug|Any CPU\r\n\t\tRelease|Any CPU = Release|Any CPU\r\n\tEndGlobalSection\r\n\tGlobalSection(ProjectConfigurationPlatforms) = postSolution\r\n\t\t{45307474-BECB-4ABE-9388-D01D55A1A9BE}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{45307474-BECB-4ABE-9388-D01D55A1A9BE}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{45307474-BECB-4ABE-9388-D01D55A1A9BE}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{45307474-BECB-4ABE-9388-D01D55A1A9BE}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{2DAB152C-9E24-47A0-9DB0-781297ECE458}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{2DAB152C-9E24-47A0-9DB0-781297ECE458}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{2DAB152C-9E24-47A0-9DB0-781297ECE458}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{2DAB152C-9E24-47A0-9DB0-781297ECE458}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{FE4EA1FF-062A-46B3-B78D-C828FED7B82E}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{FE4EA1FF-062A-46B3-B78D-C828FED7B82E}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{FE4EA1FF-062A-46B3-B78D-C828FED7B82E}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{FE4EA1FF-062A-46B3-B78D-C828FED7B82E}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{72196886-7143-4043-96E2-BCACEC6C79EB}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{72196886-7143-4043-96E2-BCACEC6C79EB}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{72196886-7143-4043-96E2-BCACEC6C79EB}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{72196886-7143-4043-96E2-BCACEC6C79EB}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{40781464-5948-462B-BA4B-98932711513F}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{40781464-5948-462B-BA4B-98932711513F}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{40781464-5948-462B-BA4B-98932711513F}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{40781464-5948-462B-BA4B-98932711513F}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{3D7CF3D6-AC45-4D50-9619-5687B1443E94}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{3D7CF3D6-AC45-4D50-9619-5687B1443E94}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{3D7CF3D6-AC45-4D50-9619-5687B1443E94}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{3D7CF3D6-AC45-4D50-9619-5687B1443E94}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{C4A368A5-FCA0-419D-97C9-C8CE0B08EB99}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{C4A368A5-FCA0-419D-97C9-C8CE0B08EB99}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{C4A368A5-FCA0-419D-97C9-C8CE0B08EB99}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{C4A368A5-FCA0-419D-97C9-C8CE0B08EB99}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{2B1B140E-A92F-426B-B0DF-5D916B67304F}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{2B1B140E-A92F-426B-B0DF-5D916B67304F}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{2B1B140E-A92F-426B-B0DF-5D916B67304F}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{2B1B140E-A92F-426B-B0DF-5D916B67304F}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{42D85582-BB63-4259-A4EA-837D66AC078B}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{42D85582-BB63-4259-A4EA-837D66AC078B}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{42D85582-BB63-4259-A4EA-837D66AC078B}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{42D85582-BB63-4259-A4EA-837D66AC078B}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{8CD6B7E5-F59F-47B3-BB87-2B2E3678924D}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{8CD6B7E5-F59F-47B3-BB87-2B2E3678924D}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{8CD6B7E5-F59F-47B3-BB87-2B2E3678924D}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{8CD6B7E5-F59F-47B3-BB87-2B2E3678924D}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{401E963F-E25A-43CE-987D-8DB2D4715756}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{401E963F-E25A-43CE-987D-8DB2D4715756}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{401E963F-E25A-43CE-987D-8DB2D4715756}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{401E963F-E25A-43CE-987D-8DB2D4715756}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{A87EDD31-D654-4C9F-AED7-F6F2825659BD}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{A87EDD31-D654-4C9F-AED7-F6F2825659BD}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{A87EDD31-D654-4C9F-AED7-F6F2825659BD}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{A87EDD31-D654-4C9F-AED7-F6F2825659BD}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{AEE0ED2B-C86F-4952-863C-EAD3219CB4EC}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{AEE0ED2B-C86F-4952-863C-EAD3219CB4EC}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{AEE0ED2B-C86F-4952-863C-EAD3219CB4EC}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{AEE0ED2B-C86F-4952-863C-EAD3219CB4EC}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{D3A1FF28-A77D-429D-AEAC-2BA77CA682BC}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{D3A1FF28-A77D-429D-AEAC-2BA77CA682BC}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{D3A1FF28-A77D-429D-AEAC-2BA77CA682BC}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{D3A1FF28-A77D-429D-AEAC-2BA77CA682BC}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{9C0ABE6C-1F54-42B5-804E-C3FED6668F52}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{9C0ABE6C-1F54-42B5-804E-C3FED6668F52}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{9C0ABE6C-1F54-42B5-804E-C3FED6668F52}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{9C0ABE6C-1F54-42B5-804E-C3FED6668F52}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{EC0BCEAB-1B4E-4129-82CE-9880426AFA0B}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{EC0BCEAB-1B4E-4129-82CE-9880426AFA0B}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{EC0BCEAB-1B4E-4129-82CE-9880426AFA0B}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{EC0BCEAB-1B4E-4129-82CE-9880426AFA0B}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{DF2569C6-6011-4716-9538-F9E9069E00EB}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{DF2569C6-6011-4716-9538-F9E9069E00EB}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{DF2569C6-6011-4716-9538-F9E9069E00EB}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{DF2569C6-6011-4716-9538-F9E9069E00EB}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{016E5D0E-6D79-4AF6-B2C6-F0E091D78C00}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{016E5D0E-6D79-4AF6-B2C6-F0E091D78C00}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{016E5D0E-6D79-4AF6-B2C6-F0E091D78C00}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{016E5D0E-6D79-4AF6-B2C6-F0E091D78C00}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{E57711E5-6546-4BA0-B627-79C94F415BC5}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{E57711E5-6546-4BA0-B627-79C94F415BC5}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{E57711E5-6546-4BA0-B627-79C94F415BC5}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{E57711E5-6546-4BA0-B627-79C94F415BC5}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{925779DB-4429-4366-87C3-B14DD44AE1D4}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{925779DB-4429-4366-87C3-B14DD44AE1D4}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{925779DB-4429-4366-87C3-B14DD44AE1D4}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{925779DB-4429-4366-87C3-B14DD44AE1D4}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{E5AB574B-9E31-45D4-9B75-1C1892241E41}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{E5AB574B-9E31-45D4-9B75-1C1892241E41}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{E5AB574B-9E31-45D4-9B75-1C1892241E41}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{E5AB574B-9E31-45D4-9B75-1C1892241E41}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{D60A8A84-D6D3-4B79-A18A-1817BEBD35B9}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{D60A8A84-D6D3-4B79-A18A-1817BEBD35B9}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{D60A8A84-D6D3-4B79-A18A-1817BEBD35B9}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{D60A8A84-D6D3-4B79-A18A-1817BEBD35B9}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{0EBE2CE5-8940-4472-8A38-6A0E976E678F}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{0EBE2CE5-8940-4472-8A38-6A0E976E678F}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{0EBE2CE5-8940-4472-8A38-6A0E976E678F}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{0EBE2CE5-8940-4472-8A38-6A0E976E678F}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{32F7534B-117E-4D1D-BAED-A1D1A6C6A62C}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{32F7534B-117E-4D1D-BAED-A1D1A6C6A62C}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{32F7534B-117E-4D1D-BAED-A1D1A6C6A62C}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{32F7534B-117E-4D1D-BAED-A1D1A6C6A62C}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{32C8C12B-D7DB-455E-B35C-945A745520CC}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{32C8C12B-D7DB-455E-B35C-945A745520CC}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{32C8C12B-D7DB-455E-B35C-945A745520CC}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{32C8C12B-D7DB-455E-B35C-945A745520CC}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{9164FA6A-F8D3-4F52-8173-A2FA78E74BB2}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{9164FA6A-F8D3-4F52-8173-A2FA78E74BB2}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{9164FA6A-F8D3-4F52-8173-A2FA78E74BB2}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{9164FA6A-F8D3-4F52-8173-A2FA78E74BB2}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{0E73BD08-EA6F-416D-8DBF-E92893A8C3B1}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{0E73BD08-EA6F-416D-8DBF-E92893A8C3B1}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{0E73BD08-EA6F-416D-8DBF-E92893A8C3B1}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{0E73BD08-EA6F-416D-8DBF-E92893A8C3B1}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{C9E5A6D3-02F4-46DE-808B-5163348F45B3}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{C9E5A6D3-02F4-46DE-808B-5163348F45B3}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{C9E5A6D3-02F4-46DE-808B-5163348F45B3}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{C9E5A6D3-02F4-46DE-808B-5163348F45B3}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{A3B7C4D1-E5F6-4A8B-9C0D-1E2F3A4B5C6D}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{A3B7C4D1-E5F6-4A8B-9C0D-1E2F3A4B5C6D}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{A3B7C4D1-E5F6-4A8B-9C0D-1E2F3A4B5C6D}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{A3B7C4D1-E5F6-4A8B-9C0D-1E2F3A4B5C6D}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{5B87496C-EF81-4232-A448-6308F8E5A18C}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{5B87496C-EF81-4232-A448-6308F8E5A18C}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{5B87496C-EF81-4232-A448-6308F8E5A18C}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{5B87496C-EF81-4232-A448-6308F8E5A18C}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{8CD66C3E-3AE3-43AA-8FDA-DD5BA456F2EC}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{8CD66C3E-3AE3-43AA-8FDA-DD5BA456F2EC}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{8CD66C3E-3AE3-43AA-8FDA-DD5BA456F2EC}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{8CD66C3E-3AE3-43AA-8FDA-DD5BA456F2EC}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{BBC69A08-01A7-4F89-938F-F0D551AD3F6C}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{BBC69A08-01A7-4F89-938F-F0D551AD3F6C}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{BBC69A08-01A7-4F89-938F-F0D551AD3F6C}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{BBC69A08-01A7-4F89-938F-F0D551AD3F6C}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\t\t{84A37E18-095E-42A6-93CC-C27CD90B8478}.Debug|Any CPU.ActiveCfg = Debug|Any CPU\r\n\t\t{84A37E18-095E-42A6-93CC-C27CD90B8478}.Debug|Any CPU.Build.0 = Debug|Any CPU\r\n\t\t{84A37E18-095E-42A6-93CC-C27CD90B8478}.Release|Any CPU.ActiveCfg = Release|Any CPU\r\n\t\t{84A37E18-095E-42A6-93CC-C27CD90B8478}.Release|Any CPU.Build.0 = Release|Any CPU\r\n\tEndGlobalSection\r\n\tGlobalSection(SolutionProperties) = preSolution\r\n\t\tHideSolutionNode = FALSE\r\n\tEndGlobalSection\r\n\tGlobalSection(ExtensibilityGlobals) = postSolution\r\n\t\tSolutionGuid = {07A6023C-0A37-4F82-A29F-896A3A338EAC}\r\n\tEndGlobalSection\r\nEndGlobal\r\n"
  },
  {
    "path": "dotnet-examples/speaker-identification/Program.cs",
    "content": "﻿// Copyright (c)  2024  Xiaomi Corporation\r\n//\r\n// This file shows how to do speaker identification with sherpa-onnx.\r\n//\r\n// 1. Download a model from\r\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\r\n//\r\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\r\n//\r\n// 2. Download test data from\r\n//\r\n// git clone https://github.com/csukuangfj/sr-data\r\n//\r\n// 3. Now run it\r\n//\r\n// dotnet run\r\n\r\nusing SherpaOnnx;\r\n\r\nclass SpeakerIdentificationDemo\r\n{\r\n  public static float[] ComputeEmbedding(SpeakerEmbeddingExtractor extractor, string filename)\r\n  {\r\n    var reader = new WaveReader(filename);\r\n\r\n    var stream = extractor.CreateStream();\r\n    stream.AcceptWaveform(reader.SampleRate, reader.Samples);\r\n    stream.InputFinished();\r\n\r\n    var embedding = extractor.Compute(stream);\r\n\r\n    return embedding;\r\n  }\r\n\r\n  static void Main(string[] args)\r\n  {\r\n    var config = new SpeakerEmbeddingExtractorConfig();\r\n    config.Model = \"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\";\r\n    config.Debug = 1;\r\n    var extractor = new SpeakerEmbeddingExtractor(config);\r\n\r\n    var manager = new SpeakerEmbeddingManager(extractor.Dim);\r\n\r\n    var spk1Files =\r\n        new string[] {\r\n          \"./sr-data/enroll/fangjun-sr-1.wav\",\r\n          \"./sr-data/enroll/fangjun-sr-2.wav\",\r\n          \"./sr-data/enroll/fangjun-sr-3.wav\",\r\n        };\r\n    var spk1Vec = new float[spk1Files.Length][];\r\n\r\n    for (int i = 0; i < spk1Files.Length; ++i)\r\n    {\r\n      spk1Vec[i] = ComputeEmbedding(extractor, spk1Files[i]);\r\n    }\r\n\r\n    var spk2Files =\r\n        new string[] {\r\n          \"./sr-data/enroll/leijun-sr-1.wav\", \"./sr-data/enroll/leijun-sr-2.wav\",\r\n        };\r\n\r\n    var spk2Vec = new float[spk2Files.Length][];\r\n\r\n    for (int i = 0; i < spk2Files.Length; ++i)\r\n    {\r\n      spk2Vec[i] = ComputeEmbedding(extractor, spk2Files[i]);\r\n    }\r\n\r\n    if (!manager.Add(\"fangjun\", spk1Vec))\r\n    {\r\n      Console.WriteLine(\"Failed to register fangjun\");\r\n      return;\r\n    }\r\n\r\n    if (!manager.Add(\"leijun\", spk2Vec))\r\n    {\r\n      Console.WriteLine(\"Failed to register leijun\");\r\n      return;\r\n    }\r\n\r\n    if (manager.NumSpeakers != 2)\r\n    {\r\n      Console.WriteLine(\"There should be two speakers\");\r\n      return;\r\n    }\r\n\r\n    if (!manager.Contains(\"fangjun\"))\r\n    {\r\n      Console.WriteLine(\"It should contain the speaker fangjun\");\r\n      return;\r\n    }\r\n\r\n    if (!manager.Contains(\"leijun\"))\r\n    {\r\n      Console.WriteLine(\"It should contain the speaker leijun\");\r\n      return;\r\n    }\r\n\r\n    Console.WriteLine(\"---All speakers---\");\r\n\r\n    var allSpeakers = manager.GetAllSpeakers();\r\n    foreach (var s in allSpeakers)\r\n    {\r\n      Console.WriteLine(s);\r\n    }\r\n    Console.WriteLine(\"------------\");\r\n\r\n    var testFiles =\r\n        new string[] {\r\n          \"./sr-data/test/fangjun-test-sr-1.wav\",\r\n          \"./sr-data/test/leijun-test-sr-1.wav\",\r\n          \"./sr-data/test/liudehua-test-sr-1.wav\"\r\n        };\r\n\r\n    float threshold = 0.6f;\r\n    foreach (var file in testFiles)\r\n    {\r\n      var embedding = ComputeEmbedding(extractor, file);\r\n\r\n      var name = manager.Search(embedding, threshold);\r\n      if (name == \"\")\r\n      {\r\n        name = \"<Unknown>\";\r\n      }\r\n      Console.WriteLine(\"{0}: {1}\", file, name);\r\n    }\r\n\r\n    // test verify\r\n    if (!manager.Verify(\"fangjun\", ComputeEmbedding(extractor, testFiles[0]), threshold))\r\n    {\r\n      Console.WriteLine(\"testFiles[0] should match fangjun!\");\r\n      return;\r\n    }\r\n\r\n    if (!manager.Remove(\"fangjun\"))\r\n    {\r\n      Console.WriteLine(\"Failed to remove fangjun\");\r\n      return;\r\n    }\r\n\r\n    if (manager.Verify(\"fangjun\", ComputeEmbedding(extractor, testFiles[0]), threshold))\r\n    {\r\n      Console.WriteLine(\"{0} should match no one!\", testFiles[0]);\r\n      return;\r\n    }\r\n\r\n    if (manager.NumSpeakers != 1)\r\n    {\r\n      Console.WriteLine(\"There should only 1 speaker left.\");\r\n      return;\r\n    }\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/speaker-identification/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -e ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\nfi\n\nif [ ! -d ./sr-data ]; then\n  git clone https://github.com/csukuangfj/sr-data\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/speaker-identification/speaker-identification.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>speaker_identification</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/speech-enhancement-dpdfnet/Program.cs",
    "content": "﻿// Copyright (c)  2025  Xiaomi Corporation\n//\n// This file shows how to use speech enhancement API with DPDFNet models.\n// Use dpdfnet_baseline.onnx, dpdfnet2.onnx, dpdfnet4.onnx, or dpdfnet8.onnx\n// for 16 kHz downstream ASR or speech recognition.\n// Use dpdfnet2_48khz_hr.onnx for 48 kHz enhancement output.\n//\n// 1. Download a model from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet2.onnx\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet4.onnx\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet8.onnx\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet2_48khz_hr.onnx\n//\n// 2. Download a test file\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n//\n// 3. Now run it\n//\n// dotnet run\n\nusing SherpaOnnx;\n\nclass OfflineSpeechEnhancementDemo\n{\n  static void Main(string[] args)\n  {\n    var model = \"./dpdfnet_baseline.onnx\";\n    var config = new OfflineSpeechDenoiserConfig();\n    config.Model.Dpdfnet.Model = model;\n    config.Model.Debug = 1;\n    config.Model.NumThreads = 1;\n    var sd = new OfflineSpeechDenoiser(config);\n\n    WaveReader waveReader = new WaveReader(\"./inp_16k.wav\");\n    var denoisedAudio = sd.Run(waveReader.Samples, waveReader.SampleRate);\n\n    var outputFilename = \"./enhanced.wav\";\n    var ok = denoisedAudio.SaveToWaveFile(outputFilename);\n\n    if (ok)\n    {\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\n    }\n    else\n    {\n      Console.WriteLine($\"Failed to write {outputFilename}\");\n    }\n  }\n}\n"
  },
  {
    "path": "dotnet-examples/speech-enhancement-dpdfnet/run.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/speech-enhancement-dpdfnet/speech-enhancement-dpdfnet.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n\n  <PropertyGroup>\n    <OutputType>Exe</OutputType>\n    <TargetFramework>net8.0</TargetFramework>\n    <RootNamespace>speech_enhancement_dpdfnet</RootNamespace>\n    <ImplicitUsings>enable</ImplicitUsings>\n    <Nullable>enable</Nullable>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "dotnet-examples/speech-enhancement-gtcrn/Program.cs",
    "content": "﻿// Copyright (c)  2025  Xiaomi Corporation\r\n//\r\n// This file shows how to use speech enhancement API with GTCRN models.\n//\n// 1. Download a model from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\n//\n// 2. Download a test file\n//\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n//\r\n// 3. Now run it\r\n//\r\n// dotnet run\r\n\r\nusing SherpaOnnx;\r\n\r\nclass OfflineSpeechEnhancementDemo\r\n{\r\n  static void Main(string[] args)\n  {\n    var model = \"./gtcrn_simple.onnx\";\n    var config = new OfflineSpeechDenoiserConfig();\n    config.Model.Gtcrn.Model = model;\n    config.Model.Debug = 1;\n    config.Model.NumThreads = 1;\n    var sd = new OfflineSpeechDenoiser(config);\n\r\n    WaveReader waveReader = new WaveReader(\"./inp_16k.wav\");\r\n    var denoisedAudio =  sd.Run(waveReader.Samples, waveReader.SampleRate);\r\n\r\n    var outputFilename = \"./enhanced.wav\";\n    var ok = denoisedAudio.SaveToWaveFile(outputFilename);\r\n\r\n    if (ok)\r\n    {\r\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine($\"Failed to write {outputFilename}\");\r\n    }\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/speech-enhancement-gtcrn/run.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/speech-enhancement-gtcrn/speech-enhancement-gtcrn.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>speech_enhancement_gtcrn</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/speech-recognition-from-microphone/Program.cs",
    "content": "﻿// Copyright (c)  2023  Xiaomi Corporation\r\n//\r\n// This file shows how to use a streaming model for real-time speech\r\n// recognition from a microphone.\r\n// Please refer to\r\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html\r\n// to download streaming models\r\n\r\nusing CommandLine;\r\nusing CommandLine.Text;\r\nusing PortAudioSharp;\r\nusing SherpaOnnx;\r\nusing System.Runtime.InteropServices;\r\n\r\nclass SpeechRecognitionFromMicrophone\r\n{\r\n  class Options\r\n  {\r\n    [Option(Required = true, HelpText = \"Path to tokens.txt\")]\r\n    public string? Tokens { get; set; }\r\n\r\n    [Option(Required = false, Default = \"cpu\", HelpText = \"Provider, e.g., cpu, coreml\")]\r\n    public string? Provider { get; set; }\r\n\r\n    [Option(Required = false, HelpText = \"Path to transducer encoder.onnx\")]\r\n    public string? Encoder { get; set; }\r\n\r\n    [Option(Required = false, HelpText = \"Path to transducer decoder.onnx\")]\r\n    public string? Decoder { get; set; }\r\n\r\n    [Option(Required = false, HelpText = \"Path to transducer joiner.onnx\")]\r\n    public string? Joiner { get; set; }\r\n\r\n    [Option(\"paraformer-encoder\", Required = false, HelpText = \"Path to paraformer encoder.onnx\")]\r\n    public string? ParaformerEncoder { get; set; }\r\n\r\n    [Option(\"paraformer-decoder\", Required = false, HelpText = \"Path to paraformer decoder.onnx\")]\r\n    public string? ParaformerDecoder { get; set; }\r\n\r\n    [Option(\"num-threads\", Required = false, Default = 1, HelpText = \"Number of threads for computation\")]\r\n    public int NumThreads { get; set; }\r\n\r\n    [Option(\"decoding-method\", Required = false, Default = \"greedy_search\",\r\n            HelpText = \"Valid decoding methods are: greedy_search, modified_beam_search\")]\r\n    public string? DecodingMethod { get; set; }\r\n\r\n    [Option(Required = false, Default = false, HelpText = \"True to show model info during loading\")]\r\n    public bool Debug { get; set; }\r\n\r\n    [Option(\"sample-rate\", Required = false, Default = 16000, HelpText = \"Sample rate of the data used to train the model\")]\r\n    public int SampleRate { get; set; }\r\n\r\n    [Option(\"max-active-paths\", Required = false, Default = 4,\r\n        HelpText = @\"Used only when --decoding--method is modified_beam_search.\r\nIt specifies number of active paths to keep during the search\")]\r\n    public int MaxActivePaths { get; set; }\r\n\r\n    [Option(\"enable-endpoint\", Required = false, Default = true,\r\n        HelpText = \"True to enable endpoint detection.\")]\r\n    public bool EnableEndpoint { get; set; }\r\n\r\n    [Option(\"rule1-min-trailing-silence\", Required = false, Default = 2.4F,\r\n        HelpText = @\"An endpoint is detected if trailing silence in seconds is\r\nlarger than this value even if nothing has been decoded. Used only when --enable-endpoint is true.\")]\r\n    public float Rule1MinTrailingSilence { get; set; }\r\n\r\n    [Option(\"rule2-min-trailing-silence\", Required = false, Default = 0.8F,\r\n        HelpText = @\"An endpoint is detected if trailing silence in seconds is\r\nlarger than this value after something that is not blank has been decoded. Used\r\nonly when --enable-endpoint is true.\")]\r\n    public float Rule2MinTrailingSilence { get; set; }\r\n\r\n    [Option(\"rule3-min-utterance-length\", Required = false, Default = 20.0F,\r\n        HelpText = @\"An endpoint is detected if the utterance in seconds is\r\nlarger than this value. Used only when --enable-endpoint is true.\")]\r\n    public float Rule3MinUtteranceLength { get; set; }\r\n  }\r\n\r\n  static void Main(string[] args)\r\n  {\r\n    var parser = new CommandLine.Parser(with => with.HelpWriter = null);\r\n    var parserResult = parser.ParseArguments<Options>(args);\r\n\r\n    parserResult\r\n      .WithParsed<Options>(options => Run(options))\r\n      .WithNotParsed(errs => DisplayHelp(parserResult, errs));\r\n  }\r\n\r\n  private static void DisplayHelp<T>(ParserResult<T> result, IEnumerable<Error> errs)\r\n  {\r\n    string usage = @\"\r\n(1) Streaming transducer models\r\n\r\ndotnet run -c Release \\\r\n  --tokens ./icefall-asr-zipformer-streaming-wenetspeech-20230615/data/lang_char/tokens.txt \\\r\n  --encoder ./icefall-asr-zipformer-streaming-wenetspeech-20230615/exp/encoder-epoch-12-avg-4-chunk-16-left-128.onnx \\\r\n  --decoder ./icefall-asr-zipformer-streaming-wenetspeech-20230615/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx \\\r\n  --joiner ./icefall-asr-zipformer-streaming-wenetspeech-20230615/exp/joiner-epoch-12-avg-4-chunk-16-left-128.onnx\r\n\r\n(2) Streaming Paraformer models\r\n\r\ndotnet run \\\r\n  --tokens=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt \\\r\n  --paraformer-encoder=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx \\\r\n  --paraformer-decoder=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx\r\n\r\nPlease refer to\r\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html\r\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/index.html\r\nto download pre-trained streaming models.\r\n\";\r\n\r\n    var helpText = HelpText.AutoBuild(result, h =>\r\n    {\r\n      h.AdditionalNewLineAfterOption = false;\r\n      h.Heading = usage;\r\n      h.Copyright = \"Copyright (c) 2023 Xiaomi Corporation\";\r\n      return HelpText.DefaultParsingErrorsHandler(result, h);\r\n    }, e => e);\r\n    Console.WriteLine(helpText);\r\n  }\r\n\r\n  private static void Run(Options options)\r\n  {\r\n    var config = new OnlineRecognizerConfig();\r\n    config.FeatConfig.SampleRate = options.SampleRate;\r\n\r\n    // All models from icefall using feature dim 80.\r\n    // You can change it if your model has a different feature dim.\r\n    config.FeatConfig.FeatureDim = 80;\r\n\r\n    config.ModelConfig.Transducer.Encoder = options.Encoder;\r\n    config.ModelConfig.Transducer.Decoder = options.Decoder;\r\n    config.ModelConfig.Transducer.Joiner = options.Joiner;\r\n\r\n    config.ModelConfig.Paraformer.Encoder = options.ParaformerEncoder;\r\n    config.ModelConfig.Paraformer.Decoder = options.ParaformerDecoder;\r\n\r\n    config.ModelConfig.Tokens = options.Tokens;\r\n    config.ModelConfig.Provider = options.Provider;\r\n    config.ModelConfig.NumThreads = options.NumThreads;\r\n    config.ModelConfig.Debug = options.Debug ? 1 : 0;\r\n\r\n    config.DecodingMethod = options.DecodingMethod;\r\n    config.MaxActivePaths = options.MaxActivePaths;\r\n    config.EnableEndpoint = options.EnableEndpoint ? 1 : 0;\r\n\r\n    config.Rule1MinTrailingSilence = options.Rule1MinTrailingSilence;\r\n    config.Rule2MinTrailingSilence = options.Rule2MinTrailingSilence;\r\n    config.Rule3MinUtteranceLength = options.Rule3MinUtteranceLength;\r\n\r\n    var recognizer = new OnlineRecognizer(config);\r\n\r\n    var s = recognizer.CreateStream();\r\n\r\n    Console.WriteLine(PortAudio.VersionInfo.versionText);\r\n    PortAudio.Initialize();\r\n\r\n    Console.WriteLine($\"Number of devices: {PortAudio.DeviceCount}\");\r\n    for (int i = 0; i != PortAudio.DeviceCount; ++i)\r\n    {\r\n      Console.WriteLine($\" Device {i}\");\r\n      DeviceInfo deviceInfo = PortAudio.GetDeviceInfo(i);\r\n      Console.WriteLine($\"   Name: {deviceInfo.name}\");\r\n      Console.WriteLine($\"   Max input channels: {deviceInfo.maxInputChannels}\");\r\n      Console.WriteLine($\"   Default sample rate: {deviceInfo.defaultSampleRate}\");\r\n    }\r\n    int deviceIndex = PortAudio.DefaultInputDevice;\r\n    if (deviceIndex == PortAudio.NoDevice)\r\n    {\r\n      Console.WriteLine(\"No default input device found\");\r\n      Environment.Exit(1);\r\n    }\r\n\r\n    var info = PortAudio.GetDeviceInfo(deviceIndex);\r\n\r\n    Console.WriteLine();\r\n    Console.WriteLine($\"Use default device {deviceIndex} ({info.name})\");\r\n\r\n    var param = new StreamParameters();\r\n    param.device = deviceIndex;\r\n    param.channelCount = 1;\r\n    param.sampleFormat = SampleFormat.Float32;\r\n    param.suggestedLatency = info.defaultLowInputLatency;\r\n    param.hostApiSpecificStreamInfo = IntPtr.Zero;\r\n\r\n    PortAudioSharp.Stream.Callback callback = (IntPtr input, IntPtr output,\r\n        uint frameCount,\r\n        ref StreamCallbackTimeInfo timeInfo,\r\n        StreamCallbackFlags statusFlags,\r\n        IntPtr userData\r\n        ) =>\r\n    {\r\n      var samples = new float[frameCount];\r\n      Marshal.Copy(input, samples, 0, (int)frameCount);\r\n\r\n      s.AcceptWaveform(options.SampleRate, samples);\r\n\r\n      return StreamCallbackResult.Continue;\r\n    };\r\n\r\n    PortAudioSharp.Stream stream = new PortAudioSharp.Stream(inParams: param, outParams: null, sampleRate: options.SampleRate,\r\n        framesPerBuffer: 0,\r\n        streamFlags: StreamFlags.ClipOff,\r\n        callback: callback,\r\n        userData: IntPtr.Zero\r\n        );\r\n\r\n    Console.WriteLine(param);\r\n    Console.WriteLine(\"Started! Please speak\");\r\n\r\n    stream.Start();\r\n\r\n    var lastText = string.Empty;\r\n    int segmentIndex = 0;\r\n\r\n    while (true)\r\n    {\r\n      while (recognizer.IsReady(s))\r\n      {\r\n        recognizer.Decode(s);\r\n      }\r\n\r\n      var text = recognizer.GetResult(s).Text;\r\n      bool isEndpoint = recognizer.IsEndpoint(s);\r\n      if (!string.IsNullOrWhiteSpace(text) && lastText != text)\r\n      {\r\n        lastText = text;\r\n        Console.Write($\"\\r{segmentIndex}: {lastText}\");\r\n      }\r\n\r\n      if (isEndpoint)\r\n      {\r\n        if (!string.IsNullOrWhiteSpace(text))\r\n        {\r\n          ++segmentIndex;\r\n          Console.WriteLine();\r\n        }\r\n        recognizer.Reset(s);\r\n      }\r\n\r\n      Thread.Sleep(200); // ms\r\n    }\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/speech-recognition-from-microphone/run-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-streaming-paraformer-bilingual-zh-en-chinese-english\n# to download the model files\n\nset -ex\nif [ ! -d ./sherpa-onnx-streaming-paraformer-bilingual-zh-en ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nfi\n\ndotnet run -c Release \\\n  --tokens ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt \\\n  --paraformer-encoder ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx \\\n  --paraformer-decoder ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx \\\n"
  },
  {
    "path": "dotnet-examples/speech-recognition-from-microphone/run-transducer.sh",
    "content": "#!/usr/bin/env bash\n\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n# to download the model files\n#\nset -ex\n\nexport LD_LIBRARY_PATH=$PWD:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$PWD:$DYLD_LIBRARY_PATH\n\nif [ ! -d ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nfi\n\ndotnet run -c Release \\\n  --tokens ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \\\n  --encoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx \\\n  --decoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \\\n  --joiner ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx\n"
  },
  {
    "path": "dotnet-examples/speech-recognition-from-microphone/speech-recognition-from-microphone.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>speech_recognition_from_microphone</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <PackageReference Include=\"CommandLineParser\" Version=\"2.9.1\" />\r\n    <PackageReference Include=\"PortAudioSharp2\" Version=\"*\" />\r\n  </ItemGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/spoken-language-identification/Program.cs",
    "content": "﻿// Copyright (c)  2024  Xiaomi Corporation\r\n//\r\n// This file shows how to do spoken language identification with whisper.\r\n//\r\n// 1. Download a whisper multilingual model. We use a tiny model below.\r\n// Please refer to https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\r\n// to download more models.\r\n//\r\n// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\r\n// tar xvf sherpa-onnx-whisper-tiny.tar.bz2\r\n// rm sherpa-onnx-whisper-tiny.tar.bz2\r\n//\r\n// 2. Now run it\r\n//\r\n// dotnet run\r\n\r\nusing SherpaOnnx;\r\n\r\nclass SpokenLanguageIdentificationDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    var config = new SpokenLanguageIdentificationConfig();\r\n    config.Whisper.Encoder = \"./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx\";\r\n    config.Whisper.Decoder = \"./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx\";\r\n\r\n    var slid = new SpokenLanguageIdentification(config);\r\n    var filename = \"./sherpa-onnx-whisper-tiny/test_wavs/0.wav\";\r\n\r\n    var waveReader = new WaveReader(filename);\r\n\r\n    var s = slid.CreateStream();\r\n    s.AcceptWaveform(waveReader.SampleRate, waveReader.Samples);\r\n    var result = slid.Compute(s);\r\n    Console.WriteLine($\"Filename: {filename}\");\r\n    Console.WriteLine($\"Detected language: {result.Lang}\");\r\n  }\r\n}\r\n\r\n"
  },
  {
    "path": "dotnet-examples/spoken-language-identification/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ./sherpa-onnx-whisper-tiny ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.tar.bz2\n  rm sherpa-onnx-whisper-tiny.tar.bz2\nfi\n\ndotnet run\n\n"
  },
  {
    "path": "dotnet-examples/spoken-language-identification/spoken-language-identification.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>spoken_language_identification</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/streaming-hlg-decoding/Program.cs",
    "content": "﻿// Copyright (c)  2024  Xiaomi Corporation\r\n//\r\n// This file shows how to do streaming HLG decoding.\r\n//\r\n// 1. Download the model for testing\r\n//\r\n//  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\r\n//  tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\r\n//  rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\r\n//\r\n// 2. Now run it\r\n//\r\n// dotnet run\r\n\r\nusing SherpaOnnx;\r\n\r\nclass StreamingHlgDecodingDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    var config = new OnlineRecognizerConfig();\r\n    config.FeatConfig.SampleRate = 16000;\r\n    config.FeatConfig.FeatureDim = 80;\r\n    config.ModelConfig.Zipformer2Ctc.Model = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx\";\r\n\r\n    config.ModelConfig.Tokens = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt\";\r\n    config.ModelConfig.Provider = \"cpu\";\r\n    config.ModelConfig.NumThreads = 1;\r\n    config.ModelConfig.Debug = 0;\r\n    config.CtcFstDecoderConfig.Graph = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst\";\r\n\r\n    var recognizer = new OnlineRecognizer(config);\r\n\r\n    var filename = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/8k.wav\";\r\n\r\n    var waveReader = new WaveReader(filename);\r\n    var s = recognizer.CreateStream();\r\n    s.AcceptWaveform(waveReader.SampleRate, waveReader.Samples);\r\n\r\n    var tailPadding = new float[(int)(waveReader.SampleRate * 0.3)];\r\n    s.AcceptWaveform(waveReader.SampleRate, tailPadding);\r\n    s.InputFinished();\r\n\r\n    while (recognizer.IsReady(s))\r\n    {\r\n      recognizer.Decode(s);\r\n    }\r\n\r\n    var r = recognizer.GetResult(s);\r\n    var text = r.Text;\r\n    var tokens = r.Tokens;\r\n    Console.WriteLine(\"--------------------\");\r\n    Console.WriteLine(filename);\r\n    Console.WriteLine(\"text: {0}\", text);\r\n    Console.WriteLine(\"tokens: [{0}]\", string.Join(\", \", tokens));\r\n    Console.Write(\"timestamps: [\");\r\n    r.Timestamps.ToList().ForEach(i => Console.Write(string.Format(\"{0:0.00}\", i) + \", \"));\r\n    Console.WriteLine(\"]\");\r\n    Console.WriteLine(\"--------------------\");\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/streaming-hlg-decoding/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nfi\n\ndotnet run -c Release\n"
  },
  {
    "path": "dotnet-examples/streaming-hlg-decoding/streaming-hlg-decoding.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>streaming_hlg_decoding</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/streaming-speech-enhancement-dpdfnet/Program.cs",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//\n// This file shows how to use the online speech enhancement API with DPDFNet\n// models.\n\nusing SherpaOnnx;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nclass StreamingSpeechEnhancementDpdfnet\n{\n  static void Main(string[] args)\n  {\n    var config = new OnlineSpeechDenoiserConfig();\n    config.Model.Dpdfnet.Model = \"./dpdfnet_baseline.onnx\";\n    config.Model.Debug = 1;\n    config.Model.NumThreads = 1;\n\n    var sd = new OnlineSpeechDenoiser(config);\n    WaveReader waveReader = new WaveReader(\"./inp_16k.wav\");\n\n    var samples = waveReader.Samples;\n    var output = new List<float>(samples.Length);\n    int frameShift = sd.FrameShiftInSamples;\n\n    for (int start = 0; start < samples.Length; start += frameShift)\n    {\n      int count = Math.Min(frameShift, samples.Length - start);\n      float[] chunk = new float[count];\n      Array.Copy(samples, start, chunk, 0, count);\n      output.AddRange(sd.Run(chunk, waveReader.SampleRate).Samples);\n    }\n\n    output.AddRange(sd.Flush().Samples);\n\n    var outFilename = \"./enhanced-online-dpdfnet.wav\";\n    var outAudio = new GeneratedDenoisedAudio(output.ToArray(), sd.SampleRate);\n    if (outAudio.SaveToWaveFile(outFilename))\n    {\n      Console.WriteLine($\"Wrote to {outFilename} succeeded!\");\n    }\n    else\n    {\n      Console.WriteLine($\"Failed to write {outFilename}\");\n    }\n  }\n\n  private sealed class GeneratedDenoisedAudio\n  {\n    private readonly float[] _samples;\n    private readonly int _sampleRate;\n\n    public GeneratedDenoisedAudio(float[] samples, int sampleRate)\n    {\n      _samples = samples;\n      _sampleRate = sampleRate;\n    }\n\n    public bool SaveToWaveFile(string filename)\n    {\n      byte[] utf8Filename = Encoding.UTF8.GetBytes(filename);\n      byte[] utf8FilenameWithNull = new byte[utf8Filename.Length + 1];\n      Array.Copy(utf8Filename, utf8FilenameWithNull, utf8Filename.Length);\n      utf8FilenameWithNull[utf8Filename.Length] = 0;\n      return SherpaOnnxWriteWave(_samples, _samples.Length, _sampleRate, utf8FilenameWithNull) == 1;\n    }\n\n    [DllImport(Dll.Filename)]\n    private static extern int SherpaOnnxWriteWave(\n        float[] samples,\n        int n,\n        int sampleRate,\n        [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Filename);\n  }\n}\n"
  },
  {
    "path": "dotnet-examples/streaming-speech-enhancement-dpdfnet/run.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/streaming-speech-enhancement-dpdfnet/streaming-speech-enhancement-dpdfnet.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n\n  <PropertyGroup>\n    <OutputType>Exe</OutputType>\n    <TargetFramework>net8.0</TargetFramework>\n    <RootNamespace>streaming_speech_enhancement_dpdfnet</RootNamespace>\n    <ImplicitUsings>enable</ImplicitUsings>\n    <Nullable>enable</Nullable>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "dotnet-examples/streaming-speech-enhancement-gtcrn/Program.cs",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//\n// This file shows how to use the online speech enhancement API with GTCRN\n// models.\n\nusing SherpaOnnx;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nclass StreamingSpeechEnhancementGtcrn\n{\n  static void Main(string[] args)\n  {\n    var config = new OnlineSpeechDenoiserConfig();\n    config.Model.Gtcrn.Model = \"./gtcrn_simple.onnx\";\n    config.Model.Debug = 1;\n    config.Model.NumThreads = 1;\n\n    var sd = new OnlineSpeechDenoiser(config);\n    WaveReader waveReader = new WaveReader(\"./inp_16k.wav\");\n\n    var samples = waveReader.Samples;\n    var output = new List<float>(samples.Length);\n    int frameShift = sd.FrameShiftInSamples;\n\n    for (int start = 0; start < samples.Length; start += frameShift)\n    {\n      int count = Math.Min(frameShift, samples.Length - start);\n      float[] chunk = new float[count];\n      Array.Copy(samples, start, chunk, 0, count);\n      output.AddRange(sd.Run(chunk, waveReader.SampleRate).Samples);\n    }\n\n    output.AddRange(sd.Flush().Samples);\n\n    var outFilename = \"./enhanced-online-gtcrn.wav\";\n    var outAudio = new GeneratedDenoisedAudio(output.ToArray(), sd.SampleRate);\n    if (outAudio.SaveToWaveFile(outFilename))\n    {\n      Console.WriteLine($\"Wrote to {outFilename} succeeded!\");\n    }\n    else\n    {\n      Console.WriteLine($\"Failed to write {outFilename}\");\n    }\n  }\n\n  private sealed class GeneratedDenoisedAudio\n  {\n    private readonly float[] _samples;\n    private readonly int _sampleRate;\n\n    public GeneratedDenoisedAudio(float[] samples, int sampleRate)\n    {\n      _samples = samples;\n      _sampleRate = sampleRate;\n    }\n\n    public bool SaveToWaveFile(string filename)\n    {\n      byte[] utf8Filename = Encoding.UTF8.GetBytes(filename);\n      byte[] utf8FilenameWithNull = new byte[utf8Filename.Length + 1];\n      Array.Copy(utf8Filename, utf8FilenameWithNull, utf8Filename.Length);\n      utf8FilenameWithNull[utf8Filename.Length] = 0;\n      return SherpaOnnxWriteWave(_samples, _samples.Length, _sampleRate, utf8FilenameWithNull) == 1;\n    }\n\n    [DllImport(Dll.Filename)]\n    private static extern int SherpaOnnxWriteWave(\n        float[] samples,\n        int n,\n        int sampleRate,\n        [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Filename);\n  }\n}\n"
  },
  {
    "path": "dotnet-examples/streaming-speech-enhancement-gtcrn/run.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/streaming-speech-enhancement-gtcrn/streaming-speech-enhancement-gtcrn.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n\n  <PropertyGroup>\n    <OutputType>Exe</OutputType>\n    <TargetFramework>net8.0</TargetFramework>\n    <RootNamespace>streaming_speech_enhancement_gtcrn</RootNamespace>\n    <ImplicitUsings>enable</ImplicitUsings>\n    <Nullable>enable</Nullable>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "dotnet-examples/supertonic-tts/Program.cs",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//\n// This file shows how to use a non-streaming Supertonic TTS model\n// for text-to-speech\n// Please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/supertonic.html\n// and\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n// to download pre-trained models\nusing SherpaOnnx;\nusing System.Runtime.InteropServices;\n\nclass SupertonicTtsDemo\n{\n  static void Main(string[] args)\n  {\n    TestEn();\n  }\n\n  static void TestEn()\n  {\n    var config = new OfflineTtsConfig();\n    config.Model.Supertonic.DurationPredictor = \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx\";\n    config.Model.Supertonic.TextEncoder = \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx\";\n    config.Model.Supertonic.VectorEstimator = \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx\";\n    config.Model.Supertonic.Vocoder = \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx\";\n    config.Model.Supertonic.TtsJson = \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json\";\n    config.Model.Supertonic.UnicodeIndexer = \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin\";\n    config.Model.Supertonic.VoiceStyle = \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin\";\n\n    config.Model.NumThreads = 2;\n    config.Model.Debug = 1;\n    config.Model.Provider = \"cpu\";\n\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\n    genConfig.Sid = 6;\n    genConfig.NumSteps = 5;\n    genConfig.Speed = 1.25f;  // larger -> faster\n    genConfig.Extra[\"lang\"] = \"en\";\n\n    var tts = new OfflineTts(config);\n    var text = \"Today as always, men fall into two groups: slaves and free men. Whoever \" +\n      \"does not have two-thirds of his day for himself, is a slave, whatever \" +\n      \"he may be: a statesman, a businessman, an official, or a scholar.\";\n\n    var MyCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\n    {\n      float[] data = new float[n];\n      Marshal.Copy(samples, data, 0, n);\n      // You can process samples here, e.g., play them.\n      Console.WriteLine($\"Progress {progress*100}%\");\n\n      // 1 means to keep generating\n      // 0 means to stop generating\n      return 1;\n    };\n\n    var callback = new OfflineTtsCallbackProgressWithArg(MyCallback);\n\n    var audio = tts.GenerateWithConfig(text, genConfig, callback);\n\n    var outputFilename = \"./generated-supertonic-en.wav\";\n    var ok = audio.SaveToWaveFile(outputFilename);\n\n    if (ok)\n    {\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\n    }\n    else\n    {\n      Console.WriteLine($\"Failed to write {outputFilename}\");\n    }\n  }\n}\n"
  },
  {
    "path": "dotnet-examples/supertonic-tts/run.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  tar xvf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  rm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/supertonic-tts/supertonic-tts.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n\n  <PropertyGroup>\n    <OutputType>Exe</OutputType>\n    <TargetFramework>net8.0</TargetFramework>\n    <RootNamespace>supertonic_tts</RootNamespace>\n    <ImplicitUsings>enable</ImplicitUsings>\n    <Nullable>enable</Nullable>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "dotnet-examples/vad-non-streaming-asr-paraformer/Program.cs",
    "content": "﻿// Copyright (c)  2024  Xiaomi Corporation\n//\n// This file shows how to use a silero_vad model or ten-vad model\n// with a non-streaming Paraformer for speech recognition.\nusing SherpaOnnx;\nusing System.IO;\n\n\nclass VadNonStreamingAsrParaformer\n{\n  static void Main(string[] args)\n  {\n    // please download model files from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    var config = new OfflineRecognizerConfig();\n    config.ModelConfig.Paraformer.Model = \"./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx\";\n    config.ModelConfig.Tokens = \"./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\";\n    config.ModelConfig.Debug = 0;\n    var recognizer = new OfflineRecognizer(config);\n\n    var vadModelConfig = new VadModelConfig();\n    if (File.Exists(\"./silero_vad.onnx\"))\n    {\n      Console.WriteLine(\"Use silero-vad\");\n      vadModelConfig.SileroVad.Model = \"./silero_vad.onnx\";\n      vadModelConfig.SileroVad.Threshold = 0.3F;\n      vadModelConfig.SileroVad.MinSilenceDuration = 0.5F;\n      vadModelConfig.SileroVad.MinSpeechDuration = 0.25F;\n      vadModelConfig.SileroVad.MaxSpeechDuration = 5.0F;\n      vadModelConfig.SileroVad.WindowSize = 512;\n    }\n    else if (File.Exists(\"./ten-vad.onnx\"))\n    {\n      Console.WriteLine(\"Use ten-vad\");\n      vadModelConfig.TenVad.Model = \"./ten-vad.onnx\";\n      vadModelConfig.TenVad.Threshold = 0.3F;\n      vadModelConfig.TenVad.MinSilenceDuration = 0.5F;\n      vadModelConfig.TenVad.MinSpeechDuration = 0.25F;\n      vadModelConfig.TenVad.MaxSpeechDuration = 5.0F;\n      vadModelConfig.TenVad.WindowSize = 256;\n    }\n    else\n    {\n      Console.WriteLine(\"Please download ./silero_vad.onnx or ./ten-vad.onnx\");\n      return;\n    }\n    vadModelConfig.Debug = 0;\n\n    var vad = new VoiceActivityDetector(vadModelConfig, 60);\n\n    var testWaveFilename = \"./lei-jun-test.wav\";\n    var reader = new WaveReader(testWaveFilename);\n\n    int numSamples = reader.Samples.Length;\n    int windowSize = vadModelConfig.SileroVad.WindowSize;\n\n    if (vadModelConfig.TenVad.Model != \"\")\n    {\n      windowSize = vadModelConfig.TenVad.WindowSize;\n    }\n\n    int sampleRate = vadModelConfig.SampleRate;\n    int numIter = numSamples / windowSize;\n\n    for (int i = 0; i != numIter; ++i)\n    {\n      int start = i * windowSize;\n      var samples = new float[windowSize];\n      Array.Copy(reader.Samples, start, samples, 0, windowSize);\n      vad.AcceptWaveform(samples);\n      if (vad.IsSpeechDetected())\n      {\n        while (!vad.IsEmpty())\n        {\n          SpeechSegment segment = vad.Front();\n          var startTime = segment.Start / (float)sampleRate;\n          var duration = segment.Samples.Length / (float)sampleRate;\n\n          OfflineStream stream = recognizer.CreateStream();\n          stream.AcceptWaveform(sampleRate, segment.Samples);\n          recognizer.Decode(stream);\n          var text = stream.Result.Text;\n\n          if (!string.IsNullOrEmpty(text))\n          {\n            Console.WriteLine(\"{0}--{1}: {2}\", string.Format(\"{0:0.00}\", startTime),\n                string.Format(\"{0:0.00}\", startTime + duration), text);\n          }\n\n          vad.Pop();\n        }\n      }\n    }\n\n    vad.Flush();\n\n    while (!vad.IsEmpty())\n    {\n      var segment = vad.Front();\n      float startTime = segment.Start / (float)sampleRate;\n      float duration = segment.Samples.Length / (float)sampleRate;\n\n      var stream = recognizer.CreateStream();\n      stream.AcceptWaveform(sampleRate, segment.Samples);\n      recognizer.Decode(stream);\n      var text = stream.Result.Text;\n\n      if (!string.IsNullOrEmpty(text))\n      {\n        Console.WriteLine(\"{0}--{1}: {2}\", string.Format(\"{0:0.00}\", startTime),\n            string.Format(\"{0:0.00}\", startTime + duration), text);\n      }\n\n      vad.Pop();\n    }\n  }\n}\n\n"
  },
  {
    "path": "dotnet-examples/vad-non-streaming-asr-paraformer/run-ten-vad.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./ten-vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/vad-non-streaming-asr-paraformer/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/vad-non-streaming-asr-paraformer/vad-non-streaming-asr-paraformer.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n\n  <PropertyGroup>\n    <OutputType>Exe</OutputType>\n    <TargetFramework>net8.0</TargetFramework>\n    <RootNamespace>vad_non_streaming_asr_paraformer</RootNamespace>\n    <ImplicitUsings>enable</ImplicitUsings>\n    <Nullable>enable</Nullable>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "dotnet-examples/vad-non-streaming-funasr-nano/Program.cs",
    "content": "﻿// Copyright (c)  2026  Xiaomi Corporation\r\n//\r\n// This file shows how to use a silero_vad model or ten-vad model\r\n// with a non-streaming FunASR Nano for speech recognition.\r\nusing SherpaOnnx;\r\nusing System.IO;\r\n\r\n\r\nclass VadNonStreamingFunAsrNano\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    // please download model files from\r\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\r\n    var config = new OfflineRecognizerConfig();\r\n    config.ModelConfig.FunAsrNano.EncoderAdaptor = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx\";\r\n    config.ModelConfig.FunAsrNano.LLM = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx\";\r\n    config.ModelConfig.FunAsrNano.Embedding = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx\";\r\n    config.ModelConfig.FunAsrNano.Tokenizer = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B\";\r\n    config.ModelConfig.Tokens = \"\";\r\n    config.ModelConfig.Debug = 0;\r\n    var recognizer = new OfflineRecognizer(config);\r\n\r\n    var vadModelConfig = new VadModelConfig();\r\n    if (File.Exists(\"./silero_vad.onnx\"))\r\n    {\r\n      Console.WriteLine(\"Use silero-vad\");\r\n      vadModelConfig.SileroVad.Model = \"./silero_vad.onnx\";\r\n      vadModelConfig.SileroVad.Threshold = 0.3F;\r\n      vadModelConfig.SileroVad.MinSilenceDuration = 0.5F;\r\n      vadModelConfig.SileroVad.MinSpeechDuration = 0.25F;\r\n      vadModelConfig.SileroVad.MaxSpeechDuration = 5.0F;\r\n      vadModelConfig.SileroVad.WindowSize = 512;\r\n    }\r\n    else if (File.Exists(\"./ten-vad.onnx\"))\r\n    {\r\n      Console.WriteLine(\"Use ten-vad\");\r\n      vadModelConfig.TenVad.Model = \"./ten-vad.onnx\";\r\n      vadModelConfig.TenVad.Threshold = 0.3F;\r\n      vadModelConfig.TenVad.MinSilenceDuration = 0.5F;\r\n      vadModelConfig.TenVad.MinSpeechDuration = 0.25F;\r\n      vadModelConfig.TenVad.MaxSpeechDuration = 5.0F;\r\n      vadModelConfig.TenVad.WindowSize = 256;\r\n    }\r\n    else\r\n    {\r\n      Console.WriteLine(\"Please download ./silero_vad.onnx or ./ten-vad.onnx\");\r\n      return;\r\n    }\r\n    vadModelConfig.Debug = 0;\r\n\r\n    var vad = new VoiceActivityDetector(vadModelConfig, 60);\r\n\r\n    var testWaveFilename = \"./lei-jun-test.wav\";\r\n    var reader = new WaveReader(testWaveFilename);\r\n\r\n    int numSamples = reader.Samples.Length;\r\n    int windowSize = vadModelConfig.SileroVad.WindowSize;\r\n\r\n    if (vadModelConfig.TenVad.Model != \"\")\r\n    {\r\n      windowSize = vadModelConfig.TenVad.WindowSize;\r\n    }\r\n\r\n    int sampleRate = vadModelConfig.SampleRate;\r\n    int numIter = numSamples / windowSize;\r\n\r\n    for (int i = 0; i != numIter; ++i)\r\n    {\r\n      int start = i * windowSize;\r\n      var samples = new float[windowSize];\r\n      Array.Copy(reader.Samples, start, samples, 0, windowSize);\r\n      vad.AcceptWaveform(samples);\r\n      if (vad.IsSpeechDetected())\r\n      {\r\n        while (!vad.IsEmpty())\r\n        {\r\n          SpeechSegment segment = vad.Front();\r\n          var startTime = segment.Start / (float)sampleRate;\r\n          var duration = segment.Samples.Length / (float)sampleRate;\r\n\r\n          OfflineStream stream = recognizer.CreateStream();\r\n          stream.AcceptWaveform(sampleRate, segment.Samples);\r\n          recognizer.Decode(stream);\r\n          var text = stream.Result.Text;\r\n\r\n          if (!string.IsNullOrEmpty(text))\r\n          {\r\n            Console.WriteLine(\"{0}--{1}: {2}\", string.Format(\"{0:0.00}\", startTime),\r\n                string.Format(\"{0:0.00}\", startTime + duration), text);\r\n          }\r\n\r\n          vad.Pop();\r\n        }\r\n      }\r\n    }\r\n\r\n    vad.Flush();\r\n\r\n    while (!vad.IsEmpty())\r\n    {\r\n      var segment = vad.Front();\r\n      float startTime = segment.Start / (float)sampleRate;\r\n      float duration = segment.Samples.Length / (float)sampleRate;\r\n\r\n      var stream = recognizer.CreateStream();\r\n      stream.AcceptWaveform(sampleRate, segment.Samples);\r\n      recognizer.Decode(stream);\r\n      var text = stream.Result.Text;\r\n\r\n      if (!string.IsNullOrEmpty(text))\r\n      {\r\n        Console.WriteLine(\"{0}--{1}: {2}\", string.Format(\"{0:0.00}\", startTime),\r\n            string.Format(\"{0:0.00}\", startTime + duration), text);\r\n      }\r\n\r\n      vad.Pop();\r\n    }\r\n  }\r\n}\r\n\r\n\r\n"
  },
  {
    "path": "dotnet-examples/vad-non-streaming-funasr-nano/run-ten-vad.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./ten-vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/vad-non-streaming-funasr-nano/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/vad-non-streaming-funasr-nano/vad-non-streaming-funasr-nano.csproj",
    "content": "﻿<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>vad_non_streaming_funasr_nano</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/version-test/Program.cs",
    "content": "﻿// Copyright (c)  2025  Xiaomi Corporation\r\nusing SherpaOnnx;\r\n\r\nclass VersionTestDemo\r\n{\r\n  static void Main(string[] args)\r\n  {\r\n    var version = VersionInfo.Version;\r\n    var gitSha1 = VersionInfo.GitSha1;\r\n    var gitDate = VersionInfo.GitDate;\r\n\r\n    Console.WriteLine(\"sherpa-onnx version: {0}\", version);\r\n    Console.WriteLine(\"sherpa-onnx gitSha1: {0}\", gitSha1);\r\n    Console.WriteLine(\"sherpa-onnx gitDate: {0}\", gitDate);\r\n  }\r\n}\r\n"
  },
  {
    "path": "dotnet-examples/version-test/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/version-test/version-test.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\r\n\r\n  <PropertyGroup>\r\n    <OutputType>Exe</OutputType>\r\n    <TargetFramework>net8.0</TargetFramework>\r\n    <RootNamespace>version_test</RootNamespace>\r\n    <ImplicitUsings>enable</ImplicitUsings>\r\n    <Nullable>enable</Nullable>\r\n  </PropertyGroup>\r\n\r\n  <ItemGroup>\r\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\r\n  </ItemGroup>\r\n\r\n</Project>\r\n"
  },
  {
    "path": "dotnet-examples/zipvoice-tts/Program.cs",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//\n// This file shows how to use a non-streaming ZipVoice model\n// for zero-shot text-to-speech.\n// Please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\n// and\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n// to download pre-trained models\nusing SherpaOnnx;\nusing System.Runtime.InteropServices;\n\nclass ZipVoiceTtsDemo\n{\n  static void Main(string[] args)\n  {\n    TestZhEn();\n  }\n\n  static void TestZhEn()\n  {\n    var config = new OfflineTtsConfig();\n    config.Model.ZipVoice.Tokens = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt\";\n    config.Model.ZipVoice.Encoder = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx\";\n    config.Model.ZipVoice.Decoder = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx\";\n    config.Model.ZipVoice.Vocoder = \"./vocos_24khz.onnx\";\n    config.Model.ZipVoice.DataDir = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data\";\n    config.Model.ZipVoice.Lexicon = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt\";\n\n    config.Model.NumThreads = 2;\n    config.Model.Debug = 1;\n    config.Model.Provider = \"cpu\";\n\n    var referenceWaveFilename = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav\";\n    var reader = new WaveReader(referenceWaveFilename);\n\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\n    genConfig.ReferenceAudio = reader.Samples;\n    genConfig.ReferenceSampleRate = reader.SampleRate;\n    genConfig.ReferenceText = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\";\n    genConfig.NumSteps = 4;\n    genConfig.Extra[\"min_char_in_sentence\"] = \"10\";\n\n    var tts = new OfflineTts(config);\n    var text = \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\";\n\n    var myCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\n    {\n      float[] data = new float[n];\n      Marshal.Copy(samples, data, 0, n);\n      Console.WriteLine($\"Progress {progress * 100}%\");\n\n      // 1 means to keep generating\n      // 0 means to stop generating\n      return 1;\n    };\n\n    var callback = new OfflineTtsCallbackProgressWithArg(myCallback);\n\n    var audio = tts.GenerateWithConfig(text, genConfig, callback);\n\n    var outputFilename = \"./generated-zipvoice-zh-en.wav\";\n    var ok = audio.SaveToWaveFile(outputFilename);\n\n    if (ok)\n    {\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\n    }\n    else\n    {\n      Console.WriteLine($\"Failed to write {outputFilename}\");\n    }\n  }\n}\n"
  },
  {
    "path": "dotnet-examples/zipvoice-tts/run.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  tar xvf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nfi\n\nif [ ! -f ./vocos_24khz.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/zipvoice-tts/zipvoice-tts.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n\n  <PropertyGroup>\n    <OutputType>Exe</OutputType>\n    <TargetFramework>net8.0</TargetFramework>\n    <RootNamespace>zipvoice_tts</RootNamespace>\n    <ImplicitUsings>enable</ImplicitUsings>\n    <Nullable>enable</Nullable>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "dotnet-examples/zipvoice-tts-play/Program.cs",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//\n// This file shows how to use a non-streaming ZipVoice model\n// for zero-shot text-to-speech with playback.\n// Please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\n// and\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n// to download pre-trained models\nusing PortAudioSharp;\nusing SherpaOnnx;\nusing System.Collections.Concurrent;\nusing System.Runtime.InteropServices;\n\nclass ZipVoiceTtsDemo\n{\n  static void Main(string[] args)\n  {\n    TestZhEn();\n  }\n\n  static void TestZhEn()\n  {\n    var config = new OfflineTtsConfig();\n    config.Model.ZipVoice.Tokens = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt\";\n    config.Model.ZipVoice.Encoder = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx\";\n    config.Model.ZipVoice.Decoder = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx\";\n    config.Model.ZipVoice.Vocoder = \"./vocos_24khz.onnx\";\n    config.Model.ZipVoice.DataDir = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data\";\n    config.Model.ZipVoice.Lexicon = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt\";\n\n    config.Model.NumThreads = 2;\n    config.Model.Debug = 1;\n    config.Model.Provider = \"cpu\";\n\n    var referenceWaveFilename = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav\";\n    var reader = new WaveReader(referenceWaveFilename);\n\n    OfflineTtsGenerationConfig genConfig = new OfflineTtsGenerationConfig();\n    genConfig.ReferenceAudio = reader.Samples;\n    genConfig.ReferenceSampleRate = reader.SampleRate;\n    genConfig.ReferenceText = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\";\n    genConfig.NumSteps = 4;\n    genConfig.Extra[\"min_char_in_sentence\"] = \"10\";\n\n    var tts = new OfflineTts(config);\n    var text = \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\";\n\n    Console.WriteLine(PortAudio.VersionInfo.versionText);\n    PortAudio.Initialize();\n    Console.WriteLine($\"Number of devices: {PortAudio.DeviceCount}\");\n\n    for (int i = 0; i != PortAudio.DeviceCount; ++i)\n    {\n      Console.WriteLine($\" Device {i}\");\n      DeviceInfo deviceInfo = PortAudio.GetDeviceInfo(i);\n      Console.WriteLine($\"   Name: {deviceInfo.name}\");\n      Console.WriteLine($\"   Max output channels: {deviceInfo.maxOutputChannels}\");\n      Console.WriteLine($\"   Default sample rate: {deviceInfo.defaultSampleRate}\");\n    }\n    int deviceIndex = PortAudio.DefaultOutputDevice;\n    if (deviceIndex == PortAudio.NoDevice)\n    {\n      Console.WriteLine(\"No default output device found. Please use ../zipvoice-tts instead\");\n      Environment.Exit(1);\n    }\n\n    var info = PortAudio.GetDeviceInfo(deviceIndex);\n    Console.WriteLine();\n    Console.WriteLine($\"Use output default device {deviceIndex} ({info.name})\");\n\n    var param = new StreamParameters();\n    param.device = deviceIndex;\n    param.channelCount = 1;\n    param.sampleFormat = SampleFormat.Float32;\n    param.suggestedLatency = info.defaultLowOutputLatency;\n    param.hostApiSpecificStreamInfo = IntPtr.Zero;\n\n    var dataItems = new BlockingCollection<float[]>();\n\n    var myCallback = (IntPtr samples, int n, float progress, IntPtr arg) =>\n    {\n      Console.WriteLine($\"Progress {progress * 100}%\");\n\n      float[] data = new float[n];\n      Marshal.Copy(samples, data, 0, n);\n      dataItems.Add(data);\n\n      // 1 means to keep generating\n      // 0 means to stop generating\n      return 1;\n    };\n\n    var playFinished = false;\n\n    float[]? lastSampleArray = null;\n    int lastIndex = 0;\n\n    PortAudioSharp.Stream.Callback playCallback = (IntPtr input, IntPtr output,\n        UInt32 frameCount,\n        ref StreamCallbackTimeInfo timeInfo,\n        StreamCallbackFlags statusFlags,\n        IntPtr userData\n        ) =>\n    {\n      if (dataItems.IsCompleted && lastSampleArray == null && lastIndex == 0)\n      {\n        Console.WriteLine(\"Finished playing\");\n        playFinished = true;\n        return StreamCallbackResult.Complete;\n      }\n\n      int expected = Convert.ToInt32(frameCount);\n      int i = 0;\n\n      while ((lastSampleArray != null || dataItems.Count != 0) && (i < expected))\n      {\n        int needed = expected - i;\n\n        if (lastSampleArray != null)\n        {\n          int remaining = lastSampleArray.Length - lastIndex;\n          if (remaining >= needed)\n          {\n            float[] thisBlock = lastSampleArray.Skip(lastIndex).Take(needed).ToArray();\n            lastIndex += needed;\n            if (lastIndex == lastSampleArray.Length)\n            {\n              lastSampleArray = null;\n              lastIndex = 0;\n            }\n\n            Marshal.Copy(thisBlock, 0, IntPtr.Add(output, i * sizeof(float)), needed);\n            return StreamCallbackResult.Continue;\n          }\n\n          float[] thisBlock2 = lastSampleArray.Skip(lastIndex).Take(remaining).ToArray();\n          lastIndex = 0;\n          lastSampleArray = null;\n\n          Marshal.Copy(thisBlock2, 0, IntPtr.Add(output, i * sizeof(float)), remaining);\n          i += remaining;\n          continue;\n        }\n\n        if (dataItems.Count != 0)\n        {\n          lastSampleArray = dataItems.Take();\n          lastIndex = 0;\n        }\n      }\n\n      if (i < expected)\n      {\n        int sizeInBytes = (expected - i) * 4;\n        Marshal.Copy(new byte[sizeInBytes], 0, IntPtr.Add(output, i * sizeof(float)), sizeInBytes);\n      }\n\n      return StreamCallbackResult.Continue;\n    };\n\n    PortAudioSharp.Stream stream = new PortAudioSharp.Stream(inParams: null, outParams: param, sampleRate: tts.SampleRate,\n        framesPerBuffer: 0,\n        streamFlags: StreamFlags.ClipOff,\n        callback: playCallback,\n        userData: IntPtr.Zero\n        );\n\n    stream.Start();\n\n    var callback = new OfflineTtsCallbackProgressWithArg(myCallback);\n    var audio = tts.GenerateWithConfig(text, genConfig, callback);\n\n    var outputFilename = \"./generated-zipvoice-zh-en-play.wav\";\n    var ok = audio.SaveToWaveFile(outputFilename);\n\n    if (ok)\n    {\n      Console.WriteLine($\"Wrote to {outputFilename} succeeded!\");\n    }\n    else\n    {\n      Console.WriteLine($\"Failed to write {outputFilename}\");\n    }\n\n    dataItems.CompleteAdding();\n\n    while (!playFinished)\n    {\n      Thread.Sleep(100);\n    }\n  }\n}\n"
  },
  {
    "path": "dotnet-examples/zipvoice-tts-play/run.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  tar xvf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nfi\n\nif [ ! -f ./vocos_24khz.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\nfi\n\ndotnet run\n"
  },
  {
    "path": "dotnet-examples/zipvoice-tts-play/zipvoice-tts-play.csproj",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n\n  <PropertyGroup>\n    <OutputType>Exe</OutputType>\n    <TargetFramework>net8.0</TargetFramework>\n    <RootNamespace>zipvoice_tts_play</RootNamespace>\n    <ImplicitUsings>enable</ImplicitUsings>\n    <Nullable>enable</Nullable>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <PackageReference Include=\"PortAudioSharp2\" Version=\"*\" />\n  </ItemGroup>\n\n  <ItemGroup>\n    <ProjectReference Include=\"..\\Common\\Common.csproj\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "ffmpeg-examples/Makefile",
    "content": "CC=g++\nGDB ?= FALSE\n\n# use pkg-config for getting CFLAGS and LDLIBS\nSHARED_LIBS=libavdevice                          \\\n            libavformat                          \\\n            libavfilter                          \\\n            libavcodec                           \\\n            libswresample                        \\\n            libswscale                           \\\n            libavutil\n\nifeq ($(GDB), TRUE)\n\tOPTFLAG += -g\nendif\n\n# CFLAGS := $(shell pkg-config --cflags $(SHARED_LIBS)) -I.. -Wall -std=c++17 -fopenmp ${OPTFLAG}\nCFLAGS := $(shell pkg-config --cflags $(SHARED_LIBS)) -I.. -Wall -std=c++17  ${OPTFLAG}\nLDLIBS := $(shell pkg-config --libs $(SHARED_LIBS))\n\nCUR_DIR :=$(shell pwd)\n\nLDLIBS += -L ../build/lib\nLDLIBS += -L ../build/_deps/onnxruntime-src/lib\nLDLIBS += -lsherpa-onnx-c-api -lonnxruntime\nLDLIBS += -Wl,-rpath,${CUR_DIR}/../build/lib\nLDLIBS += -Wl,-rpath,${CUR_DIR}/../build/_deps/onnxruntime-src/lib\n\n#Get libavutil version and extract major, minor and micro\nLIBAVUTIL_VERSION := $(shell pkg-config --modversion libavutil)\nLIBAVUTIL_MAJOR := $(shell echo \"$(LIBAVUTIL_VERSION)\" | awk -F. '{print $$1}')\nLIBAVUTIL_MINOR := $(shell echo \"$(LIBAVUTIL_VERSION)\" | awk -F. '{print $$2}')\nLIBAVUTIL_MICRO := $(shell echo \"$(LIBAVUTIL_VERSION)\" | awk -F. '{print $$3}')\n#Check if libavutil version is 57.28.100 or above\nFFMPEG_51_AND_ABOVE = $(shell echo \"$(LIBAVUTIL_MAJOR) $(LIBAVUTIL_MINOR) $(LIBAVUTIL_MICRO)\" | awk '{if ($$1 > 57 || ($$1 == 57 && $$2 > 28) || ($$1 == 57 && $$2 == 28 && $$3 >= 100)) print \"TRUE\"; else print \"FALSE\"}')\nifeq ($(FFMPEG_51_AND_ABOVE), FALSE)\n$(error FFmpeg version should be n5.1 or above!)\nendif\n\nEXAMPLES=sherpa-onnx-ffmpeg\n\nOBJS=$(addsuffix .o,$(EXAMPLES))\n\n.phony: all clean\n\nall: $(EXAMPLES)\n\t@echo $(EXAMPLES)\n\t$(RM) $(OBJS)\n\n$(EXAMPLES): $(OBJS)\n\t$(CC) $(addsuffix .o,$@) $(CFLAGS) $(LDLIBS) -o $@\n\n%.o : %.c\n\t${CC} ${CFLAGS} -c -o $@ $<\n\nclean:\n\t$(RM) $(EXAMPLES) $(OBJS)\n\nbuild_info:\n\t@echo \"libavutil version: $(LIBAVUTIL_VERSION)\"\n\t@echo \"Supported examples: $(EXAMPLES)\"\n"
  },
  {
    "path": "ffmpeg-examples/README.md",
    "content": "# Introduction\n\nYou can use `sherpa-onnx-ffmpeg` to decode a wav, mp3, or even a URL.\n\nSee <https://github.com/ossrs/srs>\nfor more supported formats and protocols, e.g.,\nRTMP/WebRTC/HLS/HTTP-FLV/SRT/MPEG-DASH/GB28181.\n\n\n## How to use\n\nPlease have a look at\n\n```\n./run.sh\n```\n"
  },
  {
    "path": "ffmpeg-examples/how-to-fix-errors.md",
    "content": "# Fixes for errors\n\nTo fix the following error:\n```\nPackage libavdevice was not found in the pkg-config search path.\n```\nplease run\n\n```\nsudo apt-get install libavdevice-dev\n```\n\nTo fix the following error\n```\nMakefile:28: *** FFmpeg version should be n5.1 or above!.  Stop.\n```\nplease run\n```\nsudo apt-get install software-properties-common\nsudo add-apt-repository ppa:savoury1/ffmpeg4\nsudo add-apt-repository ppa:savoury1/ffmpeg5\nsudo apt-get update\nsudo apt-get install ffmpeg --reinstall\nsudo apt-get install libavutil-dev --reinstall\n```\n\nTo fix the following error:\n```\nModuleNotFoundError: No module named 'apt_pkg'\n```\nplease run:\n```\nsudo apt-get install python-apt\n```\n"
  },
  {
    "path": "ffmpeg-examples/sherpa-onnx-ffmpeg.c",
    "content": "// ffmpeg-examples/sherpa-onnx-ffmpeg.c\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n/*\n * Copyright (c) 2010 Nicolas George\n * Copyright (c) 2011 Stefano Sabatini\n * Copyright (c) 2012 Clément Bœsch\n *\n * Permission is hereby granted, free of charge, to any person obtaining a copy\n * of this software and associated documentation files (the \"Software\"), to deal\n * in the Software without restriction, including without limitation the rights\n * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n * copies of the Software, and to permit persons to whom the Software is\n * furnished to do so, subject to the following conditions:\n *\n * The above copyright notice and this permission notice shall be included in\n * all copies or substantial portions of the Software.\n *\n * THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL\n * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\n * THE SOFTWARE.\n */\n\n/**\n * @file audio decoding and filtering usage example\n * @example sherpa-onnx-ffmpeg.c\n *\n * Demux, decode and filter audio input file, generate a raw audio\n * file to be played with ffplay.\n */\n\n#include <unistd.h>\nextern \"C\" {\n#include <libavcodec/avcodec.h>\n#include <libavfilter/buffersink.h>\n#include <libavfilter/buffersrc.h>\n#include <libavformat/avformat.h>\n#include <libavutil/channel_layout.h>\n#include <libavutil/opt.h>\n}\n\nstatic const char *filter_descr =\n    \"aresample=16000,aformat=sample_fmts=s16:channel_layouts=mono\";\n\nstatic AVFormatContext *fmt_ctx;\nstatic AVCodecContext *dec_ctx;\nAVFilterContext *buffersink_ctx;\nAVFilterContext *buffersrc_ctx;\nAVFilterGraph *filter_graph;\nstatic int audio_stream_index = -1;\n\nstatic int open_input_file(const char *filename) {\n  const AVCodec *dec;\n  int ret;\n\n  if ((ret = avformat_open_input(&fmt_ctx, filename, NULL, NULL)) < 0) {\n    av_log(NULL, AV_LOG_ERROR, \"Cannot open input file %s\\n\", filename);\n    return ret;\n  }\n\n  if ((ret = avformat_find_stream_info(fmt_ctx, NULL)) < 0) {\n    av_log(NULL, AV_LOG_ERROR, \"Cannot find stream information\\n\");\n    return ret;\n  }\n\n  /* select the audio stream */\n  ret = av_find_best_stream(fmt_ctx, AVMEDIA_TYPE_AUDIO, -1, -1, &dec, 0);\n  if (ret < 0) {\n    av_log(NULL, AV_LOG_ERROR,\n           \"Cannot find an audio stream in the input file\\n\");\n    return ret;\n  }\n  audio_stream_index = ret;\n\n  /* create decoding context */\n  dec_ctx = avcodec_alloc_context3(dec);\n  if (!dec_ctx) return AVERROR(ENOMEM);\n  avcodec_parameters_to_context(dec_ctx,\n                                fmt_ctx->streams[audio_stream_index]->codecpar);\n\n  /* init the audio decoder */\n  if ((ret = avcodec_open2(dec_ctx, dec, NULL)) < 0) {\n    av_log(NULL, AV_LOG_ERROR, \"Cannot open audio decoder\\n\");\n    return ret;\n  }\n\n  return 0;\n}\n\nstatic int init_filters(const char *filters_descr) {\n  char args[512];\n  int ret = 0;\n  const AVFilter *abuffersrc = avfilter_get_by_name(\"abuffer\");\n  const AVFilter *abuffersink = avfilter_get_by_name(\"abuffersink\");\n  AVFilterInOut *outputs = avfilter_inout_alloc();\n  AVFilterInOut *inputs = avfilter_inout_alloc();\n  static const enum AVSampleFormat out_sample_fmts[] = {AV_SAMPLE_FMT_S16,\n                                                        AV_SAMPLE_FMT_NONE};\n  static const int out_sample_rates[] = {16000, -1};\n  const AVFilterLink *outlink;\n  AVRational time_base = fmt_ctx->streams[audio_stream_index]->time_base;\n\n  filter_graph = avfilter_graph_alloc();\n  if (!outputs || !inputs || !filter_graph) {\n    ret = AVERROR(ENOMEM);\n    goto end;\n  }\n\n  /* buffer audio source: the decoded frames from the decoder will be inserted\n   * here. */\n  if (dec_ctx->ch_layout.order == AV_CHANNEL_ORDER_UNSPEC)\n    av_channel_layout_default(&dec_ctx->ch_layout,\n                              dec_ctx->ch_layout.nb_channels);\n  ret = snprintf(args, sizeof(args),\n                 \"time_base=%d/%d:sample_rate=%d:sample_fmt=%s:channel_layout=\",\n                 time_base.num, time_base.den, dec_ctx->sample_rate,\n                 av_get_sample_fmt_name(dec_ctx->sample_fmt));\n  av_channel_layout_describe(&dec_ctx->ch_layout, args + ret,\n                             sizeof(args) - ret);\n  ret = avfilter_graph_create_filter(&buffersrc_ctx, abuffersrc, \"in\", args,\n                                     NULL, filter_graph);\n  if (ret < 0) {\n    av_log(NULL, AV_LOG_ERROR, \"Cannot create audio buffer source\\n\");\n    goto end;\n  }\n\n  /* buffer audio sink: to terminate the filter chain. */\n  ret = avfilter_graph_create_filter(&buffersink_ctx, abuffersink, \"out\", NULL,\n                                     NULL, filter_graph);\n  if (ret < 0) {\n    av_log(NULL, AV_LOG_ERROR, \"Cannot create audio buffer sink\\n\");\n    goto end;\n  }\n\n  ret = av_opt_set_int_list(buffersink_ctx, \"sample_fmts\", out_sample_fmts, -1,\n                            AV_OPT_SEARCH_CHILDREN);\n  if (ret < 0) {\n    av_log(NULL, AV_LOG_ERROR, \"Cannot set output sample format\\n\");\n    goto end;\n  }\n\n  ret =\n      av_opt_set(buffersink_ctx, \"ch_layouts\", \"mono\", AV_OPT_SEARCH_CHILDREN);\n  if (ret < 0) {\n    av_log(NULL, AV_LOG_ERROR, \"Cannot set output channel layout\\n\");\n    goto end;\n  }\n\n  ret = av_opt_set_int_list(buffersink_ctx, \"sample_rates\", out_sample_rates,\n                            -1, AV_OPT_SEARCH_CHILDREN);\n  if (ret < 0) {\n    av_log(NULL, AV_LOG_ERROR, \"Cannot set output sample rate\\n\");\n    goto end;\n  }\n\n  /*\n   * Set the endpoints for the filter graph. The filter_graph will\n   * be linked to the graph described by filters_descr.\n   */\n\n  /*\n   * The buffer source output must be connected to the input pad of\n   * the first filter described by filters_descr; since the first\n   * filter input label is not specified, it is set to \"in\" by\n   * default.\n   */\n  outputs->name = av_strdup(\"in\");\n  outputs->filter_ctx = buffersrc_ctx;\n  outputs->pad_idx = 0;\n  outputs->next = NULL;\n\n  /*\n   * The buffer sink input must be connected to the output pad of\n   * the last filter described by filters_descr; since the last\n   * filter output label is not specified, it is set to \"out\" by\n   * default.\n   */\n  inputs->name = av_strdup(\"out\");\n  inputs->filter_ctx = buffersink_ctx;\n  inputs->pad_idx = 0;\n  inputs->next = NULL;\n\n  if ((ret = avfilter_graph_parse_ptr(filter_graph, filters_descr, &inputs,\n                                      &outputs, NULL)) < 0)\n    goto end;\n\n  if ((ret = avfilter_graph_config(filter_graph, NULL)) < 0) goto end;\n\n  /* Print summary of the sink buffer\n   * Note: args buffer is reused to store channel layout string */\n  outlink = buffersink_ctx->inputs[0];\n  av_channel_layout_describe(&outlink->ch_layout, args, sizeof(args));\n  av_log(NULL, AV_LOG_INFO, \"Output: srate:%dHz fmt:%s chlayout:%s\\n\",\n         (int)outlink->sample_rate,\n         (char *)av_x_if_null(\n             av_get_sample_fmt_name((AVSampleFormat)outlink->format), \"?\"),\n         args);\n\nend:\n  avfilter_inout_free(&inputs);\n  avfilter_inout_free(&outputs);\n\n  return ret;\n}\n\nstatic void sherpa_decode_frame(const AVFrame *frame,\n                                const SherpaOnnxOnlineRecognizer *recognizer,\n                                const SherpaOnnxOnlineStream *stream,\n                                const SherpaOnnxDisplay *display,\n                                int32_t *segment_id) {\n#define N 3200  // 100s. Sample rate is fixed to 16 kHz\n  static float samples[N];\n  static int nb_samples = 0;\n  const int16_t *p = (int16_t *)frame->data[0];\n\n  if (frame->nb_samples + nb_samples > N) {\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, 16000, samples, nb_samples);\n    while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n      SherpaOnnxDecodeOnlineStream(recognizer, stream);\n    }\n\n    const SherpaOnnxOnlineRecognizerResult *r =\n        SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n    if (strlen(r->text)) {\n      SherpaOnnxPrint(display, *segment_id, r->text);\n    }\n\n    if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n      if (strlen(r->text)) {\n        ++*segment_id;\n      }\n      SherpaOnnxOnlineStreamReset(recognizer, stream);\n    }\n\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n    nb_samples = 0;\n  }\n\n  for (int i = 0; i < frame->nb_samples; i++) {\n    samples[nb_samples++] = p[i] / 32768.;\n  }\n}\n\nstatic inline char *__av_err2str(int errnum) {\n  static char str[AV_ERROR_MAX_STRING_SIZE];\n  memset(str, 0, sizeof(str));\n  return av_make_error_string(str, AV_ERROR_MAX_STRING_SIZE, errnum);\n}\n\nint main(int argc, char **argv) {\n  int ret;\n  int num_threads = 1;\n  AVPacket *packet = av_packet_alloc();\n  AVFrame *frame = av_frame_alloc();\n  AVFrame *filt_frame = av_frame_alloc();\n  const char *kUsage =\n      \"\\n\"\n      \"Usage:\\n\"\n      \"  ./sherpa-onnx-ffmpeg \\\\\\n\"\n      \"    /path/to/tokens.txt \\\\\\n\"\n      \"    /path/to/encoder.onnx\\\\\\n\"\n      \"    /path/to/decoder.onnx\\\\\\n\"\n      \"    /path/to/joiner.onnx\\\\\\n\"\n      \"    /path/to/foo.wav [num_threads [decoding_method]]\"\n      \"\\n\\n\"\n      \"Default num_threads is 1.\\n\"\n      \"Valid decoding_method: greedy_search (default), modified_beam_search\\n\\n\"\n      \"Please refer to \\n\"\n      \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\\n\"\n      \"for a list of pre-trained models to download.\\n\";\n\n  if (!packet || !frame || !filt_frame) {\n    fprintf(stderr, \"Could not allocate frame or packet\\n\");\n    exit(1);\n  }\n\n  if (argc < 6 || argc > 8) {\n    fprintf(stderr, \"%s\\n\", kUsage);\n    return -1;\n  }\n\n  SherpaOnnxOnlineRecognizerConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model_config.tokens = argv[1];\n  config.model_config.transducer.encoder = argv[2];\n  config.model_config.transducer.decoder = argv[3];\n  config.model_config.transducer.joiner = argv[4];\n\n  if (argc == 7 && atoi(argv[6]) > 0) {\n    num_threads = atoi(argv[6]);\n  }\n\n  config.model_config.num_threads = num_threads;\n  config.model_config.debug = 0;\n\n  config.feat_config.sample_rate = 16000;\n  config.feat_config.feature_dim = 80;\n\n  config.decoding_method = \"greedy_search\";\n  if (argc == 8) {\n    config.decoding_method = argv[7];\n  }\n\n  config.max_active_paths = 4;\n\n  config.enable_endpoint = 1;\n  config.rule1_min_trailing_silence = 2.4;\n  config.rule2_min_trailing_silence = 1.2;\n  config.rule3_min_utterance_length = 300;\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&config);\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n  int32_t segment_id = 0;\n\n  if ((ret = open_input_file(argv[5])) < 0) exit(1);\n\n  if ((ret = init_filters(filter_descr)) < 0) exit(1);\n\n  /* read all packets */\n  while (1) {\n    if ((ret = av_read_frame(fmt_ctx, packet)) < 0) break;\n\n    if (packet->stream_index == audio_stream_index) {\n      ret = avcodec_send_packet(dec_ctx, packet);\n      if (ret < 0) {\n        av_log(NULL, AV_LOG_ERROR,\n               \"Error while sending a packet to the decoder\\n\");\n        break;\n      }\n\n      while (ret >= 0) {\n        ret = avcodec_receive_frame(dec_ctx, frame);\n        if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {\n          break;\n        } else if (ret < 0) {\n          av_log(NULL, AV_LOG_ERROR,\n                 \"Error while receiving a frame from the decoder\\n\");\n          exit(1);\n        }\n\n        if (ret >= 0) {\n          /* push the audio data from decoded frame into the filtergraph */\n          if (av_buffersrc_add_frame_flags(buffersrc_ctx, frame,\n                                           AV_BUFFERSRC_FLAG_KEEP_REF) < 0) {\n            av_log(NULL, AV_LOG_ERROR,\n                   \"Error while feeding the audio filtergraph\\n\");\n            break;\n          }\n\n          /* pull filtered audio from the filtergraph */\n          while (1) {\n            ret = av_buffersink_get_frame(buffersink_ctx, filt_frame);\n            if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) break;\n            if (ret < 0) exit(1);\n            sherpa_decode_frame(filt_frame, recognizer, stream, display,\n                                &segment_id);\n            av_frame_unref(filt_frame);\n          }\n          av_frame_unref(frame);\n        }\n      }\n    }\n    av_packet_unref(packet);\n  }\n\n  // add some tail padding\n  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, 16000, tail_paddings, 4800);\n  SherpaOnnxOnlineStreamInputFinished(stream);\n\n  while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream);\n  }\n\n  const SherpaOnnxOnlineRecognizerResult *r =\n      SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n  if (strlen(r->text)) {\n    SherpaOnnxPrint(display, segment_id, r->text);\n  }\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  SherpaOnnxDestroyDisplay(display);\n  SherpaOnnxDestroyOnlineStream(stream);\n  SherpaOnnxDestroyOnlineRecognizer(recognizer);\n\n  avfilter_graph_free(&filter_graph);\n  avcodec_free_context(&dec_ctx);\n  avformat_close_input(&fmt_ctx);\n  av_packet_free(&packet);\n  av_frame_free(&frame);\n  av_frame_free(&filt_frame);\n\n  if (ret < 0 && ret != AVERROR_EOF) {\n    fprintf(stderr, \"Error occurred: %s\\n\", __av_err2str(ret));\n    exit(1);\n  }\n  fprintf(stderr, \"\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "flutter/.gitignore",
    "content": "# Do not remove or rename entries in this file, only add new ones\n# See https://github.com/flutter/flutter/issues/128635 for more context.\n\n# Miscellaneous\n*.class\n*.lock\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# Visual Studio Code related\n.classpath\n.project\n.settings/\n.vscode/*\n\n# Flutter repo-specific\n/bin/cache/\n/bin/internal/bootstrap.bat\n/bin/internal/bootstrap.sh\n/bin/mingit/\n/dev/benchmarks/mega_gallery/\n/dev/bots/.recipe_deps\n/dev/bots/android_tools/\n/dev/devicelab/ABresults*.json\n/dev/docs/doc/\n/dev/docs/api_docs.zip\n/dev/docs/flutter.docs.zip\n/dev/docs/lib/\n/dev/docs/pubspec.yaml\n/dev/integration_tests/**/xcuserdata\n/dev/integration_tests/**/Pods\n/packages/flutter/coverage/\nversion\nanalysis_benchmark.json\n\n# packages file containing multi-root paths\n.packages.generated\n\n# Flutter/Dart/Pub related\n**/doc/api/\n.dart_tool/\n.flutter-plugins\n.flutter-plugins-dependencies\n**/generated_plugin_registrant.dart\n.packages\n.pub-preload-cache/\n.pub-cache/\n.pub/\nbuild/\nflutter_*.png\nlinked_*.ds\nunlinked.ds\nunlinked_spec.ds\n\n# Android related\n**/android/**/gradle-wrapper.jar\n.gradle/\n**/android/captures/\n**/android/gradlew\n**/android/gradlew.bat\n**/android/local.properties\n**/android/**/GeneratedPluginRegistrant.java\n**/android/key.properties\n*.jks\n\n# iOS/XCode related\n**/ios/**/*.mode1v3\n**/ios/**/*.mode2v3\n**/ios/**/*.moved-aside\n**/ios/**/*.pbxuser\n**/ios/**/*.perspectivev3\n**/ios/**/*sync/\n**/ios/**/.sconsign.dblite\n**/ios/**/.tags*\n**/ios/**/.vagrant/\n**/ios/**/DerivedData/\n**/ios/**/Icon?\n**/ios/**/Pods/\n**/ios/**/.symlinks/\n**/ios/**/profile\n**/ios/**/xcuserdata\n**/ios/.generated/\n**/ios/Flutter/.last_build_id\n**/ios/Flutter/App.framework\n**/ios/Flutter/Flutter.framework\n**/ios/Flutter/Flutter.podspec\n**/ios/Flutter/Generated.xcconfig\n**/ios/Flutter/ephemeral\n**/ios/Flutter/app.flx\n**/ios/Flutter/app.zip\n**/ios/Flutter/flutter_assets/\n**/ios/Flutter/flutter_export_environment.sh\n**/ios/ServiceDefinitions.json\n**/ios/Runner/GeneratedPluginRegistrant.*\n\n# macOS\n**/Flutter/ephemeral/\n**/Pods/\n**/macos/Flutter/GeneratedPluginRegistrant.swift\n**/macos/Flutter/ephemeral\n**/xcuserdata/\n\n# Windows\n**/windows/flutter/generated_plugin_registrant.cc\n**/windows/flutter/generated_plugin_registrant.h\n**/windows/flutter/generated_plugins.cmake\n\n# Linux\n**/linux/flutter/generated_plugin_registrant.cc\n**/linux/flutter/generated_plugin_registrant.h\n**/linux/flutter/generated_plugins.cmake\n\n# Coverage\ncoverage/\n\n# Symbols\napp.*.symbols\n\n# Exceptions to above rules.\n!**/ios/**/default.mode1v3\n!**/ios/**/default.mode2v3\n!**/ios/**/default.pbxuser\n!**/ios/**/default.perspectivev3\n!/packages/flutter_tools/test/data/dart_dependencies_test/**/.packages\n!/dev/ci/**/Gemfile.lock\n!.vscode/settings.json\n"
  },
  {
    "path": "flutter/README.md",
    "content": "# Introduction\n\nThis directory contains the source code of the flutter\npackage [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx)\n\nCaution: You are not expected to use this directory directly.\n\nThis directory is for developers only.\n\nFor common users, please use our package at <https://pub.dev/packages/sherpa_onnx>\n"
  },
  {
    "path": "flutter/notes.md",
    "content": "# Introduction\n\nThis file keeps some notes about how packages in this directory\nare created.\n\n1. Create `sherpa_onnx`.\n\n```bash\nflutter create --template plugin sherpa_onnx\n```\n\n2. Create `sherpa_onnx_macos`\n\n```bash\nflutter create --template plugin_ffi --platforms macos sherpa_onnx_macos\n```\n\n3. Create `sherpa_onnx_linux`\n\n```bash\nflutter create --template plugin_ffi --platforms linux sherpa_onnx_linux\n```\n\n4. Create `sherpa_onnx_windows`\n\n```bash\nflutter create --template plugin_ffi --platforms linux sherpa_onnx_windows\n```\n\n5. Create `sherpa_onnx_android`\n\n```bash\nflutter create --template plugin_ffi --platforms android --org com.k2fsa.sherpa.onnx sherpa_onnx_android\n```\n\n6. Create `sherpa_onnx_ios`\n\n```bash\nflutter create --template plugin_ffi --platforms ios sherpa_onnx_ios\n```\n"
  },
  {
    "path": "flutter/notes2.md",
    "content": "# Some use commands while learning flutter/dart\n\n## macOS\n\n1. Build required libraries\n\n```bash\ngit clone https://github.com/k2-fsa/sherpa-onnx\ncd sherpa-onnx\nmkdir build\ncd build\n\ncmake -DCMAKE_INSTALL_PREFIX=./install -DBUILD_SHARED_LIBS=ON -DCMAKE_OSX_ARCHITECTURES=\"x86_64;arm64\" ..\nmake install\ncd ../sherpa-onnx/flutter/\ncp -v  ../../build/install/lib/lib* ./macos/\n```\n\n2. Test for speaker identification\n\n```bash\ncd sherpa-onnx/sherpa-onnx/flutter/example\nmkdir assets\n```\n\n\n## Useful commands\n```\nflutter pub publish --dry-run\nflutter run -d macos\nflutter run -d linux\nflutter run -d windows\n\nflutter build macos\n\nflutter run --release -d macos\n\n# add platform to an existing project\nflutter create --platforms=windows,macos,linux .\n\ndart analyze\n\nFLUTTER_XCODE_ARCHS=arm64\nFLUTTER_XCODE_ARCHS=x86_64\n```\n\n## Examples\n\n  - https://dart.dev/tools/pub/automated-publishing\n\n     Use GitHub actions to publish\n\n  - https://dart.dev/tools/pub/pubspec\n\n     It describes the format of ./pubspec.yaml\n\n  - https://github.com/folksable/blurhash_ffi/\n\n      It supports ios, android, linux, macos, and windows.\n\n - https://github.com/alexmercerind/dart_vlc\n - https://github.com/dart-lang/native/tree/main/pkgs/jni\n"
  },
  {
    "path": "flutter/publish.md",
    "content": "# Note\n\nBefore publishing a new version, please first run\n```\nflutter analyze\n```\nto check if there are any issues.\n"
  },
  {
    "path": "flutter/sherpa_onnx/.gitignore",
    "content": "# Miscellaneous\n*.class\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\nmigrate_working_dir/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# The .vscode folder contains launch configuration and tasks you configure in\n# VS Code which you may wish to be included in version control, so this line\n# is commented out by default.\n#.vscode/\n\n# Flutter/Dart/Pub related\n# Libraries should not include pubspec.lock, per https://dart.dev/guides/libraries/private-files#pubspeclock.\n/pubspec.lock\n**/doc/api/\n.dart_tool/\nbuild/\n"
  },
  {
    "path": "flutter/sherpa_onnx/.metadata",
    "content": "# This file tracks properties of this Flutter project.\n# Used by Flutter tool to assess capabilities and perform upgrades etc.\n#\n# This file should be version controlled and should not be manually edited.\n\nversion:\n  revision: \"5dcb86f68f239346676ceb1ed1ea385bd215fba1\"\n  channel: \"stable\"\n\nproject_type: plugin\n\n# Tracks metadata for the flutter migrate command\nmigration:\n  platforms:\n    - platform: root\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n\n  # User provided section\n\n  # List of Local paths (relative to this file) that should be\n  # ignored by the migrate tool.\n  #\n  # Files that are not part of the templates will be ignored by default.\n  unmanaged_files:\n    - 'lib/main.dart'\n    - 'ios/Runner.xcodeproj/project.pbxproj'\n"
  },
  {
    "path": "flutter/sherpa_onnx/analysis_options.yaml",
    "content": "include: package:flutter_lints/flutter.yaml\n\n# Additional information about this file can be found at\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "flutter/sherpa_onnx/example/.gitignore",
    "content": "# Miscellaneous\n*.class\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\nmigrate_working_dir/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# The .vscode folder contains launch configuration and tasks you configure in\n# VS Code which you may wish to be included in version control, so this line\n# is commented out by default.\n#.vscode/\n\n# Flutter/Dart/Pub related\n**/doc/api/\n**/ios/Flutter/.last_build_id\n.dart_tool/\n.flutter-plugins\n.flutter-plugins-dependencies\n.pub-cache/\n.pub/\n/build/\n\n# Symbolication related\napp.*.symbols\n\n# Obfuscation related\napp.*.map.json\n\n# Android Studio will place build artifacts here\n/android/app/debug\n/android/app/profile\n/android/app/release\n"
  },
  {
    "path": "flutter/sherpa_onnx/example/README.md",
    "content": "# Introduction\n\nPlease find examples at\n\nhttps://github.com/k2-fsa/sherpa-onnx/tree/master/flutter-examples\n\nand\n\nhttps://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples\n"
  },
  {
    "path": "flutter/sherpa_onnx/example/example.md",
    "content": "# sherpa-onnx app example\n\n## Flutter examples\n\n| Functions | URL | Supported Platforms|\n|---|---|---|\n|Streaming speech recognition| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/flutter-examples/streaming_asr)| Android, iOS, Linux, macOS, Windows|\n|Speech synthesis| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/flutter-examples/tts)| Android, iOS, Linux, macOS, Windows|\n\n## Pure dart-examples\n\nHint: All of the following functions can be used in Flutter, even if some of them are only provided in pure dart api examples.\n\n| Functions | URL | Supported Platforms|\n|---|---|---|\n|Speaker diarization| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/speaker-diarization)| macOS, Windows, Linux|\n|Streaming speech recognition| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/streaming-asr)| macOS, Windows, Linux|\n|Non-Streaming speech recognition| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/non-streaming-asr)| macOS, Windows, Linux|\n|Text to speech| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/tts)| macOS, Windows, Linux|\n|Voice activity detection (VAD)| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/vad)| macOS, Windows, Linux|\n|Voice activity detection (VAD) with non-streaming speech recognition| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/vad-with-non-streaming-asr)| macOS, Windows, Linux|\n|Speaker identification and verification| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/speaker-identification)| macOS, Windows, Linux|\n|Audio tagging| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/audio-tagging)| macOS, Windows, Linux|\n|Keyword spotter| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/keyword-spotter)| macOS, Windows, Linux|\n|Add punctuations| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/add-punctuations)| macOS, Windows, Linux|\n|Speech enhancement/denoising| [Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/dart-api-examples/speech-enhancement-gtcrn) GTCRN and DPDFNet (`baseline`, `dpdfnet2`, `dpdfnet4`, `dpdfnet8` for 16 kHz ASR, `dpdfnet2_48khz_hr` for 48 kHz output)| macOS, Windows, Linux|\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/sherpa_onnx.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:ffi';\n\n/// Dart bindings for the public sherpa-onnx inference APIs.\n///\n/// Import this library to access offline and streaming ASR, text-to-speech,\n/// VAD, speaker identification, speaker diarization, punctuation restoration,\n/// audio tagging, spoken language identification, speech denoising, and WAV\n/// I/O helpers from a single entry point.\n///\n/// Before creating any runtime object, call [initBindings] once so the package\n/// can load the underlying native `sherpa-onnx-c-api` library for the current\n/// platform.\n///\n/// For concrete end-to-end usage, see `dart-api-examples/` in the repository,\n/// especially:\n///\n/// - `non-streaming-asr/bin/sense-voice.dart`\n/// - `non-streaming-asr/bin/whisper.dart`\n/// - `non-streaming-asr/bin/nemo-transducer.dart`\n/// - `streaming-asr/bin/zipformer-transducer.dart`\n/// - `tts/bin/pocket-en.dart`\n/// - `vad/bin/vad.dart`\n/// - `speaker-diarization/`\n\nexport 'src/audio_tagging.dart';\nexport 'src/feature_config.dart';\nexport 'src/homophone_replacer_config.dart';\nexport 'src/keyword_spotter.dart';\nexport 'src/offline_punctuation.dart';\nexport 'src/offline_recognizer.dart';\nexport 'src/offline_speaker_diarization.dart';\nexport 'src/offline_speech_denoiser.dart';\nexport 'src/offline_stream.dart';\nexport 'src/online_speech_denoiser.dart';\nexport 'src/online_punctuation.dart';\nexport 'src/online_recognizer.dart';\nexport 'src/online_stream.dart';\nexport 'src/speaker_identification.dart';\nexport 'src/spoken_language_identification.dart';\nexport 'src/tts.dart';\nexport 'src/vad.dart';\nexport 'src/version.dart';\nexport 'src/wave_reader.dart';\nexport 'src/wave_writer.dart';\n\nimport 'src/sherpa_onnx_bindings.dart';\n\nString? _path;\n\n// see also\n// https://github.com/flutter/codelabs/blob/main/ffigen_codelab/step_05/lib/ffigen_app.dart\n// https://api.flutter.dev/flutter/dart-io/Platform-class.html\nfinal DynamicLibrary _dylib = () {\n  if (Platform.isMacOS) {\n    if (_path == null) {\n      return DynamicLibrary.open('libsherpa-onnx-c-api.dylib');\n    } else {\n      return DynamicLibrary.open('$_path/libsherpa-onnx-c-api.dylib');\n    }\n  }\n\n  if (Platform.isIOS) {\n    if (_path == null) {\n      return DynamicLibrary.open('sherpa_onnx.framework/sherpa_onnx');\n    } else {\n      return DynamicLibrary.open('$_path/sherpa_onnx.framework/sherpa_onnx');\n    }\n  }\n\n  if (Platform.isAndroid || Platform.isLinux) {\n    if (_path == null) {\n      return DynamicLibrary.open('libsherpa-onnx-c-api.so');\n    } else {\n      return DynamicLibrary.open('$_path/libsherpa-onnx-c-api.so');\n    }\n  }\n\n  if (Platform.isWindows) {\n    if (_path == null) {\n      return DynamicLibrary.open('sherpa-onnx-c-api.dll');\n    } else {\n      return DynamicLibrary.open('$_path\\\\sherpa-onnx-c-api.dll');\n    }\n  }\n\n  throw UnsupportedError('Unknown platform: ${Platform.operatingSystem}');\n}();\n\n/// Initialize the native sherpa-onnx bindings.\n///\n/// Call this exactly once before using any other API from this package.\n///\n/// If [p] is provided, it is treated as the directory containing the native\n/// dynamic library for desktop platforms, or the framework root on Apple\n/// platforms. If omitted, the package tries to load the library from the\n/// default platform-specific filename.\nvoid initBindings([String? p]) {\n  _path ??= p;\n  SherpaOnnxBindings.init(_dylib);\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/audio_tagging.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'package:ffi/ffi.dart';\n\nimport './offline_stream.dart';\nimport './sherpa_onnx_bindings.dart';\n\n/// Offline audio tagging.\n///\n/// This module classifies complete audio clips and returns the most likely\n/// events. See `dart-api-examples/audio-tagging/` for working examples.\n///\n/// Example:\n///\n/// ```dart\n/// final modelConfig = AudioTaggingModelConfig(\n///   zipformer: const OfflineZipformerAudioTaggingModelConfig(\n///     model: './sherpa-onnx-zipformer-audio-tagging/model.int8.onnx',\n///   ),\n///   numThreads: 1,\n///   debug: true,\n/// );\n///\n/// final config = AudioTaggingConfig(\n///   model: modelConfig,\n///   labels: './sherpa-onnx-zipformer-audio-tagging/class_labels_indices.csv',\n/// );\n///\n/// final tagger = AudioTagging(config: config);\n/// final wave = readWave('./test.wav');\n/// final stream = tagger.createStream();\n/// stream.acceptWaveform(samples: wave.samples, sampleRate: wave.sampleRate);\n/// final events = tagger.compute(stream: stream, topK: 5);\n/// print(events);\n/// stream.free();\n/// tagger.free();\n/// ```\nclass OfflineZipformerAudioTaggingModelConfig {\n  const OfflineZipformerAudioTaggingModelConfig({this.model = ''});\n\n  factory OfflineZipformerAudioTaggingModelConfig.fromJson(\n      Map<String, dynamic> map) {\n    return OfflineZipformerAudioTaggingModelConfig(\n      model: map['model'] ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineZipformerAudioTaggingModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() {\n    return {\n      'model': model,\n    };\n  }\n\n  final String model;\n}\n\n/// Aggregate model configuration for audio tagging.\n///\n/// Configure either [zipformer] or [ced] for typical use.\nclass AudioTaggingModelConfig {\n  AudioTaggingModelConfig(\n      {this.zipformer = const OfflineZipformerAudioTaggingModelConfig(),\n      this.ced = '',\n      this.numThreads = 1,\n      this.provider = 'cpu',\n      this.debug = true});\n\n  factory AudioTaggingModelConfig.fromJson(Map<String, dynamic> map) {\n    return AudioTaggingModelConfig(\n      zipformer:\n          OfflineZipformerAudioTaggingModelConfig.fromJson(map['zipformer']),\n      ced: map['ced'] ?? '',\n      numThreads: map['numThreads'] ?? 1,\n      provider: map['provider'] ?? 'cpu',\n      debug: map['debug'] ?? true,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'AudioTaggingModelConfig(zipformer: $zipformer, ced: $ced, numThreads: $numThreads, provider: $provider, debug: $debug)';\n  }\n\n  Map<String, dynamic> toJson() {\n    return {\n      'zipformer': zipformer.toJson(),\n      'ced': ced,\n      'numThreads': numThreads,\n      'provider': provider,\n      'debug': debug,\n    };\n  }\n\n  final OfflineZipformerAudioTaggingModelConfig zipformer;\n  final String ced;\n  final int numThreads;\n  final String provider;\n  final bool debug;\n}\n\n/// Top-level configuration for [AudioTagging].\nclass AudioTaggingConfig {\n  AudioTaggingConfig({required this.model, this.labels = ''});\n\n  factory AudioTaggingConfig.fromJson(Map<String, dynamic> map) {\n    return AudioTaggingConfig(\n      model: AudioTaggingModelConfig.fromJson(map['model']),\n      labels: map['labels'] ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'AudioTaggingConfig(model: $model, labels: $labels)';\n  }\n\n  Map<String, dynamic> toJson() {\n    return {\n      'model': model.toJson(),\n      'labels': labels,\n    };\n  }\n\n  final AudioTaggingModelConfig model;\n  final String labels;\n}\n\n/// One predicted audio event.\nclass AudioEvent {\n  AudioEvent({required this.name, required this.index, required this.prob});\n\n  factory AudioEvent.fromJson(Map<String, dynamic> map) {\n    return AudioEvent(\n      name: map['name'],\n      index: map['index'],\n      prob: map['prob'],\n    );\n  }\n\n  @override\n  String toString() {\n    return 'AudioEvent(name: $name, index: $index, prob: $prob)';\n  }\n\n  Map<String, dynamic> toJson() {\n    return {\n      'name': name,\n      'index': index,\n      'prob': prob,\n    };\n  }\n\n  final String name;\n  final int index;\n  final double prob;\n}\n\n/// Offline audio tagger.\nclass AudioTagging {\n  AudioTagging.fromPtr({required this.ptr, required this.config});\n\n  AudioTagging._({required this.ptr, required this.config});\n\n  /// Create an audio tagger from [config].\n  factory AudioTagging({required AudioTaggingConfig config}) {\n    final c = calloc<SherpaOnnxAudioTaggingConfig>();\n\n    final zipformerPtr = config.model.zipformer.model.toNativeUtf8();\n    c.ref.model.zipformer.model = zipformerPtr;\n\n    final cedPtr = config.model.ced.toNativeUtf8();\n    c.ref.model.ced = cedPtr;\n\n    c.ref.model.numThreads = config.model.numThreads;\n\n    final providerPtr = config.model.provider.toNativeUtf8();\n    c.ref.model.provider = providerPtr;\n\n    c.ref.model.debug = config.model.debug ? 1 : 0;\n\n    final labelsPtr = config.labels.toNativeUtf8();\n    c.ref.labels = labelsPtr;\n\n    if (SherpaOnnxBindings.sherpaOnnxCreateAudioTagging == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final ptr =\n        SherpaOnnxBindings.sherpaOnnxCreateAudioTagging?.call(c) ?? nullptr;\n\n    calloc.free(labelsPtr);\n    calloc.free(providerPtr);\n    calloc.free(cedPtr);\n    calloc.free(zipformerPtr);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\n          \"Failed to create audio tagging. Please check your config\");\n    }\n\n    return AudioTagging._(ptr: ptr, config: config);\n  }\n\n  /// Release the native tagger.\n  void free() {\n    if (SherpaOnnxBindings.sherpaOnnxDestroyAudioTagging == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.sherpaOnnxDestroyAudioTagging?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Create an offline stream for one audio clip.\n  OfflineStream createStream() {\n    if (SherpaOnnxBindings.sherpaOnnxAudioTaggingCreateOfflineStream == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      throw Exception(\"Failed to create offline stream\");\n    }\n\n    final p = SherpaOnnxBindings.sherpaOnnxAudioTaggingCreateOfflineStream\n            ?.call(ptr) ??\n        nullptr;\n\n    if (p == nullptr) {\n      throw Exception(\"Failed to create offline stream\");\n    }\n\n    return OfflineStream(ptr: p);\n  }\n\n  /// Compute the top [topK] events for [stream].\n  List<AudioEvent> compute({required OfflineStream stream, required int topK}) {\n    if (SherpaOnnxBindings.sherpaOnnxAudioTaggingCompute == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return <AudioEvent>[];\n    }\n\n    final pp = SherpaOnnxBindings.sherpaOnnxAudioTaggingCompute\n            ?.call(ptr, stream.ptr, topK) ??\n        nullptr;\n\n    final ans = <AudioEvent>[];\n\n    if (pp == nullptr) {\n      return ans;\n    }\n\n    var i = 0;\n    while (pp[i] != nullptr) {\n      final p = pp[i];\n\n      final name = p.ref.name.toDartString();\n      final index = p.ref.index;\n      final prob = p.ref.prob;\n      final e = AudioEvent(name: name, index: index, prob: prob);\n      ans.add(e);\n\n      i += 1;\n    }\n\n    SherpaOnnxBindings.sherpaOnnxAudioTaggingFreeResults?.call(pp);\n\n    return ans;\n  }\n\n  Pointer<SherpaOnnxAudioTagging> ptr;\n  final AudioTaggingConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/feature_config.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\n\n/// Feature extraction settings shared by recognizers and keyword spotting.\n///\n/// In most cases the defaults of 16 kHz audio and 80-dimensional filterbank\n/// features should match the model packages provided in the repository.\nclass FeatureConfig {\n  const FeatureConfig({this.sampleRate = 16000, this.featureDim = 80});\n\n  factory FeatureConfig.fromJson(Map<String, dynamic> json) {\n    return FeatureConfig(\n      sampleRate: json['sampleRate'] as int? ?? 16000,\n      featureDim: json['featureDim'] as int? ?? 80,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'FeatureConfig(sampleRate: $sampleRate, featureDim: $featureDim)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'sampleRate': sampleRate,\n        'featureDim': featureDim,\n      };\n\n  final int sampleRate;\n  final int featureDim;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/homophone_replacer_config.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\n\n/// Optional resources for homophone replacement during decoding.\n///\n/// Set [lexicon] and [ruleFsts] when using models or grammars that support\n/// homophone-aware post-processing.\nclass HomophoneReplacerConfig {\n  const HomophoneReplacerConfig(\n      {this.dictDir = '', this.lexicon = '', this.ruleFsts = ''});\n\n  factory HomophoneReplacerConfig.fromJson(Map<String, dynamic> json) {\n    return HomophoneReplacerConfig(\n      lexicon: json['lexicon'] as String? ?? '',\n      ruleFsts: json['ruleFsts'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'HomophoneReplacerConfig(lexicon: $lexicon, ruleFsts: $ruleFsts)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'lexicon': lexicon,\n        'ruleFsts': ruleFsts,\n      };\n\n  final String dictDir; // unused\n  final String lexicon;\n  final String ruleFsts;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/keyword_spotter.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:convert';\nimport 'dart:ffi';\n\nimport 'package:ffi/ffi.dart';\n\nimport './feature_config.dart';\nimport './online_stream.dart';\nimport './online_recognizer.dart';\nimport './sherpa_onnx_bindings.dart';\nimport './utils.dart';\n\n/// Streaming keyword spotting.\n///\n/// See `dart-api-examples/keyword-spotter/` for end-to-end usage.\n///\n/// Example:\n///\n/// ```dart\n/// final spotter = KeywordSpotter(\n///   KeywordSpotterConfig(\n///     model: onlineModelConfig,\n///     keywordsFile: './keywords.txt',\n///   ),\n/// );\n///\n/// final stream = spotter.createStream();\n/// stream.acceptWaveform(samples: chunk, sampleRate: 16000);\n/// while (spotter.isReady(stream)) {\n///   spotter.decode(stream);\n/// }\n/// print(spotter.getResult(stream).keyword);\n/// ```\nclass KeywordSpotterConfig {\n  const KeywordSpotterConfig({\n    this.feat = const FeatureConfig(),\n    required this.model,\n    this.maxActivePaths = 4,\n    this.numTrailingBlanks = 1,\n    this.keywordsScore = 1.0,\n    this.keywordsThreshold = 0.25,\n    this.keywordsFile = '',\n    this.keywordsBuf = '',\n    this.keywordsBufSize = 0,\n  });\n\n  factory KeywordSpotterConfig.fromJson(Map<String, dynamic> json) {\n    return KeywordSpotterConfig(\n      feat: json['feat'] != null\n          ? FeatureConfig.fromJson(json['feat'] as Map<String, dynamic>)\n          : const FeatureConfig(),\n      model: OnlineModelConfig.fromJson(json['model'] as Map<String, dynamic>),\n      maxActivePaths: json['maxActivePaths'] as int? ?? 4,\n      numTrailingBlanks: json['numTrailingBlanks'] as int? ?? 1,\n      keywordsScore: (json['keywordsScore'] as num?)?.toDouble() ?? 1.0,\n      keywordsThreshold:\n          (json['keywordsThreshold'] as num?)?.toDouble() ?? 0.25,\n      keywordsFile: json['keywordsFile'] as String? ?? '',\n      keywordsBuf: json['keywordsBuf'] as String? ?? '',\n      keywordsBufSize: json['keywordsBufSize'] as int? ?? 0,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'KeywordSpotterConfig(feat: $feat, model: $model, maxActivePaths: $maxActivePaths, numTrailingBlanks: $numTrailingBlanks, keywordsScore: $keywordsScore, keywordsThreshold: $keywordsThreshold, keywordsFile: $keywordsFile, keywordsBuf: $keywordsBuf, keywordsBufSize: $keywordsBufSize)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'feat': feat.toJson(),\n        'model': model.toJson(),\n        'maxActivePaths': maxActivePaths,\n        'numTrailingBlanks': numTrailingBlanks,\n        'keywordsScore': keywordsScore,\n        'keywordsThreshold': keywordsThreshold,\n        'keywordsFile': keywordsFile,\n        'keywordsBuf': keywordsBuf,\n        'keywordsBufSize': keywordsBufSize,\n      };\n\n  final FeatureConfig feat;\n  final OnlineModelConfig model;\n\n  final int maxActivePaths;\n  final int numTrailingBlanks;\n\n  final double keywordsScore;\n  final double keywordsThreshold;\n  final String keywordsFile;\n  final String keywordsBuf;\n  final int keywordsBufSize;\n}\n\n/// Result returned by [KeywordSpotter.getResult].\nclass KeywordResult {\n  KeywordResult({required this.keyword});\n\n  factory KeywordResult.fromJson(Map<String, dynamic> json) {\n    return KeywordResult(\n      keyword: json['keyword'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'KeywordResult(keyword: $keyword)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'keyword': keyword,\n      };\n\n  final String keyword;\n}\n\n/// Streaming keyword spotter.\nclass KeywordSpotter {\n  KeywordSpotter.fromPtr({required this.ptr, required this.config});\n\n  KeywordSpotter._({required this.ptr, required this.config});\n\n  /// Create a keyword spotter from [config].\n  factory KeywordSpotter(KeywordSpotterConfig config) {\n    final c = calloc<SherpaOnnxKeywordSpotterConfig>();\n    c.ref.feat.sampleRate = config.feat.sampleRate;\n    c.ref.feat.featureDim = config.feat.featureDim;\n\n    // transducer\n    c.ref.model.transducer.encoder =\n        config.model.transducer.encoder.toNativeUtf8();\n    c.ref.model.transducer.decoder =\n        config.model.transducer.decoder.toNativeUtf8();\n    c.ref.model.transducer.joiner =\n        config.model.transducer.joiner.toNativeUtf8();\n\n    // paraformer\n    c.ref.model.paraformer.encoder =\n        config.model.paraformer.encoder.toNativeUtf8();\n    c.ref.model.paraformer.decoder =\n        config.model.paraformer.decoder.toNativeUtf8();\n\n    // zipformer2Ctc\n    c.ref.model.zipformer2Ctc.model =\n        config.model.zipformer2Ctc.model.toNativeUtf8();\n\n    // nemoCtc\n    c.ref.model.nemoCtc.model = config.model.nemoCtc.model.toNativeUtf8();\n\n    c.ref.model.tokens = config.model.tokens.toNativeUtf8();\n    c.ref.model.numThreads = config.model.numThreads;\n    c.ref.model.provider = config.model.provider.toNativeUtf8();\n    c.ref.model.debug = config.model.debug ? 1 : 0;\n    c.ref.model.modelType = config.model.modelType.toNativeUtf8();\n    c.ref.model.modelingUnit = config.model.modelingUnit.toNativeUtf8();\n    c.ref.model.bpeVocab = config.model.bpeVocab.toNativeUtf8();\n\n    c.ref.maxActivePaths = config.maxActivePaths;\n    c.ref.numTrailingBlanks = config.numTrailingBlanks;\n    c.ref.keywordsScore = config.keywordsScore;\n    c.ref.keywordsThreshold = config.keywordsThreshold;\n    c.ref.keywordsFile = config.keywordsFile.toNativeUtf8();\n    c.ref.keywordsBuf = config.keywordsBuf.toNativeUtf8();\n    c.ref.keywordsBufSize = config.keywordsBufSize;\n\n    if (SherpaOnnxBindings.createKeywordSpotter == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final ptr = SherpaOnnxBindings.createKeywordSpotter?.call(c) ?? nullptr;\n\n    calloc.free(c.ref.keywordsBuf);\n    calloc.free(c.ref.keywordsFile);\n    calloc.free(c.ref.model.bpeVocab);\n    calloc.free(c.ref.model.modelingUnit);\n    calloc.free(c.ref.model.modelType);\n    calloc.free(c.ref.model.provider);\n    calloc.free(c.ref.model.tokens);\n    calloc.free(c.ref.model.nemoCtc.model);\n    calloc.free(c.ref.model.zipformer2Ctc.model);\n    calloc.free(c.ref.model.paraformer.encoder);\n    calloc.free(c.ref.model.paraformer.decoder);\n\n    calloc.free(c.ref.model.transducer.encoder);\n    calloc.free(c.ref.model.transducer.decoder);\n    calloc.free(c.ref.model.transducer.joiner);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\"Failed to create kws. Please check your config\");\n    }\n\n    return KeywordSpotter._(ptr: ptr, config: config);\n  }\n\n  /// Release the native keyword spotter.\n  void free() {\n    if (SherpaOnnxBindings.destroyKeywordSpotter == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.destroyKeywordSpotter?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Create a streaming input stream.\n  ///\n  /// If [keywords] is provided, it overrides the configured keywords for that\n  /// stream.\n  OnlineStream createStream({String keywords = ''}) {\n    if (keywords == '') {\n      if (SherpaOnnxBindings.createKeywordStream == null) {\n        throw Exception(\"Please initialize sherpa-onnx first\");\n      }\n    } else {\n      if (SherpaOnnxBindings.createKeywordStreamWithKeywords == null) {\n        throw Exception(\"Please initialize sherpa-onnx first\");\n      }\n    }\n\n    if (ptr == nullptr) {\n      throw Exception(\"Failed to create online stream\");\n    }\n\n    if (keywords == '') {\n      final p = SherpaOnnxBindings.createKeywordStream?.call(ptr) ?? nullptr;\n      if (p == nullptr) {\n        throw Exception(\"Failed to create online stream\");\n      }\n      return OnlineStream(ptr: p);\n    }\n\n    final utf8 = keywords.toNativeUtf8();\n    final p =\n        SherpaOnnxBindings.createKeywordStreamWithKeywords?.call(ptr, utf8) ??\n            nullptr;\n    calloc.free(utf8);\n\n    if (p == nullptr) {\n      throw Exception(\"Failed to create online stream\");\n    }\n\n    return OnlineStream(ptr: p);\n  }\n\n  /// Return `true` if [stream] has enough audio for another decode step.\n  bool isReady(OnlineStream stream) {\n    if (SherpaOnnxBindings.isKeywordStreamReady == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return false;\n    }\n\n    int ready =\n        SherpaOnnxBindings.isKeywordStreamReady?.call(ptr, stream.ptr) ?? 0;\n\n    return ready == 1;\n  }\n\n  /// Fetch the current keyword spotting result for [stream].\n  KeywordResult getResult(OnlineStream stream) {\n    if (SherpaOnnxBindings.getKeywordResultAsJson == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return KeywordResult(keyword: '');\n    }\n\n    final json =\n        SherpaOnnxBindings.getKeywordResultAsJson?.call(ptr, stream.ptr) ??\n            nullptr;\n    if (json == nullptr) {\n      return KeywordResult(keyword: '');\n    }\n\n    final parsedJson = jsonDecode(toDartString(json));\n\n    SherpaOnnxBindings.freeKeywordResultJson?.call(json);\n\n    return KeywordResult(\n      keyword: parsedJson['keyword'],\n    );\n  }\n\n  /// Decode one incremental step for [stream].\n  void decode(OnlineStream stream) {\n    if (SherpaOnnxBindings.decodeKeywordStream == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.decodeKeywordStream?.call(ptr, stream.ptr);\n  }\n\n  /// Reset the internal state for [stream].\n  void reset(OnlineStream stream) {\n    if (SherpaOnnxBindings.resetKeywordStream == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.resetKeywordStream?.call(ptr, stream.ptr);\n  }\n\n  Pointer<SherpaOnnxKeywordSpotter> ptr;\n  KeywordSpotterConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/offline_punctuation.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'package:ffi/ffi.dart';\n\nimport './sherpa_onnx_bindings.dart';\n\n/// Offline punctuation restoration.\n///\n/// This is intended for complete text strings when you want one-shot\n/// punctuation insertion. See `dart-api-examples/add-punctuations/`.\nclass OfflinePunctuationModelConfig {\n  OfflinePunctuationModelConfig(\n      {required this.ctTransformer,\n      this.numThreads = 1,\n      this.provider = 'cpu',\n      this.debug = true});\n\n  factory OfflinePunctuationModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflinePunctuationModelConfig(\n      ctTransformer: json['ctTransformer'] as String,\n      numThreads: json['numThreads'] as int? ?? 1,\n      provider: json['provider'] as String? ?? 'cpu',\n      debug: json['debug'] as bool? ?? true,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflinePunctuationModelConfig(ctTransformer: $ctTransformer, numThreads: $numThreads, provider: $provider, debug: $debug)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'ctTransformer': ctTransformer,\n        'numThreads': numThreads,\n        'provider': provider,\n        'debug': debug,\n      };\n\n  final String ctTransformer;\n  final int numThreads;\n  final String provider;\n  final bool debug;\n}\n\n/// Top-level configuration for [OfflinePunctuation].\nclass OfflinePunctuationConfig {\n  OfflinePunctuationConfig({\n    required this.model,\n  });\n\n  factory OfflinePunctuationConfig.fromJson(Map<String, dynamic> json) {\n    return OfflinePunctuationConfig(\n      model: OfflinePunctuationModelConfig.fromJson(\n          json['model'] as Map<String, dynamic>),\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflinePunctuationConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model.toJson(),\n      };\n\n  final OfflinePunctuationModelConfig model;\n}\n\n/// Offline punctuation restorer.\nclass OfflinePunctuation {\n  OfflinePunctuation.fromPtr({required this.ptr, required this.config});\n\n  OfflinePunctuation._({required this.ptr, required this.config});\n\n  /// Create an offline punctuator from [config].\n  factory OfflinePunctuation({required OfflinePunctuationConfig config}) {\n    if (SherpaOnnxBindings.sherpaOnnxCreateOfflinePunctuation == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final c = calloc<SherpaOnnxOfflinePunctuationConfig>();\n\n    final ctTransformerPtr = config.model.ctTransformer.toNativeUtf8();\n    c.ref.model.ctTransformer = ctTransformerPtr;\n    c.ref.model.numThreads = config.model.numThreads;\n    c.ref.model.debug = config.model.debug ? 1 : 0;\n\n    final providerPtr = config.model.provider.toNativeUtf8();\n    c.ref.model.provider = providerPtr;\n\n    final ptr =\n        SherpaOnnxBindings.sherpaOnnxCreateOfflinePunctuation?.call(c) ??\n            nullptr;\n\n    calloc.free(providerPtr);\n    calloc.free(ctTransformerPtr);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\n          \"Failed to create offline punctuation. Please check your config\");\n    }\n\n    return OfflinePunctuation._(ptr: ptr, config: config);\n  }\n\n  /// Release the native punctuator.\n  void free() {\n    if (SherpaOnnxBindings.sherpaOnnxDestroyOfflinePunctuation == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.sherpaOnnxDestroyOfflinePunctuation?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Add punctuation to [text].\n  String addPunct(String text) {\n    if (SherpaOnnxBindings.sherpaOfflinePunctuationAddPunct == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return '';\n    }\n\n    final textPtr = text.toNativeUtf8();\n\n    final p = SherpaOnnxBindings.sherpaOfflinePunctuationAddPunct\n            ?.call(ptr, textPtr) ??\n        nullptr;\n\n    calloc.free(textPtr);\n\n    if (p == nullptr) {\n      return '';\n    }\n\n    final ans = p.toDartString();\n\n    SherpaOnnxBindings.sherpaOfflinePunctuationFreeText?.call(p);\n\n    return ans;\n  }\n\n  Pointer<SherpaOnnxOfflinePunctuation> ptr;\n  final OfflinePunctuationConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/offline_recognizer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:convert';\nimport 'dart:ffi';\n\nimport 'package:ffi/ffi.dart';\n\nimport './feature_config.dart';\nimport './homophone_replacer_config.dart';\nimport './offline_stream.dart';\nimport './sherpa_onnx_bindings.dart';\nimport './utils.dart';\n\n/// Offline speech recognition.\n///\n/// This module covers non-streaming ASR model families such as transducer,\n/// Paraformer, Whisper, SenseVoice, Moonshine, Canary, Fire-Red-ASR, WeNet,\n/// Omnilingual-ASR, TeleSpeech-CTC, FunASR-Nano, and several CTC variants.\n///\n/// See `dart-api-examples/non-streaming-asr/bin/` for concrete usage,\n/// including `sense-voice.dart`, `whisper.dart`, `nemo-transducer.dart`,\n/// `moonshine_v2.dart`, and `fire-red-asr-ctc.dart`.\n///\n/// Example:\n///\n/// ```dart\n/// final whisper = OfflineWhisperModelConfig(\n///   encoder: './sherpa-onnx-whisper-tiny/encoder.int8.onnx',\n///   decoder: './sherpa-onnx-whisper-tiny/decoder.int8.onnx',\n/// );\n///\n/// final model = OfflineModelConfig(\n///   whisper: whisper,\n///   tokens: './sherpa-onnx-whisper-tiny/tokens.txt',\n///   modelType: 'whisper',\n///   numThreads: 1,\n/// );\n///\n/// final recognizer = OfflineRecognizer(OfflineRecognizerConfig(model: model));\n/// final wave = readWave('./test.wav');\n/// final stream = recognizer.createStream();\n/// stream.acceptWaveform(samples: wave.samples, sampleRate: wave.sampleRate);\n/// recognizer.decode(stream);\n/// print(recognizer.getResult(stream).text);\n/// stream.free();\n/// recognizer.free();\n/// ```\n\n/// Model files for an offline transducer recognizer.\n///\n/// This family is also used by NeMo Parakeet TDT-style examples.\nclass OfflineTransducerModelConfig {\n  const OfflineTransducerModelConfig({\n    this.encoder = '',\n    this.decoder = '',\n    this.joiner = '',\n  });\n\n  factory OfflineTransducerModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTransducerModelConfig(\n      encoder: json['encoder'] as String? ?? '',\n      decoder: json['decoder'] as String? ?? '',\n      joiner: json['joiner'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineTransducerModelConfig(encoder: $encoder, decoder: $decoder, joiner: $joiner)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'encoder': encoder,\n    'decoder': decoder,\n    'joiner': joiner,\n  };\n\n  final String encoder;\n  final String decoder;\n  final String joiner;\n}\n\n/// Model files for an offline Paraformer recognizer.\nclass OfflineParaformerModelConfig {\n  const OfflineParaformerModelConfig({this.model = ''});\n\n  factory OfflineParaformerModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineParaformerModelConfig(model: json['model'] as String? ?? '');\n  }\n\n  @override\n  String toString() {\n    return 'OfflineParaformerModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {'model': model};\n\n  final String model;\n}\n\n/// Model files for an offline NeMo CTC recognizer.\nclass OfflineNemoEncDecCtcModelConfig {\n  const OfflineNemoEncDecCtcModelConfig({this.model = ''});\n\n  factory OfflineNemoEncDecCtcModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineNemoEncDecCtcModelConfig(\n      model: json['model'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineNemoEncDecCtcModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {'model': model};\n\n  final String model;\n}\n\n/// Model files for an offline Dolphin recognizer.\nclass OfflineDolphinModelConfig {\n  const OfflineDolphinModelConfig({this.model = ''});\n\n  factory OfflineDolphinModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineDolphinModelConfig(model: json['model'] as String? ?? '');\n  }\n\n  @override\n  String toString() {\n    return 'OfflineDolphinModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {'model': model};\n\n  final String model;\n}\n\n/// Model files for an offline Zipformer CTC recognizer.\nclass OfflineZipformerCtcModelConfig {\n  const OfflineZipformerCtcModelConfig({this.model = ''});\n\n  factory OfflineZipformerCtcModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineZipformerCtcModelConfig(\n      model: json['model'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineZipformerCtcModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {'model': model};\n\n  final String model;\n}\n\n/// Model files for an offline WeNet CTC recognizer.\nclass OfflineWenetCtcModelConfig {\n  const OfflineWenetCtcModelConfig({this.model = ''});\n\n  factory OfflineWenetCtcModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineWenetCtcModelConfig(model: json['model'] as String? ?? '');\n  }\n\n  @override\n  String toString() {\n    return 'OfflineWenetCtcModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {'model': model};\n\n  final String model;\n}\n\n/// Model files for the omnilingual ASR CTC recognizer.\nclass OfflineOmnilingualAsrCtcModelConfig {\n  const OfflineOmnilingualAsrCtcModelConfig({this.model = ''});\n\n  factory OfflineOmnilingualAsrCtcModelConfig.fromJson(\n    Map<String, dynamic> json,\n  ) {\n    return OfflineOmnilingualAsrCtcModelConfig(\n      model: json['model'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineOmnilingualAsrCtcModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {'model': model};\n\n  final String model;\n}\n\n/// Model files for the MedASR CTC recognizer.\nclass OfflineMedAsrCtcModelConfig {\n  const OfflineMedAsrCtcModelConfig({this.model = ''});\n\n  factory OfflineMedAsrCtcModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineMedAsrCtcModelConfig(model: json['model'] as String? ?? '');\n  }\n\n  @override\n  String toString() {\n    return 'OfflineMedAsrCtcModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {'model': model};\n\n  final String model;\n}\n\n/// Model files for the Fire-Red-ASR CTC recognizer.\nclass OfflineFireRedAsrCtcModelConfig {\n  const OfflineFireRedAsrCtcModelConfig({this.model = ''});\n\n  factory OfflineFireRedAsrCtcModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineFireRedAsrCtcModelConfig(\n      model: json['model'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineFireRedAsrCtcModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {'model': model};\n\n  final String model;\n}\n\n/// Model files and prompt settings for FunASR-Nano.\nclass OfflineFunAsrNanoModelConfig {\n  const OfflineFunAsrNanoModelConfig({\n    this.encoderAdaptor = '',\n    this.llm = '',\n    this.embedding = '',\n    this.tokenizer = '',\n    this.systemPrompt = 'You are a helpful assistant.',\n    this.userPrompt = '语音转写：',\n    this.maxNewTokens = 512,\n    this.temperature = 1e-6,\n    this.topP = 0.8,\n    this.seed = 42,\n    this.language = '',\n    this.itn = 1,\n    this.hotwords = '',\n  });\n\n  factory OfflineFunAsrNanoModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineFunAsrNanoModelConfig(\n      encoderAdaptor: json['encoderAdaptor'] as String? ?? '',\n      llm: json['llm'] as String? ?? '',\n      embedding: json['embedding'] as String? ?? '',\n      tokenizer: json['tokenizer'] as String? ?? '',\n      systemPrompt: json['systemPrompt'] as String? ?? '',\n      userPrompt: json['userPrompt'] as String? ?? '',\n      maxNewTokens: json['maxNewTokens'] as int? ?? 512,\n      temperature: (json['temperature'] as num?)?.toDouble() ?? 1e-6,\n      topP: (json['topP'] as num?)?.toDouble() ?? 0.8,\n      seed: json['seed'] as int? ?? 42,\n      language: json['language'] as String? ?? '',\n      itn: json['itn'] as int? ?? 1,\n      hotwords: json['hotwords'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineFunAsrNanoModelConfig(encoderAdaptor: $encoderAdaptor, llm: $llm, embedding: $embedding, tokenizer: $tokenizer, systemPrompt: $systemPrompt, userPrompt: $userPrompt, maxNewTokens: $maxNewTokens, temperature: $temperature, topP: $topP, seed: $seed, language: $language, itn: $itn, hotwords: $hotwords)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'encoderAdaptor': encoderAdaptor,\n    'llm': llm,\n    'embedding': embedding,\n    'tokenizer': tokenizer,\n    'systemPrompt': systemPrompt,\n    'userPrompt': userPrompt,\n    'maxNewTokens': maxNewTokens,\n    'temperature': temperature,\n    'topP': topP,\n    'seed': seed,\n    'language': language,\n    'itn': itn,\n    'hotwords': hotwords,\n  };\n\n  final String encoderAdaptor;\n  final String llm;\n  final String embedding;\n  final String tokenizer;\n  final String systemPrompt;\n  final String userPrompt;\n  final int maxNewTokens;\n  final double temperature;\n  final double topP;\n  final int seed;\n  final String language;\n  final int itn;\n  final String hotwords;\n}\n\n/// Model files and options for an offline Whisper recognizer.\nclass OfflineWhisperModelConfig {\n  const OfflineWhisperModelConfig({\n    this.encoder = '',\n    this.decoder = '',\n    this.language = '',\n    this.task = '',\n    this.tailPaddings = -1,\n    this.enableTokenTimestamps = false,\n    this.enableSegmentTimestamps = false,\n  });\n\n  factory OfflineWhisperModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineWhisperModelConfig(\n      encoder: json['encoder'] as String? ?? '',\n      decoder: json['decoder'] as String? ?? '',\n      language: json['language'] as String? ?? '',\n      task: json['task'] as String? ?? '',\n      tailPaddings: json['tailPaddings'] as int? ?? -1,\n      enableTokenTimestamps: json['enableTokenTimestamps'] as bool? ?? false,\n      enableSegmentTimestamps:\n          json['enableSegmentTimestamps'] as bool? ?? false,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineWhisperModelConfig(encoder: $encoder, decoder: $decoder, language: $language, task: $task, tailPaddings: $tailPaddings, enableTokenTimestamps: $enableTokenTimestamps, enableSegmentTimestamps: $enableSegmentTimestamps)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'encoder': encoder,\n    'decoder': decoder,\n    'language': language,\n    'task': task,\n    'tailPaddings': tailPaddings,\n    'enableTokenTimestamps': enableTokenTimestamps,\n    'enableSegmentTimestamps': enableSegmentTimestamps,\n  };\n\n  final String encoder;\n  final String decoder;\n  final String language;\n  final String task;\n  final int tailPaddings;\n  final bool enableTokenTimestamps;\n  final bool enableSegmentTimestamps;\n}\n\n/// Model files and translation options for NeMo Canary.\nclass OfflineCanaryModelConfig {\n  const OfflineCanaryModelConfig({\n    this.encoder = '',\n    this.decoder = '',\n    this.srcLang = 'en',\n    this.tgtLang = 'en',\n    this.usePnc = true,\n  });\n\n  factory OfflineCanaryModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineCanaryModelConfig(\n      encoder: json['encoder'] as String? ?? '',\n      decoder: json['decoder'] as String? ?? '',\n      srcLang: json['srcLang'] as String? ?? 'en',\n      tgtLang: json['tgtLang'] as String? ?? 'en',\n      usePnc: json['usePnc'] as bool? ?? true,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineCanaryModelConfig(encoder: $encoder, decoder: $decoder, srcLang: $srcLang, tgtLang: $tgtLang, usePnc: $usePnc)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'encoder': encoder,\n    'decoder': decoder,\n    'srcLang': srcLang,\n    'tgtLang': tgtLang,\n    'usePnc': usePnc,\n  };\n\n  final String encoder;\n  final String decoder;\n  final String srcLang;\n  final String tgtLang;\n  final bool usePnc;\n}\n\n/// Model files for the Fire-Red-ASR transducer recognizer.\nclass OfflineFireRedAsrModelConfig {\n  const OfflineFireRedAsrModelConfig({this.encoder = '', this.decoder = ''});\n\n  factory OfflineFireRedAsrModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineFireRedAsrModelConfig(\n      encoder: json['encoder'] as String? ?? '',\n      decoder: json['decoder'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineFireRedAsrModelConfig(encoder: $encoder, decoder: $decoder)';\n  }\n\n  Map<String, dynamic> toJson() => {'encoder': encoder, 'decoder': decoder};\n\n  final String encoder;\n  final String decoder;\n}\n\n// For Moonshine v1, you need 4 models:\n//  - preprocessor, encoder, uncachedDecoder, cachedDecoder\n//\n// For Moonshine v2, you need 2 models:\n//  - encoder, mergedDecoder\n/// Model files for Moonshine v1 or v2.\nclass OfflineMoonshineModelConfig {\n  const OfflineMoonshineModelConfig({\n    this.preprocessor = '',\n    this.encoder = '',\n    this.uncachedDecoder = '',\n    this.cachedDecoder = '',\n    this.mergedDecoder = '',\n  });\n\n  factory OfflineMoonshineModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineMoonshineModelConfig(\n      preprocessor: json['preprocessor'] as String? ?? '',\n      encoder: json['encoder'] as String? ?? '',\n      uncachedDecoder: json['uncachedDecoder'] as String? ?? '',\n      cachedDecoder: json['cachedDecoder'] as String? ?? '',\n      mergedDecoder: json['mergedDecoder'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineMoonshineModelConfig(preprocessor: $preprocessor, encoder: $encoder, uncachedDecoder: $uncachedDecoder, cachedDecoder: $cachedDecoder, mergedDecoder: $mergedDecoder)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'preprocessor': preprocessor,\n    'encoder': encoder,\n    'uncachedDecoder': uncachedDecoder,\n    'cachedDecoder': cachedDecoder,\n    'mergedDecoder': mergedDecoder,\n  };\n\n  final String preprocessor;\n  final String encoder;\n  final String uncachedDecoder;\n  final String cachedDecoder;\n  final String mergedDecoder;\n}\n\n/// Model files for an offline TDNN recognizer.\nclass OfflineTdnnModelConfig {\n  const OfflineTdnnModelConfig({this.model = ''});\n\n  factory OfflineTdnnModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTdnnModelConfig(model: json['model'] as String? ?? '');\n  }\n\n  @override\n  String toString() {\n    return 'OfflineTdnnModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {'model': model};\n\n  final String model;\n}\n\n/// Model files and options for SenseVoice.\n///\n/// In the examples, this is typically paired with the\n/// `sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8` package.\nclass OfflineSenseVoiceModelConfig {\n  const OfflineSenseVoiceModelConfig({\n    this.model = '',\n    this.language = '',\n    this.useInverseTextNormalization = false,\n  });\n\n  factory OfflineSenseVoiceModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineSenseVoiceModelConfig(\n      model: json['model'] as String? ?? '',\n      language: json['language'] as String? ?? '',\n      useInverseTextNormalization:\n          json['useInverseTextNormalization'] as bool? ?? false,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineSenseVoiceModelConfig(model: $model, language: $language, useInverseTextNormalization: $useInverseTextNormalization)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'model': model,\n    'language': language,\n    'useInverseTextNormalization': useInverseTextNormalization,\n  };\n\n  final String model;\n  final String language;\n  final bool useInverseTextNormalization;\n}\n\n/// Optional external language model settings for offline ASR.\nclass OfflineLMConfig {\n  const OfflineLMConfig({this.model = '', this.scale = 1.0});\n\n  factory OfflineLMConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineLMConfig(\n      model: json['model'] as String? ?? '',\n      scale: (json['scale'] as num?)?.toDouble() ?? 1.0,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineLMConfig(model: $model, scale: $scale)';\n  }\n\n  Map<String, dynamic> toJson() => {'model': model, 'scale': scale};\n\n  final String model;\n  final double scale;\n}\n\n/// Aggregate model configuration for offline recognition.\n///\n/// In typical use, configure exactly one model family and set the shared\n/// options such as [tokens], [provider], and [numThreads].\n///\n/// For NeMo Parakeet-style transducer models, set [modelType] to\n/// `nemo_transducer`, matching the repository examples.\nclass OfflineModelConfig {\n  const OfflineModelConfig({\n    this.transducer = const OfflineTransducerModelConfig(),\n    this.paraformer = const OfflineParaformerModelConfig(),\n    this.nemoCtc = const OfflineNemoEncDecCtcModelConfig(),\n    this.whisper = const OfflineWhisperModelConfig(),\n    this.tdnn = const OfflineTdnnModelConfig(),\n    this.senseVoice = const OfflineSenseVoiceModelConfig(),\n    this.moonshine = const OfflineMoonshineModelConfig(),\n    this.fireRedAsr = const OfflineFireRedAsrModelConfig(),\n    this.dolphin = const OfflineDolphinModelConfig(),\n    this.zipformerCtc = const OfflineZipformerCtcModelConfig(),\n    this.canary = const OfflineCanaryModelConfig(),\n    this.wenetCtc = const OfflineWenetCtcModelConfig(),\n    this.omnilingual = const OfflineOmnilingualAsrCtcModelConfig(),\n    this.medasr = const OfflineMedAsrCtcModelConfig(),\n    this.funasrNano = const OfflineFunAsrNanoModelConfig(),\n    this.fireRedAsrCtc = const OfflineFireRedAsrCtcModelConfig(),\n    required this.tokens,\n    this.numThreads = 1,\n    this.debug = true,\n    this.provider = 'cpu',\n    this.modelType = '',\n    this.modelingUnit = '',\n    this.bpeVocab = '',\n    this.telespeechCtc = '',\n  });\n\n  factory OfflineModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineModelConfig(\n      transducer: json['transducer'] != null\n          ? OfflineTransducerModelConfig.fromJson(\n              json['transducer'] as Map<String, dynamic>,\n            )\n          : const OfflineTransducerModelConfig(),\n      paraformer: json['paraformer'] != null\n          ? OfflineParaformerModelConfig.fromJson(\n              json['paraformer'] as Map<String, dynamic>,\n            )\n          : const OfflineParaformerModelConfig(),\n      nemoCtc: json['nemoCtc'] != null\n          ? OfflineNemoEncDecCtcModelConfig.fromJson(\n              json['nemoCtc'] as Map<String, dynamic>,\n            )\n          : const OfflineNemoEncDecCtcModelConfig(),\n      whisper: json['whisper'] != null\n          ? OfflineWhisperModelConfig.fromJson(\n              json['whisper'] as Map<String, dynamic>,\n            )\n          : const OfflineWhisperModelConfig(),\n      tdnn: json['tdnn'] != null\n          ? OfflineTdnnModelConfig.fromJson(\n              json['tdnn'] as Map<String, dynamic>,\n            )\n          : const OfflineTdnnModelConfig(),\n      senseVoice: json['senseVoice'] != null\n          ? OfflineSenseVoiceModelConfig.fromJson(\n              json['senseVoice'] as Map<String, dynamic>,\n            )\n          : const OfflineSenseVoiceModelConfig(),\n      moonshine: json['moonshine'] != null\n          ? OfflineMoonshineModelConfig.fromJson(\n              json['moonshine'] as Map<String, dynamic>,\n            )\n          : const OfflineMoonshineModelConfig(),\n      fireRedAsr: json['fireRedAsr'] != null\n          ? OfflineFireRedAsrModelConfig.fromJson(\n              json['fireRedAsr'] as Map<String, dynamic>,\n            )\n          : const OfflineFireRedAsrModelConfig(),\n      dolphin: json['dolphin'] != null\n          ? OfflineDolphinModelConfig.fromJson(\n              json['dolphin'] as Map<String, dynamic>,\n            )\n          : const OfflineDolphinModelConfig(),\n      zipformerCtc: json['zipformerCtc'] != null\n          ? OfflineZipformerCtcModelConfig.fromJson(\n              json['zipformerCtc'] as Map<String, dynamic>,\n            )\n          : const OfflineZipformerCtcModelConfig(),\n      canary: json['canary'] != null\n          ? OfflineCanaryModelConfig.fromJson(\n              json['canary'] as Map<String, dynamic>,\n            )\n          : const OfflineCanaryModelConfig(),\n      wenetCtc: json['wenetCtc'] != null\n          ? OfflineWenetCtcModelConfig.fromJson(\n              json['wenetCtc'] as Map<String, dynamic>,\n            )\n          : const OfflineWenetCtcModelConfig(),\n      omnilingual: json['omnilingual'] != null\n          ? OfflineOmnilingualAsrCtcModelConfig.fromJson(\n              json['omnilingual'] as Map<String, dynamic>,\n            )\n          : const OfflineOmnilingualAsrCtcModelConfig(),\n      medasr: json['medasr'] != null\n          ? OfflineMedAsrCtcModelConfig.fromJson(\n              json['medasr'] as Map<String, dynamic>,\n            )\n          : const OfflineMedAsrCtcModelConfig(),\n      funasrNano: json['funasrNano'] != null\n          ? OfflineFunAsrNanoModelConfig.fromJson(\n              json['funasrNano'] as Map<String, dynamic>,\n            )\n          : const OfflineFunAsrNanoModelConfig(),\n      fireRedAsrCtc: json['fireRedAsrCtc'] != null\n          ? OfflineFireRedAsrCtcModelConfig.fromJson(\n              json['fireRedAsrCtc'] as Map<String, dynamic>,\n            )\n          : const OfflineFireRedAsrCtcModelConfig(),\n      tokens: json['tokens'] as String,\n      numThreads: json['numThreads'] as int? ?? 1,\n      debug: json['debug'] as bool? ?? true,\n      provider: json['provider'] as String? ?? 'cpu',\n      modelType: json['modelType'] as String? ?? '',\n      modelingUnit: json['modelingUnit'] as String? ?? '',\n      bpeVocab: json['bpeVocab'] as String? ?? '',\n      telespeechCtc: json['telespeechCtc'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineModelConfig(transducer: $transducer, paraformer: $paraformer, nemoCtc: $nemoCtc, whisper: $whisper, tdnn: $tdnn, senseVoice: $senseVoice, moonshine: $moonshine, fireRedAsr: $fireRedAsr, dolphin: $dolphin, zipformerCtc: $zipformerCtc, canary: $canary, wenetCtc: $wenetCtc, omnilingual: $omnilingual, medasr: $medasr, funasrNano: $funasrNano, fireRedAsrCtc: $fireRedAsrCtc, tokens: $tokens, numThreads: $numThreads, debug: $debug, provider: $provider, modelType: $modelType, modelingUnit: $modelingUnit, bpeVocab: $bpeVocab, telespeechCtc: $telespeechCtc)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'transducer': transducer.toJson(),\n    'paraformer': paraformer.toJson(),\n    'nemoCtc': nemoCtc.toJson(),\n    'whisper': whisper.toJson(),\n    'tdnn': tdnn.toJson(),\n    'senseVoice': senseVoice.toJson(),\n    'moonshine': moonshine.toJson(),\n    'fireRedAsr': fireRedAsr.toJson(),\n    'dolphin': dolphin.toJson(),\n    'zipformerCtc': zipformerCtc.toJson(),\n    'canary': canary.toJson(),\n    'wenetCtc': wenetCtc.toJson(),\n    'omnilingual': omnilingual.toJson(),\n    'medasr': medasr.toJson(),\n    'funasrNano': funasrNano.toJson(),\n    'fireRedAsrCtc': fireRedAsrCtc.toJson(),\n    'tokens': tokens,\n    'numThreads': numThreads,\n    'debug': debug,\n    'provider': provider,\n    'modelType': modelType,\n    'modelingUnit': modelingUnit,\n    'bpeVocab': bpeVocab,\n    'telespeechCtc': telespeechCtc,\n  };\n\n  final OfflineTransducerModelConfig transducer;\n  final OfflineParaformerModelConfig paraformer;\n  final OfflineNemoEncDecCtcModelConfig nemoCtc;\n  final OfflineWhisperModelConfig whisper;\n  final OfflineTdnnModelConfig tdnn;\n  final OfflineSenseVoiceModelConfig senseVoice;\n  final OfflineMoonshineModelConfig moonshine;\n  final OfflineFireRedAsrModelConfig fireRedAsr;\n  final OfflineDolphinModelConfig dolphin;\n  final OfflineZipformerCtcModelConfig zipformerCtc;\n  final OfflineCanaryModelConfig canary;\n  final OfflineWenetCtcModelConfig wenetCtc;\n  final OfflineOmnilingualAsrCtcModelConfig omnilingual;\n  final OfflineMedAsrCtcModelConfig medasr;\n  final OfflineFunAsrNanoModelConfig funasrNano;\n  final OfflineFireRedAsrCtcModelConfig fireRedAsrCtc;\n\n  final String tokens;\n  final int numThreads;\n  final bool debug;\n  final String provider;\n  final String modelType;\n  final String modelingUnit;\n  final String bpeVocab;\n  final String telespeechCtc;\n}\n\n/// Top-level configuration for [OfflineRecognizer].\n///\n/// This combines feature extraction, the selected model family, optional\n/// language model settings, hotwords, grammar resources, and optional\n/// homophone replacement resources.\nclass OfflineRecognizerConfig {\n  const OfflineRecognizerConfig({\n    this.feat = const FeatureConfig(),\n    required this.model,\n    this.lm = const OfflineLMConfig(),\n    this.decodingMethod = 'greedy_search',\n    this.maxActivePaths = 4,\n    this.hotwordsFile = '',\n    this.hotwordsScore = 1.5,\n    this.ruleFsts = '',\n    this.ruleFars = '',\n    this.blankPenalty = 0.0,\n    this.hr = const HomophoneReplacerConfig(),\n  });\n\n  factory OfflineRecognizerConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineRecognizerConfig(\n      feat: json['feat'] != null\n          ? FeatureConfig.fromJson(json['feat'] as Map<String, dynamic>)\n          : const FeatureConfig(),\n      model: OfflineModelConfig.fromJson(json['model'] as Map<String, dynamic>),\n      lm: json['lm'] != null\n          ? OfflineLMConfig.fromJson(json['lm'] as Map<String, dynamic>)\n          : const OfflineLMConfig(),\n      decodingMethod: json['decodingMethod'] as String? ?? 'greedy_search',\n      maxActivePaths: json['maxActivePaths'] as int? ?? 4,\n      hotwordsFile: json['hotwordsFile'] as String? ?? '',\n      hotwordsScore: (json['hotwordsScore'] as num?)?.toDouble() ?? 1.5,\n      ruleFsts: json['ruleFsts'] as String? ?? '',\n      ruleFars: json['ruleFars'] as String? ?? '',\n      blankPenalty: (json['blankPenalty'] as num?)?.toDouble() ?? 0.0,\n      hr: HomophoneReplacerConfig.fromJson(json['hr'] as Map<String, dynamic>),\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineRecognizerConfig(feat: $feat, model: $model, lm: $lm, decodingMethod: $decodingMethod, maxActivePaths: $maxActivePaths, hotwordsFile: $hotwordsFile, hotwordsScore: $hotwordsScore, ruleFsts: $ruleFsts, ruleFars: $ruleFars, blankPenalty: $blankPenalty, hr: $hr)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'feat': feat.toJson(),\n    'model': model.toJson(),\n    'lm': lm.toJson(),\n    'decodingMethod': decodingMethod,\n    'maxActivePaths': maxActivePaths,\n    'hotwordsFile': hotwordsFile,\n    'hotwordsScore': hotwordsScore,\n    'ruleFsts': ruleFsts,\n    'ruleFars': ruleFars,\n    'blankPenalty': blankPenalty,\n    'hr': hr.toJson(),\n  };\n\n  final FeatureConfig feat;\n  final OfflineModelConfig model;\n  final OfflineLMConfig lm;\n  final String decodingMethod;\n\n  final int maxActivePaths;\n\n  final String hotwordsFile;\n\n  final double hotwordsScore;\n\n  final String ruleFsts;\n  final String ruleFars;\n\n  final double blankPenalty;\n  final HomophoneReplacerConfig hr;\n}\n\n/// Recognition result returned by [OfflineRecognizer.getResult].\n///\n/// Some model families populate [lang], [emotion], or [event] in addition to\n/// the decoded text and token timestamps.\nclass OfflineRecognizerResult {\n  OfflineRecognizerResult({\n    required this.text,\n    required this.tokens,\n    required this.timestamps,\n    required this.lang,\n    required this.emotion,\n    required this.event,\n  });\n\n  factory OfflineRecognizerResult.fromJson(Map<String, dynamic> json) {\n    return OfflineRecognizerResult(\n      text: json['text'] as String? ?? '',\n      tokens: (json['tokens'] as List?)?.map((e) => e as String).toList() ?? [],\n      timestamps:\n          (json['timestamps'] as List?)\n              ?.map((e) => (e as num).toDouble())\n              .toList() ??\n          [],\n      lang: json['lang'] as String? ?? '',\n      emotion: json['emotion'] as String? ?? '',\n      event: json['event'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineRecognizerResult(text: $text, tokens: $tokens, timestamps: $timestamps, lang: $lang, emotion: $emotion, event: $event)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'text': text,\n    'tokens': tokens,\n    'timestamps': timestamps,\n    'lang': lang,\n    'emotion': emotion,\n    'event': event,\n  };\n\n  final String text;\n  final List<String> tokens;\n  final List<double> timestamps;\n  final String lang;\n  final String emotion;\n  final String event;\n}\n\n/// Offline speech recognizer.\n///\n/// Create one from an [OfflineRecognizerConfig], then create an\n/// [OfflineStream], feed waveform samples, call [decode], and fetch the final\n/// hypothesis with [getResult].\nclass OfflineRecognizer {\n  OfflineRecognizer.fromPtr({required this.ptr, required this.config});\n\n  OfflineRecognizer._({required this.ptr, required this.config});\n\n  /// Release the native recognizer.\n  void free() {\n    if (SherpaOnnxBindings.destroyOfflineRecognizer == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.destroyOfflineRecognizer?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// The user is responsible to call the OfflineRecognizer.free()\n  /// method of the returned instance to avoid memory leak.\n\n  /// Create a recognizer from [config].\n  factory OfflineRecognizer(OfflineRecognizerConfig config) {\n    final c = convertConfig(config);\n\n    if (SherpaOnnxBindings.createOfflineRecognizer == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final ptr = SherpaOnnxBindings.createOfflineRecognizer?.call(c) ?? nullptr;\n\n    if (ptr == nullptr) {\n      throw Exception(\n        \"Failed to create offline recognizer. Please check your config\",\n      );\n    }\n\n    freeConfig(c);\n\n    return OfflineRecognizer._(ptr: ptr, config: config);\n  }\n\n  /// Replace the runtime configuration.\n  void setConfig(OfflineRecognizerConfig config) {\n    if (SherpaOnnxBindings.offlineRecognizerSetConfig == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n\n    final c = convertConfig(config);\n\n    SherpaOnnxBindings.offlineRecognizerSetConfig?.call(ptr, c);\n\n    freeConfig(c);\n    // we don't update this.config\n  }\n\n  static Pointer<SherpaOnnxOfflineRecognizerConfig> convertConfig(\n    OfflineRecognizerConfig config,\n  ) {\n    final c = calloc<SherpaOnnxOfflineRecognizerConfig>();\n\n    c.ref.feat.sampleRate = config.feat.sampleRate;\n    c.ref.feat.featureDim = config.feat.featureDim;\n\n    // transducer\n    c.ref.model.transducer.encoder = config.model.transducer.encoder\n        .toNativeUtf8();\n    c.ref.model.transducer.decoder = config.model.transducer.decoder\n        .toNativeUtf8();\n    c.ref.model.transducer.joiner = config.model.transducer.joiner\n        .toNativeUtf8();\n\n    // paraformer\n    c.ref.model.paraformer.model = config.model.paraformer.model.toNativeUtf8();\n\n    // nemoCtc\n    c.ref.model.nemoCtc.model = config.model.nemoCtc.model.toNativeUtf8();\n\n    // whisper\n    c.ref.model.whisper.encoder = config.model.whisper.encoder.toNativeUtf8();\n\n    c.ref.model.whisper.decoder = config.model.whisper.decoder.toNativeUtf8();\n\n    c.ref.model.whisper.language = config.model.whisper.language.toNativeUtf8();\n\n    c.ref.model.whisper.task = config.model.whisper.task.toNativeUtf8();\n\n    c.ref.model.whisper.tailPaddings = config.model.whisper.tailPaddings;\n    c.ref.model.whisper.enableTokenTimestamps =\n        config.model.whisper.enableTokenTimestamps ? 1 : 0;\n    c.ref.model.whisper.enableSegmentTimestamps =\n        config.model.whisper.enableSegmentTimestamps ? 1 : 0;\n\n    c.ref.model.tdnn.model = config.model.tdnn.model.toNativeUtf8();\n\n    c.ref.model.senseVoice.model = config.model.senseVoice.model.toNativeUtf8();\n\n    c.ref.model.senseVoice.language = config.model.senseVoice.language\n        .toNativeUtf8();\n\n    c.ref.model.senseVoice.useInverseTextNormalization =\n        config.model.senseVoice.useInverseTextNormalization ? 1 : 0;\n\n    c.ref.model.moonshine.preprocessor = config.model.moonshine.preprocessor\n        .toNativeUtf8();\n    c.ref.model.moonshine.encoder = config.model.moonshine.encoder\n        .toNativeUtf8();\n    c.ref.model.moonshine.uncachedDecoder = config\n        .model\n        .moonshine\n        .uncachedDecoder\n        .toNativeUtf8();\n    c.ref.model.moonshine.cachedDecoder = config.model.moonshine.cachedDecoder\n        .toNativeUtf8();\n    c.ref.model.moonshine.mergedDecoder = config.model.moonshine.mergedDecoder\n        .toNativeUtf8();\n\n    // FireRedAsr\n    c.ref.model.fireRedAsr.encoder = config.model.fireRedAsr.encoder\n        .toNativeUtf8();\n    c.ref.model.fireRedAsr.decoder = config.model.fireRedAsr.decoder\n        .toNativeUtf8();\n\n    c.ref.model.dolphin.model = config.model.dolphin.model.toNativeUtf8();\n    c.ref.model.zipformerCtc.model = config.model.zipformerCtc.model\n        .toNativeUtf8();\n\n    c.ref.model.canary.encoder = config.model.canary.encoder.toNativeUtf8();\n    c.ref.model.canary.decoder = config.model.canary.decoder.toNativeUtf8();\n    c.ref.model.canary.srcLang = config.model.canary.srcLang.toNativeUtf8();\n    c.ref.model.canary.tgtLang = config.model.canary.tgtLang.toNativeUtf8();\n    c.ref.model.canary.usePnc = config.model.canary.usePnc ? 1 : 0;\n\n    c.ref.model.wenetCtc.model = config.model.wenetCtc.model.toNativeUtf8();\n    c.ref.model.omnilingual.model = config.model.omnilingual.model\n        .toNativeUtf8();\n    c.ref.model.medasr.model = config.model.medasr.model.toNativeUtf8();\n\n    c.ref.model.funasrNano.encoderAdaptor = config\n        .model\n        .funasrNano\n        .encoderAdaptor\n        .toNativeUtf8();\n    c.ref.model.funasrNano.llm = config.model.funasrNano.llm.toNativeUtf8();\n    c.ref.model.funasrNano.embedding = config.model.funasrNano.embedding\n        .toNativeUtf8();\n    c.ref.model.funasrNano.tokenizer = config.model.funasrNano.tokenizer\n        .toNativeUtf8();\n    c.ref.model.funasrNano.systemPrompt = config.model.funasrNano.systemPrompt\n        .toNativeUtf8();\n    c.ref.model.funasrNano.userPrompt = config.model.funasrNano.userPrompt\n        .toNativeUtf8();\n    c.ref.model.funasrNano.maxNewTokens = config.model.funasrNano.maxNewTokens;\n    c.ref.model.funasrNano.temperature = config.model.funasrNano.temperature;\n    c.ref.model.funasrNano.topP = config.model.funasrNano.topP;\n    c.ref.model.funasrNano.seed = config.model.funasrNano.seed;\n    c.ref.model.funasrNano.language = config.model.funasrNano.language\n        .toNativeUtf8();\n    c.ref.model.funasrNano.itn = config.model.funasrNano.itn;\n    c.ref.model.funasrNano.hotwords = config.model.funasrNano.hotwords\n        .toNativeUtf8();\n\n    c.ref.model.fireRedAsrCtc.model = config.model.fireRedAsrCtc.model\n        .toNativeUtf8();\n\n    c.ref.model.tokens = config.model.tokens.toNativeUtf8();\n\n    c.ref.model.numThreads = config.model.numThreads;\n    c.ref.model.debug = config.model.debug ? 1 : 0;\n    c.ref.model.provider = config.model.provider.toNativeUtf8();\n    c.ref.model.modelType = config.model.modelType.toNativeUtf8();\n    c.ref.model.modelingUnit = config.model.modelingUnit.toNativeUtf8();\n    c.ref.model.bpeVocab = config.model.bpeVocab.toNativeUtf8();\n    c.ref.model.telespeechCtc = config.model.telespeechCtc.toNativeUtf8();\n\n    c.ref.lm.model = config.lm.model.toNativeUtf8();\n    c.ref.lm.scale = config.lm.scale;\n\n    c.ref.decodingMethod = config.decodingMethod.toNativeUtf8();\n    c.ref.maxActivePaths = config.maxActivePaths;\n\n    c.ref.hotwordsFile = config.hotwordsFile.toNativeUtf8();\n    c.ref.hotwordsScore = config.hotwordsScore;\n\n    c.ref.ruleFsts = config.ruleFsts.toNativeUtf8();\n    c.ref.ruleFars = config.ruleFars.toNativeUtf8();\n\n    c.ref.blankPenalty = config.blankPenalty;\n\n    c.ref.hr.lexicon = config.hr.lexicon.toNativeUtf8();\n    c.ref.hr.ruleFsts = config.hr.ruleFsts.toNativeUtf8();\n\n    return c;\n  }\n\n  static void freeConfig(Pointer<SherpaOnnxOfflineRecognizerConfig> c) {\n    calloc.free(c.ref.hr.lexicon);\n    calloc.free(c.ref.hr.ruleFsts);\n    calloc.free(c.ref.ruleFars);\n    calloc.free(c.ref.ruleFsts);\n    calloc.free(c.ref.hotwordsFile);\n    calloc.free(c.ref.decodingMethod);\n    calloc.free(c.ref.lm.model);\n    calloc.free(c.ref.model.telespeechCtc);\n    calloc.free(c.ref.model.bpeVocab);\n    calloc.free(c.ref.model.modelingUnit);\n    calloc.free(c.ref.model.modelType);\n    calloc.free(c.ref.model.provider);\n    calloc.free(c.ref.model.tokens);\n    calloc.free(c.ref.model.fireRedAsrCtc.model);\n    calloc.free(c.ref.model.funasrNano.hotwords);\n    calloc.free(c.ref.model.funasrNano.language);\n    calloc.free(c.ref.model.funasrNano.userPrompt);\n    calloc.free(c.ref.model.funasrNano.systemPrompt);\n    calloc.free(c.ref.model.funasrNano.tokenizer);\n    calloc.free(c.ref.model.funasrNano.embedding);\n    calloc.free(c.ref.model.funasrNano.llm);\n    calloc.free(c.ref.model.funasrNano.encoderAdaptor);\n    calloc.free(c.ref.model.medasr.model);\n    calloc.free(c.ref.model.omnilingual.model);\n    calloc.free(c.ref.model.wenetCtc.model);\n    calloc.free(c.ref.model.canary.tgtLang);\n    calloc.free(c.ref.model.canary.srcLang);\n    calloc.free(c.ref.model.canary.decoder);\n    calloc.free(c.ref.model.canary.encoder);\n    calloc.free(c.ref.model.zipformerCtc.model);\n    calloc.free(c.ref.model.dolphin.model);\n    calloc.free(c.ref.model.fireRedAsr.decoder);\n    calloc.free(c.ref.model.fireRedAsr.encoder);\n    calloc.free(c.ref.model.moonshine.mergedDecoder);\n    calloc.free(c.ref.model.moonshine.cachedDecoder);\n    calloc.free(c.ref.model.moonshine.uncachedDecoder);\n    calloc.free(c.ref.model.moonshine.encoder);\n    calloc.free(c.ref.model.moonshine.preprocessor);\n    calloc.free(c.ref.model.senseVoice.language);\n    calloc.free(c.ref.model.senseVoice.model);\n    calloc.free(c.ref.model.tdnn.model);\n    calloc.free(c.ref.model.whisper.task);\n    calloc.free(c.ref.model.whisper.language);\n    calloc.free(c.ref.model.whisper.decoder);\n    calloc.free(c.ref.model.whisper.encoder);\n    calloc.free(c.ref.model.nemoCtc.model);\n    calloc.free(c.ref.model.paraformer.model);\n    calloc.free(c.ref.model.transducer.encoder);\n    calloc.free(c.ref.model.transducer.decoder);\n    calloc.free(c.ref.model.transducer.joiner);\n    calloc.free(c);\n  }\n\n  /// The user has to invoke stream.free() on the returned instance\n  /// to avoid memory leak\n  /// Create an offline stream.\n  OfflineStream createStream() {\n    if (SherpaOnnxBindings.createOfflineStream == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      throw Exception(\"Failed to create offline stream\");\n    }\n\n    final p = SherpaOnnxBindings.createOfflineStream?.call(ptr) ?? nullptr;\n\n    if (p == nullptr) {\n      throw Exception(\"Failed to create offline stream\");\n    }\n\n    return OfflineStream(ptr: p);\n  }\n\n  /// Decode one stream.\n  void decode(OfflineStream stream) {\n    if (SherpaOnnxBindings.decodeOfflineStream == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return;\n    }\n\n    SherpaOnnxBindings.decodeOfflineStream?.call(ptr, stream.ptr);\n  }\n\n  /// Fetch the current recognition result for [stream].\n  OfflineRecognizerResult getResult(OfflineStream stream) {\n    if (SherpaOnnxBindings.getOfflineStreamResultAsJson == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return OfflineRecognizerResult(\n        text: '',\n        tokens: [],\n        timestamps: [],\n        lang: '',\n        emotion: '',\n        event: '',\n      );\n    }\n\n    final json =\n        SherpaOnnxBindings.getOfflineStreamResultAsJson?.call(stream.ptr) ??\n        nullptr;\n    if (json == nullptr) {\n      return OfflineRecognizerResult(\n        text: '',\n        tokens: [],\n        timestamps: [],\n        lang: '',\n        emotion: '',\n        event: '',\n      );\n    }\n\n    final parsedJson = jsonDecode(toDartString(json));\n\n    SherpaOnnxBindings.destroyOfflineStreamResultJson?.call(json);\n\n    return OfflineRecognizerResult(\n      text: parsedJson['text'],\n      tokens: List<String>.from(parsedJson['tokens']),\n      timestamps: List<double>.from(parsedJson['timestamps']),\n      lang: parsedJson['lang'],\n      emotion: parsedJson['emotion'],\n      event: parsedJson['event'],\n    );\n  }\n\n  Pointer<SherpaOnnxOfflineRecognizer> ptr;\n  OfflineRecognizerConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/offline_speaker_diarization.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'dart:typed_data';\n\nimport 'package:ffi/ffi.dart';\n\nimport './sherpa_onnx_bindings.dart';\nimport './speaker_identification.dart';\n\n/// Offline speaker diarization.\n///\n/// This module combines segmentation, speaker embedding extraction, and\n/// clustering to assign speaker labels to time spans. See\n/// `dart-api-examples/speaker-diarization/` for a complete example.\nclass OfflineSpeakerDiarizationSegment {\n  const OfflineSpeakerDiarizationSegment({\n    required this.start,\n    required this.end,\n    required this.speaker,\n  });\n\n  factory OfflineSpeakerDiarizationSegment.fromJson(Map<String, dynamic> json) {\n    return OfflineSpeakerDiarizationSegment(\n      start: (json['start'] as num).toDouble(),\n      end: (json['end'] as num).toDouble(),\n      speaker: json['speaker'] as int,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineSpeakerDiarizationSegment(start: $start, end: $end, speaker: $speaker)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'start': start,\n        'end': end,\n        'speaker': speaker,\n      };\n\n  final double start;\n  final double end;\n  final int speaker;\n}\n\n/// Pyannote segmentation model path.\nclass OfflineSpeakerSegmentationPyannoteModelConfig {\n  const OfflineSpeakerSegmentationPyannoteModelConfig({\n    this.model = '',\n  });\n\n  factory OfflineSpeakerSegmentationPyannoteModelConfig.fromJson(\n      Map<String, dynamic> json) {\n    return OfflineSpeakerSegmentationPyannoteModelConfig(\n      model: json['model'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineSpeakerSegmentationPyannoteModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model,\n      };\n\n  final String model;\n}\n\n/// Segmentation model configuration for speaker diarization.\nclass OfflineSpeakerSegmentationModelConfig {\n  const OfflineSpeakerSegmentationModelConfig({\n    this.pyannote = const OfflineSpeakerSegmentationPyannoteModelConfig(),\n    this.numThreads = 1,\n    this.debug = true,\n    this.provider = 'cpu',\n  });\n\n  factory OfflineSpeakerSegmentationModelConfig.fromJson(\n      Map<String, dynamic> json) {\n    return OfflineSpeakerSegmentationModelConfig(\n      pyannote: json['pyannote'] != null\n          ? OfflineSpeakerSegmentationPyannoteModelConfig.fromJson(\n              json['pyannote'] as Map<String, dynamic>)\n          : const OfflineSpeakerSegmentationPyannoteModelConfig(),\n      numThreads: json['numThreads'] as int? ?? 1,\n      debug: json['debug'] as bool? ?? true,\n      provider: json['provider'] as String? ?? 'cpu',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineSpeakerSegmentationModelConfig(pyannote: $pyannote, numThreads: $numThreads, debug: $debug, provider: $provider)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'pyannote': pyannote.toJson(),\n        'numThreads': numThreads,\n        'debug': debug,\n        'provider': provider,\n      };\n\n  final OfflineSpeakerSegmentationPyannoteModelConfig pyannote;\n\n  final int numThreads;\n  final bool debug;\n  final String provider;\n}\n\n/// Clustering options used after segmentation and embedding extraction.\nclass FastClusteringConfig {\n  const FastClusteringConfig({\n    this.numClusters = -1,\n    this.threshold = 0.5,\n  });\n\n  factory FastClusteringConfig.fromJson(Map<String, dynamic> json) {\n    return FastClusteringConfig(\n      numClusters: json['numClusters'] as int? ?? -1,\n      threshold: (json['threshold'] as num?)?.toDouble() ?? 0.5,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'FastClusteringConfig(numClusters: $numClusters, threshold: $threshold)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'numClusters': numClusters,\n        'threshold': threshold,\n      };\n\n  final int numClusters;\n  final double threshold;\n}\n\n/// Top-level configuration for [OfflineSpeakerDiarization].\nclass OfflineSpeakerDiarizationConfig {\n  const OfflineSpeakerDiarizationConfig({\n    this.segmentation = const OfflineSpeakerSegmentationModelConfig(),\n    this.embedding = const SpeakerEmbeddingExtractorConfig(model: ''),\n    this.clustering = const FastClusteringConfig(),\n    this.minDurationOn = 0.2,\n    this.minDurationOff = 0.5,\n  });\n\n  factory OfflineSpeakerDiarizationConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineSpeakerDiarizationConfig(\n      segmentation: json['segmentation'] != null\n          ? OfflineSpeakerSegmentationModelConfig.fromJson(\n              json['segmentation'] as Map<String, dynamic>)\n          : const OfflineSpeakerSegmentationModelConfig(),\n      embedding: json['embedding'] != null\n          ? SpeakerEmbeddingExtractorConfig.fromJson(\n              json['embedding'] as Map<String, dynamic>)\n          : const SpeakerEmbeddingExtractorConfig(model: ''),\n      clustering: json['clustering'] != null\n          ? FastClusteringConfig.fromJson(\n              json['clustering'] as Map<String, dynamic>)\n          : const FastClusteringConfig(),\n      minDurationOn: (json['minDurationOn'] as num?)?.toDouble() ?? 0.2,\n      minDurationOff: (json['minDurationOff'] as num?)?.toDouble() ?? 0.5,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineSpeakerDiarizationConfig(segmentation: $segmentation, embedding: $embedding, clustering: $clustering, minDurationOn: $minDurationOn, minDurationOff: $minDurationOff)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'segmentation': segmentation.toJson(),\n        'embedding': embedding.toJson(),\n        'clustering': clustering.toJson(),\n        'minDurationOn': minDurationOn,\n        'minDurationOff': minDurationOff,\n      };\n\n  final OfflineSpeakerSegmentationModelConfig segmentation;\n  final SpeakerEmbeddingExtractorConfig embedding;\n  final FastClusteringConfig clustering;\n  final double minDurationOff; // in seconds\n  final double minDurationOn; // in seconds\n}\n\n/// Offline speaker diarizer.\nclass OfflineSpeakerDiarization {\n  OfflineSpeakerDiarization.fromPtr(\n      {required this.ptr, required this.config, required this.sampleRate});\n\n  OfflineSpeakerDiarization._(\n      {required this.ptr, required this.config, required this.sampleRate});\n\n  /// Release the native diarizer.\n  void free() {\n    if (SherpaOnnxBindings.sherpaOnnxDestroyOfflineSpeakerDiarization == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.sherpaOnnxDestroyOfflineSpeakerDiarization?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Create a diarizer from [config].\n  factory OfflineSpeakerDiarization(OfflineSpeakerDiarizationConfig config) {\n    if (SherpaOnnxBindings.sherpaOnnxCreateOfflineSpeakerDiarization == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final c = calloc<SherpaOnnxOfflineSpeakerDiarizationConfig>();\n\n    c.ref.segmentation.pyannote.model =\n        config.segmentation.pyannote.model.toNativeUtf8();\n    c.ref.segmentation.numThreads = config.segmentation.numThreads;\n    c.ref.segmentation.debug = config.segmentation.debug ? 1 : 0;\n    c.ref.segmentation.provider = config.segmentation.provider.toNativeUtf8();\n\n    c.ref.embedding.model = config.embedding.model.toNativeUtf8();\n    c.ref.embedding.numThreads = config.embedding.numThreads;\n    c.ref.embedding.debug = config.embedding.debug ? 1 : 0;\n    c.ref.embedding.provider = config.embedding.provider.toNativeUtf8();\n\n    c.ref.clustering.numClusters = config.clustering.numClusters;\n    c.ref.clustering.threshold = config.clustering.threshold;\n\n    c.ref.minDurationOn = config.minDurationOn;\n    c.ref.minDurationOff = config.minDurationOff;\n\n    final ptr =\n        SherpaOnnxBindings.sherpaOnnxCreateOfflineSpeakerDiarization?.call(c) ??\n            nullptr;\n\n    calloc.free(c.ref.embedding.provider);\n    calloc.free(c.ref.embedding.model);\n    calloc.free(c.ref.segmentation.provider);\n    calloc.free(c.ref.segmentation.pyannote.model);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\n          \"Failed to create offline speaker diarization. Please check your config\");\n    }\n\n    int sampleRate = SherpaOnnxBindings\n              .sherpaOnnxOfflineSpeakerDiarizationGetSampleRate\n              ?.call(ptr) ?? 0;\n\n    return OfflineSpeakerDiarization._(\n        ptr: ptr, config: config, sampleRate: sampleRate);\n  }\n\n  /// Process a complete waveform and return speaker-labeled segments.\n  List<OfflineSpeakerDiarizationSegment> process(\n      {required Float32List samples}) {\n    if (SherpaOnnxBindings.sherpaOnnxOfflineSpeakerDiarizationProcess == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return <OfflineSpeakerDiarizationSegment>[];\n    }\n\n    final n = samples.length;\n    final Pointer<Float> p = calloc<Float>(n);\n\n    final pList = p.asTypedList(n);\n    pList.setAll(0, samples);\n\n    final r = SherpaOnnxBindings.sherpaOnnxOfflineSpeakerDiarizationProcess\n            ?.call(ptr, p, n) ??\n        nullptr;\n\n    final ans = _processImpl(r);\n\n    SherpaOnnxBindings.sherpaOnnxOfflineSpeakerDiarizationDestroyResult\n        ?.call(r);\n\n    return ans;\n  }\n\n  List<OfflineSpeakerDiarizationSegment> processWithCallback({\n    required Float32List samples,\n    required int Function(int numProcessedChunks, int numTotalChunks) callback,\n  }) {\n    if (SherpaOnnxBindings\n            .sherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg ==\n        null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return <OfflineSpeakerDiarizationSegment>[];\n    }\n\n    final n = samples.length;\n    final Pointer<Float> p = calloc<Float>(n);\n\n    final pList = p.asTypedList(n);\n    pList.setAll(0, samples);\n\n    final wrapper = NativeCallable<\n            SherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArgNative>.isolateLocal(\n        (int numProcessedChunks, int numTotalChunks) {\n      return callback(numProcessedChunks, numTotalChunks);\n    }, exceptionalReturn: 0);\n\n    final r = SherpaOnnxBindings\n            .sherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg\n            ?.call(ptr, p, n, wrapper.nativeFunction) ??\n        nullptr;\n\n    wrapper.close();\n\n    final ans = _processImpl(r);\n\n    SherpaOnnxBindings.sherpaOnnxOfflineSpeakerDiarizationDestroyResult\n        ?.call(r);\n\n    return ans;\n  }\n\n  List<OfflineSpeakerDiarizationSegment> _processImpl(\n      Pointer<SherpaOnnxOfflineSpeakerDiarizationResult> r) {\n    if (r == nullptr) {\n      return <OfflineSpeakerDiarizationSegment>[];\n    }\n\n    final numSegments = SherpaOnnxBindings\n            .sherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments\n            ?.call(r) ??\n        0;\n    final segments = SherpaOnnxBindings\n            .sherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime\n            ?.call(r) ??\n        nullptr;\n\n    if (segments == nullptr) {\n      return <OfflineSpeakerDiarizationSegment>[];\n    }\n\n    final ans = <OfflineSpeakerDiarizationSegment>[];\n    for (int i = 0; i != numSegments; ++i) {\n      final s = segments + i;\n\n      final tmp = OfflineSpeakerDiarizationSegment(\n          start: s.ref.start, end: s.ref.end, speaker: s.ref.speaker);\n      ans.add(tmp);\n    }\n\n    SherpaOnnxBindings.sherpaOnnxOfflineSpeakerDiarizationDestroySegment\n        ?.call(segments);\n\n    return ans;\n  }\n\n  Pointer<SherpaOnnxOfflineSpeakerDiarization> ptr;\n  OfflineSpeakerDiarizationConfig config;\n  final int sampleRate;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/offline_speech_denoiser.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'dart:typed_data';\n\nimport 'package:ffi/ffi.dart';\nimport './sherpa_onnx_bindings.dart';\n\n/// Offline speech denoising.\n///\n/// Supported model families include GTCRN and DPDFNet. See the examples under\n/// `dart-api-examples/speech-enhancement-gtcrn/` and\n/// `dart-api-examples/speech-enhancement-dpdfnet/`.\nclass OfflineSpeechDenoiserGtcrnModelConfig {\n  const OfflineSpeechDenoiserGtcrnModelConfig({\n    this.model = '',\n  });\n\n  factory OfflineSpeechDenoiserGtcrnModelConfig.fromJson(\n      Map<String, dynamic> json) {\n    return OfflineSpeechDenoiserGtcrnModelConfig(\n      model: json['model'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineSpeechDenoiserGtcrnModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model,\n      };\n\n  final String model;\n}\n\n/// DPDFNet model path for offline speech denoising.\nclass OfflineSpeechDenoiserDpdfNetModelConfig {\n  const OfflineSpeechDenoiserDpdfNetModelConfig({\n    this.model = '',\n  });\n\n  factory OfflineSpeechDenoiserDpdfNetModelConfig.fromJson(\n      Map<String, dynamic> json) {\n    return OfflineSpeechDenoiserDpdfNetModelConfig(\n      model: json['model'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineSpeechDenoiserDpdfNetModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model,\n      };\n\n  final String model;\n}\n\n/// Aggregate model configuration for [OfflineSpeechDenoiser].\n///\n/// Configure either [gtcrn] or [dpdfnet] for typical use.\nclass OfflineSpeechDenoiserModelConfig {\n  const OfflineSpeechDenoiserModelConfig({\n    this.gtcrn = const OfflineSpeechDenoiserGtcrnModelConfig(),\n    this.dpdfnet = const OfflineSpeechDenoiserDpdfNetModelConfig(),\n    this.numThreads = 1,\n    this.debug = true,\n    this.provider = 'cpu',\n  });\n\n  factory OfflineSpeechDenoiserModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineSpeechDenoiserModelConfig(\n      gtcrn: json['gtcrn'] != null\n          ? OfflineSpeechDenoiserGtcrnModelConfig.fromJson(\n              json['gtcrn'] as Map<String, dynamic>)\n          : const OfflineSpeechDenoiserGtcrnModelConfig(),\n      dpdfnet: json['dpdfnet'] != null\n          ? OfflineSpeechDenoiserDpdfNetModelConfig.fromJson(\n              json['dpdfnet'] as Map<String, dynamic>)\n          : const OfflineSpeechDenoiserDpdfNetModelConfig(),\n      numThreads: json['numThreads'] as int? ?? 1,\n      debug: json['debug'] as bool? ?? true,\n      provider: json['provider'] as String? ?? 'cpu',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineSpeechDenoiserModelConfig(gtcrn: $gtcrn, dpdfnet: $dpdfnet, numThreads: $numThreads, debug: $debug, provider: $provider)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'gtcrn': gtcrn.toJson(),\n        'dpdfnet': dpdfnet.toJson(),\n        'numThreads': numThreads,\n        'debug': debug,\n        'provider': provider,\n      };\n\n  final OfflineSpeechDenoiserGtcrnModelConfig gtcrn;\n  final OfflineSpeechDenoiserDpdfNetModelConfig dpdfnet;\n  final int numThreads;\n  final bool debug;\n  final String provider;\n}\n\n/// Top-level configuration for [OfflineSpeechDenoiser].\nclass OfflineSpeechDenoiserConfig {\n  const OfflineSpeechDenoiserConfig({\n    this.model = const OfflineSpeechDenoiserModelConfig(),\n  });\n\n  factory OfflineSpeechDenoiserConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineSpeechDenoiserConfig(\n      model: json['model'] != null\n          ? OfflineSpeechDenoiserModelConfig.fromJson(\n              json['model'] as Map<String, dynamic>)\n          : const OfflineSpeechDenoiserModelConfig(),\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineSpeechDenoiserConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model.toJson(),\n      };\n\n  final OfflineSpeechDenoiserModelConfig model;\n}\n\n/// Audio returned by offline or online speech denoisers.\nclass DenoisedAudio {\n  DenoisedAudio({\n    required this.samples,\n    required this.sampleRate,\n  });\n\n  final Float32List samples;\n  final int sampleRate;\n}\n\n/// Offline speech denoiser.\nclass OfflineSpeechDenoiser {\n  OfflineSpeechDenoiser.fromPtr({required this.ptr, required this.config});\n\n  OfflineSpeechDenoiser._({required this.ptr, required this.config});\n\n  /// Create an offline denoiser from [config].\n  factory OfflineSpeechDenoiser(OfflineSpeechDenoiserConfig config) {\n    if (SherpaOnnxBindings.sherpaOnnxCreateOfflineSpeechDenoiser == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final c = calloc<SherpaOnnxOfflineSpeechDenoiserConfig>();\n    c.ref.model.gtcrn.model = config.model.gtcrn.model.toNativeUtf8();\n    c.ref.model.dpdfnet.model = config.model.dpdfnet.model.toNativeUtf8();\n\n    c.ref.model.numThreads = config.model.numThreads;\n    c.ref.model.debug = config.model.debug ? 1 : 0;\n    c.ref.model.provider = config.model.provider.toNativeUtf8();\n\n    final ptr =\n        SherpaOnnxBindings.sherpaOnnxCreateOfflineSpeechDenoiser?.call(c) ??\n            nullptr;\n\n    calloc.free(c.ref.model.provider);\n    calloc.free(c.ref.model.gtcrn.model);\n    calloc.free(c.ref.model.dpdfnet.model);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\n          \"Failed to create offline speech denoiser. Please check your config\");\n    }\n\n    return OfflineSpeechDenoiser._(ptr: ptr, config: config);\n  }\n\n  /// Release the native denoiser.\n  void free() {\n    if (SherpaOnnxBindings.sherpaOnnxDestroyOfflineSpeechDenoiser == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n\n    SherpaOnnxBindings.sherpaOnnxDestroyOfflineSpeechDenoiser?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Denoise one chunk or a complete waveform.\n  DenoisedAudio run({required Float32List samples, required int sampleRate}) {\n    if (SherpaOnnxBindings.sherpaOnnxOfflineSpeechDenoiserRun == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return DenoisedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final n = samples.length;\n    final Pointer<Float> psamples = calloc<Float>(n);\n\n    final pList = psamples.asTypedList(n);\n    pList.setAll(0, samples);\n\n    final p = SherpaOnnxBindings.sherpaOnnxOfflineSpeechDenoiserRun\n            ?.call(ptr, psamples, n, sampleRate) ??\n        nullptr;\n\n    calloc.free(psamples);\n\n    if (p == nullptr) {\n      return DenoisedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final sampleRateOut = p.ref.sampleRate;\n    final nOut = p.ref.n;\n    Float32List newSamples = Float32List(0);\n    if (nOut > 0 && p.ref.samples != nullptr) {\n      newSamples = Float32List.fromList(p.ref.samples.asTypedList(nOut));\n    }\n\n    SherpaOnnxBindings.sherpaOnnxDestroyDenoisedAudio?.call(p);\n\n    return DenoisedAudio(samples: newSamples, sampleRate: sampleRateOut);\n  }\n\n  /// Return the expected sample rate for this denoiser.\n  int get sampleRate {\n    if (SherpaOnnxBindings.sherpaOnnxOfflineSpeechDenoiserGetSampleRate ==\n        null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return 0;\n    }\n\n    return SherpaOnnxBindings.sherpaOnnxOfflineSpeechDenoiserGetSampleRate\n            ?.call(ptr) ??\n        0;\n  }\n\n  Pointer<SherpaOnnxOfflineSpeechDenoiser> ptr;\n  OfflineSpeechDenoiserConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/offline_stream.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'dart:typed_data';\nimport 'package:ffi/ffi.dart';\n\nimport './sherpa_onnx_bindings.dart';\n\n/// Input stream for offline APIs such as offline ASR, audio tagging, and\n/// spoken language identification.\nclass OfflineStream {\n  /// The user has to call OfflineStream.free() to avoid memory leak.\n  OfflineStream({required this.ptr});\n\n  /// Release the native stream.\n  void free() {\n    if (SherpaOnnxBindings.destroyOfflineStream == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.destroyOfflineStream?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// If you have List<double> data, then you can use\n  /// Float32List.fromList(data) to convert data to Float32List\n  ///\n  /// See\n  ///  https://api.flutter.dev/flutter/dart-core/List-class.html\n  /// and\n  ///  https://api.flutter.dev/flutter/dart-typed_data/Float32List-class.html\n  /// Append waveform samples to the stream.\n  ///\n  /// [samples] must contain mono floating-point PCM data normalized to\n  /// `[-1, 1]`. [sampleRate] should match the model expectation, typically\n  /// 16000 for the provided examples.\n  void acceptWaveform({required Float32List samples, required int sampleRate}) {\n    if (SherpaOnnxBindings.acceptWaveformOffline == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n\n    final n = samples.length;\n    final Pointer<Float> p = calloc<Float>(n);\n\n    final pList = p.asTypedList(n);\n    pList.setAll(0, samples);\n\n    SherpaOnnxBindings.acceptWaveformOffline?.call(ptr, sampleRate, p, n);\n\n    calloc.free(p);\n  }\n\n  Pointer<SherpaOnnxOfflineStream> ptr;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/online_punctuation.dart",
    "content": "import 'dart:ffi';\nimport 'package:ffi/ffi.dart';\n\nimport './sherpa_onnx_bindings.dart';\n\n/// Online punctuation restoration.\n///\n/// This wrapper is intended for shorter or incremental text fragments. See\n/// `dart-api-examples/add-punctuations/` for working examples.\nclass OnlinePunctuationModelConfig {\n  OnlinePunctuationModelConfig(\n      {required this.cnnBiLstm,\n      required this.bpeVocab,\n      this.numThreads = 1,\n      this.provider = 'cpu',\n      this.debug = true});\n\n  factory OnlinePunctuationModelConfig.fromJson(Map<String, dynamic> json) {\n    return OnlinePunctuationModelConfig(\n      cnnBiLstm: json['cnnBiLstm'],\n      bpeVocab: json['bpeVocab'],\n      numThreads: json['numThreads'],\n      provider: json['provider'],\n      debug: json['debug'],\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlinePunctuationModelConfig(cnnBiLstm: $cnnBiLstm, '\n        'bpeVocab: $bpeVocab, numThreads: $numThreads, '\n        'provider: $provider, debug: $debug)';\n  }\n\n  Map<String, dynamic> toJson() {\n    return {\n      'cnnBiLstm': cnnBiLstm,\n      'bpeVocab': bpeVocab,\n      'numThreads': numThreads,\n      'provider': provider,\n      'debug': debug,\n    };\n  }\n\n  final String cnnBiLstm;\n  final String bpeVocab;\n  final int numThreads;\n  final String provider;\n  final bool debug;\n}\n\n/// Top-level configuration for [OnlinePunctuation].\nclass OnlinePunctuationConfig {\n  OnlinePunctuationConfig({\n    required this.model,\n  });\n\n  factory OnlinePunctuationConfig.fromJson(Map<String, dynamic> json) {\n    return OnlinePunctuationConfig(\n      model: OnlinePunctuationModelConfig.fromJson(json['model']),\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlinePunctuationConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() {\n    return {\n      'model': model.toJson(),\n    };\n  }\n\n  final OnlinePunctuationModelConfig model;\n}\n\n/// Online punctuation restorer.\nclass OnlinePunctuation {\n  OnlinePunctuation.fromPtr({required this.ptr, required this.config});\n\n  OnlinePunctuation._({required this.ptr, required this.config});\n\n  /// Create an online punctuator from [config].\n  factory OnlinePunctuation({required OnlinePunctuationConfig config}) {\n    if (SherpaOnnxBindings.sherpaOnnxCreateOnlinePunctuation == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final c = calloc<SherpaOnnxOnlinePunctuationConfig>();\n\n    final cnnBiLstmPtr = config.model.cnnBiLstm.toNativeUtf8();\n    final bpeVocabPtr = config.model.bpeVocab.toNativeUtf8();\n    c.ref.model.cnnBiLstm = cnnBiLstmPtr;\n    c.ref.model.bpeVocab = bpeVocabPtr;\n    c.ref.model.numThreads = config.model.numThreads;\n    c.ref.model.debug = config.model.debug ? 1 : 0;\n\n    final providerPtr = config.model.provider.toNativeUtf8();\n    c.ref.model.provider = providerPtr;\n\n    final ptr = SherpaOnnxBindings.sherpaOnnxCreateOnlinePunctuation?.call(c) ??\n        nullptr;\n\n    calloc.free(providerPtr);\n    calloc.free(cnnBiLstmPtr);\n    calloc.free(bpeVocabPtr);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\n          \"Failed to create online punctuation. Please check your config\");\n    }\n\n    return OnlinePunctuation._(ptr: ptr, config: config);\n  }\n\n  /// Release the native punctuator.\n  void free() {\n    if (SherpaOnnxBindings.sherpaOnnxDestroyOnlinePunctuation == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.sherpaOnnxDestroyOnlinePunctuation?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Add punctuation to [text].\n  String addPunct(String text) {\n    if (SherpaOnnxBindings.sherpaOnnxOnlinePunctuationAddPunct == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return '';\n    }\n\n    final textPtr = text.toNativeUtf8();\n\n    final p = SherpaOnnxBindings.sherpaOnnxOnlinePunctuationAddPunct\n            ?.call(ptr, textPtr) ??\n        nullptr;\n\n    calloc.free(textPtr);\n\n    if (p == nullptr) {\n      return '';\n    }\n\n    final ans = p.toDartString();\n\n    SherpaOnnxBindings.sherpaOnnxOnlinePunctuationFreeText?.call(p);\n\n    return ans;\n  }\n\n  Pointer<SherpaOnnxOnlinePunctuation> ptr;\n  final OnlinePunctuationConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/online_recognizer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:convert';\nimport 'dart:ffi';\n\nimport 'package:ffi/ffi.dart';\n\nimport './feature_config.dart';\nimport './homophone_replacer_config.dart';\nimport './online_stream.dart';\nimport './sherpa_onnx_bindings.dart';\nimport './utils.dart';\n\n/// Streaming speech recognition.\n///\n/// This module wraps the online ASR APIs used by the examples in\n/// `dart-api-examples/streaming-asr/bin/`, including Zipformer transducer,\n/// Zipformer CTC, Paraformer, T-One-CTC, and NeMo-CTC style models.\n///\n/// Example:\n///\n/// ```dart\n/// final model = OnlineModelConfig(\n///   transducer: const OnlineTransducerModelConfig(\n///     encoder: './streaming-zipformer/encoder-epoch-99-avg-1.int8.onnx',\n///     decoder: './streaming-zipformer/decoder-epoch-99-avg-1.onnx',\n///     joiner: './streaming-zipformer/joiner-epoch-99-avg-1.int8.onnx',\n///   ),\n///   tokens: './streaming-zipformer/tokens.txt',\n///   modelType: 'zipformer2',\n/// );\n///\n/// final recognizer = OnlineRecognizer(OnlineRecognizerConfig(model: model));\n/// final stream = recognizer.createStream();\n/// stream.acceptWaveform(samples: chunk, sampleRate: 16000);\n/// while (recognizer.isReady(stream)) {\n///   recognizer.decode(stream);\n/// }\n/// print(recognizer.getResult(stream).text);\n/// ```\n\n/// Model files for a streaming transducer recognizer.\nclass OnlineTransducerModelConfig {\n  const OnlineTransducerModelConfig({\n    this.encoder = '',\n    this.decoder = '',\n    this.joiner = '',\n  });\n\n  factory OnlineTransducerModelConfig.fromJson(Map<String, dynamic> json) {\n    return OnlineTransducerModelConfig(\n      encoder: json['encoder'] as String? ?? '',\n      decoder: json['decoder'] as String? ?? '',\n      joiner: json['joiner'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlineTransducerModelConfig(encoder: $encoder, decoder: $decoder, joiner: $joiner)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'encoder': encoder,\n        'decoder': decoder,\n        'joiner': joiner,\n      };\n\n  final String encoder;\n  final String decoder;\n  final String joiner;\n}\n\n/// Model files for a streaming Paraformer recognizer.\nclass OnlineParaformerModelConfig {\n  const OnlineParaformerModelConfig({this.encoder = '', this.decoder = ''});\n\n  factory OnlineParaformerModelConfig.fromJson(Map<String, dynamic> json) {\n    return OnlineParaformerModelConfig(\n      encoder: json['encoder'] as String? ?? '',\n      decoder: json['decoder'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlineParaformerModelConfig(encoder: $encoder, decoder: $decoder)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'encoder': encoder,\n        'decoder': decoder,\n      };\n\n  final String encoder;\n  final String decoder;\n}\n\n/// Model file for a streaming Zipformer2 CTC recognizer.\nclass OnlineZipformer2CtcModelConfig {\n  const OnlineZipformer2CtcModelConfig({this.model = ''});\n\n  factory OnlineZipformer2CtcModelConfig.fromJson(Map<String, dynamic> json) {\n    return OnlineZipformer2CtcModelConfig(\n      model: json['model'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlineZipformer2CtcModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model,\n      };\n\n  final String model;\n}\n\n/// Model file for a streaming NeMo CTC recognizer.\nclass OnlineNemoCtcModelConfig {\n  const OnlineNemoCtcModelConfig({this.model = ''});\n\n  factory OnlineNemoCtcModelConfig.fromJson(Map<String, dynamic> json) {\n    return OnlineNemoCtcModelConfig(\n      model: json['model'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlineNemoCtcModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model,\n      };\n\n  final String model;\n}\n\n/// Model file for a streaming tone-aware CTC recognizer.\nclass OnlineToneCtcModelConfig {\n  const OnlineToneCtcModelConfig({this.model = ''});\n\n  factory OnlineToneCtcModelConfig.fromJson(Map<String, dynamic> json) {\n    return OnlineToneCtcModelConfig(\n      model: json['model'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlineToneCtcModelConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model,\n      };\n\n  final String model;\n}\n\n/// Aggregate model configuration for streaming recognition.\n///\n/// Configure exactly one model family for a typical deployment and supply the\n/// shared tokenizer and runtime settings here.\nclass OnlineModelConfig {\n  const OnlineModelConfig({\n    this.transducer = const OnlineTransducerModelConfig(),\n    this.paraformer = const OnlineParaformerModelConfig(),\n    this.zipformer2Ctc = const OnlineZipformer2CtcModelConfig(),\n    this.nemoCtc = const OnlineNemoCtcModelConfig(),\n    this.toneCtc = const OnlineToneCtcModelConfig(),\n    required this.tokens,\n    this.numThreads = 1,\n    this.provider = 'cpu',\n    this.debug = true,\n    this.modelType = '',\n    this.modelingUnit = '',\n    this.bpeVocab = '',\n  });\n\n  factory OnlineModelConfig.fromJson(Map<String, dynamic> json) {\n    return OnlineModelConfig(\n      transducer: OnlineTransducerModelConfig.fromJson(\n          json['transducer'] as Map<String, dynamic>? ?? const {}),\n      paraformer: OnlineParaformerModelConfig.fromJson(\n          json['paraformer'] as Map<String, dynamic>? ?? const {}),\n      zipformer2Ctc: OnlineZipformer2CtcModelConfig.fromJson(\n          json['zipformer2Ctc'] as Map<String, dynamic>? ?? const {}),\n      nemoCtc: OnlineNemoCtcModelConfig.fromJson(\n          json['nemoCtc'] as Map<String, dynamic>? ?? const {}),\n      toneCtc: OnlineToneCtcModelConfig.fromJson(\n          json['toneCtc'] as Map<String, dynamic>? ?? const {}),\n      tokens: json['tokens'] as String,\n      numThreads: json['numThreads'] as int? ?? 1,\n      provider: json['provider'] as String? ?? 'cpu',\n      debug: json['debug'] as bool? ?? true,\n      modelType: json['modelType'] as String? ?? '',\n      modelingUnit: json['modelingUnit'] as String? ?? '',\n      bpeVocab: json['bpeVocab'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlineModelConfig(transducer: $transducer, paraformer: $paraformer, zipformer2Ctc: $zipformer2Ctc, nemoCtc: $nemoCtc, toneCtc: $toneCtc, tokens: $tokens, numThreads: $numThreads, provider: $provider, debug: $debug, modelType: $modelType, modelingUnit: $modelingUnit, bpeVocab: $bpeVocab)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'transducer': transducer.toJson(),\n        'paraformer': paraformer.toJson(),\n        'zipformer2Ctc': zipformer2Ctc.toJson(),\n        'nemoCtc': nemoCtc.toJson(),\n        'toneCtc': toneCtc.toJson(),\n        'tokens': tokens,\n        'numThreads': numThreads,\n        'provider': provider,\n        'debug': debug,\n        'modelType': modelType,\n        'modelingUnit': modelingUnit,\n        'bpeVocab': bpeVocab,\n      };\n\n  final OnlineTransducerModelConfig transducer;\n  final OnlineParaformerModelConfig paraformer;\n  final OnlineZipformer2CtcModelConfig zipformer2Ctc;\n  final OnlineNemoCtcModelConfig nemoCtc;\n  final OnlineToneCtcModelConfig toneCtc;\n\n  final String tokens;\n\n  final int numThreads;\n\n  final String provider;\n\n  final bool debug;\n\n  final String modelType;\n\n  final String modelingUnit;\n\n  final String bpeVocab;\n}\n\n/// FST decoder settings for CTC-based streaming recognition.\nclass OnlineCtcFstDecoderConfig {\n  const OnlineCtcFstDecoderConfig({this.graph = '', this.maxActive = 3000});\n\n  factory OnlineCtcFstDecoderConfig.fromJson(Map<String, dynamic> json) {\n    return OnlineCtcFstDecoderConfig(\n      graph: json['graph'] as String? ?? '',\n      maxActive: json['maxActive'] as int? ?? 3000,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlineCtcFstDecoderConfig(graph: $graph, maxActive: $maxActive)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'graph': graph,\n        'maxActive': maxActive,\n      };\n\n  final String graph;\n  final int maxActive;\n}\n\n/// Top-level configuration for [OnlineRecognizer].\n///\n/// This combines feature extraction, the selected online model family,\n/// endpointing rules, hotwords, grammar resources, and optional homophone\n/// replacement resources.\nclass OnlineRecognizerConfig {\n  const OnlineRecognizerConfig({\n    this.feat = const FeatureConfig(),\n    required this.model,\n    this.decodingMethod = 'greedy_search',\n    this.maxActivePaths = 4,\n    this.enableEndpoint = true,\n    this.rule1MinTrailingSilence = 2.4,\n    this.rule2MinTrailingSilence = 1.2,\n    this.rule3MinUtteranceLength = 20,\n    this.hotwordsFile = '',\n    this.hotwordsScore = 1.5,\n    this.ctcFstDecoderConfig = const OnlineCtcFstDecoderConfig(),\n    this.ruleFsts = '',\n    this.ruleFars = '',\n    this.blankPenalty = 0.0,\n    this.hr = const HomophoneReplacerConfig(),\n  });\n\n  factory OnlineRecognizerConfig.fromJson(Map<String, dynamic> json) {\n    return OnlineRecognizerConfig(\n      feat: FeatureConfig.fromJson(\n          json['feat'] as Map<String, dynamic>? ?? const {}),\n      model: OnlineModelConfig.fromJson(json['model'] as Map<String, dynamic>),\n      decodingMethod: json['decodingMethod'] as String? ?? 'greedy_search',\n      maxActivePaths: json['maxActivePaths'] as int? ?? 4,\n      enableEndpoint: json['enableEndpoint'] as bool? ?? true,\n      rule1MinTrailingSilence:\n          (json['rule1MinTrailingSilence'] as num?)?.toDouble() ?? 2.4,\n      rule2MinTrailingSilence:\n          (json['rule2MinTrailingSilence'] as num?)?.toDouble() ?? 1.2,\n      rule3MinUtteranceLength:\n          (json['rule3MinUtteranceLength'] as num?)?.toDouble() ?? 20.0,\n      hotwordsFile: json['hotwordsFile'] as String? ?? '',\n      hotwordsScore: (json['hotwordsScore'] as num?)?.toDouble() ?? 1.5,\n      ctcFstDecoderConfig: OnlineCtcFstDecoderConfig.fromJson(\n          json['ctcFstDecoderConfig'] as Map<String, dynamic>? ?? const {}),\n      ruleFsts: json['ruleFsts'] as String? ?? '',\n      ruleFars: json['ruleFars'] as String? ?? '',\n      blankPenalty: (json['blankPenalty'] as num?)?.toDouble() ?? 0.0,\n      hr: HomophoneReplacerConfig.fromJson(\n          json['hr'] as Map<String, dynamic>? ?? const {}),\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlineRecognizerConfig(feat: $feat, model: $model, decodingMethod: $decodingMethod, maxActivePaths: $maxActivePaths, enableEndpoint: $enableEndpoint, rule1MinTrailingSilence: $rule1MinTrailingSilence, rule2MinTrailingSilence: $rule2MinTrailingSilence, rule3MinUtteranceLength: $rule3MinUtteranceLength, hotwordsFile: $hotwordsFile, hotwordsScore: $hotwordsScore, ctcFstDecoderConfig: $ctcFstDecoderConfig, ruleFsts: $ruleFsts, ruleFars: $ruleFars, blankPenalty: $blankPenalty, hr: $hr)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'feat': feat.toJson(),\n        'model': model.toJson(),\n        'decodingMethod': decodingMethod,\n        'maxActivePaths': maxActivePaths,\n        'enableEndpoint': enableEndpoint,\n        'rule1MinTrailingSilence': rule1MinTrailingSilence,\n        'rule2MinTrailingSilence': rule2MinTrailingSilence,\n        'rule3MinUtteranceLength': rule3MinUtteranceLength,\n        'hotwordsFile': hotwordsFile,\n        'hotwordsScore': hotwordsScore,\n        'ctcFstDecoderConfig': ctcFstDecoderConfig.toJson(),\n        'ruleFsts': ruleFsts,\n        'ruleFars': ruleFars,\n        'blankPenalty': blankPenalty,\n        'hr': hr.toJson(),\n      };\n\n  final FeatureConfig feat;\n  final OnlineModelConfig model;\n  final String decodingMethod;\n\n  final int maxActivePaths;\n\n  final bool enableEndpoint;\n\n  final double rule1MinTrailingSilence;\n\n  final double rule2MinTrailingSilence;\n\n  final double rule3MinUtteranceLength;\n\n  final String hotwordsFile;\n\n  final double hotwordsScore;\n\n  final OnlineCtcFstDecoderConfig ctcFstDecoderConfig;\n  final String ruleFsts;\n  final String ruleFars;\n\n  final double blankPenalty;\n  final HomophoneReplacerConfig hr;\n}\n\n/// Streaming recognition result returned by [OnlineRecognizer.getResult].\nclass OnlineRecognizerResult {\n  OnlineRecognizerResult(\n      {required this.text, required this.tokens, required this.timestamps});\n\n  factory OnlineRecognizerResult.fromJson(Map<String, dynamic> json) {\n    return OnlineRecognizerResult(\n      text: json['text'] as String,\n      tokens: List<String>.from(json['tokens'] as List),\n      timestamps: (json['timestamps'] as List)\n          .map<double>((e) => (e as num).toDouble())\n          .toList(),\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlineRecognizerResult(text: $text, tokens: $tokens, timestamps: $timestamps)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'text': text,\n        'tokens': tokens,\n        'timestamps': timestamps,\n      };\n\n  final String text;\n  final List<String> tokens;\n  final List<double> timestamps;\n}\n\n/// Streaming speech recognizer.\n///\n/// Create one from an [OnlineRecognizerConfig], then feed chunks to an\n/// [OnlineStream] and call [decode] while [isReady] is true.\nclass OnlineRecognizer {\n  OnlineRecognizer.fromPtr({required this.ptr, required this.config});\n\n  OnlineRecognizer._({required this.ptr, required this.config});\n\n  /// The user is responsible to call the OnlineRecognizer.free()\n  /// method of the returned instance to avoid memory leak.\n  /// Create a recognizer from [config].\n  factory OnlineRecognizer(OnlineRecognizerConfig config) {\n    if (SherpaOnnxBindings.createOnlineRecognizer == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final c = calloc<SherpaOnnxOnlineRecognizerConfig>();\n    c.ref.feat.sampleRate = config.feat.sampleRate;\n    c.ref.feat.featureDim = config.feat.featureDim;\n\n    // transducer\n    c.ref.model.transducer.encoder =\n        config.model.transducer.encoder.toNativeUtf8();\n    c.ref.model.transducer.decoder =\n        config.model.transducer.decoder.toNativeUtf8();\n    c.ref.model.transducer.joiner =\n        config.model.transducer.joiner.toNativeUtf8();\n\n    // paraformer\n    c.ref.model.paraformer.encoder =\n        config.model.paraformer.encoder.toNativeUtf8();\n    c.ref.model.paraformer.decoder =\n        config.model.paraformer.decoder.toNativeUtf8();\n\n    // zipformer2Ctc\n    c.ref.model.zipformer2Ctc.model =\n        config.model.zipformer2Ctc.model.toNativeUtf8();\n\n    // nemoCtc\n    c.ref.model.nemoCtc.model = config.model.nemoCtc.model.toNativeUtf8();\n\n    // toneCtc\n    c.ref.model.toneCtc.model = config.model.toneCtc.model.toNativeUtf8();\n\n    c.ref.model.tokens = config.model.tokens.toNativeUtf8();\n    c.ref.model.numThreads = config.model.numThreads;\n    c.ref.model.provider = config.model.provider.toNativeUtf8();\n    c.ref.model.debug = config.model.debug ? 1 : 0;\n    c.ref.model.modelType = config.model.modelType.toNativeUtf8();\n    c.ref.model.modelingUnit = config.model.modelingUnit.toNativeUtf8();\n    c.ref.model.bpeVocab = config.model.bpeVocab.toNativeUtf8();\n\n    c.ref.decodingMethod = config.decodingMethod.toNativeUtf8();\n    c.ref.maxActivePaths = config.maxActivePaths;\n    c.ref.enableEndpoint = config.enableEndpoint ? 1 : 0;\n    c.ref.rule1MinTrailingSilence = config.rule1MinTrailingSilence;\n    c.ref.rule2MinTrailingSilence = config.rule2MinTrailingSilence;\n    c.ref.rule3MinUtteranceLength = config.rule3MinUtteranceLength;\n    c.ref.hotwordsFile = config.hotwordsFile.toNativeUtf8();\n    c.ref.hotwordsScore = config.hotwordsScore;\n\n    c.ref.ctcFstDecoderConfig.graph =\n        config.ctcFstDecoderConfig.graph.toNativeUtf8();\n    c.ref.ctcFstDecoderConfig.maxActive = config.ctcFstDecoderConfig.maxActive;\n    c.ref.ruleFsts = config.ruleFsts.toNativeUtf8();\n    c.ref.ruleFars = config.ruleFars.toNativeUtf8();\n\n    c.ref.blankPenalty = config.blankPenalty;\n\n    c.ref.hr.lexicon = config.hr.lexicon.toNativeUtf8();\n    c.ref.hr.ruleFsts = config.hr.ruleFsts.toNativeUtf8();\n\n    final ptr = SherpaOnnxBindings.createOnlineRecognizer?.call(c) ?? nullptr;\n\n    calloc.free(c.ref.hr.lexicon);\n    calloc.free(c.ref.hr.ruleFsts);\n    calloc.free(c.ref.ruleFars);\n    calloc.free(c.ref.ruleFsts);\n    calloc.free(c.ref.ctcFstDecoderConfig.graph);\n    calloc.free(c.ref.hotwordsFile);\n    calloc.free(c.ref.decodingMethod);\n    calloc.free(c.ref.model.bpeVocab);\n    calloc.free(c.ref.model.modelingUnit);\n    calloc.free(c.ref.model.modelType);\n    calloc.free(c.ref.model.provider);\n    calloc.free(c.ref.model.tokens);\n    calloc.free(c.ref.model.toneCtc.model);\n    calloc.free(c.ref.model.nemoCtc.model);\n    calloc.free(c.ref.model.zipformer2Ctc.model);\n    calloc.free(c.ref.model.paraformer.encoder);\n    calloc.free(c.ref.model.paraformer.decoder);\n\n    calloc.free(c.ref.model.transducer.encoder);\n    calloc.free(c.ref.model.transducer.decoder);\n    calloc.free(c.ref.model.transducer.joiner);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\n          \"Failed to create online recognizer. Please check your config\");\n    }\n\n    return OnlineRecognizer._(ptr: ptr, config: config);\n  }\n\n  /// Release the native recognizer.\n  void free() {\n    if (SherpaOnnxBindings.destroyOnlineRecognizer == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.destroyOnlineRecognizer?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// The user has to invoke stream.free() on the returned instance\n  /// to avoid memory leak\n  /// Create a streaming input stream.\n  ///\n  /// If [hotwords] is provided, the stream uses those per-stream hotwords in\n  /// addition to any recognizer-wide settings.\n  OnlineStream createStream({String hotwords = ''}) {\n    if (hotwords == '') {\n      if (SherpaOnnxBindings.createOnlineStream == null) {\n        throw Exception(\"Please initialize sherpa-onnx first\");\n      }\n    } else {\n      if (SherpaOnnxBindings.createOnlineStreamWithHotwords == null) {\n        throw Exception(\"Please initialize sherpa-onnx first\");\n      }\n    }\n\n    if (ptr == nullptr) {\n      throw Exception(\"Failed to create online stream\");\n    }\n\n    if (hotwords == '') {\n      final p = SherpaOnnxBindings.createOnlineStream?.call(ptr) ?? nullptr;\n      if (p == nullptr) {\n        throw Exception(\"Failed to create online stream\");\n      }\n      return OnlineStream(ptr: p);\n    }\n\n    final utf8 = hotwords.toNativeUtf8();\n    final p =\n        SherpaOnnxBindings.createOnlineStreamWithHotwords?.call(ptr, utf8) ??\n            nullptr;\n    calloc.free(utf8);\n\n    if (p == nullptr) {\n      throw Exception(\"Failed to create online stream\");\n    }\n\n    return OnlineStream(ptr: p);\n  }\n\n  /// Return `true` if the recognizer has enough audio to run another step.\n  bool isReady(OnlineStream stream) {\n    if (SherpaOnnxBindings.isOnlineStreamReady == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return false;\n    }\n\n    int ready =\n        SherpaOnnxBindings.isOnlineStreamReady?.call(ptr, stream.ptr) ?? 0;\n\n    return ready == 1;\n  }\n\n  /// Fetch the current recognition hypothesis.\n  OnlineRecognizerResult getResult(OnlineStream stream) {\n    if (SherpaOnnxBindings.getOnlineStreamResultAsJson == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return OnlineRecognizerResult(text: '', tokens: [], timestamps: []);\n    }\n\n    final json =\n        SherpaOnnxBindings.getOnlineStreamResultAsJson?.call(ptr, stream.ptr) ??\n            nullptr;\n    if (json == nullptr) {\n      return OnlineRecognizerResult(text: '', tokens: [], timestamps: []);\n    }\n\n    final parsedJson = jsonDecode(toDartString(json));\n\n    SherpaOnnxBindings.destroyOnlineStreamResultJson?.call(json);\n\n    return OnlineRecognizerResult(\n        text: parsedJson['text'],\n        tokens: List<String>.from(parsedJson['tokens']),\n        timestamps: List<double>.from(parsedJson['timestamps']));\n  }\n\n  /// Reset stream state after an endpoint or utterance boundary.\n  void reset(OnlineStream stream) {\n    if (SherpaOnnxBindings.reset == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return;\n    }\n\n    SherpaOnnxBindings.reset?.call(ptr, stream.ptr);\n  }\n\n  /// Decode one incremental step for [stream].\n  void decode(OnlineStream stream) {\n    if (SherpaOnnxBindings.decodeOnlineStream == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return;\n    }\n\n    SherpaOnnxBindings.decodeOnlineStream?.call(ptr, stream.ptr);\n  }\n\n  /// Return `true` if endpointing rules say the current utterance has ended.\n  bool isEndpoint(OnlineStream stream) {\n    if (SherpaOnnxBindings.isEndpoint == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return false;\n    }\n\n    int yes = SherpaOnnxBindings.isEndpoint?.call(ptr, stream.ptr) ?? 0;\n\n    return yes == 1;\n  }\n\n  Pointer<SherpaOnnxOnlineRecognizer> ptr;\n  OnlineRecognizerConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/online_speech_denoiser.dart",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'dart:typed_data';\n\nimport 'package:ffi/ffi.dart';\n\nimport './offline_speech_denoiser.dart';\nimport './sherpa_onnx_bindings.dart';\n\n/// Streaming speech denoising.\n///\n/// Call [run] on consecutive chunks, then [flush] after the final chunk to\n/// drain any buffered state.\nclass OnlineSpeechDenoiserConfig {\n  const OnlineSpeechDenoiserConfig({\n    this.model = const OfflineSpeechDenoiserModelConfig(),\n  });\n\n  factory OnlineSpeechDenoiserConfig.fromJson(Map<String, dynamic> json) {\n    return OnlineSpeechDenoiserConfig(\n      model: json['model'] != null\n          ? OfflineSpeechDenoiserModelConfig.fromJson(\n              json['model'] as Map<String, dynamic>,\n            )\n          : const OfflineSpeechDenoiserModelConfig(),\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OnlineSpeechDenoiserConfig(model: $model)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model.toJson(),\n      };\n\n  final OfflineSpeechDenoiserModelConfig model;\n}\n\n/// Streaming speech denoiser.\nclass OnlineSpeechDenoiser {\n  OnlineSpeechDenoiser.fromPtr({required this.ptr, required this.config});\n\n  OnlineSpeechDenoiser._({required this.ptr, required this.config});\n\n  /// Create a streaming denoiser from [config].\n  factory OnlineSpeechDenoiser(OnlineSpeechDenoiserConfig config) {\n    if (SherpaOnnxBindings.sherpaOnnxCreateOnlineSpeechDenoiser == null) {\n      throw Exception('Please initialize sherpa-onnx first');\n    }\n\n    final c = calloc<SherpaOnnxOnlineSpeechDenoiserConfig>();\n    c.ref.model.gtcrn.model = config.model.gtcrn.model.toNativeUtf8();\n    c.ref.model.dpdfnet.model = config.model.dpdfnet.model.toNativeUtf8();\n    c.ref.model.numThreads = config.model.numThreads;\n    c.ref.model.debug = config.model.debug ? 1 : 0;\n    c.ref.model.provider = config.model.provider.toNativeUtf8();\n\n    final ptr =\n        SherpaOnnxBindings.sherpaOnnxCreateOnlineSpeechDenoiser?.call(c) ??\n            nullptr;\n\n    calloc.free(c.ref.model.provider);\n    calloc.free(c.ref.model.gtcrn.model);\n    calloc.free(c.ref.model.dpdfnet.model);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\n        'Failed to create online speech denoiser. Please check your config',\n      );\n    }\n\n    return OnlineSpeechDenoiser._(ptr: ptr, config: config);\n  }\n\n  /// Release the native denoiser.\n  void free() {\n    if (SherpaOnnxBindings.sherpaOnnxDestroyOnlineSpeechDenoiser == null) {\n      throw Exception('Please initialize sherpa-onnx first');\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n\n    SherpaOnnxBindings.sherpaOnnxDestroyOnlineSpeechDenoiser?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Denoise one input chunk.\n  DenoisedAudio run({required Float32List samples, required int sampleRate}) {\n    if (SherpaOnnxBindings.sherpaOnnxOnlineSpeechDenoiserRun == null) {\n      throw Exception('Please initialize sherpa-onnx first');\n    }\n\n    if (ptr == nullptr) {\n      return DenoisedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final n = samples.length;\n    final Pointer<Float> psamples = calloc<Float>(n);\n    final pList = psamples.asTypedList(n);\n    pList.setAll(0, samples);\n\n    final p =\n        SherpaOnnxBindings.sherpaOnnxOnlineSpeechDenoiserRun?.call(\n              ptr,\n              psamples,\n              n,\n              sampleRate,\n            ) ??\n            nullptr;\n\n    calloc.free(psamples);\n\n    if (p == nullptr) {\n      return DenoisedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final sampleRateOut = p.ref.sampleRate;\n    final nOut = p.ref.n;\n    Float32List newSamples = Float32List(0);\n    if (nOut > 0 && p.ref.samples != nullptr) {\n      newSamples = Float32List.fromList(p.ref.samples.asTypedList(nOut));\n    }\n\n    SherpaOnnxBindings.sherpaOnnxDestroyDenoisedAudio?.call(p);\n\n    return DenoisedAudio(samples: newSamples, sampleRate: sampleRateOut);\n  }\n\n  /// Flush buffered output after the final chunk.\n  DenoisedAudio flush() {\n    if (SherpaOnnxBindings.sherpaOnnxOnlineSpeechDenoiserFlush == null) {\n      throw Exception('Please initialize sherpa-onnx first');\n    }\n\n    if (ptr == nullptr) {\n      return DenoisedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final p =\n        SherpaOnnxBindings.sherpaOnnxOnlineSpeechDenoiserFlush?.call(ptr) ??\n            nullptr;\n\n    if (p == nullptr) {\n      return DenoisedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final sampleRateOut = p.ref.sampleRate;\n    final nOut = p.ref.n;\n    Float32List newSamples = Float32List(0);\n    if (nOut > 0 && p.ref.samples != nullptr) {\n      newSamples = Float32List.fromList(p.ref.samples.asTypedList(nOut));\n    }\n\n    SherpaOnnxBindings.sherpaOnnxDestroyDenoisedAudio?.call(p);\n\n    return DenoisedAudio(samples: newSamples, sampleRate: sampleRateOut);\n  }\n\n  /// Reset the streaming state.\n  void reset() {\n    if (SherpaOnnxBindings.sherpaOnnxOnlineSpeechDenoiserReset == null) {\n      throw Exception('Please initialize sherpa-onnx first');\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n\n    SherpaOnnxBindings.sherpaOnnxOnlineSpeechDenoiserReset?.call(ptr);\n  }\n\n  /// Return the expected sample rate for this denoiser.\n  int get sampleRate {\n    if (SherpaOnnxBindings.sherpaOnnxOnlineSpeechDenoiserGetSampleRate ==\n        null) {\n      throw Exception('Please initialize sherpa-onnx first');\n    }\n\n    if (ptr == nullptr) {\n      return 0;\n    }\n\n    return SherpaOnnxBindings.sherpaOnnxOnlineSpeechDenoiserGetSampleRate?.call(\n          ptr,\n        ) ??\n        0;\n  }\n\n  /// Return the preferred frame shift in samples.\n  int get frameShiftInSamples {\n    if (SherpaOnnxBindings.sherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples ==\n        null) {\n      throw Exception('Please initialize sherpa-onnx first');\n    }\n\n    if (ptr == nullptr) {\n      return 0;\n    }\n\n    return SherpaOnnxBindings.sherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples\n            ?.call(ptr) ??\n        0;\n  }\n\n  Pointer<SherpaOnnxOnlineSpeechDenoiser> ptr;\n  OnlineSpeechDenoiserConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/online_stream.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'dart:typed_data';\nimport 'package:ffi/ffi.dart';\n\nimport './sherpa_onnx_bindings.dart';\n\n/// Input stream for streaming APIs such as online ASR and keyword spotting.\nclass OnlineStream {\n  /// The user has to call OnlineStream.free() to avoid memory leak.\n  OnlineStream({required this.ptr});\n\n  /// Release the native stream.\n  void free() {\n    if (SherpaOnnxBindings.destroyOnlineStream == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.destroyOnlineStream?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// If you have List<double> data, then you can use\n  /// Float32List.fromList(data) to convert data to Float32List\n  ///\n  /// See\n  ///  https://api.flutter.dev/flutter/dart-core/List-class.html\n  /// and\n  ///  https://api.flutter.dev/flutter/dart-typed_data/Float32List-class.html\n  /// Append waveform samples to the stream.\n  ///\n  /// [samples] must contain mono floating-point PCM data normalized to\n  /// `[-1, 1]`. Feed your audio in chunks, then call [inputFinished] after the\n  /// last chunk if you want the recognizer to flush trailing context.\n  void acceptWaveform({required Float32List samples, required int sampleRate}) {\n    if (SherpaOnnxBindings.onlineStreamAcceptWaveform == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n\n    final n = samples.length;\n    final Pointer<Float> p = calloc<Float>(n);\n\n    final pList = p.asTypedList(n);\n    pList.setAll(0, samples);\n\n    SherpaOnnxBindings.onlineStreamAcceptWaveform?.call(ptr, sampleRate, p, n);\n\n    calloc.free(p);\n  }\n\n  /// Mark the end of input.\n  void inputFinished() {\n    if (SherpaOnnxBindings.onlineStreamInputFinished == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.onlineStreamInputFinished?.call(ptr);\n  }\n\n  Pointer<SherpaOnnxOnlineStream> ptr;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/sherpa_onnx_bindings.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'package:ffi/ffi.dart';\n\nfinal class SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineSpeechDenoiserModelConfig extends Struct {\n  external SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig gtcrn;\n\n  @Int32()\n  external int numThreads;\n\n  @Int32()\n  external int debug;\n\n  external Pointer<Utf8> provider;\n\n  external SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig dpdfnet;\n}\n\nfinal class SherpaOnnxOfflineSpeechDenoiserConfig extends Struct {\n  external SherpaOnnxOfflineSpeechDenoiserModelConfig model;\n}\n\nfinal class SherpaOnnxOnlineSpeechDenoiserConfig extends Struct {\n  external SherpaOnnxOfflineSpeechDenoiserModelConfig model;\n}\n\nfinal class SherpaOnnxDenoisedAudio extends Struct {\n  external Pointer<Float> samples;\n\n  @Int32()\n  external int n;\n\n  @Int32()\n  external int sampleRate;\n}\n\nfinal class SherpaOnnxSpeakerEmbeddingExtractorConfig extends Struct {\n  external Pointer<Utf8> model;\n\n  @Int32()\n  external int numThreads;\n\n  @Int32()\n  external int debug;\n\n  external Pointer<Utf8> provider;\n}\n\nfinal class SherpaOnnxOfflineSpeakerDiarizationSegment extends Struct {\n  @Float()\n  external double start;\n\n  @Float()\n  external double end;\n\n  @Int32()\n  external int speaker;\n}\n\nfinal class SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig\n    extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineSpeakerSegmentationModelConfig extends Struct {\n  external SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig pyannote;\n\n  @Int32()\n  external int numThreads;\n\n  @Int32()\n  external int debug;\n\n  external Pointer<Utf8> provider;\n}\n\nfinal class SherpaOnnxFastClusteringConfig extends Struct {\n  @Int32()\n  external int numClusters;\n\n  @Float()\n  external double threshold;\n}\n\nfinal class SherpaOnnxOfflineSpeakerDiarizationConfig extends Struct {\n  external SherpaOnnxOfflineSpeakerSegmentationModelConfig segmentation;\n  external SherpaOnnxSpeakerEmbeddingExtractorConfig embedding;\n  external SherpaOnnxFastClusteringConfig clustering;\n\n  @Float()\n  external double minDurationOn;\n\n  @Float()\n  external double minDurationOff;\n}\n\nfinal class SherpaOnnxOfflinePunctuationModelConfig extends Struct {\n  external Pointer<Utf8> ctTransformer;\n\n  @Int32()\n  external int numThreads;\n\n  @Int32()\n  external int debug;\n\n  external Pointer<Utf8> provider;\n}\n\nfinal class SherpaOnnxOfflinePunctuationConfig extends Struct {\n  external SherpaOnnxOfflinePunctuationModelConfig model;\n}\n\nfinal class SherpaOnnxOnlinePunctuationModelConfig extends Struct {\n  external Pointer<Utf8> cnnBiLstm;\n  external Pointer<Utf8> bpeVocab;\n  @Int32()\n  external int numThreads;\n  @Int32()\n  external int debug;\n  external Pointer<Utf8> provider;\n}\n\nfinal class SherpaOnnxOnlinePunctuationConfig extends Struct {\n  external SherpaOnnxOnlinePunctuationModelConfig model;\n}\n\nfinal class SherpaOnnxOfflineZipformerAudioTaggingModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxAudioTaggingModelConfig extends Struct {\n  external SherpaOnnxOfflineZipformerAudioTaggingModelConfig zipformer;\n  external Pointer<Utf8> ced;\n\n  @Int32()\n  external int numThreads;\n\n  @Int32()\n  external int debug;\n\n  external Pointer<Utf8> provider;\n}\n\nfinal class SherpaOnnxAudioTaggingConfig extends Struct {\n  external SherpaOnnxAudioTaggingModelConfig model;\n  external Pointer<Utf8> labels;\n\n  @Int32()\n  external int topK;\n}\n\nfinal class SherpaOnnxAudioEvent extends Struct {\n  external Pointer<Utf8> name;\n\n  @Int32()\n  external int index;\n\n  @Float()\n  external double prob;\n}\n\nfinal class SherpaOnnxOfflineTtsVitsModelConfig extends Struct {\n  external Pointer<Utf8> model;\n  external Pointer<Utf8> lexicon;\n  external Pointer<Utf8> tokens;\n  external Pointer<Utf8> dataDir;\n\n  @Float()\n  external double noiseScale;\n\n  @Float()\n  external double noiseScaleW;\n\n  @Float()\n  external double lengthScale;\n\n  external Pointer<Utf8> dictDir;\n}\n\nfinal class SherpaOnnxOfflineTtsMatchaModelConfig extends Struct {\n  external Pointer<Utf8> acousticModel;\n  external Pointer<Utf8> vocoder;\n  external Pointer<Utf8> lexicon;\n  external Pointer<Utf8> tokens;\n  external Pointer<Utf8> dataDir;\n\n  @Float()\n  external double noiseScale;\n\n  @Float()\n  external double lengthScale;\n\n  external Pointer<Utf8> dictDir;\n}\n\nfinal class SherpaOnnxOfflineTtsKokoroModelConfig extends Struct {\n  external Pointer<Utf8> model;\n  external Pointer<Utf8> voices;\n  external Pointer<Utf8> tokens;\n  external Pointer<Utf8> dataDir;\n\n  @Float()\n  external double lengthScale;\n  external Pointer<Utf8> dictDir;\n  external Pointer<Utf8> lexicon;\n  external Pointer<Utf8> lang;\n}\n\nfinal class SherpaOnnxOfflineTtsKittenModelConfig extends Struct {\n  external Pointer<Utf8> model;\n  external Pointer<Utf8> voices;\n  external Pointer<Utf8> tokens;\n  external Pointer<Utf8> dataDir;\n\n  @Float()\n  external double lengthScale;\n}\n\nfinal class SherpaOnnxOfflineTtsZipVoiceModelConfig extends Struct {\n  external Pointer<Utf8> tokens;\n  external Pointer<Utf8> encoder;\n  external Pointer<Utf8> decoder;\n  external Pointer<Utf8> vocoder;\n  external Pointer<Utf8> dataDir;\n  external Pointer<Utf8> lexicon;\n\n  @Float()\n  external double featScale;\n\n  @Float()\n  external double tShift;\n\n  @Float()\n  external double targetRms;\n\n  @Float()\n  external double guidanceScale;\n}\n\nfinal class SherpaOnnxOfflineTtsPocketModelConfig extends Struct {\n  external Pointer<Utf8> lmFlow;\n  external Pointer<Utf8> lmMain;\n  external Pointer<Utf8> encoder;\n  external Pointer<Utf8> decoder;\n  external Pointer<Utf8> textConditioner;\n  external Pointer<Utf8> vocabJson;\n  external Pointer<Utf8> tokenScoresJson;\n\n  @Int32()\n  external int voiceEmbeddingCacheCapacity;\n}\n\nfinal class SherpaOnnxOfflineTtsSupertonicModelConfig extends Struct {\n  external Pointer<Utf8> durationPredictor;\n  external Pointer<Utf8> textEncoder;\n  external Pointer<Utf8> vectorEstimator;\n  external Pointer<Utf8> vocoder;\n  external Pointer<Utf8> ttsJson;\n  external Pointer<Utf8> unicodeIndexer;\n  external Pointer<Utf8> voiceStyle;\n}\n\nfinal class SherpaOnnxOfflineTtsModelConfig extends Struct {\n  external SherpaOnnxOfflineTtsVitsModelConfig vits;\n  @Int32()\n  external int numThreads;\n\n  @Int32()\n  external int debug;\n\n  external Pointer<Utf8> provider;\n  external SherpaOnnxOfflineTtsMatchaModelConfig matcha;\n  external SherpaOnnxOfflineTtsKokoroModelConfig kokoro;\n  external SherpaOnnxOfflineTtsKittenModelConfig kitten;\n  external SherpaOnnxOfflineTtsZipVoiceModelConfig zipvoice;\n  external SherpaOnnxOfflineTtsPocketModelConfig pocket;\n  external SherpaOnnxOfflineTtsSupertonicModelConfig supertonic;\n}\n\nfinal class SherpaOnnxOfflineTtsConfig extends Struct {\n  external SherpaOnnxOfflineTtsModelConfig model;\n  external Pointer<Utf8> ruleFsts;\n\n  @Int32()\n  external int maxNumSenetences;\n\n  external Pointer<Utf8> ruleFars;\n\n  @Float()\n  external double silenceScale;\n}\n\nfinal class SherpaOnnxGenerationConfig extends Struct {\n  @Float()\n  external double silenceScale;\n\n  @Float()\n  external double speed;\n\n  @Int32()\n  external int sid;\n\n  external Pointer<Float> referenceAudio;\n\n  @Int32()\n  external int referenceAudioLength;\n\n  @Int32()\n  external int referenceSampleRate;\n\n  external Pointer<Utf8> referenceText;\n\n  @Int32()\n  external int numSteps;\n\n  external Pointer<Utf8> extra;\n}\n\nfinal class SherpaOnnxGeneratedAudio extends Struct {\n  external Pointer<Float> samples;\n\n  @Int32()\n  external int n;\n\n  @Int32()\n  external int sampleRate;\n}\n\nfinal class SherpaOnnxFeatureConfig extends Struct {\n  @Int32()\n  external int sampleRate;\n\n  @Int32()\n  external int featureDim;\n}\n\nfinal class SherpaOnnxOfflineTransducerModelConfig extends Struct {\n  external Pointer<Utf8> encoder;\n  external Pointer<Utf8> decoder;\n  external Pointer<Utf8> joiner;\n}\n\nfinal class SherpaOnnxOfflineParaformerModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineNemoEncDecCtcModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineDolphinModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineZipformerCtcModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineWenetCtcModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineOmnilingualAsrCtcModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineMedAsrCtcModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineFireRedAsrCtcModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineFunAsrNanoModelConfig extends Struct {\n  external Pointer<Utf8> encoderAdaptor;\n  external Pointer<Utf8> llm;\n  external Pointer<Utf8> embedding;\n  external Pointer<Utf8> tokenizer;\n  external Pointer<Utf8> systemPrompt;\n  external Pointer<Utf8> userPrompt;\n\n  @Int32()\n  external int maxNewTokens;\n\n  @Float()\n  external double temperature;\n\n  @Float()\n  external double topP;\n\n  @Int32()\n  external int seed;\n\n  external Pointer<Utf8> language;\n\n  @Int32()\n  external int itn;\n\n  external Pointer<Utf8> hotwords;\n}\n\nfinal class SherpaOnnxOfflineWhisperModelConfig extends Struct {\n  external Pointer<Utf8> encoder;\n  external Pointer<Utf8> decoder;\n  external Pointer<Utf8> language;\n  external Pointer<Utf8> task;\n\n  @Int32()\n  external int tailPaddings;\n\n  @Int32()\n  external int enableTokenTimestamps;\n\n  @Int32()\n  external int enableSegmentTimestamps;\n}\n\nfinal class SherpaOnnxOfflineCanaryModelConfig extends Struct {\n  external Pointer<Utf8> encoder;\n  external Pointer<Utf8> decoder;\n  external Pointer<Utf8> srcLang;\n  external Pointer<Utf8> tgtLang;\n\n  @Int32()\n  external int usePnc;\n}\n\nfinal class SherpaOnnxOfflineMoonshineModelConfig extends Struct {\n  external Pointer<Utf8> preprocessor;\n  external Pointer<Utf8> encoder;\n  external Pointer<Utf8> uncachedDecoder;\n  external Pointer<Utf8> cachedDecoder;\n  external Pointer<Utf8> mergedDecoder;\n}\n\nfinal class SherpaOnnxOfflineFireRedAsrModelConfig extends Struct {\n  external Pointer<Utf8> encoder;\n  external Pointer<Utf8> decoder;\n}\n\nfinal class SherpaOnnxOfflineTdnnModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOfflineSenseVoiceModelConfig extends Struct {\n  external Pointer<Utf8> model;\n  external Pointer<Utf8> language;\n\n  @Int32()\n  external int useInverseTextNormalization;\n}\n\nfinal class SherpaOnnxOfflineLMConfig extends Struct {\n  external Pointer<Utf8> model;\n\n  @Float()\n  external double scale;\n}\n\nfinal class SherpaOnnxOfflineModelConfig extends Struct {\n  external SherpaOnnxOfflineTransducerModelConfig transducer;\n  external SherpaOnnxOfflineParaformerModelConfig paraformer;\n  external SherpaOnnxOfflineNemoEncDecCtcModelConfig nemoCtc;\n  external SherpaOnnxOfflineWhisperModelConfig whisper;\n  external SherpaOnnxOfflineTdnnModelConfig tdnn;\n\n  external Pointer<Utf8> tokens;\n\n  @Int32()\n  external int numThreads;\n\n  @Int32()\n  external int debug;\n\n  external Pointer<Utf8> provider;\n\n  external Pointer<Utf8> modelType;\n  external Pointer<Utf8> modelingUnit;\n  external Pointer<Utf8> bpeVocab;\n  external Pointer<Utf8> telespeechCtc;\n\n  external SherpaOnnxOfflineSenseVoiceModelConfig senseVoice;\n  external SherpaOnnxOfflineMoonshineModelConfig moonshine;\n  external SherpaOnnxOfflineFireRedAsrModelConfig fireRedAsr;\n  external SherpaOnnxOfflineDolphinModelConfig dolphin;\n  external SherpaOnnxOfflineZipformerCtcModelConfig zipformerCtc;\n  external SherpaOnnxOfflineCanaryModelConfig canary;\n  external SherpaOnnxOfflineWenetCtcModelConfig wenetCtc;\n  external SherpaOnnxOfflineOmnilingualAsrCtcModelConfig omnilingual;\n  external SherpaOnnxOfflineMedAsrCtcModelConfig medasr;\n  external SherpaOnnxOfflineFunAsrNanoModelConfig funasrNano;\n  external SherpaOnnxOfflineFireRedAsrCtcModelConfig fireRedAsrCtc;\n}\n\nfinal class SherpaOnnxOfflineRecognizerConfig extends Struct {\n  external SherpaOnnxFeatureConfig feat;\n  external SherpaOnnxOfflineModelConfig model;\n  external SherpaOnnxOfflineLMConfig lm;\n  external Pointer<Utf8> decodingMethod;\n\n  @Int32()\n  external int maxActivePaths;\n\n  external Pointer<Utf8> hotwordsFile;\n\n  @Float()\n  external double hotwordsScore;\n\n  external Pointer<Utf8> ruleFsts;\n  external Pointer<Utf8> ruleFars;\n\n  @Float()\n  external double blankPenalty;\n  external SherpaOnnxHomophoneReplacerConfig hr;\n}\n\nfinal class SherpaOnnxOnlineTransducerModelConfig extends Struct {\n  external Pointer<Utf8> encoder;\n  external Pointer<Utf8> decoder;\n  external Pointer<Utf8> joiner;\n}\n\nfinal class SherpaOnnxOnlineParaformerModelConfig extends Struct {\n  external Pointer<Utf8> encoder;\n  external Pointer<Utf8> decoder;\n}\n\nfinal class SherpaOnnxOnlineZipformer2CtcModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOnlineNemoCtcModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOnlineToneCtcModelConfig extends Struct {\n  external Pointer<Utf8> model;\n}\n\nfinal class SherpaOnnxOnlineModelConfig extends Struct {\n  external SherpaOnnxOnlineTransducerModelConfig transducer;\n  external SherpaOnnxOnlineParaformerModelConfig paraformer;\n  external SherpaOnnxOnlineZipformer2CtcModelConfig zipformer2Ctc;\n\n  external Pointer<Utf8> tokens;\n\n  @Int32()\n  external int numThreads;\n\n  external Pointer<Utf8> provider;\n\n  @Int32()\n  external int debug;\n\n  external Pointer<Utf8> modelType;\n\n  external Pointer<Utf8> modelingUnit;\n\n  external Pointer<Utf8> bpeVocab;\n\n  external Pointer<Utf8> tokensBuf;\n\n  @Int32()\n  external int tokensBufSize;\n\n  external SherpaOnnxOnlineNemoCtcModelConfig nemoCtc;\n\n  external SherpaOnnxOnlineToneCtcModelConfig toneCtc;\n}\n\nfinal class SherpaOnnxOnlineCtcFstDecoderConfig extends Struct {\n  external Pointer<Utf8> graph;\n\n  @Int32()\n  external int maxActive;\n}\n\nfinal class SherpaOnnxHomophoneReplacerConfig extends Struct {\n  external Pointer<Utf8> dictDir;\n  external Pointer<Utf8> lexicon;\n  external Pointer<Utf8> ruleFsts;\n}\n\nfinal class SherpaOnnxOnlineRecognizerConfig extends Struct {\n  external SherpaOnnxFeatureConfig feat;\n  external SherpaOnnxOnlineModelConfig model;\n  external Pointer<Utf8> decodingMethod;\n\n  @Int32()\n  external int maxActivePaths;\n\n  @Int32()\n  external int enableEndpoint;\n\n  @Float()\n  external double rule1MinTrailingSilence;\n\n  @Float()\n  external double rule2MinTrailingSilence;\n\n  @Float()\n  external double rule3MinUtteranceLength;\n\n  external Pointer<Utf8> hotwordsFile;\n\n  @Float()\n  external double hotwordsScore;\n\n  external SherpaOnnxOnlineCtcFstDecoderConfig ctcFstDecoderConfig;\n\n  external Pointer<Utf8> ruleFsts;\n  external Pointer<Utf8> ruleFars;\n\n  @Float()\n  external double blankPenalty;\n\n  external Pointer<Utf8> hotwordsBuf;\n\n  @Int32()\n  external int hotwordsBufSize;\n  external SherpaOnnxHomophoneReplacerConfig hr;\n}\n\nfinal class SherpaOnnxSileroVadModelConfig extends Struct {\n  external Pointer<Utf8> model;\n\n  @Float()\n  external double threshold;\n\n  @Float()\n  external double minSilenceDuration;\n\n  @Float()\n  external double minSpeechDuration;\n\n  @Int32()\n  external int windowSize;\n\n  @Float()\n  external double maxSpeechDuration;\n}\n\nfinal class SherpaOnnxTenVadModelConfig extends Struct {\n  external Pointer<Utf8> model;\n\n  @Float()\n  external double threshold;\n\n  @Float()\n  external double minSilenceDuration;\n\n  @Float()\n  external double minSpeechDuration;\n\n  @Int32()\n  external int windowSize;\n\n  @Float()\n  external double maxSpeechDuration;\n}\n\nfinal class SherpaOnnxVadModelConfig extends Struct {\n  external SherpaOnnxSileroVadModelConfig sileroVad;\n\n  @Int32()\n  external int sampleRate;\n\n  @Int32()\n  external int numThreads;\n\n  external Pointer<Utf8> provider;\n\n  @Int32()\n  external int debug;\n\n  external SherpaOnnxTenVadModelConfig tenVad;\n}\n\nfinal class SherpaOnnxSpeechSegment extends Struct {\n  @Int32()\n  external int start;\n\n  external Pointer<Float> samples;\n\n  @Int32()\n  external int n;\n}\n\nfinal class SherpaOnnxWave extends Struct {\n  external Pointer<Float> samples;\n\n  @Int32()\n  external int sampleRate;\n\n  @Int32()\n  external int numSamples;\n}\n\nfinal class SherpaOnnxKeywordSpotterConfig extends Struct {\n  external SherpaOnnxFeatureConfig feat;\n\n  external SherpaOnnxOnlineModelConfig model;\n\n  @Int32()\n  external int maxActivePaths;\n\n  @Int32()\n  external int numTrailingBlanks;\n\n  @Float()\n  external double keywordsScore;\n\n  @Float()\n  external double keywordsThreshold;\n\n  external Pointer<Utf8> keywordsFile;\n\n  external Pointer<Utf8> keywordsBuf;\n\n  @Int32()\n  external int keywordsBufSize;\n}\n\nfinal class SherpaOnnxOfflinePunctuation extends Opaque {}\n\nfinal class SherpaOnnxOnlinePunctuation extends Opaque {}\n\nfinal class SherpaOnnxAudioTagging extends Opaque {}\n\nfinal class SherpaOnnxKeywordSpotter extends Opaque {}\n\nfinal class SherpaOnnxOfflineTts extends Opaque {}\n\nfinal class SherpaOnnxCircularBuffer extends Opaque {}\n\nfinal class SherpaOnnxVoiceActivityDetector extends Opaque {}\n\nfinal class SherpaOnnxOnlineStream extends Opaque {}\n\nfinal class SherpaOnnxOnlineRecognizer extends Opaque {}\n\nfinal class SherpaOnnxOfflineRecognizer extends Opaque {}\n\nfinal class SherpaOnnxOfflineStream extends Opaque {}\n\nfinal class SherpaOnnxSpeakerEmbeddingExtractor extends Opaque {}\n\nfinal class SherpaOnnxSpeakerEmbeddingManager extends Opaque {}\n\nfinal class SherpaOnnxOfflineSpeakerDiarization extends Opaque {}\n\nfinal class SherpaOnnxOfflineSpeakerDiarizationResult extends Opaque {}\n\nfinal class SherpaOnnxSpokenLanguageIdentificationWhisperConfig extends Struct {\n  external Pointer<Utf8> encoder;\n  external Pointer<Utf8> decoder;\n\n  @Int32()\n  external int tailPaddings;\n}\n\nfinal class SherpaOnnxSpokenLanguageIdentificationConfig extends Struct {\n  external SherpaOnnxSpokenLanguageIdentificationWhisperConfig whisper;\n\n  @Int32()\n  external int numThreads;\n\n  @Int32()\n  external int debug;\n\n  external Pointer<Utf8> provider;\n}\n\nfinal class SherpaOnnxSpokenLanguageIdentificationResult extends Struct {\n  external Pointer<Utf8> lang;\n}\n\nfinal class SherpaOnnxSpokenLanguageIdentification extends Opaque {}\n\nfinal class SherpaOnnxOfflineSpeechDenoiser extends Opaque {}\n\nfinal class SherpaOnnxOnlineSpeechDenoiser extends Opaque {}\n\ntypedef SherpaOnnxCreateOfflineSpeechDenoiserNative =\n    Pointer<SherpaOnnxOfflineSpeechDenoiser> Function(\n      Pointer<SherpaOnnxOfflineSpeechDenoiserConfig>,\n    );\n\ntypedef SherpaOnnxCreateOfflineSpeechDenoiser =\n    SherpaOnnxCreateOfflineSpeechDenoiserNative;\n\ntypedef SherpaOnnxDestroyOfflineSpeechDenoiserNative =\n    Void Function(Pointer<SherpaOnnxOfflineSpeechDenoiser>);\n\ntypedef SherpaOnnxDestroyOfflineSpeechDenoiser =\n    void Function(Pointer<SherpaOnnxOfflineSpeechDenoiser>);\n\ntypedef SherpaOnnxOfflineSpeechDenoiserGetSampleRateNative =\n    Int32 Function(Pointer<SherpaOnnxOfflineSpeechDenoiser>);\n\ntypedef SherpaOnnxOfflineSpeechDenoiserGetSampleRate =\n    int Function(Pointer<SherpaOnnxOfflineSpeechDenoiser>);\n\ntypedef SherpaOnnxOfflineSpeechDenoiserRunNative =\n    Pointer<SherpaOnnxDenoisedAudio> Function(\n      Pointer<SherpaOnnxOfflineSpeechDenoiser>,\n      Pointer<Float>,\n      Int32,\n      Int32,\n    );\n\ntypedef SherpaOnnxOfflineSpeechDenoiserRun =\n    Pointer<SherpaOnnxDenoisedAudio> Function(\n      Pointer<SherpaOnnxOfflineSpeechDenoiser>,\n      Pointer<Float>,\n      int,\n      int,\n    );\n\ntypedef SherpaOnnxDestroyDenoisedAudioNative =\n    Void Function(Pointer<SherpaOnnxDenoisedAudio>);\n\ntypedef SherpaOnnxDestroyDenoisedAudio =\n    void Function(Pointer<SherpaOnnxDenoisedAudio>);\n\ntypedef SherpaOnnxCreateOnlineSpeechDenoiserNative =\n    Pointer<SherpaOnnxOnlineSpeechDenoiser> Function(\n      Pointer<SherpaOnnxOnlineSpeechDenoiserConfig>,\n    );\n\ntypedef SherpaOnnxCreateOnlineSpeechDenoiser =\n    SherpaOnnxCreateOnlineSpeechDenoiserNative;\n\ntypedef SherpaOnnxDestroyOnlineSpeechDenoiserNative =\n    Void Function(Pointer<SherpaOnnxOnlineSpeechDenoiser>);\n\ntypedef SherpaOnnxDestroyOnlineSpeechDenoiser =\n    void Function(Pointer<SherpaOnnxOnlineSpeechDenoiser>);\n\ntypedef SherpaOnnxOnlineSpeechDenoiserGetSampleRateNative =\n    Int32 Function(Pointer<SherpaOnnxOnlineSpeechDenoiser>);\n\ntypedef SherpaOnnxOnlineSpeechDenoiserGetSampleRate =\n    int Function(Pointer<SherpaOnnxOnlineSpeechDenoiser>);\n\ntypedef SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamplesNative =\n    Int32 Function(Pointer<SherpaOnnxOnlineSpeechDenoiser>);\n\ntypedef SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples =\n    int Function(Pointer<SherpaOnnxOnlineSpeechDenoiser>);\n\ntypedef SherpaOnnxOnlineSpeechDenoiserRunNative =\n    Pointer<SherpaOnnxDenoisedAudio> Function(\n      Pointer<SherpaOnnxOnlineSpeechDenoiser>,\n      Pointer<Float>,\n      Int32,\n      Int32,\n    );\n\ntypedef SherpaOnnxOnlineSpeechDenoiserRun =\n    Pointer<SherpaOnnxDenoisedAudio> Function(\n      Pointer<SherpaOnnxOnlineSpeechDenoiser>,\n      Pointer<Float>,\n      int,\n      int,\n    );\n\ntypedef SherpaOnnxOnlineSpeechDenoiserFlushNative =\n    Pointer<SherpaOnnxDenoisedAudio> Function(\n      Pointer<SherpaOnnxOnlineSpeechDenoiser>,\n    );\n\ntypedef SherpaOnnxOnlineSpeechDenoiserFlush =\n    Pointer<SherpaOnnxDenoisedAudio> Function(\n      Pointer<SherpaOnnxOnlineSpeechDenoiser>,\n    );\n\ntypedef SherpaOnnxOnlineSpeechDenoiserResetNative =\n    Void Function(Pointer<SherpaOnnxOnlineSpeechDenoiser>);\n\ntypedef SherpaOnnxOnlineSpeechDenoiserReset =\n    void Function(Pointer<SherpaOnnxOnlineSpeechDenoiser>);\n\ntypedef SherpaOnnxCreateSpokenLanguageIdentificationNative =\n    Pointer<SherpaOnnxSpokenLanguageIdentification> Function(\n      Pointer<SherpaOnnxSpokenLanguageIdentificationConfig>,\n    );\n\ntypedef SherpaOnnxCreateSpokenLanguageIdentification =\n    SherpaOnnxCreateSpokenLanguageIdentificationNative;\n\ntypedef SherpaOnnxDestroySpokenLanguageIdentificationNative =\n    Void Function(Pointer<SherpaOnnxSpokenLanguageIdentification>);\n\ntypedef SherpaOnnxDestroySpokenLanguageIdentification =\n    void Function(Pointer<SherpaOnnxSpokenLanguageIdentification>);\n\ntypedef SherpaOnnxSpokenLanguageIdentificationCreateOfflineStreamNative =\n    Pointer<SherpaOnnxOfflineStream> Function(\n      Pointer<SherpaOnnxSpokenLanguageIdentification>,\n    );\n\ntypedef SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream =\n    SherpaOnnxSpokenLanguageIdentificationCreateOfflineStreamNative;\n\ntypedef SherpaOnnxSpokenLanguageIdentificationComputeNative =\n    Pointer<SherpaOnnxSpokenLanguageIdentificationResult> Function(\n      Pointer<SherpaOnnxSpokenLanguageIdentification>,\n      Pointer<SherpaOnnxOfflineStream>,\n    );\n\ntypedef SherpaOnnxSpokenLanguageIdentificationCompute =\n    SherpaOnnxSpokenLanguageIdentificationComputeNative;\n\ntypedef SherpaOnnxDestroySpokenLanguageIdentificationResultNative =\n    Void Function(Pointer<SherpaOnnxSpokenLanguageIdentificationResult>);\n\ntypedef SherpaOnnxDestroySpokenLanguageIdentificationResult =\n    void Function(Pointer<SherpaOnnxSpokenLanguageIdentificationResult>);\n\ntypedef SherpaOnnxCreateOfflineSpeakerDiarizationNative =\n    Pointer<SherpaOnnxOfflineSpeakerDiarization> Function(\n      Pointer<SherpaOnnxOfflineSpeakerDiarizationConfig>,\n    );\n\ntypedef SherpaOnnxCreateOfflineSpeakerDiarization =\n    SherpaOnnxCreateOfflineSpeakerDiarizationNative;\n\ntypedef SherpaOnnxDestroyOfflineSpeakerDiarizationNative =\n    Void Function(Pointer<SherpaOnnxOfflineSpeakerDiarization>);\n\ntypedef SherpaOnnxDestroyOfflineSpeakerDiarization =\n    void Function(Pointer<SherpaOnnxOfflineSpeakerDiarization>);\n\ntypedef SherpaOnnxCreateOfflinePunctuationNative =\n    Pointer<SherpaOnnxOfflinePunctuation> Function(\n      Pointer<SherpaOnnxOfflinePunctuationConfig>,\n    );\n\ntypedef SherpaOnnxCreateOnlinePunctuationNative =\n    Pointer<SherpaOnnxOnlinePunctuation> Function(\n      Pointer<SherpaOnnxOnlinePunctuationConfig>,\n    );\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationGetSampleRateNative =\n    Int32 Function(Pointer<SherpaOnnxOfflineSpeakerDiarization>);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationGetSampleRate =\n    int Function(Pointer<SherpaOnnxOfflineSpeakerDiarization>);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationSetConfigNative =\n    Void Function(\n      Pointer<SherpaOnnxOfflineSpeakerDiarization>,\n      Pointer<SherpaOnnxOfflineSpeakerDiarizationConfig>,\n    );\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakersNative =\n    Int32 Function(Pointer<SherpaOnnxOfflineSpeakerDiarizationResult>);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers =\n    int Function(Pointer<SherpaOnnxOfflineSpeakerDiarizationResult>);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegmentsNative =\n    Int32 Function(Pointer<SherpaOnnxOfflineSpeakerDiarizationResult>);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments =\n    int Function(Pointer<SherpaOnnxOfflineSpeakerDiarizationResult>);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTimeNative =\n    Pointer<SherpaOnnxOfflineSpeakerDiarizationSegment> Function(\n      Pointer<SherpaOnnxOfflineSpeakerDiarizationResult>,\n    );\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime =\n    SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTimeNative;\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationDestroySegmentNative =\n    Void Function(Pointer<SherpaOnnxOfflineSpeakerDiarizationSegment>);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationDestroySegment =\n    void Function(Pointer<SherpaOnnxOfflineSpeakerDiarizationSegment>);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationProcessNative =\n    Pointer<SherpaOnnxOfflineSpeakerDiarizationResult> Function(\n      Pointer<SherpaOnnxOfflineSpeakerDiarization>,\n      Pointer<Float>,\n      Int32,\n    );\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationProcess =\n    Pointer<SherpaOnnxOfflineSpeakerDiarizationResult> Function(\n      Pointer<SherpaOnnxOfflineSpeakerDiarization>,\n      Pointer<Float>,\n      int,\n    );\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArgNative =\n    Int32 Function(Int32, Int32);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArgNative =\n    Pointer<SherpaOnnxOfflineSpeakerDiarizationResult> Function(\n      Pointer<SherpaOnnxOfflineSpeakerDiarization>,\n      Pointer<Float>,\n      Int32,\n      Pointer<\n        NativeFunction<\n          SherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArgNative\n        >\n      >,\n    );\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg =\n    Pointer<SherpaOnnxOfflineSpeakerDiarizationResult> Function(\n      Pointer<SherpaOnnxOfflineSpeakerDiarization>,\n      Pointer<Float>,\n      int,\n      Pointer<\n        NativeFunction<\n          SherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArgNative\n        >\n      >,\n    );\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationDestroyResultNative =\n    Void Function(Pointer<SherpaOnnxOfflineSpeakerDiarizationResult>);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationDestroyResult =\n    void Function(Pointer<SherpaOnnxOfflineSpeakerDiarizationResult>);\n\ntypedef SherpaOnnxOfflineSpeakerDiarizationSetConfig =\n    void Function(\n      Pointer<SherpaOnnxOfflineSpeakerDiarization>,\n      Pointer<SherpaOnnxOfflineSpeakerDiarizationConfig>,\n    );\n\ntypedef SherpaOnnxCreateOfflinePunctuation =\n    SherpaOnnxCreateOfflinePunctuationNative;\n\ntypedef SherpaOnnxDestroyOfflinePunctuationNative =\n    Void Function(Pointer<SherpaOnnxOfflinePunctuation>);\n\ntypedef SherpaOnnxDestroyOfflinePunctuation =\n    void Function(Pointer<SherpaOnnxOfflinePunctuation>);\n\ntypedef SherpaOfflinePunctuationAddPunctNative =\n    Pointer<Utf8> Function(\n      Pointer<SherpaOnnxOfflinePunctuation>,\n      Pointer<Utf8>,\n    );\n\ntypedef SherpaOfflinePunctuationAddPunct =\n    SherpaOfflinePunctuationAddPunctNative;\n\ntypedef SherpaOfflinePunctuationFreeTextNative = Void Function(Pointer<Utf8>);\n\ntypedef SherpaOfflinePunctuationFreeText = void Function(Pointer<Utf8>);\n\ntypedef SherpaOnnxCreateOnlinePunctuation =\n    SherpaOnnxCreateOnlinePunctuationNative;\n\ntypedef SherpaOnnxDestroyOnlinePunctuationNative =\n    Void Function(Pointer<SherpaOnnxOnlinePunctuation>);\n\ntypedef SherpaOnnxDestroyOnlinePunctuation =\n    void Function(Pointer<SherpaOnnxOnlinePunctuation>);\n\ntypedef SherpaOnnxOnlinePunctuationAddPunctNative =\n    Pointer<Utf8> Function(Pointer<SherpaOnnxOnlinePunctuation>, Pointer<Utf8>);\n\ntypedef SherpaOnnxOnlinePunctuationAddPunct =\n    SherpaOnnxOnlinePunctuationAddPunctNative;\n\ntypedef SherpaOnnxOnlinePunctuationFreeTextNative =\n    Void Function(Pointer<Utf8>);\n\ntypedef SherpaOnnxOnlinePunctuationFreeText = void Function(Pointer<Utf8>);\n\ntypedef SherpaOnnxCreateAudioTaggingNative =\n    Pointer<SherpaOnnxAudioTagging> Function(\n      Pointer<SherpaOnnxAudioTaggingConfig>,\n    );\n\ntypedef SherpaOnnxCreateAudioTagging = SherpaOnnxCreateAudioTaggingNative;\n\ntypedef SherpaOnnxDestroyAudioTaggingNative =\n    Void Function(Pointer<SherpaOnnxAudioTagging>);\n\ntypedef SherpaOnnxDestroyAudioTagging =\n    void Function(Pointer<SherpaOnnxAudioTagging>);\n\ntypedef SherpaOnnxAudioTaggingCreateOfflineStreamNative =\n    Pointer<SherpaOnnxOfflineStream> Function(Pointer<SherpaOnnxAudioTagging>);\n\ntypedef SherpaOnnxAudioTaggingCreateOfflineStream =\n    SherpaOnnxAudioTaggingCreateOfflineStreamNative;\n\ntypedef SherpaOnnxAudioTaggingComputeNative =\n    Pointer<Pointer<SherpaOnnxAudioEvent>> Function(\n      Pointer<SherpaOnnxAudioTagging>,\n      Pointer<SherpaOnnxOfflineStream>,\n      Int32,\n    );\n\ntypedef SherpaOnnxAudioTaggingCompute =\n    Pointer<Pointer<SherpaOnnxAudioEvent>> Function(\n      Pointer<SherpaOnnxAudioTagging>,\n      Pointer<SherpaOnnxOfflineStream>,\n      int,\n    );\n\ntypedef SherpaOnnxAudioTaggingFreeResultsNative =\n    Void Function(Pointer<Pointer<SherpaOnnxAudioEvent>>);\n\ntypedef SherpaOnnxAudioTaggingFreeResults =\n    void Function(Pointer<Pointer<SherpaOnnxAudioEvent>>);\n\ntypedef CreateKeywordSpotterNative =\n    Pointer<SherpaOnnxKeywordSpotter> Function(\n      Pointer<SherpaOnnxKeywordSpotterConfig>,\n    );\n\ntypedef CreateKeywordSpotter = CreateKeywordSpotterNative;\n\ntypedef DestroyKeywordSpotterNative =\n    Void Function(Pointer<SherpaOnnxKeywordSpotter>);\n\ntypedef DestroyKeywordSpotter =\n    void Function(Pointer<SherpaOnnxKeywordSpotter>);\n\ntypedef CreateKeywordStreamNative =\n    Pointer<SherpaOnnxOnlineStream> Function(Pointer<SherpaOnnxKeywordSpotter>);\n\ntypedef CreateKeywordStream = CreateKeywordStreamNative;\n\ntypedef CreateKeywordStreamWithKeywordsNative =\n    Pointer<SherpaOnnxOnlineStream> Function(\n      Pointer<SherpaOnnxKeywordSpotter>,\n      Pointer<Utf8>,\n    );\n\ntypedef CreateKeywordStreamWithKeywords = CreateKeywordStreamWithKeywordsNative;\n\ntypedef IsKeywordStreamReadyNative =\n    Int32 Function(\n      Pointer<SherpaOnnxKeywordSpotter>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef IsKeywordStreamReady =\n    int Function(\n      Pointer<SherpaOnnxKeywordSpotter>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef DecodeKeywordStreamNative =\n    Void Function(\n      Pointer<SherpaOnnxKeywordSpotter>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef DecodeKeywordStream =\n    void Function(\n      Pointer<SherpaOnnxKeywordSpotter>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef ResetKeywordStreamNative =\n    Void Function(\n      Pointer<SherpaOnnxKeywordSpotter>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef ResetKeywordStream =\n    void Function(\n      Pointer<SherpaOnnxKeywordSpotter>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef GetKeywordResultAsJsonNative =\n    Pointer<Utf8> Function(\n      Pointer<SherpaOnnxKeywordSpotter>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef GetKeywordResultAsJson = GetKeywordResultAsJsonNative;\n\ntypedef FreeKeywordResultJsonNative = Void Function(Pointer<Utf8>);\n\ntypedef FreeKeywordResultJson = void Function(Pointer<Utf8>);\n\ntypedef SherpaOnnxCreateOfflineTtsNative =\n    Pointer<SherpaOnnxOfflineTts> Function(Pointer<SherpaOnnxOfflineTtsConfig>);\n\ntypedef SherpaOnnxCreateOfflineTts = SherpaOnnxCreateOfflineTtsNative;\n\ntypedef SherpaOnnxDestroyOfflineTtsNative =\n    Void Function(Pointer<SherpaOnnxOfflineTts>);\n\ntypedef SherpaOnnxDestroyOfflineTts =\n    void Function(Pointer<SherpaOnnxOfflineTts>);\n\ntypedef SherpaOnnxOfflineTtsSampleRateNative =\n    Int32 Function(Pointer<SherpaOnnxOfflineTts>);\n\ntypedef SherpaOnnxOfflineTtsSampleRate =\n    int Function(Pointer<SherpaOnnxOfflineTts>);\n\ntypedef SherpaOnnxOfflineTtsNumSpeakersNative =\n    Int32 Function(Pointer<SherpaOnnxOfflineTts>);\n\ntypedef SherpaOnnxOfflineTtsNumSpeakers =\n    int Function(Pointer<SherpaOnnxOfflineTts>);\n\ntypedef SherpaOnnxOfflineTtsGenerateNative =\n    Pointer<SherpaOnnxGeneratedAudio> Function(\n      Pointer<SherpaOnnxOfflineTts>,\n      Pointer<Utf8>,\n      Int32,\n      Float,\n    );\n\ntypedef SherpaOnnxOfflineTtsGenerate =\n    Pointer<SherpaOnnxGeneratedAudio> Function(\n      Pointer<SherpaOnnxOfflineTts>,\n      Pointer<Utf8>,\n      int,\n      double,\n    );\n\ntypedef SherpaOnnxDestroyOfflineTtsGeneratedAudioNative =\n    Void Function(Pointer<SherpaOnnxGeneratedAudio>);\n\ntypedef SherpaOnnxDestroyOfflineTtsGeneratedAudio =\n    void Function(Pointer<SherpaOnnxGeneratedAudio>);\n\ntypedef SherpaOnnxGeneratedAudioCallbackNative =\n    Int32 Function(Pointer<Float>, Int32);\n\ntypedef SherpaOnnxGeneratedAudioProgressCallbackWithArgNative =\n    Int32 Function(Pointer<Float> samples, Int32 n, Float p, Pointer<Void> arg);\n\ntypedef SherpaOnnxGeneratedAudioProgressCallbackWithArg =\n    int Function(Pointer<Float> samples, int n, double p, Pointer<Void> arg);\n\ntypedef SherpaOnnxOfflineTtsGenerateWithCallbackNative =\n    Pointer<SherpaOnnxGeneratedAudio> Function(\n      Pointer<SherpaOnnxOfflineTts>,\n      Pointer<Utf8>,\n      Int32,\n      Float,\n      Pointer<NativeFunction<SherpaOnnxGeneratedAudioCallbackNative>>,\n    );\n\ntypedef SherpaOnnxOfflineTtsGenerateWithCallback =\n    Pointer<SherpaOnnxGeneratedAudio> Function(\n      Pointer<SherpaOnnxOfflineTts>,\n      Pointer<Utf8>,\n      int,\n      double,\n      Pointer<NativeFunction<SherpaOnnxGeneratedAudioCallbackNative>>,\n    );\n\ntypedef SherpaOnnxOfflineTtsGenerateWithConfigNative =\n    Pointer<SherpaOnnxGeneratedAudio> Function(\n      Pointer<SherpaOnnxOfflineTts>,\n      Pointer<Utf8>,\n      Pointer<SherpaOnnxGenerationConfig>,\n      Pointer<\n        NativeFunction<SherpaOnnxGeneratedAudioProgressCallbackWithArgNative>\n      >,\n      Pointer<Void>,\n    );\n\ntypedef SherpaOnnxOfflineTtsGenerateWithConfig =\n    Pointer<SherpaOnnxGeneratedAudio> Function(\n      Pointer<SherpaOnnxOfflineTts>,\n      Pointer<Utf8>,\n      Pointer<SherpaOnnxGenerationConfig>,\n      Pointer<\n        NativeFunction<SherpaOnnxGeneratedAudioProgressCallbackWithArgNative>\n      >,\n      Pointer<Void>,\n    );\n\ntypedef CreateOfflineRecognizerNative =\n    Pointer<SherpaOnnxOfflineRecognizer> Function(\n      Pointer<SherpaOnnxOfflineRecognizerConfig>,\n    );\n\ntypedef CreateOfflineRecognizer = CreateOfflineRecognizerNative;\n\ntypedef OfflineRecognizerSetConfigNative =\n    Void Function(\n      Pointer<SherpaOnnxOfflineRecognizer>,\n      Pointer<SherpaOnnxOfflineRecognizerConfig>,\n    );\n\ntypedef OfflineRecognizerSetConfig =\n    void Function(\n      Pointer<SherpaOnnxOfflineRecognizer>,\n      Pointer<SherpaOnnxOfflineRecognizerConfig>,\n    );\n\ntypedef DestroyOfflineRecognizerNative =\n    Void Function(Pointer<SherpaOnnxOfflineRecognizer>);\n\ntypedef DestroyOfflineRecognizer =\n    void Function(Pointer<SherpaOnnxOfflineRecognizer>);\n\ntypedef CreateOfflineStreamNative =\n    Pointer<SherpaOnnxOfflineStream> Function(\n      Pointer<SherpaOnnxOfflineRecognizer>,\n    );\n\ntypedef CreateOfflineStream = CreateOfflineStreamNative;\n\ntypedef DestroyOfflineStreamNative =\n    Void Function(Pointer<SherpaOnnxOfflineStream>);\n\ntypedef DestroyOfflineStream = void Function(Pointer<SherpaOnnxOfflineStream>);\n\ntypedef AcceptWaveformOfflineNative =\n    Void Function(\n      Pointer<SherpaOnnxOfflineStream>,\n      Int32,\n      Pointer<Float>,\n      Int32,\n    );\n\ntypedef AcceptWaveformOffline =\n    void Function(Pointer<SherpaOnnxOfflineStream>, int, Pointer<Float>, int);\n\ntypedef DecodeOfflineStreamNative =\n    Void Function(\n      Pointer<SherpaOnnxOfflineRecognizer>,\n      Pointer<SherpaOnnxOfflineStream>,\n    );\n\ntypedef DecodeOfflineStream =\n    void Function(\n      Pointer<SherpaOnnxOfflineRecognizer>,\n      Pointer<SherpaOnnxOfflineStream>,\n    );\n\ntypedef GetOfflineStreamResultAsJsonNative =\n    Pointer<Utf8> Function(Pointer<SherpaOnnxOfflineStream>);\n\ntypedef GetOfflineStreamResultAsJson = GetOfflineStreamResultAsJsonNative;\n\ntypedef DestroyOfflineStreamResultJsonNative = Void Function(Pointer<Utf8>);\n\ntypedef DestroyOfflineStreamResultJson = void Function(Pointer<Utf8>);\n\ntypedef SherpaOnnxCreateOnlineRecognizerNative =\n    Pointer<SherpaOnnxOnlineRecognizer> Function(\n      Pointer<SherpaOnnxOnlineRecognizerConfig>,\n    );\n\ntypedef SherpaOnnxCreateOnlineRecognizer =\n    SherpaOnnxCreateOnlineRecognizerNative;\n\ntypedef SherpaOnnxDestroyOnlineRecognizerNative =\n    Void Function(Pointer<SherpaOnnxOnlineRecognizer>);\n\ntypedef SherpaOnnxDestroyOnlineRecognizer =\n    void Function(Pointer<SherpaOnnxOnlineRecognizer>);\n\ntypedef SherpaOnnxCreateOnlineStreamNative =\n    Pointer<SherpaOnnxOnlineStream> Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n    );\n\ntypedef SherpaOnnxCreateOnlineStream = SherpaOnnxCreateOnlineStreamNative;\n\ntypedef SherpaOnnxCreateOnlineStreamWithHotwordsNative =\n    Pointer<SherpaOnnxOnlineStream> Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n      Pointer<Utf8>,\n    );\n\ntypedef SherpaOnnxCreateOnlineStreamWithHotwords =\n    SherpaOnnxCreateOnlineStreamWithHotwordsNative;\n\ntypedef IsOnlineStreamReadyNative =\n    Int32 Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef IsOnlineStreamReady =\n    int Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef SherpaOnnxDecodeOnlineStreamNative =\n    Void Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef SherpaOnnxDecodeOnlineStream =\n    void Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef GetOnlineStreamResultAsJsonNative =\n    Pointer<Utf8> Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef GetOnlineStreamResultAsJson = GetOnlineStreamResultAsJsonNative;\n\ntypedef ResetNative =\n    Void Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef Reset =\n    void Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef IsEndpointNative =\n    Int32 Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef IsEndpoint =\n    int Function(\n      Pointer<SherpaOnnxOnlineRecognizer>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef DestroyOnlineStreamResultJsonNative = Void Function(Pointer<Utf8>);\n\ntypedef DestroyOnlineStreamResultJson = void Function(Pointer<Utf8>);\n\ntypedef SherpaOnnxCreateVoiceActivityDetectorNative =\n    Pointer<SherpaOnnxVoiceActivityDetector> Function(\n      Pointer<SherpaOnnxVadModelConfig>,\n      Float,\n    );\n\ntypedef SherpaOnnxCreateVoiceActivityDetector =\n    Pointer<SherpaOnnxVoiceActivityDetector> Function(\n      Pointer<SherpaOnnxVadModelConfig>,\n      double,\n    );\n\ntypedef SherpaOnnxDestroyVoiceActivityDetectorNative =\n    Void Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxDestroyVoiceActivityDetector =\n    void Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorAcceptWaveformNative =\n    Void Function(\n      Pointer<SherpaOnnxVoiceActivityDetector>,\n      Pointer<Float>,\n      Int32,\n    );\n\ntypedef SherpaOnnxVoiceActivityDetectorAcceptWaveform =\n    void Function(\n      Pointer<SherpaOnnxVoiceActivityDetector>,\n      Pointer<Float>,\n      int,\n    );\n\ntypedef SherpaOnnxVoiceActivityDetectorEmptyNative =\n    Int32 Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorEmpty =\n    int Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorDetectedNative =\n    Int32 Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorDetected =\n    int Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorPopNative =\n    Void Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorPop =\n    void Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorClearNative =\n    Void Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorClear =\n    void Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorResetNative =\n    Void Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorReset =\n    void Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorFlushNative =\n    Void Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorFlush =\n    void Function(Pointer<SherpaOnnxVoiceActivityDetector>);\n\ntypedef SherpaOnnxVoiceActivityDetectorFrontNative =\n    Pointer<SherpaOnnxSpeechSegment> Function(\n      Pointer<SherpaOnnxVoiceActivityDetector>,\n    );\n\ntypedef SherpaOnnxVoiceActivityDetectorFront =\n    SherpaOnnxVoiceActivityDetectorFrontNative;\n\ntypedef SherpaOnnxDestroySpeechSegmentNative =\n    Void Function(Pointer<SherpaOnnxSpeechSegment>);\n\ntypedef SherpaOnnxDestroySpeechSegment =\n    void Function(Pointer<SherpaOnnxSpeechSegment>);\n\ntypedef SherpaOnnxCreateCircularBufferNative =\n    Pointer<SherpaOnnxCircularBuffer> Function(Int32);\n\ntypedef SherpaOnnxCreateCircularBuffer =\n    Pointer<SherpaOnnxCircularBuffer> Function(int);\n\ntypedef SherpaOnnxDestroyCircularBufferNative =\n    Void Function(Pointer<SherpaOnnxCircularBuffer>);\n\ntypedef SherpaOnnxDestroyCircularBuffer =\n    void Function(Pointer<SherpaOnnxCircularBuffer>);\n\ntypedef SherpaOnnxCircularBufferPushNative =\n    Void Function(Pointer<SherpaOnnxCircularBuffer>, Pointer<Float>, Int32);\n\ntypedef SherpaOnnxCircularBufferPush =\n    void Function(Pointer<SherpaOnnxCircularBuffer>, Pointer<Float>, int);\n\ntypedef SherpaOnnxCircularBufferGetNative =\n    Pointer<Float> Function(Pointer<SherpaOnnxCircularBuffer>, Int32, Int32);\n\ntypedef SherpaOnnxCircularBufferGet =\n    Pointer<Float> Function(Pointer<SherpaOnnxCircularBuffer>, int, int);\n\ntypedef SherpaOnnxCircularBufferFreeNative = Void Function(Pointer<Float>);\n\ntypedef SherpaOnnxCircularBufferFree = void Function(Pointer<Float>);\n\ntypedef SherpaOnnxCircularBufferPopNative =\n    Void Function(Pointer<SherpaOnnxCircularBuffer>, Int32);\n\ntypedef SherpaOnnxCircularBufferPop =\n    void Function(Pointer<SherpaOnnxCircularBuffer>, int);\n\ntypedef SherpaOnnxCircularBufferSizeNative =\n    Int32 Function(Pointer<SherpaOnnxCircularBuffer>);\n\ntypedef SherpaOnnxCircularBufferSize =\n    int Function(Pointer<SherpaOnnxCircularBuffer>);\n\ntypedef SherpaOnnxCircularBufferHeadNative =\n    Int32 Function(Pointer<SherpaOnnxCircularBuffer>);\n\ntypedef SherpaOnnxCircularBufferHead =\n    int Function(Pointer<SherpaOnnxCircularBuffer>);\n\ntypedef SherpaOnnxCircularBufferResetNative =\n    Void Function(Pointer<SherpaOnnxCircularBuffer>);\n\ntypedef SherpaOnnxCircularBufferReset =\n    void Function(Pointer<SherpaOnnxCircularBuffer>);\n\ntypedef SherpaOnnxCreateSpeakerEmbeddingManagerNative =\n    Pointer<SherpaOnnxSpeakerEmbeddingManager> Function(Int32);\n\ntypedef SherpaOnnxCreateSpeakerEmbeddingManager =\n    Pointer<SherpaOnnxSpeakerEmbeddingManager> Function(int);\n\ntypedef SherpaOnnxDestroySpeakerEmbeddingManagerNative =\n    Void Function(Pointer<SherpaOnnxSpeakerEmbeddingManager>);\n\ntypedef SherpaOnnxDestroySpeakerEmbeddingManager =\n    void Function(Pointer<SherpaOnnxSpeakerEmbeddingManager>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerAddNative =\n    Int32 Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingManager>,\n      Pointer<Utf8>,\n      Pointer<Float>,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerAdd =\n    int Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingManager>,\n      Pointer<Utf8>,\n      Pointer<Float>,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerAddListFlattenedNative =\n    Int32 Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingManager>,\n      Pointer<Utf8>,\n      Pointer<Float>,\n      Int32,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerAddListFlattened =\n    int Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingManager>,\n      Pointer<Utf8>,\n      Pointer<Float>,\n      int,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerRemoveNative =\n    Int32 Function(Pointer<SherpaOnnxSpeakerEmbeddingManager>, Pointer<Utf8>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerRemove =\n    int Function(Pointer<SherpaOnnxSpeakerEmbeddingManager>, Pointer<Utf8>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerContainsNative =\n    Int32 Function(Pointer<SherpaOnnxSpeakerEmbeddingManager>, Pointer<Utf8>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerContains =\n    int Function(Pointer<SherpaOnnxSpeakerEmbeddingManager>, Pointer<Utf8>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerSearchNative =\n    Pointer<Utf8> Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingManager>,\n      Pointer<Float>,\n      Float,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerSearch =\n    Pointer<Utf8> Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingManager>,\n      Pointer<Float>,\n      double,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerFreeSearchNative =\n    Void Function(Pointer<Utf8>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerFreeSearch =\n    void Function(Pointer<Utf8>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerNumSpeakersNative =\n    Int32 Function(Pointer<SherpaOnnxSpeakerEmbeddingManager>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerNumSpeakers =\n    int Function(Pointer<SherpaOnnxSpeakerEmbeddingManager>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerVerifyNative =\n    Int32 Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingManager>,\n      Pointer<Utf8>,\n      Pointer<Float>,\n      Float,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerVerify =\n    int Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingManager>,\n      Pointer<Utf8>,\n      Pointer<Float>,\n      double,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakersNative =\n    Pointer<Pointer<Utf8>> Function(Pointer<SherpaOnnxSpeakerEmbeddingManager>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers =\n    SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakersNative;\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakersNative =\n    Void Function(Pointer<Pointer<Utf8>>);\n\ntypedef SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers =\n    void Function(Pointer<Pointer<Utf8>>);\n\ntypedef SherpaOnnxCreateSpeakerEmbeddingExtractorNative =\n    Pointer<SherpaOnnxSpeakerEmbeddingExtractor> Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingExtractorConfig>,\n    );\n\ntypedef SherpaOnnxCreateSpeakerEmbeddingExtractor =\n    SherpaOnnxCreateSpeakerEmbeddingExtractorNative;\n\ntypedef SherpaOnnxDestroySpeakerEmbeddingExtractorNative =\n    Void Function(Pointer<SherpaOnnxSpeakerEmbeddingExtractor>);\n\ntypedef SherpaOnnxDestroySpeakerEmbeddingExtractor =\n    void Function(Pointer<SherpaOnnxSpeakerEmbeddingExtractor>);\n\ntypedef SherpaOnnxSpeakerEmbeddingExtractorDimNative =\n    Int32 Function(Pointer<SherpaOnnxSpeakerEmbeddingExtractor>);\n\ntypedef SherpaOnnxSpeakerEmbeddingExtractorDim =\n    int Function(Pointer<SherpaOnnxSpeakerEmbeddingExtractor>);\n\ntypedef SherpaOnnxSpeakerEmbeddingExtractorCreateStreamNative =\n    Pointer<SherpaOnnxOnlineStream> Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingExtractor>,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingExtractorCreateStream =\n    SherpaOnnxSpeakerEmbeddingExtractorCreateStreamNative;\n\ntypedef SherpaOnnxDestroyOnlineStreamNative =\n    Void Function(Pointer<SherpaOnnxOnlineStream>);\n\ntypedef SherpaOnnxDestroyOnlineStream =\n    void Function(Pointer<SherpaOnnxOnlineStream>);\n\ntypedef OnlineStreamAcceptWaveformNative =\n    Void Function(\n      Pointer<SherpaOnnxOnlineStream>,\n      Int32,\n      Pointer<Float>,\n      Int32,\n    );\n\ntypedef OnlineStreamAcceptWaveform =\n    void Function(Pointer<SherpaOnnxOnlineStream>, int, Pointer<Float>, int);\n\ntypedef OnlineStreamInputFinishedNative =\n    Void Function(Pointer<SherpaOnnxOnlineStream>);\n\ntypedef OnlineStreamInputFinished =\n    void Function(Pointer<SherpaOnnxOnlineStream>);\n\ntypedef SherpaOnnxSpeakerEmbeddingExtractorIsReadyNative =\n    Int32 Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingExtractor>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingExtractorIsReady =\n    int Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingExtractor>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingExtractorComputeEmbeddingNative =\n    Pointer<Float> Function(\n      Pointer<SherpaOnnxSpeakerEmbeddingExtractor>,\n      Pointer<SherpaOnnxOnlineStream>,\n    );\n\ntypedef SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding =\n    SherpaOnnxSpeakerEmbeddingExtractorComputeEmbeddingNative;\n\ntypedef SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbeddingNative =\n    Void Function(Pointer<Float>);\n\ntypedef SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding =\n    void Function(Pointer<Float>);\n\ntypedef SherpaOnnxReadWaveNative =\n    Pointer<SherpaOnnxWave> Function(Pointer<Utf8>);\n\ntypedef SherpaOnnxReadWave = SherpaOnnxReadWaveNative;\n\ntypedef SherpaOnnxWriteWaveNative =\n    Int32 Function(Pointer<Float>, Int32, Int32, Pointer<Utf8>);\n\ntypedef SherpaOnnxWriteWave =\n    int Function(Pointer<Float>, int, int, Pointer<Utf8>);\n\ntypedef SherpaOnnxFreeWaveNative = Void Function(Pointer<SherpaOnnxWave>);\n\ntypedef SherpaOnnxFreeWave = void Function(Pointer<SherpaOnnxWave>);\n\ntypedef SherpaOnnxGetVersionStr = Pointer<Utf8> Function();\ntypedef SherpaOnnxGetVersionStrNative = SherpaOnnxGetVersionStr;\n\ntypedef SherpaOnnxGetGitSha1Native = Pointer<Utf8> Function();\ntypedef SherpaOnnxGetGitSha1 = SherpaOnnxGetGitSha1Native;\n\ntypedef SherpaOnnxGetGitDateNative = Pointer<Utf8> Function();\ntypedef SherpaOnnxGetGitDate = SherpaOnnxGetGitDateNative;\n\nclass SherpaOnnxBindings {\n  static SherpaOnnxCreateOfflineSpeechDenoiser?\n  sherpaOnnxCreateOfflineSpeechDenoiser;\n\n  static SherpaOnnxDestroyOfflineSpeechDenoiser?\n  sherpaOnnxDestroyOfflineSpeechDenoiser;\n\n  static SherpaOnnxOfflineSpeechDenoiserGetSampleRate?\n  sherpaOnnxOfflineSpeechDenoiserGetSampleRate;\n  static SherpaOnnxOfflineSpeechDenoiserRun? sherpaOnnxOfflineSpeechDenoiserRun;\n  static SherpaOnnxDestroyDenoisedAudio? sherpaOnnxDestroyDenoisedAudio;\n  static SherpaOnnxCreateOnlineSpeechDenoiser?\n  sherpaOnnxCreateOnlineSpeechDenoiser;\n  static SherpaOnnxDestroyOnlineSpeechDenoiser?\n  sherpaOnnxDestroyOnlineSpeechDenoiser;\n  static SherpaOnnxOnlineSpeechDenoiserGetSampleRate?\n  sherpaOnnxOnlineSpeechDenoiserGetSampleRate;\n  static SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples?\n  sherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples;\n  static SherpaOnnxOnlineSpeechDenoiserRun? sherpaOnnxOnlineSpeechDenoiserRun;\n  static SherpaOnnxOnlineSpeechDenoiserFlush?\n  sherpaOnnxOnlineSpeechDenoiserFlush;\n  static SherpaOnnxOnlineSpeechDenoiserReset?\n  sherpaOnnxOnlineSpeechDenoiserReset;\n\n  static SherpaOnnxCreateSpokenLanguageIdentification?\n  sherpaOnnxCreateSpokenLanguageIdentification;\n  static SherpaOnnxDestroySpokenLanguageIdentification?\n  sherpaOnnxDestroySpokenLanguageIdentification;\n  static SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream?\n  sherpaOnnxSpokenLanguageIdentificationCreateOfflineStream;\n  static SherpaOnnxSpokenLanguageIdentificationCompute?\n  sherpaOnnxSpokenLanguageIdentificationCompute;\n  static SherpaOnnxDestroySpokenLanguageIdentificationResult?\n  sherpaOnnxDestroySpokenLanguageIdentificationResult;\n\n  static SherpaOnnxCreateOfflineSpeakerDiarization?\n  sherpaOnnxCreateOfflineSpeakerDiarization;\n  static SherpaOnnxDestroyOfflineSpeakerDiarization?\n  sherpaOnnxDestroyOfflineSpeakerDiarization;\n  static SherpaOnnxOfflineSpeakerDiarizationGetSampleRate?\n  sherpaOnnxOfflineSpeakerDiarizationGetSampleRate;\n  static SherpaOnnxOfflineSpeakerDiarizationSetConfig?\n  sherpaOnnxOfflineSpeakerDiarizationSetConfig;\n  static SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers?\n  sherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers;\n  static SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments?\n  sherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments;\n  static SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime?\n  sherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime;\n  static SherpaOnnxOfflineSpeakerDiarizationDestroySegment?\n  sherpaOnnxOfflineSpeakerDiarizationDestroySegment;\n  static SherpaOnnxOfflineSpeakerDiarizationProcess?\n  sherpaOnnxOfflineSpeakerDiarizationProcess;\n  static SherpaOnnxOfflineSpeakerDiarizationDestroyResult?\n  sherpaOnnxOfflineSpeakerDiarizationDestroyResult;\n  static SherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg?\n  sherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg;\n\n  static SherpaOnnxCreateOfflinePunctuation? sherpaOnnxCreateOfflinePunctuation;\n  static SherpaOnnxDestroyOfflinePunctuation?\n  sherpaOnnxDestroyOfflinePunctuation;\n  static SherpaOfflinePunctuationAddPunct? sherpaOfflinePunctuationAddPunct;\n  static SherpaOfflinePunctuationFreeText? sherpaOfflinePunctuationFreeText;\n\n  static SherpaOnnxCreateOnlinePunctuation? sherpaOnnxCreateOnlinePunctuation;\n  static SherpaOnnxDestroyOnlinePunctuation? sherpaOnnxDestroyOnlinePunctuation;\n  static SherpaOnnxOnlinePunctuationAddPunct?\n  sherpaOnnxOnlinePunctuationAddPunct;\n  static SherpaOnnxOnlinePunctuationFreeText?\n  sherpaOnnxOnlinePunctuationFreeText;\n\n  static SherpaOnnxCreateAudioTagging? sherpaOnnxCreateAudioTagging;\n  static SherpaOnnxDestroyAudioTagging? sherpaOnnxDestroyAudioTagging;\n  static SherpaOnnxAudioTaggingCreateOfflineStream?\n  sherpaOnnxAudioTaggingCreateOfflineStream;\n  static SherpaOnnxAudioTaggingCompute? sherpaOnnxAudioTaggingCompute;\n  static SherpaOnnxAudioTaggingFreeResults? sherpaOnnxAudioTaggingFreeResults;\n\n  static CreateKeywordSpotter? createKeywordSpotter;\n  static DestroyKeywordSpotter? destroyKeywordSpotter;\n  static CreateKeywordStream? createKeywordStream;\n  static CreateKeywordStreamWithKeywords? createKeywordStreamWithKeywords;\n  static IsKeywordStreamReady? isKeywordStreamReady;\n  static DecodeKeywordStream? decodeKeywordStream;\n  static ResetKeywordStream? resetKeywordStream;\n  static GetKeywordResultAsJson? getKeywordResultAsJson;\n  static FreeKeywordResultJson? freeKeywordResultJson;\n\n  static SherpaOnnxCreateOfflineTts? createOfflineTts;\n  static SherpaOnnxDestroyOfflineTts? destroyOfflineTts;\n  static SherpaOnnxOfflineTtsSampleRate? offlineTtsSampleRate;\n  static SherpaOnnxOfflineTtsNumSpeakers? offlineTtsNumSpeakers;\n  static SherpaOnnxOfflineTtsGenerate? offlineTtsGenerate;\n  static SherpaOnnxDestroyOfflineTtsGeneratedAudio?\n  destroyOfflineTtsGeneratedAudio;\n  static SherpaOnnxOfflineTtsGenerateWithCallback?\n  offlineTtsGenerateWithCallback;\n\n  static SherpaOnnxOfflineTtsGenerateWithConfig? offlineTtsGenerateWithConfig;\n\n  static CreateOfflineRecognizer? createOfflineRecognizer;\n  static DestroyOfflineRecognizer? destroyOfflineRecognizer;\n  static OfflineRecognizerSetConfig? offlineRecognizerSetConfig;\n  static CreateOfflineStream? createOfflineStream;\n  static DestroyOfflineStream? destroyOfflineStream;\n  static AcceptWaveformOffline? acceptWaveformOffline;\n  static DecodeOfflineStream? decodeOfflineStream;\n  static GetOfflineStreamResultAsJson? getOfflineStreamResultAsJson;\n  static DestroyOfflineStreamResultJson? destroyOfflineStreamResultJson;\n\n  static SherpaOnnxCreateOnlineRecognizer? createOnlineRecognizer;\n\n  static SherpaOnnxDestroyOnlineRecognizer? destroyOnlineRecognizer;\n\n  static SherpaOnnxCreateOnlineStream? createOnlineStream;\n\n  static SherpaOnnxCreateOnlineStreamWithHotwords?\n  createOnlineStreamWithHotwords;\n\n  static IsOnlineStreamReady? isOnlineStreamReady;\n\n  static SherpaOnnxDecodeOnlineStream? decodeOnlineStream;\n\n  static GetOnlineStreamResultAsJson? getOnlineStreamResultAsJson;\n\n  static Reset? reset;\n\n  static IsEndpoint? isEndpoint;\n\n  static DestroyOnlineStreamResultJson? destroyOnlineStreamResultJson;\n\n  static SherpaOnnxCreateVoiceActivityDetector? createVoiceActivityDetector;\n\n  static SherpaOnnxDestroyVoiceActivityDetector? destroyVoiceActivityDetector;\n\n  static SherpaOnnxVoiceActivityDetectorAcceptWaveform?\n  voiceActivityDetectorAcceptWaveform;\n\n  static SherpaOnnxVoiceActivityDetectorEmpty? voiceActivityDetectorEmpty;\n\n  static SherpaOnnxVoiceActivityDetectorDetected? voiceActivityDetectorDetected;\n\n  static SherpaOnnxVoiceActivityDetectorPop? voiceActivityDetectorPop;\n\n  static SherpaOnnxVoiceActivityDetectorClear? voiceActivityDetectorClear;\n\n  static SherpaOnnxVoiceActivityDetectorFront? voiceActivityDetectorFront;\n\n  static SherpaOnnxDestroySpeechSegment? destroySpeechSegment;\n\n  static SherpaOnnxVoiceActivityDetectorReset? voiceActivityDetectorReset;\n\n  static SherpaOnnxVoiceActivityDetectorFlush? voiceActivityDetectorFlush;\n\n  static SherpaOnnxCreateCircularBuffer? createCircularBuffer;\n\n  static SherpaOnnxDestroyCircularBuffer? destroyCircularBuffer;\n\n  static SherpaOnnxCircularBufferPush? circularBufferPush;\n\n  static SherpaOnnxCircularBufferGet? circularBufferGet;\n\n  static SherpaOnnxCircularBufferFree? circularBufferFree;\n\n  static SherpaOnnxCircularBufferPop? circularBufferPop;\n\n  static SherpaOnnxCircularBufferSize? circularBufferSize;\n\n  static SherpaOnnxCircularBufferHead? circularBufferHead;\n\n  static SherpaOnnxCircularBufferReset? circularBufferReset;\n\n  static SherpaOnnxCreateSpeakerEmbeddingExtractor?\n  createSpeakerEmbeddingExtractor;\n\n  static SherpaOnnxDestroySpeakerEmbeddingExtractor?\n  destroySpeakerEmbeddingExtractor;\n\n  static SherpaOnnxSpeakerEmbeddingExtractorDim? speakerEmbeddingExtractorDim;\n\n  static SherpaOnnxSpeakerEmbeddingExtractorCreateStream?\n  speakerEmbeddingExtractorCreateStream;\n\n  static SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding?\n  speakerEmbeddingExtractorComputeEmbedding;\n\n  static SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding?\n  speakerEmbeddingExtractorDestroyEmbedding;\n\n  static SherpaOnnxDestroyOnlineStream? destroyOnlineStream;\n\n  static OnlineStreamAcceptWaveform? onlineStreamAcceptWaveform;\n\n  static OnlineStreamInputFinished? onlineStreamInputFinished;\n\n  static SherpaOnnxSpeakerEmbeddingExtractorIsReady?\n  speakerEmbeddingExtractorIsReady;\n\n  static SherpaOnnxCreateSpeakerEmbeddingManager? createSpeakerEmbeddingManager;\n\n  static SherpaOnnxDestroySpeakerEmbeddingManager?\n  destroySpeakerEmbeddingManager;\n\n  static SherpaOnnxSpeakerEmbeddingManagerAdd? speakerEmbeddingManagerAdd;\n\n  static SherpaOnnxSpeakerEmbeddingManagerAddListFlattened?\n  speakerEmbeddingManagerAddListFlattened;\n\n  static SherpaOnnxSpeakerEmbeddingManagerRemove? speakerEmbeddingManagerRemove;\n\n  static SherpaOnnxSpeakerEmbeddingManagerContains?\n  speakerEmbeddingManagerContains;\n\n  static SherpaOnnxSpeakerEmbeddingManagerSearch? speakerEmbeddingManagerSearch;\n\n  static SherpaOnnxSpeakerEmbeddingManagerFreeSearch?\n  speakerEmbeddingManagerFreeSearch;\n\n  static SherpaOnnxSpeakerEmbeddingManagerNumSpeakers?\n  speakerEmbeddingManagerNumSpeakers;\n\n  static SherpaOnnxSpeakerEmbeddingManagerVerify? speakerEmbeddingManagerVerify;\n\n  static SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers?\n  speakerEmbeddingManagerGetAllSpeakers;\n\n  static SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers?\n  speakerEmbeddingManagerFreeAllSpeakers;\n\n  static SherpaOnnxReadWave? readWave;\n\n  static SherpaOnnxWriteWave? writeWave;\n\n  static SherpaOnnxFreeWave? freeWave;\n\n  static SherpaOnnxGetVersionStr? getVersionStr;\n  static SherpaOnnxGetGitSha1? getGitSha1;\n  static SherpaOnnxGetGitDate? getGitDate;\n\n  static void init(DynamicLibrary dynamicLibrary) {\n    sherpaOnnxCreateOfflineSpeechDenoiser ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateOfflineSpeechDenoiserNative>>(\n          'SherpaOnnxCreateOfflineSpeechDenoiser',\n        )\n        .asFunction();\n\n    sherpaOnnxDestroyOfflineSpeechDenoiser ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyOfflineSpeechDenoiserNative>>(\n          'SherpaOnnxDestroyOfflineSpeechDenoiser',\n        )\n        .asFunction();\n\n    sherpaOnnxOfflineSpeechDenoiserGetSampleRate ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxOfflineSpeechDenoiserGetSampleRateNative>\n        >('SherpaOnnxOfflineSpeechDenoiserGetSampleRate')\n        .asFunction();\n\n    sherpaOnnxOfflineSpeechDenoiserRun ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOfflineSpeechDenoiserRunNative>>(\n          'SherpaOnnxOfflineSpeechDenoiserRun',\n        )\n        .asFunction();\n\n    sherpaOnnxDestroyDenoisedAudio ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyDenoisedAudioNative>>(\n          'SherpaOnnxDestroyDenoisedAudio',\n        )\n        .asFunction();\n\n    sherpaOnnxCreateOnlineSpeechDenoiser ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateOnlineSpeechDenoiserNative>>(\n          'SherpaOnnxCreateOnlineSpeechDenoiser',\n        )\n        .asFunction();\n\n    sherpaOnnxDestroyOnlineSpeechDenoiser ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyOnlineSpeechDenoiserNative>>(\n          'SherpaOnnxDestroyOnlineSpeechDenoiser',\n        )\n        .asFunction();\n\n    sherpaOnnxOnlineSpeechDenoiserGetSampleRate ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxOnlineSpeechDenoiserGetSampleRateNative>\n        >('SherpaOnnxOnlineSpeechDenoiserGetSampleRate')\n        .asFunction();\n\n    sherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples ??= dynamicLibrary\n        .lookup<\n          NativeFunction<\n            SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamplesNative\n          >\n        >('SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples')\n        .asFunction();\n\n    sherpaOnnxOnlineSpeechDenoiserRun ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOnlineSpeechDenoiserRunNative>>(\n          'SherpaOnnxOnlineSpeechDenoiserRun',\n        )\n        .asFunction();\n\n    sherpaOnnxOnlineSpeechDenoiserFlush ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOnlineSpeechDenoiserFlushNative>>(\n          'SherpaOnnxOnlineSpeechDenoiserFlush',\n        )\n        .asFunction();\n\n    sherpaOnnxOnlineSpeechDenoiserReset ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOnlineSpeechDenoiserResetNative>>(\n          'SherpaOnnxOnlineSpeechDenoiserReset',\n        )\n        .asFunction();\n\n    sherpaOnnxCreateSpokenLanguageIdentification ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxCreateSpokenLanguageIdentificationNative>\n        >('SherpaOnnxCreateSpokenLanguageIdentification')\n        .asFunction();\n\n    sherpaOnnxDestroySpokenLanguageIdentification ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxDestroySpokenLanguageIdentificationNative>\n        >('SherpaOnnxDestroySpokenLanguageIdentification')\n        .asFunction();\n\n    sherpaOnnxSpokenLanguageIdentificationCreateOfflineStream ??= dynamicLibrary\n        .lookup<\n          NativeFunction<\n            SherpaOnnxSpokenLanguageIdentificationCreateOfflineStreamNative\n          >\n        >('SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream')\n        .asFunction();\n\n    sherpaOnnxSpokenLanguageIdentificationCompute ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxSpokenLanguageIdentificationComputeNative>\n        >('SherpaOnnxSpokenLanguageIdentificationCompute')\n        .asFunction();\n\n    sherpaOnnxDestroySpokenLanguageIdentificationResult ??= dynamicLibrary\n        .lookup<\n          NativeFunction<\n            SherpaOnnxDestroySpokenLanguageIdentificationResultNative\n          >\n        >('SherpaOnnxDestroySpokenLanguageIdentificationResult')\n        .asFunction();\n\n    sherpaOnnxCreateOfflineSpeakerDiarization ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxCreateOfflineSpeakerDiarizationNative>\n        >('SherpaOnnxCreateOfflineSpeakerDiarization')\n        .asFunction();\n\n    sherpaOnnxDestroyOfflineSpeakerDiarization ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxDestroyOfflineSpeakerDiarizationNative>\n        >('SherpaOnnxDestroyOfflineSpeakerDiarization')\n        .asFunction();\n\n    sherpaOnnxOfflineSpeakerDiarizationGetSampleRate ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxOfflineSpeakerDiarizationGetSampleRateNative>\n        >('SherpaOnnxOfflineSpeakerDiarizationGetSampleRate')\n        .asFunction();\n\n    sherpaOnnxOfflineSpeakerDiarizationSetConfig ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxOfflineSpeakerDiarizationSetConfigNative>\n        >('SherpaOnnxOfflineSpeakerDiarizationSetConfig')\n        .asFunction();\n\n    sherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers ??= dynamicLibrary\n        .lookup<\n          NativeFunction<\n            SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakersNative\n          >\n        >('SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers')\n        .asFunction();\n\n    sherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments ??= dynamicLibrary\n        .lookup<\n          NativeFunction<\n            SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegmentsNative\n          >\n        >('SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments')\n        .asFunction();\n\n    sherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime ??= dynamicLibrary\n        .lookup<\n          NativeFunction<\n            SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTimeNative\n          >\n        >('SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime')\n        .asFunction();\n\n    sherpaOnnxOfflineSpeakerDiarizationDestroySegment ??= dynamicLibrary\n        .lookup<\n          NativeFunction<\n            SherpaOnnxOfflineSpeakerDiarizationDestroySegmentNative\n          >\n        >('SherpaOnnxOfflineSpeakerDiarizationDestroySegment')\n        .asFunction();\n\n    sherpaOnnxOfflineSpeakerDiarizationProcess ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxOfflineSpeakerDiarizationProcessNative>\n        >('SherpaOnnxOfflineSpeakerDiarizationProcess')\n        .asFunction();\n\n    sherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg ??=\n        dynamicLibrary\n            .lookup<\n              NativeFunction<\n                SherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArgNative\n              >\n            >('SherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg')\n            .asFunction();\n\n    sherpaOnnxOfflineSpeakerDiarizationDestroyResult ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxOfflineSpeakerDiarizationDestroyResultNative>\n        >('SherpaOnnxOfflineSpeakerDiarizationDestroyResult')\n        .asFunction();\n\n    sherpaOnnxCreateOfflinePunctuation ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateOfflinePunctuationNative>>(\n          'SherpaOnnxCreateOfflinePunctuation',\n        )\n        .asFunction();\n\n    sherpaOnnxDestroyOfflinePunctuation ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyOfflinePunctuationNative>>(\n          'SherpaOnnxDestroyOfflinePunctuation',\n        )\n        .asFunction();\n\n    sherpaOfflinePunctuationAddPunct ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOfflinePunctuationAddPunctNative>>(\n          'SherpaOfflinePunctuationAddPunct',\n        )\n        .asFunction();\n\n    sherpaOfflinePunctuationFreeText ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOfflinePunctuationFreeTextNative>>(\n          'SherpaOfflinePunctuationFreeText',\n        )\n        .asFunction();\n\n    sherpaOnnxCreateOnlinePunctuation ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateOnlinePunctuationNative>>(\n          'SherpaOnnxCreateOnlinePunctuation',\n        )\n        .asFunction();\n\n    sherpaOnnxDestroyOnlinePunctuation ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyOnlinePunctuationNative>>(\n          'SherpaOnnxDestroyOnlinePunctuation',\n        )\n        .asFunction();\n\n    sherpaOnnxOnlinePunctuationAddPunct ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOnlinePunctuationAddPunctNative>>(\n          'SherpaOnnxOnlinePunctuationAddPunct',\n        )\n        .asFunction();\n\n    sherpaOnnxOnlinePunctuationFreeText ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOnlinePunctuationFreeTextNative>>(\n          'SherpaOnnxOnlinePunctuationFreeText',\n        )\n        .asFunction();\n\n    sherpaOnnxCreateAudioTagging ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateAudioTaggingNative>>(\n          'SherpaOnnxCreateAudioTagging',\n        )\n        .asFunction();\n\n    sherpaOnnxDestroyAudioTagging ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyAudioTaggingNative>>(\n          'SherpaOnnxDestroyAudioTagging',\n        )\n        .asFunction();\n\n    sherpaOnnxAudioTaggingCreateOfflineStream ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxAudioTaggingCreateOfflineStreamNative>\n        >('SherpaOnnxAudioTaggingCreateOfflineStream')\n        .asFunction();\n\n    sherpaOnnxAudioTaggingCompute ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxAudioTaggingComputeNative>>(\n          'SherpaOnnxAudioTaggingCompute',\n        )\n        .asFunction();\n\n    sherpaOnnxAudioTaggingFreeResults ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxAudioTaggingFreeResultsNative>>(\n          'SherpaOnnxAudioTaggingFreeResults',\n        )\n        .asFunction();\n\n    createKeywordSpotter ??= dynamicLibrary\n        .lookup<NativeFunction<CreateKeywordSpotterNative>>(\n          'SherpaOnnxCreateKeywordSpotter',\n        )\n        .asFunction();\n\n    destroyKeywordSpotter ??= dynamicLibrary\n        .lookup<NativeFunction<DestroyKeywordSpotterNative>>(\n          'SherpaOnnxDestroyKeywordSpotter',\n        )\n        .asFunction();\n\n    createKeywordStream ??= dynamicLibrary\n        .lookup<NativeFunction<CreateKeywordStreamNative>>(\n          'SherpaOnnxCreateKeywordStream',\n        )\n        .asFunction();\n\n    createKeywordStreamWithKeywords ??= dynamicLibrary\n        .lookup<NativeFunction<CreateKeywordStreamWithKeywordsNative>>(\n          'SherpaOnnxCreateKeywordStreamWithKeywords',\n        )\n        .asFunction();\n\n    isKeywordStreamReady ??= dynamicLibrary\n        .lookup<NativeFunction<IsKeywordStreamReadyNative>>(\n          'SherpaOnnxIsKeywordStreamReady',\n        )\n        .asFunction();\n\n    decodeKeywordStream ??= dynamicLibrary\n        .lookup<NativeFunction<DecodeKeywordStreamNative>>(\n          'SherpaOnnxDecodeKeywordStream',\n        )\n        .asFunction();\n\n    resetKeywordStream ??= dynamicLibrary\n        .lookup<NativeFunction<ResetKeywordStreamNative>>(\n          'SherpaOnnxResetKeywordStream',\n        )\n        .asFunction();\n\n    getKeywordResultAsJson ??= dynamicLibrary\n        .lookup<NativeFunction<GetKeywordResultAsJsonNative>>(\n          'SherpaOnnxGetKeywordResultAsJson',\n        )\n        .asFunction();\n\n    freeKeywordResultJson ??= dynamicLibrary\n        .lookup<NativeFunction<FreeKeywordResultJsonNative>>(\n          'SherpaOnnxFreeKeywordResultJson',\n        )\n        .asFunction();\n\n    createOfflineTts ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateOfflineTtsNative>>(\n          'SherpaOnnxCreateOfflineTts',\n        )\n        .asFunction();\n\n    destroyOfflineTts ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyOfflineTtsNative>>(\n          'SherpaOnnxDestroyOfflineTts',\n        )\n        .asFunction();\n\n    offlineTtsSampleRate ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOfflineTtsSampleRateNative>>(\n          'SherpaOnnxOfflineTtsSampleRate',\n        )\n        .asFunction();\n\n    offlineTtsNumSpeakers ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOfflineTtsNumSpeakersNative>>(\n          'SherpaOnnxOfflineTtsNumSpeakers',\n        )\n        .asFunction();\n\n    offlineTtsGenerate ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOfflineTtsGenerateNative>>(\n          'SherpaOnnxOfflineTtsGenerate',\n        )\n        .asFunction();\n\n    destroyOfflineTtsGeneratedAudio ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxDestroyOfflineTtsGeneratedAudioNative>\n        >('SherpaOnnxDestroyOfflineTtsGeneratedAudio')\n        .asFunction();\n\n    offlineTtsGenerateWithCallback ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOfflineTtsGenerateWithCallbackNative>>(\n          'SherpaOnnxOfflineTtsGenerateWithCallback',\n        )\n        .asFunction();\n\n    offlineTtsGenerateWithConfig ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxOfflineTtsGenerateWithConfigNative>>(\n          'SherpaOnnxOfflineTtsGenerateWithConfig',\n        )\n        .asFunction();\n\n    createOfflineRecognizer ??= dynamicLibrary\n        .lookup<NativeFunction<CreateOfflineRecognizerNative>>(\n          'SherpaOnnxCreateOfflineRecognizer',\n        )\n        .asFunction();\n\n    destroyOfflineRecognizer ??= dynamicLibrary\n        .lookup<NativeFunction<DestroyOfflineRecognizerNative>>(\n          'SherpaOnnxDestroyOfflineRecognizer',\n        )\n        .asFunction();\n\n    offlineRecognizerSetConfig ??= dynamicLibrary\n        .lookup<NativeFunction<OfflineRecognizerSetConfigNative>>(\n          'SherpaOnnxOfflineRecognizerSetConfig',\n        )\n        .asFunction();\n\n    createOfflineStream ??= dynamicLibrary\n        .lookup<NativeFunction<CreateOfflineStreamNative>>(\n          'SherpaOnnxCreateOfflineStream',\n        )\n        .asFunction();\n\n    destroyOfflineStream ??= dynamicLibrary\n        .lookup<NativeFunction<DestroyOfflineStreamNative>>(\n          'SherpaOnnxDestroyOfflineStream',\n        )\n        .asFunction();\n\n    acceptWaveformOffline ??= dynamicLibrary\n        .lookup<NativeFunction<AcceptWaveformOfflineNative>>(\n          'SherpaOnnxAcceptWaveformOffline',\n        )\n        .asFunction();\n\n    decodeOfflineStream ??= dynamicLibrary\n        .lookup<NativeFunction<DecodeOfflineStreamNative>>(\n          'SherpaOnnxDecodeOfflineStream',\n        )\n        .asFunction();\n\n    getOfflineStreamResultAsJson ??= dynamicLibrary\n        .lookup<NativeFunction<GetOfflineStreamResultAsJsonNative>>(\n          'SherpaOnnxGetOfflineStreamResultAsJson',\n        )\n        .asFunction();\n\n    destroyOfflineStreamResultJson ??= dynamicLibrary\n        .lookup<NativeFunction<DestroyOfflineStreamResultJsonNative>>(\n          'SherpaOnnxDestroyOfflineStreamResultJson',\n        )\n        .asFunction();\n\n    createOnlineRecognizer ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateOnlineRecognizerNative>>(\n          'SherpaOnnxCreateOnlineRecognizer',\n        )\n        .asFunction();\n\n    destroyOnlineRecognizer ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyOnlineRecognizerNative>>(\n          'SherpaOnnxDestroyOnlineRecognizer',\n        )\n        .asFunction();\n\n    createOnlineStream ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateOnlineStreamNative>>(\n          'SherpaOnnxCreateOnlineStream',\n        )\n        .asFunction();\n\n    createOnlineStreamWithHotwords ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateOnlineStreamWithHotwordsNative>>(\n          'SherpaOnnxCreateOnlineStreamWithHotwords',\n        )\n        .asFunction();\n\n    isOnlineStreamReady ??= dynamicLibrary\n        .lookup<NativeFunction<IsOnlineStreamReadyNative>>(\n          'SherpaOnnxIsOnlineStreamReady',\n        )\n        .asFunction();\n\n    decodeOnlineStream ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDecodeOnlineStreamNative>>(\n          'SherpaOnnxDecodeOnlineStream',\n        )\n        .asFunction();\n\n    getOnlineStreamResultAsJson ??= dynamicLibrary\n        .lookup<NativeFunction<GetOnlineStreamResultAsJsonNative>>(\n          'SherpaOnnxGetOnlineStreamResultAsJson',\n        )\n        .asFunction();\n\n    reset ??= dynamicLibrary\n        .lookup<NativeFunction<ResetNative>>('SherpaOnnxOnlineStreamReset')\n        .asFunction();\n\n    isEndpoint ??= dynamicLibrary\n        .lookup<NativeFunction<IsEndpointNative>>(\n          'SherpaOnnxOnlineStreamIsEndpoint',\n        )\n        .asFunction();\n\n    destroyOnlineStreamResultJson ??= dynamicLibrary\n        .lookup<NativeFunction<DestroyOnlineStreamResultJsonNative>>(\n          'SherpaOnnxDestroyOnlineStreamResultJson',\n        )\n        .asFunction();\n\n    createVoiceActivityDetector ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateVoiceActivityDetectorNative>>(\n          'SherpaOnnxCreateVoiceActivityDetector',\n        )\n        .asFunction();\n\n    destroyVoiceActivityDetector ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyVoiceActivityDetectorNative>>(\n          'SherpaOnnxDestroyVoiceActivityDetector',\n        )\n        .asFunction();\n\n    voiceActivityDetectorAcceptWaveform ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxVoiceActivityDetectorAcceptWaveformNative>\n        >('SherpaOnnxVoiceActivityDetectorAcceptWaveform')\n        .asFunction();\n\n    voiceActivityDetectorEmpty ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxVoiceActivityDetectorEmptyNative>>(\n          'SherpaOnnxVoiceActivityDetectorEmpty',\n        )\n        .asFunction();\n\n    voiceActivityDetectorDetected ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxVoiceActivityDetectorDetectedNative>>(\n          'SherpaOnnxVoiceActivityDetectorDetected',\n        )\n        .asFunction();\n\n    voiceActivityDetectorPop ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxVoiceActivityDetectorPopNative>>(\n          'SherpaOnnxVoiceActivityDetectorPop',\n        )\n        .asFunction();\n\n    voiceActivityDetectorClear ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxVoiceActivityDetectorClearNative>>(\n          'SherpaOnnxVoiceActivityDetectorClear',\n        )\n        .asFunction();\n\n    voiceActivityDetectorFront ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxVoiceActivityDetectorFrontNative>>(\n          'SherpaOnnxVoiceActivityDetectorFront',\n        )\n        .asFunction();\n\n    destroySpeechSegment ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroySpeechSegmentNative>>(\n          'SherpaOnnxDestroySpeechSegment',\n        )\n        .asFunction();\n\n    voiceActivityDetectorReset ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxVoiceActivityDetectorResetNative>>(\n          'SherpaOnnxVoiceActivityDetectorReset',\n        )\n        .asFunction();\n\n    voiceActivityDetectorFlush ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxVoiceActivityDetectorFlushNative>>(\n          'SherpaOnnxVoiceActivityDetectorFlush',\n        )\n        .asFunction();\n\n    createCircularBuffer ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateCircularBufferNative>>(\n          'SherpaOnnxCreateCircularBuffer',\n        )\n        .asFunction();\n\n    destroyCircularBuffer ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyCircularBufferNative>>(\n          'SherpaOnnxDestroyCircularBuffer',\n        )\n        .asFunction();\n\n    circularBufferPush ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCircularBufferPushNative>>(\n          'SherpaOnnxCircularBufferPush',\n        )\n        .asFunction();\n\n    circularBufferGet ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCircularBufferGetNative>>(\n          'SherpaOnnxCircularBufferGet',\n        )\n        .asFunction();\n\n    circularBufferFree ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCircularBufferFreeNative>>(\n          'SherpaOnnxCircularBufferFree',\n        )\n        .asFunction();\n\n    circularBufferPop ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCircularBufferPopNative>>(\n          'SherpaOnnxCircularBufferPop',\n        )\n        .asFunction();\n\n    circularBufferSize ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCircularBufferSizeNative>>(\n          'SherpaOnnxCircularBufferSize',\n        )\n        .asFunction();\n\n    circularBufferHead ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCircularBufferHeadNative>>(\n          'SherpaOnnxCircularBufferHead',\n        )\n        .asFunction();\n\n    circularBufferReset ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCircularBufferResetNative>>(\n          'SherpaOnnxCircularBufferReset',\n        )\n        .asFunction();\n\n    createSpeakerEmbeddingExtractor ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxCreateSpeakerEmbeddingExtractorNative>\n        >('SherpaOnnxCreateSpeakerEmbeddingExtractor')\n        .asFunction();\n\n    destroySpeakerEmbeddingExtractor ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxDestroySpeakerEmbeddingExtractorNative>\n        >('SherpaOnnxDestroySpeakerEmbeddingExtractor')\n        .asFunction();\n\n    speakerEmbeddingExtractorDim ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxSpeakerEmbeddingExtractorDimNative>>(\n          'SherpaOnnxSpeakerEmbeddingExtractorDim',\n        )\n        .asFunction();\n\n    speakerEmbeddingExtractorCreateStream ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxSpeakerEmbeddingExtractorCreateStreamNative>\n        >('SherpaOnnxSpeakerEmbeddingExtractorCreateStream')\n        .asFunction();\n\n    speakerEmbeddingExtractorComputeEmbedding ??= dynamicLibrary\n        .lookup<\n          NativeFunction<\n            SherpaOnnxSpeakerEmbeddingExtractorComputeEmbeddingNative\n          >\n        >('SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding')\n        .asFunction();\n\n    speakerEmbeddingExtractorDestroyEmbedding ??= dynamicLibrary\n        .lookup<\n          NativeFunction<\n            SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbeddingNative\n          >\n        >('SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding')\n        .asFunction();\n\n    destroyOnlineStream ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroyOnlineStreamNative>>(\n          'SherpaOnnxDestroyOnlineStream',\n        )\n        .asFunction();\n\n    onlineStreamAcceptWaveform ??= dynamicLibrary\n        .lookup<NativeFunction<OnlineStreamAcceptWaveformNative>>(\n          'SherpaOnnxOnlineStreamAcceptWaveform',\n        )\n        .asFunction();\n\n    onlineStreamInputFinished ??= dynamicLibrary\n        .lookup<NativeFunction<OnlineStreamInputFinishedNative>>(\n          'SherpaOnnxOnlineStreamInputFinished',\n        )\n        .asFunction();\n\n    speakerEmbeddingExtractorIsReady ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxSpeakerEmbeddingExtractorIsReadyNative>\n        >('SherpaOnnxSpeakerEmbeddingExtractorIsReady')\n        .asFunction();\n\n    createSpeakerEmbeddingManager ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxCreateSpeakerEmbeddingManagerNative>>(\n          'SherpaOnnxCreateSpeakerEmbeddingManager',\n        )\n        .asFunction();\n\n    destroySpeakerEmbeddingManager ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxDestroySpeakerEmbeddingManagerNative>>(\n          'SherpaOnnxDestroySpeakerEmbeddingManager',\n        )\n        .asFunction();\n\n    speakerEmbeddingManagerAdd ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxSpeakerEmbeddingManagerAddNative>>(\n          'SherpaOnnxSpeakerEmbeddingManagerAdd',\n        )\n        .asFunction();\n\n    speakerEmbeddingManagerAddListFlattened ??= dynamicLibrary\n        .lookup<\n          NativeFunction<\n            SherpaOnnxSpeakerEmbeddingManagerAddListFlattenedNative\n          >\n        >('SherpaOnnxSpeakerEmbeddingManagerAddListFlattened')\n        .asFunction();\n\n    speakerEmbeddingManagerRemove ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxSpeakerEmbeddingManagerRemoveNative>>(\n          'SherpaOnnxSpeakerEmbeddingManagerRemove',\n        )\n        .asFunction();\n\n    speakerEmbeddingManagerContains ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxSpeakerEmbeddingManagerContainsNative>\n        >('SherpaOnnxSpeakerEmbeddingManagerContains')\n        .asFunction();\n\n    speakerEmbeddingManagerSearch ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxSpeakerEmbeddingManagerSearchNative>>(\n          'SherpaOnnxSpeakerEmbeddingManagerSearch',\n        )\n        .asFunction();\n\n    speakerEmbeddingManagerFreeSearch ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxSpeakerEmbeddingManagerFreeSearchNative>\n        >('SherpaOnnxSpeakerEmbeddingManagerFreeSearch')\n        .asFunction();\n\n    speakerEmbeddingManagerNumSpeakers ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxSpeakerEmbeddingManagerNumSpeakersNative>\n        >('SherpaOnnxSpeakerEmbeddingManagerNumSpeakers')\n        .asFunction();\n\n    speakerEmbeddingManagerVerify ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxSpeakerEmbeddingManagerVerifyNative>>(\n          'SherpaOnnxSpeakerEmbeddingManagerVerify',\n        )\n        .asFunction();\n\n    speakerEmbeddingManagerGetAllSpeakers ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakersNative>\n        >('SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers')\n        .asFunction();\n\n    speakerEmbeddingManagerFreeAllSpeakers ??= dynamicLibrary\n        .lookup<\n          NativeFunction<SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakersNative>\n        >('SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers')\n        .asFunction();\n\n    readWave ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxReadWaveNative>>('SherpaOnnxReadWave')\n        .asFunction();\n\n    writeWave ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxWriteWaveNative>>(\n          'SherpaOnnxWriteWave',\n        )\n        .asFunction();\n\n    freeWave ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxFreeWaveNative>>('SherpaOnnxFreeWave')\n        .asFunction();\n\n    getVersionStr ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxGetVersionStrNative>>(\n          'SherpaOnnxGetVersionStr',\n        )\n        .asFunction();\n\n    getGitSha1 ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxGetGitSha1Native>>(\n          'SherpaOnnxGetGitSha1',\n        )\n        .asFunction();\n\n    getGitDate ??= dynamicLibrary\n        .lookup<NativeFunction<SherpaOnnxGetGitDateNative>>(\n          'SherpaOnnxGetGitDate',\n        )\n        .asFunction();\n  }\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/speaker_identification.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'dart:typed_data';\nimport 'package:ffi/ffi.dart';\n\nimport './online_stream.dart';\nimport './sherpa_onnx_bindings.dart';\n\n/// Speaker embedding extraction and speaker identification utilities.\n///\n/// See `dart-api-examples/speaker-identification/` for end-to-end examples.\n///\n/// Example:\n///\n/// ```dart\n/// final extractor = SpeakerEmbeddingExtractor(\n///   config: const SpeakerEmbeddingExtractorConfig(\n///     model: './3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx',\n///   ),\n/// );\n///\n/// final stream = extractor.createStream();\n/// stream.acceptWaveform(samples: wave.samples, sampleRate: wave.sampleRate);\n/// while (extractor.isReady(stream)) {}\n/// final embedding = extractor.compute(stream);\n///\n/// final manager = SpeakerEmbeddingManager(extractor.dim);\n/// manager.add(name: 'alice', embedding: embedding);\n/// print(manager.search(embedding: embedding, threshold: 0.6));\n/// ```\nclass SpeakerEmbeddingExtractorConfig {\n  const SpeakerEmbeddingExtractorConfig(\n      {required this.model,\n      this.numThreads = 1,\n      this.debug = true,\n      this.provider = 'cpu'});\n\n  factory SpeakerEmbeddingExtractorConfig.fromJson(Map<String, dynamic> json) {\n    return SpeakerEmbeddingExtractorConfig(\n      model: json['model'] as String,\n      numThreads: json['numThreads'] as int? ?? 1,\n      debug: json['debug'] as bool? ?? true,\n      provider: json['provider'] as String? ?? 'cpu',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'SpeakerEmbeddingExtractorConfig(model: $model, numThreads: $numThreads, debug: $debug, provider: $provider)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model,\n        'numThreads': numThreads,\n        'debug': debug,\n        'provider': provider,\n      };\n\n  final String model;\n  final int numThreads;\n  final bool debug;\n  final String provider;\n}\n\n/// Speaker embedding extractor.\n///\n/// Feed audio through an [OnlineStream], then call [compute] to obtain a fixed\n/// dimensional embedding suitable for search or verification.\nclass SpeakerEmbeddingExtractor {\n  SpeakerEmbeddingExtractor.fromPtr({required this.ptr, required this.dim});\n\n  SpeakerEmbeddingExtractor._({required this.ptr, required this.dim});\n\n  /// Create an extractor from [config].\n  factory SpeakerEmbeddingExtractor(\n      {required SpeakerEmbeddingExtractorConfig config}) {\n    final c = calloc<SherpaOnnxSpeakerEmbeddingExtractorConfig>();\n\n    final modelPtr = config.model.toNativeUtf8();\n    c.ref.model = modelPtr;\n\n    c.ref.numThreads = config.numThreads;\n    c.ref.debug = config.debug ? 1 : 0;\n\n    final providerPtr = config.provider.toNativeUtf8();\n    c.ref.provider = providerPtr;\n\n    if (SherpaOnnxBindings.createSpeakerEmbeddingExtractor == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final ptr =\n        SherpaOnnxBindings.createSpeakerEmbeddingExtractor?.call(c) ?? nullptr;\n\n    calloc.free(providerPtr);\n    calloc.free(modelPtr);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\n          \"Failed to create speaker embedding extractor. Please check your config\");\n    }\n\n    final dim = SherpaOnnxBindings.speakerEmbeddingExtractorDim?.call(ptr) ?? 0;\n\n    return SpeakerEmbeddingExtractor._(ptr: ptr, dim: dim);\n  }\n\n  /// Release the native extractor.\n  void free() {\n    if (SherpaOnnxBindings.destroySpeakerEmbeddingExtractor == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.destroySpeakerEmbeddingExtractor?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Create an input stream for embedding extraction.\n  OnlineStream createStream() {\n    if (SherpaOnnxBindings.speakerEmbeddingExtractorCreateStream == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      throw Exception(\"Failed to create online stream\");\n    }\n\n    final p =\n        SherpaOnnxBindings.speakerEmbeddingExtractorCreateStream?.call(ptr) ??\n            nullptr;\n\n    if (p == nullptr) {\n      throw Exception(\"Failed to create online stream\");\n    }\n\n    return OnlineStream(ptr: p);\n  }\n\n  /// Return `true` if [stream] has enough audio for embedding extraction.\n  bool isReady(OnlineStream stream) {\n    if (SherpaOnnxBindings.speakerEmbeddingExtractorIsReady == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return false;\n    }\n\n    final int ready = SherpaOnnxBindings.speakerEmbeddingExtractorIsReady\n            ?.call(ptr, stream.ptr) ??\n        0;\n    return ready == 1;\n  }\n\n  /// Compute an embedding for [stream].\n  Float32List compute(OnlineStream stream) {\n    if (SherpaOnnxBindings.speakerEmbeddingExtractorComputeEmbedding == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return Float32List(0);\n    }\n\n    final Pointer<Float> embedding = SherpaOnnxBindings\n            .speakerEmbeddingExtractorComputeEmbedding\n            ?.call(ptr, stream.ptr) ??\n        nullptr;\n\n    if (embedding == nullptr) {\n      return Float32List(0);\n    }\n\n    final embeddingList = embedding.asTypedList(dim);\n    final ans = Float32List(dim);\n    ans.setAll(0, embeddingList);\n\n    SherpaOnnxBindings.speakerEmbeddingExtractorDestroyEmbedding\n        ?.call(embedding);\n\n    return ans;\n  }\n\n  Pointer<SherpaOnnxSpeakerEmbeddingExtractor> ptr;\n  final int dim;\n}\n\n/// In-memory store of named speaker embeddings.\n///\n/// Use this class to add reference embeddings, search for the best matching\n/// speaker, and verify whether a candidate embedding belongs to a known\n/// identity.\nclass SpeakerEmbeddingManager {\n  SpeakerEmbeddingManager.fromPtr({required this.ptr, required this.dim});\n\n  SpeakerEmbeddingManager._({required this.ptr, required this.dim});\n\n  /// Create a manager for embeddings whose dimension is [dim].\n  factory SpeakerEmbeddingManager(int dim) {\n    if (SherpaOnnxBindings.createSpeakerEmbeddingManager == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final p =\n        SherpaOnnxBindings.createSpeakerEmbeddingManager?.call(dim) ?? nullptr;\n\n    if (p == nullptr) {\n      throw Exception(\"Failed to create speaker embedding manager\");\n    }\n\n    return SpeakerEmbeddingManager._(ptr: p, dim: dim);\n  }\n\n  /// Release the native manager.\n  void free() {\n    if (SherpaOnnxBindings.destroySpeakerEmbeddingManager == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.destroySpeakerEmbeddingManager?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Add one reference embedding for [name].\n  bool add({required String name, required Float32List embedding}) {\n    assert(embedding.length == dim, '${embedding.length} vs $dim');\n\n    if (SherpaOnnxBindings.speakerEmbeddingManagerAdd == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return false;\n    }\n\n    final Pointer<Utf8> namePtr = name.toNativeUtf8();\n    final int n = embedding.length;\n\n    final Pointer<Float> p = calloc<Float>(n);\n    final pList = p.asTypedList(n);\n    pList.setAll(0, embedding);\n\n    final int ok =\n        SherpaOnnxBindings.speakerEmbeddingManagerAdd?.call(ptr, namePtr, p) ??\n            0;\n\n    calloc.free(p);\n    calloc.free(namePtr);\n\n    return ok == 1;\n  }\n\n  /// Add multiple reference embeddings for [name].\n  bool addMulti(\n      {required String name, required List<Float32List> embeddingList}) {\n    if (SherpaOnnxBindings.speakerEmbeddingManagerAddListFlattened == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return false;\n    }\n\n    final Pointer<Utf8> namePtr = name.toNativeUtf8();\n    final int n = embeddingList.length;\n\n    final Pointer<Float> p = calloc<Float>(n * dim);\n    final pList = p.asTypedList(n * dim);\n\n    int offset = 0;\n    for (final e in embeddingList) {\n      assert(e.length == dim, '${e.length} vs $dim');\n\n      pList.setAll(offset, e);\n      offset += dim;\n    }\n\n    final int ok = SherpaOnnxBindings.speakerEmbeddingManagerAddListFlattened\n            ?.call(ptr, namePtr, p, n) ??\n        0;\n\n    calloc.free(p);\n    calloc.free(namePtr);\n\n    return ok == 1;\n  }\n\n  /// Return `true` if [name] exists in the manager.\n  bool contains(String name) {\n    if (SherpaOnnxBindings.speakerEmbeddingManagerContains == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return false;\n    }\n\n    final Pointer<Utf8> namePtr = name.toNativeUtf8();\n\n    final int found = SherpaOnnxBindings.speakerEmbeddingManagerContains\n            ?.call(ptr, namePtr) ??\n        0;\n\n    calloc.free(namePtr);\n\n    return found == 1;\n  }\n\n  /// Remove all embeddings associated with [name].\n  bool remove(String name) {\n    if (SherpaOnnxBindings.speakerEmbeddingManagerRemove == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return false;\n    }\n\n    final Pointer<Utf8> namePtr = name.toNativeUtf8();\n\n    final int ok =\n        SherpaOnnxBindings.speakerEmbeddingManagerRemove?.call(ptr, namePtr) ??\n            0;\n\n    calloc.free(namePtr);\n\n    return ok == 1;\n  }\n\n  /// Search for the best matching speaker above [threshold].\n  ///\n  /// Returns an empty string if no speaker is found.\n  String search({required Float32List embedding, required double threshold}) {\n    assert(embedding.length == dim);\n\n    if (SherpaOnnxBindings.speakerEmbeddingManagerSearch == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return '';\n    }\n\n    final Pointer<Float> p = calloc<Float>(dim);\n    final pList = p.asTypedList(dim);\n    pList.setAll(0, embedding);\n\n    final Pointer<Utf8> name = SherpaOnnxBindings.speakerEmbeddingManagerSearch\n            ?.call(ptr, p, threshold) ??\n        nullptr;\n\n    calloc.free(p);\n\n    if (name == nullptr) {\n      return '';\n    }\n\n    final String ans = name.toDartString();\n\n    SherpaOnnxBindings.speakerEmbeddingManagerFreeSearch?.call(name);\n\n    return ans;\n  }\n\n  /// Verify whether [embedding] matches [name] above [threshold].\n  bool verify(\n      {required String name,\n       required Float32List embedding,\n       required double threshold}) {\n    assert(embedding.length == dim);\n\n    if (SherpaOnnxBindings.speakerEmbeddingManagerVerify == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return false;\n    }\n\n    final Pointer<Utf8> namePtr = name.toNativeUtf8();\n\n    final Pointer<Float> p = calloc<Float>(dim);\n    final pList = p.asTypedList(dim);\n    pList.setAll(0, embedding);\n\n    final int ok = SherpaOnnxBindings.speakerEmbeddingManagerVerify\n            ?.call(ptr, namePtr, p, threshold) ??\n        0;\n\n    calloc.free(p);\n    calloc.free(namePtr);\n\n    return ok == 1;\n  }\n\n  int get numSpeakers {\n    if (SherpaOnnxBindings.speakerEmbeddingManagerNumSpeakers == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return 0;\n    }\n\n    return SherpaOnnxBindings.speakerEmbeddingManagerNumSpeakers?.call(ptr) ??\n        0;\n  }\n\n  List<String> get allSpeakerNames {\n    if (SherpaOnnxBindings.speakerEmbeddingManagerGetAllSpeakers == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    int n = numSpeakers;\n    if (n == 0) {\n      return <String>[];\n    }\n\n    final Pointer<Pointer<Utf8>> names =\n        SherpaOnnxBindings.speakerEmbeddingManagerGetAllSpeakers?.call(ptr) ??\n            nullptr;\n\n    if (names == nullptr) {\n      return <String>[];\n    }\n\n    final ans = <String>[];\n\n    // see https://api.flutter.dev/flutter/dart-ffi/PointerPointer.html\n    for (int i = 0; i != n; ++i) {\n      String name = names[i].toDartString();\n      ans.add(name);\n    }\n\n    SherpaOnnxBindings.speakerEmbeddingManagerFreeAllSpeakers?.call(names);\n\n    return ans;\n  }\n\n  Pointer<SherpaOnnxSpeakerEmbeddingManager> ptr;\n  final int dim;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/spoken_language_identification.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\n\nimport 'package:ffi/ffi.dart';\n\nimport './offline_stream.dart';\nimport './sherpa_onnx_bindings.dart';\nimport './utils.dart';\n\n/// Spoken language identification.\n///\n/// This module identifies the language spoken in an audio clip, using the\n/// Whisper-based language ID model family exposed by the native library.\n///\n/// Example:\n///\n/// ```dart\n/// final sli = SpokenLanguageIdentification(\n///   SpokenLanguageIdentificationConfig(\n///     whisper: const SpokenLanguageIdentificationWhisperConfig(\n///       encoder: './sherpa-onnx-whisper-tiny/encoder.int8.onnx',\n///       decoder: './sherpa-onnx-whisper-tiny/decoder.int8.onnx',\n///     ),\n///   ),\n/// );\n///\n/// final stream = sli.createStream();\n/// stream.acceptWaveform(samples: wave.samples, sampleRate: wave.sampleRate);\n/// print(sli.compute(stream).lang);\n/// ```\nclass SpokenLanguageIdentificationWhisperConfig {\n  const SpokenLanguageIdentificationWhisperConfig({\n    this.encoder = '',\n    this.decoder = '',\n    this.tailPaddings = 0,\n  });\n\n  factory SpokenLanguageIdentificationWhisperConfig.fromJson(\n      Map<String, dynamic> json) {\n    return SpokenLanguageIdentificationWhisperConfig(\n      encoder: json['encoder'] as String? ?? '',\n      decoder: json['decoder'] as String? ?? '',\n      tailPaddings: json['tailPaddings'] as int? ?? 0,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'SpokenLanguageIdentificationWhisperConfig(encoder: $encoder, decoder: $decoder, tailPaddings: $tailPaddings)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'encoder': encoder,\n        'decoder': decoder,\n        'tailPaddings': tailPaddings,\n      };\n\n  final String encoder;\n  final String decoder;\n  final int tailPaddings;\n}\n\n/// Top-level configuration for [SpokenLanguageIdentification].\nclass SpokenLanguageIdentificationConfig {\n  const SpokenLanguageIdentificationConfig({\n    this.whisper = const SpokenLanguageIdentificationWhisperConfig(),\n    this.numThreads = 1,\n    this.debug = false,\n    this.provider = 'cpu',\n  });\n\n  factory SpokenLanguageIdentificationConfig.fromJson(\n      Map<String, dynamic> json) {\n    return SpokenLanguageIdentificationConfig(\n      whisper: json['whisper'] != null\n          ? SpokenLanguageIdentificationWhisperConfig.fromJson(\n              json['whisper'] as Map<String, dynamic>)\n          : const SpokenLanguageIdentificationWhisperConfig(),\n      numThreads: json['numThreads'] as int? ?? 1,\n      debug: json['debug'] as bool? ?? false,\n      provider: json['provider'] as String? ?? 'cpu',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'SpokenLanguageIdentificationConfig(whisper: $whisper, numThreads: $numThreads, debug: $debug, provider: $provider)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'whisper': whisper.toJson(),\n        'numThreads': numThreads,\n        'debug': debug,\n        'provider': provider,\n      };\n\n  final SpokenLanguageIdentificationWhisperConfig whisper;\n  final int numThreads;\n  final bool debug;\n  final String provider;\n}\n\n/// Result returned by [SpokenLanguageIdentification.compute].\nclass SpokenLanguageIdentificationResult {\n  const SpokenLanguageIdentificationResult({\n    required this.lang,\n  });\n\n  factory SpokenLanguageIdentificationResult.fromJson(\n      Map<String, dynamic> json) {\n    return SpokenLanguageIdentificationResult(\n      lang: json['lang'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'SpokenLanguageIdentificationResult(lang: $lang)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'lang': lang,\n      };\n\n  final String lang;\n}\n\n/// Spoken language identifier.\nclass SpokenLanguageIdentification {\n  SpokenLanguageIdentification.fromPtr(\n      {required this.ptr, required this.config});\n\n  SpokenLanguageIdentification._({required this.ptr, required this.config});\n\n  /// Release the native language identifier.\n  void free() {\n    if (SherpaOnnxBindings.sherpaOnnxDestroySpokenLanguageIdentification ==\n        null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.sherpaOnnxDestroySpokenLanguageIdentification?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Create a language identifier from [config].\n  factory SpokenLanguageIdentification(\n      SpokenLanguageIdentificationConfig config) {\n    final c = convertConfig(config);\n\n    if (SherpaOnnxBindings.sherpaOnnxCreateSpokenLanguageIdentification ==\n        null) {\n      freeConfig(c);\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final ptr = SherpaOnnxBindings.sherpaOnnxCreateSpokenLanguageIdentification\n            ?.call(c) ??\n        nullptr;\n\n    if (ptr == nullptr) {\n      freeConfig(c);\n      throw Exception(\n          \"Failed to create spoken language identification. Please check your config\");\n    }\n\n    freeConfig(c);\n\n    return SpokenLanguageIdentification._(ptr: ptr, config: config);\n  }\n\n  static Pointer<SherpaOnnxSpokenLanguageIdentificationConfig> convertConfig(\n      SpokenLanguageIdentificationConfig config) {\n    final c = calloc<SherpaOnnxSpokenLanguageIdentificationConfig>();\n\n    c.ref.whisper.encoder = config.whisper.encoder.toNativeUtf8();\n    c.ref.whisper.decoder = config.whisper.decoder.toNativeUtf8();\n    c.ref.whisper.tailPaddings = config.whisper.tailPaddings;\n\n    c.ref.numThreads = config.numThreads;\n    c.ref.debug = config.debug ? 1 : 0;\n    c.ref.provider = config.provider.toNativeUtf8();\n\n    return c;\n  }\n\n  static void freeConfig(\n      Pointer<SherpaOnnxSpokenLanguageIdentificationConfig> c) {\n    malloc.free(c.ref.whisper.encoder);\n    malloc.free(c.ref.whisper.decoder);\n    malloc.free(c.ref.provider);\n    malloc.free(c);\n  }\n\n  /// Create an offline stream for one audio clip.\n  OfflineStream createStream() {\n    if (SherpaOnnxBindings\n            .sherpaOnnxSpokenLanguageIdentificationCreateOfflineStream ==\n        null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      throw Exception(\"Failed to create offline stream\");\n    }\n\n    final p = SherpaOnnxBindings\n            .sherpaOnnxSpokenLanguageIdentificationCreateOfflineStream\n            ?.call(ptr) ??\n        nullptr;\n\n    if (p == nullptr) {\n      throw Exception(\"Failed to create offline stream\");\n    }\n\n    return OfflineStream(ptr: p);\n  }\n\n  /// Compute the spoken language for [stream].\n  SpokenLanguageIdentificationResult compute(OfflineStream stream) {\n    if (SherpaOnnxBindings.sherpaOnnxSpokenLanguageIdentificationCompute ==\n        null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr || stream.ptr == nullptr) {\n      return const SpokenLanguageIdentificationResult(lang: '');\n    }\n\n    final result = SherpaOnnxBindings\n            .sherpaOnnxSpokenLanguageIdentificationCompute\n            ?.call(ptr, stream.ptr) ??\n        nullptr;\n\n    if (result == nullptr) {\n      return const SpokenLanguageIdentificationResult(lang: '');\n    }\n\n    final lang = toDartString(result.ref.lang);\n\n    SherpaOnnxBindings.sherpaOnnxDestroySpokenLanguageIdentificationResult\n        ?.call(result);\n\n    return SpokenLanguageIdentificationResult(lang: lang);\n  }\n\n  Pointer<SherpaOnnxSpokenLanguageIdentification> ptr;\n  SpokenLanguageIdentificationConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/tts.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:convert';\nimport 'dart:ffi';\nimport 'dart:typed_data';\n\nimport 'package:ffi/ffi.dart';\n\nimport './sherpa_onnx_bindings.dart';\n\n/// Offline text-to-speech.\n///\n/// This module supports VITS, Matcha, Kokoro, Kitten, ZipVoice, Pocket TTS,\n/// and Supertonic model families. See `dart-api-examples/tts/bin/` for working\n/// examples such as `pocket-en.dart`, `kokoro-en.dart`, `kokoro-zh-en.dart`,\n/// `matcha-en.dart`, and `zipvoice-zh-en.dart`.\n///\n/// Example:\n///\n/// ```dart\n/// final model = OfflineTtsModelConfig(\n///   pocketTts: const OfflineTtsPocketSphinxModelConfig(\n///     model: './sherpa-onnx-pocket-tts/model.int8.onnx',\n///     tokens: './sherpa-onnx-pocket-tts/tokens.txt',\n///     dataDir: './sherpa-onnx-pocket-tts/espeak-ng-data',\n///   ),\n///   numThreads: 1,\n/// );\n///\n/// final tts = OfflineTts(OfflineTtsConfig(model: model));\n/// final audio = tts.generate(\n///   text: 'Hello from sherpa-onnx',\n///   sid: 0,\n///   speed: 1.0,\n/// );\n/// writeWave(\n///   filename: './out.wav',\n///   samples: audio.samples,\n///   sampleRate: audio.sampleRate,\n/// );\n/// tts.free();\n/// ```\n\n/// Per-request generation options for [OfflineTts.generateWithConfig].\n///\n/// Use this when you need advanced generation controls such as zero-shot voice\n/// cloning reference audio, explicit reference sample rate, or model-specific\n/// values in [extra].\nclass OfflineTtsGenerationConfig {\n  const OfflineTtsGenerationConfig({\n    this.silenceScale = 0.2,\n    this.speed = 1.0,\n    this.sid = 0,\n    this.referenceAudio,\n    this.referenceSampleRate = 0,\n    this.referenceText = '',\n    this.numSteps = 5,\n    this.extra = const {},\n  });\n\n  /// Convert Extra to JSON string.\n  /// Returns nullptr if empty.\n  /// The user should use calloc.free(p); to free the returned value\n  Pointer<Utf8> extraToNativeUtf8() {\n    if (extra.isEmpty) {\n      return nullptr;\n    }\n\n    // Validate values\n    for (final v in extra.values) {\n      if (v is! String && v is! int && v is! double) {\n        throw ArgumentError(\n          'extra values must be String, int, or double. Got: ${v.runtimeType}',\n        );\n      }\n    }\n\n    final jsonString = jsonEncode(extra);\n    return jsonString.toNativeUtf8();\n  }\n\n  Pointer<SherpaOnnxGenerationConfig> toNative() {\n    final p = calloc<SherpaOnnxGenerationConfig>();\n\n    p.ref.silenceScale = silenceScale;\n    p.ref.speed = speed;\n    p.ref.sid = sid;\n    p.ref.numSteps = numSteps;\n\n    if (referenceAudio != null && referenceAudio!.isNotEmpty) {\n      final audioPtr = calloc<Float>(referenceAudio!.length);\n      audioPtr.asTypedList(referenceAudio!.length).setAll(0, referenceAudio!);\n      p.ref.referenceAudio = audioPtr;\n      p.ref.referenceAudioLength = referenceAudio!.length;\n      p.ref.referenceSampleRate = referenceSampleRate;\n    } else {\n      p.ref.referenceAudio = nullptr;\n      p.ref.referenceAudioLength = 0;\n      p.ref.referenceSampleRate = 0;\n    }\n\n    p.ref.referenceText = referenceText.isEmpty\n        ? nullptr\n        : referenceText.toNativeUtf8();\n\n    p.ref.extra = extraToNativeUtf8();\n\n    return p;\n  }\n\n  void freeNative(Pointer<SherpaOnnxGenerationConfig> p) {\n    if (p.ref.referenceAudio != nullptr) {\n      calloc.free(p.ref.referenceAudio);\n    }\n    if (p.ref.referenceText != nullptr) {\n      calloc.free(p.ref.referenceText);\n    }\n    if (p.ref.extra != nullptr) {\n      calloc.free(p.ref.extra);\n    }\n    calloc.free(p);\n  }\n\n  final double silenceScale;\n  final double speed;\n  final int sid;\n\n  /// mono audio in [-1, 1]\n  final Float32List? referenceAudio;\n  final int referenceSampleRate;\n  final String referenceText;\n  final int numSteps;\n\n  /// Extra model-specific attributes\n  /// key: string\n  /// value: string | int | double\n  final Map<String, Object> extra;\n}\n\n/// VITS model configuration.\nclass OfflineTtsVitsModelConfig {\n  const OfflineTtsVitsModelConfig({\n    this.model = '',\n    this.lexicon = '',\n    this.tokens = '',\n    this.dataDir = '',\n    this.noiseScale = 0.667,\n    this.noiseScaleW = 0.8,\n    this.lengthScale = 1.0,\n    this.dictDir = '',\n  });\n\n  factory OfflineTtsVitsModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTtsVitsModelConfig(\n      model: json['model'] as String? ?? '',\n      lexicon: json['lexicon'] as String? ?? '',\n      tokens: json['tokens'] as String? ?? '',\n      dataDir: json['dataDir'] as String? ?? '',\n      noiseScale: (json['noiseScale'] as num?)?.toDouble() ?? 0.667,\n      noiseScaleW: (json['noiseScaleW'] as num?)?.toDouble() ?? 0.8,\n      lengthScale: (json['lengthScale'] as num?)?.toDouble() ?? 1.0,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineTtsVitsModelConfig(model: $model, lexicon: $lexicon, tokens: $tokens, dataDir: $dataDir, noiseScale: $noiseScale, noiseScaleW: $noiseScaleW, lengthScale: $lengthScale)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'model': model,\n    'lexicon': lexicon,\n    'tokens': tokens,\n    'dataDir': dataDir,\n    'noiseScale': noiseScale,\n    'noiseScaleW': noiseScaleW,\n    'lengthScale': lengthScale,\n  };\n\n  final String model;\n  final String lexicon;\n  final String tokens;\n  final String dataDir;\n  final double noiseScale;\n  final double noiseScaleW;\n  final double lengthScale;\n  final String dictDir; // unused\n}\n\n/// Matcha model configuration.\nclass OfflineTtsMatchaModelConfig {\n  const OfflineTtsMatchaModelConfig({\n    this.acousticModel = '',\n    this.vocoder = '',\n    this.lexicon = '',\n    this.tokens = '',\n    this.dataDir = '',\n    this.noiseScale = 0.667,\n    this.lengthScale = 1.0,\n    this.dictDir = '',\n  });\n\n  factory OfflineTtsMatchaModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTtsMatchaModelConfig(\n      acousticModel: json['acousticModel'] as String? ?? '',\n      vocoder: json['vocoder'] as String? ?? '',\n      lexicon: json['lexicon'] as String? ?? '',\n      tokens: json['tokens'] as String? ?? '',\n      dataDir: json['dataDir'] as String? ?? '',\n      noiseScale: (json['noiseScale'] as num?)?.toDouble() ?? 0.667,\n      lengthScale: (json['lengthScale'] as num?)?.toDouble() ?? 1.0,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineTtsMatchaModelConfig(acousticModel: $acousticModel, vocoder: $vocoder, lexicon: $lexicon, tokens: $tokens, dataDir: $dataDir, noiseScale: $noiseScale, lengthScale: $lengthScale)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'acousticModel': acousticModel,\n    'vocoder': vocoder,\n    'lexicon': lexicon,\n    'tokens': tokens,\n    'dataDir': dataDir,\n    'noiseScale': noiseScale,\n    'lengthScale': lengthScale,\n  };\n\n  final String acousticModel;\n  final String vocoder;\n  final String lexicon;\n  final String tokens;\n  final String dataDir;\n  final double noiseScale;\n  final double lengthScale;\n  final String dictDir; // unused\n}\n\n/// Kokoro model configuration.\nclass OfflineTtsKokoroModelConfig {\n  const OfflineTtsKokoroModelConfig({\n    this.model = '',\n    this.voices = '',\n    this.tokens = '',\n    this.dataDir = '',\n    this.lengthScale = 1.0,\n    this.dictDir = '',\n    this.lexicon = '',\n    this.lang = '',\n  });\n\n  factory OfflineTtsKokoroModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTtsKokoroModelConfig(\n      model: json['model'] as String? ?? '',\n      voices: json['voices'] as String? ?? '',\n      tokens: json['tokens'] as String? ?? '',\n      dataDir: json['dataDir'] as String? ?? '',\n      lengthScale: (json['lengthScale'] as num?)?.toDouble() ?? 1.0,\n      lexicon: json['lexicon'] as String? ?? '',\n      lang: json['lang'] as String? ?? '',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineTtsKokoroModelConfig(model: $model, voices: $voices, tokens: $tokens, dataDir: $dataDir, lengthScale: $lengthScale, lexicon: $lexicon, lang: $lang)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'model': model,\n    'voices': voices,\n    'tokens': tokens,\n    'dataDir': dataDir,\n    'lengthScale': lengthScale,\n    'lexicon': lexicon,\n    'lang': lang,\n  };\n\n  final String model;\n  final String voices;\n  final String tokens;\n  final String dataDir;\n  final double lengthScale;\n  final String dictDir; // unused\n  final String lexicon;\n  final String lang;\n}\n\n/// Kitten model configuration.\nclass OfflineTtsKittenModelConfig {\n  const OfflineTtsKittenModelConfig({\n    this.model = '',\n    this.voices = '',\n    this.tokens = '',\n    this.dataDir = '',\n    this.lengthScale = 1.0,\n  });\n\n  factory OfflineTtsKittenModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTtsKittenModelConfig(\n      model: json['model'] as String? ?? '',\n      voices: json['voices'] as String? ?? '',\n      tokens: json['tokens'] as String? ?? '',\n      dataDir: json['dataDir'] as String? ?? '',\n      lengthScale: (json['lengthScale'] as num?)?.toDouble() ?? 1.0,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineTtsKittenModelConfig(model: $model, voices: $voices, tokens: $tokens, dataDir: $dataDir, lengthScale: $lengthScale)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'model': model,\n    'voices': voices,\n    'tokens': tokens,\n    'dataDir': dataDir,\n    'lengthScale': lengthScale,\n  };\n\n  final String model;\n  final String voices;\n  final String tokens;\n  final String dataDir;\n  final double lengthScale;\n}\n\n/// ZipVoice model configuration.\nclass OfflineTtsZipVoiceModelConfig {\n  const OfflineTtsZipVoiceModelConfig({\n    this.tokens = '',\n    this.encoder = '',\n    this.decoder = '',\n    this.vocoder = '',\n    this.dataDir = '',\n    this.lexicon = '',\n    this.featScale = 0.1,\n    this.tShift = 0.5,\n    this.targetRms = 0.1,\n    this.guidanceScale = 1.0,\n  });\n\n  factory OfflineTtsZipVoiceModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTtsZipVoiceModelConfig(\n      tokens: json['tokens'] as String? ?? '',\n      encoder: json['encoder'] as String? ?? '',\n      decoder: json['decoder'] as String? ?? '',\n      vocoder: json['vocoder'] as String? ?? '',\n      dataDir: json['dataDir'] as String? ?? '',\n      lexicon: json['lexicon'] as String? ?? '',\n      featScale: (json['featScale'] as num?)?.toDouble() ?? 0.1,\n      tShift: (json['tShift'] as num?)?.toDouble() ?? 0.5,\n      targetRms: (json['targetRms'] as num?)?.toDouble() ?? 0.1,\n      guidanceScale: (json['guidanceScale'] as num?)?.toDouble() ?? 1.0,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineTtsZipVoiceModelConfig(tokens: $tokens, encoder: $encoder, decoder: $decoder, vocoder: $vocoder, dataDir: $dataDir, lexicon: $lexicon, featScale: $featScale, tShift: $tShift, targetRms: $targetRms, guidanceScale: $guidanceScale)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'tokens': tokens,\n    'encoder': encoder,\n    'decoder': decoder,\n    'vocoder': vocoder,\n    'dataDir': dataDir,\n    'lexicon': lexicon,\n    'featScale': featScale,\n    'tShift': tShift,\n    'targetRms': targetRms,\n    'guidanceScale': guidanceScale,\n  };\n\n  final String tokens;\n  final String encoder;\n  final String decoder;\n  final String vocoder;\n  final String dataDir;\n  final String lexicon;\n  final double featScale;\n  final double tShift;\n  final double targetRms;\n  final double guidanceScale;\n}\n\n/// Pocket TTS model configuration.\n///\n/// This family supports zero-shot voice cloning with a reference waveform.\nclass OfflineTtsPocketModelConfig {\n  const OfflineTtsPocketModelConfig({\n    this.lmFlow = '',\n    this.lmMain = '',\n    this.encoder = '',\n    this.decoder = '',\n    this.textConditioner = '',\n    this.vocabJson = '',\n    this.tokenScoresJson = '',\n    this.voiceEmbeddingCacheCapacity = 50,\n  });\n\n  factory OfflineTtsPocketModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTtsPocketModelConfig(\n      lmFlow: json['lmFlow'] as String? ?? '',\n      lmMain: json['lmMain'] as String? ?? '',\n      encoder: json['encoder'] as String? ?? '',\n      decoder: json['decoder'] as String? ?? '',\n      textConditioner: json['textConditioner'] as String? ?? '',\n      vocabJson: json['vocabJson'] as String? ?? '',\n      tokenScoresJson: json['tokenScoresJson'] as String? ?? '',\n      voiceEmbeddingCacheCapacity:\n          json['voiceEmbeddingCacheCapacity'] as int? ?? 50,\n    );\n  }\n\n  Map<String, dynamic> toJson() => {\n    'lmFlow': lmFlow,\n    'lmMain': lmMain,\n    'encoder': encoder,\n    'decoder': decoder,\n    'textConditioner': textConditioner,\n    'vocabJson': vocabJson,\n    'tokenScoresJson': tokenScoresJson,\n    'voiceEmbeddingCacheCapacity': voiceEmbeddingCacheCapacity,\n  };\n\n  @override\n  String toString() {\n    return 'OfflineTtsPocketModelConfig(lmFlow: $lmFlow, lmMain: $lmMain, encoder: $encoder, decoder: $decoder, textConditioner: $textConditioner, vocabJson: $vocabJson, tokenScoresJson: $tokenScoresJson, voiceEmbeddingCacheCapacity: $voiceEmbeddingCacheCapacity)';\n  }\n\n  final String lmFlow;\n  final String lmMain;\n  final String encoder;\n  final String decoder;\n  final String textConditioner;\n  final String vocabJson;\n  final String tokenScoresJson;\n  final int voiceEmbeddingCacheCapacity;\n}\n\n/// Supertonic model configuration.\nclass OfflineTtsSupertonicModelConfig {\n  const OfflineTtsSupertonicModelConfig({\n    this.durationPredictor = '',\n    this.textEncoder = '',\n    this.vectorEstimator = '',\n    this.vocoder = '',\n    this.ttsJson = '',\n    this.unicodeIndexer = '',\n    this.voiceStyle = '',\n  });\n\n  factory OfflineTtsSupertonicModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTtsSupertonicModelConfig(\n      durationPredictor: json['durationPredictor'] as String? ?? '',\n      textEncoder: json['textEncoder'] as String? ?? '',\n      vectorEstimator: json['vectorEstimator'] as String? ?? '',\n      vocoder: json['vocoder'] as String? ?? '',\n      ttsJson: json['ttsJson'] as String? ?? '',\n      unicodeIndexer: json['unicodeIndexer'] as String? ?? '',\n      voiceStyle: json['voiceStyle'] as String? ?? '',\n    );\n  }\n\n  Map<String, dynamic> toJson() => {\n    'durationPredictor': durationPredictor,\n    'textEncoder': textEncoder,\n    'vectorEstimator': vectorEstimator,\n    'vocoder': vocoder,\n    'ttsJson': ttsJson,\n    'unicodeIndexer': unicodeIndexer,\n    'voiceStyle': voiceStyle,\n  };\n\n  @override\n  String toString() {\n    return 'OfflineTtsSupertonicModelConfig(durationPredictor: $durationPredictor, textEncoder: $textEncoder, vectorEstimator: $vectorEstimator, vocoder: $vocoder, ttsJson: $ttsJson, unicodeIndexer: $unicodeIndexer, voiceStyle: $voiceStyle)';\n  }\n\n  final String durationPredictor;\n  final String textEncoder;\n  final String vectorEstimator;\n  final String vocoder;\n  final String ttsJson;\n  final String unicodeIndexer;\n  final String voiceStyle;\n}\n\n/// Aggregate model configuration for offline TTS.\n///\n/// Configure exactly one model family for a typical setup and set the shared\n/// runtime options such as [numThreads] and [provider].\nclass OfflineTtsModelConfig {\n  const OfflineTtsModelConfig({\n    this.vits = const OfflineTtsVitsModelConfig(),\n    this.matcha = const OfflineTtsMatchaModelConfig(),\n    this.kokoro = const OfflineTtsKokoroModelConfig(),\n    this.kitten = const OfflineTtsKittenModelConfig(),\n    this.zipvoice = const OfflineTtsZipVoiceModelConfig(),\n    this.pocket = const OfflineTtsPocketModelConfig(),\n    this.supertonic = const OfflineTtsSupertonicModelConfig(),\n    this.numThreads = 1,\n    this.debug = true,\n    this.provider = 'cpu',\n  });\n\n  factory OfflineTtsModelConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTtsModelConfig(\n      vits: OfflineTtsVitsModelConfig.fromJson(\n        json['vits'] as Map<String, dynamic>? ?? const {},\n      ),\n      matcha: OfflineTtsMatchaModelConfig.fromJson(\n        json['matcha'] as Map<String, dynamic>? ?? const {},\n      ),\n      kokoro: OfflineTtsKokoroModelConfig.fromJson(\n        json['kokoro'] as Map<String, dynamic>? ?? const {},\n      ),\n      kitten: OfflineTtsKittenModelConfig.fromJson(\n        json['kitten'] as Map<String, dynamic>? ?? const {},\n      ),\n      zipvoice: OfflineTtsZipVoiceModelConfig.fromJson(\n        json['zipvoice'] as Map<String, dynamic>? ?? const {},\n      ),\n      pocket: OfflineTtsPocketModelConfig.fromJson(\n        json['pocket'] as Map<String, dynamic>? ?? const {},\n      ),\n      supertonic: OfflineTtsSupertonicModelConfig.fromJson(\n        json['supertonic'] as Map<String, dynamic>? ?? const {},\n      ),\n      numThreads: json['numThreads'] as int? ?? 1,\n      debug: json['debug'] as bool? ?? true,\n      provider: json['provider'] as String? ?? 'cpu',\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineTtsModelConfig(vits: $vits, matcha: $matcha, kokoro: $kokoro, kitten: $kitten, zipvoice: $zipvoice, pocket: $pocket, supertonic: $supertonic, numThreads: $numThreads, debug: $debug, provider: $provider)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'vits': vits.toJson(),\n    'matcha': matcha.toJson(),\n    'kokoro': kokoro.toJson(),\n    'kitten': kitten.toJson(),\n    'zipvoice': zipvoice.toJson(),\n    'pocket': pocket.toJson(),\n    'supertonic': supertonic.toJson(),\n    'numThreads': numThreads,\n    'debug': debug,\n    'provider': provider,\n  };\n\n  final OfflineTtsVitsModelConfig vits;\n  final OfflineTtsMatchaModelConfig matcha;\n  final OfflineTtsKokoroModelConfig kokoro;\n  final OfflineTtsKittenModelConfig kitten;\n  final OfflineTtsZipVoiceModelConfig zipvoice;\n  final OfflineTtsPocketModelConfig pocket;\n  final OfflineTtsSupertonicModelConfig supertonic;\n  final int numThreads;\n  final bool debug;\n  final String provider;\n}\n\n/// Top-level configuration for [OfflineTts].\nclass OfflineTtsConfig {\n  const OfflineTtsConfig({\n    required this.model,\n    this.ruleFsts = '',\n    this.maxNumSenetences = 1,\n    this.ruleFars = '',\n    this.silenceScale = 0.2,\n  });\n\n  factory OfflineTtsConfig.fromJson(Map<String, dynamic> json) {\n    return OfflineTtsConfig(\n      model: OfflineTtsModelConfig.fromJson(\n        json['model'] as Map<String, dynamic>,\n      ),\n      ruleFsts: json['ruleFsts'] as String? ?? '',\n      maxNumSenetences: json['maxNumSenetences'] as int? ?? 1,\n      ruleFars: json['ruleFars'] as String? ?? '',\n      silenceScale: (json['silenceScale'] as num?)?.toDouble() ?? 0.2,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'OfflineTtsConfig(model: $model, ruleFsts: $ruleFsts, maxNumSenetences: $maxNumSenetences, ruleFars: $ruleFars, silenceScale: $silenceScale)';\n  }\n\n  Map<String, dynamic> toJson() => {\n    'model': model.toJson(),\n    'ruleFsts': ruleFsts,\n    'maxNumSenetences': maxNumSenetences,\n    'ruleFars': ruleFars,\n    'silenceScale': silenceScale,\n  };\n\n  final OfflineTtsModelConfig model;\n  final String ruleFsts;\n  final int maxNumSenetences;\n  final String ruleFars;\n  final double silenceScale;\n}\n\n/// Audio generated by [OfflineTts].\nclass GeneratedAudio {\n  GeneratedAudio({required this.samples, required this.sampleRate});\n\n  final Float32List samples;\n  final int sampleRate;\n}\n\n/// Offline text-to-speech engine.\n///\n/// Create one from an [OfflineTtsConfig], then call [generate],\n/// [generateWithCallback], or [generateWithConfig] depending on how much\n/// control you need over the generation process.\nclass OfflineTts {\n  OfflineTts.fromPtr({required this.ptr, required this.config});\n\n  OfflineTts._({required this.ptr, required this.config});\n\n  /// The user is responsible to call the OfflineTts.free()\n  /// method of the returned instance to avoid memory leak.\n  factory OfflineTts(OfflineTtsConfig config) {\n    if (SherpaOnnxBindings.createOfflineTts == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final c = calloc<SherpaOnnxOfflineTtsConfig>();\n    c.ref.model.vits.model = config.model.vits.model.toNativeUtf8();\n    c.ref.model.vits.lexicon = config.model.vits.lexicon.toNativeUtf8();\n    c.ref.model.vits.tokens = config.model.vits.tokens.toNativeUtf8();\n    c.ref.model.vits.dataDir = config.model.vits.dataDir.toNativeUtf8();\n    c.ref.model.vits.noiseScale = config.model.vits.noiseScale;\n    c.ref.model.vits.noiseScaleW = config.model.vits.noiseScaleW;\n    c.ref.model.vits.lengthScale = config.model.vits.lengthScale;\n\n    c.ref.model.matcha.acousticModel = config.model.matcha.acousticModel\n        .toNativeUtf8();\n    c.ref.model.matcha.vocoder = config.model.matcha.vocoder.toNativeUtf8();\n    c.ref.model.matcha.lexicon = config.model.matcha.lexicon.toNativeUtf8();\n    c.ref.model.matcha.tokens = config.model.matcha.tokens.toNativeUtf8();\n    c.ref.model.matcha.dataDir = config.model.matcha.dataDir.toNativeUtf8();\n    c.ref.model.matcha.noiseScale = config.model.matcha.noiseScale;\n    c.ref.model.matcha.lengthScale = config.model.matcha.lengthScale;\n\n    c.ref.model.kokoro.model = config.model.kokoro.model.toNativeUtf8();\n    c.ref.model.kokoro.voices = config.model.kokoro.voices.toNativeUtf8();\n    c.ref.model.kokoro.tokens = config.model.kokoro.tokens.toNativeUtf8();\n    c.ref.model.kokoro.dataDir = config.model.kokoro.dataDir.toNativeUtf8();\n    c.ref.model.kokoro.lengthScale = config.model.kokoro.lengthScale;\n    c.ref.model.kokoro.lexicon = config.model.kokoro.lexicon.toNativeUtf8();\n    c.ref.model.kokoro.lang = config.model.kokoro.lang.toNativeUtf8();\n\n    c.ref.model.kitten.model = config.model.kitten.model.toNativeUtf8();\n    c.ref.model.kitten.voices = config.model.kitten.voices.toNativeUtf8();\n    c.ref.model.kitten.tokens = config.model.kitten.tokens.toNativeUtf8();\n    c.ref.model.kitten.dataDir = config.model.kitten.dataDir.toNativeUtf8();\n    c.ref.model.kitten.lengthScale = config.model.kitten.lengthScale;\n\n    c.ref.model.zipvoice.tokens = config.model.zipvoice.tokens.toNativeUtf8();\n    c.ref.model.zipvoice.encoder = config.model.zipvoice.encoder.toNativeUtf8();\n    c.ref.model.zipvoice.decoder = config.model.zipvoice.decoder.toNativeUtf8();\n    c.ref.model.zipvoice.vocoder = config.model.zipvoice.vocoder.toNativeUtf8();\n    c.ref.model.zipvoice.dataDir = config.model.zipvoice.dataDir.toNativeUtf8();\n    c.ref.model.zipvoice.lexicon = config.model.zipvoice.lexicon.toNativeUtf8();\n    c.ref.model.zipvoice.featScale = config.model.zipvoice.featScale;\n    c.ref.model.zipvoice.tShift = config.model.zipvoice.tShift;\n    c.ref.model.zipvoice.targetRms = config.model.zipvoice.targetRms;\n    c.ref.model.zipvoice.guidanceScale = config.model.zipvoice.guidanceScale;\n\n    c.ref.model.pocket.lmFlow = config.model.pocket.lmFlow.toNativeUtf8();\n    c.ref.model.pocket.lmMain = config.model.pocket.lmMain.toNativeUtf8();\n    c.ref.model.pocket.encoder = config.model.pocket.encoder.toNativeUtf8();\n    c.ref.model.pocket.decoder = config.model.pocket.decoder.toNativeUtf8();\n    c.ref.model.pocket.textConditioner = config.model.pocket.textConditioner\n        .toNativeUtf8();\n    c.ref.model.pocket.vocabJson = config.model.pocket.vocabJson.toNativeUtf8();\n    c.ref.model.pocket.tokenScoresJson = config.model.pocket.tokenScoresJson\n        .toNativeUtf8();\n    c.ref.model.pocket.voiceEmbeddingCacheCapacity =\n        config.model.pocket.voiceEmbeddingCacheCapacity;\n\n    c.ref.model.supertonic.durationPredictor = config.model.supertonic\n        .durationPredictor.toNativeUtf8();\n    c.ref.model.supertonic.textEncoder = config.model.supertonic.textEncoder\n        .toNativeUtf8();\n    c.ref.model.supertonic.vectorEstimator = config.model.supertonic\n        .vectorEstimator.toNativeUtf8();\n    c.ref.model.supertonic.vocoder = config.model.supertonic.vocoder\n        .toNativeUtf8();\n    c.ref.model.supertonic.ttsJson = config.model.supertonic.ttsJson\n        .toNativeUtf8();\n    c.ref.model.supertonic.unicodeIndexer = config.model.supertonic\n        .unicodeIndexer.toNativeUtf8();\n    c.ref.model.supertonic.voiceStyle = config.model.supertonic.voiceStyle\n        .toNativeUtf8();\n\n    c.ref.model.numThreads = config.model.numThreads;\n    c.ref.model.debug = config.model.debug ? 1 : 0;\n    c.ref.model.provider = config.model.provider.toNativeUtf8();\n\n    c.ref.ruleFsts = config.ruleFsts.toNativeUtf8();\n    c.ref.maxNumSenetences = config.maxNumSenetences;\n    c.ref.ruleFars = config.ruleFars.toNativeUtf8();\n    c.ref.silenceScale = config.silenceScale;\n\n    final ptr = SherpaOnnxBindings.createOfflineTts?.call(c) ?? nullptr;\n\n    calloc.free(c.ref.ruleFars);\n    calloc.free(c.ref.ruleFsts);\n    calloc.free(c.ref.model.provider);\n\n    calloc.free(c.ref.model.supertonic.voiceStyle);\n    calloc.free(c.ref.model.supertonic.unicodeIndexer);\n    calloc.free(c.ref.model.supertonic.ttsJson);\n    calloc.free(c.ref.model.supertonic.vocoder);\n    calloc.free(c.ref.model.supertonic.vectorEstimator);\n    calloc.free(c.ref.model.supertonic.textEncoder);\n    calloc.free(c.ref.model.supertonic.durationPredictor);\n\n    calloc.free(c.ref.model.pocket.tokenScoresJson);\n    calloc.free(c.ref.model.pocket.vocabJson);\n    calloc.free(c.ref.model.pocket.textConditioner);\n    calloc.free(c.ref.model.pocket.decoder);\n    calloc.free(c.ref.model.pocket.encoder);\n    calloc.free(c.ref.model.pocket.lmMain);\n    calloc.free(c.ref.model.pocket.lmFlow);\n\n    calloc.free(c.ref.model.zipvoice.lexicon);\n    calloc.free(c.ref.model.zipvoice.dataDir);\n    calloc.free(c.ref.model.zipvoice.vocoder);\n    calloc.free(c.ref.model.zipvoice.decoder);\n    calloc.free(c.ref.model.zipvoice.encoder);\n    calloc.free(c.ref.model.zipvoice.tokens);\n\n    calloc.free(c.ref.model.kitten.dataDir);\n    calloc.free(c.ref.model.kitten.tokens);\n    calloc.free(c.ref.model.kitten.voices);\n    calloc.free(c.ref.model.kitten.model);\n\n    calloc.free(c.ref.model.kokoro.lang);\n    calloc.free(c.ref.model.kokoro.lexicon);\n    calloc.free(c.ref.model.kokoro.dataDir);\n    calloc.free(c.ref.model.kokoro.tokens);\n    calloc.free(c.ref.model.kokoro.voices);\n    calloc.free(c.ref.model.kokoro.model);\n\n    calloc.free(c.ref.model.matcha.dataDir);\n    calloc.free(c.ref.model.matcha.tokens);\n    calloc.free(c.ref.model.matcha.lexicon);\n    calloc.free(c.ref.model.matcha.vocoder);\n    calloc.free(c.ref.model.matcha.acousticModel);\n\n    calloc.free(c.ref.model.vits.dataDir);\n    calloc.free(c.ref.model.vits.tokens);\n    calloc.free(c.ref.model.vits.lexicon);\n    calloc.free(c.ref.model.vits.model);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\"Failed to create offline tts. Please check your config\");\n    }\n\n    return OfflineTts._(ptr: ptr, config: config);\n  }\n\n  /// Release the native TTS engine.\n  void free() {\n    if (SherpaOnnxBindings.destroyOfflineTts == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.destroyOfflineTts?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Generate audio using the simple `(text, sid, speed)` API.\n  GeneratedAudio generate({\n    required String text,\n    int sid = 0,\n    double speed = 1.0,\n  }) {\n    if (SherpaOnnxBindings.offlineTtsGenerate == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return GeneratedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final Pointer<Utf8> textPtr = text.toNativeUtf8();\n    final p =\n        SherpaOnnxBindings.offlineTtsGenerate?.call(ptr, textPtr, sid, speed) ??\n        nullptr;\n    calloc.free(textPtr);\n\n    if (p == nullptr) {\n      return GeneratedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final samples = p.ref.samples.asTypedList(p.ref.n);\n    final sampleRate = p.ref.sampleRate;\n    final newSamples = Float32List.fromList(samples);\n\n    SherpaOnnxBindings.destroyOfflineTtsGeneratedAudio?.call(p);\n\n    return GeneratedAudio(samples: newSamples, sampleRate: sampleRate);\n  }\n\n  /// Generate audio while receiving partial sample chunks through [callback].\n  GeneratedAudio generateWithCallback({\n    required String text,\n    int sid = 0,\n    double speed = 1.0,\n    required int Function(Float32List samples) callback,\n  }) {\n    if (SherpaOnnxBindings.offlineTtsGenerateWithCallback == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return GeneratedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    // see\n    // https://github.com/dart-lang/sdk/issues/54276#issuecomment-1846109285\n    // https://stackoverflow.com/questions/69537440/callbacks-in-dart-dartffi-only-supports-calling-static-dart-functions-from-nat\n    // https://github.com/dart-lang/sdk/blob/main/tests/ffi/isolate_local_function_callbacks_test.dart#L46\n    final wrapper =\n        NativeCallable<SherpaOnnxGeneratedAudioCallbackNative>.isolateLocal((\n          Pointer<Float> samples,\n          int n,\n        ) {\n          final s = samples.asTypedList(n);\n          final newSamples = Float32List.fromList(s);\n          return callback(newSamples);\n        }, exceptionalReturn: 0);\n\n    final Pointer<Utf8> textPtr = text.toNativeUtf8();\n    final p =\n        SherpaOnnxBindings.offlineTtsGenerateWithCallback?.call(\n          ptr,\n          textPtr,\n          sid,\n          speed,\n          wrapper.nativeFunction,\n        ) ??\n        nullptr;\n\n    calloc.free(textPtr);\n    wrapper.close();\n\n    if (p == nullptr) {\n      return GeneratedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final samples = p.ref.samples.asTypedList(p.ref.n);\n    final sampleRate = p.ref.sampleRate;\n    final newSamples = Float32List.fromList(samples);\n\n    SherpaOnnxBindings.destroyOfflineTtsGeneratedAudio?.call(p);\n\n    return GeneratedAudio(samples: newSamples, sampleRate: sampleRate);\n  }\n\n  /// Generate audio using [OfflineTtsGenerationConfig].\n  ///\n  /// This is the most flexible generation API and is the recommended entry\n  /// point for features such as Pocket TTS reference-audio cloning and\n  /// model-specific options supplied through [OfflineTtsGenerationConfig.extra].\n  GeneratedAudio generateWithConfig({\n    required String text,\n    required OfflineTtsGenerationConfig config,\n    int Function(Float32List samples, double progress)? onProgress,\n  }) {\n    if (SherpaOnnxBindings.offlineTtsGenerateWithConfig == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return GeneratedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final textPtr = text.toNativeUtf8();\n    final cfgPtr = config.toNative();\n\n    NativeCallable<SherpaOnnxGeneratedAudioProgressCallbackWithArgNative>?\n    wrapper;\n\n    if (onProgress != null) {\n      wrapper =\n          NativeCallable<\n            SherpaOnnxGeneratedAudioProgressCallbackWithArgNative\n          >.isolateLocal((\n            Pointer<Float> samples,\n            int n,\n            double p,\n            Pointer<Void> arg,\n          ) {\n            final list = Float32List.fromList(samples.asTypedList(n));\n            return onProgress(list, p);\n          }, exceptionalReturn: 0);\n    }\n\n    final p =\n        SherpaOnnxBindings.offlineTtsGenerateWithConfig?.call(\n          ptr,\n          textPtr,\n          cfgPtr,\n          wrapper?.nativeFunction ?? nullptr,\n          nullptr,\n        ) ??\n        nullptr;\n\n    calloc.free(textPtr);\n    config.freeNative(cfgPtr);\n    wrapper?.close();\n\n    if (p == nullptr) {\n      return GeneratedAudio(samples: Float32List(0), sampleRate: 0);\n    }\n\n    final samples = Float32List.fromList(p.ref.samples.asTypedList(p.ref.n));\n    final sampleRate = p.ref.sampleRate;\n\n    SherpaOnnxBindings.destroyOfflineTtsGeneratedAudio?.call(p);\n\n    return GeneratedAudio(samples: samples, sampleRate: sampleRate);\n  }\n\n  /// Return the output sample rate reported by the model.\n  int get sampleRate {\n    if (SherpaOnnxBindings.offlineTtsSampleRate == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return 0;\n    }\n\n    return SherpaOnnxBindings.offlineTtsSampleRate?.call(ptr) ?? 0;\n  }\n\n  /// Return the number of built-in speakers reported by the model.\n  int get numSpeakers {\n    if (SherpaOnnxBindings.offlineTtsNumSpeakers == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return 0;\n    }\n\n    return SherpaOnnxBindings.offlineTtsNumSpeakers?.call(ptr) ?? 0;\n  }\n\n  Pointer<SherpaOnnxOfflineTts> ptr;\n  OfflineTtsConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/utils.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:convert';\nimport 'dart:ffi';\n\nimport 'package:ffi/ffi.dart';\n\nint _strLen(Pointer<Uint8> codeUnits) {\n  // this function is copied from\n  // https://github.com/dart-archive/ffi/blob/main/lib/src/utf8.dart#L52\n  var length = 0;\n  while (codeUnits[length] != 0) {\n    length++;\n  }\n  return length;\n}\n\n// This function is modified from\n// https://github.com/dart-archive/ffi/blob/main/lib/src/utf8.dart#L41\n// It ignores invalid utf8 sequence\nString toDartString(Pointer<Utf8> s) {\n  final codeUnits = s.cast<Uint8>();\n  final length = _strLen(codeUnits);\n  return utf8.decode(codeUnits.asTypedList(length), allowMalformed: true);\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/vad.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'dart:typed_data';\nimport 'package:ffi/ffi.dart';\n\nimport './sherpa_onnx_bindings.dart';\n\n/// Voice activity detection and buffering helpers.\n///\n/// See `dart-api-examples/vad/bin/vad.dart` and\n/// `dart-api-examples/vad/bin/ten-vad.dart` for complete examples.\n///\n/// Example:\n///\n/// ```dart\n/// final config = VadModelConfig(\n///   sileroVad: const SileroVadModelConfig(\n///     model: './silero_vad.onnx',\n///     minSilenceDuration: 0.25,\n///     minSpeechDuration: 0.5,\n///   ),\n///   numThreads: 1,\n/// );\n///\n/// final vad = VoiceActivityDetector(config: config, bufferSizeInSeconds: 10);\n/// final wave = readWave('./test.wav');\n/// vad.acceptWaveform(wave.samples);\n/// vad.flush();\n/// while (!vad.isEmpty()) {\n///   print(vad.front());\n///   vad.pop();\n/// }\n/// vad.free();\n/// ```\n\n/// Silero VAD model configuration.\nclass SileroVadModelConfig {\n  const SileroVadModelConfig(\n      {this.model = '',\n      this.threshold = 0.5,\n      this.minSilenceDuration = 0.5,\n      this.minSpeechDuration = 0.25,\n      this.windowSize = 512,\n      this.maxSpeechDuration = 5.0});\n\n  factory SileroVadModelConfig.fromJson(Map<String, dynamic> json) {\n    return SileroVadModelConfig(\n      model: json['model'] as String? ?? '',\n      threshold: (json['threshold'] as num?)?.toDouble() ?? 0.5,\n      minSilenceDuration:\n          (json['minSilenceDuration'] as num?)?.toDouble() ?? 0.5,\n      minSpeechDuration:\n          (json['minSpeechDuration'] as num?)?.toDouble() ?? 0.25,\n      windowSize: json['windowSize'] as int? ?? 512,\n      maxSpeechDuration: (json['maxSpeechDuration'] as num?)?.toDouble() ?? 5.0,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'SileroVadModelConfig(model: $model, threshold: $threshold, minSilenceDuration: $minSilenceDuration, minSpeechDuration: $minSpeechDuration, windowSize: $windowSize, maxSpeechDuration: $maxSpeechDuration)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model,\n        'threshold': threshold,\n        'minSilenceDuration': minSilenceDuration,\n        'minSpeechDuration': minSpeechDuration,\n        'windowSize': windowSize,\n        'maxSpeechDuration': maxSpeechDuration,\n      };\n\n  final String model;\n  final double threshold;\n  final double minSilenceDuration;\n  final double minSpeechDuration;\n  final int windowSize;\n  final double maxSpeechDuration;\n}\n\n/// Ten VAD model configuration.\nclass TenVadModelConfig {\n  const TenVadModelConfig(\n      {this.model = '',\n      this.threshold = 0.5,\n      this.minSilenceDuration = 0.5,\n      this.minSpeechDuration = 0.25,\n      this.windowSize = 256,\n      this.maxSpeechDuration = 5.0});\n\n  factory TenVadModelConfig.fromJson(Map<String, dynamic> json) {\n    return TenVadModelConfig(\n      model: json['model'] as String? ?? '',\n      threshold: (json['threshold'] as num?)?.toDouble() ?? 0.5,\n      minSilenceDuration:\n          (json['minSilenceDuration'] as num?)?.toDouble() ?? 0.5,\n      minSpeechDuration:\n          (json['minSpeechDuration'] as num?)?.toDouble() ?? 0.25,\n      windowSize: json['windowSize'] as int? ?? 256,\n      maxSpeechDuration: (json['maxSpeechDuration'] as num?)?.toDouble() ?? 5.0,\n    );\n  }\n\n  @override\n  String toString() {\n    return 'TenVadModelConfig(model: $model, threshold: $threshold, minSilenceDuration: $minSilenceDuration, minSpeechDuration: $minSpeechDuration, windowSize: $windowSize, maxSpeechDuration: $maxSpeechDuration)';\n  }\n\n  Map<String, dynamic> toJson() => {\n        'model': model,\n        'threshold': threshold,\n        'minSilenceDuration': minSilenceDuration,\n        'minSpeechDuration': minSpeechDuration,\n        'windowSize': windowSize,\n        'maxSpeechDuration': maxSpeechDuration,\n      };\n\n  final String model;\n  final double threshold;\n  final double minSilenceDuration;\n  final double minSpeechDuration;\n  final int windowSize;\n  final double maxSpeechDuration;\n}\n\n/// Top-level VAD model configuration.\n///\n/// Configure either [sileroVad] or [tenVad] for typical use and set the shared\n/// sample rate and runtime settings here.\nclass VadModelConfig {\n  VadModelConfig({\n    this.sileroVad = const SileroVadModelConfig(),\n    this.sampleRate = 16000,\n    this.numThreads = 1,\n    this.provider = 'cpu',\n    this.debug = true,\n    this.tenVad = const TenVadModelConfig(),\n  });\n\n  final SileroVadModelConfig sileroVad;\n  final TenVadModelConfig tenVad;\n  final int sampleRate;\n  final int numThreads;\n  final String provider;\n  final bool debug;\n\n  factory VadModelConfig.fromJson(Map<String, dynamic> json) {\n    return VadModelConfig(\n      sileroVad: SileroVadModelConfig.fromJson(\n          json['sileroVad'] as Map<String, dynamic>? ?? const {}),\n      tenVad: TenVadModelConfig.fromJson(\n          json['tenVad'] as Map<String, dynamic>? ?? const {}),\n      sampleRate: json['sampleRate'] as int? ?? 16000,\n      numThreads: json['numThreads'] as int? ?? 1,\n      provider: json['provider'] as String? ?? 'cpu',\n      debug: json['debug'] as bool? ?? true,\n    );\n  }\n\n  Map<String, dynamic> toJson() => {\n        'sileroVad': sileroVad.toJson(),\n        'tenVad': tenVad.toJson(),\n        'sampleRate': sampleRate,\n        'numThreads': numThreads,\n        'provider': provider,\n        'debug': debug,\n      };\n\n  @override\n  String toString() {\n    return 'VadModelConfig(sileroVad: $sileroVad, tenVad: $tenVad, sampleRate: $sampleRate, numThreads: $numThreads, provider: $provider, debug: $debug)';\n  }\n}\n\n/// One detected speech segment emitted by [VoiceActivityDetector].\nclass SpeechSegment {\n  SpeechSegment({required this.samples, required this.start});\n  final Float32List samples;\n  final int start;\n}\n\n/// Circular sample buffer used by VAD-related pipelines.\nclass CircularBuffer {\n  CircularBuffer.fromPtr({required this.ptr});\n\n  CircularBuffer._({required this.ptr});\n\n  /// The user has to invoke CircularBuffer.free() on the returned instance\n  /// to avoid memory leak.\n  factory CircularBuffer({required int capacity}) {\n    assert(capacity > 0, 'capacity is $capacity');\n\n    if (SherpaOnnxBindings.createCircularBuffer == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final p =\n        SherpaOnnxBindings.createCircularBuffer?.call(capacity) ?? nullptr;\n\n    if (p == nullptr) {\n      throw Exception(\n          \"Failed to create circular buffer. Please check your config\");\n    }\n\n    return CircularBuffer._(ptr: p);\n  }\n\n  /// Release the native buffer.\n  /// Release the native detector.\n  void free() {\n    if (SherpaOnnxBindings.destroyCircularBuffer == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.destroyCircularBuffer?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Append samples to the tail of the buffer.\n  void push(Float32List data) {\n    if (SherpaOnnxBindings.circularBufferPush == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n\n    final n = data.length;\n    final Pointer<Float> p = calloc<Float>(n);\n\n    final pList = p.asTypedList(n);\n    pList.setAll(0, data);\n\n    SherpaOnnxBindings.circularBufferPush?.call(ptr, p, n);\n\n    calloc.free(p);\n  }\n\n  /// Copy [n] samples starting at [startIndex].\n  Float32List get({required int startIndex, required int n}) {\n    if (SherpaOnnxBindings.circularBufferGet == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return Float32List(0);\n    }\n\n    final Pointer<Float> p =\n        SherpaOnnxBindings.circularBufferGet?.call(ptr, startIndex, n) ??\n            nullptr;\n\n    if (p == nullptr) {\n      return Float32List(0);\n    }\n\n    final pList = p.asTypedList(n);\n    final Float32List ans = Float32List.fromList(pList);\n\n    SherpaOnnxBindings.circularBufferFree?.call(p);\n\n    return ans;\n  }\n\n  /// Drop [n] samples from the head of the buffer.\n  void pop(int n) {\n    if (SherpaOnnxBindings.circularBufferPop == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.circularBufferPop?.call(ptr, n);\n  }\n\n  /// Clear the buffer contents.\n  /// Reset the detector state.\n  void reset() {\n    if (SherpaOnnxBindings.circularBufferReset == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.circularBufferReset?.call(ptr);\n  }\n\n  int get size {\n    if (SherpaOnnxBindings.circularBufferSize == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return 0;\n    }\n\n    return SherpaOnnxBindings.circularBufferSize?.call(ptr) ?? 0;\n  }\n\n  int get head {\n    if (SherpaOnnxBindings.circularBufferHead == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return 0;\n    }\n\n    return SherpaOnnxBindings.circularBufferHead?.call(ptr) ?? 0;\n  }\n\n  Pointer<SherpaOnnxCircularBuffer> ptr;\n}\n\n/// Voice activity detector that emits [SpeechSegment] objects.\n///\n/// Create one with a [VadModelConfig], feed audio with [acceptWaveform], then\n/// inspect queued segments with [isEmpty], [front], [pop], and [flush].\nclass VoiceActivityDetector {\n  VoiceActivityDetector.fromPtr({required this.ptr, required this.config});\n\n  VoiceActivityDetector._({required this.ptr, required this.config});\n\n  // The user has to invoke VoiceActivityDetector.free() to avoid memory leak.\n  /// Create a detector with an internal result buffer sized in seconds.\n  factory VoiceActivityDetector(\n      {required VadModelConfig config, required double bufferSizeInSeconds}) {\n    if (SherpaOnnxBindings.createVoiceActivityDetector == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    final c = calloc<SherpaOnnxVadModelConfig>();\n\n    final sileroVadModelPtr = config.sileroVad.model.toNativeUtf8();\n    c.ref.sileroVad.model = sileroVadModelPtr;\n\n    c.ref.sileroVad.threshold = config.sileroVad.threshold;\n    c.ref.sileroVad.minSilenceDuration = config.sileroVad.minSilenceDuration;\n    c.ref.sileroVad.minSpeechDuration = config.sileroVad.minSpeechDuration;\n    c.ref.sileroVad.windowSize = config.sileroVad.windowSize;\n    c.ref.sileroVad.maxSpeechDuration = config.sileroVad.maxSpeechDuration;\n\n    final tenVadModelPtr = config.tenVad.model.toNativeUtf8();\n    c.ref.tenVad.model = tenVadModelPtr;\n\n    c.ref.tenVad.threshold = config.tenVad.threshold;\n    c.ref.tenVad.minSilenceDuration = config.tenVad.minSilenceDuration;\n    c.ref.tenVad.minSpeechDuration = config.tenVad.minSpeechDuration;\n    c.ref.tenVad.windowSize = config.tenVad.windowSize;\n    c.ref.tenVad.maxSpeechDuration = config.tenVad.maxSpeechDuration;\n\n    c.ref.sampleRate = config.sampleRate;\n    c.ref.numThreads = config.numThreads;\n\n    final providerPtr = config.provider.toNativeUtf8();\n    c.ref.provider = providerPtr;\n\n    c.ref.debug = config.debug ? 1 : 0;\n\n    final ptr = SherpaOnnxBindings.createVoiceActivityDetector\n            ?.call(c, bufferSizeInSeconds) ??\n        nullptr;\n\n    calloc.free(providerPtr);\n    calloc.free(tenVadModelPtr);\n    calloc.free(sileroVadModelPtr);\n    calloc.free(c);\n\n    if (ptr == nullptr) {\n      throw Exception(\"Failed to create vad. Please check your config\");\n    }\n\n    return VoiceActivityDetector._(ptr: ptr, config: config);\n  }\n\n  void free() {\n    if (SherpaOnnxBindings.destroyVoiceActivityDetector == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.destroyVoiceActivityDetector?.call(ptr);\n    ptr = nullptr;\n  }\n\n  /// Feed normalized waveform samples into the detector.\n  void acceptWaveform(Float32List samples) {\n    if (SherpaOnnxBindings.voiceActivityDetectorAcceptWaveform == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n\n    final n = samples.length;\n    final Pointer<Float> p = calloc<Float>(n);\n\n    final pList = p.asTypedList(n);\n    pList.setAll(0, samples);\n\n    SherpaOnnxBindings.voiceActivityDetectorAcceptWaveform?.call(ptr, p, n);\n\n    calloc.free(p);\n  }\n\n  /// Return `true` if there are no queued speech segments.\n  bool isEmpty() {\n    if (SherpaOnnxBindings.voiceActivityDetectorEmpty == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return true;\n    }\n\n    final int empty =\n        SherpaOnnxBindings.voiceActivityDetectorEmpty?.call(ptr) ?? 0;\n\n    return empty == 1;\n  }\n\n  /// Return `true` if speech is currently being detected.\n  bool isDetected() {\n    if (SherpaOnnxBindings.voiceActivityDetectorDetected == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return false;\n    }\n\n    final int detected =\n        SherpaOnnxBindings.voiceActivityDetectorDetected?.call(ptr) ?? 0;\n\n    return detected == 1;\n  }\n\n  /// Drop the front queued speech segment.\n  void pop() {\n    if (SherpaOnnxBindings.voiceActivityDetectorPop == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.voiceActivityDetectorPop?.call(ptr);\n  }\n\n  /// Remove all queued speech segments.\n  void clear() {\n    if (SherpaOnnxBindings.voiceActivityDetectorClear == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.voiceActivityDetectorClear?.call(ptr);\n  }\n\n  /// Return the front queued speech segment.\n  SpeechSegment front() {\n    if (SherpaOnnxBindings.voiceActivityDetectorFront == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return SpeechSegment(samples: Float32List(0), start: 0);\n    }\n\n    final Pointer<SherpaOnnxSpeechSegment> segment =\n        SherpaOnnxBindings.voiceActivityDetectorFront?.call(ptr) ?? nullptr;\n    if (segment == nullptr) {\n      return SpeechSegment(samples: Float32List(0), start: 0);\n    }\n\n    final sampleList = segment.ref.samples.asTypedList(segment.ref.n);\n    final start = segment.ref.start;\n\n    final samples = Float32List.fromList(sampleList);\n\n    SherpaOnnxBindings.destroySpeechSegment?.call(segment);\n\n    return SpeechSegment(samples: samples, start: start);\n  }\n\n  void reset() {\n    if (SherpaOnnxBindings.voiceActivityDetectorReset == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.voiceActivityDetectorReset?.call(ptr);\n  }\n\n  /// Flush trailing buffered speech into the output queue.\n  void flush() {\n    if (SherpaOnnxBindings.voiceActivityDetectorFlush == null) {\n      throw Exception(\"Please initialize sherpa-onnx first\");\n    }\n\n    if (ptr == nullptr) {\n      return;\n    }\n    SherpaOnnxBindings.voiceActivityDetectorFlush?.call(ptr);\n  }\n\n  Pointer<SherpaOnnxVoiceActivityDetector> ptr;\n  final VadModelConfig config;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/version.dart",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'package:ffi/ffi.dart';\nimport './sherpa_onnx_bindings.dart';\n\n/// Return the sherpa-onnx version string compiled into the native library.\nString getVersion() {\n  Pointer<Utf8> version = SherpaOnnxBindings.getVersionStr?.call() ?? nullptr;\n  if (version == nullptr) {\n    return '';\n  }\n\n  return version.toDartString();\n}\n\n/// Return the Git SHA1 of the native library build.\nString getGitSha1() {\n  Pointer<Utf8> gitSha1 = SherpaOnnxBindings.getGitSha1?.call() ?? nullptr;\n  if (gitSha1 == nullptr) {\n    return '';\n  }\n\n  return gitSha1.toDartString();\n}\n\n/// Return the Git date of the native library build.\nString getGitDate() {\n  Pointer<Utf8> gitDate = SherpaOnnxBindings.getGitDate?.call() ?? nullptr;\n  if (gitDate == nullptr) {\n    return '';\n  }\n\n  return gitDate.toDartString();\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/wave_reader.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'dart:typed_data';\nimport 'package:ffi/ffi.dart';\n\nimport './sherpa_onnx_bindings.dart';\n\n/// Audio samples loaded from a WAV file.\n///\n/// Samples are normalized to the range `[-1, 1]` and are stored as mono\n/// `Float32List` PCM data.\nclass WaveData {\n  WaveData({required this.samples, required this.sampleRate});\n\n  /// normalized to [-1, 1]\n  Float32List samples;\n  int sampleRate;\n}\n\n/// Read a WAV file from disk.\n///\n/// Returns an empty [WaveData] object if the file cannot be read or decoded.\nWaveData readWave(String filename) {\n  final Pointer<Utf8> str = filename.toNativeUtf8();\n\n  if (SherpaOnnxBindings.readWave == null) {\n    throw Exception(\"Please initialize sherpa-onnx first\");\n  }\n\n  Pointer<SherpaOnnxWave> wave =\n      SherpaOnnxBindings.readWave?.call(str) ?? nullptr;\n  calloc.free(str);\n\n  if (wave == nullptr) {\n    return WaveData(samples: Float32List(0), sampleRate: 0);\n  }\n\n  final samples = wave.ref.samples.asTypedList(wave.ref.numSamples);\n\n  final newSamples = Float32List.fromList(samples);\n  int sampleRate = wave.ref.sampleRate;\n  SherpaOnnxBindings.freeWave?.call(wave);\n\n  return WaveData(samples: newSamples, sampleRate: sampleRate);\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/lib/src/wave_writer.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:ffi';\nimport 'dart:typed_data';\nimport 'package:ffi/ffi.dart';\n\nimport './sherpa_onnx_bindings.dart';\n\n/// Write normalized mono PCM samples to a WAV file.\n///\n/// Returns `true` on success and `false` otherwise. This is commonly used with\n/// samples returned from TTS, VAD pipelines, or speech denoisers.\nbool writeWave(\n    {required String filename,\n    required Float32List samples,\n    required int sampleRate}) {\n  final Pointer<Utf8> filenamePtr = filename.toNativeUtf8();\n\n  final n = samples.length;\n  final Pointer<Float> p = calloc<Float>(n);\n\n  final pList = p.asTypedList(n);\n  pList.setAll(0, samples);\n\n  if (SherpaOnnxBindings.writeWave == null) {\n    throw Exception(\"Please initialize sherpa-onnx first\");\n  }\n\n  int ok =\n      SherpaOnnxBindings.writeWave?.call(p, n, sampleRate, filenamePtr) ?? 0;\n\n  calloc.free(p);\n  calloc.free(filenamePtr);\n\n  return ok == 1;\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx/pubspec.yaml",
    "content": "name: sherpa_onnx\n\ndescription: >\n  Speech recognition, speech synthesis, speaker diarization, and speaker recognition\n  using next-gen Kaldi with onnxruntime without Internet connection.\n\nrepository: https://github.com/k2-fsa/sherpa-onnx/tree/master/flutter\n\nissue_tracker: https://github.com/k2-fsa/sherpa-onnx/issues\ndocumentation: https://k2-fsa.github.io/sherpa/onnx/\n\ntopics:\n  - speech-recognition\n  - speech-synthesis\n  - speaker-diarization\n  - audio-tagging\n  - voice-activity-detection\n\n# remember to change the version in ../sherpa_onnx_macos/macos/sherpa_onnx_macos.podspec\nversion: 1.12.31\n\nhomepage: https://github.com/k2-fsa/sherpa-onnx\n\nenvironment:\n  sdk: \">=3.1.0 <4.0.0\"\n  flutter: \">=2.8.1\"\n\ndependencies:\n  ffi: ^2.1.0\n  flutter:\n    sdk: flutter\n\n  sherpa_onnx_android: ^1.12.31\n  # sherpa_onnx_android:\n  #   path: ../sherpa_onnx_android\n\n  sherpa_onnx_macos: ^1.12.31\n  # sherpa_onnx_macos:\n  #   path: ../sherpa_onnx_macos\n\n  sherpa_onnx_linux: ^1.12.31\n  # sherpa_onnx_linux:\n  #   path: ../sherpa_onnx_linux\n\n  sherpa_onnx_windows: ^1.12.31\n  # sherpa_onnx_windows:\n  #   path: ../sherpa_onnx_windows\n\n  sherpa_onnx_ios: ^1.12.31\n  # sherpa_onnx_ios:\n  #   path: ../sherpa_onnx_ios\n\ndev_dependencies:\n  flutter_lints: ^3.0.0\n\nflutter:\n  plugin:\n    platforms:\n      android:\n        default_package: sherpa_onnx_android\n\n      ios:\n        default_package: sherpa_onnx_ios\n\n      macos:\n        default_package: sherpa_onnx_macos\n\n      linux:\n        default_package: sherpa_onnx_linux\n\n      windows:\n        default_package: sherpa_onnx_windows\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/.gitignore",
    "content": "# Miscellaneous\n*.class\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\nmigrate_working_dir/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# The .vscode folder contains launch configuration and tasks you configure in\n# VS Code which you may wish to be included in version control, so this line\n# is commented out by default.\n#.vscode/\n\n# Flutter/Dart/Pub related\n# Libraries should not include pubspec.lock, per https://dart.dev/guides/libraries/private-files#pubspeclock.\n/pubspec.lock\n**/doc/api/\n.dart_tool/\nbuild/\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/.metadata",
    "content": "# This file tracks properties of this Flutter project.\n# Used by Flutter tool to assess capabilities and perform upgrades etc.\n#\n# This file should be version controlled and should not be manually edited.\n\nversion:\n  revision: \"5dcb86f68f239346676ceb1ed1ea385bd215fba1\"\n  channel: \"stable\"\n\nproject_type: plugin_ffi\n\n# Tracks metadata for the flutter migrate command\nmigration:\n  platforms:\n    - platform: root\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n    - platform: android\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n\n  # User provided section\n\n  # List of Local paths (relative to this file) that should be\n  # ignored by the migrate tool.\n  #\n  # Files that are not part of the templates will be ignored by default.\n  unmanaged_files:\n    - 'lib/main.dart'\n    - 'ios/Runner.xcodeproj/project.pbxproj'\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/README.md",
    "content": "# sherpa_onnx_android\n\nThis is a sub project of [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx).\n\nYou are not expected to use this package directly.\n\nPlease see the entry point at <https://pub.dev/packages/sherpa_onnx>.\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/analysis_options.yaml",
    "content": "include: package:flutter_lints/flutter.yaml\n\n# Additional information about this file can be found at\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/android/.gitignore",
    "content": "*.iml\n.gradle\n/local.properties\n/.idea/workspace.xml\n/.idea/libraries\n.DS_Store\n/build\n/captures\n.cxx\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/android/build.gradle",
    "content": "// The Android Gradle Plugin builds the native code with the Android NDK.\n\ngroup = \"com.k2fsa.sherpa.onnx.sherpa_onnx_android\"\nversion = \"1.0\"\n\nbuildscript {\n    repositories {\n        google()\n        mavenCentral()\n    }\n\n    dependencies {\n        // The Android Gradle Plugin knows how to build native code with the NDK.\n        classpath(\"com.android.tools.build:gradle:7.3.0\")\n    }\n}\n\nrootProject.allprojects {\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\napply plugin: \"com.android.library\"\n\nandroid {\n    namespace 'com.k2fsa.sherpa.onnx'\n\n    // Bumping the plugin compileSdk version requires all clients of this plugin\n    // to bump the version in their app.\n    compileSdk = 34\n\n    // Use the NDK version\n    // declared in /android/app/build.gradle file of the Flutter project.\n    // Replace it with a version number if this plugin requires a specific NDK version.\n    // (e.g. ndkVersion \"23.1.7779620\")\n    ndkVersion = android.ndkVersion\n\n    compileOptions {\n        sourceCompatibility = JavaVersion.VERSION_1_8\n        targetCompatibility = JavaVersion.VERSION_1_8\n    }\n\n    defaultConfig {\n        minSdk = 21\n    }\n}\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/android/settings.gradle",
    "content": "rootProject.name = 'sherpa_onnx_android'\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/android/src/main/AndroidManifest.xml",
    "content": "<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\"\n  package=\"com.k2fsa.sherpa.onnx\">\n</manifest>\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/android/src/main/jniLibs/README.md",
    "content": "# Introduction\n\nPre-built libs are not checked-in.\n\nPlease use\n\n - https://github.com/k2-fsa/sherpa-onnx/blob/master/build-android-arm64-v8a.sh\n - https://github.com/k2-fsa/sherpa-onnx/blob/master/build-android-armv7-eabi.sh\n - https://github.com/k2-fsa/sherpa-onnx/blob/master/build-android-x86-64.sh\n - https://github.com/k2-fsa/sherpa-onnx/blob/master/build-android-x86.sh\n\nThe following is an example for `arm64-v8a`:\n\n```bash\ngit clone https://github.com/k2-fsa/sherpa-onnx\ncd sherpa-onnx\n\nexport SHERPA_ONNX_ENABLE_JNI=OFF\nexport SHERPA_ONNX_ENABLE_C_API=ON\n./build-android-arm64-v8a.sh\n\ncp -v build-android-arm64-v8a/install/lib/*.so flutter/sherpa_onnx_android/android/src/main/jniLibs/arm64-v8a/\n```\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/android/src/main/jniLibs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "flutter/sherpa_onnx_android/android/src/main/jniLibs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "flutter/sherpa_onnx_android/android/src/main/jniLibs/x86/.gitkeep",
    "content": ""
  },
  {
    "path": "flutter/sherpa_onnx_android/android/src/main/jniLibs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "flutter/sherpa_onnx_android/lib/.gitkeep",
    "content": ""
  },
  {
    "path": "flutter/sherpa_onnx_android/lib/README.md",
    "content": "# Introduction\n\nThis directory is left empty intentionally.\n"
  },
  {
    "path": "flutter/sherpa_onnx_android/pubspec.yaml",
    "content": "name: sherpa_onnx_android\n\ndescription: >\n  Speech recognition, speech synthesis, and speaker recognition using next-gen Kaldi\n  with onnxruntime without Internet connection.\n\nversion: 0.0.1\n\nrepository: https://github.com/k2-fsa/sherpa-onnx/tree/master/flutter\n\nissue_tracker: https://github.com/k2-fsa/sherpa-onnx/issues\ndocumentation: https://k2-fsa.github.io/sherpa/onnx/\n\nhomepage: https://github.com/k2-fsa/sherpa-onnx\n\ntopics:\n  - speech-recognition\n  - speech-synthesis\n  - speaker-identification\n  - audio-tagging\n  - voice-activity-detection\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n  flutter: \">=2.8.1\"\n\ndependencies:\n  flutter:\n    sdk: flutter\n\nflutter:\n  plugin:\n    platforms:\n      android:\n        ffiPlugin: true\n"
  },
  {
    "path": "flutter/sherpa_onnx_ios/.gitignore",
    "content": "# Miscellaneous\n*.class\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\nmigrate_working_dir/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# The .vscode folder contains launch configuration and tasks you configure in\n# VS Code which you may wish to be included in version control, so this line\n# is commented out by default.\n#.vscode/\n\n# Flutter/Dart/Pub related\n# Libraries should not include pubspec.lock, per https://dart.dev/guides/libraries/private-files#pubspeclock.\n/pubspec.lock\n**/doc/api/\n.dart_tool/\nbuild/\n"
  },
  {
    "path": "flutter/sherpa_onnx_ios/.metadata",
    "content": "# This file tracks properties of this Flutter project.\n# Used by Flutter tool to assess capabilities and perform upgrades etc.\n#\n# This file should be version controlled and should not be manually edited.\n\nversion:\n  revision: \"5dcb86f68f239346676ceb1ed1ea385bd215fba1\"\n  channel: \"stable\"\n\nproject_type: plugin_ffi\n\n# Tracks metadata for the flutter migrate command\nmigration:\n  platforms:\n    - platform: root\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n    - platform: ios\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n\n  # User provided section\n\n  # List of Local paths (relative to this file) that should be\n  # ignored by the migrate tool.\n  #\n  # Files that are not part of the templates will be ignored by default.\n  unmanaged_files:\n    - 'lib/main.dart'\n    - 'ios/Runner.xcodeproj/project.pbxproj'\n"
  },
  {
    "path": "flutter/sherpa_onnx_ios/README.md",
    "content": "# sherpa_onnx_ios\n\nThis is a sub project of [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx).\n\nYou are not expected to use this package directly.\n\nPlease see the entry point at <https://pub.dev/packages/sherpa_onnx>.\n"
  },
  {
    "path": "flutter/sherpa_onnx_ios/analysis_options.yaml",
    "content": "include: package:flutter_lints/flutter.yaml\n\n# Additional information about this file can be found at\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "flutter/sherpa_onnx_ios/ios/sherpa_onnx_ios.podspec",
    "content": "#\n# To learn more about a Podspec see http://guides.cocoapods.org/syntax/podspec.html.\n# Run `pod lib lint sherpa_onnx_ios.podspec` to validate before publishing.\n#\n# See also\n# https://github.com/google/webcrypto.dart/blob/2010361a106d7a872d90e3dfebfed250e2ede609/ios/webcrypto.podspec#L23-L28\n# https://groups.google.com/g/dart-ffi/c/nUATMBy7r0c\nPod::Spec.new do |s|\n  s.name             = 'sherpa_onnx_ios'\n  s.version          = '1.12.31'\n  s.summary          = 'A new Flutter FFI plugin project.'\n  s.description      = <<-DESC\nA new Flutter FFI plugin project.\n                       DESC\n  s.homepage         = 'https://github.com/k2-fsa/sherpa-onnx'\n  s.license          = { :file => '../LICENSE' }\n  s.author           = { 'Fangjun Kuang' => 'csukuangfj@gmail.com' }\n\n  # This will ensure the source files in Classes/ are included in the native\n  # builds of apps using this FFI plugin. Podspec does not support relative\n  # paths, so Classes contains a forwarder C file that relatively imports\n  # `../src/*` so that the C sources can be shared among all target platforms.\n  s.source           = { :path => '.' }\n  s.dependency 'Flutter'\n  s.platform = :ios, '13.0'\n  s.preserve_paths = 'sherpa_onnx.xcframework/**/*'\n  s.vendored_frameworks = 'sherpa_onnx.xcframework'\n\n  # Flutter.framework does not contain a i386 slice.\n  s.pod_target_xcconfig = {\n    'DEFINES_MODULE' => 'YES', 'EXCLUDED_ARCHS[sdk=iphonesimulator*]' => 'i386'\n    }\n  s.swift_version = '5.0'\nend\n"
  },
  {
    "path": "flutter/sherpa_onnx_ios/lib/README.md",
    "content": "# Introduction\n\nThis directory is left empty intentionally.\n"
  },
  {
    "path": "flutter/sherpa_onnx_ios/pubspec.yaml",
    "content": "name: sherpa_onnx_ios\n\ndescription: >\n  Speech recognition, speech synthesis, and speaker recognition using next-gen Kaldi\n  with onnxruntime without Internet connection.\n\nversion: 0.0.1\n\nrepository: https://github.com/k2-fsa/sherpa-onnx/tree/master/flutter\n\nissue_tracker: https://github.com/k2-fsa/sherpa-onnx/issues\ndocumentation: https://k2-fsa.github.io/sherpa/onnx/\n\nhomepage: https://github.com/k2-fsa/sherpa-onnx\n\ntopics:\n  - speech-recognition\n  - speech-synthesis\n  - speaker-identification\n  - audio-tagging\n  - voice-activity-detection\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n  flutter: \">=2.8.1\"\n\ndependencies:\n  flutter:\n    sdk: flutter\n\nflutter:\n  plugin:\n    platforms:\n      ios:\n        ffiPlugin: true\n\n"
  },
  {
    "path": "flutter/sherpa_onnx_linux/.gitignore",
    "content": "# Miscellaneous\n*.class\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\nmigrate_working_dir/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# The .vscode folder contains launch configuration and tasks you configure in\n# VS Code which you may wish to be included in version control, so this line\n# is commented out by default.\n#.vscode/\n\n# Flutter/Dart/Pub related\n# Libraries should not include pubspec.lock, per https://dart.dev/guides/libraries/private-files#pubspeclock.\n/pubspec.lock\n**/doc/api/\n.dart_tool/\nbuild/\n"
  },
  {
    "path": "flutter/sherpa_onnx_linux/.metadata",
    "content": "# This file tracks properties of this Flutter project.\n# Used by Flutter tool to assess capabilities and perform upgrades etc.\n#\n# This file should be version controlled and should not be manually edited.\n\nversion:\n  revision: \"5dcb86f68f239346676ceb1ed1ea385bd215fba1\"\n  channel: \"stable\"\n\nproject_type: plugin_ffi\n\n# Tracks metadata for the flutter migrate command\nmigration:\n  platforms:\n    - platform: root\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n    - platform: linux\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n\n  # User provided section\n\n  # List of Local paths (relative to this file) that should be\n  # ignored by the migrate tool.\n  #\n  # Files that are not part of the templates will be ignored by default.\n  unmanaged_files:\n    - 'lib/main.dart'\n    - 'ios/Runner.xcodeproj/project.pbxproj'\n"
  },
  {
    "path": "flutter/sherpa_onnx_linux/README.md",
    "content": "# sherpa_onnx_linux\n\nThis is a sub project of [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx).\n\nYou are not expected to use this package directly.\n\nPlease see the entry point at <https://pub.dev/packages/sherpa_onnx>.\n"
  },
  {
    "path": "flutter/sherpa_onnx_linux/analysis_options.yaml",
    "content": "include: package:flutter_lints/flutter.yaml\n\n# Additional information about this file can be found at\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "flutter/sherpa_onnx_linux/lib/.gitkeep",
    "content": ""
  },
  {
    "path": "flutter/sherpa_onnx_linux/lib/README.md",
    "content": "# Introduction\n\nThis directory is left empty intentionally.\n"
  },
  {
    "path": "flutter/sherpa_onnx_linux/linux/CMakeLists.txt",
    "content": "# The Flutter tooling requires that developers have CMake 3.10 or later\n# installed. You should not increase this version, as doing so will cause\n# the plugin to fail to compile for some customers of the plugin.\ncmake_minimum_required(VERSION 3.10)\n\n# Project-level configuration.\nset(PROJECT_NAME \"sherpa_onnx_linux\")\nproject(${PROJECT_NAME} LANGUAGES CXX)\n\nif(CMAKE_SYSTEM_PROCESSOR STREQUAL \"x86_64\")\n  set(LIB_ARCH_DIR \"x64\")\nelseif(CMAKE_SYSTEM_PROCESSOR STREQUAL \"aarch64\")\n  set(LIB_ARCH_DIR \"aarch64\")\nelse()\n  message(FATAL_ERROR \"Unsupported arch: ${CMAKE_SYSTEM_PROCESSOR}\")\nendif()\n\n# List of absolute paths to libraries that should be bundled with the plugin.\n# This list could contain prebuilt libraries, or libraries created by an\n# external build triggered from this build file.\nset(sherpa_onnx_linux_bundled_libraries\n  \"${CMAKE_CURRENT_SOURCE_DIR}/${LIB_ARCH_DIR}/libsherpa-onnx-c-api.so\"\n  \"${CMAKE_CURRENT_SOURCE_DIR}/${LIB_ARCH_DIR}/libonnxruntime.so\"\n  PARENT_SCOPE\n)\n"
  },
  {
    "path": "flutter/sherpa_onnx_linux/linux/README.md",
    "content": "# Introduction\n\n`*.so` files are generated dynamically using GitHub actions during a new release.\n\nWe don't check-in pre-built library files into git.\n"
  },
  {
    "path": "flutter/sherpa_onnx_linux/linux/aarch64/.gitikeep",
    "content": ""
  },
  {
    "path": "flutter/sherpa_onnx_linux/linux/x64/.gitikeep",
    "content": ""
  },
  {
    "path": "flutter/sherpa_onnx_linux/pubspec.yaml",
    "content": "name: sherpa_onnx_linux\ndescription: >\n  Speech recognition, speech synthesis, and speaker recognition using next-gen Kaldi\n  with onnxruntime without Internet connection.\n\nversion: 0.0.1\n\nrepository: https://github.com/k2-fsa/sherpa-onnx/tree/master/flutter\n\nissue_tracker: https://github.com/k2-fsa/sherpa-onnx/issues\ndocumentation: https://k2-fsa.github.io/sherpa/onnx/\n\nhomepage: https://github.com/k2-fsa/sherpa-onnx\n\ntopics:\n  - speech-recognition\n  - speech-synthesis\n  - speaker-identification\n  - audio-tagging\n  - voice-activity-detection\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n  flutter: \">=2.8.1\"\n\ndependencies:\n  flutter:\n    sdk: flutter\n\nflutter:\n  plugin:\n    platforms:\n      linux:\n        ffiPlugin: true\n"
  },
  {
    "path": "flutter/sherpa_onnx_macos/.gitignore",
    "content": "# Miscellaneous\n*.class\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\nmigrate_working_dir/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# The .vscode folder contains launch configuration and tasks you configure in\n# VS Code which you may wish to be included in version control, so this line\n# is commented out by default.\n#.vscode/\n\n# Flutter/Dart/Pub related\n# Libraries should not include pubspec.lock, per https://dart.dev/guides/libraries/private-files#pubspeclock.\n/pubspec.lock\n**/doc/api/\n.dart_tool/\nbuild/\n"
  },
  {
    "path": "flutter/sherpa_onnx_macos/.metadata",
    "content": "# This file tracks properties of this Flutter project.\n# Used by Flutter tool to assess capabilities and perform upgrades etc.\n#\n# This file should be version controlled and should not be manually edited.\n\nversion:\n  revision: \"5dcb86f68f239346676ceb1ed1ea385bd215fba1\"\n  channel: \"stable\"\n\nproject_type: plugin_ffi\n\n# Tracks metadata for the flutter migrate command\nmigration:\n  platforms:\n    - platform: root\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n    - platform: macos\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n\n  # User provided section\n\n  # List of Local paths (relative to this file) that should be\n  # ignored by the migrate tool.\n  #\n  # Files that are not part of the templates will be ignored by default.\n  unmanaged_files:\n    - 'lib/main.dart'\n    - 'ios/Runner.xcodeproj/project.pbxproj'\n"
  },
  {
    "path": "flutter/sherpa_onnx_macos/README.md",
    "content": "# sherpa_onnx_macos\n\nThis is a sub project of [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx).\n\nYou are not expected to use this package directly.\n\nPlease see the entry point at <https://pub.dev/packages/sherpa_onnx>.\n"
  },
  {
    "path": "flutter/sherpa_onnx_macos/analysis_options.yaml",
    "content": "include: package:flutter_lints/flutter.yaml\n\n# Additional information about this file can be found at\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "flutter/sherpa_onnx_macos/lib/.gitkeep",
    "content": ""
  },
  {
    "path": "flutter/sherpa_onnx_macos/lib/README.md",
    "content": "# Introduction\n\nThis directory is left empty intentionally.\n"
  },
  {
    "path": "flutter/sherpa_onnx_macos/macos/README.md",
    "content": "# Introduction\n\n`*.dylib` files are generated dynamically using GitHub actions during a new release.\n\nWe don't check-in pre-built library files into git.\n"
  },
  {
    "path": "flutter/sherpa_onnx_macos/macos/sherpa_onnx_macos.podspec",
    "content": "#\n# To learn more about a Podspec see http://guides.cocoapods.org/syntax/podspec.html.\n# Run `pod lib lint sherpa_onnx_macos.podspec` to validate before publishing.\n#\nPod::Spec.new do |s|\n  s.name             = 'sherpa_onnx_macos'\n  s.version          = '1.12.31'\n  s.summary          = 'sherpa-onnx Flutter FFI plugin project.'\n  s.description      = <<-DESC\nsherpa-onnx Flutter FFI plugin project.\n                       DESC\n  s.homepage         = 'https://github.com/k2-fsa/sherpa-onnx'\n  s.license          = { :file => '../LICENSE' }\n  s.author           = { 'Fangjun Kuang' => 'csukuangfj@gmail.com' }\n\n  # This will ensure the source files in Classes/ are included in the native\n  # builds of apps using this FFI plugin. Podspec does not support relative\n  # paths, so Classes contains a forwarder C file that relatively imports\n  # `../src/*` so that the C sources can be shared among all target platforms.\n  s.source           = { :path => '.' }\n  s.dependency 'FlutterMacOS'\n  s.vendored_libraries = '*.dylib'\n\n  s.platform = :osx, '10.11'\n  s.pod_target_xcconfig = { 'DEFINES_MODULE' => 'YES' }\n  s.swift_version = '5.0'\nend\n"
  },
  {
    "path": "flutter/sherpa_onnx_macos/pubspec.yaml",
    "content": "name: sherpa_onnx_macos\n\ndescription: >\n  Speech recognition, speech synthesis, and speaker recognition using next-gen Kaldi\n  with onnxruntime without Internet connection.\n\nversion: 0.0.1\n\nrepository: https://github.com/k2-fsa/sherpa-onnx/tree/master/flutter\n\nissue_tracker: https://github.com/k2-fsa/sherpa-onnx/issues\ndocumentation: https://k2-fsa.github.io/sherpa/onnx/\n\nhomepage: https://github.com/k2-fsa/sherpa-onnx\n\ntopics:\n  - speech-recognition\n  - speech-synthesis\n  - speaker-identification\n  - audio-tagging\n  - voice-activity-detection\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n  flutter: \">=2.8.1\"\n\ndependencies:\n  flutter:\n    sdk: flutter\n\nflutter:\n  plugin:\n    platforms:\n      macos:\n        ffiPlugin: true\n"
  },
  {
    "path": "flutter/sherpa_onnx_windows/.gitignore",
    "content": "# Miscellaneous\n*.class\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\nmigrate_working_dir/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# The .vscode folder contains launch configuration and tasks you configure in\n# VS Code which you may wish to be included in version control, so this line\n# is commented out by default.\n#.vscode/\n\n# Flutter/Dart/Pub related\n# Libraries should not include pubspec.lock, per https://dart.dev/guides/libraries/private-files#pubspeclock.\n/pubspec.lock\n**/doc/api/\n.dart_tool/\nbuild/\n"
  },
  {
    "path": "flutter/sherpa_onnx_windows/.metadata",
    "content": "# This file tracks properties of this Flutter project.\n# Used by Flutter tool to assess capabilities and perform upgrades etc.\n#\n# This file should be version controlled and should not be manually edited.\n\nversion:\n  revision: \"5dcb86f68f239346676ceb1ed1ea385bd215fba1\"\n  channel: \"stable\"\n\nproject_type: plugin_ffi\n\n# Tracks metadata for the flutter migrate command\nmigration:\n  platforms:\n    - platform: root\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n    - platform: windows\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n\n  # User provided section\n\n  # List of Local paths (relative to this file) that should be\n  # ignored by the migrate tool.\n  #\n  # Files that are not part of the templates will be ignored by default.\n  unmanaged_files:\n    - 'lib/main.dart'\n    - 'ios/Runner.xcodeproj/project.pbxproj'\n"
  },
  {
    "path": "flutter/sherpa_onnx_windows/README.md",
    "content": "# sherpa_onnx_windows\n\nThis is a sub project of [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx).\n\nYou are not expected to use this package directly.\n\nPlease see the entry point at <https://pub.dev/packages/sherpa_onnx>.\n"
  },
  {
    "path": "flutter/sherpa_onnx_windows/analysis_options.yaml",
    "content": "include: package:flutter_lints/flutter.yaml\n\n# Additional information about this file can be found at\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "flutter/sherpa_onnx_windows/lib/.gitkeep",
    "content": ""
  },
  {
    "path": "flutter/sherpa_onnx_windows/lib/README.md",
    "content": "# Introduction\n\nThis directory is left empty intentionally.\n"
  },
  {
    "path": "flutter/sherpa_onnx_windows/pubspec.yaml",
    "content": "name: sherpa_onnx_windows\n\ndescription: >\n  Speech recognition, speech synthesis, and speaker recognition using next-gen Kaldi\n  with onnxruntime without Internet connection.\n\nversion: 0.0.1\n\nrepository: https://github.com/k2-fsa/sherpa-onnx/tree/master/flutter\n\nissue_tracker: https://github.com/k2-fsa/sherpa-onnx/issues\ndocumentation: https://k2-fsa.github.io/sherpa/onnx/\n\nhomepage: https://github.com/k2-fsa/sherpa-onnx\n\ntopics:\n  - speech-recognition\n  - speech-synthesis\n  - speaker-identification\n  - audio-tagging\n  - voice-activity-detection\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n  flutter: \">=2.8.1\"\n\ndependencies:\n  flutter:\n    sdk: flutter\n\nflutter:\n  plugin:\n    platforms:\n      windows:\n        ffiPlugin: true\n"
  },
  {
    "path": "flutter-examples/.gitignore",
    "content": "# Do not remove or rename entries in this file, only add new ones\n# See https://github.com/flutter/flutter/issues/128635 for more context.\n\n# Miscellaneous\n*.class\n*.lock\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# Visual Studio Code related\n.classpath\n.project\n.settings/\n.vscode/*\n\n# Flutter repo-specific\n/bin/cache/\n/bin/internal/bootstrap.bat\n/bin/internal/bootstrap.sh\n/bin/mingit/\n/dev/benchmarks/mega_gallery/\n/dev/bots/.recipe_deps\n/dev/bots/android_tools/\n/dev/devicelab/ABresults*.json\n/dev/docs/doc/\n/dev/docs/api_docs.zip\n/dev/docs/flutter.docs.zip\n/dev/docs/lib/\n/dev/docs/pubspec.yaml\n/dev/integration_tests/**/xcuserdata\n/dev/integration_tests/**/Pods\n/packages/flutter/coverage/\nversion\nanalysis_benchmark.json\n\n# packages file containing multi-root paths\n.packages.generated\n\n# Flutter/Dart/Pub related\n**/doc/api/\n.dart_tool/\n.flutter-plugins\n.flutter-plugins-dependencies\n**/generated_plugin_registrant.dart\n.packages\n.pub-preload-cache/\n.pub-cache/\n.pub/\nbuild/\nflutter_*.png\nlinked_*.ds\nunlinked.ds\nunlinked_spec.ds\n\n# Android related\n**/android/**/gradle-wrapper.jar\n.gradle/\n**/android/captures/\n**/android/gradlew\n**/android/gradlew.bat\n**/android/local.properties\n**/android/**/GeneratedPluginRegistrant.java\n**/android/key.properties\n*.jks\n\n# iOS/XCode related\n**/ios/**/*.mode1v3\n**/ios/**/*.mode2v3\n**/ios/**/*.moved-aside\n**/ios/**/*.pbxuser\n**/ios/**/*.perspectivev3\n**/ios/**/*sync/\n**/ios/**/.sconsign.dblite\n**/ios/**/.tags*\n**/ios/**/.vagrant/\n**/ios/**/DerivedData/\n**/ios/**/Icon?\n**/ios/**/Pods/\n**/ios/**/.symlinks/\n**/ios/**/profile\n**/ios/**/xcuserdata\n**/ios/.generated/\n**/ios/Flutter/.last_build_id\n**/ios/Flutter/App.framework\n**/ios/Flutter/Flutter.framework\n**/ios/Flutter/Flutter.podspec\n**/ios/Flutter/Generated.xcconfig\n**/ios/Flutter/ephemeral\n**/ios/Flutter/app.flx\n**/ios/Flutter/app.zip\n**/ios/Flutter/flutter_assets/\n**/ios/Flutter/flutter_export_environment.sh\n**/ios/ServiceDefinitions.json\n**/ios/Runner/GeneratedPluginRegistrant.*\n\n# macOS\n**/Flutter/ephemeral/\n**/Pods/\n**/macos/Flutter/GeneratedPluginRegistrant.swift\n**/macos/Flutter/ephemeral\n**/xcuserdata/\n\n# Windows\n**/windows/flutter/generated_plugin_registrant.cc\n**/windows/flutter/generated_plugin_registrant.h\n**/windows/flutter/generated_plugins.cmake\n\n# Linux\n**/linux/flutter/generated_plugin_registrant.cc\n**/linux/flutter/generated_plugin_registrant.h\n**/linux/flutter/generated_plugins.cmake\n\n# Coverage\ncoverage/\n\n# Symbols\napp.*.symbols\n\n# Exceptions to above rules.\n!**/ios/**/default.mode1v3\n!**/ios/**/default.mode2v3\n!**/ios/**/default.pbxuser\n!**/ios/**/default.perspectivev3\n!/packages/flutter_tools/test/data/dart_dependencies_test/**/.packages\n!/dev/ci/**/Gemfile.lock\n!.vscode/settings.json\nPodfile\n"
  },
  {
    "path": "flutter-examples/README.md",
    "content": "# Introduction\n\nThis directory contains flutter examples of `sherpa-onnx`.\n\n| Directory | Pre-built App |\n|-----------|---------------|\n|[./tts](./tts)|[URL](https://k2-fsa.github.io/sherpa/onnx/flutter/pre-built-app.html#text-to-speech-tts-speech-synthesis)|\n|[./streaming_asr](./streaming_asr)|[URL](https://k2-fsa.github.io/sherpa/onnx/flutter/pre-built-app.html#streaming-speech-recognition-stt-asr)|\n\n# Ways to create an example\n```bash\nflutter create --platforms windows,macos streaming_asr\ncd streaming_asr\nflutter pub get\n\n# to support a new platform, e.g., android, use\n\ncd streaming_asr\nflutter create --platforms --org com.k2fsa.sherpa.onnx android ./\n\n# To add linux\nflutter config --enable-linux-desktop\nflutter create --platforms=linux .\n```\n\nTo run with android, first use\n```\n(py38) fangjuns-MacBook-Pro:streaming_asr fangjun$ flutter run devices\nNo devices found yet. Checking for wireless devices...\n\nNo supported devices found with name or id matching 'android-arm64'.\n\nThe following devices were found:\nMi 10 (mobile)  • 61106679 • android-arm64  • Android 12 (API 31)\nmacOS (desktop) • macos    • darwin-x64     • macOS 13.1 22C65 darwin-x64\nChrome (web)    • chrome   • web-javascript • Google Chrome 126.0.6478.127\n```\nto find available devices. I have attached my Android phone (Xiaomi 10) to my computer\nand it shows the device ID of my Android phone is `61106679`, so I use\n\n```bash\n(py38) fangjuns-MacBook-Pro:streaming_asr fangjun$ flutter run -d 61106679\n```\n\nto run it.\n\nIf you get the following errors and hint:\n\n```\nBUILD FAILED in 2m 43s\nRunning Gradle task 'assembleDebug'...                            165.3s\n\n┌─ Flutter Fix ───────────────────────────────────────────────────────────────────────────────────────────────────┐\n│ The plugin record_android requires a higher Android SDK version.                                                │\n│ Fix this issue by adding the following to the file                                                              │\n│ /Users/fangjun/open-source/sherpa-onnx/flutter-examples/streaming_asr/android/app/build.gradle:                 │\n│ android {                                                                                                       │\n│   defaultConfig {                                                                                               │\n│     minSdkVersion 23                                                                                            │\n│   }                                                                                                             │\n│ }                                                                                                               │\n│                                                                                                                 │\n│                                                                                                                 │\n│ Following this change, your app will not be available to users running Android SDKs below 23.                   │\n│ Consider searching for a version of this plugin that supports these lower versions of the Android SDK instead.  │\n│ For more information, see: https://docs.flutter.dev/deployment/android#reviewing-the-gradle-build-configuration │\n└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘\nError: Gradle task assembleDebug failed with exit code 1\n```\n\nPlease use the following changes:\n\n```diff\n--- a/flutter-examples/streaming_asr/android/app/build.gradle\n+++ b/flutter-examples/streaming_asr/android/app/build.gradle\n@@ -38,7 +38,7 @@ android {\n         applicationId = \"com.k2fsa.sherpa.onnx.streaming_asr\"\n         // You can update the following values to match your application needs.\n         // For more information, see: https://docs.flutter.dev/deployment/android#reviewing-the-gradle-build-configuration.\n-        minSdk = flutter.minSdkVersion\n+        minSdk = 23\n         targetSdk = flutter.targetSdkVersion\n         versionCode = flutterVersionCode.toInteger()\n         versionName = flutterVersionName\n```\n\nIf you get the following errors:\n\n```\nLaunching lib/main.dart on Mi 10 in debug mode...\nERROR:/Users/fangjun/open-source/sherpa-onnx/flutter-examples/streaming_asr/build/record_android/intermediates/runtime_library_classes_jar/debug/clas\nses.jar: D8: com.android.tools.r8.internal.Hc: Sealed classes are not supported as program classes\n\nFAILURE: Build failed with an exception.\n\n* What went wrong:\nExecution failed for task ':app:mergeLibDexDebug'.\n> Could not resolve all files for configuration ':app:debugRuntimeClasspath'.\n   > Failed to transform classes.jar (project :record_android) to match attributes {artifactType=android-dex, asm-transformed-variant=NONE, com.andro\nid.build.api.attributes.AgpVersionAttr=7.3.0, com.android.build.api.attributes.BuildTypeAttr=debug, com.android.build.gradle.internal.attributes.Vari\nantAttr=debug, dexing-enable-desugaring=true, dexing-enable-jacoco-instrumentation=false, dexing-is-debuggable=true, dexing-min-sdk=23, org.gradle.ca\ntegory=library, org.gradle.jvm.environment=android, org.gradle.libraryelements=jar, org.gradle.usage=java-runtime, org.jetbrains.kotlin.platform.type\n=androidJvm}.\n      > Execution failed for DexingWithClasspathTransform: /Users/fangjun/open-source/sherpa-onnx/flutter-examples/streaming_asr/build/record_android\n/intermediates/runtime_library_classes_jar/debug/classes.jar.\n         > Error while dexing.\n\n* Try:\n> Run with --stacktrace option to get the stack trace.\n> Run with --info or --debug option to get more log output.\n> Run with --scan to get full insights.\n\n* Get more help at https://help.gradle.org\n\nBUILD FAILED in 2m 10s\n```\n\nPlease refer to <https://github.com/llfbandit/record/blob/master/record_android/README.md>\nto make the following changes\n\n```diff\ndiff --git a/flutter-examples/streaming_asr/android/settings.gradle b/flutter-examples/streaming_asr/android/settings.gradle\nindex 536165d3..9b1a1012 100644\n--- a/flutter-examples/streaming_asr/android/settings.gradle\n+++ b/flutter-examples/streaming_asr/android/settings.gradle\n@@ -18,7 +18,7 @@ pluginManagement {\n\n plugins {\n     id \"dev.flutter.flutter-plugin-loader\" version \"1.0.0\"\n-    id \"com.android.application\" version \"7.3.0\" apply false\n+    id \"com.android.application\" version \"7.4.2\" apply false\n     id \"org.jetbrains.kotlin.android\" version \"1.7.10\" apply false\n }\n```\n\n# ios\n\nTo support ios, run\n\n```bash\ncd streaming_asr\nflutter create --platforms ios ./\n```\n\nConnect your iPhone to the computer, and run `flutter devices`, which will print:\n\n```bash\nFound 4 connected devices:\n  iPhone 14 (mobile) • 634110C4-168D-408F-A938-D7FC62222579 • ios            • com.apple.CoreSimulator.SimRuntime.iOS-16-2 (simulator)\n  iPhone (mobile)    • 00008030-001064212E85802E            • ios            • iOS 16.3 20D47\n  macOS (desktop)    • macos                                • darwin-x64     • macOS 13.1 22C65 darwin-x64\n  Chrome (web)       • chrome                               • web-javascript • Google Chrome 126.0.6478.127\n\nNo wireless devices were found.\n\nRun \"flutter emulators\" to list and start any available device emulators.\n(E.g., flutter emulators --launch ios)\n\nIf you expected another device to be detected, please run \"flutter doctor\" to diagnose potential issues. You may also try increasing the time to wait\nfor connected devices with the \"--device-timeout\" flag. Visit https://flutter.dev/setup/ for troubleshooting tips.\n```\n\nThen run\n\n```bash\nflutter run -d 00008030-001064212E85802E\n```\n\nIt will show:\n```\nLaunching lib/main.dart on iPhone in debug mode...\n════════════════════════════════════════════════════════════════════════════════\nNo valid code signing certificates were found\nYou can connect to your Apple Developer account by signing in with your Apple ID\nin Xcode and create an iOS Development Certificate as well as a Provisioning\nProfile for your project by:\n  1- Open the Flutter project's Xcode target with\n       open ios/Runner.xcworkspace\n  2- Select the 'Runner' project in the navigator then the 'Runner' target\n     in the project settings\n  3- Make sure a 'Development Team' is selected under Signing & Capabilities > Team.\n     You may need to:\n         - Log in with your Apple ID in Xcode first\n         - Ensure you have a valid unique Bundle ID\n         - Register your device with your Apple Developer Account\n         - Let Xcode automatically provision a profile for your app\n  4- Build or run your project again\n  5- Trust your newly created Development Certificate on your iOS device\n     via Settings > General > Device Management > [your new certificate] > Trust\n\nFor more information, please visit:\n  https://developer.apple.com/library/content/documentation/IDEs/Conceptual/\n  AppDistributionGuide/MaintainingCertificates/MaintainingCertificates.html\n\nOr run on an iOS simulator without code signing\n════════════════════════════════════════════════════════════════════════════════\nError: No development certificates available to code sign app for device deployment\n```\n\nFollow the above instructions.\n\nThe following is a screenshot.\n\n![](./ios-demo-1.jpg)\n\nThen close `xcode` and run again\n\n```bash\nflutter run -d 00008030-001064212E85802E\n```\n\nYou would get the following errors:\n```\nError (Xcode): Undefined symbol: ___cxa_pure_virtual\n\n\nError (Xcode): Undefined symbol: ___cxa_throw\n\n\nError (Xcode): Undefined symbol: ___gxx_personality_v0\n\n\n\nError launching application on iPhone.\n```\n\nMake the following changes:\n\n```diff\ndiff --git a/flutter-examples/streaming_asr/ios/Runner.xcodeproj/project.pbxproj b/flutter-examples/streaming_asr/ios/Runner.xcodeproj/project.pbxproj\nindex b208c7e9..466b0afb 100644\n--- a/flutter-examples/streaming_asr/ios/Runner.xcodeproj/project.pbxproj\n+++ b/flutter-examples/streaming_asr/ios/Runner.xcodeproj/project.pbxproj\n@@ -482,6 +482,7 @@\n \t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n \t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"Runner/Runner-Bridging-Header.h\";\n \t\t\t\tSWIFT_VERSION = 5.0;\n+\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n \t\t\t\tVERSIONING_SYSTEM = \"apple-generic\";\n \t\t\t};\n \t\t\tname = Profile;\n@@ -500,6 +501,7 @@\n \t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG;\n \t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n \t\t\t\tSWIFT_VERSION = 5.0;\n+\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n \t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/Runner.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/Runner\";\n \t\t\t};\n \t\t\tname = Debug;\n@@ -516,6 +518,7 @@\n \t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.streamingAsr.RunnerTests;\n \t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n \t\t\t\tSWIFT_VERSION = 5.0;\n+\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n \t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/Runner.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/Runner\";\n \t\t\t};\n \t\t\tname = Release;\n@@ -532,6 +535,7 @@\n \t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.streamingAsr.RunnerTests;\n \t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n \t\t\t\tSWIFT_VERSION = 5.0;\n+\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n \t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/Runner.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/Runner\";\n \t\t\t};\n \t\t\tname = Profile;\n@@ -666,6 +670,7 @@\n \t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"Runner/Runner-Bridging-Header.h\";\n \t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n \t\t\t\tSWIFT_VERSION = 5.0;\n+\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n \t\t\t\tVERSIONING_SYSTEM = \"apple-generic\";\n \t\t\t};\n \t\t\tname = Debug;\n@@ -688,6 +693,7 @@\n \t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n \t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"Runner/Runner-Bridging-Header.h\";\n \t\t\t\tSWIFT_VERSION = 5.0;\n+\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n \t\t\t\tVERSIONING_SYSTEM = \"apple-generic\";\n \t\t\t};\n \t\t\tname = Release;\n```\n\nThen re-run\n\n```bash\nflutter run -d 00008030-001064212E85802E\n```\n\nFinally, it shows the following:\n\n```\nLaunching lib/main.dart on iPhone in debug mode...\nAutomatically signing iOS for device deployment using specified development team in Xcode project: N5ZH3Z63A6\nRunning Xcode build...\n └─Compiling, linking and signing...                         9.0s\nXcode build done.                                           25.6s\n(lldb) 2024-07-06 17:43:54.970077+0800 Runner[4851:965716] [SceneConfiguration] Info.plist contained no UIScene configuration dictionary (looking for configuration named \"(no name)\")\nWarning: Unable to create restoration in progress marker file\nfopen failed for data file: errno = 2 (No such file or directory)\nErrors found! Invalidating cache...\nfopen failed for data file: errno = 2 (No such file or directory)\nErrors found! Invalidating cache...\nInstalling and launching...                                        31.8s\nSyncing files to device iPhone...                                1,080ms\n\nFlutter run key commands.\nr Hot reload. 🔥🔥🔥\nR Hot restart.\nh List all available interactive commands.\nd Detach (terminate \"flutter run\" but leave application running).\nc Clear the screen\nq Quit (terminate the application on the device).\n\nA Dart VM Service on iPhone is available at: http://127.0.0.1:51556/QDn_7CJ2gzk=/\nThe Flutter DevTools debugger and profiler on iPhone is available at: http://127.0.0.1:9100?uri=http://127.0.0.1:51556/QDn_7CJ2gzk=/\n```\n\nIf it shows the following log after pressing `start` within the sherpa-onnx APP on your iPhone:\n\n```\n[access] This app has crashed because it attempted to access privacy-sensitive data without a usage description.  The app's Info.plist must contain an NSMicrophoneUsageDescription key with a string value explaining to the user how the app uses this data.\n```\n\nPlease make the following changes\n```diff\n--- a/flutter-examples/streaming_asr/ios/Runner/Info.plist\n+++ b/flutter-examples/streaming_asr/ios/Runner/Info.plist\n@@ -2,6 +2,8 @@\n <!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n <plist version=\"1.0\">\n <dict>\n+       <key>NSMicrophoneUsageDescription</key>\n+       <string>Need microphone access for recording speech</string>\n        <key>CFBundleDevelopmentRegion</key>\n        <string>$(DEVELOPMENT_LANGUAGE)</string>\n        <key>CFBundleDisplayName</key>\n```\n\nAnd re-run\n\n```bash\nflutter run -d 00008030-001064212E85802E\n```\n\nThe following are some screenshots of the iOS APP:\n\n|1|2|3|\n|---|---|---|\n|![](./ios-demo-2.jpg)|![](./ios-demo-3.jpg)|![](./ios-demo-4.jpg)|\n\n\n**Hint**: If you find that you cannot start the APP on your iPhone after\ndisconnecting from the computer, please use\n\n```bash\nflutter run --release -d 00008030-001064212E85802E\n```\n"
  },
  {
    "path": "flutter-examples/andriod-notes.md",
    "content": "# Note about android\n\nUseful commands\n\n```bash\nflutter build apk --split-per-abi --release\n```\n\nThe above commands print the following:\n\n```\n✓ Built build/app/outputs/flutter-apk/app-armeabi-v7a-release.apk (94.8MB)\n✓ Built build/app/outputs/flutter-apk/app-arm64-v8a-release.apk (96.1MB)\n✓ Built build/app/outputs/flutter-apk/app-x86_64-release.apk (96.9MB)\n```\n\nNote that it does not generate APK for `x86`.\n\n```\nadb install build/app/outputs/flutter-apk/app-arm64-v8a-release.apk\n```\n"
  },
  {
    "path": "flutter-examples/how-tts-is-created.md",
    "content": "# Introduction\n\nThis document describes how the [tts](./tts) folder is created.\n\n\n```bash\nflutter create --platforms windows,macos,linux,android,ios tts\n```\n\nIt prints the following:\n\n```\nDeveloper identity \"Apple Development: xxx@zzz.com (xxxxxxx)\" selected for iOS code signing\nCreating project tts...\nResolving dependencies in `tts`... (1.3s)\nDownloading packages...\nGot dependencies in `tts`.\nWrote 122 files.\n\nAll done!\nYou can find general documentation for Flutter at: https://docs.flutter.dev/\nDetailed API documentation is available at: https://api.flutter.dev/\nIf you prefer video documentation, consider: https://www.youtube.com/c/flutterdev\n\nIn order to run your application, type:\n\n  $ cd tts\n  $ flutter run\n\nYour application code is in tts/lib/main.dart.\n```\n\n```\ncd tts\nflutter pub get\nflutter build macos\nflutter run -d macos\n```\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/.gitignore",
    "content": "# Miscellaneous\n*.class\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.build/\n.buildlog/\n.history\n.svn/\n.swiftpm/\nmigrate_working_dir/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# The .vscode folder contains launch configuration and tasks you configure in\n# VS Code which you may wish to be included in version control, so this line\n# is commented out by default.\n#.vscode/\n\n# Flutter/Dart/Pub related\n**/doc/api/\n**/ios/Flutter/.last_build_id\n.dart_tool/\n.flutter-plugins\n.flutter-plugins-dependencies\n.pub-cache/\n.pub/\n/build/\n\n# Symbolication related\napp.*.symbols\n\n# Obfuscation related\napp.*.map.json\n\n# Android Studio will place build artifacts here\n/android/app/debug\n/android/app/profile\n/android/app/release\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/README.md",
    "content": "# Real-time speech recognition by non streaming and VAD\n\nThis APP supports the following platforms:\n\n- macOS (tested)\n\n## Getting Started\n\nFollow these steps to download and set up the required models to run the demo successfully.\n\n### 1. Select a non-streaming model\n\nChoose one of the following non-streaming ASR models:\n\n#### Code Available Models:\n- **whisper**: Whisper base model\n- **senseVoice**: SenseVoice multilingual model (supports Chinese, English, Japanese, Korean, Cantonese)\n- **parakeet-tdt**: NeMo transducer-based parakeet-tdt model\n\n#### Model Download Links:\n- **whisper**: https://huggingface.co/csukuangfj/sherpa-onnx-whisper-base\n- **senseVoice**: https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09  \n- **parakeet-tdt**: https://huggingface.co/csukuangfj/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8\n\n### 2. Download VAD Model\n\nDownload the VAD (Voice Activity Detection) model from:\nhttps://huggingface.co/csukuangfj/vad\n\nPlace the VAD model file (e.g., `silero_vad.onnx`) in the `assets` directory.\n\n### 3. Configure the Model in Code\n\n#### Step 3.1: Update Model Selection\nEdit `lib/non_streaming_vad_asr.dart` and set the model type:\n\n```dart\nFuture<sherpa_onnx.OfflineRecognizer> createOfflineRecognizer() async {\n  final type = 2; // 0: whisper, 1: senseVoice, 2: parakeet-tdt\n  final modelConfig = await getOfflineModelConfig(type: type);\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  return sherpa_onnx.OfflineRecognizer(config);\n}\n```\n\n#### Step 3.2: Update Asset Configuration\nEdit `pubspec.yaml` and add the appropriate asset directory for your chosen model:\n\n```yaml\nflutter:\n  assets:\n    - assets/\n    - assets/whisper/        # For whisper model\n    # - assets/senseVoice/    # For senseVoice model (uncomment when using)\n    # - assets/nemo_transducer/ # For parakeet-tdt model (uncomment when using)\n```\n\n### 4. Directory Structure Setup\n\n#### For whisper model:\n```\n./assets/\n├── whisper/\n│   ├── base-decoder.onnx\n│   ├── base-encoder.onnx\n│   └── base-tokens.txt\n└── silero_vad.onnx\n```\n\n#### For senseVoice model:\n```\n./assets/\n├── senseVoice/\n│   ├── model.int8.onnx\n│   └── tokens.txt\n└── silero_vad.onnx\n```\n\n#### For parakeet-tdt model:\n```\n./assets/\n├── nemo_transducer/\n│   ├── encoder.int8.onnx\n│   ├── decoder.int8.onnx\n│   ├── joiner.int8.onnx\n│   └── tokens.txt\n└── silero_vad.onnx\n```\n\n### 5. Advanced Configuration (Optional)\n\n#### Modify Model Configuration:\nYou can edit `lib/offline_model.dart` to customize the model configuration, such as model size and quantization settings.\n\n#### Adjust Audio Recording Settings:\nIn `lib/non_streaming_vad_asr.dart`, you can modify the VAD configuration:\n\n```dart\n_vad = sherpa_onnx.VoiceActivityDetector(\n  config: _vadConfig, \n  bufferSizeInSeconds: 30  // Adjust based on your needs\n);\n_buffer = sherpa_onnx.CircularBuffer(capacity: 30 * 16000);\n```\n\n### 6. Run the Application\n\nUse the following command to run the app:\n\n```bash\nflutter run -d macos\n```\n\n## Troubleshooting\n\n- Ensure all model files are placed in the correct directories\n- Check that `pubspec.yaml` includes the correct asset paths\n- Verify the model type in `non_streaming_vad_asr.dart` matches your chosen model\n- Make sure to delete unnecessary files to reduce app size\n\n## Notes\n\n- The VAD model is required for all non-streaming ASR models\n- Model performance may vary depending on hardware capabilities\n- Adjust buffer sizes and VAD parameters based on your specific use case\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/analysis_options.yaml",
    "content": "# This file configures the analyzer, which statically analyzes Dart code to\n# check for errors, warnings, and lints.\n#\n# The issues identified by the analyzer are surfaced in the UI of Dart-enabled\n# IDEs (https://dart.dev/tools#ides-and-editors). The analyzer can also be\n# invoked from the command line by running `flutter analyze`.\n\n# The following line activates a set of recommended lints for Flutter apps,\n# packages, and plugins designed to encourage good coding practices.\ninclude: package:flutter_lints/flutter.yaml\n\nlinter:\n  # The lint rules applied to this project can be customized in the\n  # section below to disable rules from the `package:flutter_lints/flutter.yaml`\n  # included above or to enable additional rules. A list of all available lints\n  # and their documentation is published at https://dart.dev/lints.\n  #\n  # Instead of disabling a lint rule for the entire project in the\n  # section below, it can also be suppressed for a single line of code\n  # or a specific dart file by using the `// ignore: name_of_lint` and\n  # `// ignore_for_file: name_of_lint` syntax on the line or in the file\n  # producing the lint.\n  rules:\n    # avoid_print: false  # Uncomment to disable the `avoid_print` rule\n    # prefer_single_quotes: true  # Uncomment to enable the `prefer_single_quotes` rule\n\n# Additional information about this file can be found at\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/lib/info.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'package:flutter/material.dart';\nimport 'package:url_launcher/url_launcher.dart';\n\nclass InfoScreen extends StatelessWidget {\n  @override\n  Widget build(BuildContext context) {\n    const double height = 20;\n    return Container(\n      child: Padding(\n        padding: const EdgeInsets.all(8.0),\n        child: Column(\n          crossAxisAlignment: CrossAxisAlignment.start,\n          children: <Widget>[\n            Text('Everything is open-sourced.'),\n            SizedBox(height: height),\n            InkWell(\n              child: Text('Code: https://github.com/k2-fsa/sherpa-onnx'),\n              onTap: () => launch('https://k2-fsa.github.io/sherpa/onnx/'),\n            ),\n            SizedBox(height: height),\n            InkWell(\n              child: Text('Doc: https://k2-fsa.github.io/sherpa/onnx/'),\n              onTap: () => launch('https://k2-fsa.github.io/sherpa/onnx/'),\n            ),\n            SizedBox(height: height),\n            Text('QQ 群: 744602236'),\n            SizedBox(height: height),\n            InkWell(\n              child: Text(\n                  '微信群: https://k2-fsa.github.io/sherpa/social-groups.html'),\n              onTap: () =>\n                  launch('https://k2-fsa.github.io/sherpa/social-groups.html'),\n            ),\n          ],\n        ),\n      ),\n    );\n  }\n}\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/lib/main.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'package:flutter/material.dart';\n\nimport './non_streaming_vad_asr.dart';\nimport './info.dart';\n\nvoid main() {\n  runApp(const MyApp());\n}\n\nclass MyApp extends StatelessWidget {\n  const MyApp({super.key});\n\n  @override\n  Widget build(BuildContext context) {\n    return MaterialApp(\n      title: 'Next-gen Kaldi flutter demo',\n      theme: ThemeData(\n        colorScheme: ColorScheme.fromSeed(seedColor: Colors.deepPurple),\n        useMaterial3: true,\n      ),\n      home: const MyHomePage(title: 'Next-gen Kaldi with Flutter'),\n    );\n  }\n}\n\nclass MyHomePage extends StatefulWidget {\n  const MyHomePage({super.key, required this.title});\n\n  final String title;\n\n  @override\n  State<MyHomePage> createState() => _MyHomePageState();\n}\n\nclass _MyHomePageState extends State<MyHomePage> {\n  int _currentIndex = 0;\n  final List<Widget> _tabs = [\n    NoStreamingAsrVAdScreen(),\n    InfoScreen(),\n  ];\n  @override\n  Widget build(BuildContext context) {\n    return Scaffold(\n      appBar: AppBar(\n        title: Text(widget.title),\n      ),\n      body: _tabs[_currentIndex],\n      bottomNavigationBar: BottomNavigationBar(\n        currentIndex: _currentIndex,\n        onTap: (int index) {\n          setState(() {\n            _currentIndex = index;\n          });\n        },\n        items: [\n          BottomNavigationBarItem(\n            icon: Icon(Icons.home),\n            label: 'Home',\n          ),\n          BottomNavigationBarItem(\n            icon: Icon(Icons.info),\n            label: 'Info',\n          ),\n        ],\n      ),\n    );\n  }\n}\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/lib/non_streaming_vad_asr.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:async';\nimport 'dart:typed_data';\n\nimport 'package:flutter/foundation.dart';\nimport 'package:flutter/material.dart';\nimport 'package:path/path.dart' as p;\nimport 'package:path_provider/path_provider.dart';\nimport 'package:record/record.dart';\n\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './utils.dart';\nimport './offline_model.dart';\n\nfinal modelDir = 'assets';\nFuture<sherpa_onnx.OfflineRecognizer> createOfflineRecognizer() async {\n  final type = 2;\n  final modelConfig = await getOfflineModelConfig(type: type);\n  final config = sherpa_onnx.OfflineRecognizerConfig(model: modelConfig);\n  return sherpa_onnx.OfflineRecognizer(config);\n}\n\nclass NoStreamingAsrVAdScreen extends StatefulWidget {\n  const NoStreamingAsrVAdScreen({super.key});\n\n  @override\n  State<NoStreamingAsrVAdScreen> createState() => _NoStreamingAsrVAdScreenState();\n}\n\nclass _NoStreamingAsrVAdScreenState extends State<NoStreamingAsrVAdScreen> {\n\n  late final TextEditingController _controller;\n  late final AudioRecorder _audioRecorder;\n\n  String _title = 'Real-time speech recognition(offline recognizer with vad)';\n  String _last = '';\n  int _index = 0;\n  bool _isInitialized = false;\n\n  // offline recognizer related vars\n  sherpa_onnx.OfflineRecognizer? _recognizer;\n  static const int _sampleRate = 16000;\n\n  // VAD related vars\n  sherpa_onnx.VoiceActivityDetector? _vad;\n  sherpa_onnx.CircularBuffer? _buffer;\n  \n  // VAD config\n  late sherpa_onnx.VadModelConfig _vadConfig;\n\n  StreamSubscription<RecordState>? _recordSub;\n  RecordState _recordState = RecordState.stop;\n\n  @override\n  void initState() {\n    _audioRecorder = AudioRecorder();\n    _controller = TextEditingController();\n\n    _recordSub = _audioRecorder.onStateChanged().listen((recordState) {\n      _updateRecordState(recordState);\n    });\n\n    super.initState();\n  }\n\n  Future<void> _start() async {\n    if (!_isInitialized) {\n      sherpa_onnx.initBindings();\n\n      // 初始化 VAD\n      final sileroVadConfig = sherpa_onnx.SileroVadModelConfig(\n        model: await copyAssetFile('$modelDir/silero_vad.onnx'),\n        minSilenceDuration: 0.25,\n        minSpeechDuration: 0.5,\n        maxSpeechDuration: 5.0,\n      );\n\n      _vadConfig = sherpa_onnx.VadModelConfig(\n        sileroVad: sileroVadConfig,\n        numThreads: 1,\n        debug: false,\n      );\n\n      // create VAD, use buffer model\n      _vad = sherpa_onnx.VoiceActivityDetector(\n        config: _vadConfig, \n        bufferSizeInSeconds: 30\n      );\n      _buffer = sherpa_onnx.CircularBuffer(capacity: 30 * _sampleRate);\n\n      _recognizer = await createOfflineRecognizer();\n      _isInitialized = true;\n    }\n\n    try {\n      if (await _audioRecorder.hasPermission()) {\n        const encoder = AudioEncoder.pcm16bits;\n\n        if (!await _isEncoderSupported(encoder)) {\n          return;\n        }\n\n        final devs = await _audioRecorder.listInputDevices();\n        debugPrint(devs.toString());\n\n        const config = RecordConfig(\n          encoder: encoder,\n          sampleRate: _sampleRate,\n          numChannels: 1,\n        );\n\n        final stream = await _audioRecorder.startStream(config);\n\n        stream.listen(\n          (data) {\n            final samplesFloat32 = convertBytesToFloat32(Uint8List.fromList(data));\n            \n            // use _buffer and _vad for offline stream data making\n            _buffer!.push(samplesFloat32);\n            \n            final windowSize = _vadConfig.sileroVad.windowSize;\n            while (_buffer!.size > windowSize) {\n              final samples = _buffer!.get(\n                startIndex: _buffer!.head, \n                n: windowSize\n              );\n              _buffer!.pop(windowSize);\n              _vad!.acceptWaveform(samples);  \n\n              while (!_vad!.isEmpty()) {\n                final segment = _vad!.front();\n                final samples = segment.samples;  \n                \n                // offline _recognizer stream handle logic\n                final stream = _recognizer!.createStream();\n                stream.acceptWaveform(samples: samples, sampleRate: _sampleRate);\n                _recognizer!.decode(stream);\n                final text = _recognizer!.getResult(stream).text;\n                debugPrint(\"recognize:\"+text);\n                stream.free();\n                _vad!.pop();\n                \n                // update text to display\n                String textToDisplay = _last;\n                if (text != '') {\n                  _index += 1;\n                  if (_last == '') {\n                    textToDisplay = '$_index: $text';\n                  } else {\n                    textToDisplay = '$_index: $text\\n$_last';\n                  }\n                  _last = textToDisplay;\n                }\n                debugPrint(\"final:\"+textToDisplay);\n                _controller.value = TextEditingValue(\n                      text: textToDisplay,\n                      selection: TextSelection.collapsed(offset: textToDisplay.length),\n                );\n              }\n            }\n          },\n          onDone: () {\n            print('stream stopped.');\n          },\n        );\n      }\n    } catch (e) {\n      print(e);\n    }\n  }\n\n  Future<void> _stop() async {\n    await _audioRecorder.stop();\n    // handle rest of vad data\n     _vad!.flush();\n    while (!_vad!.isEmpty()) {\n      final segment = _vad!.front();\n      final samples = segment.samples;\n\n      final stream = _recognizer!.createStream();\n      stream.acceptWaveform(samples: samples, sampleRate: _sampleRate);\n      _recognizer!.decode(stream);\n      final text = _recognizer!.getResult(stream).text;\n              \n      String textToDisplay = _last;\n      if (text != '') {\n          _index += 1;\n          if (_last == '') {\n              textToDisplay = '$_index: $text';\n          } else {\n              textToDisplay = '$_index: $text\\n$_last';\n            }\n          }\n          _last = \"\";\n          _index = 0;  \n          debugPrint(\"final:\"+textToDisplay);\n          _controller.value = TextEditingValue(\n            text: textToDisplay,\n            selection: TextSelection.collapsed(offset: textToDisplay.length),\n          );              \n          stream.free();\n          _vad!.pop();\n    }\n  }\n\n  Future<void> _pause() => _audioRecorder.pause();\n\n  Future<void> _resume() => _audioRecorder.resume();\n\n  void _updateRecordState(RecordState recordState) {\n    setState(() => _recordState = recordState);\n  }\n\n  Future<bool> _isEncoderSupported(AudioEncoder encoder) async {\n    final isSupported = await _audioRecorder.isEncoderSupported(\n      encoder,\n    );\n\n    if (!isSupported) {\n      debugPrint('${encoder.name} is not supported on this platform.');\n      debugPrint('Supported encoders are:');\n\n      for (final e in AudioEncoder.values) {\n        if (await _audioRecorder.isEncoderSupported(e)) {\n          debugPrint('- ${e.name}');\n        }\n      }\n    }\n\n    return isSupported;\n  }\n\n  @override\n  Widget build(BuildContext context) {\n    return MaterialApp(\n      home: Scaffold(\n        appBar: AppBar(\n          title: Text(_title),\n        ),\n        body: Column(\n          mainAxisAlignment: MainAxisAlignment.center,\n          children: [\n            const SizedBox(height: 50),\n            TextField(\n              maxLines: 5,\n              controller: _controller,\n              readOnly: true,\n            ),\n            const SizedBox(height: 50),\n            Row(\n              mainAxisAlignment: MainAxisAlignment.center,\n              children: <Widget>[\n                _buildRecordStopControl(),\n                const SizedBox(width: 20),\n                _buildText(),\n              ],\n            ),\n          ],\n        ),\n      ),\n    );\n  }\n\n  @override\n  void dispose() {\n    _recordSub?.cancel();\n    _audioRecorder.dispose();\n    _recognizer?.free();\n    _vad?.free(); // release vad\n    _buffer?.free(); // release buffer\n    super.dispose();\n  }\n\n  Widget _buildRecordStopControl() {\n    late Icon icon;\n    late Color color;\n\n    if (_recordState != RecordState.stop) {\n      icon = const Icon(Icons.stop, color: Colors.red, size: 30);\n      color = Colors.red.withOpacity(0.1);\n    } else {\n      final theme = Theme.of(context);\n      icon = Icon(Icons.mic, color: theme.primaryColor, size: 30);\n      color = theme.primaryColor.withOpacity(0.1);\n    }\n\n    return ClipOval(\n      child: Material(\n        color: color,\n        child: InkWell(\n          child: SizedBox(width: 56, height: 56, child: icon),\n          onTap: () {\n            (_recordState != RecordState.stop) ? _stop() : _start();\n          },\n        ),\n      ),\n    );\n  }\n\n  Widget _buildText() {\n    if (_recordState == RecordState.stop) {\n      return const Text(\"Start\");\n    } else {\n      return const Text(\"Stop\");\n    }\n  }\n}"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/lib/offline_model.dart",
    "content": "import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './utils.dart';\n\nfinal modelDir = 'assets';\n// Remember to change `assets` in ../pubspec.yaml\n// and download files to ../assets\nFuture<sherpa_onnx.OfflineModelConfig> getOfflineModelConfig(\n    {required int type}) async {\n  switch (type) {\n    // whisper\n    case 0: \n      return sherpa_onnx.OfflineModelConfig(\n        whisper:sherpa_onnx.OfflineWhisperModelConfig(\n          encoder: await copyAssetFile('$modelDir/whisper/base-encoder.onnx'),\n          decoder: await copyAssetFile('$modelDir/whisper/base-decoder.onnx'),\n        ),\n        tokens: await copyAssetFile('$modelDir/whisper/base-tokens.txt'),\n        modelType: 'whisper',\n      );\n    // senseVoice  \n    case 1:\n      return sherpa_onnx.OfflineModelConfig(\n        senseVoice: sherpa_onnx.OfflineSenseVoiceModelConfig(\n          model: await copyAssetFile('$modelDir/senseVoice/model.int8.onnx'), \n        ),\n        tokens: await copyAssetFile('$modelDir/senseVoice/tokens.txt'),\n      );\n    // nemo_transducer-parakeet-tdt\n    case 2:\n      return sherpa_onnx.OfflineModelConfig(\n        transducer: sherpa_onnx.OfflineTransducerModelConfig(\n          encoder: await copyAssetFile(\n              '$modelDir/nemo_transducer/encoder.int8.onnx'),\n          decoder: await copyAssetFile(\n              '$modelDir/nemo_transducer/decoder.int8.onnx'),\n          joiner: await copyAssetFile(\n              '$modelDir/nemo_transducer/joiner.int8.onnx'),\n        ),\n        tokens: await copyAssetFile('$modelDir/nemo_transducer/tokens.txt'),\n        modelType: 'nemo_transducer',\n      );\n    default:\n      throw ArgumentError('Unsupported type: $type');\n  }\n}\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/lib/utils.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'package:path/path.dart';\nimport 'package:path_provider/path_provider.dart';\nimport 'package:flutter/services.dart' show rootBundle;\nimport 'dart:typed_data';\nimport \"dart:io\";\n\n// Copy the asset file from src to dst\nFuture<String> copyAssetFile(String src, [String? dst]) async {\n  final Directory directory = await getApplicationSupportDirectory();\n  if (dst == null) {\n    dst = basename(src);\n  }\n  final target = join(directory.path, dst);\n  bool exists = await new File(target).exists();\n\n  final data = await rootBundle.load(src);\n\n  if (!exists || File(target).lengthSync() != data.lengthInBytes) {\n    final List<int> bytes =\n        data.buffer.asUint8List(data.offsetInBytes, data.lengthInBytes);\n    await File(target).writeAsBytes(bytes);\n  }\n\n  return target;\n}\n\nFloat32List convertBytesToFloat32(Uint8List bytes, [endian = Endian.little]) {\n  final values = Float32List(bytes.length ~/ 2);\n\n  final data = ByteData.view(bytes.buffer);\n\n  for (var i = 0; i < bytes.length; i += 2) {\n    int short = data.getInt16(i, endian);\n    values[i ~/ 2] = short / 32768.0;\n  }\n\n  return values;\n}\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/.gitignore",
    "content": "# Flutter-related\n**/Flutter/ephemeral/\n**/Pods/\n\n# Xcode-related\n**/dgph\n**/xcuserdata/\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Flutter/Flutter-Debug.xcconfig",
    "content": "#include? \"Pods/Target Support Files/Pods-Runner/Pods-Runner.debug.xcconfig\"\n#include \"ephemeral/Flutter-Generated.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Flutter/Flutter-Release.xcconfig",
    "content": "#include? \"Pods/Target Support Files/Pods-Runner/Pods-Runner.release.xcconfig\"\n#include \"ephemeral/Flutter-Generated.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/AppDelegate.swift",
    "content": "import Cocoa\nimport FlutterMacOS\n\n@main\nclass AppDelegate: FlutterAppDelegate {\n  override func applicationShouldTerminateAfterLastWindowClosed(_ sender: NSApplication) -> Bool {\n    return true\n  }\n\n  override func applicationSupportsSecureRestorableState(_ app: NSApplication) -> Bool {\n    return true\n  }\n}\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"size\" : \"16x16\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_16.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"16x16\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_32.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"32x32\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_32.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"32x32\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_64.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"128x128\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_128.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"128x128\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_256.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"256x256\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_256.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"256x256\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_512.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"512x512\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_512.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"512x512\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_1024.png\",\n      \"scale\" : \"2x\"\n    }\n  ],\n  \"info\" : {\n    \"version\" : 1,\n    \"author\" : \"xcode\"\n  }\n}\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/Base.lproj/MainMenu.xib",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<document type=\"com.apple.InterfaceBuilder3.Cocoa.XIB\" version=\"3.0\" toolsVersion=\"14490.70\" targetRuntime=\"MacOSX.Cocoa\" propertyAccessControl=\"none\" useAutolayout=\"YES\" customObjectInstantitationMethod=\"direct\">\n    <dependencies>\n        <deployment identifier=\"macosx\"/>\n        <plugIn identifier=\"com.apple.InterfaceBuilder.CocoaPlugin\" version=\"14490.70\"/>\n        <capability name=\"documents saved in the Xcode 8 format\" minToolsVersion=\"8.0\"/>\n    </dependencies>\n    <objects>\n        <customObject id=\"-2\" userLabel=\"File's Owner\" customClass=\"NSApplication\">\n            <connections>\n                <outlet property=\"delegate\" destination=\"Voe-Tx-rLC\" id=\"GzC-gU-4Uq\"/>\n            </connections>\n        </customObject>\n        <customObject id=\"-1\" userLabel=\"First Responder\" customClass=\"FirstResponder\"/>\n        <customObject id=\"-3\" userLabel=\"Application\" customClass=\"NSObject\"/>\n        <customObject id=\"Voe-Tx-rLC\" customClass=\"AppDelegate\" customModule=\"Runner\" customModuleProvider=\"target\">\n            <connections>\n                <outlet property=\"applicationMenu\" destination=\"uQy-DD-JDr\" id=\"XBo-yE-nKs\"/>\n                <outlet property=\"mainFlutterWindow\" destination=\"QvC-M9-y7g\" id=\"gIp-Ho-8D9\"/>\n            </connections>\n        </customObject>\n        <customObject id=\"YLy-65-1bz\" customClass=\"NSFontManager\"/>\n        <menu title=\"Main Menu\" systemMenu=\"main\" id=\"AYu-sK-qS6\">\n            <items>\n                <menuItem title=\"APP_NAME\" id=\"1Xt-HY-uBw\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"APP_NAME\" systemMenu=\"apple\" id=\"uQy-DD-JDr\">\n                        <items>\n                            <menuItem title=\"About APP_NAME\" id=\"5kV-Vb-QxS\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"orderFrontStandardAboutPanel:\" target=\"-1\" id=\"Exp-CZ-Vem\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"VOq-y0-SEH\"/>\n                            <menuItem title=\"Preferences…\" keyEquivalent=\",\" id=\"BOF-NM-1cW\"/>\n                            <menuItem isSeparatorItem=\"YES\" id=\"wFC-TO-SCJ\"/>\n                            <menuItem title=\"Services\" id=\"NMo-om-nkz\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Services\" systemMenu=\"services\" id=\"hz9-B4-Xy5\"/>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"4je-JR-u6R\"/>\n                            <menuItem title=\"Hide APP_NAME\" keyEquivalent=\"h\" id=\"Olw-nP-bQN\">\n                                <connections>\n                                    <action selector=\"hide:\" target=\"-1\" id=\"PnN-Uc-m68\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Hide Others\" keyEquivalent=\"h\" id=\"Vdr-fp-XzO\">\n                                <modifierMask key=\"keyEquivalentModifierMask\" option=\"YES\" command=\"YES\"/>\n                                <connections>\n                                    <action selector=\"hideOtherApplications:\" target=\"-1\" id=\"VT4-aY-XCT\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Show All\" id=\"Kd2-mp-pUS\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"unhideAllApplications:\" target=\"-1\" id=\"Dhg-Le-xox\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"kCx-OE-vgT\"/>\n                            <menuItem title=\"Quit APP_NAME\" keyEquivalent=\"q\" id=\"4sb-4s-VLi\">\n                                <connections>\n                                    <action selector=\"terminate:\" target=\"-1\" id=\"Te7-pn-YzF\"/>\n                                </connections>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"Edit\" id=\"5QF-Oa-p0T\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"Edit\" id=\"W48-6f-4Dl\">\n                        <items>\n                            <menuItem title=\"Undo\" keyEquivalent=\"z\" id=\"dRJ-4n-Yzg\">\n                                <connections>\n                                    <action selector=\"undo:\" target=\"-1\" id=\"M6e-cu-g7V\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Redo\" keyEquivalent=\"Z\" id=\"6dh-zS-Vam\">\n                                <connections>\n                                    <action selector=\"redo:\" target=\"-1\" id=\"oIA-Rs-6OD\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"WRV-NI-Exz\"/>\n                            <menuItem title=\"Cut\" keyEquivalent=\"x\" id=\"uRl-iY-unG\">\n                                <connections>\n                                    <action selector=\"cut:\" target=\"-1\" id=\"YJe-68-I9s\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Copy\" keyEquivalent=\"c\" id=\"x3v-GG-iWU\">\n                                <connections>\n                                    <action selector=\"copy:\" target=\"-1\" id=\"G1f-GL-Joy\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Paste\" keyEquivalent=\"v\" id=\"gVA-U4-sdL\">\n                                <connections>\n                                    <action selector=\"paste:\" target=\"-1\" id=\"UvS-8e-Qdg\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Paste and Match Style\" keyEquivalent=\"V\" id=\"WeT-3V-zwk\">\n                                <modifierMask key=\"keyEquivalentModifierMask\" option=\"YES\" command=\"YES\"/>\n                                <connections>\n                                    <action selector=\"pasteAsPlainText:\" target=\"-1\" id=\"cEh-KX-wJQ\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Delete\" id=\"pa3-QI-u2k\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"delete:\" target=\"-1\" id=\"0Mk-Ml-PaM\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Select All\" keyEquivalent=\"a\" id=\"Ruw-6m-B2m\">\n                                <connections>\n                                    <action selector=\"selectAll:\" target=\"-1\" id=\"VNm-Mi-diN\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"uyl-h8-XO2\"/>\n                            <menuItem title=\"Find\" id=\"4EN-yA-p0u\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Find\" id=\"1b7-l0-nxx\">\n                                    <items>\n                                        <menuItem title=\"Find…\" tag=\"1\" keyEquivalent=\"f\" id=\"Xz5-n4-O0W\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"cD7-Qs-BN4\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Find and Replace…\" tag=\"12\" keyEquivalent=\"f\" id=\"YEy-JH-Tfz\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\" option=\"YES\" command=\"YES\"/>\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"WD3-Gg-5AJ\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Find Next\" tag=\"2\" keyEquivalent=\"g\" id=\"q09-fT-Sye\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"NDo-RZ-v9R\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Find Previous\" tag=\"3\" keyEquivalent=\"G\" id=\"OwM-mh-QMV\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"HOh-sY-3ay\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Use Selection for Find\" tag=\"7\" keyEquivalent=\"e\" id=\"buJ-ug-pKt\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"U76-nv-p5D\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Jump to Selection\" keyEquivalent=\"j\" id=\"S0p-oC-mLd\">\n                                            <connections>\n                                                <action selector=\"centerSelectionInVisibleArea:\" target=\"-1\" id=\"IOG-6D-g5B\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Spelling and Grammar\" id=\"Dv1-io-Yv7\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Spelling\" id=\"3IN-sU-3Bg\">\n                                    <items>\n                                        <menuItem title=\"Show Spelling and Grammar\" keyEquivalent=\":\" id=\"HFo-cy-zxI\">\n                                            <connections>\n                                                <action selector=\"showGuessPanel:\" target=\"-1\" id=\"vFj-Ks-hy3\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Check Document Now\" keyEquivalent=\";\" id=\"hz2-CU-CR7\">\n                                            <connections>\n                                                <action selector=\"checkSpelling:\" target=\"-1\" id=\"fz7-VC-reM\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem isSeparatorItem=\"YES\" id=\"bNw-od-mp5\"/>\n                                        <menuItem title=\"Check Spelling While Typing\" id=\"rbD-Rh-wIN\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleContinuousSpellChecking:\" target=\"-1\" id=\"7w6-Qz-0kB\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Check Grammar With Spelling\" id=\"mK6-2p-4JG\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleGrammarChecking:\" target=\"-1\" id=\"muD-Qn-j4w\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Correct Spelling Automatically\" id=\"78Y-hA-62v\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticSpellingCorrection:\" target=\"-1\" id=\"2lM-Qi-WAP\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Substitutions\" id=\"9ic-FL-obx\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Substitutions\" id=\"FeM-D8-WVr\">\n                                    <items>\n                                        <menuItem title=\"Show Substitutions\" id=\"z6F-FW-3nz\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"orderFrontSubstitutionsPanel:\" target=\"-1\" id=\"oku-mr-iSq\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem isSeparatorItem=\"YES\" id=\"gPx-C9-uUO\"/>\n                                        <menuItem title=\"Smart Copy/Paste\" id=\"9yt-4B-nSM\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleSmartInsertDelete:\" target=\"-1\" id=\"3IJ-Se-DZD\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Smart Quotes\" id=\"hQb-2v-fYv\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticQuoteSubstitution:\" target=\"-1\" id=\"ptq-xd-QOA\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Smart Dashes\" id=\"rgM-f4-ycn\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticDashSubstitution:\" target=\"-1\" id=\"oCt-pO-9gS\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Smart Links\" id=\"cwL-P1-jid\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticLinkDetection:\" target=\"-1\" id=\"Gip-E3-Fov\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Data Detectors\" id=\"tRr-pd-1PS\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticDataDetection:\" target=\"-1\" id=\"R1I-Nq-Kbl\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Text Replacement\" id=\"HFQ-gK-NFA\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticTextReplacement:\" target=\"-1\" id=\"DvP-Fe-Py6\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Transformations\" id=\"2oI-Rn-ZJC\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Transformations\" id=\"c8a-y6-VQd\">\n                                    <items>\n                                        <menuItem title=\"Make Upper Case\" id=\"vmV-6d-7jI\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"uppercaseWord:\" target=\"-1\" id=\"sPh-Tk-edu\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Make Lower Case\" id=\"d9M-CD-aMd\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"lowercaseWord:\" target=\"-1\" id=\"iUZ-b5-hil\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Capitalize\" id=\"UEZ-Bs-lqG\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"capitalizeWord:\" target=\"-1\" id=\"26H-TL-nsh\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Speech\" id=\"xrE-MZ-jX0\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Speech\" id=\"3rS-ZA-NoH\">\n                                    <items>\n                                        <menuItem title=\"Start Speaking\" id=\"Ynk-f8-cLZ\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"startSpeaking:\" target=\"-1\" id=\"654-Ng-kyl\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Stop Speaking\" id=\"Oyz-dy-DGm\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"stopSpeaking:\" target=\"-1\" id=\"dX8-6p-jy9\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"View\" id=\"H8h-7b-M4v\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"View\" id=\"HyV-fh-RgO\">\n                        <items>\n                            <menuItem title=\"Enter Full Screen\" keyEquivalent=\"f\" id=\"4J7-dP-txa\">\n                                <modifierMask key=\"keyEquivalentModifierMask\" control=\"YES\" command=\"YES\"/>\n                                <connections>\n                                    <action selector=\"toggleFullScreen:\" target=\"-1\" id=\"dU3-MA-1Rq\"/>\n                                </connections>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"Window\" id=\"aUF-d1-5bR\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"Window\" systemMenu=\"window\" id=\"Td7-aD-5lo\">\n                        <items>\n                            <menuItem title=\"Minimize\" keyEquivalent=\"m\" id=\"OY7-WF-poV\">\n                                <connections>\n                                    <action selector=\"performMiniaturize:\" target=\"-1\" id=\"VwT-WD-YPe\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Zoom\" id=\"R4o-n2-Eq4\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"performZoom:\" target=\"-1\" id=\"DIl-cC-cCs\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"eu3-7i-yIM\"/>\n                            <menuItem title=\"Bring All to Front\" id=\"LE2-aR-0XJ\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"arrangeInFront:\" target=\"-1\" id=\"DRN-fu-gQh\"/>\n                                </connections>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"Help\" id=\"EPT-qC-fAb\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"Help\" systemMenu=\"help\" id=\"rJ0-wn-3NY\"/>\n                </menuItem>\n            </items>\n            <point key=\"canvasLocation\" x=\"142\" y=\"-258\"/>\n        </menu>\n        <window title=\"APP_NAME\" allowsToolTipsWhenApplicationIsInactive=\"NO\" autorecalculatesKeyViewLoop=\"NO\" releasedWhenClosed=\"NO\" animationBehavior=\"default\" id=\"QvC-M9-y7g\" customClass=\"MainFlutterWindow\" customModule=\"Runner\" customModuleProvider=\"target\">\n            <windowStyleMask key=\"styleMask\" titled=\"YES\" closable=\"YES\" miniaturizable=\"YES\" resizable=\"YES\"/>\n            <rect key=\"contentRect\" x=\"335\" y=\"390\" width=\"800\" height=\"600\"/>\n            <rect key=\"screenRect\" x=\"0.0\" y=\"0.0\" width=\"2560\" height=\"1577\"/>\n            <view key=\"contentView\" wantsLayer=\"YES\" id=\"EiT-Mj-1SZ\">\n                <rect key=\"frame\" x=\"0.0\" y=\"0.0\" width=\"800\" height=\"600\"/>\n                <autoresizingMask key=\"autoresizingMask\"/>\n            </view>\n        </window>\n    </objects>\n</document>\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/Configs/AppInfo.xcconfig",
    "content": "// Application-level settings for the Runner target.\n//\n// This may be replaced with something auto-generated from metadata (e.g., pubspec.yaml) in the\n// future. If not, the values below would default to using the project name when this becomes a\n// 'flutter create' template.\n\n// The application's name. By default this is also the title of the Flutter window.\nPRODUCT_NAME = non_streaming_vad_asr\n\n// The application's bundle identifier\nPRODUCT_BUNDLE_IDENTIFIER = com.example.nonStreamingVadAsr\n\n// The copyright displayed in application information\nPRODUCT_COPYRIGHT = Copyright © 2024 com.example. All rights reserved.\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/Configs/Debug.xcconfig",
    "content": "#include \"../../Flutter/Flutter-Debug.xcconfig\"\n#include \"Warnings.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/Configs/Release.xcconfig",
    "content": "#include \"../../Flutter/Flutter-Release.xcconfig\"\n#include \"Warnings.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/Configs/Warnings.xcconfig",
    "content": "WARNING_CFLAGS = -Wall -Wconditional-uninitialized -Wnullable-to-nonnull-conversion -Wmissing-method-return-type -Woverlength-strings\nGCC_WARN_UNDECLARED_SELECTOR = YES\nCLANG_UNDEFINED_BEHAVIOR_SANITIZER_NULLABILITY = YES\nCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE\nCLANG_WARN__DUPLICATE_METHOD_MATCH = YES\nCLANG_WARN_PRAGMA_PACK = YES\nCLANG_WARN_STRICT_PROTOTYPES = YES\nCLANG_WARN_COMMA = YES\nGCC_WARN_STRICT_SELECTOR_MATCH = YES\nCLANG_WARN_OBJC_REPEATED_USE_OF_WEAK = YES\nCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES\nGCC_WARN_SHADOW = YES\nCLANG_WARN_UNREACHABLE_CODE = YES\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/DebugProfile.entitlements",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>com.apple.security.app-sandbox</key>\n\t<true/>\n\t<key>com.apple.security.cs.allow-jit</key>\n\t<true/>\n\t<key>com.apple.security.device.audio-input</key>\n\t<true/>\n\t<key>com.apple.security.network.server</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>NSMicrophoneUsageDescription</key>\n\t<string>Need microphone access for Next-gen kaldi to work</string>\n\t<key>CFBundleDevelopmentRegion</key>\n\t<string>$(DEVELOPMENT_LANGUAGE)</string>\n\t<key>CFBundleExecutable</key>\n\t<string>$(EXECUTABLE_NAME)</string>\n\t<key>CFBundleIconFile</key>\n\t<string></string>\n\t<key>CFBundleIdentifier</key>\n\t<string>$(PRODUCT_BUNDLE_IDENTIFIER)</string>\n\t<key>CFBundleInfoDictionaryVersion</key>\n\t<string>6.0</string>\n\t<key>CFBundleName</key>\n\t<string>$(PRODUCT_NAME)</string>\n\t<key>CFBundlePackageType</key>\n\t<string>APPL</string>\n\t<key>CFBundleShortVersionString</key>\n\t<string>$(FLUTTER_BUILD_NAME)</string>\n\t<key>CFBundleVersion</key>\n\t<string>$(FLUTTER_BUILD_NUMBER)</string>\n\t<key>LSMinimumSystemVersion</key>\n\t<string>$(MACOSX_DEPLOYMENT_TARGET)</string>\n\t<key>NSHumanReadableCopyright</key>\n\t<string>$(PRODUCT_COPYRIGHT)</string>\n\t<key>NSMainNibFile</key>\n\t<string>MainMenu</string>\n\t<key>NSPrincipalClass</key>\n\t<string>NSApplication</string>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/MainFlutterWindow.swift",
    "content": "import Cocoa\nimport FlutterMacOS\n\nclass MainFlutterWindow: NSWindow {\n  override func awakeFromNib() {\n    let flutterViewController = FlutterViewController()\n    let windowFrame = self.frame\n    self.contentViewController = flutterViewController\n    self.setFrame(windowFrame, display: true)\n\n    RegisterGeneratedPlugins(registry: flutterViewController)\n\n    super.awakeFromNib()\n  }\n}\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner/Release.entitlements",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>com.apple.security.app-sandbox</key>\n\t<true/>\n\t<key>com.apple.security.device.audio-input</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 54;\n\tobjects = {\n\n/* Begin PBXAggregateTarget section */\n\t\t33CC111A2044C6BA0003C045 /* Flutter Assemble */ = {\n\t\t\tisa = PBXAggregateTarget;\n\t\t\tbuildConfigurationList = 33CC111B2044C6BA0003C045 /* Build configuration list for PBXAggregateTarget \"Flutter Assemble\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t33CC111E2044C6BF0003C045 /* ShellScript */,\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = \"Flutter Assemble\";\n\t\t\tproductName = FLX;\n\t\t};\n/* End PBXAggregateTarget section */\n\n/* Begin PBXBuildFile section */\n\t\t331C80D8294CF71000263BE5 /* RunnerTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = 331C80D7294CF71000263BE5 /* RunnerTests.swift */; };\n\t\t335BBD1B22A9A15E00E9071D /* GeneratedPluginRegistrant.swift in Sources */ = {isa = PBXBuildFile; fileRef = 335BBD1A22A9A15E00E9071D /* GeneratedPluginRegistrant.swift */; };\n\t\t33CC10F12044A3C60003C045 /* AppDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 33CC10F02044A3C60003C045 /* AppDelegate.swift */; };\n\t\t33CC10F32044A3C60003C045 /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 33CC10F22044A3C60003C045 /* Assets.xcassets */; };\n\t\t33CC10F62044A3C60003C045 /* MainMenu.xib in Resources */ = {isa = PBXBuildFile; fileRef = 33CC10F42044A3C60003C045 /* MainMenu.xib */; };\n\t\t33CC11132044BFA00003C045 /* MainFlutterWindow.swift in Sources */ = {isa = PBXBuildFile; fileRef = 33CC11122044BFA00003C045 /* MainFlutterWindow.swift */; };\n\t\t3FE622CE7FAD50CAB6A50227 /* Pods_RunnerTests.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = 6FDD8A902F607871AAC21564 /* Pods_RunnerTests.framework */; };\n\t\tB6BF18E4D30EDE6C4C5FD00D /* Pods_Runner.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = 2DE4C5BCFCD3E0DF5FD13E12 /* Pods_Runner.framework */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXContainerItemProxy section */\n\t\t331C80D9294CF71000263BE5 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = 33CC10E52044A3C60003C045 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = 33CC10EC2044A3C60003C045;\n\t\t\tremoteInfo = Runner;\n\t\t};\n\t\t33CC111F2044C79F0003C045 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = 33CC10E52044A3C60003C045 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = 33CC111A2044C6BA0003C045;\n\t\t\tremoteInfo = FLX;\n\t\t};\n/* End PBXContainerItemProxy section */\n\n/* Begin PBXCopyFilesBuildPhase section */\n\t\t33CC110E2044A8840003C045 /* Bundle Framework */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = \"\";\n\t\t\tdstSubfolderSpec = 10;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tname = \"Bundle Framework\";\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXCopyFilesBuildPhase section */\n\n/* Begin PBXFileReference section */\n\t\t2DE4C5BCFCD3E0DF5FD13E12 /* Pods_Runner.framework */ = {isa = PBXFileReference; explicitFileType = wrapper.framework; includeInIndex = 0; path = Pods_Runner.framework; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t331C80D5294CF71000263BE5 /* RunnerTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = RunnerTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t331C80D7294CF71000263BE5 /* RunnerTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = RunnerTests.swift; sourceTree = \"<group>\"; };\n\t\t333000ED22D3DE5D00554162 /* Warnings.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = Warnings.xcconfig; sourceTree = \"<group>\"; };\n\t\t335BBD1A22A9A15E00E9071D /* GeneratedPluginRegistrant.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = GeneratedPluginRegistrant.swift; sourceTree = \"<group>\"; };\n\t\t33CC10ED2044A3C60003C045 /* streaming_asr.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = streaming_asr.app; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t33CC10F02044A3C60003C045 /* AppDelegate.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = AppDelegate.swift; sourceTree = \"<group>\"; };\n\t\t33CC10F22044A3C60003C045 /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; name = Assets.xcassets; path = Runner/Assets.xcassets; sourceTree = \"<group>\"; };\n\t\t33CC10F52044A3C60003C045 /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.xib; name = Base; path = Base.lproj/MainMenu.xib; sourceTree = \"<group>\"; };\n\t\t33CC10F72044A3C60003C045 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; name = Info.plist; path = Runner/Info.plist; sourceTree = \"<group>\"; };\n\t\t33CC11122044BFA00003C045 /* MainFlutterWindow.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = MainFlutterWindow.swift; sourceTree = \"<group>\"; };\n\t\t33CEB47222A05771004F2AC0 /* Flutter-Debug.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = \"Flutter-Debug.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t33CEB47422A05771004F2AC0 /* Flutter-Release.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = \"Flutter-Release.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t33CEB47722A0578A004F2AC0 /* Flutter-Generated.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; name = \"Flutter-Generated.xcconfig\"; path = \"ephemeral/Flutter-Generated.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t33E51913231747F40026EE4D /* DebugProfile.entitlements */ = {isa = PBXFileReference; lastKnownFileType = text.plist.entitlements; path = DebugProfile.entitlements; sourceTree = \"<group>\"; };\n\t\t33E51914231749380026EE4D /* Release.entitlements */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.plist.entitlements; path = Release.entitlements; sourceTree = \"<group>\"; };\n\t\t33E5194F232828860026EE4D /* AppInfo.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = AppInfo.xcconfig; sourceTree = \"<group>\"; };\n\t\t6FDD8A902F607871AAC21564 /* Pods_RunnerTests.framework */ = {isa = PBXFileReference; explicitFileType = wrapper.framework; includeInIndex = 0; path = Pods_RunnerTests.framework; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t7AFA3C8E1D35360C0083082E /* Release.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = Release.xcconfig; sourceTree = \"<group>\"; };\n\t\t8C6D1508C07A543BCF20E922 /* Pods-Runner.debug.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-Runner.debug.xcconfig\"; path = \"Target Support Files/Pods-Runner/Pods-Runner.debug.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t9740EEB21CF90195004384FC /* Debug.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; path = Debug.xcconfig; sourceTree = \"<group>\"; };\n\t\tA82F4113E9B77A3EFBDC76CB /* Pods-Runner.release.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-Runner.release.xcconfig\"; path = \"Target Support Files/Pods-Runner/Pods-Runner.release.xcconfig\"; sourceTree = \"<group>\"; };\n\t\tC2292CF6D0521881EB8F30D6 /* Pods-Runner.profile.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-Runner.profile.xcconfig\"; path = \"Target Support Files/Pods-Runner/Pods-Runner.profile.xcconfig\"; sourceTree = \"<group>\"; };\n\t\tD0400B19E48718CF50379B60 /* Pods-RunnerTests.debug.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-RunnerTests.debug.xcconfig\"; path = \"Target Support Files/Pods-RunnerTests/Pods-RunnerTests.debug.xcconfig\"; sourceTree = \"<group>\"; };\n\t\tE0928E31BD7FB7421B509154 /* Pods-RunnerTests.profile.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-RunnerTests.profile.xcconfig\"; path = \"Target Support Files/Pods-RunnerTests/Pods-RunnerTests.profile.xcconfig\"; sourceTree = \"<group>\"; };\n\t\tFE1C524AA6E87A1F323D2F64 /* Pods-RunnerTests.release.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-RunnerTests.release.xcconfig\"; path = \"Target Support Files/Pods-RunnerTests/Pods-RunnerTests.release.xcconfig\"; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\t331C80D2294CF70F00263BE5 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t3FE622CE7FAD50CAB6A50227 /* Pods_RunnerTests.framework in Frameworks */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t33CC10EA2044A3C60003C045 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tB6BF18E4D30EDE6C4C5FD00D /* Pods_Runner.framework in Frameworks */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\t331C80D6294CF71000263BE5 /* RunnerTests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t331C80D7294CF71000263BE5 /* RunnerTests.swift */,\n\t\t\t);\n\t\t\tpath = RunnerTests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33BA886A226E78AF003329D5 /* Configs */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33E5194F232828860026EE4D /* AppInfo.xcconfig */,\n\t\t\t\t9740EEB21CF90195004384FC /* Debug.xcconfig */,\n\t\t\t\t7AFA3C8E1D35360C0083082E /* Release.xcconfig */,\n\t\t\t\t333000ED22D3DE5D00554162 /* Warnings.xcconfig */,\n\t\t\t);\n\t\t\tpath = Configs;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CC10E42044A3C60003C045 = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33FAB671232836740065AC1E /* Runner */,\n\t\t\t\t33CEB47122A05771004F2AC0 /* Flutter */,\n\t\t\t\t331C80D6294CF71000263BE5 /* RunnerTests */,\n\t\t\t\t33CC10EE2044A3C60003C045 /* Products */,\n\t\t\t\tD73912EC22F37F3D000D13A0 /* Frameworks */,\n\t\t\t\tEFAE9269CAE479A42FBED805 /* Pods */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CC10EE2044A3C60003C045 /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10ED2044A3C60003C045 /* streaming_asr.app */,\n\t\t\t\t331C80D5294CF71000263BE5 /* RunnerTests.xctest */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CC11242044D66E0003C045 /* Resources */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10F22044A3C60003C045 /* Assets.xcassets */,\n\t\t\t\t33CC10F42044A3C60003C045 /* MainMenu.xib */,\n\t\t\t\t33CC10F72044A3C60003C045 /* Info.plist */,\n\t\t\t);\n\t\t\tname = Resources;\n\t\t\tpath = ..;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CEB47122A05771004F2AC0 /* Flutter */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t335BBD1A22A9A15E00E9071D /* GeneratedPluginRegistrant.swift */,\n\t\t\t\t33CEB47222A05771004F2AC0 /* Flutter-Debug.xcconfig */,\n\t\t\t\t33CEB47422A05771004F2AC0 /* Flutter-Release.xcconfig */,\n\t\t\t\t33CEB47722A0578A004F2AC0 /* Flutter-Generated.xcconfig */,\n\t\t\t);\n\t\t\tpath = Flutter;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33FAB671232836740065AC1E /* Runner */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10F02044A3C60003C045 /* AppDelegate.swift */,\n\t\t\t\t33CC11122044BFA00003C045 /* MainFlutterWindow.swift */,\n\t\t\t\t33E51913231747F40026EE4D /* DebugProfile.entitlements */,\n\t\t\t\t33E51914231749380026EE4D /* Release.entitlements */,\n\t\t\t\t33CC11242044D66E0003C045 /* Resources */,\n\t\t\t\t33BA886A226E78AF003329D5 /* Configs */,\n\t\t\t);\n\t\t\tpath = Runner;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tD73912EC22F37F3D000D13A0 /* Frameworks */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t2DE4C5BCFCD3E0DF5FD13E12 /* Pods_Runner.framework */,\n\t\t\t\t6FDD8A902F607871AAC21564 /* Pods_RunnerTests.framework */,\n\t\t\t);\n\t\t\tname = Frameworks;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tEFAE9269CAE479A42FBED805 /* Pods */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t8C6D1508C07A543BCF20E922 /* Pods-Runner.debug.xcconfig */,\n\t\t\t\tA82F4113E9B77A3EFBDC76CB /* Pods-Runner.release.xcconfig */,\n\t\t\t\tC2292CF6D0521881EB8F30D6 /* Pods-Runner.profile.xcconfig */,\n\t\t\t\tD0400B19E48718CF50379B60 /* Pods-RunnerTests.debug.xcconfig */,\n\t\t\t\tFE1C524AA6E87A1F323D2F64 /* Pods-RunnerTests.release.xcconfig */,\n\t\t\t\tE0928E31BD7FB7421B509154 /* Pods-RunnerTests.profile.xcconfig */,\n\t\t\t);\n\t\t\tname = Pods;\n\t\t\tpath = Pods;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\t331C80D4294CF70F00263BE5 /* RunnerTests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 331C80DE294CF71000263BE5 /* Build configuration list for PBXNativeTarget \"RunnerTests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t82A3EDCE842FB1EFBCBBAD9F /* [CP] Check Pods Manifest.lock */,\n\t\t\t\t331C80D1294CF70F00263BE5 /* Sources */,\n\t\t\t\t331C80D2294CF70F00263BE5 /* Frameworks */,\n\t\t\t\t331C80D3294CF70F00263BE5 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\t331C80DA294CF71000263BE5 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = RunnerTests;\n\t\t\tproductName = RunnerTests;\n\t\t\tproductReference = 331C80D5294CF71000263BE5 /* RunnerTests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.unit-test\";\n\t\t};\n\t\t33CC10EC2044A3C60003C045 /* Runner */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 33CC10FB2044A3C60003C045 /* Build configuration list for PBXNativeTarget \"Runner\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t9382924FF32AC828037E7DB5 /* [CP] Check Pods Manifest.lock */,\n\t\t\t\t33CC10E92044A3C60003C045 /* Sources */,\n\t\t\t\t33CC10EA2044A3C60003C045 /* Frameworks */,\n\t\t\t\t33CC10EB2044A3C60003C045 /* Resources */,\n\t\t\t\t33CC110E2044A8840003C045 /* Bundle Framework */,\n\t\t\t\t3399D490228B24CF009A79C7 /* ShellScript */,\n\t\t\t\t736059A98E6FCBCF66678C71 /* [CP] Embed Pods Frameworks */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\t33CC11202044C79F0003C045 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = Runner;\n\t\t\tproductName = Runner;\n\t\t\tproductReference = 33CC10ED2044A3C60003C045 /* streaming_asr.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\t33CC10E52044A3C60003C045 /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = YES;\n\t\t\t\tLastSwiftUpdateCheck = 0920;\n\t\t\t\tLastUpgradeCheck = 1510;\n\t\t\t\tORGANIZATIONNAME = \"\";\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\t331C80D4294CF70F00263BE5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.0;\n\t\t\t\t\t\tTestTargetID = 33CC10EC2044A3C60003C045;\n\t\t\t\t\t};\n\t\t\t\t\t33CC10EC2044A3C60003C045 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 9.2;\n\t\t\t\t\t\tLastSwiftMigration = 1100;\n\t\t\t\t\t\tProvisioningStyle = Automatic;\n\t\t\t\t\t\tSystemCapabilities = {\n\t\t\t\t\t\t\tcom.apple.Sandbox = {\n\t\t\t\t\t\t\t\tenabled = 1;\n\t\t\t\t\t\t\t};\n\t\t\t\t\t\t};\n\t\t\t\t\t};\n\t\t\t\t\t33CC111A2044C6BA0003C045 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 9.2;\n\t\t\t\t\t\tProvisioningStyle = Manual;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = 33CC10E82044A3C60003C045 /* Build configuration list for PBXProject \"Runner\" */;\n\t\t\tcompatibilityVersion = \"Xcode 9.3\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = 33CC10E42044A3C60003C045;\n\t\t\tproductRefGroup = 33CC10EE2044A3C60003C045 /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\t33CC10EC2044A3C60003C045 /* Runner */,\n\t\t\t\t331C80D4294CF70F00263BE5 /* RunnerTests */,\n\t\t\t\t33CC111A2044C6BA0003C045 /* Flutter Assemble */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\t331C80D3294CF70F00263BE5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t33CC10EB2044A3C60003C045 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t33CC10F32044A3C60003C045 /* Assets.xcassets in Resources */,\n\t\t\t\t33CC10F62044A3C60003C045 /* MainMenu.xib in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXShellScriptBuildPhase section */\n\t\t3399D490228B24CF009A79C7 /* ShellScript */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\talwaysOutOfDate = 1;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t);\n\t\t\toutputFileListPaths = (\n\t\t\t);\n\t\t\toutputPaths = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"echo \\\"$PRODUCT_NAME.app\\\" > \\\"$PROJECT_DIR\\\"/Flutter/ephemeral/.app_filename && \\\"$FLUTTER_ROOT\\\"/packages/flutter_tools/bin/macos_assemble.sh embed\\n\";\n\t\t};\n\t\t33CC111E2044C6BF0003C045 /* ShellScript */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t\tFlutter/ephemeral/FlutterInputs.xcfilelist,\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t\tFlutter/ephemeral/tripwire,\n\t\t\t);\n\t\t\toutputFileListPaths = (\n\t\t\t\tFlutter/ephemeral/FlutterOutputs.xcfilelist,\n\t\t\t);\n\t\t\toutputPaths = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"\\\"$FLUTTER_ROOT\\\"/packages/flutter_tools/bin/macos_assemble.sh && touch Flutter/ephemeral/tripwire\";\n\t\t};\n\t\t736059A98E6FCBCF66678C71 /* [CP] Embed Pods Frameworks */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t\t\"${PODS_ROOT}/Target Support Files/Pods-Runner/Pods-Runner-frameworks-${CONFIGURATION}-input-files.xcfilelist\",\n\t\t\t);\n\t\t\tname = \"[CP] Embed Pods Frameworks\";\n\t\t\toutputFileListPaths = (\n\t\t\t\t\"${PODS_ROOT}/Target Support Files/Pods-Runner/Pods-Runner-frameworks-${CONFIGURATION}-output-files.xcfilelist\",\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"\\\"${PODS_ROOT}/Target Support Files/Pods-Runner/Pods-Runner-frameworks.sh\\\"\\n\";\n\t\t\tshowEnvVarsInLog = 0;\n\t\t};\n\t\t82A3EDCE842FB1EFBCBBAD9F /* [CP] Check Pods Manifest.lock */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t\t\"${PODS_PODFILE_DIR_PATH}/Podfile.lock\",\n\t\t\t\t\"${PODS_ROOT}/Manifest.lock\",\n\t\t\t);\n\t\t\tname = \"[CP] Check Pods Manifest.lock\";\n\t\t\toutputFileListPaths = (\n\t\t\t);\n\t\t\toutputPaths = (\n\t\t\t\t\"$(DERIVED_FILE_DIR)/Pods-RunnerTests-checkManifestLockResult.txt\",\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"diff \\\"${PODS_PODFILE_DIR_PATH}/Podfile.lock\\\" \\\"${PODS_ROOT}/Manifest.lock\\\" > /dev/null\\nif [ $? != 0 ] ; then\\n    # print error to STDERR\\n    echo \\\"error: The sandbox is not in sync with the Podfile.lock. Run 'pod install' or update your CocoaPods installation.\\\" >&2\\n    exit 1\\nfi\\n# This output is used by Xcode 'outputs' to avoid re-running this script phase.\\necho \\\"SUCCESS\\\" > \\\"${SCRIPT_OUTPUT_FILE_0}\\\"\\n\";\n\t\t\tshowEnvVarsInLog = 0;\n\t\t};\n\t\t9382924FF32AC828037E7DB5 /* [CP] Check Pods Manifest.lock */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t\t\"${PODS_PODFILE_DIR_PATH}/Podfile.lock\",\n\t\t\t\t\"${PODS_ROOT}/Manifest.lock\",\n\t\t\t);\n\t\t\tname = \"[CP] Check Pods Manifest.lock\";\n\t\t\toutputFileListPaths = (\n\t\t\t);\n\t\t\toutputPaths = (\n\t\t\t\t\"$(DERIVED_FILE_DIR)/Pods-Runner-checkManifestLockResult.txt\",\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"diff \\\"${PODS_PODFILE_DIR_PATH}/Podfile.lock\\\" \\\"${PODS_ROOT}/Manifest.lock\\\" > /dev/null\\nif [ $? != 0 ] ; then\\n    # print error to STDERR\\n    echo \\\"error: The sandbox is not in sync with the Podfile.lock. Run 'pod install' or update your CocoaPods installation.\\\" >&2\\n    exit 1\\nfi\\n# This output is used by Xcode 'outputs' to avoid re-running this script phase.\\necho \\\"SUCCESS\\\" > \\\"${SCRIPT_OUTPUT_FILE_0}\\\"\\n\";\n\t\t\tshowEnvVarsInLog = 0;\n\t\t};\n/* End PBXShellScriptBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\t331C80D1294CF70F00263BE5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t331C80D8294CF71000263BE5 /* RunnerTests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t33CC10E92044A3C60003C045 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t33CC11132044BFA00003C045 /* MainFlutterWindow.swift in Sources */,\n\t\t\t\t33CC10F12044A3C60003C045 /* AppDelegate.swift in Sources */,\n\t\t\t\t335BBD1B22A9A15E00E9071D /* GeneratedPluginRegistrant.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin PBXTargetDependency section */\n\t\t331C80DA294CF71000263BE5 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = 33CC10EC2044A3C60003C045 /* Runner */;\n\t\t\ttargetProxy = 331C80D9294CF71000263BE5 /* PBXContainerItemProxy */;\n\t\t};\n\t\t33CC11202044C79F0003C045 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = 33CC111A2044C6BA0003C045 /* Flutter Assemble */;\n\t\t\ttargetProxy = 33CC111F2044C79F0003C045 /* PBXContainerItemProxy */;\n\t\t};\n/* End PBXTargetDependency section */\n\n/* Begin PBXVariantGroup section */\n\t\t33CC10F42044A3C60003C045 /* MainMenu.xib */ = {\n\t\t\tisa = PBXVariantGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10F52044A3C60003C045 /* Base */,\n\t\t\t);\n\t\t\tname = MainMenu.xib;\n\t\t\tpath = Runner;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXVariantGroup section */\n\n/* Begin XCBuildConfiguration section */\n\t\t331C80DB294CF71000263BE5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = D0400B19E48718CF50379B60 /* Pods-RunnerTests.debug.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.example.streamingAsr.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/streaming_asr.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/streaming_asr\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t331C80DC294CF71000263BE5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = FE1C524AA6E87A1F323D2F64 /* Pods-RunnerTests.release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.example.streamingAsr.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/streaming_asr.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/streaming_asr\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t331C80DD294CF71000263BE5 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = E0928E31BD7FB7421B509154 /* Pods-RunnerTests.profile.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.example.streamingAsr.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/streaming_asr.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/streaming_asr\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t338D0CE9231458BD00FA5F75 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 7AFA3C8E1D35360C0083082E /* Release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++14\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEAD_CODE_STRIPPING = YES;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.15;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t338D0CEA231458BD00FA5F75 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 33E5194F232828860026EE4D /* AppInfo.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCODE_SIGN_ENTITLEMENTS = Runner/DebugProfile.entitlements;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCOMBINE_HIDPI_IMAGES = YES;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/../Frameworks\",\n\t\t\t\t);\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t338D0CEB231458BD00FA5F75 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCODE_SIGN_STYLE = Manual;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t33CC10F92044A3C60003C045 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 9740EEB21CF90195004384FC /* Debug.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++14\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEAD_CODE_STRIPPING = YES;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.15;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t33CC10FA2044A3C60003C045 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 7AFA3C8E1D35360C0083082E /* Release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++14\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEAD_CODE_STRIPPING = YES;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.15;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t33CC10FC2044A3C60003C045 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 33E5194F232828860026EE4D /* AppInfo.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCODE_SIGN_ENTITLEMENTS = Runner/DebugProfile.entitlements;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCOMBINE_HIDPI_IMAGES = YES;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/../Frameworks\",\n\t\t\t\t);\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t33CC10FD2044A3C60003C045 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 33E5194F232828860026EE4D /* AppInfo.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCODE_SIGN_ENTITLEMENTS = Runner/Release.entitlements;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCOMBINE_HIDPI_IMAGES = YES;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/../Frameworks\",\n\t\t\t\t);\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t33CC111C2044C6BA0003C045 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCODE_SIGN_STYLE = Manual;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t33CC111D2044C6BA0003C045 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\t331C80DE294CF71000263BE5 /* Build configuration list for PBXNativeTarget \"RunnerTests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t331C80DB294CF71000263BE5 /* Debug */,\n\t\t\t\t331C80DC294CF71000263BE5 /* Release */,\n\t\t\t\t331C80DD294CF71000263BE5 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t33CC10E82044A3C60003C045 /* Build configuration list for PBXProject \"Runner\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t33CC10F92044A3C60003C045 /* Debug */,\n\t\t\t\t33CC10FA2044A3C60003C045 /* Release */,\n\t\t\t\t338D0CE9231458BD00FA5F75 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t33CC10FB2044A3C60003C045 /* Build configuration list for PBXNativeTarget \"Runner\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t33CC10FC2044A3C60003C045 /* Debug */,\n\t\t\t\t33CC10FD2044A3C60003C045 /* Release */,\n\t\t\t\t338D0CEA231458BD00FA5F75 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t33CC111B2044C6BA0003C045 /* Build configuration list for PBXAggregateTarget \"Flutter Assemble\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t33CC111C2044C6BA0003C045 /* Debug */,\n\t\t\t\t33CC111D2044C6BA0003C045 /* Release */,\n\t\t\t\t338D0CEB231458BD00FA5F75 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = 33CC10E52044A3C60003C045 /* Project object */;\n}\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner.xcodeproj/xcshareddata/xcschemes/Runner.xcscheme",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Scheme\n   LastUpgradeVersion = \"1510\"\n   version = \"1.3\">\n   <BuildAction\n      parallelizeBuildables = \"YES\"\n      buildImplicitDependencies = \"YES\">\n      <BuildActionEntries>\n         <BuildActionEntry\n            buildForTesting = \"YES\"\n            buildForRunning = \"YES\"\n            buildForProfiling = \"YES\"\n            buildForArchiving = \"YES\"\n            buildForAnalyzing = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n               BuildableName = \"non_streaming_vad_asr.app\"\n               BlueprintName = \"Runner\"\n               ReferencedContainer = \"container:Runner.xcodeproj\">\n            </BuildableReference>\n         </BuildActionEntry>\n      </BuildActionEntries>\n   </BuildAction>\n   <TestAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\">\n      <MacroExpansion>\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n            BuildableName = \"streaming_asr.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </MacroExpansion>\n      <Testables>\n         <TestableReference\n            skipped = \"NO\"\n            parallelizable = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"331C80D4294CF70F00263BE5\"\n               BuildableName = \"RunnerTests.xctest\"\n               BlueprintName = \"RunnerTests\"\n               ReferencedContainer = \"container:Runner.xcodeproj\">\n            </BuildableReference>\n         </TestableReference>\n      </Testables>\n   </TestAction>\n   <LaunchAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      launchStyle = \"0\"\n      useCustomWorkingDirectory = \"NO\"\n      ignoresPersistentStateOnLaunch = \"NO\"\n      debugDocumentVersioning = \"YES\"\n      debugServiceExtension = \"internal\"\n      enableGPUValidationMode = \"1\"\n      allowLocationSimulation = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n            BuildableName = \"streaming_asr.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </LaunchAction>\n   <ProfileAction\n      buildConfiguration = \"Profile\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\"\n      savedToolIdentifier = \"\"\n      useCustomWorkingDirectory = \"NO\"\n      debugDocumentVersioning = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n            BuildableName = \"streaming_asr.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </ProfileAction>\n   <AnalyzeAction\n      buildConfiguration = \"Debug\">\n   </AnalyzeAction>\n   <ArchiveAction\n      buildConfiguration = \"Release\"\n      revealArchiveInOrganizer = \"YES\">\n   </ArchiveAction>\n</Scheme>\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"group:Runner.xcodeproj\">\n   </FileRef>\n   <FileRef\n      location = \"group:Pods/Pods.xcodeproj\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/Runner.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/macos/RunnerTests/RunnerTests.swift",
    "content": "import Cocoa\nimport FlutterMacOS\nimport XCTest\n\nclass RunnerTests: XCTestCase {\n\n  func testExample() {\n    // If you add code to the Runner application, consider adding tests here.\n    // See https://developer.apple.com/documentation/xctest for more information about using XCTest.\n  }\n\n}\n"
  },
  {
    "path": "flutter-examples/non_streaming_vad_asr/pubspec.yaml",
    "content": "name: non_streaming_vad_asr\n\ndescription: >\n  This example shows how to implement \"real-time\" speech recognition using sherpa-onnx via non_streaming and vad.\n\npublish_to: 'none'\n\nversion: 1.12.31\n\ntopics:\n  - speech-recognition\n\nissue_tracker: https://github.com/k2-fsa/sherpa-onnx/issues\n\nrepository: https://github.com/k2-fsa/sherpa-onnx/tree/master/sherpa-onnx/flutter\n\nenvironment:\n  sdk: \">=2.17.0 <4.0.0\"\n  flutter: \">=2.8.1\"\n\ndependencies:\n  flutter:\n    sdk: flutter\n\n  cupertino_icons: ^1.0.6\n\n  path_provider: ^2.1.3\n  path: ^1.9.0\n\n  # Note: record does not support Linux for streaming ASR\n  record: 6.0.0\n  url_launcher: ^6.2.6\n\n  sherpa_onnx: ^1.12.31\n  # sherpa_onnx:\n  #   path: ../../flutter/sherpa_onnx\n\ndev_dependencies:\n  flutter_test:\n    sdk: flutter\n\n  flutter_lints: ^3.0.0\n\nflutter:\n  uses-material-design: true\n\n  assets:\n    - assets/\n    #- assets/whisper/\n    #- assets/senseVoice/\n    - assets/nemo_transducer/\n    # - assets/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\n"
  },
  {
    "path": "flutter-examples/streaming_asr/.gitignore",
    "content": "# Miscellaneous\n*.class\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\nmigrate_working_dir/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# The .vscode folder contains launch configuration and tasks you configure in\n# VS Code which you may wish to be included in version control, so this line\n# is commented out by default.\n#.vscode/\n\n# Flutter/Dart/Pub related\n**/doc/api/\n**/ios/Flutter/.last_build_id\n.dart_tool/\n.flutter-plugins\n.flutter-plugins-dependencies\n.pub-cache/\n.pub/\n/build/\n\n# Symbolication related\napp.*.symbols\n\n# Obfuscation related\napp.*.map.json\n\n# Android Studio will place build artifacts here\n/android/app/debug\n/android/app/profile\n/android/app/release\n"
  },
  {
    "path": "flutter-examples/streaming_asr/.metadata",
    "content": "# This file tracks properties of this Flutter project.\n# Used by Flutter tool to assess capabilities and perform upgrades etc.\n#\n# This file should be version controlled and should not be manually edited.\n\nversion:\n  revision: \"9f455d2486bcb28cad87b062475f42edc959f636\"\n  channel: \"stable\"\n\nproject_type: app\n\n# Tracks metadata for the flutter migrate command\nmigration:\n  platforms:\n    - platform: root\n      create_revision: 9f455d2486bcb28cad87b062475f42edc959f636\n      base_revision: 9f455d2486bcb28cad87b062475f42edc959f636\n    - platform: linux\n      create_revision: 9f455d2486bcb28cad87b062475f42edc959f636\n      base_revision: 9f455d2486bcb28cad87b062475f42edc959f636\n\n  # User provided section\n\n  # List of Local paths (relative to this file) that should be\n  # ignored by the migrate tool.\n  #\n  # Files that are not part of the templates will be ignored by default.\n  unmanaged_files:\n    - 'lib/main.dart'\n    - 'ios/Runner.xcodeproj/project.pbxproj'\n"
  },
  {
    "path": "flutter-examples/streaming_asr/README.md",
    "content": "# Real-time speech recognition\n\nThis APP supports the following platforms:\n\n  - Windows\n  - macOS\n  - Linux\n  - Android\n  - iOS\n\nPre-built APPs for this folder can be found at <https://k2-fsa.github.io/sherpa/onnx/flutter/pre-built-app.html#streaming-speech-recognition-stt-asr>\n\nSee also <https://github.com/Jason-chen-coder/Flutter-EasySpeechRecognition>\n\n## Getting Started\n\nRemember to use the following steps to download a model. Otherwise, you would\nget errors after you start and run the app.\n\n###  1. Select a streaming model\n\nPlease visit <https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models>\nto download a streaming ASR model.\n\nYou can find introductions about each streaming model at\n<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html>\n\n\nNote: `Streaming` is the same as `Online` in this context.\n\n### 2. Let the code know which model you are using\n\nWe have pre-configured some streaming models in the following file\n\n<https://github.com/k2-fsa/sherpa-onnx/blob/master/flutter-examples/streaming_asr/lib/online_model.dart>\n\nIf you select a model that is not in the above file, please add it to the above file\nby yourself by following how existing models are added.\n\nThen you need to update\n\n<https://github.com/k2-fsa/sherpa-onnx/blob/master/flutter-examples/streaming_asr/lib/streaming_asr.dart#L16>\n\n```\nfinal type = 0;\n```\n\nPlease change ``type`` accordingly.\n\nYou also need to change [./pubspec.yaml](./pubspec.yaml) so that your APP knows where to find it.\nPlease see the example below for how to do that.\n\n### 3. Place your downloaded model inside the directory assets\n\nThe downloaded model has to be placed in the [assets](./assets) directory.\n\n**HINT**: Please delete files that are not needed by the code. Otherwise, you put\nunnecessary files in your APP and it will significantly increase the size of your APP.\n\n## Example\n\nSuppose you have selected the following model\n\n<https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2>\n\nPlease use the following steps to make it available in your APP.\n\n - 1. Change [online_model.dart](./lib/online_model.dart)\n\n    This model is already in the file and its type is `0`, so there is no need to change this file.\n\n - 2. Change [streaming_asr.dart](./lib/streaming_asr.dart)\n\n    The default value for `type` is 0 and our model has also a type of `0`, so there is no need to change this file.\n\n - 3. Change [pubspec.yaml](./pubspec.yaml)\n\n   At the end of [pubspec.yaml](./pubspec.yaml), please change it exactly like below:\n\n```\n  assets:\n    - assets/\n    - assets/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\n```\n\n  - 4. Download the model to the [./assets](./assets) directory.\n\n```\ncd assets\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\n# Remeber to remove unused files.\nrm -rf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/README.md\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/bpe*\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.int8.onnx\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx\n```\n\nYour [assets](./assets) directory should look like below at the end.\n\n```\nassets/\n└── sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n    ├── decoder-epoch-99-avg-1.onnx\n    ├── encoder-epoch-99-avg-1.int8.onnx\n    ├── joiner-epoch-99-avg-1.onnx\n    └── tokens.txt\n\n1 directory, 4 files\n```\n\n  - 5. Run it!\n\n    For instance\n\n      - `flutter run -d macos` for macOS.\n\n      - `flutter run -d windows` for windows.\n"
  },
  {
    "path": "flutter-examples/streaming_asr/analysis_options.yaml",
    "content": "# This file configures the analyzer, which statically analyzes Dart code to\n# check for errors, warnings, and lints.\n#\n# The issues identified by the analyzer are surfaced in the UI of Dart-enabled\n# IDEs (https://dart.dev/tools#ides-and-editors). The analyzer can also be\n# invoked from the command line by running `flutter analyze`.\n\n# The following line activates a set of recommended lints for Flutter apps,\n# packages, and plugins designed to encourage good coding practices.\ninclude: package:flutter_lints/flutter.yaml\n\nlinter:\n  # The lint rules applied to this project can be customized in the\n  # section below to disable rules from the `package:flutter_lints/flutter.yaml`\n  # included above or to enable additional rules. A list of all available lints\n  # and their documentation is published at https://dart.dev/lints.\n  #\n  # Instead of disabling a lint rule for the entire project in the\n  # section below, it can also be suppressed for a single line of code\n  # or a specific dart file by using the `// ignore: name_of_lint` and\n  # `// ignore_for_file: name_of_lint` syntax on the line or in the file\n  # producing the lint.\n  rules:\n    # avoid_print: false  # Uncomment to disable the `avoid_print` rule\n    # prefer_single_quotes: true  # Uncomment to enable the `prefer_single_quotes` rule\n\n# Additional information about this file can be found at\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/.gitignore",
    "content": "gradle-wrapper.jar\n/.gradle\n/captures/\n/gradlew\n/gradlew.bat\n/local.properties\nGeneratedPluginRegistrant.java\n\n# Remember to never publicly share your keystore.\n# See https://flutter.dev/docs/deployment/android#reference-the-keystore-from-the-app\nkey.properties\n**/*.keystore\n**/*.jks\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/app/build.gradle",
    "content": "plugins {\n    id \"com.android.application\"\n    id \"kotlin-android\"\n    // The Flutter Gradle Plugin must be applied after the Android and Kotlin Gradle plugins.\n    id \"dev.flutter.flutter-gradle-plugin\"\n}\n\ndef localProperties = new Properties()\ndef localPropertiesFile = rootProject.file(\"local.properties\")\nif (localPropertiesFile.exists()) {\n    localPropertiesFile.withReader(\"UTF-8\") { reader ->\n        localProperties.load(reader)\n    }\n}\n\ndef flutterVersionCode = localProperties.getProperty(\"flutter.versionCode\")\nif (flutterVersionCode == null) {\n    flutterVersionCode = \"1\"\n}\n\ndef flutterVersionName = localProperties.getProperty(\"flutter.versionName\")\nif (flutterVersionName == null) {\n    flutterVersionName = \"1.0\"\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx.streaming_asr\"\n    compileSdk = 36\n    ndkVersion = \"27.0.12077973\"\n\n    compileOptions {\n        sourceCompatibility = JavaVersion.toVersion(17)\n        targetCompatibility = JavaVersion.toVersion(17)\n    }\n\n    kotlinOptions {\n        jvmTarget = '17'\n    }\n\n    java {\n        toolchain {\n            languageVersion = JavaLanguageVersion.of(17)\n        }\n    }\n\n    defaultConfig {\n        // TODO: Specify your own unique Application ID (https://developer.android.com/studio/build/application-id.html).\n        applicationId = \"com.k2fsa.sherpa.onnx.streaming_asr\"\n        // You can update the following values to match your application needs.\n        // For more information, see: https://docs.flutter.dev/deployment/android#reviewing-the-gradle-build-configuration.\n        minSdk = 23\n        targetSdk = 36\n        versionCode = flutterVersionCode.toInteger()\n        versionName = flutterVersionName\n    }\n\n    buildTypes {\n        release {\n            // TODO: Add your own signing config for the release build.\n            // Signing with the debug keys for now, so `flutter run --release` works.\n            signingConfig = signingConfigs.debug\n        }\n    }\n}\n\nflutter {\n    source = \"../..\"\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/app/src/debug/AndroidManifest.xml",
    "content": "<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <!-- The INTERNET permission is required for development. Specifically,\n         the Flutter tool needs it to communicate with the running application\n         to allow setting breakpoints, to provide hot reload, etc.\n    -->\n    <uses-permission android:name=\"android.permission.INTERNET\"/>\n</manifest>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/app/src/main/AndroidManifest.xml",
    "content": "<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <uses-permission android:name=\"android.permission.RECORD_AUDIO\" />\n    <!-- Optional: Add this permission if you want to use bluetooth telephony device like headset/earbuds -->\n    <uses-permission android:name=\"android.permission.MODIFY_AUDIO_SETTINGS\" />\n    <!-- Optional: Add this permission if you want to save your recordings in public folders -->\n    <uses-permission android:name=\"android.permission.WRITE_EXTERNAL_STORAGE\" />\n\n    <application\n        android:label=\"streaming_asr\"\n        android:name=\"${applicationName}\"\n        android:icon=\"@mipmap/ic_launcher\">\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\"\n            android:launchMode=\"singleTop\"\n            android:taskAffinity=\"\"\n            android:theme=\"@style/LaunchTheme\"\n            android:configChanges=\"orientation|keyboardHidden|keyboard|screenSize|smallestScreenSize|locale|layoutDirection|fontScale|screenLayout|density|uiMode\"\n            android:hardwareAccelerated=\"true\"\n            android:windowSoftInputMode=\"adjustResize\">\n            <!-- Specifies an Android theme to apply to this Activity as soon as\n                 the Android process has started. This theme is visible to the user\n                 while the Flutter UI initializes. After that, this theme continues\n                 to determine the Window background behind the Flutter UI. -->\n            <meta-data\n              android:name=\"io.flutter.embedding.android.NormalTheme\"\n              android:resource=\"@style/NormalTheme\"\n              />\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\"/>\n                <category android:name=\"android.intent.category.LAUNCHER\"/>\n            </intent-filter>\n        </activity>\n        <!-- Don't delete the meta-data below.\n             This is used by the Flutter tool to generate GeneratedPluginRegistrant.java -->\n        <meta-data\n            android:name=\"flutterEmbedding\"\n            android:value=\"2\" />\n    </application>\n    <!-- Required to query activities that can process text, see:\n         https://developer.android.com/training/package-visibility and\n         https://developer.android.com/reference/android/content/Intent#ACTION_PROCESS_TEXT.\n\n         In particular, this is used by the Flutter engine in io.flutter.plugin.text.ProcessTextPlugin. -->\n    <queries>\n        <intent>\n            <action android:name=\"android.intent.action.PROCESS_TEXT\"/>\n            <data android:mimeType=\"text/plain\"/>\n        </intent>\n    </queries>\n</manifest>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/app/src/main/kotlin/com/k2fsa/sherpa/onnx/streaming_asr/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx.streaming_asr\n\nimport io.flutter.embedding.android.FlutterActivity\n\nclass MainActivity: FlutterActivity()\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/app/src/main/res/drawable/launch_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<!-- Modify this file to customize your launch splash screen -->\n<layer-list xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <item android:drawable=\"@android:color/white\" />\n\n    <!-- You can insert your own image assets here -->\n    <!-- <item>\n        <bitmap\n            android:gravity=\"center\"\n            android:src=\"@mipmap/launch_image\" />\n    </item> -->\n</layer-list>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/app/src/main/res/drawable-v21/launch_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<!-- Modify this file to customize your launch splash screen -->\n<layer-list xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <item android:drawable=\"?android:colorBackground\" />\n\n    <!-- You can insert your own image assets here -->\n    <!-- <item>\n        <bitmap\n            android:gravity=\"center\"\n            android:src=\"@mipmap/launch_image\" />\n    </item> -->\n</layer-list>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/app/src/main/res/values/styles.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <!-- Theme applied to the Android Window while the process is starting when the OS's Dark Mode setting is off -->\n    <style name=\"LaunchTheme\" parent=\"@android:style/Theme.Light.NoTitleBar\">\n        <!-- Show a splash screen on the activity. Automatically removed when\n             the Flutter engine draws its first frame -->\n        <item name=\"android:windowBackground\">@drawable/launch_background</item>\n    </style>\n    <!-- Theme applied to the Android Window as soon as the process has started.\n         This theme determines the color of the Android Window while your\n         Flutter UI initializes, as well as behind your Flutter UI while its\n         running.\n\n         This Theme is only used starting with V2 of Flutter's Android embedding. -->\n    <style name=\"NormalTheme\" parent=\"@android:style/Theme.Light.NoTitleBar\">\n        <item name=\"android:windowBackground\">?android:colorBackground</item>\n    </style>\n</resources>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/app/src/main/res/values-night/styles.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <!-- Theme applied to the Android Window while the process is starting when the OS's Dark Mode setting is on -->\n    <style name=\"LaunchTheme\" parent=\"@android:style/Theme.Black.NoTitleBar\">\n        <!-- Show a splash screen on the activity. Automatically removed when\n             the Flutter engine draws its first frame -->\n        <item name=\"android:windowBackground\">@drawable/launch_background</item>\n    </style>\n    <!-- Theme applied to the Android Window as soon as the process has started.\n         This theme determines the color of the Android Window while your\n         Flutter UI initializes, as well as behind your Flutter UI while its\n         running.\n\n         This Theme is only used starting with V2 of Flutter's Android embedding. -->\n    <style name=\"NormalTheme\" parent=\"@android:style/Theme.Black.NoTitleBar\">\n        <item name=\"android:windowBackground\">?android:colorBackground</item>\n    </style>\n</resources>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/app/src/profile/AndroidManifest.xml",
    "content": "<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <!-- The INTERNET permission is required for development. Specifically,\n         the Flutter tool needs it to communicate with the running application\n         to allow setting breakpoints, to provide hot reload, etc.\n    -->\n    <uses-permission android:name=\"android.permission.INTERNET\"/>\n</manifest>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/build.gradle",
    "content": "allprojects {\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\nrootProject.buildDir = \"../build\"\nsubprojects {\n    project.buildDir = \"${rootProject.buildDir}/${project.name}\"\n}\nsubprojects {\n    project.evaluationDependsOn(\":app\")\n}\n\ntasks.register(\"clean\", Delete) {\n    delete rootProject.buildDir\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/gradle/wrapper/gradle-wrapper.properties",
    "content": "distributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.11.1-all.zip\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/gradle.properties",
    "content": "org.gradle.jvmargs=-Xmx4G -XX:+HeapDumpOnOutOfMemoryError\nandroid.useAndroidX=true\nandroid.enableJetifier=true\n"
  },
  {
    "path": "flutter-examples/streaming_asr/android/settings.gradle",
    "content": "pluginManagement {\n    def flutterSdkPath = {\n        def properties = new Properties()\n        file(\"local.properties\").withInputStream { properties.load(it) }\n        def flutterSdkPath = properties.getProperty(\"flutter.sdk\")\n        assert flutterSdkPath != null, \"flutter.sdk not set in local.properties\"\n        return flutterSdkPath\n    }()\n\n    includeBuild(\"$flutterSdkPath/packages/flutter_tools/gradle\")\n\n    repositories {\n        google()\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\n\nplugins {\n    id \"dev.flutter.flutter-plugin-loader\" version \"1.0.0\"\n    id \"com.android.application\" version \"8.9.1\" apply false\n    id \"org.jetbrains.kotlin.android\" version \"1.9.24\" apply false\n}\n\ninclude \":app\"\n"
  },
  {
    "path": "flutter-examples/streaming_asr/assets/.gitignore",
    "content": ""
  },
  {
    "path": "flutter-examples/streaming_asr/ios/.gitignore",
    "content": "**/dgph\n*.mode1v3\n*.mode2v3\n*.moved-aside\n*.pbxuser\n*.perspectivev3\n**/*sync/\n.sconsign.dblite\n.tags*\n**/.vagrant/\n**/DerivedData/\nIcon?\n**/Pods/\n**/.symlinks/\nprofile\nxcuserdata\n**/.generated/\nFlutter/App.framework\nFlutter/Flutter.framework\nFlutter/Flutter.podspec\nFlutter/Generated.xcconfig\nFlutter/ephemeral/\nFlutter/app.flx\nFlutter/app.zip\nFlutter/flutter_assets/\nFlutter/flutter_export_environment.sh\nServiceDefinitions.json\nRunner/GeneratedPluginRegistrant.*\n\n# Exceptions to above rules.\n!default.mode1v3\n!default.mode2v3\n!default.pbxuser\n!default.perspectivev3\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Flutter/AppFrameworkInfo.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n  <key>CFBundleDevelopmentRegion</key>\n  <string>en</string>\n  <key>CFBundleExecutable</key>\n  <string>App</string>\n  <key>CFBundleIdentifier</key>\n  <string>io.flutter.flutter.app</string>\n  <key>CFBundleInfoDictionaryVersion</key>\n  <string>6.0</string>\n  <key>CFBundleName</key>\n  <string>App</string>\n  <key>CFBundlePackageType</key>\n  <string>FMWK</string>\n  <key>CFBundleShortVersionString</key>\n  <string>1.0</string>\n  <key>CFBundleSignature</key>\n  <string>????</string>\n  <key>CFBundleVersion</key>\n  <string>1.0</string>\n  <key>MinimumOSVersion</key>\n  <string>12.0</string>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Flutter/Debug.xcconfig",
    "content": "#include? \"Pods/Target Support Files/Pods-Runner/Pods-Runner.debug.xcconfig\"\n#include \"Generated.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Flutter/Release.xcconfig",
    "content": "#include? \"Pods/Target Support Files/Pods-Runner/Pods-Runner.release.xcconfig\"\n#include \"Generated.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner/AppDelegate.swift",
    "content": "import Flutter\nimport UIKit\n\n@UIApplicationMain\n@objc class AppDelegate: FlutterAppDelegate {\n  override func application(\n    _ application: UIApplication,\n    didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?\n  ) -> Bool {\n    GeneratedPluginRegistrant.register(with: self)\n    return super.application(application, didFinishLaunchingWithOptions: launchOptions)\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"size\" : \"20x20\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-20x20@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"20x20\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-20x20@3x.png\",\n      \"scale\" : \"3x\"\n    },\n    {\n      \"size\" : \"29x29\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-29x29@1x.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"29x29\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-29x29@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"29x29\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-29x29@3x.png\",\n      \"scale\" : \"3x\"\n    },\n    {\n      \"size\" : \"40x40\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-40x40@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"40x40\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-40x40@3x.png\",\n      \"scale\" : \"3x\"\n    },\n    {\n      \"size\" : \"60x60\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-60x60@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"60x60\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-60x60@3x.png\",\n      \"scale\" : \"3x\"\n    },\n    {\n      \"size\" : \"20x20\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-20x20@1x.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"20x20\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-20x20@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"29x29\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-29x29@1x.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"29x29\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-29x29@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"40x40\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-40x40@1x.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"40x40\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-40x40@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"76x76\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-76x76@1x.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"76x76\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-76x76@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"83.5x83.5\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-83.5x83.5@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"1024x1024\",\n      \"idiom\" : \"ios-marketing\",\n      \"filename\" : \"Icon-App-1024x1024@1x.png\",\n      \"scale\" : \"1x\"\n    }\n  ],\n  \"info\" : {\n    \"version\" : 1,\n    \"author\" : \"xcode\"\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner/Assets.xcassets/LaunchImage.imageset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"idiom\" : \"universal\",\n      \"filename\" : \"LaunchImage.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"idiom\" : \"universal\",\n      \"filename\" : \"LaunchImage@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"idiom\" : \"universal\",\n      \"filename\" : \"LaunchImage@3x.png\",\n      \"scale\" : \"3x\"\n    }\n  ],\n  \"info\" : {\n    \"version\" : 1,\n    \"author\" : \"xcode\"\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner/Assets.xcassets/LaunchImage.imageset/README.md",
    "content": "# Launch Screen Assets\n\nYou can customize the launch screen with your own desired assets by replacing the image files in this directory.\n\nYou can also do it by opening your Flutter project's Xcode project with `open ios/Runner.xcworkspace`, selecting `Runner/Assets.xcassets` in the Project Navigator and dropping in the desired images."
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner/Base.lproj/LaunchScreen.storyboard",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n<document type=\"com.apple.InterfaceBuilder3.CocoaTouch.Storyboard.XIB\" version=\"3.0\" toolsVersion=\"12121\" systemVersion=\"16G29\" targetRuntime=\"iOS.CocoaTouch\" propertyAccessControl=\"none\" useAutolayout=\"YES\" launchScreen=\"YES\" colorMatched=\"YES\" initialViewController=\"01J-lp-oVM\">\n    <dependencies>\n        <deployment identifier=\"iOS\"/>\n        <plugIn identifier=\"com.apple.InterfaceBuilder.IBCocoaTouchPlugin\" version=\"12089\"/>\n    </dependencies>\n    <scenes>\n        <!--View Controller-->\n        <scene sceneID=\"EHf-IW-A2E\">\n            <objects>\n                <viewController id=\"01J-lp-oVM\" sceneMemberID=\"viewController\">\n                    <layoutGuides>\n                        <viewControllerLayoutGuide type=\"top\" id=\"Ydg-fD-yQy\"/>\n                        <viewControllerLayoutGuide type=\"bottom\" id=\"xbc-2k-c8Z\"/>\n                    </layoutGuides>\n                    <view key=\"view\" contentMode=\"scaleToFill\" id=\"Ze5-6b-2t3\">\n                        <autoresizingMask key=\"autoresizingMask\" widthSizable=\"YES\" heightSizable=\"YES\"/>\n                        <subviews>\n                            <imageView opaque=\"NO\" clipsSubviews=\"YES\" multipleTouchEnabled=\"YES\" contentMode=\"center\" image=\"LaunchImage\" translatesAutoresizingMaskIntoConstraints=\"NO\" id=\"YRO-k0-Ey4\">\n                            </imageView>\n                        </subviews>\n                        <color key=\"backgroundColor\" red=\"1\" green=\"1\" blue=\"1\" alpha=\"1\" colorSpace=\"custom\" customColorSpace=\"sRGB\"/>\n                        <constraints>\n                            <constraint firstItem=\"YRO-k0-Ey4\" firstAttribute=\"centerX\" secondItem=\"Ze5-6b-2t3\" secondAttribute=\"centerX\" id=\"1a2-6s-vTC\"/>\n                            <constraint firstItem=\"YRO-k0-Ey4\" firstAttribute=\"centerY\" secondItem=\"Ze5-6b-2t3\" secondAttribute=\"centerY\" id=\"4X2-HB-R7a\"/>\n                        </constraints>\n                    </view>\n                </viewController>\n                <placeholder placeholderIdentifier=\"IBFirstResponder\" id=\"iYj-Kq-Ea1\" userLabel=\"First Responder\" sceneMemberID=\"firstResponder\"/>\n            </objects>\n            <point key=\"canvasLocation\" x=\"53\" y=\"375\"/>\n        </scene>\n    </scenes>\n    <resources>\n        <image name=\"LaunchImage\" width=\"168\" height=\"185\"/>\n    </resources>\n</document>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner/Base.lproj/Main.storyboard",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n<document type=\"com.apple.InterfaceBuilder3.CocoaTouch.Storyboard.XIB\" version=\"3.0\" toolsVersion=\"10117\" systemVersion=\"15F34\" targetRuntime=\"iOS.CocoaTouch\" propertyAccessControl=\"none\" useAutolayout=\"YES\" useTraitCollections=\"YES\" initialViewController=\"BYZ-38-t0r\">\n    <dependencies>\n        <deployment identifier=\"iOS\"/>\n        <plugIn identifier=\"com.apple.InterfaceBuilder.IBCocoaTouchPlugin\" version=\"10085\"/>\n    </dependencies>\n    <scenes>\n        <!--Flutter View Controller-->\n        <scene sceneID=\"tne-QT-ifu\">\n            <objects>\n                <viewController id=\"BYZ-38-t0r\" customClass=\"FlutterViewController\" sceneMemberID=\"viewController\">\n                    <layoutGuides>\n                        <viewControllerLayoutGuide type=\"top\" id=\"y3c-jy-aDJ\"/>\n                        <viewControllerLayoutGuide type=\"bottom\" id=\"wfy-db-euE\"/>\n                    </layoutGuides>\n                    <view key=\"view\" contentMode=\"scaleToFill\" id=\"8bC-Xf-vdC\">\n                        <rect key=\"frame\" x=\"0.0\" y=\"0.0\" width=\"600\" height=\"600\"/>\n                        <autoresizingMask key=\"autoresizingMask\" widthSizable=\"YES\" heightSizable=\"YES\"/>\n                        <color key=\"backgroundColor\" white=\"1\" alpha=\"1\" colorSpace=\"custom\" customColorSpace=\"calibratedWhite\"/>\n                    </view>\n                </viewController>\n                <placeholder placeholderIdentifier=\"IBFirstResponder\" id=\"dkx-z0-nzr\" sceneMemberID=\"firstResponder\"/>\n            </objects>\n        </scene>\n    </scenes>\n</document>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>NSMicrophoneUsageDescription</key>\n\t<string>Need microphone access for recording speech</string>\n\t<key>CFBundleDevelopmentRegion</key>\n\t<string>$(DEVELOPMENT_LANGUAGE)</string>\n\t<key>CFBundleDisplayName</key>\n\t<string>Streaming Asr</string>\n\t<key>CFBundleExecutable</key>\n\t<string>$(EXECUTABLE_NAME)</string>\n\t<key>CFBundleIdentifier</key>\n\t<string>$(PRODUCT_BUNDLE_IDENTIFIER)</string>\n\t<key>CFBundleInfoDictionaryVersion</key>\n\t<string>6.0</string>\n\t<key>CFBundleName</key>\n\t<string>streaming_asr</string>\n\t<key>CFBundlePackageType</key>\n\t<string>APPL</string>\n\t<key>CFBundleShortVersionString</key>\n\t<string>$(FLUTTER_BUILD_NAME)</string>\n\t<key>CFBundleSignature</key>\n\t<string>????</string>\n\t<key>CFBundleVersion</key>\n\t<string>$(FLUTTER_BUILD_NUMBER)</string>\n\t<key>LSRequiresIPhoneOS</key>\n\t<true/>\n\t<key>UILaunchStoryboardName</key>\n\t<string>LaunchScreen</string>\n\t<key>UIMainStoryboardFile</key>\n\t<string>Main</string>\n\t<key>UISupportedInterfaceOrientations</key>\n\t<array>\n\t\t<string>UIInterfaceOrientationPortrait</string>\n\t\t<string>UIInterfaceOrientationLandscapeLeft</string>\n\t\t<string>UIInterfaceOrientationLandscapeRight</string>\n\t</array>\n\t<key>UISupportedInterfaceOrientations~ipad</key>\n\t<array>\n\t\t<string>UIInterfaceOrientationPortrait</string>\n\t\t<string>UIInterfaceOrientationPortraitUpsideDown</string>\n\t\t<string>UIInterfaceOrientationLandscapeLeft</string>\n\t\t<string>UIInterfaceOrientationLandscapeRight</string>\n\t</array>\n\t<key>CADisableMinimumFrameDurationOnPhone</key>\n\t<true/>\n\t<key>UIApplicationSupportsIndirectInputEvents</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner/Runner-Bridging-Header.h",
    "content": "#import \"GeneratedPluginRegistrant.h\"\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 54;\n\tobjects = {\n\n/* Begin PBXBuildFile section */\n\t\t05D5EF72926AFE8B0BB8E849 /* Pods_Runner.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = B422E1CC20F2C7BF721B8DEA /* Pods_Runner.framework */; };\n\t\t1498D2341E8E89220040F4C2 /* GeneratedPluginRegistrant.m in Sources */ = {isa = PBXBuildFile; fileRef = 1498D2331E8E89220040F4C2 /* GeneratedPluginRegistrant.m */; };\n\t\t331C808B294A63AB00263BE5 /* RunnerTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = 331C807B294A618700263BE5 /* RunnerTests.swift */; };\n\t\t3B3967161E833CAA004F5970 /* AppFrameworkInfo.plist in Resources */ = {isa = PBXBuildFile; fileRef = 3B3967151E833CAA004F5970 /* AppFrameworkInfo.plist */; };\n\t\t5A4BF2984B010F625045AEF9 /* Pods_RunnerTests.framework in Frameworks */ = {isa = PBXBuildFile; fileRef = CD3E5A0B481F8C71365F9259 /* Pods_RunnerTests.framework */; };\n\t\t74858FAF1ED2DC5600515810 /* AppDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 74858FAE1ED2DC5600515810 /* AppDelegate.swift */; };\n\t\t97C146FC1CF9000F007C117D /* Main.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = 97C146FA1CF9000F007C117D /* Main.storyboard */; };\n\t\t97C146FE1CF9000F007C117D /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 97C146FD1CF9000F007C117D /* Assets.xcassets */; };\n\t\t97C147011CF9000F007C117D /* LaunchScreen.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = 97C146FF1CF9000F007C117D /* LaunchScreen.storyboard */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXContainerItemProxy section */\n\t\t331C8085294A63A400263BE5 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = 97C146E61CF9000F007C117D /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = 97C146ED1CF9000F007C117D;\n\t\t\tremoteInfo = Runner;\n\t\t};\n/* End PBXContainerItemProxy section */\n\n/* Begin PBXCopyFilesBuildPhase section */\n\t\t9705A1C41CF9048500538489 /* Embed Frameworks */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = \"\";\n\t\t\tdstSubfolderSpec = 10;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tname = \"Embed Frameworks\";\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXCopyFilesBuildPhase section */\n\n/* Begin PBXFileReference section */\n\t\t0AE88D6BF022DF2B961162B1 /* Pods-RunnerTests.debug.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-RunnerTests.debug.xcconfig\"; path = \"Target Support Files/Pods-RunnerTests/Pods-RunnerTests.debug.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t1498D2321E8E86230040F4C2 /* GeneratedPluginRegistrant.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = GeneratedPluginRegistrant.h; sourceTree = \"<group>\"; };\n\t\t1498D2331E8E89220040F4C2 /* GeneratedPluginRegistrant.m */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.objc; path = GeneratedPluginRegistrant.m; sourceTree = \"<group>\"; };\n\t\t18DE41FC48D4E4A22BB8396E /* Pods-Runner.release.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-Runner.release.xcconfig\"; path = \"Target Support Files/Pods-Runner/Pods-Runner.release.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t1FA6A3CB2526375DC4E7577F /* Pods-RunnerTests.profile.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-RunnerTests.profile.xcconfig\"; path = \"Target Support Files/Pods-RunnerTests/Pods-RunnerTests.profile.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t331C807B294A618700263BE5 /* RunnerTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = RunnerTests.swift; sourceTree = \"<group>\"; };\n\t\t331C8081294A63A400263BE5 /* RunnerTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = RunnerTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t3B3967151E833CAA004F5970 /* AppFrameworkInfo.plist */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.plist.xml; name = AppFrameworkInfo.plist; path = Flutter/AppFrameworkInfo.plist; sourceTree = \"<group>\"; };\n\t\t74858FAD1ED2DC5600515810 /* Runner-Bridging-Header.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = \"Runner-Bridging-Header.h\"; sourceTree = \"<group>\"; };\n\t\t74858FAE1ED2DC5600515810 /* AppDelegate.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = AppDelegate.swift; sourceTree = \"<group>\"; };\n\t\t7AFA3C8E1D35360C0083082E /* Release.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; name = Release.xcconfig; path = Flutter/Release.xcconfig; sourceTree = \"<group>\"; };\n\t\t9740EEB21CF90195004384FC /* Debug.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; name = Debug.xcconfig; path = Flutter/Debug.xcconfig; sourceTree = \"<group>\"; };\n\t\t9740EEB31CF90195004384FC /* Generated.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; name = Generated.xcconfig; path = Flutter/Generated.xcconfig; sourceTree = \"<group>\"; };\n\t\t97C146EE1CF9000F007C117D /* Runner.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = Runner.app; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t97C146FB1CF9000F007C117D /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/Main.storyboard; sourceTree = \"<group>\"; };\n\t\t97C146FD1CF9000F007C117D /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = \"<group>\"; };\n\t\t97C147001CF9000F007C117D /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/LaunchScreen.storyboard; sourceTree = \"<group>\"; };\n\t\t97C147021CF9000F007C117D /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = \"<group>\"; };\n\t\tB422E1CC20F2C7BF721B8DEA /* Pods_Runner.framework */ = {isa = PBXFileReference; explicitFileType = wrapper.framework; includeInIndex = 0; path = Pods_Runner.framework; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tCD3E5A0B481F8C71365F9259 /* Pods_RunnerTests.framework */ = {isa = PBXFileReference; explicitFileType = wrapper.framework; includeInIndex = 0; path = Pods_RunnerTests.framework; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tD39135D1BCA9F8B2E889A4A7 /* Pods-Runner.profile.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-Runner.profile.xcconfig\"; path = \"Target Support Files/Pods-Runner/Pods-Runner.profile.xcconfig\"; sourceTree = \"<group>\"; };\n\t\tECE8263C82D7A5EDCDD523B1 /* Pods-RunnerTests.release.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-RunnerTests.release.xcconfig\"; path = \"Target Support Files/Pods-RunnerTests/Pods-RunnerTests.release.xcconfig\"; sourceTree = \"<group>\"; };\n\t\tF2428E84328DFA24DFEF0A8B /* Pods-Runner.debug.xcconfig */ = {isa = PBXFileReference; includeInIndex = 1; lastKnownFileType = text.xcconfig; name = \"Pods-Runner.debug.xcconfig\"; path = \"Target Support Files/Pods-Runner/Pods-Runner.debug.xcconfig\"; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\t370CDD7E022C5FF755B5EF47 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t5A4BF2984B010F625045AEF9 /* Pods_RunnerTests.framework in Frameworks */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t97C146EB1CF9000F007C117D /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t05D5EF72926AFE8B0BB8E849 /* Pods_Runner.framework in Frameworks */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\t331C8082294A63A400263BE5 /* RunnerTests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t331C807B294A618700263BE5 /* RunnerTests.swift */,\n\t\t\t);\n\t\t\tpath = RunnerTests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t50F577A9B451352B5312D8B8 /* Pods */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tF2428E84328DFA24DFEF0A8B /* Pods-Runner.debug.xcconfig */,\n\t\t\t\t18DE41FC48D4E4A22BB8396E /* Pods-Runner.release.xcconfig */,\n\t\t\t\tD39135D1BCA9F8B2E889A4A7 /* Pods-Runner.profile.xcconfig */,\n\t\t\t\t0AE88D6BF022DF2B961162B1 /* Pods-RunnerTests.debug.xcconfig */,\n\t\t\t\tECE8263C82D7A5EDCDD523B1 /* Pods-RunnerTests.release.xcconfig */,\n\t\t\t\t1FA6A3CB2526375DC4E7577F /* Pods-RunnerTests.profile.xcconfig */,\n\t\t\t);\n\t\t\tname = Pods;\n\t\t\tpath = Pods;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t9740EEB11CF90186004384FC /* Flutter */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t3B3967151E833CAA004F5970 /* AppFrameworkInfo.plist */,\n\t\t\t\t9740EEB21CF90195004384FC /* Debug.xcconfig */,\n\t\t\t\t7AFA3C8E1D35360C0083082E /* Release.xcconfig */,\n\t\t\t\t9740EEB31CF90195004384FC /* Generated.xcconfig */,\n\t\t\t);\n\t\t\tname = Flutter;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t97C146E51CF9000F007C117D = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t9740EEB11CF90186004384FC /* Flutter */,\n\t\t\t\t97C146F01CF9000F007C117D /* Runner */,\n\t\t\t\t97C146EF1CF9000F007C117D /* Products */,\n\t\t\t\t331C8082294A63A400263BE5 /* RunnerTests */,\n\t\t\t\t50F577A9B451352B5312D8B8 /* Pods */,\n\t\t\t\tD7A66A32065C41441BF0E0D3 /* Frameworks */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t97C146EF1CF9000F007C117D /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t97C146EE1CF9000F007C117D /* Runner.app */,\n\t\t\t\t331C8081294A63A400263BE5 /* RunnerTests.xctest */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t97C146F01CF9000F007C117D /* Runner */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t97C146FA1CF9000F007C117D /* Main.storyboard */,\n\t\t\t\t97C146FD1CF9000F007C117D /* Assets.xcassets */,\n\t\t\t\t97C146FF1CF9000F007C117D /* LaunchScreen.storyboard */,\n\t\t\t\t97C147021CF9000F007C117D /* Info.plist */,\n\t\t\t\t1498D2321E8E86230040F4C2 /* GeneratedPluginRegistrant.h */,\n\t\t\t\t1498D2331E8E89220040F4C2 /* GeneratedPluginRegistrant.m */,\n\t\t\t\t74858FAE1ED2DC5600515810 /* AppDelegate.swift */,\n\t\t\t\t74858FAD1ED2DC5600515810 /* Runner-Bridging-Header.h */,\n\t\t\t);\n\t\t\tpath = Runner;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tD7A66A32065C41441BF0E0D3 /* Frameworks */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tB422E1CC20F2C7BF721B8DEA /* Pods_Runner.framework */,\n\t\t\t\tCD3E5A0B481F8C71365F9259 /* Pods_RunnerTests.framework */,\n\t\t\t);\n\t\t\tname = Frameworks;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\t331C8080294A63A400263BE5 /* RunnerTests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 331C8087294A63A400263BE5 /* Build configuration list for PBXNativeTarget \"RunnerTests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t05C536716C891AD06C35ACE8 /* [CP] Check Pods Manifest.lock */,\n\t\t\t\t331C807D294A63A400263BE5 /* Sources */,\n\t\t\t\t331C807F294A63A400263BE5 /* Resources */,\n\t\t\t\t370CDD7E022C5FF755B5EF47 /* Frameworks */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\t331C8086294A63A400263BE5 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = RunnerTests;\n\t\t\tproductName = RunnerTests;\n\t\t\tproductReference = 331C8081294A63A400263BE5 /* RunnerTests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.unit-test\";\n\t\t};\n\t\t97C146ED1CF9000F007C117D /* Runner */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 97C147051CF9000F007C117D /* Build configuration list for PBXNativeTarget \"Runner\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t7BF04CD64B1097AB8C6E66EA /* [CP] Check Pods Manifest.lock */,\n\t\t\t\t9740EEB61CF901F6004384FC /* Run Script */,\n\t\t\t\t97C146EA1CF9000F007C117D /* Sources */,\n\t\t\t\t97C146EB1CF9000F007C117D /* Frameworks */,\n\t\t\t\t97C146EC1CF9000F007C117D /* Resources */,\n\t\t\t\t9705A1C41CF9048500538489 /* Embed Frameworks */,\n\t\t\t\t3B06AD1E1E4923F5004D2608 /* Thin Binary */,\n\t\t\t\tE862F7828A330E975EF6E1F9 /* [CP] Embed Pods Frameworks */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = Runner;\n\t\t\tproductName = Runner;\n\t\t\tproductReference = 97C146EE1CF9000F007C117D /* Runner.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\t97C146E61CF9000F007C117D /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = YES;\n\t\t\t\tLastUpgradeCheck = 1510;\n\t\t\t\tORGANIZATIONNAME = \"\";\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\t331C8080294A63A400263BE5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.0;\n\t\t\t\t\t\tTestTargetID = 97C146ED1CF9000F007C117D;\n\t\t\t\t\t};\n\t\t\t\t\t97C146ED1CF9000F007C117D = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 7.3.1;\n\t\t\t\t\t\tLastSwiftMigration = 1100;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = 97C146E91CF9000F007C117D /* Build configuration list for PBXProject \"Runner\" */;\n\t\t\tcompatibilityVersion = \"Xcode 9.3\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = 97C146E51CF9000F007C117D;\n\t\t\tproductRefGroup = 97C146EF1CF9000F007C117D /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\t97C146ED1CF9000F007C117D /* Runner */,\n\t\t\t\t331C8080294A63A400263BE5 /* RunnerTests */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\t331C807F294A63A400263BE5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t97C146EC1CF9000F007C117D /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t97C147011CF9000F007C117D /* LaunchScreen.storyboard in Resources */,\n\t\t\t\t3B3967161E833CAA004F5970 /* AppFrameworkInfo.plist in Resources */,\n\t\t\t\t97C146FE1CF9000F007C117D /* Assets.xcassets in Resources */,\n\t\t\t\t97C146FC1CF9000F007C117D /* Main.storyboard in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXShellScriptBuildPhase section */\n\t\t05C536716C891AD06C35ACE8 /* [CP] Check Pods Manifest.lock */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t\t\"${PODS_PODFILE_DIR_PATH}/Podfile.lock\",\n\t\t\t\t\"${PODS_ROOT}/Manifest.lock\",\n\t\t\t);\n\t\t\tname = \"[CP] Check Pods Manifest.lock\";\n\t\t\toutputFileListPaths = (\n\t\t\t);\n\t\t\toutputPaths = (\n\t\t\t\t\"$(DERIVED_FILE_DIR)/Pods-RunnerTests-checkManifestLockResult.txt\",\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"diff \\\"${PODS_PODFILE_DIR_PATH}/Podfile.lock\\\" \\\"${PODS_ROOT}/Manifest.lock\\\" > /dev/null\\nif [ $? != 0 ] ; then\\n    # print error to STDERR\\n    echo \\\"error: The sandbox is not in sync with the Podfile.lock. Run 'pod install' or update your CocoaPods installation.\\\" >&2\\n    exit 1\\nfi\\n# This output is used by Xcode 'outputs' to avoid re-running this script phase.\\necho \\\"SUCCESS\\\" > \\\"${SCRIPT_OUTPUT_FILE_0}\\\"\\n\";\n\t\t\tshowEnvVarsInLog = 0;\n\t\t};\n\t\t3B06AD1E1E4923F5004D2608 /* Thin Binary */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\talwaysOutOfDate = 1;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t\t\"${TARGET_BUILD_DIR}/${INFOPLIST_PATH}\",\n\t\t\t);\n\t\t\tname = \"Thin Binary\";\n\t\t\toutputPaths = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"/bin/sh \\\"$FLUTTER_ROOT/packages/flutter_tools/bin/xcode_backend.sh\\\" embed_and_thin\";\n\t\t};\n\t\t7BF04CD64B1097AB8C6E66EA /* [CP] Check Pods Manifest.lock */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t\t\"${PODS_PODFILE_DIR_PATH}/Podfile.lock\",\n\t\t\t\t\"${PODS_ROOT}/Manifest.lock\",\n\t\t\t);\n\t\t\tname = \"[CP] Check Pods Manifest.lock\";\n\t\t\toutputFileListPaths = (\n\t\t\t);\n\t\t\toutputPaths = (\n\t\t\t\t\"$(DERIVED_FILE_DIR)/Pods-Runner-checkManifestLockResult.txt\",\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"diff \\\"${PODS_PODFILE_DIR_PATH}/Podfile.lock\\\" \\\"${PODS_ROOT}/Manifest.lock\\\" > /dev/null\\nif [ $? != 0 ] ; then\\n    # print error to STDERR\\n    echo \\\"error: The sandbox is not in sync with the Podfile.lock. Run 'pod install' or update your CocoaPods installation.\\\" >&2\\n    exit 1\\nfi\\n# This output is used by Xcode 'outputs' to avoid re-running this script phase.\\necho \\\"SUCCESS\\\" > \\\"${SCRIPT_OUTPUT_FILE_0}\\\"\\n\";\n\t\t\tshowEnvVarsInLog = 0;\n\t\t};\n\t\t9740EEB61CF901F6004384FC /* Run Script */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\talwaysOutOfDate = 1;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t);\n\t\t\tname = \"Run Script\";\n\t\t\toutputPaths = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"/bin/sh \\\"$FLUTTER_ROOT/packages/flutter_tools/bin/xcode_backend.sh\\\" build\";\n\t\t};\n\t\tE862F7828A330E975EF6E1F9 /* [CP] Embed Pods Frameworks */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t\t\"${PODS_ROOT}/Target Support Files/Pods-Runner/Pods-Runner-frameworks-${CONFIGURATION}-input-files.xcfilelist\",\n\t\t\t);\n\t\t\tname = \"[CP] Embed Pods Frameworks\";\n\t\t\toutputFileListPaths = (\n\t\t\t\t\"${PODS_ROOT}/Target Support Files/Pods-Runner/Pods-Runner-frameworks-${CONFIGURATION}-output-files.xcfilelist\",\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"\\\"${PODS_ROOT}/Target Support Files/Pods-Runner/Pods-Runner-frameworks.sh\\\"\\n\";\n\t\t\tshowEnvVarsInLog = 0;\n\t\t};\n/* End PBXShellScriptBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\t331C807D294A63A400263BE5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t331C808B294A63AB00263BE5 /* RunnerTests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t97C146EA1CF9000F007C117D /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t74858FAF1ED2DC5600515810 /* AppDelegate.swift in Sources */,\n\t\t\t\t1498D2341E8E89220040F4C2 /* GeneratedPluginRegistrant.m in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin PBXTargetDependency section */\n\t\t331C8086294A63A400263BE5 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = 97C146ED1CF9000F007C117D /* Runner */;\n\t\t\ttargetProxy = 331C8085294A63A400263BE5 /* PBXContainerItemProxy */;\n\t\t};\n/* End PBXTargetDependency section */\n\n/* Begin PBXVariantGroup section */\n\t\t97C146FA1CF9000F007C117D /* Main.storyboard */ = {\n\t\t\tisa = PBXVariantGroup;\n\t\t\tchildren = (\n\t\t\t\t97C146FB1CF9000F007C117D /* Base */,\n\t\t\t);\n\t\t\tname = Main.storyboard;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t97C146FF1CF9000F007C117D /* LaunchScreen.storyboard */ = {\n\t\t\tisa = PBXVariantGroup;\n\t\t\tchildren = (\n\t\t\t\t97C147001CF9000F007C117D /* Base */,\n\t\t\t);\n\t\t\tname = LaunchScreen.storyboard;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXVariantGroup section */\n\n/* Begin XCBuildConfiguration section */\n\t\t249021D3217E4FDB00AE95B9 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++0x\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\t\"CODE_SIGN_IDENTITY[sdk=iphoneos*]\" = \"iPhone Developer\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu99;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 12.0;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSUPPORTED_PLATFORMS = iphoneos;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tVALIDATE_PRODUCT = YES;\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t249021D4217E4FDB00AE95B9 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 7AFA3C8E1D35360C0083082E /* Release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCURRENT_PROJECT_VERSION = \"$(FLUTTER_BUILD_NUMBER)\";\n\t\t\t\tDEVELOPMENT_TEAM = N5ZH3Z63A6;\n\t\t\t\tENABLE_BITCODE = NO;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.streamingAsr;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"Runner/Runner-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tVERSIONING_SYSTEM = \"apple-generic\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t331C8088294A63A400263BE5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 0AE88D6BF022DF2B961162B1 /* Pods-RunnerTests.debug.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.streamingAsr.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/Runner.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/Runner\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t331C8089294A63A400263BE5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = ECE8263C82D7A5EDCDD523B1 /* Pods-RunnerTests.release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.streamingAsr.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/Runner.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/Runner\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t331C808A294A63A400263BE5 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 1FA6A3CB2526375DC4E7577F /* Pods-RunnerTests.profile.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.streamingAsr.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/Runner.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/Runner\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t97C147031CF9000F007C117D /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++0x\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\t\"CODE_SIGN_IDENTITY[sdk=iphoneos*]\" = \"iPhone Developer\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu99;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 12.0;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t97C147041CF9000F007C117D /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++0x\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\t\"CODE_SIGN_IDENTITY[sdk=iphoneos*]\" = \"iPhone Developer\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu99;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 12.0;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSUPPORTED_PLATFORMS = iphoneos;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tVALIDATE_PRODUCT = YES;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t97C147061CF9000F007C117D /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 9740EEB21CF90195004384FC /* Debug.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCURRENT_PROJECT_VERSION = \"$(FLUTTER_BUILD_NUMBER)\";\n\t\t\t\tDEVELOPMENT_TEAM = N5ZH3Z63A6;\n\t\t\t\tENABLE_BITCODE = NO;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.streamingAsr;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"Runner/Runner-Bridging-Header.h\";\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tVERSIONING_SYSTEM = \"apple-generic\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t97C147071CF9000F007C117D /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 7AFA3C8E1D35360C0083082E /* Release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCURRENT_PROJECT_VERSION = \"$(FLUTTER_BUILD_NUMBER)\";\n\t\t\t\tDEVELOPMENT_TEAM = N5ZH3Z63A6;\n\t\t\t\tENABLE_BITCODE = NO;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.streamingAsr;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"Runner/Runner-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tVERSIONING_SYSTEM = \"apple-generic\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\t331C8087294A63A400263BE5 /* Build configuration list for PBXNativeTarget \"RunnerTests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t331C8088294A63A400263BE5 /* Debug */,\n\t\t\t\t331C8089294A63A400263BE5 /* Release */,\n\t\t\t\t331C808A294A63A400263BE5 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t97C146E91CF9000F007C117D /* Build configuration list for PBXProject \"Runner\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t97C147031CF9000F007C117D /* Debug */,\n\t\t\t\t97C147041CF9000F007C117D /* Release */,\n\t\t\t\t249021D3217E4FDB00AE95B9 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t97C147051CF9000F007C117D /* Build configuration list for PBXNativeTarget \"Runner\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t97C147061CF9000F007C117D /* Debug */,\n\t\t\t\t97C147071CF9000F007C117D /* Release */,\n\t\t\t\t249021D4217E4FDB00AE95B9 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = 97C146E61CF9000F007C117D /* Project object */;\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"self:\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner.xcodeproj/project.xcworkspace/xcshareddata/WorkspaceSettings.xcsettings",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>PreviewsEnabled</key>\n\t<false/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner.xcodeproj/xcshareddata/xcschemes/Runner.xcscheme",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Scheme\n   LastUpgradeVersion = \"1510\"\n   version = \"1.3\">\n   <BuildAction\n      parallelizeBuildables = \"YES\"\n      buildImplicitDependencies = \"YES\">\n      <BuildActionEntries>\n         <BuildActionEntry\n            buildForTesting = \"YES\"\n            buildForRunning = \"YES\"\n            buildForProfiling = \"YES\"\n            buildForArchiving = \"YES\"\n            buildForAnalyzing = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"97C146ED1CF9000F007C117D\"\n               BuildableName = \"Runner.app\"\n               BlueprintName = \"Runner\"\n               ReferencedContainer = \"container:Runner.xcodeproj\">\n            </BuildableReference>\n         </BuildActionEntry>\n      </BuildActionEntries>\n   </BuildAction>\n   <TestAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\">\n      <MacroExpansion>\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"97C146ED1CF9000F007C117D\"\n            BuildableName = \"Runner.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </MacroExpansion>\n      <Testables>\n         <TestableReference\n            skipped = \"NO\"\n            parallelizable = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"331C8080294A63A400263BE5\"\n               BuildableName = \"RunnerTests.xctest\"\n               BlueprintName = \"RunnerTests\"\n               ReferencedContainer = \"container:Runner.xcodeproj\">\n            </BuildableReference>\n         </TestableReference>\n      </Testables>\n   </TestAction>\n   <LaunchAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      launchStyle = \"0\"\n      useCustomWorkingDirectory = \"NO\"\n      ignoresPersistentStateOnLaunch = \"NO\"\n      debugDocumentVersioning = \"YES\"\n      debugServiceExtension = \"internal\"\n      allowLocationSimulation = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"97C146ED1CF9000F007C117D\"\n            BuildableName = \"Runner.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </LaunchAction>\n   <ProfileAction\n      buildConfiguration = \"Profile\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\"\n      savedToolIdentifier = \"\"\n      useCustomWorkingDirectory = \"NO\"\n      debugDocumentVersioning = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"97C146ED1CF9000F007C117D\"\n            BuildableName = \"Runner.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </ProfileAction>\n   <AnalyzeAction\n      buildConfiguration = \"Debug\">\n   </AnalyzeAction>\n   <ArchiveAction\n      buildConfiguration = \"Release\"\n      revealArchiveInOrganizer = \"YES\">\n   </ArchiveAction>\n</Scheme>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"group:Runner.xcodeproj\">\n   </FileRef>\n   <FileRef\n      location = \"group:Pods/Pods.xcodeproj\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/Runner.xcworkspace/xcshareddata/WorkspaceSettings.xcsettings",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>PreviewsEnabled</key>\n\t<false/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/ios/RunnerTests/RunnerTests.swift",
    "content": "import Flutter\nimport UIKit\nimport XCTest\n\nclass RunnerTests: XCTestCase {\n\n  func testExample() {\n    // If you add code to the Runner application, consider adding tests here.\n    // See https://developer.apple.com/documentation/xctest for more information about using XCTest.\n  }\n\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/lib/info.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'package:flutter/material.dart';\nimport 'package:url_launcher/url_launcher.dart';\n\nclass InfoScreen extends StatelessWidget {\n  @override\n  Widget build(BuildContext context) {\n    const double height = 20;\n    return Container(\n      child: Padding(\n        padding: const EdgeInsets.all(8.0),\n        child: Column(\n          crossAxisAlignment: CrossAxisAlignment.start,\n          children: <Widget>[\n            Text('Everything is open-sourced.'),\n            SizedBox(height: height),\n            InkWell(\n              child: Text('Code: https://github.com/k2-fsa/sherpa-onnx'),\n              onTap: () => launch('https://k2-fsa.github.io/sherpa/onnx/'),\n            ),\n            SizedBox(height: height),\n            InkWell(\n              child: Text('Doc: https://k2-fsa.github.io/sherpa/onnx/'),\n              onTap: () => launch('https://k2-fsa.github.io/sherpa/onnx/'),\n            ),\n            SizedBox(height: height),\n            Text('QQ 群: 744602236'),\n            SizedBox(height: height),\n            InkWell(\n              child: Text(\n                  '微信群: https://k2-fsa.github.io/sherpa/social-groups.html'),\n              onTap: () =>\n                  launch('https://k2-fsa.github.io/sherpa/social-groups.html'),\n            ),\n          ],\n        ),\n      ),\n    );\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/lib/main.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'package:flutter/material.dart';\n\nimport './streaming_asr.dart';\nimport './info.dart';\n\nvoid main() {\n  runApp(const MyApp());\n}\n\nclass MyApp extends StatelessWidget {\n  const MyApp({super.key});\n\n  @override\n  Widget build(BuildContext context) {\n    return MaterialApp(\n      title: 'Next-gen Kaldi flutter demo',\n      theme: ThemeData(\n        colorScheme: ColorScheme.fromSeed(seedColor: Colors.deepPurple),\n        useMaterial3: true,\n      ),\n      home: const MyHomePage(title: 'Next-gen Kaldi with Flutter'),\n    );\n  }\n}\n\nclass MyHomePage extends StatefulWidget {\n  const MyHomePage({super.key, required this.title});\n\n  final String title;\n\n  @override\n  State<MyHomePage> createState() => _MyHomePageState();\n}\n\nclass _MyHomePageState extends State<MyHomePage> {\n  int _currentIndex = 0;\n  final List<Widget> _tabs = [\n    StreamingAsrScreen(),\n    InfoScreen(),\n  ];\n  @override\n  Widget build(BuildContext context) {\n    return Scaffold(\n      appBar: AppBar(\n        title: Text(widget.title),\n      ),\n      body: _tabs[_currentIndex],\n      bottomNavigationBar: BottomNavigationBar(\n        currentIndex: _currentIndex,\n        onTap: (int index) {\n          setState(() {\n            _currentIndex = index;\n          });\n        },\n        items: [\n          BottomNavigationBarItem(\n            icon: Icon(Icons.home),\n            label: 'Home',\n          ),\n          BottomNavigationBarItem(\n            icon: Icon(Icons.info),\n            label: 'Info',\n          ),\n        ],\n      ),\n    );\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/lib/online_model.dart",
    "content": "import 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\nimport './utils.dart';\n\n// Remember to change `assets` in ../pubspec.yaml\n// and download files to ../assets\nFuture<sherpa_onnx.OnlineModelConfig> getOnlineModelConfig(\n    {required int type}) async {\n  switch (type) {\n    case 0:\n      final modelDir =\n          'assets/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20';\n      return sherpa_onnx.OnlineModelConfig(\n        transducer: sherpa_onnx.OnlineTransducerModelConfig(\n          encoder:\n              await copyAssetFile('$modelDir/encoder-epoch-99-avg-1.int8.onnx'),\n          decoder: await copyAssetFile('$modelDir/decoder-epoch-99-avg-1.onnx'),\n          joiner: await copyAssetFile('$modelDir/joiner-epoch-99-avg-1.onnx'),\n        ),\n        tokens: await copyAssetFile('$modelDir/tokens.txt'),\n        modelType: 'zipformer',\n      );\n    case 1:\n      final modelDir = 'assets/sherpa-onnx-streaming-zipformer-en-2023-06-26';\n      return sherpa_onnx.OnlineModelConfig(\n        transducer: sherpa_onnx.OnlineTransducerModelConfig(\n          encoder: await copyAssetFile(\n              '$modelDir/encoder-epoch-99-avg-1-chunk-16-left-128.int8.onnx'),\n          decoder: await copyAssetFile(\n              '$modelDir/decoder-epoch-99-avg-1-chunk-16-left-128.onnx'),\n          joiner: await copyAssetFile(\n              '$modelDir/joiner-epoch-99-avg-1-chunk-16-left-128.onnx'),\n        ),\n        tokens: await copyAssetFile('$modelDir/tokens.txt'),\n        modelType: 'zipformer2',\n      );\n    case 2:\n      final modelDir =\n          'assets/icefall-asr-zipformer-streaming-wenetspeech-20230615';\n      return sherpa_onnx.OnlineModelConfig(\n        transducer: sherpa_onnx.OnlineTransducerModelConfig(\n          encoder: await copyAssetFile(\n              '$modelDir/exp/encoder-epoch-12-avg-4-chunk-16-left-128.int8.onnx'),\n          decoder: await copyAssetFile(\n              '$modelDir/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx'),\n          joiner: await copyAssetFile(\n              '$modelDir/exp/joiner-epoch-12-avg-4-chunk-16-left-128.onnx'),\n        ),\n        tokens: await copyAssetFile('$modelDir/data/lang_char/tokens.txt'),\n        modelType: 'zipformer2',\n      );\n    case 3:\n      final modelDir = 'assets/sherpa-onnx-streaming-zipformer-fr-2023-04-14';\n      return sherpa_onnx.OnlineModelConfig(\n        transducer: sherpa_onnx.OnlineTransducerModelConfig(\n          encoder: await copyAssetFile(\n              '$modelDir/encoder-epoch-29-avg-9-with-averaged-model.int8.onnx'),\n          decoder: await copyAssetFile(\n              '$modelDir/decoder-epoch-29-avg-9-with-averaged-model.onnx'),\n          joiner: await copyAssetFile(\n              '$modelDir/joiner-epoch-29-avg-9-with-averaged-model.onnx'),\n        ),\n        tokens: await copyAssetFile('$modelDir/tokens.txt'),\n        modelType: 'zipformer',\n      );\n    default:\n      throw ArgumentError('Unsupported type: $type');\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/lib/streaming_asr.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:async';\n\nimport 'package:flutter/foundation.dart';\nimport 'package:flutter/material.dart';\nimport 'package:path/path.dart' as p;\nimport 'package:path_provider/path_provider.dart';\nimport 'package:record/record.dart';\n\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './utils.dart';\nimport './online_model.dart';\n\nFuture<sherpa_onnx.OnlineRecognizer> createOnlineRecognizer() async {\n  final type = 0;\n\n  final modelConfig = await getOnlineModelConfig(type: type);\n  final config = sherpa_onnx.OnlineRecognizerConfig(\n    model: modelConfig,\n    ruleFsts: '',\n  );\n\n  return sherpa_onnx.OnlineRecognizer(config);\n}\n\nclass StreamingAsrScreen extends StatefulWidget {\n  const StreamingAsrScreen({super.key});\n\n  @override\n  State<StreamingAsrScreen> createState() => _StreamingAsrScreenState();\n}\n\nclass _StreamingAsrScreenState extends State<StreamingAsrScreen> {\n  late final TextEditingController _controller;\n  late final AudioRecorder _audioRecorder;\n\n  String _title = 'Real-time speech recognition';\n  String _last = '';\n  int _index = 0;\n  bool _isInitialized = false;\n\n  sherpa_onnx.OnlineRecognizer? _recognizer;\n  sherpa_onnx.OnlineStream? _stream;\n  int _sampleRate = 16000;\n\n  StreamSubscription<RecordState>? _recordSub;\n  RecordState _recordState = RecordState.stop;\n\n  @override\n  void initState() {\n    _audioRecorder = AudioRecorder();\n    _controller = TextEditingController();\n\n    _recordSub = _audioRecorder.onStateChanged().listen((recordState) {\n      _updateRecordState(recordState);\n    });\n\n    super.initState();\n  }\n\n  Future<void> _start() async {\n    if (!_isInitialized) {\n      sherpa_onnx.initBindings();\n      _recognizer = await createOnlineRecognizer();\n      _stream = _recognizer?.createStream();\n\n      _isInitialized = true;\n    }\n\n    try {\n      if (await _audioRecorder.hasPermission()) {\n        const encoder = AudioEncoder.pcm16bits;\n\n        if (!await _isEncoderSupported(encoder)) {\n          return;\n        }\n\n        final devs = await _audioRecorder.listInputDevices();\n        debugPrint(devs.toString());\n\n        const config = RecordConfig(\n          encoder: encoder,\n          sampleRate: 16000,\n          numChannels: 1,\n        );\n\n        final stream = await _audioRecorder.startStream(config);\n\n        stream.listen(\n          (data) {\n            final samplesFloat32 =\n                convertBytesToFloat32(Uint8List.fromList(data));\n\n            _stream!.acceptWaveform(\n                samples: samplesFloat32, sampleRate: _sampleRate);\n            while (_recognizer!.isReady(_stream!)) {\n              _recognizer!.decode(_stream!);\n            }\n            final text = _recognizer!.getResult(_stream!).text;\n            String textToDisplay = _last;\n            if (text != '') {\n              if (_last == '') {\n                textToDisplay = '$_index: $text';\n              } else {\n                textToDisplay = '$_index: $text\\n$_last';\n              }\n            }\n\n            if (_recognizer!.isEndpoint(_stream!)) {\n              _recognizer!.reset(_stream!);\n              if (text != '') {\n                _last = textToDisplay;\n                _index += 1;\n              }\n            }\n            // print('text: $textToDisplay');\n\n            _controller.value = TextEditingValue(\n              text: textToDisplay,\n              selection: TextSelection.collapsed(offset: textToDisplay.length),\n            );\n          },\n          onDone: () {\n            print('stream stopped.');\n          },\n        );\n      }\n    } catch (e) {\n      print(e);\n    }\n  }\n\n  Future<void> _stop() async {\n    _stream!.free();\n    _stream = _recognizer!.createStream();\n\n    await _audioRecorder.stop();\n  }\n\n  Future<void> _pause() => _audioRecorder.pause();\n\n  Future<void> _resume() => _audioRecorder.resume();\n\n  void _updateRecordState(RecordState recordState) {\n    setState(() => _recordState = recordState);\n  }\n\n  Future<bool> _isEncoderSupported(AudioEncoder encoder) async {\n    final isSupported = await _audioRecorder.isEncoderSupported(\n      encoder,\n    );\n\n    if (!isSupported) {\n      debugPrint('${encoder.name} is not supported on this platform.');\n      debugPrint('Supported encoders are:');\n\n      for (final e in AudioEncoder.values) {\n        if (await _audioRecorder.isEncoderSupported(e)) {\n          debugPrint('- ${encoder.name}');\n        }\n      }\n    }\n\n    return isSupported;\n  }\n\n  @override\n  Widget build(BuildContext context) {\n    return MaterialApp(\n      home: Scaffold(\n        appBar: AppBar(\n          title: Text(_title),\n        ),\n        body: Column(\n          mainAxisAlignment: MainAxisAlignment.center,\n          children: [\n            const SizedBox(height: 50),\n            TextField(\n              maxLines: 5,\n              controller: _controller,\n              readOnly: true,\n            ),\n            const SizedBox(height: 50),\n            Row(\n              mainAxisAlignment: MainAxisAlignment.center,\n              children: <Widget>[\n                _buildRecordStopControl(),\n                const SizedBox(width: 20),\n                _buildText(),\n              ],\n            ),\n          ],\n        ),\n      ),\n    );\n  }\n\n  @override\n  void dispose() {\n    _recordSub?.cancel();\n    _audioRecorder.dispose();\n    _stream?.free();\n    _recognizer?.free();\n    super.dispose();\n  }\n\n  Widget _buildRecordStopControl() {\n    late Icon icon;\n    late Color color;\n\n    if (_recordState != RecordState.stop) {\n      icon = const Icon(Icons.stop, color: Colors.red, size: 30);\n      color = Colors.red.withOpacity(0.1);\n    } else {\n      final theme = Theme.of(context);\n      icon = Icon(Icons.mic, color: theme.primaryColor, size: 30);\n      color = theme.primaryColor.withOpacity(0.1);\n    }\n\n    return ClipOval(\n      child: Material(\n        color: color,\n        child: InkWell(\n          child: SizedBox(width: 56, height: 56, child: icon),\n          onTap: () {\n            (_recordState != RecordState.stop) ? _stop() : _start();\n          },\n        ),\n      ),\n    );\n  }\n\n  Widget _buildText() {\n    if (_recordState == RecordState.stop) {\n      return const Text(\"Start\");\n    } else {\n      return const Text(\"Stop\");\n    }\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/lib/utils.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'package:path/path.dart';\nimport 'package:path_provider/path_provider.dart';\nimport 'package:flutter/services.dart' show rootBundle;\nimport 'dart:typed_data';\nimport \"dart:io\";\n\n// Copy the asset file from src to dst\nFuture<String> copyAssetFile(String src, [String? dst]) async {\n  final Directory directory = await getApplicationSupportDirectory();\n  if (dst == null) {\n    dst = basename(src);\n  }\n  final target = join(directory.path, dst);\n  bool exists = await new File(target).exists();\n\n  final data = await rootBundle.load(src);\n\n  if (!exists || File(target).lengthSync() != data.lengthInBytes) {\n    final List<int> bytes =\n        data.buffer.asUint8List(data.offsetInBytes, data.lengthInBytes);\n    await File(target).writeAsBytes(bytes);\n  }\n\n  return target;\n}\n\nFloat32List convertBytesToFloat32(Uint8List bytes, [endian = Endian.little]) {\n  final values = Float32List(bytes.length ~/ 2);\n\n  final data = ByteData.view(bytes.buffer);\n\n  for (var i = 0; i < bytes.length; i += 2) {\n    int short = data.getInt16(i, endian);\n    values[i ~/ 2] = short / 32768.0;\n  }\n\n  return values;\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/linux/.gitignore",
    "content": "flutter/ephemeral\n"
  },
  {
    "path": "flutter-examples/streaming_asr/linux/CMakeLists.txt",
    "content": "# Project-level configuration.\ncmake_minimum_required(VERSION 3.10)\nproject(runner LANGUAGES CXX)\n\n# The name of the executable created for the application. Change this to change\n# the on-disk name of your application.\nset(BINARY_NAME \"streaming_asr\")\n# The unique GTK application identifier for this application. See:\n# https://wiki.gnome.org/HowDoI/ChooseApplicationID\nset(APPLICATION_ID \"com.k2fsa.sherpa.onnx.streaming_asr\")\n\n# Explicitly opt in to modern CMake behaviors to avoid warnings with recent\n# versions of CMake.\ncmake_policy(SET CMP0063 NEW)\n\n# Load bundled libraries from the lib/ directory relative to the binary.\nset(CMAKE_INSTALL_RPATH \"$ORIGIN/lib\")\n\n# Root filesystem for cross-building.\nif(FLUTTER_TARGET_PLATFORM_SYSROOT)\n  set(CMAKE_SYSROOT ${FLUTTER_TARGET_PLATFORM_SYSROOT})\n  set(CMAKE_FIND_ROOT_PATH ${CMAKE_SYSROOT})\n  set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)\n  set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)\n  set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)\n  set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)\nendif()\n\n# Define build configuration options.\nif(NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)\n  set(CMAKE_BUILD_TYPE \"Debug\" CACHE\n    STRING \"Flutter build mode\" FORCE)\n  set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS\n    \"Debug\" \"Profile\" \"Release\")\nendif()\n\n# Compilation settings that should be applied to most targets.\n#\n# Be cautious about adding new options here, as plugins use this function by\n# default. In most cases, you should add new options to specific targets instead\n# of modifying this function.\nfunction(APPLY_STANDARD_SETTINGS TARGET)\n  target_compile_features(${TARGET} PUBLIC cxx_std_14)\n  target_compile_options(${TARGET} PRIVATE -Wall -Werror)\n  target_compile_options(${TARGET} PRIVATE \"$<$<NOT:$<CONFIG:Debug>>:-O3>\")\n  target_compile_definitions(${TARGET} PRIVATE \"$<$<NOT:$<CONFIG:Debug>>:NDEBUG>\")\nendfunction()\n\n# Flutter library and tool build rules.\nset(FLUTTER_MANAGED_DIR \"${CMAKE_CURRENT_SOURCE_DIR}/flutter\")\nadd_subdirectory(${FLUTTER_MANAGED_DIR})\n\n# System-level dependencies.\nfind_package(PkgConfig REQUIRED)\npkg_check_modules(GTK REQUIRED IMPORTED_TARGET gtk+-3.0)\n\nadd_definitions(-DAPPLICATION_ID=\"${APPLICATION_ID}\")\n\n# Define the application target. To change its name, change BINARY_NAME above,\n# not the value here, or `flutter run` will no longer work.\n#\n# Any new source files that you add to the application should be added here.\nadd_executable(${BINARY_NAME}\n  \"main.cc\"\n  \"my_application.cc\"\n  \"${FLUTTER_MANAGED_DIR}/generated_plugin_registrant.cc\"\n)\n\n# Apply the standard set of build settings. This can be removed for applications\n# that need different build settings.\napply_standard_settings(${BINARY_NAME})\n\n# Add dependency libraries. Add any application-specific dependencies here.\ntarget_link_libraries(${BINARY_NAME} PRIVATE flutter)\ntarget_link_libraries(${BINARY_NAME} PRIVATE PkgConfig::GTK)\n\n# Run the Flutter tool portions of the build. This must not be removed.\nadd_dependencies(${BINARY_NAME} flutter_assemble)\n\n# Only the install-generated bundle's copy of the executable will launch\n# correctly, since the resources must in the right relative locations. To avoid\n# people trying to run the unbundled copy, put it in a subdirectory instead of\n# the default top-level location.\nset_target_properties(${BINARY_NAME}\n  PROPERTIES\n  RUNTIME_OUTPUT_DIRECTORY \"${CMAKE_BINARY_DIR}/intermediates_do_not_run\"\n)\n\n\n# Generated plugin build rules, which manage building the plugins and adding\n# them to the application.\ninclude(flutter/generated_plugins.cmake)\n\n\n# === Installation ===\n# By default, \"installing\" just makes a relocatable bundle in the build\n# directory.\nset(BUILD_BUNDLE_DIR \"${PROJECT_BINARY_DIR}/bundle\")\nif(CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT)\n  set(CMAKE_INSTALL_PREFIX \"${BUILD_BUNDLE_DIR}\" CACHE PATH \"...\" FORCE)\nendif()\n\n# Start with a clean build bundle directory every time.\ninstall(CODE \"\n  file(REMOVE_RECURSE \\\"${BUILD_BUNDLE_DIR}/\\\")\n  \" COMPONENT Runtime)\n\nset(INSTALL_BUNDLE_DATA_DIR \"${CMAKE_INSTALL_PREFIX}/data\")\nset(INSTALL_BUNDLE_LIB_DIR \"${CMAKE_INSTALL_PREFIX}/lib\")\n\ninstall(TARGETS ${BINARY_NAME} RUNTIME DESTINATION \"${CMAKE_INSTALL_PREFIX}\"\n  COMPONENT Runtime)\n\ninstall(FILES \"${FLUTTER_ICU_DATA_FILE}\" DESTINATION \"${INSTALL_BUNDLE_DATA_DIR}\"\n  COMPONENT Runtime)\n\ninstall(FILES \"${FLUTTER_LIBRARY}\" DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n  COMPONENT Runtime)\n\nforeach(bundled_library ${PLUGIN_BUNDLED_LIBRARIES})\n  install(FILES \"${bundled_library}\"\n    DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n    COMPONENT Runtime)\nendforeach(bundled_library)\n\n# Copy the native assets provided by the build.dart from all packages.\nset(NATIVE_ASSETS_DIR \"${PROJECT_BUILD_DIR}native_assets/linux/\")\ninstall(DIRECTORY \"${NATIVE_ASSETS_DIR}\"\n   DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n   COMPONENT Runtime)\n\n# Fully re-copy the assets directory on each build to avoid having stale files\n# from a previous install.\nset(FLUTTER_ASSET_DIR_NAME \"flutter_assets\")\ninstall(CODE \"\n  file(REMOVE_RECURSE \\\"${INSTALL_BUNDLE_DATA_DIR}/${FLUTTER_ASSET_DIR_NAME}\\\")\n  \" COMPONENT Runtime)\ninstall(DIRECTORY \"${PROJECT_BUILD_DIR}/${FLUTTER_ASSET_DIR_NAME}\"\n  DESTINATION \"${INSTALL_BUNDLE_DATA_DIR}\" COMPONENT Runtime)\n\n# Install the AOT library on non-Debug builds only.\nif(NOT CMAKE_BUILD_TYPE MATCHES \"Debug\")\n  install(FILES \"${AOT_LIBRARY}\" DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n    COMPONENT Runtime)\nendif()\n"
  },
  {
    "path": "flutter-examples/streaming_asr/linux/flutter/CMakeLists.txt",
    "content": "# This file controls Flutter-level build steps. It should not be edited.\ncmake_minimum_required(VERSION 3.10)\n\nset(EPHEMERAL_DIR \"${CMAKE_CURRENT_SOURCE_DIR}/ephemeral\")\n\n# Configuration provided via flutter tool.\ninclude(${EPHEMERAL_DIR}/generated_config.cmake)\n\n# TODO: Move the rest of this into files in ephemeral. See\n# https://github.com/flutter/flutter/issues/57146.\n\n# Serves the same purpose as list(TRANSFORM ... PREPEND ...),\n# which isn't available in 3.10.\nfunction(list_prepend LIST_NAME PREFIX)\n    set(NEW_LIST \"\")\n    foreach(element ${${LIST_NAME}})\n        list(APPEND NEW_LIST \"${PREFIX}${element}\")\n    endforeach(element)\n    set(${LIST_NAME} \"${NEW_LIST}\" PARENT_SCOPE)\nendfunction()\n\n# === Flutter Library ===\n# System-level dependencies.\nfind_package(PkgConfig REQUIRED)\npkg_check_modules(GTK REQUIRED IMPORTED_TARGET gtk+-3.0)\npkg_check_modules(GLIB REQUIRED IMPORTED_TARGET glib-2.0)\npkg_check_modules(GIO REQUIRED IMPORTED_TARGET gio-2.0)\n\nset(FLUTTER_LIBRARY \"${EPHEMERAL_DIR}/libflutter_linux_gtk.so\")\n\n# Published to parent scope for install step.\nset(FLUTTER_LIBRARY ${FLUTTER_LIBRARY} PARENT_SCOPE)\nset(FLUTTER_ICU_DATA_FILE \"${EPHEMERAL_DIR}/icudtl.dat\" PARENT_SCOPE)\nset(PROJECT_BUILD_DIR \"${PROJECT_DIR}/build/\" PARENT_SCOPE)\nset(AOT_LIBRARY \"${PROJECT_DIR}/build/lib/libapp.so\" PARENT_SCOPE)\n\nlist(APPEND FLUTTER_LIBRARY_HEADERS\n  \"fl_basic_message_channel.h\"\n  \"fl_binary_codec.h\"\n  \"fl_binary_messenger.h\"\n  \"fl_dart_project.h\"\n  \"fl_engine.h\"\n  \"fl_json_message_codec.h\"\n  \"fl_json_method_codec.h\"\n  \"fl_message_codec.h\"\n  \"fl_method_call.h\"\n  \"fl_method_channel.h\"\n  \"fl_method_codec.h\"\n  \"fl_method_response.h\"\n  \"fl_plugin_registrar.h\"\n  \"fl_plugin_registry.h\"\n  \"fl_standard_message_codec.h\"\n  \"fl_standard_method_codec.h\"\n  \"fl_string_codec.h\"\n  \"fl_value.h\"\n  \"fl_view.h\"\n  \"flutter_linux.h\"\n)\nlist_prepend(FLUTTER_LIBRARY_HEADERS \"${EPHEMERAL_DIR}/flutter_linux/\")\nadd_library(flutter INTERFACE)\ntarget_include_directories(flutter INTERFACE\n  \"${EPHEMERAL_DIR}\"\n)\ntarget_link_libraries(flutter INTERFACE \"${FLUTTER_LIBRARY}\")\ntarget_link_libraries(flutter INTERFACE\n  PkgConfig::GTK\n  PkgConfig::GLIB\n  PkgConfig::GIO\n)\nadd_dependencies(flutter flutter_assemble)\n\n# === Flutter tool backend ===\n# _phony_ is a non-existent file to force this command to run every time,\n# since currently there's no way to get a full input/output list from the\n# flutter tool.\nadd_custom_command(\n  OUTPUT ${FLUTTER_LIBRARY} ${FLUTTER_LIBRARY_HEADERS}\n    ${CMAKE_CURRENT_BINARY_DIR}/_phony_\n  COMMAND ${CMAKE_COMMAND} -E env\n    ${FLUTTER_TOOL_ENVIRONMENT}\n    \"${FLUTTER_ROOT}/packages/flutter_tools/bin/tool_backend.sh\"\n      ${FLUTTER_TARGET_PLATFORM} ${CMAKE_BUILD_TYPE}\n  VERBATIM\n)\nadd_custom_target(flutter_assemble DEPENDS\n  \"${FLUTTER_LIBRARY}\"\n  ${FLUTTER_LIBRARY_HEADERS}\n)\n"
  },
  {
    "path": "flutter-examples/streaming_asr/linux/main.cc",
    "content": "#include \"my_application.h\"\n\nint main(int argc, char** argv) {\n  g_autoptr(MyApplication) app = my_application_new();\n  return g_application_run(G_APPLICATION(app), argc, argv);\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/linux/my_application.cc",
    "content": "#include \"my_application.h\"\n\n#include <flutter_linux/flutter_linux.h>\n#ifdef GDK_WINDOWING_X11\n#include <gdk/gdkx.h>\n#endif\n\n#include \"flutter/generated_plugin_registrant.h\"\n\nstruct _MyApplication {\n  GtkApplication parent_instance;\n  char** dart_entrypoint_arguments;\n};\n\nG_DEFINE_TYPE(MyApplication, my_application, GTK_TYPE_APPLICATION)\n\n// Implements GApplication::activate.\nstatic void my_application_activate(GApplication* application) {\n  MyApplication* self = MY_APPLICATION(application);\n  GtkWindow* window =\n      GTK_WINDOW(gtk_application_window_new(GTK_APPLICATION(application)));\n\n  // Use a header bar when running in GNOME as this is the common style used\n  // by applications and is the setup most users will be using (e.g. Ubuntu\n  // desktop).\n  // If running on X and not using GNOME then just use a traditional title bar\n  // in case the window manager does more exotic layout, e.g. tiling.\n  // If running on Wayland assume the header bar will work (may need changing\n  // if future cases occur).\n  gboolean use_header_bar = TRUE;\n#ifdef GDK_WINDOWING_X11\n  GdkScreen* screen = gtk_window_get_screen(window);\n  if (GDK_IS_X11_SCREEN(screen)) {\n    const gchar* wm_name = gdk_x11_screen_get_window_manager_name(screen);\n    if (g_strcmp0(wm_name, \"GNOME Shell\") != 0) {\n      use_header_bar = FALSE;\n    }\n  }\n#endif\n  if (use_header_bar) {\n    GtkHeaderBar* header_bar = GTK_HEADER_BAR(gtk_header_bar_new());\n    gtk_widget_show(GTK_WIDGET(header_bar));\n    gtk_header_bar_set_title(header_bar, \"streaming_asr\");\n    gtk_header_bar_set_show_close_button(header_bar, TRUE);\n    gtk_window_set_titlebar(window, GTK_WIDGET(header_bar));\n  } else {\n    gtk_window_set_title(window, \"streaming_asr\");\n  }\n\n  gtk_window_set_default_size(window, 1280, 720);\n  gtk_widget_show(GTK_WIDGET(window));\n\n  g_autoptr(FlDartProject) project = fl_dart_project_new();\n  fl_dart_project_set_dart_entrypoint_arguments(project, self->dart_entrypoint_arguments);\n\n  FlView* view = fl_view_new(project);\n  gtk_widget_show(GTK_WIDGET(view));\n  gtk_container_add(GTK_CONTAINER(window), GTK_WIDGET(view));\n\n  fl_register_plugins(FL_PLUGIN_REGISTRY(view));\n\n  gtk_widget_grab_focus(GTK_WIDGET(view));\n}\n\n// Implements GApplication::local_command_line.\nstatic gboolean my_application_local_command_line(GApplication* application, gchar*** arguments, int* exit_status) {\n  MyApplication* self = MY_APPLICATION(application);\n  // Strip out the first argument as it is the binary name.\n  self->dart_entrypoint_arguments = g_strdupv(*arguments + 1);\n\n  g_autoptr(GError) error = nullptr;\n  if (!g_application_register(application, nullptr, &error)) {\n     g_warning(\"Failed to register: %s\", error->message);\n     *exit_status = 1;\n     return TRUE;\n  }\n\n  g_application_activate(application);\n  *exit_status = 0;\n\n  return TRUE;\n}\n\n// Implements GApplication::startup.\nstatic void my_application_startup(GApplication* application) {\n  //MyApplication* self = MY_APPLICATION(object);\n\n  // Perform any actions required at application startup.\n\n  G_APPLICATION_CLASS(my_application_parent_class)->startup(application);\n}\n\n// Implements GApplication::shutdown.\nstatic void my_application_shutdown(GApplication* application) {\n  //MyApplication* self = MY_APPLICATION(object);\n\n  // Perform any actions required at application shutdown.\n\n  G_APPLICATION_CLASS(my_application_parent_class)->shutdown(application);\n}\n\n// Implements GObject::dispose.\nstatic void my_application_dispose(GObject* object) {\n  MyApplication* self = MY_APPLICATION(object);\n  g_clear_pointer(&self->dart_entrypoint_arguments, g_strfreev);\n  G_OBJECT_CLASS(my_application_parent_class)->dispose(object);\n}\n\nstatic void my_application_class_init(MyApplicationClass* klass) {\n  G_APPLICATION_CLASS(klass)->activate = my_application_activate;\n  G_APPLICATION_CLASS(klass)->local_command_line = my_application_local_command_line;\n  G_APPLICATION_CLASS(klass)->startup = my_application_startup;\n  G_APPLICATION_CLASS(klass)->shutdown = my_application_shutdown;\n  G_OBJECT_CLASS(klass)->dispose = my_application_dispose;\n}\n\nstatic void my_application_init(MyApplication* self) {}\n\nMyApplication* my_application_new() {\n  return MY_APPLICATION(g_object_new(my_application_get_type(),\n                                     \"application-id\", APPLICATION_ID,\n                                     \"flags\", G_APPLICATION_NON_UNIQUE,\n                                     nullptr));\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/linux/my_application.h",
    "content": "#ifndef FLUTTER_MY_APPLICATION_H_\n#define FLUTTER_MY_APPLICATION_H_\n\n#include <gtk/gtk.h>\n\nG_DECLARE_FINAL_TYPE(MyApplication, my_application, MY, APPLICATION,\n                     GtkApplication)\n\n/**\n * my_application_new:\n *\n * Creates a new Flutter-based application.\n *\n * Returns: a new #MyApplication.\n */\nMyApplication* my_application_new();\n\n#endif  // FLUTTER_MY_APPLICATION_H_\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/.gitignore",
    "content": "# Flutter-related\n**/Flutter/ephemeral/\n**/Pods/\n\n# Xcode-related\n**/dgph\n**/xcuserdata/\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Flutter/Flutter-Debug.xcconfig",
    "content": "#include? \"Pods/Target Support Files/Pods-Runner/Pods-Runner.debug.xcconfig\"\n#include \"ephemeral/Flutter-Generated.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Flutter/Flutter-Release.xcconfig",
    "content": "#include? \"Pods/Target Support Files/Pods-Runner/Pods-Runner.release.xcconfig\"\n#include \"ephemeral/Flutter-Generated.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/AppDelegate.swift",
    "content": "import Cocoa\nimport FlutterMacOS\n\n@NSApplicationMain\nclass AppDelegate: FlutterAppDelegate {\n  override func applicationShouldTerminateAfterLastWindowClosed(_ sender: NSApplication) -> Bool {\n    return true\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"size\" : \"16x16\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_16.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"16x16\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_32.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"32x32\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_32.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"32x32\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_64.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"128x128\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_128.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"128x128\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_256.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"256x256\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_256.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"256x256\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_512.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"512x512\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_512.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"512x512\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_1024.png\",\n      \"scale\" : \"2x\"\n    }\n  ],\n  \"info\" : {\n    \"version\" : 1,\n    \"author\" : \"xcode\"\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/Base.lproj/MainMenu.xib",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<document type=\"com.apple.InterfaceBuilder3.Cocoa.XIB\" version=\"3.0\" toolsVersion=\"14490.70\" targetRuntime=\"MacOSX.Cocoa\" propertyAccessControl=\"none\" useAutolayout=\"YES\" customObjectInstantitationMethod=\"direct\">\n    <dependencies>\n        <deployment identifier=\"macosx\"/>\n        <plugIn identifier=\"com.apple.InterfaceBuilder.CocoaPlugin\" version=\"14490.70\"/>\n        <capability name=\"documents saved in the Xcode 8 format\" minToolsVersion=\"8.0\"/>\n    </dependencies>\n    <objects>\n        <customObject id=\"-2\" userLabel=\"File's Owner\" customClass=\"NSApplication\">\n            <connections>\n                <outlet property=\"delegate\" destination=\"Voe-Tx-rLC\" id=\"GzC-gU-4Uq\"/>\n            </connections>\n        </customObject>\n        <customObject id=\"-1\" userLabel=\"First Responder\" customClass=\"FirstResponder\"/>\n        <customObject id=\"-3\" userLabel=\"Application\" customClass=\"NSObject\"/>\n        <customObject id=\"Voe-Tx-rLC\" customClass=\"AppDelegate\" customModule=\"Runner\" customModuleProvider=\"target\">\n            <connections>\n                <outlet property=\"applicationMenu\" destination=\"uQy-DD-JDr\" id=\"XBo-yE-nKs\"/>\n                <outlet property=\"mainFlutterWindow\" destination=\"QvC-M9-y7g\" id=\"gIp-Ho-8D9\"/>\n            </connections>\n        </customObject>\n        <customObject id=\"YLy-65-1bz\" customClass=\"NSFontManager\"/>\n        <menu title=\"Main Menu\" systemMenu=\"main\" id=\"AYu-sK-qS6\">\n            <items>\n                <menuItem title=\"APP_NAME\" id=\"1Xt-HY-uBw\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"APP_NAME\" systemMenu=\"apple\" id=\"uQy-DD-JDr\">\n                        <items>\n                            <menuItem title=\"About APP_NAME\" id=\"5kV-Vb-QxS\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"orderFrontStandardAboutPanel:\" target=\"-1\" id=\"Exp-CZ-Vem\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"VOq-y0-SEH\"/>\n                            <menuItem title=\"Preferences…\" keyEquivalent=\",\" id=\"BOF-NM-1cW\"/>\n                            <menuItem isSeparatorItem=\"YES\" id=\"wFC-TO-SCJ\"/>\n                            <menuItem title=\"Services\" id=\"NMo-om-nkz\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Services\" systemMenu=\"services\" id=\"hz9-B4-Xy5\"/>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"4je-JR-u6R\"/>\n                            <menuItem title=\"Hide APP_NAME\" keyEquivalent=\"h\" id=\"Olw-nP-bQN\">\n                                <connections>\n                                    <action selector=\"hide:\" target=\"-1\" id=\"PnN-Uc-m68\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Hide Others\" keyEquivalent=\"h\" id=\"Vdr-fp-XzO\">\n                                <modifierMask key=\"keyEquivalentModifierMask\" option=\"YES\" command=\"YES\"/>\n                                <connections>\n                                    <action selector=\"hideOtherApplications:\" target=\"-1\" id=\"VT4-aY-XCT\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Show All\" id=\"Kd2-mp-pUS\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"unhideAllApplications:\" target=\"-1\" id=\"Dhg-Le-xox\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"kCx-OE-vgT\"/>\n                            <menuItem title=\"Quit APP_NAME\" keyEquivalent=\"q\" id=\"4sb-4s-VLi\">\n                                <connections>\n                                    <action selector=\"terminate:\" target=\"-1\" id=\"Te7-pn-YzF\"/>\n                                </connections>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"Edit\" id=\"5QF-Oa-p0T\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"Edit\" id=\"W48-6f-4Dl\">\n                        <items>\n                            <menuItem title=\"Undo\" keyEquivalent=\"z\" id=\"dRJ-4n-Yzg\">\n                                <connections>\n                                    <action selector=\"undo:\" target=\"-1\" id=\"M6e-cu-g7V\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Redo\" keyEquivalent=\"Z\" id=\"6dh-zS-Vam\">\n                                <connections>\n                                    <action selector=\"redo:\" target=\"-1\" id=\"oIA-Rs-6OD\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"WRV-NI-Exz\"/>\n                            <menuItem title=\"Cut\" keyEquivalent=\"x\" id=\"uRl-iY-unG\">\n                                <connections>\n                                    <action selector=\"cut:\" target=\"-1\" id=\"YJe-68-I9s\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Copy\" keyEquivalent=\"c\" id=\"x3v-GG-iWU\">\n                                <connections>\n                                    <action selector=\"copy:\" target=\"-1\" id=\"G1f-GL-Joy\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Paste\" keyEquivalent=\"v\" id=\"gVA-U4-sdL\">\n                                <connections>\n                                    <action selector=\"paste:\" target=\"-1\" id=\"UvS-8e-Qdg\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Paste and Match Style\" keyEquivalent=\"V\" id=\"WeT-3V-zwk\">\n                                <modifierMask key=\"keyEquivalentModifierMask\" option=\"YES\" command=\"YES\"/>\n                                <connections>\n                                    <action selector=\"pasteAsPlainText:\" target=\"-1\" id=\"cEh-KX-wJQ\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Delete\" id=\"pa3-QI-u2k\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"delete:\" target=\"-1\" id=\"0Mk-Ml-PaM\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Select All\" keyEquivalent=\"a\" id=\"Ruw-6m-B2m\">\n                                <connections>\n                                    <action selector=\"selectAll:\" target=\"-1\" id=\"VNm-Mi-diN\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"uyl-h8-XO2\"/>\n                            <menuItem title=\"Find\" id=\"4EN-yA-p0u\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Find\" id=\"1b7-l0-nxx\">\n                                    <items>\n                                        <menuItem title=\"Find…\" tag=\"1\" keyEquivalent=\"f\" id=\"Xz5-n4-O0W\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"cD7-Qs-BN4\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Find and Replace…\" tag=\"12\" keyEquivalent=\"f\" id=\"YEy-JH-Tfz\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\" option=\"YES\" command=\"YES\"/>\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"WD3-Gg-5AJ\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Find Next\" tag=\"2\" keyEquivalent=\"g\" id=\"q09-fT-Sye\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"NDo-RZ-v9R\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Find Previous\" tag=\"3\" keyEquivalent=\"G\" id=\"OwM-mh-QMV\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"HOh-sY-3ay\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Use Selection for Find\" tag=\"7\" keyEquivalent=\"e\" id=\"buJ-ug-pKt\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"U76-nv-p5D\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Jump to Selection\" keyEquivalent=\"j\" id=\"S0p-oC-mLd\">\n                                            <connections>\n                                                <action selector=\"centerSelectionInVisibleArea:\" target=\"-1\" id=\"IOG-6D-g5B\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Spelling and Grammar\" id=\"Dv1-io-Yv7\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Spelling\" id=\"3IN-sU-3Bg\">\n                                    <items>\n                                        <menuItem title=\"Show Spelling and Grammar\" keyEquivalent=\":\" id=\"HFo-cy-zxI\">\n                                            <connections>\n                                                <action selector=\"showGuessPanel:\" target=\"-1\" id=\"vFj-Ks-hy3\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Check Document Now\" keyEquivalent=\";\" id=\"hz2-CU-CR7\">\n                                            <connections>\n                                                <action selector=\"checkSpelling:\" target=\"-1\" id=\"fz7-VC-reM\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem isSeparatorItem=\"YES\" id=\"bNw-od-mp5\"/>\n                                        <menuItem title=\"Check Spelling While Typing\" id=\"rbD-Rh-wIN\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleContinuousSpellChecking:\" target=\"-1\" id=\"7w6-Qz-0kB\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Check Grammar With Spelling\" id=\"mK6-2p-4JG\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleGrammarChecking:\" target=\"-1\" id=\"muD-Qn-j4w\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Correct Spelling Automatically\" id=\"78Y-hA-62v\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticSpellingCorrection:\" target=\"-1\" id=\"2lM-Qi-WAP\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Substitutions\" id=\"9ic-FL-obx\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Substitutions\" id=\"FeM-D8-WVr\">\n                                    <items>\n                                        <menuItem title=\"Show Substitutions\" id=\"z6F-FW-3nz\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"orderFrontSubstitutionsPanel:\" target=\"-1\" id=\"oku-mr-iSq\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem isSeparatorItem=\"YES\" id=\"gPx-C9-uUO\"/>\n                                        <menuItem title=\"Smart Copy/Paste\" id=\"9yt-4B-nSM\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleSmartInsertDelete:\" target=\"-1\" id=\"3IJ-Se-DZD\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Smart Quotes\" id=\"hQb-2v-fYv\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticQuoteSubstitution:\" target=\"-1\" id=\"ptq-xd-QOA\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Smart Dashes\" id=\"rgM-f4-ycn\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticDashSubstitution:\" target=\"-1\" id=\"oCt-pO-9gS\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Smart Links\" id=\"cwL-P1-jid\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticLinkDetection:\" target=\"-1\" id=\"Gip-E3-Fov\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Data Detectors\" id=\"tRr-pd-1PS\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticDataDetection:\" target=\"-1\" id=\"R1I-Nq-Kbl\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Text Replacement\" id=\"HFQ-gK-NFA\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticTextReplacement:\" target=\"-1\" id=\"DvP-Fe-Py6\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Transformations\" id=\"2oI-Rn-ZJC\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Transformations\" id=\"c8a-y6-VQd\">\n                                    <items>\n                                        <menuItem title=\"Make Upper Case\" id=\"vmV-6d-7jI\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"uppercaseWord:\" target=\"-1\" id=\"sPh-Tk-edu\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Make Lower Case\" id=\"d9M-CD-aMd\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"lowercaseWord:\" target=\"-1\" id=\"iUZ-b5-hil\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Capitalize\" id=\"UEZ-Bs-lqG\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"capitalizeWord:\" target=\"-1\" id=\"26H-TL-nsh\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Speech\" id=\"xrE-MZ-jX0\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Speech\" id=\"3rS-ZA-NoH\">\n                                    <items>\n                                        <menuItem title=\"Start Speaking\" id=\"Ynk-f8-cLZ\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"startSpeaking:\" target=\"-1\" id=\"654-Ng-kyl\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Stop Speaking\" id=\"Oyz-dy-DGm\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"stopSpeaking:\" target=\"-1\" id=\"dX8-6p-jy9\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"View\" id=\"H8h-7b-M4v\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"View\" id=\"HyV-fh-RgO\">\n                        <items>\n                            <menuItem title=\"Enter Full Screen\" keyEquivalent=\"f\" id=\"4J7-dP-txa\">\n                                <modifierMask key=\"keyEquivalentModifierMask\" control=\"YES\" command=\"YES\"/>\n                                <connections>\n                                    <action selector=\"toggleFullScreen:\" target=\"-1\" id=\"dU3-MA-1Rq\"/>\n                                </connections>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"Window\" id=\"aUF-d1-5bR\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"Window\" systemMenu=\"window\" id=\"Td7-aD-5lo\">\n                        <items>\n                            <menuItem title=\"Minimize\" keyEquivalent=\"m\" id=\"OY7-WF-poV\">\n                                <connections>\n                                    <action selector=\"performMiniaturize:\" target=\"-1\" id=\"VwT-WD-YPe\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Zoom\" id=\"R4o-n2-Eq4\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"performZoom:\" target=\"-1\" id=\"DIl-cC-cCs\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"eu3-7i-yIM\"/>\n                            <menuItem title=\"Bring All to Front\" id=\"LE2-aR-0XJ\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"arrangeInFront:\" target=\"-1\" id=\"DRN-fu-gQh\"/>\n                                </connections>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"Help\" id=\"EPT-qC-fAb\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"Help\" systemMenu=\"help\" id=\"rJ0-wn-3NY\"/>\n                </menuItem>\n            </items>\n            <point key=\"canvasLocation\" x=\"142\" y=\"-258\"/>\n        </menu>\n        <window title=\"APP_NAME\" allowsToolTipsWhenApplicationIsInactive=\"NO\" autorecalculatesKeyViewLoop=\"NO\" releasedWhenClosed=\"NO\" animationBehavior=\"default\" id=\"QvC-M9-y7g\" customClass=\"MainFlutterWindow\" customModule=\"Runner\" customModuleProvider=\"target\">\n            <windowStyleMask key=\"styleMask\" titled=\"YES\" closable=\"YES\" miniaturizable=\"YES\" resizable=\"YES\"/>\n            <rect key=\"contentRect\" x=\"335\" y=\"390\" width=\"800\" height=\"600\"/>\n            <rect key=\"screenRect\" x=\"0.0\" y=\"0.0\" width=\"2560\" height=\"1577\"/>\n            <view key=\"contentView\" wantsLayer=\"YES\" id=\"EiT-Mj-1SZ\">\n                <rect key=\"frame\" x=\"0.0\" y=\"0.0\" width=\"800\" height=\"600\"/>\n                <autoresizingMask key=\"autoresizingMask\"/>\n            </view>\n        </window>\n    </objects>\n</document>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/Configs/AppInfo.xcconfig",
    "content": "// Application-level settings for the Runner target.\n//\n// This may be replaced with something auto-generated from metadata (e.g., pubspec.yaml) in the\n// future. If not, the values below would default to using the project name when this becomes a\n// 'flutter create' template.\n\n// The application's name. By default this is also the title of the Flutter window.\nPRODUCT_NAME = streaming_asr\n\n// The application's bundle identifier\nPRODUCT_BUNDLE_IDENTIFIER = com.example.streamingAsr\n\n// The copyright displayed in application information\nPRODUCT_COPYRIGHT = Copyright © 2024 com.example. All rights reserved.\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/Configs/Debug.xcconfig",
    "content": "#include \"../../Flutter/Flutter-Debug.xcconfig\"\n#include \"Warnings.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/Configs/Release.xcconfig",
    "content": "#include \"../../Flutter/Flutter-Release.xcconfig\"\n#include \"Warnings.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/Configs/Warnings.xcconfig",
    "content": "WARNING_CFLAGS = -Wall -Wconditional-uninitialized -Wnullable-to-nonnull-conversion -Wmissing-method-return-type -Woverlength-strings\nGCC_WARN_UNDECLARED_SELECTOR = YES\nCLANG_UNDEFINED_BEHAVIOR_SANITIZER_NULLABILITY = YES\nCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE\nCLANG_WARN__DUPLICATE_METHOD_MATCH = YES\nCLANG_WARN_PRAGMA_PACK = YES\nCLANG_WARN_STRICT_PROTOTYPES = YES\nCLANG_WARN_COMMA = YES\nGCC_WARN_STRICT_SELECTOR_MATCH = YES\nCLANG_WARN_OBJC_REPEATED_USE_OF_WEAK = YES\nCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES\nGCC_WARN_SHADOW = YES\nCLANG_WARN_UNREACHABLE_CODE = YES\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/DebugProfile.entitlements",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>com.apple.security.app-sandbox</key>\n\t<true/>\n\t<key>com.apple.security.cs.allow-jit</key>\n\t<true/>\n\t<key>com.apple.security.device.audio-input</key>\n\t<true/>\n\t<key>com.apple.security.network.server</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>NSMicrophoneUsageDescription</key>\n\t<string>Need microphone access for Next-gen kaldi to work</string>\n\t<key>CFBundleDevelopmentRegion</key>\n\t<string>$(DEVELOPMENT_LANGUAGE)</string>\n\t<key>CFBundleExecutable</key>\n\t<string>$(EXECUTABLE_NAME)</string>\n\t<key>CFBundleIconFile</key>\n\t<string></string>\n\t<key>CFBundleIdentifier</key>\n\t<string>$(PRODUCT_BUNDLE_IDENTIFIER)</string>\n\t<key>CFBundleInfoDictionaryVersion</key>\n\t<string>6.0</string>\n\t<key>CFBundleName</key>\n\t<string>$(PRODUCT_NAME)</string>\n\t<key>CFBundlePackageType</key>\n\t<string>APPL</string>\n\t<key>CFBundleShortVersionString</key>\n\t<string>$(FLUTTER_BUILD_NAME)</string>\n\t<key>CFBundleVersion</key>\n\t<string>$(FLUTTER_BUILD_NUMBER)</string>\n\t<key>LSMinimumSystemVersion</key>\n\t<string>$(MACOSX_DEPLOYMENT_TARGET)</string>\n\t<key>NSHumanReadableCopyright</key>\n\t<string>$(PRODUCT_COPYRIGHT)</string>\n\t<key>NSMainNibFile</key>\n\t<string>MainMenu</string>\n\t<key>NSPrincipalClass</key>\n\t<string>NSApplication</string>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/MainFlutterWindow.swift",
    "content": "import Cocoa\nimport FlutterMacOS\n\nclass MainFlutterWindow: NSWindow {\n  override func awakeFromNib() {\n    let flutterViewController = FlutterViewController()\n    let windowFrame = self.frame\n    self.contentViewController = flutterViewController\n    self.setFrame(windowFrame, display: true)\n\n    RegisterGeneratedPlugins(registry: flutterViewController)\n\n    super.awakeFromNib()\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner/Release.entitlements",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>com.apple.security.app-sandbox</key>\n\t<true/>\n\t<key>com.apple.security.device.audio-input</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 54;\n\tobjects = {\n\n/* Begin PBXAggregateTarget section */\n\t\t33CC111A2044C6BA0003C045 /* Flutter Assemble */ = {\n\t\t\tisa = PBXAggregateTarget;\n\t\t\tbuildConfigurationList = 33CC111B2044C6BA0003C045 /* Build configuration list for PBXAggregateTarget \"Flutter Assemble\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t33CC111E2044C6BF0003C045 /* ShellScript */,\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = \"Flutter Assemble\";\n\t\t\tproductName = FLX;\n\t\t};\n/* End PBXAggregateTarget section */\n\n/* Begin PBXBuildFile section */\n\t\t331C80D8294CF71000263BE5 /* RunnerTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = 331C80D7294CF71000263BE5 /* RunnerTests.swift */; };\n\t\t335BBD1B22A9A15E00E9071D /* GeneratedPluginRegistrant.swift in Sources */ = {isa = PBXBuildFile; fileRef = 335BBD1A22A9A15E00E9071D /* GeneratedPluginRegistrant.swift */; };\n\t\t33CC10F12044A3C60003C045 /* AppDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 33CC10F02044A3C60003C045 /* AppDelegate.swift */; };\n\t\t33CC10F32044A3C60003C045 /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 33CC10F22044A3C60003C045 /* Assets.xcassets */; };\n\t\t33CC10F62044A3C60003C045 /* MainMenu.xib in Resources */ = {isa = PBXBuildFile; fileRef = 33CC10F42044A3C60003C045 /* MainMenu.xib */; };\n\t\t33CC11132044BFA00003C045 /* MainFlutterWindow.swift in Sources */ = {isa = PBXBuildFile; fileRef = 33CC11122044BFA00003C045 /* MainFlutterWindow.swift */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXContainerItemProxy section */\n\t\t331C80D9294CF71000263BE5 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = 33CC10E52044A3C60003C045 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = 33CC10EC2044A3C60003C045;\n\t\t\tremoteInfo = Runner;\n\t\t};\n\t\t33CC111F2044C79F0003C045 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = 33CC10E52044A3C60003C045 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = 33CC111A2044C6BA0003C045;\n\t\t\tremoteInfo = FLX;\n\t\t};\n/* End PBXContainerItemProxy section */\n\n/* Begin PBXCopyFilesBuildPhase section */\n\t\t33CC110E2044A8840003C045 /* Bundle Framework */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = \"\";\n\t\t\tdstSubfolderSpec = 10;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tname = \"Bundle Framework\";\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXCopyFilesBuildPhase section */\n\n/* Begin PBXFileReference section */\n\t\t331C80D5294CF71000263BE5 /* RunnerTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = RunnerTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t331C80D7294CF71000263BE5 /* RunnerTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = RunnerTests.swift; sourceTree = \"<group>\"; };\n\t\t333000ED22D3DE5D00554162 /* Warnings.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = Warnings.xcconfig; sourceTree = \"<group>\"; };\n\t\t335BBD1A22A9A15E00E9071D /* GeneratedPluginRegistrant.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = GeneratedPluginRegistrant.swift; sourceTree = \"<group>\"; };\n\t\t33CC10ED2044A3C60003C045 /* streaming_asr.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = \"streaming_asr.app\"; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t33CC10F02044A3C60003C045 /* AppDelegate.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = AppDelegate.swift; sourceTree = \"<group>\"; };\n\t\t33CC10F22044A3C60003C045 /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; name = Assets.xcassets; path = Runner/Assets.xcassets; sourceTree = \"<group>\"; };\n\t\t33CC10F52044A3C60003C045 /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.xib; name = Base; path = Base.lproj/MainMenu.xib; sourceTree = \"<group>\"; };\n\t\t33CC10F72044A3C60003C045 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; name = Info.plist; path = Runner/Info.plist; sourceTree = \"<group>\"; };\n\t\t33CC11122044BFA00003C045 /* MainFlutterWindow.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = MainFlutterWindow.swift; sourceTree = \"<group>\"; };\n\t\t33CEB47222A05771004F2AC0 /* Flutter-Debug.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = \"Flutter-Debug.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t33CEB47422A05771004F2AC0 /* Flutter-Release.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = \"Flutter-Release.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t33CEB47722A0578A004F2AC0 /* Flutter-Generated.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; name = \"Flutter-Generated.xcconfig\"; path = \"ephemeral/Flutter-Generated.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t33E51913231747F40026EE4D /* DebugProfile.entitlements */ = {isa = PBXFileReference; lastKnownFileType = text.plist.entitlements; path = DebugProfile.entitlements; sourceTree = \"<group>\"; };\n\t\t33E51914231749380026EE4D /* Release.entitlements */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.plist.entitlements; path = Release.entitlements; sourceTree = \"<group>\"; };\n\t\t33E5194F232828860026EE4D /* AppInfo.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = AppInfo.xcconfig; sourceTree = \"<group>\"; };\n\t\t7AFA3C8E1D35360C0083082E /* Release.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = Release.xcconfig; sourceTree = \"<group>\"; };\n\t\t9740EEB21CF90195004384FC /* Debug.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; path = Debug.xcconfig; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\t331C80D2294CF70F00263BE5 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t33CC10EA2044A3C60003C045 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\t331C80D6294CF71000263BE5 /* RunnerTests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t331C80D7294CF71000263BE5 /* RunnerTests.swift */,\n\t\t\t);\n\t\t\tpath = RunnerTests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33BA886A226E78AF003329D5 /* Configs */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33E5194F232828860026EE4D /* AppInfo.xcconfig */,\n\t\t\t\t9740EEB21CF90195004384FC /* Debug.xcconfig */,\n\t\t\t\t7AFA3C8E1D35360C0083082E /* Release.xcconfig */,\n\t\t\t\t333000ED22D3DE5D00554162 /* Warnings.xcconfig */,\n\t\t\t);\n\t\t\tpath = Configs;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CC10E42044A3C60003C045 = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33FAB671232836740065AC1E /* Runner */,\n\t\t\t\t33CEB47122A05771004F2AC0 /* Flutter */,\n\t\t\t\t331C80D6294CF71000263BE5 /* RunnerTests */,\n\t\t\t\t33CC10EE2044A3C60003C045 /* Products */,\n\t\t\t\tD73912EC22F37F3D000D13A0 /* Frameworks */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CC10EE2044A3C60003C045 /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10ED2044A3C60003C045 /* streaming_asr.app */,\n\t\t\t\t331C80D5294CF71000263BE5 /* RunnerTests.xctest */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CC11242044D66E0003C045 /* Resources */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10F22044A3C60003C045 /* Assets.xcassets */,\n\t\t\t\t33CC10F42044A3C60003C045 /* MainMenu.xib */,\n\t\t\t\t33CC10F72044A3C60003C045 /* Info.plist */,\n\t\t\t);\n\t\t\tname = Resources;\n\t\t\tpath = ..;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CEB47122A05771004F2AC0 /* Flutter */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t335BBD1A22A9A15E00E9071D /* GeneratedPluginRegistrant.swift */,\n\t\t\t\t33CEB47222A05771004F2AC0 /* Flutter-Debug.xcconfig */,\n\t\t\t\t33CEB47422A05771004F2AC0 /* Flutter-Release.xcconfig */,\n\t\t\t\t33CEB47722A0578A004F2AC0 /* Flutter-Generated.xcconfig */,\n\t\t\t);\n\t\t\tpath = Flutter;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33FAB671232836740065AC1E /* Runner */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10F02044A3C60003C045 /* AppDelegate.swift */,\n\t\t\t\t33CC11122044BFA00003C045 /* MainFlutterWindow.swift */,\n\t\t\t\t33E51913231747F40026EE4D /* DebugProfile.entitlements */,\n\t\t\t\t33E51914231749380026EE4D /* Release.entitlements */,\n\t\t\t\t33CC11242044D66E0003C045 /* Resources */,\n\t\t\t\t33BA886A226E78AF003329D5 /* Configs */,\n\t\t\t);\n\t\t\tpath = Runner;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tD73912EC22F37F3D000D13A0 /* Frameworks */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t);\n\t\t\tname = Frameworks;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\t331C80D4294CF70F00263BE5 /* RunnerTests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 331C80DE294CF71000263BE5 /* Build configuration list for PBXNativeTarget \"RunnerTests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t331C80D1294CF70F00263BE5 /* Sources */,\n\t\t\t\t331C80D2294CF70F00263BE5 /* Frameworks */,\n\t\t\t\t331C80D3294CF70F00263BE5 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\t331C80DA294CF71000263BE5 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = RunnerTests;\n\t\t\tproductName = RunnerTests;\n\t\t\tproductReference = 331C80D5294CF71000263BE5 /* RunnerTests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.unit-test\";\n\t\t};\n\t\t33CC10EC2044A3C60003C045 /* Runner */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 33CC10FB2044A3C60003C045 /* Build configuration list for PBXNativeTarget \"Runner\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t33CC10E92044A3C60003C045 /* Sources */,\n\t\t\t\t33CC10EA2044A3C60003C045 /* Frameworks */,\n\t\t\t\t33CC10EB2044A3C60003C045 /* Resources */,\n\t\t\t\t33CC110E2044A8840003C045 /* Bundle Framework */,\n\t\t\t\t3399D490228B24CF009A79C7 /* ShellScript */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\t33CC11202044C79F0003C045 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = Runner;\n\t\t\tproductName = Runner;\n\t\t\tproductReference = 33CC10ED2044A3C60003C045 /* streaming_asr.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\t33CC10E52044A3C60003C045 /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = YES;\n\t\t\t\tLastSwiftUpdateCheck = 0920;\n\t\t\t\tLastUpgradeCheck = 1510;\n\t\t\t\tORGANIZATIONNAME = \"\";\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\t331C80D4294CF70F00263BE5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.0;\n\t\t\t\t\t\tTestTargetID = 33CC10EC2044A3C60003C045;\n\t\t\t\t\t};\n\t\t\t\t\t33CC10EC2044A3C60003C045 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 9.2;\n\t\t\t\t\t\tLastSwiftMigration = 1100;\n\t\t\t\t\t\tProvisioningStyle = Automatic;\n\t\t\t\t\t\tSystemCapabilities = {\n\t\t\t\t\t\t\tcom.apple.Sandbox = {\n\t\t\t\t\t\t\t\tenabled = 1;\n\t\t\t\t\t\t\t};\n\t\t\t\t\t\t};\n\t\t\t\t\t};\n\t\t\t\t\t33CC111A2044C6BA0003C045 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 9.2;\n\t\t\t\t\t\tProvisioningStyle = Manual;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = 33CC10E82044A3C60003C045 /* Build configuration list for PBXProject \"Runner\" */;\n\t\t\tcompatibilityVersion = \"Xcode 9.3\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = 33CC10E42044A3C60003C045;\n\t\t\tproductRefGroup = 33CC10EE2044A3C60003C045 /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\t33CC10EC2044A3C60003C045 /* Runner */,\n\t\t\t\t331C80D4294CF70F00263BE5 /* RunnerTests */,\n\t\t\t\t33CC111A2044C6BA0003C045 /* Flutter Assemble */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\t331C80D3294CF70F00263BE5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t33CC10EB2044A3C60003C045 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t33CC10F32044A3C60003C045 /* Assets.xcassets in Resources */,\n\t\t\t\t33CC10F62044A3C60003C045 /* MainMenu.xib in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXShellScriptBuildPhase section */\n\t\t3399D490228B24CF009A79C7 /* ShellScript */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\talwaysOutOfDate = 1;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t);\n\t\t\toutputFileListPaths = (\n\t\t\t);\n\t\t\toutputPaths = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"echo \\\"$PRODUCT_NAME.app\\\" > \\\"$PROJECT_DIR\\\"/Flutter/ephemeral/.app_filename && \\\"$FLUTTER_ROOT\\\"/packages/flutter_tools/bin/macos_assemble.sh embed\\n\";\n\t\t};\n\t\t33CC111E2044C6BF0003C045 /* ShellScript */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t\tFlutter/ephemeral/FlutterInputs.xcfilelist,\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t\tFlutter/ephemeral/tripwire,\n\t\t\t);\n\t\t\toutputFileListPaths = (\n\t\t\t\tFlutter/ephemeral/FlutterOutputs.xcfilelist,\n\t\t\t);\n\t\t\toutputPaths = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"\\\"$FLUTTER_ROOT\\\"/packages/flutter_tools/bin/macos_assemble.sh && touch Flutter/ephemeral/tripwire\";\n\t\t};\n/* End PBXShellScriptBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\t331C80D1294CF70F00263BE5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t331C80D8294CF71000263BE5 /* RunnerTests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t33CC10E92044A3C60003C045 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t33CC11132044BFA00003C045 /* MainFlutterWindow.swift in Sources */,\n\t\t\t\t33CC10F12044A3C60003C045 /* AppDelegate.swift in Sources */,\n\t\t\t\t335BBD1B22A9A15E00E9071D /* GeneratedPluginRegistrant.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin PBXTargetDependency section */\n\t\t331C80DA294CF71000263BE5 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = 33CC10EC2044A3C60003C045 /* Runner */;\n\t\t\ttargetProxy = 331C80D9294CF71000263BE5 /* PBXContainerItemProxy */;\n\t\t};\n\t\t33CC11202044C79F0003C045 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = 33CC111A2044C6BA0003C045 /* Flutter Assemble */;\n\t\t\ttargetProxy = 33CC111F2044C79F0003C045 /* PBXContainerItemProxy */;\n\t\t};\n/* End PBXTargetDependency section */\n\n/* Begin PBXVariantGroup section */\n\t\t33CC10F42044A3C60003C045 /* MainMenu.xib */ = {\n\t\t\tisa = PBXVariantGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10F52044A3C60003C045 /* Base */,\n\t\t\t);\n\t\t\tname = MainMenu.xib;\n\t\t\tpath = Runner;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXVariantGroup section */\n\n/* Begin XCBuildConfiguration section */\n\t\t331C80DB294CF71000263BE5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.example.streamingAsr.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/streaming_asr.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/streaming_asr\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t331C80DC294CF71000263BE5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.example.streamingAsr.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/streaming_asr.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/streaming_asr\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t331C80DD294CF71000263BE5 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.example.streamingAsr.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/streaming_asr.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/streaming_asr\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t338D0CE9231458BD00FA5F75 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 7AFA3C8E1D35360C0083082E /* Release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++14\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEAD_CODE_STRIPPING = YES;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.15;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t338D0CEA231458BD00FA5F75 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 33E5194F232828860026EE4D /* AppInfo.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCODE_SIGN_ENTITLEMENTS = Runner/DebugProfile.entitlements;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCOMBINE_HIDPI_IMAGES = YES;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/../Frameworks\",\n\t\t\t\t);\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t338D0CEB231458BD00FA5F75 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCODE_SIGN_STYLE = Manual;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t33CC10F92044A3C60003C045 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 9740EEB21CF90195004384FC /* Debug.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++14\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEAD_CODE_STRIPPING = YES;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.15;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t33CC10FA2044A3C60003C045 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 7AFA3C8E1D35360C0083082E /* Release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++14\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEAD_CODE_STRIPPING = YES;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.15;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t33CC10FC2044A3C60003C045 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 33E5194F232828860026EE4D /* AppInfo.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCODE_SIGN_ENTITLEMENTS = Runner/DebugProfile.entitlements;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCOMBINE_HIDPI_IMAGES = YES;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/../Frameworks\",\n\t\t\t\t);\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t33CC10FD2044A3C60003C045 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 33E5194F232828860026EE4D /* AppInfo.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCODE_SIGN_ENTITLEMENTS = Runner/Release.entitlements;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCOMBINE_HIDPI_IMAGES = YES;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/../Frameworks\",\n\t\t\t\t);\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t33CC111C2044C6BA0003C045 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCODE_SIGN_STYLE = Manual;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t33CC111D2044C6BA0003C045 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\t331C80DE294CF71000263BE5 /* Build configuration list for PBXNativeTarget \"RunnerTests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t331C80DB294CF71000263BE5 /* Debug */,\n\t\t\t\t331C80DC294CF71000263BE5 /* Release */,\n\t\t\t\t331C80DD294CF71000263BE5 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t33CC10E82044A3C60003C045 /* Build configuration list for PBXProject \"Runner\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t33CC10F92044A3C60003C045 /* Debug */,\n\t\t\t\t33CC10FA2044A3C60003C045 /* Release */,\n\t\t\t\t338D0CE9231458BD00FA5F75 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t33CC10FB2044A3C60003C045 /* Build configuration list for PBXNativeTarget \"Runner\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t33CC10FC2044A3C60003C045 /* Debug */,\n\t\t\t\t33CC10FD2044A3C60003C045 /* Release */,\n\t\t\t\t338D0CEA231458BD00FA5F75 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t33CC111B2044C6BA0003C045 /* Build configuration list for PBXAggregateTarget \"Flutter Assemble\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t33CC111C2044C6BA0003C045 /* Debug */,\n\t\t\t\t33CC111D2044C6BA0003C045 /* Release */,\n\t\t\t\t338D0CEB231458BD00FA5F75 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = 33CC10E52044A3C60003C045 /* Project object */;\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner.xcodeproj/xcshareddata/xcschemes/Runner.xcscheme",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Scheme\n   LastUpgradeVersion = \"1510\"\n   version = \"1.3\">\n   <BuildAction\n      parallelizeBuildables = \"YES\"\n      buildImplicitDependencies = \"YES\">\n      <BuildActionEntries>\n         <BuildActionEntry\n            buildForTesting = \"YES\"\n            buildForRunning = \"YES\"\n            buildForProfiling = \"YES\"\n            buildForArchiving = \"YES\"\n            buildForAnalyzing = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n               BuildableName = \"streaming_asr.app\"\n               BlueprintName = \"Runner\"\n               ReferencedContainer = \"container:Runner.xcodeproj\">\n            </BuildableReference>\n         </BuildActionEntry>\n      </BuildActionEntries>\n   </BuildAction>\n   <TestAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\">\n      <MacroExpansion>\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n            BuildableName = \"streaming_asr.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </MacroExpansion>\n      <Testables>\n         <TestableReference\n            skipped = \"NO\"\n            parallelizable = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"331C80D4294CF70F00263BE5\"\n               BuildableName = \"RunnerTests.xctest\"\n               BlueprintName = \"RunnerTests\"\n               ReferencedContainer = \"container:Runner.xcodeproj\">\n            </BuildableReference>\n         </TestableReference>\n      </Testables>\n   </TestAction>\n   <LaunchAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      launchStyle = \"0\"\n      useCustomWorkingDirectory = \"NO\"\n      ignoresPersistentStateOnLaunch = \"NO\"\n      debugDocumentVersioning = \"YES\"\n      debugServiceExtension = \"internal\"\n      allowLocationSimulation = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n            BuildableName = \"streaming_asr.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </LaunchAction>\n   <ProfileAction\n      buildConfiguration = \"Profile\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\"\n      savedToolIdentifier = \"\"\n      useCustomWorkingDirectory = \"NO\"\n      debugDocumentVersioning = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n            BuildableName = \"streaming_asr.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </ProfileAction>\n   <AnalyzeAction\n      buildConfiguration = \"Debug\">\n   </AnalyzeAction>\n   <ArchiveAction\n      buildConfiguration = \"Release\"\n      revealArchiveInOrganizer = \"YES\">\n   </ArchiveAction>\n</Scheme>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"group:Runner.xcodeproj\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/Runner.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/macos/RunnerTests/RunnerTests.swift",
    "content": "import Cocoa\nimport FlutterMacOS\nimport XCTest\n\nclass RunnerTests: XCTestCase {\n\n  func testExample() {\n    // If you add code to the Runner application, consider adding tests here.\n    // See https://developer.apple.com/documentation/xctest for more information about using XCTest.\n  }\n\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/pubspec.yaml",
    "content": "name: streaming_asr\n\ndescription: >\n  This example shows how to implement real-time speech recognition using sherpa-onnx.\n\npublish_to: 'none'\n\nversion: 1.12.31\n\ntopics:\n  - speech-recognition\n\nissue_tracker: https://github.com/k2-fsa/sherpa-onnx/issues\n\nrepository: https://github.com/k2-fsa/sherpa-onnx/tree/master/sherpa-onnx/flutter\n\nenvironment:\n  sdk: \">=2.17.0 <4.0.0\"\n  flutter: \">=2.8.1\"\n\ndependencies:\n  flutter:\n    sdk: flutter\n\n  cupertino_icons: ^1.0.6\n\n  path_provider: ^2.1.3\n  path: ^1.9.0\n\n  record: ^6.1.2\n  url_launcher: ^6.2.6\n\n  sherpa_onnx: ^1.12.31\n  # sherpa_onnx:\n  #   path: ../../flutter/sherpa_onnx\n\ndev_dependencies:\n  flutter_test:\n    sdk: flutter\n\n  flutter_lints: ^3.0.0\n\nflutter:\n  uses-material-design: true\n\n  assets:\n    - assets/\n    # - assets/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\n"
  },
  {
    "path": "flutter-examples/streaming_asr/test/widget_test.dart",
    "content": "// This is a basic Flutter widget test.\n//\n// To perform an interaction with a widget in your test, use the WidgetTester\n// utility in the flutter_test package. For example, you can send tap and scroll\n// gestures. You can also use WidgetTester to find child widgets in the widget\n// tree, read text, and verify that the values of widget properties are correct.\n\nimport 'package:flutter/material.dart';\nimport 'package:flutter_test/flutter_test.dart';\n\nimport 'package:streaming_asr/main.dart';\n\nvoid main() {\n  testWidgets('Counter increments smoke test', (WidgetTester tester) async {\n    // Build our app and trigger a frame.\n    await tester.pumpWidget(const MyApp());\n\n    // Verify that our counter starts at 0.\n    expect(find.text('0'), findsOneWidget);\n    expect(find.text('1'), findsNothing);\n\n    // Tap the '+' icon and trigger a frame.\n    await tester.tap(find.byIcon(Icons.add));\n    await tester.pump();\n\n    // Verify that our counter has incremented.\n    expect(find.text('0'), findsNothing);\n    expect(find.text('1'), findsOneWidget);\n  });\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/.gitignore",
    "content": "flutter/ephemeral/\n\n# Visual Studio user-specific files.\n*.suo\n*.user\n*.userosscache\n*.sln.docstates\n\n# Visual Studio build-related files.\nx64/\nx86/\n\n# Visual Studio cache files\n# files ending in .cache can be ignored\n*.[Cc]ache\n# but keep track of directories ending in .cache\n!*.[Cc]ache/\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/CMakeLists.txt",
    "content": "# Project-level configuration.\ncmake_minimum_required(VERSION 3.14)\nproject(streaming_asr LANGUAGES CXX)\n\n# The name of the executable created for the application. Change this to change\n# the on-disk name of your application.\nset(BINARY_NAME \"streaming_asr\")\n\n# Explicitly opt in to modern CMake behaviors to avoid warnings with recent\n# versions of CMake.\ncmake_policy(VERSION 3.14...3.25)\n\n# Define build configuration option.\nget_property(IS_MULTICONFIG GLOBAL PROPERTY GENERATOR_IS_MULTI_CONFIG)\nif(IS_MULTICONFIG)\n  set(CMAKE_CONFIGURATION_TYPES \"Debug;Profile;Release\"\n    CACHE STRING \"\" FORCE)\nelse()\n  if(NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)\n    set(CMAKE_BUILD_TYPE \"Debug\" CACHE\n      STRING \"Flutter build mode\" FORCE)\n    set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS\n      \"Debug\" \"Profile\" \"Release\")\n  endif()\nendif()\n# Define settings for the Profile build mode.\nset(CMAKE_EXE_LINKER_FLAGS_PROFILE \"${CMAKE_EXE_LINKER_FLAGS_RELEASE}\")\nset(CMAKE_SHARED_LINKER_FLAGS_PROFILE \"${CMAKE_SHARED_LINKER_FLAGS_RELEASE}\")\nset(CMAKE_C_FLAGS_PROFILE \"${CMAKE_C_FLAGS_RELEASE}\")\nset(CMAKE_CXX_FLAGS_PROFILE \"${CMAKE_CXX_FLAGS_RELEASE}\")\n\n# Use Unicode for all projects.\nadd_definitions(-DUNICODE -D_UNICODE)\n\n# Compilation settings that should be applied to most targets.\n#\n# Be cautious about adding new options here, as plugins use this function by\n# default. In most cases, you should add new options to specific targets instead\n# of modifying this function.\nfunction(APPLY_STANDARD_SETTINGS TARGET)\n  target_compile_features(${TARGET} PUBLIC cxx_std_17)\n  target_compile_options(${TARGET} PRIVATE /W4 /WX /wd\"4100\")\n  target_compile_options(${TARGET} PRIVATE /EHsc)\n  target_compile_definitions(${TARGET} PRIVATE \"_HAS_EXCEPTIONS=0\")\n  target_compile_definitions(${TARGET} PRIVATE \"$<$<CONFIG:Debug>:_DEBUG>\")\nendfunction()\n\n# Flutter library and tool build rules.\nset(FLUTTER_MANAGED_DIR \"${CMAKE_CURRENT_SOURCE_DIR}/flutter\")\nadd_subdirectory(${FLUTTER_MANAGED_DIR})\n\n# Application build; see runner/CMakeLists.txt.\nadd_subdirectory(\"runner\")\n\n\n# Generated plugin build rules, which manage building the plugins and adding\n# them to the application.\ninclude(flutter/generated_plugins.cmake)\n\n\n# === Installation ===\n# Support files are copied into place next to the executable, so that it can\n# run in place. This is done instead of making a separate bundle (as on Linux)\n# so that building and running from within Visual Studio will work.\nset(BUILD_BUNDLE_DIR \"$<TARGET_FILE_DIR:${BINARY_NAME}>\")\n# Make the \"install\" step default, as it's required to run.\nset(CMAKE_VS_INCLUDE_INSTALL_TO_DEFAULT_BUILD 1)\nif(CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT)\n  set(CMAKE_INSTALL_PREFIX \"${BUILD_BUNDLE_DIR}\" CACHE PATH \"...\" FORCE)\nendif()\n\nset(INSTALL_BUNDLE_DATA_DIR \"${CMAKE_INSTALL_PREFIX}/data\")\nset(INSTALL_BUNDLE_LIB_DIR \"${CMAKE_INSTALL_PREFIX}\")\n\ninstall(TARGETS ${BINARY_NAME} RUNTIME DESTINATION \"${CMAKE_INSTALL_PREFIX}\"\n  COMPONENT Runtime)\n\ninstall(FILES \"${FLUTTER_ICU_DATA_FILE}\" DESTINATION \"${INSTALL_BUNDLE_DATA_DIR}\"\n  COMPONENT Runtime)\n\ninstall(FILES \"${FLUTTER_LIBRARY}\" DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n  COMPONENT Runtime)\n\nif(PLUGIN_BUNDLED_LIBRARIES)\n  install(FILES \"${PLUGIN_BUNDLED_LIBRARIES}\"\n    DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n    COMPONENT Runtime)\nendif()\n\n# Copy the native assets provided by the build.dart from all packages.\nset(NATIVE_ASSETS_DIR \"${PROJECT_BUILD_DIR}native_assets/windows/\")\ninstall(DIRECTORY \"${NATIVE_ASSETS_DIR}\"\n   DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n   COMPONENT Runtime)\n\n# Fully re-copy the assets directory on each build to avoid having stale files\n# from a previous install.\nset(FLUTTER_ASSET_DIR_NAME \"flutter_assets\")\ninstall(CODE \"\n  file(REMOVE_RECURSE \\\"${INSTALL_BUNDLE_DATA_DIR}/${FLUTTER_ASSET_DIR_NAME}\\\")\n  \" COMPONENT Runtime)\ninstall(DIRECTORY \"${PROJECT_BUILD_DIR}/${FLUTTER_ASSET_DIR_NAME}\"\n  DESTINATION \"${INSTALL_BUNDLE_DATA_DIR}\" COMPONENT Runtime)\n\n# Install the AOT library on non-Debug builds only.\ninstall(FILES \"${AOT_LIBRARY}\" DESTINATION \"${INSTALL_BUNDLE_DATA_DIR}\"\n  CONFIGURATIONS Profile;Release\n  COMPONENT Runtime)\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/flutter/CMakeLists.txt",
    "content": "# This file controls Flutter-level build steps. It should not be edited.\ncmake_minimum_required(VERSION 3.14)\n\nset(EPHEMERAL_DIR \"${CMAKE_CURRENT_SOURCE_DIR}/ephemeral\")\n\n# Configuration provided via flutter tool.\ninclude(${EPHEMERAL_DIR}/generated_config.cmake)\n\n# TODO: Move the rest of this into files in ephemeral. See\n# https://github.com/flutter/flutter/issues/57146.\nset(WRAPPER_ROOT \"${EPHEMERAL_DIR}/cpp_client_wrapper\")\n\n# Set fallback configurations for older versions of the flutter tool.\nif (NOT DEFINED FLUTTER_TARGET_PLATFORM)\n  set(FLUTTER_TARGET_PLATFORM \"windows-x64\")\nendif()\n\n# === Flutter Library ===\nset(FLUTTER_LIBRARY \"${EPHEMERAL_DIR}/flutter_windows.dll\")\n\n# Published to parent scope for install step.\nset(FLUTTER_LIBRARY ${FLUTTER_LIBRARY} PARENT_SCOPE)\nset(FLUTTER_ICU_DATA_FILE \"${EPHEMERAL_DIR}/icudtl.dat\" PARENT_SCOPE)\nset(PROJECT_BUILD_DIR \"${PROJECT_DIR}/build/\" PARENT_SCOPE)\nset(AOT_LIBRARY \"${PROJECT_DIR}/build/windows/app.so\" PARENT_SCOPE)\n\nlist(APPEND FLUTTER_LIBRARY_HEADERS\n  \"flutter_export.h\"\n  \"flutter_windows.h\"\n  \"flutter_messenger.h\"\n  \"flutter_plugin_registrar.h\"\n  \"flutter_texture_registrar.h\"\n)\nlist(TRANSFORM FLUTTER_LIBRARY_HEADERS PREPEND \"${EPHEMERAL_DIR}/\")\nadd_library(flutter INTERFACE)\ntarget_include_directories(flutter INTERFACE\n  \"${EPHEMERAL_DIR}\"\n)\ntarget_link_libraries(flutter INTERFACE \"${FLUTTER_LIBRARY}.lib\")\nadd_dependencies(flutter flutter_assemble)\n\n# === Wrapper ===\nlist(APPEND CPP_WRAPPER_SOURCES_CORE\n  \"core_implementations.cc\"\n  \"standard_codec.cc\"\n)\nlist(TRANSFORM CPP_WRAPPER_SOURCES_CORE PREPEND \"${WRAPPER_ROOT}/\")\nlist(APPEND CPP_WRAPPER_SOURCES_PLUGIN\n  \"plugin_registrar.cc\"\n)\nlist(TRANSFORM CPP_WRAPPER_SOURCES_PLUGIN PREPEND \"${WRAPPER_ROOT}/\")\nlist(APPEND CPP_WRAPPER_SOURCES_APP\n  \"flutter_engine.cc\"\n  \"flutter_view_controller.cc\"\n)\nlist(TRANSFORM CPP_WRAPPER_SOURCES_APP PREPEND \"${WRAPPER_ROOT}/\")\n\n# Wrapper sources needed for a plugin.\nadd_library(flutter_wrapper_plugin STATIC\n  ${CPP_WRAPPER_SOURCES_CORE}\n  ${CPP_WRAPPER_SOURCES_PLUGIN}\n)\napply_standard_settings(flutter_wrapper_plugin)\nset_target_properties(flutter_wrapper_plugin PROPERTIES\n  POSITION_INDEPENDENT_CODE ON)\nset_target_properties(flutter_wrapper_plugin PROPERTIES\n  CXX_VISIBILITY_PRESET hidden)\ntarget_link_libraries(flutter_wrapper_plugin PUBLIC flutter)\ntarget_include_directories(flutter_wrapper_plugin PUBLIC\n  \"${WRAPPER_ROOT}/include\"\n)\nadd_dependencies(flutter_wrapper_plugin flutter_assemble)\n\n# Wrapper sources needed for the runner.\nadd_library(flutter_wrapper_app STATIC\n  ${CPP_WRAPPER_SOURCES_CORE}\n  ${CPP_WRAPPER_SOURCES_APP}\n)\napply_standard_settings(flutter_wrapper_app)\ntarget_link_libraries(flutter_wrapper_app PUBLIC flutter)\ntarget_include_directories(flutter_wrapper_app PUBLIC\n  \"${WRAPPER_ROOT}/include\"\n)\nadd_dependencies(flutter_wrapper_app flutter_assemble)\n\n# === Flutter tool backend ===\n# _phony_ is a non-existent file to force this command to run every time,\n# since currently there's no way to get a full input/output list from the\n# flutter tool.\nset(PHONY_OUTPUT \"${CMAKE_CURRENT_BINARY_DIR}/_phony_\")\nset_source_files_properties(\"${PHONY_OUTPUT}\" PROPERTIES SYMBOLIC TRUE)\nadd_custom_command(\n  OUTPUT ${FLUTTER_LIBRARY} ${FLUTTER_LIBRARY_HEADERS}\n    ${CPP_WRAPPER_SOURCES_CORE} ${CPP_WRAPPER_SOURCES_PLUGIN}\n    ${CPP_WRAPPER_SOURCES_APP}\n    ${PHONY_OUTPUT}\n  COMMAND ${CMAKE_COMMAND} -E env\n    ${FLUTTER_TOOL_ENVIRONMENT}\n    \"${FLUTTER_ROOT}/packages/flutter_tools/bin/tool_backend.bat\"\n      ${FLUTTER_TARGET_PLATFORM} $<CONFIG>\n  VERBATIM\n)\nadd_custom_target(flutter_assemble DEPENDS\n  \"${FLUTTER_LIBRARY}\"\n  ${FLUTTER_LIBRARY_HEADERS}\n  ${CPP_WRAPPER_SOURCES_CORE}\n  ${CPP_WRAPPER_SOURCES_PLUGIN}\n  ${CPP_WRAPPER_SOURCES_APP}\n)\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/CMakeLists.txt",
    "content": "cmake_minimum_required(VERSION 3.14)\nproject(runner LANGUAGES CXX)\n\n# Define the application target. To change its name, change BINARY_NAME in the\n# top-level CMakeLists.txt, not the value here, or `flutter run` will no longer\n# work.\n#\n# Any new source files that you add to the application should be added here.\nadd_executable(${BINARY_NAME} WIN32\n  \"flutter_window.cpp\"\n  \"main.cpp\"\n  \"utils.cpp\"\n  \"win32_window.cpp\"\n  \"${FLUTTER_MANAGED_DIR}/generated_plugin_registrant.cc\"\n  \"Runner.rc\"\n  \"runner.exe.manifest\"\n)\n\n# Apply the standard set of build settings. This can be removed for applications\n# that need different build settings.\napply_standard_settings(${BINARY_NAME})\n\n# Add preprocessor definitions for the build version.\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"FLUTTER_VERSION=\\\"${FLUTTER_VERSION}\\\"\")\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"FLUTTER_VERSION_MAJOR=${FLUTTER_VERSION_MAJOR}\")\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"FLUTTER_VERSION_MINOR=${FLUTTER_VERSION_MINOR}\")\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"FLUTTER_VERSION_PATCH=${FLUTTER_VERSION_PATCH}\")\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"FLUTTER_VERSION_BUILD=${FLUTTER_VERSION_BUILD}\")\n\n# Disable Windows macros that collide with C++ standard library functions.\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"NOMINMAX\")\n\n# Add dependency libraries and include directories. Add any application-specific\n# dependencies here.\ntarget_link_libraries(${BINARY_NAME} PRIVATE flutter flutter_wrapper_app)\ntarget_link_libraries(${BINARY_NAME} PRIVATE \"dwmapi.lib\")\ntarget_include_directories(${BINARY_NAME} PRIVATE \"${CMAKE_SOURCE_DIR}\")\n\n# Run the Flutter tool portions of the build. This must not be removed.\nadd_dependencies(${BINARY_NAME} flutter_assemble)\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/Runner.rc",
    "content": "// Microsoft Visual C++ generated resource script.\n//\n#pragma code_page(65001)\n#include \"resource.h\"\n\n#define APSTUDIO_READONLY_SYMBOLS\n/////////////////////////////////////////////////////////////////////////////\n//\n// Generated from the TEXTINCLUDE 2 resource.\n//\n#include \"winres.h\"\n\n/////////////////////////////////////////////////////////////////////////////\n#undef APSTUDIO_READONLY_SYMBOLS\n\n/////////////////////////////////////////////////////////////////////////////\n// English (United States) resources\n\n#if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_ENU)\nLANGUAGE LANG_ENGLISH, SUBLANG_ENGLISH_US\n\n#ifdef APSTUDIO_INVOKED\n/////////////////////////////////////////////////////////////////////////////\n//\n// TEXTINCLUDE\n//\n\n1 TEXTINCLUDE\nBEGIN\n    \"resource.h\\0\"\nEND\n\n2 TEXTINCLUDE\nBEGIN\n    \"#include \"\"winres.h\"\"\\r\\n\"\n    \"\\0\"\nEND\n\n3 TEXTINCLUDE\nBEGIN\n    \"\\r\\n\"\n    \"\\0\"\nEND\n\n#endif    // APSTUDIO_INVOKED\n\n\n/////////////////////////////////////////////////////////////////////////////\n//\n// Icon\n//\n\n// Icon with lowest ID value placed first to ensure application icon\n// remains consistent on all systems.\nIDI_APP_ICON            ICON                    \"resources\\\\app_icon.ico\"\n\n\n/////////////////////////////////////////////////////////////////////////////\n//\n// Version\n//\n\n#if defined(FLUTTER_VERSION_MAJOR) && defined(FLUTTER_VERSION_MINOR) && defined(FLUTTER_VERSION_PATCH) && defined(FLUTTER_VERSION_BUILD)\n#define VERSION_AS_NUMBER FLUTTER_VERSION_MAJOR,FLUTTER_VERSION_MINOR,FLUTTER_VERSION_PATCH,FLUTTER_VERSION_BUILD\n#else\n#define VERSION_AS_NUMBER 1,0,0,0\n#endif\n\n#if defined(FLUTTER_VERSION)\n#define VERSION_AS_STRING FLUTTER_VERSION\n#else\n#define VERSION_AS_STRING \"1.0.0\"\n#endif\n\nVS_VERSION_INFO VERSIONINFO\n FILEVERSION VERSION_AS_NUMBER\n PRODUCTVERSION VERSION_AS_NUMBER\n FILEFLAGSMASK VS_FFI_FILEFLAGSMASK\n#ifdef _DEBUG\n FILEFLAGS VS_FF_DEBUG\n#else\n FILEFLAGS 0x0L\n#endif\n FILEOS VOS__WINDOWS32\n FILETYPE VFT_APP\n FILESUBTYPE 0x0L\nBEGIN\n    BLOCK \"StringFileInfo\"\n    BEGIN\n        BLOCK \"040904e4\"\n        BEGIN\n            VALUE \"CompanyName\", \"com.example\" \"\\0\"\n            VALUE \"FileDescription\", \"streaming_asr\" \"\\0\"\n            VALUE \"FileVersion\", VERSION_AS_STRING \"\\0\"\n            VALUE \"InternalName\", \"streaming_asr\" \"\\0\"\n            VALUE \"LegalCopyright\", \"Copyright (C) 2024 com.example. All rights reserved.\" \"\\0\"\n            VALUE \"OriginalFilename\", \"streaming_asr.exe\" \"\\0\"\n            VALUE \"ProductName\", \"streaming_asr\" \"\\0\"\n            VALUE \"ProductVersion\", VERSION_AS_STRING \"\\0\"\n        END\n    END\n    BLOCK \"VarFileInfo\"\n    BEGIN\n        VALUE \"Translation\", 0x409, 1252\n    END\nEND\n\n#endif    // English (United States) resources\n/////////////////////////////////////////////////////////////////////////////\n\n\n\n#ifndef APSTUDIO_INVOKED\n/////////////////////////////////////////////////////////////////////////////\n//\n// Generated from the TEXTINCLUDE 3 resource.\n//\n\n\n/////////////////////////////////////////////////////////////////////////////\n#endif    // not APSTUDIO_INVOKED\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/flutter_window.cpp",
    "content": "#include \"flutter_window.h\"\n\n#include <optional>\n\n#include \"flutter/generated_plugin_registrant.h\"\n\nFlutterWindow::FlutterWindow(const flutter::DartProject& project)\n    : project_(project) {}\n\nFlutterWindow::~FlutterWindow() {}\n\nbool FlutterWindow::OnCreate() {\n  if (!Win32Window::OnCreate()) {\n    return false;\n  }\n\n  RECT frame = GetClientArea();\n\n  // The size here must match the window dimensions to avoid unnecessary surface\n  // creation / destruction in the startup path.\n  flutter_controller_ = std::make_unique<flutter::FlutterViewController>(\n      frame.right - frame.left, frame.bottom - frame.top, project_);\n  // Ensure that basic setup of the controller was successful.\n  if (!flutter_controller_->engine() || !flutter_controller_->view()) {\n    return false;\n  }\n  RegisterPlugins(flutter_controller_->engine());\n  SetChildContent(flutter_controller_->view()->GetNativeWindow());\n\n  flutter_controller_->engine()->SetNextFrameCallback([&]() {\n    this->Show();\n  });\n\n  // Flutter can complete the first frame before the \"show window\" callback is\n  // registered. The following call ensures a frame is pending to ensure the\n  // window is shown. It is a no-op if the first frame hasn't completed yet.\n  flutter_controller_->ForceRedraw();\n\n  return true;\n}\n\nvoid FlutterWindow::OnDestroy() {\n  if (flutter_controller_) {\n    flutter_controller_ = nullptr;\n  }\n\n  Win32Window::OnDestroy();\n}\n\nLRESULT\nFlutterWindow::MessageHandler(HWND hwnd, UINT const message,\n                              WPARAM const wparam,\n                              LPARAM const lparam) noexcept {\n  // Give Flutter, including plugins, an opportunity to handle window messages.\n  if (flutter_controller_) {\n    std::optional<LRESULT> result =\n        flutter_controller_->HandleTopLevelWindowProc(hwnd, message, wparam,\n                                                      lparam);\n    if (result) {\n      return *result;\n    }\n  }\n\n  switch (message) {\n    case WM_FONTCHANGE:\n      flutter_controller_->engine()->ReloadSystemFonts();\n      break;\n  }\n\n  return Win32Window::MessageHandler(hwnd, message, wparam, lparam);\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/flutter_window.h",
    "content": "#ifndef RUNNER_FLUTTER_WINDOW_H_\n#define RUNNER_FLUTTER_WINDOW_H_\n\n#include <flutter/dart_project.h>\n#include <flutter/flutter_view_controller.h>\n\n#include <memory>\n\n#include \"win32_window.h\"\n\n// A window that does nothing but host a Flutter view.\nclass FlutterWindow : public Win32Window {\n public:\n  // Creates a new FlutterWindow hosting a Flutter view running |project|.\n  explicit FlutterWindow(const flutter::DartProject& project);\n  virtual ~FlutterWindow();\n\n protected:\n  // Win32Window:\n  bool OnCreate() override;\n  void OnDestroy() override;\n  LRESULT MessageHandler(HWND window, UINT const message, WPARAM const wparam,\n                         LPARAM const lparam) noexcept override;\n\n private:\n  // The project to run.\n  flutter::DartProject project_;\n\n  // The Flutter instance hosted by this window.\n  std::unique_ptr<flutter::FlutterViewController> flutter_controller_;\n};\n\n#endif  // RUNNER_FLUTTER_WINDOW_H_\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/main.cpp",
    "content": "#include <flutter/dart_project.h>\n#include <flutter/flutter_view_controller.h>\n#include <windows.h>\n\n#include \"flutter_window.h\"\n#include \"utils.h\"\n\nint APIENTRY wWinMain(_In_ HINSTANCE instance, _In_opt_ HINSTANCE prev,\n                      _In_ wchar_t *command_line, _In_ int show_command) {\n  // Attach to console when present (e.g., 'flutter run') or create a\n  // new console when running with a debugger.\n  if (!::AttachConsole(ATTACH_PARENT_PROCESS) && ::IsDebuggerPresent()) {\n    CreateAndAttachConsole();\n  }\n\n  // Initialize COM, so that it is available for use in the library and/or\n  // plugins.\n  ::CoInitializeEx(nullptr, COINIT_APARTMENTTHREADED);\n\n  flutter::DartProject project(L\"data\");\n\n  std::vector<std::string> command_line_arguments =\n      GetCommandLineArguments();\n\n  project.set_dart_entrypoint_arguments(std::move(command_line_arguments));\n\n  FlutterWindow window(project);\n  Win32Window::Point origin(10, 10);\n  Win32Window::Size size(1280, 720);\n  if (!window.Create(L\"streaming_asr\", origin, size)) {\n    return EXIT_FAILURE;\n  }\n  window.SetQuitOnClose(true);\n\n  ::MSG msg;\n  while (::GetMessage(&msg, nullptr, 0, 0)) {\n    ::TranslateMessage(&msg);\n    ::DispatchMessage(&msg);\n  }\n\n  ::CoUninitialize();\n  return EXIT_SUCCESS;\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/resource.h",
    "content": "//{{NO_DEPENDENCIES}}\n// Microsoft Visual C++ generated include file.\n// Used by Runner.rc\n//\n#define IDI_APP_ICON                    101\n\n// Next default values for new objects\n//\n#ifdef APSTUDIO_INVOKED\n#ifndef APSTUDIO_READONLY_SYMBOLS\n#define _APS_NEXT_RESOURCE_VALUE        102\n#define _APS_NEXT_COMMAND_VALUE         40001\n#define _APS_NEXT_CONTROL_VALUE         1001\n#define _APS_NEXT_SYMED_VALUE           101\n#endif\n#endif\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/runner.exe.manifest",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<assembly xmlns=\"urn:schemas-microsoft-com:asm.v1\" manifestVersion=\"1.0\">\n  <application xmlns=\"urn:schemas-microsoft-com:asm.v3\">\n    <windowsSettings>\n      <dpiAwareness xmlns=\"http://schemas.microsoft.com/SMI/2016/WindowsSettings\">PerMonitorV2</dpiAwareness>\n    </windowsSettings>\n  </application>\n  <compatibility xmlns=\"urn:schemas-microsoft-com:compatibility.v1\">\n    <application>\n      <!-- Windows 10 and Windows 11 -->\n      <supportedOS Id=\"{8e0f7a12-bfb3-4fe8-b9a5-48fd50a15a9a}\"/>\n      <!-- Windows 8.1 -->\n      <supportedOS Id=\"{1f676c76-80e1-4239-95bb-83d0f6d0da78}\"/>\n      <!-- Windows 8 -->\n      <supportedOS Id=\"{4a2f28e3-53b9-4441-ba9c-d69d4a4a6e38}\"/>\n      <!-- Windows 7 -->\n      <supportedOS Id=\"{35138b9a-5d96-4fbd-8e2d-a2440225f93a}\"/>\n    </application>\n  </compatibility>\n</assembly>\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/utils.cpp",
    "content": "#include \"utils.h\"\n\n#include <flutter_windows.h>\n#include <io.h>\n#include <stdio.h>\n#include <windows.h>\n\n#include <iostream>\n\nvoid CreateAndAttachConsole() {\n  if (::AllocConsole()) {\n    FILE *unused;\n    if (freopen_s(&unused, \"CONOUT$\", \"w\", stdout)) {\n      _dup2(_fileno(stdout), 1);\n    }\n    if (freopen_s(&unused, \"CONOUT$\", \"w\", stderr)) {\n      _dup2(_fileno(stdout), 2);\n    }\n    std::ios::sync_with_stdio();\n    FlutterDesktopResyncOutputStreams();\n  }\n}\n\nstd::vector<std::string> GetCommandLineArguments() {\n  // Convert the UTF-16 command line arguments to UTF-8 for the Engine to use.\n  int argc;\n  wchar_t** argv = ::CommandLineToArgvW(::GetCommandLineW(), &argc);\n  if (argv == nullptr) {\n    return std::vector<std::string>();\n  }\n\n  std::vector<std::string> command_line_arguments;\n\n  // Skip the first argument as it's the binary name.\n  for (int i = 1; i < argc; i++) {\n    command_line_arguments.push_back(Utf8FromUtf16(argv[i]));\n  }\n\n  ::LocalFree(argv);\n\n  return command_line_arguments;\n}\n\nstd::string Utf8FromUtf16(const wchar_t* utf16_string) {\n  if (utf16_string == nullptr) {\n    return std::string();\n  }\n  unsigned int target_length = ::WideCharToMultiByte(\n      CP_UTF8, WC_ERR_INVALID_CHARS, utf16_string,\n      -1, nullptr, 0, nullptr, nullptr)\n    -1; // remove the trailing null character\n  int input_length = (int)wcslen(utf16_string);\n  std::string utf8_string;\n  if (target_length == 0 || target_length > utf8_string.max_size()) {\n    return utf8_string;\n  }\n  utf8_string.resize(target_length);\n  int converted_length = ::WideCharToMultiByte(\n      CP_UTF8, WC_ERR_INVALID_CHARS, utf16_string,\n      input_length, utf8_string.data(), target_length, nullptr, nullptr);\n  if (converted_length == 0) {\n    return std::string();\n  }\n  return utf8_string;\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/utils.h",
    "content": "#ifndef RUNNER_UTILS_H_\n#define RUNNER_UTILS_H_\n\n#include <string>\n#include <vector>\n\n// Creates a console for the process, and redirects stdout and stderr to\n// it for both the runner and the Flutter library.\nvoid CreateAndAttachConsole();\n\n// Takes a null-terminated wchar_t* encoded in UTF-16 and returns a std::string\n// encoded in UTF-8. Returns an empty std::string on failure.\nstd::string Utf8FromUtf16(const wchar_t* utf16_string);\n\n// Gets the command line arguments passed in as a std::vector<std::string>,\n// encoded in UTF-8. Returns an empty std::vector<std::string> on failure.\nstd::vector<std::string> GetCommandLineArguments();\n\n#endif  // RUNNER_UTILS_H_\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/win32_window.cpp",
    "content": "#include \"win32_window.h\"\n\n#include <dwmapi.h>\n#include <flutter_windows.h>\n\n#include \"resource.h\"\n\nnamespace {\n\n/// Window attribute that enables dark mode window decorations.\n///\n/// Redefined in case the developer's machine has a Windows SDK older than\n/// version 10.0.22000.0.\n/// See: https://docs.microsoft.com/windows/win32/api/dwmapi/ne-dwmapi-dwmwindowattribute\n#ifndef DWMWA_USE_IMMERSIVE_DARK_MODE\n#define DWMWA_USE_IMMERSIVE_DARK_MODE 20\n#endif\n\nconstexpr const wchar_t kWindowClassName[] = L\"FLUTTER_RUNNER_WIN32_WINDOW\";\n\n/// Registry key for app theme preference.\n///\n/// A value of 0 indicates apps should use dark mode. A non-zero or missing\n/// value indicates apps should use light mode.\nconstexpr const wchar_t kGetPreferredBrightnessRegKey[] =\n  L\"Software\\\\Microsoft\\\\Windows\\\\CurrentVersion\\\\Themes\\\\Personalize\";\nconstexpr const wchar_t kGetPreferredBrightnessRegValue[] = L\"AppsUseLightTheme\";\n\n// The number of Win32Window objects that currently exist.\nstatic int g_active_window_count = 0;\n\nusing EnableNonClientDpiScaling = BOOL __stdcall(HWND hwnd);\n\n// Scale helper to convert logical scaler values to physical using passed in\n// scale factor\nint Scale(int source, double scale_factor) {\n  return static_cast<int>(source * scale_factor);\n}\n\n// Dynamically loads the |EnableNonClientDpiScaling| from the User32 module.\n// This API is only needed for PerMonitor V1 awareness mode.\nvoid EnableFullDpiSupportIfAvailable(HWND hwnd) {\n  HMODULE user32_module = LoadLibraryA(\"User32.dll\");\n  if (!user32_module) {\n    return;\n  }\n  auto enable_non_client_dpi_scaling =\n      reinterpret_cast<EnableNonClientDpiScaling*>(\n          GetProcAddress(user32_module, \"EnableNonClientDpiScaling\"));\n  if (enable_non_client_dpi_scaling != nullptr) {\n    enable_non_client_dpi_scaling(hwnd);\n  }\n  FreeLibrary(user32_module);\n}\n\n}  // namespace\n\n// Manages the Win32Window's window class registration.\nclass WindowClassRegistrar {\n public:\n  ~WindowClassRegistrar() = default;\n\n  // Returns the singleton registrar instance.\n  static WindowClassRegistrar* GetInstance() {\n    if (!instance_) {\n      instance_ = new WindowClassRegistrar();\n    }\n    return instance_;\n  }\n\n  // Returns the name of the window class, registering the class if it hasn't\n  // previously been registered.\n  const wchar_t* GetWindowClass();\n\n  // Unregisters the window class. Should only be called if there are no\n  // instances of the window.\n  void UnregisterWindowClass();\n\n private:\n  WindowClassRegistrar() = default;\n\n  static WindowClassRegistrar* instance_;\n\n  bool class_registered_ = false;\n};\n\nWindowClassRegistrar* WindowClassRegistrar::instance_ = nullptr;\n\nconst wchar_t* WindowClassRegistrar::GetWindowClass() {\n  if (!class_registered_) {\n    WNDCLASS window_class{};\n    window_class.hCursor = LoadCursor(nullptr, IDC_ARROW);\n    window_class.lpszClassName = kWindowClassName;\n    window_class.style = CS_HREDRAW | CS_VREDRAW;\n    window_class.cbClsExtra = 0;\n    window_class.cbWndExtra = 0;\n    window_class.hInstance = GetModuleHandle(nullptr);\n    window_class.hIcon =\n        LoadIcon(window_class.hInstance, MAKEINTRESOURCE(IDI_APP_ICON));\n    window_class.hbrBackground = 0;\n    window_class.lpszMenuName = nullptr;\n    window_class.lpfnWndProc = Win32Window::WndProc;\n    RegisterClass(&window_class);\n    class_registered_ = true;\n  }\n  return kWindowClassName;\n}\n\nvoid WindowClassRegistrar::UnregisterWindowClass() {\n  UnregisterClass(kWindowClassName, nullptr);\n  class_registered_ = false;\n}\n\nWin32Window::Win32Window() {\n  ++g_active_window_count;\n}\n\nWin32Window::~Win32Window() {\n  --g_active_window_count;\n  Destroy();\n}\n\nbool Win32Window::Create(const std::wstring& title,\n                         const Point& origin,\n                         const Size& size) {\n  Destroy();\n\n  const wchar_t* window_class =\n      WindowClassRegistrar::GetInstance()->GetWindowClass();\n\n  const POINT target_point = {static_cast<LONG>(origin.x),\n                              static_cast<LONG>(origin.y)};\n  HMONITOR monitor = MonitorFromPoint(target_point, MONITOR_DEFAULTTONEAREST);\n  UINT dpi = FlutterDesktopGetDpiForMonitor(monitor);\n  double scale_factor = dpi / 96.0;\n\n  HWND window = CreateWindow(\n      window_class, title.c_str(), WS_OVERLAPPEDWINDOW,\n      Scale(origin.x, scale_factor), Scale(origin.y, scale_factor),\n      Scale(size.width, scale_factor), Scale(size.height, scale_factor),\n      nullptr, nullptr, GetModuleHandle(nullptr), this);\n\n  if (!window) {\n    return false;\n  }\n\n  UpdateTheme(window);\n\n  return OnCreate();\n}\n\nbool Win32Window::Show() {\n  return ShowWindow(window_handle_, SW_SHOWNORMAL);\n}\n\n// static\nLRESULT CALLBACK Win32Window::WndProc(HWND const window,\n                                      UINT const message,\n                                      WPARAM const wparam,\n                                      LPARAM const lparam) noexcept {\n  if (message == WM_NCCREATE) {\n    auto window_struct = reinterpret_cast<CREATESTRUCT*>(lparam);\n    SetWindowLongPtr(window, GWLP_USERDATA,\n                     reinterpret_cast<LONG_PTR>(window_struct->lpCreateParams));\n\n    auto that = static_cast<Win32Window*>(window_struct->lpCreateParams);\n    EnableFullDpiSupportIfAvailable(window);\n    that->window_handle_ = window;\n  } else if (Win32Window* that = GetThisFromHandle(window)) {\n    return that->MessageHandler(window, message, wparam, lparam);\n  }\n\n  return DefWindowProc(window, message, wparam, lparam);\n}\n\nLRESULT\nWin32Window::MessageHandler(HWND hwnd,\n                            UINT const message,\n                            WPARAM const wparam,\n                            LPARAM const lparam) noexcept {\n  switch (message) {\n    case WM_DESTROY:\n      window_handle_ = nullptr;\n      Destroy();\n      if (quit_on_close_) {\n        PostQuitMessage(0);\n      }\n      return 0;\n\n    case WM_DPICHANGED: {\n      auto newRectSize = reinterpret_cast<RECT*>(lparam);\n      LONG newWidth = newRectSize->right - newRectSize->left;\n      LONG newHeight = newRectSize->bottom - newRectSize->top;\n\n      SetWindowPos(hwnd, nullptr, newRectSize->left, newRectSize->top, newWidth,\n                   newHeight, SWP_NOZORDER | SWP_NOACTIVATE);\n\n      return 0;\n    }\n    case WM_SIZE: {\n      RECT rect = GetClientArea();\n      if (child_content_ != nullptr) {\n        // Size and position the child window.\n        MoveWindow(child_content_, rect.left, rect.top, rect.right - rect.left,\n                   rect.bottom - rect.top, TRUE);\n      }\n      return 0;\n    }\n\n    case WM_ACTIVATE:\n      if (child_content_ != nullptr) {\n        SetFocus(child_content_);\n      }\n      return 0;\n\n    case WM_DWMCOLORIZATIONCOLORCHANGED:\n      UpdateTheme(hwnd);\n      return 0;\n  }\n\n  return DefWindowProc(window_handle_, message, wparam, lparam);\n}\n\nvoid Win32Window::Destroy() {\n  OnDestroy();\n\n  if (window_handle_) {\n    DestroyWindow(window_handle_);\n    window_handle_ = nullptr;\n  }\n  if (g_active_window_count == 0) {\n    WindowClassRegistrar::GetInstance()->UnregisterWindowClass();\n  }\n}\n\nWin32Window* Win32Window::GetThisFromHandle(HWND const window) noexcept {\n  return reinterpret_cast<Win32Window*>(\n      GetWindowLongPtr(window, GWLP_USERDATA));\n}\n\nvoid Win32Window::SetChildContent(HWND content) {\n  child_content_ = content;\n  SetParent(content, window_handle_);\n  RECT frame = GetClientArea();\n\n  MoveWindow(content, frame.left, frame.top, frame.right - frame.left,\n             frame.bottom - frame.top, true);\n\n  SetFocus(child_content_);\n}\n\nRECT Win32Window::GetClientArea() {\n  RECT frame;\n  GetClientRect(window_handle_, &frame);\n  return frame;\n}\n\nHWND Win32Window::GetHandle() {\n  return window_handle_;\n}\n\nvoid Win32Window::SetQuitOnClose(bool quit_on_close) {\n  quit_on_close_ = quit_on_close;\n}\n\nbool Win32Window::OnCreate() {\n  // No-op; provided for subclasses.\n  return true;\n}\n\nvoid Win32Window::OnDestroy() {\n  // No-op; provided for subclasses.\n}\n\nvoid Win32Window::UpdateTheme(HWND const window) {\n  DWORD light_mode;\n  DWORD light_mode_size = sizeof(light_mode);\n  LSTATUS result = RegGetValue(HKEY_CURRENT_USER, kGetPreferredBrightnessRegKey,\n                               kGetPreferredBrightnessRegValue,\n                               RRF_RT_REG_DWORD, nullptr, &light_mode,\n                               &light_mode_size);\n\n  if (result == ERROR_SUCCESS) {\n    BOOL enable_dark_mode = light_mode == 0;\n    DwmSetWindowAttribute(window, DWMWA_USE_IMMERSIVE_DARK_MODE,\n                          &enable_dark_mode, sizeof(enable_dark_mode));\n  }\n}\n"
  },
  {
    "path": "flutter-examples/streaming_asr/windows/runner/win32_window.h",
    "content": "#ifndef RUNNER_WIN32_WINDOW_H_\n#define RUNNER_WIN32_WINDOW_H_\n\n#include <windows.h>\n\n#include <functional>\n#include <memory>\n#include <string>\n\n// A class abstraction for a high DPI-aware Win32 Window. Intended to be\n// inherited from by classes that wish to specialize with custom\n// rendering and input handling\nclass Win32Window {\n public:\n  struct Point {\n    unsigned int x;\n    unsigned int y;\n    Point(unsigned int x, unsigned int y) : x(x), y(y) {}\n  };\n\n  struct Size {\n    unsigned int width;\n    unsigned int height;\n    Size(unsigned int width, unsigned int height)\n        : width(width), height(height) {}\n  };\n\n  Win32Window();\n  virtual ~Win32Window();\n\n  // Creates a win32 window with |title| that is positioned and sized using\n  // |origin| and |size|. New windows are created on the default monitor. Window\n  // sizes are specified to the OS in physical pixels, hence to ensure a\n  // consistent size this function will scale the inputted width and height as\n  // as appropriate for the default monitor. The window is invisible until\n  // |Show| is called. Returns true if the window was created successfully.\n  bool Create(const std::wstring& title, const Point& origin, const Size& size);\n\n  // Show the current window. Returns true if the window was successfully shown.\n  bool Show();\n\n  // Release OS resources associated with window.\n  void Destroy();\n\n  // Inserts |content| into the window tree.\n  void SetChildContent(HWND content);\n\n  // Returns the backing Window handle to enable clients to set icon and other\n  // window properties. Returns nullptr if the window has been destroyed.\n  HWND GetHandle();\n\n  // If true, closing this window will quit the application.\n  void SetQuitOnClose(bool quit_on_close);\n\n  // Return a RECT representing the bounds of the current client area.\n  RECT GetClientArea();\n\n protected:\n  // Processes and route salient window messages for mouse handling,\n  // size change and DPI. Delegates handling of these to member overloads that\n  // inheriting classes can handle.\n  virtual LRESULT MessageHandler(HWND window,\n                                 UINT const message,\n                                 WPARAM const wparam,\n                                 LPARAM const lparam) noexcept;\n\n  // Called when CreateAndShow is called, allowing subclass window-related\n  // setup. Subclasses should return false if setup fails.\n  virtual bool OnCreate();\n\n  // Called when Destroy is called.\n  virtual void OnDestroy();\n\n private:\n  friend class WindowClassRegistrar;\n\n  // OS callback called by message pump. Handles the WM_NCCREATE message which\n  // is passed when the non-client area is being created and enables automatic\n  // non-client DPI scaling so that the non-client area automatically\n  // responds to changes in DPI. All other messages are handled by\n  // MessageHandler.\n  static LRESULT CALLBACK WndProc(HWND const window,\n                                  UINT const message,\n                                  WPARAM const wparam,\n                                  LPARAM const lparam) noexcept;\n\n  // Retrieves a class instance pointer for |window|\n  static Win32Window* GetThisFromHandle(HWND const window) noexcept;\n\n  // Update the window frame's theme to match the system theme.\n  static void UpdateTheme(HWND const window);\n\n  bool quit_on_close_ = false;\n\n  // window handle for top level window.\n  HWND window_handle_ = nullptr;\n\n  // window handle for hosted content.\n  HWND child_content_ = nullptr;\n};\n\n#endif  // RUNNER_WIN32_WINDOW_H_\n"
  },
  {
    "path": "flutter-examples/tts/.gitignore",
    "content": "# Miscellaneous\n*.class\n*.log\n*.pyc\n*.swp\n.DS_Store\n.atom/\n.buildlog/\n.history\n.svn/\nmigrate_working_dir/\n\n# IntelliJ related\n*.iml\n*.ipr\n*.iws\n.idea/\n\n# The .vscode folder contains launch configuration and tasks you configure in\n# VS Code which you may wish to be included in version control, so this line\n# is commented out by default.\n#.vscode/\n\n# Flutter/Dart/Pub related\n**/doc/api/\n**/ios/Flutter/.last_build_id\n.dart_tool/\n.flutter-plugins\n.flutter-plugins-dependencies\n.pub-cache/\n.pub/\n/build/\n\n# Symbolication related\napp.*.symbols\n\n# Obfuscation related\napp.*.map.json\n\n# Android Studio will place build artifacts here\n/android/app/debug\n/android/app/profile\n/android/app/release\n"
  },
  {
    "path": "flutter-examples/tts/.metadata",
    "content": "# This file tracks properties of this Flutter project.\n# Used by Flutter tool to assess capabilities and perform upgrades etc.\n#\n# This file should be version controlled and should not be manually edited.\n\nversion:\n  revision: \"5dcb86f68f239346676ceb1ed1ea385bd215fba1\"\n  channel: \"stable\"\n\nproject_type: app\n\n# Tracks metadata for the flutter migrate command\nmigration:\n  platforms:\n    - platform: root\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n    - platform: android\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n    - platform: ios\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n    - platform: linux\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n    - platform: macos\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n    - platform: windows\n      create_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n      base_revision: 5dcb86f68f239346676ceb1ed1ea385bd215fba1\n\n  # User provided section\n\n  # List of Local paths (relative to this file) that should be\n  # ignored by the migrate tool.\n  #\n  # Files that are not part of the templates will be ignored by default.\n  unmanaged_files:\n    - 'lib/main.dart'\n    - 'ios/Runner.xcodeproj/project.pbxproj'\n"
  },
  {
    "path": "flutter-examples/tts/README.md",
    "content": "# tts\n\nThis example demonstrates how to use text to speech (TTS) in Flutter with sherpa-onnx.\n\nIt works on the following platforms:\n\n  - Android\n  - iOS\n  - Linux\n  - macOS (both arm64 and x86_64 are supported)\n  - Windows\n\nPre-built APPs for this folder can be found at <https://k2-fsa.github.io/sherpa/onnx/flutter/pre-built-app.html#text-to-speech-tts-speech-synthesis>\n\nScreenshots are given below:\n\n|Android|iOS|Linux|macOS|Windows|\n|-------|---|-----|-----|-------|\n|![](./android.jpg)|![](./ios.jpg)|![](./ubuntu.jpg)|![](./macos.jpg)|![](./windows.jpg)|\n\n## How to build\n\nBefore you run `flutter build`, you have to select a TTS model and change\nthe code to use your selected model.\n\n### 1. Select a TTS model\n\nWe have a list of TTS models at\n\n<https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models>\n\nYou can select any of them. If you feel that there are so many that you don't know\nwhich one is the best, please visit <http://huggingface.co/spaces/k2-fsa/text-to-speech>\nand try each one by yourself and select the one you consider the best.\n\nSuppose you select\n\n  <https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2>\n\nThen please do the following:\n\n  - 1. Download and unzip the model\n\n```bash\ncd flutter-examples/tts/assets\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2\ntar xf vits-piper-en_US-libritts_r-medium.tar.bz2\nrm vits-piper-en_US-libritts_r-medium.tar.bz2\ncd ..\n\n./generate-asset-list.py\n```\n\n  Note that you have to run [./generate-asset-list.py](./generate-asset-list.py) so that Flutter knows where\n  to find the model.\n\n  - 2. Change the code to use the downloaded model.\n\n    We have given several examples for different models in [./lib/model.dart](./lib/model.dart).\n    For our selected model, we need to change [./lib/model.dart](./lib/model.dart) so that it looks like below:\n\n```\n// Example 6\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n// https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2\nmodelDir = 'vits-piper-en_US-libritts_r-medium';\nmodelName = 'en_US-libritts_r-medium.onnx';\ndataDir = 'vits-piper-en_US-libritts_r-medium/espeak-ng-data';\n```\n\n  - 3. That's it.\n\n### Build the APP\n\n  - 1. For Linux\n\n```bash\nflutter build linux\n\n# See below if you get any errors\n```\n\n  - 2. For macOS\n\nTo build a universal2 APP, use\n\n```bash\nflutter build macos\n```\n\nTo build for `x86_64`, use\n\n```bash\nexport FLUTTER_XCODE_ARCHS=x86_64\nflutter build macos\n```\n\nTo build for `arm64`, use\n\n```bash\nexport FLUTTER_XCODE_ARCHS=arm64\nflutter build macos\n```\n\n  - 3. For Windows\n\n```bash\nflutter build windows\n```\n\n  - 4. For Android\n\n```bash\nflutter build apk --split-per-abi\n```\n\n  - 5. For iOS\n\nFirst, connect your iPhone to your computer and use `flutter devices` to show\navailable devices. You will see something like below:\n\n```\nFound 3 connected devices:\n  iPhone (mobile) • 00008030-001064212E85802E • ios            • iOS 16.3 20D47\n  macOS (desktop) • macos                     • darwin-x64     • macOS 13.1 22C65 darwin-x64\n  Chrome (web)    • chrome                    • web-javascript • Google Chrome 126.0.6478.127\n\nNo wireless devices were found.\n\nRun \"flutter emulators\" to list and start any available device emulators.\n\nIf you expected another device to be detected, please run \"flutter doctor\" to diagnose potential issues. You may also try increasing the time to wait for connected devices with the \"--device-timeout\" flag. Visit https://flutter.dev/setup/ for troubleshooting tips.\n```\n\nThen you can use\n```\nflutter run -d 00008030-001064212E85802E --release\n```\n\nYou would see something like below:\n```\nLaunching lib/main.dart on iPhone in release mode...\nAutomatically signing iOS for device deployment using specified development team in Xcode project: N5ZH3Z63A6\nRunning pod install...                                           1,773ms\nRunning Xcode build...\nXcode build done.                                            7.9s\nFailed to build iOS app\nCould not build the precompiled application for the device.\nError (Xcode): No profiles for 'com.k2fsa.sherpa.onnx.tts' were found: Xcode couldn't find any iOS App Development provisioning profiles matching\n'com.k2fsa.sherpa.onnx.tts'. Automatic signing is disabled and unable to generate a profile. To enable automatic signing, pass\n-allowProvisioningUpdates to xcodebuild.\n/Users/fangjun/open-source/sherpa-onnx/flutter-examples/tts/ios/Runner.xcodeproj\n\n\n\nIt appears that there was a problem signing your application prior to installation on the device.\n\nVerify that the Bundle Identifier in your project is your signing id in Xcode\n  open ios/Runner.xcworkspace\n\nAlso try selecting 'Product > Build' to fix the problem.\n\nError running application on iPhone.\n```\n\nAfter you have followed the instructions in the above log, run gain\n\n> Note: I have run `open ios/Runner.xcworkspace` and click `Product -> Build`.\n\n```\nflutter run -d 00008030-001064212E85802E --release\n```\n\nFinally, it will show something like below:\n\n```\nLaunching lib/main.dart on iPhone in release mode...\nAutomatically signing iOS for device deployment using specified development team in Xcode project: N5ZH3Z63A6\nRunning Xcode build...\n └─Compiling, linking and signing...                         6.5s\nXcode build done.                                           18.3s\nInstalling and launching...                                        22.9s\n\nFlutter run key commands.\nh List all available interactive commands.\nc Clear the screen\nq Quit (terminate the application on the device).\n```\n\n## Fix for Linux\n\nIf you get the following errors on Linux,\n\n```\nBuilding Linux application...\nCMake Error at /usr/local/share/cmake-3.29/Modules/FindPkgConfig.cmake:634 (message):\n  The following required packages were not found:\n\n   - gstreamer-1.0\n\nCall Stack (most recent call first):\n  /usr/local/share/cmake-3.29/Modules/FindPkgConfig.cmake:862 (_pkg_check_modules_internal)\n  flutter/ephemeral/.plugin_symlinks/audioplayers_linux/linux/CMakeLists.txt:24 (pkg_check_modules)\n```\n\nplease run:\n\n```bash\nsudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev libunwind-dev\n```\n\nSee also <https://github.com/bluefireteam/audioplayers/tree/main/packages/audioplayers_linux#setup-for-linux>\nfor the above error.\n"
  },
  {
    "path": "flutter-examples/tts/analysis_options.yaml",
    "content": "# This file configures the analyzer, which statically analyzes Dart code to\n# check for errors, warnings, and lints.\n#\n# The issues identified by the analyzer are surfaced in the UI of Dart-enabled\n# IDEs (https://dart.dev/tools#ides-and-editors). The analyzer can also be\n# invoked from the command line by running `flutter analyze`.\n\n# The following line activates a set of recommended lints for Flutter apps,\n# packages, and plugins designed to encourage good coding practices.\ninclude: package:flutter_lints/flutter.yaml\n\nlinter:\n  # The lint rules applied to this project can be customized in the\n  # section below to disable rules from the `package:flutter_lints/flutter.yaml`\n  # included above or to enable additional rules. A list of all available lints\n  # and their documentation is published at https://dart.dev/lints.\n  #\n  # Instead of disabling a lint rule for the entire project in the\n  # section below, it can also be suppressed for a single line of code\n  # or a specific dart file by using the `// ignore: name_of_lint` and\n  # `// ignore_for_file: name_of_lint` syntax on the line or in the file\n  # producing the lint.\n  rules:\n    # avoid_print: false  # Uncomment to disable the `avoid_print` rule\n    # prefer_single_quotes: true  # Uncomment to enable the `prefer_single_quotes` rule\n\n# Additional information about this file can be found at\n# https://dart.dev/guides/language/analysis-options\n"
  },
  {
    "path": "flutter-examples/tts/android/.gitignore",
    "content": "gradle-wrapper.jar\n/.gradle\n/captures/\n/gradlew\n/gradlew.bat\n/local.properties\nGeneratedPluginRegistrant.java\n\n# Remember to never publicly share your keystore.\n# See https://flutter.dev/docs/deployment/android#reference-the-keystore-from-the-app\nkey.properties\n**/*.keystore\n**/*.jks\n"
  },
  {
    "path": "flutter-examples/tts/android/app/build.gradle",
    "content": "plugins {\n    id \"com.android.application\"\n    id \"kotlin-android\"\n    // The Flutter Gradle Plugin must be applied after the Android and Kotlin Gradle plugins.\n    id \"dev.flutter.flutter-gradle-plugin\"\n}\n\ndef localProperties = new Properties()\ndef localPropertiesFile = rootProject.file(\"local.properties\")\nif (localPropertiesFile.exists()) {\n    localPropertiesFile.withReader(\"UTF-8\") { reader ->\n        localProperties.load(reader)\n    }\n}\n\ndef flutterVersionCode = localProperties.getProperty(\"flutter.versionCode\")\nif (flutterVersionCode == null) {\n    flutterVersionCode = \"1\"\n}\n\ndef flutterVersionName = localProperties.getProperty(\"flutter.versionName\")\nif (flutterVersionName == null) {\n    flutterVersionName = \"1.0\"\n}\n\nandroid {\n    namespace = \"com.k2fsa.sherpa.onnx.tts\"\n    compileSdk = 35\n    ndkVersion = flutter.ndkVersion\n\n    compileOptions {\n        sourceCompatibility = JavaVersion.toVersion(17)\n        targetCompatibility = JavaVersion.toVersion(17)\n    }\n\n    kotlinOptions {\n        jvmTarget = \"17\"\n    }\n\n    java {\n        toolchain {\n            languageVersion = JavaLanguageVersion.of(17)\n        }\n    }\n\n    defaultConfig {\n        // TODO: Specify your own unique Application ID (https://developer.android.com/studio/build/application-id.html).\n        applicationId = \"com.k2fsa.sherpa.onnx.tts\"\n        // You can update the following values to match your application needs.\n        // For more information, see: https://docs.flutter.dev/deployment/android#reviewing-the-gradle-build-configuration.\n        minSdk = flutter.minSdkVersion\n        targetSdk = 35\n        versionCode = flutterVersionCode.toInteger()\n        versionName = flutterVersionName\n    }\n\n    buildTypes {\n        release {\n            // TODO: Add your own signing config for the release build.\n            // Signing with the debug keys for now, so `flutter run --release` works.\n            signingConfig = signingConfigs.debug\n        }\n    }\n}\n\nflutter {\n    source = \"../..\"\n}\n"
  },
  {
    "path": "flutter-examples/tts/android/app/src/debug/AndroidManifest.xml",
    "content": "<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <!-- The INTERNET permission is required for development. Specifically,\n         the Flutter tool needs it to communicate with the running application\n         to allow setting breakpoints, to provide hot reload, etc.\n    -->\n    <uses-permission android:name=\"android.permission.INTERNET\"/>\n</manifest>\n"
  },
  {
    "path": "flutter-examples/tts/android/app/src/main/AndroidManifest.xml",
    "content": "<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <application\n        android:label=\"tts\"\n        android:name=\"${applicationName}\"\n        android:icon=\"@mipmap/ic_launcher\">\n        <activity\n            android:name=\".MainActivity\"\n            android:exported=\"true\"\n            android:launchMode=\"singleTop\"\n            android:taskAffinity=\"\"\n            android:theme=\"@style/LaunchTheme\"\n            android:configChanges=\"orientation|keyboardHidden|keyboard|screenSize|smallestScreenSize|locale|layoutDirection|fontScale|screenLayout|density|uiMode\"\n            android:hardwareAccelerated=\"true\"\n            android:windowSoftInputMode=\"adjustResize\">\n            <!-- Specifies an Android theme to apply to this Activity as soon as\n                 the Android process has started. This theme is visible to the user\n                 while the Flutter UI initializes. After that, this theme continues\n                 to determine the Window background behind the Flutter UI. -->\n            <meta-data\n              android:name=\"io.flutter.embedding.android.NormalTheme\"\n              android:resource=\"@style/NormalTheme\"\n              />\n            <intent-filter>\n                <action android:name=\"android.intent.action.MAIN\"/>\n                <category android:name=\"android.intent.category.LAUNCHER\"/>\n            </intent-filter>\n        </activity>\n        <!-- Don't delete the meta-data below.\n             This is used by the Flutter tool to generate GeneratedPluginRegistrant.java -->\n        <meta-data\n            android:name=\"flutterEmbedding\"\n            android:value=\"2\" />\n    </application>\n    <!-- Required to query activities that can process text, see:\n         https://developer.android.com/training/package-visibility and\n         https://developer.android.com/reference/android/content/Intent#ACTION_PROCESS_TEXT.\n\n         In particular, this is used by the Flutter engine in io.flutter.plugin.text.ProcessTextPlugin. -->\n    <queries>\n        <intent>\n            <action android:name=\"android.intent.action.PROCESS_TEXT\"/>\n            <data android:mimeType=\"text/plain\"/>\n        </intent>\n    </queries>\n</manifest>\n"
  },
  {
    "path": "flutter-examples/tts/android/app/src/main/kotlin/com/example/tts/MainActivity.kt",
    "content": "package com.k2fsa.sherpa.onnx.tts\n\nimport io.flutter.embedding.android.FlutterActivity\n\nclass MainActivity: FlutterActivity()\n"
  },
  {
    "path": "flutter-examples/tts/android/app/src/main/res/drawable/launch_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<!-- Modify this file to customize your launch splash screen -->\n<layer-list xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <item android:drawable=\"@android:color/white\" />\n\n    <!-- You can insert your own image assets here -->\n    <!-- <item>\n        <bitmap\n            android:gravity=\"center\"\n            android:src=\"@mipmap/launch_image\" />\n    </item> -->\n</layer-list>\n"
  },
  {
    "path": "flutter-examples/tts/android/app/src/main/res/drawable-v21/launch_background.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<!-- Modify this file to customize your launch splash screen -->\n<layer-list xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <item android:drawable=\"?android:colorBackground\" />\n\n    <!-- You can insert your own image assets here -->\n    <!-- <item>\n        <bitmap\n            android:gravity=\"center\"\n            android:src=\"@mipmap/launch_image\" />\n    </item> -->\n</layer-list>\n"
  },
  {
    "path": "flutter-examples/tts/android/app/src/main/res/values/styles.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <!-- Theme applied to the Android Window while the process is starting when the OS's Dark Mode setting is off -->\n    <style name=\"LaunchTheme\" parent=\"@android:style/Theme.Light.NoTitleBar\">\n        <!-- Show a splash screen on the activity. Automatically removed when\n             the Flutter engine draws its first frame -->\n        <item name=\"android:windowBackground\">@drawable/launch_background</item>\n    </style>\n    <!-- Theme applied to the Android Window as soon as the process has started.\n         This theme determines the color of the Android Window while your\n         Flutter UI initializes, as well as behind your Flutter UI while its\n         running.\n\n         This Theme is only used starting with V2 of Flutter's Android embedding. -->\n    <style name=\"NormalTheme\" parent=\"@android:style/Theme.Light.NoTitleBar\">\n        <item name=\"android:windowBackground\">?android:colorBackground</item>\n    </style>\n</resources>\n"
  },
  {
    "path": "flutter-examples/tts/android/app/src/main/res/values-night/styles.xml",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<resources>\n    <!-- Theme applied to the Android Window while the process is starting when the OS's Dark Mode setting is on -->\n    <style name=\"LaunchTheme\" parent=\"@android:style/Theme.Black.NoTitleBar\">\n        <!-- Show a splash screen on the activity. Automatically removed when\n             the Flutter engine draws its first frame -->\n        <item name=\"android:windowBackground\">@drawable/launch_background</item>\n    </style>\n    <!-- Theme applied to the Android Window as soon as the process has started.\n         This theme determines the color of the Android Window while your\n         Flutter UI initializes, as well as behind your Flutter UI while its\n         running.\n\n         This Theme is only used starting with V2 of Flutter's Android embedding. -->\n    <style name=\"NormalTheme\" parent=\"@android:style/Theme.Black.NoTitleBar\">\n        <item name=\"android:windowBackground\">?android:colorBackground</item>\n    </style>\n</resources>\n"
  },
  {
    "path": "flutter-examples/tts/android/app/src/profile/AndroidManifest.xml",
    "content": "<manifest xmlns:android=\"http://schemas.android.com/apk/res/android\">\n    <!-- The INTERNET permission is required for development. Specifically,\n         the Flutter tool needs it to communicate with the running application\n         to allow setting breakpoints, to provide hot reload, etc.\n    -->\n    <uses-permission android:name=\"android.permission.INTERNET\"/>\n</manifest>\n"
  },
  {
    "path": "flutter-examples/tts/android/build.gradle",
    "content": "allprojects {\n    repositories {\n        google()\n        mavenCentral()\n    }\n}\n\nrootProject.buildDir = \"../build\"\nsubprojects {\n    project.buildDir = \"${rootProject.buildDir}/${project.name}\"\n}\nsubprojects {\n    project.evaluationDependsOn(\":app\")\n}\n\ntasks.register(\"clean\", Delete) {\n    delete rootProject.buildDir\n}\n"
  },
  {
    "path": "flutter-examples/tts/android/gradle/wrapper/gradle-wrapper.properties",
    "content": "distributionBase=GRADLE_USER_HOME\ndistributionPath=wrapper/dists\nzipStoreBase=GRADLE_USER_HOME\nzipStorePath=wrapper/dists\ndistributionUrl=https\\://services.gradle.org/distributions/gradle-8.9-bin.zip\n"
  },
  {
    "path": "flutter-examples/tts/android/gradle.properties",
    "content": "org.gradle.jvmargs=-Xmx4G -XX:+HeapDumpOnOutOfMemoryError\nandroid.useAndroidX=true\nandroid.enableJetifier=true\nFLUTTER_COMPILE_SDK_VERSION=35\norg.gradle.daemon=false\n"
  },
  {
    "path": "flutter-examples/tts/android/settings.gradle",
    "content": "pluginManagement {\n    def flutterSdkPath = {\n        def properties = new Properties()\n        file(\"local.properties\").withInputStream { properties.load(it) }\n        def flutterSdkPath = properties.getProperty(\"flutter.sdk\")\n        assert flutterSdkPath != null, \"flutter.sdk not set in local.properties\"\n        return flutterSdkPath\n    }()\n\n    includeBuild(\"$flutterSdkPath/packages/flutter_tools/gradle\")\n\n    repositories {\n        google()\n        mavenCentral()\n        gradlePluginPortal()\n    }\n}\n\nplugins {\n    id \"dev.flutter.flutter-plugin-loader\" version \"1.0.0\"\n    id \"com.android.application\" version \"8.7.0\" apply false\n    id \"org.jetbrains.kotlin.android\" version \"1.9.24\" apply false\n}\n\ninclude \":app\"\n"
  },
  {
    "path": "flutter-examples/tts/assets/.gitkeep",
    "content": ""
  },
  {
    "path": "flutter-examples/tts/generate-asset-list.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file assumes that\n  assets:\nis the last line in ./pubspec.yaml\n\nIt reads the file names of all files from the ./assets folder\nand turns them as assets and writes them into ./pubspec.yaml\n\"\"\"\n\nimport os\n\ndef main():\n    target = \"./assets/\"\n    space = \"    \"\n    subfolders = []\n    patterns_to_skip = [\"1.5x\", \"2.x\", \"3.x\", \"4.x\"]\n    for root, dirs, files in os.walk(target):\n        for d in dirs:\n            path = os.path.join(root, d).replace(\"\\\\\", \"/\")\n            if os.listdir(path):\n                path = path.lstrip('./')\n                if any(path.endswith(pattern) for pattern in patterns_to_skip):\n                    continue\n                subfolders.append(\"{space}- {path}/\".format(space=space, path=path))\n\n    assert subfolders, \"The subfolders list is empty.\"\n\n    subfolders = sorted(subfolders)\n\n    loc_of_flutter = -1\n    loc_of_flutter_asset = -1\n    loc_of_end_flutter_asset = -1\n    loc_of_end_flutter = -1\n\n    with open(\"./pubspec.yaml\", encoding=\"utf-8\") as f:\n        lines = f.readlines()\n        for index, line in enumerate(lines):\n            if line == \"flutter:\\n\":\n                loc_of_flutter = index + 1\n                if index == len(lines) - 1:\n                    loc_of_end_flutter = index + 2\n                continue\n            if loc_of_flutter >= 0 and loc_of_flutter_asset < 0 and line == \"  assets:\\n\":\n                loc_of_flutter_asset = index + 1\n                continue\n\n    with open(\"./pubspec.yaml\", encoding=\"utf-8\") as f:\n        lines = f.readlines()\n        for index, line in enumerate(lines):\n            if index < loc_of_flutter:\n                continue\n            if loc_of_flutter_asset >= 0:\n                if line.startswith(\"    - assets/\"):\n                    loc_of_end_flutter_asset = index + 1\n                    continue\n                else:\n                    loc_of_end_flutter = index + 1\n                    continue\n            else:\n                if line.startswith(\"  \") is False:\n                    loc_of_end_flutter = index + 1\n                    continue\n                else:\n                    loc_of_end_flutter = index + 2\n                    break\n\n    assert loc_of_flutter >= 0, \"The 'flutter:' section is missing in the pubspec.yaml file.\"\n\n    with open(\"./pubspec.yaml\", \"w\", encoding=\"utf-8\") as f:\n        for index, line in enumerate(lines):\n            if loc_of_end_flutter_asset >= 0:\n                if index + 1 < loc_of_flutter_asset or index + 1 > loc_of_end_flutter_asset:\n                    f.write(line)\n                if index + 1 == loc_of_flutter_asset:\n                    f.write(\"  assets:\\n\")\n                    for folder in subfolders:\n                        f.write(\"{folder}\\n\".format(folder=folder))\n            else:\n                if index + 1 < loc_of_end_flutter or index + 1 > loc_of_end_flutter:\n                    f.write(line)\n                if index + 1 == loc_of_end_flutter:\n                    f.write(\"  assets:\\n\")\n                    for indexOfFolder, folder in enumerate(subfolders):\n                        f.write(\"{folder}\\n\".format(folder=folder))\n                        if indexOfFolder == len(subfolders) - 1:\n                            f.write(\"\\n\")\n                            break\n\n        if loc_of_end_flutter == len(lines) + 1:\n            f.write(\"\\n\")\n            f.write(\"  assets:\\n\")\n            for folder in subfolders:\n                f.write(\"{folder}\\n\".format(folder=folder))\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "flutter-examples/tts/ios/.gitignore",
    "content": "**/dgph\n*.mode1v3\n*.mode2v3\n*.moved-aside\n*.pbxuser\n*.perspectivev3\n**/*sync/\n.sconsign.dblite\n.tags*\n**/.vagrant/\n**/DerivedData/\nIcon?\n**/Pods/\n**/.symlinks/\nprofile\nxcuserdata\n**/.generated/\nFlutter/App.framework\nFlutter/Flutter.framework\nFlutter/Flutter.podspec\nFlutter/Generated.xcconfig\nFlutter/ephemeral/\nFlutter/app.flx\nFlutter/app.zip\nFlutter/flutter_assets/\nFlutter/flutter_export_environment.sh\nServiceDefinitions.json\nRunner/GeneratedPluginRegistrant.*\n\n# Exceptions to above rules.\n!default.mode1v3\n!default.mode2v3\n!default.pbxuser\n!default.perspectivev3\n"
  },
  {
    "path": "flutter-examples/tts/ios/Flutter/AppFrameworkInfo.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n  <key>CFBundleDevelopmentRegion</key>\n  <string>en</string>\n  <key>CFBundleExecutable</key>\n  <string>App</string>\n  <key>CFBundleIdentifier</key>\n  <string>io.flutter.flutter.app</string>\n  <key>CFBundleInfoDictionaryVersion</key>\n  <string>6.0</string>\n  <key>CFBundleName</key>\n  <string>App</string>\n  <key>CFBundlePackageType</key>\n  <string>FMWK</string>\n  <key>CFBundleShortVersionString</key>\n  <string>1.0</string>\n  <key>CFBundleSignature</key>\n  <string>????</string>\n  <key>CFBundleVersion</key>\n  <string>1.0</string>\n  <key>MinimumOSVersion</key>\n  <string>12.0</string>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/ios/Flutter/Debug.xcconfig",
    "content": "#include \"Generated.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/tts/ios/Flutter/Release.xcconfig",
    "content": "#include \"Generated.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner/AppDelegate.swift",
    "content": "import Flutter\nimport UIKit\n\n@UIApplicationMain\n@objc class AppDelegate: FlutterAppDelegate {\n  override func application(\n    _ application: UIApplication,\n    didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?\n  ) -> Bool {\n    GeneratedPluginRegistrant.register(with: self)\n    return super.application(application, didFinishLaunchingWithOptions: launchOptions)\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"size\" : \"20x20\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-20x20@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"20x20\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-20x20@3x.png\",\n      \"scale\" : \"3x\"\n    },\n    {\n      \"size\" : \"29x29\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-29x29@1x.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"29x29\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-29x29@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"29x29\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-29x29@3x.png\",\n      \"scale\" : \"3x\"\n    },\n    {\n      \"size\" : \"40x40\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-40x40@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"40x40\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-40x40@3x.png\",\n      \"scale\" : \"3x\"\n    },\n    {\n      \"size\" : \"60x60\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-60x60@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"60x60\",\n      \"idiom\" : \"iphone\",\n      \"filename\" : \"Icon-App-60x60@3x.png\",\n      \"scale\" : \"3x\"\n    },\n    {\n      \"size\" : \"20x20\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-20x20@1x.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"20x20\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-20x20@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"29x29\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-29x29@1x.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"29x29\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-29x29@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"40x40\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-40x40@1x.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"40x40\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-40x40@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"76x76\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-76x76@1x.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"76x76\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-76x76@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"83.5x83.5\",\n      \"idiom\" : \"ipad\",\n      \"filename\" : \"Icon-App-83.5x83.5@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"1024x1024\",\n      \"idiom\" : \"ios-marketing\",\n      \"filename\" : \"Icon-App-1024x1024@1x.png\",\n      \"scale\" : \"1x\"\n    }\n  ],\n  \"info\" : {\n    \"version\" : 1,\n    \"author\" : \"xcode\"\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner/Assets.xcassets/LaunchImage.imageset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"idiom\" : \"universal\",\n      \"filename\" : \"LaunchImage.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"idiom\" : \"universal\",\n      \"filename\" : \"LaunchImage@2x.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"idiom\" : \"universal\",\n      \"filename\" : \"LaunchImage@3x.png\",\n      \"scale\" : \"3x\"\n    }\n  ],\n  \"info\" : {\n    \"version\" : 1,\n    \"author\" : \"xcode\"\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner/Assets.xcassets/LaunchImage.imageset/README.md",
    "content": "# Launch Screen Assets\n\nYou can customize the launch screen with your own desired assets by replacing the image files in this directory.\n\nYou can also do it by opening your Flutter project's Xcode project with `open ios/Runner.xcworkspace`, selecting `Runner/Assets.xcassets` in the Project Navigator and dropping in the desired images."
  },
  {
    "path": "flutter-examples/tts/ios/Runner/Base.lproj/LaunchScreen.storyboard",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n<document type=\"com.apple.InterfaceBuilder3.CocoaTouch.Storyboard.XIB\" version=\"3.0\" toolsVersion=\"12121\" systemVersion=\"16G29\" targetRuntime=\"iOS.CocoaTouch\" propertyAccessControl=\"none\" useAutolayout=\"YES\" launchScreen=\"YES\" colorMatched=\"YES\" initialViewController=\"01J-lp-oVM\">\n    <dependencies>\n        <deployment identifier=\"iOS\"/>\n        <plugIn identifier=\"com.apple.InterfaceBuilder.IBCocoaTouchPlugin\" version=\"12089\"/>\n    </dependencies>\n    <scenes>\n        <!--View Controller-->\n        <scene sceneID=\"EHf-IW-A2E\">\n            <objects>\n                <viewController id=\"01J-lp-oVM\" sceneMemberID=\"viewController\">\n                    <layoutGuides>\n                        <viewControllerLayoutGuide type=\"top\" id=\"Ydg-fD-yQy\"/>\n                        <viewControllerLayoutGuide type=\"bottom\" id=\"xbc-2k-c8Z\"/>\n                    </layoutGuides>\n                    <view key=\"view\" contentMode=\"scaleToFill\" id=\"Ze5-6b-2t3\">\n                        <autoresizingMask key=\"autoresizingMask\" widthSizable=\"YES\" heightSizable=\"YES\"/>\n                        <subviews>\n                            <imageView opaque=\"NO\" clipsSubviews=\"YES\" multipleTouchEnabled=\"YES\" contentMode=\"center\" image=\"LaunchImage\" translatesAutoresizingMaskIntoConstraints=\"NO\" id=\"YRO-k0-Ey4\">\n                            </imageView>\n                        </subviews>\n                        <color key=\"backgroundColor\" red=\"1\" green=\"1\" blue=\"1\" alpha=\"1\" colorSpace=\"custom\" customColorSpace=\"sRGB\"/>\n                        <constraints>\n                            <constraint firstItem=\"YRO-k0-Ey4\" firstAttribute=\"centerX\" secondItem=\"Ze5-6b-2t3\" secondAttribute=\"centerX\" id=\"1a2-6s-vTC\"/>\n                            <constraint firstItem=\"YRO-k0-Ey4\" firstAttribute=\"centerY\" secondItem=\"Ze5-6b-2t3\" secondAttribute=\"centerY\" id=\"4X2-HB-R7a\"/>\n                        </constraints>\n                    </view>\n                </viewController>\n                <placeholder placeholderIdentifier=\"IBFirstResponder\" id=\"iYj-Kq-Ea1\" userLabel=\"First Responder\" sceneMemberID=\"firstResponder\"/>\n            </objects>\n            <point key=\"canvasLocation\" x=\"53\" y=\"375\"/>\n        </scene>\n    </scenes>\n    <resources>\n        <image name=\"LaunchImage\" width=\"168\" height=\"185\"/>\n    </resources>\n</document>\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner/Base.lproj/Main.storyboard",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n<document type=\"com.apple.InterfaceBuilder3.CocoaTouch.Storyboard.XIB\" version=\"3.0\" toolsVersion=\"10117\" systemVersion=\"15F34\" targetRuntime=\"iOS.CocoaTouch\" propertyAccessControl=\"none\" useAutolayout=\"YES\" useTraitCollections=\"YES\" initialViewController=\"BYZ-38-t0r\">\n    <dependencies>\n        <deployment identifier=\"iOS\"/>\n        <plugIn identifier=\"com.apple.InterfaceBuilder.IBCocoaTouchPlugin\" version=\"10085\"/>\n    </dependencies>\n    <scenes>\n        <!--Flutter View Controller-->\n        <scene sceneID=\"tne-QT-ifu\">\n            <objects>\n                <viewController id=\"BYZ-38-t0r\" customClass=\"FlutterViewController\" sceneMemberID=\"viewController\">\n                    <layoutGuides>\n                        <viewControllerLayoutGuide type=\"top\" id=\"y3c-jy-aDJ\"/>\n                        <viewControllerLayoutGuide type=\"bottom\" id=\"wfy-db-euE\"/>\n                    </layoutGuides>\n                    <view key=\"view\" contentMode=\"scaleToFill\" id=\"8bC-Xf-vdC\">\n                        <rect key=\"frame\" x=\"0.0\" y=\"0.0\" width=\"600\" height=\"600\"/>\n                        <autoresizingMask key=\"autoresizingMask\" widthSizable=\"YES\" heightSizable=\"YES\"/>\n                        <color key=\"backgroundColor\" white=\"1\" alpha=\"1\" colorSpace=\"custom\" customColorSpace=\"calibratedWhite\"/>\n                    </view>\n                </viewController>\n                <placeholder placeholderIdentifier=\"IBFirstResponder\" id=\"dkx-z0-nzr\" sceneMemberID=\"firstResponder\"/>\n            </objects>\n        </scene>\n    </scenes>\n</document>\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>CFBundleDevelopmentRegion</key>\n\t<string>$(DEVELOPMENT_LANGUAGE)</string>\n\t<key>CFBundleDisplayName</key>\n\t<string>Tts</string>\n\t<key>CFBundleExecutable</key>\n\t<string>$(EXECUTABLE_NAME)</string>\n\t<key>CFBundleIdentifier</key>\n\t<string>$(PRODUCT_BUNDLE_IDENTIFIER)</string>\n\t<key>CFBundleInfoDictionaryVersion</key>\n\t<string>6.0</string>\n\t<key>CFBundleName</key>\n\t<string>tts</string>\n\t<key>CFBundlePackageType</key>\n\t<string>APPL</string>\n\t<key>CFBundleShortVersionString</key>\n\t<string>$(FLUTTER_BUILD_NAME)</string>\n\t<key>CFBundleSignature</key>\n\t<string>????</string>\n\t<key>CFBundleVersion</key>\n\t<string>$(FLUTTER_BUILD_NUMBER)</string>\n\t<key>LSRequiresIPhoneOS</key>\n\t<true/>\n\t<key>UILaunchStoryboardName</key>\n\t<string>LaunchScreen</string>\n\t<key>UIMainStoryboardFile</key>\n\t<string>Main</string>\n\t<key>UISupportedInterfaceOrientations</key>\n\t<array>\n\t\t<string>UIInterfaceOrientationPortrait</string>\n\t\t<string>UIInterfaceOrientationLandscapeLeft</string>\n\t\t<string>UIInterfaceOrientationLandscapeRight</string>\n\t</array>\n\t<key>UISupportedInterfaceOrientations~ipad</key>\n\t<array>\n\t\t<string>UIInterfaceOrientationPortrait</string>\n\t\t<string>UIInterfaceOrientationPortraitUpsideDown</string>\n\t\t<string>UIInterfaceOrientationLandscapeLeft</string>\n\t\t<string>UIInterfaceOrientationLandscapeRight</string>\n\t</array>\n\t<key>CADisableMinimumFrameDurationOnPhone</key>\n\t<true/>\n\t<key>UIApplicationSupportsIndirectInputEvents</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner/Runner-Bridging-Header.h",
    "content": "#import \"GeneratedPluginRegistrant.h\"\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 54;\n\tobjects = {\n\n/* Begin PBXBuildFile section */\n\t\t1498D2341E8E89220040F4C2 /* GeneratedPluginRegistrant.m in Sources */ = {isa = PBXBuildFile; fileRef = 1498D2331E8E89220040F4C2 /* GeneratedPluginRegistrant.m */; };\n\t\t331C808B294A63AB00263BE5 /* RunnerTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = 331C807B294A618700263BE5 /* RunnerTests.swift */; };\n\t\t3B3967161E833CAA004F5970 /* AppFrameworkInfo.plist in Resources */ = {isa = PBXBuildFile; fileRef = 3B3967151E833CAA004F5970 /* AppFrameworkInfo.plist */; };\n\t\t74858FAF1ED2DC5600515810 /* AppDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 74858FAE1ED2DC5600515810 /* AppDelegate.swift */; };\n\t\t97C146FC1CF9000F007C117D /* Main.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = 97C146FA1CF9000F007C117D /* Main.storyboard */; };\n\t\t97C146FE1CF9000F007C117D /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 97C146FD1CF9000F007C117D /* Assets.xcassets */; };\n\t\t97C147011CF9000F007C117D /* LaunchScreen.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = 97C146FF1CF9000F007C117D /* LaunchScreen.storyboard */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXContainerItemProxy section */\n\t\t331C8085294A63A400263BE5 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = 97C146E61CF9000F007C117D /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = 97C146ED1CF9000F007C117D;\n\t\t\tremoteInfo = Runner;\n\t\t};\n/* End PBXContainerItemProxy section */\n\n/* Begin PBXCopyFilesBuildPhase section */\n\t\t9705A1C41CF9048500538489 /* Embed Frameworks */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = \"\";\n\t\t\tdstSubfolderSpec = 10;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tname = \"Embed Frameworks\";\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXCopyFilesBuildPhase section */\n\n/* Begin PBXFileReference section */\n\t\t1498D2321E8E86230040F4C2 /* GeneratedPluginRegistrant.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = GeneratedPluginRegistrant.h; sourceTree = \"<group>\"; };\n\t\t1498D2331E8E89220040F4C2 /* GeneratedPluginRegistrant.m */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.objc; path = GeneratedPluginRegistrant.m; sourceTree = \"<group>\"; };\n\t\t331C807B294A618700263BE5 /* RunnerTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = RunnerTests.swift; sourceTree = \"<group>\"; };\n\t\t331C8081294A63A400263BE5 /* RunnerTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = RunnerTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t3B3967151E833CAA004F5970 /* AppFrameworkInfo.plist */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.plist.xml; name = AppFrameworkInfo.plist; path = Flutter/AppFrameworkInfo.plist; sourceTree = \"<group>\"; };\n\t\t74858FAD1ED2DC5600515810 /* Runner-Bridging-Header.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = \"Runner-Bridging-Header.h\"; sourceTree = \"<group>\"; };\n\t\t74858FAE1ED2DC5600515810 /* AppDelegate.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = AppDelegate.swift; sourceTree = \"<group>\"; };\n\t\t7AFA3C8E1D35360C0083082E /* Release.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; name = Release.xcconfig; path = Flutter/Release.xcconfig; sourceTree = \"<group>\"; };\n\t\t9740EEB21CF90195004384FC /* Debug.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; name = Debug.xcconfig; path = Flutter/Debug.xcconfig; sourceTree = \"<group>\"; };\n\t\t9740EEB31CF90195004384FC /* Generated.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; name = Generated.xcconfig; path = Flutter/Generated.xcconfig; sourceTree = \"<group>\"; };\n\t\t97C146EE1CF9000F007C117D /* Runner.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = Runner.app; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t97C146FB1CF9000F007C117D /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/Main.storyboard; sourceTree = \"<group>\"; };\n\t\t97C146FD1CF9000F007C117D /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = \"<group>\"; };\n\t\t97C147001CF9000F007C117D /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/LaunchScreen.storyboard; sourceTree = \"<group>\"; };\n\t\t97C147021CF9000F007C117D /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\t97C146EB1CF9000F007C117D /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\t331C8082294A63A400263BE5 /* RunnerTests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t331C807B294A618700263BE5 /* RunnerTests.swift */,\n\t\t\t);\n\t\t\tpath = RunnerTests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t9740EEB11CF90186004384FC /* Flutter */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t3B3967151E833CAA004F5970 /* AppFrameworkInfo.plist */,\n\t\t\t\t9740EEB21CF90195004384FC /* Debug.xcconfig */,\n\t\t\t\t7AFA3C8E1D35360C0083082E /* Release.xcconfig */,\n\t\t\t\t9740EEB31CF90195004384FC /* Generated.xcconfig */,\n\t\t\t);\n\t\t\tname = Flutter;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t97C146E51CF9000F007C117D = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t9740EEB11CF90186004384FC /* Flutter */,\n\t\t\t\t97C146F01CF9000F007C117D /* Runner */,\n\t\t\t\t97C146EF1CF9000F007C117D /* Products */,\n\t\t\t\t331C8082294A63A400263BE5 /* RunnerTests */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t97C146EF1CF9000F007C117D /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t97C146EE1CF9000F007C117D /* Runner.app */,\n\t\t\t\t331C8081294A63A400263BE5 /* RunnerTests.xctest */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t97C146F01CF9000F007C117D /* Runner */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t97C146FA1CF9000F007C117D /* Main.storyboard */,\n\t\t\t\t97C146FD1CF9000F007C117D /* Assets.xcassets */,\n\t\t\t\t97C146FF1CF9000F007C117D /* LaunchScreen.storyboard */,\n\t\t\t\t97C147021CF9000F007C117D /* Info.plist */,\n\t\t\t\t1498D2321E8E86230040F4C2 /* GeneratedPluginRegistrant.h */,\n\t\t\t\t1498D2331E8E89220040F4C2 /* GeneratedPluginRegistrant.m */,\n\t\t\t\t74858FAE1ED2DC5600515810 /* AppDelegate.swift */,\n\t\t\t\t74858FAD1ED2DC5600515810 /* Runner-Bridging-Header.h */,\n\t\t\t);\n\t\t\tpath = Runner;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\t331C8080294A63A400263BE5 /* RunnerTests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 331C8087294A63A400263BE5 /* Build configuration list for PBXNativeTarget \"RunnerTests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t331C807D294A63A400263BE5 /* Sources */,\n\t\t\t\t331C807F294A63A400263BE5 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\t331C8086294A63A400263BE5 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = RunnerTests;\n\t\t\tproductName = RunnerTests;\n\t\t\tproductReference = 331C8081294A63A400263BE5 /* RunnerTests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.unit-test\";\n\t\t};\n\t\t97C146ED1CF9000F007C117D /* Runner */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 97C147051CF9000F007C117D /* Build configuration list for PBXNativeTarget \"Runner\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t9740EEB61CF901F6004384FC /* Run Script */,\n\t\t\t\t97C146EA1CF9000F007C117D /* Sources */,\n\t\t\t\t97C146EB1CF9000F007C117D /* Frameworks */,\n\t\t\t\t97C146EC1CF9000F007C117D /* Resources */,\n\t\t\t\t9705A1C41CF9048500538489 /* Embed Frameworks */,\n\t\t\t\t3B06AD1E1E4923F5004D2608 /* Thin Binary */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = Runner;\n\t\t\tproductName = Runner;\n\t\t\tproductReference = 97C146EE1CF9000F007C117D /* Runner.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\t97C146E61CF9000F007C117D /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = YES;\n\t\t\t\tLastUpgradeCheck = 1510;\n\t\t\t\tORGANIZATIONNAME = \"\";\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\t331C8080294A63A400263BE5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.0;\n\t\t\t\t\t\tTestTargetID = 97C146ED1CF9000F007C117D;\n\t\t\t\t\t};\n\t\t\t\t\t97C146ED1CF9000F007C117D = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 7.3.1;\n\t\t\t\t\t\tLastSwiftMigration = 1100;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = 97C146E91CF9000F007C117D /* Build configuration list for PBXProject \"Runner\" */;\n\t\t\tcompatibilityVersion = \"Xcode 9.3\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = 97C146E51CF9000F007C117D;\n\t\t\tproductRefGroup = 97C146EF1CF9000F007C117D /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\t97C146ED1CF9000F007C117D /* Runner */,\n\t\t\t\t331C8080294A63A400263BE5 /* RunnerTests */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\t331C807F294A63A400263BE5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t97C146EC1CF9000F007C117D /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t97C147011CF9000F007C117D /* LaunchScreen.storyboard in Resources */,\n\t\t\t\t3B3967161E833CAA004F5970 /* AppFrameworkInfo.plist in Resources */,\n\t\t\t\t97C146FE1CF9000F007C117D /* Assets.xcassets in Resources */,\n\t\t\t\t97C146FC1CF9000F007C117D /* Main.storyboard in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXShellScriptBuildPhase section */\n\t\t3B06AD1E1E4923F5004D2608 /* Thin Binary */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\talwaysOutOfDate = 1;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t\t\"${TARGET_BUILD_DIR}/${INFOPLIST_PATH}\",\n\t\t\t);\n\t\t\tname = \"Thin Binary\";\n\t\t\toutputPaths = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"/bin/sh \\\"$FLUTTER_ROOT/packages/flutter_tools/bin/xcode_backend.sh\\\" embed_and_thin\";\n\t\t};\n\t\t9740EEB61CF901F6004384FC /* Run Script */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\talwaysOutOfDate = 1;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t);\n\t\t\tname = \"Run Script\";\n\t\t\toutputPaths = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"/bin/sh \\\"$FLUTTER_ROOT/packages/flutter_tools/bin/xcode_backend.sh\\\" build\";\n\t\t};\n/* End PBXShellScriptBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\t331C807D294A63A400263BE5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t331C808B294A63AB00263BE5 /* RunnerTests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t97C146EA1CF9000F007C117D /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t74858FAF1ED2DC5600515810 /* AppDelegate.swift in Sources */,\n\t\t\t\t1498D2341E8E89220040F4C2 /* GeneratedPluginRegistrant.m in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin PBXTargetDependency section */\n\t\t331C8086294A63A400263BE5 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = 97C146ED1CF9000F007C117D /* Runner */;\n\t\t\ttargetProxy = 331C8085294A63A400263BE5 /* PBXContainerItemProxy */;\n\t\t};\n/* End PBXTargetDependency section */\n\n/* Begin PBXVariantGroup section */\n\t\t97C146FA1CF9000F007C117D /* Main.storyboard */ = {\n\t\t\tisa = PBXVariantGroup;\n\t\t\tchildren = (\n\t\t\t\t97C146FB1CF9000F007C117D /* Base */,\n\t\t\t);\n\t\t\tname = Main.storyboard;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t97C146FF1CF9000F007C117D /* LaunchScreen.storyboard */ = {\n\t\t\tisa = PBXVariantGroup;\n\t\t\tchildren = (\n\t\t\t\t97C147001CF9000F007C117D /* Base */,\n\t\t\t);\n\t\t\tname = LaunchScreen.storyboard;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXVariantGroup section */\n\n/* Begin XCBuildConfiguration section */\n\t\t249021D3217E4FDB00AE95B9 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++0x\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\t\"CODE_SIGN_IDENTITY[sdk=iphoneos*]\" = \"iPhone Developer\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu99;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 12.0;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSUPPORTED_PLATFORMS = iphoneos;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tVALIDATE_PRODUCT = YES;\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t249021D4217E4FDB00AE95B9 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 7AFA3C8E1D35360C0083082E /* Release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCURRENT_PROJECT_VERSION = \"$(FLUTTER_BUILD_NUMBER)\";\n\t\t\t\tDEVELOPMENT_TEAM = N5ZH3Z63A6;\n\t\t\t\tENABLE_BITCODE = NO;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.tts;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"Runner/Runner-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tVERSIONING_SYSTEM = \"apple-generic\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t331C8088294A63A400263BE5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.tts.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/Runner.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/Runner\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t331C8089294A63A400263BE5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.tts.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/Runner.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/Runner\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t331C808A294A63A400263BE5 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.tts.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/Runner.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/Runner\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t97C147031CF9000F007C117D /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++0x\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\t\"CODE_SIGN_IDENTITY[sdk=iphoneos*]\" = \"iPhone Developer\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu99;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 12.0;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t97C147041CF9000F007C117D /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++0x\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\t\"CODE_SIGN_IDENTITY[sdk=iphoneos*]\" = \"iPhone Developer\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu99;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 12.0;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSUPPORTED_PLATFORMS = iphoneos;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tVALIDATE_PRODUCT = YES;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t97C147061CF9000F007C117D /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 9740EEB21CF90195004384FC /* Debug.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCURRENT_PROJECT_VERSION = \"$(FLUTTER_BUILD_NUMBER)\";\n\t\t\t\tDEVELOPMENT_TEAM = N5ZH3Z63A6;\n\t\t\t\tENABLE_BITCODE = NO;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.tts;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"Runner/Runner-Bridging-Header.h\";\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tVERSIONING_SYSTEM = \"apple-generic\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t97C147071CF9000F007C117D /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 7AFA3C8E1D35360C0083082E /* Release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCURRENT_PROJECT_VERSION = \"$(FLUTTER_BUILD_NUMBER)\";\n\t\t\t\tDEVELOPMENT_TEAM = N5ZH3Z63A6;\n\t\t\t\tENABLE_BITCODE = NO;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.tts;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"Runner/Runner-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tVERSIONING_SYSTEM = \"apple-generic\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\t331C8087294A63A400263BE5 /* Build configuration list for PBXNativeTarget \"RunnerTests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t331C8088294A63A400263BE5 /* Debug */,\n\t\t\t\t331C8089294A63A400263BE5 /* Release */,\n\t\t\t\t331C808A294A63A400263BE5 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t97C146E91CF9000F007C117D /* Build configuration list for PBXProject \"Runner\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t97C147031CF9000F007C117D /* Debug */,\n\t\t\t\t97C147041CF9000F007C117D /* Release */,\n\t\t\t\t249021D3217E4FDB00AE95B9 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t97C147051CF9000F007C117D /* Build configuration list for PBXNativeTarget \"Runner\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t97C147061CF9000F007C117D /* Debug */,\n\t\t\t\t97C147071CF9000F007C117D /* Release */,\n\t\t\t\t249021D4217E4FDB00AE95B9 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = 97C146E61CF9000F007C117D /* Project object */;\n}\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"self:\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner.xcodeproj/project.xcworkspace/xcshareddata/WorkspaceSettings.xcsettings",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>PreviewsEnabled</key>\n\t<false/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner.xcodeproj/xcshareddata/xcschemes/Runner.xcscheme",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Scheme\n   LastUpgradeVersion = \"1510\"\n   version = \"1.3\">\n   <BuildAction\n      parallelizeBuildables = \"YES\"\n      buildImplicitDependencies = \"YES\">\n      <BuildActionEntries>\n         <BuildActionEntry\n            buildForTesting = \"YES\"\n            buildForRunning = \"YES\"\n            buildForProfiling = \"YES\"\n            buildForArchiving = \"YES\"\n            buildForAnalyzing = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"97C146ED1CF9000F007C117D\"\n               BuildableName = \"Runner.app\"\n               BlueprintName = \"Runner\"\n               ReferencedContainer = \"container:Runner.xcodeproj\">\n            </BuildableReference>\n         </BuildActionEntry>\n      </BuildActionEntries>\n   </BuildAction>\n   <TestAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\">\n      <MacroExpansion>\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"97C146ED1CF9000F007C117D\"\n            BuildableName = \"Runner.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </MacroExpansion>\n      <Testables>\n         <TestableReference\n            skipped = \"NO\"\n            parallelizable = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"331C8080294A63A400263BE5\"\n               BuildableName = \"RunnerTests.xctest\"\n               BlueprintName = \"RunnerTests\"\n               ReferencedContainer = \"container:Runner.xcodeproj\">\n            </BuildableReference>\n         </TestableReference>\n      </Testables>\n   </TestAction>\n   <LaunchAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      launchStyle = \"0\"\n      useCustomWorkingDirectory = \"NO\"\n      ignoresPersistentStateOnLaunch = \"NO\"\n      debugDocumentVersioning = \"YES\"\n      debugServiceExtension = \"internal\"\n      allowLocationSimulation = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"97C146ED1CF9000F007C117D\"\n            BuildableName = \"Runner.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </LaunchAction>\n   <ProfileAction\n      buildConfiguration = \"Profile\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\"\n      savedToolIdentifier = \"\"\n      useCustomWorkingDirectory = \"NO\"\n      debugDocumentVersioning = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"97C146ED1CF9000F007C117D\"\n            BuildableName = \"Runner.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </ProfileAction>\n   <AnalyzeAction\n      buildConfiguration = \"Debug\">\n   </AnalyzeAction>\n   <ArchiveAction\n      buildConfiguration = \"Release\"\n      revealArchiveInOrganizer = \"YES\">\n   </ArchiveAction>\n</Scheme>\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"group:Runner.xcodeproj\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/ios/Runner.xcworkspace/xcshareddata/WorkspaceSettings.xcsettings",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>PreviewsEnabled</key>\n\t<false/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/ios/RunnerTests/RunnerTests.swift",
    "content": "import Flutter\nimport UIKit\nimport XCTest\n\nclass RunnerTests: XCTestCase {\n\n  func testExample() {\n    // If you add code to the Runner application, consider adding tests here.\n    // See https://developer.apple.com/documentation/xctest for more information about using XCTest.\n  }\n\n}\n"
  },
  {
    "path": "flutter-examples/tts/lib/info.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'package:flutter/material.dart';\nimport 'package:url_launcher/url_launcher.dart';\n\nclass InfoScreen extends StatelessWidget {\n  @override\n  Widget build(BuildContext context) {\n    const double height = 20;\n    return Container(\n      child: Padding(\n        padding: const EdgeInsets.all(8.0),\n        child: Column(\n          crossAxisAlignment: CrossAxisAlignment.start,\n          children: <Widget>[\n            Text('Everything is open-sourced.'),\n            SizedBox(height: height),\n            InkWell(\n              child: Text('Code: https://github.com/k2-fsa/sherpa-onnx'),\n              onTap: () => launch('https://k2-fsa.github.io/sherpa/onnx/'),\n            ),\n            SizedBox(height: height),\n            InkWell(\n              child: Text('Doc: https://k2-fsa.github.io/sherpa/onnx/'),\n              onTap: () => launch('https://k2-fsa.github.io/sherpa/onnx/'),\n            ),\n            SizedBox(height: height),\n            Text('QQ 群: 744602236'),\n            SizedBox(height: height),\n            InkWell(\n              child: Text(\n                  '微信群: https://k2-fsa.github.io/sherpa/social-groups.html'),\n              onTap: () =>\n                  launch('https://k2-fsa.github.io/sherpa/social-groups.html'),\n            ),\n          ],\n        ),\n      ),\n    );\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/lib/isolate_tts.dart",
    "content": "import 'dart:io';\nimport 'dart:isolate';\n\nimport 'package:flutter/material.dart';\nimport 'package:flutter/services.dart';\nimport 'package:media_kit/media_kit.dart';\nimport 'package:path/path.dart' as p;\nimport 'package:path_provider/path_provider.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport 'utils.dart';\n\nclass _IsolateTask<T> {\n  final SendPort sendPort;\n\n  RootIsolateToken? rootIsolateToken;\n\n  _IsolateTask(this.sendPort, this.rootIsolateToken);\n}\n\nclass _PortModel {\n  final String method;\n\n  final SendPort? sendPort;\n  dynamic data;\n\n  _PortModel({\n    required this.method,\n    this.sendPort,\n    this.data,\n  });\n}\n\nclass _TtsManager {\n  /// 主进程通信端口\n  final ReceivePort receivePort;\n\n  final Isolate isolate;\n\n  final SendPort isolatePort;\n\n  _TtsManager({\n    required this.receivePort,\n    required this.isolate,\n    required this.isolatePort,\n  });\n}\n\nclass IsolateTts {\n  static late final _TtsManager _ttsManager;\n\n  /// 获取线程里的通信端口\n  static SendPort get _sendPort => _ttsManager.isolatePort;\n\n  static late sherpa_onnx.OfflineTts _tts;\n\n  static late Player _player;\n\n  static Future<void> init() async {\n    ReceivePort port = ReceivePort();\n    RootIsolateToken? rootIsolateToken = RootIsolateToken.instance;\n\n    Isolate isolate = await Isolate.spawn(\n      _isolateEntry,\n      _IsolateTask(port.sendPort, rootIsolateToken),\n      errorsAreFatal: false,\n    );\n    port.listen((msg) async {\n      if (msg is SendPort) {\n        print(11);\n        _ttsManager =\n            _TtsManager(receivePort: port, isolate: isolate, isolatePort: msg);\n        return;\n      }\n    });\n  }\n\n  static Future<void> _isolateEntry(_IsolateTask task) async {\n    if (task.rootIsolateToken != null) {\n      BackgroundIsolateBinaryMessenger.ensureInitialized(\n          task.rootIsolateToken!);\n    }\n    MediaKit.ensureInitialized();\n    _player = Player();\n    sherpa_onnx.initBindings();\n    final receivePort = ReceivePort();\n    task.sendPort.send(receivePort.sendPort);\n\n    String modelDir = '';\n    String modelName = '';\n    String ruleFsts = '';\n    String ruleFars = '';\n    String lexicon = '';\n    String dataDir = '';\n\n    // Example 7\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-melo-tts-zh_en.tar.bz2\n    modelDir = 'vits-melo-tts-zh_en';\n    modelName = 'model.onnx';\n    lexicon = 'lexicon.txt';\n\n    if (modelName == '') {\n      throw Exception(\n          'You are supposed to select a model by changing the code before you run the app');\n    }\n\n    final Directory directory = await getApplicationSupportDirectory();\n    modelName = p.join(directory.path, modelDir, modelName);\n\n    if (ruleFsts != '') {\n      final all = ruleFsts.split(',');\n      var tmp = <String>[];\n      for (final f in all) {\n        tmp.add(p.join(directory.path, f));\n      }\n      ruleFsts = tmp.join(',');\n    }\n\n    if (ruleFars != '') {\n      final all = ruleFars.split(',');\n      var tmp = <String>[];\n      for (final f in all) {\n        tmp.add(p.join(directory.path, f));\n      }\n      ruleFars = tmp.join(',');\n    }\n\n    if (lexicon != '') {\n      lexicon = p.join(directory.path, modelDir, lexicon);\n    }\n\n    if (dataDir != '') {\n      dataDir = p.join(directory.path, dataDir);\n    }\n\n    final tokens = p.join(directory.path, modelDir, 'tokens.txt');\n\n    final vits = sherpa_onnx.OfflineTtsVitsModelConfig(\n      model: modelName,\n      lexicon: lexicon,\n      tokens: tokens,\n      dataDir: dataDir,\n    );\n\n    final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n      vits: vits,\n      numThreads: 2,\n      debug: true,\n      provider: 'cpu',\n    );\n\n    final config = sherpa_onnx.OfflineTtsConfig(\n      model: modelConfig,\n      ruleFsts: ruleFsts,\n      ruleFars: ruleFars,\n      maxNumSenetences: 1,\n    );\n    // print(config);\n    receivePort.listen((msg) async {\n      print(msg);\n      if (msg is _PortModel) {\n        switch (msg.method) {\n          case 'generate':\n            {\n              _PortModel _v = msg;\n              final stopwatch = Stopwatch();\n              stopwatch.start();\n              final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n                sid: _v.data['sid'],\n                speed: _v.data['speed'],\n                silenceScale: 0.2,\n              );\n              final audio =\n                  _tts.generateWithConfig(text: _v.data['text'], config: genConfig);\n              final suffix =\n                  '-sid-${_v.data['sid']}-speed-${_v.data['speed'].toStringAsPrecision(2)}';\n              final filename = await generateWaveFilename(suffix);\n\n              final ok = sherpa_onnx.writeWave(\n                filename: filename,\n                samples: audio.samples,\n                sampleRate: audio.sampleRate,\n              );\n\n              if (ok) {\n                stopwatch.stop();\n                double elapsed = stopwatch.elapsed.inMilliseconds.toDouble();\n\n                double waveDuration = audio.samples.length.toDouble() /\n                    audio.sampleRate.toDouble();\n\n                print('Saved to\\n$filename\\n'\n                    'Elapsed: ${(elapsed / 1000).toStringAsPrecision(4)} s\\n'\n                    'Wave duration: ${waveDuration.toStringAsPrecision(4)} s\\n'\n                    'RTF: ${(elapsed / 1000).toStringAsPrecision(4)}/${waveDuration.toStringAsPrecision(4)} '\n                    '= ${(elapsed / 1000 / waveDuration).toStringAsPrecision(3)} ');\n\n                await _player.open(Media('file:///$filename'));\n                await _player.play();\n              }\n            }\n            break;\n        }\n      }\n    });\n    _tts = sherpa_onnx.OfflineTts(config);\n  }\n\n  static Future<void> generate(\n      {required String text, int sid = 0, double speed = 1.0}) async {\n    ReceivePort receivePort = ReceivePort();\n    _sendPort.send(_PortModel(\n      method: 'generate',\n      data: {'text': text, 'sid': sid, 'speed': speed},\n      sendPort: receivePort.sendPort,\n    ));\n    await receivePort.first;\n    receivePort.close();\n  }\n}\n\n/// 这里是页面\nclass IsolateTtsView extends StatefulWidget {\n  const IsolateTtsView({super.key});\n\n  @override\n  State<IsolateTtsView> createState() => _IsolateTtsViewState();\n}\n\nclass _IsolateTtsViewState extends State<IsolateTtsView> {\n  @override\n  void initState() {\n    super.initState();\n    IsolateTts.init();\n  }\n\n  @override\n  Widget build(BuildContext context) {\n    return Scaffold(\n      body: Center(\n        child: ElevatedButton(\n          onPressed: () {\n            IsolateTts.generate(text: '这是已退出的 isolate TTS');\n          },\n          child: Text('Isolate TTS'),\n        ),\n      ),\n    );\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/lib/main.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'package:flutter/material.dart';\n\nimport './info.dart';\nimport './tts.dart';\nimport 'isolate_tts.dart';\n\nvoid main() {\n  runApp(const MyApp());\n}\n\nclass MyApp extends StatelessWidget {\n  const MyApp({super.key});\n\n  @override\n  Widget build(BuildContext context) {\n    return MaterialApp(\n      title: 'Next-gen Kaldi flutter demo',\n      theme: ThemeData(\n        colorScheme: ColorScheme.fromSeed(seedColor: Colors.deepPurple),\n        useMaterial3: true,\n      ),\n      home: const MyHomePage(title: 'Next-gen Kaldi with Flutter'),\n    );\n  }\n}\n\nclass MyHomePage extends StatefulWidget {\n  const MyHomePage({super.key, required this.title});\n\n  final String title;\n\n  @override\n  State<MyHomePage> createState() => _MyHomePageState();\n}\n\nclass _MyHomePageState extends State<MyHomePage> {\n  int _currentIndex = 0;\n  final List<Widget> _tabs = [\n    TtsScreen(),\n    InfoScreen(),\n    IsolateTtsView(),\n  ];\n  @override\n  Widget build(BuildContext context) {\n    return Scaffold(\n      appBar: AppBar(\n        title: Text(widget.title),\n      ),\n      body: _tabs[_currentIndex],\n      bottomNavigationBar: BottomNavigationBar(\n        currentIndex: _currentIndex,\n        onTap: (int index) {\n          setState(() {\n            _currentIndex = index;\n          });\n        },\n        items: [\n          BottomNavigationBarItem(\n            icon: Icon(Icons.home),\n            label: 'Home',\n          ),\n          BottomNavigationBarItem(\n            icon: Icon(Icons.info),\n            label: 'Info',\n          ),\n          BottomNavigationBarItem(\n            icon: Icon(Icons.multiline_chart),\n            label: 'isolate',\n          ),\n        ],\n      ),\n    );\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/lib/model.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\n\nimport \"dart:io\";\n\nimport 'package:flutter/services.dart';\nimport 'package:path_provider/path_provider.dart';\nimport 'package:path/path.dart' as p;\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './utils.dart';\n\nFuture<sherpa_onnx.OfflineTts> createOfflineTts() async {\n  // sherpa_onnx requires that model files are in the local disk, so we\n  // need to copy all asset files to disk.\n  await copyAllAssetFiles();\n\n  sherpa_onnx.initBindings();\n\n  // Such a design is to make it easier to build flutter APPs with\n  // github actions for a variety of tts models\n  //\n  // See https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/flutter/generate-tts.py\n  // for details\n\n  String modelDir = '';\n  String modelName = '';\n  String voices = ''; // for Kokoro only\n  String ruleFsts = '';\n  String ruleFars = '';\n  String lexicon = '';\n  String dataDir = '';\n\n  // You can select an example below and change it accordingly to match your\n  // selected tts model\n\n  // ============================================================\n  // Your change starts here\n  // ============================================================\n\n  // Example 1:\n  // modelDir = 'vits-vctk';\n  // modelName = 'vits-vctk.onnx';\n  // lexicon = 'lexicon.txt';\n\n  // Example 2:\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\n  // modelDir = 'vits-piper-en_US-amy-low';\n  // modelName = 'en_US-amy-low.onnx';\n  // dataDir = 'vits-piper-en_US-amy-low/espeak-ng-data';\n\n  // Example 3:\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\n  // modelDir = 'vits-icefall-zh-aishell3';\n  // modelName = 'model.onnx';\n  // ruleFsts = 'vits-icefall-zh-aishell3/phone.fst,vits-icefall-zh-aishell3/date.fst,vits-icefall-zh-aishell3/number.fst,vits-icefall-zh-aishell3/new_heteronym.fst';\n  // ruleFars = 'vits-icefall-zh-aishell3/rule.far';\n  // lexicon = 'lexicon.txt';\n\n  // Example 4:\n  // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#csukuangfj-vits-zh-hf-fanchen-c-chinese-187-speakers\n  // modelDir = 'vits-zh-hf-fanchen-C';\n  // modelName = 'vits-zh-hf-fanchen-C.onnx';\n  // lexicon = 'lexicon.txt';\n\n  // Example 5:\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-coqui-de-css10.tar.bz2\n  // modelDir = 'vits-coqui-de-css10';\n  // modelName = 'model.onnx';\n\n  // Example 6\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2\n  // modelDir = 'vits-piper-en_US-libritts_r-medium';\n  // modelName = 'en_US-libritts_r-medium.onnx';\n  // dataDir = 'vits-piper-en_US-libritts_r-medium/espeak-ng-data';\n\n  // Example 7\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-melo-tts-zh_en.tar.bz2\n  // modelDir = 'vits-melo-tts-zh_en';\n  // modelName = 'model.onnx';\n  // lexicon = 'lexicon.txt';\n\n  // Example 8\n  // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html#kokoro-en-v0-19-english-11-speakers\n  // modelDir = 'kokoro-en-v0_19';\n  // modelName = 'model.onnx';\n  // voices = 'voices.bin';\n  // dataDir = 'kokoro-en-v0_19/espeak-ng-data';\n\n  // Example 9\n  // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n  // modelDir = 'kokoro-multi-lang-v1_0';\n  // modelName = 'model.onnx';\n  // voices = 'voices.bin';\n  // dataDir = 'kokoro-multi-lang-v1_0/espeak-ng-data';\n  // lexicon = 'kokoro-multi-lang-v1_0/lexicon-us-en.txt,kokoro-multi-lang-v1_0/lexicon-zh.txt';\n\n  // ============================================================\n  // Please don't change the remaining part of this function\n  // ============================================================\n  if (modelName == '') {\n    throw Exception(\n        'You are supposed to select a model by changing the code before you run the app');\n  }\n\n  final Directory directory = await getApplicationSupportDirectory();\n  modelName = p.join(directory.path, modelDir, modelName);\n\n  if (ruleFsts != '') {\n    final all = ruleFsts.split(',');\n    var tmp = <String>[];\n    for (final f in all) {\n      tmp.add(p.join(directory.path, f));\n    }\n    ruleFsts = tmp.join(',');\n  }\n\n  if (ruleFars != '') {\n    final all = ruleFars.split(',');\n    var tmp = <String>[];\n    for (final f in all) {\n      tmp.add(p.join(directory.path, f));\n    }\n    ruleFars = tmp.join(',');\n  }\n\n  if (lexicon.contains(',')) {\n    final all = lexicon.split(',');\n    var tmp = <String>[];\n    for (final f in all) {\n      tmp.add(p.join(directory.path, f));\n    }\n    lexicon = tmp.join(',');\n  } else if (lexicon != '') {\n    lexicon = p.join(directory.path, modelDir, lexicon);\n  }\n\n  if (dataDir != '') {\n    dataDir = p.join(directory.path, dataDir);\n  }\n\n  final tokens = p.join(directory.path, modelDir, 'tokens.txt');\n  if (voices != '') {\n    voices = p.join(directory.path, modelDir, voices);\n  }\n\n  late final sherpa_onnx.OfflineTtsVitsModelConfig vits;\n  late final sherpa_onnx.OfflineTtsKokoroModelConfig kokoro;\n\n  if (voices != '') {\n    vits = sherpa_onnx.OfflineTtsVitsModelConfig();\n    kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig(\n      model: modelName,\n      voices: voices,\n      tokens: tokens,\n      dataDir: dataDir,\n      lexicon: lexicon,\n    );\n  } else {\n    vits = sherpa_onnx.OfflineTtsVitsModelConfig(\n      model: modelName,\n      lexicon: lexicon,\n      tokens: tokens,\n      dataDir: dataDir,\n    );\n\n    kokoro = sherpa_onnx.OfflineTtsKokoroModelConfig();\n  }\n\n  final modelConfig = sherpa_onnx.OfflineTtsModelConfig(\n    vits: vits,\n    kokoro: kokoro,\n    numThreads: 2,\n    debug: true,\n    provider: 'cpu',\n  );\n\n  final config = sherpa_onnx.OfflineTtsConfig(\n    model: modelConfig,\n    ruleFsts: ruleFsts,\n    ruleFars: ruleFars,\n    maxNumSenetences: 1,\n  );\n  // print(config);\n\n  final tts = sherpa_onnx.OfflineTts(config);\n  print('tts created successfully');\n\n  return tts;\n}\n"
  },
  {
    "path": "flutter-examples/tts/lib/tts.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:async';\n\nimport 'package:flutter/foundation.dart';\nimport 'package:flutter/services.dart';\n\nimport 'package:flutter/material.dart';\n\nimport 'package:audioplayers/audioplayers.dart';\nimport 'package:sherpa_onnx/sherpa_onnx.dart' as sherpa_onnx;\n\nimport './model.dart';\nimport './utils.dart';\n\nclass TtsScreen extends StatefulWidget {\n  const TtsScreen({super.key});\n\n  @override\n  State<TtsScreen> createState() => _TtsScreenState();\n}\n\nclass _TtsScreenState extends State<TtsScreen> {\n  late final TextEditingController _controller_text_input;\n  late final TextEditingController _controller_sid;\n  late final TextEditingController _controller_hint;\n  late final AudioPlayer _player;\n  String _title = 'Text to speech';\n  String _lastFilename = '';\n  bool _isInitialized = false;\n  int _maxSpeakerID = 0;\n  double _speed = 1.0;\n\n  sherpa_onnx.OfflineTts? _tts;\n\n  @override\n  void initState() {\n    _controller_text_input = TextEditingController();\n    _controller_hint = TextEditingController();\n    _controller_sid = TextEditingController(text: '0');\n\n    super.initState();\n  }\n\n  Future<void> _init() async {\n    if (!_isInitialized) {\n      sherpa_onnx.initBindings();\n\n      _tts?.free();\n      _tts = await createOfflineTts();\n\n      _player = AudioPlayer();\n\n      _isInitialized = true;\n    }\n  }\n\n  @override\n  Widget build(BuildContext context) {\n    return MaterialApp(\n      home: Scaffold(\n        appBar: AppBar(\n          title: Text(_title),\n        ),\n        body: Padding(\n          padding: EdgeInsets.all(10),\n          child: Column(\n            // mainAxisAlignment: MainAxisAlignment.center,\n            children: <Widget>[\n              TextField(\n                  decoration: InputDecoration(\n                    labelText: \"Speaker ID (0-$_maxSpeakerID)\",\n                    hintText: 'Please input your speaker ID',\n                  ),\n                  keyboardType: TextInputType.number,\n                  maxLines: 1,\n                  controller: _controller_sid,\n                  onTapOutside: (PointerDownEvent event) {\n                    FocusManager.instance.primaryFocus?.unfocus();\n                  },\n                  inputFormatters: <TextInputFormatter>[FilteringTextInputFormatter.digitsOnly]),\n              Slider(\n                // decoration: InputDecoration(\n                //   labelText: \"speech speed\",\n                // ),\n                label: \"Speech speed ${_speed.toStringAsPrecision(2)}\",\n                min: 0.5,\n                max: 3.0,\n                divisions: 25,\n                value: _speed,\n                onChanged: (value) {\n                  setState(() {\n                    _speed = value;\n                  });\n                },\n              ),\n              const SizedBox(height: 5),\n              TextField(\n                decoration: InputDecoration(\n                  border: OutlineInputBorder(),\n                  hintText: 'Please enter your text here',\n                ),\n                maxLines: 5,\n                controller: _controller_text_input,\n                onTapOutside: (PointerDownEvent event) {\n                  FocusManager.instance.primaryFocus?.unfocus();\n                },\n              ),\n              const SizedBox(height: 5),\n              Row(mainAxisAlignment: MainAxisAlignment.center, children: <Widget>[\n                OutlinedButton(\n                  child: Text(\"Generate\"),\n                  onPressed: () async {\n                    await _init();\n                    await _player?.stop();\n\n                    setState(() {\n                      _maxSpeakerID = _tts?.numSpeakers ?? 0;\n                      if (_maxSpeakerID > 0) {\n                        _maxSpeakerID -= 1;\n                      }\n                    });\n\n                    if (_tts == null) {\n                      _controller_hint.value = TextEditingValue(\n                        text: 'Failed to initialize tts',\n                      );\n                      return;\n                    }\n\n                    _controller_hint.value = TextEditingValue(\n                      text: '',\n                    );\n\n                    final text = _controller_text_input.text.trim();\n                    if (text == '') {\n                      _controller_hint.value = TextEditingValue(\n                        text: 'Please first input your text to generate',\n                      );\n                      return;\n                    }\n\n                    final sid = int.tryParse(_controller_sid.text.trim()) ?? 0;\n\n                    final stopwatch = Stopwatch();\n                    stopwatch.start();\n                    final genConfig = sherpa_onnx.OfflineTtsGenerationConfig(\n                      sid: sid,\n                      speed: _speed,\n                      silenceScale: 0.2,\n                    );\n                    final audio =\n                        _tts!.generateWithConfig(text: text, config: genConfig);\n                    final suffix = '-sid-$sid-speed-${_speed.toStringAsPrecision(2)}';\n                    final filename = await generateWaveFilename(suffix);\n\n                    final ok = sherpa_onnx.writeWave(\n                      filename: filename,\n                      samples: audio.samples,\n                      sampleRate: audio.sampleRate,\n                    );\n\n                    if (ok) {\n                      stopwatch.stop();\n                      double elapsed = stopwatch.elapsed.inMilliseconds.toDouble();\n\n                      double waveDuration = audio.samples.length.toDouble() / audio.sampleRate.toDouble();\n\n                      _controller_hint.value = TextEditingValue(\n                        text: 'Saved to\\n$filename\\n'\n                            'Elapsed: ${(elapsed / 1000).toStringAsPrecision(4)} s\\n'\n                            'Wave duration: ${waveDuration.toStringAsPrecision(4)} s\\n'\n                            'RTF: ${(elapsed / 1000).toStringAsPrecision(4)}/${waveDuration.toStringAsPrecision(4)} '\n                            '= ${(elapsed / 1000 / waveDuration).toStringAsPrecision(3)} ',\n                      );\n                      _lastFilename = filename;\n\n                      await _player?.play(DeviceFileSource(_lastFilename));\n                    } else {\n                      _controller_hint.value = TextEditingValue(\n                        text: 'Failed to save generated audio',\n                      );\n                    }\n                  },\n                ),\n                const SizedBox(width: 5),\n                OutlinedButton(\n                  child: Text(\"Clear\"),\n                  onPressed: () {\n                    _controller_text_input.value = TextEditingValue(\n                      text: '',\n                    );\n\n                    _controller_hint.value = TextEditingValue(\n                      text: '',\n                    );\n                  },\n                ),\n                const SizedBox(width: 5),\n                OutlinedButton(\n                  child: Text(\"Play\"),\n                  onPressed: () async {\n                    if (_lastFilename == '') {\n                      _controller_hint.value = TextEditingValue(\n                        text: 'No generated wave file found',\n                      );\n                      return;\n                    }\n                    await _player?.stop();\n                    await _player?.play(DeviceFileSource(_lastFilename));\n                    _controller_hint.value = TextEditingValue(\n                      text: 'Playing\\n$_lastFilename',\n                    );\n                  },\n                ),\n                const SizedBox(width: 5),\n                OutlinedButton(\n                  child: Text(\"Stop\"),\n                  onPressed: () async {\n                    await _player?.stop();\n                    _controller_hint.value = TextEditingValue(\n                      text: '',\n                    );\n                  },\n                ),\n              ]),\n              const SizedBox(height: 5),\n              TextField(\n                decoration: InputDecoration(\n                  border: OutlineInputBorder(),\n                  hintText: 'Logs will be shown here.\\n'\n                      'The first run is slower due to model initialization.',\n                ),\n                maxLines: 6,\n                controller: _controller_hint,\n                readOnly: true,\n              ),\n            ],\n          ),\n        ),\n      ),\n    );\n  }\n\n  @override\n  void dispose() {\n    _tts?.free();\n    super.dispose();\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/lib/utils.dart",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nimport 'dart:io';\nimport 'dart:typed_data';\n\nimport 'package:flutter/services.dart';\nimport 'package:path/path.dart' as p;\nimport 'package:path_provider/path_provider.dart';\n\nFuture<String> generateWaveFilename([String suffix = '']) async {\n  final Directory directory = await getApplicationSupportDirectory();\n  DateTime now = DateTime.now();\n  final filename =\n      '${now.year.toString()}-${now.month.toString().padLeft(2, '0')}-${now.day.toString().padLeft(2, '0')}-${now.hour.toString().padLeft(2, '0')}-${now.minute.toString().padLeft(2, '0')}-${now.second.toString().padLeft(2, '0')}$suffix.wav';\n\n  return p.join(directory.path, filename);\n}\n\n// https://stackoverflow.com/questions/68862225/flutter-how-to-get-all-files-from-assets-folder-in-one-list\nFuture<List<String>> getAllAssetFiles() async {\n  final AssetManifest assetManifest =\n      await AssetManifest.loadFromAssetBundle(rootBundle);\n  final List<String> assets = assetManifest.listAssets();\n  return assets;\n}\n\nString stripLeadingDirectory(String src, {int n = 1}) {\n  return p.joinAll(p.split(src).sublist(n));\n}\n\nFuture<void> copyAllAssetFiles() async {\n  final allFiles = await getAllAssetFiles();\n  for (final src in allFiles) {\n    final dst = stripLeadingDirectory(src);\n    await copyAssetFile(src, dst);\n  }\n}\n\n// Copy the asset file from src to dst.\n// If dst already exists, then just skip the copy\nFuture<String> copyAssetFile(String src, [String? dst]) async {\n  final Directory directory = await getApplicationSupportDirectory();\n  if (dst == null) {\n    dst = p.basename(src);\n  }\n  final target = p.join(directory.path, dst);\n  bool exists = await new File(target).exists();\n\n  final data = await rootBundle.load(src);\n  if (!exists || File(target).lengthSync() != data.lengthInBytes) {\n    final List<int> bytes =\n        data.buffer.asUint8List(data.offsetInBytes, data.lengthInBytes);\n    await (await File(target).create(recursive: true)).writeAsBytes(bytes);\n  }\n\n  return target;\n}\n"
  },
  {
    "path": "flutter-examples/tts/linux/.gitignore",
    "content": "flutter/ephemeral\n"
  },
  {
    "path": "flutter-examples/tts/linux/CMakeLists.txt",
    "content": "# Project-level configuration.\ncmake_minimum_required(VERSION 3.10)\nproject(runner LANGUAGES CXX)\n\n# The name of the executable created for the application. Change this to change\n# the on-disk name of your application.\nset(BINARY_NAME \"tts\")\n# The unique GTK application identifier for this application. See:\n# https://wiki.gnome.org/HowDoI/ChooseApplicationID\nset(APPLICATION_ID \"com.k2fsa.sherpa.onnx.tts\")\n\n# Explicitly opt in to modern CMake behaviors to avoid warnings with recent\n# versions of CMake.\ncmake_policy(SET CMP0063 NEW)\n\n# Load bundled libraries from the lib/ directory relative to the binary.\nset(CMAKE_INSTALL_RPATH \"$ORIGIN/lib\")\n\n# Root filesystem for cross-building.\nif(FLUTTER_TARGET_PLATFORM_SYSROOT)\n  set(CMAKE_SYSROOT ${FLUTTER_TARGET_PLATFORM_SYSROOT})\n  set(CMAKE_FIND_ROOT_PATH ${CMAKE_SYSROOT})\n  set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)\n  set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)\n  set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)\n  set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)\nendif()\n\n# Define build configuration options.\nif(NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)\n  set(CMAKE_BUILD_TYPE \"Debug\" CACHE\n    STRING \"Flutter build mode\" FORCE)\n  set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS\n    \"Debug\" \"Profile\" \"Release\")\nendif()\n\n# Compilation settings that should be applied to most targets.\n#\n# Be cautious about adding new options here, as plugins use this function by\n# default. In most cases, you should add new options to specific targets instead\n# of modifying this function.\nfunction(APPLY_STANDARD_SETTINGS TARGET)\n  target_compile_features(${TARGET} PUBLIC cxx_std_14)\n  target_compile_options(${TARGET} PRIVATE -Wall -Werror)\n  target_compile_options(${TARGET} PRIVATE \"$<$<NOT:$<CONFIG:Debug>>:-O3>\")\n  target_compile_definitions(${TARGET} PRIVATE \"$<$<NOT:$<CONFIG:Debug>>:NDEBUG>\")\nendfunction()\n\n# Flutter library and tool build rules.\nset(FLUTTER_MANAGED_DIR \"${CMAKE_CURRENT_SOURCE_DIR}/flutter\")\nadd_subdirectory(${FLUTTER_MANAGED_DIR})\n\n# System-level dependencies.\nfind_package(PkgConfig REQUIRED)\npkg_check_modules(GTK REQUIRED IMPORTED_TARGET gtk+-3.0)\n\nadd_definitions(-DAPPLICATION_ID=\"${APPLICATION_ID}\")\n\n# Define the application target. To change its name, change BINARY_NAME above,\n# not the value here, or `flutter run` will no longer work.\n#\n# Any new source files that you add to the application should be added here.\nadd_executable(${BINARY_NAME}\n  \"main.cc\"\n  \"my_application.cc\"\n  \"${FLUTTER_MANAGED_DIR}/generated_plugin_registrant.cc\"\n)\n\n# Apply the standard set of build settings. This can be removed for applications\n# that need different build settings.\napply_standard_settings(${BINARY_NAME})\n\n# Add dependency libraries. Add any application-specific dependencies here.\ntarget_link_libraries(${BINARY_NAME} PRIVATE flutter)\ntarget_link_libraries(${BINARY_NAME} PRIVATE PkgConfig::GTK)\n\n# Run the Flutter tool portions of the build. This must not be removed.\nadd_dependencies(${BINARY_NAME} flutter_assemble)\n\n# Only the install-generated bundle's copy of the executable will launch\n# correctly, since the resources must in the right relative locations. To avoid\n# people trying to run the unbundled copy, put it in a subdirectory instead of\n# the default top-level location.\nset_target_properties(${BINARY_NAME}\n  PROPERTIES\n  RUNTIME_OUTPUT_DIRECTORY \"${CMAKE_BINARY_DIR}/intermediates_do_not_run\"\n)\n\n\n# Generated plugin build rules, which manage building the plugins and adding\n# them to the application.\ninclude(flutter/generated_plugins.cmake)\n\n\n# === Installation ===\n# By default, \"installing\" just makes a relocatable bundle in the build\n# directory.\nset(BUILD_BUNDLE_DIR \"${PROJECT_BINARY_DIR}/bundle\")\nif(CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT)\n  set(CMAKE_INSTALL_PREFIX \"${BUILD_BUNDLE_DIR}\" CACHE PATH \"...\" FORCE)\nendif()\n\n# Start with a clean build bundle directory every time.\ninstall(CODE \"\n  file(REMOVE_RECURSE \\\"${BUILD_BUNDLE_DIR}/\\\")\n  \" COMPONENT Runtime)\n\nset(INSTALL_BUNDLE_DATA_DIR \"${CMAKE_INSTALL_PREFIX}/data\")\nset(INSTALL_BUNDLE_LIB_DIR \"${CMAKE_INSTALL_PREFIX}/lib\")\n\ninstall(TARGETS ${BINARY_NAME} RUNTIME DESTINATION \"${CMAKE_INSTALL_PREFIX}\"\n  COMPONENT Runtime)\n\ninstall(FILES \"${FLUTTER_ICU_DATA_FILE}\" DESTINATION \"${INSTALL_BUNDLE_DATA_DIR}\"\n  COMPONENT Runtime)\n\ninstall(FILES \"${FLUTTER_LIBRARY}\" DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n  COMPONENT Runtime)\n\nforeach(bundled_library ${PLUGIN_BUNDLED_LIBRARIES})\n  install(FILES \"${bundled_library}\"\n    DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n    COMPONENT Runtime)\nendforeach(bundled_library)\n\n# Copy the native assets provided by the build.dart from all packages.\nset(NATIVE_ASSETS_DIR \"${PROJECT_BUILD_DIR}native_assets/linux/\")\ninstall(DIRECTORY \"${NATIVE_ASSETS_DIR}\"\n   DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n   COMPONENT Runtime)\n\n# Fully re-copy the assets directory on each build to avoid having stale files\n# from a previous install.\nset(FLUTTER_ASSET_DIR_NAME \"flutter_assets\")\ninstall(CODE \"\n  file(REMOVE_RECURSE \\\"${INSTALL_BUNDLE_DATA_DIR}/${FLUTTER_ASSET_DIR_NAME}\\\")\n  \" COMPONENT Runtime)\ninstall(DIRECTORY \"${PROJECT_BUILD_DIR}/${FLUTTER_ASSET_DIR_NAME}\"\n  DESTINATION \"${INSTALL_BUNDLE_DATA_DIR}\" COMPONENT Runtime)\n\n# Install the AOT library on non-Debug builds only.\nif(NOT CMAKE_BUILD_TYPE MATCHES \"Debug\")\n  install(FILES \"${AOT_LIBRARY}\" DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n    COMPONENT Runtime)\nendif()\n"
  },
  {
    "path": "flutter-examples/tts/linux/flutter/CMakeLists.txt",
    "content": "# This file controls Flutter-level build steps. It should not be edited.\ncmake_minimum_required(VERSION 3.10)\n\nset(EPHEMERAL_DIR \"${CMAKE_CURRENT_SOURCE_DIR}/ephemeral\")\n\n# Configuration provided via flutter tool.\ninclude(${EPHEMERAL_DIR}/generated_config.cmake)\n\n# TODO: Move the rest of this into files in ephemeral. See\n# https://github.com/flutter/flutter/issues/57146.\n\n# Serves the same purpose as list(TRANSFORM ... PREPEND ...),\n# which isn't available in 3.10.\nfunction(list_prepend LIST_NAME PREFIX)\n    set(NEW_LIST \"\")\n    foreach(element ${${LIST_NAME}})\n        list(APPEND NEW_LIST \"${PREFIX}${element}\")\n    endforeach(element)\n    set(${LIST_NAME} \"${NEW_LIST}\" PARENT_SCOPE)\nendfunction()\n\n# === Flutter Library ===\n# System-level dependencies.\nfind_package(PkgConfig REQUIRED)\npkg_check_modules(GTK REQUIRED IMPORTED_TARGET gtk+-3.0)\npkg_check_modules(GLIB REQUIRED IMPORTED_TARGET glib-2.0)\npkg_check_modules(GIO REQUIRED IMPORTED_TARGET gio-2.0)\n\nset(FLUTTER_LIBRARY \"${EPHEMERAL_DIR}/libflutter_linux_gtk.so\")\n\n# Published to parent scope for install step.\nset(FLUTTER_LIBRARY ${FLUTTER_LIBRARY} PARENT_SCOPE)\nset(FLUTTER_ICU_DATA_FILE \"${EPHEMERAL_DIR}/icudtl.dat\" PARENT_SCOPE)\nset(PROJECT_BUILD_DIR \"${PROJECT_DIR}/build/\" PARENT_SCOPE)\nset(AOT_LIBRARY \"${PROJECT_DIR}/build/lib/libapp.so\" PARENT_SCOPE)\n\nlist(APPEND FLUTTER_LIBRARY_HEADERS\n  \"fl_basic_message_channel.h\"\n  \"fl_binary_codec.h\"\n  \"fl_binary_messenger.h\"\n  \"fl_dart_project.h\"\n  \"fl_engine.h\"\n  \"fl_json_message_codec.h\"\n  \"fl_json_method_codec.h\"\n  \"fl_message_codec.h\"\n  \"fl_method_call.h\"\n  \"fl_method_channel.h\"\n  \"fl_method_codec.h\"\n  \"fl_method_response.h\"\n  \"fl_plugin_registrar.h\"\n  \"fl_plugin_registry.h\"\n  \"fl_standard_message_codec.h\"\n  \"fl_standard_method_codec.h\"\n  \"fl_string_codec.h\"\n  \"fl_value.h\"\n  \"fl_view.h\"\n  \"flutter_linux.h\"\n)\nlist_prepend(FLUTTER_LIBRARY_HEADERS \"${EPHEMERAL_DIR}/flutter_linux/\")\nadd_library(flutter INTERFACE)\ntarget_include_directories(flutter INTERFACE\n  \"${EPHEMERAL_DIR}\"\n)\ntarget_link_libraries(flutter INTERFACE \"${FLUTTER_LIBRARY}\")\ntarget_link_libraries(flutter INTERFACE\n  PkgConfig::GTK\n  PkgConfig::GLIB\n  PkgConfig::GIO\n)\nadd_dependencies(flutter flutter_assemble)\n\n# === Flutter tool backend ===\n# _phony_ is a non-existent file to force this command to run every time,\n# since currently there's no way to get a full input/output list from the\n# flutter tool.\nadd_custom_command(\n  OUTPUT ${FLUTTER_LIBRARY} ${FLUTTER_LIBRARY_HEADERS}\n    ${CMAKE_CURRENT_BINARY_DIR}/_phony_\n  COMMAND ${CMAKE_COMMAND} -E env\n    ${FLUTTER_TOOL_ENVIRONMENT}\n    \"${FLUTTER_ROOT}/packages/flutter_tools/bin/tool_backend.sh\"\n      ${FLUTTER_TARGET_PLATFORM} ${CMAKE_BUILD_TYPE}\n  VERBATIM\n)\nadd_custom_target(flutter_assemble DEPENDS\n  \"${FLUTTER_LIBRARY}\"\n  ${FLUTTER_LIBRARY_HEADERS}\n)\n"
  },
  {
    "path": "flutter-examples/tts/linux/main.cc",
    "content": "#include \"my_application.h\"\n\nint main(int argc, char** argv) {\n  g_autoptr(MyApplication) app = my_application_new();\n  return g_application_run(G_APPLICATION(app), argc, argv);\n}\n"
  },
  {
    "path": "flutter-examples/tts/linux/my_application.cc",
    "content": "#include \"my_application.h\"\n\n#include <flutter_linux/flutter_linux.h>\n#ifdef GDK_WINDOWING_X11\n#include <gdk/gdkx.h>\n#endif\n\n#include \"flutter/generated_plugin_registrant.h\"\n\nstruct _MyApplication {\n  GtkApplication parent_instance;\n  char** dart_entrypoint_arguments;\n};\n\nG_DEFINE_TYPE(MyApplication, my_application, GTK_TYPE_APPLICATION)\n\n// Implements GApplication::activate.\nstatic void my_application_activate(GApplication* application) {\n  MyApplication* self = MY_APPLICATION(application);\n  GtkWindow* window =\n      GTK_WINDOW(gtk_application_window_new(GTK_APPLICATION(application)));\n\n  // Use a header bar when running in GNOME as this is the common style used\n  // by applications and is the setup most users will be using (e.g. Ubuntu\n  // desktop).\n  // If running on X and not using GNOME then just use a traditional title bar\n  // in case the window manager does more exotic layout, e.g. tiling.\n  // If running on Wayland assume the header bar will work (may need changing\n  // if future cases occur).\n  gboolean use_header_bar = TRUE;\n#ifdef GDK_WINDOWING_X11\n  GdkScreen* screen = gtk_window_get_screen(window);\n  if (GDK_IS_X11_SCREEN(screen)) {\n    const gchar* wm_name = gdk_x11_screen_get_window_manager_name(screen);\n    if (g_strcmp0(wm_name, \"GNOME Shell\") != 0) {\n      use_header_bar = FALSE;\n    }\n  }\n#endif\n  if (use_header_bar) {\n    GtkHeaderBar* header_bar = GTK_HEADER_BAR(gtk_header_bar_new());\n    gtk_widget_show(GTK_WIDGET(header_bar));\n    gtk_header_bar_set_title(header_bar, \"tts\");\n    gtk_header_bar_set_show_close_button(header_bar, TRUE);\n    gtk_window_set_titlebar(window, GTK_WIDGET(header_bar));\n  } else {\n    gtk_window_set_title(window, \"tts\");\n  }\n\n  gtk_window_set_default_size(window, 1280, 720);\n  gtk_widget_show(GTK_WIDGET(window));\n\n  g_autoptr(FlDartProject) project = fl_dart_project_new();\n  fl_dart_project_set_dart_entrypoint_arguments(project, self->dart_entrypoint_arguments);\n\n  FlView* view = fl_view_new(project);\n  gtk_widget_show(GTK_WIDGET(view));\n  gtk_container_add(GTK_CONTAINER(window), GTK_WIDGET(view));\n\n  fl_register_plugins(FL_PLUGIN_REGISTRY(view));\n\n  gtk_widget_grab_focus(GTK_WIDGET(view));\n}\n\n// Implements GApplication::local_command_line.\nstatic gboolean my_application_local_command_line(GApplication* application, gchar*** arguments, int* exit_status) {\n  MyApplication* self = MY_APPLICATION(application);\n  // Strip out the first argument as it is the binary name.\n  self->dart_entrypoint_arguments = g_strdupv(*arguments + 1);\n\n  g_autoptr(GError) error = nullptr;\n  if (!g_application_register(application, nullptr, &error)) {\n     g_warning(\"Failed to register: %s\", error->message);\n     *exit_status = 1;\n     return TRUE;\n  }\n\n  g_application_activate(application);\n  *exit_status = 0;\n\n  return TRUE;\n}\n\n// Implements GApplication::startup.\nstatic void my_application_startup(GApplication* application) {\n  //MyApplication* self = MY_APPLICATION(object);\n\n  // Perform any actions required at application startup.\n\n  G_APPLICATION_CLASS(my_application_parent_class)->startup(application);\n}\n\n// Implements GApplication::shutdown.\nstatic void my_application_shutdown(GApplication* application) {\n  //MyApplication* self = MY_APPLICATION(object);\n\n  // Perform any actions required at application shutdown.\n\n  G_APPLICATION_CLASS(my_application_parent_class)->shutdown(application);\n}\n\n// Implements GObject::dispose.\nstatic void my_application_dispose(GObject* object) {\n  MyApplication* self = MY_APPLICATION(object);\n  g_clear_pointer(&self->dart_entrypoint_arguments, g_strfreev);\n  G_OBJECT_CLASS(my_application_parent_class)->dispose(object);\n}\n\nstatic void my_application_class_init(MyApplicationClass* klass) {\n  G_APPLICATION_CLASS(klass)->activate = my_application_activate;\n  G_APPLICATION_CLASS(klass)->local_command_line = my_application_local_command_line;\n  G_APPLICATION_CLASS(klass)->startup = my_application_startup;\n  G_APPLICATION_CLASS(klass)->shutdown = my_application_shutdown;\n  G_OBJECT_CLASS(klass)->dispose = my_application_dispose;\n}\n\nstatic void my_application_init(MyApplication* self) {}\n\nMyApplication* my_application_new() {\n  return MY_APPLICATION(g_object_new(my_application_get_type(),\n                                     \"application-id\", APPLICATION_ID,\n                                     \"flags\", G_APPLICATION_NON_UNIQUE,\n                                     nullptr));\n}\n"
  },
  {
    "path": "flutter-examples/tts/linux/my_application.h",
    "content": "#ifndef FLUTTER_MY_APPLICATION_H_\n#define FLUTTER_MY_APPLICATION_H_\n\n#include <gtk/gtk.h>\n\nG_DECLARE_FINAL_TYPE(MyApplication, my_application, MY, APPLICATION,\n                     GtkApplication)\n\n/**\n * my_application_new:\n *\n * Creates a new Flutter-based application.\n *\n * Returns: a new #MyApplication.\n */\nMyApplication* my_application_new();\n\n#endif  // FLUTTER_MY_APPLICATION_H_\n"
  },
  {
    "path": "flutter-examples/tts/macos/.gitignore",
    "content": "# Flutter-related\n**/Flutter/ephemeral/\n**/Pods/\n\n# Xcode-related\n**/dgph\n**/xcuserdata/\n"
  },
  {
    "path": "flutter-examples/tts/macos/Flutter/Flutter-Debug.xcconfig",
    "content": "#include \"ephemeral/Flutter-Generated.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/tts/macos/Flutter/Flutter-Release.xcconfig",
    "content": "#include \"ephemeral/Flutter-Generated.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/AppDelegate.swift",
    "content": "import Cocoa\nimport FlutterMacOS\n\n@NSApplicationMain\nclass AppDelegate: FlutterAppDelegate {\n  override func applicationShouldTerminateAfterLastWindowClosed(_ sender: NSApplication) -> Bool {\n    return true\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"size\" : \"16x16\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_16.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"16x16\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_32.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"32x32\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_32.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"32x32\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_64.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"128x128\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_128.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"128x128\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_256.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"256x256\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_256.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"256x256\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_512.png\",\n      \"scale\" : \"2x\"\n    },\n    {\n      \"size\" : \"512x512\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_512.png\",\n      \"scale\" : \"1x\"\n    },\n    {\n      \"size\" : \"512x512\",\n      \"idiom\" : \"mac\",\n      \"filename\" : \"app_icon_1024.png\",\n      \"scale\" : \"2x\"\n    }\n  ],\n  \"info\" : {\n    \"version\" : 1,\n    \"author\" : \"xcode\"\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/Base.lproj/MainMenu.xib",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<document type=\"com.apple.InterfaceBuilder3.Cocoa.XIB\" version=\"3.0\" toolsVersion=\"14490.70\" targetRuntime=\"MacOSX.Cocoa\" propertyAccessControl=\"none\" useAutolayout=\"YES\" customObjectInstantitationMethod=\"direct\">\n    <dependencies>\n        <deployment identifier=\"macosx\"/>\n        <plugIn identifier=\"com.apple.InterfaceBuilder.CocoaPlugin\" version=\"14490.70\"/>\n        <capability name=\"documents saved in the Xcode 8 format\" minToolsVersion=\"8.0\"/>\n    </dependencies>\n    <objects>\n        <customObject id=\"-2\" userLabel=\"File's Owner\" customClass=\"NSApplication\">\n            <connections>\n                <outlet property=\"delegate\" destination=\"Voe-Tx-rLC\" id=\"GzC-gU-4Uq\"/>\n            </connections>\n        </customObject>\n        <customObject id=\"-1\" userLabel=\"First Responder\" customClass=\"FirstResponder\"/>\n        <customObject id=\"-3\" userLabel=\"Application\" customClass=\"NSObject\"/>\n        <customObject id=\"Voe-Tx-rLC\" customClass=\"AppDelegate\" customModule=\"Runner\" customModuleProvider=\"target\">\n            <connections>\n                <outlet property=\"applicationMenu\" destination=\"uQy-DD-JDr\" id=\"XBo-yE-nKs\"/>\n                <outlet property=\"mainFlutterWindow\" destination=\"QvC-M9-y7g\" id=\"gIp-Ho-8D9\"/>\n            </connections>\n        </customObject>\n        <customObject id=\"YLy-65-1bz\" customClass=\"NSFontManager\"/>\n        <menu title=\"Main Menu\" systemMenu=\"main\" id=\"AYu-sK-qS6\">\n            <items>\n                <menuItem title=\"APP_NAME\" id=\"1Xt-HY-uBw\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"APP_NAME\" systemMenu=\"apple\" id=\"uQy-DD-JDr\">\n                        <items>\n                            <menuItem title=\"About APP_NAME\" id=\"5kV-Vb-QxS\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"orderFrontStandardAboutPanel:\" target=\"-1\" id=\"Exp-CZ-Vem\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"VOq-y0-SEH\"/>\n                            <menuItem title=\"Preferences…\" keyEquivalent=\",\" id=\"BOF-NM-1cW\"/>\n                            <menuItem isSeparatorItem=\"YES\" id=\"wFC-TO-SCJ\"/>\n                            <menuItem title=\"Services\" id=\"NMo-om-nkz\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Services\" systemMenu=\"services\" id=\"hz9-B4-Xy5\"/>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"4je-JR-u6R\"/>\n                            <menuItem title=\"Hide APP_NAME\" keyEquivalent=\"h\" id=\"Olw-nP-bQN\">\n                                <connections>\n                                    <action selector=\"hide:\" target=\"-1\" id=\"PnN-Uc-m68\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Hide Others\" keyEquivalent=\"h\" id=\"Vdr-fp-XzO\">\n                                <modifierMask key=\"keyEquivalentModifierMask\" option=\"YES\" command=\"YES\"/>\n                                <connections>\n                                    <action selector=\"hideOtherApplications:\" target=\"-1\" id=\"VT4-aY-XCT\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Show All\" id=\"Kd2-mp-pUS\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"unhideAllApplications:\" target=\"-1\" id=\"Dhg-Le-xox\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"kCx-OE-vgT\"/>\n                            <menuItem title=\"Quit APP_NAME\" keyEquivalent=\"q\" id=\"4sb-4s-VLi\">\n                                <connections>\n                                    <action selector=\"terminate:\" target=\"-1\" id=\"Te7-pn-YzF\"/>\n                                </connections>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"Edit\" id=\"5QF-Oa-p0T\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"Edit\" id=\"W48-6f-4Dl\">\n                        <items>\n                            <menuItem title=\"Undo\" keyEquivalent=\"z\" id=\"dRJ-4n-Yzg\">\n                                <connections>\n                                    <action selector=\"undo:\" target=\"-1\" id=\"M6e-cu-g7V\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Redo\" keyEquivalent=\"Z\" id=\"6dh-zS-Vam\">\n                                <connections>\n                                    <action selector=\"redo:\" target=\"-1\" id=\"oIA-Rs-6OD\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"WRV-NI-Exz\"/>\n                            <menuItem title=\"Cut\" keyEquivalent=\"x\" id=\"uRl-iY-unG\">\n                                <connections>\n                                    <action selector=\"cut:\" target=\"-1\" id=\"YJe-68-I9s\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Copy\" keyEquivalent=\"c\" id=\"x3v-GG-iWU\">\n                                <connections>\n                                    <action selector=\"copy:\" target=\"-1\" id=\"G1f-GL-Joy\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Paste\" keyEquivalent=\"v\" id=\"gVA-U4-sdL\">\n                                <connections>\n                                    <action selector=\"paste:\" target=\"-1\" id=\"UvS-8e-Qdg\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Paste and Match Style\" keyEquivalent=\"V\" id=\"WeT-3V-zwk\">\n                                <modifierMask key=\"keyEquivalentModifierMask\" option=\"YES\" command=\"YES\"/>\n                                <connections>\n                                    <action selector=\"pasteAsPlainText:\" target=\"-1\" id=\"cEh-KX-wJQ\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Delete\" id=\"pa3-QI-u2k\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"delete:\" target=\"-1\" id=\"0Mk-Ml-PaM\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Select All\" keyEquivalent=\"a\" id=\"Ruw-6m-B2m\">\n                                <connections>\n                                    <action selector=\"selectAll:\" target=\"-1\" id=\"VNm-Mi-diN\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"uyl-h8-XO2\"/>\n                            <menuItem title=\"Find\" id=\"4EN-yA-p0u\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Find\" id=\"1b7-l0-nxx\">\n                                    <items>\n                                        <menuItem title=\"Find…\" tag=\"1\" keyEquivalent=\"f\" id=\"Xz5-n4-O0W\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"cD7-Qs-BN4\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Find and Replace…\" tag=\"12\" keyEquivalent=\"f\" id=\"YEy-JH-Tfz\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\" option=\"YES\" command=\"YES\"/>\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"WD3-Gg-5AJ\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Find Next\" tag=\"2\" keyEquivalent=\"g\" id=\"q09-fT-Sye\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"NDo-RZ-v9R\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Find Previous\" tag=\"3\" keyEquivalent=\"G\" id=\"OwM-mh-QMV\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"HOh-sY-3ay\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Use Selection for Find\" tag=\"7\" keyEquivalent=\"e\" id=\"buJ-ug-pKt\">\n                                            <connections>\n                                                <action selector=\"performFindPanelAction:\" target=\"-1\" id=\"U76-nv-p5D\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Jump to Selection\" keyEquivalent=\"j\" id=\"S0p-oC-mLd\">\n                                            <connections>\n                                                <action selector=\"centerSelectionInVisibleArea:\" target=\"-1\" id=\"IOG-6D-g5B\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Spelling and Grammar\" id=\"Dv1-io-Yv7\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Spelling\" id=\"3IN-sU-3Bg\">\n                                    <items>\n                                        <menuItem title=\"Show Spelling and Grammar\" keyEquivalent=\":\" id=\"HFo-cy-zxI\">\n                                            <connections>\n                                                <action selector=\"showGuessPanel:\" target=\"-1\" id=\"vFj-Ks-hy3\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Check Document Now\" keyEquivalent=\";\" id=\"hz2-CU-CR7\">\n                                            <connections>\n                                                <action selector=\"checkSpelling:\" target=\"-1\" id=\"fz7-VC-reM\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem isSeparatorItem=\"YES\" id=\"bNw-od-mp5\"/>\n                                        <menuItem title=\"Check Spelling While Typing\" id=\"rbD-Rh-wIN\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleContinuousSpellChecking:\" target=\"-1\" id=\"7w6-Qz-0kB\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Check Grammar With Spelling\" id=\"mK6-2p-4JG\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleGrammarChecking:\" target=\"-1\" id=\"muD-Qn-j4w\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Correct Spelling Automatically\" id=\"78Y-hA-62v\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticSpellingCorrection:\" target=\"-1\" id=\"2lM-Qi-WAP\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Substitutions\" id=\"9ic-FL-obx\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Substitutions\" id=\"FeM-D8-WVr\">\n                                    <items>\n                                        <menuItem title=\"Show Substitutions\" id=\"z6F-FW-3nz\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"orderFrontSubstitutionsPanel:\" target=\"-1\" id=\"oku-mr-iSq\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem isSeparatorItem=\"YES\" id=\"gPx-C9-uUO\"/>\n                                        <menuItem title=\"Smart Copy/Paste\" id=\"9yt-4B-nSM\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleSmartInsertDelete:\" target=\"-1\" id=\"3IJ-Se-DZD\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Smart Quotes\" id=\"hQb-2v-fYv\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticQuoteSubstitution:\" target=\"-1\" id=\"ptq-xd-QOA\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Smart Dashes\" id=\"rgM-f4-ycn\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticDashSubstitution:\" target=\"-1\" id=\"oCt-pO-9gS\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Smart Links\" id=\"cwL-P1-jid\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticLinkDetection:\" target=\"-1\" id=\"Gip-E3-Fov\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Data Detectors\" id=\"tRr-pd-1PS\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticDataDetection:\" target=\"-1\" id=\"R1I-Nq-Kbl\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Text Replacement\" id=\"HFQ-gK-NFA\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"toggleAutomaticTextReplacement:\" target=\"-1\" id=\"DvP-Fe-Py6\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Transformations\" id=\"2oI-Rn-ZJC\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Transformations\" id=\"c8a-y6-VQd\">\n                                    <items>\n                                        <menuItem title=\"Make Upper Case\" id=\"vmV-6d-7jI\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"uppercaseWord:\" target=\"-1\" id=\"sPh-Tk-edu\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Make Lower Case\" id=\"d9M-CD-aMd\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"lowercaseWord:\" target=\"-1\" id=\"iUZ-b5-hil\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Capitalize\" id=\"UEZ-Bs-lqG\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"capitalizeWord:\" target=\"-1\" id=\"26H-TL-nsh\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                            <menuItem title=\"Speech\" id=\"xrE-MZ-jX0\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <menu key=\"submenu\" title=\"Speech\" id=\"3rS-ZA-NoH\">\n                                    <items>\n                                        <menuItem title=\"Start Speaking\" id=\"Ynk-f8-cLZ\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"startSpeaking:\" target=\"-1\" id=\"654-Ng-kyl\"/>\n                                            </connections>\n                                        </menuItem>\n                                        <menuItem title=\"Stop Speaking\" id=\"Oyz-dy-DGm\">\n                                            <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                            <connections>\n                                                <action selector=\"stopSpeaking:\" target=\"-1\" id=\"dX8-6p-jy9\"/>\n                                            </connections>\n                                        </menuItem>\n                                    </items>\n                                </menu>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"View\" id=\"H8h-7b-M4v\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"View\" id=\"HyV-fh-RgO\">\n                        <items>\n                            <menuItem title=\"Enter Full Screen\" keyEquivalent=\"f\" id=\"4J7-dP-txa\">\n                                <modifierMask key=\"keyEquivalentModifierMask\" control=\"YES\" command=\"YES\"/>\n                                <connections>\n                                    <action selector=\"toggleFullScreen:\" target=\"-1\" id=\"dU3-MA-1Rq\"/>\n                                </connections>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"Window\" id=\"aUF-d1-5bR\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"Window\" systemMenu=\"window\" id=\"Td7-aD-5lo\">\n                        <items>\n                            <menuItem title=\"Minimize\" keyEquivalent=\"m\" id=\"OY7-WF-poV\">\n                                <connections>\n                                    <action selector=\"performMiniaturize:\" target=\"-1\" id=\"VwT-WD-YPe\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem title=\"Zoom\" id=\"R4o-n2-Eq4\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"performZoom:\" target=\"-1\" id=\"DIl-cC-cCs\"/>\n                                </connections>\n                            </menuItem>\n                            <menuItem isSeparatorItem=\"YES\" id=\"eu3-7i-yIM\"/>\n                            <menuItem title=\"Bring All to Front\" id=\"LE2-aR-0XJ\">\n                                <modifierMask key=\"keyEquivalentModifierMask\"/>\n                                <connections>\n                                    <action selector=\"arrangeInFront:\" target=\"-1\" id=\"DRN-fu-gQh\"/>\n                                </connections>\n                            </menuItem>\n                        </items>\n                    </menu>\n                </menuItem>\n                <menuItem title=\"Help\" id=\"EPT-qC-fAb\">\n                    <modifierMask key=\"keyEquivalentModifierMask\"/>\n                    <menu key=\"submenu\" title=\"Help\" systemMenu=\"help\" id=\"rJ0-wn-3NY\"/>\n                </menuItem>\n            </items>\n            <point key=\"canvasLocation\" x=\"142\" y=\"-258\"/>\n        </menu>\n        <window title=\"APP_NAME\" allowsToolTipsWhenApplicationIsInactive=\"NO\" autorecalculatesKeyViewLoop=\"NO\" releasedWhenClosed=\"NO\" animationBehavior=\"default\" id=\"QvC-M9-y7g\" customClass=\"MainFlutterWindow\" customModule=\"Runner\" customModuleProvider=\"target\">\n            <windowStyleMask key=\"styleMask\" titled=\"YES\" closable=\"YES\" miniaturizable=\"YES\" resizable=\"YES\"/>\n            <rect key=\"contentRect\" x=\"335\" y=\"390\" width=\"800\" height=\"600\"/>\n            <rect key=\"screenRect\" x=\"0.0\" y=\"0.0\" width=\"2560\" height=\"1577\"/>\n            <view key=\"contentView\" wantsLayer=\"YES\" id=\"EiT-Mj-1SZ\">\n                <rect key=\"frame\" x=\"0.0\" y=\"0.0\" width=\"800\" height=\"600\"/>\n                <autoresizingMask key=\"autoresizingMask\"/>\n            </view>\n        </window>\n    </objects>\n</document>\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/Configs/AppInfo.xcconfig",
    "content": "// Application-level settings for the Runner target.\n//\n// This may be replaced with something auto-generated from metadata (e.g., pubspec.yaml) in the\n// future. If not, the values below would default to using the project name when this becomes a\n// 'flutter create' template.\n\n// The application's name. By default this is also the title of the Flutter window.\nPRODUCT_NAME = tts\n\n// The application's bundle identifier\nPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.tts\n\n// The copyright displayed in application information\nPRODUCT_COPYRIGHT = Copyright © 2024 Next-gen Kaldi. All rights reserved.\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/Configs/Debug.xcconfig",
    "content": "#include \"../../Flutter/Flutter-Debug.xcconfig\"\n#include \"Warnings.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/Configs/Release.xcconfig",
    "content": "#include \"../../Flutter/Flutter-Release.xcconfig\"\n#include \"Warnings.xcconfig\"\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/Configs/Warnings.xcconfig",
    "content": "WARNING_CFLAGS = -Wall -Wconditional-uninitialized -Wnullable-to-nonnull-conversion -Wmissing-method-return-type -Woverlength-strings\nGCC_WARN_UNDECLARED_SELECTOR = YES\nCLANG_UNDEFINED_BEHAVIOR_SANITIZER_NULLABILITY = YES\nCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE\nCLANG_WARN__DUPLICATE_METHOD_MATCH = YES\nCLANG_WARN_PRAGMA_PACK = YES\nCLANG_WARN_STRICT_PROTOTYPES = YES\nCLANG_WARN_COMMA = YES\nGCC_WARN_STRICT_SELECTOR_MATCH = YES\nCLANG_WARN_OBJC_REPEATED_USE_OF_WEAK = YES\nCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES\nGCC_WARN_SHADOW = YES\nCLANG_WARN_UNREACHABLE_CODE = YES\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/DebugProfile.entitlements",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>com.apple.security.app-sandbox</key>\n\t<true/>\n\t<key>com.apple.security.cs.allow-jit</key>\n\t<true/>\n\t<key>com.apple.security.network.server</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>CFBundleDevelopmentRegion</key>\n\t<string>$(DEVELOPMENT_LANGUAGE)</string>\n\t<key>CFBundleExecutable</key>\n\t<string>$(EXECUTABLE_NAME)</string>\n\t<key>CFBundleIconFile</key>\n\t<string></string>\n\t<key>CFBundleIdentifier</key>\n\t<string>$(PRODUCT_BUNDLE_IDENTIFIER)</string>\n\t<key>CFBundleInfoDictionaryVersion</key>\n\t<string>6.0</string>\n\t<key>CFBundleName</key>\n\t<string>$(PRODUCT_NAME)</string>\n\t<key>CFBundlePackageType</key>\n\t<string>APPL</string>\n\t<key>CFBundleShortVersionString</key>\n\t<string>$(FLUTTER_BUILD_NAME)</string>\n\t<key>CFBundleVersion</key>\n\t<string>$(FLUTTER_BUILD_NUMBER)</string>\n\t<key>LSMinimumSystemVersion</key>\n\t<string>$(MACOSX_DEPLOYMENT_TARGET)</string>\n\t<key>NSHumanReadableCopyright</key>\n\t<string>$(PRODUCT_COPYRIGHT)</string>\n\t<key>NSMainNibFile</key>\n\t<string>MainMenu</string>\n\t<key>NSPrincipalClass</key>\n\t<string>NSApplication</string>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/MainFlutterWindow.swift",
    "content": "import Cocoa\nimport FlutterMacOS\n\nclass MainFlutterWindow: NSWindow {\n  override func awakeFromNib() {\n    let flutterViewController = FlutterViewController()\n    let windowFrame = self.frame\n    self.contentViewController = flutterViewController\n    self.setFrame(windowFrame, display: true)\n\n    RegisterGeneratedPlugins(registry: flutterViewController)\n\n    super.awakeFromNib()\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner/Release.entitlements",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>com.apple.security.app-sandbox</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 54;\n\tobjects = {\n\n/* Begin PBXAggregateTarget section */\n\t\t33CC111A2044C6BA0003C045 /* Flutter Assemble */ = {\n\t\t\tisa = PBXAggregateTarget;\n\t\t\tbuildConfigurationList = 33CC111B2044C6BA0003C045 /* Build configuration list for PBXAggregateTarget \"Flutter Assemble\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t33CC111E2044C6BF0003C045 /* ShellScript */,\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = \"Flutter Assemble\";\n\t\t\tproductName = FLX;\n\t\t};\n/* End PBXAggregateTarget section */\n\n/* Begin PBXBuildFile section */\n\t\t331C80D8294CF71000263BE5 /* RunnerTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = 331C80D7294CF71000263BE5 /* RunnerTests.swift */; };\n\t\t335BBD1B22A9A15E00E9071D /* GeneratedPluginRegistrant.swift in Sources */ = {isa = PBXBuildFile; fileRef = 335BBD1A22A9A15E00E9071D /* GeneratedPluginRegistrant.swift */; };\n\t\t33CC10F12044A3C60003C045 /* AppDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 33CC10F02044A3C60003C045 /* AppDelegate.swift */; };\n\t\t33CC10F32044A3C60003C045 /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 33CC10F22044A3C60003C045 /* Assets.xcassets */; };\n\t\t33CC10F62044A3C60003C045 /* MainMenu.xib in Resources */ = {isa = PBXBuildFile; fileRef = 33CC10F42044A3C60003C045 /* MainMenu.xib */; };\n\t\t33CC11132044BFA00003C045 /* MainFlutterWindow.swift in Sources */ = {isa = PBXBuildFile; fileRef = 33CC11122044BFA00003C045 /* MainFlutterWindow.swift */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXContainerItemProxy section */\n\t\t331C80D9294CF71000263BE5 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = 33CC10E52044A3C60003C045 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = 33CC10EC2044A3C60003C045;\n\t\t\tremoteInfo = Runner;\n\t\t};\n\t\t33CC111F2044C79F0003C045 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = 33CC10E52044A3C60003C045 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = 33CC111A2044C6BA0003C045;\n\t\t\tremoteInfo = FLX;\n\t\t};\n/* End PBXContainerItemProxy section */\n\n/* Begin PBXCopyFilesBuildPhase section */\n\t\t33CC110E2044A8840003C045 /* Bundle Framework */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = \"\";\n\t\t\tdstSubfolderSpec = 10;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tname = \"Bundle Framework\";\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXCopyFilesBuildPhase section */\n\n/* Begin PBXFileReference section */\n\t\t331C80D5294CF71000263BE5 /* RunnerTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = RunnerTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t331C80D7294CF71000263BE5 /* RunnerTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = RunnerTests.swift; sourceTree = \"<group>\"; };\n\t\t333000ED22D3DE5D00554162 /* Warnings.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = Warnings.xcconfig; sourceTree = \"<group>\"; };\n\t\t335BBD1A22A9A15E00E9071D /* GeneratedPluginRegistrant.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = GeneratedPluginRegistrant.swift; sourceTree = \"<group>\"; };\n\t\t33CC10ED2044A3C60003C045 /* tts.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = \"tts.app\"; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\t33CC10F02044A3C60003C045 /* AppDelegate.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = AppDelegate.swift; sourceTree = \"<group>\"; };\n\t\t33CC10F22044A3C60003C045 /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; name = Assets.xcassets; path = Runner/Assets.xcassets; sourceTree = \"<group>\"; };\n\t\t33CC10F52044A3C60003C045 /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.xib; name = Base; path = Base.lproj/MainMenu.xib; sourceTree = \"<group>\"; };\n\t\t33CC10F72044A3C60003C045 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; name = Info.plist; path = Runner/Info.plist; sourceTree = \"<group>\"; };\n\t\t33CC11122044BFA00003C045 /* MainFlutterWindow.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = MainFlutterWindow.swift; sourceTree = \"<group>\"; };\n\t\t33CEB47222A05771004F2AC0 /* Flutter-Debug.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = \"Flutter-Debug.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t33CEB47422A05771004F2AC0 /* Flutter-Release.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = \"Flutter-Release.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t33CEB47722A0578A004F2AC0 /* Flutter-Generated.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; name = \"Flutter-Generated.xcconfig\"; path = \"ephemeral/Flutter-Generated.xcconfig\"; sourceTree = \"<group>\"; };\n\t\t33E51913231747F40026EE4D /* DebugProfile.entitlements */ = {isa = PBXFileReference; lastKnownFileType = text.plist.entitlements; path = DebugProfile.entitlements; sourceTree = \"<group>\"; };\n\t\t33E51914231749380026EE4D /* Release.entitlements */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.plist.entitlements; path = Release.entitlements; sourceTree = \"<group>\"; };\n\t\t33E5194F232828860026EE4D /* AppInfo.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = AppInfo.xcconfig; sourceTree = \"<group>\"; };\n\t\t7AFA3C8E1D35360C0083082E /* Release.xcconfig */ = {isa = PBXFileReference; lastKnownFileType = text.xcconfig; path = Release.xcconfig; sourceTree = \"<group>\"; };\n\t\t9740EEB21CF90195004384FC /* Debug.xcconfig */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.xcconfig; path = Debug.xcconfig; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\t331C80D2294CF70F00263BE5 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t33CC10EA2044A3C60003C045 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\t331C80D6294CF71000263BE5 /* RunnerTests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t331C80D7294CF71000263BE5 /* RunnerTests.swift */,\n\t\t\t);\n\t\t\tpath = RunnerTests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33BA886A226E78AF003329D5 /* Configs */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33E5194F232828860026EE4D /* AppInfo.xcconfig */,\n\t\t\t\t9740EEB21CF90195004384FC /* Debug.xcconfig */,\n\t\t\t\t7AFA3C8E1D35360C0083082E /* Release.xcconfig */,\n\t\t\t\t333000ED22D3DE5D00554162 /* Warnings.xcconfig */,\n\t\t\t);\n\t\t\tpath = Configs;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CC10E42044A3C60003C045 = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33FAB671232836740065AC1E /* Runner */,\n\t\t\t\t33CEB47122A05771004F2AC0 /* Flutter */,\n\t\t\t\t331C80D6294CF71000263BE5 /* RunnerTests */,\n\t\t\t\t33CC10EE2044A3C60003C045 /* Products */,\n\t\t\t\tD73912EC22F37F3D000D13A0 /* Frameworks */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CC10EE2044A3C60003C045 /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10ED2044A3C60003C045 /* tts.app */,\n\t\t\t\t331C80D5294CF71000263BE5 /* RunnerTests.xctest */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CC11242044D66E0003C045 /* Resources */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10F22044A3C60003C045 /* Assets.xcassets */,\n\t\t\t\t33CC10F42044A3C60003C045 /* MainMenu.xib */,\n\t\t\t\t33CC10F72044A3C60003C045 /* Info.plist */,\n\t\t\t);\n\t\t\tname = Resources;\n\t\t\tpath = ..;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33CEB47122A05771004F2AC0 /* Flutter */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t335BBD1A22A9A15E00E9071D /* GeneratedPluginRegistrant.swift */,\n\t\t\t\t33CEB47222A05771004F2AC0 /* Flutter-Debug.xcconfig */,\n\t\t\t\t33CEB47422A05771004F2AC0 /* Flutter-Release.xcconfig */,\n\t\t\t\t33CEB47722A0578A004F2AC0 /* Flutter-Generated.xcconfig */,\n\t\t\t);\n\t\t\tpath = Flutter;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\t33FAB671232836740065AC1E /* Runner */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10F02044A3C60003C045 /* AppDelegate.swift */,\n\t\t\t\t33CC11122044BFA00003C045 /* MainFlutterWindow.swift */,\n\t\t\t\t33E51913231747F40026EE4D /* DebugProfile.entitlements */,\n\t\t\t\t33E51914231749380026EE4D /* Release.entitlements */,\n\t\t\t\t33CC11242044D66E0003C045 /* Resources */,\n\t\t\t\t33BA886A226E78AF003329D5 /* Configs */,\n\t\t\t);\n\t\t\tpath = Runner;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tD73912EC22F37F3D000D13A0 /* Frameworks */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t);\n\t\t\tname = Frameworks;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\t331C80D4294CF70F00263BE5 /* RunnerTests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 331C80DE294CF71000263BE5 /* Build configuration list for PBXNativeTarget \"RunnerTests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t331C80D1294CF70F00263BE5 /* Sources */,\n\t\t\t\t331C80D2294CF70F00263BE5 /* Frameworks */,\n\t\t\t\t331C80D3294CF70F00263BE5 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\t331C80DA294CF71000263BE5 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = RunnerTests;\n\t\t\tproductName = RunnerTests;\n\t\t\tproductReference = 331C80D5294CF71000263BE5 /* RunnerTests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.unit-test\";\n\t\t};\n\t\t33CC10EC2044A3C60003C045 /* Runner */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = 33CC10FB2044A3C60003C045 /* Build configuration list for PBXNativeTarget \"Runner\" */;\n\t\t\tbuildPhases = (\n\t\t\t\t33CC10E92044A3C60003C045 /* Sources */,\n\t\t\t\t33CC10EA2044A3C60003C045 /* Frameworks */,\n\t\t\t\t33CC10EB2044A3C60003C045 /* Resources */,\n\t\t\t\t33CC110E2044A8840003C045 /* Bundle Framework */,\n\t\t\t\t3399D490228B24CF009A79C7 /* ShellScript */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\t33CC11202044C79F0003C045 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = Runner;\n\t\t\tproductName = Runner;\n\t\t\tproductReference = 33CC10ED2044A3C60003C045 /* tts.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\t33CC10E52044A3C60003C045 /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = YES;\n\t\t\t\tLastSwiftUpdateCheck = 0920;\n\t\t\t\tLastUpgradeCheck = 1510;\n\t\t\t\tORGANIZATIONNAME = \"\";\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\t331C80D4294CF70F00263BE5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.0;\n\t\t\t\t\t\tTestTargetID = 33CC10EC2044A3C60003C045;\n\t\t\t\t\t};\n\t\t\t\t\t33CC10EC2044A3C60003C045 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 9.2;\n\t\t\t\t\t\tLastSwiftMigration = 1100;\n\t\t\t\t\t\tProvisioningStyle = Automatic;\n\t\t\t\t\t\tSystemCapabilities = {\n\t\t\t\t\t\t\tcom.apple.Sandbox = {\n\t\t\t\t\t\t\t\tenabled = 1;\n\t\t\t\t\t\t\t};\n\t\t\t\t\t\t};\n\t\t\t\t\t};\n\t\t\t\t\t33CC111A2044C6BA0003C045 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 9.2;\n\t\t\t\t\t\tProvisioningStyle = Manual;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = 33CC10E82044A3C60003C045 /* Build configuration list for PBXProject \"Runner\" */;\n\t\t\tcompatibilityVersion = \"Xcode 9.3\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = 33CC10E42044A3C60003C045;\n\t\t\tproductRefGroup = 33CC10EE2044A3C60003C045 /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\t33CC10EC2044A3C60003C045 /* Runner */,\n\t\t\t\t331C80D4294CF70F00263BE5 /* RunnerTests */,\n\t\t\t\t33CC111A2044C6BA0003C045 /* Flutter Assemble */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\t331C80D3294CF70F00263BE5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t33CC10EB2044A3C60003C045 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t33CC10F32044A3C60003C045 /* Assets.xcassets in Resources */,\n\t\t\t\t33CC10F62044A3C60003C045 /* MainMenu.xib in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXShellScriptBuildPhase section */\n\t\t3399D490228B24CF009A79C7 /* ShellScript */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\talwaysOutOfDate = 1;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t);\n\t\t\toutputFileListPaths = (\n\t\t\t);\n\t\t\toutputPaths = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"echo \\\"$PRODUCT_NAME.app\\\" > \\\"$PROJECT_DIR\\\"/Flutter/ephemeral/.app_filename && \\\"$FLUTTER_ROOT\\\"/packages/flutter_tools/bin/macos_assemble.sh embed\\n\";\n\t\t};\n\t\t33CC111E2044C6BF0003C045 /* ShellScript */ = {\n\t\t\tisa = PBXShellScriptBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\tinputFileListPaths = (\n\t\t\t\tFlutter/ephemeral/FlutterInputs.xcfilelist,\n\t\t\t);\n\t\t\tinputPaths = (\n\t\t\t\tFlutter/ephemeral/tripwire,\n\t\t\t);\n\t\t\toutputFileListPaths = (\n\t\t\t\tFlutter/ephemeral/FlutterOutputs.xcfilelist,\n\t\t\t);\n\t\t\toutputPaths = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t\tshellPath = /bin/sh;\n\t\t\tshellScript = \"\\\"$FLUTTER_ROOT\\\"/packages/flutter_tools/bin/macos_assemble.sh && touch Flutter/ephemeral/tripwire\";\n\t\t};\n/* End PBXShellScriptBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\t331C80D1294CF70F00263BE5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t331C80D8294CF71000263BE5 /* RunnerTests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\t33CC10E92044A3C60003C045 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\t33CC11132044BFA00003C045 /* MainFlutterWindow.swift in Sources */,\n\t\t\t\t33CC10F12044A3C60003C045 /* AppDelegate.swift in Sources */,\n\t\t\t\t335BBD1B22A9A15E00E9071D /* GeneratedPluginRegistrant.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin PBXTargetDependency section */\n\t\t331C80DA294CF71000263BE5 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = 33CC10EC2044A3C60003C045 /* Runner */;\n\t\t\ttargetProxy = 331C80D9294CF71000263BE5 /* PBXContainerItemProxy */;\n\t\t};\n\t\t33CC11202044C79F0003C045 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = 33CC111A2044C6BA0003C045 /* Flutter Assemble */;\n\t\t\ttargetProxy = 33CC111F2044C79F0003C045 /* PBXContainerItemProxy */;\n\t\t};\n/* End PBXTargetDependency section */\n\n/* Begin PBXVariantGroup section */\n\t\t33CC10F42044A3C60003C045 /* MainMenu.xib */ = {\n\t\t\tisa = PBXVariantGroup;\n\t\t\tchildren = (\n\t\t\t\t33CC10F52044A3C60003C045 /* Base */,\n\t\t\t);\n\t\t\tname = MainMenu.xib;\n\t\t\tpath = Runner;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXVariantGroup section */\n\n/* Begin XCBuildConfiguration section */\n\t\t331C80DB294CF71000263BE5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.tts.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/tts.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/tts\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t331C80DC294CF71000263BE5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.tts.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/tts.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/tts\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t331C80DD294CF71000263BE5 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = com.k2fsa.sherpa.onnx.tts.RunnerTests;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/tts.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/tts\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t338D0CE9231458BD00FA5F75 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 7AFA3C8E1D35360C0083082E /* Release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++14\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEAD_CODE_STRIPPING = YES;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.14;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t338D0CEA231458BD00FA5F75 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 33E5194F232828860026EE4D /* AppInfo.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCODE_SIGN_ENTITLEMENTS = Runner/DebugProfile.entitlements;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCOMBINE_HIDPI_IMAGES = YES;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/../Frameworks\",\n\t\t\t\t);\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t338D0CEB231458BD00FA5F75 /* Profile */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCODE_SIGN_STYLE = Manual;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Profile;\n\t\t};\n\t\t33CC10F92044A3C60003C045 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 9740EEB21CF90195004384FC /* Debug.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++14\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEAD_CODE_STRIPPING = YES;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.14;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t33CC10FA2044A3C60003C045 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 7AFA3C8E1D35360C0083082E /* Release.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++14\";\n\t\t\t\tCLANG_CXX_LIBRARY = \"libc++\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCODE_SIGN_IDENTITY = \"-\";\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEAD_CODE_STRIPPING = YES;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = NO;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tMACOSX_DEPLOYMENT_TARGET = 10.14;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tSDKROOT = macosx;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t33CC10FC2044A3C60003C045 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 33E5194F232828860026EE4D /* AppInfo.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCODE_SIGN_ENTITLEMENTS = Runner/DebugProfile.entitlements;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCOMBINE_HIDPI_IMAGES = YES;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/../Frameworks\",\n\t\t\t\t);\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t33CC10FD2044A3C60003C045 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbaseConfigurationReference = 33E5194F232828860026EE4D /* AppInfo.xcconfig */;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCODE_SIGN_ENTITLEMENTS = Runner/Release.entitlements;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCOMBINE_HIDPI_IMAGES = YES;\n\t\t\t\tINFOPLIST_FILE = Runner/Info.plist;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/../Frameworks\",\n\t\t\t\t);\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\t33CC111C2044C6BA0003C045 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCODE_SIGN_STYLE = Manual;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\t33CC111D2044C6BA0003C045 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\t331C80DE294CF71000263BE5 /* Build configuration list for PBXNativeTarget \"RunnerTests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t331C80DB294CF71000263BE5 /* Debug */,\n\t\t\t\t331C80DC294CF71000263BE5 /* Release */,\n\t\t\t\t331C80DD294CF71000263BE5 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t33CC10E82044A3C60003C045 /* Build configuration list for PBXProject \"Runner\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t33CC10F92044A3C60003C045 /* Debug */,\n\t\t\t\t33CC10FA2044A3C60003C045 /* Release */,\n\t\t\t\t338D0CE9231458BD00FA5F75 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t33CC10FB2044A3C60003C045 /* Build configuration list for PBXNativeTarget \"Runner\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t33CC10FC2044A3C60003C045 /* Debug */,\n\t\t\t\t33CC10FD2044A3C60003C045 /* Release */,\n\t\t\t\t338D0CEA231458BD00FA5F75 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\t33CC111B2044C6BA0003C045 /* Build configuration list for PBXAggregateTarget \"Flutter Assemble\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\t33CC111C2044C6BA0003C045 /* Debug */,\n\t\t\t\t33CC111D2044C6BA0003C045 /* Release */,\n\t\t\t\t338D0CEB231458BD00FA5F75 /* Profile */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = 33CC10E52044A3C60003C045 /* Project object */;\n}\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner.xcodeproj/xcshareddata/xcschemes/Runner.xcscheme",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Scheme\n   LastUpgradeVersion = \"1510\"\n   version = \"1.3\">\n   <BuildAction\n      parallelizeBuildables = \"YES\"\n      buildImplicitDependencies = \"YES\">\n      <BuildActionEntries>\n         <BuildActionEntry\n            buildForTesting = \"YES\"\n            buildForRunning = \"YES\"\n            buildForProfiling = \"YES\"\n            buildForArchiving = \"YES\"\n            buildForAnalyzing = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n               BuildableName = \"tts.app\"\n               BlueprintName = \"Runner\"\n               ReferencedContainer = \"container:Runner.xcodeproj\">\n            </BuildableReference>\n         </BuildActionEntry>\n      </BuildActionEntries>\n   </BuildAction>\n   <TestAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\">\n      <MacroExpansion>\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n            BuildableName = \"tts.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </MacroExpansion>\n      <Testables>\n         <TestableReference\n            skipped = \"NO\"\n            parallelizable = \"YES\">\n            <BuildableReference\n               BuildableIdentifier = \"primary\"\n               BlueprintIdentifier = \"331C80D4294CF70F00263BE5\"\n               BuildableName = \"RunnerTests.xctest\"\n               BlueprintName = \"RunnerTests\"\n               ReferencedContainer = \"container:Runner.xcodeproj\">\n            </BuildableReference>\n         </TestableReference>\n      </Testables>\n   </TestAction>\n   <LaunchAction\n      buildConfiguration = \"Debug\"\n      selectedDebuggerIdentifier = \"Xcode.DebuggerFoundation.Debugger.LLDB\"\n      selectedLauncherIdentifier = \"Xcode.DebuggerFoundation.Launcher.LLDB\"\n      launchStyle = \"0\"\n      useCustomWorkingDirectory = \"NO\"\n      ignoresPersistentStateOnLaunch = \"NO\"\n      debugDocumentVersioning = \"YES\"\n      debugServiceExtension = \"internal\"\n      allowLocationSimulation = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n            BuildableName = \"tts.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </LaunchAction>\n   <ProfileAction\n      buildConfiguration = \"Profile\"\n      shouldUseLaunchSchemeArgsEnv = \"YES\"\n      savedToolIdentifier = \"\"\n      useCustomWorkingDirectory = \"NO\"\n      debugDocumentVersioning = \"YES\">\n      <BuildableProductRunnable\n         runnableDebuggingMode = \"0\">\n         <BuildableReference\n            BuildableIdentifier = \"primary\"\n            BlueprintIdentifier = \"33CC10EC2044A3C60003C045\"\n            BuildableName = \"tts.app\"\n            BlueprintName = \"Runner\"\n            ReferencedContainer = \"container:Runner.xcodeproj\">\n         </BuildableReference>\n      </BuildableProductRunnable>\n   </ProfileAction>\n   <AnalyzeAction\n      buildConfiguration = \"Debug\">\n   </AnalyzeAction>\n   <ArchiveAction\n      buildConfiguration = \"Release\"\n      revealArchiveInOrganizer = \"YES\">\n   </ArchiveAction>\n</Scheme>\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"group:Runner.xcodeproj\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "flutter-examples/tts/macos/Runner.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "flutter-examples/tts/macos/RunnerTests/RunnerTests.swift",
    "content": "import Cocoa\nimport FlutterMacOS\nimport XCTest\n\nclass RunnerTests: XCTestCase {\n\n  func testExample() {\n    // If you add code to the Runner application, consider adding tests here.\n    // See https://developer.apple.com/documentation/xctest for more information about using XCTest.\n  }\n\n}\n"
  },
  {
    "path": "flutter-examples/tts/pubspec.yaml",
    "content": "name: tts\ndescription: >\n  This example shows how to implement text to speech, i.e., speech synthesis,\n  using sherpa-onnx.\n\npublish_to: 'none' # Remove this line if you wish to publish to pub.dev\n\nversion: 1.12.31\n\nenvironment:\n  sdk: \">=2.17.0 <4.0.0\"\n  flutter: \">=2.8.1\"\n\ndependencies:\n  flutter:\n    sdk: flutter\n\n  cupertino_icons: ^1.0.6\n  path_provider: ^2.1.3\n  path: ^1.9.0\n  sherpa_onnx: ^1.12.31\n  # sherpa_onnx:\n  #   path: ../../flutter/sherpa_onnx\n  url_launcher: 6.2.6\n  url_launcher_linux: 3.1.0\n  audioplayers: ^5.0.0\n  media_kit: \n  media_kit_libs_video: \n\nflutter:\n  uses-material-design: true\n\n  assets:\n    - assets/vits-melo-tts-zh_en/\n    - assets/vits-melo-tts-zh_en/dict/"
  },
  {
    "path": "flutter-examples/tts/test/widget_test.dart",
    "content": "// This is a basic Flutter widget test.\n//\n// To perform an interaction with a widget in your test, use the WidgetTester\n// utility in the flutter_test package. For example, you can send tap and scroll\n// gestures. You can also use WidgetTester to find child widgets in the widget\n// tree, read text, and verify that the values of widget properties are correct.\n\nimport 'package:flutter/material.dart';\nimport 'package:flutter_test/flutter_test.dart';\n\nimport 'package:tts/main.dart';\n\nvoid main() {\n  testWidgets('Counter increments smoke test', (WidgetTester tester) async {\n    // Build our app and trigger a frame.\n    await tester.pumpWidget(const MyApp());\n\n    // Verify that our counter starts at 0.\n    expect(find.text('0'), findsOneWidget);\n    expect(find.text('1'), findsNothing);\n\n    // Tap the '+' icon and trigger a frame.\n    await tester.tap(find.byIcon(Icons.add));\n    await tester.pump();\n\n    // Verify that our counter has incremented.\n    expect(find.text('0'), findsNothing);\n    expect(find.text('1'), findsOneWidget);\n  });\n}\n"
  },
  {
    "path": "flutter-examples/tts/windows/.gitignore",
    "content": "flutter/ephemeral/\n\n# Visual Studio user-specific files.\n*.suo\n*.user\n*.userosscache\n*.sln.docstates\n\n# Visual Studio build-related files.\nx64/\nx86/\n\n# Visual Studio cache files\n# files ending in .cache can be ignored\n*.[Cc]ache\n# but keep track of directories ending in .cache\n!*.[Cc]ache/\n"
  },
  {
    "path": "flutter-examples/tts/windows/CMakeLists.txt",
    "content": "# Project-level configuration.\ncmake_minimum_required(VERSION 3.14)\nproject(tts LANGUAGES CXX)\n\n# The name of the executable created for the application. Change this to change\n# the on-disk name of your application.\nset(BINARY_NAME \"tts\")\n\n# Explicitly opt in to modern CMake behaviors to avoid warnings with recent\n# versions of CMake.\ncmake_policy(VERSION 3.14...3.25)\n\n# Define build configuration option.\nget_property(IS_MULTICONFIG GLOBAL PROPERTY GENERATOR_IS_MULTI_CONFIG)\nif(IS_MULTICONFIG)\n  set(CMAKE_CONFIGURATION_TYPES \"Debug;Profile;Release\"\n    CACHE STRING \"\" FORCE)\nelse()\n  if(NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)\n    set(CMAKE_BUILD_TYPE \"Debug\" CACHE\n      STRING \"Flutter build mode\" FORCE)\n    set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS\n      \"Debug\" \"Profile\" \"Release\")\n  endif()\nendif()\n# Define settings for the Profile build mode.\nset(CMAKE_EXE_LINKER_FLAGS_PROFILE \"${CMAKE_EXE_LINKER_FLAGS_RELEASE}\")\nset(CMAKE_SHARED_LINKER_FLAGS_PROFILE \"${CMAKE_SHARED_LINKER_FLAGS_RELEASE}\")\nset(CMAKE_C_FLAGS_PROFILE \"${CMAKE_C_FLAGS_RELEASE}\")\nset(CMAKE_CXX_FLAGS_PROFILE \"${CMAKE_CXX_FLAGS_RELEASE}\")\n\n# Use Unicode for all projects.\nadd_definitions(-DUNICODE -D_UNICODE)\n\n# Compilation settings that should be applied to most targets.\n#\n# Be cautious about adding new options here, as plugins use this function by\n# default. In most cases, you should add new options to specific targets instead\n# of modifying this function.\nfunction(APPLY_STANDARD_SETTINGS TARGET)\n  target_compile_features(${TARGET} PUBLIC cxx_std_17)\n  target_compile_options(${TARGET} PRIVATE /W4 /WX /wd\"4100\")\n  target_compile_options(${TARGET} PRIVATE /EHsc)\n  target_compile_definitions(${TARGET} PRIVATE \"_HAS_EXCEPTIONS=0\")\n  target_compile_definitions(${TARGET} PRIVATE \"$<$<CONFIG:Debug>:_DEBUG>\")\nendfunction()\n\n# Flutter library and tool build rules.\nset(FLUTTER_MANAGED_DIR \"${CMAKE_CURRENT_SOURCE_DIR}/flutter\")\nadd_subdirectory(${FLUTTER_MANAGED_DIR})\n\n# Application build; see runner/CMakeLists.txt.\nadd_subdirectory(\"runner\")\n\n\n# Generated plugin build rules, which manage building the plugins and adding\n# them to the application.\ninclude(flutter/generated_plugins.cmake)\n\n\n# === Installation ===\n# Support files are copied into place next to the executable, so that it can\n# run in place. This is done instead of making a separate bundle (as on Linux)\n# so that building and running from within Visual Studio will work.\nset(BUILD_BUNDLE_DIR \"$<TARGET_FILE_DIR:${BINARY_NAME}>\")\n# Make the \"install\" step default, as it's required to run.\nset(CMAKE_VS_INCLUDE_INSTALL_TO_DEFAULT_BUILD 1)\nif(CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT)\n  set(CMAKE_INSTALL_PREFIX \"${BUILD_BUNDLE_DIR}\" CACHE PATH \"...\" FORCE)\nendif()\n\nset(INSTALL_BUNDLE_DATA_DIR \"${CMAKE_INSTALL_PREFIX}/data\")\nset(INSTALL_BUNDLE_LIB_DIR \"${CMAKE_INSTALL_PREFIX}\")\n\ninstall(TARGETS ${BINARY_NAME} RUNTIME DESTINATION \"${CMAKE_INSTALL_PREFIX}\"\n  COMPONENT Runtime)\n\ninstall(FILES \"${FLUTTER_ICU_DATA_FILE}\" DESTINATION \"${INSTALL_BUNDLE_DATA_DIR}\"\n  COMPONENT Runtime)\n\ninstall(FILES \"${FLUTTER_LIBRARY}\" DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n  COMPONENT Runtime)\n\nif(PLUGIN_BUNDLED_LIBRARIES)\n  install(FILES \"${PLUGIN_BUNDLED_LIBRARIES}\"\n    DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n    COMPONENT Runtime)\nendif()\n\n# Copy the native assets provided by the build.dart from all packages.\nset(NATIVE_ASSETS_DIR \"${PROJECT_BUILD_DIR}native_assets/windows/\")\ninstall(DIRECTORY \"${NATIVE_ASSETS_DIR}\"\n   DESTINATION \"${INSTALL_BUNDLE_LIB_DIR}\"\n   COMPONENT Runtime)\n\n# Fully re-copy the assets directory on each build to avoid having stale files\n# from a previous install.\nset(FLUTTER_ASSET_DIR_NAME \"flutter_assets\")\ninstall(CODE \"\n  file(REMOVE_RECURSE \\\"${INSTALL_BUNDLE_DATA_DIR}/${FLUTTER_ASSET_DIR_NAME}\\\")\n  \" COMPONENT Runtime)\ninstall(DIRECTORY \"${PROJECT_BUILD_DIR}/${FLUTTER_ASSET_DIR_NAME}\"\n  DESTINATION \"${INSTALL_BUNDLE_DATA_DIR}\" COMPONENT Runtime)\n\n# Install the AOT library on non-Debug builds only.\ninstall(FILES \"${AOT_LIBRARY}\" DESTINATION \"${INSTALL_BUNDLE_DATA_DIR}\"\n  CONFIGURATIONS Profile;Release\n  COMPONENT Runtime)\n"
  },
  {
    "path": "flutter-examples/tts/windows/flutter/CMakeLists.txt",
    "content": "# This file controls Flutter-level build steps. It should not be edited.\ncmake_minimum_required(VERSION 3.14)\n\nset(EPHEMERAL_DIR \"${CMAKE_CURRENT_SOURCE_DIR}/ephemeral\")\n\n# Configuration provided via flutter tool.\ninclude(${EPHEMERAL_DIR}/generated_config.cmake)\n\n# TODO: Move the rest of this into files in ephemeral. See\n# https://github.com/flutter/flutter/issues/57146.\nset(WRAPPER_ROOT \"${EPHEMERAL_DIR}/cpp_client_wrapper\")\n\n# Set fallback configurations for older versions of the flutter tool.\nif (NOT DEFINED FLUTTER_TARGET_PLATFORM)\n  set(FLUTTER_TARGET_PLATFORM \"windows-x64\")\nendif()\n\n# === Flutter Library ===\nset(FLUTTER_LIBRARY \"${EPHEMERAL_DIR}/flutter_windows.dll\")\n\n# Published to parent scope for install step.\nset(FLUTTER_LIBRARY ${FLUTTER_LIBRARY} PARENT_SCOPE)\nset(FLUTTER_ICU_DATA_FILE \"${EPHEMERAL_DIR}/icudtl.dat\" PARENT_SCOPE)\nset(PROJECT_BUILD_DIR \"${PROJECT_DIR}/build/\" PARENT_SCOPE)\nset(AOT_LIBRARY \"${PROJECT_DIR}/build/windows/app.so\" PARENT_SCOPE)\n\nlist(APPEND FLUTTER_LIBRARY_HEADERS\n  \"flutter_export.h\"\n  \"flutter_windows.h\"\n  \"flutter_messenger.h\"\n  \"flutter_plugin_registrar.h\"\n  \"flutter_texture_registrar.h\"\n)\nlist(TRANSFORM FLUTTER_LIBRARY_HEADERS PREPEND \"${EPHEMERAL_DIR}/\")\nadd_library(flutter INTERFACE)\ntarget_include_directories(flutter INTERFACE\n  \"${EPHEMERAL_DIR}\"\n)\ntarget_link_libraries(flutter INTERFACE \"${FLUTTER_LIBRARY}.lib\")\nadd_dependencies(flutter flutter_assemble)\n\n# === Wrapper ===\nlist(APPEND CPP_WRAPPER_SOURCES_CORE\n  \"core_implementations.cc\"\n  \"standard_codec.cc\"\n)\nlist(TRANSFORM CPP_WRAPPER_SOURCES_CORE PREPEND \"${WRAPPER_ROOT}/\")\nlist(APPEND CPP_WRAPPER_SOURCES_PLUGIN\n  \"plugin_registrar.cc\"\n)\nlist(TRANSFORM CPP_WRAPPER_SOURCES_PLUGIN PREPEND \"${WRAPPER_ROOT}/\")\nlist(APPEND CPP_WRAPPER_SOURCES_APP\n  \"flutter_engine.cc\"\n  \"flutter_view_controller.cc\"\n)\nlist(TRANSFORM CPP_WRAPPER_SOURCES_APP PREPEND \"${WRAPPER_ROOT}/\")\n\n# Wrapper sources needed for a plugin.\nadd_library(flutter_wrapper_plugin STATIC\n  ${CPP_WRAPPER_SOURCES_CORE}\n  ${CPP_WRAPPER_SOURCES_PLUGIN}\n)\napply_standard_settings(flutter_wrapper_plugin)\nset_target_properties(flutter_wrapper_plugin PROPERTIES\n  POSITION_INDEPENDENT_CODE ON)\nset_target_properties(flutter_wrapper_plugin PROPERTIES\n  CXX_VISIBILITY_PRESET hidden)\ntarget_link_libraries(flutter_wrapper_plugin PUBLIC flutter)\ntarget_include_directories(flutter_wrapper_plugin PUBLIC\n  \"${WRAPPER_ROOT}/include\"\n)\nadd_dependencies(flutter_wrapper_plugin flutter_assemble)\n\n# Wrapper sources needed for the runner.\nadd_library(flutter_wrapper_app STATIC\n  ${CPP_WRAPPER_SOURCES_CORE}\n  ${CPP_WRAPPER_SOURCES_APP}\n)\napply_standard_settings(flutter_wrapper_app)\ntarget_link_libraries(flutter_wrapper_app PUBLIC flutter)\ntarget_include_directories(flutter_wrapper_app PUBLIC\n  \"${WRAPPER_ROOT}/include\"\n)\nadd_dependencies(flutter_wrapper_app flutter_assemble)\n\n# === Flutter tool backend ===\n# _phony_ is a non-existent file to force this command to run every time,\n# since currently there's no way to get a full input/output list from the\n# flutter tool.\nset(PHONY_OUTPUT \"${CMAKE_CURRENT_BINARY_DIR}/_phony_\")\nset_source_files_properties(\"${PHONY_OUTPUT}\" PROPERTIES SYMBOLIC TRUE)\nadd_custom_command(\n  OUTPUT ${FLUTTER_LIBRARY} ${FLUTTER_LIBRARY_HEADERS}\n    ${CPP_WRAPPER_SOURCES_CORE} ${CPP_WRAPPER_SOURCES_PLUGIN}\n    ${CPP_WRAPPER_SOURCES_APP}\n    ${PHONY_OUTPUT}\n  COMMAND ${CMAKE_COMMAND} -E env\n    ${FLUTTER_TOOL_ENVIRONMENT}\n    \"${FLUTTER_ROOT}/packages/flutter_tools/bin/tool_backend.bat\"\n      ${FLUTTER_TARGET_PLATFORM} $<CONFIG>\n  VERBATIM\n)\nadd_custom_target(flutter_assemble DEPENDS\n  \"${FLUTTER_LIBRARY}\"\n  ${FLUTTER_LIBRARY_HEADERS}\n  ${CPP_WRAPPER_SOURCES_CORE}\n  ${CPP_WRAPPER_SOURCES_PLUGIN}\n  ${CPP_WRAPPER_SOURCES_APP}\n)\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/CMakeLists.txt",
    "content": "cmake_minimum_required(VERSION 3.14)\nproject(runner LANGUAGES CXX)\n\n# Define the application target. To change its name, change BINARY_NAME in the\n# top-level CMakeLists.txt, not the value here, or `flutter run` will no longer\n# work.\n#\n# Any new source files that you add to the application should be added here.\nadd_executable(${BINARY_NAME} WIN32\n  \"flutter_window.cpp\"\n  \"main.cpp\"\n  \"utils.cpp\"\n  \"win32_window.cpp\"\n  \"${FLUTTER_MANAGED_DIR}/generated_plugin_registrant.cc\"\n  \"Runner.rc\"\n  \"runner.exe.manifest\"\n)\n\n# Apply the standard set of build settings. This can be removed for applications\n# that need different build settings.\napply_standard_settings(${BINARY_NAME})\n\n# Add preprocessor definitions for the build version.\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"FLUTTER_VERSION=\\\"${FLUTTER_VERSION}\\\"\")\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"FLUTTER_VERSION_MAJOR=${FLUTTER_VERSION_MAJOR}\")\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"FLUTTER_VERSION_MINOR=${FLUTTER_VERSION_MINOR}\")\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"FLUTTER_VERSION_PATCH=${FLUTTER_VERSION_PATCH}\")\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"FLUTTER_VERSION_BUILD=${FLUTTER_VERSION_BUILD}\")\n\n# Disable Windows macros that collide with C++ standard library functions.\ntarget_compile_definitions(${BINARY_NAME} PRIVATE \"NOMINMAX\")\n\n# Add dependency libraries and include directories. Add any application-specific\n# dependencies here.\ntarget_link_libraries(${BINARY_NAME} PRIVATE flutter flutter_wrapper_app)\ntarget_link_libraries(${BINARY_NAME} PRIVATE \"dwmapi.lib\")\ntarget_include_directories(${BINARY_NAME} PRIVATE \"${CMAKE_SOURCE_DIR}\")\n\n# Run the Flutter tool portions of the build. This must not be removed.\nadd_dependencies(${BINARY_NAME} flutter_assemble)\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/Runner.rc",
    "content": "// Microsoft Visual C++ generated resource script.\n//\n#pragma code_page(65001)\n#include \"resource.h\"\n\n#define APSTUDIO_READONLY_SYMBOLS\n/////////////////////////////////////////////////////////////////////////////\n//\n// Generated from the TEXTINCLUDE 2 resource.\n//\n#include \"winres.h\"\n\n/////////////////////////////////////////////////////////////////////////////\n#undef APSTUDIO_READONLY_SYMBOLS\n\n/////////////////////////////////////////////////////////////////////////////\n// English (United States) resources\n\n#if !defined(AFX_RESOURCE_DLL) || defined(AFX_TARG_ENU)\nLANGUAGE LANG_ENGLISH, SUBLANG_ENGLISH_US\n\n#ifdef APSTUDIO_INVOKED\n/////////////////////////////////////////////////////////////////////////////\n//\n// TEXTINCLUDE\n//\n\n1 TEXTINCLUDE\nBEGIN\n    \"resource.h\\0\"\nEND\n\n2 TEXTINCLUDE\nBEGIN\n    \"#include \"\"winres.h\"\"\\r\\n\"\n    \"\\0\"\nEND\n\n3 TEXTINCLUDE\nBEGIN\n    \"\\r\\n\"\n    \"\\0\"\nEND\n\n#endif    // APSTUDIO_INVOKED\n\n\n/////////////////////////////////////////////////////////////////////////////\n//\n// Icon\n//\n\n// Icon with lowest ID value placed first to ensure application icon\n// remains consistent on all systems.\nIDI_APP_ICON            ICON                    \"resources\\\\app_icon.ico\"\n\n\n/////////////////////////////////////////////////////////////////////////////\n//\n// Version\n//\n\n#if defined(FLUTTER_VERSION_MAJOR) && defined(FLUTTER_VERSION_MINOR) && defined(FLUTTER_VERSION_PATCH) && defined(FLUTTER_VERSION_BUILD)\n#define VERSION_AS_NUMBER FLUTTER_VERSION_MAJOR,FLUTTER_VERSION_MINOR,FLUTTER_VERSION_PATCH,FLUTTER_VERSION_BUILD\n#else\n#define VERSION_AS_NUMBER 1,0,0,0\n#endif\n\n#if defined(FLUTTER_VERSION)\n#define VERSION_AS_STRING FLUTTER_VERSION\n#else\n#define VERSION_AS_STRING \"1.0.0\"\n#endif\n\nVS_VERSION_INFO VERSIONINFO\n FILEVERSION VERSION_AS_NUMBER\n PRODUCTVERSION VERSION_AS_NUMBER\n FILEFLAGSMASK VS_FFI_FILEFLAGSMASK\n#ifdef _DEBUG\n FILEFLAGS VS_FF_DEBUG\n#else\n FILEFLAGS 0x0L\n#endif\n FILEOS VOS__WINDOWS32\n FILETYPE VFT_APP\n FILESUBTYPE 0x0L\nBEGIN\n    BLOCK \"StringFileInfo\"\n    BEGIN\n        BLOCK \"040904e4\"\n        BEGIN\n            VALUE \"CompanyName\", \"com.example\" \"\\0\"\n            VALUE \"FileDescription\", \"tts\" \"\\0\"\n            VALUE \"FileVersion\", VERSION_AS_STRING \"\\0\"\n            VALUE \"InternalName\", \"tts\" \"\\0\"\n            VALUE \"LegalCopyright\", \"Copyright (C) 2024 com.example. All rights reserved.\" \"\\0\"\n            VALUE \"OriginalFilename\", \"tts.exe\" \"\\0\"\n            VALUE \"ProductName\", \"tts\" \"\\0\"\n            VALUE \"ProductVersion\", VERSION_AS_STRING \"\\0\"\n        END\n    END\n    BLOCK \"VarFileInfo\"\n    BEGIN\n        VALUE \"Translation\", 0x409, 1252\n    END\nEND\n\n#endif    // English (United States) resources\n/////////////////////////////////////////////////////////////////////////////\n\n\n\n#ifndef APSTUDIO_INVOKED\n/////////////////////////////////////////////////////////////////////////////\n//\n// Generated from the TEXTINCLUDE 3 resource.\n//\n\n\n/////////////////////////////////////////////////////////////////////////////\n#endif    // not APSTUDIO_INVOKED\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/flutter_window.cpp",
    "content": "#include \"flutter_window.h\"\n\n#include <optional>\n\n#include \"flutter/generated_plugin_registrant.h\"\n\nFlutterWindow::FlutterWindow(const flutter::DartProject& project)\n    : project_(project) {}\n\nFlutterWindow::~FlutterWindow() {}\n\nbool FlutterWindow::OnCreate() {\n  if (!Win32Window::OnCreate()) {\n    return false;\n  }\n\n  RECT frame = GetClientArea();\n\n  // The size here must match the window dimensions to avoid unnecessary surface\n  // creation / destruction in the startup path.\n  flutter_controller_ = std::make_unique<flutter::FlutterViewController>(\n      frame.right - frame.left, frame.bottom - frame.top, project_);\n  // Ensure that basic setup of the controller was successful.\n  if (!flutter_controller_->engine() || !flutter_controller_->view()) {\n    return false;\n  }\n  RegisterPlugins(flutter_controller_->engine());\n  SetChildContent(flutter_controller_->view()->GetNativeWindow());\n\n  flutter_controller_->engine()->SetNextFrameCallback([&]() {\n    this->Show();\n  });\n\n  // Flutter can complete the first frame before the \"show window\" callback is\n  // registered. The following call ensures a frame is pending to ensure the\n  // window is shown. It is a no-op if the first frame hasn't completed yet.\n  flutter_controller_->ForceRedraw();\n\n  return true;\n}\n\nvoid FlutterWindow::OnDestroy() {\n  if (flutter_controller_) {\n    flutter_controller_ = nullptr;\n  }\n\n  Win32Window::OnDestroy();\n}\n\nLRESULT\nFlutterWindow::MessageHandler(HWND hwnd, UINT const message,\n                              WPARAM const wparam,\n                              LPARAM const lparam) noexcept {\n  // Give Flutter, including plugins, an opportunity to handle window messages.\n  if (flutter_controller_) {\n    std::optional<LRESULT> result =\n        flutter_controller_->HandleTopLevelWindowProc(hwnd, message, wparam,\n                                                      lparam);\n    if (result) {\n      return *result;\n    }\n  }\n\n  switch (message) {\n    case WM_FONTCHANGE:\n      flutter_controller_->engine()->ReloadSystemFonts();\n      break;\n  }\n\n  return Win32Window::MessageHandler(hwnd, message, wparam, lparam);\n}\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/flutter_window.h",
    "content": "#ifndef RUNNER_FLUTTER_WINDOW_H_\n#define RUNNER_FLUTTER_WINDOW_H_\n\n#include <flutter/dart_project.h>\n#include <flutter/flutter_view_controller.h>\n\n#include <memory>\n\n#include \"win32_window.h\"\n\n// A window that does nothing but host a Flutter view.\nclass FlutterWindow : public Win32Window {\n public:\n  // Creates a new FlutterWindow hosting a Flutter view running |project|.\n  explicit FlutterWindow(const flutter::DartProject& project);\n  virtual ~FlutterWindow();\n\n protected:\n  // Win32Window:\n  bool OnCreate() override;\n  void OnDestroy() override;\n  LRESULT MessageHandler(HWND window, UINT const message, WPARAM const wparam,\n                         LPARAM const lparam) noexcept override;\n\n private:\n  // The project to run.\n  flutter::DartProject project_;\n\n  // The Flutter instance hosted by this window.\n  std::unique_ptr<flutter::FlutterViewController> flutter_controller_;\n};\n\n#endif  // RUNNER_FLUTTER_WINDOW_H_\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/main.cpp",
    "content": "#include <flutter/dart_project.h>\n#include <flutter/flutter_view_controller.h>\n#include <windows.h>\n\n#include \"flutter_window.h\"\n#include \"utils.h\"\n\nint APIENTRY wWinMain(_In_ HINSTANCE instance, _In_opt_ HINSTANCE prev,\n                      _In_ wchar_t *command_line, _In_ int show_command) {\n  // Attach to console when present (e.g., 'flutter run') or create a\n  // new console when running with a debugger.\n  if (!::AttachConsole(ATTACH_PARENT_PROCESS) && ::IsDebuggerPresent()) {\n    CreateAndAttachConsole();\n  }\n\n  // Initialize COM, so that it is available for use in the library and/or\n  // plugins.\n  ::CoInitializeEx(nullptr, COINIT_APARTMENTTHREADED);\n\n  flutter::DartProject project(L\"data\");\n\n  std::vector<std::string> command_line_arguments =\n      GetCommandLineArguments();\n\n  project.set_dart_entrypoint_arguments(std::move(command_line_arguments));\n\n  FlutterWindow window(project);\n  Win32Window::Point origin(10, 10);\n  Win32Window::Size size(1280, 720);\n  if (!window.Create(L\"tts\", origin, size)) {\n    return EXIT_FAILURE;\n  }\n  window.SetQuitOnClose(true);\n\n  ::MSG msg;\n  while (::GetMessage(&msg, nullptr, 0, 0)) {\n    ::TranslateMessage(&msg);\n    ::DispatchMessage(&msg);\n  }\n\n  ::CoUninitialize();\n  return EXIT_SUCCESS;\n}\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/resource.h",
    "content": "//{{NO_DEPENDENCIES}}\n// Microsoft Visual C++ generated include file.\n// Used by Runner.rc\n//\n#define IDI_APP_ICON                    101\n\n// Next default values for new objects\n//\n#ifdef APSTUDIO_INVOKED\n#ifndef APSTUDIO_READONLY_SYMBOLS\n#define _APS_NEXT_RESOURCE_VALUE        102\n#define _APS_NEXT_COMMAND_VALUE         40001\n#define _APS_NEXT_CONTROL_VALUE         1001\n#define _APS_NEXT_SYMED_VALUE           101\n#endif\n#endif\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/runner.exe.manifest",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n<assembly xmlns=\"urn:schemas-microsoft-com:asm.v1\" manifestVersion=\"1.0\">\n  <application xmlns=\"urn:schemas-microsoft-com:asm.v3\">\n    <windowsSettings>\n      <dpiAwareness xmlns=\"http://schemas.microsoft.com/SMI/2016/WindowsSettings\">PerMonitorV2</dpiAwareness>\n    </windowsSettings>\n  </application>\n  <compatibility xmlns=\"urn:schemas-microsoft-com:compatibility.v1\">\n    <application>\n      <!-- Windows 10 and Windows 11 -->\n      <supportedOS Id=\"{8e0f7a12-bfb3-4fe8-b9a5-48fd50a15a9a}\"/>\n      <!-- Windows 8.1 -->\n      <supportedOS Id=\"{1f676c76-80e1-4239-95bb-83d0f6d0da78}\"/>\n      <!-- Windows 8 -->\n      <supportedOS Id=\"{4a2f28e3-53b9-4441-ba9c-d69d4a4a6e38}\"/>\n      <!-- Windows 7 -->\n      <supportedOS Id=\"{35138b9a-5d96-4fbd-8e2d-a2440225f93a}\"/>\n    </application>\n  </compatibility>\n</assembly>\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/utils.cpp",
    "content": "#include \"utils.h\"\n\n#include <flutter_windows.h>\n#include <io.h>\n#include <stdio.h>\n#include <windows.h>\n\n#include <iostream>\n\nvoid CreateAndAttachConsole() {\n  if (::AllocConsole()) {\n    FILE *unused;\n    if (freopen_s(&unused, \"CONOUT$\", \"w\", stdout)) {\n      _dup2(_fileno(stdout), 1);\n    }\n    if (freopen_s(&unused, \"CONOUT$\", \"w\", stderr)) {\n      _dup2(_fileno(stdout), 2);\n    }\n    std::ios::sync_with_stdio();\n    FlutterDesktopResyncOutputStreams();\n  }\n}\n\nstd::vector<std::string> GetCommandLineArguments() {\n  // Convert the UTF-16 command line arguments to UTF-8 for the Engine to use.\n  int argc;\n  wchar_t** argv = ::CommandLineToArgvW(::GetCommandLineW(), &argc);\n  if (argv == nullptr) {\n    return std::vector<std::string>();\n  }\n\n  std::vector<std::string> command_line_arguments;\n\n  // Skip the first argument as it's the binary name.\n  for (int i = 1; i < argc; i++) {\n    command_line_arguments.push_back(Utf8FromUtf16(argv[i]));\n  }\n\n  ::LocalFree(argv);\n\n  return command_line_arguments;\n}\n\nstd::string Utf8FromUtf16(const wchar_t* utf16_string) {\n  if (utf16_string == nullptr) {\n    return std::string();\n  }\n  unsigned int target_length = ::WideCharToMultiByte(\n      CP_UTF8, WC_ERR_INVALID_CHARS, utf16_string,\n      -1, nullptr, 0, nullptr, nullptr)\n    -1; // remove the trailing null character\n  int input_length = (int)wcslen(utf16_string);\n  std::string utf8_string;\n  if (target_length == 0 || target_length > utf8_string.max_size()) {\n    return utf8_string;\n  }\n  utf8_string.resize(target_length);\n  int converted_length = ::WideCharToMultiByte(\n      CP_UTF8, WC_ERR_INVALID_CHARS, utf16_string,\n      input_length, utf8_string.data(), target_length, nullptr, nullptr);\n  if (converted_length == 0) {\n    return std::string();\n  }\n  return utf8_string;\n}\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/utils.h",
    "content": "#ifndef RUNNER_UTILS_H_\n#define RUNNER_UTILS_H_\n\n#include <string>\n#include <vector>\n\n// Creates a console for the process, and redirects stdout and stderr to\n// it for both the runner and the Flutter library.\nvoid CreateAndAttachConsole();\n\n// Takes a null-terminated wchar_t* encoded in UTF-16 and returns a std::string\n// encoded in UTF-8. Returns an empty std::string on failure.\nstd::string Utf8FromUtf16(const wchar_t* utf16_string);\n\n// Gets the command line arguments passed in as a std::vector<std::string>,\n// encoded in UTF-8. Returns an empty std::vector<std::string> on failure.\nstd::vector<std::string> GetCommandLineArguments();\n\n#endif  // RUNNER_UTILS_H_\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/win32_window.cpp",
    "content": "#include \"win32_window.h\"\n\n#include <dwmapi.h>\n#include <flutter_windows.h>\n\n#include \"resource.h\"\n\nnamespace {\n\n/// Window attribute that enables dark mode window decorations.\n///\n/// Redefined in case the developer's machine has a Windows SDK older than\n/// version 10.0.22000.0.\n/// See: https://docs.microsoft.com/windows/win32/api/dwmapi/ne-dwmapi-dwmwindowattribute\n#ifndef DWMWA_USE_IMMERSIVE_DARK_MODE\n#define DWMWA_USE_IMMERSIVE_DARK_MODE 20\n#endif\n\nconstexpr const wchar_t kWindowClassName[] = L\"FLUTTER_RUNNER_WIN32_WINDOW\";\n\n/// Registry key for app theme preference.\n///\n/// A value of 0 indicates apps should use dark mode. A non-zero or missing\n/// value indicates apps should use light mode.\nconstexpr const wchar_t kGetPreferredBrightnessRegKey[] =\n  L\"Software\\\\Microsoft\\\\Windows\\\\CurrentVersion\\\\Themes\\\\Personalize\";\nconstexpr const wchar_t kGetPreferredBrightnessRegValue[] = L\"AppsUseLightTheme\";\n\n// The number of Win32Window objects that currently exist.\nstatic int g_active_window_count = 0;\n\nusing EnableNonClientDpiScaling = BOOL __stdcall(HWND hwnd);\n\n// Scale helper to convert logical scaler values to physical using passed in\n// scale factor\nint Scale(int source, double scale_factor) {\n  return static_cast<int>(source * scale_factor);\n}\n\n// Dynamically loads the |EnableNonClientDpiScaling| from the User32 module.\n// This API is only needed for PerMonitor V1 awareness mode.\nvoid EnableFullDpiSupportIfAvailable(HWND hwnd) {\n  HMODULE user32_module = LoadLibraryA(\"User32.dll\");\n  if (!user32_module) {\n    return;\n  }\n  auto enable_non_client_dpi_scaling =\n      reinterpret_cast<EnableNonClientDpiScaling*>(\n          GetProcAddress(user32_module, \"EnableNonClientDpiScaling\"));\n  if (enable_non_client_dpi_scaling != nullptr) {\n    enable_non_client_dpi_scaling(hwnd);\n  }\n  FreeLibrary(user32_module);\n}\n\n}  // namespace\n\n// Manages the Win32Window's window class registration.\nclass WindowClassRegistrar {\n public:\n  ~WindowClassRegistrar() = default;\n\n  // Returns the singleton registrar instance.\n  static WindowClassRegistrar* GetInstance() {\n    if (!instance_) {\n      instance_ = new WindowClassRegistrar();\n    }\n    return instance_;\n  }\n\n  // Returns the name of the window class, registering the class if it hasn't\n  // previously been registered.\n  const wchar_t* GetWindowClass();\n\n  // Unregisters the window class. Should only be called if there are no\n  // instances of the window.\n  void UnregisterWindowClass();\n\n private:\n  WindowClassRegistrar() = default;\n\n  static WindowClassRegistrar* instance_;\n\n  bool class_registered_ = false;\n};\n\nWindowClassRegistrar* WindowClassRegistrar::instance_ = nullptr;\n\nconst wchar_t* WindowClassRegistrar::GetWindowClass() {\n  if (!class_registered_) {\n    WNDCLASS window_class{};\n    window_class.hCursor = LoadCursor(nullptr, IDC_ARROW);\n    window_class.lpszClassName = kWindowClassName;\n    window_class.style = CS_HREDRAW | CS_VREDRAW;\n    window_class.cbClsExtra = 0;\n    window_class.cbWndExtra = 0;\n    window_class.hInstance = GetModuleHandle(nullptr);\n    window_class.hIcon =\n        LoadIcon(window_class.hInstance, MAKEINTRESOURCE(IDI_APP_ICON));\n    window_class.hbrBackground = 0;\n    window_class.lpszMenuName = nullptr;\n    window_class.lpfnWndProc = Win32Window::WndProc;\n    RegisterClass(&window_class);\n    class_registered_ = true;\n  }\n  return kWindowClassName;\n}\n\nvoid WindowClassRegistrar::UnregisterWindowClass() {\n  UnregisterClass(kWindowClassName, nullptr);\n  class_registered_ = false;\n}\n\nWin32Window::Win32Window() {\n  ++g_active_window_count;\n}\n\nWin32Window::~Win32Window() {\n  --g_active_window_count;\n  Destroy();\n}\n\nbool Win32Window::Create(const std::wstring& title,\n                         const Point& origin,\n                         const Size& size) {\n  Destroy();\n\n  const wchar_t* window_class =\n      WindowClassRegistrar::GetInstance()->GetWindowClass();\n\n  const POINT target_point = {static_cast<LONG>(origin.x),\n                              static_cast<LONG>(origin.y)};\n  HMONITOR monitor = MonitorFromPoint(target_point, MONITOR_DEFAULTTONEAREST);\n  UINT dpi = FlutterDesktopGetDpiForMonitor(monitor);\n  double scale_factor = dpi / 96.0;\n\n  HWND window = CreateWindow(\n      window_class, title.c_str(), WS_OVERLAPPEDWINDOW,\n      Scale(origin.x, scale_factor), Scale(origin.y, scale_factor),\n      Scale(size.width, scale_factor), Scale(size.height, scale_factor),\n      nullptr, nullptr, GetModuleHandle(nullptr), this);\n\n  if (!window) {\n    return false;\n  }\n\n  UpdateTheme(window);\n\n  return OnCreate();\n}\n\nbool Win32Window::Show() {\n  return ShowWindow(window_handle_, SW_SHOWNORMAL);\n}\n\n// static\nLRESULT CALLBACK Win32Window::WndProc(HWND const window,\n                                      UINT const message,\n                                      WPARAM const wparam,\n                                      LPARAM const lparam) noexcept {\n  if (message == WM_NCCREATE) {\n    auto window_struct = reinterpret_cast<CREATESTRUCT*>(lparam);\n    SetWindowLongPtr(window, GWLP_USERDATA,\n                     reinterpret_cast<LONG_PTR>(window_struct->lpCreateParams));\n\n    auto that = static_cast<Win32Window*>(window_struct->lpCreateParams);\n    EnableFullDpiSupportIfAvailable(window);\n    that->window_handle_ = window;\n  } else if (Win32Window* that = GetThisFromHandle(window)) {\n    return that->MessageHandler(window, message, wparam, lparam);\n  }\n\n  return DefWindowProc(window, message, wparam, lparam);\n}\n\nLRESULT\nWin32Window::MessageHandler(HWND hwnd,\n                            UINT const message,\n                            WPARAM const wparam,\n                            LPARAM const lparam) noexcept {\n  switch (message) {\n    case WM_DESTROY:\n      window_handle_ = nullptr;\n      Destroy();\n      if (quit_on_close_) {\n        PostQuitMessage(0);\n      }\n      return 0;\n\n    case WM_DPICHANGED: {\n      auto newRectSize = reinterpret_cast<RECT*>(lparam);\n      LONG newWidth = newRectSize->right - newRectSize->left;\n      LONG newHeight = newRectSize->bottom - newRectSize->top;\n\n      SetWindowPos(hwnd, nullptr, newRectSize->left, newRectSize->top, newWidth,\n                   newHeight, SWP_NOZORDER | SWP_NOACTIVATE);\n\n      return 0;\n    }\n    case WM_SIZE: {\n      RECT rect = GetClientArea();\n      if (child_content_ != nullptr) {\n        // Size and position the child window.\n        MoveWindow(child_content_, rect.left, rect.top, rect.right - rect.left,\n                   rect.bottom - rect.top, TRUE);\n      }\n      return 0;\n    }\n\n    case WM_ACTIVATE:\n      if (child_content_ != nullptr) {\n        SetFocus(child_content_);\n      }\n      return 0;\n\n    case WM_DWMCOLORIZATIONCOLORCHANGED:\n      UpdateTheme(hwnd);\n      return 0;\n  }\n\n  return DefWindowProc(window_handle_, message, wparam, lparam);\n}\n\nvoid Win32Window::Destroy() {\n  OnDestroy();\n\n  if (window_handle_) {\n    DestroyWindow(window_handle_);\n    window_handle_ = nullptr;\n  }\n  if (g_active_window_count == 0) {\n    WindowClassRegistrar::GetInstance()->UnregisterWindowClass();\n  }\n}\n\nWin32Window* Win32Window::GetThisFromHandle(HWND const window) noexcept {\n  return reinterpret_cast<Win32Window*>(\n      GetWindowLongPtr(window, GWLP_USERDATA));\n}\n\nvoid Win32Window::SetChildContent(HWND content) {\n  child_content_ = content;\n  SetParent(content, window_handle_);\n  RECT frame = GetClientArea();\n\n  MoveWindow(content, frame.left, frame.top, frame.right - frame.left,\n             frame.bottom - frame.top, true);\n\n  SetFocus(child_content_);\n}\n\nRECT Win32Window::GetClientArea() {\n  RECT frame;\n  GetClientRect(window_handle_, &frame);\n  return frame;\n}\n\nHWND Win32Window::GetHandle() {\n  return window_handle_;\n}\n\nvoid Win32Window::SetQuitOnClose(bool quit_on_close) {\n  quit_on_close_ = quit_on_close;\n}\n\nbool Win32Window::OnCreate() {\n  // No-op; provided for subclasses.\n  return true;\n}\n\nvoid Win32Window::OnDestroy() {\n  // No-op; provided for subclasses.\n}\n\nvoid Win32Window::UpdateTheme(HWND const window) {\n  DWORD light_mode;\n  DWORD light_mode_size = sizeof(light_mode);\n  LSTATUS result = RegGetValue(HKEY_CURRENT_USER, kGetPreferredBrightnessRegKey,\n                               kGetPreferredBrightnessRegValue,\n                               RRF_RT_REG_DWORD, nullptr, &light_mode,\n                               &light_mode_size);\n\n  if (result == ERROR_SUCCESS) {\n    BOOL enable_dark_mode = light_mode == 0;\n    DwmSetWindowAttribute(window, DWMWA_USE_IMMERSIVE_DARK_MODE,\n                          &enable_dark_mode, sizeof(enable_dark_mode));\n  }\n}\n"
  },
  {
    "path": "flutter-examples/tts/windows/runner/win32_window.h",
    "content": "#ifndef RUNNER_WIN32_WINDOW_H_\n#define RUNNER_WIN32_WINDOW_H_\n\n#include <windows.h>\n\n#include <functional>\n#include <memory>\n#include <string>\n\n// A class abstraction for a high DPI-aware Win32 Window. Intended to be\n// inherited from by classes that wish to specialize with custom\n// rendering and input handling\nclass Win32Window {\n public:\n  struct Point {\n    unsigned int x;\n    unsigned int y;\n    Point(unsigned int x, unsigned int y) : x(x), y(y) {}\n  };\n\n  struct Size {\n    unsigned int width;\n    unsigned int height;\n    Size(unsigned int width, unsigned int height)\n        : width(width), height(height) {}\n  };\n\n  Win32Window();\n  virtual ~Win32Window();\n\n  // Creates a win32 window with |title| that is positioned and sized using\n  // |origin| and |size|. New windows are created on the default monitor. Window\n  // sizes are specified to the OS in physical pixels, hence to ensure a\n  // consistent size this function will scale the inputted width and height as\n  // as appropriate for the default monitor. The window is invisible until\n  // |Show| is called. Returns true if the window was created successfully.\n  bool Create(const std::wstring& title, const Point& origin, const Size& size);\n\n  // Show the current window. Returns true if the window was successfully shown.\n  bool Show();\n\n  // Release OS resources associated with window.\n  void Destroy();\n\n  // Inserts |content| into the window tree.\n  void SetChildContent(HWND content);\n\n  // Returns the backing Window handle to enable clients to set icon and other\n  // window properties. Returns nullptr if the window has been destroyed.\n  HWND GetHandle();\n\n  // If true, closing this window will quit the application.\n  void SetQuitOnClose(bool quit_on_close);\n\n  // Return a RECT representing the bounds of the current client area.\n  RECT GetClientArea();\n\n protected:\n  // Processes and route salient window messages for mouse handling,\n  // size change and DPI. Delegates handling of these to member overloads that\n  // inheriting classes can handle.\n  virtual LRESULT MessageHandler(HWND window,\n                                 UINT const message,\n                                 WPARAM const wparam,\n                                 LPARAM const lparam) noexcept;\n\n  // Called when CreateAndShow is called, allowing subclass window-related\n  // setup. Subclasses should return false if setup fails.\n  virtual bool OnCreate();\n\n  // Called when Destroy is called.\n  virtual void OnDestroy();\n\n private:\n  friend class WindowClassRegistrar;\n\n  // OS callback called by message pump. Handles the WM_NCCREATE message which\n  // is passed when the non-client area is being created and enables automatic\n  // non-client DPI scaling so that the non-client area automatically\n  // responds to changes in DPI. All other messages are handled by\n  // MessageHandler.\n  static LRESULT CALLBACK WndProc(HWND const window,\n                                  UINT const message,\n                                  WPARAM const wparam,\n                                  LPARAM const lparam) noexcept;\n\n  // Retrieves a class instance pointer for |window|\n  static Win32Window* GetThisFromHandle(HWND const window) noexcept;\n\n  // Update the window frame's theme to match the system theme.\n  static void UpdateTheme(HWND const window);\n\n  bool quit_on_close_ = false;\n\n  // window handle for top level window.\n  HWND window_handle_ = nullptr;\n\n  // window handle for hosted content.\n  HWND child_content_ = nullptr;\n};\n\n#endif  // RUNNER_WIN32_WINDOW_H_\n"
  },
  {
    "path": "go-api-examples/.gitignore",
    "content": "!*.sh\n"
  },
  {
    "path": "go-api-examples/README.md",
    "content": "# Introduction\n\nThis folder contains Go API examples for [sherpa-onnx][sherpa-onnx].\n\nPlease refer to the documentation\nhttps://k2-fsa.github.io/sherpa/onnx/go-api/index.html\nfor details.\n\n- [./add-punctuation](./add-punctuation) It shows how to use\n  a punctuation model to add punctuations to text\n\n- [./add-punctuation-online](./add-punctuation-online) It shows how to use\n  an online punctuation model to add punctuations and casing to text\n\n- [./non-streaming-decode-files](./non-streaming-decode-files) It shows how to use\n  a non-streaming ASR model to decode files\n\n- [./non-streaming-speaker-diarization](./non-streaming-speaker-diarization) It shows how to use\n  a speaker segmentation model and a speaker embedding model for speaker diarization.\n\n- [./speech-enhancement-gtcrn](./speech-enhancement-gtcrn) It shows how to use\n  the offline speech denoiser API with GTCRN models.\n\n- [./speech-enhancement-dpdfnet](./speech-enhancement-dpdfnet) It shows how to use\n  the offline speech denoiser API with DPDFNet models.\n\n- [./streaming-speech-enhancement-gtcrn](./streaming-speech-enhancement-gtcrn) It shows how to use\n  the online speech denoiser API with GTCRN models.\n\n- [./streaming-speech-enhancement-dpdfnet](./streaming-speech-enhancement-dpdfnet) It shows how to use\n  the online speech denoiser API with DPDFNet models.\n\n- [./non-streaming-tts](./non-streaming-tts) It shows how to use a non-streaming TTS\n  model to convert text to speech\n\n- [./offline-tts-play](./offline-tts-play) It shows how to use a non-streaming TTS\n  model to convert text to speech. It plays the audio back as it is being generated.\n\n- [./zero-shot-pocket-tts](./zero-shot-pocket-tts) It shows how to use a PocketTTS\n  model for zero-shot TTS.\n\n- [./zero-shot-pocket-tts-play](./zero-shot-pocket-tts-play) It shows how to use a PocketTTS\n  model for zero-shot TTS. It plays the audio back as it is being generated.\n\n- [./zero-shot-zipvoice-tts](./zero-shot-zipvoice-tts) It shows how to use a ZipVoice\n  model for zero-shot TTS with the GenerationConfig API.\n\n- [./zero-shot-zipvoice-tts-play](./zero-shot-zipvoice-tts-play) It shows how to use a\n  ZipVoice model for zero-shot TTS. It plays the audio back as it is being generated.\n\n- [./real-time-speech-recognition-from-microphone](./real-time-speech-recognition-from-microphone)\n  It shows how to use a streaming ASR model to recognize speech from a microphone in real-time\n\n- [./speaker-identification](./speaker-identification) It shows how to use a speaker\n  embedding model for speaker identification.\n\n- [./streaming-decode-files](./streaming-decode-files) It shows how to use a streaming\n  model for streaming speech recognition\n\n- [./streaming-hlg-decoding](./streaming-hlg-decoding) It shows how to use a streaming\n  model for streaming speech recognition with HLG decoding\n\n- [./vad](./vad) It shows how to use silero VAD with Golang.\n\n- [./vad-asr-paraformer](./vad-asr-paraformer) It shows how to use silero VAD + Paraformer\n  for speech recognition.\n\n- [./vad-asr-whisper](./vad-asr-whisper) It shows how to use silero VAD + Whisper\n\n- [./vad-speaker-identification](./vad-speaker-identification) It shows how to use Go API for VAD + speaker identification.\n  for speech recognition.\n\n- [./vad-spoken-language-identification](./vad-spoken-language-identification) It shows how to use silero VAD + Whisper\n  for spoken language identification.\n\n[sherpa-onnx]: https://github.com/k2-fsa/sherpa-onnx\n"
  },
  {
    "path": "go-api-examples/add-punctuation/go.mod",
    "content": "module add-punctuation\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/add-punctuation/main.go",
    "content": "package main\n\nimport (\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OfflinePunctuationConfig{}\n\tconfig.Model.CtTransformer = \"./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx\"\n\tconfig.Model.NumThreads = 1\n\tconfig.Model.Provider = \"cpu\"\n\n\tpunct := sherpa.NewOfflinePunctuation(&config)\n\tdefer sherpa.DeleteOfflinePunc(punct)\n\n\ttextArray := []string{\n\t\t\"这是一个测试你好吗How are you我很好thank you are you ok谢谢你\",\n\t\t\"我们都是木头人不会说话不会动\",\n\t\t\"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\",\n\t}\n\tlog.Println(\"----------\")\n\tfor _, text := range textArray {\n\t\tnewText := punct.AddPunct(text)\n\t\tlog.Printf(\"Input text: %v\", text)\n\t\tlog.Printf(\"Output text: %v\", newText)\n\t\tlog.Println(\"----------\")\n\t}\n}\n"
  },
  {
    "path": "go-api-examples/add-punctuation/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -d ./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./add-punctuation\n"
  },
  {
    "path": "go-api-examples/add-punctuation-online/go.mod",
    "content": "module add-punctuation-online\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/add-punctuation-online/main.go",
    "content": "package main\n\nimport (\n\t\"log\"\n\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OnlinePunctuationConfig{}\n\tconfig.Model.CnnBilstm = \"./sherpa-onnx-online-punct-en-2024-08-06/model.onnx\"\n\tconfig.Model.BpeVocab = \"./sherpa-onnx-online-punct-en-2024-08-06/bpe.vocab\"\n\tconfig.Model.NumThreads = 1\n\tconfig.Model.Provider = \"cpu\"\n\n\tpunct := sherpa.NewOnlinePunctuation(&config)\n\tif punct == nil {\n\t\tlog.Fatal(\"Failed to create OnlinePunctuation\")\n\t}\n\tdefer sherpa.DeleteOnlinePunctuation(punct)\n\n\ttextArray := []string{\n\t\t\"how are you i am fine thank you\",\n\t\t\"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\",\n\t}\n\n\tlog.Println(\"----------\")\n\tfor _, text := range textArray {\n\t\tnewText := punct.AddPunct(text)\n\t\tlog.Printf(\"Input text: %v\", text)\n\t\tlog.Printf(\"Output text: %v\", newText)\n\t\tlog.Println(\"----------\")\n\t}\n}\n"
  },
  {
    "path": "go-api-examples/add-punctuation-online/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -d ./sherpa-onnx-online-punct-en-2024-08-06 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n  tar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n  rm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./add-punctuation-online\n"
  },
  {
    "path": "go-api-examples/audio-tagging/go.mod",
    "content": "module audio-tagging\n\ngo 1.17\n\n"
  },
  {
    "path": "go-api-examples/audio-tagging/main.go",
    "content": "package main\n\nimport (\n\t\"fmt\"\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\nfunc main() {\n\tconfig := sherpa.AudioTaggingConfig{}\n\tconfig.Model.Zipformer.Model = \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/model.int8.onnx\"\n\tconfig.Model.NumThreads = 1\n\tconfig.Model.Debug = 1\n\tconfig.Model.Provider = \"cpu\"\n\tconfig.Labels = \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/class_labels_indices.csv\"\n\tconfig.TopK = 5\n\n\ttagging := sherpa.NewAudioTagging(&config)\n\tdefer sherpa.DeleteAudioTagging(tagging)\n\n\twave_filename := \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/3.wav\"\n\n\twave := sherpa.ReadWave(wave_filename)\n\tif wave == nil {\n\t\tlog.Printf(\"Failed to read %v\\n\", wave_filename)\n\t\treturn\n\t}\n\n\tstream := sherpa.NewAudioTaggingStream(tagging)\n\tdefer sherpa.DeleteOfflineStream(stream)\n\n\tstream.AcceptWaveform(wave.SampleRate, wave.Samples)\n\n\tresult := tagging.Compute(stream, 10)\n\tfmt.Printf(\"the tagging result: %v\\n\", result)\n}\n"
  },
  {
    "path": "go-api-examples/audio-tagging/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n  rm sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./audio-tagging\n"
  },
  {
    "path": "go-api-examples/keyword-spotting-from-file/go.mod",
    "content": "module keyword-spotting-from-file\n\ngo 1.17\n\n"
  },
  {
    "path": "go-api-examples/keyword-spotting-from-file/main.go",
    "content": "package main\n\nimport (\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.KeywordSpotterConfig{}\n\n\t// Please download the models from\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models\n\n\tconfig.ModelConfig.Transducer.Encoder = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx\"\n\tconfig.ModelConfig.Transducer.Decoder = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\"\n\tconfig.ModelConfig.Transducer.Joiner = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx\"\n\tconfig.ModelConfig.Tokens = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt\"\n\tconfig.KeywordsFile = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/test_keywords.txt\"\n\tconfig.ModelConfig.NumThreads = 1\n\tconfig.ModelConfig.Debug = 1\n\n\tspotter := sherpa.NewKeywordSpotter(&config)\n\tdefer sherpa.DeleteKeywordSpotter(spotter)\n\n\twave_filename := \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/3.wav\"\n\n\twave := sherpa.ReadWave(wave_filename)\n\tif wave == nil {\n\t\tlog.Printf(\"Failed to read %v\\n\", wave_filename)\n\t\treturn\n\t}\n\n\tlog.Println(\"----------Use pre-defined keywords----------\")\n\n\tstream := sherpa.NewKeywordStream(spotter)\n\tdefer sherpa.DeleteOnlineStream(stream)\n\n\tstream.AcceptWaveform(wave.SampleRate, wave.Samples)\n\n\tfor spotter.IsReady(stream) {\n\t\tspotter.Decode(stream)\n\t\tresult := spotter.GetResult(stream)\n\t\tif result.Keyword != \"\" {\n\t\t\t// You have to reset the stream right after detecting a keyword\n\t\t\tspotter.Reset(stream)\n\t\t\tlog.Printf(\"Detected %v\\n\", result.Keyword)\n\t\t}\n\t}\n\n\tlog.Println(\"----------Use pre-defined keywords + add a new keyword----------\")\n\n\tstream2 := sherpa.NewKeywordStreamWithKeywords(spotter, \"y ǎn y uán @演员\")\n\tdefer sherpa.DeleteOnlineStream(stream2)\n\n\tstream2.AcceptWaveform(wave.SampleRate, wave.Samples)\n\n\tfor spotter.IsReady(stream2) {\n\t\tspotter.Decode(stream2)\n\t\tresult := spotter.GetResult(stream2)\n\t\tif result.Keyword != \"\" {\n\t\t\tlog.Printf(\"Detected %v\\n\", result.Keyword)\n\t\t}\n\t}\n\n\tlog.Println(\"----------Use pre-defined keywords + add 2 new keywords----------\")\n\n\tstream3 := sherpa.NewKeywordStreamWithKeywords(spotter, \"y ǎn y uán @演员/zh ī m íng @知名\")\n\tdefer sherpa.DeleteOnlineStream(stream3)\n\n\tstream3.AcceptWaveform(wave.SampleRate, wave.Samples)\n\n\tfor spotter.IsReady(stream3) {\n\t\tspotter.Decode(stream3)\n\t\tresult := spotter.GetResult(stream3)\n\t\tif result.Keyword != \"\" {\n\t\t\tlog.Printf(\"Detected %v\\n\", result.Keyword)\n\t\t}\n\t}\n}\n"
  },
  {
    "path": "go-api-examples/keyword-spotting-from-file/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\nfi\n\ngo mod tidy\ngo build\n./keyword-spotting-from-file\n"
  },
  {
    "path": "go-api-examples/non-streaming-canary-decode-files/go.mod",
    "content": "module non-streaming-canary-decode-files\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/non-streaming-canary-decode-files/main.go",
    "content": "package main\n\nimport (\n\t\"bytes\"\n\t\"encoding/binary\"\n\t\"log\"\n\t\"os\"\n\t\"strings\"\n\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"github.com/youpy/go-wav\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OfflineRecognizerConfig{}\n\n\tconfig.ModelConfig.Canary.Encoder = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx\"\n\tconfig.ModelConfig.Canary.Decoder = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx\"\n\tconfig.ModelConfig.Canary.SrcLang = \"en\"\n\tconfig.ModelConfig.Canary.TgtLang = \"en\"\n\tconfig.ModelConfig.Canary.UsePnc = 1\n\tconfig.ModelConfig.Tokens = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt\"\n\n\twaveFilename := \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/en.wav\"\n\n\tsamples, sampleRate := readWave(waveFilename)\n\n\tlog.Println(\"Initializing recognizer (may take several seconds)\")\n\trecognizer := sherpa.NewOfflineRecognizer(&config)\n\tlog.Println(\"Recognizer created!\")\n\tdefer sherpa.DeleteOfflineRecognizer(recognizer)\n\n\tlog.Println(\"Start decoding!\")\n\tstream := sherpa.NewOfflineStream(recognizer)\n\tdefer sherpa.DeleteOfflineStream(stream)\n\n\tstream.AcceptWaveform(sampleRate, samples)\n\n\trecognizer.Decode(stream)\n\tlog.Println(\"Decoding done!\")\n\tresult := stream.GetResult()\n\n\tlog.Println(\"Text in English: \" + strings.ToLower(result.Text))\n\n\ts := sherpa.NewOfflineStream(recognizer)\n\tdefer sherpa.DeleteOfflineStream(s)\n\n\ts.AcceptWaveform(sampleRate, samples)\n\n\tconfig.ModelConfig.Canary.TgtLang = \"de\"\n\trecognizer.SetConfig(&config)\n\trecognizer.Decode(s)\n\tresult = s.GetResult()\n\n\tlog.Println(\"Text in German: \" + strings.ToLower(result.Text))\n}\n\nfunc readWave(filename string) (samples []float32, sampleRate int) {\n\tfile, _ := os.Open(filename)\n\tdefer file.Close()\n\n\treader := wav.NewReader(file)\n\tformat, err := reader.Format()\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to read wave format\")\n\t}\n\n\tif format.AudioFormat != 1 {\n\t\tlog.Fatalf(\"Support only PCM format. Given: %v\\n\", format.AudioFormat)\n\t}\n\n\tif format.NumChannels != 1 {\n\t\tlog.Fatalf(\"Support only 1 channel wave file. Given: %v\\n\", format.NumChannels)\n\t}\n\n\tif format.BitsPerSample != 16 {\n\t\tlog.Fatalf(\"Support only 16-bit per sample. Given: %v\\n\", format.BitsPerSample)\n\t}\n\n\treader.Duration() // so that it initializes reader.Size\n\n\tbuf := make([]byte, reader.Size)\n\tn, err := reader.Read(buf)\n\tif n != int(reader.Size) {\n\t\tlog.Fatalf(\"Failed to read %v bytes. Returned %v bytes\\n\", reader.Size, n)\n\t}\n\n\tsamples = samplesInt16ToFloat(buf)\n\tsampleRate = int(format.SampleRate)\n\n\treturn\n}\n\nfunc samplesInt16ToFloat(inSamples []byte) []float32 {\n\tnumSamples := len(inSamples) / 2\n\toutSamples := make([]float32, numSamples)\n\n\tfor i := 0; i != numSamples; i++ {\n\t\ts := inSamples[i*2 : (i+1)*2]\n\n\t\tvar s16 int16\n\t\tbuf := bytes.NewReader(s)\n\t\terr := binary.Read(buf, binary.LittleEndian, &s16)\n\t\tif err != nil {\n\t\t\tlog.Fatal(\"Failed to parse 16-bit sample\")\n\t\t}\n\t\toutSamples[i] = float32(s16) / 32768\n\t}\n\n\treturn outSamples\n}\n"
  },
  {
    "path": "go-api-examples/non-streaming-canary-decode-files/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n  tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n  rm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\nfi\n\ngo mod tidy\ngo build\n./non-streaming-canary-decode-files\n"
  },
  {
    "path": "go-api-examples/non-streaming-funasr-nano-decode-files/go.mod",
    "content": "module non-streaming-funasr-nano-decode-files\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/non-streaming-funasr-nano-decode-files/main.go",
    "content": "package main\n\nimport (\n\t\"bytes\"\n\t\"encoding/binary\"\n\t\"log\"\n\t\"os\"\n\t\"strings\"\n\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"github.com/youpy/go-wav\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OfflineRecognizerConfig{}\n\n\tconfig.ModelConfig.FunAsrNano.EncoderAdaptor = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx\"\n\tconfig.ModelConfig.FunAsrNano.LLM = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx\"\n\tconfig.ModelConfig.FunAsrNano.Embedding = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx\"\n\tconfig.ModelConfig.FunAsrNano.Tokenizer = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B\"\n\t// Seed for reproducibility (default: 42)\n\tconfig.ModelConfig.FunAsrNano.Seed = 42\n\n\tconfig.ModelConfig.Tokens = \"\"\n\n\twaveFilename := \"./sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/lyrics.wav\"\n\n\tsamples, sampleRate := readWave(waveFilename)\n\n\tlog.Println(\"Initializing recognizer (may take several seconds)\")\n\trecognizer := sherpa.NewOfflineRecognizer(&config)\n\tlog.Println(\"Recognizer created!\")\n\tdefer sherpa.DeleteOfflineRecognizer(recognizer)\n\n\tlog.Println(\"Start decoding!\")\n\tstream := sherpa.NewOfflineStream(recognizer)\n\tdefer sherpa.DeleteOfflineStream(stream)\n\n\tstream.AcceptWaveform(sampleRate, samples)\n\n\trecognizer.Decode(stream)\n\tlog.Println(\"Decoding done!\")\n\tresult := stream.GetResult()\n\n\tlog.Println(\"Text: \" + strings.ToLower(result.Text))\n}\n\nfunc readWave(filename string) (samples []float32, sampleRate int) {\n\tfile, _ := os.Open(filename)\n\tdefer file.Close()\n\n\treader := wav.NewReader(file)\n\tformat, err := reader.Format()\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to read wave format\")\n\t}\n\n\tif format.AudioFormat != 1 {\n\t\tlog.Fatalf(\"Support only PCM format. Given: %v\\n\", format.AudioFormat)\n\t}\n\n\tif format.NumChannels != 1 {\n\t\tlog.Fatalf(\"Support only 1 channel wave file. Given: %v\\n\", format.NumChannels)\n\t}\n\n\tif format.BitsPerSample != 16 {\n\t\tlog.Fatalf(\"Support only 16-bit per sample. Given: %v\\n\", format.BitsPerSample)\n\t}\n\n\treader.Duration() // so that it initializes reader.Size\n\n\tbuf := make([]byte, reader.Size)\n\tn, err := reader.Read(buf)\n\tif n != int(reader.Size) {\n\t\tlog.Fatalf(\"Failed to read %v bytes. Returned %v bytes\\n\", reader.Size, n)\n\t}\n\n\tsamples = samplesInt16ToFloat(buf)\n\tsampleRate = int(format.SampleRate)\n\n\treturn\n}\n\nfunc samplesInt16ToFloat(inSamples []byte) []float32 {\n\tnumSamples := len(inSamples) / 2\n\toutSamples := make([]float32, numSamples)\n\n\tfor i := 0; i != numSamples; i++ {\n\t\ts := inSamples[i*2 : (i+1)*2]\n\n\t\tvar s16 int16\n\t\tbuf := bytes.NewReader(s)\n\t\terr := binary.Read(buf, binary.LittleEndian, &s16)\n\t\tif err != nil {\n\t\t\tlog.Fatal(\"Failed to parse 16-bit sample\")\n\t\t}\n\t\toutSamples[i] = float32(s16) / 32768\n\t}\n\n\treturn outSamples\n}\n"
  },
  {
    "path": "go-api-examples/non-streaming-funasr-nano-decode-files/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./non-streaming-funasr-nano-decode-files\n"
  },
  {
    "path": "go-api-examples/non-streaming-medasr-ctc-decode-files/go.mod",
    "content": "module non-streaming-medasr-ctc-decode-files\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/non-streaming-medasr-ctc-decode-files/main.go",
    "content": "package main\n\nimport (\n\t\"bytes\"\n\t\"encoding/binary\"\n\t\"log\"\n\t\"os\"\n\t\"strings\"\n\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"github.com/youpy/go-wav\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OfflineRecognizerConfig{}\n\n\tconfig.ModelConfig.MedAsr.Model = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx\"\n\tconfig.ModelConfig.Tokens = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt\"\n\n\twaveFilename := \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav\"\n\n\tsamples, sampleRate := readWave(waveFilename)\n\n\tlog.Println(\"Initializing recognizer (may take several seconds)\")\n\trecognizer := sherpa.NewOfflineRecognizer(&config)\n\tlog.Println(\"Recognizer created!\")\n\tdefer sherpa.DeleteOfflineRecognizer(recognizer)\n\n\tlog.Println(\"Start decoding!\")\n\tstream := sherpa.NewOfflineStream(recognizer)\n\tdefer sherpa.DeleteOfflineStream(stream)\n\n\tstream.AcceptWaveform(sampleRate, samples)\n\n\trecognizer.Decode(stream)\n\tlog.Println(\"Decoding done!\")\n\tresult := stream.GetResult()\n\n\tlog.Println(\"Text: \" + strings.ToLower(result.Text))\n}\n\nfunc readWave(filename string) (samples []float32, sampleRate int) {\n\tfile, _ := os.Open(filename)\n\tdefer file.Close()\n\n\treader := wav.NewReader(file)\n\tformat, err := reader.Format()\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to read wave format\")\n\t}\n\n\tif format.AudioFormat != 1 {\n\t\tlog.Fatalf(\"Support only PCM format. Given: %v\\n\", format.AudioFormat)\n\t}\n\n\tif format.NumChannels != 1 {\n\t\tlog.Fatalf(\"Support only 1 channel wave file. Given: %v\\n\", format.NumChannels)\n\t}\n\n\tif format.BitsPerSample != 16 {\n\t\tlog.Fatalf(\"Support only 16-bit per sample. Given: %v\\n\", format.BitsPerSample)\n\t}\n\n\treader.Duration() // so that it initializes reader.Size\n\n\tbuf := make([]byte, reader.Size)\n\tn, err := reader.Read(buf)\n\tif n != int(reader.Size) {\n\t\tlog.Fatalf(\"Failed to read %v bytes. Returned %v bytes\\n\", reader.Size, n)\n\t}\n\n\tsamples = samplesInt16ToFloat(buf)\n\tsampleRate = int(format.SampleRate)\n\n\treturn\n}\n\nfunc samplesInt16ToFloat(inSamples []byte) []float32 {\n\tnumSamples := len(inSamples) / 2\n\toutSamples := make([]float32, numSamples)\n\n\tfor i := 0; i != numSamples; i++ {\n\t\ts := inSamples[i*2 : (i+1)*2]\n\n\t\tvar s16 int16\n\t\tbuf := bytes.NewReader(s)\n\t\terr := binary.Read(buf, binary.LittleEndian, &s16)\n\t\tif err != nil {\n\t\t\tlog.Fatal(\"Failed to parse 16-bit sample\")\n\t\t}\n\t\toutSamples[i] = float32(s16) / 32768\n\t}\n\n\treturn outSamples\n}\n"
  },
  {
    "path": "go-api-examples/non-streaming-medasr-ctc-decode-files/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  tar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  rm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nfi\n\ngo mod tidy\ngo build\n./non-streaming-medasr-ctc-decode-files\n"
  },
  {
    "path": "go-api-examples/non-streaming-moonshine-v2-decode-files/go.mod",
    "content": "module non-streaming-moonshine-v2-decode-files\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/non-streaming-moonshine-v2-decode-files/main.go",
    "content": "package main\n\nimport (\n\t\"bytes\"\n\t\"encoding/binary\"\n\t\"log\"\n\t\"os\"\n\t\"strings\"\n\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"github.com/youpy/go-wav\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OfflineRecognizerConfig{}\n\n\tconfig.ModelConfig.Moonshine.Encoder = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort\"\n\tconfig.ModelConfig.Moonshine.MergedDecoder = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort\"\n\tconfig.ModelConfig.Tokens = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt\"\n\n\twaveFilename := \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav\"\n\n\tsamples, sampleRate := readWave(waveFilename)\n\n\tlog.Println(\"Initializing recognizer (may take several seconds)\")\n\trecognizer := sherpa.NewOfflineRecognizer(&config)\n\tlog.Println(\"Recognizer created!\")\n\tdefer sherpa.DeleteOfflineRecognizer(recognizer)\n\n\tlog.Println(\"Start decoding!\")\n\tstream := sherpa.NewOfflineStream(recognizer)\n\tdefer sherpa.DeleteOfflineStream(stream)\n\n\tstream.AcceptWaveform(sampleRate, samples)\n\n\trecognizer.Decode(stream)\n\tlog.Println(\"Decoding done!\")\n\tresult := stream.GetResult()\n\n\tlog.Println(\"Text: \" + strings.ToLower(result.Text))\n}\n\nfunc readWave(filename string) (samples []float32, sampleRate int) {\n\tfile, _ := os.Open(filename)\n\tdefer file.Close()\n\n\treader := wav.NewReader(file)\n\tformat, err := reader.Format()\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to read wave format\")\n\t}\n\n\tif format.AudioFormat != 1 {\n\t\tlog.Fatalf(\"Support only PCM format. Given: %v\\n\", format.AudioFormat)\n\t}\n\n\tif format.NumChannels != 1 {\n\t\tlog.Fatalf(\"Support only 1 channel wave file. Given: %v\\n\", format.NumChannels)\n\t}\n\n\tif format.BitsPerSample != 16 {\n\t\tlog.Fatalf(\"Support only 16-bit per sample. Given: %v\\n\", format.BitsPerSample)\n\t}\n\n\treader.Duration() // so that it initializes reader.Size\n\n\tbuf := make([]byte, reader.Size)\n\tn, err := reader.Read(buf)\n\tif n != int(reader.Size) {\n\t\tlog.Fatalf(\"Failed to read %v bytes. Returned %v bytes\\n\", reader.Size, n)\n\t}\n\n\tsamples = samplesInt16ToFloat(buf)\n\tsampleRate = int(format.SampleRate)\n\n\treturn\n}\n\nfunc samplesInt16ToFloat(inSamples []byte) []float32 {\n\tnumSamples := len(inSamples) / 2\n\toutSamples := make([]float32, numSamples)\n\n\tfor i := 0; i != numSamples; i++ {\n\t\ts := inSamples[i*2 : (i+1)*2]\n\n\t\tvar s16 int16\n\t\tbuf := bytes.NewReader(s)\n\t\terr := binary.Read(buf, binary.LittleEndian, &s16)\n\t\tif err != nil {\n\t\t\tlog.Fatal(\"Failed to parse 16-bit sample\")\n\t\t}\n\t\toutSamples[i] = float32(s16) / 32768\n\t}\n\n\treturn outSamples\n}\n"
  },
  {
    "path": "go-api-examples/non-streaming-moonshine-v2-decode-files/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nfi\n\ngo mod tidy\ngo build\n./non-streaming-moonshine-v2-decode-files\n"
  },
  {
    "path": "go-api-examples/non-streaming-omnilingual-asr-ctc-decode-files/go.mod",
    "content": "module non-streaming-omnilingual-asr-ctc-decode-files\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/non-streaming-omnilingual-asr-ctc-decode-files/main.go",
    "content": "package main\n\nimport (\n\t\"bytes\"\n\t\"encoding/binary\"\n\t\"log\"\n\t\"os\"\n\t\"strings\"\n\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"github.com/youpy/go-wav\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OfflineRecognizerConfig{}\n\n\tconfig.ModelConfig.Omnilingual.Model = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx\"\n\tconfig.ModelConfig.Tokens = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt\"\n\n\twaveFilename := \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav\"\n\n\tsamples, sampleRate := readWave(waveFilename)\n\n\tlog.Println(\"Initializing recognizer (may take several seconds)\")\n\trecognizer := sherpa.NewOfflineRecognizer(&config)\n\tlog.Println(\"Recognizer created!\")\n\tdefer sherpa.DeleteOfflineRecognizer(recognizer)\n\n\tlog.Println(\"Start decoding!\")\n\tstream := sherpa.NewOfflineStream(recognizer)\n\tdefer sherpa.DeleteOfflineStream(stream)\n\n\tstream.AcceptWaveform(sampleRate, samples)\n\n\trecognizer.Decode(stream)\n\tlog.Println(\"Decoding done!\")\n\tresult := stream.GetResult()\n\n\tlog.Println(\"Text: \" + strings.ToLower(result.Text))\n}\n\nfunc readWave(filename string) (samples []float32, sampleRate int) {\n\tfile, _ := os.Open(filename)\n\tdefer file.Close()\n\n\treader := wav.NewReader(file)\n\tformat, err := reader.Format()\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to read wave format\")\n\t}\n\n\tif format.AudioFormat != 1 {\n\t\tlog.Fatalf(\"Support only PCM format. Given: %v\\n\", format.AudioFormat)\n\t}\n\n\tif format.NumChannels != 1 {\n\t\tlog.Fatalf(\"Support only 1 channel wave file. Given: %v\\n\", format.NumChannels)\n\t}\n\n\tif format.BitsPerSample != 16 {\n\t\tlog.Fatalf(\"Support only 16-bit per sample. Given: %v\\n\", format.BitsPerSample)\n\t}\n\n\treader.Duration() // so that it initializes reader.Size\n\n\tbuf := make([]byte, reader.Size)\n\tn, err := reader.Read(buf)\n\tif n != int(reader.Size) {\n\t\tlog.Fatalf(\"Failed to read %v bytes. Returned %v bytes\\n\", reader.Size, n)\n\t}\n\n\tsamples = samplesInt16ToFloat(buf)\n\tsampleRate = int(format.SampleRate)\n\n\treturn\n}\n\nfunc samplesInt16ToFloat(inSamples []byte) []float32 {\n\tnumSamples := len(inSamples) / 2\n\toutSamples := make([]float32, numSamples)\n\n\tfor i := 0; i != numSamples; i++ {\n\t\ts := inSamples[i*2 : (i+1)*2]\n\n\t\tvar s16 int16\n\t\tbuf := bytes.NewReader(s)\n\t\terr := binary.Read(buf, binary.LittleEndian, &s16)\n\t\tif err != nil {\n\t\t\tlog.Fatal(\"Failed to parse 16-bit sample\")\n\t\t}\n\t\toutSamples[i] = float32(s16) / 32768\n\t}\n\n\treturn outSamples\n}\n"
  },
  {
    "path": "go-api-examples/non-streaming-omnilingual-asr-ctc-decode-files/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  tar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  rm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nfi\n\ngo mod tidy\ngo build\n./non-streaming-omnilingual-asr-ctc-decode-files\n"
  },
  {
    "path": "go-api-examples/non-streaming-speaker-diarization/go.mod",
    "content": "module non-streaming-speaker-diarization\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/non-streaming-speaker-diarization/main.go",
    "content": "package main\n\nimport (\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\n/*\nUsage:\n\nStep 1: Download a speaker segmentation model\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nfor a list of available models. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\nStep 2: Download a speaker embedding extractor model\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nfor a list of available models. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\nStep 3. Download test wave files\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nfor a list of available test wave files. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\nStep 4. Run it\n*/\n\nfunc initSpeakerDiarization() *sherpa.OfflineSpeakerDiarization {\n\tconfig := sherpa.OfflineSpeakerDiarizationConfig{}\n\n\tconfig.Segmentation.Pyannote.Model = \"./sherpa-onnx-pyannote-segmentation-3-0/model.onnx\"\n\tconfig.Embedding.Model = \"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\"\n\n\t// The test wave file contains 4 speakers, so we use 4 here\n\tconfig.Clustering.NumClusters = 4\n\n\t// if you don't know the actual numbers in the wave file,\n\t// then please don't set NumClusters; you need to use\n\t//\n\t// config.Clustering.Threshold = 0.5\n\t//\n\n\t// A larger Threshold leads to fewer clusters\n\t// A smaller Threshold leads to more clusters\n\n\tsd := sherpa.NewOfflineSpeakerDiarization(&config)\n\treturn sd\n}\n\nfunc main() {\n\twave_filename := \"./0-four-speakers-zh.wav\"\n\twave := sherpa.ReadWave(wave_filename)\n\tif wave == nil {\n\t\tlog.Printf(\"Failed to read %v\", wave_filename)\n\t\treturn\n\t}\n\n\tsd := initSpeakerDiarization()\n\tif sd == nil {\n\t\tlog.Printf(\"Please check your config\")\n\t\treturn\n\t}\n\n\tdefer sherpa.DeleteOfflineSpeakerDiarization(sd)\n\n\tif wave.SampleRate != sd.SampleRate() {\n\t\tlog.Printf(\"Expected sample rate: %v, given: %d\\n\", sd.SampleRate(), wave.SampleRate)\n\t\treturn\n\t}\n\n\tlog.Println(\"Started\")\n\tsegments := sd.Process(wave.Samples)\n\tn := len(segments)\n\n\tfor i := 0; i < n; i++ {\n\t\tlog.Printf(\"%.3f -- %.3f speaker_%02d\\n\", segments[i].Start, segments[i].End, segments[i].Speaker)\n\t}\n}\n"
  },
  {
    "path": "go-api-examples/non-streaming-speaker-diarization/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-pyannote-segmentation-3-0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nfi\n\nif [ ! -f ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\nfi\n\nif [ ! -f ./0-four-speakers-zh.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\nfi\n\ngo mod tidy\ngo build\n./non-streaming-speaker-diarization\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/go.mod",
    "content": "module non-streaming-tts\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/main.go",
    "content": "package main\n\nimport (\n\t\"log\"\n\t\"math\"\n\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\tflag \"github.com/spf13/pflag\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OfflineTtsConfig{}\n\tsid := 0\n\tfilename := \"./generated.wav\"\n\n\tvar speed float32\n\n\tflag.StringVar(&config.Model.Vits.Model, \"vits-model\", \"\", \"Path to the vits ONNX model\")\n\tflag.StringVar(&config.Model.Vits.Lexicon, \"vits-lexicon\", \"\", \"Path to lexicon.txt\")\n\tflag.StringVar(&config.Model.Vits.Tokens, \"vits-tokens\", \"\", \"Path to tokens.txt\")\n\tflag.StringVar(&config.Model.Vits.DataDir, \"vits-data-dir\", \"\", \"Path to espeak-ng-data\")\n\tflag.Float32Var(&config.Model.Vits.NoiseScale, \"vits-noise-scale\", 0.667, \"noise_scale for VITS\")\n\tflag.Float32Var(&config.Model.Vits.NoiseScaleW, \"vits-noise-scale-w\", 0.8, \"noise_scale_w for VITS\")\n\tflag.Float32Var(&config.Model.Vits.LengthScale, \"vits-length-scale\", 1.0, \"length_scale for VITS. small -> faster; large -> slower\")\n\n\tflag.StringVar(&config.Model.Matcha.AcousticModel, \"matcha-acoustic-model\", \"\", \"Path to the matcha acoustic model\")\n\tflag.StringVar(&config.Model.Matcha.Vocoder, \"matcha-vocoder\", \"\", \"Path to the matcha vocoder model\")\n\tflag.StringVar(&config.Model.Matcha.Lexicon, \"matcha-lexicon\", \"\", \"Path to lexicon.txt\")\n\tflag.StringVar(&config.Model.Matcha.Tokens, \"matcha-tokens\", \"\", \"Path to tokens.txt\")\n\tflag.StringVar(&config.Model.Matcha.DataDir, \"matcha-data-dir\", \"\", \"Path to espeak-ng-data\")\n\tflag.Float32Var(&config.Model.Matcha.NoiseScale, \"matcha-noise-scale\", 0.667, \"noise_scale for Matcha\")\n\tflag.Float32Var(&config.Model.Matcha.LengthScale, \"matcha-length-scale\", 1.0, \"length_scale for Matcha. small -> faster; large -> slower\")\n\n\tflag.StringVar(&config.Model.Kokoro.Model, \"kokoro-model\", \"\", \"Path to the Kokoro ONNX model\")\n\tflag.StringVar(&config.Model.Kokoro.Voices, \"kokoro-voices\", \"\", \"Path to voices.bin for Kokoro\")\n\tflag.StringVar(&config.Model.Kokoro.Tokens, \"kokoro-tokens\", \"\", \"Path to tokens.txt for Kokoro\")\n\tflag.StringVar(&config.Model.Kokoro.DataDir, \"kokoro-data-dir\", \"\", \"Path to espeak-ng-data for Kokoro\")\n\tflag.StringVar(&config.Model.Kokoro.Lexicon, \"kokoro-lexicon\", \"\", \"Path to lexicon files for Kokoro\")\n\tflag.Float32Var(&config.Model.Kokoro.LengthScale, \"kokoro-length-scale\", 1.0, \"length_scale for Kokoro. small -> faster; large -> slower\")\n\n\tflag.StringVar(&config.Model.Kitten.Model, \"kitten-model\", \"\", \"Path to the kitten ONNX model\")\n\tflag.StringVar(&config.Model.Kitten.Voices, \"kitten-voices\", \"\", \"Path to voices.bin for kitten\")\n\tflag.StringVar(&config.Model.Kitten.Tokens, \"kitten-tokens\", \"\", \"Path to tokens.txt for kitten\")\n\tflag.StringVar(&config.Model.Kitten.DataDir, \"kitten-data-dir\", \"\", \"Path to espeak-ng-data for kitten\")\n\tflag.Float32Var(&config.Model.Kitten.LengthScale, \"kitten-length-scale\", 1.0, \"length_scale for kitten. small -> faster; large -> slower\")\n\n\tflag.Float32Var(&speed, \"speed\", 1.0, \"Speech speed. larger->faster; smaller->slower\")\n\n\tflag.IntVar(&config.Model.NumThreads, \"num-threads\", 1, \"Number of threads for computing\")\n\tflag.IntVar(&config.Model.Debug, \"debug\", 0, \"Whether to show debug message\")\n\tflag.StringVar(&config.Model.Provider, \"provider\", \"cpu\", \"Provider to use: cpu/cuda/coreml\")\n\tflag.StringVar(&config.RuleFsts, \"tts-rule-fsts\", \"\", \"Path to rule.fst\")\n\tflag.StringVar(&config.RuleFars, \"tts-rule-fars\", \"\", \"Path to rule.far\")\n\tflag.IntVar(&config.MaxNumSentences, \"tts-max-num-sentences\", 1, \"Batch size (split long text to avoid OOM)\")\n\n\tflag.IntVar(&sid, \"sid\", sid, \"Speaker ID (multi-speaker models only)\")\n\tflag.StringVar(&filename, \"output-filename\", filename, \"Output wav filename\")\n\n\tflag.Parse()\n\n\tif len(flag.Args()) != 1 {\n\t\tlog.Fatalf(\"Please provide the text to generate audios\")\n\t}\n\ttext := flag.Arg(0)\n\n\tlog.Println(\"Input text:\", text)\n\tlog.Println(\"Speaker ID:\", sid)\n\tlog.Println(\"Output filename:\", filename)\n\n\tlog.Println(\"Initializing model (may take several seconds)\")\n\ttts := sherpa.NewOfflineTts(&config)\n\tdefer sherpa.DeleteOfflineTts(tts)\n\tlog.Println(\"Model created!\")\n\n\tlog.Println(\"Start generating!\")\n\tcfg := sherpa.GenerationConfig{\n\t\tSilenceScale: 0.2,\n\t\tSpeed:        float32(math.Max(float64(speed), 1e-6)),\n\t\tSid:          sid,\n\t}\n\taudio := tts.GenerateWithConfig(text, &cfg, nil)\n\n\tlog.Println(\"Done!\")\n\tif ok := audio.Save(filename); !ok {\n\t\tlog.Fatalf(\"Failed to write %s\", filename)\n\t}\n\tlog.Println(\"Saved to\", filename)\n}\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/run-kitten-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./kitten-nano-en-v0_1-fp16/model.fp16.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n  tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n  rm kitten-nano-en-v0_1-fp16.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./non-streaming-tts \\\n  --kitten-model=./kitten-nano-en-v0_1-fp16/model.fp16.onnx \\\n  --kitten-voices=./kitten-nano-en-v0_1-fp16/voices.bin \\\n  --kitten-tokens=./kitten-nano-en-v0_1-fp16/tokens.txt \\\n  --kitten-data-dir=./kitten-nano-en-v0_1-fp16/espeak-ng-data \\\n  --debug=1 \\\n  --output-filename=./test-kitten-en.wav \\\n  \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/run-kokoro-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./kokoro-en-v0_19/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n  tar xf kokoro-en-v0_19.tar.bz2\n  rm kokoro-en-v0_19.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./non-streaming-tts \\\n  --kokoro-model=./kokoro-en-v0_19/model.onnx \\\n  --kokoro-voices=./kokoro-en-v0_19/voices.bin \\\n  --kokoro-tokens=./kokoro-en-v0_19/tokens.txt \\\n  --kokoro-data-dir=./kokoro-en-v0_19/espeak-ng-data \\\n  --debug=1 \\\n  --output-filename=./test-kokoro-en.wav \\\n  \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/run-kokoro-zh-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./kokoro-multi-lang-v1_0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n  tar xf kokoro-multi-lang-v1_0.tar.bz2\n  rm kokoro-multi-lang-v1_0.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./non-streaming-tts \\\n  --kokoro-model=./kokoro-multi-lang-v1_0/model.onnx \\\n  --kokoro-voices=./kokoro-multi-lang-v1_0/voices.bin \\\n  --kokoro-tokens=./kokoro-multi-lang-v1_0/tokens.txt \\\n  --kokoro-data-dir=./kokoro-multi-lang-v1_0/espeak-ng-data \\\n  --kokoro-lexicon=./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt \\\n  --debug=1 \\\n  --output-filename=./test-kokoro-zh-en.wav \\\n  \"中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？\"\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/run-matcha-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n  tar xf matcha-icefall-en_US-ljspeech.tar.bz2\n  rm matcha-icefall-en_US-ljspeech.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\ngo mod tidy\ngo build\n\n./non-streaming-tts \\\n  --matcha-acoustic-model=./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-tokens=./matcha-icefall-en_US-ljspeech/tokens.txt \\\n  --matcha-data-dir=./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\n  --debug=1 \\\n  --output-filename=./test-matcha-en.wav \\\n  \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\n\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/run-matcha-zh.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-zh-baker/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  tar xvf matcha-icefall-zh-baker.tar.bz2\n  rm matcha-icefall-zh-baker.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\ngo mod tidy\ngo build\n\n./non-streaming-tts \\\n  --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\n  --matcha-tokens=./matcha-icefall-zh-baker/tokens.txt \\\n  --debug=1 \\\n  --tts-rule-fsts=./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst \\\n  --output-filename=./test-matcha-zh.wav \\\n  \"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\"\n\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/run-vits-ljs.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -d vits-ljs ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-ljs.tar.bz2\n  tar xvf vits-ljs.tar.bz2\n  rm vits-ljs.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./non-streaming-tts \\\n  --vits-model=./vits-ljs/vits-ljs.onnx \\\n  --vits-lexicon=./vits-ljs/lexicon.txt \\\n  --vits-tokens=./vits-ljs/tokens.txt \\\n  --sid=0 \\\n  --debug=1 \\\n  --output-filename=./vits-ljs.wav \\\n  \"Liliana, the most beautiful and lovely assistant of our team!\"\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/run-vits-piper-en_US-lessac-medium.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -d vits-piper-en_US-lessac-medium ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-lessac-medium.tar.bz2\n  tar xf vits-piper-en_US-lessac-medium.tar.bz2\n  rm vits-piper-en_US-lessac-medium.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./non-streaming-tts \\\n  --vits-model=./vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx \\\n  --vits-data-dir=./vits-piper-en_US-lessac-medium/espeak-ng-data \\\n  --vits-tokens=./vits-piper-en_US-lessac-medium/tokens.txt \\\n  --output-filename=./liliana-piper-en_US-lessac-medium.wav \\\n  'liliana, the most beautiful and lovely assistant of our team!'\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/run-vits-vctk.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -d vits-vctk ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-vctk.tar.bz2\n  tar xvf vits-vctk.tar.bz2\n  rm vits-vctk.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\nfor sid in 0 10 108; do\n./non-streaming-tts \\\n  --vits-model=./vits-vctk/vits-vctk.onnx \\\n  --vits-lexicon=./vits-vctk/lexicon.txt \\\n  --vits-tokens=./vits-vctk/tokens.txt \\\n  --sid=0 \\\n  --debug=1 \\\n  --output-filename=./kennedy-$sid.wav \\\n  'Ask not what your country can do for you; ask what you can do for your country.'\ndone\n"
  },
  {
    "path": "go-api-examples/non-streaming-tts/run-vits-zh-aishell3.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -d vits-icefall-zh-aishell3 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\n  tar xvf vits-icefall-zh-aishell3.tar.bz2\n  rm vits-icefall-zh-aishell3.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\nfor sid in 10 33 99; do\n./non-streaming-tts \\\n  --vits-model=./vits-icefall-zh-aishell3/model.onnx \\\n  --vits-lexicon=./vits-icefall-zh-aishell3/lexicon.txt \\\n  --vits-tokens=./vits-icefall-zh-aishell3/tokens.txt \\\n  --sid=$sid \\\n  --debug=1 \\\n  --output-filename=./liliana-$sid.wav \\\n  \"林美丽最美丽、最漂亮、最可爱！\"\n\n./non-streaming-tts \\\n  --vits-model=./vits-icefall-zh-aishell3/model.onnx \\\n  --vits-lexicon=./vits-icefall-zh-aishell3/lexicon.txt \\\n  --vits-tokens=./vits-icefall-zh-aishell3/tokens.txt \\\n  --tts-rule-fsts=./vits-icefall-zh-aishell3/phone.fst,./vits-icefall-zh-aishell3/date.fst,./vits-icefall-zh-aishell3/number.fst \\\n  --sid=$sid \\\n  --debug=1 \\\n  --output-filename=./numbers-$sid.wav \\\n  \"数字12345.6789怎么念\"\n\n./non-streaming-tts \\\n  --vits-model=./vits-icefall-zh-aishell3/model.onnx \\\n  --vits-lexicon=./vits-icefall-zh-aishell3/lexicon.txt \\\n  --vits-tokens=./vits-icefall-zh-aishell3/tokens.txt \\\n  --tts-rule-fsts=./vits-icefall-zh-aishell3/phone.fst,./vits-icefall-zh-aishell3/date.fst,./vits-icefall-zh-aishell3/number.fst \\\n  --tts-rule-fars=./vits-icefall-zh-aishell3/rule.far \\\n  --sid=$sid \\\n  --debug=1 \\\n  --output-filename=./heteronym-$sid.wav \\\n  \"万古长存长沙长大长白山长孙长安街\"\ndone\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/go.mod",
    "content": "module offline-tts-play\n\ngo 1.24.0\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/main.go",
    "content": "package main\n\nimport (\n\t\"encoding/binary\"\n\t\"io\"\n\t\"log\"\n\t\"math\"\n\t\"os\"\n\t\"os/signal\"\n\t\"sync\"\n\t\"syscall\"\n\t\"time\"\n\n\toto \"github.com/ebitengine/oto/v3\"\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\tflag \"github.com/spf13/pflag\"\n)\n\ntype pcmBuffer struct {\n\tmu       sync.Mutex\n\tqueue    [][]byte\n\tfinished bool\n\tstarted  chan struct{} // closed on first callback\n\tonce     sync.Once\n}\n\nfunc newPCMBuffer() *pcmBuffer {\n\treturn &pcmBuffer{\n\t\tstarted: make(chan struct{}),\n\t}\n}\n\nfunc (b *pcmBuffer) Push(p []byte) {\n\tb.once.Do(func() {\n\t\tclose(b.started)\n\t})\n\n\tb.mu.Lock()\n\tb.queue = append(b.queue, p)\n\tb.mu.Unlock()\n}\n\nfunc (b *pcmBuffer) Finish() {\n\tb.once.Do(func() {\n\t\tclose(b.started)\n\t})\n\n\tb.mu.Lock()\n\tb.finished = true\n\tb.mu.Unlock()\n}\n\ntype pcmReader struct {\n\tbuf  *pcmBuffer\n\tdone chan struct{}\n\tonce sync.Once\n}\n\nfunc (r *pcmReader) Read(p []byte) (int, error) {\n\t<-r.buf.started\n\n\tr.buf.mu.Lock()\n\tdefer r.buf.mu.Unlock()\n\n\t// 2) Have audio\n\tif len(r.buf.queue) > 0 {\n\t\tchunk := r.buf.queue[0]\n\t\tn := copy(p, chunk)\n\n\t\tif n == len(chunk) {\n\t\t\tr.buf.queue = r.buf.queue[1:]\n\t\t} else {\n\t\t\tr.buf.queue[0] = chunk[n:]\n\t\t}\n\t\treturn n, nil\n\t}\n\n\t// 3) Finished → EOF\n\tif r.buf.finished {\n\t\tr.once.Do(func() { close(r.done) })\n\t\treturn 0, io.EOF\n\t}\n\n\t// 4) Gap → silence\n\tfor i := range p {\n\t\tp[i] = 0\n\t}\n\treturn len(p), nil\n}\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OfflineTtsConfig{}\n\tsid := 0\n\tfilename := \"./generated.wav\"\n\n\tflag.StringVar(&config.Model.Vits.Model, \"vits-model\", \"\", \"Path to the vits ONNX model\")\n\tflag.StringVar(&config.Model.Vits.Lexicon, \"vits-lexicon\", \"\", \"Path to lexicon.txt\")\n\tflag.StringVar(&config.Model.Vits.Tokens, \"vits-tokens\", \"\", \"Path to tokens.txt\")\n\tflag.StringVar(&config.Model.Vits.DataDir, \"vits-data-dir\", \"\", \"Path to espeak-ng-data\")\n\n\tflag.Float32Var(&config.Model.Vits.NoiseScale, \"vits-noise-scale\", 0.667, \"noise_scale for VITS\")\n\tflag.Float32Var(&config.Model.Vits.NoiseScaleW, \"vits-noise-scale-w\", 0.8, \"noise_scale_w for VITS\")\n\tflag.Float32Var(&config.Model.Vits.LengthScale, \"vits-length-scale\", 1.0, \"length_scale for VITS. small -> faster in speech speed; large -> slower\")\n\n\tflag.StringVar(&config.Model.Matcha.AcousticModel, \"matcha-acoustic-model\", \"\", \"Path to the matcha acoustic model\")\n\tflag.StringVar(&config.Model.Matcha.Vocoder, \"matcha-vocoder\", \"\", \"Path to the matcha vocoder model\")\n\tflag.StringVar(&config.Model.Matcha.Lexicon, \"matcha-lexicon\", \"\", \"Path to lexicon.txt\")\n\tflag.StringVar(&config.Model.Matcha.Tokens, \"matcha-tokens\", \"\", \"Path to tokens.txt\")\n\tflag.StringVar(&config.Model.Matcha.DataDir, \"matcha-data-dir\", \"\", \"Path to espeak-ng-data\")\n\n\tflag.Float32Var(&config.Model.Matcha.NoiseScale, \"matcha-noise-scale\", 0.667, \"noise_scale for Matcha\")\n\tflag.Float32Var(&config.Model.Matcha.LengthScale, \"matcha-length-scale\", 1.0, \"length_scale for Matcha. small -> faster in speech speed; large -> slower\")\n\n\tflag.StringVar(&config.Model.Kokoro.Model, \"kokoro-model\", \"\", \"Path to the Kokoro ONNX model\")\n\tflag.StringVar(&config.Model.Kokoro.Voices, \"kokoro-voices\", \"\", \"Path to voices.bin for Kokoro\")\n\tflag.StringVar(&config.Model.Kokoro.Tokens, \"kokoro-tokens\", \"\", \"Path to tokens.txt for Kokoro\")\n\tflag.StringVar(&config.Model.Kokoro.DataDir, \"kokoro-data-dir\", \"\", \"Path to espeak-ng-data for Kokoro\")\n\tflag.StringVar(&config.Model.Kokoro.Lexicon, \"kokoro-lexicon\", \"\", \"Path to lexicon files for Kokoro\")\n\tflag.Float32Var(&config.Model.Kokoro.LengthScale, \"kokoro-length-scale\", 1.0, \"length_scale for Kokoro. small -> faster in speech speed; large -> slower\")\n\n\tflag.StringVar(&config.Model.Kitten.Model, \"kitten-model\", \"\", \"Path to the kitten ONNX model\")\n\tflag.StringVar(&config.Model.Kitten.Voices, \"kitten-voices\", \"\", \"Path to voices.bin for kitten\")\n\tflag.StringVar(&config.Model.Kitten.Tokens, \"kitten-tokens\", \"\", \"Path to tokens.txt for kitten\")\n\tflag.StringVar(&config.Model.Kitten.DataDir, \"kitten-data-dir\", \"\", \"Path to espeak-ng-data for kitten\")\n\tflag.Float32Var(&config.Model.Kitten.LengthScale, \"kitten-length-scale\", 1.0, \"length_scale for kitten. small -> faster in speech speed; large -> slower\")\n\n\tflag.IntVar(&config.Model.NumThreads, \"num-threads\", 1, \"Number of threads for computing\")\n\tflag.IntVar(&config.Model.Debug, \"debug\", 0, \"Whether to show debug message\")\n\tflag.StringVar(&config.Model.Provider, \"provider\", \"cpu\", \"Provider to use\")\n\tflag.StringVar(&config.RuleFsts, \"tts-rule-fsts\", \"\", \"Path to rule.fst\")\n\tflag.StringVar(&config.RuleFars, \"tts-rule-fars\", \"\", \"Path to rule.far\")\n\tflag.IntVar(&config.MaxNumSentences, \"tts-max-num-sentences\", 1, \"Batch size\")\n\n\tflag.IntVar(&sid, \"sid\", sid, \"Speaker ID. Used only for multi-speaker models\")\n\tflag.StringVar(&filename, \"output-filename\", filename, \"Output wav filename\")\n\n\tflag.Parse()\n\n\tif len(flag.Args()) != 1 {\n\t\tlog.Fatalf(\"Please provide the text to generate audio\")\n\t}\n\n\ttext := flag.Arg(0)\n\n\tlog.Println(\"Input text:\", text)\n\tlog.Println(\"Speaker ID:\", sid)\n\n\tlog.Println(\"Initializing model (may take several seconds)\")\n\ttts := sherpa.NewOfflineTts(&config)\n\tdefer sherpa.DeleteOfflineTts(tts)\n\tlog.Println(\"Model created!\")\n\n\tctx, ready, err := oto.NewContext(&oto.NewContextOptions{\n\t\tSampleRate:   tts.SampleRate(),\n\t\tChannelCount: 1,\n\t\tFormat:       oto.FormatSignedInt16LE,\n\t})\n\tif err != nil {\n\t\tlog.Fatal(err)\n\t}\n\t<-ready\n\n\tpcmBuf := newPCMBuffer()\n\n\treader := &pcmReader{\n\t\tbuf:  pcmBuf,\n\t\tdone: make(chan struct{}),\n\t}\n\n\tplayer := ctx.NewPlayer(reader)\n\tplayer.Play()\n\tdefer player.Close()\n\n\tstop := make(chan os.Signal, 1)\n\tsignal.Notify(stop, syscall.SIGINT, syscall.SIGTERM)\n\n\tvar generated *sherpa.GeneratedAudio\n\n\tstart := time.Now()\n\tcfg := sherpa.GenerationConfig{\n\t\tSilenceScale: 0.2,\n\t\tSpeed:        1.0,\n\t\tSid:          sid,\n\t}\n\n\tgo func() {\n\t\tdefer pcmBuf.Finish()\n\n\t\tgenerated = tts.GenerateWithConfig(\n\t\t\ttext,\n\t\t\t&cfg,\n\t\t\tfunc(samples []float32, progress float32) bool {\n\t\t\t\tlog.Printf(\"Progress: %.1f%%\", progress*100)\n\n\t\t\t\tbuf := make([]byte, len(samples)*2)\n\t\t\t\tfor i, s := range samples {\n\t\t\t\t\tif s > 1 {\n\t\t\t\t\t\ts = 1\n\t\t\t\t\t} else if s < -1 {\n\t\t\t\t\t\ts = -1\n\t\t\t\t\t}\n\t\t\t\t\tv := int16(math.Round(float64(s * 32767)))\n\t\t\t\t\tbinary.LittleEndian.PutUint16(buf[i*2:], uint16(v))\n\t\t\t\t}\n\n\t\t\t\tpcmBuf.Push(buf)\n\t\t\t\treturn true\n\t\t\t},\n\t\t)\n\n\t\tlog.Println(\"TTS generation finished in\", time.Since(start))\n\t}()\n\n\tselect {\n\tcase <-stop:\n\t\tlog.Println(\"Interrupted\")\n\tcase <-reader.done:\n\t\tlog.Println(\"Playback finished\")\n\t}\n\n\tif generated != nil {\n\t\tif ok := generated.Save(filename); !ok {\n\t\t\tlog.Println(\"Failed to save audio\")\n\t\t} else {\n\t\t\tlog.Println(\"Saved generated audio to\", filename)\n\t\t}\n\t}\n\n\t// let remaining audio drain\n\ttime.Sleep(800 * time.Millisecond)\n\n\tlog.Println(\"Done\")\n}\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/run-kitten-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./kitten-nano-en-v0_1-fp16/model.fp16.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n  tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n  rm kitten-nano-en-v0_1-fp16.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./offline-tts-play \\\n  --kitten-model=./kitten-nano-en-v0_1-fp16/model.fp16.onnx \\\n  --kitten-voices=./kitten-nano-en-v0_1-fp16/voices.bin \\\n  --kitten-tokens=./kitten-nano-en-v0_1-fp16/tokens.txt \\\n  --kitten-data-dir=./kitten-nano-en-v0_1-fp16/espeak-ng-data \\\n  --debug=1 \\\n  \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/run-kokoro-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./kokoro-en-v0_19/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n  tar xf kokoro-en-v0_19.tar.bz2\n  rm kokoro-en-v0_19.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./offline-tts-play \\\n  --kokoro-model=./kokoro-en-v0_19/model.onnx \\\n  --kokoro-voices=./kokoro-en-v0_19/voices.bin \\\n  --kokoro-tokens=./kokoro-en-v0_19/tokens.txt \\\n  --kokoro-data-dir=./kokoro-en-v0_19/espeak-ng-data \\\n  --debug=1 \\\n  \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/run-kokoro-zh-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./kokoro-multi-lang-v1_0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n  tar xf kokoro-multi-lang-v1_0.tar.bz2\n  rm kokoro-multi-lang-v1_0.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./offline-tts-play \\\n  --kokoro-model=./kokoro-multi-lang-v1_0/model.onnx \\\n  --kokoro-voices=./kokoro-multi-lang-v1_0/voices.bin \\\n  --kokoro-tokens=./kokoro-multi-lang-v1_0/tokens.txt \\\n  --kokoro-data-dir=./kokoro-multi-lang-v1_0/espeak-ng-data \\\n  --kokoro-lexicon=./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt \\\n  --debug=1 \\\n  \"中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？\"\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/run-matcha-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n  tar xf matcha-icefall-en_US-ljspeech.tar.bz2\n  rm matcha-icefall-en_US-ljspeech.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\ngo mod tidy\ngo build\n\n./offline-tts-play \\\n  --matcha-acoustic-model=./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-tokens=./matcha-icefall-en_US-ljspeech/tokens.txt \\\n  --matcha-data-dir=./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\n  --debug=1 \\\n  \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\n\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/run-matcha-zh.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-zh-baker/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  tar xvf matcha-icefall-zh-baker.tar.bz2\n  rm matcha-icefall-zh-baker.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\ngo mod tidy\ngo build\n\n./offline-tts-play \\\n  --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\n  --matcha-tokens=./matcha-icefall-zh-baker/tokens.txt \\\n  --debug=0 \\\n  --tts-rule-fsts=./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst \\\n  \"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\"\n\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/run-vits-ljs.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -d vits-ljs ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-ljs.tar.bz2\n  tar xvf vits-ljs.tar.bz2\n  rm vits-ljs.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./offline-tts-play \\\n  --vits-model=./vits-ljs/vits-ljs.onnx \\\n  --vits-lexicon=./vits-ljs/lexicon.txt \\\n  --vits-tokens=./vits-ljs/tokens.txt \\\n  --sid=0 \\\n  --debug=1 \\\n  \"Liliana, the most beautiful and lovely assistant of our team!\"\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/run-vits-piper-en_US-lessac-medium.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -d vits-piper-en_US-lessac-medium ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-lessac-medium.tar.bz2\n  tar xf vits-piper-en_US-lessac-medium.tar.bz2\n  rm vits-piper-en_US-lessac-medium.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./offline-tts-play \\\n  --vits-model=./vits-piper-en_US-lessac-medium/en_US-lessac-medium.onnx \\\n  --vits-data-dir=./vits-piper-en_US-lessac-medium/espeak-ng-data \\\n  --vits-tokens=./vits-piper-en_US-lessac-medium/tokens.txt \\\n  'liliana, the most beautiful and lovely assistant of our team!'\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/run-vits-vctk.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -d vits-vctk ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-vctk.tar.bz2\n  tar xvf vits-vctk.tar.bz2\n  rm vits-vctk.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\nfor sid in 0 10 108; do\n./offline-tts-play \\\n  --vits-model=./vits-vctk/vits-vctk.onnx \\\n  --vits-lexicon=./vits-vctk/lexicon.txt \\\n  --vits-tokens=./vits-vctk/tokens.txt \\\n  --sid=0 \\\n  --debug=1 \\\n  'Ask not what your country can do for you; ask what you can do for your country.'\ndone\n"
  },
  {
    "path": "go-api-examples/offline-tts-play/run-vits-zh-aishell3.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -d vits-icefall-zh-aishell3 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\n  tar xvf vits-icefall-zh-aishell3.tar.bz2\n  rm vits-icefall-zh-aishell3.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\nfor sid in 10 33 99; do\n./offline-tts-play \\\n  --vits-model=./vits-icefall-zh-aishell3/model.onnx \\\n  --vits-lexicon=./vits-icefall-zh-aishell3/lexicon.txt \\\n  --vits-tokens=./vits-icefall-zh-aishell3/tokens.txt \\\n  --sid=$sid \\\n  --debug=1 \\\n  \"林美丽最美丽、最漂亮、最可爱！\"\n\n./offline-tts-play \\\n  --vits-model=./vits-icefall-zh-aishell3/model.onnx \\\n  --vits-lexicon=./vits-icefall-zh-aishell3/lexicon.txt \\\n  --vits-tokens=./vits-icefall-zh-aishell3/tokens.txt \\\n  --tts-rule-fsts=./vits-icefall-zh-aishell3/phone.fst,./vits-icefall-zh-aishell3/date.fst,./vits-icefall-zh-aishell3/number.fst \\\n  --sid=$sid \\\n  --debug=1 \\\n  \"数字12345.6789怎么念\"\n\n./offline-tts-play \\\n  --vits-model=./vits-icefall-zh-aishell3/model.onnx \\\n  --vits-lexicon=./vits-icefall-zh-aishell3/lexicon.txt \\\n  --vits-tokens=./vits-icefall-zh-aishell3/tokens.txt \\\n  --tts-rule-fsts=./vits-icefall-zh-aishell3/phone.fst,./vits-icefall-zh-aishell3/date.fst,./vits-icefall-zh-aishell3/number.fst \\\n  --tts-rule-fars=./vits-icefall-zh-aishell3/rule.far \\\n  --sid=$sid \\\n  --debug=1 \\\n  \"万古长存长沙长大长白山长孙长安街\"\ndone\n"
  },
  {
    "path": "go-api-examples/speaker-identification/go.mod",
    "content": "module speaker-identification\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/speaker-identification/main.go",
    "content": "package main\n\nimport (\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\nfunc createSpeakerEmbeddingExtractor() *sherpa.SpeakerEmbeddingExtractor {\n\tconfig := sherpa.SpeakerEmbeddingExtractorConfig{}\n\n\t// Please download the model from\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\n\t//\n\t// You can find more models at\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n\n\tconfig.Model = \"./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\"\n\tconfig.NumThreads = 1\n\tconfig.Debug = 1\n\tconfig.Provider = \"cpu\"\n\n\tex := sherpa.NewSpeakerEmbeddingExtractor(&config)\n\treturn ex\n}\n\nfunc computeEmbeddings(ex *sherpa.SpeakerEmbeddingExtractor, files []string) [][]float32 {\n\tembeddings := make([][]float32, len(files))\n\n\tfor i, f := range files {\n\t\twave := sherpa.ReadWave(f)\n\n\t\tstream := ex.CreateStream()\n\t\tdefer sherpa.DeleteOnlineStream(stream)\n\t\tstream.AcceptWaveform(wave.SampleRate, wave.Samples)\n\t\tstream.InputFinished()\n\t\tembeddings[i] = ex.Compute(stream)\n\t}\n\n\treturn embeddings\n\n}\n\nfunc registerSpeakers(ex *sherpa.SpeakerEmbeddingExtractor, manager *sherpa.SpeakerEmbeddingManager) {\n\t// Please download the test data from\n\t// https://github.com/csukuangfj/sr-data\n\tspk1_files := []string{\n\t\t\"./sr-data/enroll/fangjun-sr-1.wav\",\n\t\t\"./sr-data/enroll/fangjun-sr-2.wav\",\n\t\t\"./sr-data/enroll/fangjun-sr-3.wav\",\n\t}\n\n\tspk2_files := []string{\n\t\t\"./sr-data/enroll/leijun-sr-1.wav\",\n\t\t\"./sr-data/enroll/leijun-sr-2.wav\",\n\t}\n\n\tspk1_embeddings := computeEmbeddings(ex, spk1_files)\n\tspk2_embeddings := computeEmbeddings(ex, spk2_files)\n\n\tok := manager.RegisterV(\"fangjun\", spk1_embeddings)\n\tif !ok {\n\t\tpanic(\"Failed to register fangjun\")\n\t}\n\n\tok = manager.RegisterV(\"leijun\", spk2_embeddings)\n\tif !ok {\n\t\tpanic(\"Failed to register leijun\")\n\t}\n\n\tif !manager.Contains(\"fangjun\") {\n\t\tpanic(\"Failed to find fangjun\")\n\t}\n\n\tif !manager.Contains(\"leijun\") {\n\t\tpanic(\"Failed to find leijun\")\n\t}\n\n\tif manager.NumSpeakers() != 2 {\n\t\tpanic(\"There should be only 2 speakers\")\n\t}\n\n\tall_speakers := manager.AllSpeakers()\n\tlog.Printf(\"All speakers: %v\\n\", all_speakers)\n}\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tex := createSpeakerEmbeddingExtractor()\n\tdefer sherpa.DeleteSpeakerEmbeddingExtractor(ex)\n\n\tmanager := sherpa.NewSpeakerEmbeddingManager(ex.Dim())\n\tdefer sherpa.DeleteSpeakerEmbeddingManager(manager)\n\tregisterSpeakers(ex, manager)\n\n\t// Please download the test data from\n\t// https://github.com/csukuangfj/sr-data\n\ttest1 := \"./sr-data/test/fangjun-test-sr-1.wav\"\n\tembeddings := computeEmbeddings(ex, []string{test1})[0]\n\tthreshold := float32(0.6)\n\tname := manager.Search(embeddings, threshold)\n\tif len(name) > 0 {\n\t\tlog.Printf(\"%v matches %v\", test1, name)\n\t} else {\n\t\tlog.Printf(\"No matches found for %v\", test1)\n\t}\n\n\ttest2 := \"./sr-data/test/leijun-test-sr-1.wav\"\n\tembeddings = computeEmbeddings(ex, []string{test2})[0]\n\tname = manager.Search(embeddings, threshold)\n\tif len(name) > 0 {\n\t\tlog.Printf(\"%v matches %v\", test2, name)\n\t} else {\n\t\tlog.Printf(\"No matches found for %v\", test2)\n\t}\n\n\ttest3 := \"./sr-data/test/liudehua-test-sr-1.wav\"\n\tembeddings = computeEmbeddings(ex, []string{test3})[0]\n\tname = manager.Search(embeddings, threshold)\n\tif len(name) > 0 {\n\t\tlog.Printf(\"%v matches %v\", test3, name)\n\t} else {\n\t\tlog.Printf(\"No matches found for %v\", test3)\n\t}\n\n\tif !manager.Remove(\"fangjun\") {\n\t\tpanic(\"Failed to deregister fangjun\")\n\t} else {\n\t\tlog.Print(\"fangjun deregistered\\n\")\n\t}\n\n\ttest1 = \"./sr-data/test/fangjun-test-sr-1.wav\"\n\tembeddings = computeEmbeddings(ex, []string{test1})[0]\n\tname = manager.Search(embeddings, threshold)\n\tif len(name) > 0 {\n\t\tlog.Printf(\"%v matches %v\", test1, name)\n\t} else {\n\t\tlog.Printf(\"No matches found for %v\", test1)\n\t}\n}\n\nfunc chk(err error) {\n\tif err != nil {\n\t\tpanic(err)\n\t}\n}\n"
  },
  {
    "path": "go-api-examples/speaker-identification/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\nfi\n\nif [ ! -f ./sr-data/enroll/fangjun-sr-1.wav ]; then\n  git clone https://github.com/csukuangfj/sr-data\nfi\n\ngo mod tidy\ngo build\n./speaker-identification\n"
  },
  {
    "path": "go-api-examples/speech-enhancement-dpdfnet/go.mod",
    "content": "module speech-enhancement-dpdfnet\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/speech-enhancement-dpdfnet/main.go",
    "content": "package main\n\nimport (\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OfflineSpeechDenoiserConfig{}\n\tconfig.Model.DpdfNet.Model = \"./dpdfnet_baseline.onnx\"\n\tconfig.Model.NumThreads = 1\n\tconfig.Model.Debug = 1\n\n\tsd := sherpa.NewOfflineSpeechDenoiser(&config)\n\tdefer sherpa.DeleteOfflineSpeechDenoiser(sd)\n\n\twaveFilename := \"./inp_16k.wav\"\n\twave := sherpa.ReadWave(waveFilename)\n\tif wave == nil {\n\t\tlog.Printf(\"Failed to read %v\\n\", waveFilename)\n\t\treturn\n\t}\n\n\taudio := sd.Run(wave.Samples, wave.SampleRate)\n\tfilename := \"./enhanced-dpdfnet-16k.wav\"\n\tif !audio.Save(filename) {\n\t\tlog.Fatalf(\"Failed to write %v\\n\", filename)\n\t}\n\n\tlog.Printf(\"Saved to %v\\n\", filename)\n}\n"
  },
  {
    "path": "go-api-examples/speech-enhancement-dpdfnet/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ngo mod tidy\ngo build\n\n./speech-enhancement-dpdfnet\n"
  },
  {
    "path": "go-api-examples/speech-enhancement-gtcrn/go.mod",
    "content": "module speech-enhancement-gtcrn\n\ngo 1.17\n\n"
  },
  {
    "path": "go-api-examples/speech-enhancement-gtcrn/main.go",
    "content": "package main\n\nimport (\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OfflineSpeechDenoiserConfig{}\n\n\t// Please download the models from\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\n\tconfig.Model.Gtcrn.Model = \"./gtcrn_simple.onnx\"\n\tconfig.Model.NumThreads = 1\n\tconfig.Model.Debug = 1\n\n\tsd := sherpa.NewOfflineSpeechDenoiser(&config)\n\tdefer sherpa.DeleteOfflineSpeechDenoiser(sd)\n\n\twave_filename := \"./inp_16k.wav\"\n\n\twave := sherpa.ReadWave(wave_filename)\n\tif wave == nil {\n\t\tlog.Printf(\"Failed to read %v\\n\", wave_filename)\n\t\treturn\n\t}\n\n\tlog.Println(\"Started\")\n\taudio := sd.Run(wave.Samples, wave.SampleRate)\n\tlog.Println(\"Done!\")\n\n\tfilename := \"./enhanced-16k.wav\"\n\tok := audio.Save(filename)\n\tif !ok {\n\t\tlog.Fatalf(\"Failed to write\", filename)\n\t} else {\n\t\tlog.Println(\"Saved to \", filename)\n\t}\n\n}\n"
  },
  {
    "path": "go-api-examples/speech-enhancement-gtcrn/run.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ngo mod tidy\ngo build\n\n./speech-enhancement-gtcrn\n"
  },
  {
    "path": "go-api-examples/streaming-hlg-decoding/go.mod",
    "content": "module streaming-hlg-decoding\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/streaming-hlg-decoding/main.go",
    "content": "package main\n\nimport (\n\t\"bytes\"\n\t\"encoding/binary\"\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"github.com/youpy/go-wav\"\n\t\"log\"\n\t\"os\"\n\t\"strings\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OnlineRecognizerConfig{}\n\tconfig.FeatConfig = sherpa.FeatureConfig{SampleRate: 16000, FeatureDim: 80}\n\n\t// please download model files from\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n\tconfig.ModelConfig.Zipformer2Ctc.Model = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx\"\n\tconfig.ModelConfig.Tokens = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt\"\n\n\tconfig.ModelConfig.NumThreads = 1\n\tconfig.ModelConfig.Debug = 0\n\tconfig.ModelConfig.Provider = \"cpu\"\n\tconfig.CtcFstDecoderConfig.Graph = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst\"\n\n\twav_filename := \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/8k.wav\"\n\n\tsamples, sampleRate := readWave(wav_filename)\n\n\tlog.Println(\"Initializing recognizer (may take several seconds)\")\n\trecognizer := sherpa.NewOnlineRecognizer(&config)\n\tlog.Println(\"Recognizer created!\")\n\tdefer sherpa.DeleteOnlineRecognizer(recognizer)\n\n\tlog.Println(\"Start decoding!\")\n\tstream := sherpa.NewOnlineStream(recognizer)\n\tdefer sherpa.DeleteOnlineStream(stream)\n\n\tstream.AcceptWaveform(sampleRate, samples)\n\n\ttailPadding := make([]float32, int(float32(sampleRate)*0.3))\n\tstream.AcceptWaveform(sampleRate, tailPadding)\n\n\tfor recognizer.IsReady(stream) {\n\t\trecognizer.Decode(stream)\n\t}\n\tlog.Println(\"Decoding done!\")\n\tresult := recognizer.GetResult(stream)\n\tlog.Println(strings.ToLower(result.Text))\n\tlog.Printf(\"Wave duration: %v seconds\", float32(len(samples))/float32(sampleRate))\n}\n\nfunc readWave(filename string) (samples []float32, sampleRate int) {\n\tfile, _ := os.Open(filename)\n\tdefer file.Close()\n\n\treader := wav.NewReader(file)\n\tformat, err := reader.Format()\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to read wave format\")\n\t}\n\n\tif format.AudioFormat != 1 {\n\t\tlog.Fatalf(\"Support only PCM format. Given: %v\\n\", format.AudioFormat)\n\t}\n\n\tif format.NumChannels != 1 {\n\t\tlog.Fatalf(\"Support only 1 channel wave file. Given: %v\\n\", format.NumChannels)\n\t}\n\n\tif format.BitsPerSample != 16 {\n\t\tlog.Fatalf(\"Support only 16-bit per sample. Given: %v\\n\", format.BitsPerSample)\n\t}\n\n\treader.Duration() // so that it initializes reader.Size\n\n\tbuf := make([]byte, reader.Size)\n\tn, err := reader.Read(buf)\n\tif n != int(reader.Size) {\n\t\tlog.Fatalf(\"Failed to read %v bytes. Returned %v bytes\\n\", reader.Size, n)\n\t}\n\n\tsamples = samplesInt16ToFloat(buf)\n\tsampleRate = int(format.SampleRate)\n\n\treturn\n}\n\nfunc samplesInt16ToFloat(inSamples []byte) []float32 {\n\tnumSamples := len(inSamples) / 2\n\toutSamples := make([]float32, numSamples)\n\n\tfor i := 0; i != numSamples; i++ {\n\t\ts := inSamples[i*2 : (i+1)*2]\n\n\t\tvar s16 int16\n\t\tbuf := bytes.NewReader(s)\n\t\terr := binary.Read(buf, binary.LittleEndian, &s16)\n\t\tif err != nil {\n\t\t\tlog.Fatal(\"Failed to parse 16-bit sample\")\n\t\t}\n\t\toutSamples[i] = float32(s16) / 32768\n\t}\n\n\treturn outSamples\n}\n"
  },
  {
    "path": "go-api-examples/streaming-hlg-decoding/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nfi\n\ngo mod tidy\ngo build\nls -lh\n./streaming-hlg-decoding\n"
  },
  {
    "path": "go-api-examples/streaming-speech-enhancement-dpdfnet/go.mod",
    "content": "module streaming-speech-enhancement-dpdfnet\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/streaming-speech-enhancement-dpdfnet/main.go",
    "content": "package main\n\nimport (\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\nfunc appendSamples(dst []float32, src []float32) []float32 {\n\treturn append(dst, src...)\n}\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OnlineSpeechDenoiserConfig{}\n\tconfig.Model.DpdfNet.Model = \"./dpdfnet_baseline.onnx\"\n\tconfig.Model.NumThreads = 1\n\tconfig.Model.Debug = 1\n\n\tsd := sherpa.NewOnlineSpeechDenoiser(&config)\n\tdefer sherpa.DeleteOnlineSpeechDenoiser(sd)\n\n\twaveFilename := \"./inp_16k.wav\"\n\twave := sherpa.ReadWave(waveFilename)\n\tif wave == nil {\n\t\tlog.Printf(\"Failed to read %v\\n\", waveFilename)\n\t\treturn\n\t}\n\n\toutput := make([]float32, 0, len(wave.Samples))\n\tframeShift := sd.FrameShiftInSamples()\n\tfor start := 0; start < len(wave.Samples); start += frameShift {\n\t\tend := start + frameShift\n\t\tif end > len(wave.Samples) {\n\t\t\tend = len(wave.Samples)\n\t\t}\n\t\taudio := sd.Run(wave.Samples[start:end], wave.SampleRate)\n\t\toutput = appendSamples(output, audio.Samples)\n\t}\n\n\toutput = appendSamples(output, sd.Flush().Samples)\n\tfilename := \"./enhanced-online-dpdfnet.wav\"\n\tif !(&sherpa.DenoisedAudio{Samples: output, SampleRate: sd.SampleRate()}).Save(filename) {\n\t\tlog.Fatalf(\"Failed to write %v\\n\", filename)\n\t}\n\n\tlog.Printf(\"Saved to %v\\n\", filename)\n}\n"
  },
  {
    "path": "go-api-examples/streaming-speech-enhancement-dpdfnet/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ngo mod tidy\ngo build\n\n./streaming-speech-enhancement-dpdfnet\n"
  },
  {
    "path": "go-api-examples/streaming-speech-enhancement-gtcrn/go.mod",
    "content": "module streaming-speech-enhancement-gtcrn\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/streaming-speech-enhancement-gtcrn/main.go",
    "content": "package main\n\nimport (\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\nfunc appendSamples(dst []float32, src []float32) []float32 {\n\treturn append(dst, src...)\n}\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.OnlineSpeechDenoiserConfig{}\n\tconfig.Model.Gtcrn.Model = \"./gtcrn_simple.onnx\"\n\tconfig.Model.NumThreads = 1\n\tconfig.Model.Debug = 1\n\n\tsd := sherpa.NewOnlineSpeechDenoiser(&config)\n\tdefer sherpa.DeleteOnlineSpeechDenoiser(sd)\n\n\twaveFilename := \"./inp_16k.wav\"\n\twave := sherpa.ReadWave(waveFilename)\n\tif wave == nil {\n\t\tlog.Printf(\"Failed to read %v\\n\", waveFilename)\n\t\treturn\n\t}\n\n\toutput := make([]float32, 0, len(wave.Samples))\n\tframeShift := sd.FrameShiftInSamples()\n\tfor start := 0; start < len(wave.Samples); start += frameShift {\n\t\tend := start + frameShift\n\t\tif end > len(wave.Samples) {\n\t\t\tend = len(wave.Samples)\n\t\t}\n\t\taudio := sd.Run(wave.Samples[start:end], wave.SampleRate)\n\t\toutput = appendSamples(output, audio.Samples)\n\t}\n\n\toutput = appendSamples(output, sd.Flush().Samples)\n\tfilename := \"./enhanced-online-gtcrn.wav\"\n\tif !(&sherpa.DenoisedAudio{Samples: output, SampleRate: sd.SampleRate()}).Save(filename) {\n\t\tlog.Fatalf(\"Failed to write %v\\n\", filename)\n\t}\n\n\tlog.Printf(\"Saved to %v\\n\", filename)\n}\n"
  },
  {
    "path": "go-api-examples/streaming-speech-enhancement-gtcrn/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ngo mod tidy\ngo build\n\n./streaming-speech-enhancement-gtcrn\n"
  },
  {
    "path": "go-api-examples/supertonic-tts/go.mod",
    "content": "module supertonic-tts\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/supertonic-tts/main.go",
    "content": "package main\n\nimport (\n\t\"encoding/json\"\n\t\"log\"\n\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\t// ---------------- config ----------------\n\tvar config sherpa.OfflineTtsConfig\n\n\tconfig.Model.Supertonic.DurationPredictor =\n\t\t\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx\"\n\tconfig.Model.Supertonic.TextEncoder =\n\t\t\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx\"\n\tconfig.Model.Supertonic.VectorEstimator =\n\t\t\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx\"\n\tconfig.Model.Supertonic.Vocoder =\n\t\t\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx\"\n\tconfig.Model.Supertonic.TtsJson =\n\t\t\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json\"\n\tconfig.Model.Supertonic.UnicodeIndexer =\n\t\t\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin\"\n\tconfig.Model.Supertonic.VoiceStyle =\n\t\t\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin\"\n\n\tconfig.Model.NumThreads = 2\n\tconfig.Model.Debug = 1\n\n\tlog.Println(\"Creating Offline TTS\")\n\ttts := sherpa.NewOfflineTts(&config)\n\tif tts == nil {\n\t\tlog.Fatal(\"Failed to create OfflineTts\")\n\t}\n\tdefer sherpa.DeleteOfflineTts(tts)\n\n\ttext := \"Today as always, men fall into two groups: slaves and free men. Whoever \" +\n\t\t\"does not have two-thirds of his day for himself, is a slave, whatever \" +\n\t\t\"he may be: a statesman, a businessman, an official, or a scholar.\"\n\n\tvar cfg sherpa.GenerationConfig\n\tcfg.Sid = 6\n\tcfg.NumSteps = 5\n\tcfg.Speed = 1.25 // larger -> faster\n\n\textraMap := map[string]interface{}{\n\t\t\"lang\": \"en\",\n\t}\n\textraBytes, _ := json.Marshal(extraMap)\n\tcfg.Extra = json.RawMessage(extraBytes)\n\n\tlog.Println(\"Start generating\")\n\n\taudio := tts.GenerateWithConfig(\n\t\ttext,\n\t\t&cfg,\n\t\tfunc(samples []float32, progress float32) bool {\n\t\t\tlog.Printf(\"Progress: %.3f%%, Number of samples: %d\", progress*100, len(samples))\n\t\t\treturn true\n\t\t},\n\t)\n\n\tif audio == nil {\n\t\tlog.Fatal(\"Generation failed\")\n\t}\n\n\toutputFilename := \"./generated-supertonic-en.wav\"\n\tif !audio.Save(outputFilename) {\n\t\tlog.Fatal(\"Failed to save wav\")\n\t}\n\n\tlog.Println(\"Saved to:\", outputFilename)\n}\n"
  },
  {
    "path": "go-api-examples/supertonic-tts/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  tar xvf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  rm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./supertonic-tts\n"
  },
  {
    "path": "go-api-examples/vad/go.mod",
    "content": "module vad\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/vad/main.go",
    "content": "package main\n\nimport (\n\t\"fmt\"\n\t\"github.com/gen2brain/malgo\"\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n\t\"os\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tconfig := sherpa.VadModelConfig{}\n\n\t// Please download silero_vad.onnx from\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\t// or ten-vad.onnx from\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n\n\tif FileExists(\"./silero_vad.onnx\") {\n\t\tfmt.Println(\"Use silero-vad\")\n\t\tconfig.SileroVad.Model = \"./silero_vad.onnx\"\n\t\tconfig.SileroVad.Threshold = 0.5\n\t\tconfig.SileroVad.MinSilenceDuration = 0.5\n\t\tconfig.SileroVad.MinSpeechDuration = 0.25\n\t\tconfig.SileroVad.MaxSpeechDuration = 10\n\t\tconfig.SileroVad.WindowSize = 512\n\t} else if FileExists(\"./ten-vad.onnx\") {\n\t\tfmt.Println(\"Use ten-vad\")\n\t\tconfig.TenVad.Model = \"./ten-vad.onnx\"\n\t\tconfig.TenVad.Threshold = 0.5\n\t\tconfig.TenVad.MinSilenceDuration = 0.5\n\t\tconfig.TenVad.MinSpeechDuration = 0.25\n\t\tconfig.TenVad.MaxSpeechDuration = 10\n\t\tconfig.TenVad.WindowSize = 256\n\t} else {\n\t\tfmt.Println(\"Please download either ./silero_vad.onnx or ./ten-vad.onnx\")\n\t\treturn\n\t}\n\n\tconfig.SampleRate = 16000\n\tconfig.NumThreads = 1\n\tconfig.Provider = \"cpu\"\n\tconfig.Debug = 1\n\n\twindowSize := config.SileroVad.WindowSize\n\tif config.TenVad.Model != \"\" {\n\t\twindowSize = config.TenVad.WindowSize\n\t}\n\n\tvar bufferSizeInSeconds float32 = 5\n\n\tvad := sherpa.NewVoiceActivityDetector(&config, bufferSizeInSeconds)\n\tdefer sherpa.DeleteVoiceActivityDetector(vad)\n\n\tbuffer := sherpa.NewCircularBuffer(10 * config.SampleRate)\n\tdefer sherpa.DeleteCircularBuffer(buffer)\n\n\tctx, err := malgo.InitContext(nil, malgo.ContextConfig{}, func(message string) {\n\t\tfmt.Printf(\"LOG <%v>\", message)\n\t})\n\tchk(err)\n\n\tdefer func() {\n\t\t_ = ctx.Uninit()\n\t\tctx.Free()\n\t}()\n\n\tdeviceConfig := malgo.DefaultDeviceConfig(malgo.Duplex)\n\tdeviceConfig.Capture.Format = malgo.FormatS16\n\tdeviceConfig.Capture.Channels = 1\n\tdeviceConfig.Playback.Format = malgo.FormatS16\n\tdeviceConfig.Playback.Channels = 1\n\tdeviceConfig.SampleRate = 16000\n\tdeviceConfig.Alsa.NoMMap = 1\n\n\tprinted := false\n\tk := 0\n\n\tonRecvFrames := func(_, pSample []byte, framecount uint32) {\n\t\tsamples := samplesInt16ToFloat(pSample)\n\t\tbuffer.Push(samples)\n\t\tfor buffer.Size() >= windowSize {\n\t\t\thead := buffer.Head()\n\t\t\ts := buffer.Get(head, windowSize)\n\t\t\tbuffer.Pop(windowSize)\n\n\t\t\tvad.AcceptWaveform(s)\n\n\t\t\tif vad.IsSpeech() && !printed {\n\t\t\t\tprinted = true\n\t\t\t\tlog.Print(\"Detected speech\\n\")\n\t\t\t}\n\n\t\t\tif !vad.IsSpeech() {\n\t\t\t\tprinted = false\n\t\t\t}\n\n\t\t\tfor !vad.IsEmpty() {\n\t\t\t\tspeechSegment := vad.Front()\n\t\t\t\tvad.Pop()\n\n\t\t\t\tduration := float32(len(speechSegment.Samples)) / float32(config.SampleRate)\n\n\t\t\t\taudio := sherpa.GeneratedAudio{}\n\t\t\t\taudio.Samples = speechSegment.Samples\n\t\t\t\taudio.SampleRate = config.SampleRate\n\n\t\t\t\tfilename := fmt.Sprintf(\"seg-%d-%.2f-seconds.wav\", k, duration)\n\t\t\t\tok := audio.Save(filename)\n\t\t\t\tif ok {\n\t\t\t\t\tlog.Printf(\"Saved to %s\", filename)\n\t\t\t\t}\n\n\t\t\t\tk += 1\n\n\t\t\t\tlog.Printf(\"Duration: %.2f seconds\\n\", duration)\n\t\t\t\tlog.Print(\"----------\\n\")\n\t\t\t}\n\t\t}\n\t}\n\n\tcaptureCallbacks := malgo.DeviceCallbacks{\n\t\tData: onRecvFrames,\n\t}\n\n\tdevice, err := malgo.InitDevice(ctx.Context, deviceConfig, captureCallbacks)\n\tchk(err)\n\n\terr = device.Start()\n\tchk(err)\n\n\tfmt.Println(\"Started. Please speak. Press ctrl + C  to exit\")\n\tfmt.Scanln()\n\tdevice.Uninit()\n\n}\n\nfunc chk(err error) {\n\tif err != nil {\n\t\tpanic(err)\n\t}\n}\n\nfunc samplesInt16ToFloat(inSamples []byte) []float32 {\n\tnumSamples := len(inSamples) / 2\n\toutSamples := make([]float32, numSamples)\n\n\tfor i := 0; i != numSamples; i++ {\n\t\t// Decode two bytes into an int16 using bit manipulation\n\t\ts16 := int16(inSamples[2*i]) | int16(inSamples[2*i+1])<<8\n\t\toutSamples[i] = float32(s16) / 32768\n\t}\n\n\treturn outSamples\n}\n\nfunc FileExists(path string) bool {\n\t_, err := os.Stat(path)\n\tif err == nil {\n\t\treturn true\n\t}\n\n\treturn false\n}\n"
  },
  {
    "path": "go-api-examples/vad/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./ten-vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\nfi\n\ngo mod tidy\ngo build\n./vad\n"
  },
  {
    "path": "go-api-examples/vad-asr-whisper/go.mod",
    "content": "module vad-asr-whisper\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/vad-asr-whisper/main.go",
    "content": "package main\n\nimport (\n\t\"fmt\"\n\tportaudio \"github.com/csukuangfj/portaudio-go\"\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n\t\"strings\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\t// 1. Create VAD\n\tconfig := sherpa.VadModelConfig{}\n\n\t// Please download silero_vad.onnx from\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n\tconfig.SileroVad.Model = \"./silero_vad.onnx\"\n\tconfig.SileroVad.Threshold = 0.5\n\tconfig.SileroVad.MinSilenceDuration = 0.5\n\tconfig.SileroVad.MinSpeechDuration = 0.25\n\tconfig.SileroVad.WindowSize = 512\n\tconfig.SileroVad.MaxSpeechDuration = 5.0\n\tconfig.SampleRate = 16000\n\tconfig.NumThreads = 1\n\tconfig.Provider = \"cpu\"\n\tconfig.Debug = 1\n\n\tvar bufferSizeInSeconds float32 = 20\n\n\tvad := sherpa.NewVoiceActivityDetector(&config, bufferSizeInSeconds)\n\tdefer sherpa.DeleteVoiceActivityDetector(vad)\n\n\t// 2. Create ASR recognizer\n\n\tc := sherpa.OfflineRecognizerConfig{}\n\tc.FeatConfig.SampleRate = 16000\n\tc.FeatConfig.FeatureDim = 80\n\tc.ModelConfig.Whisper.Encoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx\"\n\tc.ModelConfig.Whisper.Decoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx\"\n\tc.ModelConfig.Tokens = \"./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\"\n\tc.ModelConfig.NumThreads = 2\n\tc.ModelConfig.Debug = 1\n\tc.ModelConfig.Provider = \"cpu\"\n\n\trecognizer := sherpa.NewOfflineRecognizer(&c)\n\tdefer sherpa.DeleteOfflineRecognizer(recognizer)\n\n\terr := portaudio.Initialize()\n\tif err != nil {\n\t\tlog.Fatalf(\"Unable to initialize portaudio: %v\\n\", err)\n\t}\n\tdefer portaudio.Terminate()\n\n\tdefault_device, err := portaudio.DefaultInputDevice()\n\tif err != nil {\n\t\tlog.Fatal(\"Failed to get default input device: %v\\n\", err)\n\t}\n\tlog.Printf(\"Selected default input device: %s\\n\", default_device.Name)\n\tparam := portaudio.StreamParameters{}\n\tparam.Input.Device = default_device\n\tparam.Input.Channels = 1\n\tparam.Input.Latency = default_device.DefaultHighInputLatency\n\n\tparam.SampleRate = float64(config.SampleRate)\n\tparam.FramesPerBuffer = 0\n\tparam.Flags = portaudio.ClipOff\n\n\t// you can choose another value for 0.1 if you want\n\tsamplesPerCall := int32(param.SampleRate * 0.1) // 0.1 second\n\tsamples := make([]float32, samplesPerCall)\n\n\ts, err := portaudio.OpenStream(param, samples)\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to open the stream\")\n\t}\n\n\tdefer s.Close()\n\tchk(s.Start())\n\n\tlog.Print(\"Started! Please speak\")\n\tprinted := false\n\n\tk := 0\n\tfor {\n\t\tchk(s.Read())\n\t\tvad.AcceptWaveform(samples)\n\n\t\tif vad.IsSpeech() && !printed {\n\t\t\tprinted = true\n\t\t\tlog.Print(\"Detected speech\\n\")\n\t\t}\n\n\t\tif !vad.IsSpeech() {\n\t\t\tprinted = false\n\t\t}\n\n\t\tfor !vad.IsEmpty() {\n\t\t\tspeechSegment := vad.Front()\n\t\t\tvad.Pop()\n\n\t\t\tduration := float32(len(speechSegment.Samples)) / float32(config.SampleRate)\n\n\t\t\taudio := &sherpa.Wave{}\n\t\t\taudio.Samples = speechSegment.Samples\n\t\t\taudio.SampleRate = config.SampleRate\n\n\t\t\t// Now decode it\n\t\t\tgo decode(recognizer, audio, k)\n\n\t\t\tk += 1\n\n\t\t\tlog.Printf(\"Duration: %.2f seconds\\n\", duration)\n\t\t}\n\t}\n\n\tchk(s.Stop())\n}\n\nfunc decode(recognizer *sherpa.OfflineRecognizer, audio *sherpa.Wave, id int) {\n\tstream := sherpa.NewOfflineStream(recognizer)\n\tdefer sherpa.DeleteOfflineStream(stream)\n\tstream.AcceptWaveform(audio.SampleRate, audio.Samples)\n\trecognizer.Decode(stream)\n\tresult := stream.GetResult()\n\ttext := strings.ToLower(result.Text)\n\ttext = strings.Trim(text, \" \")\n\tlog.Println(text)\n\n\tduration := float32(len(audio.Samples)) / float32(audio.SampleRate)\n\n\tfilename := fmt.Sprintf(\"seg-%d-%.2f-seconds-%s.wav\", id, duration, text)\n\tok := audio.Save(filename)\n\tif ok {\n\t\tlog.Printf(\"Saved to %s\", filename)\n\t}\n\tlog.Print(\"----------\\n\")\n}\n\nfunc chk(err error) {\n\tif err != nil {\n\t\tpanic(err)\n\t}\n}\n"
  },
  {
    "path": "go-api-examples/vad-asr-whisper/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\nfi\n\ngo mod tidy\ngo build\n./vad-asr-whisper\n"
  },
  {
    "path": "go-api-examples/vad-speaker-identification/go.mod",
    "content": "module vad-speaker-identification\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/vad-speaker-identification/main.go",
    "content": "package main\n\nimport (\n\t\"fmt\"\n\tportaudio \"github.com/csukuangfj/portaudio-go\"\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\nfunc createSpeakerEmbeddingExtractor() *sherpa.SpeakerEmbeddingExtractor {\n\tconfig := sherpa.SpeakerEmbeddingExtractorConfig{}\n\n\t// Please download the model from\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\n\t//\n\t// You can find more models at\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n\n\tconfig.Model = \"./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\"\n\tconfig.NumThreads = 2\n\tconfig.Debug = 1\n\tconfig.Provider = \"cpu\"\n\n\tex := sherpa.NewSpeakerEmbeddingExtractor(&config)\n\treturn ex\n}\n\nfunc computeEmbeddings(ex *sherpa.SpeakerEmbeddingExtractor, files []string) [][]float32 {\n\tembeddings := make([][]float32, len(files))\n\n\tfor i, f := range files {\n\t\twave := sherpa.ReadWave(f)\n\n\t\tstream := ex.CreateStream()\n\t\tdefer sherpa.DeleteOnlineStream(stream)\n\t\tstream.AcceptWaveform(wave.SampleRate, wave.Samples)\n\t\tstream.InputFinished()\n\t\tembeddings[i] = ex.Compute(stream)\n\t}\n\n\treturn embeddings\n\n}\n\nfunc registerSpeakers(ex *sherpa.SpeakerEmbeddingExtractor, manager *sherpa.SpeakerEmbeddingManager) {\n\t// Please download the test data from\n\t// https://github.com/csukuangfj/sr-data\n\tspk1_files := []string{\n\t\t\"./sr-data/enroll/fangjun-sr-1.wav\",\n\t\t\"./sr-data/enroll/fangjun-sr-2.wav\",\n\t\t\"./sr-data/enroll/fangjun-sr-3.wav\",\n\t}\n\n\tspk2_files := []string{\n\t\t\"./sr-data/enroll/leijun-sr-1.wav\",\n\t\t\"./sr-data/enroll/leijun-sr-2.wav\",\n\t}\n\n\tspk1_embeddings := computeEmbeddings(ex, spk1_files)\n\tspk2_embeddings := computeEmbeddings(ex, spk2_files)\n\n\tok := manager.RegisterV(\"fangjun\", spk1_embeddings)\n\tif !ok {\n\t\tpanic(\"Failed to register fangjun\")\n\t}\n\n\tok = manager.RegisterV(\"leijun\", spk2_embeddings)\n\tif !ok {\n\t\tpanic(\"Failed to register leijun\")\n\t}\n\n\tif !manager.Contains(\"fangjun\") {\n\t\tpanic(\"Failed to find fangjun\")\n\t}\n\n\tif !manager.Contains(\"leijun\") {\n\t\tpanic(\"Failed to find leijun\")\n\t}\n\n\tif manager.NumSpeakers() != 2 {\n\t\tpanic(\"There should be only 2 speakers\")\n\t}\n\n\tall_speakers := manager.AllSpeakers()\n\tlog.Printf(\"All speakers: %v\\n\", all_speakers)\n}\n\nfunc createVad() *sherpa.VoiceActivityDetector {\n\tconfig := sherpa.VadModelConfig{}\n\n\t// Please download silero_vad.onnx from\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n\tconfig.SileroVad.Model = \"./silero_vad.onnx\"\n\tconfig.SileroVad.Threshold = 0.5\n\tconfig.SileroVad.MinSilenceDuration = 0.5\n\tconfig.SileroVad.MinSpeechDuration = 0.5\n\tconfig.SileroVad.WindowSize = 512\n\tconfig.SampleRate = 16000\n\tconfig.NumThreads = 1\n\tconfig.Provider = \"cpu\"\n\tconfig.Debug = 1\n\n\tvar bufferSizeInSeconds float32 = 20\n\n\tvad := sherpa.NewVoiceActivityDetector(&config, bufferSizeInSeconds)\n\treturn vad\n}\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tvad := createVad()\n\tdefer sherpa.DeleteVoiceActivityDetector(vad)\n\n\tex := createSpeakerEmbeddingExtractor()\n\tdefer sherpa.DeleteSpeakerEmbeddingExtractor(ex)\n\n\tmanager := sherpa.NewSpeakerEmbeddingManager(ex.Dim())\n\tdefer sherpa.DeleteSpeakerEmbeddingManager(manager)\n\tregisterSpeakers(ex, manager)\n\n\terr := portaudio.Initialize()\n\tif err != nil {\n\t\tlog.Fatalf(\"Unable to initialize portaudio: %v\\n\", err)\n\t}\n\tdefer portaudio.Terminate()\n\n\tdefault_device, err := portaudio.DefaultInputDevice()\n\tif err != nil {\n\t\tlog.Fatal(\"Failed to get default input device: %v\\n\", err)\n\t}\n\tlog.Printf(\"Selected default input device: %s\\n\", default_device.Name)\n\tparam := portaudio.StreamParameters{}\n\tparam.Input.Device = default_device\n\tparam.Input.Channels = 1\n\tparam.Input.Latency = default_device.DefaultHighInputLatency\n\n\tparam.SampleRate = 16000\n\tparam.FramesPerBuffer = 0\n\tparam.Flags = portaudio.ClipOff\n\n\t// you can choose another value for 0.1 if you want\n\tsamplesPerCall := int32(param.SampleRate * 0.1) // 0.1 second\n\tsamples := make([]float32, samplesPerCall)\n\n\ts, err := portaudio.OpenStream(param, samples)\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to open the stream\")\n\t}\n\n\tdefer s.Close()\n\tchk(s.Start())\n\n\tlog.Print(\"Started! Please speak\")\n\tprinted := false\n\n\tk := 0\n\tfor {\n\t\tchk(s.Read())\n\t\tvad.AcceptWaveform(samples)\n\n\t\tif vad.IsSpeech() && !printed {\n\t\t\tprinted = true\n\t\t\tlog.Print(\"Detected speech\\n\")\n\t\t}\n\n\t\tif !vad.IsSpeech() {\n\t\t\tprinted = false\n\t\t}\n\n\t\tfor !vad.IsEmpty() {\n\t\t\tspeechSegment := vad.Front()\n\t\t\tvad.Pop()\n\n\t\t\taudio := &sherpa.Wave{}\n\t\t\taudio.Samples = speechSegment.Samples\n\t\t\taudio.SampleRate = 16000\n\n\t\t\t// Now decode it\n\t\t\tgo decode(ex, manager, audio, k)\n\n\t\t\tk += 1\n\t\t}\n\t}\n\n\tchk(s.Stop())\n\n}\n\nfunc chk(err error) {\n\tif err != nil {\n\t\tpanic(err)\n\t}\n}\n\nfunc decode(ex *sherpa.SpeakerEmbeddingExtractor, manager *sherpa.SpeakerEmbeddingManager, audio *sherpa.GeneratedAudio, id int) {\n\tstream := ex.CreateStream()\n\tdefer sherpa.DeleteOnlineStream(stream)\n\n\tstream.AcceptWaveform(audio.SampleRate, audio.Samples)\n\tstream.InputFinished()\n\tembeddings := ex.Compute(stream)\n\tthreshold := float32(0.5)\n\tname := manager.Search(embeddings, threshold)\n\tif len(name) > 0 {\n\t\tlog.Printf(\"Found speaker: %v\\n\", name)\n\t} else {\n\t\tlog.Print(\"Unknown speaker\\n\")\n\t\tname = \"Unknown\"\n\t}\n\n\tduration := float32(len(audio.Samples)) / float32(audio.SampleRate)\n\n\tfilename := fmt.Sprintf(\"seg-%d-%.2f-seconds-%s.wav\", id, duration, name)\n\tok := audio.Save(filename)\n\tif ok {\n\t\tlog.Printf(\"Saved to %s\", filename)\n\t}\n\tlog.Print(\"----------\\n\")\n}\n"
  },
  {
    "path": "go-api-examples/vad-speaker-identification/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\nfi\n\nif [ ! -f ./sr-data/enroll/fangjun-sr-1.wav ]; then\n  git clone https://github.com/csukuangfj/sr-data\nfi\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\ngo mod tidy\ngo build\n./vad-speaker-identification\n"
  },
  {
    "path": "go-api-examples/vad-spoken-language-identification/go.mod",
    "content": "module vad-spoken-language-identification\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/vad-spoken-language-identification/main.go",
    "content": "package main\n\nimport (\n\t\"fmt\"\n\tiso639 \"github.com/barbashov/iso639-3\"\n\tportaudio \"github.com/csukuangfj/portaudio-go\"\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\t\"log\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\t// 1. Create VAD\n\tconfig := sherpa.VadModelConfig{}\n\n\t// Please download silero_vad.onnx from\n\t// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n\tconfig.SileroVad.Model = \"./silero_vad.onnx\"\n\tconfig.SileroVad.Threshold = 0.5\n\tconfig.SileroVad.MinSilenceDuration = 0.5\n\tconfig.SileroVad.MinSpeechDuration = 0.25\n\tconfig.SileroVad.WindowSize = 512\n\tconfig.SampleRate = 16000\n\tconfig.NumThreads = 1\n\tconfig.Provider = \"cpu\"\n\tconfig.Debug = 1\n\n\tvar bufferSizeInSeconds float32 = 20\n\n\tvad := sherpa.NewVoiceActivityDetector(&config, bufferSizeInSeconds)\n\tdefer sherpa.DeleteVoiceActivityDetector(vad)\n\n\t// 2. Create spoken language identifier\n\n\tc := sherpa.SpokenLanguageIdentificationConfig{}\n\tc.Whisper.Encoder = \"./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx\"\n\tc.Whisper.Decoder = \"./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx\"\n\tc.NumThreads = 2\n\tc.Debug = 1\n\tc.Provider = \"cpu\"\n\n\tslid := sherpa.NewSpokenLanguageIdentification(&c)\n\tdefer sherpa.DeleteSpokenLanguageIdentification(slid)\n\n\terr := portaudio.Initialize()\n\tif err != nil {\n\t\tlog.Fatalf(\"Unable to initialize portaudio: %v\\n\", err)\n\t}\n\tdefer portaudio.Terminate()\n\n\tdefault_device, err := portaudio.DefaultInputDevice()\n\tif err != nil {\n\t\tlog.Fatal(\"Failed to get default input device: %v\\n\", err)\n\t}\n\tlog.Printf(\"Selected default input device: %s\\n\", default_device.Name)\n\tparam := portaudio.StreamParameters{}\n\tparam.Input.Device = default_device\n\tparam.Input.Channels = 1\n\tparam.Input.Latency = default_device.DefaultHighInputLatency\n\n\tparam.SampleRate = float64(config.SampleRate)\n\tparam.FramesPerBuffer = 0\n\tparam.Flags = portaudio.ClipOff\n\n\t// you can choose another value for 0.1 if you want\n\tsamplesPerCall := int32(param.SampleRate * 0.1) // 0.1 second\n\tsamples := make([]float32, samplesPerCall)\n\n\ts, err := portaudio.OpenStream(param, samples)\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to open the stream\")\n\t}\n\n\tdefer s.Close()\n\tchk(s.Start())\n\n\tlog.Print(\"Started! Please speak\")\n\tprinted := false\n\n\tk := 0\n\tfor {\n\t\tchk(s.Read())\n\t\tvad.AcceptWaveform(samples)\n\n\t\tif vad.IsSpeech() && !printed {\n\t\t\tprinted = true\n\t\t\tlog.Print(\"Detected speech\\n\")\n\t\t}\n\n\t\tif !vad.IsSpeech() {\n\t\t\tprinted = false\n\t\t}\n\n\t\tfor !vad.IsEmpty() {\n\t\t\tspeechSegment := vad.Front()\n\t\t\tvad.Pop()\n\n\t\t\tduration := float32(len(speechSegment.Samples)) / float32(config.SampleRate)\n\n\t\t\taudio := &sherpa.Wave{}\n\t\t\taudio.Samples = speechSegment.Samples\n\t\t\taudio.SampleRate = config.SampleRate\n\n\t\t\t// Now decode it\n\t\t\tgo decode(slid, audio, k)\n\n\t\t\tk += 1\n\n\t\t\tlog.Printf(\"Duration: %.2f seconds\\n\", duration)\n\t\t}\n\t}\n\n\tchk(s.Stop())\n}\n\nfunc decode(slid *sherpa.SpokenLanguageIdentification, audio *sherpa.Wave, id int) {\n\tstream := slid.CreateStream()\n\tdefer sherpa.DeleteOfflineStream(stream)\n\n\tstream.AcceptWaveform(audio.SampleRate, audio.Samples)\n\tresult := slid.Compute(stream)\n\tlang := iso639.FromPart1Code(result.Lang).Name\n\tlog.Printf(\"Detected language: %v\", lang)\n\n\tduration := float32(len(audio.Samples)) / float32(audio.SampleRate)\n\n\tfilename := fmt.Sprintf(\"seg-%d-%.2f-seconds-%s.wav\", id, duration, lang)\n\tok := audio.Save(filename)\n\tif ok {\n\t\tlog.Printf(\"Saved to %s\", filename)\n\t}\n\tlog.Print(\"----------\\n\")\n}\n\nfunc chk(err error) {\n\tif err != nil {\n\t\tpanic(err)\n\t}\n}\n"
  },
  {
    "path": "go-api-examples/vad-spoken-language-identification/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.tar.bz2\n  rm sherpa-onnx-whisper-tiny.tar.bz2\nfi\n\ngo mod tidy\ngo build\n./vad-spoken-language-identification\n"
  },
  {
    "path": "go-api-examples/zero-shot-pocket-tts/go.mod",
    "content": "module zero-shot-pocket-tts\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/zero-shot-pocket-tts/main.go",
    "content": "package main\n\nimport (\n\t\"log\"\n\n\t\"encoding/json\"\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\tflag \"github.com/spf13/pflag\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tvar referenceAudio string = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\"\n\tvar outputFilename string = \"./generated.wav\"\n\tvar voiceEmbeddingCacheCapacity int = 50\n\tvar seed int = -1\n\n\ttext := `Today as always, men fall into two groups: slaves and free men.\nWhoever does not have two-thirds of his day for himself, is a slave,\nwhatever he may be: a statesman, a businessman, an official, or a scholar.`\n\n\tflag.StringVar(&referenceAudio, \"reference-audio\", referenceAudio, \"Path to the reference audio\")\n\tflag.StringVar(&text, \"text\", text, \"Text to be synthesized\")\n\tflag.StringVar(&outputFilename, \"output-filename\", outputFilename, \"File to save the generated audio\")\n\tflag.IntVar(&voiceEmbeddingCacheCapacity, \"voice-embedding-cache-capacity\", voiceEmbeddingCacheCapacity, \"Voice embedding cache capacity (default: 50)\")\n\tflag.IntVar(&seed, \"seed\", seed, \"Random seed for reproducibility (default: -1, random)\")\n\tflag.Parse()\n\n\t// ---------------- config ----------------\n\tvar config sherpa.OfflineTtsConfig\n\n\tconfig.Model.Pocket.LmFlow =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\"\n\tconfig.Model.Pocket.LmMain =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\"\n\tconfig.Model.Pocket.Encoder =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\"\n\tconfig.Model.Pocket.Decoder =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\"\n\tconfig.Model.Pocket.TextConditioner =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\"\n\tconfig.Model.Pocket.VocabJson =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\"\n\tconfig.Model.Pocket.TokenScoresJson =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\"\n\tconfig.Model.Pocket.VoiceEmbeddingCacheCapacity = voiceEmbeddingCacheCapacity\n\n\tconfig.Model.NumThreads = 2\n\tconfig.Model.Debug = 0\n\tconfig.Model.Provider = \"cpu\"\n\n\tlog.Println(\"Creating Offline TTS\")\n\ttts := sherpa.NewOfflineTts(&config)\n\tif tts == nil {\n\t\tlog.Fatal(\"Failed to create OfflineTts\")\n\t}\n\tdefer sherpa.DeleteOfflineTts(tts)\n\n\twave := sherpa.ReadWave(referenceAudio)\n\tif wave == nil {\n\t\tlog.Fatal(\"Failed to read reference wav:\", referenceAudio)\n\t}\n\n\tvar cfg sherpa.GenerationConfig\n\tcfg.ReferenceAudio = wave.Samples\n\tcfg.ReferenceSampleRate = wave.SampleRate\n\n\t// Build extra config with optional seed\n\textraMap := map[string]interface{}{\n\t\t\"max_reference_audio_len\": 10,\n\t\t\"temperature\":             0.7,\n\t}\n\tif seed >= 0 {\n\t\textraMap[\"seed\"] = seed\n\t}\n\textraBytes, _ := json.Marshal(extraMap)\n\tcfg.Extra = json.RawMessage(extraBytes)\n\n\tlog.Println(\"Start generating\")\n\n\taudio := tts.GenerateWithConfig(\n\t\ttext,\n\t\t&cfg,\n\t\tfunc(samples []float32, progress float32) bool {\n\t\t\tlog.Printf(\"Progress: %.3f%%, Number of samples: %d\", progress*100, len(samples))\n\t\t\t// return false here if you want to cancel\n\t\t\treturn true\n\t\t},\n\t)\n\n\tif audio == nil {\n\t\tlog.Fatal(\"Generation failed\")\n\t}\n\n\tif !audio.Save(outputFilename) {\n\t\tlog.Fatal(\"Failed to save wav\")\n\t}\n\n\tlog.Println(\"Saved to:\", outputFilename)\n}\n"
  },
  {
    "path": "go-api-examples/zero-shot-pocket-tts/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  tar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./zero-shot-pocket-tts \\\n  --reference-audio ./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav \\\n  --output-filename ./generated-bria.wav \\\n  --text \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n"
  },
  {
    "path": "go-api-examples/zero-shot-pocket-tts-play/go.mod",
    "content": "module zero-shot-pocket-tts-play\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/zero-shot-pocket-tts-play/main.go",
    "content": "package main\n\nimport (\n\t\"encoding/binary\"\n\t\"encoding/json\"\n\t\"io\"\n\t\"log\"\n\t\"math\"\n\t\"os\"\n\t\"os/signal\"\n\t\"sync\"\n\t\"syscall\"\n\t\"time\"\n\n\toto \"github.com/ebitengine/oto/v3\"\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\tflag \"github.com/spf13/pflag\"\n)\n\ntype pcmBuffer struct {\n\tmu       sync.Mutex\n\tqueue    [][]byte\n\tfinished bool\n\tstarted  chan struct{} // closed on first callback\n\tonce     sync.Once\n}\n\nfunc newPCMBuffer() *pcmBuffer {\n\treturn &pcmBuffer{\n\t\tstarted: make(chan struct{}),\n\t}\n}\n\nfunc (b *pcmBuffer) Push(p []byte) {\n\tb.once.Do(func() {\n\t\tclose(b.started)\n\t})\n\n\tb.mu.Lock()\n\tb.queue = append(b.queue, p)\n\tb.mu.Unlock()\n}\n\nfunc (b *pcmBuffer) Finish() {\n\tb.once.Do(func() {\n\t\tclose(b.started)\n\t})\n\n\tb.mu.Lock()\n\tb.finished = true\n\tb.mu.Unlock()\n}\n\ntype pcmReader struct {\n\tbuf  *pcmBuffer\n\tdone chan struct{}\n\tonce sync.Once\n}\n\nfunc (r *pcmReader) Read(p []byte) (int, error) {\n\t<-r.buf.started\n\n\tr.buf.mu.Lock()\n\tdefer r.buf.mu.Unlock()\n\n\t// 2) Have audio\n\tif len(r.buf.queue) > 0 {\n\t\tchunk := r.buf.queue[0]\n\t\tn := copy(p, chunk)\n\n\t\tif n == len(chunk) {\n\t\t\tr.buf.queue = r.buf.queue[1:]\n\t\t} else {\n\t\t\tr.buf.queue[0] = chunk[n:]\n\t\t}\n\t\treturn n, nil\n\t}\n\n\t// 3) Finished → EOF\n\tif r.buf.finished {\n\t\tr.once.Do(func() { close(r.done) })\n\t\treturn 0, io.EOF\n\t}\n\n\t// 4) Gap → silence\n\tfor i := range p {\n\t\tp[i] = 0\n\t}\n\treturn len(p), nil\n}\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tvar referenceAudio string = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\"\n\tvar outputFilename string = \"./generated.wav\"\n\tvar voiceEmbeddingCacheCapacity int = 50\n\tvar seed int = -1\n\n\ttext := `Today as always, men fall into two groups: slaves and free men.\nWhoever does not have two-thirds of his day for himself, is a slave,\nwhatever he may be: a statesman, a businessman, an official, or a scholar.`\n\n\tflag.StringVar(&referenceAudio, \"reference-audio\", referenceAudio, \"Path to the reference audio\")\n\tflag.StringVar(&text, \"text\", text, \"Text to be synthesized\")\n\tflag.StringVar(&outputFilename, \"output-filename\", outputFilename, \"File to save the generated audio\")\n\tflag.IntVar(&voiceEmbeddingCacheCapacity, \"voice-embedding-cache-capacity\", voiceEmbeddingCacheCapacity, \"Voice embedding cache capacity (default: 50)\")\n\tflag.IntVar(&seed, \"seed\", seed, \"Random seed for reproducibility (default: -1, random)\")\n\tflag.Parse()\n\n\t// ---------------- config ----------------\n\tvar config sherpa.OfflineTtsConfig\n\n\tconfig.Model.Pocket.LmFlow =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\"\n\tconfig.Model.Pocket.LmMain =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\"\n\tconfig.Model.Pocket.Encoder =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\"\n\tconfig.Model.Pocket.Decoder =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\"\n\tconfig.Model.Pocket.TextConditioner =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\"\n\tconfig.Model.Pocket.VocabJson =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\"\n\tconfig.Model.Pocket.TokenScoresJson =\n\t\t\"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\"\n\tconfig.Model.Pocket.VoiceEmbeddingCacheCapacity = voiceEmbeddingCacheCapacity\n\n\tconfig.Model.NumThreads = 2\n\tconfig.Model.Debug = 0\n\tconfig.Model.Provider = \"cpu\"\n\n\tlog.Println(\"Creating Offline TTS\")\n\ttts := sherpa.NewOfflineTts(&config)\n\tif tts == nil {\n\t\tlog.Fatal(\"Failed to create OfflineTts\")\n\t}\n\tdefer sherpa.DeleteOfflineTts(tts)\n\n\twave := sherpa.ReadWave(referenceAudio)\n\tif wave == nil {\n\t\tlog.Fatal(\"Failed to read reference wav:\", referenceAudio)\n\t}\n\n\tvar cfg sherpa.GenerationConfig\n\tcfg.ReferenceAudio = wave.Samples\n\tcfg.ReferenceSampleRate = wave.SampleRate\n\n\t// Build extra config with optional seed\n\textraMap := map[string]interface{}{\n\t\t\"max_reference_audio_len\": 10,\n\t\t\"temperature\":             0.7,\n\t}\n\tif seed >= 0 {\n\t\textraMap[\"seed\"] = seed\n\t}\n\textraBytes, _ := json.Marshal(extraMap)\n\tcfg.Extra = json.RawMessage(extraBytes)\n\n\tlog.Println(\"Start generating\")\n\n\tctx, ready, err := oto.NewContext(&oto.NewContextOptions{\n\t\tSampleRate:   tts.SampleRate(),\n\t\tChannelCount: 1,\n\t\tFormat:       oto.FormatSignedInt16LE,\n\t})\n\tif err != nil {\n\t\tlog.Fatal(err)\n\t}\n\t<-ready\n\n\tpcmBuf := newPCMBuffer()\n\n\treader := &pcmReader{\n\t\tbuf:  pcmBuf,\n\t\tdone: make(chan struct{}),\n\t}\n\n\tplayer := ctx.NewPlayer(reader)\n\tplayer.Play()\n\tdefer player.Close()\n\n\tstop := make(chan os.Signal, 1)\n\tsignal.Notify(stop, syscall.SIGINT, syscall.SIGTERM)\n\n\tvar generated *sherpa.GeneratedAudio\n\n\tstart := time.Now()\n\n\tgo func() {\n\t\tdefer pcmBuf.Finish()\n\n\t\tgenerated = tts.GenerateWithConfig(\n\t\t\ttext,\n\t\t\t&cfg,\n\t\t\tfunc(samples []float32, progress float32) bool {\n\t\t\t\tlog.Printf(\"Progress: %.1f%%\", progress*100)\n\n\t\t\t\tbuf := make([]byte, len(samples)*2)\n\t\t\t\tfor i, s := range samples {\n\t\t\t\t\tif s > 1 {\n\t\t\t\t\t\ts = 1\n\t\t\t\t\t} else if s < -1 {\n\t\t\t\t\t\ts = -1\n\t\t\t\t\t}\n\t\t\t\t\tv := int16(math.Round(float64(s * 32767)))\n\t\t\t\t\tbinary.LittleEndian.PutUint16(buf[i*2:], uint16(v))\n\t\t\t\t}\n\n\t\t\t\tpcmBuf.Push(buf)\n\t\t\t\treturn true\n\t\t\t},\n\t\t)\n\n\t\tlog.Println(\"TTS generation finished in\", time.Since(start))\n\t}()\n\n\tselect {\n\tcase <-stop:\n\t\tlog.Println(\"Interrupted\")\n\tcase <-reader.done:\n\t\tlog.Println(\"Playback finished\")\n\t}\n\n\tif generated != nil {\n\t\tif ok := generated.Save(outputFilename); !ok {\n\t\t\tlog.Println(\"Failed to save audio\")\n\t\t} else {\n\t\t\tlog.Println(\"Saved generated audio to\", outputFilename)\n\t\t}\n\t}\n\n\t// let remaining audio drain\n\ttime.Sleep(800 * time.Millisecond)\n\n\tlog.Println(\"Done\")\n}\n"
  },
  {
    "path": "go-api-examples/zero-shot-pocket-tts-play/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  tar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nfi\n\ngo mod tidy\ngo build\n\n./zero-shot-pocket-tts-play \\\n  --reference-audio ./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav \\\n  --output-filename ./generated-bria.wav \\\n  --text \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n"
  },
  {
    "path": "go-api-examples/zero-shot-zipvoice-tts/go.mod",
    "content": "module zero-shot-zipvoice-tts\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/zero-shot-zipvoice-tts/main.go",
    "content": "package main\n\nimport (\n\t\"encoding/json\"\n\t\"log\"\n\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\tflag \"github.com/spf13/pflag\"\n)\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tvar referenceAudio string = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav\"\n\tvar referenceText string = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\"\n\tvar outputFilename string = \"./generated.wav\"\n\tvar numSteps int = 4\n\tvar minCharInSentence int = 10\n\n\ttext := \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n\n\tflag.StringVar(&referenceAudio, \"reference-audio\", referenceAudio, \"Path to the reference audio\")\n\tflag.StringVar(&referenceText, \"reference-text\", referenceText, \"Reference text for the reference audio\")\n\tflag.StringVar(&text, \"text\", text, \"Text to be synthesized\")\n\tflag.StringVar(&outputFilename, \"output-filename\", outputFilename, \"File to save the generated audio\")\n\tflag.IntVar(&numSteps, \"num-steps\", numSteps, \"Number of ZipVoice flow-matching steps\")\n\tflag.IntVar(&minCharInSentence, \"min-char-in-sentence\", minCharInSentence, \"Minimum characters in a sentence chunk\")\n\tflag.Parse()\n\n\tvar config sherpa.OfflineTtsConfig\n\tconfig.Model.Zipvoice.Encoder =\n\t\t\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx\"\n\tconfig.Model.Zipvoice.Decoder =\n\t\t\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx\"\n\tconfig.Model.Zipvoice.DataDir =\n\t\t\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data\"\n\tconfig.Model.Zipvoice.Lexicon =\n\t\t\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt\"\n\tconfig.Model.Zipvoice.Tokens =\n\t\t\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt\"\n\tconfig.Model.Zipvoice.Vocoder = \"./vocos_24khz.onnx\"\n\n\tconfig.Model.NumThreads = 2\n\tconfig.Model.Debug = 0\n\tconfig.Model.Provider = \"cpu\"\n\n\tlog.Println(\"Creating Offline TTS\")\n\ttts := sherpa.NewOfflineTts(&config)\n\tif tts == nil {\n\t\tlog.Fatal(\"Failed to create OfflineTts\")\n\t}\n\tdefer sherpa.DeleteOfflineTts(tts)\n\n\twave := sherpa.ReadWave(referenceAudio)\n\tif wave == nil {\n\t\tlog.Fatal(\"Failed to read reference wav:\", referenceAudio)\n\t}\n\n\tvar cfg sherpa.GenerationConfig\n\tcfg.ReferenceAudio = wave.Samples\n\tcfg.ReferenceSampleRate = wave.SampleRate\n\tcfg.ReferenceText = referenceText\n\tcfg.NumSteps = numSteps\n\n\textraMap := map[string]interface{}{\n\t\t\"min_char_in_sentence\": minCharInSentence,\n\t}\n\textraBytes, err := json.Marshal(extraMap)\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to marshal generation config extra: %v\", err)\n\t}\n\tcfg.Extra = json.RawMessage(extraBytes)\n\n\tlog.Println(\"Start generating\")\n\n\taudio := tts.GenerateWithConfig(\n\t\ttext,\n\t\t&cfg,\n\t\tfunc(samples []float32, progress float32) bool {\n\t\t\tlog.Printf(\"Progress: %.3f%%, Number of samples: %d\", progress*100, len(samples))\n\t\t\t// return false here if you want to cancel\n\t\t\treturn true\n\t\t},\n\t)\n\n\tif audio == nil {\n\t\tlog.Fatal(\"Generation failed\")\n\t}\n\n\tif !audio.Save(outputFilename) {\n\t\tlog.Fatal(\"Failed to save wav\")\n\t}\n\n\tlog.Println(\"Saved to:\", outputFilename)\n}\n"
  },
  {
    "path": "go-api-examples/zero-shot-zipvoice-tts/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  tar xvf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nfi\n\nif [ ! -f vocos_24khz.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\nfi\n\ngo mod tidy\ngo build\n\n./zero-shot-zipvoice-tts \\\n  --reference-audio ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav \\\n  --reference-text \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\" \\\n  --num-steps 4 \\\n  --min-char-in-sentence 10 \\\n  --output-filename ./test-zipvoice.wav \\\n  --text \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n"
  },
  {
    "path": "go-api-examples/zero-shot-zipvoice-tts-play/go.mod",
    "content": "module zero-shot-zipvoice-tts-play\n\ngo 1.17\n"
  },
  {
    "path": "go-api-examples/zero-shot-zipvoice-tts-play/main.go",
    "content": "package main\n\nimport (\n\t\"encoding/binary\"\n\t\"encoding/json\"\n\t\"io\"\n\t\"log\"\n\t\"math\"\n\t\"os\"\n\t\"os/signal\"\n\t\"sync\"\n\t\"syscall\"\n\t\"time\"\n\n\toto \"github.com/ebitengine/oto/v3\"\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx\"\n\tflag \"github.com/spf13/pflag\"\n)\n\ntype pcmBuffer struct {\n\tmu       sync.Mutex\n\tqueue    [][]byte\n\tfinished bool\n\tstarted  chan struct{}\n\tonce     sync.Once\n}\n\nfunc newPCMBuffer() *pcmBuffer {\n\treturn &pcmBuffer{\n\t\tstarted: make(chan struct{}),\n\t}\n}\n\nfunc (b *pcmBuffer) Push(p []byte) {\n\tb.once.Do(func() {\n\t\tclose(b.started)\n\t})\n\n\tb.mu.Lock()\n\tb.queue = append(b.queue, p)\n\tb.mu.Unlock()\n}\n\nfunc (b *pcmBuffer) Finish() {\n\tb.once.Do(func() {\n\t\tclose(b.started)\n\t})\n\n\tb.mu.Lock()\n\tb.finished = true\n\tb.mu.Unlock()\n}\n\ntype pcmReader struct {\n\tbuf  *pcmBuffer\n\tdone chan struct{}\n\tonce sync.Once\n}\n\nfunc (r *pcmReader) Read(p []byte) (int, error) {\n\t<-r.buf.started\n\n\tr.buf.mu.Lock()\n\tdefer r.buf.mu.Unlock()\n\n\tif len(r.buf.queue) > 0 {\n\t\tchunk := r.buf.queue[0]\n\t\tn := copy(p, chunk)\n\n\t\tif n == len(chunk) {\n\t\t\tr.buf.queue = r.buf.queue[1:]\n\t\t} else {\n\t\t\tr.buf.queue[0] = chunk[n:]\n\t\t}\n\t\treturn n, nil\n\t}\n\n\tif r.buf.finished {\n\t\tr.once.Do(func() { close(r.done) })\n\t\treturn 0, io.EOF\n\t}\n\n\tfor i := range p {\n\t\tp[i] = 0\n\t}\n\treturn len(p), nil\n}\n\nfunc main() {\n\tlog.SetFlags(log.LstdFlags | log.Lmicroseconds)\n\n\tvar referenceAudio string = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav\"\n\tvar referenceText string = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\"\n\tvar outputFilename string = \"./generated.wav\"\n\tvar numSteps int = 4\n\tvar minCharInSentence int = 30\n\n\ttext := \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n\n\tflag.StringVar(&referenceAudio, \"reference-audio\", referenceAudio, \"Path to the reference audio\")\n\tflag.StringVar(&referenceText, \"reference-text\", referenceText, \"Reference text for the reference audio\")\n\tflag.StringVar(&text, \"text\", text, \"Text to be synthesized\")\n\tflag.StringVar(&outputFilename, \"output-filename\", outputFilename, \"File to save the generated audio\")\n\tflag.IntVar(&numSteps, \"num-steps\", numSteps, \"Number of ZipVoice flow-matching steps\")\n\tflag.IntVar(&minCharInSentence, \"min-char-in-sentence\", minCharInSentence, \"Minimum characters in a sentence chunk\")\n\tflag.Parse()\n\n\tvar config sherpa.OfflineTtsConfig\n\tconfig.Model.Zipvoice.Encoder =\n\t\t\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx\"\n\tconfig.Model.Zipvoice.Decoder =\n\t\t\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx\"\n\tconfig.Model.Zipvoice.DataDir =\n\t\t\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data\"\n\tconfig.Model.Zipvoice.Lexicon =\n\t\t\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt\"\n\tconfig.Model.Zipvoice.Tokens =\n\t\t\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt\"\n\tconfig.Model.Zipvoice.Vocoder = \"./vocos_24khz.onnx\"\n\n\tconfig.Model.NumThreads = 2\n\tconfig.Model.Debug = 0\n\tconfig.Model.Provider = \"cpu\"\n\n\tlog.Println(\"Creating Offline TTS\")\n\ttts := sherpa.NewOfflineTts(&config)\n\tif tts == nil {\n\t\tlog.Fatal(\"Failed to create OfflineTts\")\n\t}\n\tdefer sherpa.DeleteOfflineTts(tts)\n\n\twave := sherpa.ReadWave(referenceAudio)\n\tif wave == nil {\n\t\tlog.Fatal(\"Failed to read reference wav:\", referenceAudio)\n\t}\n\n\tvar cfg sherpa.GenerationConfig\n\tcfg.ReferenceAudio = wave.Samples\n\tcfg.ReferenceSampleRate = wave.SampleRate\n\tcfg.ReferenceText = referenceText\n\tcfg.NumSteps = numSteps\n\n\textraMap := map[string]interface{}{\n\t\t\"min_char_in_sentence\": minCharInSentence,\n\t}\n\textraBytes, err := json.Marshal(extraMap)\n\tif err != nil {\n\t\tlog.Fatalf(\"Failed to marshal generation config extra: %v\", err)\n\t}\n\tcfg.Extra = json.RawMessage(extraBytes)\n\n\tlog.Println(\"Start generating\")\n\n\tctx, ready, err := oto.NewContext(&oto.NewContextOptions{\n\t\tSampleRate:   tts.SampleRate(),\n\t\tChannelCount: 1,\n\t\tFormat:       oto.FormatSignedInt16LE,\n\t})\n\tif err != nil {\n\t\tlog.Fatal(err)\n\t}\n\t<-ready\n\n\tpcmBuf := newPCMBuffer()\n\treader := &pcmReader{\n\t\tbuf:  pcmBuf,\n\t\tdone: make(chan struct{}),\n\t}\n\n\tplayer := ctx.NewPlayer(reader)\n\tplayer.Play()\n\tdefer player.Close()\n\n\tstop := make(chan os.Signal, 1)\n\tsignal.Notify(stop, syscall.SIGINT, syscall.SIGTERM)\n\n\tvar generated *sherpa.GeneratedAudio\n\tstart := time.Now()\n\n\tgo func() {\n\t\tdefer pcmBuf.Finish()\n\n\t\tgenerated = tts.GenerateWithConfig(\n\t\t\ttext,\n\t\t\t&cfg,\n\t\t\tfunc(samples []float32, progress float32) bool {\n\t\t\t\tlog.Printf(\"Progress: %.1f%%\", progress*100)\n\n\t\t\t\tbuf := make([]byte, len(samples)*2)\n\t\t\t\tfor i, s := range samples {\n\t\t\t\t\tif s > 1 {\n\t\t\t\t\t\ts = 1\n\t\t\t\t\t} else if s < -1 {\n\t\t\t\t\t\ts = -1\n\t\t\t\t\t}\n\t\t\t\t\tv := int16(math.Round(float64(s * 32767)))\n\t\t\t\t\tbinary.LittleEndian.PutUint16(buf[i*2:], uint16(v))\n\t\t\t\t}\n\n\t\t\t\tpcmBuf.Push(buf)\n\t\t\t\treturn true\n\t\t\t},\n\t\t)\n\n\t\tlog.Println(\"TTS generation finished in\", time.Since(start))\n\t}()\n\n\tselect {\n\tcase <-stop:\n\t\tlog.Println(\"Interrupted\")\n\tcase <-reader.done:\n\t\tlog.Println(\"Playback finished\")\n\t}\n\n\tif generated != nil {\n\t\tif ok := generated.Save(outputFilename); !ok {\n\t\t\tlog.Println(\"Failed to save audio\")\n\t\t} else {\n\t\t\tlog.Println(\"Saved generated audio to\", outputFilename)\n\t\t}\n\t}\n\n\ttime.Sleep(800 * time.Millisecond)\n\n\tlog.Println(\"Done\")\n}\n"
  },
  {
    "path": "go-api-examples/zero-shot-zipvoice-tts-play/run.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nexport CGO_ENABLED=1\n\nif [ ! -f ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  tar xvf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nfi\n\nif [ ! -f vocos_24khz.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\nfi\n\ngo mod tidy\ngo build\n\n./zero-shot-zipvoice-tts-play \\\n  --reference-audio ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav \\\n  --reference-text \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\" \\\n  --num-steps 4 \\\n  --min-char-in-sentence 10 \\\n  --output-filename ./generated-leijun.wav \\\n  --text \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n"
  },
  {
    "path": "harmony-os/.gitignore",
    "content": "!build-profile.json5\n*.har\n"
  },
  {
    "path": "harmony-os/README.md",
    "content": "# Introduction\n\n- [./SherpaOnnxHar](./SherpaOnnxHar) It is for building `sherpa_onnx.har`.\n  If you don't need to change the C++ or Typescript code of sherpa-onnx, then\n  you can download pre-built `sherpa_onnx.har` from us. Just run `ohpm install sherpa_onnx`.\n  Please refer to our [doc](https://k2-fsa.github.io/sherpa/onnx/harmony-os/how-to-build-har.html)\n  if you want to build `sherpa-onnx` from source.\n\n- [./SherpaOnnxSpeakerDiarization](./SherpaOnnxSpeakerDiarization) It shows how\n  to run on-device speaker diarization.\n\n- [./SherpaOnnxSpeakerIdentification](./SherpaOnnxSpeakerIdentification) It shows how to use\n  speaker embedding models for on-device speaker identification.\n\n- [./SherpaOnnxStreamingAsr](./SherpaOnnxStreamingAsr) It shows how to use\n  streaming ASR models for real-time on-device speech recognition.\n\n- [./SherpaOnnxTts](./SherpaOnnxTts) It shows how to run on-device text-to-speech.\n  Please see the doc at <https://k2-fsa.github.io/sherpa/onnx/harmony-os/tts.html>\n\n- [./SherpaOnnxVadAsr](./SherpaOnnxVadAsr) It shows how to use\n  VAD + Non-streaming ASR for speech recognition.\n  Please see the doc at <https://k2-fsa.github.io/sherpa/onnx/harmony-os/vad-asr.html>\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/.gitignore",
    "content": "/node_modules\n/oh_modules\n/local.properties\n/.idea\n**/build\n/.hvigor\n.cxx\n/.clangd\n/.clang-format\n/.clang-tidy\n**/.test\n/.appanalyzer"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/AppScope/app.json5",
    "content": "{\n  \"app\": {\n    \"bundleName\": \"com.k2fsa.sherpa.onnx\",\n    \"vendor\": \"example\",\n    \"versionCode\": 1000000,\n    \"versionName\": \"1.0.0\",\n    \"icon\": \"$media:app_icon\",\n    \"label\": \"$string:app_name\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/AppScope/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"app_name\",\n      \"value\": \"SherpaOnnxHar\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/README.md",
    "content": "# Introduction\n\nHow to build `sherpa_onnx.har` from the command line\n----------------------------------------------------\n\nPlease see https://k2-fsa.github.io/sherpa/onnx/harmony-os/how-to-build-har.html\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/build-profile.json5",
    "content": "{\n  \"app\": {\n    \"signingConfigs\": [],\n    \"products\": [\n      {\n        \"name\": \"default\",\n        \"signingConfig\": \"default\",\n        \"compatibleSdkVersion\": \"4.0.0(10)\",\n        \"runtimeOS\": \"HarmonyOS\",\n        \"buildOption\": {\n          \"strictMode\": {\n            \"caseSensitiveCheck\": true,\n          }\n        }\n      }\n    ],\n    \"buildModeSet\": [\n      {\n        \"name\": \"debug\",\n      },\n      {\n        \"name\": \"release\"\n      }\n    ]\n  },\n  \"modules\": [\n    {\n      \"name\": \"entry\",\n      \"srcPath\": \"./entry\",\n      \"targets\": [\n        {\n          \"name\": \"default\",\n          \"applyToProducts\": [\n            \"default\"\n          ]\n        }\n      ]\n    },\n    {\n      \"name\": \"sherpa_onnx\",\n      \"srcPath\": \"./sherpa_onnx\",\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/code-linter.json5",
    "content": "{\n  \"files\": [\n    \"**/*.ets\"\n  ],\n  \"ignore\": [\n    \"**/src/ohosTest/**/*\",\n    \"**/src/test/**/*\",\n    \"**/src/mock/**/*\",\n    \"**/node_modules/**/*\",\n    \"**/oh_modules/**/*\",\n    \"**/build/**/*\",\n    \"**/.preview/**/*\"\n  ],\n  \"ruleSet\": [\n    \"plugin:@performance/recommended\",\n    \"plugin:@typescript-eslint/recommended\"\n  ],\n  \"rules\": {\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/.gitignore",
    "content": "/node_modules\n/oh_modules\n/.preview\n/build\n/.cxx\n/.test"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/build-profile.json5",
    "content": "{\n  \"apiType\": \"stageMode\",\n  \"buildOption\": {\n  },\n  \"buildOptionSet\": [\n    {\n      \"name\": \"release\",\n      \"arkOptions\": {\n        \"obfuscation\": {\n          \"ruleOptions\": {\n            \"enable\": false,\n            \"files\": [\n              \"./obfuscation-rules.txt\"\n            ]\n          }\n        }\n      }\n    },\n  ],\n  \"targets\": [\n    {\n      \"name\": \"default\"\n    },\n    {\n      \"name\": \"ohosTest\",\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/hvigorfile.ts",
    "content": "import { hapTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: hapTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/obfuscation-rules.txt",
    "content": "# Define project specific obfuscation rules here.\n# You can include the obfuscation configuration files in the current module's build-profile.json5.\n#\n# For more details, see\n#   https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/source-obfuscation-V5\n\n# Obfuscation options:\n# -disable-obfuscation: disable all obfuscations\n# -enable-property-obfuscation: obfuscate the property names\n# -enable-toplevel-obfuscation: obfuscate the names in the global scope\n# -compact: remove unnecessary blank spaces and all line feeds\n# -remove-log: remove all console.* statements\n# -print-namecache: print the name cache that contains the mapping from the old names to new names\n# -apply-namecache: reuse the given cache file\n\n# Keep options:\n# -keep-property-name: specifies property names that you want to keep\n# -keep-global-name: specifies names that you want to keep in the global scope\n\n-enable-property-obfuscation\n-enable-toplevel-obfuscation\n-enable-filename-obfuscation\n-enable-export-obfuscation"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/oh-package.json5",
    "content": "{\n  \"name\": \"entry\",\n  \"version\": \"1.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"main\": \"\",\n  \"author\": \"\",\n  \"license\": \"\",\n  \"dependencies\": {}\n}\n\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/ets/entryability/EntryAbility.ets",
    "content": "import AbilityConstant from '@ohos.app.ability.AbilityConstant';\nimport hilog from '@ohos.hilog';\nimport UIAbility from '@ohos.app.ability.UIAbility';\nimport Want from '@ohos.app.ability.Want';\nimport window from '@ohos.window';\n\nexport default class EntryAbility extends UIAbility {\n  onCreate(want: Want, launchParam: AbilityConstant.LaunchParam): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onCreate');\n  }\n\n  onDestroy(): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onDestroy');\n  }\n\n  onWindowStageCreate(windowStage: window.WindowStage): void {\n    // Main window is created, set main page for this ability\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageCreate');\n\n    windowStage.loadContent('pages/Index', (err) => {\n      if (err.code) {\n        hilog.error(0x0000, 'testTag', 'Failed to load the content. Cause: %{public}s', JSON.stringify(err) ?? '');\n        return;\n      }\n      hilog.info(0x0000, 'testTag', 'Succeeded in loading the content.');\n    });\n  }\n\n  onWindowStageDestroy(): void {\n    // Main window is destroyed, release UI related resources\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageDestroy');\n  }\n\n  onForeground(): void {\n    // Ability has brought to foreground\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onForeground');\n  }\n\n  onBackground(): void {\n    // Ability has back to background\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onBackground');\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/ets/entrybackupability/EntryBackupAbility.ets",
    "content": "import hilog from '@ohos.hilog';\nimport BackupExtensionAbility, { BundleVersion } from '@ohos.application.BackupExtensionAbility';\n\nexport default class EntryBackupAbility extends BackupExtensionAbility {\n  async onBackup() {\n    hilog.info(0x0000, 'testTag', 'onBackup ok');\n  }\n\n  async onRestore(bundleVersion: BundleVersion) {\n    hilog.info(0x0000, 'testTag', 'onRestore ok %{public}s', JSON.stringify(bundleVersion));\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/ets/pages/Index.ets",
    "content": "@Entry\n@Component\nstruct Index {\n  @State message: string = 'Hello World';\n\n  build() {\n    Row() {\n      Column() {\n        Text(this.message)\n          .fontSize(50)\n          .fontWeight(FontWeight.Bold)\n      }\n      .width('100%')\n    }\n    .height('100%')\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry\",\n    \"type\": \"entry\",\n    \"description\": \"$string:module_desc\",\n    \"mainElement\": \"EntryAbility\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false,\n    \"pages\": \"$profile:main_pages\",\n    \"abilities\": [\n      {\n        \"name\": \"EntryAbility\",\n        \"srcEntry\": \"./ets/entryability/EntryAbility.ets\",\n        \"description\": \"$string:EntryAbility_desc\",\n        \"icon\": \"$media:layered_image\",\n        \"label\": \"$string:EntryAbility_label\",\n        \"startWindowIcon\": \"$media:startIcon\",\n        \"startWindowBackground\": \"$color:start_window_background\",\n        \"exported\": true,\n        \"skills\": [\n          {\n            \"entities\": [\n              \"entity.system.home\"\n            ],\n            \"actions\": [\n              \"action.system.home\"\n            ]\n          }\n        ]\n      }\n    ],\n    \"extensionAbilities\": [\n      {\n        \"name\": \"EntryBackupAbility\",\n        \"srcEntry\": \"./ets/entrybackupability/EntryBackupAbility.ets\",\n        \"type\": \"backup\",\n        \"exported\": false,\n        \"metadata\": [\n          {\n            \"name\": \"ohos.extension.backup\",\n            \"resource\": \"$profile:backup_config\"\n          }\n        ],\n      }\n    ]\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/resources/base/element/color.json",
    "content": "{\n  \"color\": [\n    {\n      \"name\": \"start_window_background\",\n      \"value\": \"#FFFFFF\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"module description\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"description\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"label\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/resources/base/media/layered_image.json",
    "content": "{\n  \"layered-image\":\n  {\n    \"background\" : \"$media:background\",\n    \"foreground\" : \"$media:foreground\"\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/resources/base/profile/backup_config.json",
    "content": "{\n  \"allowToBackupRestore\": true\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/resources/base/profile/main_pages.json",
    "content": "{\n  \"src\": [\n    \"pages/Index\"\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/resources/en_US/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"module description\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"description\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"label\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/main/resources/zh_CN/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"模块描述\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"description\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"label\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/ohosTest/ets/test/Ability.test.ets",
    "content": "import hilog from '@ohos.hilog';\nimport { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function abilityTest() {\n  describe('ActsAbilityTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    })\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    })\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    })\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    })\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      hilog.info(0x0000, 'testTag', '%{public}s', 'it begin');\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    })\n  })\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/ohosTest/ets/test/List.test.ets",
    "content": "import abilityTest from './Ability.test';\n\nexport default function testsuite() {\n  abilityTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/ohosTest/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry_test\",\n    \"type\": \"feature\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/test/List.test.ets",
    "content": "import localUnitTest from './LocalUnit.test';\n\nexport default function testsuite() {\n  localUnitTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/entry/src/test/LocalUnit.test.ets",
    "content": "import { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function localUnitTest() {\n  describe('localUnitTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    });\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    });\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    });\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    });\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    });\n  });\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/hvigor/hvigor-config.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"dependencies\": {\n  },\n  \"execution\": {\n    // \"analyze\": \"normal\",                     /* Define the build analyze mode. Value: [ \"normal\" | \"advanced\" | false ]. Default: \"normal\" */\n    // \"daemon\": true,                          /* Enable daemon compilation. Value: [ true | false ]. Default: true */\n    // \"incremental\": true,                     /* Enable incremental compilation. Value: [ true | false ]. Default: true */\n    // \"parallel\": true,                        /* Enable parallel compilation. Value: [ true | false ]. Default: true */\n    // \"typeCheck\": false,                      /* Enable typeCheck. Value: [ true | false ]. Default: false */\n  },\n  \"logging\": {\n    // \"level\": \"info\"                          /* Define the log level. Value: [ \"debug\" | \"info\" | \"warn\" | \"error\" ]. Default: \"info\" */\n  },\n  \"debugging\": {\n    // \"stacktrace\": false                      /* Disable stacktrace compilation. Value: [ true | false ]. Default: false */\n  },\n  \"nodeOptions\": {\n    // \"maxOldSpaceSize\": 8192                  /* Enable nodeOptions maxOldSpaceSize compilation. Unit M. Used for the daemon process. Default: 8192*/\n    // \"exposeGC\": true                         /* Enable to trigger garbage collection explicitly. Default: true*/\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/hvigorfile.ts",
    "content": "import { appTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: appTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/notes.md",
    "content": "# Notes\n\n## How to publish a package\n\nPlease see\n - <https://ohpm.openharmony.cn/#/cn/help/publishrequirefile>\n - <https://ohpm.openharmony.cn/#/cn/help/createandpublish>\n - <https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/ide-har-publish-V5>\n\n## How to sign the HAP file from commandline\n\nPlease see\n<https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/ide-command-line-building-app-V5>\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"@ohos/hypium@1.0.19\": \"@ohos/hypium@1.0.19\"\n  },\n  \"packages\": {\n    \"@ohos/hypium@1.0.19\": {\n      \"name\": \"@ohos/hypium\",\n      \"version\": \"1.0.19\",\n      \"integrity\": \"sha512-cEjDgLFCm3cWZDeRXk7agBUkPqjWxUo6AQeiu0gEkb3J8ESqlduQLSIXeo3cCsm8U/asL7iKjF85ZyOuufAGSQ==\",\n      \"resolved\": \"https://ohpm.openharmony.cn/ohpm/@ohos/hypium/-/hypium-1.0.19.har\",\n      \"registryType\": \"ohpm\"\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/oh-package.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"dependencies\": {\n  },\n  \"devDependencies\": {\n    \"@ohos/hypium\": \"1.0.19\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/release.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nexport PATH=/Users/fangjun/software/command-line-tools/bin:$PATH\n\ncp -v ../../CHANGELOG.md ./sherpa_onnx\n\nhvigorw clean --no-daemon\nhvigorw --mode module -p product=default -p module=sherpa_onnx@default assembleHar --analyze=normal --parallel --incremental --no-daemon\n\nohpm publish ./sherpa_onnx/build/default/outputs/default/sherpa_onnx.har\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/.gitignore",
    "content": "/node_modules\n/oh_modules\n/.preview\n/build\n/.cxx\n/.test"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/BuildProfile.ets",
    "content": "/**\n * Use these variables when you tailor your ArkTS code. They must be of the const type.\n */\nexport const HAR_VERSION = '1.12.31';\nexport const BUILD_MODE_NAME = 'debug';\nexport const DEBUG = true;\nexport const TARGET_NAME = 'default';\n\n/**\n * BuildProfile Class is used only for compatibility purposes.\n */\nexport default class BuildProfile { \n\tstatic readonly HAR_VERSION = HAR_VERSION;\n\tstatic readonly BUILD_MODE_NAME = BUILD_MODE_NAME;\n\tstatic readonly DEBUG = DEBUG;\n\tstatic readonly TARGET_NAME = TARGET_NAME;\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/Index.ets",
    "content": "export { listRawfileDir, readWave, readWaveFromBinary, } from \"libsherpa_onnx.so\";\n\nexport { CircularBuffer, SileroVadConfig, TenVadConfig, SpeechSegment, Vad, VadConfig, } from './src/main/ets/components/Vad';\n\n\nexport { Samples,\n  OfflineStream,\n  FeatureConfig,\n  HomophoneReplacerConfig,\n  OfflineCanaryModelConfig,\n  OfflineDolphinModelConfig,\n  OfflineFireRedAsrCtcModelConfig,\n  OfflineFireRedAsrModelConfig,\n  OfflineFunASRNanoModelConfig,\n  OfflineMedAsrCtcModelConfig,\n  OfflineOmnilingualAsrCtcModelConfig,\n  OfflineTransducerModelConfig,\n  OfflineParaformerModelConfig,\n  OfflineNemoEncDecCtcModelConfig,\n  OfflineWhisperModelConfig,\n  OfflineTdnnModelConfig,\n  OfflineMoonshineModelConfig,\n  OfflineSenseVoiceModelConfig,\n  OfflineWenetCtcModelConfig,\n  OfflineZipformerCtcModelConfig,\n  OfflineModelConfig,\n  OfflineLMConfig,\n  OfflineRecognizerConfig,\n  OfflineRecognizerResult,\n  OfflineRecognizer,\n} from './src/main/ets/components/NonStreamingAsr';\n\nexport { OnlineStream,\n  OnlineNemoCtcModelConfig,\n  OnlineParaformerModelConfig,\n  OnlineToneCtcModelConfig,\n  OnlineTransducerModelConfig,\n  OnlineZipformer2CtcModelConfig,\n  OnlineModelConfig,\n  OnlineCtcFstDecoderConfig,\n  OnlineRecognizerConfig,\n  OnlineRecognizerResult,\n  OnlineRecognizer,\n} from './src/main/ets/components/StreamingAsr';\n\nexport { OfflineTtsKittenModelConfig,\n  OfflineTtsKokoroModelConfig,\n  OfflineTtsMatchaModelConfig,\n  OfflineTtsPocketModelConfig,\n  OfflineTtsSupertonicModelConfig,\n  OfflineTtsVitsModelConfig,\n  OfflineTtsZipvoiceModelConfig,\n  OfflineTtsModelConfig,\n  OfflineTtsConfig,\n  OfflineTts,\n  TtsOutput,\n  TtsGenerationConfig,\n  TtsInput,\n  TtsInputWithConfig,\n} from './src/main/ets/components/NonStreamingTts';\n\nexport { OfflinePunctuationModelConfig,\n  OfflinePunctuationConfig,\n  OfflinePunctuation,\n} from './src/main/ets/components/OfflinePunctuation';\n\nexport { OnlinePunctuationModelConfig,\n  OnlinePunctuationConfig,\n  OnlinePunctuation,\n} from './src/main/ets/components/OnlinePunctuation';\n\nexport { SpeakerEmbeddingExtractorConfig,\n  SpeakerEmbeddingExtractor,\n  SpeakerEmbeddingManager,\n} from './src/main/ets/components/SpeakerIdentification';\n\nexport { OfflineSpeakerSegmentationPyannoteModelConfig,\n  OfflineSpeakerSegmentationModelConfig,\n  OfflineSpeakerDiarizationConfig,\n  OfflineSpeakerDiarizationSegment,\n  OfflineSpeakerDiarization,\n  FastClusteringConfig,\n} from './src/main/ets/components/NonStreamingSpeakerDiarization';\n\nexport { KeywordSpotterConfig,\n  KeywordSpotterResult,\n  KeywordSpotter,\n} from './src/main/ets/components/KeywordSpotting';\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/README.md",
    "content": "# Introduction\n\n[sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) is one of the deployment\nframeworks of [Next-gen Kaldi](https://github.com/k2-fsa).\n\nIt supports speech-to-text, text-to-speech, speaker diarization, and VAD using\nonnxruntime without Internet connection.\n\nIt also supports embedded systems, Android, iOS, HarmonyOS,\nRaspberry Pi, RISC-V, x86_64 servers, websocket server/client,\nC/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript,\nFlutter, Object Pascal, Lazarus, Rust, etc.\n\n\n# Installation\n\nTo use `sherpa-onnx` in your project, please either use\n\n```\nohpm install sherpa_onnx\n```\nor update your `oh-package.json5` to include the following:\n\n```\n  \"dependencies\": {\n    \"sherpa_onnx\": \"1.12.31\",\n  },\n```\n\nNote that we recommend always using the latest version.\n\n# Examples\n\n| Demo | URL | Description|\n|------|-----|------------|\n|SherpaOnnxStreamingAsr|[Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/harmony-os/SherpaOnnxStreamingAsr)|On-device real-time/streaming speech recognition with Next-gen Kaldi|\n|SherpaOnnxVadAsr|[Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/harmony-os/SherpaOnnxVadAsr)|It shows how to use VAD with a non-streaming ASR model for on-device speech recognition without accessing the network |\n|SherpaOnnxTts|[Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/harmony-os/SherpaOnnxTts)|It shows how to use Next-gen Kaldi for on-device text-to-speech (TTS, i.e., speech synthesis)|\n|SherpaOnnxSpeakerDiarization|[Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/harmony-os/SherpaOnnxSpeakerDiarization)|On-device speaker diarization with Next-gen Kaldi|\n|SherpaOnnxSpeakerIdentification|[Address](https://github.com/k2-fsa/sherpa-onnx/tree/master/harmony-os/SherpaOnnxSpeakerIdentification)|On-device speaker identification with Next-gen Kaldi|\n\n# Documentation\n\nIf you have any issues, please either look at our doc at\n<https://k2-fsa.github.io/sherpa/onnx/> or create an issue at\n<https://github.com/k2-fsa/sherpa-onnx/issues>\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/build-profile.json5",
    "content": "{\n  \"apiType\": \"stageMode\",\n  \"buildOption\": {\n    \"externalNativeOptions\": {\n      \"path\": \"./src/main/cpp/CMakeLists.txt\",\n      \"arguments\": \"\",\n      \"cppFlags\": \"-std=c++17\",\n      \"abiFilters\": [\n        \"arm64-v8a\",\n        \"x86_64\",\n      ],\n    },\n  },\n  \"buildOptionSet\": [\n    {\n      \"name\": \"release\",\n      \"arkOptions\": {\n        \"obfuscation\": {\n          \"ruleOptions\": {\n            \"enable\": false,\n            \"files\": [\n              \"./obfuscation-rules.txt\"\n            ]\n          },\n          \"consumerFiles\": [\n            \"./consumer-rules.txt\"\n          ]\n        }\n      },\n      \"nativeLib\": {\n        \"debugSymbol\": {\n          \"strip\": true,\n          \"exclude\": []\n        }\n      }\n    },\n  ],\n  \"targets\": [\n    {\n      \"name\": \"default\"\n    },\n    {\n      \"name\": \"ohosTest\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/consumer-rules.txt",
    "content": ""
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/hvigorfile.ts",
    "content": "import { harTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: harTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/obfuscation-rules.txt",
    "content": "# Define project specific obfuscation rules here.\n# You can include the obfuscation configuration files in the current module's build-profile.json5.\n#\n# For more details, see\n#   https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/source-obfuscation-V5\n\n# Obfuscation options:\n# -disable-obfuscation: disable all obfuscations\n# -enable-property-obfuscation: obfuscate the property names\n# -enable-toplevel-obfuscation: obfuscate the names in the global scope\n# -compact: remove unnecessary blank spaces and all line feeds\n# -remove-log: remove all console.* statements\n# -print-namecache: print the name cache that contains the mapping from the old names to new names\n# -apply-namecache: reuse the given cache file\n\n# Keep options:\n# -keep-property-name: specifies property names that you want to keep\n# -keep-global-name: specifies names that you want to keep in the global scope\n\n-enable-property-obfuscation\n-enable-toplevel-obfuscation\n-enable-filename-obfuscation\n-enable-export-obfuscation"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"libsherpa_onnx.so@src/main/cpp/types/libsherpa_onnx\": \"libsherpa_onnx.so@src/main/cpp/types/libsherpa_onnx\"\n  },\n  \"packages\": {\n    \"libsherpa_onnx.so@src/main/cpp/types/libsherpa_onnx\": {\n      \"name\": \"libsherpa_onnx.so\",\n      \"version\": \"1.0.0\",\n      \"resolved\": \"src/main/cpp/types/libsherpa_onnx\",\n      \"registryType\": \"local\"\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/oh-package.json5",
    "content": "{\n  \"name\": \"sherpa_onnx\",\n  \"version\": \"1.12.31\",\n  \"description\": \"On-device speech-to-text, text-to-speech, and speaker diarization using Next-gen Kaldi without Internet connection\",\n  \"main\": \"Index.ets\",\n  \"author\": \"The next-gen Kaldi team\",\n  \"license\": \"Apache-2.0\",\n  \"homepage\": \"https://github.com/k2-fsa/sherpa-onnx\",\n  \"repository\": \"https://github.com/k2-fsa/sherpa-onnx/tree/master/harmony-os/SherpaOnnxHar\",\n  \"dependencies\": {\n    \"libsherpa_onnx.so\": \"file:./src/main/cpp/types/libsherpa_onnx\"\n  },\n  \"keywords\": [\n    \"语音识别\",\n    \"语音合成\",\n    \"说话人日志\",\n    \"新一代Kaldi\",\n    \"不联网\",\n    \"本地\",\n    \"tts\",\n    \"asr\",\n    \"privacy\",\n    \"open-source\",\n  ],\n  \"bugs\": {\n    \"url\": \"https://github.com/k2-fsa/sherpa-onnx/issues\"\n  },\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/CMakeLists.txt",
    "content": "# the minimum version of CMake.\ncmake_minimum_required(VERSION 3.13.0)\nproject(myNpmLib)\n\nif (NOT CMAKE_CXX_STANDARD)\n  set(CMAKE_CXX_STANDARD 17 CACHE STRING \"The C++ version to use\")\nendif()\n\n# Disable warning about\n#\n# \"The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is\n#  not set.\nif (CMAKE_VERSION VERSION_GREATER_EQUAL \"3.24.0\")\n  cmake_policy(SET CMP0135 NEW)\nendif()\n\nset(NATIVERENDER_ROOT_PATH ${CMAKE_CURRENT_SOURCE_DIR})\n\nif(DEFINED PACKAGE_FIND_FILE)\n    include(${PACKAGE_FIND_FILE})\nendif()\n\ninclude_directories(${NATIVERENDER_ROOT_PATH}\n                    ${NATIVERENDER_ROOT_PATH}/include)\n\ninclude(FetchContent)\nFetchContent_Declare(node_addon_api\n    GIT_REPOSITORY \"https://github.com/nodejs/node-addon-api.git\"\n    GIT_TAG c679f6f4c9dc6bf9fc0d99cbe5982bd24a5e2c7b\n    PATCH_COMMAND git checkout . && git apply --ignore-whitespace \"${CMAKE_CURRENT_LIST_DIR}/my-patch.diff\"\n)\nFetchContent_MakeAvailable(node_addon_api)\nFetchContent_GetProperties(node_addon_api)\nif(NOT node_addon_api_POPULATED)\n    message(STATUS \"Downloading node-addon-api from\")\n    FetchContent_Populate(node_addon_api)\nendif()\n\nmessage(STATUS \"node-addon-api is downloaded to ${node_addon_api_SOURCE_DIR}\")\ninclude_directories(${node_addon_api_SOURCE_DIR})\n\nadd_library(sherpa_onnx SHARED\n  audio-tagging.cc\n  keyword-spotting.cc\n  non-streaming-asr.cc\n  non-streaming-speaker-diarization.cc\n  non-streaming-speech-denoiser.cc\n  non-streaming-tts.cc\n  offline-punctuation.cc\n  online-punctuation.cc\n  streaming-speech-denoiser.cc\n  sherpa-onnx-node-addon-api.cc\n  speaker-identification.cc\n  spoken-language-identification.cc\n  streaming-asr.cc\n  utils.cc\n  vad.cc\n  version.cc\n  wave-reader.cc\n  wave-writer.cc\n)\n\nadd_library(sherpa_onnx_c_api SHARED IMPORTED)\nset_target_properties(sherpa_onnx_c_api\n    PROPERTIES\n    IMPORTED_LOCATION ${CMAKE_CURRENT_SOURCE_DIR}/libs/${OHOS_ARCH}/libsherpa-onnx-c-api.so)\n\nadd_library(onnxruntime SHARED IMPORTED)\nset_target_properties(onnxruntime\n    PROPERTIES\n    IMPORTED_LOCATION ${CMAKE_CURRENT_SOURCE_DIR}/libs/${OHOS_ARCH}/libonnxruntime.so)\n\n\ntarget_link_libraries(sherpa_onnx PUBLIC libace_napi.z.so\n libhilog_ndk.z.so # for hilog\n librawfile.z.so\n sherpa_onnx_c_api onnxruntime\n)\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/audio-tagging.cc",
    "content": "// scripts/node-addon-api/src/audio-tagging.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <sstream>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic SherpaOnnxOfflineZipformerAudioTaggingModelConfig\nGetAudioTaggingZipformerModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineZipformerAudioTaggingModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"zipformer\") || !obj.Get(\"zipformer\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"zipformer\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxAudioTaggingModelConfig GetAudioTaggingModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxAudioTaggingModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"model\") || !obj.Get(\"model\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"model\").As<Napi::Object>();\n  c.zipformer = GetAudioTaggingZipformerModelConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(ced, ced);\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n  return c;\n}\n\nstatic Napi::External<SherpaOnnxAudioTagging> CreateAudioTaggingWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"You should pass an object as the only argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxAudioTaggingConfig c;\n  memset(&c, 0, sizeof(c));\n  c.model = GetAudioTaggingModelConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(labels, labels);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(top_k, topK);\n\n  const SherpaOnnxAudioTagging *at = SherpaOnnxCreateAudioTagging(&c);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model.zipformer.model);\n  SHERPA_ONNX_DELETE_C_STR(c.model.ced);\n  SHERPA_ONNX_DELETE_C_STR(c.model.provider);\n  SHERPA_ONNX_DELETE_C_STR(c.labels);\n\n  if (!at) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxAudioTagging>::New(\n      env, const_cast<SherpaOnnxAudioTagging *>(at),\n      [](Napi::Env env, SherpaOnnxAudioTagging *at) {\n        SherpaOnnxDestroyAudioTagging(at);\n      });\n}\n\nstatic Napi::External<SherpaOnnxOfflineStream>\nAudioTaggingCreateOfflineStreamWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"You should pass an audio tagging pointer as the only argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxAudioTagging *at =\n      info[0].As<Napi::External<SherpaOnnxAudioTagging>>().Data();\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxAudioTaggingCreateOfflineStream(at);\n\n  return Napi::External<SherpaOnnxOfflineStream>::New(\n      env, const_cast<SherpaOnnxOfflineStream *>(stream),\n      [](Napi::Env env, SherpaOnnxOfflineStream *stream) {\n        SherpaOnnxDestroyOfflineStream(stream);\n      });\n}\n\nstatic Napi::Object AudioTaggingComputeWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 3) {\n    std::ostringstream os;\n    os << \"Expect only 3 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"You should pass an audio tagging pointer as the first argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"You should pass an offline stream pointer as the second argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[2].IsNumber()) {\n    Napi::TypeError::New(env,\n                         \"You should pass an integer as the third argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxAudioTagging *at =\n      info[0].As<Napi::External<SherpaOnnxAudioTagging>>().Data();\n\n  const SherpaOnnxOfflineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOfflineStream>>().Data();\n\n  int32_t top_k = info[2].As<Napi::Number>().Int32Value();\n\n  const SherpaOnnxAudioEvent *const *events =\n      SherpaOnnxAudioTaggingCompute(at, stream, top_k);\n\n  auto p = events;\n  int32_t k = 0;\n  while (p && *p) {\n    ++k;\n    ++p;\n  }\n\n  Napi::Array ans = Napi::Array::New(env, k);\n  for (uint32_t i = 0; i != k; ++i) {\n    Napi::Object obj = Napi::Object::New(env);\n    obj.Set(Napi::String::New(env, \"name\"),\n            Napi::String::New(env, events[i]->name));\n    obj.Set(Napi::String::New(env, \"index\"),\n            Napi::Number::New(env, events[i]->index));\n    obj.Set(Napi::String::New(env, \"prob\"),\n            Napi::Number::New(env, events[i]->prob));\n    // ans[i] = obj; // see #2120\n    ans.Set(i, obj);\n  }\n\n  SherpaOnnxAudioTaggingFreeResults(events);\n\n  return ans;\n}\n\nvoid InitAudioTagging(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createAudioTagging\"),\n              Napi::Function::New(env, CreateAudioTaggingWrapper));\n\n  exports.Set(Napi::String::New(env, \"audioTaggingCreateOfflineStream\"),\n              Napi::Function::New(env, AudioTaggingCreateOfflineStreamWrapper));\n\n  exports.Set(Napi::String::New(env, \"audioTaggingCompute\"),\n              Napi::Function::New(env, AudioTaggingComputeWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/include/sherpa-onnx/c-api/README.md",
    "content": "# Node\n\n[./c-api.h](./c-api.h) is a symbolic link to\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/c-api/c-api.h\n\nIf you are using Windows, then you need to manually replace this file with\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/c-api/c-api.h\nsince Windows does not support symbolic links.\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/keyword-spotting.cc",
    "content": "// scripts/node-addon-api/src/keyword-spotting.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <memory>\n#include <sstream>\n#include <string>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n// defined ./streaming-asr.cc\nSherpaOnnxFeatureConfig GetFeatureConfig(Napi::Object obj);\n\n// defined ./streaming-asr.cc\nSherpaOnnxOnlineModelConfig GetOnlineModelConfig(Napi::Object obj);\n\nstatic Napi::External<SherpaOnnxKeywordSpotter> CreateKeywordSpotterWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n#if __OHOS__\n  if (info.Length() != 1 && info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#else\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"Expect an object as the argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n#if __OHOS__\n  bool use_resource_manager =\n      info.Length() == 2 && !info[1].IsUndefined() && !info[1].IsNull();\n  if (use_resource_manager && !info[1].IsObject()) {\n    Napi::TypeError::New(\n        env, \"You should pass a resource manager as the second argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  Napi::Object o = info[0].As<Napi::Object>();\n  SherpaOnnxKeywordSpotterConfig c;\n  memset(&c, 0, sizeof(c));\n  c.feat_config = GetFeatureConfig(o);\n  c.model_config = GetOnlineModelConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(max_active_paths, maxActivePaths);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_trailing_blanks, numTrailingBlanks);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(keywords_score, keywordsScore);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(keywords_threshold, keywordsThreshold);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(keywords_file, keywordsFile);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(keywords_buf, keywordsBuf);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(keywords_buf_size, keywordsBufSize);\n\n#if __OHOS__\n  const SherpaOnnxKeywordSpotter *kws = nullptr;\n\n  if (use_resource_manager) {\n    std::unique_ptr<NativeResourceManager,\n                    decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n        mgr(OH_ResourceManager_InitNativeResourceManager(env, info[1]),\n            &OH_ResourceManager_ReleaseNativeResourceManager);\n\n    kws = SherpaOnnxCreateKeywordSpotterOHOS(&c, mgr.get());\n  } else {\n    kws = SherpaOnnxCreateKeywordSpotter(&c);\n  }\n#else\n  const SherpaOnnxKeywordSpotter *kws = SherpaOnnxCreateKeywordSpotter(&c);\n#endif\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.transducer.encoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.transducer.decoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.transducer.joiner);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.paraformer.encoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.paraformer.decoder);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.zipformer2_ctc.model);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.tokens);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.provider);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.model_type);\n  SHERPA_ONNX_DELETE_C_STR(c.keywords_file);\n  SHERPA_ONNX_DELETE_C_STR(c.keywords_buf);\n\n  if (!kws) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxKeywordSpotter>::New(\n      env, const_cast<SherpaOnnxKeywordSpotter *>(kws),\n      [](Napi::Env env, SherpaOnnxKeywordSpotter *kws) {\n        SherpaOnnxDestroyKeywordSpotter(kws);\n      });\n}\n\nstatic Napi::External<SherpaOnnxOnlineStream> CreateKeywordStreamWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1 && info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"You should pass a keyword spotter pointer as the only argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (info.Length() == 2 && !info[1].IsString()) {\n    std::ostringstream os;\n    os << \"Argument 2 should be a string.\";\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n    return {};\n  }\n\n  const SherpaOnnxKeywordSpotter *kws =\n      info[0].As<Napi::External<SherpaOnnxKeywordSpotter>>().Data();\n\n  const SherpaOnnxOnlineStream *stream;\n  if (info.Length() == 1) {\n    stream = SherpaOnnxCreateKeywordStream(kws);\n  } else {\n    Napi::String js_keywords = info[1].As<Napi::String>();\n    std::string keywords = js_keywords.Utf8Value();\n    stream = SherpaOnnxCreateKeywordStreamWithKeywords(kws, keywords.c_str());\n  }\n\n  return Napi::External<SherpaOnnxOnlineStream>::New(\n      env, const_cast<SherpaOnnxOnlineStream *>(stream),\n      [](Napi::Env env, SherpaOnnxOnlineStream *stream) {\n        SherpaOnnxDestroyOnlineStream(stream);\n      });\n}\n\nstatic Napi::Boolean IsKeywordStreamReadyWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a keyword spotter pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxKeywordSpotter *kws =\n      info[0].As<Napi::External<SherpaOnnxKeywordSpotter>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  int32_t is_ready = SherpaOnnxIsKeywordStreamReady(kws, stream);\n\n  return Napi::Boolean::New(env, is_ready);\n}\n\nstatic void DecodeKeywordStreamWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a keyword spotter pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxKeywordSpotter *kws =\n      info[0].As<Napi::External<SherpaOnnxKeywordSpotter>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  SherpaOnnxDecodeKeywordStream(kws, stream);\n}\n\nstatic void ResetKeywordStreamWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a keyword spotter pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxKeywordSpotter *kws =\n      info[0].As<Napi::External<SherpaOnnxKeywordSpotter>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  SherpaOnnxResetKeywordStream(kws, stream);\n}\n\nstatic Napi::String GetKeywordResultAsJsonWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a keyword spotter pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxKeywordSpotter *kws =\n      info[0].As<Napi::External<SherpaOnnxKeywordSpotter>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  const char *json = SherpaOnnxGetKeywordResultAsJson(kws, stream);\n\n  Napi::String s = Napi::String::New(env, json);\n\n  SherpaOnnxFreeKeywordResultJson(json);\n\n  return s;\n}\n\nvoid InitKeywordSpotting(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createKeywordSpotter\"),\n              Napi::Function::New(env, CreateKeywordSpotterWrapper));\n\n  exports.Set(Napi::String::New(env, \"createKeywordStream\"),\n              Napi::Function::New(env, CreateKeywordStreamWrapper));\n\n  exports.Set(Napi::String::New(env, \"isKeywordStreamReady\"),\n              Napi::Function::New(env, IsKeywordStreamReadyWrapper));\n\n  exports.Set(Napi::String::New(env, \"decodeKeywordStream\"),\n              Napi::Function::New(env, DecodeKeywordStreamWrapper));\n\n  exports.Set(Napi::String::New(env, \"resetKeywordStream\"),\n              Napi::Function::New(env, ResetKeywordStreamWrapper));\n\n  exports.Set(Napi::String::New(env, \"getKeywordResultAsJson\"),\n              Napi::Function::New(env, GetKeywordResultAsJsonWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/libs/.gitignore",
    "content": "*.so\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/libs/README.md",
    "content": "# Introduction\n\nYou need to get the following four `.so` files using\n\n  - [build-ohos-arm64-v8a.sh](https://github.com/k2-fsa/sherpa-onnx/blob/master/build-ohos-arm64-v8a.sh)\n  - [build-ohos-x86-64.sh](https://github.com/k2-fsa/sherpa-onnx/blob/master/build-ohos-x86-64.sh)\n\n```\n.\n├── README.md\n├── arm64-v8a\n│   ├── libonnxruntime.so\n│   └── libsherpa-onnx-c-api.so\n└── x86_64\n    ├── libonnxruntime.so\n    └── libsherpa-onnx-c-api.so\n```\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/libs/arm64-v8a/.gitkeep",
    "content": ""
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/libs/armeabi-v7a/.gitkeep",
    "content": ""
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/libs/x86_64/.gitkeep",
    "content": ""
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/macros.h",
    "content": "// scripts/node-addon-api/src/macros.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SCRIPTS_NODE_ADDON_API_SRC_MACROS_H_\n#define SCRIPTS_NODE_ADDON_API_SRC_MACROS_H_\n\n#include <algorithm>\n#include <string>\n\n#if __OHOS__\n#include \"hilog/log.h\"\n#include \"rawfile/raw_file_manager.h\"\n\n#undef LOG_DOMAIN\n#undef LOG_TAG\n\n// https://gitee.com/openharmony/docs/blob/145a084f0b742e4325915e32f8184817927d1251/en/contribute/OpenHarmony-Log-guide.md#hilog-api-usage-specifications\n#define LOG_DOMAIN 0x6666\n#define LOG_TAG \"sherpa_onnx\"\n#endif\n\n#define SHERPA_ONNX_ASSIGN_ATTR_STR(c_name, js_name)                       \\\n  do {                                                                     \\\n    if (o.Has(#js_name) && o.Get(#js_name).IsString()) {                   \\\n      Napi::String _str = o.Get(#js_name).As<Napi::String>();              \\\n      std::string s = _str.Utf8Value();                                    \\\n      char *p = new char[s.size() + 1];                                    \\\n      std::copy(s.begin(), s.end(), p);                                    \\\n      p[s.size()] = 0;                                                     \\\n                                                                           \\\n      c.c_name = p;                                                        \\\n    } else if (o.Has(#js_name) && o.Get(#js_name).IsTypedArray()) {        \\\n      Napi::Uint8Array _array = o.Get(#js_name).As<Napi::Uint8Array>();    \\\n      char *p = new char[_array.ElementLength() + 1];                      \\\n      std::copy(_array.Data(), _array.Data() + _array.ElementLength(), p); \\\n      p[_array.ElementLength()] = '\\0';                                    \\\n                                                                           \\\n      c.c_name = p;                                                        \\\n    }                                                                      \\\n  } while (0)\n\n#define SHERPA_ONNX_ASSIGN_ATTR_INT32(c_name, js_name)            \\\n  do {                                                            \\\n    if (o.Has(#js_name) && o.Get(#js_name).IsNumber()) {          \\\n      c.c_name = o.Get(#js_name).As<Napi::Number>().Int32Value(); \\\n    }                                                             \\\n  } while (0)\n\n#define SHERPA_ONNX_ASSIGN_ATTR_FLOAT(c_name, js_name)            \\\n  do {                                                            \\\n    if (o.Has(#js_name) && o.Get(#js_name).IsNumber()) {          \\\n      c.c_name = o.Get(#js_name).As<Napi::Number>().FloatValue(); \\\n    }                                                             \\\n  } while (0)\n\n#define SHERPA_ONNX_DELETE_C_STR(p) \\\n  do {                              \\\n    if (p) {                        \\\n      delete[] p;                   \\\n    }                               \\\n  } while (0)\n\n#endif  // SCRIPTS_NODE_ADDON_API_SRC_MACROS_H_\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/my-patch.diff",
    "content": "diff --git a/napi-inl.h b/napi-inl.h\nindex e7141c0..0fd90d8 100644\n--- a/napi-inl.h\n+++ b/napi-inl.h\n@@ -2156,7 +2156,8 @@ inline ArrayBuffer::ArrayBuffer(napi_env env, napi_value value)\n \n inline void* ArrayBuffer::Data() {\n   void* data;\n-  napi_status status = napi_get_arraybuffer_info(_env, _value, &data, nullptr);\n+  size_t byte_length;\n+  napi_status status = napi_get_arraybuffer_info(_env, _value, &data, &byte_length);\n   NAPI_THROW_IF_FAILED(_env, status, nullptr);\n   return data;\n }\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/non-streaming-asr.cc",
    "content": "// scripts/node-addon-api/src/non-streaming-asr.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <memory>\n#include <sstream>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n// defined in ./streaming-asr.cc\nSherpaOnnxFeatureConfig GetFeatureConfig(Napi::Object obj);\nSherpaOnnxHomophoneReplacerConfig GetHomophoneReplacerConfig(Napi::Object obj);\n\nstatic SherpaOnnxOfflineTransducerModelConfig GetOfflineTransducerModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineTransducerModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"transducer\") || !obj.Get(\"transducer\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"transducer\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder, encoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoder, decoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(joiner, joiner);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineParaformerModelConfig GetOfflineParaformerModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineParaformerModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"paraformer\") || !obj.Get(\"paraformer\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"paraformer\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineZipformerCtcModelConfig\nGetOfflineZipformerCtcModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineZipformerCtcModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"zipformerCtc\") || !obj.Get(\"zipformerCtc\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"zipformerCtc\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineWenetCtcModelConfig GetOfflineWenetCtcModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineWenetCtcModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"wenetCtc\") || !obj.Get(\"wenetCtc\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"wenetCtc\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineOmnilingualAsrCtcModelConfig\nGetOfflineOmnilingualAsrCtcModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineOmnilingualAsrCtcModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"omnilingual\") || !obj.Get(\"omnilingual\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"omnilingual\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineMedAsrCtcModelConfig GetOfflineMedAsrCtcModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineMedAsrCtcModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"medasr\") || !obj.Get(\"medasr\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"medasr\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineFireRedAsrCtcModelConfig\nGetOfflineFireRedAsrCtcModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineFireRedAsrCtcModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"fireRedAsrCtc\") || !obj.Get(\"fireRedAsrCtc\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"fireRedAsrCtc\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineFunASRNanoModelConfig GetOfflineFunAsrNanoModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineFunASRNanoModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"funasrNano\") || !obj.Get(\"funasrNano\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"funasrNano\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder_adaptor, encoderAdaptor);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(llm, llm);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(embedding, embedding);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tokenizer, tokenizer);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(system_prompt, systemPrompt);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(user_prompt, userPrompt);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(max_new_tokens, maxNewTokens);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(temperature, temperature);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(top_p, topP);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(seed, seed);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(language, language);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(itn, itn);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(hotwords, hotwords);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineDolphinModelConfig GetOfflineDolphinModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineDolphinModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"dolphin\") || !obj.Get(\"dolphin\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"dolphin\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineNemoEncDecCtcModelConfig GetOfflineNeMoCtcModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineNemoEncDecCtcModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"nemoCtc\") || !obj.Get(\"nemoCtc\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"nemoCtc\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineCanaryModelConfig GetOfflineCanaryModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineCanaryModelConfig c;\n  memset(&c, 0, sizeof(c));\n  c.use_pnc = 1;  // Align default with JS default\n\n  if (!obj.Has(\"canary\") || !obj.Get(\"canary\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"canary\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder, encoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoder, decoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(src_lang, srcLang);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tgt_lang, tgtLang);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(use_pnc, usePnc);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineWhisperModelConfig GetOfflineWhisperModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineWhisperModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"whisper\") || !obj.Get(\"whisper\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"whisper\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder, encoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoder, decoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(language, language);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(task, task);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(tail_paddings, tailPaddings);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(enable_token_timestamps, enableTokenTimestamps);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(enable_segment_timestamps,\n                                enableSegmentTimestamps);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineFireRedAsrModelConfig GetOfflineFireRedAsrModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineFireRedAsrModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"fireRedAsr\") || !obj.Get(\"fireRedAsr\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"fireRedAsr\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder, encoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoder, decoder);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineMoonshineModelConfig GetOfflineMoonshineModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineMoonshineModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"moonshine\") || !obj.Get(\"moonshine\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"moonshine\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(preprocessor, preprocessor);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder, encoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(uncached_decoder, uncachedDecoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(cached_decoder, cachedDecoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(merged_decoder, mergedDecoder);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineTdnnModelConfig GetOfflineTdnnModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineTdnnModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"tdnn\") || !obj.Get(\"tdnn\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"tdnn\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineSenseVoiceModelConfig GetOfflineSenseVoiceModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineSenseVoiceModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"senseVoice\") || !obj.Get(\"senseVoice\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"senseVoice\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(language, language);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(use_itn, useInverseTextNormalization);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineModelConfig GetOfflineModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"modelConfig\") || !obj.Get(\"modelConfig\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"modelConfig\").As<Napi::Object>();\n\n  c.transducer = GetOfflineTransducerModelConfig(o);\n  c.paraformer = GetOfflineParaformerModelConfig(o);\n  c.nemo_ctc = GetOfflineNeMoCtcModelConfig(o);\n  c.whisper = GetOfflineWhisperModelConfig(o);\n  c.tdnn = GetOfflineTdnnModelConfig(o);\n  c.sense_voice = GetOfflineSenseVoiceModelConfig(o);\n  c.moonshine = GetOfflineMoonshineModelConfig(o);\n  c.fire_red_asr = GetOfflineFireRedAsrModelConfig(o);\n  c.dolphin = GetOfflineDolphinModelConfig(o);\n  c.zipformer_ctc = GetOfflineZipformerCtcModelConfig(o);\n  c.canary = GetOfflineCanaryModelConfig(o);\n  c.wenet_ctc = GetOfflineWenetCtcModelConfig(o);\n  c.omnilingual = GetOfflineOmnilingualAsrCtcModelConfig(o);\n  c.medasr = GetOfflineMedAsrCtcModelConfig(o);\n  c.funasr_nano = GetOfflineFunAsrNanoModelConfig(o);\n  c.fire_red_asr_ctc = GetOfflineFireRedAsrCtcModelConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tokens, tokens);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model_type, modelType);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(modeling_unit, modelingUnit);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(bpe_vocab, bpeVocab);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(telespeech_ctc, teleSpeechCtc);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineLMConfig GetOfflineLMConfig(Napi::Object obj) {\n  SherpaOnnxOfflineLMConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"lmConfig\") || !obj.Get(\"lmConfig\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"lmConfig\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(scale, scale);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineRecognizerConfig ParseConfig(Napi::Object o) {\n  SherpaOnnxOfflineRecognizerConfig c;\n  memset(&c, 0, sizeof(c));\n  c.feat_config = GetFeatureConfig(o);\n  c.model_config = GetOfflineModelConfig(o);\n  c.lm_config = GetOfflineLMConfig(o);\n  c.hr = GetHomophoneReplacerConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoding_method, decodingMethod);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(max_active_paths, maxActivePaths);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(hotwords_file, hotwordsFile);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(hotwords_score, hotwordsScore);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(rule_fsts, ruleFsts);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(rule_fars, ruleFars);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(blank_penalty, blankPenalty);\n\n  return c;\n}\n\nstatic void FreeConfig(const SherpaOnnxOfflineRecognizerConfig &c) {\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.transducer.encoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.transducer.decoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.transducer.joiner);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.paraformer.model);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.nemo_ctc.model);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.whisper.encoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.whisper.decoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.whisper.language);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.whisper.task);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.tdnn.model);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.sense_voice.model);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.sense_voice.language);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.moonshine.preprocessor);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.moonshine.encoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.moonshine.uncached_decoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.moonshine.cached_decoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.moonshine.merged_decoder);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.fire_red_asr.encoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.fire_red_asr.decoder);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.dolphin.model);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.zipformer_ctc.model);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.canary.encoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.canary.decoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.canary.src_lang);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.canary.tgt_lang);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.wenet_ctc.model);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.omnilingual.model);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.medasr.model);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.funasr_nano.hotwords);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.funasr_nano.language);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.funasr_nano.user_prompt);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.funasr_nano.system_prompt);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.funasr_nano.tokenizer);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.funasr_nano.embedding);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.funasr_nano.llm);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.funasr_nano.encoder_adaptor);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.fire_red_asr_ctc.model);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.tokens);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.provider);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.model_type);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.modeling_unit);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.bpe_vocab);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.telespeech_ctc);\n\n  SHERPA_ONNX_DELETE_C_STR(c.lm_config.model);\n\n  SHERPA_ONNX_DELETE_C_STR(c.decoding_method);\n  SHERPA_ONNX_DELETE_C_STR(c.hotwords_file);\n  SHERPA_ONNX_DELETE_C_STR(c.rule_fsts);\n  SHERPA_ONNX_DELETE_C_STR(c.rule_fars);\n  SHERPA_ONNX_DELETE_C_STR(c.hr.lexicon);\n  SHERPA_ONNX_DELETE_C_STR(c.hr.rule_fsts);\n}\n\nstatic Napi::External<SherpaOnnxOfflineRecognizer>\nCreateOfflineRecognizerWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n#if __OHOS__\n  // the last argument is the NativeResourceManager\n  if (info.Length() != 1 && info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#else\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"Expect an object as the argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n#if __OHOS__\n  bool use_resource_manager =\n      info.Length() == 2 && !info[1].IsUndefined() && !info[1].IsNull();\n  if (use_resource_manager && !info[1].IsObject()) {\n    Napi::TypeError::New(\n        env, \"You should pass a resource manager as the second argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxOfflineRecognizerConfig c = ParseConfig(o);\n\n#if __OHOS__\n  const SherpaOnnxOfflineRecognizer *recognizer = nullptr;\n\n  if (use_resource_manager) {\n    std::unique_ptr<NativeResourceManager,\n                    decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n        mgr(OH_ResourceManager_InitNativeResourceManager(env, info[1]),\n            &OH_ResourceManager_ReleaseNativeResourceManager);\n\n    recognizer = SherpaOnnxCreateOfflineRecognizerOHOS(&c, mgr.get());\n  } else {\n    recognizer = SherpaOnnxCreateOfflineRecognizer(&c);\n  }\n#else\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      SherpaOnnxCreateOfflineRecognizer(&c);\n#endif\n\n  FreeConfig(c);\n\n  if (!recognizer) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxOfflineRecognizer>::New(\n      env, const_cast<SherpaOnnxOfflineRecognizer *>(recognizer),\n      [](Napi::Env env, SherpaOnnxOfflineRecognizer *recognizer) {\n        SherpaOnnxDestroyOfflineRecognizer(recognizer);\n      });\n}\n\nclass CreateRecognizerAsyncWorker : public Napi::AsyncWorker {\n public:\n  CreateRecognizerAsyncWorker(const Napi::Env &env,\n                              const SherpaOnnxOfflineRecognizerConfig &cfg,\n                              const Napi::Promise::Deferred &deferred)\n      : Napi::AsyncWorker(env), cfg_(cfg), deferred_(deferred) {}\n\n  void Execute() override {\n    recognizer_ = SherpaOnnxCreateOfflineRecognizer(&cfg_);\n    FreeConfig(cfg_);\n\n    if (!recognizer_) {\n      SetError(\"Failed to create offline recognizer\");\n    }\n  }\n\n  void OnOK() override {\n    Napi::Env env = Env();\n\n    deferred_.Resolve(Napi::External<SherpaOnnxOfflineRecognizer>::New(\n        env, const_cast<SherpaOnnxOfflineRecognizer *>(recognizer_),\n        [](Napi::Env /*env*/, SherpaOnnxOfflineRecognizer *r) {\n          SherpaOnnxDestroyOfflineRecognizer(r);\n        }));\n  }\n\n  void OnError(const Napi::Error &e) override { deferred_.Reject(e.Value()); }\n\n private:\n  SherpaOnnxOfflineRecognizerConfig cfg_;\n  const SherpaOnnxOfflineRecognizer *recognizer_ = nullptr;\n  Napi::Promise::Deferred deferred_;\n};\n\nNapi::Value CreateOfflineRecognizerAsyncWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1 || !info[0].IsObject()) {\n    Napi::TypeError::New(env, \"Expected config object\")\n        .ThrowAsJavaScriptException();\n    return env.Null();\n  }\n\n  SherpaOnnxOfflineRecognizerConfig cfg =\n      ParseConfig(info[0].As<Napi::Object>());\n\n  Napi::Promise::Deferred deferred = Napi::Promise::Deferred::New(env);\n\n  auto *worker = new CreateRecognizerAsyncWorker(env, cfg, deferred);\n  worker->Queue();\n\n  return deferred.Promise();\n}\n\nstatic Napi::External<SherpaOnnxOfflineStream> CreateOfflineStreamWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env,\n        \"You should pass an offline recognizer pointer as the only argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      info[0].As<Napi::External<SherpaOnnxOfflineRecognizer>>().Data();\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxCreateOfflineStream(recognizer);\n\n  return Napi::External<SherpaOnnxOfflineStream>::New(\n      env, const_cast<SherpaOnnxOfflineStream *>(stream),\n      [](Napi::Env env, SherpaOnnxOfflineStream *stream) {\n        SherpaOnnxDestroyOfflineStream(stream);\n      });\n}\n\nstatic void AcceptWaveformOfflineWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an offline stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      info[0].As<Napi::External<SherpaOnnxOfflineStream>>().Data();\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an object\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"samples\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field samples\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!obj.Get(\"samples\").IsTypedArray()) {\n    Napi::TypeError::New(env, \"The object['samples'] should be a typed array\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!obj.Has(\"sampleRate\")) {\n    Napi::TypeError::New(env,\n                         \"The argument object should have a field sampleRate\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!obj.Get(\"sampleRate\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['samples'] should be a number\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  Napi::Float32Array samples = obj.Get(\"samples\").As<Napi::Float32Array>();\n  int32_t sample_rate = obj.Get(\"sampleRate\").As<Napi::Number>().Int32Value();\n\n#if __OHOS__\n  // Note(fangjun): For unknown reasons on HarmonyOS, we need to divide it by\n  // sizeof(float) here\n  SherpaOnnxAcceptWaveformOffline(stream, sample_rate, samples.Data(),\n                                  samples.ElementLength() / sizeof(float));\n#else\n  SherpaOnnxAcceptWaveformOffline(stream, sample_rate, samples.Data(),\n                                  samples.ElementLength());\n#endif\n}\n\nstatic void OfflineRecognizerSetConfigWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"Argument 0 should be an offline recognizer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Expect an object as the second argument\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  Napi::Object o = info[1].As<Napi::Object>();\n  SherpaOnnxOfflineRecognizerConfig c = ParseConfig(o);\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      info[0].As<Napi::External<SherpaOnnxOfflineRecognizer>>().Data();\n\n  SherpaOnnxOfflineRecognizerSetConfig(recognizer, &c);\n\n  FreeConfig(c);\n}\n\nclass DecodeOfflineStreamAsyncWorker : public Napi::AsyncWorker {\n public:\n  DecodeOfflineStreamAsyncWorker(Napi::Env env,\n                                 const SherpaOnnxOfflineRecognizer *recognizer,\n                                 const SherpaOnnxOfflineStream *stream,\n                                 Napi::Promise::Deferred deferred)\n      : Napi::AsyncWorker(env),\n        recognizer_(recognizer),\n        stream_(stream),\n        deferred_(deferred) {}\n\n  void Execute() override {\n    try {\n      SherpaOnnxDecodeOfflineStream(recognizer_, stream_);\n    } catch (const std::exception &e) {\n      SetError(e.what());\n    }\n  }\n\n  void OnOK() override {\n    const char *json = SherpaOnnxGetOfflineStreamResultAsJson(stream_);\n    Napi::String s = Napi::String::New(Env(), json);\n    SherpaOnnxDestroyOfflineStreamResultJson(json);\n    deferred_.Resolve(s);\n  }\n\n  void OnError(const Napi::Error &e) override { deferred_.Reject(e.Value()); }\n\n private:\n  const SherpaOnnxOfflineRecognizer *recognizer_;\n  const SherpaOnnxOfflineStream *stream_;\n  Napi::Promise::Deferred deferred_;\n};\n\nstatic Napi::Value DecodeOfflineStreamAsyncWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect 2 arguments. Given: \" << info.Length();\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n    return env.Null();\n  }\n\n  if (!info[0].IsExternal() || !info[1].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"Expected recognizer and stream as external pointers\")\n        .ThrowAsJavaScriptException();\n    return env.Null();\n  }\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      info[0].As<Napi::External<SherpaOnnxOfflineRecognizer>>().Data();\n\n  const SherpaOnnxOfflineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOfflineStream>>().Data();\n\n  Napi::Promise::Deferred deferred = Napi::Promise::Deferred::New(env);\n\n  // no need to free worker by ourselves\n  auto worker =\n      new DecodeOfflineStreamAsyncWorker(env, recognizer, stream, deferred);\n\n  worker->Queue();\n\n  return deferred.Promise();\n}\n\nstatic void DecodeOfflineStreamWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"Argument 0 should be an offline recognizer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an offline stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxOfflineRecognizer *recognizer =\n      info[0].As<Napi::External<SherpaOnnxOfflineRecognizer>>().Data();\n\n  const SherpaOnnxOfflineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOfflineStream>>().Data();\n\n  SherpaOnnxDecodeOfflineStream(recognizer, stream);\n}\n\nstatic Napi::String GetOfflineStreamResultAsJsonWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineStream *stream =\n      info[0].As<Napi::External<SherpaOnnxOfflineStream>>().Data();\n\n  const char *json = SherpaOnnxGetOfflineStreamResultAsJson(stream);\n  Napi::String s = Napi::String::New(env, json);\n\n  SherpaOnnxDestroyOfflineStreamResultJson(json);\n\n  return s;\n}\n\nvoid InitNonStreamingAsr(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createOfflineRecognizer\"),\n              Napi::Function::New(env, CreateOfflineRecognizerWrapper));\n\n  exports.Set(Napi::String::New(env, \"createOfflineRecognizerAsync\"),\n              Napi::Function::New(env, CreateOfflineRecognizerAsyncWrapper));\n\n  exports.Set(Napi::String::New(env, \"createOfflineStream\"),\n              Napi::Function::New(env, CreateOfflineStreamWrapper));\n\n  exports.Set(Napi::String::New(env, \"acceptWaveformOffline\"),\n              Napi::Function::New(env, AcceptWaveformOfflineWrapper));\n\n  exports.Set(Napi::String::New(env, \"decodeOfflineStream\"),\n              Napi::Function::New(env, DecodeOfflineStreamWrapper));\n\n  exports.Set(Napi::String::New(env, \"decodeOfflineStreamAsync\"),\n              Napi::Function::New(env, DecodeOfflineStreamAsyncWrapper));\n\n  exports.Set(Napi::String::New(env, \"offlineRecognizerSetConfig\"),\n              Napi::Function::New(env, OfflineRecognizerSetConfigWrapper));\n\n  exports.Set(Napi::String::New(env, \"getOfflineStreamResultAsJson\"),\n              Napi::Function::New(env, GetOfflineStreamResultAsJsonWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/non-streaming-speaker-diarization.cc",
    "content": "// scripts/node-addon-api/src/non-streaming-speaker-diarization.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <algorithm>\n#include <memory>\n#include <sstream>\n#include <utility>\n#include <vector>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig\nGetOfflineSpeakerSegmentationPyannoteModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"pyannote\") || !obj.Get(\"pyannote\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"pyannote\").As<Napi::Object>();\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineSpeakerSegmentationModelConfig\nGetOfflineSpeakerSegmentationModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineSpeakerSegmentationModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"segmentation\") || !obj.Get(\"segmentation\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"segmentation\").As<Napi::Object>();\n\n  c.pyannote = GetOfflineSpeakerSegmentationPyannoteModelConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n  return c;\n}\n\nstatic SherpaOnnxSpeakerEmbeddingExtractorConfig\nGetSpeakerEmbeddingExtractorConfig(Napi::Object obj) {\n  SherpaOnnxSpeakerEmbeddingExtractorConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"embedding\") || !obj.Get(\"embedding\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"embedding\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n  return c;\n}\n\nstatic SherpaOnnxFastClusteringConfig GetFastClusteringConfig(\n    Napi::Object obj) {\n  SherpaOnnxFastClusteringConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"clustering\") || !obj.Get(\"clustering\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"clustering\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_clusters, numClusters);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(threshold, threshold);\n\n  return c;\n}\n\nstatic Napi::External<SherpaOnnxOfflineSpeakerDiarization>\nCreateOfflineSpeakerDiarizationWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n#if __OHOS__\n  if (info.Length() != 1 && info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#else\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"Expect an object as the argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n#if __OHOS__\n  bool use_resource_manager =\n      info.Length() == 2 && !info[1].IsUndefined() && !info[1].IsNull();\n  if (use_resource_manager && !info[1].IsObject()) {\n    Napi::TypeError::New(\n        env, \"You should pass a resource manager as the second argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxOfflineSpeakerDiarizationConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.segmentation = GetOfflineSpeakerSegmentationModelConfig(o);\n  c.embedding = GetSpeakerEmbeddingExtractorConfig(o);\n  c.clustering = GetFastClusteringConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(min_duration_on, minDurationOn);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(min_duration_off, minDurationOff);\n\n#if __OHOS__\n  const SherpaOnnxOfflineSpeakerDiarization *sd = nullptr;\n\n  if (use_resource_manager) {\n    std::unique_ptr<NativeResourceManager,\n                    decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n        mgr(OH_ResourceManager_InitNativeResourceManager(env, info[1]),\n            &OH_ResourceManager_ReleaseNativeResourceManager);\n\n    sd = SherpaOnnxCreateOfflineSpeakerDiarizationOHOS(&c, mgr.get());\n  } else {\n    sd = SherpaOnnxCreateOfflineSpeakerDiarization(&c);\n  }\n#else\n  const SherpaOnnxOfflineSpeakerDiarization *sd =\n      SherpaOnnxCreateOfflineSpeakerDiarization(&c);\n#endif\n\n  SHERPA_ONNX_DELETE_C_STR(c.segmentation.pyannote.model);\n  SHERPA_ONNX_DELETE_C_STR(c.segmentation.provider);\n  SHERPA_ONNX_DELETE_C_STR(c.embedding.model);\n  SHERPA_ONNX_DELETE_C_STR(c.embedding.provider);\n\n  if (!sd) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxOfflineSpeakerDiarization>::New(\n      env, const_cast<SherpaOnnxOfflineSpeakerDiarization *>(sd),\n      [](Napi::Env env, SherpaOnnxOfflineSpeakerDiarization *sd) {\n        SherpaOnnxDestroyOfflineSpeakerDiarization(sd);\n      });\n}\n\nstatic Napi::Number OfflineSpeakerDiarizationGetSampleRateWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"Argument 0 should be an offline speaker diarization pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineSpeakerDiarization *sd =\n      info[0].As<Napi::External<SherpaOnnxOfflineSpeakerDiarization>>().Data();\n\n  int32_t sample_rate = SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(sd);\n\n  return Napi::Number::New(env, sample_rate);\n}\n\nstatic Napi::Array OfflineSpeakerDiarizationProcessWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"Argument 0 should be an offline speaker diarization pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineSpeakerDiarization *sd =\n      info[0].As<Napi::External<SherpaOnnxOfflineSpeakerDiarization>>().Data();\n\n  if (!info[1].IsTypedArray()) {\n    Napi::TypeError::New(env, \"Argument 1 should be a typed array\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Float32Array samples = info[1].As<Napi::Float32Array>();\n\n#if __OHOS__\n  // Note(fangjun): For unknown reasons on HarmonyOS, we need to divide it by\n  // sizeof(float) here\n  const SherpaOnnxOfflineSpeakerDiarizationResult *r =\n      SherpaOnnxOfflineSpeakerDiarizationProcess(\n          sd, samples.Data(), samples.ElementLength() / sizeof(float));\n#else\n  const SherpaOnnxOfflineSpeakerDiarizationResult *r =\n      SherpaOnnxOfflineSpeakerDiarizationProcess(sd, samples.Data(),\n                                                 samples.ElementLength());\n#endif\n\n  int32_t num_segments =\n      SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(r);\n\n  const SherpaOnnxOfflineSpeakerDiarizationSegment *segments =\n      SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(r);\n\n  Napi::Array ans = Napi::Array::New(env, num_segments);\n\n  for (int32_t i = 0; i != num_segments; ++i) {\n    Napi::Object obj = Napi::Object::New(env);\n\n    obj.Set(Napi::String::New(env, \"start\"), segments[i].start);\n    obj.Set(Napi::String::New(env, \"end\"), segments[i].end);\n    obj.Set(Napi::String::New(env, \"speaker\"), segments[i].speaker);\n\n    ans.Set(i, obj);\n  }\n\n  SherpaOnnxOfflineSpeakerDiarizationDestroySegment(segments);\n  SherpaOnnxOfflineSpeakerDiarizationDestroyResult(r);\n\n  return ans;\n}\n\nstruct SpeakerDiarizationCallbackData {\n  int32_t num_processed_chunks;\n  int32_t num_total_chunks;\n};\n\n// see\n// https://github.com/nodejs/node-addon-examples/blob/main/src/6-threadsafe-function/typed_threadsafe_function/node-addon-api/clock.cc\nstatic void InvokeJsCallback(Napi::Env env, Napi::Function callback,\n                             Napi::Reference<Napi::Value> *context,\n                             SpeakerDiarizationCallbackData *data) {\n  if (env != nullptr) {\n    if (callback != nullptr) {\n      Napi::Number num_processed_chunks =\n          Napi::Number::New(env, data->num_processed_chunks);\n      Napi::Number num_total_chunks =\n          Napi::Number::New(env, data->num_total_chunks);\n\n      callback.Call(context->Value(), {num_processed_chunks, num_total_chunks});\n    }\n  }\n  delete data;\n}\n\nusing TSFN = Napi::TypedThreadSafeFunction<Napi::Reference<Napi::Value>,\n                                           SpeakerDiarizationCallbackData,\n                                           InvokeJsCallback>;\n\nclass SpeakerDiarizationProcessWorker : public Napi::AsyncWorker {\n public:\n  SpeakerDiarizationProcessWorker(const Napi::Env &env, TSFN tsfn,\n                                  const SherpaOnnxOfflineSpeakerDiarization *sd,\n                                  std::vector<float> samples)\n      : tsfn_(tsfn),\n        Napi::AsyncWorker{env, \"SpeakerDiarizationProcessAsyncWorker\"},\n        deferred_(env),\n        sd_(sd),\n        samples_(std::move(samples)) {}\n\n  Napi::Promise Promise() { return deferred_.Promise(); }\n\n protected:\n  void Execute() override {\n    auto callback = [](int32_t num_processed_chunks, int32_t num_total_chunks,\n                       void *arg) -> int32_t {\n      auto _this = reinterpret_cast<SpeakerDiarizationProcessWorker *>(arg);\n\n      auto data = new SpeakerDiarizationCallbackData;\n      data->num_processed_chunks = num_processed_chunks;\n      data->num_total_chunks = num_total_chunks;\n\n      _this->tsfn_.NonBlockingCall(data);\n\n      return 0;\n    };\n\n    r_ = SherpaOnnxOfflineSpeakerDiarizationProcessWithCallback(\n        sd_, samples_.data(), samples_.size(), callback, this);\n\n    tsfn_.Release();\n  }\n\n  void OnOK() override {\n    Napi::Env env = deferred_.Env();\n\n    int32_t num_segments =\n        SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(r_);\n\n    const SherpaOnnxOfflineSpeakerDiarizationSegment *segments =\n        SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(r_);\n\n    Napi::Array ans = Napi::Array::New(env, num_segments);\n\n    for (int32_t i = 0; i != num_segments; ++i) {\n      Napi::Object obj = Napi::Object::New(env);\n\n      obj.Set(Napi::String::New(env, \"start\"), segments[i].start);\n      obj.Set(Napi::String::New(env, \"end\"), segments[i].end);\n      obj.Set(Napi::String::New(env, \"speaker\"), segments[i].speaker);\n\n      ans.Set(i, obj);\n    }\n\n    SherpaOnnxOfflineSpeakerDiarizationDestroySegment(segments);\n    SherpaOnnxOfflineSpeakerDiarizationDestroyResult(r_);\n\n    deferred_.Resolve(ans);\n  }\n\n private:\n  TSFN tsfn_;\n  Napi::Promise::Deferred deferred_;\n  const SherpaOnnxOfflineSpeakerDiarization *sd_;\n  std::vector<float> samples_;\n  const SherpaOnnxOfflineSpeakerDiarizationResult *r_;\n};\n\nstatic Napi::Object OfflineSpeakerDiarizationProcessAsyncWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 3) {\n    std::ostringstream os;\n    os << \"Expect only 3 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"Argument 0 should be an offline speaker diarization pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineSpeakerDiarization *sd =\n      info[0].As<Napi::External<SherpaOnnxOfflineSpeakerDiarization>>().Data();\n\n  if (!info[1].IsTypedArray()) {\n    Napi::TypeError::New(env, \"Argument 1 should be a typed array\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[2].IsFunction()) {\n    Napi::TypeError::New(env, \"Argument 2 should be a function\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Function cb = info[2].As<Napi::Function>();\n\n  auto context =\n      new Napi::Reference<Napi::Value>(Napi::Persistent(info.This()));\n\n  TSFN tsfn = TSFN::New(\n      env,\n      cb,  // JavaScript function called asynchronously\n      \"SpeakerDiarizationProcessAsyncFunc\",  // Name\n      0,                                     // Unlimited queue\n      1,  // Only one thread will use this initially\n      context,\n      [](Napi::Env, void *, Napi::Reference<Napi::Value> *ctx) { delete ctx; });\n\n  Napi::Float32Array samples = info[1].As<Napi::Float32Array>();\n\n#if __OHOS__\n  int32_t num_samples = samples.ElementLength() / sizeof(float);\n#else\n  int32_t num_samples = samples.ElementLength();\n#endif\n  std::vector<float> v(num_samples);\n  std::copy(samples.Data(), samples.Data() + num_samples, v.begin());\n\n  SpeakerDiarizationProcessWorker *worker =\n      new SpeakerDiarizationProcessWorker(env, tsfn, sd, v);\n  worker->Queue();\n  return worker->Promise();\n}\n\nstatic void OfflineSpeakerDiarizationSetConfigWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"Argument 0 should be an offline speaker diarization pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxOfflineSpeakerDiarization *sd =\n      info[0].As<Napi::External<SherpaOnnxOfflineSpeakerDiarization>>().Data();\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Expect an object as the argument\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  Napi::Object o = info[1].As<Napi::Object>();\n\n  SherpaOnnxOfflineSpeakerDiarizationConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.clustering = GetFastClusteringConfig(o);\n  SherpaOnnxOfflineSpeakerDiarizationSetConfig(sd, &c);\n}\n\nvoid InitNonStreamingSpeakerDiarization(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createOfflineSpeakerDiarization\"),\n              Napi::Function::New(env, CreateOfflineSpeakerDiarizationWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"getOfflineSpeakerDiarizationSampleRate\"),\n      Napi::Function::New(env, OfflineSpeakerDiarizationGetSampleRateWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"offlineSpeakerDiarizationProcess\"),\n      Napi::Function::New(env, OfflineSpeakerDiarizationProcessWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"offlineSpeakerDiarizationProcessAsync\"),\n      Napi::Function::New(env, OfflineSpeakerDiarizationProcessAsyncWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"offlineSpeakerDiarizationSetConfig\"),\n      Napi::Function::New(env, OfflineSpeakerDiarizationSetConfigWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/non-streaming-speech-denoiser.cc",
    "content": "// scripts/node-addon-api/src/non-streaming-speech-denoiser.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include <memory>\n#include <sstream>\n\n#include \"napi.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n#include \"speech-denoiser.h\"  // NOLINT\n\nstatic Napi::External<SherpaOnnxOfflineSpeechDenoiser>\nCreateOfflineSpeechDenoiserWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n#if __OHOS__\n  // the last argument is the NativeResourceManager\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#else\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"Expect an object as the argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxOfflineSpeechDenoiserConfig c;\n  memset(&c, 0, sizeof(c));\n  c.model = GetSpeechDenoiserModelConfig(o);\n\n#if __OHOS__\n  std::unique_ptr<NativeResourceManager,\n                  decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n      mgr(OH_ResourceManager_InitNativeResourceManager(env, info[1]),\n          &OH_ResourceManager_ReleaseNativeResourceManager);\n\n  const SherpaOnnxOfflineSpeechDenoiser *sd =\n      SherpaOnnxCreateOfflineSpeechDenoiserOHOS(&c, mgr.get());\n#else\n  const SherpaOnnxOfflineSpeechDenoiser *sd =\n      SherpaOnnxCreateOfflineSpeechDenoiser(&c);\n#endif\n\n  DeleteSpeechDenoiserModelConfig(c.model);\n\n  if (!sd) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxOfflineSpeechDenoiser>::New(\n      env, const_cast<SherpaOnnxOfflineSpeechDenoiser *>(sd),\n      [](Napi::Env env, SherpaOnnxOfflineSpeechDenoiser *sd) {\n        SherpaOnnxDestroyOfflineSpeechDenoiser(sd);\n      });\n}\n\nstatic Napi::Object OfflineSpeechDenoiserRunWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"Argument 0 should be an offline speech denoiser pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineSpeechDenoiser *sd =\n      info[0].As<Napi::External<SherpaOnnxOfflineSpeechDenoiser>>().Data();\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an object\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"samples\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field samples\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"samples\").IsTypedArray()) {\n    Napi::TypeError::New(env, \"The object['samples'] should be a typed array\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"sampleRate\")) {\n    Napi::TypeError::New(env,\n                         \"The argument object should have a field sampleRate\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"sampleRate\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['samples'] should be a number\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Float32Array samples = obj.Get(\"samples\").As<Napi::Float32Array>();\n  int32_t sample_rate = obj.Get(\"sampleRate\").As<Napi::Number>().Int32Value();\n\n  const SherpaOnnxDenoisedAudio *audio;\n\n  audio = SherpaOnnxOfflineSpeechDenoiserRun(\n      sd, samples.Data(), GetFloat32ArrayElementLength(samples), sample_rate);\n\n  return CreateDenoisedAudioObject(env, audio, GetEnableExternalBuffer(obj));\n}\n\nstatic Napi::Number OfflineSpeechDenoiserGetSampleRateWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"Argument 0 should be an offline speech denoiser pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineSpeechDenoiser *sd =\n      info[0].As<Napi::External<SherpaOnnxOfflineSpeechDenoiser>>().Data();\n\n  int32_t sample_rate = SherpaOnnxOfflineSpeechDenoiserGetSampleRate(sd);\n\n  return Napi::Number::New(env, sample_rate);\n}\n\nvoid InitNonStreamingSpeechDenoiser(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createOfflineSpeechDenoiser\"),\n              Napi::Function::New(env, CreateOfflineSpeechDenoiserWrapper));\n\n  exports.Set(Napi::String::New(env, \"offlineSpeechDenoiserRunWrapper\"),\n              Napi::Function::New(env, OfflineSpeechDenoiserRunWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"offlineSpeechDenoiserGetSampleRateWrapper\"),\n      Napi::Function::New(env, OfflineSpeechDenoiserGetSampleRateWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/non-streaming-tts.cc",
    "content": "// scripts/node-addon-api/src/non-streaming-tts.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <algorithm>\n#include <atomic>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <vector>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n#define SHERPA_ONNX_ASSIGN_TTS_ATTR()                                  \\\n  do {                                                                 \\\n    SHERPA_ONNX_ASSIGN_ATTR_STR(rule_fsts, ruleFsts);                  \\\n    SHERPA_ONNX_ASSIGN_ATTR_INT32(max_num_sentences, maxNumSentences); \\\n    SHERPA_ONNX_ASSIGN_ATTR_STR(rule_fars, ruleFars);                  \\\n    SHERPA_ONNX_ASSIGN_ATTR_FLOAT(silence_scale, silenceScale);        \\\n  } while (0)\n\n#define SHERPA_ONNX_DELETE_TTS_C_STR()                          \\\n  do {                                                          \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.vits.model);               \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.vits.lexicon);             \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.vits.tokens);              \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.vits.data_dir);            \\\n                                                                \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.matcha.acoustic_model);    \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.matcha.vocoder);           \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.matcha.lexicon);           \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.matcha.tokens);            \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.matcha.data_dir);          \\\n                                                                \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.kitten.model);             \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.kitten.voices);            \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.kitten.tokens);            \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.kitten.data_dir);          \\\n                                                                 \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.zipvoice.tokens);          \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.zipvoice.encoder);         \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.zipvoice.decoder);         \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.zipvoice.vocoder);         \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.zipvoice.data_dir);        \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.zipvoice.lexicon);         \\\n                                                                 \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.kokoro.model);             \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.kokoro.voices);            \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.kokoro.tokens);            \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.kokoro.data_dir);          \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.kokoro.lexicon);           \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.kokoro.lang);              \\\n                                                                \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.pocket.lm_flow);           \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.pocket.lm_main);           \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.pocket.encoder);           \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.pocket.decoder);           \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.pocket.text_conditioner);  \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.pocket.vocab_json);        \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.pocket.token_scores_json); \\\n                                                                \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.supertonic.duration_predictor);  \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.supertonic.text_encoder);        \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.supertonic.vector_estimator);    \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.supertonic.vocoder);             \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.supertonic.tts_json);            \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.supertonic.unicode_indexer);     \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.supertonic.voice_style);         \\\n                                                                \\\n    SHERPA_ONNX_DELETE_C_STR(c.model.provider);                 \\\n                                                                \\\n    SHERPA_ONNX_DELETE_C_STR(c.rule_fsts);                      \\\n    SHERPA_ONNX_DELETE_C_STR(c.rule_fars);                      \\\n  } while (0)\n\n#define SHERPA_ONNX_DELETE_GENERATION_C_STR(c)  \\\n  do {                                          \\\n    SHERPA_ONNX_DELETE_C_STR(c.reference_text); \\\n    SHERPA_ONNX_DELETE_C_STR(c.extra);          \\\n    if (c.reference_audio) {                    \\\n      delete[] c.reference_audio;               \\\n    }                                           \\\n  } while (0)\n\nstatic std::string JsObjectToJson(Napi::Env env, const Napi::Object &obj) {\n  Napi::Object json = env.Global().Get(\"JSON\").As<Napi::Object>();\n  Napi::Function stringify = json.Get(\"stringify\").As<Napi::Function>();\n  return stringify.Call(json, {obj}).As<Napi::String>().Utf8Value();\n}\n\nstatic SherpaOnnxGenerationConfig GetGenerationConfig(Napi::Object o) {\n  SherpaOnnxGenerationConfig c;\n  memset(&c, 0, sizeof(c));\n\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(silence_scale, silenceScale);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(speed, speed);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(sid, sid);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_steps, numSteps);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(reference_text, referenceText);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(reference_sample_rate, referenceSampleRate);\n\n  if (o.Has(\"referenceAudio\") && o.Get(\"referenceAudio\").IsTypedArray()) {\n    auto arr = o.Get(\"referenceAudio\").As<Napi::Float32Array>();\n    int32_t n = arr.ElementLength();\n\n    if (n > 0) {\n      float *buf = new float[n];\n      std::copy(arr.Data(), arr.Data() + n, buf);\n\n      c.reference_audio = buf;\n      c.reference_audio_len = n;\n    }\n  }\n\n  if (o.Has(\"extra\") && o.Get(\"extra\").IsObject()) {\n    std::string s = JsObjectToJson(o.Env(), o.Get(\"extra\").As<Napi::Object>());\n\n    char *p = new char[s.size() + 1];\n    std::copy(s.begin(), s.end(), p);\n    p[s.size()] = '\\0';\n\n    c.extra = p;\n  }\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineTtsVitsModelConfig GetOfflineTtsVitsModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineTtsVitsModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"vits\") || !obj.Get(\"vits\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"vits\").As<Napi::Object>();\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(lexicon, lexicon);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tokens, tokens);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(data_dir, dataDir);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(noise_scale, noiseScale);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(noise_scale_w, noiseScaleW);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(length_scale, lengthScale);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineTtsMatchaModelConfig GetOfflineTtsMatchaModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineTtsMatchaModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"matcha\") || !obj.Get(\"matcha\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"matcha\").As<Napi::Object>();\n  SHERPA_ONNX_ASSIGN_ATTR_STR(acoustic_model, acousticModel);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(vocoder, vocoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(lexicon, lexicon);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tokens, tokens);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(data_dir, dataDir);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(noise_scale, noiseScale);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(length_scale, lengthScale);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineTtsKokoroModelConfig GetOfflineTtsKokoroModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineTtsKokoroModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"kokoro\") || !obj.Get(\"kokoro\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"kokoro\").As<Napi::Object>();\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(voices, voices);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tokens, tokens);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(data_dir, dataDir);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(length_scale, lengthScale);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(lexicon, lexicon);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(lang, lang);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineTtsKittenModelConfig GetOfflineTtsKittenModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineTtsKittenModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"kitten\") || !obj.Get(\"kitten\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"kitten\").As<Napi::Object>();\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(voices, voices);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tokens, tokens);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(data_dir, dataDir);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(length_scale, lengthScale);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineTtsZipvoiceModelConfig\nGetOfflineTtsZipvoiceModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineTtsZipvoiceModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"zipvoice\") || !obj.Get(\"zipvoice\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"zipvoice\").As<Napi::Object>();\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tokens, tokens);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder, encoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoder, decoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(vocoder, vocoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(data_dir, dataDir);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(lexicon, lexicon);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(feat_scale, featScale);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(t_shift, tShift);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(target_rms, targetRms);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(guidance_scale, guidanceScale);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineTtsPocketModelConfig GetOfflineTtsPocketModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineTtsPocketModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"pocket\") || !obj.Get(\"pocket\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"pocket\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(lm_flow, lmFlow);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(lm_main, lmMain);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder, encoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoder, decoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(text_conditioner, textConditioner);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(vocab_json, vocabJson);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(token_scores_json, tokenScoresJson);\n\n  if (o.Has(\"voiceEmbeddingCacheCapacity\")) {\n    c.voice_embedding_cache_capacity =\n        o.Get(\"voiceEmbeddingCacheCapacity\").As<Napi::Number>().Int32Value();\n  } else {\n    c.voice_embedding_cache_capacity = 50;\n  }\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineTtsSupertonicModelConfig\nGetOfflineTtsSupertonicModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineTtsSupertonicModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"supertonic\") || !obj.Get(\"supertonic\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"supertonic\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(duration_predictor, durationPredictor);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(text_encoder, textEncoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(vector_estimator, vectorEstimator);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(vocoder, vocoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tts_json, ttsJson);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(unicode_indexer, unicodeIndexer);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(voice_style, voiceStyle);\n\n  return c;\n}\n\nstatic SherpaOnnxOfflineTtsModelConfig GetOfflineTtsModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflineTtsModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"model\") || !obj.Get(\"model\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"model\").As<Napi::Object>();\n\n  c.vits = GetOfflineTtsVitsModelConfig(o);\n  c.matcha = GetOfflineTtsMatchaModelConfig(o);\n  c.kokoro = GetOfflineTtsKokoroModelConfig(o);\n  c.kitten = GetOfflineTtsKittenModelConfig(o);\n  c.zipvoice = GetOfflineTtsZipvoiceModelConfig(o);\n  c.pocket = GetOfflineTtsPocketModelConfig(o);\n  c.supertonic = GetOfflineTtsSupertonicModelConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n  return c;\n}\n\n// Async worker for creating OfflineTts\nclass CreateOfflineTtsAsyncWorker : public Napi::AsyncWorker {\n public:\n  CreateOfflineTtsAsyncWorker(Napi::Env env,\n                              const SherpaOnnxOfflineTtsConfig &config)\n      : Napi::AsyncWorker(env),\n        deferred_(Napi::Promise::Deferred::New(env)),\n        config_(config) {}\n\n  Napi::Promise Promise() { return deferred_.Promise(); }\n\n protected:\n  void Execute() override {\n    // Create OfflineTts\n    tts_ = SherpaOnnxCreateOfflineTts(&config_);\n    if (!tts_) {\n      SetError(\"Failed to create OfflineTts. Check your config!\");\n    }\n  }\n\n  void OnOK() override {\n    Napi::Env env = Env();\n    deferred_.Resolve(Napi::External<SherpaOnnxOfflineTts>::New(\n        env, const_cast<SherpaOnnxOfflineTts *>(tts_),\n        [](Napi::Env, SherpaOnnxOfflineTts *ptr) {\n          SherpaOnnxDestroyOfflineTts(ptr);\n        }));\n  }\n\n  void OnError(const Napi::Error &e) override { deferred_.Reject(e.Value()); }\n\n  ~CreateOfflineTtsAsyncWorker() override {\n    SherpaOnnxOfflineTtsConfig &c = config_;\n\n    SHERPA_ONNX_DELETE_TTS_C_STR();\n  }\n\n private:\n  SherpaOnnxOfflineTtsConfig config_;\n  const SherpaOnnxOfflineTts *tts_ = nullptr;\n  Napi::Promise::Deferred deferred_;\n};\n\n// JS wrapper\nstatic Napi::Value CreateOfflineTtsAsyncWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1 || !info[0].IsObject()) {\n    Napi::TypeError::New(env, \"Expect 1 object argument for config\")\n        .ThrowAsJavaScriptException();\n    return env.Null();\n  }\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxOfflineTtsConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.model = GetOfflineTtsModelConfig(o);\n  SHERPA_ONNX_ASSIGN_TTS_ATTR();\n\n  auto *worker = new CreateOfflineTtsAsyncWorker(env, c);\n  worker->Queue();\n  return worker->Promise();\n}\n\nstatic Napi::External<SherpaOnnxOfflineTts> CreateOfflineTtsWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n#if __OHOS__\n  // the last argument is the NativeResourceManager\n  if (info.Length() != 1 && info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#else\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"Expect an object as the argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n#if __OHOS__\n  bool use_resource_manager =\n      info.Length() == 2 && !info[1].IsUndefined() && !info[1].IsNull();\n  if (use_resource_manager && !info[1].IsObject()) {\n    Napi::TypeError::New(\n        env, \"You should pass a resource manager as the second argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxOfflineTtsConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.model = GetOfflineTtsModelConfig(o);\n\n  SHERPA_ONNX_ASSIGN_TTS_ATTR();\n\n#if __OHOS__\n  const SherpaOnnxOfflineTts *tts = nullptr;\n\n  if (use_resource_manager) {\n    std::unique_ptr<NativeResourceManager,\n                    decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n        mgr(OH_ResourceManager_InitNativeResourceManager(env, info[1]),\n            &OH_ResourceManager_ReleaseNativeResourceManager);\n    tts = SherpaOnnxCreateOfflineTtsOHOS(&c, mgr.get());\n  } else {\n    tts = SherpaOnnxCreateOfflineTts(&c);\n  }\n#else\n  const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&c);\n#endif\n\n  SHERPA_ONNX_DELETE_TTS_C_STR();\n\n  if (!tts) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxOfflineTts>::New(\n      env, const_cast<SherpaOnnxOfflineTts *>(tts),\n      [](Napi::Env env, SherpaOnnxOfflineTts *tts) {\n        SherpaOnnxDestroyOfflineTts(tts);\n      });\n}\n\nstatic Napi::Number OfflineTtsSampleRateWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an offline tts pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineTts *tts =\n      info[0].As<Napi::External<SherpaOnnxOfflineTts>>().Data();\n\n  int32_t sample_rate = SherpaOnnxOfflineTtsSampleRate(tts);\n\n  return Napi::Number::New(env, sample_rate);\n}\n\nstatic Napi::Number OfflineTtsNumSpeakersWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an offline tts pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineTts *tts =\n      info[0].As<Napi::External<SherpaOnnxOfflineTts>>().Data();\n\n  int32_t num_speakers = SherpaOnnxOfflineTtsNumSpeakers(tts);\n\n  return Napi::Number::New(env, num_speakers);\n}\n\n// synchronous version\nstatic Napi::Object OfflineTtsGenerateWithConfigWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    Napi::TypeError::New(env, \"Expect 2 arguments\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 must be OfflineTts handle\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 must be an object\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  const SherpaOnnxOfflineTts *tts =\n      info[0].As<Napi::External<SherpaOnnxOfflineTts>>().Data();\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"text\") || !obj.Get(\"text\").IsString()) {\n    Napi::TypeError::New(env, \"obj.text must be a string\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  std::string text = obj.Get(\"text\").As<Napi::String>().Utf8Value();\n\n  bool enable_external_buffer = true;\n  if (obj.Has(\"enableExternalBuffer\") &&\n      obj.Get(\"enableExternalBuffer\").IsBoolean()) {\n    enable_external_buffer =\n        obj.Get(\"enableExternalBuffer\").As<Napi::Boolean>().Value();\n  }\n\n  Napi::Object genObj =\n      obj.Has(\"generationConfig\") && obj.Get(\"generationConfig\").IsObject()\n          ? obj.Get(\"generationConfig\").As<Napi::Object>()\n          : Napi::Object::New(env);\n\n  SherpaOnnxGenerationConfig gen_config = GetGenerationConfig(genObj);\n\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(tts, text.c_str(), &gen_config,\n                                             nullptr, nullptr);\n\n  SHERPA_ONNX_DELETE_GENERATION_C_STR(gen_config);\n\n  if (!audio) {\n    Napi::Error::New(env, \"TTS generation failed\").ThrowAsJavaScriptException();\n    return {};\n  }\n\n  Napi::Object result = Napi::Object::New(env);\n\n  if (enable_external_buffer) {\n    Napi::ArrayBuffer buffer = Napi::ArrayBuffer::New(\n        env, const_cast<float *>(audio->samples), sizeof(float) * audio->n,\n        [](Napi::Env, void *, const SherpaOnnxGeneratedAudio *hint) {\n          SherpaOnnxDestroyOfflineTtsGeneratedAudio(hint);\n        },\n        audio);\n\n    result.Set(\"samples\", Napi::Float32Array::New(env, audio->n, buffer, 0));\n  } else {\n    Napi::ArrayBuffer buffer =\n        Napi::ArrayBuffer::New(env, sizeof(float) * audio->n);\n\n    auto arr = Napi::Float32Array::New(env, audio->n, buffer, 0);\n\n    std::copy(audio->samples, audio->samples + audio->n, arr.Data());\n\n    SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n\n    result.Set(\"samples\", arr);\n  }\n\n  result.Set(\"sampleRate\", audio->sample_rate);\n  return result;\n}\n\nstatic Napi::Object OfflineTtsGenerateWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an offline tts pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineTts *tts =\n      info[0].As<Napi::External<SherpaOnnxOfflineTts>>().Data();\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an object\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"text\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field text\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"text\").IsString()) {\n    Napi::TypeError::New(env, \"The object['text'] should be a string\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"sid\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field sid\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"sid\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['sid'] should be a number\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"speed\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field speed\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"speed\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['speed'] should be a number\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  bool enable_external_buffer = true;\n  if (obj.Has(\"enableExternalBuffer\") &&\n      obj.Get(\"enableExternalBuffer\").IsBoolean()) {\n    enable_external_buffer =\n        obj.Get(\"enableExternalBuffer\").As<Napi::Boolean>().Value();\n  }\n\n  Napi::String _text = obj.Get(\"text\").As<Napi::String>();\n  std::string text = _text.Utf8Value();\n  int32_t sid = obj.Get(\"sid\").As<Napi::Number>().Int32Value();\n  float speed = obj.Get(\"speed\").As<Napi::Number>().FloatValue();\n\n  const SherpaOnnxGeneratedAudio *audio;\n  audio = SherpaOnnxOfflineTtsGenerate(tts, text.c_str(), sid, speed);\n\n  if (enable_external_buffer) {\n    Napi::ArrayBuffer arrayBuffer = Napi::ArrayBuffer::New(\n        env, const_cast<float *>(audio->samples), sizeof(float) * audio->n,\n        [](Napi::Env /*env*/, void * /*data*/,\n           const SherpaOnnxGeneratedAudio *hint) {\n          SherpaOnnxDestroyOfflineTtsGeneratedAudio(hint);\n        },\n        audio);\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, audio->n, arrayBuffer, 0);\n\n    Napi::Object ans = Napi::Object::New(env);\n    ans.Set(Napi::String::New(env, \"samples\"), float32Array);\n    ans.Set(Napi::String::New(env, \"sampleRate\"), audio->sample_rate);\n    return ans;\n  } else {\n    // don't use external buffer\n    Napi::ArrayBuffer arrayBuffer =\n        Napi::ArrayBuffer::New(env, sizeof(float) * audio->n);\n\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, audio->n, arrayBuffer, 0);\n\n    std::copy(audio->samples, audio->samples + audio->n, float32Array.Data());\n\n    Napi::Object ans = Napi::Object::New(env);\n    ans.Set(Napi::String::New(env, \"samples\"), float32Array);\n    ans.Set(Napi::String::New(env, \"sampleRate\"), audio->sample_rate);\n    SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n    return ans;\n  }\n}\n\nstruct TtsCallbackData {\n  std::vector<float> samples;\n  float progress;\n  std::atomic<bool> processed = {false};\n  std::atomic<bool> cancelled = {false};\n};\n\n// see\n// https://github.com/nodejs/node-addon-examples/blob/main/src/6-threadsafe-function/typed_threadsafe_function/node-addon-api/clock.cc\nstatic void InvokeJsCallback(Napi::Env env, Napi::Function callback,\n                             Napi::Reference<Napi::Value> *context,\n                             TtsCallbackData *data) {\n  if (env != nullptr) {\n    if (callback != nullptr) {\n      Napi::ArrayBuffer arrayBuffer =\n          Napi::ArrayBuffer::New(env, sizeof(float) * data->samples.size());\n\n      Napi::Float32Array float32Array =\n          Napi::Float32Array::New(env, data->samples.size(), arrayBuffer, 0);\n\n      std::copy(data->samples.begin(), data->samples.end(),\n                float32Array.Data());\n\n      Napi::Object arg = Napi::Object::New(env);\n      arg.Set(Napi::String::New(env, \"samples\"), float32Array);\n      arg.Set(Napi::String::New(env, \"progress\"), data->progress);\n\n      auto v = callback.Call(context->Value(), {arg});\n\n      if ((v.IsBoolean() && !v.As<Napi::Boolean>().Value()) ||\n          (v.IsNumber() && v.As<Napi::Number>().Int32Value() == 0)) {\n        data->cancelled = true;\n      } else {\n        data->cancelled = false;\n      }\n\n      data->processed = true;\n    }\n  }\n}\n\nusing TSFN = Napi::TypedThreadSafeFunction<Napi::Reference<Napi::Value>,\n                                           TtsCallbackData, InvokeJsCallback>;\n\nclass TtsGenerateWorker : public Napi::AsyncWorker {\n public:\n  TtsGenerateWorker(const Napi::Env &env, TSFN tsfn,\n                    const SherpaOnnxOfflineTts *tts, const std::string &text,\n                    float speed, int32_t sid, bool use_external_buffer)\n      : tsfn_(tsfn),\n        Napi::AsyncWorker{env, \"TtsGenerateWorker\"},\n        deferred_(env),\n        tts_(tts),\n        text_(text),\n        speed_(speed),\n        sid_(sid),\n        use_external_buffer_(use_external_buffer) {}\n\n  Napi::Promise Promise() { return deferred_.Promise(); }\n\n  ~TtsGenerateWorker() {\n    for (auto d : data_list_) {\n      delete d;\n    }\n  }\n\n protected:\n  void Execute() override {\n    auto callback = [](const float *samples, int32_t n, float progress,\n                       void *arg) -> int32_t {\n      TtsGenerateWorker *_this = reinterpret_cast<TtsGenerateWorker *>(arg);\n\n      for (auto it = _this->data_list_.begin();\n           it != _this->data_list_.end();) {\n        if ((*it)->processed) {\n          delete *it;\n          it = _this->data_list_.erase(it);\n        } else {\n          ++it;\n        }\n      }\n\n      for (auto d : _this->data_list_) {\n        if (d->cancelled) {\n#if __OHOS__\n          OH_LOG_INFO(LOG_APP, \"TtsGenerate is cancelled\");\n#endif\n          return 0;\n        }\n      }\n\n      auto data = new TtsCallbackData;\n      data->samples = std::vector<float>{samples, samples + n};\n      data->progress = progress;\n      _this->data_list_.push_back(data);\n\n      _this->tsfn_.NonBlockingCall(data);\n\n      return 1;\n    };\n    audio_ = SherpaOnnxOfflineTtsGenerateWithProgressCallbackWithArg(\n        tts_, text_.c_str(), sid_, speed_, callback, this);\n\n    tsfn_.Release();\n  }\n\n  void OnOK() override {\n    Napi::Env env = deferred_.Env();\n    Napi::Object ans = Napi::Object::New(env);\n    if (use_external_buffer_) {\n      Napi::ArrayBuffer arrayBuffer = Napi::ArrayBuffer::New(\n          env, const_cast<float *>(audio_->samples), sizeof(float) * audio_->n,\n          [](Napi::Env /*env*/, void * /*data*/,\n             const SherpaOnnxGeneratedAudio *hint) {\n            SherpaOnnxDestroyOfflineTtsGeneratedAudio(hint);\n          },\n          audio_);\n      Napi::Float32Array float32Array =\n          Napi::Float32Array::New(env, audio_->n, arrayBuffer, 0);\n\n      ans.Set(Napi::String::New(env, \"samples\"), float32Array);\n      ans.Set(Napi::String::New(env, \"sampleRate\"), audio_->sample_rate);\n    } else {\n      // don't use external buffer\n      Napi::ArrayBuffer arrayBuffer =\n          Napi::ArrayBuffer::New(env, sizeof(float) * audio_->n);\n\n      Napi::Float32Array float32Array =\n          Napi::Float32Array::New(env, audio_->n, arrayBuffer, 0);\n\n      std::copy(audio_->samples, audio_->samples + audio_->n,\n                float32Array.Data());\n\n      ans.Set(Napi::String::New(env, \"samples\"), float32Array);\n      ans.Set(Napi::String::New(env, \"sampleRate\"), audio_->sample_rate);\n      SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio_);\n    }\n\n    deferred_.Resolve(ans);\n  }\n\n private:\n  TSFN tsfn_;\n  Napi::Promise::Deferred deferred_;\n  const SherpaOnnxOfflineTts *tts_;\n  std::string text_;\n  float speed_;\n  int32_t sid_;\n  bool use_external_buffer_;\n\n  const SherpaOnnxGeneratedAudio *audio_;\n\n  std::vector<TtsCallbackData *> data_list_;\n};\n\nstatic Napi::Object OfflineTtsGenerateAsyncWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an offline tts pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflineTts *tts =\n      info[0].As<Napi::External<SherpaOnnxOfflineTts>>().Data();\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an object\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"text\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field text\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"text\").IsString()) {\n    Napi::TypeError::New(env, \"The object['text'] should be a string\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"sid\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field sid\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"sid\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['sid'] should be a number\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"speed\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field speed\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"speed\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['speed'] should be a number\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  bool enable_external_buffer = true;\n  if (obj.Has(\"enableExternalBuffer\") &&\n      obj.Get(\"enableExternalBuffer\").IsBoolean()) {\n    enable_external_buffer =\n        obj.Get(\"enableExternalBuffer\").As<Napi::Boolean>().Value();\n  }\n\n  Napi::String _text = obj.Get(\"text\").As<Napi::String>();\n  std::string text = _text.Utf8Value();\n  int32_t sid = obj.Get(\"sid\").As<Napi::Number>().Int32Value();\n  float speed = obj.Get(\"speed\").As<Napi::Number>().FloatValue();\n\n  Napi::Function cb;\n  if (obj.Has(\"callback\") && obj.Get(\"callback\").IsFunction()) {\n    cb = obj.Get(\"callback\").As<Napi::Function>();\n  }\n\n  auto context =\n      new Napi::Reference<Napi::Value>(Napi::Persistent(info.This()));\n\n  TSFN tsfn = TSFN::New(\n      env,\n      cb,                 // JavaScript function called asynchronously\n      \"TtsGenerateFunc\",  // Name\n      0,                  // Unlimited queue\n      1,                  // Only one thread will use this initially\n      context,\n      [](Napi::Env, void *, Napi::Reference<Napi::Value> *ctx) { delete ctx; });\n\n  TtsGenerateWorker *worker = new TtsGenerateWorker(\n      env, tsfn, tts, text, speed, sid, enable_external_buffer);\n  worker->Queue();\n  return worker->Promise();\n}\n\n// Async worker for TTS generation with generationConfig\nclass TtsGenerateWithConfigWorker : public Napi::AsyncWorker {\n public:\n  TtsGenerateWithConfigWorker(const Napi::Env &env, TSFN tsfn,\n                              const SherpaOnnxOfflineTts *tts,\n                              const std::string &text,\n                              const SherpaOnnxGenerationConfig &gen_config,\n                              bool use_external_buffer)\n      : tsfn_(tsfn),\n        Napi::AsyncWorker(env, \"TtsGenerateWithConfigWorker\"),\n        deferred_(env),\n        tts_(tts),\n        text_(text),\n        gen_config_(gen_config),\n        use_external_buffer_(use_external_buffer) {}\n\n  Napi::Promise Promise() { return deferred_.Promise(); }\n\n  ~TtsGenerateWithConfigWorker() {\n    SHERPA_ONNX_DELETE_GENERATION_C_STR(gen_config_);\n    for (auto d : data_list_) delete d;\n  }\n\n protected:\n  void Execute() override {\n    auto callback = [](const float *samples, int32_t n, float progress,\n                       void *arg) -> int32_t {\n      TtsGenerateWithConfigWorker *_this =\n          reinterpret_cast<TtsGenerateWithConfigWorker *>(arg);\n\n      // Clean up processed chunks\n      for (auto it = _this->data_list_.begin();\n           it != _this->data_list_.end();) {\n        if ((*it)->processed) {\n          delete *it;\n          it = _this->data_list_.erase(it);\n        } else {\n          ++it;\n        }\n      }\n\n      // Cancel check\n      for (auto d : _this->data_list_) {\n        if (d->cancelled) return 0;\n      }\n\n      auto data = new TtsCallbackData;\n      data->samples = std::vector<float>{samples, samples + n};\n      data->progress = progress;\n      _this->data_list_.push_back(data);\n\n      _this->tsfn_.NonBlockingCall(data);\n\n      return 1;\n    };\n\n    audio_ = SherpaOnnxOfflineTtsGenerateWithConfig(\n        tts_, text_.c_str(), &gen_config_, callback, this);\n\n    tsfn_.Release();\n  }\n\n  void OnOK() override {\n    Napi::Env env = deferred_.Env();\n    Napi::Object ans = Napi::Object::New(env);\n    if (use_external_buffer_) {\n      Napi::ArrayBuffer arrayBuffer = Napi::ArrayBuffer::New(\n          env, const_cast<float *>(audio_->samples), sizeof(float) * audio_->n,\n          [](Napi::Env, void *, const SherpaOnnxGeneratedAudio *hint) {\n            SherpaOnnxDestroyOfflineTtsGeneratedAudio(hint);\n          },\n          audio_);\n      Napi::Float32Array float32Array =\n          Napi::Float32Array::New(env, audio_->n, arrayBuffer, 0);\n      ans.Set(\"samples\", float32Array);\n      ans.Set(\"sampleRate\", audio_->sample_rate);\n    } else {\n      Napi::ArrayBuffer arrayBuffer =\n          Napi::ArrayBuffer::New(env, sizeof(float) * audio_->n);\n      Napi::Float32Array float32Array =\n          Napi::Float32Array::New(env, audio_->n, arrayBuffer, 0);\n      std::copy(audio_->samples, audio_->samples + audio_->n,\n                float32Array.Data());\n      ans.Set(\"samples\", float32Array);\n      ans.Set(\"sampleRate\", audio_->sample_rate);\n      SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio_);\n    }\n    deferred_.Resolve(ans);\n  }\n\n private:\n  TSFN tsfn_;\n  Napi::Promise::Deferred deferred_;\n  const SherpaOnnxOfflineTts *tts_;\n  std::string text_;\n  SherpaOnnxGenerationConfig gen_config_;\n  bool use_external_buffer_;\n  const SherpaOnnxGeneratedAudio *audio_;\n  std::vector<TtsCallbackData *> data_list_;\n};\n\nstatic Napi::Object OfflineTtsGenerateAsyncWithConfigWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2 || !info[0].IsExternal() || !info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Expect (External<OfflineTts>, Object)\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  const SherpaOnnxOfflineTts *tts =\n      info[0].As<Napi::External<SherpaOnnxOfflineTts>>().Data();\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"text\") || !obj.Get(\"text\").IsString()) {\n    Napi::TypeError::New(env, \"obj.text must be a string\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  std::string text = obj.Get(\"text\").As<Napi::String>().Utf8Value();\n\n  bool enable_external_buffer = true;\n  if (obj.Has(\"enableExternalBuffer\") &&\n      obj.Get(\"enableExternalBuffer\").IsBoolean()) {\n    enable_external_buffer =\n        obj.Get(\"enableExternalBuffer\").As<Napi::Boolean>().Value();\n  }\n\n  Napi::Function cb;\n  if (obj.Has(\"callback\") && obj.Get(\"callback\").IsFunction()) {\n    cb = obj.Get(\"callback\").As<Napi::Function>();\n  }\n\n  auto context =\n      new Napi::Reference<Napi::Value>(Napi::Persistent(info.This()));\n  TSFN tsfn = TSFN::New(\n      env, cb, \"TtsGenerateWithConfig\", 0, 1, context,\n      [](Napi::Env, void *, Napi::Reference<Napi::Value> *ctx) { delete ctx; });\n\n  SherpaOnnxGenerationConfig gen_config;\n  memset(&gen_config, 0, sizeof(gen_config));\n  if (obj.Has(\"generationConfig\") && obj.Get(\"generationConfig\").IsObject()) {\n    gen_config =\n        GetGenerationConfig(obj.Get(\"generationConfig\").As<Napi::Object>());\n  }\n\n  TtsGenerateWithConfigWorker *worker = new TtsGenerateWithConfigWorker(\n      env, tsfn, tts, text, gen_config, enable_external_buffer);\n  worker->Queue();\n  return worker->Promise();\n}\n\nvoid InitNonStreamingTts(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createOfflineTts\"),\n              Napi::Function::New(env, CreateOfflineTtsWrapper));\n\n  exports.Set(Napi::String::New(env, \"createOfflineTtsAsync\"),\n              Napi::Function::New(env, CreateOfflineTtsAsyncWrapper));\n\n  exports.Set(Napi::String::New(env, \"getOfflineTtsSampleRate\"),\n              Napi::Function::New(env, OfflineTtsSampleRateWrapper));\n\n  exports.Set(Napi::String::New(env, \"getOfflineTtsNumSpeakers\"),\n              Napi::Function::New(env, OfflineTtsNumSpeakersWrapper));\n\n  exports.Set(Napi::String::New(env, \"offlineTtsGenerate\"),\n              Napi::Function::New(env, OfflineTtsGenerateWrapper));\n\n  exports.Set(Napi::String::New(env, \"offlineTtsGenerateWithConfig\"),\n              Napi::Function::New(env, OfflineTtsGenerateWithConfigWrapper));\n\n  exports.Set(Napi::String::New(env, \"offlineTtsGenerateAsync\"),\n              Napi::Function::New(env, OfflineTtsGenerateAsyncWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"offlineTtsGenerateAsyncWithConfig\"),\n      Napi::Function::New(env, OfflineTtsGenerateAsyncWithConfigWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/offline-punctuation.cc",
    "content": "// scripts/node-addon-api/src/offline-punctuation.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <sstream>\n#include <string>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic SherpaOnnxOfflinePunctuationModelConfig GetOfflinePunctuationModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOfflinePunctuationModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"model\") || !obj.Get(\"model\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"model\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(ct_transformer, ctTransformer);\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n  return c;\n}\n\nstatic Napi::External<SherpaOnnxOfflinePunctuation>\nCreateOfflinePunctuationWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n#if __OHOS__\n  if (info.Length() != 1 && info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#else\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"You should pass an object as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxOfflinePunctuationConfig c;\n  memset(&c, 0, sizeof(c));\n  c.model = GetOfflinePunctuationModelConfig(o);\n\n#if __OHOS__\n  const SherpaOnnxOfflinePunctuation *punct = nullptr;\n\n  if (info.Length() == 1 || info[1].IsUndefined() || info[1].IsNull()) {\n    punct = SherpaOnnxCreateOfflinePunctuation(&c);\n  } else {\n    if (!info[1].IsObject()) {\n      Napi::TypeError::New(\n          env, \"You should pass a resource manager as the second argument.\")\n          .ThrowAsJavaScriptException();\n\n      SHERPA_ONNX_DELETE_C_STR(c.model.ct_transformer);\n      SHERPA_ONNX_DELETE_C_STR(c.model.provider);\n      return {};\n    }\n\n    std::unique_ptr<NativeResourceManager,\n                    decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n        mgr(OH_ResourceManager_InitNativeResourceManager(env, info[1]),\n            &OH_ResourceManager_ReleaseNativeResourceManager);\n\n    punct = SherpaOnnxCreateOfflinePunctuationOHOS(&c, mgr.get());\n  }\n#else\n  const SherpaOnnxOfflinePunctuation *punct =\n      SherpaOnnxCreateOfflinePunctuation(&c);\n#endif\n\n  SHERPA_ONNX_DELETE_C_STR(c.model.ct_transformer);\n  SHERPA_ONNX_DELETE_C_STR(c.model.provider);\n\n  if (!punct) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxOfflinePunctuation>::New(\n      env, const_cast<SherpaOnnxOfflinePunctuation *>(punct),\n      [](Napi::Env env, SherpaOnnxOfflinePunctuation *punct) {\n        SherpaOnnxDestroyOfflinePunctuation(punct);\n      });\n}\n\nstatic Napi::String OfflinePunctuationAddPunctWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env,\n        \"You should pass an offline punctuation pointer as the first argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsString()) {\n    Napi::TypeError::New(env, \"You should pass a string as the second argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOfflinePunctuation *punct =\n      info[0].As<Napi::External<SherpaOnnxOfflinePunctuation>>().Data();\n  Napi::String js_text = info[1].As<Napi::String>();\n  std::string text = js_text.Utf8Value();\n\n  const char *punct_text =\n      SherpaOfflinePunctuationAddPunct(punct, text.c_str());\n\n  Napi::String ans = Napi::String::New(env, punct_text);\n  SherpaOfflinePunctuationFreeText(punct_text);\n  return ans;\n}\n\nvoid InitOfflinePunctuation(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createOfflinePunctuation\"),\n              Napi::Function::New(env, CreateOfflinePunctuationWrapper));\n\n  exports.Set(Napi::String::New(env, \"offlinePunctuationAddPunct\"),\n              Napi::Function::New(env, OfflinePunctuationAddPunctWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/online-punctuation.cc",
    "content": "// scripts/node-addon-api/src/online-punctuation.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <sstream>\n#include <string>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic SherpaOnnxOnlinePunctuationModelConfig GetOnlinePunctuationModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOnlinePunctuationModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"model\") || !obj.Get(\"model\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"model\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(cnn_bilstm, cnnBilstm);\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(bpe_vocab, bpeVocab);\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n  return c;\n}\n\nstatic Napi::External<SherpaOnnxOnlinePunctuation>\nCreateOnlinePunctuationWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n#if __OHOS__\n  if (info.Length() != 1 && info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#else\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"You should pass an object as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n#if __OHOS__\n  bool use_resource_manager =\n      info.Length() == 2 && !info[1].IsUndefined() && !info[1].IsNull();\n  if (use_resource_manager && !info[1].IsObject()) {\n    Napi::TypeError::New(\n        env, \"You should pass a resource manager as the second argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxOnlinePunctuationConfig c;\n  memset(&c, 0, sizeof(c));\n  c.model = GetOnlinePunctuationModelConfig(o);\n\n#if __OHOS__\n  const SherpaOnnxOnlinePunctuation *punct = nullptr;\n\n  if (use_resource_manager) {\n    std::unique_ptr<NativeResourceManager,\n                    decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n        mgr(OH_ResourceManager_InitNativeResourceManager(env, info[1]),\n            &OH_ResourceManager_ReleaseNativeResourceManager);\n\n    punct = SherpaOnnxCreateOnlinePunctuationOHOS(&c, mgr.get());\n  } else {\n    punct = SherpaOnnxCreateOnlinePunctuation(&c);\n  }\n#else\n  const SherpaOnnxOnlinePunctuation *punct =\n      SherpaOnnxCreateOnlinePunctuation(&c);\n#endif\n\n  SHERPA_ONNX_DELETE_C_STR(c.model.cnn_bilstm);\n  SHERPA_ONNX_DELETE_C_STR(c.model.bpe_vocab);\n  SHERPA_ONNX_DELETE_C_STR(c.model.provider);\n\n  if (!punct) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxOnlinePunctuation>::New(\n      env, const_cast<SherpaOnnxOnlinePunctuation *>(punct),\n      [](Napi::Env env, SherpaOnnxOnlinePunctuation *punct) {\n        SherpaOnnxDestroyOnlinePunctuation(punct);\n      });\n}\n\nstatic Napi::String OnlinePunctuationAddPunctWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env,\n        \"You should pass an online punctuation pointer as the first argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsString()) {\n    Napi::TypeError::New(env, \"You should pass a string as the second argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOnlinePunctuation *punct =\n      info[0].As<Napi::External<SherpaOnnxOnlinePunctuation>>().Data();\n  Napi::String js_text = info[1].As<Napi::String>();\n  std::string text = js_text.Utf8Value();\n\n  const char *punct_text =\n      SherpaOnnxOnlinePunctuationAddPunct(punct, text.c_str());\n\n  Napi::String ans = Napi::String::New(env, punct_text);\n  SherpaOnnxOnlinePunctuationFreeText(punct_text);\n  return ans;\n}\n\nvoid InitOnlinePunctuation(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createOnlinePunctuation\"),\n              Napi::Function::New(env, CreateOnlinePunctuationWrapper));\n\n  exports.Set(Napi::String::New(env, \"onlinePunctuationAddPunct\"),\n              Napi::Function::New(env, OnlinePunctuationAddPunctWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/sherpa-onnx-node-addon-api.cc",
    "content": "// scripts/node-addon-api/src/sherpa-onnx-node-addon-api.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"napi.h\"  // NOLINT\n\nvoid InitStreamingAsr(Napi::Env env, Napi::Object exports);\n\nvoid InitNonStreamingAsr(Napi::Env env, Napi::Object exports);\n\nvoid InitNonStreamingTts(Napi::Env env, Napi::Object exports);\n\nvoid InitVad(Napi::Env env, Napi::Object exports);\n\nvoid InitWaveReader(Napi::Env env, Napi::Object exports);\n\nvoid InitWaveWriter(Napi::Env env, Napi::Object exports);\n\nvoid InitSpokenLanguageID(Napi::Env env, Napi::Object exports);\n\nvoid InitSpeakerID(Napi::Env env, Napi::Object exports);\n\nvoid InitAudioTagging(Napi::Env env, Napi::Object exports);\n\nvoid InitOfflinePunctuation(Napi::Env env, Napi::Object exports);\n\nvoid InitOnlinePunctuation(Napi::Env env, Napi::Object exports);\n\nvoid InitKeywordSpotting(Napi::Env env, Napi::Object exports);\n\nvoid InitNonStreamingSpeakerDiarization(Napi::Env env, Napi::Object exports);\n\nvoid InitNonStreamingSpeechDenoiser(Napi::Env env, Napi::Object exports);\n\nvoid InitOnlineSpeechDenoiser(Napi::Env env, Napi::Object exports);\n\nvoid InitVersion(Napi::Env env, Napi::Object exports);\n\n#if __OHOS__\nvoid InitUtils(Napi::Env env, Napi::Object exports);\n#endif\n\nNapi::Object Init(Napi::Env env, Napi::Object exports) {\n  InitStreamingAsr(env, exports);\n  InitNonStreamingAsr(env, exports);\n  InitNonStreamingTts(env, exports);\n  InitVad(env, exports);\n  InitWaveReader(env, exports);\n  InitWaveWriter(env, exports);\n  InitSpokenLanguageID(env, exports);\n  InitSpeakerID(env, exports);\n  InitAudioTagging(env, exports);\n  InitOfflinePunctuation(env, exports);\n  InitOnlinePunctuation(env, exports);\n  InitKeywordSpotting(env, exports);\n  InitNonStreamingSpeakerDiarization(env, exports);\n  InitNonStreamingSpeechDenoiser(env, exports);\n  InitOnlineSpeechDenoiser(env, exports);\n  InitVersion(env, exports);\n\n#if __OHOS__\n  InitUtils(env, exports);\n#endif\n\n  return exports;\n}\n\n#if __OHOS__\nNODE_API_MODULE(sherpa_onnx, Init)\n#else\nNODE_API_MODULE(addon, Init)\n#endif\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/speaker-identification.cc",
    "content": "// scripts/node-addon-api/src/speaker-identification.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <algorithm>\n#include <memory>\n#include <sstream>\n#include <string>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic Napi::External<SherpaOnnxSpeakerEmbeddingExtractor>\nCreateSpeakerEmbeddingExtractorWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n#if __OHOS__\n  if (info.Length() != 1 && info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#else\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"You should pass an object as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n#if __OHOS__\n  bool use_resource_manager =\n      info.Length() == 2 && !info[1].IsUndefined() && !info[1].IsNull();\n  if (use_resource_manager && !info[1].IsObject()) {\n    Napi::TypeError::New(\n        env, \"You should pass a resource manager as the second argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxSpeakerEmbeddingExtractorConfig c;\n  memset(&c, 0, sizeof(c));\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n#if __OHOS__\n  const SherpaOnnxSpeakerEmbeddingExtractor *extractor = nullptr;\n\n  if (use_resource_manager) {\n    std::unique_ptr<NativeResourceManager,\n                    decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n        mgr(OH_ResourceManager_InitNativeResourceManager(env, info[1]),\n            &OH_ResourceManager_ReleaseNativeResourceManager);\n\n    extractor = SherpaOnnxCreateSpeakerEmbeddingExtractorOHOS(&c, mgr.get());\n  } else {\n    extractor = SherpaOnnxCreateSpeakerEmbeddingExtractor(&c);\n  }\n#else\n  const SherpaOnnxSpeakerEmbeddingExtractor *extractor =\n      SherpaOnnxCreateSpeakerEmbeddingExtractor(&c);\n#endif\n  SHERPA_ONNX_DELETE_C_STR(c.model);\n  SHERPA_ONNX_DELETE_C_STR(c.provider);\n\n  if (!extractor) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxSpeakerEmbeddingExtractor>::New(\n      env, const_cast<SherpaOnnxSpeakerEmbeddingExtractor *>(extractor),\n      [](Napi::Env env, SherpaOnnxSpeakerEmbeddingExtractor *extractor) {\n        SherpaOnnxDestroySpeakerEmbeddingExtractor(extractor);\n      });\n}\n\nstatic Napi::Number SpeakerEmbeddingExtractorDimWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"Argument 0 should be a speaker embedding extractor pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingExtractor *extractor =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingExtractor>>().Data();\n\n  int32_t dim = SherpaOnnxSpeakerEmbeddingExtractorDim(extractor);\n\n  return Napi::Number::New(env, dim);\n}\n\nstatic Napi::External<SherpaOnnxOnlineStream>\nSpeakerEmbeddingExtractorCreateStreamWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"You should pass a speaker embedding extractor \"\n                         \"pointer as the only argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingExtractor *extractor =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingExtractor>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxSpeakerEmbeddingExtractorCreateStream(extractor);\n\n  return Napi::External<SherpaOnnxOnlineStream>::New(\n      env, const_cast<SherpaOnnxOnlineStream *>(stream),\n      [](Napi::Env env, SherpaOnnxOnlineStream *stream) {\n        SherpaOnnxDestroyOnlineStream(stream);\n      });\n}\n\nstatic Napi::Boolean SpeakerEmbeddingExtractorIsReadyWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"Argument 0 should be a speaker embedding extractor pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingExtractor *extractor =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingExtractor>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  int32_t is_ready =\n      SherpaOnnxSpeakerEmbeddingExtractorIsReady(extractor, stream);\n\n  return Napi::Boolean::New(env, is_ready);\n}\n\nstatic Napi::Float32Array SpeakerEmbeddingExtractorComputeEmbeddingWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2 && info.Length() != 3) {\n    std::ostringstream os;\n    os << \"Expect only 2 or 3 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"Argument 0 should be a speaker embedding extractor pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  bool enable_external_buffer = true;\n  if (info.Length() == 3) {\n    if (info[2].IsBoolean()) {\n      enable_external_buffer = info[2].As<Napi::Boolean>().Value();\n    } else {\n      Napi::TypeError::New(env, \"Argument 2 should be a boolean.\")\n          .ThrowAsJavaScriptException();\n    }\n  }\n\n  const SherpaOnnxSpeakerEmbeddingExtractor *extractor =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingExtractor>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  const float *v =\n      SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(extractor, stream);\n\n  int32_t dim = SherpaOnnxSpeakerEmbeddingExtractorDim(extractor);\n\n  if (enable_external_buffer) {\n    Napi::ArrayBuffer arrayBuffer = Napi::ArrayBuffer::New(\n        env, const_cast<float *>(v), sizeof(float) * dim,\n        [](Napi::Env /*env*/, void *data) {\n          SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(\n              reinterpret_cast<float *>(data));\n        });\n\n    return Napi::Float32Array::New(env, dim, arrayBuffer, 0);\n  } else {\n    // don't use external buffer\n    Napi::ArrayBuffer arrayBuffer =\n        Napi::ArrayBuffer::New(env, sizeof(float) * dim);\n\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, dim, arrayBuffer, 0);\n\n    std::copy(v, v + dim, float32Array.Data());\n\n    SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(v);\n\n    return float32Array;\n  }\n}\n\nstatic Napi::External<SherpaOnnxSpeakerEmbeddingManager>\nCreateSpeakerEmbeddingManagerWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsNumber()) {\n    Napi::TypeError::New(env,\n                         \"You should pass an integer as the only argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  int32_t dim = info[0].As<Napi::Number>().Int32Value();\n\n  const SherpaOnnxSpeakerEmbeddingManager *manager =\n      SherpaOnnxCreateSpeakerEmbeddingManager(dim);\n\n  if (!manager) {\n    Napi::TypeError::New(env, \"Please check your input dim!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxSpeakerEmbeddingManager>::New(\n      env, const_cast<SherpaOnnxSpeakerEmbeddingManager *>(manager),\n      [](Napi::Env env, SherpaOnnxSpeakerEmbeddingManager *manager) {\n        SherpaOnnxDestroySpeakerEmbeddingManager(manager);\n      });\n}\n\nstatic Napi::Boolean SpeakerEmbeddingManagerAddWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"You should pass a speaker embedding manager pointer \"\n                         \"as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an object\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingManager *manager =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingManager>>().Data();\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"v\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field v\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"v\").IsTypedArray()) {\n    Napi::TypeError::New(env, \"The object['v'] should be a typed array\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"name\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field name\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"name\").IsString()) {\n    Napi::TypeError::New(env, \"The object['name'] should be a string\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Float32Array v = obj.Get(\"v\").As<Napi::Float32Array>();\n  Napi::String js_name = obj.Get(\"name\").As<Napi::String>();\n  std::string name = js_name.Utf8Value();\n\n  int32_t ok =\n      SherpaOnnxSpeakerEmbeddingManagerAdd(manager, name.c_str(), v.Data());\n  return Napi::Boolean::New(env, ok);\n}\n\nstatic Napi::Boolean SpeakerEmbeddingManagerAddListFlattenedWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"You should pass a speaker embedding manager pointer \"\n                         \"as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an object\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingManager *manager =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingManager>>().Data();\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"vv\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field vv\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"vv\").IsTypedArray()) {\n    Napi::TypeError::New(env, \"The object['vv'] should be a typed array\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"name\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field name\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"name\").IsString()) {\n    Napi::TypeError::New(env, \"The object['name'] should be a string\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"n\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field n\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"n\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['n'] should be an integer\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Float32Array v = obj.Get(\"vv\").As<Napi::Float32Array>();\n  Napi::String js_name = obj.Get(\"name\").As<Napi::String>();\n  int32_t n = obj.Get(\"n\").As<Napi::Number>().Int32Value();\n\n  std::string name = js_name.Utf8Value();\n\n  int32_t ok = SherpaOnnxSpeakerEmbeddingManagerAddListFlattened(\n      manager, name.c_str(), v.Data(), n);\n\n  return Napi::Boolean::New(env, ok);\n}\n\nstatic Napi::Boolean SpeakerEmbeddingManagerRemoveWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"You should pass a speaker embedding manager pointer \"\n                         \"as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsString()) {\n    Napi::TypeError::New(env, \"Argument 1 should be string\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingManager *manager =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingManager>>().Data();\n\n  Napi::String js_name = info[1].As<Napi::String>();\n  std::string name = js_name.Utf8Value();\n\n  int32_t ok = SherpaOnnxSpeakerEmbeddingManagerRemove(manager, name.c_str());\n\n  return Napi::Boolean::New(env, ok);\n}\n\nstatic Napi::String SpeakerEmbeddingManagerSearchWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"You should pass a speaker embedding manager pointer \"\n                         \"as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an object\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingManager *manager =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingManager>>().Data();\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"v\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field v\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"v\").IsTypedArray()) {\n    Napi::TypeError::New(env, \"The object['v'] should be a typed array\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"threshold\")) {\n    Napi::TypeError::New(env,\n                         \"The argument object should have a field threshold\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"threshold\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['threshold'] should be a float\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Float32Array v = obj.Get(\"v\").As<Napi::Float32Array>();\n  float threshold = obj.Get(\"threshold\").As<Napi::Number>().FloatValue();\n\n  const char *name =\n      SherpaOnnxSpeakerEmbeddingManagerSearch(manager, v.Data(), threshold);\n  const char *p = name;\n  if (!p) {\n    p = \"\";\n  }\n\n  Napi::String js_name = Napi::String::New(env, p);\n  SherpaOnnxSpeakerEmbeddingManagerFreeSearch(name);\n\n  return js_name;\n}\n\nstatic Napi::Boolean SpeakerEmbeddingManagerVerifyWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"You should pass a speaker embedding manager pointer \"\n                         \"as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an object\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingManager *manager =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingManager>>().Data();\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"v\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field v\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"v\").IsTypedArray()) {\n    Napi::TypeError::New(env, \"The object['v'] should be a typed array\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"threshold\")) {\n    Napi::TypeError::New(env,\n                         \"The argument object should have a field threshold\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"threshold\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['threshold'] should be a float\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"name\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field name\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"name\").IsString()) {\n    Napi::TypeError::New(env, \"The object['name'] should be a string\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Float32Array v = obj.Get(\"v\").As<Napi::Float32Array>();\n  float threshold = obj.Get(\"threshold\").As<Napi::Number>().FloatValue();\n\n  Napi::String js_name = obj.Get(\"name\").As<Napi::String>();\n  std::string name = js_name.Utf8Value();\n\n  int32_t found = SherpaOnnxSpeakerEmbeddingManagerVerify(manager, name.c_str(),\n                                                          v.Data(), threshold);\n\n  return Napi::Boolean::New(env, found);\n}\n\nstatic Napi::Boolean SpeakerEmbeddingManagerContainsWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"You should pass a speaker embedding manager pointer \"\n                         \"as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsString()) {\n    Napi::TypeError::New(env, \"Argument 1 should be a string\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingManager *manager =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingManager>>().Data();\n\n  Napi::String js_name = info[1].As<Napi::String>();\n  std::string name = js_name.Utf8Value();\n\n  int32_t exists =\n      SherpaOnnxSpeakerEmbeddingManagerContains(manager, name.c_str());\n\n  return Napi::Boolean::New(env, exists);\n}\n\nstatic Napi::Number SpeakerEmbeddingManagerNumSpeakersWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"You should pass a speaker embedding manager pointer \"\n                         \"as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingManager *manager =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingManager>>().Data();\n\n  int32_t num_speakers = SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(manager);\n\n  return Napi::Number::New(env, num_speakers);\n}\n\nstatic Napi::Array SpeakerEmbeddingManagerGetAllSpeakersWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"You should pass a speaker embedding manager pointer \"\n                         \"as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpeakerEmbeddingManager *manager =\n      info[0].As<Napi::External<SherpaOnnxSpeakerEmbeddingManager>>().Data();\n\n  int32_t num_speakers = SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(manager);\n  if (num_speakers == 0) {\n    return Napi::Array::New(env, num_speakers);\n  }\n\n  const char *const *all_speaker_names =\n      SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers(manager);\n\n  Napi::Array ans = Napi::Array::New(env, num_speakers);\n  for (uint32_t i = 0; i != num_speakers; ++i) {\n    // ans[i] = Napi::String::New(env, all_speaker_names[i]); // see #2120\n    ans.Set(i, Napi::String::New(env, all_speaker_names[i]));\n  }\n  SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers(all_speaker_names);\n  return ans;\n}\n\nvoid InitSpeakerID(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createSpeakerEmbeddingExtractor\"),\n              Napi::Function::New(env, CreateSpeakerEmbeddingExtractorWrapper));\n\n  exports.Set(Napi::String::New(env, \"speakerEmbeddingExtractorDim\"),\n              Napi::Function::New(env, SpeakerEmbeddingExtractorDimWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"speakerEmbeddingExtractorCreateStream\"),\n      Napi::Function::New(env, SpeakerEmbeddingExtractorCreateStreamWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"speakerEmbeddingExtractorIsReady\"),\n      Napi::Function::New(env, SpeakerEmbeddingExtractorIsReadyWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"speakerEmbeddingExtractorComputeEmbedding\"),\n      Napi::Function::New(env,\n                          SpeakerEmbeddingExtractorComputeEmbeddingWrapper));\n\n  exports.Set(Napi::String::New(env, \"createSpeakerEmbeddingManager\"),\n              Napi::Function::New(env, CreateSpeakerEmbeddingManagerWrapper));\n\n  exports.Set(Napi::String::New(env, \"speakerEmbeddingManagerAdd\"),\n              Napi::Function::New(env, SpeakerEmbeddingManagerAddWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"speakerEmbeddingManagerAddListFlattened\"),\n      Napi::Function::New(env, SpeakerEmbeddingManagerAddListFlattenedWrapper));\n\n  exports.Set(Napi::String::New(env, \"speakerEmbeddingManagerRemove\"),\n              Napi::Function::New(env, SpeakerEmbeddingManagerRemoveWrapper));\n\n  exports.Set(Napi::String::New(env, \"speakerEmbeddingManagerSearch\"),\n              Napi::Function::New(env, SpeakerEmbeddingManagerSearchWrapper));\n\n  exports.Set(Napi::String::New(env, \"speakerEmbeddingManagerVerify\"),\n              Napi::Function::New(env, SpeakerEmbeddingManagerVerifyWrapper));\n\n  exports.Set(Napi::String::New(env, \"speakerEmbeddingManagerContains\"),\n              Napi::Function::New(env, SpeakerEmbeddingManagerContainsWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"speakerEmbeddingManagerNumSpeakers\"),\n      Napi::Function::New(env, SpeakerEmbeddingManagerNumSpeakersWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"speakerEmbeddingManagerGetAllSpeakers\"),\n      Napi::Function::New(env, SpeakerEmbeddingManagerGetAllSpeakersWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/speech-denoiser.h",
    "content": "// scripts/node-addon-api/src/speech-denoiser.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_HARMONY_OS_SHERPAONNXHAR_SHERPA_ONNX_SRC_MAIN_CPP_SPEECH_DENOISER_H_\n#define SHERPA_ONNX_HARMONY_OS_SHERPAONNXHAR_SHERPA_ONNX_SRC_MAIN_CPP_SPEECH_DENOISER_H_\n\n#include <algorithm>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic inline SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig\nGetSpeechDenoiserGtcrnModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"gtcrn\") || !obj.Get(\"gtcrn\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"gtcrn\").As<Napi::Object>();\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  return c;\n}\n\nstatic inline SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig\nGetSpeechDenoiserDpdfNetModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"dpdfnet\") || !obj.Get(\"dpdfnet\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"dpdfnet\").As<Napi::Object>();\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  return c;\n}\n\nstatic inline SherpaOnnxOfflineSpeechDenoiserModelConfig\nGetSpeechDenoiserModelConfig(Napi::Object obj) {\n  SherpaOnnxOfflineSpeechDenoiserModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"model\") || !obj.Get(\"model\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"model\").As<Napi::Object>();\n  c.gtcrn = GetSpeechDenoiserGtcrnModelConfig(o);\n  c.dpdfnet = GetSpeechDenoiserDpdfNetModelConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n  return c;\n}\n\nstatic inline void DeleteSpeechDenoiserModelConfig(\n    const SherpaOnnxOfflineSpeechDenoiserModelConfig &c) {\n  SHERPA_ONNX_DELETE_C_STR(c.gtcrn.model);\n  SHERPA_ONNX_DELETE_C_STR(c.provider);\n  SHERPA_ONNX_DELETE_C_STR(c.dpdfnet.model);\n}\n\nstatic inline bool GetEnableExternalBuffer(Napi::Object obj) {\n  if (obj.Has(\"enableExternalBuffer\") &&\n      obj.Get(\"enableExternalBuffer\").IsBoolean()) {\n    return obj.Get(\"enableExternalBuffer\").As<Napi::Boolean>().Value();\n  }\n\n  return true;\n}\n\nstatic inline int32_t GetFloat32ArrayElementLength(Napi::Float32Array samples) {\n#if __OHOS__\n  return samples.ElementLength() / sizeof(float);\n#else\n  return samples.ElementLength();\n#endif\n}\n\nstatic inline Napi::Object CreateDenoisedAudioObject(\n    Napi::Env env, const SherpaOnnxDenoisedAudio *audio,\n    bool enable_external_buffer) {\n  Napi::Object ans = Napi::Object::New(env);\n\n  if (!audio) {\n    ans.Set(Napi::String::New(env, \"samples\"), Napi::Float32Array::New(env, 0));\n    ans.Set(Napi::String::New(env, \"sampleRate\"), 0);\n    return ans;\n  }\n\n  if (enable_external_buffer) {\n    Napi::ArrayBuffer arrayBuffer = Napi::ArrayBuffer::New(\n        env, const_cast<float *>(audio->samples), sizeof(float) * audio->n,\n        [](Napi::Env /*env*/, void * /*data*/,\n           const SherpaOnnxDenoisedAudio *hint) {\n          SherpaOnnxDestroyDenoisedAudio(hint);\n        },\n        audio);\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, audio->n, arrayBuffer, 0);\n    ans.Set(Napi::String::New(env, \"samples\"), float32Array);\n    ans.Set(Napi::String::New(env, \"sampleRate\"), audio->sample_rate);\n    return ans;\n  }\n\n  Napi::ArrayBuffer arrayBuffer =\n      Napi::ArrayBuffer::New(env, sizeof(float) * audio->n);\n  Napi::Float32Array float32Array =\n      Napi::Float32Array::New(env, audio->n, arrayBuffer, 0);\n\n  if (audio->n > 0 && audio->samples) {\n    std::copy(audio->samples, audio->samples + audio->n, float32Array.Data());\n  }\n\n  ans.Set(Napi::String::New(env, \"samples\"), float32Array);\n  ans.Set(Napi::String::New(env, \"sampleRate\"), audio->sample_rate);\n  SherpaOnnxDestroyDenoisedAudio(audio);\n  return ans;\n}\n\n#endif  // SHERPA_ONNX_HARMONY_OS_SHERPAONNXHAR_SHERPA_ONNX_SRC_MAIN_CPP_SPEECH_DENOISER_H_\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/spoken-language-identification.cc",
    "content": "// scripts/node-addon-api/src/spoken-language-identification.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <sstream>\n#include <string>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic SherpaOnnxSpokenLanguageIdentificationWhisperConfig\nGetSpokenLanguageIdentificationWhisperConfig(Napi::Object obj) {\n  SherpaOnnxSpokenLanguageIdentificationWhisperConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"whisper\") || !obj.Get(\"whisper\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"whisper\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder, encoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoder, decoder);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(tail_paddings, tailPaddings);\n\n  return c;\n}\n\nstatic Napi::External<SherpaOnnxSpokenLanguageIdentification>\nCreateSpokenLanguageIdentificationWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"You should pass an object as the only argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxSpokenLanguageIdentificationConfig c;\n  memset(&c, 0, sizeof(c));\n  c.whisper = GetSpokenLanguageIdentificationWhisperConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n  const SherpaOnnxSpokenLanguageIdentification *slid =\n      SherpaOnnxCreateSpokenLanguageIdentification(&c);\n\n  SHERPA_ONNX_DELETE_C_STR(c.whisper.encoder);\n  SHERPA_ONNX_DELETE_C_STR(c.whisper.decoder);\n  SHERPA_ONNX_DELETE_C_STR(c.provider);\n\n  if (!slid) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxSpokenLanguageIdentification>::New(\n      env, const_cast<SherpaOnnxSpokenLanguageIdentification *>(slid),\n      [](Napi::Env env, SherpaOnnxSpokenLanguageIdentification *slid) {\n        SherpaOnnxDestroySpokenLanguageIdentification(slid);\n      });\n}\n\nstatic Napi::External<SherpaOnnxOfflineStream>\nSpokenLanguageIdentificationCreateOfflineStreamWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env,\n        \"You should pass an offline language ID pointer as the only argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpokenLanguageIdentification *slid =\n      info[0]\n          .As<Napi::External<SherpaOnnxSpokenLanguageIdentification>>()\n          .Data();\n\n  const SherpaOnnxOfflineStream *stream =\n      SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(slid);\n\n  return Napi::External<SherpaOnnxOfflineStream>::New(\n      env, const_cast<SherpaOnnxOfflineStream *>(stream),\n      [](Napi::Env env, SherpaOnnxOfflineStream *stream) {\n        SherpaOnnxDestroyOfflineStream(stream);\n      });\n}\n\nstatic Napi::String SpokenLanguageIdentificationComputeWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env, \"Argument 0 should be an offline spoken language ID pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an offline stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxSpokenLanguageIdentification *slid =\n      info[0]\n          .As<Napi::External<SherpaOnnxSpokenLanguageIdentification>>()\n          .Data();\n\n  const SherpaOnnxOfflineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOfflineStream>>().Data();\n\n  const SherpaOnnxSpokenLanguageIdentificationResult *r =\n      SherpaOnnxSpokenLanguageIdentificationCompute(slid, stream);\n\n  std::string lang = r->lang;\n  SherpaOnnxDestroySpokenLanguageIdentificationResult(r);\n\n  return Napi::String::New(env, lang);\n}\n\nvoid InitSpokenLanguageID(Napi::Env env, Napi::Object exports) {\n  exports.Set(\n      Napi::String::New(env, \"createSpokenLanguageIdentification\"),\n      Napi::Function::New(env, CreateSpokenLanguageIdentificationWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"createSpokenLanguageIdentificationOfflineStream\"),\n      Napi::Function::New(\n          env, SpokenLanguageIdentificationCreateOfflineStreamWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"spokenLanguageIdentificationCompute\"),\n      Napi::Function::New(env, SpokenLanguageIdentificationComputeWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/streaming-asr.cc",
    "content": "// scripts/node-addon-api/src/streaming-asr.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <memory>\n#include <sstream>\n#include <string>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n/*\n{\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  }\n};\n */\nSherpaOnnxFeatureConfig GetFeatureConfig(Napi::Object obj) {\n  SherpaOnnxFeatureConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"featConfig\") || !obj.Get(\"featConfig\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"featConfig\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(sample_rate, sampleRate);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(feature_dim, featureDim);\n\n  return c;\n}\n/*\n{\n  'transducer': {\n    'encoder': './encoder.onnx',\n    'decoder': './decoder.onnx',\n    'joiner': './joiner.onnx',\n  }\n}\n */\n\nstatic SherpaOnnxOnlineTransducerModelConfig GetOnlineTransducerModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOnlineTransducerModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"transducer\") || !obj.Get(\"transducer\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"transducer\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder, encoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoder, decoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(joiner, joiner);\n\n  return c;\n}\n\nstatic SherpaOnnxOnlineZipformer2CtcModelConfig\nGetOnlineZipformer2CtcModelConfig(Napi::Object obj) {\n  SherpaOnnxOnlineZipformer2CtcModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"zipformer2Ctc\") || !obj.Get(\"zipformer2Ctc\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"zipformer2Ctc\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOnlineNemoCtcModelConfig GetOnlineNemoCtcModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOnlineNemoCtcModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"nemoCtc\") || !obj.Get(\"nemoCtc\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"nemoCtc\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOnlineToneCtcModelConfig GetOnlineToneCtcModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOnlineToneCtcModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"toneCtc\") || !obj.Get(\"toneCtc\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"toneCtc\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n\n  return c;\n}\n\nstatic SherpaOnnxOnlineParaformerModelConfig GetOnlineParaformerModelConfig(\n    Napi::Object obj) {\n  SherpaOnnxOnlineParaformerModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"paraformer\") || !obj.Get(\"paraformer\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"paraformer\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(encoder, encoder);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoder, decoder);\n\n  return c;\n}\n\nSherpaOnnxOnlineModelConfig GetOnlineModelConfig(Napi::Object obj) {\n  SherpaOnnxOnlineModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"modelConfig\") || !obj.Get(\"modelConfig\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"modelConfig\").As<Napi::Object>();\n\n  c.transducer = GetOnlineTransducerModelConfig(o);\n  c.paraformer = GetOnlineParaformerModelConfig(o);\n  c.zipformer2_ctc = GetOnlineZipformer2CtcModelConfig(o);\n  c.nemo_ctc = GetOnlineNemoCtcModelConfig(o);\n  c.t_one_ctc = GetOnlineToneCtcModelConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tokens, tokens);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model_type, modelType);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(modeling_unit, modelingUnit);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(bpe_vocab, bpeVocab);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(tokens_buf, tokensBuf);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(tokens_buf_size, tokensBufSize);\n\n  return c;\n}\n\nstatic SherpaOnnxOnlineCtcFstDecoderConfig GetCtcFstDecoderConfig(\n    Napi::Object obj) {\n  SherpaOnnxOnlineCtcFstDecoderConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"ctcFstDecoderConfig\") ||\n      !obj.Get(\"ctcFstDecoderConfig\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"ctcFstDecoderConfig\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(graph, graph);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(max_active, maxActive);\n\n  return c;\n}\n\n// Also used in ./non-streaming-asr.cc\nSherpaOnnxHomophoneReplacerConfig GetHomophoneReplacerConfig(Napi::Object obj) {\n  SherpaOnnxHomophoneReplacerConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"hr\") || !obj.Get(\"hr\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"hr\").As<Napi::Object>();\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(lexicon, lexicon);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(rule_fsts, ruleFsts);\n\n  return c;\n}\n\nstatic Napi::External<SherpaOnnxOnlineRecognizer> CreateOnlineRecognizerWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n#if __OHOS__\n  if (info.Length() != 1 && info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#else\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"Expect an object as the argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n#if __OHOS__\n  bool use_resource_manager =\n      info.Length() == 2 && !info[1].IsUndefined() && !info[1].IsNull();\n  if (use_resource_manager && !info[1].IsObject()) {\n    Napi::TypeError::New(\n        env, \"You should pass a resource manager as the second argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  Napi::Object o = info[0].As<Napi::Object>();\n  SherpaOnnxOnlineRecognizerConfig c;\n  memset(&c, 0, sizeof(c));\n  c.feat_config = GetFeatureConfig(o);\n  c.model_config = GetOnlineModelConfig(o);\n  c.hr = GetHomophoneReplacerConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_STR(decoding_method, decodingMethod);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(max_active_paths, maxActivePaths);\n\n  // enableEndpoint can be either a boolean or an integer\n  if (o.Has(\"enableEndpoint\") && (o.Get(\"enableEndpoint\").IsNumber() ||\n                                  o.Get(\"enableEndpoint\").IsBoolean())) {\n    if (o.Get(\"enableEndpoint\").IsNumber()) {\n      c.enable_endpoint =\n          o.Get(\"enableEndpoint\").As<Napi::Number>().Int32Value();\n    } else {\n      c.enable_endpoint = o.Get(\"enableEndpoint\").As<Napi::Boolean>().Value();\n    }\n  }\n\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(rule1_min_trailing_silence,\n                                rule1MinTrailingSilence);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(rule2_min_trailing_silence,\n                                rule2MinTrailingSilence);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(rule3_min_utterance_length,\n                                rule3MinUtteranceLength);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(hotwords_file, hotwordsFile);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(hotwords_score, hotwordsScore);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(rule_fsts, ruleFsts);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(rule_fars, ruleFars);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(blank_penalty, blankPenalty);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(hotwords_buf, hotwordsBuf);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(hotwords_buf_size, hotwordsBufSize);\n\n  c.ctc_fst_decoder_config = GetCtcFstDecoderConfig(o);\n\n#if __OHOS__\n  const SherpaOnnxOnlineRecognizer *recognizer = nullptr;\n\n  if (use_resource_manager) {\n    std::unique_ptr<NativeResourceManager,\n                    decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n        mgr(OH_ResourceManager_InitNativeResourceManager(env, info[1]),\n            &OH_ResourceManager_ReleaseNativeResourceManager);\n\n    recognizer = SherpaOnnxCreateOnlineRecognizerOHOS(&c, mgr.get());\n  } else {\n    recognizer = SherpaOnnxCreateOnlineRecognizer(&c);\n  }\n#else\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      SherpaOnnxCreateOnlineRecognizer(&c);\n#endif\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.transducer.encoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.transducer.decoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.transducer.joiner);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.paraformer.encoder);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.paraformer.decoder);\n\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.t_one_ctc.model);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.nemo_ctc.model);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.zipformer2_ctc.model);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.tokens);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.provider);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.model_type);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.modeling_unit);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.bpe_vocab);\n  SHERPA_ONNX_DELETE_C_STR(c.model_config.tokens_buf);\n  SHERPA_ONNX_DELETE_C_STR(c.decoding_method);\n  SHERPA_ONNX_DELETE_C_STR(c.hotwords_file);\n  SHERPA_ONNX_DELETE_C_STR(c.rule_fsts);\n  SHERPA_ONNX_DELETE_C_STR(c.rule_fars);\n  SHERPA_ONNX_DELETE_C_STR(c.hotwords_buf);\n  SHERPA_ONNX_DELETE_C_STR(c.ctc_fst_decoder_config.graph);\n\n  SHERPA_ONNX_DELETE_C_STR(c.hr.lexicon);\n  SHERPA_ONNX_DELETE_C_STR(c.hr.rule_fsts);\n\n  if (!recognizer) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxOnlineRecognizer>::New(\n      env, const_cast<SherpaOnnxOnlineRecognizer *>(recognizer),\n      [](Napi::Env env, SherpaOnnxOnlineRecognizer *recognizer) {\n        SherpaOnnxDestroyOnlineRecognizer(recognizer);\n      });\n}\n\nstatic Napi::External<SherpaOnnxOnlineStream> CreateOnlineStreamWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(\n        env,\n        \"You should pass an online recognizer pointer as the only argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      info[0].As<Napi::External<SherpaOnnxOnlineRecognizer>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      SherpaOnnxCreateOnlineStream(recognizer);\n\n  return Napi::External<SherpaOnnxOnlineStream>::New(\n      env, const_cast<SherpaOnnxOnlineStream *>(stream),\n      [](Napi::Env env, SherpaOnnxOnlineStream *stream) {\n        SherpaOnnxDestroyOnlineStream(stream);\n      });\n}\n\nstatic void AcceptWaveformWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      info[0].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an object\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"samples\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field samples\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!obj.Get(\"samples\").IsTypedArray()) {\n    Napi::TypeError::New(env, \"The object['samples'] should be a typed array\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!obj.Has(\"sampleRate\")) {\n    Napi::TypeError::New(env,\n                         \"The argument object should have a field sampleRate\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!obj.Get(\"sampleRate\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['samples'] should be a number\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  Napi::Float32Array samples = obj.Get(\"samples\").As<Napi::Float32Array>();\n  int32_t sample_rate = obj.Get(\"sampleRate\").As<Napi::Number>().Int32Value();\n\n#if __OHOS__\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, sample_rate, samples.Data(),\n                                       samples.ElementLength() / sizeof(float));\n#else\n  SherpaOnnxOnlineStreamAcceptWaveform(stream, sample_rate, samples.Data(),\n                                       samples.ElementLength());\n#endif\n}\n\nstatic Napi::Boolean IsOnlineStreamReadyWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"Argument 0 should be an online recognizer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      info[0].As<Napi::External<SherpaOnnxOnlineRecognizer>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  int32_t is_ready = SherpaOnnxIsOnlineStreamReady(recognizer, stream);\n\n  return Napi::Boolean::New(env, is_ready);\n}\n\nstatic void DecodeOnlineStreamWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"Argument 0 should be an online recognizer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      info[0].As<Napi::External<SherpaOnnxOnlineRecognizer>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  SherpaOnnxDecodeOnlineStream(recognizer, stream);\n}\n\nstatic Napi::String GetOnlineStreamResultAsJsonWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"Argument 0 should be an online recognizer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      info[0].As<Napi::External<SherpaOnnxOnlineRecognizer>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  const char *json = SherpaOnnxGetOnlineStreamResultAsJson(recognizer, stream);\n  Napi::String s = Napi::String::New(env, json);\n\n  SherpaOnnxDestroyOnlineStreamResultJson(json);\n\n  return s;\n}\n\nstatic void InputFinishedWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxOnlineStream *stream =\n      info[0].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  SherpaOnnxOnlineStreamInputFinished(stream);\n}\n\nstatic void ResetOnlineStreamWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"Argument 0 should be an online recognizer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      info[0].As<Napi::External<SherpaOnnxOnlineRecognizer>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  SherpaOnnxOnlineStreamReset(recognizer, stream);\n}\n\nstatic Napi::Boolean IsEndpointWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env,\n                         \"Argument 0 should be an online recognizer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxOnlineRecognizer *recognizer =\n      info[0].As<Napi::External<SherpaOnnxOnlineRecognizer>>().Data();\n\n  const SherpaOnnxOnlineStream *stream =\n      info[1].As<Napi::External<SherpaOnnxOnlineStream>>().Data();\n\n  int32_t is_endpoint = SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream);\n\n  return Napi::Boolean::New(env, is_endpoint);\n}\n\nstatic Napi::External<SherpaOnnxDisplay> CreateDisplayWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsNumber()) {\n    Napi::TypeError::New(env, \"Expect a number as the argument\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n  int32_t max_word_per_line = info[0].As<Napi::Number>().Int32Value();\n\n  const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(max_word_per_line);\n\n  return Napi::External<SherpaOnnxDisplay>::New(\n      env, const_cast<SherpaOnnxDisplay *>(display),\n      [](Napi::Env env, SherpaOnnxDisplay *display) {\n        SherpaOnnxDestroyDisplay(display);\n      });\n}\n\nstatic void PrintWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 3) {\n    std::ostringstream os;\n    os << \"Expect only 3 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an online stream pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[1].IsNumber()) {\n    Napi::TypeError::New(env, \"Argument 1 should be a number.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[2].IsString()) {\n    Napi::TypeError::New(env, \"Argument 2 should be a string.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxDisplay *display =\n      info[0].As<Napi::External<SherpaOnnxDisplay>>().Data();\n\n  int32_t idx = info[1].As<Napi::Number>().Int32Value();\n\n  Napi::String text = info[2].As<Napi::String>();\n  std::string s = text.Utf8Value();\n  SherpaOnnxPrint(display, idx, s.c_str());\n}\n\nvoid InitStreamingAsr(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createOnlineRecognizer\"),\n              Napi::Function::New(env, CreateOnlineRecognizerWrapper));\n\n  exports.Set(Napi::String::New(env, \"createOnlineStream\"),\n              Napi::Function::New(env, CreateOnlineStreamWrapper));\n\n  exports.Set(Napi::String::New(env, \"acceptWaveformOnline\"),\n              Napi::Function::New(env, AcceptWaveformWrapper));\n\n  exports.Set(Napi::String::New(env, \"isOnlineStreamReady\"),\n              Napi::Function::New(env, IsOnlineStreamReadyWrapper));\n\n  exports.Set(Napi::String::New(env, \"decodeOnlineStream\"),\n              Napi::Function::New(env, DecodeOnlineStreamWrapper));\n\n  exports.Set(Napi::String::New(env, \"getOnlineStreamResultAsJson\"),\n              Napi::Function::New(env, GetOnlineStreamResultAsJsonWrapper));\n\n  exports.Set(Napi::String::New(env, \"inputFinished\"),\n              Napi::Function::New(env, InputFinishedWrapper));\n\n  exports.Set(Napi::String::New(env, \"reset\"),\n              Napi::Function::New(env, ResetOnlineStreamWrapper));\n\n  exports.Set(Napi::String::New(env, \"isEndpoint\"),\n              Napi::Function::New(env, IsEndpointWrapper));\n\n  exports.Set(Napi::String::New(env, \"createDisplay\"),\n              Napi::Function::New(env, CreateDisplayWrapper));\n\n  exports.Set(Napi::String::New(env, \"print\"),\n              Napi::Function::New(env, PrintWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/streaming-speech-denoiser.cc",
    "content": "// scripts/node-addon-api/src/streaming-speech-denoiser.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n#include <memory>\n#include <sstream>\n\n#include \"napi.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n#include \"speech-denoiser.h\"  // NOLINT\n\nstatic Napi::External<SherpaOnnxOnlineSpeechDenoiser>\nCreateOnlineSpeechDenoiserWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n#if __OHOS__\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n    return {};\n  }\n#else\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n    return {};\n  }\n#endif\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env, \"Expect an object as the argument\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  SherpaOnnxOnlineSpeechDenoiserConfig c;\n  memset(&c, 0, sizeof(c));\n  c.model = GetSpeechDenoiserModelConfig(info[0].As<Napi::Object>());\n\n#if __OHOS__\n  std::unique_ptr<NativeResourceManager,\n                  decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n      mgr(OH_ResourceManager_InitNativeResourceManager(env, info[1]),\n          &OH_ResourceManager_ReleaseNativeResourceManager);\n\n  const SherpaOnnxOnlineSpeechDenoiser *sd =\n      SherpaOnnxCreateOnlineSpeechDenoiserOHOS(&c, mgr.get());\n#else\n  const SherpaOnnxOnlineSpeechDenoiser *sd =\n      SherpaOnnxCreateOnlineSpeechDenoiser(&c);\n#endif\n\n  DeleteSpeechDenoiserModelConfig(c.model);\n\n  if (!sd) {\n    Napi::TypeError::New(env, \"Please check your config!\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  return Napi::External<SherpaOnnxOnlineSpeechDenoiser>::New(\n      env, const_cast<SherpaOnnxOnlineSpeechDenoiser *>(sd),\n      [](Napi::Env /*env*/, SherpaOnnxOnlineSpeechDenoiser *sd) {\n        SherpaOnnxDestroyOnlineSpeechDenoiser(sd);\n      });\n}\n\nstatic Napi::Object OnlineSpeechDenoiserRunWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 2 || !info[0].IsExternal() || !info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Expect a denoiser handle and an audio object\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  const SherpaOnnxOnlineSpeechDenoiser *sd =\n      info[0].As<Napi::External<SherpaOnnxOnlineSpeechDenoiser>>().Data();\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"samples\") || !obj.Get(\"samples\").IsTypedArray()) {\n    Napi::TypeError::New(\n        env, \"The argument object should have a typed array field samples\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  if (!obj.Has(\"sampleRate\") || !obj.Get(\"sampleRate\").IsNumber()) {\n    Napi::TypeError::New(\n        env, \"The argument object should have a number field sampleRate\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  Napi::Float32Array samples = obj.Get(\"samples\").As<Napi::Float32Array>();\n  int32_t sample_rate = obj.Get(\"sampleRate\").As<Napi::Number>().Int32Value();\n  const SherpaOnnxDenoisedAudio *audio = SherpaOnnxOnlineSpeechDenoiserRun(\n      sd, samples.Data(), GetFloat32ArrayElementLength(samples), sample_rate);\n  return CreateDenoisedAudioObject(env, audio, GetEnableExternalBuffer(obj));\n}\n\nstatic Napi::Object OnlineSpeechDenoiserFlushWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() < 1 || !info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Expect an online speech denoiser pointer.\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  bool enable_external_buffer = true;\n  if (info.Length() > 1 && info[1].IsBoolean()) {\n    enable_external_buffer = info[1].As<Napi::Boolean>().Value();\n  }\n\n  const SherpaOnnxOnlineSpeechDenoiser *sd =\n      info[0].As<Napi::External<SherpaOnnxOnlineSpeechDenoiser>>().Data();\n  const SherpaOnnxDenoisedAudio *audio =\n      SherpaOnnxOnlineSpeechDenoiserFlush(sd);\n  return CreateDenoisedAudioObject(env, audio, enable_external_buffer);\n}\n\nstatic void OnlineSpeechDenoiserResetWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1 || !info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Expect an online speech denoiser pointer.\")\n        .ThrowAsJavaScriptException();\n    return;\n  }\n\n  const SherpaOnnxOnlineSpeechDenoiser *sd =\n      info[0].As<Napi::External<SherpaOnnxOnlineSpeechDenoiser>>().Data();\n  SherpaOnnxOnlineSpeechDenoiserReset(sd);\n}\n\nstatic Napi::Number OnlineSpeechDenoiserGetSampleRateWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1 || !info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Expect an online speech denoiser pointer.\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  const SherpaOnnxOnlineSpeechDenoiser *sd =\n      info[0].As<Napi::External<SherpaOnnxOnlineSpeechDenoiser>>().Data();\n  return Napi::Number::New(env,\n                           SherpaOnnxOnlineSpeechDenoiserGetSampleRate(sd));\n}\n\nstatic Napi::Number OnlineSpeechDenoiserGetFrameShiftInSamplesWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1 || !info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Expect an online speech denoiser pointer.\")\n        .ThrowAsJavaScriptException();\n    return {};\n  }\n\n  const SherpaOnnxOnlineSpeechDenoiser *sd =\n      info[0].As<Napi::External<SherpaOnnxOnlineSpeechDenoiser>>().Data();\n  return Napi::Number::New(\n      env, SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(sd));\n}\n\nvoid InitOnlineSpeechDenoiser(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createOnlineSpeechDenoiser\"),\n              Napi::Function::New(env, CreateOnlineSpeechDenoiserWrapper));\n  exports.Set(Napi::String::New(env, \"onlineSpeechDenoiserRunWrapper\"),\n              Napi::Function::New(env, OnlineSpeechDenoiserRunWrapper));\n  exports.Set(Napi::String::New(env, \"onlineSpeechDenoiserFlushWrapper\"),\n              Napi::Function::New(env, OnlineSpeechDenoiserFlushWrapper));\n  exports.Set(Napi::String::New(env, \"onlineSpeechDenoiserResetWrapper\"),\n              Napi::Function::New(env, OnlineSpeechDenoiserResetWrapper));\n  exports.Set(\n      Napi::String::New(env, \"onlineSpeechDenoiserGetSampleRateWrapper\"),\n      Napi::Function::New(env, OnlineSpeechDenoiserGetSampleRateWrapper));\n  exports.Set(Napi::String::New(\n                  env, \"onlineSpeechDenoiserGetFrameShiftInSamplesWrapper\"),\n              Napi::Function::New(\n                  env, OnlineSpeechDenoiserGetFrameShiftInSamplesWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/types/libsherpa_onnx/Index.d.ts",
    "content": "export const listRawfileDir: (mgr: object, dir: string) => Array<string>;\n\nexport const readWave: (filename: string, enableExternalBuffer: boolean = true) => {samples: Float32Array, sampleRate: number};\nexport const readWaveFromBinary: (data: Uint8Array, enableExternalBuffer: boolean = true) => {samples: Float32Array, sampleRate: number};\nexport const createCircularBuffer: (capacity: number) => object;\nexport const circularBufferPush: (handle: object, samples: Float32Array) => void;\nexport const circularBufferGet: (handle: object, index: number, n: number, enableExternalBuffer: boolean = true) => Float32Array;\nexport const circularBufferPop: (handle: object, n: number) => void;\nexport const circularBufferSize: (handle: object) => number;\nexport const circularBufferHead: (handle: object) => number;\nexport const circularBufferReset: (handle: object) => void;\n\nexport const createVoiceActivityDetector: (config: object, bufferSizeInSeconds?: number, mgr?: object) => object;\nexport const voiceActivityDetectorAcceptWaveform: (handle: object, samples: Float32Array) => void;\nexport const voiceActivityDetectorIsEmpty: (handle: object) => boolean;\nexport const voiceActivityDetectorIsDetected: (handle: object) => boolean;\nexport const voiceActivityDetectorPop: (handle: object) => void;\nexport const voiceActivityDetectorClear: (handle: object) => void;\nexport const voiceActivityDetectorFront: (handle: object, enableExternalBuffer: boolean = true) => {samples: Float32Array, start: number};\nexport const voiceActivityDetectorReset: (handle: object) => void;\nexport const voiceActivityDetectorFlush: (handle: object) => void;\n\nexport const createOfflineRecognizer: (config: object, mgr?: object) => object;\nexport const createOfflineStream: (handle: object) => object;\nexport const offlineRecognizerSetConfig: (handle: object, config: object) => void;\nexport const acceptWaveformOffline: (handle: object, audio: object) => void;\nexport const decodeOfflineStream: (handle: object, streamHandle: object) => void;\nexport const getOfflineStreamResultAsJson: (streamHandle: object) => string;\n\nexport const createOnlineRecognizer: (config: object, mgr?: object) => object;\nexport const createOnlineStream: (handle: object) => object;\nexport const acceptWaveformOnline: (handle: object, audio: object) => void;\nexport const inputFinished: (streamHandle: object) => void;\nexport const isOnlineStreamReady: (handle: object, streamHandle: object) => boolean;\nexport const decodeOnlineStream: (handle: object, streamHandle: object) => void;\nexport const isEndpoint: (handle: object, streamHandle: object) => boolean;\nexport const reset: (handle: object, streamHandle: object) => void;\nexport const getOnlineStreamResultAsJson: (handle: object, streamHandle: object) => string;\n\nexport const createOfflineTts: (config: object, mgr?: object) => object;\nexport const getOfflineTtsNumSpeakers: (handle: object) => number;\nexport const getOfflineTtsSampleRate: (handle: object) => number;\n\nexport type TtsOutput = {\n  samples: Float32Array;\n  sampleRate: number;\n};\n\nexport const offlineTtsGenerate: (handle: object, input: object) => TtsOutput;\nexport const offlineTtsGenerateWithConfig: (handle: object, input: object) => TtsOutput;\nexport const offlineTtsGenerateAsync: (handle: object, input: object) => Promise<TtsOutput>;\nexport const offlineTtsGenerateAsyncWithConfig: (handle: object, input: object) => Promise<TtsOutput>;\n\nexport const createOfflinePunctuation: (config: object, mgr?: object) => object;\nexport const offlinePunctuationAddPunct: (handle: object, text: string) => string;\nexport const createOnlinePunctuation: (config: object, mgr?: object) => object;\nexport const onlinePunctuationAddPunct: (handle: object, text: string) => string;\n\nexport const createSpeakerEmbeddingExtractor: (config: object, mgr?: object) => object;\nexport const speakerEmbeddingExtractorDim: (handle: object) => number;\nexport const speakerEmbeddingExtractorCreateStream: (handle: object) => object;\nexport const speakerEmbeddingExtractorIsReady: (handle: object, stream: object) => boolean;\nexport const speakerEmbeddingExtractorComputeEmbedding: (handle: object, stream: object, enableExternalBuffer: boolean) => Float32Array;\nexport const createSpeakerEmbeddingManager: (dim: number) => object;\nexport const speakerEmbeddingManagerAdd: (handle: object, speaker: {name: string, v: Float32Array}) => boolean;\nexport const speakerEmbeddingManagerAddListFlattened: (handle: object, speaker: {name: string, vv: Float32Array, n: number}) => boolean;\nexport const speakerEmbeddingManagerRemove: (handle: object, name: string) => boolean;\nexport const speakerEmbeddingManagerSearch: (handle: object, obj: {v: Float32Array, threshold: number}) => string;\nexport const speakerEmbeddingManagerVerify: (handle: object, obj: {name: string, v: Float32Array, threshold: number}) => boolean;\nexport const speakerEmbeddingManagerContains: (handle: object, name: string) => boolean;\nexport const speakerEmbeddingManagerNumSpeakers: (handle: object) => number;\nexport const speakerEmbeddingManagerGetAllSpeakers: (handle: object) => Array<string>;\n\nexport const createOfflineSpeakerDiarization: (config: object, mgr?: object) => object;\nexport const getOfflineSpeakerDiarizationSampleRate: (handle: object) => number;\nexport const offlineSpeakerDiarizationProcess: (handle: object, input: object) => object;\nexport const offlineSpeakerDiarizationProcessAsync: (handle: object, input: object, callback: object) => object;\nexport const offlineSpeakerDiarizationSetConfig: (handle: object, config: object) => void;\n\nexport const createKeywordSpotter: (config: object, mgr?: object) => object;\nexport const createKeywordStream: (handle: object, keywords?: string) => object;\nexport const isKeywordStreamReady: (handle: object, stream: object) => boolean;\nexport const decodeKeywordStream: (handle: object, stream: object) => void;\nexport const resetKeywordStream: (handle: object, stream: object) => void;\nexport const getKeywordResultAsJson: (handle: object, stream: object) => string;\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/types/libsherpa_onnx/oh-package.json5",
    "content": "{\n  \"name\": \"libsherpa_onnx.so\",\n  \"types\": \"./Index.d.ts\",\n  \"version\": \"1.0.0\",\n  \"description\": \"Please describe the basic information.\"\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/utils.cc",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\n\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"  // NOLINT\n\nstatic std::vector<std::string> GetFilenames(NativeResourceManager *mgr,\n                                             const std::string &d) {\n  std::unique_ptr<RawDir, decltype(&OH_ResourceManager_CloseRawDir)> raw_dir(\n      OH_ResourceManager_OpenRawDir(mgr, d.c_str()),\n      &OH_ResourceManager_CloseRawDir);\n  int count = OH_ResourceManager_GetRawFileCount(raw_dir.get());\n  std::vector<std::string> ans;\n  ans.reserve(count);\n  for (int32_t i = 0; i < count; ++i) {\n    std::string filename = OH_ResourceManager_GetRawFileName(raw_dir.get(), i);\n    bool is_dir = OH_ResourceManager_IsRawDir(\n        mgr, d.empty() ? filename.c_str() : (d + \"/\" + filename).c_str());\n    if (is_dir) {\n      auto files = GetFilenames(mgr, d.empty() ? filename : d + \"/\" + filename);\n      for (auto &f : files) {\n        ans.push_back(std::move(f));\n      }\n    } else {\n      if (d.empty()) {\n        ans.push_back(std::move(filename));\n      } else {\n        ans.push_back(d + \"/\" + filename);\n      }\n    }\n  }\n\n  return ans;\n}\n\nstatic Napi::Array ListRawFileDir(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  std::unique_ptr<NativeResourceManager,\n                  decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n      mgr(OH_ResourceManager_InitNativeResourceManager(env, info[0]),\n          &OH_ResourceManager_ReleaseNativeResourceManager);\n\n  if (!info[1].IsString()) {\n    Napi::TypeError::New(env, \"Argument 1 should be a string\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  std::string dir = info[1].As<Napi::String>().Utf8Value();\n\n  auto files = GetFilenames(mgr.get(), dir);\n  Napi::Array ans = Napi::Array::New(env, files.size());\n  for (int32_t i = 0; i != files.size(); ++i) {\n    // Fix #2120\n    // ans[i] = Napi::String::New(env, files[i]);\n    ans.Set(i, Napi::String::New(env, files[i]));\n  }\n  return ans;\n}\nvoid InitUtils(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"listRawfileDir\"),\n              Napi::Function::New(env, ListRawFileDir));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/vad.cc",
    "content": "// scripts/node-addon-api/src/vad.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <algorithm>\n#include <memory>\n#include <sstream>\n\n#include \"macros.h\"  // NOLINT\n#include \"napi.h\"    // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic Napi::External<SherpaOnnxCircularBuffer> CreateCircularBufferWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsNumber()) {\n    Napi::TypeError::New(env, \"You should pass an integer as the argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxCircularBuffer *buf =\n      SherpaOnnxCreateCircularBuffer(info[0].As<Napi::Number>().Int32Value());\n\n  return Napi::External<SherpaOnnxCircularBuffer>::New(\n      env, const_cast<SherpaOnnxCircularBuffer *>(buf),\n      [](Napi::Env env, SherpaOnnxCircularBuffer *p) {\n        SherpaOnnxDestroyCircularBuffer(p);\n      });\n}\n\nstatic void CircularBufferPushWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an CircularBuffer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxCircularBuffer *buf =\n      info[0].As<Napi::External<SherpaOnnxCircularBuffer>>().Data();\n\n  if (!info[1].IsTypedArray()) {\n    Napi::TypeError::New(env, \"Argument 1 should be a Float32Array.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  Napi::Float32Array data = info[1].As<Napi::Float32Array>();\n\n#if __OHOS__\n  // Note(fangjun): Normally, we don't need to divied it by sizeof(float).\n  // However, data.ElementLength() here returns number of bytes, not number of\n  // elements.\n  SherpaOnnxCircularBufferPush(buf, data.Data(),\n                               data.ElementLength() / sizeof(float));\n#else\n  SherpaOnnxCircularBufferPush(buf, data.Data(), data.ElementLength());\n#endif\n}\n\n// see https://github.com/nodejs/node-addon-api/blob/main/doc/typed_array.md\n// https://github.com/nodejs/node-addon-examples/blob/main/src/2-js-to-native-conversion/typed_array_to_native/node-addon-api/typed_array_to_native.cc\nstatic Napi::Float32Array CircularBufferGetWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 3 && info.Length() != 4) {\n    std::ostringstream os;\n    os << \"Expect only 3 or 4 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an CircularBuffer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxCircularBuffer *buf =\n      info[0].As<Napi::External<SherpaOnnxCircularBuffer>>().Data();\n\n  if (!info[1].IsNumber()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an integer (startIndex).\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[2].IsNumber()) {\n    Napi::TypeError::New(env, \"Argument 2 should be an integer (n).\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  bool enable_external_buffer = true;\n  if (info.Length() == 4) {\n    if (info[3].IsBoolean()) {\n      enable_external_buffer = info[3].As<Napi::Boolean>().Value();\n    } else {\n      Napi::TypeError::New(env, \"Argument 3 should be a boolean.\")\n          .ThrowAsJavaScriptException();\n    }\n  }\n\n  int32_t start_index = info[1].As<Napi::Number>().Int32Value();\n  int32_t n = info[2].As<Napi::Number>().Int32Value();\n\n  const float *data = SherpaOnnxCircularBufferGet(buf, start_index, n);\n\n  if (enable_external_buffer) {\n    Napi::ArrayBuffer arrayBuffer = Napi::ArrayBuffer::New(\n        env, const_cast<float *>(data), sizeof(float) * n,\n        [](Napi::Env /*env*/, void *p) {\n          SherpaOnnxCircularBufferFree(reinterpret_cast<const float *>(p));\n        });\n\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, n, arrayBuffer, 0);\n\n    return float32Array;\n  } else {\n    // don't use external buffer\n    Napi::ArrayBuffer arrayBuffer =\n        Napi::ArrayBuffer::New(env, sizeof(float) * n);\n\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, n, arrayBuffer, 0);\n\n    std::copy(data, data + n, float32Array.Data());\n\n    SherpaOnnxCircularBufferFree(data);\n\n    return float32Array;\n  }\n}\n\nstatic void CircularBufferPopWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an CircularBuffer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxCircularBuffer *buf =\n      info[0].As<Napi::External<SherpaOnnxCircularBuffer>>().Data();\n\n  if (!info[1].IsNumber()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an integer (n).\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  int32_t n = info[1].As<Napi::Number>().Int32Value();\n\n  SherpaOnnxCircularBufferPop(buf, n);\n}\n\nstatic Napi::Number CircularBufferSizeWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an CircularBuffer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxCircularBuffer *buf =\n      info[0].As<Napi::External<SherpaOnnxCircularBuffer>>().Data();\n\n  int32_t size = SherpaOnnxCircularBufferSize(buf);\n\n  return Napi::Number::New(env, size);\n}\n\nstatic Napi::Number CircularBufferHeadWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an CircularBuffer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxCircularBuffer *buf =\n      info[0].As<Napi::External<SherpaOnnxCircularBuffer>>().Data();\n\n  int32_t size = SherpaOnnxCircularBufferHead(buf);\n\n  return Napi::Number::New(env, size);\n}\n\nstatic void CircularBufferResetWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be an CircularBuffer pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxCircularBuffer *buf =\n      info[0].As<Napi::External<SherpaOnnxCircularBuffer>>().Data();\n\n  SherpaOnnxCircularBufferReset(buf);\n}\n\nstatic SherpaOnnxSileroVadModelConfig GetSileroVadConfig(\n    const Napi::Object &obj) {\n  SherpaOnnxSileroVadModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"sileroVad\") || !obj.Get(\"sileroVad\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"sileroVad\").As<Napi::Object>();\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(threshold, threshold);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(min_silence_duration, minSilenceDuration);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(min_speech_duration, minSpeechDuration);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(window_size, windowSize);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(max_speech_duration, maxSpeechDuration);\n\n  return c;\n}\n\nstatic SherpaOnnxTenVadModelConfig GetTenVadConfig(const Napi::Object &obj) {\n  SherpaOnnxTenVadModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  if (!obj.Has(\"tenVad\") || !obj.Get(\"tenVad\").IsObject()) {\n    return c;\n  }\n\n  Napi::Object o = obj.Get(\"tenVad\").As<Napi::Object>();\n  SHERPA_ONNX_ASSIGN_ATTR_STR(model, model);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(threshold, threshold);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(min_silence_duration, minSilenceDuration);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(min_speech_duration, minSpeechDuration);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(window_size, windowSize);\n  SHERPA_ONNX_ASSIGN_ATTR_FLOAT(max_speech_duration, maxSpeechDuration);\n\n  return c;\n}\n\nstatic Napi::External<SherpaOnnxVoiceActivityDetector>\nCreateVoiceActivityDetectorWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n#if __OHOS__\n  // the last argument is a NativeResourceManager\n  if (info.Length() != 1 && info.Length() != 2 && info.Length() != 3) {\n    std::ostringstream os;\n    os << \"Expect 1, 2, or 3 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#else\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  if (!info[0].IsObject()) {\n    Napi::TypeError::New(env,\n                         \"You should pass an object as the first argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  float buffer_size_in_seconds = 60;\n  if (info.Length() >= 2 && !info[1].IsUndefined() && !info[1].IsNull()) {\n    if (!info[1].IsNumber()) {\n      Napi::TypeError::New(env,\n                           \"You should pass a number as the second argument.\")\n          .ThrowAsJavaScriptException();\n\n      return {};\n    }\n\n    buffer_size_in_seconds = info[1].As<Napi::Number>().FloatValue();\n  }\n\n#if __OHOS__\n  bool use_resource_manager =\n      info.Length() == 3 && !info[2].IsUndefined() && !info[2].IsNull();\n  if (use_resource_manager && !info[2].IsObject()) {\n    Napi::TypeError::New(\n        env, \"You should pass a resource manager as the third argument.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n#endif\n\n  Napi::Object o = info[0].As<Napi::Object>();\n\n  SherpaOnnxVadModelConfig c;\n  memset(&c, 0, sizeof(c));\n  c.silero_vad = GetSileroVadConfig(o);\n  c.ten_vad = GetTenVadConfig(o);\n\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(sample_rate, sampleRate);\n  SHERPA_ONNX_ASSIGN_ATTR_INT32(num_threads, numThreads);\n  SHERPA_ONNX_ASSIGN_ATTR_STR(provider, provider);\n\n  if (o.Has(\"debug\") &&\n      (o.Get(\"debug\").IsNumber() || o.Get(\"debug\").IsBoolean())) {\n    if (o.Get(\"debug\").IsBoolean()) {\n      c.debug = o.Get(\"debug\").As<Napi::Boolean>().Value();\n    } else {\n      c.debug = o.Get(\"debug\").As<Napi::Number>().Int32Value();\n    }\n  }\n\n#if __OHOS__\n  const SherpaOnnxVoiceActivityDetector *vad = nullptr;\n\n  if (use_resource_manager) {\n    std::unique_ptr<NativeResourceManager,\n                    decltype(&OH_ResourceManager_ReleaseNativeResourceManager)>\n        mgr(OH_ResourceManager_InitNativeResourceManager(env, info[2]),\n            &OH_ResourceManager_ReleaseNativeResourceManager);\n\n    vad = SherpaOnnxCreateVoiceActivityDetectorOHOS(&c, buffer_size_in_seconds,\n                                                    mgr.get());\n  } else {\n    vad = SherpaOnnxCreateVoiceActivityDetector(&c, buffer_size_in_seconds);\n  }\n#else\n  const SherpaOnnxVoiceActivityDetector *vad =\n      SherpaOnnxCreateVoiceActivityDetector(&c, buffer_size_in_seconds);\n#endif\n  SHERPA_ONNX_DELETE_C_STR(c.silero_vad.model);\n  SHERPA_ONNX_DELETE_C_STR(c.ten_vad.model);\n  SHERPA_ONNX_DELETE_C_STR(c.provider);\n\n  return Napi::External<SherpaOnnxVoiceActivityDetector>::New(\n      env, const_cast<SherpaOnnxVoiceActivityDetector *>(vad),\n      [](Napi::Env env, SherpaOnnxVoiceActivityDetector *p) {\n        SherpaOnnxDestroyVoiceActivityDetector(p);\n      });\n}\n\nstatic void VoiceActivityDetectorAcceptWaveformWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a VAD pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      info[0].As<Napi::External<SherpaOnnxVoiceActivityDetector>>().Data();\n\n  if (!info[1].IsTypedArray()) {\n    Napi::TypeError::New(\n        env, \"Argument 1 should be a Float32Array containing samples\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  Napi::Float32Array samples = info[1].As<Napi::Float32Array>();\n\n#if __OHOS__\n  // Note(fangjun): For unknown reasons, we need to use `/sizeof(float)` here\n  // for Huawei\n  SherpaOnnxVoiceActivityDetectorAcceptWaveform(\n      vad, samples.Data(), samples.ElementLength() / sizeof(float));\n#else\n  SherpaOnnxVoiceActivityDetectorAcceptWaveform(vad, samples.Data(),\n                                                samples.ElementLength());\n#endif\n}\n\nstatic Napi::Boolean VoiceActivityDetectorEmptyWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a VAD pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      info[0].As<Napi::External<SherpaOnnxVoiceActivityDetector>>().Data();\n\n  int32_t is_empty = SherpaOnnxVoiceActivityDetectorEmpty(vad);\n\n  return Napi::Boolean::New(env, is_empty);\n}\n\nstatic Napi::Boolean VoiceActivityDetectorDetectedWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a VAD pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      info[0].As<Napi::External<SherpaOnnxVoiceActivityDetector>>().Data();\n\n  int32_t is_detected = SherpaOnnxVoiceActivityDetectorDetected(vad);\n\n  return Napi::Boolean::New(env, is_detected);\n}\n\nstatic void VoiceActivityDetectorPopWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a VAD pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      info[0].As<Napi::External<SherpaOnnxVoiceActivityDetector>>().Data();\n\n  SherpaOnnxVoiceActivityDetectorPop(vad);\n}\n\nstatic void VoiceActivityDetectorClearWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a VAD pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      info[0].As<Napi::External<SherpaOnnxVoiceActivityDetector>>().Data();\n\n  SherpaOnnxVoiceActivityDetectorClear(vad);\n}\n\nstatic Napi::Object VoiceActivityDetectorFrontWrapper(\n    const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1 && info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a VAD pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  bool enable_external_buffer = true;\n  if (info.Length() == 2) {\n    if (info[1].IsBoolean()) {\n      enable_external_buffer = info[1].As<Napi::Boolean>().Value();\n    } else {\n      Napi::TypeError::New(env, \"Argument 1 should be a boolean.\")\n          .ThrowAsJavaScriptException();\n    }\n  }\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      info[0].As<Napi::External<SherpaOnnxVoiceActivityDetector>>().Data();\n\n  const SherpaOnnxSpeechSegment *segment =\n      SherpaOnnxVoiceActivityDetectorFront(vad);\n\n  if (enable_external_buffer) {\n    Napi::ArrayBuffer arrayBuffer = Napi::ArrayBuffer::New(\n        env, const_cast<float *>(segment->samples), sizeof(float) * segment->n,\n        [](Napi::Env /*env*/, void * /*data*/,\n           const SherpaOnnxSpeechSegment *hint) {\n          SherpaOnnxDestroySpeechSegment(hint);\n        },\n        segment);\n\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, segment->n, arrayBuffer, 0);\n\n    Napi::Object obj = Napi::Object::New(env);\n    obj.Set(Napi::String::New(env, \"start\"), segment->start);\n    obj.Set(Napi::String::New(env, \"samples\"), float32Array);\n\n    return obj;\n  } else {\n    Napi::ArrayBuffer arrayBuffer =\n        Napi::ArrayBuffer::New(env, sizeof(float) * segment->n);\n\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, segment->n, arrayBuffer, 0);\n\n    std::copy(segment->samples, segment->samples + segment->n,\n              float32Array.Data());\n\n    Napi::Object obj = Napi::Object::New(env);\n    obj.Set(Napi::String::New(env, \"start\"), segment->start);\n    obj.Set(Napi::String::New(env, \"samples\"), float32Array);\n\n    SherpaOnnxDestroySpeechSegment(segment);\n\n    return obj;\n  }\n}\n\nstatic void VoiceActivityDetectorResetWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a VAD pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      info[0].As<Napi::External<SherpaOnnxVoiceActivityDetector>>().Data();\n\n  SherpaOnnxVoiceActivityDetectorReset(vad);\n}\n\nstatic void VoiceActivityDetectorFlushWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 1) {\n    std::ostringstream os;\n    os << \"Expect only 1 argument. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  if (!info[0].IsExternal()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a VAD pointer.\")\n        .ThrowAsJavaScriptException();\n\n    return;\n  }\n\n  const SherpaOnnxVoiceActivityDetector *vad =\n      info[0].As<Napi::External<SherpaOnnxVoiceActivityDetector>>().Data();\n\n  SherpaOnnxVoiceActivityDetectorFlush(vad);\n}\n\nvoid InitVad(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"createCircularBuffer\"),\n              Napi::Function::New(env, CreateCircularBufferWrapper));\n\n  exports.Set(Napi::String::New(env, \"circularBufferPush\"),\n              Napi::Function::New(env, CircularBufferPushWrapper));\n\n  exports.Set(Napi::String::New(env, \"circularBufferGet\"),\n              Napi::Function::New(env, CircularBufferGetWrapper));\n\n  exports.Set(Napi::String::New(env, \"circularBufferPop\"),\n              Napi::Function::New(env, CircularBufferPopWrapper));\n\n  exports.Set(Napi::String::New(env, \"circularBufferSize\"),\n              Napi::Function::New(env, CircularBufferSizeWrapper));\n\n  exports.Set(Napi::String::New(env, \"circularBufferHead\"),\n              Napi::Function::New(env, CircularBufferHeadWrapper));\n\n  exports.Set(Napi::String::New(env, \"circularBufferReset\"),\n              Napi::Function::New(env, CircularBufferResetWrapper));\n\n  exports.Set(Napi::String::New(env, \"createVoiceActivityDetector\"),\n              Napi::Function::New(env, CreateVoiceActivityDetectorWrapper));\n\n  exports.Set(\n      Napi::String::New(env, \"voiceActivityDetectorAcceptWaveform\"),\n      Napi::Function::New(env, VoiceActivityDetectorAcceptWaveformWrapper));\n\n  exports.Set(Napi::String::New(env, \"voiceActivityDetectorIsEmpty\"),\n              Napi::Function::New(env, VoiceActivityDetectorEmptyWrapper));\n\n  exports.Set(Napi::String::New(env, \"voiceActivityDetectorIsDetected\"),\n              Napi::Function::New(env, VoiceActivityDetectorDetectedWrapper));\n\n  exports.Set(Napi::String::New(env, \"voiceActivityDetectorPop\"),\n              Napi::Function::New(env, VoiceActivityDetectorPopWrapper));\n\n  exports.Set(Napi::String::New(env, \"voiceActivityDetectorClear\"),\n              Napi::Function::New(env, VoiceActivityDetectorClearWrapper));\n\n  exports.Set(Napi::String::New(env, \"voiceActivityDetectorFront\"),\n              Napi::Function::New(env, VoiceActivityDetectorFrontWrapper));\n\n  exports.Set(Napi::String::New(env, \"voiceActivityDetectorReset\"),\n              Napi::Function::New(env, VoiceActivityDetectorResetWrapper));\n\n  exports.Set(Napi::String::New(env, \"voiceActivityDetectorFlush\"),\n              Napi::Function::New(env, VoiceActivityDetectorFlushWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/version.cc",
    "content": "// scripts/node-addon-api/src/version.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include <sstream>\n\n#include \"napi.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nvoid InitVersion(Napi::Env env, Napi::Object exports) {\n  Napi::String version = Napi::String::New(env, SherpaOnnxGetVersionStr());\n  Napi::String git_sha1 = Napi::String::New(env, SherpaOnnxGetGitSha1());\n  Napi::String git_date = Napi::String::New(env, SherpaOnnxGetGitDate());\n\n  exports.Set(Napi::String::New(env, \"version\"), version);\n  exports.Set(Napi::String::New(env, \"gitSha1\"), git_sha1);\n  exports.Set(Napi::String::New(env, \"gitDate\"), git_date);\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/wave-reader.cc",
    "content": "// scripts/node-addon-api/src/wave-reader.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <algorithm>\n#include <sstream>\n#include <string>\n\n#include \"napi.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nstatic Napi::Object ReadWaveWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() > 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsString()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a string\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  std::string filename = info[0].As<Napi::String>().Utf8Value();\n\n  bool enable_external_buffer = true;\n  if (info.Length() == 2) {\n    if (info[1].IsBoolean()) {\n      enable_external_buffer = info[1].As<Napi::Boolean>().Value();\n    } else {\n      Napi::TypeError::New(env, \"Argument 1 should be a boolean\")\n          .ThrowAsJavaScriptException();\n\n      return {};\n    }\n  }\n\n  const SherpaOnnxWave *wave = SherpaOnnxReadWave(filename.c_str());\n  if (!wave) {\n    std::ostringstream os;\n    os << \"Failed to read '\" << filename << \"'\";\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (enable_external_buffer) {\n    Napi::ArrayBuffer arrayBuffer = Napi::ArrayBuffer::New(\n        env, const_cast<float *>(wave->samples),\n        sizeof(float) * wave->num_samples,\n        [](Napi::Env /*env*/, void * /*data*/, const SherpaOnnxWave *hint) {\n          SherpaOnnxFreeWave(hint);\n        },\n        wave);\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, wave->num_samples, arrayBuffer, 0);\n\n    Napi::Object obj = Napi::Object::New(env);\n    obj.Set(Napi::String::New(env, \"samples\"), float32Array);\n    obj.Set(Napi::String::New(env, \"sampleRate\"), wave->sample_rate);\n    return obj;\n  } else {\n    // don't use external buffer\n    Napi::ArrayBuffer arrayBuffer =\n        Napi::ArrayBuffer::New(env, sizeof(float) * wave->num_samples);\n\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, wave->num_samples, arrayBuffer, 0);\n\n    std::copy(wave->samples, wave->samples + wave->num_samples,\n              float32Array.Data());\n\n    Napi::Object obj = Napi::Object::New(env);\n    obj.Set(Napi::String::New(env, \"samples\"), float32Array);\n    obj.Set(Napi::String::New(env, \"sampleRate\"), wave->sample_rate);\n\n    SherpaOnnxFreeWave(wave);\n\n    return obj;\n  }\n}\n\nstatic Napi::Object ReadWaveFromBinaryWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n  if (info.Length() > 2) {\n    std::ostringstream os;\n    os << \"Expect only 1 or 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsTypedArray()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a float32 array\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Uint8Array data = info[0].As<Napi::Uint8Array>();\n  int32_t n = data.ElementLength();\n  const SherpaOnnxWave *wave = SherpaOnnxReadWaveFromBinaryData(\n      reinterpret_cast<const char *>(data.Data()), n);\n  if (!wave) {\n    std::ostringstream os;\n    os << \"Failed to read wave\";\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  bool enable_external_buffer = true;\n  if (info.Length() == 2) {\n    if (info[1].IsBoolean()) {\n      enable_external_buffer = info[1].As<Napi::Boolean>().Value();\n    } else {\n      Napi::TypeError::New(env, \"Argument 1 should be a boolean\")\n          .ThrowAsJavaScriptException();\n\n      return {};\n    }\n  }\n\n  if (enable_external_buffer) {\n    Napi::ArrayBuffer arrayBuffer = Napi::ArrayBuffer::New(\n        env, const_cast<float *>(wave->samples),\n        sizeof(float) * wave->num_samples,\n        [](Napi::Env /*env*/, void * /*data*/, const SherpaOnnxWave *hint) {\n          SherpaOnnxFreeWave(hint);\n        },\n        wave);\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, wave->num_samples, arrayBuffer, 0);\n\n    Napi::Object obj = Napi::Object::New(env);\n    obj.Set(Napi::String::New(env, \"samples\"), float32Array);\n    obj.Set(Napi::String::New(env, \"sampleRate\"), wave->sample_rate);\n    return obj;\n  } else {\n    // don't use external buffer\n    Napi::ArrayBuffer arrayBuffer =\n        Napi::ArrayBuffer::New(env, sizeof(float) * wave->num_samples);\n\n    Napi::Float32Array float32Array =\n        Napi::Float32Array::New(env, wave->num_samples, arrayBuffer, 0);\n\n    std::copy(wave->samples, wave->samples + wave->num_samples,\n              float32Array.Data());\n\n    Napi::Object obj = Napi::Object::New(env);\n    obj.Set(Napi::String::New(env, \"samples\"), float32Array);\n    obj.Set(Napi::String::New(env, \"sampleRate\"), wave->sample_rate);\n\n    SherpaOnnxFreeWave(wave);\n\n    return obj;\n  }\n}\n\nvoid InitWaveReader(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"readWave\"),\n              Napi::Function::New(env, ReadWaveWrapper));\n\n  exports.Set(Napi::String::New(env, \"readWaveFromBinary\"),\n              Napi::Function::New(env, ReadWaveFromBinaryWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/wave-writer.cc",
    "content": "// scripts/node-addon-api/src/wave-writer.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <sstream>\n\n#include \"napi.h\"  // NOLINT\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n// (filename, {samples: samples, sampleRate: sampleRate}\nstatic Napi::Boolean WriteWaveWrapper(const Napi::CallbackInfo &info) {\n  Napi::Env env = info.Env();\n\n  if (info.Length() != 2) {\n    std::ostringstream os;\n    os << \"Expect only 2 arguments. Given: \" << info.Length();\n\n    Napi::TypeError::New(env, os.str()).ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[0].IsString()) {\n    Napi::TypeError::New(env, \"Argument 0 should be a string\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!info[1].IsObject()) {\n    Napi::TypeError::New(env, \"Argument 1 should be an object\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Object obj = info[1].As<Napi::Object>();\n\n  if (!obj.Has(\"samples\")) {\n    Napi::TypeError::New(env, \"The argument object should have a field samples\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"samples\").IsTypedArray()) {\n    Napi::TypeError::New(env, \"The object['samples'] should be a typed array\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Has(\"sampleRate\")) {\n    Napi::TypeError::New(env,\n                         \"The argument object should have a field sampleRate\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  if (!obj.Get(\"sampleRate\").IsNumber()) {\n    Napi::TypeError::New(env, \"The object['samples'] should be a number\")\n        .ThrowAsJavaScriptException();\n\n    return {};\n  }\n\n  Napi::Float32Array samples = obj.Get(\"samples\").As<Napi::Float32Array>();\n  int32_t sample_rate = obj.Get(\"sampleRate\").As<Napi::Number>().Int32Value();\n#if __OHOS__\n  int32_t ok = SherpaOnnxWriteWave(\n      samples.Data(), samples.ElementLength() / sizeof(float), sample_rate,\n      info[0].As<Napi::String>().Utf8Value().c_str());\n#else\n  int32_t ok =\n      SherpaOnnxWriteWave(samples.Data(), samples.ElementLength(), sample_rate,\n                          info[0].As<Napi::String>().Utf8Value().c_str());\n#endif\n\n  return Napi::Boolean::New(env, ok);\n}\n\nvoid InitWaveWriter(Napi::Env env, Napi::Object exports) {\n  exports.Set(Napi::String::New(env, \"writeWave\"),\n              Napi::Function::New(env, WriteWaveWrapper));\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/ets/components/KeywordSpotting.ets",
    "content": "import {\n  createKeywordSpotter,\n  createKeywordStream,\n  isKeywordStreamReady,\n  decodeKeywordStream,\n  resetKeywordStream,\n  getKeywordResultAsJson,\n} from 'libsherpa_onnx.so';\n\nimport { FeatureConfig } from './NonStreamingAsr';\nimport { OnlineModelConfig, OnlineStream } from './StreamingAsr';\n\nexport class KeywordSpotterConfig {\n  public featConfig: FeatureConfig = new FeatureConfig();\n  public modelConfig: OnlineModelConfig = new OnlineModelConfig();\n  public maxActivePaths: number = 4;\n  public numTrailingBlanks: number = 1;\n  public keywordsScore: number = 1;\n  public keywordsThreshold: number = 0.25;\n  public keywordsFile: string = '';\n}\n\ninterface KeywordSpotterResultJson {\n  keyword: string;\n  timestamps: number[];\n  tokens: string[];\n}\n\nexport class KeywordSpotterResult {\n  public keyword: string = '';\n  public tokens: string[] = [];\n  public timestamps: number[] = [];\n  public json: string = '';\n}\n\nexport class KeywordSpotter {\n  public handle: object;\n  public config: KeywordSpotterConfig;\n\n  constructor(config: KeywordSpotterConfig, mgr?: object) {\n    this.handle = createKeywordSpotter(config, mgr);\n    this.config = config\n  }\n\n  createStream(keywords?: string): OnlineStream {\n    if (typeof keywords !== \"undefined\") {\n      return new OnlineStream(createKeywordStream(this.handle, keywords));\n    } else {\n      return new OnlineStream(createKeywordStream(this.handle));\n    }\n  }\n\n  isReady(stream: OnlineStream): boolean {\n    return isKeywordStreamReady(this.handle, stream.handle);\n  }\n\n  decode(stream: OnlineStream) {\n    decodeKeywordStream(this.handle, stream.handle);\n  }\n\n  reset(stream: OnlineStream) {\n    resetKeywordStream(this.handle, stream.handle);\n  }\n\n  getResult(stream: OnlineStream): KeywordSpotterResult {\n    const jsonStr: string = getKeywordResultAsJson(this.handle, stream.handle);\n\n    let o = JSON.parse(jsonStr) as KeywordSpotterResultJson;\n\n    const r = new KeywordSpotterResult()\n    r.keyword = o.keyword\n    r.timestamps = o.timestamps;\n    r.tokens = o.tokens;\n    r.json = jsonStr;\n\n    return r;\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/ets/components/MainPage.ets",
    "content": "import hilog from '@ohos.hilog';\nimport testNapi from 'libsherpa_onnx.so';\n\n@Component\nexport struct MainPage {\n  @State message: string = 'Hello World';\n\n  build() {\n    Row() {\n      Column() {\n        Text(this.message)\n          .fontSize(50)\n          .fontWeight(FontWeight.Bold)\n          .onClick(() => {\n          })\n      }\n      .width('100%')\n    }\n    .height('100%')\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/ets/components/NonStreamingAsr.ets",
    "content": "import {\n  acceptWaveformOffline,\n  createOfflineRecognizer,\n  createOfflineStream,\n  decodeOfflineStream,\n  getOfflineStreamResultAsJson,\n  offlineRecognizerSetConfig,\n} from 'libsherpa_onnx.so';\n\nexport interface Samples {\n  samples: Float32Array;\n  sampleRate: number;\n}\n\nexport class OfflineStream {\n  public handle: object;\n\n  constructor(handle: object) {\n    this.handle = handle;\n  }\n\n  // obj is {samples: samples, sampleRate: sampleRate}\n  // samples is a float32 array containing samples in the range [-1, 1]\n  // sampleRate is a number\n  acceptWaveform(obj: Samples) {\n    acceptWaveformOffline(this.handle, obj)\n  }\n}\n\nexport class HomophoneReplacerConfig {\n  public dictDir: string = '';  // unused\n  public lexicon: string = '';\n  public ruleFsts: string = '';\n}\n\nexport class FeatureConfig {\n  public sampleRate: number = 16000;\n  public featureDim: number = 80;\n}\n\nexport class OfflineTransducerModelConfig {\n  public encoder: string = '';\n  public decoder: string = '';\n  public joiner: string = '';\n}\n\nexport class OfflineParaformerModelConfig {\n  public model: string = '';\n}\n\nexport class OfflineNemoEncDecCtcModelConfig {\n  public model: string = '';\n}\n\nexport class OfflineDolphinModelConfig {\n  public model: string = '';\n}\n\nexport class OfflineOmnilingualAsrCtcModelConfig {\n  public model: string = '';\n}\n\nexport class OfflineMedAsrCtcModelConfig {\n  public model: string = '';\n}\n\nexport class OfflineFunASRNanoModelConfig {\n  public encoderAdaptor: string = '';\n  public llm: string = '';\n  public embedding: string = '';\n  public tokenizer: string = '';\n  public systemPrompt: string = '';\n  public userPrompt: string = '';\n  public maxNewTokens: number = 0;\n  public temperature: number = 1e-6;\n  public topP: number = 0.8;\n  public seed: number = 0;\n  public language: string = '';\n  public itn: number = 0;\n  public hotwords: string = '';\n}\n\nexport class OfflineZipformerCtcModelConfig {\n  public model: string = '';\n}\n\nexport class OfflineWenetCtcModelConfig {\n  public model: string = '';\n}\n\nexport class OfflineFireRedAsrModelConfig {\n  public encoder: string = '';\n  public decoder: string = '';\n}\n\nexport class OfflineFireRedAsrCtcModelConfig {\n  public model: string = '';\n}\n\nexport class OfflineWhisperModelConfig {\n  public encoder: string = '';\n  public decoder: string = '';\n  public language: string = '';\n  public task: string = 'transcribe';\n  public tailPaddings: number = -1;\n  public enableTokenTimestamps: boolean = false;\n  public enableSegmentTimestamps: boolean = false;\n}\n\nexport class OfflineCanaryModelConfig {\n  public encoder: string = '';\n  public decoder: string = '';\n  public srcLang: string = '';\n  public tgtLang: string = '';\n  public usePnc: number = 1;\n}\n\nexport class OfflineTdnnModelConfig {\n  public model: string = '';\n}\n\nexport class OfflineSenseVoiceModelConfig {\n  public model: string = '';\n  public language: string = '';\n  public useItn: boolean = false;\n}\n\nexport class OfflineMoonshineModelConfig {\n  public preprocessor: string = '';\n  public encoder: string = '';\n  public uncachedDecoder: string = '';\n  public cachedDecoder: string = '';\n  public mergedDecoder: string = '';\n}\n\nexport class OfflineModelConfig {\n  public transducer: OfflineTransducerModelConfig = new OfflineTransducerModelConfig();\n  public paraformer: OfflineParaformerModelConfig = new OfflineParaformerModelConfig();\n  public nemoCtc: OfflineNemoEncDecCtcModelConfig = new OfflineNemoEncDecCtcModelConfig();\n  public whisper: OfflineWhisperModelConfig = new OfflineWhisperModelConfig();\n  public tdnn: OfflineTdnnModelConfig = new OfflineTdnnModelConfig();\n  public tokens: string = '';\n  public numThreads: number = 1;\n  public debug: boolean = false;\n  public provider: string = 'cpu';\n  public modelType: string = '';\n  public modelingUnit: string = \"cjkchar\";\n  public bpeVocab: string = '';\n  public telespeechCtc: string = '';\n  public senseVoice: OfflineSenseVoiceModelConfig = new OfflineSenseVoiceModelConfig();\n  public moonshine: OfflineMoonshineModelConfig = new OfflineMoonshineModelConfig();\n  public fireRedAsr: OfflineFireRedAsrModelConfig = new OfflineFireRedAsrModelConfig();\n  public dolphin: OfflineDolphinModelConfig = new OfflineDolphinModelConfig();\n  public zipformerCtc: OfflineZipformerCtcModelConfig = new OfflineZipformerCtcModelConfig();\n  public canary: OfflineCanaryModelConfig = new OfflineCanaryModelConfig();\n  public wenetCtc: OfflineWenetCtcModelConfig = new OfflineWenetCtcModelConfig();\n  public omnilingual: OfflineOmnilingualAsrCtcModelConfig = new OfflineOmnilingualAsrCtcModelConfig();\n  public medasr: OfflineMedAsrCtcModelConfig = new OfflineMedAsrCtcModelConfig();\n  public funasrNano: OfflineFunASRNanoModelConfig = new OfflineFunASRNanoModelConfig();\n  public fireRedAsrCtc: OfflineFireRedAsrCtcModelConfig = new OfflineFireRedAsrCtcModelConfig();\n}\n\nexport class OfflineLMConfig {\n  public model: string = '';\n  public scale: number = 1.0;\n}\n\nexport class OfflineRecognizerConfig {\n  public featConfig: FeatureConfig = new FeatureConfig();\n  public modelConfig: OfflineModelConfig = new OfflineModelConfig();\n  public lmConfig: OfflineLMConfig = new OfflineLMConfig();\n  public decodingMethod: string = \"greedy_search\";\n  public maxActivePaths: number = 4;\n  public hotwordsFfile: string = '';\n  public hotwordsScore: number = 1.5;\n  public ruleFsts: string = '';\n  public ruleFars: string = '';\n  public blankPenalty: number = 0;\n  public hr: HomophoneReplacerConfig = new HomophoneReplacerConfig();\n}\n\nexport class OfflineRecognizerResult {\n  public text: string = '';\n  public timestamps: number[] = [];\n  public tokens: string[] = [];\n  public json = '';\n  public lang: string = '';\n  public emotion: string = '';\n  public event: string = '';\n}\n\ninterface OfflineRecognizerResultJson {\n  text: string;\n  timestamps: number[];\n  tokens: string[];\n  lang: string;\n  emotion: string;\n  event: string;\n}\n\nexport class OfflineRecognizer {\n  public handle: object;\n  public config: OfflineRecognizerConfig;\n\n  constructor(config: OfflineRecognizerConfig, mgr?: object) {\n    this.handle = createOfflineRecognizer(config, mgr);\n    this.config = config\n  }\n\n  setConfig(config: OfflineRecognizerConfig) {\n    offlineRecognizerSetConfig(this.handle, config);\n  }\n\n  createStream(): OfflineStream {\n    const handle: object = createOfflineStream(this.handle);\n    return new OfflineStream(handle);\n  }\n\n  decode(stream: OfflineStream) {\n    decodeOfflineStream(this.handle, stream.handle);\n  }\n\n  getResult(stream: OfflineStream): OfflineRecognizerResult {\n    const jsonStr: string = getOfflineStreamResultAsJson(stream.handle);\n\n    let o = JSON.parse(jsonStr) as OfflineRecognizerResultJson;\n\n    const r = new OfflineRecognizerResult()\n    r.text = o.text\n    r.timestamps = o.timestamps;\n    r.tokens = o.tokens;\n    r.json = jsonStr;\n    r.lang = o.lang;\n    r.emotion = o.emotion;\n    r.event = o.event;\n\n    return r;\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/ets/components/NonStreamingSpeakerDiarization.ets",
    "content": "import {\n  createOfflineSpeakerDiarization,\n  getOfflineSpeakerDiarizationSampleRate,\n  offlineSpeakerDiarizationProcess,\n  offlineSpeakerDiarizationProcessAsync,\n  offlineSpeakerDiarizationSetConfig,\n} from 'libsherpa_onnx.so';\n\nimport { SpeakerEmbeddingExtractorConfig } from './SpeakerIdentification';\n\nexport class OfflineSpeakerSegmentationPyannoteModelConfig {\n  public model: string = '';\n}\n\nexport class OfflineSpeakerSegmentationModelConfig {\n  public pyannote: OfflineSpeakerSegmentationPyannoteModelConfig = new OfflineSpeakerSegmentationPyannoteModelConfig();\n  public numThreads: number = 1;\n  public debug: boolean = false;\n  public provider: string = 'cpu';\n}\n\nexport class FastClusteringConfig {\n  public numClusters: number = -1;\n  public threshold: number = 0.5;\n}\n\nexport class OfflineSpeakerDiarizationConfig {\n  public segmentation: OfflineSpeakerSegmentationModelConfig = new OfflineSpeakerSegmentationModelConfig();\n  public embedding: SpeakerEmbeddingExtractorConfig = new SpeakerEmbeddingExtractorConfig();\n  public clustering: FastClusteringConfig = new FastClusteringConfig();\n  public minDurationOn: number = 0.2;\n  public minDurationOff: number = 0.5;\n}\n\nexport class OfflineSpeakerDiarizationSegment {\n  // in seconds\n  public start: number = 0;\n  // in seconds\n  public end: number = 0;\n  // ID of the speaker; count from 0\n  public speaker: number = 0;\n}\n\nexport class OfflineSpeakerDiarization {\n  public config: OfflineSpeakerDiarizationConfig;\n  public sampleRate: number;\n  private handle: object;\n\n  constructor(config: OfflineSpeakerDiarizationConfig, mgr?: object) {\n    this.handle = createOfflineSpeakerDiarization(config, mgr);\n    this.config = config;\n\n    this.sampleRate = getOfflineSpeakerDiarizationSampleRate(this.handle);\n  }\n\n  /**\n   * samples is a 1-d float32 array. Each element of the array should be\n   * in the range [-1, 1].\n   *\n   * We assume its sample rate equals to this.sampleRate.\n   *\n   * Returns an array of object, where an object is\n   *\n   *  {\n   *    \"start\": start_time_in_seconds,\n   *    \"end\": end_time_in_seconds,\n   *    \"speaker\": an_integer,\n   *  }\n   */\n  process(samples: Float32Array): OfflineSpeakerDiarizationSegment[] {\n    return offlineSpeakerDiarizationProcess(this.handle, samples) as OfflineSpeakerDiarizationSegment[];\n  }\n\n  processAsync(samples: Float32Array, callback: (numProcessedChunks: number,\n    numTotalChunks: number) => void): Promise<OfflineSpeakerDiarizationSegment[]> {\n    return offlineSpeakerDiarizationProcessAsync(this.handle, samples,\n      callback) as Promise<OfflineSpeakerDiarizationSegment[]>;\n  }\n\n  setConfig(config: OfflineSpeakerDiarizationConfig) {\n    offlineSpeakerDiarizationSetConfig(this.handle, config);\n    this.config.clustering = config.clustering;\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/ets/components/NonStreamingTts.ets",
    "content": "import {\n  createOfflineTts,\n  getOfflineTtsNumSpeakers,\n  getOfflineTtsSampleRate,\n  offlineTtsGenerate,\n  offlineTtsGenerateAsyncWithConfig,\n  offlineTtsGenerateAsync,\n  offlineTtsGenerateWithConfig,\n} from 'libsherpa_onnx.so';\n\nexport class OfflineTtsVitsModelConfig {\n  public model: string = '';\n  public lexicon: string = '';\n  public tokens: string = '';\n  public dataDir: string = '';\n  public dictDir: String = '';  // unused\n  public noiseScale: number = 0.667;\n  public noiseScaleW: number = 0.8;\n  public lengthScale: number = 1.0;\n}\n\nexport class OfflineTtsMatchaModelConfig {\n  public acousticModel: string = '';\n  public vocoder: string = '';\n  public lexicon: string = '';\n  public tokens: string = '';\n  public dataDir: string = '';\n  public dictDir: String = '';  // unused\n  public noiseScale: number = 0.667;\n  public lengthScale: number = 1.0;\n}\n\nexport class OfflineTtsKokoroModelConfig {\n  public model: string = '';\n  public voices: string = '';\n  public tokens: string = '';\n  public dataDir: string = '';\n  public lengthScale: number = 1.0;\n  public dictDir: string = '';  // unused\n  public lexicon: string = '';\n  public lang: string = '';\n}\n\nexport class OfflineTtsKittenModelConfig {\n  public model: string = '';\n  public voices: string = '';\n  public tokens: string = '';\n  public dataDir: string = '';\n  public lengthScale: number = 1.0;\n}\n\nexport class OfflineTtsZipvoiceModelConfig {\n  public tokens: string = '';\n  public encoder: string = '';\n  public decoder: string = '';\n  public vocoder: string = '';\n  public dataDir: string = '';\n  public lexicon: string = '';\n  public featScale: number = 0.1;\n  public tShift: number = 0.5;\n  public targetRms: number = 0.1;\n  public guidanceScale: number = 1.0;\n}\n\nexport class OfflineTtsPocketModelConfig {\n  public lmFlow: string = '';\n  public lmMain: string = '';\n  public encoder: string = '';\n  public decoder: string = '';\n  public textConditioner: string = '';\n  public vocabJson: string = '';\n  public tokenScoresJson: string = '';\n  public voiceEmbeddingCacheCapacity: number = 50;\n}\n\nexport class OfflineTtsSupertonicModelConfig {\n  public durationPredictor: string = '';\n  public textEncoder: string = '';\n  public vectorEstimator: string = '';\n  public vocoder: string = '';\n  public ttsJson: string = '';\n  public unicodeIndexer: string = '';\n  public voiceStyle: string = '';\n}\n\nexport class OfflineTtsModelConfig {\n  public vits: OfflineTtsVitsModelConfig = new OfflineTtsVitsModelConfig();\n  public matcha: OfflineTtsMatchaModelConfig = new OfflineTtsMatchaModelConfig();\n  public kokoro: OfflineTtsKokoroModelConfig = new OfflineTtsKokoroModelConfig();\n  public kitten: OfflineTtsKittenModelConfig = new OfflineTtsKittenModelConfig();\n  public zipvoice: OfflineTtsZipvoiceModelConfig = new OfflineTtsZipvoiceModelConfig();\n  public pocket: OfflineTtsPocketModelConfig = new OfflineTtsPocketModelConfig();\n  public supertonic: OfflineTtsSupertonicModelConfig = new OfflineTtsSupertonicModelConfig();\n  public numThreads: number = 1;\n  public debug: boolean = false;\n  public provider: string = 'cpu';\n}\n\nexport class OfflineTtsConfig {\n  public model: OfflineTtsModelConfig = new OfflineTtsModelConfig();\n  public ruleFsts: string = '';\n  public ruleFars: string = '';\n  public maxNumSentences: number = 1;\n  public silenceScale: number = 0.2;\n}\n\nexport class TtsOutput {\n  public samples: Float32Array = new Float32Array(0);\n  public sampleRate: number = 0;\n}\n\ninterface TtsCallbackData {\n  samples: Float32Array;\n  progress: number;\n}\n\nexport class TtsGenerationConfig {\n  public silenceScale: number = 0.2;\n  public speed: number = 1.0;\n  public sid: number = 0;\n  public referenceAudio?: Float32Array;\n  public referenceSampleRate: number = 0;\n  public referenceText: string = '';\n  public numSteps: number = 5;\n  public extra: object = new Object();\n}\n\nexport class TtsInput {\n  public text: string = '';\n  public sid: number = 0;\n  public speed: number = 1.0;\n  public enableExternalBuffer: boolean = true;\n  public callback?: (data: TtsCallbackData) => number;\n}\n\nexport class TtsInputWithConfig {\n  public text: string = '';\n  public generationConfig: TtsGenerationConfig = new TtsGenerationConfig();\n  public enableExternalBuffer: boolean = true;\n  public callback?: (data: TtsCallbackData) => number;\n}\n\nexport class OfflineTts {\n  public config: OfflineTtsConfig;\n  public numSpeakers: number;\n  public sampleRate: number;\n  private handle: object;\n\n  constructor(config: OfflineTtsConfig, mgr?: object) {\n    this.handle = createOfflineTts(config, mgr);\n    this.config = config;\n\n    this.numSpeakers = getOfflineTtsNumSpeakers(this.handle);\n    this.sampleRate = getOfflineTtsSampleRate(this.handle);\n  }\n\n  /*\n   input obj: {text: \"xxxx\", sid: 0, speed: 1.0}\n   where text is a string, sid is a int32, speed is a float\n\n   return an object {samples: Float32Array, sampleRate: <a number>}\n   */\n  generate(input: TtsInput): TtsOutput {\n    return offlineTtsGenerate(this.handle, input);\n  }\n\n  generateAsync(input: TtsInput): Promise<TtsOutput> {\n    return offlineTtsGenerateAsync(this.handle, input);\n  }\n\n  generateWithConfig(input: TtsInputWithConfig): TtsOutput {\n    return offlineTtsGenerateWithConfig(this.handle, input);\n  }\n\n  generateAsyncWithConfig(input: TtsInputWithConfig): Promise<TtsOutput> {\n    return offlineTtsGenerateAsyncWithConfig(this.handle, input);\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/ets/components/OfflinePunctuation.ets",
    "content": "import {\n  createOfflinePunctuation,\n  offlinePunctuationAddPunct,\n} from 'libsherpa_onnx.so';\n\nexport class OfflinePunctuationModelConfig {\n  public ctTransformer: string = '';\n  public numThreads: number = 1;\n  public debug: boolean = false;\n  public provider: string = 'cpu';\n}\n\nexport class OfflinePunctuationConfig {\n  public model: OfflinePunctuationModelConfig = new OfflinePunctuationModelConfig();\n}\n\nexport class OfflinePunctuation {\n  public config: OfflinePunctuationConfig;\n  private handle: object;\n\n  constructor(config: OfflinePunctuationConfig, mgr?: object) {\n    this.handle = createOfflinePunctuation(config, mgr);\n    this.config = config;\n  }\n\n  addPunct(text: string): string {\n    return offlinePunctuationAddPunct(this.handle, text);\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/ets/components/OnlinePunctuation.ets",
    "content": "import {\n  createOnlinePunctuation,\n  onlinePunctuationAddPunct,\n} from 'libsherpa_onnx.so';\n\nexport class OnlinePunctuationModelConfig {\n  public cnnBilstm: string = '';\n  public bpeVocab: string = '';\n  public numThreads: number = 1;\n  public debug: boolean = false;\n  public provider: string = 'cpu';\n}\n\nexport class OnlinePunctuationConfig {\n  public model: OnlinePunctuationModelConfig = new OnlinePunctuationModelConfig();\n}\n\nexport class OnlinePunctuation {\n  public config: OnlinePunctuationConfig;\n  private handle: object;\n\n  constructor(config: OnlinePunctuationConfig, mgr?: object) {\n    this.handle = createOnlinePunctuation(config, mgr);\n    this.config = config;\n  }\n\n  addPunct(text: string): string {\n    return onlinePunctuationAddPunct(this.handle, text);\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/ets/components/SpeakerIdentification.ets",
    "content": "import {\n  createSpeakerEmbeddingExtractor,\n  createSpeakerEmbeddingManager,\n  speakerEmbeddingExtractorComputeEmbedding,\n  speakerEmbeddingExtractorCreateStream,\n  speakerEmbeddingExtractorDim,\n  speakerEmbeddingExtractorIsReady,\n  speakerEmbeddingManagerAdd,\n  speakerEmbeddingManagerAddListFlattened,\n  speakerEmbeddingManagerContains,\n  speakerEmbeddingManagerGetAllSpeakers,\n  speakerEmbeddingManagerNumSpeakers,\n  speakerEmbeddingManagerRemove,\n  speakerEmbeddingManagerSearch,\n  speakerEmbeddingManagerVerify\n} from 'libsherpa_onnx.so';\nimport { OnlineStream } from './StreamingAsr';\n\nexport class SpeakerEmbeddingExtractorConfig {\n  public model: string = '';\n  public numThreads: number = 1;\n  public debug: boolean = false;\n  public provider: string = 'cpu';\n}\n\nexport class SpeakerEmbeddingExtractor {\n  public config: SpeakerEmbeddingExtractorConfig = new SpeakerEmbeddingExtractorConfig();\n  public dim: number;\n  private handle: object;\n\n  constructor(config: SpeakerEmbeddingExtractorConfig, mgr?: object) {\n    this.handle = createSpeakerEmbeddingExtractor(config, mgr);\n    this.config = config;\n    this.dim = speakerEmbeddingExtractorDim(this.handle);\n  }\n\n  createStream(): OnlineStream {\n    return new OnlineStream(speakerEmbeddingExtractorCreateStream(this.handle));\n  }\n\n  isReady(stream: OnlineStream): boolean {\n    return speakerEmbeddingExtractorIsReady(this.handle, stream.handle);\n  }\n\n  compute(stream: OnlineStream, enableExternalBuffer: boolean = true): Float32Array {\n    return speakerEmbeddingExtractorComputeEmbedding(this.handle, stream.handle, enableExternalBuffer);\n  }\n}\n\nfunction flatten(arrayList: Float32Array[]): Float32Array {\n  let n = 0;\n  for (let i = 0; i < arrayList.length; ++i) {\n    n += arrayList[i].length;\n  }\n  let ans = new Float32Array(n);\n\n  let offset = 0;\n  for (let i = 0; i < arrayList.length; ++i) {\n    ans.set(arrayList[i], offset);\n    offset += arrayList[i].length;\n  }\n  return ans;\n}\n\ninterface SpeakerNameWithEmbedding {\n  name: string;\n  v: Float32Array;\n}\n\ninterface SpeakerNameWithEmbeddingList {\n  name: string;\n  v: Float32Array[];\n}\n\ninterface SpeakerNameWithEmbeddingN {\n  name: string;\n  vv: Float32Array;\n  n: number;\n}\n\ninterface EmbeddingWithThreshold {\n  v: Float32Array;\n  threshold: number;\n}\n\ninterface SpeakerNameEmbeddingThreshold {\n  name: string;\n  v: Float32Array;\n  threshold: number;\n}\n\nexport class SpeakerEmbeddingManager {\n  public dim: number;\n  private handle: object;\n\n  constructor(dim: number) {\n    this.handle = createSpeakerEmbeddingManager(dim);\n    this.dim = dim;\n  }\n\n  add(speaker: SpeakerNameWithEmbedding): boolean {\n    return speakerEmbeddingManagerAdd(this.handle, speaker);\n  }\n\n  addMulti(speaker: SpeakerNameWithEmbeddingList): boolean {\n    const c: SpeakerNameWithEmbeddingN = {\n      name: speaker.name, vv: flatten(speaker.v), n: speaker.v.length,\n    };\n    return speakerEmbeddingManagerAddListFlattened(this.handle, c);\n  }\n\n  remove(name: string): boolean {\n    return speakerEmbeddingManagerRemove(this.handle, name);\n  }\n\n  search(obj: EmbeddingWithThreshold): string {\n    return speakerEmbeddingManagerSearch(this.handle, obj);\n  }\n\n  verify(obj: SpeakerNameEmbeddingThreshold): boolean {\n    return speakerEmbeddingManagerVerify(this.handle, obj);\n  }\n\n  contains(name: string): boolean {\n    return speakerEmbeddingManagerContains(this.handle, name);\n  }\n\n  getNumSpeakers(): number {\n    return speakerEmbeddingManagerNumSpeakers(this.handle);\n  }\n\n  getAllSpeakerNames(): string[] {\n    return speakerEmbeddingManagerGetAllSpeakers(this.handle);\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/ets/components/StreamingAsr.ets",
    "content": "import {\n  acceptWaveformOnline,\n  createOnlineRecognizer,\n  createOnlineStream,\n  decodeOnlineStream,\n  getOnlineStreamResultAsJson,\n  inputFinished,\n  isEndpoint,\n  isOnlineStreamReady,\n  reset,\n} from 'libsherpa_onnx.so';\n\nimport { FeatureConfig, HomophoneReplacerConfig, Samples } from './NonStreamingAsr';\n\nexport class OnlineStream {\n  public handle: object;\n\n  constructor(handle: object) {\n    this.handle = handle;\n  }\n\n  // obj is {samples: samples, sampleRate: sampleRate}\n  // samples is a float32 array containing samples in the range [-1, 1]\n  // sampleRate is a number\n  acceptWaveform(obj: Samples) {\n    acceptWaveformOnline(this.handle, obj)\n  }\n\n  inputFinished() {\n    inputFinished(this.handle)\n  }\n}\n\nexport class OnlineTransducerModelConfig {\n  public encoder: string = '';\n  public decoder: string = '';\n  public joiner: string = '';\n}\n\nexport class OnlineParaformerModelConfig {\n  public encoder: string = '';\n  public decoder: string = '';\n}\n\nexport class OnlineZipformer2CtcModelConfig {\n  public model: string = '';\n}\n\nexport class OnlineNemoCtcModelConfig {\n  public model: string = '';\n}\n\nexport class OnlineToneCtcModelConfig {\n  public model: string = '';\n}\n\nexport class OnlineModelConfig {\n  public transducer: OnlineTransducerModelConfig = new OnlineTransducerModelConfig();\n  public paraformer: OnlineParaformerModelConfig = new OnlineParaformerModelConfig();\n  public zipformer2Ctc: OnlineZipformer2CtcModelConfig = new OnlineZipformer2CtcModelConfig();\n  public nemoCtc: OnlineNemoCtcModelConfig = new OnlineNemoCtcModelConfig();\n  public toneCtc: OnlineToneCtcModelConfig = new OnlineToneCtcModelConfig();\n  public tokens: string = '';\n  public numThreads: number = 1;\n  public provider: string = 'cpu';\n  public debug: boolean = false;\n  public modelType: string = '';\n  public modelingUnit: string = \"cjkchar\";\n  public bpeVocab: string = '';\n  // Raw string data mirrored from the native OHOS binding; size is the string length.\n  public tokensBuf: string = '';\n  public tokensBufSize: number = 0;\n}\n\nexport class OnlineCtcFstDecoderConfig {\n  public graph: string = '';\n  public maxActive: number = 3000;\n}\n\nexport class OnlineRecognizerConfig {\n  public featConfig: FeatureConfig = new FeatureConfig();\n  public modelConfig: OnlineModelConfig = new OnlineModelConfig();\n  public decodingMethod: string = 'greedy_search';\n  public maxActivePaths: number = 4;\n  public enableEndpoint: boolean = false;\n  public rule1MinTrailingSilence: number = 2.4;\n  public rule2MinTrailingSilence: number = 1.2;\n  public rule3MinUtteranceLength: number = 20;\n  public hotwordsFile: string = '';\n  public hotwordsScore: number = 1.5;\n  public ctcFstDecoderConfig: OnlineCtcFstDecoderConfig = new OnlineCtcFstDecoderConfig();\n  public ruleFsts: string = '';\n  public ruleFars: string = '';\n  public blankPenalty: number = 0;\n  // Raw string data mirrored from the native OHOS binding; size is the string length.\n  public hotwordsBuf: string = '';\n  public hotwordsBufSize: number = 0;\n  public hr: HomophoneReplacerConfig = new HomophoneReplacerConfig();\n}\n\ninterface OnlineRecognizerResultJson {\n  text: string;\n  timestamps: number[];\n  tokens: string[];\n}\n\nexport class OnlineRecognizerResult {\n  public text: string = '';\n  public tokens: string[] = [];\n  public timestamps: number[] = [];\n  public json: string = '';\n}\n\nexport class OnlineRecognizer {\n  public handle: object;\n  public config: OnlineRecognizerConfig\n\n  constructor(config: OnlineRecognizerConfig, mgr?: object) {\n    this.handle = createOnlineRecognizer(config, mgr);\n    this.config = config\n  }\n\n  createStream(): OnlineStream {\n    const handle: object = createOnlineStream(this.handle);\n    return new OnlineStream(handle);\n  }\n\n  isReady(stream: OnlineStream): boolean {\n    return isOnlineStreamReady(this.handle, stream.handle);\n  }\n\n  decode(stream: OnlineStream) {\n    decodeOnlineStream(this.handle, stream.handle);\n  }\n\n  isEndpoint(stream: OnlineStream): boolean {\n    return isEndpoint(this.handle, stream.handle);\n  }\n\n  reset(stream: OnlineStream) {\n    reset(this.handle, stream.handle);\n  }\n\n  getResult(stream: OnlineStream): OnlineRecognizerResult {\n    const jsonStr: string = getOnlineStreamResultAsJson(this.handle, stream.handle);\n\n    let o = JSON.parse(jsonStr) as OnlineRecognizerResultJson;\n\n    const r = new OnlineRecognizerResult()\n    r.text = o.text\n    r.timestamps = o.timestamps;\n    r.tokens = o.tokens;\n    r.json = jsonStr;\n\n    return r;\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/ets/components/Vad.ets",
    "content": "import {\n  circularBufferGet,\n  circularBufferHead,\n  circularBufferPop,\n  circularBufferPush,\n  circularBufferReset,\n  circularBufferSize,\n  createCircularBuffer,\n  createVoiceActivityDetector,\n  voiceActivityDetectorAcceptWaveform,\n  voiceActivityDetectorClear,\n  voiceActivityDetectorFlush,\n  voiceActivityDetectorFront,\n  voiceActivityDetectorIsDetected,\n  voiceActivityDetectorIsEmpty,\n  voiceActivityDetectorPop,\n  voiceActivityDetectorReset,\n} from 'libsherpa_onnx.so';\n\nexport class SileroVadConfig {\n  public model: string;\n  public threshold: number;\n  public minSpeechDuration: number;\n  public minSilenceDuration: number;\n  public windowSize: number;\n  public maxSpeechDuration: number;\n\n  public constructor(model: string, threshold: number, minSpeechDuration: number, minSilenceDuration: number,\n    windowSize: number, maxSpeechDuration: number = 20) {\n    this.model = model;\n    this.threshold = threshold;\n    this.minSpeechDuration = minSpeechDuration;\n    this.minSilenceDuration = minSilenceDuration;\n    this.windowSize = windowSize;\n    this.maxSpeechDuration = maxSpeechDuration\n  }\n}\n\nexport class TenVadConfig {\n  public model: string;\n  public threshold: number;\n  public minSpeechDuration: number;\n  public minSilenceDuration: number;\n  public windowSize: number;\n  public maxSpeechDuration: number;\n\n  public constructor(model: string, threshold: number, minSpeechDuration: number, minSilenceDuration: number,\n    windowSize: number, maxSpeechDuration: number = 20) {\n    this.model = model;\n    this.threshold = threshold;\n    this.minSpeechDuration = minSpeechDuration;\n    this.minSilenceDuration = minSilenceDuration;\n    this.windowSize = windowSize;\n    this.maxSpeechDuration = maxSpeechDuration\n  }\n}\n\nexport class VadConfig {\n  public sileroVad: SileroVadConfig;\n  public tenVad: TenVadConfig;\n  public sampleRate: number;\n  public debug: boolean;\n  public numThreads: number;\n  public provider: string = 'cpu';\n\n  public constructor(sileroVad: SileroVadConfig, tenVad: TenVadConfig, sampleRate: number, debug: boolean,\n    numThreads: number, provider: string = 'cpu') {\n    this.sileroVad = sileroVad;\n    this.tenVad = tenVad;\n    this.sampleRate = sampleRate;\n    this.debug = debug;\n    this.numThreads = numThreads;\n    this.provider = provider;\n  }\n}\n\nexport class CircularBuffer {\n  private handle: object;\n\n  constructor(capacity: number) {\n    this.handle = createCircularBuffer(capacity);\n  }\n\n  // samples is a float32 array\n  push(samples: Float32Array) {\n    circularBufferPush(this.handle, samples);\n  }\n\n  // return a float32 array\n  get(startIndex: number, n: number, enableExternalBuffer: boolean = true): Float32Array {\n    return circularBufferGet(this.handle, startIndex, n, enableExternalBuffer);\n  }\n\n  pop(n: number) {\n    circularBufferPop(this.handle, n);\n  }\n\n  size(): number {\n    return circularBufferSize(this.handle);\n  }\n\n  head(): number {\n    return circularBufferHead(this.handle);\n  }\n\n  reset() {\n    circularBufferReset(this.handle);\n  }\n}\n\nexport interface SpeechSegment {\n  samples: Float32Array;\n  start: number;\n}\n\nexport class Vad {\n  public config: VadConfig;\n  private handle: object;\n\n  constructor(config: VadConfig, bufferSizeInSeconds: number = 60, mgr?: object) {\n    this.handle = createVoiceActivityDetector(config, bufferSizeInSeconds, mgr);\n    this.config = config;\n  }\n\n  acceptWaveform(samples: Float32Array): void {\n    voiceActivityDetectorAcceptWaveform(this.handle, samples);\n  }\n\n  isEmpty(): boolean {\n    return voiceActivityDetectorIsEmpty(this.handle);\n  }\n\n  isDetected(): boolean {\n    return voiceActivityDetectorIsDetected(this.handle);\n  }\n\n  pop(): void {\n    voiceActivityDetectorPop(this.handle);\n  }\n\n  clear(): void {\n    voiceActivityDetectorClear(this.handle);\n  }\n\n  front(enableExternalBuffer = true): SpeechSegment {\n    return voiceActivityDetectorFront(this.handle, enableExternalBuffer);\n  }\n\n  reset(): void {\n    voiceActivityDetectorReset(this.handle);\n  }\n\n  flush(): void {\n    voiceActivityDetectorFlush(this.handle);\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"sherpa_onnx\",\n    \"type\": \"har\",\n    \"deviceTypes\": [\n      \"default\",\n      \"tablet\",\n      \"2in1\"\n    ]\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"page_show\",\n      \"value\": \"page from package\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/resources/en_US/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"page_show\",\n      \"value\": \"page from package\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/resources/zh_CN/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"page_show\",\n      \"value\": \"page from package\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/ohosTest/ets/test/Ability.test.ets",
    "content": "import hilog from '@ohos.hilog';\nimport { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function abilityTest() {\n  describe('ActsAbilityTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    })\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    })\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    })\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    })\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      hilog.info(0x0000, 'testTag', '%{public}s', 'it begin');\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    })\n  })\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/ohosTest/ets/test/List.test.ets",
    "content": "import abilityTest from './Ability.test';\n\nexport default function testsuite() {\n  abilityTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/ohosTest/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"sherpa_onnx_test\",\n    \"type\": \"feature\",\n    \"deviceTypes\": [\n      \"default\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/test/List.test.ets",
    "content": "import localUnitTest from './LocalUnit.test';\n\nexport default function testsuite() {\n  localUnitTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxHar/sherpa_onnx/src/test/LocalUnit.test.ets",
    "content": "import { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function localUnitTest() {\n  describe('localUnitTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    });\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    });\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    });\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    });\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    });\n  });\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/.gitignore",
    "content": "/node_modules\n/oh_modules\n/local.properties\n/.idea\n**/build\n/.hvigor\n.cxx\n/.clangd\n/.clang-format\n/.clang-tidy\n**/.test\n/.appanalyzer"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/AppScope/app.json5",
    "content": "{\n  \"app\": {\n    \"bundleName\": \"com.k2fsa.sherpa.onnx.speaker.diarization\",\n    \"vendor\": \"example\",\n    \"versionCode\": 1000000,\n    \"versionName\": \"1.0.0\",\n    \"icon\": \"$media:app_icon\",\n    \"label\": \"$string:app_name\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/AppScope/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"app_name\",\n      \"value\": \"SherpaOnnxSpeakerDiarization\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/build-profile.json5",
    "content": "{\n  \"app\": {\n    \"signingConfigs\": [],\n    \"products\": [\n      {\n        \"name\": \"default\",\n        \"signingConfig\": \"default\",\n        \"compatibleSdkVersion\": \"4.0.0(10)\",\n        \"runtimeOS\": \"HarmonyOS\",\n        \"buildOption\": {\n          \"strictMode\": {\n            \"caseSensitiveCheck\": true,\n          }\n        }\n      }\n    ],\n    \"buildModeSet\": [\n      {\n        \"name\": \"debug\",\n      },\n      {\n        \"name\": \"release\"\n      }\n    ]\n  },\n  \"modules\": [\n    {\n      \"name\": \"entry\",\n      \"srcPath\": \"./entry\",\n      \"targets\": [\n        {\n          \"name\": \"default\",\n          \"applyToProducts\": [\n            \"default\"\n          ]\n        }\n      ]\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/code-linter.json5",
    "content": "{\n  \"files\": [\n    \"**/*.ets\"\n  ],\n  \"ignore\": [\n    \"**/src/ohosTest/**/*\",\n    \"**/src/test/**/*\",\n    \"**/src/mock/**/*\",\n    \"**/node_modules/**/*\",\n    \"**/oh_modules/**/*\",\n    \"**/build/**/*\",\n    \"**/.preview/**/*\"\n  ],\n  \"ruleSet\": [\n    \"plugin:@performance/recommended\",\n    \"plugin:@typescript-eslint/recommended\"\n  ],\n  \"rules\": {\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/.gitignore",
    "content": "/node_modules\n/oh_modules\n/.preview\n/build\n/.cxx\n/.test"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/build-profile.json5",
    "content": "{\n  \"apiType\": \"stageMode\",\n  \"buildOption\": {\n    \"sourceOption\": {\n      \"workers\": [\n        './src/main/ets/workers/SpeakerDiarizationWorker.ets'\n      ]\n    }\n  },\n  \"buildOptionSet\": [\n    {\n      \"name\": \"release\",\n      \"arkOptions\": {\n        \"obfuscation\": {\n          \"ruleOptions\": {\n            \"enable\": false,\n            \"files\": [\n              \"./obfuscation-rules.txt\"\n            ]\n          }\n        }\n      }\n    },\n  ],\n  \"targets\": [\n    {\n      \"name\": \"default\"\n    },\n    {\n      \"name\": \"ohosTest\",\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/hvigorfile.ts",
    "content": "import { hapTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: hapTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/obfuscation-rules.txt",
    "content": "# Define project specific obfuscation rules here.\n# You can include the obfuscation configuration files in the current module's build-profile.json5.\n#\n# For more details, see\n#   https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/source-obfuscation-V5\n\n# Obfuscation options:\n# -disable-obfuscation: disable all obfuscations\n# -enable-property-obfuscation: obfuscate the property names\n# -enable-toplevel-obfuscation: obfuscate the names in the global scope\n# -compact: remove unnecessary blank spaces and all line feeds\n# -remove-log: remove all console.* statements\n# -print-namecache: print the name cache that contains the mapping from the old names to new names\n# -apply-namecache: reuse the given cache file\n\n# Keep options:\n# -keep-property-name: specifies property names that you want to keep\n# -keep-global-name: specifies names that you want to keep in the global scope\n\n-enable-property-obfuscation\n-enable-toplevel-obfuscation\n-enable-filename-obfuscation\n-enable-export-obfuscation"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/oh-package.json5",
    "content": "{\n  \"name\": \"entry\",\n  \"version\": \"1.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"main\": \"\",\n  \"author\": \"\",\n  \"license\": \"\",\n  \"dependencies\": {\n    \"sherpa_onnx\": \"1.12.31\"\n  }\n}\n\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/ets/entryability/EntryAbility.ets",
    "content": "import AbilityConstant from '@ohos.app.ability.AbilityConstant';\nimport hilog from '@ohos.hilog';\nimport UIAbility from '@ohos.app.ability.UIAbility';\nimport Want from '@ohos.app.ability.Want';\nimport window from '@ohos.window';\n\nexport default class EntryAbility extends UIAbility {\n  onCreate(want: Want, launchParam: AbilityConstant.LaunchParam): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onCreate');\n  }\n\n  onDestroy(): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onDestroy');\n  }\n\n  onWindowStageCreate(windowStage: window.WindowStage): void {\n    // Main window is created, set main page for this ability\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageCreate');\n\n    windowStage.loadContent('pages/Index', (err) => {\n      if (err.code) {\n        hilog.error(0x0000, 'testTag', 'Failed to load the content. Cause: %{public}s', JSON.stringify(err) ?? '');\n        return;\n      }\n      hilog.info(0x0000, 'testTag', 'Succeeded in loading the content.');\n    });\n  }\n\n  onWindowStageDestroy(): void {\n    // Main window is destroyed, release UI related resources\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageDestroy');\n  }\n\n  onForeground(): void {\n    // Ability has brought to foreground\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onForeground');\n  }\n\n  onBackground(): void {\n    // Ability has back to background\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onBackground');\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/ets/entrybackupability/EntryBackupAbility.ets",
    "content": "import hilog from '@ohos.hilog';\nimport BackupExtensionAbility, { BundleVersion } from '@ohos.application.BackupExtensionAbility';\n\nexport default class EntryBackupAbility extends BackupExtensionAbility {\n  async onBackup() {\n    hilog.info(0x0000, 'testTag', 'onBackup ok');\n  }\n\n  async onRestore(bundleVersion: BundleVersion) {\n    hilog.info(0x0000, 'testTag', 'onRestore ok %{public}s', JSON.stringify(bundleVersion));\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/ets/pages/Index.ets",
    "content": "import { LengthUnit, promptAction } from '@kit.ArkUI';\nimport worker, { MessageEvents } from '@ohos.worker';\nimport { BusinessError, pasteboard } from '@kit.BasicServicesKit';\nimport { picker } from '@kit.CoreFileKit';\n\n\n@Entry\n@Component\nstruct Index {\n  @State title: string = 'Next-gen Kaldi: Speaker Diarization';\n  @State titleFontSize: number = 15;\n  @State currentIndex: number = 0;\n  @State resultForFile: string = '';\n  @State resultForMic: string = '';\n  @State progressForFile: number = 0;\n  @State selectFileBtnEnabled: boolean = false;\n  @State copyBtnForFileEnabled: boolean = false;\n  private controller: TabsController = new TabsController();\n  private workerInstance?: worker.ThreadWorker\n  private readonly scriptURL: string = 'entry/ets/workers/SpeakerDiarizationWorker.ets'\n  private numSpeakers: string = '-1';\n\n  @Builder\n  TabBuilder(title: string, targetIndex: number, selectedImg: Resource, normalImg: Resource) {\n    Column() {\n      Image(this.currentIndex == targetIndex ? selectedImg : normalImg).size({ width: 25, height: 25 })\n      Text(title).fontColor(this.currentIndex == targetIndex ? '#28bff1' : '#8a8a8a')\n    }.width('100%').height(50).justifyContent(FlexAlign.Center).onClick(() => {\n      this.currentIndex = targetIndex;\n      this.controller.changeIndex(this.currentIndex);\n    })\n  }\n\n  aboutToAppear(): void {\n    this.workerInstance = new worker.ThreadWorker(this.scriptURL, {\n      name: 'Streaming ASR worker'\n    });\n\n    this.workerInstance.onmessage = (e: MessageEvents) => {\n      const msgType = e.data['msgType'] as string;\n\n      if (msgType != 'speaker-diarization-file-progress') {\n        console.log(`received msg from worker: ${msgType}`);\n      }\n\n      if (msgType == 'init-speaker-diarization-done') {\n        console.log('Speaker diarization initialized successfully');\n\n        this.resultForFile = 'Initialization finished.\\nPlease select a .wav file.';\n        this.resultForMic = 'Initialization finished.\\nPlease click the button Start recording.';\n\n        this.selectFileBtnEnabled = true;\n      }\n\n      if (msgType == 'speaker-diarization-file-progress') {\n        this.progressForFile = e.data['progress'] as number;\n      }\n\n      if (msgType == 'speaker-diarization-file-done') {\n        const result = e.data['result'] as string;\n        this.resultForFile = result;\n\n        this.selectFileBtnEnabled = true;\n        this.copyBtnForFileEnabled = true;\n      }\n    };\n\n    const context = getContext();\n    this.workerInstance.postMessage({ msgType: 'init-speaker-diarization', context });\n    console.log('initializing');\n    this.resultForFile = 'Initializing models. Please wait';\n    this.resultForMic = this.resultForFile;\n  }\n\n  build() {\n    Column() {\n      Tabs({ barPosition: BarPosition.End, controller: this.controller }) {\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(this.titleFontSize).fontWeight(FontWeight.Bold);\n            Row({ space: 10 }) {\n              Text(`Number of speakers`).width('60%')\n\n              TextInput({ text: this.numSpeakers }).onChange((text) => {\n                this.numSpeakers = text.trim();\n              }).width('20%')\n            }.justifyContent(FlexAlign.Center)\n\n            Row({ space: 10 }) {\n              Button('Select .wav file (16kHz) ').enabled(this.selectFileBtnEnabled).onClick(() => {\n                this.resultForFile = '';\n                this.progressForFile = 0;\n                this.copyBtnForFileEnabled = false;\n\n                let numSpeakers = parseInt(this.numSpeakers);\n                if (numSpeakers.toString() != this.numSpeakers) {\n                  this.resultForFile =\n                    'Please input a valid value for the number of speakers in the .wav file you are going to select';\n                  return;\n                }\n\n                if (numSpeakers < 1) {\n                  this.resultForFile =\n                    'Please input a positive value for the number of speakers in the .wav file you are going to select';\n                  return;\n                }\n\n                this.selectFileBtnEnabled = false;\n\n                const documentSelectOptions = new picker.DocumentSelectOptions();\n                documentSelectOptions.maxSelectNumber = 1;\n                documentSelectOptions.fileSuffixFilters = ['.wav'];\n                const documentViewPicker = new picker.DocumentViewPicker();\n\n                documentViewPicker.select(documentSelectOptions).then((result: Array<string>) => {\n                  console.log(`select file result: ${result}`);\n\n                  if (!result[0]) {\n                    this.resultForFile = 'Please select a file to decode';\n                    this.selectFileBtnEnabled = true;\n                    return;\n                  }\n\n                  if (this.workerInstance) {\n                    this.workerInstance.postMessage({\n                      msgType: 'speaker-diarization-file', filename: result[0], numSpeakers,\n                    });\n                    this.resultForFile = `Decoding ${result[0]} ... ...`;\n                  } else {\n                    console.log(`this worker instance is undefined ${this.workerInstance}`);\n                  }\n                }).catch((err: BusinessError) => {\n                  console.error(`Failed to select file, code is ${err.code}, message is ${err.message}`);\n                  this.selectFileBtnEnabled = true;\n                })\n              })\n              Button('Copy results')\n                .enabled(this.copyBtnForFileEnabled)\n                .onClick(() => { // See https://developer.huawei.com/consumer/cn/doc/harmonyos-faqs/faqs-arkui-308-V5\n                  const pasteboardData = pasteboard.createData(pasteboard.MIMETYPE_TEXT_PLAIN, this.resultForFile);\n                  const systemPasteboard = pasteboard.getSystemPasteboard();\n                  systemPasteboard.setData(pasteboardData);\n                  systemPasteboard.getData().then((data) => {\n                    if (data) {\n                      promptAction.showToast({ message: 'Result copied.' });\n                    } else {\n                      promptAction.showToast({ message: 'Failed to copy' });\n                    }\n                  })\n                })\n            }\n\n            if (this.progressForFile > 0) {\n              Row() {\n                Progress({ value: 0, total: 100, type: ProgressType.Capsule })\n                  .width('80%')\n                  .height(20)\n                  .value(this.progressForFile);\n\n                Text(`${this.progressForFile.toFixed(2)}%`).width('15%')\n              }.width('100%').justifyContent(FlexAlign.Center)\n            }\n\n            TextArea({ text: this.resultForFile })\n              .lineSpacing({ value: 10, unit: LengthUnit.VP })\n              .width('100%')\n              .height('100%')\n          }\n        }.tabBar(this.TabBuilder('From file', 0, $r('app.media.icon_doc'), $r('app.media.icon_doc')))\n\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(this.titleFontSize).fontWeight(FontWeight.Bold);\n            TextArea({\n              text: `\nEveryting is open-sourced.\n\nIt runs locally, without accessing the network\n\nSee also https://github.com/k2-fsa/sherpa-onnx\n\n新一代 Kaldi QQ 和微信交流群: 请看\n\nhttps://k2-fsa.github.io/sherpa/social-groups.html\n\n微信公众号: 新一代 Kaldi\n            `\n            }).width('100%').height('100%').focusable(false)\n          }.justifyContent(FlexAlign.Start)\n        }.tabBar(this.TabBuilder('Help', 1, $r('app.media.info'), $r('app.media.info')))\n      }.scrollable(false)\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/ets/workers/SpeakerDiarizationWorker.ets",
    "content": "import worker, { ErrorEvent, MessageEvents, ThreadWorkerGlobalScope } from '@ohos.worker';\nimport {\n  OfflineSpeakerDiarization,\n  OfflineSpeakerDiarizationConfig,\n  OfflineSpeakerDiarizationSegment,\n  readWaveFromBinary,\n  Samples\n} from 'sherpa_onnx';\nimport { fileIo } from '@kit.CoreFileKit';\n\nconst workerPort: ThreadWorkerGlobalScope = worker.workerPort;\n\nlet sd: OfflineSpeakerDiarization;\nlet useAsync: boolean = true;\n\nfunction readWave(filename: string): Samples {\n  const fp = fileIo.openSync(filename);\n  const stat = fileIo.statSync(fp.fd);\n  const arrayBuffer = new ArrayBuffer(stat.size);\n  fileIo.readSync(fp.fd, arrayBuffer);\n  const data: Uint8Array = new Uint8Array(arrayBuffer);\n  return readWaveFromBinary(data) as Samples;\n}\n\nfunction initOfflineSpeakerDiarization(context: Context): OfflineSpeakerDiarization {\n  const config: OfflineSpeakerDiarizationConfig = new OfflineSpeakerDiarizationConfig();\n\n  // Please refer to https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\n  // to download models.\n  // Make sure you have placed it inside the directory\n  // harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/rawfile\n  //\n  // Also, please delete unused files to reduce the size of the app\n  config.segmentation.pyannote.model = 'sherpa-onnx-pyannote-segmentation-3-0/model.int8.onnx';\n  config.segmentation.numThreads = 2;\n  config.segmentation.debug = true;\n\n  // Please refer to https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n  // to download models.\n  // Make sure you have placed it inside the directory\n  // harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/rawfile\n  config.embedding.model = '3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx';\n  config.embedding.numThreads = 2;\n  config.embedding.debug = true;\n\n  config.minDurationOn = 0.2;\n  config.minDurationOff = 0.5;\n  return new OfflineSpeakerDiarization(config, context.resourceManager);\n\n  // For the above two models files, you should have the following directory structure\n  /*\n  (py38) fangjuns-MacBook-Pro:rawfile fangjun$ pwd\n  /Users/fangjun/open-source/sherpa-onnx/harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/rawfile\n  (py38) fangjuns-MacBook-Pro:rawfile fangjun$ ls -lh\n  total 77336\n  -rw-r--r--  1 fangjun  staff    38M Dec 10 16:28 3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n  drwxr-xr-x  3 fangjun  staff    96B Dec 10 19:36 sherpa-onnx-pyannote-segmentation-3-0\n  (py38) fangjuns-MacBook-Pro:rawfile fangjun$ tree .\n  .\n  ├── 3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n  └── sherpa-onnx-pyannote-segmentation-3-0\n      └── model.int8.onnx\n\n  1 directory, 2 files\n\n  (Note that we have kept only model.int8.onnx and removed all other files\n  from sherpa-onnx-pyannote-segmentation-3-0\n  )\n   */\n}\n\n/**\n * Defines the event handler to be called when the worker thread receives a message sent by the host thread.\n * The event handler is executed in the worker thread.\n *\n * @param e message data\n */\nworkerPort.onmessage = (e: MessageEvents) => {\n  const msgType = e.data['msgType'] as string;\n\n  console.log(`from the main thread, msg-type: ${msgType}`);\n  if (msgType == 'init-speaker-diarization' && !sd) {\n    const context: Context = e.data['context'] as Context;\n    sd = initOfflineSpeakerDiarization(context);\n    workerPort.postMessage({ msgType: 'init-speaker-diarization-done' });\n    console.log('Init sd done');\n  }\n\n  if (msgType == 'speaker-diarization-file') {\n    const filename = e.data['filename'] as string;\n    const numSpeakers = e.data['numSpeakers'] as number;\n    const wave = readWave(filename);\n    let result = '';\n    if (wave == undefined || wave == null) {\n      result = `Failed to read ${filename}`;\n\n      workerPort.postMessage({\n        msgType: 'speaker-diarization-file-done', result\n      });\n      return;\n    }\n\n    if (wave.sampleRate != sd.sampleRate) {\n      result = `Expected sample rate: ${sd.sampleRate}`;\n      result += '\\n';\n      result += `Sample rate in file ${filename} is ${wave.sampleRate}`;\n\n      workerPort.postMessage({\n        msgType: 'speaker-diarization-file-done', result\n      });\n\n      return;\n    }\n\n    const duration = wave.samples.length / wave.sampleRate;\n    console.log(`Processing ${filename} of ${duration} seconds`);\n\n    // You can remove this if statement if you want\n    if (duration < 0.3) {\n      result = `${filename} has only ${duration} seconds. Please use a longer file`;\n\n      workerPort.postMessage({\n        msgType: 'speaker-diarization-file-done', result\n      });\n      return;\n    }\n    sd.config.clustering.numClusters = numSpeakers;\n    sd.setConfig(sd.config);\n\n    if (useAsync) {\n      sd.processAsync(wave.samples, (numProcessedChunks: number, numTotalChunks: number) => {\n        const progress = numProcessedChunks / numTotalChunks * 100;\n        workerPort.postMessage({\n          msgType: 'speaker-diarization-file-progress', progress\n        });\n      }).then((r: OfflineSpeakerDiarizationSegment[]) => {\n        console.log(`r is ${r.length}, ${r}`);\n\n        for (const s of r) {\n          const start: string = s.start.toFixed(3);\n          const end: string = s.end.toFixed(3);\n          result += `${start}\\t--\\t${end}\\tspeaker_${s.speaker}\\n`;\n          console.log(`result: ${result}`);\n        }\n\n        if (r.length == 0) {\n          result = 'The result is empty';\n        }\n\n        workerPort.postMessage({\n          msgType: 'speaker-diarization-file-done', result\n        });\n      });\n    } else {\n      const r: OfflineSpeakerDiarizationSegment[] = sd.process(wave.samples)\n      console.log(`r is ${r.length}, ${r}`);\n      for (const s of r) {\n        const start: string = s.start.toFixed(3);\n        const end: string = s.end.toFixed(3);\n        result += `${start}\\t--\\t${end}\\tspeaker_${s.speaker}\\n`;\n        console.log(`result: ${result}`);\n      }\n\n      if (r.length == 0) {\n        result = 'The result is empty';\n      }\n\n      workerPort.postMessage({\n        msgType: 'speaker-diarization-file-done', result\n      });\n    }\n  }\n} /**\n * Defines the event handler to be called when the worker receives a message that cannot be deserialized.\n * The event handler is executed in the worker thread.\n *\n * @param e message data\n */\nworkerPort.onmessageerror = (e: MessageEvents) => {\n}\n\n/**\n * Defines the event handler to be called when an exception occurs during worker execution.\n * The event handler is executed in the worker thread.\n *\n * @param e error message\n */\nworkerPort.onerror = (e: ErrorEvent) => {\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry\",\n    \"type\": \"entry\",\n    \"description\": \"$string:module_desc\",\n    \"mainElement\": \"EntryAbility\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false,\n    \"pages\": \"$profile:main_pages\",\n    \"abilities\": [\n      {\n        \"name\": \"EntryAbility\",\n        \"srcEntry\": \"./ets/entryability/EntryAbility.ets\",\n        \"description\": \"$string:EntryAbility_desc\",\n        \"icon\": \"$media:layered_image\",\n        \"label\": \"$string:EntryAbility_label\",\n        \"startWindowIcon\": \"$media:startIcon\",\n        \"startWindowBackground\": \"$color:start_window_background\",\n        \"exported\": true,\n        \"skills\": [\n          {\n            \"entities\": [\n              \"entity.system.home\"\n            ],\n            \"actions\": [\n              \"action.system.home\"\n            ]\n          }\n        ]\n      }\n    ],\n    \"extensionAbilities\": [\n      {\n        \"name\": \"EntryBackupAbility\",\n        \"srcEntry\": \"./ets/entrybackupability/EntryBackupAbility.ets\",\n        \"type\": \"backup\",\n        \"exported\": false,\n        \"metadata\": [\n          {\n            \"name\": \"ohos.extension.backup\",\n            \"resource\": \"$profile:backup_config\"\n          }\n        ],\n      }\n    ]\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/base/element/color.json",
    "content": "{\n  \"color\": [\n    {\n      \"name\": \"start_window_background\",\n      \"value\": \"#FFFFFF\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"On-device speaker diarization with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"On-device speaker diarization with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"Speaker diarization\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/base/media/layered_image.json",
    "content": "{\n  \"layered-image\":\n  {\n    \"background\" : \"$media:background\",\n    \"foreground\" : \"$media:foreground\"\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/base/profile/backup_config.json",
    "content": "{\n  \"allowToBackupRestore\": true\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/base/profile/main_pages.json",
    "content": "{\n  \"src\": [\n    \"pages/Index\"\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/en_US/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"On-device speaker diarization with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"On-device speaker diarization with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"Speaker diarization\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/rawfile/.gitkeep",
    "content": ""
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/main/resources/zh_CN/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"新一代Kaldi: 本地说话人日志\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"新一代Kaldi: 本地说话人日志\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"说话人日志\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/ohosTest/ets/test/Ability.test.ets",
    "content": "import hilog from '@ohos.hilog';\nimport { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function abilityTest() {\n  describe('ActsAbilityTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    })\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    })\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    })\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    })\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      hilog.info(0x0000, 'testTag', '%{public}s', 'it begin');\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    })\n  })\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/ohosTest/ets/test/List.test.ets",
    "content": "import abilityTest from './Ability.test';\n\nexport default function testsuite() {\n  abilityTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/ohosTest/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry_test\",\n    \"type\": \"feature\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/test/List.test.ets",
    "content": "import localUnitTest from './LocalUnit.test';\n\nexport default function testsuite() {\n  localUnitTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/entry/src/test/LocalUnit.test.ets",
    "content": "import { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function localUnitTest() {\n  describe('localUnitTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    });\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    });\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    });\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    });\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    });\n  });\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/hvigor/hvigor-config.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"dependencies\": {\n  },\n  \"execution\": {\n    // \"analyze\": \"normal\",                     /* Define the build analyze mode. Value: [ \"normal\" | \"advanced\" | false ]. Default: \"normal\" */\n    // \"daemon\": true,                          /* Enable daemon compilation. Value: [ true | false ]. Default: true */\n    // \"incremental\": true,                     /* Enable incremental compilation. Value: [ true | false ]. Default: true */\n    // \"parallel\": true,                        /* Enable parallel compilation. Value: [ true | false ]. Default: true */\n    // \"typeCheck\": false,                      /* Enable typeCheck. Value: [ true | false ]. Default: false */\n  },\n  \"logging\": {\n    // \"level\": \"info\"                          /* Define the log level. Value: [ \"debug\" | \"info\" | \"warn\" | \"error\" ]. Default: \"info\" */\n  },\n  \"debugging\": {\n    // \"stacktrace\": false                      /* Disable stacktrace compilation. Value: [ true | false ]. Default: false */\n  },\n  \"nodeOptions\": {\n    // \"maxOldSpaceSize\": 8192                  /* Enable nodeOptions maxOldSpaceSize compilation. Unit M. Used for the daemon process. Default: 8192*/\n    // \"exposeGC\": true                         /* Enable to trigger garbage collection explicitly. Default: true*/\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/hvigorfile.ts",
    "content": "import { appTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: appTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"@ohos/hypium@1.0.19\": \"@ohos/hypium@1.0.19\"\n  },\n  \"packages\": {\n    \"@ohos/hypium@1.0.19\": {\n      \"name\": \"@ohos/hypium\",\n      \"version\": \"1.0.19\",\n      \"integrity\": \"sha512-cEjDgLFCm3cWZDeRXk7agBUkPqjWxUo6AQeiu0gEkb3J8ESqlduQLSIXeo3cCsm8U/asL7iKjF85ZyOuufAGSQ==\",\n      \"resolved\": \"https://ohpm.openharmony.cn/ohpm/@ohos/hypium/-/hypium-1.0.19.har\",\n      \"registryType\": \"ohpm\"\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerDiarization/oh-package.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"dependencies\": {\n  },\n  \"devDependencies\": {\n    \"@ohos/hypium\": \"1.0.19\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/.gitignore",
    "content": "/node_modules\n/oh_modules\n/local.properties\n/.idea\n**/build\n/.hvigor\n.cxx\n/.clangd\n/.clang-format\n/.clang-tidy\n**/.test\n/.appanalyzer"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/AppScope/app.json5",
    "content": "{\n  \"app\": {\n    \"bundleName\": \"com.k2fsa.sherpa.onnx.speaker.identification\",\n    \"vendor\": \"example\",\n    \"versionCode\": 1000000,\n    \"versionName\": \"1.0.0\",\n    \"icon\": \"$media:app_icon\",\n    \"label\": \"$string:app_name\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/AppScope/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"app_name\",\n      \"value\": \"SherpaOnnxSpeakerIdentification\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/build-profile.json5",
    "content": "{\n  \"app\": {\n    \"signingConfigs\": [],\n    \"products\": [\n      {\n        \"name\": \"default\",\n        \"signingConfig\": \"default\",\n        \"compatibleSdkVersion\": \"4.0.0(10)\",\n        \"runtimeOS\": \"HarmonyOS\",\n        \"buildOption\": {\n          \"strictMode\": {\n            \"caseSensitiveCheck\": true,\n          }\n        }\n      }\n    ],\n    \"buildModeSet\": [\n      {\n        \"name\": \"debug\",\n      },\n      {\n        \"name\": \"release\"\n      }\n    ]\n  },\n  \"modules\": [\n    {\n      \"name\": \"entry\",\n      \"srcPath\": \"./entry\",\n      \"targets\": [\n        {\n          \"name\": \"default\",\n          \"applyToProducts\": [\n            \"default\"\n          ]\n        }\n      ]\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/code-linter.json5",
    "content": "{\n  \"files\": [\n    \"**/*.ets\"\n  ],\n  \"ignore\": [\n    \"**/src/ohosTest/**/*\",\n    \"**/src/test/**/*\",\n    \"**/src/mock/**/*\",\n    \"**/node_modules/**/*\",\n    \"**/oh_modules/**/*\",\n    \"**/build/**/*\",\n    \"**/.preview/**/*\"\n  ],\n  \"ruleSet\": [\n    \"plugin:@performance/recommended\",\n    \"plugin:@typescript-eslint/recommended\"\n  ],\n  \"rules\": {\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/.gitignore",
    "content": "/node_modules\n/oh_modules\n/.preview\n/build\n/.cxx\n/.test"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/build-profile.json5",
    "content": "{\n  \"apiType\": \"stageMode\",\n  \"buildOption\": {\n    \"sourceOption\": {\n      \"workers\": [\n        './src/main/ets/workers/SpeakerIdentificationWorker.ets'\n      ]\n    }\n  },\n  \"buildOptionSet\": [\n    {\n      \"name\": \"release\",\n      \"arkOptions\": {\n        \"obfuscation\": {\n          \"ruleOptions\": {\n            \"enable\": false,\n            \"files\": [\n              \"./obfuscation-rules.txt\"\n            ]\n          }\n        }\n      }\n    },\n  ],\n  \"targets\": [\n    {\n      \"name\": \"default\"\n    },\n    {\n      \"name\": \"ohosTest\",\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/hvigorfile.ts",
    "content": "import { hapTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: hapTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/obfuscation-rules.txt",
    "content": "# Define project specific obfuscation rules here.\n# You can include the obfuscation configuration files in the current module's build-profile.json5.\n#\n# For more details, see\n#   https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/source-obfuscation-V5\n\n# Obfuscation options:\n# -disable-obfuscation: disable all obfuscations\n# -enable-property-obfuscation: obfuscate the property names\n# -enable-toplevel-obfuscation: obfuscate the names in the global scope\n# -compact: remove unnecessary blank spaces and all line feeds\n# -remove-log: remove all console.* statements\n# -print-namecache: print the name cache that contains the mapping from the old names to new names\n# -apply-namecache: reuse the given cache file\n\n# Keep options:\n# -keep-property-name: specifies property names that you want to keep\n# -keep-global-name: specifies names that you want to keep in the global scope\n\n-enable-property-obfuscation\n-enable-toplevel-obfuscation\n-enable-filename-obfuscation\n-enable-export-obfuscation"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1y+qvabrznvcerrtte4uydjhwfdt7hfnlsk0jsnicmy=/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\": \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1y+qvabrznvcerrtte4uydjhwfdt7hfnlsk0jsnicmy=/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\",\n    \"sherpa_onnx@sherpa_onnx_2.har\": \"sherpa_onnx@sherpa_onnx_2.har\"\n  },\n  \"packages\": {\n    \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1y+qvabrznvcerrtte4uydjhwfdt7hfnlsk0jsnicmy=/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\": {\n      \"name\": \"libsherpa_onnx.so\",\n      \"version\": \"1.0.0\",\n      \"resolved\": \"../oh_modules/.ohpm/sherpa_onnx@1y+qvabrznvcerrtte4uydjhwfdt7hfnlsk0jsnicmy=/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\",\n      \"registryType\": \"local\"\n    },\n    \"sherpa_onnx@sherpa_onnx_2.har\": {\n      \"name\": \"sherpa_onnx\",\n      \"version\": \"1.10.33\",\n      \"resolved\": \"sherpa_onnx_2.har\",\n      \"registryType\": \"local\",\n      \"dependencies\": {\n        \"libsherpa_onnx.so\": \"file:./src/main/cpp/types/libsherpa_onnx\"\n      }\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/oh-package.json5",
    "content": "{\n  \"name\": \"entry\",\n  \"version\": \"1.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"main\": \"\",\n  \"author\": \"\",\n  \"license\": \"\",\n  \"dependencies\": {\n    \"sherpa_onnx\": \"1.12.31\",\n  }\n}\n\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/ets/entryability/EntryAbility.ets",
    "content": "import AbilityConstant from '@ohos.app.ability.AbilityConstant';\nimport hilog from '@ohos.hilog';\nimport UIAbility from '@ohos.app.ability.UIAbility';\nimport Want from '@ohos.app.ability.Want';\nimport window from '@ohos.window';\n\nexport default class EntryAbility extends UIAbility {\n  onCreate(want: Want, launchParam: AbilityConstant.LaunchParam): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onCreate');\n  }\n\n  onDestroy(): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onDestroy');\n  }\n\n  onWindowStageCreate(windowStage: window.WindowStage): void {\n    // Main window is created, set main page for this ability\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageCreate');\n\n    windowStage.loadContent('pages/Index', (err) => {\n      if (err.code) {\n        hilog.error(0x0000, 'testTag', 'Failed to load the content. Cause: %{public}s', JSON.stringify(err) ?? '');\n        return;\n      }\n      hilog.info(0x0000, 'testTag', 'Succeeded in loading the content.');\n    });\n  }\n\n  onWindowStageDestroy(): void {\n    // Main window is destroyed, release UI related resources\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageDestroy');\n  }\n\n  onForeground(): void {\n    // Ability has brought to foreground\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onForeground');\n  }\n\n  onBackground(): void {\n    // Ability has back to background\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onBackground');\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/ets/entrybackupability/EntryBackupAbility.ets",
    "content": "import hilog from '@ohos.hilog';\nimport BackupExtensionAbility, { BundleVersion } from '@ohos.application.BackupExtensionAbility';\n\nexport default class EntryBackupAbility extends BackupExtensionAbility {\n  async onBackup() {\n    hilog.info(0x0000, 'testTag', 'onBackup ok');\n  }\n\n  async onRestore(bundleVersion: BundleVersion) {\n    hilog.info(0x0000, 'testTag', 'onRestore ok %{public}s', JSON.stringify(bundleVersion));\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/ets/pages/Index.ets",
    "content": "import worker, { MessageEvents } from '@ohos.worker';\nimport { audio } from '@kit.AudioKit';\nimport { allAllowed, requestPermissions } from './Permission';\nimport { Permissions } from '@kit.AbilityKit';\nimport { picker } from '@kit.CoreFileKit';\nimport fs from '@ohos.file.fs';\n\n\n\nfunction flatten(samples: Float32Array[]): Float32Array {\n  let n = 0;\n  for (let i = 0; i < samples.length; ++i) {\n    n += samples[i].length;\n  }\n\n  const ans: Float32Array = new Float32Array(n);\n  let offset: number = 0;\n  for (let i = 0; i < samples.length; ++i) {\n    ans.set(samples[i], offset);\n    offset += samples[i].length;\n  }\n\n  return ans;\n}\n\nfunction savePcmToWav(filename: string, samples: Int16Array, sampleRate: number) {\n  const fp = fs.openSync(filename, fs.OpenMode.READ_WRITE | fs.OpenMode.CREATE);\n\n  const header = new ArrayBuffer(44);\n  const view = new DataView(header);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true); // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true); // chunkSize //                   E V A W\n  view.setUint32(8, 0x45564157, true); // format // //                      t m f\n  view.setUint32(12, 0x20746d66, true); // subchunk1ID\n  view.setUint32(16, 16, true); // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true); // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true); // numChannels: 1 channel\n  view.setUint32(24, sampleRate, true); // sampleRate\n  view.setUint32(28, sampleRate * 2, true); // byteRate\n  view.setUint16(32, 2, true); // blockAlign\n  view.setUint16(34, 16, true); // bitsPerSample\n  view.setUint32(36, 0x61746164, true); // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true); // subchunk2Size\n\n  fs.writeSync(fp.fd, new Uint8Array(header).buffer, { length: header.byteLength });\n  fs.writeSync(fp.fd, samples.buffer, { length: samples.buffer.byteLength });\n\n  fs.closeSync(fp.fd);\n}\n\nfunction toInt16Samples(samples: Float32Array): Int16Array {\n  const int16Samples = new Int16Array(samples.length);\n  for (let i = 0; i < samples.length; ++i) {\n    let s = samples[i] * 32767;\n    s = s > 32767 ? 32767 : s;\n    s = s < -32768 ? -32768 : s;\n    int16Samples[i] = s;\n  }\n\n  return int16Samples;\n}\n\n@Entry\n@Component\nstruct Index {\n  @State title: string = 'Next-gen Kaldi: Speaker Identification';\n  @State titleFontSize: number = 18;\n  private controller: TabsController = new TabsController();\n\n  @State currentIndex: number = 0;\n\n  private threshold: string = '0.5';\n\n  private workerInstance?: worker.ThreadWorker\n  private readonly scriptURL: string = 'entry/ets/workers/SpeakerIdentificationWorker.ets'\n\n  @State allSpeakerNames: string[] = [];\n  private inputSpeakerName: string = '';\n\n  @State btnSaveAudioEnabled: boolean = false;\n  @State btnAddEnabled: boolean = false;\n\n  private sampleRate: number = 48000;\n  private sampleListForAdding: Float32Array[] = []\n  private sampleListForTesting: Float32Array[] = []\n  private mic?: audio.AudioCapturer;\n\n  @State infoHome: string = '';\n  @State infoAdd: string = '';\n\n  @State micBtnCaptionForAdding: string = 'Start recording';\n  @State micStartedForAdding: boolean = false;\n  @State micBtnEnabledForAdding: boolean = true;\n\n  @State micBtnCaptionForTesting: string = 'Start recording';\n  @State micStartedForTesting: boolean = false;\n  @State micBtnEnabledForTesting: boolean = true;\n\n  async initMic() {\n    const permissions: Permissions[] = [\"ohos.permission.MICROPHONE\"];\n    let allowed: boolean = await allAllowed(permissions);\n    if (!allowed) {\n      console.log(\"request to access the microphone\");\n      const status: boolean = await requestPermissions(permissions);\n\n      if (!status) {\n        console.error('access to microphone is denied')\n        this.infoHome = \"Failed to get microphone permission. Please retry\";\n        this.infoAdd = this.infoHome;\n        return;\n      }\n\n      allowed = await allAllowed(permissions);\n      if (!allowed) {\n        console.error('failed to get microphone permission');\n        this.infoHome = \"Failed to get microphone permission. Please retry\";\n        this.infoAdd = this.infoHome;\n        return;\n      }\n    } else {\n      console.log(\"allowed to access microphone\");\n    }\n\n    const audioStreamInfo: audio.AudioStreamInfo = {\n      samplingRate: this.sampleRate,\n      channels: audio.AudioChannel.CHANNEL_1,\n      sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_S16LE,\n      encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW,\n    };\n\n    const audioCapturerInfo: audio.AudioCapturerInfo = {\n      source: audio.SourceType.SOURCE_TYPE_MIC, capturerFlags: 0\n    };\n\n    const audioCapturerOptions: audio.AudioCapturerOptions = {\n      streamInfo: audioStreamInfo, capturerInfo: audioCapturerInfo\n\n    };\n    audio.createAudioCapturer(audioCapturerOptions, (err, data) => {\n      if (err) {\n        console.error(`error code is ${err.code}, error message is ${err.message}`);\n        this.infoHome = 'Failed to init microphone';\n        this.infoAdd = this.infoHome;\n      } else {\n        console.info(`init mic successfully`);\n        this.mic = data;\n        this.mic.on('readData', this.micCallback);\n      }\n    });\n  }\n\n  async aboutToAppear() {\n    this.workerInstance = new worker.ThreadWorker(this.scriptURL, {\n      name: 'Speaker identification worker'\n    });\n\n    this.workerInstance.onmessage = (e: MessageEvents) => {\n      const msgType = e.data['msgType'] as string;\n      console.log(`received msg from worker: ${msgType}`);\n\n      if (msgType == 'manager-all-speaker-names') {\n        this.allSpeakerNames = e.data['allSpeakers'] as string[];\n      }\n\n      if (msgType == 'manager-add-speaker-done') {\n        const ok: boolean = e.data['ok'] as boolean;\n        const status: string = e.data['status'] as string;\n        this.infoAdd += '\\n' + status;\n\n        if (ok) {\n          this.sampleListForAdding = [];\n          this.btnSaveAudioEnabled = false;\n          this.btnAddEnabled = false;\n        }\n      }\n\n      if (msgType == 'manager-search-speaker-done') {\n        const name = e.data['name'] as string;\n        this.infoHome = name;\n      }\n    };\n\n    this.workerInstance.postMessage({ msgType: 'init-extractor', context: getContext()});\n\n    await this.initMic();\n  }\n\n  @Builder\n  TabBuilder(title: string, targetIndex: number, selectedImg: Resource, normalImg: Resource) {\n    Column() {\n      Image(this.currentIndex == targetIndex ? selectedImg : normalImg).size({ width: 25, height: 25 })\n      Text(title).fontColor(this.currentIndex == targetIndex ? '#28bff1' : '#8a8a8a')\n    }.width('100%').height(50).justifyContent(FlexAlign.Center).onClick(() => {\n      this.currentIndex = targetIndex;\n      this.controller.changeIndex(this.currentIndex);\n    })\n  }\n\n  build() {\n    Column() {\n      Tabs({ barPosition: BarPosition.End, controller: this.controller }) {\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(this.titleFontSize).fontWeight(FontWeight.Bold);\n            Row() {\n              Text('Similary threshold').width('60%');\n\n              TextInput({ text: this.threshold }).onChange((text) => {\n                this.threshold = text.trim();\n              }).width('20%')\n            }\n            Row() {\n              Button(this.micBtnCaptionForTesting)\n                .enabled(this.micBtnEnabledForTesting)\n                .onClick(()=>{\n                  if (this.allSpeakerNames.length == 0) {\n                    this.infoHome = 'There are no speakers registered. Please add them first';\n                    return;\n                  }\n\n                  let threshold = parseFloat(this.threshold);\n                  if (isNaN(threshold)) {\n                    this.infoHome = 'Please enter a valid threshold';\n                    return;\n                  }\n\n                  if (threshold <= 0) {\n                    this.infoHome = 'Please enter a positive threshold';\n                    return;\n                  }\n                  console.log(`threshold: ${threshold}`);\n\n                  if (this.micStartedForTesting) {\n                    this.micStartedForTesting = false;\n                    this.micBtnCaptionForTesting = 'Start';\n                    this.micBtnEnabledForAdding = true;\n                    this.mic?.stop();\n\n                    const samples = flatten(this.sampleListForTesting);\n                    const duration = samples.length / this.sampleRate;\n                    if (duration < 0.5) {\n                      this.infoHome = `Please speak for a longer time! Current duration: ${duration}`;\n                      return;\n                    }\n                    if (this.workerInstance) {\n                      this.workerInstance.postMessage({\n                        msgType: 'manager-search-speaker',\n                        samples: samples,\n                        sampleRate: this.sampleRate,\n                        threshold,\n                      });\n                    }\n                  } else {\n                    this.sampleListForTesting = [];\n                    this.micStartedForTesting = true;\n                    this.micBtnCaptionForTesting = 'Stop';\n                    this.micBtnEnabledForAdding = false;\n                    this.mic?.start();\n                    this.infoHome = `Use threshold: ${threshold}`;\n                    this.infoHome += '\\nPlease speak and then click Stop';\n                  }\n                })\n\n              Button('Save audio')\n                .enabled(!this.micStartedForTesting)\n                .onClick(()=>{\n                  if (this.sampleListForTesting.length == 0) {\n                    this.infoHome = 'No audio samples recorded';\n                    return;\n                  }\n                  const samples = flatten(this.sampleListForTesting);\n\n                  if (samples.length == 0) {\n                    this.infoHome = 'Empty samples';\n                    return;\n                  }\n\n                  let uri: string = '';\n\n                  const audioOptions = new picker.AudioSaveOptions(); // audioOptions.newFileNames = ['o.wav'];\n\n                  const audioViewPicker = new picker.AudioViewPicker();\n\n                  audioViewPicker.save(audioOptions).then((audioSelectResult: Array<string>) => {\n                    uri = audioSelectResult[0];\n                    savePcmToWav(uri, toInt16Samples(samples), this.sampleRate);\n                    console.log(`Saved to ${uri}`);\n                    this.infoHome+= `\\nSaved to ${uri}`;\n                  });\n                })\n            }\n            TextArea({text: this.infoHome})\n              .height('100%')\n              .focusable(false)\n          }\n        }.tabBar(this.TabBuilder('Home', 0, $r('app.media.icon_home'), $r('app.media.icon_home')))\n\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(this.titleFontSize).fontWeight(FontWeight.Bold);\n\n            if (this.allSpeakerNames.length == 0) {\n              Text('Please add speakers first')\n            } else {\n              List({ space: 10, initialIndex: 0 }) {\n                ForEach(this.allSpeakerNames, (item: string, index: number) => {\n                  ListItem() {\n                    Flex({ direction: FlexDirection.Row, alignItems: ItemAlign.Center }) {\n                      Text(item)\n                        .width('100%')\n                        .height(80)\n                        .fontSize(20)\n                        .textAlign(TextAlign.Center)\n                        .borderRadius(10)\n                        .flexShrink(1)\n\n                      Button('Delete')\n                      .width('30%')\n                        .height(40)\n                      .onClick(() => {\n                        if (index != undefined) {\n                          const name = this.allSpeakerNames[index];\n                          console.log(`Deleting speaker ${name}`);\n                          if (this.workerInstance) {\n                            this.workerInstance.postMessage({\n                              msgType: 'manager-delete-speaker',\n                              name: name\n                            });\n                          }\n                        }\n                      }).stateEffect(true)\n\n                      Text('')\n                        .width('15%')\n                        .height(80)\n                    }\n                  }\n                }, (item: string) => item)\n              }\n            }\n          }\n        }.tabBar(this.TabBuilder('View', 1, $r('app.media.icon_view'), $r('app.media.icon_view')))\n\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(this.titleFontSize).fontWeight(FontWeight.Bold);\n\n            Row({space: 10}) {\n              Text('Speaker name')\n              TextInput({placeholder: 'Input speaker name'})\n                .onChange((value: string)=>{\n                  this.inputSpeakerName = value.trim();\n                });\n            }.width('100%')\n\n            Row({space: 10}) {\n              Button(this.micBtnCaptionForAdding)\n                .enabled(this.micBtnEnabledForAdding)\n                .onClick(()=> {\n                  if (this.mic) {\n                    if (this.micStartedForAdding) {\n                      this.micStartedForAdding = false;\n                      this.micBtnEnabledForTesting = true;\n                      this.micBtnCaptionForAdding = 'Start recording';\n                      this.mic.stop();\n                      this.infoAdd = '';\n                      if (this.sampleListForAdding.length > 0) {\n                        this.btnAddEnabled = true;\n                        this.btnSaveAudioEnabled = true;\n                      }\n                    } else {\n                      this.micStartedForAdding = true;\n                      this.micBtnEnabledForTesting = false;\n                      this.micBtnCaptionForAdding = 'Stop recording';\n                      this.sampleListForAdding = [];\n                      this.mic.start();\n                      this.infoAdd = '';\n\n                      this.btnAddEnabled = false;\n                      this.btnSaveAudioEnabled = false;\n                    }\n                  }\n                })\n\n              Button('Add')\n                .enabled(this.btnAddEnabled)\n                .onClick(()=>{\n                  if (this.inputSpeakerName.trim() == '') {\n                    this.infoAdd += '\\nPlease input a speaker name first';\n                    return;\n                  }\n\n                  const samples = flatten(this.sampleListForAdding);\n                  const duration = samples.length / this.sampleRate;\n                  if (duration < 0.5) {\n                    this.infoAdd = `Please speak for a longer time. Current duration: ${duration}`;\n                    return;\n                  }\n                  if (this.workerInstance) {\n                    this.workerInstance.postMessage({\n                      msgType: 'manager-add-speaker',\n                      name: this.inputSpeakerName,\n                      samples: samples,\n                      sampleRate: this.sampleRate,\n                    })\n                  }\n                })\n\n              Button('Save audio')\n                .enabled(this.btnSaveAudioEnabled)\n                .onClick(()=>{\n                  if (this.sampleListForAdding.length == 0) {\n                    this.btnSaveAudioEnabled = false;\n                    return;\n                  }\n\n                  const samples = flatten(this.sampleListForAdding);\n\n                  if (samples.length == 0) {\n                    this.btnSaveAudioEnabled = false;\n                    return;\n                  }\n\n                  let uri: string = '';\n\n\n                  const audioOptions = new picker.AudioSaveOptions(); // audioOptions.newFileNames = ['o.wav'];\n\n                  const audioViewPicker = new picker.AudioViewPicker();\n\n                  audioViewPicker.save(audioOptions).then((audioSelectResult: Array<string>) => {\n                    uri = audioSelectResult[0];\n                    savePcmToWav(uri, toInt16Samples(samples), this.sampleRate);\n                    console.log(`Saved to ${uri}`);\n                    this.infoAdd += `\\nSaved to ${uri}`;\n                  });\n                })\n            }\n            TextArea({text: this.infoAdd})\n              .height('100%')\n              .width('100%')\n              .focusable(false)\n          }\n        }.tabBar(this.TabBuilder('Add', 2, $r('app.media.icon_add'), $r('app.media.icon_add')))\n\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(this.titleFontSize).fontWeight(FontWeight.Bold);\n            TextArea({\n              text: `\nEveryting is open-sourced.\n\nIt runs locally, without accessing the network\n\nSee also https://github.com/k2-fsa/sherpa-onnx\n\n新一代 Kaldi QQ 和微信交流群: 请看\n\nhttps://k2-fsa.github.io/sherpa/social-groups.html\n\n微信公众号: 新一代 Kaldi\n            `\n            }).width('100%').height('100%').focusable(false)\n          }\n        }.tabBar(this.TabBuilder('Help', 3, $r('app.media.icon_info'), $r('app.media.icon_info')))\n\n      }.scrollable(false)\n    }.width('100%')\n  }\n\n  private micCallback = (buffer: ArrayBuffer) => {\n    const view: Int16Array = new Int16Array(buffer);\n\n    const samplesFloat: Float32Array = new Float32Array(view.length);\n    for (let i = 0; i < view.length; ++i) {\n      samplesFloat[i] = view[i] / 32768.0;\n    }\n\n    if (this.micStartedForAdding) {\n      this.sampleListForAdding.push(samplesFloat);\n    }\n\n    if (this.micStartedForTesting) {\n      this.sampleListForTesting.push(samplesFloat);\n    }\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/ets/pages/Permission.ets",
    "content": "// This file is modified from\n// https://gitee.com/ukSir/hmchat2/blob/master/entry/src/main/ets/utils/permissionMananger.ets\nimport { abilityAccessCtrl, bundleManager, common, Permissions } from '@kit.AbilityKit';\n\nexport function allAllowed(permissions: Permissions[]): boolean {\n  if (permissions.length == 0) {\n    return false;\n  }\n\n  const mgr: abilityAccessCtrl.AtManager = abilityAccessCtrl.createAtManager();\n\n  const bundleInfo = bundleManager.getBundleInfoForSelfSync(bundleManager.BundleFlag.GET_BUNDLE_INFO_WITH_APPLICATION);\n\n  let tokenID: number = bundleInfo.appInfo.accessTokenId;\n\n  return permissions.every(permission => abilityAccessCtrl.GrantStatus.PERMISSION_GRANTED ==\n  mgr.checkAccessTokenSync(tokenID, permission));\n}\n\nexport async function requestPermissions(permissions: Permissions[]): Promise<boolean> {\n  const mgr: abilityAccessCtrl.AtManager = abilityAccessCtrl.createAtManager();\n  const context: Context = getContext() as common.UIAbilityContext;\n\n  const result = await mgr.requestPermissionsFromUser(context, permissions);\n  return result.authResults.length > 0 && result.authResults.every(authResults => authResults == 0);\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/ets/workers/SpeakerIdentificationWorker.ets",
    "content": "import worker, { ErrorEvent, MessageEvents, ThreadWorkerGlobalScope } from '@ohos.worker';\nimport {\n  OnlineStream,\n  readWaveFromBinary,\n  Samples,\n  SpeakerEmbeddingExtractor,\n  SpeakerEmbeddingExtractorConfig,\n  SpeakerEmbeddingManager\n} from 'sherpa_onnx';\n\nconst workerPort: ThreadWorkerGlobalScope = worker.workerPort;\n\nlet extractor: SpeakerEmbeddingExtractor;\nlet manager: SpeakerEmbeddingManager;\n\nfunction readWaveFromRawfile(filename: string, context: Context): Samples {\n  const data: Uint8Array = context.resourceManager.getRawFileContentSync(filename);\n  return readWaveFromBinary(data) as Samples;\n}\n\nfunction initExtractor(context: Context): SpeakerEmbeddingExtractor {\n  const config: SpeakerEmbeddingExtractorConfig = new SpeakerEmbeddingExtractorConfig();\n\n  // Please put the model file inside the directory\n  // harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/resources/rawfile\n/*\n(py38) fangjuns-MacBook-Pro:rawfile fangjun$ pwd\n/Users/fangjun/open-source/sherpa-onnx/harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/resources/rawfile\n(py38) fangjuns-MacBook-Pro:rawfile fangjun$ ls -lh\ntotal 77336\n-rw-r--r--  1 fangjun  staff    38M Dec  9 19:34 3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n */\n  // You can find more models at\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n  config.model = '3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx';\n  config.numThreads = 2;\n  config.debug = true;\n\n  return new SpeakerEmbeddingExtractor(config, context.resourceManager);\n}\n\nfunction extractEmbedding(samples: Samples): Float32Array {\n  const stream: OnlineStream = extractor.createStream();\n  stream.acceptWaveform(samples);\n  return extractor.compute(stream);\n}\n\n/**\n * Defines the event handler to be called when the worker thread receives a message sent by the host thread.\n * The event handler is executed in the worker thread.\n *\n * @param e message data\n */\nworkerPort.onmessage = (e: MessageEvents) => {\n  const msgType = e.data['msgType'] as string;\n\n  console.log(`from the main thread, msg-type: ${msgType}`);\n\n  if (msgType == 'init-extractor' && !extractor) {\n    const context: Context = e.data['context'] as Context;\n    extractor = initExtractor(context);\n    manager = new SpeakerEmbeddingManager(extractor.dim);\n\n    workerPort.postMessage({\n      msgType: 'manager-all-speaker-names', allSpeakers: manager.getAllSpeakerNames(),\n    });\n  }\n\n  if (msgType == 'manager-delete-speaker') {\n    const name = e.data['name'] as string;\n    const ok: boolean = manager.remove(name);\n    if (ok) {\n      console.log(`Removed ${name}.`);\n\n      console.log(`Number of speakers: ${manager.getNumSpeakers()}`);\n      console.log(`Number of speakers2: ${manager.getAllSpeakerNames().length}`);\n      console.log(JSON.stringify(manager.getAllSpeakerNames()));\n      workerPort.postMessage({\n        msgType: 'manager-all-speaker-names', allSpeakers: manager.getAllSpeakerNames(),\n      });\n    }\n  }\n\n  if (msgType == 'manager-add-speaker') {\n    const name = e.data['name'] as string;\n    const samples = e.data['samples'] as Float32Array;\n    const sampleRate = e.data['sampleRate'] as number;\n\n    const v = extractEmbedding({ samples, sampleRate });\n    const ok: boolean = manager.add({ name, v });\n    if (ok) {\n      workerPort.postMessage({\n        msgType: 'manager-add-speaker-done',\n        status: `Added ${name}`,\n        ok,\n      });\n      workerPort.postMessage({\n        msgType: 'manager-all-speaker-names', allSpeakers: manager.getAllSpeakerNames(),\n      }\n      );\n    } else {\n      workerPort.postMessage({\n        msgType: 'manager-add-speaker-done',\n        status: `Failed to add ${name}. Possibly due to exsiting speaker name. Please recheck`,\n        ok,\n      });\n    }\n  }\n\n  if (msgType == 'manager-search-speaker') {\n    const threshold = e.data['threshold'] as number;\n    const samples = e.data['samples'] as Float32Array;\n    const sampleRate = e.data['sampleRate'] as number;\n\n    const v = extractEmbedding({ samples, sampleRate });\n    let name: string = manager.search({ threshold, v });\n    if (name == '' || name == undefined) {\n      name = \"===<Unknown>===\";\n    }\n    workerPort.postMessage({\n      msgType: 'manager-search-speaker-done',\n      name\n    });\n  }\n}\n\n/**\n * Defines the event handler to be called when the worker receives a message that cannot be deserialized.\n * The event handler is executed in the worker thread.\n *\n * @param e message data\n */\nworkerPort.onmessageerror = (e: MessageEvents) => {\n}\n\n/**\n * Defines the event handler to be called when an exception occurs during worker execution.\n * The event handler is executed in the worker thread.\n *\n * @param e error message\n */\nworkerPort.onerror = (e: ErrorEvent) => {\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry\",\n    \"type\": \"entry\",\n    \"description\": \"$string:module_desc\",\n    \"mainElement\": \"EntryAbility\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false,\n    \"pages\": \"$profile:main_pages\",\n    \"abilities\": [\n      {\n        \"name\": \"EntryAbility\",\n        \"srcEntry\": \"./ets/entryability/EntryAbility.ets\",\n        \"description\": \"$string:EntryAbility_desc\",\n        \"icon\": \"$media:layered_image\",\n        \"label\": \"$string:EntryAbility_label\",\n        \"startWindowIcon\": \"$media:startIcon\",\n        \"startWindowBackground\": \"$color:start_window_background\",\n        \"exported\": true,\n        \"skills\": [\n          {\n            \"entities\": [\n              \"entity.system.home\"\n            ],\n            \"actions\": [\n              \"action.system.home\"\n            ]\n          }\n        ]\n      }\n    ],\n    \"extensionAbilities\": [\n      {\n        \"name\": \"EntryBackupAbility\",\n        \"srcEntry\": \"./ets/entrybackupability/EntryBackupAbility.ets\",\n        \"type\": \"backup\",\n        \"exported\": false,\n        \"metadata\": [\n          {\n            \"name\": \"ohos.extension.backup\",\n            \"resource\": \"$profile:backup_config\"\n          }\n        ],\n      }\n    ],\n    \"requestPermissions\": [\n      {\n        \"name\": \"ohos.permission.MICROPHONE\",\n        \"reason\": \"$string:mic_reason\",\n        \"usedScene\": {\n          \"abilities\": [\n            \"EntryAbility\",\n          ],\n          \"when\": \"inuse\",\n        }\n      }\n    ]\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/resources/base/element/color.json",
    "content": "{\n  \"color\": [\n    {\n      \"name\": \"start_window_background\",\n      \"value\": \"#FFFFFF\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"On-device speaker identification with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"On-device speaker identification with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"Speaker identification\"\n    },\n    {\n      \"name\": \"mic_reason\",\n      \"value\": \"access the microphone for on-device speaker identification with Next-gen Kaldi\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/resources/base/media/layered_image.json",
    "content": "{\n  \"layered-image\":\n  {\n    \"background\" : \"$media:background\",\n    \"foreground\" : \"$media:foreground\"\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/resources/base/profile/backup_config.json",
    "content": "{\n  \"allowToBackupRestore\": true\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/resources/base/profile/main_pages.json",
    "content": "{\n  \"src\": [\n    \"pages/Index\"\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/resources/en_US/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"On-device speaker identification with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"On-device speaker identification with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"Speaker identification\"\n    },\n    {\n      \"name\": \"mic_reason\",\n      \"value\": \"access the microphone for on-device speaker identification with Next-gen Kaldi\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/resources/rawfile/.gitkeep",
    "content": ""
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/main/resources/zh_CN/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"新一代Kaldi: 本地说话人识别\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"新一代Kaldi: 本地说话人识别\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"说话人识别\"\n    },\n    {\n      \"name\": \"mic_reason\",\n      \"value\": \"使用新一代Kaldi, 访问麦克风进行本地说话人识别 (不需要联网)\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/ohosTest/ets/test/Ability.test.ets",
    "content": "import hilog from '@ohos.hilog';\nimport { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function abilityTest() {\n  describe('ActsAbilityTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    })\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    })\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    })\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    })\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      hilog.info(0x0000, 'testTag', '%{public}s', 'it begin');\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    })\n  })\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/ohosTest/ets/test/List.test.ets",
    "content": "import abilityTest from './Ability.test';\n\nexport default function testsuite() {\n  abilityTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/ohosTest/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry_test\",\n    \"type\": \"feature\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/test/List.test.ets",
    "content": "import localUnitTest from './LocalUnit.test';\n\nexport default function testsuite() {\n  localUnitTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/entry/src/test/LocalUnit.test.ets",
    "content": "import { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function localUnitTest() {\n  describe('localUnitTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    });\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    });\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    });\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    });\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    });\n  });\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/hvigor/hvigor-config.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"dependencies\": {\n  },\n  \"execution\": {\n    // \"analyze\": \"normal\",                     /* Define the build analyze mode. Value: [ \"normal\" | \"advanced\" | false ]. Default: \"normal\" */\n    // \"daemon\": true,                          /* Enable daemon compilation. Value: [ true | false ]. Default: true */\n    // \"incremental\": true,                     /* Enable incremental compilation. Value: [ true | false ]. Default: true */\n    // \"parallel\": true,                        /* Enable parallel compilation. Value: [ true | false ]. Default: true */\n    // \"typeCheck\": false,                      /* Enable typeCheck. Value: [ true | false ]. Default: false */\n  },\n  \"logging\": {\n    // \"level\": \"info\"                          /* Define the log level. Value: [ \"debug\" | \"info\" | \"warn\" | \"error\" ]. Default: \"info\" */\n  },\n  \"debugging\": {\n    // \"stacktrace\": false                      /* Disable stacktrace compilation. Value: [ true | false ]. Default: false */\n  },\n  \"nodeOptions\": {\n    // \"maxOldSpaceSize\": 8192                  /* Enable nodeOptions maxOldSpaceSize compilation. Unit M. Used for the daemon process. Default: 8192*/\n    // \"exposeGC\": true                         /* Enable to trigger garbage collection explicitly. Default: true*/\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/hvigorfile.ts",
    "content": "import { appTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: appTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"@ohos/hypium@1.0.19\": \"@ohos/hypium@1.0.19\"\n  },\n  \"packages\": {\n    \"@ohos/hypium@1.0.19\": {\n      \"name\": \"@ohos/hypium\",\n      \"version\": \"1.0.19\",\n      \"integrity\": \"sha512-cEjDgLFCm3cWZDeRXk7agBUkPqjWxUo6AQeiu0gEkb3J8ESqlduQLSIXeo3cCsm8U/asL7iKjF85ZyOuufAGSQ==\",\n      \"resolved\": \"https://ohpm.openharmony.cn/ohpm/@ohos/hypium/-/hypium-1.0.19.har\",\n      \"registryType\": \"ohpm\"\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxSpeakerIdentification/oh-package.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"dependencies\": {\n  },\n  \"devDependencies\": {\n    \"@ohos/hypium\": \"1.0.19\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/.gitignore",
    "content": "/node_modules\n/oh_modules\n/local.properties\n/.idea\n**/build\n/.hvigor\n.cxx\n/.clangd\n/.clang-format\n/.clang-tidy\n**/.test\n/.appanalyzer"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/AppScope/app.json5",
    "content": "{\n  \"app\": {\n    \"bundleName\": \"com.k2fsa.sherpa.onnx.streaming.asr\",\n    \"vendor\": \"example\",\n    \"versionCode\": 1000000,\n    \"versionName\": \"1.0.0\",\n    \"icon\": \"$media:app_icon\",\n    \"label\": \"$string:app_name\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/AppScope/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"app_name\",\n      \"value\": \"SherpaOnnxStreamingAsr\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/build-profile.json5",
    "content": "{\n  \"app\": {\n    \"signingConfigs\": [],\n    \"products\": [\n      {\n        \"name\": \"default\",\n        \"signingConfig\": \"default\",\n        \"compatibleSdkVersion\": \"4.0.0(10)\",\n        \"runtimeOS\": \"HarmonyOS\",\n        \"buildOption\": {\n          \"strictMode\": {\n            \"caseSensitiveCheck\": true,\n          }\n        }\n      }\n    ],\n    \"buildModeSet\": [\n      {\n        \"name\": \"debug\",\n      },\n      {\n        \"name\": \"release\"\n      }\n    ]\n  },\n  \"modules\": [\n    {\n      \"name\": \"entry\",\n      \"srcPath\": \"./entry\",\n      \"targets\": [\n        {\n          \"name\": \"default\",\n          \"applyToProducts\": [\n            \"default\"\n          ]\n        }\n      ]\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/code-linter.json5",
    "content": "{\n  \"files\": [\n    \"**/*.ets\"\n  ],\n  \"ignore\": [\n    \"**/src/ohosTest/**/*\",\n    \"**/src/test/**/*\",\n    \"**/src/mock/**/*\",\n    \"**/node_modules/**/*\",\n    \"**/oh_modules/**/*\",\n    \"**/build/**/*\",\n    \"**/.preview/**/*\"\n  ],\n  \"ruleSet\": [\n    \"plugin:@performance/recommended\",\n    \"plugin:@typescript-eslint/recommended\"\n  ],\n  \"rules\": {\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/.gitignore",
    "content": "/node_modules\n/oh_modules\n/.preview\n/build\n/.cxx\n/.test"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/build-profile.json5",
    "content": "{\n  \"apiType\": \"stageMode\",\n  \"buildOption\": {\n    \"sourceOption\": {\n      \"workers\": [\n        './src/main/ets/workers/StreamingAsrWorker.ets'\n      ]\n    }\n  },\n  \"buildOptionSet\": [\n    {\n      \"name\": \"release\",\n      \"arkOptions\": {\n        \"obfuscation\": {\n          \"ruleOptions\": {\n            \"enable\": false,\n            \"files\": [\n              \"./obfuscation-rules.txt\"\n            ]\n          }\n        }\n      }\n    },\n  ],\n  \"targets\": [\n    {\n      \"name\": \"default\"\n    },\n    {\n      \"name\": \"ohosTest\",\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/hvigorfile.ts",
    "content": "import { hapTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: hapTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/obfuscation-rules.txt",
    "content": "# Define project specific obfuscation rules here.\n# You can include the obfuscation configuration files in the current module's build-profile.json5.\n#\n# For more details, see\n#   https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/source-obfuscation-V5\n\n# Obfuscation options:\n# -disable-obfuscation: disable all obfuscations\n# -enable-property-obfuscation: obfuscate the property names\n# -enable-toplevel-obfuscation: obfuscate the names in the global scope\n# -compact: remove unnecessary blank spaces and all line feeds\n# -remove-log: remove all console.* statements\n# -print-namecache: print the name cache that contains the mapping from the old names to new names\n# -apply-namecache: reuse the given cache file\n\n# Keep options:\n# -keep-property-name: specifies property names that you want to keep\n# -keep-global-name: specifies names that you want to keep in the global scope\n\n-enable-property-obfuscation\n-enable-toplevel-obfuscation\n-enable-filename-obfuscation\n-enable-export-obfuscation"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1.10.33/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\": \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1.10.33/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\",\n    \"sherpa_onnx@1.10.33\": \"sherpa_onnx@1.10.33\"\n  },\n  \"packages\": {\n    \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1.10.33/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\": {\n      \"name\": \"libsherpa_onnx.so\",\n      \"version\": \"1.0.0\",\n      \"resolved\": \"../oh_modules/.ohpm/sherpa_onnx@1.10.33/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\",\n      \"registryType\": \"local\"\n    },\n    \"sherpa_onnx@1.10.33\": {\n      \"name\": \"sherpa_onnx\",\n      \"version\": \"1.10.33\",\n      \"integrity\": \"sha512-cmZ8zwOMx4qmDvOjF1/PL6/suBgReanSf5XdQTuMWWZ6qN74rynODHrt4C+Qz754MTXg0q/phAKeVjGA4rHHSA==\",\n      \"resolved\": \"https://ohpm.openharmony.cn/ohpm/sherpa_onnx/-/sherpa_onnx-1.10.33.har\",\n      \"registryType\": \"ohpm\",\n      \"dependencies\": {\n        \"libsherpa_onnx.so\": \"file:./src/main/cpp/types/libsherpa_onnx\"\n      }\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/oh-package.json5",
    "content": "{\n  \"name\": \"entry\",\n  \"version\": \"1.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"main\": \"\",\n  \"author\": \"\",\n  \"license\": \"\",\n  \"dependencies\": {\n    \"sherpa_onnx\": \"1.12.31\",\n  }\n}\n\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/ets/entryability/EntryAbility.ets",
    "content": "import AbilityConstant from '@ohos.app.ability.AbilityConstant';\nimport hilog from '@ohos.hilog';\nimport UIAbility from '@ohos.app.ability.UIAbility';\nimport Want from '@ohos.app.ability.Want';\nimport window from '@ohos.window';\n\nexport default class EntryAbility extends UIAbility {\n  onCreate(want: Want, launchParam: AbilityConstant.LaunchParam): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onCreate');\n  }\n\n  onDestroy(): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onDestroy');\n  }\n\n  onWindowStageCreate(windowStage: window.WindowStage): void {\n    // Main window is created, set main page for this ability\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageCreate');\n\n    windowStage.loadContent('pages/Index', (err) => {\n      if (err.code) {\n        hilog.error(0x0000, 'testTag', 'Failed to load the content. Cause: %{public}s', JSON.stringify(err) ?? '');\n        return;\n      }\n      hilog.info(0x0000, 'testTag', 'Succeeded in loading the content.');\n    });\n  }\n\n  onWindowStageDestroy(): void {\n    // Main window is destroyed, release UI related resources\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageDestroy');\n  }\n\n  onForeground(): void {\n    // Ability has brought to foreground\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onForeground');\n  }\n\n  onBackground(): void {\n    // Ability has back to background\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onBackground');\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/ets/entrybackupability/EntryBackupAbility.ets",
    "content": "import hilog from '@ohos.hilog';\nimport BackupExtensionAbility, { BundleVersion } from '@ohos.application.BackupExtensionAbility';\n\nexport default class EntryBackupAbility extends BackupExtensionAbility {\n  async onBackup() {\n    hilog.info(0x0000, 'testTag', 'onBackup ok');\n  }\n\n  async onRestore(bundleVersion: BundleVersion) {\n    hilog.info(0x0000, 'testTag', 'onRestore ok %{public}s', JSON.stringify(bundleVersion));\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/ets/pages/Index.ets",
    "content": "import { LengthUnit } from '@kit.ArkUI';\nimport worker, { MessageEvents } from '@ohos.worker';\nimport { BusinessError } from '@kit.BasicServicesKit';\nimport { picker } from '@kit.CoreFileKit';\nimport systemTime from '@ohos.systemTime';\nimport { Permissions } from '@kit.AbilityKit';\nimport { allAllowed, requestPermissions } from './Permission';\nimport { audio } from '@kit.AudioKit';\nimport fs from '@ohos.file.fs';\n\n\nfunction savePcmToWav(filename: string, samples: Int16Array, sampleRate: number) {\n  const fp = fs.openSync(filename, fs.OpenMode.READ_WRITE | fs.OpenMode.CREATE);\n\n  const header = new ArrayBuffer(44);\n  const view = new DataView(header);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true); // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true); // chunkSize //                   E V A W\n  view.setUint32(8, 0x45564157, true); // format // //                      t m f\n  view.setUint32(12, 0x20746d66, true); // subchunk1ID\n  view.setUint32(16, 16, true); // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true); // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true); // numChannels: 1 channel\n  view.setUint32(24, sampleRate, true); // sampleRate\n  view.setUint32(28, sampleRate * 2, true); // byteRate\n  view.setUint16(32, 2, true); // blockAlign\n  view.setUint16(34, 16, true); // bitsPerSample\n  view.setUint32(36, 0x61746164, true); // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true); // subchunk2Size\n\n  fs.writeSync(fp.fd, new Uint8Array(header).buffer, { length: header.byteLength });\n  fs.writeSync(fp.fd, samples.buffer, { length: samples.buffer.byteLength });\n\n  fs.closeSync(fp.fd);\n}\n\nfunction toInt16Samples(samples: Float32Array): Int16Array {\n  const int16Samples = new Int16Array(samples.length);\n  for (let i = 0; i < samples.length; ++i) {\n    let s = samples[i] * 32767;\n    s = s > 32767 ? 32767 : s;\n    s = s < -32768 ? -32768 : s;\n    int16Samples[i] = s;\n  }\n\n  return int16Samples;\n}\n\n\n@Entry\n@Component\nstruct Index {\n  @State title: string = 'Next-gen Kaldi: Real-time speech recognition';\n  @State titleFontSize: number = 15;\n  @State currentIndex: number = 0;\n  @State lang: string = 'English';\n  @State resultForFile: string = ''\n  @State resultForMic: string = ''\n  @State selectFileBtnEnabled: boolean = false;\n  @State micBtnCaption: string = 'Start';\n  @State micStarted: boolean = false;\n  @State micAllowed: boolean = false;\n  @State micBtnEnabled: boolean = false;\n  @State micSaveBtnCaption: string = 'Save recorded audio';\n  @State micSaveBtnEnabled: boolean = false;\n  @State info: string = '';\n  @State micInfo: string = '';\n  @State micInitDone: boolean = false;\n  private resultListForMic: string[] = [];\n  private controller: TabsController = new TabsController();\n  private workerInstance?: worker.ThreadWorker\n  private readonly scriptURL: string = 'entry/ets/workers/StreamingAsrWorker.ets'\n  private startTime: number = 0;\n  private stopTime: number = 0;\n  private sampleRate: number = 48000;\n  private sampleList: Float32Array[] = []\n  private mic?: audio.AudioCapturer;\n\n  flatten(samples: Float32Array[]): Float32Array {\n    let n = 0;\n    for (let i = 0; i < samples.length; ++i) {\n      n += samples[i].length;\n    }\n\n    const ans: Float32Array = new Float32Array(n);\n    let offset: number = 0;\n    for (let i = 0; i < samples.length; ++i) {\n      ans.set(samples[i], offset);\n      offset += samples[i].length;\n    }\n\n    return ans;\n  }\n\n  async initMic() {\n    const permissions: Permissions[] = [\"ohos.permission.MICROPHONE\"];\n    let allowed: boolean = await allAllowed(permissions);\n    if (!allowed) {\n      console.log(\"request to access the microphone\");\n      const status: boolean = await requestPermissions(permissions);\n\n      if (!status) {\n        console.error('access to microphone is denied')\n        this.resultForMic = \"Failed to get microphone permission. Please retry\";\n        return;\n      }\n\n      allowed = await allAllowed(permissions);\n      if (!allowed) {\n        console.error('failed to get microphone permission');\n        this.resultForMic = \"Failed to get microphone permission. Please retry\";\n        return;\n      }\n      this.micAllowed = true;\n    } else {\n      console.log(\"allowed to access microphone\");\n      this.micAllowed = true;\n    }\n\n    const audioStreamInfo: audio.AudioStreamInfo = {\n      samplingRate: this.sampleRate,\n      channels: audio.AudioChannel.CHANNEL_1,\n      sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_S16LE,\n      encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW,\n    };\n\n    const audioCapturerInfo: audio.AudioCapturerInfo = {\n      source: audio.SourceType.SOURCE_TYPE_MIC, capturerFlags: 0\n    };\n\n    const audioCapturerOptions: audio.AudioCapturerOptions = {\n      streamInfo: audioStreamInfo, capturerInfo: audioCapturerInfo\n\n    };\n    audio.createAudioCapturer(audioCapturerOptions, (err, data) => {\n      if (err) {\n        console.error(`error code is ${err.code}, error message is ${err.message}`);\n        this.resultForMic = 'Failed to init microphone';\n      } else {\n        console.info(`init mic successfully`);\n        this.mic = data;\n        this.mic.on('readData', this.micCallback);\n      }\n    });\n  }\n\n  async aboutToAppear() {\n    this.workerInstance = new worker.ThreadWorker(this.scriptURL, {\n      name: 'Streaming ASR worker'\n    });\n\n    this.workerInstance.onmessage = (e: MessageEvents) => {\n      const msgType = e.data['msgType'] as string;\n      console.log(`received msg from worker: ${msgType}`);\n\n      if (msgType == 'init-streaming-asr-done') {\n        this.selectFileBtnEnabled = true;\n        this.micBtnEnabled = true;\n        this.info = `Initializing done.\\n\\nPlease select a wave file of 16kHz in language ${this.lang}`;\n        this.micInfo = `Initializing done.\\n\\nPlease click Start and speak`;\n      }\n\n      if (msgType == 'streaming-asr-decode-file-done') {\n        const text = e.data['text'] as string;\n        this.resultForFile = text;\n        this.selectFileBtnEnabled = true;\n\n        systemTime.getRealTime((err, data) => {\n          if (err) {\n            console.log('Failed to get stop time');\n          } else {\n            this.stopTime = data;\n\n            const audioDuration = e.data['duration'] as number;\n            const elapsedSeconds = (this.stopTime - this.startTime) / 1000;\n            const RTF = elapsedSeconds / audioDuration;\n            this.info = `Audio duration: ${audioDuration.toFixed(2)} s\nElapsed: ${elapsedSeconds.toFixed(2)} s\nRTF = ${elapsedSeconds.toFixed(2)}/${audioDuration.toFixed(2)} = ${RTF.toFixed(3)}\n`;\n          }\n        });\n      }\n\n      if (msgType == 'streaming-asr-decode-mic-result') {\n        const text = e.data['text'] as string;\n        if (text.trim() == '') {\n          return;\n        }\n\n        const isEndpoint = e.data['isEndpoint'] as boolean;\n\n        let s = '';\n        let i = 0;\n        for (; i < this.resultListForMic.length; ++i) {\n          s += `${i}: ${this.resultListForMic[i]}\\n`\n        }\n\n        s += `${i}: ${text}`;\n        this.resultForMic = s;\n\n        if (isEndpoint) {\n          this.resultListForMic.push(text);\n        }\n      }\n    };\n\n    const context = getContext();\n    this.workerInstance.postMessage({ msgType: 'init-streaming-asr', context });\n    this.info = 'Initializing ASR model.\\nPlease wait';\n    this.micInfo = 'Initializing ASR model.\\nPlease wait';\n\n    await this.initMic();\n  }\n\n  @Builder\n  TabBuilder(title: string, targetIndex: number, selectedImg: Resource, normalImg: Resource) {\n    Column() {\n      Image(this.currentIndex == targetIndex ? selectedImg : normalImg).size({ width: 25, height: 25 })\n      Text(title).fontColor(this.currentIndex == targetIndex ? '#28bff1' : '#8a8a8a')\n    }.width('100%').height(50).justifyContent(FlexAlign.Center).onClick(() => {\n      this.currentIndex = targetIndex;\n      this.controller.changeIndex(this.currentIndex);\n    })\n  }\n\n  build() {\n    Column() {\n      Tabs({ barPosition: BarPosition.End, controller: this.controller }) {\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(this.titleFontSize).fontWeight(FontWeight.Bold);\n            Button('Select .wav file (16kHz) ')\n              .enabled(this.selectFileBtnEnabled)\n              .fontSize(13)\n              .width(296)\n              .height(60)\n              .onClick(() => {\n                this.resultForFile = '';\n                this.info = '';\n                this.selectFileBtnEnabled = false;\n\n                const documentSelectOptions = new picker.DocumentSelectOptions();\n                documentSelectOptions.maxSelectNumber = 1;\n                documentSelectOptions.fileSuffixFilters = ['.wav'];\n                const documentViewPicker = new picker.DocumentViewPicker();\n\n                documentViewPicker.select(documentSelectOptions).then((result: Array<string>) => {\n                  console.log(`select file result: ${result}`);\n\n                  if (!result[0]) {\n                    this.resultForFile = 'Please select a file to decode';\n                    this.selectFileBtnEnabled = true;\n                    return;\n                  }\n\n                  if (this.workerInstance) {\n                    systemTime.getRealTime((err, data) => {\n                      if (err) {\n                        console.log('Failed to get start time');\n                      } else {\n                        this.startTime = data;\n                      }\n                    });\n\n                    this.workerInstance.postMessage({\n                      msgType: 'streaming-asr-decode-file', filename: result[0],\n                    });\n                    this.info = `Decoding ${result[0]} ... ...`;\n                  } else {\n                    console.log(`this worker instance is undefined ${this.workerInstance}`);\n                  }\n\n                }).catch((err: BusinessError) => {\n                  console.error(`Failed to select file, code is ${err.code}, message is ${err.message}`);\n                  this.selectFileBtnEnabled = true;\n                })\n              })\n\n            Text(`Supported languages: ${this.lang}`);\n            if (this.info != '') {\n              TextArea({ text: this.info }).focusable(false);\n            }\n            TextArea({ text: this.resultForFile })\n              .width('100%')\n              .lineSpacing({ value: 10, unit: LengthUnit.VP })\n              .height('100%');\n          }\n        }.tabBar(this.TabBuilder('From file', 0, $r('app.media.icon_doc'), $r('app.media.icon_doc')))\n\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(this.titleFontSize).fontWeight(FontWeight.Bold);\n            Button(this.micBtnCaption)\n              .enabled(this.micBtnEnabled)\n              .fontSize(13)\n              .width(296)\n              .height(60)\n              .onClick(() => {\n                this.micInfo = '';\n                if (this.mic) {\n                  if (this.micStarted) {\n                    this.micStarted = false;\n                    this.micBtnCaption = 'Start';\n                    this.mic.stop();\n                    this.micSaveBtnEnabled = true;\n\n                    if (this.workerInstance) {\n                      this.workerInstance.postMessage({\n                        msgType: 'streaming-asr-decode-mic-stop'\n                      });\n                    }\n                  } else {\n                    this.micStarted = true;\n                    this.micSaveBtnEnabled = false;\n                    this.micBtnCaption = 'Stop';\n                    this.resultForMic = '';\n                    this.resultListForMic = [];\n\n                    if (this.workerInstance) {\n                      this.workerInstance.postMessage({\n                        msgType: 'streaming-asr-decode-mic-start'\n                      });\n                    }\n\n                    this.sampleList = [];\n                    this.mic.start();\n                  }\n                }\n              });\n            Button(this.micSaveBtnCaption)\n              .enabled(this.micSaveBtnEnabled)\n              .fontSize(13)\n              .width(296)\n              .height(60)\n              .onClick(() => {\n                if (this.sampleList.length == 0) {\n                  this.micSaveBtnEnabled = false;\n                  return;\n                }\n\n                const samples = this.flatten(this.sampleList);\n\n                if (samples.length == 0) {\n                  this.micSaveBtnEnabled = false;\n                  return;\n                }\n\n\n                let uri: string = '';\n\n\n                const audioOptions = new picker.AudioSaveOptions(); // audioOptions.newFileNames = ['o.wav'];\n\n                const audioViewPicker = new picker.AudioViewPicker();\n\n                audioViewPicker.save(audioOptions).then((audioSelectResult: Array<string>) => {\n                  uri = audioSelectResult[0];\n                  savePcmToWav(uri, toInt16Samples(samples), this.sampleRate);\n                  console.log(`Saved to ${uri}`);\n                  this.micInfo += `\\nSaved to ${uri}`;\n                });\n\n              })\n\n\n            Text(`Supported languages: ${this.lang}`)\n\n            if (this.micInfo != '') {\n              TextArea({ text: this.micInfo })\n                .focusable(false);\n            }\n\n            TextArea({ text: this.resultForMic })\n              .width('100%')\n              .lineSpacing({ value: 10, unit: LengthUnit.VP })\n              .width('100%')\n              .height('100%');\n          }\n        }.tabBar(this.TabBuilder('From mic', 1, $r('app.media.icon_mic'), $r('app.media.icon_mic')))\n\n\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(this.titleFontSize).fontWeight(FontWeight.Bold);\n            TextArea({\n              text: `\nEveryting is open-sourced.\n\nIt runs locally, without accessing the network\n\nSee also https://github.com/k2-fsa/sherpa-onnx\n\n新一代 Kaldi QQ 和微信交流群: 请看\n\nhttps://k2-fsa.github.io/sherpa/social-groups.html\n\n微信公众号: 新一代 Kaldi\n            `\n            }).width('100%').height('100%').focusable(false)\n          }.justifyContent(FlexAlign.Start)\n        }.tabBar(this.TabBuilder('Help', 2, $r('app.media.info'), $r('app.media.info')))\n      }.scrollable(false)\n    }.width('100%')\n  }\n\n  private micCallback = (buffer: ArrayBuffer) => {\n    const view: Int16Array = new Int16Array(buffer);\n\n    const samplesFloat: Float32Array = new Float32Array(view.length);\n    for (let i = 0; i < view.length; ++i) {\n      samplesFloat[i] = view[i] / 32768.0;\n    }\n\n    this.sampleList.push(samplesFloat);\n\n    if (this.workerInstance) {\n      this.workerInstance.postMessage({\n        msgType: 'streaming-asr-decode-mic-samples',\n        samples: samplesFloat,\n        sampleRate: this.sampleRate,\n      })\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/ets/pages/Permission.ets",
    "content": "// This file is modified from\n// https://gitee.com/ukSir/hmchat2/blob/master/entry/src/main/ets/utils/permissionMananger.ets\nimport { abilityAccessCtrl, bundleManager, common, Permissions } from '@kit.AbilityKit';\n\nexport function allAllowed(permissions: Permissions[]): boolean {\n  if (permissions.length == 0) {\n    return false;\n  }\n\n  const mgr: abilityAccessCtrl.AtManager = abilityAccessCtrl.createAtManager();\n\n  const bundleInfo = bundleManager.getBundleInfoForSelfSync(bundleManager.BundleFlag.GET_BUNDLE_INFO_WITH_APPLICATION);\n\n  let tokenID: number = bundleInfo.appInfo.accessTokenId;\n\n  return permissions.every(permission => abilityAccessCtrl.GrantStatus.PERMISSION_GRANTED ==\n  mgr.checkAccessTokenSync(tokenID, permission));\n}\n\nexport async function requestPermissions(permissions: Permissions[]): Promise<boolean> {\n  const mgr: abilityAccessCtrl.AtManager = abilityAccessCtrl.createAtManager();\n  const context: Context = getContext() as common.UIAbilityContext;\n\n  const result = await mgr.requestPermissionsFromUser(context, permissions);\n  return result.authResults.length > 0 && result.authResults.every(authResults => authResults == 0);\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/ets/workers/StreamingAsrWorker.ets",
    "content": "import worker, { ErrorEvent, MessageEvents, ThreadWorkerGlobalScope } from '@ohos.worker';\nimport {\n  OnlineModelConfig,\n  OnlineRecognizer,\n  OnlineRecognizerConfig,\n  OnlineStream,\n  readWaveFromBinary,\n  Samples\n} from 'sherpa_onnx';\nimport { fileIo } from '@kit.CoreFileKit';\n\nconst workerPort: ThreadWorkerGlobalScope = worker.workerPort;\n\n\nlet recognizer: OnlineRecognizer;\nlet micStream: OnlineStream;\n\nfunction getModelConfig(type: number): OnlineModelConfig {\n  const modelConfig = new OnlineModelConfig();\n  switch (type) {\n    case 0: {\n      const modelDir = 'sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20';\n      modelConfig.transducer.encoder = `${modelDir}/encoder-epoch-99-avg-1.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/decoder-epoch-99-avg-1.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/joiner-epoch-99-avg-1.onnx`;\n      modelConfig.tokens = `${modelDir}/tokens.txt`;\n      modelConfig.modelType = 'zipformer';\n      break;\n    }\n\n    case 1: {\n      const modelDir = 'sherpa-onnx-lstm-zh-2023-02-20';\n      modelConfig.transducer.encoder = `${modelDir}/encoder-epoch-11-avg-1.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/decoder-epoch-11-avg-1.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/joiner-epoch-11-avg-1.onnx`;\n      modelConfig.tokens = `${modelDir}/tokens.txt`;\n      modelConfig.modelType = 'lstm';\n      break;\n    }\n\n    case 2: {\n      const modelDir = 'sherpa-onnx-lstm-en-2023-02-17';\n      modelConfig.transducer.encoder = `${modelDir}/encoder-epoch-99-avg-1.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/decoder-epoch-99-avg-1.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/joiner-epoch-99-avg-1.onnx`;\n      modelConfig.tokens = `${modelDir}/tokens.txt`;\n      modelConfig.modelType = 'lstm';\n      break;\n    }\n\n    case 3: {\n      const modelDir = 'icefall-asr-zipformer-streaming-wenetspeech-20230615';\n      modelConfig.transducer.encoder = `${modelDir}/exp/encoder-epoch-12-avg-4-chunk-16-left-128.int8.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/exp/joiner-epoch-12-avg-4-chunk-16-left-128.onnx`;\n      modelConfig.tokens = `${modelDir}/data/lang_char/tokens.txt`;\n      modelConfig.modelType = 'zipformer2';\n      break;\n    }\n\n    case 4: {\n      const modelDir = 'icefall-asr-zipformer-streaming-wenetspeech-20230615';\n      modelConfig.transducer.encoder = `${modelDir}/exp/encoder-epoch-12-avg-4-chunk-16-left-128.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/exp/joiner-epoch-12-avg-4-chunk-16-left-128.onnx`;\n      modelConfig.tokens = `${modelDir}/data/lang_char/tokens.txt`;\n      modelConfig.modelType = 'zipformer2';\n      break;\n    }\n\n    case 5: {\n      const modelDir = 'sherpa-onnx-streaming-paraformer-bilingual-zh-en';\n      modelConfig.paraformer.encoder = `${modelDir}/encoder.int8.onnx`;\n      modelConfig.paraformer.decoder = `${modelDir}/decoder.int8.onnx`;\n      modelConfig.tokens = `${modelDir}/tokens.txt`;\n      modelConfig.modelType = 'paraformer';\n      break;\n    }\n\n    case 6: {\n      const modelDir = 'sherpa-onnx-streaming-zipformer-en-2023-06-26';\n      modelConfig.transducer.encoder = `${modelDir}/encoder-epoch-99-avg-1-chunk-16-left-128.int8.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/decoder-epoch-99-avg-1-chunk-16-left-128.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/joiner-epoch-99-avg-1-chunk-16-left-128.onnx`;\n      modelConfig.tokens = `${modelDir}/tokens.txt`;\n      modelConfig.modelType = 'zipformer2';\n      break;\n    }\n\n    case 7: {\n      const modelDir = 'sherpa-onnx-streaming-zipformer-fr-2023-04-14';\n      modelConfig.transducer.encoder = `${modelDir}/encoder-epoch-29-avg-9-with-averaged-model.int8.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/decoder-epoch-29-avg-9-with-averaged-model.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/joiner-epoch-29-avg-9-with-averaged-model.onnx`;\n      modelConfig.tokens = `${modelDir}/tokens.txt`;\n      modelConfig.modelType = 'zipformer';\n      break;\n    }\n\n    case 8: {\n      const modelDir = 'sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20';\n      modelConfig.transducer.encoder = `${modelDir}/encoder-epoch-99-avg-1.int8.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/decoder-epoch-99-avg-1.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/joiner-epoch-99-avg-1.int8.onnx`;\n      modelConfig.tokens = `${modelDir}/tokens.txt`;\n      modelConfig.modelType = 'zipformer';\n      break;\n    }\n\n    case 9: {\n      const modelDir = 'sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23'\n      modelConfig.transducer.encoder = `${modelDir}/encoder-epoch-99-avg-1.int8.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/decoder-epoch-99-avg-1.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/joiner-epoch-99-avg-1.int8.onnx`;\n      modelConfig.tokens = `${modelDir}/tokens.txt`;\n      modelConfig.modelType = 'zipformer';\n      break;\n    }\n\n    case 10: {\n      const modelDir = 'sherpa-onnx-streaming-zipformer-en-20M-2023-02-17';\n      modelConfig.transducer.encoder = `${modelDir}/encoder-epoch-99-avg-1.int8.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/decoder-epoch-99-avg-1.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/joiner-epoch-99-avg-1.int8.onnx`;\n      modelConfig.tokens = `${modelDir}/tokens.txt`;\n      modelConfig.modelType = 'zipformer';\n      break;\n    }\n\n    case 14: {\n      const modelDir = 'sherpa-onnx-streaming-zipformer-korean-2024-06-16';\n      modelConfig.transducer.encoder = `${modelDir}/encoder-epoch-99-avg-1.int8.onnx`;\n      modelConfig.transducer.decoder = `${modelDir}/decoder-epoch-99-avg-1.onnx`;\n      modelConfig.transducer.joiner = `${modelDir}/joiner-epoch-99-avg-1.int8.onnx`;\n      modelConfig.tokens = `${modelDir}/tokens.txt`;\n      modelConfig.modelType = 'zipformer';\n      break;\n    }\n    default: {\n      console.log(`Please specify a supported type. Given type ${type}`);\n    }\n  }\n  return modelConfig;\n}\n\nfunction initStreamingAsr(context: Context): OnlineRecognizer {\n  let type: number;\n\n  /*\n\nIf you use type = 8, then you should have the following directory structure in the rawfile directory\n\n(py38) fangjuns-MacBook-Pro:rawfile fangjun$ pwd\n/Users/fangjun/open-source/sherpa-onnx/harmony-os/SherpaOnnxStreamingAsr/entry/src/main/resources/rawfile\n(py38) fangjuns-MacBook-Pro:rawfile fangjun$ ls\nsherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n(py38) fangjuns-MacBook-Pro:rawfile fangjun$ tree .\n.\n└── sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n    ├── decoder-epoch-99-avg-1.onnx\n    ├── encoder-epoch-99-avg-1.int8.onnx\n    ├── joiner-epoch-99-avg-1.int8.onnx\n    └── tokens.txt\n\n1 directory, 4 files\n\nYou can download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\nNote that please delete files that are not used. Otherwise, you APP will be very large\ndue to containing unused large files.\n\n   */\n  type = 8;\n\n  const config: OnlineRecognizerConfig = new OnlineRecognizerConfig();\n  config.modelConfig = getModelConfig(type);\n  config.modelConfig.debug = true;\n  config.modelConfig.numThreads = 2;\n  config.enableEndpoint = true;\n\n  return new OnlineRecognizer(config, context.resourceManager);\n}\n\ninterface DecodeFileResult {\n  text: string;\n  duration: number;\n}\n\nfunction decodeFile(filename: string): DecodeFileResult {\n  const fp = fileIo.openSync(filename);\n  const stat = fileIo.statSync(fp.fd);\n  const arrayBuffer = new ArrayBuffer(stat.size);\n  fileIo.readSync(fp.fd, arrayBuffer);\n  const data: Uint8Array = new Uint8Array(arrayBuffer);\n  const wave: Samples = readWaveFromBinary(data) as Samples;\n  console.log(`Sample rate: ${wave.sampleRate}`);\n\n  const stream = recognizer.createStream();\n  stream.acceptWaveform(wave);\n  const tailPadding = new Float32Array(0.5 * wave.sampleRate);\n  tailPadding.fill(0);\n\n  stream.acceptWaveform({ samples: tailPadding, sampleRate: wave.sampleRate });\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  const audioDuration = wave.samples.length / wave.sampleRate;\n\n  return { text: recognizer.getResult(stream).text, duration: audioDuration };\n}\n\n/**\n * Defines the event handler to be called when the worker thread receives a message sent by the host thread.\n * The event handler is executed in the worker thread.\n *\n * @param e message data\n */\nworkerPort.onmessage = (e: MessageEvents) => {\n  const msgType = e.data['msgType'] as string;\n\n  if (msgType != 'streaming-asr-decode-mic-samples') {\n    console.log(`from the main thread, msg-type: ${msgType}`);\n  }\n\n  if (msgType == 'init-streaming-asr' && !recognizer) {\n    console.log('initializing streaming ASR...');\n    const context = e.data['context'] as Context;\n    recognizer = initStreamingAsr(context);\n    console.log('streaming ASR is initialized. ');\n    workerPort.postMessage({ 'msgType': 'init-streaming-asr-done' });\n  }\n\n  if (msgType == 'streaming-asr-decode-file') {\n    const filename = e.data['filename'] as string;\n    console.log(`decoding ${filename}`);\n    const result = decodeFile(filename);\n    workerPort.postMessage({\n      'msgType': 'streaming-asr-decode-file-done', text: result.text, duration: result.duration\n    });\n  }\n\n  if (msgType == 'streaming-asr-decode-mic-start') {\n    micStream = recognizer.createStream();\n  }\n\n  if (msgType == 'streaming-asr-decode-mic-stop') { // nothing to do\n  }\n\n  if (msgType == 'streaming-asr-decode-mic-samples') {\n    const samples = e.data['samples'] as Float32Array;\n    const sampleRate = e.data['sampleRate'] as number;\n\n    micStream.acceptWaveform({ samples, sampleRate });\n    while (recognizer.isReady(micStream)) {\n      recognizer.decode(micStream);\n\n      let isEndpoint = false;\n      let text = recognizer.getResult(micStream).text;\n\n      if (recognizer.isEndpoint(micStream)) {\n        isEndpoint = true;\n        recognizer.reset(micStream);\n      }\n\n      if (text.trim() != '') {\n        workerPort.postMessage({\n          'msgType': 'streaming-asr-decode-mic-result', text: text, isEndpoint: isEndpoint,\n        });\n      }\n    }\n  }\n\n}\n\n/**\n * Defines the event handler to be called when the worker receives a message that cannot be deserialized.\n * The event handler is executed in the worker thread.\n *\n * @param e message data\n */\nworkerPort.onmessageerror = (e: MessageEvents) => {\n}\n\n/**\n * Defines the event handler to be called when an exception occurs during worker execution.\n * The event handler is executed in the worker thread.\n *\n * @param e error message\n */\nworkerPort.onerror = (e: ErrorEvent) => {\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry\",\n    \"type\": \"entry\",\n    \"description\": \"$string:module_desc\",\n    \"mainElement\": \"EntryAbility\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false,\n    \"pages\": \"$profile:main_pages\",\n    \"abilities\": [\n      {\n        \"name\": \"EntryAbility\",\n        \"srcEntry\": \"./ets/entryability/EntryAbility.ets\",\n        \"description\": \"$string:EntryAbility_desc\",\n        \"icon\": \"$media:layered_image\",\n        \"label\": \"$string:EntryAbility_label\",\n        \"startWindowIcon\": \"$media:startIcon\",\n        \"startWindowBackground\": \"$color:start_window_background\",\n        \"exported\": true,\n        \"skills\": [\n          {\n            \"entities\": [\n              \"entity.system.home\"\n            ],\n            \"actions\": [\n              \"action.system.home\"\n            ]\n          }\n        ]\n      }\n    ],\n    \"extensionAbilities\": [\n      {\n        \"name\": \"EntryBackupAbility\",\n        \"srcEntry\": \"./ets/entrybackupability/EntryBackupAbility.ets\",\n        \"type\": \"backup\",\n        \"exported\": false,\n        \"metadata\": [\n          {\n            \"name\": \"ohos.extension.backup\",\n            \"resource\": \"$profile:backup_config\"\n          }\n        ],\n      }\n    ],\n    \"requestPermissions\": [\n      {\n        \"name\": \"ohos.permission.MICROPHONE\",\n        \"reason\": \"$string:mic_reason\",\n        \"usedScene\": {\n          \"abilities\": [\n            \"EntryAbility\",\n          ],\n          \"when\": \"inuse\",\n        }\n      }\n    ]\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/resources/base/element/color.json",
    "content": "{\n  \"color\": [\n    {\n      \"name\": \"start_window_background\",\n      \"value\": \"#FFFFFF\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"On-device real-time speech recognition with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"On-device real-time speech recognition with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"Real-time ASR\"\n    },\n    {\n      \"name\": \"mic_reason\",\n      \"value\": \"access the microphone for on-device real-time speech recognition with Next-gen Kaldi\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/resources/base/media/layered_image.json",
    "content": "{\n  \"layered-image\":\n  {\n    \"background\" : \"$media:background\",\n    \"foreground\" : \"$media:foreground\"\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/resources/base/profile/backup_config.json",
    "content": "{\n  \"allowToBackupRestore\": true\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/resources/base/profile/main_pages.json",
    "content": "{\n  \"src\": [\n    \"pages/Index\"\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/resources/en_US/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"On-device real-time speech recognition with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"On-device real-time speech recognition with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"Real-time ASR\"\n    },\n    {\n      \"name\": \"mic_reason\",\n      \"value\": \"access the microphone for on-device real-time speech recognition with Next-gen Kaldi\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/resources/rawfile/.gitkeep",
    "content": ""
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/main/resources/zh_CN/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"新一代Kaldi: 本地实时语音识别\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"新一代Kaldi: 本地实时语音识别\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"实时语音识别\"\n    },\n    {\n      \"name\": \"mic_reason\",\n      \"value\": \"使用新一代Kaldi, 访问麦克风进行本地实时语音识别 (不需要联网)\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/ohosTest/ets/test/Ability.test.ets",
    "content": "import hilog from '@ohos.hilog';\nimport { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function abilityTest() {\n  describe('ActsAbilityTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    })\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    })\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    })\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    })\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      hilog.info(0x0000, 'testTag', '%{public}s', 'it begin');\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    })\n  })\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/ohosTest/ets/test/List.test.ets",
    "content": "import abilityTest from './Ability.test';\n\nexport default function testsuite() {\n  abilityTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/ohosTest/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry_test\",\n    \"type\": \"feature\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/test/List.test.ets",
    "content": "import localUnitTest from './LocalUnit.test';\n\nexport default function testsuite() {\n  localUnitTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/entry/src/test/LocalUnit.test.ets",
    "content": "import { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function localUnitTest() {\n  describe('localUnitTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    });\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    });\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    });\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    });\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    });\n  });\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/hvigor/hvigor-config.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"dependencies\": {\n  },\n  \"execution\": {\n    // \"analyze\": \"normal\",                     /* Define the build analyze mode. Value: [ \"normal\" | \"advanced\" | false ]. Default: \"normal\" */\n    // \"daemon\": true,                          /* Enable daemon compilation. Value: [ true | false ]. Default: true */\n    // \"incremental\": true,                     /* Enable incremental compilation. Value: [ true | false ]. Default: true */\n    // \"parallel\": true,                        /* Enable parallel compilation. Value: [ true | false ]. Default: true */\n    // \"typeCheck\": false,                      /* Enable typeCheck. Value: [ true | false ]. Default: false */\n  },\n  \"logging\": {\n    // \"level\": \"info\"                          /* Define the log level. Value: [ \"debug\" | \"info\" | \"warn\" | \"error\" ]. Default: \"info\" */\n  },\n  \"debugging\": {\n    // \"stacktrace\": false                      /* Disable stacktrace compilation. Value: [ true | false ]. Default: false */\n  },\n  \"nodeOptions\": {\n    // \"maxOldSpaceSize\": 8192                  /* Enable nodeOptions maxOldSpaceSize compilation. Unit M. Used for the daemon process. Default: 8192*/\n    // \"exposeGC\": true                         /* Enable to trigger garbage collection explicitly. Default: true*/\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/hvigorfile.ts",
    "content": "import { appTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: appTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"@ohos/hypium@1.0.19\": \"@ohos/hypium@1.0.19\"\n  },\n  \"packages\": {\n    \"@ohos/hypium@1.0.19\": {\n      \"name\": \"@ohos/hypium\",\n      \"version\": \"1.0.19\",\n      \"integrity\": \"sha512-cEjDgLFCm3cWZDeRXk7agBUkPqjWxUo6AQeiu0gEkb3J8ESqlduQLSIXeo3cCsm8U/asL7iKjF85ZyOuufAGSQ==\",\n      \"resolved\": \"https://ohpm.openharmony.cn/ohpm/@ohos/hypium/-/hypium-1.0.19.har\",\n      \"registryType\": \"ohpm\"\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxStreamingAsr/oh-package.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"dependencies\": {\n  },\n  \"devDependencies\": {\n    \"@ohos/hypium\": \"1.0.19\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/.gitignore",
    "content": "/node_modules\n/oh_modules\n/local.properties\n/.idea\n**/build\n/.hvigor\n.cxx\n/.clangd\n/.clang-format\n/.clang-tidy\n**/.test\n/.appanalyzer"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/AppScope/app.json5",
    "content": "{\n  \"app\": {\n    \"bundleName\": \"com.k2fsa.sherpa.onnx.tts\",\n    \"vendor\": \"next-gen Kaldi\",\n    \"versionCode\": 1000000,\n    \"versionName\": \"1.0.0\",\n    \"icon\": \"$media:app_icon\",\n    \"label\": \"$string:app_name\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/AppScope/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"app_name\",\n      \"value\": \"SherpaOnnxTts\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/README.md",
    "content": "# Introduction\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/harmony-os/tts.html\nfor how to run code in this folder.\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/build-profile.json5",
    "content": "{\n  \"app\": {\n    \"signingConfigs\": [],\n    \"products\": [\n      {\n        \"name\": \"default\",\n        \"signingConfig\": \"default\",\n        \"compatibleSdkVersion\": \"4.0.0(10)\",\n        \"runtimeOS\": \"HarmonyOS\",\n        \"buildOption\": {\n          \"strictMode\": {\n            \"caseSensitiveCheck\": true,\n          }\n        }\n      }\n    ],\n    \"buildModeSet\": [\n      {\n        \"name\": \"debug\",\n      },\n      {\n        \"name\": \"release\"\n      }\n    ]\n  },\n  \"modules\": [\n    {\n      \"name\": \"entry\",\n      \"srcPath\": \"./entry\",\n      \"targets\": [\n        {\n          \"name\": \"default\",\n          \"applyToProducts\": [\n            \"default\"\n          ]\n        }\n      ]\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/code-linter.json5",
    "content": "{\n  \"files\": [\n    \"**/*.ets\"\n  ],\n  \"ignore\": [\n    \"**/src/ohosTest/**/*\",\n    \"**/src/test/**/*\",\n    \"**/src/mock/**/*\",\n    \"**/node_modules/**/*\",\n    \"**/oh_modules/**/*\",\n    \"**/build/**/*\",\n    \"**/.preview/**/*\"\n  ],\n  \"ruleSet\": [\n    \"plugin:@performance/recommended\",\n    \"plugin:@typescript-eslint/recommended\"\n  ],\n  \"rules\": {\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/.gitignore",
    "content": "/node_modules\n/oh_modules\n/.preview\n/build\n/.cxx\n/.test"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/build-profile.json5",
    "content": "{\n  \"apiType\": \"stageMode\",\n  \"buildOption\": {\n    \"sourceOption\": {\n      \"workers\": [\n        \"./src/main/ets/workers/NonStreamingTtsWorker.ets\"\n      ]\n    }\n  },\n  \"buildOptionSet\": [\n    {\n      \"name\": \"release\",\n      \"arkOptions\": {\n        \"obfuscation\": {\n          \"ruleOptions\": {\n            \"enable\": false,\n            \"files\": [\n              \"./obfuscation-rules.txt\"\n            ]\n          }\n        }\n      }\n    },\n  ],\n  \"targets\": [\n    {\n      \"name\": \"default\"\n    },\n    {\n      \"name\": \"ohosTest\",\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/hvigorfile.ts",
    "content": "import { hapTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: hapTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/obfuscation-rules.txt",
    "content": "# Define project specific obfuscation rules here.\n# You can include the obfuscation configuration files in the current module's build-profile.json5.\n#\n# For more details, see\n#   https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/source-obfuscation-V5\n\n# Obfuscation options:\n# -disable-obfuscation: disable all obfuscations\n# -enable-property-obfuscation: obfuscate the property names\n# -enable-toplevel-obfuscation: obfuscate the names in the global scope\n# -compact: remove unnecessary blank spaces and all line feeds\n# -remove-log: remove all console.* statements\n# -print-namecache: print the name cache that contains the mapping from the old names to new names\n# -apply-namecache: reuse the given cache file\n\n# Keep options:\n# -keep-property-name: specifies property names that you want to keep\n# -keep-global-name: specifies names that you want to keep in the global scope\n\n-enable-property-obfuscation\n-enable-toplevel-obfuscation\n-enable-filename-obfuscation\n-enable-export-obfuscation"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1.10.32/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\": \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1.10.32/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\",\n    \"sherpa_onnx@1.10.32\": \"sherpa_onnx@1.10.32\"\n  },\n  \"packages\": {\n    \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1.10.32/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\": {\n      \"name\": \"libsherpa_onnx.so\",\n      \"version\": \"1.0.0\",\n      \"resolved\": \"../oh_modules/.ohpm/sherpa_onnx@1.10.32/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\",\n      \"registryType\": \"local\"\n    },\n    \"sherpa_onnx@1.10.32\": {\n      \"name\": \"sherpa_onnx\",\n      \"version\": \"1.10.32\",\n      \"integrity\": \"sha512-yHYmWoeqhrunOqGr9gxPJJH/8+rdwcKFOW6onYByVObQVpbqypslg301IjGm9xpnc5bJEkO3S9sra2zQTpPA/w==\",\n      \"resolved\": \"https://ohpm.openharmony.cn/ohpm/sherpa_onnx/-/sherpa_onnx-1.10.32.har\",\n      \"registryType\": \"ohpm\",\n      \"dependencies\": {\n        \"libsherpa_onnx.so\": \"file:./src/main/cpp/types/libsherpa_onnx\"\n      }\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/oh-package.json5",
    "content": "{\n  \"name\": \"entry\",\n  \"version\": \"1.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"main\": \"\",\n  \"author\": \"\",\n  \"license\": \"\",\n  \"dependencies\": {\n    \"sherpa_onnx\": \"1.12.31\",\n  }\n}\n\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/ets/entryability/EntryAbility.ets",
    "content": "import AbilityConstant from '@ohos.app.ability.AbilityConstant';\nimport hilog from '@ohos.hilog';\nimport UIAbility from '@ohos.app.ability.UIAbility';\nimport Want from '@ohos.app.ability.Want';\nimport window from '@ohos.window';\n\nexport default class EntryAbility extends UIAbility {\n  onCreate(want: Want, launchParam: AbilityConstant.LaunchParam): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onCreate');\n  }\n\n  onDestroy(): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onDestroy');\n  }\n\n  onWindowStageCreate(windowStage: window.WindowStage): void {\n    // Main window is created, set main page for this ability\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageCreate');\n\n    windowStage.loadContent('pages/Index', (err) => {\n      if (err.code) {\n        hilog.error(0x0000, 'testTag', 'Failed to load the content. Cause: %{public}s', JSON.stringify(err) ?? '');\n        return;\n      }\n      hilog.info(0x0000, 'testTag', 'Succeeded in loading the content.');\n    });\n  }\n\n  onWindowStageDestroy(): void {\n    // Main window is destroyed, release UI related resources\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageDestroy');\n  }\n\n  onForeground(): void {\n    // Ability has brought to foreground\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onForeground');\n  }\n\n  onBackground(): void {\n    // Ability has back to background\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onBackground');\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/ets/entrybackupability/EntryBackupAbility.ets",
    "content": "import hilog from '@ohos.hilog';\nimport BackupExtensionAbility, { BundleVersion } from '@ohos.application.BackupExtensionAbility';\n\nexport default class EntryBackupAbility extends BackupExtensionAbility {\n  async onBackup() {\n    hilog.info(0x0000, 'testTag', 'onBackup ok');\n  }\n\n  async onRestore(bundleVersion: BundleVersion) {\n    hilog.info(0x0000, 'testTag', 'onRestore ok %{public}s', JSON.stringify(bundleVersion));\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/ets/pages/Index.ets",
    "content": "import { CircularBuffer } from 'sherpa_onnx';\nimport worker, { MessageEvents } from '@ohos.worker';\nimport { audio } from '@kit.AudioKit';\nimport picker from '@ohos.file.picker';\nimport fs from '@ohos.file.fs';\nimport systemTime from '@ohos.systemTime';\n\n\nfunction savePcmToWav(filename: string, samples: Int16Array, sampleRate: number) {\n  const fp = fs.openSync(filename, fs.OpenMode.READ_WRITE | fs.OpenMode.CREATE);\n\n  const header = new ArrayBuffer(44);\n  const view = new DataView(header);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true); // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true); // chunkSize //                   E V A W\n  view.setUint32(8, 0x45564157, true); // format // //                      t m f\n  view.setUint32(12, 0x20746d66, true); // subchunk1ID\n  view.setUint32(16, 16, true); // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true); // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true); // numChannels: 1 channel\n  view.setUint32(24, sampleRate, true); // sampleRate\n  view.setUint32(28, sampleRate * 2, true); // byteRate\n  view.setUint16(32, 2, true); // blockAlign\n  view.setUint16(34, 16, true); // bitsPerSample\n  view.setUint32(36, 0x61746164, true); // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true); // subchunk2Size\n\n  fs.writeSync(fp.fd, new Uint8Array(header).buffer, { length: header.byteLength });\n  fs.writeSync(fp.fd, samples.buffer, { length: samples.buffer.byteLength });\n\n  fs.closeSync(fp.fd);\n}\n\nfunction toInt16Samples(samples: Float32Array): Int16Array {\n  const int16Samples = new Int16Array(samples.length);\n  for (let i = 0; i < samples.length; ++i) {\n    let s = samples[i] * 32767;\n    s = s > 32767 ? 32767 : s;\n    s = s < -32768 ? -32768 : s;\n    int16Samples[i] = s;\n  }\n\n  return int16Samples;\n}\n\n\n@Entry\n@Component\nstruct Index {\n  @State currentIndex: number = 0;\n  @State title: string = 'Next-gen Kaldi: Text-to-speech';\n  @State info: string = '';\n  @State btnStartCaption: string = 'Start';\n  @State btnStartEnabled: boolean = false;\n  @State btnStopCaption: string = 'Stop';\n  @State btnStopEnabled: boolean = false;\n  @State btnSaveCaption: string = 'Save';\n  @State btnSaveEnabled: boolean = false;\n  @State progress: number = 0;\n  @State sid: string = '0';\n  @State speechSpeed: string = '1.0';\n  @State isGenerating: boolean = false;\n  @State initTtsDone: boolean = false;\n  @State ttsGeneratedDone: boolean = true;\n  @State numSpeakers: number = 1;\n  @State numThreads: number = 1;\n  @State initAudioDone: boolean = false;\n  private controller: TabsController = new TabsController();\n  private cancelled: boolean = false;\n  private sampleRate: number = 0;\n  private startTime: number = 0;\n  private stopTime: number = 0;\n  private inputText: string = '';\n  // it specifies only the initial capacity.\n  private workerInstance?: worker.ThreadWorker\n  private readonly scriptURL: string = 'entry/ets/workers/NonStreamingTtsWorker.ets'\n  // note that circular buffer can automatically resize.\n  private sampleBuffer: CircularBuffer = new CircularBuffer(16000 * 5);\n  private finalSamples: Float32Array | null = null;\n  private audioRenderer: audio.AudioRenderer | null = null;\n\n  initAudioRenderer() {\n    if (this.audioRenderer) {\n      console.log(`Audio renderer has already been created. Skip creating`);\n      return;\n    } // see // https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/using-audiorenderer-for-playback-V5\n    console.log('Initializing audio renderer');\n    const audioStreamInfo: audio.AudioStreamInfo = {\n      samplingRate: this.sampleRate,\n      channels: audio.AudioChannel.CHANNEL_1, // 通道\n      sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_S16LE,\n      encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW\n    };\n\n    const audioRendererInfo: audio.AudioRendererInfo = {\n      usage: audio.StreamUsage.STREAM_USAGE_MUSIC, rendererFlags: 0\n    };\n\n    const audioRendererOptions: audio.AudioRendererOptions = {\n      streamInfo: audioStreamInfo, rendererInfo: audioRendererInfo\n    };\n\n    audio.createAudioRenderer(audioRendererOptions, (err, renderer) => {\n      if (!err) {\n        console.log('audio renderer initialized successfully');\n        this.initAudioDone = true;\n        if (renderer) {\n          this.audioRenderer = renderer;\n          this.audioRenderer.on(\"writeData\", this.audioPlayCallback);\n          if (this.sampleBuffer.size()) {\n            this.audioRenderer.start();\n          }\n        } else {\n          console.log(`returned audio renderer is ${renderer}`);\n        }\n      } else {\n        console.log(`Failed to initialize audio renderer. error message: ${err.message}, error code: ${err.code}`);\n      }\n    });\n  }\n\n  async aboutToAppear() {\n    this.initAudioRenderer();\n\n    this.workerInstance = new worker.ThreadWorker(this.scriptURL, {\n      name: 'NonStreaming TTS worker'\n    });\n    this.workerInstance.onmessage = (e: MessageEvents) => {\n      const msgType = e.data['msgType'] as string;\n      console.log(`received msg from worker: ${msgType}`);\n\n      if (msgType == 'init-tts-done') {\n        this.info = 'Model initialized!\\nPlease enter text and press start.';\n        this.sampleRate = e.data['sampleRate'] as number;\n        this.numSpeakers = e.data['numSpeakers'] as number;\n        this.numThreads = e.data['numThreads'] as number;\n\n        this.initTtsDone = true;\n      }\n\n      if (msgType == 'tts-generate-partial') {\n        if (this.cancelled) {\n          return;\n        }\n\n        const samples: Float32Array = e.data['samples'] as Float32Array;\n        const progress: number = e.data['progress'] as number;\n        this.progress = progress;\n\n        this.sampleBuffer.push(samples);\n\n        if (!this.initAudioDone) {\n          this.initAudioRenderer();\n        }\n\n        if (this.audioRenderer && this.audioRenderer?.state != audio.AudioState.STATE_RUNNING) {\n          this.audioRenderer.start();\n        }\n      }\n\n      if (msgType == 'tts-generate-done') {\n        this.isGenerating = false;\n        const samples: Float32Array = e.data['samples'] as Float32Array;\n\n        systemTime.getRealTime((err, data) => {\n\n          if (err) {\n            console.log(`Failed to get stop time`)\n          } else {\n            this.stopTime = data;\n\n            const audioDuration = samples.length / this.sampleRate;\n            const elapsedSeconds = (this.stopTime - this.startTime) / 1000;\n            const RTF = elapsedSeconds / audioDuration;\n\n            this.info = `Audio duration: ${audioDuration} s\nElapsed: ${elapsedSeconds} s\nRTF = ${elapsedSeconds.toFixed(2)}/${audioDuration.toFixed(2)} = ${RTF.toFixed(3)}\nNumber of threads: ${this.numThreads}\n`;\n            if (this.cancelled) {\n              this.info += '\\nCancelled.';\n            }\n          }\n        });\n\n        this.finalSamples = samples;\n        this.ttsGeneratedDone = true;\n        this.btnSaveEnabled = true;\n\n        this.ttsGeneratedDone = true;\n\n        if (this.audioRenderer && this.audioRenderer?.state != audio.AudioState.STATE_RUNNING &&\n          this.sampleBuffer.size() == 0) {\n          this.sampleBuffer.push(samples);\n          this.progress = 1;\n          this.audioRenderer.start();\n        }\n\n        if (!this.initAudioDone) {\n          this.btnStartEnabled = true;\n          this.btnStopEnabled = false;\n          this.info += '\\nAudio renderer is not initialized. Disable playing audio.';\n        }\n      }\n    }\n\n    this.info = 'Initializing TTS model ...';\n    this.workerInstance.postMessage({ msgType: 'init-tts', context: getContext() });\n  }\n\n  @Builder\n  TabBuilder(title: string, targetIndex: number, selectedImg: Resource, normalImg: Resource) {\n    Column() {\n      Image(this.currentIndex == targetIndex ? selectedImg : normalImg).size({ width: 25, height: 25 })\n      Text(title).fontColor(this.currentIndex == targetIndex ? '#28bff1' : '#8a8a8a')\n    }.width('100%').height(50).justifyContent(FlexAlign.Center).onClick(() => {\n      this.currentIndex = targetIndex;\n      this.controller.changeIndex(this.currentIndex);\n    })\n  }\n\n  build() {\n    Column() {\n      Tabs({ barPosition: BarPosition.End, controller: this.controller }) {\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(20).fontWeight(FontWeight.Bold);\n            if (this.numSpeakers > 1) {\n              Row({ space: 10 }) {\n                Text(`Speaker ID (0-${this.numSpeakers - 1})`).width('60%')\n\n                TextInput({ text: this.sid }).onChange((text) => {\n                  this.sid = text.trim();\n                }).width('20%')\n              }.justifyContent(FlexAlign.Center)\n            }\n\n            Row() {\n              Text('Speech speed').width('60%');\n\n              TextInput({ text: this.speechSpeed }).onChange((text) => {\n                this.speechSpeed = text.trim();\n              }).width('20%')\n            }\n\n            Row({ space: 10 }) {\n              Button(this.btnStartCaption).enabled(this.btnStartEnabled).onClick(async () => {\n                let sid = parseInt(this.sid);\n                if (sid.toString() != this.sid) {\n                  this.info = 'Please input a valid speaker ID';\n                  return;\n                }\n\n                let speed = parseFloat(this.speechSpeed);\n                if (isNaN(speed)) {\n                  this.info = 'Please enter a valid speech speed';\n                  return;\n                }\n\n                if (speed <= 0) {\n                  this.info = 'Please enter a positive speech speed';\n                  return;\n                }\n\n                if (this.workerInstance && this.initTtsDone) {\n                  this.info = 'Generating...';\n                  this.cancelled = false;\n                  this.finalSamples = null;\n                  this.sampleBuffer.reset();\n                  this.ttsGeneratedDone = false;\n                  this.progress = 0;\n\n                  this.btnStartEnabled = false;\n                  this.btnStopEnabled = true;\n                  this.btnSaveEnabled = false;\n                  console.log(`sending ${this.inputText}`)\n                  this.ttsGeneratedDone = false;\n                  this.startTime = await systemTime.getRealTime();\n                  this.workerInstance?.postMessage({\n                    msgType: 'tts-generate',\n                    text: this.inputText,\n                    sid: sid,\n                    speed: speed,\n                  });\n                  this.isGenerating = true;\n                  this.info = '';\n                } else {\n                  this.info = 'Failed to initialize tts model';\n                  this.btnStartEnabled = false;\n                }\n              });\n\n              Button(this.btnStopCaption).enabled(this.btnStopEnabled).onClick(() => {\n                this.ttsGeneratedDone = true;\n                this.btnStartEnabled = true;\n                this.btnStopEnabled = false;\n                this.sampleBuffer.reset();\n                this.cancelled = true;\n                this.isGenerating = false;\n\n                if (this.workerInstance && this.initTtsDone) {\n                  this.workerInstance.postMessage({ msgType: 'tts-generate-cancel' });\n                }\n                this.audioRenderer?.stop();\n              })\n\n              Button(this.btnSaveCaption).enabled(this.btnSaveEnabled).onClick(() => {\n                if (!this.finalSamples || this.finalSamples.length == 0) {\n\n                  this.btnSaveEnabled = false;\n                  return;\n                }\n\n                let uri: string = '';\n\n                const audioOptions = new picker.AudioSaveOptions(); // audioOptions.newFileNames = ['o.wav'];\n\n                const audioViewPicker = new picker.AudioViewPicker();\n\n                audioViewPicker.save(audioOptions).then((audioSelectResult: Array<string>) => {\n                  uri = audioSelectResult[0];\n                  if (this.finalSamples) {\n                    savePcmToWav(uri, toInt16Samples(this.finalSamples), this.sampleRate);\n                    console.log(`Saved to ${uri}`);\n                    this.info += `\\nSaved to ${uri}`;\n                  }\n                });\n              });\n            }\n\n            if (this.info != '') {\n              TextArea({ text: this.info }).focusable(false);\n            }\n            if (this.progress > 0) {\n              Row() {\n                Progress({ value: 0, total: 100, type: ProgressType.Capsule })\n                  .width('80%')\n                  .height(20)\n                  .value(this.progress * 100);\n\n                Text(`${(this.progress * 100).toFixed(2)}%`).width('15%')\n              }.width('100%').justifyContent(FlexAlign.Center)\n            }\n\n            TextArea({ placeholder: 'Input text for TTS and click the start button' })\n              .width('100%')\n              .height('100%')\n              .focusable(this.isGenerating == false && this.initTtsDone)\n              .onChange((text) => {\n                this.inputText = text;\n                if (text.trim() == '') {\n                  this.btnStartEnabled = false;\n                  return;\n                }\n                this.btnStartEnabled = true;\n              })\n          }.width('100%')\n\n          // see https://composeicons.com/\n        }.tabBar(this.TabBuilder('TTS', 0, $r('app.media.home'), $r('app.media.home')))\n\n        TabContent() {\n          Column({space: 10}) {\n            Text(this.title).fontSize(20).fontWeight(FontWeight.Bold);\n            TextArea({text: `\nEveryting is open-sourced.\n\nIt runs locally, without accessing the network\n\nSee also https://github.com/k2-fsa/sherpa-onnx\n\n新一代 Kaldi QQ 和微信交流群: 请看\n\nhttps://k2-fsa.github.io/sherpa/social-groups.html\n\n微信公众号: 新一代 Kaldi\n            `}).width('100%')\n              .height('100%')\n              .focusable(false)\n          }.justifyContent(FlexAlign.Start)\n        }.tabBar(this.TabBuilder('Help', 1, $r('app.media.info'), $r('app.media.info')))\n      }.scrollable(false)\n    }\n  }\n\n  private audioPlayCallback = (buffer: ArrayBuffer) => {\n    const numSamples = buffer.byteLength / 2;\n    if (this.sampleBuffer.size() >= numSamples) {\n      const samples: Float32Array = this.sampleBuffer.get(this.sampleBuffer.head(), numSamples);\n\n      const int16Samples = new Int16Array(buffer);\n      for (let i = 0; i < numSamples; ++i) {\n        let s = samples[i] * 32767;\n        s = s > 32767 ? 32767 : s;\n        s = s < -32768 ? -32768 : s;\n        int16Samples[i] = s;\n      }\n      this.sampleBuffer.pop(numSamples);\n    } else {\n      (new Int16Array(buffer)).fill(0);\n      if (this.ttsGeneratedDone) {\n        this.audioRenderer?.stop();\n        this.btnStartEnabled = true;\n        this.btnStopEnabled = false;\n      }\n    }\n  };\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/ets/workers/NonStreamingTtsWorker.ets",
    "content": "import worker, { ThreadWorkerGlobalScope, MessageEvents, ErrorEvent } from '@ohos.worker';\n\nimport { fileIo as fs } from '@kit.CoreFileKit';\n\nimport { OfflineTtsConfig, OfflineTts, listRawfileDir, TtsInput, TtsOutput } from 'sherpa_onnx';\nimport { buffer } from '@kit.ArkTS';\n\nconst workerPort: ThreadWorkerGlobalScope = worker.workerPort;\n\nlet tts: OfflineTts;\nlet cancelled = false;\n\nfunction mkdir(context: Context, parts: string[]) {\n  const path = parts.join('/');\n  if (fs.accessSync(path)) {\n    return;\n  }\n\n  const sandboxPath: string = context.getApplicationContext().filesDir;\n  let d = sandboxPath\n  for (const p of parts) {\n    d = d + '/' + p;\n\n    if (fs.accessSync(d)) {\n      continue;\n    }\n\n    fs.mkdirSync(d);\n  }\n}\n\nfunction copyRawFileDirToSandbox(context: Context, srcDir: string) {\n  let mgr = context.resourceManager;\n  const allFiles: string[] = listRawfileDir(mgr, srcDir);\n  for (const src of allFiles) {\n    const parts: string[] = src.split('/');\n    if (parts.length != 1) {\n      mkdir(context, parts.slice(0, -1));\n    }\n\n    copyRawFileToSandbox(context, src, src);\n  }\n}\n\nfunction copyRawFileToSandbox(context: Context, src: string,\n  dst: string) {\n  /* see\n   https://blog.csdn.net/weixin_44640245/article/details/142634846\n   https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/rawfile-guidelines-V5\n   */\n  let uint8Array: Uint8Array = context.resourceManager.getRawFileContentSync(src);\n\n  // https://developer.huawei.com/consumer/cn/doc/harmonyos-references-V5/js-apis-file-fs-V5#fsmkdir\n  let sandboxPath: string = context.getApplicationContext().filesDir;\n  let filepath = sandboxPath + '/' + dst;\n\n  if (fs.accessSync(filepath)) {\n    /* if the destination exists and has the expected file size\n       then we skip copying it\n     */\n    let stat = fs.statSync(filepath);\n    if (stat.size == uint8Array.length) {\n      return;\n    }\n  }\n\n  const fp = fs.openSync(filepath, fs.OpenMode.WRITE_ONLY | fs.OpenMode.CREATE | fs.OpenMode.TRUNC);\n  fs.writeSync(fp.fd, buffer.from(uint8Array).buffer)\n  fs.close(fp.fd);\n}\n\nfunction initTts(context: Context): OfflineTts {\n  /* Such a design is to make it easier to build flutter APPs with\n     github actions for a variety of tts models\n\n     See https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/flutter/generate-tts.py\n     for details\n   */\n\n  let modelDir = '';\n\n  // for VITS begin\n  let modelName = '';\n  // for VITS end\n\n  // for Matcha begin\n  let acousticModelName = '';\n  let vocoder = '';\n  // for Matcha end\n\n  // for Kokoro begin\n  let voices = '';\n  // for Kokoro end\n\n  let ruleFsts = '';\n  let ruleFars = '';\n  let lexicon = '';\n  let dataDir = '';\n  /*\n    You can select an example below and change it according to match your\n    selected tts model\n   */\n\n  // ============================================================\n  // Your change starts here\n  // ============================================================\n\n  // Example 1:\n  // modelDir = 'vits-vctk';\n  // modelName = 'vits-vctk.onnx';\n  // lexicon = 'lexicon.txt';\n\n  // Example 2:\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\n  // modelDir = 'vits-piper-en_US-amy-low';\n  // modelName = 'en_US-amy-low.onnx';\n  // dataDir = 'espeak-ng-data';\n\n  // Example 3:\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\n  // modelDir = 'vits-icefall-zh-aishell3';\n  // modelName = 'model.onnx';\n  // ruleFsts = 'phone.fst,date.fst,number.fst,new_heteronym.fst';\n  // ruleFars = 'rule.far';\n  // lexicon = 'lexicon.txt';\n\n  // Example 4:\n  // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#csukuangfj-vits-zh-hf-fanchen-c-chinese-187-speakers\n  // modelDir = 'vits-zh-hf-fanchen-C';\n  // modelName = 'vits-zh-hf-fanchen-C.onnx';\n  // lexicon = 'lexicon.txt';\n\n  // Example 5:\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-coqui-de-css10.tar.bz2\n  // modelDir = 'vits-coqui-de-css10';\n  // modelName = 'model.onnx';\n\n  // Example 6\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2\n  // modelDir = 'vits-piper-en_US-libritts_r-medium';\n  // modelName = 'en_US-libritts_r-medium.onnx';\n  // dataDir = 'espeak-ng-data';\n\n  // Example 7\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-melo-tts-zh_en.tar.bz2\n  // modelDir = 'vits-melo-tts-zh_en';\n  // modelName = 'model.onnx';\n  // lexicon = 'lexicon.txt';\n  // ruleFsts = `date.fst,phone.fst,number.fst`;\n\n  // Example 8\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n  // modelDir = 'matcha-icefall-zh-baker';\n  // acousticModelName = 'model-steps-3.onnx';\n  // vocoder = 'vocos-22khz-univ.onnx';\n  // lexicon = 'lexicon.txt';\n  // ruleFsts = `date.fst,phone.fst,number.fst`;\n\n  // Example 9\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n  // modelDir = 'matcha-icefall-en_US-ljspeech';\n  // acousticModelName = 'model-steps-3.onnx';\n  // vocoder = 'vocos-22khz-univ.onnx';\n  // dataDir = 'espeak-ng-data';\n\n  // Example 10\n  // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html#kokoro-en-v0-19-english-11-speakers\n  // modelDir = 'kokoro-en-v0_19';\n  // modelName = 'model.onnx';\n  // voices = 'voices.bin'\n  // dataDir = 'espeak-ng-data';\n\n  // Example 11\n  // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n  // modelDir = 'kokoro-multi-lang-v1_0';\n  // modelName = 'model.onnx';\n  // voices = 'voices.bin'\n  // dataDir = 'espeak-ng-data';\n  // lexicon = 'lexicon-us-en.txt,lexicon-zh.txt';\n  // ruleFsts = `date-zh.fst,phone-zh.fst,number-zh.fst`;\n\n  // ============================================================\n  // Please don't change the remaining part of this function\n  // ============================================================\n\n  if (modelName == '' && acousticModelName == '' && vocoder == '') {\n    throw new Error('You are supposed to select a model by changing the code before you run the app');\n  }\n\n  if (modelName != '' && acousticModelName != '') {\n    throw new Error('Please select either VITS or Matcha, not both');\n  }\n\n  if (acousticModelName != '' && vocoder == '') {\n    throw new Error('Please provider vocoder for matcha tts models');\n  }\n\n  if (modelName != '') {\n    modelName = modelDir + '/' + modelName;\n  }\n\n  if (acousticModelName != '') {\n    acousticModelName = modelDir + '/' + acousticModelName;\n  }\n\n  if (voices != '') {\n    voices = modelDir + '/' + voices;\n  }\n\n  if (ruleFsts != '') {\n    let fsts = ruleFsts.split(',')\n    let tmp: string[] = [];\n    for (const f of fsts) {\n      tmp.push(modelDir + '/' + f);\n    }\n    ruleFsts = tmp.join(',');\n  }\n\n  if (ruleFars != '') {\n    let fars = ruleFars.split(',')\n    let tmp: string[] = [];\n    for (const f of fars) {\n      tmp.push(modelDir + '/' + f);\n    }\n    ruleFars = tmp.join(',');\n  }\n\n  if (lexicon.includes(\",\")) {\n    let v = lexicon.split(',')\n    let tmp: string[] = [];\n    for (const f of v) {\n      tmp.push(modelDir + '/' + f);\n    }\n    lexicon = tmp.join(',');\n  } else if (lexicon != '') {\n    lexicon = modelDir + '/' + lexicon;\n  }\n\n  if (dataDir != '') {\n    copyRawFileDirToSandbox(context, modelDir + '/' + dataDir)\n    let sandboxPath: string = context.getApplicationContext().filesDir;\n    dataDir = sandboxPath + '/' + modelDir + '/' + dataDir;\n  }\n\n  const tokens = modelDir + '/tokens.txt';\n\n  const config: OfflineTtsConfig = new OfflineTtsConfig();\n  if (voices != '') {\n    config.model.vits.model = '';\n  } else {\n    config.model.vits.model = modelName;\n  }\n\n  if (voices == '') {\n    config.model.vits.lexicon = lexicon;\n    config.model.vits.tokens = tokens;\n    config.model.vits.dataDir = dataDir;\n\n    config.model.matcha.acousticModel = acousticModelName;\n    config.model.matcha.vocoder = vocoder;\n    config.model.matcha.lexicon = lexicon;\n    config.model.matcha.tokens = tokens;\n    config.model.matcha.dataDir = dataDir;\n  }\n\n  if (voices != '') {\n    config.model.kokoro.model = modelName;\n  } else {\n    config.model.kokoro.model = '';\n  }\n\n  if (voices != '') {\n    config.model.kokoro.voices = voices;\n    config.model.kokoro.tokens = tokens;\n    config.model.kokoro.dataDir = dataDir;\n    config.model.kokoro.lexicon = lexicon;\n  }\n\n  config.model.numThreads = 2;\n  config.model.debug = true;\n  config.ruleFsts = ruleFsts;\n  config.ruleFars = ruleFars;\n\n  return new OfflineTts(config, context.resourceManager);\n}\n\ninterface TtsCallbackData {\n  samples: Float32Array;\n  progress: number;\n}\n\nfunction callback(data: TtsCallbackData): number {\n  workerPort.postMessage({\n    'msgType': 'tts-generate-partial', samples: Float32Array.from(data.samples), progress: data.progress,\n  });\n\n  // 0 means to stop generating in C++\n  // 1 means to continue generating in C++\n  return cancelled ? 0 : 1;\n}\n\n/**\n * Defines the event handler to be called when the worker thread receives a message sent by the host thread.\n * The event handler is executed in the worker thread.\n *\n * @param e message data\n */\nworkerPort.onmessage = (e: MessageEvents) => {\n  const msgType = e.data['msgType'] as string;\n  console.log(`msg-type: ${msgType}`);\n  if (msgType == 'init-tts' && !tts) {\n    const context = e.data['context'] as Context;\n    tts = initTts(context);\n    workerPort.postMessage({\n      'msgType': 'init-tts-done',\n      sampleRate: tts.sampleRate,\n      numSpeakers: tts.numSpeakers,\n      numThreads: tts.config.model.numThreads,\n    });\n  }\n\n  if (msgType == 'tts-generate-cancel') {\n    cancelled = true;\n  }\n\n  if (msgType == 'tts-generate') {\n    const text = e.data['text'] as string;\n    console.log(`recevied text ${text}`);\n    const input: TtsInput = new TtsInput();\n    input.text = text;\n    input.sid = e.data['sid'] as number;\n    input.speed = e.data['speed'] as number;\n    input.callback = callback;\n\n    cancelled = false;\n    if (true) {\n      tts.generateAsync(input).then((ttsOutput: TtsOutput) => {\n        console.log(`sampleRate: ${ttsOutput.sampleRate}`);\n\n        workerPort.postMessage({\n          'msgType': 'tts-generate-done', samples: Float32Array.from(ttsOutput.samples),\n        });\n\n      });\n    } else {\n      const ttsOutput: TtsOutput = tts.generate(input);\n      workerPort.postMessage({\n        'msgType': 'tts-generate-done', samples: Float32Array.from(ttsOutput.samples),\n      });\n    }\n\n\n  }\n}\n\n/**\n * Defines the event handler to be called when the worker receives a message that cannot be deserialized.\n * The event handler is executed in the worker thread.\n *\n * @param e message data\n */\nworkerPort.onmessageerror = (e: MessageEvents) => {\n}\n\n/**\n * Defines the event handler to be called when an exception occurs during worker execution.\n * The event handler is executed in the worker thread.\n *\n * @param e error message\n */\nworkerPort.onerror = (e: ErrorEvent) => {\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry\",\n    \"type\": \"entry\",\n    \"description\": \"$string:module_desc\",\n    \"mainElement\": \"EntryAbility\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false,\n    \"pages\": \"$profile:main_pages\",\n    \"abilities\": [\n      {\n        \"name\": \"EntryAbility\",\n        \"srcEntry\": \"./ets/entryability/EntryAbility.ets\",\n        \"description\": \"$string:EntryAbility_desc\",\n        \"icon\": \"$media:layered_image\",\n        \"label\": \"$string:EntryAbility_label\",\n        \"startWindowIcon\": \"$media:startIcon\",\n        \"startWindowBackground\": \"$color:start_window_background\",\n        \"exported\": true,\n        \"skills\": [\n          {\n            \"entities\": [\n              \"entity.system.home\"\n            ],\n            \"actions\": [\n              \"action.system.home\"\n            ]\n          }\n        ]\n      }\n    ],\n    \"extensionAbilities\": [\n      {\n        \"name\": \"EntryBackupAbility\",\n        \"srcEntry\": \"./ets/entrybackupability/EntryBackupAbility.ets\",\n        \"type\": \"backup\",\n        \"exported\": false,\n        \"metadata\": [\n          {\n            \"name\": \"ohos.extension.backup\",\n            \"resource\": \"$profile:backup_config\"\n          }\n        ],\n      }\n    ]\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/resources/base/element/color.json",
    "content": "{\n  \"color\": [\n    {\n      \"name\": \"start_window_background\",\n      \"value\": \"#FFFFFF\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"On-device text-to-speech with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"On-device text-to-speech with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"TTS\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/resources/base/media/layered_image.json",
    "content": "{\n  \"layered-image\":\n  {\n    \"background\" : \"$media:background\",\n    \"foreground\" : \"$media:foreground\"\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/resources/base/profile/backup_config.json",
    "content": "{\n  \"allowToBackupRestore\": true\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/resources/base/profile/main_pages.json",
    "content": "{\n  \"src\": [\n    \"pages/Index\"\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/resources/en_US/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"On-device text-to-speech with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"On-device text-to-speech with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"TTS\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/resources/rawfile/.gitkeep",
    "content": ""
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/main/resources/zh_CN/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"使用新一代Kaldi进行本地离线语音合成\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"使用新一代Kaldi进行本地离线语音合成\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"本地语音合成\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/ohosTest/ets/test/Ability.test.ets",
    "content": "import hilog from '@ohos.hilog';\nimport { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function abilityTest() {\n  describe('ActsAbilityTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    })\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    })\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    })\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    })\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      hilog.info(0x0000, 'testTag', '%{public}s', 'it begin');\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    })\n  })\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/ohosTest/ets/test/List.test.ets",
    "content": "import abilityTest from './Ability.test';\n\nexport default function testsuite() {\n  abilityTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/ohosTest/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry_test\",\n    \"type\": \"feature\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/test/List.test.ets",
    "content": "import localUnitTest from './LocalUnit.test';\n\nexport default function testsuite() {\n  localUnitTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/entry/src/test/LocalUnit.test.ets",
    "content": "import { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function localUnitTest() {\n  describe('localUnitTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    });\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    });\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    });\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    });\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    });\n  });\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/hvigor/hvigor-config.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"dependencies\": {\n  },\n  \"execution\": {\n    // \"analyze\": \"normal\",                     /* Define the build analyze mode. Value: [ \"normal\" | \"advanced\" | false ]. Default: \"normal\" */\n    // \"daemon\": true,                          /* Enable daemon compilation. Value: [ true | false ]. Default: true */\n    // \"incremental\": true,                     /* Enable incremental compilation. Value: [ true | false ]. Default: true */\n    // \"parallel\": true,                        /* Enable parallel compilation. Value: [ true | false ]. Default: true */\n    // \"typeCheck\": false,                      /* Enable typeCheck. Value: [ true | false ]. Default: false */\n  },\n  \"logging\": {\n    // \"level\": \"info\"                          /* Define the log level. Value: [ \"debug\" | \"info\" | \"warn\" | \"error\" ]. Default: \"info\" */\n  },\n  \"debugging\": {\n    // \"stacktrace\": false                      /* Disable stacktrace compilation. Value: [ true | false ]. Default: false */\n  },\n  \"nodeOptions\": {\n    // \"maxOldSpaceSize\": 8192                  /* Enable nodeOptions maxOldSpaceSize compilation. Unit M. Used for the daemon process. Default: 8192*/\n    // \"exposeGC\": true                         /* Enable to trigger garbage collection explicitly. Default: true*/\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/hvigorfile.ts",
    "content": "import { appTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: appTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"@ohos/hypium@1.0.19\": \"@ohos/hypium@1.0.19\"\n  },\n  \"packages\": {\n    \"@ohos/hypium@1.0.19\": {\n      \"name\": \"@ohos/hypium\",\n      \"version\": \"1.0.19\",\n      \"integrity\": \"sha512-cEjDgLFCm3cWZDeRXk7agBUkPqjWxUo6AQeiu0gEkb3J8ESqlduQLSIXeo3cCsm8U/asL7iKjF85ZyOuufAGSQ==\",\n      \"resolved\": \"https://ohpm.openharmony.cn/ohpm/@ohos/hypium/-/hypium-1.0.19.har\",\n      \"registryType\": \"ohpm\"\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxTts/oh-package.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"dependencies\": {\n  },\n  \"devDependencies\": {\n    \"@ohos/hypium\": \"1.0.19\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/.gitignore",
    "content": "/node_modules\n/oh_modules\n/local.properties\n/.idea\n**/build\n/.hvigor\n.cxx\n/.clangd\n/.clang-format\n/.clang-tidy\n**/.test\n/.appanalyzer"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/AppScope/app.json5",
    "content": "{\n  \"app\": {\n    \"bundleName\": \"com.k2fsa.sherpa.onnx.vad.asr\",\n    \"vendor\": \"example\",\n    \"versionCode\": 1000000,\n    \"versionName\": \"1.0.0\",\n    \"icon\": \"$media:app_icon\",\n    \"label\": \"$string:app_name\"\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/AppScope/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"app_name\",\n      \"value\": \"SherpaOnnxVadAsr\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/README.md",
    "content": "# Introduction\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/harmony-os/vad-asr.html\nfor how to run code in this folder.\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/build-profile.json5",
    "content": "{\n  \"app\": {\n    \"signingConfigs\": [],\n    \"products\": [\n      {\n        \"name\": \"default\",\n        \"signingConfig\": \"default\",\n        \"compatibleSdkVersion\": \"4.0.0(10)\",\n        \"runtimeOS\": \"HarmonyOS\",\n        \"buildOption\": {\n          \"strictMode\": {\n            \"caseSensitiveCheck\": true,\n          }\n        }\n      }\n    ],\n    \"buildModeSet\": [\n      {\n        \"name\": \"debug\",\n      },\n      {\n        \"name\": \"release\"\n      }\n    ]\n  },\n  \"modules\": [\n    {\n      \"name\": \"entry\",\n      \"srcPath\": \"./entry\",\n      \"targets\": [\n        {\n          \"name\": \"default\",\n          \"applyToProducts\": [\n            \"default\"\n          ]\n        }\n      ]\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/code-linter.json5",
    "content": "{\n  \"files\": [\n    \"**/*.ets\"\n  ],\n  \"ignore\": [\n    \"**/src/ohosTest/**/*\",\n    \"**/src/test/**/*\",\n    \"**/src/mock/**/*\",\n    \"**/node_modules/**/*\",\n    \"**/oh_modules/**/*\",\n    \"**/build/**/*\",\n    \"**/.preview/**/*\"\n  ],\n  \"ruleSet\": [\n    \"plugin:@performance/recommended\",\n    \"plugin:@typescript-eslint/recommended\"\n  ],\n  \"rules\": {\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/.gitignore",
    "content": "/node_modules\n/oh_modules\n/.preview\n/build\n/.cxx\n/.test\n*.har\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/README.md",
    "content": "# Introduction\n\nPlease download ./sherpa_onnx-v1.12.31.har\nfrom <https://huggingface.co/csukuangfj/sherpa-onnx-harmony-os/tree/main/har>\n\nHint: For users who have no access to huggingface, please use\n<https://hf-mirror.com/csukuangfj/sherpa-onnx-harmony-os/tree/main/har>\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/build-profile.json5",
    "content": "{\n  \"apiType\": \"stageMode\",\n  \"buildOption\": {\n    \"sourceOption\": {\n      \"workers\": [\n        './src/main/ets/workers/NonStreamingAsrWithVadWorker.ets'\n      ]\n    }\n  },\n  \"buildOptionSet\": [\n    {\n      \"name\": \"release\",\n      \"arkOptions\": {\n        \"obfuscation\": {\n          \"ruleOptions\": {\n            \"enable\": false,\n            \"files\": [\n              \"./obfuscation-rules.txt\"\n            ]\n          }\n        }\n      }\n    },\n  ],\n  \"targets\": [\n    {\n      \"name\": \"default\"\n    },\n    {\n      \"name\": \"ohosTest\",\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/hvigorfile.ts",
    "content": "import { hapTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: hapTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/obfuscation-rules.txt",
    "content": "# Define project specific obfuscation rules here.\n# You can include the obfuscation configuration files in the current module's build-profile.json5.\n#\n# For more details, see\n#   https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/source-obfuscation-V5\n\n# Obfuscation options:\n# -disable-obfuscation: disable all obfuscations\n# -enable-property-obfuscation: obfuscate the property names\n# -enable-toplevel-obfuscation: obfuscate the names in the global scope\n# -compact: remove unnecessary blank spaces and all line feeds\n# -remove-log: remove all console.* statements\n# -print-namecache: print the name cache that contains the mapping from the old names to new names\n# -apply-namecache: reuse the given cache file\n\n# Keep options:\n# -keep-property-name: specifies property names that you want to keep\n# -keep-global-name: specifies names that you want to keep in the global scope\n\n-enable-property-obfuscation\n-enable-toplevel-obfuscation\n-enable-filename-obfuscation\n-enable-export-obfuscation"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1.10.32/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\": \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1.10.32/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\",\n    \"sherpa_onnx@1.10.32\": \"sherpa_onnx@1.10.32\"\n  },\n  \"packages\": {\n    \"libsherpa_onnx.so@../oh_modules/.ohpm/sherpa_onnx@1.10.32/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\": {\n      \"name\": \"libsherpa_onnx.so\",\n      \"version\": \"1.0.0\",\n      \"resolved\": \"../oh_modules/.ohpm/sherpa_onnx@1.10.32/oh_modules/sherpa_onnx/src/main/cpp/types/libsherpa_onnx\",\n      \"registryType\": \"local\"\n    },\n    \"sherpa_onnx@1.10.32\": {\n      \"name\": \"sherpa_onnx\",\n      \"version\": \"1.10.32\",\n      \"integrity\": \"sha512-yHYmWoeqhrunOqGr9gxPJJH/8+rdwcKFOW6onYByVObQVpbqypslg301IjGm9xpnc5bJEkO3S9sra2zQTpPA/w==\",\n      \"resolved\": \"https://ohpm.openharmony.cn/ohpm/sherpa_onnx/-/sherpa_onnx-1.10.32.har\",\n      \"registryType\": \"ohpm\",\n      \"dependencies\": {\n        \"libsherpa_onnx.so\": \"file:./src/main/cpp/types/libsherpa_onnx\"\n      }\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/oh-package.json5",
    "content": "{\n  \"name\": \"entry\",\n  \"version\": \"1.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"main\": \"\",\n  \"author\": \"\",\n  \"license\": \"\",\n  \"dependencies\": {\n    // please see https://ohpm.openharmony.cn/#/cn/detail/sherpa_onnx\n    \"sherpa_onnx\": \"1.12.31\",\n  }\n}\n\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/ets/entryability/EntryAbility.ets",
    "content": "import AbilityConstant from '@ohos.app.ability.AbilityConstant';\nimport hilog from '@ohos.hilog';\nimport UIAbility from '@ohos.app.ability.UIAbility';\nimport Want from '@ohos.app.ability.Want';\nimport window from '@ohos.window';\n\nexport default class EntryAbility extends UIAbility {\n  onCreate(want: Want, launchParam: AbilityConstant.LaunchParam): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onCreate');\n  }\n\n  onDestroy(): void {\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onDestroy');\n  }\n\n  onWindowStageCreate(windowStage: window.WindowStage): void {\n    // Main window is created, set main page for this ability\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageCreate');\n\n    windowStage.loadContent('pages/Index', (err) => {\n      if (err.code) {\n        hilog.error(0x0000, 'testTag', 'Failed to load the content. Cause: %{public}s', JSON.stringify(err) ?? '');\n        return;\n      }\n      hilog.info(0x0000, 'testTag', 'Succeeded in loading the content.');\n    });\n  }\n\n  onWindowStageDestroy(): void {\n    // Main window is destroyed, release UI related resources\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageDestroy');\n  }\n\n  onForeground(): void {\n    // Ability has brought to foreground\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onForeground');\n  }\n\n  onBackground(): void {\n    // Ability has back to background\n    hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onBackground');\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/ets/entrybackupability/EntryBackupAbility.ets",
    "content": "import hilog from '@ohos.hilog';\nimport BackupExtensionAbility, { BundleVersion } from '@ohos.application.BackupExtensionAbility';\n\nexport default class EntryBackupAbility extends BackupExtensionAbility {\n  async onBackup() {\n    hilog.info(0x0000, 'testTag', 'onBackup ok');\n  }\n\n  async onRestore(bundleVersion: BundleVersion) {\n    hilog.info(0x0000, 'testTag', 'onRestore ok %{public}s', JSON.stringify(bundleVersion));\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/ets/pages/Index.ets",
    "content": "import { LengthUnit } from '@kit.ArkUI';\nimport worker, { MessageEvents } from '@ohos.worker';\nimport { BusinessError } from '@kit.BasicServicesKit';\nimport { picker } from '@kit.CoreFileKit';\n\nimport { Permissions } from '@kit.AbilityKit';\nimport { allAllowed, requestPermissions } from './Permission';\nimport { audio } from '@kit.AudioKit';\n\n\n@Entry\n@Component\nstruct Index {\n  @State title: string = 'Next-gen Kaldi: VAD + ASR';\n  @State currentIndex: number = 0;\n  @State resultForFile: string = '';\n  @State progressForFile: number = 0;\n  @State selectFileBtnEnabled: boolean = false;\n  @State lang: string = 'English';\n  @State resultForMic: string = '';\n  @State micStarted: boolean = false;\n  @State message: string = 'Start recording';\n  @State micInitDone: boolean = false;\n  private controller: TabsController = new TabsController();\n  private workerInstance?: worker.ThreadWorker\n  private readonly scriptURL: string = 'entry/ets/workers/NonStreamingAsrWithVadWorker.ets'\n  private mic?: audio.AudioCapturer;\n  private sampleList: Float32Array[] = []\n\n  flatten(samples: Float32Array[]): Float32Array {\n    let n = 0;\n    for (let i = 0; i < samples.length; ++i) {\n      n += samples[i].length;\n    }\n\n    const ans: Float32Array = new Float32Array(n);\n    let offset: number = 0;\n    for (let i = 0; i < samples.length; ++i) {\n      ans.set(samples[i], offset);\n      offset += samples[i].length;\n    }\n\n    return ans;\n  }\n\n  async initMic() {\n    const permissions: Permissions[] = [\"ohos.permission.MICROPHONE\"];\n    let allowed: boolean = await allAllowed(permissions);\n    if (!allowed) {\n      console.log(\"request to access the microphone\");\n      const status: boolean = await requestPermissions(permissions);\n\n      if (!status) {\n        console.error('access to microphone is denied')\n        this.resultForMic = \"Failed to get microphone permission. Please retry\";\n        return;\n      }\n\n      allowed = await allAllowed(permissions);\n      if (!allowed) {\n        console.error('failed to get microphone permission');\n        this.resultForMic = \"Failed to get microphone permission. Please retry\";\n        return;\n      }\n    } else {\n      console.log(\"allowed to access microphone\");\n    }\n\n    const audioStreamInfo: audio.AudioStreamInfo = {\n      samplingRate: audio.AudioSamplingRate.SAMPLE_RATE_16000,\n      channels: audio.AudioChannel.CHANNEL_1,\n      sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_S16LE,\n      encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW,\n    };\n\n    const audioCapturerInfo: audio.AudioCapturerInfo = {\n      source: audio.SourceType.SOURCE_TYPE_MIC, capturerFlags: 0\n    };\n\n    const audioCapturerOptions: audio.AudioCapturerOptions = {\n      streamInfo: audioStreamInfo, capturerInfo: audioCapturerInfo\n\n    };\n    audio.createAudioCapturer(audioCapturerOptions, (err, data) => {\n      if (err) {\n        console.error(`error code is ${err.code}, error message is ${err.message}`);\n        this.resultForMic = 'Failed to init microphone';\n      } else {\n        console.info(`init mic successfully`);\n        this.mic = data;\n        this.mic.on('readData', this.micCallback);\n\n        if (this.workerInstance) {\n          this.workerInstance.postMessage({ msgType: 'init-vad-mic', context: getContext() });\n        }\n      }\n    });\n  }\n\n  async aboutToAppear() {\n    this.workerInstance = new worker.ThreadWorker(this.scriptURL, {\n      name: 'NonStreaming ASR worker'\n    });\n\n    this.workerInstance.onmessage = (e: MessageEvents) => {\n      const msgType = e.data['msgType'] as string;\n      console.log(`received msg from worker: ${msgType}`);\n\n      if (msgType == 'init-vad-mic-done') {\n        this.micInitDone = true;\n      }\n\n      if (msgType == 'init-non-streaming-asr-done') {\n        this.selectFileBtnEnabled = true;\n        this.resultForFile = `Initializing done.\\n\\nPlease select a wave file of 16kHz in language ${this.lang}`;\n      }\n\n      if (msgType == 'non-streaming-asr-vad-decode-done') {\n        this.resultForFile = e.data['text'] as string + '\\n';\n      }\n\n      if (msgType == 'non-streaming-asr-vad-decode-partial') {\n        if (this.resultForFile == '') {\n          this.resultForFile = e.data['text'] as string;\n        } else {\n          this.resultForFile += '\\n\\n' + e.data['text'] as string;\n        }\n      }\n\n      if (msgType == 'non-streaming-asr-vad-decode-error') {\n        this.resultForFile = e.data['text'] as string;\n      }\n\n      if (msgType == 'non-streaming-asr-vad-decode-progress') {\n        this.progressForFile = e.data['progress'] as number;\n\n        this.selectFileBtnEnabled = this.progressForFile >= 100;\n      }\n\n      if (msgType == 'non-streaming-asr-vad-mic-partial') {\n        if (this.resultForMic == '') {\n          this.resultForMic = e.data['text'] as string;\n        } else {\n          this.resultForMic += '\\n\\n' + e.data['text'] as string;\n        }\n      }\n\n      if (msgType == 'non-streaming-asr-vad-mic-error') {\n        this.resultForMic = e.data['text'] as string;\n      }\n    }\n\n    const context = getContext();\n    this.resultForFile = 'Initializing models';\n    this.workerInstance.postMessage({ msgType: 'init-vad', context });\n    this.workerInstance.postMessage({ msgType: 'init-non-streaming-asr', context });\n\n    await this.initMic();\n  }\n\n  @Builder\n  TabBuilder(title: string, targetIndex: number, selectedImg: Resource, normalImg: Resource) {\n    Column() {\n      Image(this.currentIndex == targetIndex ? selectedImg : normalImg).size({ width: 25, height: 25 })\n      Text(title).fontColor(this.currentIndex == targetIndex ? '#28bff1' : '#8a8a8a')\n    }.width('100%').height(50).justifyContent(FlexAlign.Center).onClick(() => {\n      this.currentIndex = targetIndex;\n      this.controller.changeIndex(this.currentIndex);\n    })\n  }\n\n  build() {\n    Column() {\n      Tabs({ barPosition: BarPosition.End, controller: this.controller }) {\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(20).fontWeight(FontWeight.Bold);\n\n            Button('Select .wav file (16kHz) ')\n              .enabled(this.selectFileBtnEnabled)\n              .fontSize(13)\n              .width(296)\n              .height(60)\n              .onClick(() => {\n                this.resultForFile = '';\n                this.progressForFile = 0;\n\n                const documentSelectOptions = new picker.DocumentSelectOptions();\n                documentSelectOptions.maxSelectNumber = 1;\n                documentSelectOptions.fileSuffixFilters = ['.wav'];\n                const documentViewPicker = new picker.DocumentViewPicker();\n                documentViewPicker.select(documentSelectOptions).then((result: Array<string>) => {\n                  console.log(`Result: ${result}`);\n\n                  if (!result[0]) {\n                    this.resultForFile = 'Please select a file to decode';\n                    this.selectFileBtnEnabled = true;\n                    return;\n                  }\n\n                  if (this.workerInstance) {\n                    this.workerInstance.postMessage({\n                      msgType: 'non-streaming-asr-vad-decode', filename: result[0],\n                    });\n                  } else {\n                    console.log(`this worker instance is undefined ${this.workerInstance}`);\n                  }\n                }).catch((err: BusinessError) => {\n                  console.error(`Failed to select file, code is ${err.code}, message is ${err.message}`);\n                })\n\n              })\n\n            Text(`Supported languages: ${this.lang}`)\n\n            if (this.progressForFile > 0) {\n              Row() {\n                Progress({ value: 0, total: 100, type: ProgressType.Capsule })\n                  .width('80%')\n                  .height(20)\n                  .value(this.progressForFile);\n\n                Text(`${this.progressForFile.toFixed(2)}%`).width('15%')\n              }.width('100%').justifyContent(FlexAlign.Center)\n            }\n\n            TextArea({ text: this.resultForFile })\n              .width('100%')\n              .lineSpacing({ value: 10, unit: LengthUnit.VP })\n              .height('100%');\n          }.alignItems(HorizontalAlign.Center).justifyContent(FlexAlign.Start)\n        }.tabBar(this.TabBuilder('From file', 0, $r('app.media.icon_doc'), $r('app.media.icon_doc')))\n\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(20).fontWeight(FontWeight.Bold);\n            Button(this.message).enabled(this.micInitDone).onClick(() => {\n              console.log('clicked mic button');\n              this.resultForMic = '';\n              if (this.mic) {\n                if (this.micStarted) {\n                  this.mic.stop();\n                  this.message = \"Start recording\";\n                  this.micStarted = false;\n                  console.log('mic stopped');\n\n                  const samples = this.flatten(this.sampleList);\n                  let s = 0;\n                  for (let i = 0; i < samples.length; ++i) {\n                    s += samples[i];\n                  }\n                  console.log(`samples ${samples.length}, sum: ${s}`);\n\n                  if (this.workerInstance) {\n                    console.log('decode mic');\n                    this.workerInstance.postMessage({\n                      msgType: 'non-streaming-asr-vad-mic', samples,\n                    });\n                  } else {\n                    console.log(`this worker instance is undefined ${this.workerInstance}`);\n                  }\n                } else {\n                  this.sampleList = [];\n                  this.mic.start();\n                  this.message = \"Stop recording\";\n                  this.micStarted = true;\n                  console.log('mic started');\n                }\n              }\n            });\n\n            Text(`Supported languages: ${this.lang}`)\n\n            TextArea({ text: this.resultForMic })\n              .width('100%')\n              .lineSpacing({ value: 10, unit: LengthUnit.VP })\n              .width('100%')\n              .height('100%');\n          }.alignItems(HorizontalAlign.Center).justifyContent(FlexAlign.Start)\n        }\n        .tabBar(this.TabBuilder('From mic', 1, $r('app.media.icon_mic'),\n          $r('app.media.icon_mic')))\n\n        TabContent() {\n          Column({ space: 10 }) {\n            Text(this.title).fontSize(20).fontWeight(FontWeight.Bold);\n            TextArea({\n              text: `\nEveryting is open-sourced.\n\nIt runs locally, without accessing the network\n\nSee also https://github.com/k2-fsa/sherpa-onnx\n\n新一代 Kaldi QQ 和微信交流群: 请看\n\nhttps://k2-fsa.github.io/sherpa/social-groups.html\n\n微信公众号: 新一代 Kaldi\n            `\n            }).width('100%').height('100%').focusable(false)\n          }.justifyContent(FlexAlign.Start)\n        }.tabBar(this.TabBuilder('Help', 2, $r('app.media.info'), $r('app.media.info')))\n\n      }.scrollable(false)\n    }.width('100%').justifyContent(FlexAlign.Start)\n  }\n\n  private micCallback = (buffer: ArrayBuffer) => {\n    const view: Int16Array = new Int16Array(buffer);\n\n    const samplesFloat: Float32Array = new Float32Array(view.length);\n    for (let i = 0; i < view.length; ++i) {\n      samplesFloat[i] = view[i] / 32768.0;\n    }\n    this.sampleList.push(samplesFloat);\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/ets/pages/NonStreamingAsrModels.ets",
    "content": "// Please keep in sync with\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/kotlin-api/OfflineRecognizer.kt#L184\n\nimport { OfflineModelConfig } from 'sherpa_onnx';\n\nexport function getOfflineModelConfig(type: number): OfflineModelConfig {\n  const c: OfflineModelConfig = new OfflineModelConfig();\n  switch (type) {\n    case 0: {\n      const modelDir = 'sherpa-onnx-paraformer-zh-2023-09-14'\n      c.paraformer.model = `${modelDir}/model.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"paraformer\";\n\n      break;\n    }\n\n    case 1: {\n      const modelDir = 'icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04'\n      c.transducer.encoder = `$modelDir}/encoder-epoch-30-avg-4.int8.onnx`;\n      c.transducer.decoder = `${modelDir}/decoder-epoch-30-avg-4.onnx`;\n      c.transducer.encoder = `${modelDir}/joiner-epoch-30-avg-4.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"transducer\";\n\n      break;\n    }\n\n    case 2: {\n      const modelDir = 'sherpa-onnx-whisper-tiny.en';\n      c.whisper.encoder = `${modelDir}/tiny.en-encoder.int8.onnx`;\n      c.whisper.decoder = `${modelDir}/tiny.en-decoder.int8.onnx`;\n      c.tokens = `${modelDir}/tiny.en-tokens.txt`;\n      c.modelType = \"whisper\";\n\n      break;\n    }\n\n    case 3: {\n      const modelDir = 'sherpa-onnx-whisper-base.en';\n      c.whisper.encoder = `${modelDir}/base.en-encoder.int8.onnx`;\n      c.whisper.decoder = `${modelDir}/base.en-decoder.int8.onnx`;\n      c.tokens = `${modelDir}/base.en-tokens.txt`;\n      c.modelType = \"whisper\";\n\n      break;\n    }\n\n    case 4: {\n      const modelDir = \"icefall-asr-zipformer-wenetspeech-20230615\";\n      c.transducer.encoder = `${modelDir}/encoder-epoch-12-avg-4.int8.onnx`;\n      c.transducer.decoder = `${modelDir}/decoder-epoch-12-avg-4.onnx`;\n      c.transducer.joiner = `${modelDir}/joiner-epoch-12-avg-4.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"transducer\";\n\n      break;\n    }\n\n    case 5: {\n      const modelDir = \"sherpa-onnx-zipformer-multi-zh-hans-2023-9-2\";\n      c.transducer.encoder = `${modelDir}/encoder-epoch-20-avg-1.int8.onnx`;\n      c.transducer.decoder = `${modelDir}/decoder-epoch-20-avg-1.onnx`;\n      c.transducer.joiner = `${modelDir}/joiner-epoch-20-avg-1.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"transducer\";\n\n      break;\n    }\n\n    case 6: {\n      const modelDir = \"sherpa-onnx-nemo-ctc-en-citrinet-512\";\n      c.nemoCtc.model = `${modelDir}/model.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n\n      break;\n    }\n\n    case 7: {\n      const modelDir = \"sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k\"\n      c.nemoCtc.model = `${modelDir}/model.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n\n      break;\n    }\n\n    case 8: {\n      const modelDir = \"sherpa-onnx-nemo-fast-conformer-ctc-en-24500\"\n      c.nemoCtc.model = `${modelDir}/model.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n\n      break;\n    }\n\n    case 9: {\n      const modelDir = \"sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288\"\n      c.nemoCtc.model = `${modelDir}/model.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n\n      break;\n    }\n\n    case 10: {\n      const modelDir = \"sherpa-onnx-nemo-fast-conformer-ctc-es-1424\"\n      c.nemoCtc.model = `${modelDir}/model.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n\n      break;\n    }\n\n    case 11: {\n      const modelDir = \"sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04\"\n      c.telespeechCtc = `${modelDir}/model.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"telespeech_ctc\";\n\n      break;\n    }\n\n    case 12: {\n      const modelDir = \"sherpa-onnx-zipformer-thai-2024-06-20\"\n      c.transducer.encoder = `${modelDir}/encoder-epoch-12-avg-5.int8.onnx`;\n      c.transducer.decoder = `${modelDir}/decoder-epoch-12-avg-5.onnx`;\n      c.transducer.joiner = `${modelDir}/joiner-epoch-12-avg-5.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"transducer\";\n\n      break;\n    }\n\n    case 13: {\n      const modelDir = \"sherpa-onnx-zipformer-korean-2024-06-24\";\n      c.transducer.encoder = `${modelDir}/encoder-epoch-99-avg-1.int8.onnx`;\n      c.transducer.decoder = `${modelDir}/decoder-epoch-99-avg-1.onnx`;\n      c.transducer.joiner = `${modelDir}/joiner-epoch-99-avg-1.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"transducer\";\n\n      break;\n    }\n\n    case 14: {\n      const modelDir = \"sherpa-onnx-paraformer-zh-small-2024-03-09\";\n      c.paraformer.model = `${modelDir}/model.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"paraformer\";\n\n      break;\n    }\n\n    case 15: {\n      const modelDir = \"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17\";\n      c.senseVoice.model = `${modelDir}/model.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n\n      break;\n    }\n\n    case 16: {\n      const modelDir = \"sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01\";\n      c.transducer.encoder = `${modelDir}/encoder-epoch-99-avg-1.int8.onnx`;\n      c.transducer.decoder = `${modelDir}/decoder-epoch-99-avg-1.onnx`;\n      c.transducer.joiner = `${modelDir}/joiner-epoch-99-avg-1.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"transducer\";\n\n      break;\n    }\n\n    case 17: {\n      const modelDir = \"sherpa-onnx-zipformer-ru-2024-09-18\";\n      c.transducer.encoder = `${modelDir}/encoder.int8.onnx`;\n      c.transducer.decoder = `${modelDir}/decoder.onnx`;\n      c.transducer.joiner = `${modelDir}/joiner.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"transducer\";\n\n      break;\n    }\n\n    case 18: {\n      const modelDir = \"sherpa-onnx-small-zipformer-ru-2024-09-18\";\n      c.transducer.encoder = `${modelDir}/encoder.int8.onnx`;\n      c.transducer.decoder = `${modelDir}/decoder.onnx`;\n      c.transducer.joiner = `${modelDir}/joiner.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"transducer\";\n\n      break;\n    }\n\n    case 19: {\n      const modelDir = \"sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24\";\n      c.nemoCtc.model = `${modelDir}/model.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n\n      break;\n    }\n\n    case 20: {\n      const modelDir = \"sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24\";\n      c.transducer.encoder = `${modelDir}/encoder.int8.onnx`;\n      c.transducer.decoder = `${modelDir}/decoder.onnx`;\n      c.transducer.joiner = `${modelDir}/joiner.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"nemo_transducer\";\n\n      break;\n    }\n\n    case 21: {\n      const modelDir = \"sherpa-onnx-moonshine-tiny-en-int8\";\n      c.moonshine.preprocessor = `${modelDir}/preprocess.onnx`;\n      c.moonshine.encoder = `${modelDir}/encode.int8.onnx`;\n      c.moonshine.uncachedDecoder = `${modelDir}/uncached_decode.int8.onnx`;\n      c.moonshine.cachedDecoder = `${modelDir}/cached_decode.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n\n      break;\n    }\n\n    case 22: {\n      const modelDir = \"sherpa-onnx-moonshine-base-en-int8\";\n      c.moonshine.preprocessor = `${modelDir}/preprocess.onnx`;\n      c.moonshine.encoder = `${modelDir}/encode.int8.onnx`;\n      c.moonshine.uncachedDecoder = `${modelDir}/uncached_decode.int8.onnx`;\n      c.moonshine.cachedDecoder = `${modelDir}/cached_decode.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n\n      break;\n    }\n\n    case 23: {\n      const modelDir = \"sherpa-onnx-zipformer-zh-en-2023-11-22\";\n      c.transducer.encoder = `${modelDir}/encoder-epoch-34-avg-19.int8.onnx`;\n      c.transducer.decoder = `${modelDir}/decoder-epoch-34-avg-19.onnx`;\n      c.transducer.joiner = `${modelDir}/joiner-epoch-34-avg-19.int8.onnx`;\n      c.tokens = `${modelDir}/tokens.txt`;\n      c.modelType = \"transducer\";\n\n      break;\n    }\n\n    default: {\n      console.log(`Please specify a supported type. Given type ${type}`);\n    }\n  }\n\n  return c;\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/ets/pages/Permission.ets",
    "content": "// This file is modified from\n// https://gitee.com/ukSir/hmchat2/blob/master/entry/src/main/ets/utils/permissionMananger.ets\nimport { abilityAccessCtrl, bundleManager, common, Permissions } from '@kit.AbilityKit';\n\nexport function allAllowed(permissions: Permissions[]): boolean {\n  if (permissions.length == 0) {\n    return false;\n  }\n\n  const mgr: abilityAccessCtrl.AtManager = abilityAccessCtrl.createAtManager();\n\n  const bundleInfo = bundleManager.getBundleInfoForSelfSync(bundleManager.BundleFlag.GET_BUNDLE_INFO_WITH_APPLICATION);\n\n  let tokenID: number = bundleInfo.appInfo.accessTokenId;\n\n  return permissions.every(permission => abilityAccessCtrl.GrantStatus.PERMISSION_GRANTED ==\n  mgr.checkAccessTokenSync(tokenID, permission));\n}\n\nexport async function requestPermissions(permissions: Permissions[]): Promise<boolean> {\n  const mgr: abilityAccessCtrl.AtManager = abilityAccessCtrl.createAtManager();\n  const context: Context = getContext() as common.UIAbilityContext;\n\n  const result = await mgr.requestPermissionsFromUser(context, permissions);\n  return result.authResults.length > 0 && result.authResults.every(authResults => authResults == 0);\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/ets/workers/NonStreamingAsrWithVadWorker.ets",
    "content": "import { ErrorEvent, MessageEvents, ThreadWorkerGlobalScope, worker } from '@kit.ArkTS';\nimport {\n  OfflineRecognizer,\n  OfflineRecognizerConfig,\n  OfflineStream,\n  OnlineRecognizerResult,\n  readWaveFromBinary,\n  SileroVadConfig,\n  TenVadConfig,\n  SpeechSegment,\n  Vad,\n  VadConfig,\n} from 'sherpa_onnx';\nimport { Context } from '@kit.AbilityKit';\nimport { fileIo } from '@kit.CoreFileKit';\nimport { getOfflineModelConfig } from '../pages/NonStreamingAsrModels';\nimport { BusinessError } from '@kit.BasicServicesKit';\n\nconst workerPort: ThreadWorkerGlobalScope = worker.workerPort;\n\nlet recognizer: OfflineRecognizer;\nlet vad: Vad; // vad for decoding files\nlet vadMic: Vad; // vad for mic\n\nfunction initVad(context: Context): Vad {\n  let mgr = context.resourceManager;\n  const config: VadConfig = new VadConfig(\n    new SileroVadConfig(\n      'silero_vad.onnx',\n      0.5,\n      0.25,\n      0.5,\n      512,\n    ),\n    new TenVadConfig(\n      '', // set it to ten-vad.onnx to use ten-vad\n      0.5,\n      0.25,\n      0.5,\n      256,\n    ),\n    16000,\n    true,\n    1,\n  );\n\n  const bufferSizeInSeconds = 60;\n  return new Vad(config, bufferSizeInSeconds, mgr);\n}\n\nfunction initNonStreamingAsr(context: Context): OfflineRecognizer {\n  let mgr = context.resourceManager;\n  const config: OfflineRecognizerConfig = new OfflineRecognizerConfig();\n\n  // Note that you can switch to a new model by changing type\n  //\n  // If you use type = 2, which means you will use\n  // sherpa-onnx-whisper-tiny.en\n  // we assume you have the following folder structure in you resources/rawfile\n  /*\n  (py38) fangjuns-MacBook-Pro:main fangjun$ pwd\n  /Users/fangjun/open-source/sherpa-onnx/harmony-os/SherpaOnnxVadAsr/entry/src/main\n  (py38) fangjuns-MacBook-Pro:main fangjun$ tree resources/rawfile/\n  resources/rawfile/\n  ├── sherpa-onnx-whisper-tiny.en\n  │   ├── README.md\n  │   ├── tiny.en-decoder.int8.onnx\n  │   ├── tiny.en-encoder.int8.onnx\n  │   └── tiny.en-tokens.txt\n  └── silero_vad.onnx\n\n  1 directory, 5 files\n   */\n  const type = 2;\n  config.modelConfig = getOfflineModelConfig(type);\n  config.modelConfig.debug = true;\n  config.ruleFsts = '';\n  return new OfflineRecognizer(config, mgr);\n}\n\ninterface Wave {\n  samples: Float32Array;\n  sampleRate: number;\n}\n\nfunction decodeFile(filename: string): string {\n  vad.reset();\n\n  const fp = fileIo.openSync(filename);\n  const stat = fileIo.statSync(fp.fd);\n  const arrayBuffer = new ArrayBuffer(stat.size);\n  fileIo.readSync(fp.fd, arrayBuffer);\n  const data: Uint8Array = new Uint8Array(arrayBuffer);\n\n  const wave: Wave = readWaveFromBinary(data);\n  if (wave.sampleRate != 16000) {\n    return `the sample rate in ${filename} is not 16000Hz. Given: ${wave.sampleRate}Hz.\\nPlease select a wav file of 16kHz.`;\n  }\n\n  console.log(`sample rate ${wave.sampleRate}`);\n  console.log(`samples length ${wave.samples.length}`);\n  const resultList: string[] = [];\n\n  let windowSize: number = vad.config.sileroVad.windowSize;\n\n  if (vad.config.tenVad.model != '') {\n    windowSize = vad.config.tenVad.windowSize;\n  }\n\n  for (let i = 0; i < wave.samples.length; i += windowSize) {\n    const thisWindow: Float32Array = wave.samples.subarray(i, i + windowSize)\n    vad.acceptWaveform(thisWindow);\n    if (i + windowSize >= wave.samples.length) {\n      vad.flush();\n    }\n    while (!vad.isEmpty()) {\n      const segment: SpeechSegment = vad.front();\n      const _startTime: number = (segment.start / wave.sampleRate);\n      const _endTime: number = _startTime + segment.samples.length / wave.sampleRate;\n\n      if (_endTime - _startTime < 0.2) {\n        vad.pop();\n        continue;\n      }\n\n      const startTime: string = _startTime.toFixed(2);\n      const endTime: string = _endTime.toFixed(2);\n\n      const progress: number = (segment.start + segment.samples.length) / wave.samples.length * 100;\n\n      workerPort.postMessage({ 'msgType': 'non-streaming-asr-vad-decode-progress', progress });\n\n      const stream: OfflineStream = recognizer.createStream();\n      stream.acceptWaveform({ samples: segment.samples, sampleRate: wave.sampleRate });\n      recognizer.decode(stream);\n      const result: OnlineRecognizerResult = recognizer.getResult(stream);\n\n      const text: string = `${startTime} -- ${endTime} ${result.text}`\n      resultList.push(text);\n      console.log(`partial result ${text}`);\n\n      workerPort.postMessage({ 'msgType': 'non-streaming-asr-vad-decode-partial', text });\n\n      vad.pop();\n    }\n  }\n\n  return resultList.join('\\n\\n');\n}\n\nfunction decodeMic(samples: Float32Array) {\n  const resultList: string[] = [];\n\n  let windowSize: number = vad.config.sileroVad.windowSize;\n\n  if (vad.config.tenVad.model != '') {\n    windowSize = vad.config.tenVad.windowSize;\n  }\n\n  for (let i = 0; i < samples.length; i += windowSize) {\n    const thisWindow: Float32Array = samples.subarray(i, i + windowSize)\n    vad.acceptWaveform(thisWindow);\n    if (i + windowSize >= samples.length) {\n      vad.flush();\n    }\n    while (!vad.isEmpty()) {\n      const segment: SpeechSegment = vad.front();\n      const _startTime: number = (segment.start / 16000);\n      const _endTime: number = _startTime + segment.samples.length / 16000;\n\n      if (_endTime - _startTime < 0.2) {\n        vad.pop();\n        continue;\n      }\n\n      const startTime: string = _startTime.toFixed(2);\n      const endTime: string = _endTime.toFixed(2);\n\n      const stream: OfflineStream = recognizer.createStream();\n      stream.acceptWaveform({ samples: segment.samples, sampleRate: 16000 });\n      recognizer.decode(stream);\n      const result: OnlineRecognizerResult = recognizer.getResult(stream);\n\n      const text: string = `${startTime} -- ${endTime} ${result.text}`\n      resultList.push(text);\n      console.log(`partial result ${text}`);\n\n      workerPort.postMessage({ 'msgType': 'non-streaming-asr-vad-mic-partial', text });\n\n      vad.pop();\n    }\n  }\n\n  return resultList.join('\\n\\n');\n}\n\n/**\n * Defines the event handler to be called when the worker thread receives a message sent by the host thread.\n * The event handler is executed in the worker thread.\n *\n * @param e message data\n */\nworkerPort.onmessage = (e: MessageEvents) => {\n  const msgType = e.data['msgType'] as string;\n  console.log(`msg-type: ${msgType}`)\n  if (msgType == 'init-vad' && !vad) {\n    const context = e.data['context'] as Context;\n    vad = initVad(context);\n    console.log('init vad done');\n    workerPort.postMessage({ 'msgType': 'init-vad-done' });\n  }\n\n  if (msgType == 'init-vad-mic' && !vadMic) {\n    const context = e.data['context'] as Context;\n    vadMic = initVad(context);\n    console.log('init vad mic done');\n    workerPort.postMessage({ 'msgType': 'init-vad-mic-done' });\n  }\n\n  if (msgType == 'init-non-streaming-asr' && !recognizer) {\n    const context = e.data['context'] as Context;\n    recognizer = initNonStreamingAsr(context);\n    console.log('init non streaming ASR done');\n    workerPort.postMessage({ 'msgType': 'init-non-streaming-asr-done' });\n  }\n\n  if (msgType == 'non-streaming-asr-vad-decode') {\n    const filename = e.data['filename'] as string;\n    console.log(`decoding ${filename}`);\n    try {\n      const text = decodeFile(filename);\n      workerPort.postMessage({ msgType: 'non-streaming-asr-vad-decode-done', text });\n    } catch (e) {\n      workerPort.postMessage({ msgType: 'non-streaming-asr-vad-decode-error', text: `Failed to decode ${filename}` });\n    }\n\n    workerPort.postMessage({ 'msgType': 'non-streaming-asr-vad-decode-progress', progress: 100 });\n  }\n\n  if (msgType == 'non-streaming-asr-vad-mic') {\n    const samples: Float32Array = e.data['samples'] as Float32Array;\n    vadMic.reset();\n    try {\n      const text = decodeMic(samples);\n      workerPort.postMessage({ msgType: 'non-streaming-asr-vad-mic-done', text });\n    } catch (e) {\n      workerPort.postMessage({ msgType: 'non-streaming-asr-vad-mic-error', text: `Failed to decode` });\n    }\n  }\n}\n\n/**\n * Defines the event handler to be called when the worker receives a message that cannot be deserialized.\n * The event handler is executed in the worker thread.\n *\n * @param e message data\n */\nworkerPort.onmessageerror = (e: MessageEvents) => {\n}\n\n/**\n * Defines the event handler to be called when an exception occurs during worker execution.\n * The event handler is executed in the worker thread.\n *\n * @param e error message\n */\nworkerPort.onerror = (e: ErrorEvent) => {\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry\",\n    \"type\": \"entry\",\n    \"description\": \"$string:module_desc\",\n    \"mainElement\": \"EntryAbility\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false,\n    \"pages\": \"$profile:main_pages\",\n    \"abilities\": [\n      {\n        \"name\": \"EntryAbility\",\n        \"srcEntry\": \"./ets/entryability/EntryAbility.ets\",\n        \"description\": \"$string:EntryAbility_desc\",\n        \"icon\": \"$media:layered_image\",\n        \"label\": \"$string:EntryAbility_label\",\n        \"startWindowIcon\": \"$media:startIcon\",\n        \"startWindowBackground\": \"$color:start_window_background\",\n        \"exported\": true,\n        \"skills\": [\n          {\n            \"entities\": [\n              \"entity.system.home\"\n            ],\n            \"actions\": [\n              \"action.system.home\"\n            ]\n          }\n        ]\n      }\n    ],\n    \"extensionAbilities\": [\n      {\n        \"name\": \"EntryBackupAbility\",\n        \"srcEntry\": \"./ets/entrybackupability/EntryBackupAbility.ets\",\n        \"type\": \"backup\",\n        \"exported\": false,\n        \"metadata\": [\n          {\n            \"name\": \"ohos.extension.backup\",\n            \"resource\": \"$profile:backup_config\"\n          }\n        ],\n      }\n    ],\n    \"requestPermissions\": [\n      {\n        \"name\": \"ohos.permission.MICROPHONE\",\n        \"reason\": \"$string:mic_reason\",\n        \"usedScene\": {\n          \"abilities\": [\n            \"EntryAbility\",\n          ],\n          \"when\": \"inuse\",\n        }\n      }\n    ]\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/base/element/color.json",
    "content": "{\n  \"color\": [\n    {\n      \"name\": \"start_window_background\",\n      \"value\": \"#FFFFFF\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/base/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"On-device VAD+ASR with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"On-device VAD+ASR with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"On-device speech recognition\"\n    },\n    {\n      \"name\": \"mic_reason\",\n      \"value\": \"access the microphone for on-device speech recognition with Next-gen Kaldi\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/base/media/layered_image.json",
    "content": "{\n  \"layered-image\":\n  {\n    \"background\" : \"$media:background\",\n    \"foreground\" : \"$media:foreground\"\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/base/profile/backup_config.json",
    "content": "{\n  \"allowToBackupRestore\": true\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/base/profile/main_pages.json",
    "content": "{\n  \"src\": [\n    \"pages/Index\"\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/en_US/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"On-device VAD+ASR with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"On-device VAD+ASR with Next-gen Kaldi\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"On-device speech recognition\"\n    },\n    {\n      \"name\": \"mic_reason\",\n      \"value\": \"access the microphone for on-device speech recognition with Next-gen Kaldi\"\n    }\n  ]\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/rawfile/.gitkeep",
    "content": ""
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/zh_CN/element/string.json",
    "content": "{\n  \"string\": [\n    {\n      \"name\": \"module_desc\",\n      \"value\": \"基于新一代Kaldi的本地语音识别\"\n    },\n    {\n      \"name\": \"EntryAbility_desc\",\n      \"value\": \"基于新一代Kaldi的本地语音识别\"\n    },\n    {\n      \"name\": \"EntryAbility_label\",\n      \"value\": \"本地语音识别\"\n    },\n    {\n      \"name\": \"mic_reason\",\n      \"value\": \"使用新一代Kaldi, 访问麦克风进行本地语音识别 (不需要联网)\"\n    }\n  ]\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/ohosTest/ets/test/Ability.test.ets",
    "content": "import hilog from '@ohos.hilog';\nimport { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function abilityTest() {\n  describe('ActsAbilityTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    })\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    })\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    })\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    })\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      hilog.info(0x0000, 'testTag', '%{public}s', 'it begin');\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    })\n  })\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/ohosTest/ets/test/List.test.ets",
    "content": "import abilityTest from './Ability.test';\n\nexport default function testsuite() {\n  abilityTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/ohosTest/module.json5",
    "content": "{\n  \"module\": {\n    \"name\": \"entry_test\",\n    \"type\": \"feature\",\n    \"deviceTypes\": [\n      \"phone\",\n      \"tablet\",\n      \"2in1\"\n    ],\n    \"deliveryWithInstall\": true,\n    \"installationFree\": false\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/test/List.test.ets",
    "content": "import localUnitTest from './LocalUnit.test';\n\nexport default function testsuite() {\n  localUnitTest();\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/entry/src/test/LocalUnit.test.ets",
    "content": "import { describe, beforeAll, beforeEach, afterEach, afterAll, it, expect } from '@ohos/hypium';\n\nexport default function localUnitTest() {\n  describe('localUnitTest', () => {\n    // Defines a test suite. Two parameters are supported: test suite name and test suite function.\n    beforeAll(() => {\n      // Presets an action, which is performed only once before all test cases of the test suite start.\n      // This API supports only one parameter: preset action function.\n    });\n    beforeEach(() => {\n      // Presets an action, which is performed before each unit test case starts.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: preset action function.\n    });\n    afterEach(() => {\n      // Presets a clear action, which is performed after each unit test case ends.\n      // The number of execution times is the same as the number of test cases defined by **it**.\n      // This API supports only one parameter: clear action function.\n    });\n    afterAll(() => {\n      // Presets a clear action, which is performed after all test cases of the test suite end.\n      // This API supports only one parameter: clear action function.\n    });\n    it('assertContain', 0, () => {\n      // Defines a test case. This API supports three parameters: test case name, filter parameter, and test case function.\n      let a = 'abc';\n      let b = 'b';\n      // Defines a variety of assertion methods, which are used to declare expected boolean conditions.\n      expect(a).assertContain(b);\n      expect(a).assertEqual(a);\n    });\n  });\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/hvigor/hvigor-config.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"dependencies\": {\n  },\n  \"execution\": {\n    // \"analyze\": \"normal\",                     /* Define the build analyze mode. Value: [ \"normal\" | \"advanced\" | false ]. Default: \"normal\" */\n    // \"daemon\": true,                          /* Enable daemon compilation. Value: [ true | false ]. Default: true */\n    // \"incremental\": true,                     /* Enable incremental compilation. Value: [ true | false ]. Default: true */\n    // \"parallel\": true,                        /* Enable parallel compilation. Value: [ true | false ]. Default: true */\n    // \"typeCheck\": false,                      /* Enable typeCheck. Value: [ true | false ]. Default: false */\n  },\n  \"logging\": {\n    // \"level\": \"info\"                          /* Define the log level. Value: [ \"debug\" | \"info\" | \"warn\" | \"error\" ]. Default: \"info\" */\n  },\n  \"debugging\": {\n    // \"stacktrace\": false                      /* Disable stacktrace compilation. Value: [ true | false ]. Default: false */\n  },\n  \"nodeOptions\": {\n    // \"maxOldSpaceSize\": 8192                  /* Enable nodeOptions maxOldSpaceSize compilation. Unit M. Used for the daemon process. Default: 8192*/\n    // \"exposeGC\": true                         /* Enable to trigger garbage collection explicitly. Default: true*/\n  }\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/hvigorfile.ts",
    "content": "import { appTasks } from '@ohos/hvigor-ohos-plugin';\n\nexport default {\n    system: appTasks,  /* Built-in plugin of Hvigor. It cannot be modified. */\n    plugins:[]         /* Custom plugin to extend the functionality of Hvigor. */\n}\n"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/oh-package-lock.json5",
    "content": "{\n  \"meta\": {\n    \"stableOrder\": true\n  },\n  \"lockfileVersion\": 3,\n  \"ATTENTION\": \"THIS IS AN AUTOGENERATED FILE. DO NOT EDIT THIS FILE DIRECTLY.\",\n  \"specifiers\": {\n    \"@ohos/hypium@1.0.19\": \"@ohos/hypium@1.0.19\"\n  },\n  \"packages\": {\n    \"@ohos/hypium@1.0.19\": {\n      \"name\": \"@ohos/hypium\",\n      \"version\": \"1.0.19\",\n      \"integrity\": \"sha512-cEjDgLFCm3cWZDeRXk7agBUkPqjWxUo6AQeiu0gEkb3J8ESqlduQLSIXeo3cCsm8U/asL7iKjF85ZyOuufAGSQ==\",\n      \"resolved\": \"https://ohpm.openharmony.cn/ohpm/@ohos/hypium/-/hypium-1.0.19.har\",\n      \"registryType\": \"ohpm\"\n    }\n  }\n}"
  },
  {
    "path": "harmony-os/SherpaOnnxVadAsr/oh-package.json5",
    "content": "{\n  \"modelVersion\": \"5.0.0\",\n  \"description\": \"Please describe the basic information.\",\n  \"dependencies\": {\n  },\n  \"devDependencies\": {\n    \"@ohos/hypium\": \"1.0.19\"\n  }\n}\n"
  },
  {
    "path": "ios-swift/.gitignore",
    "content": "# See https://github.com/github/gitignore/blob/main/Swift.gitignore\n# Xcode\n#\n# gitignore contributors: remember to update Global/Xcode.gitignore, Objective-C.gitignore & Swift.gitignore\n\n## User settings\nxcuserdata/\n\n## compatibility with Xcode 8 and earlier (ignoring not required starting Xcode 9)\n*.xcscmblueprint\n*.xccheckout\n\n## compatibility with Xcode 3 and earlier (ignoring not required starting Xcode 4)\nbuild/\nDerivedData/\n*.moved-aside\n*.pbxuser\n!default.pbxuser\n*.mode1v3\n!default.mode1v3\n*.mode2v3\n!default.mode2v3\n*.perspectivev3\n!default.perspectivev3\n\n## Obj-C/Swift specific\n*.hmap\n\n## App packaging\n*.ipa\n*.dSYM.zip\n*.dSYM\n\n## Playgrounds\ntimeline.xctimeline\nplayground.xcworkspace\n\n# Swift Package Manager\n#\n# Add this line if you want to avoid checking in source code from Swift Package Manager dependencies.\n# Packages/\n# Package.pins\n# Package.resolved\n# *.xcodeproj\n#\n# Xcode automatically generates this directory with a .xcworkspacedata file and xcuserdata\n# hence it is not needed unless you have added a package configuration file to your project\n# .swiftpm\n\n.build/\n\n# CocoaPods\n#\n# We recommend against adding the Pods directory to your .gitignore. However\n# you should judge for yourself, the pros and cons are mentioned at:\n# https://guides.cocoapods.org/using/using-cocoapods.html#should-i-check-the-pods-directory-into-source-control\n#\n# Pods/\n#\n# Add this line if you want to avoid checking in source code from the Xcode workspace\n# *.xcworkspace\n\n# Carthage\n#\n# Add this line if you want to avoid checking in source code from Carthage dependencies.\n# Carthage/Checkouts\n\nCarthage/Build/\n\n# Accio dependency management\nDependencies/\n.accio/\n\n# fastlane\n#\n# It is recommended to not store the screenshots in the git repo.\n# Instead, use fastlane to re-generate the screenshots whenever they are needed.\n# For more information about the recommended setup visit:\n# https://docs.fastlane.tools/best-practices/source-control/#source-control\n\nfastlane/report.xml\nfastlane/Preview.html\nfastlane/screenshots/**/*.png\nfastlane/test_output\n\n# Code Injection\n#\n# After new code Injection tools there's a generated folder /iOSInjectionProject\n# https://github.com/johnno1962/injectionforxcode\n\niOSInjectionProject/\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx/AppDelegate.swift",
    "content": "//\n//  AppDelegate.swift\n//  SherpaOnnx\n//\n//  Created by fangjun on 2023/2/25.\n//\n\nimport UIKit\n\n@main\nclass AppDelegate: UIResponder, UIApplicationDelegate {\n\n\n\n    func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?) -> Bool {\n        // Override point for customization after application launch.\n        return true\n    }\n\n    // MARK: UISceneSession Lifecycle\n\n    func application(_ application: UIApplication, configurationForConnecting connectingSceneSession: UISceneSession, options: UIScene.ConnectionOptions) -> UISceneConfiguration {\n        // Called when a new scene session is being created.\n        // Use this method to select a configuration to create the new scene with.\n        return UISceneConfiguration(name: \"Default Configuration\", sessionRole: connectingSceneSession.role)\n    }\n\n    func application(_ application: UIApplication, didDiscardSceneSessions sceneSessions: Set<UISceneSession>) {\n        // Called when the user discards a scene session.\n        // If any sessions were discarded while the application was not running, this will be called shortly after application:didFinishLaunchingWithOptions.\n        // Use this method to release any resources that were specific to the discarded scenes, as they will not return.\n    }\n\n\n}\n\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx/Assets.xcassets/AccentColor.colorset/Contents.json",
    "content": "{\n  \"colors\" : [\n    {\n      \"idiom\" : \"universal\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"filename\" : \"k2-1024x1024.png\",\n      \"idiom\" : \"universal\",\n      \"platform\" : \"ios\",\n      \"size\" : \"1024x1024\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx/Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx/Base.lproj/LaunchScreen.storyboard",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n<document type=\"com.apple.InterfaceBuilder3.CocoaTouch.Storyboard.XIB\" version=\"3.0\" toolsVersion=\"13122.16\" targetRuntime=\"iOS.CocoaTouch\" propertyAccessControl=\"none\" useAutolayout=\"YES\" launchScreen=\"YES\" useTraitCollections=\"YES\" useSafeAreas=\"YES\" colorMatched=\"YES\" initialViewController=\"01J-lp-oVM\">\n    <dependencies>\n        <plugIn identifier=\"com.apple.InterfaceBuilder.IBCocoaTouchPlugin\" version=\"13104.12\"/>\n        <capability name=\"Safe area layout guides\" minToolsVersion=\"9.0\"/>\n        <capability name=\"documents saved in the Xcode 8 format\" minToolsVersion=\"8.0\"/>\n    </dependencies>\n    <scenes>\n        <!--View Controller-->\n        <scene sceneID=\"EHf-IW-A2E\">\n            <objects>\n                <viewController id=\"01J-lp-oVM\" sceneMemberID=\"viewController\">\n                    <view key=\"view\" contentMode=\"scaleToFill\" id=\"Ze5-6b-2t3\">\n                        <rect key=\"frame\" x=\"0.0\" y=\"0.0\" width=\"375\" height=\"667\"/>\n                        <autoresizingMask key=\"autoresizingMask\" widthSizable=\"YES\" heightSizable=\"YES\"/>\n                        <color key=\"backgroundColor\" xcode11CocoaTouchSystemColor=\"systemBackgroundColor\" cocoaTouchSystemColor=\"whiteColor\"/>\n                        <viewLayoutGuide key=\"safeArea\" id=\"6Tk-OE-BBY\"/>\n                    </view>\n                </viewController>\n                <placeholder placeholderIdentifier=\"IBFirstResponder\" id=\"iYj-Kq-Ea1\" userLabel=\"First Responder\" sceneMemberID=\"firstResponder\"/>\n            </objects>\n            <point key=\"canvasLocation\" x=\"53\" y=\"375\"/>\n        </scene>\n    </scenes>\n</document>\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx/Base.lproj/Main.storyboard",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<document type=\"com.apple.InterfaceBuilder3.CocoaTouch.Storyboard.XIB\" version=\"3.0\" toolsVersion=\"21507\" targetRuntime=\"iOS.CocoaTouch\" propertyAccessControl=\"none\" useAutolayout=\"YES\" useTraitCollections=\"YES\" useSafeAreas=\"YES\" colorMatched=\"YES\" initialViewController=\"BYZ-38-t0r\">\n    <device id=\"retina6_12\" orientation=\"portrait\" appearance=\"light\"/>\n    <dependencies>\n        <deployment identifier=\"iOS\"/>\n        <plugIn identifier=\"com.apple.InterfaceBuilder.IBCocoaTouchPlugin\" version=\"21505\"/>\n        <capability name=\"Safe area layout guides\" minToolsVersion=\"9.0\"/>\n        <capability name=\"System colors in document resources\" minToolsVersion=\"11.0\"/>\n        <capability name=\"documents saved in the Xcode 8 format\" minToolsVersion=\"8.0\"/>\n    </dependencies>\n    <scenes>\n        <!--View Controller-->\n        <scene sceneID=\"tne-QT-ifu\">\n            <objects>\n                <viewController id=\"BYZ-38-t0r\" customClass=\"ViewController\" customModule=\"SherpaNcnn\" customModuleProvider=\"target\" sceneMemberID=\"viewController\">\n                    <view key=\"view\" contentMode=\"scaleToFill\" id=\"8bC-Xf-vdC\">\n                        <rect key=\"frame\" x=\"0.0\" y=\"0.0\" width=\"393\" height=\"852\"/>\n                        <autoresizingMask key=\"autoresizingMask\" widthSizable=\"YES\" heightSizable=\"YES\"/>\n                        <subviews>\n                            <button opaque=\"NO\" contentMode=\"scaleToFill\" contentHorizontalAlignment=\"center\" contentVerticalAlignment=\"center\" buttonType=\"system\" lineBreakMode=\"middleTruncation\" translatesAutoresizingMaskIntoConstraints=\"NO\" id=\"7q8-Y3-WbJ\">\n                                <rect key=\"frame\" x=\"166\" y=\"773\" width=\"61.333333333333343\" height=\"35\"/>\n                                <state key=\"normal\" title=\"Button\"/>\n                                <buttonConfiguration key=\"configuration\" style=\"plain\" title=\"Start\"/>\n                                <connections>\n                                    <action selector=\"onRecordBtnClick:\" destination=\"BYZ-38-t0r\" eventType=\"touchUpInside\" id=\"rS6-DT-XWm\"/>\n                                </connections>\n                            </button>\n                            <label opaque=\"NO\" userInteractionEnabled=\"NO\" contentMode=\"left\" horizontalHuggingPriority=\"251\" verticalHuggingPriority=\"251\" text=\"Label\" lineBreakMode=\"tailTruncation\" numberOfLines=\"0\" baselineAdjustment=\"alignBaselines\" adjustsFontSizeToFit=\"NO\" translatesAutoresizingMaskIntoConstraints=\"NO\" id=\"jfS-7J-m9C\">\n                                <rect key=\"frame\" x=\"8\" y=\"67\" width=\"377\" height=\"20.333333333333329\"/>\n                                <fontDescription key=\"fontDescription\" type=\"system\" pointSize=\"17\"/>\n                                <nil key=\"textColor\"/>\n                                <nil key=\"highlightedColor\"/>\n                            </label>\n                        </subviews>\n                        <viewLayoutGuide key=\"safeArea\" id=\"6Tk-OE-BBY\"/>\n                        <color key=\"backgroundColor\" systemColor=\"systemBackgroundColor\"/>\n                        <constraints>\n                            <constraint firstItem=\"jfS-7J-m9C\" firstAttribute=\"leading\" secondItem=\"6Tk-OE-BBY\" secondAttribute=\"leading\" constant=\"8\" id=\"HX3-rI-U9E\"/>\n                            <constraint firstItem=\"jfS-7J-m9C\" firstAttribute=\"top\" secondItem=\"6Tk-OE-BBY\" secondAttribute=\"top\" constant=\"8\" id=\"NEv-PD-DHj\"/>\n                            <constraint firstItem=\"7q8-Y3-WbJ\" firstAttribute=\"centerX\" secondItem=\"8bC-Xf-vdC\" secondAttribute=\"centerX\" id=\"Nha-gf-R2b\"/>\n                            <constraint firstItem=\"6Tk-OE-BBY\" firstAttribute=\"trailing\" secondItem=\"jfS-7J-m9C\" secondAttribute=\"trailing\" constant=\"8\" id=\"P2f-hG-O2e\"/>\n                            <constraint firstAttribute=\"bottomMargin\" secondItem=\"7q8-Y3-WbJ\" secondAttribute=\"bottom\" constant=\"10\" id=\"Pgb-4G-ySa\"/>\n                        </constraints>\n                    </view>\n                    <connections>\n                        <outlet property=\"recordBtn\" destination=\"7q8-Y3-WbJ\" id=\"mFd-cu-zjn\"/>\n                        <outlet property=\"resultLabel\" destination=\"jfS-7J-m9C\" id=\"xQU-ID-m5Q\"/>\n                    </connections>\n                </viewController>\n                <placeholder placeholderIdentifier=\"IBFirstResponder\" id=\"dkx-z0-nzr\" sceneMemberID=\"firstResponder\"/>\n            </objects>\n            <point key=\"canvasLocation\" x=\"32.824427480916029\" y=\"3.5211267605633805\"/>\n        </scene>\n    </scenes>\n    <resources>\n        <systemColor name=\"systemBackgroundColor\">\n            <color white=\"1\" alpha=\"1\" colorSpace=\"custom\" customColorSpace=\"genericGamma22GrayColorSpace\"/>\n        </systemColor>\n    </resources>\n</document>\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>NSMicrophoneUsageDescription</key>\n\t<string>Need microphone access for Next-gen Kaldi to work</string>\n\t<key>UIApplicationSceneManifest</key>\n\t<dict>\n\t\t<key>UIApplicationSupportsMultipleScenes</key>\n\t\t<false/>\n\t\t<key>UISceneConfigurations</key>\n\t\t<dict>\n\t\t\t<key>UIWindowSceneSessionRoleApplication</key>\n\t\t\t<array>\n\t\t\t\t<dict>\n\t\t\t\t\t<key>UISceneConfigurationName</key>\n\t\t\t\t\t<string>Default Configuration</string>\n\t\t\t\t\t<key>UISceneDelegateClassName</key>\n\t\t\t\t\t<string>$(PRODUCT_MODULE_NAME).SceneDelegate</string>\n\t\t\t\t\t<key>UISceneStoryboardFile</key>\n\t\t\t\t\t<string>Main</string>\n\t\t\t\t</dict>\n\t\t\t</array>\n\t\t</dict>\n\t</dict>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx/Model.swift",
    "content": "import Foundation\n\nfunc getResource(_ forResource: String, _ ofType: String) -> String {\n  let path = Bundle.main.path(forResource: forResource, ofType: ofType)\n  precondition(\n    path != nil,\n    \"\\(forResource).\\(ofType) does not exist!\\n\" + \"Remember to change \\n\"\n      + \"  Build Phases -> Copy Bundle Resources\\n\" + \"to add it!\"\n  )\n  return path!\n}\n/// Please refer to\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n/// to download pre-trained models\n\n/// sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 (Bilingual, Chinese + English)\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/zipformer-transducer-models.html\nfunc getBilingualStreamZhEnZipformer20230220() -> SherpaOnnxOnlineModelConfig {\n  let encoder = getResource(\"encoder-epoch-99-avg-1\", \"onnx\")\n  let decoder = getResource(\"decoder-epoch-99-avg-1\", \"onnx\")\n  let joiner = getResource(\"joiner-epoch-99-avg-1\", \"onnx\")\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  return sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    transducer: sherpaOnnxOnlineTransducerModelConfig(\n      encoder: encoder,\n      decoder: decoder,\n      joiner: joiner\n    ),\n    numThreads: 1,\n    modelType: \"zipformer\"\n  )\n}\n\nfunc getZhZipformer20230615() -> SherpaOnnxOnlineModelConfig {\n  let encoder = getResource(\"encoder-epoch-12-avg-4-chunk-16-left-128\", \"onnx\")\n  let decoder = getResource(\"decoder-epoch-12-avg-4-chunk-16-left-128\", \"onnx\")\n  let joiner = getResource(\"joiner-epoch-12-avg-4-chunk-16-left-128\", \"onnx\")\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  return sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    transducer: sherpaOnnxOnlineTransducerModelConfig(\n      encoder: encoder,\n      decoder: decoder,\n      joiner: joiner\n    ),\n    numThreads: 1,\n    modelType: \"zipformer2\"\n  )\n}\n\nfunc getZhZipformer20230615Int8() -> SherpaOnnxOnlineModelConfig {\n  let encoder = getResource(\"encoder-epoch-12-avg-4-chunk-16-left-128.int8\", \"onnx\")\n  let decoder = getResource(\"decoder-epoch-12-avg-4-chunk-16-left-128\", \"onnx\")\n  let joiner = getResource(\"joiner-epoch-12-avg-4-chunk-16-left-128\", \"onnx\")\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  return sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    transducer: sherpaOnnxOnlineTransducerModelConfig(\n      encoder: encoder,\n      decoder: decoder,\n      joiner: joiner),\n    numThreads: 1,\n    modelType: \"zipformer2\"\n  )\n}\n\nfunc getEnZipformer20230626() -> SherpaOnnxOnlineModelConfig {\n  let encoder = getResource(\"encoder-epoch-99-avg-1-chunk-16-left-128\", \"onnx\")\n  let decoder = getResource(\"decoder-epoch-99-avg-1-chunk-16-left-128\", \"onnx\")\n  let joiner = getResource(\"joiner-epoch-99-avg-1-chunk-16-left-128\", \"onnx\")\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  return sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    transducer: sherpaOnnxOnlineTransducerModelConfig(\n      encoder: encoder,\n      decoder: decoder,\n      joiner: joiner),\n    numThreads: 1,\n    modelType: \"zipformer2\"\n  )\n}\n\nfunc getBilingualStreamingZhEnParaformer() -> SherpaOnnxOnlineModelConfig {\n  let encoder = getResource(\"encoder.int8\", \"onnx\")\n  let decoder = getResource(\"decoder.int8\", \"onnx\")\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  return sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    paraformer: sherpaOnnxOnlineParaformerModelConfig(\n      encoder: encoder,\n      decoder: decoder),\n    numThreads: 1,\n    modelType: \"paraformer\"\n  )\n}\n\n/// Please refer to\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n/// to add more models if you need\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx/SceneDelegate.swift",
    "content": "//\n//  SceneDelegate.swift\n//  SherpaOnnx\n//\n//  Created by fangjun on 2023/2/25.\n//\n\nimport UIKit\n\nclass SceneDelegate: UIResponder, UIWindowSceneDelegate {\n\n    var window: UIWindow?\n\n\n    func scene(_ scene: UIScene, willConnectTo session: UISceneSession, options connectionOptions: UIScene.ConnectionOptions) {\n        // Use this method to optionally configure and attach the UIWindow `window` to the provided UIWindowScene `scene`.\n        // If using a storyboard, the `window` property will automatically be initialized and attached to the scene.\n        // This delegate does not imply the connecting scene or session are new (see `application:configurationForConnectingSceneSession` instead).\n        guard let _ = (scene as? UIWindowScene) else { return }\n    }\n\n    func sceneDidDisconnect(_ scene: UIScene) {\n        // Called as the scene is being released by the system.\n        // This occurs shortly after the scene enters the background, or when its session is discarded.\n        // Release any resources associated with this scene that can be re-created the next time the scene connects.\n        // The scene may re-connect later, as its session was not necessarily discarded (see `application:didDiscardSceneSessions` instead).\n    }\n\n    func sceneDidBecomeActive(_ scene: UIScene) {\n        // Called when the scene has moved from an inactive state to an active state.\n        // Use this method to restart any tasks that were paused (or not yet started) when the scene was inactive.\n    }\n\n    func sceneWillResignActive(_ scene: UIScene) {\n        // Called when the scene will move from an active state to an inactive state.\n        // This may occur due to temporary interruptions (ex. an incoming phone call).\n    }\n\n    func sceneWillEnterForeground(_ scene: UIScene) {\n        // Called as the scene transitions from the background to the foreground.\n        // Use this method to undo the changes made on entering the background.\n    }\n\n    func sceneDidEnterBackground(_ scene: UIScene) {\n        // Called as the scene transitions from the foreground to the background.\n        // Use this method to save data, release shared resources, and store enough scene-specific state information\n        // to restore the scene back to its current state.\n    }\n\n\n}\n\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx/ViewController.swift",
    "content": "//\n//  ViewController.swift\n//  SherpaOnnx\n//\n//  Created by fangjun on 2023/1/28.\n//\n\nimport AVFoundation\nimport UIKit\n\nextension AudioBuffer {\n    func array() -> [Float] {\n        return Array(UnsafeBufferPointer(self))\n    }\n}\n\nextension AVAudioPCMBuffer {\n    func array() -> [Float] {\n        return self.audioBufferList.pointee.mBuffers.array()\n    }\n}\n\nclass ViewController: UIViewController {\n    @IBOutlet weak var resultLabel: UILabel!\n    @IBOutlet weak var recordBtn: UIButton!\n\n    var audioEngine: AVAudioEngine? = nil\n    var recognizer: SherpaOnnxRecognizer! = nil\n\n    /// It saves the decoded results so far\n    var sentences: [String] = [] {\n        didSet {\n            updateLabel()\n        }\n    }\n    var lastSentence: String = \"\"\n    let maxSentence: Int = 20\n    var results: String {\n        if sentences.isEmpty && lastSentence.isEmpty {\n            return \"\"\n        }\n        if sentences.isEmpty {\n            return \"0: \\(lastSentence.lowercased())\"\n        }\n\n        let start = max(sentences.count - maxSentence, 0)\n        if lastSentence.isEmpty {\n            return sentences.enumerated().map { (index, s) in \"\\(index): \\(s.lowercased())\" }[start...]\n                .joined(separator: \"\\n\")\n        } else {\n            return sentences.enumerated().map { (index, s) in \"\\(index): \\(s.lowercased())\" }[start...]\n                .joined(separator: \"\\n\") + \"\\n\\(sentences.count): \\(lastSentence.lowercased())\"\n        }\n    }\n\n    func updateLabel() {\n        DispatchQueue.main.async {\n            self.resultLabel.text = self.results\n        }\n    }\n\n    override func viewDidLoad() {\n        super.viewDidLoad()\n        // Do any additional setup after loading the view.\n\n        resultLabel.text = \"ASR with Next-gen Kaldi\\n\\nSee https://github.com/k2-fsa/sherpa-onnx\\n\\nPress the Start button to run!\"\n        recordBtn.setTitle(\"Start\", for: .normal)\n        initRecognizer()\n        initRecorder()\n    }\n\n    @IBAction func onRecordBtnClick(_ sender: UIButton) {\n        if recordBtn.currentTitle == \"Start\" {\n            startRecorder()\n            recordBtn.setTitle(\"Stop\", for: .normal)\n        } else {\n            stopRecorder()\n            recordBtn.setTitle(\"Start\", for: .normal)\n        }\n    }\n\n    func initRecognizer() {\n        // Please select one model that is best suitable for you.\n        //\n        // You can also modify Model.swift to add new pre-trained models from\n        // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n\n        let modelConfig = getBilingualStreamZhEnZipformer20230220()\n        // let modelConfig = getZhZipformer20230615()\n        // let modelConfig = getEnZipformer20230626()\n        // let modelConfig = getBilingualStreamingZhEnParaformer()\n\n        let featConfig = sherpaOnnxFeatureConfig(\n            sampleRate: 16000,\n            featureDim: 80)\n\n        var config = sherpaOnnxOnlineRecognizerConfig(\n            featConfig: featConfig,\n            modelConfig: modelConfig,\n            enableEndpoint: true,\n            rule1MinTrailingSilence: 2.4,\n            rule2MinTrailingSilence: 0.8,\n            rule3MinUtteranceLength: 30,\n            decodingMethod: \"greedy_search\",\n            maxActivePaths: 4\n        )\n        recognizer = SherpaOnnxRecognizer(config: &config)\n    }\n\n    func initRecorder() {\n        print(\"init recorder\")\n        audioEngine = AVAudioEngine()\n        let inputNode = self.audioEngine?.inputNode\n        let bus = 0\n        let inputFormat = inputNode?.outputFormat(forBus: bus)\n        let outputFormat = AVAudioFormat(\n            commonFormat: .pcmFormatFloat32,\n            sampleRate: 16000, channels: 1,\n            interleaved: false)!\n\n        let converter = AVAudioConverter(from: inputFormat!, to: outputFormat)!\n\n        inputNode!.installTap(\n            onBus: bus,\n            bufferSize: 1024,\n            format: inputFormat\n        ) {\n            (buffer: AVAudioPCMBuffer, when: AVAudioTime) in\n            var newBufferAvailable = true\n\n            let inputCallback: AVAudioConverterInputBlock = {\n                inNumPackets, outStatus in\n                if newBufferAvailable {\n                    outStatus.pointee = .haveData\n                    newBufferAvailable = false\n\n                    return buffer\n                } else {\n                    outStatus.pointee = .noDataNow\n                    return nil\n                }\n            }\n\n            let convertedBuffer = AVAudioPCMBuffer(\n                pcmFormat: outputFormat,\n                frameCapacity:\n                    AVAudioFrameCount(outputFormat.sampleRate)\n                * buffer.frameLength\n                / AVAudioFrameCount(buffer.format.sampleRate))!\n\n            var error: NSError?\n            let _ = converter.convert(\n                to: convertedBuffer,\n                error: &error, withInputFrom: inputCallback)\n\n            // TODO(fangjun): Handle status != haveData\n\n            let array = convertedBuffer.array()\n            if !array.isEmpty {\n                self.recognizer.acceptWaveform(samples: array)\n                while (self.recognizer.isReady()){\n                    self.recognizer.decode()\n                }\n                let isEndpoint = self.recognizer.isEndpoint()\n                let text = self.recognizer.getResult().text\n\n                if !text.isEmpty && self.lastSentence != text {\n                    self.lastSentence = text\n                    self.updateLabel()\n                    print(text)\n                }\n\n                if isEndpoint {\n                    if !text.isEmpty {\n                        let tmp = self.lastSentence\n                        self.lastSentence = \"\"\n                        self.sentences.append(tmp)\n                    }\n                    self.recognizer.reset()\n                }\n            }\n        }\n\n    }\n\n    func startRecorder() {\n        lastSentence = \"\"\n        sentences = []\n\n        do {\n            try self.audioEngine?.start()\n        } catch let error as NSError {\n            print(\"Got an error starting audioEngine: \\(error.domain), \\(error)\")\n        }\n        print(\"started\")\n    }\n\n    func stopRecorder() {\n        audioEngine?.stop()\n        print(\"stopped\")\n    }\n}\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 56;\n\tobjects = {\n\n/* Begin PBXBuildFile section */\n\t\tC93989AE2A89FE13009AB859 /* sherpa-onnx.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = C984A81B29AA11C500D74C52 /* sherpa-onnx.xcframework */; };\n\t\tC984A7E829A9EEB700D74C52 /* AppDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = C984A7E729A9EEB700D74C52 /* AppDelegate.swift */; };\n\t\tC984A7EA29A9EEB700D74C52 /* SceneDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = C984A7E929A9EEB700D74C52 /* SceneDelegate.swift */; };\n\t\tC984A7F129A9EEB900D74C52 /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = C984A7F029A9EEB900D74C52 /* Assets.xcassets */; };\n\t\tC984A7F429A9EEB900D74C52 /* LaunchScreen.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = C984A7F229A9EEB900D74C52 /* LaunchScreen.storyboard */; };\n\t\tC984A7FF29A9EEBA00D74C52 /* SherpaOnnxTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = C984A7FE29A9EEBA00D74C52 /* SherpaOnnxTests.swift */; };\n\t\tC984A80929A9EEBA00D74C52 /* SherpaOnnxUITests.swift in Sources */ = {isa = PBXBuildFile; fileRef = C984A80829A9EEBA00D74C52 /* SherpaOnnxUITests.swift */; };\n\t\tC984A80B29A9EEBA00D74C52 /* SherpaOnnxUITestsLaunchTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = C984A80A29A9EEBA00D74C52 /* SherpaOnnxUITestsLaunchTests.swift */; };\n\t\tC984A81929AA119400D74C52 /* SherpaOnnx.swift in Sources */ = {isa = PBXBuildFile; fileRef = C984A81829AA119400D74C52 /* SherpaOnnx.swift */; };\n\t\tC984A82829AA196100D74C52 /* Main.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = C984A82629AA196100D74C52 /* Main.storyboard */; };\n\t\tC984A82A29AA19AC00D74C52 /* Model.swift in Sources */ = {isa = PBXBuildFile; fileRef = C984A82929AA19AC00D74C52 /* Model.swift */; };\n\t\tC984A83C29AA430B00D74C52 /* ViewController.swift in Sources */ = {isa = PBXBuildFile; fileRef = C984A83B29AA430B00D74C52 /* ViewController.swift */; };\n\t\tC9AC22172BB50165008B65E2 /* onnxruntime.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = C9AC22162BB50165008B65E2 /* onnxruntime.xcframework */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXContainerItemProxy section */\n\t\tC984A7FB29A9EEBA00D74C52 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = C984A7DC29A9EEB700D74C52 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = C984A7E329A9EEB700D74C52;\n\t\t\tremoteInfo = SherpaOnnx;\n\t\t};\n\t\tC984A80529A9EEBA00D74C52 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = C984A7DC29A9EEB700D74C52 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = C984A7E329A9EEB700D74C52;\n\t\t\tremoteInfo = SherpaOnnx;\n\t\t};\n/* End PBXContainerItemProxy section */\n\n/* Begin PBXFileReference section */\n\t\tC984A7E429A9EEB700D74C52 /* SherpaOnnx.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = SherpaOnnx.app; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tC984A7E729A9EEB700D74C52 /* AppDelegate.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = AppDelegate.swift; sourceTree = \"<group>\"; };\n\t\tC984A7E929A9EEB700D74C52 /* SceneDelegate.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SceneDelegate.swift; sourceTree = \"<group>\"; };\n\t\tC984A7F029A9EEB900D74C52 /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = \"<group>\"; };\n\t\tC984A7F329A9EEB900D74C52 /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/LaunchScreen.storyboard; sourceTree = \"<group>\"; };\n\t\tC984A7F529A9EEB900D74C52 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = \"<group>\"; };\n\t\tC984A7FA29A9EEBA00D74C52 /* SherpaOnnxTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = SherpaOnnxTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tC984A7FE29A9EEBA00D74C52 /* SherpaOnnxTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxTests.swift; sourceTree = \"<group>\"; };\n\t\tC984A80429A9EEBA00D74C52 /* SherpaOnnxUITests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = SherpaOnnxUITests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tC984A80829A9EEBA00D74C52 /* SherpaOnnxUITests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxUITests.swift; sourceTree = \"<group>\"; };\n\t\tC984A80A29A9EEBA00D74C52 /* SherpaOnnxUITestsLaunchTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxUITestsLaunchTests.swift; sourceTree = \"<group>\"; };\n\t\tC984A81729A9F51B00D74C52 /* SherpaOnnx-Bridging-Header.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = \"SherpaOnnx-Bridging-Header.h\"; path = \"../../../swift-api-examples/SherpaOnnx-Bridging-Header.h\"; sourceTree = \"<group>\"; };\n\t\tC984A81829AA119400D74C52 /* SherpaOnnx.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; name = SherpaOnnx.swift; path = \"../../../swift-api-examples/SherpaOnnx.swift\"; sourceTree = \"<group>\"; };\n\t\tC984A81B29AA11C500D74C52 /* sherpa-onnx.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = \"sherpa-onnx.xcframework\"; path = \"../../build-ios/sherpa-onnx.xcframework\"; sourceTree = \"<group>\"; };\n\t\tC984A82729AA196100D74C52 /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/Main.storyboard; sourceTree = \"<group>\"; };\n\t\tC984A82929AA19AC00D74C52 /* Model.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = Model.swift; sourceTree = \"<group>\"; };\n\t\tC984A83B29AA430B00D74C52 /* ViewController.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = ViewController.swift; sourceTree = \"<group>\"; };\n\t\tC9AC22162BB50165008B65E2 /* onnxruntime.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = onnxruntime.xcframework; path = \"../../build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework\"; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\tC984A7E129A9EEB700D74C52 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC9AC22172BB50165008B65E2 /* onnxruntime.xcframework in Frameworks */,\n\t\t\t\tC93989AE2A89FE13009AB859 /* sherpa-onnx.xcframework in Frameworks */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC984A7F729A9EEBA00D74C52 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC984A80129A9EEBA00D74C52 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\tC984A7DB29A9EEB700D74C52 = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC984A7E629A9EEB700D74C52 /* SherpaOnnx */,\n\t\t\t\tC984A7FD29A9EEBA00D74C52 /* SherpaOnnxTests */,\n\t\t\t\tC984A80729A9EEBA00D74C52 /* SherpaOnnxUITests */,\n\t\t\t\tC984A7E529A9EEB700D74C52 /* Products */,\n\t\t\t\tC984A81A29AA11C500D74C52 /* Frameworks */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC984A7E529A9EEB700D74C52 /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC984A7E429A9EEB700D74C52 /* SherpaOnnx.app */,\n\t\t\t\tC984A7FA29A9EEBA00D74C52 /* SherpaOnnxTests.xctest */,\n\t\t\t\tC984A80429A9EEBA00D74C52 /* SherpaOnnxUITests.xctest */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC984A7E629A9EEB700D74C52 /* SherpaOnnx */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC984A83B29AA430B00D74C52 /* ViewController.swift */,\n\t\t\t\tC984A82929AA19AC00D74C52 /* Model.swift */,\n\t\t\t\tC984A81829AA119400D74C52 /* SherpaOnnx.swift */,\n\t\t\t\tC984A81729A9F51B00D74C52 /* SherpaOnnx-Bridging-Header.h */,\n\t\t\t\tC984A7E729A9EEB700D74C52 /* AppDelegate.swift */,\n\t\t\t\tC984A7E929A9EEB700D74C52 /* SceneDelegate.swift */,\n\t\t\t\tC984A82629AA196100D74C52 /* Main.storyboard */,\n\t\t\t\tC984A7F029A9EEB900D74C52 /* Assets.xcassets */,\n\t\t\t\tC984A7F229A9EEB900D74C52 /* LaunchScreen.storyboard */,\n\t\t\t\tC984A7F529A9EEB900D74C52 /* Info.plist */,\n\t\t\t);\n\t\t\tpath = SherpaOnnx;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC984A7FD29A9EEBA00D74C52 /* SherpaOnnxTests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC984A7FE29A9EEBA00D74C52 /* SherpaOnnxTests.swift */,\n\t\t\t);\n\t\t\tpath = SherpaOnnxTests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC984A80729A9EEBA00D74C52 /* SherpaOnnxUITests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC984A80829A9EEBA00D74C52 /* SherpaOnnxUITests.swift */,\n\t\t\t\tC984A80A29A9EEBA00D74C52 /* SherpaOnnxUITestsLaunchTests.swift */,\n\t\t\t);\n\t\t\tpath = SherpaOnnxUITests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC984A81A29AA11C500D74C52 /* Frameworks */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC9AC22162BB50165008B65E2 /* onnxruntime.xcframework */,\n\t\t\t\tC984A81B29AA11C500D74C52 /* sherpa-onnx.xcframework */,\n\t\t\t);\n\t\t\tname = Frameworks;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\tC984A7E329A9EEB700D74C52 /* SherpaOnnx */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = C984A80E29A9EEBA00D74C52 /* Build configuration list for PBXNativeTarget \"SherpaOnnx\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tC984A7E029A9EEB700D74C52 /* Sources */,\n\t\t\t\tC984A7E129A9EEB700D74C52 /* Frameworks */,\n\t\t\t\tC984A7E229A9EEB700D74C52 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = SherpaOnnx;\n\t\t\tproductName = SherpaOnnx;\n\t\t\tproductReference = C984A7E429A9EEB700D74C52 /* SherpaOnnx.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n\t\tC984A7F929A9EEBA00D74C52 /* SherpaOnnxTests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = C984A81129A9EEBA00D74C52 /* Build configuration list for PBXNativeTarget \"SherpaOnnxTests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tC984A7F629A9EEBA00D74C52 /* Sources */,\n\t\t\t\tC984A7F729A9EEBA00D74C52 /* Frameworks */,\n\t\t\t\tC984A7F829A9EEBA00D74C52 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\tC984A7FC29A9EEBA00D74C52 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = SherpaOnnxTests;\n\t\t\tproductName = SherpaOnnxTests;\n\t\t\tproductReference = C984A7FA29A9EEBA00D74C52 /* SherpaOnnxTests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.unit-test\";\n\t\t};\n\t\tC984A80329A9EEBA00D74C52 /* SherpaOnnxUITests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = C984A81429A9EEBA00D74C52 /* Build configuration list for PBXNativeTarget \"SherpaOnnxUITests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tC984A80029A9EEBA00D74C52 /* Sources */,\n\t\t\t\tC984A80129A9EEBA00D74C52 /* Frameworks */,\n\t\t\t\tC984A80229A9EEBA00D74C52 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\tC984A80629A9EEBA00D74C52 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = SherpaOnnxUITests;\n\t\t\tproductName = SherpaOnnxUITests;\n\t\t\tproductReference = C984A80429A9EEBA00D74C52 /* SherpaOnnxUITests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.ui-testing\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\tC984A7DC29A9EEB700D74C52 /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = 1;\n\t\t\t\tLastSwiftUpdateCheck = 1420;\n\t\t\t\tLastUpgradeCheck = 1420;\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\tC984A7E329A9EEB700D74C52 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.2;\n\t\t\t\t\t};\n\t\t\t\t\tC984A7F929A9EEBA00D74C52 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.2;\n\t\t\t\t\t\tTestTargetID = C984A7E329A9EEB700D74C52;\n\t\t\t\t\t};\n\t\t\t\t\tC984A80329A9EEBA00D74C52 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.2;\n\t\t\t\t\t\tTestTargetID = C984A7E329A9EEB700D74C52;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = C984A7DF29A9EEB700D74C52 /* Build configuration list for PBXProject \"SherpaOnnx\" */;\n\t\t\tcompatibilityVersion = \"Xcode 14.0\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = C984A7DB29A9EEB700D74C52;\n\t\t\tproductRefGroup = C984A7E529A9EEB700D74C52 /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\tC984A7E329A9EEB700D74C52 /* SherpaOnnx */,\n\t\t\t\tC984A7F929A9EEBA00D74C52 /* SherpaOnnxTests */,\n\t\t\t\tC984A80329A9EEBA00D74C52 /* SherpaOnnxUITests */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\tC984A7E229A9EEB700D74C52 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC984A82829AA196100D74C52 /* Main.storyboard in Resources */,\n\t\t\t\tC984A7F429A9EEB900D74C52 /* LaunchScreen.storyboard in Resources */,\n\t\t\t\tC984A7F129A9EEB900D74C52 /* Assets.xcassets in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC984A7F829A9EEBA00D74C52 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC984A80229A9EEBA00D74C52 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\tC984A7E029A9EEB700D74C52 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC984A83C29AA430B00D74C52 /* ViewController.swift in Sources */,\n\t\t\t\tC984A82A29AA19AC00D74C52 /* Model.swift in Sources */,\n\t\t\t\tC984A81929AA119400D74C52 /* SherpaOnnx.swift in Sources */,\n\t\t\t\tC984A7E829A9EEB700D74C52 /* AppDelegate.swift in Sources */,\n\t\t\t\tC984A7EA29A9EEB700D74C52 /* SceneDelegate.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC984A7F629A9EEBA00D74C52 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC984A7FF29A9EEBA00D74C52 /* SherpaOnnxTests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC984A80029A9EEBA00D74C52 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC984A80B29A9EEBA00D74C52 /* SherpaOnnxUITestsLaunchTests.swift in Sources */,\n\t\t\t\tC984A80929A9EEBA00D74C52 /* SherpaOnnxUITests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin PBXTargetDependency section */\n\t\tC984A7FC29A9EEBA00D74C52 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = C984A7E329A9EEB700D74C52 /* SherpaOnnx */;\n\t\t\ttargetProxy = C984A7FB29A9EEBA00D74C52 /* PBXContainerItemProxy */;\n\t\t};\n\t\tC984A80629A9EEBA00D74C52 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = C984A7E329A9EEB700D74C52 /* SherpaOnnx */;\n\t\t\ttargetProxy = C984A80529A9EEBA00D74C52 /* PBXContainerItemProxy */;\n\t\t};\n/* End PBXTargetDependency section */\n\n/* Begin PBXVariantGroup section */\n\t\tC984A7F229A9EEB900D74C52 /* LaunchScreen.storyboard */ = {\n\t\t\tisa = PBXVariantGroup;\n\t\t\tchildren = (\n\t\t\t\tC984A7F329A9EEB900D74C52 /* Base */,\n\t\t\t);\n\t\t\tname = LaunchScreen.storyboard;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC984A82629AA196100D74C52 /* Main.storyboard */ = {\n\t\t\tisa = PBXVariantGroup;\n\t\t\tchildren = (\n\t\t\t\tC984A82729AA196100D74C52 /* Base */,\n\t\t\t);\n\t\t\tname = Main.storyboard;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXVariantGroup section */\n\n/* Begin XCBuildConfiguration section */\n\t\tC984A80C29A9EEBA00D74C52 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = INCLUDE_SOURCE;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC984A80D29A9EEBA00D74C52 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t\tVALIDATE_PRODUCT = YES;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tC984A80F29A9EEBA00D74C52 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tINFOPLIST_FILE = SherpaOnnx/Info.plist;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchStoryboardName = LaunchScreen;\n\t\t\t\tINFOPLIST_KEY_UIMainStoryboardFile = Main;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnx\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC984A81029A9EEBA00D74C52 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tINFOPLIST_FILE = SherpaOnnx/Info.plist;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchStoryboardName = LaunchScreen;\n\t\t\t\tINFOPLIST_KEY_UIMainStoryboardFile = Main;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnx\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tC984A81229A9EEBA00D74C52 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxTests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/SherpaOnnx.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/SherpaOnnx\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC984A81329A9EEBA00D74C52 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxTests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/SherpaOnnx.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/SherpaOnnx\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tC984A81529A9EEBA00D74C52 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxUITests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_TARGET_NAME = SherpaOnnx;\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC984A81629A9EEBA00D74C52 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxUITests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_TARGET_NAME = SherpaOnnx;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\tC984A7DF29A9EEB700D74C52 /* Build configuration list for PBXProject \"SherpaOnnx\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC984A80C29A9EEBA00D74C52 /* Debug */,\n\t\t\t\tC984A80D29A9EEBA00D74C52 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tC984A80E29A9EEBA00D74C52 /* Build configuration list for PBXNativeTarget \"SherpaOnnx\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC984A80F29A9EEBA00D74C52 /* Debug */,\n\t\t\t\tC984A81029A9EEBA00D74C52 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tC984A81129A9EEBA00D74C52 /* Build configuration list for PBXNativeTarget \"SherpaOnnxTests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC984A81229A9EEBA00D74C52 /* Debug */,\n\t\t\t\tC984A81329A9EEBA00D74C52 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tC984A81429A9EEBA00D74C52 /* Build configuration list for PBXNativeTarget \"SherpaOnnxUITests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC984A81529A9EEBA00D74C52 /* Debug */,\n\t\t\t\tC984A81629A9EEBA00D74C52 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = C984A7DC29A9EEB700D74C52 /* Project object */;\n}\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"self:\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnx.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnxTests/SherpaOnnxTests.swift",
    "content": "//\n//  SherpaOnnxTests.swift\n//  SherpaOnnxTests\n//\n//  Created by fangjun on 2023/2/25.\n//\n\nimport XCTest\n@testable import SherpaOnnx\n\nfinal class SherpaOnnxTests: XCTestCase {\n\n    override func setUpWithError() throws {\n        // Put setup code here. This method is called before the invocation of each test method in the class.\n    }\n\n    override func tearDownWithError() throws {\n        // Put teardown code here. This method is called after the invocation of each test method in the class.\n    }\n\n    func testExample() throws {\n        // This is an example of a functional test case.\n        // Use XCTAssert and related functions to verify your tests produce the correct results.\n        // Any test you write for XCTest can be annotated as throws and async.\n        // Mark your test throws to produce an unexpected failure when your test encounters an uncaught error.\n        // Mark your test async to allow awaiting for asynchronous code to complete. Check the results with assertions afterwards.\n    }\n\n    func testPerformanceExample() throws {\n        // This is an example of a performance test case.\n        self.measure {\n            // Put the code you want to measure the time of here.\n        }\n    }\n\n}\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnxUITests/SherpaOnnxUITests.swift",
    "content": "//\n//  SherpaOnnxUITests.swift\n//  SherpaOnnxUITests\n//\n//  Created by fangjun on 2023/2/25.\n//\n\nimport XCTest\n\nfinal class SherpaOnnxUITests: XCTestCase {\n\n    override func setUpWithError() throws {\n        // Put setup code here. This method is called before the invocation of each test method in the class.\n\n        // In UI tests it is usually best to stop immediately when a failure occurs.\n        continueAfterFailure = false\n\n        // In UI tests it’s important to set the initial state - such as interface orientation - required for your tests before they run. The setUp method is a good place to do this.\n    }\n\n    override func tearDownWithError() throws {\n        // Put teardown code here. This method is called after the invocation of each test method in the class.\n    }\n\n    func testExample() throws {\n        // UI tests must launch the application that they test.\n        let app = XCUIApplication()\n        app.launch()\n\n        // Use XCTAssert and related functions to verify your tests produce the correct results.\n    }\n\n    func testLaunchPerformance() throws {\n        if #available(macOS 10.15, iOS 13.0, tvOS 13.0, watchOS 7.0, *) {\n            // This measures how long it takes to launch your application.\n            measure(metrics: [XCTApplicationLaunchMetric()]) {\n                XCUIApplication().launch()\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "ios-swift/SherpaOnnx/SherpaOnnxUITests/SherpaOnnxUITestsLaunchTests.swift",
    "content": "//\n//  SherpaOnnxUITestsLaunchTests.swift\n//  SherpaOnnxUITests\n//\n//  Created by fangjun on 2023/2/25.\n//\n\nimport XCTest\n\nfinal class SherpaOnnxUITestsLaunchTests: XCTestCase {\n\n    override class var runsForEachTargetApplicationUIConfiguration: Bool {\n        true\n    }\n\n    override func setUpWithError() throws {\n        continueAfterFailure = false\n    }\n\n    func testLaunch() throws {\n        let app = XCUIApplication()\n        app.launch()\n\n        // Insert steps here to perform after app launch but before taking a screenshot,\n        // such as logging into a test account or navigating somewhere in the app\n\n        let attachment = XCTAttachment(screenshot: app.screenshot())\n        attachment.name = \"Launch Screen\"\n        attachment.lifetime = .keepAlways\n        add(attachment)\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/.gitignore",
    "content": "# See https://github.com/github/gitignore/blob/main/Swift.gitignore\n# Xcode\n#\n# gitignore contributors: remember to update Global/Xcode.gitignore, Objective-C.gitignore & Swift.gitignore\n\n## User settings\nxcuserdata/\n\n## compatibility with Xcode 8 and earlier (ignoring not required starting Xcode 9)\n*.xcscmblueprint\n*.xccheckout\n\n## compatibility with Xcode 3 and earlier (ignoring not required starting Xcode 4)\nbuild/\nDerivedData/\n*.moved-aside\n*.pbxuser\n!default.pbxuser\n*.mode1v3\n!default.mode1v3\n*.mode2v3\n!default.mode2v3\n*.perspectivev3\n!default.perspectivev3\n\n## Obj-C/Swift specific\n*.hmap\n\n## App packaging\n*.ipa\n*.dSYM.zip\n*.dSYM\n\n## Playgrounds\ntimeline.xctimeline\nplayground.xcworkspace\n\n# Swift Package Manager\n#\n# Add this line if you want to avoid checking in source code from Swift Package Manager dependencies.\n# Packages/\n# Package.pins\n# Package.resolved\n# *.xcodeproj\n#\n# Xcode automatically generates this directory with a .xcworkspacedata file and xcuserdata\n# hence it is not needed unless you have added a package configuration file to your project\n# .swiftpm\n\n.build/\n\n# CocoaPods\n#\n# We recommend against adding the Pods directory to your .gitignore. However\n# you should judge for yourself, the pros and cons are mentioned at:\n# https://guides.cocoapods.org/using/using-cocoapods.html#should-i-check-the-pods-directory-into-source-control\n#\n# Pods/\n#\n# Add this line if you want to avoid checking in source code from the Xcode workspace\n# *.xcworkspace\n\n# Carthage\n#\n# Add this line if you want to avoid checking in source code from Carthage dependencies.\n# Carthage/Checkouts\n\nCarthage/Build/\n\n# Accio dependency management\nDependencies/\n.accio/\n\n# fastlane\n#\n# It is recommended to not store the screenshots in the git repo.\n# Instead, use fastlane to re-generate the screenshots whenever they are needed.\n# For more information about the recommended setup visit:\n# https://docs.fastlane.tools/best-practices/source-control/#source-control\n\nfastlane/report.xml\nfastlane/Preview.html\nfastlane/screenshots/**/*.png\nfastlane/test_output\n\n# Code Injection\n#\n# After new code Injection tools there's a generated folder /iOSInjectionProject\n# https://github.com/johnno1962/injectionforxcode\n\niOSInjectionProject/\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx/Assets.xcassets/AccentColor.colorset/Contents.json",
    "content": "{\n  \"colors\" : [\n    {\n      \"idiom\" : \"universal\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"filename\" : \"k2-1024x1024.png\",\n      \"idiom\" : \"universal\",\n      \"platform\" : \"ios\",\n      \"size\" : \"1024x1024\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx/Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx/ContentView.swift",
    "content": "//\n//  ContentView.swift\n//  SherpaOnnx\n//\n//  Created by fangjun on 2023/4/5.\n//\n\nimport SwiftUI\n\nstruct ContentView: View {\n    @StateObject var sherpaOnnxVM = SherpaOnnxViewModel()\n\n    var body: some View {\n        VStack {\n            Text(\"ASR with Next-gen Kaldi\")\n                .font(.title)\n            if sherpaOnnxVM.status == .stop {\n                Text(\"See https://github.com/k2-fsa/sherpa-onnx\")\n                Text(\"Press the Start button to run!\")\n            }\n            ScrollView(.vertical, showsIndicators: true) {\n                HStack {\n                    Text(sherpaOnnxVM.subtitles)\n                    Spacer()\n                }\n            }\n            Spacer()\n            Button {\n                toggleRecorder()\n            } label: {\n                Text(sherpaOnnxVM.status == .stop ? \"Start\" : \"Stop\")\n            }\n        }\n        .padding()\n    }\n\n    private func toggleRecorder() {\n        sherpaOnnxVM.toggleRecorder()\n    }\n}\n\nstruct ContentView_Previews: PreviewProvider {\n    static var previews: some View {\n        ContentView()\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx/Extension.swift",
    "content": "//\n//  Extension.swift\n//  SherpaOnnx\n//\n//  Created by knight on 2023/4/5.\n//\n\nimport AVFoundation\n\nextension AudioBuffer {\n    func array() -> [Float] {\n        return Array(UnsafeBufferPointer(self))\n    }\n}\n\nextension AVAudioPCMBuffer {\n    func array() -> [Float] {\n        return self.audioBufferList.pointee.mBuffers.array()\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>NSMicrophoneUsageDescription</key>\n\t<string>Need microphone access for Next-gen Kaldi to work</string>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx/Model.swift",
    "content": "import Foundation\n\nfunc getResource(_ forResource: String, _ ofType: String) -> String {\n  let path = Bundle.main.path(forResource: forResource, ofType: ofType)\n  precondition(\n    path != nil,\n    \"\\(forResource).\\(ofType) does not exist!\\n\" + \"Remember to change \\n\"\n      + \"  Build Phases -> Copy Bundle Resources\\n\" + \"to add it!\"\n  )\n  return path!\n}\n/// Please refer to\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n/// to download pre-trained models\n\n/// sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 (Bilingual, Chinese + English)\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/zipformer-transducer-models.html\nfunc getBilingualStreamZhEnZipformer20230220() -> SherpaOnnxOnlineModelConfig {\n  let encoder = getResource(\"encoder-epoch-99-avg-1\", \"onnx\")\n  let decoder = getResource(\"decoder-epoch-99-avg-1\", \"onnx\")\n  let joiner = getResource(\"joiner-epoch-99-avg-1\", \"onnx\")\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  return sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    transducer: sherpaOnnxOnlineTransducerModelConfig(\n      encoder: encoder,\n      decoder: decoder,\n      joiner: joiner),\n    numThreads: 2,\n    modelType: \"zipformer\"\n  )\n}\n\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/index.html\nfunc getBilingualStreamingZhEnParaformer() -> SherpaOnnxOnlineModelConfig {\n  let encoder = getResource(\"encoder.int8\", \"onnx\")\n  let decoder = getResource(\"decoder.int8\", \"onnx\")\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  return sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    paraformer: sherpaOnnxOnlineParaformerModelConfig(\n      encoder: encoder,\n      decoder: decoder),\n    numThreads: 1,\n    modelType: \"paraformer\"\n  )\n}\n\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html#tiny-en\n//\nfunc getLanguageIdentificationTiny() -> SherpaOnnxSpokenLanguageIdentificationConfig\n {\n  let encoder = getResource(\"tiny-encoder.int8\", \"onnx\")\n  let decoder = getResource(\"tiny-decoder.int8\", \"onnx\")\n    \n    let whisperConfig = sherpaOnnxSpokenLanguageIdentificationWhisperConfig(\n      encoder: encoder,\n      decoder: decoder\n    )\n\n    let config = sherpaOnnxSpokenLanguageIdentificationConfig(\n      whisper: whisperConfig,\n      numThreads: 1,\n      debug: 1,\n      provider: \"cpu\"\n    )\n    return config\n}\n\n\n/// Please refer to\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n/// to add more models if you need\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx/Preview Content/Preview Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx/SherpaOnnxApp.swift",
    "content": "//\n//  SherpaOnnxApp.swift\n//  SherpaOnnx\n//\n//  Created by fangjun on 2023/4/5.\n//\n\nimport SwiftUI\n\n@main\nstruct SherpaOnnxApp: App {\n    var body: some Scene {\n        WindowGroup {\n            ContentView()\n        }\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx/SherpaOnnxViewModel.swift",
    "content": "//\n//  SherpaOnnxViewModel.swift\n//  SherpaOnnx\n//\n//  Created by knight on 2023/4/5.\n//\n\nimport AVFoundation\nimport Foundation\n\nenum Status {\n    case stop\n    case recording\n}\n\n@MainActor\nclass SherpaOnnxViewModel: ObservableObject {\n    @Published var status: Status = .stop\n    @Published var subtitles: String = \"\"\n\n    var sentences: [String] = []\n\n    var audioEngine: AVAudioEngine? = nil\n    var recognizer: SherpaOnnxRecognizer! = nil\n    private var audioSession: AVAudioSession!\n\n    var lastSentence: String = \"\"\n    let maxSentence: Int = 20\n\n    var results: String {\n        if sentences.isEmpty && lastSentence.isEmpty {\n            return \"\"\n        }\n        if sentences.isEmpty {\n            return \"0: \\(lastSentence.lowercased())\"\n        }\n\n        let start = max(sentences.count - maxSentence, 0)\n        if lastSentence.isEmpty {\n            return sentences.enumerated().map { (index, s) in\n                \"\\(index): \\(s.lowercased())\"\n            }[start...]\n            .joined(separator: \"\\n\")\n        } else {\n            return sentences.enumerated().map { (index, s) in\n                \"\\(index): \\(s.lowercased())\"\n            }[start...]\n            .joined(separator: \"\\n\")\n                + \"\\n\\(sentences.count): \\(lastSentence.lowercased())\"\n        }\n    }\n\n    func updateLabel() {\n        self.subtitles = self.results\n    }\n\n    func setupAudioSession() {\n        audioSession = AVAudioSession.sharedInstance()\n        do {\n            try audioSession.setCategory(\n                .playAndRecord, mode: .default, options: [.defaultToSpeaker])\n            try audioSession.setActive(true)\n        } catch {\n            print(\"Failed to set up audio session: \\(error)\")\n        }\n    }\n\n    init() {\n        initRecognizer()\n        setupAudioSession()\n        initRecorder()\n    }\n\n    private func initRecognizer() {\n        // Please select one model that is best suitable for you.\n        //\n        // You can also modify Model.swift to add new pre-trained models from\n        // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n        // let modelConfig = getBilingualStreamZhEnZipformer20230220()\n        let modelConfig = getBilingualStreamingZhEnParaformer()\n\n        let featConfig = sherpaOnnxFeatureConfig(\n            sampleRate: 16000,\n            featureDim: 80)\n\n        var config = sherpaOnnxOnlineRecognizerConfig(\n            featConfig: featConfig,\n            modelConfig: modelConfig,\n            enableEndpoint: true,\n            rule1MinTrailingSilence: 2.4,\n            rule2MinTrailingSilence: 0.8,\n            rule3MinUtteranceLength: 30,\n            decodingMethod: \"greedy_search\",\n            maxActivePaths: 4\n        )\n        recognizer = SherpaOnnxRecognizer(config: &config)\n    }\n\n    private func initRecorder() {\n        print(\"init recorder\")\n        audioEngine = AVAudioEngine()\n        let inputNode = self.audioEngine?.inputNode\n        let bus = 0\n        let inputFormat = inputNode?.outputFormat(forBus: bus)\n        let outputFormat = AVAudioFormat(\n            commonFormat: .pcmFormatFloat32,\n            sampleRate: 16000, channels: 1,\n            interleaved: false)!\n\n        let converter = AVAudioConverter(from: inputFormat!, to: outputFormat)!\n\n        inputNode!.installTap(\n            onBus: bus,\n            bufferSize: 1024,\n            format: inputFormat\n        ) {\n            (buffer: AVAudioPCMBuffer, when: AVAudioTime) in\n            var newBufferAvailable = true\n\n            let inputCallback: AVAudioConverterInputBlock = {\n                inNumPackets, outStatus in\n                if newBufferAvailable {\n                    outStatus.pointee = .haveData\n                    newBufferAvailable = false\n\n                    return buffer\n                } else {\n                    outStatus.pointee = .noDataNow\n                    return nil\n                }\n            }\n\n            let convertedBuffer = AVAudioPCMBuffer(\n                pcmFormat: outputFormat,\n                frameCapacity:\n                    AVAudioFrameCount(outputFormat.sampleRate)\n                    * buffer.frameLength\n                    / AVAudioFrameCount(buffer.format.sampleRate))!\n\n            var error: NSError?\n            let _ = converter.convert(\n                to: convertedBuffer,\n                error: &error, withInputFrom: inputCallback)\n\n            // TODO(fangjun): Handle status != haveData\n\n            let array = convertedBuffer.array()\n            if !array.isEmpty {\n                self.recognizer.acceptWaveform(samples: array)\n                while self.recognizer.isReady() {\n                    self.recognizer.decode()\n                }\n                let isEndpoint = self.recognizer.isEndpoint()\n                let text = self.recognizer.getResult().text\n\n                if !text.isEmpty && self.lastSentence != text {\n                    self.lastSentence = text\n                    self.updateLabel()\n                    print(text)\n                }\n\n                if isEndpoint {\n                    if !text.isEmpty {\n                        let tmp = self.lastSentence\n                        self.lastSentence = \"\"\n                        self.sentences.append(tmp)\n                    }\n                    self.recognizer.reset()\n                }\n            }\n        }\n    }\n\n    public func toggleRecorder() {\n        if status == .stop {\n            startRecorder()\n            status = .recording\n        } else {\n            stopRecorder()\n            status = .stop\n        }\n    }\n\n    private func startRecorder() {\n        lastSentence = \"\"\n        sentences = []\n\n        do {\n            try self.audioEngine?.start()\n        } catch let error as NSError {\n            print(\n                \"Got an error starting audioEngine: \\(error.domain), \\(error)\")\n        }\n        print(\"started\")\n    }\n\n    private func stopRecorder() {\n        audioEngine?.stop()\n        print(\"stopped\")\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 56;\n\tobjects = {\n\n/* Begin PBXBuildFile section */\n\t\tC924F32E29DDAC0B00A440A5 /* SherpaOnnxApp.swift in Sources */ = {isa = PBXBuildFile; fileRef = C924F32D29DDAC0B00A440A5 /* SherpaOnnxApp.swift */; };\n\t\tC924F33029DDAC0B00A440A5 /* ContentView.swift in Sources */ = {isa = PBXBuildFile; fileRef = C924F32F29DDAC0B00A440A5 /* ContentView.swift */; };\n\t\tC924F33229DDAC0D00A440A5 /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = C924F33129DDAC0D00A440A5 /* Assets.xcassets */; };\n\t\tC924F33529DDAC0D00A440A5 /* Preview Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = C924F33429DDAC0D00A440A5 /* Preview Assets.xcassets */; };\n\t\tC924F33F29DDAC0D00A440A5 /* SherpaOnnxTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = C924F33E29DDAC0D00A440A5 /* SherpaOnnxTests.swift */; };\n\t\tC924F34929DDAC0D00A440A5 /* SherpaOnnxUITests.swift in Sources */ = {isa = PBXBuildFile; fileRef = C924F34829DDAC0D00A440A5 /* SherpaOnnxUITests.swift */; };\n\t\tC924F34B29DDAC0D00A440A5 /* SherpaOnnxUITestsLaunchTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = C924F34A29DDAC0D00A440A5 /* SherpaOnnxUITestsLaunchTests.swift */; };\n\t\tC924F35929DDACED00A440A5 /* SherpaOnnx.swift in Sources */ = {isa = PBXBuildFile; fileRef = C924F35829DDACED00A440A5 /* SherpaOnnx.swift */; };\n\t\tC924F35C29DDAE4000A440A5 /* sherpa-onnx.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = C924F35B29DDAE4000A440A5 /* sherpa-onnx.xcframework */; };\n\t\tC924F35E29DDAE8200A440A5 /* Model.swift in Sources */ = {isa = PBXBuildFile; fileRef = C924F35D29DDAE8200A440A5 /* Model.swift */; };\n\t\tC924F36029DDB05D00A440A5 /* onnxruntime.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = C924F35F29DDB05D00A440A5 /* onnxruntime.xcframework */; };\n\t\tC924F36229DDB15D00A440A5 /* Extension.swift in Sources */ = {isa = PBXBuildFile; fileRef = C924F36129DDB15D00A440A5 /* Extension.swift */; };\n\t\tC924F36429DDB1D500A440A5 /* SherpaOnnxViewModel.swift in Sources */ = {isa = PBXBuildFile; fileRef = C924F36329DDB1D500A440A5 /* SherpaOnnxViewModel.swift */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXContainerItemProxy section */\n\t\tC924F33B29DDAC0D00A440A5 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = C924F32229DDAC0B00A440A5 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = C924F32929DDAC0B00A440A5;\n\t\t\tremoteInfo = SherpaOnnx;\n\t\t};\n\t\tC924F34529DDAC0D00A440A5 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = C924F32229DDAC0B00A440A5 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = C924F32929DDAC0B00A440A5;\n\t\t\tremoteInfo = SherpaOnnx;\n\t\t};\n/* End PBXContainerItemProxy section */\n\n/* Begin PBXFileReference section */\n\t\tC924F32A29DDAC0B00A440A5 /* SherpaOnnx.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = SherpaOnnx.app; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tC924F32D29DDAC0B00A440A5 /* SherpaOnnxApp.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxApp.swift; sourceTree = \"<group>\"; };\n\t\tC924F32F29DDAC0B00A440A5 /* ContentView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ContentView.swift; sourceTree = \"<group>\"; };\n\t\tC924F33129DDAC0D00A440A5 /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = \"<group>\"; };\n\t\tC924F33429DDAC0D00A440A5 /* Preview Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = \"Preview Assets.xcassets\"; sourceTree = \"<group>\"; };\n\t\tC924F33A29DDAC0D00A440A5 /* SherpaOnnxTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = SherpaOnnxTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tC924F33E29DDAC0D00A440A5 /* SherpaOnnxTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxTests.swift; sourceTree = \"<group>\"; };\n\t\tC924F34429DDAC0D00A440A5 /* SherpaOnnxUITests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = SherpaOnnxUITests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tC924F34829DDAC0D00A440A5 /* SherpaOnnxUITests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxUITests.swift; sourceTree = \"<group>\"; };\n\t\tC924F34A29DDAC0D00A440A5 /* SherpaOnnxUITestsLaunchTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxUITestsLaunchTests.swift; sourceTree = \"<group>\"; };\n\t\tC924F35729DDACED00A440A5 /* SherpaOnnx-Bridging-Header.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = \"SherpaOnnx-Bridging-Header.h\"; path = \"../../../swift-api-examples/SherpaOnnx-Bridging-Header.h\"; sourceTree = \"<group>\"; };\n\t\tC924F35829DDACED00A440A5 /* SherpaOnnx.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; name = SherpaOnnx.swift; path = \"../../../swift-api-examples/SherpaOnnx.swift\"; sourceTree = \"<group>\"; };\n\t\tC924F35B29DDAE4000A440A5 /* sherpa-onnx.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = \"sherpa-onnx.xcframework\"; path = \"../../build-ios/sherpa-onnx.xcframework\"; sourceTree = \"<group>\"; };\n\t\tC924F35D29DDAE8200A440A5 /* Model.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = Model.swift; sourceTree = \"<group>\"; };\n\t\tC924F35F29DDB05D00A440A5 /* onnxruntime.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = onnxruntime.xcframework; path = \"../../build-ios/ios-onnxruntime/onnxruntime.xcframework\"; sourceTree = \"<group>\"; };\n\t\tC924F36129DDB15D00A440A5 /* Extension.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = Extension.swift; sourceTree = \"<group>\"; };\n\t\tC924F36329DDB1D500A440A5 /* SherpaOnnxViewModel.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = SherpaOnnxViewModel.swift; sourceTree = \"<group>\"; };\n\t\tDEFC34EE2BBA8AD100E174E9 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist; path = Info.plist; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\tC924F32729DDAC0B00A440A5 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC924F36029DDB05D00A440A5 /* onnxruntime.xcframework in Frameworks */,\n\t\t\t\tC924F35C29DDAE4000A440A5 /* sherpa-onnx.xcframework in Frameworks */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC924F33729DDAC0D00A440A5 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC924F34129DDAC0D00A440A5 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\tC924F32129DDAC0B00A440A5 = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC924F32C29DDAC0B00A440A5 /* SherpaOnnx */,\n\t\t\t\tC924F33D29DDAC0D00A440A5 /* SherpaOnnxTests */,\n\t\t\t\tC924F34729DDAC0D00A440A5 /* SherpaOnnxUITests */,\n\t\t\t\tC924F32B29DDAC0B00A440A5 /* Products */,\n\t\t\t\tC924F35A29DDAE3F00A440A5 /* Frameworks */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC924F32B29DDAC0B00A440A5 /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC924F32A29DDAC0B00A440A5 /* SherpaOnnx.app */,\n\t\t\t\tC924F33A29DDAC0D00A440A5 /* SherpaOnnxTests.xctest */,\n\t\t\t\tC924F34429DDAC0D00A440A5 /* SherpaOnnxUITests.xctest */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC924F32C29DDAC0B00A440A5 /* SherpaOnnx */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEFC34EE2BBA8AD100E174E9 /* Info.plist */,\n\t\t\t\tC924F36329DDB1D500A440A5 /* SherpaOnnxViewModel.swift */,\n\t\t\t\tC924F36129DDB15D00A440A5 /* Extension.swift */,\n\t\t\t\tC924F35D29DDAE8200A440A5 /* Model.swift */,\n\t\t\t\tC924F35729DDACED00A440A5 /* SherpaOnnx-Bridging-Header.h */,\n\t\t\t\tC924F35829DDACED00A440A5 /* SherpaOnnx.swift */,\n\t\t\t\tC924F32D29DDAC0B00A440A5 /* SherpaOnnxApp.swift */,\n\t\t\t\tC924F32F29DDAC0B00A440A5 /* ContentView.swift */,\n\t\t\t\tC924F33129DDAC0D00A440A5 /* Assets.xcassets */,\n\t\t\t\tC924F33329DDAC0D00A440A5 /* Preview Content */,\n\t\t\t);\n\t\t\tpath = SherpaOnnx;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC924F33329DDAC0D00A440A5 /* Preview Content */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC924F33429DDAC0D00A440A5 /* Preview Assets.xcassets */,\n\t\t\t);\n\t\t\tpath = \"Preview Content\";\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC924F33D29DDAC0D00A440A5 /* SherpaOnnxTests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC924F33E29DDAC0D00A440A5 /* SherpaOnnxTests.swift */,\n\t\t\t);\n\t\t\tpath = SherpaOnnxTests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC924F34729DDAC0D00A440A5 /* SherpaOnnxUITests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC924F34829DDAC0D00A440A5 /* SherpaOnnxUITests.swift */,\n\t\t\t\tC924F34A29DDAC0D00A440A5 /* SherpaOnnxUITestsLaunchTests.swift */,\n\t\t\t);\n\t\t\tpath = SherpaOnnxUITests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC924F35A29DDAE3F00A440A5 /* Frameworks */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC924F35F29DDB05D00A440A5 /* onnxruntime.xcframework */,\n\t\t\t\tC924F35B29DDAE4000A440A5 /* sherpa-onnx.xcframework */,\n\t\t\t);\n\t\t\tname = Frameworks;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\tC924F32929DDAC0B00A440A5 /* SherpaOnnx */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = C924F34E29DDAC0D00A440A5 /* Build configuration list for PBXNativeTarget \"SherpaOnnx\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tC924F32629DDAC0B00A440A5 /* Sources */,\n\t\t\t\tC924F32729DDAC0B00A440A5 /* Frameworks */,\n\t\t\t\tC924F32829DDAC0B00A440A5 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = SherpaOnnx;\n\t\t\tproductName = SherpaOnnx;\n\t\t\tproductReference = C924F32A29DDAC0B00A440A5 /* SherpaOnnx.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n\t\tC924F33929DDAC0D00A440A5 /* SherpaOnnxTests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = C924F35129DDAC0D00A440A5 /* Build configuration list for PBXNativeTarget \"SherpaOnnxTests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tC924F33629DDAC0D00A440A5 /* Sources */,\n\t\t\t\tC924F33729DDAC0D00A440A5 /* Frameworks */,\n\t\t\t\tC924F33829DDAC0D00A440A5 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\tC924F33C29DDAC0D00A440A5 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = SherpaOnnxTests;\n\t\t\tproductName = SherpaOnnxTests;\n\t\t\tproductReference = C924F33A29DDAC0D00A440A5 /* SherpaOnnxTests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.unit-test\";\n\t\t};\n\t\tC924F34329DDAC0D00A440A5 /* SherpaOnnxUITests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = C924F35429DDAC0D00A440A5 /* Build configuration list for PBXNativeTarget \"SherpaOnnxUITests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tC924F34029DDAC0D00A440A5 /* Sources */,\n\t\t\t\tC924F34129DDAC0D00A440A5 /* Frameworks */,\n\t\t\t\tC924F34229DDAC0D00A440A5 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\tC924F34629DDAC0D00A440A5 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = SherpaOnnxUITests;\n\t\t\tproductName = SherpaOnnxUITests;\n\t\t\tproductReference = C924F34429DDAC0D00A440A5 /* SherpaOnnxUITests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.ui-testing\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\tC924F32229DDAC0B00A440A5 /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = 1;\n\t\t\t\tLastSwiftUpdateCheck = 1420;\n\t\t\t\tLastUpgradeCheck = 1420;\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\tC924F32929DDAC0B00A440A5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.2;\n\t\t\t\t\t};\n\t\t\t\t\tC924F33929DDAC0D00A440A5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.2;\n\t\t\t\t\t\tTestTargetID = C924F32929DDAC0B00A440A5;\n\t\t\t\t\t};\n\t\t\t\t\tC924F34329DDAC0D00A440A5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.2;\n\t\t\t\t\t\tTestTargetID = C924F32929DDAC0B00A440A5;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = C924F32529DDAC0B00A440A5 /* Build configuration list for PBXProject \"SherpaOnnx\" */;\n\t\t\tcompatibilityVersion = \"Xcode 14.0\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = C924F32129DDAC0B00A440A5;\n\t\t\tproductRefGroup = C924F32B29DDAC0B00A440A5 /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\tC924F32929DDAC0B00A440A5 /* SherpaOnnx */,\n\t\t\t\tC924F33929DDAC0D00A440A5 /* SherpaOnnxTests */,\n\t\t\t\tC924F34329DDAC0D00A440A5 /* SherpaOnnxUITests */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\tC924F32829DDAC0B00A440A5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC924F33529DDAC0D00A440A5 /* Preview Assets.xcassets in Resources */,\n\t\t\t\tC924F33229DDAC0D00A440A5 /* Assets.xcassets in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC924F33829DDAC0D00A440A5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC924F34229DDAC0D00A440A5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\tC924F32629DDAC0B00A440A5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC924F36229DDB15D00A440A5 /* Extension.swift in Sources */,\n\t\t\t\tC924F33029DDAC0B00A440A5 /* ContentView.swift in Sources */,\n\t\t\t\tC924F35929DDACED00A440A5 /* SherpaOnnx.swift in Sources */,\n\t\t\t\tC924F32E29DDAC0B00A440A5 /* SherpaOnnxApp.swift in Sources */,\n\t\t\t\tC924F36429DDB1D500A440A5 /* SherpaOnnxViewModel.swift in Sources */,\n\t\t\t\tC924F35E29DDAE8200A440A5 /* Model.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC924F33629DDAC0D00A440A5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC924F33F29DDAC0D00A440A5 /* SherpaOnnxTests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tC924F34029DDAC0D00A440A5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC924F34B29DDAC0D00A440A5 /* SherpaOnnxUITestsLaunchTests.swift in Sources */,\n\t\t\t\tC924F34929DDAC0D00A440A5 /* SherpaOnnxUITests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin PBXTargetDependency section */\n\t\tC924F33C29DDAC0D00A440A5 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = C924F32929DDAC0B00A440A5 /* SherpaOnnx */;\n\t\t\ttargetProxy = C924F33B29DDAC0D00A440A5 /* PBXContainerItemProxy */;\n\t\t};\n\t\tC924F34629DDAC0D00A440A5 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = C924F32929DDAC0B00A440A5 /* SherpaOnnx */;\n\t\t\ttargetProxy = C924F34529DDAC0D00A440A5 /* PBXContainerItemProxy */;\n\t\t};\n/* End PBXTargetDependency section */\n\n/* Begin XCBuildConfiguration section */\n\t\tC924F34C29DDAC0D00A440A5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = INCLUDE_SOURCE;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC924F34D29DDAC0D00A440A5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t\tVALIDATE_PRODUCT = YES;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tC924F34F29DDAC0D00A440A5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_ASSET_PATHS = \"\\\"SherpaOnnx/Preview Content\\\"\";\n\t\t\t\tDEVELOPMENT_TEAM = \"\";\n\t\t\t\tENABLE_PREVIEWS = YES;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tINFOPLIST_FILE = SherpaOnnx/Info.plist;\n\t\t\t\tINFOPLIST_KEY_NSMicrophoneUsageDescription = \"Use microphone to record voice\";\n\t\t\t\tINFOPLIST_KEY_UIApplicationSceneManifest_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchScreen_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnx\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC924F35029DDAC0D00A440A5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_ASSET_PATHS = \"\\\"SherpaOnnx/Preview Content\\\"\";\n\t\t\t\tDEVELOPMENT_TEAM = \"\";\n\t\t\t\tENABLE_PREVIEWS = YES;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tINFOPLIST_FILE = SherpaOnnx/Info.plist;\n\t\t\t\tINFOPLIST_KEY_NSMicrophoneUsageDescription = \"Use microphone to record voice\";\n\t\t\t\tINFOPLIST_KEY_UIApplicationSceneManifest_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchScreen_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnx\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tC924F35229DDAC0D00A440A5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxTests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/SherpaOnnx.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/SherpaOnnx\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC924F35329DDAC0D00A440A5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxTests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/SherpaOnnx.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/SherpaOnnx\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tC924F35529DDAC0D00A440A5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxUITests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_TARGET_NAME = SherpaOnnx;\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC924F35629DDAC0D00A440A5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxUITests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_TARGET_NAME = SherpaOnnx;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\tC924F32529DDAC0B00A440A5 /* Build configuration list for PBXProject \"SherpaOnnx\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC924F34C29DDAC0D00A440A5 /* Debug */,\n\t\t\t\tC924F34D29DDAC0D00A440A5 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tC924F34E29DDAC0D00A440A5 /* Build configuration list for PBXNativeTarget \"SherpaOnnx\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC924F34F29DDAC0D00A440A5 /* Debug */,\n\t\t\t\tC924F35029DDAC0D00A440A5 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tC924F35129DDAC0D00A440A5 /* Build configuration list for PBXNativeTarget \"SherpaOnnxTests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC924F35229DDAC0D00A440A5 /* Debug */,\n\t\t\t\tC924F35329DDAC0D00A440A5 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tC924F35429DDAC0D00A440A5 /* Build configuration list for PBXNativeTarget \"SherpaOnnxUITests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC924F35529DDAC0D00A440A5 /* Debug */,\n\t\t\t\tC924F35629DDAC0D00A440A5 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = C924F32229DDAC0B00A440A5 /* Project object */;\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"self:\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnx.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnxTests/SherpaOnnxTests.swift",
    "content": "//\n//  SherpaOnnxTests.swift\n//  SherpaOnnxTests\n//\n//  Created by fangjun on 2023/4/5.\n//\n\nimport XCTest\n@testable import SherpaOnnx\n\nfinal class SherpaOnnxTests: XCTestCase {\n\n    override func setUpWithError() throws {\n        // Put setup code here. This method is called before the invocation of each test method in the class.\n    }\n\n    override func tearDownWithError() throws {\n        // Put teardown code here. This method is called after the invocation of each test method in the class.\n    }\n\n    func testExample() throws {\n        // This is an example of a functional test case.\n        // Use XCTAssert and related functions to verify your tests produce the correct results.\n        // Any test you write for XCTest can be annotated as throws and async.\n        // Mark your test throws to produce an unexpected failure when your test encounters an uncaught error.\n        // Mark your test async to allow awaiting for asynchronous code to complete. Check the results with assertions afterwards.\n    }\n\n    func testPerformanceExample() throws {\n        // This is an example of a performance test case.\n        self.measure {\n            // Put the code you want to measure the time of here.\n        }\n    }\n\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnxUITests/SherpaOnnxUITests.swift",
    "content": "//\n//  SherpaOnnxUITests.swift\n//  SherpaOnnxUITests\n//\n//  Created by fangjun on 2023/4/5.\n//\n\nimport XCTest\n\nfinal class SherpaOnnxUITests: XCTestCase {\n\n    override func setUpWithError() throws {\n        // Put setup code here. This method is called before the invocation of each test method in the class.\n\n        // In UI tests it is usually best to stop immediately when a failure occurs.\n        continueAfterFailure = false\n\n        // In UI tests it’s important to set the initial state - such as interface orientation - required for your tests before they run. The setUp method is a good place to do this.\n    }\n\n    override func tearDownWithError() throws {\n        // Put teardown code here. This method is called after the invocation of each test method in the class.\n    }\n\n    func testExample() throws {\n        // UI tests must launch the application that they test.\n        let app = XCUIApplication()\n        app.launch()\n\n        // Use XCTAssert and related functions to verify your tests produce the correct results.\n    }\n\n    func testLaunchPerformance() throws {\n        if #available(macOS 10.15, iOS 13.0, tvOS 13.0, watchOS 7.0, *) {\n            // This measures how long it takes to launch your application.\n            measure(metrics: [XCTApplicationLaunchMetric()]) {\n                XCUIApplication().launch()\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx/SherpaOnnxUITests/SherpaOnnxUITestsLaunchTests.swift",
    "content": "//\n//  SherpaOnnxUITestsLaunchTests.swift\n//  SherpaOnnxUITests\n//\n//  Created by fangjun on 2023/4/5.\n//\n\nimport XCTest\n\nfinal class SherpaOnnxUITestsLaunchTests: XCTestCase {\n\n    override class var runsForEachTargetApplicationUIConfiguration: Bool {\n        true\n    }\n\n    override func setUpWithError() throws {\n        continueAfterFailure = false\n    }\n\n    func testLaunch() throws {\n        let app = XCUIApplication()\n        app.launch()\n\n        // Insert steps here to perform after app launch but before taking a screenshot,\n        // such as logging into a test account or navigating somewhere in the app\n\n        let attachment = XCTAttachment(screenshot: app.screenshot())\n        attachment.name = \"Launch Screen\"\n        attachment.lifetime = .keepAlways\n        add(attachment)\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass/Assets.xcassets/AccentColor.colorset/Contents.json",
    "content": "{\n  \"colors\" : [\n    {\n      \"idiom\" : \"universal\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"filename\" : \"k2-1024x1024.png\",\n      \"idiom\" : \"universal\",\n      \"platform\" : \"ios\",\n      \"size\" : \"1024x1024\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass/Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass/ContentView.swift",
    "content": "//\n//  ContentView.swift\n//  SherpaOnnx2Pass\n//\n//  Created by fangjun on 2023/9/11.\n//\n\nimport SwiftUI\n\nstruct ContentView: View {\n    @StateObject var sherpaOnnxVM = SherpaOnnxViewModel()\n\n    var body: some View {\n        VStack {\n            Text(\"ASR with Next-gen Kaldi\")\n                .font(.title)\n            if sherpaOnnxVM.status == .stop {\n                Text(\"See https://github.com/k2-fsa/sherpa-onnx\")\n                Text(\"Press the Start button to run!\")\n            }\n            ScrollView(.vertical, showsIndicators: true) {\n                HStack {\n                    Text(sherpaOnnxVM.subtitles)\n                    Spacer()\n                }\n            }\n            Spacer()\n            Button {\n                toggleRecorder()\n            } label: {\n                Text(sherpaOnnxVM.status == .stop ? \"Start\" : \"Stop\")\n            }\n        }\n        .padding()\n    }\n\n    private func toggleRecorder() {\n        sherpaOnnxVM.toggleRecorder()\n    }\n}\n\nstruct ContentView_Previews: PreviewProvider {\n    static var previews: some View {\n        ContentView()\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass/Extension.swift",
    "content": "//\n//  Extension.swift\n//  SherpaOnnx\n//\n//  Created by knight on 2023/4/5.\n//\n\nimport AVFoundation\n\nextension AudioBuffer {\n    func array() -> [Float] {\n        return Array(UnsafeBufferPointer(self))\n    }\n}\n\nextension AVAudioPCMBuffer {\n    func array() -> [Float] {\n        return self.audioBufferList.pointee.mBuffers.array()\n    }\n}\n\nextension TimeInterval {\n  var hourMinuteSecondMS: String {\n    String(format: \"%d:%02d:%02d,%03d\", hour, minute, second, millisecond)\n  }\n\n  var hour: Int {\n    Int((self / 3600).truncatingRemainder(dividingBy: 3600))\n  }\n  var minute: Int {\n    Int((self / 60).truncatingRemainder(dividingBy: 60))\n  }\n  var second: Int {\n    Int(truncatingRemainder(dividingBy: 60))\n  }\n  var millisecond: Int {\n    Int((self * 1000).truncatingRemainder(dividingBy: 1000))\n  }\n}\n\nextension String {\n  var fileURL: URL {\n    return URL(fileURLWithPath: self)\n  }\n  var pathExtension: String {\n    return fileURL.pathExtension\n  }\n  var lastPathComponent: String {\n    return fileURL.lastPathComponent\n  }\n  var stringByDeletingPathExtension: String {\n    return fileURL.deletingPathExtension().path\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>NSMicrophoneUsageDescription</key>\n\t<string>Need microphone access for Next-gen Kaldi to work</string>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass/Model.swift",
    "content": "import Foundation\n\nfunc getResource(_ forResource: String, _ ofType: String) -> String {\n  let path = Bundle.main.path(forResource: forResource, ofType: ofType)\n  precondition(\n    path != nil,\n    \"\\(forResource).\\(ofType) does not exist!\\n\" + \"Remember to change \\n\"\n      + \"  Build Phases -> Copy Bundle Resources\\n\" + \"to add it!\"\n  )\n  return path!\n}\n/// Please refer to\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n/// to download pre-trained models\n\n/// sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 (Bilingual, Chinese + English)\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/zipformer-transducer-models.html\nfunc getBilingualStreamingZhEnZipformer20230220() -> SherpaOnnxOnlineModelConfig {\n  let encoder = getResource(\"encoder-epoch-99-avg-1.int8\", \"onnx\")\n  let decoder = getResource(\"decoder-epoch-99-avg-1\", \"onnx\")\n  let joiner = getResource(\"joiner-epoch-99-avg-1.int8\", \"onnx\")\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  return sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    transducer: sherpaOnnxOnlineTransducerModelConfig(\n      encoder: encoder,\n      decoder: decoder,\n      joiner: joiner),\n    numThreads: 1,\n    modelType: \"zipformer\"\n  )\n}\n\n/// csukuangfj/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23 (Chinese)\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-zh-14m-2023-02-23-chinese\n\nfunc getStreamingZh14MZipformer20230223() -> SherpaOnnxOnlineModelConfig {\n  let encoder = getResource(\"encoder-epoch-99-avg-1.int8\", \"onnx\")\n  let decoder = getResource(\"decoder-epoch-99-avg-1\", \"onnx\")\n  let joiner = getResource(\"joiner-epoch-99-avg-1.int8\", \"onnx\")\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  return sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    transducer: sherpaOnnxOnlineTransducerModelConfig(\n      encoder: encoder,\n      decoder: decoder,\n      joiner: joiner),\n    numThreads: 1,\n    modelType: \"zipformer\"\n  )\n}\n\n/// csukuangfj/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17 (English)\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-en-20m-2023-02-17-english\n\nfunc getStreamingEn20MZipformer20230217() -> SherpaOnnxOnlineModelConfig {\n  let encoder = getResource(\"encoder-epoch-99-avg-1.int8\", \"onnx\")\n  let decoder = getResource(\"decoder-epoch-99-avg-1\", \"onnx\")\n  let joiner = getResource(\"joiner-epoch-99-avg-1\", \"onnx\")\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  return sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    transducer: sherpaOnnxOnlineTransducerModelConfig(\n      encoder: encoder,\n      decoder: decoder,\n      joiner: joiner),\n    numThreads: 1,\n    modelType: \"zipformer\"\n  )\n}\n\n/// ========================================\n///   Non-streaming models\n/// ========================================\n\n/// csukuangfj/sherpa-onnx-paraformer-zh-2023-09-14 (Chinese)\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2023-09-14-chinese\nfunc getNonStreamingZhParaformer20230914() -> SherpaOnnxOfflineModelConfig {\n  let model = getResource(\"model.int8\", \"onnx\")\n  let tokens = getResource(\"paraformer-tokens\", \"txt\")\n\n  return sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    paraformer: sherpaOnnxOfflineParaformerModelConfig(\n      model: model),\n    numThreads: 1,\n    modelType: \"paraformer\"\n  )\n}\n\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html#tiny-en\n// English, int8 encoder and decoder\nfunc getNonStreamingWhisperTinyEn() -> SherpaOnnxOfflineModelConfig {\n  let encoder = getResource(\"tiny.en-encoder.int8\", \"onnx\")\n  let decoder = getResource(\"tiny.en-decoder.int8\", \"onnx\")\n  let tokens = getResource(\"tiny.en-tokens\", \"txt\")\n\n  return sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    whisper: sherpaOnnxOfflineWhisperModelConfig(\n      encoder: encoder,\n      decoder: decoder\n    ),\n    numThreads: 1,\n    modelType: \"whisper\"\n  )\n}\n\n// icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04 (English)\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#icefall-asr-multidataset-pruned-transducer-stateless7-2023-05-04-english\n\nfunc getNonStreamingEnZipformer20230504() -> SherpaOnnxOfflineModelConfig {\n  let encoder = getResource(\"encoder-epoch-30-avg-4.int8\", \"onnx\")\n  let decoder = getResource(\"decoder-epoch-30-avg-4\", \"onnx\")\n  let joiner = getResource(\"joiner-epoch-30-avg-4\", \"onnx\")\n  let tokens = getResource(\"non-streaming-zipformer-tokens\", \"txt\")\n\n  return sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    transducer: sherpaOnnxOfflineTransducerModelConfig(\n      encoder: encoder,\n      decoder: decoder,\n      joiner: joiner),\n    numThreads: 1,\n    modelType: \"zipformer\"\n  )\n}\n\n/// Please refer to\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n/// to add more models if you need\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass/Preview Content/Preview Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass/SherpaOnnx2PassApp.swift",
    "content": "//\n//  SherpaOnnx2PassApp.swift\n//  SherpaOnnx2Pass\n//\n//  Created by fangjun on 2023/9/11.\n//\n\nimport SwiftUI\n\n@main\nstruct SherpaOnnx2PassApp: App {\n    var body: some Scene {\n        WindowGroup {\n            ContentView()\n        }\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass/SherpaOnnxViewModel.swift",
    "content": "//\n//  SherpaOnnxViewModel.swift\n//  SherpaOnnx\n//\n//  Created by knight on 2023/4/5.\n//\n\nimport Foundation\nimport AVFoundation\n\nenum Status {\n    case stop\n    case recording\n}\n\nclass SherpaOnnxViewModel: ObservableObject {\n    @Published var status: Status = .stop\n    @Published var subtitles: String = \"\"\n\n    var sentences: [String] = []\n    var samplesBuffer = [[Float]] ()\n\n    var audioEngine: AVAudioEngine? = nil\n    var recognizer: SherpaOnnxRecognizer! = nil\n    var offlineRecognizer: SherpaOnnxOfflineRecognizer! = nil\n\n    var lastSentence: String = \"\"\n    // let maxSentence: Int = 10 // for Chinese\n    let maxSentence: Int = 6 // for English\n\n    var results: String {\n        if sentences.isEmpty && lastSentence.isEmpty {\n            return \"\"\n        }\n        if sentences.isEmpty {\n            return \"0: \\(lastSentence.lowercased())\"\n        }\n\n        let start = max(sentences.count - maxSentence, 0)\n        if lastSentence.isEmpty {\n            return sentences.enumerated().map { (index, s) in \"\\(index): \\(s.lowercased())\" }[start...]\n                .joined(separator: \"\\n\")\n        } else {\n            return sentences.enumerated().map { (index, s) in \"\\(index): \\(s.lowercased())\" }[start...]\n                .joined(separator: \"\\n\") + \"\\n\\(sentences.count): \\(lastSentence.lowercased())\"\n        }\n    }\n\n    func updateLabel() {\n        DispatchQueue.main.async {\n            self.subtitles = self.results\n        }\n    }\n\n    init() {\n        initRecognizer()\n        initOfflineRecognizer()\n        initRecorder()\n    }\n\n    private func initRecognizer() {\n        // Please select one model that is best suitable for you.\n        //\n        // You can also modify Model.swift to add new pre-trained models from\n        // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n        // let modelConfig = getBilingualStreamingZhEnZipformer20230220()\n        /* let modelConfig = getStreamingZh14MZipformer20230223() */\n\n        let modelConfig = getStreamingEn20MZipformer20230217()\n\n        let featConfig = sherpaOnnxFeatureConfig(\n            sampleRate: 16000,\n            featureDim: 80)\n\n        var config = sherpaOnnxOnlineRecognizerConfig(\n            featConfig: featConfig,\n            modelConfig: modelConfig,\n            enableEndpoint: true,\n            rule1MinTrailingSilence: 2.4,\n\n            // rule2MinTrailingSilence: 1.2, // for Chinese\n\n            rule2MinTrailingSilence: 0.5, // for English\n\n            rule3MinUtteranceLength: 30,\n            decodingMethod: \"greedy_search\",\n            maxActivePaths: 4\n        )\n        recognizer = SherpaOnnxRecognizer(config: &config)\n    }\n\n    private func initOfflineRecognizer() {\n        // let modelConfig = getNonStreamingZhParaformer20230914()\n        let modelConfig = getNonStreamingWhisperTinyEn()\n\n        // let modelConfig = getNonStreamingEnZipformer20230504()\n\n        let featConfig = sherpaOnnxFeatureConfig(\n            sampleRate: 16000,\n            featureDim: 80)\n\n        var config = sherpaOnnxOfflineRecognizerConfig(\n            featConfig: featConfig,\n            modelConfig: modelConfig,\n            decodingMethod: \"greedy_search\",\n            maxActivePaths: 4\n        )\n        offlineRecognizer = SherpaOnnxOfflineRecognizer(config: &config)\n    }\n\n    private func initRecorder() {\n        print(\"init recorder\")\n        audioEngine = AVAudioEngine()\n        let inputNode = self.audioEngine?.inputNode\n        let bus = 0\n        let inputFormat = inputNode?.outputFormat(forBus: bus)\n        let outputFormat = AVAudioFormat(\n            commonFormat: .pcmFormatFloat32,\n            sampleRate: 16000, channels: 1,\n            interleaved: false)!\n\n        let converter = AVAudioConverter(from: inputFormat!, to: outputFormat)!\n\n        inputNode!.installTap(\n            onBus: bus,\n            bufferSize: 1024,\n            format: inputFormat\n        ) {\n            (buffer: AVAudioPCMBuffer, when: AVAudioTime) in\n            var newBufferAvailable = true\n\n            let inputCallback: AVAudioConverterInputBlock = {\n                inNumPackets, outStatus in\n                if newBufferAvailable {\n                    outStatus.pointee = .haveData\n                    newBufferAvailable = false\n\n                    return buffer\n                } else {\n                    outStatus.pointee = .noDataNow\n                    return nil\n                }\n            }\n\n            let convertedBuffer = AVAudioPCMBuffer(\n                pcmFormat: outputFormat,\n                frameCapacity:\n                    AVAudioFrameCount(outputFormat.sampleRate)\n                * buffer.frameLength\n                / AVAudioFrameCount(buffer.format.sampleRate))!\n\n            var error: NSError?\n            let _ = converter.convert(\n                to: convertedBuffer,\n                error: &error, withInputFrom: inputCallback)\n\n            // TODO(fangjun): Handle status != haveData\n\n            let array = convertedBuffer.array()\n            if !array.isEmpty {\n                self.samplesBuffer.append(array)\n\n                self.recognizer.acceptWaveform(samples: array)\n                while (self.recognizer.isReady()){\n                    self.recognizer.decode()\n                }\n                let isEndpoint = self.recognizer.isEndpoint()\n                let text = self.recognizer.getResult().text\n\n                if !text.isEmpty && self.lastSentence != text {\n                    self.lastSentence = text\n                    self.updateLabel()\n                    print(text)\n                }\n\n                if isEndpoint{\n                    if !text.isEmpty {\n                        // Invoke offline recognizer\n                        var numSamples: Int = 0\n                        for a in self.samplesBuffer {\n                          numSamples += a.count\n                        }\n\n                        var samples: [Float] = Array(repeating: 0, count: numSamples)\n                        var i = 0\n                        for a in self.samplesBuffer {\n                            for s in a {\n                                samples[i] = s\n                                i += 1\n                            }\n                        }\n\n                        // let num = 12000 // For Chinese\n                        let num = 10000 // For English\n                        self.lastSentence = self.offlineRecognizer.decode(samples: Array(samples[0..<samples.count-num])).text\n\n                        let tmp = self.lastSentence\n                        self.lastSentence = \"\"\n                        self.sentences.append(tmp)\n\n                        self.updateLabel()\n\n                        i = 0\n                        if samples.count > num {\n                            i = samples.count - num\n                        }\n                        var tail: [Float] = Array(repeating: 0, count: samples.count - i)\n\n                        for k in 0  ... samples.count - i - 1 {\n                            tail[k] = samples[i+k];\n                        }\n\n                        self.samplesBuffer = [[Float]]()\n                        self.samplesBuffer.append(tail)\n                    } else {\n                        self.samplesBuffer = [[Float]]()\n                    }\n                    self.recognizer.reset()\n                }\n            }\n        }\n    }\n\n    public func toggleRecorder() {\n        if status == .stop {\n            startRecorder()\n            status = .recording\n        } else {\n            stopRecorder()\n            status = .stop\n        }\n    }\n\n    private func startRecorder() {\n        lastSentence = \"\"\n        sentences = []\n        samplesBuffer = [[Float]] ()\n        updateLabel()\n\n        do {\n            try self.audioEngine?.start()\n        } catch let error as NSError {\n            print(\"Got an error starting audioEngine: \\(error.domain), \\(error)\")\n        }\n        print(\"started\")\n    }\n\n    private func stopRecorder() {\n        audioEngine?.stop()\n        print(\"stopped\")\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 56;\n\tobjects = {\n\n/* Begin PBXBuildFile section */\n\t\tC98126502BFEED7D000AD7AA /* Info.plist in Resources */ = {isa = PBXBuildFile; fileRef = C981264F2BFEED7C000AD7AA /* Info.plist */; };\n\t\tC9A2587D2AAEFFF100E555CA /* SherpaOnnx2PassApp.swift in Sources */ = {isa = PBXBuildFile; fileRef = C9A2587C2AAEFFF100E555CA /* SherpaOnnx2PassApp.swift */; };\n\t\tC9A2587F2AAEFFF100E555CA /* ContentView.swift in Sources */ = {isa = PBXBuildFile; fileRef = C9A2587E2AAEFFF100E555CA /* ContentView.swift */; };\n\t\tC9A258812AAEFFF200E555CA /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = C9A258802AAEFFF200E555CA /* Assets.xcassets */; };\n\t\tC9A258842AAEFFF200E555CA /* Preview Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = C9A258832AAEFFF200E555CA /* Preview Assets.xcassets */; };\n\t\tC9A2588E2AAF039D00E555CA /* Model.swift in Sources */ = {isa = PBXBuildFile; fileRef = C9A2588A2AAF039D00E555CA /* Model.swift */; };\n\t\tC9A258902AAF039D00E555CA /* SherpaOnnxViewModel.swift in Sources */ = {isa = PBXBuildFile; fileRef = C9A2588C2AAF039D00E555CA /* SherpaOnnxViewModel.swift */; };\n\t\tC9A258912AAF039D00E555CA /* Extension.swift in Sources */ = {isa = PBXBuildFile; fileRef = C9A2588D2AAF039D00E555CA /* Extension.swift */; };\n\t\tC9A258932AAF057E00E555CA /* SherpaOnnx.swift in Sources */ = {isa = PBXBuildFile; fileRef = C9A258922AAF057E00E555CA /* SherpaOnnx.swift */; };\n\t\tC9A258962AAF05D100E555CA /* sherpa-onnx.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = C9A258952AAF05D100E555CA /* sherpa-onnx.xcframework */; };\n\t\tC9A258982AAF05E400E555CA /* onnxruntime.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = C9A258972AAF05E400E555CA /* onnxruntime.xcframework */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXFileReference section */\n\t\tC981264F2BFEED7C000AD7AA /* Info.plist */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = \"<group>\"; };\n\t\tC9A258792AAEFFF100E555CA /* SherpaOnnx2Pass.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = SherpaOnnx2Pass.app; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tC9A2587C2AAEFFF100E555CA /* SherpaOnnx2PassApp.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnx2PassApp.swift; sourceTree = \"<group>\"; };\n\t\tC9A2587E2AAEFFF100E555CA /* ContentView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ContentView.swift; sourceTree = \"<group>\"; };\n\t\tC9A258802AAEFFF200E555CA /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = \"<group>\"; };\n\t\tC9A258832AAEFFF200E555CA /* Preview Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = \"Preview Assets.xcassets\"; sourceTree = \"<group>\"; };\n\t\tC9A2588A2AAF039D00E555CA /* Model.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = Model.swift; sourceTree = \"<group>\"; };\n\t\tC9A2588C2AAF039D00E555CA /* SherpaOnnxViewModel.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = SherpaOnnxViewModel.swift; sourceTree = \"<group>\"; };\n\t\tC9A2588D2AAF039D00E555CA /* Extension.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = Extension.swift; sourceTree = \"<group>\"; };\n\t\tC9A258922AAF057E00E555CA /* SherpaOnnx.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; name = SherpaOnnx.swift; path = \"../../../swift-api-examples/SherpaOnnx.swift\"; sourceTree = \"<group>\"; };\n\t\tC9A258952AAF05D100E555CA /* sherpa-onnx.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = \"sherpa-onnx.xcframework\"; path = \"../../build-ios/sherpa-onnx.xcframework\"; sourceTree = \"<group>\"; };\n\t\tC9A258972AAF05E400E555CA /* onnxruntime.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = onnxruntime.xcframework; path = \"../../build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework\"; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\tC9A258762AAEFFF100E555CA /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC9A258982AAF05E400E555CA /* onnxruntime.xcframework in Frameworks */,\n\t\t\t\tC9A258962AAF05D100E555CA /* sherpa-onnx.xcframework in Frameworks */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\tC9A258702AAEFFF100E555CA = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC9A2587B2AAEFFF100E555CA /* SherpaOnnx2Pass */,\n\t\t\t\tC9A2587A2AAEFFF100E555CA /* Products */,\n\t\t\t\tC9A258942AAF05D100E555CA /* Frameworks */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC9A2587A2AAEFFF100E555CA /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC9A258792AAEFFF100E555CA /* SherpaOnnx2Pass.app */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC9A2587B2AAEFFF100E555CA /* SherpaOnnx2Pass */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC981264F2BFEED7C000AD7AA /* Info.plist */,\n\t\t\t\tC9A258922AAF057E00E555CA /* SherpaOnnx.swift */,\n\t\t\t\tC9A2588D2AAF039D00E555CA /* Extension.swift */,\n\t\t\t\tC9A2588A2AAF039D00E555CA /* Model.swift */,\n\t\t\t\tC9A2588C2AAF039D00E555CA /* SherpaOnnxViewModel.swift */,\n\t\t\t\tC9A2587C2AAEFFF100E555CA /* SherpaOnnx2PassApp.swift */,\n\t\t\t\tC9A2587E2AAEFFF100E555CA /* ContentView.swift */,\n\t\t\t\tC9A258802AAEFFF200E555CA /* Assets.xcassets */,\n\t\t\t\tC9A258822AAEFFF200E555CA /* Preview Content */,\n\t\t\t);\n\t\t\tpath = SherpaOnnx2Pass;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC9A258822AAEFFF200E555CA /* Preview Content */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC9A258832AAEFFF200E555CA /* Preview Assets.xcassets */,\n\t\t\t);\n\t\t\tpath = \"Preview Content\";\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC9A258942AAF05D100E555CA /* Frameworks */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC9A258972AAF05E400E555CA /* onnxruntime.xcframework */,\n\t\t\t\tC9A258952AAF05D100E555CA /* sherpa-onnx.xcframework */,\n\t\t\t);\n\t\t\tname = Frameworks;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\tC9A258782AAEFFF100E555CA /* SherpaOnnx2Pass */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = C9A258872AAEFFF200E555CA /* Build configuration list for PBXNativeTarget \"SherpaOnnx2Pass\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tC9A258752AAEFFF100E555CA /* Sources */,\n\t\t\t\tC9A258762AAEFFF100E555CA /* Frameworks */,\n\t\t\t\tC9A258772AAEFFF100E555CA /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = SherpaOnnx2Pass;\n\t\t\tproductName = SherpaOnnx2Pass;\n\t\t\tproductReference = C9A258792AAEFFF100E555CA /* SherpaOnnx2Pass.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\tC9A258712AAEFFF100E555CA /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = 1;\n\t\t\t\tLastSwiftUpdateCheck = 1420;\n\t\t\t\tLastUpgradeCheck = 1420;\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\tC9A258782AAEFFF100E555CA = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.2;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = C9A258742AAEFFF100E555CA /* Build configuration list for PBXProject \"SherpaOnnx2Pass\" */;\n\t\t\tcompatibilityVersion = \"Xcode 14.0\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = C9A258702AAEFFF100E555CA;\n\t\t\tproductRefGroup = C9A2587A2AAEFFF100E555CA /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\tC9A258782AAEFFF100E555CA /* SherpaOnnx2Pass */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\tC9A258772AAEFFF100E555CA /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC9A258842AAEFFF200E555CA /* Preview Assets.xcassets in Resources */,\n\t\t\t\tC9A258812AAEFFF200E555CA /* Assets.xcassets in Resources */,\n\t\t\t\tC98126502BFEED7D000AD7AA /* Info.plist in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\tC9A258752AAEFFF100E555CA /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC9A2588E2AAF039D00E555CA /* Model.swift in Sources */,\n\t\t\t\tC9A258902AAF039D00E555CA /* SherpaOnnxViewModel.swift in Sources */,\n\t\t\t\tC9A258912AAF039D00E555CA /* Extension.swift in Sources */,\n\t\t\t\tC9A2587F2AAEFFF100E555CA /* ContentView.swift in Sources */,\n\t\t\t\tC9A258932AAF057E00E555CA /* SherpaOnnx.swift in Sources */,\n\t\t\t\tC9A2587D2AAEFFF100E555CA /* SherpaOnnx2PassApp.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin XCBuildConfiguration section */\n\t\tC9A258852AAEFFF200E555CA /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = INCLUDE_SOURCE;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC9A258862AAEFFF200E555CA /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t\tVALIDATE_PRODUCT = YES;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tC9A258882AAEFFF200E555CA /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_ASSET_PATHS = \"\\\"SherpaOnnx2Pass/Preview Content\\\"\";\n\t\t\t\tENABLE_PREVIEWS = YES;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSceneManifest_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchScreen_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnx2Pass\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC9A258892AAEFFF200E555CA /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_ASSET_PATHS = \"\\\"SherpaOnnx2Pass/Preview Content\\\"\";\n\t\t\t\tENABLE_PREVIEWS = YES;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSceneManifest_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchScreen_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnx2Pass\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\tC9A258742AAEFFF100E555CA /* Build configuration list for PBXProject \"SherpaOnnx2Pass\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC9A258852AAEFFF200E555CA /* Debug */,\n\t\t\t\tC9A258862AAEFFF200E555CA /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tC9A258872AAEFFF200E555CA /* Build configuration list for PBXNativeTarget \"SherpaOnnx2Pass\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC9A258882AAEFFF200E555CA /* Debug */,\n\t\t\t\tC9A258892AAEFFF200E555CA /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = C9A258712AAEFFF100E555CA /* Project object */;\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"self:\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnx2Pass/SherpaOnnx2Pass.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID/Assets.xcassets/AccentColor.colorset/Contents.json",
    "content": "{\n  \"colors\" : [\n    {\n      \"idiom\" : \"universal\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID/Assets.xcassets/AppIcon 1.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"filename\" : \"k2-1024x1024.png\",\n      \"idiom\" : \"universal\",\n      \"platform\" : \"ios\",\n      \"size\" : \"1024x1024\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"idiom\" : \"universal\",\n      \"platform\" : \"ios\",\n      \"size\" : \"1024x1024\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID/Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID/ContentView.swift",
    "content": "//\n//  ContentView.swift\n//  SherpaOnnxLangID\n//\n//  Created by knight on 2024/4/1.\n//\n\nimport SwiftUI\n\nstruct ContentView: View {\n    @StateObject var viewModel = ViewModel()\n\n    var body: some View {\n        VStack {\n            Text(\"ASR with Next-gen Kaldi\")\n                .font(.title)\n            if viewModel.status == .stop {\n                Text(\"See https://github.com/k2-fsa/sherpa-onnx\")\n                Text(\"Press the Start button to run!\")\n            }\n            if viewModel.status == .recording {\n                Text(\"Stop will show recording language.\")\n            }\n            Spacer()\n            Text(\"Recording language is: \\(viewModel.language)\")\n                .frame(maxWidth: .infinity)\n            Spacer()\n            Button {\n                toggleRecorder()\n            } label: {\n                Text(viewModel.status == .stop ? \"Start\" : \"Stop\")\n            }\n        }\n        .padding()\n    }\n\n    private func toggleRecorder() {\n        Task {\n            await viewModel.toggleRecorder()\n        }\n    }\n}\n\n#Preview {\n    ContentView()\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>NSMicrophoneUsageDescription</key>\n\t<string>Need microphone access for Next-gen Kaldi to work</string>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID/Preview Content/Preview Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID/SherpaOnnxLangIDApp.swift",
    "content": "//\n//  SherpaOnnxLangIDApp.swift\n//  SherpaOnnxLangID\n//\n//  Created by knight on 2024/4/1.\n//\n\nimport SwiftUI\n\n@main\nstruct SherpaOnnxLangIDApp: App {\n    var body: some Scene {\n        WindowGroup {\n            ContentView()\n        }\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID/ViewModel.swift",
    "content": "//\n//  ViewModel.swift\n//  SherpaOnnxLangID\n//\n//  Created by knight on 2024/4/1.\n//\n\nimport SwiftUI\nimport AVFoundation\n\nenum Status {\n    case stop\n    case recording\n}\n\n@MainActor\nclass ViewModel:ObservableObject {\n    @Published var status: Status = .stop\n\n    @Published var language: String = \"\"\n    \n    var languageIdentifier: SherpaOnnxSpokenLanguageIdentificationWrapper? = nil\n    var audioEngine: AVAudioEngine? = nil\n    \n    var voices: [Float] = []\n\n    init() {\n        initRecorder()\n        initRecognizer()\n    }\n    \n    private func initRecognizer() {\n        var config =  getLanguageIdentificationTiny()\n        self.languageIdentifier = SherpaOnnxSpokenLanguageIdentificationWrapper(config: &config)\n    }\n    \n    private func initRecorder() {\n        print(\"init recorder\")\n        audioEngine = AVAudioEngine()\n        let inputNode = self.audioEngine?.inputNode\n        let bus = 0\n        let inputFormat = inputNode?.outputFormat(forBus: bus)\n        let outputFormat = AVAudioFormat(\n            commonFormat: .pcmFormatFloat32,\n            sampleRate: 16000, channels: 1,\n            interleaved: false)!\n\n        let converter = AVAudioConverter(from: inputFormat!, to: outputFormat)!\n\n        inputNode!.installTap(\n            onBus: bus,\n            bufferSize: 1024,\n            format: inputFormat\n        ) {\n            (buffer: AVAudioPCMBuffer, when: AVAudioTime) in\n            var newBufferAvailable = true\n\n            let inputCallback: AVAudioConverterInputBlock = {\n                inNumPackets, outStatus in\n                if newBufferAvailable {\n                    outStatus.pointee = .haveData\n                    newBufferAvailable = false\n\n                    return buffer\n                } else {\n                    outStatus.pointee = .noDataNow\n                    return nil\n                }\n            }\n\n            let convertedBuffer = AVAudioPCMBuffer(\n                pcmFormat: outputFormat,\n                frameCapacity:\n                    AVAudioFrameCount(outputFormat.sampleRate)\n                * buffer.frameLength\n                / AVAudioFrameCount(buffer.format.sampleRate))!\n\n            var error: NSError?\n            let _ = converter.convert(\n                to: convertedBuffer,\n                error: &error, withInputFrom: inputCallback)\n\n            // TODO(fangjun): Handle status != haveData\n\n            let array = convertedBuffer.array()\n            if !array.isEmpty {\n                self.voices.append(contentsOf: array)\n            }\n        }\n    }\n    \n    public func toggleRecorder() async{\n        if status == .stop {\n            await startRecorder()\n        } else {\n            await stopRecorder()\n        }\n    }\n\n    private func startRecorder() async {\n        await MainActor.run {\n            self.language = \"\"\n        }\n        if !self.voices.isEmpty {\n            self.voices = []\n        }\n        do {\n            try self.audioEngine?.start()\n            status = .recording\n            print(\"started\")\n        } catch let error as NSError {\n            print(\"Got an error starting audioEngine: \\(error.domain), \\(error)\")\n        }\n    }\n\n    private func stopRecorder() async {\n        audioEngine?.stop()\n        print(\"stopped, and begin identify language\")\n        await self.identify()\n        status = .stop\n    }\n    \n    private func identify() async {\n        let result = self.languageIdentifier?    .decode(samples: self.voices)\n        if let language = result?.lang {\n            await MainActor.run {\n                self.language = language\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 56;\n\tobjects = {\n\n/* Begin PBXBuildFile section */\n\t\tC98126522BFEEDB7000AD7AA /* Info.plist in Resources */ = {isa = PBXBuildFile; fileRef = C98126512BFEEDB7000AD7AA /* Info.plist */; };\n\t\tDEBB2D762BBAAA3500864EF5 /* SherpaOnnxLangIDApp.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEBB2D752BBAAA3500864EF5 /* SherpaOnnxLangIDApp.swift */; };\n\t\tDEBB2D782BBAAA3500864EF5 /* ContentView.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEBB2D772BBAAA3500864EF5 /* ContentView.swift */; };\n\t\tDEBB2D7A2BBAAA3600864EF5 /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = DEBB2D792BBAAA3600864EF5 /* Assets.xcassets */; };\n\t\tDEBB2D7D2BBAAA3600864EF5 /* Preview Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = DEBB2D7C2BBAAA3600864EF5 /* Preview Assets.xcassets */; };\n\t\tDEBB2D872BBAAA3600864EF5 /* SherpaOnnxLangIDTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEBB2D862BBAAA3600864EF5 /* SherpaOnnxLangIDTests.swift */; };\n\t\tDEBB2D912BBAAA3600864EF5 /* SherpaOnnxLangIDUITests.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEBB2D902BBAAA3600864EF5 /* SherpaOnnxLangIDUITests.swift */; };\n\t\tDEBB2D932BBAAA3600864EF5 /* SherpaOnnxLangIDUITestsLaunchTests.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEBB2D922BBAAA3600864EF5 /* SherpaOnnxLangIDUITestsLaunchTests.swift */; };\n\t\tDEBB2DA12BBAAAD800864EF5 /* Extension.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEBB2DA02BBAAAD800864EF5 /* Extension.swift */; };\n\t\tDEBB2DA32BBAAAE700864EF5 /* Model.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEBB2DA22BBAAAE700864EF5 /* Model.swift */; };\n\t\tDEBB2DA52BBAAAFD00864EF5 /* ViewModel.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEBB2DA42BBAAAFD00864EF5 /* ViewModel.swift */; };\n\t\tDEBB2DAC2BBAAC6200864EF5 /* onnxruntime.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = DEBB2DAB2BBAAC6200864EF5 /* onnxruntime.xcframework */; };\n\t\tDEBB2DAD2BBAAC6200864EF5 /* onnxruntime.xcframework in Embed Frameworks */ = {isa = PBXBuildFile; fileRef = DEBB2DAB2BBAAC6200864EF5 /* onnxruntime.xcframework */; settings = {ATTRIBUTES = (CodeSignOnCopy, RemoveHeadersOnCopy, ); }; };\n\t\tDEBB2DAF2BBAAC6400864EF5 /* sherpa-onnx.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = DEBB2DA72BBAAC4D00864EF5 /* sherpa-onnx.xcframework */; };\n\t\tDEBB2DB02BBAAC6400864EF5 /* sherpa-onnx.xcframework in Embed Frameworks */ = {isa = PBXBuildFile; fileRef = DEBB2DA72BBAAC4D00864EF5 /* sherpa-onnx.xcframework */; settings = {ATTRIBUTES = (CodeSignOnCopy, RemoveHeadersOnCopy, ); }; };\n\t\tDEBB2DB22BBAAD0000864EF5 /* SherpaOnnx.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEBB2DB12BBAAD0000864EF5 /* SherpaOnnx.swift */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXContainerItemProxy section */\n\t\tDEBB2D832BBAAA3600864EF5 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = DEBB2D6A2BBAAA3500864EF5 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = DEBB2D712BBAAA3500864EF5;\n\t\t\tremoteInfo = SherpaOnnxLangID;\n\t\t};\n\t\tDEBB2D8D2BBAAA3600864EF5 /* PBXContainerItemProxy */ = {\n\t\t\tisa = PBXContainerItemProxy;\n\t\t\tcontainerPortal = DEBB2D6A2BBAAA3500864EF5 /* Project object */;\n\t\t\tproxyType = 1;\n\t\t\tremoteGlobalIDString = DEBB2D712BBAAA3500864EF5;\n\t\t\tremoteInfo = SherpaOnnxLangID;\n\t\t};\n/* End PBXContainerItemProxy section */\n\n/* Begin PBXCopyFilesBuildPhase section */\n\t\tDEBB2DAE2BBAAC6200864EF5 /* Embed Frameworks */ = {\n\t\t\tisa = PBXCopyFilesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tdstPath = \"\";\n\t\t\tdstSubfolderSpec = 10;\n\t\t\tfiles = (\n\t\t\t\tDEBB2DAD2BBAAC6200864EF5 /* onnxruntime.xcframework in Embed Frameworks */,\n\t\t\t\tDEBB2DB02BBAAC6400864EF5 /* sherpa-onnx.xcframework in Embed Frameworks */,\n\t\t\t);\n\t\t\tname = \"Embed Frameworks\";\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXCopyFilesBuildPhase section */\n\n/* Begin PBXFileReference section */\n\t\tC98126512BFEEDB7000AD7AA /* Info.plist */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = \"<group>\"; };\n\t\tDEBB2D722BBAAA3500864EF5 /* SherpaOnnxLangID.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = SherpaOnnxLangID.app; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tDEBB2D752BBAAA3500864EF5 /* SherpaOnnxLangIDApp.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxLangIDApp.swift; sourceTree = \"<group>\"; };\n\t\tDEBB2D772BBAAA3500864EF5 /* ContentView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ContentView.swift; sourceTree = \"<group>\"; };\n\t\tDEBB2D792BBAAA3600864EF5 /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = \"<group>\"; };\n\t\tDEBB2D7C2BBAAA3600864EF5 /* Preview Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = \"Preview Assets.xcassets\"; sourceTree = \"<group>\"; };\n\t\tDEBB2D822BBAAA3600864EF5 /* SherpaOnnxLangIDTests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = SherpaOnnxLangIDTests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tDEBB2D862BBAAA3600864EF5 /* SherpaOnnxLangIDTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxLangIDTests.swift; sourceTree = \"<group>\"; };\n\t\tDEBB2D8C2BBAAA3600864EF5 /* SherpaOnnxLangIDUITests.xctest */ = {isa = PBXFileReference; explicitFileType = wrapper.cfbundle; includeInIndex = 0; path = SherpaOnnxLangIDUITests.xctest; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tDEBB2D902BBAAA3600864EF5 /* SherpaOnnxLangIDUITests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxLangIDUITests.swift; sourceTree = \"<group>\"; };\n\t\tDEBB2D922BBAAA3600864EF5 /* SherpaOnnxLangIDUITestsLaunchTests.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxLangIDUITestsLaunchTests.swift; sourceTree = \"<group>\"; };\n\t\tDEBB2D9F2BBAAACD00864EF5 /* SherpaOnnx-Bridging-Header.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; name = \"SherpaOnnx-Bridging-Header.h\"; path = \"../../../swift-api-examples/SherpaOnnx-Bridging-Header.h\"; sourceTree = \"<group>\"; };\n\t\tDEBB2DA02BBAAAD800864EF5 /* Extension.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; name = Extension.swift; path = ../../SherpaOnnx/SherpaOnnx/Extension.swift; sourceTree = \"<group>\"; };\n\t\tDEBB2DA22BBAAAE700864EF5 /* Model.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; name = Model.swift; path = ../../SherpaOnnx/SherpaOnnx/Model.swift; sourceTree = \"<group>\"; };\n\t\tDEBB2DA42BBAAAFD00864EF5 /* ViewModel.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ViewModel.swift; sourceTree = \"<group>\"; };\n\t\tDEBB2DA72BBAAC4D00864EF5 /* sherpa-onnx.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = \"sherpa-onnx.xcframework\"; path = \"../../build-ios/sherpa-onnx.xcframework\"; sourceTree = \"<group>\"; };\n\t\tDEBB2DAB2BBAAC6200864EF5 /* onnxruntime.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = onnxruntime.xcframework; path = \"../../build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework\"; sourceTree = \"<group>\"; };\n\t\tDEBB2DB12BBAAD0000864EF5 /* SherpaOnnx.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; name = SherpaOnnx.swift; path = \"../../../swift-api-examples/SherpaOnnx.swift\"; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\tDEBB2D6F2BBAAA3500864EF5 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tDEBB2DAC2BBAAC6200864EF5 /* onnxruntime.xcframework in Frameworks */,\n\t\t\t\tDEBB2DAF2BBAAC6400864EF5 /* sherpa-onnx.xcframework in Frameworks */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tDEBB2D7F2BBAAA3600864EF5 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tDEBB2D892BBAAA3600864EF5 /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\tDEBB2D692BBAAA3500864EF5 = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEBB2D742BBAAA3500864EF5 /* SherpaOnnxLangID */,\n\t\t\t\tDEBB2D852BBAAA3600864EF5 /* SherpaOnnxLangIDTests */,\n\t\t\t\tDEBB2D8F2BBAAA3600864EF5 /* SherpaOnnxLangIDUITests */,\n\t\t\t\tDEBB2D732BBAAA3500864EF5 /* Products */,\n\t\t\t\tDEBB2DA62BBAAC4D00864EF5 /* Frameworks */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDEBB2D732BBAAA3500864EF5 /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEBB2D722BBAAA3500864EF5 /* SherpaOnnxLangID.app */,\n\t\t\t\tDEBB2D822BBAAA3600864EF5 /* SherpaOnnxLangIDTests.xctest */,\n\t\t\t\tDEBB2D8C2BBAAA3600864EF5 /* SherpaOnnxLangIDUITests.xctest */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDEBB2D742BBAAA3500864EF5 /* SherpaOnnxLangID */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC98126512BFEEDB7000AD7AA /* Info.plist */,\n\t\t\t\tDEBB2D752BBAAA3500864EF5 /* SherpaOnnxLangIDApp.swift */,\n\t\t\t\tDEBB2D772BBAAA3500864EF5 /* ContentView.swift */,\n\t\t\t\tDEBB2DA42BBAAAFD00864EF5 /* ViewModel.swift */,\n\t\t\t\tDEBB2D9F2BBAAACD00864EF5 /* SherpaOnnx-Bridging-Header.h */,\n\t\t\t\tDEBB2DB12BBAAD0000864EF5 /* SherpaOnnx.swift */,\n\t\t\t\tDEBB2DA02BBAAAD800864EF5 /* Extension.swift */,\n\t\t\t\tDEBB2DA22BBAAAE700864EF5 /* Model.swift */,\n\t\t\t\tDEBB2D792BBAAA3600864EF5 /* Assets.xcassets */,\n\t\t\t\tDEBB2D7B2BBAAA3600864EF5 /* Preview Content */,\n\t\t\t);\n\t\t\tpath = SherpaOnnxLangID;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDEBB2D7B2BBAAA3600864EF5 /* Preview Content */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEBB2D7C2BBAAA3600864EF5 /* Preview Assets.xcassets */,\n\t\t\t);\n\t\t\tpath = \"Preview Content\";\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDEBB2D852BBAAA3600864EF5 /* SherpaOnnxLangIDTests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEBB2D862BBAAA3600864EF5 /* SherpaOnnxLangIDTests.swift */,\n\t\t\t);\n\t\t\tpath = SherpaOnnxLangIDTests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDEBB2D8F2BBAAA3600864EF5 /* SherpaOnnxLangIDUITests */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEBB2D902BBAAA3600864EF5 /* SherpaOnnxLangIDUITests.swift */,\n\t\t\t\tDEBB2D922BBAAA3600864EF5 /* SherpaOnnxLangIDUITestsLaunchTests.swift */,\n\t\t\t);\n\t\t\tpath = SherpaOnnxLangIDUITests;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDEBB2DA62BBAAC4D00864EF5 /* Frameworks */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEBB2DAB2BBAAC6200864EF5 /* onnxruntime.xcframework */,\n\t\t\t\tDEBB2DA72BBAAC4D00864EF5 /* sherpa-onnx.xcframework */,\n\t\t\t);\n\t\t\tname = Frameworks;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\tDEBB2D712BBAAA3500864EF5 /* SherpaOnnxLangID */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = DEBB2D962BBAAA3600864EF5 /* Build configuration list for PBXNativeTarget \"SherpaOnnxLangID\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tDEBB2D6E2BBAAA3500864EF5 /* Sources */,\n\t\t\t\tDEBB2D6F2BBAAA3500864EF5 /* Frameworks */,\n\t\t\t\tDEBB2D702BBAAA3500864EF5 /* Resources */,\n\t\t\t\tDEBB2DAE2BBAAC6200864EF5 /* Embed Frameworks */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = SherpaOnnxLangID;\n\t\t\tproductName = SherpaOnnxLangID;\n\t\t\tproductReference = DEBB2D722BBAAA3500864EF5 /* SherpaOnnxLangID.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n\t\tDEBB2D812BBAAA3600864EF5 /* SherpaOnnxLangIDTests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = DEBB2D992BBAAA3600864EF5 /* Build configuration list for PBXNativeTarget \"SherpaOnnxLangIDTests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tDEBB2D7E2BBAAA3600864EF5 /* Sources */,\n\t\t\t\tDEBB2D7F2BBAAA3600864EF5 /* Frameworks */,\n\t\t\t\tDEBB2D802BBAAA3600864EF5 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\tDEBB2D842BBAAA3600864EF5 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = SherpaOnnxLangIDTests;\n\t\t\tproductName = SherpaOnnxLangIDTests;\n\t\t\tproductReference = DEBB2D822BBAAA3600864EF5 /* SherpaOnnxLangIDTests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.unit-test\";\n\t\t};\n\t\tDEBB2D8B2BBAAA3600864EF5 /* SherpaOnnxLangIDUITests */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = DEBB2D9C2BBAAA3600864EF5 /* Build configuration list for PBXNativeTarget \"SherpaOnnxLangIDUITests\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tDEBB2D882BBAAA3600864EF5 /* Sources */,\n\t\t\t\tDEBB2D892BBAAA3600864EF5 /* Frameworks */,\n\t\t\t\tDEBB2D8A2BBAAA3600864EF5 /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t\tDEBB2D8E2BBAAA3600864EF5 /* PBXTargetDependency */,\n\t\t\t);\n\t\t\tname = SherpaOnnxLangIDUITests;\n\t\t\tproductName = SherpaOnnxLangIDUITests;\n\t\t\tproductReference = DEBB2D8C2BBAAA3600864EF5 /* SherpaOnnxLangIDUITests.xctest */;\n\t\t\tproductType = \"com.apple.product-type.bundle.ui-testing\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\tDEBB2D6A2BBAAA3500864EF5 /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = 1;\n\t\t\t\tLastSwiftUpdateCheck = 1530;\n\t\t\t\tLastUpgradeCheck = 1530;\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\tDEBB2D712BBAAA3500864EF5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 15.3;\n\t\t\t\t\t};\n\t\t\t\t\tDEBB2D812BBAAA3600864EF5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 15.3;\n\t\t\t\t\t\tTestTargetID = DEBB2D712BBAAA3500864EF5;\n\t\t\t\t\t};\n\t\t\t\t\tDEBB2D8B2BBAAA3600864EF5 = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 15.3;\n\t\t\t\t\t\tTestTargetID = DEBB2D712BBAAA3500864EF5;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = DEBB2D6D2BBAAA3500864EF5 /* Build configuration list for PBXProject \"SherpaOnnxLangID\" */;\n\t\t\tcompatibilityVersion = \"Xcode 14.0\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = DEBB2D692BBAAA3500864EF5;\n\t\t\tproductRefGroup = DEBB2D732BBAAA3500864EF5 /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\tDEBB2D712BBAAA3500864EF5 /* SherpaOnnxLangID */,\n\t\t\t\tDEBB2D812BBAAA3600864EF5 /* SherpaOnnxLangIDTests */,\n\t\t\t\tDEBB2D8B2BBAAA3600864EF5 /* SherpaOnnxLangIDUITests */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\tDEBB2D702BBAAA3500864EF5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tDEBB2D7D2BBAAA3600864EF5 /* Preview Assets.xcassets in Resources */,\n\t\t\t\tDEBB2D7A2BBAAA3600864EF5 /* Assets.xcassets in Resources */,\n\t\t\t\tC98126522BFEEDB7000AD7AA /* Info.plist in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tDEBB2D802BBAAA3600864EF5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tDEBB2D8A2BBAAA3600864EF5 /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\tDEBB2D6E2BBAAA3500864EF5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tDEBB2DA52BBAAAFD00864EF5 /* ViewModel.swift in Sources */,\n\t\t\t\tDEBB2DB22BBAAD0000864EF5 /* SherpaOnnx.swift in Sources */,\n\t\t\t\tDEBB2DA12BBAAAD800864EF5 /* Extension.swift in Sources */,\n\t\t\t\tDEBB2D782BBAAA3500864EF5 /* ContentView.swift in Sources */,\n\t\t\t\tDEBB2D762BBAAA3500864EF5 /* SherpaOnnxLangIDApp.swift in Sources */,\n\t\t\t\tDEBB2DA32BBAAAE700864EF5 /* Model.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tDEBB2D7E2BBAAA3600864EF5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tDEBB2D872BBAAA3600864EF5 /* SherpaOnnxLangIDTests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n\t\tDEBB2D882BBAAA3600864EF5 /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tDEBB2D912BBAAA3600864EF5 /* SherpaOnnxLangIDUITests.swift in Sources */,\n\t\t\t\tDEBB2D932BBAAA3600864EF5 /* SherpaOnnxLangIDUITestsLaunchTests.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin PBXTargetDependency section */\n\t\tDEBB2D842BBAAA3600864EF5 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = DEBB2D712BBAAA3500864EF5 /* SherpaOnnxLangID */;\n\t\t\ttargetProxy = DEBB2D832BBAAA3600864EF5 /* PBXContainerItemProxy */;\n\t\t};\n\t\tDEBB2D8E2BBAAA3600864EF5 /* PBXTargetDependency */ = {\n\t\t\tisa = PBXTargetDependency;\n\t\t\ttarget = DEBB2D712BBAAA3500864EF5 /* SherpaOnnxLangID */;\n\t\t\ttargetProxy = DEBB2D8D2BBAAA3600864EF5 /* PBXContainerItemProxy */;\n\t\t};\n/* End PBXTargetDependency section */\n\n/* Begin XCBuildConfiguration section */\n\t\tDEBB2D942BBAAA3600864EF5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu17;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.0;\n\t\t\t\tLOCALIZATION_PREFERS_STRING_CATALOGS = YES;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = INCLUDE_SOURCE;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = \"DEBUG $(inherited)\";\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tDEBB2D952BBAAA3600864EF5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu17;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.0;\n\t\t\t\tLOCALIZATION_PREFERS_STRING_CATALOGS = YES;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tVALIDATE_PRODUCT = YES;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tDEBB2D972BBAAA3600864EF5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_IDENTITY = \"Apple Development\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_ASSET_PATHS = \"\\\"SherpaOnnxLangID/Preview Content\\\"\";\n\t\t\t\tDEVELOPMENT_TEAM = \"\";\n\t\t\t\tENABLE_PREVIEWS = YES;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tINFOPLIST_KEY_NSMicrophoneUsageDescription = \"Use microphone to record voice\";\n\t\t\t\tINFOPLIST_KEY_UIApplicationSceneManifest_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchScreen_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.0;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxLangID\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tDEBB2D982BBAAA3600864EF5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_IDENTITY = \"Apple Development\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_ASSET_PATHS = \"\\\"SherpaOnnxLangID/Preview Content\\\"\";\n\t\t\t\tDEVELOPMENT_TEAM = \"\";\n\t\t\t\tENABLE_PREVIEWS = YES;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tINFOPLIST_KEY_NSMicrophoneUsageDescription = \"Use microphone to record voice\";\n\t\t\t\tINFOPLIST_KEY_UIApplicationSceneManifest_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchScreen_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.0;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxLangID\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tPROVISIONING_PROFILE_SPECIFIER = \"\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tDEBB2D9A2BBAAA3600864EF5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_TEAM = 896WS4KUPV;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 17.4;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxLangIDTests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/SherpaOnnxLangID.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/SherpaOnnxLangID\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tDEBB2D9B2BBAAA3600864EF5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tBUNDLE_LOADER = \"$(TEST_HOST)\";\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_TEAM = 896WS4KUPV;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 17.4;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxLangIDTests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_HOST = \"$(BUILT_PRODUCTS_DIR)/SherpaOnnxLangID.app/$(BUNDLE_EXECUTABLE_FOLDER_PATH)/SherpaOnnxLangID\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tDEBB2D9D2BBAAA3600864EF5 /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_TEAM = 896WS4KUPV;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxLangIDUITests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_TARGET_NAME = SherpaOnnxLangID;\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tDEBB2D9E2BBAAA3600864EF5 /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_EMBED_SWIFT_STANDARD_LIBRARIES = YES;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_TEAM = 896WS4KUPV;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxLangIDUITests\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = NO;\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t\tTEST_TARGET_NAME = SherpaOnnxLangID;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\tDEBB2D6D2BBAAA3500864EF5 /* Build configuration list for PBXProject \"SherpaOnnxLangID\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tDEBB2D942BBAAA3600864EF5 /* Debug */,\n\t\t\t\tDEBB2D952BBAAA3600864EF5 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tDEBB2D962BBAAA3600864EF5 /* Build configuration list for PBXNativeTarget \"SherpaOnnxLangID\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tDEBB2D972BBAAA3600864EF5 /* Debug */,\n\t\t\t\tDEBB2D982BBAAA3600864EF5 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tDEBB2D992BBAAA3600864EF5 /* Build configuration list for PBXNativeTarget \"SherpaOnnxLangIDTests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tDEBB2D9A2BBAAA3600864EF5 /* Debug */,\n\t\t\t\tDEBB2D9B2BBAAA3600864EF5 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tDEBB2D9C2BBAAA3600864EF5 /* Build configuration list for PBXNativeTarget \"SherpaOnnxLangIDUITests\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tDEBB2D9D2BBAAA3600864EF5 /* Debug */,\n\t\t\t\tDEBB2D9E2BBAAA3600864EF5 /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = DEBB2D6A2BBAAA3500864EF5 /* Project object */;\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"self:\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangID.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangIDTests/SherpaOnnxLangIDTests.swift",
    "content": "//\n//  SherpaOnnxLangIDTests.swift\n//  SherpaOnnxLangIDTests\n//\n//  Created by knight on 2024/4/1.\n//\n\nimport XCTest\n@testable import SherpaOnnxLangID\n\nfinal class SherpaOnnxLangIDTests: XCTestCase {\n\n    override func setUpWithError() throws {\n        // Put setup code here. This method is called before the invocation of each test method in the class.\n    }\n\n    override func tearDownWithError() throws {\n        // Put teardown code here. This method is called after the invocation of each test method in the class.\n    }\n\n    func testExample() throws {\n        // This is an example of a functional test case.\n        // Use XCTAssert and related functions to verify your tests produce the correct results.\n        // Any test you write for XCTest can be annotated as throws and async.\n        // Mark your test throws to produce an unexpected failure when your test encounters an uncaught error.\n        // Mark your test async to allow awaiting for asynchronous code to complete. Check the results with assertions afterwards.\n    }\n\n    func testPerformanceExample() throws {\n        // This is an example of a performance test case.\n        self.measure {\n            // Put the code you want to measure the time of here.\n        }\n    }\n\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangIDUITests/SherpaOnnxLangIDUITests.swift",
    "content": "//\n//  SherpaOnnxLangIDUITests.swift\n//  SherpaOnnxLangIDUITests\n//\n//  Created by knight on 2024/4/1.\n//\n\nimport XCTest\n\nfinal class SherpaOnnxLangIDUITests: XCTestCase {\n\n    override func setUpWithError() throws {\n        // Put setup code here. This method is called before the invocation of each test method in the class.\n\n        // In UI tests it is usually best to stop immediately when a failure occurs.\n        continueAfterFailure = false\n\n        // In UI tests it’s important to set the initial state - such as interface orientation - required for your tests before they run. The setUp method is a good place to do this.\n    }\n\n    override func tearDownWithError() throws {\n        // Put teardown code here. This method is called after the invocation of each test method in the class.\n    }\n\n    func testExample() throws {\n        // UI tests must launch the application that they test.\n        let app = XCUIApplication()\n        app.launch()\n\n        // Use XCTAssert and related functions to verify your tests produce the correct results.\n    }\n\n    func testLaunchPerformance() throws {\n        if #available(macOS 10.15, iOS 13.0, tvOS 13.0, watchOS 7.0, *) {\n            // This measures how long it takes to launch your application.\n            measure(metrics: [XCTApplicationLaunchMetric()]) {\n                XCUIApplication().launch()\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxLangID/SherpaOnnxLangIDUITests/SherpaOnnxLangIDUITestsLaunchTests.swift",
    "content": "//\n//  SherpaOnnxLangIDUITestsLaunchTests.swift\n//  SherpaOnnxLangIDUITests\n//\n//  Created by knight on 2024/4/1.\n//\n\nimport XCTest\n\nfinal class SherpaOnnxLangIDUITestsLaunchTests: XCTestCase {\n\n    override class var runsForEachTargetApplicationUIConfiguration: Bool {\n        true\n    }\n\n    override func setUpWithError() throws {\n        continueAfterFailure = false\n    }\n\n    func testLaunch() throws {\n        let app = XCUIApplication()\n        app.launch()\n\n        // Insert steps here to perform after app launch but before taking a screenshot,\n        // such as logging into a test account or navigating somewhere in the app\n\n        let attachment = XCTAttachment(screenshot: app.screenshot())\n        attachment.name = \"Launch Screen\"\n        attachment.lifetime = .keepAlways\n        add(attachment)\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/.gitignore",
    "content": "tiny.en-tokens.txt\n*.onnx\n*.ort\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/Assets.xcassets/AccentColor.colorset/Contents.json",
    "content": "{\n  \"colors\" : [\n    {\n      \"idiom\" : \"universal\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"filename\" : \"k2-1024x1024.png\",\n      \"idiom\" : \"universal\",\n      \"platform\" : \"ios\",\n      \"size\" : \"1024x1024\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/ContentView.swift",
    "content": "//\n//  ContentView.swift\n//  SherpaOnnxSubtitle\n//\n//  Created by knight on 2023/9/23.\n//\n\nimport AVKit\nimport MediaPlayer\nimport PhotosUI\nimport SwiftUI\n\nstruct ContentView: View {\n    @StateObject var subtitleViewModel = SubtitleViewModel()\n\n    var body: some View {\n        VStack {\n            VStack {\n                Text(\"SherpaOnnxSubtitle\")\n                    .font(.title)\n                VStack(alignment: .leading) {\n                    Text(\"Audio format should be **mono** channel and **16khz** sample rate\")\n\n                    Text(\"You can convert file with the help of ffmpeg\")\n                    Text(\"```ffmpeg -i ./foo.mov -acodec pcm_s16le -ac 1 -ar 16000 foo.wav```\")\n                }\n            }\n            .padding(.vertical)\n            PhotosPicker(\n                selection: $subtitleViewModel.selectedItem,\n                matching: .videos\n            ) {\n                Label(\"Open Audio from Photo Library\", systemImage: \"photo\")\n                    .frame(minWidth: 0, maxWidth: .infinity)\n                    .padding()\n                    .background(.blue, in: .rect(cornerRadius: 8.0))\n                    .foregroundColor(.white)\n            }\n\n            Button(action: {\n                subtitleViewModel.importNow = true\n            }, label: {\n                Text(\"Open Audio from Files\")\n                    .frame(minWidth: 0, maxWidth: .infinity)\n                    .padding()\n                    .background(.blue, in: .rect(cornerRadius: 8.0))\n            })\n            .foregroundColor(.white)\n            switch subtitleViewModel.loadState {\n            case .initial, .loaded(_), .done:\n                EmptyView()\n            case .loading:\n                ProgressView()\n            case .failed:\n                Text(\"Gen SRT failed\")\n            }\n        }\n        .fileImporter(isPresented: $subtitleViewModel.importNow, allowedContentTypes: [.movie, .audio], onCompletion: handleImportCompletion)\n        .onChange(of: subtitleViewModel.importNow) { importNow in\n            if !importNow {\n                subtitleViewModel.restoreState()\n            }\n        }\n        .fileExporter(isPresented: $subtitleViewModel.exportNow,\n                      document: subtitleViewModel.srtDocument, contentType: .srt,\n                      defaultFilename: subtitleViewModel.srtName,\n                      onCompletion: handleExportCompletion)\n        .task(id: subtitleViewModel.selectedItem) {\n            do {\n                if !subtitleViewModel.hasAudio {\n                    return\n                }\n                subtitleViewModel.loadState = .loading\n\n                if let movie = try await subtitleViewModel.selectedItem?.loadTransferable(type: Audio.self) {\n                    subtitleViewModel.loadState = .loaded(movie)\n                    subtitleViewModel.generateSRT(from: movie.url)\n                } else {\n                    subtitleViewModel.loadState = .failed\n                }\n            } catch {\n                subtitleViewModel.loadState = .failed\n            }\n        }\n        .padding()\n    }\n\n    private func handleImportCompletion(result: Result<URL, Error>) {\n        print(\"file import...\")\n        switch result {\n        case let .success(file):\n            let accessing = file.startAccessingSecurityScopedResource()\n            defer {\n                if accessing {\n                    file.stopAccessingSecurityScopedResource()\n                }\n            }\n            subtitleViewModel.generateSRT(from: file)\n        case let .failure(error):\n            print(error.localizedDescription)\n            subtitleViewModel.loadState = .failed\n        }\n    }\n\n    private func handleExportCompletion(result: Result<URL, any Error>) {\n        switch result {\n        case let .success(url):\n            print(\"audio export to: \\(url)\")\n            subtitleViewModel.loadState = .done\n        case let .failure(error):\n            print(\"export audio error: \\(error.localizedDescription)\")\n            subtitleViewModel.loadState = .failed\n        }\n    }\n}\n\nstruct ContentView_Previews: PreviewProvider {\n    static var previews: some View {\n        ContentView()\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/Extensions/UTType.swift",
    "content": "//\n//  UTType.swift\n//  YPlayer\n//\n//  Created by knight on 2023/7/7.\n//\n\nimport UniformTypeIdentifiers\n\nextension UTType {\n    static var srt: UTType {\n        UTType(exportedAs: \"com.k2.srt\")\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>NSMicrophoneUsageDescription</key>\n\t<string>Need microphone access for Next-gen Kaldi to work</string>\n\t<key>UTExportedTypeDeclarations</key>\n\t<array>\n\t\t<dict>\n\t\t\t<key>UTTypeConformsTo</key>\n\t\t\t<array>\n\t\t\t\t<string>public.plain-text</string>\n\t\t\t</array>\n\t\t\t<key>UTTypeDescription</key>\n\t\t\t<string>SubRip Subtitle File</string>\n\t\t\t<key>UTTypeIconFiles</key>\n\t\t\t<array/>\n\t\t\t<key>UTTypeIdentifier</key>\n\t\t\t<string>com.k2.srt</string>\n\t\t\t<key>UTTypeTagSpecification</key>\n\t\t\t<dict>\n\t\t\t\t<key>public.filename-extension</key>\n\t\t\t\t<array>\n\t\t\t\t\t<string>srt</string>\n\t\t\t\t</array>\n\t\t\t</dict>\n\t\t</dict>\n\t</array>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/Models/Audio.swift",
    "content": "//\n//  Audio.swift\n//  SherpaOnnxSubtitle\n//\n//  Created by knight on 2023/9/23.\n//\n\nimport SwiftUI\n\nstruct Audio: Transferable {\n    let url: URL\n\n    static var transferRepresentation: some TransferRepresentation {\n        FileRepresentation(contentType: .movie) { movie in\n            SentTransferredFile(movie.url)\n        } importing: { received in\n            let copy = URL.documentsDirectory.appending(path: \"audio.wav\")\n\n            if FileManager.default.fileExists(atPath: copy.path()) {\n                try FileManager.default.removeItem(at: copy)\n            }\n\n            try FileManager.default.copyItem(at: received.file, to: copy)\n            return Self(url: copy)\n        }\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/Models/Document.swift",
    "content": "//\n//  Document.swift\n//  YPlayer\n//\n//  Created by knight on 2023/6/5.\n//\n\nimport SwiftUI\nimport UniformTypeIdentifiers\n\nstruct Document: FileDocument {\n    static var readableContentTypes = [UTType.srt]\n    static var writableContentTypes = [UTType.srt]\n    var data: Data?\n\n    init(data: Data?) {\n        self.data = data\n    }\n\n    init(configuration: ReadConfiguration) throws {\n        if let data = configuration.file.regularFileContents {\n            self.data = data\n        }\n    }\n\n    func fileWrapper(configuration _: WriteConfiguration) throws -> FileWrapper {\n        guard let data = data else {\n            throw ExportError.fileNotFound\n        }\n        return FileWrapper(regularFileWithContents: data)\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/Models/Errors.swift",
    "content": "//\n//  Errors.swift\n//  YPlayer\n//\n//  Created by knight on 2023/8/26.\n//\n\nimport Foundation\n\nenum ExportError: String, Error {\n    case fileNotFound = \"export file not found\"\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/Models/SpeechSegment.swift",
    "content": "//\n//  SpeechSegment.swift\n//  SherpaOnnxSubtitle\n//\n//  Created by knight on 2023/9/23.\n//\n\nimport Foundation\n\nclass SpeechSegment: CustomStringConvertible {\n    let start: Float\n    let end: Float\n    let text: String\n\n    init(start: Float, duration: Float, text: String) {\n        self.start = start\n        end = start + duration\n        self.text = text\n    }\n\n    public var description: String {\n        var s: String\n        s = TimeInterval(start).hourMinuteSecondMS\n        s += \" --> \"\n        s += TimeInterval(end).hourMinuteSecondMS\n        s += \"\\n\"\n        s += text\n\n        return s\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/Preview Content/Preview Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/SherpaOnnxSubtitleApp.swift",
    "content": "//\n//  SherpaOnnxSubtitleApp.swift\n//  SherpaOnnxSubtitle\n//\n//  Created by knight on 2023/9/23.\n//\n\nimport SwiftUI\n\n@main\nstruct SherpaOnnxSubtitleApp: App {\n    var body: some Scene {\n        WindowGroup {\n            ContentView()\n        }\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle/SubtitleViewModel.swift",
    "content": "//\n//  SubtitleViewModel.swift\n//  SherpaOnnxSubtitle\n//\n//  Created by knight on 2023/9/23.\n//\n\nimport AVFoundation\nimport PhotosUI\nimport SwiftUI\n\nenum LoadState {\n    case initial\n    case loading\n    case loaded(Audio)\n    case done\n    case failed\n}\n\nclass SubtitleViewModel: ObservableObject {\n    var modelType = \"whisper\"\n    let sampleRate = 16000\n\n    var modelConfig: SherpaOnnxOfflineModelConfig?\n    // modelType = \"paraformer\"\n\n    var recognizer: SherpaOnnxOfflineRecognizer?\n\n    var vadModelConfig: SherpaOnnxVadModelConfig?\n    var vad: SherpaOnnxVoiceActivityDetectorWrapper?\n\n    @Published var loadState: LoadState = .initial\n\n    @Published var selectedItem: PhotosPickerItem? = nil\n\n    @Published var importNow: Bool = false {\n        didSet {\n            loadState = .loading\n        }\n    }\n\n    @Published var exportNow: Bool = false\n\n    var srtName: String = \"unknown.srt\"\n    var content: String = \"\"\n\n    var srtDocument: Document {\n        let content = content.data(using: .utf8)\n        return Document(data: content)\n    }\n\n    var hasAudio: Bool {\n        return selectedItem != nil\n    }\n\n    init() {\n        if modelType == \"whisper\" {\n            // for English\n            self.modelConfig = getNonStreamingWhisperTinyEn()\n        } else if modelType == \"paraformer\" {\n            // for Chinese\n            self.modelConfig = getNonStreamingZhParaformer20230914()\n        } else {\n            print(\"Please specify a supported modelType \\(modelType)\")\n            return\n        }\n\n        let featConfig = sherpaOnnxFeatureConfig(\n            sampleRate: sampleRate,\n            featureDim: 80\n        )\n\n        guard let modelConfig else {\n            return\n        }\n\n        var config = sherpaOnnxOfflineRecognizerConfig(\n            featConfig: featConfig,\n            modelConfig: modelConfig\n        )\n\n        recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n        let sileroVadConfig = sherpaOnnxSileroVadModelConfig(\n            model: getResource(\"silero_vad\", \"onnx\")\n        )\n\n        self.vadModelConfig = sherpaOnnxVadModelConfig(sileroVad: sileroVadConfig)\n        guard var vadModelConfig else {\n            return\n        }\n        vad = SherpaOnnxVoiceActivityDetectorWrapper(\n            config: &vadModelConfig, buffer_size_in_seconds: 120\n        )\n    }\n\n    func restoreState() {\n        loadState = .initial\n    }\n\n    func generateSRT(from file: URL) {\n        print(\"gen srt from: \\(file)\")\n        content = \"\"\n\n        // restore state\n        defer {\n            loadState = .done\n        }\n        guard let recognizer else {\n            return\n        }\n        guard let vadModelConfig else {\n            return\n        }\n\n        guard let vad else {\n            return\n        }\n\n        do {\n            let audioFile = try AVAudioFile(forReading: file)\n            let audioFormat = audioFile.processingFormat\n            assert(audioFormat.sampleRate == Double(sampleRate))\n            assert(audioFormat.channelCount == 1)\n            assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n            let audioFrameCount = UInt32(audioFile.length)\n            let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n            try audioFile.read(into: audioFileBuffer!)\n            var array: [Float]! = audioFileBuffer?.array()\n\n            let windowSize = Int(vadModelConfig.silero_vad.window_size)\n\n            var segments: [SpeechSegment] = []\n\n            while array.count > windowSize {\n                // todo(fangjun): avoid extra copies here\n                vad.acceptWaveform(samples: [Float](array[0 ..< windowSize]))\n                array = [Float](array[windowSize ..< array.count])\n\n                while !vad.isEmpty() {\n                    let s = vad.front()\n                    vad.pop()\n                    let result = recognizer.decode(samples: s.samples)\n\n                    segments.append(\n                        SpeechSegment(\n                            start: Float(s.start) / Float(sampleRate),\n                            duration: Float(s.samples.count) / Float(sampleRate),\n                            text: result.text\n                        ))\n\n                    print(segments.last!)\n                }\n            }\n            content = zip(segments.indices, segments).map { index, element in\n                \"\\(index + 1)\\n\\(element)\"\n            }.joined(separator: \"\\n\\n\")\n        } catch {\n            print(\"error: \\(error.localizedDescription)\")\n        }\n        exportNow = true\n\n        let last = file.lastPathComponent\n        srtName = \"\\(last).srt\"\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 56;\n\tobjects = {\n\n/* Begin PBXBuildFile section */\n\t\tDE081A8F2ABF287C00E8CD63 /* SherpaOnnx.swift in Sources */ = {isa = PBXBuildFile; fileRef = DE081A8E2ABF287C00E8CD63 /* SherpaOnnx.swift */; };\n\t\tDE081A922ABF28D400E8CD63 /* SubtitleViewModel.swift in Sources */ = {isa = PBXBuildFile; fileRef = DE081A912ABF28D400E8CD63 /* SubtitleViewModel.swift */; };\n\t\tDE081A952ABFC60E00E8CD63 /* Model.swift in Sources */ = {isa = PBXBuildFile; fileRef = DE081A942ABFC60E00E8CD63 /* Model.swift */; };\n\t\tDE081AAF2ABFF35400E8CD63 /* UTType.swift in Sources */ = {isa = PBXBuildFile; fileRef = DE081AAE2ABFF35400E8CD63 /* UTType.swift */; };\n\t\tDE081AB12ABFFEEE00E8CD63 /* Document.swift in Sources */ = {isa = PBXBuildFile; fileRef = DE081AB02ABFFEEE00E8CD63 /* Document.swift */; };\n\t\tDE081AB32ABFFF2600E8CD63 /* Errors.swift in Sources */ = {isa = PBXBuildFile; fileRef = DE081AB22ABFFF2600E8CD63 /* Errors.swift */; };\n\t\tDE8C85A62ABF23E100F667E3 /* onnxruntime.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = DE8C85A52ABF23E100F667E3 /* onnxruntime.xcframework */; };\n\t\tDE8C85AA2ABF23FA00F667E3 /* sherpa-onnx.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = DE8C85A92ABF23FA00F667E3 /* sherpa-onnx.xcframework */; };\n\t\tDE8C85B22ABF257200F667E3 /* SpeechSegment.swift in Sources */ = {isa = PBXBuildFile; fileRef = DE8C85B12ABF257200F667E3 /* SpeechSegment.swift */; };\n\t\tDEA657152ABF19730066A81D /* SherpaOnnxSubtitleApp.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEA657142ABF19730066A81D /* SherpaOnnxSubtitleApp.swift */; };\n\t\tDEA657172ABF19730066A81D /* ContentView.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEA657162ABF19730066A81D /* ContentView.swift */; };\n\t\tDEA657192ABF19740066A81D /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = DEA657182ABF19740066A81D /* Assets.xcassets */; };\n\t\tDEA6571C2ABF19740066A81D /* Preview Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = DEA6571B2ABF19740066A81D /* Preview Assets.xcassets */; };\n\t\tDEA657232ABF20130066A81D /* Audio.swift in Sources */ = {isa = PBXBuildFile; fileRef = DEA657222ABF20130066A81D /* Audio.swift */; };\n\t\tDED059702AC136FF00122A60 /* Extension.swift in Sources */ = {isa = PBXBuildFile; fileRef = DED0596F2AC136FF00122A60 /* Extension.swift */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXFileReference section */\n\t\tDE081A8E2ABF287C00E8CD63 /* SherpaOnnx.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; name = SherpaOnnx.swift; path = \"../../../swift-api-examples/SherpaOnnx.swift\"; sourceTree = \"<group>\"; };\n\t\tDE081A912ABF28D400E8CD63 /* SubtitleViewModel.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SubtitleViewModel.swift; sourceTree = \"<group>\"; };\n\t\tDE081A942ABFC60E00E8CD63 /* Model.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; name = Model.swift; path = ../../SherpaOnnx2Pass/SherpaOnnx2Pass/Model.swift; sourceTree = \"<group>\"; };\n\t\tDE081AAC2ABFF30A00E8CD63 /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist; path = Info.plist; sourceTree = \"<group>\"; };\n\t\tDE081AAE2ABFF35400E8CD63 /* UTType.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = UTType.swift; sourceTree = \"<group>\"; };\n\t\tDE081AB02ABFFEEE00E8CD63 /* Document.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = Document.swift; sourceTree = \"<group>\"; };\n\t\tDE081AB22ABFFF2600E8CD63 /* Errors.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = Errors.swift; sourceTree = \"<group>\"; };\n\t\tDE8C85A52ABF23E100F667E3 /* onnxruntime.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = onnxruntime.xcframework; path = \"../../build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework\"; sourceTree = \"<group>\"; };\n\t\tDE8C85A92ABF23FA00F667E3 /* sherpa-onnx.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = \"sherpa-onnx.xcframework\"; path = \"../../build-ios/sherpa-onnx.xcframework\"; sourceTree = \"<group>\"; };\n\t\tDE8C85B12ABF257200F667E3 /* SpeechSegment.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SpeechSegment.swift; sourceTree = \"<group>\"; };\n\t\tDEA657112ABF19730066A81D /* SherpaOnnxSubtitle.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = SherpaOnnxSubtitle.app; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tDEA657142ABF19730066A81D /* SherpaOnnxSubtitleApp.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxSubtitleApp.swift; sourceTree = \"<group>\"; };\n\t\tDEA657162ABF19730066A81D /* ContentView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ContentView.swift; sourceTree = \"<group>\"; };\n\t\tDEA657182ABF19740066A81D /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = \"<group>\"; };\n\t\tDEA6571B2ABF19740066A81D /* Preview Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = \"Preview Assets.xcassets\"; sourceTree = \"<group>\"; };\n\t\tDEA657222ABF20130066A81D /* Audio.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = Audio.swift; sourceTree = \"<group>\"; };\n\t\tDED0596F2AC136FF00122A60 /* Extension.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; name = Extension.swift; path = ../../../SherpaOnnx2Pass/SherpaOnnx2Pass/Extension.swift; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\tDEA6570E2ABF19730066A81D /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tDE8C85A62ABF23E100F667E3 /* onnxruntime.xcframework in Frameworks */,\n\t\t\t\tDE8C85AA2ABF23FA00F667E3 /* sherpa-onnx.xcframework in Frameworks */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\tDE081A902ABF28BE00E8CD63 /* Models */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEA657222ABF20130066A81D /* Audio.swift */,\n\t\t\t\tDE8C85B12ABF257200F667E3 /* SpeechSegment.swift */,\n\t\t\t\tDE081AB02ABFFEEE00E8CD63 /* Document.swift */,\n\t\t\t\tDE081AB22ABFFF2600E8CD63 /* Errors.swift */,\n\t\t\t);\n\t\t\tpath = Models;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDE081AAD2ABFF34900E8CD63 /* Extensions */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDED0596F2AC136FF00122A60 /* Extension.swift */,\n\t\t\t\tDE081AAE2ABFF35400E8CD63 /* UTType.swift */,\n\t\t\t);\n\t\t\tpath = Extensions;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDE8C85A42ABF23E100F667E3 /* Frameworks */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDE8C85A92ABF23FA00F667E3 /* sherpa-onnx.xcframework */,\n\t\t\t\tDE8C85A52ABF23E100F667E3 /* onnxruntime.xcframework */,\n\t\t\t);\n\t\t\tname = Frameworks;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDEA657082ABF19730066A81D = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEA657132ABF19730066A81D /* SherpaOnnxSubtitle */,\n\t\t\t\tDEA657122ABF19730066A81D /* Products */,\n\t\t\t\tDE8C85A42ABF23E100F667E3 /* Frameworks */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDEA657122ABF19730066A81D /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEA657112ABF19730066A81D /* SherpaOnnxSubtitle.app */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDEA657132ABF19730066A81D /* SherpaOnnxSubtitle */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDE081AAC2ABFF30A00E8CD63 /* Info.plist */,\n\t\t\t\tDE081A8E2ABF287C00E8CD63 /* SherpaOnnx.swift */,\n\t\t\t\tDEA657142ABF19730066A81D /* SherpaOnnxSubtitleApp.swift */,\n\t\t\t\tDEA657162ABF19730066A81D /* ContentView.swift */,\n\t\t\t\tDE081A912ABF28D400E8CD63 /* SubtitleViewModel.swift */,\n\t\t\t\tDE081AAD2ABFF34900E8CD63 /* Extensions */,\n\t\t\t\tDE081A942ABFC60E00E8CD63 /* Model.swift */,\n\t\t\t\tDE081A902ABF28BE00E8CD63 /* Models */,\n\t\t\t\tDEA657182ABF19740066A81D /* Assets.xcassets */,\n\t\t\t\tDEA6571A2ABF19740066A81D /* Preview Content */,\n\t\t\t);\n\t\t\tpath = SherpaOnnxSubtitle;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tDEA6571A2ABF19740066A81D /* Preview Content */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tDEA6571B2ABF19740066A81D /* Preview Assets.xcassets */,\n\t\t\t);\n\t\t\tpath = \"Preview Content\";\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\tDEA657102ABF19730066A81D /* SherpaOnnxSubtitle */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = DEA6571F2ABF19740066A81D /* Build configuration list for PBXNativeTarget \"SherpaOnnxSubtitle\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tDEA6570D2ABF19730066A81D /* Sources */,\n\t\t\t\tDEA6570E2ABF19730066A81D /* Frameworks */,\n\t\t\t\tDEA6570F2ABF19730066A81D /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = SherpaOnnxSubtitle;\n\t\t\tproductName = SherpaOnnxSubtitle;\n\t\t\tproductReference = DEA657112ABF19730066A81D /* SherpaOnnxSubtitle.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\tDEA657092ABF19730066A81D /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = 1;\n\t\t\t\tLastSwiftUpdateCheck = 1500;\n\t\t\t\tLastUpgradeCheck = 1500;\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\tDEA657102ABF19730066A81D = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 15.0;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = DEA6570C2ABF19730066A81D /* Build configuration list for PBXProject \"SherpaOnnxSubtitle\" */;\n\t\t\tcompatibilityVersion = \"Xcode 14.0\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = DEA657082ABF19730066A81D;\n\t\t\tproductRefGroup = DEA657122ABF19730066A81D /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\tDEA657102ABF19730066A81D /* SherpaOnnxSubtitle */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\tDEA6570F2ABF19730066A81D /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tDEA6571C2ABF19740066A81D /* Preview Assets.xcassets in Resources */,\n\t\t\t\tDEA657192ABF19740066A81D /* Assets.xcassets in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\tDEA6570D2ABF19730066A81D /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tDE081AAF2ABFF35400E8CD63 /* UTType.swift in Sources */,\n\t\t\t\tDE8C85B22ABF257200F667E3 /* SpeechSegment.swift in Sources */,\n\t\t\t\tDE081A922ABF28D400E8CD63 /* SubtitleViewModel.swift in Sources */,\n\t\t\t\tDE081AB12ABFFEEE00E8CD63 /* Document.swift in Sources */,\n\t\t\t\tDED059702AC136FF00122A60 /* Extension.swift in Sources */,\n\t\t\t\tDEA657172ABF19730066A81D /* ContentView.swift in Sources */,\n\t\t\t\tDEA657152ABF19730066A81D /* SherpaOnnxSubtitleApp.swift in Sources */,\n\t\t\t\tDE081AB32ABFFF2600E8CD63 /* Errors.swift in Sources */,\n\t\t\t\tDEA657232ABF20130066A81D /* Audio.swift in Sources */,\n\t\t\t\tDE081A8F2ABF287C00E8CD63 /* SherpaOnnx.swift in Sources */,\n\t\t\t\tDE081A952ABFC60E00E8CD63 /* Model.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin XCBuildConfiguration section */\n\t\tDEA6571D2ABF19740066A81D /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu17;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.0;\n\t\t\t\tLOCALIZATION_PREFERS_STRING_CATALOGS = YES;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = INCLUDE_SOURCE;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = \"DEBUG $(inherited)\";\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tDEA6571E2ABF19740066A81D /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_USER_SCRIPT_SANDBOXING = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu17;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.0;\n\t\t\t\tLOCALIZATION_PREFERS_STRING_CATALOGS = YES;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tVALIDATE_PRODUCT = YES;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tDEA657202ABF19740066A81D /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_ASSET_PATHS = \"\\\"SherpaOnnxSubtitle/Preview Content\\\"\";\n\t\t\t\tDEVELOPMENT_TEAM = 896WS4KUPV;\n\t\t\t\tENABLE_PREVIEWS = YES;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tINFOPLIST_FILE = SherpaOnnxSubtitle/Info.plist;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSceneManifest_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchScreen_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.0;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = net.duoziwei.SherpaOnnxSubtitle;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tDEA657212ABF19740066A81D /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_ASSET_PATHS = \"\\\"SherpaOnnxSubtitle/Preview Content\\\"\";\n\t\t\t\tDEVELOPMENT_TEAM = 896WS4KUPV;\n\t\t\t\tENABLE_PREVIEWS = YES;\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tINFOPLIST_FILE = SherpaOnnxSubtitle/Info.plist;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSceneManifest_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchScreen_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.0;\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = net.duoziwei.SherpaOnnxSubtitle;\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\tDEA6570C2ABF19730066A81D /* Build configuration list for PBXProject \"SherpaOnnxSubtitle\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tDEA6571D2ABF19740066A81D /* Debug */,\n\t\t\t\tDEA6571E2ABF19740066A81D /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tDEA6571F2ABF19740066A81D /* Build configuration list for PBXNativeTarget \"SherpaOnnxSubtitle\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tDEA657202ABF19740066A81D /* Debug */,\n\t\t\t\tDEA657212ABF19740066A81D /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = DEA657092ABF19730066A81D /* Project object */;\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"self:\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxSubtitle/SherpaOnnxSubtitle.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts/Assets.xcassets/AccentColor.colorset/Contents.json",
    "content": "{\n  \"colors\" : [\n    {\n      \"idiom\" : \"universal\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts/Assets.xcassets/AppIcon.appiconset/Contents.json",
    "content": "{\n  \"images\" : [\n    {\n      \"idiom\" : \"universal\",\n      \"platform\" : \"ios\",\n      \"size\" : \"1024x1024\"\n    }\n  ],\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts/Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts/ContentView.swift",
    "content": "//\n//  ContentView.swift\n//  SherpaOnnxTts\n//\n//  Created by fangjun on 2023/11/23.\n//\n// Text-to-speech with Next-gen Kaldi on iOS without Internet connection\n\nimport SwiftUI\nimport AVFoundation\n\nstruct ContentView: View {\n    @State private var sid = \"0\"\n    @State private var speed = 1.0\n    @State private var text = \"\"\n    @State private var showAlert = false\n    @State var filename: URL = NSURL() as URL\n    @State var audioPlayer: AVAudioPlayer!\n\n    private var tts = createOfflineTts()\n\n    var body: some View {\n\n        VStack(alignment: .leading) {\n            HStack {\n                Spacer()\n                Text(\"Next-gen Kaldi: TTS\").font(.title)\n                Spacer()\n            }\n            HStack{\n                Text(\"Speaker ID\")\n                TextField(\"Please input a speaker ID\", text: $sid).textFieldStyle(.roundedBorder)\n                    .keyboardType(.numberPad)\n            }\n            HStack{\n                Text(\"Speed \\(String(format: \"%.1f\", speed))\")\n                    .padding(.trailing)\n                Slider(value: $speed, in: 0.5...2.0, step: 0.1) {\n                    Text(\"Speech speed\")\n                }\n            }\n\n            Text(\"Please input your text below\").padding([.trailing, .top, .bottom])\n\n            TextEditor(text: $text)\n                .font(.body)\n                .opacity(self.text.isEmpty ? 0.25 : 1)\n                .disableAutocorrection(true)\n                .border(Color.black)\n\n            Spacer()\n            HStack {\n                Spacer()\n                Button(action: {\n                    let speakerId = Int(self.sid) ?? 0\n                    let t = self.text.trimmingCharacters(in: .whitespacesAndNewlines)\n                    if t.isEmpty {\n                        self.showAlert = true\n                        return\n                    }\n\n                    let audio = tts.generate(text: t, sid: speakerId, speed: Float(self.speed))\n                    if self.filename.absoluteString.isEmpty {\n                        let tempDirectoryURL = NSURL.fileURL(withPath: NSTemporaryDirectory(), isDirectory: true)\n                        self.filename = tempDirectoryURL.appendingPathComponent(\"test.wav\")\n                    }\n\n                    let _ = audio.save(filename: filename.path)\n\n                    self.audioPlayer = try! AVAudioPlayer(contentsOf: filename)\n                    self.audioPlayer.play()\n                }) {\n                    Text(\"Generate\")\n                }.alert(isPresented: $showAlert) {\n                    Alert(title: Text(\"Empty text\"), message: Text(\"Please input your text before clicking the Generate button\"))\n                }\n                Spacer()\n                Button (action: {\n                    self.audioPlayer.play()\n                }) {\n                    Text(\"Play\")\n                }.disabled(filename.absoluteString.isEmpty)\n                Spacer()\n            }\n            Spacer()\n        }\n        .padding()\n    }\n}\n\nstruct ContentView_Previews: PreviewProvider {\n    static var previews: some View {\n        ContentView()\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts/Info.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>NSMicrophoneUsageDescription</key>\n\t<string>Need microphone access for Next-gen Kaldi to work</string>\n</dict>\n</plist>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts/Preview Content/Preview Assets.xcassets/Contents.json",
    "content": "{\n  \"info\" : {\n    \"author\" : \"xcode\",\n    \"version\" : 1\n  }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts/SherpaOnnxTtsApp.swift",
    "content": "//\n//  SherpaOnnxTtsApp.swift\n//  SherpaOnnxTts\n//\n//  Created by fangjun on 2023/11/23.\n//\n\nimport SwiftUI\n\n@main\nstruct SherpaOnnxTtsApp: App {\n    var body: some Scene {\n        WindowGroup {\n            ContentView()\n        }\n    }\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts/ViewModel.swift",
    "content": "//\n//  ViewModel.swift\n//  SherpaOnnxTts\n//\n//  Created by fangjun on 2023/11/23.\n//\n\nimport Foundation\n\n// used to get the path to espeak-ng-data\nfunc resourceURL(to path: String) -> String {\n  return URL(string: path, relativeTo: Bundle.main.resourceURL)!.path\n}\n\nfunc getResource(_ forResource: String, _ ofType: String) -> String {\n  let path = Bundle.main.path(forResource: forResource, ofType: ofType)\n  precondition(\n    path != nil,\n    \"\\(forResource).\\(ofType) does not exist!\\n\" + \"Remember to change \\n\"\n      + \"  Build Phases -> Copy Bundle Resources\\n\" + \"to add it!\"\n  )\n  return path!\n}\n\n/// Please refer to\n/// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/index.html\n/// to download pre-trained models\n\nfunc getTtsForVCTK() -> SherpaOnnxOfflineTtsWrapper {\n  // See the following link\n  // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#vctk-english-multi-speaker-109-speakers\n\n  // vits-vctk.onnx\n  let model = getResource(\"vits-vctk\", \"onnx\")\n\n  // lexicon.txt\n  let lexicon = getResource(\"lexicon\", \"txt\")\n\n  // tokens.txt\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  let vits = sherpaOnnxOfflineTtsVitsModelConfig(model: model, lexicon: lexicon, tokens: tokens)\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)\n  var config = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n  return SherpaOnnxOfflineTtsWrapper(config: &config)\n}\n\nfunc getTtsForAishell3() -> SherpaOnnxOfflineTtsWrapper {\n  // See the following link\n  // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#vits-model-aishell3\n\n  let model = getResource(\"model\", \"onnx\")\n\n  // lexicon.txt\n  let lexicon = getResource(\"lexicon\", \"txt\")\n\n  // tokens.txt\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  // rule.fst\n  let ruleFsts = getResource(\"rule\", \"fst\")\n\n  // rule.far\n  let ruleFars = getResource(\"rule\", \"far\")\n\n  let vits = sherpaOnnxOfflineTtsVitsModelConfig(model: model, lexicon: lexicon, tokens: tokens)\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)\n  var config = sherpaOnnxOfflineTtsConfig(\n    model: modelConfig,\n    ruleFsts: ruleFsts,\n    ruleFars: ruleFars\n  )\n  return SherpaOnnxOfflineTtsWrapper(config: &config)\n}\n\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\nfunc getTtsFor_en_US_amy_low() -> SherpaOnnxOfflineTtsWrapper {\n  // please see  https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\n\n  let model = getResource(\"en_US-amy-low\", \"onnx\")\n\n  // tokens.txt\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  // in this case, we don't need lexicon.txt\n  let dataDir = resourceURL(to: \"espeak-ng-data\")\n\n  let vits = sherpaOnnxOfflineTtsVitsModelConfig(\n    model: model, lexicon: \"\", tokens: tokens, dataDir: dataDir)\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)\n  var config = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n\n  return SherpaOnnxOfflineTtsWrapper(config: &config)\n}\n\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#vits-melo-tts-zh-en-chinese-english-1-speaker\nfunc getTtsFor_zh_en_melo_tts() -> SherpaOnnxOfflineTtsWrapper {\n  // please see https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-melo-tts-zh_en.tar.bz2\n\n  let model = getResource(\"model\", \"onnx\")\n\n  let tokens = getResource(\"tokens\", \"txt\")\n  let lexicon = getResource(\"lexicon\", \"txt\")\n\n  let numFst = getResource(\"number\", \"fst\")\n  let dateFst = getResource(\"date\", \"fst\")\n  let phoneFst = getResource(\"phone\", \"fst\")\n  let ruleFsts = \"\\(dateFst),\\(phoneFst),\\(numFst)\"\n\n  let vits = sherpaOnnxOfflineTtsVitsModelConfig(\n    model: model, lexicon: lexicon, tokens: tokens,\n    dataDir: \"\",\n    noiseScale: 0.667,\n    noiseScaleW: 0.8,\n    lengthScale: 1.0\n  )\n\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)\n  var config = sherpaOnnxOfflineTtsConfig(\n    model: modelConfig,\n    ruleFsts: ruleFsts\n  )\n\n  return SherpaOnnxOfflineTtsWrapper(config: &config)\n}\n\nfunc getTtsFor_matcha_icefall_zh_baker() -> SherpaOnnxOfflineTtsWrapper {\n  // please see https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n\n  let acousticModel = getResource(\"model-steps-3\", \"onnx\")\n  let vocoder = getResource(\"vocos-22khz-univ\", \"onnx\")\n\n  let tokens = getResource(\"tokens\", \"txt\")\n  let lexicon = getResource(\"lexicon\", \"txt\")\n\n  let numFst = getResource(\"number\", \"fst\")\n  let dateFst = getResource(\"date\", \"fst\")\n  let phoneFst = getResource(\"phone\", \"fst\")\n  let ruleFsts = \"\\(dateFst),\\(phoneFst),\\(numFst)\"\n\n  let matcha = sherpaOnnxOfflineTtsMatchaModelConfig(\n    acousticModel: acousticModel,\n    vocoder: vocoder,\n    lexicon: lexicon,\n    tokens: tokens\n  )\n\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(matcha: matcha)\n  var config = sherpaOnnxOfflineTtsConfig(\n    model: modelConfig,\n    ruleFsts: ruleFsts\n  )\n\n  return SherpaOnnxOfflineTtsWrapper(config: &config)\n}\n\nfunc getTtsFor_kokoro_en_v0_19() -> SherpaOnnxOfflineTtsWrapper {\n  // please see https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html#kokoro-en-v0-19-english-11-speakers\n\n  let model = getResource(\"model\", \"onnx\")\n  let voices = getResource(\"voices\", \"bin\")\n\n  // tokens.txt\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  // in this case, we don't need lexicon.txt\n  let dataDir = resourceURL(to: \"espeak-ng-data\")\n\n  let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(\n    model: model, voices: voices, tokens: tokens, dataDir: dataDir)\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro)\n  var config = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n\n  return SherpaOnnxOfflineTtsWrapper(config: &config)\n}\n\nfunc getTtsFor_kokoro_multi_lang_v1_0() -> SherpaOnnxOfflineTtsWrapper {\n  // please see https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n\n  let model = getResource(\"model\", \"onnx\")\n  let voices = getResource(\"voices\", \"bin\")\n\n  // tokens.txt\n  let tokens = getResource(\"tokens\", \"txt\")\n\n  let lexicon_en = getResource(\"lexicon-us-en\", \"txt\")\n  let lexicon_zh = getResource(\"lexicon-zh\", \"txt\")\n  let lexicon = \"\\(lexicon_en),\\(lexicon_zh)\"\n\n  // in this case, we don't need lexicon.txt\n  let dataDir = resourceURL(to: \"espeak-ng-data\")\n\n  let numFst = getResource(\"number-zh\", \"fst\")\n  let dateFst = getResource(\"date-zh\", \"fst\")\n  let phoneFst = getResource(\"phone-zh\", \"fst\")\n  let ruleFsts = \"\\(dateFst),\\(phoneFst),\\(numFst)\"\n\n  let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(\n    model: model, voices: voices, tokens: tokens, dataDir: dataDir,\n    lexicon: lexicon)\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro)\n  var config = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n\n  return SherpaOnnxOfflineTtsWrapper(config: &config)\n}\n\nfunc createOfflineTts() -> SherpaOnnxOfflineTtsWrapper {\n  // Please enable only one of them\n\n  return getTtsFor_kokoro_multi_lang_v1_0()\n\n  // return getTtsFor_kokoro_en_v0_19()\n\n  // return getTtsFor_matcha_icefall_zh_baker()\n\n  // return getTtsFor_en_US_amy_low()\n\n  // return getTtsForVCTK()\n\n  // return getTtsForAishell3()\n\n  // return getTtsFor_zh_en_melo_tts()\n\n  // please add more models on need by following the above two examples\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts.xcodeproj/project.pbxproj",
    "content": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 56;\n\tobjects = {\n\n/* Begin PBXBuildFile section */\n\t\tC917B4E52B0EEF3B005245AC /* SherpaOnnxTtsApp.swift in Sources */ = {isa = PBXBuildFile; fileRef = C917B4E42B0EEF3B005245AC /* SherpaOnnxTtsApp.swift */; };\n\t\tC917B4E72B0EEF3B005245AC /* ContentView.swift in Sources */ = {isa = PBXBuildFile; fileRef = C917B4E62B0EEF3B005245AC /* ContentView.swift */; };\n\t\tC917B4E92B0EEF3C005245AC /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = C917B4E82B0EEF3C005245AC /* Assets.xcassets */; };\n\t\tC917B4EC2B0EEF3C005245AC /* Preview Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = C917B4EB2B0EEF3C005245AC /* Preview Assets.xcassets */; };\n\t\tC9FE9FE52B0F33CD009F1003 /* ViewModel.swift in Sources */ = {isa = PBXBuildFile; fileRef = C9FE9FE42B0F33CD009F1003 /* ViewModel.swift */; };\n\t\tC9FE9FE72B0F3620009F1003 /* SherpaOnnx.swift in Sources */ = {isa = PBXBuildFile; fileRef = C9FE9FE62B0F3620009F1003 /* SherpaOnnx.swift */; };\n\t\tC9FE9FEA2B0F3754009F1003 /* sherpa-onnx.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = C9FE9FE92B0F3754009F1003 /* sherpa-onnx.xcframework */; };\n\t\tC9FE9FEF2B0F3EFB009F1003 /* onnxruntime.xcframework in Frameworks */ = {isa = PBXBuildFile; fileRef = C9FE9FEB2B0F3785009F1003 /* onnxruntime.xcframework */; };\n/* End PBXBuildFile section */\n\n/* Begin PBXFileReference section */\n\t\tC917B4E12B0EEF3B005245AC /* SherpaOnnxTts.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = SherpaOnnxTts.app; sourceTree = BUILT_PRODUCTS_DIR; };\n\t\tC917B4E42B0EEF3B005245AC /* SherpaOnnxTtsApp.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = SherpaOnnxTtsApp.swift; sourceTree = \"<group>\"; };\n\t\tC917B4E62B0EEF3B005245AC /* ContentView.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ContentView.swift; sourceTree = \"<group>\"; };\n\t\tC917B4E82B0EEF3C005245AC /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = \"<group>\"; };\n\t\tC917B4EB2B0EEF3C005245AC /* Preview Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = \"Preview Assets.xcassets\"; sourceTree = \"<group>\"; };\n\t\tC9FE9FE42B0F33CD009F1003 /* ViewModel.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ViewModel.swift; sourceTree = \"<group>\"; };\n\t\tC9FE9FE62B0F3620009F1003 /* SherpaOnnx.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; name = SherpaOnnx.swift; path = \"../../../swift-api-examples/SherpaOnnx.swift\"; sourceTree = \"<group>\"; };\n\t\tC9FE9FE92B0F3754009F1003 /* sherpa-onnx.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = \"sherpa-onnx.xcframework\"; path = \"../../build-ios/sherpa-onnx.xcframework\"; sourceTree = \"<group>\"; };\n\t\tC9FE9FEB2B0F3785009F1003 /* onnxruntime.xcframework */ = {isa = PBXFileReference; lastKnownFileType = wrapper.xcframework; name = onnxruntime.xcframework; path = \"../../build-ios/ios-onnxruntime/1.17.1/onnxruntime.xcframework\"; sourceTree = \"<group>\"; };\n/* End PBXFileReference section */\n\n/* Begin PBXFrameworksBuildPhase section */\n\t\tC917B4DE2B0EEF3B005245AC /* Frameworks */ = {\n\t\t\tisa = PBXFrameworksBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC9FE9FEF2B0F3EFB009F1003 /* onnxruntime.xcframework in Frameworks */,\n\t\t\t\tC9FE9FEA2B0F3754009F1003 /* sherpa-onnx.xcframework in Frameworks */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXFrameworksBuildPhase section */\n\n/* Begin PBXGroup section */\n\t\tC917B4D82B0EEF3B005245AC = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC917B4E32B0EEF3B005245AC /* SherpaOnnxTts */,\n\t\t\t\tC917B4E22B0EEF3B005245AC /* Products */,\n\t\t\t\tC9FE9FE82B0F3754009F1003 /* Frameworks */,\n\t\t\t);\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC917B4E22B0EEF3B005245AC /* Products */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC917B4E12B0EEF3B005245AC /* SherpaOnnxTts.app */,\n\t\t\t);\n\t\t\tname = Products;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC917B4E32B0EEF3B005245AC /* SherpaOnnxTts */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC9FE9FE62B0F3620009F1003 /* SherpaOnnx.swift */,\n\t\t\t\tC9FE9FE42B0F33CD009F1003 /* ViewModel.swift */,\n\t\t\t\tC917B4E42B0EEF3B005245AC /* SherpaOnnxTtsApp.swift */,\n\t\t\t\tC917B4E62B0EEF3B005245AC /* ContentView.swift */,\n\t\t\t\tC917B4E82B0EEF3C005245AC /* Assets.xcassets */,\n\t\t\t\tC917B4EA2B0EEF3C005245AC /* Preview Content */,\n\t\t\t);\n\t\t\tpath = SherpaOnnxTts;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC917B4EA2B0EEF3C005245AC /* Preview Content */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC917B4EB2B0EEF3C005245AC /* Preview Assets.xcassets */,\n\t\t\t);\n\t\t\tpath = \"Preview Content\";\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n\t\tC9FE9FE82B0F3754009F1003 /* Frameworks */ = {\n\t\t\tisa = PBXGroup;\n\t\t\tchildren = (\n\t\t\t\tC9FE9FEB2B0F3785009F1003 /* onnxruntime.xcframework */,\n\t\t\t\tC9FE9FE92B0F3754009F1003 /* sherpa-onnx.xcframework */,\n\t\t\t);\n\t\t\tname = Frameworks;\n\t\t\tsourceTree = \"<group>\";\n\t\t};\n/* End PBXGroup section */\n\n/* Begin PBXNativeTarget section */\n\t\tC917B4E02B0EEF3B005245AC /* SherpaOnnxTts */ = {\n\t\t\tisa = PBXNativeTarget;\n\t\t\tbuildConfigurationList = C917B4EF2B0EEF3C005245AC /* Build configuration list for PBXNativeTarget \"SherpaOnnxTts\" */;\n\t\t\tbuildPhases = (\n\t\t\t\tC917B4DD2B0EEF3B005245AC /* Sources */,\n\t\t\t\tC917B4DE2B0EEF3B005245AC /* Frameworks */,\n\t\t\t\tC917B4DF2B0EEF3B005245AC /* Resources */,\n\t\t\t);\n\t\t\tbuildRules = (\n\t\t\t);\n\t\t\tdependencies = (\n\t\t\t);\n\t\t\tname = SherpaOnnxTts;\n\t\t\tproductName = SherpaOnnxTts;\n\t\t\tproductReference = C917B4E12B0EEF3B005245AC /* SherpaOnnxTts.app */;\n\t\t\tproductType = \"com.apple.product-type.application\";\n\t\t};\n/* End PBXNativeTarget section */\n\n/* Begin PBXProject section */\n\t\tC917B4D92B0EEF3B005245AC /* Project object */ = {\n\t\t\tisa = PBXProject;\n\t\t\tattributes = {\n\t\t\t\tBuildIndependentTargetsInParallel = 1;\n\t\t\t\tLastSwiftUpdateCheck = 1420;\n\t\t\t\tLastUpgradeCheck = 1420;\n\t\t\t\tTargetAttributes = {\n\t\t\t\t\tC917B4E02B0EEF3B005245AC = {\n\t\t\t\t\t\tCreatedOnToolsVersion = 14.2;\n\t\t\t\t\t};\n\t\t\t\t};\n\t\t\t};\n\t\t\tbuildConfigurationList = C917B4DC2B0EEF3B005245AC /* Build configuration list for PBXProject \"SherpaOnnxTts\" */;\n\t\t\tcompatibilityVersion = \"Xcode 14.0\";\n\t\t\tdevelopmentRegion = en;\n\t\t\thasScannedForEncodings = 0;\n\t\t\tknownRegions = (\n\t\t\t\ten,\n\t\t\t\tBase,\n\t\t\t);\n\t\t\tmainGroup = C917B4D82B0EEF3B005245AC;\n\t\t\tproductRefGroup = C917B4E22B0EEF3B005245AC /* Products */;\n\t\t\tprojectDirPath = \"\";\n\t\t\tprojectRoot = \"\";\n\t\t\ttargets = (\n\t\t\t\tC917B4E02B0EEF3B005245AC /* SherpaOnnxTts */,\n\t\t\t);\n\t\t};\n/* End PBXProject section */\n\n/* Begin PBXResourcesBuildPhase section */\n\t\tC917B4DF2B0EEF3B005245AC /* Resources */ = {\n\t\t\tisa = PBXResourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC917B4EC2B0EEF3C005245AC /* Preview Assets.xcassets in Resources */,\n\t\t\t\tC917B4E92B0EEF3C005245AC /* Assets.xcassets in Resources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXResourcesBuildPhase section */\n\n/* Begin PBXSourcesBuildPhase section */\n\t\tC917B4DD2B0EEF3B005245AC /* Sources */ = {\n\t\t\tisa = PBXSourcesBuildPhase;\n\t\t\tbuildActionMask = 2147483647;\n\t\t\tfiles = (\n\t\t\t\tC917B4E72B0EEF3B005245AC /* ContentView.swift in Sources */,\n\t\t\t\tC9FE9FE72B0F3620009F1003 /* SherpaOnnx.swift in Sources */,\n\t\t\t\tC9FE9FE52B0F33CD009F1003 /* ViewModel.swift in Sources */,\n\t\t\t\tC917B4E52B0EEF3B005245AC /* SherpaOnnxTtsApp.swift in Sources */,\n\t\t\t);\n\t\t\trunOnlyForDeploymentPostprocessing = 0;\n\t\t};\n/* End PBXSourcesBuildPhase section */\n\n/* Begin XCBuildConfiguration section */\n\t\tC917B4ED2B0EEF3C005245AC /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = dwarf;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tENABLE_TESTABILITY = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_DYNAMIC_NO_PIC = NO;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_OPTIMIZATION_LEVEL = 0;\n\t\t\t\tGCC_PREPROCESSOR_DEFINITIONS = (\n\t\t\t\t\t\"DEBUG=1\",\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t);\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = INCLUDE_SOURCE;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tONLY_ACTIVE_ARCH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-Onone\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC917B4EE2B0EEF3C005245AC /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tALWAYS_SEARCH_USER_PATHS = NO;\n\t\t\t\tCLANG_ANALYZER_NONNULL = YES;\n\t\t\t\tCLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;\n\t\t\t\tCLANG_CXX_LANGUAGE_STANDARD = \"gnu++20\";\n\t\t\t\tCLANG_ENABLE_MODULES = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_ARC = YES;\n\t\t\t\tCLANG_ENABLE_OBJC_WEAK = YES;\n\t\t\t\tCLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;\n\t\t\t\tCLANG_WARN_BOOL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_COMMA = YES;\n\t\t\t\tCLANG_WARN_CONSTANT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;\n\t\t\t\tCLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;\n\t\t\t\tCLANG_WARN_DOCUMENTATION_COMMENTS = YES;\n\t\t\t\tCLANG_WARN_EMPTY_BODY = YES;\n\t\t\t\tCLANG_WARN_ENUM_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_INFINITE_RECURSION = YES;\n\t\t\t\tCLANG_WARN_INT_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;\n\t\t\t\tCLANG_WARN_OBJC_LITERAL_CONVERSION = YES;\n\t\t\t\tCLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;\n\t\t\t\tCLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;\n\t\t\t\tCLANG_WARN_RANGE_LOOP_ANALYSIS = YES;\n\t\t\t\tCLANG_WARN_STRICT_PROTOTYPES = YES;\n\t\t\t\tCLANG_WARN_SUSPICIOUS_MOVE = YES;\n\t\t\t\tCLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;\n\t\t\t\tCLANG_WARN_UNREACHABLE_CODE = YES;\n\t\t\t\tCLANG_WARN__DUPLICATE_METHOD_MATCH = YES;\n\t\t\t\tCOPY_PHASE_STRIP = NO;\n\t\t\t\tDEBUG_INFORMATION_FORMAT = \"dwarf-with-dsym\";\n\t\t\t\tENABLE_NS_ASSERTIONS = NO;\n\t\t\t\tENABLE_STRICT_OBJC_MSGSEND = YES;\n\t\t\t\tGCC_C_LANGUAGE_STANDARD = gnu11;\n\t\t\t\tGCC_NO_COMMON_BLOCKS = YES;\n\t\t\t\tGCC_WARN_64_TO_32_BIT_CONVERSION = YES;\n\t\t\t\tGCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;\n\t\t\t\tGCC_WARN_UNDECLARED_SELECTOR = YES;\n\t\t\t\tGCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;\n\t\t\t\tGCC_WARN_UNUSED_FUNCTION = YES;\n\t\t\t\tGCC_WARN_UNUSED_VARIABLE = YES;\n\t\t\t\tIPHONEOS_DEPLOYMENT_TARGET = 16.2;\n\t\t\t\tMTL_ENABLE_DEBUG_INFO = NO;\n\t\t\t\tMTL_FAST_MATH = YES;\n\t\t\t\tSDKROOT = iphoneos;\n\t\t\t\tSWIFT_COMPILATION_MODE = wholemodule;\n\t\t\t\tSWIFT_OPTIMIZATION_LEVEL = \"-O\";\n\t\t\t\tVALIDATE_PRODUCT = YES;\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n\t\tC917B4F02B0EEF3C005245AC /* Debug */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_ASSET_PATHS = \"\\\"SherpaOnnxTts/Preview Content\\\"\";\n\t\t\t\tENABLE_PREVIEWS = YES;\n\t\t\t\tFRAMEWORK_SEARCH_PATHS = \"${PROJECT_DIR}/../../build-ios\";\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tHEADER_SEARCH_PATHS = \"${PROJECT_DIR}/../../build-ios/sherpa-onnx.xcframework/Headers\";\n\t\t\t\tINFOPLIST_KEY_UIApplicationSceneManifest_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchScreen_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxTts\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Debug;\n\t\t};\n\t\tC917B4F12B0EEF3C005245AC /* Release */ = {\n\t\t\tisa = XCBuildConfiguration;\n\t\t\tbuildSettings = {\n\t\t\t\tASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;\n\t\t\t\tASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;\n\t\t\t\tCODE_SIGN_STYLE = Automatic;\n\t\t\t\tCURRENT_PROJECT_VERSION = 1;\n\t\t\t\tDEVELOPMENT_ASSET_PATHS = \"\\\"SherpaOnnxTts/Preview Content\\\"\";\n\t\t\t\tENABLE_PREVIEWS = YES;\n\t\t\t\tFRAMEWORK_SEARCH_PATHS = \"${PROJECT_DIR}/../../build-ios\";\n\t\t\t\tGENERATE_INFOPLIST_FILE = YES;\n\t\t\t\tHEADER_SEARCH_PATHS = \"${PROJECT_DIR}/../../build-ios/sherpa-onnx.xcframework/Headers\";\n\t\t\t\tINFOPLIST_KEY_UIApplicationSceneManifest_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UIApplicationSupportsIndirectInputEvents = YES;\n\t\t\t\tINFOPLIST_KEY_UILaunchScreen_Generation = YES;\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPad = \"UIInterfaceOrientationPortrait UIInterfaceOrientationPortraitUpsideDown UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tINFOPLIST_KEY_UISupportedInterfaceOrientations_iPhone = \"UIInterfaceOrientationPortrait UIInterfaceOrientationLandscapeLeft UIInterfaceOrientationLandscapeRight\";\n\t\t\t\tLD_RUNPATH_SEARCH_PATHS = (\n\t\t\t\t\t\"$(inherited)\",\n\t\t\t\t\t\"@executable_path/Frameworks\",\n\t\t\t\t);\n\t\t\t\tMARKETING_VERSION = 1.0;\n\t\t\t\tOTHER_LDFLAGS = \"-lc++\";\n\t\t\t\tPRODUCT_BUNDLE_IDENTIFIER = \"com.k2-fsa.org.SherpaOnnxTts\";\n\t\t\t\tPRODUCT_NAME = \"$(TARGET_NAME)\";\n\t\t\t\tSWIFT_EMIT_LOC_STRINGS = YES;\n\t\t\t\tSWIFT_OBJC_BRIDGING_HEADER = \"${PROJECT_DIR}/../../swift-api-examples/SherpaOnnx-Bridging-Header.h\";\n\t\t\t\tSWIFT_VERSION = 5.0;\n\t\t\t\tTARGETED_DEVICE_FAMILY = \"1,2\";\n\t\t\t};\n\t\t\tname = Release;\n\t\t};\n/* End XCBuildConfiguration section */\n\n/* Begin XCConfigurationList section */\n\t\tC917B4DC2B0EEF3B005245AC /* Build configuration list for PBXProject \"SherpaOnnxTts\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC917B4ED2B0EEF3C005245AC /* Debug */,\n\t\t\t\tC917B4EE2B0EEF3C005245AC /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n\t\tC917B4EF2B0EEF3C005245AC /* Build configuration list for PBXNativeTarget \"SherpaOnnxTts\" */ = {\n\t\t\tisa = XCConfigurationList;\n\t\t\tbuildConfigurations = (\n\t\t\t\tC917B4F02B0EEF3C005245AC /* Debug */,\n\t\t\t\tC917B4F12B0EEF3C005245AC /* Release */,\n\t\t\t);\n\t\t\tdefaultConfigurationIsVisible = 0;\n\t\t\tdefaultConfigurationName = Release;\n\t\t};\n/* End XCConfigurationList section */\n\t};\n\trootObject = C917B4D92B0EEF3B005245AC /* Project object */;\n}\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n   version = \"1.0\">\n   <FileRef\n      location = \"self:\">\n   </FileRef>\n</Workspace>\n"
  },
  {
    "path": "ios-swiftui/SherpaOnnxTts/SherpaOnnxTts.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n<plist version=\"1.0\">\n<dict>\n\t<key>IDEDidComputeMac32BitWarning</key>\n\t<true/>\n</dict>\n</plist>\n"
  },
  {
    "path": "java-api-examples/.gitignore",
    "content": "lib\nhs_err*\n!run-*.sh\n./hotwords_cn.txt\n*.class\n"
  },
  {
    "path": "java-api-examples/AudioTaggingCEDFromFile.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a CED audio tagging model to tag\n// input audio files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class AudioTaggingCEDFromFile {\n  public static void main(String[] args) {\n    // please download the model from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\n    String model = \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.int8.onnx\";\n    String labels = \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/class_labels_indices.csv\";\n    int topK = 5;\n\n    String[] testWaves =\n        new String[] {\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/1.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/2.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/3.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/4.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/5.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/6.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/7.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/8.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/9.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/10.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/11.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/12.wav\",\n          \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/13.wav\",\n        };\n\n    AudioTaggingModelConfig modelConfig =\n        AudioTaggingModelConfig.builder().setCED(model).setNumThreads(1).setDebug(true).build();\n\n    AudioTaggingConfig config =\n        AudioTaggingConfig.builder().setModel(modelConfig).setLabels(labels).setTopK(topK).build();\n\n    AudioTagging tagger = new AudioTagging(config);\n    System.out.println(\"------\");\n    for (String filename : testWaves) {\n      WaveReader reader = new WaveReader(filename);\n\n      OfflineStream stream = tagger.createStream();\n      stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n      AudioEvent[] events = tagger.compute(stream);\n\n      stream.release();\n\n      System.out.printf(\"input file: %s\\n\", filename);\n      System.out.printf(\"Probability\\t\\tName\\n\");\n      for (AudioEvent e : events) {\n        System.out.printf(\"%.3f\\t\\t\\t%s\\n\", e.getProb(), e.getName());\n      }\n      System.out.println(\"------\");\n    }\n\n    tagger.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/AudioTaggingZipformerFromFile.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a zipformer audio tagging model to tag\n// input audio files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class AudioTaggingZipformerFromFile {\n  public static void main(String[] args) {\n    // please download the model from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\n    String model = \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/model.int8.onnx\";\n    String labels =\n        \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/class_labels_indices.csv\";\n    int topK = 5;\n\n    String[] testWaves =\n        new String[] {\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/1.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/2.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/3.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/4.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/5.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/6.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/7.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/8.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/9.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/10.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/11.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/12.wav\",\n          \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/13.wav\",\n        };\n\n    OfflineZipformerAudioTaggingModelConfig zipformer =\n        OfflineZipformerAudioTaggingModelConfig.builder().setModel(model).build();\n\n    AudioTaggingModelConfig modelConfig =\n        AudioTaggingModelConfig.builder()\n            .setZipformer(zipformer)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    AudioTaggingConfig config =\n        AudioTaggingConfig.builder().setModel(modelConfig).setLabels(labels).setTopK(topK).build();\n\n    AudioTagging tagger = new AudioTagging(config);\n    System.out.println(\"------\");\n    for (String filename : testWaves) {\n      WaveReader reader = new WaveReader(filename);\n\n      OfflineStream stream = tagger.createStream();\n      stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n      AudioEvent[] events = tagger.compute(stream);\n\n      stream.release();\n\n      System.out.printf(\"input file: %s\\n\", filename);\n      System.out.printf(\"Probability\\t\\tName\\n\");\n      for (AudioEvent e : events) {\n        System.out.printf(\"%.3f\\t\\t\\t%s\\n\", e.getProb(), e.getName());\n      }\n      System.out.println(\"------\");\n    }\n\n    tagger.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/InverseTextNormalizationNonStreamingParaformer.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline paraformer, i.e., non-streaming paraformer,\n// to decode files with inverse text normalization.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class InverseTextNormalizationNonStreamingParaformer {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2023-09-14-chinese-english\n    // to download model files\n    String model = \"./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\";\n\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\n    String waveFilename = \"./itn-zh-number.wav\";\n\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n    String ruleFsts = \"./itn_zh_number.fst\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineParaformerModelConfig paraformer =\n        OfflineParaformerModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setParaformer(paraformer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .setRuleFsts(ruleFsts)\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/InverseTextNormalizationStreamingTransducer.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a streaming transducer\n// to decode files with inverse text normalization.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class InverseTextNormalizationStreamingTransducer {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n    // to download model files\n    String encoder =\n        \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx\";\n    String decoder =\n        \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\";\n    String joiner =\n        \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx\";\n    String tokens = \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\";\n\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\n    String waveFilename = \"./itn-zh-number.wav\";\n\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n    String ruleFsts = \"./itn_zh_number.fst\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OnlineTransducerModelConfig transducer =\n        OnlineTransducerModelConfig.builder()\n            .setEncoder(encoder)\n            .setDecoder(decoder)\n            .setJoiner(joiner)\n            .build();\n\n    OnlineModelConfig modelConfig =\n        OnlineModelConfig.builder()\n            .setTransducer(transducer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OnlineRecognizerConfig config =\n        OnlineRecognizerConfig.builder()\n            .setOnlineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .setRuleFsts(ruleFsts)\n            .build();\n\n    OnlineRecognizer recognizer = new OnlineRecognizer(config);\n    OnlineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    float[] tailPaddings = new float[(int) (0.8 * reader.getSampleRate())];\n    stream.acceptWaveform(tailPaddings, reader.getSampleRate());\n\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/KeywordSpotterFromFile.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a keyword spotter model to spot keywords from\n// a file.\n\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class KyewordSpotterFromFile {\n  public static void main(String[] args) {\n    // please download test files from https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models\n    String encoder =\n        \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx\";\n    String decoder =\n        \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\";\n    String joiner =\n        \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx\";\n    String tokens = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt\";\n\n    String keywordsFile =\n        \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/test_keywords.txt\";\n\n    String waveFilename = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/3.wav\";\n\n    OnlineTransducerModelConfig transducer =\n        OnlineTransducerModelConfig.builder()\n            .setEncoder(encoder)\n            .setDecoder(decoder)\n            .setJoiner(joiner)\n            .build();\n\n    OnlineModelConfig modelConfig =\n        OnlineModelConfig.builder()\n            .setTransducer(transducer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    KeywordSpotterConfig config =\n        KeywordSpotterConfig.builder()\n            .setOnlineModelConfig(modelConfig)\n            .setKeywordsFile(keywordsFile)\n            .build();\n\n    KeywordSpotter kws = new KeywordSpotter(config);\n    OnlineStream stream = kws.createStream();\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    float[] tailPaddings = new float[(int) (0.8 * reader.getSampleRate())];\n    stream.acceptWaveform(tailPaddings, reader.getSampleRate());\n    while (kws.isReady(stream)) {\n      kws.decode(stream);\n\n      String keyword = kws.getResult(stream).getKeyword();\n      if (!keyword.isEmpty()) {\n        // Remember to reset the stream right after detecting a keyword\n        kws.reset(stream);\n        System.out.printf(\"Detected keyword: %s\\n\", keyword);\n      }\n    }\n\n    kws.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileDolphinCtc.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use an offline Dolphin CTC model, i.e.,\n// non-streaming Dolphin CTC model, to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileDolphinCtc {\n  public static void main(String[] args) {\n    // please refer to\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    // to download model files\n    String model = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt\";\n\n    String waveFilename =\n        \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineDolphinModelConfig dolphin = OfflineDolphinModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setDolphin(dolphin)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileFireRedAsr.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use an offline FireRedAsr AED model\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileFireRedAsr {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/FireRedAsr/index.html\n    // to download model files\n    String encoder = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx\";\n    String decoder = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx\";\n    String tokens = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineFireRedAsrModelConfig fireRedAsr =\n        OfflineFireRedAsrModelConfig.builder().setEncoder(encoder).setDecoder(decoder).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setFireRedAsr(fireRedAsr)\n            .setTokens(tokens)\n            .setNumThreads(2)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileFireRedAsrCtc.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use an offline FireRedASR CTC model,\n// i.e., non-streaming FireRedASR CTC model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileFireRedAsrCtc {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/FireRedAsr/index.html\n    // to download model files\n    String model = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx\";\n\n    String tokens = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineFireRedAsrCtcModelConfig medasr =\n        OfflineFireRedAsrCtcModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setFireRedAsrCtc(medasr)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileFunAsrNano.java",
    "content": "// Copyright 2026 Xiaomi Corporation\n\n// This file shows how to use an offline FunASR Nano model,\n// i.e., non-streaming FunASR Nano model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileFunAsrNano {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/funasr-nano/index.html\n    // to download model files\n    String encoderAdaptor = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx\";\n    String llm = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx\";\n    String embedding = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx\";\n    String tokenizer = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B\";\n\n    String tokens = \"\";\n\n    String waveFilename = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/lyrics.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineFunAsrNanoModelConfig funasrNano =\n        OfflineFunAsrNanoModelConfig.builder()\n            .setEncoderAdaptor(encoderAdaptor)\n            .setLLM(llm)\n            .setEmbedding(embedding)\n            .setTokenizer(tokenizer)\n            .build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setFunAsrNano(funasrNano)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileMedAsrCtc.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use an offline Google MedASR CTC model,\n// i.e., non-streaming MedASR CTC model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileMedAsrCtc {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/medasr/index.html\n    // to download model files\n    String model = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx\";\n\n    String tokens = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineMedAsrCtcModelConfig medasr =\n        OfflineMedAsrCtcModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setMedAsr(medasr)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileMoonshine.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline Moonshine,\n// i.e., non-streaming Moonshine model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileMoonshine {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/moonshine/index.html\n    // to download model files\n\n    String preprocessor = \"./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx\";\n    String encoder = \"./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx\";\n    String uncachedDecoder = \"./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx\";\n    String cachedDecoder = \"./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx\";\n\n    String tokens = \"./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineMoonshineModelConfig moonshine =\n        OfflineMoonshineModelConfig.builder()\n            .setPreprocessor(preprocessor)\n            .setEncoder(encoder)\n            .setUncachedDecoder(uncachedDecoder)\n            .setCachedDecoder(cachedDecoder)\n            .build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setMoonshine(moonshine)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileMoonshineV2.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline Moonshine,\n// i.e., non-streaming Moonshine v2 model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileMoonshineV2 {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/moonshine/index.html\n    // to download model files\n\n    String encoder = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort\";\n    String decoder =\n        \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort\";\n    String tokens = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineMoonshineModelConfig moonshine =\n        OfflineMoonshineModelConfig.builder().setEncoder(encoder).setMergedDecoder(decoder).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setMoonshine(moonshine)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileNemo.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline NeMo CTC model, i.e., non-streaming NeMo CTC model,,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileNemo {\n  public static void main(String[] args) {\n    // please refer to\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-en-citrinet-512.tar.bz2\n    // to download model files\n    String model = \"./sherpa-onnx-nemo-ctc-en-citrinet-512/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-nemo-ctc-en-citrinet-512/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-nemo-ctc-en-citrinet-512/test_wavs/1.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineNemoEncDecCtcModelConfig nemo =\n        OfflineNemoEncDecCtcModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setNemo(nemo)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileNemoCanary.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline NeMo Canary model, i.e.,\n// non-streaming NeMo Canary model, to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileNemoCanary {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/nemo/canary.html\n    // to download model files\n    String encoder = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx\";\n    String decoder = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx\";\n    String tokens = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/en.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineCanaryModelConfig canary =\n        OfflineCanaryModelConfig.builder()\n            .setEncoder(encoder)\n            .setDecoder(decoder)\n            .setSrcLang(\"en\")\n            .setTgtLang(\"en\")\n            .setUsePnc(true)\n            .build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setCanary(canary)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult(English):%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileOmnilingualAsrCtc.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use an offline Omnilingual ASR CTC model,\n// i.e., non-streaming Omnilingual ASR CTC model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileOmnilingualAsrCtc {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/omnilingual-asr/index.html\n    // to download model files\n    String model =\n        \"sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx\";\n\n    String tokens =\n        \"sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt\";\n\n    String waveFilename =\n        \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineOmnilingualAsrCtcModelConfig omnilingual =\n        OfflineOmnilingualAsrCtcModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setOmnilingual(omnilingual)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileParaformer.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline paraformer, i.e., non-streaming paraformer,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileParaformer {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2023-09-14-chinese-english\n    // to download model files\n    String model = \"./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/3-sichuan.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineParaformerModelConfig paraformer =\n        OfflineParaformerModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setParaformer(paraformer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileSenseVoice.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline SenseVoice model,\n// i.e., non-streaming SenseVoice model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileSenseVoice {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/sense-voice/index.html\n    // to download model files\n    String model = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/zh.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineSenseVoiceModelConfig senseVoice =\n        OfflineSenseVoiceModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setSenseVoice(senseVoice)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileSenseVoiceWithHr.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use an offline SenseVoice model,\n// i.e., non-streaming SenseVoice model\n// to decode files with homophone replacer.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileSenseVoiceWithHr {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/sense-voice/index.html\n    // to download model files\n    String model = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n\n    String waveFilename = \"./test-hr.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineSenseVoiceModelConfig senseVoice =\n        OfflineSenseVoiceModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setSenseVoice(senseVoice)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    HomophoneReplacerConfig hr =\n        HomophoneReplacerConfig.builder()\n            .setDictDir(\"./dict\")\n            .setLexicon(\"./lexicon.txt\")\n            .setRuleFsts(\"./replace.fst\")\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .setHr(hr)\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileTeleSpeechCtc.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline TeleSpeech CTC model\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileTeleSpeechCtc {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/telespeech/models.html#sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04\n    // to download model files\n    String model = \"./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/test_wavs/3-sichuan.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setTeleSpeech(model)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setModelType(\"telespeech_ctc\")\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileTransducer.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline transducer, i.e., non-streaming transducer,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileTransducer {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-gigaspeech-2023-12-12-english\n    // to download model files\n    String encoder =\n        \"./sherpa-onnx-zipformer-gigaspeech-2023-12-12/encoder-epoch-30-avg-1.int8.onnx\";\n    String decoder = \"./sherpa-onnx-zipformer-gigaspeech-2023-12-12/decoder-epoch-30-avg-1.onnx\";\n    String joiner = \"./sherpa-onnx-zipformer-gigaspeech-2023-12-12/joiner-epoch-30-avg-1.onnx\";\n    String tokens = \"./sherpa-onnx-zipformer-gigaspeech-2023-12-12/tokens.txt\";\n\n    String waveFilename =\n        \"./sherpa-onnx-zipformer-gigaspeech-2023-12-12/test_wavs/1089-134686-0001.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineTransducerModelConfig transducer =\n        OfflineTransducerModelConfig.builder()\n            .setEncoder(encoder)\n            .setDecoder(decoder)\n            .setJoiner(joiner)\n            .build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setTransducer(transducer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileTransducerHotwords.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline transducer, i.e., non-streaming transducer,\n// to decode files with hotwords support.\n//\n// See also\n// https://k2-fsa.github.io/sherpa/onnx/hotwords/index.html#modeling-unit-is-cjkchar\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileTransducerHotwords {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/hotwords/index.html#modeling-unit-is-cjkchar\n    // to download model files\n    String encoder =\n        \"./sherpa-onnx-conformer-zh-stateless2-2023-05-23/encoder-epoch-99-avg-1.int8.onnx\";\n    String decoder = \"./sherpa-onnx-conformer-zh-stateless2-2023-05-23/decoder-epoch-99-avg-1.onnx\";\n    String joiner = \"./sherpa-onnx-conformer-zh-stateless2-2023-05-23/joiner-epoch-99-avg-1.onnx\";\n    String tokens = \"./sherpa-onnx-conformer-zh-stateless2-2023-05-23/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-conformer-zh-stateless2-2023-05-23/test_wavs/6.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineTransducerModelConfig transducer =\n        OfflineTransducerModelConfig.builder()\n            .setEncoder(encoder)\n            .setDecoder(decoder)\n            .setJoiner(joiner)\n            .build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setTransducer(transducer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setModelingUnit(\"cjkchar\")\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"modified_beam_search\")\n            .setHotwordsFile(\"./hotwords_cn.txt\")\n            .setHotwordsScore(2.0f)\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileWenetCtc.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use an offline Wenet CTC model,\n// i.e., non-streaming Wenet CTC model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileWenetCtc {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/sense-voice/index.html\n    // to download model files\n    String model =\n        \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx\";\n\n    String tokens =\n        \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt\";\n\n    String waveFilename =\n        \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/test_wavs/yue-0.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineWenetCtcModelConfig wenetCtc =\n        OfflineWenetCtcModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setWenetCtc(wenetCtc)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileWhisper.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an offline whisper, i.e., non-streaming whisper,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileWhisper {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html\n    // to download model files\n    String encoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx\";\n    String decoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx\";\n    String tokens = \"./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineWhisperModelConfig whisper =\n        OfflineWhisperModelConfig.builder().setEncoder(encoder).setDecoder(decoder).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setWhisper(whisper)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileWhisperMultiple.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use an offline whisper, i.e., non-streaming whisper,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileWhisperMultiple {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html\n    // to download model files\n    String encoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx\";\n    String decoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx\";\n    String tokens = \"./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\";\n\n    String waveFilename0 = \"./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav\";\n    String waveFilename1 = \"./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav\";\n\n    WaveReader reader0 = new WaveReader(waveFilename0);\n    WaveReader reader1 = new WaveReader(waveFilename1);\n\n    OfflineWhisperModelConfig whisper =\n        OfflineWhisperModelConfig.builder().setEncoder(encoder).setDecoder(decoder).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setWhisper(whisper)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream0 = recognizer.createStream();\n    stream0.acceptWaveform(reader0.getSamples(), reader0.getSampleRate());\n\n    OfflineStream stream1 = recognizer.createStream();\n    stream1.acceptWaveform(reader1.getSamples(), reader1.getSampleRate());\n\n    OfflineStream[] ss = new OfflineStream[] {stream0, stream1};\n    recognizer.decode(ss);\n\n    String text0 = recognizer.getResult(stream0).getText();\n    String text1 = recognizer.getResult(stream1).getText();\n\n    System.out.printf(\"filename0:%s\\nresult0:%s\\n\\n\", waveFilename0, text0);\n    System.out.printf(\"filename1:%s\\nresult1:%s\\n\\n\", waveFilename1, text1);\n\n    stream0.release();\n    stream1.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingDecodeFileZipformerCtc.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use an offline Zipformer CTC model,\n// i.e., non-streaming Zipformer CTC model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingDecodeFileZipformerCtc {\n  public static void main(String[] args) {\n    // please refer to\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n    // to download model files\n    String model = \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt\";\n\n    String waveFilename = \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/test_wavs/0.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineZipformerCtcModelConfig zipformerCtc =\n        OfflineZipformerCtcModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setZipformerCtc(zipformerCtc)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OfflineRecognizer recognizer = new OfflineRecognizer(config);\n    OfflineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    recognizer.decode(stream);\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingSpeechEnhancementDpdfNet.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use DPDFNet speech enhancement models in sherpa-onnx\n//\n// Download DPDFNet models from either:\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n// https://huggingface.co/Ceva-IP/DPDFNet\n//\n// Use dpdfnet_baseline.onnx, dpdfnet2.onnx, dpdfnet4.onnx, or dpdfnet8.onnx\n// for 16 kHz downstream ASR or speech recognition.\n// Use dpdfnet2_48khz_hr.onnx for 48 kHz enhancement output.\n\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingSpeechEnhancementDpdfNet {\n  public static void main(String[] args) {\n    String model = \"./dpdfnet_baseline.onnx\";\n    OfflineSpeechDenoiserModelConfig.Builder builder =\n        OfflineSpeechDenoiserModelConfig.builder()\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .setDpdfnet(\n                OfflineSpeechDenoiserDpdfNetModelConfig.builder().setModel(model).build());\n\n    OfflineSpeechDenoiserModelConfig modelConfig = builder.build();\n    OfflineSpeechDenoiserConfig config =\n        OfflineSpeechDenoiserConfig.builder().setModel(modelConfig).build();\n\n    OfflineSpeechDenoiser speech_denoiser = new OfflineSpeechDenoiser(config);\n\n    String testWaveFilename = \"./inp_16k.wav\";\n    WaveReader reader = new WaveReader(testWaveFilename);\n\n    DenoisedAudio denoised = speech_denoiser.run(reader.getSamples(), reader.getSampleRate());\n    String outFilename = \"enhanced.wav\";\n    WaveWriter.write(outFilename, denoised.getSamples(), denoised.getSampleRate());\n    System.out.printf(\"Saved to %s\\n\", outFilename);\n\n    speech_denoiser.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingSpeechEnhancementGtcrn.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use speech enhancement models in sherpa-onnx\n//\n// Download GTCRN models and sample test waves from:\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingSpeechEnhancementGtcrn {\n  public static void main(String[] args) {\n    String model = \"./gtcrn_simple.onnx\";\n    OfflineSpeechDenoiserModelConfig.Builder builder =\n        OfflineSpeechDenoiserModelConfig.builder()\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\");\n\n    builder.setGtcrn(OfflineSpeechDenoiserGtcrnModelConfig.builder().setModel(model).build());\n\n    OfflineSpeechDenoiserModelConfig modelConfig = builder.build();\n    OfflineSpeechDenoiserConfig config =\n        OfflineSpeechDenoiserConfig.builder().setModel(modelConfig).build();\n\n    OfflineSpeechDenoiser speechDenoiser = new OfflineSpeechDenoiser(config);\n\n    String testWaveFilename = \"./inp_16k.wav\";\n    WaveReader reader = new WaveReader(testWaveFilename);\n\n    DenoisedAudio denoised = speechDenoiser.run(reader.getSamples(), reader.getSampleRate());\n    String outFilename = \"enhanced.wav\";\n    WaveWriter.write(outFilename, denoised.getSamples(), denoised.getSampleRate());\n    System.out.printf(\"Saved to %s\\n\", outFilename);\n\n    speechDenoiser.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingTtsCoquiDe.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a Coqui-ai VITS German TTS model\n// to convert text to speech\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingTtsCoquiDe {\n  public static void main(String[] args) {\n    // please visit\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n    // to download model files\n    String model = \"./vits-coqui-de-css10/model.onnx\";\n    String tokens = \"./vits-coqui-de-css10/tokens.txt\";\n    String text = \"Alles hat ein Ende, nur die Wurst hat zwei.\";\n\n    OfflineTtsVitsModelConfig vitsModelConfig =\n        OfflineTtsVitsModelConfig.builder().setModel(model).setTokens(tokens).build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setVits(vitsModelConfig)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder().setModel(modelConfig).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    int sid = 0;\n    float speed = 1.0f;\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setSid(sid);\n    genConfig.setSpeed(speed);\n    genConfig.setSilenceScale(config.getSilenceScale());\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio =\n        tts.generateWithConfigAndCallback(text, genConfig, (float[] samples) -> 1);\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"tts-coqui-de.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingTtsKittenEn.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use a KittenTTS English model\n// to convert text to speech\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingTtsKittenEn {\n  public static void main(String[] args) {\n    LibraryUtils.enableDebug();\n    // please visit\n    // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kitten.html\n    // to download model files\n    String model = \"./kitten-nano-en-v0_1-fp16/model.fp16.onnx\";\n    String voices = \"./kitten-nano-en-v0_1-fp16/voices.bin\";\n    String tokens = \"./kitten-nano-en-v0_1-fp16/tokens.txt\";\n    String dataDir = \"./kitten-nano-en-v0_1-fp16/espeak-ng-data\";\n    String text =\n        \"Today as always, men fall into two groups: slaves and free men. Whoever does not have\"\n            + \" two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a\"\n            + \" businessman, an official, or a scholar.\";\n\n    OfflineTtsKittenModelConfig kittenModelConfig =\n        OfflineTtsKittenModelConfig.builder()\n            .setModel(model)\n            .setVoices(voices)\n            .setTokens(tokens)\n            .setDataDir(dataDir)\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setKitten(kittenModelConfig)\n            .setNumThreads(2)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder().setModel(modelConfig).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    int sid = 7;\n    float speed = 1.0f;\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setSid(sid);\n    genConfig.setSpeed(speed);\n    genConfig.setSilenceScale(0.2f);\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio = tts.generateWithConfigAndCallback(text, genConfig, samples -> {});\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"tts-kitten-en.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingTtsKokoroEn.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use a Kokoro English model\n// to convert text to speech\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingTtsKokoroEn {\n  public static void main(String[] args) {\n    // please visit\n    // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n    // to download model files\n    String model = \"./kokoro-en-v0_19/model.onnx\";\n    String voices = \"./kokoro-en-v0_19/voices.bin\";\n    String tokens = \"./kokoro-en-v0_19/tokens.txt\";\n    String dataDir = \"./kokoro-en-v0_19/espeak-ng-data\";\n    String text =\n        \"Today as always, men fall into two groups: slaves and free men. Whoever does not have\"\n            + \" two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a\"\n            + \" businessman, an official, or a scholar.\";\n\n    OfflineTtsKokoroModelConfig kokoroModelConfig =\n        OfflineTtsKokoroModelConfig.builder()\n            .setModel(model)\n            .setVoices(voices)\n            .setTokens(tokens)\n            .setDataDir(dataDir)\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setKokoro(kokoroModelConfig)\n            .setNumThreads(2)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder().setModel(modelConfig).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    int sid = 0;\n    float speed = 1.0f;\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setSid(sid);\n    genConfig.setSpeed(speed);\n    genConfig.setSilenceScale(0.2f);\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio = tts.generateWithConfigAndCallback(text, genConfig, samples -> {});\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"tts-kokoro-en.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingTtsKokoroZhEn.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use a Kokoro multi-lingual model\n// to convert Chinese and English text to speech\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingTtsKokoroZhEn {\n  public static void main(String[] args) {\n    // please visit\n    // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n    // to download model files\n    String model = \"./kokoro-multi-lang-v1_0/model.onnx\";\n    String voices = \"./kokoro-multi-lang-v1_0/voices.bin\";\n    String tokens = \"./kokoro-multi-lang-v1_0/tokens.txt\";\n    String dataDir = \"./kokoro-multi-lang-v1_0/espeak-ng-data\";\n    String lexicon =\n        \"./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt\";\n    String text =\n        \"中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki.\"\n            + \" 你觉得中英文说的如何呢？\";\n\n    OfflineTtsKokoroModelConfig kokoroModelConfig =\n        OfflineTtsKokoroModelConfig.builder()\n            .setModel(model)\n            .setVoices(voices)\n            .setTokens(tokens)\n            .setDataDir(dataDir)\n            .setLexicon(lexicon)\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setKokoro(kokoroModelConfig)\n            .setNumThreads(2)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder().setModel(modelConfig).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    int sid = 0; // this model has 53 speakers. You can use sid in the range 0-52\n    float speed = 1.0f;\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setSid(sid);\n    genConfig.setSpeed(speed);\n    genConfig.setSilenceScale(0.2f);\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio = tts.generateWithConfigAndCallback(text, genConfig, samples -> {});\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"tts-kokoro-zh-en.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingTtsMatchaEn.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use a matcha English model\n// to convert text to speech\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingTtsMatchaEn {\n  public static void main(String[] args) {\n    // please visit\n    // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n    // to download model files\n    String acousticModel = \"./matcha-icefall-en_US-ljspeech/model-steps-3.onnx\";\n    String vocoder = \"./vocos-22khz-univ.onnx\";\n    String tokens = \"./matcha-icefall-en_US-ljspeech/tokens.txt\";\n    String dataDir = \"./matcha-icefall-en_US-ljspeech/espeak-ng-data\";\n    String text =\n        \"Today as always, men fall into two groups: slaves and free men. Whoever does not have\"\n            + \" two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a\"\n            + \" businessman, an official, or a scholar.\";\n\n    OfflineTtsMatchaModelConfig matchaModelConfig =\n        OfflineTtsMatchaModelConfig.builder()\n            .setAcousticModel(acousticModel)\n            .setVocoder(vocoder)\n            .setTokens(tokens)\n            .setDataDir(dataDir)\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setMatcha(matchaModelConfig)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder().setModel(modelConfig).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setSid(0);\n    genConfig.setSpeed(1.0f);\n    genConfig.setSilenceScale(config.getSilenceScale());\n\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio = tts.generateWithConfigAndCallback(text, genConfig, (float[] samples) -> 1);\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"tts-matcha-en.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- speaker ID: %d\\n\", genConfig.getSid());\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingTtsMatchaZh.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use a matcha Chinese TTS model\n// to convert text to speech\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingTtsMatchaZh {\n  public static void main(String[] args) {\n    // please visit\n    // https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n    // to download model files\n    String acousticModel = \"./matcha-icefall-zh-baker/model-steps-3.onnx\";\n    String vocoder = \"./vocos-22khz-univ.onnx\";\n    String tokens = \"./matcha-icefall-zh-baker/tokens.txt\";\n    String lexicon = \"./matcha-icefall-zh-baker/lexicon.txt\";\n    String ruleFsts =\n        \"./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst\";\n    String text =\n        \"某某银行的副行长和一些行政领导表示，他们去过长江\"\n            + \"和长白山; 经济不断增长。\"\n            + \"2024年12月31号，拨打110或者18920240511。\"\n            + \"123456块钱。\";\n\n    OfflineTtsMatchaModelConfig matchaModelConfig =\n        OfflineTtsMatchaModelConfig.builder()\n            .setAcousticModel(acousticModel)\n            .setVocoder(vocoder)\n            .setTokens(tokens)\n            .setLexicon(lexicon)\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setMatcha(matchaModelConfig)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config =\n        OfflineTtsConfig.builder().setModel(modelConfig).setRuleFsts(ruleFsts).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setSid(0);\n    genConfig.setSpeed(1.0f);\n    genConfig.setSilenceScale(config.getSilenceScale());\n\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio = tts.generateWithConfigAndCallback(text, genConfig, (float[] samples) -> 1);\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"tts-matcha-zh.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- speaker ID: %d\\n\", genConfig.getSid());\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingTtsPiperEn.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a piper VITS English TTS model\n// to convert text to speech\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingTtsPiperEn {\n  public static void main(String[] args) {\n    // please visit\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n    // to download model files\n    String model = \"./vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx\";\n    String tokens = \"./vits-piper-en_GB-cori-medium/tokens.txt\";\n    String dataDir = \"./vits-piper-en_GB-cori-medium/espeak-ng-data\";\n    String text =\n        \"Today as always, men fall into two groups: slaves and free men. Whoever does not have\"\n            + \" two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a\"\n            + \" businessman, an official, or a scholar.\";\n\n    OfflineTtsVitsModelConfig vitsModelConfig =\n        OfflineTtsVitsModelConfig.builder()\n            .setModel(model)\n            .setTokens(tokens)\n            .setDataDir(dataDir)\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setVits(vitsModelConfig)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder().setModel(modelConfig).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    int sid = 0;\n    float speed = 1.0f;\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setSid(sid);\n    genConfig.setSpeed(speed);\n    genConfig.setSilenceScale(config.getSilenceScale());\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio =\n        tts.generateWithConfigAndCallback(text, genConfig, (float[] samples) -> 1);\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"tts-piper-en.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingTtsPiperEnWithCallback.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n//\n// References\n// https://www.baeldung.com/java-passing-method-parameter\n// https://www.geeksforgeeks.org/how-to-create-a-thread-safe-queue-in-java/\n// https://stackoverflow.com/questions/74077394/java-audio-how-to-continuously-write-bytes-to-an-audio-file-as-they-are-being-g\n\n// This file shows how to use a piper VITS English TTS model\n// to convert text to speech. You can pass a callback to the generation call,\n// which is invoked whenever max_num_sentences sentences have been\n// finished generation.\n//\n// The callback saves the generated samples into a queue, which are played\n// by a separate thread.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport java.util.Queue;\nimport java.util.concurrent.*;\nimport java.util.concurrent.ConcurrentLinkedQueue;\nimport javax.sound.sampled.*;\n\npublic class NonStreamingTtsPiperEn {\n  public static void main(String[] args) {\n    // please visit\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n    // to download model files\n    String model = \"./vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx\";\n    String tokens = \"./vits-piper-en_GB-cori-medium/tokens.txt\";\n    String dataDir = \"./vits-piper-en_GB-cori-medium/espeak-ng-data\";\n    String text =\n        \"Today as always, men fall into two groups: slaves and free men. Whoever does not have\"\n            + \" two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a\"\n            + \" businessman, an official, or a scholar.\";\n\n    OfflineTtsVitsModelConfig vitsModelConfig =\n        OfflineTtsVitsModelConfig.builder()\n            .setModel(model)\n            .setTokens(tokens)\n            .setDataDir(dataDir)\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setVits(vitsModelConfig)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder().setModel(modelConfig).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    Queue<byte[]> samplesQueue = new ConcurrentLinkedQueue<>();\n\n    Semaphore canPlaySem = new Semaphore(1);\n    try {\n      canPlaySem.acquire();\n    } catch (InterruptedException ex) {\n      System.out.println(\"Failed to acquire the play semaphore in the main thread\");\n      return;\n    }\n\n    Runnable playRuannable =\n        () -> {\n          try {\n            canPlaySem.acquire();\n          } catch (InterruptedException e) {\n            System.out.println(\"Failed to get canPlay semaphore in the play thread\");\n            return;\n          }\n\n          // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/AudioFormat.html\n          AudioFormat format =\n              new AudioFormat(\n                  tts.getSampleRate(), // sampleRate\n                  16, // sampleSizeInBits\n                  1, // channels\n                  true, // signed\n                  false // bigEndian\n                  );\n          DataLine.Info info = new DataLine.Info(SourceDataLine.class, format);\n          SourceDataLine line;\n          try {\n            line = (SourceDataLine) AudioSystem.getLine(info);\n\n            int bufferSizeInBytes = tts.getSampleRate(); // 0.5 seconds\n            line.open(format, bufferSizeInBytes);\n          } catch (LineUnavailableException ex) {\n            System.out.println(\"Failed to open a device for playing\");\n            return;\n          }\n          line.start();\n\n          while (true) {\n            if (samplesQueue.isEmpty()) {\n              // Do nothing.\n              //\n              // If the generating speed is very slow, we can sleep\n              // for some time here to save some CPU.\n            } else {\n              byte[] samples = samplesQueue.poll();\n              if (samples.length == 1) {\n                // end of the generating\n                break;\n              }\n              line.write(samples, 0, samples.length);\n            }\n          }\n\n          line.drain();\n          line.close();\n        };\n\n    Thread playThread = new Thread(playRuannable);\n    playThread.start();\n\n    int sid = 0;\n    float speed = 1.0f;\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setSid(sid);\n    genConfig.setSpeed(speed);\n    genConfig.setSilenceScale(config.getSilenceScale());\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio =\n        tts.generateWithConfigAndCallback(\n            text,\n            genConfig,\n            (float[] samples) -> {\n\n              // we use a byte array to save int16 samples\n              byte[] samplesInt16 = new byte[samples.length * 2];\n              for (int i = 0; i < samples.length; ++i) {\n                float s = samples[i];\n                if (s > 1) {\n                  s = 1;\n                }\n\n                if (s < -1) {\n                  s = -1;\n                }\n\n                short t = (short) (s * 32767);\n\n                // we use little endian\n                samplesInt16[2 * i] = (byte) (t & 0xff);\n                samplesInt16[2 * i + 1] = (byte) ((t & 0xff00) >> 8);\n              }\n\n              samplesQueue.add(samplesInt16);\n\n              canPlaySem.release();\n\n              // Note: You can play the samples.\n              // warning: You need to save a copy of samples since it is freed\n              // when this function returns\n\n              // return 1 to continue generation\n              // return 0 to stop generation\n              return 1;\n            });\n\n    // Since a sample always has two bytes. We put a single byte\n    // into the queue to indicate that we have finished processing.\n    samplesQueue.add(new byte[1]);\n\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    try {\n      playThread.join();\n    } catch (InterruptedException ex) {\n      System.out.println(\"Failed to join the play thread\");\n      return;\n    }\n\n    String waveFilename = \"tts-piper-en.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingTtsVitsZh.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a VITS Chinese TTS model\n// to convert text to speech.\n//\n// You can use https://github.com/Plachtaa/VITS-fast-fine-tuning\n// to train your model\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class NonStreamingTtsPiperEn {\n  public static void main(String[] args) {\n    // please visit\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n    // to download model files\n    String model = \"./vits-zh-hf-fanchen-C/vits-zh-hf-fanchen-C.onnx\";\n    String tokens = \"./vits-zh-hf-fanchen-C/tokens.txt\";\n    String lexicon = \"./vits-zh-hf-fanchen-C/lexicon.txt\";\n    String ruleFsts =\n        \"./vits-zh-hf-fanchen-C/phone.fst,./vits-zh-hf-fanchen-C/date.fst,./vits-zh-hf-fanchen-C/number.fst\";\n    String text = \"有问题，请拨打110或者手机18601239876。我们的价值观是真诚热爱！\";\n\n    OfflineTtsVitsModelConfig vitsModelConfig =\n        OfflineTtsVitsModelConfig.builder()\n            .setModel(model)\n            .setTokens(tokens)\n            .setLexicon(lexicon)\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setVits(vitsModelConfig)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config =\n        OfflineTtsConfig.builder().setModel(modelConfig).setRuleFsts(ruleFsts).build();\n\n    OfflineTts tts = new OfflineTts(config);\n\n    int sid = 100;\n    float speed = 1.0f;\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setSid(sid);\n    genConfig.setSpeed(speed);\n    genConfig.setSilenceScale(config.getSilenceScale());\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio =\n        tts.generateWithConfigAndCallback(text, genConfig, (float[] samples) -> 1);\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"tts-vits-zh.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/NonStreamingWebsocketClient.java",
    "content": "// Refer to\n// https://stackoverflow.com/questions/55380813/require-assistance-with-simple-pure-java-11-websocket-client-example\n//\n//\n// This is a WebSocketClient client for ../python-api-examples/non_streaming_server.py\n//\n// Please see ./run-non-streaming-websocket-client.sh\nimport com.k2fsa.sherpa.onnx.*;\nimport java.net.URI;\nimport java.net.http.HttpClient;\nimport java.net.http.WebSocket;\nimport java.nio.*;\nimport java.util.concurrent.CompletionStage;\nimport java.util.concurrent.CountDownLatch;\n\npublic class NonStreamingWebsocketClient {\n  public static void main(String[] args) throws Exception {\n    CountDownLatch latch = new CountDownLatch(1);\n\n    WebSocket ws =\n        HttpClient.newHttpClient()\n            .newWebSocketBuilder()\n            .buildAsync(URI.create(\"ws://localhost:6006\"), new WebSocketClient(latch))\n            .join();\n\n    // Please use a 16-bit, single channel wav for testing.\n    // the sample rate does not need to be 16kHz\n    String waveFilename = \"./zh.wav\";\n    WaveReader reader = new WaveReader(waveFilename);\n    int sampleRate = reader.getSampleRate();\n    int numSamples = reader.getSamples().length;\n\n    // Here is the format of the message\n    // byte 0-3 in little endian: sampleRate\n    // byte 4-7 in little endian: number of bytes for samples\n    // remaining bytes: samples. Each sample is a float32\n    ByteBuffer buffer = ByteBuffer.allocate(8 + 4 * numSamples).order(ByteOrder.LITTLE_ENDIAN);\n    buffer.putInt(sampleRate);\n    buffer.putInt(numSamples * 4); // each sample has 4 bytes\n\n    for (float s : reader.getSamples()) {\n      buffer.putFloat(s);\n    }\n\n    buffer.rewind();\n    buffer.flip();\n    buffer.order(ByteOrder.LITTLE_ENDIAN);\n\n    ws.sendBinary(ByteBuffer.wrap(buffer.array()), true).join();\n\n    // Send Done to the server to indicate that we don't have new wave files to decode\n    ws.sendText(\"Done\", true).join();\n\n    latch.await();\n  }\n\n  private static class WebSocketClient implements WebSocket.Listener {\n    private final CountDownLatch latch;\n\n    public WebSocketClient(CountDownLatch latch) {\n      this.latch = latch;\n    }\n\n    @Override\n    public CompletionStage<?> onText(WebSocket webSocket, CharSequence data, boolean last) {\n      System.out.println(\"Result is \" + data);\n      latch.countDown();\n      return WebSocket.Listener.super.onText(webSocket, data, last);\n    }\n  }\n}\n"
  },
  {
    "path": "java-api-examples/OfflineAddPunctuation.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a punctuation model to add punctuations to text.\n//\n// The model supports both English and Chinese.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class OfflineAddPunctuation {\n  public static void main(String[] args) {\n    // please download the model from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models\n    String model = \"./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx\";\n    OfflinePunctuationModelConfig modelConfig =\n        OfflinePunctuationModelConfig.builder()\n            .setCtTransformer(model)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n    OfflinePunctuationConfig config =\n        OfflinePunctuationConfig.builder().setModel(modelConfig).build();\n\n    OfflinePunctuation punct = new OfflinePunctuation(config);\n\n    String[] sentences =\n        new String[] {\n          \"这是一个测试你好吗How are you我很好thank you are you ok谢谢你\",\n          \"我们都是木头人不会说话不会动\",\n          \"The African blogosphere is rapidly expanding bringing more voices online in the form of\"\n              + \" commentaries opinions analyses rants and poetry\",\n        };\n\n    System.out.println(\"---\");\n    for (String text : sentences) {\n      String out = punct.addPunctuation(text);\n      System.out.printf(\"Input: %s\\n\", text);\n      System.out.printf(\"Output: %s\\n\", out);\n      System.out.println(\"---\");\n    }\n  }\n}\n"
  },
  {
    "path": "java-api-examples/OfflineSpeakerDiarizationDemo.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use sherpa-onnx Java API for speaker diarization,\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class OfflineSpeakerDiarizationDemo {\n  public static void main(String[] args) {\n    /* Please use the following commands to download files used in this file\n    Step 1: Download a speaker segmentation model\n\n    Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\n    for a list of available models. The following is an example\n\n      wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n      tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n      rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\n    Step 2: Download a speaker embedding extractor model\n\n    Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n    for a list of available models. The following is an example\n\n      wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\n    Step 3. Download test wave files\n\n    Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\n    for a list of available test wave files. The following is an example\n\n      wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\n    Step 4. Run it\n        */\n\n    String segmentationModel = \"./sherpa-onnx-pyannote-segmentation-3-0/model.onnx\";\n    String embeddingModel = \"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\";\n    String waveFilename = \"./0-four-speakers-zh.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OfflineSpeakerSegmentationPyannoteModelConfig pyannote =\n        OfflineSpeakerSegmentationPyannoteModelConfig.builder().setModel(segmentationModel).build();\n\n    OfflineSpeakerSegmentationModelConfig segmentation =\n        OfflineSpeakerSegmentationModelConfig.builder()\n            .setPyannote(pyannote)\n            .setDebug(true)\n            .build();\n\n    SpeakerEmbeddingExtractorConfig embedding =\n        SpeakerEmbeddingExtractorConfig.builder().setModel(embeddingModel).setDebug(true).build();\n\n    // The test wave file ./0-four-speakers-zh.wav contains four speakers, so\n    // we use numClusters=4 here. If you don't know the number of speakers\n    // in the test wave file, please set the numClusters to -1 and provide\n    // threshold for clustering\n    FastClusteringConfig clustering =\n        FastClusteringConfig.builder()\n            .setNumClusters(4) // set it to -1 if you don't know the actual number\n            .setThreshold(0.5f)\n            .build();\n\n    OfflineSpeakerDiarizationConfig config =\n        OfflineSpeakerDiarizationConfig.builder()\n            .setSegmentation(segmentation)\n            .setEmbedding(embedding)\n            .setClustering(clustering)\n            .setMinDurationOn(0.2f)\n            .setMinDurationOff(0.5f)\n            .build();\n\n    OfflineSpeakerDiarization sd = new OfflineSpeakerDiarization(config);\n    if (sd.getSampleRate() != reader.getSampleRate()) {\n      System.out.printf(\n          \"Expected sample rate: %d, given: %d\\n\", sd.getSampleRate(), reader.getSampleRate());\n      sd.release();\n      return;\n    }\n\n    // OfflineSpeakerDiarizationSegment[] segments = sd.process(reader.getSamples());\n    // without callback is also ok\n\n    // or you can use a callback to show the progress\n    OfflineSpeakerDiarizationSegment[] segments =\n        sd.processWithCallback(\n            reader.getSamples(),\n            (int numProcessedChunks, int numTotalChunks, long arg) -> {\n              float progress = 100.0f * numProcessedChunks / numTotalChunks;\n              System.out.printf(\"Progress: %.2f%%\\n\", progress);\n\n              return 0;\n            });\n\n    for (OfflineSpeakerDiarizationSegment s : segments) {\n      System.out.printf(\"%.3f -- %.3f speaker_%02d\\n\", s.getStart(), s.getEnd(), s.getSpeaker());\n    }\n\n    sd.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/OnlineAddPunctuation.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use a punctuation model to add punctuations to text.\n//\n// The model supports ONLY English.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class OnlineAddPunctuation {\n  public static void main(String[] args) {\n    // please download the model from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n    String model = \"./sherpa-onnx-online-punct-en-2024-08-06/model.int8.onnx\";\n    String bpeVocab = \"./sherpa-onnx-online-punct-en-2024-08-06/bpe.vocab\";\n    OnlinePunctuationModelConfig modelConfig =\n        OnlinePunctuationModelConfig.builder()\n            .setCnnBilstm(model)\n            .setBpeVocab(bpeVocab)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n    OnlinePunctuationConfig config =\n        OnlinePunctuationConfig.builder().setModel(modelConfig).build();\n\n    OnlinePunctuation punct = new OnlinePunctuation(config);\n\n    String[] sentences =\n        new String[] {\n          \"how are you doing fantastic thank you how about you\",\n          \"The African blogosphere is rapidly expanding bringing more voices online in the form of\"\n              + \" commentaries opinions analyses rants and poetry\",\n        };\n\n    System.out.println(\"---\");\n    for (String text : sentences) {\n      String out = punct.addPunctuation(text);\n      System.out.printf(\"Input: %s\\n\", text);\n      System.out.printf(\"Output: %s\\n\", out);\n      System.out.println(\"---\");\n    }\n  }\n}\n"
  },
  {
    "path": "java-api-examples/PocketTts.java",
    "content": "// Copyright 2026 Xiaomi Corporation\n\n// This file shows how to use a PocketTTS English model\n// for voice cloning.\nimport com.k2fsa.sherpa.onnx.*;\nimport java.util.HashMap;\nimport java.util.Map;\n\npublic class PocketTts {\n  public static void main(String[] args) {\n    LibraryUtils.enableDebug();\n    // please visit\n    // https://k2-fsa.github.io/sherpa/onnx/tts/pocket.html\n    // to download model files\n    String lmFlow = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\";\n    String lmMain = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\";\n    String encoder = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\";\n    String decoder = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\";\n    String textConditioner = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\";\n    String vocabJson = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\";\n    String tokenScoresJson = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\";\n    String text =\n        \"Today as always, men fall into two groups: slaves and free men. Whoever does not have\"\n            + \" two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a\"\n            + \" businessman, an official, or a scholar.\";\n\n    OfflineTtsPocketModelConfig pocketModelConfig =\n        OfflineTtsPocketModelConfig.builder()\n            .setLmMain(lmMain)\n            .setLmFlow(lmFlow)\n            .setEncoder(encoder)\n            .setDecoder(decoder)\n            .setTextConditioner(textConditioner)\n            .setVocabJson(vocabJson)\n            .setTokenScoresJson(tokenScoresJson)\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setPocket(pocketModelConfig)\n            .setNumThreads(2)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder().setModel(modelConfig).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    String referenceAudioFilename = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\";\n    WaveReader reader = new WaveReader(referenceAudioFilename);\n\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setReferenceAudio(reader.getSamples());\n    genConfig.setReferenceSampleRate(reader.getSampleRate());\n    genConfig.setNumSteps(5);\n\n    Map<String, String> extra = new HashMap<>();\n    extra.put(\"temperature\", \"0.7\");\n    extra.put(\"chunk_size\", \"15\");\n\n    genConfig.setExtra(extra);\n\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio = null;\n\n    // You can choose one of the following callback style\n    // ---------------------------------------------------\n    // 1. Anonymous class implementing OfflineTtsCallback\n    // ---------------------------------------------------\n    if (true) {\n      audio =\n          tts.generateWithConfigAndCallback(\n              text,\n              genConfig,\n              new OfflineTtsCallback() {\n                @Override\n                public Integer invoke(float[] samples) {\n                  // you can play the generated samples in a separate thread\n                  System.out.println(\"callback got called with \" + samples.length + \" samples\");\n                  // 1 = continue, 0 = stop\n                  return 1;\n                }\n              });\n    }\n\n    // -------------------------------\n    // 2. Lambda implementing OfflineTtsCallback\n    // -------------------------------\n    if (false) {\n      audio =\n          tts.generateWithConfigAndCallback(\n              text,\n              genConfig,\n              samples -> {\n                System.out.println(\"Lambda Integer callback: \" + samples.length);\n                return 1; // continue\n              });\n    }\n\n    if (false) {\n      audio =\n          tts.generateWithConfigAndCallback(\n              text,\n              genConfig,\n              samples -> {\n                System.out.println(\"Consumer: \" + samples.length);\n                // implicitly, it returns 1 internally\n              });\n    }\n\n    if (audio == null) {\n      System.err.println(\"No audio was generated. Please enable at least one callback branch.\");\n      return;\n    }\n\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"pocket-tts-bria.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/README.md",
    "content": "# Introduction\n\nThis directory contains examples for the JAVA API of sherpa-onnx.\n\n# Usage\n\n## Non-streaming speech enhancement\n\n```bash\n./run-non-streaming-speech-enhancement-gtcrn.sh\n./run-non-streaming-speech-enhancement-dpdfnet.sh\n```\n\nUse 16 kHz DPDFNet models such as\n`dpdfnet_baseline.onnx`, `dpdfnet2.onnx`, `dpdfnet4.onnx`, or `dpdfnet8.onnx` for\ndownstream ASR and `dpdfnet2_48khz_hr.onnx` for 48 kHz enhancement output.\n\n## Non-streaming speaker diarization\n\n```bash\n./run-offline-speaker-diarization.sh\n```\n\n## Streaming Speech recognition\n\n```bash\n./run-streaming-asr-from-mic-transducer.sh\n./run-streaming-decode-file-ctc-hlg.sh\n./run-streaming-decode-file-ctc.sh\n./run-streaming-decode-file-paraformer.sh\n./run-streaming-decode-file-tone-ctc.sh\n./run-streaming-decode-file-transducer.sh\n```\n\n## Non-Streaming Speech recognition\n\n```bash\n./run-non-streaming-decode-file-dolphin-ctc.sh\n./run-non-streaming-decode-file-fire-red-asr-ctc.sh\n./run-non-streaming-decode-file-fire-red-asr.sh\n./run-non-streaming-decode-file-funasr-nano.sh\n./run-non-streaming-decode-file-medasr-ctc.sh\n./run-non-streaming-decode-file-moonshine.sh\n./run-non-streaming-decode-file-moonshine-v2.sh\n./run-non-streaming-decode-file-nemo-canary.sh\n./run-non-streaming-decode-file-nemo.sh\n./run-non-streaming-decode-file-omnilingual-asr-ctc.sh\n./run-non-streaming-decode-file-paraformer.sh\n./run-non-streaming-decode-file-sense-voice-with-hr.sh\n./run-non-streaming-decode-file-sense-voice.sh\n./run-non-streaming-decode-file-tele-speech-ctc.sh\n./run-non-streaming-decode-file-transducer-hotwords.sh\n./run-non-streaming-decode-file-transducer.sh\n./run-non-streaming-decode-file-wenet-ctc.sh\n./run-non-streaming-decode-file-whisper-multiple.sh\n./run-non-streaming-decode-file-whisper.sh\n./run-non-streaming-decode-file-zipformer-ctc.sh\n```\n\n## Non-Streaming Speech recognition with homophone replacer\n\n```bash\n./run-non-streaming-decode-file-sense-voice-with-hr.sh\n```\n\n## Non-Streaming text-to-speech\n\n```bash\n./run-non-streaming-tts-coqui-de.sh\n./run-non-streaming-tts-kitten-en.sh\n./run-non-streaming-tts-kokoro-en.sh\n./run-non-streaming-tts-kokoro-zh-en.sh\n./run-non-streaming-tts-matcha-en.sh\n./run-non-streaming-tts-matcha-zh.sh\n./run-non-streaming-tts-piper-en-with-callback.sh\n./run-non-streaming-tts-piper-en.sh\n./run-non-streaming-tts-vits-zh.sh\n./run-pocket-tts.sh\n./run-zipvoice-tts.sh\n```\n\n## Non-Streaming text-to-speech (Playback the audio as it is being generated)\n\n```bash\n./run-non-streaming-tts-piper-en-with-callback.sh\n```\n\n## Spoken language identification\n\n```bash\n./run-spoken-language-identification-whisper.sh\n```\n\n## Add punctuations to text\n\nThe punctuation model supports both English and Chinese.\n\n```bash\n./run-offline-add-punctuation-zh-en.sh\n./run-online-add-punctuation-zh-en.sh\n```\n\n## Audio tagging\n\n```bash\n./run-audio-tagging-zipformer-from-file.sh\n./run-audio-tagging-ced-from-file.sh\n```\n\n## Speaker identification\n\n```bash\n./run-speaker-identification.sh\n```\n\n## VAD with a microphone\n\n```bash\n./run-vad-from-mic.sh\n```\n\n## VAD with a microphone + Non-streaming SenseVoice for speech recognition\n\n```bash\n./run-vad-from-mic-non-streaming-sense-voice.sh\n```\n\n## VAD with a microphone + Non-streaming Paraformer for speech recognition\n\n```bash\n./run-vad-from-mic-non-streaming-paraformer.sh\n```\n\n## VAD with a microphone + Non-streaming Whisper tiny.en for speech recognition\n\n```bash\n./run-vad-from-mic-non-streaming-whisper.sh\n```\n\n## VAD (Remove silence)\n\n```bash\n./run-vad-remove-slience.sh\n./run-ten-vad-remove-slience.sh\n```\n\n## VAD + Non-streaming Dolphin CTC for speech recognition\n\n```bash\n./run-vad-non-streaming-dolphin-ctc.sh\n```\n\n## VAD + Non-streaming SenseVoice for speech recognition\n\n```bash\n./run-vad-non-streaming-sense-voice.sh\n```\n\n## VAD + Non-streaming Paraformer for speech recognition\n\n```bash\n./run-vad-non-streaming-paraformer.sh\n```\n\n## Keyword spotter\n\n```bash\n./run-kws-from-file.sh\n```\n"
  },
  {
    "path": "java-api-examples/SpeakerIdentification.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a speaker embedding extractor model for speaker\n// identification.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class SpeakerIdentification {\n  public static float[] computeEmbedding(SpeakerEmbeddingExtractor extractor, String filename) {\n    WaveReader reader = new WaveReader(filename);\n\n    OnlineStream stream = extractor.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n    stream.inputFinished();\n\n    float[] embedding = extractor.compute(stream);\n    stream.release();\n\n    return embedding;\n  }\n\n  public static void main(String[] args) {\n    // Please download the model from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n    String model = \"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\";\n    SpeakerEmbeddingExtractorConfig config =\n        SpeakerEmbeddingExtractorConfig.builder()\n            .setModel(model)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n    SpeakerEmbeddingExtractor extractor = new SpeakerEmbeddingExtractor(config);\n    SpeakerEmbeddingManager manager = new SpeakerEmbeddingManager(extractor.getDim());\n\n    try {\n      String[] spk1Files =\n          new String[] {\n            \"./sr-data/enroll/fangjun-sr-1.wav\",\n            \"./sr-data/enroll/fangjun-sr-2.wav\",\n            \"./sr-data/enroll/fangjun-sr-3.wav\",\n          };\n\n      float[][] spk1Vec = new float[spk1Files.length][];\n\n      for (int i = 0; i < spk1Files.length; ++i) {\n        spk1Vec[i] = computeEmbedding(extractor, spk1Files[i]);\n      }\n\n      String[] spk2Files =\n          new String[] {\n            \"./sr-data/enroll/leijun-sr-1.wav\", \"./sr-data/enroll/leijun-sr-2.wav\",\n          };\n\n      float[][] spk2Vec = new float[spk2Files.length][];\n\n      for (int i = 0; i < spk2Files.length; ++i) {\n        spk2Vec[i] = computeEmbedding(extractor, spk2Files[i]);\n      }\n\n      if (!manager.add(\"fangjun\", spk1Vec)) {\n        System.out.println(\"Failed to register fangjun\");\n        return;\n      }\n\n      if (!manager.add(\"leijun\", spk2Vec)) {\n        System.out.println(\"Failed to register leijun\");\n        return;\n      }\n\n      if (manager.getNumSpeakers() != 2) {\n        System.out.println(\"There should be two speakers\");\n        return;\n      }\n\n      if (!manager.contains(\"fangjun\")) {\n        System.out.println(\"It should contain the speaker fangjun\");\n        return;\n      }\n\n      if (!manager.contains(\"leijun\")) {\n        System.out.println(\"It should contain the speaker leijun\");\n        return;\n      }\n\n      System.out.println(\"---All speakers---\");\n      String[] allSpeakers = manager.getAllSpeakerNames();\n      for (String s : allSpeakers) {\n        System.out.println(s);\n      }\n      System.out.println(\"------------\");\n\n      String[] testFiles =\n          new String[] {\n            \"./sr-data/test/fangjun-test-sr-1.wav\",\n            \"./sr-data/test/leijun-test-sr-1.wav\",\n            \"./sr-data/test/liudehua-test-sr-1.wav\"\n          };\n\n      float threshold = 0.6f;\n      for (String file : testFiles) {\n        float[] embedding = computeEmbedding(extractor, file);\n\n        String name = manager.search(embedding, threshold);\n        if (name.isEmpty()) {\n          name = \"<Unknown>\";\n        }\n        System.out.printf(\"%s: %s\\n\", file, name);\n      }\n\n      // test verify\n      if (!manager.verify(\"fangjun\", computeEmbedding(extractor, testFiles[0]), threshold)) {\n        System.out.printf(\"%s should match fangjun!\\n\", testFiles[0]);\n        return;\n      }\n\n      if (!manager.remove(\"fangjun\")) {\n        System.out.println(\"Failed to remove fangjun\");\n        return;\n      }\n\n      if (manager.verify(\"fangjun\", computeEmbedding(extractor, testFiles[0]), threshold)) {\n        System.out.printf(\"%s should match no one!\\n\", testFiles[0]);\n        return;\n      }\n\n      if (manager.getNumSpeakers() != 1) {\n        System.out.println(\"There should only 1 speaker left.\");\n        return;\n      }\n    } finally {\n      extractor.release();\n      manager.release();\n    }\n  }\n}\n"
  },
  {
    "path": "java-api-examples/SpokenLanguageIdentificationWhisper.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a multilingual whisper model for\n// spoken language identification.\n//\n// Note that it needs a multilingual whisper model. For instance,\n// tiny works, but tiny.en doesn't.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class SpokenLanguageIdentificationWhisper {\n  public static void main(String[] args) {\n    // please download model and test files from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String encoder = \"./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx\";\n    String decoder = \"./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx\";\n\n    String[] testFiles =\n        new String[] {\n          \"./spoken-language-identification-test-wavs/en-english.wav\",\n          \"./spoken-language-identification-test-wavs/de-german.wav\",\n          \"./spoken-language-identification-test-wavs/zh-chinese.wav\",\n          \"./spoken-language-identification-test-wavs/es-spanish.wav\",\n          \"./spoken-language-identification-test-wavs/fa-persian.wav\",\n          \"./spoken-language-identification-test-wavs/ko-korean.wav\",\n          \"./spoken-language-identification-test-wavs/ja-japanese.wav\",\n          \"./spoken-language-identification-test-wavs/ru-russian.wav\",\n          \"./spoken-language-identification-test-wavs/uk-ukrainian.wav\",\n        };\n\n    SpokenLanguageIdentificationWhisperConfig whisper =\n        SpokenLanguageIdentificationWhisperConfig.builder()\n            .setEncoder(encoder)\n            .setDecoder(decoder)\n            .build();\n\n    SpokenLanguageIdentificationConfig config =\n        SpokenLanguageIdentificationConfig.builder()\n            .setWhisper(whisper)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    SpokenLanguageIdentification slid = new SpokenLanguageIdentification(config);\n    for (String filename : testFiles) {\n      WaveReader reader = new WaveReader(filename);\n\n      OfflineStream stream = slid.createStream();\n      stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n      String lang = slid.compute(stream);\n      System.out.println(\"---\");\n      System.out.printf(\"filename: %s\\n\", filename);\n      System.out.printf(\"lang: %s\\n\", lang);\n\n      stream.release();\n    }\n    System.out.println(\"---\");\n\n    slid.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/StreamingAsrFromMicTransducer.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an online transducer, i.e., streaming transducer,\n// for real-time speech recognition with a microphone.\nimport com.k2fsa.sherpa.onnx.*;\nimport javax.sound.sampled.*;\n\npublic class StreamingAsrFromMicTransducer {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n    // to download model files\n    String encoder =\n        \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx\";\n    String decoder =\n        \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\";\n    String joiner =\n        \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx\";\n    String tokens = \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\";\n\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n    String ruleFsts = \"./itn_zh_number.fst\";\n\n    int sampleRate = 16000;\n\n    OnlineTransducerModelConfig transducer =\n        OnlineTransducerModelConfig.builder()\n            .setEncoder(encoder)\n            .setDecoder(decoder)\n            .setJoiner(joiner)\n            .build();\n\n    OnlineModelConfig modelConfig =\n        OnlineModelConfig.builder()\n            .setTransducer(transducer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OnlineRecognizerConfig config =\n        OnlineRecognizerConfig.builder()\n            .setOnlineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .setRuleFsts(ruleFsts)\n            .build();\n\n    OnlineRecognizer recognizer = new OnlineRecognizer(config);\n    OnlineStream stream = recognizer.createStream();\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/AudioFormat.html\n    // Linear PCM, 16000Hz, 16-bit, 1 channel, signed, little endian\n    AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/DataLine.Info.html#Info-java.lang.Class-javax.sound.sampled.AudioFormat-int-\n    DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);\n    TargetDataLine targetDataLine;\n    try {\n      targetDataLine = (TargetDataLine) AudioSystem.getLine(info);\n      targetDataLine.open(format);\n      targetDataLine.start();\n    } catch (LineUnavailableException e) {\n      System.out.println(\"Failed to open target data line: \" + e.getMessage());\n      recognizer.release();\n      stream.release();\n      return;\n    }\n\n    String lastText = \"\";\n    int segmentIndex = 0;\n\n    // You can choose an arbitrary number\n    int bufferSize = 1600; // 0.1 seconds for 16000Hz\n    byte[] buffer = new byte[bufferSize * 2]; // a short has 2 bytes\n    float[] samples = new float[bufferSize];\n\n    System.out.println(\"Started! Please speak\");\n    while (targetDataLine.isOpen()) {\n      int n = targetDataLine.read(buffer, 0, buffer.length);\n      if (n <= 0) {\n        System.out.printf(\"Got %d bytes. Expected %d bytes.\\n\", n, buffer.length);\n        continue;\n      }\n      for (int i = 0; i != bufferSize; ++i) {\n        short low = buffer[2 * i];\n        short high = buffer[2 * i + 1];\n        int s = (high << 8) + low;\n        samples[i] = (float) s / 32768;\n      }\n      stream.acceptWaveform(samples, sampleRate);\n\n      while (recognizer.isReady(stream)) {\n        recognizer.decode(stream);\n      }\n\n      String text = recognizer.getResult(stream).getText();\n      boolean isEndpoint = recognizer.isEndpoint(stream);\n      if (!text.isEmpty() && text != \" \" && lastText != text) {\n        lastText = text;\n        System.out.printf(\"%d: %s\\r\", segmentIndex, text);\n      }\n\n      if (isEndpoint) {\n        if (!text.isEmpty()) {\n          System.out.println();\n          segmentIndex += 1;\n        }\n\n        recognizer.reset(stream);\n      }\n    } // while (targetDataLine.isOpen())\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/StreamingDecodeFileCtc.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an online CTC model, i.e., streaming CTC model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class StreamingDecodeFileCtc {\n  public static void main(String[] args) {\n    // please refer to\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n    // to download model files\n    String model =\n        \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx\";\n    String tokens = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt\";\n    String waveFilename = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/8k.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OnlineZipformer2CtcModelConfig ctc =\n        OnlineZipformer2CtcModelConfig.builder().setModel(model).build();\n\n    OnlineModelConfig modelConfig =\n        OnlineModelConfig.builder()\n            .setZipformer2Ctc(ctc)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OnlineRecognizerConfig config =\n        OnlineRecognizerConfig.builder()\n            .setOnlineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OnlineRecognizer recognizer = new OnlineRecognizer(config);\n    OnlineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    float[] tailPaddings = new float[(int) (0.3 * reader.getSampleRate())];\n    stream.acceptWaveform(tailPaddings, reader.getSampleRate());\n\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/StreamingDecodeFileCtcHLG.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an online CTC model, i.e., streaming CTC model,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class StreamingDecodeFileCtcHLG {\n  public static void main(String[] args) {\n    // please refer to\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n    // to download model files\n    String model =\n        \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx\";\n    String tokens = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt\";\n    String hlg = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst\";\n    String waveFilename = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/8k.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OnlineZipformer2CtcModelConfig ctc =\n        OnlineZipformer2CtcModelConfig.builder().setModel(model).build();\n\n    OnlineModelConfig modelConfig =\n        OnlineModelConfig.builder()\n            .setZipformer2Ctc(ctc)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OnlineCtcFstDecoderConfig ctcFstDecoderConfig =\n        OnlineCtcFstDecoderConfig.builder().setGraph(hlg).build();\n\n    OnlineRecognizerConfig config =\n        OnlineRecognizerConfig.builder()\n            .setOnlineModelConfig(modelConfig)\n            .setCtcFstDecoderConfig(ctcFstDecoderConfig)\n            .build();\n\n    OnlineRecognizer recognizer = new OnlineRecognizer(config);\n    OnlineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    float[] tailPaddings = new float[(int) (0.3 * reader.getSampleRate())];\n    stream.acceptWaveform(tailPaddings, reader.getSampleRate());\n\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/StreamingDecodeFileParaformer.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an online paraformer, i.e., streaming paraformer,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class StreamingDecodeFileParaformer {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-streaming-paraformer-bilingual-zh-en-chinese-english\n    // to download model files\n    String encoder = \"./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx\";\n    String decoder = \"./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx\";\n    String tokens = \"./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt\";\n    String waveFilename = \"./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/2.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OnlineParaformerModelConfig paraformer =\n        OnlineParaformerModelConfig.builder().setEncoder(encoder).setDecoder(decoder).build();\n\n    OnlineModelConfig modelConfig =\n        OnlineModelConfig.builder()\n            .setParaformer(paraformer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OnlineRecognizerConfig config =\n        OnlineRecognizerConfig.builder()\n            .setOnlineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OnlineRecognizer recognizer = new OnlineRecognizer(config);\n    OnlineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    float[] tailPaddings = new float[(int) (0.8 * reader.getSampleRate())];\n    stream.acceptWaveform(tailPaddings, reader.getSampleRate());\n\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/StreamingDecodeFileToneCtc.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an online T-one CTC model, i.e.,\n// streaming T-one CTC model, to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class StreamingDecodeFileToneCtc {\n  public static void main(String[] args) {\n    String model = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx\";\n    String tokens = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt\";\n    String waveFilename = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OnlineToneCtcModelConfig ctc = OnlineToneCtcModelConfig.builder().setModel(model).build();\n\n    OnlineModelConfig modelConfig =\n        OnlineModelConfig.builder()\n            .setToneCtc(ctc)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OnlineRecognizerConfig config =\n        OnlineRecognizerConfig.builder()\n            .setOnlineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OnlineRecognizer recognizer = new OnlineRecognizer(config);\n    OnlineStream stream = recognizer.createStream();\n\n    float[] leftPaddings = new float[(int) (0.3 * reader.getSampleRate())];\n    stream.acceptWaveform(leftPaddings, reader.getSampleRate());\n\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    float[] tailPaddings = new float[(int) (0.6 * reader.getSampleRate())];\n    stream.acceptWaveform(tailPaddings, reader.getSampleRate());\n\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/StreamingDecodeFileTransducer.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use an online transducer, i.e., streaming transducer,\n// to decode files.\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class StreamingDecodeFileTransducer {\n  public static void main(String[] args) {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n    // to download model files\n    String encoder =\n        \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx\";\n    String decoder =\n        \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\";\n    String joiner =\n        \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx\";\n    String tokens = \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\";\n\n    String waveFilename =\n        \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav\";\n\n    WaveReader reader = new WaveReader(waveFilename);\n\n    OnlineTransducerModelConfig transducer =\n        OnlineTransducerModelConfig.builder()\n            .setEncoder(encoder)\n            .setDecoder(decoder)\n            .setJoiner(joiner)\n            .build();\n\n    OnlineModelConfig modelConfig =\n        OnlineModelConfig.builder()\n            .setTransducer(transducer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OnlineRecognizerConfig config =\n        OnlineRecognizerConfig.builder()\n            .setOnlineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    OnlineRecognizer recognizer = new OnlineRecognizer(config);\n    OnlineStream stream = recognizer.createStream();\n    stream.acceptWaveform(reader.getSamples(), reader.getSampleRate());\n\n    float[] tailPaddings = new float[(int) (0.8 * reader.getSampleRate())];\n    stream.acceptWaveform(tailPaddings, reader.getSampleRate());\n\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n\n    String text = recognizer.getResult(stream).getText();\n\n    System.out.printf(\"filename:%s\\nresult:%s\\n\", waveFilename, text);\n\n    stream.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/StreamingSpeechEnhancementDpdfNet.java",
    "content": "// Copyright 2026 Xiaomi Corporation\n//\n// This file shows how to use streaming DPDFNet speech enhancement models in\n// sherpa-onnx.\n//\n// Download DPDFNet models from either:\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n// https://huggingface.co/Ceva-IP/DPDFNet\n\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class StreamingSpeechEnhancementDpdfNet {\n  private static void appendSamples(java.util.ArrayList<Float> dst, float[] src) {\n    for (float v : src) {\n      dst.add(v);\n    }\n  }\n\n  private static float[] toFloatArray(java.util.ArrayList<Float> src) {\n    float[] ans = new float[src.size()];\n    for (int i = 0; i != src.size(); ++i) {\n      ans[i] = src.get(i);\n    }\n    return ans;\n  }\n\n  public static void main(String[] args) {\n    String model = \"./dpdfnet_baseline.onnx\";\n\n    OfflineSpeechDenoiserModelConfig modelConfig =\n        OfflineSpeechDenoiserModelConfig.builder()\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .setDpdfnet(\n                OfflineSpeechDenoiserDpdfNetModelConfig.builder().setModel(model).build())\n            .build();\n\n    OnlineSpeechDenoiserConfig config =\n        OnlineSpeechDenoiserConfig.builder().setModel(modelConfig).build();\n\n    OnlineSpeechDenoiser speechDenoiser = new OnlineSpeechDenoiser(config);\n\n    WaveReader reader = new WaveReader(\"./inp_16k.wav\");\n    int frameShift = speechDenoiser.getFrameShiftInSamples();\n    java.util.ArrayList<Float> output = new java.util.ArrayList<>();\n\n    float[] samples = reader.getSamples();\n    for (int start = 0; start < samples.length; start += frameShift) {\n      int end = Math.min(start + frameShift, samples.length);\n      float[] chunk = java.util.Arrays.copyOfRange(samples, start, end);\n      DenoisedAudio denoised = speechDenoiser.run(chunk, reader.getSampleRate());\n      appendSamples(output, denoised.getSamples());\n    }\n\n    DenoisedAudio denoised = speechDenoiser.flush();\n    appendSamples(output, denoised.getSamples());\n    String outFilename = \"enhanced-online-dpdfnet.wav\";\n    WaveWriter.write(outFilename, toFloatArray(output), speechDenoiser.getSampleRate());\n    System.out.printf(\"Saved to %s\\n\", outFilename);\n\n    speechDenoiser.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/StreamingSpeechEnhancementGtcrn.java",
    "content": "// Copyright 2026 Xiaomi Corporation\n//\n// This file shows how to use streaming GTCRN speech enhancement models in\n// sherpa-onnx.\n//\n// Download GTCRN models and sample test waves from:\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class StreamingSpeechEnhancementGtcrn {\n  private static void appendSamples(java.util.ArrayList<Float> dst, float[] src) {\n    for (float v : src) {\n      dst.add(v);\n    }\n  }\n\n  private static float[] toFloatArray(java.util.ArrayList<Float> src) {\n    float[] ans = new float[src.size()];\n    for (int i = 0; i != src.size(); ++i) {\n      ans[i] = src.get(i);\n    }\n    return ans;\n  }\n\n  public static void main(String[] args) {\n    String model = \"./gtcrn_simple.onnx\";\n\n    OfflineSpeechDenoiserModelConfig modelConfig =\n        OfflineSpeechDenoiserModelConfig.builder()\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .setGtcrn(\n                OfflineSpeechDenoiserGtcrnModelConfig.builder().setModel(model).build())\n            .build();\n\n    OnlineSpeechDenoiserConfig config =\n        OnlineSpeechDenoiserConfig.builder().setModel(modelConfig).build();\n\n    OnlineSpeechDenoiser speechDenoiser = new OnlineSpeechDenoiser(config);\n\n    WaveReader reader = new WaveReader(\"./inp_16k.wav\");\n    int frameShift = speechDenoiser.getFrameShiftInSamples();\n    java.util.ArrayList<Float> output = new java.util.ArrayList<>();\n\n    float[] samples = reader.getSamples();\n    for (int start = 0; start < samples.length; start += frameShift) {\n      int end = Math.min(start + frameShift, samples.length);\n      float[] chunk = java.util.Arrays.copyOfRange(samples, start, end);\n      DenoisedAudio denoised = speechDenoiser.run(chunk, reader.getSampleRate());\n      appendSamples(output, denoised.getSamples());\n    }\n\n    DenoisedAudio denoised = speechDenoiser.flush();\n    appendSamples(output, denoised.getSamples());\n    String outFilename = \"enhanced-online-gtcrn.wav\";\n    WaveWriter.write(outFilename, toFloatArray(output), speechDenoiser.getSampleRate());\n    System.out.printf(\"Saved to %s\\n\", outFilename);\n\n    speechDenoiser.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/SupertonicTts.java",
    "content": "// Copyright 2026 Xiaomi Corporation\n\n// This file shows how to use a Supertonic TTS English model.\nimport com.k2fsa.sherpa.onnx.*;\nimport java.util.HashMap;\nimport java.util.Map;\n\npublic class SupertonicTts {\n  public static void main(String[] args) {\n    LibraryUtils.enableDebug();\n    // please visit\n    // https://k2-fsa.github.io/sherpa/onnx/tts/supertonic.html\n    // to download model files\n    String modelDir = \"./sherpa-onnx-supertonic-tts-int8-2026-03-06\";\n    String durationPredictor = modelDir + \"/duration_predictor.int8.onnx\";\n    String textEncoder = modelDir + \"/text_encoder.int8.onnx\";\n    String vectorEstimator = modelDir + \"/vector_estimator.int8.onnx\";\n    String vocoder = modelDir + \"/vocoder.int8.onnx\";\n    String ttsJson = modelDir + \"/tts.json\";\n    String unicodeIndexer = modelDir + \"/unicode_indexer.bin\";\n    String voiceStyle = modelDir + \"/voice.bin\";\n\n    String text =\n        \"Today as always, men fall into two groups: slaves and free men. Whoever does not have\"\n            + \" two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a\"\n            + \" businessman, an official, or a scholar.\";\n\n    OfflineTtsSupertonicModelConfig supertonicModelConfig =\n        OfflineTtsSupertonicModelConfig.builder()\n            .setDurationPredictor(durationPredictor)\n            .setTextEncoder(textEncoder)\n            .setVectorEstimator(vectorEstimator)\n            .setVocoder(vocoder)\n            .setTtsJson(ttsJson)\n            .setUnicodeIndexer(unicodeIndexer)\n            .setVoiceStyle(voiceStyle)\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setSupertonic(supertonicModelConfig)\n            .setNumThreads(2)\n            .setDebug(true)\n            .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder().setModel(modelConfig).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setSid(6);\n    genConfig.setSpeed(1.25f);\n    genConfig.setNumSteps(5);\n\n    Map<String, String> extra = new HashMap<>();\n    extra.put(\"lang\", \"en\");\n\n    genConfig.setExtra(extra);\n\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio =\n        tts.generateWithConfigAndCallback(\n            text,\n            genConfig,\n            new OfflineTtsCallback() {\n              @Override\n              public Integer invoke(float[] samples) {\n                System.out.println(\"callback got called with \" + samples.length + \" samples\");\n                return 1;\n              }\n            });\n\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"supertonic-tts-en.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/TenVadRemoveSilence.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use a ten-vad model to remove silences from\n// a wave file.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport java.util.ArrayList;\nimport java.util.Arrays;\n\npublic class TenVadRemoveSilence {\n  public static void main(String[] args) {\n    // please download ./ten-vad.onnx from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String model = \"./ten-vad.onnx\";\n    TenVadModelConfig tenVad =\n        TenVadModelConfig.builder()\n            .setModel(model)\n            .setThreshold(0.5f)\n            .setMinSilenceDuration(0.25f)\n            .setMinSpeechDuration(0.5f)\n            .setWindowSize(256)\n            .setMaxSpeechDuration(5.0f)\n            .build();\n\n    VadModelConfig config =\n        VadModelConfig.builder()\n            .setTenVadModelConfig(tenVad)\n            .setSampleRate(16000)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .build();\n\n    Vad vad = new Vad(config);\n\n    // You can download the test file from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String testWaveFilename = \"./lei-jun-test.wav\";\n    WaveReader reader = new WaveReader(testWaveFilename);\n\n    int numSamples = reader.getSamples().length;\n    int windowSize = tenVad.getWindowSize();\n    int numIter = numSamples / windowSize;\n\n    ArrayList<float[]> segments = new ArrayList<float[]>();\n\n    for (int i = 0; i != numIter; ++i) {\n      int start = i * windowSize;\n      int end = start + windowSize;\n      float[] samples = Arrays.copyOfRange(reader.getSamples(), start, end);\n      vad.acceptWaveform(samples);\n      if (vad.isSpeechDetected()) {\n        while (!vad.empty()) {\n\n          // if you want to get the starting time of this segment, you can use\n          /* float startTime = vad.front().getStart() / 16000.0f; */\n\n          segments.add(vad.front().getSamples());\n          vad.pop();\n        }\n      }\n    }\n\n    vad.flush();\n    while (!vad.empty()) {\n\n      // if you want to get the starting time of this segment, you can use\n      /* float startTime = vad.front().getStart() / 16000.0f; */\n\n      segments.add(vad.front().getSamples());\n      vad.pop();\n    }\n\n    // get total number of samples\n    int n = 0;\n    for (float[] s : segments) {\n      n += s.length;\n    }\n\n    float[] allSamples = new float[n];\n    int i = 0;\n    for (float[] s : segments) {\n      System.arraycopy(s, 0, allSamples, i, s.length);\n      i += s.length;\n    }\n\n    String outFilename = \"lei-jun-test-no-silence.wav\";\n    WaveWriter.write(outFilename, allSamples, 16000);\n    System.out.printf(\"Saved to %s\\n\", outFilename);\n\n    vad.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/VadFromMic.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a silero_vad model to detect speech\n// and save detected speech into a wave file.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport javax.sound.sampled.*;\n\npublic class VadFromMic {\n  public static void main(String[] args) {\n    int sampleRate = 16000;\n    int windowSize = 512;\n    // please download ./silero_vad.onnx from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String model = \"./silero_vad.onnx\";\n    SileroVadModelConfig sileroVad =\n        SileroVadModelConfig.builder()\n            .setModel(model)\n            .setThreshold(0.5f)\n            .setMinSilenceDuration(0.25f)\n            .setMinSpeechDuration(0.5f)\n            .setWindowSize(windowSize)\n            .build();\n\n    VadModelConfig config =\n        VadModelConfig.builder()\n            .setSileroVadModelConfig(sileroVad)\n            .setSampleRate(sampleRate)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .build();\n\n    Vad vad = new Vad(config);\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/AudioFormat.html\n    // Linear PCM, 16000Hz, 16-bit, 1 channel, signed, little endian\n    AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/DataLine.Info.html#Info-java.lang.Class-javax.sound.sampled.AudioFormat-int-\n    DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);\n    TargetDataLine targetDataLine;\n    try {\n      targetDataLine = (TargetDataLine) AudioSystem.getLine(info);\n      targetDataLine.open(format);\n      targetDataLine.start();\n    } catch (LineUnavailableException e) {\n      System.out.println(\"Failed to open target data line: \" + e.getMessage());\n      vad.release();\n      return;\n    }\n\n    boolean printed = false;\n    int index = 0;\n\n    byte[] buffer = new byte[windowSize * 2];\n    float[] samples = new float[windowSize];\n\n    while (targetDataLine.isOpen()) {\n      int n = targetDataLine.read(buffer, 0, buffer.length);\n      if (n <= 0) {\n        System.out.printf(\"Got %d bytes. Expected %d bytes.\\n\", n, buffer.length);\n        continue;\n      }\n      for (int i = 0; i != windowSize; ++i) {\n        short low = buffer[2 * i];\n        short high = buffer[2 * i + 1];\n        int s = (high << 8) + low;\n        samples[i] = (float) s / 32768;\n      }\n\n      vad.acceptWaveform(samples);\n      if (vad.isSpeechDetected() && !printed) {\n        System.out.println(\"Detected speech\");\n        printed = true;\n      }\n\n      if (!vad.isSpeechDetected()) {\n        printed = false;\n      }\n\n      while (!vad.empty()) {\n        float[] segment = vad.front().getSamples();\n        float duration = segment.length / (float) sampleRate;\n        System.out.printf(\"Duration: %.3f seconds\\n\", duration);\n\n        String filename = String.format(\"seg-%d-%.3fs.wav\", index, duration);\n        index += 1;\n        WaveWriter.write(filename, segment, sampleRate);\n        System.out.printf(\"Saved to %s\\n\", filename);\n        System.out.println(\"----------\");\n        vad.pop();\n      }\n    }\n\n    vad.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/VadFromMicWithNonStreamingMoonshine.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a silero_vad model with a non-streaming\n// Moonshine tiny for speech recognition.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport javax.sound.sampled.*;\n\npublic class VadFromMicNonStreamingMoonshine {\n  private static final int sampleRate = 16000;\n  private static final int windowSize = 512;\n\n  public static Vad createVad() {\n    // please download ./silero_vad.onnx from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String model = \"./silero_vad.onnx\";\n    SileroVadModelConfig sileroVad =\n        SileroVadModelConfig.builder()\n            .setModel(model)\n            .setThreshold(0.5f)\n            .setMinSilenceDuration(0.25f)\n            .setMinSpeechDuration(0.5f)\n            .setWindowSize(windowSize)\n            .build();\n\n    VadModelConfig config =\n        VadModelConfig.builder()\n            .setSileroVadModelConfig(sileroVad)\n            .setSampleRate(sampleRate)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .build();\n\n    return new Vad(config);\n  }\n\n  public static OfflineRecognizer createOfflineRecognizer() {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/moonshine/index.html\n    // to download model files\n\n    String preprocessor = \"./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx\";\n    String encoder = \"./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx\";\n    String uncachedDecoder = \"./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx\";\n    String cachedDecoder = \"./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx\";\n\n    String tokens = \"./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt\";\n\n    OfflineMoonshineModelConfig moonshine =\n        OfflineMoonshineModelConfig.builder()\n            .setPreprocessor(preprocessor)\n            .setEncoder(encoder)\n            .setUncachedDecoder(uncachedDecoder)\n            .setCachedDecoder(cachedDecoder)\n            .build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setMoonshine(moonshine)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    return new OfflineRecognizer(config);\n  }\n\n  public static void main(String[] args) {\n    Vad vad = createVad();\n    OfflineRecognizer recognizer = createOfflineRecognizer();\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/AudioFormat.html\n    // Linear PCM, 16000Hz, 16-bit, 1 channel, signed, little endian\n    AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/DataLine.Info.html#Info-java.lang.Class-javax.sound.sampled.AudioFormat-int-\n    DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);\n    TargetDataLine targetDataLine;\n    try {\n      targetDataLine = (TargetDataLine) AudioSystem.getLine(info);\n      targetDataLine.open(format);\n      targetDataLine.start();\n    } catch (LineUnavailableException e) {\n      System.out.println(\"Failed to open target data line: \" + e.getMessage());\n      vad.release();\n      recognizer.release();\n      return;\n    }\n\n    boolean printed = false;\n    byte[] buffer = new byte[windowSize * 2];\n    float[] samples = new float[windowSize];\n\n    System.out.println(\"Started. Please speak\");\n    boolean running = true;\n    while (targetDataLine.isOpen() && running) {\n      int n = targetDataLine.read(buffer, 0, buffer.length);\n      if (n <= 0) {\n        System.out.printf(\"Got %d bytes. Expected %d bytes.\\n\", n, buffer.length);\n        continue;\n      }\n      for (int i = 0; i != windowSize; ++i) {\n        short low = buffer[2 * i];\n        short high = buffer[2 * i + 1];\n        int s = (high << 8) + low;\n        samples[i] = (float) s / 32768;\n      }\n\n      vad.acceptWaveform(samples);\n      if (vad.isSpeechDetected() && !printed) {\n        System.out.println(\"Detected speech\");\n        printed = true;\n      }\n\n      if (!vad.isSpeechDetected()) {\n        printed = false;\n      }\n\n      while (!vad.empty()) {\n        SpeechSegment segment = vad.front();\n        float startTime = segment.getStart() / (float) sampleRate;\n        float duration = segment.getSamples().length / (float) sampleRate;\n\n        OfflineStream stream = recognizer.createStream();\n        stream.acceptWaveform(segment.getSamples(), sampleRate);\n        recognizer.decode(stream);\n        String text = recognizer.getResult(stream).getText();\n        stream.release();\n\n        if (!text.isEmpty()) {\n          System.out.printf(\"%.3f--%.3f: %s\\n\", startTime, startTime + duration, text);\n        }\n\n        if (text.contains(\"exit the program\")) {\n          running = false;\n        }\n\n        vad.pop();\n      }\n    }\n\n    vad.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/VadFromMicWithNonStreamingParaformer.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a silero_vad model with a non-streaming Paraformer\n// for speech recognition.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport javax.sound.sampled.*;\n\npublic class VadFromMicWithNonStreamingParaformer {\n  private static final int sampleRate = 16000;\n  private static final int windowSize = 512;\n\n  public static Vad createVad() {\n    // please download ./silero_vad.onnx from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String model = \"./silero_vad.onnx\";\n    SileroVadModelConfig sileroVad =\n        SileroVadModelConfig.builder()\n            .setModel(model)\n            .setThreshold(0.5f)\n            .setMinSilenceDuration(0.25f)\n            .setMinSpeechDuration(0.5f)\n            .setWindowSize(windowSize)\n            .build();\n\n    VadModelConfig config =\n        VadModelConfig.builder()\n            .setSileroVadModelConfig(sileroVad)\n            .setSampleRate(sampleRate)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .build();\n\n    return new Vad(config);\n  }\n\n  public static OfflineRecognizer createOfflineRecognizer() {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2023-09-14-chinese-english\n    // to download model files\n    String model = \"./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\";\n\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n    String ruleFsts = \"./itn_zh_number.fst\";\n\n    OfflineParaformerModelConfig paraformer =\n        OfflineParaformerModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setParaformer(paraformer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .setRuleFsts(ruleFsts)\n            .build();\n\n    return new OfflineRecognizer(config);\n  }\n\n  public static void main(String[] args) {\n    Vad vad = createVad();\n    OfflineRecognizer recognizer = createOfflineRecognizer();\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/AudioFormat.html\n    // Linear PCM, 16000Hz, 16-bit, 1 channel, signed, little endian\n    AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/DataLine.Info.html#Info-java.lang.Class-javax.sound.sampled.AudioFormat-int-\n    DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);\n    TargetDataLine targetDataLine;\n    try {\n      targetDataLine = (TargetDataLine) AudioSystem.getLine(info);\n      targetDataLine.open(format);\n      targetDataLine.start();\n    } catch (LineUnavailableException e) {\n      System.out.println(\"Failed to open target data line: \" + e.getMessage());\n      vad.release();\n      recognizer.release();\n      return;\n    }\n\n    boolean printed = false;\n    byte[] buffer = new byte[windowSize * 2];\n    float[] samples = new float[windowSize];\n\n    System.out.println(\"Started. Please speak\");\n    boolean running = true;\n    while (targetDataLine.isOpen() && running) {\n      int n = targetDataLine.read(buffer, 0, buffer.length);\n      if (n <= 0) {\n        System.out.printf(\"Got %d bytes. Expected %d bytes.\\n\", n, buffer.length);\n        continue;\n      }\n      for (int i = 0; i != windowSize; ++i) {\n        short low = buffer[2 * i];\n        short high = buffer[2 * i + 1];\n        int s = (high << 8) + low;\n        samples[i] = (float) s / 32768;\n      }\n\n      vad.acceptWaveform(samples);\n      if (vad.isSpeechDetected() && !printed) {\n        System.out.println(\"Detected speech\");\n        printed = true;\n      }\n\n      if (!vad.isSpeechDetected()) {\n        printed = false;\n      }\n\n      while (!vad.empty()) {\n        SpeechSegment segment = vad.front();\n        float startTime = segment.getStart() / (float) sampleRate;\n        float duration = segment.getSamples().length / (float) sampleRate;\n\n        OfflineStream stream = recognizer.createStream();\n        stream.acceptWaveform(segment.getSamples(), sampleRate);\n        recognizer.decode(stream);\n        String text = recognizer.getResult(stream).getText();\n        stream.release();\n\n        if (!text.isEmpty()) {\n          System.out.printf(\"%.3f--%.3f: %s\\n\", startTime, startTime + duration, text);\n        }\n\n        if (text.contains(\"退出程序\")) {\n          running = false;\n        }\n\n        vad.pop();\n      }\n    }\n\n    vad.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/VadFromMicWithNonStreamingSenseVoice.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a silero_vad model with a non-streaming\n// SenseVoice model for speech recognition.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport javax.sound.sampled.*;\n\npublic class VadFromMicWithNonStreamingSenseVoice {\n  private static final int sampleRate = 16000;\n  private static final int windowSize = 512;\n\n  public static Vad createVad() {\n    // please download ./silero_vad.onnx from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String model = \"./silero_vad.onnx\";\n    SileroVadModelConfig sileroVad =\n        SileroVadModelConfig.builder()\n            .setModel(model)\n            .setThreshold(0.5f)\n            .setMinSilenceDuration(0.25f)\n            .setMinSpeechDuration(0.5f)\n            .setWindowSize(windowSize)\n            .build();\n\n    VadModelConfig config =\n        VadModelConfig.builder()\n            .setSileroVadModelConfig(sileroVad)\n            .setSampleRate(sampleRate)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .build();\n\n    return new Vad(config);\n  }\n\n  public static OfflineRecognizer createOfflineRecognizer() {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/sense-voice/index.html\n    // to download model files\n    String model = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n\n    OfflineSenseVoiceModelConfig senseVoice =\n        OfflineSenseVoiceModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setSenseVoice(senseVoice)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    return new OfflineRecognizer(config);\n  }\n\n  public static void main(String[] args) {\n    Vad vad = createVad();\n    OfflineRecognizer recognizer = createOfflineRecognizer();\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/AudioFormat.html\n    // Linear PCM, 16000Hz, 16-bit, 1 channel, signed, little endian\n    AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/DataLine.Info.html#Info-java.lang.Class-javax.sound.sampled.AudioFormat-int-\n    DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);\n    TargetDataLine targetDataLine;\n    try {\n      targetDataLine = (TargetDataLine) AudioSystem.getLine(info);\n      targetDataLine.open(format);\n      targetDataLine.start();\n    } catch (LineUnavailableException e) {\n      System.out.println(\"Failed to open target data line: \" + e.getMessage());\n      vad.release();\n      recognizer.release();\n      return;\n    }\n\n    boolean printed = false;\n    byte[] buffer = new byte[windowSize * 2];\n    float[] samples = new float[windowSize];\n\n    System.out.println(\"Started. Please speak\");\n    boolean running = true;\n    while (targetDataLine.isOpen() && running) {\n      int n = targetDataLine.read(buffer, 0, buffer.length);\n      if (n <= 0) {\n        System.out.printf(\"Got %d bytes. Expected %d bytes.\\n\", n, buffer.length);\n        continue;\n      }\n      for (int i = 0; i != windowSize; ++i) {\n        short low = buffer[2 * i];\n        short high = buffer[2 * i + 1];\n        int s = (high << 8) + low;\n        samples[i] = (float) s / 32768;\n      }\n\n      vad.acceptWaveform(samples);\n      if (vad.isSpeechDetected() && !printed) {\n        System.out.println(\"Detected speech\");\n        printed = true;\n      }\n\n      if (!vad.isSpeechDetected()) {\n        printed = false;\n      }\n\n      while (!vad.empty()) {\n        SpeechSegment segment = vad.front();\n        float startTime = segment.getStart() / (float) sampleRate;\n        float duration = segment.getSamples().length / (float) sampleRate;\n\n        OfflineStream stream = recognizer.createStream();\n        stream.acceptWaveform(segment.getSamples(), sampleRate);\n        recognizer.decode(stream);\n        String text = recognizer.getResult(stream).getText();\n        stream.release();\n\n        if (!text.isEmpty()) {\n          System.out.printf(\"%.3f--%.3f: %s\\n\", startTime, startTime + duration, text);\n        }\n\n        if (text.contains(\"退出程序\")) {\n          running = false;\n        }\n\n        vad.pop();\n      }\n    }\n\n    vad.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/VadFromMicWithNonStreamingWhisper.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a silero_vad model with a non-streaming Whisper tiny.en\n// for speech recognition.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport javax.sound.sampled.*;\n\npublic class VadFromMicNonStreamingWhisper {\n  private static final int sampleRate = 16000;\n  private static final int windowSize = 512;\n\n  public static Vad createVad() {\n    // please download ./silero_vad.onnx from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String model = \"./silero_vad.onnx\";\n    SileroVadModelConfig sileroVad =\n        SileroVadModelConfig.builder()\n            .setModel(model)\n            .setThreshold(0.5f)\n            .setMinSilenceDuration(0.25f)\n            .setMinSpeechDuration(0.5f)\n            .setWindowSize(windowSize)\n            .build();\n\n    VadModelConfig config =\n        VadModelConfig.builder()\n            .setSileroVadModelConfig(sileroVad)\n            .setSampleRate(sampleRate)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .build();\n\n    return new Vad(config);\n  }\n\n  public static OfflineRecognizer createOfflineRecognizer() {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html\n    // to download model files\n    String encoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx\";\n    String decoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx\";\n    String tokens = \"./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\";\n\n    OfflineWhisperModelConfig whisper =\n        OfflineWhisperModelConfig.builder().setEncoder(encoder).setDecoder(decoder).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setWhisper(whisper)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    return new OfflineRecognizer(config);\n  }\n\n  public static void main(String[] args) {\n    Vad vad = createVad();\n    OfflineRecognizer recognizer = createOfflineRecognizer();\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/AudioFormat.html\n    // Linear PCM, 16000Hz, 16-bit, 1 channel, signed, little endian\n    AudioFormat format = new AudioFormat(sampleRate, 16, 1, true, false);\n\n    // https://docs.oracle.com/javase/8/docs/api/javax/sound/sampled/DataLine.Info.html#Info-java.lang.Class-javax.sound.sampled.AudioFormat-int-\n    DataLine.Info info = new DataLine.Info(TargetDataLine.class, format);\n    TargetDataLine targetDataLine;\n    try {\n      targetDataLine = (TargetDataLine) AudioSystem.getLine(info);\n      targetDataLine.open(format);\n      targetDataLine.start();\n    } catch (LineUnavailableException e) {\n      System.out.println(\"Failed to open target data line: \" + e.getMessage());\n      vad.release();\n      recognizer.release();\n      return;\n    }\n\n    boolean printed = false;\n    byte[] buffer = new byte[windowSize * 2];\n    float[] samples = new float[windowSize];\n\n    System.out.println(\"Started. Please speak\");\n    boolean running = true;\n    while (targetDataLine.isOpen() && running) {\n      int n = targetDataLine.read(buffer, 0, buffer.length);\n      if (n <= 0) {\n        System.out.printf(\"Got %d bytes. Expected %d bytes.\\n\", n, buffer.length);\n        continue;\n      }\n      for (int i = 0; i != windowSize; ++i) {\n        short low = buffer[2 * i];\n        short high = buffer[2 * i + 1];\n        int s = (high << 8) + low;\n        samples[i] = (float) s / 32768;\n      }\n\n      vad.acceptWaveform(samples);\n      if (vad.isSpeechDetected() && !printed) {\n        System.out.println(\"Detected speech\");\n        printed = true;\n      }\n\n      if (!vad.isSpeechDetected()) {\n        printed = false;\n      }\n\n      while (!vad.empty()) {\n        SpeechSegment segment = vad.front();\n        float startTime = segment.getStart() / (float) sampleRate;\n        float duration = segment.getSamples().length / (float) sampleRate;\n\n        OfflineStream stream = recognizer.createStream();\n        stream.acceptWaveform(segment.getSamples(), sampleRate);\n        recognizer.decode(stream);\n        String text = recognizer.getResult(stream).getText();\n        stream.release();\n\n        if (!text.isEmpty()) {\n          System.out.printf(\"%.3f--%.3f: %s\\n\", startTime, startTime + duration, text);\n        }\n\n        if (text.contains(\"exit the program\")) {\n          running = false;\n        }\n\n        vad.pop();\n      }\n    }\n\n    vad.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/VadNonStreamingDolphinCtc.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\n// This file shows how to use a silero_vad model with a non-streaming Dolphin\n// CTC model for speech recognition.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport java.util.Arrays;\n\npublic class VadNonStreamingSenseVoice {\n  public static Vad createVad() {\n    // please download ./silero_vad.onnx from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String model = \"./silero_vad.onnx\";\n    SileroVadModelConfig sileroVad =\n        SileroVadModelConfig.builder()\n            .setModel(model)\n            .setThreshold(0.5f)\n            .setMinSilenceDuration(0.25f)\n            .setMinSpeechDuration(0.5f)\n            .setWindowSize(512)\n            .setMaxSpeechDuration(5.0f)\n            .build();\n\n    VadModelConfig config =\n        VadModelConfig.builder()\n            .setSileroVadModelConfig(sileroVad)\n            .setSampleRate(16000)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .build();\n\n    return new Vad(config);\n  }\n\n  public static OfflineRecognizer createOfflineRecognizer() {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/dolphin/index.html\n    // to download model files\n    String model = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt\";\n\n    OfflineDolphinModelConfig dolphin = OfflineDolphinModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setDolphin(dolphin)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    return new OfflineRecognizer(config);\n  }\n\n  public static void main(String[] args) {\n\n    Vad vad = createVad();\n    OfflineRecognizer recognizer = createOfflineRecognizer();\n\n    // You can download the test file from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String testWaveFilename = \"./lei-jun-test.wav\";\n    WaveReader reader = new WaveReader(testWaveFilename);\n\n    int numSamples = reader.getSamples().length;\n    int numIter = numSamples / 512;\n\n    for (int i = 0; i != numIter; ++i) {\n      int start = i * 512;\n      int end = start + 512;\n      float[] samples = Arrays.copyOfRange(reader.getSamples(), start, end);\n      vad.acceptWaveform(samples);\n      if (vad.isSpeechDetected()) {\n        while (!vad.empty()) {\n          SpeechSegment segment = vad.front();\n          float startTime = segment.getStart() / 16000.0f;\n          float duration = segment.getSamples().length / 16000.0f;\n\n          OfflineStream stream = recognizer.createStream();\n          stream.acceptWaveform(segment.getSamples(), 16000);\n          recognizer.decode(stream);\n          String text = recognizer.getResult(stream).getText();\n          stream.release();\n\n          if (!text.isEmpty()) {\n            System.out.printf(\"%.3f--%.3f: %s\\n\", startTime, startTime + duration, text);\n          }\n\n          vad.pop();\n        }\n      }\n    }\n\n    vad.flush();\n    while (!vad.empty()) {\n      SpeechSegment segment = vad.front();\n      float startTime = segment.getStart() / 16000.0f;\n      float duration = segment.getSamples().length / 16000.0f;\n\n      OfflineStream stream = recognizer.createStream();\n      stream.acceptWaveform(segment.getSamples(), 16000);\n      recognizer.decode(stream);\n      String text = recognizer.getResult(stream).getText();\n      stream.release();\n\n      if (!text.isEmpty()) {\n        System.out.printf(\"%.3f--%.3f: %s\\n\", startTime, startTime + duration, text);\n      }\n\n      vad.pop();\n    }\n\n    vad.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/VadNonStreamingParaformer.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a silero_vad model with a non-streaming Paraformer\n// for speech recognition.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport java.util.Arrays;\n\npublic class VadNonStreamingParaformer {\n  public static Vad createVad() {\n    // please download ./silero_vad.onnx from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String model = \"./silero_vad.onnx\";\n    SileroVadModelConfig sileroVad =\n        SileroVadModelConfig.builder()\n            .setModel(model)\n            .setThreshold(0.5f)\n            .setMinSilenceDuration(0.25f)\n            .setMinSpeechDuration(0.5f)\n            .setWindowSize(512)\n            .setMaxSpeechDuration(5.0f)\n            .build();\n\n    VadModelConfig config =\n        VadModelConfig.builder()\n            .setSileroVadModelConfig(sileroVad)\n            .setSampleRate(16000)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .build();\n\n    return new Vad(config);\n  }\n\n  public static OfflineRecognizer createOfflineRecognizer() {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2023-09-14-chinese-english\n    // to download model files\n    String model = \"./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\";\n\n    OfflineParaformerModelConfig paraformer =\n        OfflineParaformerModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setParaformer(paraformer)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    return new OfflineRecognizer(config);\n  }\n\n  public static void main(String[] args) {\n\n    Vad vad = createVad();\n    OfflineRecognizer recognizer = createOfflineRecognizer();\n\n    // You can download the test file from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String testWaveFilename = \"./lei-jun-test.wav\";\n    WaveReader reader = new WaveReader(testWaveFilename);\n\n    int numSamples = reader.getSamples().length;\n    int numIter = numSamples / 512;\n\n    for (int i = 0; i != numIter; ++i) {\n      int start = i * 512;\n      int end = start + 512;\n      float[] samples = Arrays.copyOfRange(reader.getSamples(), start, end);\n      vad.acceptWaveform(samples);\n      if (vad.isSpeechDetected()) {\n        while (!vad.empty()) {\n          SpeechSegment segment = vad.front();\n          float startTime = segment.getStart() / 16000.0f;\n          float duration = segment.getSamples().length / 16000.0f;\n\n          OfflineStream stream = recognizer.createStream();\n          stream.acceptWaveform(segment.getSamples(), 16000);\n          recognizer.decode(stream);\n          String text = recognizer.getResult(stream).getText();\n          stream.release();\n\n          if (!text.isEmpty()) {\n            System.out.printf(\"%.3f--%.3f: %s\\n\", startTime, startTime + duration, text);\n          }\n\n          vad.pop();\n        }\n      }\n    }\n\n    vad.flush();\n    while (!vad.empty()) {\n      SpeechSegment segment = vad.front();\n      float startTime = segment.getStart() / 16000.0f;\n      float duration = segment.getSamples().length / 16000.0f;\n\n      OfflineStream stream = recognizer.createStream();\n      stream.acceptWaveform(segment.getSamples(), 16000);\n      recognizer.decode(stream);\n      String text = recognizer.getResult(stream).getText();\n      stream.release();\n\n      if (!text.isEmpty()) {\n        System.out.printf(\"%.3f--%.3f: %s\\n\", startTime, startTime + duration, text);\n      }\n\n      vad.pop();\n    }\n\n    vad.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/VadNonStreamingSenseVoice.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a silero_vad model with a non-streaming SenseVoiceModel\n// for speech recognition.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport java.util.Arrays;\n\npublic class VadNonStreamingSenseVoice {\n  public static Vad createVad() {\n    // please download ./silero_vad.onnx from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String model = \"./silero_vad.onnx\";\n    SileroVadModelConfig sileroVad =\n        SileroVadModelConfig.builder()\n            .setModel(model)\n            .setThreshold(0.5f)\n            .setMinSilenceDuration(0.25f)\n            .setMinSpeechDuration(0.5f)\n            .setWindowSize(512)\n            .setMaxSpeechDuration(5.0f)\n            .build();\n\n    VadModelConfig config =\n        VadModelConfig.builder()\n            .setSileroVadModelConfig(sileroVad)\n            .setSampleRate(16000)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .build();\n\n    return new Vad(config);\n  }\n\n  public static OfflineRecognizer createOfflineRecognizer() {\n    // please refer to\n    // https://k2-fsa.github.io/sherpa/onnx/sense-voice/index.html\n    // to download model files\n    String model = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\";\n    String tokens = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\";\n\n    OfflineSenseVoiceModelConfig senseVoice =\n        OfflineSenseVoiceModelConfig.builder().setModel(model).build();\n\n    OfflineModelConfig modelConfig =\n        OfflineModelConfig.builder()\n            .setSenseVoice(senseVoice)\n            .setTokens(tokens)\n            .setNumThreads(1)\n            .setDebug(true)\n            .build();\n\n    OfflineRecognizerConfig config =\n        OfflineRecognizerConfig.builder()\n            .setOfflineModelConfig(modelConfig)\n            .setDecodingMethod(\"greedy_search\")\n            .build();\n\n    return new OfflineRecognizer(config);\n  }\n\n  public static void main(String[] args) {\n\n    Vad vad = createVad();\n    OfflineRecognizer recognizer = createOfflineRecognizer();\n\n    // You can download the test file from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String testWaveFilename = \"./lei-jun-test.wav\";\n    WaveReader reader = new WaveReader(testWaveFilename);\n\n    int numSamples = reader.getSamples().length;\n    int numIter = numSamples / 512;\n\n    for (int i = 0; i != numIter; ++i) {\n      int start = i * 512;\n      int end = start + 512;\n      float[] samples = Arrays.copyOfRange(reader.getSamples(), start, end);\n      vad.acceptWaveform(samples);\n      if (vad.isSpeechDetected()) {\n        while (!vad.empty()) {\n          SpeechSegment segment = vad.front();\n          float startTime = segment.getStart() / 16000.0f;\n          float duration = segment.getSamples().length / 16000.0f;\n\n          OfflineStream stream = recognizer.createStream();\n          stream.acceptWaveform(segment.getSamples(), 16000);\n          recognizer.decode(stream);\n          String text = recognizer.getResult(stream).getText();\n          stream.release();\n\n          if (!text.isEmpty()) {\n            System.out.printf(\"%.3f--%.3f: %s\\n\", startTime, startTime + duration, text);\n          }\n\n          vad.pop();\n        }\n      }\n    }\n\n    vad.flush();\n    while (!vad.empty()) {\n      SpeechSegment segment = vad.front();\n      float startTime = segment.getStart() / 16000.0f;\n      float duration = segment.getSamples().length / 16000.0f;\n\n      OfflineStream stream = recognizer.createStream();\n      stream.acceptWaveform(segment.getSamples(), 16000);\n      recognizer.decode(stream);\n      String text = recognizer.getResult(stream).getText();\n      stream.release();\n\n      if (!text.isEmpty()) {\n        System.out.printf(\"%.3f--%.3f: %s\\n\", startTime, startTime + duration, text);\n      }\n\n      vad.pop();\n    }\n\n    vad.release();\n    recognizer.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/VadRemoveSilence.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\n// This file shows how to use a silero_vad model to remove silences from\n// a wave file.\n\nimport com.k2fsa.sherpa.onnx.*;\nimport java.util.ArrayList;\nimport java.util.Arrays;\n\npublic class VadRemoveSilence {\n  public static void main(String[] args) {\n    // please download ./silero_vad.onnx from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String model = \"./silero_vad.onnx\";\n    SileroVadModelConfig sileroVad =\n        SileroVadModelConfig.builder()\n            .setModel(model)\n            .setThreshold(0.5f)\n            .setMinSilenceDuration(0.25f)\n            .setMinSpeechDuration(0.5f)\n            .setWindowSize(512)\n            .setMaxSpeechDuration(5.0f)\n            .build();\n\n    VadModelConfig config =\n        VadModelConfig.builder()\n            .setSileroVadModelConfig(sileroVad)\n            .setSampleRate(16000)\n            .setNumThreads(1)\n            .setDebug(true)\n            .setProvider(\"cpu\")\n            .build();\n\n    Vad vad = new Vad(config);\n\n    // You can download the test file from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n    String testWaveFilename = \"./lei-jun-test.wav\";\n    WaveReader reader = new WaveReader(testWaveFilename);\n\n    int numSamples = reader.getSamples().length;\n    int numIter = numSamples / 512;\n\n    ArrayList<float[]> segments = new ArrayList<float[]>();\n\n    for (int i = 0; i != numIter; ++i) {\n      int start = i * 512;\n      int end = start + 512;\n      float[] samples = Arrays.copyOfRange(reader.getSamples(), start, end);\n      vad.acceptWaveform(samples);\n      if (vad.isSpeechDetected()) {\n        while (!vad.empty()) {\n\n          // if you want to get the starting time of this segment, you can use\n          /* float startTime = vad.front().getStart() / 16000.0f; */\n\n          segments.add(vad.front().getSamples());\n          vad.pop();\n        }\n      }\n    }\n\n    vad.flush();\n    while (!vad.empty()) {\n\n      // if you want to get the starting time of this segment, you can use\n      /* float startTime = vad.front().getStart() / 16000.0f; */\n\n      segments.add(vad.front().getSamples());\n      vad.pop();\n    }\n\n    // get total number of samples\n    int n = 0;\n    for (float[] s : segments) {\n      n += s.length;\n    }\n\n    float[] allSamples = new float[n];\n    int i = 0;\n    for (float[] s : segments) {\n      System.arraycopy(s, 0, allSamples, i, s.length);\n      i += s.length;\n    }\n\n    String outFilename = \"lei-jun-test-no-silence.wav\";\n    WaveWriter.write(outFilename, allSamples, 16000);\n    System.out.printf(\"Saved to %s\\n\", outFilename);\n\n    vad.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/VersionTest.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\nimport com.k2fsa.sherpa.onnx.*;\n\npublic class VersionTest {\n  public static void main(String[] args) {\n    System.out.printf(\"sherpa-onnx version: %s\\n\", VersionInfo.getVersion());\n    System.out.printf(\"sherpa-onnx gitSha1: %s\\n\", VersionInfo.getGitSha1());\n    System.out.printf(\"sherpa-onnx gitDate: %s\\n\", VersionInfo.getGitDate());\n  }\n}\n"
  },
  {
    "path": "java-api-examples/ZipVoiceTts.java",
    "content": "// Copyright 2026 Xiaomi Corporation\n\n// This file shows how to use a ZipVoice Chinese/English model\n// for zero-shot text to speech.\nimport com.k2fsa.sherpa.onnx.*;\nimport java.util.HashMap;\nimport java.util.Map;\n\npublic class ZipVoiceTts {\n  public static void main(String[] args) {\n    LibraryUtils.enableDebug();\n    // please visit\n    // https://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\n    // to download model files\n    String modelDir = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\";\n    String referenceAudioFilename = modelDir + \"/test_wavs/leijun-1.wav\";\n    String text = \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\";\n    String referenceText = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\";\n\n    OfflineTtsZipVoiceModelConfig zipvoiceModelConfig =\n        OfflineTtsZipVoiceModelConfig.builder()\n            .setTokens(modelDir + \"/tokens.txt\")\n            .setEncoder(modelDir + \"/encoder.int8.onnx\")\n            .setDecoder(modelDir + \"/decoder.int8.onnx\")\n            .setVocoder(\"./vocos_24khz.onnx\")\n            .setDataDir(modelDir + \"/espeak-ng-data\")\n            .setLexicon(modelDir + \"/lexicon.txt\")\n            .build();\n\n    OfflineTtsModelConfig modelConfig =\n        OfflineTtsModelConfig.builder()\n            .setZipvoice(zipvoiceModelConfig)\n            .setNumThreads(2)\n            .setDebug(false)\n            .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder().setModel(modelConfig).build();\n    OfflineTts tts = new OfflineTts(config);\n\n    WaveReader reader = new WaveReader(referenceAudioFilename);\n\n    GenerationConfig genConfig = new GenerationConfig();\n    genConfig.setReferenceAudio(reader.getSamples());\n    genConfig.setReferenceSampleRate(reader.getSampleRate());\n    genConfig.setReferenceText(referenceText);\n    genConfig.setNumSteps(4);\n\n    Map<String, String> extra = new HashMap<>();\n    extra.put(\"min_char_in_sentence\", \"10\");\n    genConfig.setExtra(extra);\n\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio =\n        tts.generateWithConfigAndCallback(\n            text,\n            genConfig,\n            samples -> {\n              System.out.println(\"callback got called with \" + samples.length + \" samples\");\n              return 1;\n            });\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float realTimeFactor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"generated-zipvoice-zh-en-java.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", realTimeFactor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n"
  },
  {
    "path": "java-api-examples/run-audio-tagging-ced-from-file.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n  tar xvf sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n  rm sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./AudioTaggingCEDFromFile.java\n"
  },
  {
    "path": "java-api-examples/run-audio-tagging-zipformer-from-file.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n  tar xvf sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n  rm sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./AudioTaggingZipformerFromFile.java\n"
  },
  {
    "path": "java-api-examples/run-inverse-text-normalization-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\nif [ ! -f ./itn-zh-number.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nfi\n\nif [ ! -f ./itn_zh_number.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  InverseTextNormalizationNonStreamingParaformer.java\n"
  },
  {
    "path": "java-api-examples/run-inverse-text-normalization-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nfi\n\nif [ ! -f ./itn-zh-number.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nfi\n\nif [ ! -f ./itn_zh_number.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  InverseTextNormalizationStreamingTransducer.java\n"
  },
  {
    "path": "java-api-examples/run-kws-from-file.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./KeywordSpotterFromFile.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-dolphin-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  ls -lh sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileDolphinCtc.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-fire-red-asr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  tar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  rm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nfi\n\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileFireRedAsrCtc.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-fire-red-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileFireRedAsr.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-funasr-nano.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileFunAsrNano.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-medasr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  tar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  rm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nfi\n\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileMedAsrCtc.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-moonshine-v2.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileMoonshineV2.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-moonshine.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileMoonshine.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-nemo-canary.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n  tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n  rm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileNemoCanary.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-nemo.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-nemo-ctc-en-citrinet-512/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-en-citrinet-512.tar.bz2\n  tar xvf sherpa-onnx-nemo-ctc-en-citrinet-512.tar.bz2\n  rm sherpa-onnx-nemo-ctc-en-citrinet-512.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileNemo.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-omnilingual-asr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  tar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  rm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nfi\n\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileOmnilingualAsrCtc.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileParaformer.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-sense-voice-with-hr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\nif [ ! -d dict ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\n  tar xf dict.tar.bz2\n  rm dict.tar.bz2\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileSenseVoiceWithHr.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-sense-voice.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileSenseVoice.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-tele-speech-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n  tar xvf sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n  rm sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./NonStreamingDecodeFileTeleSpeechCtc.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-transducer-hotwords.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-conformer-zh-stateless2-2023-05-23/tokens.txt ]; then\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-conformer-zh-stateless2-2023-05-23.tar.bz2\n  tar xvf sherpa-onnx-conformer-zh-stateless2-2023-05-23.tar.bz2\n  rm sherpa-onnx-conformer-zh-stateless2-2023-05-23.tar.bz2\nfi\n\nif [ ! -f hotwords_cn.txt ]; then\n  cat > hotwords_cn.txt <<EOF\n朱丽楠\nEOF\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileTransducerHotwords.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\n  rm sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileTransducer.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-wenet-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n  tar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n  rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileWenetCtc.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-whisper-multiple.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileWhisperMultiple.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-whisper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileWhisper.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-decode-file-zipformer-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n  rm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingDecodeFileZipformerCtc.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-speech-enhancement-dpdfnet.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingSpeechEnhancementDpdfNet.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-speech-enhancement-gtcrn.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingSpeechEnhancementGtcrn.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-tts-coqui-de.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n# to download more models\nif [ ! -f ./vits-coqui-de-css10/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-coqui-de-css10.tar.bz2\n  tar xf vits-coqui-de-css10.tar.bz2\n  rm vits-coqui-de-css10.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingTtsCoquiDe.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-tts-kitten-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kitten.html\n# to download more models\n\nif [ ! -f ./kitten-nano-en-v0_1-fp16/model.fp16.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n  tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n  rm kitten-nano-en-v0_1-fp16.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingTtsKittenEn.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-tts-kokoro-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n# to download more models\nif [ ! -f ./kokoro-en-v0_19/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n  tar xf kokoro-en-v0_19.tar.bz2\n  rm kokoro-en-v0_19.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingTtsKokoroEn.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-tts-kokoro-zh-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n# to download more models\nif [ ! -f ./kokoro-multi-lang-v1_0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n  tar xf kokoro-multi-lang-v1_0.tar.bz2\n  rm kokoro-multi-lang-v1_0.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingTtsKokoroZhEn.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-tts-matcha-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n  tar xf matcha-icefall-en_US-ljspeech.tar.bz2\n  rm matcha-icefall-en_US-ljspeech.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingTtsMatchaEn.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-tts-matcha-zh.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-zh-baker/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  tar xvf matcha-icefall-zh-baker.tar.bz2\n  rm matcha-icefall-zh-baker.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingTtsMatchaZh.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-tts-piper-en-with-callback.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n# to download more models\nif [ ! -f ./vits-piper-en_GB-cori-medium/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-cori-medium.tar.bz2\n  tar xf vits-piper-en_GB-cori-medium.tar.bz2\n  rm vits-piper-en_GB-cori-medium.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingTtsPiperEnWithCallback.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-tts-piper-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n# to download more models\nif [ ! -f ./vits-piper-en_GB-cori-medium/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-cori-medium.tar.bz2\n  tar xf vits-piper-en_GB-cori-medium.tar.bz2\n  rm vits-piper-en_GB-cori-medium.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingTtsPiperEn.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-tts-vits-zh.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n# to download more models\nif [ ! -f ./vits-zh-hf-fanchen-C/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-zh-hf-fanchen-C.tar.bz2\n  tar xf vits-zh-hf-fanchen-C.tar.bz2\n  rm vits-zh-hf-fanchen-C.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingTtsVitsZh.java\n"
  },
  {
    "path": "java-api-examples/run-non-streaming-websocket-client.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f zh.wav ]; then\n  # wget https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/zh.wav\n  wget https://hf-mirror.com/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/test_wavs/zh.wav\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  NonStreamingWebsocketClient.java\n"
  },
  {
    "path": "java-api-examples/run-offline-add-punctuation-zh-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./OfflineAddPunctuation.java\n"
  },
  {
    "path": "java-api-examples/run-offline-speaker-diarization.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-pyannote-segmentation-3-0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nfi\n\nif [ ! -f ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\nfi\n\nif [ ! -f ./0-four-speakers-zh.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./OfflineSpeakerDiarizationDemo.java\n"
  },
  {
    "path": "java-api-examples/run-online-add-punctuation-zh-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-online-punct-en-2024-08-06/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n  tar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n  rm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./OnlineAddPunctuation.java\n"
  },
  {
    "path": "java-api-examples/run-pocket-tts.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n# to download more models\n\nif [ ! -f ./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  tar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nfi\n\nif false; then\n  javac \\\n    -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n    PocketTts.java\n  javap -p -s PocketTts.class\n  javap -p -s PocketTts$1.class\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  PocketTts.java\n"
  },
  {
    "path": "java-api-examples/run-speaker-identification.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\nfi\n\nif [ ! -f ./sr-data/enroll/leijun-sr-1.wav ]; then\n  curl -SL -o sr-data.tar.gz https://github.com/csukuangfj/sr-data/archive/refs/tags/v1.0.0.tar.gz\n  tar xvf sr-data.tar.gz\n  mv sr-data-1.0.0 sr-data\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./SpeakerIdentification.java\n"
  },
  {
    "path": "java-api-examples/run-spoken-language-identification-whisper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# Note that it needs a multilingual whisper model. so, for example, tiny works while tiny.en does not work\n# https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\nif [ ! -f ./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.tar.bz2\n  rm sherpa-onnx-whisper-tiny.tar.bz2\nfi\n\nif [ ! -f ./spoken-language-identification-test-wavs/en-english.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/spoken-language-identification-test-wavs.tar.bz2\n  tar xvf spoken-language-identification-test-wavs.tar.bz2\n  rm spoken-language-identification-test-wavs.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./SpokenLanguageIdentificationWhisper.java\n"
  },
  {
    "path": "java-api-examples/run-streaming-asr-from-mic-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nfi\n\nif [ ! -f ./itn_zh_number.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./StreamingAsrFromMicTransducer.java\n"
  },
  {
    "path": "java-api-examples/run-streaming-decode-file-ctc-hlg.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  StreamingDecodeFileCtcHLG.java\n"
  },
  {
    "path": "java-api-examples/run-streaming-decode-file-ctc.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  StreamingDecodeFileCtc.java\n"
  },
  {
    "path": "java-api-examples/run-streaming-decode-file-paraformer.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  StreamingDecodeFileParaformer.java\n"
  },
  {
    "path": "java-api-examples/run-streaming-decode-file-tone-ctc.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n  tar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n  rm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  StreamingDecodeFileToneCtc.java\n"
  },
  {
    "path": "java-api-examples/run-streaming-decode-file-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  StreamingDecodeFileTransducer.java\n"
  },
  {
    "path": "java-api-examples/run-streaming-speech-enhancement-dpdfnet.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  StreamingSpeechEnhancementDpdfNet.java\n"
  },
  {
    "path": "java-api-examples/run-streaming-speech-enhancement-gtcrn.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  StreamingSpeechEnhancementGtcrn.java\n"
  },
  {
    "path": "java-api-examples/run-supertonic-tts.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n# to download more models\n\nif [ ! -f ./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  tar xvf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  rm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  SupertonicTts.java\n"
  },
  {
    "path": "java-api-examples/run-ten-vad-remove-silence.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./ten-vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./TenVadRemoveSilence.java\n"
  },
  {
    "path": "java-api-examples/run-vad-from-mic-non-streaming-moonshine.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./VadFromMicWithNonStreamingMoonshine.java\n"
  },
  {
    "path": "java-api-examples/run-vad-from-mic-non-streaming-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\nif [ ! -f ./itn_zh_number.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./VadFromMicWithNonStreamingParaformer.java\n"
  },
  {
    "path": "java-api-examples/run-vad-from-mic-non-streaming-sense-voice.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./VadFromMicWithNonStreamingSenseVoice.java\n"
  },
  {
    "path": "java-api-examples/run-vad-from-mic-non-streaming-whisper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./VadFromMicWithNonStreamingWhisper.java\n"
  },
  {
    "path": "java-api-examples/run-vad-from-mic.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./VadFromMic.java\n"
  },
  {
    "path": "java-api-examples/run-vad-non-streaming-dolphin-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  ls -lh sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./VadNonStreamingDolphinCtc.java\n"
  },
  {
    "path": "java-api-examples/run-vad-non-streaming-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./VadNonStreamingParaformer.java\n"
  },
  {
    "path": "java-api-examples/run-vad-non-streaming-sense-voice.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./VadNonStreamingSenseVoice.java\n"
  },
  {
    "path": "java-api-examples/run-vad-remove-silence.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./VadRemoveSilence.java\n"
  },
  {
    "path": "java-api-examples/run-version-test.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ./VersionTest.java\n\n"
  },
  {
    "path": "java-api-examples/run-zipvoice-tts.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [[ ! -f ../build/lib/libsherpa-onnx-jni.dylib  && ! -f ../build/lib/libsherpa-onnx-jni.so ]]; then\n  mkdir -p ../build\n  pushd ../build\n  cmake \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    -DSHERPA_ONNX_ENABLE_JNI=ON \\\n    ..\n\n  make -j4\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ../sherpa-onnx/java-api/build/sherpa-onnx.jar ]; then\n  pushd ../sherpa-onnx/java-api\n  make\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\n# to download more models\nif [ ! -f ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  tar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nfi\n\nif [ ! -f ./vocos_24khz.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\nfi\n\njava \\\n  -Djava.library.path=$PWD/../build/lib \\\n  -cp ../sherpa-onnx/java-api/build/sherpa-onnx.jar \\\n  ZipVoiceTts.java\n"
  },
  {
    "path": "java-api-examples/src/websocketsrv/AsrWebsocketClient.java",
    "content": "/*\n * // Copyright 2022-2023 by zhaomingwork\n */\n// java AsrWebsocketClient\n// usage: AsrWebsocketClient soPath srvIp srvPort wavPath numThreads\npackage websocketsrv;\n\nimport com.k2fsa.sherpa.onnx.OnlineRecognizer;\nimport java.net.URI;\nimport java.net.URISyntaxException;\nimport java.nio.*;\nimport java.util.Map;\nimport org.java_websocket.client.WebSocketClient;\nimport org.java_websocket.drafts.Draft;\nimport org.java_websocket.handshake.ServerHandshake;\nimport org.slf4j.Logger;\nimport org.slf4j.LoggerFactory;\n\n/** This example demonstrates how to connect to websocket server. */\npublic class AsrWebsocketClient extends WebSocketClient {\n  private static final Logger logger = LoggerFactory.getLogger(AsrWebsocketClient.class);\n\n  public AsrWebsocketClient(URI serverUri, Draft draft) {\n    super(serverUri, draft);\n  }\n\n  public AsrWebsocketClient(URI serverURI) {\n    super(serverURI);\n  }\n\n  public AsrWebsocketClient(URI serverUri, Map<String, String> httpHeaders) {\n    super(serverUri, httpHeaders);\n  }\n\n  @Override\n  public void onOpen(ServerHandshake handshakedata) {\n\n    float[] floats = OnlineRecognizer.readWavFile(AsrWebsocketClient.wavPath);\n    ByteBuffer buffer =\n        ByteBuffer.allocate(4 * floats.length)\n            .order(ByteOrder.LITTLE_ENDIAN); // float is sizeof 4. allocate enough buffer\n\n    for (float f : floats) {\n      buffer.putFloat(f);\n    }\n    buffer.rewind();\n    buffer.flip();\n    buffer.order(ByteOrder.LITTLE_ENDIAN);\n\n    send(buffer.array()); // send buf to server\n    send(\"Done\"); // send 'Done' means finished\n  }\n\n  @Override\n  public void onMessage(String message) {\n\n    logger.info(\"received: \" + message);\n  }\n\n  @Override\n  public void onClose(int code, String reason, boolean remote) {\n\n    logger.info(\n        \"Connection closed by \"\n            + (remote ? \"remote peer\" : \"us\")\n            + \" Code: \"\n            + code\n            + \" Reason: \"\n            + reason);\n  }\n\n  @Override\n  public void onError(Exception ex) {\n    ex.printStackTrace();\n    // if the error is fatal then onClose will be called additionally\n  }\n\n  public static OnlineRecognizer rcgobj;\n  public static String wavPath;\n\n  public static void main(String[] args) throws URISyntaxException {\n\n    if (args.length != 5) {\n      System.out.println(\"usage: AsrWebsocketClient soPath srvIp srvPort wavPath numThreads\");\n      return;\n    }\n\n    String soPath = args[0];\n    String srvIp = args[1];\n    String srvPort = args[2];\n    String wavPath = args[3];\n    int numThreads = Integer.parseInt(args[4]);\n    System.out.println(\"serIp=\" + srvIp + \",srvPort=\" + srvPort + \",wavPath=\" + wavPath);\n\n    class ClientThread implements Runnable {\n\n      String soPath;\n      String srvIp;\n      String srvPort;\n      String wavPath;\n\n      ClientThread(String soPath, String srvIp, String srvPort, String wavPath) {\n        this.soPath = soPath;\n        this.srvIp = srvIp;\n        this.srvPort = srvPort;\n        this.wavPath = wavPath;\n      }\n\n      public void run() {\n        try {\n\n          OnlineRecognizer.setSoPath(soPath);\n\n          AsrWebsocketClient.wavPath = wavPath;\n\n          String wsAddress = \"ws://\" + srvIp + \":\" + srvPort;\n          AsrWebsocketClient c = new AsrWebsocketClient(new URI(wsAddress));\n\n          c.connect();\n        } catch (Exception e) {\n          e.printStackTrace();\n        }\n      }\n    }\n    for (int i = 0; i < numThreads; i++) {\n      System.out.println(\"Thread1 is running...\");\n      Thread t = new Thread(new ClientThread(soPath, srvIp, srvPort, wavPath));\n      t.start();\n    }\n  }\n}\n"
  },
  {
    "path": "java-api-examples/src/websocketsrv/AsrWebsocketServer.java",
    "content": "/*\n * // Copyright 2022-2023 by zhaoming\n */\n// java websocketServer\n// usage: AsrWebsocketServer soPath modelCfgPath\npackage websocketsrv;\n\nimport com.k2fsa.sherpa.onnx.OnlineRecognizer;\nimport com.k2fsa.sherpa.onnx.OnlineStream;\nimport java.io.*;\nimport java.io.IOException;\nimport java.net.InetSocketAddress;\nimport java.net.UnknownHostException;\nimport java.nio.ByteBuffer;\nimport java.nio.ByteOrder;\nimport java.nio.FloatBuffer;\nimport java.util.*;\nimport java.util.Collections;\nimport java.util.concurrent.*;\nimport java.util.concurrent.LinkedBlockingQueue;\nimport org.java_websocket.WebSocket;\nimport org.java_websocket.drafts.Draft;\nimport org.java_websocket.drafts.Draft_6455;\nimport org.java_websocket.handshake.ClientHandshake;\nimport org.java_websocket.server.WebSocketServer;\nimport org.slf4j.Logger;\nimport org.slf4j.LoggerFactory;\n\n/**\n * AsrWebSocketServer has three threads pools, one pool for network io, one pool for asr stream and\n * one pool for asr decoder.\n */\npublic class AsrWebsocketServer extends WebSocketServer {\n  private static final Logger logger = LoggerFactory.getLogger(AsrWebsocketServer.class);\n  //  Queue between io network io thread pool and stream thread pool, use websocket as the key\n  private LinkedBlockingQueue<WebSocket> streamQueue = new LinkedBlockingQueue<WebSocket>();\n  //  Queue waiting for deocdeing, use websocket as the key\n  private LinkedBlockingQueue<WebSocket> decoderQueue = new LinkedBlockingQueue<WebSocket>();\n\n  // recognizer object\n  private OnlineRecognizer rcgOjb = null;\n\n  // mapping between websocket connection and connection data\n  private ConcurrentHashMap<WebSocket, ConnectionData> connectionMap =\n      new ConcurrentHashMap<WebSocket, ConnectionData>();\n\n  public AsrWebsocketServer(int port, int numThread) throws UnknownHostException {\n    // server port and num of threads for  network io\n    super(new InetSocketAddress(port), numThread);\n  }\n\n  public AsrWebsocketServer(InetSocketAddress address) {\n    super(address);\n  }\n\n  public AsrWebsocketServer(int port, Draft_6455 draft) {\n    super(new InetSocketAddress(port), Collections.<Draft>singletonList(draft));\n  }\n\n  @Override\n  public void onOpen(WebSocket conn, ClientHandshake handshake) {}\n\n  @Override\n  public void onClose(WebSocket conn, int code, String reason, boolean remote) {\n    connectionMap.remove(conn);\n    logger.info(\n        conn\n            + \" remove one connection!, now connection number=\"\n            + String.valueOf(connectionMap.size()));\n  }\n\n  @Override\n  public void onMessage(WebSocket conn, String message) {\n    // this is text message\n    try {\n      // if rec \"Done\" msg from client\n      if (message.equals(\"Done\")) {\n        ConnectionData connData = creatOrGetConnectionData(conn);\n        connData.setEof(true);\n        if (!streamQueueFind(conn)) {\n          streamQueue.put(conn);\n        }\n      }\n\n    } catch (Exception e) {\n      e.printStackTrace();\n    }\n  }\n\n  private ConnectionData creatOrGetConnectionData(WebSocket conn) {\n    // create a new connection data if not in connection map or return the existed one\n\n    ConnectionData connData = null;\n    try {\n      if (!connectionMap.containsKey(conn)) {\n        OnlineStream stream = rcgOjb.createStream();\n        connData = new ConnectionData(conn, stream);\n        connectionMap.put(conn, connData);\n      } else {\n        connData = connectionMap.get(conn);\n      }\n\n      logger.info(\n          conn.getRemoteSocketAddress().getAddress().getHostAddress()\n              + \" open one connection,, now connection number=\"\n              + String.valueOf(connectionMap.size()));\n\n    } catch (Exception e) {\n      System.err.println(e);\n      e.printStackTrace();\n    }\n    return connData;\n  }\n\n  @Override\n  public void onMessage(WebSocket conn, ByteBuffer blob) {\n    try {\n\n      // for handle binary data\n      blob.order(ByteOrder.LITTLE_ENDIAN); // set little endian\n\n      // set to float\n      FloatBuffer floatbuf = blob.asFloatBuffer();\n\n      if (floatbuf.capacity() > 0) {\n        // allocate memory for float data\n        float[] arr = new float[floatbuf.capacity()];\n\n        floatbuf.get(arr);\n        ConnectionData connData = creatOrGetConnectionData(conn);\n        // put websocket  to stream queue with binary type==1\n        connData.addSamplesToData(arr);\n\n        if (!streamQueueFind(conn)) {\n          streamQueue.put(conn);\n        }\n      }\n    } catch (Exception e) {\n      e.printStackTrace();\n    }\n  }\n\n  public boolean streamQueueFind(WebSocket conn) {\n    return streamQueue.contains(conn);\n  }\n\n  public void initModelWithCfg(Map<String, String> cfgMap, String cfgPath) {\n    try {\n\n      rcgOjb = new OnlineRecognizer(cfgPath);\n      // size of stream thread pool\n      int streamThreadNum = Integer.valueOf(cfgMap.getOrDefault(\"stream_thread_num\", \"16\"));\n      // size of decoder thread pool\n      int decoderThreadNum = Integer.valueOf(cfgMap.getOrDefault(\"decoder_thread_num\", \"16\"));\n\n      // time(ms) idle for decoder thread when no job\n      int decoderTimeIdle = Integer.valueOf(cfgMap.getOrDefault(\"decoder_time_idle\", \"200\"));\n      // size of streams for parallel decoding\n      int parallelDecoderNum = Integer.valueOf(cfgMap.getOrDefault(\"parallel_decoder_num\", \"16\"));\n      // time(ms) out for connection data\n      int deocderTimeOut = Integer.valueOf(cfgMap.getOrDefault(\"deocder_time_out\", \"30000\"));\n\n      // create stream threads\n      for (int i = 0; i < streamThreadNum; i++) {\n        new StreamThreadHandler(streamQueue, decoderQueue, connectionMap).start();\n      }\n      // create decoder threads\n      for (int i = 0; i < decoderThreadNum; i++) {\n        new DecoderThreadHandler(\n                decoderQueue,\n                connectionMap,\n                rcgOjb,\n                decoderTimeIdle,\n                parallelDecoderNum,\n                deocderTimeOut)\n            .start();\n      }\n    } catch (Exception e) {\n      System.err.println(e);\n      e.printStackTrace();\n    }\n  }\n\n  public static Map<String, String> readProperties(String CfgPath) {\n    // read and parse config file\n    Properties props = new Properties();\n    Map<String, String> proMap = new HashMap<String, String>();\n    try {\n\n      File file = new File(CfgPath);\n      if (!file.exists()) {\n        logger.info(String.valueOf(CfgPath) + \" cfg file not exists!\");\n        System.exit(0);\n      }\n      InputStream in = new BufferedInputStream(new FileInputStream(CfgPath));\n      props.load(in);\n      Enumeration en = props.propertyNames();\n      while (en.hasMoreElements()) {\n        String key = (String) en.nextElement();\n        String Property = props.getProperty(key);\n        proMap.put(key, Property);\n      }\n\n    } catch (Exception e) {\n      e.printStackTrace();\n    }\n    return proMap;\n  }\n\n  public static void main(String[] args) throws InterruptedException, IOException {\n    if (args.length != 2) {\n      logger.info(\"usage: AsrWebsocketServer soPath modelCfgPath\");\n\n      return;\n    }\n\n    String soPath = args[0];\n    String cfgPath = args[1];\n\n    OnlineRecognizer.setSoPath(soPath);\n    logger.info(\"readProperties\");\n    Map<String, String> cfgMap = AsrWebsocketServer.readProperties(cfgPath);\n    int port = Integer.valueOf(cfgMap.getOrDefault(\"port\", \"8890\"));\n\n    int connectionThreadNum = Integer.valueOf(cfgMap.getOrDefault(\"connection_thread_num\", \"16\"));\n    AsrWebsocketServer s = new AsrWebsocketServer(port, connectionThreadNum);\n    logger.info(\"initModelWithCfg\");\n    s.initModelWithCfg(cfgMap, cfgPath);\n    logger.info(\"Server started on port: \" + s.getPort());\n    s.start();\n  }\n\n  @Override\n  public void onError(WebSocket conn, Exception ex) {\n    ex.printStackTrace();\n    if (conn != null) {\n      // some errors like port binding failed may not be assignable to a specific websocket\n    }\n  }\n\n  @Override\n  public void onStart() {\n    logger.info(\"Server started!\");\n    setConnectionLostTimeout(0);\n    setConnectionLostTimeout(100);\n  }\n}\n"
  },
  {
    "path": "java-api-examples/src/websocketsrv/ConnectionData.java",
    "content": "/*\n * // Copyright 2022-2023 by zhaoming\n */\n// connection data act as a bridge between different threads pools\n\npackage websocketsrv;\n\nimport com.k2fsa.sherpa.onnx.OnlineStream;\nimport java.time.LocalDateTime;\nimport java.util.LinkedList;\nimport java.util.Queue;\nimport java.util.concurrent.*;\nimport org.java_websocket.WebSocket;\n\npublic class ConnectionData {\n\n  private WebSocket webSocket; // the websocket for this connection data\n\n  private OnlineStream stream; // connection stream\n\n  private Queue<float[]> queueSamples =\n      new LinkedList<float[]>(); // binary data rec from the client\n\n  private boolean eof = false; // connection data is done\n\n  private LocalDateTime lastHandleTime; // used for time out in ms\n\n  public ConnectionData(WebSocket webSocket, OnlineStream stream) {\n    this.webSocket = webSocket;\n\n    this.stream = stream;\n  }\n\n  public void addSamplesToData(float[] samples) {\n    this.queueSamples.add(samples);\n  }\n\n  public LocalDateTime getLastHandleTime() {\n    return this.lastHandleTime;\n  }\n\n  public void setLastHandleTime(LocalDateTime now) {\n    this.lastHandleTime = now;\n  }\n\n  public boolean getEof() {\n    return this.eof;\n  }\n\n  public void setEof(boolean eof) {\n    this.eof = eof;\n  }\n\n  public WebSocket getWebSocket() {\n    return this.webSocket;\n  }\n\n  public Queue<float[]> getQueueSamples() {\n    return this.queueSamples;\n  }\n\n  public OnlineStream getStream() {\n    return this.stream;\n  }\n}\n"
  },
  {
    "path": "java-api-examples/src/websocketsrv/DecoderThreadHandler.java",
    "content": "/*\n * // Copyright 2022-2023 by zhaoming\n */\n// java DecoderThreadHandler\npackage websocketsrv;\n\nimport com.k2fsa.sherpa.onnx.OnlineRecognizer;\nimport com.k2fsa.sherpa.onnx.OnlineStream;\nimport java.nio.*;\nimport java.nio.charset.StandardCharsets;\nimport java.time.LocalDateTime;\nimport java.util.*;\nimport java.util.List;\nimport java.util.concurrent.*;\nimport java.util.concurrent.LinkedBlockingQueue;\nimport org.java_websocket.WebSocket;\nimport org.java_websocket.drafts.Draft;\nimport org.java_websocket.framing.Framedata;\nimport org.slf4j.Logger;\nimport org.slf4j.LoggerFactory;\n\npublic class DecoderThreadHandler extends Thread {\n  private static final Logger logger = LoggerFactory.getLogger(DecoderThreadHandler.class);\n  // Websocket Queue that waiting for decoding\n  private LinkedBlockingQueue<WebSocket> decoderQueue;\n  // the mapping between websocket and connection data\n  private ConcurrentHashMap<WebSocket, ConnectionData> connMap;\n\n  private OnlineRecognizer rcgOjb = null; // recgnizer object\n\n  // connection data list for this thread to decode in parallel\n  private List<ConnectionData> connDataList = new ArrayList<ConnectionData>();\n\n  private int parallelDecoderNum = 10; // parallel decoding number\n  private int deocderTimeIdle = 10; // idle time(ms) when no job\n  private int deocderTimeOut = 3000; // if it is timeout(ms), the connection data will be removed\n\n  public DecoderThreadHandler(\n      LinkedBlockingQueue<WebSocket> decoderQueue,\n      ConcurrentHashMap<WebSocket, ConnectionData> connMap,\n      OnlineRecognizer rcgOjb,\n      int deocderTimeIdle,\n      int parallelDecoderNum,\n      int deocderTimeOut) {\n    this.decoderQueue = decoderQueue;\n    this.connMap = connMap;\n    this.rcgOjb = rcgOjb;\n    this.deocderTimeIdle = deocderTimeIdle;\n    this.parallelDecoderNum = parallelDecoderNum;\n    this.deocderTimeOut = deocderTimeOut;\n  }\n\n  public void run() {\n    while (true) {\n      try {\n        // time(ms) idle  if there is no job\n\n        Thread.sleep(deocderTimeIdle);\n        // clear data list for this threads\n        connDataList.clear();\n        if (rcgOjb == null) continue;\n\n        // loop for total decoder Queue\n        while (!decoderQueue.isEmpty()) {\n\n          // get websocket\n          WebSocket conn = decoderQueue.take();\n          // get connection data according to websocket\n          ConnectionData connData = connMap.get(conn);\n\n          // if the websocket closed, continue\n          if (connData == null) continue;\n          // get the stream\n          OnlineStream stream = connData.getStream();\n\n          // put to decoder list if 1) stream is ready; 2) and\n          // size not > parallelDecoderNum\n          if ((rcgOjb.isReady(stream) && connDataList.size() < parallelDecoderNum)) {\n\n            // add to this thread's decoder list\n            connDataList.add(connData);\n            // change the handled time for this connection data\n            connData.setLastHandleTime(LocalDateTime.now());\n          }\n          // break when decoder list size >= parallelDecoderNum\n          if (connDataList.size() >= parallelDecoderNum) {\n            break;\n          }\n        }\n\n        // if decoder data list for this thread >0\n        if (connDataList.size() > 0) {\n\n          // create a stream array for parallel decoding\n          OnlineStream[] arr = new OnlineStream[connDataList.size()];\n          for (int i = 0; i < connDataList.size(); i++) {\n\n            arr[i] = connDataList.get(i).getStream();\n          }\n\n          // parallel decoding\n          rcgOjb.decodeStreams(arr);\n        }\n\n        // get result for each connection\n        for (ConnectionData connData : connDataList) {\n\n          OnlineStream stream = connData.getStream();\n          WebSocket webSocket = connData.getWebSocket();\n\n          String txtResult = rcgOjb.getResult(stream);\n\n          // decode text in utf-8\n          byte[] utf8Data = txtResult.getBytes(StandardCharsets.UTF_8);\n\n          boolean isEof = (connData.getEof() == true && !rcgOjb.isReady(stream));\n          // result\n          if (utf8Data.length > 0) {\n\n            String jsonResult =\n                \"{\\\"text\\\":\\\"\" + txtResult + \"\\\",\\\"eof\\\":\" + String.valueOf(isEof) + \"\\\"}\";\n\n            if (webSocket.isOpen()) {\n              // create a TEXT Frame for send back json result\n              Draft draft = webSocket.getDraft();\n              List<Framedata> frames = null;\n              frames = draft.createFrames(jsonResult, false);\n              // send to client\n              webSocket.sendFrame(frames);\n            }\n          }\n        }\n        // loop for each connection data in this thread\n        for (ConnectionData connData : connDataList) {\n          OnlineStream stream = connData.getStream();\n          WebSocket webSocket = connData.getWebSocket();\n          // if the stream is still ready, put it to decoder Queue again for next decoding\n          if (rcgOjb.isReady(stream)) {\n            decoderQueue.put(webSocket);\n          }\n          // the duration between last handled time and now\n          java.time.Duration duration =\n              java.time.Duration.between(connData.getLastHandleTime(), LocalDateTime.now());\n          // close the websocket if 1) data is done  and  stream not ready; 2) or data is time out;\n          // 3) or\n          // connection is closed\n          if ((connData.getEof() == true\n                  && !rcgOjb.isReady(stream)\n                  && connData.getQueueSamples().isEmpty())\n              || duration.toMillis() > deocderTimeOut\n              || !connData.getWebSocket().isOpen()) {\n\n            logger.info(\"close websocket!!!\");\n\n            // delay close web socket as data may still in processing\n            Timer timer = new Timer();\n            timer.schedule(\n                new TimerTask() {\n                  public void run() {\n\n                    webSocket.close();\n                  }\n                },\n                5000); // 5 seconds\n          }\n        }\n\n      } catch (Exception e) {\n        e.printStackTrace();\n      }\n    }\n  }\n}\n"
  },
  {
    "path": "java-api-examples/src/websocketsrv/StreamThreadHandler.java",
    "content": "/*\n * // Copyright 2022-2023 by zhaoming\n */\n// java StreamThreadHandler\npackage websocketsrv;\n\nimport com.k2fsa.sherpa.onnx.OnlineStream;\nimport java.nio.*;\nimport java.util.*;\nimport java.util.concurrent.*;\nimport java.util.concurrent.LinkedBlockingQueue;\nimport org.java_websocket.WebSocket;\n// thread for processing stream\n\npublic class StreamThreadHandler extends Thread {\n  //  Queue between io network io thread pool and stream thread pool, use websocket as the key\n  private LinkedBlockingQueue<WebSocket> streamQueue;\n  //  Queue waiting for deocdeing, use websocket as the key\n  private LinkedBlockingQueue<WebSocket> decoderQueue;\n  // mapping between websocket connection and connection data\n  private ConcurrentHashMap<WebSocket, ConnectionData> connMap;\n\n  public StreamThreadHandler(\n      LinkedBlockingQueue<WebSocket> streamQueue,\n      LinkedBlockingQueue<WebSocket> decoderQueue,\n      ConcurrentHashMap<WebSocket, ConnectionData> connMap) {\n    this.streamQueue = streamQueue;\n    this.decoderQueue = decoderQueue;\n    this.connMap = connMap;\n  }\n\n  public void run() {\n    while (true) {\n      try {\n        // fetch one websocket from queue\n        WebSocket conn = (WebSocket) this.streamQueue.take();\n        // get the connection data according to websocket\n        ConnectionData connData = connMap.get(conn);\n        OnlineStream stream = connData.getStream();\n\n        // handle received binary data\n        if (!connData.getQueueSamples().isEmpty()) {\n          // loop to put all received binary data to stream\n          while (!connData.getQueueSamples().isEmpty()) {\n\n            float[] samples = connData.getQueueSamples().poll();\n\n            stream.acceptWaveform(samples);\n          }\n          //  if data is finished\n          if (connData.getEof() == true) {\n\n            stream.inputFinished();\n          }\n          // add this websocket to decoder Queue if not in the Queue\n          if (!decoderQueue.contains(conn)) {\n\n            decoderQueue.put(conn);\n          }\n        }\n\n      } catch (Exception e) {\n        e.printStackTrace();\n      }\n    }\n  }\n}\n"
  },
  {
    "path": "jitpack.yml",
    "content": "jdk:\n  - openjdk17\n\nbefore_install:\n  - wget https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.31/sherpa-onnx-1.12.31.aar\n\ninstall:\n  - FILE=\"-Dfile=sherpa-onnx-1.12.31.aar\"\n  - mvn install:install-file $FILE -DgroupId=com.k2fsa.sherpa.onnx -DartifactId=sherpa-onnx -Dversion=1.12.31 -Dpackaging=aar -DgeneratePom=true\n"
  },
  {
    "path": "kotlin-api-examples/.gitignore",
    "content": "hs_err*\nvits-zh-aishell3\n*.jar\n"
  },
  {
    "path": "kotlin-api-examples/faked-asset-manager.kt",
    "content": "package android.content.res\n\nclass AssetManager {}\n"
  },
  {
    "path": "kotlin-api-examples/faked-log.kt",
    "content": "package android.util\n\nclass Log {\n  companion object {\n    fun i(tag: String, msg: String) {\n      println(\"$tag, $msg\")\n    }\n  }\n}\n\n"
  },
  {
    "path": "kotlin-api-examples/test_audio_tagging.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testAudioTagging()\n}\n\nfun testAudioTagging() {\n  val config = AudioTaggingConfig(\n      model=AudioTaggingModelConfig(\n        zipformer=OfflineZipformerAudioTaggingModelConfig(\n          model=\"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.int8.onnx\",\n        ),\n        numThreads=1,\n        debug=true,\n        provider=\"cpu\",\n      ),\n      labels=\"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/class_labels_indices.csv\",\n      topK=5,\n   )\n  val tagger = AudioTagging(config=config)\n\n  val testFiles = arrayOf(\n    \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/test_wavs/1.wav\",\n    \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/test_wavs/2.wav\",\n    \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/test_wavs/3.wav\",\n    \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/test_wavs/4.wav\",\n  )\n  println(\"----------\")\n  for (waveFilename in testFiles) {\n    val stream = tagger.createStream()\n\n    val waveData = WaveReader.readWaveFromFile(\n        filename = waveFilename,\n    )\n\n    stream.acceptWaveform(waveData.samples, sampleRate = waveData.sampleRate)\n    val events = tagger.compute(stream)\n    stream.release()\n\n    println(waveFilename)\n    for (event in events) {\n      println(\"Name: ${event.name}, Index: ${event.index}, Probability: ${event.prob}\")\n    }\n\n    println(\"----------\")\n  }\n\n  tagger.release()\n}\n\n"
  },
  {
    "path": "kotlin-api-examples/test_itn_offline_asr.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  test()\n}\n\nfun test() {\n  val recognizer = createOfflineRecognizer()\n  val waveFilename = \"./itn-zh-number.wav\";\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  val stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  val result = recognizer.getResult(stream)\n  println(result)\n\n  stream.release()\n  recognizer.release()\n}\n\nfun createOfflineRecognizer(): OfflineRecognizer {\n  val config = OfflineRecognizerConfig(\n      featConfig = getFeatureConfig(sampleRate = 16000, featureDim = 80),\n      modelConfig = getOfflineModelConfig(0)!!,\n      ruleFsts = \"./itn_zh_number.fst\",\n  )\n\n  return OfflineRecognizer(config = config)\n}\n\n"
  },
  {
    "path": "kotlin-api-examples/test_itn_online_asr.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  test()\n}\n\nfun test() {\n  val recognizer = createOnlineRecognizer()\n  val waveFilename = \"./itn-zh-number.wav\";\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  val stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream)\n  }\n\n  val result = recognizer.getResult(stream).text\n  println(result)\n\n  stream.release()\n  recognizer.release()\n}\n\nfun createOnlineRecognizer(): OnlineRecognizer {\n  val config = OnlineRecognizerConfig(\n      featConfig = getFeatureConfig(sampleRate = 16000, featureDim = 80),\n      modelConfig = getModelConfig(8)!!,\n  )\n\n  config.ruleFsts = \"./itn_zh_number.fst\"\n  println(config)\n\n  return OnlineRecognizer(config = config)\n}\n\n"
  },
  {
    "path": "kotlin-api-examples/test_language_id.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testSpokenLanguageIdentifcation()\n}\n\nfun testSpokenLanguageIdentifcation() {\n  val config = SpokenLanguageIdentificationConfig(\n    whisper = SpokenLanguageIdentificationWhisperConfig(\n      encoder = \"./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx\",\n      decoder = \"./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx\",\n      tailPaddings = 33,\n    ),\n    numThreads=1,\n    debug=true,\n    provider=\"cpu\",\n  )\n  val slid = SpokenLanguageIdentification(config=config)\n\n  val testFiles = arrayOf(\n    \"./spoken-language-identification-test-wavs/ar-arabic.wav\",\n    \"./spoken-language-identification-test-wavs/bg-bulgarian.wav\",\n    \"./spoken-language-identification-test-wavs/de-german.wav\",\n  )\n\n  for (waveFilename in testFiles) {\n    val waveData = WaveReader.readWaveFromFile(\n        filename = waveFilename,\n    )\n\n    val stream = slid.createStream()\n    stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n    val lang = slid.compute(stream)\n    stream.release()\n    println(waveFilename)\n    println(lang)\n  }\n\n  slid.release()\n}\n\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_asr.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  val types = arrayOf(0, 2, 5, 6, 15, 21, 24, 25, 31)\n  for (type in types) {\n    test(type)\n  }\n}\n\nfun test(type: Int) {\n  val recognizer = createOfflineRecognizer(type)\n\n  val waveFilename = when (type) {\n    0 -> \"./sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/0.wav\"\n    2 -> \"./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav\"\n    5 -> \"./sherpa-onnx-zipformer-multi-zh-hans-2023-9-2/test_wavs/1.wav\"\n    6 -> \"./sherpa-onnx-nemo-ctc-en-citrinet-512/test_wavs/8k.wav\"\n    15 -> \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/test_wavs/zh.wav\"\n    21 -> \"./sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav\"\n    24 -> \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav\"\n    25 -> \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav\"\n    31 -> \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/test_wavs/0.wav\"\n    else -> null\n  }\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename!!,\n  )\n\n  val stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  val result = recognizer.getResult(stream)\n  println(result)\n\n  stream.release()\n  recognizer.release()\n}\n\nfun createOfflineRecognizer(type: Int): OfflineRecognizer {\n  val config = OfflineRecognizerConfig(\n      featConfig = getFeatureConfig(sampleRate = 16000, featureDim = 80),\n      modelConfig = getOfflineModelConfig(type = type)!!,\n  )\n\n  return OfflineRecognizer(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_fire_red_asr_ctc.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  val recognizer = createOfflineRecognizer()\n  val waveFilename = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav\"\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  var stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  var result = recognizer.getResult(stream)\n  println(result)\n\n  stream.release()\n  recognizer.release()\n}\n\n\nfun createOfflineRecognizer(): OfflineRecognizer {\n  val config = OfflineRecognizerConfig(\n      modelConfig = getOfflineModelConfig(type = 50)!!,\n  )\n\n  return OfflineRecognizer(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_funasr_nano.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  val recognizer = createOfflineRecognizer()\n  val waveFilename = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/lyrics.wav\"\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  var stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  var result = recognizer.getResult(stream)\n  println(result)\n\n  stream.release()\n  recognizer.release()\n}\n\n\nfun createOfflineRecognizer(): OfflineRecognizer {\n  val config = OfflineRecognizerConfig(\n      modelConfig = getOfflineModelConfig(type = 46)!!,\n  )\n\n  return OfflineRecognizer(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_medasr_ctc.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  val recognizer = createOfflineRecognizer()\n  val waveFilename = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav\"\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  var stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  var result = recognizer.getResult(stream)\n  println(result)\n\n  stream.release()\n  recognizer.release()\n}\n\n\nfun createOfflineRecognizer(): OfflineRecognizer {\n  val config = OfflineRecognizerConfig(\n      modelConfig = getOfflineModelConfig(type = 45)!!,\n  )\n\n  return OfflineRecognizer(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_moonshine_asr_v2.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  val recognizer = createOfflineRecognizer()\n  val waveFilename = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav\"\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  var stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  var result = recognizer.getResult(stream)\n  println(result)\n\n  stream.release()\n  recognizer.release()\n}\n\n\nfun createOfflineRecognizer(): OfflineRecognizer {\n  val config = OfflineRecognizerConfig(\n      modelConfig = getOfflineModelConfig(type = 53)!!,\n  )\n\n  return OfflineRecognizer(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_nemo_canary.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  val recognizer = createOfflineRecognizer()\n  val waveFilename = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/en.wav\"\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  var stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  var result = recognizer.getResult(stream)\n  println(\"English: $result\")\n\n  stream.release()\n\n  // now output text in German\n  val config = recognizer.config.copy(modelConfig=recognizer.config.modelConfig.copy(\n    canary=recognizer.config.modelConfig.canary.copy(\n      tgtLang=\"de\"\n    )\n  ))\n  recognizer.setConfig(config)\n\n  stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  result = recognizer.getResult(stream)\n  println(\"German: $result\")\n\n  stream.release()\n  recognizer.release()\n}\n\n\nfun createOfflineRecognizer(): OfflineRecognizer {\n  val config = OfflineRecognizerConfig(\n      modelConfig = getOfflineModelConfig(type = 32)!!,\n  )\n\n  return OfflineRecognizer(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_omnilingual_asr_ctc.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  val recognizer = createOfflineRecognizer()\n  val waveFilename = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav\"\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  var stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  var result = recognizer.getResult(stream)\n  println(result)\n\n  stream.release()\n  recognizer.release()\n}\n\n\nfun createOfflineRecognizer(): OfflineRecognizer {\n  val config = OfflineRecognizerConfig(\n      modelConfig = getOfflineModelConfig(type = 44)!!,\n  )\n\n  return OfflineRecognizer(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_punctuation.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testPunctuation()\n}\n\nfun testPunctuation() {\n  val config = OfflinePunctuationConfig(\n      model=OfflinePunctuationModelConfig(\n          ctTransformer=\"./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx\",\n          numThreads=1,\n          debug=true,\n          provider=\"cpu\",\n      )\n  )\n  val punct = OfflinePunctuation(config = config)\n  val sentences = arrayOf(\n        \"这是一个测试你好吗How are you我很好thank you are you ok谢谢你\",\n        \"我们都是木头人不会说话不会动\",\n        \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\",\n  )\n  println(\"---\")\n  for (text in sentences) {\n    val out = punct.addPunctuation(text)\n    println(\"Input: $text\")\n    println(\"Output: $out\")\n    println(\"---\")\n  }\n  println(sentences)\n\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_sense_voice_with_hr.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  val recognizer = createOfflineRecognizer()\n  val waveFilename = \"./test-hr.wav\"\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  val stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  val result = recognizer.getResult(stream)\n  println(result)\n\n  stream.release()\n  recognizer.release()\n}\n\nfun createOfflineRecognizer(): OfflineRecognizer {\n  val config = OfflineRecognizerConfig(\n      featConfig = getFeatureConfig(sampleRate = 16000, featureDim = 80),\n      modelConfig = getOfflineModelConfig(type = 15)!!,\n      hr = HomophoneReplacerConfig(\n        lexicon = \"./lexicon.txt\",\n        ruleFsts = \"./replace.fst\"),\n  )\n\n  return OfflineRecognizer(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_speaker_diarization.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testOfflineSpeakerDiarization()\n}\n\nfun callback(numProcessedChunks: Int, numTotalChunks: Int, arg: Long): Int {\n  val progress = numProcessedChunks.toFloat() / numTotalChunks * 100\n  val s = \"%.2f\".format(progress)\n  println(\"Progress: ${s}%\");\n\n  return 0\n}\n\nfun testOfflineSpeakerDiarization() {\n  var config = OfflineSpeakerDiarizationConfig(\n    segmentation=OfflineSpeakerSegmentationModelConfig(\n      pyannote=OfflineSpeakerSegmentationPyannoteModelConfig(\"./sherpa-onnx-pyannote-segmentation-3-0/model.onnx\"),\n    ),\n    embedding=SpeakerEmbeddingExtractorConfig(\n      model=\"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\",\n    ),\n\n    // The test wave file ./0-four-speakers-zh.wav contains four speakers, so\n    // we use numClusters=4 here. If you don't know the number of speakers\n    // in the test wave file, please set the threshold like below.\n    //\n    // clustering=FastClusteringConfig(threshold=0.5),\n    //\n    // WARNING: You need to tune threshold by yourself.\n    // A larger threshold leads to fewer clusters, i.e., few speakers.\n    // A smaller threshold leads to more clusters, i.e., more speakers.\n    //\n    clustering=FastClusteringConfig(numClusters=4),\n  )\n\n  val sd = OfflineSpeakerDiarization(config=config)\n\n  val waveData = WaveReader.readWave(\n      filename = \"./0-four-speakers-zh.wav\",\n  )\n\n  if (sd.sampleRate() != waveData.sampleRate) {\n    println(\"Expected sample rate: ${sd.sampleRate()}, given: ${waveData.sampleRate}\")\n    return\n  }\n\n  // val segments = sd.process(waveData.samples) // this one is also ok\n  val segments = sd.processWithCallback(waveData.samples, callback=::callback)\n  for (segment in segments) {\n    println(\"${segment.start} -- ${segment.end} speaker_${segment.speaker}\")\n  }\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_speech_denoiser.kt",
    "content": "package com.k2fsa.sherpa.onnx\n// Please download test files in this script from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\nfun main() {\n  test()\n}\n\nfun test() {\n  val denoiser  = createOfflineSpeechDenoiser()\n\n  val waveFilename = \"./inp_16k.wav\";\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  val denoised = denoiser.run(waveData.samples, waveData.sampleRate);\n  denoised.save(filename=\"./enhanced-16k.wav\")\n  println(\"saved to ./enhanced-16k.wav\")\n}\n\nfun createOfflineSpeechDenoiser(): OfflineSpeechDenoiser {\n  val config = OfflineSpeechDenoiserConfig(\n      model = OfflineSpeechDenoiserModelConfig(\n        gtcrn = OfflineSpeechDenoiserGtcrnModelConfig(\n          model = \"./gtcrn_simple.onnx\"\n        ),\n        provider = \"cpu\",\n        numThreads = 1,\n      ),\n  )\n\n  println(config)\n\n  return OfflineSpeechDenoiser(config = config)\n}\n\n\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_speech_denoiser_dpdfnet.kt",
    "content": "package com.k2fsa.sherpa.onnx\n// Please download test files in this script from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\nfun main() {\n  val denoiser = createOfflineSpeechDenoiserDpdfNet()\n  val waveData = WaveReader.readWaveFromFile(filename = \"./inp_16k.wav\")\n  val denoised = denoiser.run(waveData.samples, waveData.sampleRate)\n  denoised.save(filename = \"./enhanced-dpdfnet-16k.wav\")\n  println(\"saved to ./enhanced-dpdfnet-16k.wav\")\n}\n\nfun createOfflineSpeechDenoiserDpdfNet(): OfflineSpeechDenoiser {\n  val config = OfflineSpeechDenoiserConfig(\n      model = OfflineSpeechDenoiserModelConfig(\n        dpdfnet = OfflineSpeechDenoiserDpdfNetModelConfig(\n          model = \"./dpdfnet_baseline.onnx\"\n        ),\n        provider = \"cpu\",\n        numThreads = 1,\n      ),\n  )\n\n  return OfflineSpeechDenoiser(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_offline_wenet_ctc.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  val recognizer = createOfflineRecognizer()\n  val waveFilename = \"./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/test_wavs/yue-0.wav\"\n\n  val waveData = WaveReader.readWaveFromFile(\n      filename = waveFilename,\n  )\n\n  var stream = recognizer.createStream()\n  stream.acceptWaveform(waveData.samples, sampleRate=waveData.sampleRate)\n  recognizer.decode(stream)\n\n  var result = recognizer.getResult(stream)\n  println(result)\n\n  stream.release()\n  recognizer.release()\n}\n\n\nfun createOfflineRecognizer(): OfflineRecognizer {\n  val config = OfflineRecognizerConfig(\n      modelConfig = getOfflineModelConfig(type = 42)!!,\n  )\n\n  return OfflineRecognizer(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_online_asr.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testOnlineAsr(\"transducer\")\n  testOnlineAsr(\"zipformer2-ctc\")\n  testOnlineAsr(\"ctc-hlg\")\n  testOnlineAsr(\"nemo-ctc\")\n  testOnlineAsr(\"tone-ctc\")\n}\n\nfun testOnlineAsr(type: String) {\n    val featConfig = FeatureConfig(\n        sampleRate = 16000,\n        featureDim = 80,\n    )\n\n    var ctcFstDecoderConfig  = OnlineCtcFstDecoderConfig()\n    val waveFilename: String\n    val modelConfig: OnlineModelConfig = when (type) {\n      \"transducer\" -> {\n        waveFilename = \"./sherpa-onnx-streaming-zipformer-en-2023-02-21/test_wavs/0.wav\"\n        // please refer to\n        // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n        // to download pre-trained models\n        OnlineModelConfig(\n            transducer = OnlineTransducerModelConfig(\n                encoder = \"./sherpa-onnx-streaming-zipformer-en-2023-02-21/encoder-epoch-99-avg-1.onnx\",\n                decoder = \"./sherpa-onnx-streaming-zipformer-en-2023-02-21/decoder-epoch-99-avg-1.onnx\",\n                joiner = \"./sherpa-onnx-streaming-zipformer-en-2023-02-21/joiner-epoch-99-avg-1.onnx\",\n            ),\n            tokens = \"./sherpa-onnx-streaming-zipformer-en-2023-02-21/tokens.txt\",\n            numThreads = 1,\n            debug = false,\n        )\n      }\n      \"zipformer2-ctc\" -> {\n        waveFilename = \"./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000000.wav\"\n        OnlineModelConfig(\n            zipformer2Ctc = OnlineZipformer2CtcModelConfig(\n                model = \"./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/ctc-epoch-20-avg-1-chunk-16-left-128.onnx\",\n            ),\n            tokens = \"./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt\",\n            numThreads = 1,\n            debug = false,\n        )\n      }\n      \"nemo-ctc\" -> {\n        waveFilename = \"./sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms/test_wavs/0.wav\"\n        OnlineModelConfig(\n            neMoCtc = OnlineNeMoCtcModelConfig(\n                model = \"./sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms/model.onnx\",\n            ),\n            tokens = \"./sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms/tokens.txt\",\n            numThreads = 1,\n            debug = false,\n        )\n      }\n      \"tone-ctc\" -> {\n        waveFilename = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav\"\n        OnlineModelConfig(\n            toneCtc = OnlineToneCtcModelConfig(\n                model = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx\",\n            ),\n            tokens = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt\",\n            numThreads = 1,\n            debug = false,\n        )\n      }\n      \"ctc-hlg\" -> {\n        waveFilename = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/1.wav\"\n        ctcFstDecoderConfig.graph = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst\"\n        OnlineModelConfig(\n            zipformer2Ctc = OnlineZipformer2CtcModelConfig(\n                model = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx\",\n            ),\n            tokens = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt\",\n            numThreads = 1,\n            debug = false,\n        )\n      }\n      else -> throw IllegalArgumentException(type)\n    }\n\n    val endpointConfig = EndpointConfig()\n\n    val lmConfig = OnlineLMConfig()\n\n    val config = OnlineRecognizerConfig(\n        modelConfig = modelConfig,\n        lmConfig = lmConfig,\n        featConfig = featConfig,\n        ctcFstDecoderConfig=ctcFstDecoderConfig,\n        endpointConfig = endpointConfig,\n        enableEndpoint = true,\n        decodingMethod = \"greedy_search\",\n        maxActivePaths = 4,\n    )\n\n    val recognizer = OnlineRecognizer(\n        config = config,\n    )\n\n    val waveData = WaveReader.readWaveFromFile(\n        filename = waveFilename,\n    )\n\n    val stream = recognizer.createStream()\n\n    val leftPaddings = FloatArray((waveData.sampleRate * 0.3).toInt()) // 0.3 seconds\n    stream.acceptWaveform(leftPaddings, sampleRate = waveData.sampleRate)\n\n    stream.acceptWaveform(waveData.samples, sampleRate = waveData.sampleRate)\n    while (recognizer.isReady(stream)) {\n        recognizer.decode(stream)\n    }\n\n    val tailPaddings = FloatArray((waveData.sampleRate * 0.6).toInt()) // 0.6 seconds\n    stream.acceptWaveform(tailPaddings, sampleRate = waveData.sampleRate)\n    stream.inputFinished()\n    while (recognizer.isReady(stream)) {\n        recognizer.decode(stream)\n    }\n\n    println(\"results: ${recognizer.getResult(stream).text}\")\n\n    stream.release()\n    recognizer.release()\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_online_punctuation.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testPunctuation()\n}\n\n// https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nfun testPunctuation() {\n  val config = OnlinePunctuationConfig(\n      model=OnlinePunctuationModelConfig(\n          cnnBilstm=\"./sherpa-onnx-online-punct-en-2024-08-06/model.int8.onnx\",\n          bpeVocab=\"./sherpa-onnx-online-punct-en-2024-08-06/bpe.vocab\",\n          numThreads=1,\n          debug=true,\n          provider=\"cpu\",\n      )\n  )\n  val punct = OnlinePunctuation(config = config)\n  val sentences = arrayOf(\n        \"how are you doing fantastic thank you what is about you\",\n        \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\",\n  )\n  println(\"---\")\n  for (text in sentences) {\n    val out = punct.addPunctuation(text)\n    println(\"Input: $text\")\n    println(\"Output: $out\")\n    println(\"---\")\n  }\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_online_speech_denoiser.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\n// Please download test files in this script from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\nfun main() {\n  testGtcrn()\n  testDpdfNet()\n}\n\nfun testGtcrn() {\n  val denoiser = createOnlineSpeechDenoiserGtcrn()\n  val waveData = WaveReader.readWaveFromFile(\"./inp_16k.wav\")\n  val output = mutableListOf<Float>()\n  val frameShift = denoiser.frameShiftInSamples\n\n  var start = 0\n  while (start < waveData.samples.size) {\n    val end = minOf(start + frameShift, waveData.samples.size)\n    val chunk = waveData.samples.copyOfRange(start, end)\n    val denoised = denoiser.run(chunk, waveData.sampleRate)\n    output.addAll(denoised.samples.asList())\n    start = end\n  }\n\n  output.addAll(denoiser.flush().samples.asList())\n  DenoisedAudio(output.toFloatArray(), denoiser.sampleRate).save(\n    filename = \"./enhanced-online-gtcrn.wav\"\n  )\n  println(\"saved to ./enhanced-online-gtcrn.wav\")\n\n  denoiser.release()\n}\n\nfun testDpdfNet() {\n  val denoiser = createOnlineSpeechDenoiserDpdfNet()\n  val waveData = WaveReader.readWaveFromFile(\"./inp_16k.wav\")\n  val output = mutableListOf<Float>()\n\n  val frameShift = denoiser.frameShiftInSamples\n  var start = 0\n  while (start < waveData.samples.size) {\n    val end = minOf(start + frameShift, waveData.samples.size)\n    val chunk = waveData.samples.copyOfRange(start, end)\n    val denoised = denoiser.run(chunk, waveData.sampleRate)\n    output.addAll(denoised.samples.asList())\n    start = end\n  }\n\n  output.addAll(denoiser.flush().samples.asList())\n  DenoisedAudio(output.toFloatArray(), denoiser.sampleRate).save(\n    filename = \"./enhanced-online-dpdfnet.wav\"\n  )\n  println(\"saved to ./enhanced-online-dpdfnet.wav\")\n\n  denoiser.release()\n}\n\nfun createOnlineSpeechDenoiserGtcrn(): OnlineSpeechDenoiser {\n  val config = OnlineSpeechDenoiserConfig(\n      model = OfflineSpeechDenoiserModelConfig(\n        gtcrn = OfflineSpeechDenoiserGtcrnModelConfig(\n          model = \"./gtcrn_simple.onnx\"\n        ),\n        provider = \"cpu\",\n        numThreads = 1,\n      ),\n  )\n\n  return OnlineSpeechDenoiser(config = config)\n}\n\nfun createOnlineSpeechDenoiserDpdfNet(): OnlineSpeechDenoiser {\n  val config = OnlineSpeechDenoiserConfig(\n      model = OfflineSpeechDenoiserModelConfig(\n        dpdfnet = OfflineSpeechDenoiserDpdfNetModelConfig(\n          model = \"./dpdfnet_baseline.onnx\"\n        ),\n        provider = \"cpu\",\n        numThreads = 1,\n      ),\n  )\n\n  return OnlineSpeechDenoiser(config = config)\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_pocket_tts.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testPocketTts()\n}\n\nfun testPocketTts() {\n  // see https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  val config = OfflineTtsConfig(\n    model=OfflineTtsModelConfig(\n      pocket=OfflineTtsPocketModelConfig(\n        lmFlow=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\",\n        lmMain=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\",\n        encoder=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\",\n        decoder=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\",\n        textConditioner=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\",\n        vocabJson=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\",\n        tokenScoresJson=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\",\n      ),\n      numThreads=2,\n      debug=true,\n    ),\n  )\n  val tts = OfflineTts(config=config)\n\n  val referenceAudioFilename = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\"\n  val wave = WaveReader.readWave(\n      filename = referenceAudioFilename,\n  )\n\n  val genConfig = GenerationConfig(\n    referenceAudio = wave.samples,\n    referenceSampleRate = wave.sampleRate,\n    numSteps = 5,\n    extra = mapOf(\n        \"temperature\" to \"0.7\",\n        \"chunk_size\" to \"15\",\n    )\n  )\n\n  val text = \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be, a statesman, a businessman, an official, or a scholar.\"\n\n  val audio = tts.generateWithConfigAndCallback(text=text, config=genConfig, callback=::callback)\n  audio.save(filename=\"out-bria.wav\")\n  tts.release()\n  println(\"Saved to out-bria.wav\")\n}\n\nfun callback(samples: FloatArray): Int {\n  println(\"callback got called with ${samples.size} samples\")\n\n  // 1 means to continue\n  // 0 means to stop\n  return 1\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_speaker_id.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testSpeakerRecognition()\n}\n\nfun testSpeakerRecognition() {\n    val config = SpeakerEmbeddingExtractorConfig(\n        model=\"./3dspeaker_speech_eres2net_large_sv_zh-cn_3dspeaker_16k.onnx\",\n        )\n    val extractor = SpeakerEmbeddingExtractor(config = config)\n\n    val embedding1a = computeEmbedding(extractor, \"./speaker1_a_cn_16k.wav\")\n    val embedding2a = computeEmbedding(extractor, \"./speaker2_a_cn_16k.wav\")\n    val embedding1b = computeEmbedding(extractor, \"./speaker1_b_cn_16k.wav\")\n\n    var manager = SpeakerEmbeddingManager(extractor.dim())\n    var ok = manager.add(name = \"speaker1\", embedding=embedding1a)\n    check(ok)\n\n    manager.add(name = \"speaker2\", embedding=embedding2a)\n    check(ok)\n\n    var name = manager.search(embedding=embedding1b, threshold=0.5f)\n    check(name == \"speaker1\")\n\n    manager.release()\n\n    manager = SpeakerEmbeddingManager(extractor.dim())\n    val embeddingList = mutableListOf(embedding1a, embedding1b)\n    ok = manager.add(name = \"s1\", embedding=embeddingList.toTypedArray())\n    check(ok)\n\n    name = manager.search(embedding=embedding1b, threshold=0.5f)\n    check(name == \"s1\")\n\n    name = manager.search(embedding=embedding2a, threshold=0.5f)\n    check(name.length == 0)\n\n    manager.release()\n    extractor.release()\n    println(\"Speaker ID test done!\")\n}\n\nfun computeEmbedding(extractor: SpeakerEmbeddingExtractor, filename: String): FloatArray {\n    val waveData = WaveReader.readWaveFromFile(\n        filename = filename,\n    )\n    val stream = extractor.createStream()\n    stream.acceptWaveform(sampleRate = waveData.sampleRate, samples=waveData.samples)\n    stream.inputFinished()\n    check(extractor.isReady(stream))\n\n    val embedding = extractor.compute(stream)\n\n    stream.release()\n\n    return embedding\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_supertonic_tts.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testSupertonicTts()\n}\n\nfun testSupertonicTts() {\n  // see https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  val modelDir = \"./sherpa-onnx-supertonic-tts-int8-2026-03-06\"\n  val config = OfflineTtsConfig(\n    model=OfflineTtsModelConfig(\n      supertonic=OfflineTtsSupertonicModelConfig(\n        durationPredictor=\"$modelDir/duration_predictor.int8.onnx\",\n        textEncoder=\"$modelDir/text_encoder.int8.onnx\",\n        vectorEstimator=\"$modelDir/vector_estimator.int8.onnx\",\n        vocoder=\"$modelDir/vocoder.int8.onnx\",\n        ttsJson=\"$modelDir/tts.json\",\n        unicodeIndexer=\"$modelDir/unicode_indexer.bin\",\n        voiceStyle=\"$modelDir/voice.bin\",\n      ),\n      numThreads=2,\n      debug=true,\n    ),\n  )\n  val tts = OfflineTts(config=config)\n\n  val genConfig = GenerationConfig(\n    sid = 6,\n    speed = 1.25f,\n    numSteps = 5,\n    extra = mapOf(\n        \"lang\" to \"en\",\n    )\n  )\n\n  val text = \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be, a statesman, a businessman, an official, or a scholar.\"\n\n  val audio = tts.generateWithConfigAndCallback(text=text, config=genConfig, callback=::supertonicCallback)\n  audio.save(filename=\"test-supertonic-en.wav\")\n  tts.release()\n  println(\"Saved to test-supertonic-en.wav\")\n}\n\nfun supertonicCallback(samples: FloatArray): Int {\n  println(\"callback got called with ${samples.size} samples\")\n\n  // 1 means to continue\n  // 0 means to stop\n  return 1\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_tts.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testVits()\n  testMatcha()\n  testKokoroEn()\n  testKokoroZhEn()\n  testKittenEn()\n}\n\nfun testKokoroZhEn() {\n  // see https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  var config = OfflineTtsConfig(\n    model=OfflineTtsModelConfig(\n      kokoro=OfflineTtsKokoroModelConfig(\n        model=\"./kokoro-multi-lang-v1_0/model.onnx\",\n        voices=\"./kokoro-multi-lang-v1_0/voices.bin\",\n        tokens=\"./kokoro-multi-lang-v1_0/tokens.txt\",\n        dataDir=\"./kokoro-multi-lang-v1_0/espeak-ng-data\",\n        lexicon=\"./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt\",\n      ),\n      numThreads=2,\n      debug=true,\n    ),\n  )\n  val tts = OfflineTts(config=config)\n  val genConfig = GenerationConfig(\n    silenceScale = 0.2f,\n  )\n  val audio = tts.generateWithConfigAndCallback(text=\"中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？\", config=genConfig, callback=::callback)\n  audio.save(filename=\"test-kokoro-zh-en.wav\")\n  tts.release()\n  println(\"Saved to test-kokoro-zh-en.wav\")\n}\n\nfun testKokoroEn() {\n  // see https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  var config = OfflineTtsConfig(\n    model=OfflineTtsModelConfig(\n      kokoro=OfflineTtsKokoroModelConfig(\n        model=\"./kokoro-en-v0_19/model.onnx\",\n        voices=\"./kokoro-en-v0_19/voices.bin\",\n        tokens=\"./kokoro-en-v0_19/tokens.txt\",\n        dataDir=\"./kokoro-en-v0_19/espeak-ng-data\",\n      ),\n      numThreads=2,\n      debug=true,\n    ),\n  )\n  val tts = OfflineTts(config=config)\n  val genConfig = GenerationConfig(\n    silenceScale = 0.2f,\n  )\n  val audio = tts.generateWithConfigAndCallback(text=\"How are you doing today?\", config=genConfig, callback=::callback)\n  audio.save(filename=\"test-kokoro-en.wav\")\n  tts.release()\n  println(\"Saved to test-kokoro-en.wav\")\n}\n\nfun testMatcha() {\n  // see https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  var config = OfflineTtsConfig(\n    model=OfflineTtsModelConfig(\n      matcha=OfflineTtsMatchaModelConfig(\n        acousticModel=\"./matcha-icefall-zh-baker/model-steps-3.onnx\",\n        vocoder=\"./vocos-22khz-univ.onnx\",\n        tokens=\"./matcha-icefall-zh-baker/tokens.txt\",\n        lexicon=\"./matcha-icefall-zh-baker/lexicon.txt\",\n      ),\n      numThreads=1,\n      debug=true,\n    ),\n    ruleFsts=\"./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst\",\n  )\n  val tts = OfflineTts(config=config)\n  val genConfig = GenerationConfig(\n    silenceScale = 0.2f,\n  )\n  val audio = tts.generateWithConfigAndCallback(text=\"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\", config=genConfig, callback=::callback)\n  audio.save(filename=\"test-matcha-zh.wav\")\n  tts.release()\n  println(\"Saved to test-matcha-zh.wav\")\n}\n\nfun testVits() {\n  // see https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\n  var config = OfflineTtsConfig(\n    model=OfflineTtsModelConfig(\n      vits=OfflineTtsVitsModelConfig(\n        model=\"./vits-piper-en_US-amy-low/en_US-amy-low.onnx\",\n        tokens=\"./vits-piper-en_US-amy-low/tokens.txt\",\n        dataDir=\"./vits-piper-en_US-amy-low/espeak-ng-data\",\n      ),\n      numThreads=1,\n      debug=true,\n    )\n  )\n  val tts = OfflineTts(config=config)\n  val genConfig = GenerationConfig(\n    silenceScale = 0.2f,\n  )\n  val audio = tts.generateWithConfigAndCallback(text=\"“Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.”\", config=genConfig, callback=::callback)\n  audio.save(filename=\"test-en.wav\")\n  tts.release()\n  println(\"Saved to test-en.wav\")\n}\n\nfun testKittenEn() {\n  // see https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n  var config = OfflineTtsConfig(\n    model=OfflineTtsModelConfig(\n      kitten=OfflineTtsKittenModelConfig(\n        model=\"./kitten-nano-en-v0_1-fp16/model.fp16.onnx\",\n        voices=\"./kitten-nano-en-v0_1-fp16/voices.bin\",\n        tokens=\"./kitten-nano-en-v0_1-fp16/tokens.txt\",\n        dataDir=\"./kitten-nano-en-v0_1-fp16/espeak-ng-data\",\n      ),\n      numThreads=2,\n      debug=true,\n    ),\n  )\n  val tts = OfflineTts(config=config)\n  val genConfig = GenerationConfig(\n    sid = 7,\n    silenceScale = 0.2f,\n  )\n  val audio = tts.generateWithConfigAndCallback(text=\"How are you doing today?\", config=genConfig, callback=::callback)\n  audio.save(filename=\"test-kitten-en.wav\")\n  tts.release()\n  println(\"Saved to test-kitten-en.wav\")\n}\n\n/*\n1. Unzip test_tts.jar\n2.\njavap ./com/k2fsa/sherpa/onnx/Test_ttsKt\\$testTts\\$audio\\$1.class\n\n3. It prints:\nCompiled from \"test_tts.kt\"\nfinal class com.k2fsa.sherpa.onnx.Test_ttsKt$testTts$audio$1 extends kotlin.jvm.internal.FunctionReferenceImpl implements kotlin.jvm.functions.Function1<float[], java.lang.Integer> {\n  public static final com.k2fsa.sherpa.onnx.Test_ttsKt$testTts$audio$1 INSTANCE;\n  com.k2fsa.sherpa.onnx.Test_ttsKt$testTts$audio$1();\n  public final java.lang.Integer invoke(float[]);\n  public java.lang.Object invoke(java.lang.Object);\n  static {};\n}\n\n4.\njavap -s ./com/k2fsa/sherpa/onnx/Test_ttsKt\\$testTts\\$audio\\$1.class\n\n5. It prints\nCompiled from \"test_tts.kt\"\nfinal class com.k2fsa.sherpa.onnx.Test_ttsKt$testTts$audio$1 extends kotlin.jvm.internal.FunctionReferenceImpl implements kotlin.jvm.functions.Function1<float[], java.lang.Integer> {\n  public static final com.k2fsa.sherpa.onnx.Test_ttsKt$testTts$audio$1 INSTANCE;\n    descriptor: Lcom/k2fsa/sherpa/onnx/Test_ttsKt$testTts$audio$1;\n  com.k2fsa.sherpa.onnx.Test_ttsKt$testTts$audio$1();\n    descriptor: ()V\n\n  public final java.lang.Integer invoke(float[]);\n    descriptor: ([F)Ljava/lang/Integer;\n\n  public java.lang.Object invoke(java.lang.Object);\n    descriptor: (Ljava/lang/Object;)Ljava/lang/Object;\n\n  static {};\n    descriptor: ()V\n}\n*/\nfun callback(samples: FloatArray): Int {\n  println(\"callback got called with ${samples.size} samples\");\n\n  // 1 means to continue\n  // 0 means to stop\n  return 1\n}\n"
  },
  {
    "path": "kotlin-api-examples/test_version.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  println(\"sherpa-onnx version: ${VersionInfo.version}\");\n  println(\"sherpa-onnx gitSha1: ${VersionInfo.gitSha1}\");\n  println(\"sherpa-onnx gitDate: ${VersionInfo.gitDate}\");\n}\n\n"
  },
  {
    "path": "kotlin-api-examples/test_zipvoice_tts.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nfun main() {\n  testZipVoiceTts()\n}\n\nfun testZipVoiceTts() {\n  val modelDir = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\"\n  val referenceAudioFilename = \"$modelDir/test_wavs/leijun-1.wav\"\n  val wave = WaveReader.readWave(filename = referenceAudioFilename)\n\n  val config = OfflineTtsConfig(\n    model = OfflineTtsModelConfig(\n      zipvoice = OfflineTtsZipVoiceModelConfig(\n        tokens = \"$modelDir/tokens.txt\",\n        encoder = \"$modelDir/encoder.int8.onnx\",\n        decoder = \"$modelDir/decoder.int8.onnx\",\n        vocoder = \"./vocos_24khz.onnx\",\n        dataDir = \"$modelDir/espeak-ng-data\",\n        lexicon = \"$modelDir/lexicon.txt\",\n      ),\n      numThreads = 2,\n      debug = false,\n    ),\n  )\n\n  val tts = OfflineTts(config = config)\n  val text = \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n  val referenceText = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\"\n  val genConfig = GenerationConfig(\n    referenceAudio = wave.samples,\n    referenceSampleRate = wave.sampleRate,\n    referenceText = referenceText,\n    numSteps = 4,\n    extra = mapOf(\"min_char_in_sentence\" to \"10\"),\n  )\n\n  val audio = tts.generateWithConfigAndCallback(text = text, config = genConfig, callback = ::callback)\n  audio.save(filename = \"test-zipvoice-zh-en.wav\")\n  tts.release()\n  println(\"Saved to test-zipvoice-zh-en.wav\")\n}\n\nfun callback(samples: FloatArray): Int {\n  println(\"callback got called with ${samples.size} samples\")\n\n  // 1 means to continue\n  // 0 means to stop\n  return 1\n}\n"
  },
  {
    "path": "lazarus-examples/.gitignore",
    "content": "# Lazarus compiler-generated binaries (safe to delete)\n*.exe\n*.dll\n*.so\n*.dylib\n*.lrs\n*.res\n*.compiled\n*.dbg\n*.ppu\n*.o\n*.or\n*.a\n\n# Lazarus autogenerated files (duplicated info)\n*.rst\n*.rsj\n*.lrt\n\n# Lazarus local files (user-specific info)\n*.lps\n\n# Lazarus backups and unit output folders.\n# These can be changed by user in Lazarus/project options.\nbackup/\n*.bak\nlib/\n\n# Application bundle for Mac OS\n*.app/\n"
  },
  {
    "path": "lazarus-examples/README.md",
    "content": "# Introduction\n\nThis directory contains examples about using\nhttps://www.lazarus-ide.org/\nwith Object Pascal API to develop speech related applications.\n\n**Documentation for this directory**:\nhttps://k2-fsa.github.io/sherpa/onnx/lazarus/index.html\n\n|Directory| Pre-built App|\n|---------|--------------|\n|[./generate_subtitles](./generate_subtitles)|[URL](https://k2-fsa.github.io/sherpa/onnx/lazarus/download-generated-subtitles.html)|\n"
  },
  {
    "path": "mfc-examples/.gitignore",
    "content": "# See https://github.com/github/gitignore/blob/main/VisualStudio.gitignore\n## Ignore Visual Studio temporary files, build results, and\n## files generated by popular Visual Studio add-ons.\n##\n## Get latest from https://github.com/github/gitignore/blob/main/VisualStudio.gitignore\n\n# User-specific files\n*.rsuser\n*.suo\n*.user\n*.userosscache\n*.sln.docstates\n\n# User-specific files (MonoDevelop/Xamarin Studio)\n*.userprefs\n\n# Mono auto generated files\nmono_crash.*\n\n# Build results\n[Dd]ebug/\n[Dd]ebugPublic/\n[Rr]elease/\n[Rr]eleases/\nx64/\nx86/\n[Ww][Ii][Nn]32/\n[Aa][Rr][Mm]/\n[Aa][Rr][Mm]64/\nbld/\n[Bb]in/\n[Oo]bj/\n[Ll]og/\n[Ll]ogs/\n\n# Visual Studio 2015/2017 cache/options directory\n.vs/\n# Uncomment if you have tasks that create the project's static files in wwwroot\n#wwwroot/\n\n# Visual Studio 2017 auto generated files\nGenerated\\ Files/\n\n# MSTest test Results\n[Tt]est[Rr]esult*/\n[Bb]uild[Ll]og.*\n\n# NUnit\n*.VisualState.xml\nTestResult.xml\nnunit-*.xml\n\n# Build Results of an ATL Project\n[Dd]ebugPS/\n[Rr]eleasePS/\ndlldata.c\n\n# Benchmark Results\nBenchmarkDotNet.Artifacts/\n\n# .NET Core\nproject.lock.json\nproject.fragment.lock.json\nartifacts/\n\n# ASP.NET Scaffolding\nScaffoldingReadMe.txt\n\n# StyleCop\nStyleCopReport.xml\n\n# Files built by Visual Studio\n*_i.c\n*_p.c\n*_h.h\n*.ilk\n*.meta\n*.obj\n*.iobj\n*.pch\n*.pdb\n*.ipdb\n*.pgc\n*.pgd\n*.rsp\n*.sbr\n*.tlb\n*.tli\n*.tlh\n*.tmp\n*.tmp_proj\n*_wpftmp.csproj\n*.log\n*.tlog\n*.vspscc\n*.vssscc\n.builds\n*.pidb\n*.svclog\n*.scc\n\n# Chutzpah Test files\n_Chutzpah*\n\n# Visual C++ cache files\nipch/\n*.aps\n*.ncb\n*.opendb\n*.opensdf\n*.sdf\n*.cachefile\n*.VC.db\n*.VC.VC.opendb\n\n# Visual Studio profiler\n*.psess\n*.vsp\n*.vspx\n*.sap\n\n# Visual Studio Trace Files\n*.e2e\n\n# TFS 2012 Local Workspace\n$tf/\n\n# Guidance Automation Toolkit\n*.gpState\n\n# ReSharper is a .NET coding add-in\n_ReSharper*/\n*.[Rr]e[Ss]harper\n*.DotSettings.user\n\n# TeamCity is a build add-in\n_TeamCity*\n\n# DotCover is a Code Coverage Tool\n*.dotCover\n\n# AxoCover is a Code Coverage Tool\n.axoCover/*\n!.axoCover/settings.json\n\n# Coverlet is a free, cross platform Code Coverage Tool\ncoverage*.json\ncoverage*.xml\ncoverage*.info\n\n# Visual Studio code coverage results\n*.coverage\n*.coveragexml\n\n# NCrunch\n_NCrunch_*\n.*crunch*.local.xml\nnCrunchTemp_*\n\n# MightyMoose\n*.mm.*\nAutoTest.Net/\n\n# Web workbench (sass)\n.sass-cache/\n\n# Installshield output folder\n[Ee]xpress/\n\n# DocProject is a documentation generator add-in\nDocProject/buildhelp/\nDocProject/Help/*.HxT\nDocProject/Help/*.HxC\nDocProject/Help/*.hhc\nDocProject/Help/*.hhk\nDocProject/Help/*.hhp\nDocProject/Help/Html2\nDocProject/Help/html\n\n# Click-Once directory\npublish/\n\n# Publish Web Output\n*.[Pp]ublish.xml\n*.azurePubxml\n# Note: Comment the next line if you want to checkin your web deploy settings,\n# but database connection strings (with potential passwords) will be unencrypted\n*.pubxml\n*.publishproj\n\n# Microsoft Azure Web App publish settings. Comment the next line if you want to\n# checkin your Azure Web App publish settings, but sensitive information contained\n# in these scripts will be unencrypted\nPublishScripts/\n\n# NuGet Packages\n*.nupkg\n# NuGet Symbol Packages\n*.snupkg\n# The packages folder can be ignored because of Package Restore\n**/[Pp]ackages/*\n# except build/, which is used as an MSBuild target.\n!**/[Pp]ackages/build/\n# Uncomment if necessary however generally it will be regenerated when needed\n#!**/[Pp]ackages/repositories.config\n# NuGet v3's project.json files produces more ignorable files\n*.nuget.props\n*.nuget.targets\n\n# Microsoft Azure Build Output\ncsx/\n*.build.csdef\n\n# Microsoft Azure Emulator\necf/\nrcf/\n\n# Windows Store app package directories and files\nAppPackages/\nBundleArtifacts/\nPackage.StoreAssociation.xml\n_pkginfo.txt\n*.appx\n*.appxbundle\n*.appxupload\n\n# Visual Studio cache files\n# files ending in .cache can be ignored\n*.[Cc]ache\n# but keep track of directories ending in .cache\n!?*.[Cc]ache/\n\n# Others\nClientBin/\n~$*\n*~\n*.dbmdl\n*.dbproj.schemaview\n*.jfm\n*.pfx\n*.publishsettings\norleans.codegen.cs\n\n# Including strong name files can present a security risk\n# (https://github.com/github/gitignore/pull/2483#issue-259490424)\n#*.snk\n\n# Since there are multiple workflows, uncomment next line to ignore bower_components\n# (https://github.com/github/gitignore/pull/1529#issuecomment-104372622)\n#bower_components/\n\n# RIA/Silverlight projects\nGenerated_Code/\n\n# Backup & report files from converting an old project file\n# to a newer Visual Studio version. Backup files are not needed,\n# because we have git ;-)\n_UpgradeReport_Files/\nBackup*/\nUpgradeLog*.XML\nUpgradeLog*.htm\nServiceFabricBackup/\n*.rptproj.bak\n\n# SQL Server files\n*.mdf\n*.ldf\n*.ndf\n\n# Business Intelligence projects\n*.rdl.data\n*.bim.layout\n*.bim_*.settings\n*.rptproj.rsuser\n*- [Bb]ackup.rdl\n*- [Bb]ackup ([0-9]).rdl\n*- [Bb]ackup ([0-9][0-9]).rdl\n\n# Microsoft Fakes\nFakesAssemblies/\n\n# GhostDoc plugin setting file\n*.GhostDoc.xml\n\n# Node.js Tools for Visual Studio\n.ntvs_analysis.dat\nnode_modules/\n\n# Visual Studio 6 build log\n*.plg\n\n# Visual Studio 6 workspace options file\n*.opt\n\n# Visual Studio 6 auto-generated workspace file (contains which files were open etc.)\n*.vbw\n\n# Visual Studio 6 auto-generated project file (contains which files were open etc.)\n*.vbp\n\n# Visual Studio 6 workspace and project file (working project files containing files to include in project)\n*.dsw\n*.dsp\n\n# Visual Studio 6 technical files\n*.ncb\n*.aps\n\n# Visual Studio LightSwitch build output\n**/*.HTMLClient/GeneratedArtifacts\n**/*.DesktopClient/GeneratedArtifacts\n**/*.DesktopClient/ModelManifest.xml\n**/*.Server/GeneratedArtifacts\n**/*.Server/ModelManifest.xml\n_Pvt_Extensions\n\n# Paket dependency manager\n.paket/paket.exe\npaket-files/\n\n# FAKE - F# Make\n.fake/\n\n# CodeRush personal settings\n.cr/personal\n\n# Python Tools for Visual Studio (PTVS)\n__pycache__/\n*.pyc\n\n# Cake - Uncomment if you are using it\n# tools/**\n# !tools/packages.config\n\n# Tabs Studio\n*.tss\n\n# Telerik's JustMock configuration file\n*.jmconfig\n\n# BizTalk build output\n*.btp.cs\n*.btm.cs\n*.odx.cs\n*.xsd.cs\n\n# OpenCover UI analysis results\nOpenCover/\n\n# Azure Stream Analytics local run output\nASALocalRun/\n\n# MSBuild Binary and Structured Log\n*.binlog\n\n# NVidia Nsight GPU debugger configuration file\n*.nvuser\n\n# MFractors (Xamarin productivity tool) working folder\n.mfractor/\n\n# Local History for Visual Studio\n.localhistory/\n\n# Visual Studio History (VSHistory) files\n.vshistory/\n\n# BeatPulse healthcheck temp database\nhealthchecksdb\n\n# Backup folder for Package Reference Convert tool in Visual Studio 2017\nMigrationBackup/\n\n# Ionide (cross platform F# VS Code tools) working folder\n.ionide/\n\n# Fody - auto-generated XML schema\nFodyWeavers.xsd\n\n# VS Code files for those working on multiple tools\n.vscode/*\n!.vscode/settings.json\n!.vscode/tasks.json\n!.vscode/launch.json\n!.vscode/extensions.json\n*.code-workspace\n\n# Local History for Visual Studio Code\n.history/\n\n# Windows Installer files from build outputs\n*.cab\n*.msi\n*.msix\n*.msm\n*.msp\n\n# JetBrains Rider\n*.sln.iml\n"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/NonStreamingSpeechRecognition.cpp",
    "content": "\n// NonStreamingSpeechRecognition.cpp : Defines the class behaviors for the\n// application.\n//\n\n// clang-format off\n#include \"pch.h\"\n#include \"framework.h\"\n#include \"NonStreamingSpeechRecognitionDlg.h\"\n#include \"NonStreamingSpeechRecognition.h\"\n// clang-format on\n\n#ifdef _DEBUG\n#define new DEBUG_NEW\n#endif\n\n// CNonStreamingSpeechRecognitionApp\n\nBEGIN_MESSAGE_MAP(CNonStreamingSpeechRecognitionApp, CWinApp)\nON_COMMAND(ID_HELP, &CWinApp::OnHelp)\nEND_MESSAGE_MAP()\n\n// CNonStreamingSpeechRecognitionApp construction\n\nCNonStreamingSpeechRecognitionApp::CNonStreamingSpeechRecognitionApp() {\n  // TODO: add construction code here,\n  // Place all significant initialization in InitInstance\n}\n\n// The one and only CNonStreamingSpeechRecognitionApp object\n\nCNonStreamingSpeechRecognitionApp theApp;\n\n// CNonStreamingSpeechRecognitionApp initialization\n\nBOOL CNonStreamingSpeechRecognitionApp::InitInstance() {\n  CWinApp::InitInstance();\n\n  // Create the shell manager, in case the dialog contains\n  // any shell tree view or shell list view controls.\n  CShellManager *pShellManager = new CShellManager;\n\n  // Activate \"Windows Native\" visual manager for enabling themes in MFC\n  // controls\n  CMFCVisualManager::SetDefaultManager(RUNTIME_CLASS(CMFCVisualManagerWindows));\n\n  // Standard initialization\n  // If you are not using these features and wish to reduce the size\n  // of your final executable, you should remove from the following\n  // the specific initialization routines you do not need\n  // Change the registry key under which our settings are stored\n  // TODO: You should modify this string to be something appropriate\n  // such as the name of your company or organization\n  SetRegistryKey(_T(\"Local AppWizard-Generated Applications\"));\n\n  CNonStreamingSpeechRecognitionDlg dlg;\n  m_pMainWnd = &dlg;\n  INT_PTR nResponse = dlg.DoModal();\n  if (nResponse == IDOK) {\n    // TODO: Place code here to handle when the dialog is\n    //  dismissed with OK\n  } else if (nResponse == IDCANCEL) {\n    // TODO: Place code here to handle when the dialog is\n    //  dismissed with Cancel\n  } else if (nResponse == -1) {\n    TRACE(traceAppMsg, 0,\n          \"Warning: dialog creation failed, so application is terminating \"\n          \"unexpectedly.\\n\");\n    TRACE(traceAppMsg, 0,\n          \"Warning: if you are using MFC controls on the dialog, you cannot \"\n          \"#define _AFX_NO_MFC_CONTROLS_IN_DIALOGS.\\n\");\n  }\n\n  // Delete the shell manager created above.\n  if (pShellManager != nullptr) {\n    delete pShellManager;\n  }\n\n#if !defined(_AFXDLL) && !defined(_AFX_NO_MFC_CONTROLS_IN_DIALOGS)\n  ControlBarCleanUp();\n#endif\n\n  // Since the dialog has been closed, return FALSE so that we exit the\n  //  application, rather than start the application's message pump.\n  return FALSE;\n}\n"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/NonStreamingSpeechRecognition.h",
    "content": "\n// NonStreamingSpeechRecognition.h : main header file for the PROJECT_NAME\n// application\n//\n\n#pragma once\n\n#ifndef __AFXWIN_H__\n#error \"include 'pch.h' before including this file for PCH\"\n#endif\n\n#include \"resource.h\"  // main symbols\n\n// CNonStreamingSpeechRecognitionApp:\n// See NonStreamingSpeechRecognition.cpp for the implementation of this class\n//\n\nclass CNonStreamingSpeechRecognitionApp : public CWinApp {\n public:\n  CNonStreamingSpeechRecognitionApp();\n\n  // Overrides\n public:\n  virtual BOOL InitInstance();\n\n  // Implementation\n\n  DECLARE_MESSAGE_MAP()\n};\n\nextern CNonStreamingSpeechRecognitionApp theApp;\n"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/NonStreamingSpeechRecognition.vcxproj",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Project DefaultTargets=\"Build\" xmlns=\"http://schemas.microsoft.com/developer/msbuild/2003\">\n  <ItemGroup Label=\"ProjectConfigurations\">\n    <ProjectConfiguration Include=\"Debug|Win32\">\n      <Configuration>Debug</Configuration>\n      <Platform>Win32</Platform>\n    </ProjectConfiguration>\n    <ProjectConfiguration Include=\"Release|Win32\">\n      <Configuration>Release</Configuration>\n      <Platform>Win32</Platform>\n    </ProjectConfiguration>\n    <ProjectConfiguration Include=\"Debug|x64\">\n      <Configuration>Debug</Configuration>\n      <Platform>x64</Platform>\n    </ProjectConfiguration>\n    <ProjectConfiguration Include=\"Release|x64\">\n      <Configuration>Release</Configuration>\n      <Platform>x64</Platform>\n    </ProjectConfiguration>\n  </ItemGroup>\n  <PropertyGroup Label=\"Globals\">\n    <VCProjectVersion>17.0</VCProjectVersion>\n    <ProjectGuid>{0298EE00-7AF2-4A66-9D5F-AA0D92AC871D}</ProjectGuid>\n    <Keyword>MFCProj</Keyword>\n    <RootNamespace>NonStreamingSpeechRecognition</RootNamespace>\n    <WindowsTargetPlatformVersion>10.0</WindowsTargetPlatformVersion>\n  </PropertyGroup>\n  <Import Project=\"$(VCTargetsPath)\\Microsoft.Cpp.Default.props\" />\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>true</UseDebugLibraries>\n    <PlatformToolset>v143</PlatformToolset>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Static</UseOfMfc>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>false</UseDebugLibraries>\n    <PlatformToolset>v143</PlatformToolset>\n    <WholeProgramOptimization>true</WholeProgramOptimization>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Static</UseOfMfc>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>true</UseDebugLibraries>\n    <PlatformToolset>v143</PlatformToolset>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Static</UseOfMfc>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>false</UseDebugLibraries>\n    <PlatformToolset>v143</PlatformToolset>\n    <WholeProgramOptimization>true</WholeProgramOptimization>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Static</UseOfMfc>\n  </PropertyGroup>\n  <Import Project=\"$(VCTargetsPath)\\Microsoft.Cpp.props\" />\n  <ImportGroup Label=\"ExtensionSettings\">\n  </ImportGroup>\n  <ImportGroup Label=\"Shared\">\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <PropertyGroup Label=\"UserMacros\" />\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">\n    <LinkIncremental>false</LinkIncremental>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">\n    <LinkIncremental>true</LinkIncremental>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">\n    <LinkIncremental>true</LinkIncremental>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">\n    <LinkIncremental>false</LinkIncremental>\n  </PropertyGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <FunctionLevelLinking>true</FunctionLevelLinking>\n      <IntrinsicFunctions>true</IntrinsicFunctions>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>_WINDOWS;NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n      <EnableCOMDATFolding>true</EnableCOMDATFolding>\n      <OptimizeReferences>true</OptimizeReferences>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>WIN32;_WINDOWS;_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>_WINDOWS;_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <FunctionLevelLinking>true</FunctionLevelLinking>\n      <IntrinsicFunctions>true</IntrinsicFunctions>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>WIN32;_WINDOWS;NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n      <EnableCOMDATFolding>true</EnableCOMDATFolding>\n      <OptimizeReferences>true</OptimizeReferences>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemGroup>\n    <ClInclude Include=\"framework.h\" />\n    <ClInclude Include=\"NonStreamingSpeechRecognition.h\" />\n    <ClInclude Include=\"NonStreamingSpeechRecognitionDlg.h\" />\n    <ClInclude Include=\"pch.h\" />\n    <ClInclude Include=\"Resource.h\" />\n    <ClInclude Include=\"targetver.h\" />\n  </ItemGroup>\n  <ItemGroup>\n    <ClCompile Include=\"NonStreamingSpeechRecognition.cpp\" />\n    <ClCompile Include=\"NonStreamingSpeechRecognitionDlg.cpp\" />\n    <ClCompile Include=\"pch.cpp\">\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">Create</PrecompiledHeader>\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">Create</PrecompiledHeader>\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">Create</PrecompiledHeader>\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">Create</PrecompiledHeader>\n    </ClCompile>\n  </ItemGroup>\n  <ItemGroup>\n    <ResourceCompile Include=\"NonStreamingSpeechRecognition.rc\" />\n  </ItemGroup>\n  <ItemGroup>\n    <None Include=\"res\\NonStreamingSpeechRecognition.rc2\" />\n  </ItemGroup>\n  <ItemGroup>\n    <Image Include=\"res\\NonStreamingSpeechRecognition.ico\" />\n  </ItemGroup>\n  <Import Project=\"$(VCTargetsPath)\\Microsoft.Cpp.targets\" />\n  <ImportGroup Label=\"ExtensionTargets\">\n  </ImportGroup>\n</Project>"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/NonStreamingSpeechRecognition.vcxproj.filters",
    "content": "﻿<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Project ToolsVersion=\"4.0\" xmlns=\"http://schemas.microsoft.com/developer/msbuild/2003\">\n  <ItemGroup>\n    <Filter Include=\"Source Files\">\n      <UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>\n      <Extensions>cpp;c;cc;cxx;c++;cppm;ixx;def;odl;idl;hpj;bat;asm;asmx</Extensions>\n    </Filter>\n    <Filter Include=\"Header Files\">\n      <UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>\n      <Extensions>h;hh;hpp;hxx;h++;hm;inl;inc;ipp;xsd</Extensions>\n    </Filter>\n    <Filter Include=\"Resource Files\">\n      <UniqueIdentifier>{67DA6AB6-F800-4c08-8B7A-83BB121AAD01}</UniqueIdentifier>\n      <Extensions>rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav;mfcribbon-ms</Extensions>\n    </Filter>\n  </ItemGroup>\n  <ItemGroup>\n    <ClInclude Include=\"NonStreamingSpeechRecognition.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"NonStreamingSpeechRecognitionDlg.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"framework.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"targetver.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"Resource.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"pch.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n  </ItemGroup>\n  <ItemGroup>\n    <ClCompile Include=\"NonStreamingSpeechRecognition.cpp\">\n      <Filter>Source Files</Filter>\n    </ClCompile>\n    <ClCompile Include=\"NonStreamingSpeechRecognitionDlg.cpp\">\n      <Filter>Source Files</Filter>\n    </ClCompile>\n    <ClCompile Include=\"pch.cpp\">\n      <Filter>Source Files</Filter>\n    </ClCompile>\n  </ItemGroup>\n  <ItemGroup>\n    <ResourceCompile Include=\"NonStreamingSpeechRecognition.rc\">\n      <Filter>Resource Files</Filter>\n    </ResourceCompile>\n  </ItemGroup>\n  <ItemGroup>\n    <None Include=\"res\\NonStreamingSpeechRecognition.rc2\">\n      <Filter>Resource Files</Filter>\n    </None>\n  </ItemGroup>\n  <ItemGroup>\n    <Image Include=\"res\\NonStreamingSpeechRecognition.ico\">\n      <Filter>Resource Files</Filter>\n    </Image>\n  </ItemGroup>\n</Project>"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/NonStreamingSpeechRecognitionDlg.cpp",
    "content": "\n// NonStreamingSpeechRecognitionDlg.cpp : implementation file\n//\n\n// clang-format off\n#include \"pch.h\"\n#include \"framework.h\"\n#include \"afxdialogex.h\"\n#include \"NonStreamingSpeechRecognition.h\"\n#include \"NonStreamingSpeechRecognitionDlg.h\"\n// clang-format on\n\n#include <fstream>\n#include <sstream>\n#include <string>\n#include <vector>\n\n#ifdef _DEBUG\n#define new DEBUG_NEW\n#endif\n\nMicrophone::Microphone() {\n  PaError err = Pa_Initialize();\n  if (err != paNoError) {\n    fprintf(stderr, \"portaudio error: %s\\n\", Pa_GetErrorText(err));\n    exit(-2);\n  }\n}\n\nMicrophone::~Microphone() {\n  PaError err = Pa_Terminate();\n  if (err != paNoError) {\n    fprintf(stderr, \"portaudio error: %s\\n\", Pa_GetErrorText(err));\n    exit(-2);\n  }\n}\n\n// see\n// https://stackoverflow.com/questions/7153935/how-to-convert-utf-8-stdstring-to-utf-16-stdwstring\nstatic std::wstring Utf8ToUtf16(const std::string &utf8) {\n  std::vector<unsigned long> unicode;\n  size_t i = 0;\n  while (i < utf8.size()) {\n    unsigned long uni;\n    size_t todo;\n    bool error = false;\n    unsigned char ch = utf8[i++];\n    if (ch <= 0x7F) {\n      uni = ch;\n      todo = 0;\n    } else if (ch <= 0xBF) {\n      throw std::logic_error(\"not a UTF-8 string\");\n    } else if (ch <= 0xDF) {\n      uni = ch & 0x1F;\n      todo = 1;\n    } else if (ch <= 0xEF) {\n      uni = ch & 0x0F;\n      todo = 2;\n    } else if (ch <= 0xF7) {\n      uni = ch & 0x07;\n      todo = 3;\n    } else {\n      throw std::logic_error(\"not a UTF-8 string\");\n    }\n    for (size_t j = 0; j < todo; ++j) {\n      if (i == utf8.size()) throw std::logic_error(\"not a UTF-8 string\");\n      unsigned char ch = utf8[i++];\n      if (ch < 0x80 || ch > 0xBF) throw std::logic_error(\"not a UTF-8 string\");\n      uni <<= 6;\n      uni += ch & 0x3F;\n    }\n    if (uni >= 0xD800 && uni <= 0xDFFF)\n      throw std::logic_error(\"not a UTF-8 string\");\n    if (uni > 0x10FFFF) throw std::logic_error(\"not a UTF-8 string\");\n    unicode.push_back(uni);\n  }\n  std::wstring utf16;\n  for (size_t i = 0; i < unicode.size(); ++i) {\n    unsigned long uni = unicode[i];\n    if (uni <= 0xFFFF) {\n      utf16 += (wchar_t)uni;\n    } else {\n      uni -= 0x10000;\n      utf16 += (wchar_t)((uni >> 10) + 0xD800);\n      utf16 += (wchar_t)((uni & 0x3FF) + 0xDC00);\n    }\n  }\n  return utf16;\n}\n\nstatic std::string Cat(const std::vector<std::string> &results) {\n  std::ostringstream os;\n  std::string sep;\n\n  int i = 0;\n  for (i = 0; i != results.size(); ++i) {\n    os << sep << i << \": \" << results[i];\n    sep = \"\\r\\n\";\n  }\n\n  return os.str();\n}\n\n// CNonStreamingSpeechRecognitionDlg dialog\n\nCNonStreamingSpeechRecognitionDlg::CNonStreamingSpeechRecognitionDlg(\n    CWnd *pParent /*=nullptr*/)\n    : CDialogEx(IDD_NONSTREAMINGSPEECHRECOGNITION_DIALOG, pParent) {\n  m_hIcon = AfxGetApp()->LoadIcon(IDR_MAINFRAME);\n}\n\nCNonStreamingSpeechRecognitionDlg::~CNonStreamingSpeechRecognitionDlg() {\n  if (recognizer_) {\n    SherpaOnnxDestroyOfflineRecognizer(recognizer_);\n    recognizer_ = nullptr;\n  }\n}\n\nvoid CNonStreamingSpeechRecognitionDlg::DoDataExchange(CDataExchange *pDX) {\n  CDialogEx::DoDataExchange(pDX);\n  DDX_Control(pDX, IDC_EDIT1, my_text_);\n  DDX_Control(pDX, IDOK, my_btn_);\n}\n\nBEGIN_MESSAGE_MAP(CNonStreamingSpeechRecognitionDlg, CDialogEx)\nON_WM_PAINT()\nON_WM_QUERYDRAGICON()\nON_BN_CLICKED(IDOK, &CNonStreamingSpeechRecognitionDlg::OnBnClickedOk)\nEND_MESSAGE_MAP()\n\n// CNonStreamingSpeechRecognitionDlg message handlers\n\nBOOL CNonStreamingSpeechRecognitionDlg::OnInitDialog() {\n  CDialogEx::OnInitDialog();\n\n  // Set the icon for this dialog.  The framework does this automatically\n  //  when the application's main window is not a dialog\n  SetIcon(m_hIcon, TRUE);   // Set big icon\n  SetIcon(m_hIcon, FALSE);  // Set small icon\n\n  // TODO: Add extra initialization here\n  InitMicrophone();\n\n  return TRUE;  // return TRUE  unless you set the focus to a control\n}\n\n// If you add a minimize button to your dialog, you will need the code below\n//  to draw the icon.  For MFC applications using the document/view model,\n//  this is automatically done for you by the framework.\n\nvoid CNonStreamingSpeechRecognitionDlg::OnPaint() {\n  if (IsIconic()) {\n    CPaintDC dc(this);  // device context for painting\n\n    SendMessage(WM_ICONERASEBKGND, reinterpret_cast<WPARAM>(dc.GetSafeHdc()),\n                0);\n\n    // Center icon in client rectangle\n    int cxIcon = GetSystemMetrics(SM_CXICON);\n    int cyIcon = GetSystemMetrics(SM_CYICON);\n    CRect rect;\n    GetClientRect(&rect);\n    int x = (rect.Width() - cxIcon + 1) / 2;\n    int y = (rect.Height() - cyIcon + 1) / 2;\n\n    // Draw the icon\n    dc.DrawIcon(x, y, m_hIcon);\n  } else {\n    CDialogEx::OnPaint();\n  }\n}\n\n// The system calls this function to obtain the cursor to display while the user\n// drags\n//  the minimized window.\nHCURSOR CNonStreamingSpeechRecognitionDlg::OnQueryDragIcon() {\n  return static_cast<HCURSOR>(m_hIcon);\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void *user_data) {\n  auto dlg = reinterpret_cast<CNonStreamingSpeechRecognitionDlg *>(user_data);\n  auto begin = reinterpret_cast<const float *>(input_buffer);\n  auto end = begin + frames_per_buffer;\n  dlg->samples_.insert(dlg->samples_.end(), begin, end);\n\n  return dlg->started_ ? paContinue : paComplete;\n}\n\nvoid CNonStreamingSpeechRecognitionDlg::OnBnClickedOk() {\n  if (!recognizer_) {\n    AppendLineToMultilineEditCtrl(\"Creating recognizer...\");\n    AppendLineToMultilineEditCtrl(\"It will take several seconds. Please wait\");\n    InitRecognizer();\n    if (!recognizer_) {\n      // failed to create the recognizer\n      return;\n    }\n    AppendLineToMultilineEditCtrl(\"Recognizer created!\");\n  }\n\n  if (!started_) {\n    samples_.clear();\n    started_ = true;\n\n    PaStreamParameters param;\n    param.device = Pa_GetDefaultInputDevice();\n    const PaDeviceInfo *info = Pa_GetDeviceInfo(param.device);\n    param.channelCount = 1;\n    param.sampleFormat = paFloat32;\n    param.suggestedLatency = info->defaultLowInputLatency;\n    param.hostApiSpecificStreamInfo = nullptr;\n    float sample_rate = static_cast<float>(config_.feat_config.sample_rate);\n    pa_stream_ = nullptr;\n    PaError err =\n        Pa_OpenStream(&pa_stream_, &param, nullptr, /* &outputParameters, */\n                      sample_rate,\n                      0,          // frames per buffer\n                      paClipOff,  // we won't output out of range samples\n                                  // so don't bother clipping them\n                      RecordCallback, this);\n    if (err != paNoError) {\n      AppendLineToMultilineEditCtrl(std::string(\"PortAudio error: \") +\n                                    Pa_GetErrorText(err));\n      my_btn_.EnableWindow(FALSE);\n      return;\n    }\n\n    err = Pa_StartStream(pa_stream_);\n    if (err != paNoError) {\n      AppendLineToMultilineEditCtrl(std::string(\"PortAudio error: \") +\n                                    Pa_GetErrorText(err));\n      my_btn_.EnableWindow(FALSE);\n      return;\n    }\n    AppendLineToMultilineEditCtrl(\n        \"\\r\\nStarted! Please speak and click stop.\\r\\n\");\n    my_btn_.SetWindowText(_T(\"Stop\"));\n\n  } else {\n    started_ = false;\n\n    Pa_Sleep(200);  // sleep for 200ms\n    if (pa_stream_) {\n      PaError err = Pa_CloseStream(pa_stream_);\n      if (err != paNoError) {\n        AppendLineToMultilineEditCtrl(std::string(\"PortAudio error: \") +\n                                      Pa_GetErrorText(err));\n        my_btn_.EnableWindow(FALSE);\n        return;\n      }\n    }\n    pa_stream_ = nullptr;\n\n    const SherpaOnnxOfflineStream *stream = SherpaOnnxCreateOfflineStream(recognizer_);\n\n    SherpaOnnxAcceptWaveformOffline(stream, config_.feat_config.sample_rate,\n                          samples_.data(), static_cast<int32_t>(samples_.size()));\n    SherpaOnnxDecodeOfflineStream(recognizer_, stream);\n    auto r = SherpaOnnxGetOfflineStreamResult(stream);\n    results_.emplace_back(r->text);\n\n    auto str = Utf8ToUtf16(Cat(results_).c_str());\n    my_text_.SetWindowText(str.c_str());\n    my_text_.SetFocus();\n    my_text_.SetSel(-1);\n\n    SherpaOnnxDestroyOfflineRecognizerResult(r);\n\n    SherpaOnnxDestroyOfflineStream(stream);\n    // AfxMessageBox(\"Stopped\", MB_OK);\n    my_btn_.SetWindowText(_T(\"Start\"));\n    AppendLineToMultilineEditCtrl(\"\\r\\nStopped. Please click start and speak\");\n  }\n}\n\nvoid CNonStreamingSpeechRecognitionDlg::InitMicrophone() {\n  int default_device = Pa_GetDefaultInputDevice();\n  int device_count = Pa_GetDeviceCount();\n  if (default_device == paNoDevice) {\n    // CString str;\n    // str.Format(_T(\"No default input device found!\"));\n    // AfxMessageBox(str, MB_OK | MB_ICONSTOP);\n    // exit(-1);\n    AppendLineToMultilineEditCtrl(\"No default input device found!\");\n    my_btn_.EnableWindow(FALSE);\n    return;\n  }\n  AppendLineToMultilineEditCtrl(std::string(\"Selected device \") +\n                                Pa_GetDeviceInfo(default_device)->name);\n}\n\nbool CNonStreamingSpeechRecognitionDlg::Exists(const std::string &filename) {\n  std::ifstream is(filename);\n  return is.good();\n}\n\nvoid CNonStreamingSpeechRecognitionDlg::ShowInitRecognizerHelpMessage() {\n  my_btn_.EnableWindow(FALSE);\n  std::string msg =\n      \"\\r\\nPlease go to\\r\\n\"\n      \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html \"\n      \"\\r\\n\";\n  msg += \"to download a non-streaming model, i.e., an offline model.\\r\\n\";\n  msg += \"You need to rename them after downloading\\r\\n\\r\\n\";\n  msg += \"It supports transducer, paraformer, and whisper models.\\r\\n\\r\\n\";\n  msg +=\n      \"We give three examples below to show you how to download models\\r\\n\\r\\n\";\n  msg += \"(1) Transducer\\r\\n\\r\\n\";\n  msg +=\n      \"We use \"\n      \"https://huggingface.co/pkufool/\"\n      \"icefall-asr-zipformer-wenetspeech-20230615 below\\r\\n\";\n  msg +=\n      \"wget \"\n      \"https://huggingface.co/pkufool/\"\n      \"icefall-asr-zipformer-wenetspeech-20230615/resolve/main/exp/\"\n      \"encoder-epoch-12-avg-4.onnx\\r\\n\";\n  msg +=\n      \"wget \"\n      \"https://huggingface.co/pkufool/\"\n      \"icefall-asr-zipformer-wenetspeech-20230615/resolve/main/exp/\"\n      \"decoder-epoch-12-avg-4.onnx\\r\\n\";\n  msg +=\n      \"wget \"\n      \"https://huggingface.co/pkufool/\"\n      \"icefall-asr-zipformer-wenetspeech-20230615/resolve/main/exp/\"\n      \"joiner-epoch-12-avg-4.onnx\\r\\n\";\n  msg += \"\\r\\n Now rename them\\r\\n\";\n  msg += \"mv encoder-epoch-12-avg-4.onnx encoder.onnx\\r\\n\";\n  msg += \"mv decoder-epoch-12-avg-4.onnx decoder.onnx\\r\\n\";\n  msg += \"mv joiner-epoch-12-avg-4.onnx joiner.onnx\\r\\n\\r\\n\";\n  msg += \"(2) Paraformer\\r\\n\\r\\n\";\n  msg +=\n      \"wget \"\n      \"https://huggingface.co/csukuangfj/\"\n      \"sherpa-onnx-paraformer-zh-2023-09-14/resolve/main/model.int8.onnx\\r\\n\";\n  msg +=\n      \"wget \"\n      \"https://huggingface.co/csukuangfj/sherpa-onnx-paraformer-zh-2023-09-14/\"\n      \"resolve/main/tokens.txt\\r\\n\\r\\n\";\n  msg += \"\\r\\n Now rename them\\r\\n\";\n  msg += \"mv model.onnx paraformer.onnx\\r\\n\\r\\n\";\n  msg += \"(3) Whisper\\r\\n\\r\\n\";\n  msg +=\n      \"wget \"\n      \"https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en/resolve/\"\n      \"main/tiny.en-encoder.onnx\\r\\n\";\n  msg +=\n      \"wget \"\n      \"https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en/resolve/\"\n      \"main/tiny.en-decoder.onnx\\r\\n\";\n  msg +=\n      \"wget \"\n      \"https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en/resolve/\"\n      \"main/tiny.en-tokens.txt\\r\\n\";\n  msg += \"\\r\\n Now rename them\\r\\n\";\n  msg += \"mv tiny.en-encoder.onnx whisper-encoder.onnx\\r\\n\";\n  msg += \"mv tiny.en-decoder.onnx whisper-decoder.onnx\\r\\n\";\n  msg += \"\\r\\n\";\n  msg += \"That's it!\\r\\n\";\n\n  AppendLineToMultilineEditCtrl(msg);\n}\n\nvoid CNonStreamingSpeechRecognitionDlg::InitWhisper() {\n  std::string whisper_encoder = \"./whisper-encoder.onnx\";\n  std::string whisper_decoder = \"./whisper-decoder.onnx\";\n\n  std::string tokens = \"./tokens.txt\";\n\n  bool is_ok = true;\n\n  if (Exists(\"./whisper-encoder.int8.onnx\")) {\n    whisper_encoder = \"./whisper-encoder.int8.onnx\";\n  } else if (!Exists(whisper_encoder)) {\n    std::string msg = whisper_encoder + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (Exists(\"./whisper-decoder.int8.onnx\")) {\n    whisper_decoder = \"./whisper-decoder.int8.onnx\";\n  } else if (!Exists(whisper_decoder)) {\n    std::string msg = whisper_decoder + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!Exists(tokens)) {\n    std::string msg = tokens + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!is_ok) {\n    ShowInitRecognizerHelpMessage();\n    return;\n  }\n\n  memset(&config_, 0, sizeof(config_));\n\n  config_.feat_config.sample_rate = 16000;\n  config_.feat_config.feature_dim = 80;\n\n  config_.model_config.whisper.encoder = whisper_encoder.c_str();\n  config_.model_config.whisper.decoder = whisper_decoder.c_str();\n  config_.model_config.tokens = tokens.c_str();\n  config_.model_config.num_threads = 1;\n  config_.model_config.debug = 1;\n  config_.model_config.model_type = \"whisper\";\n\n  config_.decoding_method = \"greedy_search\";\n  config_.max_active_paths = 4;\n\n  recognizer_ = SherpaOnnxCreateOfflineRecognizer(&config_);\n}\n\nvoid CNonStreamingSpeechRecognitionDlg::InitParaformer() {\n  std::string paraformer = \"./paraformer.onnx\";\n  std::string tokens = \"./tokens.txt\";\n\n  bool is_ok = true;\n\n  if (Exists(\"./paraformer.int8.onnx\")) {\n    paraformer = \"./paraformer.int8.onnx\";\n  } else if (!Exists(paraformer)) {\n    std::string msg = paraformer + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!Exists(tokens)) {\n    std::string msg = tokens + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!is_ok) {\n    ShowInitRecognizerHelpMessage();\n    return;\n  }\n\n  memset(&config_, 0, sizeof(config_));\n\n  config_.feat_config.sample_rate = 16000;\n  config_.feat_config.feature_dim = 80;\n\n  config_.model_config.paraformer.model = paraformer.c_str();\n  config_.model_config.tokens = tokens.c_str();\n  config_.model_config.num_threads = 1;\n  config_.model_config.debug = 1;\n  config_.model_config.model_type = \"paraformer\";\n\n  config_.decoding_method = \"greedy_search\";\n  config_.max_active_paths = 4;\n\n  recognizer_ = SherpaOnnxCreateOfflineRecognizer(&config_);\n}\n\nvoid CNonStreamingSpeechRecognitionDlg::InitRecognizer() {\n  if (Exists(\"./paraformer.onnx\") || Exists(\"./paraformer.int8.onnx\")) {\n    InitParaformer();\n    return;\n  }\n\n  if (Exists(\"./whisper-encoder.onnx\") || Exists(\"./whisper-encoder.int8.onnx\")) {\n    InitWhisper();\n    return;\n  }\n\n  // assume it is transducer\n\n  std::string encoder = \"./encoder.onnx\";\n  std::string decoder = \"./decoder.onnx\";\n  std::string joiner = \"./joiner.onnx\";\n  std::string tokens = \"./tokens.txt\";\n\n  bool is_ok = true;\n  if (!Exists(encoder)) {\n    std::string msg = encoder + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!Exists(decoder)) {\n    std::string msg = decoder + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!Exists(joiner)) {\n    std::string msg = joiner + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!Exists(tokens)) {\n    std::string msg = tokens + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!is_ok) {\n    ShowInitRecognizerHelpMessage();\n    return;\n  }\n  memset(&config_, 0, sizeof(config_));\n\n  config_.feat_config.sample_rate = 16000;\n  config_.feat_config.feature_dim = 80;\n\n  config_.model_config.transducer.encoder = encoder.c_str();\n  config_.model_config.transducer.decoder = decoder.c_str();\n  config_.model_config.transducer.joiner = joiner.c_str();\n  config_.model_config.tokens = tokens.c_str();\n  config_.model_config.num_threads = 1;\n  config_.model_config.debug = 0;\n  config_.model_config.model_type = \"transducer\";\n\n  config_.decoding_method = \"greedy_search\";\n  config_.max_active_paths = 4;\n\n  recognizer_ = SherpaOnnxCreateOfflineRecognizer(&config_);\n}\n\nvoid CNonStreamingSpeechRecognitionDlg::AppendTextToEditCtrl(\n    const std::string &s) {\n  // get the initial text length\n  int nLength = my_text_.GetWindowTextLength();\n  // put the selection at the end of text\n  my_text_.SetSel(nLength, nLength);\n  // replace the selection\n\n  std::wstring wstr = Utf8ToUtf16(s);\n\n  my_text_.ReplaceSel(wstr.c_str());\n}\n\nvoid CNonStreamingSpeechRecognitionDlg::AppendLineToMultilineEditCtrl(\n    const std::string &s) {\n  AppendTextToEditCtrl(\"\\r\\n\" + s);\n}\n"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/NonStreamingSpeechRecognitionDlg.h",
    "content": "\n// NonStreamingSpeechRecognitionDlg.h : header file\n//\n\n#pragma once\n\n#include <string>\n#include <vector>\n\n#include \"portaudio.h\"\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nclass Microphone {\n public:\n  Microphone();\n  ~Microphone();\n};\n\n// CNonStreamingSpeechRecognitionDlg dialog\nclass CNonStreamingSpeechRecognitionDlg : public CDialogEx {\n  // Construction\n public:\n  CNonStreamingSpeechRecognitionDlg(\n      CWnd *pParent = nullptr);  // standard constructor\n  ~CNonStreamingSpeechRecognitionDlg();\n\n// Dialog Data\n#ifdef AFX_DESIGN_TIME\n  enum { IDD = IDD_NONSTREAMINGSPEECHRECOGNITION_DIALOG };\n#endif\n\n protected:\n  virtual void DoDataExchange(CDataExchange *pDX);  // DDX/DDV support\n\n  // Implementation\n protected:\n  HICON m_hIcon;\n\n  // Generated message map functions\n  virtual BOOL OnInitDialog();\n  afx_msg void OnPaint();\n  afx_msg HCURSOR OnQueryDragIcon();\n  DECLARE_MESSAGE_MAP()\n public:\n  afx_msg void OnBnClickedOk();\n  int RunThread();\n\n private:\n  Microphone mic_;\n\n  const SherpaOnnxOfflineRecognizer *recognizer_ = nullptr;\n  SherpaOnnxOfflineRecognizerConfig config_;\n\n  PaStream *pa_stream_ = nullptr;\n  CButton my_btn_;\n  CEdit my_text_;\n  std::vector<std::string> results_;\n\n public:\n  bool started_ = false;\n  std::vector<float> samples_;\n\n private:\n  void AppendTextToEditCtrl(const std::string &s);\n  void AppendLineToMultilineEditCtrl(const std::string &s);\n  void InitMicrophone();\n\n  bool Exists(const std::string &filename);\n  void InitRecognizer();\n\n  void InitParaformer();\n  void InitWhisper();\n  void ShowInitRecognizerHelpMessage();\n};\n"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/Resource.h",
    "content": "//{{NO_DEPENDENCIES}}\n// Microsoft Visual C++ generated include file.\n// Used by NonStreamingSpeechRecognition.rc\n//\n#define IDD_NONSTREAMINGSPEECHRECOGNITION_DIALOG 102\n#define IDR_MAINFRAME 128\n#define IDC_EDIT1 1000\n\n// Next default values for new objects\n//\n#ifdef APSTUDIO_INVOKED\n#ifndef APSTUDIO_READONLY_SYMBOLS\n#define _APS_NEXT_RESOURCE_VALUE 130\n#define _APS_NEXT_COMMAND_VALUE 32771\n#define _APS_NEXT_CONTROL_VALUE 1001\n#define _APS_NEXT_SYMED_VALUE 101\n#endif\n#endif\n"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/framework.h",
    "content": "#pragma once\n\n#ifndef VC_EXTRALEAN\n#define VC_EXTRALEAN  // Exclude rarely-used stuff from Windows headers\n#endif\n\n#include \"targetver.h\"\n\n#define _ATL_CSTRING_EXPLICIT_CONSTRUCTORS  // some CString constructors will be\n                                            // explicit\n\n// turns off MFC's hiding of some common and often safely ignored warning\n// messages\n#define _AFX_ALL_WARNINGS\n\n#include <afxext.h>  // MFC extensions\n#include <afxwin.h>  // MFC core and standard components\n\n#ifndef _AFX_NO_OLE_SUPPORT\n#include <afxdtctl.h>  // MFC support for Internet Explorer 4 Common Controls\n#endif\n#ifndef _AFX_NO_AFXCMN_SUPPORT\n#include <afxcmn.h>  // MFC support for Windows Common Controls\n#endif               // _AFX_NO_AFXCMN_SUPPORT\n\n#include <afxcontrolbars.h>  // MFC support for ribbons and control bars\n"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/pch.cpp",
    "content": "// pch.cpp: source file corresponding to the pre-compiled header\n\n#include \"pch.h\"\n\n// When you are using pre-compiled headers, this source file is necessary for\n// compilation to succeed.\n"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/pch.h",
    "content": "// pch.h: This is a precompiled header file.\n// Files listed below are compiled only once, improving build performance for\n// future builds. This also affects IntelliSense performance, including code\n// completion and many code browsing features. However, files listed here are\n// ALL re-compiled if any one of them is updated between builds. Do not add\n// files here that you will be updating frequently as this negates the\n// performance advantage.\n\n#ifndef PCH_H\n#define PCH_H\n\n// add headers that you want to pre-compile here\n#include \"framework.h\"\n\n#endif  // PCH_H\n"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/sherpa-onnx-deps.props",
    "content": "﻿<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Project ToolsVersion=\"4.0\" xmlns=\"http://schemas.microsoft.com/developer/msbuild/2003\">\n  <ImportGroup Label=\"PropertySheets\" />\n  <PropertyGroup Label=\"UserMacros\" />\n  <PropertyGroup>\n    <SherpaOnnxBuildDirectory>..\\..\\build</SherpaOnnxBuildDirectory>\n    <SherpaOnnxInstallDirectory>..\\..\\build\\install</SherpaOnnxInstallDirectory>\n    <SherpaOnnxLibraries>\n        sherpa-onnx-portaudio_static.lib;\n        sherpa-onnx-c-api.lib;\n        sherpa-onnx-core.lib;\n        kaldi-decoder-core.lib;\n        sherpa-onnx-kaldifst-core.lib;\n        sherpa-onnx-fstfar.lib;\n        sherpa-onnx-fst.lib;\n        kaldi-native-fbank-core.lib;\n        kissfft-float.lib;\n        onnxruntime.lib;\n        piper_phonemize.lib;\n        espeak-ng.lib;\n        ucd.lib;\n        ssentencepiece_core.lib;\n    </SherpaOnnxLibraries>\n  </PropertyGroup>\n  <ItemDefinitionGroup>\n    <ClCompile>\n      <AdditionalIncludeDirectories>\n\t  $(SherpaOnnxBuildDirectory)\\_deps\\portaudio-src\\include;\n    $(SherpaOnnxInstallDirectory)\\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ClCompile>\n    <Link>\n      <AdditionalLibraryDirectories>$(SherpaOnnxInstallDirectory)\\lib;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>\n      <AdditionalDependencies>$(SherpaOnnxLibraries);</AdditionalDependencies>\n    </Link>\n  </ItemDefinitionGroup>\n  <ItemGroup />\n</Project>\n"
  },
  {
    "path": "mfc-examples/NonStreamingSpeechRecognition/targetver.h",
    "content": "#pragma once\n\n// Including SDKDDKVer.h defines the highest available Windows platform.\n\n// If you wish to build your application for a previous Windows platform,\n// include WinSDKVer.h and set the _WIN32_WINNT macro to the platform you wish\n// to support before including SDKDDKVer.h.\n\n#include <SDKDDKVer.h>\n"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/NonStreamingTextToSpeech.cpp",
    "content": "\n// NonStreamingTextToSpeech.cpp : Defines the class behaviors for the application.\n//\n\n#include \"pch.h\"\n#include \"framework.h\"\n#include \"NonStreamingTextToSpeech.h\"\n#include \"NonStreamingTextToSpeechDlg.h\"\n\n#ifdef _DEBUG\n#define new DEBUG_NEW\n#endif\n\n\n// CNonStreamingTextToSpeechApp\n\nBEGIN_MESSAGE_MAP(CNonStreamingTextToSpeechApp, CWinApp)\n\tON_COMMAND(ID_HELP, &CWinApp::OnHelp)\nEND_MESSAGE_MAP()\n\n\n// CNonStreamingTextToSpeechApp construction\n\nCNonStreamingTextToSpeechApp::CNonStreamingTextToSpeechApp()\n{\n\t// TODO: add construction code here,\n\t// Place all significant initialization in InitInstance\n}\n\n\n// The one and only CNonStreamingTextToSpeechApp object\n\nCNonStreamingTextToSpeechApp theApp;\n\n\n// CNonStreamingTextToSpeechApp initialization\n\nBOOL CNonStreamingTextToSpeechApp::InitInstance()\n{\n\tCWinApp::InitInstance();\n\n\n\t// Create the shell manager, in case the dialog contains\n\t// any shell tree view or shell list view controls.\n\tCShellManager *pShellManager = new CShellManager;\n\n\t// Activate \"Windows Native\" visual manager for enabling themes in MFC controls\n\tCMFCVisualManager::SetDefaultManager(RUNTIME_CLASS(CMFCVisualManagerWindows));\n\n\t// Standard initialization\n\t// If you are not using these features and wish to reduce the size\n\t// of your final executable, you should remove from the following\n\t// the specific initialization routines you do not need\n\t// Change the registry key under which our settings are stored\n\t// TODO: You should modify this string to be something appropriate\n\t// such as the name of your company or organization\n\tSetRegistryKey(_T(\"Local AppWizard-Generated Applications\"));\n\n\tCNonStreamingTextToSpeechDlg dlg;\n\tm_pMainWnd = &dlg;\n\tINT_PTR nResponse = dlg.DoModal();\n\tif (nResponse == IDOK)\n\t{\n\t\t// TODO: Place code here to handle when the dialog is\n\t\t//  dismissed with OK\n\t}\n\telse if (nResponse == IDCANCEL)\n\t{\n\t\t// TODO: Place code here to handle when the dialog is\n\t\t//  dismissed with Cancel\n\t}\n\telse if (nResponse == -1)\n\t{\n\t\tTRACE(traceAppMsg, 0, \"Warning: dialog creation failed, so application is terminating unexpectedly.\\n\");\n\t\tTRACE(traceAppMsg, 0, \"Warning: if you are using MFC controls on the dialog, you cannot #define _AFX_NO_MFC_CONTROLS_IN_DIALOGS.\\n\");\n\t}\n\n\t// Delete the shell manager created above.\n\tif (pShellManager != nullptr)\n\t{\n\t\tdelete pShellManager;\n\t}\n\n#if !defined(_AFXDLL) && !defined(_AFX_NO_MFC_CONTROLS_IN_DIALOGS)\n\tControlBarCleanUp();\n#endif\n\n\t// Since the dialog has been closed, return FALSE so that we exit the\n\t//  application, rather than start the application's message pump.\n\treturn FALSE;\n}\n\n"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/NonStreamingTextToSpeech.h",
    "content": "\n// NonStreamingTextToSpeech.h : main header file for the PROJECT_NAME application\n//\n\n#pragma once\n\n#ifndef __AFXWIN_H__\n\t#error \"include 'pch.h' before including this file for PCH\"\n#endif\n\n#include \"resource.h\"\t\t// main symbols\n\n\n// CNonStreamingTextToSpeechApp:\n// See NonStreamingTextToSpeech.cpp for the implementation of this class\n//\n\nclass CNonStreamingTextToSpeechApp : public CWinApp\n{\npublic:\n\tCNonStreamingTextToSpeechApp();\n\n// Overrides\npublic:\n\tvirtual BOOL InitInstance();\n\n// Implementation\n\n\tDECLARE_MESSAGE_MAP()\n};\n\nextern CNonStreamingTextToSpeechApp theApp;\n"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/NonStreamingTextToSpeech.vcxproj",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Project DefaultTargets=\"Build\" xmlns=\"http://schemas.microsoft.com/developer/msbuild/2003\">\n  <ItemGroup Label=\"ProjectConfigurations\">\n    <ProjectConfiguration Include=\"Debug|Win32\">\n      <Configuration>Debug</Configuration>\n      <Platform>Win32</Platform>\n    </ProjectConfiguration>\n    <ProjectConfiguration Include=\"Release|Win32\">\n      <Configuration>Release</Configuration>\n      <Platform>Win32</Platform>\n    </ProjectConfiguration>\n    <ProjectConfiguration Include=\"Debug|x64\">\n      <Configuration>Debug</Configuration>\n      <Platform>x64</Platform>\n    </ProjectConfiguration>\n    <ProjectConfiguration Include=\"Release|x64\">\n      <Configuration>Release</Configuration>\n      <Platform>x64</Platform>\n    </ProjectConfiguration>\n  </ItemGroup>\n  <PropertyGroup Label=\"Globals\">\n    <VCProjectVersion>17.0</VCProjectVersion>\n    <ProjectGuid>{9A5F2CCC-1AAB-4F7F-A608-F0B512023405}</ProjectGuid>\n    <Keyword>MFCProj</Keyword>\n    <RootNamespace>NonStreamingTextToSpeech</RootNamespace>\n    <WindowsTargetPlatformVersion>10.0</WindowsTargetPlatformVersion>\n  </PropertyGroup>\n  <Import Project=\"$(VCTargetsPath)\\Microsoft.Cpp.Default.props\" />\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>true</UseDebugLibraries>\n    <PlatformToolset>v143</PlatformToolset>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Dynamic</UseOfMfc>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>false</UseDebugLibraries>\n    <PlatformToolset>v143</PlatformToolset>\n    <WholeProgramOptimization>true</WholeProgramOptimization>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Static</UseOfMfc>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>true</UseDebugLibraries>\n    <PlatformToolset>v143</PlatformToolset>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Dynamic</UseOfMfc>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>false</UseDebugLibraries>\n    <PlatformToolset>v143</PlatformToolset>\n    <WholeProgramOptimization>true</WholeProgramOptimization>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Static</UseOfMfc>\n  </PropertyGroup>\n  <Import Project=\"$(VCTargetsPath)\\Microsoft.Cpp.props\" />\n  <ImportGroup Label=\"ExtensionSettings\">\n  </ImportGroup>\n  <ImportGroup Label=\"Shared\">\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <PropertyGroup Label=\"UserMacros\" />\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">\n    <LinkIncremental>false</LinkIncremental>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">\n    <LinkIncremental>true</LinkIncremental>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">\n    <LinkIncremental>true</LinkIncremental>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">\n    <LinkIncremental>false</LinkIncremental>\n  </PropertyGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <FunctionLevelLinking>true</FunctionLevelLinking>\n      <IntrinsicFunctions>true</IntrinsicFunctions>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>_WINDOWS;NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n      <EnableCOMDATFolding>true</EnableCOMDATFolding>\n      <OptimizeReferences>true</OptimizeReferences>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>WIN32;_WINDOWS;_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>_WINDOWS;_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <FunctionLevelLinking>true</FunctionLevelLinking>\n      <IntrinsicFunctions>true</IntrinsicFunctions>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>WIN32;_WINDOWS;NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n      <EnableCOMDATFolding>true</EnableCOMDATFolding>\n      <OptimizeReferences>true</OptimizeReferences>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemGroup>\n    <ClInclude Include=\"framework.h\" />\n    <ClInclude Include=\"NonStreamingTextToSpeech.h\" />\n    <ClInclude Include=\"NonStreamingTextToSpeechDlg.h\" />\n    <ClInclude Include=\"pch.h\" />\n    <ClInclude Include=\"Resource.h\" />\n    <ClInclude Include=\"targetver.h\" />\n  </ItemGroup>\n  <ItemGroup>\n    <ClCompile Include=\"NonStreamingTextToSpeech.cpp\" />\n    <ClCompile Include=\"NonStreamingTextToSpeechDlg.cpp\" />\n    <ClCompile Include=\"pch.cpp\">\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">Create</PrecompiledHeader>\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">Create</PrecompiledHeader>\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">Create</PrecompiledHeader>\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">Create</PrecompiledHeader>\n    </ClCompile>\n  </ItemGroup>\n  <ItemGroup>\n    <ResourceCompile Include=\"NonStreamingTextToSpeech.rc\" />\n  </ItemGroup>\n  <ItemGroup>\n    <None Include=\"res\\NonStreamingTextToSpeech.rc2\" />\n  </ItemGroup>\n  <ItemGroup>\n    <Image Include=\"res\\NonStreamingTextToSpeech.ico\" />\n  </ItemGroup>\n  <Import Project=\"$(VCTargetsPath)\\Microsoft.Cpp.targets\" />\n  <ImportGroup Label=\"ExtensionTargets\">\n  </ImportGroup>\n</Project>"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/NonStreamingTextToSpeech.vcxproj.filters",
    "content": "﻿<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Project ToolsVersion=\"4.0\" xmlns=\"http://schemas.microsoft.com/developer/msbuild/2003\">\n  <ItemGroup>\n    <Filter Include=\"Source Files\">\n      <UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>\n      <Extensions>cpp;c;cc;cxx;c++;cppm;ixx;def;odl;idl;hpj;bat;asm;asmx</Extensions>\n    </Filter>\n    <Filter Include=\"Header Files\">\n      <UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>\n      <Extensions>h;hh;hpp;hxx;h++;hm;inl;inc;ipp;xsd</Extensions>\n    </Filter>\n    <Filter Include=\"Resource Files\">\n      <UniqueIdentifier>{67DA6AB6-F800-4c08-8B7A-83BB121AAD01}</UniqueIdentifier>\n      <Extensions>rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav;mfcribbon-ms</Extensions>\n    </Filter>\n  </ItemGroup>\n  <ItemGroup>\n    <ClInclude Include=\"NonStreamingTextToSpeech.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"NonStreamingTextToSpeechDlg.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"framework.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"targetver.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"Resource.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"pch.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n  </ItemGroup>\n  <ItemGroup>\n    <ClCompile Include=\"NonStreamingTextToSpeech.cpp\">\n      <Filter>Source Files</Filter>\n    </ClCompile>\n    <ClCompile Include=\"NonStreamingTextToSpeechDlg.cpp\">\n      <Filter>Source Files</Filter>\n    </ClCompile>\n    <ClCompile Include=\"pch.cpp\">\n      <Filter>Source Files</Filter>\n    </ClCompile>\n  </ItemGroup>\n  <ItemGroup>\n    <ResourceCompile Include=\"NonStreamingTextToSpeech.rc\">\n      <Filter>Resource Files</Filter>\n    </ResourceCompile>\n  </ItemGroup>\n  <ItemGroup>\n    <None Include=\"res\\NonStreamingTextToSpeech.rc2\">\n      <Filter>Resource Files</Filter>\n    </None>\n  </ItemGroup>\n  <ItemGroup>\n    <Image Include=\"res\\NonStreamingTextToSpeech.ico\">\n      <Filter>Resource Files</Filter>\n    </Image>\n  </ItemGroup>\n</Project>"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/NonStreamingTextToSpeechDlg.cpp",
    "content": "\n// NonStreamingTextToSpeechDlg.cpp : implementation file\n//\n\n#include \"pch.h\"\n#include \"framework.h\"\n#include \"NonStreamingTextToSpeech.h\"\n#include \"NonStreamingTextToSpeechDlg.h\"\n#include \"afxdialogex.h\"\n\n#include <fstream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <stdexcept>\n#include <string>\n#include <thread>  // NOLINT\n#include <vector>\n\n#ifdef _DEBUG\n#define new DEBUG_NEW\n#endif\n\nMicrophone::Microphone() {\n  PaError err = Pa_Initialize();\n  if (err != paNoError) {\n    fprintf(stderr, \"portaudio error: %s\\n\", Pa_GetErrorText(err));\n    exit(-2);\n  }\n}\n\nMicrophone::~Microphone() {\n  PaError err = Pa_Terminate();\n  if (err != paNoError) {\n    fprintf(stderr, \"portaudio error: %s\\n\", Pa_GetErrorText(err));\n    exit(-2);\n  }\n}\n\n// NOTE(fangjun): Code is copied from\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/csrc/sherpa-onnx-offline-tts-play.cc#L22\nstatic std::condition_variable g_cv;\nstatic std::mutex g_cv_m;\n\nstruct Samples {\n  std::vector<float> data;\n  int32_t consumed = 0;\n};\n\nstruct Buffer {\n  std::queue<Samples> samples;\n  std::mutex mutex;\n};\n\nstatic Buffer g_buffer;\n\nstatic bool g_started = false;\nstatic bool g_stopped = false;\nstatic bool g_killed = false;\n\nstatic int32_t AudioGeneratedCallback(const float *s, int32_t n) {\n  if (n > 0) {\n    Samples samples;\n    samples.data = std::vector<float>{s, s + n};\n\n    std::lock_guard<std::mutex> lock(g_buffer.mutex);\n    g_buffer.samples.push(std::move(samples));\n    g_started = true;\n  }\n  if (g_killed) {\n    return 0;\n  }\n  return 1;\n}\n\nstatic int PlayCallback(const void * /*in*/, void *out,\n                        unsigned long _n,  // NOLINT\n                        const PaStreamCallbackTimeInfo * /*time_info*/,\n                        PaStreamCallbackFlags /*status_flags*/,\n                        void * /*user_data*/) {\n  int32_t n = static_cast<int32_t>(_n);\n  if (g_killed) {\n    return paComplete;\n  }\n\n  float *pout = reinterpret_cast<float *>(out);\n  std::lock_guard<std::mutex> lock(g_buffer.mutex);\n\n  if (g_buffer.samples.empty()) {\n    if (g_stopped) {\n      // no more data is available and we have processed all of the samples\n      return paComplete;\n    }\n\n    // The current sentence is so long, though very unlikely, that\n    // the model has not finished processing it yet.\n    std::fill_n(pout, n, 0);\n\n    return paContinue;\n  }\n\n  int32_t k = 0;\n  for (; k < n && !g_buffer.samples.empty();) {\n    int32_t this_block = n - k;\n\n    auto &p = g_buffer.samples.front();\n\n    int32_t remaining = static_cast<int32_t>(p.data.size()) - p.consumed;\n\n    if (this_block <= remaining) {\n      std::copy(p.data.begin() + p.consumed,\n                p.data.begin() + p.consumed + this_block, pout + k);\n      p.consumed += this_block;\n\n      k = n;\n\n      if (p.consumed == p.data.size()) {\n        g_buffer.samples.pop();\n      }\n      break;\n    }\n\n    std::copy(p.data.begin() + p.consumed, p.data.end(), pout + k);\n    k += static_cast<int32_t>(p.data.size()) - p.consumed;\n    g_buffer.samples.pop();\n  }\n\n  if (k < n) {\n    std::fill_n(pout + k, n - k, 0);\n  }\n\n  if (g_stopped && g_buffer.samples.empty()) {\n    return paComplete;\n  }\n\n  return paContinue;\n}\n\nstatic void PlayCallbackFinished(void *userData) { g_cv.notify_all(); }\n\nstatic void StartPlayback(int32_t sample_rate) {\n  int32_t frames_per_buffer = 1024;\n  PaStreamParameters outputParameters;\n  PaStream *stream;\n  PaError err;\n\n  outputParameters.device =\n      Pa_GetDefaultOutputDevice(); /* default output device */\n\n  outputParameters.channelCount = 1;         /* stereo output */\n  outputParameters.sampleFormat = paFloat32; /* 32 bit floating point output */\n  outputParameters.suggestedLatency =\n      Pa_GetDeviceInfo(outputParameters.device)->defaultLowOutputLatency;\n  outputParameters.hostApiSpecificStreamInfo = nullptr;\n\n  err = Pa_OpenStream(&stream, nullptr, /* no input */\n                      &outputParameters, sample_rate, frames_per_buffer,\n                      paClipOff,  // we won't output out of range samples so\n                                  //   don't bother clipping them\n                      PlayCallback, nullptr);\n  if (err != paNoError) {\n    fprintf(stderr, \"%d portaudio error: %s\\n\", __LINE__, Pa_GetErrorText(err));\n    return;\n  }\n\n  err = Pa_SetStreamFinishedCallback(stream, &PlayCallbackFinished);\n  if (err != paNoError) {\n    fprintf(stderr, \"%d portaudio error: %s\\n\", __LINE__, Pa_GetErrorText(err));\n    return;\n  }\n\n  err = Pa_StartStream(stream);\n  if (err != paNoError) {\n    fprintf(stderr, \"%d portaudio error: %s\\n\", __LINE__, Pa_GetErrorText(err));\n    return;\n  }\n\n  std::unique_lock<std::mutex> lock(g_cv_m);\n  while (!g_killed && !g_stopped &&\n         (!g_started || (g_started && !g_buffer.samples.empty()))) {\n    g_cv.wait(lock);\n  }\n\n  err = Pa_StopStream(stream);\n  if (err != paNoError) {\n    return;\n  }\n\n  err = Pa_CloseStream(stream);\n  if (err != paNoError) {\n    return;\n  }\n}\n\n\n// CAboutDlg dialog used for App About\n\nclass CAboutDlg : public CDialogEx\n{\npublic:\n\tCAboutDlg();\n\n// Dialog Data\n#ifdef AFX_DESIGN_TIME\n\tenum { IDD = IDD_ABOUTBOX };\n#endif\n\n\tprotected:\n\tvirtual void DoDataExchange(CDataExchange* pDX);    // DDX/DDV support\n\n// Implementation\nprotected:\n\tDECLARE_MESSAGE_MAP()\n};\n\nCAboutDlg::CAboutDlg() : CDialogEx(IDD_ABOUTBOX)\n{\n}\n\nvoid CAboutDlg::DoDataExchange(CDataExchange* pDX)\n{\n\tCDialogEx::DoDataExchange(pDX);\n}\n\nBEGIN_MESSAGE_MAP(CAboutDlg, CDialogEx)\nEND_MESSAGE_MAP()\n\n\n// CNonStreamingTextToSpeechDlg dialog\n\n// see\n// https://stackoverflow.com/questions/7153935/how-to-convert-utf-8-stdstring-to-utf-16-stdwstring\nstatic std::wstring Utf8ToUtf16(const std::string &utf8) {\n  std::vector<unsigned long> unicode;\n  size_t i = 0;\n  while (i < utf8.size()) {\n    unsigned long uni;\n    size_t todo;\n    bool error = false;\n    unsigned char ch = utf8[i++];\n    if (ch <= 0x7F) {\n      uni = ch;\n      todo = 0;\n    } else if (ch <= 0xBF) {\n      throw std::logic_error(\"not a UTF-8 string\");\n    } else if (ch <= 0xDF) {\n      uni = ch & 0x1F;\n      todo = 1;\n    } else if (ch <= 0xEF) {\n      uni = ch & 0x0F;\n      todo = 2;\n    } else if (ch <= 0xF7) {\n      uni = ch & 0x07;\n      todo = 3;\n    } else {\n      throw std::logic_error(\"not a UTF-8 string\");\n    }\n    for (size_t j = 0; j < todo; ++j) {\n      if (i == utf8.size()) throw std::logic_error(\"not a UTF-8 string\");\n      unsigned char ch = utf8[i++];\n      if (ch < 0x80 || ch > 0xBF) throw std::logic_error(\"not a UTF-8 string\");\n      uni <<= 6;\n      uni += ch & 0x3F;\n    }\n    if (uni >= 0xD800 && uni <= 0xDFFF)\n      throw std::logic_error(\"not a UTF-8 string\");\n    if (uni > 0x10FFFF) throw std::logic_error(\"not a UTF-8 string\");\n    unicode.push_back(uni);\n  }\n  std::wstring utf16;\n  for (size_t i = 0; i < unicode.size(); ++i) {\n    unsigned long uni = unicode[i];\n    if (uni <= 0xFFFF) {\n      utf16 += (wchar_t)uni;\n    } else {\n      uni -= 0x10000;\n      utf16 += (wchar_t)((uni >> 10) + 0xD800);\n      utf16 += (wchar_t)((uni & 0x3FF) + 0xDC00);\n    }\n  }\n  return utf16;\n}\n\n// The system calls this function to obtain the cursor to display while the user drags\n//  the minimized window.\nHCURSOR CNonStreamingTextToSpeechDlg::OnQueryDragIcon()\n{\n\treturn static_cast<HCURSOR>(m_hIcon);\n}\n\n\nvoid AppendTextToEditCtrl(CEdit& e, const std::string &s) {\n  // get the initial text length\n  int nLength = e.GetWindowTextLength();\n  // put the selection at the end of text\n  e.SetSel(nLength, nLength);\n  // replace the selection\n\n  std::wstring wstr = Utf8ToUtf16(s);\n\n  // my_text_.ReplaceSel(wstr.c_str());\n  e.ReplaceSel(wstr.c_str());\n}\n\nvoid AppendLineToMultilineEditCtrl(CEdit& e, const std::string &s) {\n  AppendTextToEditCtrl(e, \"\\r\\n\" + s);\n}\n\n\nCNonStreamingTextToSpeechDlg::CNonStreamingTextToSpeechDlg(CWnd* pParent /*=nullptr*/)\n\t: CDialogEx(IDD_NONSTREAMINGTEXTTOSPEECH_DIALOG, pParent)\n       {\n\tm_hIcon = AfxGetApp()->LoadIcon(IDR_MAINFRAME);\n}\n\nvoid CNonStreamingTextToSpeechDlg::DoDataExchange(CDataExchange* pDX)\n{\n        CDialogEx::DoDataExchange(pDX);\n        DDX_Control(pDX, IDC_HINT, my_hint_);\n        DDX_Control(pDX, IDC_SPEAKER, speaker_id_);\n        DDX_Control(pDX, IDC_SPEED, speed_);\n        DDX_Control(pDX, IDOK, generate_btn_);\n        DDX_Control(pDX, IDC_TEXT, my_text_);\n        DDX_Control(pDX, IDC_OUTPUT_FILENAME, output_filename_);\n}\n\nBEGIN_MESSAGE_MAP(CNonStreamingTextToSpeechDlg, CDialogEx)\n\tON_WM_SYSCOMMAND()\n\tON_WM_PAINT()\n\tON_WM_QUERYDRAGICON()\n        ON_BN_CLICKED(IDOK, &CNonStreamingTextToSpeechDlg::OnBnClickedOk)\n        ON_BN_CLICKED(IDC_STOP, &CNonStreamingTextToSpeechDlg::OnBnClickedStop)\n        END_MESSAGE_MAP()\n\n\n// CNonStreamingTextToSpeechDlg message handlers\n\nBOOL CNonStreamingTextToSpeechDlg::OnInitDialog()\n{\n\tCDialogEx::OnInitDialog();\n\n\t// Add \"About...\" menu item to system menu.\n\n\t// IDM_ABOUTBOX must be in the system command range.\n\tASSERT((IDM_ABOUTBOX & 0xFFF0) == IDM_ABOUTBOX);\n\tASSERT(IDM_ABOUTBOX < 0xF000);\n\n\tCMenu* pSysMenu = GetSystemMenu(FALSE);\n\tif (pSysMenu != nullptr)\n\t{\n\t\tBOOL bNameValid;\n\t\tCString strAboutMenu;\n\t\tbNameValid = strAboutMenu.LoadString(IDS_ABOUTBOX);\n\t\tASSERT(bNameValid);\n\t\tif (!strAboutMenu.IsEmpty())\n\t\t{\n\t\t\tpSysMenu->AppendMenu(MF_SEPARATOR);\n\t\t\tpSysMenu->AppendMenu(MF_STRING, IDM_ABOUTBOX, strAboutMenu);\n\t\t}\n\t}\n\n\t// Set the icon for this dialog.  The framework does this automatically\n\t//  when the application's main window is not a dialog\n\tSetIcon(m_hIcon, TRUE);\t\t\t// Set big icon\n\tSetIcon(m_hIcon, FALSE);\t\t// Set small icon\n\n\t// TODO: Add extra initialization here\n    Init();\n\n\treturn TRUE;  // return TRUE  unless you set the focus to a control\n}\n\nvoid CNonStreamingTextToSpeechDlg::OnSysCommand(UINT nID, LPARAM lParam)\n{\n\tif ((nID & 0xFFF0) == IDM_ABOUTBOX)\n\t{\n\t\tCAboutDlg dlgAbout;\n\t\tdlgAbout.DoModal();\n\t}\n\telse\n\t{\n\t\tCDialogEx::OnSysCommand(nID, lParam);\n\t}\n}\n\n// If you add a minimize button to your dialog, you will need the code below\n//  to draw the icon            .  For MFC applications using the document/view model,\n//  this is automatically done for you by the framework.\n\nvoid CNonStreamingTextToSpeechDlg::OnPaint()\n{\n\tif (IsIconic())\n\t{\n\t\tCPaintDC dc(this); // device context for painting\n\n\t\tSendMessage(WM_ICONERASEBKGND, reinterpret_cast<WPARAM>(dc.GetSafeHdc()), 0);\n\n\t\t// Center icon in client rectangle\n\t\tint cxIcon = GetSystemMetrics(SM_CXICON);\n\t\tint cyIcon =             GetSystemMetrics(SM_CYICON);\n\t\tCRect rect;\n\t\tGetClientRect(&rect);\n\t\tint x = (rect.Width() - cxIcon + 1) / 2;\n\t\tint y = (rect.Height() - cyIcon + 1) / 2;\n\n\t\t// Draw the icon\n\t\tdc.DrawIcon(x, y, m_hIcon);\n\t}\n\telse\n\t{\n\t\tCDialogEx::OnPaint();\n\t}\n}\n\nbool Exists(const std::string &filename) {\n  std::ifstream is(filename);\n  return is.good();\n}\n\nvoid CNonStreamingTextToSpeechDlg::InitHint() {\n    AppendLineToMultilineEditCtrl(my_hint_, \"Speaker ID: Used only for multi-speaker models. Example value: 0\");\n    AppendLineToMultilineEditCtrl(my_hint_, \"Speed: Larger -> Faster in speech speed. Example value: 1.0\");\n    AppendLineToMultilineEditCtrl(my_hint_, \"\\r\\nPlease input your text and click the button Generate\");\n\n}\n\nvoid CNonStreamingTextToSpeechDlg::Init() {\n    InitHint();\n    speaker_id_.SetWindowText(Utf8ToUtf16(\"0\").c_str());\n    speed_.SetWindowText(Utf8ToUtf16(\"1.0\").c_str());\n    output_filename_.SetWindowText(Utf8ToUtf16(\"./generated.wav\").c_str());\n\n\tbool ok = true;\n  std::string error_message = \"--------------------\\r\\n\";\n  if (!Exists(\"./model.onnx\")) {\n    error_message += \"Cannot find ./model.onnx\\r\\n\";\n    ok = false;\n  }\n\n  if (!Exists(\"./tokens.txt\")) {\n    error_message += \"Cannot find ./tokens.txt\\r\\n\";\n    ok = false;\n  }\n  // it is OK to leave lexicon.txt and espeak-ng-data empty\n  // since models using characters don't need them\n\n  if (!ok) {\n    generate_btn_.EnableWindow(FALSE);\n    error_message +=\n        \"\\r\\nPlease refer to\\r\\n\"\n        \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\"\n        \"\\r\\nor\\r\\n\"\n        \"https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models\";\n\n    error_message += \"\\r\\nto download models.\\r\\n\";\n    error_message += \"\\r\\nWe give several examples below\\r\\n\";\n    error_message += \"      1. Use a Kokoro TTS model (multi-lingual, e.g, English + Chinese)\\r\\n\";\n    error_message += \"      2. Use a Kokoro TTS model (English only)\\r\\n\";\n    error_message += \"      3. Use a VITS Piper TTS model\\r\\n\";\n    error_message += \"      4. Use a VITS Chinese TTS model\\r\\n\";\n    error_message += \"      5. Use a Matcha TTS model\\r\\n\";\n    error_message += \"\\r\\n\";\n\n    error_message += \n        \"----------1. Use a Kokoro TTS model (multi-lingual, eg., English + Chinese)----------\\r\\n\"\n        \"(a) Download the model from \\r\\n\"\n        \"     https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\\r\\n\"\n        \"(b) Uncompress it and you will get a directory kokoro-multi-lang-v1_0\\r\\n\"\n        \"(c) Switch to the directory kokoro-multi-lang-v1_0\\r\\n\"\n        \"(d) Copy the current exe to the directory kokoro-multi-lang-v1_0\\r\\n\"\n        \"(e).Done! You can now run the exe in the directory kokoro-multi-lang-v1_0\\r\\n\";\n\n    error_message +=  \"\\r\\n\";\n\n    error_message += \n        \"----------2. Use a Kokoro TTS model (English only)----------\\r\\n\"\n        \"(a) Download the model from \\r\\n\"\n        \"     https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\\r\\n\"\n        \"(b) Uncompress it and you will get a directory kokoro-en-v0_19\\r\\n\"\n        \"(c) Switch to the directory kokoro-en-v0_19\\r\\n\"\n        \"(d) Copy the current exe to the directory kokoro-en-v0_19\\r\\n\"\n        \"(e).Done! You can now run the exe in the directory kokoro-en-v0_19\\r\\n\";\n\n    error_message +=  \"\\r\\n\";\n\n    error_message += \n        \"----------3. Use a VITS Piper TTS model----------\\r\\n\"\n        \"(a) Download the model from \\r\\n\"\n        \"     https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\\r\\n\"\n        \"(b) Uncompress it and you will get a directory vits-piper-en_US-amy-low\\r\\n\"\n        \"(c) Switch to the directory vits-piper-en_US-amy-low \\r\\n\"\n        \"(d) Rename en_US-amy-low.onnx to model.onnx\\r\\n\"\n        \"(e) Copy the current exe to the directory vits-piper-en_US-amy-low\\r\\n\"\n        \"(f) Done! You can now run the exe in the directory vits-piper-en_US-amy-low\\r\\n\";\n\n    error_message +=  \"\\r\\n\";\n\n    error_message += \n        \"----------4. Use a VITS Chinese TTS model----------\\r\\n\"\n        \"(a) Download the model from \\r\\n\"\n        \"     https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-vits-zh-ll.tar.bz2\\r\\n\"\n        \"(b) Uncompress it and you will get a directory sherpa-onnx-vits-zh-ll\\r\\n\"\n        \"(c) Switch to the directory sherpa-onnx-vits-zh-ll\\r\\n\"\n        \"(d) Copy the current exe to the directory sherpa-onnx-vits-zh-ll\\r\\n\"\n        \"(e) Done! You can now run the exe in the directory sherpa-onnx-vits-zh-ll\\r\\n\";\n\n    error_message +=  \"\\r\\n\";\n\n    error_message += \n        \"----------5. Use a Matcha TTS model----------\\r\\n\"\n        \"(a) Download the model from \\r\\n\"\n        \"     https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\\r\\n\"\n        \"(b) Uncompress it and you will get a directory matcha-icefall-zh-baker\\r\\n\"\n        \"(c) Switch to the directory matcha-icefall-zh-baker\\r\\n\"\n        \"(d) Rename model-steps-3.onnx to model.onnx\\r\\n\"\n        \"(e) Download a vocoder model from \\r\\n\"\n        \"      https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\\r\\n\"\n        \"(f) Rename vocos-22khz-univ.onnx to vocos.onnx\\r\\n\"\n        \"(g) Remember to put vocos.onnx in the directory matcha-icefall-zh-baker\\r\\n\"\n        \"(h) Copy the current exe to the directory matcha-icefall-zh-baker\\r\\n\"\n        \"(i) Done! You can now run the exe in the directory matcha-icefall-zh-baker\\r\\n\";\n\n    AppendLineToMultilineEditCtrl(my_hint_, error_message);\n    return;\n  }\n\n  // Now init tts\n  SherpaOnnxOfflineTtsConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model.debug = 0;\n  config.model.num_threads = 4;\n  config.model.provider = \"cpu\";\n\n  if (Exists(\"./voices.bin\")) {\n    // it is a kokoro tts model\n    config.model.kokoro.model = \"./model.onnx\";\n    config.model.kokoro.voices = \"./voices.bin\";\n    config.model.kokoro.tokens = \"./tokens.txt\";\n    config.model.kokoro.data_dir = \"./espeak-ng-data\";\n    if (Exists(\"./dict/jieba.dict.utf8\") && Exists(\"./lexicon-zh.txt\")) {\n      config.model.kokoro.dict_dir = \"./dict\";\n      config.model.kokoro.lexicon = \"./lexicon-us-en.txt,./lexicon-zh.txt\";\n    }\n  } else if (Exists(\"./hifigan.onnx\") || Exists(\"./vocos.onnx\")) {\n    // it is a matcha tts model\n    config.model.matcha.acoustic_model = \"./model.onnx\";\n\n    if (Exists(\"./hifigan.onnx\")) {\n      config.model.matcha.vocoder = \"./hifigan.onnx\";\n    } else if (Exists(\"./vocos.onnx\")) {\n      config.model.matcha.vocoder = \"./vocos.onnx\";\n    }\n\n    config.model.matcha.tokens = \"./tokens.txt\";\n\n    if (Exists(\"./espeak-ng-data/phontab\")) {\n      config.model.matcha.data_dir = \"./espeak-ng-data\";\n    }\n\n    if(Exists(\"./lexicon.txt\")) {\n      config.model.matcha.lexicon = \"./lexicon.txt\";\n    }\n\n    if (Exists(\"./dict/jieba.dict.utf8\")) {\n      config.model.matcha.dict_dir = \"./dict\";\n    }\n  } else {\n    // it is a vits tts model\n    config.model.vits.model = \"./model.onnx\";\n    config.model.vits.tokens = \"./tokens.txt\";\n    if (Exists(\"./espeak-ng-data/phontab\")) {\n      config.model.vits.data_dir = \"./espeak-ng-data\";\n    } \n\n    if (Exists(\"./lexicon.txt\")) {\n      config.model.vits.lexicon = \"./lexicon.txt\";\n    }\n\n    if (Exists(\"./dict/jieba.dict.utf8\")) {\n      config.model.vits.dict_dir = \"./dict\";\n    }\n  }\n\n  if (Exists(\"./phone.fst\") && Exists(\"./date.fst\") && Exists(\"./number.fst\")) {\n    config.rule_fsts = \"./phone.fst,./date.fst,number.fst\";\n  }\n\n  if (Exists(\"./phone-zh.fst\") && Exists(\"./date-zh.fst\") && Exists(\"./number-zh.fst\")) {\n    config.rule_fsts = \"./phone-zh.fst,./date-zh.fst,number-zh.fst\";\n  }\n\n  if (Exists(\"./rule.far\")) {\n    config.rule_fars = \"./rule.far\";\n  }\n\n  tts_ = SherpaOnnxCreateOfflineTts(&config);\n}\n\n CNonStreamingTextToSpeechDlg::~CNonStreamingTextToSpeechDlg() {\n  if (tts_) {\n    SherpaOnnxDestroyOfflineTts(tts_);\n  }\n  if (generate_thread_ && generate_thread_->joinable()) {\n    generate_thread_->join();\n  }\n\n  if (play_thread_ && play_thread_->joinable()) {\n    play_thread_->join();\n  }\n }\n\n\n static std::string ToString(const CString &s) {\n    CT2CA pszConvertedAnsiString(s);\n    return std::string(pszConvertedAnsiString);\n }\n\nvoid CNonStreamingTextToSpeechDlg::OnBnClickedOk() {\n  CString s;\n  speaker_id_.GetWindowText(s);\n  int speaker_id = _ttoi(s);\n  if (speaker_id < 0) {\n    AfxMessageBox(Utf8ToUtf16(\"Please input a valid speaker ID\").c_str(), MB_OK);\n    return;\n  }\n\n  speed_.GetWindowText(s);\n  float speed = static_cast<float>(_ttof(s));\n  if (speed < 0) {\n    AfxMessageBox(Utf8ToUtf16(\"Please input a valid speed\").c_str(), MB_OK);\n    return;\n  }\n\n  my_text_.GetWindowText(s);\n\n  std::string ss = ToString(s);\n  if (ss.empty()) {\n    AfxMessageBox(Utf8ToUtf16(\"Please input your text\").c_str(), MB_OK);\n    return;\n  }\n\n  if (play_thread_) {\n    g_killed = true;\n    g_stopped = true;\n    if (play_thread_->joinable()) {\n      play_thread_->join();\n    }\n  }\n\n  g_killed = false;\n  g_stopped = false;\n  g_started = false;\n  g_buffer.samples = {};\n\n  // Caution(fangjun): It is not efficient to re-create the thread. We use this approach\n  // for simplicity\n  play_thread_ = std::make_unique<std::thread>(StartPlayback, SherpaOnnxOfflineTtsSampleRate(tts_));\n\n  if (generate_thread_ && generate_thread_->joinable()) {\n    generate_thread_->join();\n  }\n\n  output_filename_.GetWindowText(s);\n  std::string filename = ToString(s);\n\n  generate_thread_ = std::make_unique<std::thread>([ss, this,filename, speaker_id, speed]() {\n      std::string text = ss;\n\n      // generate_btn_.EnableWindow(FALSE);\n\n\t  const SherpaOnnxGeneratedAudio *audio =\n\t\t  SherpaOnnxOfflineTtsGenerateWithCallback(tts_, text.c_str(), speaker_id, speed, &AudioGeneratedCallback);\n      // generate_btn_.EnableWindow(TRUE);\n       g_stopped = true;\n\n\t  int ok = SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,\n\t\t\t\t\t\tfilename.c_str());\n\n\t  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n\n\t  if (ok) {\n\t\t// AfxMessageBox(Utf8ToUtf16(std::string(\"Saved to \") + filename + \" successfully\").c_str(), MB_OK);\n\n\t\t// AppendLineToMultilineEditCtrl(my_hint_, std::string(\"Saved to \") + filename + \" successfully\");\n\t  } else {\n\t\t// AfxMessageBox(Utf8ToUtf16(std::string(\"Failed to save to \") + filename).c_str(), MB_OK);\n\n\t\t// AppendLineToMultilineEditCtrl(my_hint_, std::string(\"Failed to saved to \") + filename);\n\t  }\n  });\n\n  //CDialogEx::OnOK();\n}\n\nvoid CNonStreamingTextToSpeechDlg::OnBnClickedStop() { g_killed = true; }\n"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/NonStreamingTextToSpeechDlg.h",
    "content": "\n// NonStreamingTextToSpeechDlg.h : header file\n//\n\n#pragma once\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n#include <memory>\n#include <thread>\n\n#include \"portaudio.h\"\n\nclass Microphone {\n public:\n  Microphone();\n  ~Microphone();\n};\n\n// CNonStreamingTextToSpeechDlg dialog\nclass CNonStreamingTextToSpeechDlg : public CDialogEx\n{\n// Construction\npublic:\n\tCNonStreamingTextToSpeechDlg(CWnd* pParent = nullptr);\t// standard constructor\n ~CNonStreamingTextToSpeechDlg();\n\n// Dialog Data\n#ifdef AFX_DESIGN_TIME\n\tenum { IDD = IDD_NONSTREAMINGTEXTTOSPEECH_DIALOG };\n#endif\n\n\tprotected:\n\tvirtual void DoDataExchange(CDataExchange* pDX);\t// DDX/DDV support\n\n\n// Implementation\nprotected:\n\tHICON m_hIcon;\n\n\t// Generated message map functions\n\tvirtual BOOL OnInitDialog();\n\tafx_msg void OnSysCommand(UINT nID, LPARAM lParam);\n\tafx_msg void OnPaint();\n\tafx_msg HCURSOR OnQueryDragIcon();\n\tDECLARE_MESSAGE_MAP()\npublic:\n\tCEdit my_hint_;\n\tCEdit speaker_id_;\n\tCEdit speed_;\n\tvoid Init();\n\tvoid InitHint();\n\tCButton generate_btn_;\n\tafx_msg void OnBnClickedOk();\n\n\tconst SherpaOnnxOfflineTts *tts_ = nullptr;\n\tCEdit my_text_;\n\tCEdit output_filename_;\n\nprivate:\n    Microphone mic_;\n\tstd::unique_ptr<std::thread> play_thread_;\n\tstd::unique_ptr<std::thread> generate_thread_;\n\n   public:\n    afx_msg void OnBnClickedStop();\n};\n"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/Resource.h",
    "content": "//{{NO_DEPENDENCIES}}\n// Microsoft Visual C++ generated include file.\n// Used by NonStreamingTextToSpeech.rc\n//\n#define IDM_ABOUTBOX                    0x0010\n#define IDD_ABOUTBOX                    100\n#define IDS_ABOUTBOX                    101\n#define IDD_NONSTREAMINGTEXTTOSPEECH_DIALOG 102\n#define IDR_MAINFRAME                   128\n#define IDC_SPEAKER                     1000\n#define IDC_SPEED                       1003\n#define IDC_TEXT                        1004\n#define IDC_HINT                        1005\n#define IDC_EDIT1                       1006\n#define IDC_OUTPUT_FILENAME             1006\n#define IDC_STOP                        1009\n\n// Next default values for new objects\n// \n#ifdef APSTUDIO_INVOKED\n#ifndef APSTUDIO_READONLY_SYMBOLS\n#define _APS_NEXT_RESOURCE_VALUE        130\n#define _APS_NEXT_COMMAND_VALUE         32771\n#define _APS_NEXT_CONTROL_VALUE         1010\n#define _APS_NEXT_SYMED_VALUE           101\n#endif\n#endif\n"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/framework.h",
    "content": "#pragma once\n\n#ifndef VC_EXTRALEAN\n#define VC_EXTRALEAN            // Exclude rarely-used stuff from Windows headers\n#endif\n\n#include \"targetver.h\"\n\n#define _ATL_CSTRING_EXPLICIT_CONSTRUCTORS      // some CString constructors will be explicit\n\n// turns off MFC's hiding of some common and often safely ignored warning messages\n#define _AFX_ALL_WARNINGS\n\n#include <afxwin.h>         // MFC core and standard components\n#include <afxext.h>         // MFC extensions\n\n\n\n\n\n#ifndef _AFX_NO_OLE_SUPPORT\n#include <afxdtctl.h>           // MFC support for Internet Explorer 4 Common Controls\n#endif\n#ifndef _AFX_NO_AFXCMN_SUPPORT\n#include <afxcmn.h>             // MFC support for Windows Common Controls\n#endif // _AFX_NO_AFXCMN_SUPPORT\n\n#include <afxcontrolbars.h>     // MFC support for ribbons and control bars\n\n\n\n\n\n\n\n\n\n\n\n"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/pch.cpp",
    "content": "// pch.cpp: source file corresponding to the pre-compiled header\n\n#include \"pch.h\"\n\n// When you are using pre-compiled headers, this source file is necessary for compilation to succeed.\n"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/pch.h",
    "content": "// pch.h: This is a precompiled header file.\n// Files listed below are compiled only once, improving build performance for future builds.\n// This also affects IntelliSense performance, including code completion and many code browsing features.\n// However, files listed here are ALL re-compiled if any one of them is updated between builds.\n// Do not add files here that you will be updating frequently as this negates the performance advantage.\n\n#ifndef PCH_H\n#define PCH_H\n\n// add headers that you want to pre-compile here\n#include \"framework.h\"\n\n#endif //PCH_H\n"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/sherpa-onnx-deps.props",
    "content": "﻿<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Project ToolsVersion=\"4.0\" xmlns=\"http://schemas.microsoft.com/developer/msbuild/2003\">\n  <ImportGroup Label=\"PropertySheets\" />\n  <PropertyGroup Label=\"UserMacros\" />\n  <PropertyGroup>\n    <SherpaOnnxBuildDirectory>..\\..\\build</SherpaOnnxBuildDirectory>\n    <SherpaOnnxInstallDirectory>..\\..\\build\\install</SherpaOnnxInstallDirectory>\n    <SherpaOnnxLibraries>\n        sherpa-onnx-portaudio_static.lib;\n        sherpa-onnx-c-api.lib;\n        sherpa-onnx-core.lib;\n        kaldi-decoder-core.lib;\n        sherpa-onnx-kaldifst-core.lib;\n        sherpa-onnx-fstfar.lib;\n        sherpa-onnx-fst.lib;\n        kaldi-native-fbank-core.lib;\n        kissfft-float.lib;\n        onnxruntime.lib;\n        piper_phonemize.lib;\n        espeak-ng.lib;\n        ucd.lib;\n        ssentencepiece_core.lib;\n    </SherpaOnnxLibraries>\n  </PropertyGroup>\n  <ItemDefinitionGroup>\n    <ClCompile>\n      <AdditionalIncludeDirectories>\n\t  $(SherpaOnnxBuildDirectory)\\_deps\\portaudio-src\\include;\n    $(SherpaOnnxInstallDirectory)\\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ClCompile>\n    <Link>\n      <AdditionalLibraryDirectories>$(SherpaOnnxInstallDirectory)\\lib;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>\n      <AdditionalDependencies>$(SherpaOnnxLibraries);</AdditionalDependencies>\n    </Link>\n  </ItemDefinitionGroup>\n  <ItemGroup />\n</Project>\n"
  },
  {
    "path": "mfc-examples/NonStreamingTextToSpeech/targetver.h",
    "content": "#pragma once\n\n// Including SDKDDKVer.h defines the highest available Windows platform.\n\n// If you wish to build your application for a previous Windows platform, include WinSDKVer.h and\n// set the _WIN32_WINNT macro to the platform you wish to support before including SDKDDKVer.h.\n\n#include <SDKDDKVer.h>\n"
  },
  {
    "path": "mfc-examples/README.md",
    "content": "# Speech recognition with Visual C++ MFC\n\nThis directory contains examples showing how to use Next-gen Kaldi in MFC\nfor speech recognition.\n\n|Directory| Pre-built exe (x64)|Pre-built exe (x86)| Description|\n|---------|--------------------|-------------------|------------|\n|[./NonStreamingSpeechRecognition](./NonStreamingSpeechRecognition)|[URL](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.31/sherpa-onnx-non-streaming-asr-x64-v1.12.31.exe)|[URL](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.31/sherpa-onnx-non-streaming-asr-x86-v1.12.31.exe)| Non-streaming speech recognition|\n|[./StreamingSpeechRecognition](./StreamingSpeechRecognition)|[URL](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.31/sherpa-onnx-streaming-asr-x64-v1.12.31.exe)|[URL](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.31/sherpa-onnx-streaming-asr-x86-v1.12.31.exe)| Streaming speech recognition|\n|[./NonStreamingTextToSpeech](./NonStreamingTextToSpeech)|[URL](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.31/sherpa-onnx-non-streaming-tts-x64-v1.12.31.exe)|[URL](https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.31/sherpa-onnx-non-streaming-tts-x86-v1.12.31.exe)| Non-streaming text to speech|\n\nCaution: You need to use Windows and install Visual Studio 2022 in order to\ncompile it.\n\nHint: If you don't want to install Visual Studio, you can find below\nabout how to download pre-compiled `exe`.\n\nWe use bash script below to demonstrate how to use it. Please change\nthe commands accordingly for Windows.\n\n## How to compile\n\n\nFirst, we need to compile sherpa-onnx:\n\n```bash\nmkdir -p $HOME/open-source\ncd $HOME/open-source\n\ngit clone https://github.com/k2-fsa/sherpa-onnx\ncd sherpa-onnx\nmkdir build\ncd build\n\ncmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX=./install ..\ncmake --build . --config Release --target install\ncd ../mfc-examples\n\nmsbuild ./mfc-examples.sln /property:Configuration=Release /property:Platform=x64\n\n# now run the program\n\n./x64/Release/StreamingSpeechRecognition.exe\n./x64/Release/NonStreamingSpeechRecognition.exe\n```\n\nIf you don't want to compile the project by yourself, you can download\npre-compiled `exe` from https://github.com/k2-fsa/sherpa-onnx/releases\n\nFor instance, you can use the following addresses:\n\n  - https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.5.1/sherpa-onnx-streaming-v1.5.1.exe\n  - https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.5.1/sherpa-onnx-non-streaming-v1.5.1.exe\n"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/Resource.h",
    "content": "//{{NO_DEPENDENCIES}}\n// Microsoft Visual C++ generated include file.\n// Used by StreamingSpeechRecognition.rc\n//\n#define IDD_STREAMINGSPEECHRECOGNITION_DIALOG 102\n#define IDR_MAINFRAME 128\n#define IDC_EDIT1 1000\n\n// Next default values for new objects\n//\n#ifdef APSTUDIO_INVOKED\n#ifndef APSTUDIO_READONLY_SYMBOLS\n#define _APS_NEXT_RESOURCE_VALUE 130\n#define _APS_NEXT_COMMAND_VALUE 32771\n#define _APS_NEXT_CONTROL_VALUE 1001\n#define _APS_NEXT_SYMED_VALUE 101\n#endif\n#endif\n"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/StreamingSpeechRecognition.cpp",
    "content": "\n// StreamingSpeechRecognition.cpp : Defines the class behaviors for the\n// application.\n//\n\n// clang-format off\n#include \"pch.h\"\n#include \"framework.h\"\n// clang-format on\n\n#include \"StreamingSpeechRecognition.h\"\n\n#include \"StreamingSpeechRecognitionDlg.h\"\n\n#ifdef _DEBUG\n#define new DEBUG_NEW\n#endif\n\n// CStreamingSpeechRecognitionApp\n\nBEGIN_MESSAGE_MAP(CStreamingSpeechRecognitionApp, CWinApp)\nON_COMMAND(ID_HELP, &CWinApp::OnHelp)\nEND_MESSAGE_MAP()\n\n// CStreamingSpeechRecognitionApp construction\n\nCStreamingSpeechRecognitionApp::CStreamingSpeechRecognitionApp() {\n  // TODO: add construction code here,\n  // Place all significant initialization in InitInstance\n}\n\n// The one and only CStreamingSpeechRecognitionApp object\n\nCStreamingSpeechRecognitionApp theApp;\n\n// CStreamingSpeechRecognitionApp initialization\n\nBOOL CStreamingSpeechRecognitionApp::InitInstance() {\n  CWinApp::InitInstance();\n\n  // Create the shell manager, in case the dialog contains\n  // any shell tree view or shell list view controls.\n  CShellManager *pShellManager = new CShellManager;\n\n  // Activate \"Windows Native\" visual manager for enabling themes in MFC\n  // controls\n  CMFCVisualManager::SetDefaultManager(RUNTIME_CLASS(CMFCVisualManagerWindows));\n\n  // Standard initialization\n  // If you are not using these features and wish to reduce the size\n  // of your final executable, you should remove from the following\n  // the specific initialization routines you do not need\n  // Change the registry key under which our settings are stored\n  // TODO: You should modify this string to be something appropriate\n  // such as the name of your company or organization\n  SetRegistryKey(_T(\"Local AppWizard-Generated Applications\"));\n\n  CStreamingSpeechRecognitionDlg dlg;\n  m_pMainWnd = &dlg;\n  INT_PTR nResponse = dlg.DoModal();\n  if (nResponse == IDOK) {\n    // TODO: Place code here to handle when the dialog is\n    //  dismissed with OK\n  } else if (nResponse == IDCANCEL) {\n    // TODO: Place code here to handle when the dialog is\n    //  dismissed with Cancel\n  } else if (nResponse == -1) {\n    TRACE(traceAppMsg, 0,\n          \"Warning: dialog creation failed, so application is terminating \"\n          \"unexpectedly.\\n\");\n    TRACE(traceAppMsg, 0,\n          \"Warning: if you are using MFC controls on the dialog, you cannot \"\n          \"#define _AFX_NO_MFC_CONTROLS_IN_DIALOGS.\\n\");\n  }\n\n  // Delete the shell manager created above.\n  if (pShellManager != nullptr) {\n    delete pShellManager;\n  }\n\n#if !defined(_AFXDLL) && !defined(_AFX_NO_MFC_CONTROLS_IN_DIALOGS)\n  ControlBarCleanUp();\n#endif\n\n  // Since the dialog has been closed, return FALSE so that we exit the\n  //  application, rather than start the application's message pump.\n  return FALSE;\n}\n"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/StreamingSpeechRecognition.h",
    "content": "\n// StreamingSpeechRecognition.h : main header file for the PROJECT_NAME\n// application\n//\n\n#pragma once\n\n#ifndef __AFXWIN_H__\n#error \"include 'pch.h' before including this file for PCH\"\n#endif\n\n#include \"resource.h\"  // main symbols\n\n// CStreamingSpeechRecognitionApp:\n// See StreamingSpeechRecognition.cpp for the implementation of this class\n//\n\nclass CStreamingSpeechRecognitionApp : public CWinApp {\n public:\n  CStreamingSpeechRecognitionApp();\n\n  // Overrides\n public:\n  virtual BOOL InitInstance();\n\n  // Implementation\n\n  DECLARE_MESSAGE_MAP()\n};\n\nextern CStreamingSpeechRecognitionApp theApp;\n"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/StreamingSpeechRecognition.vcxproj",
    "content": "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Project DefaultTargets=\"Build\" xmlns=\"http://schemas.microsoft.com/developer/msbuild/2003\">\n  <ItemGroup Label=\"ProjectConfigurations\">\n    <ProjectConfiguration Include=\"Debug|Win32\">\n      <Configuration>Debug</Configuration>\n      <Platform>Win32</Platform>\n    </ProjectConfiguration>\n    <ProjectConfiguration Include=\"Release|Win32\">\n      <Configuration>Release</Configuration>\n      <Platform>Win32</Platform>\n    </ProjectConfiguration>\n    <ProjectConfiguration Include=\"Debug|x64\">\n      <Configuration>Debug</Configuration>\n      <Platform>x64</Platform>\n    </ProjectConfiguration>\n    <ProjectConfiguration Include=\"Release|x64\">\n      <Configuration>Release</Configuration>\n      <Platform>x64</Platform>\n    </ProjectConfiguration>\n  </ItemGroup>\n  <PropertyGroup Label=\"Globals\">\n    <VCProjectVersion>16.0</VCProjectVersion>\n    <ProjectGuid>{A79C2604-C33D-497C-9770-D34E118B77FE}</ProjectGuid>\n    <Keyword>MFCProj</Keyword>\n    <RootNamespace>StreamingSpeechRecognition</RootNamespace>\n    <WindowsTargetPlatformVersion>10.0</WindowsTargetPlatformVersion>\n  </PropertyGroup>\n  <Import Project=\"$(VCTargetsPath)\\Microsoft.Cpp.Default.props\" />\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>true</UseDebugLibraries>\n    <PlatformToolset>v142</PlatformToolset>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Static</UseOfMfc>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>false</UseDebugLibraries>\n    <PlatformToolset>v143</PlatformToolset>\n    <WholeProgramOptimization>true</WholeProgramOptimization>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Static</UseOfMfc>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>true</UseDebugLibraries>\n    <PlatformToolset>v142</PlatformToolset>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Static</UseOfMfc>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\" Label=\"Configuration\">\n    <ConfigurationType>Application</ConfigurationType>\n    <UseDebugLibraries>false</UseDebugLibraries>\n    <PlatformToolset>v143</PlatformToolset>\n    <WholeProgramOptimization>true</WholeProgramOptimization>\n    <CharacterSet>Unicode</CharacterSet>\n    <UseOfMfc>Static</UseOfMfc>\n  </PropertyGroup>\n  <Import Project=\"$(VCTargetsPath)\\Microsoft.Cpp.props\" />\n  <ImportGroup Label=\"ExtensionSettings\">\n  </ImportGroup>\n  <ImportGroup Label=\"Shared\">\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <ImportGroup Label=\"PropertySheets\" Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">\n    <Import Project=\"$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props\" Condition=\"exists('$(UserRootDir)\\Microsoft.Cpp.$(Platform).user.props')\" Label=\"LocalAppDataPlatform\" />\n    <Import Project=\"sherpa-onnx-deps.props\" />\n  </ImportGroup>\n  <PropertyGroup Label=\"UserMacros\" />\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">\n    <LinkIncremental>true</LinkIncremental>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">\n    <LinkIncremental>true</LinkIncremental>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">\n    <LinkIncremental>false</LinkIncremental>\n  </PropertyGroup>\n  <PropertyGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">\n    <LinkIncremental>false</LinkIncremental>\n  </PropertyGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>WIN32;_WINDOWS;_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>_WINDOWS;_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <FunctionLevelLinking>true</FunctionLevelLinking>\n      <IntrinsicFunctions>true</IntrinsicFunctions>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>WIN32;_WINDOWS;NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n      <EnableCOMDATFolding>true</EnableCOMDATFolding>\n      <OptimizeReferences>true</OptimizeReferences>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemDefinitionGroup Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">\n    <ClCompile>\n      <PrecompiledHeader>Use</PrecompiledHeader>\n      <WarningLevel>Level3</WarningLevel>\n      <FunctionLevelLinking>true</FunctionLevelLinking>\n      <IntrinsicFunctions>true</IntrinsicFunctions>\n      <SDLCheck>true</SDLCheck>\n      <PreprocessorDefinitions>_WINDOWS;NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <PrecompiledHeaderFile>pch.h</PrecompiledHeaderFile>\n    </ClCompile>\n    <Link>\n      <SubSystem>Windows</SubSystem>\n      <EnableCOMDATFolding>true</EnableCOMDATFolding>\n      <OptimizeReferences>true</OptimizeReferences>\n    </Link>\n    <Midl>\n      <MkTypLibCompatible>false</MkTypLibCompatible>\n      <ValidateAllParameters>true</ValidateAllParameters>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n    </Midl>\n    <ResourceCompile>\n      <Culture>0x0409</Culture>\n      <PreprocessorDefinitions>NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>\n      <AdditionalIncludeDirectories>$(IntDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ResourceCompile>\n  </ItemDefinitionGroup>\n  <ItemGroup>\n    <ClInclude Include=\"framework.h\" />\n    <ClInclude Include=\"pch.h\" />\n    <ClInclude Include=\"Resource.h\" />\n    <ClInclude Include=\"StreamingSpeechRecognition.h\" />\n    <ClInclude Include=\"StreamingSpeechRecognitionDlg.h\" />\n    <ClInclude Include=\"targetver.h\" />\n  </ItemGroup>\n  <ItemGroup>\n    <ClCompile Include=\"pch.cpp\">\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Debug|Win32'\">Create</PrecompiledHeader>\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Debug|x64'\">Create</PrecompiledHeader>\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Release|Win32'\">Create</PrecompiledHeader>\n      <PrecompiledHeader Condition=\"'$(Configuration)|$(Platform)'=='Release|x64'\">Create</PrecompiledHeader>\n    </ClCompile>\n    <ClCompile Include=\"StreamingSpeechRecognition.cpp\" />\n    <ClCompile Include=\"StreamingSpeechRecognitionDlg.cpp\" />\n  </ItemGroup>\n  <ItemGroup>\n    <ResourceCompile Include=\"StreamingSpeechRecognition.rc\" />\n  </ItemGroup>\n  <ItemGroup>\n    <None Include=\"res\\StreamingSpeechRecognition.rc2\" />\n  </ItemGroup>\n  <ItemGroup>\n    <Image Include=\"res\\StreamingSpeechRecognition.ico\" />\n  </ItemGroup>\n  <Import Project=\"$(VCTargetsPath)\\Microsoft.Cpp.targets\" />\n  <ImportGroup Label=\"ExtensionTargets\">\n  </ImportGroup>\n</Project>"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/StreamingSpeechRecognition.vcxproj.filters",
    "content": "﻿<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Project ToolsVersion=\"4.0\" xmlns=\"http://schemas.microsoft.com/developer/msbuild/2003\">\n  <ItemGroup>\n    <Filter Include=\"Source Files\">\n      <UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>\n      <Extensions>cpp;c;cc;cxx;c++;cppm;ixx;def;odl;idl;hpj;bat;asm;asmx</Extensions>\n    </Filter>\n    <Filter Include=\"Header Files\">\n      <UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>\n      <Extensions>h;hh;hpp;hxx;h++;hm;inl;inc;ipp;xsd</Extensions>\n    </Filter>\n    <Filter Include=\"Resource Files\">\n      <UniqueIdentifier>{67DA6AB6-F800-4c08-8B7A-83BB121AAD01}</UniqueIdentifier>\n      <Extensions>rc;ico;cur;bmp;dlg;rc2;rct;bin;rgs;gif;jpg;jpeg;jpe;resx;tiff;tif;png;wav;mfcribbon-ms</Extensions>\n    </Filter>\n  </ItemGroup>\n  <ItemGroup>\n    <ClInclude Include=\"StreamingSpeechRecognition.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"StreamingSpeechRecognitionDlg.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"framework.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"targetver.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"Resource.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n    <ClInclude Include=\"pch.h\">\n      <Filter>Header Files</Filter>\n    </ClInclude>\n  </ItemGroup>\n  <ItemGroup>\n    <ClCompile Include=\"StreamingSpeechRecognition.cpp\">\n      <Filter>Source Files</Filter>\n    </ClCompile>\n    <ClCompile Include=\"StreamingSpeechRecognitionDlg.cpp\">\n      <Filter>Source Files</Filter>\n    </ClCompile>\n    <ClCompile Include=\"pch.cpp\">\n      <Filter>Source Files</Filter>\n    </ClCompile>\n  </ItemGroup>\n  <ItemGroup>\n    <ResourceCompile Include=\"StreamingSpeechRecognition.rc\">\n      <Filter>Resource Files</Filter>\n    </ResourceCompile>\n  </ItemGroup>\n  <ItemGroup>\n    <None Include=\"res\\StreamingSpeechRecognition.rc2\">\n      <Filter>Resource Files</Filter>\n    </None>\n  </ItemGroup>\n  <ItemGroup>\n    <Image Include=\"res\\StreamingSpeechRecognition.ico\">\n      <Filter>Resource Files</Filter>\n    </Image>\n  </ItemGroup>\n</Project>"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/StreamingSpeechRecognitionDlg.cpp",
    "content": "\n// StreamingSpeechRecognitionDlg.cpp : implementation file\n//\n// clang-format off\n#include \"pch.h\"\n#include \"framework.h\"\n#include \"afxdialogex.h\"\n// clang-format on\n\n#include \"StreamingSpeechRecognitionDlg.h\"\n\n#include <fstream>\n#include <sstream>\n#include <string>\n#include <vector>\n\n#include \"StreamingSpeechRecognition.h\"\n\n#ifdef _DEBUG\n#define new DEBUG_NEW\n#endif\n\nMicrophone::Microphone() {\n  PaError err = Pa_Initialize();\n  if (err != paNoError) {\n    fprintf(stderr, \"portaudio error: %s\\n\", Pa_GetErrorText(err));\n    exit(-2);\n  }\n}\n\nMicrophone::~Microphone() {\n  PaError err = Pa_Terminate();\n  if (err != paNoError) {\n    fprintf(stderr, \"portaudio error: %s\\n\", Pa_GetErrorText(err));\n    exit(-2);\n  }\n}\n\n// CStreamingSpeechRecognitionDlg dialog\n\nCStreamingSpeechRecognitionDlg::CStreamingSpeechRecognitionDlg(\n    CWnd *pParent /*=nullptr*/)\n    : CDialogEx(IDD_STREAMINGSPEECHRECOGNITION_DIALOG, pParent) {\n  m_hIcon = AfxGetApp()->LoadIcon(IDR_MAINFRAME);\n}\n\nCStreamingSpeechRecognitionDlg::~CStreamingSpeechRecognitionDlg() {\n  if (recognizer_) {\n    SherpaOnnxDestroyOnlineRecognizer(recognizer_);\n    recognizer_ = nullptr;\n  }\n}\n\nvoid CStreamingSpeechRecognitionDlg::DoDataExchange(CDataExchange *pDX) {\n  CDialogEx::DoDataExchange(pDX);\n  DDX_Control(pDX, IDOK, my_btn_);\n  DDX_Control(pDX, IDC_EDIT1, my_text_);\n}\n\nBEGIN_MESSAGE_MAP(CStreamingSpeechRecognitionDlg, CDialogEx)\nON_WM_PAINT()\nON_WM_QUERYDRAGICON()\nON_BN_CLICKED(IDOK, &CStreamingSpeechRecognitionDlg::OnBnClickedOk)\nEND_MESSAGE_MAP()\n\n// CStreamingSpeechRecognitionDlg message handlers\n\nBOOL CStreamingSpeechRecognitionDlg::OnInitDialog() {\n  CDialogEx::OnInitDialog();\n\n  // Set the icon for this dialog.  The framework does this automatically\n  //  when the application's main window is not a dialog\n  SetIcon(m_hIcon, TRUE);   // Set big icon\n  SetIcon(m_hIcon, FALSE);  // Set small icon\n\n  // TODO: Add extra initialization here\n  SetWindowText(_T(\"Real-time speech recogntion with Next-gen Kaldi\"));\n  InitMicrophone();\n\n  return TRUE;  // return TRUE  unless you set the focus to a control\n}\n\n// If you add a minimize button to your dialog, you will need the code below\n//  to draw the icon.  For MFC applications using the document/view model,\n//  this is automatically done for you by the framework.\n\nvoid CStreamingSpeechRecognitionDlg::OnPaint() {\n  if (IsIconic()) {\n    CPaintDC dc(this);  // device context for painting\n\n    SendMessage(WM_ICONERASEBKGND, reinterpret_cast<WPARAM>(dc.GetSafeHdc()),\n                0);\n\n    // Center icon in client rectangle\n    int cxIcon = GetSystemMetrics(SM_CXICON);\n    int cyIcon = GetSystemMetrics(SM_CYICON);\n    CRect rect;\n    GetClientRect(&rect);\n    int x = (rect.Width() - cxIcon + 1) / 2;\n    int y = (rect.Height() - cyIcon + 1) / 2;\n\n    // Draw the icon\n    dc.DrawIcon(x, y, m_hIcon);\n  } else {\n    CDialogEx::OnPaint();\n  }\n}\n\n// The system calls this function to obtain the cursor to display while the user\n// drags\n//  the minimized window.\nHCURSOR CStreamingSpeechRecognitionDlg::OnQueryDragIcon() {\n  return static_cast<HCURSOR>(m_hIcon);\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void *user_data) {\n  auto dlg = reinterpret_cast<CStreamingSpeechRecognitionDlg *>(user_data);\n\n  auto stream = dlg->stream_;\n  if (stream) {\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, 16000, reinterpret_cast<const float *>(input_buffer),\n                   frames_per_buffer);\n  }\n\n  return dlg->started_ ? paContinue : paComplete;\n}\n\nvoid CStreamingSpeechRecognitionDlg::OnBnClickedOk() {\n  if (!recognizer_) {\n    AppendLineToMultilineEditCtrl(\"Creating recognizer...\");\n    AppendLineToMultilineEditCtrl(\"It will take several seconds. Please wait\");\n    InitRecognizer();\n    if (!recognizer_) {\n      // failed to create the recognizer\n      return;\n    }\n    AppendLineToMultilineEditCtrl(\"Recognizer created!\");\n  }\n\n  if (!started_) {\n    started_ = true;\n\n    if (stream_) {\n      SherpaOnnxDestroyOnlineStream(stream_);\n      stream_ = nullptr;\n    }\n\n    stream_ = SherpaOnnxCreateOnlineStream(recognizer_);\n\n    PaStreamParameters param;\n    param.device = Pa_GetDefaultInputDevice();\n    const PaDeviceInfo *info = Pa_GetDeviceInfo(param.device);\n    param.channelCount = 1;\n    param.sampleFormat = paFloat32;\n    param.suggestedLatency = info->defaultLowInputLatency;\n    param.hostApiSpecificStreamInfo = nullptr;\n    float sample_rate = 16000;\n    pa_stream_ = nullptr;\n    PaError err =\n        Pa_OpenStream(&pa_stream_, &param, nullptr, /* &outputParameters, */\n                      sample_rate,\n                      0,          // frames per buffer\n                      paClipOff,  // we won't output out of range samples\n                                  // so don't bother clipping them\n                      RecordCallback, this);\n    if (err != paNoError) {\n      AppendLineToMultilineEditCtrl(std::string(\"PortAudio error: \") +\n                                    Pa_GetErrorText(err));\n      my_btn_.EnableWindow(FALSE);\n      return;\n    }\n\n    err = Pa_StartStream(pa_stream_);\n    if (err != paNoError) {\n      AppendLineToMultilineEditCtrl(std::string(\"PortAudio error: \") +\n                                    Pa_GetErrorText(err));\n      my_btn_.EnableWindow(FALSE);\n      return;\n    }\n    AppendLineToMultilineEditCtrl(\"Started! Please speak\");\n    my_btn_.SetWindowText(_T(\"Stop\"));\n\n    thread_ = new RecognizerThread(this);\n    thread_->CreateThread(CREATE_SUSPENDED);\n    thread_->m_bAutoDelete = false;  // Let me delete it.\n    thread_->ResumeThread();\n  } else {\n    started_ = false;\n    Pa_Sleep(200);  // sleep for 200ms\n    if (pa_stream_) {\n      PaError err = Pa_CloseStream(pa_stream_);\n      if (err != paNoError) {\n        AppendLineToMultilineEditCtrl(std::string(\"PortAudio error: \") +\n                                      Pa_GetErrorText(err));\n        my_btn_.EnableWindow(FALSE);\n        return;\n      }\n    }\n    pa_stream_ = nullptr;\n\n    WaitForSingleObject(thread_->m_hThread, INFINITE);\n    delete thread_;\n    thread_ = nullptr;\n\n    // AfxMessageBox(\"stopped\", MB_OK);\n    my_btn_.SetWindowText(_T(\"Start\"));\n    AppendLineToMultilineEditCtrl(\"Stopped\");\n  }\n}\n\nvoid CStreamingSpeechRecognitionDlg::InitMicrophone() {\n  int default_device = Pa_GetDefaultInputDevice();\n  int device_count = Pa_GetDeviceCount();\n  if (default_device == paNoDevice) {\n    // CString str;\n    // str.Format(_T(\"No default input device found!\"));\n    // AfxMessageBox(str, MB_OK | MB_ICONSTOP);\n    // exit(-1);\n    AppendLineToMultilineEditCtrl(\"No default input device found!\");\n    my_btn_.EnableWindow(FALSE);\n    return;\n  }\n  AppendLineToMultilineEditCtrl(std::string(\"Selected device \") +\n                                Pa_GetDeviceInfo(default_device)->name);\n}\n\nbool CStreamingSpeechRecognitionDlg::Exists(const std::string &filename) {\n  std::ifstream is(filename);\n  return is.good();\n}\n\nvoid CStreamingSpeechRecognitionDlg::ShowInitRecognizerHelpMessage() {\n    my_btn_.EnableWindow(FALSE);\n    std::string msg =\n        \"\\r\\nPlease go to\\r\\n\"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html \"\n        \"\\r\\n\";\n    msg += \"to download a streaming model, i.e., an online model.\\r\\n\";\n    msg += \"You need to rename them after downloading\\r\\n\\r\\n\";\n    msg += \"It supports both transducer and paraformer models.\\r\\n\\r\\n\";\n    msg +=\n      \"We give two examples below to show you how to download models\\r\\n\\r\\n\";\n    msg += \"(1) Transducer\\r\\n\\r\\n\";\n    msg +=\n        \"https://huggingface.co/pkufool/\"\n        \"icefall-asr-zipformer-streaming-wenetspeech-20230615\";\n    msg += \"\\r\\n\\r\\n\";\n    msg +=\n        \"wget https:// \"\n        \"huggingface.co/pkufool/\"\n        \"icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/exp/\"\n        \"encoder-epoch-12-avg-4-chunk-16-left-128.onnx\\r\\n\";\n    msg +=\n        \"wget https:// \"\n        \"huggingface.co/pkufool/\"\n        \"icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/exp/\"\n        \"decoder-epoch-12-avg-4-chunk-16-left-128.onnx\\r\\n\";\n    msg +=\n        \"wget https:// \"\n        \"huggingface.co/pkufool/\"\n        \"icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/exp/\"\n        \"joiner-epoch-12-avg-4-chunk-16-left-128.onnx\\r\\n\";\n    msg +=\n        \"wget \"\n        \"https://huggingface.co/pkufool/\"\n        \"icefall-asr-zipformer-streaming-wenetspeech-20230615/resolve/main/\"\n        \"data/lang_char/tokens.txt\\r\\n\";\n\n    msg += \"\\r\\nNow rename them.\\r\\n\";\n    msg += \"mv encoder-epoch-12-avg-4-chunk-16-left-128.onnx encoder.onnx\\r\\n\";\n    msg += \"mv decoder-epoch-12-avg-4-chunk-16-left-128.onnx decoder.onnx\\r\\n\";\n    msg += \"mv joiner-epoch-12-avg-4-chunk-16-left-128.onnx joiner.onnx\\r\\n\";\n    msg += \"\\r\\n\";\n    msg += \"(2) Paraformer\\r\\n\\r\\n\";\n    msg +=\n        \"wget \"\n        \"https://huggingface.co/csukuangfj/\"\n        \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/resolve/main/\"\n        \"encoder.int8.onnx\\r\\n\";\n    msg +=\n        \"wget \"\n        \"https://huggingface.co/csukuangfj/\"\n        \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/resolve/main/\"\n        \"decoder.int8.onnx\\r\\n\";\n    msg +=\n        \"wget \"\n        \"https://huggingface.co/csukuangfj/\"\n        \"sherpa-onnx-streaming-paraformer-bilingual-zh-en/resolve/main/\"\n        \"tokens.txt\\r\\n\";\n    msg += \"\\r\\nNow rename them.\\r\\n\";\n    msg += \"mv encoder.int8.onnx paraformer-encoder.onnx\\r\\n\";\n    msg += \"mv decoder.int8.onnx paraformer-decoder.onnx\\r\\n\\r\\n\";\n    msg += \"That's it!\\r\\n\";\n\n    AppendLineToMultilineEditCtrl(msg);\n}\n\nvoid CStreamingSpeechRecognitionDlg::InitParaformer() {\n  std::string paraformer_encoder = \"./paraformer-encoder.onnx\";\n  std::string paraformer_decoder = \"./paraformer-decoder.onnx\";\n\n  std::string tokens = \"./tokens.txt\";\n\n  bool is_ok = true;\n\n  if (Exists(\"./paraformer-encoder.int8.onnx\")) {\n    paraformer_encoder = \"./paraformer-encoder.int8.onnx\";\n  } else if (!Exists(paraformer_encoder)) {\n    std::string msg = paraformer_encoder + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (Exists(\"./paraformer-decoder.int8.onnx\")) {\n    paraformer_decoder = \"./paraformer-decoder.int8.onnx\";\n  } else if (!Exists(paraformer_decoder)) {\n    std::string msg = paraformer_decoder + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!Exists(tokens)) {\n    std::string msg = tokens + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!is_ok) {\n    ShowInitRecognizerHelpMessage();\n    return;\n  }\n\n  SherpaOnnxOnlineRecognizerConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model_config.debug = 0;\n  config.model_config.num_threads = 1;\n  config.model_config.provider = \"cpu\";\n\n  config.decoding_method = \"greedy_search\";\n  config.max_active_paths = 4;\n\n  config.feat_config.sample_rate = 16000;\n  config.feat_config.feature_dim = 80;\n\n  config.enable_endpoint = 1;\n  config.rule1_min_trailing_silence = 1.2f;\n  config.rule2_min_trailing_silence = 0.8f;\n  config.rule3_min_utterance_length = 300.0f;\n\n  config.model_config.tokens = tokens.c_str();\n  config.model_config.paraformer.encoder = paraformer_encoder.c_str();\n  config.model_config.paraformer.decoder = paraformer_decoder.c_str();\n\n  recognizer_ = SherpaOnnxCreateOnlineRecognizer(&config);\n}\n\nvoid CStreamingSpeechRecognitionDlg::InitRecognizer() {\n  if (Exists(\"./paraformer-encoder.onnx\") || Exists(\"./paraformer-encoder.int8.onnx\")) {\n    InitParaformer();\n    return;\n  }\n\n  std::string encoder = \"./encoder.onnx\";\n  std::string decoder = \"./decoder.onnx\";\n  std::string joiner = \"./joiner.onnx\";\n  std::string tokens = \"./tokens.txt\";\n\n  bool is_ok = true;\n  if (!Exists(encoder)) {\n    std::string msg = encoder + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!Exists(decoder)) {\n    std::string msg = decoder + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!Exists(joiner)) {\n    std::string msg = joiner + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!Exists(tokens)) {\n    std::string msg = tokens + \" does not exist!\";\n    AppendLineToMultilineEditCtrl(msg);\n    is_ok = false;\n  }\n\n  if (!is_ok) {\n    ShowInitRecognizerHelpMessage();\n    return;\n  }\n\n  SherpaOnnxOnlineRecognizerConfig config;\n  memset(&config, 0, sizeof(config));\n  config.model_config.debug = 0;\n  config.model_config.num_threads = 1;\n  config.model_config.provider = \"cpu\";\n\n  config.decoding_method = \"greedy_search\";\n  config.max_active_paths = 4;\n\n  config.feat_config.sample_rate = 16000;\n  config.feat_config.feature_dim = 80;\n\n  config.enable_endpoint = 1;\n  config.rule1_min_trailing_silence = 1.2f;\n  config.rule2_min_trailing_silence = 0.8f;\n  config.rule3_min_utterance_length = 300.0f;\n\n  config.model_config.tokens = tokens.c_str();\n  config.model_config.transducer.encoder = encoder.c_str();\n  config.model_config.transducer.decoder = decoder.c_str();\n  config.model_config.transducer.joiner = joiner.c_str();\n\n  recognizer_ = SherpaOnnxCreateOnlineRecognizer(&config);\n}\n\n// see\n// https://stackoverflow.com/questions/7153935/how-to-convert-utf-8-stdstring-to-utf-16-stdwstring\nstatic std::wstring Utf8ToUtf16(const std::string &utf8) {\n  std::vector<unsigned long> unicode;\n  size_t i = 0;\n  while (i < utf8.size()) {\n    unsigned long uni;\n    size_t todo;\n    bool error = false;\n    unsigned char ch = utf8[i++];\n    if (ch <= 0x7F) {\n      uni = ch;\n      todo = 0;\n    } else if (ch <= 0xBF) {\n      throw std::logic_error(\"not a UTF-8 string\");\n    } else if (ch <= 0xDF) {\n      uni = ch & 0x1F;\n      todo = 1;\n    } else if (ch <= 0xEF) {\n      uni = ch & 0x0F;\n      todo = 2;\n    } else if (ch <= 0xF7) {\n      uni = ch & 0x07;\n      todo = 3;\n    } else {\n      throw std::logic_error(\"not a UTF-8 string\");\n    }\n    for (size_t j = 0; j < todo; ++j) {\n      if (i == utf8.size()) throw std::logic_error(\"not a UTF-8 string\");\n      unsigned char ch = utf8[i++];\n      if (ch < 0x80 || ch > 0xBF) throw std::logic_error(\"not a UTF-8 string\");\n      uni <<= 6;\n      uni += ch & 0x3F;\n    }\n    if (uni >= 0xD800 && uni <= 0xDFFF)\n      throw std::logic_error(\"not a UTF-8 string\");\n    if (uni > 0x10FFFF) throw std::logic_error(\"not a UTF-8 string\");\n    unicode.push_back(uni);\n  }\n  std::wstring utf16;\n  for (size_t i = 0; i < unicode.size(); ++i) {\n    unsigned long uni = unicode[i];\n    if (uni <= 0xFFFF) {\n      utf16 += (wchar_t)uni;\n    } else {\n      uni -= 0x10000;\n      utf16 += (wchar_t)((uni >> 10) + 0xD800);\n      utf16 += (wchar_t)((uni & 0x3FF) + 0xDC00);\n    }\n  }\n  return utf16;\n}\n\nvoid CStreamingSpeechRecognitionDlg::AppendTextToEditCtrl(\n    const std::string &s) {\n  // get the initial text length\n  int nLength = my_text_.GetWindowTextLength();\n  // put the selection at the end of text\n  my_text_.SetSel(nLength, nLength);\n  // replace the selection\n\n  std::wstring wstr = Utf8ToUtf16(s);\n\n  // my_text_.ReplaceSel(wstr.c_str());\n  my_text_.ReplaceSel(wstr.c_str());\n}\n\nvoid CStreamingSpeechRecognitionDlg::AppendLineToMultilineEditCtrl(\n    const std::string &s) {\n  AppendTextToEditCtrl(\"\\r\\n\" + s);\n}\n\nstatic std::string Cat(const std::vector<std::string> &results,\n                       const std::string &s) {\n  std::ostringstream os;\n  std::string sep;\n\n  int i = 0;\n  for (i = 0; i != results.size(); ++i) {\n    os << sep << i << \": \" << results[i];\n    sep = \"\\r\\n\";\n  }\n\n  if (!s.empty()) {\n    os << sep << i << \": \" << s;\n  }\n  return os.str();\n}\n\nint CStreamingSpeechRecognitionDlg::RunThread() {\n  std::vector<std::string> results;\n\n  std::string last_text;\n  while (started_) {\n    while (SherpaOnnxIsOnlineStreamReady(recognizer_, stream_)) {\n      SherpaOnnxDecodeOnlineStream(recognizer_, stream_);\n    }\n\n    auto r = SherpaOnnxGetOnlineStreamResult(recognizer_, stream_);\n    std::string text = r->text;\n    SherpaOnnxDestroyOnlineRecognizerResult(r);\n    if (!text.empty() && last_text != text) {\n      // CString str;\n      // str.Format(_T(\"%s\"), Cat(results, text).c_str());\n      auto str = Utf8ToUtf16(Cat(results, text).c_str());\n      my_text_.SetWindowText(str.c_str());\n      my_text_.SetFocus();\n      my_text_.SetSel(-1);\n      last_text = text;\n    }\n    int is_endpoint = SherpaOnnxOnlineStreamIsEndpoint(recognizer_, stream_);\n    if (is_endpoint) {\n      SherpaOnnxOnlineStreamReset(recognizer_, stream_);\n      if (!text.empty()) {\n        results.push_back(std::move(text));\n      }\n    }\n\n    Pa_Sleep(100);  // sleep for 100ms\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/StreamingSpeechRecognitionDlg.h",
    "content": "\n// StreamingSpeechRecognitionDlg.h : header file\n//\n\n#pragma once\n\n#include <string>\n\n#include \"portaudio.h\"\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nclass Microphone {\n public:\n  Microphone();\n  ~Microphone();\n};\n\nclass RecognizerThread;\n\n// CStreamingSpeechRecognitionDlg dialog\nclass CStreamingSpeechRecognitionDlg : public CDialogEx {\n  // Construction\n public:\n  CStreamingSpeechRecognitionDlg(\n      CWnd *pParent = nullptr);  // standard constructor\n  ~CStreamingSpeechRecognitionDlg();\n\n// Dialog Data\n#ifdef AFX_DESIGN_TIME\n  enum { IDD = IDD_STREAMINGSPEECHRECOGNITION_DIALOG };\n#endif\n\n protected:\n  virtual void DoDataExchange(CDataExchange *pDX);  // DDX/DDV support\n\n  // Implementation\n protected:\n  HICON m_hIcon;\n\n  // Generated message map functions\n  virtual BOOL OnInitDialog();\n  afx_msg void OnPaint();\n  afx_msg HCURSOR OnQueryDragIcon();\n  DECLARE_MESSAGE_MAP()\n private:\n  Microphone mic_;\n\n  const SherpaOnnxOnlineRecognizer *recognizer_ = nullptr;\n\n  PaStream *pa_stream_ = nullptr;\n  RecognizerThread *thread_ = nullptr;\n  CButton my_btn_;\n  CEdit my_text_;\n\n public:\n  bool started_ = false;\n  const SherpaOnnxOnlineStream *stream_ = nullptr;\n\n public:\n  int RunThread();\n  afx_msg void OnBnClickedOk();\n\n private:\n  void AppendTextToEditCtrl(const std::string &s);\n  void AppendLineToMultilineEditCtrl(const std::string &s);\n  void InitMicrophone();\n\n  bool Exists(const std::string &filename);\n  void InitRecognizer();\n  void InitParaformer();\n  void ShowInitRecognizerHelpMessage();\n};\n\nclass RecognizerThread : public CWinThread {\n public:\n  RecognizerThread(CStreamingSpeechRecognitionDlg *dlg) : dlg_(dlg) {}\n  virtual BOOL InitInstance() { return TRUE; }\n  virtual int Run() { return dlg_->RunThread(); }\n\n private:\n  CStreamingSpeechRecognitionDlg *dlg_;\n};\n"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/framework.h",
    "content": "#pragma once\n\n#ifndef VC_EXTRALEAN\n#define VC_EXTRALEAN  // Exclude rarely-used stuff from Windows headers\n#endif\n\n#include \"targetver.h\"\n\n#define _ATL_CSTRING_EXPLICIT_CONSTRUCTORS  // some CString constructors will be\n                                            // explicit\n\n// turns off MFC's hiding of some common and often safely ignored warning\n// messages\n#define _AFX_ALL_WARNINGS\n\n#include <afxext.h>  // MFC extensions\n#include <afxwin.h>  // MFC core and standard components\n\n#ifndef _AFX_NO_OLE_SUPPORT\n#include <afxdtctl.h>  // MFC support for Internet Explorer 4 Common Controls\n#endif\n#ifndef _AFX_NO_AFXCMN_SUPPORT\n#include <afxcmn.h>  // MFC support for Windows Common Controls\n#endif               // _AFX_NO_AFXCMN_SUPPORT\n\n#include <afxcontrolbars.h>  // MFC support for ribbons and control bars\n"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/pch.cpp",
    "content": "// pch.cpp: source file corresponding to the pre-compiled header\n\n#include \"pch.h\"\n\n// When you are using pre-compiled headers, this source file is necessary for\n// compilation to succeed.\n"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/pch.h",
    "content": "// pch.h: This is a precompiled header file.\n// Files listed below are compiled only once, improving build performance for\n// future builds. This also affects IntelliSense performance, including code\n// completion and many code browsing features. However, files listed here are\n// ALL re-compiled if any one of them is updated between builds. Do not add\n// files here that you will be updating frequently as this negates the\n// performance advantage.\n\n#ifndef PCH_H\n#define PCH_H\n\n// add headers that you want to pre-compile here\n#include \"framework.h\"\n\n#endif  // PCH_H\n"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/sherpa-onnx-deps.props",
    "content": "﻿<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<Project ToolsVersion=\"4.0\" xmlns=\"http://schemas.microsoft.com/developer/msbuild/2003\">\n  <ImportGroup Label=\"PropertySheets\" />\n  <PropertyGroup Label=\"UserMacros\" />\n  <PropertyGroup>\n    <SherpaOnnxBuildDirectory>..\\..\\build</SherpaOnnxBuildDirectory>\n    <SherpaOnnxInstallDirectory>..\\..\\build\\install</SherpaOnnxInstallDirectory>\n    <SherpaOnnxLibraries>\n        sherpa-onnx-portaudio_static.lib;\n        sherpa-onnx-c-api.lib;\n        sherpa-onnx-core.lib;\n        kaldi-decoder-core.lib;\n        sherpa-onnx-kaldifst-core.lib;\n        sherpa-onnx-fstfar.lib;\n        sherpa-onnx-fst.lib;\n        kaldi-native-fbank-core.lib;\n        kissfft-float.lib;\n        onnxruntime.lib;\n        piper_phonemize.lib;\n        espeak-ng.lib;\n        ucd.lib;\n        ssentencepiece_core.lib;\n    </SherpaOnnxLibraries>\n  </PropertyGroup>\n  <ItemDefinitionGroup>\n    <ClCompile>\n      <AdditionalIncludeDirectories>\n\t  $(SherpaOnnxBuildDirectory)\\_deps\\portaudio-src\\include;\n    $(SherpaOnnxInstallDirectory)\\include;%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>\n    </ClCompile>\n    <Link>\n      <AdditionalLibraryDirectories>$(SherpaOnnxInstallDirectory)\\lib;%(AdditionalLibraryDirectories)</AdditionalLibraryDirectories>\n      <AdditionalDependencies>$(SherpaOnnxLibraries);</AdditionalDependencies>\n    </Link>\n  </ItemDefinitionGroup>\n  <ItemGroup />\n</Project>\n"
  },
  {
    "path": "mfc-examples/StreamingSpeechRecognition/targetver.h",
    "content": "#pragma once\n\n// Including SDKDDKVer.h defines the highest available Windows platform.\n\n// If you wish to build your application for a previous Windows platform,\n// include WinSDKVer.h and set the _WIN32_WINNT macro to the platform you wish\n// to support before including SDKDDKVer.h.\n\n#include <SDKDDKVer.h>\n"
  },
  {
    "path": "mfc-examples/mfc-examples.sln",
    "content": "﻿\nMicrosoft Visual Studio Solution File, Format Version 12.00\n# Visual Studio Version 17\nVisualStudioVersion = 17.6.33829.357\nMinimumVisualStudioVersion = 10.0.40219.1\nProject(\"{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}\") = \"StreamingSpeechRecognition\", \"StreamingSpeechRecognition\\StreamingSpeechRecognition.vcxproj\", \"{A79C2604-C33D-497C-9770-D34E118B77FE}\"\nEndProject\nProject(\"{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}\") = \"NonStreamingSpeechRecognition\", \"NonStreamingSpeechRecognition\\NonStreamingSpeechRecognition.vcxproj\", \"{0298EE00-7AF2-4A66-9D5F-AA0D92AC871D}\"\nEndProject\nProject(\"{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}\") = \"NonStreamingTextToSpeech\", \"NonStreamingTextToSpeech\\NonStreamingTextToSpeech.vcxproj\", \"{9A5F2CCC-1AAB-4F7F-A608-F0B512023405}\"\nEndProject\nGlobal\n\tGlobalSection(SolutionConfigurationPlatforms) = preSolution\n\t\tDebug|x64 = Debug|x64\n\t\tDebug|x86 = Debug|x86\n\t\tRelease|x64 = Release|x64\n\t\tRelease|x86 = Release|x86\n\tEndGlobalSection\n\tGlobalSection(ProjectConfigurationPlatforms) = postSolution\n\t\t{A79C2604-C33D-497C-9770-D34E118B77FE}.Debug|x64.ActiveCfg = Debug|x64\n\t\t{A79C2604-C33D-497C-9770-D34E118B77FE}.Debug|x64.Build.0 = Debug|x64\n\t\t{A79C2604-C33D-497C-9770-D34E118B77FE}.Debug|x86.ActiveCfg = Debug|Win32\n\t\t{A79C2604-C33D-497C-9770-D34E118B77FE}.Debug|x86.Build.0 = Debug|Win32\n\t\t{A79C2604-C33D-497C-9770-D34E118B77FE}.Release|x64.ActiveCfg = Release|x64\n\t\t{A79C2604-C33D-497C-9770-D34E118B77FE}.Release|x64.Build.0 = Release|x64\n\t\t{A79C2604-C33D-497C-9770-D34E118B77FE}.Release|x86.ActiveCfg = Release|Win32\n\t\t{A79C2604-C33D-497C-9770-D34E118B77FE}.Release|x86.Build.0 = Release|Win32\n\t\t{0298EE00-7AF2-4A66-9D5F-AA0D92AC871D}.Debug|x64.ActiveCfg = Debug|x64\n\t\t{0298EE00-7AF2-4A66-9D5F-AA0D92AC871D}.Debug|x64.Build.0 = Debug|x64\n\t\t{0298EE00-7AF2-4A66-9D5F-AA0D92AC871D}.Debug|x86.ActiveCfg = Debug|Win32\n\t\t{0298EE00-7AF2-4A66-9D5F-AA0D92AC871D}.Debug|x86.Build.0 = Debug|Win32\n\t\t{0298EE00-7AF2-4A66-9D5F-AA0D92AC871D}.Release|x64.ActiveCfg = Release|x64\n\t\t{0298EE00-7AF2-4A66-9D5F-AA0D92AC871D}.Release|x64.Build.0 = Release|x64\n\t\t{0298EE00-7AF2-4A66-9D5F-AA0D92AC871D}.Release|x86.ActiveCfg = Release|Win32\n\t\t{0298EE00-7AF2-4A66-9D5F-AA0D92AC871D}.Release|x86.Build.0 = Release|Win32\n\t\t{9A5F2CCC-1AAB-4F7F-A608-F0B512023405}.Debug|x64.ActiveCfg = Debug|x64\n\t\t{9A5F2CCC-1AAB-4F7F-A608-F0B512023405}.Debug|x64.Build.0 = Debug|x64\n\t\t{9A5F2CCC-1AAB-4F7F-A608-F0B512023405}.Debug|x86.ActiveCfg = Debug|Win32\n\t\t{9A5F2CCC-1AAB-4F7F-A608-F0B512023405}.Debug|x86.Build.0 = Debug|Win32\n\t\t{9A5F2CCC-1AAB-4F7F-A608-F0B512023405}.Release|x64.ActiveCfg = Release|x64\n\t\t{9A5F2CCC-1AAB-4F7F-A608-F0B512023405}.Release|x64.Build.0 = Release|x64\n\t\t{9A5F2CCC-1AAB-4F7F-A608-F0B512023405}.Release|x86.ActiveCfg = Release|Win32\n\t\t{9A5F2CCC-1AAB-4F7F-A608-F0B512023405}.Release|x86.Build.0 = Release|Win32\n\tEndGlobalSection\n\tGlobalSection(SolutionProperties) = preSolution\n\t\tHideSolutionNode = FALSE\n\tEndGlobalSection\n\tGlobalSection(ExtensibilityGlobals) = postSolution\n\t\tSolutionGuid = {C0A85719-CF8C-4BCD-BDF6-7C57EE651CBB}\n\tEndGlobalSection\nEndGlobal\n"
  },
  {
    "path": "new-release.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nold_version_code=20260319\nnew_version_code=20260320\n\nold_version=\"1\\.12\\.30\"\nnew_version=\"1\\.12\\.31\"\n\nreplace_str=\"s/$old_version/$new_version/g\"\n\nsed -i.bak \"$replace_str\" ./CMakeLists.txt\n\nsed -i.bak \"$replace_str\" ./sherpa-onnx/csrc/version.cc\nsha1=$(git describe --match=NeVeRmAtCh --always --abbrev=8)\ndate=$(git log -1 --format=%ad --date=local)\n\nfind android -name \"build.gradle\" -type f -exec sed -i.bak \"s/versionName \\\"$old_version\\\"/versionName \\\"$new_version\\\"/g\" {} \\;\nfind android -name \"build.gradle.kts\" -type f -exec sed -i.bak \"s/versionName = \\\"$old_version\\\"/versionName = \\\"$new_version\\\"/g\" {} \\;\n\nfind android -name \"build.gradle\" -type f -exec sed -i.bak \"s/versionCode $old_version_code/versionCode $new_version_code/g\" {} \\;\nfind android -name \"build.gradle.kts\" -type f -exec sed -i.bak \"s/versionCode = $old_version_code/versionCode = $new_version_code/g\" {} \\;\n\nsed -i.bak \"s/  static const char \\*sha1.*/  static const char \\*sha1 = \\\"$sha1\\\";/g\" ./sherpa-onnx/csrc/version.cc\nsed -i.bak \"s/  static const char \\*date.*/  static const char \\*date = \\\"$date\\\";/g\" ./sherpa-onnx/csrc/version.cc\n\n\nfind scripts/wheel -name \"setup.py\" -type f -exec sed -i.bak \"$replace_str\" {} \\;\nsed -i.bak \"$replace_str\" ./setup.py\n\nsed -i.bak \"$replace_str\" ./build-ios-shared.sh\nsed -i.bak \"$replace_str\" ./pom.xml\nsed -i.bak \"$replace_str\" ./jitpack.yml\nsed -i.bak \"$replace_str\" ./android/SherpaOnnxAar/README.md\n\nsed -i.bak \"$replace_str\" ./rust-api-examples/Cargo.toml\nsed -i.bak \"$replace_str\" ./sherpa-onnx/rust/sherpa-onnx-sys/Cargo.toml\nsed -i.bak \"$replace_str\" ./sherpa-onnx/rust/sherpa-onnx/Cargo.toml\nsed -i.bak \"$replace_str\" ./rust-api-examples/README.md\n\nfind android -name build.gradle -type f -exec sed -i.bak \"s/sherpa-onnx:v$old_version/sherpa-onnx:v$new_version/g\" {} \\;\nfind android -name build.gradle.kts -type f -exec sed -i.bak \"s/sherpa-onnx:v$old_version/sherpa-onnx:v$new_version/g\" {} \\;\n\nfind flutter -name \"*.yaml\" -type f -exec sed -i.bak \"$replace_str\" {} \\;\nfind dart-api-examples -name \"*.yaml\" -type f -exec sed -i.bak \"$replace_str\" {} \\;\nfind flutter-examples -name \"*.yaml\" -type f -exec sed -i.bak \"$replace_str\" {} \\;\nfind flutter -name \"*.podspec\" -type f -exec sed -i.bak \"$replace_str\" {} \\;\nfind nodejs-addon-examples -name package.json -type f -exec sed -i.bak \"$replace_str\" {} \\;\nfind nodejs-examples -name package.json -type f -exec sed -i.bak \"$replace_str\" {} \\;\n\nfind harmony-os -name \"README.md\" -type f -exec sed -i.bak \"$replace_str\" {} \\;\nfind harmony-os -name oh-package.json5 -type f -exec sed -i.bak \"$replace_str\" {} \\;\nfind harmony-os -name BuildProfile.ets -type f -exec sed -i.bak \"$replace_str\" {} \\;\n\nfind mfc-examples -name \"README.md\" -type f -exec sed -i.bak \"$replace_str\" {} \\;\n\nfind . -name \"*.bak\" -exec rm {} \\;\n"
  },
  {
    "path": "nodejs-addon-examples/.gitignore",
    "content": "crash.log\n"
  },
  {
    "path": "nodejs-addon-examples/README.md",
    "content": "# Introduction\n\nNote: You need `Node >= 16`.\n\nThis repo contains examples for NodeJS.\nIt uses [node-addon-api](https://github.com/nodejs/node-addon-api) to wrap\n`sherpa-onnx` for NodeJS and it supports multiple threads.\n\nNote: [../nodejs-examples](../nodejs-examples) uses WebAssembly to wrap\n`sherpa-onnx` for NodeJS and it does not support multiple threads.\n\nBefore you continue, please first run\n\n```bash\nnpm install # or pnpm install\n\n# For macOS x64\n## With npm\nexport DYLD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-x64:$DYLD_LIBRARY_PATH\n## With pnpm\nexport DYLD_LIBRARY_PATH=$PWD/node_modules/.pnpm/sherpa-onnx-node@<REPLACE-THIS-WITH-THE-INSTALLED-VERSION>/node_modules/sherpa-onnx-darwin-x64:$DYLD_LIBRARY_PATH\n\n# For macOS arm64\n## With npm\nexport DYLD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-darwin-arm64:$DYLD_LIBRARY_PATH\n## With pnpm\nexport DYLD_LIBRARY_PATH=$PWD/node_modules/.pnpm/sherpa-onnx-node@<REPLACE-THIS-WITH-THE-INSTALLED-VERSION>/node_modules/sherpa-onnx-darwin-arm64:$DYLD_LIBRARY_PATH\n\n# For Linux x64\n## With npm\nexport LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-linux-x64:$LD_LIBRARY_PATH\n## With pnpm\nexport LD_LIBRARY_PATH=$PWD/node_modules/.pnpm/sherpa-onnx-node@<REPLACE-THIS-WITH-THE-INSTALLED-VERSION>/node_modules/sherpa-onnx-linux-x64:$LD_LIBRARY_PATH\n\n# For Linux arm64, e.g., Raspberry Pi 4\n## With npm\nexport LD_LIBRARY_PATH=$PWD/node_modules/sherpa-onnx-linux-arm64:$LD_LIBRARY_PATH\n## With pnpm\nexport LD_LIBRARY_PATH=$PWD/node_modules/.pnpm/sherpa-onnx-node@<REPLACE-THIS-WITH-THE-INSTALLED-VERSION>/node_modules/sherpa-onnx-linux-arm64:$LD_LIBRARY_PATH\n```\n\n# Examples\n\nThe following tables list the examples in this folder.\n\n## Speech enhancement/denoising\n\n|File| Description|\n|---|---|\n|[./test_offline_speech_enhancement_gtcrn.js](./test_offline_speech_enhancement_gtcrn.js)| It demonstrates how to use sherpa-onnx JavaScript API for speech enhancement with GTCRN.|\n|[./test_offline_speech_enhancement_dpdfnet.js](./test_offline_speech_enhancement_dpdfnet.js)| It demonstrates how to use sherpa-onnx JavaScript API for speech enhancement with DPDFNet.|\n|[./test_online_speech_enhancement_gtcrn.js](./test_online_speech_enhancement_gtcrn.js)| It demonstrates how to use sherpa-onnx JavaScript API for online speech enhancement with GTCRN.|\n|[./test_online_speech_enhancement_dpdfnet.js](./test_online_speech_enhancement_dpdfnet.js)| It demonstrates how to use sherpa-onnx JavaScript API for online speech enhancement with DPDFNet.|\n\n## Speaker diarization\n\n|File| Description|\n|---|---|\n|[./test_offline_speaker_diarization.js](./test_offline_speaker_diarization.js)| It demonstrates how to use sherpa-onnx JavaScript API for speaker diarization. It supports speaker segmentation models from [pyannote-audio](https://github.com/pyannote/pyannote-audio)|\n\n## Add punctuations to text\n\n|File| Description|\n|---|---|\n|[./test_offline_punctuation.js](./test_offline_punctuation.js)| Add punctuations to input text using [CT transformer](https://modelscope.cn/models/iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch/summary). It supports both Chinese and English.|\n|[./test_online_punctuation.js](./test_online_punctuation.js)| Add punctuations to input text using an online/streaming punctuation model.|\n\n## Voice activity detection (VAD)\n\n|File| Description|\n|---|---|\n|[./test_vad_microphone.js](./test_vad_microphone.js)| VAD with a microphone. It uses [silero-vad](https://github.com/snakers4/silero-vad)|\n\n## Speaker identification\n\n|File| Description|\n|---|---|\n|[ ./test_speaker_identification.js]( ./test_speaker_identification.js)| Speaker identification from a file|\n\n## Spoken language identification\n\n|File| Description|\n|---|---|\n|[./test_spoken_language_identification.js](./test_spoken_language_identification.js)|Spoken language identification from a file using a multi-lingual [Whisper](https://github.com/openai/whisper) model|\n|[./test_vad_spoken_language_identification_microphone.js](./test_vad_spoken_language_identification_microphone.js)|Spoken language identification from a microphone using a multi-lingual [Whisper](https://github.com/openai/whisper) model|\n\n## Audio tagging\n\n|File| Description|\n|---|---|\n|[./test_audio_tagging_zipformer.js](./test_audio_tagging_zipformer.js)| Audio tagging with a Zipformer model|\n|[./test_audio_tagging_ced.js](./test_audio_tagging_ced.js)| Audio tagging with a [CED](https://github.com/RicherMans/CED) model|\n\n## Keyword spotting\n\n|File| Description|\n|---|---|\n|[./test_keyword_spotter_transducer.js](./test_keyword_spotter_transducer.js)| Keyword spotting from a file using a Zipformer model|\n|[./test_keyword_spotter_transducer_microphone.js](./test_keyword_spotter_transducer_microphone.js)| Keyword spotting from a microphone using a Zipformer model|\n\n## Streaming speech-to-text from files\n\n|File| Description|\n|---|---|\n|[./test_asr_streaming_t_one_ctc.js](./test_asr_streaming_t_one_ctc.js)| Streaming speech recognition from a file using a T-one CTC model|\n|[./test_asr_streaming_transducer.js](./test_asr_streaming_transducer.js)| Streaming speech recognition from a file using a Zipformer transducer model|\n|[./test_asr_streaming_transducer_itn.js](./test_asr_streaming_transducer_itn.js)| Streaming speech recognition from a file using a Zipformer transducer model with ITN|\n|[./test_asr_streaming_transducer_with_hr.js](./test_asr_streaming_transducer_with_hr.js)| Streaming speech recognition from a file using a Zipformer transducer model with homophone replacer|\n|[./test_asr_streaming_ctc.js](./test_asr_streaming_ctc.js)| Streaming speech recognition from a file using a Zipformer CTC model with greedy search|\n|[./test_asr_streaming_ctc_hlg.js](./test_asr_streaming_ctc_hlg.js)| Streaming speech recognition from a file using a Zipformer CTC model with HLG decoding|\n|[./test_asr_streaming_paraformer.js](./test_asr_streaming_paraformer.js)|Streaming speech recognition from a file using a [Paraformer](https://github.com/alibaba-damo-academy/FunASR) model|\n\n## Streaming speech-to-text from a microphone\n\n|File| Description|\n|---|---|\n|[./test_asr_streaming_transducer_microphone.js](./test_asr_streaming_transducer_microphone.js)| Streaming speech recognition from a microphone using a Zipformer transducer model|\n|[./test_asr_streaming_transducer_microphone_itn.js](./test_asr_streaming_transducer_microphone_itn.js)| Streaming speech recognition from a microphone using a Zipformer transducer model with ITN|\n|[./test_asr_streaming_ctc_microphone.js](./test_asr_streaming_ctc_microphone.js)| Streaming speech recognition from a microphone using a Zipformer CTC model with greedy search|\n|[./test_asr_streaming_ctc_hlg_microphone.js](./test_asr_streaming_ctc_hlg_microphone.js)|Streaming speech recognition from a microphone using a Zipformer CTC model with HLG decoding|\n|[./test_asr_streaming_paraformer_microphone.js](./test_asr_streaming_paraformer_microphone.js)| Streaming speech recognition from a microphone using a [Paraformer](https://github.com/alibaba-damo-academy/FunASR) model|\n\n## Non-Streaming speech-to-text from files\n\n|File| Description|\n|---|---|\n|[./test_asr_non_streaming_transducer.js](./test_asr_non_streaming_transducer.js)|Non-streaming speech recognition from a file with a Zipformer transducer model|\n|[./test_asr_non_streaming_fire_red_asr.js](./test_asr_non_streaming_fire_red_asr.js)| Non-streaming speech recognition from a file using [FireRedAsr](https://github.com/FireRedTeam/FireRedASR)|\n|[./test_asr_non_streaming_fire_red_asr_ctc.js](./test_asr_non_streaming_fire_red_asr_ctc.js)| Non-streaming speech recognition from a file using [FireRedAsr](https://github.com/FireRedTeam/FireRedASR) CTC model|\n|[./test_asr_non_streaming_fire_red_asr_ctc_async.js](./test_asr_non_streaming_fire_red_asr_ctc_async.js)| Async non-streaming speech recognition from a file using [FireRedAsr](https://github.com/FireRedTeam/FireRedASR) CTC model|\n|[./test_asr_non_streaming_whisper.js](./test_asr_non_streaming_whisper.js)| Non-streaming speech recognition from a file using [Whisper](https://github.com/openai/whisper)|\n|[./test_vad_with_non_streaming_asr_whisper.js](./test_vad_with_non_streaming_asr_whisper.js)| Non-streaming speech recognition from a file using [Whisper](https://github.com/openai/whisper) + [Silero VAD](https://github.com/snakers4/silero-vad)|\n|[./test_asr_non_streaming_moonshine.js](./test_asr_non_streaming_moonshine.js)|Non-streaming speech recognition from a file using [Moonshine](https://github.com/usefulsensors/moonshine)|\n|[./test_asr_non_streaming_moonshine_v2.js](./test_asr_non_streaming_moonshine_v2.js)|Non-streaming speech recognition from a file using [Moonshine](https://github.com/usefulsensors/moonshine) v2|\n|[./test_vad_with_non_streaming_asr_moonshine.js](./test_vad_with_non_streaming_asr_moonshine.js)| Non-streaming speech recognition from a file using [Moonshine](https://github.com/usefulsensors/moonshine) + [Silero VAD](https://github.com/snakers4/silero-vad)|\n|[./test_asr_non_streaming_nemo_ctc.js](./test_asr_non_streaming_nemo_ctc.js)|Non-streaming speech recognition from a file using a [NeMo](https://github.com/NVIDIA/NeMo) CTC model with greedy search|\n|[./test_asr_non_streaming_wenet_ctc.js](./test_asr_non_streaming_wenet_ctc.js)|Non-streaming speech recognition from a file using a [u2pp_conformer_yue](https://huggingface.co/ASLP-lab/WSYue-ASR/tree/main/u2pp_conformer_yue) CTC model with greedy search|\n|[./test_asr_non_streaming_omnilingual_asr_ctc.js](./test_asr_non_streaming_omnilingual_asr_ctc.js)|Non-streaming speech recognition from a file using a [Omnilingual-ASR](https://github.com/facebookresearch/omnilingual-asr) CTC model with greedy search|\n|[./test_asr_non_streaming_medasr_ctc.js](./test_asr_non_streaming_medasr_ctc.js)|Non-streaming speech recognition from a file using a [Google MedASR](https://github.com/google-health/medasr) CTC model with greedy search|\n|[./test_asr_non_streaming_funasr_nano.js](./test_asr_non_streaming_funasr_nano.js)|Non-streaming speech recognition from a file using a [FunASR Nano](https://modelscope.cn/models/FunAudioLLM/Fun-ASR-Nano-2512) model|\n|[./test_asr_non_streaming_funasr_nano_async.js](./test_asr_non_streaming_funasr_nano_async.js)|Async non-streaming speech recognition from multiple files using a [FunASR Nano](https://modelscope.cn/models/FunAudioLLM/Fun-ASR-Nano-2512) model|\n|[./test_asr_non_streaming_nemo_canary.js](./test_asr_non_streaming_nemo_canary.js)|Non-streaming speech recognition from a file using a [NeMo](https://github.com/NVIDIA/NeMo) [Canary](https://k2-fsa.github.io/sherpa/onnx/nemo/canary.html#sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8-english-spanish-german-french) model|\n|[./test_asr_non_streaming_zipformer_ctc.js](./test_asr_non_streaming_zipformer_ctc.js)|Non-streaming speech recognition from a file using a Zipformer CTC model with greedy search|\n|[./test_asr_non_streaming_nemo_parakeet_tdt_v2.js](./test_asr_non_streaming_nemo_parakeet_tdt_v2.js)|Non-streaming speech recognition from a file using a [NeMo](https://github.com/NVIDIA/NeMo) [parakeet-tdt-0.6b-v2](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/nemo-transducer-models.html#sherpa-onnx-nemo-parakeet-tdt-0-6b-v2-int8-english) model with greedy search|\n|[./test_asr_non_streaming_dolphin_ctc.js](./test_asr_non_streaming_dolphin_ctc.js)|Non-streaming speech recognition from a file using a [Dolphinhttps://github.com/DataoceanAI/Dolphin]) CTC model with greedy search|\n|[./test_asr_non_streaming_paraformer.js](./test_asr_non_streaming_paraformer.js)|Non-streaming speech recognition from a file using [Paraformer](https://github.com/alibaba-damo-academy/FunASR)|\n|[./test_asr_non_streaming_paraformer_itn.js](./test_asr_non_streaming_paraformer_itn.js)|Non-streaming speech recognition from a file using [Paraformer](https://github.com/alibaba-damo-academy/FunASR) with ITN|\n|[./test_asr_non_streaming_sense_voice.js](./test_asr_non_streaming_sense_voice.js)|Non-streaming speech recognition from a file using [SenseVoice](https://github.com/FunAudioLLM/SenseVoice)|\n|[./test_asr_non_streaming_sense_voice_with_hr.js](./test_asr_non_streaming_sense_voice_with_hr.js)|Non-streaming speech recognition from a file using [SenseVoice](https://github.com/FunAudioLLM/SenseVoice) with homophone replacer|\n\n## Non-Streaming speech-to-text from a microphone with VAD\n\n|File| Description|\n|---|---|\n|[./test_vad_asr_non_streaming_transducer_microphone.js](./test_vad_asr_non_streaming_transducer_microphone.js)|VAD + Non-streaming speech recognition from a microphone using a Zipformer transducer model|\n|[./test_vad_asr_non_streaming_whisper_microphone.js](./test_vad_asr_non_streaming_whisper_microphone.js)|VAD + Non-streaming speech recognition from a microphone using [Whisper](https://github.com/openai/whisper)|\n|[./test_vad_asr_non_streaming_moonshine_microphone.js](./test_vad_asr_non_streaming_moonshine_microphone.js)|VAD + Non-streaming speech recognition from a microphone using [Moonshine](https://github.com/usefulsensors/moonshine)|\n|[./test_vad_asr_non_streaming_nemo_ctc_microphone.js](./test_vad_asr_non_streaming_nemo_ctc_microphone.js)|VAD + Non-streaming speech recognition from a microphone using a [NeMo](https://github.com/NVIDIA/NeMo) CTC model with greedy search|\n|[./test_vad_asr_non_streaming_zipformer_ctc_microphone.js](./test_vad_asr_non_streaming_zipformer_ctc_microphone.js)|VAD + Non-streaming speech recognition from a microphone using a Zipformer CTC model with greedy search|\n|[./test_vad_asr_non_streaming_paraformer_microphone.js](./test_vad_asr_non_streaming_paraformer_microphone.js)|VAD + Non-streaming speech recognition from a microphone using [Paraformer](https://github.com/alibaba-damo-academy/FunASR)|\n|[./test_vad_asr_non_streaming_sense_voice_microphone.js](./test_vad_asr_non_streaming_sense_voice_microphone.js)|VAD + Non-streaming speech recognition from a microphone using [SenseVoice](https://github.com/FunAudioLLM/SenseVoice)|\n\n## Text-to-speech\n\n|File| Description|\n|---|---|\n|[./test_tts_non_streaming_pocket_en.js](./test_tts_non_streaming_pocket_en.js)| Zero-shot text-to-speech with a PocketTTS English Model|\n|[./test_tts_non_streaming_pocket_en_async.js](./test_tts_non_streaming_pocket_en_async.js)| Zero-shot text-to-speech with a PocketTTS English Model using async JS API|\n|[./test_tts_non_streaming_pocket_en_play_async.js](./test_tts_non_streaming_pocket_en_play_async.js)| Zero-shot text-to-speech with a PocketTTS English Model using async JS API and live audio playback|\n|[./test_tts_non_streaming_zipvoice_zh_en.js](./test_tts_non_streaming_zipvoice_zh_en.js)| Zero-shot text-to-speech with a ZipVoice Chinese/English Model|\n|[./test_tts_non_streaming_zipvoice_zh_en_async.js](./test_tts_non_streaming_zipvoice_zh_en_async.js)| Zero-shot text-to-speech with a ZipVoice Chinese/English Model using async JS API|\n|[./test_tts_non_streaming_zipvoice_zh_en_play_async.js](./test_tts_non_streaming_zipvoice_zh_en_play_async.js)| Zero-shot text-to-speech with a ZipVoice Chinese/English Model using async JS API and live audio playback|\n|[./test_tts_non_streaming_kitten_en.js](./test_tts_non_streaming_kitten_en.js)| Text-to-speech with a KittenTTS English Model|\n|[./test_tts_non_streaming_kokoro_en.js](./test_tts_non_streaming_kokoro_en.js)| Text-to-speech with a Kokoro English Model|\n|[./test_tts_non_streaming_kokoro_zh_en.js](./test_tts_non_streaming_kokoro_zh_en.js)| Text-to-speech with a Kokoro Model supporting Chinese and English|\n|[./test_tts_non_streaming_matcha_icefall_en.js](./test_tts_non_streaming_matcha_icefall_en.js)| Text-to-speech with a [MatchaTTS English Model](https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker)|\n|[./test_tts_non_streaming_matcha_icefall_zh.js](./test_tts_non_streaming_matcha_icefall_zh.js)| Text-to-speech with a [MatchaTTS Chinese Model](https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker)|\n|[./test_tts_non_streaming_supertonic_en.js](./test_tts_non_streaming_supertonic_en.js)| Text-to-speech with a Supertonic English Model|\n|[./test_tts_non_streaming_supertonic_en_async.js](./test_tts_non_streaming_supertonic_en_async.js)| Text-to-speech with a Supertonic English Model using async JS API|\n|[./test_tts_non_streaming_supertonic_en_play_async.js](./test_tts_non_streaming_supertonic_en_play_async.js)| Text-to-speech with a Supertonic English Model using async JS API and live audio playback|\n|[./test_tts_non_streaming_vits_piper_en.js](./test_tts_non_streaming_vits_piper_en.js)| Text-to-speech with a [piper](https://github.com/rhasspy/piper) English model|\n|[./test_tts_non_streaming_vits_coqui_de.js](./test_tts_non_streaming_vits_coqui_de.js)| Text-to-speech with a [coqui](https://github.com/coqui-ai/TTS) German model|\n|[./test_tts_non_streaming_vits_zh_ll.js](./test_tts_non_streaming_vits_zh_ll.js)| Text-to-speech with a Chinese model using [cppjieba](https://github.com/yanyiwu/cppjieba)|\n|[./test_tts_non_streaming_vits_zh_aishell3.js](./test_tts_non_streaming_vits_zh_aishell3.js)| Text-to-speech with a Chinese TTS model|\n\n\n### Speaker diarization\n\n```bash\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\ntar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nrm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\nnode ./test_offline_speaker_diarization.js\n```\n\n### Speech enhancement/denoising\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n\nnode ./test_offline_speech_enhancement_gtcrn.js\n```\n\n### Voice Activity detection (VAD)\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n\n# To run the test with a microphone, you need to install the package naudiodon2\nnpm install naudiodon2\n\nnode ./test_vad_microphone.js\n```\n\n### Audio tagging with zipformer\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\ntar xvf sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\nrm sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n\nnode ./test_audio_tagging_zipformer.js\n```\n\n### Audio tagging with CED\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-ced-mini-audio-tagging-2024-09-14.tar.bz2\ntar xvf sherpa-onnx-ced-mini-audio-tagging-2024-09-14.tar.bz2\nrm sherpa-onnx-ced-mini-audio-tagging-2024-09-14.tar.bz2\n\nnode ./test_audio_tagging_ced.js\n```\n\n### Streaming speech recognition with Zipformer transducer with homophone replacer\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\ntar xf dict.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\nnode ./test_asr_streaming_transducer_with_hr.js\n```\n\n### Streaming speech recognition with T-one CTC\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\ntar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nrm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n\nnode ./test_asr_streaming_t_one_ctc.js\n```\n\n### Streaming speech recognition with Zipformer transducer\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\nnode ./test_asr_streaming_transducer.js\n\n# To run the test with a microphone, you need to install the package naudiodon2\nnpm install naudiodon2\n\nnode ./test_asr_streaming_transducer_microphone.js\n```\n\n### Streaming speech recognition with Zipformer CTC\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nrm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n\nnode ./test_asr_streaming_ctc.js\n\n# To decode with HLG.fst\nnode ./test_asr_streaming_ctc_hlg.js\n\n# To run the test with a microphone, you need to install the package naudiodon2\nnpm install naudiodon2\n\nnode ./test_asr_streaming_ctc_microphone.js\nnode ./test_asr_streaming_ctc_hlg_microphone.js\n```\n\n### Streaming speech recognition with Paraformer\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\ntar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nrm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n\nnode ./test_asr_streaming_paraformer.js\n\n# To run the test with a microphone, you need to install the package naudiodon2\nnpm install naudiodon2\n\nnode ./test_asr_streaming_paraformer_microphone.js\n```\n\n### Non-streaming speech recognition with Zipformer transducer\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\ntar xvf sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\nrm sherpa-onnx-zipformer-en-2023-04-01.tar.bz2\n\nnode ./test_asr_non_streaming_transducer.js\n\n# To run VAD + non-streaming ASR with transudcer using a microphone\nnpm install naudiodon2\nnode ./test_vad_asr_non_streaming_transducer_microphone.js\n```\n\n### Non-streaming speech recognition with FireRedAsr\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\nrm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n\nnode ./test_asr_non_streaming_fire_red_asr.js\n```\n\n### Non-streaming speech recognition with FireRedAsr CTC\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nrm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\nnode ./test_asr_non_streaming_fire_red_asr_ctc.js\nnode ./test_asr_non_streaming_fire_red_asr_ctc_async.js\n```\n\n### Non-streaming speech recognition with Whisper\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nrm sherpa-onnx-whisper-tiny.en.tar.bz2\n\nnode ./test_asr_non_streaming_whisper.js\n\n# To run VAD + non-streaming ASR with Whisper using a microphone\nnpm install naudiodon2\nnode ./test_vad_asr_non_streaming_whisper_microphone.js\n```\n\n### Non-streaming speech recognition with Moonshine v2\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n\nnode ./test_asr_non_streaming_moonshine_v2.js\n```\n\n### Non-streaming speech recognition with Moonshine\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\nnode ./test_asr_non_streaming_moonshine.js\n\n# To run VAD + non-streaming ASR with Moonshine using a microphone\nnpm install naudiodon2\nnode ./test_vad_asr_non_streaming_moonshine_microphone.js\n```\n\n### Non-streaming speech recognition with Moonshine + VAD\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nnode ./test_vad_with_non_streaming_asr_moonshine.js\n```\n\n### Non-streaming speech recognition with Whisper + VAD\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nrm sherpa-onnx-whisper-tiny.en.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nnode ./test_vad_with_non_streaming_asr_whisper.js\n```\n\n### Non-streaming speech recognition with Dolphin CTC models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\ntar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\nrm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n\nnode ./test_asr_non_streaming_dolphin_ctc.js\n```\n\n### Non-streaming speech recognition with NeMo parakeet-tdt-0.6b-v2 models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\ntar xvf sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\nrm sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\n\nnode ./test_asr_non_streaming_nemo_parakeet_tdt_v2.js\n```\n\n### Non-streaming speech recognition with Zipformer CTC models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\ntar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nrm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\nnode ./test_asr_non_streaming_zipformer_ctc.js\n\n# To run VAD + non-streaming ASR with Paraformer using a microphone\nnpm install naudiodon2\nnode ./test_vad_asr_non_streaming_zipformer_ctc_microphone.js\n```\n\n### Non-streaming speech recognition with NeMo Canary models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\ntar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\nrm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n\nnode ./test_asr_non_streaming_nemo_canary.js\n```\n\n### Non-streaming speech recognition with NeMo CTC models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\ntar xvf sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\nrm sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n\nnode ./test_asr_non_streaming_nemo_ctc.js\n\n# To run VAD + non-streaming ASR with Paraformer using a microphone\nnpm install naudiodon2\nnode ./test_vad_asr_non_streaming_nemo_ctc_microphone.js\n```\n\n### Asynchronous non-streaming speech recognition with FunASR Nano models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\ntar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nrm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n\nnode ./test_asr_non_streaming_funasr_nano_async.js\n```\n\n### Non-streaming speech recognition with FunASR Nano models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\ntar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nrm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n\nnode ./test_asr_non_streaming_funasr_nano.js\n```\n\n### Non-streaming speech recognition with Google MedASR CTC models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\ntar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nrm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n\nnode ./test_asr_non_streaming_medasr_ctc.js\n```\n\n### Non-streaming speech recognition with Omnilingual ASR CTC models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\ntar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nrm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n\nnode ./test_asr_non_streaming_omnilingual_asr_ctc.js\n```\n\n### Non-streaming speech recognition with WeNet CTC models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\ntar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\nrm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n\nnode ./test_asr_non_streaming_wenet_ctc.js\n```\n\n### Non-streaming speech recognition with Paraformer\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\ntar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nrm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\nnode ./test_asr_non_streaming_paraformer.js\n\n# To run VAD + non-streaming ASR with Paraformer using a microphone\nnpm install naudiodon2\nnode ./test_vad_asr_non_streaming_paraformer_microphone.js\n```\n\n### Non-streaming speech recognition with SenseVoice with homophone replacer\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\ntar xf dict.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\nnode ./test_asr_non_streaming_sense_voice_with_hr.js\n```\n\n### Non-streaming speech recognition with SenseVoice\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\nnode ./test_asr_non_streaming_sense_voice.js\n\n# To run VAD + non-streaming ASR with Paraformer using a microphone\nnpm install naudiodon2\nnode ./test_vad_asr_non_streaming_sense_voice_microphone.js\n```\n\n### Zero-shot text-to-speech with PocketTTS models (English TTS, async API)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\nnode ./test_tts_non_streaming_pocket_en_async.js\n```\n\n### Zero-shot text-to-speech with PocketTTS models (English TTS, async API + playback)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\nnpm install speaker\n\nnode ./test_tts_non_streaming_pocket_en_play_async.js\n```\n\n### Zero-shot text-to-speech with PocketTTS models (English TTS, sync API)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\nnode ./test_tts_non_streaming_pocket_en.js\n```\n\n### Zero-shot text-to-speech with ZipVoice models (Chinese/English TTS, async API)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n# The reference text must match the reference audio transcript.\nnode ./test_tts_non_streaming_zipvoice_zh_en_async.js\n```\n\n### Zero-shot text-to-speech with ZipVoice models (Chinese/English TTS, async API + playback)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n# Install the playback package once.\nnpm install speaker\n\n# The reference text must match the reference audio transcript.\nnode ./test_tts_non_streaming_zipvoice_zh_en_play_async.js\n```\n\n### Zero-shot text-to-speech with ZipVoice models (Chinese/English TTS, sync API)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n# The reference text must match the reference audio transcript.\nnode ./test_tts_non_streaming_zipvoice_zh_en.js\n```\n\n### Text-to-speech with KittenTTS models (English TTS)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\ntar xf kitten-nano-en-v0_1-fp16.tar.bz2\nrm kitten-nano-en-v0_1-fp16.tar.bz2\n\nnode ./test_tts_non_streaming_kitten_en.js\n```\n\n### Text-to-speech with Supertonic TTS models (English TTS, sync API)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nrm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\nnode ./test_tts_non_streaming_supertonic_en.js\n```\n\n### Text-to-speech with Supertonic TTS models (English TTS, async API)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nrm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\nnode ./test_tts_non_streaming_supertonic_en_async.js\n```\n\n### Text-to-speech with Supertonic TTS models (English TTS, async API + playback)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nrm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\nnpm install speaker\n\nnode ./test_tts_non_streaming_supertonic_en_play_async.js\n```\n\n### Text-to-speech with Kokoro TTS models (Chinese + English TTS)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\ntar xf kokoro-multi-lang-v1_0.tar.bz2\nrm kokoro-multi-lang-v1_0.tar.bz2\n\nnode ./test_tts_non_streaming_kokoro_zh_en.js\n```\n\n### Text-to-speech with Kokoro TTS models (English TTS)\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\ntar xf kokoro-en-v0_19.tar.bz2\nrm kokoro-en-v0_19.tar.bz2\n\nnode ./test_tts_non_streaming_kokoro_en.js\n```\n\n### Text-to-speech with MatchaTTS models (English TTS)\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\ntar xf matcha-icefall-en_US-ljspeech.tar.bz2\nrm matcha-icefall-en_US-ljspeech.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\nnode ./test_tts_non_streaming_matcha_icefall_en.js\n```\n\n### Text-to-speech with MatchaTTS models (Chinese TTS)\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\ntar xvf matcha-icefall-zh-baker.tar.bz2\nrm matcha-icefall-zh-baker.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\nnode ./test_tts_non_streaming_matcha_icefall_zh.js\n```\n\n### Text-to-speech with piper VITS models (TTS)\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_GB-cori-medium.tar.bz2\ntar xvf vits-piper-en_GB-cori-medium.tar.bz2\nrm vits-piper-en_GB-cori-medium.tar.bz2\n\nnode ./test_tts_non_streaming_vits_piper_en.js\n```\n\n### Text-to-speech with piper Coqui-ai/TTS models (TTS)\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-coqui-de-css10.tar.bz2\ntar xvf vits-coqui-de-css10.tar.bz2\nrm vits-coqui-de-css10.tar.bz2\n\nnode ./test_tts_non_streaming_vits_coqui_de.js\n```\n\n### Text-to-speech with vits Chinese models (1/2)\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-vits-zh-ll.tar.bz2\ntar xvf sherpa-onnx-vits-zh-ll.tar.bz2\nrm sherpa-onnx-vits-zh-ll.tar.bz2\n\nnode ./test_tts_non_streaming_vits_zh_ll.js\n```\n\n### Text-to-speech with vits Chinese models (2/2)\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\ntar xvf vits-icefall-zh-aishell3.tar.bz2\nrm vits-icefall-zh-aishell3.tar.bz2\n\nnode ./test_tts_non_streaming_vits_zh_aishell3.js\n```\n\n### Spoken language identification with Whisper multi-lingual models\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.tar.bz2\nrm sherpa-onnx-whisper-tiny.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/spoken-language-identification-test-wavs.tar.bz2\ntar xvf spoken-language-identification-test-wavs.tar.bz2\nrm spoken-language-identification-test-wavs.tar.bz2\n\nnode ./test_spoken_language_identification.js\n\n# To run VAD + spoken language identification using a microphone\nnpm install naudiodon2\nnode ./test_vad_spoken_language_identification_microphone.js\n```\n\n### Speaker identification\n\nYou can find more models at\n<https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models>\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\ngit clone https://github.com/csukuangfj/sr-data\n\nnode ./test_speaker_identification.js\n```\n\n### Add punctuations\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\ntar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nrm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n\nnode ./test_offline_punctuation.js\n```\n\n### Online punctuation\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\ntar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nrm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n\nnode ./test_online_punctuation.js\n```\n\n## Keyword spotting\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\ntar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\nrm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n\nnode ./test_keyword_spotter_transducer.js\n\n# To run keyword spotting using a microphone\nnpm install naudiodon2\nnode ./test_keyword_spotter_transducer_microphone.js\n```\n"
  },
  {
    "path": "nodejs-addon-examples/package.json",
    "content": "{\n  \"dependencies\": {\n    \"sherpa-onnx-node\": \"^1.12.31\"\n  }\n}\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_dolphin_ctc.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'dolphin': {\n      'model':\n          './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_fire_red_asr.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'fireRedAsr': {\n      'encoder':\n          './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx',\n      'decoder':\n          './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx',\n    },\n    'tokens': './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_fire_red_asr_ctc.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'fireRedAsrCtc': {\n      'model':\n          './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_fire_red_asr_ctc_async.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//  This file shows how to use the async API to decode multiple files\nconst path = require('path');\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n/**\n * Create an OfflineRecognizer with FireRedASR CTC model asynchronously.\n */\nasync function createRecognizerAsync(numThreads = 2, debug = 1) {\n  const config = {\n    featConfig: {\n      sampleRate: 16000,\n      featureDim: 80,\n    },\n    modelConfig: {\n      fireRedAsrCtc: {\n        model:\n            './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx',\n      },\n      tokens:\n          './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt',\n      numThreads,\n      provider: 'cpu',\n      debug,\n    },\n  };\n\n  // Use the async C++ API to create recognizer without blocking Node.js\n  return await sherpa_onnx.OfflineRecognizer.createAsync(config);\n}\n\n/**\n * Read a waveform and create a stream for decoding.\n */\nfunction createStreamFromFile(recognizer, file) {\n  const wave = sherpa_onnx.readWave(file);\n  const stream = recognizer.createStream();\n  stream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n  return stream;\n}\n\nasync function main() {\n  const modelDir = './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25';\n\n  // Async recognizer creation\n  const recognizer = await createRecognizerAsync(modelDir);\n\n  const testFiles = [\n    'test_wavs/0.wav',\n    'test_wavs/1.wav',\n    'test_wavs/2.wav',\n    'test_wavs/3-sichuan.wav',\n    'test_wavs/3.wav',\n    'test_wavs/4-tianjin.wav',\n    'test_wavs/5-henan.wav',\n    'test_wavs/8k.wav',\n  ].map(f => path.join(modelDir, f));\n\n  // Create streams for each file\n  const streams = testFiles.map(file => createStreamFromFile(recognizer, file));\n\n  // Decode all streams concurrently\n  const results =\n      await Promise.all(streams.map(stream => recognizer.decodeAsync(stream)));\n\n  console.log('Concurrent decode results:');\n  testFiles.forEach((file, i) => {\n    console.log(`${file}: ${results[i].text}`);\n  });\n}\n\nmain().catch(console.error);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_funasr_nano.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'funasrNano': {\n      'encoderAdaptor':\n          './sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx',\n      'llm': './sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx',\n      'embedding':\n          './sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx',\n      'tokenizer': './sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B',\n    },\n    'tokens': '',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/lyrics.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_funasr_nano_async.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//  This file shows how to use the async API to decode multiple files\nconst path = require('path');\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n/**\n * Create an OfflineRecognizer with FunASR Nano model asynchronously.\n */\nasync function createRecognizerAsync(modelDir, numThreads = 2, debug = 1) {\n  const config = {\n    featConfig: {\n      sampleRate: 16000,\n      featureDim: 80,\n    },\n    modelConfig: {\n      funasrNano: {\n        encoderAdaptor: path.join(modelDir, 'encoder_adaptor.int8.onnx'),\n        llm: path.join(modelDir, 'llm.int8.onnx'),\n        embedding: path.join(modelDir, 'embedding.int8.onnx'),\n        tokenizer: path.join(modelDir, 'Qwen3-0.6B'),\n      },\n      tokens: '',\n      numThreads,\n      provider: 'cpu',\n      debug,\n    },\n  };\n\n  // Use the async C++ API to create recognizer without blocking Node.js\n  return await sherpa_onnx.OfflineRecognizer.createAsync(config);\n}\n\n/**\n * Read a waveform and create a stream for decoding.\n */\nfunction createStreamFromFile(recognizer, file) {\n  const wave = sherpa_onnx.readWave(file);\n  const stream = recognizer.createStream();\n  stream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n  return stream;\n}\n\nasync function main() {\n  const modelDir = './sherpa-onnx-funasr-nano-int8-2025-12-30';\n\n  // Async recognizer creation\n  const recognizer = await createRecognizerAsync(modelDir);\n\n  const testFiles = [\n    'test_wavs/lyrics_en_1.wav',\n    'test_wavs/lyrics_en_2.wav',\n    'test_wavs/lyrics_en_3.wav',\n  ].map(f => path.join(modelDir, f));\n\n  // Create streams for each file\n  const streams = testFiles.map(file => createStreamFromFile(recognizer, file));\n\n  // Decode all streams concurrently\n  const results =\n      await Promise.all(streams.map(stream => recognizer.decodeAsync(stream)));\n\n  console.log('Concurrent decode results:');\n  testFiles.forEach((file, i) => {\n    console.log(`${file}: ${results[i].text}`);\n  });\n}\n\nmain().catch(console.error);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_medasr_ctc.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'medasr': {\n      'model': './sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx',\n    },\n    'tokens': './sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_moonshine.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'moonshine': {\n      'preprocessor': './sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx',\n      'encoder': './sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx',\n      'uncachedDecoder':\n          './sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx',\n      'cachedDecoder':\n          './sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx',\n    },\n    'tokens': './sherpa-onnx-moonshine-tiny-en-int8/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename = './sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_moonshine_v2.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'moonshine': {\n      'encoder':\n          './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort',\n      'mergedDecoder':\n          './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort',\n    },\n    'tokens': './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_nemo_canary.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'canary': {\n      'encoder':\n          './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx',\n      'decoder':\n          './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx',\n      'srcLang': 'en',\n      'tgtLang': 'en',\n      'usePnc': 1,\n    },\n    'tokens':\n        './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 0,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/en.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nlet stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nlet result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result (English)\\n', result);\n\nstream = recognizer.createStream();\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\nrecognizer.config.modelConfig.canary.tgtLang = 'de';\nrecognizer.setConfig(recognizer.config);\n\nrecognizer.decode(stream);\nresult = recognizer.getResult(stream);\nconsole.log('result (German)\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_nemo_ctc.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'nemoCtc': {\n      'model':\n          './sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/model.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/de-german.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_nemo_parakeet_tdt_v2.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'transducer': {\n      'encoder':\n          './sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/encoder.int8.onnx',\n      'decoder':\n          './sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/decoder.int8.onnx',\n      'joiner': './sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/joiner.int8.onnx',\n    },\n    'tokens': './sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n    'modelType': 'nemo_transducer',\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_omnilingual_asr_ctc.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'omnilingual': {\n      'model':\n          './sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_paraformer.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'paraformer': {\n      'model': './sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx',\n    },\n    'tokens': './sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/5-henan.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_paraformer_itn.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'paraformer': {\n      'model': './sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx',\n    },\n    'tokens': './sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  },\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n  ruleFsts: './itn_zh_number.fst',\n};\n\n// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nconst waveFilename = './itn-zh-number.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_sense_voice.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\n\n// If your path contains non-ascii characters, e.g., Chinese, you can use\n// the following code\n//\n\n// let encoder = new TextEncoder();\n// let tokens = encoder.encode(\n//     './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/测试.txt');\n// let model = encoder.encode(\n//     './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/测试.int8.onnx');\n\n\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'senseVoice': {\n      'model':\n          './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx',\n      // 'model': model,\n      'useInverseTextNormalization': 1,\n    },\n    'tokens': './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt',\n    // 'tokens': tokens,\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/zh.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_sense_voice_with_hr.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/hr-files\n\n\n// If your path contains non-ascii characters, e.g., Chinese, you can use\n// the following code\n//\n\n// let encoder = new TextEncoder();\n// let tokens = encoder.encode(\n//     './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/测试.txt');\n// let model = encoder.encode(\n//     './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/测试.int8.onnx');\n\n\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'senseVoice': {\n      'model':\n          './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx',\n      // 'model': model,\n      'useInverseTextNormalization': 1,\n    },\n    'tokens': './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt',\n    // 'tokens': tokens,\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  },\n  'hr': {\n    // Please download files from\n    // https://github.com/k2-fsa/sherpa-onnx/releases/tag/hr-files\n    'lexicon': './lexicon.txt',\n    'ruleFsts': './replace.fst',\n  }\n};\n\nconst waveFilename = './test-hr.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_transducer.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'transducer': {\n      'encoder':\n          './sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.int8.onnx',\n      'decoder':\n          './sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx',\n      'joiner':\n          './sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.int8.onnx',\n    },\n    'tokens': './sherpa-onnx-zipformer-en-2023-04-01/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename = './sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_wenet_ctc.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'wenetCtc': {\n      'model':\n          './sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/test_wavs/yue-0.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_whisper.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\nconsole.log(`version : ${sherpa_onnx.version}`);\nconsole.log(`git sha1: ${sherpa_onnx.gitSha1}`);\nconsole.log(`git date: ${sherpa_onnx.gitDate}`);\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'whisper': {\n      'encoder': './sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx',\n      'decoder': './sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx',\n    },\n    'tokens': './sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename = './sherpa-onnx-whisper-tiny.en/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_non_streaming_zipformer_ctc.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'zipformerCtc': {\n      'model': './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx',\n    },\n    'tokens': './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OfflineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nrecognizer.decode(stream);\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_ctc.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'zipformer2Ctc': {\n      'model':\n          './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OnlineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nconst tailPadding = new Float32Array(wave.sampleRate * 0.4);\nstream.acceptWaveform({samples: tailPadding, sampleRate: wave.sampleRate});\n\nwhile (recognizer.isReady(stream)) {\n  recognizer.decode(stream);\n}\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_ctc_hlg.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'zipformer2Ctc': {\n      'model':\n          './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  },\n  'ctcFstDecoderConfig': {\n    'graph': './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst',\n  },\n};\n\nconst waveFilename =\n    './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/1.wav';\n\nconst recognizer = new sherpa_onnx.OnlineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nconst tailPadding = new Float32Array(wave.sampleRate * 0.4);\nstream.acceptWaveform({samples: tailPadding, sampleRate: wave.sampleRate});\n\nwhile (recognizer.isReady(stream)) {\n  recognizer.decode(stream);\n}\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_ctc_hlg_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createOnlineRecognizer() {\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'zipformer2Ctc': {\n        'model':\n            './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx',\n      },\n      'tokens':\n          './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    },\n    'ctcFstDecoderConfig': {\n      'graph': './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst',\n    },\n    'enableEndpoint': true,\n    'rule1MinTrailingSilence': 2.4,\n    'rule2MinTrailingSilence': 1.2,\n    'rule3MinUtteranceLength': 20\n  };\n\n  return new sherpa_onnx.OnlineRecognizer(config);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nlet lastText = '';\nlet segmentIndex = 0;\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: recognizer.config.featConfig.sampleRate\n  }\n});\n\nconst display = new sherpa_onnx.Display(50);\n\nai.on('data', data => {\n  const samples = new Float32Array(data.buffer);\n\n  stream.acceptWaveform(\n      {sampleRate: recognizer.config.featConfig.sampleRate, samples: samples});\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  const isEndpoint = recognizer.isEndpoint(stream);\n  const text = recognizer.getResult(stream).text.toLowerCase();\n\n  if (text.length > 0 && lastText != text) {\n    lastText = text;\n    display.print(segmentIndex, lastText);\n  }\n  if (isEndpoint) {\n    if (text.length > 0) {\n      lastText = text;\n      segmentIndex += 1;\n    }\n    recognizer.reset(stream);\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_ctc_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createOnlineRecognizer() {\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'zipformer2Ctc': {\n        'model':\n            './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx',\n      },\n      'tokens':\n          './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    },\n    'decodingMethod': 'greedy_search',\n    'maxActivePaths': 4,\n    'enableEndpoint': true,\n    'rule1MinTrailingSilence': 2.4,\n    'rule2MinTrailingSilence': 1.2,\n    'rule3MinUtteranceLength': 20\n  };\n\n  return new sherpa_onnx.OnlineRecognizer(config);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nlet lastText = '';\nlet segmentIndex = 0;\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: recognizer.config.featConfig.sampleRate\n  }\n});\n\nconst display = new sherpa_onnx.Display(50);\n\nai.on('data', data => {\n  const samples = new Float32Array(data.buffer);\n\n  stream.acceptWaveform(\n      {sampleRate: recognizer.config.featConfig.sampleRate, samples: samples});\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  const isEndpoint = recognizer.isEndpoint(stream);\n  const text = recognizer.getResult(stream).text.toLowerCase();\n\n  if (text.length > 0 && lastText != text) {\n    lastText = text;\n    display.print(segmentIndex, lastText);\n  }\n  if (isEndpoint) {\n    if (text.length > 0) {\n      lastText = text;\n      segmentIndex += 1;\n    }\n    recognizer.reset(stream);\n  }\n});\n\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_paraformer.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'paraformer': {\n      'encoder':\n          './sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx',\n      'decoder':\n          './sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx',\n    },\n    'tokens': './sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OnlineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nconst tailPadding = new Float32Array(wave.sampleRate * 0.4);\nstream.acceptWaveform({samples: tailPadding, sampleRate: wave.sampleRate});\n\nwhile (recognizer.isReady(stream)) {\n  recognizer.decode(stream);\n}\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_paraformer_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createOnlineRecognizer() {\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'paraformer': {\n        'encoder':\n            './sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx',\n        'decoder':\n            './sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx',\n      },\n      'tokens': './sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    },\n    'decodingMethod': 'greedy_search',\n    'maxActivePaths': 4,\n    'enableEndpoint': true,\n    'rule1MinTrailingSilence': 2.4,\n    'rule2MinTrailingSilence': 1.2,\n    'rule3MinUtteranceLength': 20\n  };\n\n  return new sherpa_onnx.OnlineRecognizer(config);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nlet lastText = '';\nlet segmentIndex = 0;\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: recognizer.config.featConfig.sampleRate\n  }\n});\n\nconst display = new sherpa_onnx.Display(50);\n\nai.on('data', data => {\n  const samples = new Float32Array(data.buffer);\n\n  stream.acceptWaveform(\n      {sampleRate: recognizer.config.featConfig.sampleRate, samples: samples});\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  const isEndpoint = recognizer.isEndpoint(stream);\n  let text = recognizer.getResult(stream).text.toLowerCase();\n\n  if (isEndpoint) {\n    // for online paraformer models, we have to manually padding on endpoint\n    // so that the last word can be recognized\n    const tailPadding =\n        new Float32Array(recognizer.config.featConfig.sampleRate * 0.4);\n    stream.acceptWaveform({\n      samples: tailPadding,\n      sampleRate: recognizer.config.featConfig.sampleRate\n    });\n    while (recognizer.isReady(stream)) {\n      recognizer.decode(stream);\n    }\n    text = recognizer.getResult(stream).text.toLowerCase();\n  }\n\n  if (text.length > 0 && lastText != text) {\n    lastText = text;\n    display.print(segmentIndex, lastText);\n  }\n  if (isEndpoint) {\n    if (text.length > 0) {\n      lastText = text;\n      segmentIndex += 1;\n    }\n    recognizer.reset(stream);\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_t_one_ctc.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'modelConfig': {\n    'toneCtc': {\n      'model': './sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx',\n    },\n    'tokens': './sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename = './sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav';\n\nconst recognizer = new sherpa_onnx.OnlineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\n\nconst leftPadding = new Float32Array(wave.sampleRate * 0.3);\nstream.acceptWaveform({samples: leftPadding, sampleRate: wave.sampleRate});\n\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nconst tailPadding = new Float32Array(wave.sampleRate * 0.6);\nstream.acceptWaveform({samples: tailPadding, sampleRate: wave.sampleRate});\n\nwhile (recognizer.isReady(stream)) {\n  recognizer.decode(stream);\n}\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_transducer.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'transducer': {\n      'encoder':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx',\n      'decoder':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx',\n      'joiner':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OnlineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nconst tailPadding = new Float32Array(wave.sampleRate * 0.4);\nstream.acceptWaveform({samples: tailPadding, sampleRate: wave.sampleRate});\n\nwhile (recognizer.isReady(stream)) {\n  recognizer.decode(stream);\n}\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_transducer_itn.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'transducer': {\n      'encoder':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx',\n      'decoder':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx',\n      'joiner':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  },\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n  ruleFsts: './itn_zh_number.fst',\n};\n\n// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nconst waveFilename = './itn-zh-number.wav';\n\nconst recognizer = new sherpa_onnx.OnlineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nconst tailPadding = new Float32Array(wave.sampleRate * 0.4);\nstream.acceptWaveform({samples: tailPadding, sampleRate: wave.sampleRate});\n\nwhile (recognizer.isReady(stream)) {\n  recognizer.decode(stream);\n}\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_transducer_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createOnlineRecognizer() {\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'transducer': {\n        'encoder':\n            './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx',\n        'decoder':\n            './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx',\n        'joiner':\n            './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx',\n      },\n      'tokens':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    },\n    'decodingMethod': 'greedy_search',\n    'maxActivePaths': 4,\n    'enableEndpoint': true,\n    'rule1MinTrailingSilence': 2.4,\n    'rule2MinTrailingSilence': 1.2,\n    'rule3MinUtteranceLength': 20\n  };\n\n  return new sherpa_onnx.OnlineRecognizer(config);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nlet lastText = '';\nlet segmentIndex = 0;\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: recognizer.config.featConfig.sampleRate\n  }\n});\n\nconst display = new sherpa_onnx.Display(50);\n\nai.on('data', data => {\n  const samples = new Float32Array(data.buffer);\n\n  stream.acceptWaveform(\n      {sampleRate: recognizer.config.featConfig.sampleRate, samples: samples});\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  const isEndpoint = recognizer.isEndpoint(stream);\n  const text = recognizer.getResult(stream).text.toLowerCase();\n\n  if (text.length > 0 && lastText != text) {\n    lastText = text;\n    display.print(segmentIndex, lastText);\n  }\n  if (isEndpoint) {\n    if (text.length > 0) {\n      lastText = text;\n      segmentIndex += 1;\n    }\n    recognizer.reset(stream);\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_transducer_microphone_itn.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createOnlineRecognizer() {\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'transducer': {\n        'encoder':\n            './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx',\n        'decoder':\n            './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx',\n        'joiner':\n            './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx',\n      },\n      'tokens':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    },\n    'decodingMethod': 'greedy_search',\n    'maxActivePaths': 4,\n    'enableEndpoint': true,\n    'rule1MinTrailingSilence': 2.4,\n    'rule2MinTrailingSilence': 1.2,\n    'rule3MinUtteranceLength': 20,\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n    ruleFsts: './itn_zh_number.fst',\n  };\n\n  return new sherpa_onnx.OnlineRecognizer(config);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nlet lastText = '';\nlet segmentIndex = 0;\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: recognizer.config.featConfig.sampleRate\n  }\n});\n\nconst display = new sherpa_onnx.Display(50);\n\nai.on('data', data => {\n  const samples = new Float32Array(data.buffer);\n\n  stream.acceptWaveform(\n      {sampleRate: recognizer.config.featConfig.sampleRate, samples: samples});\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  const isEndpoint = recognizer.isEndpoint(stream);\n  const text = recognizer.getResult(stream).text.toLowerCase();\n\n  if (text.length > 0 && lastText != text) {\n    lastText = text;\n    display.print(segmentIndex, lastText);\n  }\n  if (isEndpoint) {\n    if (text.length > 0) {\n      lastText = text;\n      segmentIndex += 1;\n    }\n    recognizer.reset(stream);\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_asr_streaming_transducer_with_hr.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'transducer': {\n      'encoder':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx',\n      'decoder':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx',\n      'joiner':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n  },\n  'hr': {\n    'lexicon': './lexicon.txt',\n    'ruleFsts': './replace.fst',\n  },\n};\n\nconst waveFilename = './test-hr.wav';\n\nconst recognizer = new sherpa_onnx.OnlineRecognizer(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nconst tailPadding = new Float32Array(wave.sampleRate * 0.4);\nstream.acceptWaveform({samples: tailPadding, sampleRate: wave.sampleRate});\n\nwhile (recognizer.isReady(stream)) {\n  recognizer.decode(stream);\n}\nconst result = recognizer.getResult(stream);\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', result);\n"
  },
  {
    "path": "nodejs-addon-examples/test_audio_tagging_ced.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download models files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\nfunction createAudioTagging() {\n  const config = {\n    model: {\n      ced: './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.int8.onnx',\n      numThreads: 1,\n      debug: true,\n    },\n    labels:\n        './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/class_labels_indices.csv',\n    topK: 5,\n  };\n  return new sherpa_onnx.AudioTagging(config);\n}\n\nconst at = createAudioTagging();\n\nconst testWaves = [\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/1.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/2.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/3.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/4.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/5.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/6.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/7.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/8.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/9.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/10.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/11.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/12.wav',\n  './sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/13.wav',\n];\n\nconsole.log('------');\n\nfor (let filename of testWaves) {\n  const start = Date.now();\n  const stream = at.createStream();\n  const wave = sherpa_onnx.readWave(filename);\n  stream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n  const events = at.compute(stream);\n  const stop = Date.now();\n\n  const elapsed_seconds = (stop - start) / 1000;\n  const duration = wave.samples.length / wave.sampleRate;\n  const real_time_factor = elapsed_seconds / duration;\n\n  console.log('input file:', filename);\n  console.log('Probability\\t\\tName');\n  for (let e of events) {\n    console.log(`${e.prob.toFixed(3)}\\t\\t\\t${e.name}`);\n  }\n  console.log('Wave duration', duration.toFixed(3), 'seconds');\n  console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\n  console.log(\n      `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n      real_time_factor.toFixed(3));\n  console.log('------');\n}\n"
  },
  {
    "path": "nodejs-addon-examples/test_audio_tagging_zipformer.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download models files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\nfunction createAudioTagging() {\n  const config = {\n    model: {\n      zipformer: {\n        model:\n            './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/model.int8.onnx'\n      },\n      numThreads: 1,\n      debug: true,\n    },\n    labels:\n        './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/class_labels_indices.csv',\n    topK: 5,\n  };\n  return new sherpa_onnx.AudioTagging(config);\n}\n\nconst at = createAudioTagging();\n\nconst testWaves = [\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/1.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/2.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/3.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/4.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/5.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/6.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/7.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/8.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/9.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/10.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/11.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/12.wav',\n  './sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/13.wav',\n];\n\nconsole.log('------');\n\nfor (let filename of testWaves) {\n  const start = Date.now();\n  const stream = at.createStream();\n  const wave = sherpa_onnx.readWave(filename);\n  stream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n  const events = at.compute(stream);\n  const stop = Date.now();\n\n  const elapsed_seconds = (stop - start) / 1000;\n  const duration = wave.samples.length / wave.sampleRate;\n  const real_time_factor = elapsed_seconds / duration;\n\n  console.log('input file:', filename);\n  console.log('Probability\\t\\tName');\n  for (let e of events) {\n    console.log(`${e.prob.toFixed(3)}\\t\\t\\t${e.name}`);\n  }\n  console.log('Wave duration', duration.toFixed(3), 'seconds');\n  console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\n  console.log(\n      `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n      real_time_factor.toFixed(3));\n  console.log('------');\n}\n"
  },
  {
    "path": "nodejs-addon-examples/test_keyword_spotter_transducer.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'transducer': {\n      'encoder':\n          './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx',\n      'decoder':\n          './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx',\n      'joiner':\n          './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt',\n    'numThreads': 1,\n    'provider': 'cpu',\n    'debug': 1,\n  },\n  'keywordsFile':\n      './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/test_keywords.txt',\n};\n\nconst waveFilename =\n    './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/3.wav';\n\nconst kws = new sherpa_onnx.KeywordSpotter(config);\nconsole.log('Started');\nlet start = Date.now();\nconst stream = kws.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\nconst tailPadding = new Float32Array(wave.sampleRate * 0.4);\nstream.acceptWaveform({samples: tailPadding, sampleRate: wave.sampleRate});\n\nconst detectedKeywords = [];\nwhile (kws.isReady(stream)) {\n  kws.decode(stream);\n  const keyword = kws.getResult(stream).keyword;\n  if (keyword != '') {\n    detectedKeywords.push(keyword);\n  }\n}\nlet stop = Date.now();\n\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\nconsole.log(waveFilename);\nconsole.log('result\\n', detectedKeywords);\n"
  },
  {
    "path": "nodejs-addon-examples/test_keyword_spotter_transducer_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createKeywordSpotter() {\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'transducer': {\n        'encoder':\n            './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx',\n        'decoder':\n            './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx',\n        'joiner':\n            './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx',\n      },\n      'tokens':\n          './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    },\n    'keywordsFile':\n        './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/keywords.txt',\n  };\n\n  return new sherpa_onnx.KeywordSpotter(config);\n}\n\nconst kws = createKeywordSpotter();\nconst stream = kws.createStream();\n\nlet lastText = '';\nlet segmentIndex = 0;\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: kws.config.featConfig.sampleRate\n  }\n});\n\nconst display = new sherpa_onnx.Display(50);\n\nai.on('data', data => {\n  const samples = new Float32Array(data.buffer);\n\n  stream.acceptWaveform(\n      {sampleRate: kws.config.featConfig.sampleRate, samples: samples});\n\n  while (kws.isReady(stream)) {\n    kws.decode(stream);\n  }\n\n  const keyword = kws.getResult(stream).keyword;\n  if (keyword != '') {\n    display.print(segmentIndex, keyword);\n    segmentIndex += 1;\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak.');\nconsole.log(`Only words from ${kws.config.keywordsFile} can be recognized`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_offline_punctuation.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models\nfunction createPunctuation() {\n  const config = {\n    model: {\n      ctTransformer:\n          './sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx',\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n  };\n  return new sherpa_onnx.OfflinePunctuation(config);\n}\n\nconst punct = createPunctuation();\nconst sentences = [\n  '这是一个测试你好吗How are you我很好thank you are you ok谢谢你',\n  '我们都是木头人不会说话不会动',\n  'The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry',\n];\nconsole.log('---');\nfor (let sentence of sentences) {\n  const punct_text = punct.addPunct(sentence);\n  console.log(`Input: ${sentence}`);\n  console.log(`Output: ${punct_text}`);\n  console.log('---');\n}\n"
  },
  {
    "path": "nodejs-addon-examples/test_offline_speaker_diarization.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// clang-format off\n/* Please use the following commands to download files\n   used in this script\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\ntar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nrm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\n */\n// clang-format on\n\nconst config = {\n  segmentation: {\n    pyannote: {\n      model: './sherpa-onnx-pyannote-segmentation-3-0/model.onnx',\n    },\n  },\n  embedding: {\n    model: './3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx',\n  },\n  clustering: {\n    // since we know that the test wave file\n    // ./0-four-speakers-zh.wav contains 4 speakers, we use 4 for numClusters\n    // here. if you don't have such information, please set numClusters to -1\n    numClusters: 4,\n\n    // If numClusters is not -1, then threshold is ignored.\n    //\n    // A larger threshold leads to fewer clusters, i.e., fewer speakers\n    // A smaller threshold leads to more clusters, i.e., more speakers\n    // You need to tune it by yourself.\n    threshold: 0.5,\n  },\n\n  // If a segment is shorter than minDurationOn, we discard it\n  minDurationOn: 0.2,  // in seconds\n\n  // If the gap between two segments is less than minDurationOff, then we\n  // merge these two segments into a single one\n  minDurationOff: 0.5,  // in seconds\n};\n\nconst waveFilename = './0-four-speakers-zh.wav';\n\nconst sd = new sherpa_onnx.OfflineSpeakerDiarization(config);\nconsole.log('Started');\n\nconst wave = sherpa_onnx.readWave(waveFilename);\nif (sd.sampleRate != wave.sampleRate) {\n  throw new Error(\n      `Expected sample rate: ${sd.sampleRate}, given: ${wave.sampleRate}`);\n}\n\nconst segments = sd.process(wave.samples);\nconsole.log(segments);\n"
  },
  {
    "path": "nodejs-addon-examples/test_offline_speech_enhancement_dpdfnet.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createOfflineSpeechDenoiser() {\n  // please download models from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n  const config = {\n    model: {\n      dpdfnet: {model: './dpdfnet_baseline.onnx'},\n      debug: true,\n      numThreads: 1,\n    },\n  };\n\n  return new sherpa_onnx.OfflineSpeechDenoiser(config);\n}\n\nconst sd = createOfflineSpeechDenoiser();\n\nconst waveFilename = './inp_16k.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nconst denoised = sd.run({\n  samples: wave.samples,\n  sampleRate: wave.sampleRate,\n  enableExternalBuffer: true\n});\nsherpa_onnx.writeWave(\n    './enhanced-dpdfnet-16k.wav',\n    {samples: denoised.samples, sampleRate: denoised.sampleRate});\n\nconsole.log('Saved to ./enhanced-dpdfnet-16k.wav');\n"
  },
  {
    "path": "nodejs-addon-examples/test_offline_speech_enhancement_gtcrn.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createOfflineSpeechDenoiser() {\n  // please download models from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n  const config = {\n    model: {\n      gtcrn: {model: './gtcrn_simple.onnx'},\n      debug: true,\n      numThreads: 1,\n    },\n  };\n\n  return new sherpa_onnx.OfflineSpeechDenoiser(config);\n}\n\nconst sd = createOfflineSpeechDenoiser();\n\nconst waveFilename = './inp_16k.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nconst denoised = sd.run({\n  samples: wave.samples,\n  sampleRate: wave.sampleRate,\n  enableExternalBuffer: true\n});\nsherpa_onnx.writeWave(\n    './enhanced-16k.wav',\n    {samples: denoised.samples, sampleRate: denoised.sampleRate});\n\nconsole.log(`Saved to ./enhanced-16k.wav`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_online_punctuation.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models\nfunction createPunctuation() {\n  const config = {\n    model: {\n      cnnBilstm:\n          './sherpa-onnx-online-punct-en-2024-08-06/model.onnx',\n      bpeVocab:\n          './sherpa-onnx-online-punct-en-2024-08-06/bpe.vocab',\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n  };\n  return new sherpa_onnx.OnlinePunctuation(config);\n}\n\nconst punct = createPunctuation();\nconst sentences = [\n  'How are you i am fine thank you',\n  'The african blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry',\n];\nconsole.log('---');\nfor (let sentence of sentences) {\n  const punct_text = punct.addPunct(sentence);\n  console.log(`Input: ${sentence}`);\n  console.log(`Output: ${punct_text}`);\n  console.log('---');\n}\n"
  },
  {
    "path": "nodejs-addon-examples/test_online_speech_enhancement_dpdfnet.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createOnlineSpeechDenoiser() {\n  const config = {\n    model: {\n      dpdfnet: {model: './dpdfnet_baseline.onnx'},\n      debug: true,\n      numThreads: 1,\n    },\n  };\n\n  return new sherpa_onnx.OnlineSpeechDenoiser(config);\n}\n\nconst sd = createOnlineSpeechDenoiser();\nconst wave = sherpa_onnx.readWave('./inp_16k.wav');\nconst output = [];\nconst frameShift = sd.frameShiftInSamples;\n\nfor (let start = 0; start < wave.samples.length; start += frameShift) {\n  const end = Math.min(start + frameShift, wave.samples.length);\n  const chunk = wave.samples.slice(start, end);\n  const denoised = sd.run({\n    samples: chunk,\n    sampleRate: wave.sampleRate,\n    enableExternalBuffer: true\n  });\n  output.push(...denoised.samples);\n}\n\nconst tail = sd.flush(true);\noutput.push(...tail.samples);\n\nsherpa_onnx.writeWave(\n    './enhanced-online-dpdfnet.wav',\n    {samples: Float32Array.from(output), sampleRate: sd.sampleRate});\n\nconsole.log('Saved to ./enhanced-online-dpdfnet.wav');\n"
  },
  {
    "path": "nodejs-addon-examples/test_online_speech_enhancement_gtcrn.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createOnlineSpeechDenoiser() {\n  const config = {\n    model: {\n      gtcrn: {model: './gtcrn_simple.onnx'},\n      debug: true,\n      numThreads: 1,\n    },\n  };\n\n  return new sherpa_onnx.OnlineSpeechDenoiser(config);\n}\n\nconst sd = createOnlineSpeechDenoiser();\nconst wave = sherpa_onnx.readWave('./inp_16k.wav');\nconst output = [];\nconst frameShift = sd.frameShiftInSamples;\n\nfor (let start = 0; start < wave.samples.length; start += frameShift) {\n  const end = Math.min(start + frameShift, wave.samples.length);\n  const chunk = wave.samples.slice(start, end);\n  const denoised = sd.run({\n    samples: chunk,\n    sampleRate: wave.sampleRate,\n    enableExternalBuffer: true\n  });\n  output.push(...denoised.samples);\n}\n\nconst tail = sd.flush(true);\noutput.push(...tail.samples);\n\nsherpa_onnx.writeWave(\n    './enhanced-online-gtcrn.wav',\n    {samples: Float32Array.from(output), sampleRate: sd.sampleRate});\n\nconsole.log('Saved to ./enhanced-online-gtcrn.wav');\n"
  },
  {
    "path": "nodejs-addon-examples/test_speaker_identification.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\nconst assert = require('node:assert');\n\n// Please download models files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nfunction createSpeakerEmbeddingExtractor() {\n  const config = {\n    model: './3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx',\n    numThreads: 1,\n    debug: true,\n  };\n  return new sherpa_onnx.SpeakerEmbeddingExtractor(config);\n}\n\nfunction computeEmbedding(extractor, filename) {\n  const stream = extractor.createStream();\n  const wave = sherpa_onnx.readWave(filename);\n  stream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n  return extractor.compute(stream);\n}\n\nconst extractor = createSpeakerEmbeddingExtractor();\nconst manager = new sherpa_onnx.SpeakerEmbeddingManager(extractor.dim);\n\n// Please download test files from\n// https://github.com/csukuangfj/sr-data\nconst spk1Files = [\n  './sr-data/enroll/fangjun-sr-1.wav',\n  './sr-data/enroll/fangjun-sr-2.wav',\n  './sr-data/enroll/fangjun-sr-3.wav',\n];\n\nlet spk1Vec = [];\nfor (let f of spk1Files) {\n  spk1Vec.push(computeEmbedding(extractor, f));\n}\n\nconst spk2Files = [\n  './sr-data/enroll/leijun-sr-1.wav',\n  './sr-data/enroll/leijun-sr-2.wav',\n];\n\nlet spk2Vec = [];\nfor (let f of spk2Files) {\n  spk2Vec.push(computeEmbedding(extractor, f));\n}\n\nlet ok = manager.addMulti({name: 'fangjun', v: spk1Vec});\nassert.equal(ok, true);\n\nok = manager.addMulti({name: 'leijun', v: spk2Vec});\nassert.equal(ok, true);\n\nassert.equal(manager.getNumSpeakers(), 2);\n\nassert.equal(manager.contains('fangjun'), true);\nassert.equal(manager.contains('leijun'), true);\n\nconsole.log('---All speakers---');\n\nconsole.log(manager.getAllSpeakerNames());\nconsole.log('------------');\n\nconst testFiles = [\n  './sr-data/test/fangjun-test-sr-1.wav',\n  './sr-data/test/leijun-test-sr-1.wav',\n  './sr-data/test/liudehua-test-sr-1.wav',\n];\n\nconst threshold = 0.6;\n\nfor (let f of testFiles) {\n  const embedding = computeEmbedding(extractor, f);\n\n  let name = manager.search({v: embedding, threshold: threshold});\n  if (name == '') {\n    name = '<Unknown>';\n  }\n  console.log(`${f}: ${name}`);\n}\n\n\nok = manager.verify({\n  name: 'fangjun',\n  v: computeEmbedding(extractor, testFiles[0]),\n  threshold: threshold\n});\n\nassert.equal(ok, true);\n\nok = manager.remove('fangjun');\nassert.equal(ok, true);\n\nok = manager.verify({\n  name: 'fangjun',\n  v: computeEmbedding(extractor, testFiles[0]),\n  threshold: threshold\n});\nassert.equal(ok, false);\n\nassert.equal(manager.getNumSpeakers(), 1);\n"
  },
  {
    "path": "nodejs-addon-examples/test_spoken_language_identification.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// Please download whisper multi-lingual models from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nfunction createSpokenLanguageID() {\n  const config = {\n    whisper: {\n      encoder: './sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx',\n      decoder: './sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx',\n    },\n    debug: true,\n    numThreads: 1,\n    provider: 'cpu',\n  };\n  return new sherpa_onnx.SpokenLanguageIdentification(config);\n}\n\nconst slid = createSpokenLanguageID();\n\nconst testWaves = [\n  './spoken-language-identification-test-wavs/ar-arabic.wav',\n  './spoken-language-identification-test-wavs/de-german.wav',\n  './spoken-language-identification-test-wavs/en-english.wav',\n  './spoken-language-identification-test-wavs/fr-french.wav',\n  './spoken-language-identification-test-wavs/pt-portuguese.wav',\n  './spoken-language-identification-test-wavs/es-spanish.wav',\n  './spoken-language-identification-test-wavs/zh-chinese.wav',\n];\n\nconst display = new Intl.DisplayNames(['en'], {type: 'language'});\n\nfor (let f of testWaves) {\n  const stream = slid.createStream();\n\n  const wave = sherpa_onnx.readWave(f);\n  stream.acceptWaveform({sampleRate: wave.sampleRate, samples: wave.samples});\n\n  const lang = slid.compute(stream);\n  console.log(f.split('/')[2], lang, display.of(lang));\n}\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_kitten_en.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n/**\n * Create an offline TTS instance asynchronously using the Kitten model.\n *\n * Model files can be downloaded from:\n * https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kitten.html\n */\nasync function createOfflineTtsAsync() {\n  const config = {\n    model: {\n      kitten: {\n        model: './kitten-nano-en-v0_1-fp16/model.fp16.onnx',\n        voices: './kitten-nano-en-v0_1-fp16/voices.bin',\n        tokens: './kitten-nano-en-v0_1-fp16/tokens.txt',\n        dataDir: './kitten-nano-en-v0_1-fp16/espeak-ng-data',\n      },\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n\n  // Use the async factory (non-blocking)\n  return await sherpa_onnx.OfflineTts.createAsync(config);\n}\n\nasync function main() {\n  // Asynchronously create the OfflineTts instance\n  const tts = await createOfflineTtsAsync();\n\n  const text =\n      'Today as always, men fall into two groups: slaves and free men. ' +\n      'Whoever does not have two-thirds of his day for himself, is a slave, ' +\n      'whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\n  console.log('Number of speakers:', tts.numSpeakers);\n  console.log('Sample rate:', tts.sampleRate);\n\n  const start = Date.now();\n  const generationConfig = new sherpa_onnx.GenerationConfig({\n    sid: 6,\n    speed: 1.0,\n    silenceScale: 0.2,\n  });\n\n  // Asynchronous generation with progress reporting\n  const audio = await tts.generateAsync({\n    text,\n    generationConfig,\n\n    // Progress callback receives audio chunks\n    onProgress({samples, progress}) {\n      // samples is Float32Array for this chunk\n      process.stdout.write(`\\rGenerating... ${\n          (progress * 100).toFixed(1)}% (chunk length: ${samples.length})`);\n\n      // Return 0 or false to cancel, any other value to continue\n      return true;\n    },\n  });\n\n  console.log('\\nGeneration finished.');\n\n  const stop = Date.now();\n  const elapsedSeconds = (stop - start) / 1000;\n  const durationSeconds = audio.samples.length / audio.sampleRate;\n  const realTimeFactor = elapsedSeconds / durationSeconds;\n\n  console.log('Wave duration:', durationSeconds.toFixed(3), 'seconds');\n  console.log('Elapsed time:', elapsedSeconds.toFixed(3), 'seconds');\n  console.log(\n      `RTF = ${elapsedSeconds.toFixed(3)} / ${durationSeconds.toFixed(3)} =`,\n      realTimeFactor.toFixed(3));\n\n  const filename = 'test-kitten-en-6.wav';\n  sherpa_onnx.writeWave(filename, {\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  });\n\n  console.log(`Saved to ${filename}`);\n}\n\n// Run the demo\nmain().catch((err) => {\n  console.error('TTS failed:', err);\n  process.exit(1);\n});\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_kokoro_en.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n// to download model files\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      kokoro: {\n        model: './kokoro-en-v0_19/model.onnx',\n        voices: './kokoro-en-v0_19/voices.bin',\n        tokens: './kokoro-en-v0_19/tokens.txt',\n        dataDir: './kokoro-en-v0_19/espeak-ng-data',\n      },\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\n\nconst text =\n    'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  sid: 6,\n  speed: 1.0,\n  silenceScale: 0.2,\n});\n\n\nlet start = Date.now();\nconst audio = tts.generate({text, generationConfig});\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-kokoro-en-6.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_kokoro_zh_en.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n// to download model files\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      kokoro: {\n        model: './kokoro-multi-lang-v1_0/model.onnx',\n        voices: './kokoro-multi-lang-v1_0/voices.bin',\n        tokens: './kokoro-multi-lang-v1_0/tokens.txt',\n        dataDir: './kokoro-multi-lang-v1_0/espeak-ng-data',\n        lexicon:\n            './kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt',\n      },\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\n\nconst text =\n    '中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？';\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  sid: 48,\n  speed: 1.0,\n  silenceScale: 0.2,\n});\n\nlet start = Date.now();\nconst audio = tts.generate({text, generationConfig});\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-kokoro-zh-en-48.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_matcha_icefall_en.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n// to download model files\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      matcha: {\n        acousticModel: './matcha-icefall-en_US-ljspeech/model-steps-3.onnx',\n        vocoder: './vocos-22khz-univ.onnx',\n        tokens: './matcha-icefall-en_US-ljspeech/tokens.txt',\n        dataDir: './matcha-icefall-en_US-ljspeech/espeak-ng-data',\n      },\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\n\nconst text =\n    'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  sid: 0,\n  speed: 1.0,\n  silenceScale: 0.2,\n});\n\n\nlet start = Date.now();\nconst audio = tts.generate({text, generationConfig});\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-matcha-en.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_matcha_icefall_zh.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n// to download model files\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      matcha: {\n        acousticModel: './matcha-icefall-zh-baker/model-steps-3.onnx',\n        vocoder: './vocos-22khz-univ.onnx',\n        lexicon: './matcha-icefall-zh-baker/lexicon.txt',\n        tokens: './matcha-icefall-zh-baker/tokens.txt',\n      },\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n    ruleFsts:\n        './matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst',\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\n\nconst text =\n    '当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感受着生命的奇迹与温柔. 某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。';\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  sid: 0,\n  speed: 1.0,\n  silenceScale: 0.2,\n});\n\n\nlet start = Date.now();\nconst audio = tts.generate({text, generationConfig});\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-matcha-zh.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_pocket_en.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/pocket.html\n// to download model files\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      pocket: {\n        lmFlow: './sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx',\n        lmMain: './sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx',\n        encoder: './sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx',\n        decoder: './sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx',\n        textConditioner:\n            './sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx',\n        vocabJson: './sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json',\n        tokenScoresJson:\n            './sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json',\n        voiceEmbeddingCacheCapacity: 50,\n      },\n      debug: true,\n      numThreads: 2,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\n\nconst text =\n    'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\nconst referenceAudioFilename =\n    './sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav';\nconst referenceWave = sherpa_onnx.readWave(referenceAudioFilename);\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  speed: 1.0,\n  referenceAudio: referenceWave.samples,\n  referenceSampleRate: referenceWave.sampleRate,\n  numSteps: 5,\n  extra: {max_reference_audio_len: 12, seed: 42}\n});\n\n\nlet start = Date.now();\nconst audio = tts.generate({text, generationConfig});\n\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-pocket-bria.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_pocket_en_async.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nasync function createOfflineTts() {\n  const config = {\n    model: {\n      pocket: {\n        lmFlow: './sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx',\n        lmMain: './sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx',\n        encoder: './sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx',\n        decoder: './sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx',\n        textConditioner:\n            './sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx',\n        vocabJson: './sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json',\n        tokenScoresJson:\n            './sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json',\n        voiceEmbeddingCacheCapacity: 50,\n      },\n      debug: false,  // set to true to see verbose logs\n      numThreads: 2,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n\n  return await sherpa_onnx.OfflineTts.createAsync(config);\n}\n\n/**\n * Async function to generate audio with progress callback\n * @param {sherpa_onnx.OfflineTts} tts\n * @param {string} text\n */\nasync function generateAudioAsync(tts, text) {\n  const referenceAudioFilename =\n      './sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav';\n  const referenceWave = sherpa_onnx.readWave(referenceAudioFilename);\n\n  const generationConfig = new sherpa_onnx.GenerationConfig({\n    speed: 1.0,\n    referenceAudio: referenceWave.samples,\n    referenceSampleRate: referenceWave.sampleRate,\n    numSteps: 5,\n    extra: {max_reference_audio_len: 12, seed: 42},\n  });\n\n  console.log('Starting generation...');\n\n  const audio = await tts.generateAsync({\n    text,\n    enableExternalBuffer: true,\n    generationConfig,\n    onProgress: ({samples, progress}) => {\n      // Print progress percentage and number of samples generated\n      process.stdout.write(\n          `Progress: ${(progress * 100).toFixed(1)}%, ` +\n          `Samples: ${samples.length}\\r`);\n\n      // Return anything other than 0/false to continue generation\n      return 1;\n    },\n  });\n\n  console.log('\\nGeneration complete!');\n  return audio;\n}\n\n/**\n * Main entry\n */\nasync function main() {\n  console.log('Creating OfflineTts...');\n  const tts = await createOfflineTts();\n  console.log('OfflineTts created!');\n\n  const text =\n      'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\n\n  const start = Date.now();\n  const audio = await generateAudioAsync(tts, text);\n  const stop = Date.now();\n\n  const elapsed_seconds = (stop - start) / 1000;\n  const duration = audio.samples.length / audio.sampleRate;\n  const real_time_factor = elapsed_seconds / duration;\n\n  console.log('Wave duration', duration.toFixed(3), 'seconds');\n  console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\n  console.log(\n      `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n      real_time_factor.toFixed(3));\n\n  const filename = 'test-pocket-bria-async.wav';\n  sherpa_onnx.writeWave(filename, {\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  });\n  console.log(`Saved to ${filename}`);\n}\n\n// Run the async main\nmain().catch((err) => {\n  console.error('Error:', err);\n});\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_pocket_en_play_async.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//\n// npm install speaker\n//\nconst Speaker = require('speaker');\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nasync function createOfflineTts() {\n  const config = {\n    model: {\n      pocket: {\n        lmFlow: './sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx',\n        lmMain: './sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx',\n        encoder: './sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx',\n        decoder: './sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx',\n        textConditioner:\n            './sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx',\n        vocabJson: './sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json',\n        tokenScoresJson:\n            './sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json',\n        voiceEmbeddingCacheCapacity: 50,\n      },\n      debug: false,  // set to true to see verbose logs\n      numThreads: 2,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n\n  return await sherpa_onnx.OfflineTts.createAsync(config);\n}\n\nfunction createSpeaker(sampleRate) {\n  return new Speaker({\n    channels: 1,\n    bitDepth: 16,\n    sampleRate: sampleRate,\n    signed: true,\n  });\n}\n\nfunction float32ToInt16Buffer(samples) {\n  const buffer = Buffer.alloc(samples.length * 2);\n\n  for (let i = 0; i < samples.length; ++i) {\n    const s = Math.max(-1, Math.min(1, samples[i]));\n    const v = s < 0 ? s * 0x8000 : s * 0x7fff;\n    buffer.writeInt16LE(Math.round(v), i * 2);\n  }\n\n  return buffer;\n}\n\nfunction waitForEvent(emitter, eventName) {\n  return new Promise((resolve, reject) => {\n    emitter.once(eventName, resolve);\n    emitter.once('error', reject);\n  });\n}\n\n/**\n * @param {sherpa_onnx.OfflineTts} tts\n * @param {string} text\n */\nasync function generateAudioAsync(tts, text) {\n  const referenceAudioFilename =\n      './sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav';\n  const referenceWave = sherpa_onnx.readWave(referenceAudioFilename);\n\n  const generationConfig = new sherpa_onnx.GenerationConfig({\n    speed: 1.0,\n    referenceAudio: referenceWave.samples,\n    referenceSampleRate: referenceWave.sampleRate,\n    numSteps: 5,\n    extra: {max_reference_audio_len: 12, seed: 42},\n  });\n\n  const speaker = createSpeaker(tts.sampleRate);\n  const start = Date.now();\n\n  console.log('Starting generation and playback...');\n\n  const audio = await tts.generateAsync({\n    text,\n    enableExternalBuffer: true,\n    generationConfig,\n    onProgress: ({samples, progress}) => {\n      process.stdout.write(\n          `Progress: ${(progress * 100).toFixed(1)}%, ` +\n          `Chunk samples: ${samples.length}\\r`);\n      speaker.write(float32ToInt16Buffer(samples));\n      return 1;\n    },\n  });\n\n  const generationStop = Date.now();\n  speaker.end();\n  await waitForEvent(speaker, 'close');\n  const playbackStop = Date.now();\n\n  console.log('\\nGeneration and playback complete!');\n  return {\n    audio,\n    generationElapsedSeconds: (generationStop - start) / 1000,\n    playbackElapsedSeconds: (playbackStop - start) / 1000,\n  };\n}\n\nasync function main() {\n  console.log('Creating OfflineTts...');\n  const tts = await createOfflineTts();\n  console.log('OfflineTts created!');\n\n  const text =\n      'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\n  const {audio, generationElapsedSeconds, playbackElapsedSeconds} =\n      await generateAudioAsync(tts, text);\n  const duration = audio.samples.length / audio.sampleRate;\n  const real_time_factor = generationElapsedSeconds / duration;\n\n  console.log('Wave duration', duration.toFixed(3), 'seconds');\n  console.log(\n      'Generation elapsed', generationElapsedSeconds.toFixed(3), 'seconds');\n  console.log(\n      'Playback drained in', playbackElapsedSeconds.toFixed(3), 'seconds');\n  console.log(\n      `RTF = ${generationElapsedSeconds.toFixed(3)}/${duration.toFixed(3)} =`,\n      real_time_factor.toFixed(3));\n\n  const filename = 'test-pocket-bria-play-async.wav';\n  sherpa_onnx.writeWave(filename, {\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  });\n  console.log(`Saved to ${filename}`);\n}\n\nmain().catch((err) => {\n  console.error('Error:', err);\n});\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_supertonic_en.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/supertonic.html\n// to download model files\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      supertonic: {\n        durationPredictor:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx',\n        textEncoder:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx',\n        vectorEstimator:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx',\n        vocoder:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx',\n        ttsJson: './sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json',\n        unicodeIndexer:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin',\n        voiceStyle: './sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin',\n      },\n      debug: true,\n      numThreads: 2,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\n\nconst text =\n    'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  sid: 6,\n  speed: 1.25,\n  numSteps: 5,\n  extra: {lang: 'en'},\n});\n\nlet start = Date.now();\nconst audio = tts.generate({text, generationConfig});\n\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-supertonic-en.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_supertonic_en_async.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nasync function createOfflineTts() {\n  const config = {\n    model: {\n      supertonic: {\n        durationPredictor:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx',\n        textEncoder:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx',\n        vectorEstimator:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx',\n        vocoder:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx',\n        ttsJson: './sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json',\n        unicodeIndexer:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin',\n        voiceStyle: './sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin',\n      },\n      debug: false,  // set to true to see verbose logs\n      numThreads: 2,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n\n  return await sherpa_onnx.OfflineTts.createAsync(config);\n}\n\n/**\n * Async function to generate audio with progress callback\n * @param {sherpa_onnx.OfflineTts} tts\n * @param {string} text\n */\nasync function generateAudioAsync(tts, text) {\n  const generationConfig = new sherpa_onnx.GenerationConfig({\n    sid: 6,\n    speed: 1.25,\n    numSteps: 5,\n    extra: {lang: 'en'},\n  });\n\n  console.log('Starting generation...');\n\n  const audio = await tts.generateAsync({\n    text,\n    enableExternalBuffer: true,\n    generationConfig,\n    onProgress: ({samples, progress}) => {\n      // Print progress percentage and number of samples generated\n      process.stdout.write(\n          `Progress: ${(progress * 100).toFixed(1)}%, ` +\n          `Samples: ${samples.length}\\r`);\n\n      // Return anything other than 0/false to continue generation\n      return 1;\n    },\n  });\n\n  console.log('\\nGeneration complete!');\n  return audio;\n}\n\n/**\n * Main entry\n */\nasync function main() {\n  console.log('Creating OfflineTts...');\n  const tts = await createOfflineTts();\n  console.log('OfflineTts created!');\n\n  const text =\n      'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\n\n  const start = Date.now();\n  const audio = await generateAudioAsync(tts, text);\n  const stop = Date.now();\n\n  const elapsed_seconds = (stop - start) / 1000;\n  const duration = audio.samples.length / audio.sampleRate;\n  const real_time_factor = elapsed_seconds / duration;\n\n  console.log('Wave duration', duration.toFixed(3), 'seconds');\n  console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\n  console.log(\n      `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n      real_time_factor.toFixed(3));\n\n  const filename = 'test-supertonic-en-async.wav';\n  sherpa_onnx.writeWave(filename, {\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  });\n  console.log(`Saved to ${filename}`);\n}\n\n// Run the async main\nmain().catch((err) => {\n  console.error('Error:', err);\n});\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_supertonic_en_play_async.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//\n// npm install speaker\n//\nconst Speaker = require('speaker');\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nasync function createOfflineTts() {\n  const config = {\n    model: {\n      supertonic: {\n        durationPredictor:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx',\n        textEncoder:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx',\n        vectorEstimator:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx',\n        vocoder:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx',\n        ttsJson: './sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json',\n        unicodeIndexer:\n            './sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin',\n        voiceStyle: './sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin',\n      },\n      debug: false,  // set to true to see verbose logs\n      numThreads: 2,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n\n  return await sherpa_onnx.OfflineTts.createAsync(config);\n}\n\nfunction createSpeaker(sampleRate) {\n  return new Speaker({\n    channels: 1,\n    bitDepth: 16,\n    sampleRate: sampleRate,\n    signed: true,\n  });\n}\n\nfunction float32ToInt16Buffer(samples) {\n  const buffer = Buffer.alloc(samples.length * 2);\n\n  for (let i = 0; i < samples.length; ++i) {\n    const s = Math.max(-1, Math.min(1, samples[i]));\n    const v = s < 0 ? s * 0x8000 : s * 0x7fff;\n    buffer.writeInt16LE(Math.round(v), i * 2);\n  }\n\n  return buffer;\n}\n\nfunction waitForEvent(emitter, eventName) {\n  return new Promise((resolve, reject) => {\n    emitter.once(eventName, resolve);\n    emitter.once('error', reject);\n  });\n}\n\n/**\n * @param {sherpa_onnx.OfflineTts} tts\n * @param {string} text\n */\nasync function generateAudioAsync(tts, text) {\n  const generationConfig = new sherpa_onnx.GenerationConfig({\n    sid: 6,\n    speed: 1.25,\n    numSteps: 5,\n    extra: {lang: 'en'},\n  });\n\n  const speaker = createSpeaker(tts.sampleRate);\n  const start = Date.now();\n\n  console.log('Starting generation and playback...');\n\n  const audio = await tts.generateAsync({\n    text,\n    enableExternalBuffer: true,\n    generationConfig,\n    onProgress: ({samples, progress}) => {\n      process.stdout.write(\n          `Progress: ${(progress * 100).toFixed(1)}%, ` +\n          `Chunk samples: ${samples.length}\\r`);\n      speaker.write(float32ToInt16Buffer(samples));\n      return 1;\n    },\n  });\n\n  const generationStop = Date.now();\n  speaker.end();\n  await waitForEvent(speaker, 'close');\n  const playbackStop = Date.now();\n\n  console.log('\\nGeneration and playback complete!');\n  return {\n    audio,\n    generationElapsedSeconds: (generationStop - start) / 1000,\n    playbackElapsedSeconds: (playbackStop - start) / 1000,\n  };\n}\n\nasync function main() {\n  console.log('Creating OfflineTts...');\n  const tts = await createOfflineTts();\n  console.log('OfflineTts created!');\n\n  const text =\n      'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\n  const {audio, generationElapsedSeconds, playbackElapsedSeconds} =\n      await generateAudioAsync(tts, text);\n  const duration = audio.samples.length / audio.sampleRate;\n  const real_time_factor = generationElapsedSeconds / duration;\n\n  console.log('Wave duration', duration.toFixed(3), 'seconds');\n  console.log(\n      'Generation elapsed', generationElapsedSeconds.toFixed(3), 'seconds');\n  console.log(\n      'Playback drained in', playbackElapsedSeconds.toFixed(3), 'seconds');\n  console.log(\n      `RTF = ${generationElapsedSeconds.toFixed(3)}/${duration.toFixed(3)} =`,\n      real_time_factor.toFixed(3));\n\n  const filename = 'test-supertonic-en-play-async.wav';\n  sherpa_onnx.writeWave(filename, {\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  });\n  console.log(`Saved to ${filename}`);\n}\n\nmain().catch((err) => {\n  console.error('Error:', err);\n});\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_vits_coqui_de.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please download model files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      vits: {\n        model: './vits-coqui-de-css10/model.onnx',\n        tokens: './vits-coqui-de-css10/tokens.txt',\n      },\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\n\nconst text = 'Alles hat ein Ende, nur die Wurst hat zwei.';\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  sid: 0,\n  speed: 1.0,\n  silenceScale: 0.2,\n});\n\nlet start = Date.now();\nconst audio = tts.generate({\n  text: text,\n  generationConfig,\n  enableExternalBuffer: true,\n});\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-coqui-de.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_vits_piper_en.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please download model files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      vits: {\n        model: './vits-piper-en_GB-cori-medium/en_GB-cori-medium.onnx',\n        tokens: './vits-piper-en_GB-cori-medium/tokens.txt',\n        dataDir: './vits-piper-en_GB-cori-medium/espeak-ng-data',\n      },\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\n\nconst text =\n    'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  sid: 0,\n  speed: 1.0,\n  silenceScale: 0.2,\n});\n\n\nlet start = Date.now();\nconst audio = tts.generate({text: text, generationConfig});\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-piper-en.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_vits_zh_aishell3.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please download model files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      vits: {\n        model: './vits-icefall-zh-aishell3/model.onnx',\n        tokens: './vits-icefall-zh-aishell3/tokens.txt',\n        lexicon: './vits-icefall-zh-aishell3/lexicon.txt',\n      },\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n    ruleFsts:\n        './vits-icefall-zh-aishell3/date.fst,./vits-icefall-zh-aishell3/phone.fst,./vits-icefall-zh-aishell3/number.fst,./vits-icefall-zh-aishell3/new_heteronym.fst',\n    ruleFars: './vits-icefall-zh-aishell3/rule.far',\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\n\nconst text =\n    '他在长沙出生，长白山长大，去过长江，现在他是一个银行的行长，主管行政工作。有困难，请拨110，或者13020240513。今天是2024年5月13号, 他上个月的工资是12345块钱。';\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  sid: 88,\n  speed: 1.0,\n  silenceScale: 0.2,\n});\n\nlet start = Date.now();\nconst audio = tts.generate({text: text, generationConfig});\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-zh-aishell3.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_vits_zh_ll.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please download model files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      vits: {\n        model: './sherpa-onnx-vits-zh-ll/model.onnx',\n        tokens: './sherpa-onnx-vits-zh-ll/tokens.txt',\n        lexicon: './sherpa-onnx-vits-zh-ll/lexicon.txt',\n      },\n      debug: true,\n      numThreads: 1,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n    ruleFsts:\n        './sherpa-onnx-vits-zh-ll/date.fst,./sherpa-onnx-vits-zh-ll/phone.fst,./sherpa-onnx-vits-zh-ll/number.fst',\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\n\nconst text =\n    '当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感受着生命的奇迹与温柔。2024年5月13号，拨打110或者18920240513。123456块钱。';\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  sid: 2,\n  speed: 1.0,\n  silenceScale: 0.2,\n});\n\nlet start = Date.now();\nconst audio = tts.generate({text: text, generationConfig});\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-zh-ll.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_zipvoice_zh_en.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\n// please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\n// to download model files\nfunction createOfflineTts() {\n  const config = {\n    model: {\n      zipvoice: {\n        tokens: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt',\n        encoder:\n            './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx',\n        decoder:\n            './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx',\n        vocoder: './vocos_24khz.onnx',\n        dataDir:\n            './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data',\n        lexicon: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt',\n      },\n      debug: true,\n      numThreads: 2,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n  return new sherpa_onnx.OfflineTts(config);\n}\n\nconst tts = createOfflineTts();\nconst text =\n    '小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.';\nconst referenceText =\n    '那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.';\nconst referenceAudioFilename =\n    './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav';\nconst referenceWave = sherpa_onnx.readWave(referenceAudioFilename);\n\nconst generationConfig = new sherpa_onnx.GenerationConfig({\n  speed: 1.0,\n  referenceAudio: referenceWave.samples,\n  referenceSampleRate: referenceWave.sampleRate,\n  referenceText,\n  numSteps: 4,\n  extra: {min_char_in_sentence: 10},\n});\n\nlet start = Date.now();\nconst audio = tts.generate({text, generationConfig});\n\nlet stop = Date.now();\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = audio.samples.length / audio.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nconst filename = 'test-zipvoice-zh-en.wav';\nsherpa_onnx.writeWave(\n    filename, {samples: audio.samples, sampleRate: audio.sampleRate});\n\nconsole.log(`Saved to ${filename}`);\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_zipvoice_zh_en_async.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nasync function createOfflineTts() {\n  const config = {\n    model: {\n      zipvoice: {\n        tokens: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt',\n        encoder:\n            './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx',\n        decoder:\n            './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx',\n        vocoder: './vocos_24khz.onnx',\n        dataDir:\n            './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data',\n        lexicon: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt',\n      },\n      debug: false,  // set to true to see verbose logs\n      numThreads: 2,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n\n  return await sherpa_onnx.OfflineTts.createAsync(config);\n}\n\n/**\n * @param {sherpa_onnx.OfflineTts} tts\n * @param {string} text\n */\nasync function generateAudioAsync(tts, text) {\n  const referenceText =\n      '那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.';\n  const referenceAudioFilename =\n      './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav';\n  const referenceWave = sherpa_onnx.readWave(referenceAudioFilename);\n\n  const generationConfig = new sherpa_onnx.GenerationConfig({\n    speed: 1.0,\n    referenceAudio: referenceWave.samples,\n    referenceSampleRate: referenceWave.sampleRate,\n    referenceText,\n    numSteps: 4,\n    extra: {min_char_in_sentence: 10},\n  });\n\n  console.log('Starting generation...');\n\n  const audio = await tts.generateAsync({\n    text,\n    enableExternalBuffer: true,\n    generationConfig,\n    onProgress: ({samples, progress}) => {\n      process.stdout.write(\n          `Progress: ${(progress * 100).toFixed(1)}%, ` +\n          `Samples: ${samples.length}\\r`);\n      return 1;\n    },\n  });\n\n  console.log('\\nGeneration complete!');\n  return audio;\n}\n\nasync function main() {\n  console.log('Creating OfflineTts...');\n  const tts = await createOfflineTts();\n  console.log('OfflineTts created!');\n\n  const text =\n      '小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.';\n\n  const start = Date.now();\n  const audio = await generateAudioAsync(tts, text);\n  const stop = Date.now();\n\n  const elapsed_seconds = (stop - start) / 1000;\n  const duration = audio.samples.length / audio.sampleRate;\n  const real_time_factor = elapsed_seconds / duration;\n\n  console.log('Wave duration', duration.toFixed(3), 'seconds');\n  console.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\n  console.log(\n      `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n      real_time_factor.toFixed(3));\n\n  const filename = 'test-zipvoice-zh-en-async.wav';\n  sherpa_onnx.writeWave(filename, {\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  });\n  console.log(`Saved to ${filename}`);\n}\n\nmain().catch((err) => {\n  console.error('Error:', err);\n});\n"
  },
  {
    "path": "nodejs-addon-examples/test_tts_non_streaming_zipvoice_zh_en_play_async.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//\n// npm install speaker\n//\nconst Speaker = require('speaker');\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nasync function createOfflineTts() {\n  const config = {\n    model: {\n      zipvoice: {\n        tokens: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt',\n        encoder:\n            './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx',\n        decoder:\n            './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx',\n        vocoder: './vocos_24khz.onnx',\n        dataDir:\n            './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data',\n        lexicon: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt',\n      },\n      debug: false,  // set to true to see verbose logs\n      numThreads: 2,\n      provider: 'cpu',\n    },\n    maxNumSentences: 1,\n  };\n\n  return await sherpa_onnx.OfflineTts.createAsync(config);\n}\n\nfunction createSpeaker(sampleRate) {\n  return new Speaker({\n    channels: 1,\n    bitDepth: 16,\n    sampleRate: sampleRate,\n    signed: true,\n  });\n}\n\nfunction float32ToInt16Buffer(samples) {\n  const buffer = Buffer.alloc(samples.length * 2);\n\n  for (let i = 0; i < samples.length; ++i) {\n    const s = Math.max(-1, Math.min(1, samples[i]));\n    const v = s < 0 ? s * 0x8000 : s * 0x7fff;\n    buffer.writeInt16LE(Math.round(v), i * 2);\n  }\n\n  return buffer;\n}\n\nfunction waitForEvent(emitter, eventName) {\n  return new Promise((resolve, reject) => {\n    emitter.once(eventName, resolve);\n    emitter.once('error', reject);\n  });\n}\n\n/**\n * @param {sherpa_onnx.OfflineTts} tts\n * @param {string} text\n */\nasync function generateAudioAsync(tts, text) {\n  const referenceText =\n      '那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.';\n  const referenceAudioFilename =\n      './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav';\n  const referenceWave = sherpa_onnx.readWave(referenceAudioFilename);\n\n  const generationConfig = new sherpa_onnx.GenerationConfig({\n    speed: 1.0,\n    referenceAudio: referenceWave.samples,\n    referenceSampleRate: referenceWave.sampleRate,\n    referenceText,\n    numSteps: 4,\n    extra: {min_char_in_sentence: 10},\n  });\n\n  const speaker = createSpeaker(tts.sampleRate);\n  const start = Date.now();\n\n  console.log('Starting generation and playback...');\n\n  const audio = await tts.generateAsync({\n    text,\n    enableExternalBuffer: true,\n    generationConfig,\n    onProgress: ({samples, progress}) => {\n      process.stdout.write(\n          `Progress: ${(progress * 100).toFixed(1)}%, ` +\n          `Chunk samples: ${samples.length}\\r`);\n      speaker.write(float32ToInt16Buffer(samples));\n      return 1;\n    },\n  });\n\n  const generationStop = Date.now();\n  speaker.end();\n  await waitForEvent(speaker, 'close');\n  const playbackStop = Date.now();\n\n  console.log('\\nGeneration and playback complete!');\n  return {\n    audio,\n    generationElapsedSeconds: (generationStop - start) / 1000,\n    playbackElapsedSeconds: (playbackStop - start) / 1000,\n  };\n}\n\nasync function main() {\n  console.log('Creating OfflineTts...');\n  const tts = await createOfflineTts();\n  console.log('OfflineTts created!');\n\n  const text =\n      '小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.';\n\n  const {audio, generationElapsedSeconds, playbackElapsedSeconds} =\n      await generateAudioAsync(tts, text);\n  const duration = audio.samples.length / audio.sampleRate;\n  const real_time_factor = generationElapsedSeconds / duration;\n\n  console.log('Wave duration', duration.toFixed(3), 'seconds');\n  console.log(\n      'Generation elapsed', generationElapsedSeconds.toFixed(3), 'seconds');\n  console.log(\n      'Playback drained in', playbackElapsedSeconds.toFixed(3), 'seconds');\n  console.log(\n      `RTF = ${generationElapsedSeconds.toFixed(3)}/${duration.toFixed(3)} =`,\n      real_time_factor.toFixed(3));\n\n  const filename = 'test-zipvoice-zh-en-play-async.wav';\n  sherpa_onnx.writeWave(filename, {\n    samples: audio.samples,\n    sampleRate: audio.sampleRate,\n  });\n  console.log(`Saved to ${filename}`);\n}\n\nmain().catch((err) => {\n  console.error('Error:', err);\n});\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_asr_non_streaming_moonshine_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'moonshine': {\n        'preprocessor': './sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx',\n        'encoder': './sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx',\n        'uncachedDecoder':\n            './sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx',\n        'cachedDecoder':\n            './sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx',\n      },\n      'tokens': './sherpa-onnx-moonshine-tiny-en-int8/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    }\n  };\n\n  return new sherpa_onnx.OfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\nconst bufferSizeInSeconds = 30;\nconst buffer =\n    new sherpa_onnx.CircularBuffer(bufferSizeInSeconds * vad.config.sampleRate);\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: vad.config.sampleRate\n  }\n});\n\nlet printed = false;\nlet index = 0;\nai.on('data', data => {\n  const windowSize = vad.config.sileroVad.windowSize;\n  buffer.push(new Float32Array(data.buffer));\n  while (buffer.size() > windowSize) {\n    const samples = buffer.get(buffer.head(), windowSize);\n    buffer.pop(windowSize);\n    vad.acceptWaveform(samples);\n  }\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n    const stream = recognizer.createStream();\n    stream.acceptWaveform({\n      samples: segment.samples,\n      sampleRate: recognizer.config.featConfig.sampleRate\n    });\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${index}: ${text}`);\n\n      const filename = `${index}-${text}-${\n                           new Date()\n                               .toLocaleTimeString('en-US', {hour12: false})\n                               .split(' ')[0]}.wav`\n                           .replace(/:/g, '-');\n\n      sherpa_onnx.writeWave(\n          filename,\n          {samples: segment.samples, sampleRate: vad.config.sampleRate});\n\n      index += 1;\n    }\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_asr_non_streaming_nemo_ctc_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'nemoCtc': {\n        'model':\n            './sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/model.onnx',\n      },\n      'tokens':\n          './sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    }\n  };\n\n  return new sherpa_onnx.OfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\nconst bufferSizeInSeconds = 30;\nconst buffer =\n    new sherpa_onnx.CircularBuffer(bufferSizeInSeconds * vad.config.sampleRate);\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: vad.config.sampleRate\n  }\n});\n\nlet printed = false;\nlet index = 0;\nai.on('data', data => {\n  const windowSize = vad.config.sileroVad.windowSize;\n  buffer.push(new Float32Array(data.buffer));\n  while (buffer.size() > windowSize) {\n    const samples = buffer.get(buffer.head(), windowSize);\n    buffer.pop(windowSize);\n    vad.acceptWaveform(samples);\n  }\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n    const stream = recognizer.createStream();\n    stream.acceptWaveform({\n      samples: segment.samples,\n      sampleRate: recognizer.config.featConfig.sampleRate\n    });\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${index}: ${text}`);\n\n      const filename = `${index}-${text}-${\n                           new Date()\n                               .toLocaleTimeString('en-US', {hour12: false})\n                               .split(' ')[0]}.wav`\n                           .replace(/:/g, '-');\n\n      sherpa_onnx.writeWave(\n          filename,\n          {samples: segment.samples, sampleRate: vad.config.sampleRate});\n\n      index += 1;\n    }\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_asr_non_streaming_paraformer_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'paraformer': {\n        'model': './sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx',\n      },\n      'tokens': './sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    }\n  };\n\n  return new sherpa_onnx.OfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\nconst bufferSizeInSeconds = 30;\nconst buffer =\n    new sherpa_onnx.CircularBuffer(bufferSizeInSeconds * vad.config.sampleRate);\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: vad.config.sampleRate\n  }\n});\n\nlet printed = false;\nlet index = 0;\nai.on('data', data => {\n  const windowSize = vad.config.sileroVad.windowSize;\n  buffer.push(new Float32Array(data.buffer));\n  while (buffer.size() > windowSize) {\n    const samples = buffer.get(buffer.head(), windowSize);\n    buffer.pop(windowSize);\n    vad.acceptWaveform(samples);\n  }\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n    const stream = recognizer.createStream();\n    stream.acceptWaveform({\n      samples: segment.samples,\n      sampleRate: recognizer.config.featConfig.sampleRate\n    });\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${index}: ${text}`);\n\n      const filename = `${index}-${text}-${\n                           new Date()\n                               .toLocaleTimeString('en-US', {hour12: false})\n                               .split(' ')[0]}.wav`\n                           .replace(/:/g, '-');\n\n      sherpa_onnx.writeWave(\n          filename,\n          {samples: segment.samples, sampleRate: vad.config.sampleRate});\n\n      index += 1;\n    }\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_asr_non_streaming_sense_voice_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'senseVoice': {\n        'model':\n            './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx',\n        'useInverseTextNormalization': 1,\n      },\n      'tokens':\n          './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    }\n  };\n\n  return new sherpa_onnx.OfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\nconst bufferSizeInSeconds = 30;\nconst buffer =\n    new sherpa_onnx.CircularBuffer(bufferSizeInSeconds * vad.config.sampleRate);\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: vad.config.sampleRate\n  }\n});\n\nlet printed = false;\nlet index = 0;\nai.on('data', data => {\n  const windowSize = vad.config.sileroVad.windowSize;\n  buffer.push(new Float32Array(data.buffer));\n  while (buffer.size() > windowSize) {\n    const samples = buffer.get(buffer.head(), windowSize);\n    buffer.pop(windowSize);\n    vad.acceptWaveform(samples);\n  }\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n    const stream = recognizer.createStream();\n    stream.acceptWaveform({\n      samples: segment.samples,\n      sampleRate: recognizer.config.featConfig.sampleRate\n    });\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${index}: ${text}`);\n\n      const filename = `${index}-${text}-${\n                           new Date()\n                               .toLocaleTimeString('en-US', {hour12: false})\n                               .split(' ')[0]}.wav`\n                           .replace(/:/g, '-');\n\n      sherpa_onnx.writeWave(\n          filename,\n          {samples: segment.samples, sampleRate: vad.config.sampleRate});\n\n      index += 1;\n    }\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_asr_non_streaming_transducer_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'transducer': {\n        'encoder':\n            './sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.int8.onnx',\n        'decoder':\n            './sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx',\n        'joiner':\n            './sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.int8.onnx',\n      },\n      'tokens': './sherpa-onnx-zipformer-en-2023-04-01/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    }\n  };\n\n  return new sherpa_onnx.OfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\nconst bufferSizeInSeconds = 30;\nconst buffer =\n    new sherpa_onnx.CircularBuffer(bufferSizeInSeconds * vad.config.sampleRate);\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: vad.config.sampleRate\n  }\n});\n\nlet printed = false;\nlet index = 0;\nai.on('data', data => {\n  const windowSize = vad.config.sileroVad.windowSize;\n  buffer.push(new Float32Array(data.buffer));\n  while (buffer.size() > windowSize) {\n    const samples = buffer.get(buffer.head(), windowSize);\n    buffer.pop(windowSize);\n    vad.acceptWaveform(samples);\n  }\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n    const stream = recognizer.createStream();\n    stream.acceptWaveform({\n      samples: segment.samples,\n      sampleRate: recognizer.config.featConfig.sampleRate\n    });\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${index}: ${text}`);\n\n      const filename = `${index}-${text}-${\n                           new Date()\n                               .toLocaleTimeString('en-US', {hour12: false})\n                               .split(' ')[0]}.wav`\n                           .replace(/:/g, '-');\n\n      sherpa_onnx.writeWave(\n          filename,\n          {samples: segment.samples, sampleRate: vad.config.sampleRate});\n\n      index += 1;\n    }\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_asr_non_streaming_whisper_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'whisper': {\n        'encoder': './sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx',\n        'decoder': './sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx',\n      },\n      'tokens': './sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    }\n  };\n\n  return new sherpa_onnx.OfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\nconst bufferSizeInSeconds = 30;\nconst buffer =\n    new sherpa_onnx.CircularBuffer(bufferSizeInSeconds * vad.config.sampleRate);\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: vad.config.sampleRate\n  }\n});\n\nlet printed = false;\nlet index = 0;\nai.on('data', data => {\n  const windowSize = vad.config.sileroVad.windowSize;\n  buffer.push(new Float32Array(data.buffer));\n  while (buffer.size() > windowSize) {\n    const samples = buffer.get(buffer.head(), windowSize);\n    buffer.pop(windowSize);\n    vad.acceptWaveform(samples);\n  }\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n    const stream = recognizer.createStream();\n    stream.acceptWaveform({\n      samples: segment.samples,\n      sampleRate: recognizer.config.featConfig.sampleRate\n    });\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${index}: ${text}`);\n\n      const filename = `${index}-${text}-${\n                           new Date()\n                               .toLocaleTimeString('en-US', {hour12: false})\n                               .split(' ')[0]}.wav`\n                           .replace(/:/g, '-');\n\n      sherpa_onnx.writeWave(\n          filename,\n          {samples: segment.samples, sampleRate: vad.config.sampleRate});\n\n      index += 1;\n    }\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_asr_non_streaming_zipformer_ctc_microphone.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'zipformerCtc': {\n        'model':\n            './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx',\n      },\n      'tokens': './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    }\n  };\n\n  return new sherpa_onnx.OfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\nconst bufferSizeInSeconds = 30;\nconst buffer =\n    new sherpa_onnx.CircularBuffer(bufferSizeInSeconds * vad.config.sampleRate);\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: vad.config.sampleRate\n  }\n});\n\nlet printed = false;\nlet index = 0;\nai.on('data', data => {\n  const windowSize = vad.config.sileroVad.windowSize;\n  buffer.push(new Float32Array(data.buffer));\n  while (buffer.size() > windowSize) {\n    const samples = buffer.get(buffer.head(), windowSize);\n    buffer.pop(windowSize);\n    vad.acceptWaveform(samples);\n  }\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n    const stream = recognizer.createStream();\n    stream.acceptWaveform({\n      samples: segment.samples,\n      sampleRate: recognizer.config.featConfig.sampleRate\n    });\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${index}: ${text}`);\n\n      const filename = `${index}-${text}-${\n                           new Date()\n                               .toLocaleTimeString('en-US', {hour12: false})\n                               .split(' ')[0]}.wav`\n                           .replace(/:/g, '-');\n\n      sherpa_onnx.writeWave(\n          filename,\n          {samples: segment.samples, sampleRate: vad.config.sampleRate});\n\n      index += 1;\n    }\n  }\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  //\n  // OR\n  //\n  // please download ten-vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n  const config = {\n    sileroVad: {\n      // model: '',\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      windowSize: 512,\n    },\n    tenVad: {\n      model: '',\n      // model: './ten-vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      windowSize: 256,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\nconst vad = createVad();\n\nconst bufferSizeInSeconds = 30;\nconst buffer =\n    new sherpa_onnx.CircularBuffer(bufferSizeInSeconds * vad.config.sampleRate);\n\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: vad.config.sampleRate,\n  }\n});\n\nlet printed = false;\nlet index = 0;\nai.on('data', data => {\n  const windowSize = vad.config.sileroVad.model != '' ?\n      vad.config.sileroVad.windowSize :\n      vad.config.tenVad.windowSize;\n\n  buffer.push(new Float32Array(data.buffer));\n  while (buffer.size() > windowSize) {\n    const samples = buffer.get(buffer.head(), windowSize);\n    buffer.pop(windowSize);\n    vad.acceptWaveform(samples);\n    if (vad.isDetected() && !printed) {\n      console.log(`${index}: Detected speech`);\n      printed = true;\n    }\n\n    if (!vad.isDetected()) {\n      printed = false;\n    }\n\n    while (!vad.isEmpty()) {\n      const segment = vad.front();\n      vad.pop();\n      const filename = `${index}-${\n                           new Date()\n                               .toLocaleTimeString('en-US', {hour12: false})\n                               .split(' ')[0]}.wav`\n                           .replace(/:/g, '-');\n      sherpa_onnx.writeWave(\n          filename,\n          {samples: segment.samples, sampleRate: vad.config.sampleRate});\n      const duration = segment.samples.length / vad.config.sampleRate;\n      console.log(`${index} End of speech. Duration: ${duration} seconds`);\n      console.log(`Saved to ${filename}`);\n      index += 1;\n    }\n  }\n});\n\nai.on('close', () => {\n  console.log('Free resources');\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_spoken_language_identification_microphone.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nfunction createSpokenLanguageID() {\n  const config = {\n    whisper: {\n      encoder: './sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx',\n      decoder: './sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx',\n    },\n    debug: true,\n    numThreads: 1,\n    provider: 'cpu',\n  };\n  return new sherpa_onnx.SpokenLanguageIdentification(config);\n}\n\nconst slid = createSpokenLanguageID();\nconst vad = createVad();\n\nconst display = new Intl.DisplayNames(['en'], {type: 'language'});\n\nconst bufferSizeInSeconds = 30;\nconst buffer =\n    new sherpa_onnx.CircularBuffer(bufferSizeInSeconds * vad.config.sampleRate);\n\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: vad.config.sampleRate,\n  }\n});\n\nlet printed = false;\nlet index = 0;\nai.on('data', data => {\n  const windowSize = vad.config.sileroVad.windowSize;\n  buffer.push(new Float32Array(data.buffer));\n  while (buffer.size() > windowSize) {\n    const samples = buffer.get(buffer.head(), windowSize);\n    buffer.pop(windowSize);\n    vad.acceptWaveform(samples);\n    if (vad.isDetected() && !printed) {\n      console.log(`${index}: Detected speech`);\n      printed = true;\n    }\n\n    if (!vad.isDetected()) {\n      printed = false;\n    }\n\n    while (!vad.isEmpty()) {\n      const segment = vad.front();\n      vad.pop();\n\n      const stream = slid.createStream();\n      stream.acceptWaveform(\n          {samples: segment.samples, sampleRate: vad.config.sampleRate});\n      const lang = slid.compute(stream);\n      const fullLang = display.of(lang);\n\n      const filename = `${index}-${fullLang}-${\n                           new Date()\n                               .toLocaleTimeString('en-US', {hour12: false})\n                               .split(' ')[0]}.wav`\n                           .replace(/:/g, '-');\n\n      sherpa_onnx.writeWave(\n          filename,\n          {samples: segment.samples, sampleRate: vad.config.sampleRate});\n      const duration = segment.samples.length / vad.config.sampleRate;\n      console.log(`${index} End of speech. Duration: ${\n          duration} seconds.\\n Detected language: ${fullLang}`);\n      console.log(`Saved to ${filename}`);\n      index += 1;\n    }\n  }\n});\n\nai.on('close', () => {\n  console.log('Free resources');\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_with_non_streaming_asr_moonshine.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'moonshine': {\n        'preprocessor': './sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx',\n        'encoder': './sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx',\n        'uncachedDecoder':\n            './sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx',\n        'cachedDecoder':\n            './sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx',\n      },\n      'tokens': './sherpa-onnx-moonshine-tiny-en-int8/tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    }\n  };\n\n  return new sherpa_onnx.OfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      maxSpeechDuration: 5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\n// please download ./Obama.wav from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst waveFilename = './Obama.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\n\nif (wave.sampleRate != recognizer.config.featConfig.sampleRate) {\n  throw new Error(\n      'Expected sample rate: ${recognizer.config.featConfig.sampleRate}. Given: ${wave.sampleRate}');\n}\n\nconsole.log('Started');\nlet start = Date.now();\n\nconst windowSize = vad.config.sileroVad.windowSize;\nfor (let i = 0; i < wave.samples.length; i += windowSize) {\n  const thisWindow = wave.samples.subarray(i, i + windowSize);\n  vad.acceptWaveform(thisWindow);\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n\n    let start_time = segment.start / wave.sampleRate;\n    let end_time = start_time + segment.samples.length / wave.sampleRate;\n\n    start_time = start_time.toFixed(2);\n    end_time = end_time.toFixed(2);\n\n    const stream = recognizer.createStream();\n    stream.acceptWaveform(\n        {samples: segment.samples, sampleRate: wave.sampleRate});\n\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${start_time} -- ${end_time}: ${text}`);\n    }\n  }\n}\n\nvad.flush();\n\nwhile (!vad.isEmpty()) {\n  const segment = vad.front();\n  vad.pop();\n\n  let start_time = segment.start / wave.sampleRate;\n  let end_time = start_time + segment.samples.length / wave.sampleRate;\n\n  start_time = start_time.toFixed(2);\n  end_time = end_time.toFixed(2);\n\n  const stream = recognizer.createStream();\n  stream.acceptWaveform(\n      {samples: segment.samples, sampleRate: wave.sampleRate});\n\n  recognizer.decode(stream);\n  const r = recognizer.getResult(stream);\n  if (r.text.length > 0) {\n    const text = r.text.toLowerCase().trim();\n    console.log(`${start_time} -- ${end_time}: ${text}`);\n  }\n}\n\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n"
  },
  {
    "path": "nodejs-addon-examples/test_vad_with_non_streaming_asr_whisper.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx-node');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'featConfig': {\n      'sampleRate': 16000,\n      'featureDim': 80,\n    },\n    'modelConfig': {\n      'whisper': {\n        'encoder': './sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx',\n        'decoder': './sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx',\n      },\n      'tokens': './sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt',\n      'numThreads': 2,\n      'provider': 'cpu',\n      'debug': 1,\n    }\n  };\n\n  return new sherpa_onnx.OfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      maxSpeechDuration: 5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n  };\n\n  const bufferSizeInSeconds = 60;\n\n  return new sherpa_onnx.Vad(config, bufferSizeInSeconds);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\n// please download ./Obama.wav from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst waveFilename = './Obama.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\n\nif (wave.sampleRate != recognizer.config.featConfig.sampleRate) {\n  throw new Error(\n      'Expected sample rate: ${recognizer.config.featConfig.sampleRate}. Given: ${wave.sampleRate}');\n}\n\nconsole.log('Started');\nlet start = Date.now();\n\nconst windowSize = vad.config.sileroVad.windowSize;\nfor (let i = 0; i < wave.samples.length; i += windowSize) {\n  const thisWindow = wave.samples.subarray(i, i + windowSize);\n  vad.acceptWaveform(thisWindow);\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n\n    let start_time = segment.start / wave.sampleRate;\n    let end_time = start_time + segment.samples.length / wave.sampleRate;\n\n    start_time = start_time.toFixed(2);\n    end_time = end_time.toFixed(2);\n\n    const stream = recognizer.createStream();\n    stream.acceptWaveform(\n        {samples: segment.samples, sampleRate: wave.sampleRate});\n\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${start_time} -- ${end_time}: ${text}`);\n    }\n  }\n}\n\nvad.flush();\n\nwhile (!vad.isEmpty()) {\n  const segment = vad.front();\n  vad.pop();\n\n  let start_time = segment.start / wave.sampleRate;\n  let end_time = start_time + segment.samples.length / wave.sampleRate;\n\n  start_time = start_time.toFixed(2);\n  end_time = end_time.toFixed(2);\n\n  const stream = recognizer.createStream();\n  stream.acceptWaveform(\n      {samples: segment.samples, sampleRate: wave.sampleRate});\n\n  recognizer.decode(stream);\n  const r = recognizer.getResult(stream);\n  if (r.text.length > 0) {\n    const text = r.text.toLowerCase().trim();\n    console.log(`${start_time} -- ${end_time}: ${text}`);\n  }\n}\n\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n"
  },
  {
    "path": "nodejs-examples/.gitignore",
    "content": "node_modules\nlib\npackage-lock.json\n*.tar.bz2\n"
  },
  {
    "path": "nodejs-examples/README.md",
    "content": "# Introduction\n\nNote: You need `Node >= 18`. \n\nNote: For Mac M1 and other silicon chip series, do check the example `test-online-paraformer-microphone-mic.js` \n\nThis directory contains nodejs examples for [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx).\n\nIt uses WebAssembly to wrap `sherpa-onnx` for NodeJS and it does not support multiple threads.\n\nNote: [../nodejs-addon-examples](../nodejs-addon-examples) uses\n[node-addon-api](https://github.com/nodejs/node-addon-api) to wrap\n`sherpa-onnx` for NodeJS and it supports multiple threads.\n\nBefore you continue, please first run\n\n```bash\ncd ./nodejs-examples\n\nnpm i\n```\n\nIn the following, we describe how to use [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx)\nfor text-to-speech and speech-to-text.\n\n\n# Speech enhancement\n\nIn the following, we demonstrate how to run speech enhancement.\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nnode ./test-offline-speech-enhancement-gtcrn.js\nnode ./test-online-speech-enhancement-gtcrn.js\n```\n\nThe GTCRN example files use `gtcrn_simple.onnx`.\n\nDPDFNet has a separate example file. Download DPDFNet models from\n`https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models`\nor the official Hugging Face hub `https://huggingface.co/Ceva-IP/DPDFNet`\n\nUse 16 kHz DPDFNet models such as `dpdfnet_baseline.onnx`, `dpdfnet2.onnx`,\n`dpdfnet4.onnx`, or `dpdfnet8.onnx` if you want enhanced audio for downstream\nASR or speech recognition, and use `dpdfnet2_48khz_hr.onnx` if you want 48 kHz\nenhancement output.\n\n```bash\nnode ./test-offline-speech-enhancement-dpdfnet.js\nnode ./test-online-speech-enhancement-dpdfnet.js\n```\n\nThe following four example files are available:\n\n```bash\nnode ./test-offline-speech-enhancement-gtcrn.js\nnode ./test-offline-speech-enhancement-dpdfnet.js\nnode ./test-online-speech-enhancement-gtcrn.js\nnode ./test-online-speech-enhancement-dpdfnet.js\n```\n\n# Speaker diarization\n\nIn the following, we demonstrate how to run speaker diarization.\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\ntar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nrm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\nnode ./test-offline-speaker-diarization.js\n```\n\n# Text-to-speech\n\nIn the following, we demonstrate how to run text-to-speech.\n\n## ./test-offline-tts-zipvoice-zh-en.js\n\n[./test-offline-tts-zipvoice-zh-en.js](./test-offline-tts-zipvoice-zh-en.js)\nshows how to use ZipVoice for Zero-shot TTS in Chinese and English.\n\nPlease make sure that the reference transcript matches the reference audio.\n\nYou can use the following command to run it:\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\nnode ./test-offline-tts-zipvoice-zh-en.js\n```\n\n## ./test-offline-tts-pocket-en.js\n\n[./test-offline-tts-pocket-en.js](./test-offline-tts-pocket-en.js)\nshows how to use PocketTTS for Zero-shot TTS.\n\nYou can use the following command to run it:\n```\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\nnode ./test-offline-tts-pocket-en.js\n```\n\n## ./test-offline-tts-kitten-en.js\n\n[./test-offline-tts-kitten-en.js](./test-offline-tts-kitten-en.js) shows how to use\n[kitten-nano-en-v0_1-fp16](https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2)\nfor text-to-speech.\n\nYou can use the following command to run it:\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\ntar xf kitten-nano-en-v0_1-fp16.tar.bz2\nrm kitten-nano-en-v0_1-fp16.tar.bz2\n\nnode ./test-offline-tts-kitten-en.js\n```\n\n## ./test-offline-tts-kokoro-en.js\n\n[./test-offline-tts-kokoro-en.js](./test-offline-tts-kokoro-en.js) shows how to use\n[kokoro-en-v0_19](https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2)\nfor text-to-speech.\n\nYou can use the following command to run it:\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\ntar xf kokoro-en-v0_19.tar.bz2\nrm kokoro-en-v0_19.tar.bz2\n\nnode ./test-offline-tts-kokoro-en.js\n```\n\n## ./test-offline-tts-matcha-zh.js\n\n[./test-offline-tts-matcha-zh.js](./test-offline-tts-matcha-zh.js) shows how to use\n[matcha-icefall-zh-baker](https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker)\nfor text-to-speech.\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\ntar xvf matcha-icefall-zh-baker.tar.bz2\nrm matcha-icefall-zh-baker.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\nnode ./test-offline-tts-matcha-zh.js\n```\n\n## ./test-offline-tts-matcha-en.js\n\n[./test-offline-tts-matcha-en.js](./test-offline-tts-matcha-en.js) shows how to use\n[matcha-icefall-en_US-ljspeech](https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker)\nfor text-to-speech.\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\ntar xf matcha-icefall-en_US-ljspeech.tar.bz2\nrm matcha-icefall-en_US-ljspeech.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\nnode ./test-offline-tts-matcha-en.js\n```\n\n## ./test-offline-tts-vits-en.js\n\n[./test-offline-tts-vits-en.js](./test-offline-tts-vits-en.js) shows how to use\n[vits-piper-en_US-amy-low.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2)\nfor text-to-speech.\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\ntar xvf vits-piper-en_US-amy-low.tar.bz2\nnode ./test-offline-tts-vits-en.js\n```\n\n## ./test-offline-tts-vits-zh.js\n\n[./test-offline-tts-vits-zh.js](./test-offline-tts-vits-zh.js) shows how to use\na VITS pretrained model\n[aishell3](https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/vits.html#vits-model-aishell3)\nfor text-to-speech.\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\ntar xvf vits-icefall-zh-aishell3.tar.bz2\nnode ./test-offline-tts-vits-zh.js\n```\n\n# Speech-to-text\n\nIn the following, we demonstrate how to decode files and how to perform\nspeech recognition with a microphone with `nodejs`.\n\n## ./test-offline-dolphin-ctc.js\n\n[./test-offline-dolphin-ctc.js](./test-offline-dolphin-ctc.js) demonstrates\nhow to decode a file with a [Dolphin](https://github.com/DataoceanAI/Dolphin) CTC model.\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\ntar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\nrm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\nnode ./test-offline-dolphin-ctc.js\n```\n\n## ./test-offline-nemo-canary.js\n\n[./test-offline-nemo-canary.js](./test-offline-nemo-canary.js) demonstrates\nhow to decode a file with a NeMo Canary model. In the code we use\n[sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8](https://k2-fsa.github.io/sherpa/onnx/nemo/canary.html#sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8-english-spanish-german-french).\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\ntar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\nrm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n\nnode ./test-offline-nemo-canary.js\n```\n\n## ./test-offline-zipformer-ctc.js\n\n[./test-offline-zipformer-ctc.js](./test-offline-zipformer-ctc.js) demonstrates\nhow to decode a file with a Zipformer CTC model. In the code we use\n[sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html#sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03-chinese).\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\ntar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nrm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\nnode ./test-offline-zipformer-ctc.js\n```\n\n## ./test-offline-funasr-nano.js\n\n[./test-offline-funasr-nano.js](./test-offline-funasr-nano.js) demonstrates\nhow to decode a file with a FunASR Nano model. In the code we use\n[sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2).\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\ntar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nrm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n\nnode ./test-offline-funasr-nano.js\n```\n\n## ./test-offline-medasr-ctc.js\n\n[./test-offline-medasr-ctc.js](./test-offline-medasr-ctc.js) demonstrates\nhow to decode a file with a Google MedASR CTC model. In the code we use\n[sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2).\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\ntar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nrm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n\nnode ./test-offline-medasr-ctc.js\n```\n\n## ./test-offline-fire-red-asr-ctc.js\n\n[./test-offline-fire-red-asr-ctc.js](./test-offline-fire-red-asr-ctc.js) demonstrates\nhow to decode a file with a FireRedASR CTC model. In the code we use\n[sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2).\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nrm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\nnode ./test-offline-fire-red-asr-ctc.js\n```\n\n## ./test-offline-omnilingual-asr-ctc.js\n\n[./test-offline-omnilingual-asr-ctc.js](./test-offline-omnilingual-asr-ctc.js) demonstrates\nhow to decode a file with a Omnilingual ASR CTC model. In the code we use\n[sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2).\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\ntar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nrm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n\nnode ./test-offline-omnilingual-asr-ctc.js\n```\n\n## ./test-offline-wenet-ctc.js\n\n[./test-offline-wenet-ctc.js](./test-offline-wenet-ctc.js) demonstrates\nhow to decode a file with a WeNet CTC model. In the code we use\n[sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2).\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\ntar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\nrm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n\nnode ./test-offline-wenet-ctc.js\n```\n\n## ./test-offline-nemo-ctc.js\n\n[./test-offline-nemo-ctc.js](./test-offline-nemo-ctc.js) demonstrates\nhow to decode a file with a NeMo CTC model. In the code we use\n[stt_en_conformer_ctc_small](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/english.html#stt-en-conformer-ctc-small).\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-en-conformer-small.tar.bz2\ntar xvf sherpa-onnx-nemo-ctc-en-conformer-small.tar.bz2\nnode ./test-offline-nemo-ctc.js\n```\n\n## ./test-offline-paraformer.js\n\n[./test-offline-paraformer.js](./test-offline-paraformer.js) demonstrates\nhow to decode a file with a non-streaming Paraformer model. In the code we use\n[sherpa-onnx-paraformer-zh-2023-09-14](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2023-09-14-chinese).\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\ntar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nnode ./test-offline-paraformer.js\n```\n\n## ./test-offline-sense-voice-with-hr.js\n\n[./test-offline-sense-voice-with-hr.js](./test-offline-sense-voice-with-hr.js) demonstrates\nhow to decode a file with a non-streaming SenseVoice model with homophone replacer.\n\nYou can use the following command to run it:\n\n```bash\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\ntar xf dict.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\nnode ./test-offline-sense-voice-with-hr.js\n```\n\n## ./test-offline-sense-voice.js\n\n[./test-offline-sense-voice.js](./test-offline-sense-voice.js) demonstrates\nhow to decode a file with a non-streaming SenseVoice model.\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\n\nnode ./test-offline-sense-voice.js\n```\n\n## ./test-offline-transducer.js\n\n[./test-offline-transducer.js](./test-offline-transducer.js) demonstrates\nhow to decode a file with a non-streaming transducer model. In the code we use\n[sherpa-onnx-zipformer-en-2023-06-26](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-zipformer-en-2023-06-26-english).\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\ntar xvf sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\nnode ./test-offline-transducer.js\n```\n\n## ./test-vad-with-non-streaming-asr-whisper.js\n\n[./test-vad-with-non-streaming-asr-whisper.js](./test-vad-with-non-streaming-asr-whisper.js)\nshows how to use VAD + whisper to decode a very long file.\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nnode ./test-vad-with-non-streaming-asr-whisper.js\n```\n\n## ./test-offline-whisper.js\n\n[./test-offline-whisper.js](./test-offline-whisper.js) demonstrates\nhow to decode a file with a Whisper model. In the code we use\n[sherpa-onnx-whisper-tiny.en](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html).\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nnode ./test-offline-whisper.js\n```\n\n## ./test-offline-fire-red-asr.js\n\n[./test-offline-fire-red-asr.js](./test-offline-fire-red-asr.js) demonstrates\nhow to decode a file with a FireRedAsr AED model.\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\nrm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n\nnode ./test-offline-fire-red-asr.js\n```\n\n## ./test-offline-moonshine-v2.js\n\n[./test-offline-moonshine-v2.js](./test-offline-moonshine-v2.js) demonstrates\nhow to decode a file with a Moonshine v2 model. In the code we use\n[sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2).\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n\nnode ./test-offline-moonshine-v2.js\n```\n\n## ./test-offline-moonshine.js\n\n[./test-offline-moonshine.js](./test-offline-moonshine.js) demonstrates\nhow to decode a file with a Moonshine model. In the code we use\n[sherpa-onnx-moonshine-tiny-en-int8](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2).\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\nnode ./test-offline-moonshine.js\n```\n\n## ./test-vad-with-non-streaming-asr-moonshine.js\n\n[./test-vad-with-non-streaming-asr-moonshine.js](./test-vad-with-non-streaming-asr-moonshine.js)\nshows how to use VAD + whisper to decode a very long file.\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nnode ./test-vad-with-non-streaming-asr-moonshine.js\n```\n\n## ./test-online-paraformer-microphone.js\n\n[./test-online-paraformer-microphone.js](./test-online-paraformer-microphone.js)\ndemonstrates how to do real-time speech recognition from microphone\nwith a streaming Paraformer model. In the code we use\n[sherpa-onnx-streaming-paraformer-bilingual-zh-en](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-streaming-paraformer-bilingual-zh-en-chinese-english).\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nrm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nnode ./test-online-paraformer-microphone.js\n```\n\n\n## ./test-online-paraformer-microphone-mic.js\n\n[./test-online-paraformer-microphone-mic.js](./test-online-paraformer-microphone-mic.js)\ndemonstrates how to do real-time speech recognition from microphone\nwith a streaming Paraformer model. In the code we use\n[sherpa-onnx-streaming-paraformer-bilingual-zh-en](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-streaming-paraformer-bilingual-zh-en-chinese-english).\n\nIt uses `mic` for better compatibility, do check its [npm](https://www.npmjs.com/package/mic) before running it.\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nrm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nnode ./test-online-paraformer-microphone-mic.js\n```\n\n## ./test-online-t-one-ctc.js\n[./test-online-t-one-ctc.js](./test-online-t-one-ctc.js) demonstrates\nhow to decode a file using a streaming T-one model.\n\nYou can use the following command to run it:\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\ntar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nrm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nnode ./test-online-t-one-ctc.js\n```\n\n## ./test-online-paraformer.js\n[./test-online-paraformer.js](./test-online-paraformer.js) demonstrates\nhow to decode a file using a streaming Paraformer model. In the code we use\n[sherpa-onnx-streaming-paraformer-bilingual-zh-en](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-streaming-paraformer-bilingual-zh-en-chinese-english).\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nrm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nnode ./test-online-paraformer.js\n```\n\n## ./test-online-transducer-microphone.js\n[./test-online-transducer-microphone.js](./test-online-transducer-microphone.js)\ndemonstrates how to do real-time speech recognition with microphone using a streaming transducer model. In the code\nwe use [sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english).\n\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nnode ./test-online-transducer-microphone.js\n```\n\n## ./test-online-transducer.js\n[./test-online-transducer.js](./test-online-transducer.js) demonstrates\nhow to decode a file using a streaming transducer model. In the code\nwe use [sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english).\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nnode ./test-online-transducer.js\n```\n\n## ./test-online-zipformer2-ctc.js\n[./test-online-zipformer2-ctc.js](./test-online-zipformer2-ctc.js) demonstrates\nhow to decode a file using a streaming zipformer2 CTC model. In the code\nwe use [sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-ctc/zipformer-ctc-models.html#sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13-chinese).\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\nnode ./test-online-zipformer2-ctc.js\n```\n\n## ./test-online-zipformer2-ctc-hlg.js\n[./test-online-zipformer2-ctc-hlg.js](./test-online-zipformer2-ctc-hlg.js) demonstrates\nhow to decode a file using a streaming zipformer2 CTC model with HLG. In the code\nwe use [sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2).\n\nYou can use the following command to run it:\n\n```bash\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nnode ./test-online-zipformer2-ctc-hlg.js\n```\n"
  },
  {
    "path": "nodejs-examples/package.json",
    "content": "{\n  \"dependencies\": {\n    \"mic\": \"^2.1.2\",\n    \"naudiodon2\": \"^2.4.0\",\n    \"sherpa-onnx\": \"^1.12.31\",\n    \"wav\": \"^1.0.2\"\n  }\n}\n"
  },
  {
    "path": "nodejs-examples/test-keyword-spotter-transducer.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createKeywordSpotter() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models\n  const config = {\n    'modelConfig': {\n      'transducer': {\n        'encoder':\n            './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx',\n        'decoder':\n            './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx',\n        'joiner':\n            './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx',\n      },\n      'tokens':\n          './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt',\n    },\n    keywords: 'w én s ēn t è k ǎ s uǒ  @文森特卡索\\n' +\n        'f ǎ g uó @法国'\n  };\n\n  return sherpa_onnx.createKws(config);\n}\n\nconst kws = createKeywordSpotter();\nconst stream = kws.createStream();\nconst waveFilename =\n    './sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/3.wav';\n\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nconst tailPadding = new Float32Array(wave.sampleRate * 0.4);\nstream.acceptWaveform(kws.config.featConfig.sampleRate, tailPadding);\n\nconst detectedKeywords = [];\nwhile (kws.isReady(stream)) {\n  kws.decode(stream);\n  const keyword = kws.getResult(stream).keyword;\n  if (keyword != '') {\n    detectedKeywords.push(keyword);\n\n    // remember to reset the stream right after detecting a keyword\n    kws.reset(stream);\n  }\n}\nconsole.log(detectedKeywords);\n\nstream.free();\nkws.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-dolphin-ctc.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let config = {\n    modelConfig: {\n      dolphin: {\n        model:\n            './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx',\n      },\n      tokens:\n          './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt',\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-fire-red-asr-ctc.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let config = {\n    modelConfig: {\n      fireRedAsrCtc: {\n        model:\n            './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx',\n      },\n      tokens:\n          './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt',\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-fire-red-asr.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let modelConfig = {\n    fireRedAsr: {\n      encoder:\n          './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx',\n      decoder:\n          './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx',\n    },\n    tokens: './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt',\n    debug: 1,\n  };\n\n  let config = {\n    modelConfig: modelConfig,\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-funasr-nano.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let config = {\n    modelConfig: {\n      funasrNano: {\n        encoderAdaptor:\n            './sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx',\n        llm: './sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx',\n        embedding:\n            './sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx',\n        tokenizer: './sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B',\n      },\n      tokens: '',\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/lyrics.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-medasr-ctc.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let config = {\n    modelConfig: {\n      medasr: {\n        model: './sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx',\n      },\n      tokens: './sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt',\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-moonshine-v2.js",
    "content": "// Copyright (c)  2023-2026  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let modelConfig = {\n    moonshine: {\n      encoder:\n          './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort',\n      mergedDecoder:\n          './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort',\n    },\n    tokens: './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt',\n  };\n\n  let config = {\n    modelConfig: modelConfig,\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-moonshine.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let modelConfig = {\n    moonshine: {\n      preprocessor: './sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx',\n      encoder: './sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx',\n      uncachedDecoder:\n          './sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx',\n      cachedDecoder:\n          './sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx',\n    },\n    tokens: './sherpa-onnx-moonshine-tiny-en-int8/tokens.txt',\n  };\n\n  let config = {\n    modelConfig: modelConfig,\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename = './sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-nemo-canary.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let config = {\n    modelConfig: {\n      canary: {\n        encoder:\n            './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx',\n        decoder:\n            './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx',\n        srcLang: 'en',\n        tgtLang: 'en',\n        usePnc: 1,\n      },\n      debug: 0,\n      tokens:\n          './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt',\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nlet stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/en.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nlet text = recognizer.getResult(stream).text;\nconsole.log(`text in English: ${text}`);\n\nstream.free();\n\n// now output German text\nrecognizer.config.modelConfig.canary.tgtLang = 'de';\nrecognizer.setConfig(recognizer.config);\n\nstream = recognizer.createStream();\nstream.acceptWaveform(wave.sampleRate, wave.samples);\nrecognizer.decode(stream);\ntext = recognizer.getResult(stream).text;\n\nconsole.log(`text in German: ${text}`);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-nemo-ctc.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let config = {\n    modelConfig: {\n      nemoCtc: {\n        model: './sherpa-onnx-nemo-ctc-en-conformer-small/model.int8.onnx',\n      },\n      tokens: './sherpa-onnx-nemo-ctc-en-conformer-small/tokens.txt',\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-nemo-ctc-en-conformer-small/test_wavs/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-omnilingual-asr-ctc.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let config = {\n    modelConfig: {\n      omnilingual: {\n        model:\n            './sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx',\n      },\n      tokens:\n          './sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt',\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-paraformer-itn.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let modelConfig = {\n    paraformer: {\n      model: './sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx',\n    },\n    tokens: './sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt',\n  };\n\n  let config = {\n    modelConfig: modelConfig,\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n    ruleFsts: './itn_zh_number.fst',\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\n// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nconst waveFilename = './itn-zh-number.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-paraformer.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let modelConfig = {\n    paraformer: {\n      model: './sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx',\n    },\n    tokens: './sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt',\n  };\n\n  let config = {\n    modelConfig: modelConfig,\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename = './sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-sense-voice-with-hr.js",
    "content": "// Copyright (c)  2024-2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let modelConfig = {\n    senseVoice: {\n      model:\n          './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/model.int8.onnx',\n      language: '',\n      useInverseTextNormalization: 1,\n    },\n    tokens:\n        './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/tokens.txt',\n  };\n\n  let config = {\n    modelConfig: modelConfig,\n    hr: {\n      lexicon: './lexicon.txt',\n      ruleFsts: './replace.fst',\n    },\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename = './test-hr.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-sense-voice.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let modelConfig = {\n    senseVoice: {\n      model:\n          './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/model.int8.onnx',\n      language: '',\n      useInverseTextNormalization: 1,\n    },\n    tokens:\n        './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/tokens.txt',\n  };\n\n  let config = {\n    modelConfig: modelConfig,\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/test_wavs/zh.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-speaker-diarization.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('sherpa-onnx');\n\n// clang-format off\n/* Please use the following commands to download files\n   used in this script\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\ntar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nrm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\n */\n// clang-format on\n\nconst config = {\n  segmentation: {\n    pyannote: {\n      model: './sherpa-onnx-pyannote-segmentation-3-0/model.onnx',\n      debug: 1,\n    },\n  },\n  embedding: {\n    model: './3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx',\n    debug: 1,\n  },\n  clustering: {\n    // since we know that the test wave file\n    // ./0-four-speakers-zh.wav contains 4 speakers, we use 4 for numClusters\n    // here. if you don't have such information, please set numClusters to -1\n    numClusters: 4,\n\n    // If numClusters is not -1, then threshold is ignored.\n    //\n    // A larger threshold leads to fewer clusters, i.e., fewer speakers\n    // A smaller threshold leads to more clusters, i.e., more speakers\n    // You need to tune it by yourself.\n    threshold: 0.5,\n  },\n\n  // If a segment is shorter than minDurationOn, we discard it\n  minDurationOn: 0.2,  // in seconds\n\n  // If the gap between two segments is less than minDurationOff, then we\n  // merge these two segments into a single one\n  minDurationOff: 0.5,  // in seconds\n};\n\nconst waveFilename = './0-four-speakers-zh.wav';\n\nconst sd = sherpa_onnx.createOfflineSpeakerDiarization(config);\nconsole.log('Started');\n\nconst wave = sherpa_onnx.readWave(waveFilename);\nif (sd.sampleRate != wave.sampleRate) {\n  throw new Error(\n      `Expected sample rate: ${sd.sampleRate}, given: ${wave.sampleRate}`);\n}\n\nconst segments = sd.process(wave.samples);\nconsole.log(segments);\n"
  },
  {
    "path": "nodejs-examples/test-offline-speech-enhancement-dpdfnet.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n//\n// Please download a DPDFNet model and ./inp_16k.wav used in this file\n// from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n// or https://huggingface.co/Ceva-IP/DPDFNet\n//\n// This script shows how to use speech enhancement API from sherpa-onnx.\n// Use dpdfnet_baseline.onnx, dpdfnet2.onnx, dpdfnet4.onnx, or dpdfnet8.onnx\n// for 16 kHz downstream ASR or speech recognition.\n// Use dpdfnet2_48khz_hr.onnx for 48 kHz enhancement output.\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineSpeechDenoiser() {\n  const model = './dpdfnet2.onnx';\n  let config = {\n    model: {\n      dpdfnet: {model},\n      debug: 1,\n    },\n  };\n\n  return sherpa_onnx.createOfflineSpeechDenoiser(config);\n}\n\nconst speech_denoiser = createOfflineSpeechDenoiser();\n\nconst waveFilename = './inp_16k.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\n\nconst denoised = speech_denoiser.run(wave.samples, wave.sampleRate);\nconst outputFilename = './enhanced.wav';\nsherpa_onnx.writeWave(outputFilename, denoised);\nconsole.log(`Saved to ${outputFilename}`);\n\nspeech_denoiser.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-speech-enhancement-gtcrn.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n//\n// Please download a speech enhancement model and ./inp_16k.wav used in this file\n// from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n//\n// This script shows how to use speech enhancement API from sherpa-onnx.\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineSpeechDenoiser() {\n  const model = './gtcrn_simple.onnx';\n  let config = {\n    model: {\n      gtcrn: {model},\n      debug: 1,\n    },\n  };\n\n  return sherpa_onnx.createOfflineSpeechDenoiser(config);\n}\n\nconst speech_denoiser = createOfflineSpeechDenoiser();\n\nconst waveFilename = './inp_16k.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\n\nconst denoised = speech_denoiser.run(wave.samples, wave.sampleRate);\nconst outputFilename = './enhanced.wav';\nsherpa_onnx.writeWave(outputFilename, denoised);\nconsole.log(`Saved to ${outputFilename}`);\n\nspeech_denoiser.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-transducer.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let modelConfig = {\n    transducer: {\n      encoder:\n          './sherpa-onnx-zipformer-en-2023-06-26/encoder-epoch-99-avg-1.int8.onnx',\n      decoder:\n          './sherpa-onnx-zipformer-en-2023-06-26/decoder-epoch-99-avg-1.onnx',\n      joiner:\n          './sherpa-onnx-zipformer-en-2023-06-26/joiner-epoch-99-avg-1.int8.onnx',\n    },\n    tokens: './sherpa-onnx-zipformer-en-2023-06-26/tokens.txt',\n    modelType: 'transducer',\n  };\n\n  let config = {\n    modelConfig: modelConfig,\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename = './sherpa-onnx-zipformer-en-2023-06-26/test_wavs/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-tts-kitten-en.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineTts() {\n  let offlineTtsKittenModelConfig = {\n    model: './kitten-nano-en-v0_1-fp16/model.fp16.onnx',\n    voices: './kitten-nano-en-v0_1-fp16/voices.bin',\n    tokens: './kitten-nano-en-v0_1-fp16/tokens.txt',\n    dataDir: './kitten-nano-en-v0_1-fp16/espeak-ng-data',\n    lengthScale: 1.0,\n  };\n  let offlineTtsModelConfig = {\n    offlineTtsKittenModelConfig: offlineTtsKittenModelConfig,\n    numThreads: 1,\n    debug: 1,\n    provider: 'cpu',\n  };\n\n  let offlineTtsConfig = {\n    offlineTtsModelConfig: offlineTtsModelConfig,\n    maxNumSentences: 1,\n  };\n\n  return sherpa_onnx.createOfflineTts(offlineTtsConfig);\n}\n\nconst tts = createOfflineTts();\nconst speakerId = 0;\nconst speed = 1.0;\nconst generationConfig = {\n  sid: speakerId,\n  speed: speed,\n  silenceScale: 0.2,\n};\nconst text =\n    'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\nconst audio = tts.generateWithConfig(text, generationConfig);\ntts.save('./test-kitten-en.wav', audio);\nconsole.log('Saved to test-kitten-en.wav successfully.');\ntts.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-tts-kokoro-en.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineTts() {\n  let offlineTtsKokoroModelConfig = {\n    model: './kokoro-en-v0_19/model.onnx',\n    voices: './kokoro-en-v0_19/voices.bin',\n    tokens: './kokoro-en-v0_19/tokens.txt',\n    dataDir: './kokoro-en-v0_19/espeak-ng-data',\n    lengthScale: 1.0,\n  };\n  let offlineTtsModelConfig = {\n    offlineTtsKokoroModelConfig: offlineTtsKokoroModelConfig,\n    numThreads: 1,\n    debug: 1,\n    provider: 'cpu',\n  };\n\n  let offlineTtsConfig = {\n    offlineTtsModelConfig: offlineTtsModelConfig,\n    maxNumSentences: 1,\n  };\n\n  return sherpa_onnx.createOfflineTts(offlineTtsConfig);\n}\n\nconst tts = createOfflineTts();\nconst speakerId = 0;\nconst speed = 1.0;\nconst generationConfig = {\n  sid: speakerId,\n  speed: speed,\n  silenceScale: 0.2,\n};\nconst text =\n    'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\nconst audio = tts.generateWithConfig(text, generationConfig);\ntts.save('./test-kokoro-en.wav', audio);\nconsole.log('Saved to test-kokoro-en.wav successfully.');\ntts.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-tts-kokoro-zh-en.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineTts() {\n  let offlineTtsKokoroModelConfig = {\n    model: './kokoro-multi-lang-v1_0/model.onnx',\n    voices: './kokoro-multi-lang-v1_0/voices.bin',\n    tokens: './kokoro-multi-lang-v1_0/tokens.txt',\n    dataDir: './kokoro-multi-lang-v1_0/espeak-ng-data',\n    lexicon:\n        './kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt',\n    lengthScale: 1.0,\n  };\n  let offlineTtsModelConfig = {\n    offlineTtsKokoroModelConfig: offlineTtsKokoroModelConfig,\n    numThreads: 1,\n    debug: 1,\n    provider: 'cpu',\n  };\n\n  let offlineTtsConfig = {\n    offlineTtsModelConfig: offlineTtsModelConfig,\n    maxNumSentences: 1,\n  };\n\n  return sherpa_onnx.createOfflineTts(offlineTtsConfig);\n}\n\nconst tts = createOfflineTts();\nconst speakerId = 49;\nconst speed = 1.0;\nconst generationConfig = {\n  sid: speakerId,\n  speed: speed,\n  silenceScale: 0.2,\n};\nconst text =\n    '中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？';\n\nconst audio = tts.generateWithConfig(text, generationConfig);\ntts.save('./test-kokoro-zh-en-49.wav', audio);\nconsole.log('Saved to test-kokoro-zh-en-49.wav successfully.');\ntts.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-tts-matcha-en.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nconst silenceScale = 0.2;\n\nfunction createOfflineTts() {\n  let offlineTtsMatchaModelConfig = {\n    acousticModel: './matcha-icefall-en_US-ljspeech/model-steps-3.onnx',\n    vocoder: './vocos-22khz-univ.onnx',\n    tokens: './matcha-icefall-en_US-ljspeech/tokens.txt',\n    dataDir: './matcha-icefall-en_US-ljspeech/espeak-ng-data',\n\n    noiseScale: 0.667,\n    lengthScale: 1.0,\n  };\n  let offlineTtsModelConfig = {\n    offlineTtsMatchaModelConfig: offlineTtsMatchaModelConfig,\n    numThreads: 1,\n    debug: 1,\n    provider: 'cpu',\n  };\n\n  let offlineTtsConfig = {\n    offlineTtsModelConfig: offlineTtsModelConfig,\n    maxNumSentences: 1,\n    silenceScale: silenceScale,\n  };\n\n  return sherpa_onnx.createOfflineTts(offlineTtsConfig);\n}\n\nconst tts = createOfflineTts();\nconst text =\n    'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\nconst generationConfig = {\n  sid: 0,\n  speed: 1.0,\n  silenceScale: silenceScale,\n};\n\nconst audio = tts.generateWithConfig(text, generationConfig);\ntts.save('./test-matcha-en.wav', audio);\nconsole.log('Saved to test-matcha-en.wav successfully.');\ntts.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-tts-matcha-zh.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nconst silenceScale = 0.2;\n\nfunction createOfflineTts() {\n  let offlineTtsMatchaModelConfig = {\n    acousticModel: './matcha-icefall-zh-baker/model-steps-3.onnx',\n    vocoder: './vocos-22khz-univ.onnx',\n    lexicon: './matcha-icefall-zh-baker/lexicon.txt',\n    tokens: './matcha-icefall-zh-baker/tokens.txt',\n    noiseScale: 0.667,\n    lengthScale: 1.0,\n  };\n  let offlineTtsModelConfig = {\n    offlineTtsMatchaModelConfig: offlineTtsMatchaModelConfig,\n    numThreads: 1,\n    debug: 1,\n    provider: 'cpu',\n  };\n\n  let offlineTtsConfig = {\n    offlineTtsModelConfig: offlineTtsModelConfig,\n    maxNumSentences: 1,\n    silenceScale: silenceScale,\n    ruleFsts:\n        './matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst',\n  };\n\n  return sherpa_onnx.createOfflineTts(offlineTtsConfig);\n}\n\nconst tts = createOfflineTts();\nconst text =\n    '当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感受着生命的奇迹与温柔. 某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。';\n\nconst generationConfig = {\n  sid: 0,\n  speed: 1.0,\n  silenceScale: silenceScale,\n};\n\nconst audio = tts.generateWithConfig(text, generationConfig);\ntts.save('./test-matcha-zh.wav', audio);\nconsole.log('Saved to test-matcha-zh.wav successfully.');\ntts.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-tts-pocket-en.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineTts() {\n  let pocket = {\n    lmFlow: './sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx',\n    lmMain: './sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx',\n    encoder: './sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx',\n    decoder: './sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx',\n    textConditioner:\n        './sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx',\n    vocabJson: './sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json',\n    tokenScoresJson:\n        './sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json',\n    voiceEmbeddingCacheCapacity: 50,\n  };\n  let offlineTtsModelConfig = {\n    offlineTtsPocketModelConfig: pocket,\n    numThreads: 1,\n    debug: 1,  // set it to 1 to see verbose logs; 0 to disable logs\n    provider: 'cpu',\n  };\n\n  let offlineTtsConfig = {\n    offlineTtsModelConfig: offlineTtsModelConfig,\n    maxNumSentences: 1,\n  };\n\n  return sherpa_onnx.createOfflineTts(offlineTtsConfig);\n}\n\nconst referenceWaveFilename =\n    './sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav';\nconst wave = sherpa_onnx.readWave(referenceWaveFilename);\n\nconst generationConfig = {\n  silenceScale: 0.2,\n  referenceAudio: wave.samples,\n  referenceSampleRate: wave.sampleRate,\n  numSteps: 5,\n  extra: {max_reference_audio_len: 12, seed: 42}\n};\n\nconst tts = createOfflineTts();\nconst text =\n    'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\nconst audio = tts.generateWithConfig(text, generationConfig);\ntts.save('./test-pocket-en.wav', audio);\nconsole.log('Saved to test-pocket-en.wav successfully.');\ntts.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-tts-vits-en.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineTts() {\n  let offlineTtsVitsModelConfig = {\n    model: './vits-piper-en_US-amy-low/en_US-amy-low.onnx',\n    tokens: './vits-piper-en_US-amy-low/tokens.txt',\n    dataDir: './vits-piper-en_US-amy-low/espeak-ng-data',\n    noiseScale: 0.667,\n    noiseScaleW: 0.8,\n    lengthScale: 1.0,\n  };\n  let offlineTtsModelConfig = {\n    offlineTtsVitsModelConfig: offlineTtsVitsModelConfig,\n    numThreads: 1,\n    debug: 1,\n    provider: 'cpu',\n  };\n\n  let offlineTtsConfig = {\n    offlineTtsModelConfig: offlineTtsModelConfig,\n    maxNumSentences: 1,\n  };\n\n  return sherpa_onnx.createOfflineTts(offlineTtsConfig);\n}\n\n\nconst tts = createOfflineTts();\nconst speakerId = 0;\nconst speed = 1.0;\nconst generationConfig = {\n  sid: speakerId,\n  speed: speed,\n  silenceScale: 0.2,\n};\nconst audio = tts.generateWithConfig(\n    '“Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.”',\n    generationConfig);\n\ntts.save('./test-vits-en.wav', audio);\nconsole.log('Saved to test-vits-en.wav successfully.');\n\ntts.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-tts-vits-zh.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineTts() {\n  let offlineTtsVitsModelConfig = {\n    model: './vits-icefall-zh-aishell3/model.onnx',\n    lexicon: './vits-icefall-zh-aishell3/lexicon.txt',\n    tokens: './vits-icefall-zh-aishell3/tokens.txt',\n    noiseScale: 0.667,\n    noiseScaleW: 0.8,\n    lengthScale: 1.0,\n  };\n  let offlineTtsModelConfig = {\n    offlineTtsVitsModelConfig: offlineTtsVitsModelConfig,\n    numThreads: 1,\n    debug: 1,\n    provider: 'cpu',\n  };\n\n  let offlineTtsConfig = {\n    offlineTtsModelConfig: offlineTtsModelConfig,\n    ruleFsts:\n        './vits-icefall-zh-aishell3/phone.fst,./vits-icefall-zh-aishell3/date.fst,./vits-icefall-zh-aishell3/number.fst,./vits-icefall-zh-aishell3/new_heteronym.fst',\n    ruleFars: './vits-icefall-zh-aishell3/rule.far',\n    maxNumSentences: 1,\n  };\n\n  return sherpa_onnx.createOfflineTts(offlineTtsConfig);\n}\n\nconst tts = createOfflineTts();\nconst speakerId = 66;\nconst speed = 1.0;\nconst generationConfig = {\n  sid: speakerId,\n  speed: speed,\n  silenceScale: 0.2,\n};\nconst audio = tts.generateWithConfig('3年前中国总人口是1411778724人', generationConfig);\ntts.save('./test-vits-zh.wav', audio);\nconsole.log('Saved to test-vits-zh.wav successfully.');\ntts.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-tts-zipvoice-zh-en.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineTts() {\n  const zipvoice = {\n    encoder: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx',\n    decoder: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx',\n    tokens: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt',\n    lexicon: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt',\n    dataDir: './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data',\n    vocoder: './vocos_24khz.onnx',\n  };\n\n  const offlineTtsModelConfig = {\n    offlineTtsZipVoiceModelConfig: zipvoice,\n    numThreads: 1,\n    debug: 1,  // set it to 1 to see verbose logs; 0 to disable logs\n    provider: 'cpu',\n  };\n\n  const offlineTtsConfig = {\n    offlineTtsModelConfig: offlineTtsModelConfig,\n    maxNumSentences: 1,\n  };\n\n  return sherpa_onnx.createOfflineTts(offlineTtsConfig);\n}\n\nconst referenceWaveFilename =\n    './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav';\nconst wave = sherpa_onnx.readWave(referenceWaveFilename);\n\nconst referenceText =\n    '那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.';\nconst text =\n    '小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.';\n\nconst generationConfig = {\n  referenceAudio: wave.samples,\n  referenceSampleRate: wave.sampleRate,\n  // It must match the transcript of the reference audio above.\n  referenceText: referenceText,\n  numSteps: 4,\n  extra: {min_char_in_sentence: 10},\n};\n\nconst tts = createOfflineTts();\nconst audio = tts.generateWithConfig(text, generationConfig);\ntts.save('./test-zipvoice-zh-en.wav', audio);\nconsole.log('Saved to test-zipvoice-zh-en.wav successfully.');\ntts.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-wenet-ctc.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let config = {\n    modelConfig: {\n      wenetCtc: {\n        model:\n            './sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx',\n      },\n      tokens:\n          './sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt',\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/test_wavs/yue-0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-whisper.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst sherpa_onnx = require('sherpa-onnx');\nconsole.log(`version : ${sherpa_onnx.version}`);\nconsole.log(`git sha1: ${sherpa_onnx.gitSha1}`);\nconsole.log(`git date: ${sherpa_onnx.gitDate}`);\n\nfunction createOfflineRecognizer() {\n  let modelConfig = {\n    whisper: {\n      encoder: './sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx',\n      decoder: './sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx',\n      language: '',\n      task: 'transcribe',\n      tailPaddings: -1,\n    },\n    tokens: './sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt',\n  };\n\n  let config = {\n    modelConfig: modelConfig,\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename = './sherpa-onnx-whisper-tiny.en/test_wavs/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-offline-zipformer-ctc.js",
    "content": "// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOfflineRecognizer() {\n  let config = {\n    modelConfig: {\n      zipformerCtc: {\n        model: './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx',\n      },\n      tokens: './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt',\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nconst recognizer = createOfflineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/test_wavs/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\n\nrecognizer.decode(stream);\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-online-paraformer-microphone-mic.js",
    "content": "// Copyright (c) 2023 Xiaomi Corporation (authors: Fangjun Kuang)\nconst mic = require(\n    'mic');  // It uses `mic` for better compatibility, do check its\n             // [npm](https://www.npmjs.com/package/mic) before running it.\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineRecognizer() {\n  let onlineParaformerModelConfig = {\n    encoder:\n        './sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx',\n    decoder:\n        './sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx',\n  };\n\n  let onlineModelConfig = {\n    paraformer: onlineParaformerModelConfig,\n    tokens: './sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt',\n  };\n\n  let recognizerConfig = {\n    modelConfig: onlineModelConfig,\n    enableEndpoint: 1,\n    rule1MinTrailingSilence: 2.4,\n    rule2MinTrailingSilence: 1.2,\n    rule3MinUtteranceLength: 20,\n  };\n\n  return sherpa_onnx.createOnlineRecognizer(recognizerConfig);\n}\n\n/**\n * SpeechSession class, work as a session manager with the formatOutput function\n * Sample output:\n=== Automated Speech Recognition ===\nCurrent Session #1\nTime: 8:44:46 PM\n------------------------\nRecognized Sentences:\n[8:44:43 PM] 1. it's so great three result is great great 她还支持中文\n[8:44:46 PM] 2. 很厉\n------------------------\nRecognizing: 真的很厉害太厉害\n\n*/\nclass SpeechSession {\n  constructor() {\n    this.startTime = Date.now();\n    this.sentences = [];\n    this.currentText = '';\n    this.lastUpdateTime = Date.now();\n  }\n\n  addOrUpdateText(text) {\n    this.currentText = text;\n    this.lastUpdateTime = Date.now();\n  }\n\n  finalizeSentence() {\n    if (this.currentText.trim()) {\n      this.sentences.push({\n        text: this.currentText.trim(),\n        timestamp: new Date().toLocaleTimeString()\n      });\n    }\n    this.currentText = '';\n  }\n\n  shouldStartNewSession() {\n    return Date.now() - this.lastUpdateTime > 10000;  // 10 seconds of silence\n  }\n}\n\nfunction formatOutput() {\n  clearConsole();\n  console.log('\\n=== Automated Speech Recognition ===');\n  console.log(`Current Session #${sessionCount}`);\n  console.log('Time:', new Date().toLocaleTimeString());\n  console.log('------------------------');\n\n  // 显示历史句子\n  if (currentSession.sentences.length > 0) {\n    console.log('Recognized Sentences:');\n    currentSession.sentences.forEach((sentence, index) => {\n      console.log(`[${sentence.timestamp}] ${index + 1}. ${sentence.text}`);\n    });\n    console.log('------------------------');\n  }\n\n  // 显示当前正在识别的内容\n  if (currentSession.currentText) {\n    console.log('Recognizing:', currentSession.currentText);\n  }\n}\n\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\nlet currentSession = new SpeechSession();\nlet sessionCount = 1;\n\nfunction clearConsole() {\n  process.stdout.write('\\x1B[2J\\x1B[0f');\n}\n\n\nfunction exitHandler(options, exitCode) {\n  if (options.cleanup) {\n    console.log('\\nCleaned up resources...');\n    micInstance.stop();\n    stream.free();\n    recognizer.free();\n  }\n  if (exitCode || exitCode === 0) console.log('Exit code:', exitCode);\n  if (options.exit) process.exit();\n}\n\nconst micInstance = mic({\n  rate: recognizer.config.featConfig.sampleRate,\n  channels: 1,\n  debug: false,  // 关闭调试输出\n  device: 'default',\n  bitwidth: 16,\n  encoding: 'signed-integer',\n  exitOnSilence: 6,\n  fileType: 'raw'\n});\n\nconst micInputStream = micInstance.getAudioStream();\n\nfunction startMic() {\n  return new Promise((resolve, reject) => {\n    micInputStream.once('startComplete', () => {\n      console.log('Mic phone started.');\n      resolve();\n    });\n\n    micInputStream.once('error', (err) => {\n      console.error('Mic phone start error:', err);\n      reject(err);\n    });\n\n    micInstance.start();\n  });\n}\n\nmicInputStream.on('data', buffer => {\n  const int16Array = new Int16Array(buffer.buffer);\n  const samples = new Float32Array(int16Array.length);\n\n  for (let i = 0; i < int16Array.length; i++) {\n    samples[i] = int16Array[i] / 32768.0;\n  }\n\n  stream.acceptWaveform(recognizer.config.featConfig.sampleRate, samples);\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  const isEndpoint = recognizer.isEndpoint(stream);\n  const text = recognizer.getResult(stream).text;\n\n  if (text.length > 0) {\n    // 检查是否需要开始新会话\n    if (currentSession.shouldStartNewSession()) {\n      currentSession.finalizeSentence();\n      sessionCount++;\n      currentSession = new SpeechSession();\n    }\n\n    currentSession.addOrUpdateText(text);\n    formatOutput();\n  }\n\n  if (isEndpoint) {\n    if (text.length > 0) {\n      currentSession.finalizeSentence();\n      formatOutput();\n    }\n    recognizer.reset(stream);\n  }\n});\n\nmicInputStream.on('error', err => {\n  console.error('Audio stream error:', err);\n});\n\nmicInputStream.on('close', () => {\n  console.log('Mic phone closed.');\n});\n\nprocess.on('exit', exitHandler.bind(null, {cleanup: true}));\nprocess.on('SIGINT', exitHandler.bind(null, {exit: true}));\nprocess.on('SIGUSR1', exitHandler.bind(null, {exit: true}));\nprocess.on('SIGUSR2', exitHandler.bind(null, {exit: true}));\nprocess.on('uncaughtException', exitHandler.bind(null, {exit: true}));\n\nasync function main() {\n  try {\n    console.log('Starting ...');\n    await startMic();\n    console.log('Initialized, waiting for speech ...');\n    formatOutput();\n  } catch (err) {\n    console.error('Failed to initialize:', err);\n    process.exit(1);\n  }\n}\n\nmain();\n"
  },
  {
    "path": "nodejs-examples/test-online-paraformer-microphone.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\nconsole.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineRecognizer() {\n  let onlineParaformerModelConfig = {\n    encoder:\n        './sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx',\n    decoder:\n        './sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx',\n  };\n\n  let onlineModelConfig = {\n    paraformer: onlineParaformerModelConfig,\n    tokens: './sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt',\n  };\n\n  let recognizerConfig = {\n    modelConfig: onlineModelConfig,\n    enableEndpoint: 1,\n    rule1MinTrailingSilence: 2.4,\n    rule2MinTrailingSilence: 1.2,\n    rule3MinUtteranceLength: 20,\n  };\n\n  return sherpa_onnx.createOnlineRecognizer(recognizerConfig);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nlet lastText = '';\nlet segmentIndex = 0;\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: recognizer.config.featConfig.sampleRate\n  }\n});\n\nai.on('data', data => {\n  const samples = new Float32Array(data.buffer);\n\n  stream.acceptWaveform(recognizer.config.featConfig.sampleRate, samples);\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  const isEndpoint = recognizer.isEndpoint(stream);\n  const text = recognizer.getResult(stream).text;\n\n  if (text.length > 0 && lastText != text) {\n    lastText = text;\n    console.log(segmentIndex, lastText);\n  }\n  if (isEndpoint) {\n    if (text.length > 0) {\n      lastText = text;\n      segmentIndex += 1;\n    }\n    recognizer.reset(stream);\n  }\n});\n\nai.on('close', () => {\n  console.log('Free resources');\n  stream.free();\n  recognizer.free();\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-examples/test-online-paraformer.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineRecognizer() {\n  let onlineParaformerModelConfig = {\n    encoder:\n        './sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx',\n    decoder:\n        './sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx',\n  };\n\n  let onlineModelConfig = {\n    paraformer: onlineParaformerModelConfig,\n    tokens: './sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt',\n  };\n\n  let recognizerConfig = {\n    modelConfig: onlineModelConfig,\n  };\n\n  return sherpa_onnx.createOnlineRecognizer(recognizerConfig);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav';\n\nconst reader = new wav.Reader();\nconst readable = new Readable().wrap(reader);\n\nfunction decode(samples) {\n  stream.acceptWaveform(recognizer.config.featConfig.sampleRate, samples);\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n  const text = recognizer.getResult(stream).text;\n  console.log(text);\n}\n\nreader.on('format', ({audioFormat, bitDepth, channels, sampleRate}) => {\n  if (sampleRate != recognizer.config.featConfig.sampleRate) {\n    throw new Error(`Only support sampleRate ${\n        recognizer.config.featConfig.sampleRate}. Given ${sampleRate}`);\n  }\n\n  if (audioFormat != 1) {\n    throw new Error(`Only support PCM format. Given ${audioFormat}`);\n  }\n\n  if (channels != 1) {\n    throw new Error(`Only a single channel. Given ${channels}`);\n  }\n\n  if (bitDepth != 16) {\n    throw new Error(`Only support 16-bit samples. Given ${bitDepth}`);\n  }\n});\n\nfs.createReadStream(waveFilename, {'highWaterMark': 4096})\n    .pipe(reader)\n    .on('finish', function(err) {\n      // tail padding\n      const floatSamples =\n          new Float32Array(recognizer.config.featConfig.sampleRate * 0.5);\n      decode(floatSamples);\n      stream.free();\n      recognizer.free();\n    });\n\nreadable.on('readable', function() {\n  let chunk;\n  while ((chunk = readable.read()) != null) {\n    const int16Samples = new Int16Array(\n        chunk.buffer, chunk.byteOffset,\n        chunk.length / Int16Array.BYTES_PER_ELEMENT);\n\n    const floatSamples = new Float32Array(int16Samples.length);\n\n    for (let i = 0; i < floatSamples.length; i++) {\n      floatSamples[i] = int16Samples[i] / 32768.0;\n    }\n\n    decode(floatSamples);\n  }\n});\n"
  },
  {
    "path": "nodejs-examples/test-online-speech-enhancement-dpdfnet.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n//\n// Please download a DPDFNet model and ./inp_16k.wav used in this file from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n// or https://huggingface.co/Ceva-IP/DPDFNet\n//\n// This script shows how to use the streaming speech enhancement API from\n// sherpa-onnx.\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineSpeechDenoiser() {\n  const model = './dpdfnet_baseline.onnx';\n  const config = {\n    model: {\n      dpdfnet: {model},\n      debug: 1,\n    },\n  };\n\n  return sherpa_onnx.createOnlineSpeechDenoiser(config);\n}\n\nconst speech_denoiser = createOnlineSpeechDenoiser();\n\nconst waveFilename = './inp_16k.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nconst frameShift = speech_denoiser.frameShiftInSamples;\nconst output = [];\n\nlet start = 0;\nwhile (start < wave.samples.length) {\n  const end = Math.min(start + frameShift, wave.samples.length);\n  const chunk = wave.samples.slice(start, end);\n  const denoised = speech_denoiser.run(chunk, wave.sampleRate);\n  output.push(...denoised.samples);\n  start = end;\n}\n\noutput.push(...speech_denoiser.flush().samples);\n\nconst outputFilename = './enhanced-online-dpdfnet.wav';\nsherpa_onnx.writeWave(outputFilename, {\n  samples: Float32Array.from(output),\n  sampleRate: speech_denoiser.sampleRate,\n});\nconsole.log(`Saved to ${outputFilename}`);\n\nspeech_denoiser.free();\n"
  },
  {
    "path": "nodejs-examples/test-online-speech-enhancement-gtcrn.js",
    "content": "// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n//\n// Please download a speech enhancement model and ./inp_16k.wav used in this\n// file from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n//\n// This script shows how to use the streaming speech enhancement API from\n// sherpa-onnx.\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineSpeechDenoiser() {\n  const model = './gtcrn_simple.onnx';\n  const config = {\n    model: {\n      gtcrn: {model},\n      debug: 1,\n    },\n  };\n\n  return sherpa_onnx.createOnlineSpeechDenoiser(config);\n}\n\nconst speech_denoiser = createOnlineSpeechDenoiser();\n\nconst waveFilename = './inp_16k.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\nconst frameShift = speech_denoiser.frameShiftInSamples;\nconst output = [];\n\nlet start = 0;\nwhile (start < wave.samples.length) {\n  const end = Math.min(start + frameShift, wave.samples.length);\n  const chunk = wave.samples.slice(start, end);\n  const denoised = speech_denoiser.run(chunk, wave.sampleRate);\n  output.push(...denoised.samples);\n  start = end;\n}\n\noutput.push(...speech_denoiser.flush().samples);\n\nconst outputFilename = './enhanced-online-gtcrn.wav';\nsherpa_onnx.writeWave(outputFilename, {\n  samples: Float32Array.from(output),\n  sampleRate: speech_denoiser.sampleRate,\n});\nconsole.log(`Saved to ${outputFilename}`);\n\nspeech_denoiser.free();\n"
  },
  {
    "path": "nodejs-examples/test-online-t-one-ctc.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineRecognizer() {\n  let toneCtc = {\n    model: './sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx',\n  };\n\n  let onlineModelConfig = {\n    toneCtc: toneCtc,\n    tokens: './sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt',\n    numThreads: 1,\n    provider: 'cpu',\n    debug: 1,\n  };\n\n\n  let recognizerConfig = {\n    modelConfig: onlineModelConfig,\n    decodingMethod: 'greedy_search',\n    maxActivePaths: 4,\n    enableEndpoint: 1,\n    rule1MinTrailingSilence: 2.4,\n    rule2MinTrailingSilence: 1.2,\n    rule3MinUtteranceLength: 20,\n  };\n\n  return sherpa_onnx.createOnlineRecognizer(recognizerConfig);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename = './sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\n\nconst leftPadding = new Float32Array(wave.sampleRate * 0.3);\nconst tailPadding = new Float32Array(wave.sampleRate * 0.6);\n\nstream.acceptWaveform(wave.sampleRate, leftPadding);\nstream.acceptWaveform(wave.sampleRate, wave.samples);\nstream.acceptWaveform(wave.sampleRate, tailPadding);\n\nwhile (recognizer.isReady(stream)) {\n  recognizer.decode(stream);\n}\nconst text = recognizer.getResult(stream).text;\nconsole.log(text);\n\nstream.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-online-transducer-itn.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineRecognizer() {\n  let onlineTransducerModelConfig = {\n    encoder:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx',\n    decoder:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx',\n    joiner:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx',\n  };\n\n  let onlineModelConfig = {\n    transducer: onlineTransducerModelConfig,\n    tokens:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt',\n    numThreads: 1,\n    provider: 'cpu',\n    debug: 1,\n    modelType: 'zipformer',\n  };\n\n  let featureConfig = {\n    sampleRate: 16000,\n    featureDim: 80,\n  };\n\n  let recognizerConfig = {\n    featConfig: featureConfig,\n    modelConfig: onlineModelConfig,\n    decodingMethod: 'greedy_search',\n    maxActivePaths: 4,\n    enableEndpoint: 1,\n    rule1MinTrailingSilence: 2.4,\n    rule2MinTrailingSilence: 1.2,\n    rule3MinUtteranceLength: 20,\n    // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n    ruleFsts: './itn_zh_number.fst',\n  };\n\n  return sherpa_onnx.createOnlineRecognizer(recognizerConfig);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\n// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nconst waveFilename = './itn-zh-number.wav';\n\nconst reader = new wav.Reader();\nconst readable = new Readable().wrap(reader);\n\nfunction decode(samples) {\n  stream.acceptWaveform(recognizer.config.featConfig.sampleRate, samples);\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n  const text = recognizer.getResult(stream).text;\n  console.log(text);\n}\n\nreader.on('format', ({audioFormat, bitDepth, channels, sampleRate}) => {\n  if (sampleRate != recognizer.config.featConfig.sampleRate) {\n    throw new Error(`Only support sampleRate ${\n        recognizer.config.featConfig.sampleRate}. Given ${sampleRate}`);\n  }\n\n  if (audioFormat != 1) {\n    throw new Error(`Only support PCM format. Given ${audioFormat}`);\n  }\n\n  if (channels != 1) {\n    throw new Error(`Only a single channel. Given ${channels}`);\n  }\n\n  if (bitDepth != 16) {\n    throw new Error(`Only support 16-bit samples. Given ${bitDepth}`);\n  }\n});\n\nfs.createReadStream(waveFilename, {'highWaterMark': 4096})\n    .pipe(reader)\n    .on('finish', function(err) {\n      // tail padding\n      const floatSamples =\n          new Float32Array(recognizer.config.featConfig.sampleRate * 0.5);\n      decode(floatSamples);\n      stream.free();\n      recognizer.free();\n    });\n\nreadable.on('readable', function() {\n  let chunk;\n  while ((chunk = readable.read()) != null) {\n    const int16Samples = new Int16Array(\n        chunk.buffer, chunk.byteOffset,\n        chunk.length / Int16Array.BYTES_PER_ELEMENT);\n\n    const floatSamples = new Float32Array(int16Samples.length);\n\n    for (let i = 0; i < floatSamples.length; i++) {\n      floatSamples[i] = int16Samples[i] / 32768.0;\n    }\n\n    decode(floatSamples);\n  }\n});\n"
  },
  {
    "path": "nodejs-examples/test-online-transducer-microphone.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst portAudio = require('naudiodon2');\n// console.log(portAudio.getDevices());\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineRecognizer() {\n  let onlineTransducerModelConfig = {\n    encoder:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx',\n    decoder:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx',\n    joiner:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx',\n  };\n\n  let onlineModelConfig = {\n    transducer: onlineTransducerModelConfig,\n    tokens:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt',\n    numThreads: 1,\n    provider: 'cpu',\n    debug: 1,\n    modelType: 'zipformer',\n  };\n\n  let featureConfig = {\n    sampleRate: 16000,\n    featureDim: 80,\n  };\n\n  let recognizerConfig = {\n    featConfig: featureConfig,\n    modelConfig: onlineModelConfig,\n    decodingMethod: 'greedy_search',\n    maxActivePaths: 4,\n    enableEndpoint: 1,\n    rule1MinTrailingSilence: 2.4,\n    rule2MinTrailingSilence: 1.2,\n    rule3MinUtteranceLength: 20,\n  };\n\n  return sherpa_onnx.createOnlineRecognizer(recognizerConfig);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nlet lastText = '';\nlet segmentIndex = 0;\n\nconst ai = new portAudio.AudioIO({\n  inOptions: {\n    channelCount: 1,\n    closeOnError: true,  // Close the stream if an audio error is detected, if\n                         // set false then just log the error\n    deviceId: -1,  // Use -1 or omit the deviceId to select the default device\n    sampleFormat: portAudio.SampleFormatFloat32,\n    sampleRate: recognizer.config.featConfig.sampleRate\n  }\n});\n\nai.on('data', data => {\n  const samples = new Float32Array(data.buffer);\n\n  stream.acceptWaveform(recognizer.config.featConfig.sampleRate, samples);\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n\n  const isEndpoint = recognizer.isEndpoint(stream);\n  const text = recognizer.getResult(stream).text;\n\n  if (text.length > 0 && lastText != text) {\n    lastText = text;\n    console.log(segmentIndex, lastText);\n  }\n  if (isEndpoint) {\n    if (text.length > 0) {\n      lastText = text;\n      segmentIndex += 1;\n    }\n    recognizer.reset(stream);\n  }\n});\n\nai.on('close', () => {\n  console.log('Free resources');\n  stream.free();\n  recognizer.free();\n});\n\nai.start();\nconsole.log('Started! Please speak');\n"
  },
  {
    "path": "nodejs-examples/test-online-transducer.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineRecognizer() {\n  let onlineTransducerModelConfig = {\n    encoder:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx',\n    decoder:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx',\n    joiner:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx',\n  };\n\n  let onlineModelConfig = {\n    transducer: onlineTransducerModelConfig,\n    tokens:\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt',\n  };\n\n  let recognizerConfig = {\n    modelConfig: onlineModelConfig,\n  };\n\n  return sherpa_onnx.createOnlineRecognizer(recognizerConfig);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav';\n\nconst reader = new wav.Reader();\nconst readable = new Readable().wrap(reader);\n\nfunction decode(samples) {\n  stream.acceptWaveform(recognizer.config.featConfig.sampleRate, samples);\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n  const text = recognizer.getResult(stream).text;\n  console.log(text);\n}\n\nreader.on('format', ({audioFormat, bitDepth, channels, sampleRate}) => {\n  if (sampleRate != recognizer.config.featConfig.sampleRate) {\n    throw new Error(`Only support sampleRate ${\n        recognizer.config.featConfig.sampleRate}. Given ${sampleRate}`);\n  }\n\n  if (audioFormat != 1) {\n    throw new Error(`Only support PCM format. Given ${audioFormat}`);\n  }\n\n  if (channels != 1) {\n    throw new Error(`Only a single channel. Given ${channels}`);\n  }\n\n  if (bitDepth != 16) {\n    throw new Error(`Only support 16-bit samples. Given ${bitDepth}`);\n  }\n});\n\nfs.createReadStream(waveFilename, {'highWaterMark': 4096})\n    .pipe(reader)\n    .on('finish', function(err) {\n      // tail padding\n      const floatSamples =\n          new Float32Array(recognizer.config.featConfig.sampleRate * 0.5);\n      decode(floatSamples);\n      stream.free();\n      recognizer.free();\n    });\n\nreadable.on('readable', function() {\n  let chunk;\n  while ((chunk = readable.read()) != null) {\n    const int16Samples = new Int16Array(\n        chunk.buffer, chunk.byteOffset,\n        chunk.length / Int16Array.BYTES_PER_ELEMENT);\n\n    const floatSamples = new Float32Array(int16Samples.length);\n\n    for (let i = 0; i < floatSamples.length; i++) {\n      floatSamples[i] = int16Samples[i] / 32768.0;\n    }\n\n    decode(floatSamples);\n  }\n});\n"
  },
  {
    "path": "nodejs-examples/test-online-zipformer2-ctc-hlg.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineRecognizer() {\n  let onlineZipformer2CtcModelConfig = {\n    model:\n        './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx',\n  };\n\n  let onlineModelConfig = {\n    zipformer2Ctc: onlineZipformer2CtcModelConfig,\n    tokens: './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt',\n    numThreads: 1,\n    provider: 'cpu',\n    debug: 0,\n    modelType: '',\n  };\n\n  let featureConfig = {\n    sampleRate: 16000,\n    featureDim: 80,\n  };\n\n  let recognizerConfig = {\n    featConfig: featureConfig,\n    modelConfig: onlineModelConfig,\n    decodingMethod: 'greedy_search',\n    maxActivePaths: 4,\n    enableEndpoint: 1,\n    rule1MinTrailingSilence: 2.4,\n    rule2MinTrailingSilence: 1.2,\n    rule3MinUtteranceLength: 20,\n    ctcFstDecoderConfig: {\n      graph: './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst',\n      maxActive: 3000,\n    }\n  };\n\n  return sherpa_onnx.createOnlineRecognizer(recognizerConfig);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/8k.wav';\n\nconst reader = new wav.Reader();\nconst readable = new Readable().wrap(reader);\n\nfunction decode(samples) {\n  stream.acceptWaveform(gSampleRate, samples);\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n  const text = recognizer.getResult(stream).text;\n  console.log(text);\n}\n\nlet gSampleRate = 16000;\n\nreader.on('format', ({audioFormat, bitDepth, channels, sampleRate}) => {\n  gSampleRate = sampleRate;\n\n  if (audioFormat != 1) {\n    throw new Error(`Only support PCM format. Given ${audioFormat}`);\n  }\n\n  if (channels != 1) {\n    throw new Error(`Only a single channel. Given ${channels}`);\n  }\n\n  if (bitDepth != 16) {\n    throw new Error(`Only support 16-bit samples. Given ${bitDepth}`);\n  }\n});\n\nfs.createReadStream(waveFilename, {'highWaterMark': 4096})\n    .pipe(reader)\n    .on('finish', function(err) {\n      // tail padding\n      const floatSamples =\n          new Float32Array(recognizer.config.featConfig.sampleRate * 0.5);\n      decode(floatSamples);\n      stream.free();\n      recognizer.free();\n    });\n\nreadable.on('readable', function() {\n  let chunk;\n  while ((chunk = readable.read()) != null) {\n    const int16Samples = new Int16Array(\n        chunk.buffer, chunk.byteOffset,\n        chunk.length / Int16Array.BYTES_PER_ELEMENT);\n\n    const floatSamples = new Float32Array(int16Samples.length);\n\n    for (let i = 0; i < floatSamples.length; i++) {\n      floatSamples[i] = int16Samples[i] / 32768.0;\n    }\n\n    decode(floatSamples);\n  }\n});\n"
  },
  {
    "path": "nodejs-examples/test-online-zipformer2-ctc.js",
    "content": "// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n//\nconst fs = require('fs');\nconst {Readable} = require('stream');\nconst wav = require('wav');\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createOnlineRecognizer() {\n  let onlineZipformer2CtcModelConfig = {\n    model:\n        './sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/ctc-epoch-20-avg-1-chunk-16-left-128.onnx',\n  };\n\n  let onlineModelConfig = {\n    zipformer2Ctc: onlineZipformer2CtcModelConfig,\n    tokens:\n        './sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt',\n    numThreads: 1,\n    provider: 'cpu',\n    debug: 1,\n  };\n\n  let featureConfig = {\n    sampleRate: 16000,\n    featureDim: 80,\n  };\n\n  let recognizerConfig = {\n    featConfig: featureConfig,\n    modelConfig: onlineModelConfig,\n    decodingMethod: 'greedy_search',\n    maxActivePaths: 4,\n    enableEndpoint: 1,\n    rule1MinTrailingSilence: 2.4,\n    rule2MinTrailingSilence: 1.2,\n    rule3MinUtteranceLength: 20,\n  };\n\n  return sherpa_onnx.createOnlineRecognizer(recognizerConfig);\n}\n\nconst recognizer = createOnlineRecognizer();\nconst stream = recognizer.createStream();\n\nconst waveFilename =\n    './sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000000.wav';\n\nconst reader = new wav.Reader();\nconst readable = new Readable().wrap(reader);\n\nfunction decode(samples) {\n  stream.acceptWaveform(recognizer.config.featConfig.sampleRate, samples);\n\n  while (recognizer.isReady(stream)) {\n    recognizer.decode(stream);\n  }\n  const text = recognizer.getResult(stream).text;\n  console.log(text);\n}\n\nreader.on('format', ({audioFormat, bitDepth, channels, sampleRate}) => {\n  if (sampleRate != recognizer.config.featConfig.sampleRate) {\n    throw new Error(`Only support sampleRate ${\n        recognizer.config.featConfig.sampleRate}. Given ${sampleRate}`);\n  }\n\n  if (audioFormat != 1) {\n    throw new Error(`Only support PCM format. Given ${audioFormat}`);\n  }\n\n  if (channels != 1) {\n    throw new Error(`Only a single channel. Given ${channels}`);\n  }\n\n  if (bitDepth != 16) {\n    throw new Error(`Only support 16-bit samples. Given ${bitDepth}`);\n  }\n});\n\nfs.createReadStream(waveFilename, {'highWaterMark': 4096})\n    .pipe(reader)\n    .on('finish', function(err) {\n      // tail padding\n      const floatSamples =\n          new Float32Array(recognizer.config.featConfig.sampleRate * 0.5);\n      decode(floatSamples);\n      stream.free();\n      recognizer.free();\n    });\n\nreadable.on('readable', function() {\n  let chunk;\n  while ((chunk = readable.read()) != null) {\n    const int16Samples = new Int16Array(\n        chunk.buffer, chunk.byteOffset,\n        chunk.length / Int16Array.BYTES_PER_ELEMENT);\n\n    const floatSamples = new Float32Array(int16Samples.length);\n\n    for (let i = 0; i < floatSamples.length; i++) {\n      floatSamples[i] = int16Samples[i] / 32768.0;\n    }\n\n    decode(floatSamples);\n  }\n});\n"
  },
  {
    "path": "nodejs-examples/test-vad-with-non-streaming-asr-moonshine.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'modelConfig': {\n      'moonshine': {\n        'preprocessor': './sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx',\n        'encoder': './sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx',\n        'uncachedDecoder':\n            './sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx',\n        'cachedDecoder':\n            './sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx',\n      },\n      'tokens': './sherpa-onnx-moonshine-tiny-en-int8/tokens.txt',\n      'debug': 0,\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  //\n  // please download ten-vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n  //\n  // You only need one vad\n  //\n  // To use ten-vad.onnx, please set sileroVad.model to ''\n  // and set tenVad.model to 'ten-vad.onnx'\n  //\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      maxSpeechDuration: 5,\n      windowSize: 512,\n    },\n    tenVad: {\n      // model: './ten-vad.onnx',\n      model: '',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      maxSpeechDuration: 5,\n      windowSize: 256,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n    bufferSizeInSeconds: 60,\n  };\n\n\n  return sherpa_onnx.createVad(config);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\n// please download ./Obama.wav from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst waveFilename = './Obama.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\n\nif (wave.sampleRate != recognizer.config.featConfig.sampleRate) {\n  throw new Error(\n      'Expected sample rate: ${recognizer.config.featConfig.sampleRate}. Given: ${wave.sampleRate}');\n}\n\nconsole.log('Started');\nlet start = Date.now();\n\nlet windowSize = vad.config.sileroVad.windowSize;\nif (vad.config.tenVad.model != '') {\n  windowSize = vad.config.tenVad.windowSize;\n}\n\nfor (let i = 0; i < wave.samples.length; i += windowSize) {\n  const thisWindow = wave.samples.subarray(i, i + windowSize);\n  vad.acceptWaveform(thisWindow);\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n\n    let start_time = segment.start / wave.sampleRate;\n    let end_time = start_time + segment.samples.length / wave.sampleRate;\n\n    start_time = start_time.toFixed(2);\n    end_time = end_time.toFixed(2);\n\n    const stream = recognizer.createStream();\n    stream.acceptWaveform(wave.sampleRate, segment.samples);\n\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${start_time} -- ${end_time}: ${text}`);\n    }\n\n    stream.free();\n  }\n}\n\nvad.flush();\n\nwhile (!vad.isEmpty()) {\n  const segment = vad.front();\n  vad.pop();\n\n  let start_time = segment.start / wave.sampleRate;\n  let end_time = start_time + segment.samples.length / wave.sampleRate;\n\n  start_time = start_time.toFixed(2);\n  end_time = end_time.toFixed(2);\n\n  const stream = recognizer.createStream();\n  stream.acceptWaveform(wave.sampleRate, segment.samples);\n\n  recognizer.decode(stream);\n  const r = recognizer.getResult(stream);\n  if (r.text.length > 0) {\n    const text = r.text.toLowerCase().trim();\n    console.log(`${start_time} -- ${end_time}: ${text}`);\n  }\n}\n\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nvad.free();\nrecognizer.free();\n"
  },
  {
    "path": "nodejs-examples/test-vad-with-non-streaming-asr-whisper.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nconst sherpa_onnx = require('sherpa-onnx');\n\nfunction createRecognizer() {\n  // Please download test files from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  const config = {\n    'modelConfig': {\n      'whisper': {\n        'encoder': './sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx',\n        'decoder': './sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx',\n        'tailPaddings': 2000,\n      },\n      'tokens': './sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt',\n      'debug': 0,\n    }\n  };\n\n  return sherpa_onnx.createOfflineRecognizer(config);\n}\n\nfunction createVad() {\n  // please download silero_vad.onnx from\n  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n  const config = {\n    sileroVad: {\n      model: './silero_vad.onnx',\n      threshold: 0.5,\n      minSpeechDuration: 0.25,\n      minSilenceDuration: 0.5,\n      maxSpeechDuration: 5,\n      windowSize: 512,\n    },\n    sampleRate: 16000,\n    debug: true,\n    numThreads: 1,\n    bufferSizeInSeconds: 60,\n  };\n\n  return sherpa_onnx.createVad(config);\n}\n\nconst recognizer = createRecognizer();\nconst vad = createVad();\n\n// please download ./Obama.wav from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst waveFilename = './Obama.wav';\nconst wave = sherpa_onnx.readWave(waveFilename);\n\nif (wave.sampleRate != recognizer.config.featConfig.sampleRate) {\n  throw new Error(\n      'Expected sample rate: ${recognizer.config.featConfig.sampleRate}. Given: ${wave.sampleRate}');\n}\n\nconsole.log('Started');\nlet start = Date.now();\n\nconst windowSize = vad.config.sileroVad.windowSize;\nfor (let i = 0; i < wave.samples.length; i += windowSize) {\n  const thisWindow = wave.samples.subarray(i, i + windowSize);\n  vad.acceptWaveform(thisWindow);\n\n  while (!vad.isEmpty()) {\n    const segment = vad.front();\n    vad.pop();\n\n    let start_time = segment.start / wave.sampleRate;\n    let end_time = start_time + segment.samples.length / wave.sampleRate;\n\n    start_time = start_time.toFixed(2);\n    end_time = end_time.toFixed(2);\n\n    const stream = recognizer.createStream();\n    stream.acceptWaveform(wave.sampleRate, segment.samples);\n\n    recognizer.decode(stream);\n    const r = recognizer.getResult(stream);\n    if (r.text.length > 0) {\n      const text = r.text.toLowerCase().trim();\n      console.log(`${start_time} -- ${end_time}: ${text}`);\n    }\n\n    stream.free();\n  }\n}\n\nvad.flush();\n\nwhile (!vad.isEmpty()) {\n  const segment = vad.front();\n  vad.pop();\n\n  let start_time = segment.start / wave.sampleRate;\n  let end_time = start_time + segment.samples.length / wave.sampleRate;\n\n  start_time = start_time.toFixed(2);\n  end_time = end_time.toFixed(2);\n\n  const stream = recognizer.createStream();\n  stream.acceptWaveform(wave.sampleRate, segment.samples);\n\n  recognizer.decode(stream);\n  const r = recognizer.getResult(stream);\n  if (r.text.length > 0) {\n    const text = r.text.toLowerCase().trim();\n    console.log(`${start_time} -- ${end_time}: ${text}`);\n  }\n}\n\nlet stop = Date.now();\nconsole.log('Done');\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'seconds');\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'seconds');\nconsole.log(\n    `RTF = ${elapsed_seconds.toFixed(3)}/${duration.toFixed(3)} =`,\n    real_time_factor.toFixed(3));\n\nvad.free();\nrecognizer.free();\n"
  },
  {
    "path": "pascal-api-examples/.gitignore",
    "content": "link*.res\n"
  },
  {
    "path": "pascal-api-examples/README.md",
    "content": "# Introduction\n\nThis directory contains examples for how to use the [Object Pascal](https://en.wikipedia.org/wiki/Object_Pascal)\nAPIs of [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx).\n\n**Documentation for this directory**:\nhttps://k2-fsa.github.io/sherpa/onnx/pascal-api/index.html\n\n|Directory| Description|\n|---------|------------|\n|[read-wav](./read-wav)|It shows how to read a wave file.|\n|[speaker-diarization](./speaker-diarization)|It shows how to use Pascal API for speaker diarization.|\n|[speech-enhancement-gtcrn](./speech-enhancement-gtcrn)| It shows how to use the offline speech denoiser API with GTCRN.|\n|[speech-enhancement-dpdfnet](./speech-enhancement-dpdfnet)| It shows how to use the offline speech denoiser API with DPDFNet. Use `dpdfnet_baseline.onnx`, `dpdfnet2.onnx`, `dpdfnet4.onnx`, or `dpdfnet8.onnx` for 16 kHz downstream ASR and `dpdfnet2_48khz_hr.onnx` for 48 kHz enhancement output.|\n|[streaming-speech-enhancement-gtcrn](./streaming-speech-enhancement-gtcrn)| It shows how to use the streaming speech denoiser API with GTCRN.|\n|[streaming-speech-enhancement-dpdfnet](./streaming-speech-enhancement-dpdfnet)| It shows how to use the streaming speech denoiser API with DPDFNet.|\n|[streaming-asr](./streaming-asr)| It shows how to use streaming models for speech recognition.|\n|[non-streaming-asr](./non-streaming-asr)| It shows how to use non-streaming models for speech recognition.|\n|[vad](./vad)| It shows how to use the voice activity detection API.|\n|[vad-with-non-streaming-asr](./vad-with-non-streaming-asr)| It shows how to use the voice activity detection API with non-streaming models for speech recognition.|\n|[portaudio-test](./portaudio-test)| It shows how to use PortAudio for recording and playing.|\n|[tts](./tts)| It shows how to use the text-to-speech API.|\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/.gitignore",
    "content": "!run-*.sh\nzipformer_transducer\nwhisper\nnemo_transducer\nnemo_ctc\nparaformer\nparaformer_itn\nsense_voice\ntelespeech_ctc\nmoonshine\nmoonshine_v2\ndolphin_ctc\nzipformer_ctc\nwenet_ctc\nnemo_canary\nomnilingual_asr_ctc\nmedasr_ctc\nfunasr_nano\nfire_red_asr_ctc\nfire_red_asr\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/README.md",
    "content": "# Introduction\n\nThis folder contains examples about using sherpa-onnx's object pascal\nAPIs with non-streaming models for speech recognition.\n\n|File|Description|\n|----|-----------|\n|[run-dolphin-ctc.sh](./run-dolphin-ctc.sh)|Use a non-streaming [Dolphin](https://github.com/DataoceanAI/Dolphin) CTC model for speech recognition|\n|[run-nemo-ctc.sh](./run-nemo-ctc.sh)|Use a non-streaming NeMo CTC model for speech recognition|\n|[run-nemo-transducer.sh](./run-nemo-transducer.sh)|Use a non-streaming NeMo transducer model for speech recognition|\n|[run-paraformer-itn.sh](./run-paraformer-itn.sh)|Use a non-streaming Paraformer model for speech recognition with inverse text normalization for numbers|\n|[run-paraformer.sh](./run-paraformer.sh)|Use a non-streaming Paraformer model for speech recognition|\n|[run-sense-voice.sh](./run-sense-voice.sh)|Use a non-streaming SenseVoice model for speech recognition|\n|[run-telespeech-ctc.sh](./run-telespeech-ctc.sh)|Use a non-streaming TeleSpeech CTC model for speech recognition|\n|[run-whisper.sh](./run-whisper.sh)|Use a Whisper model for speech recognition|\n|[run-zipformer-transducer.sh](./run-zipformer-transducer.sh)|Use a non-streaming Zipformer transducer model for speech recognition|\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/dolphin_ctc.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Dolphin CTC model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram dolphin_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Dolphin.Model := './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/fire_red_asr.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming FireRedAsr AED model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram fire_red_asr;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.FireRedAsr.Encoder := './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx';\n  Config.ModelConfig.FireRedAsr.Decoder := './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/fire_red_asr_ctc.pas",
    "content": "{ Copyright (c)  2026  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming FireRedASR CTC model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram fire_red_asr_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.FireRedAsrCtc.Model := './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := True;\n\n  WaveFilename := './sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/funasr_nano.pas",
    "content": "{ Copyright (c)  2026  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming FunASR Nano model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram funasr_nano;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.FunAsrNano.EncoderAdaptor := './sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx';\n  Config.ModelConfig.FunAsrNano.LLM := './sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx';\n  Config.ModelConfig.FunAsrNano.Embedding := './sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx';\n  Config.ModelConfig.FunAsrNano.Tokenizer := './sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B';\n  Config.ModelConfig.Tokens := '';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 2;\n  Config.ModelConfig.Debug := True;\n\n  WaveFilename := './sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/lyrics.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/medasr_ctc.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Google MedASR CTC model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram medasr_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.MedAsr.Model := './sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := True;\n\n  WaveFilename := './sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/moonshine.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Moonshine model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram moonshine;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Moonshine.Preprocessor := './sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx';\n  Config.ModelConfig.Moonshine.Encoder := './sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx';\n  Config.ModelConfig.Moonshine.UncachedDecoder := './sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx';\n  Config.ModelConfig.Moonshine.CachedDecoder := './sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx';\n\n  Config.ModelConfig.Tokens := './sherpa-onnx-moonshine-tiny-en-int8/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/moonshine_v2.pas",
    "content": "{ Copyright (c)  2024-2026  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Moonshine v2 model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram moonshine_v2;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Moonshine.Encoder := './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort';\n  Config.ModelConfig.Moonshine.MergedDecoder := './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort';\n\n  Config.ModelConfig.Tokens := './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/nemo_canary.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming NeMo Canary model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram nemo_canary;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Canary.Encoder := './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx';\n  Config.ModelConfig.Canary.Decoder := './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx';\n  Config.ModelConfig.Canary.SrcLang := 'en';\n  Config.ModelConfig.Canary.TgtLang := 'en';\n  Config.ModelConfig.Canary.UsePnc := True;\n  Config.ModelConfig.Tokens := './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/en.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  FreeAndNil(Stream);\n\n  WriteLn('-----------Output German-----');\n\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n\n  Config.ModelConfig.Canary.TgtLang := 'de';\n  Recognizer.SetConfig(Config);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/nemo_ctc.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming NeMo CTC model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram nemo_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.NeMoCtC.Model := './sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/model.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/es-spanish.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/nemo_transducer.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming NeMo transducer\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram nemo_transducer;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Transducer.Encoder := './sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/encoder.onnx';\n  Config.ModelConfig.Transducer.Decoder := './sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/decoder.onnx';\n  Config.ModelConfig.Transducer.Joiner := './sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/joiner.onnx';\n  Config.ModelConfig.ModelType := 'nemo_transducer';\n  Config.ModelConfig.Tokens := './sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/de-german.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/omnilingual_asr_ctc.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Omnilingual ASR CTC model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram omnilingual_asr_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Omnilingual.Model := './sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/paraformer.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Paraformer model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram paraformer;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Paraformer.Model := './sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/3-sichuan.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/paraformer_itn.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Paraformer model\nto decode files with inverse text normalization for numbers.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram paraformer_itn;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Paraformer.Model := './sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n  Config.RuleFsts := './itn_zh_number.fst';\n\n  WaveFilename := './itn-zh-number.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-dolphin-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./dolphin_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./dolphin_ctc\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-fire-red-asr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  tar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  rm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  ls -lh sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\nfi\n\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./fire_red_asr_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./fire_red_asr_ctc\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-fire-red-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\nfi\n\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./fire_red_asr.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./fire_red_asr\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-funasr-nano.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./funasr_nano.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./funasr_nano\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-medasr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  tar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  rm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./medasr_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./medasr_ctc\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-moonshine-v2.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./moonshine_v2.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./moonshine_v2\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-moonshine.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./moonshine.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./moonshine\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-nemo-canary.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n  tar xvf sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\n  rm sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./nemo_canary.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./nemo_canary\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-nemo-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n  tar xvf sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n  rm sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./nemo_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./nemo_ctc\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-nemo-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n\n  tar xvf sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\n  rm sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./nemo_transducer.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./nemo_transducer\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-omnilingual-asr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  tar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  rm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./omnilingual_asr_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./omnilingual_asr_ctc\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-paraformer-itn.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\nif [ ! -f ./itn-zh-number.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\nfi\n\nif [ ! -f ./itn_zh_number.fst ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./paraformer_itn.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./paraformer_itn\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n  tar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n  rm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./paraformer.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./paraformer\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-sense-voice.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./sense_voice.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./sense_voice\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-telespeech-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n\n  tar xvf sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n  rm sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./telespeech_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./telespeech_ctc\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-wenet-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n  tar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n  rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./wenet_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./wenet_ctc\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-whisper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./whisper.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./whisper\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-zipformer-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n  rm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./zipformer_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./zipformer_ctc\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/run-zipformer-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-zipformer-gigaspeech-2023-12-12/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\n  rm sherpa-onnx-zipformer-gigaspeech-2023-12-12.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./zipformer_transducer.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./zipformer_transducer\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/sense_voice.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming SenseVoice model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram sense_voice;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.SenseVoice.Model := './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx';\n  Config.ModelConfig.SenseVoice.Language := 'auto';\n  Config.ModelConfig.SenseVoice.UseItn := False;\n  Config.ModelConfig.Tokens := './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/zh.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(Format('sherpa-onnx version: %s', [SherpaOnnxGetVersionStr()]));\n  WriteLn(Format('sherpa-onnx gitSha1: %s', [SherpaOnnxGetGitSha1()]));\n  WriteLn(Format('sherpa-onnx gitDate: %s', [SherpaOnnxGetGitDate()]));\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/telespeech_ctc.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming TeleSpeech CTC model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram telespeech_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.TeleSpeechCtc := './sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/test_wavs/3-sichuan.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/wenet_ctc.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Wenet CTC model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram wenet_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.WenetCtc.Model := './sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/test_wavs/yue-0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/whisper.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Whisper model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram whisper;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Whisper.Encoder := './sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx';\n  Config.ModelConfig.Whisper.Decoder := './sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-whisper-tiny.en/test_wavs/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/zipformer_ctc.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Zipformer CTC model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram zipformer_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.ZipformerCtc.Model := './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/test_wavs/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/non-streaming-asr/zipformer_transducer.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Zipformer transducer\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram zipformer_transducer;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n\n  Config: TSherpaOnnxOfflineRecognizerConfig;\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Transducer.Encoder := './sherpa-onnx-zipformer-gigaspeech-2023-12-12/encoder-epoch-30-avg-1.int8.onnx';\n  Config.ModelConfig.Transducer.Decoder := './sherpa-onnx-zipformer-gigaspeech-2023-12-12/decoder-epoch-30-avg-1.onnx';\n  Config.ModelConfig.Transducer.Joiner := './sherpa-onnx-zipformer-gigaspeech-2023-12-12/joiner-epoch-30-avg-1.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-zipformer-gigaspeech-2023-12-12/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-zipformer-gigaspeech-2023-12-12/test_wavs/1089-134686-0001.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOfflineRecognizer.Create(Config);\n  Stream := Recognizer.CreateStream();\n  Start := Now;\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n  Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/portaudio-test/.gitignore",
    "content": "test-record\ntest-play\n"
  },
  {
    "path": "pascal-api-examples/portaudio-test/README.md",
    "content": "# Introduction\n\n[portaudio.pas](./portaudio.pas)\nrequires that the portaudio library is installed on your system.\n\n\nOn macOS, you can use\n\n```bash\nbrew install portaudio\n```\n\nand it will install `portaudio` into `/usr/local/Cellar/portaudio/19.7.0`.\n"
  },
  {
    "path": "pascal-api-examples/portaudio-test/test-play.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n{\nThis file shows how to use portaudio for playing.\n\n}\nprogram main;\n\n{$mode objfpc}{$H+}\n\n\nuses\n  portaudio,\n  sherpa_onnx,\n  dos,\n  ctypes,\n  SysUtils;\n\nvar\n  Version: String;\n  EnvStr: String;\n  Status: Integer;\n  NumDevices: Integer;\n  DeviceIndex: Integer;\n  DeviceInfo: PPaDeviceInfo;\n  I: Integer;\n  Param: TPaStreamParameters;\n  Stream: PPaStream;\n  Wave: TSherpaOnnxWave;\n\n  Buffer: TSherpaOnnxCircularBuffer;\n\nfunction PlayCallback(\n      input: Pointer; output: Pointer;\n      frameCount: culong;\n      timeInfo: PPaStreamCallbackTimeInfo;\n      statusFlags: TPaStreamCallbackFlags;\n      userData: Pointer ): cint; cdecl;\nvar\n  Samples: TSherpaOnnxSamplesArray;\n  I: Integer;\nbegin\n  if Buffer.Size >= frameCount then\n    begin\n      Samples := Buffer.Get(Buffer.Head, FrameCount);\n      Buffer.Pop(FrameCount);\n    end\n  else\n    begin\n      Samples := Buffer.Get(Buffer.Head, Buffer.Size);\n      Buffer.Pop(Buffer.Size);\n      SetLength(Samples, frameCount);\n    end;\n  for I := 0 to frameCount - 1 do\n    pcfloat(output)[I] := Samples[I];\n\n  if Buffer.Size > 0 then\n    Result := paContinue\n  else\n    Result := paComplete;\nend;\n\n\n\nbegin\n  Version := String(Pa_GetVersionText);\n  WriteLn('Version is ', Version);\n  Status := Pa_Initialize;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to initialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  NumDevices := Pa_GetDeviceCount;\n  WriteLn('Num devices: ', NumDevices);\n\n  DeviceIndex := Pa_GetDefaultOutputDevice;\n\n  if DeviceIndex = paNoDevice then\n    begin\n      WriteLn('No default output device found');\n      Pa_Terminate;\n      Exit;\n    end;\n\n  EnvStr := GetEnv('SHERPA_ONNX_MIC_DEVICE');\n  if EnvStr <> '' then\n    begin\n      DeviceIndex := StrToIntDef(EnvStr, DeviceIndex);\n      WriteLn('Use device index from environment variable SHERPA_ONNX_MIC_DEVICE: ', EnvStr);\n    end;\n\n  for I := 0 to (NumDevices - 1) do\n    begin\n      DeviceInfo := Pa_GetDeviceInfo(I);\n      if I = DeviceIndex then\n        { WriteLn(Format(' * %d %s', [I, DeviceInfo^.Name])) }\n        WriteLn(Format(' * %d %s', [I, AnsiString(DeviceInfo^.Name)]))\n      else\n        WriteLn(Format('   %d %s', [I, AnsiString(DeviceInfo^.Name)]));\n    end;\n\n  WriteLn('Use device ', DeviceIndex);\n  WriteLn(' Name ', Pa_GetDeviceInfo(DeviceIndex)^.Name);\n  WriteLn(' Max output channels ', Pa_GetDeviceInfo(DeviceIndex)^.MaxOutputChannels);\n\n  Wave := SherpaOnnxReadWave('./record.wav');\n  if Wave.Samples = nil then\n    begin\n      WriteLn('Failed to read ./record.wav');\n      Pa_Terminate;\n      Exit;\n    end;\n\n  Initialize(Param);\n  Param.Device := DeviceIndex;\n  Param.ChannelCount := 1;\n  Param.SampleFormat := paFloat32;\n  param.SuggestedLatency := Pa_GetDeviceInfo(DeviceIndex)^.DefaultHighOutputLatency;\n  param.HostApiSpecificStreamInfo := nil;\n\n  Buffer := TSherpaOnnxCircularBuffer.Create(Length(Wave.Samples));\n  Buffer.Push(Wave.Samples);\n\n  Status := Pa_OpenStream(stream, nil, @Param, Wave.SampleRate, paFramesPerBufferUnspecified, paNoFlag,\n    PPaStreamCallback(@PlayCallback), nil);\n\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to open stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  Status := Pa_StartStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to start stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  while Buffer.Size > 0 do\n    Pa_Sleep(100);  {sleep for 0.1 second }\n\n  Status := Pa_CloseStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to close stream, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  Status := Pa_Terminate;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to deinitialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\nend.\n\n"
  },
  {
    "path": "pascal-api-examples/portaudio-test/test-record.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n{\nThis file shows how to use portaudio for recording.\n\nIt records for 10 seconds and saves the audio samples to ./record.wav\n}\nprogram main;\n\n{$mode objfpc}\n\nuses\n  portaudio,\n  sherpa_onnx,\n  dos,\n  ctypes,\n  SysUtils;\n\nvar\n  Version: String;\n  EnvStr: String;\n  Status: Integer;\n  NumDevices: Integer;\n  DeviceIndex: Integer;\n  DeviceInfo: PPaDeviceInfo;\n  I: Integer;\n  Param: TPaStreamParameters;\n  SampleRate: Double;\n  Stream: PPaStream;\n\n  Buffer: TSherpaOnnxCircularBuffer;\n  AllSamples: TSherpaOnnxSamplesArray;\n\nfunction RecordCallback(\n      input: Pointer; output: Pointer;\n      frameCount: culong;\n      timeInfo: PPaStreamCallbackTimeInfo;\n      statusFlags: TPaStreamCallbackFlags;\n      userData: Pointer ): cint; cdecl;\nbegin\n  Buffer.Push(pcfloat(input), frameCount);\n  Result := paContinue;\nend;\n\n\n\nbegin\n  Version := String(Pa_GetVersionText);\n  WriteLn('Version is ', Version);\n  Status := Pa_Initialize;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to initialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  NumDevices := Pa_GetDeviceCount;\n  WriteLn('Num devices: ', NumDevices);\n\n  DeviceIndex := Pa_GetDefaultInputDevice;\n\n  if DeviceIndex = paNoDevice then\n    begin\n      WriteLn('No default input device found');\n      Pa_Terminate;\n      Exit;\n    end;\n\n  EnvStr := GetEnv('SHERPA_ONNX_MIC_DEVICE');\n  if EnvStr <> '' then\n    begin\n      DeviceIndex := StrToIntDef(EnvStr, DeviceIndex);\n      WriteLn('Use device index from environment variable SHERPA_ONNX_MIC_DEVICE: ', EnvStr);\n    end;\n\n  for I := 0 to (NumDevices - 1) do\n    begin\n      DeviceInfo := Pa_GetDeviceInfo(I);\n      if I = DeviceIndex then\n        { WriteLn(Format(' * %d %s', [I, DeviceInfo^.Name])) }\n        WriteLn(Format(' * %d %s', [I, AnsiString(DeviceInfo^.Name)]))\n      else\n        WriteLn(Format('   %d %s', [I, AnsiString(DeviceInfo^.Name)]));\n    end;\n\n  WriteLn('Use device ', DeviceIndex);\n  WriteLn(' Name ', Pa_GetDeviceInfo(DeviceIndex)^.Name);\n  WriteLn(' Max input channels ', Pa_GetDeviceInfo(DeviceIndex)^.MaxInputChannels);\n\n  Initialize(Param);\n  Param.Device := DeviceIndex;\n  Param.ChannelCount := 1;\n  Param.SampleFormat := paFloat32;\n  param.SuggestedLatency := Pa_GetDeviceInfo(DeviceIndex)^.DefaultHighInputLatency;\n  param.HostApiSpecificStreamInfo := nil;\n\n  SampleRate := 48000;\n  Buffer := TSherpaOnnxCircularBuffer.Create(Round(SampleRate) * 20);\n\n  Status := Pa_OpenStream(stream, @Param, nil, SampleRate, paFramesPerBufferUnspecified, paNoFlag,\n    PPaStreamCallback(@RecordCallback), nil);\n\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to open stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  Status := Pa_StartStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to start stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  WriteLn('Please speak! It will exit after 10 seconds.');\n  Pa_Sleep(10000);  {sleep for 10 seconds }\n\n  Status := Pa_CloseStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to close stream, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  AllSamples := Buffer.Get(0, Buffer.Size);\n\n  SherpaOnnxWriteWave('record.wav', AllSamples, Round(SampleRate));\n  WriteLn('Saved to record.wav');\n\n  Status := Pa_Terminate;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to deinitialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\nend.\n\n"
  },
  {
    "path": "pascal-api-examples/read-wav/.gitignore",
    "content": "main\n"
  },
  {
    "path": "pascal-api-examples/read-wav/main.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\nprogram main;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx;\n\nvar\n  Wave: TSherpaOnnxWave;\n  S: Single;\n  I: Integer;\nbegin\n  Wave := SherpaOnnxReadWave('./lei-jun-test.wav');\n  WriteLn('info ', Wave.SampleRate, ' ', Length(Wave.Samples));\n  S := 0;\n  for i := Low(Wave.Samples) to High(Wave.Samples) do\n    S += Wave.Samples[i];\n\n  WriteLn('sum is ', S);\nend.\n"
  },
  {
    "path": "pascal-api-examples/speaker-diarization/main.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n{\nThis file shows how to use the Pascal API from sherpa-onnx\nfor speaker diarization.\n\nUsage:\n\nStep 1: Download a speaker segmentation model\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nfor a list of available models. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\nStep 2: Download a speaker embedding extractor model\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nfor a list of available models. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\nStep 3. Download test wave files\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nfor a list of available test wave files. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\nStep 4. Run it\n}\n\nprogram main;\n\n{$mode delphi}\n\nuses\n  sherpa_onnx,\n  ctypes,\n  SysUtils;\n\nfunction ProgressCallback(\n      NumProcessedChunks: cint32;\n      NumTotalChunks: cint32): cint32; cdecl;\nvar\n  Progress: Single;\nbegin\n  Progress := 100.0 * NumProcessedChunks / NumTotalChunks;\n  WriteLn(Format('Progress: %.3f%%', [Progress]));\n\n  Result := 0;\nend;\n\nvar\n  Wave: TSherpaOnnxWave;\n  Config: TSherpaOnnxOfflineSpeakerDiarizationConfig;\n  Sd: TSherpaOnnxOfflineSpeakerDiarization;\n  Segments: TSherpaOnnxOfflineSpeakerDiarizationSegmentArray;\n  I: Integer;\nbegin\n  Wave := SherpaOnnxReadWave('./0-four-speakers-zh.wav');\n\n  Config.Segmentation.Pyannote.Model := './sherpa-onnx-pyannote-segmentation-3-0/model.onnx';\n  Config.Embedding.Model := './3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx';\n\n  {\n    Since we know that there are 4 speakers in ./0-four-speakers-zh.wav, we\n    set NumClusters to 4 here.\n    If you don't have such information, please set NumClusters to -1.\n    In that case, you have to set Config.Clustering.Threshold.\n    A larger threshold leads to fewer clusters, i.e., fewer speakers.\n  }\n  Config.Clustering.NumClusters := 4;\n  Config.Segmentation.Debug := True;\n  Config.Embedding.Debug := True;\n\n  Sd := TSherpaOnnxOfflineSpeakerDiarization.Create(Config);\n  if Sd.GetHandle = nil then\n    begin\n      WriteLn('Please check you config');\n      Exit;\n    end;\n\n  if Sd.GetSampleRate <> Wave.SampleRate then\n    begin\n      WriteLn(Format('Expected sample rate: %d, given: %d', [Sd.GetSampleRate, Wave.SampleRate]));\n      Exit;\n    end;\n\n  {\n    // If you don't want to use a callback\n    Segments := Sd.Process(Wave.Samples);\n  }\n  Segments := Sd.Process(Wave.Samples, @ProgressCallback);\n\n  for I := Low(Segments) to High(Segments) do\n    begin\n      WriteLn(Format('%.3f -- %.3f speaker_%d',\n        [Segments[I].Start, Segments[I].Stop, Segments[I].Speaker]));\n    end;\n\n  FreeAndNil(Sd);\nend.\n"
  },
  {
    "path": "pascal-api-examples/speech-enhancement-dpdfnet/.gitignore",
    "content": "dpdfnet\n"
  },
  {
    "path": "pascal-api-examples/speech-enhancement-dpdfnet/dpdfnet.pas",
    "content": "{ Copyright (c)  2026  Xiaomi Corporation }\n{\nThis file shows how to use the offline speech enhancement API from sherpa-onnx\nwith a DPDFNet model.\n\nPlease first download files used in this script before you run it.\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n\nUse dpdfnet_baseline.onnx, dpdfnet2.onnx, dpdfnet4.onnx, or dpdfnet8.onnx\nfor 16 kHz downstream ASR or speech recognition.\nUse dpdfnet2_48khz_hr.onnx for 48 kHz enhancement output.\n}\nprogram main;\n\n{$mode delphi}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  Config: TSherpaOnnxOfflineSpeechDenoiserConfig;\n  Sd: TSherpaOnnxOfflineSpeechDenoiser;\n  Audio: TSherpaOnnxDenoisedAudio;\nbegin\n  Wave := SherpaOnnxReadWave('./inp_16k.wav');\n\n  Initialize(Config);\n  Config.Model.DpdfNet.Model := './dpdfnet_baseline.onnx';\n  Config.Model.NumThreads:= 1;\n  Config.Model.Debug:= True;\n  Config.Model.Provider:= 'cpu';\n\n  Sd := TSherpaOnnxOfflineSpeechDenoiser.Create(Config);\n\n  Audio := Sd.Run(Wave.Samples, Wave.SampleRate);\n\n  SherpaOnnxWriteWave('./enhanced-dpdfnet.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./enhanced-dpdfnet.wav');\n\n  FreeAndNil(Sd);\nend.\n"
  },
  {
    "path": "pascal-api-examples/speech-enhancement-gtcrn/.gitignore",
    "content": "gtcrn\n"
  },
  {
    "path": "pascal-api-examples/speech-enhancement-gtcrn/gtcrn.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n{\nThis file shows how to use the speech enhancement API from sherpa-onnx\n\nPlease first download files used in this script before you run it.\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n}\nprogram main;\n\n{$mode delphi}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  Model: AnsiString;\n\n  Config: TSherpaOnnxOfflineSpeechDenoiserConfig;\n  Sd: TSherpaOnnxOfflineSpeechDenoiser;\n  Audio: TSherpaOnnxDenoisedAudio;\nbegin\n  Wave := SherpaOnnxReadWave('./inp_16k.wav');\n  Model := './gtcrn_simple.onnx';\n\n  Initialize(Config);\n  Config.Model.Gtcrn.Model := Model;\n  Config.Model.NumThreads:= 1;\n  Config.Model.Debug:= True;\n  Config.Model.Provider:= 'cpu';\n\n  Sd := TSherpaOnnxOfflineSpeechDenoiser.Create(Config);\n\n  Audio := Sd.Run(Wave.Samples, Wave.SampleRate);\n\n  SherpaOnnxWriteWave('./enhanced.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./enhanced.wav');\n\n  FreeAndNil(Sd);\nend.\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/.gitignore",
    "content": "!run-*.sh\nzipformer_transducer\nparaformer\nzipformer_ctc\nzipformer_ctc_hlg\nnemo_transducer\nt_one_ctc\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/README.md",
    "content": "# Introduction\n\nThis folder contains examples about using sherpa-onnx's object pascal\nAPIs with streaming models for speech recognition.\n\n|File|Description|\n|----|-----------|\n|[run-paraformer.sh](./run-paraformer.sh)|Use a streaming Paraformer model for speech recognition|\n|[run-zipformer-ctc-hlg.sh](./run-zipformer-ctc-hlg.sh)|Use a streaming Zipformer CTC model for speech recognition|\n|[run-zipformer-ctc.sh](./run-zipformer-ctc.sh)|Use a streaming Zipformer CTC model with HLG for speech recognition|\n|[run-zipformer-transducer.sh](./run-zipformer-transducer.sh)|Use a Zipformer transducer model for speech recognition|\n|[run-nemo-transducer.sh](./run-nemo-transducer.sh)|Use a NeMo transducer model for speech recognition|\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/nemo_transducer.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a streaming NeMo transducer\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram nemo_transducer;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Config: TSherpaOnnxOnlineRecognizerConfig;\n  Recognizer: TSherpaOnnxOnlineRecognizer;\n  Stream: TSherpaOnnxOnlineStream;\n  RecognitionResult: TSherpaOnnxOnlineRecognizerResult;\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n  TailPaddings: array of Single;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  {Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  to download model files used in this file.}\n  Config.ModelConfig.Transducer.Encoder := './sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/encoder.onnx';\n  Config.ModelConfig.Transducer.Decoder := './sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/decoder.onnx';\n  Config.ModelConfig.Transducer.Joiner := './sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/joiner.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/test_wavs/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOnlineRecognizer.Create(Config);\n\n  Start := Now;\n\n  Stream := Recognizer.CreateStream();\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n\n  SetLength(TailPaddings, Round(Wave.SampleRate * 0.5)); {0.5 seconds of padding}\n  Stream.AcceptWaveform(TailPaddings, Wave.SampleRate);\n\n  Stream.InputFinished();\n\n  while Recognizer.IsReady(Stream) do\n    Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/paraformer.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a streaming Paraformer model to decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram paraformer;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Config: TSherpaOnnxOnlineRecognizerConfig;\n  Recognizer: TSherpaOnnxOnlineRecognizer;\n  Stream: TSherpaOnnxOnlineStream;\n  RecognitionResult: TSherpaOnnxOnlineRecognizerResult;\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n  TailPaddings: array of Single;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  {Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  to download model files used in this file.}\n  Config.ModelConfig.Paraformer.Encoder := './sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx';\n  Config.ModelConfig.Paraformer.Decoder := './sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt';\n\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/2.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOnlineRecognizer.Create(Config);\n\n  Start := Now;\n\n  Stream := Recognizer.CreateStream();\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n\n  SetLength(TailPaddings, Round(Wave.SampleRate * 0.5)); {0.5 seconds of padding}\n  Stream.AcceptWaveform(TailPaddings, Wave.SampleRate);\n\n  Stream.InputFinished();\n\n  while Recognizer.IsReady(Stream) do\n    Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/run-nemo-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms.tar.bz2\n  tar xvf sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms.tar.bz2\n  rm sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-80ms.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./nemo_transducer.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./nemo_transducer\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/run-paraformer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\n\nif [ ! -f ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./paraformer.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./paraformer\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/run-t-one-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n  tar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n  rm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./t_one_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./t_one_ctc\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/run-zipformer-ctc-hlg.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./zipformer_ctc_hlg.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./zipformer_ctc_hlg\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/run-zipformer-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./zipformer_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./zipformer_ctc\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/run-zipformer-transducer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  ls -lh lib\n  popd\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nfi\n\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./zipformer_transducer.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./zipformer_transducer\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/t_one_ctc.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n\n{\nThis file shows how to use a streaming T-one CTC model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram t_one_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Config: TSherpaOnnxOnlineRecognizerConfig;\n  Recognizer: TSherpaOnnxOnlineRecognizer;\n  Stream: TSherpaOnnxOnlineStream;\n  RecognitionResult: TSherpaOnnxOnlineRecognizerResult;\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n  LeftPaddings: array of Single;\n  TailPaddings: array of Single;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  {Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  to download model files used in this file.}\n  Config.ModelConfig.ToneCtc.Model := './sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOnlineRecognizer.Create(Config);\n\n  Start := Now;\n\n  Stream := Recognizer.CreateStream();\n\n  SetLength(LeftPaddings, Round(Wave.SampleRate * 0.3)); {0.3 seconds of padding}\n  Stream.AcceptWaveform(LeftPaddings, Wave.SampleRate);\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n\n  SetLength(TailPaddings, Round(Wave.SampleRate * 0.6)); {0.6 seconds of padding}\n  Stream.AcceptWaveform(TailPaddings, Wave.SampleRate);\n\n  Stream.InputFinished();\n\n  while Recognizer.IsReady(Stream) do\n    Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/zipformer_ctc.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a streaming Zipformer CTC model\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram zipformer_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Config: TSherpaOnnxOnlineRecognizerConfig;\n  Recognizer: TSherpaOnnxOnlineRecognizer;\n  Stream: TSherpaOnnxOnlineStream;\n  RecognitionResult: TSherpaOnnxOnlineRecognizerResult;\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n  TailPaddings: array of Single;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  {Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  to download model files used in this file.}\n  Config.ModelConfig.Zipformer2Ctc.Model := './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/8k.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOnlineRecognizer.Create(Config);\n\n  Start := Now;\n\n  Stream := Recognizer.CreateStream();\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n\n  SetLength(TailPaddings, Round(Wave.SampleRate * 0.5)); {0.5 seconds of padding}\n  Stream.AcceptWaveform(TailPaddings, Wave.SampleRate);\n\n  Stream.InputFinished();\n\n  while Recognizer.IsReady(Stream) do\n    Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/zipformer_ctc_hlg.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a streaming Zipformer CTC model\nwith HLG to decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram zipformer_ctc_hlg;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Config: TSherpaOnnxOnlineRecognizerConfig;\n  Recognizer: TSherpaOnnxOnlineRecognizer;\n  Stream: TSherpaOnnxOnlineStream;\n  RecognitionResult: TSherpaOnnxOnlineRecognizerResult;\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n  TailPaddings: array of Single;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  {Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  to download model files used in this file.}\n  Config.ModelConfig.Zipformer2Ctc.Model := './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := True;\n  Config.CtcFstDecoderConfig.Graph := './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst';\n\n  WaveFilename := './sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/8k.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOnlineRecognizer.Create(Config);\n\n  Start := Now;\n\n  Stream := Recognizer.CreateStream();\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n\n  SetLength(TailPaddings, Round(Wave.SampleRate * 0.5)); {0.5 seconds of padding}\n  Stream.AcceptWaveform(TailPaddings, Wave.SampleRate);\n\n  Stream.InputFinished();\n\n  while Recognizer.IsReady(Stream) do\n    Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/streaming-asr/zipformer_transducer.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a streaming Zipformer transducer\nto decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram zipformer_transducer;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  DateUtils,\n  SysUtils;\n\nvar\n  Config: TSherpaOnnxOnlineRecognizerConfig;\n  Recognizer: TSherpaOnnxOnlineRecognizer;\n  Stream: TSherpaOnnxOnlineStream;\n  RecognitionResult: TSherpaOnnxOnlineRecognizerResult;\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n  TailPaddings: array of Single;\n\n  Start: TDateTime;\n  Stop: TDateTime;\n\n  Elapsed: Single;\n  Duration: Single;\n  RealTimeFactor: Single;\nbegin\n  Initialize(Config);\n\n  {Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n  to download model files used in this file.}\n  Config.ModelConfig.Transducer.Encoder := './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx';\n  Config.ModelConfig.Transducer.Decoder := './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx';\n  Config.ModelConfig.Transducer.Joiner := './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  WaveFilename := './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav';\n\n  Wave := SherpaOnnxReadWave(WaveFilename);\n\n  Recognizer := TSherpaOnnxOnlineRecognizer.Create(Config);\n\n  Start := Now;\n\n  Stream := Recognizer.CreateStream();\n\n  Stream.AcceptWaveform(Wave.Samples, Wave.SampleRate);\n\n  SetLength(TailPaddings, Round(Wave.SampleRate * 0.5)); {0.5 seconds of padding}\n  Stream.AcceptWaveform(TailPaddings, Wave.SampleRate);\n\n  Stream.InputFinished();\n\n  while Recognizer.IsReady(Stream) do\n    Recognizer.Decode(Stream);\n\n  RecognitionResult := Recognizer.GetResult(Stream);\n\n  Stop := Now;\n\n  Elapsed := MilliSecondsBetween(Stop, Start) / 1000;\n  Duration := Length(Wave.Samples) / Wave.SampleRate;\n  RealTimeFactor := Elapsed / Duration;\n\n  WriteLn(RecognitionResult.ToString);\n  WriteLn(Format('NumThreads %d', [Config.ModelConfig.NumThreads]));\n  WriteLn(Format('Elapsed %.3f s', [Elapsed]));\n  WriteLn(Format('Wave duration %.3f s', [Duration]));\n  WriteLn(Format('RTF = %.3f/%.3f = %.3f', [Elapsed, Duration, RealTimeFactor]));\n\n  {Free resources to avoid memory leak.\n\n  Note: You don't need to invoke them for this simple script.\n  However, you have to invoke them in your own large/complex project.\n  }\n  FreeAndNil(Stream);\n  FreeAndNil(Recognizer);\nend.\n"
  },
  {
    "path": "pascal-api-examples/streaming-speech-enhancement-dpdfnet/.gitignore",
    "content": "dpdfnet\n"
  },
  {
    "path": "pascal-api-examples/streaming-speech-enhancement-dpdfnet/dpdfnet.pas",
    "content": "{ Copyright (c)  2026  Xiaomi Corporation }\n{\nThis file shows how to use the streaming speech enhancement API from sherpa-onnx\nwith a DPDFNet model.\n\nPlease first download files used in this script before you run it.\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n}\nprogram main;\n\n{$mode delphi}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  Config: TSherpaOnnxOnlineSpeechDenoiserConfig;\n  Sd: TSherpaOnnxOnlineSpeechDenoiser;\n  Audio: TSherpaOnnxDenoisedAudio;\n  Chunk: array of Single;\n  Enhanced: array of Single;\n  StartIndex: Integer;\n  N: Integer;\n  NewLength: Integer;\nbegin\n  Wave := SherpaOnnxReadWave('./inp_16k.wav');\n\n  Initialize(Config);\n  Config.Model.DpdfNet.Model := './dpdfnet_baseline.onnx';\n  Config.Model.NumThreads:= 1;\n  Config.Model.Debug:= True;\n  Config.Model.Provider:= 'cpu';\n\n  Sd := TSherpaOnnxOnlineSpeechDenoiser.Create(Config);\n\n  SetLength(Enhanced, 0);\n  StartIndex := 0;\n  while StartIndex < Length(Wave.Samples) do\n    begin\n      N := Sd.GetFrameShiftInSamples;\n      if StartIndex + N > Length(Wave.Samples) then\n        N := Length(Wave.Samples) - StartIndex;\n\n      Chunk := Copy(Wave.Samples, StartIndex, N);\n      Audio := Sd.Run(Chunk, Wave.SampleRate);\n      NewLength := Length(Enhanced) + Length(Audio.Samples);\n      SetLength(Enhanced, NewLength);\n      if Length(Audio.Samples) > 0 then\n        Move(Audio.Samples[0], Enhanced[NewLength - Length(Audio.Samples)],\n          Length(Audio.Samples) * SizeOf(Single));\n      Inc(StartIndex, N);\n    end;\n\n  Audio := Sd.Flush;\n  NewLength := Length(Enhanced) + Length(Audio.Samples);\n  SetLength(Enhanced, NewLength);\n  if Length(Audio.Samples) > 0 then\n    Move(Audio.Samples[0], Enhanced[NewLength - Length(Audio.Samples)],\n      Length(Audio.Samples) * SizeOf(Single));\n\n  SherpaOnnxWriteWave('./enhanced-online-dpdfnet.wav', Enhanced, Sd.GetSampleRate);\n  WriteLn('Saved to ./enhanced-online-dpdfnet.wav');\n\n  FreeAndNil(Sd);\nend.\n"
  },
  {
    "path": "pascal-api-examples/streaming-speech-enhancement-gtcrn/.gitignore",
    "content": "gtcrn\n"
  },
  {
    "path": "pascal-api-examples/streaming-speech-enhancement-gtcrn/gtcrn.pas",
    "content": "{ Copyright (c)  2026  Xiaomi Corporation }\n{\nThis file shows how to use the streaming speech enhancement API from sherpa-onnx\nwith a GTCRN model.\n\nPlease first download files used in this script before you run it.\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\n}\nprogram main;\n\n{$mode delphi}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n  Config: TSherpaOnnxOnlineSpeechDenoiserConfig;\n  Sd: TSherpaOnnxOnlineSpeechDenoiser;\n  Audio: TSherpaOnnxDenoisedAudio;\n  Chunk: array of Single;\n  Enhanced: array of Single;\n  StartIndex: Integer;\n  N: Integer;\n  NewLength: Integer;\nbegin\n  Wave := SherpaOnnxReadWave('./inp_16k.wav');\n\n  Initialize(Config);\n  Config.Model.Gtcrn.Model := './gtcrn_simple.onnx';\n  Config.Model.NumThreads:= 1;\n  Config.Model.Debug:= True;\n  Config.Model.Provider:= 'cpu';\n\n  Sd := TSherpaOnnxOnlineSpeechDenoiser.Create(Config);\n\n  SetLength(Enhanced, 0);\n  StartIndex := 0;\n  while StartIndex < Length(Wave.Samples) do\n    begin\n      N := Sd.GetFrameShiftInSamples;\n      if StartIndex + N > Length(Wave.Samples) then\n        N := Length(Wave.Samples) - StartIndex;\n\n      Chunk := Copy(Wave.Samples, StartIndex, N);\n      Audio := Sd.Run(Chunk, Wave.SampleRate);\n      NewLength := Length(Enhanced) + Length(Audio.Samples);\n      SetLength(Enhanced, NewLength);\n      if Length(Audio.Samples) > 0 then\n        Move(Audio.Samples[0], Enhanced[NewLength - Length(Audio.Samples)],\n          Length(Audio.Samples) * SizeOf(Single));\n      Inc(StartIndex, N);\n    end;\n\n  Audio := Sd.Flush;\n  NewLength := Length(Enhanced) + Length(Audio.Samples);\n  SetLength(Enhanced, NewLength);\n  if Length(Audio.Samples) > 0 then\n    Move(Audio.Samples[0], Enhanced[NewLength - Length(Audio.Samples)],\n      Length(Audio.Samples) * SizeOf(Single));\n\n  SherpaOnnxWriteWave('./enhanced-online-gtcrn.wav', Enhanced, Sd.GetSampleRate);\n  WriteLn('Saved to ./enhanced-online-gtcrn.wav');\n\n  FreeAndNil(Sd);\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/.gitignore",
    "content": "!run-*.sh\npiper\npiper-playback\nlink*.res\nmatcha-zh\nmatcha-en\nmatcha-zh-playback\nmatcha-en-playback\nkokoro-en\nkitten-en\nkokoro-en-playback\nkitten-en-playback\nkokoro-zh-en\nkokoro-zh-en-playback\npocket-en\nsupertonic-en\nzipvoice-zh-en\n"
  },
  {
    "path": "pascal-api-examples/tts/README.md",
    "content": "# Introduction\n\nThis directory contains examples for how to use the TTS (text to speech) APIs.\n\n|Directory| Description|\n|---------|------------|\n|[run-piper.sh](./run-piper.sh)|It shows how to use models from [piper](https://github.com/rhasspy/piper) for text to speech.|\n|[run-piper-playback.sh](./run-piper-playback.sh)|It shows how to use models from [piper](https://github.com/rhasspy/piper) for text to speech. It plays the generated audio as it is still generating. |\n|[run-zipvoice-zh-en.sh](./run-zipvoice-zh-en.sh)|It shows how to use ZipVoice Chinese/English zero-shot TTS for text to speech.|\n"
  },
  {
    "path": "pascal-api-examples/tts/kitten-en-playback.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\nprogram kitten_en_playback;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith kitten models.\n\nIt generates speech from text and saves it to a wave file.\n\nNote that it plays the audio back as it is still generating.\n}\n\n{$mode objfpc}\n\nuses\n  {$ifdef unix}\n  cthreads,\n  {$endif}\n  SysUtils,\n  dos,\n  ctypes,\n  portaudio,\n  sherpa_onnx;\n\nvar\n  CriticalSection: TRTLCriticalSection;\n\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  Resampler: TSherpaOnnxLinearResampler;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 0;\n  Buffer: TSherpaOnnxCircularBuffer;\n  FinishedGeneration: Boolean = False;\n  FinishedPlaying: Boolean = False;\n\n  Version: String;\n  EnvStr: String;\n  Status: Integer;\n  NumDevices: Integer;\n  DeviceIndex: Integer;\n  DeviceInfo: PPaDeviceInfo;\n\n  { If you get EDivByZero: Division by zero error, please change the sample rate\n    to the one supported by your microphone.\n  }\n  DeviceSampleRate: Integer = 48000;\n  I: Integer;\n  Param: TPaStreamParameters;\n  Stream: PPaStream;\n  Wave: TSherpaOnnxWave;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\nfunction GenerateCallback(\n      Samples: pcfloat; N: cint32;\n      Progress: cfloat; Arg: Pointer): cint; cdecl;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Resampler <> nil then\n      Buffer.Push(Resampler.Resample(Samples, N, False))\n    else\n      Buffer.Push(Samples, N);\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\n\n  { 1 means to continue generating; 0 means to stop generating. }\n  Result := 1;\nend;\n\nfunction PlayCallback(\n      input: Pointer; output: Pointer;\n      frameCount: culong;\n      timeInfo: PPaStreamCallbackTimeInfo;\n      statusFlags: TPaStreamCallbackFlags;\n      userData: Pointer ): cint; cdecl;\nvar\n  Samples: TSherpaOnnxSamplesArray;\n  I: Integer;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Buffer.Size >= frameCount then\n      begin\n        Samples := Buffer.Get(Buffer.Head, FrameCount);\n        Buffer.Pop(FrameCount);\n      end\n    else if Buffer.Size > 0 then\n      begin\n        Samples := Buffer.Get(Buffer.Head, Buffer.Size);\n        Buffer.Pop(Buffer.Size);\n        SetLength(Samples, frameCount);\n      end\n    else\n      SetLength(Samples, frameCount);\n\n    for I := 0 to frameCount - 1 do\n      pcfloat(output)[I] := Samples[I];\n\n    if (Buffer.Size > 0) or (not FinishedGeneration) then\n      Result := paContinue\n    else\n      begin\n        Result := paComplete;\n        FinishedPlaying := True;\n      end;\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\nend;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Kitten.Model := './kitten-nano-en-v0_1-fp16/model.fp16.onnx';\n  Config.Model.Kitten.Voices := './kitten-nano-en-v0_1-fp16/voices.bin';\n  Config.Model.Kitten.Tokens := './kitten-nano-en-v0_1-fp16/tokens.txt';\n  Config.Model.Kitten.DataDir := './kitten-nano-en-v0_1-fp16/espeak-ng-data';\n  Config.Model.NumThreads := 2;\n  Config.Model.Debug := False;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nbegin\n  Tts := GetOfflineTts;\n  if Tts.GetSampleRate <> DeviceSampleRate then\n    Resampler := TSherpaOnnxLinearResampler.Create(Tts.GetSampleRate, DeviceSampleRate);\n\n  Version := String(Pa_GetVersionText);\n  WriteLn('Version is ', Version);\n  Status := Pa_Initialize;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to initialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  NumDevices := Pa_GetDeviceCount;\n  WriteLn('Num devices: ', NumDevices);\n\n  DeviceIndex := Pa_GetDefaultOutputDevice;\n\n  if DeviceIndex = paNoDevice then\n    begin\n      WriteLn('No default output device found');\n      Pa_Terminate;\n      Exit;\n    end;\n\n  EnvStr := GetEnv('SHERPA_ONNX_MIC_DEVICE');\n  if EnvStr <> '' then\n    begin\n      DeviceIndex := StrToIntDef(EnvStr, DeviceIndex);\n      WriteLn('Use device index from environment variable SHERPA_ONNX_MIC_DEVICE: ', EnvStr);\n    end;\n\n  for I := 0 to (NumDevices - 1) do\n    begin\n      DeviceInfo := Pa_GetDeviceInfo(I);\n      if I = DeviceIndex then\n        { WriteLn(Format(' * %d %s', [I, DeviceInfo^.Name])) }\n        WriteLn(Format(' * %d %s', [I, AnsiString(DeviceInfo^.Name)]))\n      else\n        WriteLn(Format('   %d %s', [I, AnsiString(DeviceInfo^.Name)]));\n    end;\n\n  WriteLn('Use device ', DeviceIndex);\n  WriteLn(' Name ', Pa_GetDeviceInfo(DeviceIndex)^.Name);\n  WriteLn(' Max output channels ', Pa_GetDeviceInfo(DeviceIndex)^.MaxOutputChannels);\n\n  Initialize(Param);\n  Param.Device := DeviceIndex;\n  Param.ChannelCount := 1;\n  Param.SampleFormat := paFloat32;\n  param.SuggestedLatency := Pa_GetDeviceInfo(DeviceIndex)^.DefaultHighOutputLatency;\n  param.HostApiSpecificStreamInfo := nil;\n\n  Buffer := TSherpaOnnxCircularBuffer.Create(30 * DeviceSampleRate);\n\n\n  { Note(fangjun): PortAudio invokes PlayCallback in a separate thread. }\n  Status := Pa_OpenStream(stream, nil, @Param, DeviceSampleRate, paFramesPerBufferUnspecified, paNoFlag,\n    PPaStreamCallback(@PlayCallback), nil);\n\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to open stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  InitCriticalSection(CriticalSection);\n\n  Status := Pa_StartStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to start stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig,\n    @GenerateCallback, nil);\n  FinishedGeneration := True;\n  SherpaOnnxWriteWave('./kitten-en-playback-0.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./kitten-en-playback-0.wav');\n\n  while not FinishedPlaying do\n    Pa_Sleep(100);  {sleep for 0.1 second }\n    {TODO(fangjun): Use an event to indicate the play is finished}\n\n  DoneCriticalSection(CriticalSection);\n\n  FreeAndNil(Tts);\n  FreeAndNil(Resampler);\n\n  Status := Pa_CloseStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to close stream, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  Status := Pa_Terminate;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to deinitialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/kitten-en.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\nprogram kitten_en;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith Kitten TTS models.\n\nIt generates speech from text and saves it to a wave file.\n\nIf you want to play it while it is generating, please see\n./kitten-en-playback.pas\n}\n\n{$mode objfpc}\n\nuses\n  SysUtils,\n  sherpa_onnx;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Kitten.Model := './kitten-nano-en-v0_1-fp16/model.fp16.onnx';\n  Config.Model.Kitten.Voices := './kitten-nano-en-v0_1-fp16/voices.bin';\n  Config.Model.Kitten.Tokens := './kitten-nano-en-v0_1-fp16/tokens.txt';\n  Config.Model.Kitten.DataDir := './kitten-nano-en-v0_1-fp16/espeak-ng-data';\n  Config.Model.NumThreads := 2;\n  Config.Model.Debug := False;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nvar\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 0;\n\nbegin\n  Tts := GetOfflineTts;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig, NIL, NIL);\n  SherpaOnnxWriteWave('./kitten-en-0.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./kitten-en-0.wav');\n\n  FreeAndNil(Tts);\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/kokoro-en-playback.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\nprogram kokoro_en_playback;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith Kokoro models.\n\nIt generates speech from text and saves it to a wave file.\n\nNote that it plays the audio back as it is still generating.\n}\n\n{$mode objfpc}\n\nuses\n  {$ifdef unix}\n  cthreads,\n  {$endif}\n  SysUtils,\n  dos,\n  ctypes,\n  portaudio,\n  sherpa_onnx;\n\nvar\n  CriticalSection: TRTLCriticalSection;\n\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  Resampler: TSherpaOnnxLinearResampler;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 7;\n  Buffer: TSherpaOnnxCircularBuffer;\n  FinishedGeneration: Boolean = False;\n  FinishedPlaying: Boolean = False;\n\n  Version: String;\n  EnvStr: String;\n  Status: Integer;\n  NumDevices: Integer;\n  DeviceIndex: Integer;\n  DeviceInfo: PPaDeviceInfo;\n\n  { If you get EDivByZero: Division by zero error, please change the sample rate\n    to the one supported by your microphone.\n  }\n  DeviceSampleRate: Integer = 48000;\n  I: Integer;\n  Param: TPaStreamParameters;\n  Stream: PPaStream;\n  Wave: TSherpaOnnxWave;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\nfunction GenerateCallback(\n      Samples: pcfloat; N: cint32;\n      Progress: cfloat; Arg: Pointer): cint; cdecl;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Resampler <> nil then\n      Buffer.Push(Resampler.Resample(Samples, N, False))\n    else\n      Buffer.Push(Samples, N);\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\n\n  { 1 means to continue generating; 0 means to stop generating. }\n  Result := 1;\nend;\n\nfunction PlayCallback(\n      input: Pointer; output: Pointer;\n      frameCount: culong;\n      timeInfo: PPaStreamCallbackTimeInfo;\n      statusFlags: TPaStreamCallbackFlags;\n      userData: Pointer ): cint; cdecl;\nvar\n  Samples: TSherpaOnnxSamplesArray;\n  I: Integer;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Buffer.Size >= frameCount then\n      begin\n        Samples := Buffer.Get(Buffer.Head, FrameCount);\n        Buffer.Pop(FrameCount);\n      end\n    else if Buffer.Size > 0 then\n      begin\n        Samples := Buffer.Get(Buffer.Head, Buffer.Size);\n        Buffer.Pop(Buffer.Size);\n        SetLength(Samples, frameCount);\n      end\n    else\n      SetLength(Samples, frameCount);\n\n    for I := 0 to frameCount - 1 do\n      pcfloat(output)[I] := Samples[I];\n\n    if (Buffer.Size > 0) or (not FinishedGeneration) then\n      Result := paContinue\n    else\n      begin\n        Result := paComplete;\n        FinishedPlaying := True;\n      end;\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\nend;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Kokoro.Model := './kokoro-en-v0_19/model.onnx';\n  Config.Model.Kokoro.Voices := './kokoro-en-v0_19/voices.bin';\n  Config.Model.Kokoro.Tokens := './kokoro-en-v0_19/tokens.txt';\n  Config.Model.Kokoro.DataDir := './kokoro-en-v0_19/espeak-ng-data';\n  Config.Model.NumThreads := 2;\n  Config.Model.Debug := False;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nbegin\n  Tts := GetOfflineTts;\n  if Tts.GetSampleRate <> DeviceSampleRate then\n    Resampler := TSherpaOnnxLinearResampler.Create(Tts.GetSampleRate, DeviceSampleRate);\n\n  Version := String(Pa_GetVersionText);\n  WriteLn('Version is ', Version);\n  Status := Pa_Initialize;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to initialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  NumDevices := Pa_GetDeviceCount;\n  WriteLn('Num devices: ', NumDevices);\n\n  DeviceIndex := Pa_GetDefaultOutputDevice;\n\n  if DeviceIndex = paNoDevice then\n    begin\n      WriteLn('No default output device found');\n      Pa_Terminate;\n      Exit;\n    end;\n\n  EnvStr := GetEnv('SHERPA_ONNX_MIC_DEVICE');\n  if EnvStr <> '' then\n    begin\n      DeviceIndex := StrToIntDef(EnvStr, DeviceIndex);\n      WriteLn('Use device index from environment variable SHERPA_ONNX_MIC_DEVICE: ', EnvStr);\n    end;\n\n  for I := 0 to (NumDevices - 1) do\n    begin\n      DeviceInfo := Pa_GetDeviceInfo(I);\n      if I = DeviceIndex then\n        { WriteLn(Format(' * %d %s', [I, DeviceInfo^.Name])) }\n        WriteLn(Format(' * %d %s', [I, AnsiString(DeviceInfo^.Name)]))\n      else\n        WriteLn(Format('   %d %s', [I, AnsiString(DeviceInfo^.Name)]));\n    end;\n\n  WriteLn('Use device ', DeviceIndex);\n  WriteLn(' Name ', Pa_GetDeviceInfo(DeviceIndex)^.Name);\n  WriteLn(' Max output channels ', Pa_GetDeviceInfo(DeviceIndex)^.MaxOutputChannels);\n\n  Initialize(Param);\n  Param.Device := DeviceIndex;\n  Param.ChannelCount := 1;\n  Param.SampleFormat := paFloat32;\n  param.SuggestedLatency := Pa_GetDeviceInfo(DeviceIndex)^.DefaultHighOutputLatency;\n  param.HostApiSpecificStreamInfo := nil;\n\n  Buffer := TSherpaOnnxCircularBuffer.Create(30 * DeviceSampleRate);\n\n\n  { Note(fangjun): PortAudio invokes PlayCallback in a separate thread. }\n  Status := Pa_OpenStream(stream, nil, @Param, DeviceSampleRate, paFramesPerBufferUnspecified, paNoFlag,\n    PPaStreamCallback(@PlayCallback), nil);\n\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to open stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  InitCriticalSection(CriticalSection);\n\n  Status := Pa_StartStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to start stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig,\n    @GenerateCallback, nil);\n  FinishedGeneration := True;\n  SherpaOnnxWriteWave('./kokoro-en-playback-7.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./kokoro-en-playback-7.wav');\n\n  while not FinishedPlaying do\n    Pa_Sleep(100);  {sleep for 0.1 second }\n    {TODO(fangjun): Use an event to indicate the play is finished}\n\n  DoneCriticalSection(CriticalSection);\n\n  FreeAndNil(Tts);\n  FreeAndNil(Resampler);\n\n  Status := Pa_CloseStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to close stream, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  Status := Pa_Terminate;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to deinitialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/kokoro-en.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\nprogram kokoro_en;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith Kokoro TTS models.\n\nIt generates speech from text and saves it to a wave file.\n\nIf you want to play it while it is generating, please see\n./kokoro-en-playback.pas\n}\n\n{$mode objfpc}\n\nuses\n  SysUtils,\n  sherpa_onnx;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Kokoro.Model := './kokoro-en-v0_19/model.onnx';\n  Config.Model.Kokoro.Voices := './kokoro-en-v0_19/voices.bin';\n  Config.Model.Kokoro.Tokens := './kokoro-en-v0_19/tokens.txt';\n  Config.Model.Kokoro.DataDir := './kokoro-en-v0_19/espeak-ng-data';\n  Config.Model.NumThreads := 2;\n  Config.Model.Debug := False;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nvar\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 8;\n\nbegin\n  Tts := GetOfflineTts;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig, NIL, NIL);\n  SherpaOnnxWriteWave('./kokoro-en-8.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./kokoro-en-8.wav');\n\n  FreeAndNil(Tts);\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/kokoro-zh-en-playback.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\nprogram kokoro_en_playback;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith Kokoro models (Chinese + English).\n\nIt generates speech from text and saves it to a wave file.\n\nNote that it plays the audio back as it is still generating.\n}\n\n{$mode objfpc}\n\nuses\n  {$ifdef unix}\n  cthreads,\n  {$endif}\n  SysUtils,\n  dos,\n  ctypes,\n  portaudio,\n  sherpa_onnx;\n\nvar\n  CriticalSection: TRTLCriticalSection;\n\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  Resampler: TSherpaOnnxLinearResampler;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 47;\n  Buffer: TSherpaOnnxCircularBuffer;\n  FinishedGeneration: Boolean = False;\n  FinishedPlaying: Boolean = False;\n\n  Version: String;\n  EnvStr: String;\n  Status: Integer;\n  NumDevices: Integer;\n  DeviceIndex: Integer;\n  DeviceInfo: PPaDeviceInfo;\n\n  { If you get EDivByZero: Division by zero error, please change the sample rate\n    to the one supported by your microphone.\n  }\n  DeviceSampleRate: Integer = 48000;\n  I: Integer;\n  Param: TPaStreamParameters;\n  Stream: PPaStream;\n  Wave: TSherpaOnnxWave;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\nfunction GenerateCallback(\n      Samples: pcfloat; N: cint32;\n      Progress: cfloat; Arg: Pointer): cint; cdecl;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Resampler <> nil then\n      Buffer.Push(Resampler.Resample(Samples, N, False))\n    else\n      Buffer.Push(Samples, N);\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\n\n  { 1 means to continue generating; 0 means to stop generating. }\n  Result := 1;\nend;\n\nfunction PlayCallback(\n      input: Pointer; output: Pointer;\n      frameCount: culong;\n      timeInfo: PPaStreamCallbackTimeInfo;\n      statusFlags: TPaStreamCallbackFlags;\n      userData: Pointer ): cint; cdecl;\nvar\n  Samples: TSherpaOnnxSamplesArray;\n  I: Integer;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Buffer.Size >= frameCount then\n      begin\n        Samples := Buffer.Get(Buffer.Head, FrameCount);\n        Buffer.Pop(FrameCount);\n      end\n    else if Buffer.Size > 0 then\n      begin\n        Samples := Buffer.Get(Buffer.Head, Buffer.Size);\n        Buffer.Pop(Buffer.Size);\n        SetLength(Samples, frameCount);\n      end\n    else\n      SetLength(Samples, frameCount);\n\n    for I := 0 to frameCount - 1 do\n      pcfloat(output)[I] := Samples[I];\n\n    if (Buffer.Size > 0) or (not FinishedGeneration) then\n      Result := paContinue\n    else\n      begin\n        Result := paComplete;\n        FinishedPlaying := True;\n      end;\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\nend;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Kokoro.Model := './kokoro-multi-lang-v1_0/model.onnx';\n  Config.Model.Kokoro.Voices := './kokoro-multi-lang-v1_0/voices.bin';\n  Config.Model.Kokoro.Tokens := './kokoro-multi-lang-v1_0/tokens.txt';\n  Config.Model.Kokoro.DataDir := './kokoro-multi-lang-v1_0/espeak-ng-data';\n  Config.Model.Kokoro.DictDir := './kokoro-multi-lang-v1_0/dict';\n  Config.Model.Kokoro.Lexicon := './kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt';\n  Config.Model.NumThreads := 2;\n  Config.Model.Debug := False;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nbegin\n  Tts := GetOfflineTts;\n  if Tts.GetSampleRate <> DeviceSampleRate then\n    Resampler := TSherpaOnnxLinearResampler.Create(Tts.GetSampleRate, DeviceSampleRate);\n\n  Version := String(Pa_GetVersionText);\n  WriteLn('Version is ', Version);\n  Status := Pa_Initialize;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to initialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  NumDevices := Pa_GetDeviceCount;\n  WriteLn('Num devices: ', NumDevices);\n\n  DeviceIndex := Pa_GetDefaultOutputDevice;\n\n  if DeviceIndex = paNoDevice then\n    begin\n      WriteLn('No default output device found');\n      Pa_Terminate;\n      Exit;\n    end;\n\n  EnvStr := GetEnv('SHERPA_ONNX_MIC_DEVICE');\n  if EnvStr <> '' then\n    begin\n      DeviceIndex := StrToIntDef(EnvStr, DeviceIndex);\n      WriteLn('Use device index from environment variable SHERPA_ONNX_MIC_DEVICE: ', EnvStr);\n    end;\n\n  for I := 0 to (NumDevices - 1) do\n    begin\n      DeviceInfo := Pa_GetDeviceInfo(I);\n      if I = DeviceIndex then\n        { WriteLn(Format(' * %d %s', [I, DeviceInfo^.Name])) }\n        WriteLn(Format(' * %d %s', [I, AnsiString(DeviceInfo^.Name)]))\n      else\n        WriteLn(Format('   %d %s', [I, AnsiString(DeviceInfo^.Name)]));\n    end;\n\n  WriteLn('Use device ', DeviceIndex);\n  WriteLn(' Name ', Pa_GetDeviceInfo(DeviceIndex)^.Name);\n  WriteLn(' Max output channels ', Pa_GetDeviceInfo(DeviceIndex)^.MaxOutputChannels);\n\n  Initialize(Param);\n  Param.Device := DeviceIndex;\n  Param.ChannelCount := 1;\n  Param.SampleFormat := paFloat32;\n  param.SuggestedLatency := Pa_GetDeviceInfo(DeviceIndex)^.DefaultHighOutputLatency;\n  param.HostApiSpecificStreamInfo := nil;\n\n  Buffer := TSherpaOnnxCircularBuffer.Create(30 * DeviceSampleRate);\n\n\n  { Note(fangjun): PortAudio invokes PlayCallback in a separate thread. }\n  Status := Pa_OpenStream(stream, nil, @Param, DeviceSampleRate, paFramesPerBufferUnspecified, paNoFlag,\n    PPaStreamCallback(@PlayCallback), nil);\n\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to open stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  InitCriticalSection(CriticalSection);\n\n  Status := Pa_StartStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to start stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := '中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig,\n    @GenerateCallback, nil);\n  FinishedGeneration := True;\n  SherpaOnnxWriteWave('./kokoro-zh-en-playback-47.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./kokoro-zh-en-playback-47.wav');\n\n  while not FinishedPlaying do\n    Pa_Sleep(100);  {sleep for 0.1 second }\n    {TODO(fangjun): Use an event to indicate the play is finished}\n\n  DoneCriticalSection(CriticalSection);\n\n  FreeAndNil(Tts);\n  FreeAndNil(Resampler);\n\n  Status := Pa_CloseStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to close stream, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  Status := Pa_Terminate;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to deinitialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/kokoro-zh-en.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\nprogram kokoro_en;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith Kokoro TTS models (Chinese + English).\n\nIt generates speech from text and saves it to a wave file.\n\nIf you want to play it while it is generating, please see\n./kokoro-en-playback.pas\n}\n\n{$mode objfpc}\n\nuses\n  SysUtils,\n  sherpa_onnx;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Kokoro.Model := './kokoro-multi-lang-v1_0/model.onnx';\n  Config.Model.Kokoro.Voices := './kokoro-multi-lang-v1_0/voices.bin';\n  Config.Model.Kokoro.Tokens := './kokoro-multi-lang-v1_0/tokens.txt';\n  Config.Model.Kokoro.DataDir := './kokoro-multi-lang-v1_0/espeak-ng-data';\n  Config.Model.Kokoro.DictDir := './kokoro-multi-lang-v1_0/dict';\n  Config.Model.Kokoro.Lexicon := './kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt';\n  Config.Model.NumThreads := 2;\n  Config.Model.Debug := False;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nvar\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 46;\n\nbegin\n  Tts := GetOfflineTts;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := '中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig, NIL, NIL);\n  SherpaOnnxWriteWave('./kokoro-zh-en-46.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./kokoro-zh-en-46.wav');\n\n  FreeAndNil(Tts);\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/matcha-en-playback.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\nprogram matcha_en_playback;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith MatchaTTS models.\n\nIt generates speech from text and saves it to a wave file.\n\nNote that it plays the audio back as it is still generating.\n}\n\n{$mode objfpc}\n\nuses\n  {$ifdef unix}\n  cthreads,\n  {$endif}\n  SysUtils,\n  dos,\n  ctypes,\n  portaudio,\n  sherpa_onnx;\n\nvar\n  CriticalSection: TRTLCriticalSection;\n\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  Resampler: TSherpaOnnxLinearResampler;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 0;\n  Buffer: TSherpaOnnxCircularBuffer;\n  FinishedGeneration: Boolean = False;\n  FinishedPlaying: Boolean = False;\n\n  Version: String;\n  EnvStr: String;\n  Status: Integer;\n  NumDevices: Integer;\n  DeviceIndex: Integer;\n  DeviceInfo: PPaDeviceInfo;\n\n  { If you get EDivByZero: Division by zero error, please change the sample rate\n    to the one supported by your microphone.\n  }\n  DeviceSampleRate: Integer = 48000;\n  I: Integer;\n  Param: TPaStreamParameters;\n  Stream: PPaStream;\n  Wave: TSherpaOnnxWave;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\nfunction GenerateCallback(\n      Samples: pcfloat; N: cint32;\n      Progress: cfloat; Arg: Pointer): cint; cdecl;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Resampler <> nil then\n      Buffer.Push(Resampler.Resample(Samples, N, False))\n    else\n      Buffer.Push(Samples, N);\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\n\n  { 1 means to continue generating; 0 means to stop generating. }\n  Result := 1;\nend;\n\nfunction PlayCallback(\n      input: Pointer; output: Pointer;\n      frameCount: culong;\n      timeInfo: PPaStreamCallbackTimeInfo;\n      statusFlags: TPaStreamCallbackFlags;\n      userData: Pointer ): cint; cdecl;\nvar\n  Samples: TSherpaOnnxSamplesArray;\n  I: Integer;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Buffer.Size >= frameCount then\n      begin\n        Samples := Buffer.Get(Buffer.Head, FrameCount);\n        Buffer.Pop(FrameCount);\n      end\n    else if Buffer.Size > 0 then\n      begin\n        Samples := Buffer.Get(Buffer.Head, Buffer.Size);\n        Buffer.Pop(Buffer.Size);\n        SetLength(Samples, frameCount);\n      end\n    else\n      SetLength(Samples, frameCount);\n\n    for I := 0 to frameCount - 1 do\n      pcfloat(output)[I] := Samples[I];\n\n    if (Buffer.Size > 0) or (not FinishedGeneration) then\n      Result := paContinue\n    else\n      begin\n        Result := paComplete;\n        FinishedPlaying := True;\n      end;\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\nend;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Matcha.AcousticModel := './matcha-icefall-en_US-ljspeech/model-steps-3.onnx';\n  Config.Model.Matcha.Vocoder := './vocos-22khz-univ.onnx';\n  Config.Model.Matcha.Tokens := './matcha-icefall-en_US-ljspeech/tokens.txt';\n  Config.Model.Matcha.DataDir := './matcha-icefall-en_US-ljspeech/espeak-ng-data';\n  Config.Model.NumThreads := 1;\n  Config.Model.Debug := False;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nbegin\n  Tts := GetOfflineTts;\n  if Tts.GetSampleRate <> DeviceSampleRate then\n    Resampler := TSherpaOnnxLinearResampler.Create(Tts.GetSampleRate, DeviceSampleRate);\n\n  Version := String(Pa_GetVersionText);\n  WriteLn('Version is ', Version);\n  Status := Pa_Initialize;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to initialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  NumDevices := Pa_GetDeviceCount;\n  WriteLn('Num devices: ', NumDevices);\n\n  DeviceIndex := Pa_GetDefaultOutputDevice;\n\n  if DeviceIndex = paNoDevice then\n    begin\n      WriteLn('No default output device found');\n      Pa_Terminate;\n      Exit;\n    end;\n\n  EnvStr := GetEnv('SHERPA_ONNX_MIC_DEVICE');\n  if EnvStr <> '' then\n    begin\n      DeviceIndex := StrToIntDef(EnvStr, DeviceIndex);\n      WriteLn('Use device index from environment variable SHERPA_ONNX_MIC_DEVICE: ', EnvStr);\n    end;\n\n  for I := 0 to (NumDevices - 1) do\n    begin\n      DeviceInfo := Pa_GetDeviceInfo(I);\n      if I = DeviceIndex then\n        { WriteLn(Format(' * %d %s', [I, DeviceInfo^.Name])) }\n        WriteLn(Format(' * %d %s', [I, AnsiString(DeviceInfo^.Name)]))\n      else\n        WriteLn(Format('   %d %s', [I, AnsiString(DeviceInfo^.Name)]));\n    end;\n\n  WriteLn('Use device ', DeviceIndex);\n  WriteLn(' Name ', Pa_GetDeviceInfo(DeviceIndex)^.Name);\n  WriteLn(' Max output channels ', Pa_GetDeviceInfo(DeviceIndex)^.MaxOutputChannels);\n\n  Initialize(Param);\n  Param.Device := DeviceIndex;\n  Param.ChannelCount := 1;\n  Param.SampleFormat := paFloat32;\n  param.SuggestedLatency := Pa_GetDeviceInfo(DeviceIndex)^.DefaultHighOutputLatency;\n  param.HostApiSpecificStreamInfo := nil;\n\n  Buffer := TSherpaOnnxCircularBuffer.Create(30 * DeviceSampleRate);\n\n\n  { Note(fangjun): PortAudio invokes PlayCallback in a separate thread. }\n  Status := Pa_OpenStream(stream, nil, @Param, DeviceSampleRate, paFramesPerBufferUnspecified, paNoFlag,\n    PPaStreamCallback(@PlayCallback), nil);\n\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to open stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  InitCriticalSection(CriticalSection);\n\n  Status := Pa_StartStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to start stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig,\n    @GenerateCallback, nil);\n  FinishedGeneration := True;\n  SherpaOnnxWriteWave('./matcha-en-playback.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./matcha-en-playback.wav');\n\n  while not FinishedPlaying do\n    Pa_Sleep(100);  {sleep for 0.1 second }\n    {TODO(fangjun): Use an event to indicate the play is finished}\n\n  DoneCriticalSection(CriticalSection);\n\n  FreeAndNil(Tts);\n  FreeAndNil(Resampler);\n\n  Status := Pa_CloseStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to close stream, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  Status := Pa_Terminate;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to deinitialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/matcha-en.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\nprogram matcha_en;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith MatchaTTS models.\n\nIt generates speech from text and saves it to a wave file.\n\nIf you want to play it while it is generating, please see\n./matcha-en-playback.pas\n}\n\n{$mode objfpc}\n\nuses\n  SysUtils,\n  sherpa_onnx;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Matcha.AcousticModel := './matcha-icefall-en_US-ljspeech/model-steps-3.onnx';\n  Config.Model.Matcha.Vocoder := './vocos-22khz-univ.onnx';\n  Config.Model.Matcha.Tokens := './matcha-icefall-en_US-ljspeech/tokens.txt';\n  Config.Model.Matcha.DataDir := './matcha-icefall-en_US-ljspeech/espeak-ng-data';\n  Config.Model.NumThreads := 1;\n  Config.Model.Debug := False;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nvar\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 0;\n\nbegin\n  Tts := GetOfflineTts;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig, NIL, NIL);\n  SherpaOnnxWriteWave('./matcha-en.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./matcha-en.wav');\n\n  FreeAndNil(Tts);\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/matcha-zh-playback.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\nprogram matcha_zh_playback;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith MatchaTTS models.\n\nIt generates speech from text and saves it to a wave file.\n\nNote that it plays the audio back as it is still generating.\n}\n\n{$mode objfpc}\n\nuses\n  {$ifdef unix}\n  cthreads,\n  {$endif}\n  SysUtils,\n  dos,\n  ctypes,\n  portaudio,\n  sherpa_onnx;\n\nvar\n  CriticalSection: TRTLCriticalSection;\n\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  Resampler: TSherpaOnnxLinearResampler;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 0;\n  Buffer: TSherpaOnnxCircularBuffer;\n  FinishedGeneration: Boolean = False;\n  FinishedPlaying: Boolean = False;\n\n  Version: String;\n  EnvStr: String;\n  Status: Integer;\n  NumDevices: Integer;\n  DeviceIndex: Integer;\n  DeviceInfo: PPaDeviceInfo;\n\n  { If you get EDivByZero: Division by zero error, please change the sample rate\n    to the one supported by your microphone.\n  }\n  DeviceSampleRate: Integer = 48000;\n  I: Integer;\n  Param: TPaStreamParameters;\n  Stream: PPaStream;\n  Wave: TSherpaOnnxWave;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\nfunction GenerateCallback(\n      Samples: pcfloat; N: cint32;\n      Progress: cfloat; Arg: Pointer): cint; cdecl;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Resampler <> nil then\n      Buffer.Push(Resampler.Resample(Samples, N, False))\n    else\n      Buffer.Push(Samples, N);\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\n\n  { 1 means to continue generating; 0 means to stop generating. }\n  Result := 1;\nend;\n\nfunction PlayCallback(\n      input: Pointer; output: Pointer;\n      frameCount: culong;\n      timeInfo: PPaStreamCallbackTimeInfo;\n      statusFlags: TPaStreamCallbackFlags;\n      userData: Pointer ): cint; cdecl;\nvar\n  Samples: TSherpaOnnxSamplesArray;\n  I: Integer;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Buffer.Size >= frameCount then\n      begin\n        Samples := Buffer.Get(Buffer.Head, FrameCount);\n        Buffer.Pop(FrameCount);\n      end\n    else if Buffer.Size > 0 then\n      begin\n        Samples := Buffer.Get(Buffer.Head, Buffer.Size);\n        Buffer.Pop(Buffer.Size);\n        SetLength(Samples, frameCount);\n      end\n    else\n      SetLength(Samples, frameCount);\n\n    for I := 0 to frameCount - 1 do\n      pcfloat(output)[I] := Samples[I];\n\n    if (Buffer.Size > 0) or (not FinishedGeneration) then\n      Result := paContinue\n    else\n      begin\n        Result := paComplete;\n        FinishedPlaying := True;\n      end;\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\nend;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Matcha.AcousticModel := './matcha-icefall-zh-baker/model-steps-3.onnx';\n  Config.Model.Matcha.Vocoder := './vocos-22khz-univ.onnx';\n  Config.Model.Matcha.Lexicon := './matcha-icefall-zh-baker/lexicon.txt';\n  Config.Model.Matcha.Tokens := './matcha-icefall-zh-baker/tokens.txt';\n  Config.Model.Matcha.DictDir := './matcha-icefall-zh-baker/dict';\n  Config.Model.NumThreads := 1;\n  Config.Model.Debug := False;\n  Config.RuleFsts := './matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst';\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nbegin\n  Tts := GetOfflineTts;\n  if Tts.GetSampleRate <> DeviceSampleRate then\n    Resampler := TSherpaOnnxLinearResampler.Create(Tts.GetSampleRate, DeviceSampleRate);\n\n  Version := String(Pa_GetVersionText);\n  WriteLn('Version is ', Version);\n  Status := Pa_Initialize;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to initialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  NumDevices := Pa_GetDeviceCount;\n  WriteLn('Num devices: ', NumDevices);\n\n  DeviceIndex := Pa_GetDefaultOutputDevice;\n\n  if DeviceIndex = paNoDevice then\n    begin\n      WriteLn('No default output device found');\n      Pa_Terminate;\n      Exit;\n    end;\n\n  EnvStr := GetEnv('SHERPA_ONNX_MIC_DEVICE');\n  if EnvStr <> '' then\n    begin\n      DeviceIndex := StrToIntDef(EnvStr, DeviceIndex);\n      WriteLn('Use device index from environment variable SHERPA_ONNX_MIC_DEVICE: ', EnvStr);\n    end;\n\n  for I := 0 to (NumDevices - 1) do\n    begin\n      DeviceInfo := Pa_GetDeviceInfo(I);\n      if I = DeviceIndex then\n        { WriteLn(Format(' * %d %s', [I, DeviceInfo^.Name])) }\n        WriteLn(Format(' * %d %s', [I, AnsiString(DeviceInfo^.Name)]))\n      else\n        WriteLn(Format('   %d %s', [I, AnsiString(DeviceInfo^.Name)]));\n    end;\n\n  WriteLn('Use device ', DeviceIndex);\n  WriteLn(' Name ', Pa_GetDeviceInfo(DeviceIndex)^.Name);\n  WriteLn(' Max output channels ', Pa_GetDeviceInfo(DeviceIndex)^.MaxOutputChannels);\n\n  Initialize(Param);\n  Param.Device := DeviceIndex;\n  Param.ChannelCount := 1;\n  Param.SampleFormat := paFloat32;\n  param.SuggestedLatency := Pa_GetDeviceInfo(DeviceIndex)^.DefaultHighOutputLatency;\n  param.HostApiSpecificStreamInfo := nil;\n\n  Buffer := TSherpaOnnxCircularBuffer.Create(30 * DeviceSampleRate);\n\n\n  { Note(fangjun): PortAudio invokes PlayCallback in a separate thread. }\n  Status := Pa_OpenStream(stream, nil, @Param, DeviceSampleRate, paFramesPerBufferUnspecified, paNoFlag,\n    PPaStreamCallback(@PlayCallback), nil);\n\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to open stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  InitCriticalSection(CriticalSection);\n\n  Status := Pa_StartStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to start stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := '某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig,\n    @GenerateCallback, nil);\n  FinishedGeneration := True;\n  SherpaOnnxWriteWave('./matcha-zh-playback.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./matcha-zh-playback.wav');\n\n  while not FinishedPlaying do\n    Pa_Sleep(100);  {sleep for 0.1 second }\n    {TODO(fangjun): Use an event to indicate the play is finished}\n\n  DoneCriticalSection(CriticalSection);\n\n  FreeAndNil(Tts);\n  FreeAndNil(Resampler);\n\n  Status := Pa_CloseStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to close stream, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  Status := Pa_Terminate;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to deinitialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/matcha-zh.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\nprogram matcha_zh;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith MatchaTTS models.\n\nIt generates speech from text and saves it to a wave file.\n\nIf you want to play it while it is generating, please see\n./matcha-zh-playback.pas\n}\n\n{$mode objfpc}\n\nuses\n  SysUtils,\n  sherpa_onnx;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Matcha.AcousticModel := './matcha-icefall-zh-baker/model-steps-3.onnx';\n  Config.Model.Matcha.Vocoder := './vocos-22khz-univ.onnx';\n  Config.Model.Matcha.Lexicon := './matcha-icefall-zh-baker/lexicon.txt';\n  Config.Model.Matcha.Tokens := './matcha-icefall-zh-baker/tokens.txt';\n  Config.Model.Matcha.DictDir := './matcha-icefall-zh-baker/dict';\n  Config.Model.NumThreads := 1;\n  Config.Model.Debug := False;\n  Config.RuleFsts := './matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst';\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nvar\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 0;\n\nbegin\n  Tts := GetOfflineTts;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := '某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig, NIL, NIL);\n  SherpaOnnxWriteWave('./matcha-zh.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./matcha-zh.wav');\n\n  FreeAndNil(Tts);\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/piper-playback.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\nprogram piper_playback;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith Piper models.\n\nIt generates speech from text and saves it to a wave file.\n\nNote that it plays the audio back as it is still generating.\n}\n\n{$mode objfpc}\n\nuses\n  {$ifdef unix}\n  cthreads,\n  {$endif}\n  SysUtils,\n  dos,\n  ctypes,\n  portaudio,\n  sherpa_onnx;\n\nvar\n  CriticalSection: TRTLCriticalSection;\n\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  Resampler: TSherpaOnnxLinearResampler;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 0;\n  Buffer: TSherpaOnnxCircularBuffer;\n  FinishedGeneration: Boolean = False;\n  FinishedPlaying: Boolean = False;\n\n  Version: String;\n  EnvStr: String;\n  Status: Integer;\n  NumDevices: Integer;\n  DeviceIndex: Integer;\n  DeviceInfo: PPaDeviceInfo;\n\n  { If you get EDivByZero: Division by zero error, please change the sample rate\n    to the one supported by your microphone.\n  }\n  DeviceSampleRate: Integer = 48000;\n  I: Integer;\n  Param: TPaStreamParameters;\n  Stream: PPaStream;\n  Wave: TSherpaOnnxWave;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\nfunction GenerateCallback(\n      Samples: pcfloat; N: cint32;\n      Progress: cfloat; Arg: Pointer): cint; cdecl;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Resampler <> nil then\n      Buffer.Push(Resampler.Resample(Samples, N, False))\n    else\n      Buffer.Push(Samples, N);\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\n\n  { 1 means to continue generating; 0 means to stop generating. }\n  Result := 1;\nend;\n\nfunction PlayCallback(\n      input: Pointer; output: Pointer;\n      frameCount: culong;\n      timeInfo: PPaStreamCallbackTimeInfo;\n      statusFlags: TPaStreamCallbackFlags;\n      userData: Pointer ): cint; cdecl;\nvar\n  Samples: TSherpaOnnxSamplesArray;\n  I: Integer;\nbegin\n  EnterCriticalSection(CriticalSection);\n  try\n    if Buffer.Size >= frameCount then\n      begin\n        Samples := Buffer.Get(Buffer.Head, FrameCount);\n        Buffer.Pop(FrameCount);\n      end\n    else if Buffer.Size > 0 then\n      begin\n        Samples := Buffer.Get(Buffer.Head, Buffer.Size);\n        Buffer.Pop(Buffer.Size);\n        SetLength(Samples, frameCount);\n      end\n    else\n      SetLength(Samples, frameCount);\n\n    for I := 0 to frameCount - 1 do\n      pcfloat(output)[I] := Samples[I];\n\n    if (Buffer.Size > 0) or (not FinishedGeneration) then\n      Result := paContinue\n    else\n      begin\n        Result := paComplete;\n        FinishedPlaying := True;\n      end;\n  finally\n    LeaveCriticalSection(CriticalSection);\n  end;\nend;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Vits.Model := './vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx';\n  Config.Model.Vits.Tokens := './vits-piper-en_US-libritts_r-medium/tokens.txt';\n  Config.Model.Vits.DataDir := './vits-piper-en_US-libritts_r-medium/espeak-ng-data';\n  Config.Model.NumThreads := 1;\n  Config.Model.Debug := False;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nbegin\n  Tts := GetOfflineTts;\n  if Tts.GetSampleRate <> DeviceSampleRate then\n    Resampler := TSherpaOnnxLinearResampler.Create(Tts.GetSampleRate, DeviceSampleRate);\n\n  Version := String(Pa_GetVersionText);\n  WriteLn('Version is ', Version);\n  Status := Pa_Initialize;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to initialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  NumDevices := Pa_GetDeviceCount;\n  WriteLn('Num devices: ', NumDevices);\n\n  DeviceIndex := Pa_GetDefaultOutputDevice;\n\n  if DeviceIndex = paNoDevice then\n    begin\n      WriteLn('No default output device found');\n      Pa_Terminate;\n      Exit;\n    end;\n\n  EnvStr := GetEnv('SHERPA_ONNX_MIC_DEVICE');\n  if EnvStr <> '' then\n    begin\n      DeviceIndex := StrToIntDef(EnvStr, DeviceIndex);\n      WriteLn('Use device index from environment variable SHERPA_ONNX_MIC_DEVICE: ', EnvStr);\n    end;\n\n  for I := 0 to (NumDevices - 1) do\n    begin\n      DeviceInfo := Pa_GetDeviceInfo(I);\n      if I = DeviceIndex then\n        { WriteLn(Format(' * %d %s', [I, DeviceInfo^.Name])) }\n        WriteLn(Format(' * %d %s', [I, AnsiString(DeviceInfo^.Name)]))\n      else\n        WriteLn(Format('   %d %s', [I, AnsiString(DeviceInfo^.Name)]));\n    end;\n\n  WriteLn('Use device ', DeviceIndex);\n  WriteLn(' Name ', Pa_GetDeviceInfo(DeviceIndex)^.Name);\n  WriteLn(' Max output channels ', Pa_GetDeviceInfo(DeviceIndex)^.MaxOutputChannels);\n\n  Initialize(Param);\n  Param.Device := DeviceIndex;\n  Param.ChannelCount := 1;\n  Param.SampleFormat := paFloat32;\n  param.SuggestedLatency := Pa_GetDeviceInfo(DeviceIndex)^.DefaultHighOutputLatency;\n  param.HostApiSpecificStreamInfo := nil;\n\n  Buffer := TSherpaOnnxCircularBuffer.Create(30 * DeviceSampleRate);\n\n\n  { Note(fangjun): PortAudio invokes PlayCallback in a separate thread. }\n  Status := Pa_OpenStream(stream, nil, @Param, DeviceSampleRate, paFramesPerBufferUnspecified, paNoFlag,\n    PPaStreamCallback(@PlayCallback), nil);\n\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to open stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  InitCriticalSection(CriticalSection);\n\n  Status := Pa_StartStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to start stream, ', Pa_GetErrorText(Status));\n      Pa_Terminate;\n      Exit;\n    end;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := 'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig,\n    @GenerateCallback, nil);\n  FinishedGeneration := True;\n  SherpaOnnxWriteWave('./libritts_r-generated.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./libritts_r-generated.wav');\n\n  while not FinishedPlaying do\n    Pa_Sleep(100);  {sleep for 0.1 second }\n    {TODO(fangjun): Use an event to indicate the play is finished}\n\n  DoneCriticalSection(CriticalSection);\n\n  FreeAndNil(Tts);\n  FreeAndNil(Resampler);\n\n  Status := Pa_CloseStream(stream);\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to close stream, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\n\n  Status := Pa_Terminate;\n  if Status <> paNoError then\n    begin\n      WriteLn('Failed to deinitialize portaudio, ', Pa_GetErrorText(Status));\n      Exit;\n    end;\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/piper.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\nprogram piper;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith Piper models.\n\nIt generates speech from text and saves it to a wave file.\n\nIf you want to play it while it is generating, please see\n./piper-playback.pas\n}\n\n{$mode objfpc}\n\nuses\n  SysUtils,\n  sherpa_onnx;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Vits.Model := './vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx';\n  Config.Model.Vits.Tokens := './vits-piper-en_US-libritts_r-medium/tokens.txt';\n  Config.Model.Vits.DataDir := './vits-piper-en_US-libritts_r-medium/espeak-ng-data';\n  Config.Model.NumThreads := 1;\n  Config.Model.Debug := False;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nvar\n  Tts: TSherpaOnnxOfflineTts;\n  Audio: TSherpaOnnxGeneratedAudio;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n\n  Text: AnsiString;\n  Speed: Single = 1.0;  {Use a larger value to speak faster}\n  SpeakerId: Integer = 0;\n\nbegin\n  Tts := GetOfflineTts;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := 'Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.';\n\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := Speed;\n  GenerationConfig.Sid := SpeakerId;\n\n  Audio :=  Tts.Generate(Text, GenerationConfig, NIL, NIL);\n  SherpaOnnxWriteWave('./libritts_r-generated.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./libritts_r-generated.wav');\n\n  FreeAndNil(Tts);\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/pocket-en.pas",
    "content": "{ Copyright (c)  2026  Xiaomi Corporation }\nprogram pocket_en;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith Pocket TTS models.\n\nIt generates speech from text and saves it to a wave file.\n}\n\n{$mode objfpc}\n\nuses\n  ctypes,\n  SysUtils,\n  sherpa_onnx;\n\nfunction ProgressCallback(Samples: pcfloat; N: cint32; P: cfloat;\n  Arg: Pointer): cint32; cdecl;\nbegin\n  WriteLn(Format('Progress: %.2f%%, samples: %d', [P * 100.0, N]));\n  Result := 1;\nend;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Pocket.LmFlow := './sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx';\n  Config.Model.Pocket.LmMain := './sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx';\n  Config.Model.Pocket.Encoder := './sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx';\n  Config.Model.Pocket.Decoder := './sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx';\n  Config.Model.Pocket.TextConditioner := './sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx';\n  Config.Model.Pocket.VocabJson := './sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json';\n  Config.Model.Pocket.TokenScoresJson := './sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json';\n  Config.Model.NumThreads := 2;\n  Config.Model.Debug := True;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nvar\n  Tts: TSherpaOnnxOfflineTts;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n  Audio: TSherpaOnnxGeneratedAudio;\n\n  Text: AnsiString;\n\nbegin\n  Tts := GetOfflineTts;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := 'Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.';\n\n  WaveFilename := './sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav';\n  Wave := SherpaOnnxReadWave(WaveFilename);\n  GenerationConfig.ReferenceAudio := Wave.Samples;\n  GenerationConfig.ReferenceAudioLen := Length(Wave.Samples);\n  GenerationConfig.ReferenceSampleRate := Wave.SampleRate;\n\n  Audio := Tts.Generate(Text, GenerationConfig, @ProgressCallback, NIL);\n  SherpaOnnxWriteWave('./pocket-tts-en.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./pocket-tts-en.wav');\n\n  FreeAndNil(Tts);\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/run-kitten-en-playback.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kitten.html\nif [ ! -f ./kitten-nano-en-v0_1-fp16/model.fp16.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n  tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n  rm kitten-nano-en-v0_1-fp16.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  -Fl/usr/local/Cellar/portaudio/19.7.0/lib \\\n  ./kitten-en-playback.pas\n\n# Please see ../portaudio-test/README.md\n# for how to install portaudio on macOS\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./kitten-en-playback\n"
  },
  {
    "path": "pascal-api-examples/tts/run-kitten-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kitten.html\nif [ ! -f ./kitten-nano-en-v0_1-fp16/model.fp16.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n  tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n  rm kitten-nano-en-v0_1-fp16.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./kitten-en.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./kitten-en\n"
  },
  {
    "path": "pascal-api-examples/tts/run-kokoro-en-playback.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\nif [ ! -f ./kokoro-en-v0_19/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n  tar xf kokoro-en-v0_19.tar.bz2\n  rm kokoro-en-v0_19.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  -Fl/usr/local/Cellar/portaudio/19.7.0/lib \\\n  ./kokoro-en-playback.pas\n\n# Please see ../portaudio-test/README.md\n# for how to install portaudio on macOS\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./kokoro-en-playback\n"
  },
  {
    "path": "pascal-api-examples/tts/run-kokoro-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\nif [ ! -f ./kokoro-en-v0_19/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n  tar xf kokoro-en-v0_19.tar.bz2\n  rm kokoro-en-v0_19.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./kokoro-en.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./kokoro-en\n"
  },
  {
    "path": "pascal-api-examples/tts/run-kokoro-zh-en-playback.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\nif [ ! -f ./kokoro-multi-lang-v1_0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n  tar xf kokoro-multi-lang-v1_0.tar.bz2\n  rm kokoro-multi-lang-v1_0.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  -Fl/usr/local/Cellar/portaudio/19.7.0/lib \\\n  ./kokoro-zh-en-playback.pas\n\n# Please see ../portaudio-test/README.md\n# for how to install portaudio on macOS\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./kokoro-zh-en-playback\n"
  },
  {
    "path": "pascal-api-examples/tts/run-kokoro-zh-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\nif [ ! -f ./kokoro-multi-lang-v1_0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n  tar xf kokoro-multi-lang-v1_0.tar.bz2\n  rm kokoro-multi-lang-v1_0.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./kokoro-zh-en.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./kokoro-zh-en\n"
  },
  {
    "path": "pascal-api-examples/tts/run-matcha-en-playback.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n  tar xf matcha-icefall-en_US-ljspeech.tar.bz2\n  rm matcha-icefall-en_US-ljspeech.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  -Fl/usr/local/Cellar/portaudio/19.7.0/lib \\\n  ./matcha-en-playback.pas\n\n# Please see ../portaudio-test/README.md\n# for how to install portaudio on macOS\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./matcha-en-playback\n"
  },
  {
    "path": "pascal-api-examples/tts/run-matcha-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n  tar xf matcha-icefall-en_US-ljspeech.tar.bz2\n  rm matcha-icefall-en_US-ljspeech.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./matcha-en.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./matcha-en\n"
  },
  {
    "path": "pascal-api-examples/tts/run-matcha-zh-playback.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-zh-baker/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  tar xvf matcha-icefall-zh-baker.tar.bz2\n  rm matcha-icefall-zh-baker.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  -Fl/usr/local/Cellar/portaudio/19.7.0/lib \\\n  ./matcha-zh-playback.pas\n\n# Please see ../portaudio-test/README.md\n# for how to install portaudio on macOS\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./matcha-zh-playback\n"
  },
  {
    "path": "pascal-api-examples/tts/run-matcha-zh.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-zh-baker/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  tar xvf matcha-icefall-zh-baker.tar.bz2\n  rm matcha-icefall-zh-baker.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./matcha-zh.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./matcha-zh\n"
  },
  {
    "path": "pascal-api-examples/tts/run-piper-playback.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\nif [[ ! -f ./vits-piper-en_US-libritts_r-medium/tokens.txt ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2\n  tar xf vits-piper-en_US-libritts_r-medium.tar.bz2\n  rm vits-piper-en_US-libritts_r-medium.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  -Fl/usr/local/Cellar/portaudio/19.7.0/lib \\\n  ./piper-playback.pas\n\n# Please see ../portaudio-test/README.md\n# for how to install portaudio on macOS\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./piper-playback\n"
  },
  {
    "path": "pascal-api-examples/tts/run-piper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\nif [[ ! -f ./vits-piper-en_US-libritts_r-medium/tokens.txt ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2\n  tar xf vits-piper-en_US-libritts_r-medium.tar.bz2\n  rm vits-piper-en_US-libritts_r-medium.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./piper.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./piper\n"
  },
  {
    "path": "pascal-api-examples/tts/run-pocket-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pocket.html\nif [ ! -f ./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  tar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./pocket-en.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./pocket-en\n"
  },
  {
    "path": "pascal-api-examples/tts/run-supertonic-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/supertonic.html\nif [ ! -f ./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  tar xvf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  rm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./supertonic-en.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./supertonic-en\n"
  },
  {
    "path": "pascal-api-examples/tts/run-zipvoice-zh-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\nif [ ! -f ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  tar xvf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nfi\n\nif [ ! -f ./vocos_24khz.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./zipvoice-zh-en.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./zipvoice-zh-en\n"
  },
  {
    "path": "pascal-api-examples/tts/supertonic-en.pas",
    "content": "{ Copyright (c)  2026  Xiaomi Corporation }\nprogram supertonic_en;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith Supertonic TTS models.\n\nIt generates speech from text and saves it to a wave file.\n\nPlease visit\nhttps://k2-fsa.github.io/sherpa/onnx/tts/supertonic.html\nto download the model.\n}\n\n{$mode objfpc}\n\nuses\n  SysUtils,\n  sherpa_onnx;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config.Model.Supertonic.DurationPredictor := './sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx';\n  Config.Model.Supertonic.TextEncoder := './sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx';\n  Config.Model.Supertonic.VectorEstimator := './sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx';\n  Config.Model.Supertonic.Vocoder := './sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx';\n  Config.Model.Supertonic.TtsJson := './sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json';\n  Config.Model.Supertonic.UnicodeIndexer := './sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin';\n  Config.Model.Supertonic.VoiceStyle := './sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin';\n  Config.Model.NumThreads := 2;\n  Config.Model.Debug := True;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nvar\n  Tts: TSherpaOnnxOfflineTts;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n  Audio: TSherpaOnnxGeneratedAudio;\n  Text: AnsiString;\n\nbegin\n  Tts := GetOfflineTts;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := 'Today as always, men fall into two groups: slaves and free men. Whoever ' +\n    'does not have two-thirds of his day for himself, is a slave, whatever ' +\n    'he may be: a statesman, a businessman, an official, or a scholar.';\n\n  GenerationConfig.Sid := 6;\n  GenerationConfig.NumSteps := 5;\n  GenerationConfig.Speed := 1.25;\n  GenerationConfig.Extra := '{\"lang\": \"en\"}';\n\n  Audio := Tts.Generate(Text, GenerationConfig, NIL, NIL);\n  SherpaOnnxWriteWave('./supertonic-tts-en.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./supertonic-tts-en.wav');\n\n  FreeAndNil(Tts);\nend.\n"
  },
  {
    "path": "pascal-api-examples/tts/zipvoice-zh-en.pas",
    "content": "{ Copyright (c)  2026  Xiaomi Corporation }\nprogram zipvoice_zh_en;\n{\nThis file shows how to use the text to speech API of sherpa-onnx\nwith ZipVoice TTS models.\n\nIt generates speech from text and saves it to a wave file.\n\nPlease visit\nhttps://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\nto download the model.\n}\n\n{$mode objfpc}\n\nuses\n  ctypes,\n  SysUtils,\n  sherpa_onnx;\n\nfunction ProgressCallback(Samples: pcfloat; N: cint32; P: cfloat;\n  Arg: Pointer): cint32; cdecl;\nbegin\n  WriteLn(Format('Progress: %.2f%%, samples: %d', [P * 100.0, N]));\n  Result := 1;\nend;\n\nfunction GetOfflineTts: TSherpaOnnxOfflineTts;\nvar\n  Config: TSherpaOnnxOfflineTtsConfig;\nbegin\n  Config := Default(TSherpaOnnxOfflineTtsConfig);\n  Config.Model.ZipVoice.Tokens := './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt';\n  Config.Model.ZipVoice.Encoder := './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx';\n  Config.Model.ZipVoice.Decoder := './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx';\n  Config.Model.ZipVoice.Vocoder := './vocos_24khz.onnx';\n  Config.Model.ZipVoice.DataDir := './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data';\n  Config.Model.ZipVoice.Lexicon := './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt';\n  Config.Model.ZipVoice.FeatScale := 0.1;\n  Config.Model.ZipVoice.Tshift := 0.5;\n  Config.Model.ZipVoice.TargetRms := 0.1;\n  Config.Model.ZipVoice.GuidanceScale := 1.0;\n  Config.Model.NumThreads := 2;\n  Config.Model.Debug := True;\n  Config.MaxNumSentences := 1;\n\n  Result := TSherpaOnnxOfflineTts.Create(Config);\nend;\n\nvar\n  Tts: TSherpaOnnxOfflineTts;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n  Wave: TSherpaOnnxWave;\n  WaveFilename: AnsiString;\n  Audio: TSherpaOnnxGeneratedAudio;\n  Text: AnsiString;\n  ReferenceText: AnsiString;\n\nbegin\n  Tts := GetOfflineTts;\n\n  WriteLn('There are ', Tts.GetNumSpeakers, ' speakers');\n\n  Text := '小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.';\n  ReferenceText := '那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.';\n\n  WaveFilename := './sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav';\n  Wave := SherpaOnnxReadWave(WaveFilename);\n  GenerationConfig := Default(TSherpaOnnxGenerationConfig);\n  GenerationConfig.SilenceScale := 0.2;\n  GenerationConfig.Speed := 1.0;\n  GenerationConfig.Sid := 0;\n  GenerationConfig.ReferenceAudio := Wave.Samples;\n  GenerationConfig.ReferenceAudioLen := Length(Wave.Samples);\n  GenerationConfig.ReferenceSampleRate := Wave.SampleRate;\n  GenerationConfig.ReferenceText := ReferenceText;\n  GenerationConfig.NumSteps := 4;\n  GenerationConfig.Extra := '{\"min_char_in_sentence\": \"10\"}';\n\n  Audio := Tts.Generate(Text, GenerationConfig, @ProgressCallback, NIL);\n  SherpaOnnxWriteWave('./zipvoice-zh-en.wav', Audio.Samples, Audio.SampleRate);\n  WriteLn('Saved to ./zipvoice-zh-en.wav');\n\n  FreeAndNil(Tts);\nend.\n"
  },
  {
    "path": "pascal-api-examples/vad/.gitignore",
    "content": "!run*.sh\ncircular_buffer\nremove_silence\nremove_silence_ten_vad\n"
  },
  {
    "path": "pascal-api-examples/vad/README.md",
    "content": "# Introduction\n\n\nThis directory contains examples for how to use the VAD (voice activity detection)\nAPIs.\n\n|Directory| Description|\n|---------|------------|\n|[run-circular-buffer.sh](./run-circular-buffer.sh)|It shows how to use the circular buffer API.|\n|[run-remove-silence.sh](./run-remove-silence.sh)|It shows how to use the VAD API to remove silences from a wave file.|\n\n"
  },
  {
    "path": "pascal-api-examples/vad/circular_buffer.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\nprogram circular_buffer;\n{\nThis file shows how to use the CircularBuffer API of sherpa-onnx\n}\n\n{$mode objfpc}\n{$ASSERTIONS ON}\n\nuses\n  sherpa_onnx;\n\nvar\n  Buffer: TSherpaOnnxCircularBuffer;\n  Samples: TSherpaOnnxSamplesArray;\nbegin\n  {The initial capacity is 5. It will be resized automatically if needed.}\n  Buffer := TSherpaOnnxCircularBuffer.Create(5);\n  Assert(Buffer.Size = 0);\n  Assert(Buffer.Head = 0);\n  Buffer.Push([0, 10, 20]);\n\n  {Push() changes Size. Head is not changed.}\n  Assert(Buffer.Size = 3);\n  Assert(Buffer.Head = 0);\n\n  Samples := Buffer.Get(0, 1);\n  Assert(Length(Samples) = 1);\n  Assert(Samples[0] = 0);\n\n  { Get() does not change Size or Head}\n  Assert(Buffer.Size = 3);\n  Assert(Buffer.Head = 0);\n\n  Samples := Buffer.Get(0, 2);\n  Assert(Length(Samples) = 2);\n  Assert(Samples[0] = 0);\n  Assert(Samples[1] = 10);\n\n  { The buffer will be resized since its initial capacity is 5 but we have\n    pushed 7 elements into it.\n\n    No data is lost during the resize.\n  }\n  Buffer.Push([30, 40, 50, 60]);\n\n  Assert(Buffer.Size = 7); {There are now 7 elements}\n  Assert(Buffer.Head = 0);\n\n  {Remove the first 4 elements}\n  Buffer.Pop(4);\n\n  Assert(Buffer.Size = 3); {There are only 3 elements left}\n  Assert(Buffer.Head = 4);\n\n  Samples := Buffer.Get(Buffer.Head, 2);\n  Assert(Length(Samples) = 2);\n  Assert(Samples[0] = 40);\n  Assert(Samples[1] = 50);\n\n  Buffer.Pop(1);\n\n  Assert(Buffer.Size = 2); {There are only 2 elements left}\n  Assert(Buffer.Head = 5);\n\n  Samples := Buffer.Get(Buffer.Head, 2);\n  Assert(Length(Samples) = 2);\n  Assert(Samples[0] = 50);\n  Assert(Samples[1] = 60);\n\n  Buffer.Pop(2);\n  Assert(Buffer.Size = 0); {There are no elements left}\n  Assert(Buffer.Head = 7);\n\n  Buffer.Push([100, 200, 300, 400, 500]);\n  Assert(Buffer.Size = 5);\n  Assert(Buffer.Head = 7);\n\n  Buffer.Pop(4);\n  Assert(Buffer.Size = 1);\n\n  {Head can be larger than the Capacity!\n   This is what circular means. It points to Buffer.Head / Capacity.\n  }\n  Assert(Buffer.Head = 11);\n  Buffer.Push([600, 700]);\n\n  Assert(Buffer.Size = 3);\n  Assert(Buffer.Head = 11);\n\n  Samples := Buffer.Get(Buffer.Head, 3);\n  Assert(Length(Samples) = 3);\n  Assert(Samples[0] = 500);\n  Assert(Samples[1] = 600);\n  Assert(Samples[2] = 700);\n\n  Buffer.Pop(3);\n  Assert(Buffer.Size = 0);\n  Assert(Buffer.Head = 14);\n\n  Buffer.Reset();\n\n  Assert(Buffer.Size = 0);\n  Assert(Buffer.Head = 0);\nend.\n\n"
  },
  {
    "path": "pascal-api-examples/vad/remove_silence.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n{\nThis file shows how to use the VAD API from sherpa-onnx\nto remove silences from a wave file with silero-vad.\n}\nprogram main;\n\n{$mode delphi}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n\n  Config: TSherpaOnnxVadModelConfig;\n  Vad: TSherpaOnnxVoiceActivityDetector;\n  Offset: Integer;\n  WindowSize: Integer;\n  SpeechSegment: TSherpaOnnxSpeechSegment;\n\n  Start: Single;\n  Duration: Single;\n  SampleRate: Integer;\n\n  AllSpeechSegment: array of TSherpaOnnxSpeechSegment;\n  AllSamples: array of Single;\n  N: Integer;\n  I: Integer;\nbegin\n  SampleRate := 16000; {Please don't change it unless you know the details}\n\n  Wave := SherpaOnnxReadWave('./lei-jun-test.wav');\n  if Wave.SampleRate <> SampleRate then\n    begin\n      WriteLn(Format('Expected sample rate: %d. Given: %d',\n        [SampleRate, Wave.SampleRate]));\n\n      Exit;\n    end;\n\n  WindowSize := 512; {Please don't change it unless you know the details}\n  Initialize(Config);\n\n  Config.SileroVad.Model := './silero_vad.onnx';\n  Config.SileroVad.MinSpeechDuration := 0.25;\n  Config.SileroVad.MinSilenceDuration := 0.5;\n  Config.SileroVad.Threshold := 0.5;\n  Config.SileroVad.WindowSize := WindowSize;\n  Config.NumThreads:= 1;\n  Config.Debug:= True;\n  Config.Provider:= 'cpu';\n  Config.SampleRate := SampleRate;\n\n  Vad := TSherpaOnnxVoiceActivityDetector.Create(Config, 20);\n\n  AllSpeechSegment := nil;\n  AllSamples := nil;\n  Offset := 0;\n  while Offset + WindowSize <= Length(Wave.Samples) do\n    begin\n      Vad.AcceptWaveform(Wave.Samples, Offset, WindowSize);\n      Inc(Offset, WindowSize);\n\n      while not Vad.IsEmpty do\n        begin\n          SetLength(AllSpeechSegment, Length(AllSpeechSegment) + 1);\n\n          SpeechSegment := Vad.Front();\n          Vad.Pop();\n          AllSpeechSegment[Length(AllSpeechSegment)-1] := SpeechSegment;\n\n          Start := SpeechSegment.Start / SampleRate;\n          Duration := Length(SpeechSegment.Samples) / SampleRate;\n          WriteLn(Format('%.3f -- %.3f', [Start, Start + Duration]));\n        end;\n    end;\n\n  Vad.Flush;\n\n  while not Vad.IsEmpty do\n    begin\n      SetLength(AllSpeechSegment, Length(AllSpeechSegment) + 1);\n\n      SpeechSegment := Vad.Front();\n      Vad.Pop();\n      AllSpeechSegment[Length(AllSpeechSegment)-1] := SpeechSegment;\n\n      Start := SpeechSegment.Start / SampleRate;\n      Duration := Length(SpeechSegment.Samples) / SampleRate;\n      WriteLn(Format('%.3f -- %.3f', [Start, Start + Duration]));\n    end;\n\n  N := 0;\n  for SpeechSegment in AllSpeechSegment do\n    Inc(N, Length(SpeechSegment.Samples));\n\n  SetLength(AllSamples, N);\n\n  N := 0;\n  for SpeechSegment in AllSpeechSegment do\n    begin\n      for I := Low(SpeechSegment.Samples) to High(SpeechSegment.Samples) do\n        begin\n          AllSamples[N] := SpeechSegment.Samples[I];\n          Inc(N);\n        end;\n    end;\n\n  SherpaOnnxWriteWave('./lei-jun-test-no-silence.wav', AllSamples, SampleRate);\n  WriteLn('Saved to ./lei-jun-test-no-silence.wav');\n\n  FreeAndNil(Vad);\nend.\n"
  },
  {
    "path": "pascal-api-examples/vad/remove_silence_ten_vad.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n{\nThis file shows how to use the VAD API from sherpa-onnx\nto remove silences from a wave file with ten-vad.\n}\nprogram main;\n\n{$mode delphi}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nvar\n  Wave: TSherpaOnnxWave;\n\n  Config: TSherpaOnnxVadModelConfig;\n  Vad: TSherpaOnnxVoiceActivityDetector;\n  Offset: Integer;\n  WindowSize: Integer;\n  SpeechSegment: TSherpaOnnxSpeechSegment;\n\n  Start: Single;\n  Duration: Single;\n  SampleRate: Integer;\n\n  AllSpeechSegment: array of TSherpaOnnxSpeechSegment;\n  AllSamples: array of Single;\n  N: Integer;\n  I: Integer;\nbegin\n  SampleRate := 16000; {Please don't change it unless you know the details}\n\n  Wave := SherpaOnnxReadWave('./lei-jun-test.wav');\n  if Wave.SampleRate <> SampleRate then\n    begin\n      WriteLn(Format('Expected sample rate: %d. Given: %d',\n        [SampleRate, Wave.SampleRate]));\n\n      Exit;\n    end;\n\n  WindowSize := 256; {Please don't change it unless you know the details}\n  Initialize(Config);\n\n  Config.TenVad.Model := './ten-vad.onnx';\n  Config.TenVad.MinSpeechDuration := 0.25;\n  Config.TenVad.MinSilenceDuration := 0.5;\n  Config.TenVad.Threshold := 0.25;\n  Config.TenVad.WindowSize := WindowSize;\n  Config.NumThreads:= 1;\n  Config.Debug:= True;\n  Config.Provider:= 'cpu';\n  Config.SampleRate := SampleRate;\n\n  Vad := TSherpaOnnxVoiceActivityDetector.Create(Config, 20);\n\n  AllSpeechSegment := nil;\n  AllSamples := nil;\n  Offset := 0;\n  while Offset + WindowSize <= Length(Wave.Samples) do\n    begin\n      Vad.AcceptWaveform(Wave.Samples, Offset, WindowSize);\n      Inc(Offset, WindowSize);\n\n      while not Vad.IsEmpty do\n        begin\n          SetLength(AllSpeechSegment, Length(AllSpeechSegment) + 1);\n\n          SpeechSegment := Vad.Front();\n          Vad.Pop();\n          AllSpeechSegment[Length(AllSpeechSegment)-1] := SpeechSegment;\n\n          Start := SpeechSegment.Start / SampleRate;\n          Duration := Length(SpeechSegment.Samples) / SampleRate;\n          WriteLn(Format('%.3f -- %.3f', [Start, Start + Duration]));\n        end;\n    end;\n\n  Vad.Flush;\n\n  while not Vad.IsEmpty do\n    begin\n      SetLength(AllSpeechSegment, Length(AllSpeechSegment) + 1);\n\n      SpeechSegment := Vad.Front();\n      Vad.Pop();\n      AllSpeechSegment[Length(AllSpeechSegment)-1] := SpeechSegment;\n\n      Start := SpeechSegment.Start / SampleRate;\n      Duration := Length(SpeechSegment.Samples) / SampleRate;\n      WriteLn(Format('%.3f -- %.3f', [Start, Start + Duration]));\n    end;\n\n  N := 0;\n  for SpeechSegment in AllSpeechSegment do\n    Inc(N, Length(SpeechSegment.Samples));\n\n  SetLength(AllSamples, N);\n\n  N := 0;\n  for SpeechSegment in AllSpeechSegment do\n    begin\n      for I := Low(SpeechSegment.Samples) to High(SpeechSegment.Samples) do\n        begin\n          AllSamples[N] := SpeechSegment.Samples[I];\n          Inc(N);\n        end;\n    end;\n\n  SherpaOnnxWriteWave('./lei-jun-test-no-silence-ten-vad.wav', AllSamples, SampleRate);\n  WriteLn('Saved to ./lei-jun-test-no-silence-ten-vad.wav');\n\n  FreeAndNil(Vad);\nend.\n"
  },
  {
    "path": "pascal-api-examples/vad/run-circular-buffer.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./circular_buffer.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./circular_buffer\n"
  },
  {
    "path": "pascal-api-examples/vad/run-remove-silence-ten-vad.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\nif [[ ! -f ./ten-vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./remove_silence_ten_vad.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./remove_silence_ten_vad\n"
  },
  {
    "path": "pascal-api-examples/vad/run-remove-silence.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./remove_silence.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./remove_silence\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/.gitignore",
    "content": "!run-*.sh\nvad_with_whisper\nvad_with_sense_voice\nvad_with_moonshine\nvad_with_zipformer_ctc\nvad_with_dolphin\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/README.md",
    "content": "# Introduction\n\n\nThis directory contains examples for how to use the VAD (voice activity detection)\nwith non-streaming speech recognition models.\n\n|Directory| Description|\n|---------|------------|\n|[run-vad-with-dolphin-ctc.sh](./run-vad-with-dolphin-ctc.sh)|It shows how to use the VAD + [Dolphin](https://github.com/DataoceanAI/Dolphin) for speech recognition.|\n|[run-vad-with-whisper.sh](./run-vad-with-whisper.sh)|It shows how to use the VAD + [Whisper](https://github.com/openai/whisper) for speech recognition.|\n|[run-vad-with-sense-voice.sh](./run-vad-with-sense-voice.sh)|It shows how to use the VAD + [SenseVoice](https://github.com/FunAudioLLM/SenseVoice) for speech recognition.|\n|[run-vad-with-moonshine.sh](./run-vad-with-moonshine.sh)|It shows how to use the VAD + [Moonshine](https://github.com/usefulsensors/moonshine) for speech recognition.|\n\n\nPlease refer to [non-streaming-asr](../non-streaming-asr) for more kinds of non-streaming models.\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/run-vad-with-dolphin-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./vad_with_dolphin.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./vad_with_dolphin\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/run-vad-with-moonshine.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./Obama.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\nfi\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./vad_with_moonshine.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./vad_with_moonshine\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/run-vad-with-sense-voice.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./vad_with_sense_voice.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./vad_with_sense_voice\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/run-vad-with-whisper.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./Obama.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav\nfi\n\nif [ ! -f ./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./vad_with_whisper.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./vad_with_whisper\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/run-vad-with-zipformer-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\n\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nif [[ ! -f ../../build/install/lib/libsherpa-onnx-c-api.dylib  && ! -f ../../build/install/lib/libsherpa-onnx-c-api.so && ! -f ../../build/install/lib/sherpa-onnx-c-api.dll ]]; then\n  mkdir -p ../../build\n  pushd ../../build\n  cmake \\\n    -DCMAKE_INSTALL_PREFIX=./install \\\n    -DSHERPA_ONNX_ENABLE_PYTHON=OFF \\\n    -DSHERPA_ONNX_ENABLE_TESTS=OFF \\\n    -DSHERPA_ONNX_ENABLE_CHECK=OFF \\\n    -DBUILD_SHARED_LIBS=ON \\\n    -DSHERPA_ONNX_ENABLE_PORTAUDIO=OFF \\\n    ..\n\n  cmake --build . --target install --config Release\n  popd\nfi\n\nif [[ ! -f ./silero_vad.onnx ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\nif [ ! -f ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n  rm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\nfi\n\nfpc \\\n  -dSHERPA_ONNX_USE_SHARED_LIBS \\\n  -Fu$SHERPA_ONNX_DIR/sherpa-onnx/pascal-api \\\n  -Fl$SHERPA_ONNX_DIR/build/install/lib \\\n  ./vad_with_zipformer_ctc.pas\n\nexport LD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$LD_LIBRARY_PATH\nexport DYLD_LIBRARY_PATH=$SHERPA_ONNX_DIR/build/install/lib:$DYLD_LIBRARY_PATH\n\n./vad_with_zipformer_ctc\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/vad_with_dolphin.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Dolphin model\nwith silero VAD to decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram vad_with_dolphin;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nfunction CreateVad(): TSherpaOnnxVoiceActivityDetector;\nvar\n  Config: TSherpaOnnxVadModelConfig;\n\n  SampleRate: Integer;\n  WindowSize: Integer;\nbegin\n  Initialize(Config);\n\n  SampleRate := 16000; {Please don't change it unless you know the details}\n  WindowSize := 512; {Please don't change it unless you know the details}\n\n  Config.SileroVad.Model := './silero_vad.onnx';\n  Config.SileroVad.MinSpeechDuration := 0.5;\n  Config.SileroVad.MinSilenceDuration := 0.5;\n  Config.SileroVad.Threshold := 0.5;\n  Config.SileroVad.WindowSize := WindowSize;\n  Config.NumThreads:= 1;\n  Config.Debug:= True;\n  Config.Provider:= 'cpu';\n  Config.SampleRate := SampleRate;\n\n  Result := TSherpaOnnxVoiceActivityDetector.Create(Config, 30);\nend;\n\nfunction CreateOfflineRecognizer(): TSherpaOnnxOfflineRecognizer;\nvar\n  Config: TSherpaOnnxOfflineRecognizerConfig;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Dolphin.Model := './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  Result := TSherpaOnnxOfflineRecognizer.Create(Config);\nend;\n\nvar\n  Wave: TSherpaOnnxWave;\n\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Vad: TSherpaOnnxVoiceActivityDetector;\n\n  Offset: Integer;\n  WindowSize: Integer;\n  SpeechSegment: TSherpaOnnxSpeechSegment;\n\n  Start: Single;\n  Duration: Single;\n\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\nbegin\n  Vad := CreateVad();\n  Recognizer := CreateOfflineRecognizer();\n\n  Wave := SherpaOnnxReadWave('./lei-jun-test.wav');\n  if Wave.SampleRate <> Vad.Config.SampleRate then\n    begin\n      WriteLn(Format('Expected sample rate: %d. Given: %d',\n        [Vad.Config.SampleRate, Wave.SampleRate]));\n\n      Exit;\n    end;\n\n  WindowSize := Vad.Config.SileroVad.WindowSize;\n  Offset := 0;\n  while Offset + WindowSize <= Length(Wave.Samples) do\n    begin\n      Vad.AcceptWaveform(Wave.Samples, Offset, WindowSize);\n      Offset += WindowSize;\n\n      while not Vad.IsEmpty do\n        begin\n          SpeechSegment := Vad.Front();\n          Vad.Pop();\n          Stream := Recognizer.CreateStream();\n\n          Stream.AcceptWaveform(SpeechSegment.Samples, Wave.SampleRate);\n          Recognizer.Decode(Stream);\n          RecognitionResult := Recognizer.GetResult(Stream);\n\n          Start := SpeechSegment.Start / Wave.SampleRate;\n          Duration := Length(SpeechSegment.Samples) / Wave.SampleRate;\n          WriteLn(Format('%.3f -- %.3f %s',\n            [Start, Start + Duration, RecognitionResult.Text]));\n\n          FreeAndNil(Stream);\n        end;\n    end;\n\n  Vad.Flush;\n\n  while not Vad.IsEmpty do\n    begin\n      SpeechSegment := Vad.Front();\n      Vad.Pop();\n      Stream := Recognizer.CreateStream();\n\n      Stream.AcceptWaveform(SpeechSegment.Samples, Wave.SampleRate);\n      Recognizer.Decode(Stream);\n      RecognitionResult := Recognizer.GetResult(Stream);\n\n      Start := SpeechSegment.Start / Wave.SampleRate;\n      Duration := Length(SpeechSegment.Samples) / Wave.SampleRate;\n      WriteLn(Format('%.3f -- %.3f %s',\n        [Start, Start + Duration, RecognitionResult.Text]));\n\n      FreeAndNil(Stream);\n    end;\n\n  FreeAndNil(Recognizer);\n  FreeAndNil(Vad);\nend.\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/vad_with_moonshine.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Moonshine model\nwith silero VAD to decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram vad_with_moonshine;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nfunction CreateVad(): TSherpaOnnxVoiceActivityDetector;\nvar\n  Config: TSherpaOnnxVadModelConfig;\n\n  SampleRate: Integer;\n  WindowSize: Integer;\nbegin\n  Initialize(Config);\n\n  SampleRate := 16000; {Please don't change it unless you know the details}\n  WindowSize := 512; {Please don't change it unless you know the details}\n\n  Config.SileroVad.Model := './silero_vad.onnx';\n  Config.SileroVad.MinSpeechDuration := 0.5;\n  Config.SileroVad.MinSilenceDuration := 0.5;\n  Config.SileroVad.Threshold := 0.5;\n  Config.SileroVad.WindowSize := WindowSize;\n  Config.NumThreads:= 1;\n  Config.Debug:= True;\n  Config.Provider:= 'cpu';\n  Config.SampleRate := SampleRate;\n\n  Result := TSherpaOnnxVoiceActivityDetector.Create(Config, 30);\nend;\n\nfunction CreateOfflineRecognizer(): TSherpaOnnxOfflineRecognizer;\nvar\n  Config: TSherpaOnnxOfflineRecognizerConfig;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Moonshine.Preprocessor := './sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx';\n  Config.ModelConfig.Moonshine.Encoder := './sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx';\n  Config.ModelConfig.Moonshine.UncachedDecoder := './sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx';\n  Config.ModelConfig.Moonshine.CachedDecoder := './sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx';\n\n  Config.ModelConfig.Tokens := './sherpa-onnx-moonshine-tiny-en-int8/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  Result := TSherpaOnnxOfflineRecognizer.Create(Config);\nend;\n\nvar\n  Wave: TSherpaOnnxWave;\n\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Vad: TSherpaOnnxVoiceActivityDetector;\n\n  Offset: Integer;\n  WindowSize: Integer;\n  SpeechSegment: TSherpaOnnxSpeechSegment;\n\n  Start: Single;\n  Duration: Single;\n\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\nbegin\n  Vad := CreateVad();\n  Recognizer := CreateOfflineRecognizer();\n\n  Wave := SherpaOnnxReadWave('./Obama.wav');\n  if Wave.SampleRate <> Vad.Config.SampleRate then\n    begin\n      WriteLn(Format('Expected sample rate: %d. Given: %d',\n        [Vad.Config.SampleRate, Wave.SampleRate]));\n\n      Exit;\n    end;\n\n  WindowSize := Vad.Config.SileroVad.WindowSize;\n  Offset := 0;\n  while Offset + WindowSize <= Length(Wave.Samples) do\n    begin\n      Vad.AcceptWaveform(Wave.Samples, Offset, WindowSize);\n      Offset += WindowSize;\n\n      while not Vad.IsEmpty do\n        begin\n          SpeechSegment := Vad.Front();\n          Vad.Pop();\n          Stream := Recognizer.CreateStream();\n\n          Stream.AcceptWaveform(SpeechSegment.Samples, Wave.SampleRate);\n          Recognizer.Decode(Stream);\n          RecognitionResult := Recognizer.GetResult(Stream);\n\n          Start := SpeechSegment.Start / Wave.SampleRate;\n          Duration := Length(SpeechSegment.Samples) / Wave.SampleRate;\n          WriteLn(Format('%.3f -- %.3f %s',\n            [Start, Start + Duration, RecognitionResult.Text]));\n\n          FreeAndNil(Stream);\n        end;\n    end;\n\n  Vad.Flush;\n\n  while not Vad.IsEmpty do\n    begin\n      SpeechSegment := Vad.Front();\n      Vad.Pop();\n      Stream := Recognizer.CreateStream();\n\n      Stream.AcceptWaveform(SpeechSegment.Samples, Wave.SampleRate);\n      Recognizer.Decode(Stream);\n      RecognitionResult := Recognizer.GetResult(Stream);\n\n      Start := SpeechSegment.Start / Wave.SampleRate;\n      Duration := Length(SpeechSegment.Samples) / Wave.SampleRate;\n      WriteLn(Format('%.3f -- %.3f %s',\n        [Start, Start + Duration, RecognitionResult.Text]));\n\n      FreeAndNil(Stream);\n    end;\n\n  FreeAndNil(Recognizer);\n  FreeAndNil(Vad);\nend.\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/vad_with_sense_voice.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming SenseVoice model\nwith silero VAD to decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram vad_with_sense_voice;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nfunction CreateVad(): TSherpaOnnxVoiceActivityDetector;\nvar\n  Config: TSherpaOnnxVadModelConfig;\n\n  SampleRate: Integer;\n  WindowSize: Integer;\nbegin\n  Initialize(Config);\n\n  SampleRate := 16000; {Please don't change it unless you know the details}\n  WindowSize := 512; {Please don't change it unless you know the details}\n\n  Config.SileroVad.Model := './silero_vad.onnx';\n  Config.SileroVad.MinSpeechDuration := 0.5;\n  Config.SileroVad.MinSilenceDuration := 0.5;\n  Config.SileroVad.Threshold := 0.5;\n  Config.SileroVad.WindowSize := WindowSize;\n  Config.NumThreads:= 1;\n  Config.Debug:= True;\n  Config.Provider:= 'cpu';\n  Config.SampleRate := SampleRate;\n\n  Result := TSherpaOnnxVoiceActivityDetector.Create(Config, 30);\nend;\n\nfunction CreateOfflineRecognizer(): TSherpaOnnxOfflineRecognizer;\nvar\n  Config: TSherpaOnnxOfflineRecognizerConfig;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.SenseVoice.Model := './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx';\n  Config.ModelConfig.SenseVoice.Language := 'auto';\n  Config.ModelConfig.SenseVoice.UseItn := False;\n  Config.ModelConfig.Tokens := './sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  Result := TSherpaOnnxOfflineRecognizer.Create(Config);\nend;\n\nvar\n  Wave: TSherpaOnnxWave;\n\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Vad: TSherpaOnnxVoiceActivityDetector;\n\n  Offset: Integer;\n  WindowSize: Integer;\n  SpeechSegment: TSherpaOnnxSpeechSegment;\n\n  Start: Single;\n  Duration: Single;\n\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\nbegin\n  Vad := CreateVad();\n  Recognizer := CreateOfflineRecognizer();\n\n  Wave := SherpaOnnxReadWave('./lei-jun-test.wav');\n  if Wave.SampleRate <> Vad.Config.SampleRate then\n    begin\n      WriteLn(Format('Expected sample rate: %d. Given: %d',\n        [Vad.Config.SampleRate, Wave.SampleRate]));\n\n      Exit;\n    end;\n\n  WindowSize := Vad.Config.SileroVad.WindowSize;\n  Offset := 0;\n  while Offset + WindowSize <= Length(Wave.Samples) do\n    begin\n      Vad.AcceptWaveform(Wave.Samples, Offset, WindowSize);\n      Offset += WindowSize;\n\n      while not Vad.IsEmpty do\n        begin\n          SpeechSegment := Vad.Front();\n          Vad.Pop();\n          Stream := Recognizer.CreateStream();\n\n          Stream.AcceptWaveform(SpeechSegment.Samples, Wave.SampleRate);\n          Recognizer.Decode(Stream);\n          RecognitionResult := Recognizer.GetResult(Stream);\n\n          Start := SpeechSegment.Start / Wave.SampleRate;\n          Duration := Length(SpeechSegment.Samples) / Wave.SampleRate;\n          WriteLn(Format('%.3f -- %.3f %s',\n            [Start, Start + Duration, RecognitionResult.Text]));\n\n          FreeAndNil(Stream);\n        end;\n    end;\n\n  Vad.Flush;\n\n  while not Vad.IsEmpty do\n    begin\n      SpeechSegment := Vad.Front();\n      Vad.Pop();\n      Stream := Recognizer.CreateStream();\n\n      Stream.AcceptWaveform(SpeechSegment.Samples, Wave.SampleRate);\n      Recognizer.Decode(Stream);\n      RecognitionResult := Recognizer.GetResult(Stream);\n\n      Start := SpeechSegment.Start / Wave.SampleRate;\n      Duration := Length(SpeechSegment.Samples) / Wave.SampleRate;\n      WriteLn(Format('%.3f -- %.3f %s',\n        [Start, Start + Duration, RecognitionResult.Text]));\n\n      FreeAndNil(Stream);\n    end;\n\n  FreeAndNil(Recognizer);\n  FreeAndNil(Vad);\nend.\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/vad_with_whisper.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Whisper model\nwith silero VAD to decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram vad_with_whisper;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nfunction CreateVad(): TSherpaOnnxVoiceActivityDetector;\nvar\n  Config: TSherpaOnnxVadModelConfig;\n\n  SampleRate: Integer;\n  WindowSize: Integer;\nbegin\n  Initialize(Config);\n\n  SampleRate := 16000; {Please don't change it unless you know the details}\n  WindowSize := 512; {Please don't change it unless you know the details}\n\n  Config.SileroVad.Model := './silero_vad.onnx';\n  Config.SileroVad.MinSpeechDuration := 0.5;\n  Config.SileroVad.MinSilenceDuration := 0.5;\n  Config.SileroVad.Threshold := 0.5;\n  Config.SileroVad.WindowSize := WindowSize;\n  Config.NumThreads:= 1;\n  Config.Debug:= True;\n  Config.Provider:= 'cpu';\n  Config.SampleRate := SampleRate;\n\n  Result := TSherpaOnnxVoiceActivityDetector.Create(Config, 30);\nend;\n\nfunction CreateOfflineRecognizer(): TSherpaOnnxOfflineRecognizer;\nvar\n  Config: TSherpaOnnxOfflineRecognizerConfig;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.Whisper.Encoder := './sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx';\n  Config.ModelConfig.Whisper.Decoder := './sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  Result := TSherpaOnnxOfflineRecognizer.Create(Config);\nend;\n\nvar\n  Wave: TSherpaOnnxWave;\n\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Vad: TSherpaOnnxVoiceActivityDetector;\n\n  Offset: Integer;\n  WindowSize: Integer;\n  SpeechSegment: TSherpaOnnxSpeechSegment;\n\n  Start: Single;\n  Duration: Single;\n\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\nbegin\n  Vad := CreateVad();\n  Recognizer := CreateOfflineRecognizer();\n\n  Wave := SherpaOnnxReadWave('./Obama.wav');\n  if Wave.SampleRate <> Vad.Config.SampleRate then\n    begin\n      WriteLn(Format('Expected sample rate: %d. Given: %d',\n        [Vad.Config.SampleRate, Wave.SampleRate]));\n\n      Exit;\n    end;\n\n  WindowSize := Vad.Config.SileroVad.WindowSize;\n  Offset := 0;\n  while Offset + WindowSize <= Length(Wave.Samples) do\n    begin\n      Vad.AcceptWaveform(Wave.Samples, Offset, WindowSize);\n      Offset += WindowSize;\n\n      while not Vad.IsEmpty do\n        begin\n          SpeechSegment := Vad.Front();\n          Vad.Pop();\n          Stream := Recognizer.CreateStream();\n\n          Stream.AcceptWaveform(SpeechSegment.Samples, Wave.SampleRate);\n          Recognizer.Decode(Stream);\n          RecognitionResult := Recognizer.GetResult(Stream);\n\n          Start := SpeechSegment.Start / Wave.SampleRate;\n          Duration := Length(SpeechSegment.Samples) / Wave.SampleRate;\n          WriteLn(Format('%.3f -- %.3f %s',\n            [Start, Start + Duration, RecognitionResult.Text]));\n\n          FreeAndNil(Stream);\n        end;\n    end;\n\n  Vad.Flush;\n\n  while not Vad.IsEmpty do\n    begin\n      SpeechSegment := Vad.Front();\n      Vad.Pop();\n      Stream := Recognizer.CreateStream();\n\n      Stream.AcceptWaveform(SpeechSegment.Samples, Wave.SampleRate);\n      Recognizer.Decode(Stream);\n      RecognitionResult := Recognizer.GetResult(Stream);\n\n      Start := SpeechSegment.Start / Wave.SampleRate;\n      Duration := Length(SpeechSegment.Samples) / Wave.SampleRate;\n      WriteLn(Format('%.3f -- %.3f %s',\n        [Start, Start + Duration, RecognitionResult.Text]));\n\n      FreeAndNil(Stream);\n    end;\n\n  FreeAndNil(Recognizer);\n  FreeAndNil(Vad);\nend.\n"
  },
  {
    "path": "pascal-api-examples/vad-with-non-streaming-asr/vad_with_zipformer_ctc.pas",
    "content": "{ Copyright (c)  2025  Xiaomi Corporation }\n\n{\nThis file shows how to use a non-streaming Zipformer CTC model\nwith silero VAD to decode files.\n\nYou can download the model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n}\n\nprogram vad_with_zipformer_ctc;\n\n{$mode objfpc}\n\nuses\n  sherpa_onnx,\n  SysUtils;\n\nfunction CreateVad(): TSherpaOnnxVoiceActivityDetector;\nvar\n  Config: TSherpaOnnxVadModelConfig;\n\n  SampleRate: Integer;\n  WindowSize: Integer;\nbegin\n  Initialize(Config);\n\n  SampleRate := 16000; {Please don't change it unless you know the details}\n  WindowSize := 512; {Please don't change it unless you know the details}\n\n  Config.SileroVad.Model := './silero_vad.onnx';\n  Config.SileroVad.MinSpeechDuration := 0.5;\n  Config.SileroVad.MinSilenceDuration := 0.5;\n  Config.SileroVad.Threshold := 0.5;\n  Config.SileroVad.WindowSize := WindowSize;\n  Config.NumThreads:= 1;\n  Config.Debug:= True;\n  Config.Provider:= 'cpu';\n  Config.SampleRate := SampleRate;\n\n  Result := TSherpaOnnxVoiceActivityDetector.Create(Config, 30);\nend;\n\nfunction CreateOfflineRecognizer(): TSherpaOnnxOfflineRecognizer;\nvar\n  Config: TSherpaOnnxOfflineRecognizerConfig;\nbegin\n  Initialize(Config);\n\n  Config.ModelConfig.ZipformerCtc.Model := './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx';\n  Config.ModelConfig.Tokens := './sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt';\n  Config.ModelConfig.Provider := 'cpu';\n  Config.ModelConfig.NumThreads := 1;\n  Config.ModelConfig.Debug := False;\n\n  Result := TSherpaOnnxOfflineRecognizer.Create(Config);\nend;\n\nvar\n  Wave: TSherpaOnnxWave;\n\n  Recognizer: TSherpaOnnxOfflineRecognizer;\n  Vad: TSherpaOnnxVoiceActivityDetector;\n\n  Offset: Integer;\n  WindowSize: Integer;\n  SpeechSegment: TSherpaOnnxSpeechSegment;\n\n  Start: Single;\n  Duration: Single;\n\n  Stream: TSherpaOnnxOfflineStream;\n  RecognitionResult: TSherpaOnnxOfflineRecognizerResult;\nbegin\n  Vad := CreateVad();\n  Recognizer := CreateOfflineRecognizer();\n\n  Wave := SherpaOnnxReadWave('./lei-jun-test.wav');\n  if Wave.SampleRate <> Vad.Config.SampleRate then\n    begin\n      WriteLn(Format('Expected sample rate: %d. Given: %d',\n        [Vad.Config.SampleRate, Wave.SampleRate]));\n\n      Exit;\n    end;\n\n  WindowSize := Vad.Config.SileroVad.WindowSize;\n  Offset := 0;\n  while Offset + WindowSize <= Length(Wave.Samples) do\n    begin\n      Vad.AcceptWaveform(Wave.Samples, Offset, WindowSize);\n      Offset += WindowSize;\n\n      while not Vad.IsEmpty do\n        begin\n          SpeechSegment := Vad.Front();\n          Vad.Pop();\n          Stream := Recognizer.CreateStream();\n\n          Stream.AcceptWaveform(SpeechSegment.Samples, Wave.SampleRate);\n          Recognizer.Decode(Stream);\n          RecognitionResult := Recognizer.GetResult(Stream);\n\n          Start := SpeechSegment.Start / Wave.SampleRate;\n          Duration := Length(SpeechSegment.Samples) / Wave.SampleRate;\n          WriteLn(Format('%.3f -- %.3f %s',\n            [Start, Start + Duration, RecognitionResult.Text]));\n\n          FreeAndNil(Stream);\n        end;\n    end;\n\n  Vad.Flush;\n\n  while not Vad.IsEmpty do\n    begin\n      SpeechSegment := Vad.Front();\n      Vad.Pop();\n      Stream := Recognizer.CreateStream();\n\n      Stream.AcceptWaveform(SpeechSegment.Samples, Wave.SampleRate);\n      Recognizer.Decode(Stream);\n      RecognitionResult := Recognizer.GetResult(Stream);\n\n      Start := SpeechSegment.Start / Wave.SampleRate;\n      Duration := Length(SpeechSegment.Samples) / Wave.SampleRate;\n      WriteLn(Format('%.3f -- %.3f %s',\n        [Start, Start + Duration, RecognitionResult.Text]));\n\n      FreeAndNil(Stream);\n    end;\n\n  FreeAndNil(Recognizer);\n  FreeAndNil(Vad);\nend.\n"
  },
  {
    "path": "pom.xml",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<project xsi:schemaLocation=\"http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd\" xmlns=\"http://maven.apache.org/POM/4.0.0\"\n    xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">\n    <modelVersion>4.0.0</modelVersion>\n    <groupId>com.k2fsa.sherpa.onnx</groupId>\n    <artifactId>sherpa-onnx-android</artifactId>\n    <version>1.12.31</version>\n    <url>https://github.com/k2-fsa/sherpa-onnx</url>\n    <packaging>pom</packaging>\n    <description>First Android Library</description>\n\n    <licenses>\n      <license>\n        <name>The Apache Software License, Version 2.0</name>\n        <url>http://www.apache.org/licenses/LICENSE-2.0.txt</url>\n        <distribution>repo</distribution>\n      </license>\n    </licenses>\n</project>\n"
  },
  {
    "path": "python-api-examples/README.md",
    "content": "# File description\n\n- [./http_server.py](./http_server.py) It defines which files to server.\n  Files are saved in [./web](./web).\n- [non_streaming_server.py](./non_streaming_server.py) WebSocket server for\n  non-streaming models.\n- [vad-remove-non-speech-segments.py](./vad-remove-non-speech-segments.py) It uses\n  [silero-vad](https://github.com/snakers4/silero-vad) to remove non-speech\n  segments and concatenate all speech segments into a single one.\n- [vad-with-non-streaming-asr.py](./vad-with-non-streaming-asr.py) It shows\n  how to use VAD with a non-streaming ASR model for speech recognition from\n  a microphone\n- [offline-speech-enhancement-gtcrn.py](./offline-speech-enhancement-gtcrn.py)\n  It shows how to use the offline speech denoiser API with GTCRN.\n\n- [offline-speech-enhancement-dpdfnet.py](./offline-speech-enhancement-dpdfnet.py)\n  It shows how to use the offline speech denoiser API with DPDFNet.\n\n- [online-speech-enhancement-gtcrn.py](./online-speech-enhancement-gtcrn.py)\n  It shows how to use the online speech denoiser API with GTCRN.\n\n- [online-speech-enhancement-dpdfnet.py](./online-speech-enhancement-dpdfnet.py)\n  It shows how to use the online speech denoiser API with DPDFNet.\n  models. Use 16 kHz DPDFNet models such as `dpdfnet_baseline.onnx`,\n  `dpdfnet2.onnx`, `dpdfnet4.onnx`, or `dpdfnet8.onnx` for downstream ASR and\n  `dpdfnet2_48khz_hr.onnx` for 48 kHz enhancement output.\n\n- [pocket-tts.py](./pocket-tts.py) It shows how to use PocketTTS with the\n  `GenerationConfig` API.\n\n- [supertonic-tts.py](./supertonic-tts.py) It shows how to use SupertonicTTS\n  with the `GenerationConfig` API.\n\n- [zipvoice-tts.py](./zipvoice-tts.py) It shows how to use ZipVoice for\n  zero-shot TTS with the `GenerationConfig` API.\n\n- [zipvoice-tts-play.py](./zipvoice-tts-play.py) It shows how to use ZipVoice\n  for zero-shot TTS and plays the generated audio while it is being synthesized.\n"
  },
  {
    "path": "python-api-examples/add-punctuation-online.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script shows how to add punctuations to text using sherpa-onnx Python API.\n\nPlease download the model from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models\n\nThe following is an example\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\ntar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nrm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\n\n\ndef main():\n    model = \"./sherpa-onnx-online-punct-en-2024-08-06/model.onnx\"\n    bpe = \"./sherpa-onnx-online-punct-en-2024-08-06/bpe.vocab\"\n    if not Path(model).is_file():\n        raise ValueError(f\"{model} does not exist\")\n    if not Path(bpe).is_file():\n        raise ValueError(f\"{bpe} does not exist\")\n\n    model_config = sherpa_onnx.OnlinePunctuationModelConfig(\n        cnn_bilstm=model, bpe_vocab=bpe\n    )\n    config = sherpa_onnx.OnlinePunctuationConfig(model_config=model_config)\n    punct = sherpa_onnx.OnlinePunctuation(config)\n\n    texts = [\n        \"how are you i am fine thank you\",\n        \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\",\n    ]\n    for text in texts:\n        text_with_punct = punct.add_punctuation_with_case(text)\n        print(\"----------\")\n        print(f\"input : {text}\")\n        print(f\"output: {text_with_punct}\")\n    print(\"----------\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/add-punctuation.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script shows how to add punctuations to text using sherpa-onnx Python API.\n\nPlease download the model from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models\n\nThe following is an example\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\ntar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nrm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\n\n\ndef main():\n    model = \"./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx\"\n    if not Path(model).is_file():\n        raise ValueError(f\"{model} does not exist\")\n    config = sherpa_onnx.OfflinePunctuationConfig(\n        model=sherpa_onnx.OfflinePunctuationModelConfig(ct_transformer=model),\n    )\n\n    punct = sherpa_onnx.OfflinePunctuation(config)\n\n    text_list = [\n        \"这是一个测试你好吗How are you我很好thank you are you ok谢谢你\",\n        \"我们都是木头人不会说话不会动\",\n        \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\",\n    ]\n    for text in text_list:\n        text_with_punct = punct.add_punctuation(text)\n        print(\"----------\")\n        print(f\"input: {text}\")\n        print(f\"output: {text_with_punct}\")\n\n    print(\"----------\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/audio-tagging-from-a-file-ced.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script shows how to use audio tagging Python APIs to tag a file.\n\nPlease read the code to download the required model files and test wave file.\n\"\"\"\n\nimport logging\nimport time\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef read_test_wave():\n    # Please download the model files and test wave files from\n    # https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\n    test_wave = \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/6.wav\"\n\n    if not Path(test_wave).is_file():\n        raise ValueError(\n            f\"Please download {test_wave} from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\"\n        )\n\n    # See https://python-soundfile.readthedocs.io/en/0.11.0/#soundfile.read\n    data, sample_rate = sf.read(\n        test_wave,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n\n    # samples is a 1-d array of dtype float32\n    # sample_rate is a scalar\n    return samples, sample_rate\n\n\ndef create_audio_tagger():\n    # Please download the model files and test wave files from\n    # https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\n    model_file = \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.int8.onnx\"\n    label_file = (\n        \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/class_labels_indices.csv\"\n    )\n\n    if not Path(model_file).is_file():\n        raise ValueError(\n            f\"Please download {model_file} from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\"\n        )\n\n    if not Path(label_file).is_file():\n        raise ValueError(\n            f\"Please download {label_file} from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\"\n        )\n\n    config = sherpa_onnx.AudioTaggingConfig(\n        model=sherpa_onnx.AudioTaggingModelConfig(\n            ced=model_file,\n            num_threads=1,\n            debug=True,\n            provider=\"cpu\",\n        ),\n        labels=label_file,\n        top_k=5,\n    )\n    if not config.validate():\n        raise ValueError(f\"Please check the config: {config}\")\n\n    print(config)\n\n    return sherpa_onnx.AudioTagging(config)\n\n\ndef main():\n    logging.info(\"Create audio tagger\")\n    audio_tagger = create_audio_tagger()\n\n    logging.info(\"Read test wave\")\n    samples, sample_rate = read_test_wave()\n\n    logging.info(\"Computing\")\n\n    start_time = time.time()\n\n    stream = audio_tagger.create_stream()\n    stream.accept_waveform(sample_rate=sample_rate, waveform=samples)\n    result = audio_tagger.compute(stream)\n    end_time = time.time()\n\n    elapsed_seconds = end_time - start_time\n    audio_duration = len(samples) / sample_rate\n\n    real_time_factor = elapsed_seconds / audio_duration\n    logging.info(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    logging.info(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    logging.info(\n        f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\"\n    )\n\n    s = \"\\n\"\n    for i, e in enumerate(result):\n        s += f\"{i}: {e}\\n\"\n\n    logging.info(s)\n\n\nif __name__ == \"__main__\":\n    formatter = \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"\n\n    logging.basicConfig(format=formatter, level=logging.INFO)\n\n    main()\n"
  },
  {
    "path": "python-api-examples/audio-tagging-from-a-file.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script shows how to use audio tagging Python APIs to tag a file.\n\nPlease read the code to download the required model files and test wave file.\n\"\"\"\n\nimport logging\nimport time\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef read_test_wave():\n    # Please download the model files and test wave files from\n    # https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\n    test_wave = \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/test_wavs/1.wav\"\n\n    if not Path(test_wave).is_file():\n        raise ValueError(\n            f\"Please download {test_wave} from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\"\n        )\n\n    # See https://python-soundfile.readthedocs.io/en/0.11.0/#soundfile.read\n    data, sample_rate = sf.read(\n        test_wave,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n\n    # samples is a 1-d array of dtype float32\n    # sample_rate is a scalar\n    return samples, sample_rate\n\n\ndef create_audio_tagger():\n    # Please download the model files and test wave files from\n    # https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\n    model_file = \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.onnx\"\n    label_file = (\n        \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/class_labels_indices.csv\"\n    )\n\n    if not Path(model_file).is_file():\n        raise ValueError(\n            f\"Please download {model_file} from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\"\n        )\n\n    if not Path(label_file).is_file():\n        raise ValueError(\n            f\"Please download {label_file} from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\"\n        )\n\n    config = sherpa_onnx.AudioTaggingConfig(\n        model=sherpa_onnx.AudioTaggingModelConfig(\n            zipformer=sherpa_onnx.OfflineZipformerAudioTaggingModelConfig(\n                model=model_file,\n            ),\n            num_threads=1,\n            debug=True,\n            provider=\"cpu\",\n        ),\n        labels=label_file,\n        top_k=5,\n    )\n    if not config.validate():\n        raise ValueError(f\"Please check the config: {config}\")\n\n    print(config)\n\n    return sherpa_onnx.AudioTagging(config)\n\n\ndef main():\n    logging.info(\"Create audio tagger\")\n    audio_tagger = create_audio_tagger()\n\n    logging.info(\"Read test wave\")\n    samples, sample_rate = read_test_wave()\n\n    logging.info(\"Computing\")\n\n    start_time = time.time()\n\n    stream = audio_tagger.create_stream()\n    stream.accept_waveform(sample_rate=sample_rate, waveform=samples)\n    result = audio_tagger.compute(stream)\n    end_time = time.time()\n\n    elapsed_seconds = end_time - start_time\n    audio_duration = len(samples) / sample_rate\n\n    real_time_factor = elapsed_seconds / audio_duration\n    logging.info(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    logging.info(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    logging.info(\n        f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\"\n    )\n\n    s = \"\\n\"\n    for i, e in enumerate(result):\n        s += f\"{i}: {e}\\n\"\n\n    logging.info(s)\n\n\nif __name__ == \"__main__\":\n    formatter = \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"\n\n    logging.basicConfig(format=formatter, level=logging.INFO)\n\n    main()\n"
  },
  {
    "path": "python-api-examples/generate-subtitles.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2023  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python APIs to generate\nsubtitles.\n\nSupported file formats are those supported by ffmpeg; for instance,\n*.mov, *.mp4, *.wav, etc.\n\nNote that you need a non-streaming model for this script.\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nto download silero_vad.onnx\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nor download ten-vad.onnx, for instance\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n\nPlease replace --silero-vad-model with --ten-vad-model below to use ten-vad.\n\n(1) For paraformer\n\n    ./python-api-examples/generate-subtitles.py  \\\n      --silero-vad-model=/path/to/silero_vad.onnx \\\n      --tokens=/path/to/tokens.txt \\\n      --paraformer=/path/to/paraformer.onnx \\\n      --num-threads=2 \\\n      --decoding-method=greedy_search \\\n      --debug=false \\\n      --sample-rate=16000 \\\n      --feature-dim=80 \\\n      /path/to/test.mp4\n\n(2) For transducer models from icefall\n\n    ./python-api-examples/generate-subtitles.py  \\\n      --silero-vad-model=/path/to/silero_vad.onnx \\\n      --tokens=/path/to/tokens.txt \\\n      --encoder=/path/to/encoder.onnx \\\n      --decoder=/path/to/decoder.onnx \\\n      --joiner=/path/to/joiner.onnx \\\n      --num-threads=2 \\\n      --decoding-method=greedy_search \\\n      --debug=false \\\n      --sample-rate=16000 \\\n      --feature-dim=80 \\\n      /path/to/test.mp4\n\n(3) For Moonshine models\n\n./python-api-examples/generate-subtitles.py  \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --moonshine-preprocessor=./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx \\\n  --moonshine-encoder=./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx \\\n  --moonshine-uncached-decoder=./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx \\\n  --moonshine-cached-decoder=./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx \\\n  --tokens=./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt \\\n  --num-threads=2 \\\n  /path/to/test.mp4\n\n(4) For Whisper models\n\n./python-api-examples/generate-subtitles.py  \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \\\n  --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \\\n  --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \\\n  --whisper-task=transcribe \\\n  --num-threads=2 \\\n  /path/to/test.mp4\n\n(5) For SenseVoice CTC models\n\n./python-api-examples/generate-subtitles.py  \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --sense-voice=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx \\\n  --tokens=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt \\\n  --num-threads=2 \\\n  /path/to/test.mp4\n\n(6) For FireRedAsr models\n\n./python-api-examples/generate-subtitles.py  \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --tokens=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt \\\n  --fire-red-asr-encoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx \\\n  --fire-red-asr-decoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx \\\n  --num-threads=2 \\\n  /path/to/test.mp4\n\n(7) For WeNet CTC models\n\n./python-api-examples/generate-subtitles.py  \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --wenet-ctc=./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx \\\n  --tokens=./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt \\\n  --num-threads=2 \\\n  /path/to/test.mp4\n\n(8) For NeMo Parakeet TDT models\n\n./python-api-examples/generate-subtitles.py  \\\n  --silero-vad-model=./silero_vad.onnx \\\n  --encoder ./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/encoder.int8.onnx \\\n  --decoder ./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/decoder.int8.onnx \\\n  --joiner ./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/joiner.int8.onnx \\\n  --tokens ./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/tokens.txt \\\n  --model-type nemo_transducer \\\n  /path/to/test.mp4\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/index.html\nto install sherpa-onnx and to download non-streaming pre-trained models\nused in this file.\n\"\"\"\nimport argparse\nimport datetime as dt\nimport shutil\nimport subprocess\nimport sys\nfrom dataclasses import dataclass\nfrom datetime import timedelta\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        help=\"Path to silero_vad.onnx.\",\n    )\n\n    parser.add_argument(\n        \"--ten-vad-model\",\n        type=str,\n        help=\"Path to ten-vad.onnx\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer joiner model\",\n    )\n\n    parser.add_argument(\n        \"--model-type\",\n        default=\"\",\n        type=str,\n        help=\"If using NeMo transducer models, please set it to nemo_transducer\",\n    )\n\n    parser.add_argument(\n        \"--paraformer\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from Paraformer\",\n    )\n\n    parser.add_argument(\n        \"--sense-voice\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from SenseVoice\",\n    )\n\n    parser.add_argument(\n        \"--wenet-ctc\",\n        default=\"\",\n        type=str,\n        help=\"Path to the CTC model.onnx from WeNet\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=2,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--fire-red-asr-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to FireRedAsr encoder model\",\n    )\n\n    parser.add_argument(\n        \"--fire-red-asr-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to FireRedAsr decoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper encoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper decoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-language\",\n        default=\"\",\n        type=str,\n        help=\"\"\"It specifies the spoken language in the input file.\n        Example values: en, fr, de, zh, jp.\n        Available languages for multilingual models can be found at\n        https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10\n        If not specified, we infer the language from the input audio file.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-task\",\n        default=\"transcribe\",\n        choices=[\"transcribe\", \"translate\"],\n        type=str,\n        help=\"\"\"For multilingual models, if you specify translate, the output\n        will be in English.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-tail-paddings\",\n        default=-1,\n        type=int,\n        help=\"\"\"Number of tail padding frames.\n        We have removed the 30-second constraint from whisper, so you need to\n        choose the amount of tail padding frames by yourself.\n        Use -1 to use a default value for tail padding.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-preprocessor\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine preprocessor model\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine encoder model\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-uncached-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine uncached decoder model\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-cached-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine cached decoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"\"\"Valid values are greedy_search and modified_beam_search.\n        modified_beam_search is valid only for transducer models.\n        \"\"\",\n    )\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages when loading modes.\",\n    )\n\n    parser.add_argument(\n        \"--sample-rate\",\n        type=int,\n        default=16000,\n        help=\"\"\"Sample rate of the feature extractor. Must match the one\n        expected by the model. Note: The input sound files can have a\n        different sample rate from this argument.\"\"\",\n    )\n\n    parser.add_argument(\n        \"--feature-dim\",\n        type=int,\n        default=80,\n        help=\"Feature dimension. Must match the one expected by the model\",\n    )\n\n    parser.add_argument(\n        \"sound_file\",\n        type=str,\n        help=\"The input sound file to generate subtitles \",\n    )\n\n    return parser.parse_args()\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef create_recognizer(args) -> sherpa_onnx.OfflineRecognizer:\n    if args.encoder:\n        assert len(args.paraformer) == 0, args.paraformer\n        assert len(args.sense_voice) == 0, args.sense_voice\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.fire_red_asr_encoder) == 0, args.fire_red_asr_encoder\n        assert len(args.fire_red_asr_decoder) == 0, args.fire_red_asr_decoder\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.encoder)\n        assert_file_exists(args.decoder)\n        assert_file_exists(args.joiner)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_transducer(\n            encoder=args.encoder,\n            decoder=args.decoder,\n            joiner=args.joiner,\n            tokens=args.tokens,\n            model_type=args.model_type,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.paraformer:\n        assert len(args.sense_voice) == 0, args.sense_voice\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.fire_red_asr_encoder) == 0, args.fire_red_asr_encoder\n        assert len(args.fire_red_asr_decoder) == 0, args.fire_red_asr_decoder\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.paraformer)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(\n            paraformer=args.paraformer,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.sense_voice:\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.fire_red_asr_encoder) == 0, args.fire_red_asr_encoder\n        assert len(args.fire_red_asr_decoder) == 0, args.fire_red_asr_decoder\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.sense_voice)\n        recognizer = sherpa_onnx.OfflineRecognizer.from_sense_voice(\n            model=args.sense_voice,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            use_itn=True,\n            debug=args.debug,\n        )\n    elif args.wenet_ctc:\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.fire_red_asr_encoder) == 0, args.fire_red_asr_encoder\n        assert len(args.fire_red_asr_decoder) == 0, args.fire_red_asr_decoder\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.wenet_ctc)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_wenet_ctc(\n            model=args.wenet_ctc,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.whisper_encoder:\n        assert_file_exists(args.whisper_encoder)\n        assert_file_exists(args.whisper_decoder)\n        assert len(args.fire_red_asr_encoder) == 0, args.fire_red_asr_encoder\n        assert len(args.fire_red_asr_decoder) == 0, args.fire_red_asr_decoder\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n            encoder=args.whisper_encoder,\n            decoder=args.whisper_decoder,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n            language=args.whisper_language,\n            task=args.whisper_task,\n            tail_paddings=args.whisper_tail_paddings,\n        )\n    elif args.moonshine_preprocessor:\n        assert len(args.fire_red_asr_encoder) == 0, args.fire_red_asr_encoder\n        assert len(args.fire_red_asr_decoder) == 0, args.fire_red_asr_decoder\n        assert_file_exists(args.moonshine_preprocessor)\n        assert_file_exists(args.moonshine_encoder)\n        assert_file_exists(args.moonshine_uncached_decoder)\n        assert_file_exists(args.moonshine_cached_decoder)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_moonshine(\n            preprocessor=args.moonshine_preprocessor,\n            encoder=args.moonshine_encoder,\n            uncached_decoder=args.moonshine_uncached_decoder,\n            cached_decoder=args.moonshine_cached_decoder,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.fire_red_asr_encoder:\n        recognizer = sherpa_onnx.OfflineRecognizer.from_fire_red_asr(\n            encoder=args.fire_red_asr_encoder,\n            decoder=args.fire_red_asr_decoder,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    else:\n        raise ValueError(\"Please specify at least one model\")\n\n    return recognizer\n\n\n@dataclass\nclass Segment:\n    start: float\n    duration: float\n    text: str = \"\"\n\n    @property\n    def end(self):\n        return self.start + self.duration\n\n    def __str__(self):\n        s = f\"{timedelta(seconds=self.start)}\"[:-3]\n        s += \" --> \"\n        s += f\"{timedelta(seconds=self.end)}\"[:-3]\n        s = s.replace(\".\", \",\")\n        s += \"\\n\"\n        s += self.text\n        return s\n\n\ndef main():\n    args = get_args()\n    assert_file_exists(args.tokens)\n    if args.silero_vad_model:\n        assert_file_exists(args.silero_vad_model)\n    elif args.ten_vad_model:\n        assert_file_exists(args.ten_vad_model)\n    else:\n        raise ValueError(\"You need to supply one vad model\")\n\n    assert args.num_threads > 0, args.num_threads\n\n    if not Path(args.sound_file).is_file():\n        raise ValueError(f\"{args.sound_file} does not exist\")\n\n    assert (\n        args.sample_rate == 16000\n    ), f\"Only sample rate 16000 is supported.Given: {args.sample_rate}\"\n\n    recognizer = create_recognizer(args)\n\n    ffmpeg_cmd = [\n        \"ffmpeg\",\n        \"-i\",\n        args.sound_file,\n        \"-f\",\n        \"s16le\",\n        \"-acodec\",\n        \"pcm_s16le\",\n        \"-ac\",\n        \"1\",\n        \"-ar\",\n        str(args.sample_rate),\n        \"-\",\n    ]\n\n    process = subprocess.Popen(\n        ffmpeg_cmd, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL\n    )\n\n    frames_per_read = int(args.sample_rate * 100)  # 100 second\n\n    stream = recognizer.create_stream()\n\n    config = sherpa_onnx.VadModelConfig()\n    if args.silero_vad_model:\n        config.silero_vad.model = args.silero_vad_model\n        config.silero_vad.threshold = 0.2\n        config.silero_vad.min_silence_duration = 0.25  # seconds\n        config.silero_vad.min_speech_duration = 0.25  # seconds\n\n        # If the current segment is larger than this value, then it increases\n        # the threshold to 0.9 internally. After detecting this segment,\n        # it resets the threshold to its original value.\n        config.silero_vad.max_speech_duration = 5  # seconds\n        config.sample_rate = args.sample_rate\n\n        window_size = config.silero_vad.window_size\n        print(\"use silero-vad\")\n    else:\n        config.ten_vad.model = args.ten_vad_model\n        config.ten_vad.threshold = 0.2\n        config.ten_vad.min_silence_duration = 0.25  # seconds\n        config.ten_vad.min_speech_duration = 0.25  # seconds\n\n        # If the current segment is larger than this value, then it increases\n        # the threshold to 0.9 internally. After detecting this segment,\n        # it resets the threshold to its original value.\n        config.ten_vad.max_speech_duration = 5  # seconds\n        config.sample_rate = args.sample_rate\n\n        window_size = config.ten_vad.window_size\n        print(\"use ten-vad\")\n\n    buffer = []\n    vad = sherpa_onnx.VoiceActivityDetector(config, buffer_size_in_seconds=100)\n\n    segment_list = []\n\n    print(\"Started!\")\n    start_t = dt.datetime.now()\n    num_processed_samples = 0\n\n    is_eof = False\n    # TODO(fangjun): Support multithreads\n    while not is_eof:\n        # *2 because int16_t has two bytes\n        data = process.stdout.read(frames_per_read * 2)\n        if not data:\n            vad.flush()\n            is_eof = True\n        else:\n            samples = np.frombuffer(data, dtype=np.int16)\n            samples = samples.astype(np.float32) / 32768\n\n            num_processed_samples += samples.shape[0]\n\n            buffer = np.concatenate([buffer, samples])\n            while len(buffer) > window_size:\n                vad.accept_waveform(buffer[:window_size])\n                buffer = buffer[window_size:]\n\n                if False:\n                    # If you want to process the speech segment as soon as\n                    # speech is detected, you can use\n                    current_segment = vad.current_segment\n                    if len(current_segment.samples) > 0:\n                        print(\n                            f\"speech starts at {current_segment.start/16000} seconds: \",\n                            f\"duration {len(current_segment.samples)/16000} seconds\",\n                        )\n\n        streams = []\n        segments = []\n        while not vad.empty():\n            segment = Segment(\n                start=vad.front.start / args.sample_rate,\n                duration=len(vad.front.samples) / args.sample_rate,\n            )\n            segments.append(segment)\n\n            stream = recognizer.create_stream()\n            stream.accept_waveform(args.sample_rate, vad.front.samples)\n\n            streams.append(stream)\n\n            vad.pop()\n\n        for s in streams:\n            recognizer.decode_stream(s)\n\n        for seg, stream in zip(segments, streams):\n            seg.text = stream.result.text\n            if seg.text in (\".\", \"The.\"):\n                continue\n            segment_list.append(seg)\n\n    end_t = dt.datetime.now()\n    elapsed_seconds = (end_t - start_t).total_seconds()\n    duration = num_processed_samples / 16000\n    rtf = elapsed_seconds / duration\n\n    srt_filename = Path(args.sound_file).with_suffix(\".srt\")\n    with open(srt_filename, \"w\", encoding=\"utf-8\") as f:\n        for i, seg in enumerate(segment_list):\n            print(i + 1, file=f)\n            print(seg, file=f)\n            print(\"\", file=f)\n\n    print(f\"Saved to {srt_filename}\")\n    print(f\"Audio duration:\\t{duration:.3f} s\")\n    print(f\"Elapsed:\\t{elapsed_seconds:.3f} s\")\n    print(f\"RTF = {elapsed_seconds:.3f}/{duration:.3f} = {rtf:.3f}\")\n    print(\"Done!\")\n\n\nif __name__ == \"__main__\":\n    if shutil.which(\"ffmpeg\") is None:\n        sys.exit(\"Please install ffmpeg first!\")\n    main()\n"
  },
  {
    "path": "python-api-examples/http_server.py",
    "content": "# Copyright      2022  Xiaomi Corp.        (authors: Fangjun Kuang)\n#\n# See ../../../LICENSE for clarification regarding multiple authors\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\nfrom typing import Tuple\n\n# Please sort it alphabetically\n_static_files = (\n    (\"/css/bootstrap.min.css\", \"text/css\"),\n    (\"/css/bootstrap.min.css.map\", \"text/css\"),\n    (\"/index.html\", \"text/html\"),\n    (\"/js/bootstrap.min.js\", \"application/javascript\"),\n    (\"/js/bootstrap.min.js.map\", \"application/javascript\"),\n    (\"/js/jquery-3.6.0.min.js\", \"application/javascript\"),\n    (\"/js/offline_record.js\", \"application/javascript\"),\n    (\"/js/offline_record.js\", \"application/javascript\"),\n    (\"/js/popper.min.js\", \"application/javascript\"),\n    (\"/js/popper.min.js.map\", \"application/javascript\"),\n    (\"/js/streaming_record.js\", \"application/javascript\"),\n    (\"/js/upload.js\", \"application/javascript\"),\n    (\"/k2-logo.png\", \"image/png\"),\n    (\"/nav-partial.html\", \"text/html\"),\n    (\"/offline_record.html\", \"text/html\"),\n    (\"/streaming_record.html\", \"text/html\"),\n    (\"/upload.html\", \"text/html\"),\n)\n\n_404_page = r\"\"\"\n<!doctype html><html><head>\n<title>Speech recognition with next-gen Kaldi</title><body>\n<h1>404 ERROR! Please re-check your URL</h1>\n</body></head></html>\n\"\"\"\n\n\ndef read_file(root: str, name: str) -> str:\n    try:\n        with open(f\"{root}/{name}\") as f:\n            return f.read()\n    except:  # noqa\n        with open(f\"{root}/{name}\", \"rb\") as f:\n            return f.read()\n\n\nclass HttpServer:\n    \"\"\"\n    A simple HTTP server that hosts only static files\n    \"\"\"\n\n    def __init__(self, doc_root: str):\n        content = dict()\n        for f, mime_type in _static_files:\n            content[f] = (read_file(doc_root, f), mime_type)\n        self.content = content\n\n    def process_request(self, f: str) -> Tuple[str, str, str]:\n        \"\"\"\n        Args:\n          f:\n            The filename to read.\n        Returns:\n          Return a tuple:\n            - a bool, True if the given file is found. False otherwise.\n            - a str, the content of the file if found. Otherwise, it\n              contains the content for the 404 page\n            - a str, the MIME type of the returned content\n        \"\"\"\n        if f in self.content:\n            return True, self.content[f][0], self.content[f][1]\n        else:\n            return False, _404_page, \"text/html\"\n"
  },
  {
    "path": "python-api-examples/inverse-text-normalization-offline-asr.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2024  Xiaomi Corporation\n\n\"\"\"\nThis script shows how to use inverse text normalization with non-streaming ASR.\n\nUsage:\n\n(1) Download the test model\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\ntar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nrm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\n(2) Download rule fst\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n\nPlease refer to\nhttps://github.com/k2-fsa/colab/blob/master/sherpa-onnx/itn_zh_number.ipynb\nfor how itn_zh_number.fst is generated.\n\n(3) Download test wave\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\n\n(4) Run this script\n\npython3 ./python-api-examples/inverse-text-normalization-offline-asr.py\n\"\"\"\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx\"\n    tokens = \"./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\"\n    rule_fsts = \"./itn_zh_number.fst\"\n\n    if (\n        not Path(model).is_file()\n        or not Path(tokens).is_file()\n        or not Path(rule_fsts).is_file()\n    ):\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return sherpa_onnx.OfflineRecognizer.from_paraformer(\n        paraformer=model,\n        tokens=tokens,\n        debug=True,\n        rule_fsts=rule_fsts,\n    )\n\n\ndef main():\n    recognizer = create_recognizer()\n    wave_filename = \"./itn-zh-number.wav\"\n    if not Path(wave_filename).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n    print(wave_filename)\n    print(stream.result)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/inverse-text-normalization-online-asr.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2024  Xiaomi Corporation\n\n\"\"\"\nThis script shows how to use inverse text normalization with streaming ASR.\n\nUsage:\n\n(1) Download the test model\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\n(2) Download rule fst\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n\nPlease refer to\nhttps://github.com/k2-fsa/colab/blob/master/sherpa-onnx/itn_zh_number.ipynb\nfor how itn_zh_number.fst is generated.\n\n(3) Download test wave\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn-zh-number.wav\n\n(4) Run this script\n\npython3 ./python-api-examples/inverse-text-normalization-online-asr.py\n\"\"\"\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    encoder = \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx\"\n    decoder = \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\"\n    joiner = \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx\"\n    tokens = \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\"\n    rule_fsts = \"./itn_zh_number.fst\"\n\n    if (\n        not Path(encoder).is_file()\n        or not Path(decoder).is_file()\n        or not Path(joiner).is_file()\n        or not Path(tokens).is_file()\n        or not Path(rule_fsts).is_file()\n    ):\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return sherpa_onnx.OnlineRecognizer.from_transducer(\n        encoder=encoder,\n        decoder=decoder,\n        joiner=joiner,\n        tokens=tokens,\n        debug=True,\n        rule_fsts=rule_fsts,\n    )\n\n\ndef main():\n    recognizer = create_recognizer()\n    wave_filename = \"./itn-zh-number.wav\"\n    if not Path(wave_filename).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n\n    tail_padding = [0] * int(0.3 * sample_rate)\n    stream.accept_waveform(sample_rate, tail_padding)\n\n    while recognizer.is_ready(stream):\n        recognizer.decode_stream(stream)\n\n    print(wave_filename)\n    print(recognizer.get_result_all(stream))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/keyword-spotter-from-microphone.py",
    "content": "#!/usr/bin/env python3\n\n# Real-time keyword spotting from a microphone with sherpa-onnx Python API\n#\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html\n# to download pre-trained models\n\nimport argparse\nimport sys\nfrom pathlib import Path\n\nfrom typing import List\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\nimport sherpa_onnx\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html to download it\"\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        help=\"Path to the transducer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        help=\"Path to the transducer decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        type=str,\n        help=\"Path to the transducer joiner model\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    parser.add_argument(\n        \"--max-active-paths\",\n        type=int,\n        default=4,\n        help=\"\"\"\n        It specifies number of active paths to keep during decoding.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--num-trailing-blanks\",\n        type=int,\n        default=1,\n        help=\"\"\"The number of trailing blanks a keyword should be followed. Setting\n        to a larger value (e.g. 8) when your keywords has overlapping tokens\n        between each other.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--keywords-file\",\n        type=str,\n        help=\"\"\"\n        The file containing keywords, one words/phrases per line, and for each\n        phrase the bpe/cjkchar/pinyin are separated by a space. For example:\n\n        ▁HE LL O ▁WORLD\n        x iǎo ài t óng x ué \n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--keywords-score\",\n        type=float,\n        default=1.0,\n        help=\"\"\"\n        The boosting score of each token for keywords. The larger the easier to\n        survive beam search.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--keywords-threshold\",\n        type=float,\n        default=0.25,\n        help=\"\"\"\n        The trigger threshold (i.e. probability) of the keyword. The larger the\n        harder to trigger.\n        \"\"\",\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    assert_file_exists(args.tokens)\n    assert_file_exists(args.encoder)\n    assert_file_exists(args.decoder)\n    assert_file_exists(args.joiner)\n\n    assert Path(\n        args.keywords_file\n    ).is_file(), (\n        f\"keywords_file : {args.keywords_file} not exist, please provide a valid path.\"\n    )\n\n    keyword_spotter = sherpa_onnx.KeywordSpotter(\n        tokens=args.tokens,\n        encoder=args.encoder,\n        decoder=args.decoder,\n        joiner=args.joiner,\n        num_threads=args.num_threads,\n        max_active_paths=args.max_active_paths,\n        keywords_file=args.keywords_file,\n        keywords_score=args.keywords_score,\n        keywords_threshold=args.keywords_threshold,\n        num_trailing_blanks=args.num_trailing_blanks,\n        provider=args.provider,\n    )\n\n    print(\"Started! Please speak\")\n\n    idx = 0\n\n    sample_rate = 16000\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n    stream = keyword_spotter.create_stream()\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=sample_rate) as s:\n        while True:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n            stream.accept_waveform(sample_rate, samples)\n            while keyword_spotter.is_ready(stream):\n                keyword_spotter.decode_stream(stream)\n                result = keyword_spotter.get_result(stream)\n                if result:\n                    print(f\"{idx}: {result }\")\n                    idx += 1\n                    # Remember to reset stream right after detecting a keyword\n                    keyword_spotter.reset_stream(stream)\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/keyword-spotter.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python API to do keyword spotting\nfrom wave file(s).\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html\nto download pre-trained models.\n\"\"\"\nimport argparse\nimport time\nimport wave\nfrom pathlib import Path\nfrom typing import List, Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\ndef create_keyword_spotter():\n    kws = sherpa_onnx.KeywordSpotter(\n        tokens=\"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt\",\n        encoder=\"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx\",\n        decoder=\"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\",\n        joiner=\"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx\",\n        num_threads=2,\n        keywords_file=\"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/test_keywords.txt\",\n        provider=\"cpu\",\n    )\n\n    return kws\n\n\ndef main():\n    kws = create_keyword_spotter()\n\n    wave_filename = (\n        \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/3.wav\"\n    )\n\n    samples, sample_rate = read_wave(wave_filename)\n\n    tail_paddings = np.zeros(int(0.66 * sample_rate), dtype=np.float32)\n\n    print(\"----------Use pre-defined keywords----------\")\n    s = kws.create_stream()\n    s.accept_waveform(sample_rate, samples)\n    s.accept_waveform(sample_rate, tail_paddings)\n    s.input_finished()\n    while kws.is_ready(s):\n        kws.decode_stream(s)\n        r = kws.get_result(s)\n        if r != \"\":\n            # Remember to call reset right after detected a keyword\n            kws.reset_stream(s)\n\n            print(f\"Detected {r}\")\n\n    print(\"----------Use pre-defined keywords + add a new keyword----------\")\n\n    s = kws.create_stream(\"y ǎn y uán @演员\")\n    s.accept_waveform(sample_rate, samples)\n    s.accept_waveform(sample_rate, tail_paddings)\n    s.input_finished()\n    while kws.is_ready(s):\n        kws.decode_stream(s)\n        r = kws.get_result(s)\n        if r != \"\":\n            # Remember to call reset right after detected a keyword\n            kws.reset_stream(s)\n\n            print(f\"Detected {r}\")\n\n    print(\"----------Use pre-defined keywords + add 2 new keywords----------\")\n\n    s = kws.create_stream(\"y ǎn y uán @演员/zh ī m íng @知名\")\n    s.accept_waveform(sample_rate, samples)\n    s.accept_waveform(sample_rate, tail_paddings)\n    s.input_finished()\n    while kws.is_ready(s):\n        kws.decode_stream(s)\n        r = kws.get_result(s)\n        if r != \"\":\n            # Remember to call reset right after detected a keyword\n            kws.reset_stream(s)\n\n            print(f\"Detected {r}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/non_streaming_server.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2022-2023  Xiaomi Corp.\n\"\"\"\nA server for non-streaming speech recognition. Non-streaming means you send all\nthe content of the audio at once for recognition.\n\nIt supports multiple clients sending at the same time.\n\nUsage:\n    ./non_streaming_server.py --help\n\nPlease refer to\n\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/index.html\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/index.html\n\nfor pre-trained models to download.\n\nUsage examples:\n\n(1) Use a non-streaming transducer model\n\ncd /path/to/sherpa-onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\ntar xvf sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\nrm sherpa-onnx-zipformer-en-2023-06-26.tar.bz2\n\npython3 ./python-api-examples/non_streaming_server.py \\\n  --encoder ./sherpa-onnx-zipformer-en-2023-06-26/encoder-epoch-99-avg-1.onnx \\\n  --decoder ./sherpa-onnx-zipformer-en-2023-06-26/decoder-epoch-99-avg-1.onnx \\\n  --joiner ./sherpa-onnx-zipformer-en-2023-06-26/joiner-epoch-99-avg-1.onnx \\\n  --tokens ./sherpa-onnx-zipformer-en-2023-06-26/tokens.txt \\\n  --port 6006\n  \n(2) Use a non-streaming paraformer\n\ncd /path/to/sherpa-onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\ntar xvf sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\nrm sherpa-onnx-paraformer-zh-2023-09-14.tar.bz2\n\npython3 ./python-api-examples/non_streaming_server.py \\\n  --paraformer ./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \\\n  --tokens ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\n\n(3) Use a non-streaming CTC model from NeMo\n\ncd /path/to/sherpa-onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-en-conformer-medium.tar.bz2\ntar xvf sherpa-onnx-nemo-ctc-en-conformer-medium.tar.bz2\nrm sherpa-onnx-nemo-ctc-en-conformer-medium.tar.bz2\n\npython3 ./python-api-examples/non_streaming_server.py \\\n  --nemo-ctc ./sherpa-onnx-nemo-ctc-en-conformer-medium/model.onnx \\\n  --tokens ./sherpa-onnx-nemo-ctc-en-conformer-medium/tokens.txt\n\n(4) Use a non-streaming CTC model from WeNet\n\ncd /path/to/sherpa-onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zh-wenet-wenetspeech.tar.bz2\ntar xvf sherpa-onnx-zh-wenet-wenetspeech.tar.bz2\nrm sherpa-onnx-zh-wenet-wenetspeech.tar.bz2\n\npython3 ./python-api-examples/non_streaming_server.py \\\n  --wenet-ctc ./sherpa-onnx-zh-wenet-wenetspeech/model.onnx \\\n  --tokens ./sherpa-onnx-zh-wenet-wenetspeech/tokens.txt\n\n(5) Use a Moonshine model\n\ncd /path/to/sherpa-onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\npython3 ./python-api-examples/non_streaming_server.py \\\n  --moonshine-preprocessor=./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx \\\n  --moonshine-encoder=./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx \\\n  --moonshine-uncached-decoder=./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx \\\n  --moonshine-cached-decoder=./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx \\\n  --tokens=./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt\n\n(6) Use a Whisper model\n\ncd /path/to/sherpa-onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nrm sherpa-onnx-whisper-tiny.en.tar.bz2\n\npython3 ./python-api-examples/non_streaming_server.py \\\n  --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx \\\n  --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx \\\n  --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\n\n(7) Use a tdnn model of the yesno recipe from icefall\n\ncd /path/to/sherpa-onnx\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-tdnn-yesno.tar.bz2\ntar xvf sherpa-onnx-tdnn-yesno.tar.bz2\nrm sherpa-onnx-tdnn-yesno.tar.bz2\n\npython3 ./python-api-examples/non_streaming_server.py \\\n  --sample-rate=8000 \\\n  --feat-dim=23 \\\n  --tdnn-model=./sherpa-onnx-tdnn-yesno/model-epoch-14-avg-2.onnx \\\n  --tokens=./sherpa-onnx-tdnn-yesno/tokens.txt\n\n(8) Use a Non-streaming SenseVoice model\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\npython3 ./python-api-examples/non_streaming_server.py \\\n  --sense-voice=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx \\\n  --tokens=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\n\n(9) Use a Non-streaming telespeech ctc model\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\ntar xvf sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\nrm sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2\n\npython3 ./python-api-examples/non_streaming_server.py \\\n  --telespeech-ctc=./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/model.int8.onnx \\\n  --tokens=./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt\n\n----\n\nTo use a certificate so that you can use https, please use\n\npython3 ./python-api-examples/non_streaming_server.py \\\n  --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx \\\n  --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx \\\n  --certificate=/path/to/your/cert.pem\n\nIf you don't have a certificate, please run:\n\n    cd ./python-api-examples/web\n    ./generate-certificate.py\n\nIt will generate 3 files, one of which is the required `cert.pem`.\n\"\"\"  # noqa\n\nimport argparse\nimport asyncio\nimport http\nimport logging\nimport socket\nimport ssl\nimport sys\nimport warnings\nfrom concurrent.futures import ThreadPoolExecutor\nfrom datetime import datetime\nfrom pathlib import Path\nfrom typing import Optional, Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\nimport websockets\n\nfrom http_server import HttpServer\n\n\ndef setup_logger(\n    log_filename: str,\n    log_level: str = \"info\",\n    use_console: bool = True,\n) -> None:\n    \"\"\"Setup log level.\n\n    Args:\n      log_filename:\n        The filename to save the log.\n      log_level:\n        The log level to use, e.g., \"debug\", \"info\", \"warning\", \"error\",\n        \"critical\"\n      use_console:\n        True to also print logs to console.\n    \"\"\"\n    now = datetime.now()\n    date_time = now.strftime(\"%Y-%m-%d-%H-%M-%S\")\n    formatter = \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"\n    log_filename = f\"{log_filename}-{date_time}.txt\"\n\n    Path(log_filename).parent.mkdir(parents=True, exist_ok=True)\n\n    level = logging.ERROR\n    if log_level == \"debug\":\n        level = logging.DEBUG\n    elif log_level == \"info\":\n        level = logging.INFO\n    elif log_level == \"warning\":\n        level = logging.WARNING\n    elif log_level == \"critical\":\n        level = logging.CRITICAL\n\n    logging.basicConfig(\n        filename=log_filename,\n        format=formatter,\n        level=level,\n        filemode=\"w\",\n    )\n    if use_console:\n        console = logging.StreamHandler()\n        console.setLevel(level)\n        console.setFormatter(logging.Formatter(formatter))\n        logging.getLogger(\"\").addHandler(console)\n\n\ndef add_transducer_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer joiner model\",\n    )\n\n\ndef add_paraformer_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--paraformer\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from Paraformer\",\n    )\n\n\ndef add_sense_voice_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--sense-voice\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from SenseVoice\",\n    )\n\n\ndef add_nemo_ctc_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--nemo-ctc\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from NeMo CTC\",\n    )\n\n\ndef add_telespeech_ctc_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--telespeech-ctc\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from TeleSpeech CTC\",\n    )\n\n\ndef add_wenet_ctc_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--wenet-ctc\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from WeNet CTC\",\n    )\n\n\ndef add_tdnn_ctc_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--tdnn-model\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx for the tdnn model of the yesno recipe\",\n    )\n\n\ndef add_moonshine_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--moonshine-preprocessor\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine preprocessor model\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine encoder model\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-uncached-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine uncached decoder model\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-cached-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine cached decoder model\",\n    )\n\n\ndef add_whisper_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--whisper-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper encoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper decoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-language\",\n        default=\"\",\n        type=str,\n        help=\"\"\"It specifies the spoken language in the input audio file.\n        Example values: en, fr, de, zh, jp.\n        Available languages for multilingual models can be found at\n        https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10\n        If not specified, we infer the language from the input audio file.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-task\",\n        default=\"transcribe\",\n        choices=[\"transcribe\", \"translate\"],\n        type=str,\n        help=\"\"\"For multilingual models, if you specify translate, the output\n        will be in English.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-tail-paddings\",\n        default=-1,\n        type=int,\n        help=\"\"\"Number of tail padding frames.\n        We have removed the 30-second constraint from whisper, so you need to\n        choose the amount of tail padding frames by yourself.\n        Use -1 to use a default value for tail padding.\n        \"\"\",\n    )\n\n\ndef add_model_args(parser: argparse.ArgumentParser):\n    add_transducer_model_args(parser)\n    add_paraformer_model_args(parser)\n    add_sense_voice_model_args(parser)\n    add_nemo_ctc_model_args(parser)\n    add_wenet_ctc_model_args(parser)\n    add_telespeech_ctc_model_args(parser)\n    add_tdnn_ctc_model_args(parser)\n    add_whisper_model_args(parser)\n    add_moonshine_model_args(parser)\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=2,\n        help=\"Number of threads to run the neural network model\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n\ndef add_feature_config_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--sample-rate\",\n        type=int,\n        default=16000,\n        help=\"Sample rate of the data used to train the model. \",\n    )\n\n    parser.add_argument(\n        \"--feat-dim\",\n        type=int,\n        default=80,\n        help=\"Feature dimension of the model\",\n    )\n\n\ndef add_decoding_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"\"\"Decoding method to use. Current supported methods are:\n        - greedy_search\n        - modified_beam_search  (for transducer models only)\n        \"\"\",\n    )\n\n    add_modified_beam_search_args(parser)\n\n\ndef add_modified_beam_search_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--max-active-paths\",\n        type=int,\n        default=4,\n        help=\"\"\"Used only when --decoding-method is modified_beam_search.\n        It specifies number of active paths to keep during decoding.\n        \"\"\",\n    )\n\n\ndef add_hotwords_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--hotwords-file\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The file containing hotwords, one words/phrases per line, and for each\n        phrase the bpe/cjkchar are separated by a space. For example:\n\n        ▁HE LL O ▁WORLD\n        你 好 世 界\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-score\",\n        type=float,\n        default=1.5,\n        help=\"\"\"\n        The hotword score of each token for biasing word/phrase. Used only if\n        --hotwords-file is given.\n        \"\"\",\n    )\n\n\ndef add_blank_penalty_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--blank-penalty\",\n        type=float,\n        default=0.0,\n        help=\"\"\"\n        The penalty applied on blank symbol during decoding.\n        Note: It is a positive value that would be applied to logits like\n        this `logits[:, 0] -= blank_penalty` (suppose logits.shape is\n        [batch_size, vocab] and blank id is 0).\n        \"\"\",\n    )\n\n\ndef check_args(args):\n    if not Path(args.tokens).is_file():\n        raise ValueError(f\"{args.tokens} does not exist\")\n\n    if args.decoding_method not in (\n        \"greedy_search\",\n        \"modified_beam_search\",\n    ):\n        raise ValueError(f\"Unsupported decoding method {args.decoding_method}\")\n\n    if args.decoding_method == \"modified_beam_search\":\n        assert args.num_active_paths > 0, args.num_active_paths\n        assert Path(args.encoder).is_file(), args.encoder\n        assert Path(args.decoder).is_file(), args.decoder\n        assert Path(args.joiner).is_file(), args.joiner\n\n    if args.hotwords_file != \"\":\n        assert args.decoding_method == \"modified_beam_search\", args.decoding_method\n        assert Path(args.hotwords_file).is_file(), args.hotwords_file\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    add_model_args(parser)\n    add_feature_config_args(parser)\n    add_decoding_args(parser)\n    add_hotwords_args(parser)\n    add_blank_penalty_args(parser)\n\n    parser.add_argument(\n        \"--port\",\n        type=int,\n        default=6006,\n        help=\"The server will listen on this port\",\n    )\n\n    parser.add_argument(\n        \"--max-batch-size\",\n        type=int,\n        default=3,\n        help=\"\"\"Max batch size for computation. Note if there are not enough\n        requests in the queue, it will wait for max_wait_ms time. After that,\n        even if there are not enough requests, it still sends the\n        available requests in the queue for computation.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--max-wait-ms\",\n        type=float,\n        default=5,\n        help=\"\"\"Max time in millisecond to wait to build batches for inference.\n        If there are not enough requests in the feature queue to build a batch\n        of max_batch_size, it waits up to this time before fetching available\n        requests for computation.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--nn-pool-size\",\n        type=int,\n        default=1,\n        help=\"Number of threads for NN computation and decoding.\",\n    )\n\n    parser.add_argument(\n        \"--max-message-size\",\n        type=int,\n        default=(1 << 20),\n        help=\"\"\"Max message size in bytes.\n        The max size per message cannot exceed this limit.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--max-queue-size\",\n        type=int,\n        default=32,\n        help=\"Max number of messages in the queue for each connection.\",\n    )\n\n    parser.add_argument(\n        \"--max-active-connections\",\n        type=int,\n        default=200,\n        help=\"\"\"Maximum number of active connections. The server will refuse\n        to accept new connections once the current number of active connections\n        equals to this limit.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--certificate\",\n        type=str,\n        help=\"\"\"Path to the X.509 certificate. You need it only if you want to\n        use a secure websocket connection, i.e., use wss:// instead of ws://.\n        You can use ./web/generate-certificate.py\n        to generate the certificate `cert.pem`.\n        Note ./web/generate-certificate.py will generate three files but you\n        only need to pass the generated cert.pem to this option.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--doc-root\",\n        type=str,\n        default=\"./python-api-examples/web\",\n        help=\"Path to the web root\",\n    )\n\n    return parser.parse_args()\n\n\nclass NonStreamingServer:\n    def __init__(\n        self,\n        recognizer: sherpa_onnx.OfflineRecognizer,\n        max_batch_size: int,\n        max_wait_ms: float,\n        nn_pool_size: int,\n        max_message_size: int,\n        max_queue_size: int,\n        max_active_connections: int,\n        doc_root: str,\n        certificate: Optional[str] = None,\n    ):\n        \"\"\"\n        Args:\n          recognizer:\n            An instance of the sherpa_onnx.OfflineRecognizer.\n          max_batch_size:\n            Max batch size for inference.\n          max_wait_ms:\n            Max wait time in milliseconds in order to build a batch of\n            `max_batch_size`.\n          nn_pool_size:\n            Number of threads for the thread pool that is used for NN\n            computation and decoding.\n          max_message_size:\n            Max size in bytes per message.\n          max_queue_size:\n            Max number of messages in the queue for each connection.\n          max_active_connections:\n            Max number of active connections. Once number of active client\n            equals to this limit, the server refuses to accept new connections.\n          doc_root:\n            Path to the directory where files like index.html for the HTTP\n            server locate.\n          certificate:\n            Optional. If not None, it will use secure websocket.\n            You can use ./web/generate-certificate.py to generate\n            it (the default generated filename is `cert.pem`).\n        \"\"\"\n        self.recognizer = recognizer\n\n        self.certificate = certificate\n        self.http_server = HttpServer(doc_root)\n\n        self.nn_pool_size = nn_pool_size\n        self.nn_pool = ThreadPoolExecutor(\n            max_workers=nn_pool_size,\n            thread_name_prefix=\"nn\",\n        )\n\n        self.stream_queue = asyncio.Queue()\n\n        self.max_wait_ms = max_wait_ms\n        self.max_batch_size = max_batch_size\n        self.max_message_size = max_message_size\n        self.max_queue_size = max_queue_size\n        self.max_active_connections = max_active_connections\n\n        self.current_active_connections = 0\n        self.sample_rate = int(recognizer.config.feat_config.sampling_rate)\n\n    async def process_request(\n        self,\n        path: str,\n        request_headers: websockets.Headers,\n    ) -> Optional[Tuple[http.HTTPStatus, websockets.Headers, bytes]]:\n        if \"sec-websocket-key\" not in (\n            request_headers.headers  # For new request_headers\n            if hasattr(request_headers, \"headers\")\n            else request_headers  # For old request_headers\n        ):\n            # This is a normal HTTP request\n            if path == \"/\":\n                path = \"/index.html\"\n            if path[-1] == \"?\":\n                path = path[:-1]\n\n            if path == \"/streaming_record.html\":\n                response = r\"\"\"\n<!doctype html><html><head>\n<title>Speech recognition with next-gen Kaldi</title><body>\n<h2>Only\n<a href=\"/upload.html\">/upload.html</a>\nand\n<a href=\"/offline_record.html\">/offline_record.html</a>\nis available for the non-streaming server.<h2>\n<br/>\n<br/>\nGo back to <a href=\"/upload.html\">/upload.html</a>\nor <a href=\"/offline_record.html\">/offline_record.html</a>\n</body></head></html>\n\"\"\"\n                found = True\n                mime_type = \"text/html\"\n            else:\n                found, response, mime_type = self.http_server.process_request(path)\n            if isinstance(response, str):\n                response = response.encode(\"utf-8\")\n\n            if not found:\n                status = http.HTTPStatus.NOT_FOUND\n            else:\n                status = http.HTTPStatus.OK\n            header = {\"Content-Type\": mime_type}\n            return status, header, response\n\n        if self.current_active_connections < self.max_active_connections:\n            self.current_active_connections += 1\n            return None\n\n        # Refuse new connections\n        status = http.HTTPStatus.SERVICE_UNAVAILABLE  # 503\n        header = {\"Hint\": \"The server is overloaded. Please retry later.\"}\n        response = b\"The server is busy. Please retry later.\"\n\n        return status, header, response\n\n    async def run(self, port: int):\n        logging.info(\"started\")\n\n        tasks = []\n        for i in range(self.nn_pool_size):\n            tasks.append(asyncio.create_task(self.stream_consumer_task()))\n\n        if self.certificate:\n            logging.info(f\"Using certificate: {self.certificate}\")\n            ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)\n            ssl_context.load_cert_chain(self.certificate)\n        else:\n            ssl_context = None\n            logging.info(\"No certificate provided\")\n\n        async with websockets.serve(\n            self.handle_connection,\n            host=\"\",\n            port=port,\n            max_size=self.max_message_size,\n            max_queue=self.max_queue_size,\n            process_request=self.process_request,\n            ssl=ssl_context,\n        ):\n            ip_list = [\"localhost\"]\n            if ssl_context:\n                ip_list += [\"0.0.0.0\", \"127.0.0.1\"]\n                ip_list.append(socket.gethostbyname(socket.gethostname()))\n\n            proto = \"http://\" if ssl_context is None else \"https://\"\n            s = \"Please visit one of the following addresses:\\n\\n\"\n            for p in ip_list:\n                s += \"  \" + proto + p + f\":{port}\" \"\\n\"\n            logging.info(s)\n\n            await asyncio.Future()  # run forever\n\n        await asyncio.gather(*tasks)  # not reachable\n\n    async def recv_audio_samples(\n        self,\n        socket: websockets.WebSocketServerProtocol,\n    ) -> Tuple[Optional[np.ndarray], Optional[float]]:\n        \"\"\"Receive a tensor from the client.\n\n        The message from the client is a **bytes** buffer.\n\n        The first message can be either \"Done\" meaning the client won't send\n        anything in the future or it can be a buffer containing 8 bytes.\n        The first 4 bytes in little endian specifies the sample\n        rate of the audio samples; the second 4 bytes in little endian specifies\n        the number of bytes in the audio file, which will be sent by the client\n        in the subsequent messages.\n        Since there is a limit in the message size posed by the websocket\n        protocol, the client may send the audio file in multiple messages if the\n        audio file is very large.\n\n        The second and remaining messages contain audio samples.\n\n        Please refer to ./offline-websocket-client-decode-files-paralell.py\n        and ./offline-websocket-client-decode-files-sequential.py\n        for how the client sends the message.\n\n        Args:\n          socket:\n            The socket for communicating with the client.\n        Returns:\n          Return a containing:\n            - 1-D np.float32 array containing the audio samples\n            - sample rate of the audio samples\n          or return (None, None) indicating the end of utterance.\n        \"\"\"\n        header = await socket.recv()\n        if header == \"Done\":\n            return None, None\n\n        assert len(header) >= 8, (\n            \"The first message should contain at least 8 bytes.\"\n            + f\"Given {len(header)}\"\n        )\n\n        sample_rate = int.from_bytes(header[:4], \"little\", signed=True)\n        expected_num_bytes = int.from_bytes(header[4:8], \"little\", signed=True)\n\n        received = []\n        num_received_bytes = 0\n        if len(header) > 8:\n            received.append(header[8:])\n            num_received_bytes += len(header) - 8\n\n        if num_received_bytes < expected_num_bytes:\n            async for message in socket:\n                received.append(message)\n                num_received_bytes += len(message)\n                if num_received_bytes >= expected_num_bytes:\n                    break\n\n        assert num_received_bytes == expected_num_bytes, (\n            num_received_bytes,\n            expected_num_bytes,\n        )\n\n        samples = b\"\".join(received)\n        array = np.frombuffer(samples, dtype=np.float32)\n        return array, sample_rate\n\n    async def stream_consumer_task(self):\n        \"\"\"This function extracts streams from the queue, batches them up, sends\n        them to the RNN-T model for computation and decoding.\n        \"\"\"\n        while True:\n            if self.stream_queue.empty():\n                await asyncio.sleep(self.max_wait_ms / 1000)\n                continue\n\n            batch = []\n            try:\n                while len(batch) < self.max_batch_size:\n                    item = self.stream_queue.get_nowait()\n\n                    batch.append(item)\n            except asyncio.QueueEmpty:\n                pass\n\n            stream_list = [b[0] for b in batch]\n            future_list = [b[1] for b in batch]\n\n            loop = asyncio.get_running_loop()\n            await loop.run_in_executor(\n                self.nn_pool,\n                self.recognizer.decode_streams,\n                stream_list,\n            )\n\n            for f in future_list:\n                self.stream_queue.task_done()\n                f.set_result(None)\n\n    async def compute_and_decode(\n        self,\n        stream: sherpa_onnx.OfflineStream,\n    ) -> None:\n        \"\"\"Put the stream into the queue and wait it to be processed by the\n        consumer task.\n\n        Args:\n          stream:\n            The stream to be processed. Note: It is changed in-place.\n        \"\"\"\n        loop = asyncio.get_running_loop()\n        future = loop.create_future()\n        await self.stream_queue.put((stream, future))\n        await future\n\n    async def handle_connection(\n        self,\n        socket: websockets.WebSocketServerProtocol,\n    ):\n        \"\"\"Receive audio samples from the client, process it, and sends\n        deocoding result back to the client.\n\n        Args:\n          socket:\n            The socket for communicating with the client.\n        \"\"\"\n        try:\n            await self.handle_connection_impl(socket)\n        except websockets.exceptions.ConnectionClosedError:\n            logging.info(f\"{socket.remote_address} disconnected\")\n        finally:\n            # Decrement so that it can accept new connections\n            self.current_active_connections -= 1\n\n            logging.info(\n                f\"Disconnected: {socket.remote_address}. \"\n                f\"Number of connections: {self.current_active_connections}/{self.max_active_connections}\"  # noqa\n            )\n\n    async def handle_connection_impl(\n        self,\n        socket: websockets.WebSocketServerProtocol,\n    ):\n        \"\"\"Receive audio samples from the client, process it, and send\n        decoding results back to the client.\n\n        Args:\n          socket:\n            The socket for communicating with the client.\n        \"\"\"\n        logging.info(\n            f\"Connected: {socket.remote_address}. \"\n            f\"Number of connections: {self.current_active_connections}/{self.max_active_connections}\"  # noqa\n        )\n\n        while True:\n            stream = self.recognizer.create_stream()\n            samples, sample_rate = await self.recv_audio_samples(socket)\n            if samples is None:\n                break\n            # stream.accept_samples() runs in the main thread\n\n            stream.accept_waveform(sample_rate, samples)\n\n            await self.compute_and_decode(stream)\n            result = stream.result.text\n            logging.info(f\"result: {result}\")\n\n            if result:\n                await socket.send(result)\n            else:\n                # If result is an empty string, send something to the client.\n                # Otherwise, socket.send() is a no-op and the client will\n                # wait for a reply indefinitely.\n                await socket.send(\"<EMPTY>\")\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef create_recognizer(args) -> sherpa_onnx.OfflineRecognizer:\n    if args.encoder:\n        assert len(args.paraformer) == 0, args.paraformer\n        assert len(args.sense_voice) == 0, args.sense_voice\n        assert len(args.nemo_ctc) == 0, args.nemo_ctc\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.telespeech_ctc) == 0, args.telespeech_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.encoder)\n        assert_file_exists(args.decoder)\n        assert_file_exists(args.joiner)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_transducer(\n            encoder=args.encoder,\n            decoder=args.decoder,\n            joiner=args.joiner,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            decoding_method=args.decoding_method,\n            max_active_paths=args.max_active_paths,\n            hotwords_file=args.hotwords_file,\n            hotwords_score=args.hotwords_score,\n            blank_penalty=args.blank_penalty,\n            provider=args.provider,\n        )\n    elif args.paraformer:\n        assert len(args.sense_voice) == 0, args.sense_voice\n        assert len(args.nemo_ctc) == 0, args.nemo_ctc\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.telespeech_ctc) == 0, args.telespeech_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.paraformer)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(\n            paraformer=args.paraformer,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            decoding_method=args.decoding_method,\n            provider=args.provider,\n        )\n    elif args.sense_voice:\n        assert len(args.nemo_ctc) == 0, args.nemo_ctc\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.telespeech_ctc) == 0, args.telespeech_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.sense_voice)\n        recognizer = sherpa_onnx.OfflineRecognizer.from_sense_voice(\n            model=args.sense_voice,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            use_itn=True,\n        )\n    elif args.nemo_ctc:\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.telespeech_ctc) == 0, args.telespeech_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.nemo_ctc)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_nemo_ctc(\n            model=args.nemo_ctc,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            decoding_method=args.decoding_method,\n            provider=args.provider,\n        )\n    elif args.wenet_ctc:\n        assert len(args.telespeech_ctc) == 0, args.telespeech_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.wenet_ctc)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_wenet_ctc(\n            model=args.wenet_ctc,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            decoding_method=args.decoding_method,\n            provider=args.provider,\n        )\n    elif args.telespeech_ctc:\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.telespeech_ctc)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_telespeech_ctc(\n            model=args.telespeech_ctc,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            decoding_method=args.decoding_method,\n            provider=args.provider,\n        )\n    elif args.whisper_encoder:\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n        assert_file_exists(args.whisper_encoder)\n        assert_file_exists(args.whisper_decoder)\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n            encoder=args.whisper_encoder,\n            decoder=args.whisper_decoder,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            language=args.whisper_language,\n            task=args.whisper_task,\n            tail_paddings=args.whisper_tail_paddings,\n            provider=args.provider,\n        )\n    elif args.tdnn_model:\n        assert_file_exists(args.tdnn_model)\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_tdnn_ctc(\n            model=args.tdnn_model,\n            tokens=args.tokens,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            provider=args.provider,\n        )\n    elif args.moonshine_preprocessor:\n        assert_file_exists(args.moonshine_preprocessor)\n        assert_file_exists(args.moonshine_encoder)\n        assert_file_exists(args.moonshine_uncached_decoder)\n        assert_file_exists(args.moonshine_cached_decoder)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_moonshine(\n            preprocessor=args.moonshine_preprocessor,\n            encoder=args.moonshine_encoder,\n            uncached_decoder=args.moonshine_uncached_decoder,\n            cached_decoder=args.moonshine_cached_decoder,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n        )\n    else:\n        raise ValueError(\"Please specify at least one model\")\n\n    return recognizer\n\n\ndef main():\n    args = get_args()\n    logging.info(vars(args))\n    check_args(args)\n\n    recognizer = create_recognizer(args)\n\n    port = args.port\n    max_wait_ms = args.max_wait_ms\n    max_batch_size = args.max_batch_size\n    nn_pool_size = args.nn_pool_size\n    max_message_size = args.max_message_size\n    max_queue_size = args.max_queue_size\n    max_active_connections = args.max_active_connections\n    certificate = args.certificate\n    doc_root = args.doc_root\n\n    if certificate and not Path(certificate).is_file():\n        raise ValueError(f\"{certificate} does not exist\")\n\n    if not Path(doc_root).is_dir():\n        raise ValueError(f\"Directory {doc_root} does not exist\")\n\n    non_streaming_server = NonStreamingServer(\n        recognizer=recognizer,\n        max_wait_ms=max_wait_ms,\n        max_batch_size=max_batch_size,\n        nn_pool_size=nn_pool_size,\n        max_message_size=max_message_size,\n        max_queue_size=max_queue_size,\n        max_active_connections=max_active_connections,\n        certificate=certificate,\n        doc_root=doc_root,\n    )\n    asyncio.run(non_streaming_server.run(port))\n\n\nif __name__ == \"__main__\":\n    log_filename = \"log/log-non-streaming-server\"\n    setup_logger(log_filename)\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-decode-files.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2023 by manyeyes\n# Copyright (c)  2023  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python API to transcribe\nfile(s) with a non-streaming model.\n\n(1) For paraformer\n\n    ./python-api-examples/offline-decode-files.py  \\\n      --tokens=/path/to/tokens.txt \\\n      --paraformer=/path/to/paraformer.onnx \\\n      --num-threads=2 \\\n      --decoding-method=greedy_search \\\n      --debug=false \\\n      --sample-rate=16000 \\\n      --feature-dim=80 \\\n      /path/to/0.wav \\\n      /path/to/1.wav\n\n(2) For transducer models from icefall\n\n    ./python-api-examples/offline-decode-files.py  \\\n      --tokens=/path/to/tokens.txt \\\n      --encoder=/path/to/encoder.onnx \\\n      --decoder=/path/to/decoder.onnx \\\n      --joiner=/path/to/joiner.onnx \\\n      --num-threads=2 \\\n      --decoding-method=greedy_search \\\n      --debug=false \\\n      --sample-rate=16000 \\\n      --feature-dim=80 \\\n      /path/to/0.wav \\\n      /path/to/1.wav\n\n    also with RNN LM rescoring and LODR (optional):\n\n    ./python-api-examples/offline-decode-files.py  \\\n      --tokens=/path/to/tokens.txt \\\n      --encoder=/path/to/encoder.onnx \\\n      --decoder=/path/to/decoder.onnx \\\n      --joiner=/path/to/joiner.onnx \\\n      --num-threads=2 \\\n      --decoding-method=modified_beam_search \\\n      --debug=false \\\n      --sample-rate=16000 \\\n      --feature-dim=80 \\\n      --lm=/path/to/lm.onnx \\\n      --lm-scale=0.1 \\\n      --lodr-fst=/path/to/lodr.fst \\\n      --lodr-scale=-0.1 \\\n      /path/to/0.wav \\\n      /path/to/1.wav\n\n(3) For CTC models from NeMo\n\npython3 ./python-api-examples/offline-decode-files.py \\\n  --tokens=./sherpa-onnx-nemo-ctc-en-citrinet-512/tokens.txt \\\n  --nemo-ctc=./sherpa-onnx-nemo-ctc-en-citrinet-512/model.onnx \\\n  --num-threads=2 \\\n  --decoding-method=greedy_search \\\n  --debug=false \\\n  ./sherpa-onnx-nemo-ctc-en-citrinet-512/test_wavs/0.wav \\\n  ./sherpa-onnx-nemo-ctc-en-citrinet-512/test_wavs/1.wav \\\n  ./sherpa-onnx-nemo-ctc-en-citrinet-512/test_wavs/8k.wav\n\n(4) For Whisper models\n\npython3 ./python-api-examples/offline-decode-files.py \\\n  --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \\\n  --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \\\n  --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \\\n  --whisper-task=transcribe \\\n  --num-threads=1 \\\n  ./sherpa-onnx-whisper-base.en/test_wavs/0.wav \\\n  ./sherpa-onnx-whisper-base.en/test_wavs/1.wav \\\n  ./sherpa-onnx-whisper-base.en/test_wavs/8k.wav\n\n(5) For CTC models from WeNet\n\npython3 ./python-api-examples/offline-decode-files.py \\\n  --wenet-ctc=./sherpa-onnx-zh-wenet-wenetspeech/model.onnx \\\n  --tokens=./sherpa-onnx-zh-wenet-wenetspeech/tokens.txt \\\n  ./sherpa-onnx-zh-wenet-wenetspeech/test_wavs/0.wav \\\n  ./sherpa-onnx-zh-wenet-wenetspeech/test_wavs/1.wav \\\n  ./sherpa-onnx-zh-wenet-wenetspeech/test_wavs/8k.wav\n\n(6) For tdnn models of the yesno recipe from icefall\n\npython3 ./python-api-examples/offline-decode-files.py \\\n  --sample-rate=8000 \\\n  --feature-dim=23 \\\n  --tdnn-model=./sherpa-onnx-tdnn-yesno/model-epoch-14-avg-2.onnx \\\n  --tokens=./sherpa-onnx-tdnn-yesno/tokens.txt \\\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_0_1_0_0_0_1.wav \\\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_0_1_0.wav \\\n  ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_1_1_1.wav\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/index.html\nto install sherpa-onnx and to download non-streaming pre-trained models\nused in this file.\n\"\"\"\nimport argparse\nimport time\nimport wave\nfrom pathlib import Path\nfrom typing import List, Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-file\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The file containing hotwords, one words/phrases per line, like\n        HELLO WORLD\n        你好世界\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-score\",\n        type=float,\n        default=1.5,\n        help=\"\"\"\n        The hotword score of each token for biasing word/phrase. Used only if\n        --hotwords-file is given.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--modeling-unit\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The modeling unit of the model, valid values are cjkchar, bpe, cjkchar+bpe.\n        Used only when hotwords-file is given.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--bpe-vocab\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The path to the bpe vocabulary, the bpe vocabulary is generated by\n        sentencepiece, you can also export the bpe vocabulary through a bpe model\n        by `scripts/export_bpe_vocab.py`. Used only when hotwords-file is given\n        and modeling-unit is bpe or cjkchar+bpe.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        default=\"\",\n        type=str,\n        help=\"Path to the joiner model\",\n    )\n\n    parser.add_argument(\n        \"--paraformer\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from Paraformer\",\n    )\n\n    parser.add_argument(\n        \"--nemo-ctc\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from NeMo CTC\",\n    )\n\n    parser.add_argument(\n        \"--wenet-ctc\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from WeNet CTC\",\n    )\n\n    parser.add_argument(\n        \"--tdnn-model\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx for the tdnn model of the yesno recipe\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--whisper-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper encoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper decoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-language\",\n        default=\"\",\n        type=str,\n        help=\"\"\"It specifies the spoken language in the input audio file.\n        Example values: en, fr, de, zh, jp.\n        Available languages for multilingual models can be found at\n        https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10\n        If not specified, we infer the language from the input audio file.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-task\",\n        default=\"transcribe\",\n        choices=[\"transcribe\", \"translate\"],\n        type=str,\n        help=\"\"\"For multilingual models, if you specify translate, the output\n        will be in English.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-tail-paddings\",\n        default=-1,\n        type=int,\n        help=\"\"\"Number of tail padding frames.\n        We have removed the 30-second constraint from whisper, so you need to\n        choose the amount of tail padding frames by yourself.\n        Use -1 to use a default value for tail padding.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--blank-penalty\",\n        type=float,\n        default=0.0,\n        help=\"\"\"\n        The penalty applied on blank symbol during decoding.\n        Note: It is a positive value that would be applied to logits like\n        this `logits[:, 0] -= blank_penalty` (suppose logits.shape is\n        [batch_size, vocab] and blank id is 0).\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"Valid values are greedy_search and modified_beam_search\",\n    )\n\n    parser.add_argument(\n        \"--lm\",\n        metavar=\"file\",\n        type=str,\n        default=\"\",\n        help=\"Path to RNN LM model\",\n    )\n\n    parser.add_argument(\n        \"--lm-scale\",\n        metavar=\"lm_scale\",\n        type=float,\n        default=0.1,\n        help=\"LM model scale for rescoring\",\n    )\n\n    parser.add_argument(\n        \"--lodr-fst\",\n        metavar=\"file\",\n        type=str,\n        default=\"\",\n        help=\"Path to LODR FST model. Used only when --lm is given.\",\n    )\n\n    parser.add_argument(\n        \"--lodr-scale\",\n        metavar=\"lodr_scale\",\n        type=float,\n        default=-0.1,\n        help=\"LODR scale for rescoring.Used only when --lodr_fst is given.\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages\",\n    )\n\n    parser.add_argument(\n        \"--sample-rate\",\n        type=int,\n        default=16000,\n        help=\"\"\"Sample rate of the feature extractor. Must match the one\n        expected  by the model. Note: The input sound files can have a\n        different sample rate from this argument.\"\"\",\n    )\n\n    parser.add_argument(\n        \"--feature-dim\",\n        type=int,\n        default=80,\n        help=\"Feature dimension. Must match the one expected by the model\",\n    )\n\n    parser.add_argument(\n        \"sound_files\",\n        type=str,\n        nargs=\"+\",\n        help=\"The input sound file(s) to decode. Each file must be of WAVE\"\n        \"format with a single channel, and each sample has 16-bit, \"\n        \"i.e., int16_t. \"\n        \"The sample rate of the file can be arbitrary and does not need to \"\n        \"be 16 kHz\",\n    )\n\n    return parser.parse_args()\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\ndef main():\n    args = get_args()\n    assert_file_exists(args.tokens)\n    assert args.num_threads > 0, args.num_threads\n\n    if args.encoder:\n        assert len(args.paraformer) == 0, args.paraformer\n        assert len(args.nemo_ctc) == 0, args.nemo_ctc\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n\n        assert_file_exists(args.encoder)\n        assert_file_exists(args.decoder)\n        assert_file_exists(args.joiner)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_transducer(\n            encoder=args.encoder,\n            decoder=args.decoder,\n            joiner=args.joiner,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            lm=args.lm,\n            lm_scale=args.lm_scale,\n            lodr_fst=args.lodr_fst,\n            lodr_scale=args.lodr_scale,\n            decoding_method=args.decoding_method,\n            hotwords_file=args.hotwords_file,\n            hotwords_score=args.hotwords_score,\n            modeling_unit=args.modeling_unit,\n            bpe_vocab=args.bpe_vocab,\n            blank_penalty=args.blank_penalty,\n            debug=args.debug,\n        )\n    elif args.paraformer:\n        assert len(args.nemo_ctc) == 0, args.nemo_ctc\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n\n        assert_file_exists(args.paraformer)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(\n            paraformer=args.paraformer,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.nemo_ctc:\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n\n        assert_file_exists(args.nemo_ctc)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_nemo_ctc(\n            model=args.nemo_ctc,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.wenet_ctc:\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n\n        assert_file_exists(args.wenet_ctc)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_wenet_ctc(\n            model=args.wenet_ctc,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.whisper_encoder:\n        assert len(args.tdnn_model) == 0, args.tdnn_model\n        assert_file_exists(args.whisper_encoder)\n        assert_file_exists(args.whisper_decoder)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n            encoder=args.whisper_encoder,\n            decoder=args.whisper_decoder,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n            language=args.whisper_language,\n            task=args.whisper_task,\n            tail_paddings=args.whisper_tail_paddings,\n        )\n    elif args.tdnn_model:\n        assert_file_exists(args.tdnn_model)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_tdnn_ctc(\n            model=args.tdnn_model,\n            tokens=args.tokens,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    else:\n        print(\"Please specify at least one model\")\n        return\n\n    print(\"Started!\")\n    start_time = time.time()\n\n    streams = []\n    total_duration = 0\n    for wave_filename in args.sound_files:\n        assert_file_exists(wave_filename)\n        samples, sample_rate = read_wave(wave_filename)\n        duration = len(samples) / sample_rate\n        total_duration += duration\n        s = recognizer.create_stream()\n        s.accept_waveform(sample_rate, samples)\n\n        streams.append(s)\n\n    recognizer.decode_streams(streams)\n    results = [s.result.text for s in streams]\n    end_time = time.time()\n    print(\"Done!\")\n\n    for wave_filename, result in zip(args.sound_files, results):\n        print(f\"{wave_filename}\\n{result}\")\n        print(\"-\" * 10)\n\n    elapsed_seconds = end_time - start_time\n    rtf = elapsed_seconds / total_duration\n    print(f\"num_threads: {args.num_threads}\")\n    print(f\"decoding_method: {args.decoding_method}\")\n    print(f\"Wave duration: {total_duration:.3f} s\")\n    print(f\"Elapsed time: {elapsed_seconds:.3f} s\")\n    print(\n        f\"Real time factor (RTF): {elapsed_seconds:.3f}/{total_duration:.3f} = {rtf:.3f}\"\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-dolphin-ctc-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming CTC model from Dolphin\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\"\"\"\n\nfrom pathlib import Path\nimport time\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx\"\n    tokens = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt\"\n    test_wav = (\n        \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav\"\n    )\n\n    if not Path(model).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_dolphin_ctc(\n            model=model,\n            tokens=tokens,\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    start = time.time()\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n    end = time.time()\n\n    print(wave_filename)\n    print(stream.result)\n\n    elapsed_seconds = end - start\n    audio_duration = len(audio) / sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-fire-red-asr-ctc-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming FireRedASR CTC model from\nhttps://github.com/FireRedTeam/FireRedASR2S\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\nrm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\"\"\"\n\nimport time\nfrom pathlib import Path\n\nimport librosa\nimport numpy as np\nimport sherpa_onnx\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx\"\n    tokens = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt\"\n    test_wav_0 = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/0.wav\"\n    test_wav_1 = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav\"\n    test_wav_2 = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/2.wav\"\n    test_wav_3 = (\n        \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/3-sichuan.wav\"\n    )\n    test_wav_4 = (\n        \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/4-tianjin.wav\"\n    )\n    test_wav_5 = (\n        \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/5-henan.wav\"\n    )\n\n    for f in [\n        model,\n        tokens,\n        test_wav_0,\n        test_wav_1,\n        test_wav_2,\n        test_wav_3,\n        test_wav_4,\n        test_wav_5,\n    ]:\n        if not Path(f).is_file():\n            print(f\"{f} does not exist\")\n\n            raise ValueError(\n                \"\"\"Please download model files from\n                https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n                \"\"\"\n            )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_fire_red_asr_ctc(\n            model=model,\n            tokens=tokens,\n            num_threads=2,\n        ),\n        test_wav_0,\n        test_wav_1,\n        test_wav_2,\n        test_wav_3,\n        test_wav_4,\n        test_wav_5,\n    )\n\n\ndef load_audio(filename):\n    audio, sample_rate = librosa.load(filename, sr=16000)\n    assert sample_rate == 16000, sample_rate\n\n    return np.ascontiguousarray(audio)\n\n\ndef decode_single_file(recognizer, filename):\n    samples = load_audio(filename)\n\n    start_time = time.time()\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate=16000, waveform=samples)\n    recognizer.decode_stream(stream)\n\n    end_time = time.time()\n    elapsed_seconds = end_time - start_time\n    audio_duration = len(samples) / 16000\n    real_time_factor = elapsed_seconds / audio_duration\n\n    print(\"---\")\n    print(filename)\n    print(stream.result)\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n    print()\n\n\ndef decode_multiple_files(recognizer, filenames):\n    streams = []\n\n    start_time = time.time()\n\n    audio_duration = 0\n\n    for filename in filenames:\n        samples = load_audio(filename)\n        audio_duration += len(samples) / 16000\n\n        stream = recognizer.create_stream()\n        stream.accept_waveform(sample_rate=16000, waveform=samples)\n        streams.append(stream)\n\n    recognizer.decode_streams(streams)\n\n    end_time = time.time()\n    elapsed_seconds = end_time - start_time\n    real_time_factor = elapsed_seconds / audio_duration\n\n    for name, stream in zip(filenames, streams):\n        print(\"---\")\n        print(name)\n        print(stream.result)\n        print()\n\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n    print()\n    print()\n\n\ndef main():\n    recognizer, *filenames = create_recognizer()\n\n    decode_single_file(recognizer, filenames[0])\n    decode_single_file(recognizer, filenames[1])\n    decode_multiple_files(recognizer, filenames[2:])\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-fire-red-asr-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming FireRedAsr AED model from\nhttps://github.com/FireRedTeam/FireRedASR\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\ntar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\nrm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    encoder = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx\"\n    decoder = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx\"\n    tokens = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt\"\n    test_wav = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav\"\n    #  test_wav = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/1.wav\"\n    #  test_wav = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/2.wav\"\n    #  test_wav = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/3.wav\"\n    #  test_wav = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/8k.wav\"\n    #  test_wav = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/3-sichuan.wav\"\n    #  test_wav = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/4-tianjin.wav\"\n    #  test_wav = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/5-henan.wav\"\n\n    if (\n        not Path(encoder).is_file()\n        or not Path(decoder).is_file()\n        or not Path(test_wav).is_file()\n    ):\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_fire_red_asr(\n            encoder=encoder,\n            decoder=decoder,\n            tokens=tokens,\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n    print(wave_filename)\n    print(stream.result)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-funasr-nano-decode-files.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2025  zengyw\n#\n\"\"\"\nDecode audio files using FunASR-nano models with sherpa-onnx Python API.\n\nThis script demonstrates how to use FunASR-nano models for offline speech recognition.\n\nUsage:\n    python offline-funasr-nano-decode-files.py \\\n        --encoder-adaptor=/path/to/encoder_adaptor.onnx \\\n        --llm=/path/to/llm.onnx \\\n        --tokenizer=/path/to/Qwen3-0.6B \\\n        --embedding=/path/to/embedding.onnx \\\n        [--num-threads=4] \\\n        [--provider=cpu] \\\n        audio1.wav audio2.wav ...\n\"\"\"\n\nimport argparse\nimport sys\nfrom pathlib import Path\n\nimport soundfile as sf\n\ntry:\n    import sherpa_onnx\nexcept ImportError:\n    print(\"Please install sherpa-onnx: pip install sherpa-onnx\")\n    sys.exit(1)\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.RawDescriptionHelpFormatter,\n        description=__doc__,\n    )\n\n    parser.add_argument(\n        \"--encoder-adaptor\",\n        type=str,\n        required=True,\n        help=\"Path to encoder_adaptor.onnx\",\n    )\n\n    parser.add_argument(\n        \"--llm\",\n        type=str,\n        required=True,\n        help=\"Path to llm.onnx (unified KV cache model)\",\n    )\n\n    parser.add_argument(\n        \"--tokenizer\",\n        type=str,\n        required=True,\n        help=\"Path to tokenizer directory (e.g., Qwen3-0.6B)\",\n    )\n\n    parser.add_argument(\n        \"--embedding\",\n        type=str,\n        required=True,\n        help=\"Path to embedding.onnx\",\n    )\n\n    parser.add_argument(\n        \"--system-prompt\",\n        type=str,\n        default=\"You are a helpful assistant.\",\n        help=\"System prompt for FunASR-nano\",\n    )\n\n    parser.add_argument(\n        \"--user-prompt\",\n        type=str,\n        default=\"语音转写:\",\n        help=\"User prompt template for FunASR-nano\",\n    )\n\n    parser.add_argument(\n        \"--max-new-tokens\",\n        type=int,\n        default=512,\n        help=\"Maximum number of new tokens to generate\",\n    )\n\n    parser.add_argument(\n        \"--temperature\",\n        type=float,\n        default=1e-6,\n        help=\"Sampling temperature\",\n    )\n\n    parser.add_argument(\n        \"--top-p\",\n        type=float,\n        default=0.8,\n        help=\"Top-p (nucleus) sampling threshold\",\n    )\n\n    parser.add_argument(\n        \"--seed\",\n        type=int,\n        default=42,\n        help=\"Random seed\",\n    )\n\n    parser.add_argument(\n        \"--language\",\n        type=str,\n        default=\"\",\n        help=\"Language for transcription (empty string means None)\",\n    )\n\n    parser.add_argument(\n        \"--itn\",\n        action=\"store_true\",\n        default=True,\n        help=\"Whether to apply inverse text normalization (default: True)\",\n    )\n\n    parser.add_argument(\n        \"--no-itn\",\n        dest=\"itn\",\n        action=\"store_false\",\n        help=\"Disable inverse text normalization\",\n    )\n\n    parser.add_argument(\n        \"--hotwords\",\n        type=str,\n        default=\"\",\n        help=\"Hotwords (comma-separated, e.g., 'Sherpa,FunASR')\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=2,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        choices=[\"cpu\", \"cuda\"],\n        help=\"Provider: cpu or cuda\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        action=\"store_true\",\n        help=\"True to print model information while loading\",\n    )\n\n    parser.add_argument(\n        \"sound_files\",\n        type=str,\n        nargs=\"+\",\n        help=\"The input sound file(s) to decode. \"\n        \"Each file must be of single channel, 16-bit PCM encoded wav file. \"\n        \"Its sample rate can be arbitrary and does not need to be 16kHz.\",\n    )\n\n    return parser.parse_args()\n\n\ndef create_recognizer(args) -> sherpa_onnx.OfflineRecognizer:\n    return sherpa_onnx.OfflineRecognizer.from_funasr_nano(\n        encoder_adaptor=args.encoder_adaptor,\n        llm=args.llm,\n        embedding=args.embedding,\n        tokenizer=args.tokenizer,\n        num_threads=args.num_threads,\n        provider=args.provider,\n        debug=args.debug,\n        system_prompt=args.system_prompt,\n        user_prompt=args.user_prompt,\n        max_new_tokens=args.max_new_tokens,\n        temperature=args.temperature,\n        top_p=args.top_p,\n        seed=args.seed,\n        language=args.language,\n        itn=args.itn,\n        hotwords=args.hotwords,\n    )\n\n\ndef decode_file(\n    recognizer: sherpa_onnx.OfflineRecognizer,\n    filename: str,\n):\n    \"\"\"Decode a single audio file.\"\"\"\n    audio, sample_rate = sf.read(filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n    result = stream.result\n    return result\n\n\ndef main():\n    args = get_args()\n\n    print(\"Creating recognizer...\")\n    recognizer = create_recognizer(args)\n    print(\"Recognizer created!\")\n\n    print(f\"\\nDecoding {len(args.sound_files)} file(s)...\\n\")\n\n    for sound_file in args.sound_files:\n        if not Path(sound_file).exists():\n            print(f\"Error: File not found: {sound_file}\", file=sys.stderr)\n            continue\n\n        print(f\"Processing: {sound_file}\")\n        result = decode_file(recognizer, sound_file)\n\n        print(f\"Text: {result.text}\")\n        if result.tokens:\n            print(f\"Tokens: {result.tokens}\")\n        if result.timestamps:\n            print(f\"Timestamps: {[f'{t:.2f}' for t in result.timestamps]}\")\n        print()\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-medasr-ctc-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming Google MedASR CTC model from\nhttps://huggingface.co/google/medasr\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\ntar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nrm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n\"\"\"\n\nimport time\nfrom pathlib import Path\n\nimport librosa\nimport numpy as np\nimport sherpa_onnx\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx\"\n    tokens = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt\"\n    test_wav_0 = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav\"\n    test_wav_1 = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/1.wav\"\n    test_wav_2 = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/2.wav\"\n    test_wav_3 = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/3.wav\"\n    test_wav_4 = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/4.wav\"\n    test_wav_5 = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/5.wav\"\n\n    for f in [\n        model,\n        tokens,\n        test_wav_0,\n        test_wav_1,\n        test_wav_2,\n        test_wav_3,\n        test_wav_4,\n        test_wav_5,\n    ]:\n        if not Path(f).is_file():\n            print(f\"{f} does not exist\")\n\n            raise ValueError(\n                \"\"\"Please download model files from\n                https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n                \"\"\"\n            )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_medasr_ctc(\n            model=model,\n            tokens=tokens,\n            num_threads=2,\n        ),\n        test_wav_0,\n        test_wav_1,\n        test_wav_2,\n        test_wav_3,\n        test_wav_4,\n        test_wav_5,\n    )\n\n\ndef load_audio(filename):\n    audio, sample_rate = librosa.load(filename, sr=16000)\n    assert sample_rate == 16000, sample_rate\n\n    return np.ascontiguousarray(audio)\n\n\ndef decode_single_file(recognizer, filename):\n    samples = load_audio(filename)\n\n    start_time = time.time()\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate=16000, waveform=samples)\n    recognizer.decode_stream(stream)\n\n    end_time = time.time()\n    elapsed_seconds = end_time - start_time\n    audio_duration = len(samples) / 16000\n    real_time_factor = elapsed_seconds / audio_duration\n\n    print(\"---\")\n    print(filename)\n    print(stream.result)\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n    print()\n\n\ndef decode_multiple_files(recognizer, filenames):\n    streams = []\n\n    start_time = time.time()\n\n    audio_duration = 0\n\n    for filename in filenames:\n        samples = load_audio(filename)\n        audio_duration += len(samples) / 16000\n\n        stream = recognizer.create_stream()\n        stream.accept_waveform(sample_rate=16000, waveform=samples)\n        streams.append(stream)\n\n    recognizer.decode_streams(streams)\n\n    end_time = time.time()\n    elapsed_seconds = end_time - start_time\n    real_time_factor = elapsed_seconds / audio_duration\n\n    for name, stream in zip(filenames, streams):\n        print(\"---\")\n        print(name)\n        print(stream.result)\n        print()\n\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n    print()\n    print()\n\n\ndef main():\n    recognizer, *filenames = create_recognizer()\n\n    decode_single_file(recognizer, filenames[0])\n    decode_single_file(recognizer, filenames[1])\n    decode_multiple_files(recognizer, filenames[2:])\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-moonshine-decode-files-v2.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming Moonshine model from\nhttps://github.com/usefulsensors/moonshine\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n\"\"\"\n\nimport datetime as dt\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    encoder = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort\"\n    decoder = (\n        \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort\"\n    )\n    tokens = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt\"\n    test_wav = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav\"\n\n    if not Path(encoder).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_moonshine_v2(\n            encoder=encoder,\n            decoder=decoder,\n            tokens=tokens,\n            debug=False,  # Set to True to see more logs\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    start_t = dt.datetime.now()\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n\n    end_t = dt.datetime.now()\n    elapsed_seconds = (end_t - start_t).total_seconds()\n    duration = audio.shape[-1] / sample_rate\n    rtf = elapsed_seconds / duration\n\n    print(stream.result)\n    print(wave_filename)\n    print(\"Text:\", stream.result.text)\n    print(f\"Audio duration:\\t{duration:.3f} s\")\n    print(f\"Elapsed:\\t{elapsed_seconds:.3f} s\")\n    print(f\"RTF = {elapsed_seconds:.3f}/{duration:.3f} = {rtf:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-moonshine-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming Moonshine model from\nhttps://github.com/usefulsensors/moonshine\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\ntar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\nrm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2\n\"\"\"\n\nimport datetime as dt\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    preprocessor = \"./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx\"\n    encoder = \"./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx\"\n    uncached_decoder = \"./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx\"\n    cached_decoder = \"./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx\"\n\n    tokens = \"./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt\"\n    test_wav = \"./sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav\"\n\n    if not Path(preprocessor).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_moonshine(\n            preprocessor=preprocessor,\n            encoder=encoder,\n            uncached_decoder=uncached_decoder,\n            cached_decoder=cached_decoder,\n            tokens=tokens,\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    start_t = dt.datetime.now()\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n\n    end_t = dt.datetime.now()\n    elapsed_seconds = (end_t - start_t).total_seconds()\n    duration = audio.shape[-1] / sample_rate\n    rtf = elapsed_seconds / duration\n\n    print(stream.result)\n    print(wave_filename)\n    print(\"Text:\", stream.result.text)\n    print(f\"Audio duration:\\t{duration:.3f} s\")\n    print(f\"Elapsed:\\t{elapsed_seconds:.3f} s\")\n    print(f\"RTF = {elapsed_seconds:.3f}/{duration:.3f} = {rtf:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-nemo-canary-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming Canary model from NeMo\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\n\nThe example model supports 4 languages and it is converted from\nhttps://huggingface.co/nvidia/canary-180m-flash\n\nIt supports automatic speech-to-text recognition (ASR) in 4 languages\n(English, German, French, Spanish) and translation from English to\nGerman/French/Spanish and from German/French/Spanish to English with or\nwithout punctuation and capitalization (PnC).\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    encoder = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/encoder.int8.onnx\"\n    decoder = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/decoder.int8.onnx\"\n    tokens = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/tokens.txt\"\n\n    en_wav = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/en.wav\"\n    de_wav = \"./sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8/test_wavs/de.wav\"\n\n    if not Path(encoder).is_file() or not Path(en_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_nemo_canary(\n            encoder=encoder,\n            decoder=decoder,\n            tokens=tokens,\n            debug=True,\n        ),\n        en_wav,\n        de_wav,\n    )\n\n\ndef decode(recognizer, samples, sample_rate, src_lang, tgt_lang):\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, samples)\n\n    recognizer.recognizer.set_config(\n        config=sherpa_onnx.OfflineRecognizerConfig(\n            model_config=sherpa_onnx.OfflineModelConfig(\n                canary=sherpa_onnx.OfflineCanaryModelConfig(\n                    src_lang=src_lang,\n                    tgt_lang=tgt_lang,\n                )\n            )\n        )\n    )\n\n    recognizer.decode_stream(stream)\n    return stream.result.text\n\n\ndef main():\n    recognizer, en_wav, de_wav = create_recognizer()\n\n    en_audio, en_sample_rate = sf.read(en_wav, dtype=\"float32\", always_2d=True)\n    en_audio = en_audio[:, 0]  # only use the first channel\n\n    de_audio, de_sample_rate = sf.read(de_wav, dtype=\"float32\", always_2d=True)\n    de_audio = de_audio[:, 0]  # only use the first channel\n\n    en_wav_en_result = decode(\n        recognizer, en_audio, en_sample_rate, src_lang=\"en\", tgt_lang=\"en\"\n    )\n    en_wav_es_result = decode(\n        recognizer, en_audio, en_sample_rate, src_lang=\"en\", tgt_lang=\"es\"\n    )\n    en_wav_de_result = decode(\n        recognizer, en_audio, en_sample_rate, src_lang=\"en\", tgt_lang=\"de\"\n    )\n    en_wav_fr_result = decode(\n        recognizer, en_audio, en_sample_rate, src_lang=\"en\", tgt_lang=\"fr\"\n    )\n\n    de_wav_en_result = decode(\n        recognizer, de_audio, de_sample_rate, src_lang=\"de\", tgt_lang=\"en\"\n    )\n    de_wav_de_result = decode(\n        recognizer, de_audio, de_sample_rate, src_lang=\"de\", tgt_lang=\"de\"\n    )\n\n    print(\"en_wav_en_result\", en_wav_en_result)\n    print(\"en_wav_es_result\", en_wav_es_result)\n    print(\"en_wav_de_result\", en_wav_de_result)\n    print(\"en_wav_fr_result\", en_wav_fr_result)\n    print(\"-\" * 10)\n    print(\"de_wav_en_result\", de_wav_en_result)\n    print(\"de_wav_de_result\", de_wav_de_result)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-nemo-ctc-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming CTC model from NeMo\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\n\nThe example model supports 10 languages and it is converted from\nhttps://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_multilingual_fastconformer_hybrid_large_pc\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/model.onnx\"\n    tokens = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt\"\n\n    test_wav = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/de-german.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/en-english.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/es-spanish.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/fr-french.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/hr-croatian.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/it-italian.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/po-polish.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/ru-russian.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/uk-ukrainian.wav\"\n\n    if not Path(model).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_nemo_ctc(\n            model=model,\n            tokens=tokens,\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n    print(wave_filename)\n    print(stream.result)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-nemo-parakeet-decode-file.py",
    "content": "# Example using the sherpa-onnx Python API and sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8 model\n# Prints recognized text, per-token timestamps, and durations\n\nimport os\nimport sys\nimport sherpa_onnx\nimport soundfile as sf\n\nwav_filename = \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/test_wavs/en.wav\"\nencoder = \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/encoder.int8.onnx\"\ndecoder = \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/decoder.int8.onnx\"\njoiner = \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/joiner.int8.onnx\"\ntokens = \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/tokens.txt\"\n\nif not os.path.exists(wav_filename):\n    print(f\"File not found: {wav_filename}\")\n    sys.exit(1)\n\n\nrecognizer = sherpa_onnx.OfflineRecognizer.from_transducer(\n    encoder,\n    decoder,\n    joiner,\n    tokens,\n    num_threads=1,\n    provider=\"cpu\",\n    debug=False,\n    decoding_method=\"greedy_search\",\n    model_type=\"nemo_transducer\"\n)\n\naudio, sample_rate = sf.read(wav_filename, dtype=\"float32\", always_2d=True)\naudio = audio[:, 0]  # use first channel if multi-channel\nstream = recognizer.create_stream()\nstream.accept_waveform(sample_rate, audio)\nrecognizer.decode_stream(stream)\nresult = stream.result\n\nprint(f\"Recognized text: {result.text}\")\n\nif hasattr(result, \"tokens\") and hasattr(result, \"timestamps\") and hasattr(result, \"durations\"):\n    print(\"Token\\tTimestamp\\tDuration\")\n    for token, ts, dur in zip(result.tokens, result.timestamps, result.durations):\n        print(f\"{token}\\t{ts:.2f}\\t{dur:.2f}\")\nelse:\n    print(\"Timestamps or durations not available.\")\n"
  },
  {
    "path": "python-api-examples/offline-nemo-transducer-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming transducer model from NeMo\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\n\nThe example model supports 10 languages and it is converted from\nhttps://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_multilingual_fastconformer_hybrid_large_pc\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    encoder = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/encoder.onnx\"\n    decoder = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/decoder.onnx\"\n    joiner = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/joiner.onnx\"\n    tokens = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/tokens.txt\"\n\n    test_wav = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/de-german.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/en-english.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/es-spanish.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/fr-french.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/hr-croatian.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/it-italian.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/po-polish.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/ru-russian.wav\"\n    #  test_wav = \"./sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k/test_wavs/uk-ukrainian.wav\"\n\n    if not Path(encoder).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_transducer(\n            encoder=encoder,\n            decoder=decoder,\n            joiner=joiner,\n            tokens=tokens,\n            model_type=\"nemo_transducer\",\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n    print(wave_filename)\n    print(stream.result)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-omnilingual-asr-ctc-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming Omnilingual ASR CTC model from\nhttps://github.com/facebookresearch/omnilingual-asr\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\ntar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nrm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n\"\"\"\n\nfrom pathlib import Path\n\nimport numpy as np\nimport time\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx\"\n    tokens = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt\"\n    test_wav_en = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav\"\n    test_wav_de = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/de.wav\"\n    test_wav_fr = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/fr.wav\"\n    test_wav_es = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/es.wav\"\n\n    for f in [model, tokens, test_wav_en, test_wav_de, test_wav_fr, test_wav_es]:\n        if not Path(f).is_file():\n            print(f\"{f} does not exist\")\n\n            raise ValueError(\n                \"\"\"Please download model files from\n                https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n                \"\"\"\n            )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_omnilingual_asr_ctc(\n            model=model,\n            tokens=tokens,\n            num_threads=1,\n        ),\n        test_wav_en,\n        test_wav_de,\n        test_wav_fr,\n        test_wav_es,\n    )\n\n\ndef load_audio(filename):\n    audio, sample_rate = sf.read(filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        import librosa\n\n        audio = librosa.resample(audio, orig_sr=sample_rate, target_sr=16000)\n\n    return np.ascontiguousarray(audio)\n\n\ndef decode_single_file(recognizer, filename):\n    samples = load_audio(filename)\n\n    start_time = time.time()\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate=16000, waveform=samples)\n    recognizer.decode_stream(stream)\n\n    end_time = time.time()\n    elapsed_seconds = end_time - start_time\n    audio_duration = len(samples) / 16000\n    real_time_factor = elapsed_seconds / audio_duration\n\n    print(\"---\")\n    print(filename)\n    print(stream.result)\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n    print()\n\n\ndef decode_multiple_files(recognizer, filenames):\n    streams = []\n\n    start_time = time.time()\n\n    audio_duration = 0\n\n    for filename in filenames:\n        samples = load_audio(filename)\n        audio_duration += len(samples) / 16000\n\n        stream = recognizer.create_stream()\n        stream.accept_waveform(sample_rate=16000, waveform=samples)\n        streams.append(stream)\n\n    recognizer.decode_streams(streams)\n\n    end_time = time.time()\n    elapsed_seconds = end_time - start_time\n    real_time_factor = elapsed_seconds / audio_duration\n\n    for name, stream in zip(filenames, streams):\n        print(\"---\")\n        print(name)\n        print(stream.result)\n        print()\n\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n    print()\n    print()\n\n\ndef main():\n    recognizer, *filenames = create_recognizer()\n\n    decode_single_file(recognizer, filenames[0])\n    decode_single_file(recognizer, filenames[1])\n    decode_multiple_files(recognizer, filenames[2:])\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-omnilingual-asr-ctc-v2-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming Omnilingual ASR CTC v2 model from\nhttps://github.com/facebookresearch/omnilingual-asr\nto decode files.\n\nPlease download model files from\nhttps://huggingface.co/Edison2ST/sherpa-onnx-omnilingual-asr-1600-languages-ctc-v2\n\nFor instance,\n\nwget https://huggingface.co/Edison2ST/sherpa-onnx-omnilingual-asr-1600-languages-ctc-v2/resolve/main/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05.tar.bz2 # noqa: E501\ntar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05.tar.bz2\nrm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05.tar.bz2\n\"\"\"\n\nfrom pathlib import Path\n\nimport numpy as np\nimport time\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05/model.int8.onnx\"\n    tokens = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05/tokens.txt\"\n    test_wav_en = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05/test_wavs/en.wav\"\n    test_wav_de = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05/test_wavs/de.wav\"\n    test_wav_fr = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05/test_wavs/fr.wav\"\n    test_wav_es = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-v2-int8-2026-02-05/test_wavs/es.wav\"\n\n    for f in [model, tokens, test_wav_en, test_wav_de, test_wav_fr, test_wav_es]:\n        if not Path(f).is_file():\n            print(f\"{f} does not exist\")\n\n            raise ValueError(\"\"\"Please download model files from\n                https://huggingface.co/Edison2ST/sherpa-onnx-omnilingual-asr-1600-languages-ctc-v2\n                \"\"\")\n    return (\n        sherpa_onnx.OfflineRecognizer.from_omnilingual_asr_ctc(\n            model=model,\n            tokens=tokens,\n            num_threads=1,\n        ),\n        test_wav_en,\n        test_wav_de,\n        test_wav_fr,\n        test_wav_es,\n    )\n\n\ndef load_audio(filename):\n    audio, sample_rate = sf.read(filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        import librosa\n\n        audio = librosa.resample(audio, orig_sr=sample_rate, target_sr=16000)\n\n    return np.ascontiguousarray(audio)\n\n\ndef decode_single_file(recognizer, filename):\n    samples = load_audio(filename)\n\n    start_time = time.time()\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate=16000, waveform=samples)\n    recognizer.decode_stream(stream)\n\n    end_time = time.time()\n    elapsed_seconds = end_time - start_time\n    audio_duration = len(samples) / 16000\n    real_time_factor = elapsed_seconds / audio_duration\n\n    print(\"---\")\n    print(filename)\n    print(stream.result)\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n    print()\n\n\ndef decode_multiple_files(recognizer, filenames):\n    streams = []\n\n    start_time = time.time()\n\n    audio_duration = 0\n\n    for filename in filenames:\n        samples = load_audio(filename)\n        audio_duration += len(samples) / 16000\n\n        stream = recognizer.create_stream()\n        stream.accept_waveform(sample_rate=16000, waveform=samples)\n        streams.append(stream)\n\n    recognizer.decode_streams(streams)\n\n    end_time = time.time()\n    elapsed_seconds = end_time - start_time\n    real_time_factor = elapsed_seconds / audio_duration\n\n    for name, stream in zip(filenames, streams):\n        print(\"---\")\n        print(name)\n        print(stream.result)\n        print()\n\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n    print()\n    print()\n\n\ndef main():\n    recognizer, *filenames = create_recognizer()\n\n    decode_single_file(recognizer, filenames[0])\n    decode_single_file(recognizer, filenames[1])\n    decode_multiple_files(recognizer, filenames[2:])\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-sense-voice-ctc-decode-files-with-hr.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming SenseVoice CTC model from\nhttps://github.com/FunAudioLLM/SenseVoice\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\ntar xf dict.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx\"\n    tokens = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\"\n    test_wav = \"./test-hr.wav\"\n\n    if not Path(model).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            and\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/hr-files\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_sense_voice(\n            model=model,\n            tokens=tokens,\n            use_itn=True,\n            debug=True,\n            hr_lexicon=\"./lexicon.txt\",\n            hr_rule_fsts=\"./replace.fst\",\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n    print(wave_filename)\n    print(stream.result)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-sense-voice-ctc-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming SenseVoice CTC model from\nhttps://github.com/FunAudioLLM/SenseVoice\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nrm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx\"\n    tokens = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\"\n    test_wav = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/zh.wav\"\n    #  test_wav = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/en.wav\"\n    #  test_wav = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/ja.wav\"\n    #  test_wav = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/ko.wav\"\n    #  test_wav = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/yue.wav\"\n\n    if not Path(model).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_sense_voice(\n            model=model,\n            tokens=tokens,\n            use_itn=True,\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n    print(wave_filename)\n    print(stream.result)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-source-separation-spleeter.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\n\"\"\"\nThis file shows how to use spleeter for source separation.\n\nPlease first download a spleeter model from\n\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/source-separation-models\n\nThe following is an example:\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/sherpa-onnx-spleeter-2stems-fp16.tar.bz2\n\nPlease also download a test file\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/qi-feng-le-zh.wav\n\nThe test wav file is 16-bit encoded with 2 channels. If you have other\nformats, e.g., .mp4 or .mp3, please first use ffmpeg to convert it.\nFor instance\n\n    ffmpeg -i your.mp4 -vn -acodec pcm_s16le -ar 44100 -ac 2 out.wav\n\nThen you can use out.wav as input for this example.\n\"\"\"\n\nimport time\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_offline_source_separation():\n    # Please read the help message at the beginning of this file\n    # to download model files\n    vocals = \"./sherpa-onnx-spleeter-2stems-fp16/vocals.fp16.onnx\"\n    accompaniment = \"./sherpa-onnx-spleeter-2stems-fp16/accompaniment.fp16.onnx\"\n\n    if not Path(vocals).is_file():\n        raise ValueError(f\"{vocals} does not exist.\")\n\n    if not Path(accompaniment).is_file():\n        raise ValueError(f\"{accompaniment} does not exist.\")\n\n    config = sherpa_onnx.OfflineSourceSeparationConfig(\n        model=sherpa_onnx.OfflineSourceSeparationModelConfig(\n            spleeter=sherpa_onnx.OfflineSourceSeparationSpleeterModelConfig(\n                vocals=vocals,\n                accompaniment=accompaniment,\n            ),\n            num_threads=1,\n            debug=False,\n            provider=\"cpu\",\n        )\n    )\n    if not config.validate():\n        raise ValueError(\"Please check your config.\")\n\n    return sherpa_onnx.OfflineSourceSeparation(config)\n\n\ndef load_audio():\n    # Please read the help message at the beginning of this file to download\n    # the following wav_file\n    wav_file = \"./qi-feng-le-zh.wav\"\n    if not Path(wav_file).is_file():\n        raise ValueError(f\"{wav_file} does not exist\")\n\n    samples, sample_rate = sf.read(wav_file, dtype=\"float32\", always_2d=True)\n    samples = np.transpose(samples)\n    # now samples is of shape (num_channels, num_samples)\n    assert (\n        samples.shape[1] > samples.shape[0]\n    ), f\"You should use (num_channels, num_samples). {samples.shape}\"\n\n    assert (\n        samples.dtype == np.float32\n    ), f\"Expect np.float32 as dtype. Given: {samples.dtype}\"\n\n    return samples, sample_rate\n\n\ndef main():\n    sp = create_offline_source_separation()\n    samples, sample_rate = load_audio()\n    samples = np.ascontiguousarray(samples)\n\n    start = time.time()\n    output = sp.process(sample_rate=sample_rate, samples=samples)\n    end = time.time()\n\n    print(\"output.sample_rate\", output.sample_rate)\n\n    assert len(output.stems) == 2, len(output.stems)\n\n    vocals = output.stems[0].data\n    non_vocals = output.stems[1].data\n    # vocals.shape (num_channels, num_samples)\n\n    vocals = np.transpose(vocals)\n    non_vocals = np.transpose(non_vocals)\n\n    # vocals.shape (num_samples,num_channels)\n\n    sf.write(\"./spleeter-vocals.wav\", vocals, samplerate=output.sample_rate)\n    sf.write(\"./spleeter-non-vocals.wav\", non_vocals, samplerate=output.sample_rate)\n\n    elapsed_seconds = end - start\n    audio_duration = samples.shape[1] / sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    print(\"Saved to ./spleeter-vocals.wav and ./spleeter-non-vocals.wav\")\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-source-separation-uvr.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\n\"\"\"\nThis file shows how to use UVR for source separation.\n\nPlease first download a UVR model from\n\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/source-separation-models\n\nThe following is an example:\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/UVR_MDXNET_9482.onnx\n\nPlease also download a test file\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/qi-feng-le-zh.wav\n\nThe test wav file is 16-bit encoded with 2 channels. If you have other\nformats, e.g., .mp4 or .mp3, please first use ffmpeg to convert it.\nFor instance\n\n    ffmpeg -i your.mp4 -vn -acodec pcm_s16le -ar 44100 -ac 2 out.wav\n\nThen you can use out.wav as input for this example.\n\"\"\"\n\nimport time\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_offline_source_separation():\n    # Please read the help message at the beginning of this file\n    # to download model files\n    model = \"./UVR_MDXNET_9482.onnx\"\n\n    if not Path(model).is_file():\n        raise ValueError(f\"{model} does not exist.\")\n\n    config = sherpa_onnx.OfflineSourceSeparationConfig(\n        model=sherpa_onnx.OfflineSourceSeparationModelConfig(\n            uvr=sherpa_onnx.OfflineSourceSeparationUvrModelConfig(\n                model=model,\n            ),\n            num_threads=1,\n            debug=False,\n            provider=\"cpu\",\n        )\n    )\n    if not config.validate():\n        raise ValueError(\"Please check your config.\")\n\n    return sherpa_onnx.OfflineSourceSeparation(config)\n\n\ndef load_audio():\n    # Please read the help message at the beginning of this file to download\n    # the following wav_file\n    wav_file = \"./qi-feng-le-zh.wav\"\n    if not Path(wav_file).is_file():\n        raise ValueError(f\"{wav_file} does not exist\")\n\n    samples, sample_rate = sf.read(wav_file, dtype=\"float32\", always_2d=True)\n    samples = np.transpose(samples)\n    # now samples is of shape (num_channels, num_samples)\n    assert (\n        samples.shape[1] > samples.shape[0]\n    ), f\"You should use (num_channels, num_samples). {samples.shape}\"\n\n    assert (\n        samples.dtype == np.float32\n    ), f\"Expect np.float32 as dtype. Given: {samples.dtype}\"\n\n    return samples, sample_rate\n\n\ndef main():\n    sp = create_offline_source_separation()\n    samples, sample_rate = load_audio()\n    samples = np.ascontiguousarray(samples)\n\n    print(\"Started. Please wait\")\n    start = time.time()\n    output = sp.process(sample_rate=sample_rate, samples=samples)\n    end = time.time()\n\n    print(\"output.sample_rate\", output.sample_rate)\n\n    assert len(output.stems) == 2, len(output.stems)\n\n    vocals = output.stems[0].data\n    non_vocals = output.stems[1].data\n    # vocals.shape (num_channels, num_samples)\n\n    vocals = np.transpose(vocals)\n    non_vocals = np.transpose(non_vocals)\n\n    # vocals.shape (num_samples,num_channels)\n\n    sf.write(\"./uvr-vocals.wav\", vocals, samplerate=output.sample_rate)\n    sf.write(\"./uvr-non-vocals.wav\", non_vocals, samplerate=output.sample_rate)\n\n    elapsed_seconds = end - start\n    audio_duration = samples.shape[1] / sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    print(\"Saved to ./uvr-vocals.wav and ./uvr-non-vocals.wav\")\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-speaker-diarization.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2024  Xiaomi Corporation\n\n\"\"\"\nThis file shows how to use sherpa-onnx Python API for\noffline/non-streaming speaker diarization.\n\nUsage:\n\nStep 1: Download a speaker segmentation model\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nfor a list of available models. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\nStep 2: Download a speaker embedding extractor model\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nfor a list of available models. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\nStep 3. Download test wave files\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nfor a list of available test wave files. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\nStep 4. Run it\n\n    python3 ./python-api-examples/offline-speaker-diarization.py\n\n\"\"\"\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\nimport librosa\n\n\ndef resample_audio(audio, sample_rate, target_sample_rate):\n    \"\"\"\n    Resample audio to target sample rate using librosa\n    \"\"\"\n    if sample_rate != target_sample_rate:\n        print(f\"Resampling audio from {sample_rate}Hz to {target_sample_rate}Hz...\")\n        audio = librosa.resample(audio, orig_sr=sample_rate, target_sr=target_sample_rate)\n        print(f\"Resampling completed. New audio shape: {audio.shape}\")\n        return audio, target_sample_rate\n    return audio, sample_rate\n\n\ndef init_speaker_diarization(num_speakers: int = -1, cluster_threshold: float = 0.5):\n    \"\"\"\n    Args:\n      num_speakers:\n        If you know the actual number of speakers in the wave file, then please\n        specify it. Otherwise, leave it to -1\n      cluster_threshold:\n        If num_speakers is -1, then this threshold is used for clustering.\n        A smaller cluster_threshold leads to more clusters, i.e., more speakers.\n        A larger cluster_threshold leads to fewer clusters, i.e., fewer speakers.\n    \"\"\"\n    segmentation_model = \"./sherpa-onnx-pyannote-segmentation-3-0/model.onnx\"\n    embedding_extractor_model = (\n        \"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\"\n    )\n\n    config = sherpa_onnx.OfflineSpeakerDiarizationConfig(\n        segmentation=sherpa_onnx.OfflineSpeakerSegmentationModelConfig(\n            pyannote=sherpa_onnx.OfflineSpeakerSegmentationPyannoteModelConfig(\n                model=segmentation_model\n            ),\n        ),\n        embedding=sherpa_onnx.SpeakerEmbeddingExtractorConfig(\n            model=embedding_extractor_model\n        ),\n        clustering=sherpa_onnx.FastClusteringConfig(\n            num_clusters=num_speakers, threshold=cluster_threshold\n        ),\n        min_duration_on=0.3,\n        min_duration_off=0.5,\n    )\n    if not config.validate():\n        raise RuntimeError(\n            \"Please check your config and make sure all required files exist\"\n        )\n\n    return sherpa_onnx.OfflineSpeakerDiarization(config)\n\n\ndef progress_callback(num_processed_chunk: int, num_total_chunks: int) -> int:\n    progress = num_processed_chunk / num_total_chunks * 100\n    print(f\"Progress: {progress:.3f}%\")\n    return 0\n\n\ndef main():\n    wave_filename = \"./0-four-speakers-zh.wav\"\n    if not Path(wave_filename).is_file():\n        raise RuntimeError(f\"{wave_filename} does not exist\")\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # Since we know there are 4 speakers in the above test wave file, we use\n    # num_speakers 4 here\n    sd = init_speaker_diarization(num_speakers=4)\n    \n    # Resample audio to match the expected sample rate\n    target_sample_rate = sd.sample_rate\n    audio, sample_rate = resample_audio(audio, sample_rate, target_sample_rate)\n    \n    if sample_rate != sd.sample_rate:\n        raise RuntimeError(\n            f\"Expected samples rate: {sd.sample_rate}, given: {sample_rate}\"\n        )\n\n    show_progress = True\n\n    if show_progress:\n        result = sd.process(audio, callback=progress_callback).sort_by_start_time()\n    else:\n        result = sd.process(audio).sort_by_start_time()\n\n    for r in result:\n        print(f\"{r.start:.3f} -- {r.end:.3f} speaker_{r.speaker:02}\")\n        #  print(r) # this one is simpler\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-speech-enhancement-dpdfnet.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use the speech enhancement API with DPDFNet.\n\nPlease download DPDFNet models from the sherpa-onnx GitHub release\nor the official Hugging Face hub:\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\nhttps://huggingface.co/Ceva-IP/DPDFNet\n\nExample:\n\n wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\n wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet2.onnx\n wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet4.onnx\n wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet8.onnx\n wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet2_48khz_hr.onnx\n wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/speech_with_noise.wav\n\nUse 16 kHz DPDFNet models such as `dpdfnet_baseline.onnx`, `dpdfnet2.onnx`,\n`dpdfnet4.onnx`, or `dpdfnet8.onnx` for downstream ASR or speech recognition.\nUse `dpdfnet2_48khz_hr.onnx` for 48 kHz enhancement output.\n\"\"\"\n\nimport time\nfrom pathlib import Path\nfrom typing import Tuple\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_speech_denoiser():\n    model_filename = \"./dpdfnet_baseline.onnx\"\n    if not Path(model_filename).is_file():\n        print(f\"{model_filename} does not exist\")\n        raise ValueError(\n            \"Please first download a DPDFNet model from \"\n            \"the sherpa-onnx GitHub release or the official Hugging Face hub: \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models or \"\n            \"https://huggingface.co/Ceva-IP/DPDFNet\"\n        )\n\n    config = sherpa_onnx.OfflineSpeechDenoiserConfig(\n        model=sherpa_onnx.OfflineSpeechDenoiserModelConfig(\n            dpdfnet=sherpa_onnx.OfflineSpeechDenoiserDpdfNetModelConfig(\n                model=model_filename\n            ),\n            debug=False,\n            num_threads=1,\n            provider=\"cpu\",\n        )\n    )\n    if not config.validate():\n        print(config)\n        raise ValueError(\"Errors in config. Please check previous error logs\")\n    return sherpa_onnx.OfflineSpeechDenoiser(config)\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef main():\n    sd = create_speech_denoiser()\n    test_wave = \"./speech_with_noise.wav\"\n    if not Path(test_wave).is_file():\n        raise ValueError(\n            f\"{test_wave} does not exist. You can download it from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\"\n        )\n\n    samples, sample_rate = load_audio(test_wave)\n\n    start = time.time()\n    denoised = sd(samples, sample_rate)\n    end = time.time()\n\n    elapsed_seconds = end - start\n    audio_duration = len(samples) / sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    output_filename = f\"./enhanced_{denoised.sample_rate}.wav\"\n    sf.write(output_filename, denoised.samples, denoised.sample_rate)\n    print(f\"Saved to {output_filename}\")\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-speech-enhancement-gtcrn.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use the speech enhancement API.\n\nPlease download files used this script from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\nExample:\n\n wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\n wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/speech_with_noise.wav\n\"\"\"\n\nimport time\nfrom pathlib import Path\nfrom typing import Tuple\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_speech_denoiser():\n    model_filename = \"./gtcrn_simple.onnx\"\n    if not Path(model_filename).is_file():\n        raise ValueError(\n            \"Please first download a model from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\"\n        )\n\n    config = sherpa_onnx.OfflineSpeechDenoiserConfig(\n        model=sherpa_onnx.OfflineSpeechDenoiserModelConfig(\n            gtcrn=sherpa_onnx.OfflineSpeechDenoiserGtcrnModelConfig(\n                model=model_filename\n            ),\n            debug=False,\n            num_threads=1,\n            provider=\"cpu\",\n        )\n    )\n    if not config.validate():\n        print(config)\n        raise ValueError(\"Errors in config. Please check previous error logs\")\n    return sherpa_onnx.OfflineSpeechDenoiser(config)\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef main():\n    sd = create_speech_denoiser()\n    test_wave = \"./speech_with_noise.wav\"\n    if not Path(test_wave).is_file():\n        raise ValueError(\n            f\"{test_wave} does not exist. You can download it from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\"\n        )\n\n    samples, sample_rate = load_audio(test_wave)\n\n    start = time.time()\n    denoised = sd(samples, sample_rate)\n    end = time.time()\n\n    elapsed_seconds = end - start\n    audio_duration = len(samples) / sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    sf.write(\"./enhanced_16k.wav\", denoised.samples, denoised.sample_rate)\n    print(\"Saved to ./enhanced_16k.wav\")\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-telespeech-ctc-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming CTC model from\nhttps://github.com/Tele-AI/TeleSpeech-ASR\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\n\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/model.int8.onnx\"\n    tokens = \"./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt\"\n    test_wav = \"./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/test_wavs/3-sichuan.wav\"\n    #  test_wav = \"./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/test_wavs/4-tianjin.wav\"\n    #  test_wav = \"./sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/test_wavs/5-henan.wav\"\n\n    if not Path(model).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_telespeech_ctc(\n            model=model,\n            tokens=tokens,\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n    print(wave_filename)\n    print(stream.result)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-tts-play.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2023  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python API to generate audio\nfrom text, i.e., text-to-speech.\n\nDifferent from ./offline-tts.py, this file plays back the generated audio\nwhile the model is still generating.\n\nUsage:\n\nExample (1/8)\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\ntar xf vits-piper-en_US-amy-low.tar.bz2\n\npython3 ./python-api-examples/offline-tts-play.py \\\n --vits-model=./vits-piper-en_US-amy-low/en_US-amy-low.onnx \\\n --vits-tokens=./vits-piper-en_US-amy-low/tokens.txt \\\n --vits-data-dir=./vits-piper-en_US-amy-low/espeak-ng-data \\\n --output-filename=./generated.wav \\\n \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nExample (2/8)\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-zh-aishell3.tar.bz2\ntar xvf vits-zh-aishell3.tar.bz2\n\npython3 ./python-api-examples/offline-tts-play.py \\\n --vits-model=./vits-icefall-zh-aishell3/model.onnx \\\n --vits-lexicon=./vits-icefall-zh-aishell3/lexicon.txt \\\n --vits-tokens=./vits-icefall-zh-aishell3/tokens.txt \\\n --tts-rule-fsts='./vits-icefall-zh-aishell3/phone.fst,./vits-icefall-zh-aishell3/date.fst,./vits-icefall-zh-aishell3/number.fst' \\\n --sid=21 \\\n --output-filename=./liubei-21.wav \\\n \"勿以恶小而为之，勿以善小而不为。惟贤惟德，能服于人。122334\"\n\nExample (3/8)\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-vits-zh-ll.tar.bz2\ntar xvf sherpa-onnx-vits-zh-ll.tar.bz2\nrm sherpa-onnx-vits-zh-ll.tar.bz2\n\npython3 ./python-api-examples/offline-tts-play.py \\\n --vits-model=./sherpa-onnx-vits-zh-ll/model.onnx \\\n --vits-lexicon=./sherpa-onnx-vits-zh-ll/lexicon.txt \\\n --vits-tokens=./sherpa-onnx-vits-zh-ll/tokens.txt \\\n --tts-rule-fsts=./sherpa-onnx-vits-zh-ll/phone.fst,./sherpa-onnx-vits-zh-ll/date.fst,./sherpa-onnx-vits-zh-ll/number.fst \\\n --sid=2 \\\n --output-filename=./test-2.wav \\\n \"当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感受着生命的奇迹与温柔。2024年5月11号，拨打110或者18920240511。123456块钱。\"\n\nExample (4/8)\n\ncurl -O -SL https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\ntar xvf matcha-icefall-zh-baker.tar.bz2\nrm matcha-icefall-zh-baker.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\npython3 ./python-api-examples/offline-tts-play.py \\\n --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\n --matcha-vocoder=./vocos-22khz-univ.onnx \\\n --matcha-lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\n --matcha-tokens=./matcha-icefall-zh-baker/tokens.txt \\\n --tts-rule-fsts=./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst \\\n --output-filename=./test-matcha.wav \\\n \"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\"\n\nExample (5/8)\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\ntar xvf matcha-icefall-en_US-ljspeech.tar.bz2\nrm matcha-icefall-en_US-ljspeech.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\npython3 ./python-api-examples/offline-tts-play.py \\\n  --matcha-acoustic-model=./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-tokens=./matcha-icefall-en_US-ljspeech/tokens.txt \\\n  --matcha-data-dir=./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\n  --output-filename=./test-matcha-ljspeech-en.wav \\\n  --num-threads=2 \\\n \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nExample (6/8)\n\n(This version of kokoro supports only English)\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\ntar xf kokoro-en-v0_19.tar.bz2\nrm kokoro-en-v0_19.tar.bz2\n\npython3 ./python-api-examples/offline-tts.py \\\n  --debug=1 \\\n  --kokoro-model=./kokoro-en-v0_19/model.onnx \\\n  --kokoro-voices=./kokoro-en-v0_19/voices.bin \\\n  --kokoro-tokens=./kokoro-en-v0_19/tokens.txt \\\n  --kokoro-data-dir=./kokoro-en-v0_19/espeak-ng-data \\\n  --num-threads=2 \\\n  --sid=10 \\\n  --output-filename=\"./kokoro-10.wav\" \\\n  \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nExample (7/8)\n\n(This version of kokoro supports English, Chinese, etc.)\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\ntar xf kokoro-multi-lang-v1_0.tar.bz2\nrm kokoro-multi-lang-v1_0.tar.bz2\n\npython3 ./python-api-examples/offline-tts-play.py \\\n  --debug=1 \\\n  --kokoro-model=./kokoro-multi-lang-v1_0/model.onnx \\\n  --kokoro-voices=./kokoro-multi-lang-v1_0/voices.bin \\\n  --kokoro-tokens=./kokoro-multi-lang-v1_0/tokens.txt \\\n  --kokoro-data-dir=./kokoro-multi-lang-v1_0/espeak-ng-data \\\n  --kokoro-lexicon=./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt \\\n  --num-threads=2 \\\n  --sid=18 \\\n  --output-filename=\"./kokoro-18-zh-en.wav\" \\\n  \"中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？\"\n\nExample (8/8)\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\ntar xf kitten-nano-en-v0_1-fp16.tar.bz2\nrm kitten-nano-en-v0_1-fp16.tar.bz2\n\npython3 ./python-api-examples/offline-tts-play.py \\\n  --debug=1 \\\n  --kitten-model=./kitten-nano-en-v0_1-fp16/model.fp16.onnx \\\n  --kitten-voices=./kitten-nano-en-v0_1-fp16/voices.bin \\\n  --kitten-tokens=./kitten-nano-en-v0_1-fp16/tokens.txt \\\n  --kitten-data-dir=./kitten-nano-en-v0_1-fp16/espeak-ng-data \\\n  --num-threads=2 \\\n  --sid=0 \\\n  --output-filename=\"./kitten-0.wav\" \\\n  \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nYou can find more models at\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/tts/index.html\nfor details.\n\"\"\"\n\nimport argparse\nimport logging\nimport queue\nimport sys\nimport threading\nimport time\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\n\ndef add_vits_args(parser):\n    parser.add_argument(\n        \"--vits-model\",\n        type=str,\n        default=\"\",\n        help=\"Path to vits model.onnx\",\n    )\n\n    parser.add_argument(\n        \"--vits-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"Path to lexicon.txt\",\n    )\n\n    parser.add_argument(\n        \"--vits-tokens\",\n        type=str,\n        default=\"\",\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--vits-data-dir\",\n        type=str,\n        default=\"\",\n        help=\"\"\"Path to the dict directory of espeak-ng. If it is specified,\n        --vits-lexicon and --vits-tokens are ignored\"\"\",\n    )\n\n\ndef add_matcha_args(parser):\n    parser.add_argument(\n        \"--matcha-acoustic-model\",\n        type=str,\n        default=\"\",\n        help=\"Path to model.onnx for matcha\",\n    )\n\n    parser.add_argument(\n        \"--matcha-vocoder\",\n        type=str,\n        default=\"\",\n        help=\"Path to vocoder for matcha\",\n    )\n\n    parser.add_argument(\n        \"--matcha-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"Path to lexicon.txt for matcha\",\n    )\n\n    parser.add_argument(\n        \"--matcha-tokens\",\n        type=str,\n        default=\"\",\n        help=\"Path to tokens.txt for matcha\",\n    )\n\n    parser.add_argument(\n        \"--matcha-data-dir\",\n        type=str,\n        default=\"\",\n        help=\"\"\"Path to the dict directory of espeak-ng. If it is specified,\n        --matcha-lexicon and --matcha-tokens are ignored\"\"\",\n    )\n\n\ndef add_kokoro_args(parser):\n    parser.add_argument(\n        \"--kokoro-model\",\n        type=str,\n        default=\"\",\n        help=\"Path to model.onnx for kokoro\",\n    )\n\n    parser.add_argument(\n        \"--kokoro-voices\",\n        type=str,\n        default=\"\",\n        help=\"Path to voices.bin for kokoro\",\n    )\n\n    parser.add_argument(\n        \"--kokoro-tokens\",\n        type=str,\n        default=\"\",\n        help=\"Path to tokens.txt for kokoro\",\n    )\n\n    parser.add_argument(\n        \"--kokoro-data-dir\",\n        type=str,\n        default=\"\",\n        help=\"Path to the dict directory of espeak-ng.\",\n    )\n\n    parser.add_argument(\n        \"--kokoro-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"Path to lexicon.txt for kokoro. Needed only by multilingual kokoro\",\n    )\n\n\ndef add_kitten_args(parser):\n    parser.add_argument(\n        \"--kitten-model\",\n        type=str,\n        default=\"\",\n        help=\"Path to model.onnx for kitten\",\n    )\n\n    parser.add_argument(\n        \"--kitten-voices\",\n        type=str,\n        default=\"\",\n        help=\"Path to voices.bin for kitten\",\n    )\n\n    parser.add_argument(\n        \"--kitten-tokens\",\n        type=str,\n        default=\"\",\n        help=\"Path to tokens.txt for kitten\",\n    )\n\n    parser.add_argument(\n        \"--kitten-data-dir\",\n        type=str,\n        default=\"\",\n        help=\"Path to the dict directory of espeak-ng.\",\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    add_vits_args(parser)\n    add_matcha_args(parser)\n    add_kokoro_args(parser)\n    add_kitten_args(parser)\n\n    parser.add_argument(\n        \"--tts-rule-fsts\",\n        type=str,\n        default=\"\",\n        help=\"Path to rule.fst\",\n    )\n\n    parser.add_argument(\n        \"--output-filename\",\n        type=str,\n        default=\"./generated.wav\",\n        help=\"Path to save generated wave\",\n    )\n\n    parser.add_argument(\n        \"--sid\",\n        type=int,\n        default=0,\n        help=\"\"\"Speaker ID. Used only for multi-speaker models, e.g.\n        models trained using the VCTK dataset. Not used for single-speaker\n        models, e.g., models trained using the LJ speech dataset.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"valid values: cpu, cuda, coreml\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--speed\",\n        type=float,\n        default=1.0,\n        help=\"Speech speed. Larger->faster; smaller->slower\",\n    )\n\n    parser.add_argument(\n        \"text\",\n        type=str,\n        help=\"The input text to generate audio for\",\n    )\n\n    return parser.parse_args()\n\n\n# buffer saves audio samples to be played\nbuffer = queue.Queue()\n\n# started is set to True once generated_audio_callback is called.\nstarted = False\n\n# stopped is set to True once all the text has been processed\nstopped = False\n\n# killed is set to True once ctrl + C is pressed\nkilled = False\n\n# Note: When started is True, and stopped is True, and buffer is empty,\n# we will exit the program since all audio samples have been played.\n\nsample_rate = None\n\nevent = threading.Event()\n\nfirst_message_time = None\n\n\ndef generated_audio_callback(samples: np.ndarray, progress: float):\n    \"\"\"This function is called whenever max_num_sentences sentences\n    have been processed.\n\n    Note that it is passed to C++ and is invoked in C++.\n\n    Args:\n      samples:\n        A 1-D np.float32 array containing audio samples\n    \"\"\"\n    global first_message_time\n    if first_message_time is None:\n        first_message_time = time.time()\n\n    buffer.put(samples)\n    global started\n\n    if started is False:\n        logging.info(\"Start playing ...\")\n    started = True\n\n    # 1 means to keep generating\n    # 0 means to stop generating\n    if killed:\n        return 0\n\n    return 1\n\n\n# see https://python-sounddevice.readthedocs.io/en/0.4.6/api/streams.html#sounddevice.OutputStream\ndef play_audio_callback(\n    outdata: np.ndarray, frames: int, time, status: sd.CallbackFlags\n):\n    if killed or (started and buffer.empty() and stopped):\n        event.set()\n\n    # outdata is of shape (frames, num_channels)\n    if buffer.empty():\n        outdata.fill(0)\n        return\n\n    n = 0\n    while n < frames and not buffer.empty():\n        remaining = frames - n\n        k = buffer.queue[0].shape[0]\n\n        if remaining <= k:\n            outdata[n:, 0] = buffer.queue[0][:remaining]\n            buffer.queue[0] = buffer.queue[0][remaining:]\n            n = frames\n            if buffer.queue[0].shape[0] == 0:\n                buffer.get()\n\n            break\n\n        outdata[n : n + k, 0] = buffer.get()\n        n += k\n\n    if n < frames:\n        outdata[n:, 0] = 0\n\n\n# Please see\n# https://python-sounddevice.readthedocs.io/en/0.4.6/usage.html#device-selection\n# for how to select a device\ndef play_audio():\n    if False:\n        # This if branch can be safely removed. It is here to show you how to\n        # change the default output device in case you need that.\n        devices = sd.query_devices()\n        print(devices)\n\n        # sd.default.device[1] is the output device, if you want to\n        # select a different device, say, 3, as the output device, please\n        # use self.default.device[1] = 3\n\n        default_output_device_idx = sd.default.device[1]\n        print(\n            f'Use default output device: {devices[default_output_device_idx][\"name\"]}'\n        )\n\n    with sd.OutputStream(\n        channels=1,\n        callback=play_audio_callback,\n        dtype=\"float32\",\n        samplerate=sample_rate,\n        blocksize=1024,\n    ):\n        event.wait()\n\n    logging.info(\"Exiting ...\")\n\n\ndef main():\n    args = get_args()\n    print(args)\n\n    tts_config = sherpa_onnx.OfflineTtsConfig(\n        model=sherpa_onnx.OfflineTtsModelConfig(\n            vits=sherpa_onnx.OfflineTtsVitsModelConfig(\n                model=args.vits_model,\n                lexicon=args.vits_lexicon,\n                data_dir=args.vits_data_dir,\n                tokens=args.vits_tokens,\n            ),\n            matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(\n                acoustic_model=args.matcha_acoustic_model,\n                vocoder=args.matcha_vocoder,\n                lexicon=args.matcha_lexicon,\n                tokens=args.matcha_tokens,\n                data_dir=args.matcha_data_dir,\n            ),\n            kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(\n                model=args.kokoro_model,\n                voices=args.kokoro_voices,\n                tokens=args.kokoro_tokens,\n                data_dir=args.kokoro_data_dir,\n                lexicon=args.kokoro_lexicon,\n            ),\n            kitten=sherpa_onnx.OfflineTtsKittenModelConfig(\n                model=args.kitten_model,\n                voices=args.kitten_voices,\n                tokens=args.kitten_tokens,\n                data_dir=args.kitten_data_dir,\n            ),\n            provider=args.provider,\n            debug=args.debug,\n            num_threads=args.num_threads,\n        ),\n        rule_fsts=args.tts_rule_fsts,\n        max_num_sentences=1,\n    )\n\n    if not tts_config.validate():\n        raise ValueError(\"Please check your config\")\n\n    logging.info(\"Loading model ...\")\n    tts = sherpa_onnx.OfflineTts(tts_config)\n    logging.info(\"Loading model done.\")\n\n    global sample_rate\n    sample_rate = tts.sample_rate\n\n    play_back_thread = threading.Thread(target=play_audio)\n    play_back_thread.start()\n\n    logging.info(\"Start generating ...\")\n    start_time = time.time()\n    gen_config = sherpa_onnx.GenerationConfig()\n    gen_config.sid = args.sid\n    gen_config.speed = args.speed\n    gen_config.silence_scale = 0.2\n    audio = tts.generate(\n        args.text,\n        gen_config,\n        callback=generated_audio_callback,\n    )\n    end_time = time.time()\n    logging.info(\"Finished generating!\")\n    global stopped\n    stopped = True\n\n    if len(audio.samples) == 0:\n        print(\"Error in generating audios. Please read previous error messages.\")\n        global killed\n        killed = True\n        play_back_thread.join()\n        return\n\n    elapsed_seconds = end_time - start_time\n    audio_duration = len(audio.samples) / audio.sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    sf.write(\n        args.output_filename,\n        audio.samples,\n        samplerate=audio.sample_rate,\n        subtype=\"PCM_16\",\n    )\n    logging.info(f\"The text is '{args.text}'\")\n    logging.info(\n        \"Time in seconds to receive the first \"\n        f\"message: {first_message_time-start_time:.3f}\"\n    )\n    logging.info(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    logging.info(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    logging.info(\n        f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\"\n    )\n\n    logging.info(f\"***  Saved to {args.output_filename} ***\")\n\n    print(\"\\n   >>>>>>>>> You can safely press ctrl + C to stop the play <<<<<<<<<<\\n\")\n\n    play_back_thread.join()\n\n\nif __name__ == \"__main__\":\n    formatter = \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"\n\n    logging.basicConfig(format=formatter, level=logging.INFO)\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n        killed = True\n        sys.exit(0)\n"
  },
  {
    "path": "python-api-examples/offline-tts.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2023-2025  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python API to generate audio\nfrom text, i.e., text-to-speech.\n\n\nDifferent from ./offline-tts-play.py, this file does not play back the\ngenerated audio.\n\nUsage:\n\nExample (1/8)\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\ntar xf vits-piper-en_US-amy-low.tar.bz2\n\npython3 ./python-api-examples/offline-tts.py \\\n --vits-model=./vits-piper-en_US-amy-low/en_US-amy-low.onnx \\\n --vits-tokens=./vits-piper-en_US-amy-low/tokens.txt \\\n --vits-data-dir=./vits-piper-en_US-amy-low/espeak-ng-data \\\n --output-filename=./generated.wav \\\n \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nExample (2/8)\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-icefall-zh-aishell3.tar.bz2\ntar xvf vits-icefall-zh-aishell3.tar.bz2\n\npython3 ./python-api-examples/offline-tts.py \\\n --vits-model=./vits-icefall-zh-aishell3/model.onnx \\\n --vits-lexicon=./vits-icefall-zh-aishell3/lexicon.txt \\\n --vits-tokens=./vits-icefall-zh-aishell3/tokens.txt \\\n --tts-rule-fsts='./vits-icefall-zh-aishell3/phone.fst,./vits-icefall-zh-aishell3/date.fst,./vits-icefall-zh-aishell3/number.fst' \\\n --sid=21 \\\n --output-filename=./liubei-21.wav \\\n \"勿以恶小而为之，勿以善小而不为。惟贤惟德，能服于人。122334\"\n\nExample (3/8)\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-vits-zh-ll.tar.bz2\ntar xvf sherpa-onnx-vits-zh-ll.tar.bz2\nrm sherpa-onnx-vits-zh-ll.tar.bz2\n\npython3 ./python-api-examples/offline-tts.py \\\n --vits-model=./sherpa-onnx-vits-zh-ll/model.onnx \\\n --vits-lexicon=./sherpa-onnx-vits-zh-ll/lexicon.txt \\\n --vits-tokens=./sherpa-onnx-vits-zh-ll/tokens.txt \\\n --tts-rule-fsts=./sherpa-onnx-vits-zh-ll/phone.fst,./sherpa-onnx-vits-zh-ll/date.fst,./sherpa-onnx-vits-zh-ll/number.fst \\\n --sid=2 \\\n --output-filename=./test-2.wav \\\n \"当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感受着生命的奇迹与温柔。2024年5月11号，拨打110或者18920240511。123456块钱。\"\n\nExample (4/8)\n\ncurl -O -SL https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\ntar xvf matcha-icefall-zh-baker.tar.bz2\nrm matcha-icefall-zh-baker.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\npython3 ./python-api-examples/offline-tts.py \\\n --matcha-acoustic-model=./matcha-icefall-zh-baker/model-steps-3.onnx \\\n --matcha-vocoder=./vocos-22khz-univ.onnx \\\n --matcha-lexicon=./matcha-icefall-zh-baker/lexicon.txt \\\n --matcha-tokens=./matcha-icefall-zh-baker/tokens.txt \\\n --tts-rule-fsts=./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst \\\n --output-filename=./test-matcha.wav \\\n \"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\"\n\nExample (5/8)\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\ntar xvf matcha-icefall-en_US-ljspeech.tar.bz2\nrm matcha-icefall-en_US-ljspeech.tar.bz2\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\npython3 ./python-api-examples/offline-tts.py \\\n  --matcha-acoustic-model=./matcha-icefall-en_US-ljspeech/model-steps-3.onnx \\\n  --matcha-vocoder=./vocos-22khz-univ.onnx \\\n  --matcha-tokens=./matcha-icefall-en_US-ljspeech/tokens.txt \\\n  --matcha-data-dir=./matcha-icefall-en_US-ljspeech/espeak-ng-data \\\n  --output-filename=./test-matcha-ljspeech-en.wav \\\n  --num-threads=2 \\\n \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nExample (6/8)\n\n(This version of kokoro supports only English)\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\ntar xf kokoro-en-v0_19.tar.bz2\nrm kokoro-en-v0_19.tar.bz2\n\npython3 ./python-api-examples/offline-tts.py \\\n  --debug=1 \\\n  --kokoro-model=./kokoro-en-v0_19/model.onnx \\\n  --kokoro-voices=./kokoro-en-v0_19/voices.bin \\\n  --kokoro-tokens=./kokoro-en-v0_19/tokens.txt \\\n  --kokoro-data-dir=./kokoro-en-v0_19/espeak-ng-data \\\n  --num-threads=2 \\\n  --sid=10 \\\n  --output-filename=\"./kokoro-10.wav\" \\\n  \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nExample (7/8)\n\n(This version of kokoro supports English, Chinese, etc.)\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\ntar xf kokoro-multi-lang-v1_0.tar.bz2\nrm kokoro-multi-lang-v1_0.tar.bz2\n\npython3 ./python-api-examples/offline-tts.py \\\n  --debug=1 \\\n  --kokoro-model=./kokoro-multi-lang-v1_0/model.onnx \\\n  --kokoro-voices=./kokoro-multi-lang-v1_0/voices.bin \\\n  --kokoro-tokens=./kokoro-multi-lang-v1_0/tokens.txt \\\n  --kokoro-data-dir=./kokoro-multi-lang-v1_0/espeak-ng-data \\\n  --kokoro-lexicon=./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt \\\n  --num-threads=2 \\\n  --sid=18 \\\n  --output-filename=\"./kokoro-18-zh-en.wav\" \\\n  \"中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？\"\n\nExample (8/8)\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\ntar xf kitten-nano-en-v0_1-fp16.tar.bz2\nrm kitten-nano-en-v0_1-fp16.tar.bz2\n\npython3 ./python-api-examples/offline-tts.py \\\n  --debug=1 \\\n  --kitten-model=./kitten-nano-en-v0_1-fp16/model.fp16.onnx \\\n  --kitten-voices=./kitten-nano-en-v0_1-fp16/voices.bin \\\n  --kitten-tokens=./kitten-nano-en-v0_1-fp16/tokens.txt \\\n  --kitten-data-dir=./kitten-nano-en-v0_1-fp16/espeak-ng-data \\\n  --num-threads=2 \\\n  --sid=0 \\\n  --output-filename=\"./kitten-0.wav\" \\\n  \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nYou can find more models at\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/tts/index.html\nfor details.\n\n\"\"\"\n\nimport argparse\nimport time\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef add_vits_args(parser):\n    parser.add_argument(\n        \"--vits-model\",\n        type=str,\n        default=\"\",\n        help=\"Path to vits model.onnx\",\n    )\n\n    parser.add_argument(\n        \"--vits-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"Path to lexicon.txt\",\n    )\n\n    parser.add_argument(\n        \"--vits-tokens\",\n        type=str,\n        default=\"\",\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--vits-data-dir\",\n        type=str,\n        default=\"\",\n        help=\"\"\"Path to the dict directory of espeak-ng. If it is specified,\n        --vits-lexicon and --vits-tokens are ignored\"\"\",\n    )\n\n\ndef add_matcha_args(parser):\n    parser.add_argument(\n        \"--matcha-acoustic-model\",\n        type=str,\n        default=\"\",\n        help=\"Path to model.onnx for matcha\",\n    )\n\n    parser.add_argument(\n        \"--matcha-vocoder\",\n        type=str,\n        default=\"\",\n        help=\"Path to vocoder for matcha\",\n    )\n\n    parser.add_argument(\n        \"--matcha-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"Path to lexicon.txt for matcha\",\n    )\n\n    parser.add_argument(\n        \"--matcha-tokens\",\n        type=str,\n        default=\"\",\n        help=\"Path to tokens.txt for matcha\",\n    )\n\n    parser.add_argument(\n        \"--matcha-data-dir\",\n        type=str,\n        default=\"\",\n        help=\"\"\"Path to the dict directory of espeak-ng. If it is specified,\n        --matcha-lexicon and --matcha-tokens are ignored\"\"\",\n    )\n\n\ndef add_kokoro_args(parser):\n    parser.add_argument(\n        \"--kokoro-model\",\n        type=str,\n        default=\"\",\n        help=\"Path to model.onnx for kokoro\",\n    )\n\n    parser.add_argument(\n        \"--kokoro-voices\",\n        type=str,\n        default=\"\",\n        help=\"Path to voices.bin for kokoro\",\n    )\n\n    parser.add_argument(\n        \"--kokoro-tokens\",\n        type=str,\n        default=\"\",\n        help=\"Path to tokens.txt for kokoro\",\n    )\n\n    parser.add_argument(\n        \"--kokoro-data-dir\",\n        type=str,\n        default=\"\",\n        help=\"Path to the dict directory of espeak-ng.\",\n    )\n\n    parser.add_argument(\n        \"--kokoro-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"Path to lexicon.txt for kokoro. Needed only by multilingual kokoro\",\n    )\n\n\ndef add_kitten_args(parser):\n    parser.add_argument(\n        \"--kitten-model\",\n        type=str,\n        default=\"\",\n        help=\"Path to model.onnx for kitten\",\n    )\n\n    parser.add_argument(\n        \"--kitten-voices\",\n        type=str,\n        default=\"\",\n        help=\"Path to voices.bin for kitten\",\n    )\n\n    parser.add_argument(\n        \"--kitten-tokens\",\n        type=str,\n        default=\"\",\n        help=\"Path to tokens.txt for kitten\",\n    )\n\n    parser.add_argument(\n        \"--kitten-data-dir\",\n        type=str,\n        default=\"\",\n        help=\"Path to the dict directory of espeak-ng.\",\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    add_vits_args(parser)\n    add_matcha_args(parser)\n    add_kokoro_args(parser)\n    add_kitten_args(parser)\n\n    parser.add_argument(\n        \"--tts-rule-fsts\",\n        type=str,\n        default=\"\",\n        help=\"Path to rule.fst\",\n    )\n\n    parser.add_argument(\n        \"--max-num-sentences\",\n        type=int,\n        default=1,\n        help=\"\"\"Max number of sentences in a batch to avoid OOM if the input\n        text is very long. Set it to -1 to process all the sentences in a\n        single batch. A smaller value does not mean it is slower compared\n        to a larger one on CPU.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--output-filename\",\n        type=str,\n        default=\"./generated.wav\",\n        help=\"Path to save generated wave\",\n    )\n\n    parser.add_argument(\n        \"--sid\",\n        type=int,\n        default=0,\n        help=\"\"\"Speaker ID. Used only for multi-speaker models, e.g.\n        models trained using the VCTK dataset. Not used for single-speaker\n        models, e.g., models trained using the LJ speech dataset.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"valid values: cpu, cuda, coreml\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--speed\",\n        type=float,\n        default=1.0,\n        help=\"Speech speed. Larger->faster; smaller->slower\",\n    )\n\n    parser.add_argument(\n        \"text\",\n        type=str,\n        help=\"The input text to generate audio for\",\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    print(args)\n\n    tts_config = sherpa_onnx.OfflineTtsConfig(\n        model=sherpa_onnx.OfflineTtsModelConfig(\n            vits=sherpa_onnx.OfflineTtsVitsModelConfig(\n                model=args.vits_model,\n                lexicon=args.vits_lexicon,\n                data_dir=args.vits_data_dir,\n                tokens=args.vits_tokens,\n            ),\n            matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(\n                acoustic_model=args.matcha_acoustic_model,\n                vocoder=args.matcha_vocoder,\n                lexicon=args.matcha_lexicon,\n                tokens=args.matcha_tokens,\n                data_dir=args.matcha_data_dir,\n            ),\n            kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(\n                model=args.kokoro_model,\n                voices=args.kokoro_voices,\n                tokens=args.kokoro_tokens,\n                data_dir=args.kokoro_data_dir,\n                lexicon=args.kokoro_lexicon,\n            ),\n            kitten=sherpa_onnx.OfflineTtsKittenModelConfig(\n                model=args.kitten_model,\n                voices=args.kitten_voices,\n                tokens=args.kitten_tokens,\n                data_dir=args.kitten_data_dir,\n            ),\n            provider=args.provider,\n            debug=args.debug,\n            num_threads=args.num_threads,\n        ),\n        rule_fsts=args.tts_rule_fsts,\n        max_num_sentences=args.max_num_sentences,\n    )\n    if not tts_config.validate():\n        raise ValueError(\"Please check your config\")\n\n    tts = sherpa_onnx.OfflineTts(tts_config)\n\n    start = time.time()\n    gen_config = sherpa_onnx.GenerationConfig()\n    gen_config.sid = args.sid\n    gen_config.speed = args.speed\n    gen_config.silence_scale = 0.2\n    audio = tts.generate(args.text, gen_config)\n    end = time.time()\n\n    if len(audio.samples) == 0:\n        print(\"Error in generating audios. Please read previous error messages.\")\n        return\n\n    elapsed_seconds = end - start\n    audio_duration = len(audio.samples) / audio.sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    sf.write(\n        args.output_filename,\n        audio.samples,\n        samplerate=audio.sample_rate,\n        subtype=\"PCM_16\",\n    )\n    print(f\"Saved to {args.output_filename}\")\n    print(f\"The text is '{args.text}'\")\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-websocket-client-decode-files-paralell.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2023  Xiaomi Corporation\n\n\"\"\"\nA websocket client for sherpa-onnx-offline-websocket-server\n\nThis file shows how to transcribe multiple\nfiles in parallel. We create a separate connection for transcribing each file.\n\nUsage:\n    ./offline-websocket-client-decode-files-parallel.py \\\n      --server-addr localhost \\\n      --server-port 6006 \\\n      /path/to/foo.wav \\\n      /path/to/bar.wav \\\n      /path/to/16kHz.wav \\\n      /path/to/8kHz.wav\n\n(Note: You have to first start the server before starting the client)\n\nYou can find the server at\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/csrc/offline-websocket-server.cc\n\nNote: The server is implemented in C++.\n\"\"\"\n\nimport argparse\nimport asyncio\nimport logging\nimport wave\nfrom typing import Tuple\n\ntry:\n    import websockets\nexcept ImportError:\n    print(\"please run:\")\n    print(\"\")\n    print(\"  pip install websockets\")\n    print(\"\")\n    print(\"before you run this script\")\n    print(\"\")\n\nimport numpy as np\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--server-addr\",\n        type=str,\n        default=\"localhost\",\n        help=\"Address of the server\",\n    )\n\n    parser.add_argument(\n        \"--server-port\",\n        type=int,\n        default=6006,\n        help=\"Port of the server\",\n    )\n\n    parser.add_argument(\n        \"sound_files\",\n        type=str,\n        nargs=\"+\",\n        help=\"The input sound file(s) to decode. Each file must be of WAVE\"\n        \"format with a single channel, and each sample has 16-bit, \"\n        \"i.e., int16_t. \"\n        \"The sample rate of the file can be arbitrary and does not need to \"\n        \"be 16 kHz\",\n    )\n\n    return parser.parse_args()\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\nasync def run(\n    server_addr: str,\n    server_port: int,\n    wave_filename: str,\n):\n    async with websockets.connect(\n        f\"ws://{server_addr}:{server_port}\"\n    ) as websocket:  # noqa\n        logging.info(f\"Sending {wave_filename}\")\n        samples, sample_rate = read_wave(wave_filename)\n        assert isinstance(sample_rate, int)\n        assert samples.dtype == np.float32, samples.dtype\n        assert samples.ndim == 1, samples.dim\n        buf = sample_rate.to_bytes(4, byteorder=\"little\")  # 4 bytes\n        buf += (samples.size * 4).to_bytes(4, byteorder=\"little\")\n        buf += samples.tobytes()\n\n        payload_len = 10240\n        while len(buf) > payload_len:\n            await websocket.send(buf[:payload_len])\n            buf = buf[payload_len:]\n\n        if buf:\n            await websocket.send(buf)\n\n        decoding_results = await websocket.recv()\n        logging.info(f\"{wave_filename}\\n{decoding_results}\")\n\n        # to signal that the client has sent all the data\n        await websocket.send(\"Done\")\n\n\nasync def main():\n    args = get_args()\n    logging.info(vars(args))\n\n    server_addr = args.server_addr\n    server_port = args.server_port\n    sound_files = args.sound_files\n\n    all_tasks = []\n    for wave_filename in sound_files:\n        task = asyncio.create_task(\n            run(\n                server_addr=server_addr,\n                server_port=server_port,\n                wave_filename=wave_filename,\n            )\n        )\n        all_tasks.append(task)\n\n    await asyncio.gather(*all_tasks)\n\n\nif __name__ == \"__main__\":\n    formatter = (\n        \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"  # noqa\n    )\n    logging.basicConfig(format=formatter, level=logging.INFO)\n    asyncio.run(main())\n"
  },
  {
    "path": "python-api-examples/offline-websocket-client-decode-files-sequential.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2023  Xiaomi Corporation\n\n\"\"\"\nA websocket client for sherpa-onnx-offline-websocket-server\n\nThis file shows how to use a single connection to transcribe multiple\nfiles sequentially.\n\nUsage:\n    ./offline-websocket-client-decode-files-sequential.py \\\n      --server-addr localhost \\\n      --server-port 6006 \\\n      /path/to/foo.wav \\\n      /path/to/bar.wav \\\n      /path/to/16kHz.wav \\\n      /path/to/8kHz.wav\n\n(Note: You have to first start the server before starting the client)\n\nYou can find the server at\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/csrc/offline-websocket-server.cc\n\nNote: The server is implemented in C++.\n\"\"\"\n\nimport argparse\nimport asyncio\nimport logging\nimport wave\nfrom typing import List, Tuple\n\ntry:\n    import websockets\nexcept ImportError:\n    print(\"please run:\")\n    print(\"\")\n    print(\"  pip install websockets\")\n    print(\"\")\n    print(\"before you run this script\")\n    print(\"\")\n\nimport numpy as np\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--server-addr\",\n        type=str,\n        default=\"localhost\",\n        help=\"Address of the server\",\n    )\n\n    parser.add_argument(\n        \"--server-port\",\n        type=int,\n        default=6006,\n        help=\"Port of the server\",\n    )\n\n    parser.add_argument(\n        \"sound_files\",\n        type=str,\n        nargs=\"+\",\n        help=\"The input sound file(s) to decode. Each file must be of WAVE\"\n        \"format with a single channel, and each sample has 16-bit, \"\n        \"i.e., int16_t. \"\n        \"The sample rate of the file can be arbitrary and does not need to \"\n        \"be 16 kHz\",\n    )\n\n    return parser.parse_args()\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\nasync def run(\n    server_addr: str,\n    server_port: int,\n    sound_files: List[str],\n):\n    async with websockets.connect(\n        f\"ws://{server_addr}:{server_port}\"\n    ) as websocket:  # noqa\n        for wave_filename in sound_files:\n            logging.info(f\"Sending {wave_filename}\")\n            samples, sample_rate = read_wave(wave_filename)\n            assert isinstance(sample_rate, int)\n            assert samples.dtype == np.float32, samples.dtype\n            assert samples.ndim == 1, samples.dim\n\n            buf = sample_rate.to_bytes(4, byteorder=\"little\")  # 4 bytes\n            buf += (samples.size * 4).to_bytes(4, byteorder=\"little\")\n            buf += samples.tobytes()\n\n            payload_len = 10240\n            while len(buf) > payload_len:\n                await websocket.send(buf[:payload_len])\n                buf = buf[payload_len:]\n\n            if buf:\n                await websocket.send(buf)\n\n            decoding_results = await websocket.recv()\n            print(decoding_results)\n\n        # to signal that the client has sent all the data\n        await websocket.send(\"Done\")\n\n\nasync def main():\n    args = get_args()\n    logging.info(vars(args))\n\n    server_addr = args.server_addr\n    server_port = args.server_port\n    sound_files = args.sound_files\n\n    await run(\n        server_addr=server_addr,\n        server_port=server_port,\n        sound_files=sound_files,\n    )\n\n\nif __name__ == \"__main__\":\n    formatter = (\n        \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"  # noqa\n    )\n    logging.basicConfig(format=formatter, level=logging.INFO)\n    asyncio.run(main())\n"
  },
  {
    "path": "python-api-examples/offline-whisper-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming whisper model from\nhttps://github.com/openai/whisper\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\nrm sherpa-onnx-whisper-tiny.en.tar.bz2\n\"\"\"\n\nimport datetime as dt\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    encoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx\"\n    decoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx\"\n    tokens = \"./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\"\n    test_wav = \"./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav\"\n\n    if not Path(encoder).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_whisper(\n            encoder=encoder,\n            decoder=decoder,\n            tokens=tokens,\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    start_t = dt.datetime.now()\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n\n    end_t = dt.datetime.now()\n    elapsed_seconds = (end_t - start_t).total_seconds()\n    duration = audio.shape[-1] / sample_rate\n    rtf = elapsed_seconds / duration\n\n    print(stream.result)\n    print(wave_filename)\n    print(\"Text:\", stream.result.text)\n    print(f\"Audio duration:\\t{duration:.3f} s\")\n    print(f\"Elapsed:\\t{elapsed_seconds:.3f} s\")\n    print(f\"RTF = {elapsed_seconds:.3f}/{duration:.3f} = {rtf:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/offline-zipformer-ctc-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a non-streaming zipformer CTC model from icefall\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx\"\n    tokens = \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt\"\n    test_wav = \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/test_wavs/0.wav\"\n\n    if not Path(model).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OfflineRecognizer.from_zipformer_ctc(\n            model=model,\n            tokens=tokens,\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n    recognizer.decode_stream(stream)\n    print(wave_filename)\n    print(stream.result)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/online-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python API to transcribe\nfile(s) with a streaming model.\n\nUsage:\n\n(1) Streaming transducer\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-2023-06-26.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-en-2023-06-26.tar.bz2\nrm sherpa-onnx-streaming-zipformer-en-2023-06-26.tar.bz2\n\n./python-api-examples/online-decode-files.py \\\n  --tokens=./sherpa-onnx-streaming-zipformer-en-2023-06-26/tokens.txt \\\n  --encoder=./sherpa-onnx-streaming-zipformer-en-2023-06-26/encoder-epoch-99-avg-1-chunk-16-left-64.onnx \\\n  --decoder=./sherpa-onnx-streaming-zipformer-en-2023-06-26/decoder-epoch-99-avg-1-chunk-16-left-64.onnx \\\n  --joiner=./sherpa-onnx-streaming-zipformer-en-2023-06-26/joiner-epoch-99-avg-1-chunk-16-left-64.onnx \\\n  ./sherpa-onnx-streaming-zipformer-en-2023-06-26/test_wavs/0.wav \\\n  ./sherpa-onnx-streaming-zipformer-en-2023-06-26/test_wavs/1.wav \\\n  ./sherpa-onnx-streaming-zipformer-en-2023-06-26/test_wavs/8k.wav\n\nor with RNN LM rescoring and LODR:\n\n./python-api-examples/online-decode-files.py \\\n  --tokens=./sherpa-onnx-streaming-zipformer-en-2023-06-26/tokens.txt \\\n  --encoder=./sherpa-onnx-streaming-zipformer-en-2023-06-26/encoder-epoch-99-avg-1-chunk-16-left-64.onnx \\\n  --decoder=./sherpa-onnx-streaming-zipformer-en-2023-06-26/decoder-epoch-99-avg-1-chunk-16-left-64.onnx \\\n  --joiner=./sherpa-onnx-streaming-zipformer-en-2023-06-26/joiner-epoch-99-avg-1-chunk-16-left-64.onnx \\\n  --decoding-method=modified_beam_search \\\n  --lm=/path/to/lm.onnx \\\n  --lm-scale=0.1 \\\n  --lodr-fst=/path/to/lodr.fst \\\n  --lodr-scale=-0.1 \\\n  ./sherpa-onnx-streaming-zipformer-en-2023-06-26/test_wavs/0.wav \\\n  ./sherpa-onnx-streaming-zipformer-en-2023-06-26/test_wavs/1.wav \\\n  ./sherpa-onnx-streaming-zipformer-en-2023-06-26/test_wavs/8k.wav\n\n(2) Streaming paraformer\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\ntar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nrm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n\n./python-api-examples/online-decode-files.py \\\n  --tokens=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt \\\n  --paraformer-encoder=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx \\\n  --paraformer-decoder=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx \\\n  ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav \\\n  ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/1.wav \\\n  ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/2.wav \\\n  ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/3.wav \\\n  ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/8k.wav\n\n(3) Streaming Zipformer2 CTC\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\nrm sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\nls -lh sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13\n\n./python-api-examples/online-decode-files.py \\\n  --zipformer2-ctc=./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/ctc-epoch-20-avg-1-chunk-16-left-128.onnx \\\n  --tokens=./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt \\\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000000.wav \\\n  ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000001.wav\n\n(4) Streaming Conformer CTC from WeNet\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zh-wenet-wenetspeech.tar.bz2\ntar xvf sherpa-onnx-zh-wenet-wenetspeech.tar.bz2\nrm sherpa-onnx-zh-wenet-wenetspeech.tar.bz2\n\n./python-api-examples/online-decode-files.py \\\n  --tokens=./sherpa-onnx-zh-wenet-wenetspeech/tokens.txt \\\n  --wenet-ctc=./sherpa-onnx-zh-wenet-wenetspeech/model-streaming.onnx \\\n  ./sherpa-onnx-zh-wenet-wenetspeech/test_wavs/0.wav \\\n  ./sherpa-onnx-zh-wenet-wenetspeech/test_wavs/1.wav \\\n  ./sherpa-onnx-zh-wenet-wenetspeech/test_wavs/8k.wav\n\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nto download streaming pre-trained models.\n\"\"\"\nimport argparse\nimport time\nimport wave\nfrom pathlib import Path\nfrom typing import List, Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        help=\"Path to the transducer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        help=\"Path to the transducer decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        type=str,\n        help=\"Path to the transducer joiner model\",\n    )\n\n    parser.add_argument(\n        \"--zipformer2-ctc\",\n        type=str,\n        help=\"Path to the zipformer2 ctc model\",\n    )\n\n    parser.add_argument(\n        \"--paraformer-encoder\",\n        type=str,\n        help=\"Path to the paraformer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--paraformer-decoder\",\n        type=str,\n        help=\"Path to the paraformer decoder model\",\n    )\n\n    parser.add_argument(\n        \"--wenet-ctc\",\n        type=str,\n        help=\"Path to the wenet ctc model\",\n    )\n\n    parser.add_argument(\n        \"--wenet-ctc-chunk-size\",\n        type=int,\n        default=16,\n        help=\"The --chunk-size parameter for streaming WeNet models\",\n    )\n\n    parser.add_argument(\n        \"--wenet-ctc-num-left-chunks\",\n        type=int,\n        default=4,\n        help=\"The --num-left-chunks parameter for streaming WeNet models\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"Valid values are greedy_search and modified_beam_search\",\n    )\n\n    parser.add_argument(\n        \"--max-active-paths\",\n        type=int,\n        default=4,\n        help=\"\"\"Used only when --decoding-method is modified_beam_search.\n        It specifies number of active paths to keep during decoding.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--lm\",\n        type=str,\n        default=\"\",\n        help=\"\"\"Used only when --decoding-method is modified_beam_search.\n        path of language model.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--lm-scale\",\n        type=float,\n        default=0.1,\n        help=\"\"\"Used only when --decoding-method is modified_beam_search.\n        scale of language model.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--lodr-fst\",\n        metavar=\"file\",\n        type=str,\n        default=\"\",\n        help=\"Path to LODR FST model. Used only when --lm is given.\",\n    )\n\n    parser.add_argument(\n        \"--lodr-scale\",\n        metavar=\"lodr_scale\",\n        type=float,\n        default=-0.1,\n        help=\"LODR scale for rescoring.Used only when --lodr_fst is given.\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-file\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The file containing hotwords, one words/phrases per line, like\n        HELLO WORLD\n        你好世界\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-score\",\n        type=float,\n        default=1.5,\n        help=\"\"\"\n        The hotword score of each token for biasing word/phrase. Used only if\n        --hotwords-file is given.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--modeling-unit\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The modeling unit of the model, valid values are cjkchar, bpe, cjkchar+bpe.\n        Used only when hotwords-file is given.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--bpe-vocab\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The path to the bpe vocabulary, the bpe vocabulary is generated by\n        sentencepiece, you can also export the bpe vocabulary through a bpe model\n        by `scripts/export_bpe_vocab.py`. Used only when hotwords-file is given\n        and modeling-unit is bpe or cjkchar+bpe.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--blank-penalty\",\n        type=float,\n        default=0.0,\n        help=\"\"\"\n        The penalty applied on blank symbol during decoding.\n        Note: It is a positive value that would be applied to logits like\n        this `logits[:, 0] -= blank_penalty` (suppose logits.shape is\n        [batch_size, vocab] and blank id is 0).\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"sound_files\",\n        type=str,\n        nargs=\"+\",\n        help=\"The input sound file(s) to decode. Each file must be of WAVE\"\n        \"format with a single channel, and each sample has 16-bit, \"\n        \"i.e., int16_t. \"\n        \"The sample rate of the file can be arbitrary and does not need to \"\n        \"be 16 kHz\",\n    )\n\n    return parser.parse_args()\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\ndef main():\n    args = get_args()\n    assert_file_exists(args.tokens)\n\n    if args.encoder:\n        assert_file_exists(args.encoder)\n        assert_file_exists(args.decoder)\n        assert_file_exists(args.joiner)\n\n        assert not args.paraformer_encoder, args.paraformer_encoder\n        assert not args.paraformer_decoder, args.paraformer_decoder\n\n        recognizer = sherpa_onnx.OnlineRecognizer.from_transducer(\n            tokens=args.tokens,\n            encoder=args.encoder,\n            decoder=args.decoder,\n            joiner=args.joiner,\n            num_threads=args.num_threads,\n            provider=args.provider,\n            sample_rate=16000,\n            feature_dim=80,\n            decoding_method=args.decoding_method,\n            max_active_paths=args.max_active_paths,\n            lm=args.lm,\n            lm_scale=args.lm_scale,\n            lodr_fst=args.lodr_fst,\n            lodr_scale=args.lodr_scale,\n            hotwords_file=args.hotwords_file,\n            hotwords_score=args.hotwords_score,\n            modeling_unit=args.modeling_unit,\n            bpe_vocab=args.bpe_vocab,\n            blank_penalty=args.blank_penalty,\n        )\n    elif args.zipformer2_ctc:\n        recognizer = sherpa_onnx.OnlineRecognizer.from_zipformer2_ctc(\n            tokens=args.tokens,\n            model=args.zipformer2_ctc,\n            num_threads=args.num_threads,\n            provider=args.provider,\n            sample_rate=16000,\n            feature_dim=80,\n            decoding_method=\"greedy_search\",\n        )\n    elif args.paraformer_encoder:\n        recognizer = sherpa_onnx.OnlineRecognizer.from_paraformer(\n            tokens=args.tokens,\n            encoder=args.paraformer_encoder,\n            decoder=args.paraformer_decoder,\n            num_threads=args.num_threads,\n            provider=args.provider,\n            sample_rate=16000,\n            feature_dim=80,\n            decoding_method=\"greedy_search\",\n        )\n    elif args.wenet_ctc:\n        recognizer = sherpa_onnx.OnlineRecognizer.from_wenet_ctc(\n            tokens=args.tokens,\n            model=args.wenet_ctc,\n            chunk_size=args.wenet_ctc_chunk_size,\n            num_left_chunks=args.wenet_ctc_num_left_chunks,\n            num_threads=args.num_threads,\n            provider=args.provider,\n            sample_rate=16000,\n            feature_dim=80,\n            decoding_method=\"greedy_search\",\n        )\n    else:\n        raise ValueError(\"Please provide a model\")\n\n    print(\"Started!\")\n    start_time = time.time()\n\n    streams = []\n    total_duration = 0\n    for wave_filename in args.sound_files:\n        assert_file_exists(wave_filename)\n        samples, sample_rate = read_wave(wave_filename)\n        duration = len(samples) / sample_rate\n        total_duration += duration\n\n        s = recognizer.create_stream()\n\n        s.accept_waveform(sample_rate, samples)\n\n        tail_paddings = np.zeros(int(0.66 * sample_rate), dtype=np.float32)\n        s.accept_waveform(sample_rate, tail_paddings)\n\n        s.input_finished()\n\n        streams.append(s)\n\n    while True:\n        ready_list = []\n        for s in streams:\n            if recognizer.is_ready(s):\n                ready_list.append(s)\n        if len(ready_list) == 0:\n            break\n        recognizer.decode_streams(ready_list)\n    results = [recognizer.get_result(s) for s in streams]\n    end_time = time.time()\n    print(\"Done!\")\n\n    for wave_filename, result in zip(args.sound_files, results):\n        print(f\"{wave_filename}\\n{result}\")\n        print(\"-\" * 10)\n\n    elapsed_seconds = end_time - start_time\n    rtf = elapsed_seconds / total_duration\n    print(f\"num_threads: {args.num_threads}\")\n    print(f\"decoding_method: {args.decoding_method}\")\n    print(f\"Wave duration: {total_duration:.3f} s\")\n    print(f\"Elapsed time: {elapsed_seconds:.3f} s\")\n    print(\n        f\"Real time factor (RTF): {elapsed_seconds:.3f}/{total_duration:.3f} = {rtf:.3f}\"\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/online-nemo-ctc-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a streaming CTC model from NeMo\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\n\nThe example model is converted from\nhttps://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_streaming_80ms\n\"\"\"\n\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms/model.onnx\"\n    tokens = \"./sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms/tokens.txt\"\n\n    test_wav = \"./sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms/test_wavs/0.wav\"\n\n    if not Path(model).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OnlineRecognizer.from_nemo_ctc(\n            model=model,\n            tokens=tokens,\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 16000 Hz\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, audio)\n\n    tail_paddings = np.zeros(int(0.66 * sample_rate), dtype=np.float32)\n    stream.accept_waveform(sample_rate, tail_paddings)\n    stream.input_finished()\n\n    while recognizer.is_ready(stream):\n        recognizer.decode_stream(stream)\n    print(wave_filename)\n    print(recognizer.get_result_all(stream))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/online-speech-enhancement-dpdfnet.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use the online speech enhancement API with DPDFNet.\n\nPlease download files used in this script from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\"\"\"\n\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_speech_denoiser():\n    model_filename = \"./dpdfnet_baseline.onnx\"\n    if not Path(model_filename).is_file():\n        raise ValueError(\n            \"Please first download a model from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\"\n            \" or https://huggingface.co/csukuangfj/speech-enhancement-models\"\n        )\n\n    config = sherpa_onnx.OnlineSpeechDenoiserConfig(\n        model=sherpa_onnx.OfflineSpeechDenoiserModelConfig(\n            dpdfnet=sherpa_onnx.OfflineSpeechDenoiserDpdfNetModelConfig(\n                model=model_filename\n            ),\n            debug=False,\n            num_threads=1,\n            provider=\"cpu\",\n        )\n    )\n\n    if not config.validate():\n        print(config)\n        raise ValueError(\"Errors in config. Please check previous error logs\")\n\n    return sherpa_onnx.OnlineSpeechDenoiser(config)\n\n\ndef load_audio(filename: str):\n    data, sample_rate = sf.read(filename, always_2d=True, dtype=\"float32\")\n    samples = np.ascontiguousarray(data[:, 0])\n    return samples, sample_rate\n\n\ndef main():\n    sd = create_speech_denoiser()\n    test_wave = \"./speech_with_noise.wav\"\n    if not Path(test_wave).is_file():\n        raise ValueError(\n            f\"{test_wave} does not exist. You can download it from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\"\n        )\n\n    samples, sample_rate = load_audio(test_wave)\n    frame_shift = sd.frame_shift_in_samples\n    output = []\n\n    for start in range(0, len(samples), frame_shift):\n        chunk = samples[start : start + frame_shift]\n        denoised = sd(chunk, sample_rate)\n        output.append(np.asarray(denoised.samples, dtype=np.float32))\n\n    output.append(np.asarray(sd.flush().samples, dtype=np.float32))\n    enhanced = np.concatenate(output) if output else np.empty(0, dtype=np.float32)\n\n    sf.write(\"./enhanced_online_dpdfnet.wav\", enhanced, sd.sample_rate)\n    print(\"Saved to ./enhanced_online_dpdfnet.wav\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/online-speech-enhancement-gtcrn.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use the online speech enhancement API with GTCRN.\n\nPlease download files used in this script from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\"\"\"\n\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_speech_denoiser():\n    model_filename = \"./gtcrn_simple.onnx\"\n    if not Path(model_filename).is_file():\n        raise ValueError(\n            \"Please first download a model from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\"\n        )\n\n    config = sherpa_onnx.OnlineSpeechDenoiserConfig(\n        model=sherpa_onnx.OfflineSpeechDenoiserModelConfig(\n            gtcrn=sherpa_onnx.OfflineSpeechDenoiserGtcrnModelConfig(\n                model=model_filename\n            ),\n            debug=False,\n            num_threads=1,\n            provider=\"cpu\",\n        )\n    )\n\n    if not config.validate():\n        print(config)\n        raise ValueError(\"Errors in config. Please check previous error logs\")\n\n    return sherpa_onnx.OnlineSpeechDenoiser(config)\n\n\ndef load_audio(filename: str):\n    data, sample_rate = sf.read(filename, always_2d=True, dtype=\"float32\")\n    samples = np.ascontiguousarray(data[:, 0])\n    return samples, sample_rate\n\n\ndef main():\n    sd = create_speech_denoiser()\n    test_wave = \"./speech_with_noise.wav\"\n    if not Path(test_wave).is_file():\n        raise ValueError(\n            f\"{test_wave} does not exist. You can download it from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\"\n        )\n\n    samples, sample_rate = load_audio(test_wave)\n    frame_shift = sd.frame_shift_in_samples\n    output = []\n\n    for start in range(0, len(samples), frame_shift):\n        chunk = samples[start : start + frame_shift]\n        denoised = sd(chunk, sample_rate)\n        output.append(np.asarray(denoised.samples, dtype=np.float32))\n\n    output.append(np.asarray(sd.flush().samples, dtype=np.float32))\n    enhanced = np.concatenate(output) if output else np.empty(0, dtype=np.float32)\n\n    sf.write(\"./enhanced_online_gtcrn.wav\", enhanced, sd.sample_rate)\n    print(\"Saved to ./enhanced_online_gtcrn.wav\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/online-t-one-ctc-decode-files.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to use a streaming CTC model from T-one\nto decode files.\n\nPlease download model files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n\n\nThe example model is converted from\nhttps://github.com/voicekit-team/T-one\nusing\nhttps://github.com/k2-fsa/sherpa-onnx/tree/master/scripts/t-one\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\ntar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nrm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n\"\"\"\n\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_recognizer():\n    model = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx\"\n    tokens = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt\"\n    test_wav = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav\"\n\n    if not Path(model).is_file() or not Path(test_wav).is_file():\n        raise ValueError(\n            \"\"\"Please download model files from\n            https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\n            \"\"\"\n        )\n    return (\n        sherpa_onnx.OnlineRecognizer.from_t_one_ctc(\n            model=model,\n            tokens=tokens,\n            debug=True,\n        ),\n        test_wav,\n    )\n\n\ndef main():\n    recognizer, wave_filename = create_recognizer()\n\n    audio, sample_rate = sf.read(wave_filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n\n    # audio is a 1-D float32 numpy array normalized to the range [-1, 1]\n    # sample_rate does not need to be 8000 Hz\n\n    stream = recognizer.create_stream()\n    left_paddings = np.zeros(int(0.3 * sample_rate), dtype=np.float32)\n    stream.accept_waveform(sample_rate, left_paddings)\n\n    stream.accept_waveform(sample_rate, audio)\n\n    tail_paddings = np.zeros(int(0.66 * sample_rate), dtype=np.float32)\n    stream.accept_waveform(sample_rate, tail_paddings)\n    stream.input_finished()\n\n    while recognizer.is_ready(stream):\n        recognizer.decode_stream(stream)\n    print(wave_filename)\n    print(recognizer.get_result_all(stream))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/online-websocket-client-decode-file.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2023  Xiaomi Corporation\n\n\"\"\"\nA websocket client for sherpa-onnx-online-websocket-server\n\nUsage:\n    ./online-websocket-client-decode-file.py \\\n      --server-addr localhost \\\n      --server-port 6006 \\\n      --seconds-per-message 0.1 \\\n      --samples-per-message 8000 \\\n      /path/to/foo.wav\n\n(Note: You have to first start the server before starting the client)\n\nYou can find the c++ server at\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/csrc/online-websocket-server.cc\nor use the python server ./python-api-examples/streaming_server.py\n\nThere is also a C++ version of the client. Please see\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/csrc/online-websocket-client.cc\n\"\"\"\n\nimport argparse\nimport asyncio\nimport json\nimport logging\nimport wave\n\ntry:\n    import websockets\nexcept ImportError:\n    print(\"please run:\")\n    print(\"\")\n    print(\"  pip install websockets\")\n    print(\"\")\n    print(\"before you run this script\")\n    print(\"\")\n\nimport numpy as np\n\n\ndef read_wave(wave_filename: str) -> np.ndarray:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. Its sampling rate has to be 16000.\n        It should be single channel and each sample should be 16-bit.\n    Returns:\n      Return a 1-D float32 tensor.\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getframerate() == 16000, f.getframerate()\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--server-addr\",\n        type=str,\n        default=\"localhost\",\n        help=\"Address of the server\",\n    )\n\n    parser.add_argument(\n        \"--server-port\",\n        type=int,\n        default=6006,\n        help=\"Port of the server\",\n    )\n\n    parser.add_argument(\n        \"--samples-per-message\",\n        type=int,\n        default=8000,\n        help=\"Number of samples per message\",\n    )\n\n    parser.add_argument(\n        \"--seconds-per-message\",\n        type=float,\n        default=0.1,\n        help=\"We will simulate that the duration of two messages is of this value\",\n    )\n\n    parser.add_argument(\n        \"sound_file\",\n        type=str,\n        help=\"The input sound file. Must be wave with a single channel, 16kHz \"\n        \"sampling rate, 16-bit of each sample.\",\n    )\n\n    return parser.parse_args()\n\n\nasync def receive_results(socket: websockets.WebSocketServerProtocol):\n    last_message = \"\"\n    async for message in socket:\n        if message != \"Done!\":\n            last_message = message\n            logging.info(json.loads(message))\n        else:\n            break\n    return last_message\n\n\nasync def run(\n    server_addr: str,\n    server_port: int,\n    wave_filename: str,\n    samples_per_message: int,\n    seconds_per_message: float,\n):\n    data = read_wave(wave_filename)\n\n    async with websockets.connect(\n        f\"ws://{server_addr}:{server_port}\"\n    ) as websocket:  # noqa\n        logging.info(f\"Sending {wave_filename}\")\n\n        receive_task = asyncio.create_task(receive_results(websocket))\n\n        start = 0\n        while start < data.shape[0]:\n            end = start + samples_per_message\n            end = min(end, data.shape[0])\n            d = data.data[start:end].tobytes()\n\n            await websocket.send(d)\n\n            # Simulate streaming. You can remove the sleep if you want\n            await asyncio.sleep(seconds_per_message)  # in seconds\n\n            start += samples_per_message\n\n        # to signal that the client has sent all the data\n        await websocket.send(\"Done\")\n\n        decoding_results = await receive_task\n        logging.info(f\"\\nFinal result is:\\n{json.loads(decoding_results)}\")\n\n\nasync def main():\n    args = get_args()\n    logging.info(vars(args))\n\n    server_addr = args.server_addr\n    server_port = args.server_port\n    samples_per_message = args.samples_per_message\n    seconds_per_message = args.seconds_per_message\n\n    await run(\n        server_addr=server_addr,\n        server_port=server_port,\n        wave_filename=args.sound_file,\n        samples_per_message=samples_per_message,\n        seconds_per_message=seconds_per_message,\n    )\n\n\nif __name__ == \"__main__\":\n    formatter = (\n        \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"  # noqa\n    )\n    logging.basicConfig(format=formatter, level=logging.INFO)\n    asyncio.run(main())\n"
  },
  {
    "path": "python-api-examples/online-websocket-client-microphone.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2023  Xiaomi Corporation\n\n\"\"\"\nA websocket client for sherpa-onnx-online-websocket-server\n\nUsage:\n    ./online-websocket-client-microphone.py \\\n      --server-addr localhost \\\n      --server-port 6006\n\n(Note: You have to first start the server before starting the client)\n\nYou can find the C++ server at\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/csrc/online-websocket-server.cc\nor use the python server ./python-api-examples/streaming_server.py\n\nThere is also a C++ version of the client. Please see\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/csrc/online-websocket-client.cc\n\"\"\"\n\nimport argparse\nimport asyncio\nimport sys\n\nimport numpy as np\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\ntry:\n    import websockets\nexcept ImportError:\n    print(\"please run:\")\n    print(\"\")\n    print(\"  pip install websockets\")\n    print(\"\")\n    print(\"before you run this script\")\n    print(\"\")\n    sys.exit(-1)\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--server-addr\",\n        type=str,\n        default=\"localhost\",\n        help=\"Address of the server\",\n    )\n\n    parser.add_argument(\n        \"--server-port\",\n        type=int,\n        default=6006,\n        help=\"Port of the server\",\n    )\n\n    return parser.parse_args()\n\n\nasync def inputstream_generator(channels=1):\n    \"\"\"Generator that yields blocks of input data as NumPy arrays.\n\n    See https://python-sounddevice.readthedocs.io/en/0.4.6/examples.html#creating-an-asyncio-generator-for-audio-blocks\n    \"\"\"\n    q_in = asyncio.Queue()\n    loop = asyncio.get_event_loop()\n\n    def callback(indata, frame_count, time_info, status):\n        loop.call_soon_threadsafe(q_in.put_nowait, (indata.copy(), status))\n\n    devices = sd.query_devices()\n    print(devices)\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n    print()\n    print(\"Started! Please speak\")\n\n    stream = sd.InputStream(\n        callback=callback,\n        channels=channels,\n        dtype=\"float32\",\n        samplerate=16000,\n        blocksize=int(0.05 * 16000),  # 0.05 seconds\n    )\n    with stream:\n        while True:\n            indata, status = await q_in.get()\n            yield indata, status\n\n\nasync def receive_results(socket: websockets.WebSocketServerProtocol):\n    last_message = \"\"\n    async for message in socket:\n        if message != \"Done!\":\n            if last_message != message:\n                last_message = message\n\n                if last_message:\n                    print(last_message)\n        else:\n            return last_message\n\n\nasync def run(\n    server_addr: str,\n    server_port: int,\n):\n    async with websockets.connect(\n        f\"ws://{server_addr}:{server_port}\"\n    ) as websocket:  # noqa\n        receive_task = asyncio.create_task(receive_results(websocket))\n        print(\"Started! Please Speak\")\n\n        async for indata, status in inputstream_generator():\n            if status:\n                print(status)\n            indata = indata.reshape(-1)\n            indata = np.ascontiguousarray(indata)\n            await websocket.send(indata.tobytes())\n\n        decoding_results = await receive_task\n        print(f\"\\nFinal result is:\\n{decoding_results}\")\n\n\nasync def main():\n    args = get_args()\n    print(vars(args))\n\n    server_addr = args.server_addr\n    server_port = args.server_port\n\n    await run(\n        server_addr=server_addr,\n        server_port=server_port,\n    )\n\n\nif __name__ == \"__main__\":\n    try:\n        asyncio.run(main())\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/online-zipformer-ctc-hlg-decode-file.py",
    "content": "#!/usr/bin/env python3\n\n# This file shows how to use a streaming zipformer CTC model and an HLG\n# graph for decoding.\n#\n# We use the following model as an example\n#\n\"\"\"\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nrm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n\npython3 ./python-api-examples/online-zipformer-ctc-hlg-decode-file.py \\\n  --tokens ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt \\\n  --graph ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst \\\n  --model ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx \\\n  ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/0.wav\n\n\"\"\"\n# (The above model is from https://github.com/k2-fsa/icefall/pull/1557)\n\nimport argparse\nimport time\nimport wave\nfrom pathlib import Path\nfrom typing import List, Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the ONNX model\",\n    )\n\n    parser.add_argument(\n        \"--graph\",\n        type=str,\n        required=True,\n        help=\"Path to H.fst, HL.fst, or HLG.fst\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        type=int,\n        default=0,\n        help=\"Valid values: 1, 0\",\n    )\n\n    parser.add_argument(\n        \"sound_file\",\n        type=str,\n        help=\"The input sound file to decode. It must be of WAVE\"\n        \"format with a single channel, and each sample has 16-bit, \"\n        \"i.e., int16_t. \"\n        \"The sample rate of the file can be arbitrary and does not need to \"\n        \"be 16 kHz\",\n    )\n\n    return parser.parse_args()\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    assert_file_exists(args.tokens)\n    assert_file_exists(args.graph)\n    assert_file_exists(args.model)\n\n    recognizer = sherpa_onnx.OnlineRecognizer.from_zipformer2_ctc(\n        tokens=args.tokens,\n        model=args.model,\n        num_threads=args.num_threads,\n        provider=args.provider,\n        sample_rate=16000,\n        feature_dim=80,\n        ctc_graph=args.graph,\n    )\n\n    wave_filename = args.sound_file\n    assert_file_exists(wave_filename)\n    samples, sample_rate = read_wave(wave_filename)\n    duration = len(samples) / sample_rate\n\n    print(\"Started\")\n\n    start_time = time.time()\n    s = recognizer.create_stream()\n    s.accept_waveform(sample_rate, samples)\n    tail_paddings = np.zeros(int(0.66 * sample_rate), dtype=np.float32)\n    s.accept_waveform(sample_rate, tail_paddings)\n    s.input_finished()\n    while recognizer.is_ready(s):\n        recognizer.decode_stream(s)\n\n    result = recognizer.get_result(s).lower()\n    end_time = time.time()\n\n    elapsed_seconds = end_time - start_time\n    rtf = elapsed_seconds / duration\n    print(f\"num_threads: {args.num_threads}\")\n    print(f\"Wave duration: {duration:.3f} s\")\n    print(f\"Elapsed time: {elapsed_seconds:.3f} s\")\n    print(f\"Real time factor (RTF): {elapsed_seconds:.3f}/{duration:.3f} = {rtf:.3f}\")\n    print(result)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/pocket-tts-play.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2026  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python API\nfor voice cloning using PocketTTS.\n\nDifferent from ./pocket-tts.py, this file plays back the generated audio\nwhile the model is still generating.\n\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\npython3 ./pocket-tts-play.py\n\nYou can find more models at\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/tts/pocket.html\nfor details.\n\"\"\"\n\nimport logging\nimport queue\nimport sys\nimport threading\nimport time\nfrom pathlib import Path\n\nimport librosa\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\n\ndef create_tts():\n    tts_config = sherpa_onnx.OfflineTtsConfig(\n        model=sherpa_onnx.OfflineTtsModelConfig(\n            pocket=sherpa_onnx.OfflineTtsPocketModelConfig(\n                lm_flow=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\",\n                lm_main=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\",\n                encoder=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\",\n                decoder=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\",\n                text_conditioner=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\",\n                vocab_json=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\",\n                token_scores_json=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\",\n            ),\n            debug=True,  # set it to True to see verbose logs\n            num_threads=2,\n            provider=\"cpu\",\n        )\n    )\n    if not tts_config.validate():\n        raise ValueError(\n            \"Please read the previous error messages and re-check your config\"\n        )\n\n    return sherpa_onnx.OfflineTts(tts_config)\n\n\n# buffer saves audio samples to be played\nbuffer = queue.Queue()\n\n# started is set to True once generated_audio_callback is called.\nstarted = False\n\n# stopped is set to True once all the text has been processed\nstopped = False\n\n# killed is set to True once ctrl + C is pressed\nkilled = False\n\n# Note: When started is True, and stopped is True, and buffer is empty,\n# we will exit the program since all audio samples have been played.\n\nsample_rate = None\n\nevent = threading.Event()\n\nfirst_message_time = None\n\n\ndef generated_audio_callback(samples: np.ndarray, progress: float):\n    \"\"\"This function is called whenever max_num_sentences sentences\n    have been processed.\n\n    Note that it is passed to C++ and is invoked in C++.\n\n    Args:\n      samples:\n        A 1-D np.float32 array containing audio samples\n    \"\"\"\n    global first_message_time\n    if first_message_time is None:\n        first_message_time = time.time()\n\n    buffer.put(samples)\n    global started\n\n    if started is False:\n        logging.info(\"Start playing ...\")\n    started = True\n\n    # 1 means to keep generating\n    # 0 means to stop generating\n    if killed:\n        return 0\n\n    return 1\n\n\n# see https://python-sounddevice.readthedocs.io/en/0.4.6/api/streams.html#sounddevice.OutputStream\ndef play_audio_callback(\n    outdata: np.ndarray, frames: int, time, status: sd.CallbackFlags\n):\n    if killed or (started and buffer.empty() and stopped):\n        event.set()\n\n    # outdata is of shape (frames, num_channels)\n    if buffer.empty():\n        outdata.fill(0)\n        return\n\n    n = 0\n    while n < frames and not buffer.empty():\n        remaining = frames - n\n        k = buffer.queue[0].shape[0]\n\n        if remaining <= k:\n            outdata[n:, 0] = buffer.queue[0][:remaining]\n            buffer.queue[0] = buffer.queue[0][remaining:]\n            n = frames\n            if buffer.queue[0].shape[0] == 0:\n                buffer.get()\n\n            break\n\n        outdata[n : n + k, 0] = buffer.get()\n        n += k\n\n    if n < frames:\n        outdata[n:, 0] = 0\n\n\n# Please see\n# https://python-sounddevice.readthedocs.io/en/0.4.6/usage.html#device-selection\n# for how to select a device\ndef play_audio():\n    if False:\n        # This if branch can be safely removed. It is here to show you how to\n        # change the default output device in case you need that.\n        devices = sd.query_devices()\n        print(devices)\n\n        # sd.default.device[1] is the output device, if you want to\n        # select a different device, say, 3, as the output device, please\n        # use self.default.device[1] = 3\n\n        default_output_device_idx = sd.default.device[1]\n        print(\n            f'Use default output device: {devices[default_output_device_idx][\"name\"]}'\n        )\n\n    with sd.OutputStream(\n        channels=1,\n        callback=play_audio_callback,\n        dtype=\"float32\",\n        samplerate=sample_rate,\n        blocksize=1024,\n    ):\n        event.wait()\n\n    logging.info(\"Exiting ...\")\n\n\ndef main():\n    reference_audio_file = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\"\n    if not Path(reference_audio_file).is_file():\n        raise ValueError(f\"Reference audio {reference_audio_file} does not exist\")\n\n    logging.info(\"Loading model ...\")\n    tts = create_tts()\n    logging.info(\"Loading model done.\")\n\n    reference_audio, reference_sample_rate = librosa.load(\n        reference_audio_file, sr=tts.sample_rate\n    )\n\n    text = \"\"\"\n    I am happy to join with you today in what will go down in history as the greatest demonstration for freedom in the history of our nation.\n    Five score years ago, a great American, in whose symbolic shadow we stand today, signed the Emancipation Proclamation. This momentous decree came as a great beacon light of hope to millions of Negro slaves who had been seared in the flames of withering injustice. It came as a joyous daybreak to end the long night of their captivity.\n    But one hundred years later, the Negro still is not free. One hundred years later, the life of the Negro is still sadly crippled by the manacles of segregation and the chains of discrimination. One hundred years later, the Negro lives on a lonely island of poverty in the midst of a vast ocean of material prosperity. One hundred years later, the Negro is still languished in the corners of American society and finds himself an exile in his own land. And so we've come here today to dramatize a shameful condition.\n    In a sense we've come to our nation's capital to cash a check. When the architects of our republic wrote the magnificent words of the Constitution and the Declaration of Independence, they were signing a promissory note to which every American was to fall heir. This note was a promise that all men, yes, black men as well as white men, would be guaranteed the \"unalienable Rights\" of \"Life, Liberty and the pursuit of Happiness.\" It is obvious today that America has defaulted on this promissory note, insofar as her citizens of color are concerned. Instead of honoring this sacred obligation, America has given the Negro people a bad check, a check which has come back marked insufficient funds.\n    \"\"\"\n\n    global sample_rate\n    sample_rate = tts.sample_rate\n\n    gen_config = sherpa_onnx.GenerationConfig()\n    gen_config.reference_audio = reference_audio\n    gen_config.reference_sample_rate = reference_sample_rate\n    gen_config.num_steps = 5\n\n    play_back_thread = threading.Thread(target=play_audio)\n    play_back_thread.start()\n\n    logging.info(\"Start generating ...\")\n    start_time = time.time()\n    audio = tts.generate(\n        text,\n        gen_config,\n        callback=generated_audio_callback,\n    )\n    end_time = time.time()\n    logging.info(\"Finished generating!\")\n    global stopped\n    stopped = True\n\n    if len(audio.samples) == 0:\n        print(\"Error in generating audios. Please read previous error messages.\")\n        global killed\n        killed = True\n        play_back_thread.join()\n        return\n\n    elapsed_seconds = end_time - start_time\n    audio_duration = len(audio.samples) / audio.sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    output_filename = \"./generated.wav\"\n    sf.write(\n        output_filename,\n        audio.samples,\n        samplerate=audio.sample_rate,\n        subtype=\"PCM_16\",\n    )\n    logging.info(f\"The text is '{text}'\")\n    logging.info(\n        \"Time in seconds to receive the first \"\n        f\"message: {first_message_time-start_time:.3f}\"\n    )\n    logging.info(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    logging.info(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    logging.info(\n        f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\"\n    )\n\n    logging.info(f\"***  Saved to {output_filename} ***\")\n\n    print(\"\\n   >>>>>>>>> You can safely press ctrl + C to stop the play <<<<<<<<<<\\n\")\n\n    play_back_thread.join()\n\n\nif __name__ == \"__main__\":\n    formatter = \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"\n\n    logging.basicConfig(format=formatter, level=logging.INFO)\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n        killed = True\n        sys.exit(0)\n"
  },
  {
    "path": "python-api-examples/pocket-tts.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2026  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python API\nfor voice cloning using PocketTTS.\n\n\nDifferent from ./pocket-tts-play.py, this file does not play back the\ngenerated audio.\n\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\npython3 ./pocket-tts.py\n\nYou can find more models at\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/tts/pocket.html\nfor details.\n\n\"\"\"\n\nimport time\nfrom pathlib import Path\n\nimport librosa\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_tts():\n    tts_config = sherpa_onnx.OfflineTtsConfig(\n        model=sherpa_onnx.OfflineTtsModelConfig(\n            pocket=sherpa_onnx.OfflineTtsPocketModelConfig(\n                lm_flow=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\",\n                lm_main=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\",\n                encoder=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\",\n                decoder=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\",\n                text_conditioner=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\",\n                vocab_json=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\",\n                token_scores_json=\"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\",\n            ),\n            debug=True,\n            num_threads=2,\n            provider=\"cpu\",\n        )\n    )\n    if not tts_config.validate():\n        raise ValueError(\n            \"Please read the previous error messages and re-check your config\"\n        )\n\n    return sherpa_onnx.OfflineTts(tts_config)\n\n\ndef main():\n    reference_audio_file = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\"\n    if not Path(reference_audio_file).is_file():\n        raise ValueError(f\"Reference audio {reference_audio_file} does not exist\")\n\n    tts = create_tts()\n\n    reference_audio, sample_rate = librosa.load(\n        reference_audio_file, sr=tts.sample_rate\n    )\n\n    text = \"I am happy to join with you today in what will go down in history as the greatest demonstration for freedom in the history of our nation.\"\n\n    gen_config = sherpa_onnx.GenerationConfig()\n    gen_config.reference_audio = reference_audio\n    gen_config.reference_sample_rate = sample_rate\n    gen_config.num_steps = 5\n\n    start = time.time()\n    audio = tts.generate(text, gen_config)\n    end = time.time()\n\n    if len(audio.samples) == 0:\n        print(\"Error in generating audios. Please read previous error messages.\")\n        return\n\n    elapsed_seconds = end - start\n    audio_duration = len(audio.samples) / audio.sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    output_filename = \"./generated.wav\"\n    sf.write(\n        output_filename,\n        audio.samples,\n        samplerate=audio.sample_rate,\n        subtype=\"PCM_16\",\n    )\n    print(f\"Saved to {output_filename}\")\n    print(f\"The text is '{text}'\")\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/simulate-streaming-paraformer-microphone.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2025  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python APIs\nwith VAD and non-streaming Paraformer for real-time speech recognition\nfrom a microphone.\n\nUsage:\n\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-int8-2025-10-07.tar.bz2\ntar xvf sherpa-onnx-paraformer-zh-int8-2025-10-07.tar.bz2\n\n./python-api-examples/simulate-streaming-paraformer-microphone.py  \\\n  --silero-vad-model=./silero_vad.onnx \\\n  --paraformer=./sherpa-onnx-paraformer-zh-int8-2025-10-07/model.int8.onnx \\\n  --tokens=./sherpa-onnx-paraformer-zh-int8-2025-10-07/tokens.txt\n\"\"\"\nimport argparse\nimport queue\nimport sys\nimport threading\nimport time\nfrom pathlib import Path\n\nimport numpy as np\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\nimport sherpa_onnx\n\nkilled = False\nrecording_thread = None\nsample_rate = 16000  # Please don't change it\n\n# buffer saves audio samples to be played\nsamples_queue = queue.Queue()\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--paraformer\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from Paraformer\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=2,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--hr-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the lexicon.txt for homophone replacer\",\n    )\n\n    parser.add_argument(\n        \"--hr-rule-fsts\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the replace.fst for homophone replacer\",\n    )\n\n    return parser.parse_args()\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef create_recognizer(args) -> sherpa_onnx.OfflineRecognizer:\n    assert_file_exists(args.paraformer)\n    recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(\n        paraformer=args.paraformer,\n        tokens=args.tokens,\n        num_threads=args.num_threads,\n        debug=False,\n        hr_rule_fsts=args.hr_rule_fsts,\n        hr_lexicon=args.hr_lexicon,\n    )\n\n    return recognizer\n\n\ndef start_recording():\n    # You can use any value you like for samples_per_read\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=sample_rate) as s:\n        while not killed:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n            samples = np.copy(samples)\n            samples_queue.put(samples)\n\n\ndef main():\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n\n    # If you want to select a different input device, please use\n    # sd.default.device[0] = xxx\n    # where xxx is the device number\n\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    args = get_args()\n    assert_file_exists(args.tokens)\n    assert_file_exists(args.silero_vad_model)\n\n    assert args.num_threads > 0, args.num_threads\n\n    print(\"Creating recognizer. Please wait...\")\n    recognizer = create_recognizer(args)\n\n    config = sherpa_onnx.VadModelConfig()\n    config.silero_vad.model = args.silero_vad_model\n    config.silero_vad.threshold = 0.5\n    config.silero_vad.min_silence_duration = 0.1  # seconds\n    config.silero_vad.min_speech_duration = 0.25  # seconds\n    # If the current segment is larger than this value, then it increases\n    # the threshold to 0.9 internally. After detecting this segment,\n    # it resets the threshold to its original value.\n    config.silero_vad.max_speech_duration = 8  # seconds\n    config.sample_rate = sample_rate\n\n    window_size = config.silero_vad.window_size\n\n    vad = sherpa_onnx.VoiceActivityDetector(config, buffer_size_in_seconds=100)\n\n    print(\"Started! Please speak\")\n\n    buffer = []\n\n    global recording_thread\n    recording_thread = threading.Thread(target=start_recording)\n    recording_thread.start()\n\n    display = sherpa_onnx.Display()\n\n    started = False\n    started_time = None\n\n    offset = 0\n    while not killed:\n        samples = samples_queue.get()  # a blocking read\n\n        buffer = np.concatenate([buffer, samples])\n        while offset + window_size < len(buffer):\n            vad.accept_waveform(buffer[offset : offset + window_size])\n            if not started and vad.is_speech_detected():\n                started = True\n                started_time = time.time()\n            offset += window_size\n\n        if not started:\n            if len(buffer) > 10 * window_size:\n                offset -= len(buffer) - 10 * window_size\n                buffer = buffer[-10 * window_size :]\n\n        if started and time.time() - started_time > 0.2:\n            stream = recognizer.create_stream()\n            stream.accept_waveform(sample_rate, buffer)\n            recognizer.decode_stream(stream)\n            text = stream.result.text.strip()\n            if text:\n                display.update_text(text)\n                display.display()\n\n            started_time = time.time()\n\n        while not vad.empty():\n            # In general, this while loop is executed only once\n            stream = recognizer.create_stream()\n            stream.accept_waveform(sample_rate, vad.front.samples)\n\n            vad.pop()\n            recognizer.decode_stream(stream)\n\n            text = stream.result.text.strip()\n\n            display.update_text(text)\n\n            buffer = []\n            offset = 0\n            started = False\n            started_time = None\n\n            display.finalize_current_sentence()\n            display.display()\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        killed = True\n        if recording_thread:\n            recording_thread.join()\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/simulate-streaming-sense-voice-microphone.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2025  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python APIs\nwith VAD and non-streaming SenseVoice for real-time speech recognition\nfrom a microphone.\n\nUsage:\n\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n./python-api-examples/simulate-streaming-sense-voice-microphone.py  \\\n  --silero-vad-model=./silero_vad.onnx \\\n  --sense-voice=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx \\\n  --tokens=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\n\"\"\"\nimport argparse\nimport queue\nimport sys\nimport threading\nimport time\nfrom pathlib import Path\n\nimport numpy as np\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\nimport sherpa_onnx\n\nkilled = False\nrecording_thread = None\nsample_rate = 16000  # Please don't change it\n\n# buffer saves audio samples to be played\nsamples_queue = queue.Queue()\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--sense-voice\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from SenseVoice\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=2,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--hr-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the lexicon.txt for homophone replacer\",\n    )\n\n    parser.add_argument(\n        \"--hr-rule-fsts\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the replace.fst for homophone replacer\",\n    )\n\n    return parser.parse_args()\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef create_recognizer(args) -> sherpa_onnx.OfflineRecognizer:\n    assert_file_exists(args.sense_voice)\n    recognizer = sherpa_onnx.OfflineRecognizer.from_sense_voice(\n        model=args.sense_voice,\n        tokens=args.tokens,\n        num_threads=args.num_threads,\n        use_itn=False,\n        debug=False,\n        hr_rule_fsts=args.hr_rule_fsts,\n        hr_lexicon=args.hr_lexicon,\n    )\n\n    return recognizer\n\n\ndef start_recording():\n    # You can use any value you like for samples_per_read\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=sample_rate) as s:\n        while not killed:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n            samples = np.copy(samples)\n            samples_queue.put(samples)\n\n\ndef main():\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n\n    # If you want to select a different input device, please use\n    # sd.default.device[0] = xxx\n    # where xxx is the device number\n\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    args = get_args()\n    assert_file_exists(args.tokens)\n    assert_file_exists(args.silero_vad_model)\n\n    assert args.num_threads > 0, args.num_threads\n\n    print(\"Creating recognizer. Please wait...\")\n    recognizer = create_recognizer(args)\n\n    config = sherpa_onnx.VadModelConfig()\n    config.silero_vad.model = args.silero_vad_model\n    config.silero_vad.threshold = 0.5\n    config.silero_vad.min_silence_duration = 0.1  # seconds\n    config.silero_vad.min_speech_duration = 0.25  # seconds\n    # If the current segment is larger than this value, then it increases\n    # the threshold to 0.9 internally. After detecting this segment,\n    # it resets the threshold to its original value.\n    config.silero_vad.max_speech_duration = 8  # seconds\n    config.sample_rate = sample_rate\n\n    window_size = config.silero_vad.window_size\n\n    vad = sherpa_onnx.VoiceActivityDetector(config, buffer_size_in_seconds=100)\n\n    print(\"Started! Please speak\")\n\n    buffer = []\n\n    global recording_thread\n    recording_thread = threading.Thread(target=start_recording)\n    recording_thread.start()\n\n    display = sherpa_onnx.Display()\n\n    started = False\n    started_time = None\n\n    offset = 0\n    while not killed:\n        samples = samples_queue.get()  # a blocking read\n\n        buffer = np.concatenate([buffer, samples])\n        while offset + window_size < len(buffer):\n            vad.accept_waveform(buffer[offset : offset + window_size])\n            if not started and vad.is_speech_detected():\n                started = True\n                started_time = time.time()\n            offset += window_size\n\n        if not started:\n            if len(buffer) > 10 * window_size:\n                offset -= len(buffer) - 10 * window_size\n                buffer = buffer[-10 * window_size :]\n\n        if started and time.time() - started_time > 0.2:\n            stream = recognizer.create_stream()\n            stream.accept_waveform(sample_rate, buffer)\n            recognizer.decode_stream(stream)\n            text = stream.result.text.strip()\n            if text:\n                display.update_text(text)\n                display.display()\n\n            started_time = time.time()\n\n        while not vad.empty():\n            # In general, this while loop is executed only once\n            stream = recognizer.create_stream()\n            stream.accept_waveform(sample_rate, vad.front.samples)\n\n            vad.pop()\n            recognizer.decode_stream(stream)\n\n            text = stream.result.text.strip()\n\n            display.update_text(text)\n\n            buffer = []\n            offset = 0\n            started = False\n            started_time = None\n\n            display.finalize_current_sentence()\n            display.display()\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        killed = True\n        if recording_thread:\n            recording_thread.join()\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/speaker-identification-with-vad-dynamic.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script shows how to use Python APIs for speaker identification with\na microphone and a VAD model\n\nUsage:\n\n(1) Download a model for computing speaker embeddings\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nto download a model. An example is given below:\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_large_sv_zh-cn_3dspeaker_16k.onnx\n\nNote that `zh` means Chinese, while `en` means English.\n\n(2) Download the VAD model\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nto download silero_vad.onnx\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n(3) Run this script\n\npython3 ./python-api-examples/speaker-identification-with-vad-dynamic.py \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --model ./3dspeaker_speech_eres2net_large_sv_zh-cn_3dspeaker_16k.onnx\n\"\"\"\nimport argparse\nimport sys\n\nimport numpy as np\nimport sherpa_onnx\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\ng_sample_rate = 16000\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the speaker embedding model file.\",\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    parser.add_argument(\"--threshold\", type=float, default=0.4)\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    return parser.parse_args()\n\n\ndef load_speaker_embedding_model(args):\n    config = sherpa_onnx.SpeakerEmbeddingExtractorConfig(\n        model=args.model,\n        num_threads=args.num_threads,\n        debug=args.debug,\n        provider=args.provider,\n    )\n    if not config.validate():\n        raise ValueError(f\"Invalid config. {config}\")\n    extractor = sherpa_onnx.SpeakerEmbeddingExtractor(config)\n    return extractor\n\n\ndef compute_speaker_embedding(\n    samples: np.ndarray,\n    extractor: sherpa_onnx.SpeakerEmbeddingExtractor,\n) -> np.ndarray:\n    \"\"\"\n    Args:\n      samples:\n        A 1-D float32 array.\n      extractor:\n        The return value of function load_speaker_embedding_model().\n    Returns:\n      Return a 1-D float32 array.\n    \"\"\"\n    if len(samples) < g_sample_rate:\n        print(f\"Your input contains only {len(samples)} samples!\")\n\n    stream = extractor.create_stream()\n    stream.accept_waveform(sample_rate=g_sample_rate, waveform=samples)\n    stream.input_finished()\n\n    assert extractor.is_ready(stream)\n    embedding = extractor.compute(stream)\n    embedding = np.array(embedding)\n    return embedding\n\n\ndef main():\n    args = get_args()\n    print(args)\n\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n    # If you want to select a different device, please change\n    # sd.default.device[0]. For instance, if you want to select device 10,\n    # please use\n    #\n    #  sd.default.device[0] = 4\n    #  print(devices)\n    #\n\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    extractor = load_speaker_embedding_model(args)\n\n    manager = sherpa_onnx.SpeakerEmbeddingManager(extractor.dim)\n\n    vad_config = sherpa_onnx.VadModelConfig()\n    vad_config.silero_vad.model = args.silero_vad_model\n    vad_config.silero_vad.min_silence_duration = 0.25\n    vad_config.silero_vad.min_speech_duration = 1.0\n    vad_config.sample_rate = g_sample_rate\n\n    window_size = vad_config.silero_vad.window_size\n    vad = sherpa_onnx.VoiceActivityDetector(vad_config, buffer_size_in_seconds=100)\n\n    samples_per_read = int(0.1 * g_sample_rate)  # 0.1 second = 100 ms\n\n    print(\"Started! Please speak\")\n\n    line_num = 0\n    speaker_id = 0\n    buffer = []\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=g_sample_rate) as s:\n        while True:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n            buffer = np.concatenate([buffer, samples])\n            while len(buffer) > window_size:\n                vad.accept_waveform(buffer[:window_size])\n                buffer = buffer[window_size:]\n\n            while not vad.empty():\n                if len(vad.front.samples) < 0.5 * g_sample_rate:\n                    # this segment is too short, skip it\n                    vad.pop()\n                    continue\n                stream = extractor.create_stream()\n                stream.accept_waveform(\n                    sample_rate=g_sample_rate, waveform=vad.front.samples\n                )\n                vad.pop()\n                stream.input_finished()\n\n                embedding = extractor.compute(stream)\n                embedding = np.array(embedding)\n                name = manager.search(embedding, threshold=args.threshold)\n                if not name:\n                    # register it\n                    new_name = f\"speaker_{speaker_id}\"\n                    status = manager.add(new_name, embedding)\n                    if not status:\n                        raise RuntimeError(f\"Failed to register speaker {new_name}\")\n                    print(\n                        f\"{line_num}: Detected new speaker. Register it as {new_name}\"\n                    )\n                    speaker_id += 1\n                else:\n                    print(f\"{line_num}: Detected existing speaker: {name}\")\n                line_num += 1\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/speaker-identification-with-vad-non-streaming-asr-alsa.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script works only on Linux. It uses ALSA for recording.\n\nThis script shows how to use Python APIs for speaker identification with\na microphone, a VAD model, and a non-streaming ASR model.\n\nPlease see also ./generate-subtitles.py\n\nUsage:\n\n(1) Prepare a text file containing speaker related files.\n\nEach line in the text file contains two columns. The first column is the\nspeaker name, while the second column contains the wave file of the speaker.\n\nIf the text file contains multiple wave files for the same speaker, then the\nembeddings of these files are averaged.\n\nAn example text file is given below:\n\n    foo /path/to/a.wav\n    bar /path/to/b.wav\n    foo /path/to/c.wav\n    foobar /path/to/d.wav\n\nEach wave file should contain only a single channel; the sample format\nshould be int16_t; the sample rate can be arbitrary.\n\n(2) Download a model for computing speaker embeddings\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nto download a model. An example is given below:\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/wespeaker_zh_cnceleb_resnet34.onnx\n\nNote that `zh` means Chinese, while `en` means English.\n\n(3) Download the VAD model\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nto download silero_vad.onnx\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n(4) Please refer to ./generate-subtitles.py\nto download a non-streaming ASR model.\n\n(5) Run this script\n\nAssume the filename of the text file is speaker.txt.\n\npython3 ./python-api-examples/speaker-identification-with-vad-non-streaming-asr.py \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --speaker-file ./speaker.txt \\\n  --model ./wespeaker_zh_cnceleb_resnet34.onnx\n\"\"\"\nimport argparse\nfrom collections import defaultdict\nfrom pathlib import Path\nfrom typing import Dict, List, Tuple\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\ng_sample_rate = 16000\n\n\ndef register_non_streaming_asr_model_args(parser):\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer joiner model\",\n    )\n\n    parser.add_argument(\n        \"--paraformer\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from Paraformer\",\n    )\n\n    parser.add_argument(\n        \"--wenet-ctc\",\n        default=\"\",\n        type=str,\n        help=\"Path to the CTC model.onnx from WeNet\",\n    )\n\n    parser.add_argument(\n        \"--whisper-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper encoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper decoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-language\",\n        default=\"\",\n        type=str,\n        help=\"\"\"It specifies the spoken language in the input file.\n        Example values: en, fr, de, zh, jp.\n        Available languages for multilingual models can be found at\n        https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10\n        If not specified, we infer the language from the input audio file.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-task\",\n        default=\"transcribe\",\n        choices=[\"transcribe\", \"translate\"],\n        type=str,\n        help=\"\"\"For multilingual models, if you specify translate, the output\n        will be in English.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-tail-paddings\",\n        default=-1,\n        type=int,\n        help=\"\"\"Number of tail padding frames.\n        We have removed the 30-second constraint from whisper, so you need to\n        choose the amount of tail padding frames by yourself.\n        Use -1 to use a default value for tail padding.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"\"\"Valid values are greedy_search and modified_beam_search.\n        modified_beam_search is valid only for transducer models.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--feature-dim\",\n        type=int,\n        default=80,\n        help=\"Feature dimension. Must match the one expected by the model\",\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    register_non_streaming_asr_model_args(parser)\n\n    parser.add_argument(\n        \"--speaker-file\",\n        type=str,\n        required=True,\n        help=\"\"\"Path to the speaker file. Read the help doc at the beginning of this\n        file for the format.\"\"\",\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the speaker embedding model file.\",\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    parser.add_argument(\"--threshold\", type=float, default=0.6)\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    parser.add_argument(\n        \"--device-name\",\n        type=str,\n        required=True,\n        help=\"\"\"\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n        \"\"\",\n    )\n\n    return parser.parse_args()\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef create_recognizer(args) -> sherpa_onnx.OfflineRecognizer:\n    if args.encoder:\n        assert len(args.paraformer) == 0, args.paraformer\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n\n        assert_file_exists(args.encoder)\n        assert_file_exists(args.decoder)\n        assert_file_exists(args.joiner)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_transducer(\n            encoder=args.encoder,\n            decoder=args.decoder,\n            joiner=args.joiner,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.paraformer:\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n\n        assert_file_exists(args.paraformer)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(\n            paraformer=args.paraformer,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=g_sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.wenet_ctc:\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n\n        assert_file_exists(args.wenet_ctc)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_wenet_ctc(\n            model=args.wenet_ctc,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.whisper_encoder:\n        assert_file_exists(args.whisper_encoder)\n        assert_file_exists(args.whisper_decoder)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n            encoder=args.whisper_encoder,\n            decoder=args.whisper_decoder,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n            language=args.whisper_language,\n            task=args.whisper_task,\n            tail_paddings=args.whisper_tail_paddings,\n        )\n    else:\n        raise ValueError(\"Please specify at least one model\")\n\n    return recognizer\n\n\ndef load_speaker_embedding_model(args):\n    config = sherpa_onnx.SpeakerEmbeddingExtractorConfig(\n        model=args.model,\n        num_threads=args.num_threads,\n        debug=args.debug,\n        provider=args.provider,\n    )\n    if not config.validate():\n        raise ValueError(f\"Invalid config. {config}\")\n    extractor = sherpa_onnx.SpeakerEmbeddingExtractor(config)\n    return extractor\n\n\ndef load_speaker_file(args) -> Dict[str, List[str]]:\n    if not Path(args.speaker_file).is_file():\n        raise ValueError(f\"--speaker-file {args.speaker_file} does not exist\")\n\n    ans = defaultdict(list)\n    with open(args.speaker_file) as f:\n        for line in f:\n            line = line.strip()\n            if not line:\n                continue\n\n            fields = line.split()\n            if len(fields) != 2:\n                raise ValueError(f\"Invalid line: {line}. Fields: {fields}\")\n\n            speaker_name, filename = fields\n            ans[speaker_name].append(filename)\n    return ans\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef compute_speaker_embedding(\n    filenames: List[str],\n    extractor: sherpa_onnx.SpeakerEmbeddingExtractor,\n) -> np.ndarray:\n    assert len(filenames) > 0, \"filenames is empty\"\n\n    ans = None\n    for filename in filenames:\n        print(f\"processing {filename}\")\n        samples, sample_rate = load_audio(filename)\n        stream = extractor.create_stream()\n        stream.accept_waveform(sample_rate=sample_rate, waveform=samples)\n        stream.input_finished()\n\n        assert extractor.is_ready(stream)\n        embedding = extractor.compute(stream)\n        embedding = np.array(embedding)\n        if ans is None:\n            ans = embedding\n        else:\n            ans += embedding\n\n    return ans / len(filenames)\n\n\ndef main():\n    args = get_args()\n    print(args)\n\n    device_name = args.device_name\n    print(f\"device_name: {device_name}\")\n    alsa = sherpa_onnx.Alsa(device_name)\n\n    recognizer = create_recognizer(args)\n    extractor = load_speaker_embedding_model(args)\n    speaker_file = load_speaker_file(args)\n\n    manager = sherpa_onnx.SpeakerEmbeddingManager(extractor.dim)\n    for name, filename_list in speaker_file.items():\n        embedding = compute_speaker_embedding(\n            filenames=filename_list,\n            extractor=extractor,\n        )\n        status = manager.add(name, embedding)\n        if not status:\n            raise RuntimeError(f\"Failed to register speaker {name}\")\n\n    vad_config = sherpa_onnx.VadModelConfig()\n    vad_config.silero_vad.model = args.silero_vad_model\n    vad_config.silero_vad.min_silence_duration = 0.25\n    vad_config.silero_vad.min_speech_duration = 0.25\n    vad_config.sample_rate = g_sample_rate\n    if not vad_config.validate():\n        raise ValueError(\"Errors in vad config\")\n\n    window_size = vad_config.silero_vad.window_size\n\n    vad = sherpa_onnx.VoiceActivityDetector(vad_config, buffer_size_in_seconds=100)\n\n    samples_per_read = int(0.1 * g_sample_rate)  # 0.1 second = 100 ms\n\n    print(\"Started! Please speak\")\n\n    idx = 0\n    buffer = []\n    while True:\n        samples = alsa.read(samples_per_read)  # a blocking read\n        samples = np.array(samples)\n        buffer = np.concatenate([buffer, samples])\n        while len(buffer) > window_size:\n            vad.accept_waveform(buffer[:window_size])\n            buffer = buffer[window_size:]\n\n        while not vad.empty():\n            if len(vad.front.samples) < 0.5 * g_sample_rate:\n                # this segment is too short, skip it\n                vad.pop()\n                continue\n            stream = extractor.create_stream()\n            stream.accept_waveform(\n                sample_rate=g_sample_rate, waveform=vad.front.samples\n            )\n            stream.input_finished()\n\n            embedding = extractor.compute(stream)\n            embedding = np.array(embedding)\n            name = manager.search(embedding, threshold=args.threshold)\n            if not name:\n                name = \"unknown\"\n\n            # Now for non-streaming ASR\n            asr_stream = recognizer.create_stream()\n            asr_stream.accept_waveform(\n                sample_rate=g_sample_rate, waveform=vad.front.samples\n            )\n            recognizer.decode_stream(asr_stream)\n            text = asr_stream.result.text\n\n            vad.pop()\n\n            print(f\"\\r{idx}-{name}: {text}\")\n            idx += 1\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/speaker-identification-with-vad-non-streaming-asr.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script shows how to use Python APIs for speaker identification with\na microphone, a VAD model, and a non-streaming ASR model.\n\nPlease see also ./generate-subtitles.py\n\nUsage:\n\n(1) Prepare a text file containing speaker related files.\n\nEach line in the text file contains two columns. The first column is the\nspeaker name, while the second column contains the wave file of the speaker.\n\nIf the text file contains multiple wave files for the same speaker, then the\nembeddings of these files are averaged.\n\nAn example text file is given below:\n\n    foo /path/to/a.wav\n    bar /path/to/b.wav\n    foo /path/to/c.wav\n    foobar /path/to/d.wav\n\nEach wave file should contain only a single channel; the sample format\nshould be int16_t; the sample rate can be arbitrary.\n\n(2) Download a model for computing speaker embeddings\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nto download a model. An example is given below:\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/wespeaker_zh_cnceleb_resnet34.onnx\n\nNote that `zh` means Chinese, while `en` means English.\n\n(3) Download the VAD model\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nto download silero_vad.onnx\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n(4) Please refer to ./generate-subtitles.py\nto download a non-streaming ASR model.\n\n(5) Run this script\n\nAssume the filename of the text file is speaker.txt.\n\npython3 ./python-api-examples/speaker-identification-with-vad-non-streaming-asr.py \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --speaker-file ./speaker.txt \\\n  --model ./wespeaker_zh_cnceleb_resnet34.onnx\n\"\"\"\nimport argparse\nimport sys\nfrom collections import defaultdict\nfrom pathlib import Path\nfrom typing import Dict, List, Tuple\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\ng_sample_rate = 16000\n\n\ndef register_non_streaming_asr_model_args(parser):\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer joiner model\",\n    )\n\n    parser.add_argument(\n        \"--paraformer\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from Paraformer\",\n    )\n\n    parser.add_argument(\n        \"--wenet-ctc\",\n        default=\"\",\n        type=str,\n        help=\"Path to the CTC model.onnx from WeNet\",\n    )\n\n    parser.add_argument(\n        \"--whisper-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper encoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper decoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-language\",\n        default=\"\",\n        type=str,\n        help=\"\"\"It specifies the spoken language in the input file.\n        Example values: en, fr, de, zh, jp.\n        Available languages for multilingual models can be found at\n        https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10\n        If not specified, we infer the language from the input audio file.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-task\",\n        default=\"transcribe\",\n        choices=[\"transcribe\", \"translate\"],\n        type=str,\n        help=\"\"\"For multilingual models, if you specify translate, the output\n        will be in English.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-tail-paddings\",\n        default=-1,\n        type=int,\n        help=\"\"\"Number of tail padding frames.\n        We have removed the 30-second constraint from whisper, so you need to\n        choose the amount of tail padding frames by yourself.\n        Use -1 to use a default value for tail padding.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"\"\"Valid values are greedy_search and modified_beam_search.\n        modified_beam_search is valid only for transducer models.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--feature-dim\",\n        type=int,\n        default=80,\n        help=\"Feature dimension. Must match the one expected by the model\",\n    )\n\n    parser.add_argument(\n        \"--sense-voice\",\n        default=\"\",\n        type=str,\n        help=\"Path to sense voice model\",\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    register_non_streaming_asr_model_args(parser)\n\n    parser.add_argument(\n        \"--speaker-file\",\n        type=str,\n        required=True,\n        help=\"\"\"Path to the speaker file. Read the help doc at the beginning of this\n        file for the format.\"\"\",\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the speaker embedding model file.\",\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    parser.add_argument(\"--threshold\", type=float, default=0.6)\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    return parser.parse_args()\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef create_recognizer(args) -> sherpa_onnx.OfflineRecognizer:\n    if args.encoder:\n        assert len(args.paraformer) == 0, args.paraformer\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n\n        assert_file_exists(args.encoder)\n        assert_file_exists(args.decoder)\n        assert_file_exists(args.joiner)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_transducer(\n            encoder=args.encoder,\n            decoder=args.decoder,\n            joiner=args.joiner,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.paraformer:\n        assert len(args.wenet_ctc) == 0, args.wenet_ctc\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n\n        assert_file_exists(args.paraformer)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(\n            paraformer=args.paraformer,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=g_sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.wenet_ctc:\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n\n        assert_file_exists(args.wenet_ctc)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_wenet_ctc(\n            model=args.wenet_ctc,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n        )\n    elif args.whisper_encoder:\n        assert_file_exists(args.whisper_encoder)\n        assert_file_exists(args.whisper_decoder)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n            encoder=args.whisper_encoder,\n            decoder=args.whisper_decoder,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n            language=args.whisper_language,\n            task=args.whisper_task,\n            tail_paddings=args.whisper_tail_paddings,\n        )\n    elif args.sense_voice:\n        assert_file_exists(args.sense_voice)\n        recognizer = sherpa_onnx.OfflineRecognizer.from_sense_voice(\n            model=args.sense_voice,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            use_itn=True,\n            debug=args.debug,\n        )\n    else:\n        raise ValueError(\"Please specify at least one model\")\n\n    return recognizer\n\n\ndef load_speaker_embedding_model(args):\n    config = sherpa_onnx.SpeakerEmbeddingExtractorConfig(\n        model=args.model,\n        num_threads=args.num_threads,\n        debug=args.debug,\n        provider=args.provider,\n    )\n    if not config.validate():\n        raise ValueError(f\"Invalid config. {config}\")\n    extractor = sherpa_onnx.SpeakerEmbeddingExtractor(config)\n    return extractor\n\n\ndef load_speaker_file(args) -> Dict[str, List[str]]:\n    if not Path(args.speaker_file).is_file():\n        raise ValueError(f\"--speaker-file {args.speaker_file} does not exist\")\n\n    ans = defaultdict(list)\n    with open(args.speaker_file) as f:\n        for line in f:\n            line = line.strip()\n            if not line:\n                continue\n\n            fields = line.split()\n            if len(fields) != 2:\n                raise ValueError(f\"Invalid line: {line}. Fields: {fields}\")\n\n            speaker_name, filename = fields\n            ans[speaker_name].append(filename)\n    return ans\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef compute_speaker_embedding(\n    filenames: List[str],\n    extractor: sherpa_onnx.SpeakerEmbeddingExtractor,\n) -> np.ndarray:\n    assert len(filenames) > 0, \"filenames is empty\"\n\n    ans = None\n    for filename in filenames:\n        print(f\"processing {filename}\")\n        samples, sample_rate = load_audio(filename)\n        stream = extractor.create_stream()\n        stream.accept_waveform(sample_rate=sample_rate, waveform=samples)\n        stream.input_finished()\n\n        assert extractor.is_ready(stream)\n        embedding = extractor.compute(stream)\n        embedding = np.array(embedding)\n        if ans is None:\n            ans = embedding\n        else:\n            ans += embedding\n\n    return ans / len(filenames)\n\n\ndef main():\n    args = get_args()\n    print(args)\n    recognizer = create_recognizer(args)\n    extractor = load_speaker_embedding_model(args)\n    speaker_file = load_speaker_file(args)\n\n    manager = sherpa_onnx.SpeakerEmbeddingManager(extractor.dim)\n    for name, filename_list in speaker_file.items():\n        embedding = compute_speaker_embedding(\n            filenames=filename_list,\n            extractor=extractor,\n        )\n        status = manager.add(name, embedding)\n        if not status:\n            raise RuntimeError(f\"Failed to register speaker {name}\")\n\n    vad_config = sherpa_onnx.VadModelConfig()\n    vad_config.silero_vad.model = args.silero_vad_model\n    vad_config.silero_vad.min_silence_duration = 0.25\n    vad_config.silero_vad.min_speech_duration = 0.25\n    vad_config.sample_rate = g_sample_rate\n    if not vad_config.validate():\n        raise ValueError(\"Errors in vad config\")\n\n    window_size = vad_config.silero_vad.window_size\n\n    vad = sherpa_onnx.VoiceActivityDetector(vad_config, buffer_size_in_seconds=100)\n\n    samples_per_read = int(0.1 * g_sample_rate)  # 0.1 second = 100 ms\n\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    print(\"Started! Please speak\")\n\n    idx = 0\n    buffer = []\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=g_sample_rate) as s:\n        while True:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n            buffer = np.concatenate([buffer, samples])\n            while len(buffer) > window_size:\n                vad.accept_waveform(buffer[:window_size])\n                buffer = buffer[window_size:]\n\n            while not vad.empty():\n                if len(vad.front.samples) < 0.5 * g_sample_rate:\n                    # this segment is too short, skip it\n                    vad.pop()\n                    continue\n                stream = extractor.create_stream()\n                stream.accept_waveform(\n                    sample_rate=g_sample_rate, waveform=vad.front.samples\n                )\n                stream.input_finished()\n\n                embedding = extractor.compute(stream)\n                embedding = np.array(embedding)\n                name = manager.search(embedding, threshold=args.threshold)\n                if not name:\n                    name = \"unknown\"\n\n                # Now for non-streaming ASR\n                asr_stream = recognizer.create_stream()\n                asr_stream.accept_waveform(\n                    sample_rate=g_sample_rate, waveform=vad.front.samples\n                )\n                recognizer.decode_stream(asr_stream)\n                text = asr_stream.result.text\n\n                vad.pop()\n\n                print(f\"\\r{idx}-{name}: {text}\")\n                idx += 1\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/speaker-identification-with-vad.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script shows how to use Python APIs for speaker identification with\na microphone and a VAD model\n\nUsage:\n\n(1) Prepare a text file containing speaker related files.\n\nEach line in the text file contains two columns. The first column is the\nspeaker name, while the second column contains the wave file of the speaker.\n\nIf the text file contains multiple wave files for the same speaker, then the\nembeddings of these files are averaged.\n\nAn example text file is given below:\n\n    foo /path/to/a.wav\n    bar /path/to/b.wav\n    foo /path/to/c.wav\n    foobar /path/to/d.wav\n\nEach wave file should contain only a single channel; the sample format\nshould be int16_t; the sample rate can be arbitrary.\n\n(2) Download a model for computing speaker embeddings\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nto download a model. An example is given below:\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/wespeaker_zh_cnceleb_resnet34.onnx\n\nNote that `zh` means Chinese, while `en` means English.\n\n(3) Download the VAD model\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nto download silero_vad.onnx\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n(4) Run this script\n\nAssume the filename of the text file is speaker.txt.\n\npython3 ./python-api-examples/speaker-identification-with-vad.py \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --speaker-file ./speaker.txt \\\n  --model ./wespeaker_zh_cnceleb_resnet34.onnx\n\"\"\"\nimport argparse\nimport sys\nfrom collections import defaultdict\nfrom pathlib import Path\nfrom typing import Dict, List, Tuple\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--speaker-file\",\n        type=str,\n        required=True,\n        help=\"\"\"Path to the speaker file. Read the help doc at the beginning of this\n        file for the format.\"\"\",\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the speaker embedding model file.\",\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    parser.add_argument(\"--threshold\", type=float, default=0.6)\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    return parser.parse_args()\n\n\ndef load_speaker_embedding_model(args):\n    config = sherpa_onnx.SpeakerEmbeddingExtractorConfig(\n        model=args.model,\n        num_threads=args.num_threads,\n        debug=args.debug,\n        provider=args.provider,\n    )\n    if not config.validate():\n        raise ValueError(f\"Invalid config. {config}\")\n    extractor = sherpa_onnx.SpeakerEmbeddingExtractor(config)\n    return extractor\n\n\ndef load_speaker_file(args) -> Dict[str, List[str]]:\n    if not Path(args.speaker_file).is_file():\n        raise ValueError(f\"--speaker-file {args.speaker_file} does not exist\")\n\n    ans = defaultdict(list)\n    with open(args.speaker_file) as f:\n        for line in f:\n            line = line.strip()\n            if not line:\n                continue\n\n            fields = line.split()\n            if len(fields) != 2:\n                raise ValueError(f\"Invalid line: {line}. Fields: {fields}\")\n\n            speaker_name, filename = fields\n            ans[speaker_name].append(filename)\n    return ans\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef compute_speaker_embedding(\n    filenames: List[str],\n    extractor: sherpa_onnx.SpeakerEmbeddingExtractor,\n) -> np.ndarray:\n    assert len(filenames) > 0, \"filenames is empty\"\n\n    ans = None\n    for filename in filenames:\n        print(f\"processing {filename}\")\n        samples, sample_rate = load_audio(filename)\n        stream = extractor.create_stream()\n        stream.accept_waveform(sample_rate=sample_rate, waveform=samples)\n        stream.input_finished()\n\n        assert extractor.is_ready(stream)\n        embedding = extractor.compute(stream)\n        embedding = np.array(embedding)\n        if ans is None:\n            ans = embedding\n        else:\n            ans += embedding\n\n    return ans / len(filenames)\n\n\ng_sample_rate = 16000\n\n\ndef main():\n    args = get_args()\n    print(args)\n    extractor = load_speaker_embedding_model(args)\n    speaker_file = load_speaker_file(args)\n\n    manager = sherpa_onnx.SpeakerEmbeddingManager(extractor.dim)\n    for name, filename_list in speaker_file.items():\n        embedding = compute_speaker_embedding(\n            filenames=filename_list,\n            extractor=extractor,\n        )\n        status = manager.add(name, embedding)\n        if not status:\n            raise RuntimeError(f\"Failed to register speaker {name}\")\n\n    vad_config = sherpa_onnx.VadModelConfig()\n    vad_config.silero_vad.model = args.silero_vad_model\n    vad_config.silero_vad.min_silence_duration = 0.25\n    vad_config.silero_vad.min_speech_duration = 0.25\n    vad_config.sample_rate = g_sample_rate\n\n    window_size = vad_config.silero_vad.window_size\n    vad = sherpa_onnx.VoiceActivityDetector(vad_config, buffer_size_in_seconds=100)\n\n    samples_per_read = int(0.1 * g_sample_rate)  # 0.1 second = 100 ms\n\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    print(\"Started! Please speak\")\n\n    idx = 0\n    buffer = []\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=g_sample_rate) as s:\n        while True:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n            buffer = np.concatenate([buffer, samples])\n            while len(buffer) > window_size:\n                vad.accept_waveform(buffer[:window_size])\n                buffer = buffer[window_size:]\n\n            while not vad.empty():\n                if len(vad.front.samples) < 0.5 * g_sample_rate:\n                    # this segment is too short, skip it\n                    vad.pop()\n                    continue\n                stream = extractor.create_stream()\n                stream.accept_waveform(\n                    sample_rate=g_sample_rate, waveform=vad.front.samples\n                )\n                vad.pop()\n                stream.input_finished()\n\n                print(\"Computing\", end=\"\")\n                embedding = extractor.compute(stream)\n                embedding = np.array(embedding)\n                name = manager.search(embedding, threshold=args.threshold)\n                if not name:\n                    name = \"unknown\"\n                print(f\"\\r{idx}: Predicted name: {name}\")\n                idx += 1\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/speaker-identification.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script shows how to use Python APIs for speaker identification with\na microphone.\n\nUsage:\n\n(1) Prepare a text file containing speaker related files.\n\nEach line in the text file contains two columns. The first column is the\nspeaker name, while the second column contains the wave file of the speaker.\n\nIf the text file contains multiple wave files for the same speaker, then the\nembeddings of these files are averaged.\n\nAn example text file is given below:\n\n    foo /path/to/a.wav\n    bar /path/to/b.wav\n    foo /path/to/c.wav\n    foobar /path/to/d.wav\n\nEach wave file should contain only a single channel; the sample format\nshould be int16_t; the sample rate can be arbitrary.\n\n(2) Download a model for computing speaker embeddings\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nto download a model. An example is given below:\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/wespeaker_zh_cnceleb_resnet34.onnx\n\nNote that `zh` means Chinese, while `en` means English.\n\n(3) Run this script\n\nAssume the filename of the text file is speaker.txt.\n\npython3 ./python-api-examples/speaker-identification.py \\\n  --speaker-file ./speaker.txt \\\n  --model ./wespeaker_zh_cnceleb_resnet34.onnx\n\"\"\"\nimport argparse\nimport queue\nimport sys\nimport threading\nfrom collections import defaultdict\nfrom pathlib import Path\nfrom typing import Dict, List, Tuple\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--speaker-file\",\n        type=str,\n        required=True,\n        help=\"\"\"Path to the speaker file. Read the help doc at the beginning of this\n        file for the format.\"\"\",\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the model file.\",\n    )\n\n    parser.add_argument(\"--threshold\", type=float, default=0.6)\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    return parser.parse_args()\n\n\ndef load_speaker_embedding_model(args):\n    config = sherpa_onnx.SpeakerEmbeddingExtractorConfig(\n        model=args.model,\n        num_threads=args.num_threads,\n        debug=args.debug,\n        provider=args.provider,\n    )\n    if not config.validate():\n        raise ValueError(f\"Invalid config. {config}\")\n    extractor = sherpa_onnx.SpeakerEmbeddingExtractor(config)\n    return extractor\n\n\ndef load_speaker_file(args) -> Dict[str, List[str]]:\n    if not Path(args.speaker_file).is_file():\n        raise ValueError(f\"--speaker-file {args.speaker_file} does not exist\")\n\n    ans = defaultdict(list)\n    with open(args.speaker_file) as f:\n        for line in f:\n            line = line.strip()\n            if not line:\n                continue\n\n            fields = line.split()\n            if len(fields) != 2:\n                raise ValueError(f\"Invalid line: {line}. Fields: {fields}\")\n\n            speaker_name, filename = fields\n            ans[speaker_name].append(filename)\n    return ans\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef compute_speaker_embedding(\n    filenames: List[str],\n    extractor: sherpa_onnx.SpeakerEmbeddingExtractor,\n) -> np.ndarray:\n    assert len(filenames) > 0, \"filenames is empty\"\n\n    ans = None\n    for filename in filenames:\n        print(f\"processing {filename}\")\n        samples, sample_rate = load_audio(filename)\n        stream = extractor.create_stream()\n        stream.accept_waveform(sample_rate=sample_rate, waveform=samples)\n        stream.input_finished()\n\n        assert extractor.is_ready(stream)\n        embedding = extractor.compute(stream)\n        embedding = np.array(embedding)\n        if ans is None:\n            ans = embedding\n        else:\n            ans += embedding\n\n    return ans / len(filenames)\n\n\ng_buffer = queue.Queue()\ng_stop = False\ng_sample_rate = 16000\ng_read_mic_thread = None\n\n\ndef read_mic():\n    print(\"Please speak!\")\n    samples_per_read = int(0.1 * g_sample_rate)  # 0.1 second = 100 ms\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=g_sample_rate) as s:\n        while not g_stop:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            g_buffer.put(samples)\n\n\ndef main():\n    args = get_args()\n    print(args)\n    extractor = load_speaker_embedding_model(args)\n    speaker_file = load_speaker_file(args)\n\n    manager = sherpa_onnx.SpeakerEmbeddingManager(extractor.dim)\n    for name, filename_list in speaker_file.items():\n        embedding = compute_speaker_embedding(\n            filenames=filename_list,\n            extractor=extractor,\n        )\n        status = manager.add(name, embedding)\n        if not status:\n            raise RuntimeError(f\"Failed to register speaker {name}\")\n\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    global g_stop\n    global g_read_mic_thread\n    while True:\n        key = input(\"Press Enter to start recording\")\n        if key.lower() in (\"q\", \"quit\"):\n            g_stop = True\n            break\n\n        g_stop = False\n        g_buffer.queue.clear()\n        g_read_mic_thread = threading.Thread(target=read_mic)\n        g_read_mic_thread.start()\n        input(\"Press Enter to stop recording\")\n        g_stop = True\n        g_read_mic_thread.join()\n        print(\"Compute embedding\")\n        stream = extractor.create_stream()\n        while not g_buffer.empty():\n            samples = g_buffer.get()\n            stream.accept_waveform(sample_rate=g_sample_rate, waveform=samples)\n        stream.input_finished()\n\n        embedding = extractor.compute(stream)\n        embedding = np.array(embedding)\n        name = manager.search(embedding, threshold=args.threshold)\n        if not name:\n            name = \"unknown\"\n        print(f\"Predicted name: {name}\")\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n        g_stop = True\n        if g_read_mic_thread.is_alive():\n            g_read_mic_thread.join()\n"
  },
  {
    "path": "python-api-examples/speech-recognition-from-microphone-with-endpoint-detection-alsa.py",
    "content": "#!/usr/bin/env python3\n\n# Real-time speech recognition from a microphone with sherpa-onnx Python API\n# with endpoint detection.\n#\n# Note: This script uses ALSA and works only on Linux systems, especially\n# for embedding Linux systems and for running Linux on Windows using WSL.\n#\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n# to download pre-trained models\n\nimport argparse\nfrom pathlib import Path\n\nimport sherpa_onnx\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        required=True,\n        help=\"Path to the encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        required=True,\n        help=\"Path to the decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        type=str,\n        required=True,\n        help=\"Path to the joiner model\",\n    )\n\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"Valid values are greedy_search and modified_beam_search\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-file\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The file containing hotwords, one words/phrases per line, and for each\n        phrase the bpe/cjkchar are separated by a space. For example:\n\n        ▁HE LL O ▁WORLD\n        你 好 世 界\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-score\",\n        type=float,\n        default=1.5,\n        help=\"\"\"\n        The hotword score of each token for biasing word/phrase. Used only if\n        --hotwords-file is given.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--blank-penalty\",\n        type=float,\n        default=0.0,\n        help=\"\"\"\n        The penalty applied on blank symbol during decoding.\n        Note: It is a positive value that would be applied to logits like\n        this `logits[:, 0] -= blank_penalty` (suppose logits.shape is\n        [batch_size, vocab] and blank id is 0).\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hr-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the lexicon.txt for homophone replacer\",\n    )\n\n    parser.add_argument(\n        \"--hr-rule-fsts\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the replace.fst for homophone replacer\",\n    )\n\n    parser.add_argument(\n        \"--device-name\",\n        type=str,\n        required=True,\n        help=\"\"\"\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n        \"\"\",\n    )\n\n    return parser.parse_args()\n\n\ndef create_recognizer(args):\n    assert_file_exists(args.encoder)\n    assert_file_exists(args.decoder)\n    assert_file_exists(args.joiner)\n    assert_file_exists(args.tokens)\n    # Please replace the model files if needed.\n    # See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n    # for download links.\n    recognizer = sherpa_onnx.OnlineRecognizer.from_transducer(\n        tokens=args.tokens,\n        encoder=args.encoder,\n        decoder=args.decoder,\n        joiner=args.joiner,\n        num_threads=1,\n        sample_rate=16000,\n        feature_dim=80,\n        enable_endpoint_detection=True,\n        rule1_min_trailing_silence=2.4,\n        rule2_min_trailing_silence=1.2,\n        rule3_min_utterance_length=300,  # it essentially disables this rule\n        decoding_method=args.decoding_method,\n        provider=args.provider,\n        hotwords_file=args.hotwords_file,\n        hotwords_score=args.hotwords_score,\n        blank_penalty=args.blank_penalty,\n        hr_rule_fsts=args.hr_rule_fsts,\n        hr_lexicon=args.hr_lexicon,\n    )\n    return recognizer\n\n\ndef main():\n    args = get_args()\n    device_name = args.device_name\n    print(f\"device_name: {device_name}\")\n    alsa = sherpa_onnx.Alsa(device_name)\n\n    print(\"Creating recognizer\")\n    recognizer = create_recognizer(args)\n    print(\"Started! Please speak\")\n\n    sample_rate = 16000\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n\n    stream = recognizer.create_stream()\n\n    display = sherpa_onnx.Display()\n\n    while True:\n        samples = alsa.read(samples_per_read)  # a blocking read\n        stream.accept_waveform(sample_rate, samples)\n        while recognizer.is_ready(stream):\n            recognizer.decode_stream(stream)\n\n        is_endpoint = recognizer.is_endpoint(stream)\n\n        result = recognizer.get_result(stream)\n\n        display.update_text(result)\n        display.display()\n\n        if is_endpoint:\n            if result:\n                display.finalize_current_sentence()\n                display.display()\n\n            recognizer.reset(stream)\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/speech-recognition-from-microphone-with-endpoint-detection.py",
    "content": "#!/usr/bin/env python3\n\n# Real-time speech recognition from a microphone with sherpa-onnx Python API\n# with endpoint detection.\n#\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n# to download pre-trained models\n\nimport argparse\nimport sys\nfrom pathlib import Path\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\nimport sherpa_onnx\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        required=True,\n        help=\"Path to the encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        required=True,\n        help=\"Path to the decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        type=str,\n        required=True,\n        help=\"Path to the joiner model\",\n    )\n\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"Valid values are greedy_search and modified_beam_search\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-file\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The file containing hotwords, one words/phrases per line, and for each\n        phrase the bpe/cjkchar are separated by a space. For example:\n\n        ▁HE LL O ▁WORLD\n        你 好 世 界\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-score\",\n        type=float,\n        default=1.5,\n        help=\"\"\"\n        The hotword score of each token for biasing word/phrase. Used only if\n        --hotwords-file is given.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--blank-penalty\",\n        type=float,\n        default=0.0,\n        help=\"\"\"\n        The penalty applied on blank symbol during decoding.\n        Note: It is a positive value that would be applied to logits like\n        this `logits[:, 0] -= blank_penalty` (suppose logits.shape is\n        [batch_size, vocab] and blank id is 0).\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hr-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the lexicon.txt for homophone replacer\",\n    )\n\n    parser.add_argument(\n        \"--hr-rule-fsts\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the replace.fst for homophone replacer\",\n    )\n\n    return parser.parse_args()\n\n\ndef create_recognizer(args):\n    assert_file_exists(args.encoder)\n    assert_file_exists(args.decoder)\n    assert_file_exists(args.joiner)\n    assert_file_exists(args.tokens)\n    # Please replace the model files if needed.\n    # See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n    # for download links.\n    recognizer = sherpa_onnx.OnlineRecognizer.from_transducer(\n        tokens=args.tokens,\n        encoder=args.encoder,\n        decoder=args.decoder,\n        joiner=args.joiner,\n        num_threads=1,\n        sample_rate=16000,\n        feature_dim=80,\n        enable_endpoint_detection=True,\n        rule1_min_trailing_silence=2.4,\n        rule2_min_trailing_silence=1.2,\n        rule3_min_utterance_length=300,  # it essentially disables this rule\n        decoding_method=args.decoding_method,\n        provider=args.provider,\n        hotwords_file=args.hotwords_file,\n        hotwords_score=args.hotwords_score,\n        blank_penalty=args.blank_penalty,\n        hr_rule_fsts=args.hr_rule_fsts,\n        hr_lexicon=args.hr_lexicon,\n    )\n    return recognizer\n\n\ndef main():\n    args = get_args()\n\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    recognizer = create_recognizer(args)\n    print(\"Started! Please speak\")\n\n    # The model is using 16 kHz, we use 48 kHz here to demonstrate that\n    # sherpa-onnx will do resampling inside.\n    sample_rate = 48000\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n\n    stream = recognizer.create_stream()\n\n    display = sherpa_onnx.Display()\n\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=sample_rate) as s:\n        while True:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n            stream.accept_waveform(sample_rate, samples)\n            while recognizer.is_ready(stream):\n                recognizer.decode_stream(stream)\n\n            is_endpoint = recognizer.is_endpoint(stream)\n\n            result = recognizer.get_result(stream)\n\n            display.update_text(result)\n            display.display()\n\n            if is_endpoint:\n                if result:\n                    display.finalize_current_sentence()\n                    display.display()\n\n                recognizer.reset(stream)\n\n\nif __name__ == \"__main__\":\n\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/speech-recognition-from-microphone.py",
    "content": "#!/usr/bin/env python3\n\n# Real-time speech recognition from a microphone with sherpa-onnx Python API\n#\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n# to download pre-trained models\n\nimport argparse\nimport sys\nfrom pathlib import Path\n\nfrom typing import List\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\nimport sherpa_onnx\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        required=True,\n        help=\"Path to the encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        required=True,\n        help=\"Path to the decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        type=str,\n        help=\"Path to the joiner model\",\n    )\n\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"Valid values are greedy_search and modified_beam_search\",\n    )\n\n    parser.add_argument(\n        \"--max-active-paths\",\n        type=int,\n        default=4,\n        help=\"\"\"Used only when --decoding-method is modified_beam_search.\n        It specifies number of active paths to keep during decoding.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-file\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The file containing hotwords, one words/phrases per line, and for each\n        phrase the bpe/cjkchar are separated by a space. For example:\n\n        ▁HE LL O ▁WORLD\n        你 好 世 界\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-score\",\n        type=float,\n        default=1.5,\n        help=\"\"\"\n        The hotword score of each token for biasing word/phrase. Used only if\n        --hotwords-file is given.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--blank-penalty\",\n        type=float,\n        default=0.0,\n        help=\"\"\"\n        The penalty applied on blank symbol during decoding.\n        Note: It is a positive value that would be applied to logits like\n        this `logits[:, 0] -= blank_penalty` (suppose logits.shape is\n        [batch_size, vocab] and blank id is 0).\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hr-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the lexicon.txt for homophone replacer\",\n    )\n\n    parser.add_argument(\n        \"--hr-rule-fsts\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the replace.fst for homophone replacer\",\n    )\n\n    return parser.parse_args()\n\n\ndef create_recognizer(args):\n    assert_file_exists(args.encoder)\n    assert_file_exists(args.decoder)\n    assert_file_exists(args.joiner)\n    assert_file_exists(args.tokens)\n    # Please replace the model files if needed.\n    # See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n    # for download links.\n    recognizer = sherpa_onnx.OnlineRecognizer.from_transducer(\n        tokens=args.tokens,\n        encoder=args.encoder,\n        decoder=args.decoder,\n        joiner=args.joiner,\n        num_threads=1,\n        sample_rate=16000,\n        feature_dim=80,\n        decoding_method=args.decoding_method,\n        max_active_paths=args.max_active_paths,\n        provider=args.provider,\n        hotwords_file=args.hotwords_file,\n        hotwords_score=args.hotwords_score,\n        blank_penalty=args.blank_penalty,\n        hr_rule_fsts=args.hr_rule_fsts,\n        hr_lexicon=args.hr_lexicon,\n    )\n    return recognizer\n\n\ndef main():\n    args = get_args()\n\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    recognizer = create_recognizer(args)\n    print(\"Started! Please speak\")\n\n    # The model is using 16 kHz, we use 48 kHz here to demonstrate that\n    # sherpa-onnx will do resampling inside.\n    sample_rate = 48000\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n    last_result = \"\"\n    stream = recognizer.create_stream()\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=sample_rate) as s:\n        while True:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n            stream.accept_waveform(sample_rate, samples)\n            while recognizer.is_ready(stream):\n                recognizer.decode_stream(stream)\n            result = recognizer.get_result(stream)\n            if last_result != result:\n                last_result = result\n                print(\"\\r{}\".format(result), end=\"\", flush=True)\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/speech-recognition-from-url.py",
    "content": "#!/usr/bin/env python3\n#\n# Real-time speech recognition from a URL with sherpa-onnx Python API\n#\n# Supported URLs are those supported by ffmpeg.\n#\n# For instance:\n# (1) RTMP\n#     rtmp://localhost/live/livestream\n#\n# (2) A file\n#     https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/wenetspeech/DEV_T0000000000.opus\n#     https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/aishell2/ID0012W0030.wav\n#     file:///Users/fangjun/open-source/sherpa-onnx/a.wav\n#\n#    Note that it supports all file formats supported by ffmpeg\n#\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n# to download pre-trained models\n\nimport argparse\nimport shutil\nimport subprocess\nimport sys\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        required=True,\n        help=\"Path to the encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        required=True,\n        help=\"Path to the decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        type=str,\n        help=\"Path to the joiner model\",\n    )\n\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"Valid values are greedy_search and modified_beam_search\",\n    )\n\n    parser.add_argument(\n        \"--url\",\n        type=str,\n        required=True,\n        help=\"\"\"Example values:\n          rtmp://localhost/live/livestream\n          https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/wenetspeech/DEV_T0000000000.opus\n          https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition/resolve/main/test_wavs/aishell2/ID0012W0030.wav\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-file\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The file containing hotwords, one words/phrases per line, and for each\n        phrase the bpe/cjkchar are separated by a space. For example:\n\n        ▁HE LL O ▁WORLD\n        你 好 世 界\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-score\",\n        type=float,\n        default=1.5,\n        help=\"\"\"\n        The hotword score of each token for biasing word/phrase. Used only if\n        --hotwords-file is given.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hr-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the lexicon.txt for homophone replacer\",\n    )\n\n    parser.add_argument(\n        \"--hr-rule-fsts\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the replace.fst for homophone replacer\",\n    )\n\n    return parser.parse_args()\n\n\ndef create_recognizer(args):\n    # Please replace the model files if needed.\n    # See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n    # for download links.\n    recognizer = sherpa_onnx.OnlineRecognizer.from_transducer(\n        tokens=args.tokens,\n        encoder=args.encoder,\n        decoder=args.decoder,\n        joiner=args.joiner,\n        num_threads=1,\n        sample_rate=16000,\n        feature_dim=80,\n        decoding_method=args.decoding_method,\n        enable_endpoint_detection=True,\n        rule1_min_trailing_silence=2.4,\n        rule2_min_trailing_silence=1.2,\n        rule3_min_utterance_length=300,  # it essentially disables this rule\n        hotwords_file=args.hotwords_file,\n        hotwords_score=args.hotwords_score,\n        hr_rule_fsts=args.hr_rule_fsts,\n        hr_lexicon=args.hr_lexicon,\n    )\n    return recognizer\n\n\ndef main():\n    args = get_args()\n    assert_file_exists(args.encoder)\n    assert_file_exists(args.decoder)\n    assert_file_exists(args.joiner)\n    assert_file_exists(args.tokens)\n\n    recognizer = create_recognizer(args)\n\n    ffmpeg_cmd = [\n        \"ffmpeg\",\n        \"-i\",\n        args.url,\n        \"-f\",\n        \"s16le\",\n        \"-acodec\",\n        \"pcm_s16le\",\n        \"-ac\",\n        \"1\",\n        \"-ar\",\n        \"16000\",\n        \"-\",\n    ]\n\n    process = subprocess.Popen(\n        ffmpeg_cmd, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL\n    )\n\n    frames_per_read = 1600  # 0.1 second\n\n    stream = recognizer.create_stream()\n\n    display = sherpa_onnx.Display()\n\n    print(\"Started!\")\n    while True:\n        # *2 because int16_t has two bytes\n        data = process.stdout.read(frames_per_read * 2)\n        if not data:\n            break\n\n        samples = np.frombuffer(data, dtype=np.int16)\n        samples = samples.astype(np.float32) / 32768\n        stream.accept_waveform(16000, samples)\n\n        while recognizer.is_ready(stream):\n            recognizer.decode_stream(stream)\n\n        is_endpoint = recognizer.is_endpoint(stream)\n\n        result = recognizer.get_result(stream)\n\n        display.update_text(result)\n        display.display()\n\n        if is_endpoint:\n            if result:\n                display.finalize_current_sentence()\n                display.display()\n\n            recognizer.reset(stream)\n\n\nif __name__ == \"__main__\":\n    if shutil.which(\"ffmpeg\") is None:\n        sys.exit(\"Please install ffmpeg first!\")\n    main()\n"
  },
  {
    "path": "python-api-examples/spoken-language-identification.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script shows how to use Python APIs for spoken language identification.\nIt detects the language spoken in the given wave file.\n\nUsage:\n\n1. Download a whisper multilingual model. We use a tiny model below.\nPlease refer to https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nto download more models.\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.tar.bz2\nrm sherpa-onnx-whisper-tiny.tar.bz2\n\nWe only use the int8.onnx models below.\n\n2. Download a test wave.\n\nYou can find many wave files for different languages at\nhttps://hf-mirror.com/spaces/k2-fsa/spoken-language-identification/tree/main/test_wavs\n\nwget https://hf-mirror.com/spaces/k2-fsa/spoken-language-identification/resolve/main/test_wavs/de-german.wav\n\npython3 ./python-api-examples/spoken-language-identification.py\n  --whisper-encoder=sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx \\\n  --whisper-decoder=sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx \\\n  --num-threads=1 \\\n  ./de-german.wav\n\"\"\"\n\nimport argparse\nimport logging\nimport time\nimport wave\nfrom pathlib import Path\nfrom typing import Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--whisper-encoder\",\n        required=True,\n        type=str,\n        help=\"Path to a multilingual whisper encoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-decoder\",\n        required=True,\n        type=str,\n        help=\"Path to a multilingual whisper decoder model\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    parser.add_argument(\n        \"sound_file\",\n        type=str,\n        help=\"The input sound file to identify. It must be of WAVE\"\n        \"format with a single channel, and each sample has 16-bit, \"\n        \"i.e., int16_t. \"\n        \"The sample rate of the file can be arbitrary and does not need to \"\n        \"be 16 kHz\",\n    )\n\n    return parser.parse_args()\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/index.html to download it\"\n    )\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\ndef main():\n    args = get_args()\n    assert_file_exists(args.whisper_encoder)\n    assert_file_exists(args.whisper_decoder)\n    assert args.num_threads > 0, args.num_threads\n    config = sherpa_onnx.SpokenLanguageIdentificationConfig(\n        whisper=sherpa_onnx.SpokenLanguageIdentificationWhisperConfig(\n            encoder=args.whisper_encoder,\n            decoder=args.whisper_decoder,\n        ),\n        num_threads=args.num_threads,\n        debug=args.debug,\n        provider=args.provider,\n    )\n    slid = sherpa_onnx.SpokenLanguageIdentification(config)\n\n    samples, sample_rate = read_wave(args.sound_file)\n\n    start_time = time.time()\n    stream = slid.create_stream()\n    stream.accept_waveform(sample_rate=sample_rate, waveform=samples)\n    lang = slid.compute(stream)\n    end_time = time.time()\n\n    elapsed_seconds = end_time - start_time\n    audio_duration = len(samples) / sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    logging.info(f\"File: {args.sound_file}\")\n    logging.info(f\"Detected language: {lang}\")\n    logging.info(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    logging.info(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    logging.info(\n        f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\"\n    )\n\n\nif __name__ == \"__main__\":\n    formatter = \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"\n\n    logging.basicConfig(format=formatter, level=logging.INFO)\n\n    main()\n"
  },
  {
    "path": "python-api-examples/streaming-paraformer-asr-microphone.py",
    "content": "#!/usr/bin/env python3\n\n# Real-time speech recognition from a microphone with sherpa-onnx Python API\n# with endpoint detection.\n# This script uses a streaming paraformer\n#\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html#\n# to download pre-trained models\n\nimport sys\nfrom pathlib import Path\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\nimport sherpa_onnx\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/paraformer-models.html to download it\"\n    )\n\n\ndef create_recognizer():\n    encoder = \"./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx\"\n    decoder = \"./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx\"\n    tokens = \"./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt\"\n    assert_file_exists(encoder)\n    assert_file_exists(decoder)\n    assert_file_exists(tokens)\n    recognizer = sherpa_onnx.OnlineRecognizer.from_paraformer(\n        tokens=tokens,\n        encoder=encoder,\n        decoder=decoder,\n        num_threads=1,\n        sample_rate=16000,\n        feature_dim=80,\n        enable_endpoint_detection=True,\n        rule1_min_trailing_silence=2.4,\n        rule2_min_trailing_silence=1.2,\n        rule3_min_utterance_length=300,  # it essentially disables this rule\n    )\n    return recognizer\n\n\ndef main():\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    recognizer = create_recognizer()\n    print(\"Started! Please speak\")\n\n    # The model is using 16 kHz, we use 48 kHz here to demonstrate that\n    # sherpa-onnx will do resampling inside.\n    sample_rate = 48000\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n\n    stream = recognizer.create_stream()\n\n    display = sherpa_onnx.Display()\n\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=sample_rate) as s:\n        while True:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n            stream.accept_waveform(sample_rate, samples)\n            while recognizer.is_ready(stream):\n                recognizer.decode_stream(stream)\n\n            is_endpoint = recognizer.is_endpoint(stream)\n\n            result = recognizer.get_result(stream)\n\n            display.update_text(result)\n            display.display()\n\n            if is_endpoint:\n                if result:\n                    display.finalize_current_sentence()\n                    display.display()\n\n                recognizer.reset(stream)\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/streaming_server.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2022-2023  Xiaomi Corp.\n#\n\"\"\"\nA server for streaming ASR recognition. By streaming it means the audio samples\nare coming in real-time. You don't need to wait until all audio samples are\ncaptured before sending them for recognition.\n\nIt supports multiple clients sending at the same time.\n\nUsage:\n    ./streaming_server.py --help\n\nExample:\n\n(1) Without a certificate\n\npython3 ./python-api-examples/streaming_server.py \\\n  --encoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx \\\n  --decoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \\\n  --joiner ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx \\\n  --tokens ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\n\n(2) With a certificate\n\n(a) Generate a certificate first:\n\n    cd python-api-examples/web\n    ./generate-certificate.py\n    cd ../..\n\n(b) Start the server\n\npython3 ./python-api-examples/streaming_server.py \\\n  --encoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx \\\n  --decoder ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \\\n  --joiner ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx \\\n  --tokens ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \\\n  --certificate ./python-api-examples/web/cert.pem\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/wenet/index.html\nto download pre-trained models.\n\nThe model in the above help messages is from\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n\nTo use a WeNet streaming Conformer CTC model, please use\n\npython3 ./python-api-examples/streaming_server.py \\\n  --tokens=./sherpa-onnx-zh-wenet-wenetspeech/tokens.txt \\\n  --wenet-ctc=./sherpa-onnx-zh-wenet-wenetspeech/model-streaming.onnx\n\"\"\"\n\nimport argparse\nimport asyncio\nimport http\nimport json\nimport logging\nimport socket\nimport ssl\nfrom concurrent.futures import ThreadPoolExecutor\nfrom datetime import datetime\nfrom pathlib import Path\nfrom typing import List, Optional, Tuple\n\nimport numpy as np\nimport sherpa_onnx\nimport websockets\n\nfrom http_server import HttpServer\n\n\ndef setup_logger(\n    log_filename: str,\n    log_level: str = \"info\",\n    use_console: bool = True,\n) -> None:\n    \"\"\"Setup log level.\n\n    Args:\n      log_filename:\n        The filename to save the log.\n      log_level:\n        The log level to use, e.g., \"debug\", \"info\", \"warning\", \"error\",\n        \"critical\"\n      use_console:\n        True to also print logs to console.\n    \"\"\"\n    now = datetime.now()\n    date_time = now.strftime(\"%Y-%m-%d-%H-%M-%S\")\n    formatter = \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"\n    log_filename = f\"{log_filename}-{date_time}.txt\"\n\n    Path(log_filename).parent.mkdir(parents=True, exist_ok=True)\n\n    level = logging.ERROR\n    if log_level == \"debug\":\n        level = logging.DEBUG\n    elif log_level == \"info\":\n        level = logging.INFO\n    elif log_level == \"warning\":\n        level = logging.WARNING\n    elif log_level == \"critical\":\n        level = logging.CRITICAL\n\n    logging.basicConfig(\n        filename=log_filename,\n        format=formatter,\n        level=level,\n        filemode=\"w\",\n    )\n    if use_console:\n        console = logging.StreamHandler()\n        console.setLevel(level)\n        console.setFormatter(logging.Formatter(formatter))\n        logging.getLogger(\"\").addHandler(console)\n\n\ndef add_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        help=\"Path to the transducer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        help=\"Path to the transducer decoder model.\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        type=str,\n        help=\"Path to the transducer joiner model.\",\n    )\n\n    parser.add_argument(\n        \"--zipformer2-ctc\",\n        type=str,\n        help=\"Path to the model file from zipformer2 ctc\",\n    )\n\n    parser.add_argument(\n        \"--wenet-ctc\",\n        type=str,\n        help=\"Path to the model.onnx from WeNet\",\n    )\n\n    parser.add_argument(\n        \"--paraformer-encoder\",\n        type=str,\n        help=\"Path to the paraformer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--paraformer-decoder\",\n        type=str,\n        help=\"Path to the paraformer decoder model.\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--sample-rate\",\n        type=int,\n        default=16000,\n        help=\"Sample rate of the data used to train the model. \"\n        \"Caution: If your input sound files have a different sampling rate, \"\n        \"we will do resampling inside\",\n    )\n\n    parser.add_argument(\n        \"--feat-dim\",\n        type=int,\n        default=80,\n        help=\"Feature dimension of the model\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n\ndef add_decoding_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"\"\"Decoding method to use. Current supported methods are:\n        - greedy_search\n        - modified_beam_search\n        \"\"\",\n    )\n\n    add_modified_beam_search_args(parser)\n\n\ndef add_hotwords_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--hotwords-file\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The file containing hotwords, one words/phrases per line, and for each\n        phrase the bpe/cjkchar are separated by a space. For example:\n\n        ▁HE LL O ▁WORLD\n        你 好 世 界\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-score\",\n        type=float,\n        default=1.5,\n        help=\"\"\"\n        The hotword score of each token for biasing word/phrase. Used only if\n        --hotwords-file is given.\n        \"\"\",\n    )\n    parser.add_argument(\n        \"--modeling-unit\",\n        type=str,\n        default='cjkchar',\n        help=\"\"\"\n        The modeling unit of the used model. Current supported units are:\n        - cjkchar(for Chinese)\n        - bpe(for English like languages)\n        - cjkchar+bpe(for multilingual models)\n        \"\"\",\n    )\n    parser.add_argument(\n        \"--bpe-vocab\",\n        type=str,\n        default='',\n        help=\"\"\"\n        The bpe vocabulary generated by sentencepiece toolkit. \n        It is only used when modeling-unit is bpe or cjkchar+bpe.\n        if you can’t find bpe.vocab in the model directory, please run:\n        python script/export_bpe_vocab.py --bpe-model exp/bpe.model\n        \"\"\",\n    )\n\n\ndef add_modified_beam_search_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--num-active-paths\",\n        type=int,\n        default=4,\n        help=\"\"\"Used only when --decoding-method is modified_beam_search.\n        It specifies number of active paths to keep during decoding.\n        \"\"\",\n    )\n\ndef add_blank_penalty_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--blank-penalty\",\n        type=float,\n        default=0.0,\n        help=\"\"\"\n        The penalty applied on blank symbol during decoding.\n        Note: It is a positive value that would be applied to logits like\n        this `logits[:, 0] -= blank_penalty` (suppose logits.shape is\n        [batch_size, vocab] and blank id is 0).\n        \"\"\",\n    )\n\ndef add_endpointing_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--use-endpoint\",\n        type=int,\n        default=1,\n        help=\"1 to enable endpoiting. 0 to disable it\",\n    )\n\n    parser.add_argument(\n        \"--rule1-min-trailing-silence\",\n        type=float,\n        default=2.4,\n        help=\"\"\"This endpointing rule1 requires duration of trailing silence\n        in seconds) to be >= this value\"\"\",\n    )\n\n    parser.add_argument(\n        \"--rule2-min-trailing-silence\",\n        type=float,\n        default=1.2,\n        help=\"\"\"This endpointing rule2 requires duration of trailing silence in\n        seconds) to be >= this value.\"\"\",\n    )\n\n    parser.add_argument(\n        \"--rule3-min-utterance-length\",\n        type=float,\n        default=20,\n        help=\"\"\"This endpointing rule3 requires utterance-length (in seconds)\n        to be >= this value.\"\"\",\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter,\n    )\n\n    add_model_args(parser)\n    add_decoding_args(parser)\n    add_endpointing_args(parser)\n    add_hotwords_args(parser)\n    add_blank_penalty_args(parser)\n\n    parser.add_argument(\n        \"--port\",\n        type=int,\n        default=6006,\n        help=\"The server will listen on this port\",\n    )\n\n    parser.add_argument(\n        \"--nn-pool-size\",\n        type=int,\n        default=1,\n        help=\"Number of threads for NN computation and decoding.\",\n    )\n\n    parser.add_argument(\n        \"--max-batch-size\",\n        type=int,\n        default=3,\n        help=\"\"\"Max batch size for computation. Note if there are not enough\n        requests in the queue, it will wait for max_wait_ms time. After that,\n        even if there are not enough requests, it still sends the\n        available requests in the queue for computation.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--max-wait-ms\",\n        type=float,\n        default=10,\n        help=\"\"\"Max time in millisecond to wait to build batches for inference.\n        If there are not enough requests in the stream queue to build a batch\n        of max_batch_size, it waits up to this time before fetching available\n        requests for computation.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--max-message-size\",\n        type=int,\n        default=(1 << 20),\n        help=\"\"\"Max message size in bytes.\n        The max size per message cannot exceed this limit.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--max-queue-size\",\n        type=int,\n        default=32,\n        help=\"Max number of messages in the queue for each connection.\",\n    )\n\n    parser.add_argument(\n        \"--max-active-connections\",\n        type=int,\n        default=200,\n        help=\"\"\"Maximum number of active connections. The server will refuse\n        to accept new connections once the current number of active connections\n        equals to this limit.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=2,\n        help=\"Number of threads to run the neural network model\",\n    )\n\n    parser.add_argument(\n        \"--certificate\",\n        type=str,\n        help=\"\"\"Path to the X.509 certificate. You need it only if you want to\n        use a secure websocket connection, i.e., use wss:// instead of ws://.\n        You can use ./web/generate-certificate.py\n        to generate the certificate `cert.pem`.\n        Note ./web/generate-certificate.py will generate three files but you\n        only need to pass the generated cert.pem to this option.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--doc-root\",\n        type=str,\n        default=\"./python-api-examples/web\",\n        help=\"Path to the web root\",\n    )\n\n    return parser.parse_args()\n\n\ndef create_recognizer(args) -> sherpa_onnx.OnlineRecognizer:\n    if args.encoder:\n        recognizer = sherpa_onnx.OnlineRecognizer.from_transducer(\n            tokens=args.tokens,\n            encoder=args.encoder,\n            decoder=args.decoder,\n            joiner=args.joiner,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            decoding_method=args.decoding_method,\n            max_active_paths=args.num_active_paths,\n            hotwords_score=args.hotwords_score,\n            hotwords_file=args.hotwords_file,\n            blank_penalty=args.blank_penalty,\n            enable_endpoint_detection=args.use_endpoint != 0,\n            rule1_min_trailing_silence=args.rule1_min_trailing_silence,\n            rule2_min_trailing_silence=args.rule2_min_trailing_silence,\n            rule3_min_utterance_length=args.rule3_min_utterance_length,\n            provider=args.provider,\n            modeling_unit=args.modeling_unit,\n            bpe_vocab=args.bpe_vocab\n        )\n    elif args.paraformer_encoder:\n        recognizer = sherpa_onnx.OnlineRecognizer.from_paraformer(\n            tokens=args.tokens,\n            encoder=args.paraformer_encoder,\n            decoder=args.paraformer_decoder,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            decoding_method=args.decoding_method,\n            enable_endpoint_detection=args.use_endpoint != 0,\n            rule1_min_trailing_silence=args.rule1_min_trailing_silence,\n            rule2_min_trailing_silence=args.rule2_min_trailing_silence,\n            rule3_min_utterance_length=args.rule3_min_utterance_length,\n            provider=args.provider,\n        )\n    elif args.zipformer2_ctc:\n        recognizer = sherpa_onnx.OnlineRecognizer.from_zipformer2_ctc(\n            tokens=args.tokens,\n            model=args.zipformer2_ctc,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            decoding_method=args.decoding_method,\n            enable_endpoint_detection=args.use_endpoint != 0,\n            rule1_min_trailing_silence=args.rule1_min_trailing_silence,\n            rule2_min_trailing_silence=args.rule2_min_trailing_silence,\n            rule3_min_utterance_length=args.rule3_min_utterance_length,\n            provider=args.provider,\n        )\n    elif args.wenet_ctc:\n        recognizer = sherpa_onnx.OnlineRecognizer.from_wenet_ctc(\n            tokens=args.tokens,\n            model=args.wenet_ctc,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            decoding_method=args.decoding_method,\n            enable_endpoint_detection=args.use_endpoint != 0,\n            rule1_min_trailing_silence=args.rule1_min_trailing_silence,\n            rule2_min_trailing_silence=args.rule2_min_trailing_silence,\n            rule3_min_utterance_length=args.rule3_min_utterance_length,\n            provider=args.provider,\n        )\n    else:\n        raise ValueError(\"Please provide a model\")\n\n    return recognizer\n\n\ndef format_timestamps(timestamps: List[float]) -> List[str]:\n    return [\"{:.3f}\".format(t) for t in timestamps]\n\n\nclass StreamingServer(object):\n    def __init__(\n        self,\n        recognizer: sherpa_onnx.OnlineRecognizer,\n        nn_pool_size: int,\n        max_wait_ms: float,\n        max_batch_size: int,\n        max_message_size: int,\n        max_queue_size: int,\n        max_active_connections: int,\n        doc_root: str,\n        certificate: Optional[str] = None,\n    ):\n        \"\"\"\n        Args:\n          recognizer:\n            An instance of online recognizer.\n          nn_pool_size:\n            Number of threads for the thread pool that is responsible for\n            neural network computation and decoding.\n          max_wait_ms:\n            Max wait time in milliseconds in order to build a batch of\n            `batch_size`.\n          max_batch_size:\n            Max batch size for inference.\n          max_message_size:\n            Max size in bytes per message.\n          max_queue_size:\n            Max number of messages in the queue for each connection.\n          max_active_connections:\n            Max number of active connections. Once number of active client\n            equals to this limit, the server refuses to accept new connections.\n          beam_search_params:\n            Dictionary containing all the parameters for beam search.\n          online_endpoint_config:\n            Config for endpointing.\n          doc_root:\n            Path to the directory where files like index.html for the HTTP\n            server locate.\n          certificate:\n            Optional. If not None, it will use secure websocket.\n            You can use ./web/generate-certificate.py to generate\n            it (the default generated filename is `cert.pem`).\n        \"\"\"\n        self.recognizer = recognizer\n\n        self.certificate = certificate\n        self.http_server = HttpServer(doc_root)\n\n        self.nn_pool_size = nn_pool_size\n        self.nn_pool = ThreadPoolExecutor(\n            max_workers=nn_pool_size,\n            thread_name_prefix=\"nn\",\n        )\n\n        self.stream_queue = asyncio.Queue()\n\n        self.max_wait_ms = max_wait_ms\n        self.max_batch_size = max_batch_size\n        self.max_message_size = max_message_size\n        self.max_queue_size = max_queue_size\n        self.max_active_connections = max_active_connections\n\n        self.current_active_connections = 0\n\n        self.sample_rate = int(recognizer.config.feat_config.sampling_rate)\n\n    async def stream_consumer_task(self):\n        \"\"\"This function extracts streams from the queue, batches them up, sends\n        them to the neural network model for computation and decoding.\n        \"\"\"\n        while True:\n            if self.stream_queue.empty():\n                await asyncio.sleep(self.max_wait_ms / 1000)\n                continue\n\n            batch = []\n            try:\n                while len(batch) < self.max_batch_size:\n                    item = self.stream_queue.get_nowait()\n\n                    assert self.recognizer.is_ready(item[0])\n\n                    batch.append(item)\n            except asyncio.QueueEmpty:\n                pass\n            stream_list = [b[0] for b in batch]\n            future_list = [b[1] for b in batch]\n\n            loop = asyncio.get_running_loop()\n            await loop.run_in_executor(\n                self.nn_pool,\n                self.recognizer.decode_streams,\n                stream_list,\n            )\n\n            for f in future_list:\n                self.stream_queue.task_done()\n                f.set_result(None)\n\n    async def compute_and_decode(\n        self,\n        stream: sherpa_onnx.OnlineStream,\n    ) -> None:\n        \"\"\"Put the stream into the queue and wait it to be processed by the\n        consumer task.\n\n        Args:\n          stream:\n            The stream to be processed. Note: It is changed in-place.\n        \"\"\"\n        loop = asyncio.get_running_loop()\n        future = loop.create_future()\n        await self.stream_queue.put((stream, future))\n        await future\n\n    async def process_request(\n        self,\n        path: str,\n        request_headers: websockets.Headers,\n    ) -> Optional[Tuple[http.HTTPStatus, websockets.Headers, bytes]]:\n        if \"sec-websocket-key\" not in (\n            request_headers.headers  # For new request_headers\n            if hasattr(request_headers, \"headers\")\n            else request_headers  # For old request_headers\n        ):\n            # This is a normal HTTP request\n            if path == \"/\":\n                path = \"/index.html\"\n\n            if path in (\"/upload.html\", \"/offline_record.html\"):\n                response = r\"\"\"\n<!doctype html><html><head>\n<title>Speech recognition with next-gen Kaldi</title><body>\n<h2>Only /streaming_record.html is available for the streaming server.<h2>\n<br/>\n<br/>\nGo back to <a href=\"/streaming_record.html\">/streaming_record.html</a>\n</body></head></html>\n\"\"\"\n                found = True\n                mime_type = \"text/html\"\n            else:\n                found, response, mime_type = self.http_server.process_request(path)\n\n            if isinstance(response, str):\n                response = response.encode(\"utf-8\")\n\n            if not found:\n                status = http.HTTPStatus.NOT_FOUND\n            else:\n                status = http.HTTPStatus.OK\n            header = {\"Content-Type\": mime_type}\n            return status, header, response\n\n        if self.current_active_connections < self.max_active_connections:\n            self.current_active_connections += 1\n            return None\n\n        # Refuse new connections\n        status = http.HTTPStatus.SERVICE_UNAVAILABLE  # 503\n        header = {\"Hint\": \"The server is overloaded. Please retry later.\"}\n        response = b\"The server is busy. Please retry later.\"\n\n        return status, header, response\n\n    async def run(self, port: int):\n        tasks = []\n        for i in range(self.nn_pool_size):\n            tasks.append(asyncio.create_task(self.stream_consumer_task()))\n\n        if self.certificate:\n            logging.info(f\"Using certificate: {self.certificate}\")\n            ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)\n            ssl_context.load_cert_chain(self.certificate)\n        else:\n            ssl_context = None\n            logging.info(\"No certificate provided\")\n\n        async with websockets.serve(\n            self.handle_connection,\n            host=\"\",\n            port=port,\n            max_size=self.max_message_size,\n            max_queue=self.max_queue_size,\n            process_request=self.process_request,\n            ssl=ssl_context,\n        ):\n            ip_list = [\"localhost\"]\n            if ssl_context:\n                ip_list += [\"0.0.0.0\", \"127.0.0.1\"]\n                ip_list.append(socket.gethostbyname(socket.gethostname()))\n            proto = \"http://\" if ssl_context is None else \"https://\"\n            s = \"Please visit one of the following addresses:\\n\\n\"\n            for p in ip_list:\n                s += \"  \" + proto + p + f\":{port}\" \"\\n\"\n\n            if not ssl_context:\n                s += \"\\nSince you are not providing a certificate, you cannot \"\n                s += \"use your microphone from within the browser using \"\n                s += \"public IP addresses. Only localhost can be used.\"\n                s += \"You also cannot use 0.0.0.0 or 127.0.0.1\"\n\n            logging.info(s)\n\n            await asyncio.Future()  # run forever\n\n        await asyncio.gather(*tasks)  # not reachable\n\n    async def handle_connection(\n        self,\n        socket: websockets.WebSocketServerProtocol,\n    ):\n        \"\"\"Receive audio samples from the client, process it, and send\n        decoding result back to the client.\n\n        Args:\n          socket:\n            The socket for communicating with the client.\n        \"\"\"\n        try:\n            await self.handle_connection_impl(socket)\n        except websockets.exceptions.ConnectionClosedError:\n            logging.info(f\"{socket.remote_address} disconnected\")\n        finally:\n            # Decrement so that it can accept new connections\n            self.current_active_connections -= 1\n\n            logging.info(\n                f\"Disconnected: {socket.remote_address}. \"\n                f\"Number of connections: {self.current_active_connections}/{self.max_active_connections}\"  # noqa\n            )\n\n    async def handle_connection_impl(\n        self,\n        socket: websockets.WebSocketServerProtocol,\n    ):\n        \"\"\"Receive audio samples from the client, process it, and send\n        decoding result back to the client.\n\n        Args:\n          socket:\n            The socket for communicating with the client.\n        \"\"\"\n        logging.info(\n            f\"Connected: {socket.remote_address}. \"\n            f\"Number of connections: {self.current_active_connections}/{self.max_active_connections}\"  # noqa\n        )\n\n        stream = self.recognizer.create_stream()\n        segment = 0\n\n        while True:\n            samples = await self.recv_audio_samples(socket)\n            if samples is None:\n                break\n\n            # TODO(fangjun): At present, we assume the sampling rate\n            # of the received audio samples equal to --sample-rate\n            stream.accept_waveform(sample_rate=self.sample_rate, waveform=samples)\n\n            while self.recognizer.is_ready(stream):\n                await self.compute_and_decode(stream)\n                result = self.recognizer.get_result(stream)\n\n                message = {\n                    \"text\": result,\n                    \"segment\": segment,\n                }\n                if self.recognizer.is_endpoint(stream):\n                    self.recognizer.reset(stream)\n                    segment += 1\n\n                await socket.send(json.dumps(message))\n\n        tail_padding = np.zeros(int(self.sample_rate * 0.3)).astype(np.float32)\n        stream.accept_waveform(sample_rate=self.sample_rate, waveform=tail_padding)\n        stream.input_finished()\n        while self.recognizer.is_ready(stream):\n            await self.compute_and_decode(stream)\n\n        result = self.recognizer.get_result(stream)\n\n        message = {\n            \"text\": result,\n            \"segment\": segment,\n        }\n\n        await socket.send(json.dumps(message))\n\n    async def recv_audio_samples(\n        self,\n        socket: websockets.WebSocketServerProtocol,\n    ) -> Optional[np.ndarray]:\n        \"\"\"Receive a tensor from the client.\n\n        Each message contains either a bytes buffer containing audio samples\n        in 16 kHz or contains \"Done\" meaning the end of utterance.\n\n        Args:\n          socket:\n            The socket for communicating with the client.\n        Returns:\n          Return a 1-D np.float32 tensor containing the audio samples or\n          return None.\n        \"\"\"\n        message = await socket.recv()\n        if message == \"Done\":\n            return None\n\n        return np.frombuffer(message, dtype=np.float32)\n\n\ndef check_args(args):\n    if args.encoder:\n        assert Path(args.encoder).is_file(), f\"{args.encoder} does not exist\"\n\n        assert Path(args.decoder).is_file(), f\"{args.decoder} does not exist\"\n\n        assert Path(args.joiner).is_file(), f\"{args.joiner} does not exist\"\n\n        assert args.paraformer_encoder is None, args.paraformer_encoder\n        assert args.paraformer_decoder is None, args.paraformer_decoder\n        assert args.zipformer2_ctc is None, args.zipformer2_ctc\n        assert args.wenet_ctc is None, args.wenet_ctc\n    elif args.paraformer_encoder:\n        assert Path(\n            args.paraformer_encoder\n        ).is_file(), f\"{args.paraformer_encoder} does not exist\"\n\n        assert Path(\n            args.paraformer_decoder\n        ).is_file(), f\"{args.paraformer_decoder} does not exist\"\n    elif args.zipformer2_ctc:\n        assert Path(\n            args.zipformer2_ctc\n        ).is_file(), f\"{args.zipformer2_ctc} does not exist\"\n    elif args.wenet_ctc:\n        assert Path(args.wenet_ctc).is_file(), f\"{args.wenet_ctc} does not exist\"\n    else:\n        raise ValueError(\"Please provide a model\")\n\n    if not Path(args.tokens).is_file():\n        raise ValueError(f\"{args.tokens} does not exist\")\n\n    if args.decoding_method not in (\n        \"greedy_search\",\n        \"modified_beam_search\",\n    ):\n        raise ValueError(f\"Unsupported decoding method {args.decoding_method}\")\n\n    if args.decoding_method == \"modified_beam_search\":\n        assert args.num_active_paths > 0, args.num_active_paths\n\n\ndef main():\n    args = get_args()\n    logging.info(vars(args))\n    check_args(args)\n\n    recognizer = create_recognizer(args)\n\n    port = args.port\n    nn_pool_size = args.nn_pool_size\n    max_batch_size = args.max_batch_size\n    max_wait_ms = args.max_wait_ms\n    max_message_size = args.max_message_size\n    max_queue_size = args.max_queue_size\n    max_active_connections = args.max_active_connections\n    certificate = args.certificate\n    doc_root = args.doc_root\n\n    if certificate and not Path(certificate).is_file():\n        raise ValueError(f\"{certificate} does not exist\")\n\n    if not Path(doc_root).is_dir():\n        raise ValueError(f\"Directory {doc_root} does not exist\")\n\n    server = StreamingServer(\n        recognizer=recognizer,\n        nn_pool_size=nn_pool_size,\n        max_batch_size=max_batch_size,\n        max_wait_ms=max_wait_ms,\n        max_message_size=max_message_size,\n        max_queue_size=max_queue_size,\n        max_active_connections=max_active_connections,\n        certificate=certificate,\n        doc_root=doc_root,\n    )\n    asyncio.run(server.run(port))\n\n\nif __name__ == \"__main__\":\n    log_filename = \"log/log-streaming-server\"\n    setup_logger(log_filename)\n    main()\n"
  },
  {
    "path": "python-api-examples/supertonic-tts.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2026  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python API\nfor SupertonicTTS.\n\n\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xvf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nrm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\npython3 ./supertonic-tts.py\n\nYou can find more models at\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/tts/supertonic.html\nfor details.\n\n\"\"\"\n\nimport time\n\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_tts():\n    tts_config = sherpa_onnx.OfflineTtsConfig(\n        model=sherpa_onnx.OfflineTtsModelConfig(\n            supertonic=sherpa_onnx.OfflineTtsSupertonicModelConfig(\n                duration_predictor=\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx\",\n                text_encoder=\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx\",\n                vector_estimator=\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx\",\n                vocoder=\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx\",\n                tts_json=\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json\",\n                unicode_indexer=\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin\",\n                voice_style=\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin\",\n            ),\n            debug=False,\n            num_threads=2,\n            provider=\"cpu\",\n        )\n    )\n    if not tts_config.validate():\n        raise ValueError(\n            \"Please read the previous error messages and re-check your config\"\n        )\n\n    return sherpa_onnx.OfflineTts(tts_config)\n\n\ndef main():\n    tts = create_tts()\n\n    text = \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be, a statesman, a businessman, an official, or a scholar.\"\n\n    gen_config = sherpa_onnx.GenerationConfig()\n\n    # This model has 10 speakers. Valid sid: 0-9\n    gen_config.sid = 6\n    gen_config.num_steps = 5\n    gen_config.speed = 1.25  # larger -> faster\n\n    # We use en for English.\n    # You can also use es, pt, fr, ko.\n    # This single model supports 5 languages.\n    gen_config.extra[\"lang\"] = \"en\"\n\n    start = time.time()\n    audio = tts.generate(text, gen_config)\n    end = time.time()\n\n    if len(audio.samples) == 0:\n        print(\"Error in generating audios. Please read previous error messages.\")\n        return\n\n    elapsed_seconds = end - start\n    audio_duration = len(audio.samples) / audio.sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    output_filename = \"./supertonic-en.wav\"\n    sf.write(\n        output_filename,\n        audio.samples,\n        samplerate=audio.sample_rate,\n        subtype=\"PCM_16\",\n    )\n    print(f\"Saved to {output_filename}\")\n    print(f\"The text is '{text}'\")\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/test-sentence-piece-tokenizer.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2026  Xiaomi Corporation\n\n\"\"\"\nPlease download test files\n - vocab.json\n - token_scores.json\nfrom\nhttps://huggingface.co/csukuangfj/sherpa-onnx-test-data/tree/main\n\nThey are generated by ../scripts/pocket-tts/convert_tokenizer.py\nusing the BPE model from\nhttps://huggingface.co/KevinAHM/pocket-tts-onnx/blob/main/tokenizer.model\n\nSee also ../scripts/pocket-tts/test_tokenizer.py\n\"\"\"\n\nfrom pathlib import Path\n\nimport sherpa_onnx\n\n\ndef main():\n    vocab_json = \"./vocab.json\"\n    token_scores_json = \"./token_scores.json\"\n\n    if not Path(vocab_json).is_file() or not Path(token_scores_json).is_file():\n        print(\"Please download test files first\")\n        return\n\n    sp = sherpa_onnx.SentencePieceTokenizer(\n        vocab_json=vocab_json,\n        token_scores_json=token_scores_json,\n    )\n\n    text = \"Yesterday, I bought 3 apples, 2 bananas, and a dozen oranges. Wow! That's amazing—did you see it too? I can't believe it's already 10:30 p.m.\"\n\n    ids = sp.encode(text, out_type=int)\n    tokens = sp.encode(text, out_type=str)\n    print(text)\n    print(tokens)\n    print(ids)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/test-whisper-timestamps.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Posit Software, PBC\n\"\"\"\nTest Whisper timestamps functionality.\n\nThis script tests token-level timestamps using cross-attention DTW alignment.\nNote: Requires models exported with attention outputs.\n\nUsage:\n  # Test without timestamps (default)\n  python test-whisper-timestamps.py \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --audio=/path/to/test.wav\n\n  # Test with timestamps (requires attention-enabled model)\n  python test-whisper-timestamps.py \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --audio=/path/to/test.wav \\\n    --enable-token-timestamps\n\n  # Test with CUDA GPU acceleration\n  python test-whisper-timestamps.py \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --audio=/path/to/test.wav \\\n    --enable-token-timestamps \\\n    --provider=cuda\n\"\"\"\n\nimport argparse\nimport wave\nfrom typing import Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Read a wave file and return samples as float32 array.\n\n    Args:\n      wave_filename: Path to a wave file. Should be single channel, 16-bit.\n\n    Returns:\n      Tuple of (samples as float32 array normalized to [-1, 1], sample_rate)\n    \"\"\"\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # 16-bit\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\ndef test_without_timestamps(args, samples, sample_rate):\n    \"\"\"Test recognition without timestamps.\"\"\"\n    print(\"=\" * 60)\n    print(\"Testing Without Timestamps\")\n    print(\"=\" * 60)\n\n    recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n        encoder=args.encoder,\n        decoder=args.decoder,\n        tokens=args.tokens,\n        enable_token_timestamps=False,\n        provider=args.provider,\n    )\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, samples)\n    recognizer.decode_stream(stream)\n    result = stream.result\n\n    print(f\"\\nText: {result.text}\")\n    print(f\"Tokens: {result.tokens}\")\n    print(f\"Timestamps: {result.timestamps}\")\n\n    assert len(result.timestamps) == 0, \"Should have no timestamps\"\n\n    print(\"\\nTest without timestamps PASSED!\")\n\n\ndef test_with_timestamps(args, samples, sample_rate, audio_duration, enable_segment_timestamps=False):\n    \"\"\"Test token-level timestamps using cross-attention DTW.\"\"\"\n    print(\"\\n\" + \"=\" * 60)\n    if enable_segment_timestamps:\n        print(\"Testing With Both Token and Segment Timestamps\")\n    else:\n        print(\"Testing With Token Timestamps (cross-attention DTW)\")\n    print(\"=\" * 60)\n\n    recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n        encoder=args.encoder,\n        decoder=args.decoder,\n        tokens=args.tokens,\n        enable_token_timestamps=True,\n        enable_segment_timestamps=enable_segment_timestamps,\n        provider=args.provider,\n    )\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, samples)\n    recognizer.decode_stream(stream)\n    result = stream.result\n\n    print(f\"\\nText: {result.text}\")\n    print(f\"Language: {result.lang}\")\n\n    # Check token-level timestamps\n    print(f\"\\nToken timestamps count: {len(result.timestamps)}\")\n    assert len(result.timestamps) == len(result.tokens), (\n        f\"Timestamps count ({len(result.timestamps)}) != \"\n        f\"tokens count ({len(result.tokens)})\"\n    )\n\n    print(\"\\n--- Token-Level Timestamps ---\")\n    timestamps = result.timestamps\n    durations = result.durations\n    tokens = result.tokens\n\n    assert len(durations) == len(tokens), (\n        f\"Durations count ({len(durations)}) != tokens count ({len(tokens)})\"\n    )\n\n    for token, ts, dur in zip(tokens, timestamps, durations):\n        end_ts = ts + dur\n        print(f\"  [{ts:.2f}s - {end_ts:.2f}s] ({dur:.2f}s): {repr(token)}\")\n\n    # Check monotonicity\n    for i in range(1, len(result.timestamps)):\n        assert result.timestamps[i] >= result.timestamps[i - 1], (\n            f\"Timestamps not monotonic at index {i}: \"\n            f\"{result.timestamps[i - 1]} > {result.timestamps[i]}\"\n        )\n\n    # Check range: timestamps bounded by actual audio duration (or 30s if truncated)\n    max_timestamp = min(audio_duration, 30.0)\n    for ts in result.timestamps:\n        assert 0.0 <= ts <= max_timestamp, f\"Timestamp out of range: {ts}\"\n\n    # Note: Word-level timestamps can be derived from token-level data client-side\n    # by grouping tokens that start with a space character into words--or, in the\n    # case of non-space-delimited languages like Chinese, Japanese, etc., treat\n    # each unicode character as a word.\n\n    # Check segment timestamps if enabled\n    if enable_segment_timestamps:\n        print(\"\\n--- Segment-Level Timestamps ---\")\n        seg_timestamps = result.segment_timestamps\n        seg_durations = result.segment_durations\n        seg_texts = result.segment_texts\n\n        assert len(seg_timestamps) == len(seg_durations) == len(seg_texts), (\n            f\"Segment vectors have different lengths: \"\n            f\"timestamps={len(seg_timestamps)}, durations={len(seg_durations)}, \"\n            f\"texts={len(seg_texts)}\"\n        )\n\n        for i, (ts, dur, text) in enumerate(\n            zip(seg_timestamps, seg_durations, seg_texts)\n        ):\n            end_ts = ts + dur\n            print(f\"  Segment {i}: [{ts:.2f}s - {end_ts:.2f}s] ({dur:.2f}s)\")\n            print(f\"    Text: {repr(text)}\")\n\n    print(\"\\nTest with timestamps PASSED!\")\n    return True\n\n\ndef main():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--encoder\", required=True, help=\"Path to encoder.onnx\")\n    parser.add_argument(\"--decoder\", required=True, help=\"Path to decoder.onnx\")\n    parser.add_argument(\"--tokens\", required=True, help=\"Path to tokens.txt\")\n    parser.add_argument(\"--audio\", required=True, help=\"Path to audio file (wav)\")\n    parser.add_argument(\n        \"--enable-token-timestamps\",\n        action=\"store_true\",\n        help=\"Enable token-level timestamps (requires attention-enabled model)\",\n    )\n    parser.add_argument(\n        \"--enable-segment-timestamps\",\n        action=\"store_true\",\n        help=\"Enable segment-level timestamps using timestamp tokens\",\n    )\n    parser.add_argument(\n        \"--provider\",\n        default=\"cpu\",\n        help=\"Execution provider: cpu, cuda, coreml, etc. (default: cpu)\",\n    )\n    args = parser.parse_args()\n\n    # Handle --enable-segment-timestamps dependency on --enable-token-timestamps\n    if args.enable_segment_timestamps and not args.enable_token_timestamps:\n        parser.error(\n            \"--enable-segment-timestamps requires --enable-token-timestamps to be set\"\n        )\n\n    # Read audio\n    samples, sample_rate = read_wave(args.audio)\n    print(f\"Loaded audio: {len(samples)} samples at {sample_rate} Hz\")\n    print(f\"Duration: {len(samples) / sample_rate:.2f} seconds\\n\")\n\n    # Test without timestamps\n    test_without_timestamps(args, samples, sample_rate)\n\n    # Test with timestamps if requested\n    audio_duration = len(samples) / sample_rate\n    if args.enable_token_timestamps:\n        test_with_timestamps(\n            args,\n            samples,\n            sample_rate,\n            audio_duration,\n            enable_segment_timestamps=args.enable_segment_timestamps,\n        )\n\n    print(\"\\n\" + \"=\" * 60)\n    print(\"All tests passed!\")\n    print(\"=\" * 60)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/two-pass-speech-recognition-from-microphone.py",
    "content": "#!/usr/bin/env python3\n\n# Two-pass real-time speech recognition from a microphone with sherpa-onnx\n# Python API.\n#\n# The first pass uses a streaming model, which has two purposes:\n#\n#  (1) Display a temporary result to users\n#\n#  (2) Endpointing\n#\n# The second pass uses a non-streaming model. It has a higher recognition\n# accuracy than the first pass model and its result is used as the final result.\n#\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n# to download pre-trained models\n\n\"\"\"\nUsage examples:\n\n(1) Chinese: Streaming zipformer (1st pass) + Non-streaming paraformer (2nd pass)\n\npython3 ./python-api-examples/two-pass-speech-recognition-from-microphone.py \\\n  --first-encoder ./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/encoder-epoch-99-avg-1.onnx \\\n  --first-decoder ./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/decoder-epoch-99-avg-1.onnx \\\n  --first-joiner ./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/joiner-epoch-99-avg-1.onnx \\\n  --first-tokens ./sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23/tokens.txt \\\n  \\\n  --second-paraformer ./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx \\\n  --second-tokens ./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\n\n(2) English: Streaming zipformer (1st pass) + Non-streaming whisper (2nd pass)\n\npython3 ./python-api-examples/two-pass-speech-recognition-from-microphone.py \\\n  --first-encoder ./sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/encoder-epoch-99-avg-1.onnx \\\n  --first-decoder ./sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/decoder-epoch-99-avg-1.onnx \\\n  --first-joiner ./sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/joiner-epoch-99-avg-1.onnx \\\n  --first-tokens ./sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/tokens.txt \\\n  \\\n  --second-whisper-encoder ./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx \\\n  --second-whisper-decoder ./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx \\\n  --second-tokens ./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\n\"\"\"\n\nimport argparse\nimport sys\nfrom pathlib import Path\n\nimport numpy as np\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\nimport sherpa_onnx\n\n\ndef assert_file_exists(filename: str, message: str):\n    if not filename:\n        raise ValueError(f\"Please specify {message}\")\n\n    if not Path(filename).is_file():\n        raise ValueError(f\"{message} {filename} does not exist\")\n\n\ndef add_first_pass_streaming_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--first-tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt for the first pass\",\n    )\n\n    parser.add_argument(\n        \"--first-encoder\",\n        type=str,\n        required=True,\n        help=\"Path to the encoder model for the first pass\",\n    )\n\n    parser.add_argument(\n        \"--first-decoder\",\n        type=str,\n        required=True,\n        help=\"Path to the decoder model for the first pass\",\n    )\n\n    parser.add_argument(\n        \"--first-joiner\",\n        type=str,\n        help=\"Path to the joiner model for the first pass\",\n    )\n\n    parser.add_argument(\n        \"--first-decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"\"\"Decoding method for the first pass. Valid values are\n        greedy_search and modified_beam_search\"\"\",\n    )\n\n    parser.add_argument(\n        \"--first-max-active-paths\",\n        type=int,\n        default=4,\n        help=\"\"\"Used only when --first-decoding-method is modified_beam_search.\n        It specifies number of active paths to keep during decoding.\n        \"\"\",\n    )\n\n\ndef add_second_pass_transducer_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--second-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer encoder model for the second pass\",\n    )\n\n    parser.add_argument(\n        \"--second-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer decoder model for the second pass\",\n    )\n\n    parser.add_argument(\n        \"--second-joiner\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer joiner model for the second pass\",\n    )\n\n\ndef add_second_pass_paraformer_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--second-paraformer\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx for Paraformer for the second pass\",\n    )\n\n\ndef add_second_pass_nemo_ctc_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--second-nemo-ctc\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx for NeMo CTC for the second pass\",\n    )\n\n\ndef add_second_pass_whisper_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--second-whisper-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper encoder model for the second pass\",\n    )\n\n    parser.add_argument(\n        \"--second-whisper-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper decoder model for the second pass\",\n    )\n\n    parser.add_argument(\n        \"--second-whisper-language\",\n        default=\"\",\n        type=str,\n        help=\"\"\"It specifies the spoken language in the input audio file.\n        Example values: en, fr, de, zh, jp.\n        Available languages for multilingual models can be found at\n        https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10\n        If not specified, we infer the language from the input audio file.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--second-whisper-task\",\n        default=\"transcribe\",\n        choices=[\"transcribe\", \"translate\"],\n        type=str,\n        help=\"\"\"For multilingual models, if you specify translate, the output\n        will be in English.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--second-whisper-tail-paddings\",\n        default=-1,\n        type=int,\n        help=\"\"\"Number of tail padding frames.\n        We have removed the 30-second constraint from whisper, so you need to\n        choose the amount of tail padding frames by yourself.\n        Use -1 to use a default value for tail padding.\n        \"\"\",\n    )\n\n\ndef add_second_pass_non_streaming_model_args(parser: argparse.ArgumentParser):\n    add_second_pass_transducer_model_args(parser)\n    add_second_pass_nemo_ctc_model_args(parser)\n    add_second_pass_paraformer_model_args(parser)\n    add_second_pass_whisper_model_args(parser)\n\n    parser.add_argument(\n        \"--second-tokens\",\n        type=str,\n        help=\"Path to tokens.txt for the second pass\",\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n    add_first_pass_streaming_model_args(parser)\n    add_second_pass_non_streaming_model_args(parser)\n\n    return parser.parse_args()\n\n\ndef check_first_pass_args(args):\n    assert_file_exists(args.first_tokens, \"--first-tokens\")\n    assert_file_exists(args.first_encoder, \"--first-encoder\")\n    assert_file_exists(args.first_decoder, \"--first-decoder\")\n    assert_file_exists(args.first_joiner, \"--first-joiner\")\n\n\ndef check_second_pass_args(args):\n    assert_file_exists(args.second_tokens, \"--second-tokens\")\n\n    if args.second_encoder:\n        assert_file_exists(args.second_encoder, \"--second-encoder\")\n        assert_file_exists(args.second_decoder, \"--second-decoder\")\n        assert_file_exists(args.second_joiner, \"--second-joiner\")\n    elif args.second_paraformer:\n        assert_file_exists(args.second_paraformer, \"--second-paraformer\")\n    elif args.second_nemo_ctc:\n        assert_file_exists(args.second_nemo_ctc, \"--second-nemo-ctc\")\n    elif args.second_whisper_encoder:\n        assert_file_exists(args.second_whisper_encoder, \"--second-whisper-encoder\")\n        assert_file_exists(args.second_whisper_decoder, \"--second-whisper-decoder\")\n    else:\n        raise ValueError(\"Please specify the model for the second pass\")\n\n\ndef create_first_pass_recognizer(args):\n    # Please replace the model files if needed.\n    # See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n    # for download links.\n    recognizer = sherpa_onnx.OnlineRecognizer.from_transducer(\n        tokens=args.first_tokens,\n        encoder=args.first_encoder,\n        decoder=args.first_decoder,\n        joiner=args.first_joiner,\n        num_threads=1,\n        sample_rate=16000,\n        feature_dim=80,\n        decoding_method=args.first_decoding_method,\n        max_active_paths=args.first_max_active_paths,\n        provider=args.provider,\n        enable_endpoint_detection=True,\n        rule1_min_trailing_silence=2.4,\n        rule2_min_trailing_silence=1.2,\n        rule3_min_utterance_length=20,\n    )\n    return recognizer\n\n\ndef create_second_pass_recognizer(args) -> sherpa_onnx.OfflineRecognizer:\n    if args.second_encoder:\n        recognizer = sherpa_onnx.OfflineRecognizer.from_transducer(\n            encoder=args.second_encoder,\n            decoder=args.second_decoder,\n            joiner=args.second_joiner,\n            tokens=args.second_tokens,\n            sample_rate=16000,\n            feature_dim=80,\n            decoding_method=\"greedy_search\",\n            max_active_paths=4,\n        )\n    elif args.second_paraformer:\n        recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(\n            paraformer=args.second_paraformer,\n            tokens=args.second_tokens,\n            num_threads=1,\n            sample_rate=16000,\n            feature_dim=80,\n            decoding_method=\"greedy_search\",\n        )\n    elif args.second_nemo_ctc:\n        recognizer = sherpa_onnx.OfflineRecognizer.from_nemo_ctc(\n            model=args.second_nemo_ctc,\n            tokens=args.second_tokens,\n            num_threads=1,\n            sample_rate=16000,\n            feature_dim=80,\n            decoding_method=\"greedy_search\",\n        )\n    elif args.second_whisper_encoder:\n        recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n            encoder=args.second_whisper_encoder,\n            decoder=args.second_whisper_decoder,\n            tokens=args.second_tokens,\n            num_threads=1,\n            decoding_method=\"greedy_search\",\n            language=args.second_whisper_language,\n            task=args.second_whisper_task,\n            tail_paddings=args.second_whisper_tail_paddings,\n        )\n    else:\n        raise ValueError(\"Please specify at least one model for the second pass\")\n\n    return recognizer\n\n\ndef run_second_pass(\n    recognizer: sherpa_onnx.OfflineRecognizer,\n    samples: np.ndarray,\n    sample_rate: int,\n):\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, samples)\n\n    recognizer.decode_stream(stream)\n\n    return stream.result.text\n\n\ndef main():\n    args = get_args()\n    check_first_pass_args(args)\n    check_second_pass_args(args)\n\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n\n    # If you want to select a different input device, please use\n    # sd.default.device[0] = xxx\n    # where xxx is the device number\n\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    print(\"Creating recognizers. Please wait...\")\n    first_recognizer = create_first_pass_recognizer(args)\n    second_recognizer = create_second_pass_recognizer(args)\n\n    print(\"Started! Please speak\")\n\n    sample_rate = 16000\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n    stream = first_recognizer.create_stream()\n\n    display = sherpa_onnx.Display()\n\n    sample_buffers = []\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=sample_rate) as s:\n        while True:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n            stream.accept_waveform(sample_rate, samples)\n\n            sample_buffers.append(samples)\n\n            while first_recognizer.is_ready(stream):\n                first_recognizer.decode_stream(stream)\n\n            is_endpoint = first_recognizer.is_endpoint(stream)\n\n            result = first_recognizer.get_result(stream)\n            result = result.lower().strip()\n\n            display.update_text(result)\n            display.display()\n\n            if is_endpoint:\n                if result:\n                    samples = np.concatenate(sample_buffers)\n                    # There are internal sample buffers inside the streaming\n                    # feature extractor, so we cannot send all samples to\n                    # the 2nd pass. Here 8000 is just an empirical value\n                    # that should work for most streaming models in sherpa-onnx\n                    sample_buffers = [samples[-8000:]]\n                    samples = samples[:-8000]\n                    result = run_second_pass(\n                        recognizer=second_recognizer,\n                        samples=samples,\n                        sample_rate=sample_rate,\n                    )\n                    result = result.lower().strip()\n                    display.update_text(result)\n                    display.finalize_current_sentence()\n                    display.display()\n                else:\n                    sample_buffers = []\n\n                first_recognizer.reset(stream)\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/two-pass-wss.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c) 2025 Minghu Wang\n\"\"\"\n\nA two-pass streaming ASR server with WebSocket support. This server implements\na two-pass recognition strategy where the first pass uses a fast streaming model\nfor real-time recognition, and the second pass uses a more accurate offline model\nto refine the results.\n\nThe first pass provides immediate feedback to users, while the second pass\nimproves accuracy by re-processing the complete utterance with a more powerful\nmodel.\n\nIt supports multiple clients sending audio simultaneously and provides\nreal-time transcription results.\n\nUsage:\n    ./two-pass-wss.py --help\n\nExample:\n\n(1) Without a certificate\n\npython3 ./python-api-examples/two-pass-wss.py \\\n  --paraformer-encoder ./sherpa-onnx-paraformer-zh-2023-09-18/encoder.onnx \\\n  --paraformer-decoder ./sherpa-onnx-paraformer-zh-2023-09-18/decoder.onnx \\\n  --tokens ./sherpa-onnx-paraformer-zh-2023-09-18/tokens.txt \\\n  --second-sense-voice ./sherpa-onnx-sense-voice-zh-2023-09-18/model.onnx \\\n  --second-tokens ./sherpa-onnx-sense-voice-zh-2023-09-18/tokens.txt\n\n(2) With a certificate\n\n(a) Generate a certificate first:\n\n    cd python-api-examples/web\n    ./generate-certificate.py\n    cd ../..\n\n(b) Start the server\n\npython3 ./python-api-examples/two-pass-wss.py \\\n  --paraformer-encoder ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.onnx \\\n  --paraformer-decoder ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.onnx \\\n  --tokens ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt \\\n  --second-sense-voice ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx \\\n  --second-tokens ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt \\\n  --certificate ./python-api-examples/web/cert.pem\n\nPlease refer to\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nto download pre-trained models.\n\"\"\"\n\nimport argparse\nimport asyncio\nimport http\nimport json\nimport logging\nimport socket\nimport ssl\nfrom concurrent.futures import ThreadPoolExecutor\nfrom datetime import datetime\nfrom pathlib import Path\nfrom typing import List, Optional, Tuple\n\nimport numpy as np\nimport sherpa_onnx\nimport websockets\n\ndef setup_logger(\n    log_filename: str,\n    log_level: str = \"info\",\n    use_console: bool = True,\n) -> None:\n    \"\"\"Setup log level.\n\n    Args:\n      log_filename:\n        The filename to save the log.\n      log_level:\n        The log level to use, e.g., \"debug\", \"info\", \"warning\", \"error\",\n        \"critical\"\n      use_console:\n        True to also print logs to console.\n    \"\"\"\n    now = datetime.now()\n    date_time = now.strftime(\"%Y-%m-%d-%H-%M-%S\")\n    formatter = \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"\n    log_filename = f\"{log_filename}-{date_time}.txt\"\n\n    Path(log_filename).parent.mkdir(parents=True, exist_ok=True)\n\n    level = logging.ERROR\n    if log_level == \"debug\":\n        level = logging.DEBUG\n    elif log_level == \"info\":\n        level = logging.INFO\n    elif log_level == \"warning\":\n        level = logging.WARNING\n    elif log_level == \"critical\":\n        level = logging.CRITICAL\n\n    logging.basicConfig(\n        filename=log_filename,\n        format=formatter,\n        level=level,\n        filemode=\"w\",\n    )\n    if use_console:\n        console = logging.StreamHandler()\n        console.setLevel(level)\n        console.setFormatter(logging.Formatter(formatter))\n        logging.getLogger(\"\").addHandler(console)\n\n\ndef add_model_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        default=\"\",\n        help=\"Path to the transducer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        default=\"\",\n        help=\"Path to the transducer decoder model.\",\n    )\n\n\n    parser.add_argument(\n        \"--second-tokens\",\n        type=str,\n        default=\"\",\n        help=\"Path to the second pass tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--second-sense-voice\",\n        type=str,\n        default=\"\",\n        help=\"Path to the second pass sense voice model.\",\n    )\n\n    parser.add_argument(\n        \"--paraformer-encoder\",\n        type=str,\n        default=\"\",\n        help=\"Path to the paraformer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--paraformer-decoder\",\n        type=str,\n        default=\"\",\n        help=\"Path to the paraformer decoder model.\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        default=\"\",\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--sample-rate\",\n        type=int,\n        default=16000,\n        help=\"Sample rate of the data used to train the model. \"\n        \"Caution: If your input sound files have a different sampling rate, \"\n        \"we will do resampling inside\",\n    )\n\n    parser.add_argument(\n        \"--feat-dim\",\n        type=int,\n        default=80,\n        help=\"Feature dimension of the model\",\n    )\n\n    parser.add_argument(\n        \"--provider\",\n        type=str,\n        default=\"cpu\",\n        help=\"Valid values: cpu, cuda, coreml\",\n    )\n\n\ndef add_decoding_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"\"\"Decoding method to use. Current supported methods are:\n        - greedy_search\n        - modified_beam_search\n        \"\"\",\n    )\n\n    add_modified_beam_search_args(parser)\n\n\ndef add_hotwords_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--hotwords-file\",\n        type=str,\n        default=\"\",\n        help=\"\"\"\n        The file containing hotwords, one words/phrases per line, and for each\n        phrase the bpe/cjkchar are separated by a space. For example:\n\n        ▁HE LL O ▁WORLD\n        你 好 世 界\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--hotwords-score\",\n        type=float,\n        default=1.5,\n        help=\"\"\"\n        The hotword score of each token for biasing word/phrase. Used only if\n        --hotwords-file is given.\n        \"\"\",\n    )\n    parser.add_argument(\n        \"--modeling-unit\",\n        type=str,\n        default='cjkchar',\n        help=\"\"\"\n        The modeling unit of the used model. Current supported units are:\n        - cjkchar(for Chinese)\n        - bpe(for English like languages)\n        - cjkchar+bpe(for multilingual models)\n        \"\"\",\n    )\n    parser.add_argument(\n        \"--bpe-vocab\",\n        type=str,\n        default='',\n        help=\"\"\"\n        The bpe vocabulary generated by sentencepiece toolkit. \n        It is only used when modeling-unit is bpe or cjkchar+bpe.\n        if you can't find bpe.vocab in the model directory, please run:\n        python script/export_bpe_vocab.py --bpe-model exp/bpe.model\n        \"\"\",\n    )\n\n\ndef add_modified_beam_search_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--num-active-paths\",\n        type=int,\n        default=4,\n        help=\"\"\"Used only when --decoding-method is modified_beam_search.\n        It specifies number of active paths to keep during decoding.\n        \"\"\",\n    )\n\ndef add_blank_penalty_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--blank-penalty\",\n        type=float,\n        default=0.0,\n        help=\"\"\"\n        The penalty applied on blank symbol during decoding.\n        Note: It is a positive value that would be applied to logits like\n        this `logits[:, 0] -= blank_penalty` (suppose logits.shape is\n        [batch_size, vocab] and blank id is 0).\n        \"\"\",\n    )\n\ndef add_endpointing_args(parser: argparse.ArgumentParser):\n    parser.add_argument(\n        \"--rule1-min-trailing-silence\",\n        type=float,\n        default=2.4,\n        help=\"\"\"This endpointing rule1 requires duration of trailing silence\n        in seconds) to be >= this value\"\"\",\n    )\n\n    parser.add_argument(\n        \"--rule2-min-trailing-silence\",\n        type=float,\n        default=1.2,\n        help=\"\"\"This endpointing rule2 requires duration of trailing silence in\n        seconds) to be >= this value.\"\"\",\n    )\n\n    parser.add_argument(\n        \"--rule3-min-utterance-length\",\n        type=float,\n        default=20,\n        help=\"\"\"This endpointing rule3 requires utterance-length (in seconds)\n        to be >= this value.\"\"\",\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter,\n    )\n\n    add_model_args(parser)\n    add_decoding_args(parser)\n    add_endpointing_args(parser)\n    add_hotwords_args(parser)\n    add_blank_penalty_args(parser)\n\n    parser.add_argument(\n        \"--port\",\n        type=int,\n        default=6006,\n        help=\"The server will listen on this port\",\n    )\n\n    parser.add_argument(\n        \"--nn-pool-size\",\n        type=int,\n        default=1,\n        help=\"Number of threads for NN computation and decoding.\",\n    )\n\n    parser.add_argument(\n        \"--max-batch-size\",\n        type=int,\n        default=3,\n        help=\"\"\"Max batch size for computation. Note if there are not enough\n        requests in the queue, it will wait for max_wait_ms time. After that,\n        even if there are not enough requests, it still sends the\n        available requests in the queue for computation.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--max-wait-ms\",\n        type=float,\n        default=10,\n        help=\"\"\"Max time in millisecond to wait to build batches for inference.\n        If there are not enough requests in the stream queue to build a batch\n        of max_batch_size, it waits up to this time before fetching available\n        requests for computation.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--max-message-size\",\n        type=int,\n        default=(1 << 20),\n        help=\"\"\"Max message size in bytes.\n        The max size per message cannot exceed this limit.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--max-queue-size\",\n        type=int,\n        default=32,\n        help=\"Max number of messages in the queue for each connection.\",\n    )\n\n    parser.add_argument(\n        \"--max-active-connections\",\n        type=int,\n        default=200,\n        help=\"\"\"Maximum number of active connections. The server will refuse\n        to accept new connections once the current number of active connections\n        equals to this limit.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=2,\n        help=\"Number of threads to run the neural network model\",\n    )\n\n    parser.add_argument(\n        \"--second-pass-threads\",\n        type=int,\n        default=2,\n        help=\"Number of threads for second pass processing\",\n    )\n\n    parser.add_argument(\n        \"--certificate\",\n        type=str,\n        help=\"\"\"Path to the X.509 certificate. You need it only if you want to\n        use a secure websocket connection, i.e., use wss:// instead of ws://.\n        You can use ./web/generate-certificate.py\n        to generate the certificate `cert.pem`.\n        Note ./web/generate-certificate.py will generate three files but you\n        only need to pass the generated cert.pem to this option.\n        \"\"\",\n    )\n\n    return parser.parse_args()\n\ndef run_second_pass(\n    recognizer: sherpa_onnx.OfflineRecognizer,\n    samples: np.ndarray,\n    sample_rate: int,\n):\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, samples)\n\n    recognizer.decode_stream(stream)\n\n    return stream.result.text\n\ndef create_first_pass_recognizer(args) -> sherpa_onnx.OnlineRecognizer:\n    recognizer = sherpa_onnx.OnlineRecognizer.from_paraformer(\n            tokens=args.tokens,\n            encoder=args.paraformer_encoder,\n            decoder=args.paraformer_decoder,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feat_dim,\n            decoding_method=args.decoding_method,\n            enable_endpoint_detection=True,\n            rule1_min_trailing_silence=args.rule1_min_trailing_silence,\n            rule2_min_trailing_silence=args.rule2_min_trailing_silence,\n            rule3_min_utterance_length=args.rule3_min_utterance_length,\n            provider=args.provider,\n        )\n    return recognizer\n\n\ndef create_second_pass_recognizer(args) -> sherpa_onnx.OfflineRecognizer:\n    recognizer = sherpa_onnx.OfflineRecognizer.from_sense_voice(\n            model=args.second_sense_voice,\n            tokens=args.second_tokens,\n            num_threads=1,\n            sample_rate=16000,\n            feature_dim=80,\n            use_itn=True,\n            decoding_method=\"greedy_search\",\n        )\n    return recognizer\n\n\ndef format_timestamps(timestamps: List[float]) -> List[str]:\n    return [\"{:.3f}\".format(t) for t in timestamps]\n\n\nclass StreamingServer(object):\n    def __init__(\n        self,\n        first_pass_recognizer: sherpa_onnx.OnlineRecognizer,\n        second_pass_recognizer: sherpa_onnx.OfflineRecognizer,\n        nn_pool_size: int,\n        max_wait_ms: float,\n        max_batch_size: int,\n        max_message_size: int,\n        max_queue_size: int,\n        max_active_connections: int,\n        second_pass_threads: int = 2,\n        certificate: Optional[str] = None,\n    ):\n        \"\"\"\n        Args:\n          first_pass_recognizer:\n            An instance of online recognizer for first pass.\n          second_pass_recognizer:\n            An instance of offline recognizer for second pass.\n          nn_pool_size:\n            Number of threads for the thread pool that is responsible for\n            neural network computation and decoding.\n          max_wait_ms:\n            Max wait time in milliseconds in order to build a batch of\n            `batch_size`.\n          max_batch_size:\n            Max batch size for inference.\n          max_message_size:\n            Max size in bytes per message.\n          max_queue_size:\n            Max number of messages in the queue for each connection.\n          max_active_connections:\n            Max number of active connections. Once number of active client\n            equals to this limit, the server refuses to accept new connections.\n          certificate:\n            Optional. If not None, it will use secure websocket.\n            You can use ./web/generate-certificate.py to generate\n            it (the default generated filename is `cert.pem`).\n        \"\"\"\n        self.first_pass_recognizer = first_pass_recognizer\n        self.second_pass_recognizer = second_pass_recognizer\n\n        self.certificate = certificate\n\n        self.nn_pool_size = nn_pool_size\n        self.nn_pool = ThreadPoolExecutor(\n            max_workers=nn_pool_size,\n            thread_name_prefix=\"nn\",\n        )\n\n        self.second_pass_pool = ThreadPoolExecutor(\n            max_workers=second_pass_threads,\n            thread_name_prefix=\"second_pass\",\n        )\n\n        self.stream_queue = asyncio.Queue()\n\n        self.max_wait_ms = max_wait_ms\n        self.max_batch_size = max_batch_size\n        self.max_message_size = max_message_size\n        self.max_queue_size = max_queue_size\n        self.max_active_connections = max_active_connections\n\n        self.current_active_connections = 0\n\n        self.sample_rate = int(self.first_pass_recognizer.config.feat_config.sampling_rate)\n\n    async def stream_consumer_task(self):\n        \"\"\"This function extracts streams from the queue, batches them up, sends\n        them to the neural network model for computation and decoding.\n        \"\"\"\n        while True:\n            if self.stream_queue.empty():\n                await asyncio.sleep(self.max_wait_ms / 1000)\n                continue\n\n            batch = []\n            try:\n                while len(batch) < self.max_batch_size:\n                    item = self.stream_queue.get_nowait()\n\n                    assert self.first_pass_recognizer.is_ready(item[0])\n\n                    batch.append(item)\n            except asyncio.QueueEmpty:\n                pass\n            stream_list = [b[0] for b in batch]\n            future_list = [b[1] for b in batch]\n\n            loop = asyncio.get_running_loop()\n            await loop.run_in_executor(\n                self.nn_pool,\n                self.first_pass_recognizer.decode_streams,\n                stream_list,\n            )\n\n            for f in future_list:\n                self.stream_queue.task_done()\n                f.set_result(None)\n\n    async def compute_and_decode(\n        self,\n        stream: sherpa_onnx.OnlineStream,\n    ) -> None:\n        \"\"\"Put the stream into the queue and wait it to be processed by the\n        consumer task.\n\n        Args:\n          stream:\n            The stream to be processed. Note: It is changed in-place.\n        \"\"\"\n        loop = asyncio.get_running_loop()\n        future = loop.create_future()\n        await self.stream_queue.put((stream, future))\n        await future\n\n    async def run_second_pass_async(\n        self,\n        samples: np.ndarray,\n        sample_rate: int,\n    ) -> str:\n        \"\"\"Run second-pass recognition asynchronously to avoid blocking.\n\n        Args:\n          samples: Audio samples.\n          sample_rate: Sampling rate.\n\n        Returns:\n          Text result from the second-pass recognition.\n        \"\"\"\n        import time\n        start_time = time.time()\n        \n        loop = asyncio.get_running_loop()\n        result = await loop.run_in_executor(\n            self.second_pass_pool,\n            run_second_pass,\n            self.second_pass_recognizer,\n            samples,\n            sample_rate,\n        )\n        \n        end_time = time.time()\n        duration = end_time - start_time\n        logging.info(f\"Second pass processing completed in {duration:.3f}s for {len(samples)/sample_rate:.2f}s audio\")\n        \n        return result.lower().strip()\n\n    async def process_request(\n        self,\n        path: str,\n        request_headers: websockets.Headers,\n    ) -> Optional[Tuple[http.HTTPStatus, websockets.Headers, bytes]]:\n        if self.current_active_connections < self.max_active_connections:\n            self.current_active_connections += 1\n            return None\n\n        # Refuse new connections\n        status = http.HTTPStatus.SERVICE_UNAVAILABLE  # 503\n        header = {\"Hint\": \"The server is overloaded. Please retry later.\"}\n        response = b\"The server is busy. Please retry later.\"\n\n        return status, header, response\n\n    async def run(self, port: int):\n        tasks = []\n        for i in range(self.nn_pool_size):\n            tasks.append(asyncio.create_task(self.stream_consumer_task()))\n\n        if self.certificate:\n            logging.info(f\"Using certificate: {self.certificate}\")\n            ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)\n            ssl_context.load_cert_chain(self.certificate)\n        else:\n            ssl_context = None\n            logging.info(\"No certificate provided\")\n\n        try:\n            async with websockets.serve(\n                self.handle_connection,\n                host=\"\",\n                port=port,\n                max_size=self.max_message_size,\n                max_queue=self.max_queue_size,\n                process_request=self.process_request,\n                ssl=ssl_context,\n            ):\n                logging.info(f\"Started server on port {port}\")\n                await asyncio.Future()  # run forever\n        finally:\n            logging.info(\"Shutting down thread pools...\")\n            self.nn_pool.shutdown(wait=True)\n            self.second_pass_pool.shutdown(wait=True)\n            logging.info(\"Thread pools shut down successfully\")\n\n        await asyncio.gather(*tasks)  # not reachable\n\n    async def handle_connection(\n        self,\n        socket: websockets.WebSocketServerProtocol,\n    ):\n        \"\"\"Receive audio samples from the client, process it, and send\n        decoding result back to the client.\n\n        Args:\n          socket:\n            The socket for communicating with the client.\n        \"\"\"\n        try:\n            await self.handle_connection_impl(socket)\n        except websockets.exceptions.ConnectionClosed:\n            logging.info(f\"{socket.remote_address} disconnected\")\n        finally:\n            # Decrement so that it can accept new connections\n            self.current_active_connections -= 1\n\n            logging.info(\n                f\"Disconnected: {socket.remote_address}. \"\n                f\"Number of connections: {self.current_active_connections}/{self.max_active_connections}\"  # noqa\n            )\n\n    async def handle_connection_impl(\n        self,\n        socket: websockets.WebSocketServerProtocol,\n    ):\n        \"\"\"Receive audio samples from the client, process it, and send\n        decoding result back to the client.\n\n        Args:\n          socket:\n            The socket for communicating with the client.\n        \"\"\"\n        stream = self.first_pass_recognizer.create_stream()\n        segment = 0\n        sample_buffers = []\n        while True:\n            samples = await self.recv_audio_samples(socket)\n            if samples is None:\n                break\n            \n            # TODO(fangjun): At present, we assume the sampling rate\n            # of the received audio samples equal to --sample-rate\n            stream.accept_waveform(sample_rate=self.sample_rate, waveform=samples)\n            sample_buffers.append(samples)\n            while self.first_pass_recognizer.is_ready(stream):\n                await self.compute_and_decode(stream)\n                result = self.first_pass_recognizer.get_result(stream)\n\n                message = {\n                    \"text\": result,\n                    \"segment\": segment,\n                }\n                if self.first_pass_recognizer.is_endpoint(stream):\n                    if result:\n                        samples_for_2nd_pass = np.concatenate(sample_buffers)\n                        sample_buffers = [samples_for_2nd_pass[-8000:]]\n                        samples_for_2nd_pass = samples_for_2nd_pass[:-8000]\n                        second_pass_result = (\n                            await self.run_second_pass_async(\n                                samples=samples_for_2nd_pass,\n                                sample_rate=self.sample_rate,\n                            )\n                        )\n\n                        if second_pass_result:\n                            message[\"text\"] = second_pass_result\n                            message[\"segment\"] = segment\n                    else:\n                        sample_buffers=[]\n\n                    self.first_pass_recognizer.reset(stream)\n                    segment += 1\n                await socket.send(json.dumps(message))\n\n        tail_padding = np.zeros(int(self.sample_rate * 0.3)).astype(np.float32)\n        stream.accept_waveform(sample_rate=self.sample_rate, waveform=tail_padding)\n        stream.input_finished()\n        while self.first_pass_recognizer.is_ready(stream):\n            await self.compute_and_decode(stream)\n\n        result = self.first_pass_recognizer.get_result(stream)\n\n        message = {\n            \"text\": result,\n            \"segment\": segment,\n        }\n        await socket.send(json.dumps(message))\n\n    async def recv_audio_samples(\n        self,\n        socket: websockets.WebSocketServerProtocol,\n    ) -> Optional[np.ndarray]:\n        \"\"\"Receive audio samples from WebSocket connection\n        \n        Args:\n          socket: WebSocket connection\n        \n        Returns:\n          Numpy array containing audio samples, or None indicating end of audio\n        \"\"\"\n        message = await socket.recv()\n        if message == \"Done\":\n            return None\n        return np.frombuffer(message, dtype=np.float32)\n\n\ndef check_args(args):\n    if args.encoder:\n        assert Path(args.encoder).is_file(), f\"{args.encoder} does not exist\"\n        assert Path(args.decoder).is_file(), f\"{args.decoder} does not exist\"\n        assert args.paraformer_encoder is None, args.paraformer_encoder\n        assert args.paraformer_decoder is None, args.paraformer_decoder\n       \n    elif args.paraformer_encoder:\n        assert Path(\n            args.paraformer_encoder\n        ).is_file(), f\"{args.paraformer_encoder} does not exist\"\n\n        assert Path(\n            args.paraformer_decoder\n        ).is_file(), f\"{args.paraformer_decoder} does not exist\"\n    else:\n        raise ValueError(\"Please provide a model\")\n\n    if not Path(args.tokens).is_file():\n        raise ValueError(f\"{args.tokens} does not exist\")\n\n    if args.decoding_method not in (\n        \"greedy_search\",\n        \"modified_beam_search\",\n    ):\n        raise ValueError(f\"Unsupported decoding method {args.decoding_method}\")\n\n    if args.decoding_method == \"modified_beam_search\":\n        assert args.num_active_paths > 0, args.num_active_paths\n\n\ndef main():\n    args = get_args()\n    logging.info(vars(args))\n    check_args(args)\n\n    first_pass_recognizer = create_first_pass_recognizer(args)\n    second_pass_recognizer = create_second_pass_recognizer(args)\n\n    port = args.port\n    nn_pool_size = args.nn_pool_size\n    max_batch_size = args.max_batch_size\n    max_wait_ms = args.max_wait_ms\n    max_message_size = args.max_message_size\n    max_queue_size = args.max_queue_size\n    max_active_connections = args.max_active_connections\n    second_pass_threads = args.second_pass_threads\n    certificate = args.certificate\n    # doc_root = args.doc_root\n\n    if certificate and not Path(certificate).is_file():\n        raise ValueError(f\"{certificate} does not exist\")\n\n    server = StreamingServer(\n        first_pass_recognizer=first_pass_recognizer,\n        second_pass_recognizer=second_pass_recognizer,\n        nn_pool_size=nn_pool_size,\n        max_batch_size=max_batch_size,\n        max_wait_ms=max_wait_ms,\n        max_message_size=max_message_size,\n        max_queue_size=max_queue_size,\n        max_active_connections=max_active_connections,\n        second_pass_threads=second_pass_threads,\n        certificate=certificate,\n        # doc_root=doc_root,\n    )\n    asyncio.run(server.run(port))\n\n\nif __name__ == \"__main__\":\n    log_filename = \"log/log-streaming-server\"\n    setup_logger(log_filename)\n    main()\n"
  },
  {
    "path": "python-api-examples/vad-alsa.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script works only on Linux. It uses ALSA for recording.\n\"\"\"\n\nimport argparse\nfrom pathlib import Path\n\nimport sherpa_onnx\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    parser.add_argument(\n        \"--device-name\",\n        type=str,\n        required=True,\n        help=\"\"\"\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n        \"\"\",\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    if not Path(args.silero_vad_model).is_file():\n        raise RuntimeError(\n            f\"{args.silero_vad_model} does not exist. Please download it from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\"\n        )\n\n    device_name = args.device_name\n    print(f\"device_name: {device_name}\")\n    alsa = sherpa_onnx.Alsa(device_name)\n\n    sample_rate = 16000\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n\n    config = sherpa_onnx.VadModelConfig()\n    config.silero_vad.model = args.silero_vad_model\n    config.sample_rate = sample_rate\n\n    vad = sherpa_onnx.VoiceActivityDetector(config, buffer_size_in_seconds=30)\n\n    print(\"Started! Please speak. Press Ctrl C to exit\")\n\n    printed = False\n    k = 0\n    try:\n        while True:\n            samples = alsa.read(samples_per_read)  # a blocking read\n\n            vad.accept_waveform(samples)\n\n            if vad.is_speech_detected() and not printed:\n                print(\"Detected speech\")\n                printed = True\n\n            if not vad.is_speech_detected():\n                printed = False\n\n            while not vad.empty():\n                samples = vad.front.samples\n                duration = len(samples) / sample_rate\n                filename = f\"seg-{k}-{duration:.3f}-seconds.wav\"\n                k += 1\n                sherpa_onnx.write_wave(filename, samples, sample_rate)\n                print(f\"Duration: {duration:.3f} seconds\")\n                print(f\"Saved to {filename}\")\n                print(\"----------\")\n\n                vad.pop()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exit\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/vad-microphone.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nimport os\nimport sys\nfrom pathlib import Path\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\nimport sherpa_onnx\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    if not Path(args.silero_vad_model).is_file():\n        raise RuntimeError(\n            f\"{args.silero_vad_model} does not exist. Please download it from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\"\n        )\n\n    mic_sample_rate = 16000\n    if \"SHERPA_ONNX_MIC_SAMPLE_RATE\" in os.environ:\n        mic_sample_rate = int(os.environ.get(\"SHERPA_ONNX_MIC_SAMPLE_RATE\"))\n        print(f\"Change microphone sample rate to {mic_sample_rate}\")\n\n    sample_rate = 16000\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n\n    config = sherpa_onnx.VadModelConfig()\n    config.silero_vad.model = args.silero_vad_model\n    config.sample_rate = sample_rate\n\n    vad = sherpa_onnx.VoiceActivityDetector(config, buffer_size_in_seconds=30)\n\n    # python3 -m sounddevice\n    # can also be used to list all devices\n\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        print(\n            \"If you are using Linux and you are sure there is a microphone \"\n            \"on your system, please use \"\n            \"./vad-alsa.py\"\n        )\n        sys.exit(0)\n\n    print(devices)\n\n    if \"SHERPA_ONNX_MIC_DEVICE\" in os.environ:\n        input_device_idx = int(os.environ.get(\"SHERPA_ONNX_MIC_DEVICE\"))\n        sd.default.device[0] = input_device_idx\n        print(f'Use selected device: {devices[input_device_idx][\"name\"]}')\n    else:\n        input_device_idx = sd.default.device[0]\n        print(f'Use default device: {devices[input_device_idx][\"name\"]}')\n\n    print(\"Started! Please speak. Press Ctrl C to exit\")\n\n    printed = False\n    k = 0\n    try:\n        with sd.InputStream(\n            channels=1, dtype=\"float32\", samplerate=mic_sample_rate\n        ) as s:\n            while True:\n                samples, _ = s.read(samples_per_read)  # a blocking read\n                samples = samples.reshape(-1)\n\n                if mic_sample_rate != sample_rate:\n                    import librosa\n\n                    samples = librosa.resample(\n                        samples, orig_sr=mic_sample_rate, target_sr=sample_rate\n                    )\n\n                vad.accept_waveform(samples)\n\n                if vad.is_speech_detected() and not printed:\n                    print(\"Detected speech\")\n                    printed = True\n\n                if not vad.is_speech_detected():\n                    printed = False\n\n                while not vad.empty():\n                    samples = vad.front.samples\n                    duration = len(samples) / sample_rate\n                    filename = f\"seg-{k}-{duration:.3f}-seconds.wav\"\n                    k += 1\n                    sherpa_onnx.write_wave(filename, samples, sample_rate)\n                    print(f\"Duration: {duration:.3f} seconds\")\n                    print(f\"Saved to {filename}\")\n                    print(\"----------\")\n\n                    vad.pop()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exit\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/vad-remove-non-speech-segments-alsa.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to remove non-speech segments\nand merge all speech segments into a large segment\nand save it to a file.\n\nDifferent from ./vad-remove-non-speech-segments.py, this file supports only\nLinux.\n\nUsage\n\npython3 ./vad-remove-non-speech-segments-alsa.py \\\n        --silero-vad-model silero_vad.onnx\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nto download silero_vad.onnx\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\"\"\"\n\nimport argparse\nimport time\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    parser.add_argument(\n        \"--device-name\",\n        type=str,\n        required=True,\n        help=\"\"\"\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n        \"\"\",\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    assert_file_exists(args.silero_vad_model)\n\n    device_name = args.device_name\n    print(f\"device_name: {device_name}\")\n    alsa = sherpa_onnx.Alsa(device_name)\n\n    sample_rate = 16000\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n\n    config = sherpa_onnx.VadModelConfig()\n    config.silero_vad.model = args.silero_vad_model\n    config.sample_rate = sample_rate\n\n    window_size = config.silero_vad.window_size\n\n    buffer = []\n    vad = sherpa_onnx.VoiceActivityDetector(config, buffer_size_in_seconds=30)\n\n    all_samples = []\n\n    print(\"Started! Please speak. Press Ctrl C to exit\")\n\n    try:\n        while True:\n            samples = alsa.read(samples_per_read)  # a blocking read\n            samples = np.array(samples)\n\n            buffer = np.concatenate([buffer, samples])\n\n            all_samples = np.concatenate([all_samples, samples])\n\n            while len(buffer) > window_size:\n                vad.accept_waveform(buffer[:window_size])\n                buffer = buffer[window_size:]\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Saving & Exiting\")\n\n        speech_samples = []\n        while not vad.empty():\n            speech_samples.extend(vad.front.samples)\n            vad.pop()\n\n        speech_samples = np.array(speech_samples, dtype=np.float32)\n\n        filename_for_speech = time.strftime(\"%Y%m%d-%H%M%S-speech.wav\")\n        sf.write(filename_for_speech, speech_samples, samplerate=sample_rate)\n\n        filename_for_all = time.strftime(\"%Y%m%d-%H%M%S-all.wav\")\n        sf.write(filename_for_all, all_samples, samplerate=sample_rate)\n\n        print(f\"Saved to {filename_for_speech} and {filename_for_all}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/vad-remove-non-speech-segments-from-file.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to remove non-speech segments\nand merge all speech segments into a large segment\nand save it to a file.\n\nUsage\n\npython3 ./vad-remove-non-speech-segments-from-file.py \\\n        --silero-vad-model silero_vad.onnx \\\n        input.wav \\\n        output.wav\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nto download silero_vad.onnx\n\nFor instance,\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\"\"\"\n\nimport argparse\nfrom pathlib import Path\nfrom typing import Tuple\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    parser.add_argument(\n        \"input\",\n        type=str,\n        help=\"Path to input.wav\",\n    )\n\n    parser.add_argument(\n        \"output\",\n        type=str,\n        help=\"Path to output.wav\",\n    )\n\n    return parser.parse_args()\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef main():\n    args = get_args()\n    assert_file_exists(args.silero_vad_model)\n    assert_file_exists(args.input)\n\n    samples, sample_rate = load_audio(args.input)\n    if sample_rate != 16000:\n        import librosa\n\n        samples = librosa.resample(samples, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    config = sherpa_onnx.VadModelConfig()\n    config.silero_vad.model = args.silero_vad_model\n    config.silero_vad.threshold = 0.5\n    config.silero_vad.min_silence_duration = 0.25  # seconds\n    config.silero_vad.min_speech_duration = 0.25  # seconds\n\n    # If the current segment is larger than this value, then it increases\n    # the threshold to 0.9 internally. After detecting this segment,\n    # it resets the threshold to its original value.\n    config.silero_vad.max_speech_duration = 5  # seconds\n\n    config.sample_rate = sample_rate\n\n    window_size = config.silero_vad.window_size\n\n    vad = sherpa_onnx.VoiceActivityDetector(config, buffer_size_in_seconds=30)\n\n    speech_samples = []\n    while len(samples) > window_size:\n        vad.accept_waveform(samples[:window_size])\n        samples = samples[window_size:]\n\n        while not vad.empty():\n            speech_samples.extend(vad.front.samples)\n            vad.pop()\n\n    vad.flush()\n\n    while not vad.empty():\n        speech_samples.extend(vad.front.samples)\n        vad.pop()\n\n    speech_samples = np.array(speech_samples, dtype=np.float32)\n\n    sf.write(args.output, speech_samples, samplerate=sample_rate)\n\n    print(f\"Saved to {args.output}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/vad-remove-non-speech-segments.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis file shows how to remove non-speech segments\nand merge all speech segments into a large segment\nand save it to a file.\n\nUsage\n\npython3 ./vad-remove-non-speech-segments.py \\\n        --silero-vad-model silero_vad.onnx\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nto download silero_vad.onnx\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\"\"\"\n\nimport argparse\nimport sys\nimport time\nfrom pathlib import Path\n\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        print(\n            \"If you are using Linux and you are sure there is a microphone \"\n            \"on your system, please use \"\n            \"./vad-remove-non-speech-segments-alsa.py\"\n        )\n        sys.exit(0)\n\n    print(devices)\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    args = get_args()\n    assert_file_exists(args.silero_vad_model)\n\n    sample_rate = 16000\n    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms\n\n    config = sherpa_onnx.VadModelConfig()\n    config.silero_vad.model = args.silero_vad_model\n    config.sample_rate = sample_rate\n\n    window_size = config.silero_vad.window_size\n\n    buffer = []\n    vad = sherpa_onnx.VoiceActivityDetector(config, buffer_size_in_seconds=30)\n\n    all_samples = []\n\n    print(\"Started! Please speak. Press Ctrl C to exit\")\n\n    try:\n        with sd.InputStream(channels=1, dtype=\"float32\", samplerate=sample_rate) as s:\n            while True:\n                samples, _ = s.read(samples_per_read)  # a blocking read\n                samples = samples.reshape(-1)\n                buffer = np.concatenate([buffer, samples])\n\n                all_samples = np.concatenate([all_samples, samples])\n\n                while len(buffer) > window_size:\n                    vad.accept_waveform(buffer[:window_size])\n                    buffer = buffer[window_size:]\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Saving & Exiting\")\n\n        speech_samples = []\n        while not vad.empty():\n            speech_samples.extend(vad.front.samples)\n            vad.pop()\n\n        speech_samples = np.array(speech_samples, dtype=np.float32)\n\n        filename_for_speech = time.strftime(\"%Y%m%d-%H%M%S-speech.wav\")\n        sf.write(filename_for_speech, speech_samples, samplerate=sample_rate)\n\n        filename_for_all = time.strftime(\"%Y%m%d-%H%M%S-all.wav\")\n        sf.write(filename_for_all, all_samples, samplerate=sample_rate)\n\n        print(f\"Saved to {filename_for_speech} and {filename_for_all}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/vad-with-non-streaming-asr.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2023  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python APIs\nwith VAD and non-streaming ASR models for speech recognition\nfrom a microphone.\n\nNote that you need a non-streaming model for this script.\n\n(1) For paraformer\n\n    ./python-api-examples/vad-with-non-streaming-asr.py  \\\n      --silero-vad-model=/path/to/silero_vad.onnx \\\n      --tokens=/path/to/tokens.txt \\\n      --paraformer=/path/to/paraformer.onnx \\\n      --num-threads=2 \\\n      --decoding-method=greedy_search \\\n      --debug=false \\\n      --sample-rate=16000 \\\n      --feature-dim=80\n\n(2) For transducer models from icefall\n\n    ./python-api-examples/vad-with-non-streaming-asr.py  \\\n      --silero-vad-model=/path/to/silero_vad.onnx \\\n      --tokens=/path/to/tokens.txt \\\n      --encoder=/path/to/encoder.onnx \\\n      --decoder=/path/to/decoder.onnx \\\n      --joiner=/path/to/joiner.onnx \\\n      --num-threads=2 \\\n      --decoding-method=greedy_search \\\n      --debug=false \\\n      --sample-rate=16000 \\\n      --feature-dim=80\n\n(3) For Moonshine models\n\n./python-api-examples/vad-with-non-streaming-asr.py  \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --moonshine-preprocessor=./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx \\\n  --moonshine-encoder=./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx \\\n  --moonshine-uncached-decoder=./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx \\\n  --moonshine-cached-decoder=./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx \\\n  --tokens=./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt \\\n  --num-threads=2\n\n(4) For Whisper models\n\n./python-api-examples/vad-with-non-streaming-asr.py  \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \\\n  --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \\\n  --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \\\n  --whisper-task=transcribe \\\n  --num-threads=2\n\n(5) For SenseVoice CTC models\n\n./python-api-examples/vad-with-non-streaming-asr.py  \\\n  --silero-vad-model=/path/to/silero_vad.onnx \\\n  --sense-voice=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx \\\n  --tokens=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt \\\n  --num-threads=2\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/index.html\nto install sherpa-onnx and to download non-streaming pre-trained models\nused in this file.\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nto download silero_vad.onnx\n\nFor instance,\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n\"\"\"\nimport argparse\nimport sys\nfrom pathlib import Path\n\nimport numpy as np\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\nimport sherpa_onnx\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--silero-vad-model\",\n        type=str,\n        required=True,\n        help=\"Path to silero_vad.onnx\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer encoder model\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer decoder model\",\n    )\n\n    parser.add_argument(\n        \"--joiner\",\n        default=\"\",\n        type=str,\n        help=\"Path to the transducer joiner model\",\n    )\n\n    parser.add_argument(\n        \"--paraformer\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from Paraformer\",\n    )\n\n    parser.add_argument(\n        \"--sense-voice\",\n        default=\"\",\n        type=str,\n        help=\"Path to the model.onnx from SenseVoice\",\n    )\n\n    parser.add_argument(\n        \"--num-threads\",\n        type=int,\n        default=1,\n        help=\"Number of threads for neural network computation\",\n    )\n\n    parser.add_argument(\n        \"--whisper-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper encoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to whisper decoder model\",\n    )\n\n    parser.add_argument(\n        \"--whisper-language\",\n        default=\"\",\n        type=str,\n        help=\"\"\"It specifies the spoken language in the input file.\n        Example values: en, fr, de, zh, jp.\n        Available languages for multilingual models can be found at\n        https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10\n        If not specified, we infer the language from the input audio file.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-task\",\n        default=\"transcribe\",\n        choices=[\"transcribe\", \"translate\"],\n        type=str,\n        help=\"\"\"For multilingual models, if you specify translate, the output\n        will be in English.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--whisper-tail-paddings\",\n        default=-1,\n        type=int,\n        help=\"\"\"Number of tail padding frames.\n        We have removed the 30-second constraint from whisper, so you need to\n        choose the amount of tail padding frames by yourself.\n        Use -1 to use a default value for tail padding.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-preprocessor\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine preprocessor model\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-encoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine encoder model\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-uncached-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine uncached decoder model\",\n    )\n\n    parser.add_argument(\n        \"--moonshine-cached-decoder\",\n        default=\"\",\n        type=str,\n        help=\"Path to moonshine cached decoder model\",\n    )\n\n    parser.add_argument(\n        \"--blank-penalty\",\n        type=float,\n        default=0.0,\n        help=\"\"\"\n        The penalty applied on blank symbol during decoding.\n        Note: It is a positive value that would be applied to logits like\n        this `logits[:, 0] -= blank_penalty` (suppose logits.shape is\n        [batch_size, vocab] and blank id is 0).\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--decoding-method\",\n        type=str,\n        default=\"greedy_search\",\n        help=\"\"\"Valid values are greedy_search and modified_beam_search.\n        modified_beam_search is valid only for transducer models.\n        \"\"\",\n    )\n    parser.add_argument(\n        \"--debug\",\n        type=bool,\n        default=False,\n        help=\"True to show debug messages when loading modes.\",\n    )\n\n    parser.add_argument(\n        \"--sample-rate\",\n        type=int,\n        default=16000,\n        help=\"\"\"Sample rate of the feature extractor. Must match the one\n        expected by the model.\"\"\",\n    )\n\n    parser.add_argument(\n        \"--feature-dim\",\n        type=int,\n        default=80,\n        help=\"Feature dimension. Must match the one expected by the model\",\n    )\n\n    parser.add_argument(\n        \"--hr-lexicon\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the lexicon.txt for homophone replacer\",\n    )\n\n    parser.add_argument(\n        \"--hr-rule-fsts\",\n        type=str,\n        default=\"\",\n        help=\"If not empty, it is the replace.fst for homophone replacer\",\n    )\n\n    return parser.parse_args()\n\n\ndef assert_file_exists(filename: str):\n    assert Path(filename).is_file(), (\n        f\"{filename} does not exist!\\n\"\n        \"Please refer to \"\n        \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it\"\n    )\n\n\ndef create_recognizer(args) -> sherpa_onnx.OfflineRecognizer:\n    if args.encoder:\n        assert len(args.paraformer) == 0, args.paraformer\n        assert len(args.sense_voice) == 0, args.sense_voice\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.encoder)\n        assert_file_exists(args.decoder)\n        assert_file_exists(args.joiner)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_transducer(\n            encoder=args.encoder,\n            decoder=args.decoder,\n            joiner=args.joiner,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            blank_penalty=args.blank_penalty,\n            debug=args.debug,\n            hr_rule_fsts=args.hr_rule_fsts,\n            hr_lexicon=args.hr_lexicon,\n        )\n    elif args.paraformer:\n        assert len(args.sense_voice) == 0, args.sense_voice\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.paraformer)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(\n            paraformer=args.paraformer,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            sample_rate=args.sample_rate,\n            feature_dim=args.feature_dim,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n            hr_rule_fsts=args.hr_rule_fsts,\n            hr_lexicon=args.hr_lexicon,\n        )\n    elif args.sense_voice:\n        assert len(args.whisper_encoder) == 0, args.whisper_encoder\n        assert len(args.whisper_decoder) == 0, args.whisper_decoder\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        assert_file_exists(args.sense_voice)\n        recognizer = sherpa_onnx.OfflineRecognizer.from_sense_voice(\n            model=args.sense_voice,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            use_itn=True,\n            debug=args.debug,\n            hr_rule_fsts=args.hr_rule_fsts,\n            hr_lexicon=args.hr_lexicon,\n        )\n    elif args.whisper_encoder:\n        assert_file_exists(args.whisper_encoder)\n        assert_file_exists(args.whisper_decoder)\n        assert len(args.moonshine_preprocessor) == 0, args.moonshine_preprocessor\n        assert len(args.moonshine_encoder) == 0, args.moonshine_encoder\n        assert (\n            len(args.moonshine_uncached_decoder) == 0\n        ), args.moonshine_uncached_decoder\n        assert len(args.moonshine_cached_decoder) == 0, args.moonshine_cached_decoder\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n            encoder=args.whisper_encoder,\n            decoder=args.whisper_decoder,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n            language=args.whisper_language,\n            task=args.whisper_task,\n            tail_paddings=args.whisper_tail_paddings,\n            hr_rule_fsts=args.hr_rule_fsts,\n            hr_lexicon=args.hr_lexicon,\n        )\n    elif args.moonshine_preprocessor:\n        assert_file_exists(args.moonshine_preprocessor)\n        assert_file_exists(args.moonshine_encoder)\n        assert_file_exists(args.moonshine_uncached_decoder)\n        assert_file_exists(args.moonshine_cached_decoder)\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_moonshine(\n            preprocessor=args.moonshine_preprocessor,\n            encoder=args.moonshine_encoder,\n            uncached_decoder=args.moonshine_uncached_decoder,\n            cached_decoder=args.moonshine_cached_decoder,\n            tokens=args.tokens,\n            num_threads=args.num_threads,\n            decoding_method=args.decoding_method,\n            debug=args.debug,\n            hr_rule_fsts=args.hr_rule_fsts,\n            hr_lexicon=args.hr_lexicon,\n        )\n    else:\n        raise ValueError(\"Please specify at least one model\")\n\n    return recognizer\n\n\ndef main():\n    devices = sd.query_devices()\n    if len(devices) == 0:\n        print(\"No microphone devices found\")\n        sys.exit(0)\n\n    print(devices)\n\n    # If you want to select a different input device, please use\n    # sd.default.device[0] = xxx\n    # where xxx is the device number\n\n    default_input_device_idx = sd.default.device[0]\n    print(f'Use default device: {devices[default_input_device_idx][\"name\"]}')\n\n    args = get_args()\n    assert_file_exists(args.tokens)\n    assert_file_exists(args.silero_vad_model)\n\n    assert args.num_threads > 0, args.num_threads\n\n    assert (\n        args.sample_rate == 16000\n    ), f\"Only sample rate 16000 is supported.Given: {args.sample_rate}\"\n\n    print(\"Creating recognizer. Please wait...\")\n    recognizer = create_recognizer(args)\n\n    config = sherpa_onnx.VadModelConfig()\n    config.silero_vad.model = args.silero_vad_model\n    config.silero_vad.min_silence_duration = 0.25\n    config.sample_rate = args.sample_rate\n\n    window_size = config.silero_vad.window_size\n\n    vad = sherpa_onnx.VoiceActivityDetector(config, buffer_size_in_seconds=100)\n\n    samples_per_read = int(0.1 * args.sample_rate)  # 0.1 second = 100 ms\n\n    print(\"Started! Please speak\")\n\n    buffer = []\n    texts = []\n    with sd.InputStream(channels=1, dtype=\"float32\", samplerate=args.sample_rate) as s:\n        while True:\n            samples, _ = s.read(samples_per_read)  # a blocking read\n            samples = samples.reshape(-1)\n\n            buffer = np.concatenate([buffer, samples])\n            while len(buffer) > window_size:\n                vad.accept_waveform(buffer[:window_size])\n                buffer = buffer[window_size:]\n\n            while not vad.empty():\n                stream = recognizer.create_stream()\n                stream.accept_waveform(args.sample_rate, vad.front.samples)\n\n                vad.pop()\n                recognizer.decode_stream(stream)\n\n                text = stream.result.text.strip().lower()\n                if len(text):\n                    idx = len(texts)\n                    texts.append(text)\n                    print(f\"{idx}: {text}\")\n\n\nif __name__ == \"__main__\":\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n"
  },
  {
    "path": "python-api-examples/web/.gitignore",
    "content": "*.pem\n*.key\n*.crt\n"
  },
  {
    "path": "python-api-examples/web/generate-certificate.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\npip install pyopenssl\n\"\"\"\n\nfrom OpenSSL import crypto\n\n# The code in this file is modified from\n# https://stackoverflow.com/questions/27164354/create-a-self-signed-x509-certificate-in-python\n\n\"\"\"\nThis script generates 3 files:\n    - private.key\n    - selfsigned.crt\n    - cert.pem\n\nYou need cert.pem when you start a https server\nor a secure websocket server.\n\nNote: You need to change serialNumber if you want to generate\na new certificate as two different certificates cannot share\nthe same serial number if they are issued by the same organization.\n\nOtherwise, you may get the following error from within you browser:\n\n  An error occurred during a connection to 127.0.0.1:6007. You have received an\n  invalid certificate. Please contact the server administrator or email\n  correspondent and give them the following information: Your certificate\n  contains the same serial number as another certificate issued by the\n  certificate authority. Please get a new certificate containing a unique\n  serial number. Error code: SEC_ERROR_REUSED_ISSUER_AND_SERIAL\n\n\"\"\"\n\n\ndef cert_gen(\n    emailAddress=\"https://github.com/k2-fsa/sherpa-onnx\",\n    commonName=\"sherpa-onnx\",\n    countryName=\"CN\",\n    localityName=\"k2-fsa\",\n    stateOrProvinceName=\"k2-fsa\",\n    organizationName=\"k2-fsa\",\n    organizationUnitName=\"k2-fsa\",\n    serialNumber=3,\n    validityStartInSeconds=0,\n    validityEndInSeconds=10 * 365 * 24 * 60 * 60,\n    KEY_FILE=\"private.key\",\n    CERT_FILE=\"selfsigned.crt\",\n    ALL_IN_ONE_FILE=\"cert.pem\",\n):\n    # can look at generated file using openssl:\n    # openssl x509 -inform pem -in selfsigned.crt -noout -text\n    # create a key pair\n    k = crypto.PKey()\n    k.generate_key(crypto.TYPE_RSA, 4096)\n    # create a self-signed cert\n    cert = crypto.X509()\n    cert.get_subject().C = countryName\n    cert.get_subject().ST = stateOrProvinceName\n    cert.get_subject().L = localityName\n    cert.get_subject().O = organizationName  # noqa\n    cert.get_subject().OU = organizationUnitName\n    cert.get_subject().CN = commonName\n    cert.get_subject().emailAddress = emailAddress\n    cert.set_serial_number(serialNumber)\n    cert.gmtime_adj_notBefore(0)\n    cert.gmtime_adj_notAfter(validityEndInSeconds)\n    cert.set_issuer(cert.get_subject())\n    cert.set_pubkey(k)\n    cert.sign(k, \"sha512\")\n    with open(CERT_FILE, \"wt\") as f:\n        f.write(crypto.dump_certificate(crypto.FILETYPE_PEM, cert).decode(\"utf-8\"))\n    with open(KEY_FILE, \"wt\") as f:\n        f.write(crypto.dump_privatekey(crypto.FILETYPE_PEM, k).decode(\"utf-8\"))\n\n    with open(ALL_IN_ONE_FILE, \"wt\") as f:\n        f.write(crypto.dump_privatekey(crypto.FILETYPE_PEM, k).decode(\"utf-8\"))\n        f.write(crypto.dump_certificate(crypto.FILETYPE_PEM, cert).decode(\"utf-8\"))\n    print(f\"Generated {CERT_FILE}\")\n    print(f\"Generated {KEY_FILE}\")\n    print(f\"Generated {ALL_IN_ONE_FILE}\")\n\n\ncert_gen()\n"
  },
  {
    "path": "python-api-examples/web/index.html",
    "content": "<!doctype html>\n<html lang=\"en\">\n<head>\n  <!-- Required meta tags -->\n  <meta charset=\"utf-8\"></meta>\n  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1, shrink-to-fit=no\"></meta>\n\n  <!-- Bootstrap CSS -->\n  <link rel=\"stylesheet\"\n        href=\"./css/bootstrap.min.css\"\n        integrity=\"sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T\"\n        crossorigin=\"anonymous\">\n  </link>\n  <link rel=\"icon\"\n      type=\"image/png\"\n      href=\"./k2-logo.png\">\n\n  <script src=\"./js/jquery-3.6.0.min.js\" integrity=\"sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=\" crossorigin=\"anonymous\"></script>\n\n  <title>Next-gen Kaldi demo</title>\n</head>\n\n\n<body>\n  <div id=\"nav\"></div>\n  <script>\n    $(function(){\n      $(\"#nav\").load(\"nav-partial.html\");\n    });\n  </script>\n\n  <ul class=\"list-unstyled\">\n  <li class=\"media\">\n    <div class=\"media-body\">\n      <h5 class=\"mt-0 mb-1\">Upload</h5>\n      <p>Recognition from a selected file</p>\n    </div>\n  <li>\n\n  <li class=\"media\">\n    <div class=\"media-body\">\n      <h5 class=\"mt-0 mb-1\">Streaming_Record</h5>\n      <p>Recognition from real-time recordings</p>\n    </div>\n  </li>\n\n  <li class=\"media\">\n    <div class=\"media-body\">\n      <h5 class=\"mt-0 mb-1\">Offline_Record</h5>\n      <p>Recognition from offline recordings</p>\n    </div>\n  </li>\n  </ul>\n\n  Code is available at\n  <a href=\"https://github.com/k2-fsa/sherpa-onnx\"> https://github.com/k2-fsa/sherpa-onnx</a>\n\n  <!-- Optional JavaScript -->\n  <!-- jQuery first, then Popper.js, then Bootstrap JS -->\n  <script src=\"./js/popper.min.js\"\n          integrity=\"sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1\"\n          crossorigin=\"anonymous\">\n  </script>\n\n  <script src=\"./js/bootstrap.min.js\"\n          integrity=\"sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM\"\n          crossorigin=\"anonymous\">\n  </script>\n\n</body>\n</html>\n"
  },
  {
    "path": "python-api-examples/web/js/offline_record.js",
    "content": "// This file copies and modifies code\n// from https://mdn.github.io/web-dictaphone/scripts/app.js\n// and https://gist.github.com/meziantou/edb7217fddfbb70e899e\n\nvar socket;\n\nconst serverIpInput = document.getElementById('server-ip');\nconst serverPortInput = document.getElementById('server-port');\n\nconst connectBtn = document.getElementById('connect');\nconst uploadBtn = document.getElementById('file');\n\nfunction initWebSocket() {\n  let protocol = 'ws://';\n  if (window.location.protocol == 'https:') {\n    protocol = 'wss://'\n  }\n  let server_ip = serverIpInput.value;\n  let server_port = serverPortInput.value;\n  console.log('protocol: ', protocol);\n  console.log('server_ip: ', server_ip);\n  console.log('server_port: ', server_port);\n\n  let uri = protocol + server_ip + ':' + server_port;\n  console.log('uri', uri);\n  socket = new WebSocket(uri);\n\n  // Connection opened\n  socket.addEventListener('open', function(event) {\n    console.log('connected');\n    recordBtn.disabled = false;\n    connectBtn.disabled = true;\n    connectBtn.innerHTML = 'Connected!';\n  });\n\n  // Connection closed\n  socket.addEventListener('close', function(event) {\n    console.log('disconnected');\n    recordBtn.disabled = true;\n    stopBtn.disabled = true;\n    connectBtn.disabled = false;\n    connectBtn.innerHTML = 'Click me to connect!';\n  });\n\n  // Listen for messages\n  socket.addEventListener('message', function(event) {\n    console.log('Received message: ', event.data);\n\n    document.getElementById('results').value = event.data;\n    socket.send('Done');\n    console.log('Sent Done');\n    socket.close();\n  });\n}\n\nconst recordBtn = document.getElementById('offline_record');\nconst stopBtn = document.getElementById('offline_stop');\nconst clearBtn = document.getElementById('clear');\nconst soundClips = document.getElementById('sound-clips');\nconst canvas = document.getElementById('canvas');\nconst mainSection = document.querySelector('.container');\n\nrecordBtn.disabled = true;\nstopBtn.disabled = true;\n\nwindow.onload = (event) => {\n  console.log('page is fully loaded');\n  console.log('protocol', window.location.protocol);\n  console.log('port', window.location.port);\n  if (window.location.protocol == 'https:') {\n    document.getElementById('ws-protocol').textContent = 'wss://';\n  }\n  serverIpInput.value = window.location.hostname;\n  serverPortInput.value = window.location.port;\n};\n\nconnectBtn.onclick = function() {\n  initWebSocket();\n};\n\n\nlet audioCtx;\nconst canvasCtx = canvas.getContext('2d');\nlet mediaStream;\nlet analyser;\n\nlet expectedSampleRate = 16000;\nlet recordSampleRate;  // the sampleRate of the microphone\nlet recorder = null;   // the microphone\nlet leftchannel = [];  // TODO: Use a single channel\n\nlet recordingLength = 0;  // number of samples so far\n\nclearBtn.onclick = function() {\n  document.getElementById('results').value = '';\n};\n\nfunction send_header(n) {\n  const header = new ArrayBuffer(8);\n  new DataView(header).setInt32(0, expectedSampleRate, true /* littleEndian */);\n  new DataView(header).setInt32(4, n, true /* littleEndian */);\n  socket.send(new Int32Array(header, 0, 2));\n}\n\n// copied/modified from https://mdn.github.io/web-dictaphone/\n// and\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nif (navigator.mediaDevices.getUserMedia) {\n  console.log('getUserMedia supported.');\n\n  // see https://w3c.github.io/mediacapture-main/#dom-mediadevices-getusermedia\n  const constraints = {audio: true};\n\n  let onSuccess = function(stream) {\n    if (!audioCtx) {\n      audioCtx = new AudioContext();\n    }\n    console.log(audioCtx);\n    recordSampleRate = audioCtx.sampleRate;\n    console.log('sample rate ' + recordSampleRate);\n\n    // creates an audio node from the microphone incoming stream\n    mediaStream = audioCtx.createMediaStreamSource(stream);\n    console.log(mediaStream);\n\n    // https://developer.mozilla.org/en-US/docs/Web/API/AudioContext/createScriptProcessor\n    // bufferSize: the onaudioprocess event is called when the buffer is full\n    var bufferSize = 2048;\n    var numberOfInputChannels = 2;\n    var numberOfOutputChannels = 2;\n    if (audioCtx.createScriptProcessor) {\n      recorder = audioCtx.createScriptProcessor(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    } else {\n      recorder = audioCtx.createJavaScriptNode(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    }\n    console.log(recorder);\n\n    recorder.onaudioprocess = function(e) {\n      let samples = new Float32Array(e.inputBuffer.getChannelData(0))\n      samples = downsampleBuffer(samples, expectedSampleRate);\n      let buf = new Int16Array(samples.length);\n      for (var i = 0; i < samples.length; ++i) {\n        let s = samples[i];\n        if (s >= 1)\n          s = 1;\n        else if (s <= -1)\n          s = -1;\n        buf[i] = s * 32767;\n      }\n      leftchannel.push(buf);\n      recordingLength += bufferSize;\n    };\n\n    visualize(stream);\n    mediaStream.connect(analyser);\n\n    recordBtn.onclick = function() {\n      mediaStream.connect(recorder);\n      mediaStream.connect(analyser);\n      recorder.connect(audioCtx.destination);\n\n      console.log('recorder started');\n      recordBtn.style.background = 'red';\n\n      stopBtn.disabled = false;\n      recordBtn.disabled = true;\n    };\n\n    stopBtn.onclick = function() {\n      console.log('recorder stopped');\n\n      // stopBtn recording\n      recorder.disconnect(audioCtx.destination);\n      mediaStream.disconnect(recorder);\n      mediaStream.disconnect(analyser);\n\n      recordBtn.style.background = '';\n      recordBtn.style.color = '';\n      // mediaRecorder.requestData();\n\n      stopBtn.disabled = true;\n      recordBtn.disabled = false;\n\n      const clipName =\n          prompt('Enter a name for your sound clip?', 'My unnamed clip');\n\n      const clipContainer = document.createElement('article');\n      const clipLabel = document.createElement('p');\n      const audio = document.createElement('audio');\n      const deleteButton = document.createElement('button');\n      clipContainer.classList.add('clip');\n      audio.setAttribute('controls', '');\n      deleteButton.textContent = 'Delete';\n      deleteButton.className = 'delete';\n\n      if (clipName === null) {\n        clipLabel.textContent = 'My unnamed clip';\n      } else {\n        clipLabel.textContent = clipName;\n      }\n\n      clipContainer.appendChild(audio);\n\n      clipContainer.appendChild(clipLabel);\n      clipContainer.appendChild(deleteButton);\n      soundClips.appendChild(clipContainer);\n\n      audio.controls = true;\n      let samples = flatten(leftchannel);\n      let buf = new Float32Array(samples.length);\n      for (var i = 0; i < samples.length; ++i) {\n        let s = samples[i];\n        buf[i] = s / 32767.0;\n      }\n      const blob = toWav(samples);\n\n      leftchannel = [];\n      const audioURL = window.URL.createObjectURL(blob);\n      audio.src = audioURL;\n      console.log('recorder stopped');\n\n      deleteButton.onclick = function(e) {\n        let evtTgt = e.target;\n        evtTgt.parentNode.parentNode.removeChild(evtTgt.parentNode);\n      };\n\n      clipLabel.onclick = function() {\n        const existingName = clipLabel.textContent;\n        const newClipName = prompt('Enter a new name for your sound clip?');\n        if (newClipName === null) {\n          clipLabel.textContent = existingName;\n        } else {\n          clipLabel.textContent = newClipName;\n        }\n      };\n\n      buf = buf.buffer\n\n      let n = 1024 * 4;  // send this number of bytes per request.\n      console.log('buf length, ' + buf.byteLength);\n      send_header(buf.byteLength);\n\n      for (let start = 0; start < buf.byteLength; start += n) {\n        socket.send(buf.slice(start, start + n));\n      }\n    };\n  };\n\n  let onError = function(err) {\n    console.log('The following error occurred: ' + err);\n  };\n\n  navigator.mediaDevices.getUserMedia(constraints).then(onSuccess, onError);\n} else {\n  console.log('getUserMedia not supported on your browser!');\n  alert('getUserMedia not supported on your browser!');\n}\n\nfunction visualize(stream) {\n  if (!audioCtx) {\n    audioCtx = new AudioContext();\n  }\n\n  const source = audioCtx.createMediaStreamSource(stream);\n\n  if (!analyser) {\n    analyser = audioCtx.createAnalyser();\n    analyser.fftSize = 2048;\n  }\n  const bufferLength = analyser.frequencyBinCount;\n  const dataArray = new Uint8Array(bufferLength);\n\n  // source.connect(analyser);\n  // analyser.connect(audioCtx.destination);\n\n  draw()\n\n  function draw() {\n    const WIDTH = canvas.width\n    const HEIGHT = canvas.height;\n\n    requestAnimationFrame(draw);\n\n    analyser.getByteTimeDomainData(dataArray);\n\n    canvasCtx.fillStyle = 'rgb(200, 200, 200)';\n    canvasCtx.fillRect(0, 0, WIDTH, HEIGHT);\n\n    canvasCtx.lineWidth = 2;\n    canvasCtx.strokeStyle = 'rgb(0, 0, 0)';\n\n    canvasCtx.beginPath();\n\n    let sliceWidth = WIDTH * 1.0 / bufferLength;\n    let x = 0;\n\n    for (let i = 0; i < bufferLength; i++) {\n      let v = dataArray[i] / 128.0;\n      let y = v * HEIGHT / 2;\n\n      if (i === 0) {\n        canvasCtx.moveTo(x, y);\n      } else {\n        canvasCtx.lineTo(x, y);\n      }\n\n      x += sliceWidth;\n    }\n\n    canvasCtx.lineTo(canvas.width, canvas.height / 2);\n    canvasCtx.stroke();\n  }\n}\n\nwindow.onresize = function() {\n  canvas.width = mainSection.offsetWidth;\n};\n\nwindow.onresize();\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction flatten(listOfSamples) {\n  let n = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    n += listOfSamples[i].length;\n  }\n  let ans = new Int16Array(n);\n\n  let offset = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    ans.set(listOfSamples[i], offset);\n    offset += listOfSamples[i].length;\n  }\n  return ans;\n}\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction toWav(samples) {\n  let buf = new ArrayBuffer(44 + samples.length * 2);\n  var view = new DataView(buf);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true);               // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true);  // chunkSize\n  //                   E V A W\n  view.setUint32(8, 0x45564157, true);  // format\n                                        //\n  //                      t m f\n  view.setUint32(12, 0x20746d66, true);          // subchunk1ID\n  view.setUint32(16, 16, true);                  // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true);                   // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true);                   // numChannels: 1 channel\n  view.setUint32(24, expectedSampleRate, true);  // sampleRate\n  view.setUint32(28, expectedSampleRate * 2, true);  // byteRate\n  view.setUint16(32, 2, true);                       // blockAlign\n  view.setUint16(34, 16, true);                      // bitsPerSample\n  view.setUint32(36, 0x61746164, true);              // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true);      // subchunk2Size\n\n  let offset = 44;\n  for (let i = 0; i < samples.length; ++i) {\n    view.setInt16(offset, samples[i], true);\n    offset += 2;\n  }\n\n  return new Blob([view], {type: 'audio/wav'});\n}\n\n// this function is copied from\n// https://github.com/awslabs/aws-lex-browser-audio-capture/blob/master/lib/worker.js#L46\nfunction downsampleBuffer(buffer, exportSampleRate) {\n  if (exportSampleRate === recordSampleRate) {\n    return buffer;\n  }\n  var sampleRateRatio = recordSampleRate / exportSampleRate;\n  var newLength = Math.round(buffer.length / sampleRateRatio);\n  var result = new Float32Array(newLength);\n  var offsetResult = 0;\n  var offsetBuffer = 0;\n  while (offsetResult < result.length) {\n    var nextOffsetBuffer = Math.round((offsetResult + 1) * sampleRateRatio);\n    var accum = 0, count = 0;\n    for (var i = offsetBuffer; i < nextOffsetBuffer && i < buffer.length; i++) {\n      accum += buffer[i];\n      count++;\n    }\n    result[offsetResult] = accum / count;\n    offsetResult++;\n    offsetBuffer = nextOffsetBuffer;\n  }\n  return result;\n};\n"
  },
  {
    "path": "python-api-examples/web/js/streaming_record.js",
    "content": "// This file copies and modifies code\n// from https://mdn.github.io/web-dictaphone/scripts/app.js\n// and https://gist.github.com/meziantou/edb7217fddfbb70e899e\n\nvar socket;\nvar recognition_text = [];\n\nfunction getDisplayResult() {\n  let i = 0;\n  let ans = '';\n  for (let s in recognition_text) {\n    if (recognition_text[s] == '') continue;\n\n    ans += '' + i + ': ' + recognition_text[s] + '\\n';\n    i += 1;\n  }\n  return ans;\n}\n\nfunction initWebSocket() {\n  console.log('Creating websocket')\n  let protocol = 'ws://';\n  if (window.location.protocol == 'https:') {\n    protocol = 'wss://'\n  }\n  let server_ip = serverIpInput.value;\n  let server_port = serverPortInput.value;\n  console.log('protocol: ', protocol);\n  console.log('server_ip: ', server_ip);\n  console.log('server_port: ', server_port);\n\n  let uri = protocol + server_ip + ':' + server_port;\n  console.log('uri', uri);\n  socket = new WebSocket(uri);\n  // socket = new WebSocket('wss://localhost:6006/');\n\n  // Connection opened\n  socket.addEventListener('open', function(event) {\n    console.log('connected');\n    recordBtn.disabled = false;\n    connectBtn.disabled = true;\n    connectBtn.innerHTML = 'Connected!';\n  });\n\n  // Connection closed\n  socket.addEventListener('close', function(event) {\n    console.log('disconnected');\n    recordBtn.disabled = true;\n    connectBtn.disabled = false;\n    connectBtn.innerHTML = 'Click me to connect!';\n  });\n\n  // Listen for messages\n  socket.addEventListener('message', function(event) {\n    let message = JSON.parse(event.data);\n    if (message.segment in recognition_text) {\n      recognition_text[message.segment] = message.text;\n    } else {\n      recognition_text.push(message.text);\n    }\n    let text_area = document.getElementById('results');\n    text_area.value = getDisplayResult();\n    text_area.scrollTop = text_area.scrollHeight;  // auto scroll\n    console.log('Received message: ', event.data);\n  });\n}\n\nwindow.onload = (event) => {\n  console.log('page is fully loaded');\n  console.log('protocol', window.location.protocol);\n  console.log('port', window.location.port);\n  if (window.location.protocol == 'https:') {\n    document.getElementById('ws-protocol').textContent = 'wss://';\n  }\n  serverIpInput.value = window.location.hostname;\n  serverPortInput.value = window.location.port;\n};\n\nconst serverIpInput = document.getElementById('server-ip');\nconst serverPortInput = document.getElementById('server-port');\n\nconst connectBtn = document.getElementById('connect');\nconst recordBtn = document.getElementById('streaming_record');\nconst stopBtn = document.getElementById('streaming_stop');\nconst clearBtn = document.getElementById('clear');\nconst soundClips = document.getElementById('sound-clips');\nconst canvas = document.getElementById('canvas');\nconst mainSection = document.querySelector('.container');\n\nstopBtn.disabled = true;\nrecordBtn.disabled = true;\n\nlet audioCtx;\nconst canvasCtx = canvas.getContext('2d');\nlet mediaStream;\nlet analyser;\n\nlet expectedSampleRate = 16000;\nlet recordSampleRate;  // the sampleRate of the microphone\nlet recorder = null;   // the microphone\nlet leftchannel = [];  // TODO: Use a single channel\n\nlet recordingLength = 0;  // number of samples so far\n\nclearBtn.onclick = function() {\n  document.getElementById('results').value = '';\n  recognition_text = [];\n};\n\nconnectBtn.onclick = function() {\n  initWebSocket();\n};\n\n// copied/modified from https://mdn.github.io/web-dictaphone/\n// and\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nif (navigator.mediaDevices.getUserMedia) {\n  console.log('getUserMedia supported.');\n\n  // see https://w3c.github.io/mediacapture-main/#dom-mediadevices-getusermedia\n  const constraints = {audio: true};\n\n  let onSuccess = function(stream) {\n    if (!audioCtx) {\n      audioCtx = new AudioContext();\n    }\n    console.log(audioCtx);\n    recordSampleRate = audioCtx.sampleRate;\n    console.log('sample rate ' + recordSampleRate);\n\n    // creates an audio node from the microphone incoming stream\n    mediaStream = audioCtx.createMediaStreamSource(stream);\n    console.log(mediaStream);\n\n    // https://developer.mozilla.org/en-US/docs/Web/API/AudioContext/createScriptProcessor\n    // bufferSize: the onaudioprocess event is called when the buffer is full\n    var bufferSize = 2048;\n    var numberOfInputChannels = 2;\n    var numberOfOutputChannels = 2;\n    if (audioCtx.createScriptProcessor) {\n      recorder = audioCtx.createScriptProcessor(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    } else {\n      recorder = audioCtx.createJavaScriptNode(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    }\n    console.log(recorder);\n\n    recorder.onaudioprocess = function(e) {\n      let samples = new Float32Array(e.inputBuffer.getChannelData(0))\n      samples = downsampleBuffer(samples, expectedSampleRate);\n\n      let buf = new Int16Array(samples.length);\n      for (var i = 0; i < samples.length; ++i) {\n        let s = samples[i];\n        if (s >= 1)\n          s = 1;\n        else if (s <= -1)\n          s = -1;\n\n        samples[i] = s;\n        buf[i] = s * 32767;\n      }\n\n      socket.send(samples);\n\n      leftchannel.push(buf);\n      recordingLength += bufferSize;\n    };\n\n    visualize(stream);\n    mediaStream.connect(analyser);\n\n    recordBtn.onclick = function() {\n      mediaStream.connect(recorder);\n      mediaStream.connect(analyser);\n      recorder.connect(audioCtx.destination);\n\n      console.log('recorder started');\n      recordBtn.style.background = 'red';\n\n      stopBtn.disabled = false;\n      recordBtn.disabled = true;\n    };\n\n    stopBtn.onclick = function() {\n      console.log('recorder stopped');\n\n      socket.send('Done');\n      console.log('Sent Done');\n\n      socket.close();\n\n      // stopBtn recording\n      recorder.disconnect(audioCtx.destination);\n      mediaStream.disconnect(recorder);\n      mediaStream.disconnect(analyser);\n\n      recordBtn.style.background = '';\n      recordBtn.style.color = '';\n      // mediaRecorder.requestData();\n\n      stopBtn.disabled = true;\n      recordBtn.disabled = false;\n\n      const clipName =\n          prompt('Enter a name for your sound clip?', 'My unnamed clip');\n\n      const clipContainer = document.createElement('article');\n      const clipLabel = document.createElement('p');\n      const audio = document.createElement('audio');\n      const deleteButton = document.createElement('button');\n      clipContainer.classList.add('clip');\n      audio.setAttribute('controls', '');\n      deleteButton.textContent = 'Delete';\n      deleteButton.className = 'delete';\n\n      if (clipName === null) {\n        clipLabel.textContent = 'My unnamed clip';\n      } else {\n        clipLabel.textContent = clipName;\n      }\n\n      clipContainer.appendChild(audio);\n\n      clipContainer.appendChild(clipLabel);\n      clipContainer.appendChild(deleteButton);\n      soundClips.appendChild(clipContainer);\n\n      audio.controls = true;\n      let samples = flatten(leftchannel);\n      const blob = toWav(samples);\n\n      leftchannel = [];\n      const audioURL = window.URL.createObjectURL(blob);\n      audio.src = audioURL;\n      console.log('recorder stopped');\n\n      deleteButton.onclick = function(e) {\n        let evtTgt = e.target;\n        evtTgt.parentNode.parentNode.removeChild(evtTgt.parentNode);\n      };\n\n      clipLabel.onclick = function() {\n        const existingName = clipLabel.textContent;\n        const newClipName = prompt('Enter a new name for your sound clip?');\n        if (newClipName === null) {\n          clipLabel.textContent = existingName;\n        } else {\n          clipLabel.textContent = newClipName;\n        }\n      };\n    };\n  };\n\n  let onError = function(err) {\n    console.log('The following error occurred: ' + err);\n  };\n\n  navigator.mediaDevices.getUserMedia(constraints).then(onSuccess, onError);\n} else {\n  console.log('getUserMedia not supported on your browser!');\n  alert('getUserMedia not supported on your browser!');\n}\n\nfunction visualize(stream) {\n  if (!audioCtx) {\n    audioCtx = new AudioContext();\n  }\n\n  const source = audioCtx.createMediaStreamSource(stream);\n\n  if (!analyser) {\n    analyser = audioCtx.createAnalyser();\n    analyser.fftSize = 2048;\n  }\n  const bufferLength = analyser.frequencyBinCount;\n  const dataArray = new Uint8Array(bufferLength);\n\n  // source.connect(analyser);\n  // analyser.connect(audioCtx.destination);\n\n  draw()\n\n  function draw() {\n    const WIDTH = canvas.width\n    const HEIGHT = canvas.height;\n\n    requestAnimationFrame(draw);\n\n    analyser.getByteTimeDomainData(dataArray);\n\n    canvasCtx.fillStyle = 'rgb(200, 200, 200)';\n    canvasCtx.fillRect(0, 0, WIDTH, HEIGHT);\n\n    canvasCtx.lineWidth = 2;\n    canvasCtx.strokeStyle = 'rgb(0, 0, 0)';\n\n    canvasCtx.beginPath();\n\n    let sliceWidth = WIDTH * 1.0 / bufferLength;\n    let x = 0;\n\n    for (let i = 0; i < bufferLength; i++) {\n      let v = dataArray[i] / 128.0;\n      let y = v * HEIGHT / 2;\n\n      if (i === 0) {\n        canvasCtx.moveTo(x, y);\n      } else {\n        canvasCtx.lineTo(x, y);\n      }\n\n      x += sliceWidth;\n    }\n\n    canvasCtx.lineTo(canvas.width, canvas.height / 2);\n    canvasCtx.stroke();\n  }\n}\n\nwindow.onresize = function() {\n  canvas.width = mainSection.offsetWidth;\n};\n\nwindow.onresize();\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction flatten(listOfSamples) {\n  let n = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    n += listOfSamples[i].length;\n  }\n  let ans = new Int16Array(n);\n\n  let offset = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    ans.set(listOfSamples[i], offset);\n    offset += listOfSamples[i].length;\n  }\n  return ans;\n}\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction toWav(samples) {\n  let buf = new ArrayBuffer(44 + samples.length * 2);\n  var view = new DataView(buf);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true);               // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true);  // chunkSize\n  //                   E V A W\n  view.setUint32(8, 0x45564157, true);  // format\n                                        //\n  //                      t m f\n  view.setUint32(12, 0x20746d66, true);          // subchunk1ID\n  view.setUint32(16, 16, true);                  // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true);                   // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true);                   // numChannels: 1 channel\n  view.setUint32(24, expectedSampleRate, true);  // sampleRate\n  view.setUint32(28, expectedSampleRate * 2, true);  // byteRate\n  view.setUint16(32, 2, true);                       // blockAlign\n  view.setUint16(34, 16, true);                      // bitsPerSample\n  view.setUint32(36, 0x61746164, true);              // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true);      // subchunk2Size\n\n  let offset = 44;\n  for (let i = 0; i < samples.length; ++i) {\n    view.setInt16(offset, samples[i], true);\n    offset += 2;\n  }\n\n  return new Blob([view], {type: 'audio/wav'});\n}\n\n// this function is copied from\n// https://github.com/awslabs/aws-lex-browser-audio-capture/blob/master/lib/worker.js#L46\nfunction downsampleBuffer(buffer, exportSampleRate) {\n  if (exportSampleRate === recordSampleRate) {\n    return buffer;\n  }\n  var sampleRateRatio = recordSampleRate / exportSampleRate;\n  var newLength = Math.round(buffer.length / sampleRateRatio);\n  var result = new Float32Array(newLength);\n  var offsetResult = 0;\n  var offsetBuffer = 0;\n  while (offsetResult < result.length) {\n    var nextOffsetBuffer = Math.round((offsetResult + 1) * sampleRateRatio);\n    var accum = 0, count = 0;\n    for (var i = offsetBuffer; i < nextOffsetBuffer && i < buffer.length; i++) {\n      accum += buffer[i];\n      count++;\n    }\n    result[offsetResult] = accum / count;\n    offsetResult++;\n    offsetBuffer = nextOffsetBuffer;\n  }\n  return result;\n};\n"
  },
  {
    "path": "python-api-examples/web/js/upload.js",
    "content": "/**\nReferences\nhttps://developer.mozilla.org/en-US/docs/Web/API/FileList\nhttps://developer.mozilla.org/en-US/docs/Web/API/FileReader\nhttps://javascript.info/arraybuffer-binary-arrays\nhttps://developer.mozilla.org/zh-CN/docs/Web/API/WebSocket\nhttps://developer.mozilla.org/en-US/docs/Web/API/WebSocket/send\n*/\n\nvar socket;\n\nconst serverIpInput = document.getElementById('server-ip');\nconst serverPortInput = document.getElementById('server-port');\n\nconst connectBtn = document.getElementById('connect');\nconst uploadBtn = document.getElementById('file');\n\nfunction initWebSocket() {\n  let protocol = 'ws://';\n  if (window.location.protocol == 'https:') {\n    protocol = 'wss://'\n  }\n  let server_ip = serverIpInput.value;\n  let server_port = serverPortInput.value;\n  console.log('protocol: ', protocol);\n  console.log('server_ip: ', server_ip);\n  console.log('server_port: ', server_port);\n\n\n  let uri = protocol + server_ip + ':' + server_port;\n  console.log('uri', uri);\n  socket = new WebSocket(uri);\n\n  // Connection opened\n  socket.addEventListener('open', function(event) {\n    console.log('connected');\n    uploadBtn.disabled = false;\n    connectBtn.disabled = true;\n    connectBtn.innerHTML = 'Connected!';\n  });\n\n  // Connection closed\n  socket.addEventListener('close', function(event) {\n    console.log('disconnected');\n    uploadBtn.disabled = true;\n    connectBtn.disabled = false;\n    connectBtn.innerHTML = 'Click me to connect!';\n  });\n\n  // Listen for messages\n  socket.addEventListener('message', function(event) {\n    console.log('Received message: ', event.data);\n\n    document.getElementById('results').value = event.data;\n    socket.send('Done');\n    console.log('Sent Done');\n    socket.close();\n  });\n}\n\nwindow.onload = (event) => {\n  console.log('page is fully loaded');\n  console.log('protocol', window.location.protocol);\n  console.log('port', window.location.port);\n  if (window.location.protocol == 'https:') {\n    document.getElementById('ws-protocol').textContent = 'wss://';\n  }\n  serverIpInput.value = window.location.hostname;\n  serverPortInput.value = window.location.port;\n};\n\nconnectBtn.onclick = function() {\n  initWebSocket();\n};\n\nfunction send_header(n) {\n  const header = new ArrayBuffer(8);\n  // assume the uploaded wave is 16000 Hz\n  new DataView(header).setInt32(0, 16000, true /* littleEndian */);\n  new DataView(header).setInt32(4, n, true /* littleEndian */);\n  socket.send(new Int32Array(header, 0, 2));\n}\n\nfunction onFileChange() {\n  var files = document.getElementById('file').files;\n\n  if (files.length == 0) {\n    console.log('No file selected');\n    return;\n  }\n\n  console.log('files: ' + files);\n\n  const file = files[0];\n  console.log(file);\n  console.log('file.name ' + file.name);\n  console.log('file.type ' + file.type);\n  console.log('file.size ' + file.size);\n\n  let audioCtx = new AudioContext({sampleRate: 16000});\n\n  let reader = new FileReader();\n  reader.onload = function() {\n    console.log('reading file!');\n    audioCtx.decodeAudioData(reader.result, decodedDone);\n  };\n\n  function decodedDone(decoded) {\n    let typedArray = new Float32Array(decoded.length);\n    let float32_samples = decoded.getChannelData(0);\n    let buf = float32_samples.buffer\n\n    // Send 1024 audio samples per request.\n    //\n    // It has two purposes:\n    //  (1) Simulate streaming\n    //  (2) There is a limit on the number of bytes in the payload that can be\n    //      sent by websocket, which is 1MB, I think. We can send a large\n    //      audio file for decoding in this approach.\n    let n = 1024 * 4;  // send this number of bytes per request.\n    console.log('buf length, ' + buf.byteLength);\n    send_header(buf.byteLength);\n    for (let start = 0; start < buf.byteLength; start += n) {\n      socket.send(buf.slice(start, start + n));\n    }\n  }\n\n  reader.readAsArrayBuffer(file);\n}\n\nconst clearBtn = document.getElementById('clear');\nclearBtn.onclick = function() {\n  console.log('clicked');\n  document.getElementById('results').value = '';\n};\n"
  },
  {
    "path": "python-api-examples/web/nav-partial.html",
    "content": "  <nav class=\"navbar navbar-expand-lg navbar-light bg-light\">\n    <a class=\"navbar-brand\" href=\"index.html\">Next-gen Kaldi demo</a>\n      <button class=\"navbar-toggler\" type=\"button\" data-toggle=\"collapse\" data-target=\"#navbarSupportedContent\" aria-controls=\"navbarSupportedContent\" aria-expanded=\"false\" aria-label=\"Toggle navigation\">\n        <span class=\"navbar-toggler-icon\"></span>\n      </button>\n    <div class=\"collapse navbar-collapse\" id=\"navbarSupportedContent\">\n      <ul class=\"navbar-nav mr-auto\">\n        <li class=\"nav-item active\">\n          <a class=\"nav-link\" href=\"index.html\">Home <span class=\"sr-only\">(current)</span></a>\n        </li>\n\n        <li class=\"nav-item\">\n          <a class=\"nav-link\" href=\"upload.html\">Upload</a>\n        </li>\n\n        <li class=\"nav-item\">\n          <a class=\"nav-link\" href=\"streaming_record.html\">Streaming-Record</a>\n        </li>\n\n        <li class=\"nav-item\">\n          <a class=\"nav-link\" href=\"offline_record.html\">Offline-Record</a>\n        </li>\n\n      </ul>\n    </div>\n  </nav>\n"
  },
  {
    "path": "python-api-examples/web/offline_record.html",
    "content": "<!doctype html>\n<html lang=\"en\">\n<head>\n  <!-- Required meta tags -->\n  <meta charset=\"utf-8\"></meta>\n  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1, shrink-to-fit=no\"></meta>\n\n  <!-- Bootstrap CSS -->\n  <link rel=\"stylesheet\"\n        href=\"./css/bootstrap.min.css\"\n        integrity=\"sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T\"\n        crossorigin=\"anonymous\">\n  </link>\n\n  <script src=\"./js/jquery-3.6.0.min.js\" integrity=\"sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=\" crossorigin=\"anonymous\"></script>\n\n  <title>Next-gen Kaldi demo (Upload file for recognition)</title>\n</head>\n\n\n<body>\n  <div id=\"nav\"></div>\n  <script>\n    $(function(){\n      $(\"#nav\").load(\"nav-partial.html\");\n    });\n  </script>\n\n  <h3>Recognition from offline recordings</h3>\n  <div class=\"container\">\n    <div class=\"input-group mb-1\">\n      <div class=\"input-group-prepend\">\n        <button class=\"btn btn-block btn-primary\" type=\"button\" id=\"connect\">Click me to connect</button>\n      </div>\n      <span class=\"input-group-text\" id=\"ws-protocol\">ws://</span>\n      <input type=\"text\" id=\"server-ip\" class=\"form-control\" placeholder=\"Sherpa-onnx server IP, e.g., localhost\" aria-label=\"sherpa-onnx server IP\">\n      <span class=\"input-group-text\">:</span>\n      <input type=\"text\" id=\"server-port\" class=\"form-control\" placeholder=\"Sherpa-onnx server port, e.g., 6006\" aria-label=\"sherpa-onnx server port\">\n    </div>\n\n    <div class=\"row\">\n       <div class=\"col-12\">\n        <canvas id=\"canvas\" height=\"60px\" display=\"block\" margin-bottom=\"0.5rem\"></canvas>\n      </div>\n    </div>\n    <div class=\"row\">\n       <div class=\"col\">\n        <button class=\"btn btn-primary btn-block\" id=\"offline_record\">Offline-Record</button>\n       </div>\n       <div class=\"col\">\n        <button class=\"btn btn-primary btn-block\" id=\"offline_stop\">Offline-Stop</button>\n       </div>\n    </div>\n  </div>\n\n  <div class=\"mb-3\">\n    <label for=\"results\" class=\"form-label\">Recognition results</label>\n    <textarea class=\"form-control\" id=\"results\" rows=\"8\"></textarea>\n  </div>\n\n  <button class=\"btn btn-primary btn-block\" id=\"clear\">Clear results</button>\n\n  <section flex=\"1\" overflow=\"auto\" id=\"sound-clips\">\n  </section>\n\n\n  <!-- Optional JavaScript -->\n  <!-- jQuery first, then Popper.js, then Bootstrap JS -->\n  <script src=\"./js/popper.min.js\"\n          integrity=\"sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1\"\n          crossorigin=\"anonymous\">\n  </script>\n\n  <script src=\"./js/bootstrap.min.js\"\n          integrity=\"sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM\"\n          crossorigin=\"anonymous\">\n  </script>\n\n  <script src=\"./js/offline_record.js\"> </script>\n</body>\n</html>\n"
  },
  {
    "path": "python-api-examples/web/start-https-server.py",
    "content": "#!/usr/bin/env python3\n\n# Code in this file is modified from\n# https://stackoverflow.com/questions/19705785/python-3-simple-https-server\n\nimport argparse\nimport http.server\nimport ssl\nimport sys\nfrom pathlib import Path\n\n\"\"\"\nUsage:\n\n  ./start-https-server.py \\\n    --server-address 0.0.0.0 \\\n    --server-port 6007 \\\n    --cert ./cert.pem\n\"\"\"\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--server-address\",\n        type=str,\n        default=\"0.0.0.0\",\n        help=\"\"\"IP address which this server will bind to\"\"\",\n    )\n\n    parser.add_argument(\n        \"--server-port\",\n        type=int,\n        default=6007,\n        help=\"\"\"Port number on which this server will listen\"\"\",\n    )\n\n    parser.add_argument(\n        \"--certificate\",\n        type=str,\n        default=\"cert.pem\",\n        help=\"\"\"Path to the X.509 certificate. You can use\n        ./generate-certificate.py to generate it\"\"\",\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    print(f\"{vars(args)}\")\n    server_address = (args.server_address, args.server_port)\n    httpd = http.server.HTTPServer(\n        server_address, http.server.SimpleHTTPRequestHandler\n    )\n\n    if not Path(args.certificate).is_file():\n        print(\"Please run ./generate-certificate.py to generate a certificate\")\n        sys.exit(-1)\n\n    httpd.socket = ssl.wrap_socket(\n        httpd.socket,\n        server_side=True,\n        certfile=args.certificate,\n        ssl_version=ssl.PROTOCOL_TLS,\n    )\n    print(\n        \"The server is listening at the following address:\\n\"\n        f\"https://{args.server_address}:{args.server_port}\"\n    )\n    httpd.serve_forever()\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "python-api-examples/web/streaming_record.html",
    "content": "<!doctype html>\n<html lang=\"en\">\n<head>\n  <!-- Required meta tags -->\n  <meta charset=\"utf-8\"></meta>\n  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1, shrink-to-fit=no\"></meta>\n\n  <!-- Bootstrap CSS -->\n  <link rel=\"stylesheet\"\n        href=\"./css/bootstrap.min.css\"\n        integrity=\"sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T\"\n        crossorigin=\"anonymous\">\n  </link>\n\n  <script src=\"./js/jquery-3.6.0.min.js\" integrity=\"sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=\" crossorigin=\"anonymous\"></script>\n\n  <title>Next-gen Kaldi demo (Upload file for recognition)</title>\n</head>\n\n\n<body>\n  <div id=\"nav\"></div>\n  <script>\n    $(function(){\n      $(\"#nav\").load(\"nav-partial.html\");\n    });\n  </script>\n\n  <h3>Recognition from real-time recordings</h3>\n  <div class=\"container\">\n    <div class=\"input-group mb-1\">\n      <div class=\"input-group-prepend\">\n        <button class=\"btn btn-block btn-primary\" type=\"button\" id=\"connect\">Click me to connect</button>\n      </div>\n      <span class=\"input-group-text\" id=\"ws-protocol\">ws://</span>\n      <input type=\"text\" id=\"server-ip\" class=\"form-control\" placeholder=\"Sherpa-onnx server IP, e.g., localhost\" aria-label=\"sherpa-onnx server IP\">\n      <span class=\"input-group-text\">:</span>\n      <input type=\"text\" id=\"server-port\" class=\"form-control\" placeholder=\"Sherpa-onnx server port, e.g., 6006\" aria-label=\"sherpa-onnx server port\">\n    </div>\n\n    <div class=\"row\">\n       <div class=\"col-12\">\n        <canvas id=\"canvas\" height=\"60px\" display=\"block\" margin-bottom=\"0.5rem\"></canvas>\n      </div>\n    </div>\n    <div class=\"row\">\n       <div class=\"col\">\n        <button class=\"btn btn-primary btn-block\" id=\"streaming_record\">Streaming-Record</button>\n       </div>\n       <div class=\"col\">\n        <button class=\"btn btn-primary btn-block\" id=\"streaming_stop\">Streaming-Stop</button>\n       </div>\n    </div>\n  </div>\n\n  <div class=\"mb-3\">\n    <label for=\"results\" class=\"form-label\">Recognition results</label>\n    <textarea class=\"form-control\" id=\"results\" rows=\"8\"></textarea>\n  </div>\n\n  <button class=\"btn btn-primary btn-block\" id=\"clear\">Clear results</button>\n\n  <section flex=\"1\" overflow=\"auto\" id=\"sound-clips\">\n  </section>\n\n\n  <!-- Optional JavaScript -->\n  <!-- jQuery first, then Popper.js, then Bootstrap JS -->\n  <script src=\"./js/popper.min.js\"\n          integrity=\"sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1\"\n          crossorigin=\"anonymous\">\n  </script>\n\n  <script src=\"./js/bootstrap.min.js\"\n          integrity=\"sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM\"\n          crossorigin=\"anonymous\">\n  </script>\n\n  <script src=\"./js/streaming_record.js\"> </script>\n</body>\n</html>\n"
  },
  {
    "path": "python-api-examples/web/upload.html",
    "content": "<!doctype html>\n<html lang=\"en\">\n<head>\n  <!-- Required meta tags -->\n  <meta charset=\"utf-8\"></meta>\n  <meta name=\"viewport\" content=\"width=device-width, initial-scale=1, shrink-to-fit=no\"></meta>\n\n  <!-- Bootstrap CSS -->\n  <link rel=\"stylesheet\"\n        href=\"./css/bootstrap.min.css\"\n        integrity=\"sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T\"\n        crossorigin=\"anonymous\">\n  </link>\n\n  <script src=\"./js/jquery-3.6.0.min.js\" integrity=\"sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=\" crossorigin=\"anonymous\"></script>\n\n  <title>Next-gen Kaldi demo (Upload file for recognition)</title>\n</head>\n\n\n<body>\n  <div id=\"nav\"></div>\n  <script>\n    $(function(){\n      $(\"#nav\").load(\"nav-partial.html\");\n    });\n  </script>\n\n  <h3>Recognition from a selected file</h3>\n  <div class=\"input-group mb-1\">\n    <div class=\"input-group-prepend\">\n      <button class=\"btn btn-block btn-primary\" type=\"button\" id=\"connect\">Click me to connect</button>\n    </div>\n    <span class=\"input-group-text\" id=\"ws-protocol\">ws://</span>\n    <input type=\"text\" id=\"server-ip\" class=\"form-control\" placeholder=\"Sherpa-onnx server IP, e.g., localhost\" aria-label=\"sherpa-onnx server IP\">\n    <span class=\"input-group-text\">:</span>\n    <input type=\"text\" id=\"server-port\" class=\"form-control\" placeholder=\"Sherpa-onnx server port, e.g., 6006\" aria-label=\"sherpa-onnx server port\">\n  </div>\n\n  <form>\n    <div class=\"mb-3\">\n      <label for=\"file\" class=\"form-label\">Select file</label>\n      <input class=\"form-control\" type=\"file\" id=\"file\" accept=\".wav\" onchange=\"onFileChange()\" disabled=\"true\"></input>\n    </div>\n\n    <div class=\"mb-3\">\n      <label for=\"results\" class=\"form-label\">Recognition results</label>\n      <textarea class=\"form-control\" id=\"results\" rows=\"8\"></textarea>\n    </div>\n\n    <button class=\"btn btn-primary btn-block\" id=\"clear\">Clear results</button>\n  </form>\n\n  <!-- Optional JavaScript -->\n  <!-- jQuery first, then Popper.js, then Bootstrap JS -->\n  <script src=\"./js/popper.min.js\"\n          integrity=\"sha384-UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1\"\n          crossorigin=\"anonymous\">\n  </script>\n\n  <script src=\"./js/bootstrap.min.js\"\n          integrity=\"sha384-JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM\"\n          crossorigin=\"anonymous\">\n  </script>\n\n  <script src=\"./js/upload.js\"> </script>\n</body>\n</html>\n"
  },
  {
    "path": "python-api-examples/zipvoice-tts-play.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2026  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python API\nfor Chinese/English zero-shot TTS with ZipVoice.\n\nDifferent from ./zipvoice-tts.py, this file plays back the generated audio\nwhile the model is still generating.\n\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xvf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\npython3 ./python-api-examples/zipvoice-tts-play.py\n\nYou can find more models at\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\nfor details.\n\"\"\"\n\nimport logging\nimport queue\nimport sys\nimport threading\nimport time\nfrom pathlib import Path\n\nimport librosa\nimport numpy as np\nimport sherpa_onnx\nimport soundfile as sf\n\ntry:\n    import sounddevice as sd\nexcept ImportError:\n    print(\"Please install sounddevice first. You can use\")\n    print()\n    print(\"  pip install sounddevice\")\n    print()\n    print(\"to install it\")\n    sys.exit(-1)\n\n\ndef create_tts():\n    tts_config = sherpa_onnx.OfflineTtsConfig(\n        model=sherpa_onnx.OfflineTtsModelConfig(\n            zipvoice=sherpa_onnx.OfflineTtsZipvoiceModelConfig(\n                tokens=\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt\",\n                encoder=\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx\",\n                decoder=\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx\",\n                data_dir=\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data\",\n                lexicon=\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt\",\n                vocoder=\"./vocos_24khz.onnx\",\n            ),\n            debug=False,\n            num_threads=2,\n            provider=\"cpu\",\n        )\n    )\n    if not tts_config.validate():\n        raise ValueError(\n            \"Please read the previous error messages and re-check your config\"\n        )\n\n    return sherpa_onnx.OfflineTts(tts_config)\n\n\nbuffer = queue.Queue()\nstarted = False\nstopped = False\nkilled = False\nsample_rate = None\nevent = threading.Event()\nfirst_message_time = None\n\n\ndef generated_audio_callback(samples: np.ndarray, progress: float):\n    global first_message_time\n    if first_message_time is None:\n        first_message_time = time.time()\n\n    buffer.put(samples)\n\n    global started\n    if started is False:\n        logging.info(\"Start playing ...\")\n    started = True\n\n    if killed:\n        return 0\n\n    return 1\n\n\ndef play_audio_callback(\n    outdata: np.ndarray, frames: int, time, status: sd.CallbackFlags\n):\n    if killed or (started and buffer.empty() and stopped):\n        event.set()\n\n    if buffer.empty():\n        outdata.fill(0)\n        return\n\n    n = 0\n    while n < frames and not buffer.empty():\n        remaining = frames - n\n        k = buffer.queue[0].shape[0]\n\n        if remaining <= k:\n            outdata[n:, 0] = buffer.queue[0][:remaining]\n            buffer.queue[0] = buffer.queue[0][remaining:]\n            n = frames\n            if buffer.queue[0].shape[0] == 0:\n                buffer.get()\n\n            break\n\n        outdata[n : n + k, 0] = buffer.get()\n        n += k\n\n    if n < frames:\n        outdata[n:, 0] = 0\n\n\ndef play_audio():\n    if False:\n        devices = sd.query_devices()\n        print(devices)\n\n        default_output_device_idx = sd.default.device[1]\n        print(\n            f'Use default output device: {devices[default_output_device_idx][\"name\"]}'\n        )\n\n    with sd.OutputStream(\n        channels=1,\n        callback=play_audio_callback,\n        dtype=\"float32\",\n        samplerate=sample_rate,\n        blocksize=1024,\n    ):\n        event.wait()\n\n    logging.info(\"Exiting ...\")\n\n\ndef main():\n    reference_audio_file = (\n        \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav\"\n    )\n    if not Path(reference_audio_file).is_file():\n        raise ValueError(f\"Reference audio {reference_audio_file} does not exist\")\n\n    logging.info(\"Loading model ...\")\n    tts = create_tts()\n    logging.info(\"Loading model done.\")\n\n    reference_audio, reference_sample_rate = librosa.load(reference_audio_file, sr=None)\n    reference_text = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\"\n    text = \"\"\"\n    小米的价值观是真诚, 热爱.\n    真诚，就是不欺人也不自欺.\n    热爱, 就是全心投入并享受其中.\n    \"\"\"\n\n    global sample_rate\n    sample_rate = tts.sample_rate\n\n    gen_config = sherpa_onnx.GenerationConfig()\n    gen_config.reference_audio = reference_audio\n    gen_config.reference_sample_rate = reference_sample_rate\n    gen_config.reference_text = reference_text\n    gen_config.num_steps = 4\n    gen_config.extra[\"min_char_in_sentence\"] = \"30\"\n\n    play_back_thread = threading.Thread(target=play_audio)\n    play_back_thread.start()\n\n    logging.info(\"Start generating ...\")\n    start_time = time.time()\n    audio = tts.generate(\n        text,\n        gen_config,\n        callback=generated_audio_callback,\n    )\n    end_time = time.time()\n    logging.info(\"Finished generating!\")\n\n    global stopped\n    stopped = True\n\n    if len(audio.samples) == 0:\n        print(\"Error in generating audios. Please read previous error messages.\")\n        global killed\n        killed = True\n        play_back_thread.join()\n        return\n\n    elapsed_seconds = end_time - start_time\n    audio_duration = len(audio.samples) / audio.sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    output_filename = \"./generated-zipvoice-zh-en-play.wav\"\n    sf.write(\n        output_filename,\n        audio.samples,\n        samplerate=audio.sample_rate,\n        subtype=\"PCM_16\",\n    )\n    logging.info(f\"The text is '{text}'\")\n    logging.info(\n        \"Time in seconds to receive the first \"\n        f\"message: {first_message_time-start_time:.3f}\"\n    )\n    logging.info(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    logging.info(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    logging.info(\n        f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\"\n    )\n\n    logging.info(f\"***  Saved to {output_filename} ***\")\n\n    print(\"\\n   >>>>>>>>> You can safely press ctrl + C to stop the play <<<<<<<<<<\\n\")\n\n    play_back_thread.join()\n\n\nif __name__ == \"__main__\":\n    formatter = \"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\"\n\n    logging.basicConfig(format=formatter, level=logging.INFO)\n    try:\n        main()\n    except KeyboardInterrupt:\n        print(\"\\nCaught Ctrl + C. Exiting\")\n        killed = True\n        sys.exit(0)\n"
  },
  {
    "path": "python-api-examples/zipvoice-tts.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2026  Xiaomi Corporation\n\n\"\"\"\nThis file demonstrates how to use sherpa-onnx Python API\nfor Chinese/English zero-shot TTS with ZipVoice.\n\n\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xvf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\npython3 ./python-api-examples/zipvoice-tts.py\n\nYou can find more models at\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\nfor details.\n\n\"\"\"\n\nimport time\nfrom pathlib import Path\n\nimport librosa\nimport sherpa_onnx\nimport soundfile as sf\n\n\ndef create_tts():\n    tts_config = sherpa_onnx.OfflineTtsConfig(\n        model=sherpa_onnx.OfflineTtsModelConfig(\n            zipvoice=sherpa_onnx.OfflineTtsZipvoiceModelConfig(\n                tokens=\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt\",\n                encoder=\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx\",\n                decoder=\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx\",\n                data_dir=\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data\",\n                lexicon=\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt\",\n                vocoder=\"./vocos_24khz.onnx\",\n            ),\n            debug=False,\n            num_threads=2,\n            provider=\"cpu\",\n        )\n    )\n    if not tts_config.validate():\n        raise ValueError(\n            \"Please read the previous error messages and re-check your config\"\n        )\n\n    return sherpa_onnx.OfflineTts(tts_config)\n\n\ndef main():\n    reference_audio_file = (\n        \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav\"\n    )\n    if not Path(reference_audio_file).is_file():\n        raise ValueError(f\"Reference audio {reference_audio_file} does not exist\")\n\n    tts = create_tts()\n\n    reference_audio, sample_rate = librosa.load(reference_audio_file, sr=None)\n    reference_text = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\"\n    text = \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n\n    gen_config = sherpa_onnx.GenerationConfig()\n    gen_config.reference_audio = reference_audio\n    gen_config.reference_sample_rate = sample_rate\n    gen_config.reference_text = reference_text\n    gen_config.num_steps = 4\n    gen_config.extra[\"min_char_in_sentence\"] = \"30\"\n\n    start = time.time()\n    audio = tts.generate(text, gen_config)\n    end = time.time()\n\n    if len(audio.samples) == 0:\n        print(\"Error in generating audios. Please read previous error messages.\")\n        return\n\n    elapsed_seconds = end - start\n    audio_duration = len(audio.samples) / audio.sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    output_filename = \"./generated-zipvoice-zh-en-python.wav\"\n    sf.write(\n        output_filename,\n        audio.samples,\n        samplerate=audio.sample_rate,\n        subtype=\"PCM_16\",\n    )\n    print(f\"Saved to {output_filename}\")\n    print(f\"The text is '{text}'\")\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "release.sh",
    "content": "#!/usr/bin/env bash\n#\n# Copyright (c)  2023  Xiaomi Corporation\n#\n# Please see the end of this file for what files it will generate\n\nset -ex\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\necho \"SHERPA_ONNX_VERSION: ${SHERPA_ONNX_VERSION}\"\ndst=v${SHERPA_ONNX_VERSION}\n\nif [ -d $dst ]; then\n  echo \"$dst exists - skipping\"\n  exit 0\nfi\n\n./build-android-arm64-v8a.sh\n./build-android-armv7-eabi.sh\n./build-android-x86-64.sh\n./build-android-x86.sh\n./build-ios.sh\n\nmkdir -p $dst/jniLibs/arm64-v8a\ncp -v ./build-android-arm64-v8a/install/lib/*.so $dst/jniLibs/arm64-v8a/\n\nmkdir -p $dst/jniLibs/armeabi-v7a\ncp -v ./build-android-armv7-eabi/install/lib/*.so $dst/jniLibs/armeabi-v7a/\n\nmkdir -p $dst/jniLibs/x86_64\ncp -v ./build-android-x86-64/install/lib/*.so $dst/jniLibs/x86_64\n\nmkdir -p $dst/jniLibs/x86\ncp -v ./build-android-x86/install/lib/*.so $dst/jniLibs/x86\n\nmkdir -p $dst/build-ios/\ncp -av ./build-ios/sherpa-onnx.xcframework $dst/build-ios/\n\nmkdir -p $dst/build-ios/ios-onnxruntime\ncp -av ./build-ios/ios-onnxruntime/onnxruntime.xcframework $dst/build-ios/ios-onnxruntime/\n\ncd $dst\n\ntar cjvf sherpa-onnx-v${SHERPA_ONNX_VERSION}-android.tar.bz2 ./jniLibs\n\ntar cjvf sherpa-onnx-v${SHERPA_ONNX_VERSION}-ios.tar.bz2 ./build-ios\n"
  },
  {
    "path": "rust-api-examples/.gitignore",
    "content": "target\n!run-*.sh\n"
  },
  {
    "path": "rust-api-examples/Cargo.toml",
    "content": "[package]\nname = \"rust-api-examples\"\nversion = \"1.12.31\"\nedition = \"2021\"\n\n[dependencies]\nanyhow = \"1.0\"\nclap = { version = \"4.5\", features = [\"derive\"] }\nsherpa-onnx = \"1.12.31\"\n# sherpa-onnx = { path = \"../sherpa-onnx/rust/sherpa-onnx\" }\nserde_json = \"1.0\"\n\ncpal = { version = \"0.16\", optional = true } # cross-platform audio I/O\n\n[features]\n# Default features are empty to avoid building cpal by default\ndefault = []\n\n# Feature for using microphone\nmic = [\"cpal\"]\n\n[[example]]\nname = \"streaming_zipformer_microphone\"\nrequired-features = [\"mic\"]\n"
  },
  {
    "path": "rust-api-examples/README.md",
    "content": "# Introduction\n\nThis folder uses Rust API maintained by us.\n\n## Setup library path\n\n### Method 1 (Build from source, support only shared libs right now)\n\n```bash\nexport SHERPA_ONNX_LIB_DIR=/Users/fangjun/open-source/sherpa-onnx/build/install/lib\nexport RUSTFLAGS=\"-C link-arg=-Wl,-rpath,$SHERPA_ONNX_LIB_DIR\"\n```\n\n### Method 2 (Download pre-built libs)\n\n```bash\n# You can choose any directory you like\ncd $HOME/Downloads\n\n# We use version v1.12.31 below as an example.\n# Please always use the latest version from\n# https://github.com/k2-fsa/sherpa-onnx/releases\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/v1.12.31/sherpa-onnx-v1.12.31-osx-universal2-shared.tar.bz2\ntar xvf sherpa-onnx-v1.12.31-osx-universal2-shared.tar.bz2\nrm sherpa-onnx-v1.12.31-osx-universal2-shared.tar.bz2\n\nexport SHERPA_ONNX_LIB_DIR=$HOME/Downloads/sherpa-onnx-v1.12.31-osx-universal2-shared/lib\nexport RUSTFLAGS=\"-C link-arg=-Wl,-rpath,$SHERPA_ONNX_LIB_DIR\"\n```\n\n## Examples\n\n| # | Example | Description |\n|---|---------|-------------|\n| 1 | [version](#example-1-show-sherpa-onnx-version) | Show the sherpa-onnx version |\n| 2 | [pocket_tts](#example-2-tts-with-pocket-tts-zero-shot-voice-cloning) | Text-to-speech with zero-shot voice cloning using a reference audio |\n| 3 | [supertonic_tts](#example-3-tts-with-supertonic-tts) | Text-to-speech with Supertonic TTS (multi-speaker, multi-language) |\n| 4 | [zipvoice_tts](#example-4-tts-with-zipvoice-zero-shot-voice-cloning) | Text-to-speech with ZipVoice zero-shot voice cloning |\n| 5 | [vits_tts](#example-5-tts-with-vits-english-piper) | Text-to-speech with a standalone VITS Piper model (English) |\n| 6 | [vits_tts](#example-6-tts-with-vits-german-piper) | Text-to-speech with a standalone VITS Piper model (German) |\n| 7 | [matcha_tts_en](#example-7-tts-with-matcha-english) | Text-to-speech with Matcha TTS (English) |\n| 8 | [matcha_tts_zh](#example-8-tts-with-matcha-chinese) | Text-to-speech with Matcha TTS (Chinese) |\n| 9 | [kokoro_tts_en](#example-9-tts-with-kokoro-english) | Text-to-speech with Kokoro TTS (English) |\n| 10 | [kokoro_tts_zh_en](#example-10-tts-with-kokoro-chinese--english) | Text-to-speech with Kokoro TTS (Chinese + English) |\n| 11 | [kitten_tts_en](#example-11-tts-with-kitten-english) | Text-to-speech with Kitten TTS (English) |\n| 12 | [streaming_zipformer_en](#example-12-asr-with-streaming-zipformer-english) | Streaming ASR with zipformer transducer (English) |\n| 13 | [streaming_zipformer_zh_en](#example-13-asr-with-streaming-zipformer-chinese--english) | Streaming ASR with zipformer transducer (Chinese + English) |\n| 14 | [streaming_zipformer_microphone](#example-14-asr-with-streaming-zipformer-with-a-microphone-real-time-asr) | Real-time streaming ASR from microphone input |\n| 15 | [zipformer_en](#example-15-asr-with-non-streaming-zipformer-english) | Non-streaming ASR with zipformer transducer (English) |\n| 16 | [zipformer_zh_en](#example-16-asr-with-non-streaming-zipformer-chinese--english) | Non-streaming ASR with zipformer transducer (Chinese + English) |\n| 17 | [zipformer_vi](#example-17-asr-with-non-streaming-zipformer-vietnamese) | Non-streaming ASR with zipformer transducer (Vietnamese) |\n| 18 | [nemo_parakeet](#example-18-asr-with-non-streaming-nemo-parakeet-english) | Non-streaming ASR with Nemo Parakeet TDT transducer (English) |\n| 19 | [fire_red_asr_ctc](#example-19-asr-with-non-streaming-fireredasr-ctc-chinese--english) | Non-streaming ASR with FireRedASR CTC model (Chinese + English) |\n| 20 | [moonshine_v2](#example-20-asr-with-non-streaming-moonshine-v2-english) | Non-streaming ASR with Moonshine v2 (English) |\n| 21 | [sense_voice](#example-21-asr-with-non-streaming-sensevoice) | Non-streaming ASR with SenseVoice (Chinese, English, Japanese, Korean, Cantonese) |\n| 22 | [silero_vad_remove_silence](#example-22-remove-silences-from-a-file-using-silerovad) | Remove silences from an audio file using Silero VAD |\n| 23 | [offline_speech_enhancement_gtcrn](#example-23-offline-speech-enhancement-with-gtcrn) | Offline speech enhancement with GTCRN |\n| 24 | [offline_speech_enhancement_dpdfnet](#example-24-offline-speech-enhancement-with-dpdfnet) | Offline speech enhancement with DPDFNet |\n| 25 | [streaming_speech_enhancement_gtcrn](#example-25-streaming-speech-enhancement-with-gtcrn) | Streaming speech enhancement with GTCRN |\n| 26 | [streaming_speech_enhancement_dpdfnet](#example-26-streaming-speech-enhancement-with-dpdfnet) | Streaming speech enhancement with DPDFNet |\n| 27 | [online_punctuation](#example-27-online-punctuation) | Add punctuation to text using online punctuation model |\n| 28 | [keyword_spotter](#example-28-keyword-spotter) | Detect keywords from audio using a Zipformer KWS model |\n| 29 | [spoken_language_identification](#example-29-spoken-language-identification) | Detect the spoken language in a wave file using Whisper |\n| 30 | [offline_punctuation](#example-30-offline-punctuation) | Add punctuation to text using an offline punctuation model |\n| 31 | [audio_tagging_zipformer](#example-31-audio-tagging-with-a-zipformer-model) | Audio tagging with a Zipformer model |\n| 32 | [audio_tagging_ced](#example-32-audio-tagging-with-a-ced-model) | Audio tagging with a CED model |\n| 33 | [speaker_embedding_extractor](#example-33-speaker-embedding-extractor) | Compute a speaker embedding from a wave file |\n| 34 | [speaker_embedding_manager](#example-34-speaker-embedding-manager) | Register, search, verify, and remove speakers using embeddings |\n| 35 | [speaker_embedding_cosine_similarity](#example-35-speaker-embedding-cosine-similarity) | Compute cosine similarity from three speaker embeddings |\n| 36 | [offline_speaker_diarization](#example-36-offline-speaker-diarization) | Offline speaker diarization with pyannote segmentation and 3D-Speaker embeddings |\n\n## Run it\n\nEach helper script downloads the required files if needed.\n\n### Example 1: Show sherpa-onnx version\n\n```bash\n./run-version.sh\n```\n\nFor macOS, you can run\n```\notool -l target/debug/examples/version | grep -A2 LC_RPATH\n```\nto check the RPATH.\n\n### Example 2: TTS with Pocket TTS (zero-shot voice cloning)\n\n```bash\n./run-pocket-tts.sh\n```\n\n### Example 3: TTS with Supertonic TTS\n\n```bash\n./run-supertonic-tts.sh\n```\n\n### Example 4: TTS with ZipVoice zero-shot voice cloning\n\n```bash\n./run-zipvoice-tts.sh\n```\n\n\n### Example 5: TTS with VITS (English Piper)\n\n```bash\n./run-vits-en.sh\n```\n\n### Example 6: TTS with VITS (German Piper)\n\n```bash\n./run-vits-de.sh\n```\n\n### Example 7: TTS with Matcha (English)\n\n```bash\n./run-matcha-tts-en.sh\n```\n\n### Example 8: TTS with Matcha (Chinese)\n\n```bash\n./run-matcha-tts-zh.sh\n```\n\n### Example 9: TTS with Kokoro (English)\n\n```bash\n./run-kokoro-tts-en.sh\n```\n\n### Example 10: TTS with Kokoro (Chinese + English)\n\n```bash\n./run-kokoro-tts-zh-en.sh\n```\n\n### Example 11: TTS with Kitten (English)\n\n```bash\n./run-kitten-tts-en.sh\n```\n\n### Example 12: ASR with streaming zipformer (English)\n\n```bash\n./run-streaming-zipformer-en.sh\n```\n\n### Example 13: ASR with streaming zipformer (Chinese + English)\n\n```bash\n./run-streaming-zipformer-zh-en.sh\n```\n\n### Example 14: ASR with streaming zipformer (with a microphone, real-time ASR)\n\n```bash\n./run-streaming-zipformer-microphone-zh-en.sh\n```\n\n### Example 15: ASR with non-streaming zipformer (English)\n\n```bash\n./run-zipformer-en.sh\n```\n\n### Example 16: ASR with non-streaming zipformer (Chinese + English)\n\n```bash\n./run-zipformer-zh-en.sh\n```\n\n### Example 17: ASR with non-streaming zipformer (Vietnamese)\n\n```bash\n./run-zipformer-vi.sh\n```\n\n### Example 18: ASR with non-streaming Nemo Parakeet (English)\n\n```bash\n./run-nemo-parakeet-en.sh\n```\n\n### Example 19: ASR with non-streaming FireRedASR CTC (Chinese + English)\n\n```bash\n./run-fire-red-asr-ctc.sh\n```\n\n### Example 20: ASR with non-streaming Moonshine v2 (English)\n\n```bash\n./run-moonshine-v2.sh\n```\n\n### Example 21: ASR with non-streaming SenseVoice\n\n```bash\n./run-sense-voice.sh\n```\n\n### Example 22: Remove silences from a file using SileroVAD\n\n```bash\n./run-silero-vad-remove-silence.sh\n```\n\n### Example 23: Offline speech enhancement with GTCRN\n\n```bash\n./run-offline-speech-enhancement-gtcrn.sh\n```\n\n### Example 24: Offline speech enhancement with DPDFNet\n\n```bash\n./run-offline-speech-enhancement-dpdfnet.sh\n```\n\n### Example 25: Streaming speech enhancement with GTCRN\n\n```bash\n./run-streaming-speech-enhancement-gtcrn.sh\n```\n\n### Example 26: Streaming speech enhancement with DPDFNet\n\n```bash\n./run-streaming-speech-enhancement-dpdfnet.sh\n```\n\n### Example 27: Online punctuation\n\n```bash\n./run-online-punctuation.sh\n```\n\n### Example 28: Keyword spotter\n\n```bash\n./run-keyword-spotter.sh\n```\n\n### Example 29: Spoken language identification\n\n```bash\n./run-spoken-language-identification.sh\n```\n\n### Example 30: Offline punctuation\n\n```bash\n./run-offline-punctuation.sh\n```\n\n### Example 31: Audio tagging with a Zipformer model\n\n```bash\n./run-audio-tagging-zipformer.sh\n```\n\n### Example 32: Audio tagging with a CED model\n\n```bash\n./run-audio-tagging-ced.sh\n```\n\n\n### Example 33: Speaker embedding extractor\n\n```bash\n./run-speaker-embedding-extractor.sh\n```\n\n### Example 34: Speaker embedding manager\n\n```bash\n./run-speaker-embedding-manager.sh\n```\n\n\n### Example 35: Speaker embedding cosine similarity\n\n```bash\n./run-speaker-embedding-cosine-similarity.sh\n```\n\n\n### Example 36: Offline speaker diarization\n\n```bash\n./run-offline-speaker-diarization.sh\n```\n"
  },
  {
    "path": "rust-api-examples/examples/audio_tagging_ced.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use audio tagging with a CED model\n// through sherpa-onnx's Rust API.\n\nuse sherpa_onnx::{AudioTagging, AudioTaggingConfig, AudioTaggingModelConfig, Wave};\nuse std::time::Instant;\n\nfn main() {\n    let config = AudioTaggingConfig {\n        model: AudioTaggingModelConfig {\n            ced: Some(\"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.int8.onnx\".into()),\n            num_threads: 1,\n            debug: true,\n            provider: Some(\"cpu\".into()),\n            ..Default::default()\n        },\n        labels: Some(\n            \"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/class_labels_indices.csv\".into(),\n        ),\n        top_k: 5,\n    };\n\n    let tagger = AudioTagging::create(&config).expect(\"Failed to create AudioTagging\");\n\n    let wav = Wave::read(\"./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/test_wavs/6.wav\")\n        .expect(\"Failed to read test wave\");\n\n    let start = Instant::now();\n    let stream = tagger.create_stream();\n    stream.accept_waveform(wav.sample_rate(), wav.samples());\n    let result = tagger.compute(&stream, 5);\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let audio_duration = wav.samples().len() as f32 / wav.sample_rate() as f32;\n    let rtf = elapsed_seconds / audio_duration;\n\n    println!(\"Elapsed seconds: {:.3}\", elapsed_seconds);\n    println!(\"Audio duration in seconds: {:.3}\", audio_duration);\n    println!(\"RTF: {:.3}/{:.3} = {:.3}\", elapsed_seconds, audio_duration, rtf);\n    println!();\n    for (i, event) in result.iter().enumerate() {\n        println!(\"{}: {{name: {}, index: {}, prob: {:.3}}}\", i, event.name, event.index, event.prob);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/audio_tagging_zipformer.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use audio tagging with a Zipformer model\n// through sherpa-onnx's Rust API.\n\nuse sherpa_onnx::{\n    AudioTagging, AudioTaggingConfig, AudioTaggingModelConfig,\n    OfflineZipformerAudioTaggingModelConfig, Wave,\n};\nuse std::time::Instant;\n\nfn main() {\n    let config = AudioTaggingConfig {\n        model: AudioTaggingModelConfig {\n            zipformer: OfflineZipformerAudioTaggingModelConfig {\n                model: Some(\n                    \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/model.int8.onnx\"\n                        .into(),\n                ),\n            },\n            num_threads: 1,\n            debug: true,\n            provider: Some(\"cpu\".into()),\n            ..Default::default()\n        },\n        labels: Some(\n            \"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/class_labels_indices.csv\"\n                .into(),\n        ),\n        top_k: 5,\n    };\n\n    let tagger = AudioTagging::create(&config).expect(\"Failed to create AudioTagging\");\n\n    let wav =\n        Wave::read(\"./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/test_wavs/1.wav\")\n            .expect(\"Failed to read test wave\");\n\n    let start = Instant::now();\n    let stream = tagger.create_stream();\n    stream.accept_waveform(wav.sample_rate(), wav.samples());\n    let result = tagger.compute(&stream, 5);\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let audio_duration = wav.samples().len() as f32 / wav.sample_rate() as f32;\n    let rtf = elapsed_seconds / audio_duration;\n\n    println!(\"Elapsed seconds: {:.3}\", elapsed_seconds);\n    println!(\"Audio duration in seconds: {:.3}\", audio_duration);\n    println!(\"RTF: {:.3}/{:.3} = {:.3}\", elapsed_seconds, audio_duration, rtf);\n    println!();\n    for (i, event) in result.iter().enumerate() {\n        println!(\"{}: {{name: {}, index: {}, prob: {:.3}}}\", i, event.name, event.index, event.prob);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/fire_red_asr_ctc.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use FireRedAsr CTC with sherpa-onnx's Rust API\n// for offline speech recognition.\n//\n// See ../README.md for how to run it.\n\nuse clap::Parser;\nuse sherpa_onnx::{\n    OfflineFireRedAsrCtcModelConfig, OfflineRecognizer, OfflineRecognizerConfig, Wave,\n};\nuse std::time::Instant;\n\n/// FireRedAsr CTC offline example\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    /// Path to WAV file\n    #[arg(long)]\n    wav: String,\n\n    /// Path to FireRedAsr CTC ONNX model\n    #[arg(long)]\n    model: String,\n\n    /// Path to tokens file\n    #[arg(long)]\n    tokens: String,\n\n    /// Provider (default: cpu)\n    #[arg(long, default_value = \"cpu\")]\n    provider: String,\n\n    /// Enable debug logs\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n\n    /// Number of threads\n    #[arg(long, default_value_t = 2)]\n    num_threads: i32,\n}\n\nfn main() {\n    let args = Args::parse();\n\n    let wave = Wave::read(&args.wav).expect(\"Failed to read WAV file\");\n    let audio_duration = wave.samples().len() as f64 / wave.sample_rate() as f64;\n\n    let mut recognizer_config = OfflineRecognizerConfig::default();\n\n    recognizer_config.model_config.fire_red_asr_ctc = OfflineFireRedAsrCtcModelConfig {\n        model: Some(args.model.clone()),\n    };\n\n    recognizer_config.model_config.tokens = Some(args.tokens.clone());\n    recognizer_config.model_config.provider = Some(args.provider.clone());\n    recognizer_config.model_config.debug = args.debug;\n    recognizer_config.model_config.num_threads = args.num_threads;\n\n    // Measure recognizer creation time\n    println!(\"Creating recognizer ...\");\n    let start_creation = Instant::now();\n    let recognizer =\n        OfflineRecognizer::create(&recognizer_config).expect(\"Failed to create OfflineRecognizer\");\n    let creation_elapsed = start_creation.elapsed().as_secs_f64();\n    println!(\"Recognizer created in {:.3} seconds.\", creation_elapsed);\n\n    let stream = recognizer.create_stream();\n\n    // Measure recognition time\n    let start_recognition = Instant::now();\n    stream.accept_waveform(wave.sample_rate(), wave.samples());\n    recognizer.decode(&stream);\n    let recognition_elapsed = start_recognition.elapsed().as_secs_f64();\n\n    // Get recognition result\n    if let Some(result) = stream.get_result() {\n        println!(\"Decoded text: {}\", result.text);\n\n        let total_time = creation_elapsed + recognition_elapsed;\n        let rtf = recognition_elapsed / audio_duration;\n\n        println!(\"\\n=== Performance Summary ===\");\n        println!(\"Audio duration          : {:.3} seconds\", audio_duration);\n        println!(\"Recognizer creation time: {:.3} seconds\", creation_elapsed);\n        println!(\n            \"Recognition time        : {:.3} seconds\",\n            recognition_elapsed\n        );\n        println!(\"Total elapsed time      : {:.3} seconds\", total_time);\n\n        // Detailed RTF computation log\n        println!(\n            \"Real-Time Factor (RTF)  : {:.3} (recognition_elapsed / audio_duration = {:.3} / {:.3})\",\n            rtf, recognition_elapsed, audio_duration\n        );\n\n        println!(\n            \"Number of threads       : {}\",\n            recognizer_config.model_config.num_threads\n        );\n    } else {\n        eprintln!(\"Failed to get recognition result\");\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/keyword_spotter.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use sherpa-onnx's Rust API for keyword spotting.\n//\n// See ../README.md for how to run it.\n\nuse clap::Parser;\nuse sherpa_onnx::{KeywordSpotter, KeywordSpotterConfig, Wave};\n\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    #[arg(long)]\n    wav: String,\n\n    #[arg(long)]\n    encoder: String,\n\n    #[arg(long)]\n    decoder: String,\n\n    #[arg(long)]\n    joiner: String,\n\n    #[arg(long)]\n    tokens: String,\n\n    #[arg(long)]\n    keywords_file: String,\n\n    #[arg(long, default_value = \"cpu\")]\n    provider: String,\n\n    #[arg(long, default_value_t = 1)]\n    num_threads: i32,\n\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n}\n\nfn detect_keywords(\n    kws: &KeywordSpotter,\n    wave: &Wave,\n    title: &str,\n    extra_keywords: Option<&str>,\n) {\n    println!(\"{title}\");\n\n    let stream = if let Some(extra_keywords) = extra_keywords {\n        kws.create_stream_with_keywords(extra_keywords)\n    } else {\n        kws.create_stream()\n    };\n\n    let tail_padding = vec![0.0f32; (wave.sample_rate() / 2) as usize];\n    stream.accept_waveform(wave.sample_rate(), wave.samples());\n    stream.accept_waveform(wave.sample_rate(), &tail_padding);\n    stream.input_finished();\n\n    let mut detected = false;\n    while kws.is_ready(&stream) {\n        kws.decode(&stream);\n        if let Some(result) = kws.get_result(&stream) {\n            if !result.keyword.is_empty() {\n                detected = true;\n                println!(\"Detected keyword: {}\", result.json);\n                kws.reset(&stream);\n            }\n        }\n    }\n\n    if !detected {\n        println!(\"No keyword detected.\");\n    }\n\n    println!();\n}\n\nfn main() -> anyhow::Result<()> {\n    let args = Args::parse();\n    let wave = Wave::read(&args.wav).ok_or_else(|| anyhow::anyhow!(\"Failed to read WAV file\"))?;\n\n    let mut config = KeywordSpotterConfig::default();\n    config.model_config.transducer.encoder = Some(args.encoder);\n    config.model_config.transducer.decoder = Some(args.decoder);\n    config.model_config.transducer.joiner = Some(args.joiner);\n    config.model_config.tokens = Some(args.tokens);\n    config.model_config.provider = Some(args.provider);\n    config.model_config.num_threads = args.num_threads;\n    config.model_config.debug = args.debug;\n    config.keywords_file = Some(args.keywords_file);\n\n    let kws = KeywordSpotter::create(&config)\n        .ok_or_else(|| anyhow::anyhow!(\"Failed to create KeywordSpotter\"))?;\n\n    detect_keywords(\n        &kws,\n        &wave,\n        \"--Test pre-defined keywords from the keywords file--\",\n        None,\n    );\n    detect_keywords(\n        &kws,\n        &wave,\n        \"--Use pre-defined keywords + add a new keyword--\",\n        Some(\"y ǎn y uán @演员\"),\n    );\n    detect_keywords(\n        &kws,\n        &wave,\n        \"--Use pre-defined keywords + add two new keywords--\",\n        Some(\"y ǎn y uán @演员/zh ī m íng @知名\"),\n    );\n\n    Ok(())\n}\n"
  },
  {
    "path": "rust-api-examples/examples/kitten_tts_en.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use Kitten TTS with sherpa-onnx's Rust API\n// for offline English text-to-speech.\n\nuse sherpa_onnx::{\n    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKittenModelConfig,\n};\nuse std::time::Instant;\n\nfn main() {\n    let config = OfflineTtsConfig {\n        model: sherpa_onnx::OfflineTtsModelConfig {\n            kitten: OfflineTtsKittenModelConfig {\n                model: Some(\"./kitten-nano-en-v0_1-fp16/model.fp16.onnx\".into()),\n                voices: Some(\"./kitten-nano-en-v0_1-fp16/voices.bin\".into()),\n                tokens: Some(\"./kitten-nano-en-v0_1-fp16/tokens.txt\".into()),\n                data_dir: Some(\"./kitten-nano-en-v0_1-fp16/espeak-ng-data\".into()),\n                length_scale: 1.0,\n            },\n            num_threads: 2,\n            debug: false,\n            ..Default::default()\n        },\n        ..Default::default()\n    };\n\n    let tts = OfflineTts::create(&config).expect(\"Failed to create OfflineTts\");\n\n    println!(\"Sample rate: {}\", tts.sample_rate());\n    println!(\"Num speakers: {}\", tts.num_speakers());\n\n    let text = \"Today as always, men fall into two groups: slaves and free men. Whoever \\\n        does not have two-thirds of his day for himself, is a slave, whatever \\\n        he may be: a statesman, a businessman, an official, or a scholar. \\\n        Friends fell out often because life was changing so fast. The easiest \\\n        thing in the world was to lose touch with someone.\";\n\n    let gen_config = GenerationConfig {\n        sid: 0,\n        speed: 1.0,\n        ..Default::default()\n    };\n\n    let start = Instant::now();\n\n    let audio = tts\n        .generate_with_config(\n            text,\n            &gen_config,\n            Some(|_samples: &[f32], progress: f32| -> bool {\n                println!(\"Progress: {:.1}%\", progress * 100.0);\n                true\n            }),\n        )\n        .expect(\"Generation failed\");\n\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let duration = audio.samples().len() as f32 / audio.sample_rate() as f32;\n    let rtf = elapsed_seconds / duration;\n\n    println!(\"Number of threads: {}\", config.model.num_threads);\n    println!(\"Elapsed seconds: {:.3} s\", elapsed_seconds);\n    println!(\"Audio duration: {:.3} s\", duration);\n    println!(\n        \"Real-time factor (RTF): {:.3}/{:.3} = {:.3}\",\n        elapsed_seconds, duration, rtf\n    );\n\n    let filename = \"./generated-kitten-en-rust.wav\";\n    if audio.save(filename) {\n        println!(\"Saved to: {}\", filename);\n    } else {\n        eprintln!(\"Failed to save {}\", filename);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/kokoro_tts_en.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use Kokoro TTS with sherpa-onnx's Rust API\n// for offline English text-to-speech.\n\nuse sherpa_onnx::{\n    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKokoroModelConfig,\n};\nuse std::time::Instant;\n\nfn main() {\n    let config = OfflineTtsConfig {\n        model: sherpa_onnx::OfflineTtsModelConfig {\n            kokoro: OfflineTtsKokoroModelConfig {\n                model: Some(\"./kokoro-en-v0_19/model.onnx\".into()),\n                voices: Some(\"./kokoro-en-v0_19/voices.bin\".into()),\n                tokens: Some(\"./kokoro-en-v0_19/tokens.txt\".into()),\n                data_dir: Some(\"./kokoro-en-v0_19/espeak-ng-data\".into()),\n                length_scale: 1.0,\n                ..Default::default()\n            },\n            num_threads: 2,\n            debug: false,\n            ..Default::default()\n        },\n        ..Default::default()\n    };\n\n    let tts = OfflineTts::create(&config).expect(\"Failed to create OfflineTts\");\n\n    println!(\"Sample rate: {}\", tts.sample_rate());\n    println!(\"Num speakers: {}\", tts.num_speakers());\n\n    let text = \"Today as always, men fall into two groups: slaves and free men. Whoever \\\n        does not have two-thirds of his day for himself, is a slave, whatever \\\n        he may be: a statesman, a businessman, an official, or a scholar. \\\n        Friends fell out often because life was changing so fast. The easiest \\\n        thing in the world was to lose touch with someone.\";\n\n    let gen_config = GenerationConfig {\n        sid: 0,\n        speed: 1.0,\n        ..Default::default()\n    };\n\n    let start = Instant::now();\n\n    let audio = tts\n        .generate_with_config(\n            text,\n            &gen_config,\n            Some(|_samples: &[f32], progress: f32| -> bool {\n                println!(\"Progress: {:.1}%\", progress * 100.0);\n                true\n            }),\n        )\n        .expect(\"Generation failed\");\n\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let duration = audio.samples().len() as f32 / audio.sample_rate() as f32;\n    let rtf = elapsed_seconds / duration;\n\n    println!(\"Number of threads: {}\", config.model.num_threads);\n    println!(\"Elapsed seconds: {:.3} s\", elapsed_seconds);\n    println!(\"Audio duration: {:.3} s\", duration);\n    println!(\n        \"Real-time factor (RTF): {:.3}/{:.3} = {:.3}\",\n        elapsed_seconds, duration, rtf\n    );\n\n    let filename = \"./generated-kokoro-en-rust.wav\";\n    if audio.save(filename) {\n        println!(\"Saved to: {}\", filename);\n    } else {\n        eprintln!(\"Failed to save {}\", filename);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/kokoro_tts_zh_en.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use Kokoro TTS with sherpa-onnx's Rust API\n// for offline Chinese + English text-to-speech.\n\nuse sherpa_onnx::{\n    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsKokoroModelConfig,\n};\nuse std::time::Instant;\n\nfn main() {\n    let config = OfflineTtsConfig {\n        model: sherpa_onnx::OfflineTtsModelConfig {\n            kokoro: OfflineTtsKokoroModelConfig {\n                model: Some(\"./kokoro-multi-lang-v1_0/model.onnx\".into()),\n                voices: Some(\"./kokoro-multi-lang-v1_0/voices.bin\".into()),\n                tokens: Some(\"./kokoro-multi-lang-v1_0/tokens.txt\".into()),\n                data_dir: Some(\"./kokoro-multi-lang-v1_0/espeak-ng-data\".into()),\n                dict_dir: Some(\"./kokoro-multi-lang-v1_0/dict\".into()),\n                lexicon: Some(\n                    \"./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt\".into(),\n                ),\n                length_scale: 1.0,\n                ..Default::default()\n            },\n            num_threads: 2,\n            debug: false,\n            ..Default::default()\n        },\n        ..Default::default()\n    };\n\n    let tts = OfflineTts::create(&config).expect(\"Failed to create OfflineTts\");\n\n    println!(\"Sample rate: {}\", tts.sample_rate());\n    println!(\"Num speakers: {}\", tts.num_speakers());\n\n    let text = \"中英文语音合成测试。This is generated by next generation Kaldi using \\\n        Kokoro without Misaki. 你觉得中英文说的如何呢？\";\n\n    let gen_config = GenerationConfig {\n        sid: 0,\n        speed: 1.0,\n        ..Default::default()\n    };\n\n    let start = Instant::now();\n\n    let audio = tts\n        .generate_with_config(\n            text,\n            &gen_config,\n            Some(|_samples: &[f32], progress: f32| -> bool {\n                println!(\"Progress: {:.1}%\", progress * 100.0);\n                true\n            }),\n        )\n        .expect(\"Generation failed\");\n\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let duration = audio.samples().len() as f32 / audio.sample_rate() as f32;\n    let rtf = elapsed_seconds / duration;\n\n    println!(\"Number of threads: {}\", config.model.num_threads);\n    println!(\"Elapsed seconds: {:.3} s\", elapsed_seconds);\n    println!(\"Audio duration: {:.3} s\", duration);\n    println!(\n        \"Real-time factor (RTF): {:.3}/{:.3} = {:.3}\",\n        elapsed_seconds, duration, rtf\n    );\n\n    let filename = \"./generated-kokoro-zh-en-rust.wav\";\n    if audio.save(filename) {\n        println!(\"Saved to: {}\", filename);\n    } else {\n        eprintln!(\"Failed to save {}\", filename);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/matcha_tts_en.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use Matcha TTS with sherpa-onnx's Rust API\n// for offline English text-to-speech.\n\nuse sherpa_onnx::{\n    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsMatchaModelConfig,\n};\nuse std::time::Instant;\n\nfn main() {\n    let config = OfflineTtsConfig {\n        model: sherpa_onnx::OfflineTtsModelConfig {\n            matcha: OfflineTtsMatchaModelConfig {\n                acoustic_model: Some(\"./matcha-icefall-en_US-ljspeech/model-steps-3.onnx\".into()),\n                vocoder: Some(\"./vocos-22khz-univ.onnx\".into()),\n                tokens: Some(\"./matcha-icefall-en_US-ljspeech/tokens.txt\".into()),\n                data_dir: Some(\"./matcha-icefall-en_US-ljspeech/espeak-ng-data\".into()),\n                noise_scale: 0.667,\n                length_scale: 1.0,\n                ..Default::default()\n            },\n            num_threads: 2,\n            debug: false,\n            ..Default::default()\n        },\n        ..Default::default()\n    };\n\n    let tts = OfflineTts::create(&config).expect(\"Failed to create OfflineTts\");\n\n    println!(\"Sample rate: {}\", tts.sample_rate());\n    println!(\"Num speakers: {}\", tts.num_speakers());\n\n    let text = \"Today as always, men fall into two groups: slaves and free men. Whoever \\\n        does not have two-thirds of his day for himself, is a slave, whatever \\\n        he may be: a statesman, a businessman, an official, or a scholar. \\\n        Friends fell out often because life was changing so fast. The easiest \\\n        thing in the world was to lose touch with someone.\";\n\n    let gen_config = GenerationConfig {\n        sid: 0,\n        speed: 1.0,\n        ..Default::default()\n    };\n\n    let start = Instant::now();\n\n    let audio = tts\n        .generate_with_config(\n            text,\n            &gen_config,\n            Some(|_samples: &[f32], progress: f32| -> bool {\n                println!(\"Progress: {:.1}%\", progress * 100.0);\n                true\n            }),\n        )\n        .expect(\"Generation failed\");\n\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let duration = audio.samples().len() as f32 / audio.sample_rate() as f32;\n    let rtf = elapsed_seconds / duration;\n\n    println!(\"Number of threads: {}\", config.model.num_threads);\n    println!(\"Elapsed seconds: {:.3} s\", elapsed_seconds);\n    println!(\"Audio duration: {:.3} s\", duration);\n    println!(\n        \"Real-time factor (RTF): {:.3}/{:.3} = {:.3}\",\n        elapsed_seconds, duration, rtf\n    );\n\n    let filename = \"./generated-matcha-en-rust.wav\";\n    if audio.save(filename) {\n        println!(\"Saved to: {}\", filename);\n    } else {\n        eprintln!(\"Failed to save {}\", filename);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/matcha_tts_zh.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use Matcha TTS with sherpa-onnx's Rust API\n// for offline Chinese text-to-speech.\n\nuse sherpa_onnx::{\n    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsMatchaModelConfig,\n};\nuse std::time::Instant;\n\nfn main() {\n    let config = OfflineTtsConfig {\n        model: sherpa_onnx::OfflineTtsModelConfig {\n            matcha: OfflineTtsMatchaModelConfig {\n                acoustic_model: Some(\"./matcha-icefall-zh-baker/model-steps-3.onnx\".into()),\n                vocoder: Some(\"./vocos-22khz-univ.onnx\".into()),\n                lexicon: Some(\"./matcha-icefall-zh-baker/lexicon.txt\".into()),\n                tokens: Some(\"./matcha-icefall-zh-baker/tokens.txt\".into()),\n                dict_dir: Some(\"./matcha-icefall-zh-baker/dict\".into()),\n                noise_scale: 0.667,\n                length_scale: 1.0,\n                ..Default::default()\n            },\n            num_threads: 2,\n            debug: false,\n            ..Default::default()\n        },\n        rule_fsts: Some(\n            \"./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst\".into(),\n        ),\n        ..Default::default()\n    };\n\n    let tts = OfflineTts::create(&config).expect(\"Failed to create OfflineTts\");\n\n    println!(\"Sample rate: {}\", tts.sample_rate());\n    println!(\"Num speakers: {}\", tts.num_speakers());\n\n    let text = \"当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如\\\n        涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感\\\n        受着生命的奇迹与温柔.\\\n        某某银行的副行长和一些行政领导表示，他们去过长江和长白山; \\\n        经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\";\n\n    let gen_config = GenerationConfig {\n        sid: 0,\n        speed: 1.0,\n        ..Default::default()\n    };\n\n    let start = Instant::now();\n\n    let audio = tts\n        .generate_with_config(\n            text,\n            &gen_config,\n            Some(|_samples: &[f32], progress: f32| -> bool {\n                println!(\"Progress: {:.1}%\", progress * 100.0);\n                true\n            }),\n        )\n        .expect(\"Generation failed\");\n\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let duration = audio.samples().len() as f32 / audio.sample_rate() as f32;\n    let rtf = elapsed_seconds / duration;\n\n    println!(\"Number of threads: {}\", config.model.num_threads);\n    println!(\"Elapsed seconds: {:.3} s\", elapsed_seconds);\n    println!(\"Audio duration: {:.3} s\", duration);\n    println!(\n        \"Real-time factor (RTF): {:.3}/{:.3} = {:.3}\",\n        elapsed_seconds, duration, rtf\n    );\n\n    let filename = \"./generated-matcha-zh-rust.wav\";\n    if audio.save(filename) {\n        println!(\"Saved to: {}\", filename);\n    } else {\n        eprintln!(\"Failed to save {}\", filename);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/moonshine_v2.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use a Moonshine v2 model with sherpa-onnx's Rust API\n// for offline speech recognition.\n//\n// See ../README.md for how to run it.\n\nuse clap::Parser;\nuse sherpa_onnx::{OfflineRecognizer, OfflineRecognizerConfig, Wave};\nuse std::time::Instant;\n\n/// Moonshine v2 offline example\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    /// Path to WAV file\n    #[arg(long)]\n    wav: String,\n\n    /// Path to the encoder model\n    #[arg(long)]\n    encoder: String,\n\n    /// Path to the decoder model\n    #[arg(long)]\n    decoder: String,\n\n    /// Path to tokens file\n    #[arg(long)]\n    tokens: String,\n\n    /// Provider (default: cpu)\n    #[arg(long, default_value = \"cpu\")]\n    provider: String,\n\n    /// Enable debug logs\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n\n    /// Number of threads\n    #[arg(long, default_value_t = 2)]\n    num_threads: i32,\n}\n\nfn main() {\n    let args = Args::parse();\n\n    let wave = Wave::read(&args.wav).expect(\"Failed to read WAV file\");\n    let audio_duration = wave.samples().len() as f64 / wave.sample_rate() as f64;\n\n    let mut recognizer_config = OfflineRecognizerConfig::default();\n\n    recognizer_config.model_config.moonshine.encoder = Some(args.encoder.clone());\n    recognizer_config.model_config.moonshine.merged_decoder = Some(args.decoder.clone());\n\n    recognizer_config.model_config.tokens = Some(args.tokens.clone());\n    recognizer_config.model_config.provider = Some(args.provider.clone());\n    recognizer_config.model_config.debug = args.debug;\n    recognizer_config.model_config.num_threads = args.num_threads;\n\n    // Measure recognizer creation time\n    println!(\"Creating recognizer ...\");\n    let start_creation = Instant::now();\n    let recognizer =\n        OfflineRecognizer::create(&recognizer_config).expect(\"Failed to create OfflineRecognizer\");\n    let creation_elapsed = start_creation.elapsed().as_secs_f64();\n    println!(\"Recognizer created in {:.3} seconds.\", creation_elapsed);\n\n    let stream = recognizer.create_stream();\n\n    // Measure recognition time\n    let start_recognition = Instant::now();\n    stream.accept_waveform(wave.sample_rate(), wave.samples());\n    recognizer.decode(&stream);\n    let recognition_elapsed = start_recognition.elapsed().as_secs_f64();\n\n    // Get recognition result\n    if let Some(result) = stream.get_result() {\n        println!(\"Decoded text: {}\", result.text);\n\n        let total_time = creation_elapsed + recognition_elapsed;\n        let rtf = recognition_elapsed / audio_duration;\n\n        println!(\"\\n=== Performance Summary ===\");\n        println!(\"Audio duration          : {:.3} seconds\", audio_duration);\n        println!(\"Recognizer creation time: {:.3} seconds\", creation_elapsed);\n        println!(\n            \"Recognition time        : {:.3} seconds\",\n            recognition_elapsed\n        );\n        println!(\"Total elapsed time      : {:.3} seconds\", total_time);\n\n        // Detailed RTF computation log\n        println!(\n            \"Real-Time Factor (RTF)  : {:.3} (recognition_elapsed / audio_duration = {:.3} / {:.3})\",\n            rtf, recognition_elapsed, audio_duration\n        );\n\n        println!(\n            \"Number of threads       : {}\",\n            recognizer_config.model_config.num_threads\n        );\n    } else {\n        eprintln!(\"Failed to get recognition result\");\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/nemo_parakeet.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use Nemo Parakeet with sherpa-onnx's Rust API\n// for offline speech recognition.\n//\n// See ../README.md for how to run it.\n\nuse clap::Parser;\nuse sherpa_onnx::{OfflineRecognizer, OfflineRecognizerConfig, OfflineTransducerModelConfig, Wave};\nuse std::time::Instant;\n\n/// Nemo Parakeet offline example\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    /// Path to WAV file\n    #[arg(long)]\n    wav: String,\n\n    /// Path to encoder ONNX model\n    #[arg(long)]\n    encoder: String,\n\n    /// Path to decoder ONNX model\n    #[arg(long)]\n    decoder: String,\n\n    /// Path to joiner ONNX model\n    #[arg(long)]\n    joiner: String,\n\n    /// Path to tokens file\n    #[arg(long)]\n    tokens: String,\n\n    /// Provider (default: cpu)\n    #[arg(long, default_value = \"cpu\")]\n    provider: String,\n\n    /// Enable debug logs\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n\n    /// Number of threads\n    #[arg(long, default_value_t = 2)]\n    num_threads: i32,\n}\n\nfn main() {\n    let args = Args::parse();\n\n    let wave = Wave::read(&args.wav).expect(\"Failed to read WAV file\");\n    let audio_duration = wave.samples().len() as f64 / wave.sample_rate() as f64;\n\n    // Create default recognizer config\n    let mut recognizer_config = OfflineRecognizerConfig::default();\n\n    // Set the transducer model\n    recognizer_config.model_config.transducer = OfflineTransducerModelConfig {\n        encoder: Some(args.encoder.clone()),\n        decoder: Some(args.decoder.clone()),\n        joiner: Some(args.joiner.clone()),\n    };\n\n    recognizer_config.model_config.tokens = Some(args.tokens.clone());\n    recognizer_config.model_config.provider = Some(args.provider.clone());\n    recognizer_config.model_config.debug = args.debug;\n    recognizer_config.model_config.num_threads = args.num_threads;\n\n    // Measure recognizer creation time\n    println!(\"Creating recognizer ...\");\n    let start_creation = Instant::now();\n    let recognizer =\n        OfflineRecognizer::create(&recognizer_config).expect(\"Failed to create OfflineRecognizer\");\n    let creation_elapsed = start_creation.elapsed().as_secs_f64();\n    println!(\"Recognizer created in {:.3} seconds.\", creation_elapsed);\n\n    let stream = recognizer.create_stream();\n\n    // Measure recognition time\n    let start_recognition = Instant::now();\n    stream.accept_waveform(wave.sample_rate(), wave.samples());\n    recognizer.decode(&stream);\n    let recognition_elapsed = start_recognition.elapsed().as_secs_f64();\n\n    // Get recognition result\n    if let Some(result) = stream.get_result() {\n        println!(\"Decoded text: {}\", result.text);\n\n        let total_elapsed = creation_elapsed + recognition_elapsed;\n        let rtf = recognition_elapsed / audio_duration;\n\n        println!(\"\\n=== Performance Summary ===\");\n        println!(\"Audio duration          : {:.3} seconds\", audio_duration);\n        println!(\"Recognizer creation time: {:.3} seconds\", creation_elapsed);\n        println!(\n            \"Recognition time        : {:.3} seconds\",\n            recognition_elapsed\n        );\n        println!(\"Total elapsed time      : {:.3} seconds\", total_elapsed);\n\n        println!(\n            \"Real-Time Factor (RTF)  : {:.3} (recognition_elapsed / audio_duration = {:.3} / {:.3})\",\n            rtf, recognition_elapsed, audio_duration\n        );\n\n        println!(\n            \"Number of threads       : {}\",\n            recognizer_config.model_config.num_threads\n        );\n    } else {\n        eprintln!(\"Failed to get recognition result\");\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/offline_punctuation.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use sherpa-onnx's Rust API for offline\n// punctuation.\n//\n// See ../README.md for how to run it.\n\nuse clap::Parser;\nuse sherpa_onnx::{OfflinePunctuation, OfflinePunctuationConfig, OfflinePunctuationModelConfig};\n\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    #[arg(long)]\n    ct_transformer: String,\n\n    #[arg(long, default_value_t = 1)]\n    num_threads: i32,\n\n    #[arg(long, default_value = \"cpu\")]\n    provider: String,\n\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n}\n\nfn main() -> anyhow::Result<()> {\n    let args = Args::parse();\n    let punct = OfflinePunctuation::create(&OfflinePunctuationConfig {\n        model: OfflinePunctuationModelConfig {\n            ct_transformer: Some(args.ct_transformer),\n            num_threads: args.num_threads,\n            provider: Some(args.provider),\n            debug: args.debug,\n        },\n    })\n    .ok_or_else(|| anyhow::anyhow!(\"Failed to create OfflinePunctuation\"))?;\n\n    let texts = [\n        \"这是一个测试你好吗How are you我很好thank you are you ok谢谢你\",\n        \"我们都是木头人不会说话不会动\",\n        \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\",\n    ];\n\n    println!(\"----------\");\n    for text in texts {\n        let out = punct\n            .add_punctuation(text)\n            .ok_or_else(|| anyhow::anyhow!(\"Failed to add punctuation\"))?;\n        println!(\"Input text: {text}\");\n        println!(\"Output text: {out}\");\n        println!(\"----------\");\n    }\n\n    Ok(())\n}\n"
  },
  {
    "path": "rust-api-examples/examples/offline_speaker_diarization.rs",
    "content": "use sherpa_onnx::{\n    FastClusteringConfig, OfflineSpeakerDiarization, OfflineSpeakerDiarizationConfig,\n    OfflineSpeakerSegmentationModelConfig, OfflineSpeakerSegmentationPyannoteModelConfig,\n    SpeakerEmbeddingExtractorConfig, Wave,\n};\n\nfn main() {\n    let config = OfflineSpeakerDiarizationConfig {\n        segmentation: OfflineSpeakerSegmentationModelConfig {\n            pyannote: OfflineSpeakerSegmentationPyannoteModelConfig {\n                model: Some(\"./sherpa-onnx-pyannote-segmentation-3-0/model.onnx\".into()),\n            },\n            ..Default::default()\n        },\n        embedding: SpeakerEmbeddingExtractorConfig {\n            model: Some(\"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\".into()),\n            ..Default::default()\n        },\n        clustering: FastClusteringConfig {\n            num_clusters: 4,\n            ..Default::default()\n        },\n        ..Default::default()\n    };\n\n    let sd = OfflineSpeakerDiarization::create(&config)\n        .expect(\"Failed to initialize offline speaker diarization\");\n\n    let wave = Wave::read(\"./0-four-speakers-zh.wav\").expect(\"Failed to read wave\");\n\n    assert_eq!(\n        sd.sample_rate(),\n        wave.sample_rate(),\n        \"Unexpected sample rate\"\n    );\n\n    let result = sd\n        .process(wave.samples())\n        .expect(\"Failed to do speaker diarization\");\n    println!(\"Number of speakers: {}\", result.num_speakers());\n    println!(\"Number of segments: {}\", result.num_segments());\n\n    for s in result.sort_by_start_time() {\n        println!(\"{:.3} -- {:.3} speaker_{:02}\", s.start, s.end, s.speaker);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/offline_speech_enhancement_dpdfnet.rs",
    "content": "use clap::Parser;\nuse sherpa_onnx::{\n    write, OfflineSpeechDenoiser, OfflineSpeechDenoiserConfig,\n    OfflineSpeechDenoiserDpdfNetModelConfig, Wave,\n};\n\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    #[arg(long)]\n    model: String,\n\n    #[arg(long)]\n    input: String,\n\n    #[arg(long)]\n    output: String,\n}\n\nfn main() -> anyhow::Result<()> {\n    let args = Args::parse();\n\n    let config = OfflineSpeechDenoiserConfig {\n        model: sherpa_onnx::OfflineSpeechDenoiserModelConfig {\n            dpdfnet: OfflineSpeechDenoiserDpdfNetModelConfig {\n                model: Some(args.model),\n            },\n            ..Default::default()\n        },\n    };\n\n    let denoiser = OfflineSpeechDenoiser::create(&config)\n        .ok_or_else(|| anyhow::anyhow!(\"Failed to create offline DPDFNet denoiser\"))?;\n    let wave =\n        Wave::read(&args.input).ok_or_else(|| anyhow::anyhow!(\"Failed to read {}\", args.input))?;\n\n    let audio = denoiser.run(wave.samples(), wave.sample_rate());\n    anyhow::ensure!(\n        write(&args.output, &audio.samples, audio.sample_rate),\n        \"Failed to save {}\",\n        args.output\n    );\n\n    println!(\"Saved to {}\", args.output);\n    Ok(())\n}\n"
  },
  {
    "path": "rust-api-examples/examples/offline_speech_enhancement_gtcrn.rs",
    "content": "use clap::Parser;\nuse sherpa_onnx::{\n    write, OfflineSpeechDenoiser, OfflineSpeechDenoiserConfig,\n    OfflineSpeechDenoiserGtcrnModelConfig, Wave,\n};\n\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    #[arg(long)]\n    model: String,\n\n    #[arg(long)]\n    input: String,\n\n    #[arg(long)]\n    output: String,\n}\n\nfn main() -> anyhow::Result<()> {\n    let args = Args::parse();\n\n    let config = OfflineSpeechDenoiserConfig {\n        model: sherpa_onnx::OfflineSpeechDenoiserModelConfig {\n            gtcrn: OfflineSpeechDenoiserGtcrnModelConfig {\n                model: Some(args.model),\n            },\n            ..Default::default()\n        },\n    };\n\n    let denoiser = OfflineSpeechDenoiser::create(&config)\n        .ok_or_else(|| anyhow::anyhow!(\"Failed to create offline GTCRN denoiser\"))?;\n    let wave =\n        Wave::read(&args.input).ok_or_else(|| anyhow::anyhow!(\"Failed to read {}\", args.input))?;\n\n    let audio = denoiser.run(wave.samples(), wave.sample_rate());\n    anyhow::ensure!(\n        write(&args.output, &audio.samples, audio.sample_rate),\n        \"Failed to save {}\",\n        args.output\n    );\n\n    println!(\"Saved to {}\", args.output);\n    Ok(())\n}\n"
  },
  {
    "path": "rust-api-examples/examples/online_punctuation.rs",
    "content": "// Copyright (c) 2026 zengyw\n//\n// This file demonstrates how to use online punctuation with sherpa-onnx's Rust API.\n//\n// See ../README.md for how to run it.\n\nuse clap::Parser;\nuse sherpa_onnx::{OnlinePunctuation, OnlinePunctuationConfig, OnlinePunctuationModelConfig};\n\n/// Online punctuation example\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    /// Path to CNN-BiLSTM ONNX model\n    #[arg(long)]\n    cnn_bilstm: String,\n\n    /// Path to BPE vocabulary file\n    #[arg(long)]\n    bpe_vocab: String,\n\n    /// Number of threads\n    #[arg(long, default_value_t = 1)]\n    num_threads: i32,\n\n    /// Provider (default: cpu)\n    #[arg(long, default_value = \"cpu\")]\n    provider: String,\n\n    /// Enable debug logs\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n}\n\nfn main() -> anyhow::Result<()> {\n    let args = Args::parse();\n\n    let config = OnlinePunctuationConfig {\n        model: OnlinePunctuationModelConfig {\n            cnn_bilstm: Some(args.cnn_bilstm),\n            bpe_vocab: Some(args.bpe_vocab),\n            num_threads: args.num_threads,\n            provider: Some(args.provider),\n            debug: args.debug,\n            ..Default::default()\n        },\n    };\n\n    let punct = OnlinePunctuation::create(&config)\n        .ok_or_else(|| anyhow::anyhow!(\"Failed to create OnlinePunctuation\"))?;\n\n    let texts = [\n        \"how are you i am fine thank you\",\n        \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\",\n    ];\n\n    println!(\"----------\");\n    for text in texts {\n        let out = punct\n            .add_punctuation(text)\n            .ok_or_else(|| anyhow::anyhow!(\"Failed to add punctuation\"))?;\n\n        println!(\"Input text: {text}\");\n        println!(\"Output text: {out}\");\n        println!(\"----------\");\n    }\n\n    Ok(())\n}\n"
  },
  {
    "path": "rust-api-examples/examples/pocket_tts.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use Pocket TTS with sherpa-onnx's Rust API\n// for offline text-to-speech with zero-shot voice cloning.\n\nuse sherpa_onnx::{\n    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsPocketModelConfig, Wave,\n};\nuse std::collections::HashMap;\nuse std::time::Instant;\n\nfn main() {\n    let config = OfflineTtsConfig {\n        model: sherpa_onnx::OfflineTtsModelConfig {\n            pocket: OfflineTtsPocketModelConfig {\n                lm_flow: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\".into()),\n                lm_main: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\".into()),\n                encoder: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\".into()),\n                decoder: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\".into()),\n                text_conditioner: Some(\n                    \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\".into(),\n                ),\n                vocab_json: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\".into()),\n                token_scores_json: Some(\n                    \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\".into(),\n                ),\n                voice_embedding_cache_capacity: 50,\n            },\n            num_threads: 2,\n            debug: false, // set to true to see verbose logs\n            ..Default::default()\n        },\n        ..Default::default()\n    };\n\n    let tts = OfflineTts::create(&config).expect(\"Failed to create OfflineTts\");\n\n    println!(\"Sample rate: {}\", tts.sample_rate());\n    println!(\"Num speakers: {}\", tts.num_speakers());\n\n    let text = \"Today as always, men fall into two groups: slaves and free men. Whoever \\\n        does not have two-thirds of his day for himself, is a slave, whatever \\\n        he may be: a statesman, a businessman, an official, or a scholar. \\\n        Friends fell out often because life was changing so fast. The easiest \\\n        thing in the world was to lose touch with someone.\";\n\n    // Read reference audio for zero-shot voice cloning\n    let reference_audio_file = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\";\n    let wave = Wave::read(reference_audio_file).expect(\"Failed to read reference audio\");\n\n    let mut extra = HashMap::new();\n    extra.insert(\n        \"max_reference_audio_len\".to_string(),\n        serde_json::json!(10.0),\n    );\n    extra.insert(\"seed\".to_string(), serde_json::json!(42));\n\n    let gen_config = GenerationConfig {\n        speed: 1.0,\n        reference_audio: Some(wave.samples().to_vec()),\n        reference_sample_rate: wave.sample_rate(),\n        extra: Some(extra),\n        ..Default::default()\n    };\n\n    let start = Instant::now();\n\n    let audio = tts\n        .generate_with_config(\n            text,\n            &gen_config,\n            Some(|_samples: &[f32], progress: f32| -> bool {\n                println!(\"Progress: {:.1}%\", progress * 100.0);\n                true\n            }),\n        )\n        .expect(\"Generation failed\");\n\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let duration = audio.samples().len() as f32 / audio.sample_rate() as f32;\n    let rtf = elapsed_seconds / duration;\n\n    println!(\"Number of threads: {}\", config.model.num_threads);\n    println!(\"Elapsed seconds: {:.3} s\", elapsed_seconds);\n    println!(\"Audio duration: {:.3} s\", duration);\n    println!(\n        \"Real-time factor (RTF): {:.3}/{:.3} = {:.3}\",\n        elapsed_seconds, duration, rtf\n    );\n\n    let filename = \"./generated-pocket-en-rust.wav\";\n    if audio.save(filename) {\n        println!(\"Saved to: {}\", filename);\n    } else {\n        eprintln!(\"Failed to save {}\", filename);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/sense_voice.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use SenseVoice with sherpa-onnx's Rust API\n// for offline speech recognition.\n//\n// See ../README.md for how to run it.\n\nuse clap::Parser;\nuse sherpa_onnx::{OfflineRecognizer, OfflineRecognizerConfig, OfflineSenseVoiceModelConfig, Wave};\nuse std::time::Instant;\n\n/// SenseVoice offline example\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    /// Path to WAV file\n    #[arg(long)]\n    wav: String,\n\n    /// Path to SenseVoice ONNX model\n    #[arg(long)]\n    model: String,\n\n    /// Path to tokens file\n    #[arg(long)]\n    tokens: String,\n\n    /// Language, e.g., \"auto\", \"en\", \"zh\"\n    #[arg(long, default_value = \"auto\")]\n    language: String,\n\n    /// Provider (default: cpu)\n    #[arg(long, default_value = \"cpu\")]\n    provider: String,\n\n    /// Enable debug logs\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n\n    /// Enable inverse text normalization\n    #[arg(long, default_value_t = true)]\n    use_itn: bool,\n\n    /// Number of threads\n    #[arg(long, default_value_t = 2)]\n    num_threads: i32,\n}\n\nfn main() {\n    let args = Args::parse();\n\n    let wave = Wave::read(&args.wav).expect(\"Failed to read WAV file\");\n    let audio_duration = wave.samples().len() as f64 / wave.sample_rate() as f64;\n\n    let mut recognizer_config = OfflineRecognizerConfig::default();\n\n    recognizer_config.model_config.sense_voice = OfflineSenseVoiceModelConfig {\n        model: Some(args.model.clone()),\n        language: Some(args.language.clone()),\n        use_itn: args.use_itn,\n    };\n\n    recognizer_config.model_config.tokens = Some(args.tokens.clone());\n    recognizer_config.model_config.provider = Some(args.provider.clone());\n    recognizer_config.model_config.debug = args.debug;\n    recognizer_config.model_config.num_threads = args.num_threads;\n\n    // Measure recognizer creation time\n    println!(\"Creating recognizer ...\");\n    let start_creation = Instant::now();\n    let recognizer =\n        OfflineRecognizer::create(&recognizer_config).expect(\"Failed to create OfflineRecognizer\");\n    let creation_elapsed = start_creation.elapsed().as_secs_f64();\n    println!(\"Recognizer created in {:.3} seconds.\", creation_elapsed);\n\n    let stream = recognizer.create_stream();\n\n    // Measure recognition time\n    let start_recognition = Instant::now();\n    stream.accept_waveform(wave.sample_rate(), wave.samples());\n    recognizer.decode(&stream);\n    let recognition_elapsed = start_recognition.elapsed().as_secs_f64();\n\n    // Get recognition result\n    if let Some(result) = stream.get_result() {\n        println!(\"Decoded text: {}\", result.text);\n\n        let total_time = creation_elapsed + recognition_elapsed;\n        let rtf = recognition_elapsed / audio_duration;\n\n        println!(\"\\n=== Performance Summary ===\");\n        println!(\"Audio duration          : {:.3} seconds\", audio_duration);\n        println!(\"Recognizer creation time: {:.3} seconds\", creation_elapsed);\n        println!(\n            \"Recognition time        : {:.3} seconds\",\n            recognition_elapsed\n        );\n        println!(\"Total elapsed time      : {:.3} seconds\", total_time);\n\n        // Detailed RTF computation log\n        println!(\n            \"Real-Time Factor (RTF)  : {:.3} (recognition_elapsed / audio_duration = {:.3} / {:.3})\",\n            rtf, recognition_elapsed, audio_duration\n        );\n\n        println!(\n            \"Number of threads       : {}\",\n            recognizer_config.model_config.num_threads\n        );\n    } else {\n        eprintln!(\"Failed to get recognition result\");\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/silero_vad_remove_silence.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use silero VAD with sherpa-onnx's\n// Rust API to remove non-speech segments and save speech-only audio.\n//\n// See ../README.md for how to run it\n\nuse clap::Parser;\nuse sherpa_onnx::{self, SileroVadModelConfig, VadModelConfig, VoiceActivityDetector, Wave};\n\n/// Simple VAD example: remove non-speech segments from a WAV file\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    /// Path to input WAV file\n    #[arg(long)]\n    input: String,\n\n    /// Path to output WAV file\n    #[arg(long)]\n    output: String,\n\n    /// Path to Silero VAD ONNX model\n    #[arg(long)]\n    silero_vad_model: String,\n}\n\nfn main() -> anyhow::Result<()> {\n    let args = Args::parse();\n\n    // Read WAV file\n    let wave = Wave::read(&args.input)\n        .ok_or_else(|| anyhow::anyhow!(\"Failed to read WAV file: {}\", &args.input))?;\n    let sample_rate = wave.sample_rate();\n    let input_num_samples = wave.num_samples();\n    let input_duration = input_num_samples as f32 / sample_rate as f32;\n\n    println!(\n        \"Input WAV: sample rate: {}, num samples: {}, duration: {:.2}s\",\n        sample_rate, input_num_samples, input_duration\n    );\n\n    // Configure VAD\n    let mut silero_config = SileroVadModelConfig::default();\n    silero_config.model = Some(args.silero_vad_model);\n\n    // You can tune the values below\n    silero_config.threshold = 0.5;\n    silero_config.min_silence_duration = 0.25;\n    silero_config.min_speech_duration = 0.25;\n    silero_config.max_speech_duration = 5.0;\n\n    let vad_config = VadModelConfig {\n        silero_vad: silero_config,\n        ten_vad: Default::default(),\n        sample_rate,\n        num_threads: 1,\n        provider: Some(\"cpu\".to_string()),\n        debug: false,\n    };\n\n    let vad = VoiceActivityDetector::create(&vad_config, 30.0)\n        .expect(\"Failed to create VoiceActivityDetector\");\n\n    let mut speech_samples = Vec::new();\n    const WINDOW_SIZE: usize = 512;\n\n    for chunk in wave.samples().chunks(WINDOW_SIZE) {\n        vad.accept_waveform(chunk);\n\n        while let Some(seg) = vad.front() {\n            speech_samples.extend_from_slice(seg.samples());\n            vad.pop();\n        }\n    }\n\n    vad.flush();\n    while let Some(seg) = vad.front() {\n        speech_samples.extend_from_slice(seg.samples());\n        vad.pop();\n    }\n\n    // Write speech-only samples to output WAV\n    let ok = sherpa_onnx::write(&args.output, &speech_samples, sample_rate);\n    if ok {\n        println!(\"Saved speech-only audio to {}\", args.output);\n    } else {\n        println!(\"Failed to save speech-only audio to {}\", args.output);\n    }\n\n    // Summary\n    let output_num_samples = speech_samples.len();\n    let output_duration = output_num_samples as f32 / sample_rate as f32;\n    println!(\"\\n=== Summary ===\");\n    println!(\n        \"Input:  sample rate = {}, samples = {}, duration = {:.2}s\",\n        sample_rate, input_num_samples, input_duration\n    );\n    println!(\n        \"Output: sample rate = {}, samples = {}, duration = {:.2}s\",\n        sample_rate, output_num_samples, output_duration\n    );\n    println!(\n        \"Removed non-speech: {:.2}% of input removed\",\n        100.0 * (1.0 - output_duration / input_duration)\n    );\n\n    Ok(())\n}\n"
  },
  {
    "path": "rust-api-examples/examples/speaker_embedding_cosine_similarity.rs",
    "content": "use sherpa_onnx::{SpeakerEmbeddingExtractor, SpeakerEmbeddingExtractorConfig, Wave};\n\nfn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {\n    assert_eq!(a.len(), b.len(), \"Vectors must have the same length\");\n\n    let mut dot = 0.0_f32;\n    let mut sum_a = 0.0_f32;\n    let mut sum_b = 0.0_f32;\n\n    for (&x, &y) in a.iter().zip(b.iter()) {\n        dot += x * y;\n        sum_a += x * x;\n        sum_b += y * y;\n    }\n\n    let mag_a = sum_a.sqrt();\n    let mag_b = sum_b.sqrt();\n    if mag_a > 0.0 && mag_b > 0.0 {\n        dot / (mag_a * mag_b)\n    } else {\n        0.0\n    }\n}\n\nfn compute_embedding(extractor: &SpeakerEmbeddingExtractor, wave_filename: &str) -> Vec<f32> {\n    let wave = Wave::read(wave_filename)\n        .unwrap_or_else(|| panic!(\"Failed to read {}\", wave_filename));\n    let stream = extractor.create_stream().expect(\"Failed to create stream\");\n    stream.accept_waveform(wave.sample_rate(), wave.samples());\n    stream.input_finished();\n\n    if !extractor.is_ready(&stream) {\n        panic!(\"{} is too short\", wave_filename);\n    }\n\n    extractor\n        .compute(&stream)\n        .unwrap_or_else(|| panic!(\"Failed to compute embedding for {}\", wave_filename))\n}\n\nfn main() {\n    let config = SpeakerEmbeddingExtractorConfig {\n        model: Some(\"./wespeaker_zh_cnceleb_resnet34.onnx\".into()),\n        num_threads: 1,\n        debug: true,\n        provider: Some(\"cpu\".into()),\n    };\n\n    let extractor = SpeakerEmbeddingExtractor::create(&config)\n        .expect(\"Failed to create SpeakerEmbeddingExtractor\");\n\n    let embedding1 = compute_embedding(&extractor, \"./fangjun-sr-1.wav\");\n    let embedding2 = compute_embedding(&extractor, \"./fangjun-sr-2.wav\");\n    let embedding3 = compute_embedding(&extractor, \"./leijun-sr-1.wav\");\n\n    let score12 = cosine_similarity(&embedding1, &embedding2);\n    let score13 = cosine_similarity(&embedding1, &embedding3);\n    let score23 = cosine_similarity(&embedding2, &embedding3);\n\n    println!(\"Score between spk1 and spk2: {}\", score12);\n    println!(\"Score between spk1 and spk3: {}\", score13);\n    println!(\"Score between spk2 and spk3: {}\", score23);\n}\n"
  },
  {
    "path": "rust-api-examples/examples/speaker_embedding_extractor.rs",
    "content": "use sherpa_onnx::{SpeakerEmbeddingExtractor, SpeakerEmbeddingExtractorConfig, Wave};\n\nfn main() {\n    let config = SpeakerEmbeddingExtractorConfig {\n        model: Some(\"./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\".into()),\n        num_threads: 1,\n        debug: true,\n        provider: Some(\"cpu\".into()),\n    };\n\n    let extractor = SpeakerEmbeddingExtractor::create(&config)\n        .expect(\"Failed to create SpeakerEmbeddingExtractor\");\n    println!(\"Embedding dim: {}\", extractor.dim());\n\n    let wave = Wave::read(\"./sr-data/test/fangjun-test-sr-1.wav\").expect(\"Failed to read wave\");\n    let stream = extractor.create_stream().expect(\"Failed to create stream\");\n    stream.accept_waveform(wave.sample_rate(), wave.samples());\n    stream.input_finished();\n\n    if !extractor.is_ready(&stream) {\n        panic!(\"Input wave is too short\");\n    }\n\n    let embedding = extractor.compute(&stream).expect(\"Failed to compute embedding\");\n    println!(\"Computed embedding with {} values\", embedding.len());\n\n    let n = usize::min(10, embedding.len());\n    println!(\"First {} values: {:?}\", n, &embedding[..n]);\n}\n"
  },
  {
    "path": "rust-api-examples/examples/speaker_embedding_manager.rs",
    "content": "use sherpa_onnx::{\n    SpeakerEmbeddingExtractor, SpeakerEmbeddingExtractorConfig, SpeakerEmbeddingManager, Wave,\n};\n\nfn compute_embedding(extractor: &SpeakerEmbeddingExtractor, filename: &str) -> Vec<f32> {\n    let wave = Wave::read(filename).unwrap_or_else(|| panic!(\"Failed to read {}\", filename));\n    let stream = extractor.create_stream().expect(\"Failed to create stream\");\n    stream.accept_waveform(wave.sample_rate(), wave.samples());\n    stream.input_finished();\n\n    if !extractor.is_ready(&stream) {\n        panic!(\"The input wave file {} is too short!\", filename);\n    }\n\n    extractor\n        .compute(&stream)\n        .unwrap_or_else(|| panic!(\"Failed to compute embedding for {}\", filename))\n}\n\nfn main() {\n    let config = SpeakerEmbeddingExtractorConfig {\n        model: Some(\"./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\".into()),\n        num_threads: 1,\n        debug: true,\n        provider: Some(\"cpu\".into()),\n    };\n\n    let extractor = SpeakerEmbeddingExtractor::create(&config)\n        .expect(\"Failed to create SpeakerEmbeddingExtractor\");\n    let manager = SpeakerEmbeddingManager::create(extractor.dim())\n        .expect(\"Failed to create SpeakerEmbeddingManager\");\n\n    let spk1 = vec![\n        compute_embedding(&extractor, \"./sr-data/enroll/fangjun-sr-1.wav\"),\n        compute_embedding(&extractor, \"./sr-data/enroll/fangjun-sr-2.wav\"),\n        compute_embedding(&extractor, \"./sr-data/enroll/fangjun-sr-3.wav\"),\n    ];\n    let spk2 = vec![\n        compute_embedding(&extractor, \"./sr-data/enroll/leijun-sr-1.wav\"),\n        compute_embedding(&extractor, \"./sr-data/enroll/leijun-sr-2.wav\"),\n    ];\n\n    assert!(manager.add_list(\"fangjun\", &spk1));\n    assert!(manager.contains(\"fangjun\"));\n\n    let flattened_spk2: Vec<f32> = spk2.iter().flat_map(|v| v.iter().copied()).collect();\n    assert!(manager.add_list_flattened(\"leijun\", &flattened_spk2));\n    assert!(manager.contains(\"leijun\"));\n    assert_eq!(manager.num_speakers(), 2);\n\n    println!(\"Registered speakers: {:?}\", manager.get_all_speakers());\n\n    let v1 = compute_embedding(&extractor, \"./sr-data/test/fangjun-test-sr-1.wav\");\n    let v2 = compute_embedding(&extractor, \"./sr-data/test/leijun-test-sr-1.wav\");\n    let v3 = compute_embedding(&extractor, \"./sr-data/test/liudehua-test-sr-1.wav\");\n\n    let threshold = 0.6;\n\n    println!(\n        \"fangjun-test-sr-1.wav => {}\",\n        manager.search(&v1, threshold).unwrap_or_else(|| \"unknown\".to_string())\n    );\n    println!(\n        \"leijun-test-sr-1.wav => {}\",\n        manager.search(&v2, threshold).unwrap_or_else(|| \"unknown\".to_string())\n    );\n    println!(\n        \"liudehua-test-sr-1.wav => {}\",\n        manager.search(&v3, threshold).unwrap_or_else(|| \"unknown\".to_string())\n    );\n\n    let best_matches = manager.get_best_matches(&v1, threshold, 2);\n    println!(\"Best matches for fangjun-test-sr-1.wav: {:?}\", best_matches);\n\n    println!(\"fangjun verification for v1: {}\", manager.verify(\"fangjun\", &v1, threshold));\n    println!(\"fangjun verification for v2: {}\", manager.verify(\"fangjun\", &v2, threshold));\n\n    assert!(manager.remove(\"fangjun\"));\n    println!(\"After removing fangjun: {:?}\", manager.get_all_speakers());\n\n    assert!(manager.remove(\"leijun\"));\n    println!(\"After removing leijun: {:?}\", manager.get_all_speakers());\n}\n"
  },
  {
    "path": "rust-api-examples/examples/spoken_language_identification.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use sherpa-onnx's Rust API for spoken language\n// identification.\n//\n// See ../README.md for how to run it.\n\nuse clap::Parser;\nuse sherpa_onnx::{\n    SpokenLanguageIdentification, SpokenLanguageIdentificationConfig,\n    SpokenLanguageIdentificationWhisperConfig, Wave,\n};\nuse std::time::Instant;\n\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    #[arg(long)]\n    wav: String,\n\n    #[arg(long)]\n    whisper_encoder: String,\n\n    #[arg(long)]\n    whisper_decoder: String,\n\n    #[arg(long, default_value_t = 0)]\n    tail_paddings: i32,\n\n    #[arg(long, default_value_t = 1)]\n    num_threads: i32,\n\n    #[arg(long, default_value = \"cpu\")]\n    provider: String,\n\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n}\n\nfn main() -> anyhow::Result<()> {\n    let args = Args::parse();\n    let wave = Wave::read(&args.wav).ok_or_else(|| anyhow::anyhow!(\"Failed to read WAV file\"))?;\n    let audio_duration = wave.num_samples() as f64 / wave.sample_rate() as f64;\n\n    let config = SpokenLanguageIdentificationConfig {\n        whisper: SpokenLanguageIdentificationWhisperConfig {\n            encoder: Some(args.whisper_encoder),\n            decoder: Some(args.whisper_decoder),\n            tail_paddings: args.tail_paddings,\n        },\n        num_threads: args.num_threads,\n        provider: Some(args.provider),\n        debug: args.debug,\n    };\n\n    let slid = SpokenLanguageIdentification::create(&config)\n        .ok_or_else(|| anyhow::anyhow!(\"Failed to create SpokenLanguageIdentification\"))?;\n\n    let stream = slid.create_stream();\n    let start = Instant::now();\n    stream.accept_waveform(wave.sample_rate(), wave.samples());\n    let result = slid\n        .compute(&stream)\n        .ok_or_else(|| anyhow::anyhow!(\"Failed to compute spoken language identification result\"))?;\n    let elapsed = start.elapsed().as_secs_f64();\n\n    println!(\"File: {}\", args.wav);\n    println!(\"Detected language: {}\", result.lang);\n    println!(\"Elapsed seconds: {:.3}\", elapsed);\n    println!(\"Audio duration in seconds: {:.3}\", audio_duration);\n    println!(\"RTF: {:.3}/{:.3} = {:.3}\", elapsed, audio_duration, elapsed / audio_duration);\n\n    Ok(())\n}\n"
  },
  {
    "path": "rust-api-examples/examples/streaming_speech_enhancement_dpdfnet.rs",
    "content": "use clap::Parser;\nuse sherpa_onnx::{\n    write, OfflineSpeechDenoiserDpdfNetModelConfig, OnlineSpeechDenoiser, OnlineSpeechDenoiserConfig,\n    Wave,\n};\n\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    #[arg(long)]\n    model: String,\n\n    #[arg(long)]\n    input: String,\n\n    #[arg(long)]\n    output: String,\n}\n\nfn main() -> anyhow::Result<()> {\n    let args = Args::parse();\n\n    let config = OnlineSpeechDenoiserConfig {\n        model: sherpa_onnx::OfflineSpeechDenoiserModelConfig {\n            dpdfnet: OfflineSpeechDenoiserDpdfNetModelConfig {\n                model: Some(args.model),\n            },\n            ..Default::default()\n        },\n    };\n\n    let denoiser = OnlineSpeechDenoiser::create(&config)\n        .ok_or_else(|| anyhow::anyhow!(\"Failed to create streaming DPDFNet denoiser\"))?;\n    let wave =\n        Wave::read(&args.input).ok_or_else(|| anyhow::anyhow!(\"Failed to read {}\", args.input))?;\n\n    let frame_shift = denoiser.frame_shift_in_samples() as usize;\n    let mut enhanced = Vec::new();\n\n    for chunk in wave.samples().chunks(frame_shift.max(1)) {\n        let audio = denoiser.run(chunk, wave.sample_rate());\n        enhanced.extend_from_slice(&audio.samples);\n    }\n\n    enhanced.extend_from_slice(&denoiser.flush().samples);\n\n    anyhow::ensure!(\n        write(&args.output, &enhanced, denoiser.sample_rate()),\n        \"Failed to save {}\",\n        args.output\n    );\n\n    println!(\"Saved to {}\", args.output);\n    Ok(())\n}\n"
  },
  {
    "path": "rust-api-examples/examples/streaming_speech_enhancement_gtcrn.rs",
    "content": "use clap::Parser;\nuse sherpa_onnx::{\n    write, OfflineSpeechDenoiserGtcrnModelConfig, OnlineSpeechDenoiser, OnlineSpeechDenoiserConfig,\n    Wave,\n};\n\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    #[arg(long)]\n    model: String,\n\n    #[arg(long)]\n    input: String,\n\n    #[arg(long)]\n    output: String,\n}\n\nfn main() -> anyhow::Result<()> {\n    let args = Args::parse();\n\n    let config = OnlineSpeechDenoiserConfig {\n        model: sherpa_onnx::OfflineSpeechDenoiserModelConfig {\n            gtcrn: OfflineSpeechDenoiserGtcrnModelConfig {\n                model: Some(args.model),\n            },\n            ..Default::default()\n        },\n    };\n\n    let denoiser = OnlineSpeechDenoiser::create(&config)\n        .ok_or_else(|| anyhow::anyhow!(\"Failed to create streaming GTCRN denoiser\"))?;\n    let wave =\n        Wave::read(&args.input).ok_or_else(|| anyhow::anyhow!(\"Failed to read {}\", args.input))?;\n\n    let frame_shift = denoiser.frame_shift_in_samples() as usize;\n    let mut enhanced = Vec::new();\n\n    for chunk in wave.samples().chunks(frame_shift.max(1)) {\n        let audio = denoiser.run(chunk, wave.sample_rate());\n        enhanced.extend_from_slice(&audio.samples);\n    }\n\n    enhanced.extend_from_slice(&denoiser.flush().samples);\n\n    anyhow::ensure!(\n        write(&args.output, &enhanced, denoiser.sample_rate()),\n        \"Failed to save {}\",\n        args.output\n    );\n\n    println!(\"Saved to {}\", args.output);\n    Ok(())\n}\n"
  },
  {
    "path": "rust-api-examples/examples/streaming_zipformer.rs",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//\n// This file demonstrates how to use streaming Zipformer with sherpa-onnx's\n// Rust API for speech recognition.\n//\n// See ../README.md for how to run it\n//\n// Note that even if we use a wave file as an example, this model supports\n// real-time streaming speech recognition.\n// See ./streaming_zipformer_microphone.rs for how to do real-time\n// streaming speech recognition from a microphone.\n\nuse clap::Parser;\nuse sherpa_onnx::{OnlineRecognizer, OnlineRecognizerConfig, Wave};\n\n/// Simple streaming Zipformer example\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    /// Path to WAV file\n    #[arg(long)]\n    wav: String,\n\n    /// Path to encoder ONNX model\n    #[arg(long)]\n    encoder: String,\n\n    /// Path to decoder ONNX model\n    #[arg(long)]\n    decoder: String,\n\n    /// Path to joiner ONNX model\n    #[arg(long)]\n    joiner: String,\n\n    /// Path to tokens file\n    #[arg(long)]\n    tokens: String,\n\n    /// Provider (default: cpu)\n    #[arg(long, default_value = \"cpu\")]\n    provider: String,\n\n    /// Enable debug logs\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n}\n\nfn main() {\n    let args = Args::parse();\n\n    let wave = Wave::read(&args.wav).expect(\"Failed to read WAV file\");\n\n    let mut recognizer_config = OnlineRecognizerConfig::default();\n    recognizer_config.model_config.transducer.encoder = Some(args.encoder.clone());\n    recognizer_config.model_config.transducer.decoder = Some(args.decoder.clone());\n    recognizer_config.model_config.transducer.joiner = Some(args.joiner.clone());\n    recognizer_config.model_config.tokens = Some(args.tokens.clone());\n    recognizer_config.model_config.provider = Some(args.provider.clone());\n    recognizer_config.enable_endpoint = true;\n    recognizer_config.model_config.debug = args.debug;\n    recognizer_config.decoding_method = Some(\"greedy_search\".to_string());\n\n    let recognizer =\n        OnlineRecognizer::create(&recognizer_config).expect(\"Failed to create OnlineRecognizer\");\n\n    let stream = recognizer.create_stream();\n    let mut segment_id = 0;\n\n    // use any positive value as you like\n    const CHUNK_SIZE: usize = 3200;\n\n    println!(\n        \"Sample rate: {}, num samples: {}, duration: {:.2}s\",\n        wave.sample_rate(),\n        wave.num_samples(),\n        wave.num_samples() as f32 / wave.sample_rate() as f32\n    );\n\n    // Process in chunks\n    for chunk in wave.samples().chunks(CHUNK_SIZE) {\n        stream.accept_waveform(wave.sample_rate(), chunk);\n\n        while recognizer.is_ready(&stream) {\n            recognizer.decode(&stream);\n\n            if let Some(result) = recognizer.get_result(&stream) {\n                if !result.text.is_empty() {\n                    println!(\"Segment {}: {}\", segment_id, result.text);\n                }\n            }\n\n            if recognizer.is_endpoint(&stream) {\n                recognizer.reset(&stream);\n                segment_id += 1;\n            }\n        }\n    }\n\n    // Tail padding (~0.3s)\n    let tail_padding_len = (wave.sample_rate() as f32 * 0.3).round() as usize;\n    let tail_padding = vec![0.0f32; tail_padding_len];\n    stream.accept_waveform(wave.sample_rate(), &tail_padding);\n\n    stream.input_finished();\n\n    while recognizer.is_ready(&stream) {\n        recognizer.decode(&stream);\n        if let Some(result) = recognizer.get_result(&stream) {\n            if !result.text.is_empty() {\n                println!(\"Segment {}: {}\", segment_id, result.text);\n            }\n        }\n    }\n\n    println!(\"Transcription finished.\");\n}\n"
  },
  {
    "path": "rust-api-examples/examples/streaming_zipformer_microphone.rs",
    "content": "// Copyright (c)  2026  Xiaomi Corporation\n//\n// This file demonstrates how to use streaming Zipformer with sherpa-onnx's\n// Rust API for real-time streaming speech recognition with a microphone.\n//\n// See ../README.md for how to run it\n//\n// See ./streaming_zipformer.rs for how to recognize a wave file.\n\nuse anyhow::Result;\nuse clap::Parser;\nuse cpal::traits::{DeviceTrait, HostTrait, StreamTrait};\nuse cpal::SampleFormat;\nuse sherpa_onnx::{DisplayManager, OnlineRecognizer, OnlineRecognizerConfig};\nuse std::sync::mpsc;\n\n/// Command-line arguments\n#[derive(Parser, Debug)]\n#[command(author, version, about, long_about = None)]\nstruct Args {\n    #[arg(long)]\n    encoder: String,\n    #[arg(long)]\n    decoder: String,\n    #[arg(long)]\n    joiner: String,\n    #[arg(long)]\n    tokens: String,\n    #[arg(long, default_value = \"cpu\")]\n    provider: String,\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n    #[arg(long, default_value_t = 3200)]\n    chunk_size: usize,\n}\n\n/// List input devices and return the default one\nfn list_input_devices(host: &cpal::Host) -> Result<cpal::Device> {\n    let default_input = host.default_input_device();\n    let default_name = default_input.as_ref().map(|d| d.name().unwrap_or_default());\n\n    println!(\"Available input devices:\");\n    for device in host.input_devices()? {\n        let name = device.name().unwrap_or(\"<unknown>\".to_string());\n        let mark = if Some(&name) == default_name.as_ref() {\n            \"*\"\n        } else {\n            \" \"\n        };\n        println!(\"{} {}\", mark, name);\n    }\n\n    let device = default_input.ok_or_else(|| anyhow::anyhow!(\"No default input device\"))?;\n\n    println!(\"\\nUsing default device: {}\", device.name()?);\n    Ok(device)\n}\n\n/// Create and configure the OnlineRecognizer\nfn setup_recognizer(args: &Args) -> OnlineRecognizer {\n    let mut config = OnlineRecognizerConfig::default();\n    config.model_config.transducer.encoder = Some(args.encoder.clone());\n    config.model_config.transducer.decoder = Some(args.decoder.clone());\n    config.model_config.transducer.joiner = Some(args.joiner.clone());\n    config.model_config.tokens = Some(args.tokens.clone());\n    config.model_config.provider = Some(args.provider.clone());\n    config.model_config.debug = args.debug;\n    config.enable_endpoint = true;\n    config.decoding_method = Some(\"greedy_search\".to_string());\n\n    OnlineRecognizer::create(&config).expect(\"Failed to create OnlineRecognizer\")\n}\n\n/// Build the audio input stream (producer)\nfn build_input_stream(device: &cpal::Device, tx: mpsc::Sender<Vec<f32>>) -> Result<cpal::Stream> {\n    let supported = device.default_input_config()?;\n    let config = supported.config();\n    let sample_format = supported.sample_format();\n    let channels = config.channels as usize;\n\n    let err_fn = |err| eprintln!(\"Audio stream error: {:?}\", err);\n\n    println!(\n        \"Input format: {:?}, channels: {}, sample_rate: {}\",\n        sample_format, channels, config.sample_rate.0\n    );\n\n    let stream = match sample_format {\n        SampleFormat::F32 => device.build_input_stream(\n            &config,\n            move |data: &[f32], _| {\n                if data.is_empty() {\n                    return;\n                }\n\n                let mono: Vec<f32> = data\n                    .chunks(channels)\n                    .map(|frame| {\n                        let sum: f32 = frame.iter().copied().sum();\n                        sum / channels as f32\n                    })\n                    .collect();\n                let _ = tx.send(mono);\n            },\n            err_fn,\n            None,\n        )?,\n\n        SampleFormat::I16 => device.build_input_stream(\n            &config,\n            move |data: &[i16], _| {\n                if data.is_empty() {\n                    return;\n                }\n\n                let mono: Vec<f32> = data\n                    .chunks(channels)\n                    .map(|frame| {\n                        let sum: f32 = frame.iter().map(|&s| s as f32 / i16::MAX as f32).sum();\n                        sum / channels as f32\n                    })\n                    .collect();\n\n                let _ = tx.send(mono);\n            },\n            err_fn,\n            None,\n        )?,\n\n        SampleFormat::U16 => device.build_input_stream(\n            &config,\n            move |data: &[u16], _| {\n                if data.is_empty() {\n                    return;\n                }\n\n                let mono: Vec<f32> = data\n                    .chunks(channels)\n                    .map(|frame| {\n                        let sum: f32 = frame\n                            .iter()\n                            .map(|&s| {\n                                let centered = s as f32 - 32768.0;\n                                centered / 32768.0\n                            })\n                            .sum();\n                        sum / channels as f32\n                    })\n                    .collect();\n\n                let _ = tx.send(mono);\n            },\n            err_fn,\n            None,\n        )?,\n\n        other => anyhow::bail!(\"Unsupported sample format: {:?}\", other),\n    };\n\n    Ok(stream)\n}\n\n/// Main recognition loop (consumer)\nfn run_recognition_loop(\n    rx: mpsc::Receiver<Vec<f32>>,\n    recognizer: &OnlineRecognizer,\n    stream: &mut sherpa_onnx::OnlineStream,\n    chunk_size: usize,\n    sample_rate: i32,\n) {\n    let mut display = DisplayManager::new();\n    let mut buffer = Vec::<f32>::new();\n\n    loop {\n        match rx.recv() {\n            Ok(samples) => {\n                buffer.extend_from_slice(&samples);\n            }\n            Err(_) => {\n                println!(\"\\nAudio stream closed. Exiting.\");\n                break;\n            }\n        }\n\n        while buffer.len() >= chunk_size {\n            let chunk: Vec<f32> = buffer.drain(..chunk_size).collect();\n            stream.accept_waveform(sample_rate, &chunk);\n\n            while recognizer.is_ready(&stream) {\n                recognizer.decode(&stream);\n\n                if let Some(result) = recognizer.get_result(&stream) {\n                    let text = result.text;\n                    if !text.is_empty() {\n                        display.update_text(&text);\n                    }\n                }\n\n                if recognizer.is_endpoint(&stream) {\n                    if let Some(result) = recognizer.get_result(&stream) {\n                        if !result.text.is_empty() {\n                            display.finalize_sentence();\n                        }\n                    }\n                    recognizer.reset(&stream);\n                }\n            }\n        }\n\n        display.render();\n    }\n}\n\nfn main() -> Result<()> {\n    let args = Args::parse();\n    let host = cpal::default_host();\n\n    let device = list_input_devices(&host)?;\n\n    let supported = device.default_input_config()?;\n    let sample_rate = supported.sample_rate().0 as i32;\n\n    let recognizer = setup_recognizer(&args);\n    let mut stream = recognizer.create_stream();\n\n    let (tx, rx) = mpsc::channel::<Vec<f32>>();\n    let audio_stream = build_input_stream(&device, tx)?;\n    audio_stream.play()?;\n\n    println!(\"Streaming microphone ASR... Press Ctrl+C to stop.\");\n\n    run_recognition_loop(rx, &recognizer, &mut stream, args.chunk_size, sample_rate);\n\n    Ok(())\n}\n"
  },
  {
    "path": "rust-api-examples/examples/supertonic_tts.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use Supertonic TTS with sherpa-onnx's Rust API\n// for offline text-to-speech.\n\nuse sherpa_onnx::{\n    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsSupertonicModelConfig,\n};\nuse std::collections::HashMap;\nuse std::time::Instant;\n\nfn main() {\n    let config = OfflineTtsConfig {\n        model: sherpa_onnx::OfflineTtsModelConfig {\n            supertonic: OfflineTtsSupertonicModelConfig {\n                duration_predictor: Some(\n                    \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx\"\n                        .into(),\n                ),\n                text_encoder: Some(\n                    \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx\".into(),\n                ),\n                vector_estimator: Some(\n                    \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx\"\n                        .into(),\n                ),\n                vocoder: Some(\n                    \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx\".into(),\n                ),\n                tts_json: Some(\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json\".into()),\n                unicode_indexer: Some(\n                    \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin\".into(),\n                ),\n                voice_style: Some(\"./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin\".into()),\n            },\n            num_threads: 2,\n            debug: false, // set to true to see verbose logs\n            ..Default::default()\n        },\n        ..Default::default()\n    };\n\n    let tts = OfflineTts::create(&config).expect(\"Failed to create OfflineTts\");\n\n    println!(\"Sample rate: {}\", tts.sample_rate());\n    println!(\"Num speakers: {}\", tts.num_speakers());\n\n    let text = \"Today as always, men fall into two groups: slaves and free men. Whoever \\\n        does not have two-thirds of his day for himself, is a slave, whatever \\\n        he may be: a statesman, a businessman, an official, or a scholar.\";\n\n    let mut extra = HashMap::new();\n    extra.insert(\"lang\".to_string(), serde_json::json!(\"en\"));\n\n    let gen_config = GenerationConfig {\n        sid: 6,\n        num_steps: 5,\n        speed: 1.25,\n        extra: Some(extra),\n        ..Default::default()\n    };\n\n    let start = Instant::now();\n\n    let audio = tts\n        .generate_with_config(\n            text,\n            &gen_config,\n            Some(|_samples: &[f32], progress: f32| -> bool {\n                println!(\"Progress: {:.1}%\", progress * 100.0);\n                true\n            }),\n        )\n        .expect(\"Generation failed\");\n\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let duration = audio.samples().len() as f32 / audio.sample_rate() as f32;\n    let rtf = elapsed_seconds / duration;\n\n    println!(\"Number of threads: {}\", config.model.num_threads);\n    println!(\"Elapsed seconds: {:.3} s\", elapsed_seconds);\n    println!(\"Audio duration: {:.3} s\", duration);\n    println!(\n        \"Real-time factor (RTF): {:.3}/{:.3} = {:.3}\",\n        elapsed_seconds, duration, rtf\n    );\n\n    let filename = \"./generated-supertonic-en-rust.wav\";\n    if audio.save(filename) {\n        println!(\"Saved to: {}\", filename);\n    } else {\n        eprintln!(\"Failed to save {}\", filename);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/version.rs",
    "content": "use sherpa_onnx;\n\nfn main() {\n    println!(\"Version : {}\", sherpa_onnx::version());\n    println!(\"Git SHA1: {}\", sherpa_onnx::git_sha1());\n    println!(\"Git date: {}\", sherpa_onnx::git_date());\n}\n"
  },
  {
    "path": "rust-api-examples/examples/vits_tts.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use a Piper VITS TTS model with sherpa-onnx's\n// Rust API for offline text-to-speech.\n\nuse clap::Parser;\nuse sherpa_onnx::{\n    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsVitsModelConfig,\n};\nuse std::time::Instant;\n\n#[derive(Parser, Debug)]\n#[command(author, version, about)]\nstruct Args {\n    /// Path to the VITS/Piper model\n    #[arg(long)]\n    model: String,\n\n    /// Path to tokens.txt\n    #[arg(long)]\n    tokens: String,\n\n    /// Path to espeak-ng-data\n    #[arg(long)]\n    data_dir: String,\n\n    /// Input text to synthesize\n    #[arg(long)]\n    text: String,\n\n    /// Output wave filename\n    #[arg(long, default_value = \"./generated-vits-rust.wav\")]\n    output: String,\n\n    /// Speaker ID for multi-speaker models\n    #[arg(long, default_value_t = 0)]\n    sid: i32,\n\n    /// Speech speed; larger means faster\n    #[arg(long, default_value_t = 1.0)]\n    speed: f32,\n\n    /// Number of threads\n    #[arg(long, default_value_t = 2)]\n    num_threads: i32,\n\n    /// Show debug logs from sherpa-onnx\n    #[arg(long, default_value_t = false)]\n    debug: bool,\n}\n\nfn main() {\n    let args = Args::parse();\n\n    let config = OfflineTtsConfig {\n        model: sherpa_onnx::OfflineTtsModelConfig {\n            vits: OfflineTtsVitsModelConfig {\n                model: Some(args.model.clone()),\n                tokens: Some(args.tokens.clone()),\n                noise_scale: 0.667,\n                noise_scale_w: 0.8,\n                length_scale: 1.0,\n                data_dir: Some(args.data_dir.clone()),\n                ..Default::default()\n            },\n            num_threads: args.num_threads,\n            debug: args.debug,\n            ..Default::default()\n        },\n        ..Default::default()\n    };\n\n    let tts = OfflineTts::create(&config).expect(\"Failed to create OfflineTts\");\n\n    println!(\"Sample rate: {}\", tts.sample_rate());\n    println!(\"Num speakers: {}\", tts.num_speakers());\n\n    let gen_config = GenerationConfig {\n        sid: args.sid,\n        speed: args.speed,\n        ..Default::default()\n    };\n\n    let start = Instant::now();\n\n    let audio = tts\n        .generate_with_config(\n            &args.text,\n            &gen_config,\n            Some(|_samples: &[f32], progress: f32| -> bool {\n                println!(\"Progress: {:.1}%\", progress * 100.0);\n                true\n            }),\n        )\n        .expect(\"Generation failed\");\n\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let duration = audio.samples().len() as f32 / audio.sample_rate() as f32;\n    let rtf = elapsed_seconds / duration;\n\n    println!(\"Number of threads: {}\", config.model.num_threads);\n    println!(\"Elapsed seconds: {:.3} s\", elapsed_seconds);\n    println!(\"Audio duration: {:.3} s\", duration);\n    println!(\n        \"Real-time factor (RTF): {:.3}/{:.3} = {:.3}\",\n        elapsed_seconds, duration, rtf\n    );\n\n    if audio.save(&args.output) {\n        println!(\"Saved to: {}\", args.output);\n    } else {\n        eprintln!(\"Failed to save {}\", args.output);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/examples/zipvoice_tts.rs",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\n//\n// This file demonstrates how to use ZipVoice TTS with sherpa-onnx's Rust API\n// for offline zero-shot text-to-speech.\n\nuse sherpa_onnx::{\n    GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsZipvoiceModelConfig, Wave,\n};\nuse std::collections::HashMap;\nuse std::time::Instant;\n\nfn main() {\n    let config = OfflineTtsConfig {\n        model: sherpa_onnx::OfflineTtsModelConfig {\n            zipvoice: OfflineTtsZipvoiceModelConfig {\n                tokens: Some(\"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt\".into()),\n                encoder: Some(\n                    \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx\".into(),\n                ),\n                decoder: Some(\n                    \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx\".into(),\n                ),\n                vocoder: Some(\"./vocos_24khz.onnx\".into()),\n                data_dir: Some(\n                    \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data\".into(),\n                ),\n                lexicon: Some(\n                    \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt\".into(),\n                ),\n                feat_scale: 0.1,\n                t_shift: 0.5,\n                target_rms: 0.1,\n                guidance_scale: 1.0,\n            },\n            num_threads: 2,\n            debug: false,\n            ..Default::default()\n        },\n        ..Default::default()\n    };\n\n    let tts = OfflineTts::create(&config).expect(\"Failed to create OfflineTts\");\n\n    println!(\"Sample rate: {}\", tts.sample_rate());\n    println!(\"Num speakers: {}\", tts.num_speakers());\n\n    let text = \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\";\n    let reference_text = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\";\n    let reference_audio_file =\n        \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav\";\n\n    let wave = Wave::read(reference_audio_file).expect(\"Failed to read reference audio\");\n\n    let mut extra = HashMap::new();\n    extra.insert(\"min_char_in_sentence\".to_string(), serde_json::json!(10));\n\n    let gen_config = GenerationConfig {\n        speed: 1.0,\n        reference_audio: Some(wave.samples().to_vec()),\n        reference_sample_rate: wave.sample_rate(),\n        reference_text: Some(reference_text.to_string()),\n        num_steps: 4,\n        extra: Some(extra),\n        ..Default::default()\n    };\n\n    let start = Instant::now();\n\n    let audio = tts\n        .generate_with_config(\n            text,\n            &gen_config,\n            Some(|_samples: &[f32], progress: f32| -> bool {\n                println!(\"Progress: {:.1}%\", progress * 100.0);\n                true\n            }),\n        )\n        .expect(\"Generation failed\");\n\n    let elapsed_seconds = start.elapsed().as_secs_f32();\n    let duration = audio.samples().len() as f32 / audio.sample_rate() as f32;\n    let rtf = elapsed_seconds / duration;\n\n    println!(\"Number of threads: {}\", config.model.num_threads);\n    println!(\"Elapsed seconds: {:.3} s\", elapsed_seconds);\n    println!(\"Audio duration: {:.3} s\", duration);\n    println!(\n        \"Real-time factor (RTF): {:.3}/{:.3} = {:.3}\",\n        elapsed_seconds, duration, rtf\n    );\n\n    let filename = \"./generated-zipvoice-zh-en-rust.wav\";\n    if audio.save(filename) {\n        println!(\"Saved to: {}\", filename);\n    } else {\n        eprintln!(\"Failed to save {}\", filename);\n    }\n}\n"
  },
  {
    "path": "rust-api-examples/run-audio-tagging-ced.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-ced-mini-audio-tagging-2024-04-19/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n  tar xvf sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\n  rm sherpa-onnx-ced-mini-audio-tagging-2024-04-19.tar.bz2\nfi\n\ncargo run --example audio_tagging_ced\n"
  },
  {
    "path": "rust-api-examples/run-audio-tagging-zipformer.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-zipformer-small-audio-tagging-2024-04-15/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n  tar xvf sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\n  rm sherpa-onnx-zipformer-small-audio-tagging-2024-04-15.tar.bz2\nfi\n\ncargo run --example audio_tagging_zipformer\n"
  },
  {
    "path": "rust-api-examples/run-fire-red-asr-ctc.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# see\n# https://k2-fsa.github.io/sherpa/onnx/FireRedAsr/pretrained.html\nif [ ! -f ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx ]; then\n  curl -SsL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\n  tar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  rm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  ls -lh sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\nfi\n\ncargo run --example fire_red_asr_ctc -- \\\n    --wav ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav \\\n    --model ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx \\\n    --tokens ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt \\\n    --num-threads 2 \\\n    --debug\n"
  },
  {
    "path": "rust-api-examples/run-keyword-spotter.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nrepo=sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile\nif [ ! -f ./$repo/encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/$repo.tar.bz2\n  tar xvf $repo.tar.bz2\n  rm $repo.tar.bz2\nfi\n\ncargo run --example keyword_spotter --   --wav ./$repo/test_wavs/3.wav   --encoder ./$repo/encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx   --decoder ./$repo/decoder-epoch-12-avg-2-chunk-16-left-64.onnx   --joiner ./$repo/joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx   --tokens ./$repo/tokens.txt   --keywords-file ./$repo/test_wavs/test_keywords.txt   --provider cpu   --num-threads 1\n"
  },
  {
    "path": "rust-api-examples/run-kitten-tts-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./kitten-nano-en-v0_1-fp16/model.fp16.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n  tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n  rm kitten-nano-en-v0_1-fp16.tar.bz2\nfi\n\ncargo run --example kitten_tts_en\n"
  },
  {
    "path": "rust-api-examples/run-kokoro-tts-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./kokoro-en-v0_19/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n  tar xf kokoro-en-v0_19.tar.bz2\n  rm kokoro-en-v0_19.tar.bz2\nfi\n\ncargo run --example kokoro_tts_en\n"
  },
  {
    "path": "rust-api-examples/run-kokoro-tts-zh-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./kokoro-multi-lang-v1_0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n  tar xf kokoro-multi-lang-v1_0.tar.bz2\n  rm kokoro-multi-lang-v1_0.tar.bz2\nfi\n\ncargo run --example kokoro_tts_zh_en\n"
  },
  {
    "path": "rust-api-examples/run-matcha-tts-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n  tar xf matcha-icefall-en_US-ljspeech.tar.bz2\n  rm matcha-icefall-en_US-ljspeech.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\ncargo run --example matcha_tts_en\n"
  },
  {
    "path": "rust-api-examples/run-matcha-tts-zh.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./matcha-icefall-zh-baker/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  tar xvf matcha-icefall-zh-baker.tar.bz2\n  rm matcha-icefall-zh-baker.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\ncargo run --example matcha_tts_zh\n"
  },
  {
    "path": "rust-api-examples/run-moonshine-v2.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# see\n# https://k2-fsa.github.io/sherpa/onnx/moonshine\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\nfi\n\ncargo run --example moonshine_v2 -- \\\n    --wav ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav \\\n    --encoder ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort \\\n    --decoder ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort \\\n    --tokens ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt \\\n    --num-threads 2\n"
  },
  {
    "path": "rust-api-examples/run-nemo-parakeet-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# See also\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/nemo-transducer-models.html#sherpa-onnx-nemo-parakeet-tdt-0-6b-v2-int8-english\nif [ ! -f \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/encoder.int8.onnx\" ]; then\n    curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\n    tar xvf sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\n    rm sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\n    ls -lh sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8\nfi\n\n# Run Rust Nemo Parakeet example\ncargo run --example nemo_parakeet -- \\\n    --wav \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/test_wavs/0.wav\" \\\n    --encoder \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/encoder.int8.onnx\" \\\n    --decoder \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/decoder.int8.onnx\" \\\n    --joiner \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/joiner.int8.onnx\" \\\n    --tokens \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/tokens.txt\" \\\n    --provider cpu \\\n    --num-threads 2 \\\n    --debug\n"
  },
  {
    "path": "rust-api-examples/run-offline-punctuation.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nrepo=sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8\nif [ ! -f ./$repo/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/$repo.tar.bz2\n  tar xvf $repo.tar.bz2\n  rm $repo.tar.bz2\nfi\n\ncargo run --example offline_punctuation --   --ct-transformer ./$repo/model.int8.onnx   --provider cpu   --num-threads 1\n"
  },
  {
    "path": "rust-api-examples/run-offline-speaker-diarization.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-pyannote-segmentation-3-0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nfi\n\nif [ ! -f ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\nfi\n\nif [ ! -f ./0-four-speakers-zh.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\nfi\n\ncargo run --example offline_speaker_diarization\n"
  },
  {
    "path": "rust-api-examples/run-offline-speech-enhancement-dpdfnet.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ncargo run --example offline_speech_enhancement_dpdfnet -- \\\n  --model ./dpdfnet_baseline.onnx \\\n  --input ./inp_16k.wav \\\n  --output ./enhanced-rust-dpdfnet.wav\n"
  },
  {
    "path": "rust-api-examples/run-offline-speech-enhancement-gtcrn.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ncargo run --example offline_speech_enhancement_gtcrn -- \\\n  --model ./gtcrn_simple.onnx \\\n  --input ./inp_16k.wav \\\n  --output ./enhanced-rust-gtcrn.wav\n"
  },
  {
    "path": "rust-api-examples/run-online-punctuation.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -d ./sherpa-onnx-online-punct-en-2024-08-06 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n  tar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n  rm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nfi\n\ncargo run --example online_punctuation -- \\\n  --cnn-bilstm ./sherpa-onnx-online-punct-en-2024-08-06/model.onnx \\\n  --bpe-vocab ./sherpa-onnx-online-punct-en-2024-08-06/bpe.vocab\n"
  },
  {
    "path": "rust-api-examples/run-pocket-tts.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  tar xvf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nfi\n\ncargo run --example pocket_tts\n"
  },
  {
    "path": "rust-api-examples/run-sense-voice.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# see\n# https://k2-fsa.github.io/sherpa/onnx/sense-voice/pretrained.html#sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-chinese-english-japanese-korean-cantonese\nif [ ! -f ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/model.int8.onnx ]; then\n  curl -SsL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\n\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\n  ls -lh sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17\nfi\n\ncargo run --example sense_voice -- \\\n    --wav ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/test_wavs/en.wav \\\n    --model ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/model.int8.onnx \\\n    --tokens ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/tokens.txt \\\n    --num-threads 2 \\\n    --debug\n"
  },
  {
    "path": "rust-api-examples/run-silero-vad-remove-silence.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# https://k2-fsa.github.io/sherpa/onnx/vad/silero-vad.html\nif [ ! -f \"./silero_vad.onnx\" ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -f ./lei-jun-test.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\nfi\n\ncargo run --example silero_vad_remove_silence -- \\\n    --input ./lei-jun-test.wav \\\n    --output ./no-silence.wav \\\n    --silero-vad-model ./silero_vad.onnx\n"
  },
  {
    "path": "rust-api-examples/run-speaker-embedding-cosine-similarity.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./wespeaker_zh_cnceleb_resnet34.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/wespeaker_zh_cnceleb_resnet34.onnx\nfi\n\nif [ ! -f ./fangjun-sr-1.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/fangjun-sr-1.wav\nfi\n\nif [ ! -f ./fangjun-sr-2.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/fangjun-sr-2.wav\nfi\n\nif [ ! -f ./leijun-sr-1.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/leijun-sr-1.wav\nfi\n\ncargo run --example speaker_embedding_cosine_similarity\n"
  },
  {
    "path": "rust-api-examples/run-speaker-embedding-extractor.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\nfi\n\nif [ ! -d ./sr-data ]; then\n  git clone https://github.com/csukuangfj/sr-data\nfi\n\ncargo run --example speaker_embedding_extractor\n"
  },
  {
    "path": "rust-api-examples/run-speaker-embedding-manager.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\nfi\n\nif [ ! -d ./sr-data ]; then\n  git clone https://github.com/csukuangfj/sr-data\nfi\n\ncargo run --example speaker_embedding_manager\n"
  },
  {
    "path": "rust-api-examples/run-spoken-language-identification.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.tar.bz2\n  rm sherpa-onnx-whisper-tiny.tar.bz2\nfi\n\nif [ ! -f ./spoken-language-identification-test-wavs/en-english.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/spoken-language-identification-test-wavs.tar.bz2\n  tar xvf spoken-language-identification-test-wavs.tar.bz2\n  rm spoken-language-identification-test-wavs.tar.bz2\nfi\n\ncargo run --example spoken_language_identification --   --wav ./spoken-language-identification-test-wavs/de-german.wav   --whisper-encoder ./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx   --whisper-decoder ./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx   --provider cpu   --num-threads 1\n"
  },
  {
    "path": "rust-api-examples/run-streaming-speech-enhancement-dpdfnet.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ncargo run --example streaming_speech_enhancement_dpdfnet -- \\\n  --model ./dpdfnet_baseline.onnx \\\n  --input ./inp_16k.wav \\\n  --output ./enhanced-rust-streaming-dpdfnet.wav\n"
  },
  {
    "path": "rust-api-examples/run-streaming-speech-enhancement-gtcrn.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\ncargo run --example streaming_speech_enhancement_gtcrn -- \\\n  --model ./gtcrn_simple.onnx \\\n  --input ./inp_16k.wav \\\n  --output ./enhanced-rust-streaming-gtcrn.wav\n"
  },
  {
    "path": "rust-api-examples/run-streaming-zipformer-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# see\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-en-2023-06-21-english\nif [ ! -f ./sherpa-onnx-streaming-zipformer-en-2023-06-21/encoder-epoch-99-avg-1.int8.onnx ]; then\n  curl -SsL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-2023-06-21.tar.bz2\n\n  tar xvf sherpa-onnx-streaming-zipformer-en-2023-06-21.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-en-2023-06-21.tar.bz2\n  ls -lh sherpa-onnx-streaming-zipformer-en-2023-06-21\nfi\n\ncargo run --example streaming_zipformer -- \\\n    --wav sherpa-onnx-streaming-zipformer-en-2023-06-21/test_wavs/1.wav \\\n    --encoder sherpa-onnx-streaming-zipformer-en-2023-06-21/encoder-epoch-99-avg-1.int8.onnx \\\n    --decoder sherpa-onnx-streaming-zipformer-en-2023-06-21/decoder-epoch-99-avg-1.onnx \\\n    --joiner sherpa-onnx-streaming-zipformer-en-2023-06-21/joiner-epoch-99-avg-1.int8.onnx \\\n    --tokens sherpa-onnx-streaming-zipformer-en-2023-06-21/tokens.txt \\\n    --provider cpu \\\n    --debug\n"
  },
  {
    "path": "rust-api-examples/run-streaming-zipformer-microphone-zh-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# see\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\nif [ ! -f ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx ]; then\n  curl -SsL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  ls -lh sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\nfi\n\ncargo run --example streaming_zipformer_microphone --features mic -- \\\n    --encoder sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx \\\n    --decoder sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \\\n    --joiner sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx \\\n    --tokens sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \\\n    --provider cpu \\\n    --debug\n"
  },
  {
    "path": "rust-api-examples/run-streaming-zipformer-zh-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# see\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\nif [ ! -f ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx ]; then\n  curl -SsL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  ls -lh sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\nfi\n\ncargo run --example streaming_zipformer -- \\\n    --wav sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/2.wav \\\n    --encoder sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx \\\n    --decoder sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \\\n    --joiner sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx \\\n    --tokens sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \\\n    --provider cpu \\\n    --debug\n"
  },
  {
    "path": "rust-api-examples/run-supertonic-tts.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  tar xvf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  rm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nfi\n\ncargo run --example supertonic_tts\n"
  },
  {
    "path": "rust-api-examples/run-version.sh",
    "content": "#!/usr/bin/env bash\nset -ex\ncargo run --example version\n"
  },
  {
    "path": "rust-api-examples/run-vits-de.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -d ./vits-piper-de_DE-glados-high ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-glados-high.tar.bz2\n  tar xf vits-piper-de_DE-glados-high.tar.bz2\n  rm vits-piper-de_DE-glados-high.tar.bz2\nfi\n\ncargo run --example vits_tts --   --model ./vits-piper-de_DE-glados-high/de_DE-glados-high.onnx   --tokens ./vits-piper-de_DE-glados-high/tokens.txt   --data-dir ./vits-piper-de_DE-glados-high/espeak-ng-data   --output ./generated-vits-de-rust.wav   --text \"Alles hat ein Ende, nur die Wurst hat zwei.\"\n"
  },
  {
    "path": "rust-api-examples/run-vits-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -d ./vits-piper-en_US-amy-low ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\n  tar xf vits-piper-en_US-amy-low.tar.bz2\n  rm vits-piper-en_US-amy-low.tar.bz2\nfi\n\ncargo run --example vits_tts --   --model ./vits-piper-en_US-amy-low/en_US-amy-low.onnx   --tokens ./vits-piper-en_US-amy-low/tokens.txt   --data-dir ./vits-piper-en_US-amy-low/espeak-ng-data   --output ./generated-vits-en-rust.wav   --text \"Liliana, the most beautiful and lovely assistant of our team!\"\n"
  },
  {
    "path": "rust-api-examples/run-zipformer-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# see also\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#icefall-asr-multidataset-pruned-transducer-stateless7-2023-05-04-english\nif [ ! -f \"./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/data/lang_bpe_500/tokens.txt\" ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04.tar.bz2\n\n  tar xvf icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04.tar.bz2\n  rm icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04.tar.bz2\n  ls -lh icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04\nfi\n\n# Run Zipformer transducer\ncargo run --example zipformer -- \\\n    --wav \"./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/test_wavs/1089-134686-0001.wav\" \\\n    --tokens=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/data/lang_bpe_500/tokens.txt \\\n    --encoder=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/encoder-epoch-30-avg-4.int8.onnx \\\n    --decoder=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/decoder-epoch-30-avg-4.onnx \\\n    --joiner=./icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04/exp/joiner-epoch-30-avg-4.int8.onnx \\\n    --provider cpu \\\n    --num-threads 2 \\\n    --debug\n"
  },
  {
    "path": "rust-api-examples/run-zipformer-vi.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# see also\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-vi-30m-int8-2026-02-09-vietnamese\nif [ ! -f \"./sherpa-onnx-zipformer-vi-30M-int8-2026-02-09/encoder.int8.onnx\" ]; then\n    curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-vi-30M-int8-2026-02-09.tar.bz2\n    tar xvf sherpa-onnx-zipformer-vi-30M-int8-2026-02-09.tar.bz2\n    rm sherpa-onnx-zipformer-vi-30M-int8-2026-02-09.tar.bz2\n    ls -lh sherpa-onnx-zipformer-vi-30M-int8-2026-02-09\nfi\n\n# Run Zipformer transducer\ncargo run --example zipformer -- \\\n    --wav \"./sherpa-onnx-zipformer-vi-30M-int8-2026-02-09/test_wavs/0.wav\" \\\n    --encoder \"./sherpa-onnx-zipformer-vi-30M-int8-2026-02-09/encoder.int8.onnx\" \\\n    --decoder \"./sherpa-onnx-zipformer-vi-30M-int8-2026-02-09/decoder.onnx\" \\\n    --joiner \"./sherpa-onnx-zipformer-vi-30M-int8-2026-02-09/joiner.int8.onnx\" \\\n    --tokens \"./sherpa-onnx-zipformer-vi-30M-int8-2026-02-09/tokens.txt\" \\\n    --provider cpu \\\n    --num-threads 2 \\\n    --debug\n"
  },
  {
    "path": "rust-api-examples/run-zipformer-zh-en.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\n# see also\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-zh-en-2023-11-22-chinese-english\nif [ ! -f \"./sherpa-onnx-zipformer-zh-en-2023-11-22/encoder-epoch-34-avg-19.int8.onnx\" ]; then\n    curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-zh-en-2023-11-22.tar.bz2\n    tar xvf sherpa-onnx-zipformer-zh-en-2023-11-22.tar.bz2\n    rm sherpa-onnx-zipformer-zh-en-2023-11-22.tar.bz2\n    ls -lh sherpa-onnx-zipformer-zh-en-2023-11-22\nfi\n\n# Run Zipformer transducer\ncargo run --example zipformer -- \\\n    --wav \"./sherpa-onnx-zipformer-zh-en-2023-11-22/test_wavs/0.wav\" \\\n    --encoder \"./sherpa-onnx-zipformer-zh-en-2023-11-22/encoder-epoch-34-avg-19.int8.onnx\" \\\n    --decoder \"./sherpa-onnx-zipformer-zh-en-2023-11-22/decoder-epoch-34-avg-19.onnx\" \\\n    --joiner \"./sherpa-onnx-zipformer-zh-en-2023-11-22/joiner-epoch-34-avg-19.int8.onnx\" \\\n    --tokens \"./sherpa-onnx-zipformer-zh-en-2023-11-22/tokens.txt\" \\\n    --provider cpu \\\n    --num-threads 2 \\\n    --debug\n"
  },
  {
    "path": "rust-api-examples/run-zipvoice-tts.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nif [ ! -f ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  tar xvf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nfi\n\nif [ ! -f ./vocos_24khz.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\nfi\n\ncargo run --example zipvoice_tts\n"
  },
  {
    "path": "scripts/3dspeaker/README.md",
    "content": "# Introduction\n\nThis directory contains scripts\nabout exporting models from https://github.com/alibaba-damo-academy/3D-Speaker\nto `onnx` so that they can be used in `sherpa-onnx`.\n"
  },
  {
    "path": "scripts/3dspeaker/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2023-2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nimport json\nimport os\nimport pathlib\nimport re\nfrom typing import Dict\n\nimport onnx\nimport torch\nfrom infer_sv import supports\nfrom modelscope.hub.snapshot_download import snapshot_download\nfrom speakerlab.utils.builder import dynamic_import\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        choices=[\n            \"speech_campplus_sv_en_voxceleb_16k\",\n            \"speech_campplus_sv_zh-cn_16k-common\",\n            \"speech_campplus_sv_zh_en_16k-common_advanced\",\n            \"speech_eres2net_sv_en_voxceleb_16k\",\n            \"speech_eres2net_sv_zh-cn_16k-common\",\n            \"speech_eres2net_base_200k_sv_zh-cn_16k-common\",\n            \"speech_eres2net_base_sv_zh-cn_3dspeaker_16k\",\n            \"speech_eres2net_large_sv_zh-cn_3dspeaker_16k\",\n            \"speech_eres2netv2_sv_zh-cn_16k-common\",\n        ],\n    )\n    return parser.parse_args()\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    local_model_dir = \"pretrained\"\n    model_id = f\"iic/{args.model}\"\n    conf = supports[model_id]\n    cache_dir = snapshot_download(\n        model_id,\n        revision=conf[\"revision\"],\n    )\n    cache_dir = pathlib.Path(cache_dir)\n\n    save_dir = os.path.join(local_model_dir, model_id.split(\"/\")[1])\n    save_dir = pathlib.Path(save_dir)\n    save_dir.mkdir(exist_ok=True, parents=True)\n\n    download_files = [\"examples\", conf[\"model_pt\"]]\n    for src in cache_dir.glob(\"*\"):\n        if re.search(\"|\".join(download_files), src.name):\n            dst = save_dir / src.name\n            try:\n                dst.unlink()\n            except FileNotFoundError:\n                pass\n            dst.symlink_to(src)\n    pretrained_model = save_dir / conf[\"model_pt\"]\n    pretrained_state = torch.load(pretrained_model, map_location=\"cpu\")\n\n    model = conf[\"model\"]\n    embedding_model = dynamic_import(model[\"obj\"])(**model[\"args\"])\n    embedding_model.load_state_dict(pretrained_state)\n    embedding_model.eval()\n\n    with open(f\"{cache_dir}/configuration.json\") as f:\n        json_config = json.loads(f.read())\n        print(json_config)\n\n    T = 100\n    C = 80\n    x = torch.rand(1, T, C)\n    filename = f\"{args.model}.onnx\"\n    torch.onnx.export(\n        embedding_model,\n        x,\n        filename,\n        opset_version=13,\n        input_names=[\"x\"],\n        output_names=[\"embedding\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"T\"},\n            \"embeddings\": {0: \"N\"},\n        },\n    )\n\n    # all models from 3d-speaker expect input samples in the range\n    # [-1, 1]\n    normalize_samples = 1\n\n    # all models from 3d-speaker normalize the features by the global mean\n    feature_normalize_type = \"global-mean\"\n    sample_rate = json_config[\"model\"][\"model_config\"][\"sample_rate\"]\n\n    feat_dim = conf[\"model\"][\"args\"][\"feat_dim\"]\n    assert feat_dim == 80, feat_dim\n\n    output_dim = conf[\"model\"][\"args\"][\"embedding_size\"]\n\n    if \"zh-cn\" in args.model:\n        language = \"Chinese\"\n    elif \"zh_en\" in args.model:\n        language = \"Chinese-English\"\n    elif \"en\" in args.model:\n        language = \"English\"\n    else:\n        raise ValueError(f\"Unsupported language for model {args.model}\")\n\n    comment = f\"This model is from iic/{args.model}\"\n    url = f\"https://www.modelscope.cn/models/iic/{args.model}/summary\"\n\n    meta_data = {\n        \"framework\": \"3d-speaker\",\n        \"language\": language,\n        \"url\": url,\n        \"comment\": comment,\n        \"sample_rate\": sample_rate,\n        \"output_dim\": output_dim,\n        \"normalize_samples\": normalize_samples,\n        \"feature_normalize_type\": feature_normalize_type,\n    }\n    print(meta_data)\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n\nmain()\n"
  },
  {
    "path": "scripts/3dspeaker/test-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2023-2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nThis script computes speaker similarity score in the range [0-1]\nof two wave files using a speaker embedding model.\n\"\"\"\nimport argparse\nimport wave\nfrom pathlib import Path\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nfrom numpy.linalg import norm\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the input onnx model. Example value: model.onnx\",\n    )\n\n    parser.add_argument(\n        \"--file1\",\n        type=str,\n        required=True,\n        help=\"Input wave 1\",\n    )\n\n    parser.add_argument(\n        \"--file2\",\n        type=str,\n        required=True,\n        help=\"Input wave 2\",\n    )\n\n    return parser.parse_args()\n\n\ndef read_wavefile(filename, expected_sample_rate: int = 16000) -> np.ndarray:\n    \"\"\"\n    Args:\n      filename:\n        Path to a wave file, which must be of 16-bit and 16kHz.\n     expected_sample_rate:\n       Expected sample rate of the wave file.\n    Returns:\n      Return a 1-D float32 array containing audio samples. Each sample is in\n      the range [-1, 1].\n    \"\"\"\n    filename = str(filename)\n    with wave.open(filename) as f:\n        wave_file_sample_rate = f.getframerate()\n        assert wave_file_sample_rate == expected_sample_rate, (\n            wave_file_sample_rate,\n            expected_sample_rate,\n        )\n\n        num_channels = f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_int16 = samples_int16.reshape(-1, num_channels)[:, 0]\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n\n        return samples_float32\n\n\ndef compute_features(samples: np.ndarray, sample_rate: int) -> np.ndarray:\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.samp_freq = sample_rate\n    opts.frame_opts.snip_edges = True\n\n    opts.mel_opts.num_bins = 80\n    opts.mel_opts.debug_mel = False\n\n    fbank = knf.OnlineFbank(opts)\n    fbank.accept_waveform(sample_rate, samples)\n    fbank.input_finished()\n\n    features = []\n    for i in range(fbank.num_frames_ready):\n        f = fbank.get_frame(i)\n        features.append(f)\n    features = np.stack(features, axis=0)\n\n    return features\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        self.normalize_samples = int(meta[\"normalize_samples\"])\n        self.sample_rate = int(meta[\"sample_rate\"])\n        self.output_dim = int(meta[\"output_dim\"])\n        self.feature_normalize_type = meta[\"feature_normalize_type\"]\n\n    def __call__(self, x: np.ndarray) -> np.ndarray:\n        \"\"\"\n        Args:\n          x:\n            A 2-D float32 tensor of shape (T, C).\n          y:\n            A 1-D float32 tensor containing model output.\n        \"\"\"\n        x = np.expand_dims(x, axis=0)\n\n        return self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )[0][0]\n\n\ndef main():\n    args = get_args()\n    print(args)\n    filename = Path(args.model)\n    file1 = Path(args.file1)\n    file2 = Path(args.file2)\n    assert filename.is_file(), filename\n    assert file1.is_file(), file1\n    assert file2.is_file(), file2\n\n    model = OnnxModel(filename)\n    wave1 = read_wavefile(file1, model.sample_rate)\n    wave2 = read_wavefile(file2, model.sample_rate)\n\n    if not model.normalize_samples:\n        wave1 = wave1 * 32768\n        wave2 = wave2 * 32768\n\n    features1 = compute_features(wave1, model.sample_rate)\n    features2 = compute_features(wave2, model.sample_rate)\n\n    if model.feature_normalize_type == \"global-mean\":\n        features1 -= features1.mean(axis=0, keepdims=True)\n        features2 -= features2.mean(axis=0, keepdims=True)\n\n    output1 = model(features1)\n    output2 = model(features2)\n\n    similarity = np.dot(output1, output2) / (norm(output1) * norm(output2))\n    print(f\"similarity in the range [0-1]: {similarity}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/apk/.gitignore",
    "content": "build-apk-tts.sh\n!*.sh.in\n"
  },
  {
    "path": "scripts/apk/README.md",
    "content": "# Introduction\n\nThis folder contains scripts for building Android APKs.\n"
  },
  {
    "path": "scripts/apk/build-apk-asr-2pass.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building streaming ASR two-pass APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\nexport SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION=OFF\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nmkdir -p apks\n\n{% for first, second in model_list %}\npushd ./android/SherpaOnnx2Pass/app/src/main/assets/\n\nmodel_name1={{ first.model_name }}\nmodel_name=$model_name1\ntype1={{ first.idx }}\nlang1={{ first.lang }}\nshort_name1={{ first.short_name }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name1}.tar.bz2\ntar xvf ${model_name1}.tar.bz2\n\n{{ first.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name1\n\nmodel_name2={{ second.model_name }}\nmodel_name=$model_name2\ntype2={{ second.idx }}\nlang2={{ second.lang }}\nshort_name2={{ second.short_name }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name2}.tar.bz2\ntar xvf ${model_name2}.tar.bz2\n\n{{ second.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name2\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\npushd android/SherpaOnnx2Pass/app/src/main/java/com/k2fsa/sherpa/onnx\nsed -i.bak s/\"firstType = 9/firstType = $type1/\" ./MainActivity.kt\nsed -i.bak s/\"secondType = 0/secondType = $type2/\" ./MainActivity.kt\n\n{% if first.rule_fsts %}\n  rule_fsts={{ first.rule_fsts }}\n  sed -i.bak s%\"firstRuleFsts = null\"%\"firstRuleFsts = \\\"$rule_fsts\\\"\"% ./MainActivity.kt\n{% endif %}\n\n{% if second.rule_fsts %}\n  rule_fsts={{ second.rule_fsts }}\n  sed -i.bak s%\"secondRuleFsts = null\"%\"secondRuleFsts = \\\"$rule_fsts\\\"\"% ./MainActivity.kt\n{% endif %}\n\ngit diff\npopd\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build ASR apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnx2Pass/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnx2Pass\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew assembleRelease\n  popd\n\n  mv android/SherpaOnnx2Pass/app/build/outputs/apk/release/app-release-unsigned.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-asr_2pass-$lang1-${short_name1}_${short_name2}.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnx2Pass/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnx2Pass/app/src/main/assets/$model_name1\nrm -rf ./android/SherpaOnnx2Pass/app/src/main/assets/$model_name2\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/build-apk-asr.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building streaming ASR APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\nexport SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION=OFF\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nmkdir -p apks\n\n{% for model in model_list %}\npushd ./android/SherpaOnnx/app/src/main/assets/\nmodel_name={{ model.model_name }}\ntype={{ model.idx }}\nlang={{ model.lang }}\nshort_name={{ model.short_name }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\n\n{{ model.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\npushd android/SherpaOnnx/app/src/main/java/com/k2fsa/sherpa/onnx\nsed -i.bak s/\"type = 0/type = $type/\" ./MainActivity.kt\n\n{% if model.rule_fsts %}\n  rule_fsts={{ model.rule_fsts }}\n  sed -i.bak s%\"ruleFsts = null\"%\"ruleFsts = \\\"$rule_fsts\\\"\"% ./MainActivity.kt\n{% endif %}\n\ngit diff\npopd\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build ASR apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnx/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnx\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew assembleRelease\n  popd\n\n  mv android/SherpaOnnx/app/build/outputs/apk/release/app-release-unsigned.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-asr-$lang-$short_name.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnx/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnx/app/src/main/assets/$model_name\nrm -rf ./android/SherpaOnnx/app/src/main/assets/*.fst\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/build-apk-audio-tagging-wearos.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building audio tagging WearOS APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\nexport SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION=OFF\n\nmkdir -p apks\n\n{% for model in model_list %}\npushd ./android/SherpaOnnxAudioTaggingWearOs/app/src/main/assets/\nmodel_name={{ model.model_name }}\nshort_name={{ model.short_name }}\ntype={{ model.idx }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\nrm -rfv $model_name/model.onnx\nrm -rfv $model_name/test_wavs\nrm -rf  *.tar.bz2\nls -lh $model_name\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\n# Tagger.kt is a symlink file, so we use SherpaOnnxAudioTagging here instead of SherpaOnnxAudioTaggingWearOs\npushd android/SherpaOnnxAudioTagging/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/\nsed -i.bak s/\"type = 0/type = $type/\" ./Tagger.kt\ngit diff\npopd\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build audio tagging apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnxAudioTaggingWearOs/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnxAudioTaggingWearOs\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew assembleRelease\n  popd\n\n  mv android/SherpaOnnxAudioTaggingWearOs/app/build/outputs/apk/release/app-release-unsigned.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-audio-tagging-$short_name-wearos.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnxAudioTaggingWearOs/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnxAudioTaggingWearOs/app/src/main/assets/$model_name\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/build-apk-audio-tagging.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building audio tagging APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\nexport SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION=OFF\n\nmkdir -p apks\n\n{% for model in model_list %}\npushd ./android/SherpaOnnxAudioTagging/app/src/main/assets/\nmodel_name={{ model.model_name }}\nshort_name={{ model.short_name }}\ntype={{ model.idx }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\nrm -rfv $model_name/model.onnx\nrm -rfv $model_name/test_wavs\nrm -rf  *.tar.bz2\nls -lh $model_name\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\npushd android/SherpaOnnxAudioTagging/app/src/main/java/com/k2fsa/sherpa/onnx/audio/tagging/\nsed -i.bak s/\"type = 0/type = $type/\" ./Tagger.kt\ngit diff\npopd\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build audio tagging apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnxAudioTagging/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnxAudioTagging\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew assembleRelease\n  popd\n\n  mv android/SherpaOnnxAudioTagging/app/build/outputs/apk/release/app-release-unsigned.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-audio-tagging-$short_name.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnxAudioTagging/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnxAudioTagging/app/src/main/assets/$model_name\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/build-apk-qnn-vad-asr-simulate-streaming.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building simulated-streaming VAD + ASR APK + QNN for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\n\nexport SHERPA_ONNX_ENABLE_QNN=ON\n\nlog \"Download qnn header files\"\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models-qnn/qnn-include-2.40.0.251030.tar.bz2\ntar xf qnn-include-2.40.0.251030.tar.bz2\nrm qnn-include-2.40.0.251030.tar.bz2\nls -lh qnn-include-2.40.0.251030\n\nexport QNN_SDK_ROOT=$PWD/qnn-include-2.40.0.251030\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\n\ncp -v ./build-android-arm64-v8a/install/lib/*.so ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/jniLibs/arm64-v8a/\n\nlog \"=======Download qnn libs============\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models-qnn/qnn-libs-2.40.0.251030.tar.bz2\ntar xvf qnn-libs-2.40.0.251030.tar.bz2\nrm qnn-libs-2.40.0.251030.tar.bz2\ncp -v qnn-libs-2.40.0.251030/*.so ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/jniLibs/arm64-v8a/\n\nrm -rf qnn-libs-2.40.0.251030\n\nls -lh ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/jniLibs/arm64-v8a/\n\nmkdir -p apks\n\n{% for model in model_list %}\npushd ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/assets/\nmodel_name={{ model.model_name }}-android-aarch64\ntype={{ model.idx }}\nlang={{ model.lang }}\nshort_name={{ model.short_name }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models-qnn/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\n\n{% if model.use_hr %}\n  if [ ! -f lexicon.txt ]; then\n    curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n  fi\n\n  if [ ! -f replace.fst ]; then\n    curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n  fi\n{% endif %}\n\n{{ model.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\n\npushd android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/screens\nsed -i.bak s/\"asrModelType = 15/asrModelType = $type/\" ./Home.kt\npopd\n\npushd android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr\n\n{% if model.use_hr %}\n  sed -i.bak s/\"useHr = false/useHr = true/\" ./SimulateStreamingAsr.kt\n{% endif %}\n\n{% if model.rule_fsts %}\n  rule_fsts={{ model.rule_fsts }}\n  sed -i.bak s%\"asrRuleFsts = null\"%\"asrRuleFsts = \\\"$rule_fsts\\\"\"% ./MainActivity.kt\n{% endif %}\n\ngit diff\npopd\n\nfor arch in arm64-v8a; do\n  log \"------------------------------------------------------------\"\n  log \"build simulated-streaming ASR apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  pushd ./android/SherpaOnnxSimulateStreamingAsr\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew assembleRelease\n  popd\n\n  mv android/SherpaOnnxSimulateStreamingAsr/app/build/outputs/apk/release/app-release-unsigned.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-qnn-$arch-simulated_streaming_asr-$lang-$short_name.apk\n  ls -lh apks\ndone\n\nrm -rf ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/assets/$model_name\nrm -rf ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/assets/lexicon.txt\nrm -rf ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/assets/replace.fst\n\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/build-apk-slid.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building spoken language identification APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\nexport SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION=OFF\n\nmkdir -p apks\n\n{% for model in model_list %}\npushd ./android/SherpaOnnxSpokenLanguageIdentification/app/src/main/assets/\nmodel_name={{ model.model_name }}\nshort_name={{ model.short_name }}\ntype={{ model.idx }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\nrm -rfv $model_name/*-encoder.onnx\nrm -rfv $model_name/*-decoder.onnx\nrm -rfv $model_name/*.py\nrm -rfv $model_name/*.txt\nrm -rfv $model_name/*.md\nrm -rfv $model_name/test_wavs\nrm -rf  *.tar.bz2\nls -lh $model_name\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\npushd android/SherpaOnnxSpokenLanguageIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/slid/\nsed -i.bak s/\"type = 0/type = $type/\" ./slid.kt\ngit diff\npopd\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build spoken language identification apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnxSpokenLanguageIdentification/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnxSpokenLanguageIdentification\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew assembleRelease\n  popd\n\n  mv android/SherpaOnnxSpokenLanguageIdentification/app/build/outputs/apk/release/app-release-unsigned.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-slid-$short_name.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnxSpokenLanguageIdentification/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnxSpokenLanguageIdentification/app/src/main/assets/$model_name\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/build-apk-speaker-diarization.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building Speaker identification APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nmkdir -p apks\n\n{% for model in model_list %}\n\npushd ./android/SherpaOnnxSpeakerDiarization/app/src/main/assets/\n\nls -lh\n\nsegmentation_model_name={{ model.segmentation.model_name }}\nsegmentation_short_name={{ model.segmentation.short_name }}\n\nembedding_model_name={{ model.embedding.model_name }}\nembedding_short_name={{ model.embedding.short_name }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/$segmentation_model_name.tar.bz2\ntar xvf $segmentation_model_name.tar.bz2\nrm $segmentation_model_name.tar.bz2\nmv $segmentation_model_name/model.onnx segmentation.onnx\nrm -rf $segmentation_model_name\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/$embedding_model_name.onnx\nmv $embedding_model_name.onnx embedding.onnx\n\necho \"pwd: $PWD\"\nls -lh\n\npopd\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build speaker diarization apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnxSpeakerDiarization/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnxSpeakerDiarization\n  ./gradlew build\n  popd\n\n  mv android/SherpaOnnxSpeakerDiarization/app/build/outputs/apk/debug/app-debug.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-speaker-diarization-$segmentation_short_name-$embedding_short_name.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnxSpeakerDiarization/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnxSpeakerDiarization/app/src/main/assets/*.onnx\n\n{% endfor %}\n\nls -lh apks\n"
  },
  {
    "path": "scripts/apk/build-apk-speaker-identification.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building Speaker identification APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nmkdir -p apks\n\n{% for model in model_list %}\npushd ./android/SherpaOnnxSpeakerIdentification/app/src/main/assets/\nmodel_name={{ model.model_name }}\nshort_name={{ model.short_name }}\nlang={{ model.lang }}\nframework={{ model.framework }}\n\nwget -qq https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/$model_name\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\npushd android/SherpaOnnxSpeakerIdentification/app/src/main/java/com/k2fsa/sherpa/onnx/speaker/identification/\nsed -i.bak s/\"private val modelName.*/private val modelName = \\\"$model_name\\\"/\" ./Speaker.kt\ngit diff\npopd\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build speaker identification apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnxSpeakerIdentification/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnxSpeakerIdentification\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew build\n  popd\n\n  mv android/SherpaOnnxSpeakerIdentification/app/build/outputs/apk/debug/app-debug.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-$lang-speaker-identification-$framework-$short_name.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnxSpeakerIdentification/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnxSpeakerIdentification/app/src/main/assets/$model_name\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/build-apk-tts-engine.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building TTS engine APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nexport SHERPA_ONNX_ENABLE_TTS=ON\n\nmkdir -p apks\n\n{% for tts_model in tts_model_list %}\npushd ./android/SherpaOnnxTtsEngine/app/src/main/assets/\nmodel_dir={{ tts_model.model_dir }}\nmodel_name={{ tts_model.model_name }}\nacoustic_model_name={{ tts_model.acoustic_model_name }}\nvocoder={{ tts_model.vocoder }}\nvoices={{ tts_model.voices }}\nlang={{ tts_model.lang }}\nlang_iso_639_3={{ tts_model.lang_iso_639_3 }}\nlang_iso_639_3_2={{ tts_model.lang_iso_639_3_2 }}\n\nwget -qq https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/$model_dir.tar.bz2\ntar xf $model_dir.tar.bz2\nrm $model_dir.tar.bz2\n\n{% if tts_model.vocoder %}\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/$vocoder\n{% endif %}\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\npushd android/SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/engine\nsed -i.bak s/\"modelDir = null\"/\"modelDir = \\\"$model_dir\\\"\"/ ./TtsEngine.kt\nsed -i.bak s/\"lang = null\"/\"lang = \\\"$lang_iso_639_3\\\"\"/ ./TtsEngine.kt\n\n{% if tts_model.lang2 %}\n  sed -i.bak s/\"lang2 = null\"/\"lang2 = \\\"$lang_iso_639_3_2\\\"\"/ ./TtsEngine.kt\n{% endif %}\n\n{% if tts_model.is_kitten %}\n  sed -i.bak s/\"isKitten = false\"/\"isKitten = true\"/ ./TtsEngine.kt\n{% endif %}\n\n{% if tts_model.model_name %}\n  sed -i.bak s/\"modelName = null\"/\"modelName = \\\"$model_name\\\"\"/ ./TtsEngine.kt\n{% endif %}\n\n{% if tts_model.acoustic_model_name %}\n  sed -i.bak s/\"acousticModelName = null\"/\"acousticModelName = \\\"$acoustic_model_name\\\"\"/ ./TtsEngine.kt\n{% endif %}\n\n{% if tts_model.vocoder %}\n  sed -i.bak s/\"vocoder = null\"/\"vocoder = \\\"$vocoder\\\"\"/ ./TtsEngine.kt\n{% endif %}\n\n{% if tts_model.voices %}\n  sed -i.bak s/\"voices = null\"/\"voices = \\\"$voices\\\"\"/ ./TtsEngine.kt\n{% endif %}\n\n{% if tts_model.rule_fsts %}\n  rule_fsts={{ tts_model.rule_fsts }}\n  sed -i.bak s%\"ruleFsts = null\"%\"ruleFsts = \\\"$rule_fsts\\\"\"% ./TtsEngine.kt\n{% endif %}\n\n{% if tts_model.rule_fars %}\n  rule_fars={{ tts_model.rule_fars }}\n  sed -i.bak s%\"ruleFsts = null\"%\"ruleFars = \\\"$rule_fars\\\"\"% ./TtsEngine.kt\n{% endif %}\n\n{% if tts_model.data_dir %}\n  data_dir={{ tts_model.data_dir }}\n  sed -i.bak s%\"dataDir = null\"%\"dataDir = \\\"$data_dir\\\"\"% ./TtsEngine.kt\n{% elif not tts_model.is_char %}\n  sed -i.bak s/\"lexicon = null\"/\"lexicon = \\\"lexicon.txt\\\"\"/ ./TtsEngine.kt\n{% endif %}\n\n{% if tts_model.lexicon %}\n  lexicon={{ tts_model.lexicon }}\n  sed -i.bak s%\"lexicon = null\"%\"lexicon = \\\"$lexicon\\\"\"% ./TtsEngine.kt\n{% endif %}\n\ngit diff\npopd\n\nif [[ $model_dir == vits-melo-tts-zh_en ]]; then\n  lang=zh_en\nfi\n\nif [[ $model_dir == matcha-icefall-zh-en ]]; then\n  lang=zh_en\nfi\n\nif [[ $model_dir == kokoro-multi-lang-v1_0 || $model_dir == kokoro-multi-lang-v1_1 || $model_dir == kokoro-int8-multi-lang-v1_1 ]]; then\n  lang=zh_en\nfi\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build tts apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnxTtsEngine/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnxTtsEngine\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew assembleRelease\n  popd\n\n  mv android/SherpaOnnxTtsEngine/app/build/outputs/apk/release/app-release-unsigned.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-$lang-tts-engine-$model_dir.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnxTtsEngine/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnxTtsEngine/app/src/main/assets/$model_dir\nrm -fv ./android/SherpaOnnxTtsEngine/app/src/main/assets/*.onnx\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/build-apk-tts.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building TTS APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nexport SHERPA_ONNX_ENABLE_TTS=ON\n\nmkdir -p apks\n\n{% for tts_model in tts_model_list %}\npushd ./android/SherpaOnnxTts/app/src/main/assets/\nmodel_dir={{ tts_model.model_dir }}\nmodel_name={{ tts_model.model_name }}\nacoustic_model_name={{ tts_model.acoustic_model_name }}\nvocoder={{ tts_model.vocoder }}\nvoices={{ tts_model.voices }}\nlang={{ tts_model.lang }}\n\nwget -qq https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/$model_dir.tar.bz2\ntar xf $model_dir.tar.bz2\nrm $model_dir.tar.bz2\n\n{% if tts_model.vocoder %}\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/$vocoder\n{% endif %}\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\npushd android/SherpaOnnxTts/app/src/main/java/com/k2fsa/sherpa/onnx\nsed -i.bak s/\"modelDir = null\"/\"modelDir = \\\"$model_dir\\\"\"/ ./MainActivity.kt\n\n\n{% if tts_model.model_name %}\n  sed -i.bak s/\"modelName = null\"/\"modelName = \\\"$model_name\\\"\"/ ./MainActivity.kt\n{% endif %}\n\n{% if tts_model.acoustic_model_name %}\n  sed -i.bak s/\"acousticModelName = null\"/\"acousticModelName = \\\"$acoustic_model_name\\\"\"/ ./MainActivity.kt\n{% endif %}\n\n{% if tts_model.vocoder %}\n  sed -i.bak s/\"vocoder = null\"/\"vocoder = \\\"$vocoder\\\"\"/ ./MainActivity.kt\n{% endif %}\n\n{% if tts_model.voices %}\n  sed -i.bak s/\"voices = null\"/\"voices = \\\"$voices\\\"\"/ ./MainActivity.kt\n{% endif %}\n\n{% if tts_model.rule_fsts %}\n  rule_fsts={{ tts_model.rule_fsts }}\n  sed -i.bak s%\"ruleFsts = null\"%\"ruleFsts = \\\"$rule_fsts\\\"\"% ./MainActivity.kt\n{% endif %}\n\n{% if tts_model.rule_fars %}\n  rule_fars={{ tts_model.rule_fars }}\n  sed -i.bak s%\"ruleFsts = null\"%\"ruleFars = \\\"$rule_fars\\\"\"% ./MainActivity.kt\n{% endif %}\n\n{% if tts_model.data_dir %}\n  data_dir={{ tts_model.data_dir }}\n  sed -i.bak s%\"dataDir = null\"%\"dataDir = \\\"$data_dir\\\"\"% ./MainActivity.kt\n{% elif not tts_model.is_char %}\n  sed -i.bak s/\"lexicon = null\"/\"lexicon = \\\"lexicon.txt\\\"\"/ ./MainActivity.kt\n{% endif %}\n\n{% if tts_model.lexicon %}\n  lexicon={{ tts_model.lexicon }}\n  sed -i.bak s%\"lexicon = null\"%\"lexicon = \\\"$lexicon\\\"\"% ./MainActivity.kt\n{% endif %}\n\n{% if tts_model.is_kitten %}\n  sed -i.bak s/\"isKitten = false\"/\"isKitten = true\"/ ./MainActivity.kt\n{% endif %}\n\ngit diff\npopd\n\nif [[ $model_dir == vits-melo-tts-zh_en ]]; then\n  lang=zh_en\nfi\n\nif [[ $model_dir == matcha-icefall-zh-en ]]; then\n  lang=zh_en\nfi\n\nif [[ $model_dir == kokoro-multi-lang-v1_0 || $model_dir == kokoro-multi-lang-v1_1 || $model_dir == kokoro-int8-multi-lang-v1_1 ]]; then\n  lang=zh_en\nfi\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build tts apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnxTts/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnxTts\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew assembleRelease\n  popd\n\n  mv android/SherpaOnnxTts/app/build/outputs/apk/release/app-release-unsigned.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-$lang-tts-$model_dir.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnxTts/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnxTts/app/src/main/assets/$model_dir\nrm -fv ./android/SherpaOnnxTts/app/src/main/assets/*.onnx\n\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/build-apk-vad-asr-simulate-streaming.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building simulated-streaming VAD + ASR APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nmkdir -p apks\n\n{% for model in model_list %}\npushd ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/assets/\nmodel_name={{ model.model_name }}\ntype={{ model.idx }}\nlang={{ model.lang }}\nshort_name={{ model.short_name }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\n\n{% if model.use_hr %}\n  if [ ! -f lexicon.txt ]; then\n    curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\n  fi\n\n  if [ ! -f replace.fst ]; then\n    curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n  fi\n{% endif %}\n\n{{ model.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\n\npushd android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr/screens\nsed -i.bak s/\"asrModelType = 15/asrModelType = $type/\" ./Home.kt\npopd\n\npushd android/SherpaOnnxSimulateStreamingAsr/app/src/main/java/com/k2fsa/sherpa/onnx/simulate/streaming/asr\n\n{% if model.use_hr %}\n  sed -i.bak s/\"useHr = false/useHr = true/\" ./SimulateStreamingAsr.kt\n{% endif %}\n\n{% if model.rule_fsts %}\n  rule_fsts={{ model.rule_fsts }}\n  sed -i.bak s%\"asrRuleFsts = null\"%\"asrRuleFsts = \\\"$rule_fsts\\\"\"% ./MainActivity.kt\n{% endif %}\n\ngit diff\npopd\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build simulated-streaming ASR apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnxSimulateStreamingAsr\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew assembleRelease\n  popd\n\n  mv android/SherpaOnnxSimulateStreamingAsr/app/build/outputs/apk/release/app-release-unsigned.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-simulated_streaming_asr-$lang-$short_name.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/assets/$model_name\nrm -rf ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/assets/lexicon.txt\nrm -rf ./android/SherpaOnnxSimulateStreamingAsr/app/src/main/assets/replace.fst\n\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/build-apk-vad-asr.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable ANDROID_NDK\n# before running this script\n\n# Inside the $ANDROID_NDK directory, you can find a binary ndk-build\n# and some other files like the file \"build/cmake/android.toolchain.cmake\"\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building streaming VAD + ASR APK for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\n\nlog \"====================arm64-v8a=================\"\n./build-android-arm64-v8a.sh\nlog \"====================armv7-eabi================\"\n./build-android-armv7-eabi.sh\nlog \"====================x86-64====================\"\n./build-android-x86-64.sh\nlog \"====================x86====================\"\n./build-android-x86.sh\n\nmkdir -p apks\n\n{% for model in model_list %}\npushd ./android/SherpaOnnxVadAsr/app/src/main/assets/\nmodel_name={{ model.model_name }}\ntype={{ model.idx }}\nlang={{ model.lang }}\nshort_name={{ model.short_name }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\n\n{{ model.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\npushd android/SherpaOnnxVadAsr/app/src/main/java/com/k2fsa/sherpa/onnx\nsed -i.bak s/\"asrModelType = 0/asrModelType = $type/\" ./MainActivity.kt\n\n{% if model.rule_fsts %}\n  rule_fsts={{ model.rule_fsts }}\n  sed -i.bak s%\"asrRuleFsts = null\"%\"asrRuleFsts = \\\"$rule_fsts\\\"\"% ./MainActivity.kt\n{% endif %}\n\ngit diff\npopd\n\nfor arch in arm64-v8a armeabi-v7a x86_64 x86; do\n  log \"------------------------------------------------------------\"\n  log \"build ASR apk for $arch\"\n  log \"------------------------------------------------------------\"\n  src_arch=$arch\n  if [ $arch == \"armeabi-v7a\" ]; then\n    src_arch=armv7-eabi\n  elif [ $arch == \"x86_64\" ]; then\n    src_arch=x86-64\n  fi\n\n  ls -lh ./build-android-$src_arch/install/lib/*.so\n\n  cp -v ./build-android-$src_arch/install/lib/*.so ./android/SherpaOnnxVadAsr/app/src/main/jniLibs/$arch/\n\n  pushd ./android/SherpaOnnxVadAsr\n  sed -i.bak s/2048/9012/g ./gradle.properties\n  git diff ./gradle.properties\n  ./gradlew assembleRelease\n  popd\n\n  mv android/SherpaOnnxVadAsr/app/build/outputs/apk/release/app-release-unsigned.apk ./apks/sherpa-onnx-${SHERPA_ONNX_VERSION}-$arch-vad_asr-$lang-$short_name.apk\n  ls -lh apks\n  rm -v ./android/SherpaOnnxVadAsr/app/src/main/jniLibs/$arch/*.so\ndone\n\nrm -rf ./android/SherpaOnnxVadAsr/app/src/main/assets/$model_name\n{% endfor %}\n\ngit checkout .\n\nls -lh apks/\n"
  },
  {
    "path": "scripts/apk/generate-asr-2pass-apk-script.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass Model:\n    # We will download\n    # https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/{model_name}.tar.bz2\n    model_name: str\n\n    # The type of the model, e..g, 0, 1, 2. It is hardcoded in the kotlin code\n    idx: int\n\n    # e.g., zh, en, zh_en\n    lang: str\n\n    # e.g., whisper, paraformer, zipformer\n    short_name: str = \"\"\n\n    # cmd is used to remove extra file from the model directory\n    cmd: str = \"\"\n    rule_fsts: str = \"\"\n\n\ndef get_2nd_models():\n    models = [\n        Model(\n            model_name=\"sherpa-onnx-whisper-tiny.en\",\n            idx=2,\n            lang=\"en\",\n            short_name=\"whisper_tiny\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv tiny.en-encoder.onnx\n            rm -fv tiny.en-decoder.onnx\n            rm -rf test_wavs\n            rm -fv *.py\n            rm -fv requirements.txt\n            rm -fv .gitignore\n            rm -fv README.md\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-paraformer-zh-2023-09-14\",\n            idx=0,\n            lang=\"zh\",\n            short_name=\"paraformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n\n            rm -fv README.md\n            rm -rfv test_wavs\n            rm -fv model.onnx\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"icefall-asr-zipformer-wenetspeech-20230615\",\n            idx=4,\n            lang=\"zh\",\n            short_name=\"zipformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv README.md\n            mv -v data/lang_char/tokens.txt ./\n            rm -rfv data/lang_char\n\n            mv -v exp/encoder-epoch-12-avg-4.int8.onnx ./\n            mv -v exp/decoder-epoch-12-avg-4.onnx ./\n            mv -v exp/joiner-epoch-12-avg-4.int8.onnx ./\n            rm -rfv exp\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17\",\n            idx=15,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"sense_voice_2024_07_17_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv *.py\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-tiny-en-int8\",\n            idx=21,\n            lang=\"en\",\n            short_name=\"moonshine_tiny_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-en-int8\",\n            idx=22,\n            lang=\"en\",\n            short_name=\"moonshine_base_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\",\n            idx=25,\n            lang=\"multi_lang\",\n            short_name=\"dolphin_base_ctc\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09\",\n            idx=41,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"sense_voice_2025_09_09_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv *.py\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10\",\n            idx=42,\n            lang=\"zh_en_yue\",\n            short_name=\"wenetspeech_yue_u2pconformer_ctc_2025_09_10_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n    ]\n    return models\n\n\ndef get_1st_models():\n    # See as ./generate-asr-apk-script.py\n    models = [\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\",\n            idx=8,\n            lang=\"bilingual_zh_en\",\n            short_name=\"zipformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv joiner-epoch-99-avg-1.onnx\n\n            rm -fv *.sh\n            rm -fv bpe.model\n            rm -fv README.md\n            rm -fv .gitattributes\n            rm -fv *state*\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-en-2023-06-26\",\n            idx=6,\n            lang=\"en\",\n            short_name=\"zipformer2\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv encoder-epoch-99-avg-1-chunk-16-left-128.onnx\n            rm -fv decoder-epoch-99-avg-1-chunk-16-left-128.int8.onnx\n            rm -fv joiner-epoch-99-avg-1-chunk-16-left-128.int8.onnx\n\n            rm -fv README.md\n            rm -fv bpe.model\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"icefall-asr-zipformer-streaming-wenetspeech-20230615\",\n            idx=3,\n            lang=\"zh\",\n            short_name=\"zipformer2\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv exp/encoder-epoch-12-avg-4-chunk-16-left-128.onnx\n            rm -fv exp/decoder-epoch-12-avg-4-chunk-16-left-128.int8.onnx\n            rm -fv exp/joiner-epoch-12-avg-4-chunk-16-left-128.int8.onnx\n\n            rm -fv data/lang_char/lexicon.txt\n            rm -fv data/lang_char/words.txt\n            rm -rfv test_wavs\n            rm -fv README.md\n\n            ls -lh exp/\n            ls -lh data/lang_char\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-fr-2023-04-14\",\n            idx=7,\n            lang=\"fr\",\n            short_name=\"zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv encoder-epoch-29-avg-9-with-averaged-model.onnx\n            rm -fv decoder-epoch-29-avg-9-with-averaged-model.int8.onnx\n            rm -fv joiner-epoch-29-avg-9-with-averaged-model.int8.onnx\n\n            rm -fv *.sh\n            rm -rf test_wavs\n            rm README.md\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23\",\n            idx=9,\n            lang=\"zh\",\n            short_name=\"small_zipformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv joiner-epoch-99-avg-1.onnx\n\n            rm -fv *.sh\n            rm -rf test_wavs\n            rm README.md\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\",\n            idx=10,\n            lang=\"en\",\n            short_name=\"small_zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv joiner-epoch-99-avg-1.onnx\n\n            rm -fv *.sh\n            rm -rf test_wavs\n            rm README.md\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-small-ctc-zh-int8-2025-04-01\",\n            idx=15,\n            lang=\"zh\",\n            short_name=\"int8_small_zipformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -f bpe.model\n\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-small-ctc-zh-2025-04-01\",\n            idx=16,\n            lang=\"zh\",\n            short_name=\"small_zipformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -f bpe.model\n\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n    ]\n\n    return models\n\n\ndef get_models():\n    first = get_1st_models()\n    second = get_2nd_models()\n\n    combinations = []\n\n    first_zh = [\n        \"sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23\",\n        \"sherpa-onnx-streaming-zipformer-small-ctc-zh-int8-2025-04-01\",\n        \"sherpa-onnx-streaming-zipformer-small-ctc-zh-2025-04-01\",\n    ]\n\n    second_zh = [\n        \"sherpa-onnx-paraformer-zh-2023-09-14\",\n        \"icefall-asr-zipformer-wenetspeech-20230615\",\n        \"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17\",\n        \"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09\",\n        \"sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\",\n        \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10\",\n    ]\n    for first_m in first_zh:\n        for second_m in second_zh:\n            combinations.append((first_m, second_m))\n\n    combinations += [\n        (\n            \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\",\n            \"sherpa-onnx-whisper-tiny.en\",\n        ),\n        (\n            \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\",\n            \"sherpa-onnx-moonshine-tiny-en-int8\",\n        ),\n        (\n            \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\",\n            \"sherpa-onnx-moonshine-base-en-int8\",\n        ),\n        (\n            \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\",\n            \"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17\",\n        ),\n        (\n            \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\",\n            \"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09\",\n        ),\n        (\n            \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\",\n            \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10\",\n        ),\n    ]\n    models = []\n    for f, s in combinations:\n        t = []\n        for m in first:\n            if m.model_name == f:\n                t.append(m)\n                break\n        assert len(t) == 1, (f, s, first, second)\n\n        for m in second:\n            if m.model_name == s:\n                t.append(m)\n                break\n        assert len(t) == 2, (f, s, first, second)\n\n        models.append(t)\n\n    return models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./build-apk-asr-2pass.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/apk/generate-asr-apk-script.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass Model:\n    # We will download\n    # https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/{model_name}.tar.bz2\n    model_name: str\n\n    # The type of the model, e..g, 0, 1, 2. It is hardcoded in the kotlin code\n    idx: int\n\n    # e.g., zh, en, zh_en\n    lang: str\n\n    # e.g., whisper, paraformer, zipformer\n    short_name: str = \"\"\n\n    # cmd is used to remove extra file from the model directory\n    cmd: str = \"\"\n\n    rule_fsts: str = \"\"\n\n\ndef get_models():\n    models = [\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\",\n            idx=8,\n            lang=\"bilingual_zh_en\",\n            short_name=\"zipformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv joiner-epoch-99-avg-1.onnx\n\n            rm -fv *.sh\n            rm -fv bpe.model\n            rm -fv README.md\n            rm -fv .gitattributes\n            rm -fv *state*\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-en-2023-06-26\",\n            idx=6,\n            lang=\"en\",\n            short_name=\"zipformer2\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv encoder-epoch-99-avg-1-chunk-16-left-128.onnx\n            rm -fv decoder-epoch-99-avg-1-chunk-16-left-128.int8.onnx\n            rm -fv joiner-epoch-99-avg-1-chunk-16-left-128.int8.onnx\n\n            rm -fv README.md\n            rm -fv bpe.model\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"icefall-asr-zipformer-streaming-wenetspeech-20230615\",\n            idx=3,\n            lang=\"zh\",\n            short_name=\"zipformer2\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv exp/encoder-epoch-12-avg-4-chunk-16-left-128.onnx\n            rm -fv exp/decoder-epoch-12-avg-4-chunk-16-left-128.int8.onnx\n            rm -fv exp/joiner-epoch-12-avg-4-chunk-16-left-128.int8.onnx\n\n            rm -fv data/lang_char/lexicon.txt\n            rm -fv data/lang_char/words.txt\n            rm -rfv test_wavs\n            rm -fv README.md\n\n            ls -lh exp/\n            ls -lh data/lang_char\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-fr-2023-04-14\",\n            idx=7,\n            lang=\"fr\",\n            short_name=\"zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv encoder-epoch-29-avg-9-with-averaged-model.onnx\n            rm -fv decoder-epoch-29-avg-9-with-averaged-model.int8.onnx\n            rm -fv joiner-epoch-29-avg-9-with-averaged-model.int8.onnx\n\n            rm -fv *.sh\n            rm -rfv test_wavs\n            rm README.md\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23\",\n            idx=9,\n            lang=\"zh\",\n            short_name=\"small_zipformer_14M_2023_02_23\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv joiner-epoch-99-avg-1.onnx\n\n            rm -fv *.sh\n            rm -rf test_wavs\n            rm README.md\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\",\n            idx=10,\n            lang=\"en\",\n            short_name=\"small_zipformer_20M_2023_02_17\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv joiner-epoch-99-avg-1.onnx\n\n            rm -fv *.sh\n            rm -rf test_wavs\n            rm README.md\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms\",\n            idx=11,\n            lang=\"en\",\n            short_name=\"nemo_ctc_80ms\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-480ms\",\n            idx=12,\n            lang=\"en\",\n            short_name=\"nemo_ctc_480ms\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-1040ms\",\n            idx=13,\n            lang=\"en\",\n            short_name=\"nemo_ctc_1040ms\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-korean-2024-06-16\",\n            idx=14,\n            lang=\"ko\",\n            short_name=\"zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv joiner-epoch-99-avg-1.onnx\n\n            rm -fv bpe.model\n            rm -fv README.md\n            rm -fv .gitattributes\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-paraformer-bilingual-zh-en\",\n            idx=5,\n            lang=\"zh_en\",\n            short_name=\"paraformer\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv decoder.onnx\n            rm -fv encoder.onnx\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-small-ctc-zh-int8-2025-04-01\",\n            idx=15,\n            lang=\"zh\",\n            short_name=\"int8_small_zipformer_2025_04_01\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -f bpe.model\n\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-small-ctc-zh-2025-04-01\",\n            idx=16,\n            lang=\"zh\",\n            short_name=\"small_zipformer_2025_04_01\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -f bpe.model\n\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-ctc-zh-int8-2025-06-30\",\n            idx=17,\n            lang=\"zh\",\n            short_name=\"large_zipformer_int8\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv bpe.model\n\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-ctc-zh-2025-06-30\",\n            idx=18,\n            lang=\"zh\",\n            short_name=\"large_zipformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv bpe.model\n\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-ctc-zh-fp16-2025-06-30\",\n            idx=19,\n            lang=\"zh\",\n            short_name=\"large_zipformer_fp16\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv bpe.model\n\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-ctc-zh-int8-2025-06-30\",\n            idx=20,\n            lang=\"zh\",\n            short_name=\"large_zipformer_int8\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv bpe.model\n\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-en-kroko-2025-08-06\",\n            idx=21,\n            lang=\"en\",\n            short_name=\"zipformer_kroko_asr\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-es-kroko-2025-08-06\",\n            idx=22,\n            lang=\"es\",\n            short_name=\"zipformer_kroko_asr\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-fr-kroko-2025-08-06\",\n            idx=23,\n            lang=\"fr\",\n            short_name=\"zipformer_kroko_asr\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-de-kroko-2025-08-06\",\n            idx=24,\n            lang=\"de\",\n            short_name=\"zipformer_kroko_asr\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-small-ru-vosk-int8-2025-08-16\",\n            idx=25,\n            lang=\"ru\",\n            short_name=\"small_zipformer_int8_2025_08_16\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -rf test_wavs\n            rm -fv bpe.model\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-small-ru-vosk-2025-08-16\",\n            idx=26,\n            lang=\"ru\",\n            short_name=\"small_zipformer_2025_08_16\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -rf test_wavs\n            rm -fv bpe.model\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-t-one-russian-2025-09-08\",\n            idx=27,\n            lang=\"ru\",\n            short_name=\"t_one_ctc_2025_09_08\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -v *.wav\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemotron-speech-streaming-en-0.6b-int8-2026-01-14\",\n            idx=28,\n            lang=\"en\",\n            short_name=\"nemotron-speech-streaming-en-0.6b-int8-2026-01-14\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-bn-vosk-2026-02-09\",\n            idx=29,\n            lang=\"bn\",\n            short_name=\"bengali_vosk_2026_02_09\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rf test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n    ]\n\n    return models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./build-apk-asr.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/apk/generate-audio-tagging-apk-script.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass AudioTaggingModel:\n    model_name: str\n    idx: int\n    short_name: str = \"\"\n\n\ndef get_models():\n    # see https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\n    icefall_models = [\n        AudioTaggingModel(\n            model_name=\"sherpa-onnx-zipformer-small-audio-tagging-2024-04-15\",\n            idx=0,\n            short_name=\"small_zipformer\",\n        ),\n        AudioTaggingModel(\n            model_name=\"sherpa-onnx-zipformer-audio-tagging-2024-04-09\",\n            idx=1,\n            short_name=\"zipformer\",\n        ),\n    ]\n\n    ced_models = [\n        AudioTaggingModel(\n            model_name=\"sherpa-onnx-ced-tiny-audio-tagging-2024-04-19\",\n            idx=2,\n            short_name=\"ced_tiny\",\n        ),\n        AudioTaggingModel(\n            model_name=\"sherpa-onnx-ced-mini-audio-tagging-2024-04-19\",\n            idx=3,\n            short_name=\"ced_mini\",\n        ),\n        AudioTaggingModel(\n            model_name=\"sherpa-onnx-ced-small-audio-tagging-2024-04-19\",\n            idx=4,\n            short_name=\"ced_small\",\n        ),\n        AudioTaggingModel(\n            model_name=\"sherpa-onnx-ced-base-audio-tagging-2024-04-19\",\n            idx=5,\n            short_name=\"ced_base\",\n        ),\n    ]\n\n    return icefall_models + ced_models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./build-apk-audio-tagging.sh\",\n        \"./build-apk-audio-tagging-wearos.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/apk/generate-qnn-vad-asr-apk-script.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\nfrom pathlib import Path\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass Model:\n    # We will download\n    # https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/{model_name}.tar.bz2\n    model_name: str\n\n    # The type of the model, e..g, 0, 1, 2. It is hardcoded in the kotlin code\n    idx: int\n\n    # e.g., zh, en, zh_en\n    lang: str\n\n    # e.g., whisper, paraformer, zipformer\n    short_name: str = \"\"\n\n    # cmd is used to remove extra file from the model directory\n    cmd: str = \"\"\n\n    rule_fsts: str = \"\"\n\n    use_hr: bool = False\n\n\n# See get_2nd_models() in ./generate-asr-2pass-apk-script.py\ndef get_models():\n    models = [\n        Model(\n            model_name=\"sherpa-onnx-qnn-5-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9000,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"5-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-8-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9001,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"8-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-10-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9002,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"10-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-13-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9003,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"13-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-15-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9004,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"15-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-18-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9005,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"18-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-20-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9006,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"20-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-23-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9007,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"23-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-25-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9008,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"25-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-28-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9009,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"28-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-30-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\",\n            idx=9010,\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"30-seconds-sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-5-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9011,\n            lang=\"zh\",\n            short_name=\"5-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-8-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9012,\n            lang=\"zh\",\n            short_name=\"8-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-10-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9013,\n            lang=\"zh\",\n            short_name=\"10-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-13-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9014,\n            lang=\"zh\",\n            short_name=\"13-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-15-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9015,\n            lang=\"zh\",\n            short_name=\"15-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-18-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9016,\n            lang=\"zh\",\n            short_name=\"18-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-20-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9017,\n            lang=\"zh\",\n            short_name=\"20-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-23-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9018,\n            lang=\"zh\",\n            short_name=\"23-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-25-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9019,\n            lang=\"zh\",\n            short_name=\"25-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-28-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9020,\n            lang=\"zh\",\n            short_name=\"28-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-30-seconds-zipformer-ctc-zh-2025-07-03-int8\",\n            idx=9021,\n            lang=\"zh\",\n            short_name=\"30-seconds-zipformer_ctc_2025_07_03_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-5-seconds-paraformer-zh-2023-03-28-int8\",\n            idx=9023,\n            lang=\"zh\",\n            short_name=\"5-seconds-paraformer_zh_2023_03_28_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-qnn-5-seconds-paraformer-zh-2025-10-07-int8\",\n            idx=9024,\n            lang=\"zh\",\n            short_name=\"5-seconds-paraformer_zh_2025_10_07_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n    ]\n    return models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./build-apk-qnn-vad-asr-simulate-streaming.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        if not Path(f\"{filename}.in\").is_file():\n            print(f\"skip {filename}\")\n            continue\n\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/apk/generate-slid-apk-script.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass SlidModel:\n    model_name: str\n    idx: int\n    short_name: str = \"\"\n\n\ndef get_models():\n    # see https://k2-fsa.github.io/sherpa/onnx/spolken-language-identification/pretrained_models.html#pre-trained-models\n    whisper_models = [\n        SlidModel(\n            model_name=\"sherpa-onnx-whisper-tiny\",\n            idx=0,\n            short_name=\"whisper_tiny\",\n        ),\n    ]\n\n    return whisper_models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./build-apk-slid.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/apk/generate-speaker-diarization-apk-script.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\nfrom typing import List\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass SpeakerSegmentationModel:\n    model_name: str\n    short_name: str\n\n\n@dataclass\nclass SpeakerEmbeddingModel:\n    model_name: str\n    short_name: str\n\n\n@dataclass\nclass Model:\n    segmentation: SpeakerSegmentationModel\n    embedding: SpeakerEmbeddingModel\n\n\ndef get_segmentation_models() -> List[SpeakerSegmentationModel]:\n    models = [\n        SpeakerSegmentationModel(\n            model_name=\"sherpa-onnx-pyannote-segmentation-3-0\",\n            short_name=\"pyannote_audio\",\n        ),\n        SpeakerSegmentationModel(\n            model_name=\"sherpa-onnx-reverb-diarization-v1\",\n            short_name=\"revai_v1\",\n        ),\n    ]\n\n    return models\n\n\ndef get_embedding_models() -> List[SpeakerEmbeddingModel]:\n    models = [\n        SpeakerSegmentationModel(\n            model_name=\"3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k\",\n            short_name=\"3dspeaker\",\n        ),\n        SpeakerSegmentationModel(\n            model_name=\"nemo_en_titanet_small\",\n            short_name=\"nemo\",\n        ),\n    ]\n    return models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    segmentation_models = get_segmentation_models()\n    embedding_models = get_embedding_models()\n\n    all_model_list = []\n    for s in segmentation_models:\n        for e in embedding_models:\n            all_model_list.append(Model(segmentation=s, embedding=e))\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\"./build-apk-speaker-diarization.sh\"]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/apk/generate-speaker-identification-apk-script.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\nfrom typing import List\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass SpeakerIdentificationModel:\n    model_name: str\n    short_name: str = \"\"\n    lang: str = \"\"\n    framework: str = \"\"\n\n\ndef get_3dspeaker_models() -> List[SpeakerIdentificationModel]:\n    models = [\n        SpeakerIdentificationModel(\n            model_name=\"3dspeaker_speech_campplus_sv_en_voxceleb_16k.onnx\"\n        ),\n        SpeakerIdentificationModel(\n            model_name=\"3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\"\n        ),\n        SpeakerIdentificationModel(\n            model_name=\"3dspeaker_speech_eres2net_base_200k_sv_zh-cn_16k-common.onnx\"\n        ),\n        SpeakerIdentificationModel(\n            model_name=\"3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\"\n        ),\n        SpeakerIdentificationModel(\n            model_name=\"3dspeaker_speech_eres2net_large_sv_zh-cn_3dspeaker_16k.onnx\"\n        ),\n        SpeakerIdentificationModel(\n            model_name=\"3dspeaker_speech_eres2net_sv_en_voxceleb_16k.onnx\"\n        ),\n        SpeakerIdentificationModel(\n            model_name=\"3dspeaker_speech_eres2net_sv_zh-cn_16k-common.onnx\"\n        ),\n    ]\n\n    prefix = \"3dspeaker_speech_\"\n    num = len(prefix)\n    for m in models:\n        m.framework = \"3dspeaker\"\n        m.short_name = m.model_name[num:-5]\n        if \"_zh-cn_\" in m.model_name:\n            m.lang = \"zh\"\n        elif \"_en_\" in m.model_name:\n            m.lang = \"en\"\n        else:\n            raise ValueError(m)\n    return models\n\n\ndef get_wespeaker_models() -> List[SpeakerIdentificationModel]:\n    models = [\n        SpeakerIdentificationModel(model_name=\"wespeaker_en_voxceleb_CAM++.onnx\"),\n        SpeakerIdentificationModel(model_name=\"wespeaker_en_voxceleb_CAM++_LM.onnx\"),\n        SpeakerIdentificationModel(\n            model_name=\"wespeaker_en_voxceleb_resnet152_LM.onnx\"\n        ),\n        SpeakerIdentificationModel(\n            model_name=\"wespeaker_en_voxceleb_resnet221_LM.onnx\"\n        ),\n        SpeakerIdentificationModel(\n            model_name=\"wespeaker_en_voxceleb_resnet293_LM.onnx\"\n        ),\n        SpeakerIdentificationModel(model_name=\"wespeaker_en_voxceleb_resnet34.onnx\"),\n        SpeakerIdentificationModel(model_name=\"wespeaker_en_voxceleb_resnet34_LM.onnx\"),\n        SpeakerIdentificationModel(model_name=\"wespeaker_zh_cnceleb_resnet34.onnx\"),\n        SpeakerIdentificationModel(model_name=\"wespeaker_zh_cnceleb_resnet34_LM.onnx\"),\n    ]\n\n    prefix = \"wespeaker_xx_\"\n    num = len(prefix)\n    for m in models:\n        m.framework = \"wespeaker\"\n        m.short_name = m.model_name[num:-5]\n        if \"_zh_\" in m.model_name:\n            m.lang = \"zh\"\n        elif \"_en_\" in m.model_name:\n            m.lang = \"en\"\n        else:\n            raise ValueError(m)\n    return models\n\n\ndef get_nemo_models() -> List[SpeakerIdentificationModel]:\n    models = [\n        SpeakerIdentificationModel(\n            model_name=\"nemo_en_speakerverification_speakernet.onnx\"\n        ),\n        SpeakerIdentificationModel(model_name=\"nemo_en_titanet_large.onnx\"),\n        SpeakerIdentificationModel(model_name=\"nemo_en_titanet_small.onnx\"),\n    ]\n\n    prefix = \"nemo_en_\"\n    num = len(prefix)\n    for m in models:\n        m.framework = \"nemo\"\n        m.short_name = m.model_name[num:-5]\n        if \"_zh_\" in m.model_name:\n            m.lang = \"zh\"\n        elif \"_en_\" in m.model_name:\n            m.lang = \"en\"\n        else:\n            raise ValueError(m)\n    return models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_3dspeaker_models()\n    all_model_list += get_wespeaker_models()\n    all_model_list += get_nemo_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\"./build-apk-speaker-identification.sh\"]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/apk/generate-tts-apk-script.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\nfrom typing import List, Optional\n\nimport jinja2\n\n# pip install iso639-lang\nfrom iso639 import Lang\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass TtsModel:\n    model_dir: str\n    model_name: str = \"\"  # for vits\n    acoustic_model_name: str = \"\"  # for matcha\n    vocoder: str = \"\"  # for matcha\n    voices: str = \"\"  # for kokoro\n    lang: str = \"\"  # en, zh, fr, de, etc.\n    lang2: str = \"\"  # en, zh, fr, de, etc.\n    rule_fsts: Optional[List[str]] = None\n    rule_fars: Optional[List[str]] = None\n    data_dir: Optional[str] = None\n    dict_dir: Optional[str] = None\n    is_char: bool = False\n    lang_iso_639_3: str = \"\"\n    lang_iso_639_3_2: str = \"\"\n    lexicon: str = \"\"\n    is_kitten: bool = False\n\n\ndef convert_lang_to_iso_639_3(models: List[TtsModel]):\n    for m in models:\n        if m.lang_iso_639_3 == \"\":\n            m.lang_iso_639_3 = Lang(m.lang).pt3\n        if m.lang2 != \"\":\n            m.lang_iso_639_3_2 = Lang(m.lang2).pt3\n\n\ndef get_coqui_models() -> List[TtsModel]:\n    # English (coqui-ai/TTS)\n    models = [\n        TtsModel(model_dir=\"vits-coqui-en-ljspeech\"),\n        TtsModel(model_dir=\"vits-coqui-en-ljspeech-neon\"),\n        TtsModel(model_dir=\"vits-coqui-en-vctk\"),\n        #  TtsModel(model_dir=\"vits-coqui-en-jenny\"),\n    ]\n\n    for m in models:\n        m.data_dir = m.model_dir + \"/\" + \"espeak-ng-data\"\n        m.model_name = \"model.onnx\"\n        m.lang = \"en\"\n\n    character_models = [\n        TtsModel(model_dir=\"vits-coqui-bg-cv\", lang=\"bg\"),\n        TtsModel(model_dir=\"vits-coqui-bn-custom_female\", lang=\"bn\"),\n        TtsModel(model_dir=\"vits-coqui-cs-cv\", lang=\"cs\"),\n        TtsModel(model_dir=\"vits-coqui-da-cv\", lang=\"da\"),\n        TtsModel(model_dir=\"vits-coqui-de-css10\", lang=\"de\"),\n        TtsModel(model_dir=\"vits-coqui-es-css10\", lang=\"es\"),\n        TtsModel(model_dir=\"vits-coqui-et-cv\", lang=\"et\"),\n        TtsModel(model_dir=\"vits-coqui-fi-css10\", lang=\"fi\"),\n        TtsModel(model_dir=\"vits-coqui-fr-css10\", lang=\"fr\"),\n        TtsModel(model_dir=\"vits-coqui-ga-cv\", lang=\"ga\"),\n        TtsModel(model_dir=\"vits-coqui-hr-cv\", lang=\"hr\"),\n        TtsModel(model_dir=\"vits-coqui-lt-cv\", lang=\"lt\"),\n        TtsModel(model_dir=\"vits-coqui-lv-cv\", lang=\"lv\"),\n        TtsModel(model_dir=\"vits-coqui-mt-cv\", lang=\"mt\"),\n        TtsModel(model_dir=\"vits-coqui-nl-css10\", lang=\"nl\"),\n        TtsModel(model_dir=\"vits-coqui-pl-mai_female\", lang=\"pl\"),\n        TtsModel(model_dir=\"vits-coqui-pt-cv\", lang=\"pt\"),\n        TtsModel(model_dir=\"vits-coqui-ro-cv\", lang=\"ro\"),\n        TtsModel(model_dir=\"vits-coqui-sk-cv\", lang=\"sk\"),\n        TtsModel(model_dir=\"vits-coqui-sl-cv\", lang=\"sl\"),\n        TtsModel(model_dir=\"vits-coqui-sv-cv\", lang=\"sv\"),\n        TtsModel(model_dir=\"vits-coqui-uk-mai\", lang=\"uk\"),\n    ]\n    for m in character_models:\n        m.is_char = True\n        m.model_name = \"model.onnx\"\n\n    return models + character_models\n\n\ndef get_piper_models() -> List[TtsModel]:\n    models = [\n        #  TtsModel(model_dir=\"vits-piper-es_ES-mls_10246-low\"),\n        #  TtsModel(model_dir=\"vits-piper-es_ES-mls_9972-low\"),\n        #  TtsModel(model_dir=\"vits-piper-pl_PL-mls_6892-low\"),\n        TtsModel(model_dir=\"vits-piper-ar_JO-kareem-low\"),\n        TtsModel(model_dir=\"vits-piper-ar_JO-kareem-medium\"),\n        TtsModel(model_dir=\"vits-piper-ar_JO-SA_dii-high\"),\n        TtsModel(model_dir=\"vits-piper-ar_JO-SA_miro-high\"),\n        TtsModel(model_dir=\"vits-piper-ar_JO-SA_miro_V2-high\"),\n        TtsModel(model_dir=\"vits-piper-ca_ES-upc_ona-medium\"),\n        TtsModel(model_dir=\"vits-piper-ca_ES-upc_ona-x_low\"),\n        TtsModel(model_dir=\"vits-piper-ca_ES-upc_pau-x_low\"),\n        TtsModel(model_dir=\"vits-piper-cs_CZ-jirka-low\"),\n        TtsModel(model_dir=\"vits-piper-cs_CZ-jirka-medium\"),\n        TtsModel(model_dir=\"vits-piper-cy_GB-bu_tts-medium\"),\n        TtsModel(model_dir=\"vits-piper-cy_GB-gwryw_gogleddol-medium\"),\n        TtsModel(model_dir=\"vits-piper-da_DK-talesyntese-medium\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-eva_k-x_low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-karlsson-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-kerstin-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-dii-high\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-miro-high\"),\n        #  TtsModel(model_dir=\"vits-piper-de_DE-mls-medium\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-pavoque-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-ramona-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-thorsten-high\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-thorsten-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-thorsten-medium\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-thorsten_emotional-medium\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-glados-high\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-glados-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-glados-medium\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-glados_turret-high\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-glados_turret-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-glados_turret-medium\"),\n        TtsModel(model_dir=\"vits-piper-el_GR-rapunzelina-low\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-alan-low\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-alan-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-alba-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-aru-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-cori-high\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-cori-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-dii-high\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-jenny_dioco-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-miro-high\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-northern_english_male-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-semaine-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-southern_english_female-low\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-southern_english_female-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-southern_english_male-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-sweetbbak-amy\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-vctk-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-amy-low\"),\n        TtsModel(model_dir=\"vits-piper-en_US-amy-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-arctic-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-bryce-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-danny-low\"),\n        TtsModel(model_dir=\"vits-piper-en_US-glados\"),\n        TtsModel(model_dir=\"vits-piper-en_US-glados-high\"),\n        TtsModel(model_dir=\"vits-piper-en_US-hfc_female-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-hfc_male-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-joe-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-john-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-kathleen-low\"),\n        TtsModel(model_dir=\"vits-piper-en_US-kristin-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-kusal-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-l2arctic-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-lessac-high\"),\n        TtsModel(model_dir=\"vits-piper-en_US-lessac-low\"),\n        TtsModel(model_dir=\"vits-piper-en_US-lessac-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-libritts-high\"),\n        TtsModel(model_dir=\"vits-piper-en_US-libritts_r-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-ljspeech-high\"),\n        TtsModel(model_dir=\"vits-piper-en_US-ljspeech-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-miro-high\"),\n        TtsModel(model_dir=\"vits-piper-en_US-norman-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-ryan-high\"),\n        TtsModel(model_dir=\"vits-piper-en_US-ryan-low\"),\n        TtsModel(model_dir=\"vits-piper-en_US-ryan-medium\"),\n        TtsModel(model_dir=\"vits-piper-es_AR-daniela-high\"),\n        TtsModel(model_dir=\"vits-piper-es_ES-carlfm-x_low\"),\n        TtsModel(model_dir=\"vits-piper-es_ES-davefx-medium\"),\n        TtsModel(model_dir=\"vits-piper-es_ES-glados-medium\"),\n        TtsModel(model_dir=\"vits-piper-es_ES-miro-high\"),\n        TtsModel(model_dir=\"vits-piper-es_ES-sharvard-medium\"),\n        TtsModel(model_dir=\"vits-piper-es_MX-ald-medium\"),\n        TtsModel(model_dir=\"vits-piper-es_MX-claude-high\"),\n        TtsModel(model_dir=\"vits-piper-fa_IR-amir-medium\"),\n        TtsModel(model_dir=\"vits-piper-fa_IR-ganji-medium\"),\n        TtsModel(model_dir=\"vits-piper-fa_IR-ganji_adabi-medium\"),\n        TtsModel(model_dir=\"vits-piper-fa_IR-gyro-medium\"),\n        TtsModel(model_dir=\"vits-piper-fa_IR-reza_ibrahim-medium\"),\n        TtsModel(model_dir=\"vits-piper-fa_en-rezahedayatfar-ibrahimwalk-medium\"),\n        TtsModel(model_dir=\"vits-piper-fi_FI-harri-low\"),\n        TtsModel(model_dir=\"vits-piper-fi_FI-harri-medium\"),\n        #  TtsModel(model_dir=\"vits-piper-fr_FR-mls-medium\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-gilles-low\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-miro-high\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-siwis-low\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-siwis-medium\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-tjiho-model1\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-tjiho-model2\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-tjiho-model3\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-tom-medium\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-upmc-medium\"),\n        TtsModel(model_dir=\"vits-piper-hi_IN-pratham-medium\"),\n        TtsModel(model_dir=\"vits-piper-hi_IN-priyamvada-medium\"),\n        TtsModel(model_dir=\"vits-piper-hi_IN-rohan-medium\"),\n        TtsModel(model_dir=\"vits-piper-hu_HU-anna-medium\"),\n        TtsModel(model_dir=\"vits-piper-hu_HU-berta-medium\"),\n        TtsModel(model_dir=\"vits-piper-hu_HU-imre-medium\"),\n        TtsModel(model_dir=\"vits-piper-id_ID-news_tts-medium\"),\n        TtsModel(model_dir=\"vits-piper-is_IS-bui-medium\"),\n        TtsModel(model_dir=\"vits-piper-is_IS-salka-medium\"),\n        TtsModel(model_dir=\"vits-piper-is_IS-steinn-medium\"),\n        TtsModel(model_dir=\"vits-piper-is_IS-ugla-medium\"),\n        TtsModel(model_dir=\"vits-piper-it_IT-dii-high\"),\n        TtsModel(model_dir=\"vits-piper-it_IT-miro-high\"),\n        TtsModel(model_dir=\"vits-piper-it_IT-paola-medium\"),\n        TtsModel(model_dir=\"vits-piper-it_IT-riccardo-x_low\"),\n        TtsModel(model_dir=\"vits-piper-ka_GE-natia-medium\"),\n        TtsModel(model_dir=\"vits-piper-kk_KZ-iseke-x_low\"),\n        TtsModel(model_dir=\"vits-piper-kk_KZ-issai-high\"),\n        TtsModel(model_dir=\"vits-piper-kk_KZ-raya-x_low\"),\n        TtsModel(model_dir=\"vits-piper-lv_LV-aivars-medium\"),\n        TtsModel(model_dir=\"vits-piper-lb_LU-marylux-medium\"),\n        TtsModel(model_dir=\"vits-piper-ne_NP-chitwan-medium\"),\n        TtsModel(model_dir=\"vits-piper-ne_NP-google-medium\"),\n        TtsModel(model_dir=\"vits-piper-ne_NP-google-x_low\"),\n        TtsModel(model_dir=\"vits-piper-nl_BE-nathalie-medium\"),\n        TtsModel(model_dir=\"vits-piper-nl_BE-nathalie-x_low\"),\n        TtsModel(model_dir=\"vits-piper-nl_BE-rdh-medium\"),\n        TtsModel(model_dir=\"vits-piper-nl_BE-rdh-x_low\"),\n        TtsModel(model_dir=\"vits-piper-nl_NL-miro-high\"),\n        TtsModel(model_dir=\"vits-piper-nl_NL-dii-high\"),\n        #  TtsModel(model_dir=\"vits-piper-nl_NL-mls-medium\"),\n        #  TtsModel(model_dir=\"vits-piper-nl_NL-mls_5809-low\"),\n        #  TtsModel(model_dir=\"vits-piper-nl_NL-mls_7432-low\"),\n        TtsModel(model_dir=\"vits-piper-no_NO-talesyntese-medium\"),\n        TtsModel(model_dir=\"vits-piper-pl_PL-darkman-medium\"),\n        TtsModel(model_dir=\"vits-piper-pl_PL-gosia-medium\"),\n        TtsModel(model_dir=\"vits-piper-pl_PL-jarvis_wg_glos-medium\"),\n        TtsModel(model_dir=\"vits-piper-pl_PL-justyna_wg_glos-medium\"),\n        TtsModel(model_dir=\"vits-piper-pl_PL-mc_speech-medium\"),\n        TtsModel(model_dir=\"vits-piper-pl_PL-meski_wg_glos-medium\"),\n        TtsModel(model_dir=\"vits-piper-pl_PL-zenski_wg_glos-medium\"),\n        TtsModel(model_dir=\"vits-piper-pt_BR-cadu-medium\"),\n        TtsModel(model_dir=\"vits-piper-pt_BR-dii-high\"),\n        TtsModel(model_dir=\"vits-piper-pt_BR-edresson-low\"),\n        TtsModel(model_dir=\"vits-piper-pt_BR-faber-medium\"),\n        TtsModel(model_dir=\"vits-piper-pt_BR-jeff-medium\"),\n        TtsModel(model_dir=\"vits-piper-pt_BR-miro-high\"),\n        TtsModel(model_dir=\"vits-piper-pt_PT-dii-high\"),\n        TtsModel(model_dir=\"vits-piper-pt_PT-miro-high\"),\n        TtsModel(model_dir=\"vits-piper-pt_PT-tugao-medium\"),\n        TtsModel(model_dir=\"vits-piper-ro_RO-mihai-medium\"),\n        TtsModel(model_dir=\"vits-piper-ru_RU-denis-medium\"),\n        TtsModel(model_dir=\"vits-piper-ru_RU-dmitri-medium\"),\n        TtsModel(model_dir=\"vits-piper-ru_RU-irina-medium\"),\n        TtsModel(model_dir=\"vits-piper-ru_RU-ruslan-medium\"),\n        TtsModel(model_dir=\"vits-piper-sk_SK-lili-medium\"),\n        TtsModel(model_dir=\"vits-piper-sl_SI-artur-medium\"),\n        TtsModel(model_dir=\"vits-piper-sr_RS-serbski_institut-medium\"),\n        TtsModel(model_dir=\"vits-piper-sv_SE-lisa-medium\"),\n        TtsModel(model_dir=\"vits-piper-sv_SE-nst-medium\"),\n        TtsModel(model_dir=\"vits-piper-sw_CD-lanfrica-medium\"),\n        TtsModel(model_dir=\"vits-piper-tr_TR-dfki-medium\"),\n        TtsModel(model_dir=\"vits-piper-tr_TR-fahrettin-medium\"),\n        TtsModel(model_dir=\"vits-piper-tr_TR-fettah-medium\"),\n        TtsModel(model_dir=\"vits-piper-uk_UA-lada-x_low\"),\n        TtsModel(model_dir=\"vits-piper-uk_UA-ukrainian_tts-medium\"),\n        TtsModel(model_dir=\"vits-piper-vi_VN-25hours_single-low\"),\n        TtsModel(model_dir=\"vits-piper-vi_VN-vais1000-medium\"),\n        TtsModel(model_dir=\"vits-piper-vi_VN-vivos-x_low\"),\n        TtsModel(model_dir=\"vits-piper-zh_CN-huayan-medium\"),\n    ]\n\n    for m in models:\n        m.data_dir = m.model_dir + \"/\" + \"espeak-ng-data\"\n        m.model_name = m.model_dir[len(\"vits-piper-\") :] + \".onnx\"\n        m.lang = m.model_dir.split(\"-\")[2][:2]\n\n    return models\n\n\ndef get_mimic3_models() -> List[TtsModel]:\n    models = [\n        TtsModel(model_dir=\"vits-mimic3-af_ZA-google-nwu_low\"),\n        TtsModel(model_dir=\"vits-mimic3-bn-multi_low\"),\n        TtsModel(model_dir=\"vits-mimic3-es_ES-m-ailabs_low\"),\n        TtsModel(model_dir=\"vits-mimic3-fa-haaniye_low\"),\n        TtsModel(model_dir=\"vits-mimic3-fi_FI-harri-tapani-ylilammi_low\"),\n        TtsModel(model_dir=\"vits-mimic3-gu_IN-cmu-indic_low\"),\n        TtsModel(model_dir=\"vits-mimic3-hu_HU-diana-majlinger_low\"),\n        TtsModel(model_dir=\"vits-mimic3-ko_KO-kss_low\"),\n        TtsModel(model_dir=\"vits-mimic3-ne_NP-ne-google_low\"),\n        TtsModel(model_dir=\"vits-mimic3-pl_PL-m-ailabs_low\"),\n        TtsModel(model_dir=\"vits-mimic3-tn_ZA-google-nwu_low\"),\n        TtsModel(model_dir=\"vits-mimic3-vi_VN-vais1000_low\"),\n    ]\n    for m in models:\n        m.data_dir = m.model_dir + \"/\" + \"espeak-ng-data\"\n        m.model_name = m.model_dir[len(\"vits-mimic3-\") :] + \".onnx\"\n        m.lang = m.model_dir.split(\"-\")[2][:2]\n\n    return models\n\n\ndef get_vits_models() -> List[TtsModel]:\n    chinese_models = [\n        # Chinese\n        TtsModel(\n            model_dir=\"vits-icefall-zh-aishell3\",\n            model_name=\"model.onnx\",\n            lang=\"zh\",\n            rule_fsts=\"vits-icefall-zh-aishell3/phone.fst,vits-icefall-zh-aishell3/date.fst,vits-icefall-zh-aishell3/number.fst,vits-icefall-zh-aishell3/new_heteronym.fst\",\n            rule_fars=\"vits-icefall-zh-aishell3/rule.far\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-aishell3\",\n            model_name=\"vits-aishell3.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-doom\",\n            model_name=\"doom.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-echo\",\n            model_name=\"echo.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-zenyatta\",\n            model_name=\"zenyatta.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-abyssinvoker\",\n            model_name=\"abyssinvoker.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-keqing\",\n            model_name=\"keqing.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-eula\",\n            model_name=\"eula.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-bronya\",\n            model_name=\"bronya.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-theresa\",\n            model_name=\"theresa.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-fanchen-wnj\",\n            model_name=\"vits-zh-hf-fanchen-wnj.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-melo-tts-zh_en\",\n            model_name=\"model.onnx\",\n            lang=\"zh\",\n            lang2=\"en\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-fanchen-C\",\n            model_name=\"vits-zh-hf-fanchen-C.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-fanchen-ZhiHuiLaoZhe\",\n            model_name=\"vits-zh-hf-fanchen-ZhiHuiLaoZhe.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-fanchen-ZhiHuiLaoZhe_new\",\n            model_name=\"vits-zh-hf-fanchen-ZhiHuiLaoZhe_new.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-fanchen-unity\",\n            model_name=\"vits-zh-hf-fanchen-unity.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"sherpa-onnx-vits-zh-ll\",\n            model_name=\"model.onnx\",\n            lang=\"zh\",\n        ),\n    ]\n\n    rule_fsts = [\"phone.fst\", \"date.fst\", \"number.fst\"]\n    for m in chinese_models:\n        s = [f\"{m.model_dir}/{r}\" for r in rule_fsts]\n        if (\n            \"vits-zh-hf\" in m.model_dir\n            or \"sherpa-onnx-vits-zh-ll\" == m.model_dir\n            or \"melo-tts\" in m.model_dir\n        ):\n            s = s[:-1]\n        else:\n            m.rule_fars = f\"{m.model_dir}/rule.far\"\n\n        m.rule_fsts = \",\".join(s)\n\n    all_models = chinese_models + [\n        TtsModel(\n            model_dir=\"vits-cantonese-hf-xiaomaiiwn\",\n            model_name=\"vits-cantonese-hf-xiaomaiiwn.onnx\",\n            lang=\"cantonese\",\n            lang_iso_639_3=\"yue\",\n            rule_fsts=\"vits-cantonese-hf-xiaomaiiwn/rule.fst\",\n        ),\n        # English (US)\n        TtsModel(model_dir=\"vits-vctk\", model_name=\"vits-vctk.onnx\", lang=\"en\"),\n        #  TtsModel(model_dir=\"vits-ljs\", model_name=\"vits-ljs.onnx\", lang=\"en\"),\n        # fmt: on\n    ]\n\n    return all_models\n\n\ndef get_matcha_models() -> List[TtsModel]:\n    chinese_models = [\n        TtsModel(\n            model_dir=\"matcha-icefall-zh-baker\",\n            acoustic_model_name=\"model-steps-3.onnx\",\n            lang=\"zh\",\n            lexicon=\"lexicon.txt\",\n        )\n    ]\n    rule_fsts = [\"phone.fst\", \"date.fst\", \"number.fst\"]\n    for m in chinese_models:\n        s = [f\"{m.model_dir}/{r}\" for r in rule_fsts]\n        m.rule_fsts = \",\".join(s)\n        m.vocoder = \"vocos-22khz-univ.onnx\"\n\n    chinese_english_models = [\n        TtsModel(\n            model_dir=\"matcha-icefall-zh-en\",\n            acoustic_model_name=\"model-steps-3.onnx\",\n            lang=\"zh\",\n            lexicon=\"lexicon.txt\",\n        )\n    ]\n    rule_fsts_zh = [\"phone-zh.fst\", \"date-zh.fst\", \"number-zh.fst\"]\n    for m in chinese_english_models:\n        s = [f\"{m.model_dir}/{r}\" for r in rule_fsts_zh]\n        m.rule_fsts = \",\".join(s)\n        m.vocoder = \"vocos-16khz-univ.onnx\"\n        m.data_dir = f\"{m.model_dir}/espeak-ng-data\"\n\n    english_persian_models = [\n        TtsModel(\n            model_dir=\"matcha-icefall-en_US-ljspeech\",\n            acoustic_model_name=\"model-steps-3.onnx\",\n            lang=\"en\",\n        ),\n        TtsModel(\n            model_dir=\"matcha-tts-fa_en-musa\",\n            acoustic_model_name=\"model.onnx\",\n            lang=\"fa\",\n        ),\n        TtsModel(\n            model_dir=\"matcha-tts-fa_en-khadijah\",\n            acoustic_model_name=\"model.onnx\",\n            lang=\"fa\",\n        ),\n    ]\n    for m in english_persian_models:\n        m.data_dir = f\"{m.model_dir}/espeak-ng-data\"\n        m.vocoder = \"vocos-22khz-univ.onnx\"\n\n    return chinese_models + english_persian_models + chinese_english_models\n\n\ndef get_kokoro_models() -> List[TtsModel]:\n    english_models = [\n        TtsModel(\n            model_dir=\"kokoro-en-v0_19\",\n            model_name=\"model.onnx\",\n            lang=\"en\",\n        )\n    ]\n    for m in english_models:\n        m.data_dir = f\"{m.model_dir}/espeak-ng-data\"\n        m.voices = \"voices.bin\"\n\n    multi_lingual_models = [\n        TtsModel(\n            model_dir=\"kokoro-multi-lang-v1_0\",\n            model_name=\"model.onnx\",\n            lang=\"en\",\n            lang2=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"kokoro-multi-lang-v1_1\",\n            model_name=\"model.onnx\",\n            lang=\"en\",\n            lang2=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"kokoro-int8-multi-lang-v1_1\",\n            model_name=\"model.int8.onnx\",\n            lang=\"en\",\n            lang2=\"zh\",\n        ),\n    ]\n    for m in multi_lingual_models:\n        m.data_dir = f\"{m.model_dir}/espeak-ng-data\"\n        m.voices = \"voices.bin\"\n        m.lexicon = f\"{m.model_dir}/lexicon-us-en.txt,{m.model_dir}/lexicon-zh.txt\"\n        m.rule_fsts = f\"{m.model_dir}/phone-zh.fst,{m.model_dir}/date-zh.fst,{m.model_dir}/number-zh.fst\"\n\n    return english_models + multi_lingual_models\n\n\ndef get_kitten_models() -> List[TtsModel]:\n    english_models = [\n        TtsModel(\n            model_dir=\"kitten-nano-en-v0_1-fp16\",\n            model_name=\"model.fp16.onnx\",\n            lang=\"en\",\n        ),\n        TtsModel(\n            model_dir=\"kitten-nano-en-v0_2-fp16\",\n            model_name=\"model.fp16.onnx\",\n            lang=\"en\",\n        ),\n        TtsModel(\n            model_dir=\"kitten-mini-en-v0_1-fp16\",\n            model_name=\"model.fp16.onnx\",\n            lang=\"en\",\n        ),\n    ]\n    for m in english_models:\n        m.data_dir = f\"{m.model_dir}/espeak-ng-data\"\n        m.voices = \"voices.bin\"\n        m.is_kitten = True\n\n    return english_models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n    d = dict()\n\n    all_model_list = get_vits_models()\n    all_model_list += get_piper_models()\n    all_model_list += get_mimic3_models()\n    all_model_list += get_coqui_models()\n    all_model_list += get_matcha_models()\n    all_model_list += get_kokoro_models()\n    all_model_list += get_kitten_models()\n\n    convert_lang_to_iso_639_3(all_model_list)\n    print(all_model_list)\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n    d[\"tts_model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"tts_model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\"./build-apk-tts.sh\", \"./build-apk-tts-engine.sh\"]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/apk/generate-vad-asr-apk-script.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\nfrom pathlib import Path\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass Model:\n    # We will download\n    # https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/{model_name}.tar.bz2\n    model_name: str\n\n    # The type of the model, e..g, 0, 1, 2. It is hardcoded in the kotlin code\n    idx: int\n\n    # e.g., zh, en, zh_en\n    lang: str\n    lang2: str\n\n    # e.g., whisper, paraformer, zipformer\n    short_name: str = \"\"\n\n    # cmd is used to remove extra file from the model directory\n    cmd: str = \"\"\n\n    rule_fsts: str = \"\"\n\n    use_hr: bool = False\n\n\n# See get_2nd_models() in ./generate-asr-2pass-apk-script.py\ndef get_models():\n    models = [\n        Model(\n            model_name=\"sherpa-onnx-whisper-tiny.en\",\n            idx=2,\n            lang=\"en\",\n            lang2=\"English\",\n            short_name=\"whisper_tiny\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv tiny.en-encoder.onnx\n            rm -fv tiny.en-decoder.onnx\n            rm -rf test_wavs\n            rm -fv *.py\n            rm -fv requirements.txt\n            rm -fv .gitignore\n            rm -fv README.md\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-paraformer-zh-2023-09-14\",\n            idx=0,\n            lang=\"zh_en\",\n            lang2=\"Chinese,English\",\n            short_name=\"paraformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n\n            rm -fv README.md\n            rm -rfv test_wavs\n            rm -fv model.onnx\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17\",\n            idx=15,\n            lang=\"zh_en_ko_ja_yue\",\n            lang2=\"中英粤日韩\",\n            short_name=\"sense_voice_2024_07_17_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv *.py\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-paraformer-zh-small-2024-03-09\",\n            idx=14,\n            lang=\"zh_en\",\n            lang2=\"Chinese,English\",\n            short_name=\"small_paraformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n\n            rm -fv README.md\n            rm -fv *.py\n            rm -fv *.yaml\n            rm -fv *.mvn\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"icefall-asr-zipformer-wenetspeech-20230615\",\n            idx=4,\n            lang=\"zh\",\n            lang2=\"Chinese\",\n            short_name=\"zipformer\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv README.md\n            mv -v data/lang_char/tokens.txt ./\n            rm -rfv data/lang_char\n\n            mv -v exp/encoder-epoch-12-avg-4.int8.onnx ./\n            mv -v exp/decoder-epoch-12-avg-4.onnx ./\n            mv -v exp/joiner-epoch-12-avg-4.int8.onnx ./\n            rm -rfv exp\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k\",\n            idx=7,\n            lang=\"be_de_en_es_fr_hr_it_pl_ru_uk\",\n            lang2=\"be_de_en_es_fr_hr_it_pl_ru_uk\",\n            short_name=\"fast_conformer_ctc_20k\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-fast-conformer-ctc-en-24500\",\n            idx=8,\n            lang=\"en\",\n            lang2=\"English\",\n            short_name=\"fast_conformer_ctc_24500\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288\",\n            idx=9,\n            lang=\"en_de_es_fr\",\n            lang2=\"English,German,Spanish,French\",\n            short_name=\"fast_conformer_ctc_14288\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-fast-conformer-ctc-es-1424\",\n            idx=10,\n            lang=\"es\",\n            lang2=\"Spanish\",\n            short_name=\"fast_conformer_ctc_1424\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04\",\n            idx=11,\n            lang=\"zh\",\n            lang2=\"Chinese\",\n            short_name=\"telespeech\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv test.py\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-thai-2024-06-20\",\n            idx=12,\n            lang=\"th\",\n            lang2=\"Thai\",\n            short_name=\"zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv README.md\n            rm -fv bpe.model\n\n            rm -fv encoder-epoch-12-avg-5.onnx\n            rm -fv decoder-epoch-12-avg-5.int8.onnx\n            rm joiner-epoch-12-avg-5.onnx\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-korean-2024-06-24\",\n            idx=13,\n            lang=\"ko\",\n            lang2=\"Korean\",\n            short_name=\"zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv README.md\n            rm -fv bpe.model\n\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv joiner-epoch-99-avg-1.onnx\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01\",\n            idx=16,\n            lang=\"ja\",\n            lang2=\"Japanese\",\n            short_name=\"zipformer_reazonspeech\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv joiner-epoch-99-avg-1.onnx\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-ru-2024-09-18\",\n            idx=17,\n            lang=\"ru\",\n            lang2=\"Russian\",\n            short_name=\"zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            rm -fv encoder.onnx\n            rm -fv decoder.int8.onnx\n            rm -fv joiner.onnx\n            rm -fv bpe.model\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-small-zipformer-ru-2024-09-18\",\n            idx=18,\n            lang=\"ru\",\n            lang2=\"Russian\",\n            short_name=\"small_zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            rm -fv encoder.onnx\n            rm -fv decoder.int8.onnx\n            rm -fv joiner.onnx\n            rm -fv bpe.model\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24\",\n            idx=19,\n            lang=\"ru\",\n            lang2=\"Russian\",\n            short_name=\"nemo_ctc_giga_am\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            rm -fv *.sh\n            rm -fv *.py\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24\",\n            idx=20,\n            lang=\"ru\",\n            lang2=\"Russian\",\n            short_name=\"nemo_transducer_giga_am\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            rm -fv *.sh\n            rm -fv *.py\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-tiny-en-int8\",\n            idx=21,\n            lang=\"en\",\n            lang2=\"English\",\n            short_name=\"moonshine_tiny_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-en-int8\",\n            idx=22,\n            lang=\"en\",\n            lang2=\"English\",\n            short_name=\"moonshine_base_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-zh-en-2023-11-22\",\n            idx=23,\n            lang=\"zh_en\",\n            lang2=\"Chinese,English\",\n            short_name=\"zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            rm -fv encoder-epoch-34-avg-19.onnx\n            rm -fv joiner-epoch-34-avg-19.onnx\n            rm -fv bbpe.model\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\",\n            idx=25,\n            lang=\"multi_lang\",\n            lang2=\"multi_lang\",\n            short_name=\"dolphin_base_ctc\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-vi-int8-2025-04-20\",\n            idx=26,\n            lang=\"vi\",\n            lang2=\"Vietnamese\",\n            short_name=\"zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv bpe.model\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19\",\n            idx=27,\n            lang=\"ru\",\n            lang2=\"Russian\",\n            short_name=\"nemo_ctc_giga_am_v2\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            rm -fv *.sh\n            rm -fv *.py\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19\",\n            idx=28,\n            lang=\"ru\",\n            lang2=\"Russian\",\n            short_name=\"nemo_transducer_giga_am\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            rm -fv *.sh\n            rm -fv *.py\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-ru-int8-2025-04-20\",\n            idx=29,\n            lang=\"ru\",\n            lang2=\"Russian\",\n            short_name=\"v2_zipformer\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            rm -fv bpe.model\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8\",\n            idx=30,\n            lang=\"en\",\n            lang2=\"English\",\n            short_name=\"parakeet_tdt_0.6b_v2\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03\",\n            idx=31,\n            lang=\"zh\",\n            lang2=\"Chinese\",\n            short_name=\"zipformer_2025_07_03\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -rfv bbpe.model\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-parakeet_tdt_ctc_110m-en-36000-int8\",\n            idx=33,\n            lang=\"en\",\n            lang2=\"English\",\n            short_name=\"parakeet_tdt_ctc_110m\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8\",\n            idx=34,\n            lang=\"ja\",\n            lang2=\"Japanese\",\n            short_name=\"parakeet-tdt_ctc_0.6b_ja\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-transducer-stt_pt_fastconformer_hybrid_large_pc-int8\",\n            idx=35,\n            lang=\"pt\",\n            lang2=\"Portuguese\",\n            short_name=\"stt_pt_fastconformer_hybrid_large_pc_transducer_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-stt_pt_fastconformer_hybrid_large_pc-int8\",\n            idx=36,\n            lang=\"pt\",\n            lang2=\"Portuguese\",\n            short_name=\"stt_pt_fastconformer_hybrid_large_pc_ctc-int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-transducer-stt_de_fastconformer_hybrid_large_pc-int8\",\n            idx=37,\n            lang=\"de\",\n            lang2=\"German\",\n            short_name=\"stt_de_fastconformer_hybrid_large_pc_transducer_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-stt_de_fastconformer_hybrid_large_pc-int8\",\n            idx=38,\n            lang=\"de\",\n            lang2=\"German\",\n            short_name=\"stt_de_fastconformer_hybrid_large_pc_ctc-int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-ctc-small-zh-int8-2025-07-16\",\n            idx=39,\n            lang=\"zh\",\n            lang2=\"Chinese\",\n            short_name=\"zipformer_ctc_small_2025_07_16\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -rfv bbpe.model\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8\",\n            idx=40,\n            lang=\"multi\",\n            lang2=\"25_languages\",\n            short_name=\"parakeet_tdt_0.6b_v3\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09\",\n            idx=41,\n            lang=\"zh_en_ko_ja_yue\",\n            lang2=\"中英粤日韩\",\n            short_name=\"sense_voice_2025_09_09_int8\",\n            use_hr=True,\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10\",\n            idx=42,\n            lang=\"zh_en_yue\",\n            lang2=\"中英粤\",\n            short_name=\"wenetspeech_yue_u2pconformer_ctc_2025_09_10_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-paraformer-zh-int8-2025-10-07\",\n            idx=43,\n            lang=\"zh\",\n            lang2=\"四川话\",\n            short_name=\"paraformer_四川话\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12\",\n            idx=44,\n            lang=\"1600\",\n            lang2=\"1600_languages\",\n            short_name=\"omnilingual_asr_300M_ctc_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-medasr-ctc-en-int8-2025-12-25\",\n            idx=45,\n            lang=\"en\",\n            lang2=\"英语\",\n            short_name=\"google_medasr_ctc_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-funasr-nano-int8-2025-12-30\",\n            idx=46,\n            lang=\"multi\",\n            lang2=\"31_languages\",\n            short_name=\"funasr_nano_int8_2025_12_30\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-wenetspeech-wu-u2pp-conformer-ctc-zh-int8-2026-02-03\",\n            idx=47,\n            lang=\"wu\",\n            lang2=\"吴语\",\n            short_name=\"wenetspeech_wu_u2pconformer_ctc_2026_02_03_int8\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-wenetspeech-wu-u2pp-conformer-ctc-zh-2026-02-03\",\n            idx=48,\n            lang=\"wu\",\n            lang2=\"吴语\",\n            short_name=\"wenetspeech_wu_u2pconformer_ctc_2026_02_03\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-vi-30M-int8-2026-02-09\",\n            idx=49,\n            lang=\"vi\",\n            lang2=\"Vietnamese\",\n            short_name=\"zipformer_vi_30M_int8_2026_02_09\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\",\n            idx=50,\n            lang=\"zh_en\",\n            lang2=\"中英\",\n            short_name=\"fire_red_asr2_ctc_int8_2026_02_25\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-tiny-ko-quantized-2026-02-27\",\n            idx=51,\n            lang=\"ko\",\n            lang2=\"Korean\",\n            short_name=\"moonshine_tiny_ko_2026_02_27\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-tiny-ja-quantized-2026-02-27\",\n            idx=52,\n            lang=\"ja\",\n            lang2=\"Japanese\",\n            short_name=\"moonshine_tiny_ja_2026_02_27\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27\",\n            idx=53,\n            lang=\"en\",\n            lang2=\"English\",\n            short_name=\"moonshine_tiny_en_2026_02_27\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-zh-quantized-2026-02-27\",\n            idx=54,\n            lang=\"zh\",\n            lang2=\"Chinese\",\n            short_name=\"moonshine_base_zh_2026_02_27\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-vi-quantized-2026-02-27\",\n            idx=55,\n            lang=\"vi\",\n            lang2=\"Vietnamese\",\n            short_name=\"moonshine_base_vi_2026_02_27\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-uk-quantized-2026-02-27\",\n            idx=56,\n            lang=\"uk\",\n            lang2=\"Ukrainian\",\n            short_name=\"moonshine_base_uk_2026_02_27\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-ja-quantized-2026-02-27\",\n            idx=57,\n            lang=\"ja\",\n            lang2=\"Japanese\",\n            short_name=\"moonshine_base_ja_2026_02_27\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-es-quantized-2026-02-27\",\n            idx=58,\n            lang=\"es\",\n            lang2=\"Spanish\",\n            short_name=\"moonshine_base_es_2026_02_27\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-en-quantized-2026-02-27\",\n            idx=59,\n            lang=\"en\",\n            lang2=\"English\",\n            short_name=\"moonshine_base_en_2026_02_27\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-ar-quantized-2026-02-27\",\n            idx=60,\n            lang=\"ar\",\n            lang2=\"Arabic\",\n            short_name=\"moonshine_base_ar_2026_02_27\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n    ]\n    return models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./build-apk-vad-asr.sh\",\n        \"./build-hap-vad-asr.sh\",\n        \"./build-apk-vad-asr-simulate-streaming.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        if not Path(f\"{filename}.in\").is_file():\n            print(f\"skip {filename}\")\n            continue\n\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/bbpe/.gitignore",
    "content": "bbpe.cc\n"
  },
  {
    "path": "scripts/bbpe/generate_bbpe_table.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n#\n# See https://github.com/facebookresearch/fairseq/blob/main/fairseq/data/encoders/byte_bpe.py#L28\n# and\n# https://github.com/k2-fsa/icefall/blob/master/icefall/byte_utils.py\n#\n# Caution: The PRINTABLE_LATIN from fairseq is different from PRINTABLE_BASE_CHARS from icefall\n\nimport re\n\nBPE_UNK = chr(8263)\nPRINTABLE_BASE_CHARS = (\n    list(range(256, 287 + 1))\n    + list(range(32, 126 + 1))\n    + list(range(288, 305 + 1))\n    + list(range(308, 318 + 1))\n    + list(range(321, 328 + 1))\n    + list(range(330, 382 + 1))\n    + list(range(384, 422 + 1))\n)\n\n\nBYTE_TO_BCHAR = {b: chr(PRINTABLE_BASE_CHARS[b]) for b in range(256)}\nBCHAR_TO_BYTE = {bc: b for b, bc in BYTE_TO_BCHAR.items()}\nBCHAR_TO_BYTE[BPE_UNK] = 32  # map unk to space\n\n\ndef main():\n    s = \"\"\n    s += \"// sherpa-onnx/csrc/bbpe.cc\\n\"\n    s += \"//\\n\"\n    s += \"// Copyright (c)  2024 Xiaomi Corporation\\n\"\n    s += \"\\n\"\n    s += \"// Auto-generated! DO NOT EDIT\\n\"\n    s += \"\\n\"\n    s += '#include \"sherpa-onnx/csrc/bbpe.h\"\\n'\n    s += \"\\n\"\n    s += \"#include <cstdint>\\n\"\n    s += \"#include <string>\\n\"\n    s += \"#include <unordered_map>\\n\"\n    s += \"\\n\"\n    s += \"const std::unordered_map<std::string, uint8_t> &GetByteBpeTable() {\\n\"\n    s += \"  static const std::unordered_map<std::string, uint8_t> table = {\\n\"\n\n    s += \"      \"\n    for i, (k, v) in enumerate(BCHAR_TO_BYTE.items()):\n        s += \"{\"\n        if k == \"\\\\\":\n            s += f'\"\\\\\\\\\", {v}'\n        elif k == '\"':\n            s += f'\"\\\\\"\", {v}'\n        else:\n            s += f'\"{k}\", {v}'\n        s += \"}, \"\n        if i > 0 and i % 7 == 0:\n            s += \"\\n\"\n            s += \"      \"\n    s += \"};\\n\"\n    s += \"\\n\"\n    s += \"  return table\\n;\"\n    s += \"}\\n\"\n\n    s += \"\\n\"\n    s += \"const std::unordered_map<uint8_t, std::string> &GetByteBpeTableId2Token() {\\n\"\n    s += \"  static const std::unordered_map<uint8_t, std::string> table = {\\n\"\n\n    s += \"      \"\n    for i, (k, v) in enumerate(BCHAR_TO_BYTE.items()):\n        s += \"{\"\n        if k == \"\\\\\":\n            s += f'{v}, \"\\\\\\\\\"'\n        elif k == '\"':\n            s += f'{v}, \"\\\\\"\"'\n        else:\n            s += f'{v}, \"{k}\"'\n\n        s += \"}, \"\n        if i > 0 and i % 7 == 0:\n            s += \"\\n\"\n            s += \"      \"\n    s += \"};\\n\"\n    s += \"\\n\"\n    s += \"  return table\\n;\"\n    s += \"}\\n\"\n\n    with open(\"bbpe.cc\", \"w\", encoding=\"utf-8\") as f:\n        f.write(s)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/benchmark/README.md",
    "content": "# Whisper Timestamp Accuracy Benchmark\r\n\r\nThis directory contains tools for benchmarking sherpa-onnx Whisper word timestamp accuracy against ground truth alignments from the Montreal Forced Aligner (MFA).\r\n\r\n## Overview\r\n\r\nThe benchmark suite evaluates how accurately sherpa-onnx predicts word-level timestamps by comparing against MFA alignments on LibriSpeech data. MFA provides high-quality forced alignments that serve as ground truth for measuring timestamp accuracy.\r\n\r\n## Scripts\r\n\r\n### `download_librispeech_test_data.py`\r\n\r\nDownloads and prepares the benchmark dataset:\r\n- LibriSpeech dev-clean audio (converted to 16kHz mono WAV)\r\n- MFA word alignments with precise word boundaries\r\n\r\n**Usage:**\r\n```bash\r\nuv run python scripts/benchmark/download_librispeech_test_data.py [--num-utterances 200]\r\n```\r\n\r\n**Options:**\r\n- `--num-utterances` - Number of utterances to include (default: 200)\r\n- `--output-dir` - Output directory (default: `benchmark_data`)\r\n- `--skip-download` - Skip download step and use existing files\r\n\r\n**Output:**\r\n- `benchmark_data/audio/*.wav` - Audio files\r\n- `benchmark_data/manifest.json` - Mapping of audio files to ground truth timestamps\r\n\r\n**Requirements:**\r\n- `gdown` (for Google Drive downloads)\r\n- `ffmpeg` or `sox` (for audio conversion)\r\n\r\n### `run_timestamp_benchmark.py`\r\n\r\nRuns the timestamp accuracy benchmark against the downloaded ground truth.\r\n\r\n**Usage:**\r\n```bash\r\nPYTHONPATH=build/lib:sherpa-onnx/python uv run python scripts/benchmark/run_timestamp_benchmark.py \\\r\n    --encoder ./whisper-tiny-attention/tiny-encoder.onnx \\\r\n    --decoder ./whisper-tiny-attention/tiny-decoder.onnx \\\r\n    --tokens ./whisper-tiny-attention/tiny-tokens.txt\r\n```\r\n\r\n**Options:**\r\n- `--encoder` - Path to Whisper encoder ONNX model (required)\r\n- `--decoder` - Path to Whisper decoder ONNX model (required)\r\n- `--tokens` - Path to tokens file (required)\r\n- `--data-dir` - Directory with manifest and audio (default: `benchmark_data`)\r\n- `--output-dir` - Output directory for results (default: `benchmark_results`)\r\n- `--language` - Language code (default: `en`)\r\n- `--num-workers` - Number of parallel workers (default: 1)\r\n\r\n**Parallel Processing:**\r\n```bash\r\n# Run with 4 workers for faster benchmarking\r\nPYTHONPATH=build/lib:sherpa-onnx/python uv run python scripts/benchmark/run_timestamp_benchmark.py \\\r\n    --encoder ./whisper-tiny-attention/tiny-encoder.onnx \\\r\n    --decoder ./whisper-tiny-attention/tiny-decoder.onnx \\\r\n    --tokens ./whisper-tiny-attention/tiny-tokens.txt \\\r\n    --num-workers 4\r\n```\r\n\r\nNote: Each worker loads its own model copy, so memory usage scales linearly with worker count.\r\n\r\n**Requirements:**\r\n- `numpy`\r\n- `jiwer` (for WER calculation)\r\n- Built sherpa-onnx library\r\n\r\n**Note on PYTHONPATH:** This script uses `PYTHONPATH=build/lib:sherpa-onnx/python` instead of `pip install sherpa-onnx` to allow rapid iteration when developing C++ code. After running `make` in the build directory, you can immediately test without reinstalling the package.\r\n\r\n## Output Format\r\n\r\n### `details_YYYYMMDD_HHMMSS.csv`\r\n\r\nPer-word timing errors with columns:\r\n- `utterance_id` - Utterance identifier\r\n- `word_index` - Word position in utterance\r\n- `word` - The word text\r\n- `gt_start`, `gt_end` - Ground truth timestamps (seconds)\r\n- `pred_start`, `pred_end` - Predicted timestamps (seconds)\r\n- `matched` - Whether the word was successfully aligned\r\n- `start_error_ms`, `end_error_ms` - Timing errors in milliseconds\r\n\r\n### `summary_YYYYMMDD_HHMMSS.csv`\r\n\r\nPer-utterance aggregate statistics:\r\n- `utterance_id` - Utterance identifier\r\n- `num_gt_words`, `num_pred_words`, `num_matched` - Word counts\r\n- `match_rate` - Fraction of ground truth words matched\r\n- `wer` - Word Error Rate\r\n- `mean_start_error_ms`, `median_start_error_ms`, `max_start_error_ms` - Start time error statistics\r\n- `mean_end_error_ms`, `median_end_error_ms`, `max_end_error_ms` - End time error statistics\r\n- `pct_within_20ms`, `pct_within_50ms` - Percentage of words within accuracy thresholds\r\n\r\n## Metrics Explained\r\n\r\n- **Start/End Time Error**: Absolute difference between predicted and ground truth timestamps\r\n- **Match Rate**: How many ground truth words were successfully aligned with predictions\r\n- **WER (Word Error Rate)**: Standard ASR accuracy metric (lower is better)\r\n- **Accuracy Thresholds**: Percentage of words with start time error within 20ms, 50ms, or 100ms\r\n\r\n## Example Workflow\r\n\r\n```bash\r\n# 1. Build sherpa-onnx\r\ncd build && make -j8 && cd ..\r\n\r\n# 2. Export a Whisper model with attention outputs\r\nuv run python scripts/whisper/export-onnx.py --model tiny --with-attention --output-dir ./whisper-tiny-attention\r\n\r\n# 3. Download benchmark data\r\nuv run python scripts/benchmark/download_librispeech_test_data.py --num-utterances 200\r\n\r\n# 4. Run the benchmark\r\nPYTHONPATH=build/lib:sherpa-onnx/python uv run python scripts/benchmark/run_timestamp_benchmark.py \\\r\n    --encoder ./whisper-tiny-attention/tiny-encoder.onnx \\\r\n    --decoder ./whisper-tiny-attention/tiny-decoder.onnx \\\r\n    --tokens ./whisper-tiny-attention/tiny-tokens.txt \\\r\n    --num-workers 4\r\n\r\n# 5. Review results in benchmark_results/\r\n```\r\n\r\n## Data Sources and Citations\r\n\r\n### LibriSpeech Corpus\r\n\r\nThe audio data comes from the [LibriSpeech](https://www.openslr.org/12/) ASR corpus:\r\n\r\n> Panayotov, V., Chen, G., Povey, D., & Khudanpur, S. (2015). LibriSpeech: An ASR corpus based on public domain audio books. In *2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)* (pp. 5206-5210). IEEE. https://doi.org/10.1109/ICASSP.2015.7178964\r\n\r\nLibriSpeech is derived from read audiobooks from the [LibriVox](https://librivox.org/) project and is freely available under a CC BY 4.0 license.\r\n\r\n### Montreal Forced Aligner (MFA)\r\n\r\nThe ground truth word alignments were generated using the [Montreal Forced Aligner](https://montreal-forced-aligner.readthedocs.io/):\r\n\r\n> McAuliffe, M., Socolof, M., Mihuc, S., Wagner, M., & Sonderegger, M. (2017). Montreal Forced Aligner: Trainable text-speech alignment using Kaldi. In *Proceedings of Interspeech 2017* (pp. 498-502). https://doi.org/10.21437/Interspeech.2017-1386\r\n\r\nMFA is an open-source forced alignment tool that uses Kaldi for acoustic modeling.\r\n\r\n### Pre-computed LibriSpeech Alignments\r\n\r\nThe pre-computed MFA alignments for LibriSpeech are provided by the [librispeech-alignments](https://github.com/CorentinJ/librispeech-alignments) project by Corentin Jemine.\r\n\r\n## License\r\n\r\nThe LibriSpeech corpus is released under the [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) license. Please ensure compliance with all applicable licenses when using this benchmark data.\r\n"
  },
  {
    "path": "scripts/benchmark/download_librispeech_test_data.py",
    "content": "#!/usr/bin/env python3\n# /// script\n# dependencies = [\"gdown\"]\n# ///\nfrom __future__ import annotations\n\n\"\"\"\nDownload and prepare LibriSpeech test data for timestamp benchmarking.\n\nDownloads:\n1. LibriSpeech dev-clean audio subset\n2. MFA word alignments from librispeech-alignments repo\n\nOutputs:\n- benchmark_data/audio/*.wav (16kHz mono WAV files)\n- benchmark_data/manifest.json (mapping of audio files to ground truth timestamps)\n\nUsage:\n    python scripts/benchmark/download_librispeech_test_data.py [--num-utterances 200]\n\"\"\"\n\nimport argparse\nimport json\nimport os\nimport re\nimport subprocess\nimport sys\nimport tarfile\nimport tempfile\nimport urllib.request\nimport zipfile\nfrom pathlib import Path\n\n# URLs for downloads\nLIBRISPEECH_DEV_CLEAN_URL = \"https://www.openslr.org/resources/12/dev-clean.tar.gz\"\nMFA_ALIGNMENTS_URL = \"https://drive.google.com/uc?export=download&id=1WYfgr31T-PPwMcxuAq09XZfHQO5Mw8fE\"\n\n# Google Drive file ID for the simple TXT format alignments\nGDRIVE_FILE_ID = \"1WYfgr31T-PPwMcxuAq09XZfHQO5Mw8fE\"\n\n\ndef download_file(url: str, dest_path: Path, description: str = \"file\"):\n    \"\"\"Download a file with progress indication.\"\"\"\n    print(f\"Downloading {description}...\")\n    print(f\"  URL: {url}\")\n    print(f\"  Destination: {dest_path}\")\n\n    def reporthook(block_num, block_size, total_size):\n        if total_size > 0:\n            downloaded = block_num * block_size\n            percent = min(100, downloaded * 100 / total_size)\n            mb_downloaded = downloaded / (1024 * 1024)\n            mb_total = total_size / (1024 * 1024)\n            sys.stdout.write(f\"\\r  Progress: {percent:.1f}% ({mb_downloaded:.1f}/{mb_total:.1f} MB)\")\n            sys.stdout.flush()\n\n    urllib.request.urlretrieve(url, dest_path, reporthook)\n    print()  # newline after progress\n\n\ndef download_from_gdrive(file_id: str, dest_path: Path, description: str = \"file\"):\n    \"\"\"Download a file from Google Drive using gdown.\"\"\"\n    try:\n        import gdown\n    except ImportError:\n        print(\"ERROR: gdown is required for downloading from Google Drive.\")\n        print(\"Install it with: pip install gdown\")\n        sys.exit(1)\n\n    print(f\"Downloading {description} from Google Drive...\")\n    print(f\"  File ID: {file_id}\")\n    print(f\"  Destination: {dest_path}\")\n\n    url = f\"https://drive.google.com/uc?id={file_id}\"\n    gdown.download(url, str(dest_path), quiet=False)\n\n\ndef extract_tar_gz(archive_path: Path, dest_dir: Path):\n    \"\"\"Extract a tar.gz archive.\"\"\"\n    print(f\"Extracting {archive_path}...\")\n    with tarfile.open(archive_path, \"r:gz\") as tar:\n        tar.extractall(dest_dir)\n    print(f\"  Extracted to {dest_dir}\")\n\n\ndef extract_zip(archive_path: Path, dest_dir: Path):\n    \"\"\"Extract a zip archive.\"\"\"\n    print(f\"Extracting {archive_path}...\")\n    with zipfile.ZipFile(archive_path, \"r\") as z:\n        z.extractall(dest_dir)\n    print(f\"  Extracted to {dest_dir}\")\n\n\ndef convert_flac_to_wav(flac_path: Path, wav_path: Path):\n    \"\"\"Convert FLAC to 16kHz mono WAV using ffmpeg or sox.\"\"\"\n    wav_path.parent.mkdir(parents=True, exist_ok=True)\n\n    # Try ffmpeg first\n    try:\n        subprocess.run(\n            [\n                \"ffmpeg\", \"-y\", \"-i\", str(flac_path),\n                \"-ar\", \"16000\", \"-ac\", \"1\", \"-sample_fmt\", \"s16\",\n                str(wav_path)\n            ],\n            check=True,\n            capture_output=True\n        )\n        return\n    except (subprocess.CalledProcessError, FileNotFoundError):\n        pass\n\n    # Try sox\n    try:\n        subprocess.run(\n            [\"sox\", str(flac_path), \"-r\", \"16000\", \"-c\", \"1\", str(wav_path)],\n            check=True,\n            capture_output=True\n        )\n        return\n    except (subprocess.CalledProcessError, FileNotFoundError):\n        pass\n\n    print(f\"ERROR: Could not convert {flac_path}\")\n    print(\"Please install ffmpeg or sox\")\n    sys.exit(1)\n\n\ndef parse_alignment_line(line: str) -> dict | None:\n    \"\"\"\n    Parse a single line from the MFA alignment file.\n\n    Format: utterance_id \"word1,word2,...\" \"end_time1,end_time2,...\"\n    Empty words represent silences.\n    Times are END times for each word.\n\n    Returns dict with utterance_id, words (list), and word_times (list of {word, start, end})\n    \"\"\"\n    # Pattern: utterance_id \"words\" \"times\"\n    match = re.match(r'^(\\S+)\\s+\"([^\"]*)\"\\s+\"([^\"]*)\"', line.strip())\n    if not match:\n        return None\n\n    utterance_id = match.group(1)\n    words_str = match.group(2)\n    times_str = match.group(3)\n\n    # Parse words (comma-separated, may have empty entries for silences)\n    words = words_str.split(\",\")\n\n    # Parse end times\n    try:\n        end_times = [float(t) for t in times_str.split(\",\") if t]\n    except ValueError:\n        return None\n\n    if len(words) != len(end_times):\n        return None\n\n    # Convert to word_times with start and end\n    word_times = []\n    prev_end = 0.0\n    for word, end_time in zip(words, end_times):\n        if word:  # Skip empty words (silences)\n            word_times.append({\n                \"word\": word,\n                \"start\": prev_end,\n                \"end\": end_time\n            })\n        prev_end = end_time\n\n    return {\n        \"utterance_id\": utterance_id,\n        \"words\": [w[\"word\"] for w in word_times],\n        \"word_times\": word_times\n    }\n\n\ndef parse_alignment_file(alignment_path: Path) -> dict:\n    \"\"\"Parse an alignment file and return dict mapping utterance_id to alignment data.\"\"\"\n    alignments = {}\n    with open(alignment_path, \"r\") as f:\n        for line in f:\n            parsed = parse_alignment_line(line)\n            if parsed:\n                alignments[parsed[\"utterance_id\"]] = parsed\n    return alignments\n\n\ndef main():\n    parser = argparse.ArgumentParser(description=\"Download LibriSpeech benchmark data\")\n    parser.add_argument(\n        \"--num-utterances\",\n        type=int,\n        default=200,\n        help=\"Number of utterances to include in test set (default: 200)\"\n    )\n    parser.add_argument(\n        \"--output-dir\",\n        type=str,\n        default=\"benchmark_data\",\n        help=\"Output directory (default: benchmark_data)\"\n    )\n    parser.add_argument(\n        \"--skip-download\",\n        action=\"store_true\",\n        help=\"Skip download step (use existing files)\"\n    )\n    args = parser.parse_args()\n\n    # Paths\n    script_dir = Path(__file__).parent.resolve()\n    repo_root = script_dir.parent.parent\n    output_dir = repo_root / args.output_dir\n    audio_dir = output_dir / \"audio\"\n    cache_dir = output_dir / \".cache\"\n\n    output_dir.mkdir(parents=True, exist_ok=True)\n    audio_dir.mkdir(parents=True, exist_ok=True)\n    cache_dir.mkdir(parents=True, exist_ok=True)\n\n    # Step 1: Download LibriSpeech dev-clean\n    librispeech_tar = cache_dir / \"dev-clean.tar.gz\"\n    librispeech_dir = cache_dir / \"LibriSpeech\" / \"dev-clean\"\n\n    if not args.skip_download and not librispeech_dir.exists():\n        if not librispeech_tar.exists():\n            download_file(\n                LIBRISPEECH_DEV_CLEAN_URL,\n                librispeech_tar,\n                \"LibriSpeech dev-clean\"\n            )\n        extract_tar_gz(librispeech_tar, cache_dir)\n\n    # Step 2: Download MFA alignments\n    alignments_zip = cache_dir / \"librispeech-alignments.zip\"\n    alignments_dir = cache_dir / \"alignments\"\n\n    if not args.skip_download and not alignments_dir.exists():\n        if not alignments_zip.exists():\n            download_from_gdrive(\n                GDRIVE_FILE_ID,\n                alignments_zip,\n                \"LibriSpeech MFA alignments\"\n            )\n        alignments_dir.mkdir(parents=True, exist_ok=True)\n        extract_zip(alignments_zip, alignments_dir)\n\n    # Step 3: Find alignment files for dev-clean\n    print(\"\\nParsing alignment files...\")\n    all_alignments = {}\n\n    # Look for dev-clean alignment files\n    for alignment_file in alignments_dir.rglob(\"*.alignment.txt\"):\n        if \"dev-clean\" in str(alignment_file):\n            print(f\"  Parsing {alignment_file.name}...\")\n            file_alignments = parse_alignment_file(alignment_file)\n            all_alignments.update(file_alignments)\n\n    print(f\"  Found {len(all_alignments)} alignments\")\n\n    # Step 4: Find corresponding audio files and convert\n    print(\"\\nProcessing audio files...\")\n    manifest = []\n    processed = 0\n\n    # Walk through LibriSpeech directory structure: speaker/chapter/utterance.flac\n    for flac_file in sorted(librispeech_dir.rglob(\"*.flac\")):\n        utterance_id = flac_file.stem  # e.g., \"84-121123-0000\"\n\n        if utterance_id not in all_alignments:\n            continue\n\n        alignment = all_alignments[utterance_id]\n\n        # Skip utterances with no words\n        if not alignment[\"word_times\"]:\n            continue\n\n        # Convert to WAV\n        wav_file = audio_dir / f\"{utterance_id}.wav\"\n        if not wav_file.exists():\n            convert_flac_to_wav(flac_file, wav_file)\n\n        # Add to manifest\n        manifest.append({\n            \"utterance_id\": utterance_id,\n            \"audio_path\": str(wav_file.relative_to(output_dir)),\n            \"transcript\": \" \".join(alignment[\"words\"]),\n            \"word_times\": alignment[\"word_times\"]\n        })\n\n        processed += 1\n        if processed % 50 == 0:\n            print(f\"  Processed {processed} utterances...\")\n\n        if processed >= args.num_utterances:\n            break\n\n    print(f\"\\nProcessed {len(manifest)} utterances\")\n\n    # Step 5: Write manifest\n    manifest_path = output_dir / \"manifest.json\"\n    with open(manifest_path, \"w\") as f:\n        json.dump(manifest, f, indent=2)\n    print(f\"Wrote manifest to {manifest_path}\")\n\n    # Summary\n    print(\"\\n\" + \"=\" * 60)\n    print(\"Download complete!\")\n    print(\"=\" * 60)\n    print(f\"Audio files: {audio_dir}\")\n    print(f\"Manifest: {manifest_path}\")\n    print(f\"Total utterances: {len(manifest)}\")\n\n    # Calculate total duration\n    total_words = sum(len(item[\"word_times\"]) for item in manifest)\n    print(f\"Total words: {total_words}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/benchmark/run_timestamp_benchmark.py",
    "content": "#!/usr/bin/env python3\n# /// script\n# dependencies = [\"numpy\", \"jiwer\"]\n# ///\nfrom __future__ import annotations\n\n\"\"\"\nRun timestamp accuracy benchmark against LibriSpeech ground truth.\n\nCompares sherpa-onnx Whisper word timestamps against MFA alignments.\n\nUsage:\n    PYTHONPATH=build/lib:sherpa-onnx/python python scripts/benchmark/run_timestamp_benchmark.py \\\n        --encoder ./whisper-tiny-attention/tiny-encoder.onnx \\\n        --decoder ./whisper-tiny-attention/tiny-decoder.onnx \\\n        --tokens ./whisper-tiny-attention/tiny-tokens.txt\n\n    # Parallel processing with 4 workers:\n    PYTHONPATH=build/lib:sherpa-onnx/python python scripts/benchmark/run_timestamp_benchmark.py \\\n        --encoder ./whisper-tiny-attention/tiny-encoder.onnx \\\n        --decoder ./whisper-tiny-attention/tiny-decoder.onnx \\\n        --tokens ./whisper-tiny-attention/tiny-tokens.txt \\\n        --num-workers 4\n\nOutputs:\n    benchmark_results/details_YYYYMMDD_HHMMSS.csv - Per-word timing errors\n    benchmark_results/summary_YYYYMMDD_HHMMSS.csv - Aggregate statistics\n\"\"\"\n\nimport argparse\nimport csv\nimport json\nimport multiprocessing\nimport os\nimport re\nimport sys\nimport time\nimport wave\nfrom dataclasses import dataclass\nfrom datetime import datetime\nfrom difflib import SequenceMatcher\nfrom pathlib import Path\n\nimport numpy as np\n\ntry:\n    import sherpa_onnx\nexcept ImportError:\n    print(\"ERROR: sherpa_onnx not found. Please install it using one of the methods at:\")\n    print(\"https://k2-fsa.github.io/sherpa/onnx/python/install.html\")\n    sys.exit(1)\n\ntry:\n    import jiwer\n    from jiwer import wer as compute_wer\nexcept ImportError:\n    print(\"ERROR: jiwer not found. Install with: pip install jiwer\")\n    sys.exit(1)\n\n# Text normalization for WER calculation\nwer_transforms = jiwer.Compose([\n    jiwer.ToLowerCase(),\n    jiwer.RemovePunctuation(),\n    jiwer.ExpandCommonEnglishContractions(),\n    jiwer.SubstituteWords({\n        \"mr\": \"mister\",\n        \"mrs\": \"missus\",\n        \"dr\": \"doctor\",\n        \"st\": \"saint\",\n    }),\n    jiwer.RemoveMultipleSpaces(),\n    jiwer.Strip(),\n    jiwer.ReduceToListOfListOfWords(),\n])\n\n\n@dataclass\nclass WordTiming:\n    \"\"\"A word with its timing information.\"\"\"\n    word: str\n    start: float\n    end: float\n\n\n@dataclass\nclass AlignedWord:\n    \"\"\"A pair of ground truth and predicted words that have been aligned.\"\"\"\n    word: str\n    gt_start: float\n    gt_end: float\n    pred_start: float | None\n    pred_end: float | None\n    matched: bool\n\n\ndef normalize_word(word: str) -> str:\n    \"\"\"Normalize word for comparison.\"\"\"\n    # Remove punctuation, lowercase\n    return re.sub(r'[^\\w]', '', word).strip().lower()\n\n\ndef read_wave(wave_filename: str) -> tuple[np.ndarray, int]:\n    \"\"\"Read a wave file and return samples as float32 array.\"\"\"\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f\"Expected mono, got {f.getnchannels()} channels\"\n        assert f.getsampwidth() == 2, f\"Expected 16-bit, got {f.getsampwidth() * 8}-bit\"\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32) / 32768\n        return samples_float32, f.getframerate()\n\n\ndef tokens_to_words(\n    tokens: list[str],\n    timestamps: list[float],\n    durations: list[float]\n) -> list[WordTiming]:\n    \"\"\"\n    Convert token-level timestamps to word-level timestamps.\n\n    Follows OpenAI Whisper's split_tokens_on_spaces logic:\n    - Tokens starting with space begin a new word\n    - Punctuation-only tokens begin a new word\n    - Otherwise append to previous word\n    \"\"\"\n    import string\n\n    if not tokens:\n        return []\n\n    words = []\n    current_word = \"\"\n    current_start = None\n    current_end = None\n\n    for token, ts, dur in zip(tokens, timestamps, durations):\n        token_end = ts + dur\n        token_stripped = token.strip()\n\n        # Determine if this token starts a new word\n        with_space = token.startswith(\" \")\n        is_punctuation = token_stripped in string.punctuation\n        is_first = len(words) == 0 and current_word == \"\"\n\n        if with_space or is_punctuation or is_first:\n            # Save previous word if exists\n            if current_word.strip():\n                words.append(WordTiming(\n                    word=current_word.strip(),\n                    start=current_start,\n                    end=current_end\n                ))\n            # Start new word\n            current_word = token\n            current_start = ts\n            current_end = token_end\n        else:\n            # Append to current word\n            current_word += token\n            current_end = token_end\n\n    # Don't forget the last word\n    if current_word.strip():\n        words.append(WordTiming(\n            word=current_word.strip(),\n            start=current_start,\n            end=current_end\n        ))\n\n    return words\n\n\ndef get_sherpa_word_timestamps(\n    recognizer: sherpa_onnx.OfflineRecognizer,\n    audio_path: Path\n) -> list[WordTiming]:\n    \"\"\"Run sherpa-onnx recognition and return word timestamps.\"\"\"\n    samples, sample_rate = read_wave(str(audio_path))\n\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, samples)\n    recognizer.decode_stream(stream)\n    result = stream.result\n\n    # Convert token timestamps to word timestamps\n    return tokens_to_words(result.tokens, result.timestamps, result.durations)\n\n\n# Global recognizer for worker processes\n_worker_recognizer = None\n\n\ndef _init_worker(encoder: str, decoder: str, tokens: str, language: str):\n    \"\"\"Initialize recognizer in worker process.\"\"\"\n    global _worker_recognizer\n    _worker_recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n        encoder=encoder,\n        decoder=decoder,\n        tokens=tokens,\n        language=language,\n        enable_token_timestamps=True,\n    )\n\n\ndef _process_utterance(args: tuple) -> dict:\n    \"\"\"Process a single utterance in a worker process.\"\"\"\n    item, data_dir = args\n    utterance_id = item[\"utterance_id\"]\n    audio_path = Path(data_dir) / item[\"audio_path\"]\n\n    # Parse ground truth\n    gt_words = [\n        WordTiming(word=wt[\"word\"], start=wt[\"start\"], end=wt[\"end\"])\n        for wt in item[\"word_times\"]\n    ]\n    gt_transcript = \" \".join(w.word for w in gt_words)\n\n    # Get predictions\n    pred_words = get_sherpa_word_timestamps(_worker_recognizer, audio_path)\n    pred_transcript = \" \".join(w.word for w in pred_words)\n\n    # Align words\n    aligned = align_words(gt_words, pred_words)\n\n    # Calculate per-utterance stats\n    matched = [a for a in aligned if a.matched]\n\n    if matched:\n        start_errors = [abs(a.pred_start - a.gt_start) * 1000 for a in matched]\n        end_errors = [abs(a.pred_end - a.gt_end) * 1000 for a in matched]\n\n        stats = {\n            \"utterance_id\": utterance_id,\n            \"num_gt_words\": len(gt_words),\n            \"num_pred_words\": len(pred_words),\n            \"num_matched\": len(matched),\n            \"match_rate\": len(matched) / len(gt_words) if gt_words else 0,\n            \"wer\": jiwer.wer(\n                gt_transcript,\n                pred_transcript,\n                reference_transform=wer_transforms,\n                hypothesis_transform=wer_transforms,\n            ),\n            \"mean_start_error_ms\": np.mean(start_errors),\n            \"median_start_error_ms\": np.median(start_errors),\n            \"max_start_error_ms\": np.max(start_errors),\n            \"mean_end_error_ms\": np.mean(end_errors),\n            \"median_end_error_ms\": np.median(end_errors),\n            \"max_end_error_ms\": np.max(end_errors),\n            \"pct_within_20ms\": sum(1 for e in start_errors if e <= 20) / len(start_errors) * 100,\n            \"pct_within_50ms\": sum(1 for e in start_errors if e <= 50) / len(start_errors) * 100,\n        }\n    else:\n        stats = {\n            \"utterance_id\": utterance_id,\n            \"num_gt_words\": len(gt_words),\n            \"num_pred_words\": len(pred_words),\n            \"num_matched\": 0,\n            \"match_rate\": 0,\n            \"wer\": jiwer.wer(\n                gt_transcript,\n                pred_transcript,\n                reference_transform=wer_transforms,\n                hypothesis_transform=wer_transforms,\n            ) if gt_transcript else 1.0,\n            \"mean_start_error_ms\": None,\n            \"median_start_error_ms\": None,\n            \"max_start_error_ms\": None,\n            \"mean_end_error_ms\": None,\n            \"median_end_error_ms\": None,\n            \"max_end_error_ms\": None,\n            \"pct_within_20ms\": None,\n            \"pct_within_50ms\": None,\n        }\n\n    # Build aligned words for detailed output\n    aligned_words = []\n    for j, a in enumerate(aligned):\n        aligned_words.append({\n            \"utterance_id\": utterance_id,\n            \"word_index\": j,\n            \"word\": a.word,\n            \"gt_start\": a.gt_start,\n            \"gt_end\": a.gt_end,\n            \"pred_start\": a.pred_start if a.pred_start is not None else \"\",\n            \"pred_end\": a.pred_end if a.pred_end is not None else \"\",\n            \"matched\": a.matched,\n            \"start_error_ms\": abs(a.pred_start - a.gt_start) * 1000 if a.matched else \"\",\n            \"end_error_ms\": abs(a.pred_end - a.gt_end) * 1000 if a.matched else \"\",\n        })\n\n    return {\"stats\": stats, \"aligned\": aligned_words}\n\n\ndef align_words(\n    gt_words: list[WordTiming],\n    pred_words: list[WordTiming]\n) -> list[AlignedWord]:\n    \"\"\"\n    Align ground truth and predicted words using sequence matching.\n\n    Returns list of AlignedWord with timing comparisons for matched words.\n    \"\"\"\n    # Normalize words for matching\n    gt_normalized = [normalize_word(w.word) for w in gt_words]\n    pred_normalized = [normalize_word(w.word) for w in pred_words]\n\n    # Use SequenceMatcher to find matching blocks\n    matcher = SequenceMatcher(None, gt_normalized, pred_normalized)\n\n    aligned = []\n    matched_pred_indices = set()\n\n    for tag, i1, i2, j1, j2 in matcher.get_opcodes():\n        if tag == 'equal':\n            # Words match\n            for gt_idx, pred_idx in zip(range(i1, i2), range(j1, j2)):\n                gt_word = gt_words[gt_idx]\n                pred_word = pred_words[pred_idx]\n                aligned.append(AlignedWord(\n                    word=gt_word.word,\n                    gt_start=gt_word.start,\n                    gt_end=gt_word.end,\n                    pred_start=pred_word.start,\n                    pred_end=pred_word.end,\n                    matched=True\n                ))\n                matched_pred_indices.add(pred_idx)\n        elif tag in ('replace', 'delete'):\n            # Ground truth words not matched\n            for gt_idx in range(i1, i2):\n                gt_word = gt_words[gt_idx]\n                aligned.append(AlignedWord(\n                    word=gt_word.word,\n                    gt_start=gt_word.start,\n                    gt_end=gt_word.end,\n                    pred_start=None,\n                    pred_end=None,\n                    matched=False\n                ))\n\n    return aligned\n\n\ndef run_benchmark(\n    manifest: list[dict],\n    data_dir: Path,\n    output_dir: Path,\n    encoder: str,\n    decoder: str,\n    tokens: str,\n    language: str,\n    num_workers: int = 1\n):\n    \"\"\"Run benchmark on all utterances in manifest.\"\"\"\n    timestamp = datetime.now().strftime(\"%Y%m%d_%H%M%S\")\n    details_path = output_dir / f\"details_{timestamp}.csv\"\n    summary_path = output_dir / f\"summary_{timestamp}.csv\"\n\n    output_dir.mkdir(parents=True, exist_ok=True)\n\n    # Collect all results\n    all_aligned = []\n    utterance_stats = []\n\n    total = len(manifest)\n    start_time = time.time()\n\n    if num_workers > 1:\n        # Parallel processing\n        print(f\"\\nProcessing {total} utterances with {num_workers} workers...\")\n\n        # Prepare arguments for workers\n        work_items = [(item, str(data_dir)) for item in manifest]\n\n        with multiprocessing.Pool(\n            processes=num_workers,\n            initializer=_init_worker,\n            initargs=(encoder, decoder, tokens, language)\n        ) as pool:\n            completed = 0\n            for result in pool.imap(_process_utterance, work_items):\n                utterance_stats.append(result[\"stats\"])\n                all_aligned.extend(result[\"aligned\"])\n\n                completed += 1\n                elapsed = time.time() - start_time\n                avg_per_item = elapsed / completed\n                remaining = total - completed\n                eta_seconds = avg_per_item * remaining\n\n                if eta_seconds >= 3600:\n                    eta_str = f\"{eta_seconds / 3600:.1f}h\"\n                elif eta_seconds >= 60:\n                    eta_str = f\"{eta_seconds / 60:.1f}m\"\n                else:\n                    eta_str = f\"{eta_seconds:.0f}s\"\n\n                print(f\"  [{completed}/{total}] {result['stats']['utterance_id']} - ETA: {eta_str}\", flush=True)\n    else:\n        # Sequential processing (original behavior)\n        print(f\"\\nProcessing {total} utterances...\")\n\n        # Initialize recognizer for sequential mode\n        recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n            encoder=encoder,\n            decoder=decoder,\n            tokens=tokens,\n            language=language,\n            enable_token_timestamps=True,\n        )\n\n        for i, item in enumerate(manifest):\n            iter_start = time.time()\n            utterance_id = item[\"utterance_id\"]\n            audio_path = data_dir / item[\"audio_path\"]\n\n            # Parse ground truth\n            gt_words = [\n                WordTiming(word=wt[\"word\"], start=wt[\"start\"], end=wt[\"end\"])\n                for wt in item[\"word_times\"]\n            ]\n            gt_transcript = \" \".join(w.word for w in gt_words)\n\n            # Get predictions\n            pred_words = get_sherpa_word_timestamps(recognizer, audio_path)\n            pred_transcript = \" \".join(w.word for w in pred_words)\n\n            # Align words\n            aligned = align_words(gt_words, pred_words)\n\n            # Calculate per-utterance stats\n            matched = [a for a in aligned if a.matched]\n\n            if matched:\n                start_errors = [abs(a.pred_start - a.gt_start) * 1000 for a in matched]\n                end_errors = [abs(a.pred_end - a.gt_end) * 1000 for a in matched]\n\n                stats = {\n                    \"utterance_id\": utterance_id,\n                    \"num_gt_words\": len(gt_words),\n                    \"num_pred_words\": len(pred_words),\n                    \"num_matched\": len(matched),\n                    \"match_rate\": len(matched) / len(gt_words) if gt_words else 0,\n                    \"wer\": jiwer.wer(\n                        gt_transcript,\n                        pred_transcript,\n                        reference_transform=wer_transforms,\n                        hypothesis_transform=wer_transforms,\n                    ),\n                    \"mean_start_error_ms\": np.mean(start_errors),\n                    \"median_start_error_ms\": np.median(start_errors),\n                    \"max_start_error_ms\": np.max(start_errors),\n                    \"mean_end_error_ms\": np.mean(end_errors),\n                    \"median_end_error_ms\": np.median(end_errors),\n                    \"max_end_error_ms\": np.max(end_errors),\n                    \"pct_within_20ms\": sum(1 for e in start_errors if e <= 20) / len(start_errors) * 100,\n                    \"pct_within_50ms\": sum(1 for e in start_errors if e <= 50) / len(start_errors) * 100,\n                }\n            else:\n                stats = {\n                    \"utterance_id\": utterance_id,\n                    \"num_gt_words\": len(gt_words),\n                    \"num_pred_words\": len(pred_words),\n                    \"num_matched\": 0,\n                    \"match_rate\": 0,\n                    \"wer\": jiwer.wer(\n                        gt_transcript,\n                        pred_transcript,\n                        reference_transform=wer_transforms,\n                        hypothesis_transform=wer_transforms,\n                    ) if gt_transcript else 1.0,\n                    \"mean_start_error_ms\": None,\n                    \"median_start_error_ms\": None,\n                    \"max_start_error_ms\": None,\n                    \"mean_end_error_ms\": None,\n                    \"median_end_error_ms\": None,\n                    \"max_end_error_ms\": None,\n                    \"pct_within_20ms\": None,\n                    \"pct_within_50ms\": None,\n                }\n\n            utterance_stats.append(stats)\n\n            # Store aligned words for detailed output\n            for j, a in enumerate(aligned):\n                all_aligned.append({\n                    \"utterance_id\": utterance_id,\n                    \"word_index\": j,\n                    \"word\": a.word,\n                    \"gt_start\": a.gt_start,\n                    \"gt_end\": a.gt_end,\n                    \"pred_start\": a.pred_start if a.pred_start is not None else \"\",\n                    \"pred_end\": a.pred_end if a.pred_end is not None else \"\",\n                    \"matched\": a.matched,\n                    \"start_error_ms\": abs(a.pred_start - a.gt_start) * 1000 if a.matched else \"\",\n                    \"end_error_ms\": abs(a.pred_end - a.gt_end) * 1000 if a.matched else \"\",\n                })\n\n            # Progress with ETA\n            completed = i + 1\n            elapsed = time.time() - start_time\n            avg_per_item = elapsed / completed\n            remaining = total - completed\n            eta_seconds = avg_per_item * remaining\n\n            if eta_seconds >= 3600:\n                eta_str = f\"{eta_seconds / 3600:.1f}h\"\n            elif eta_seconds >= 60:\n                eta_str = f\"{eta_seconds / 60:.1f}m\"\n            else:\n                eta_str = f\"{eta_seconds:.0f}s\"\n\n            iter_time = time.time() - iter_start\n            print(f\"  [{completed}/{total}] {utterance_id} ({iter_time:.1f}s) - ETA: {eta_str}\", flush=True)\n\n    # Sort results by utterance_id to ensure consistent output\n    utterance_stats.sort(key=lambda x: x[\"utterance_id\"])\n    all_aligned.sort(key=lambda x: (x[\"utterance_id\"], x[\"word_index\"]))\n\n    # Write detailed results\n    print(f\"\\nWriting detailed results to {details_path}...\")\n    with open(details_path, \"w\", newline=\"\") as f:\n        writer = csv.DictWriter(f, fieldnames=[\n            \"utterance_id\", \"word_index\", \"word\", \"gt_start\", \"gt_end\",\n            \"pred_start\", \"pred_end\", \"matched\", \"start_error_ms\", \"end_error_ms\"\n        ])\n        writer.writeheader()\n        writer.writerows(all_aligned)\n\n    # Write summary results\n    print(f\"Writing summary to {summary_path}...\")\n    with open(summary_path, \"w\", newline=\"\") as f:\n        writer = csv.DictWriter(f, fieldnames=[\n            \"utterance_id\", \"num_gt_words\", \"num_pred_words\", \"num_matched\",\n            \"match_rate\", \"wer\", \"mean_start_error_ms\", \"median_start_error_ms\",\n            \"max_start_error_ms\", \"mean_end_error_ms\", \"median_end_error_ms\",\n            \"max_end_error_ms\", \"pct_within_20ms\", \"pct_within_50ms\"\n        ])\n        writer.writeheader()\n        writer.writerows(utterance_stats)\n\n    # Print aggregate stats\n    matched_stats = [s for s in utterance_stats if s[\"num_matched\"] > 0]\n    if matched_stats:\n        print(\"\\n\" + \"=\" * 60)\n        print(\"AGGREGATE RESULTS\")\n        print(\"=\" * 60)\n        print(f\"Total utterances: {len(manifest)}\")\n        print(f\"Total ground truth words: {sum(s['num_gt_words'] for s in utterance_stats)}\")\n        print(f\"Total matched words: {sum(s['num_matched'] for s in utterance_stats)}\")\n\n        all_start_errors = [\n            float(r[\"start_error_ms\"]) for r in all_aligned\n            if r[\"matched\"] and r[\"start_error_ms\"] != \"\"\n        ]\n        all_end_errors = [\n            float(r[\"end_error_ms\"]) for r in all_aligned\n            if r[\"matched\"] and r[\"end_error_ms\"] != \"\"\n        ]\n\n        if all_start_errors:\n            print(f\"\\nStart Time Errors:\")\n            print(f\"  Mean: {np.mean(all_start_errors):.1f} ms\")\n            print(f\"  Median: {np.median(all_start_errors):.1f} ms\")\n            print(f\"  Max: {np.max(all_start_errors):.1f} ms\")\n            print(f\"  Std: {np.std(all_start_errors):.1f} ms\")\n\n            print(f\"\\nEnd Time Errors:\")\n            print(f\"  Mean: {np.mean(all_end_errors):.1f} ms\")\n            print(f\"  Median: {np.median(all_end_errors):.1f} ms\")\n            print(f\"  Max: {np.max(all_end_errors):.1f} ms\")\n            print(f\"  Std: {np.std(all_end_errors):.1f} ms\")\n\n            print(f\"\\nAccuracy Thresholds (start time):\")\n            print(f\"  Within 20ms: {sum(1 for e in all_start_errors if e <= 20) / len(all_start_errors) * 100:.1f}%\")\n            print(f\"  Within 50ms: {sum(1 for e in all_start_errors if e <= 50) / len(all_start_errors) * 100:.1f}%\")\n            print(f\"  Within 100ms: {sum(1 for e in all_start_errors if e <= 100) / len(all_start_errors) * 100:.1f}%\")\n\n        avg_wer = np.mean([s[\"wer\"] for s in utterance_stats])\n        print(f\"\\nWord Error Rate (WER): {avg_wer * 100:.1f}%\")\n\n    return details_path, summary_path\n\n\ndef main():\n    parser = argparse.ArgumentParser(description=\"Run timestamp accuracy benchmark\")\n    parser.add_argument(\"--encoder\", required=True, help=\"Path to encoder.onnx\")\n    parser.add_argument(\"--decoder\", required=True, help=\"Path to decoder.onnx\")\n    parser.add_argument(\"--tokens\", required=True, help=\"Path to tokens.txt\")\n    parser.add_argument(\n        \"--data-dir\",\n        default=\"benchmark_data\",\n        help=\"Directory with manifest.json and audio (default: benchmark_data)\"\n    )\n    parser.add_argument(\n        \"--output-dir\",\n        default=\"benchmark_results\",\n        help=\"Output directory for CSV files (default: benchmark_results)\"\n    )\n    parser.add_argument(\n        \"--language\",\n        default=\"en\",\n        help=\"Language code (default: en)\"\n    )\n    parser.add_argument(\n        \"--num-workers\",\n        type=int,\n        default=1,\n        help=\"Number of parallel workers (default: 1, sequential). \"\n             \"Use higher values to speed up benchmarks on multi-core machines. \"\n             \"Each worker loads its own model copy, so memory usage scales linearly.\"\n    )\n    args = parser.parse_args()\n\n    # Resolve paths\n    script_dir = Path(__file__).parent.resolve()\n    repo_root = script_dir.parent.parent\n    data_dir = repo_root / args.data_dir\n    output_dir = repo_root / args.output_dir\n    manifest_path = data_dir / \"manifest.json\"\n\n    # Load manifest\n    print(f\"Loading manifest from {manifest_path}...\")\n    with open(manifest_path) as f:\n        manifest = json.load(f)\n    print(f\"  Found {len(manifest)} utterances\")\n\n    # Print recognizer info\n    print(f\"\\nRecognizer configuration:\")\n    print(f\"  Encoder: {args.encoder}\")\n    print(f\"  Decoder: {args.decoder}\")\n    print(f\"  Tokens: {args.tokens}\")\n    if args.num_workers > 1:\n        print(f\"  Workers: {args.num_workers} (parallel)\")\n    else:\n        print(f\"  Workers: 1 (sequential)\")\n\n    # Run benchmark\n    details_path, summary_path = run_benchmark(\n        manifest=manifest,\n        data_dir=data_dir,\n        output_dir=output_dir,\n        encoder=args.encoder,\n        decoder=args.decoder,\n        tokens=args.tokens,\n        language=args.language,\n        num_workers=args.num_workers,\n    )\n\n    print(\"\\n\" + \"=\" * 60)\n    print(\"Benchmark complete!\")\n    print(\"=\" * 60)\n    print(f\"Details: {details_path}\")\n    print(f\"Summary: {summary_path}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/check_style_cpplint.sh",
    "content": "#!/bin/bash\n#\n# Copyright      2020  Mobvoi Inc. (authors: Fangjun Kuang)\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n#\n# Usage:\n#\n# (1) To check files of the last commit\n#  ./scripts/check_style_cpplint.sh\n#\n# (2) To check changed files not committed yet\n#  ./scripts/check_style_cpplint.sh 1\n#\n# (3) To check all files in the project\n#  ./scripts/check_style_cpplint.sh 2\n\n\ncpplint_version=\"2.0.2\"\ncur_dir=$(cd $(dirname $BASH_SOURCE) && pwd)\nsherpa_onnx_dir=$(cd $cur_dir/.. && pwd)\n\nbuild_dir=$sherpa_onnx_dir/build\nmkdir -p $build_dir\n\ncpplint_src=$build_dir/cpplint-${cpplint_version}/cpplint.py\n\nif [ ! -d \"$build_dir/cpplint-${cpplint_version}\" ]; then\n  pushd $build_dir\n  if command -v wget &> /dev/null; then\n    wget https://github.com/cpplint/cpplint/archive/${cpplint_version}.tar.gz\n  elif command -v curl &> /dev/null; then\n    curl -O -SL https://github.com/cpplint/cpplint/archive/${cpplint_version}.tar.gz\n  else\n    echo \"Please install wget or curl to download cpplint\"\n    exit 1\n  fi\n  tar xf ${cpplint_version}.tar.gz\n  rm ${cpplint_version}.tar.gz\n\n  # cpplint will report the following error for: __host__ __device__ (\n  #\n  #     Extra space before ( in function call  [whitespace/parens] [4]\n  #\n  # the following patch disables the above error\n  sed -i \"3490i\\        not Search(r'__host__ __device__\\\\\\s+\\\\\\(', fncall) and\" $cpplint_src\n  popd\nfi\n\nsource $sherpa_onnx_dir/scripts/utils.sh\n\n# return true if the given file is a c++ source file\n# return false otherwise\nfunction is_source_code_file() {\n  case \"$1\" in\n    *.cc|*.h|*.cu)\n      echo true;;\n    *)\n      echo false;;\n  esac\n}\n\nfunction check_style() {\n  if [[ $1 == mfc-example* ]]; then\n    return\n  fi\n  python3 $cpplint_src $1 || abort $1\n}\n\nfunction check_last_commit() {\n  files=$(git diff HEAD^1 --name-only --diff-filter=ACDMRUXB)\n  echo $files\n}\n\nfunction check_current_dir() {\n  files=$(git status -s -uno --porcelain | awk '{\n  if (NF == 4) {\n    # a file has been renamed\n    print $NF\n  } else {\n    print $2\n  }}')\n\n  echo $files\n}\n\nfunction do_check() {\n  case \"$1\" in\n    1)\n      echo \"Check changed files\"\n      files=$(check_current_dir)\n      ;;\n    2)\n      echo \"Check all files\"\n      files=$(find $sherpa_onnx_dir/cxx-api-examples-ignored $sherpa_onnx_dir/c-api-examples-ignored $sherpa_onnx_dir/sherpa-onnx/csrc $sherpa_onnx_dir/sherpa-onnx/python $sherpa_onnx_dir/scripts/node-addon-api/src $sherpa_onnx_dir/sherpa-onnx/jni $sherpa_onnx_dir/sherpa-onnx/c-api -name \"*.h\" -o -name \"*.cc\")\n      files2=$(find $sherpa_onnx_dir/harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/ -name \"*.cc\")\n      ;;\n    *)\n      echo \"Check last commit\"\n      files=$(check_last_commit)\n      ;;\n  esac\n\n  for f in $files $files2; do\n    need_check=$(is_source_code_file $f)\n    if $need_check; then\n      [[ -f $f ]] && check_style $f\n    fi\n  done\n}\n\nfunction main() {\n  do_check $1\n\n  ok \"Great! Style check passed!\"\n}\n\ncd $sherpa_onnx_dir\n\nmain $1\n"
  },
  {
    "path": "scripts/dart/add-punctuations-pubspec.yaml",
    "content": "name: add_punctuations\n\ndescription: >\n  This example demonstrates how to use the Dart API to add punctuations to text.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/audio-tagging-pubspec.yaml",
    "content": "name: audio_tagging\n\ndescription: >\n  This example demonstrates how to use the Dart API for audio tagging.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/kws-pubspec.yaml",
    "content": "name: keyword_spotter\n\ndescription: >\n  This example demonstrates how to use the Dart API for keyword spotting\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/non-streaming-asr-pubspec.yaml",
    "content": "name: non_streaming_asr\ndescription: >\n  This example demonstrates how to use the Dart API for Non-streaming speech recognition. Specifically, we use the following models as examples, whisper, zipformer, and paraformer.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\n# Add regular dependencies here.\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/release.sh",
    "content": "#!/usr/bin/env bash\n\n# see\n# https://dart.dev/tools/pub/automated-publishing\n\nset -ex\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\necho \"SCRIPT_DIR: $SCRIPT_DIR\"\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" $SHERPA_ONNX_DIR/CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nsrc_dir=$SHERPA_ONNX_DIR/sherpa-onnx/flutter\npushd $src_dir\n\nv=\"version: $SHERPA_ONNX_VERSION\"\necho \"v: $v\"\nsed -i.bak s\"/^version: .*/$v/\" ./pubspec.yaml\nrm *.bak\nrm notes.md\ngit status\ngit diff\n\nHF_MIRROR=hf.co\nlinux_wheel_filename=sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-manylinux2014_x86_64.whl\nlinux_wheel=$src_dir/$linux_wheel_filename\n\nmacos_wheel_filename=sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-macosx_10_15_universal2.whl\nmacos_wheel=$src_dir/$macos_wheel_filename\n\nwindows_x64_wheel_filename=sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-win_amd64.whl\nwindows_x64_wheel=$src_dir/$windows_x64_wheel_filename\n\nfunction process_linux() {\n  mkdir -p t\n  cd t\n  curl -OL https://$HF_MIRROR/csukuangfj2/sherpa-onnx-wheels/resolve/main/cpu/$SHERPA_ONNX_VERSION/$linux_wheel_filename\n  unzip $linux_wheel_filename\n  cp -v sherpa_onnx/lib/*.so* ../linux\n  cd ..\n  rm -rf t\n\n  pushd linux\n\n  popd\n}\n\nfunction process_windows_x64() {\n  mkdir -p t\n  cd t\n  curl -OL https://$HF_MIRROR/csukuangfj2/sherpa-onnx-wheels/resolve/main/cpu/$SHERPA_ONNX_VERSION/$windows_x64_wheel_filename\n  unzip $windows_x64_wheel_filename\n  cp -v sherpa_onnx/lib/*.dll ../windows\n  cd ..\n  rm -rf t\n}\n\nfunction process_macos() {\n  mkdir -p t\n  cd t\n  curl -OL https://$HF_MIRROR/csukuangfj2/sherpa-onnx-wheels/resolve/main/cpu/$SHERPA_ONNX_VERSION/$macos_wheel_filename\n  unzip $macos_wheel_filename\n  cp -v sherpa_onnx/lib/*.dylib ../macos\n  cd ..\n  rm -rf t\n}\n\nprocess_linux\nprocess_windows_x64\nprocess_macos\n"
  },
  {
    "path": "scripts/dart/sherpa-onnx-pubspec.yaml",
    "content": "name: sherpa_onnx\n\ndescription: >\n  Speech recognition, speech synthesis, and speaker recognition using next-gen Kaldi\n  with onnxruntime without Internet connection.\n\nrepository: https://github.com/k2-fsa/sherpa-onnx/tree/master/sherpa-onnx/flutter\n\nissue_tracker: https://github.com/k2-fsa/sherpa-onnx/issues\ndocumentation: https://k2-fsa.github.io/sherpa/onnx/\n\ntopics:\n  - speech-recognition\n  - speech-synthesis\n  - speaker-identification\n  - audio-tagging\n  - voice-activity-detection\n\n# remember to change the version in ../sherpa_onnx_macos/macos/sherpa_onnx.podspec\nversion: 1.10.20\n\nhomepage: https://github.com/k2-fsa/sherpa-onnx\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  ffi: ^2.1.0\n  flutter:\n    sdk: flutter\n\n  sherpa_onnx_android:\n    path: ../sherpa_onnx_android\n\n  sherpa_onnx_macos:\n    path: ../sherpa_onnx_macos\n\n  sherpa_onnx_linux:\n    path: ../sherpa_onnx_linux\n\n  sherpa_onnx_windows:\n    path: ../sherpa_onnx_windows\n\nflutter:\n  plugin:\n    platforms:\n      android:\n        default_package: sherpa_onnx_android\n\n      macos:\n        default_package: sherpa_onnx_macos\n\n      linux:\n        default_package: sherpa_onnx_linux\n\n      windows:\n        default_package: sherpa_onnx_windows\n"
  },
  {
    "path": "scripts/dart/slid-pubspec.yaml",
    "content": "name: spoken_language_identification\n\ndescription: >\n  This example demonstrates how to use the Dart API for spoken language identification.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\n# Add regular dependencies here.\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/speaker-diarization-pubspec.yaml",
    "content": "name: speaker_diarization\ndescription: >\n  This example demonstrates how to use the Dart API for speaker diarization.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/speaker-id-pubspec.yaml",
    "content": "name: speaker_identification\n\ndescription: >\n  This example demonstrates how to use the Dart API for speaker identification.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/speech-enhancement-dpdfnet-pubspec.yaml",
    "content": "name: speech_enhancement_dpdfnet\n\ndescription: >\n  This example demonstrates how to use the Dart API for DPDFNet speech enhancement/denoising.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/speech-enhancement-gtcrn-pubspec.yaml",
    "content": "name: speech_enhancement_gtcrn\n\ndescription: >\n  This example demonstrates how to use the Dart API for speech enhancement/denoising.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/streaming-asr-pubspec.yaml",
    "content": "name: streaming_asr\n\ndescription: >\n  This example demonstrates how to use the Dart API for streaming speech recognition.\n\nversion: 1.0.0\n# repository: https://github.com/my_org/my_repo\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\n# Add regular dependencies here.\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n  test: ^1.24.0\n"
  },
  {
    "path": "scripts/dart/streaming-speech-enhancement-dpdfnet-pubspec.yaml",
    "content": "name: streaming_speech_enhancement_dpdfnet\n\ndescription: >\n  This example demonstrates how to use the Dart API for streaming speech enhancement/denoising with DPDFNet.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/streaming-speech-enhancement-gtcrn-pubspec.yaml",
    "content": "name: streaming_speech_enhancement_gtcrn\n\ndescription: >\n  This example demonstrates how to use the Dart API for streaming speech enhancement/denoising with GTCRN.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/tts-pubspec.yaml",
    "content": "name: tts\ndescription: A sample command-line application.\nversion: 1.0.0\n# repository: https://github.com/my_org/my_repo\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\n# Add regular dependencies here.\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/vad-non-streaming-asr-pubspec.yaml",
    "content": "name: vad_with_non_streaming_asr\n\ndescription: >\n  This example demonstrates how to use the Dart API for VAD (voice activity detection)\n  with non-streaming speech recognition.\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dart/vad-pubspec.yaml",
    "content": "name: vad\n\ndescription: >\n  This example demonstrates how to use the Dart API for VAD (voice activity detection).\n\nversion: 1.0.0\n\nenvironment:\n  sdk: \">=3.0.0 <4.0.0\"\n\ndependencies:\n  sherpa_onnx:\n    path: ../../flutter/sherpa_onnx\n\n  path: ^1.9.0\n  args: ^2.5.0\n\ndev_dependencies:\n  lints: ^3.0.0\n"
  },
  {
    "path": "scripts/dotnet/.gitignore",
    "content": "all\nmacos-arm64\nmacos-x64\nlinux-x64\nlinux-arm64\nwindows-arm64\nwindows-x64\nwindows-x86\npackages\ntmp\n"
  },
  {
    "path": "scripts/dotnet/AudioEvent.cs",
    "content": "﻿/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n\n    public class AudioEvent\n    {\n        public AudioEvent(IntPtr handle)\n        {\n            Impl impl = (Impl)Marshal.PtrToStructure(handle, typeof(Impl));\n\n            // PtrToStringUTF8() requires .net standard 2.1\n            // _text = Marshal.PtrToStringUTF8(impl.Text);\n\n            int length = 0;\n\n            unsafe\n            {\n                byte* buffer = (byte*)impl.Name;\n                while (*buffer != 0)\n                {\n                    ++buffer;\n                    length += 1;\n                }\n            }\n\n            byte[] stringBuffer = new byte[length];\n            Marshal.Copy(impl.Name, stringBuffer, 0, length);\n            _name = Encoding.UTF8.GetString(stringBuffer);\n\n            _index = impl.Index;\n            _prob = impl.Prob;\n        }\n\n        [StructLayout(LayoutKind.Sequential)]\n        struct Impl\n        {\n            public IntPtr Name;\n            public int Index;\n            public float Prob;\n        }\n\n        private String _name;\n        public String Name => _name;\n\n        private int _index;\n        public int Index => _index;\n\n        private float _prob;\n        public float Prob => _prob;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/AudioTagging.cs",
    "content": "﻿/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\nusing System.Collections.Generic;\n\nnamespace SherpaOnnx\n{\n    public class AudioTagging : IDisposable\n    {\n        public AudioTagging(AudioTaggingConfig config)\n        {\n            IntPtr h = SherpaOnnxCreateAudioTagging(ref config);\n            _handle = new HandleRef(this, h);\n        }\n\n        public OfflineStream CreateStream()\n        {\n            IntPtr p = SherpaOnnxAudioTaggingCreateOfflineStream(_handle.Handle);\n            return new OfflineStream(p);\n        }\n\n        // if topK <= 0, then config.TopK is used\n        // if topK > 0, then config.TopK is ignored\n        public AudioEvent[] Compute(OfflineStream stream, int topK = -1)\n        {\n            IntPtr p = SherpaOnnxAudioTaggingCompute(_handle.Handle, stream.Handle, topK);\n\n            var result = new List<AudioEvent>();\n\n            if (p == IntPtr.Zero)\n            {\n              return result.ToArray();\n            }\n\n            int index = 0;\n            while (true)\n            {\n              IntPtr e = Marshal.ReadIntPtr(p, index * IntPtr.Size);\n              if (e == IntPtr.Zero)\n              {\n                break;\n              }\n\n              AudioEvent ae = new AudioEvent(e);\n              result.Add(ae);\n\n              ++index;\n            }\n\n            SherpaOnnxAudioTaggingFreeResults(p);\n\n            return result.ToArray();\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~AudioTagging()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyAudioTagging(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateAudioTagging(ref AudioTaggingConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyAudioTagging(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxAudioTaggingCreateOfflineStream(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxAudioTaggingCompute(IntPtr handle, IntPtr stream, int topK);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxAudioTaggingFreeResults(IntPtr p);\n    }\n}\n\n"
  },
  {
    "path": "scripts/dotnet/AudioTaggingConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct AudioTaggingConfig\n    {\n        public AudioTaggingConfig()\n        {\n            Model = new AudioTaggingModelConfig();\n\n            Labels = \"\";\n            TopK = 5;\n        }\n\n        public AudioTaggingModelConfig Model;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Labels;\n\n        public int TopK;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/AudioTaggingModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct AudioTaggingModelConfig\n    {\n        public AudioTaggingModelConfig()\n        {\n            Zipformer = new OfflineZipformerAudioTaggingModelConfig();\n\n            CED = \"\";\n            NumThreads = 1;\n            Debug = 0;\n            Provider = \"cpu\";\n        }\n\n        public OfflineZipformerAudioTaggingModelConfig Zipformer;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string CED;\n\n        public int NumThreads;\n\n        public int Debug;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Provider;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/CircularBuffer.cs",
    "content": "﻿/// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    public class CircularBuffer : IDisposable\n    {\n        public CircularBuffer(int capacity)\n        {\n            IntPtr h = SherpaOnnxCreateCircularBuffer(capacity);\n            _handle = new HandleRef(this, h);\n        }\n\n        public void Push(float[] data)\n        {\n            SherpaOnnxCircularBufferPush(_handle.Handle, data, data.Length);\n        }\n\n        public float[] Get(int startIndex, int n)\n        {\n            IntPtr p = SherpaOnnxCircularBufferGet(_handle.Handle, startIndex, n);\n\n            float[] ans = new float[n];\n            Marshal.Copy(p, ans, 0, n);\n\n            SherpaOnnxCircularBufferFree(p);\n\n            return ans;\n        }\n\n        public void Pop(int n)\n        {\n            SherpaOnnxCircularBufferPop(_handle.Handle, n);\n        }\n\n        public int Size\n        {\n          get\n          {\n              return SherpaOnnxCircularBufferSize(_handle.Handle);\n          }\n        }\n\n        public int Head\n        {\n          get\n          {\n              return SherpaOnnxCircularBufferHead(_handle.Handle);\n          }\n        }\n\n        public void Reset()\n        {\n            SherpaOnnxCircularBufferReset(_handle.Handle);\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~CircularBuffer()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyCircularBuffer(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateCircularBuffer(int capacity);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyCircularBuffer(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxCircularBufferPush(IntPtr handle, float[] p, int n);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCircularBufferGet(IntPtr handle, int startIndex, int n);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxCircularBufferFree(IntPtr p);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxCircularBufferPop(IntPtr handle, int n);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxCircularBufferSize(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxCircularBufferHead(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxCircularBufferReset(IntPtr handle);\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/DenoisedAudio.cs",
    "content": "﻿/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n    public class DenoisedAudio\n    {\n        public DenoisedAudio(IntPtr p)\n        {\n            _handle = new HandleRef(this, p);\n        }\n\n        public bool SaveToWaveFile(String filename)\n        {\n            if (Handle == IntPtr.Zero)\n            {\n                return false;\n            }\n\n            Impl impl = (Impl)Marshal.PtrToStructure(Handle, typeof(Impl));\n            byte[] utf8Filename = Encoding.UTF8.GetBytes(filename);\n            byte[] utf8FilenameWithNull = new byte[utf8Filename.Length + 1]; // +1 for null terminator\n            Array.Copy(utf8Filename, utf8FilenameWithNull, utf8Filename.Length);\n            utf8FilenameWithNull[utf8Filename.Length] = 0; // Null terminator\n            int status = SherpaOnnxWriteWave(impl.Samples, impl.NumSamples, impl.SampleRate, utf8FilenameWithNull);\n            return status == 1;\n        }\n\n        ~DenoisedAudio()\n        {\n            Cleanup();\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        private void Cleanup()\n        {\n            if (Handle != IntPtr.Zero)\n            {\n                SherpaOnnxDestroyDenoisedAudio(Handle);\n            }\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        [StructLayout(LayoutKind.Sequential)]\n        struct Impl\n        {\n            public IntPtr Samples;\n            public int NumSamples;\n            public int SampleRate;\n        }\n\n        private HandleRef _handle;\n        public IntPtr Handle => _handle.Handle;\n\n        public int NumSamples\n        {\n            get\n            {\n                if (Handle == IntPtr.Zero)\n                {\n                    return 0;\n                }\n\n                Impl impl = (Impl)Marshal.PtrToStructure(Handle, typeof(Impl));\n                return impl.NumSamples;\n            }\n        }\n\n        public int SampleRate\n        {\n            get\n            {\n                if (Handle == IntPtr.Zero)\n                {\n                    return 0;\n                }\n\n                Impl impl = (Impl)Marshal.PtrToStructure(Handle, typeof(Impl));\n                return impl.SampleRate;\n            }\n        }\n\n        public float[] Samples\n        {\n            get\n            {\n                if (Handle == IntPtr.Zero)\n                {\n                    return new float[0];\n                }\n\n                Impl impl = (Impl)Marshal.PtrToStructure(Handle, typeof(Impl));\n\n                float[] samples = new float[impl.NumSamples];\n                if (impl.NumSamples > 0 && impl.Samples != IntPtr.Zero)\n                {\n                    Marshal.Copy(impl.Samples, samples, 0, impl.NumSamples);\n                }\n                return samples;\n            }\n        }\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyDenoisedAudio(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxWriteWave(IntPtr samples, int n, int sample_rate, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Filename);\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/Dll.cs",
    "content": "/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\n\nnamespace SherpaOnnx\n{\n    internal static class Dll\n    {\n        public const string Filename = \"sherpa-onnx-c-api\";\n    }\n}"
  },
  {
    "path": "scripts/dotnet/FastClusteringConfig.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct FastClusteringConfig\n    {\n        public FastClusteringConfig()\n        {\n            NumClusters = -1;\n            Threshold = 0.5F;\n        }\n\n        public int NumClusters;\n        public float Threshold;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/FeatureConfig.cs",
    "content": "/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    /// It expects 16 kHz 16-bit single channel wave format.\n    [StructLayout(LayoutKind.Sequential)]\n    public struct FeatureConfig\n    {\n        public FeatureConfig()\n        {\n            SampleRate = 16000;\n            FeatureDim = 80;\n        }\n        /// Sample rate of the input data. MUST match the one expected\n        /// by the model. For instance, it should be 16000 for models provided\n        /// by us.\n        public int SampleRate;\n\n        /// Feature dimension of the model.\n        /// For instance, it should be 80 for models provided by us.\n        public int FeatureDim;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/HomophoneReplacerConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct HomophoneReplacerConfig\n    {\n        public HomophoneReplacerConfig()\n        {\n          DictDir = \"\";\n          Lexicon = \"\";\n          RuleFsts = \"\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DictDir;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Lexicon;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string RuleFsts;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/KeywordResult.cs",
    "content": "﻿/// Copyright (c)  2024  Xiaomi Corporation\n\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n    public class KeywordResult\n    {\n        public KeywordResult(IntPtr handle)\n        {\n            Impl impl = (Impl)Marshal.PtrToStructure(handle, typeof(Impl));\n\n            // PtrToStringUTF8() requires .net standard 2.1\n            // _keyword = Marshal.PtrToStringUTF8(impl.Keyword);\n\n            int length = 0;\n\n            unsafe\n            {\n                byte* buffer = (byte*)impl.Keyword;\n                while (*buffer != 0)\n                {\n                    ++buffer;\n                    length += 1;\n                }\n            }\n\n            byte[] stringBuffer = new byte[length];\n            Marshal.Copy(impl.Keyword, stringBuffer, 0, length);\n            _keyword = Encoding.UTF8.GetString(stringBuffer);\n        }\n\n        [StructLayout(LayoutKind.Sequential)]\n        struct Impl\n        {\n            public IntPtr Keyword;\n        }\n\n        private String _keyword;\n        public String Keyword => _keyword;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/KeywordSpotter.cs",
    "content": "﻿/// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\nusing System;\nusing System.Collections.Generic;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n    // please see\n    // https://www.mono-project.com/docs/advanced/pinvoke/#gc-safe-pinvoke-code\n    // https://www.mono-project.com/docs/advanced/pinvoke/#properly-disposing-of-resources\n    public class KeywordSpotter : IDisposable\n    {\n        public KeywordSpotter(KeywordSpotterConfig config)\n        {\n            IntPtr h = SherpaOnnxCreateKeywordSpotter(ref config);\n            _handle = new HandleRef(this, h);\n        }\n\n        public OnlineStream CreateStream()\n        {\n            IntPtr p = SherpaOnnxCreateKeywordStream(_handle.Handle);\n            return new OnlineStream(p);\n        }\n\n        public OnlineStream CreateStream(string keywords)\n        {\n            byte[] utf8Bytes = Encoding.UTF8.GetBytes(keywords);\n            byte[] utf8BytesWithNull = new byte[utf8Bytes.Length + 1]; // +1 for null terminator\n            Array.Copy(utf8Bytes, utf8BytesWithNull, utf8Bytes.Length);\n            utf8BytesWithNull[utf8Bytes.Length] = 0; // Null terminator\n            IntPtr p = SherpaOnnxCreateKeywordStreamWithKeywords(_handle.Handle, utf8BytesWithNull);\n            return new OnlineStream(p);\n        }\n\n        /// Return true if the passed stream is ready for decoding.\n        public bool IsReady(OnlineStream stream)\n        {\n            return IsReady(_handle.Handle, stream.Handle) != 0;\n        }\n\n        /// You have to ensure that IsReady(stream) returns true before\n        /// you call this method\n        public void Decode(OnlineStream stream)\n        {\n            Decode(_handle.Handle, stream.Handle);\n        }\n\n        public void Reset(OnlineStream stream)\n        {\n            Reset(_handle.Handle, stream.Handle);\n        }\n\n        // The caller should ensure all passed streams are ready for decoding.\n        public void Decode(IEnumerable<OnlineStream> streams)\n        {\n            // TargetFramework=net20 does not support System.Linq\n            // IntPtr[] ptrs = streams.Select(s => s.Handle).ToArray();\n            List<IntPtr> list = new List<IntPtr>();\n            foreach (OnlineStream s in streams)\n            {\n              list.Add(s.Handle);\n            }\n\n            IntPtr[] ptrs = list.ToArray();\n            Decode(_handle.Handle, ptrs, ptrs.Length);\n        }\n\n        public KeywordResult GetResult(OnlineStream stream)\n        {\n            IntPtr h = GetResult(_handle.Handle, stream.Handle);\n            KeywordResult result = new KeywordResult(h);\n            DestroyResult(h);\n            return result;\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~KeywordSpotter()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyKeywordSpotter(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateKeywordSpotter(ref KeywordSpotterConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyKeywordSpotter(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateKeywordStream(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateKeywordStreamWithKeywords(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Keywords);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxIsKeywordStreamReady\")]\n        private static extern int IsReady(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxDecodeKeywordStream\")]\n        private static extern void Decode(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxResetKeywordStream\")]\n        private static extern void Reset(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxDecodeMultipleKeywordStreams\")]\n        private static extern void Decode(IntPtr handle, IntPtr[] streams, int n);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxGetKeywordResult\")]\n        private static extern IntPtr GetResult(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxDestroyKeywordResult\")]\n        private static extern void DestroyResult(IntPtr result);\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/KeywordSpotterConfig.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct KeywordSpotterConfig\n    {\n        public KeywordSpotterConfig()\n        {\n            FeatConfig = new FeatureConfig();\n            ModelConfig = new OnlineModelConfig();\n\n            MaxActivePaths = 4;\n            NumTrailingBlanks = 1;\n            KeywordsScore = 1.0F;\n            KeywordsThreshold = 0.25F;\n            KeywordsFile = \"\";\n            KeywordsBuf= \"\";\n            KeywordsBufSize= 0;\n        }\n        public FeatureConfig FeatConfig;\n        public OnlineModelConfig ModelConfig;\n\n        public int MaxActivePaths;\n        public int NumTrailingBlanks;\n        public float KeywordsScore;\n        public float KeywordsThreshold;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string KeywordsFile;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string KeywordsBuf;\n\n        public int KeywordsBufSize;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineCanaryModelConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineCanaryModelConfig\n    {\n        public OfflineCanaryModelConfig()\n        {\n            Encoder = \"\";\n            Decoder = \"\";\n            SrcLang = \"en\";\n            TgtLang = \"en\";\n            UsePnc = 1;\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Encoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Decoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string SrcLang;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string TgtLang;\n\n        public int UsePnc;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineDolphinModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineDolphinModelConfig\n    {\n        public OfflineDolphinModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineFireRedAsrCtcModel.cs",
    "content": "/// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineFireRedAsrCtcModelConfig\n    {\n        public OfflineFireRedAsrCtcModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineFireRedAsrModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineFireRedAsrModelConfig\n    {\n        public OfflineFireRedAsrModelConfig()\n        {\n            Encoder = \"\";\n            Decoder = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Encoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Decoder;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineFunAsrNanoModel.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineFunAsrNanoModelConfig\n    {\n        public OfflineFunAsrNanoModelConfig()\n        {\n            EncoderAdaptor = \"\";\n            LLM = \"\";\n            Embedding = \"\";\n            Tokenizer = \"\";\n            SystemPrompt = \"You are a helpful assistant.\";\n            UserPrompt = \"语音转写：\";\n            MaxNewTokens = 512;\n            Temperature = 1e-6F;\n            TopP = 0.8F;\n            Seed = 42;\n            Language = \"\";\n            Itn = 0;\n            Hotwords = \"\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string EncoderAdaptor;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string LLM;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Embedding;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Tokenizer;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string SystemPrompt;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string UserPrompt;\n\n        public int MaxNewTokens;\n        public float Temperature;\n        public float TopP;\n        public int Seed;\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Language;\n\n        public int Itn;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Hotwords;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineLMConfig.cs",
    "content": "/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineLMConfig\n    {\n        public OfflineLMConfig()\n        {\n            Model = \"\";\n            Scale = 0.5F;\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n\n        public float Scale;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/OfflineMedAsrCtcModel.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineMedAsrCtcModelConfig\n    {\n        public OfflineMedAsrCtcModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineModelConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineModelConfig\n    {\n        public OfflineModelConfig()\n        {\n            Transducer = new OfflineTransducerModelConfig();\n            Paraformer = new OfflineParaformerModelConfig();\n            NeMoCtc = new OfflineNemoEncDecCtcModelConfig();\n            Whisper = new OfflineWhisperModelConfig();\n            Tdnn = new OfflineTdnnModelConfig();\n            Tokens = \"\";\n            NumThreads = 1;\n            Debug = 0;\n            Provider = \"cpu\";\n            ModelType = \"\";\n            ModelingUnit = \"cjkchar\";\n            BpeVocab = \"\";\n            TeleSpeechCtc = \"\";\n            SenseVoice = new OfflineSenseVoiceModelConfig();\n            Moonshine = new OfflineMoonshineModelConfig();\n            FireRedAsr = new OfflineFireRedAsrModelConfig();\n            Dolphin = new OfflineDolphinModelConfig();\n            ZipformerCtc = new OfflineZipformerCtcModelConfig();\n            Canary = new OfflineCanaryModelConfig();\n            WenetCtc = new OfflineWenetCtcModelConfig();\n            Omnilingual = new OfflineOmnilingualAsrCtcModelConfig();\n            MedAsr = new OfflineMedAsrCtcModelConfig();\n            FunAsrNano = new OfflineFunAsrNanoModelConfig();\n            FireRedAsrCtc = new OfflineFireRedAsrCtcModelConfig();\n        }\n        public OfflineTransducerModelConfig Transducer;\n        public OfflineParaformerModelConfig Paraformer;\n        public OfflineNemoEncDecCtcModelConfig NeMoCtc;\n        public OfflineWhisperModelConfig Whisper;\n        public OfflineTdnnModelConfig Tdnn;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Tokens;\n\n        public int NumThreads;\n\n        public int Debug;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Provider;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string ModelType;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string ModelingUnit;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string BpeVocab;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string TeleSpeechCtc;\n\n        public OfflineSenseVoiceModelConfig SenseVoice;\n        public OfflineMoonshineModelConfig Moonshine;\n        public OfflineFireRedAsrModelConfig FireRedAsr;\n        public OfflineDolphinModelConfig Dolphin;\n        public OfflineZipformerCtcModelConfig ZipformerCtc;\n        public OfflineCanaryModelConfig Canary;\n        public OfflineWenetCtcModelConfig WenetCtc;\n        public OfflineOmnilingualAsrCtcModelConfig Omnilingual;\n        public OfflineMedAsrCtcModelConfig MedAsr;\n        public OfflineFunAsrNanoModelConfig FunAsrNano;\n        public OfflineFireRedAsrCtcModelConfig FireRedAsrCtc;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineMoonshineModelConfig.cs",
    "content": "/// Copyright (c)  2024-2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\n// For Moonshine v1, you need four models:\n//  - preprocessor, encoder, cached_decoder, uncached_decoder\n//\n// For Moonshine v2, you need 2 models:\n//  - encoder, merged_decoder\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineMoonshineModelConfig\n    {\n        public OfflineMoonshineModelConfig()\n        {\n            Preprocessor = \"\";\n            Encoder = \"\";\n            UncachedDecoder = \"\";\n            CachedDecoder = \"\";\n            MergedDecoder = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Preprocessor;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Encoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string UncachedDecoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string CachedDecoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string MergedDecoder;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineNemoEncDecCtcModelConfig.cs",
    "content": "/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineNemoEncDecCtcModelConfig\n    {\n        public OfflineNemoEncDecCtcModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}"
  },
  {
    "path": "scripts/dotnet/OfflineOmnilingualAsrCtcModel.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineOmnilingualAsrCtcModelConfig\n    {\n        public OfflineOmnilingualAsrCtcModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineParaformerModelConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineParaformerModelConfig\n    {\n        public OfflineParaformerModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/OfflinePunctuation.cs",
    "content": "﻿/// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\n\nnamespace SherpaOnnx\n{\n    public class OfflinePunctuation : IDisposable\n    {\n        public OfflinePunctuation(OfflinePunctuationConfig config)\n        {\n            IntPtr h = SherpaOnnxCreateOfflinePunctuation(ref config);\n            _handle = new HandleRef(this, h);\n        }\n\n        public String AddPunct(String text)\n        {\n            byte[] utf8Bytes = Encoding.UTF8.GetBytes(text);\n            byte[] utf8BytesWithNull = new byte[utf8Bytes.Length + 1]; // +1 for null terminator\n            Array.Copy(utf8Bytes, utf8BytesWithNull, utf8Bytes.Length);\n            utf8BytesWithNull[utf8Bytes.Length] = 0; // Null terminator\n\n            IntPtr p = SherpaOfflinePunctuationAddPunct(_handle.Handle, utf8BytesWithNull);\n\n            string s = \"\";\n            int length = 0;\n\n            unsafe\n            {\n                byte* b = (byte*)p;\n                if (b != null)\n                {\n                    while (*b != 0)\n                    {\n                        ++b;\n                        length += 1;\n                    }\n                }\n            }\n\n            if (length > 0)\n            {\n                byte[] stringBuffer = new byte[length];\n                Marshal.Copy(p, stringBuffer, 0, length);\n                s = Encoding.UTF8.GetString(stringBuffer);\n            }\n\n            SherpaOfflinePunctuationFreeText(p);\n\n            return s;\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~OfflinePunctuation()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyOfflinePunctuation(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateOfflinePunctuation(ref OfflinePunctuationConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyOfflinePunctuation(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOfflinePunctuationAddPunct(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Text);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOfflinePunctuationFreeText(IntPtr p);\n    }\n}\n\n"
  },
  {
    "path": "scripts/dotnet/OfflinePunctuationConfig.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflinePunctuationConfig\n    {\n        public OfflinePunctuationConfig()\n        {\n            Model = new OfflinePunctuationModelConfig();\n        }\n        public OfflinePunctuationModelConfig Model;\n    }\n}\n\n"
  },
  {
    "path": "scripts/dotnet/OfflinePunctuationModelConfig.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflinePunctuationModelConfig\n    {\n        public OfflinePunctuationModelConfig()\n        {\n            CtTransformer = \"\";\n            NumThreads = 1;\n            Debug = 0;\n            Provider = \"cpu\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string CtTransformer;\n\n        public int NumThreads;\n\n        public int Debug;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Provider;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineRecognizer.cs",
    "content": "﻿/// Copyright (c)  2024.5 by 东风破\n\nusing System;\nusing System.Collections.Generic;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    public class OfflineRecognizer : IDisposable\n    {\n        public OfflineRecognizer(OfflineRecognizerConfig config)\n        {\n            IntPtr h = SherpaOnnxCreateOfflineRecognizer(ref config);\n            _handle = new HandleRef(this, h);\n        }\n\n        public void SetConfig(OfflineRecognizerConfig config)\n        {\n            SherpaOnnxOfflineRecognizerSetConfig(_handle.Handle, ref config);\n        }\n\n        public OfflineStream CreateStream()\n        {\n            IntPtr p = SherpaOnnxCreateOfflineStream(_handle.Handle);\n            return new OfflineStream(p);\n        }\n\n        public void Decode(OfflineStream stream)\n        {\n            Decode(_handle.Handle, stream.Handle);\n        }\n\n        // The caller should ensure all passed streams are ready for decoding.\n        public void Decode(IEnumerable<OfflineStream> streams)\n        {\n            // TargetFramework=net20 does not support System.Linq\n            // IntPtr[] ptrs = streams.Select(s => s.Handle).ToArray();\n            List<IntPtr> list = new List<IntPtr>();\n            foreach (OfflineStream s in streams)\n            {\n              list.Add(s.Handle);\n            }\n            IntPtr[] ptrs = list.ToArray();\n            Decode(_handle.Handle, ptrs, ptrs.Length);\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~OfflineRecognizer()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyOfflineRecognizer(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateOfflineRecognizer(ref OfflineRecognizerConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxOfflineRecognizerSetConfig(IntPtr handle, ref OfflineRecognizerConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyOfflineRecognizer(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateOfflineStream(IntPtr handle);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxDecodeOfflineStream\")]\n        private static extern void Decode(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxDecodeMultipleOfflineStreams\")]\n        private static extern void Decode(IntPtr handle, IntPtr[] streams, int n);\n    }\n\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineRecognizerConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineRecognizerConfig\n    {\n        public OfflineRecognizerConfig()\n        {\n            FeatConfig = new FeatureConfig();\n            ModelConfig = new OfflineModelConfig();\n            LmConfig = new OfflineLMConfig();\n\n            DecodingMethod = \"greedy_search\";\n            MaxActivePaths = 4;\n            HotwordsFile = \"\";\n            HotwordsScore = 1.5F;\n            RuleFsts = \"\";\n            RuleFars = \"\";\n            BlankPenalty = 0.0F;\n            Hr = new HomophoneReplacerConfig();\n        }\n        public FeatureConfig FeatConfig;\n        public OfflineModelConfig ModelConfig;\n        public OfflineLMConfig LmConfig;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DecodingMethod;\n\n        public int MaxActivePaths;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string HotwordsFile;\n\n        public float HotwordsScore;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string RuleFsts;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string RuleFars;\n\n        public float BlankPenalty;\n\n        public HomophoneReplacerConfig Hr;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineRecognizerResult.cs",
    "content": "﻿/// Copyright (c)  2024.5 by 东风破\n\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n\n    public class OfflineRecognizerResult\n    {\n        public OfflineRecognizerResult(IntPtr handle)\n        {\n            Impl impl = (Impl)Marshal.PtrToStructure(handle, typeof(Impl));\n\n            // PtrToStringUTF8() requires .net standard 2.1\n            // _text = Marshal.PtrToStringUTF8(impl.Text);\n\n            int length = 0;\n\n            unsafe\n            {\n                byte* buffer = (byte*)impl.Text;\n                while (*buffer != 0)\n                {\n                    ++buffer;\n                    length += 1;\n                }\n            }\n\n            byte[] stringBuffer = new byte[length];\n            Marshal.Copy(impl.Text, stringBuffer, 0, length);\n            _text = Encoding.UTF8.GetString(stringBuffer);\n\n            _tokens = new String[impl.Count];\n\n            unsafe\n            {\n                byte* buf = (byte*)impl.Tokens;\n                for (int i = 0; i < impl.Count; i++)\n                {\n                    length = 0;\n                    byte* start = buf;\n                    while (*buf != 0)\n                    {\n                        ++buf;\n                        length += 1;\n                    }\n                    ++buf;\n\n                    stringBuffer = new byte[length];\n                    fixed (byte* pTarget = stringBuffer)\n                    {\n                        for (int k = 0; k < length; k++)\n                        {\n                            pTarget[k] = start[k];\n                        }\n                    }\n\n                    _tokens[i] = Encoding.UTF8.GetString(stringBuffer);\n                }\n            }\n\n            unsafe\n            {\n              if (impl.Timestamps != IntPtr.Zero)\n              {\n                float *t = (float*)impl.Timestamps;\n                _timestamps = new float[impl.Count];\n                fixed (float* f = _timestamps)\n                {\n                  for (int k = 0; k < impl.Count; k++)\n                  {\n                    f[k] = t[k];\n                  }\n                }\n              }\n            }\n\n            unsafe\n            {\n              if (impl.Durations != IntPtr.Zero)\n              {\n                float *d = (float*)impl.Durations;\n                _durations = new float[impl.Count];\n                fixed (float* f = _durations)\n                {\n                  for (int k = 0; k < impl.Count; k++)\n                  {\n                    f[k] = d[k];\n                  }\n                }\n              }\n            }\n        }\n\n        [StructLayout(LayoutKind.Sequential)]\n        struct Impl\n        {\n            public IntPtr Text;\n            public IntPtr Timestamps;\n            public int Count;\n            public IntPtr Tokens;\n            public IntPtr Durations;\n        }\n\n        private String _text;\n        public String Text => _text;\n\n        private String[] _tokens;\n        public String[] Tokens => _tokens;\n\n        private float[] _timestamps;\n        public float[] Timestamps => _timestamps;\n\n        private float[] _durations;\n        public float[] Durations => _durations;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineSenseVoiceModelConfig.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineSenseVoiceModelConfig\n    {\n        public OfflineSenseVoiceModelConfig()\n        {\n            Model = \"\";\n            Language = \"\";\n            UseInverseTextNormalization = 0;\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Language;\n\n        public int UseInverseTextNormalization;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineSpeakerDiarization.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n    // IntPtr is actually a `const float*` from C++\n    public delegate int OfflineSpeakerDiarizationProgressCallback(int numProcessedChunks, int numTotalChunks, IntPtr arg);\n\n    public class OfflineSpeakerDiarization : IDisposable\n    {\n        public OfflineSpeakerDiarization(OfflineSpeakerDiarizationConfig config)\n        {\n            IntPtr h = SherpaOnnxCreateOfflineSpeakerDiarization(ref config);\n            _handle = new HandleRef(this, h);\n        }\n\n        public void SetConfig(OfflineSpeakerDiarizationConfig config)\n        {\n            SherpaOnnxOfflineSpeakerDiarizationSetConfig(_handle.Handle, ref config);\n        }\n\n        public OfflineSpeakerDiarizationSegment[] Process(float[] samples)\n        {\n            IntPtr result = SherpaOnnxOfflineSpeakerDiarizationProcess(_handle.Handle, samples, samples.Length);\n            return ProcessImpl(result);\n        }\n\n        public OfflineSpeakerDiarizationSegment[] ProcessWithCallback(float[] samples, OfflineSpeakerDiarizationProgressCallback callback, IntPtr arg)\n        {\n            IntPtr result = SherpaOnnxOfflineSpeakerDiarizationProcessWithCallback(_handle.Handle, samples, samples.Length, callback, arg);\n            return ProcessImpl(result);\n        }\n\n        private OfflineSpeakerDiarizationSegment[] ProcessImpl(IntPtr result)\n        {\n            if (result == IntPtr.Zero)\n            {\n              return new OfflineSpeakerDiarizationSegment[] {};\n            }\n\n            int numSegments = SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(result);\n            IntPtr p = SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(result);\n\n            OfflineSpeakerDiarizationSegment[] ans = new OfflineSpeakerDiarizationSegment[numSegments];\n            unsafe\n            {\n              int size = sizeof(float) * 2 + sizeof(int);\n              for (int i = 0; i != numSegments; ++i)\n              {\n                IntPtr t = new IntPtr((byte*)p + i * size);\n                ans[i] = new OfflineSpeakerDiarizationSegment(t);\n\n                // The following IntPtr.Add() does not support net20\n                // ans[i] = new OfflineSpeakerDiarizationSegment(IntPtr.Add(p, i));\n              }\n            }\n\n\n            SherpaOnnxOfflineSpeakerDiarizationDestroySegment(p);\n            SherpaOnnxOfflineSpeakerDiarizationDestroyResult(result);\n\n            return ans;\n\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~OfflineSpeakerDiarization()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyOfflineSpeakerDiarization(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n        public int SampleRate\n        {\n            get\n            {\n                return SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(_handle.Handle);\n            }\n        }\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateOfflineSpeakerDiarization(ref OfflineSpeakerDiarizationConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyOfflineSpeakerDiarization(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxOfflineSpeakerDiarizationProcess(IntPtr handle, float[] samples, int n);\n\n        [DllImport(Dll.Filename, CallingConvention = CallingConvention.Cdecl)]\n        private static extern IntPtr SherpaOnnxOfflineSpeakerDiarizationProcessWithCallback(IntPtr handle, float[] samples, int n, OfflineSpeakerDiarizationProgressCallback callback, IntPtr arg);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxOfflineSpeakerDiarizationDestroyResult(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxOfflineSpeakerDiarizationDestroySegment(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxOfflineSpeakerDiarizationSetConfig(IntPtr handle, ref OfflineSpeakerDiarizationConfig config);\n    }\n}\n\n"
  },
  {
    "path": "scripts/dotnet/OfflineSpeakerDiarizationConfig.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineSpeakerDiarizationConfig\n    {\n        public OfflineSpeakerDiarizationConfig()\n        {\n            Segmentation = new OfflineSpeakerSegmentationModelConfig();\n            Embedding = new SpeakerEmbeddingExtractorConfig();\n            Clustering = new FastClusteringConfig();\n\n            MinDurationOn = 0.3F;\n            MinDurationOff = 0.5F;\n        }\n\n        public OfflineSpeakerSegmentationModelConfig Segmentation;\n        public SpeakerEmbeddingExtractorConfig Embedding;\n        public FastClusteringConfig Clustering;\n\n        public float MinDurationOn;\n        public float MinDurationOff;\n    }\n}\n\n\n\n"
  },
  {
    "path": "scripts/dotnet/OfflineSpeakerDiarizationSegment.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n\n    public class OfflineSpeakerDiarizationSegment\n    {\n        public OfflineSpeakerDiarizationSegment(IntPtr handle)\n        {\n          Impl impl = (Impl)Marshal.PtrToStructure(handle, typeof(Impl));\n\n          Start = impl.Start;\n          End = impl.End;\n          Speaker = impl.Speaker;\n        }\n\n        [StructLayout(LayoutKind.Sequential)]\n        struct Impl\n        {\n            public float Start;\n            public float End;\n            public int Speaker;\n        }\n\n        public float Start;\n        public float End;\n        public int Speaker;\n    }\n}\n\n"
  },
  {
    "path": "scripts/dotnet/OfflineSpeakerSegmentationModelConfig.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineSpeakerSegmentationModelConfig\n    {\n        public OfflineSpeakerSegmentationModelConfig()\n        {\n            Pyannote = new OfflineSpeakerSegmentationPyannoteModelConfig();\n            NumThreads = 1;\n            Debug = 0;\n            Provider = \"cpu\";\n        }\n\n        public OfflineSpeakerSegmentationPyannoteModelConfig Pyannote;\n\n        /// Number of threads used to run the neural network model\n        public int NumThreads;\n\n        /// true to print debug information of the model\n        public int Debug;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Provider;\n    }\n}\n\n\n"
  },
  {
    "path": "scripts/dotnet/OfflineSpeakerSegmentationPyannoteModelConfig.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineSpeakerSegmentationPyannoteModelConfig\n    {\n        public OfflineSpeakerSegmentationPyannoteModelConfig()\n        {\n            Model = \"\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n\n"
  },
  {
    "path": "scripts/dotnet/OfflineSpeechDenoiser.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    public class OfflineSpeechDenoiser: IDisposable\n    {\n        public OfflineSpeechDenoiser(OfflineSpeechDenoiserConfig config)\n        {\n            IntPtr h = SherpaOnnxCreateOfflineSpeechDenoiser(ref config);\n            _handle = new HandleRef(this, h);\n        }\n\n        public DenoisedAudio Run(float[] samples, int sampleRate)\n        {\n            IntPtr p = SherpaOnnxOfflineSpeechDenoiserRun(_handle.Handle, samples, samples.Length, sampleRate);\n            return new DenoisedAudio(p);\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~OfflineSpeechDenoiser()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyOfflineSpeechDenoiser(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n        public int SampleRate\n        {\n            get\n            {\n                return SherpaOnnxOfflineSpeechDenoiserGetSampleRate(_handle.Handle);\n            }\n        }\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateOfflineSpeechDenoiser(ref OfflineSpeechDenoiserConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyOfflineSpeechDenoiser(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxOfflineSpeechDenoiserGetSampleRate(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxOfflineSpeechDenoiserRun(IntPtr handle, float[] samples, int n, int sampleRate);\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineSpeechDenoiserConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineSpeechDenoiserConfig\n    {\n        public OfflineSpeechDenoiserConfig()\n        {\n            Model = new OfflineSpeechDenoiserModelConfig();\n        }\n        public OfflineSpeechDenoiserModelConfig Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineSpeechDenoiserDpdfNetModelConfig.cs",
    "content": "/// Copyright (c)  2026  Xiaomi Corporation\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineSpeechDenoiserDpdfNetModelConfig\n    {\n        public OfflineSpeechDenoiserDpdfNetModelConfig()\n        {\n            Model = \"\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineSpeechDenoiserGtcrnModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineSpeechDenoiserGtcrnModelConfig\n    {\n        public OfflineSpeechDenoiserGtcrnModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineSpeechDenoiserModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineSpeechDenoiserModelConfig\n    {\n        public OfflineSpeechDenoiserModelConfig()\n        {\n            Gtcrn = new OfflineSpeechDenoiserGtcrnModelConfig();\n            Dpdfnet = new OfflineSpeechDenoiserDpdfNetModelConfig();\n            NumThreads = 1;\n            Debug = 0;\n            Provider = \"cpu\";\n        }\n\n        public OfflineSpeechDenoiserGtcrnModelConfig Gtcrn;\n\n        public int NumThreads;\n\n        public int Debug;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Provider;\n\n        public OfflineSpeechDenoiserDpdfNetModelConfig Dpdfnet;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineStream.cs",
    "content": "﻿/// Copyright (c)  2024.5 by 东风破\n\nusing System;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    public class OfflineStream : IDisposable\n    {\n        public OfflineStream(IntPtr p)\n        {\n            _handle = new HandleRef(this, p);\n        }\n\n        public void AcceptWaveform(int sampleRate, float[] samples)\n        {\n            AcceptWaveform(Handle, sampleRate, samples, samples.Length);\n        }\n\n        public void SetOption(string key, string value)\n        {\n            SherpaOnnxOfflineStreamSetOption(Handle, key, value);\n        }\n\n        public string GetOption(string key)\n        {\n            IntPtr p = SherpaOnnxOfflineStreamGetOption(Handle, key);\n            return Marshal.PtrToStringAnsi(p) ?? \"\";\n        }\n\n        public bool HasOption(string key)\n        {\n            return SherpaOnnxOfflineStreamHasOption(Handle, key) == 1;\n        }\n\n        public OfflineRecognizerResult Result\n        {\n            get\n            {\n                IntPtr h = GetResult(_handle.Handle);\n                OfflineRecognizerResult result = new OfflineRecognizerResult(h);\n                DestroyResult(h);\n                return result;\n            }\n        }\n\n        ~OfflineStream()\n        {\n            Cleanup();\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyOfflineStream(Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n        public IntPtr Handle => _handle.Handle;\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyOfflineStream(IntPtr handle);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxAcceptWaveformOffline\")]\n        private static extern void AcceptWaveform(IntPtr handle, int sampleRate, float[] samples, int n);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxOfflineStreamSetOption(IntPtr handle, [MarshalAs(UnmanagedType.LPStr)] string key, [MarshalAs(UnmanagedType.LPStr)] string value);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxOfflineStreamGetOption(IntPtr handle, [MarshalAs(UnmanagedType.LPStr)] string key);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxOfflineStreamHasOption(IntPtr handle, [MarshalAs(UnmanagedType.LPStr)] string key);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxGetOfflineStreamResult\")]\n        private static extern IntPtr GetResult(IntPtr handle);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxDestroyOfflineRecognizerResult\")]\n        private static extern void DestroyResult(IntPtr handle);\n    }\n\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineTdnnModelConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTdnnModelConfig\n    {\n        public OfflineTdnnModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/OfflineTransducerModelConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTransducerModelConfig\n    {\n        public OfflineTransducerModelConfig()\n        {\n            Encoder = \"\";\n            Decoder = \"\";\n            Joiner = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Encoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Decoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Joiner;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/OfflineTts.cs",
    "content": "﻿/// Copyright (c)  2024.5 by 东风破\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n    // IntPtr is actually a `const float*` from C++\n    [UnmanagedFunctionPointer(CallingConvention.Cdecl)]\n    public delegate int OfflineTtsCallback(IntPtr samples, int n);\n\n    [UnmanagedFunctionPointer(CallingConvention.Cdecl)]\n    public delegate int OfflineTtsCallbackProgress(IntPtr samples, int n, float progress);\n\n    [UnmanagedFunctionPointer(CallingConvention.Cdecl)]\n    public delegate int OfflineTtsCallbackProgressWithArg(IntPtr samples, int n, float progress, IntPtr arg);\n\n\n    public class OfflineTts : IDisposable\n    {\n        public OfflineTts(OfflineTtsConfig config)\n        {\n            IntPtr h = SherpaOnnxCreateOfflineTts(ref config);\n            _handle = new HandleRef(this, h);\n        }\n\n        public OfflineTtsGeneratedAudio Generate(String text, float speed, int speakerId)\n        {\n            byte[] utf8Bytes = Encoding.UTF8.GetBytes(text);\n            byte[] utf8BytesWithNull = new byte[utf8Bytes.Length + 1]; // +1 for null terminator\n            Array.Copy(utf8Bytes, utf8BytesWithNull, utf8Bytes.Length);\n            utf8BytesWithNull[utf8Bytes.Length] = 0; // Null terminator\n            IntPtr p = SherpaOnnxOfflineTtsGenerate(_handle.Handle, utf8BytesWithNull, speakerId, speed);\n            return new OfflineTtsGeneratedAudio(p);\n        }\n\n        public OfflineTtsGeneratedAudio GenerateWithCallback(\n            String text,\n            float speed,\n            int speakerId,\n            OfflineTtsCallback callback)\n        {\n            byte[] utf8Bytes = Encoding.UTF8.GetBytes(text);\n            byte[] utf8BytesWithNull = new byte[utf8Bytes.Length + 1];\n            Array.Copy(utf8Bytes, utf8BytesWithNull, utf8Bytes.Length);\n            utf8BytesWithNull[utf8Bytes.Length] = 0;\n\n            GCHandle callbackHandle = default(GCHandle);\n            try\n            {\n                callbackHandle = GCHandle.Alloc(callback);\n\n                IntPtr p = SherpaOnnxOfflineTtsGenerateWithCallback(\n                    _handle.Handle,\n                    utf8BytesWithNull,\n                    speakerId,\n                    speed,\n                    callback\n                );\n\n                return new OfflineTtsGeneratedAudio(p);\n            }\n            finally\n            {\n                if (callbackHandle.IsAllocated)\n                    callbackHandle.Free();\n            }\n        }\n\n        public OfflineTtsGeneratedAudio GenerateWithCallbackProgress(\n            String text,\n            float speed,\n            int speakerId,\n            OfflineTtsCallbackProgress callback)\n        {\n            byte[] utf8Bytes = Encoding.UTF8.GetBytes(text);\n            byte[] utf8BytesWithNull = new byte[utf8Bytes.Length + 1];\n            Array.Copy(utf8Bytes, utf8BytesWithNull, utf8Bytes.Length);\n            utf8BytesWithNull[utf8Bytes.Length] = 0;\n\n            GCHandle callbackHandle = default(GCHandle);\n            try\n            {\n                callbackHandle = GCHandle.Alloc(callback);\n\n                IntPtr p = SherpaOnnxOfflineTtsGenerateWithProgressCallback(\n                    _handle.Handle,\n                    utf8BytesWithNull,\n                    speakerId,\n                    speed,\n                    callback\n                );\n\n                return new OfflineTtsGeneratedAudio(p);\n            }\n            finally\n            {\n                if (callbackHandle.IsAllocated)\n                    callbackHandle.Free();\n            }\n        }\n\n\n        public OfflineTtsGeneratedAudio GenerateWithConfig(\n            string text,\n            OfflineTtsGenerationConfig config,\n            OfflineTtsCallbackProgressWithArg callback)\n        {\n            byte[] utf8Bytes = Encoding.UTF8.GetBytes(text);\n            byte[] utf8BytesWithNull = new byte[utf8Bytes.Length + 1];\n            Array.Copy(utf8Bytes, utf8BytesWithNull, utf8Bytes.Length);\n            utf8BytesWithNull[utf8Bytes.Length] = 0;\n\n            GCHandle callbackHandle = default(GCHandle);\n            GCHandle? audioHandle = null;\n\n            var nativeConfig = config.ToNative(out audioHandle);\n\n            try\n            {\n                callbackHandle = GCHandle.Alloc(callback);\n\n                IntPtr p = SherpaOnnxOfflineTtsGenerateWithConfig(\n                    _handle.Handle,\n                    utf8BytesWithNull,\n                    ref nativeConfig,\n                    callback,\n                    IntPtr.Zero\n                );\n\n                return new OfflineTtsGeneratedAudio(p);\n            }\n            finally\n            {\n                if (callbackHandle.IsAllocated)\n                    callbackHandle.Free();\n\n                if (audioHandle.HasValue)\n                    audioHandle.Value.Free();\n            }\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~OfflineTts()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyOfflineTts(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n        public int SampleRate\n        {\n            get\n            {\n                return SherpaOnnxOfflineTtsSampleRate(_handle.Handle);\n            }\n        }\n\n        public int NumSpeakers\n        {\n            get\n            {\n                return SherpaOnnxOfflineTtsNumSpeakers(_handle.Handle);\n            }\n        }\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateOfflineTts(ref OfflineTtsConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyOfflineTts(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxOfflineTtsSampleRate(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxOfflineTtsNumSpeakers(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxOfflineTtsGenerate(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Text, int sid, float speed);\n\n        [DllImport(Dll.Filename, CallingConvention = CallingConvention.Cdecl)]\n        private static extern IntPtr SherpaOnnxOfflineTtsGenerateWithCallback(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Text, int sid, float speed, OfflineTtsCallback callback);\n\n        [DllImport(Dll.Filename, CallingConvention = CallingConvention.Cdecl)]\n        private static extern IntPtr SherpaOnnxOfflineTtsGenerateWithProgressCallback(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Text, int sid, float speed, OfflineTtsCallbackProgress callback);\n\n        [DllImport(Dll.Filename, CallingConvention = CallingConvention.Cdecl)]\n        private static extern IntPtr SherpaOnnxOfflineTtsGenerateWithConfig(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Text, ref OfflineTtsGenerationConfig.NativeStruct config, OfflineTtsCallbackProgressWithArg callback, IntPtr arg);\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineTtsConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTtsConfig\n    {\n        public OfflineTtsConfig()\n        {\n            Model = new OfflineTtsModelConfig();\n            RuleFsts = \"\";\n            MaxNumSentences = 1;\n            RuleFars = \"\";\n            SilenceScale = 0.2F;\n        }\n        public OfflineTtsModelConfig Model;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string RuleFsts;\n\n        public int MaxNumSentences;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string RuleFars;\n\n        public float SilenceScale;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineTtsGeneratedAudio.cs",
    "content": "﻿/// Copyright (c)  2024.5 by 东风破\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n    public class OfflineTtsGeneratedAudio\n    {\n        public OfflineTtsGeneratedAudio(IntPtr p)\n        {\n            _handle = new HandleRef(this, p);\n        }\n\n        public bool SaveToWaveFile(String filename)\n        {\n            Impl impl = (Impl)Marshal.PtrToStructure(Handle, typeof(Impl));\n            byte[] utf8Filename = Encoding.UTF8.GetBytes(filename);\n            byte[] utf8FilenameWithNull = new byte[utf8Filename.Length + 1]; // +1 for null terminator\n            Array.Copy(utf8Filename, utf8FilenameWithNull, utf8Filename.Length);\n            utf8FilenameWithNull[utf8Filename.Length] = 0; // Null terminator\n            int status = SherpaOnnxWriteWave(impl.Samples, impl.NumSamples, impl.SampleRate, utf8FilenameWithNull);\n            return status == 1;\n        }\n\n        ~OfflineTtsGeneratedAudio()\n        {\n            Cleanup();\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyOfflineTtsGeneratedAudio(Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        [StructLayout(LayoutKind.Sequential)]\n        struct Impl\n        {\n            public IntPtr Samples;\n            public int NumSamples;\n            public int SampleRate;\n        }\n\n        private HandleRef _handle;\n        public IntPtr Handle => _handle.Handle;\n\n        public int NumSamples\n        {\n            get\n            {\n                Impl impl = (Impl)Marshal.PtrToStructure(Handle, typeof(Impl));\n                return impl.NumSamples;\n            }\n        }\n\n        public int SampleRate\n        {\n            get\n            {\n                Impl impl = (Impl)Marshal.PtrToStructure(Handle, typeof(Impl));\n                return impl.SampleRate;\n            }\n        }\n\n        public float[] Samples\n        {\n            get\n            {\n                Impl impl = (Impl)Marshal.PtrToStructure(Handle, typeof(Impl));\n\n                float[] samples = new float[impl.NumSamples];\n                Marshal.Copy(impl.Samples, samples, 0, impl.NumSamples);\n                return samples;\n            }\n        }\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyOfflineTtsGeneratedAudio(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxWriteWave(IntPtr samples, int n, int sample_rate, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Filename);\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineTtsGenerationConfig.cs",
    "content": "﻿/// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\r\n\r\nusing System;\r\nusing System.Collections;\r\nusing System.Runtime.InteropServices;\r\nusing System.Text;\r\n\r\nnamespace SherpaOnnx\r\n{\r\n    public class OfflineTtsGenerationConfig\r\n    {\r\n        public OfflineTtsGenerationConfig()\r\n        {\r\n            SilenceScale = 0.2f;\r\n            Speed = 1.0f;\r\n            Sid = 0;\r\n            ReferenceAudio = null;\r\n            ReferenceSampleRate = 0;\r\n            ReferenceText = \"\";\r\n            NumSteps = 5;\r\n            Extra = new Hashtable();\r\n        }\r\n\r\n        public float SilenceScale;\r\n        public float Speed;\r\n        public int Sid;\r\n\r\n        public float[] ReferenceAudio;\r\n        public int ReferenceSampleRate;\r\n        public string ReferenceText;\r\n        public int NumSteps;\r\n\r\n        /// <summary>\r\n        /// Extra attributes stored as key/value pairs\r\n        /// </summary>\r\n        public Hashtable Extra;\r\n\r\n        /// <summary>\r\n        /// Convert to native struct for P/Invoke\r\n        /// </summary>\r\n        internal NativeStruct ToNative(out GCHandle? audioHandle)\r\n        {\r\n            NativeStruct native = new NativeStruct();\r\n            native.SilenceScale = SilenceScale;\r\n            native.Speed = Speed;\r\n            native.Sid = Sid;\r\n\r\n            // Handle ReferenceAudio\r\n            audioHandle = null;\r\n            if (ReferenceAudio != null && ReferenceAudio.Length > 0)\r\n            {\r\n                audioHandle = GCHandle.Alloc(ReferenceAudio, GCHandleType.Pinned);\r\n                native.ReferenceAudio = audioHandle.Value.AddrOfPinnedObject();\r\n                native.ReferenceAudioLen = ReferenceAudio.Length;\r\n            }\r\n            else\r\n            {\r\n                native.ReferenceAudio = IntPtr.Zero;\r\n                native.ReferenceAudioLen = 0;\r\n            }\r\n\r\n            native.ReferenceSampleRate = ReferenceSampleRate;\r\n            native.ReferenceText = ReferenceText ?? \"\";\r\n            native.NumSteps = NumSteps;\r\n\r\n            // Handle Extra JSON\r\n            native.Extra = \"{}\";\r\n            if (Extra != null && Extra.Count > 0)\r\n            {\r\n                StringBuilder json = new StringBuilder();\r\n                json.Append(\"{\");\r\n                bool first = true;\r\n\r\n                foreach (DictionaryEntry kv in Extra)\r\n                {\r\n                    if (!first) json.Append(\",\");\r\n                    first = false;\r\n\r\n                    string key = JsonEscape(kv.Key.ToString());\r\n                    string val;\r\n\r\n                    if (kv.Value is string)\r\n                        val = JsonEscape((string)kv.Value);\r\n                    else if (kv.Value is float || kv.Value is double)\r\n                        val = ((IFormattable)kv.Value).ToString(null, System.Globalization.CultureInfo.InvariantCulture);\r\n                    else if (kv.Value is bool)\r\n                        val = (bool)kv.Value ? \"true\" : \"false\";\r\n                    else\r\n                        val = kv.Value.ToString();\r\n\r\n                    json.AppendFormat(\"{0}:{1}\", key, val);\r\n                }\r\n\r\n                json.Append(\"}\");\r\n                native.Extra = json.ToString();\r\n            }\r\n            return native;\r\n        }\r\n\r\n        /// <summary>\r\n        /// Escapes a string for JSON (for .NET 2.0)\r\n        /// </summary>\r\n        private static string JsonEscape(string s)\r\n        {\r\n            if (s == null) return \"\\\"\\\"\";\r\n\r\n            StringBuilder sb = new StringBuilder();\r\n            sb.Append('\"');\r\n            foreach (char c in s)\r\n            {\r\n                switch (c)\r\n                {\r\n                    case '\"': sb.Append(\"\\\\\\\"\"); break;\r\n                    case '\\\\': sb.Append(\"\\\\\\\\\"); break;\r\n                    case '\\b': sb.Append(\"\\\\b\"); break;\r\n                    case '\\f': sb.Append(\"\\\\f\"); break;\r\n                    case '\\n': sb.Append(\"\\\\n\"); break;\r\n                    case '\\r': sb.Append(\"\\\\r\"); break;\r\n                    case '\\t': sb.Append(\"\\\\t\"); break;\r\n                    default:\r\n                        if (c < 32 || c > 126)\r\n                            sb.AppendFormat(\"\\\\u{0:X4}\", (int)c);\r\n                        else\r\n                            sb.Append(c);\r\n                        break;\r\n                }\r\n            }\r\n            sb.Append('\"');\r\n            return sb.ToString();\r\n        }\r\n\r\n        [StructLayout(LayoutKind.Sequential)]\r\n        internal struct NativeStruct\r\n        {\r\n            public float SilenceScale;\r\n            public float Speed;\r\n            public int Sid;\r\n\r\n            public IntPtr ReferenceAudio;\r\n            public int ReferenceAudioLen;\r\n            public int ReferenceSampleRate;\r\n\r\n            [MarshalAs(UnmanagedType.LPStr)]\r\n            public string ReferenceText;\r\n\r\n            public int NumSteps;\r\n\r\n            [MarshalAs(UnmanagedType.LPStr)]\r\n            public string Extra;\r\n        }\r\n    }\r\n}\r\n\r\n"
  },
  {
    "path": "scripts/dotnet/OfflineTtsKittenModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTtsKittenModelConfig\n    {\n        public OfflineTtsKittenModelConfig()\n        {\n            Model = \"\";\n            Voices = \"\";\n            Tokens = \"\";\n            DataDir = \"\";\n\n            LengthScale = 1.0F;\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Voices;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Tokens;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DataDir;\n\n        public float LengthScale;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineTtsKokoroModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTtsKokoroModelConfig\n    {\n        public OfflineTtsKokoroModelConfig()\n        {\n            Model = \"\";\n            Voices = \"\";\n            Tokens = \"\";\n            DataDir = \"\";\n\n            LengthScale = 1.0F;\n\n            DictDir = \"\";\n            Lexicon = \"\";\n            Lang = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Voices;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Tokens;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DataDir;\n\n        public float LengthScale;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DictDir;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Lexicon;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Lang;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineTtsMatchaModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTtsMatchaModelConfig\n    {\n        public OfflineTtsMatchaModelConfig()\n        {\n            AcousticModel = \"\";\n            Vocoder = \"\";\n            Lexicon = \"\";\n            Tokens = \"\";\n            DataDir = \"\";\n\n            NoiseScale = 0.667F;\n            LengthScale = 1.0F;\n\n            DictDir = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string AcousticModel;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Vocoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Lexicon;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Tokens;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DataDir;\n\n        public float NoiseScale;\n        public float LengthScale;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DictDir;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineTtsModelConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTtsModelConfig\n    {\n        public OfflineTtsModelConfig()\n        {\n            Vits = new OfflineTtsVitsModelConfig();\n            Matcha = new OfflineTtsMatchaModelConfig();\n            Kokoro = new OfflineTtsKokoroModelConfig();\n            Kitten = new OfflineTtsKittenModelConfig();\n            ZipVoice = new OfflineTtsZipVoiceModelConfig();\n            Pocket = new OfflineTtsPocketModelConfig();\n            Supertonic = new OfflineTtsSupertonicModelConfig();\n            NumThreads = 1;\n            Debug = 0;\n            Provider = \"cpu\";\n        }\n\n        public OfflineTtsVitsModelConfig Vits;\n        public int NumThreads;\n        public int Debug;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Provider;\n\n        public OfflineTtsMatchaModelConfig Matcha;\n        public OfflineTtsKokoroModelConfig Kokoro;\n        public OfflineTtsKittenModelConfig Kitten;\n        public OfflineTtsZipVoiceModelConfig ZipVoice;\n        public OfflineTtsPocketModelConfig Pocket;\n        public OfflineTtsSupertonicModelConfig Supertonic;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineTtsPocketModelConfig.cs",
    "content": "/// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTtsPocketModelConfig\n    {\n        // Default constructor for convenience\n        public OfflineTtsPocketModelConfig()\n        {\n            LmFlow = \"\";\n            LmMain = \"\";\n            Encoder = \"\";\n            Decoder = \"\";\n            TextConditioner = \"\";\n            VocabJson = \"\";\n            TokenScoresJson = \"\";\n            VoiceEmbeddingCacheCapacity = 50;\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string LmFlow;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string LmMain;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Encoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Decoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string TextConditioner;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string VocabJson;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string TokenScoresJson;\n\n        public int VoiceEmbeddingCacheCapacity;\n    }\n}\n\n"
  },
  {
    "path": "scripts/dotnet/OfflineTtsSupertonicModelConfig.cs",
    "content": "/// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTtsSupertonicModelConfig\n    {\n        public OfflineTtsSupertonicModelConfig()\n        {\n            DurationPredictor = \"\";\n            TextEncoder = \"\";\n            VectorEstimator = \"\";\n            Vocoder = \"\";\n            TtsJson = \"\";\n            UnicodeIndexer = \"\";\n            VoiceStyle = \"\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DurationPredictor;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string TextEncoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string VectorEstimator;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Vocoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string TtsJson;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string UnicodeIndexer;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string VoiceStyle;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineTtsVitsModelConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTtsVitsModelConfig\n    {\n        public OfflineTtsVitsModelConfig()\n        {\n            Model = \"\";\n            Lexicon = \"\";\n            Tokens = \"\";\n            DataDir = \"\";\n\n            NoiseScale = 0.667F;\n            NoiseScaleW = 0.8F;\n            LengthScale = 1.0F;\n\n            DictDir = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Lexicon;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Tokens;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DataDir;\n\n        public float NoiseScale;\n        public float NoiseScaleW;\n        public float LengthScale;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DictDir;\n    }\n}"
  },
  {
    "path": "scripts/dotnet/OfflineTtsZipVoiceModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineTtsZipVoiceModelConfig\n    {\n        public OfflineTtsZipVoiceModelConfig()\n        {\n            Tokens = \"\";\n            Encoder = \"\";\n            Decoder = \"\";\n            Vocoder = \"\";\n            DataDir = \"\";\n            Lexicon = \"\";\n\n            FeatScale = 0.1F;\n            Tshift = 0.5F;\n            TargetRms = 0.1F;\n            GuidanceScale = 1.0F;\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Tokens;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Encoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Decoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Vocoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DataDir;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Lexicon;\n\n        public float FeatScale;\n        public float Tshift;\n        public float TargetRms;\n        public float GuidanceScale;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineWenetCtcModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineWenetCtcModelConfig\n    {\n        public OfflineWenetCtcModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineWhisperModelConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineWhisperModelConfig\n    {\n        public OfflineWhisperModelConfig()\n        {\n            Encoder = \"\";\n            Decoder = \"\";\n            Language = \"\";\n            Task = \"transcribe\";\n            TailPaddings = -1;\n            EnableTokenTimestamps = 0;\n            EnableSegmentTimestamps = 0;\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Encoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Decoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Language;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Task;\n\n        public int TailPaddings;\n        public int EnableTokenTimestamps;\n        public int EnableSegmentTimestamps;\n    }\n\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineZipformerAudioTaggingModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineZipformerAudioTaggingModelConfig\n    {\n        public OfflineZipformerAudioTaggingModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OfflineZipformerCtcModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OfflineZipformerCtcModelConfig\n    {\n        public OfflineZipformerCtcModelConfig()\n        {\n            Model = \"\";\n        }\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OnlineCtcFstDecoderConfig.cs",
    "content": "/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OnlineCtcFstDecoderConfig\n    {\n        public OnlineCtcFstDecoderConfig()\n        {\n            Graph = \"\";\n            MaxActive = 3000;\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Graph;\n\n        public int MaxActive;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/OnlineModelConfig.cs",
    "content": "/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OnlineModelConfig\n    {\n        public OnlineModelConfig()\n        {\n            Transducer = new OnlineTransducerModelConfig();\n            Paraformer = new OnlineParaformerModelConfig();\n            Zipformer2Ctc = new OnlineZipformer2CtcModelConfig();\n            Tokens = \"\";\n            NumThreads = 1;\n            Provider = \"cpu\";\n            Debug = 0;\n            ModelType = \"\";\n            ModelingUnit = \"cjkchar\";\n            BpeVocab = \"\";\n            TokensBuf = \"\";\n            TokensBufSize = 0;\n            NemoCtc = new OnlineNemoCtcModelConfig();\n            ToneCtc = new OnlineToneCtcModelConfig();\n        }\n\n        public OnlineTransducerModelConfig Transducer;\n        public OnlineParaformerModelConfig Paraformer;\n        public OnlineZipformer2CtcModelConfig Zipformer2Ctc;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Tokens;\n\n        /// Number of threads used to run the neural network model\n        public int NumThreads;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Provider;\n\n        /// true to print debug information of the model\n        public int Debug;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string ModelType;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string ModelingUnit;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string BpeVocab;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string TokensBuf;\n\n        public int TokensBufSize;\n\n        public OnlineNemoCtcModelConfig NemoCtc;\n\n        public OnlineToneCtcModelConfig ToneCtc;\n    }\n\n}\n"
  },
  {
    "path": "scripts/dotnet/OnlineNemoCtcModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OnlineNemoCtcModelConfig\n    {\n        public OnlineNemoCtcModelConfig()\n        {\n            Model = \"\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OnlineParaformerModelConfig.cs",
    "content": "/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OnlineParaformerModelConfig\n    {\n        public OnlineParaformerModelConfig()\n        {\n            Encoder = \"\";\n            Decoder = \"\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Encoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Decoder;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/OnlineRecognizer.cs",
    "content": "﻿/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\nusing System;\nusing System.Collections.Generic;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    // please see\n    // https://www.mono-project.com/docs/advanced/pinvoke/#gc-safe-pinvoke-code\n    // https://www.mono-project.com/docs/advanced/pinvoke/#properly-disposing-of-resources\n    public class OnlineRecognizer : IDisposable\n    {\n        public OnlineRecognizer(OnlineRecognizerConfig config)\n        {\n            IntPtr h = SherpaOnnxCreateOnlineRecognizer(ref config);\n            _handle = new HandleRef(this, h);\n        }\n\n        public OnlineStream CreateStream()\n        {\n            IntPtr p = SherpaOnnxCreateOnlineStream(_handle.Handle);\n            return new OnlineStream(p);\n        }\n\n        /// Return true if the passed stream is ready for decoding.\n        public bool IsReady(OnlineStream stream)\n        {\n            return IsReady(_handle.Handle, stream.Handle) != 0;\n        }\n\n        /// Return true if an endpoint is detected for this stream.\n        /// You probably need to invoke Reset(stream) when this method returns\n        /// true.\n        public bool IsEndpoint(OnlineStream stream)\n        {\n            return SherpaOnnxOnlineStreamIsEndpoint(_handle.Handle, stream.Handle) != 0;\n        }\n\n        /// You have to ensure that IsReady(stream) returns true before\n        /// you call this method\n        public void Decode(OnlineStream stream)\n        {\n            Decode(_handle.Handle, stream.Handle);\n        }\n\n        // The caller should ensure all passed streams are ready for decoding.\n        public void Decode(IEnumerable<OnlineStream> streams)\n        {\n            // TargetFramework=net20 does not support System.Linq\n            // IntPtr[] ptrs = streams.Select(s => s.Handle).ToArray();\n            List<IntPtr> list = new List<IntPtr>();\n            foreach (OnlineStream s in streams)\n            {\n              list.Add(s.Handle);\n            }\n\n            IntPtr[] ptrs = list.ToArray();\n            Decode(_handle.Handle, ptrs, ptrs.Length);\n        }\n\n        public OnlineRecognizerResult GetResult(OnlineStream stream)\n        {\n            IntPtr h = GetResult(_handle.Handle, stream.Handle);\n            OnlineRecognizerResult result = new OnlineRecognizerResult(h);\n            DestroyResult(h);\n            return result;\n        }\n\n        /// When this method returns, IsEndpoint(stream) will return false.\n        public void Reset(OnlineStream stream)\n        {\n            SherpaOnnxOnlineStreamReset(_handle.Handle, stream.Handle);\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~OnlineRecognizer()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyOnlineRecognizer(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateOnlineRecognizer(ref OnlineRecognizerConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyOnlineRecognizer(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateOnlineStream(IntPtr handle);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxIsOnlineStreamReady\")]\n        private static extern int IsReady(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxDecodeOnlineStream\")]\n        private static extern void Decode(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxDecodeMultipleOnlineStreams\")]\n        private static extern void Decode(IntPtr handle, IntPtr[] streams, int n);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxGetOnlineStreamResult\")]\n        private static extern IntPtr GetResult(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename, EntryPoint = \"SherpaOnnxDestroyOnlineRecognizerResult\")]\n        private static extern void DestroyResult(IntPtr result);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxOnlineStreamReset(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxOnlineStreamIsEndpoint(IntPtr handle, IntPtr stream);\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OnlineRecognizerConfig.cs",
    "content": "/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OnlineRecognizerConfig\n    {\n        public OnlineRecognizerConfig()\n        {\n            FeatConfig = new FeatureConfig();\n            ModelConfig = new OnlineModelConfig();\n            DecodingMethod = \"greedy_search\";\n            MaxActivePaths = 4;\n            EnableEndpoint = 0;\n            Rule1MinTrailingSilence = 1.2F;\n            Rule2MinTrailingSilence = 2.4F;\n            Rule3MinUtteranceLength = 20.0F;\n            HotwordsFile = \"\";\n            HotwordsScore = 1.5F;\n            CtcFstDecoderConfig = new OnlineCtcFstDecoderConfig();\n            RuleFsts = \"\";\n            RuleFars = \"\";\n            BlankPenalty = 0.0F;\n            HotwordsBuf = \"\";\n            HotwordsBufSize = 0;\n            Hr = new HomophoneReplacerConfig();\n        }\n        public FeatureConfig FeatConfig;\n        public OnlineModelConfig ModelConfig;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string DecodingMethod;\n\n        /// Used only when decoding_method is modified_beam_search\n        /// Example value: 4\n        public int MaxActivePaths;\n\n        /// 0 to disable endpoint detection.\n        /// A non-zero value to enable endpoint detection.\n        public int EnableEndpoint;\n\n        /// An endpoint is detected if trailing silence in seconds is larger than\n        /// this value even if nothing has been decoded.\n        /// Used only when enable_endpoint is not 0.\n        public float Rule1MinTrailingSilence;\n\n        /// An endpoint is detected if trailing silence in seconds is larger than\n        /// this value after something that is not blank has been decoded.\n        /// Used only when enable_endpoint is not 0.\n        public float Rule2MinTrailingSilence;\n\n        /// An endpoint is detected if the utterance in seconds is larger than\n        /// this value.\n        /// Used only when enable_endpoint is not 0.\n        public float Rule3MinUtteranceLength;\n\n        /// Path to the hotwords.\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string HotwordsFile;\n\n        /// Bonus score for each token in hotwords.\n        public float HotwordsScore;\n\n        public OnlineCtcFstDecoderConfig CtcFstDecoderConfig;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string RuleFsts;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string RuleFars;\n\n        public float BlankPenalty;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string HotwordsBuf;\n\n        public int HotwordsBufSize;\n\n        public HomophoneReplacerConfig Hr;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OnlineRecognizerResult.cs",
    "content": "﻿/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n\n    public class OnlineRecognizerResult\n    {\n        public OnlineRecognizerResult(IntPtr handle)\n        {\n            Impl impl = (Impl)Marshal.PtrToStructure(handle, typeof(Impl));\n            // PtrToStringUTF8() requires .net standard 2.1\n            // _text = Marshal.PtrToStringUTF8(impl.Text);\n\n            int length = 0;\n\n            unsafe\n            {\n                byte* buffer = (byte*)impl.Text;\n                while (*buffer != 0)\n                {\n                    ++buffer;\n                    length += 1;\n                }\n            }\n\n            byte[] stringBuffer = new byte[length];\n            Marshal.Copy(impl.Text, stringBuffer, 0, length);\n            _text = Encoding.UTF8.GetString(stringBuffer);\n\n            _tokens = new String[impl.Count];\n\n            unsafe\n            {\n                byte* buf = (byte*)impl.Tokens;\n                for (int i = 0; i < impl.Count; i++)\n                {\n                    length = 0;\n                    byte* start = buf;\n                    while (*buf != 0)\n                    {\n                        ++buf;\n                        length += 1;\n                    }\n                    ++buf;\n\n                    stringBuffer = new byte[length];\n                    fixed (byte* pTarget = stringBuffer)\n                    {\n                        for (int k = 0; k < length; k++)\n                        {\n                            pTarget[k] = start[k];\n                        }\n                    }\n\n                    _tokens[i] = Encoding.UTF8.GetString(stringBuffer);\n                }\n            }\n\n            unsafe\n            {\n                float* t = (float*)impl.Timestamps;\n                if (t != null)\n                {\n                    _timestamps = new float[impl.Count];\n                    fixed (float* pTarget = _timestamps)\n                    {\n                        for (int i = 0; i < impl.Count; i++)\n                        {\n                            pTarget[i] = t[i];\n                        }\n                    }\n                }\n                else\n                {\n                    _timestamps = new float[] {};\n                }\n            }\n        }\n        [StructLayout(LayoutKind.Sequential)]\n        struct Impl\n        {\n            public IntPtr Text;\n            public IntPtr Tokens;\n            public IntPtr TokensArr;\n            public IntPtr Timestamps;\n            public int Count;\n        }\n\n        private String _text;\n        public String Text => _text;\n\n        private String[] _tokens;\n        public String[] Tokens => _tokens;\n\n        private float[] _timestamps;\n        public float[] Timestamps => _timestamps;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OnlineSpeechDenoiser.cs",
    "content": "/// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    public class OnlineSpeechDenoiser: IDisposable\n    {\n        public OnlineSpeechDenoiser(OnlineSpeechDenoiserConfig config)\n        {\n            IntPtr h = SherpaOnnxCreateOnlineSpeechDenoiser(ref config);\n            _handle = new HandleRef(this, h);\n        }\n\n        public DenoisedAudio Run(float[] samples, int sampleRate)\n        {\n            IntPtr p = SherpaOnnxOnlineSpeechDenoiserRun(_handle.Handle, samples, samples.Length, sampleRate);\n            return new DenoisedAudio(p);\n        }\n\n        public DenoisedAudio Flush()\n        {\n            IntPtr p = SherpaOnnxOnlineSpeechDenoiserFlush(_handle.Handle);\n            return new DenoisedAudio(p);\n        }\n\n        public void Reset()\n        {\n            SherpaOnnxOnlineSpeechDenoiserReset(_handle.Handle);\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~OnlineSpeechDenoiser()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyOnlineSpeechDenoiser(_handle.Handle);\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n        public int SampleRate => SherpaOnnxOnlineSpeechDenoiserGetSampleRate(_handle.Handle);\n\n        public int FrameShiftInSamples =>\n            SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(_handle.Handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateOnlineSpeechDenoiser(ref OnlineSpeechDenoiserConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyOnlineSpeechDenoiser(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxOnlineSpeechDenoiserGetSampleRate(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxOnlineSpeechDenoiserRun(IntPtr handle, float[] samples, int n, int sampleRate);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxOnlineSpeechDenoiserFlush(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxOnlineSpeechDenoiserReset(IntPtr handle);\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OnlineSpeechDenoiserConfig.cs",
    "content": "/// Copyright (c)  2026  Xiaomi Corporation (authors: Fangjun Kuang)\n\nnamespace SherpaOnnx\n{\n    public struct OnlineSpeechDenoiserConfig\n    {\n        public OnlineSpeechDenoiserConfig()\n        {\n            Model = new OfflineSpeechDenoiserModelConfig();\n        }\n\n        public OfflineSpeechDenoiserModelConfig Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OnlineStream.cs",
    "content": "﻿/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\nusing System;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    public class OnlineStream : IDisposable\n    {\n        public OnlineStream(IntPtr p)\n        {\n            _handle = new HandleRef(this, p);\n        }\n\n        public void AcceptWaveform(int sampleRate, float[] samples)\n        {\n            SherpaOnnxOnlineStreamAcceptWaveform(Handle, sampleRate, samples, samples.Length);\n        }\n\n        public void InputFinished()\n        {\n            SherpaOnnxOnlineStreamInputFinished(Handle);\n        }\n\n        public void SetOption(string key, string value)\n        {\n            SherpaOnnxOnlineStreamSetOption(Handle, key, value);\n        }\n\n        public string GetOption(string key)\n        {\n            IntPtr p = SherpaOnnxOnlineStreamGetOption(Handle, key);\n            return Marshal.PtrToStringAnsi(p) ?? \"\";\n        }\n\n        public bool HasOption(string key)\n        {\n            return SherpaOnnxOnlineStreamHasOption(Handle, key) == 1;\n        }\n\n        ~OnlineStream()\n        {\n            Cleanup();\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyOnlineStream(Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n        public IntPtr Handle => _handle.Handle;\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyOnlineStream(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxOnlineStreamAcceptWaveform(IntPtr handle, int sampleRate, float[] samples, int n);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxOnlineStreamInputFinished(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxOnlineStreamSetOption(IntPtr handle, [MarshalAs(UnmanagedType.LPStr)] string key, [MarshalAs(UnmanagedType.LPStr)] string value);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxOnlineStreamGetOption(IntPtr handle, [MarshalAs(UnmanagedType.LPStr)] string key);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxOnlineStreamHasOption(IntPtr handle, [MarshalAs(UnmanagedType.LPStr)] string key);\n    }\n\n}\n"
  },
  {
    "path": "scripts/dotnet/OnlineToneCtcModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OnlineToneCtcModelConfig\n    {\n        public OnlineToneCtcModelConfig()\n        {\n            Model = \"\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/OnlineTransducerModelConfig.cs",
    "content": "/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OnlineTransducerModelConfig\n    {\n        public OnlineTransducerModelConfig()\n        {\n            Encoder = \"\";\n            Decoder = \"\";\n            Joiner = \"\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Encoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Decoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Joiner;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/OnlineZipformer2CtcModelConfig.cs",
    "content": "/// Copyright (c)  2023  Xiaomi Corporation (authors: Fangjun Kuang)\n/// Copyright (c)  2023 by manyeyes\n/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct OnlineZipformer2CtcModelConfig\n    {\n        public OnlineZipformer2CtcModelConfig()\n        {\n            Model = \"\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n    }\n}"
  },
  {
    "path": "scripts/dotnet/README.md",
    "content": "# Introduction\n\n[sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) is an open-source\nreal-time speech recognition toolkit developed\nby the Next-gen Kaldi team.\n\nIt supports streaming recognition on a variety of\nplatforms such as Android, iOS, Raspberry, Linux, Windows, macOS, etc.\n\nIt does not require Internet connection during recognition.\n\nSee the documentation https://k2-fsa.github.io/sherpa/onnx/index.html\nfor details.\n\nPlease see\nhttps://github.com/k2-fsa/sherpa-onnx/tree/dot-net/dotnet-examples\nfor how to use C# APIs of this package.\n"
  },
  {
    "path": "scripts/dotnet/SileroVadModelConfig.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct SileroVadModelConfig\n    {\n        public SileroVadModelConfig()\n        {\n            Model = \"\";\n            Threshold = 0.5F;\n            MinSilenceDuration = 0.5F;\n            MinSpeechDuration = 0.25F;\n            WindowSize = 512;\n            MaxSpeechDuration = 5.0F;\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n\n        public float Threshold;\n\n        public float MinSilenceDuration;\n\n        public float MinSpeechDuration;\n\n        public int WindowSize;\n\n        public float MaxSpeechDuration;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/SpeakerEmbeddingExtractor.cs",
    "content": "﻿/// Copyright (c)  2024.5 by 东风破\nusing System;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    public class SpeakerEmbeddingExtractor : IDisposable\n    {\n        public SpeakerEmbeddingExtractor(SpeakerEmbeddingExtractorConfig config)\n        {\n            IntPtr h = SherpaOnnxCreateSpeakerEmbeddingExtractor(ref config);\n            _handle = new HandleRef(this, h);\n        }\n\n        public OnlineStream CreateStream()\n        {\n            IntPtr p = SherpaOnnxSpeakerEmbeddingExtractorCreateStream(_handle.Handle);\n            return new OnlineStream(p);\n        }\n\n        public bool IsReady(OnlineStream stream)\n        {\n            return SherpaOnnxSpeakerEmbeddingExtractorIsReady(_handle.Handle, stream.Handle) != 0;\n        }\n\n        public float[] Compute(OnlineStream stream)\n        {\n            IntPtr p = SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(_handle.Handle, stream.Handle);\n\n            int dim = Dim;\n            float[] ans = new float[dim];\n            Marshal.Copy(p, ans, 0, dim);\n\n            SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(p);\n\n            return ans;\n        }\n\n        public int Dim\n        {\n            get\n            {\n                return SherpaOnnxSpeakerEmbeddingExtractorDim(_handle.Handle);\n            }\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~SpeakerEmbeddingExtractor()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroySpeakerEmbeddingExtractor(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateSpeakerEmbeddingExtractor(ref SpeakerEmbeddingExtractorConfig config);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroySpeakerEmbeddingExtractor(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxSpeakerEmbeddingExtractorDim(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxSpeakerEmbeddingExtractorCreateStream(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxSpeakerEmbeddingExtractorIsReady(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(IntPtr handle, IntPtr stream);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(IntPtr p);\n    }\n\n}\n"
  },
  {
    "path": "scripts/dotnet/SpeakerEmbeddingExtractorConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct SpeakerEmbeddingExtractorConfig\n    {\n        public SpeakerEmbeddingExtractorConfig()\n        {\n            Model = \"\";\n            NumThreads = 1;\n            Debug = 0;\n            Provider = \"cpu\";\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n\n        public int NumThreads;\n        public int Debug;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Provider;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/SpeakerEmbeddingManager.cs",
    "content": "﻿/// Copyright (c)  2024.5 by 东风破\nusing System;\nusing System.Collections.Generic;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n    public class SpeakerEmbeddingManager : IDisposable\n    {\n        public SpeakerEmbeddingManager(int dim)\n        {\n            IntPtr h = SherpaOnnxCreateSpeakerEmbeddingManager(dim);\n            _handle = new HandleRef(this, h);\n            this._dim = dim;\n        }\n\n        public bool Add(string name, float[] v)\n        {\n            byte[] utf8Name = Encoding.UTF8.GetBytes(name);\n            byte[] utf8NameWithNull = new byte[utf8Name.Length + 1]; // +1 for null terminator\n            Array.Copy(utf8Name, utf8NameWithNull, utf8Name.Length);\n            utf8NameWithNull[utf8Name.Length] = 0; // Null terminator\n            return SherpaOnnxSpeakerEmbeddingManagerAdd(_handle.Handle, utf8NameWithNull, v) == 1;\n        }\n\n        public bool Add(string name, ICollection<float[]> v_list)\n        {\n            int n = v_list.Count;\n            float[] v = new float[n * _dim];\n            int i = 0;\n            foreach (var item in v_list)\n            {\n                item.CopyTo(v, i);\n                i += _dim;\n            }\n\n            byte[] utf8Name = Encoding.UTF8.GetBytes(name);\n            byte[] utf8NameWithNull = new byte[utf8Name.Length + 1]; // +1 for null terminator\n            Array.Copy(utf8Name, utf8NameWithNull, utf8Name.Length);\n            utf8NameWithNull[utf8Name.Length] = 0; // Null terminator\n            return SherpaOnnxSpeakerEmbeddingManagerAddListFlattened(_handle.Handle, utf8NameWithNull, v, n) == 1;\n        }\n\n        public bool Remove(string name)\n        {\n            byte[] utf8Name = Encoding.UTF8.GetBytes(name);\n            byte[] utf8NameWithNull = new byte[utf8Name.Length + 1]; // +1 for null terminator\n            Array.Copy(utf8Name, utf8NameWithNull, utf8Name.Length);\n            utf8NameWithNull[utf8Name.Length] = 0; // Null terminator\n            return SherpaOnnxSpeakerEmbeddingManagerRemove(_handle.Handle, utf8NameWithNull) == 1;\n        }\n\n        public string Search(float[] v, float threshold)\n        {\n            IntPtr p = SherpaOnnxSpeakerEmbeddingManagerSearch(_handle.Handle, v, threshold);\n\n            string s = \"\";\n            int length = 0;\n\n            unsafe\n            {\n                byte* b = (byte*)p;\n                if (b != null)\n                {\n                    while (*b != 0)\n                    {\n                        ++b;\n                        length += 1;\n                    }\n                }\n            }\n\n            if (length > 0)\n            {\n                byte[] stringBuffer = new byte[length];\n                Marshal.Copy(p, stringBuffer, 0, length);\n                s = Encoding.UTF8.GetString(stringBuffer);\n            }\n\n            SherpaOnnxSpeakerEmbeddingManagerFreeSearch(p);\n\n            return s;\n        }\n\n        public bool Verify(string name, float[] v, float threshold)\n        {\n            byte[] utf8Name = Encoding.UTF8.GetBytes(name);\n            byte[] utf8NameWithNull = new byte[utf8Name.Length + 1]; // +1 for null terminator\n            Array.Copy(utf8Name, utf8NameWithNull, utf8Name.Length);\n            utf8NameWithNull[utf8Name.Length] = 0; // Null terminator\n            return SherpaOnnxSpeakerEmbeddingManagerVerify(_handle.Handle, utf8NameWithNull, v, threshold) == 1;\n        }\n\n        public bool Contains(string name)\n        {\n            byte[] utf8Name = Encoding.UTF8.GetBytes(name);\n            byte[] utf8NameWithNull = new byte[utf8Name.Length + 1]; // +1 for null terminator\n            Array.Copy(utf8Name, utf8NameWithNull, utf8Name.Length);\n            utf8NameWithNull[utf8Name.Length] = 0; // Null terminator\n            return SherpaOnnxSpeakerEmbeddingManagerContains(_handle.Handle, utf8NameWithNull) == 1;\n        }\n\n        public string[] GetAllSpeakers()\n        {\n            if (NumSpeakers == 0)\n            {\n                return new string[] { };\n            }\n\n            IntPtr names = SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers(_handle.Handle);\n\n            string[] ans = new string[NumSpeakers];\n\n            unsafe\n            {\n                byte** p = (byte**)names;\n                for (int i = 0; i != NumSpeakers; i++)\n                {\n                    int length = 0;\n                    byte* s = p[i];\n                    while (*s != 0)\n                    {\n                        ++s;\n                        length += 1;\n                    }\n                    byte[] stringBuffer = new byte[length];\n                    Marshal.Copy((IntPtr)p[i], stringBuffer, 0, length);\n                    ans[i] = Encoding.UTF8.GetString(stringBuffer);\n                }\n            }\n\n            SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers(names);\n\n            return ans;\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~SpeakerEmbeddingManager()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroySpeakerEmbeddingManager(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        public int NumSpeakers\n        {\n            get\n            {\n                return SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(_handle.Handle);\n            }\n        }\n\n        private HandleRef _handle;\n        private int _dim;\n\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateSpeakerEmbeddingManager(int dim);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroySpeakerEmbeddingManager(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxSpeakerEmbeddingManagerAdd(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Name, float[] v);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxSpeakerEmbeddingManagerAddListFlattened(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Name, float[] v, int n);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxSpeakerEmbeddingManagerRemove(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Name);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxSpeakerEmbeddingManagerSearch(IntPtr handle, float[] v, float threshold);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxSpeakerEmbeddingManagerFreeSearch(IntPtr p);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxSpeakerEmbeddingManagerVerify(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Name, float[] v, float threshold);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxSpeakerEmbeddingManagerContains(IntPtr handle, [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.I1)] byte[] utf8Name);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers(IntPtr names);\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/SpeechSegment.cs",
    "content": "﻿/// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\nusing System;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    public class SpeechSegment\n    {\n        public SpeechSegment(IntPtr handle)\n        {\n            Impl impl = (Impl)Marshal.PtrToStructure(handle, typeof(Impl));\n\n            _start = impl.Start;\n\n            unsafe\n            {\n                float* t = (float*)impl.Samples;\n                _samples = new float[impl.Count];\n                fixed (float* pTarget = _samples)\n                {\n                    for (int i = 0; i < impl.Count; i++)\n                    {\n                        pTarget[i] = t[i];\n                    }\n                }\n            }\n        }\n\n        public int _start;\n        public int Start => _start;\n\n        private float[] _samples;\n        public float[] Samples => _samples;\n\n        [StructLayout(LayoutKind.Sequential)]\n        struct Impl\n        {\n            public int Start;\n            public IntPtr Samples;\n            public int Count;\n        }\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/SpokenLanguageIdentification.cs",
    "content": "﻿/// Copyright (c)  2024.5 by 东风破\nusing System;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    public class SpokenLanguageIdentification : IDisposable\n{\n    public SpokenLanguageIdentification(SpokenLanguageIdentificationConfig config)\n    {\n        IntPtr h = SherpaOnnxCreateSpokenLanguageIdentification(ref config);\n        _handle = new HandleRef(this, h);\n    }\n\n    public OfflineStream CreateStream()\n    {\n        IntPtr p = SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(_handle.Handle);\n        return new OfflineStream(p);\n    }\n\n    public SpokenLanguageIdentificationResult Compute(OfflineStream stream)\n    {\n        IntPtr h = SherpaOnnxSpokenLanguageIdentificationCompute(_handle.Handle, stream.Handle);\n        SpokenLanguageIdentificationResult result = new SpokenLanguageIdentificationResult(h);\n        SherpaOnnxDestroySpokenLanguageIdentificationResult(h);\n        return result;\n    }\n\n    public void Dispose()\n    {\n        Cleanup();\n        // Prevent the object from being placed on the\n        // finalization queue\n        System.GC.SuppressFinalize(this);\n    }\n\n    ~SpokenLanguageIdentification()\n    {\n        Cleanup();\n    }\n\n    private void Cleanup()\n    {\n        SherpaOnnxDestroySpokenLanguageIdentification(_handle.Handle);\n\n        // Don't permit the handle to be used again.\n        _handle = new HandleRef(this, IntPtr.Zero);\n    }\n\n    private HandleRef _handle;\n\n    [DllImport(Dll.Filename)]\n    private static extern IntPtr SherpaOnnxCreateSpokenLanguageIdentification(ref SpokenLanguageIdentificationConfig config);\n\n    [DllImport(Dll.Filename)]\n    private static extern void SherpaOnnxDestroySpokenLanguageIdentification(IntPtr handle);\n\n    [DllImport(Dll.Filename)]\n    private static extern IntPtr SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(IntPtr handle);\n\n    [DllImport(Dll.Filename)]\n    private static extern IntPtr SherpaOnnxSpokenLanguageIdentificationCompute(IntPtr handle, IntPtr stream);\n\n    [DllImport(Dll.Filename)]\n    private static extern void SherpaOnnxDestroySpokenLanguageIdentificationResult(IntPtr handle);\n}\n}\n"
  },
  {
    "path": "scripts/dotnet/SpokenLanguageIdentificationConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    public struct SpokenLanguageIdentificationConfig\n    {\n        public SpokenLanguageIdentificationConfig()\n        {\n            Whisper = new SpokenLanguageIdentificationWhisperConfig();\n            NumThreads = 1;\n            Debug = 0;\n            Provider = \"cpu\";\n        }\n        public SpokenLanguageIdentificationWhisperConfig Whisper;\n\n        public int NumThreads;\n        public int Debug;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Provider;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/SpokenLanguageIdentificationResult.cs",
    "content": "﻿/// Copyright (c)  2024.5 by 东风破\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\nnamespace SherpaOnnx\n{\n    public class SpokenLanguageIdentificationResult\n    {\n        public SpokenLanguageIdentificationResult(IntPtr handle)\n        {\n            Impl impl = (Impl)Marshal.PtrToStructure(handle, typeof(Impl));\n\n            // PtrToStringUTF8() requires .net standard 2.1\n            // _text = Marshal.PtrToStringUTF8(impl.Text);\n\n            int length = 0;\n\n            unsafe\n            {\n                byte* buffer = (byte*)impl.Lang;\n                while (*buffer != 0)\n                {\n                    ++buffer;\n                    length += 1;\n                }\n            }\n\n            byte[] stringBuffer = new byte[length];\n            Marshal.Copy(impl.Lang, stringBuffer, 0, length);\n            _lang = Encoding.UTF8.GetString(stringBuffer);\n        }\n\n        [StructLayout(LayoutKind.Sequential)]\n        struct Impl\n        {\n            public IntPtr Lang;\n        }\n\n        private String _lang;\n        public String Lang => _lang;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/SpokenLanguageIdentificationWhisperConfig.cs",
    "content": "/// Copyright (c)  2024.5 by 东风破\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct SpokenLanguageIdentificationWhisperConfig\n    {\n        public SpokenLanguageIdentificationWhisperConfig()\n        {\n            Encoder = \"\";\n            Decoder = \"\";\n            TailPaddings = -1;\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Encoder;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Decoder;\n\n        public int TailPaddings;\n    }\n\n}"
  },
  {
    "path": "scripts/dotnet/TenVadModelConfig.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct TenVadModelConfig\n    {\n        public TenVadModelConfig()\n        {\n            Model = \"\";\n            Threshold = 0.5F;\n            MinSilenceDuration = 0.5F;\n            MinSpeechDuration = 0.25F;\n            WindowSize = 256;\n            MaxSpeechDuration = 5.0F;\n        }\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Model;\n\n        public float Threshold;\n\n        public float MinSilenceDuration;\n\n        public float MinSpeechDuration;\n\n        public int WindowSize;\n\n        public float MaxSpeechDuration;\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/VadModelConfig.cs",
    "content": "/// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\n\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    [StructLayout(LayoutKind.Sequential)]\n    public struct VadModelConfig\n    {\n        public VadModelConfig()\n        {\n            SileroVad = new SileroVadModelConfig();\n            SampleRate = 16000;\n            NumThreads = 1;\n            Provider = \"cpu\";\n            Debug = 0;\n            TenVad = new TenVadModelConfig();\n        }\n\n        public SileroVadModelConfig SileroVad;\n\n        public int SampleRate;\n\n        public int NumThreads;\n\n        [MarshalAs(UnmanagedType.LPStr)]\n        public string Provider;\n\n        public int Debug;\n\n        public TenVadModelConfig TenVad;\n    }\n}\n\n"
  },
  {
    "path": "scripts/dotnet/VersionInfo.cs",
    "content": "/// Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\nusing System;\nusing System.Runtime.InteropServices;\nusing System.Text;\n\n\nnamespace SherpaOnnx\n{\n    public class VersionInfo\n    {\n        public static String Version\n        {\n          get\n          {\n            IntPtr p = SherpaOnnxGetVersionStr();\n\n            string s = \"\";\n            int length = 0;\n\n            unsafe\n            {\n                byte* b = (byte*)p;\n                if (b != null)\n                {\n                    while (*b != 0)\n                    {\n                        ++b;\n                        length += 1;\n                    }\n                }\n            }\n\n            if (length > 0)\n            {\n                byte[] stringBuffer = new byte[length];\n                Marshal.Copy(p, stringBuffer, 0, length);\n                s = Encoding.UTF8.GetString(stringBuffer);\n            }\n\n            return s;\n          }\n        }\n\n        public static String GitSha1\n        {\n          get\n          {\n            IntPtr p = SherpaOnnxGetGitSha1();\n\n            string s = \"\";\n            int length = 0;\n\n            unsafe\n            {\n                byte* b = (byte*)p;\n                if (b != null)\n                {\n                    while (*b != 0)\n                    {\n                        ++b;\n                        length += 1;\n                    }\n                }\n            }\n\n            if (length > 0)\n            {\n                byte[] stringBuffer = new byte[length];\n                Marshal.Copy(p, stringBuffer, 0, length);\n                s = Encoding.UTF8.GetString(stringBuffer);\n            }\n\n            return s;\n          }\n        }\n\n        public static String GitDate\n        {\n          get\n          {\n            IntPtr p = SherpaOnnxGetGitDate();\n\n            string s = \"\";\n            int length = 0;\n\n            unsafe\n            {\n                byte* b = (byte*)p;\n                if (b != null)\n                {\n                    while (*b != 0)\n                    {\n                        ++b;\n                        length += 1;\n                    }\n                }\n            }\n\n            if (length > 0)\n            {\n                byte[] stringBuffer = new byte[length];\n                Marshal.Copy(p, stringBuffer, 0, length);\n                s = Encoding.UTF8.GetString(stringBuffer);\n            }\n\n            return s;\n          }\n        }\n\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxGetVersionStr();\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxGetGitSha1();\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxGetGitDate();\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/VoiceActivityDetector.cs",
    "content": "﻿/// Copyright (c)  2024  Xiaomi Corporation (authors: Fangjun Kuang)\nusing System;\nusing System.Runtime.InteropServices;\n\nnamespace SherpaOnnx\n{\n    public class VoiceActivityDetector : IDisposable\n    {\n        public VoiceActivityDetector(VadModelConfig config, float bufferSizeInSeconds)\n        {\n            IntPtr h = SherpaOnnxCreateVoiceActivityDetector(ref config, bufferSizeInSeconds);\n            _handle = new HandleRef(this, h);\n        }\n\n        public void AcceptWaveform(float[] samples)\n        {\n            SherpaOnnxVoiceActivityDetectorAcceptWaveform(_handle.Handle, samples, samples.Length);\n        }\n\n        public bool IsEmpty()\n        {\n            return SherpaOnnxVoiceActivityDetectorEmpty(_handle.Handle) == 1;\n        }\n\n        public bool IsSpeechDetected()\n        {\n            return SherpaOnnxVoiceActivityDetectorDetected(_handle.Handle) == 1;\n        }\n\n        public void Pop()\n        {\n            SherpaOnnxVoiceActivityDetectorPop(_handle.Handle);\n        }\n\n        public SpeechSegment Front()\n        {\n            IntPtr p = SherpaOnnxVoiceActivityDetectorFront(_handle.Handle);\n\n            SpeechSegment segment = new SpeechSegment(p);\n\n            SherpaOnnxDestroySpeechSegment(p);\n\n            return segment;\n        }\n\n        public void Clear()\n        {\n            SherpaOnnxVoiceActivityDetectorClear(_handle.Handle);\n        }\n\n        public void Reset()\n        {\n            SherpaOnnxVoiceActivityDetectorReset(_handle.Handle);\n        }\n\n        public void Flush()\n        {\n            SherpaOnnxVoiceActivityDetectorFlush(_handle.Handle);\n        }\n\n        public void Dispose()\n        {\n            Cleanup();\n            // Prevent the object from being placed on the\n            // finalization queue\n            System.GC.SuppressFinalize(this);\n        }\n\n        ~VoiceActivityDetector()\n        {\n            Cleanup();\n        }\n\n        private void Cleanup()\n        {\n            SherpaOnnxDestroyVoiceActivityDetector(_handle.Handle);\n\n            // Don't permit the handle to be used again.\n            _handle = new HandleRef(this, IntPtr.Zero);\n        }\n\n        private HandleRef _handle;\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxCreateVoiceActivityDetector(ref VadModelConfig config, float bufferSizeInSeconds);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroyVoiceActivityDetector(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxVoiceActivityDetectorAcceptWaveform(IntPtr handle, float[] samples, int n);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxVoiceActivityDetectorEmpty(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern int SherpaOnnxVoiceActivityDetectorDetected(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxVoiceActivityDetectorPop(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxVoiceActivityDetectorClear(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern IntPtr SherpaOnnxVoiceActivityDetectorFront(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxDestroySpeechSegment(IntPtr segment);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxVoiceActivityDetectorReset(IntPtr handle);\n\n        [DllImport(Dll.Filename)]\n        private static extern void SherpaOnnxVoiceActivityDetectorFlush(IntPtr handle);\n    }\n}\n"
  },
  {
    "path": "scripts/dotnet/examples/Common.csproj",
    "content": "﻿<Project Sdk=\"Microsoft.NET.Sdk\">\n\n  <PropertyGroup>\n    <TargetFramework>net8.0</TargetFramework>\n    <RestoreSources>/tmp/packages;$(RestoreSources);https://api.nuget.org/v3/index.json</RestoreSources>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <PackageReference Include=\"org.k2fsa.sherpa.onnx\" Version=\"*\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "scripts/dotnet/examples/README.md",
    "content": "# Introduction\n\nFiles in this directory are used exclusively by CI.\n"
  },
  {
    "path": "scripts/dotnet/generate.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2023  Xiaomi Corporation\n\nimport glob\nimport os\nimport re\nfrom pathlib import Path\n\nimport jinja2\n\nSHERPA_ONNX_DIR = Path(__file__).resolve().parent.parent.parent\n\nsrc_dir = os.environ.get(\"src_dir\", \"/tmp\")\n\n\ndef get_version():\n    cmake_file = SHERPA_ONNX_DIR / \"CMakeLists.txt\"\n    with open(cmake_file) as f:\n        content = f.read()\n\n    version = re.search(r\"set\\(SHERPA_ONNX_VERSION (.*)\\)\", content).group(1)\n    return version.strip('\"')\n\n\ndef read_proj_file(filename):\n    with open(filename) as f:\n        return f.read()\n\n\ndef get_dict():\n    return {\n        \"version\": get_version(),\n    }\n\n\ndef process_linux(s, rid):\n    libs = [\n        \"libonnxruntime.so\",\n        \"libsherpa-onnx-c-api.so\",\n    ]\n    prefix = f\"{src_dir}/linux-{rid}/\"\n    libs = [prefix + lib for lib in libs]\n    libs = \"\\n      ;\".join(libs)\n\n    d = get_dict()\n    d[\"dotnet_rid\"] = f\"linux-{rid}\"\n    d[\"libs\"] = libs\n\n    environment = jinja2.Environment()\n    template = environment.from_string(s)\n    s = template.render(**d)\n    with open(f\"./linux-{rid}/sherpa-onnx.runtime.csproj\", \"w\") as f:\n        f.write(s)\n\n\ndef process_macos(s, rid):\n    lib_dir = os.path.join(src_dir, f\"macos-{rid}\")\n    onnx_libs = glob.glob(os.path.join(lib_dir, \"libonnxruntime*.dylib\"))\n    if not onnx_libs:\n        raise FileNotFoundError(f\"No libonnxruntime*.dylib found in {lib_dir}\")\n\n    other_libs = [os.path.join(lib_dir, \"libsherpa-onnx-c-api.dylib\")]\n    libs = onnx_libs + other_libs\n    libs_str = \"\\n      ;\".join(libs)\n\n    d = get_dict()\n    d[\"dotnet_rid\"] = f\"osx-{rid}\"\n    d[\"libs\"] = libs_str\n\n    environment = jinja2.Environment()\n    template = environment.from_string(s)\n    s = template.render(**d)\n    with open(f\"./macos-{rid}/sherpa-onnx.runtime.csproj\", \"w\") as f:\n        f.write(s)\n\n\ndef process_windows(s, rid):\n    libs = [\n        \"onnxruntime.dll\",\n        \"sherpa-onnx-c-api.dll\",\n    ]\n\n    prefix = f\"{src_dir}/windows-{rid}/\"\n    libs = [prefix + lib for lib in libs]\n    libs = \"\\n      ;\".join(libs)\n\n    d = get_dict()\n    d[\"dotnet_rid\"] = f\"win-{rid}\"\n    d[\"libs\"] = libs\n\n    environment = jinja2.Environment()\n    template = environment.from_string(s)\n    s = template.render(**d)\n    with open(f\"./windows-{rid}/sherpa-onnx.runtime.csproj\", \"w\") as f:\n        f.write(s)\n\n\ndef main():\n    s = read_proj_file(\"./sherpa-onnx.csproj.runtime.in\")\n    process_macos(s, \"x64\")\n    process_macos(s, \"arm64\")\n    process_linux(s, \"x64\")\n    process_linux(s, \"arm64\")\n    process_windows(s, \"x64\")\n    process_windows(s, \"x86\")\n    process_windows(s, \"arm64\")\n\n    s = read_proj_file(\"./sherpa-onnx.csproj.in\")\n    d = get_dict()\n    d[\"packages_dir\"] = str(SHERPA_ONNX_DIR / \"scripts/dotnet/packages\")\n\n    environment = jinja2.Environment()\n    template = environment.from_string(s)\n    s = template.render(**d)\n    with open(\"./all/sherpa-onnx.csproj\", \"w\") as f:\n        f.write(s)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/dotnet/sherpa-onnx.csproj.in",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n  <PropertyGroup>\n    <PackageLicenseExpression>Apache-2.0</PackageLicenseExpression>\n    <PackageReadmeFile>README.md</PackageReadmeFile>\n    <OutputType>Library</OutputType>\n    <LangVersion>10.0</LangVersion>\n    <TargetFrameworks>net8.0;net7.0;net6.0;net45;net40;net35;net20;netstandard2.0</TargetFrameworks>\n    <RuntimeIdentifiers>linux-x64;linux-arm64;osx-x64;osx-arm64;win-x64;win-x86;win-arm64</RuntimeIdentifiers>\n    <AllowUnsafeBlocks>true</AllowUnsafeBlocks>\n    <AssemblyName>sherpa-onnx</AssemblyName>\n    <Version>{{ version }}</Version>\n\n    <PackageProjectUrl>https://github.com/k2-fsa/sherpa-onnx</PackageProjectUrl>\n    <RepositoryUrl>https://github.com/k2-fsa/sherpa-onnx</RepositoryUrl>\n    <PackageTags>speech recognition voice audio stt asr speech-to-text AI offline\n      privacy open-sourced next-gen-kaldi k2 kaldi2 sherpa-onnx</PackageTags>\n\n    <Authors>The Next-gen Kaldi development team</Authors>\n    <Owners>The Next-gen Kaldi development team</Owners>\n    <Company>Xiaomi Corporation</Company>\n    <Copyright>Copyright 2019-2023 Xiaomi Corporation</Copyright>\n    <Description>sherpa-onnx is an open-source real-time speech recognition toolkit developed\n    by the Next-gen Kaldi team. It supports streaming recognition on a variety of\n    platforms such as Android, iOS, Raspberry, Linux, Windows, macOS, etc.\n\n    It does not require Internet connection during recognition.\n\n    See the documentation https://k2-fsa.github.io/sherpa/onnx/index.html\n    for details.\n    </Description>\n\n    <!-- Pack Option -->\n    <Title>sherpa-onnx v{{ version }}</Title>\n    <PackageId>org.k2fsa.sherpa.onnx</PackageId>\n\n    <!-- Signing -->\n    <SignAssembly>false</SignAssembly>\n    <PublicSign>false</PublicSign>\n    <DelaySign>false</DelaySign>\n  </PropertyGroup>\n\n  <PropertyGroup>\n    <RestoreSources>{{ packages_dir }};$(RestoreSources);https://api.nuget.org/v3/index.json</RestoreSources>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <None Include=\"../README.md\" Pack=\"true\" PackagePath=\"/\"/>\n  </ItemGroup>\n\n  <ItemGroup>\n    <PackageReference Include=\"org.k2fsa.sherpa.onnx.runtime.linux-x64\" Version=\"{{ version }}\" />\n    <PackageReference Include=\"org.k2fsa.sherpa.onnx.runtime.linux-arm64\" Version=\"{{ version }}\" />\n    <PackageReference Include=\"org.k2fsa.sherpa.onnx.runtime.osx-x64\"   Version=\"{{ version }}\" />\n    <PackageReference Include=\"org.k2fsa.sherpa.onnx.runtime.osx-arm64\" Version=\"{{ version }}\" />\n    <PackageReference Include=\"org.k2fsa.sherpa.onnx.runtime.win-x64\"   Version=\"{{ version }}\" />\n    <PackageReference Include=\"org.k2fsa.sherpa.onnx.runtime.win-x86\"   Version=\"{{ version }}\" />\n    <PackageReference Include=\"org.k2fsa.sherpa.onnx.runtime.win-arm64\" Version=\"{{ version }}\" />\n  </ItemGroup>\n\n</Project>\n"
  },
  {
    "path": "scripts/dotnet/sherpa-onnx.csproj.runtime.in",
    "content": "<Project Sdk=\"Microsoft.NET.Sdk\">\n  <PropertyGroup>\n    <PackageLicenseExpression>Apache-2.0</PackageLicenseExpression>\n    <PackageReadmeFile>README.md</PackageReadmeFile>\n    <OutputType>Library</OutputType>\n    <TargetFrameworks>net8.0;net7.0;net6.0;net45;net40;net35;net20;netstandard2.0</TargetFrameworks>\n    <RuntimeIdentifier>{{ dotnet_rid }}</RuntimeIdentifier>\n    <AssemblyName>sherpa-onnx</AssemblyName>\n    <Version>{{ version }}</Version>\n\n    <PackageProjectUrl>https://github.com/k2-fsa/sherpa-onnx</PackageProjectUrl>\n    <RepositoryUrl>https://github.com/k2-fsa/sherpa-onnx</RepositoryUrl>\n    <PackageTags>speech recognition voice audio stt asr speech-to-text AI offline\n      privacy open-sourced next-gen-kaldi k2 kaldi2 sherpa-onnx</PackageTags>\n\n    <!-- Nuget Properties -->\n    <Description>.NET native {{ dotnet_rid }} wrapper for the sherpa-onnx project.\n\n    In general, you don't need to use this package directly.\n\n    Please use https://www.nuget.org/packages/org.k2fsa.sherpa.onnx instead\n    </Description>\n    <IncludeBuildOutput>false</IncludeBuildOutput>\n\n    <!-- Pack Option -->\n    <Title>sherpa-onnx {{ dotnet_rid }} v{{ version }}</Title>\n    <PackageId>org.k2fsa.sherpa.onnx.runtime.{{ dotnet_rid }}</PackageId>\n\n    <!-- Signing -->\n    <SignAssembly>false</SignAssembly>\n    <PublicSign>false</PublicSign>\n    <DelaySign>false</DelaySign>\n  </PropertyGroup>\n\n  <ItemGroup>\n    <None Include=\"../README.md\" Pack=\"true\" PackagePath=\"/\"/>\n  </ItemGroup>\n\n  <ItemGroup>\n    <!-- Native library must be in native directory... -->\n    <!-- If project is built as a STATIC_LIBRARY (e.g. Windows) then we don't have to include it -->\n    <Content Include=\"\n      {{ libs }}\n    \">\n      <PackagePath>runtimes/{{ dotnet_rid }}/native/%(Filename)%(Extension)</PackagePath>\n      <Pack>true</Pack>\n      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>\n    </Content>\n  </ItemGroup>\n</Project>\n"
  },
  {
    "path": "scripts/export_bpe_vocab.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2024  Xiaomi Corp.        (authors: Wei Kang)\n#\n# See ../../../../LICENSE for clarification regarding multiple authors\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\n\n# You can install sentencepiece via:\n#\n#  pip install sentencepiece\n#\n# Due to an issue reported in\n# https://github.com/google/sentencepiece/pull/642#issuecomment-857972030\n#\n# Please install a version >=0.1.96\n\nimport argparse\nimport codecs\nfrom typing import Dict\n\ntry:\n  import sentencepiece as spm\nexcept ImportError:\n    print('Please run')\n    print('  pip install sentencepiece')\n    print('before you continue')\n    raise\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--bpe-model\",\n        type=str,\n        help=\"The path to the bpe model.\",\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    model_file = args.bpe_model\n\n    vocab_file = model_file.replace(\".model\", \".vocab\")\n\n    sp = spm.SentencePieceProcessor()\n    sp.Load(model_file)\n    vocabs = [sp.id_to_piece(id) for id in range(sp.get_piece_size())]\n    with codecs.open(vocab_file, \"w\", \"utf-8\") as vfile:\n        for v in vocabs:\n            id = sp.piece_to_id(v)\n            vfile.write(f\"{v}\\t{sp.get_score(id)}\\n\")\n    print(f\"Vocabulary file is written to {vocab_file}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/flutter/.gitignore",
    "content": "!*.sh.in\n"
  },
  {
    "path": "scripts/flutter/build-android-streaming-asr.sh.in",
    "content": "#!/usr/bin/env bash\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\nlog \"SCRIPT_DIR: $SCRIPT_DIR\"\nlog \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" $SHERPA_ONNX_DIR/CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\nlog \"SHERPA_ONNX_VERSION: $SHERPA_ONNX_VERSION\"\n\nif [ -z $arch ]; then\n  arch=x86_64\nfi\n\nlog \"arch: $arch\"\n\n{% for model in model_list %}\npushd $SHERPA_ONNX_DIR/flutter-examples/streaming_asr/\n\nmodel_name={{ model.model_name }}\nlang={{ model.lang }}\ntype={{ model.idx }}\nshort_name={{ model.short_name }}\n\nrm -rf assets\nmkdir assets\n\npushd assets\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\n\n{{ model.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name\n\nls -lh\nls -lh *\n\npopd\n\ngit checkout ./\nsed -i.bak \"s|   - assets/$|   - assets/\\n    - assets/$model_name/|g\" ./pubspec.yaml\n\nsed -i.bak \"s/final type = .*;$/final type = $type;/g\" ./lib/streaming_asr.dart\n\n{% if model.rule_fsts %}\n  rule_fsts={{ model.rule_fsts }}\n  sed -i.bak \"s|ruleFsts: ''|ruleFsts: await copyAssetFile(\\'assets/$rule_fsts\\')|g\"  ./lib/streaming_asr.dart\n{% endif %}\n\ngit diff .\n\nflutter pub get\n\nflutter build apk --split-per-abi --release\n\npushd build/app/outputs/flutter-apk\nls -lh\n\n\n\nfor arch in armeabi-v7a arm64-v8a x86_64; do\n  src=app-$arch-release.apk\n  dst=$SHERPA_ONNX_DIR/sherpa-onnx-$SHERPA_ONNX_VERSION-$arch-asr-$short_name.apk\n  mv $src $dst\ndone\n\npushd $SHERPA_ONNX_DIR\nls -lh *.apk\npopd\n\npopd\n\npopd\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/flutter/build-android-tts.sh.in",
    "content": "#!/usr/bin/env bash\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\nlog \"SCRIPT_DIR: $SCRIPT_DIR\"\nlog \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" $SHERPA_ONNX_DIR/CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\nlog \"SHERPA_ONNX_VERSION: $SHERPA_ONNX_VERSION\"\n\n{% for tts_model in tts_model_list %}\npushd $SHERPA_ONNX_DIR/flutter-examples/tts/\n\ngit checkout .\n\nrm -rf assets\nmkdir assets\npushd assets\n\nmodel_dir={{ tts_model.model_dir }}\nmodel_name={{ tts_model.model_name }}\nlang={{ tts_model.lang }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/$model_dir.tar.bz2\ntar xf $model_dir.tar.bz2\nrm $model_dir.tar.bz2\n\nls -lh\nls -lh *\n\npopd # assets\n# Now we are at the project root directory\n\n./generate-asset-list.py\n\npushd lib\n\nsed -i.bak \"s|modelDir = ''|modelDir = \\\"$model_dir\\\"|\" ./model.dart\nsed -i.bak s/\"modelName = ''\"/\"modelName = \\\"$model_name\\\"\"/ ./model.dart\n\n{% if tts_model.rule_fsts %}\n  rule_fsts={{ tts_model.rule_fsts }}\n  sed -i.bak \"s|ruleFsts = ''|ruleFsts = \\\"$rule_fsts\\\"|\" ./model.dart\n{% endif %}\n\n{% if tts_model.rule_fars %}\n  rule_fars={{ tts_model.rule_fars }}\n  sed -i.bak \"s|ruleFars = ''|ruleFars = \\\"$rule_fars\\\"|\" ./model.dart\n{% endif %}\n\n{% if tts_model.data_dir %}\n  data_dir={{ tts_model.data_dir }}\n  sed -i.bak \"s|dataDir = ''|dataDir = \\\"$data_dir\\\"|\" ./model.dart\n{% elif not tts_model.is_char %}\n  sed -i.bak \"s|lexicon = ''|lexicon = \\\"lexicon.txt\\\"|\" ./model.dart\n{% endif %}\n\ngit status\n\ngit diff .\n\npopd #lib\n\nflutter pub get\n\nflutter build apk --split-per-abi --release\n\npushd build/app/outputs/flutter-apk\nls -lh\n\nfor arch in armeabi-v7a arm64-v8a x86_64; do\n  src=app-$arch-release.apk\n  dst=$SHERPA_ONNX_DIR/sherpa-onnx-$SHERPA_ONNX_VERSION-$arch-$lang-tts-$model_dir.apk\n  mv $src $dst\ndone\n\npushd $SHERPA_ONNX_DIR\nls -lh *.apk\npopd\n\npopd\n\npopd\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/flutter/build-linux-streaming-asr.sh.in",
    "content": "#!/usr/bin/env bash\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\nlog \"SCRIPT_DIR: $SCRIPT_DIR\"\nlog \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" $SHERPA_ONNX_DIR/CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\nlog \"SHERPA_ONNX_VERSION: $SHERPA_ONNX_VERSION\"\n\nif [ -z $arch ]; then\n  arch=x86_64\nfi\n\nlog \"arch: $arch\"\n\n{% for model in model_list %}\npushd $SHERPA_ONNX_DIR/flutter-examples/streaming_asr/\n\nmodel_name={{ model.model_name }}\nlang={{ model.lang }}\ntype={{ model.idx }}\nshort_name={{ model.short_name }}\n\nrm -rf assets\nmkdir assets\n\npushd assets\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\n\n{{ model.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name\n\nls -lh\nls -lh *\n\npopd\n\ngit checkout ./\nsed -i.bak \"s|   - assets/$|   - assets/\\n    - assets/$model_name/|g\" ./pubspec.yaml\n\nsed -i.bak \"s/final type = .*;$/final type = $type;/g\" ./lib/streaming_asr.dart\n\n{% if model.rule_fsts %}\n  rule_fsts={{ model.rule_fsts }}\n  sed -i.bak \"s|ruleFsts: ''|ruleFsts: await copyAssetFile(\\'assets/$rule_fsts\\')|g\"  ./lib/streaming_asr.dart\n{% endif %}\n\ngit diff .\n\nflutter pub get\n# flutter upgrade\n# flutter pub upgrade\n\nexport FLUTTER_XCODE_ARCHS=$arch\nlog \"FLUTTER_XCODE_ARCHS: $FLUTTER_XCODE_ARCHS\"\n\nflutter build linux\n\npushd build/linux/x64/release/\n\nls -lh\necho \"----\"\nls -lh bundle\n\necho \"----\"\nls -lh bundle/lib\n\necho \"----\"\nls -lh bundle/data\necho \"======\"\nls -lh bundle/data/flutter_assets\n\n\necho \"----\"\nfile bundle/streaming_asr\n\necho \"----\"\nreadelf -d bundle/streaming_asr\n\necho \"----\"\nldd bundle/streaming_asr\n\napp=sherpa-onnx-$SHERPA_ONNX_VERSION-linux-$arch-$lang-streaming_asr\napp=sherpa-onnx-$SHERPA_ONNX_VERSION-linux-$arch-asr-$lang-$short_name\n\nmv bundle $app\ntar cjf $app.tar.bz2 $app\nrm -rf $app\nmv $app.tar.bz2 $SHERPA_ONNX_DIR\n\npushd $SHERPA_ONNX_DIR\nls -lh *.tar.bz2\npopd\n\npopd\n\npopd\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/flutter/build-linux-tts.sh.in",
    "content": "#!/usr/bin/env bash\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\nlog \"SCRIPT_DIR: $SCRIPT_DIR\"\nlog \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" $SHERPA_ONNX_DIR/CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\nlog \"SHERPA_ONNX_VERSION: $SHERPA_ONNX_VERSION\"\n\nif [ -z $arch ]; then\n  arch=x86_64\nfi\n\nlog \"arch: $arch\"\n\n{% for tts_model in tts_model_list %}\npushd $SHERPA_ONNX_DIR/flutter-examples/tts/\n\ngit checkout .\n\nrm -rf assets\nmkdir assets\n\npushd assets\n\nmodel_dir={{ tts_model.model_dir }}\nmodel_name={{ tts_model.model_name }}\nlang={{ tts_model.lang }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/$model_dir.tar.bz2\ntar xf $model_dir.tar.bz2\nrm $model_dir.tar.bz2\n\nls -lh\nls -lh *\n\npopd # assets\n# Now we are at the project root directory\n\n./generate-asset-list.py\n\npushd lib\n\nsed -i.bak \"s|modelDir = ''|modelDir = \\\"$model_dir\\\"|\" ./model.dart\nsed -i.bak s/\"modelName = ''\"/\"modelName = \\\"$model_name\\\"\"/ ./model.dart\n\n{% if tts_model.rule_fsts %}\n  rule_fsts={{ tts_model.rule_fsts }}\n  sed -i.bak \"s|ruleFsts = ''|ruleFsts = \\\"$rule_fsts\\\"|\" ./model.dart\n{% endif %}\n\n{% if tts_model.rule_fars %}\n  rule_fars={{ tts_model.rule_fars }}\n  sed -i.bak \"s|ruleFars = ''|ruleFars = \\\"$rule_fars\\\"|\" ./model.dart\n{% endif %}\n\n{% if tts_model.data_dir %}\n  data_dir={{ tts_model.data_dir }}\n  sed -i.bak \"s|dataDir = ''|dataDir = \\\"$data_dir\\\"|\" ./model.dart\n{% elif not tts_model.is_char %}\n  sed -i.bak \"s|lexicon = ''|lexicon = \\\"lexicon.txt\\\"|\" ./model.dart\n{% endif %}\n\ngit status\n\ngit diff .\n\npopd #lib\n\nflutter pub get\n\nflutter build linux\n\npushd build/linux/x64/release/\n\nls -lh\necho \"----\"\nls -lh bundle\n\necho \"----\"\nls -lh bundle/lib\n\necho \"----\"\nls -lh bundle/data\necho \"======\"\nls -lh bundle/data/flutter_assets\n\n\necho \"----\"\nfile bundle/tts\n\necho \"----\"\nreadelf -d bundle/tts\n\necho \"----\"\nldd bundle/tts\n\napp=sherpa-onnx-$SHERPA_ONNX_VERSION-linux-$arch-$lang-tts-$model_dir\n\nmv bundle $app\ntar cjf $app.tar.bz2 $app\nrm -rf $app\nmv $app.tar.bz2 $SHERPA_ONNX_DIR\n\npushd $SHERPA_ONNX_DIR\nls -lh *.tar.bz2\npopd\n\npopd\n\npopd\n{% endfor %}\n"
  },
  {
    "path": "scripts/flutter/build-macos-streaming-asr.sh.in",
    "content": "#!/usr/bin/env bash\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\nlog \"SCRIPT_DIR: $SCRIPT_DIR\"\nlog \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" $SHERPA_ONNX_DIR/CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\nlog \"SHERPA_ONNX_VERSION: $SHERPA_ONNX_VERSION\"\n\nif [ -z $arch ]; then\n  arch=x86_64\nfi\n\nlog \"arch: $arch\"\n\n{% for model in model_list %}\npushd $SHERPA_ONNX_DIR/flutter-examples/streaming_asr/\n\nmodel_name={{ model.model_name }}\nlang={{ model.lang }}\ntype={{ model.idx }}\nshort_name={{ model.short_name }}\n\nrm -rf assets\nmkdir assets\n\npushd assets\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\n\n{{ model.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name\n\nls -lh\nls -lh *\n\npopd\n\ngit checkout ./\nsed -i.bak \"s|   - assets/$|   - assets/\\n    - assets/$model_name/|g\" ./pubspec.yaml\n\nsed -i.bak \"s/final type = .*;$/final type = $type;/g\" ./lib/streaming_asr.dart\n\n{% if model.rule_fsts %}\n  rule_fsts={{ model.rule_fsts }}\n  sed -i.bak \"s|ruleFsts: ''|ruleFsts: await copyAssetFile(\\'assets/$rule_fsts\\')|g\"  ./lib/streaming_asr.dart\n{% endif %}\n\ngit diff .\n\nflutter pub get\n\nexport FLUTTER_XCODE_ARCHS=$arch\nlog \"FLUTTER_XCODE_ARCHS: $FLUTTER_XCODE_ARCHS\"\n\nflutter build macos\n\npushd build/macos/Build/Products/Release/\nls -lh\n\napp=sherpa-onnx-$SHERPA_ONNX_VERSION-osx-$arch-asr-$lang-$short_name.app\nmv streaming_asr.app $app\ntar cjf $app.tar.bz2 $app\nrm -rf $app\nls -lh\nmv $app.tar.bz2 $SHERPA_ONNX_DIR\n\npushd $SHERPA_ONNX_DIR\nls -lh *.tar.bz2\npopd\n\npopd\n\npopd\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/flutter/build-macos-tts.sh.in",
    "content": "#!/usr/bin/env bash\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\nlog \"SCRIPT_DIR: $SCRIPT_DIR\"\nlog \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" $SHERPA_ONNX_DIR/CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\nlog \"SHERPA_ONNX_VERSION: $SHERPA_ONNX_VERSION\"\n\nif [ -z $arch ]; then\n  arch=x86_64\nfi\n\nlog \"arch: $arch\"\n\n{% for tts_model in tts_model_list %}\npushd $SHERPA_ONNX_DIR/flutter-examples/tts/\n\ngit checkout .\n\nrm -rf assets\nmkdir assets\n\npushd assets\n\nmodel_dir={{ tts_model.model_dir }}\nmodel_name={{ tts_model.model_name }}\nlang={{ tts_model.lang }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/$model_dir.tar.bz2\ntar xf $model_dir.tar.bz2\nrm $model_dir.tar.bz2\n\nls -lh\nls -lh *\n\npopd # assets\n# Now we are at the project root directory\n\n./generate-asset-list.py\n\npushd lib\n\nsed -i.bak \"s|modelDir = ''|modelDir = \\\"$model_dir\\\"|\" ./model.dart\nsed -i.bak s/\"modelName = ''\"/\"modelName = \\\"$model_name\\\"\"/ ./model.dart\n\n{% if tts_model.rule_fsts %}\n  rule_fsts={{ tts_model.rule_fsts }}\n  sed -i.bak \"s|ruleFsts = ''|ruleFsts = \\\"$rule_fsts\\\"|\" ./model.dart\n{% endif %}\n\n{% if tts_model.rule_fars %}\n  rule_fars={{ tts_model.rule_fars }}\n  sed -i.bak \"s|ruleFars = ''|ruleFars = \\\"$rule_fars\\\"|\" ./model.dart\n{% endif %}\n\n{% if tts_model.data_dir %}\n  data_dir={{ tts_model.data_dir }}\n  sed -i.bak \"s|dataDir = ''|dataDir = \\\"$data_dir\\\"|\" ./model.dart\n{% elif not tts_model.is_char %}\n  sed -i.bak \"s|lexicon = ''|lexicon = \\\"lexicon.txt\\\"|\" ./model.dart\n{% endif %}\n\ngit status\n\ngit diff .\n\npopd #lib\n\nflutter pub get\n\nexport FLUTTER_XCODE_ARCHS=$arch\nlog \"FLUTTER_XCODE_ARCHS: $FLUTTER_XCODE_ARCHS\"\n\nflutter build macos\n\npushd build/macos/Build/Products/Release/\nls -lh\n\napp=sherpa-onnx-$SHERPA_ONNX_VERSION-osx-$arch-$lang-tts-$model_dir.app\n\nmv tts.app $app\ntar cjf $app.tar.bz2 $app\nls -lh\nmv $app.tar.bz2 $SHERPA_ONNX_DIR\nrm -rf $app\n\npushd $SHERPA_ONNX_DIR\nls -lh *.tar.bz2\npopd\n\npopd\n\npopd\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/flutter/build-windows-streaming-asr.sh.in",
    "content": "#!/usr/bin/env bash\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\nlog \"SCRIPT_DIR: $SCRIPT_DIR\"\nlog \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" $SHERPA_ONNX_DIR/CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\nlog \"SHERPA_ONNX_VERSION: $SHERPA_ONNX_VERSION\"\n\n{% for model in model_list %}\npushd $SHERPA_ONNX_DIR/flutter-examples/streaming_asr/\n\nmodel_name={{ model.model_name }}\nlang={{ model.lang }}\ntype={{ model.idx }}\nshort_name={{ model.short_name }}\n\nrm -rf build\nrm -rf assets\nmkdir assets\n\npushd assets\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\n\n{{ model.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name\n\nls -lh\nls -lh *\n\npopd\n\ngit checkout ./\nsed -i.bak \"s|   - assets/$|   - assets/\\n    - assets/$model_name/|g\" ./pubspec.yaml\n\nsed -i.bak \"s/final type = .*;$/final type = $type;/g\" ./lib/streaming_asr.dart\n\n{% if model.rule_fsts %}\n  rule_fsts={{ model.rule_fsts }}\n  sed -i.bak \"s|ruleFsts: ''|ruleFsts: await copyAssetFile(\\'assets/$rule_fsts\\')|g\"  ./lib/streaming_asr.dart\n{% endif %}\n\ngit diff .\n\nflutter pub get\n\nflutter build windows\n\npushd build/windows/x64/runner/\nls -lh\n\ndst=sherpa-onnx-$SHERPA_ONNX_VERSION-win-x64-asr-$lang-$short_name\nmv Release $dst\ntar cjf $dst.tar.bz2 ./$dst\nrm -rf $dst\nmv $dst.tar.bz2 $SHERPA_ONNX_DIR\n\npushd $SHERPA_ONNX_DIR\nls -lh *.tar.bz2\npopd\n\npopd\n\npopd\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/flutter/build-windows-tts.sh.in",
    "content": "#!/usr/bin/env bash\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(cd $SCRIPT_DIR/../.. && pwd)\nlog \"SCRIPT_DIR: $SCRIPT_DIR\"\nlog \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" $SHERPA_ONNX_DIR/CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\nlog \"SHERPA_ONNX_VERSION: $SHERPA_ONNX_VERSION\"\n\nif [ -z $arch ]; then\n  arch=x64\nfi\n\nlog \"arch: $arch\"\n\n{% for tts_model in tts_model_list %}\npushd $SHERPA_ONNX_DIR/flutter-examples/tts/\n\ngit checkout .\n\nrm -rf assets\n\nmkdir -p assets\n\npushd assets\n\nmodel_dir={{ tts_model.model_dir }}\nmodel_name={{ tts_model.model_name }}\nlang={{ tts_model.lang }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/$model_dir.tar.bz2\ntar xf $model_dir.tar.bz2\nrm $model_dir.tar.bz2\n\nls -lh\n\nls -lh *\n\npopd # assets\n# Now we are at the project root directory\n\n./generate-asset-list.py\n\npushd lib\n\nsed -i.bak \"s|modelDir = ''|modelDir = \\\"$model_dir\\\"|\" ./model.dart\nsed -i.bak s/\"modelName = ''\"/\"modelName = \\\"$model_name\\\"\"/ ./model.dart\n\n{% if tts_model.rule_fsts %}\n  rule_fsts={{ tts_model.rule_fsts }}\n  sed -i.bak \"s|ruleFsts = ''|ruleFsts = \\\"$rule_fsts\\\"|\" ./model.dart\n{% endif %}\n\n{% if tts_model.rule_fars %}\n  rule_fars={{ tts_model.rule_fars }}\n  sed -i.bak \"s|ruleFars = ''|ruleFars = \\\"$rule_fars\\\"|\" ./model.dart\n{% endif %}\n\n{% if tts_model.data_dir %}\n  data_dir={{ tts_model.data_dir }}\n  sed -i.bak \"s|dataDir = ''|dataDir = \\\"$data_dir\\\"|\" ./model.dart\n{% elif not tts_model.is_char %}\n  sed -i.bak \"s|lexicon = ''|lexicon = \\\"lexicon.txt\\\"|\" ./model.dart\n{% endif %}\n\ngit status\n\ngit diff .\n\npopd #lib\n\nflutter pub get\n\nflutter build windows\n\npushd build/windows/x64/runner/\nls -lh\necho \"---\"\n\nls -lh ./Release/\n\ndst=sherpa-onnx-$SHERPA_ONNX_VERSION-win-$arch-$lang-tts-$model_dir\nmv Release $dst\ntar cjf $dst.tar.bz2 ./$dst\nrm -rf $dst\nmv $dst.tar.bz2 $SHERPA_ONNX_DIR\n\npushd $SHERPA_ONNX_DIR\nls -lh *.tar.bz2\npopd\n\npopd\n\npopd\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/flutter/generate-streaming-asr.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n\n    return parser.parse_args()\n\n\n@dataclass\nclass Model:\n    # We will download\n    # https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/{model_name}.tar.bz2\n    model_name: str\n\n    # The type of the model, e..g, 0, 1, 2. It is hardcoded in the flutter code\n    # See flutter-example/streaming_asr/lib/online_model.dart\n    idx: int\n\n    # e.g., zh, en, zh_en\n    lang: str\n\n    # e.g., whisper, paraformer, zipformer\n    short_name: str = \"\"\n\n    # cmd is used to remove extra files from the model directory\n    cmd: str = \"\"\n\n    rule_fsts: str = \"\"\n\n\ndef get_models():\n    models = [\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\",\n            idx=0,\n            lang=\"bilingual_zh_en\",\n            short_name=\"zipformer_2023_02_20\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv joiner-epoch-99-avg-1.int8.onnx\n\n            rm -fv *.sh\n            rm -fv bpe.model\n            rm -fv README.md\n            rm -fv .gitattributes\n            rm -fv *state*\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-en-2023-06-26\",\n            idx=1,\n            lang=\"en\",\n            short_name=\"zipformer2_2023_06_26\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv encoder-epoch-99-avg-1-chunk-16-left-128.onnx\n            rm -fv decoder-epoch-99-avg-1-chunk-16-left-128.int8.onnx\n            rm -fv joiner-epoch-99-avg-1-chunk-16-left-128.int8.onnx\n\n            rm -fv README.md\n            rm -fv bpe.model\n            rm -rfv test_wavs\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"icefall-asr-zipformer-streaming-wenetspeech-20230615\",\n            idx=2,\n            lang=\"zh\",\n            short_name=\"zipformer2_wenetspeech_2023_06_15\",\n            rule_fsts=\"itn_zh_number.fst\",\n            cmd=\"\"\"\n            if [ ! -f itn_zh_number.fst ]; then\n              curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/itn_zh_number.fst\n            fi\n            pushd $model_name\n            rm -fv exp/encoder-epoch-12-avg-4-chunk-16-left-128.onnx\n            rm -fv exp/decoder-epoch-12-avg-4-chunk-16-left-128.int8.onnx\n            rm -fv exp/joiner-epoch-12-avg-4-chunk-16-left-128.int8.onnx\n\n            rm -fv data/lang_char/lexicon.txt\n            rm -fv data/lang_char/words.txt\n            rm -rfv test_wavs\n            rm -fv README.md\n\n            ls -lh exp/\n            ls -lh data/lang_char\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-fr-2023-04-14\",\n            idx=3,\n            lang=\"fr\",\n            short_name=\"zipformer_2023_04_14\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv encoder-epoch-29-avg-9-with-averaged-model.onnx\n            rm -fv decoder-epoch-29-avg-9-with-averaged-model.int8.onnx\n            rm -fv joiner-epoch-29-avg-9-with-averaged-model.int8.onnx\n\n            rm -fv *.sh\n            rm -rfv test_wavs\n            rm README.md\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n    ]\n\n    return models\n\n\ndef main():\n    args = get_args()\n\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./build-macos-streaming-asr.sh\",\n        \"./build-linux-streaming-asr.sh\",\n        \"./build-windows-streaming-asr.sh\",\n        \"./build-android-streaming-asr.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/flutter/generate-tts.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\nfrom typing import List, Optional\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass TtsModel:\n    model_dir: str\n    model_name: str = \"\"\n    lang: str = \"\"  # en, zh, fr, de, etc.\n    rule_fsts: Optional[List[str]] = None\n    rule_fars: Optional[List[str]] = None\n    data_dir: Optional[str] = None\n    dict_dir: Optional[str] = None\n    is_char: bool = False\n\n\ndef get_coqui_models() -> List[TtsModel]:\n    # English (coqui-ai/TTS)\n    models = [\n        TtsModel(model_dir=\"vits-coqui-en-ljspeech\"),\n        TtsModel(model_dir=\"vits-coqui-en-ljspeech-neon\"),\n        TtsModel(model_dir=\"vits-coqui-en-vctk\"),\n        #  TtsModel(model_dir=\"vits-coqui-en-jenny\"),\n    ]\n\n    for m in models:\n        m.data_dir = m.model_dir + \"/\" + \"espeak-ng-data\"\n        m.model_name = \"model.onnx\"\n        m.lang = \"en\"\n\n    character_models = [\n        TtsModel(model_dir=\"vits-coqui-bg-cv\", lang=\"bg\"),\n        TtsModel(model_dir=\"vits-coqui-bn-custom_female\", lang=\"bn\"),\n        TtsModel(model_dir=\"vits-coqui-cs-cv\", lang=\"cs\"),\n        TtsModel(model_dir=\"vits-coqui-da-cv\", lang=\"da\"),\n        TtsModel(model_dir=\"vits-coqui-de-css10\", lang=\"de\"),\n        TtsModel(model_dir=\"vits-coqui-es-css10\", lang=\"es\"),\n        TtsModel(model_dir=\"vits-coqui-et-cv\", lang=\"et\"),\n        TtsModel(model_dir=\"vits-coqui-fi-css10\", lang=\"fi\"),\n        TtsModel(model_dir=\"vits-coqui-fr-css10\", lang=\"fr\"),\n        TtsModel(model_dir=\"vits-coqui-ga-cv\", lang=\"ga\"),\n        TtsModel(model_dir=\"vits-coqui-hr-cv\", lang=\"hr\"),\n        TtsModel(model_dir=\"vits-coqui-lt-cv\", lang=\"lt\"),\n        TtsModel(model_dir=\"vits-coqui-lv-cv\", lang=\"lv\"),\n        TtsModel(model_dir=\"vits-coqui-mt-cv\", lang=\"mt\"),\n        TtsModel(model_dir=\"vits-coqui-nl-css10\", lang=\"nl\"),\n        TtsModel(model_dir=\"vits-coqui-pl-mai_female\", lang=\"pl\"),\n        TtsModel(model_dir=\"vits-coqui-pt-cv\", lang=\"pt\"),\n        TtsModel(model_dir=\"vits-coqui-ro-cv\", lang=\"ro\"),\n        TtsModel(model_dir=\"vits-coqui-sk-cv\", lang=\"sk\"),\n        TtsModel(model_dir=\"vits-coqui-sl-cv\", lang=\"sl\"),\n        TtsModel(model_dir=\"vits-coqui-sv-cv\", lang=\"sv\"),\n        TtsModel(model_dir=\"vits-coqui-uk-mai\", lang=\"uk\"),\n    ]\n    for m in character_models:\n        m.is_char = True\n        m.model_name = \"model.onnx\"\n\n    return models + character_models\n\n\ndef get_piper_models() -> List[TtsModel]:\n    models = [\n        #  TtsModel(model_dir=\"vits-piper-es_ES-mls_10246-low\"),\n        #  TtsModel(model_dir=\"vits-piper-es_ES-mls_9972-low\"),\n        #  TtsModel(model_dir=\"vits-piper-pl_PL-mls_6892-low\"),\n        TtsModel(model_dir=\"vits-piper-ar_JO-kareem-low\"),\n        TtsModel(model_dir=\"vits-piper-ar_JO-kareem-medium\"),\n        TtsModel(model_dir=\"vits-piper-ca_ES-upc_ona-medium\"),\n        TtsModel(model_dir=\"vits-piper-ca_ES-upc_ona-x_low\"),\n        TtsModel(model_dir=\"vits-piper-ca_ES-upc_pau-x_low\"),\n        TtsModel(model_dir=\"vits-piper-ca_ES-upc_pau-x_low\"),\n        TtsModel(model_dir=\"vits-piper-cs_CZ-jirka-medium\"),\n        TtsModel(model_dir=\"vits-piper-cy_GB-gwryw_gogleddol-medium\"),\n        TtsModel(model_dir=\"vits-piper-da_DK-talesyntese-medium\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-eva_k-x_low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-karlsson-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-kerstin-low\"),\n        #  TtsModel(model_dir=\"vits-piper-de_DE-mls-medium\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-pavoque-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-ramona-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-thorsten-high\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-thorsten-low\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-thorsten-medium\"),\n        TtsModel(model_dir=\"vits-piper-de_DE-thorsten_emotional-medium\"),\n        TtsModel(model_dir=\"vits-piper-el_GR-rapunzelina-low\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-alan-low\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-alan-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-alba-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-aru-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-cori-high\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-cori-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-jenny_dioco-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-northern_english_male-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-semaine-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-southern_english_female-low\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-southern_english_female-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-southern_english_male-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-sweetbbak-amy\"),\n        TtsModel(model_dir=\"vits-piper-en_GB-vctk-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-amy-low\"),\n        TtsModel(model_dir=\"vits-piper-en_US-amy-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-arctic-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-bryce-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-danny-low\"),\n        TtsModel(model_dir=\"vits-piper-en_US-glados\"),\n        TtsModel(model_dir=\"vits-piper-en_US-hfc_female-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-hfc_male-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-joe-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-john-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-kathleen-low\"),\n        TtsModel(model_dir=\"vits-piper-en_US-kristin-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-kusal-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-l2arctic-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-lessac-high\"),\n        TtsModel(model_dir=\"vits-piper-en_US-lessac-low\"),\n        TtsModel(model_dir=\"vits-piper-en_US-lessac-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-libritts-high\"),\n        TtsModel(model_dir=\"vits-piper-en_US-libritts_r-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-ljspeech-high\"),\n        TtsModel(model_dir=\"vits-piper-en_US-ljspeech-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-norman-medium\"),\n        TtsModel(model_dir=\"vits-piper-en_US-ryan-high\"),\n        TtsModel(model_dir=\"vits-piper-en_US-ryan-low\"),\n        TtsModel(model_dir=\"vits-piper-en_US-ryan-medium\"),\n        TtsModel(model_dir=\"vits-piper-es-glados-medium\"),\n        TtsModel(model_dir=\"vits-piper-es_ES-carlfm-x_low\"),\n        TtsModel(model_dir=\"vits-piper-es_ES-davefx-medium\"),\n        TtsModel(model_dir=\"vits-piper-es_ES-sharvard-medium\"),\n        TtsModel(model_dir=\"vits-piper-es_MX-ald-medium\"),\n        TtsModel(model_dir=\"vits-piper-es_MX-claude-high\"),\n        TtsModel(model_dir=\"vits-piper-fa_IR-amir-medium\"),\n        TtsModel(model_dir=\"vits-piper-fa_IR-gyro-medium\"),\n        TtsModel(model_dir=\"vits-piper-fi_FI-harri-low\"),\n        TtsModel(model_dir=\"vits-piper-fi_FI-harri-medium\"),\n        #  TtsModel(model_dir=\"vits-piper-fr_FR-mls-medium\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-siwis-low\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-siwis-medium\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-tom-medium\"),\n        TtsModel(model_dir=\"vits-piper-fr_FR-upmc-medium\"),\n        TtsModel(model_dir=\"vits-piper-hu_HU-anna-medium\"),\n        TtsModel(model_dir=\"vits-piper-hu_HU-berta-medium\"),\n        TtsModel(model_dir=\"vits-piper-hu_HU-imre-medium\"),\n        TtsModel(model_dir=\"vits-piper-is_IS-bui-medium\"),\n        TtsModel(model_dir=\"vits-piper-is_IS-salka-medium\"),\n        TtsModel(model_dir=\"vits-piper-is_IS-steinn-medium\"),\n        TtsModel(model_dir=\"vits-piper-is_IS-ugla-medium\"),\n        TtsModel(model_dir=\"vits-piper-it_IT-paola-medium\"),\n        TtsModel(model_dir=\"vits-piper-it_IT-riccardo-x_low\"),\n        TtsModel(model_dir=\"vits-piper-ka_GE-natia-medium\"),\n        TtsModel(model_dir=\"vits-piper-kk_KZ-iseke-x_low\"),\n        TtsModel(model_dir=\"vits-piper-kk_KZ-issai-high\"),\n        TtsModel(model_dir=\"vits-piper-kk_KZ-raya-x_low\"),\n        TtsModel(model_dir=\"vits-piper-lb_LU-marylux-medium\"),\n        TtsModel(model_dir=\"vits-piper-ne_NP-google-medium\"),\n        TtsModel(model_dir=\"vits-piper-ne_NP-google-x_low\"),\n        TtsModel(model_dir=\"vits-piper-nl_BE-nathalie-medium\"),\n        TtsModel(model_dir=\"vits-piper-nl_BE-nathalie-x_low\"),\n        TtsModel(model_dir=\"vits-piper-nl_BE-rdh-medium\"),\n        TtsModel(model_dir=\"vits-piper-nl_BE-rdh-x_low\"),\n        #  TtsModel(model_dir=\"vits-piper-nl_NL-mls-medium\"),\n        #  TtsModel(model_dir=\"vits-piper-nl_NL-mls_5809-low\"),\n        #  TtsModel(model_dir=\"vits-piper-nl_NL-mls_7432-low\"),\n        TtsModel(model_dir=\"vits-piper-no_NO-talesyntese-medium\"),\n        TtsModel(model_dir=\"vits-piper-pl_PL-darkman-medium\"),\n        TtsModel(model_dir=\"vits-piper-pl_PL-gosia-medium\"),\n        TtsModel(model_dir=\"vits-piper-pl_PL-mc_speech-medium\"),\n        TtsModel(model_dir=\"vits-piper-pt_BR-edresson-low\"),\n        TtsModel(model_dir=\"vits-piper-pt_BR-faber-medium\"),\n        TtsModel(model_dir=\"vits-piper-pt_PT-tugao-medium\"),\n        TtsModel(model_dir=\"vits-piper-ro_RO-mihai-medium\"),\n        TtsModel(model_dir=\"vits-piper-ru_RU-denis-medium\"),\n        TtsModel(model_dir=\"vits-piper-ru_RU-dmitri-medium\"),\n        TtsModel(model_dir=\"vits-piper-ru_RU-irina-medium\"),\n        TtsModel(model_dir=\"vits-piper-ru_RU-ruslan-medium\"),\n        TtsModel(model_dir=\"vits-piper-sk_SK-lili-medium\"),\n        TtsModel(model_dir=\"vits-piper-sl_SI-artur-medium\"),\n        TtsModel(model_dir=\"vits-piper-sr_RS-serbski_institut-medium\"),\n        TtsModel(model_dir=\"vits-piper-sv_SE-nst-medium\"),\n        TtsModel(model_dir=\"vits-piper-sw_CD-lanfrica-medium\"),\n        TtsModel(model_dir=\"vits-piper-tr_TR-dfki-medium\"),\n        TtsModel(model_dir=\"vits-piper-tr_TR-fahrettin-medium\"),\n        TtsModel(model_dir=\"vits-piper-tr_TR-fettah-medium\"),\n        TtsModel(model_dir=\"vits-piper-uk_UA-lada-x_low\"),\n        TtsModel(model_dir=\"vits-piper-uk_UA-ukrainian_tts-medium\"),\n        TtsModel(model_dir=\"vits-piper-vi_VN-25hours_single-low\"),\n        TtsModel(model_dir=\"vits-piper-vi_VN-vais1000-medium\"),\n        TtsModel(model_dir=\"vits-piper-vi_VN-vivos-x_low\"),\n        TtsModel(model_dir=\"vits-piper-zh_CN-huayan-medium\"),\n    ]\n\n    for m in models:\n        m.data_dir = m.model_dir + \"/\" + \"espeak-ng-data\"\n        m.model_name = m.model_dir[len(\"vits-piper-\") :] + \".onnx\"\n        m.lang = m.model_dir.split(\"-\")[2][:2]\n\n    return models\n\n\ndef get_mimic3_models() -> List[TtsModel]:\n    models = [\n        TtsModel(model_dir=\"vits-mimic3-af_ZA-google-nwu_low\"),\n        TtsModel(model_dir=\"vits-mimic3-bn-multi_low\"),\n        TtsModel(model_dir=\"vits-mimic3-es_ES-m-ailabs_low\"),\n        TtsModel(model_dir=\"vits-mimic3-fa-haaniye_low\"),\n        TtsModel(model_dir=\"vits-mimic3-fi_FI-harri-tapani-ylilammi_low\"),\n        TtsModel(model_dir=\"vits-mimic3-gu_IN-cmu-indic_low\"),\n        TtsModel(model_dir=\"vits-mimic3-hu_HU-diana-majlinger_low\"),\n        TtsModel(model_dir=\"vits-mimic3-ko_KO-kss_low\"),\n        TtsModel(model_dir=\"vits-mimic3-ne_NP-ne-google_low\"),\n        TtsModel(model_dir=\"vits-mimic3-pl_PL-m-ailabs_low\"),\n        TtsModel(model_dir=\"vits-mimic3-tn_ZA-google-nwu_low\"),\n        TtsModel(model_dir=\"vits-mimic3-vi_VN-vais1000_low\"),\n    ]\n    for m in models:\n        m.data_dir = m.model_dir + \"/\" + \"espeak-ng-data\"\n        m.model_name = m.model_dir[len(\"vits-mimic3-\") :] + \".onnx\"\n        m.lang = m.model_dir.split(\"-\")[2][:2]\n\n    return models\n\n\ndef get_vits_models() -> List[TtsModel]:\n    chinese_models = [\n        # Chinese\n        TtsModel(\n            model_dir=\"vits-icefall-zh-aishell3\",\n            model_name=\"model.onnx\",\n            lang=\"zh\",\n            rule_fsts=\"vits-icefall-zh-aishell3/phone.fst,vits-icefall-zh-aishell3/date.fst,vits-icefall-zh-aishell3/number.fst,vits-icefall-zh-aishell3/new_heteronym.fst\",\n            rule_fars=\"vits-icefall-zh-aishell3/rule.far\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-aishell3\",\n            model_name=\"vits-aishell3.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-doom\",\n            model_name=\"doom.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-echo\",\n            model_name=\"echo.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-zenyatta\",\n            model_name=\"zenyatta.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-abyssinvoker\",\n            model_name=\"abyssinvoker.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-keqing\",\n            model_name=\"keqing.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-eula\",\n            model_name=\"eula.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-bronya\",\n            model_name=\"bronya.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-theresa\",\n            model_name=\"theresa.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-fanchen-wnj\",\n            model_name=\"vits-zh-hf-fanchen-wnj.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-melo-tts-zh_en\",\n            model_name=\"model.onnx\",\n            lang=\"zh_en\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-fanchen-C\",\n            model_name=\"vits-zh-hf-fanchen-C.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-fanchen-ZhiHuiLaoZhe\",\n            model_name=\"vits-zh-hf-fanchen-ZhiHuiLaoZhe.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-fanchen-ZhiHuiLaoZhe_new\",\n            model_name=\"vits-zh-hf-fanchen-ZhiHuiLaoZhe_new.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"vits-zh-hf-fanchen-unity\",\n            model_name=\"vits-zh-hf-fanchen-unity.onnx\",\n            lang=\"zh\",\n        ),\n        TtsModel(\n            model_dir=\"sherpa-onnx-vits-zh-ll\",\n            model_name=\"model.onnx\",\n            lang=\"zh\",\n        ),\n    ]\n\n    rule_fsts = [\"phone.fst\", \"date.fst\", \"number.fst\"]\n    for m in chinese_models:\n        s = [f\"{m.model_dir}/{r}\" for r in rule_fsts]\n        if (\n            \"vits-zh-hf\" in m.model_dir\n            or \"sherpa-onnx-vits-zh-ll\" == m.model_dir\n            or \"melo-tts\" in m.model_dir\n        ):\n            m.dict_dir = m.model_dir + \"/dict\"\n        else:\n            m.rule_fars = f\"{m.model_dir}/rule.far\"\n\n        m.rule_fsts = \",\".join(s)\n\n    all_models = chinese_models + [\n        TtsModel(\n            model_dir=\"vits-cantonese-hf-xiaomaiiwn\",\n            model_name=\"vits-cantonese-hf-xiaomaiiwn.onnx\",\n            lang=\"cantonese\",\n            rule_fsts=\"vits-cantonese-hf-xiaomaiiwn/rule.fst\",\n        ),\n        # English (US)\n        TtsModel(model_dir=\"vits-vctk\", model_name=\"vits-vctk.onnx\", lang=\"en\"),\n        #  TtsModel(model_dir=\"vits-ljs\", model_name=\"vits-ljs.onnx\", lang=\"en\"),\n        # fmt: on\n    ]\n\n    return all_models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n    d = dict()\n\n    all_model_list = get_vits_models()\n    all_model_list += get_piper_models()\n    all_model_list += get_mimic3_models()\n    all_model_list += get_coqui_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(\n        \"{index}/{total}: {start}-{end}/{num_models}\".format(\n            index=index,\n            total=total,\n            start=start,\n            end=end,\n            num_models=num_models,\n        )\n    )\n\n    d[\"tts_model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"tts_model_list\"].append(all_model_list[s])\n        print(\"{s}/{num_models}\".format(s=s, num_models=num_models))\n\n    filename_list = [\n        \"./build-macos-tts.sh\",\n        \"./build-linux-tts.sh\",\n        \"./build-android-tts.sh\",\n        \"./build-windows-tts.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(\"{filename}.in\".format(filename=filename)) as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/go/README.md",
    "content": "# Introduction\n\n- [./_internal](./_internal) is for testing only. As a general user, you don't\nneed to care about it.\n"
  },
  {
    "path": "scripts/go/_internal/.gitignore",
    "content": "!*.sh\ngo.sum\n"
  },
  {
    "path": "scripts/go/_internal/add-punctuation/go.mod",
    "content": "module add-punctuation\n\ngo 1.17\n\nreplace github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx => ../\n"
  },
  {
    "path": "scripts/go/_internal/add-punctuation-online/go.mod",
    "content": "module add-punctuation-online\n\ngo 1.17\n\nreplace github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx => ../\n"
  },
  {
    "path": "scripts/go/_internal/build_darwin_amd64.go",
    "content": "//go:build darwin && amd64 && !ios\n\npackage sherpa_onnx\n\n// #cgo LDFLAGS: -L ${SRCDIR}/lib/x86_64-apple-darwin -lsherpa-onnx-c-api -lonnxruntime -Wl,-rpath,${SRCDIR}/lib/x86_64-apple-darwin\nimport \"C\"\n"
  },
  {
    "path": "scripts/go/_internal/build_darwin_arm64.go",
    "content": "//go:build darwin && arm64 && !ios\n\npackage sherpa_onnx\n\n// #cgo LDFLAGS: -L ${SRCDIR}/lib/aarch64-apple-darwin -lsherpa-onnx-c-api -lonnxruntime -Wl,-rpath,${SRCDIR}/lib/aarch64-apple-darwin\nimport \"C\"\n"
  },
  {
    "path": "scripts/go/_internal/build_linux_amd64.go",
    "content": "//go:build !android && linux && amd64 && !musl\n\npackage sherpa_onnx\n\n// #cgo LDFLAGS: -L ${SRCDIR}/lib/x86_64-unknown-linux-gnu -lsherpa-onnx-c-api -lonnxruntime -Wl,-rpath,${SRCDIR}/lib/x86_64-unknown-linux-gnu\nimport \"C\"\n"
  },
  {
    "path": "scripts/go/_internal/build_linux_arm.go",
    "content": "//go:build linux && arm && !arm7\n\npackage sherpa_onnx\n\n// #cgo LDFLAGS: -L ${SRCDIR}/lib/arm-unknown-linux-gnueabihf -lsherpa-onnx-c-api -lonnxruntime -Wl,-rpath,${SRCDIR}/lib/arm-unknown-linux-gnueabihf\nimport \"C\"\n"
  },
  {
    "path": "scripts/go/_internal/build_linux_arm64.go",
    "content": "//go:build linux && arm64\n\npackage sherpa_onnx\n\n// #cgo LDFLAGS: -L ${SRCDIR}/lib/aarch64-unknown-linux-gnu -lsherpa-onnx-c-api -lonnxruntime -Wl,-rpath,${SRCDIR}/lib/aarch64-unknown-linux-gnu\nimport \"C\"\n"
  },
  {
    "path": "scripts/go/_internal/build_windows_386.go",
    "content": "//go:build windows && 386\n\npackage sherpa_onnx\n\n// #cgo LDFLAGS: -L ${SRCDIR}/lib/i686-pc-windows-gnu -lsherpa-onnx-c-api -lonnxruntime\nimport \"C\"\n"
  },
  {
    "path": "scripts/go/_internal/build_windows_amd64.go",
    "content": "//go:build windows && amd64\n\npackage sherpa_onnx\n\n// #cgo LDFLAGS: -L ${SRCDIR}/lib/x86_64-pc-windows-gnu -lsherpa-onnx-c-api -lonnxruntime\nimport \"C\"\n"
  },
  {
    "path": "scripts/go/_internal/go.mod",
    "content": "module sherpa_onnx\n\ngo 1.17\n"
  },
  {
    "path": "scripts/go/_internal/lib/x86_64-pc-windows-gnu/.gitkeep",
    "content": ""
  },
  {
    "path": "scripts/go/_internal/non-streaming-canary-decode-files/go.mod",
    "content": "module non-streaming-canary-decode-files\n\ngo 1.17\n\nreplace github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx => ../\n"
  },
  {
    "path": "scripts/go/_internal/non-streaming-funasr-nano-decode-files/go.mod",
    "content": "module non-streaming-funasr-nano-decode-files\n\ngo 1.17\n\nreplace github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx => ../\n"
  },
  {
    "path": "scripts/go/_internal/non-streaming-omnilingual-asr-ctc-decode-files/go.mod",
    "content": "module non-streaming-omnilingual-asr-ctc-decode-files\n\ngo 1.17\n\nreplace github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx => ../\n"
  },
  {
    "path": "scripts/go/_internal/non-streaming-speaker-diarization/go.mod",
    "content": "module non-streaming-speaker-diarization\n\ngo 1.17\n\nreplace github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx => ../\n"
  },
  {
    "path": "scripts/go/_internal/supertonic-tts/go.mod",
    "content": "module supertonic-tts\n\ngo 1.17\n\nreplace github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx => ../\n"
  },
  {
    "path": "scripts/go/_internal/vad-speaker-identification/go.mod",
    "content": "module vad-speaker-identification\n\ngo 1.17\n\nreplace github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx => ../\n"
  },
  {
    "path": "scripts/go/_internal/zero-shot-zipvoice-tts/go.mod",
    "content": "module zero-shot-zipvoice-tts\n\ngo 1.17\n\nreplace github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx => ../\n"
  },
  {
    "path": "scripts/go/_internal/zero-shot-zipvoice-tts-play/go.mod",
    "content": "module zero-shot-zipvoice-tts-play\n\ngo 1.17\n\nreplace github.com/k2-fsa/sherpa-onnx-go/sherpa_onnx => ../\n"
  },
  {
    "path": "scripts/go/defines.go.jinja",
    "content": "{{ golang_header }}\npackage sherpa_onnx\n\n// ============================================================\n// Code Generated Automatically for {{ platform }} platform, DO NOT EDIT MANUALLY!!\n// ============================================================\n\nimport (\n\tsherpa \"github.com/k2-fsa/sherpa-onnx-go-{{ platform }}\"\n)\n\n// ============================================================\n// Struct/Function Defines\n// ============================================================\n{% for define in defines -%}\n{%- if define.type == 'function' %}\nvar {{ define.name }} = sherpa.{{ define.name }}\n{%- else %}\ntype {{ define.name }} = sherpa.{{ define.name }}\n{%- endif %}\n{%- endfor %}"
  },
  {
    "path": "scripts/go/generate.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nimport os\nimport re\n\nimport jinja2\n\n\ndef parse_args():\n    # set the source code file\n    # -s sherpa_onnx.go\n    # set the output folder\n    # -o ./sherpa-onnx-go\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n    # add argv to set source code file\n    parser.add_argument(\"-s\", \"--source\", type=str, required=True)\n    # add argv to set output folder\n    parser.add_argument(\"-o\", \"--output\", type=str, required=True)\n    return parser.parse_args()\n\n\ndef parse_golang(target):\n    with open(target, \"r\") as file:\n        content = file.read()\n    defines = []\n    struct_pattern = re.compile(r\"type\\s+([A-Z]\\w+)\\s+struct\", re.DOTALL)\n    struct_matches = struct_pattern.findall(content)\n    for name in struct_matches:\n        c_define = {\n            \"type\": \"struct\",\n            \"name\": name,\n        }\n        defines.append(c_define)\n    struct_pattern = re.compile(r\"type\\s+([A-Z][^ =]+)\\s*=\", re.DOTALL)\n    struct_matches = struct_pattern.findall(content)\n    for name in struct_matches:\n        c_define = {\n            \"type\": \"struct\",\n            \"name\": name,\n        }\n        defines.append(c_define)\n    func_pattern = re.compile(r\"func\\s+([A-Z][^ \\(]+)\\s*\\(\", re.DOTALL)\n    func_matches = func_pattern.findall(content)\n    for name in func_matches:\n        c_define = {\n            \"type\": \"function\",\n            \"name\": name,\n        }\n        defines.append(c_define)\n    return defines\n\n\ndef render(output, defines, platform):\n    build_info = \"\"\n    if platform == \"windows\":\n        build_info = \"//go:build (windows && amd64) || (windows && 386)\"\n    elif platform == \"linux\":\n        build_info = \"//go:build (!android && linux && arm64) || (!android && linux && amd64 && !musl) || (!android && linux && arm && !arm7) || (!android && arm7) || (!android && linux && 386 && !musl) || (!android && musl) || (!android && linux && mips) || (!android && linux && mips64) || (!android && linux && mips64le) || (!android && linux && mipsle)\"\n    elif platform == \"macos\":\n        build_info = \"//go:build (darwin && amd64 && !ios) || (darwin && arm64 && !ios)\"\n    with open(\"./defines.go.jinja\") as f:\n        content = f.read()\n    environment = jinja2.Environment()\n    template = environment.from_string(content)\n    context = {\n        \"platform\": platform,\n        \"defines\": defines,\n        \"golang_header\": build_info,\n    }\n    rendered = template.render(**context)\n    folder = os.path.dirname(output)\n    if not os.path.exists(folder):\n        os.makedirs(folder)\n    with open(output, \"w\") as f:\n        print(rendered, file=f)\n\n\ndef generate(src, output):\n    defines = parse_golang(src)\n    platform = \"linux\"\n    render(f\"{output}/sherpa_onnx/sherpa_onnx_{platform}.go\", defines, platform)\n    platform = \"windows\"\n    render(f\"{output}/sherpa_onnx/sherpa_onnx_{platform}.go\", defines, platform)\n    platform = \"macos\"\n    render(f\"{output}/sherpa_onnx/sherpa_onnx_{platform}.go\", defines, platform)\n\n\nif __name__ == \"__main__\":\n    args = parse_args()\n    generate(args.source, args.output)\n"
  },
  {
    "path": "scripts/go/release.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\ngit config --global user.email \"csukuangfj@gmail.com\"\ngit config --global user.name \"Fangjun Kuang\"\n\nSCRIPT_DIR=$( cd -- \"$( dirname -- \"${BASH_SOURCE[0]}\" )\" &> /dev/null && pwd )\nSHERPA_ONNX_DIR=$(realpath $SCRIPT_DIR/../..)\necho \"SCRIPT_DIR: $SCRIPT_DIR\"\necho \"SHERPA_ONNX_DIR: $SHERPA_ONNX_DIR\"\n\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" $SHERPA_ONNX_DIR/CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\necho \"SHERPA_ONNX_VERSION $SHERPA_ONNX_VERSION\"\n\nfunction linux() {\n  echo \"Process linux\"\n  git clone git@github.com:k2-fsa/sherpa-onnx-go-linux.git\n\n  rm -v ./sherpa-onnx-go-linux/*.go\n\n  cp -v ./sherpa_onnx.go ./sherpa-onnx-go-linux/\n  cp -v ./_internal/c-api.h ./sherpa-onnx-go-linux/\n  cp -v ./_internal/build_linux_*.go ./sherpa-onnx-go-linux/\n\n  rm -rf sherpa-onnx-go-linux/lib/x86_64-unknown-linux-gnu/lib*\n  dst=$(realpath sherpa-onnx-go-linux/lib/x86_64-unknown-linux-gnu)\n  mkdir t\n  cd t\n  wget -q https://huggingface.co/csukuangfj2/sherpa-onnx-wheels/resolve/main/cpu/$SHERPA_ONNX_VERSION/sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-manylinux2014_x86_64.whl\n  unzip sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-manylinux2014_x86_64.whl\n\n  rm -fv $dst/_sherpa*.so\n  cp -v sherpa_onnx/lib/lib*.so* $dst\n\n  cd ..\n  rm -rf t\n\n  rm -rf sherpa-onnx-go-linux/lib/aarch64-unknown-linux-gnu/lib*\n  dst=$(realpath sherpa-onnx-go-linux/lib/aarch64-unknown-linux-gnu)\n  mkdir t\n  cd t\n  wget -q https://huggingface.co/csukuangfj2/sherpa-onnx-wheels/resolve/main/cpu/$SHERPA_ONNX_VERSION/sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-manylinux2014_aarch64.whl\n  unzip ./sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-manylinux2014_aarch64.whl\n\n  rm -fv $dst/_sherpa*.so\n  cp -v sherpa_onnx/lib/lib*.so* $dst\n\n  cd ..\n  rm -rf t\n\n  rm -rf sherpa-onnx-go-linux/lib/arm-unknown-linux-gnueabihf/lib*\n  dst=$(realpath sherpa-onnx-go-linux/lib/arm-unknown-linux-gnueabihf)\n  mkdir t\n  cd t\n  wget -q https://huggingface.co/csukuangfj2/sherpa-onnx-wheels/resolve/main/cpu/$SHERPA_ONNX_VERSION/sherpa_onnx_core-$SHERPA_ONNX_VERSION-py3-none-manylinux_2_35_armv7l.whl\n  unzip ./sherpa_onnx_core-$SHERPA_ONNX_VERSION-py3-none-manylinux_2_35_armv7l.whl\n\n  rm -fv $dst/_sherpa*.so\n  cp -v sherpa_onnx/lib/lib*.so* $dst\n\n  cd ..\n  rm -rf t\n\n  echo \"------------------------------\"\n  cd sherpa-onnx-go-linux\n  git status\n  git add .\n  git commit -m \"Release v$SHERPA_ONNX_VERSION\" && \\\n  git push && \\\n  git tag v$SHERPA_ONNX_VERSION && \\\n  git push origin v$SHERPA_ONNX_VERSION || true\n  cd ..\n  rm -rf sherpa-onnx-go-linux\n}\n\nfunction osx() {\n  echo \"Process osx-x64\"\n  git clone git@github.com:k2-fsa/sherpa-onnx-go-macos.git\n  rm -v ./sherpa-onnx-go-macos/*.go\n  cp -v ./sherpa_onnx.go ./sherpa-onnx-go-macos/\n  cp -v ./_internal/c-api.h ./sherpa-onnx-go-macos/\n  cp -v ./_internal/build_darwin_*.go ./sherpa-onnx-go-macos/\n\n  rm -rf sherpa-onnx-go-macos/lib/x86_64-apple-darwin/lib*\n  dst=$(realpath sherpa-onnx-go-macos/lib/x86_64-apple-darwin/)\n\n  mkdir t\n  cd t\n  wget -q https://huggingface.co/csukuangfj2/sherpa-onnx-wheels/resolve/main/cpu/$SHERPA_ONNX_VERSION/sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-macosx_10_15_x86_64.whl\n  unzip ./sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-macosx_10_15_x86_64.whl\n\n  cp -v sherpa_onnx/lib/*.dylib $dst/\n\n  pushd $dst\n  cp -v libonnxruntime.*.dylib libonnxruntime.dylib\n  popd\n\n  cd ..\n  rm -rf t\n\n  echo \"process macos arm64\"\n  rm -rf sherpa-onnx-go-macos/lib/aarch64-apple-darwin/lib*\n  dst=$(realpath sherpa-onnx-go-macos/lib/aarch64-apple-darwin)\n\n  mkdir t\n  cd t\n  wget -q https://huggingface.co/csukuangfj2/sherpa-onnx-wheels/resolve/main/cpu/$SHERPA_ONNX_VERSION/sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-macosx_11_0_arm64.whl\n  unzip ./sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-macosx_11_0_arm64.whl\n\n  cp -v sherpa_onnx/lib/*.dylib $dst/\n\n  pushd $dst\n  cp -v libonnxruntime.*.dylib libonnxruntime.dylib\n  popd\n\n  cd ..\n  rm -rf t\n  echo \"------------------------------\"\n  cd sherpa-onnx-go-macos\n  git status\n  git add .\n  git commit -m \"Release v$SHERPA_ONNX_VERSION\" && \\\n  git push && \\\n  git tag v$SHERPA_ONNX_VERSION && \\\n  git push origin v$SHERPA_ONNX_VERSION || true\n  cd ..\n  rm -rf sherpa-onnx-go-macos\n}\n\nfunction windows() {\n  echo \"Process windows\"\n  git clone git@github.com:k2-fsa/sherpa-onnx-go-windows.git\n  rm -v ./sherpa-onnx-go-windows/*.go\n  cp -v ./sherpa_onnx.go ./sherpa-onnx-go-windows/\n  cp -v ./_internal/c-api.h ./sherpa-onnx-go-windows/\n  cp -v ./_internal/build_windows_*.go ./sherpa-onnx-go-windows/\n\n  rm -fv sherpa-onnx-go-windows/lib/x86_64-pc-windows-gnu/*\n  dst=$(realpath sherpa-onnx-go-windows/lib/x86_64-pc-windows-gnu)\n  mkdir t\n  cd t\n  wget -q https://huggingface.co/csukuangfj2/sherpa-onnx-wheels/resolve/main/cpu/$SHERPA_ONNX_VERSION/sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-win_amd64.whl\n  unzip ./sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-win_amd64.whl\n\n  cp -v sherpa_onnx/lib/*.dll $dst\n\n  cd ..\n  rm -rf t\n\n  rm -fv sherpa-onnx-go-windows/lib/i686-pc-windows-gnu/*\n  dst=$(realpath sherpa-onnx-go-windows/lib/i686-pc-windows-gnu)\n  mkdir t\n  cd t\n  wget -q https://huggingface.co/csukuangfj2/sherpa-onnx-wheels/resolve/main/cpu/$SHERPA_ONNX_VERSION/sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-win32.whl\n  unzip ./sherpa_onnx_core-${SHERPA_ONNX_VERSION}-py3-none-win32.whl\n\n  cp -v sherpa_onnx/lib/*.dll $dst\n\n  cd ..\n  rm -rf t\n  echo \"------------------------------\"\n  cd sherpa-onnx-go-windows\n  git status\n  git add .\n  git commit -m \"Release v$SHERPA_ONNX_VERSION\" && \\\n  git push && \\\n  git tag v$SHERPA_ONNX_VERSION && \\\n  git push origin v$SHERPA_ONNX_VERSION || true\n  cd ..\n  rm -rf sherpa-onnx-go-windows\n}\n\nfunction basic() {\n  echo \"Process sherpa-onnx-go\"\n  git clone git@github.com:k2-fsa/sherpa-onnx-go.git\n\n  python3 ./generate.py -s ./sherpa_onnx.go -o ./sherpa-onnx-go\n\n  echo \"------------------------------\"\n  cd sherpa-onnx-go\n  git status\n  git add .\n  git commit -m \"Release v$SHERPA_ONNX_VERSION\" && \\\n    git push && \\\n    git tag v$SHERPA_ONNX_VERSION && \\\n    git push origin v$SHERPA_ONNX_VERSION\n  cd ..\n  rm -rf sherpa-onnx-go\n}\n\nbasic\nwindows\nlinux\nosx\n\nrm -fv ~/.ssh/github\n"
  },
  {
    "path": "scripts/go/sherpa_onnx.go",
    "content": "/*\nSpeech recognition with [Next-gen Kaldi].\n\n[sherpa-onnx] is an open-source speech recognition framework for [Next-gen Kaldi].\nIt depends only on [onnxruntime], supporting both streaming and non-streaming\nspeech recognition.\n\nIt does not need to access the network during recognition and everything\nruns locally.\n\nIt supports a variety of platforms, such as Linux (x86_64, aarch64, arm),\nWindows (x86_64, x86), macOS (x86_64, arm64), etc.\n\nUsage examples:\n\n 1. Real-time speech recognition from a microphone\n\n    Please see\n    https://github.com/k2-fsa/sherpa-onnx/tree/master/go-api-examples/real-time-speech-recognition-from-microphone\n\n 2. Decode files using a non-streaming model\n\n    Please see\n    https://github.com/k2-fsa/sherpa-onnx/tree/master/go-api-examples/non-streaming-decode-files\n\n 3. Decode files using a streaming model\n\n    Please see\n    https://github.com/k2-fsa/sherpa-onnx/tree/master/go-api-examples/streaming-decode-files\n\n 4. Convert text to speech using a non-streaming model\n\n    Please see\n    https://github.com/k2-fsa/sherpa-onnx/tree/master/go-api-examples/non-streaming-tts\n\n[sherpa-onnx]: https://github.com/k2-fsa/sherpa-onnx\n[onnxruntime]: https://github.com/microsoft/onnxruntime\n[Next-gen Kaldi]: https://github.com/k2-fsa/\n*/\npackage sherpa_onnx\n\n// #include <stdlib.h>\n// #include \"c-api.h\"\n// extern int32_t _cgoGeneratedAudioCallback(float *samples,int32_t n,void *arg);\n// extern int32_t _cgoGeneratedAudioProgressCallback(float *samples, int32_t n, float p, void *arg);\nimport \"C\"\nimport (\n\t\"encoding/json\"\n\t\"runtime/cgo\"\n\t\"unsafe\"\n)\n\n// Configuration for online/streaming transducer models\n//\n// Please refer to\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html\n// to download pre-trained models\ntype OnlineTransducerModelConfig struct {\n\tEncoder string // Path to the encoder model, e.g., encoder.onnx or encoder.int8.onnx\n\tDecoder string // Path to the decoder model.\n\tJoiner  string // Path to the joiner model.\n}\n\n// Configuration for online/streaming paraformer models\n//\n// Please refer to\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/index.html\n// to download pre-trained models\ntype OnlineParaformerModelConfig struct {\n\tEncoder string // Path to the encoder model, e.g., encoder.onnx or encoder.int8.onnx\n\tDecoder string // Path to the decoder model.\n}\n\n// Please refer to\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-ctc/index.html\n// to download pre-trained models\ntype OnlineZipformer2CtcModelConfig struct {\n\tModel string // Path to the onnx model\n}\n\ntype OnlineNemoCtcModelConfig struct {\n\tModel string // Path to the onnx model\n}\n\ntype OnlineToneCtcModelConfig struct {\n\tModel string // Path to the onnx model\n}\n\n// Configuration for online/streaming models\n//\n// Please refer to\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/index.html\n// to download pre-trained models\ntype OnlineModelConfig struct {\n\tTransducer    OnlineTransducerModelConfig\n\tParaformer    OnlineParaformerModelConfig\n\tZipformer2Ctc OnlineZipformer2CtcModelConfig\n\tNemoCtc       OnlineNemoCtcModelConfig\n\tToneCtc       OnlineToneCtcModelConfig\n\tTokens        string // Path to tokens.txt\n\tNumThreads    int    // Number of threads to use for neural network computation\n\tProvider      string // Optional. Valid values are: cpu, cuda, coreml\n\tDebug         int    // 1 to show model meta information while loading it.\n\tModelType     string // Optional. You can specify it for faster model initialization\n\tModelingUnit  string // Optional. cjkchar, bpe, cjkchar+bpe\n\tBpeVocab      string // Optional.\n\tTokensBuf     string // Optional.\n\tTokensBufSize int    // Optional.\n}\n\n// Configuration for the feature extractor\ntype FeatureConfig struct {\n\t// Sample rate expected by the model. It is 16000 for all\n\t// pre-trained models provided by us\n\tSampleRate int\n\t// Feature dimension expected by the model. It is 80 for all\n\t// pre-trained models provided by us\n\tFeatureDim int\n}\n\ntype OnlineCtcFstDecoderConfig struct {\n\tGraph     string\n\tMaxActive int\n}\n\ntype HomophoneReplacerConfig struct {\n\tDictDir  string // unused\n\tLexicon  string\n\tRuleFsts string\n}\n\n// Configuration for the online/streaming recognizer.\ntype OnlineRecognizerConfig struct {\n\tFeatConfig  FeatureConfig\n\tModelConfig OnlineModelConfig\n\n\t// Valid decoding methods: greedy_search, modified_beam_search\n\tDecodingMethod string\n\n\t// Used only when DecodingMethod is modified_beam_search. It specifies\n\t// the maximum number of paths to keep during the search\n\tMaxActivePaths int\n\n\tEnableEndpoint int // 1 to enable endpoint detection.\n\n\t// Please see\n\t// https://k2-fsa.github.io/sherpa/ncnn/endpoint.html\n\t// for the meaning of Rule1MinTrailingSilence, Rule2MinTrailingSilence\n\t// and Rule3MinUtteranceLength.\n\tRule1MinTrailingSilence float32\n\tRule2MinTrailingSilence float32\n\tRule3MinUtteranceLength float32\n\tHotwordsFile            string\n\tHotwordsScore           float32\n\tBlankPenalty            float32\n\tCtcFstDecoderConfig     OnlineCtcFstDecoderConfig\n\tRuleFsts                string\n\tRuleFars                string\n\tHotwordsBuf             string\n\tHotwordsBufSize         int\n\tHr                      HomophoneReplacerConfig\n}\n\n// It contains the recognition result for a online stream.\ntype OnlineRecognizerResult struct {\n\tText string\n}\n\n// The online recognizer class. It wraps a pointer from C.\ntype OnlineRecognizer struct {\n\timpl *C.struct_SherpaOnnxOnlineRecognizer\n}\n\n// The online stream class. It wraps a pointer from C.\ntype OnlineStream struct {\n\timpl *C.struct_SherpaOnnxOnlineStream\n}\n\n// Free the internal pointer inside the recognizer to avoid memory leak.\nfunc DeleteOnlineRecognizer(recognizer *OnlineRecognizer) {\n\tC.SherpaOnnxDestroyOnlineRecognizer(recognizer.impl)\n\trecognizer.impl = nil\n}\n\n// The user is responsible to invoke [DeleteOnlineRecognizer]() to free\n// the returned recognizer to avoid memory leak\nfunc NewOnlineRecognizer(config *OnlineRecognizerConfig) *OnlineRecognizer {\n\tc := C.struct_SherpaOnnxOnlineRecognizerConfig{}\n\tc.feat_config.sample_rate = C.int(config.FeatConfig.SampleRate)\n\tc.feat_config.feature_dim = C.int(config.FeatConfig.FeatureDim)\n\n\tc.model_config.transducer.encoder = C.CString(config.ModelConfig.Transducer.Encoder)\n\tdefer C.free(unsafe.Pointer(c.model_config.transducer.encoder))\n\n\tc.model_config.transducer.decoder = C.CString(config.ModelConfig.Transducer.Decoder)\n\tdefer C.free(unsafe.Pointer(c.model_config.transducer.decoder))\n\n\tc.model_config.transducer.joiner = C.CString(config.ModelConfig.Transducer.Joiner)\n\tdefer C.free(unsafe.Pointer(c.model_config.transducer.joiner))\n\n\tc.model_config.paraformer.encoder = C.CString(config.ModelConfig.Paraformer.Encoder)\n\tdefer C.free(unsafe.Pointer(c.model_config.paraformer.encoder))\n\n\tc.model_config.paraformer.decoder = C.CString(config.ModelConfig.Paraformer.Decoder)\n\tdefer C.free(unsafe.Pointer(c.model_config.paraformer.decoder))\n\n\tc.model_config.zipformer2_ctc.model = C.CString(config.ModelConfig.Zipformer2Ctc.Model)\n\tdefer C.free(unsafe.Pointer(c.model_config.zipformer2_ctc.model))\n\n\tc.model_config.nemo_ctc.model = C.CString(config.ModelConfig.NemoCtc.Model)\n\tdefer C.free(unsafe.Pointer(c.model_config.nemo_ctc.model))\n\n\tc.model_config.t_one_ctc.model = C.CString(config.ModelConfig.ToneCtc.Model)\n\tdefer C.free(unsafe.Pointer(c.model_config.t_one_ctc.model))\n\n\tc.model_config.tokens = C.CString(config.ModelConfig.Tokens)\n\tdefer C.free(unsafe.Pointer(c.model_config.tokens))\n\n\tc.model_config.tokens_buf = C.CString(config.ModelConfig.TokensBuf)\n\tdefer C.free(unsafe.Pointer(c.model_config.tokens_buf))\n\n\tc.model_config.tokens_buf_size = C.int(config.ModelConfig.TokensBufSize)\n\n\tc.model_config.num_threads = C.int(config.ModelConfig.NumThreads)\n\n\tc.model_config.provider = C.CString(config.ModelConfig.Provider)\n\tdefer C.free(unsafe.Pointer(c.model_config.provider))\n\n\tc.model_config.debug = C.int(config.ModelConfig.Debug)\n\n\tc.model_config.model_type = C.CString(config.ModelConfig.ModelType)\n\tdefer C.free(unsafe.Pointer(c.model_config.model_type))\n\n\tc.model_config.modeling_unit = C.CString(config.ModelConfig.ModelingUnit)\n\tdefer C.free(unsafe.Pointer(c.model_config.modeling_unit))\n\n\tc.model_config.bpe_vocab = C.CString(config.ModelConfig.BpeVocab)\n\tdefer C.free(unsafe.Pointer(c.model_config.bpe_vocab))\n\n\tc.decoding_method = C.CString(config.DecodingMethod)\n\tdefer C.free(unsafe.Pointer(c.decoding_method))\n\n\tc.max_active_paths = C.int(config.MaxActivePaths)\n\tc.enable_endpoint = C.int(config.EnableEndpoint)\n\tc.rule1_min_trailing_silence = C.float(config.Rule1MinTrailingSilence)\n\tc.rule2_min_trailing_silence = C.float(config.Rule2MinTrailingSilence)\n\tc.rule3_min_utterance_length = C.float(config.Rule3MinUtteranceLength)\n\n\tc.hotwords_file = C.CString(config.HotwordsFile)\n\tdefer C.free(unsafe.Pointer(c.hotwords_file))\n\n\tc.hotwords_buf = C.CString(config.HotwordsBuf)\n\tdefer C.free(unsafe.Pointer(c.hotwords_buf))\n\n\tc.hotwords_buf_size = C.int(config.HotwordsBufSize)\n\n\tc.hotwords_score = C.float(config.HotwordsScore)\n\tc.blank_penalty = C.float(config.BlankPenalty)\n\n\tc.rule_fsts = C.CString(config.RuleFsts)\n\tdefer C.free(unsafe.Pointer(c.rule_fsts))\n\n\tc.rule_fars = C.CString(config.RuleFars)\n\tdefer C.free(unsafe.Pointer(c.rule_fars))\n\n\tc.ctc_fst_decoder_config.graph = C.CString(config.CtcFstDecoderConfig.Graph)\n\tdefer C.free(unsafe.Pointer(c.ctc_fst_decoder_config.graph))\n\tc.ctc_fst_decoder_config.max_active = C.int(config.CtcFstDecoderConfig.MaxActive)\n\n\tc.hr.lexicon = C.CString(config.Hr.Lexicon)\n\tdefer C.free(unsafe.Pointer(c.hr.lexicon))\n\n\tc.hr.rule_fsts = C.CString(config.Hr.RuleFsts)\n\tdefer C.free(unsafe.Pointer(c.hr.rule_fsts))\n\n\timpl := C.SherpaOnnxCreateOnlineRecognizer(&c)\n\tif impl == nil {\n\t\treturn nil\n\t}\n\trecognizer := &OnlineRecognizer{}\n\trecognizer.impl = impl\n\treturn recognizer\n}\n\n// Delete the internal pointer inside the stream to avoid memory leak.\nfunc DeleteOnlineStream(stream *OnlineStream) {\n\tC.SherpaOnnxDestroyOnlineStream(stream.impl)\n\tstream.impl = nil\n}\n\n// The user is responsible to invoke [DeleteOnlineStream]() to free\n// the returned stream to avoid memory leak\nfunc NewOnlineStream(recognizer *OnlineRecognizer) *OnlineStream {\n\tstream := &OnlineStream{}\n\tstream.impl = C.SherpaOnnxCreateOnlineStream(recognizer.impl)\n\treturn stream\n}\n\n// Input audio samples for the stream.\n//\n// sampleRate is the actual sample rate of the input audio samples. If it\n// is different from the sample rate expected by the feature extractor, we will\n// do resampling inside.\n//\n// samples contains audio samples. Each sample is in the range [-1, 1]\nfunc (s *OnlineStream) AcceptWaveform(sampleRate int, samples []float32) {\n\tC.SherpaOnnxOnlineStreamAcceptWaveform(s.impl, C.int(sampleRate), (*C.float)(&samples[0]), C.int(len(samples)))\n}\n\n// Signal that there will be no incoming audio samples.\n// After calling this function, you cannot call [OnlineStream.AcceptWaveform] any longer.\n//\n// The main purpose of this function is to flush the remaining audio samples\n// buffered inside for feature extraction.\nfunc (s *OnlineStream) InputFinished() {\n\tC.SherpaOnnxOnlineStreamInputFinished(s.impl)\n}\n\n// Set a key-value option on the online stream.\n// This provides a generic mechanism for passing per-stream runtime parameters\n// to the recognizer (e.g., \"is_final\" for streaming Paraformer).\nfunc (s *OnlineStream) SetOption(key string, value string) {\n\tcKey := C.CString(key)\n\tdefer C.free(unsafe.Pointer(cKey))\n\tcValue := C.CString(value)\n\tdefer C.free(unsafe.Pointer(cValue))\n\tC.SherpaOnnxOnlineStreamSetOption(s.impl, cKey, cValue)\n}\n\n// Get a key-value option from the online stream.\n// Returns an empty string if the option is not set.\nfunc (s *OnlineStream) GetOption(key string) string {\n\tcKey := C.CString(key)\n\tdefer C.free(unsafe.Pointer(cKey))\n\treturn C.GoString(C.SherpaOnnxOnlineStreamGetOption(s.impl, cKey))\n}\n\n// Check whether the given option exists in the online stream.\n// Return true if the option exists. Return false otherwise.\nfunc (s *OnlineStream) HasOption(key string) bool {\n\tcKey := C.CString(key)\n\tdefer C.free(unsafe.Pointer(cKey))\n\treturn C.SherpaOnnxOnlineStreamHasOption(s.impl, cKey) == 1\n}\n\n// Check whether the stream has enough feature frames for decoding.\n// Return true if this stream is ready for decoding. Return false otherwise.\n//\n// You will usually use it like below:\n//\n//\tfor recognizer.IsReady(s) {\n//\t   recognizer.Decode(s)\n//\t}\nfunc (recognizer *OnlineRecognizer) IsReady(s *OnlineStream) bool {\n\treturn C.SherpaOnnxIsOnlineStreamReady(recognizer.impl, s.impl) == 1\n}\n\n// Return true if an endpoint is detected.\n//\n// You usually use it like below:\n//\n//\tif recognizer.IsEndpoint(s) {\n//\t   // do your own stuff after detecting an endpoint\n//\n//\t   recognizer.Reset(s)\n//\t}\nfunc (recognizer *OnlineRecognizer) IsEndpoint(s *OnlineStream) bool {\n\treturn C.SherpaOnnxOnlineStreamIsEndpoint(recognizer.impl, s.impl) == 1\n}\n\n// After calling this function, the internal neural network model states\n// are reset and IsEndpoint(s) would return false. GetResult(s) would also\n// return an empty string.\nfunc (recognizer *OnlineRecognizer) Reset(s *OnlineStream) {\n\tC.SherpaOnnxOnlineStreamReset(recognizer.impl, s.impl)\n}\n\n// Decode the stream. Before calling this function, you have to ensure\n// that recognizer.IsReady(s) returns true. Otherwise, you will be SAD.\n//\n// You usually use it like below:\n//\n//\tfor recognizer.IsReady(s) {\n//\t  recognizer.Decode(s)\n//\t}\nfunc (recognizer *OnlineRecognizer) Decode(s *OnlineStream) {\n\tC.SherpaOnnxDecodeOnlineStream(recognizer.impl, s.impl)\n}\n\n// Decode multiple streams in parallel, i.e., in batch.\n// You have to ensure that each stream is ready for decoding. Otherwise,\n// you will be SAD.\nfunc (recognizer *OnlineRecognizer) DecodeStreams(s []*OnlineStream) {\n\tss := make([]*C.struct_SherpaOnnxOnlineStream, len(s))\n\tfor i, v := range s {\n\t\tss[i] = v.impl\n\t}\n\n\tC.SherpaOnnxDecodeMultipleOnlineStreams(recognizer.impl, &ss[0], C.int(len(s)))\n}\n\n// Get the current result of stream since the last invoke of Reset()\nfunc (recognizer *OnlineRecognizer) GetResult(s *OnlineStream) *OnlineRecognizerResult {\n\tp := C.SherpaOnnxGetOnlineStreamResult(recognizer.impl, s.impl)\n\tdefer C.SherpaOnnxDestroyOnlineRecognizerResult(p)\n\tresult := &OnlineRecognizerResult{}\n\tresult.Text = C.GoString(p.text)\n\n\treturn result\n}\n\n// Configuration for offline/non-streaming transducer.\n//\n// Please refer to\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html\n// to download pre-trained models\ntype OfflineTransducerModelConfig struct {\n\tEncoder string // Path to the encoder model, i.e., encoder.onnx or encoder.int8.onnx\n\tDecoder string // Path to the decoder model\n\tJoiner  string // Path to the joiner model\n}\n\n// Configuration for offline/non-streaming paraformer.\n//\n// please refer to\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html\n// to download pre-trained models\ntype OfflineParaformerModelConfig struct {\n\tModel string // Path to the model, e.g., model.onnx or model.int8.onnx\n}\n\n// Configuration for offline/non-streaming NeMo CTC models.\n//\n// Please refer to\n// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/index.html\n// to download pre-trained models\ntype OfflineNemoEncDecCtcModelConfig struct {\n\tModel string // Path to the model, e.g., model.onnx or model.int8.onnx\n}\n\ntype OfflineZipformerCtcModelConfig struct {\n\tModel string // Path to the model, e.g., model.onnx or model.int8.onnx\n}\n\ntype OfflineWenetCtcModelConfig struct {\n\tModel string // Path to the model, e.g., model.onnx or model.int8.onnx\n}\n\ntype OfflineOmnilingualAsrCtcModelConfig struct {\n\tModel string // Path to the model, e.g., model.onnx or model.int8.onnx\n}\n\ntype OfflineMedAsrCtcModelConfig struct {\n\tModel string // Path to the model, e.g., model.onnx or model.int8.onnx\n}\n\ntype OfflineFireRedAsrCtcModelConfig struct {\n\tModel string // Path to the model, e.g., model.onnx or model.int8.onnx\n}\n\ntype OfflineDolphinModelConfig struct {\n\tModel string // Path to the model, e.g., model.onnx or model.int8.onnx\n}\n\ntype OfflineWhisperModelConfig struct {\n\tEncoder                 string\n\tDecoder                 string\n\tLanguage                string\n\tTask                    string\n\tTailPaddings            int\n\tEnableTokenTimestamps   int\n\tEnableSegmentTimestamps int\n}\n\ntype OfflineCanaryModelConfig struct {\n\tEncoder string\n\tDecoder string\n\tSrcLang string\n\tTgtLang string\n\tUsePnc  int\n}\n\ntype OfflineFireRedAsrModelConfig struct {\n\tEncoder string\n\tDecoder string\n}\n\ntype OfflineFunASRNanoModelConfig struct {\n\tEncoderAdaptor              string\n\tLLM                         string\n\tEmbedding                   string\n\tTokenizer                   string\n\tSystemPrompt                string\n\tUserPrompt                  string\n\tMaxNewTokens                int\n\tTemperature                 float32\n\tTopP                        float32\n\tSeed                        int\n\tLanguage                    string\n\tUseInverseTextNormalization int\n\tHotwords                    string\n}\n\n// For Moonshine v1, you need 4 models:\n//   - preprocessor, encoder, uncached_decoder, cached_decoder\n//\n// For Moonshine v2, you need 2 models:\n//   - encoder, merged_decoder\ntype OfflineMoonshineModelConfig struct {\n\tPreprocessor    string\n\tEncoder         string\n\tUncachedDecoder string\n\tCachedDecoder   string\n\tMergedDecoder   string\n}\n\ntype OfflineTdnnModelConfig struct {\n\tModel string\n}\n\ntype OfflineSenseVoiceModelConfig struct {\n\tModel                       string\n\tLanguage                    string\n\tUseInverseTextNormalization int\n}\n\n// Configuration for offline LM.\ntype OfflineLMConfig struct {\n\tModel string  // Path to the model\n\tScale float32 // scale for LM score\n}\n\ntype OfflineModelConfig struct {\n\tTransducer    OfflineTransducerModelConfig\n\tParaformer    OfflineParaformerModelConfig\n\tNemoCTC       OfflineNemoEncDecCtcModelConfig\n\tWhisper       OfflineWhisperModelConfig\n\tTdnn          OfflineTdnnModelConfig\n\tSenseVoice    OfflineSenseVoiceModelConfig\n\tMoonshine     OfflineMoonshineModelConfig\n\tFireRedAsr    OfflineFireRedAsrModelConfig\n\tFunAsrNano    OfflineFunASRNanoModelConfig\n\tDolphin       OfflineDolphinModelConfig\n\tZipformerCtc  OfflineZipformerCtcModelConfig\n\tCanary        OfflineCanaryModelConfig\n\tWenetCtc      OfflineWenetCtcModelConfig\n\tOmnilingual   OfflineOmnilingualAsrCtcModelConfig\n\tMedAsr        OfflineMedAsrCtcModelConfig\n\tFireRedAsrCtc OfflineFireRedAsrCtcModelConfig\n\tTokens        string // Path to tokens.txt\n\n\t// Number of threads to use for neural network computation\n\tNumThreads int\n\n\t// 1 to print model meta information while loading\n\tDebug int\n\n\t// Optional. Valid values: cpu, cuda, coreml\n\tProvider string\n\n\t// Optional. Specify it for faster model initialization.\n\tModelType string\n\n\tModelingUnit  string // Optional. cjkchar, bpe, cjkchar+bpe\n\tBpeVocab      string // Optional.\n\tTeleSpeechCtc string // Optional.\n}\n\n// Configuration for the offline/non-streaming recognizer.\ntype OfflineRecognizerConfig struct {\n\tFeatConfig  FeatureConfig\n\tModelConfig OfflineModelConfig\n\tLmConfig    OfflineLMConfig\n\n\t// Valid decoding method: greedy_search, modified_beam_search\n\tDecodingMethod string\n\n\t// Used only when DecodingMethod is modified_beam_search.\n\tMaxActivePaths int\n\tHotwordsFile   string\n\tHotwordsScore  float32\n\tBlankPenalty   float32\n\tRuleFsts       string\n\tRuleFars       string\n\tHr             HomophoneReplacerConfig\n}\n\n// It wraps a pointer from C\ntype OfflineRecognizer struct {\n\timpl *C.struct_SherpaOnnxOfflineRecognizer\n}\n\n// It wraps a pointer from C\ntype OfflineStream struct {\n\timpl *C.struct_SherpaOnnxOfflineStream\n}\n\n// It contains recognition result of an offline stream.\ntype OfflineRecognizerResult struct {\n\tText       string\n\tTokens     []string\n\tTimestamps []float32\n\tDurations  []float32\n\tLang       string\n\tEmotion    string\n\tEvent      string\n}\n\nfunc newCOfflineRecognizerConfig(config *OfflineRecognizerConfig) *C.struct_SherpaOnnxOfflineRecognizerConfig {\n\tc := C.struct_SherpaOnnxOfflineRecognizerConfig{}\n\tc.feat_config.sample_rate = C.int(config.FeatConfig.SampleRate)\n\tc.feat_config.feature_dim = C.int(config.FeatConfig.FeatureDim)\n\n\tc.model_config.transducer.encoder = C.CString(config.ModelConfig.Transducer.Encoder)\n\tc.model_config.transducer.decoder = C.CString(config.ModelConfig.Transducer.Decoder)\n\tc.model_config.transducer.joiner = C.CString(config.ModelConfig.Transducer.Joiner)\n\n\tc.model_config.paraformer.model = C.CString(config.ModelConfig.Paraformer.Model)\n\n\tc.model_config.nemo_ctc.model = C.CString(config.ModelConfig.NemoCTC.Model)\n\n\tc.model_config.whisper.encoder = C.CString(config.ModelConfig.Whisper.Encoder)\n\tc.model_config.whisper.decoder = C.CString(config.ModelConfig.Whisper.Decoder)\n\tc.model_config.whisper.language = C.CString(config.ModelConfig.Whisper.Language)\n\tc.model_config.whisper.task = C.CString(config.ModelConfig.Whisper.Task)\n\tc.model_config.whisper.tail_paddings = C.int(config.ModelConfig.Whisper.TailPaddings)\n\tc.model_config.whisper.enable_token_timestamps = C.int(config.ModelConfig.Whisper.EnableTokenTimestamps)\n\tc.model_config.whisper.enable_segment_timestamps = C.int(config.ModelConfig.Whisper.EnableSegmentTimestamps)\n\n\tc.model_config.tdnn.model = C.CString(config.ModelConfig.Tdnn.Model)\n\n\tc.model_config.sense_voice.model = C.CString(config.ModelConfig.SenseVoice.Model)\n\tc.model_config.sense_voice.language = C.CString(config.ModelConfig.SenseVoice.Language)\n\tc.model_config.sense_voice.use_itn = C.int(config.ModelConfig.SenseVoice.UseInverseTextNormalization)\n\n\tc.model_config.moonshine.preprocessor = C.CString(config.ModelConfig.Moonshine.Preprocessor)\n\tc.model_config.moonshine.encoder = C.CString(config.ModelConfig.Moonshine.Encoder)\n\tc.model_config.moonshine.uncached_decoder = C.CString(config.ModelConfig.Moonshine.UncachedDecoder)\n\tc.model_config.moonshine.cached_decoder = C.CString(config.ModelConfig.Moonshine.CachedDecoder)\n\tc.model_config.moonshine.merged_decoder = C.CString(config.ModelConfig.Moonshine.MergedDecoder)\n\n\tc.model_config.fire_red_asr.encoder = C.CString(config.ModelConfig.FireRedAsr.Encoder)\n\tc.model_config.fire_red_asr.decoder = C.CString(config.ModelConfig.FireRedAsr.Decoder)\n\n\tc.model_config.funasr_nano.encoder_adaptor = C.CString(config.ModelConfig.FunAsrNano.EncoderAdaptor)\n\tc.model_config.funasr_nano.llm = C.CString(config.ModelConfig.FunAsrNano.LLM)\n\tc.model_config.funasr_nano.embedding = C.CString(config.ModelConfig.FunAsrNano.Embedding)\n\tc.model_config.funasr_nano.tokenizer = C.CString(config.ModelConfig.FunAsrNano.Tokenizer)\n\tc.model_config.funasr_nano.system_prompt = C.CString(config.ModelConfig.FunAsrNano.SystemPrompt)\n\tc.model_config.funasr_nano.user_prompt = C.CString(config.ModelConfig.FunAsrNano.UserPrompt)\n\tc.model_config.funasr_nano.max_new_tokens = C.int(config.ModelConfig.FunAsrNano.MaxNewTokens)\n\tc.model_config.funasr_nano.temperature = C.float(config.ModelConfig.FunAsrNano.Temperature)\n\tc.model_config.funasr_nano.top_p = C.float(config.ModelConfig.FunAsrNano.TopP)\n\tc.model_config.funasr_nano.seed = C.int(config.ModelConfig.FunAsrNano.Seed)\n\tc.model_config.funasr_nano.language = C.CString(config.ModelConfig.FunAsrNano.Language)\n\tc.model_config.funasr_nano.itn = C.int(config.ModelConfig.FunAsrNano.UseInverseTextNormalization)\n\tc.model_config.funasr_nano.hotwords = C.CString(config.ModelConfig.FunAsrNano.Hotwords)\n\n\tc.model_config.dolphin.model = C.CString(config.ModelConfig.Dolphin.Model)\n\tc.model_config.zipformer_ctc.model = C.CString(config.ModelConfig.ZipformerCtc.Model)\n\n\tc.model_config.canary.encoder = C.CString(config.ModelConfig.Canary.Encoder)\n\tc.model_config.canary.decoder = C.CString(config.ModelConfig.Canary.Decoder)\n\tc.model_config.canary.src_lang = C.CString(config.ModelConfig.Canary.SrcLang)\n\tc.model_config.canary.tgt_lang = C.CString(config.ModelConfig.Canary.TgtLang)\n\tc.model_config.canary.use_pnc = C.int(config.ModelConfig.Canary.UsePnc)\n\n\tc.model_config.wenet_ctc.model = C.CString(config.ModelConfig.WenetCtc.Model)\n\n\tc.model_config.omnilingual.model = C.CString(config.ModelConfig.Omnilingual.Model)\n\tc.model_config.medasr.model = C.CString(config.ModelConfig.MedAsr.Model)\n\tc.model_config.fire_red_asr_ctc.model = C.CString(config.ModelConfig.FireRedAsrCtc.Model)\n\n\tc.model_config.tokens = C.CString(config.ModelConfig.Tokens)\n\n\tc.model_config.num_threads = C.int(config.ModelConfig.NumThreads)\n\n\tc.model_config.debug = C.int(config.ModelConfig.Debug)\n\n\tc.model_config.provider = C.CString(config.ModelConfig.Provider)\n\n\tc.model_config.model_type = C.CString(config.ModelConfig.ModelType)\n\n\tc.model_config.modeling_unit = C.CString(config.ModelConfig.ModelingUnit)\n\n\tc.model_config.bpe_vocab = C.CString(config.ModelConfig.BpeVocab)\n\n\tc.model_config.telespeech_ctc = C.CString(config.ModelConfig.TeleSpeechCtc)\n\n\tc.lm_config.model = C.CString(config.LmConfig.Model)\n\tc.lm_config.scale = C.float(config.LmConfig.Scale)\n\n\tc.decoding_method = C.CString(config.DecodingMethod)\n\n\tc.max_active_paths = C.int(config.MaxActivePaths)\n\n\tc.hotwords_file = C.CString(config.HotwordsFile)\n\tc.hotwords_score = C.float(config.HotwordsScore)\n\n\tc.blank_penalty = C.float(config.BlankPenalty)\n\n\tc.rule_fsts = C.CString(config.RuleFsts)\n\tc.rule_fars = C.CString(config.RuleFars)\n\n\tc.hr.lexicon = C.CString(config.Hr.Lexicon)\n\tc.hr.rule_fsts = C.CString(config.Hr.RuleFsts)\n\treturn &c\n}\nfunc freeCOfflineRecognizerConfig(c *C.struct_SherpaOnnxOfflineRecognizerConfig) {\n\tstringFields := []*(*C.char){\n\t\t&c.model_config.transducer.encoder,\n\t\t&c.model_config.transducer.decoder,\n\t\t&c.model_config.transducer.joiner,\n\t\t&c.model_config.paraformer.model,\n\t\t&c.model_config.nemo_ctc.model,\n\t\t&c.model_config.whisper.encoder,\n\t\t&c.model_config.whisper.decoder,\n\t\t&c.model_config.whisper.language,\n\t\t&c.model_config.whisper.task,\n\t\t&c.model_config.tdnn.model,\n\t\t&c.model_config.sense_voice.model,\n\t\t&c.model_config.sense_voice.language,\n\t\t&c.model_config.moonshine.preprocessor,\n\t\t&c.model_config.moonshine.encoder,\n\t\t&c.model_config.moonshine.uncached_decoder,\n\t\t&c.model_config.moonshine.cached_decoder,\n\t\t&c.model_config.moonshine.merged_decoder,\n\t\t&c.model_config.fire_red_asr.encoder,\n\t\t&c.model_config.fire_red_asr.decoder,\n\t\t&c.model_config.funasr_nano.encoder_adaptor,\n\t\t&c.model_config.funasr_nano.llm,\n\t\t&c.model_config.funasr_nano.embedding,\n\t\t&c.model_config.funasr_nano.tokenizer,\n\t\t&c.model_config.funasr_nano.system_prompt,\n\t\t&c.model_config.funasr_nano.user_prompt,\n\t\t&c.model_config.funasr_nano.language,\n\t\t&c.model_config.funasr_nano.hotwords,\n\t\t&c.model_config.dolphin.model,\n\t\t&c.model_config.zipformer_ctc.model,\n\t\t&c.model_config.canary.encoder,\n\t\t&c.model_config.canary.decoder,\n\t\t&c.model_config.canary.src_lang,\n\t\t&c.model_config.canary.tgt_lang,\n\t\t&c.model_config.wenet_ctc.model,\n\t\t&c.model_config.medasr.model,\n\t\t&c.model_config.fire_red_asr_ctc.model,\n\t\t&c.model_config.omnilingual.model,\n\t\t&c.model_config.tokens,\n\t\t&c.model_config.provider,\n\t\t&c.model_config.model_type,\n\t\t&c.model_config.modeling_unit,\n\t\t&c.model_config.bpe_vocab,\n\t\t&c.model_config.telespeech_ctc,\n\t\t&c.lm_config.model,\n\t\t&c.decoding_method,\n\t\t&c.hotwords_file,\n\t\t&c.rule_fsts,\n\t\t&c.rule_fars,\n\t\t&c.hr.lexicon,\n\t\t&c.hr.rule_fsts,\n\t}\n\n\tfor _, field := range stringFields {\n\t\tif *field != nil {\n\t\t\tC.free(unsafe.Pointer(*field))\n\t\t\t*field = nil\n\t\t}\n\t}\n}\n\n// Frees the internal pointer of the recognition to avoid memory leak.\nfunc DeleteOfflineRecognizer(recognizer *OfflineRecognizer) {\n\tC.SherpaOnnxDestroyOfflineRecognizer(recognizer.impl)\n\trecognizer.impl = nil\n}\n\n// The user is responsible to invoke [DeleteOfflineRecognizer]() to free\n// the returned recognizer to avoid memory leak\nfunc NewOfflineRecognizer(config *OfflineRecognizerConfig) *OfflineRecognizer {\n\tc := newCOfflineRecognizerConfig(config)\n\tdefer freeCOfflineRecognizerConfig(c)\n\n\timpl := C.SherpaOnnxCreateOfflineRecognizer(c)\n\tif impl == nil {\n\t\treturn nil\n\t}\n\trecognizer := &OfflineRecognizer{}\n\trecognizer.impl = impl\n\n\treturn recognizer\n}\n\n// Set new config to replace\nfunc (r *OfflineRecognizer) SetConfig(config *OfflineRecognizerConfig) {\n\tc := newCOfflineRecognizerConfig(config)\n\tdefer freeCOfflineRecognizerConfig(c)\n\n\tC.SherpaOnnxOfflineRecognizerSetConfig(r.impl, c)\n}\n\n// Frees the internal pointer of the stream to avoid memory leak.\nfunc DeleteOfflineStream(stream *OfflineStream) {\n\tC.SherpaOnnxDestroyOfflineStream(stream.impl)\n\tstream.impl = nil\n}\n\n// The user is responsible to invoke [DeleteOfflineStream]() to free\n// the returned stream to avoid memory leak\nfunc NewOfflineStream(recognizer *OfflineRecognizer) *OfflineStream {\n\tstream := &OfflineStream{}\n\tstream.impl = C.SherpaOnnxCreateOfflineStream(recognizer.impl)\n\treturn stream\n}\n\n// Input audio samples for the offline stream.\n// Please only call it once. That is, input all samples at once.\n//\n// sampleRate is the sample rate of the input audio samples. If it is different\n// from the value expected by the feature extractor, we will do resampling inside.\n//\n// samples contains the actual audio samples. Each sample is in the range [-1, 1].\nfunc (s *OfflineStream) AcceptWaveform(sampleRate int, samples []float32) {\n\tC.SherpaOnnxAcceptWaveformOffline(s.impl, C.int(sampleRate), (*C.float)(&samples[0]), C.int(len(samples)))\n}\n\n// Set a key-value option on the offline stream.\n// This provides a generic mechanism for passing per-stream runtime parameters\n// to the recognizer (e.g., \"task\", \"prompt\").\nfunc (s *OfflineStream) SetOption(key string, value string) {\n\tcKey := C.CString(key)\n\tdefer C.free(unsafe.Pointer(cKey))\n\tcValue := C.CString(value)\n\tdefer C.free(unsafe.Pointer(cValue))\n\tC.SherpaOnnxOfflineStreamSetOption(s.impl, cKey, cValue)\n}\n\n// Get a key-value option from the offline stream.\n// Returns an empty string if the option is not set.\nfunc (s *OfflineStream) GetOption(key string) string {\n\tcKey := C.CString(key)\n\tdefer C.free(unsafe.Pointer(cKey))\n\treturn C.GoString(C.SherpaOnnxOfflineStreamGetOption(s.impl, cKey))\n}\n\n// Check whether the given option exists in the offline stream.\n// Return true if the option exists. Return false otherwise.\nfunc (s *OfflineStream) HasOption(key string) bool {\n\tcKey := C.CString(key)\n\tdefer C.free(unsafe.Pointer(cKey))\n\treturn C.SherpaOnnxOfflineStreamHasOption(s.impl, cKey) == 1\n}\n\n// Decode the offline stream.\nfunc (recognizer *OfflineRecognizer) Decode(s *OfflineStream) {\n\tC.SherpaOnnxDecodeOfflineStream(recognizer.impl, s.impl)\n}\n\n// Decode multiple streams in parallel, i.e., in batch.\nfunc (recognizer *OfflineRecognizer) DecodeStreams(s []*OfflineStream) {\n\tss := make([]*C.struct_SherpaOnnxOfflineStream, len(s))\n\tfor i, v := range s {\n\t\tss[i] = v.impl\n\t}\n\n\tC.SherpaOnnxDecodeMultipleOfflineStreams(recognizer.impl, &ss[0], C.int(len(s)))\n}\n\n// Get the recognition result of the offline stream.\nfunc (s *OfflineStream) GetResult() *OfflineRecognizerResult {\n\tp := C.SherpaOnnxGetOfflineStreamResult(s.impl)\n\tdefer C.SherpaOnnxDestroyOfflineRecognizerResult(p)\n\tn := int(p.count)\n\tif n == 0 {\n\t\treturn nil\n\t}\n\tresult := &OfflineRecognizerResult{}\n\tresult.Text = C.GoString(p.text)\n\tresult.Lang = C.GoString(p.lang)\n\tresult.Emotion = C.GoString(p.emotion)\n\tresult.Event = C.GoString(p.event)\n\tresult.Tokens = make([]string, n)\n\ttokens := unsafe.Slice(p.tokens_arr, n)\n\tfor i := 0; i < n; i++ {\n\t\tresult.Tokens[i] = C.GoString(tokens[i])\n\t}\n\tif p.timestamps != nil {\n\t\tresult.Timestamps = make([]float32, n)\n\t\ttimestamps := unsafe.Slice(p.timestamps, n)\n\t\tfor i := 0; i < n; i++ {\n\t\t\tresult.Timestamps[i] = float32(timestamps[i])\n\t\t}\n\t}\n\tif p.durations != nil {\n\t\tresult.Durations = make([]float32, n)\n\t\tdurations := unsafe.Slice(p.durations, n)\n\t\tfor i := 0; i < n; i++ {\n\t\t\tresult.Durations[i] = float32(durations[i])\n\t\t}\n\t}\n\treturn result\n}\n\n// Configuration for offline/non-streaming text-to-speech (TTS).\n//\n// Please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/index.html\n// to download pre-trained models\ntype OfflineTtsVitsModelConfig struct {\n\tModel       string  // Path to the VITS onnx model\n\tLexicon     string  // Path to lexicon.txt\n\tTokens      string  // Path to tokens.txt\n\tDataDir     string  // Path to espeak-ng-data directory\n\tNoiseScale  float32 // noise scale for vits models. Please use 0.667 in general\n\tNoiseScaleW float32 // noise scale for vits models. Please use 0.8 in general\n\tLengthScale float32 // Please use 1.0 in general. Smaller -> Faster speech speed. Larger -> Slower speech speed\n\tDictDir     string  // unused\n}\n\ntype OfflineTtsMatchaModelConfig struct {\n\tAcousticModel string  // Path to the acoustic model for MatchaTTS\n\tVocoder       string  // Path to the vocoder model for MatchaTTS\n\tLexicon       string  // Path to lexicon.txt\n\tTokens        string  // Path to tokens.txt\n\tDataDir       string  // Path to espeak-ng-data directory\n\tNoiseScale    float32 // noise scale for vits models. Please use 0.667 in general\n\tLengthScale   float32 // Please use 1.0 in general. Smaller -> Faster speech speed. Larger -> Slower speech speed\n\tDictDir       string  // unused\n}\n\ntype OfflineTtsKokoroModelConfig struct {\n\tModel       string  // Path to the model for kokoro\n\tVoices      string  // Path to the voices.bin for kokoro\n\tTokens      string  // Path to tokens.txt\n\tDataDir     string  // Path to espeak-ng-data directory\n\tDictDir     string  // unused\n\tLexicon     string  // Path to lexicon files\n\tLang        string  // Example: es for Spanish, fr-fr for French. Can be empty\n\tLengthScale float32 // Please use 1.0 in general. Smaller -> Faster speech speed. Larger -> Slower speech speed\n}\n\ntype OfflineTtsKittenModelConfig struct {\n\tModel       string  // Path to the model for kitten\n\tVoices      string  // Path to the voices.bin for kitten\n\tTokens      string  // Path to tokens.txt\n\tDataDir     string  // Path to espeak-ng-data directory\n\tLengthScale float32 // Please use 1.0 in general. Smaller -> Faster speech speed. Larger -> Slower speech speed\n}\n\ntype OfflineTtsPocketModelConfig struct {\n\tLmFlow                      string // lm_flow\n\tLmMain                      string // lm_main\n\tEncoder                     string // encoder\n\tDecoder                     string // decoder\n\tTextConditioner             string // text_conditioner\n\tVocabJson                   string // vocab_json\n\tTokenScoresJson             string // token_scores_json\n\tVoiceEmbeddingCacheCapacity int    // voice_embedding_cache_capacity\n}\n\ntype OfflineTtsZipvoiceModelConfig struct {\n\tTokens  string // Path to tokens.txt for ZipVoice\n\tEncoder string // Path to text encoder (e.g. encoder.onnx)\n\tDecoder string // Path to flow-matching decoder (e.g. fm_decoder.onnx)\n\tDataDir string // Path to espeak-ng-data\n\tLexicon string // Path to lexicon.txt (needed for zh)\n\tVocoder string // Path to vocoder (e.g. vocos_24khz.onnx)\n\n\tFeatScale     float32 // Feature scale\n\tTShift        float32 // t-shift (<1 shifts to smaller t)\n\tTargetRms     float32 // Target RMS for speech normalization\n\tGuidanceScale float32 // CFG scale\n}\n\ntype OfflineTtsSupertonicModelConfig struct {\n\tDurationPredictor string // Path to duration_predictor.onnx\n\tTextEncoder       string // Path to text_encoder.onnx\n\tVectorEstimator   string // Path to vector_estimator.onnx\n\tVocoder           string // Path to vocoder.onnx\n\tTtsJson           string // Path to tts.json\n\tUnicodeIndexer    string // Path to unicode_indexer.bin\n\tVoiceStyle        string // Path to voice.bin\n}\n\ntype OfflineTtsModelConfig struct {\n\tVits       OfflineTtsVitsModelConfig\n\tMatcha     OfflineTtsMatchaModelConfig\n\tKokoro     OfflineTtsKokoroModelConfig\n\tKitten     OfflineTtsKittenModelConfig\n\tZipvoice   OfflineTtsZipvoiceModelConfig\n\tPocket     OfflineTtsPocketModelConfig\n\tSupertonic OfflineTtsSupertonicModelConfig\n\n\t// Number of threads to use for neural network computation\n\tNumThreads int\n\n\t// 1 to print model meta information while loading\n\tDebug int\n\n\t// Optional. Valid values: cpu, cuda, coreml\n\tProvider string\n}\n\ntype OfflineTtsConfig struct {\n\tModel           OfflineTtsModelConfig\n\tRuleFsts        string\n\tRuleFars        string\n\tMaxNumSentences int\n\tSilenceScale    float32\n}\n\ntype GeneratedAudio struct {\n\t// Normalized samples in the range [-1, 1]\n\tSamples []float32\n\n\tSampleRate int\n}\n\ntype GenerationConfig struct {\n\tSilenceScale float32\n\tSpeed        float32\n\tSid          int\n\n\tReferenceAudio      []float32\n\tReferenceSampleRate int\n\tReferenceText       string\n\n\tNumSteps int\n\n\t// Opaque JSON passed directly to C\n\tExtra json.RawMessage\n}\n\n// The offline tts class. It wraps a pointer from C.\ntype OfflineTts struct {\n\timpl *C.struct_SherpaOnnxOfflineTts\n}\n\ntype sherpaOnnxGeneratedAudioCallbackWithArg func(samples []float32) bool\n\n//export _cgoGeneratedAudioCallback\nfunc _cgoGeneratedAudioCallback(\n\tsamples *C.float,\n\tn C.int32_t,\n\targ unsafe.Pointer,\n) C.int32_t {\n\n\th := *(*cgo.Handle)(arg)\n\tcb := h.Value().(sherpaOnnxGeneratedAudioCallbackWithArg)\n\n\tnn := int(n)\n\tarr := unsafe.Slice(\n\t\t(*float32)(unsafe.Pointer(samples)),\n\t\tnn,\n\t)\n\n\tall := make([]float32, nn)\n\tcopy(all, arr)\n\n\t// Prevent panics from crossing the C boundary\n\tvar ret bool\n\tfunc() {\n\t\tdefer func() {\n\t\t\tif r := recover(); r != nil {\n\t\t\t\tret = false\n\t\t\t}\n\t\t}()\n\t\tret = cb(all)\n\t}()\n\n\tif ret {\n\t\treturn 1\n\t}\n\treturn 0\n}\n\ntype sherpaOnnxGeneratedAudioProgressCallbackWithArg func(samples []float32, p float32) bool\n\n//export _cgoGeneratedAudioProgressCallback\nfunc _cgoGeneratedAudioProgressCallback(\n\tsamples *C.float,\n\tn C.int32_t,\n\tp C.float,\n\targ unsafe.Pointer,\n) C.int32_t {\n\n\th := *(*cgo.Handle)(arg)\n\tcb := h.Value().(sherpaOnnxGeneratedAudioProgressCallbackWithArg)\n\n\tnn := int(n)\n\tarr := unsafe.Slice(\n\t\t(*float32)(unsafe.Pointer(samples)),\n\t\tnn,\n\t)\n\n\tall := make([]float32, nn)\n\tcopy(all, arr)\n\n\t// Prevent panics from crossing the C boundary\n\tvar ret bool\n\tfunc() {\n\t\tdefer func() {\n\t\t\tif r := recover(); r != nil {\n\t\t\t\tret = false\n\t\t\t}\n\t\t}()\n\t\tret = cb(all, float32(p))\n\t}()\n\n\tif ret {\n\t\treturn 1\n\t}\n\treturn 0\n}\n\n// Free the internal pointer inside the tts to avoid memory leak.\nfunc DeleteOfflineTts(tts *OfflineTts) {\n\tC.SherpaOnnxDestroyOfflineTts(tts.impl)\n\ttts.impl = nil\n}\n\n// The user is responsible to invoke [DeleteOfflineTts]() to free\n// the returned tts to avoid memory leak\nfunc NewOfflineTts(config *OfflineTtsConfig) *OfflineTts {\n\tc := C.struct_SherpaOnnxOfflineTtsConfig{}\n\n\tc.rule_fsts = C.CString(config.RuleFsts)\n\tdefer C.free(unsafe.Pointer(c.rule_fsts))\n\n\tc.rule_fars = C.CString(config.RuleFars)\n\tdefer C.free(unsafe.Pointer(c.rule_fars))\n\n\tc.max_num_sentences = C.int(config.MaxNumSentences)\n\tc.silence_scale = C.float(config.SilenceScale)\n\n\t// vits\n\tc.model.vits.model = C.CString(config.Model.Vits.Model)\n\tdefer C.free(unsafe.Pointer(c.model.vits.model))\n\n\tc.model.vits.lexicon = C.CString(config.Model.Vits.Lexicon)\n\tdefer C.free(unsafe.Pointer(c.model.vits.lexicon))\n\n\tc.model.vits.tokens = C.CString(config.Model.Vits.Tokens)\n\tdefer C.free(unsafe.Pointer(c.model.vits.tokens))\n\n\tc.model.vits.data_dir = C.CString(config.Model.Vits.DataDir)\n\tdefer C.free(unsafe.Pointer(c.model.vits.data_dir))\n\n\tc.model.vits.noise_scale = C.float(config.Model.Vits.NoiseScale)\n\tc.model.vits.noise_scale_w = C.float(config.Model.Vits.NoiseScaleW)\n\tc.model.vits.length_scale = C.float(config.Model.Vits.LengthScale)\n\n\t// matcha\n\tc.model.matcha.acoustic_model = C.CString(config.Model.Matcha.AcousticModel)\n\tdefer C.free(unsafe.Pointer(c.model.matcha.acoustic_model))\n\n\tc.model.matcha.vocoder = C.CString(config.Model.Matcha.Vocoder)\n\tdefer C.free(unsafe.Pointer(c.model.matcha.vocoder))\n\n\tc.model.matcha.lexicon = C.CString(config.Model.Matcha.Lexicon)\n\tdefer C.free(unsafe.Pointer(c.model.matcha.lexicon))\n\n\tc.model.matcha.tokens = C.CString(config.Model.Matcha.Tokens)\n\tdefer C.free(unsafe.Pointer(c.model.matcha.tokens))\n\n\tc.model.matcha.data_dir = C.CString(config.Model.Matcha.DataDir)\n\tdefer C.free(unsafe.Pointer(c.model.matcha.data_dir))\n\n\tc.model.matcha.noise_scale = C.float(config.Model.Matcha.NoiseScale)\n\tc.model.matcha.length_scale = C.float(config.Model.Matcha.LengthScale)\n\n\t// kokoro\n\tc.model.kokoro.model = C.CString(config.Model.Kokoro.Model)\n\tdefer C.free(unsafe.Pointer(c.model.kokoro.model))\n\n\tc.model.kokoro.voices = C.CString(config.Model.Kokoro.Voices)\n\tdefer C.free(unsafe.Pointer(c.model.kokoro.voices))\n\n\tc.model.kokoro.tokens = C.CString(config.Model.Kokoro.Tokens)\n\tdefer C.free(unsafe.Pointer(c.model.kokoro.tokens))\n\n\tc.model.kokoro.data_dir = C.CString(config.Model.Kokoro.DataDir)\n\tdefer C.free(unsafe.Pointer(c.model.kokoro.data_dir))\n\n\tc.model.kokoro.lexicon = C.CString(config.Model.Kokoro.Lexicon)\n\tdefer C.free(unsafe.Pointer(c.model.kokoro.lexicon))\n\n\tc.model.kokoro.lang = C.CString(config.Model.Kokoro.Lang)\n\tdefer C.free(unsafe.Pointer(c.model.kokoro.lang))\n\n\tc.model.kokoro.length_scale = C.float(config.Model.Kokoro.LengthScale)\n\n\t// kitten\n\tc.model.kitten.model = C.CString(config.Model.Kitten.Model)\n\tdefer C.free(unsafe.Pointer(c.model.kitten.model))\n\n\tc.model.kitten.voices = C.CString(config.Model.Kitten.Voices)\n\tdefer C.free(unsafe.Pointer(c.model.kitten.voices))\n\n\tc.model.kitten.tokens = C.CString(config.Model.Kitten.Tokens)\n\tdefer C.free(unsafe.Pointer(c.model.kitten.tokens))\n\n\tc.model.kitten.data_dir = C.CString(config.Model.Kitten.DataDir)\n\tdefer C.free(unsafe.Pointer(c.model.kitten.data_dir))\n\n\tc.model.kitten.length_scale = C.float(config.Model.Kitten.LengthScale)\n\n\t// zipvoice\n\tc.model.zipvoice.tokens = C.CString(config.Model.Zipvoice.Tokens)\n\tdefer C.free(unsafe.Pointer(c.model.zipvoice.tokens))\n\n\tc.model.zipvoice.encoder = C.CString(config.Model.Zipvoice.Encoder)\n\tdefer C.free(unsafe.Pointer(c.model.zipvoice.encoder))\n\n\tc.model.zipvoice.decoder = C.CString(config.Model.Zipvoice.Decoder)\n\tdefer C.free(unsafe.Pointer(c.model.zipvoice.decoder))\n\n\tc.model.zipvoice.vocoder = C.CString(config.Model.Zipvoice.Vocoder)\n\tdefer C.free(unsafe.Pointer(c.model.zipvoice.vocoder))\n\n\tc.model.zipvoice.data_dir = C.CString(config.Model.Zipvoice.DataDir)\n\tdefer C.free(unsafe.Pointer(c.model.zipvoice.data_dir))\n\n\tc.model.zipvoice.lexicon = C.CString(config.Model.Zipvoice.Lexicon)\n\tdefer C.free(unsafe.Pointer(c.model.zipvoice.lexicon))\n\n\tc.model.zipvoice.feat_scale = C.float(config.Model.Zipvoice.FeatScale)\n\tc.model.zipvoice.t_shift = C.float(config.Model.Zipvoice.TShift)\n\tc.model.zipvoice.target_rms = C.float(config.Model.Zipvoice.TargetRms)\n\tc.model.zipvoice.guidance_scale = C.float(config.Model.Zipvoice.GuidanceScale)\n\n\t// pocket\n\tc.model.pocket.lm_flow = C.CString(config.Model.Pocket.LmFlow)\n\tdefer C.free(unsafe.Pointer(c.model.pocket.lm_flow))\n\n\tc.model.pocket.lm_main = C.CString(config.Model.Pocket.LmMain)\n\tdefer C.free(unsafe.Pointer(c.model.pocket.lm_main))\n\n\tc.model.pocket.encoder = C.CString(config.Model.Pocket.Encoder)\n\tdefer C.free(unsafe.Pointer(c.model.pocket.encoder))\n\n\tc.model.pocket.decoder = C.CString(config.Model.Pocket.Decoder)\n\tdefer C.free(unsafe.Pointer(c.model.pocket.decoder))\n\n\tc.model.pocket.text_conditioner = C.CString(config.Model.Pocket.TextConditioner)\n\tdefer C.free(unsafe.Pointer(c.model.pocket.text_conditioner))\n\n\tc.model.pocket.vocab_json = C.CString(config.Model.Pocket.VocabJson)\n\tdefer C.free(unsafe.Pointer(c.model.pocket.vocab_json))\n\n\tc.model.pocket.token_scores_json = C.CString(config.Model.Pocket.TokenScoresJson)\n\tdefer C.free(unsafe.Pointer(c.model.pocket.token_scores_json))\n\n\tc.model.pocket.voice_embedding_cache_capacity = C.int(config.Model.Pocket.VoiceEmbeddingCacheCapacity)\n\n\t// supertonic\n\tc.model.supertonic.duration_predictor = C.CString(config.Model.Supertonic.DurationPredictor)\n\tdefer C.free(unsafe.Pointer(c.model.supertonic.duration_predictor))\n\n\tc.model.supertonic.text_encoder = C.CString(config.Model.Supertonic.TextEncoder)\n\tdefer C.free(unsafe.Pointer(c.model.supertonic.text_encoder))\n\n\tc.model.supertonic.vector_estimator = C.CString(config.Model.Supertonic.VectorEstimator)\n\tdefer C.free(unsafe.Pointer(c.model.supertonic.vector_estimator))\n\n\tc.model.supertonic.vocoder = C.CString(config.Model.Supertonic.Vocoder)\n\tdefer C.free(unsafe.Pointer(c.model.supertonic.vocoder))\n\n\tc.model.supertonic.tts_json = C.CString(config.Model.Supertonic.TtsJson)\n\tdefer C.free(unsafe.Pointer(c.model.supertonic.tts_json))\n\n\tc.model.supertonic.unicode_indexer = C.CString(config.Model.Supertonic.UnicodeIndexer)\n\tdefer C.free(unsafe.Pointer(c.model.supertonic.unicode_indexer))\n\n\tc.model.supertonic.voice_style = C.CString(config.Model.Supertonic.VoiceStyle)\n\tdefer C.free(unsafe.Pointer(c.model.supertonic.voice_style))\n\n\tc.model.num_threads = C.int(config.Model.NumThreads)\n\tc.model.debug = C.int(config.Model.Debug)\n\n\tc.model.provider = C.CString(config.Model.Provider)\n\tdefer C.free(unsafe.Pointer(c.model.provider))\n\n\timpl := C.SherpaOnnxCreateOfflineTts(&c)\n\tif impl == nil {\n\t\treturn nil\n\t}\n\ttts := &OfflineTts{}\n\ttts.impl = impl\n\treturn tts\n}\n\nfunc (tts *OfflineTts) NumSpeakers() int {\n\treturn int(C.SherpaOnnxOfflineTtsNumSpeakers(tts.impl))\n}\n\nfunc (tts *OfflineTts) SampleRate() int {\n\treturn int(C.SherpaOnnxOfflineTtsSampleRate(tts.impl))\n}\n\nfunc (tts *OfflineTts) Generate(text string, sid int, speed float32) *GeneratedAudio {\n\ts := C.CString(text)\n\tdefer C.free(unsafe.Pointer(s))\n\n\taudio := C.SherpaOnnxOfflineTtsGenerate(tts.impl, s, C.int(sid), C.float(speed))\n\n\tif audio == nil {\n\t\treturn nil\n\t}\n\n\tdefer C.SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio)\n\n\tans := &GeneratedAudio{}\n\tans.SampleRate = int(audio.sample_rate)\n\tn := int(audio.n)\n\tans.Samples = make([]float32, n)\n\n\t// see https://stackoverflow.com/questions/48756732/what-does-1-30c-yourtype-do-exactly-in-cgo\n\t// :n:n means 0:n:n, means low:high:capacity\n\tsamples := unsafe.Slice(\n\t\t(*float32)(unsafe.Pointer(audio.samples)),\n\t\tn,\n\t)\n\n\tcopy(ans.Samples, samples)\n\n\treturn ans\n}\n\n// Deprecated: Use GenerateWithConfig() instead.\nfunc (tts *OfflineTts) GenerateWithZipvoice(\n\ttext, promptText string,\n\tpromptSamples []float32,\n\tpromptSampleRate int,\n\tspeed float32,\n\tnumSteps int,\n) *GeneratedAudio {\n\n\tcText := C.CString(text)\n\tdefer C.free(unsafe.Pointer(cText))\n\n\tcPromptText := C.CString(promptText)\n\tdefer C.free(unsafe.Pointer(cPromptText))\n\n\tvar p *C.float\n\tvar n C.int\n\tif len(promptSamples) > 0 {\n\t\tp = (*C.float)(unsafe.Pointer(&promptSamples[0]))\n\t\tn = C.int(len(promptSamples))\n\t}\n\n\taudio := C.SherpaOnnxOfflineTtsGenerateWithZipvoice(\n\t\ttts.impl,\n\t\tcText,\n\t\tcPromptText,\n\t\tp,\n\t\tn,\n\t\tC.int(promptSampleRate),\n\t\tC.float(speed),\n\t\tC.int(numSteps),\n\t)\n\tif audio == nil {\n\t\treturn nil\n\t}\n\tdefer C.SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio)\n\n\tnn := int(audio.n)\n\tarr := unsafe.Slice(\n\t\t(*float32)(unsafe.Pointer(audio.samples)),\n\t\tnn,\n\t)\n\n\tans := &GeneratedAudio{\n\t\tSampleRate: int(audio.sample_rate),\n\t\tSamples:    make([]float32, nn),\n\t}\n\tcopy(ans.Samples, arr)\n\n\treturn ans\n}\n\nfunc (tts *OfflineTts) GenerateWithCallback(\n\ttext string,\n\tsid int,\n\tspeed float32,\n\tcb sherpaOnnxGeneratedAudioCallbackWithArg,\n) *GeneratedAudio {\n\n\ts := C.CString(text)\n\tdefer C.free(unsafe.Pointer(s))\n\n\tvar audio *C.struct_SherpaOnnxGeneratedAudio\n\n\tif cb != nil {\n\t\th := cgo.NewHandle(cb)\n\t\tdefer h.Delete()\n\n\t\taudio = C.SherpaOnnxOfflineTtsGenerateWithCallbackWithArg(\n\t\t\ttts.impl,\n\t\t\ts,\n\t\t\tC.int(sid),\n\t\t\tC.float(speed),\n\t\t\tC.SherpaOnnxGeneratedAudioCallbackWithArg(C._cgoGeneratedAudioCallback),\n\t\t\tunsafe.Pointer(&h),\n\t\t)\n\t} else {\n\t\taudio = C.SherpaOnnxOfflineTtsGenerateWithCallbackWithArg(\n\t\t\ttts.impl,\n\t\t\ts,\n\t\t\tC.int(sid),\n\t\t\tC.float(speed),\n\t\t\tnil,\n\t\t\tnil,\n\t\t)\n\t}\n\n\tif audio == nil {\n\t\treturn nil\n\t}\n\tdefer C.SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio)\n\n\tn := int(audio.n)\n\tsamples := unsafe.Slice(\n\t\t(*float32)(unsafe.Pointer(audio.samples)),\n\t\tn,\n\t)\n\n\tans := &GeneratedAudio{\n\t\tSampleRate: int(audio.sample_rate),\n\t\tSamples:    make([]float32, n),\n\t}\n\tcopy(ans.Samples, samples)\n\n\treturn ans\n}\n\nfunc (tts *OfflineTts) GenerateWithProgressCallback(\n\ttext string,\n\tsid int,\n\tspeed float32,\n\tcb sherpaOnnxGeneratedAudioProgressCallbackWithArg,\n) *GeneratedAudio {\n\ts := C.CString(text)\n\tdefer C.free(unsafe.Pointer(s))\n\n\tvar audio *C.struct_SherpaOnnxGeneratedAudio\n\n\tif cb != nil {\n\t\th := cgo.NewHandle(cb)\n\t\tdefer h.Delete()\n\n\t\taudio = C.SherpaOnnxOfflineTtsGenerateWithProgressCallbackWithArg(\n\t\t\ttts.impl,\n\t\t\ts,\n\t\t\tC.int(sid),\n\t\t\tC.float(speed),\n\t\t\tC.SherpaOnnxGeneratedAudioProgressCallbackWithArg(\n\t\t\t\tC._cgoGeneratedAudioProgressCallback,\n\t\t\t),\n\t\t\tunsafe.Pointer(&h),\n\t\t)\n\t} else {\n\t\taudio = C.SherpaOnnxOfflineTtsGenerateWithProgressCallbackWithArg(\n\t\t\ttts.impl,\n\t\t\ts,\n\t\t\tC.int(sid),\n\t\t\tC.float(speed),\n\t\t\tnil,\n\t\t\tnil,\n\t\t)\n\t}\n\n\tif audio == nil {\n\t\treturn nil\n\t}\n\tdefer C.SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio)\n\n\tn := int(audio.n)\n\tsamples := unsafe.Slice(\n\t\t(*float32)(unsafe.Pointer(audio.samples)),\n\t\tn,\n\t)\n\n\tans := &GeneratedAudio{\n\t\tSampleRate: int(audio.sample_rate),\n\t\tSamples:    make([]float32, n),\n\t}\n\tcopy(ans.Samples, samples)\n\n\treturn ans\n}\n\nfunc (tts *OfflineTts) GenerateWithConfig(\n\ttext string,\n\tcfg *GenerationConfig,\n\tcb sherpaOnnxGeneratedAudioProgressCallbackWithArg,\n) *GeneratedAudio {\n\tif cfg == nil {\n\t\tcfg = &GenerationConfig{}\n\t}\n\n\tcText := C.CString(text)\n\tdefer C.free(unsafe.Pointer(cText))\n\n\tvar cCfg C.struct_SherpaOnnxGenerationConfig\n\tcCfg.silence_scale = C.float(cfg.SilenceScale)\n\tcCfg.speed = C.float(cfg.Speed)\n\tcCfg.sid = C.int(cfg.Sid)\n\tcCfg.num_steps = C.int(cfg.NumSteps)\n\n\tvar cReferenceAudio *C.float\n\tif len(cfg.ReferenceAudio) > 0 {\n\t\tcReferenceAudio = (*C.float)(C.malloc(C.size_t(len(cfg.ReferenceAudio)) * C.size_t(unsafe.Sizeof(C.float(0)))))\n\t\tslice := (*[1 << 30]C.float)(unsafe.Pointer(cReferenceAudio))[:len(cfg.ReferenceAudio):len(cfg.ReferenceAudio)]\n\t\tfor i, v := range cfg.ReferenceAudio {\n\t\t\tslice[i] = C.float(v)\n\t\t}\n\t\tcCfg.reference_audio = cReferenceAudio\n\t\tcCfg.reference_audio_len = C.int(len(cfg.ReferenceAudio))\n\t\tcCfg.reference_sample_rate = C.int(cfg.ReferenceSampleRate)\n\t\tdefer C.free(unsafe.Pointer(cReferenceAudio)) // free after use\n\t}\n\n\t// Reference text\n\tif cfg.ReferenceText != \"\" {\n\t\tcCfg.reference_text = C.CString(cfg.ReferenceText)\n\t\tdefer C.free(unsafe.Pointer(cCfg.reference_text))\n\t}\n\n\tvar cExtra *C.char\n\n\tif len(cfg.Extra) > 0 {\n\t\tcExtra = C.CString(string(cfg.Extra)) // copy Go slice to C memory\n\t\tdefer C.free(unsafe.Pointer(cExtra))  // free after use\n\t}\n\n\tcCfg.extra = cExtra\n\n\tvar audio *C.struct_SherpaOnnxGeneratedAudio\n\tif cb != nil {\n\t\th := cgo.NewHandle(cb)\n\t\tdefer h.Delete()\n\n\t\taudio = C.SherpaOnnxOfflineTtsGenerateWithConfig(\n\t\t\ttts.impl,\n\t\t\tcText,\n\t\t\t&cCfg,\n\t\t\tC.SherpaOnnxGeneratedAudioProgressCallbackWithArg(\n\t\t\t\tC._cgoGeneratedAudioProgressCallback,\n\t\t\t),\n\t\t\tunsafe.Pointer(&h),\n\t\t)\n\t} else {\n\t\taudio = C.SherpaOnnxOfflineTtsGenerateWithConfig(\n\t\t\ttts.impl,\n\t\t\tcText,\n\t\t\t&cCfg,\n\t\t\tnil,\n\t\t\tnil,\n\t\t)\n\t}\n\n\tif audio == nil {\n\t\treturn nil\n\t}\n\tdefer C.SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio)\n\n\tn := int(audio.n)\n\tarr := unsafe.Slice(\n\t\t(*float32)(unsafe.Pointer(audio.samples)),\n\t\tn,\n\t)\n\n\tans := &GeneratedAudio{\n\t\tSampleRate: int(audio.sample_rate),\n\t\tSamples:    make([]float32, n),\n\t}\n\tcopy(ans.Samples, arr)\n\n\treturn ans\n}\n\nfunc (audio *GeneratedAudio) Save(filename string) bool {\n\ts := C.CString(filename)\n\tdefer C.free(unsafe.Pointer(s))\n\n\tok := int(C.SherpaOnnxWriteWave((*C.float)(&audio.Samples[0]), C.int(len(audio.Samples)), C.int(audio.SampleRate), s))\n\n\treturn ok == 1\n}\n\nfunc (audio *GeneratedAudio) ToBuffer() []byte {\n\t// Similar to Save(): it writes the wave to an allocated buffer;\n\t// Uses the C API: SHERPA_ONNX_API void SherpaOnnxWriteWaveToBuffer(const float *samples, int32_t n, int32_t sample_rate, char *buffer);\n\tn := len(audio.Samples)\n\tif n == 0 {\n\t\treturn nil\n\t}\n\tfs := C.SherpaOnnxWaveFileSize(C.int(n)) // SHERPA_ONNX_API int64_t SherpaOnnxWaveFileSize(int32_t n_samples);\n\tbuf := make([]byte, fs)\n\tC.SherpaOnnxWriteWaveToBuffer((*C.float)(&audio.Samples[0]), C.int(n), C.int(audio.SampleRate), (*C.char)(unsafe.Pointer(&buf[0])))\n\treturn buf\n}\n\n// ============================================================\n// For VAD\n// ============================================================\ntype SileroVadModelConfig struct {\n\tModel              string\n\tThreshold          float32\n\tMinSilenceDuration float32\n\tMinSpeechDuration  float32\n\tWindowSize         int\n\tMaxSpeechDuration  float32\n}\n\ntype TenVadModelConfig struct {\n\tModel              string\n\tThreshold          float32\n\tMinSilenceDuration float32\n\tMinSpeechDuration  float32\n\tWindowSize         int\n\tMaxSpeechDuration  float32\n}\n\ntype VadModelConfig struct {\n\tSileroVad  SileroVadModelConfig\n\tTenVad     TenVadModelConfig\n\tSampleRate int\n\tNumThreads int\n\tProvider   string\n\tDebug      int\n}\n\ntype CircularBuffer struct {\n\timpl *C.struct_SherpaOnnxCircularBuffer\n}\n\nfunc DeleteCircularBuffer(buffer *CircularBuffer) {\n\tC.SherpaOnnxDestroyCircularBuffer(buffer.impl)\n\tbuffer.impl = nil\n}\n\nfunc NewCircularBuffer(capacity int) *CircularBuffer {\n\tcircularBuffer := &CircularBuffer{}\n\tcircularBuffer.impl = C.SherpaOnnxCreateCircularBuffer(C.int(capacity))\n\treturn circularBuffer\n}\n\nfunc (buffer *CircularBuffer) Push(samples []float32) {\n\tC.SherpaOnnxCircularBufferPush(buffer.impl, (*C.float)(&samples[0]), C.int(len(samples)))\n}\n\nfunc (buffer *CircularBuffer) Get(start int, n int) []float32 {\n\tsamples := C.SherpaOnnxCircularBufferGet(buffer.impl, C.int(start), C.int(n))\n\tdefer C.SherpaOnnxCircularBufferFree(samples)\n\n\tresult := make([]float32, n)\n\n\tp := unsafe.Slice(samples, n)\n\tfor i := 0; i < n; i++ {\n\t\tresult[i] = float32(p[i])\n\t}\n\n\treturn result\n}\n\nfunc (buffer *CircularBuffer) Pop(n int) {\n\tC.SherpaOnnxCircularBufferPop(buffer.impl, C.int(n))\n}\n\nfunc (buffer *CircularBuffer) Size() int {\n\treturn int(C.SherpaOnnxCircularBufferSize(buffer.impl))\n}\n\nfunc (buffer *CircularBuffer) Head() int {\n\treturn int(C.SherpaOnnxCircularBufferHead(buffer.impl))\n}\n\nfunc (buffer *CircularBuffer) Reset() {\n\tC.SherpaOnnxCircularBufferReset(buffer.impl)\n}\n\ntype SpeechSegment struct {\n\tStart   int\n\tSamples []float32\n}\n\ntype VoiceActivityDetector struct {\n\timpl *C.struct_SherpaOnnxVoiceActivityDetector\n}\n\nfunc NewVoiceActivityDetector(config *VadModelConfig, bufferSizeInSeconds float32) *VoiceActivityDetector {\n\tc := C.struct_SherpaOnnxVadModelConfig{}\n\n\tc.silero_vad.model = C.CString(config.SileroVad.Model)\n\tdefer C.free(unsafe.Pointer(c.silero_vad.model))\n\n\tc.silero_vad.threshold = C.float(config.SileroVad.Threshold)\n\tc.silero_vad.min_silence_duration = C.float(config.SileroVad.MinSilenceDuration)\n\tc.silero_vad.min_speech_duration = C.float(config.SileroVad.MinSpeechDuration)\n\tc.silero_vad.window_size = C.int(config.SileroVad.WindowSize)\n\tc.silero_vad.max_speech_duration = C.float(config.SileroVad.MaxSpeechDuration)\n\n\tc.ten_vad.model = C.CString(config.TenVad.Model)\n\tdefer C.free(unsafe.Pointer(c.ten_vad.model))\n\n\tc.ten_vad.threshold = C.float(config.TenVad.Threshold)\n\tc.ten_vad.min_silence_duration = C.float(config.TenVad.MinSilenceDuration)\n\tc.ten_vad.min_speech_duration = C.float(config.TenVad.MinSpeechDuration)\n\tc.ten_vad.window_size = C.int(config.TenVad.WindowSize)\n\tc.ten_vad.max_speech_duration = C.float(config.TenVad.MaxSpeechDuration)\n\n\tc.sample_rate = C.int(config.SampleRate)\n\tc.num_threads = C.int(config.NumThreads)\n\tc.provider = C.CString(config.Provider)\n\tdefer C.free(unsafe.Pointer(c.provider))\n\n\tc.debug = C.int(config.Debug)\n\n\timpl := C.SherpaOnnxCreateVoiceActivityDetector(&c, C.float(bufferSizeInSeconds))\n\tif impl == nil {\n\t\treturn nil\n\t}\n\tvad := &VoiceActivityDetector{}\n\tvad.impl = impl\n\treturn vad\n}\n\nfunc DeleteVoiceActivityDetector(vad *VoiceActivityDetector) {\n\tC.SherpaOnnxDestroyVoiceActivityDetector(vad.impl)\n\tvad.impl = nil\n}\n\nfunc (vad *VoiceActivityDetector) AcceptWaveform(samples []float32) {\n\tC.SherpaOnnxVoiceActivityDetectorAcceptWaveform(vad.impl, (*C.float)(&samples[0]), C.int(len(samples)))\n}\n\nfunc (vad *VoiceActivityDetector) IsEmpty() bool {\n\treturn int(C.SherpaOnnxVoiceActivityDetectorEmpty(vad.impl)) == 1\n}\n\nfunc (vad *VoiceActivityDetector) IsSpeech() bool {\n\treturn int(C.SherpaOnnxVoiceActivityDetectorDetected(vad.impl)) == 1\n}\n\nfunc (vad *VoiceActivityDetector) Pop() {\n\tC.SherpaOnnxVoiceActivityDetectorPop(vad.impl)\n}\n\nfunc (vad *VoiceActivityDetector) Clear() {\n\tC.SherpaOnnxVoiceActivityDetectorClear(vad.impl)\n}\n\nfunc (vad *VoiceActivityDetector) Front() *SpeechSegment {\n\tf := C.SherpaOnnxVoiceActivityDetectorFront(vad.impl)\n\tdefer C.SherpaOnnxDestroySpeechSegment(f)\n\n\tans := &SpeechSegment{}\n\tans.Start = int(f.start)\n\n\tn := int(f.n)\n\tans.Samples = make([]float32, n)\n\n\tsamples := unsafe.Slice(f.samples, n)\n\n\tfor i := 0; i < n; i++ {\n\t\tans.Samples[i] = float32(samples[i])\n\t}\n\n\treturn ans\n}\n\nfunc (vad *VoiceActivityDetector) Reset() {\n\tC.SherpaOnnxVoiceActivityDetectorReset(vad.impl)\n}\n\nfunc (vad *VoiceActivityDetector) Flush() {\n\tC.SherpaOnnxVoiceActivityDetectorFlush(vad.impl)\n}\n\n// Spoken language identification\n\ntype SpokenLanguageIdentificationWhisperConfig struct {\n\tEncoder      string\n\tDecoder      string\n\tTailPaddings int\n}\n\ntype SpokenLanguageIdentificationConfig struct {\n\tWhisper    SpokenLanguageIdentificationWhisperConfig\n\tNumThreads int\n\tDebug      int\n\tProvider   string\n}\n\ntype SpokenLanguageIdentification struct {\n\timpl *C.struct_SherpaOnnxSpokenLanguageIdentification\n}\n\ntype SpokenLanguageIdentificationResult struct {\n\tLang string\n}\n\nfunc NewSpokenLanguageIdentification(config *SpokenLanguageIdentificationConfig) *SpokenLanguageIdentification {\n\tc := C.struct_SherpaOnnxSpokenLanguageIdentificationConfig{}\n\n\tc.whisper.encoder = C.CString(config.Whisper.Encoder)\n\tdefer C.free(unsafe.Pointer(c.whisper.encoder))\n\n\tc.whisper.decoder = C.CString(config.Whisper.Decoder)\n\tdefer C.free(unsafe.Pointer(c.whisper.decoder))\n\n\tc.whisper.tail_paddings = C.int(config.Whisper.TailPaddings)\n\n\tc.num_threads = C.int(config.NumThreads)\n\tc.debug = C.int(config.Debug)\n\n\tc.provider = C.CString(config.Provider)\n\tdefer C.free(unsafe.Pointer(c.provider))\n\n\tslid := &SpokenLanguageIdentification{}\n\tslid.impl = C.SherpaOnnxCreateSpokenLanguageIdentification(&c)\n\n\treturn slid\n}\n\nfunc DeleteSpokenLanguageIdentification(slid *SpokenLanguageIdentification) {\n\tC.SherpaOnnxDestroySpokenLanguageIdentification(slid.impl)\n\tslid.impl = nil\n}\n\n// The user has to invoke DeleteOfflineStream() to free the returned value\n// to avoid memory leak\nfunc (slid *SpokenLanguageIdentification) CreateStream() *OfflineStream {\n\tstream := &OfflineStream{}\n\tstream.impl = C.SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(slid.impl)\n\treturn stream\n}\n\nfunc (slid *SpokenLanguageIdentification) Compute(stream *OfflineStream) *SpokenLanguageIdentificationResult {\n\tr := C.SherpaOnnxSpokenLanguageIdentificationCompute(slid.impl, stream.impl)\n\t// defer C.SherpaOnnxDestroySpokenLanguageIdentificationResult(r)\n\n\tans := &SpokenLanguageIdentificationResult{}\n\tans.Lang = C.GoString(r.lang)\n\n\treturn ans\n}\n\n// ============================================================\n// For speaker embedding extraction\n// ============================================================\n\ntype SpeakerEmbeddingExtractorConfig struct {\n\tModel      string\n\tNumThreads int\n\tDebug      int\n\tProvider   string\n}\n\ntype SpeakerEmbeddingExtractor struct {\n\timpl *C.struct_SherpaOnnxSpeakerEmbeddingExtractor\n}\n\n// The user has to invoke [DeleteSpeakerEmbeddingExtractor]() to free the returned value\n// to avoid memory leak\nfunc NewSpeakerEmbeddingExtractor(config *SpeakerEmbeddingExtractorConfig) *SpeakerEmbeddingExtractor {\n\tc := C.struct_SherpaOnnxSpeakerEmbeddingExtractorConfig{}\n\n\tc.model = C.CString(config.Model)\n\tdefer C.free(unsafe.Pointer(c.model))\n\n\tc.num_threads = C.int(config.NumThreads)\n\tc.debug = C.int(config.Debug)\n\n\tc.provider = C.CString(config.Provider)\n\tdefer C.free(unsafe.Pointer(c.provider))\n\n\timpl := C.SherpaOnnxCreateSpeakerEmbeddingExtractor(&c)\n\tif impl == nil {\n\t\treturn nil\n\t}\n\tex := &SpeakerEmbeddingExtractor{}\n\tex.impl = impl\n\treturn ex\n}\n\nfunc DeleteSpeakerEmbeddingExtractor(ex *SpeakerEmbeddingExtractor) {\n\tC.SherpaOnnxDestroySpeakerEmbeddingExtractor(ex.impl)\n\tex.impl = nil\n}\n\nfunc (ex *SpeakerEmbeddingExtractor) Dim() int {\n\treturn int(C.SherpaOnnxSpeakerEmbeddingExtractorDim(ex.impl))\n}\n\n// The user is responsible to invoke [DeleteOnlineStream]() to free\n// the returned stream to avoid memory leak\nfunc (ex *SpeakerEmbeddingExtractor) CreateStream() *OnlineStream {\n\tstream := &OnlineStream{}\n\tstream.impl = C.SherpaOnnxSpeakerEmbeddingExtractorCreateStream(ex.impl)\n\treturn stream\n}\n\nfunc (ex *SpeakerEmbeddingExtractor) IsReady(stream *OnlineStream) bool {\n\treturn int(C.SherpaOnnxSpeakerEmbeddingExtractorIsReady(ex.impl, stream.impl)) == 1\n}\n\nfunc (ex *SpeakerEmbeddingExtractor) Compute(stream *OnlineStream) []float32 {\n\tembedding := C.SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(ex.impl, stream.impl)\n\tdefer C.SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(embedding)\n\n\tn := ex.Dim()\n\tans := make([]float32, n)\n\n\t// see https://stackoverflow.com/questions/48756732/what-does-1-30c-yourtype-do-exactly-in-cgo\n\t// :n:n means 0:n:n, means low:high:capacity\n\tc := unsafe.Slice(embedding, n)\n\n\tfor i := 0; i < n; i++ {\n\t\tans[i] = float32(c[i])\n\t}\n\n\treturn ans\n}\n\ntype SpeakerEmbeddingManager struct {\n\timpl *C.struct_SherpaOnnxSpeakerEmbeddingManager\n}\n\n// The user has to invoke [DeleteSpeakerEmbeddingManager]() to free the returned\n// value to avoid memory leak\nfunc NewSpeakerEmbeddingManager(dim int) *SpeakerEmbeddingManager {\n\timpl := C.SherpaOnnxCreateSpeakerEmbeddingManager(C.int(dim))\n\tif impl == nil {\n\t\treturn nil\n\t}\n\tm := &SpeakerEmbeddingManager{}\n\tm.impl = impl\n\treturn m\n}\n\nfunc DeleteSpeakerEmbeddingManager(m *SpeakerEmbeddingManager) {\n\tC.SherpaOnnxDestroySpeakerEmbeddingManager(m.impl)\n\tm.impl = nil\n}\n\nfunc (m *SpeakerEmbeddingManager) Register(name string, embedding []float32) bool {\n\ts := C.CString(name)\n\tdefer C.free(unsafe.Pointer(s))\n\n\treturn C.int(C.SherpaOnnxSpeakerEmbeddingManagerAdd(m.impl, s, (*C.float)(&embedding[0]))) == 1\n}\n\nfunc (m *SpeakerEmbeddingManager) RegisterV(name string, embeddings [][]float32) bool {\n\ts := C.CString(name)\n\tdefer C.free(unsafe.Pointer(s))\n\n\tif len(embeddings) == 0 {\n\t\treturn false\n\t}\n\n\tdim := len(embeddings[0])\n\tv := make([]float32, 0, dim*len(embeddings))\n\tfor _, embedding := range embeddings {\n\t\tv = append(v, embedding...)\n\t}\n\n\treturn C.int(C.SherpaOnnxSpeakerEmbeddingManagerAddListFlattened(m.impl, s, (*C.float)(&v[0]), C.int(len(embeddings)))) == 1\n}\n\nfunc (m *SpeakerEmbeddingManager) Remove(name string) bool {\n\ts := C.CString(name)\n\tdefer C.free(unsafe.Pointer(s))\n\n\treturn C.int(C.SherpaOnnxSpeakerEmbeddingManagerRemove(m.impl, s)) == 1\n}\n\nfunc (m *SpeakerEmbeddingManager) Search(embedding []float32, threshold float32) string {\n\tvar s string\n\n\tname := C.SherpaOnnxSpeakerEmbeddingManagerSearch(m.impl, (*C.float)(&embedding[0]), C.float(threshold))\n\tdefer C.SherpaOnnxSpeakerEmbeddingManagerFreeSearch(name)\n\n\tif name != nil {\n\t\ts = C.GoString(name)\n\t}\n\n\treturn s\n}\n\nfunc (m *SpeakerEmbeddingManager) Verify(name string, embedding []float32, threshold float32) bool {\n\ts := C.CString(name)\n\tdefer C.free(unsafe.Pointer(s))\n\n\treturn C.int(C.SherpaOnnxSpeakerEmbeddingManagerVerify(m.impl, s, (*C.float)(&embedding[0]), C.float(threshold))) == 1\n}\n\nfunc (m *SpeakerEmbeddingManager) Contains(name string) bool {\n\ts := C.CString(name)\n\tdefer C.free(unsafe.Pointer(s))\n\n\treturn C.int(C.SherpaOnnxSpeakerEmbeddingManagerContains(m.impl, s)) == 1\n}\n\nfunc (m *SpeakerEmbeddingManager) NumSpeakers() int {\n\treturn int(C.SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(m.impl))\n}\n\nfunc (m *SpeakerEmbeddingManager) AllSpeakers() []string {\n\tall_speakers := C.SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers(m.impl)\n\tdefer C.SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers(all_speakers)\n\n\tn := m.NumSpeakers()\n\tif n == 0 {\n\t\treturn nil\n\t}\n\n\t// https://stackoverflow.com/questions/62012070/convert-array-of-strings-from-cgo-in-go\n\tp := unsafe.Slice(all_speakers, n)\n\n\tans := make([]string, n)\n\n\tfor i := 0; i < n; i++ {\n\t\tans[i] = C.GoString(p[i])\n\t}\n\n\treturn ans\n}\n\n// Wave\n\n// single channel wave\ntype Wave = GeneratedAudio\n\nfunc ReadWave(filename string) *Wave {\n\ts := C.CString(filename)\n\tdefer C.free(unsafe.Pointer(s))\n\n\tw := C.SherpaOnnxReadWave(s)\n\tdefer C.SherpaOnnxFreeWave(w)\n\n\tif w == nil {\n\t\treturn nil\n\t}\n\n\tn := int(w.num_samples)\n\tif n == 0 {\n\t\treturn nil\n\t}\n\n\tans := &Wave{}\n\tans.SampleRate = int(w.sample_rate)\n\tsamples := unsafe.Slice(w.samples, n)\n\n\tans.Samples = make([]float32, n)\n\n\tfor i := 0; i < n; i++ {\n\t\tans.Samples[i] = float32(samples[i])\n\t}\n\n\treturn ans\n}\n\n// ============================================================\n// For offline speaker diarization\n// ============================================================\ntype OfflineSpeakerSegmentationPyannoteModelConfig struct {\n\tModel string\n}\n\ntype OfflineSpeakerSegmentationModelConfig struct {\n\tPyannote   OfflineSpeakerSegmentationPyannoteModelConfig\n\tNumThreads int\n\tDebug      int\n\tProvider   string\n}\n\ntype FastClusteringConfig struct {\n\tNumClusters int\n\tThreshold   float32\n}\n\ntype OfflineSpeakerDiarizationConfig struct {\n\tSegmentation   OfflineSpeakerSegmentationModelConfig\n\tEmbedding      SpeakerEmbeddingExtractorConfig\n\tClustering     FastClusteringConfig\n\tMinDurationOn  float32\n\tMinDurationOff float32\n}\n\ntype OfflineSpeakerDiarization struct {\n\timpl *C.struct_SherpaOnnxOfflineSpeakerDiarization\n}\n\nfunc DeleteOfflineSpeakerDiarization(sd *OfflineSpeakerDiarization) {\n\tC.SherpaOnnxDestroyOfflineSpeakerDiarization(sd.impl)\n\tsd.impl = nil\n}\n\nfunc NewOfflineSpeakerDiarization(config *OfflineSpeakerDiarizationConfig) *OfflineSpeakerDiarization {\n\tc := C.struct_SherpaOnnxOfflineSpeakerDiarizationConfig{}\n\tc.segmentation.pyannote.model = C.CString(config.Segmentation.Pyannote.Model)\n\tdefer C.free(unsafe.Pointer(c.segmentation.pyannote.model))\n\n\tc.segmentation.num_threads = C.int(config.Segmentation.NumThreads)\n\n\tc.segmentation.debug = C.int(config.Segmentation.Debug)\n\n\tc.segmentation.provider = C.CString(config.Segmentation.Provider)\n\tdefer C.free(unsafe.Pointer(c.segmentation.provider))\n\n\tc.embedding.model = C.CString(config.Embedding.Model)\n\tdefer C.free(unsafe.Pointer(c.embedding.model))\n\n\tc.embedding.num_threads = C.int(config.Embedding.NumThreads)\n\n\tc.embedding.debug = C.int(config.Embedding.Debug)\n\n\tc.embedding.provider = C.CString(config.Embedding.Provider)\n\tdefer C.free(unsafe.Pointer(c.embedding.provider))\n\n\tc.clustering.num_clusters = C.int(config.Clustering.NumClusters)\n\tc.clustering.threshold = C.float(config.Clustering.Threshold)\n\tc.min_duration_on = C.float(config.MinDurationOn)\n\tc.min_duration_off = C.float(config.MinDurationOff)\n\n\tp := C.SherpaOnnxCreateOfflineSpeakerDiarization(&c)\n\n\tif p == nil {\n\t\treturn nil\n\t}\n\n\tsd := &OfflineSpeakerDiarization{}\n\tsd.impl = p\n\n\treturn sd\n}\n\nfunc (sd *OfflineSpeakerDiarization) SampleRate() int {\n\treturn int(C.SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(sd.impl))\n}\n\n// only config.Clustering is used. All other fields are ignored\nfunc (sd *OfflineSpeakerDiarization) SetConfig(config *OfflineSpeakerDiarizationConfig) {\n\tc := C.struct_SherpaOnnxOfflineSpeakerDiarizationConfig{}\n\n\tc.clustering.num_clusters = C.int(config.Clustering.NumClusters)\n\tc.clustering.threshold = C.float(config.Clustering.Threshold)\n\n\tC.SherpaOnnxOfflineSpeakerDiarizationSetConfig(sd.impl, &c)\n}\n\ntype OfflineSpeakerDiarizationSegment struct {\n\tStart   float32\n\tEnd     float32\n\tSpeaker int\n}\n\nfunc (sd *OfflineSpeakerDiarization) Process(samples []float32) []OfflineSpeakerDiarizationSegment {\n\tr := C.SherpaOnnxOfflineSpeakerDiarizationProcess(sd.impl, (*C.float)(&samples[0]), C.int(len(samples)))\n\tdefer C.SherpaOnnxOfflineSpeakerDiarizationDestroyResult(r)\n\n\tn := int(C.SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(r))\n\n\tif n == 0 {\n\t\treturn nil\n\t}\n\n\ts := C.SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(r)\n\tdefer C.SherpaOnnxOfflineSpeakerDiarizationDestroySegment(s)\n\n\tans := make([]OfflineSpeakerDiarizationSegment, n)\n\n\tp := unsafe.Slice(s, n)\n\n\tfor i := 0; i < n; i++ {\n\t\tans[i].Start = float32(p[i].start)\n\t\tans[i].End = float32(p[i].end)\n\t\tans[i].Speaker = int(p[i].speaker)\n\t}\n\n\treturn ans\n}\n\n// ============================================================\n// For punctuation\n// ============================================================\ntype OfflinePunctuationModelConfig struct {\n\tCtTransformer string\n\tNumThreads    int\n\tDebug         int // true to print debug information of the model\n\tProvider      string\n}\n\ntype OfflinePunctuationConfig struct {\n\tModel OfflinePunctuationModelConfig\n}\n\ntype OfflinePunctuation struct {\n\timpl *C.struct_SherpaOnnxOfflinePunctuation\n}\n\nfunc NewOfflinePunctuation(config *OfflinePunctuationConfig) *OfflinePunctuation {\n\tcfg := C.struct_SherpaOnnxOfflinePunctuationConfig{}\n\tcfg.model.ct_transformer = C.CString(config.Model.CtTransformer)\n\tdefer C.free(unsafe.Pointer(cfg.model.ct_transformer))\n\n\tcfg.model.num_threads = C.int(config.Model.NumThreads)\n\tcfg.model.debug = C.int(config.Model.Debug)\n\tcfg.model.provider = C.CString(config.Model.Provider)\n\tdefer C.free(unsafe.Pointer(cfg.model.provider))\n\n\timpl := C.SherpaOnnxCreateOfflinePunctuation(&cfg)\n\tif impl == nil {\n\t\treturn nil\n\t}\n\tpunc := &OfflinePunctuation{}\n\tpunc.impl = impl\n\treturn punc\n}\n\nfunc DeleteOfflinePunc(punc *OfflinePunctuation) {\n\tC.SherpaOnnxDestroyOfflinePunctuation(punc.impl)\n\tpunc.impl = nil\n}\n\nfunc (punc *OfflinePunctuation) AddPunct(text string) string {\n\tinputText := C.CString(text)\n\tdefer C.free(unsafe.Pointer(inputText))\n\tp := C.SherpaOfflinePunctuationAddPunct(punc.impl, inputText)\n\tif p == nil {\n\t\treturn \"\"\n\t}\n\tdefer C.SherpaOfflinePunctuationFreeText(p)\n\n\ttext_with_punct := C.GoString(p)\n\n\treturn text_with_punct\n}\n\ntype OnlinePunctuationModelConfig struct {\n\tCnnBilstm  string\n\tBpeVocab   string\n\tNumThreads int\n\tDebug      int\n\tProvider   string\n}\n\ntype OnlinePunctuationConfig struct {\n\tModel OnlinePunctuationModelConfig\n}\n\ntype OnlinePunctuation struct {\n\timpl *C.struct_SherpaOnnxOnlinePunctuation\n}\n\nfunc NewOnlinePunctuation(config *OnlinePunctuationConfig) *OnlinePunctuation {\n\tcfg := C.struct_SherpaOnnxOnlinePunctuationConfig{}\n\tcfg.model.cnn_bilstm = C.CString(config.Model.CnnBilstm)\n\tdefer C.free(unsafe.Pointer(cfg.model.cnn_bilstm))\n\n\tcfg.model.bpe_vocab = C.CString(config.Model.BpeVocab)\n\tdefer C.free(unsafe.Pointer(cfg.model.bpe_vocab))\n\n\tcfg.model.num_threads = C.int(config.Model.NumThreads)\n\tcfg.model.debug = C.int(config.Model.Debug)\n\tcfg.model.provider = C.CString(config.Model.Provider)\n\tdefer C.free(unsafe.Pointer(cfg.model.provider))\n\n\timpl := C.SherpaOnnxCreateOnlinePunctuation(&cfg)\n\tif impl == nil {\n\t\treturn nil\n\t}\n\tpunc := &OnlinePunctuation{}\n\tpunc.impl = impl\n\treturn punc\n}\n\nfunc DeleteOnlinePunctuation(punc *OnlinePunctuation) {\n\tC.SherpaOnnxDestroyOnlinePunctuation(punc.impl)\n\tpunc.impl = nil\n}\n\nfunc (punc *OnlinePunctuation) AddPunct(text string) string {\n\tinputText := C.CString(text)\n\tdefer C.free(unsafe.Pointer(inputText))\n\n\tp := C.SherpaOnnxOnlinePunctuationAddPunct(punc.impl, inputText)\n\tif p == nil {\n\t\treturn \"\"\n\t}\n\tdefer C.SherpaOnnxOnlinePunctuationFreeText(p)\n\n\ttextWithPunct := C.GoString(p)\n\n\treturn textWithPunct\n}\n\n// Configuration for the online/streaming recognizer.\ntype KeywordSpotterConfig struct {\n\tFeatConfig        FeatureConfig\n\tModelConfig       OnlineModelConfig\n\tMaxActivePaths    int\n\tKeywordsFile      string\n\tKeywordsScore     float32\n\tKeywordsThreshold float32\n\tKeywordsBuf       string\n\tKeywordsBufSize   int\n}\n\ntype KeywordSpotterResult struct {\n\tKeyword string\n}\n\ntype KeywordSpotter struct {\n\timpl *C.struct_SherpaOnnxKeywordSpotter\n}\n\n// Free the internal pointer inside the recognizer to avoid memory leak.\nfunc DeleteKeywordSpotter(spotter *KeywordSpotter) {\n\tC.SherpaOnnxDestroyKeywordSpotter(spotter.impl)\n\tspotter.impl = nil\n}\n\n// The user is responsible to invoke [DeleteKeywordSpotter]() to free\n// the returned spotter to avoid memory leak\nfunc NewKeywordSpotter(config *KeywordSpotterConfig) *KeywordSpotter {\n\tc := C.struct_SherpaOnnxKeywordSpotterConfig{}\n\tc.feat_config.sample_rate = C.int(config.FeatConfig.SampleRate)\n\tc.feat_config.feature_dim = C.int(config.FeatConfig.FeatureDim)\n\n\tc.model_config.transducer.encoder = C.CString(config.ModelConfig.Transducer.Encoder)\n\tdefer C.free(unsafe.Pointer(c.model_config.transducer.encoder))\n\n\tc.model_config.transducer.decoder = C.CString(config.ModelConfig.Transducer.Decoder)\n\tdefer C.free(unsafe.Pointer(c.model_config.transducer.decoder))\n\n\tc.model_config.transducer.joiner = C.CString(config.ModelConfig.Transducer.Joiner)\n\tdefer C.free(unsafe.Pointer(c.model_config.transducer.joiner))\n\n\tc.model_config.paraformer.encoder = C.CString(config.ModelConfig.Paraformer.Encoder)\n\tdefer C.free(unsafe.Pointer(c.model_config.paraformer.encoder))\n\n\tc.model_config.paraformer.decoder = C.CString(config.ModelConfig.Paraformer.Decoder)\n\tdefer C.free(unsafe.Pointer(c.model_config.paraformer.decoder))\n\n\tc.model_config.zipformer2_ctc.model = C.CString(config.ModelConfig.Zipformer2Ctc.Model)\n\tdefer C.free(unsafe.Pointer(c.model_config.zipformer2_ctc.model))\n\n\tc.model_config.nemo_ctc.model = C.CString(config.ModelConfig.NemoCtc.Model)\n\tdefer C.free(unsafe.Pointer(c.model_config.nemo_ctc.model))\n\n\tc.model_config.tokens = C.CString(config.ModelConfig.Tokens)\n\tdefer C.free(unsafe.Pointer(c.model_config.tokens))\n\n\tc.model_config.num_threads = C.int(config.ModelConfig.NumThreads)\n\n\tc.model_config.provider = C.CString(config.ModelConfig.Provider)\n\tdefer C.free(unsafe.Pointer(c.model_config.provider))\n\n\tc.model_config.debug = C.int(config.ModelConfig.Debug)\n\n\tc.model_config.model_type = C.CString(config.ModelConfig.ModelType)\n\tdefer C.free(unsafe.Pointer(c.model_config.model_type))\n\n\tc.model_config.modeling_unit = C.CString(config.ModelConfig.ModelingUnit)\n\tdefer C.free(unsafe.Pointer(c.model_config.modeling_unit))\n\n\tc.model_config.bpe_vocab = C.CString(config.ModelConfig.BpeVocab)\n\tdefer C.free(unsafe.Pointer(c.model_config.bpe_vocab))\n\n\tc.model_config.tokens_buf = C.CString(config.ModelConfig.TokensBuf)\n\tdefer C.free(unsafe.Pointer(c.model_config.tokens_buf))\n\n\tc.model_config.tokens_buf_size = C.int(config.ModelConfig.TokensBufSize)\n\n\tc.max_active_paths = C.int(config.MaxActivePaths)\n\n\tc.keywords_file = C.CString(config.KeywordsFile)\n\tdefer C.free(unsafe.Pointer(c.keywords_file))\n\n\tc.keywords_score = C.float(config.KeywordsScore)\n\n\tc.keywords_threshold = C.float(config.KeywordsThreshold)\n\n\tc.keywords_buf = C.CString(config.KeywordsBuf)\n\tdefer C.free(unsafe.Pointer(c.keywords_buf))\n\n\tc.keywords_buf_size = C.int(config.KeywordsBufSize)\n\n\timpl := C.SherpaOnnxCreateKeywordSpotter(&c)\n\tif impl == nil {\n\t\treturn nil\n\t}\n\tspotter := &KeywordSpotter{}\n\tspotter.impl = impl\n\treturn spotter\n}\n\n// The user is responsible to invoke [DeleteOnlineStream]() to free\n// the returned stream to avoid memory leak\nfunc NewKeywordStream(spotter *KeywordSpotter) *OnlineStream {\n\tstream := &OnlineStream{}\n\tstream.impl = C.SherpaOnnxCreateKeywordStream(spotter.impl)\n\treturn stream\n}\n\n// The user is responsible to invoke [DeleteOnlineStream]() to free\n// the returned stream to avoid memory leak\nfunc NewKeywordStreamWithKeywords(spotter *KeywordSpotter, keywords string) *OnlineStream {\n\tstream := &OnlineStream{}\n\n\ts := C.CString(keywords)\n\tdefer C.free(unsafe.Pointer(s))\n\n\tstream.impl = C.SherpaOnnxCreateKeywordStreamWithKeywords(spotter.impl, s)\n\treturn stream\n}\n\n// Check whether the stream has enough feature frames for decoding.\n// Return true if this stream is ready for decoding. Return false otherwise.\n//\n// You will usually use it like below:\n//\n//\tfor spotter.IsReady(s) {\n//\t   spotter.Decode(s)\n//\t}\nfunc (spotter *KeywordSpotter) IsReady(s *OnlineStream) bool {\n\treturn C.SherpaOnnxIsKeywordStreamReady(spotter.impl, s.impl) == 1\n}\n\n// Decode the stream. Before calling this function, you have to ensure\n// that spotter.IsReady(s) returns true. Otherwise, you will be SAD.\n//\n// You usually use it like below:\n//\n//\tfor spotter.IsReady(s) {\n//\t  spotter.Decode(s)\n//\t}\nfunc (spotter *KeywordSpotter) Decode(s *OnlineStream) {\n\tC.SherpaOnnxDecodeKeywordStream(spotter.impl, s.impl)\n}\n\n// You MUST call it right after detecting a keyword\nfunc (spotter *KeywordSpotter) Reset(s *OnlineStream) {\n\tC.SherpaOnnxResetKeywordStream(spotter.impl, s.impl)\n}\n\n// Get the current result of stream since the last invoke of Reset()\nfunc (spotter *KeywordSpotter) GetResult(s *OnlineStream) *KeywordSpotterResult {\n\tp := C.SherpaOnnxGetKeywordResult(spotter.impl, s.impl)\n\tdefer C.SherpaOnnxDestroyKeywordResult(p)\n\tresult := &KeywordSpotterResult{}\n\tresult.Keyword = C.GoString(p.keyword)\n\treturn result\n}\n\n// Configuration for the audio tagging.\ntype OfflineZipformerAudioTaggingModelConfig struct {\n\tModel string\n}\n\ntype AudioTaggingModelConfig struct {\n\tZipformer  OfflineZipformerAudioTaggingModelConfig\n\tCed        string\n\tNumThreads int32\n\tDebug      int32\n\tProvider   string\n}\n\ntype AudioTaggingConfig struct {\n\tModel  AudioTaggingModelConfig\n\tLabels string\n\tTopK   int32\n}\n\ntype AudioTagging struct {\n\timpl *C.struct_SherpaOnnxAudioTagging\n}\n\ntype AudioEvent struct {\n\tName  string\n\tIndex int\n\tProb  float32\n}\n\nfunc DeleteAudioTagging(tagging *AudioTagging) {\n\tC.SherpaOnnxDestroyAudioTagging(tagging.impl)\n\ttagging.impl = nil\n}\n\n// The user is responsible to invoke [DeleteAudioTagging]() to free\n// the returned tagger to avoid memory leak\nfunc NewAudioTagging(config *AudioTaggingConfig) *AudioTagging {\n\tc := C.struct_SherpaOnnxAudioTaggingConfig{}\n\n\tc.model.zipformer.model = C.CString(config.Model.Zipformer.Model)\n\tdefer C.free(unsafe.Pointer(c.model.zipformer.model))\n\n\tc.model.ced = C.CString(config.Model.Ced)\n\tdefer C.free(unsafe.Pointer(c.model.ced))\n\n\tc.model.num_threads = C.int(config.Model.NumThreads)\n\n\tc.model.provider = C.CString(config.Model.Provider)\n\tdefer C.free(unsafe.Pointer(c.model.provider))\n\n\tc.model.debug = C.int(config.Model.Debug)\n\n\tc.labels = C.CString(config.Labels)\n\tdefer C.free(unsafe.Pointer(c.labels))\n\n\tc.top_k = C.int(config.TopK)\n\n\timpl := C.SherpaOnnxCreateAudioTagging(&c)\n\tif impl == nil {\n\t\treturn nil\n\t}\n\ttagging := &AudioTagging{}\n\ttagging.impl = impl\n\treturn tagging\n}\n\n// The user is responsible to invoke [DeleteOfflineStream]() to free\n// the returned stream to avoid memory leak\nfunc NewAudioTaggingStream(tagging *AudioTagging) *OfflineStream {\n\tstream := &OfflineStream{}\n\tstream.impl = C.SherpaOnnxAudioTaggingCreateOfflineStream(tagging.impl)\n\treturn stream\n}\n\nfunc (tagging *AudioTagging) Compute(s *OfflineStream, topK int32) []AudioEvent {\n\tr := C.SherpaOnnxAudioTaggingCompute(tagging.impl, s.impl, C.int(topK))\n\tdefer C.SherpaOnnxAudioTaggingFreeResults(r)\n\tresult := make([]AudioEvent, 0)\n\n\tp := (*[1 << 25]*C.struct_SherpaOnnxAudioEvent)(unsafe.Pointer(r))\n\ti := 0\n\tfor {\n\t\tif p[i] == nil {\n\t\t\tbreak\n\t\t}\n\t\tresult = append(result, AudioEvent{\n\t\t\tName:  C.GoString(p[i].name),\n\t\t\tIndex: int(p[i].index),\n\t\t\tProb:  float32(p[i].prob),\n\t\t})\n\t\ti += 1\n\t}\n\treturn result\n}\n\ntype OfflineSpeechDenoiserGtcrnModelConfig struct {\n\tModel string\n}\n\ntype OfflineSpeechDenoiserDpdfNetModelConfig struct {\n\tModel string\n}\n\ntype OfflineSpeechDenoiserModelConfig struct {\n\tGtcrn      OfflineSpeechDenoiserGtcrnModelConfig\n\tDpdfNet    OfflineSpeechDenoiserDpdfNetModelConfig\n\tNumThreads int32\n\tDebug      int32\n\tProvider   string\n}\n\ntype OfflineSpeechDenoiserConfig struct {\n\tModel OfflineSpeechDenoiserModelConfig\n}\n\ntype OfflineSpeechDenoiser struct {\n\timpl *C.struct_SherpaOnnxOfflineSpeechDenoiser\n}\n\ntype OnlineSpeechDenoiserConfig struct {\n\tModel OfflineSpeechDenoiserModelConfig\n}\n\ntype OnlineSpeechDenoiser struct {\n\timpl *C.struct_SherpaOnnxOnlineSpeechDenoiser\n}\n\ntype DenoisedAudio struct {\n\t// Normalized samples in the range [-1, 1]\n\tSamples []float32\n\n\tSampleRate int\n}\n\nfunc floatPointer(samples []float32) *C.float {\n\tif len(samples) == 0 {\n\t\treturn nil\n\t}\n\n\treturn (*C.float)(&samples[0])\n}\n\nfunc denoisedAudioFromPointer(audio *C.struct_SherpaOnnxDenoisedAudio) *DenoisedAudio {\n\tif audio == nil {\n\t\treturn &DenoisedAudio{}\n\t}\n\n\tdefer C.SherpaOnnxDestroyDenoisedAudio(audio)\n\n\tans := &DenoisedAudio{}\n\tans.SampleRate = int(audio.sample_rate)\n\tn := int(audio.n)\n\tans.Samples = make([]float32, n)\n\n\tif n == 0 || audio.samples == nil {\n\t\treturn ans\n\t}\n\n\tdenoisedSamples := unsafe.Slice(audio.samples, n)\n\tfor i := 0; i < n; i++ {\n\t\tans.Samples[i] = float32(denoisedSamples[i])\n\t}\n\n\treturn ans\n}\n\n// Free the internal pointer inside the OfflineSpeechDenoiser to avoid memory leak.\nfunc DeleteOfflineSpeechDenoiser(sd *OfflineSpeechDenoiser) {\n\tC.SherpaOnnxDestroyOfflineSpeechDenoiser(sd.impl)\n\tsd.impl = nil\n}\n\n// The user is responsible to invoke [DeleteOfflineSpeechDenoiser]() to free\n// the returned tts to avoid memory leak\nfunc NewOfflineSpeechDenoiser(config *OfflineSpeechDenoiserConfig) *OfflineSpeechDenoiser {\n\tc := C.struct_SherpaOnnxOfflineSpeechDenoiserConfig{}\n\tc.model.gtcrn.model = C.CString(config.Model.Gtcrn.Model)\n\tdefer C.free(unsafe.Pointer(c.model.gtcrn.model))\n\tc.model.dpdfnet.model = C.CString(config.Model.DpdfNet.Model)\n\tdefer C.free(unsafe.Pointer(c.model.dpdfnet.model))\n\n\tc.model.num_threads = C.int(config.Model.NumThreads)\n\tc.model.debug = C.int(config.Model.Debug)\n\n\tc.model.provider = C.CString(config.Model.Provider)\n\tdefer C.free(unsafe.Pointer(c.model.provider))\n\n\timpl := C.SherpaOnnxCreateOfflineSpeechDenoiser(&c)\n\tif impl == nil {\n\t\treturn nil\n\t}\n\n\tsd := &OfflineSpeechDenoiser{}\n\tsd.impl = impl\n\treturn sd\n}\n\nfunc (sd *OfflineSpeechDenoiser) Run(samples []float32, sampleRate int) *DenoisedAudio {\n\taudio := C.SherpaOnnxOfflineSpeechDenoiserRun(sd.impl, floatPointer(samples), C.int(len(samples)), C.int(sampleRate))\n\treturn denoisedAudioFromPointer(audio)\n}\n\nfunc (audio *DenoisedAudio) Save(filename string) bool {\n\ts := C.CString(filename)\n\tdefer C.free(unsafe.Pointer(s))\n\n\tok := int(C.SherpaOnnxWriteWave(floatPointer(audio.Samples), C.int(len(audio.Samples)), C.int(audio.SampleRate), s))\n\n\treturn ok == 1\n}\n\nfunc (sd *OfflineSpeechDenoiser) SampleRate() int {\n\treturn int(C.SherpaOnnxOfflineSpeechDenoiserGetSampleRate(sd.impl))\n}\n\n// Free the internal pointer inside the OnlineSpeechDenoiser to avoid memory leak.\nfunc DeleteOnlineSpeechDenoiser(sd *OnlineSpeechDenoiser) {\n\tC.SherpaOnnxDestroyOnlineSpeechDenoiser(sd.impl)\n\tsd.impl = nil\n}\n\n// The user is responsible to invoke [DeleteOnlineSpeechDenoiser]() to free\n// the returned denoiser to avoid memory leak.\nfunc NewOnlineSpeechDenoiser(config *OnlineSpeechDenoiserConfig) *OnlineSpeechDenoiser {\n\tc := C.struct_SherpaOnnxOnlineSpeechDenoiserConfig{}\n\tc.model.gtcrn.model = C.CString(config.Model.Gtcrn.Model)\n\tdefer C.free(unsafe.Pointer(c.model.gtcrn.model))\n\tc.model.dpdfnet.model = C.CString(config.Model.DpdfNet.Model)\n\tdefer C.free(unsafe.Pointer(c.model.dpdfnet.model))\n\n\tc.model.num_threads = C.int(config.Model.NumThreads)\n\tc.model.debug = C.int(config.Model.Debug)\n\n\tc.model.provider = C.CString(config.Model.Provider)\n\tdefer C.free(unsafe.Pointer(c.model.provider))\n\n\timpl := C.SherpaOnnxCreateOnlineSpeechDenoiser(&c)\n\tif impl == nil {\n\t\treturn nil\n\t}\n\n\tsd := &OnlineSpeechDenoiser{}\n\tsd.impl = impl\n\treturn sd\n}\n\nfunc (sd *OnlineSpeechDenoiser) Run(samples []float32, sampleRate int) *DenoisedAudio {\n\taudio := C.SherpaOnnxOnlineSpeechDenoiserRun(sd.impl, floatPointer(samples), C.int(len(samples)), C.int(sampleRate))\n\treturn denoisedAudioFromPointer(audio)\n}\n\nfunc (sd *OnlineSpeechDenoiser) Flush() *DenoisedAudio {\n\taudio := C.SherpaOnnxOnlineSpeechDenoiserFlush(sd.impl)\n\treturn denoisedAudioFromPointer(audio)\n}\n\nfunc (sd *OnlineSpeechDenoiser) Reset() {\n\tC.SherpaOnnxOnlineSpeechDenoiserReset(sd.impl)\n}\n\nfunc (sd *OnlineSpeechDenoiser) SampleRate() int {\n\treturn int(C.SherpaOnnxOnlineSpeechDenoiserGetSampleRate(sd.impl))\n}\n\nfunc (sd *OnlineSpeechDenoiser) FrameShiftInSamples() int {\n\treturn int(C.SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(sd.impl))\n}\n\nfunc GetVersion() string {\n\treturn C.GoString(C.SherpaOnnxGetVersionStr())\n}\n\nfunc GetGitSha1() string {\n\treturn C.GoString(C.SherpaOnnxGetGitSha1())\n}\n\nfunc GetGitDate() string {\n\treturn C.GoString(C.SherpaOnnxGetGitDate())\n}\n"
  },
  {
    "path": "scripts/go/ssh_config",
    "content": "Host github.com\n  Hostname github.com\n  User git\n  IdentityFile ~/.ssh/github\n  StrictHostKeyChecking no\n"
  },
  {
    "path": "scripts/gtcrn/README.md",
    "content": "# Introduction\n\nThis folder contains scripts for adding metadata to models from\nhttps://github.com/Xiaobin-Rong/gtcrn/blob/main/stream/onnx_models/gtcrn_simple.onnx\n"
  },
  {
    "path": "scripts/gtcrn/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nNodeArg(name='mix', type='tensor(float)', shape=[1, 257, 1, 2])\nNodeArg(name='conv_cache', type='tensor(float)', shape=[2, 1, 16, 16, 33])\nNodeArg(name='tra_cache', type='tensor(float)', shape=[2, 3, 1, 1, 16])\nNodeArg(name='inter_cache', type='tensor(float)', shape=[2, 1, 33, 16])\n-----\nNodeArg(name='enh', type='tensor(float)', shape=[1, 257, 1, 2])\nNodeArg(name='conv_cache_out', type='tensor(float)', shape=[2, 1, 16, 16, 33])\nNodeArg(name='tra_cache_out', type='tensor(float)', shape=[2, 3, 1, 1, 16])\nNodeArg(name='inter_cache_out', type='tensor(float)', shape=[2, 1, 33, 16])\n\"\"\"\n\nimport onnx\nimport onnxruntime as ort\n\n\ndef show(filename):\n    session_opts = ort.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = ort.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\ndef main():\n    filename = \"./gtcrn_simple.onnx\"\n    show(filename)\n    model = onnx.load(filename)\n\n    meta_data = {\n        \"model_type\": \"gtcrn\",\n        \"comment\": \"gtcrn_simple\",\n        \"version\": 1,\n        \"sample_rate\": 16000,\n        \"model_url\": \"https://github.com/Xiaobin-Rong/gtcrn/blob/main/stream/onnx_models/gtcrn_simple.onnx\",\n        \"maintainer\": \"k2-fsa\",\n        \"comment2\": \"Please see also https://github.com/Xiaobin-Rong/gtcrn\",\n        \"conv_cache_shape\": \"2,1,16,16,33\",\n        \"tra_cache_shape\": \"2,3,1,1,16\",\n        \"inter_cache_shape\": \"2,1,33,16\",\n        \"n_fft\": 512,\n        \"hop_length\": 256,\n        \"window_length\": 512,\n        \"window_type\": \"hann_sqrt\",\n    }\n\n    print(model.metadata_props)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, filename)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/gtcrn/show.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport onnxruntime\nimport onnx\n\n\"\"\"\n[key: \"model_type\"\nvalue: \"gtcrn\"\n, key: \"comment\"\nvalue: \"gtcrn_simple\"\n, key: \"version\"\nvalue: \"1\"\n, key: \"sample_rate\"\nvalue: \"16000\"\n, key: \"model_url\"\nvalue: \"https://github.com/Xiaobin-Rong/gtcrn/blob/main/stream/onnx_models/gtcrn_simple.onnx\"\n, key: \"maintainer\"\nvalue: \"k2-fsa\"\n, key: \"comment2\"\nvalue: \"Please see also https://github.com/Xiaobin-Rong/gtcrn\"\n, key: \"conv_cache_shape\"\nvalue: \"2,1,16,16,33\"\n, key: \"tra_cache_shape\"\nvalue: \"2,3,1,1,16\"\n, key: \"inter_cache_shape\"\nvalue: \"2,1,33,16\"\n, key: \"n_fft\"\nvalue: \"512\"\n, key: \"hop_length\"\nvalue: \"256\"\n, key: \"window_length\"\nvalue: \"512\"\n, key: \"window_type\"\nvalue: \"hann_sqrt\"\n]\n\"\"\"\n\n\"\"\"\nNodeArg(name='mix', type='tensor(float)', shape=[1, 257, 1, 2])\nNodeArg(name='conv_cache', type='tensor(float)', shape=[2, 1, 16, 16, 33])\nNodeArg(name='tra_cache', type='tensor(float)', shape=[2, 3, 1, 1, 16])\nNodeArg(name='inter_cache', type='tensor(float)', shape=[2, 1, 33, 16])\n-----\nNodeArg(name='enh', type='tensor(float)', shape=[1, 257, 1, 2])\nNodeArg(name='conv_cache_out', type='tensor(float)', shape=[2, 1, 16, 16, 33])\nNodeArg(name='tra_cache_out', type='tensor(float)', shape=[2, 3, 1, 1, 16])\nNodeArg(name='inter_cache_out', type='tensor(float)', shape=[2, 1, 33, 16])\n\"\"\"\n\n\ndef show(filename):\n    model = onnx.load(filename)\n    print(model.metadata_props)\n\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(\n        filename, session_opts, providers=[\"CPUExecutionProvider\"]\n    )\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\ndef main():\n    show(\"./gtcrn_simple.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/gtcrn/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom typing import Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\nclass OnnxModel:\n    def __init__(self):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            \"./gtcrn_simple.onnx\",\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        self.sample_rate = int(meta[\"sample_rate\"])\n        self.n_fft = int(meta[\"n_fft\"])\n        self.hop_length = int(meta[\"hop_length\"])\n        self.window_length = int(meta[\"window_length\"])\n        assert meta[\"window_type\"] == \"hann_sqrt\", meta[\"window_type\"]\n\n        self.window = torch.hann_window(self.window_length).pow(0.5)\n\n    def get_init_states(self):\n        meta = self.model.get_modelmeta().custom_metadata_map\n        conv_cache_shape = list(map(int, meta[\"conv_cache_shape\"].split(\",\")))\n        tra_cache_shape = list(map(int, meta[\"tra_cache_shape\"].split(\",\")))\n        inter_cache_shape = list(map(int, meta[\"inter_cache_shape\"].split(\",\")))\n\n        conv_cache_shape = np.zeros(conv_cache_shape, dtype=np.float32)\n        tra_cache = np.zeros(tra_cache_shape, dtype=np.float32)\n        inter_cache = np.zeros(inter_cache_shape, dtype=np.float32)\n\n        return conv_cache_shape, tra_cache, inter_cache\n\n    def __call__(self, x, states):\n        \"\"\"\n        Args:\n          x: (1, n_fft/2+1, 1, 2)\n        Returns:\n          o: (1, n_fft/2+1, 1, 2)\n        \"\"\"\n        out, next_conv_cache, next_tra_cache, next_inter_cache = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n                self.model.get_outputs()[1].name,\n                self.model.get_outputs()[2].name,\n                self.model.get_outputs()[3].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n                self.model.get_inputs()[1].name: states[0],\n                self.model.get_inputs()[2].name: states[1],\n                self.model.get_inputs()[3].name: states[2],\n            },\n        )\n\n        return out, (next_conv_cache, next_tra_cache, next_inter_cache)\n\n\ndef main():\n    model = OnnxModel()\n\n    filename = \"./inp_16k.wav\"\n    wave, sample_rate = load_audio(filename)\n    if sample_rate != model.sample_rate:\n        import librosa\n\n        wave = librosa.resample(wave, orig_sr=sample_rate, target_sr=model.sample_rate)\n        sample_rate = model.sample_rate\n\n    stft_config = knf.StftConfig(\n        n_fft=model.n_fft,\n        hop_length=model.hop_length,\n        win_length=model.window_length,\n        window=model.window.tolist(),\n    )\n    stft = knf.Stft(stft_config)\n    stft_result = stft(wave)\n    num_frames = stft_result.num_frames\n    real = np.array(stft_result.real, dtype=np.float32).reshape(num_frames, -1)\n    imag = np.array(stft_result.imag, dtype=np.float32).reshape(num_frames, -1)\n\n    states = model.get_init_states()\n    outputs = []\n    for i in range(num_frames):\n        x_real = real[i : i + 1]\n        x_imag = imag[i : i + 1]\n        x = np.vstack([x_real, x_imag]).transpose()\n        x = np.expand_dims(x, axis=0)\n        x = np.expand_dims(x, axis=2)\n\n        o, states = model(x, states)\n        outputs.append(o)\n\n    outputs = np.concatenate(outputs, axis=2)\n    outputs = outputs.squeeze(0).transpose(1, 0, 2)\n\n    enhanced_real = outputs[:, :, 0]\n    enhanced_imag = outputs[:, :, 1]\n    enhanced_stft_result = knf.StftResult(\n        real=enhanced_real.reshape(-1).tolist(),\n        imag=enhanced_imag.reshape(-1).tolist(),\n        num_frames=enhanced_real.shape[0],\n    )\n\n    istft = knf.IStft(stft_config)\n    enhanced = istft(enhanced_stft_result)\n\n    sf.write(\"./enhanced_16k.wav\", enhanced, model.sample_rate)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/hap/.gitignore",
    "content": "!build-*.in\n"
  },
  {
    "path": "scripts/hap/build-hap-vad-asr.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Auto generated! Please DO NOT EDIT!\n\n# Please set the environment variable COMMANDLINE_TOOLS_DIR\n# before running this script\n\n# Inside the $COMMANDLINE_TOOL_DIR directory, you can find the following:\n#\n# command-line-tools fangjun$ ls\n# LICENSE.txt NOTICE.txt  bin         codelinter  hstack      hvigor      ohpm        sdk         tool\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\nlog \"Building streaming VAD + ASR Hap for sherpa-onnx v${SHERPA_ONNX_VERSION}\"\n\nexport SHERPA_ONNX_ENABLE_TTS=OFF\n\nif [ ! -f $COMMANDLINE_TOOLS_DIR/bin/hvigorw ]; then\n  echo \"Please first download Command Line Tools for HarmonyOS\"\n  echo \"See https://developer.huawei.com/consumer/cn/download/\"\n  echo \"or\"\n  echo \"https://hf-mirror.com/csukuangfj/harmonyos-commandline-tools/tree/main\"\n  exit 1\nfi\n\njar=$COMMANDLINE_TOOLS_DIR/sdk/default/openharmony/toolchains/lib/hap-sign-tool.jar\n\nexport PATH=$COMMANDLINE_TOOLS_DIR/bin:$PATH\n\nmkdir -p haps\n\n{% for model in model_list %}\npushd ./harmony-os/SherpaOnnxVadAsr/entry/src/main/resources/rawfile\nmodel_name={{ model.model_name }}\ntype={{ model.idx }}\nlang={{ model.lang }}\nlang2={{ model.lang2 }}\nshort_name={{ model.short_name }}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\n\n{{ model.cmd }}\n\nrm -rf  *.tar.bz2\nls -lh $model_name\n\nif [ ! -f ./silero_vad.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\npopd\n# Now we are at the project root directory\n\ngit checkout .\npushd harmony-os/SherpaOnnxVadAsr/entry/src/main/ets/workers/\nsed -i.bak s/\"const type = 2/const type = $type/\" ./NonStreamingAsrWithVadWorker.ets\n\n{% if model.rule_fsts %}\n  rule_fsts={{ model.rule_fsts }}\n  sed -i.bak s%\"ruleFsts = ''\"%\"ruleFsts = \\\"$rule_fsts\\\"\"% ./NonStreamingAsrWithVadWorker.ets\n{% endif %}\n\ngit diff\npopd\n\npushd harmony-os/SherpaOnnxVadAsr/entry/src/main/ets/pages\nsed -i.bak s/English/$lang2/ ./Index.ets\npopd\n\npushd harmony-os/SherpaOnnxVadAsr\n\ngit diff\n\ncd entry\nohpm install\ncd ..\n\nhvigorw clean --no-daemon\nhvigorw assembleHap --mode module -p product=default -p buildMode=release --no-daemon\n\nls -lh ./entry/build/default/outputs/default/entry-default-unsigned.hap\n\nin_file=$PWD/entry/build/default/outputs/default/entry-default-unsigned.hap\nout_file=$PWD/entry/build/default/outputs/default/entry-default-signed.hap\n\njava -jar $jar sign-app -keyAlias \"$HAP_KEY_ALIAS\" -signAlg \"SHA256withECDSA\" -mode \"localSign\" \\\n  -appCertFile \"/tmp/sherpa_onnx.cer\" -profileFile \"/tmp/sherpa_onnx_profileRelease.p7b\" \\\n  -inFile $in_file -keystoreFile \"/tmp/sherpa_onnx_ohos_key.p12\" \\\n  -outFile $out_file -keyPwd \"$HAP_KEY_PWD\" -keystorePwd \"$HAP_KEY_STORE_PWD\" -signCode \"1\"\n\nls -l $in_file $out_file\nls -lh $in_file $out_file\nrm -rf ./entry/src/main/resources/rawfile/$model_name\npopd\n\n# Use unsigned hap\nmv $in_file ./haps/sherpa-onnx-${SHERPA_ONNX_VERSION}-vad_asr-$lang-$short_name.hap\n# mv $out_file ./haps/sherpa-onnx-${SHERPA_ONNX_VERSION}-vad_asr-$lang-$short_name.hap\n\nls -lh haps\n\n{% endfor %}\n\ngit checkout .\n\nls -lh haps/\n"
  },
  {
    "path": "scripts/kitten-tts/README.md",
    "content": "# Introduction\n\nSee also https://github.com/KittenML/KittenTTS\n"
  },
  {
    "path": "scripts/kitten-tts/mini_v0_1/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport argparse\n\nimport numpy as np\nimport onnx\n\nfrom generate_voices_bin import speaker2id\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\", type=str, required=True, help=\"input and output onnx model\"\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    print(args.model)\n\n    model = onnx.load(args.model)\n\n    style = np.load(\"./voices.npz\")\n    style_shape = style[list(style.keys())[0]].shape\n\n    speaker2id_str = \"\"\n    id2speaker_str = \"\"\n    sep = \"\"\n    for s, i in speaker2id.items():\n        speaker2id_str += f\"{sep}{s}->{i}\"\n        id2speaker_str += f\"{sep}{i}->{s}\"\n        sep = \",\"\n\n    meta_data = {\n        \"model_type\": \"kitten-tts\",\n        \"language\": \"English\",\n        \"has_espeak\": 1,\n        \"sample_rate\": 24000,\n        \"version\": 1,\n        \"voice\": \"en-us\",\n        \"style_dim\": \",\".join(map(str, style_shape)),\n        \"n_speakers\": len(speaker2id),\n        \"speaker2id\": speaker2id_str,\n        \"id2speaker\": id2speaker_str,\n        \"speaker_names\": \",\".join(map(str, speaker2id.keys())),\n        \"model_url\": \"https://huggingface.co/KittenML/kitten-tts-nano-0.2\",\n        \"see_also\": \"https://github.com/KittenML/KittenTTS\",\n        \"maintainer\": \"k2-fsa\",\n        \"comment\": \"This is kitten-tts-nano-0.2 and supports only English\",\n    }\n\n    print(model.metadata_props)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, args.model)\n\n    print(f\"Please see {args.model}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kitten-tts/mini_v0_1/convert_opset.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nChange the model so that it can be run in onnxruntime 1.17.1\n\"\"\"\n\nimport onnx\n\n\ndef main():\n    model = onnx.load(\"kitten_tts_mini_v0_1.onnx\")\n\n    # Print current opsets\n    for opset in model.opset_import:\n        print(f\"Domain: '{opset.domain}', Version: {opset.version}\")\n\n    # Modify the opset versions (be careful!)\n    for opset in model.opset_import:\n        if opset.domain == \"\":  # ai.onnx domain\n            opset.version = 19  # change from 20 to 19\n        elif opset.domain == \"ai.onnx.ml\":\n            opset.version = 4  # change from 5 to 4\n\n    # Save the modified model\n    onnx.save(model, \"model.fp16.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kitten-tts/mini_v0_1/generate_samples.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\"\"\"\nGenerate samples for\nhttps://k2-fsa.github.io/sherpa/onnx/tts/all/\n\"\"\"\n\n\nimport sherpa_onnx\nimport soundfile as sf\n\nfrom generate_voices_bin import speaker2id\n\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(\n            model=\"kitten-mini-en-v0_1-fp16/model.fp16.onnx\",\n            voices=\"kitten-mini-en-v0_1-fp16/voices.bin\",\n            tokens=\"kitten-mini-en-v0_1-fp16/tokens.txt\",\n            data_dir=\"kitten-mini-en-v0_1-fp16/espeak-ng-data\",\n        ),\n        num_threads=2,\n    ),\n    max_num_sentences=1,\n)\n\nif not config.validate():\n    raise ValueError(\"Please check your config\")\n\ntts = sherpa_onnx.OfflineTts(config)\ntext = \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\nfor s, i in speaker2id.items():\n    print(s, i, len(speaker2id))\n    audio = tts.generate(text, sid=i, speed=1.0)\n\n    sf.write(\n        f\"./hf/kitten/v0.1-mini/mp3/{i}-{s}.mp3\",\n        audio.samples,\n        samplerate=audio.sample_rate,\n    )\n"
  },
  {
    "path": "scripts/kitten-tts/nano_v0_1/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport argparse\n\nimport numpy as np\nimport onnx\n\nfrom generate_voices_bin import speaker2id\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\", type=str, required=True, help=\"input and output onnx model\"\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    print(args.model)\n\n    model = onnx.load(args.model)\n\n    style = np.load(\"./voices.npz\")\n    style_shape = style[list(style.keys())[0]].shape\n\n    speaker2id_str = \"\"\n    id2speaker_str = \"\"\n    sep = \"\"\n    for s, i in speaker2id.items():\n        speaker2id_str += f\"{sep}{s}->{i}\"\n        id2speaker_str += f\"{sep}{i}->{s}\"\n        sep = \",\"\n\n    meta_data = {\n        \"model_type\": \"kitten-tts\",\n        \"language\": \"English\",\n        \"has_espeak\": 1,\n        \"sample_rate\": 24000,\n        \"version\": 1,\n        \"voice\": \"en-us\",\n        \"style_dim\": \",\".join(map(str, style_shape)),\n        \"n_speakers\": len(speaker2id),\n        \"speaker2id\": speaker2id_str,\n        \"id2speaker\": id2speaker_str,\n        \"speaker_names\": \",\".join(map(str, speaker2id.keys())),\n        \"model_url\": \"https://huggingface.co/KittenML/kitten-tts-nano-0.1\",\n        \"see_also\": \"https://github.com/KittenML/KittenTTS\",\n        \"maintainer\": \"k2-fsa\",\n        \"comment\": \"This is kitten-tts-nano-0.1 and supports only English\",\n    }\n\n    print(model.metadata_props)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, args.model)\n\n    print(f\"Please see {args.model}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kitten-tts/nano_v0_1/convert_opset.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nChange the model so that it can be run in onnxruntime 1.17.1\n\"\"\"\n\nimport onnx\n\n\ndef main():\n    model = onnx.load(\"kitten_tts_nano_v0_1.onnx\")\n\n    # Print current opsets\n    for opset in model.opset_import:\n        print(f\"Domain: '{opset.domain}', Version: {opset.version}\")\n\n    # Modify the opset versions (be careful!)\n    for opset in model.opset_import:\n        if opset.domain == \"\":  # ai.onnx domain\n            opset.version = 19  # change from 20 to 19\n        elif opset.domain == \"ai.onnx.ml\":\n            opset.version = 4  # change from 5 to 4\n\n    # Save the modified model\n    onnx.save(model, \"model.fp16.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kitten-tts/nano_v0_1/generate_samples.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\"\"\"\nGenerate samples for\nhttps://k2-fsa.github.io/sherpa/onnx/tts/all/\n\"\"\"\n\n\nimport sherpa_onnx\nimport soundfile as sf\n\nfrom generate_voices_bin import speaker2id\n\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(\n            model=\"kitten-nano-en-v0_1-fp16/model.fp16.onnx\",\n            voices=\"kitten-nano-en-v0_1-fp16/voices.bin\",\n            tokens=\"kitten-nano-en-v0_1-fp16/tokens.txt\",\n            data_dir=\"kitten-nano-en-v0_1-fp16/espeak-ng-data\",\n        ),\n        num_threads=2,\n    ),\n    max_num_sentences=1,\n)\n\nif not config.validate():\n    raise ValueError(\"Please check your config\")\n\ntts = sherpa_onnx.OfflineTts(config)\ntext = \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\nfor s, i in speaker2id.items():\n    print(s, i, len(speaker2id))\n    audio = tts.generate(text, sid=i, speed=1.0)\n\n    sf.write(\n        f\"./hf/kitten/v0.1-nano/mp3/{i}-{s}.mp3\",\n        audio.samples,\n        samplerate=audio.sample_rate,\n    )\n"
  },
  {
    "path": "scripts/kitten-tts/nano_v0_1/generate_tokens.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\ndef get_vocab():\n    # https://github.com/KittenML/KittenTTS/blob/main/kittentts/onnx_model.py#L17\n    _pad = \"$\"\n    _punctuation = ';:,.!?¡¿—…\"«»\"\" '\n    _letters = \"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\"\n    _letters_ipa = \"ɑɐɒæɓʙβɔɕçɗɖðʤəɘɚɛɜɝɞɟʄɡɠɢʛɦɧħɥʜɨɪʝɭɬɫɮʟɱɯɰŋɳɲɴøɵɸθœɶʘɹɺɾɻʀʁɽʂʃʈʧʉʊʋⱱʌɣɤʍχʎʏʑʐʒʔʡʕʢǀǁǂǃˈˌːˑʼʴʰʱʲʷˠˤ˞↓↑→↗↘'̩'ᵻ\"\n\n    symbols = [_pad] + list(_punctuation) + list(_letters) + list(_letters_ipa)\n    dicts = {}\n    for i in range(len((symbols))):\n        dicts[symbols[i]] = i\n    return dicts\n\n\ndef main():\n    token2id = get_vocab()\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for s, i in token2id.items():\n            f.write(f\"{s} {i}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kitten-tts/nano_v0_1/generate_voices_bin.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nfrom pathlib import Path\n\nimport numpy as np\n\nspeakers = [\n    \"expr-voice-2-m\",\n    \"expr-voice-2-f\",\n    \"expr-voice-3-m\",\n    \"expr-voice-3-f\",\n    \"expr-voice-4-m\",\n    \"expr-voice-4-f\",\n    \"expr-voice-5-m\",\n    \"expr-voice-5-f\",\n]\n\nid2speaker = {idx: speaker for idx, speaker in enumerate(speakers)}\n\nspeaker2id = {speaker: idx for idx, speaker in id2speaker.items()}\n\n\ndef main():\n    if Path(\"./voices.bin\").is_file():\n        print(\"./voices.bin exists - skip\")\n        return\n\n    voices = np.load(\"./voices.npz\")\n\n    with open(\"voices.bin\", \"wb\") as f:\n        for speaker in speakers:\n            v = voices[speaker]\n            # v.shape (1, 256)\n            f.write(v.tobytes())\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kitten-tts/nano_v0_1/show.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport onnxruntime\nimport onnx\n\n\"\"\"\n[key: \"onnx.infer\"\nvalue: \"onnxruntime.quant\"\n, key: \"onnx.quant.pre_process\"\nvalue: \"onnxruntime.quant\"\n]\nNodeArg(name='input_ids', type='tensor(int64)', shape=[1, 'sequence_length'])\nNodeArg(name='style', type='tensor(float)', shape=[1, 256])\nNodeArg(name='speed', type='tensor(float)', shape=[1])\n-----\nNodeArg(name='waveform', type='tensor(float)', shape=['num_samples'])\nNodeArg(name='duration', type='tensor(int64)', shape=['Castduration_dim_0'])\n\"\"\"\n\n\ndef show(filename):\n    model = onnx.load(filename)\n    print(model.metadata_props)\n\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(\n        filename, session_opts, providers=[\"CPUExecutionProvider\"]\n    )\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\ndef main():\n    show(\"./model.fp16.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kitten-tts/nano_v0_1/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nimport time\nfrom pathlib import Path\nfrom typing import Dict, List\n\nimport numpy as np\n\ntry:\n    from piper_phonemize import phonemize_espeak\nexcept Exception as ex:\n    raise RuntimeError(\n        f\"{ex}\\nPlease run\\n\"\n        \"pip install piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html\"\n    )\n\nimport onnxruntime as ort\nimport soundfile as sf\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the model\",\n    )\n\n    parser.add_argument(\n        \"--voices-bin\",\n        type=str,\n        required=True,\n        help=\"Path to the voices.bin\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n    return parser.parse_args()\n\n\ndef show(filename):\n    session_opts = ort.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = ort.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\ndef load_tokens(filename: str) -> Dict[str, int]:\n    ans = dict()\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.strip().split()\n            if len(fields) == 2:\n                token, idx = fields\n                ans[token] = int(idx)\n            else:\n                assert len(fields) == 1, (len(fields), line)\n                ans[\" \"] = int(fields[0])\n    return ans\n\n\ndef load_voices(speaker_names: List[str], dim: List[int], voices_bin: str):\n    embedding = (\n        np.fromfile(voices_bin, dtype=\"uint8\")\n        .view(np.float32)\n        .reshape(len(speaker_names), *dim)\n    )\n    ans = dict()\n    for i in range(len(speaker_names)):\n        ans[speaker_names[i]] = embedding[i]\n\n    return ans\n\n\nclass OnnxModel:\n    def __init__(self, model_filename: str, voices_bin: str, tokens: str):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            model_filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        self.token2id = load_tokens(tokens)\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        print(meta)\n        dim = list(map(int, meta[\"style_dim\"].split(\",\")))\n        speaker_names = meta[\"speaker_names\"].split(\",\")\n\n        self.voices = load_voices(\n            speaker_names=speaker_names, dim=dim, voices_bin=voices_bin\n        )\n\n        self.sample_rate = int(meta[\"sample_rate\"])\n\n    def __call__(self, text: str, voice):\n        tokens = phonemize_espeak(text, \"en-us\")\n        # tokens is List[List[str]]\n        # Each sentence is a List[str]\n        # len(tokens) == number of sentences\n\n        flatten = []\n        for t in tokens:\n            flatten.extend(t)\n            # we append a space at the end of a sentence so that there is\n            # a pause in the generated audio\n            flatten.append(\" \")\n\n        tokens = \"\".join(flatten)\n\n        tokens = list(tokens)\n\n        token_ids = [self.token2id[i] for i in tokens]\n\n        style = self.voices[voice]\n\n        token_ids = [0, *token_ids, 0]\n        token_ids = np.array([token_ids], dtype=np.int64)\n\n        speed = np.array([1.0], dtype=np.float32)\n\n        audio = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: token_ids,\n                self.model.get_inputs()[1].name: style,\n                self.model.get_inputs()[2].name: speed,\n            },\n        )[0]\n        return audio\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n    show(args.model)\n\n    #  tokens = phonemize_espeak(\"how are you doing?\", \"en-us\")\n    # [['h', 'ˌ', 'a', 'ʊ', ' ', 'ɑ', 'ː', 'ɹ', ' ', 'j', 'u', 'ː', ' ', 'd', 'ˈ', 'u', 'ː', 'ɪ', 'ŋ', '?']]\n    m = OnnxModel(\n        model_filename=args.model, voices_bin=args.voices_bin, tokens=args.tokens\n    )\n\n    text = (\n        \"Today as always, men fall into two groups: slaves and free men. \"\n        + \" Whoever does not have two-thirds of his day for himself, \"\n        + \"is a slave, whatever he may be: a statesman, a businessman, \"\n        + \"an official, or a scholar.\"\n    )\n\n    for i, voice in enumerate(m.voices.keys(), 1):\n        print(f\"Testing {i}/{len(m.voices)} - {voice}/{args.model}\")\n\n        start = time.time()\n        audio = m(text, voice=voice)\n        end = time.time()\n\n        elapsed_seconds = end - start\n        audio_duration = len(audio) / m.sample_rate\n        real_time_factor = elapsed_seconds / audio_duration\n\n        filename = f\"{Path(args.model).stem}-{voice}.wav\"\n        sf.write(\n            filename,\n            audio,\n            samplerate=m.sample_rate,\n            subtype=\"PCM_16\",\n        )\n        print(f\" Saved to {filename}\")\n        print(f\" Elapsed seconds: {elapsed_seconds:.3f}\")\n        print(f\" Audio duration in seconds: {audio_duration:.3f}\")\n        print(\n            f\" RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\"\n        )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kitten-tts/nano_v0_2/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport argparse\n\nimport numpy as np\nimport onnx\n\nfrom generate_voices_bin import speaker2id\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\", type=str, required=True, help=\"input and output onnx model\"\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    print(args.model)\n\n    model = onnx.load(args.model)\n\n    style = np.load(\"./voices.npz\")\n    style_shape = style[list(style.keys())[0]].shape\n\n    speaker2id_str = \"\"\n    id2speaker_str = \"\"\n    sep = \"\"\n    for s, i in speaker2id.items():\n        speaker2id_str += f\"{sep}{s}->{i}\"\n        id2speaker_str += f\"{sep}{i}->{s}\"\n        sep = \",\"\n\n    meta_data = {\n        \"model_type\": \"kitten-tts\",\n        \"language\": \"English\",\n        \"has_espeak\": 1,\n        \"sample_rate\": 24000,\n        \"version\": 1,\n        \"voice\": \"en-us\",\n        \"style_dim\": \",\".join(map(str, style_shape)),\n        \"n_speakers\": len(speaker2id),\n        \"speaker2id\": speaker2id_str,\n        \"id2speaker\": id2speaker_str,\n        \"speaker_names\": \",\".join(map(str, speaker2id.keys())),\n        \"model_url\": \"https://huggingface.co/KittenML/kitten-tts-nano-0.2\",\n        \"see_also\": \"https://github.com/KittenML/KittenTTS\",\n        \"maintainer\": \"k2-fsa\",\n        \"comment\": \"This is kitten-tts-nano-0.2 and supports only English\",\n    }\n\n    print(model.metadata_props)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, args.model)\n\n    print(f\"Please see {args.model}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kitten-tts/nano_v0_2/convert_opset.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nChange the model so that it can be run in onnxruntime 1.17.1\n\"\"\"\n\nimport onnx\n\n\ndef main():\n    model = onnx.load(\"kitten_tts_nano_v0_2.onnx\")\n\n    # Print current opsets\n    for opset in model.opset_import:\n        print(f\"Domain: '{opset.domain}', Version: {opset.version}\")\n\n    # Modify the opset versions (be careful!)\n    for opset in model.opset_import:\n        if opset.domain == \"\":  # ai.onnx domain\n            opset.version = 19  # change from 20 to 19\n        elif opset.domain == \"ai.onnx.ml\":\n            opset.version = 4  # change from 5 to 4\n\n    # Save the modified model\n    onnx.save(model, \"model.fp16.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kitten-tts/nano_v0_2/generate_samples.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\"\"\"\nGenerate samples for\nhttps://k2-fsa.github.io/sherpa/onnx/tts/all/\n\"\"\"\n\n\nimport sherpa_onnx\nimport soundfile as sf\n\nfrom generate_voices_bin import speaker2id\n\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        kitten=sherpa_onnx.OfflineTtsKittenModelConfig(\n            model=\"kitten-nano-en-v0_2-fp16/model.fp16.onnx\",\n            voices=\"kitten-nano-en-v0_2-fp16/voices.bin\",\n            tokens=\"kitten-nano-en-v0_2-fp16/tokens.txt\",\n            data_dir=\"kitten-nano-en-v0_2-fp16/espeak-ng-data\",\n        ),\n        num_threads=2,\n    ),\n    max_num_sentences=1,\n)\n\nif not config.validate():\n    raise ValueError(\"Please check your config\")\n\ntts = sherpa_onnx.OfflineTts(config)\ntext = \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\nfor s, i in speaker2id.items():\n    print(s, i, len(speaker2id))\n    audio = tts.generate(text, sid=i, speed=1.0)\n\n    sf.write(\n        f\"./hf/kitten/v0.2-nano/mp3/{i}-{s}.mp3\",\n        audio.samples,\n        samplerate=audio.sample_rate,\n    )\n"
  },
  {
    "path": "scripts/kokoro/.gitignore",
    "content": "espeak-ng-data\nvoices.json\nvoices.bin\nREADME-new.md\nlexicon-*.txt\nconfig.json\n"
  },
  {
    "path": "scripts/kokoro/README.md",
    "content": "# Introduction\n\nPlease see also\nhttps://huggingface.co/hexgrad/Kokoro-82M\nand\nhttps://huggingface.co/hexgrad/Kokoro-82M/discussions/14\n"
  },
  {
    "path": "scripts/kokoro/v0.19/.gitignore",
    "content": "kLegacy\n"
  },
  {
    "path": "scripts/kokoro/v0.19/__init__.py",
    "content": ""
  },
  {
    "path": "scripts/kokoro/v0.19/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport argparse\n\nimport onnx\nimport torch\n\nfrom generate_voices_bin import speaker2id\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\", type=str, required=True, help=\"input and output onnx model\"\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    print(args.model)\n\n    model = onnx.load(args.model)\n\n    style = torch.load(\n        \"./kLegacy/v0.19/voices/af.pt\", weights_only=True, map_location=\"cpu\"\n    )\n\n    speaker2id_str = \"\"\n    id2speaker_str = \"\"\n    sep = \"\"\n    for s, i in speaker2id.items():\n        speaker2id_str += f\"{sep}{s}->{i}\"\n        id2speaker_str += f\"{sep}{i}->{s}\"\n        sep = \",\"\n\n    meta_data = {\n        \"model_type\": \"kokoro\",\n        \"language\": \"English\",\n        \"has_espeak\": 1,\n        \"sample_rate\": 24000,\n        \"version\": 1,\n        \"voice\": \"en-us\",\n        \"style_dim\": \",\".join(map(str, style.shape)),\n        \"n_speakers\": len(speaker2id),\n        \"speaker2id\": speaker2id_str,\n        \"id2speaker\": id2speaker_str,\n        \"speaker_names\": \",\".join(map(str, speaker2id.keys())),\n        \"model_url\": \"https://huggingface.co/hexgrad/kLegacy/\",\n        \"see_also\": \"https://huggingface.co/spaces/hexgrad/Kokoro-TTS\",\n        \"maintainer\": \"k2-fsa\",\n        \"comment\": \"This is kokoro v0.19 and supports only English\",\n    }\n\n    print(model.metadata_props)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, args.model)\n\n    print(f\"Please see {args.model}, ./voices.bin, and ./tokens.txt\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v0.19/dynamic_quantization.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom pathlib import Path\n\nimport onnxruntime\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef show(filename):\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\n\"\"\"\nNodeArg(name='tokens', type='tensor(int64)', shape=[1, 'tokens1'])\nNodeArg(name='style', type='tensor(float)', shape=[1, 256])\nNodeArg(name='speed', type='tensor(float)', shape=[1])\n-----\nNodeArg(name='audio', type='tensor(float)', shape=['audio0'])\n\"\"\"\n\n\ndef main():\n    show(\"./model.onnx\")\n\n    if not Path(\"./model.int8.onnx\").is_file():\n        quantize_dynamic(\n            model_input=\"model.onnx\",\n            model_output=\"model.int8.onnx\",\n            #  op_types_to_quantize=[\"MatMul\"],\n            weight_type=QuantType.QUInt8,\n        )\n    else:\n        print(\"./model.int8.onnx exists - skip\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v0.19/generate_samples.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\"\"\"\nGenerate samples for\nhttps://k2-fsa.github.io/sherpa/onnx/tts/all/\n\"\"\"\n\nimport sherpa_onnx\nimport soundfile as sf\n\nfrom generate_voices_bin import speaker2id\n\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(\n            model=\"./model.onnx\",\n            voices=\"./voices.bin\",\n            tokens=\"./tokens.txt\",\n            data_dir=\"./espeak-ng-data\",\n        ),\n        num_threads=2,\n    ),\n    max_num_sentences=1,\n)\n\nif not config.validate():\n    raise ValueError(\"Please check your config\")\n\ntts = sherpa_onnx.OfflineTts(config)\ntext = \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\nfor s, i in speaker2id.items():\n    print(s, i, len(speaker2id))\n    audio = tts.generate(text, sid=i, speed=1.0)\n\n    sf.write(\n        f\"./hf/kokoro/v0.19/mp3/{i}-{s}.mp3\",\n        audio.samples,\n        samplerate=audio.sample_rate,\n    )\n"
  },
  {
    "path": "scripts/kokoro/v0.19/generate_tokens.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\ndef get_vocab():\n    # https://huggingface.co/hexgrad/kLegacy/blob/main/v0.19/kokoro.py#L75\n    _pad = \"$\"\n    _punctuation = ';:,.!?¡¿—…\"«»“” '\n    _letters = \"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\"\n    _letters_ipa = \"ɑɐɒæɓʙβɔɕçɗɖðʤəɘɚɛɜɝɞɟʄɡɠɢʛɦɧħɥʜɨɪʝɭɬɫɮʟɱɯɰŋɳɲɴøɵɸθœɶʘɹɺɾɻʀʁɽʂʃʈʧʉʊʋⱱʌɣɤʍχʎʏʑʐʒʔʡʕʢǀǁǂǃˈˌːˑʼʴʰʱʲʷˠˤ˞↓↑→↗↘'̩'ᵻ\"\n    symbols = [_pad] + list(_punctuation) + list(_letters) + list(_letters_ipa)\n    dicts = {}\n    for i in range(len((symbols))):\n        dicts[symbols[i]] = i\n    return dicts\n\n\ndef main():\n    token2id = get_vocab()\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for s, i in token2id.items():\n            f.write(f\"{s} {i}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v0.19/generate_voices_bin.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport torch\nfrom pathlib import Path\n\n\nid2speaker = {\n    0: \"af\",\n    1: \"af_bella\",\n    2: \"af_nicole\",\n    3: \"af_sarah\",\n    4: \"af_sky\",\n    5: \"am_adam\",\n    6: \"am_michael\",\n    7: \"bf_emma\",\n    8: \"bf_isabella\",\n    9: \"bm_george\",\n    10: \"bm_lewis\",\n}\n\nspeaker2id = {speaker: idx for idx, speaker in id2speaker.items()}\n\n\ndef main():\n    if Path(\"./voices.bin\").is_file():\n        print(\"./voices.bin exists - skip\")\n        return\n\n    with open(\"voices.bin\", \"wb\") as f:\n        for _, speaker in id2speaker.items():\n            m = torch.load(\n                f\"kLegacy/v0.19/voices/{speaker}.pt\",\n                weights_only=True,\n                map_location=\"cpu\",\n            ).numpy()\n            # m.shape (511, 1, 256)\n\n            f.write(m.tobytes())\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v0.19/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nfemale (7)\n'af', 'af_bella', 'af_nicole','af_sarah', 'af_sky',\n'bf_emma', 'bf_isabella',\n\nmale (4)\n'am_adam',  'am_michael', 'bm_george', 'bm_lewis'\n\"\"\"\n\nimport argparse\nimport time\nfrom pathlib import Path\nfrom typing import Dict, List\n\nimport numpy as np\n\ntry:\n    from piper_phonemize import phonemize_espeak\nexcept Exception as ex:\n    raise RuntimeError(\n        f\"{ex}\\nPlease run\\n\"\n        \"pip install piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html\"\n    )\n\nimport onnxruntime as ort\nimport soundfile as sf\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the model\",\n    )\n\n    parser.add_argument(\n        \"--voices-bin\",\n        type=str,\n        required=True,\n        help=\"Path to the voices.bin\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n    return parser.parse_args()\n\n\ndef show(filename):\n    session_opts = ort.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = ort.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\n\"\"\"\nNodeArg(name='tokens', type='tensor(int64)', shape=[1, 'tokens1'])\nNodeArg(name='style', type='tensor(float)', shape=[1, 256])\nNodeArg(name='speed', type='tensor(float)', shape=[1])\n-----\nNodeArg(name='audio', type='tensor(float)', shape=['audio0'])\n\"\"\"\n\n\ndef load_tokens(filename: str) -> Dict[str, int]:\n    ans = dict()\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.strip().split()\n            if len(fields) == 2:\n                token, idx = fields\n                ans[token] = int(idx)\n            else:\n                assert len(fields) == 1, (len(fields), line)\n                ans[\" \"] = int(fields[0])\n    return ans\n\n\ndef load_voices(speaker_names: List[str], dim: List[int], voices_bin: str):\n    embedding = (\n        np.fromfile(voices_bin, dtype=\"uint8\")\n        .view(np.float32)\n        .reshape(len(speaker_names), *dim)\n    )\n    print(\"embedding.shape\", embedding.shape)\n    ans = dict()\n    for i in range(len(speaker_names)):\n        ans[speaker_names[i]] = embedding[i]\n\n    return ans\n\n\nclass OnnxModel:\n    def __init__(self, model_filename: str, voices_bin: str, tokens: str):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            model_filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        self.token2id = load_tokens(tokens)\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        print(meta)\n        dim = list(map(int, meta[\"style_dim\"].split(\",\")))\n        speaker_names = meta[\"speaker_names\"].split(\",\")\n\n        self.voices = load_voices(\n            speaker_names=speaker_names, dim=dim, voices_bin=voices_bin\n        )\n\n        self.sample_rate = int(meta[\"sample_rate\"])\n\n        print(list(self.voices.keys()))\n        # ['af', 'af_bella', 'af_nicole', 'af_sarah', 'af_sky', 'am_adam',\n        # 'am_michael', 'bf_emma', 'bf_isabella', 'bm_george', 'bm_lewis']\n        # af -> (511, 1, 256)\n        self.max_len = self.voices[next(iter(self.voices))].shape[0] - 1\n\n    def __call__(self, text: str, voice):\n        tokens = phonemize_espeak(text, \"en-us\")\n        # tokens is List[List[str]]\n        # Each sentence is a List[str]\n        # len(tokens) == number of sentences\n\n        tokens = sum(tokens, [])  # flatten\n        tokens = \"\".join(tokens)\n\n        tokens = tokens.replace(\"kəkˈoːɹoʊ\", \"kˈoʊkəɹoʊ\").replace(\n            \"kəkˈɔːɹəʊ\", \"kˈəʊkəɹəʊ\"\n        )\n\n        tokens = list(tokens)\n\n        token_ids = [self.token2id[i] for i in tokens]\n        token_ids = token_ids[: self.max_len]\n\n        style = self.voices[voice][len(token_ids)]\n\n        token_ids = [0, *token_ids, 0]\n        token_ids = np.array([token_ids], dtype=np.int64)\n\n        speed = np.array([1.0], dtype=np.float32)\n\n        audio = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: token_ids,\n                self.model.get_inputs()[1].name: style,\n                self.model.get_inputs()[2].name: speed,\n            },\n        )[0]\n        return audio\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n    show(args.model)\n\n    #  tokens = phonemize_espeak(\"how are you doing?\", \"en-us\")\n    # [['h', 'ˌ', 'a', 'ʊ', ' ', 'ɑ', 'ː', 'ɹ', ' ', 'j', 'u', 'ː', ' ', 'd', 'ˈ', 'u', 'ː', 'ɪ', 'ŋ', '?']]\n    m = OnnxModel(\n        model_filename=args.model, voices_bin=args.voices_bin, tokens=args.tokens\n    )\n\n    text = (\n        \"Today as always, men fall into two groups: slaves and free men.\"\n        + \" Whoever does not have two-thirds of his day for himself, \"\n        + \"is a slave, whatever he may be: a statesman, a businessman, \"\n        + \"an official, or a scholar.\"\n    )\n\n    for i, voice in enumerate(m.voices.keys(), 1):\n        print(f\"Testing {i}/{len(m.voices)} - {voice}/{args.model}\")\n\n        start = time.time()\n        audio = m(text, voice=voice)\n        end = time.time()\n\n        elapsed_seconds = end - start\n        audio_duration = len(audio) / m.sample_rate\n        real_time_factor = elapsed_seconds / audio_duration\n\n        filename = f\"{Path(args.model).stem}-{voice}.wav\"\n        sf.write(\n            filename,\n            audio,\n            samplerate=m.sample_rate,\n            subtype=\"PCM_16\",\n        )\n        print(f\" Saved to {filename}\")\n        print(f\" Elapsed seconds: {elapsed_seconds:.3f}\")\n        print(f\" Audio duration in seconds: {audio_duration:.3f}\")\n        print(\n            f\" RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\"\n        )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.0/.gitignore",
    "content": "config.json\n*.json\n*.txt\n.add-meta-data.done\nvoices\n"
  },
  {
    "path": "scripts/kokoro/v1.0/README.md",
    "content": "# Introduction\n\nThis directory is for kokoro v1.0\n"
  },
  {
    "path": "scripts/kokoro/v1.0/__init__.py",
    "content": ""
  },
  {
    "path": "scripts/kokoro/v1.0/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport onnx\nimport torch\n\nfrom generate_voices_bin import speaker2id\n\n\ndef main():\n    model = onnx.load(\"./kokoro.onnx\")\n    style = torch.load(\n        \"./Kokoro-82M/voices/af_alloy.pt\", weights_only=True, map_location=\"cpu\"\n    )\n\n    id2speaker_str = \"\"\n    speaker2id_str = \"\"\n    sep = \"\"\n    for s, i in speaker2id.items():\n        speaker2id_str += f\"{sep}{s}->{i}\"\n        id2speaker_str += f\"{sep}{i}->{s}\"\n        sep = \",\"\n\n    meta_data = {\n        \"model_type\": \"kokoro\",\n        \"language\": \"multi-lang, e.g., English, Chinese\",\n        \"has_espeak\": 1,\n        \"sample_rate\": 24000,\n        \"version\": 2,\n        \"voice\": \"en-us\",\n        \"style_dim\": \",\".join(map(str, style.shape)),\n        \"n_speakers\": len(speaker2id),\n        \"id2speaker\": id2speaker_str,\n        \"speaker2id\": speaker2id_str,\n        \"speaker_names\": \",\".join(map(str, speaker2id.keys())),\n        \"model_url\": \"https://github.com/thewh1teagle/kokoro-onnx/releases/tag/model-files\",\n        \"see_also\": \"https://huggingface.co/spaces/hexgrad/Kokoro-TTS\",\n        \"see_also_2\": \"https://huggingface.co/hexgrad/Kokoro-82M\",\n        \"maintainer\": \"k2-fsa\",\n        \"comment\": \"This is Kokoro v1.0, a multilingual TTS model, supporting English, Chinese, French, Japanese etc.\",\n    }\n\n    print(model.metadata_props)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, \"./kokoro.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.0/dynamic_quantization.py",
    "content": "#!/usr/bin/env python3\nimport argparse\n\nimport onnxruntime\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef show(filename):\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\n\"\"\"\nNodeArg(name='tokens', type='tensor(int64)', shape=[1, 'sequence_length'])\nNodeArg(name='style', type='tensor(float)', shape=[1, 256])\nNodeArg(name='speed', type='tensor(float)', shape=[1])\n-----\nNodeArg(name='audio', type='tensor(float)', shape=['audio_length'])\n\"\"\"\n\n\ndef main():\n    show(\"./kokoro.onnx\")\n\n    quantize_dynamic(\n        model_input=\"kokoro.onnx\",\n        model_output=\"kokoro.int8.onnx\",\n        #  op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QUInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.0/export_onnx.py",
    "content": "#!/usr/bin/env python3\n\nimport json\n\nimport torch\nfrom kokoro import KModel\nfrom kokoro.model import KModelForONNX\n\n\n@torch.no_grad()\ndef main():\n    with open(\"Kokoro-82M/config.json\") as f:\n        config = json.load(f)\n\n    model = (\n        KModel(\n            repo_id=\"not-used-any-value-is-ok\",\n            model=\"Kokoro-82M/kokoro-v1_0.pth\",\n            config=config,\n            disable_complex=True,\n        )\n        .to(\"cpu\")\n        .eval()\n    )\n\n    x = torch.randint(1, 100, (48,)).numpy()\n    x = torch.LongTensor([[0, *x, 0]])\n\n    style = torch.rand(1, 256, dtype=torch.float32)\n    speed = torch.rand(1)\n\n    print(x.shape, x.dtype)\n    print(style.shape, style.dtype)\n    print(speed, speed.dtype)\n\n    model2 = KModelForONNX(model)\n\n    torch.onnx.export(\n        model2,\n        (x, style, speed),\n        \"kokoro.onnx\",\n        input_names=[\"tokens\", \"style\", \"speed\"],\n        output_names=[\"audio\"],\n        dynamic_axes={\n            \"tokens\": {1: \"sequence_length\"},\n            \"audio\": {0: \"audio_length\"},\n        },\n        opset_version=14,  # minimum working version for this kokoro model is 14\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.0/generate_lexicon_en.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport json\nfrom typing import List, Tuple\n\n\ndef generate_english_lexicon(kind: str):\n    assert kind in (\"us\", \"gb\"), kind\n    # If you want to add new words, please add them to\n    # the user_defined dict.\n    user_defined = {\n        \"Kokoro\": \"kˈOkəɹO\",\n        \"Misaki\": \"misˈɑki\",\n    }\n\n    user_defined_lower = dict()\n    for k, v in user_defined.items():\n        user_defined_lower[k.lower()] = v\n\n    with open(f\"./{kind}_gold.json\", encoding=\"utf-8\") as f:\n        gold = json.load(f)\n\n    with open(f\"./{kind}_silver.json\", encoding=\"utf-8\") as f:\n        silver = json.load(f)\n\n    # words in us_gold has a higher priority than those in s_silver, so\n    # we put us_gold after us_silver below\n    english = {**silver, **gold}\n\n    lexicon = dict()\n    for k, v in english.items():\n        k_lower = k.lower()\n\n        if k_lower in user_defined_lower:\n            print(f\"{k} already exist in the user defined dict. Skip adding\")\n            continue\n\n        if isinstance(v, str):\n            lexicon[k_lower] = v\n        else:\n            assert isinstance(v, dict), (k, v)\n            assert \"DEFAULT\" in v, (k, v)\n            lexicon[k_lower] = v[\"DEFAULT\"]\n\n    return list(user_defined_lower.items()) + list(lexicon.items())\n\n\ndef save(filename: str, lexicon: List[Tuple[str, str]]):\n    with open(filename, \"w\", encoding=\"utf-8\") as f:\n        for word, phones in lexicon:\n            tokens = \" \".join(list(phones))\n            f.write(f\"{word} {tokens}\\n\")\n\n\ndef main():\n    us = generate_english_lexicon(\"us\")\n    gb = generate_english_lexicon(\"gb\")\n\n    save(\"lexicon-us-en.txt\", us)\n    save(\"lexicon-gb-en.txt\", gb)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.0/generate_lexicon_zh.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom typing import List, Tuple\n\nfrom misaki import zh\nfrom pypinyin import load_phrases_dict, phrases_dict, pinyin_dict\n\nuser_dict = {\n    \"还田\": [[\"huan2\"], [\"tian2\"]],\n    \"行长\": [[\"hang2\"], [\"zhang3\"]],\n    \"银行行长\": [[\"yin2\"], [\"hang2\"], [\"hang2\"], [\"zhang3\"]],\n}\n\nload_phrases_dict(user_dict)\n\nphrases_dict.phrases_dict.update(**user_dict)\n\n\ndef generate_chinese_lexicon():\n    word_dict = pinyin_dict.pinyin_dict\n    phrases = phrases_dict.phrases_dict\n\n    g2p = zh.ZHG2P()\n    lexicon = []\n\n    for key in word_dict:\n        if not (0x4E00 <= key <= 0x9FFF):\n            continue\n        w = chr(key)\n        tokens: str = g2p.word2ipa(w)\n        tokens = tokens.replace(chr(815), \"\")\n        lexicon.append((w, tokens))\n\n    for key in phrases:\n        tokens: str = g2p.word2ipa(key)\n        tokens = tokens.replace(chr(815), \"\")\n        lexicon.append((key, tokens))\n    return lexicon\n\n\ndef save(filename: str, lexicon: List[Tuple[str, str]]):\n    with open(filename, \"w\", encoding=\"utf-8\") as f:\n        for word, phones in lexicon:\n            tokens = \" \".join(list(phones))\n            f.write(f\"{word} {tokens}\\n\")\n\n\ndef main():\n    zh = generate_chinese_lexicon()\n\n    save(\"lexicon-zh.txt\", zh)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.0/generate_samples.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\"\"\"\nGenerate samples for\nhttps://k2-fsa.github.io/sherpa/onnx/tts/all/\n\"\"\"\n\nimport sherpa_onnx\nimport soundfile as sf\n\nfrom generate_voices_bin import speaker2id\n\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(\n            model=\"./kokoro.onnx\",\n            voices=\"./voices.bin\",\n            tokens=\"./tokens.txt\",\n            data_dir=\"./espeak-ng-data\",\n            dict_dir=\"./dict\",\n            lexicon=\"./lexicon-zh.txt,./lexicon-us-en.txt\",\n        ),\n        num_threads=2,\n        debug=True,\n    ),\n    rule_fsts=\"./phone-zh.fst,./date-zh.fst,./number-zh.fst\",\n    max_num_sentences=1,\n)\n\nif not config.validate():\n    raise ValueError(\"Please check your config\")\n\ntts = sherpa_onnx.OfflineTts(config)\ntext = \"This model supports both Chinese and English. 小米的核心价值观是什么？答案是真诚热爱！有困难，请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢? 今天是 2025年6月18号.\"\n\nprint(\"text\", text)\n\nfor s, i in speaker2id.items():\n    print(s, i, len(speaker2id))\n    audio = tts.generate(text, sid=i, speed=1.0)\n\n    sf.write(\n        f\"./hf/kokoro/v1.0/mp3/{i}-{s}.mp3\",\n        audio.samples,\n        samplerate=audio.sample_rate,\n    )\n"
  },
  {
    "path": "scripts/kokoro/v1.0/generate_tokens.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport json\n\n\ndef main():\n    with open(\"Kokoro-82M/config.json\") as f:\n        config = json.load(f)\n    vocab = config[\"vocab\"]\n\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for k, i in vocab.items():\n            f.write(f\"{k} {i}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.0/generate_voices_bin.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport torch\nfrom pathlib import Path\n\n\nid2speaker = {\n    0: \"af_alloy\",\n    1: \"af_aoede\",\n    2: \"af_bella\",\n    3: \"af_heart\",\n    4: \"af_jessica\",\n    5: \"af_kore\",\n    6: \"af_nicole\",\n    7: \"af_nova\",\n    8: \"af_river\",\n    9: \"af_sarah\",\n    10: \"af_sky\",\n    11: \"am_adam\",\n    12: \"am_echo\",\n    13: \"am_eric\",\n    14: \"am_fenrir\",\n    15: \"am_liam\",\n    16: \"am_michael\",\n    17: \"am_onyx\",\n    18: \"am_puck\",\n    19: \"am_santa\",\n    20: \"bf_alice\",\n    21: \"bf_emma\",\n    22: \"bf_isabella\",\n    23: \"bf_lily\",\n    24: \"bm_daniel\",\n    25: \"bm_fable\",\n    26: \"bm_george\",\n    27: \"bm_lewis\",\n    28: \"ef_dora\",\n    29: \"em_alex\",\n    30: \"ff_siwis\",\n    31: \"hf_alpha\",\n    32: \"hf_beta\",\n    33: \"hm_omega\",\n    34: \"hm_psi\",\n    35: \"if_sara\",\n    36: \"im_nicola\",\n    37: \"jf_alpha\",\n    38: \"jf_gongitsune\",\n    39: \"jf_nezumi\",\n    40: \"jf_tebukuro\",\n    41: \"jm_kumo\",\n    42: \"pf_dora\",\n    43: \"pm_alex\",\n    44: \"pm_santa\",\n    45: \"zf_xiaobei\",\n    46: \"zf_xiaoni\",\n    47: \"zf_xiaoxiao\",\n    48: \"zf_xiaoyi\",\n    49: \"zm_yunjian\",\n    50: \"zm_yunxi\",\n    51: \"zm_yunxia\",\n    52: \"zm_yunyang\",\n}\n\nspeaker2id = {speaker: idx for idx, speaker in id2speaker.items()}\n\n\ndef main():\n    if Path(\"./voices.bin\").is_file():\n        print(\"./voices.bin exists - skip\")\n        return\n\n    with open(\"voices.bin\", \"wb\") as f:\n        for _, speaker in id2speaker.items():\n            m = torch.load(\n                f\"Kokoro-82M/voices/{speaker}.pt\",\n                weights_only=True,\n                map_location=\"cpu\",\n            ).numpy()\n            # m.shape (510, 1, 256)\n\n            f.write(m.tobytes())\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.0/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport re\nimport time\nfrom typing import Dict, List\n\nimport jieba\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\ntry:\n    from piper_phonemize import phonemize_espeak\nexcept Exception as ex:\n    raise RuntimeError(\n        f\"{ex}\\nPlease run\\n\"\n        \"pip install piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html\"\n    )\n\n\ndef show(filename):\n    session_opts = ort.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = ort.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\n\"\"\"\nNodeArg(name='tokens', type='tensor(int64)', shape=[1, 'sequence_length'])\nNodeArg(name='style', type='tensor(float)', shape=[1, 256])\nNodeArg(name='speed', type='tensor(float)', shape=[1])\n-----\nNodeArg(name='audio', type='tensor(float)', shape=['audio_length'])\n\"\"\"\n\n\ndef load_voices(speaker_names: List[str], dim: List[int], voices_bin: str):\n    embedding = (\n        np.fromfile(voices_bin, dtype=\"uint8\")\n        .view(np.float32)\n        .reshape(len(speaker_names), *dim)\n    )\n    print(\"embedding.shape\", embedding.shape)\n    ans = dict()\n    for i in range(len(speaker_names)):\n        ans[speaker_names[i]] = embedding[i]\n\n    return ans\n\n\ndef load_tokens(filename: str) -> Dict[str, int]:\n    ans = dict()\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.strip().split()\n            if len(fields) == 2:\n                token, idx = fields\n                ans[token] = int(idx)\n            else:\n                assert len(fields) == 1, (len(fields), line)\n                ans[\" \"] = int(fields[0])\n    return ans\n\n\ndef load_lexicon(filename: str) -> Dict[str, List[str]]:\n    ans = dict()\n    for lexicon in filename.split(\",\"):\n        print(lexicon)\n        with open(lexicon, encoding=\"utf-8\") as f:\n            for line in f:\n                w, tokens = line.strip().split(\" \", maxsplit=1)\n                ans[w] = \"\".join(tokens.split())\n    return ans\n\n\nclass OnnxModel:\n    def __init__(self, model_filename: str, tokens: str, lexicon: str, voices_bin: str):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            model_filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        self.token2id = load_tokens(tokens)\n        self.word2tokens = load_lexicon(lexicon)\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        print(meta)\n        dim = list(map(int, meta[\"style_dim\"].split(\",\")))\n        speaker_names = meta[\"speaker_names\"].split(\",\")\n        self.voices = load_voices(\n            speaker_names=speaker_names, dim=dim, voices_bin=voices_bin\n        )\n        self.sample_rate = int(meta[\"sample_rate\"])\n        print(list(self.voices.keys()))\n\n        self.sample_rate = 24000\n        self.max_len = self.voices[next(iter(self.voices))].shape[0] - 1\n\n    def __call__(self, text: str, voice: str):\n        punctuations = ';:,.!?-…()\"“”'\n        text = text.lower()\n\n        tokens = \"\"\n\n        for t in re.findall(\"[\\u4E00-\\u9FFF]+|[\\u0000-\\u007f]+\", text):\n            if ord(t[0]) < 0x7F:\n                for w in t.split():\n                    while w:\n                        if w[0] in punctuations:\n                            tokens += w[0] + \" \"\n                            w = w[1:]\n                            continue\n\n                        if w[-1] in punctuations:\n                            if w[:-1] in self.word2tokens:\n                                tokens += self.word2tokens[w[:-1]]\n                                tokens += w[-1]\n                        else:\n                            if w in self.word2tokens:\n                                tokens += self.word2tokens[w]\n                            else:\n                                print(f\"Use espeak-ng for word {w}\")\n                                tokens += \"\".join(phonemize_espeak(w, \"en-us\")[0])\n\n                        tokens += \" \"\n                        break\n            else:\n                # Chinese\n                for w in jieba.cut(t):\n                    if w in self.word2tokens:\n                        tokens += self.word2tokens[w]\n                    else:\n                        for i in w:\n                            if i in self.word2tokens:\n                                tokens += self.word2tokens[i]\n                            else:\n                                print(f\"skip {i}\")\n\n        token_ids = [self.token2id[i] for i in tokens]\n        token_ids = token_ids[: self.max_len]\n\n        style = self.voices[voice][len(token_ids)]\n\n        token_ids = [0, *token_ids, 0]\n        token_ids = np.array([token_ids], dtype=np.int64)\n\n        speed = np.array([1.0], dtype=np.float32)\n\n        audio = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: token_ids,\n                self.model.get_inputs()[1].name: style,\n                self.model.get_inputs()[2].name: speed,\n            },\n        )[0]\n        return audio\n\n\ndef main():\n    m = OnnxModel(\n        model_filename=\"./kokoro.onnx\",\n        tokens=\"./tokens.txt\",\n        lexicon=\"./lexicon-gb-en.txt,./lexicon-zh.txt\",\n        voices_bin=\"./voices.bin\",\n    )\n    text = \"来听一听, 这个是什么口音? How are you doing? Are you ok? Thank you! 你觉得中英文说得如何呢?\"\n\n    text = text.lower()\n\n    voice = \"bf_alice\"\n    start = time.time()\n    audio = m(text, voice=voice)\n    end = time.time()\n\n    elapsed_seconds = end - start\n    audio_duration = len(audio) / m.sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    filename = f\"kokoro_v1.0_{voice}_zh_en.wav\"\n    sf.write(\n        filename,\n        audio,\n        samplerate=m.sample_rate,\n        subtype=\"PCM_16\",\n    )\n    print(f\" Saved to {filename}\")\n    print(f\" Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\" Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\" RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.1-zh/README.md",
    "content": "# Introduction\n\nThis directory is for kokoro v1.1-zh.\n\nSee also https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh\n"
  },
  {
    "path": "scripts/kokoro/v1.1-zh/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport onnx\nimport torch\n\nfrom generate_voices_bin import speaker2id\n\n\ndef main():\n    model = onnx.load(\"./kokoro.onnx\")\n    style = torch.load(\"./voices/zf_001.pt\", weights_only=True, map_location=\"cpu\")\n\n    id2speaker_str = \"\"\n    speaker2id_str = \"\"\n    sep = \"\"\n    for s, i in speaker2id.items():\n        speaker2id_str += f\"{sep}{s}->{i}\"\n        id2speaker_str += f\"{sep}{i}->{s}\"\n        sep = \",\"\n\n    meta_data = {\n        \"model_type\": \"kokoro\",\n        \"language\": \"multi-lang, e.g., English, Chinese\",\n        \"has_espeak\": 1,\n        \"sample_rate\": 24000,\n        \"version\": 2,\n        \"voice\": \"en-us\",\n        \"style_dim\": \",\".join(map(str, style.shape)),\n        \"n_speakers\": len(speaker2id),\n        \"id2speaker\": id2speaker_str,\n        \"speaker2id\": speaker2id_str,\n        \"speaker_names\": \",\".join(map(str, speaker2id.keys())),\n        \"model_url\": \"https://huggingface.co/hexgrad/Kokoro-82M-v1.1-zh\",\n        \"maintainer\": \"k2-fsa\",\n        \"comment\": \"This is Kokoro v1.1-zh, a multilingual TTS model, supporting English, Chinese.\",\n    }\n\n    print(model.metadata_props)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, \"./kokoro.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.1-zh/dynamic_quantization.py",
    "content": "#!/usr/bin/env python3\nimport argparse\n\nimport onnxruntime\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef show(filename):\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\n\"\"\"\nNodeArg(name='tokens', type='tensor(int64)', shape=[1, 'sequence_length'])\nNodeArg(name='style', type='tensor(float)', shape=[1, 256])\nNodeArg(name='speed', type='tensor(float)', shape=[1])\n-----\nNodeArg(name='audio', type='tensor(float)', shape=['audio_length'])\n\"\"\"\n\n\ndef main():\n    show(\"./kokoro.onnx\")\n\n    quantize_dynamic(\n        model_input=\"kokoro.onnx\",\n        model_output=\"kokoro.int8.onnx\",\n        #  op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QUInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.1-zh/export_onnx.py",
    "content": "#!/usr/bin/env python3\n\nimport json\n\nimport torch\nfrom kokoro import KModel\nfrom kokoro.model import KModelForONNX\n\n\n@torch.no_grad()\ndef main():\n    with open(\"config.json\") as f:\n        config = json.load(f)\n\n    model = (\n        KModel(\n            repo_id=\"not-used-any-value-is-ok\",\n            model=\"kokoro-v1_1-zh.pth\",\n            config=config,\n            disable_complex=True,\n        )\n        .to(\"cpu\")\n        .eval()\n    )\n\n    x = torch.randint(1, 100, (48,)).numpy()\n    x = torch.LongTensor([[0, *x, 0]])\n\n    style = torch.rand(1, 256, dtype=torch.float32)\n    speed = torch.rand(1)\n\n    print(x.shape, x.dtype)\n    print(style.shape, style.dtype)\n    print(speed, speed.dtype)\n\n    model2 = KModelForONNX(model)\n\n    torch.onnx.export(\n        model2,\n        (x, style, speed),\n        \"kokoro.onnx\",\n        input_names=[\"tokens\", \"style\", \"speed\"],\n        output_names=[\"audio\"],\n        dynamic_axes={\n            \"tokens\": {1: \"sequence_length\"},\n            \"audio\": {0: \"audio_length\"},\n        },\n        opset_version=14,  # minimum working version for this kokoro model is 14\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.1-zh/generate_lexicon_zh.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport re\nfrom typing import List, Tuple\n\nfrom misaki import zh\nfrom misaki.token import MToken\nfrom misaki.zh_frontend import ZH_MAP\nfrom pypinyin import load_phrases_dict, phrases_dict, pinyin_dict\n\nuser_dict = {\n    \"还田\": [[\"huan2\"], [\"tian2\"]],\n    \"行长\": [[\"hang2\"], [\"zhang3\"]],\n    \"银行行长\": [[\"yin2\"], [\"hang2\"], [\"hang2\"], [\"zhang3\"]],\n}\n\nload_phrases_dict(user_dict)\n\nphrases_dict.phrases_dict.update(**user_dict)\n\n\ndef process_text(self, text, with_erhua=True):\n    \"\"\"\n    This function is modified from\n    https://github.com/hexgrad/misaki/blob/main/misaki/zh_frontend.py#L155\n\n    Note that we have removed jieba.posseg.lcut().\n    \"\"\"\n    seg_cut = [(text, \"v\")]\n    seg_cut = self.tone_modifier.pre_merge_for_modify(seg_cut)\n    tokens = []\n    seg_cut = self.tone_modifier.pre_merge_for_modify(seg_cut)\n    initials = []\n    finals = []\n    # pypinyin, g2pM\n    for word, pos in seg_cut:\n        if pos == \"x\" and \"\\u4E00\" <= min(word) and max(word) <= \"\\u9FFF\":\n            pos = \"X\"\n        elif pos != \"x\" and word in self.punc:\n            pos = \"x\"\n        tk = MToken(text=word, tag=pos, whitespace=\"\")\n        if pos in (\"x\", \"eng\"):\n            if not word.isspace():\n                if pos == \"x\" and word in self.punc:\n                    tk.phonemes = word\n                tokens.append(tk)\n            elif tokens:\n                tokens[-1].whitespace += word\n            continue\n        elif (\n            tokens and tokens[-1].tag not in (\"x\", \"eng\") and not tokens[-1].whitespace\n        ):\n            tokens[-1].whitespace = \"/\"\n\n        # g2p\n        sub_initials, sub_finals = self._get_initials_finals(word)\n        # tone sandhi\n        sub_finals = self.tone_modifier.modified_tone(word, pos, sub_finals)\n        # er hua\n        if with_erhua:\n            sub_initials, sub_finals = self._merge_erhua(\n                sub_initials, sub_finals, word, pos\n            )\n\n        initials.append(sub_initials)\n        finals.append(sub_finals)\n        # assert len(sub_initials) == len(sub_finals) == len(word)\n\n        # sum(iterable[, start])\n        # initials = sum(initials, [])\n        # finals = sum(finals, [])\n\n        phones = []\n        for c, v in zip(sub_initials, sub_finals):\n            # NOTE: post process for pypinyin outputs\n            # we discriminate i, ii and iii\n            if c:\n                phones.append(c)\n            # replace punctuation by ` `\n            # if c and c in self.punc:\n            #     phones.append(c)\n            if v and (v not in self.punc or v != c):  # and v not in self.rhy_phns:\n                phones.append(v)\n        phones = \"_\".join(phones).replace(\"_eR\", \"_er\").replace(\"R\", \"_R\")\n        phones = re.sub(r\"(?=\\d)\", \"_\", phones).split(\"_\")\n        tk.phonemes = \"\".join(ZH_MAP.get(p, self.unk) for p in phones)\n        tokens.append(tk)\n\n    result = \"\".join(\n        (self.unk if tk.phonemes is None else tk.phonemes) + tk.whitespace\n        for tk in tokens\n    )\n\n    return result, tokens\n\n\ndef generate_chinese_lexicon():\n    word_dict = pinyin_dict.pinyin_dict\n    phrases = phrases_dict.phrases_dict\n\n    g2p = zh.ZHG2P(version=\"1.1\")\n\n    lexicon = []\n    for key in word_dict:\n        if not (0x4E00 <= key <= 0x9FFF):\n            continue\n        w = chr(key)\n        tokens: str = process_text(g2p.frontend, w)[0]\n        lexicon.append((w, tokens))\n\n    for key in phrases:\n        tokens: str = process_text(g2p.frontend, key)[0]\n        lexicon.append((key, tokens))\n    return lexicon\n\n\ndef save(filename: str, lexicon: List[Tuple[str, str]]):\n    with open(filename, \"w\", encoding=\"utf-8\") as f:\n        for word, phones in lexicon:\n            tokens = \" \".join(list(phones))\n            f.write(f\"{word} {tokens}\\n\")\n\n\ndef main():\n    zh = generate_chinese_lexicon()\n\n    save(\"lexicon-zh.txt\", zh)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.1-zh/generate_samples.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\"\"\"\nGenerate samples for\nhttps://k2-fsa.github.io/sherpa/onnx/tts/all/\n\"\"\"\n\nimport sherpa_onnx\nimport soundfile as sf\n\nfrom generate_voices_bin import speaker2id\n\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        kokoro=sherpa_onnx.OfflineTtsKokoroModelConfig(\n            model=\"./kokoro.onnx\",\n            voices=\"./voices.bin\",\n            tokens=\"./tokens.txt\",\n            data_dir=\"./espeak-ng-data\",\n            dict_dir=\"./dict\",\n            lexicon=\"./lexicon-zh.txt,./lexicon-us-en.txt\",\n        ),\n        num_threads=2,\n        debug=True,\n    ),\n    rule_fsts=\"./phone-zh.fst,./date-zh.fst,./number-zh.fst\",\n    max_num_sentences=1,\n)\n\nif not config.validate():\n    raise ValueError(\"Please check your config\")\n\ntts = sherpa_onnx.OfflineTts(config)\ntext = \"This model supports both Chinese and English. 小米的核心价值观是什么？答案是真诚热爱！有困难，请拨打110 或者18601200909。I am learning 机器学习. 我在研究 machine learning。What do you think 中英文说的如何呢? 今天是 2025年6月18号.\"\n\nprint(\"text\", text)\n\nfor s, i in speaker2id.items():\n    print(s, i, len(speaker2id))\n    audio = tts.generate(text, sid=i, speed=1.0)\n\n    sf.write(\n        f\"./hf/kokoro/v1.1-zh/mp3/{i}-{s}.mp3\",\n        audio.samples,\n        samplerate=audio.sample_rate,\n    )\n"
  },
  {
    "path": "scripts/kokoro/v1.1-zh/generate_voices_bin.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport torch\nfrom pathlib import Path\n\n\nspeakers = [\n    \"af_maple\",\n    \"af_sol\",\n    \"bf_vale\",\n]\nfor i in range(1, 99 + 1):\n    name = \"zf_{:03d}\".format(i)\n    if Path(f\"voices/{name}.pt\").is_file():\n        speakers.append(name)\n\nfor i in range(9, 100 + 1):\n    name = \"zm_{:03d}\".format(i)\n    if Path(f\"voices/{name}.pt\").is_file():\n        speakers.append(name)\n\n\nid2speaker = {index: value for index, value in enumerate(speakers)}\n\nspeaker2id = {speaker: idx for idx, speaker in id2speaker.items()}\n\n\ndef main():\n    if Path(\"./voices.bin\").is_file():\n        print(\"./voices.bin exists - skip\")\n        return\n\n    with open(\"voices.bin\", \"wb\") as f:\n        for _, speaker in id2speaker.items():\n            m = torch.load(\n                f\"voices/{speaker}.pt\",\n                weights_only=True,\n                map_location=\"cpu\",\n            ).numpy()\n            # m.shape (510, 1, 256)\n\n            f.write(m.tobytes())\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/kokoro/v1.1-zh/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport re\nimport time\nfrom typing import Dict, List\n\nimport jieba\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\ntry:\n    from piper_phonemize import phonemize_espeak\nexcept Exception as ex:\n    raise RuntimeError(\n        f\"{ex}\\nPlease run\\n\"\n        \"pip install piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html\"\n    )\n\n\ndef show(filename):\n    session_opts = ort.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = ort.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\n\"\"\"\nNodeArg(name='tokens', type='tensor(int64)', shape=[1, 'sequence_length'])\nNodeArg(name='style', type='tensor(float)', shape=[1, 256])\nNodeArg(name='speed', type='tensor(float)', shape=[1])\n-----\nNodeArg(name='audio', type='tensor(float)', shape=['audio_length'])\n\"\"\"\n\n\ndef load_voices(speaker_names: List[str], dim: List[int], voices_bin: str):\n    embedding = (\n        np.fromfile(voices_bin, dtype=\"uint8\")\n        .view(np.float32)\n        .reshape(len(speaker_names), *dim)\n    )\n    print(\"embedding.shape\", embedding.shape)\n    ans = dict()\n    for i in range(len(speaker_names)):\n        ans[speaker_names[i]] = embedding[i]\n\n    return ans\n\n\ndef load_tokens(filename: str) -> Dict[str, int]:\n    ans = dict()\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.strip().split()\n            if len(fields) == 2:\n                token, idx = fields\n                ans[token] = int(idx)\n            else:\n                assert len(fields) == 1, (len(fields), line)\n                ans[\" \"] = int(fields[0])\n    return ans\n\n\ndef load_lexicon(filename: str) -> Dict[str, List[str]]:\n    ans = dict()\n    for lexicon in filename.split(\",\"):\n        print(lexicon)\n        with open(lexicon, encoding=\"utf-8\") as f:\n            for line in f:\n                w, tokens = line.strip().split(\" \", maxsplit=1)\n                ans[w] = \"\".join(tokens.split())\n    return ans\n\n\nclass OnnxModel:\n    def __init__(self, model_filename: str, tokens: str, lexicon: str, voices_bin: str):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 3\n        session_opts.intra_op_num_threads = 3\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            model_filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        self.token2id = load_tokens(tokens)\n        self.word2tokens = load_lexicon(lexicon)\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        print(meta)\n        dim = list(map(int, meta[\"style_dim\"].split(\",\")))\n        speaker_names = meta[\"speaker_names\"].split(\",\")\n        self.voices = load_voices(\n            speaker_names=speaker_names, dim=dim, voices_bin=voices_bin\n        )\n        self.sample_rate = int(meta[\"sample_rate\"])\n        print(list(self.voices.keys()))\n\n        self.sample_rate = 24000\n        self.max_len = self.voices[next(iter(self.voices))].shape[0] - 1\n\n    def __call__(self, text: str, voice: str):\n        punctuations = ';:,.!?-…()\"“”'\n        text = text.lower()\n\n        tokens = \"\"\n\n        for t in re.findall(\"[\\u4E00-\\u9FFF]+|[\\u0000-\\u007f]+\", text):\n            if ord(t[0]) < 0x7F:\n                for w in t.split():\n                    while w:\n                        if w[0] in punctuations:\n                            tokens += w[0] + \" \"\n                            w = w[1:]\n                            continue\n\n                        if w[-1] in punctuations:\n                            if w[:-1] in self.word2tokens:\n                                tokens += self.word2tokens[w[:-1]]\n                                tokens += w[-1]\n                        else:\n                            if w in self.word2tokens:\n                                tokens += self.word2tokens[w]\n                            else:\n                                print(f\"Use espeak-ng for word {w}\")\n                                tokens += \"\".join(phonemize_espeak(w, \"en-us\")[0])\n\n                        tokens += \" \"\n                        break\n            else:\n                # Chinese\n                for w in jieba.cut(t):\n                    if w in self.word2tokens:\n                        tokens += self.word2tokens[w]\n                    else:\n                        for i in w:\n                            if i in self.word2tokens:\n                                tokens += self.word2tokens[i]\n                            else:\n                                print(f\"skip {i}\")\n\n        token_ids = [self.token2id[i] for i in tokens]\n        token_ids = token_ids[: self.max_len]\n\n        style = self.voices[voice][len(token_ids)]\n\n        token_ids = [0, *token_ids, 0]\n        token_ids = np.array([token_ids], dtype=np.int64)\n\n        speed = np.array([1.0], dtype=np.float32)\n\n        audio = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: token_ids,\n                self.model.get_inputs()[1].name: style,\n                self.model.get_inputs()[2].name: speed,\n            },\n        )[0]\n        return audio\n\n\ndef main():\n    m = OnnxModel(\n        model_filename=\"./kokoro.onnx\",\n        tokens=\"./tokens.txt\",\n        lexicon=\"./lexicon-us-en.txt,./lexicon-zh.txt\",\n        voices_bin=\"./voices.bin\",\n    )\n    text = \"来听一听, 这个是什么口音? How are you doing? Are you ok? Thank you! 你觉得中英文说得如何呢?\"\n\n    text = text.lower()\n\n    voice = \"zf_001\"\n    start = time.time()\n    audio = m(text, voice=voice)\n    end = time.time()\n\n    elapsed_seconds = end - start\n    audio_duration = len(audio) / m.sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    filename = f\"kokoro_v1.1_{voice}_zh_en.wav\"\n    sf.write(\n        filename,\n        audio,\n        samplerate=m.sample_rate,\n        subtype=\"PCM_16\",\n    )\n    print(f\" Saved to {filename}\")\n    print(f\" Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\" Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\" RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/lazarus/generate-subtitles.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\nfrom typing import List, Optional\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass Model:\n    model_name: str\n    lang: str\n    short_name: str = \"\"\n    cmd: str = \"\"\n\n\ndef get_models():\n    models = [\n        Model(\n            model_name=\"sherpa-onnx-whisper-tiny.en\",\n            lang=\"en\",\n            short_name=\"whisper_tiny.en\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv tiny.en-encoder.onnx\n            rm -fv tiny.en-decoder.onnx\n\n            mv -v tiny.en-encoder.int8.onnx whisper-encoder.onnx\n            mv -v tiny.en-decoder.int8.onnx whisper-decoder.onnx\n            mv -v tiny.en-tokens.txt tokens.txt\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-tiny-en-int8\",\n            lang=\"en\",\n            short_name=\"moonshine_tiny\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v preprocess.onnx moonshine-preprocessor.onnx\n            mv -v encode.int8.onnx moonshine-encoder.onnx\n            mv -v uncached_decode.int8.onnx moonshine-uncached-decoder.onnx\n            mv -v cached_decode.int8.onnx moonshine-cached-decoder.onnx\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17\",\n            lang=\"zh_en_ko_ja_yue\",\n            short_name=\"sense_voice\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv model.onnx\n            mv -v model.int8.onnx sense-voice.onnx\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-paraformer-zh-2023-09-14\",\n            lang=\"zh_en\",\n            short_name=\"paraformer_2023_09_14\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv model.onnx\n            mv -v model.int8.onnx paraformer.onnx\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-paraformer-zh-small-2024-03-09\",\n            lang=\"zh_en\",\n            short_name=\"paraformer_small_2024_03_09\",\n            cmd=\"\"\"\n            pushd $model_name\n            rm -fv model.onnx\n            mv -v model.int8.onnx paraformer.onnx\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-gigaspeech-2023-12-12\",\n            lang=\"en\",\n            short_name=\"zipformer_gigaspeech_2023_12_12\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv encoder-epoch-30-avg-1.int8.onnx transducer-encoder.onnx\n            mv decoder-epoch-30-avg-1.onnx transducer-decoder.onnx\n            mv joiner-epoch-30-avg-1.int8.onnx transducer-joiner.onnx\n\n            rm -fv encoder-epoch-30-avg-1.onnx\n            rm -fv decoder-epoch-30-avg-1.int8.onnx\n            rm -fv joiner-epoch-30-avg-1.onnx\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"icefall-asr-zipformer-wenetspeech-20230615\",\n            lang=\"zh\",\n            short_name=\"zipformer_wenetspeech\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv README.md\n            mv -v data/lang_char/tokens.txt ./\n            rm -rfv data/lang_char\n\n            mv -v exp/encoder-epoch-12-avg-4.int8.onnx ./\n            mv -v exp/decoder-epoch-12-avg-4.onnx ./\n            mv -v exp/joiner-epoch-12-avg-4.int8.onnx ./\n            rm -rfv exp\n\n            mv -v encoder-epoch-12-avg-4.int8.onnx transducer-encoder.onnx\n            mv -v decoder-epoch-12-avg-4.onnx transducer-decoder.onnx\n            mv -v joiner-epoch-12-avg-4.int8.onnx transducer-joiner.onnx\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01\",\n            lang=\"ja\",\n            short_name=\"zipformer_reazonspeech_2024_08_01\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv encoder-epoch-99-avg-1.int8.onnx transducer-encoder.onnx\n            mv decoder-epoch-99-avg-1.onnx transducer-decoder.onnx\n            mv joiner-epoch-99-avg-1.int8.onnx transducer-joiner.onnx\n\n            rm -fv encoder-epoch-99-avg-1.onnx\n            rm -fv decoder-epoch-99-avg-1.int8.onnx\n            rm -fv joiner-epoch-99-avg-1.onnx\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-thai-2024-06-20\",\n            lang=\"th\",\n            short_name=\"zipformer_gigaspeech2\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n            rm -fv README.md\n            rm -fv bpe.model\n\n            mv encoder-epoch-12-avg-5.int8.onnx transducer-encoder.onnx\n            mv decoder-epoch-12-avg-5.onnx transducer-decoder.onnx\n            mv joiner-epoch-12-avg-5.int8.onnx transducer-joiner.onnx\n\n            rm -fv encoder-epoch-12-avg-5.onnx\n            rm -fv decoder-epoch-12-avg-5.int8.onnx\n            rm -fv joiner-epoch-12-avg-5.onnx\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04\",\n            lang=\"zh\",\n            short_name=\"telespeech_ctc\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            mv model.int8.onnx telespeech.onnx\n            rm -fv model.onnx\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8\",\n            lang=\"en\",\n            short_name=\"parakeet_tdt_0.6b_v2\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            rm -rfv test_wavs\n\n            mv -v encoder.int8.onnx nemo-transducer-encoder.onnx\n            mv -v decoder.int8.onnx nemo-transducer-decoder.onnx\n            mv -v joiner.int8.onnx nemo-transducer-joiner.onnx\n\n            ls -lh\n\n            popd\n            \"\"\",\n        ),\n    ]\n    return models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./build-generate-subtitles.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/matcha-tts/README.md",
    "content": "# Introduction\n\nThis folder contains scripts for adding meta data to tts models\nfrom https://github.com/shivammehta25/Matcha-TTS\n\nNote: If you use icefall to train a MatchaTTS model, you don't need this folder.\n"
  },
  {
    "path": "scripts/matcha-tts/en/generate_samples.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\"\"\"\nGenerate samples for\nhttps://k2-fsa.github.io/sherpa/onnx/tts/all/\n\"\"\"\n\n\nimport sherpa_onnx\nimport soundfile as sf\n\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(\n            acoustic_model=\"matcha-icefall-en_US-ljspeech/model-steps-3.onnx\",\n            vocoder=\"vocos-22khz-univ.onnx\",\n            tokens=\"matcha-icefall-en_US-ljspeech/tokens.txt\",\n            lexicon=\"\",\n            data_dir=\"matcha-icefall-en_US-ljspeech/espeak-ng-data\",\n        ),\n        num_threads=2,\n    ),\n    max_num_sentences=1,\n)\n\nif not config.validate():\n    raise ValueError(\"Please check your config\")\n\ntts = sherpa_onnx.OfflineTts(config)\ntext = \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n\naudio = tts.generate(text, sid=0, speed=1.0)\n\nsf.write(\n    \"./hf/matcha/icefall-en-ljspeech/mp3/0.mp3\",\n    audio.samples,\n    samplerate=audio.sample_rate,\n)\n"
  },
  {
    "path": "scripts/matcha-tts/fa-en/.gitignore",
    "content": ".add-meta-data.done\n"
  },
  {
    "path": "scripts/matcha-tts/fa-en/README.md",
    "content": "# Introduction\n\nThis folder is for\nhttps://github.com/k2-fsa/sherpa-onnx/issues/1779\n"
  },
  {
    "path": "scripts/matcha-tts/fa-en/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n\nfrom typing import Any, Dict\n\nimport onnx\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\ndef main():\n    meta_data = {\n        \"model_type\": \"matcha-tts\",\n        \"language\": \"Persian+English\",\n        \"voice\": \"fa\",\n        \"has_espeak\": 1,\n        \"jieba\": 0,\n        \"n_speakers\": 1,\n        \"sample_rate\": 22050,\n        \"version\": 1,\n        \"pad_id\": 0,\n        \"use_icefall\": 0,\n        \"model_author\": \"Ali Mahmoudi (@mah92)\",\n        \"maintainer\": \"k2-fsa\",\n        \"use_eos_bos\": 0,\n        \"num_ode_steps\": 5,\n        \"see_also\": \"https://github.com/k2-fsa/sherpa-onnx/issues/1779\",\n    }\n    add_meta_data(\"./female/model.onnx\", meta_data)\n    add_meta_data(\"./male/model.onnx\", meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/matcha-tts/fa-en/test.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nAM\nNodeArg(name='x', type='tensor(int64)', shape=['batch_size', 'time'])\nNodeArg(name='x_lengths', type='tensor(int64)', shape=['batch_size'])\nNodeArg(name='scales', type='tensor(float)', shape=[2])\n-----\nNodeArg(name='mel', type='tensor(float)', shape=['batch_size', 80, 'time'])\nNodeArg(name='mel_lengths', type='tensor(int64)', shape=['batch_size'])\n\nVocoder\nNodeArg(name='mel', type='tensor(float)', shape=['N', 80, 'L'])\n-----\nNodeArg(name='audio', type='tensor(float)', shape=['N', 'L'])\n\"\"\"\n\nimport argparse\n\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\ntry:\n    from piper_phonemize import phonemize_espeak\nexcept Exception as ex:\n    raise RuntimeError(\n        f\"{ex}\\nPlease run\\n\"\n        \"pip install piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html\"\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--am\", type=str, required=True, help=\"Path to the acoustic model\"\n    )\n\n    parser.add_argument(\n        \"--vocoder\", type=str, required=True, help=\"Path to the vocoder\"\n    )\n    parser.add_argument(\n        \"--tokens\", type=str, required=True, help=\"Path to the tokens.txt\"\n    )\n\n    parser.add_argument(\n        \"--text\", type=str, required=True, help=\"Path to the text for generation\"\n    )\n\n    parser.add_argument(\n        \"--out-wav\", type=str, required=True, help=\"Path to save the generated wav\"\n    )\n    return parser.parse_args()\n\n\ndef load_tokens(filename: str):\n    ans = dict()\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.strip().split()\n            if len(fields) == 1:\n                ans[\" \"] = int(fields[0])\n            else:\n                assert len(fields) == 2, (line, fields)\n                ans[fields[0]] = int(fields[1])\n    return ans\n\n\nclass OnnxHifiGANModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i)\n\n    def __call__(self, x: np.ndarray):\n        assert x.ndim == 3, x.shape\n        assert x.shape[0] == 1, x.shape\n\n        audio = self.model.run(\n            [self.model.get_outputs()[0].name],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )[0]\n        # audio: (batch_size, num_samples)\n\n        return audio\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n        tokens: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 2\n\n        self.session_opts = session_opts\n        self.token2id = load_tokens(tokens)\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        print(f\"{self.model.get_modelmeta().custom_metadata_map}\")\n        metadata = self.model.get_modelmeta().custom_metadata_map\n        self.sample_rate = int(metadata[\"sample_rate\"])\n\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i)\n\n    def __call__(self, x: np.ndarray):\n        assert x.ndim == 2, x.shape\n        assert x.shape[0] == 1, x.shape\n\n        x_lengths = np.array([x.shape[1]], dtype=np.int64)\n\n        noise_scale = 1.0\n        length_scale = 1.0\n        scales = np.array([noise_scale, length_scale], dtype=np.float32)\n\n        mel = self.model.run(\n            [self.model.get_outputs()[0].name],\n            {\n                self.model.get_inputs()[0].name: x,\n                self.model.get_inputs()[1].name: x_lengths,\n                self.model.get_inputs()[2].name: scales,\n            },\n        )[0]\n        # mel: (batch_size, feat_dim, num_frames)\n\n        return mel\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n    am = OnnxModel(args.am, args.tokens)\n    vocoder = OnnxHifiGANModel(args.vocoder)\n\n    phones = phonemize_espeak(args.text, voice=\"fa\")\n    phones = sum(phones, [])\n    phone_ids = [am.token2id[i] for i in phones]\n\n    padded_phone_ids = [0] * (len(phone_ids) * 2 + 1)\n    padded_phone_ids[1::2] = phone_ids\n\n    tokens = np.array([padded_phone_ids], dtype=np.int64)\n    mel = am(tokens)\n    audio = vocoder(mel)\n\n    sf.write(args.out_wav, audio[0], am.sample_rate, \"PCM_16\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/matcha-tts/zh/generate_samples.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\"\"\"\nGenerate samples for\nhttps://k2-fsa.github.io/sherpa/onnx/tts/all/\n\"\"\"\n\n\nimport sherpa_onnx\nimport soundfile as sf\n\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(\n            acoustic_model=\"matcha-icefall-zh-baker/model-steps-3.onnx\",\n            vocoder=\"vocos-22khz-univ.onnx\",\n            lexicon=\"matcha-icefall-zh-baker/lexicon.txt\",\n            tokens=\"matcha-icefall-zh-baker/tokens.txt\",\n            dict_dir=\"matcha-icefall-zh-baker/dict\",\n        ),\n        num_threads=2,\n    ),\n    max_num_sentences=1,\n    rule_fsts=\"./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst\",\n)\n\nif not config.validate():\n    raise ValueError(\"Please check your config\")\n\ntts = sherpa_onnx.OfflineTts(config)\ntext = \"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感受着生命的奇迹与温柔.\"\n\n\naudio = tts.generate(text, sid=0, speed=1.0)\n\nsf.write(\n    \"./hf/matcha/icefall-zh/mp3/0.mp3\",\n    audio.samples,\n    samplerate=audio.sample_rate,\n)\n"
  },
  {
    "path": "scripts/matcha-tts/zh-en/.gitignore",
    "content": "vocab_tts.txt\n"
  },
  {
    "path": "scripts/matcha-tts/zh-en/README.md",
    "content": "# Introduction\n\nModel files are from\nhttps://modelscope.cn/models/dengcunqin/matcha_tts_zh_en_20251010/summary\n\nNote that you have to use\nvocos-16khz-univ.onnx\n\nYou can download it from\n https://modelscope.cn/models/dengcunqin/matcha_tts_zh_en_20251010/resolve/master/vocos-16khz-univ.onnx\nor\n https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-16khz-univ.onnx\n\n```\n{'am': './model-steps-3.onnx', 'vocoder': './vocos-16khz-univ.onnx', 'tokens': './tokens.txt', 'lexicon': './lexicon.txt', 'text': '中英文合成测试. It supports both English 和中文合成', 'out_wav': 'generated.wav'}\n\n{'use_eos_bos': '1', 'modelscope_url': 'https://modelscope.cn/models/dengcunqin/matcha_tts_zh_en_20251010', 'sample_rate': '16000', 'language': 'chinese English', 'model_type': 'matcha-tts', 'n_speakers': '1', 'model_author': 'dengcunqin', 'version': '1', 'pad_id': '0', 'voice': 'zh en-us', 'demo_url': 'https://www.tulingyun.com/tts.html', 'num_ode_steps': '3'}\n\nNodeArg(name='x', type='tensor(int64)', shape=['N', 'L'])\nNodeArg(name='x_length', type='tensor(int64)', shape=['N'])\nNodeArg(name='noise_scale', type='tensor(float)', shape=[1])\nNodeArg(name='length_scale', type='tensor(float)', shape=[1])\n-----\nNodeArg(name='mel', type='tensor(float)', shape=['N', 80, 'L'])\n\nvocos {'modelscope_url': 'https://modelscope.cn/models/dengcunqin/matcha_tts_zh_en_20251010', 'use_eos_bos': '1', 'n_speakers': '1', 'sample_rate': '16000', 'pad_id': '0', 'language': 'chinese English', 'model_type': 'matcha-tts vocos', 'voice': 'zh en-us', 'version': '1', 'demo_url': 'https://www.tulingyun.com/tts.html', 'model_author': 'dengcunqin'}\n\n----------vocos----------\nNodeArg(name='mels', type='tensor(float)', shape=['batch_size', 80, 'time'])\n-----\nNodeArg(name='mag', type='tensor(float)', shape=['batch_size', 'Clipmag_dim_1', 'time'])\nNodeArg(name='x', type='tensor(float)', shape=['batch_size', 'Cosx_dim_1', 'time'])\nNodeArg(name='y', type='tensor(float)', shape=['batch_size', 'Cosx_dim_1', 'time'])\n```\n"
  },
  {
    "path": "scripts/matcha-tts/zh-en/generate_lexicon.py",
    "content": "#!/usr/bin/env python3\n\nfrom pypinyin import Style, pinyin, load_phrases_dict, phrases_dict, pinyin_dict\n\nload_phrases_dict(\n    {\n        \"行长\": [[\"hang2\"], [\"zhang3\"]],\n        \"银行行长\": [[\"yin2\"], [\"hang2\"], [\"hang2\"], [\"zhang3\"]],\n    }\n)\nuser_defined = {\n    \"微调\": [\"wei1\", \"tiao2\"],\n    \"这个\": [\"zhe4\", \"ge4\"],\n    \"方便地\": [\"fang1\", \"bian2\", \"de1\"],\n}\n\n\ndef main():\n    filename = \"lexicon.txt\"\n\n    word_dict = pinyin_dict.pinyin_dict\n    phrases = phrases_dict.phrases_dict\n\n    i = 0\n    with open(filename, \"w\", encoding=\"utf-8\") as f:\n        for key in word_dict:\n            if not (0x4E00 <= key <= 0x9FFF):\n                continue\n\n            w = chr(key)\n            tokens = pinyin(w, style=Style.TONE3, neutral_tone_with_five=True)[0][0]\n\n            if tokens == \"shei2\":\n                tokens = \"shui2\"\n\n            if tokens[-1] not in (\"1\", \"2\", \"3\", \"4\", \"5\"):\n                tokens += \"1\"\n\n            f.write(f\"{w} {tokens}\\n\")\n\n        for key, value in user_defined.items():\n            f.write(f\"{key} {' '.join(value)}\\n\")\n\n        for key in phrases:\n            if key in user_defined:\n                continue\n            tokens = pinyin(key, style=Style.TONE3, neutral_tone_with_five=True)\n\n            for i in range(len(tokens)):\n                if tokens[i][0] == \"shei2\":\n                    tokens[i][0] = \"shui2\"\n\n                if tokens[i][0][-1] not in (\"1\", \"2\", \"3\", \"4\", \"5\"):\n                    tokens[i][0] += \"1\"\n\n            flatten = [t[0] for t in tokens]\n\n            tokens = \" \".join(flatten)\n\n            f.write(f\"{key} {tokens}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/matcha-tts/zh-en/generate_samples.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\"\"\"\nGenerate samples for\nhttps://k2-fsa.github.io/sherpa/onnx/tts/all/\n\"\"\"\n\n\nimport sherpa_onnx\nimport soundfile as sf\n\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        matcha=sherpa_onnx.OfflineTtsMatchaModelConfig(\n            acoustic_model=\"matcha-icefall-zh-en/model-steps-3.onnx\",\n            vocoder=\"vocos-16khz-univ.onnx\",\n            lexicon=\"matcha-icefall-zh-en/lexicon.txt\",\n            tokens=\"matcha-icefall-zh-en/tokens.txt\",\n            data_dir=\"matcha-icefall-zh-en/espeak-ng-data\",\n        ),\n        num_threads=2,\n    ),\n    max_num_sentences=1,\n    rule_fsts=\"./matcha-icefall-zh-en/phone-zh.fst,./matcha-icefall-zh-en/date-zh.fst,./matcha-icefall-zh-en/number-zh.fst\",\n)\n\nif not config.validate():\n    raise ValueError(\"Please check your config\")\n\ntts = sherpa_onnx.OfflineTts(config)\ntext = \"我最近在学习machine learning，希望能够在未来的artificial intelligence领域有所建树。在这次vocation中，我们计划去Paris欣赏埃菲尔铁塔和卢浮宫的美景。某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。开始数字测试。2025年12月4号，拨打110或者189202512043。123456块钱。在这个快速发展的时代，人工智能技术正在改变我们的生活方式。语音合成作为人工智能的重要应用之一，让机器能够用自然流畅的语音与人类进行交流。\"\n\n\naudio = tts.generate(text, sid=0, speed=1.0)\n\nsf.write(\n    \"./hf/matcha/icefall-zh-en/mp3/0.mp3\",\n    audio.samples,\n    samplerate=audio.sample_rate,\n)\n"
  },
  {
    "path": "scripts/matcha-tts/zh-en/generate_tokens.py",
    "content": "#!/usr/bin/env python3\n\ntoken2id = dict()\nwith open(\"./vocab_tts.txt\", encoding=\"utf-8\") as f:\n    for i, line in enumerate(f):\n        fields = line.strip().split()\n        if len(fields) == 0:\n            token2id[\" \"] = i + 1\n        else:\n            token2id[fields[0]] = i + 1\n\nwith open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n    for t, i in token2id.items():\n        f.write(f\"{t} {i}\\n\")\n"
  },
  {
    "path": "scripts/matcha-tts/zh-en/test.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nAM\n\nNodeArg(name='x', type='tensor(int64)', shape=['N', 'L'])\nNodeArg(name='x_length', type='tensor(int64)', shape=['N'])\nNodeArg(name='noise_scale', type='tensor(float)', shape=[1])\nNodeArg(name='length_scale', type='tensor(float)', shape=[1])\n-----\nNodeArg(name='mel', type='tensor(float)', shape=['N', 80, 'L'])\n\nVocoder\n\nNodeArg(name='mels', type='tensor(float)', shape=['batch_size', 80, 'time'])\n-----\nNodeArg(name='mag', type='tensor(float)', shape=['batch_size', 'Clipmag_dim_1', 'time'])\nNodeArg(name='x', type='tensor(float)', shape=['batch_size', 'Cosx_dim_1', 'time'])\nNodeArg(name='y', type='tensor(float)', shape=['batch_size', 'Cosx_dim_1', 'time'])\n\"\"\"\n\nimport argparse\n\nimport re\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\ntry:\n    from piper_phonemize import phonemize_espeak\nexcept Exception as ex:\n    raise RuntimeError(\n        f\"{ex}\\nPlease run\\n\"\n        \"pip install piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html\"\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--am\",\n        type=str,\n        default=\"./model-steps-3.onnx\",\n        help=\"Path to the acoustic model\",\n    )\n\n    parser.add_argument(\n        \"--vocoder\",\n        type=str,\n        default=\"./vocos-16khz-univ.onnx\",\n        help=\"Path to the vocoder\",\n    )\n    parser.add_argument(\n        \"--tokens\", type=str, default=\"./tokens.txt\", help=\"Path to the tokens.txt\"\n    )\n\n    parser.add_argument(\n        \"--lexicon\", type=str, default=\"./lexicon.txt\", help=\"Path to the lexicon.txt\"\n    )\n\n    parser.add_argument(\n        \"--text\",\n        type=str,\n        #  default=\"这是一个中英文测试. It can also speak English. 你觉得中英文说的如何呀?\",\n        default=\"中英文合成测试. It supports both English 和中文合成\",\n        help=\"The text for generation\",\n    )\n\n    parser.add_argument(\n        \"--out-wav\",\n        type=str,\n        default=\"generated.wav\",\n        help=\"Path to save the generated wav\",\n    )\n    return parser.parse_args()\n\n\ndef load_tokens(filename: str):\n    ans = dict()\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.strip().split()\n            if len(fields) == 1:\n                ans[\" \"] = int(fields[0])\n            else:\n                assert len(fields) == 2, (line, fields)\n                ans[fields[0]] = int(fields[1])\n    return ans\n\n\ndef load_lexicon(filename: str, token2id):\n    ans = dict()\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.strip().split()\n            tokens = fields[1:]\n            ids = [token2id[t] for t in tokens]\n            ans[fields[0]] = ids\n    return ans\n\n\nclass OnnxVocosModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        print(f\"vocos {self.model.get_modelmeta().custom_metadata_map}\")\n\n        print(\"----------vocos----------\")\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i)\n        print()\n\n    def __call__(self, x: np.ndarray):\n        \"\"\"\n        Args:\n          x: (N, feat_dim, num_frames)\n        Returns:\n          mag: (N, n_fft/2+1, num_frames)\n          x: (N, n_fft/2+1, num_frames)\n          y: (N, n_fft/2+1, num_frames)\n\n        The complex spectrum is mag * (x + j*y)\n        \"\"\"\n        assert x.ndim == 3, x.shape\n        assert x.shape[0] == 1, x.shape\n\n        mag, x, y = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n                self.model.get_outputs()[1].name,\n                self.model.get_outputs()[2].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )\n\n        return mag, x, y\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 2\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        print(f\"{self.model.get_modelmeta().custom_metadata_map}\")\n        metadata = self.model.get_modelmeta().custom_metadata_map\n        self.sample_rate = int(metadata[\"sample_rate\"])\n\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i)\n\n    def __call__(self, x: np.ndarray):\n        assert x.ndim == 2, x.shape\n        assert x.shape[0] == 1, x.shape\n\n        x_lengths = np.array([x.shape[1]], dtype=np.int64)\n\n        noise_scale = 1.0\n        length_scale = 1.0\n\n        mel = self.model.run(\n            [self.model.get_outputs()[0].name],\n            {\n                self.model.get_inputs()[0].name: x,\n                self.model.get_inputs()[1].name: x_lengths,\n                self.model.get_inputs()[2].name: np.array(\n                    [noise_scale], dtype=np.float32\n                ),\n                self.model.get_inputs()[3].name: np.array(\n                    [length_scale], dtype=np.float32\n                ),\n            },\n        )[0]\n        # mel: (batch_size, feat_dim, num_frames)\n\n        return mel\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n    am = OnnxModel(args.am)\n    vocoder = OnnxVocosModel(args.vocoder)\n\n    token2id = load_tokens(args.tokens)\n    id2token = {i: t for t, i in token2id.items()}\n    lexicon = load_lexicon(args.lexicon, token2id)\n\n    text = args.text\n\n    pattern = re.compile(r\"[\\u4e00-\\u9fff]+|[a-zA-Z0-9 ,.!\\?]+\")\n\n    ids = []\n    for match in pattern.finditer(text):\n        segment = match.group()\n        if segment in token2id:\n            print(segment)\n            ids.append(token2id[segment])\n        elif re.match(r\"[\\u4e00-\\u9fff]+\", segment):\n            # process chinese\n            print(segment)\n            for w in segment:\n                if w in lexicon:\n                    ids += lexicon[w]\n                else:\n                    print(f\"Ignore {w}\")\n        else:\n            print(segment)\n            segment = segment.strip()\n            tokens_list = phonemize_espeak(segment, \"en-us\")\n            tokens = sum(tokens_list, [])\n            for t in tokens:\n                ids.append(token2id[t])\n\n    tokens = np.array([ids], dtype=np.int64)\n    mel = am(tokens)\n    print(tokens)\n    print(mel.shape)\n\n    mag, x, y = vocoder(mel)\n    stft_result = knf.StftResult(\n        real=(mag * x)[0].transpose().reshape(-1).tolist(),\n        imag=(mag * y)[0].transpose().reshape(-1).tolist(),\n        num_frames=mag.shape[2],\n    )\n    config = knf.StftConfig(\n        n_fft=1024,\n        hop_length=256,\n        win_length=1024,\n        window_type=\"hann\",\n        center=True,\n        pad_mode=\"reflect\",\n        normalized=False,\n    )\n    istft = knf.IStft(config)\n    audio_vocos = istft(stft_result)\n\n    audio_vocos = np.array(audio_vocos)\n\n    sf.write(args.out_wav, audio_vocos, am.sample_rate, \"PCM_16\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/medasr/README.md",
    "content": "---\nlicense: other\nlicense_name: health-ai-developer-foundations\nlicense_link: https://developers.google.com/health-ai-developer-foundations/terms\nlanguage:\n- en\npipeline_tag: automatic-speech-recognition\nlibrary_name: transformers\ntags:\n- medical-asr\n- radiology\n- medical\n---\n\n# Introduction\n\nThis directory includes models sourced from:\n\nhttps://github.com/Google-Health/medasr\n\nAll model files are governed by the Health AI Developer Foundations Terms of Use.\nFor full licensing details, please refer to:\n\nhttps://developers.google.com/health-ai-developer-foundations/terms\n"
  },
  {
    "path": "scripts/medasr/export_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nMake sure you have set the environment variable\n\n    export HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\n\nwhere hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx is your Huggingface access token.\n\"\"\"\n\nfrom typing import Any, Dict\n\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom transformers import AutoModelForCTC, AutoProcessor\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    model.metadata_props.clear()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\nclass Wrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.m = m\n\n    def forward(self, x: torch.Tensor, mask: torch.Tensor):\n        \"\"\"\n        Args:\n          x: (N, T, C), dtype float32\n          mask: (N, T), dtype int64. Valid positions are 1. Padding positions are 0.\n        Returns:\n          logits: (N, T/4, vocob_size), dtype float32\n          logits_len: (N,), dtype int64\n        \"\"\"\n        o = self.m(x, mask.bool())\n        logits_len = self.m._get_subsampling_output_length(mask.sum(-1)).to(torch.int64)\n        return o.logits, logits_len\n\n\ndef generate_tokens(tokenizer):\n    vocab = tokenizer.get_vocab()\n    id2token = {i: t for t, i in vocab.items()}\n\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i in range(tokenizer.vocab_size):\n            if i == tokenizer.pad_token_id:\n                f.write(f\"<blk> {i}\\n\")\n            else:\n                f.write(f\"{id2token[i]} {i}\\n\")\n    print(\"saved to tokens.txt\")\n\n\n@torch.no_grad()\ndef main():\n    model_id = \"google/medasr\"\n    processor = AutoProcessor.from_pretrained(model_id)\n\n    generate_tokens(processor.tokenizer)\n\n    model = AutoModelForCTC.from_pretrained(model_id)\n\n    w = Wrapper(model)\n    w.eval()\n\n    filename = \"model.onnx\"\n    x = torch.rand(1, 100, 128)\n    mask = torch.ones(1, x.shape[1], dtype=torch.int64)\n    torch.onnx.export(\n        w,\n        (x, mask),\n        filename,\n        input_names=[\"x\", \"mask\"],\n        output_names=[\"logits\", \"logits_len\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"T\"},\n            \"mask\": {0: \"N\", 1: \"T\"},\n            \"logits\": {0: \"N\", 1: \"T_4\"},\n            \"logits_len\": {0: \"N\"},\n        },\n        opset_version=14,\n        external_data=False,\n        dynamo=False,\n    )\n\n    meta_data = {\n        \"model_type\": \"medasr_ctc\",\n        \"version\": \"20251225\",\n        \"model_author\": \"google\",\n        \"maintainer\": \"k2-fsa\",\n        \"vocab_size\": processor.tokenizer.vocab_size,\n        \"subsampling_factor\": 4,\n        \"url\": \"https://github.com/Google-Health/medasr\",\n        \"license\": \"https://developers.google.com/health-ai-developer-foundations/terms\",\n    }\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n    filename_int8 = \"model.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        op_types_to_quantize=[\"MatMul\"],\n        # Note that we have to use QUInt8 here.\n        #\n        # When QInt8 is used, C++ onnxruntime produces incorrect results\n        weight_type=QuantType.QUInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/medasr/test_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nimport time\n\nimport kaldi_native_fbank as knf\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to onnx model file\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--wav\",\n        type=str,\n        required=True,\n        help=\"Path to test wav\",\n    )\n    return parser.parse_args()\n\n\nclass OnnxModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def __call__(self, x, mask):\n        \"\"\"\n        Args:\n          x: (N, T, C), float32\n          mask: (N, T), int64\n        Returns:\n          logits: (N, T/4, vocab_size), float32\n          logits_len: (N,) int64\n        \"\"\"\n        logits, logits_len = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n                self.model.get_outputs()[1].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n                self.model.get_inputs()[1].name: mask,\n            },\n        )\n\n        return logits, logits_len\n\n\ndef load_tokens(tokens):\n    id2token = dict()\n    with open(tokens, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.split()\n            if len(fields) == 1:\n                id2token[int(fields[0])] = \" \"\n            else:\n                t, idx = fields\n                id2token[int(idx)] = t\n    return id2token\n\n\ndef compute_feat(samples):\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.snip_edges = True\n    opts.frame_opts.window_type = \"hanning\"\n    opts.frame_opts.samp_freq = 16000\n    opts.frame_opts.preemph_coeff = 0\n    opts.frame_opts.remove_dc_offset = False\n    opts.mel_opts.num_bins = 128\n\n    online_fbank = knf.OnlineFbank(opts)\n    online_fbank.accept_waveform(16000, samples.tolist())\n    online_fbank.input_finished()\n\n    features = np.stack(\n        [online_fbank.get_frame(i) for i in range(online_fbank.num_frames_ready)]\n    )\n    assert features.dtype == np.float32, features.dtype\n\n    features = np.ascontiguousarray(features)\n\n    return features\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    model = OnnxModel(args.model)\n\n    samples, sample_rate = librosa.load(args.wav, sr=16000)\n\n    start = time.time()\n\n    assert sample_rate == 16000, sample_rate\n    features = compute_feat(samples)\n    mask = np.ones(features.shape[0], dtype=np.int64)[None]\n    features = features[None]\n\n    logits, logits_len = model(features, mask)\n    idx = logits[0, : logits_len[0]].argmax(axis=-1)\n\n    end = time.time()\n    elapsed_seconds = end - start\n    audio_duration = samples.shape[0] / 16000\n    real_time_factor = elapsed_seconds / audio_duration\n\n    print(\"idx\", idx)\n\n    unique_ids = []\n    prev = -1\n    for i in idx.tolist():\n        if i == prev:\n            continue\n        unique_ids.append(i)\n        prev = i\n    print(\"unique_ids\", unique_ids)\n    blank_id = 0\n    ids = [i for i in unique_ids if i != blank_id]\n    print(ids)\n\n    id2token = load_tokens(args.tokens)\n\n    tokens = [id2token[i] for i in ids]\n    text = \"\".join(tokens)\n    print(text)\n    text = text.replace(\"▁\", \" \")\n    print(text)\n    print(f\"RTF: {real_time_factor}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/melo-tts/README.md",
    "content": "# Introduction\n\nModels in this directory are converted from\nhttps://github.com/myshell-ai/MeloTTS\n\nNote there is only a single female speaker in the model for Chinese+English TTS.\nTTS model, whereas there are 5 female speakers in the model For English TTS.\n"
  },
  {
    "path": "scripts/melo-tts/export-onnx-en.py",
    "content": "#!/usr/bin/env python3\n# This model exports the English-only TTS model.\n# It has 5 speakers.\n# {'EN-US': 0, 'EN-BR': 1, 'EN_INDIA': 2, 'EN-AU': 3, 'EN-Default': 4}\n\nfrom typing import Any, Dict\n\nimport onnx\nimport torch\nfrom melo.api import TTS\nfrom melo.text import language_id_map, language_tone_start_map\nfrom melo.text.chinese import pinyin_to_symbol_map\nfrom melo.text.english import eng_dict, refine_syllables\nfrom pypinyin import Style, lazy_pinyin, phrases_dict, pinyin_dict\n\n\ndef generate_tokens(symbol_list):\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(symbol_list):\n            f.write(f\"{s} {i}\\n\")\n\n\ndef add_new_english_words(lexicon):\n    \"\"\"\n    Args:\n      lexicon:\n        Please modify it in-place.\n    \"\"\"\n\n    # Please have a look at\n    # https://github.com/myshell-ai/MeloTTS/blob/main/melo/text/cmudict.rep\n\n    # We give several examples below about how to add new words\n\n    # Example 1. Add a new word kaldi\n\n    # It does not contain the word kaldi in cmudict.rep\n    # so if we add the following line to cmudict.rep\n    #\n    #  KALDI K AH0 - L D IH0\n    #\n    # then we need to change the lexicon like below\n    lexicon[\"kaldi\"] = [[\"K\", \"AH0\"], [\"L\", \"D\", \"IH0\"]]\n    #\n    # K AH0 and L D IH0 are separated by a dash \"-\", so\n    # [\"K\", \"AH0\"] is a in list and [\"L\", \"D\", \"IH0\"] is in a separate list\n\n    # Note: Either kaldi or KALDI is fine. You can use either lowercase or\n    # uppercase or both\n\n    # Example 2. Add a new word SF\n    #\n    # If we add the following line to cmudict.rep\n    #\n    #  SF EH1 S - EH1 F\n    #\n    # to cmudict.rep, then we need to change the lexicon like below:\n    lexicon[\"SF\"] = [[\"EH1\", \"S\"], [\"EH1\", \"F\"]]\n\n    # Please add your new words here\n\n    # No need to return lexicon since it is changed in-place\n\n\ndef generate_lexicon():\n    add_new_english_words(eng_dict)\n    with open(\"lexicon.txt\", \"w\", encoding=\"utf-8\") as f:\n        for word in eng_dict:\n            phones, tones = refine_syllables(eng_dict[word])\n            tones = [t + language_tone_start_map[\"EN\"] for t in tones]\n            tones = [str(t) for t in tones]\n\n            phones = \" \".join(phones)\n            tones = \" \".join(tones)\n\n            f.write(f\"{word.lower()} {phones} {tones}\\n\")\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\nclass ModelWrapper(torch.nn.Module):\n    def __init__(self, model: \"SynthesizerTrn\"):\n        super().__init__()\n        self.model = model\n        self.lang_id = language_id_map[model.language]\n\n    def forward(\n        self,\n        x,\n        x_lengths,\n        tones,\n        sid,\n        noise_scale,\n        length_scale,\n        noise_scale_w,\n        max_len=None,\n    ):\n        \"\"\"\n        Args:\n          x: A 1-D array of dtype np.int64. Its shape is (token_numbers,)\n          tones: A 1-D array of dtype np.int64. Its shape is (token_numbers,)\n          lang_id: A 1-D array of dtype np.int64. Its shape is (token_numbers,)\n          sid: an integer\n        \"\"\"\n        bert = torch.zeros(x.shape[0], 1024, x.shape[1], dtype=torch.float32)\n        ja_bert = torch.zeros(x.shape[0], 768, x.shape[1], dtype=torch.float32)\n        lang_id = torch.zeros_like(x)\n        lang_id[:, 1::2] = self.lang_id\n        return self.model.model.infer(\n            x=x,\n            x_lengths=x_lengths,\n            sid=sid,\n            tone=tones,\n            language=lang_id,\n            bert=bert,\n            ja_bert=ja_bert,\n            noise_scale=noise_scale,\n            noise_scale_w=noise_scale_w,\n            length_scale=length_scale,\n        )[0]\n\n\ndef main():\n    generate_lexicon()\n\n    language = \"EN\"\n    model = TTS(language=language, device=\"cpu\")\n\n    generate_tokens(model.hps[\"symbols\"])\n\n    torch_model = ModelWrapper(model)\n\n    opset_version = 13\n    x = torch.randint(low=0, high=10, size=(60,), dtype=torch.int64)\n    print(x.shape)\n    x_lengths = torch.tensor([x.size(0)], dtype=torch.int64)\n    sid = torch.tensor([1], dtype=torch.int64)\n    tones = torch.zeros_like(x)\n\n    noise_scale = torch.tensor([1.0], dtype=torch.float32)\n    length_scale = torch.tensor([1.0], dtype=torch.float32)\n    noise_scale_w = torch.tensor([1.0], dtype=torch.float32)\n\n    x = x.unsqueeze(0)\n    tones = tones.unsqueeze(0)\n\n    filename = \"model.onnx\"\n\n    torch.onnx.export(\n        torch_model,\n        (\n            x,\n            x_lengths,\n            tones,\n            sid,\n            noise_scale,\n            length_scale,\n            noise_scale_w,\n        ),\n        filename,\n        opset_version=opset_version,\n        input_names=[\n            \"x\",\n            \"x_lengths\",\n            \"tones\",\n            \"sid\",\n            \"noise_scale\",\n            \"length_scale\",\n            \"noise_scale_w\",\n        ],\n        output_names=[\"y\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"L\"},\n            \"x_lengths\": {0: \"N\"},\n            \"tones\": {0: \"N\", 1: \"L\"},\n            \"y\": {0: \"N\", 1: \"S\", 2: \"T\"},\n        },\n    )\n\n    meta_data = {\n        \"model_type\": \"melo-vits\",\n        \"comment\": \"melo\",\n        \"version\": 2,\n        \"language\": \"English\",\n        \"add_blank\": int(model.hps.data.add_blank),\n        \"n_speakers\": len(model.hps.data.spk2id),  # 5\n        \"jieba\": 0,\n        \"sample_rate\": model.hps.data.sampling_rate,\n        \"bert_dim\": 1024,\n        \"ja_bert_dim\": 768,\n        \"speaker_id\": 0,\n        \"lang_id\": language_id_map[model.language],\n        \"tone_start\": language_tone_start_map[model.language],\n        \"url\": \"https://github.com/myshell-ai/MeloTTS\",\n        \"license\": \"MIT license\",\n        \"description\": \"MeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai\",\n    }\n    add_meta_data(filename, meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/melo-tts/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# This script exports ZH_EN TTS model, which supports both Chinese and English.\n# This model has only 1 speaker.\n\nfrom typing import Any, Dict\n\nimport onnx\nimport torch\nfrom melo.api import TTS\nfrom melo.text import language_id_map, language_tone_start_map\nfrom melo.text.chinese import pinyin_to_symbol_map\nfrom melo.text.english import eng_dict, refine_syllables\nfrom pypinyin import Style, lazy_pinyin, phrases_dict, pinyin_dict\n\nfor k, v in pinyin_to_symbol_map.items():\n    if isinstance(v, list):\n        break\n    pinyin_to_symbol_map[k] = v.split()\n\n\ndef get_initial_final_tone(word: str):\n    initials = lazy_pinyin(word, neutral_tone_with_five=True, style=Style.INITIALS)\n    finals = lazy_pinyin(word, neutral_tone_with_five=True, style=Style.FINALS_TONE3)\n\n    ans_phone = []\n    ans_tone = []\n\n    for c, v in zip(initials, finals):\n        raw_pinyin = c + v\n        v_without_tone = v[:-1]\n        try:\n            tone = v[-1]\n        except:\n            print(\"skip\", word, initials, finals)\n            return [], []\n\n        pinyin = c + v_without_tone\n        assert tone in \"12345\"\n\n        if c:\n            v_rep_map = {\n                \"uei\": \"ui\",\n                \"iou\": \"iu\",\n                \"uen\": \"un\",\n            }\n            if v_without_tone in v_rep_map.keys():\n                pinyin = c + v_rep_map[v_without_tone]\n        else:\n            pinyin_rep_map = {\n                \"ing\": \"ying\",\n                \"i\": \"yi\",\n                \"in\": \"yin\",\n                \"u\": \"wu\",\n            }\n            if pinyin in pinyin_rep_map.keys():\n                pinyin = pinyin_rep_map[pinyin]\n            else:\n                single_rep_map = {\n                    \"v\": \"yu\",\n                    \"e\": \"e\",\n                    \"i\": \"y\",\n                    \"u\": \"w\",\n                }\n                if pinyin[0] in single_rep_map.keys():\n                    pinyin = single_rep_map[pinyin[0]] + pinyin[1:]\n                    #  print(word, initials, finals, pinyin)\n\n        if pinyin not in pinyin_to_symbol_map:\n            print(\"skip\", pinyin, word, c, v, raw_pinyin)\n            continue\n        phone = pinyin_to_symbol_map[pinyin]\n        ans_phone += phone\n        ans_tone += [tone] * len(phone)\n\n    return ans_phone, ans_tone\n\n\ndef generate_tokens(symbol_list):\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(symbol_list):\n            f.write(f\"{s} {i}\\n\")\n\n\ndef add_new_english_words(lexicon):\n    \"\"\"\n    Args:\n      lexicon:\n        Please modify it in-place.\n    \"\"\"\n\n    # Please have a look at\n    # https://github.com/myshell-ai/MeloTTS/blob/main/melo/text/cmudict.rep\n\n    # We give several examples below about how to add new words\n\n    # Example 1. Add a new word kaldi\n\n    # It does not contain the word kaldi in cmudict.rep\n    # so if we add the following line to cmudict.rep\n    #\n    #  KALDI K AH0 - L D IH0\n    #\n    # then we need to change the lexicon like below\n    lexicon[\"kaldi\"] = [[\"K\", \"AH0\"], [\"L\", \"D\", \"IH0\"]]\n    #\n    # K AH0 and L D IH0 are separated by a dash \"-\", so\n    # [\"K\", \"AH0\"] is a in list and [\"L\", \"D\", \"IH0\"] is in a separate list\n\n    # Note: Either kaldi or KALDI is fine. You can use either lowercase or\n    # uppercase or both\n\n    # Example 2. Add a new word SF\n    #\n    # If we add the following line to cmudict.rep\n    #\n    #  SF EH1 S - EH1 F\n    #\n    # to cmudict.rep, then we need to change the lexicon like below:\n    lexicon[\"SF\"] = [[\"EH1\", \"S\"], [\"EH1\", \"F\"]]\n\n    # Please add your new words here\n\n    # No need to return lexicon since it is changed in-place\n\n\ndef generate_lexicon():\n    word_dict = pinyin_dict.pinyin_dict\n    phrases = phrases_dict.phrases_dict\n    add_new_english_words(eng_dict)\n    with open(\"lexicon.txt\", \"w\", encoding=\"utf-8\") as f:\n        for word in eng_dict:\n            phones, tones = refine_syllables(eng_dict[word])\n            tones = [t + language_tone_start_map[\"EN\"] for t in tones]\n            tones = [str(t) for t in tones]\n\n            phones = \" \".join(phones)\n            tones = \" \".join(tones)\n\n            f.write(f\"{word.lower()} {phones} {tones}\\n\")\n\n        for key in word_dict:\n            if not (0x4E00 <= key <= 0x9FA5):\n                continue\n            w = chr(key)\n            phone, tone = get_initial_final_tone(w)\n            if not phone:\n                continue\n            phone = \" \".join(phone)\n            tone = \" \".join(tone)\n            f.write(f\"{w} {phone} {tone}\\n\")\n\n        for w in phrases:\n            phone, tone = get_initial_final_tone(w)\n            if not phone:\n                continue\n            assert len(phone) == len(tone), (len(phone), len(tone), phone, tone)\n            phone = \" \".join(phone)\n            tone = \" \".join(tone)\n            f.write(f\"{w} {phone} {tone}\\n\")\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\nclass ModelWrapper(torch.nn.Module):\n    def __init__(self, model: \"SynthesizerTrn\"):\n        super().__init__()\n        self.model = model\n        self.lang_id = language_id_map[model.language]\n\n    def forward(\n        self,\n        x,\n        x_lengths,\n        tones,\n        sid,\n        noise_scale,\n        length_scale,\n        noise_scale_w,\n        max_len=None,\n    ):\n        \"\"\"\n        Args:\n          x: A 1-D array of dtype np.int64. Its shape is (token_numbers,)\n          tones: A 1-D array of dtype np.int64. Its shape is (token_numbers,)\n          lang_id: A 1-D array of dtype np.int64. Its shape is (token_numbers,)\n          sid: an integer\n        \"\"\"\n        bert = torch.zeros(x.shape[0], 1024, x.shape[1], dtype=torch.float32)\n        ja_bert = torch.zeros(x.shape[0], 768, x.shape[1], dtype=torch.float32)\n        lang_id = torch.zeros_like(x)\n        lang_id[:, 1::2] = self.lang_id\n        return self.model.model.infer(\n            x=x,\n            x_lengths=x_lengths,\n            sid=sid,\n            tone=tones,\n            language=lang_id,\n            bert=bert,\n            ja_bert=ja_bert,\n            noise_scale=noise_scale,\n            noise_scale_w=noise_scale_w,\n            length_scale=length_scale,\n        )[0]\n\n\ndef main():\n    generate_lexicon()\n\n    language = \"ZH\"\n    model = TTS(language=language, device=\"cpu\")\n\n    generate_tokens(model.hps[\"symbols\"])\n\n    torch_model = ModelWrapper(model)\n\n    opset_version = 18\n    x = torch.randint(low=0, high=10, size=(60,), dtype=torch.int64)\n    print(x.shape)\n    x_lengths = torch.tensor([x.size(0)], dtype=torch.int64)\n    sid = torch.tensor([1], dtype=torch.int64)\n    tones = torch.zeros_like(x)\n\n    noise_scale = torch.tensor([1.0], dtype=torch.float32)\n    length_scale = torch.tensor([1.0], dtype=torch.float32)\n    noise_scale_w = torch.tensor([1.0], dtype=torch.float32)\n\n    x = x.unsqueeze(0)\n    tones = tones.unsqueeze(0)\n\n    filename = \"model.onnx\"\n\n    torch.onnx.export(\n        torch_model,\n        (\n            x,\n            x_lengths,\n            tones,\n            sid,\n            noise_scale,\n            length_scale,\n            noise_scale_w,\n        ),\n        filename,\n        opset_version=opset_version,\n        input_names=[\n            \"x\",\n            \"x_lengths\",\n            \"tones\",\n            \"sid\",\n            \"noise_scale\",\n            \"length_scale\",\n            \"noise_scale_w\",\n        ],\n        output_names=[\"y\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"L\"},\n            \"x_lengths\": {0: \"N\"},\n            \"tones\": {0: \"N\", 1: \"L\"},\n            \"y\": {0: \"N\", 1: \"S\", 2: \"T\"},\n        },\n    )\n\n    meta_data = {\n        \"model_type\": \"melo-vits\",\n        \"comment\": \"melo\",\n        \"version\": 2,\n        \"language\": \"Chinese + English\",\n        \"add_blank\": int(model.hps.data.add_blank),\n        \"n_speakers\": 1,\n        \"jieba\": 1,\n        \"sample_rate\": model.hps.data.sampling_rate,\n        \"bert_dim\": 1024,\n        \"ja_bert_dim\": 768,\n        \"speaker_id\": list(model.hps.data.spk2id.values())[0],\n        \"lang_id\": language_id_map[model.language],\n        \"tone_start\": language_tone_start_map[model.language],\n        \"url\": \"https://github.com/myshell-ai/MeloTTS\",\n        \"license\": \"MIT license\",\n        \"description\": \"MeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai\",\n    }\n    add_meta_data(filename, meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/melo-tts/show-info.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport onnxruntime\n\n\ndef show(filename):\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n    meta = sess.get_modelmeta().custom_metadata_map\n    print(\"*****************************************\")\n    print(\"meta\\n\", meta)\n\n\ndef main():\n    print(\"=========model==========\")\n    show(\"./model.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n\n\"\"\"\n=========model==========\nNodeArg(name='x', type='tensor(int64)', shape=['N', 'L'])\nNodeArg(name='x_lengths', type='tensor(int64)', shape=['N'])\nNodeArg(name='tones', type='tensor(int64)', shape=['N', 'L'])\nNodeArg(name='sid', type='tensor(int64)', shape=[1])\nNodeArg(name='noise_scale', type='tensor(float)', shape=[1])\nNodeArg(name='length_scale', type='tensor(float)', shape=[1])\nNodeArg(name='noise_scale_w', type='tensor(float)', shape=[1])\n-----\nNodeArg(name='y', type='tensor(float)', shape=['N', 'S', 'T'])\n*****************************************\nmeta\n {'description': 'MeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai',\n 'model_type': 'melo-vits', 'license': 'MIT license', 'sample_rate': '44100', 'add_blank': '1',\n 'n_speakers': '1', 'bert_dim': '1024', 'language': 'Chinese + English',\n 'ja_bert_dim': '768', 'speaker_id': '1', 'comment': 'melo', 'lang_id': '3',\n 'tone_start': '0', 'url': 'https://github.com/myshell-ai/MeloTTS'}\n\"\"\"\n"
  },
  {
    "path": "scripts/melo-tts/test.py",
    "content": "#!/usr/bin/env python3\n\nfrom typing import Iterable, List, Tuple\n\nimport jieba\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\n\n\nclass Lexicon:\n    def __init__(self, lexion_filename: str, tokens_filename: str):\n        tokens = dict()\n        with open(tokens_filename, encoding=\"utf-8\") as f:\n            for line in f:\n                s, i = line.split()\n                tokens[s] = int(i)\n        # Map \"v\" to \"V\" token ID (same as post_replace_ph in MeloTTS, only for English models)\n        # English models have \"V\" with token ID 14\n        if tokens.get(\"V\") == 14 and \"v\" in tokens:\n            tokens[\"v\"] = tokens[\"V\"]\n\n\n        lexicon = dict()\n        with open(lexion_filename, encoding=\"utf-8\") as f:\n            for line in f:\n                splits = line.split()\n                word_or_phrase = splits[0]\n                phone_tone_list = splits[1:]\n                assert len(phone_tone_list) & 1 == 0, len(phone_tone_list)\n                phones = phone_tone_list[: len(phone_tone_list) // 2]\n                phones = [tokens[p] for p in phones]\n\n                tones = phone_tone_list[len(phone_tone_list) // 2 :]\n                tones = [int(t) for t in tones]\n\n                lexicon[word_or_phrase] = (phones, tones)\n        lexicon[\"呣\"] = lexicon[\"母\"]\n        lexicon[\"嗯\"] = lexicon[\"恩\"]\n        self.lexicon = lexicon\n\n        punctuation = [\"!\", \"?\", \"…\", \",\", \".\", \"'\", \"-\"]\n        for p in punctuation:\n            i = tokens[p]\n            tone = 0\n            self.lexicon[p] = ([i], [tone])\n        self.lexicon[\" \"] = ([tokens[\"_\"]], [0])\n\n    def _convert(self, text: str) -> Tuple[List[int], List[int]]:\n        phones = []\n        tones = []\n\n        if text == \"，\":\n            text = \",\"\n        elif text == \"。\":\n            text = \".\"\n        elif text == \"！\":\n            text = \"!\"\n        elif text == \"？\":\n            text = \"?\"\n\n        if text not in self.lexicon:\n            print(\"t\", text)\n            if len(text) > 1:\n                for w in text:\n                    print(\"w\", w)\n                    p, t = self.convert(w)\n                    if p:\n                        phones += p\n                        tones += t\n            return phones, tones\n\n        phones, tones = self.lexicon[text]\n        return phones, tones\n\n    def convert(self, text_list: Iterable[str]) -> Tuple[List[int], List[int]]:\n        phones = []\n        tones = []\n        for text in text_list:\n            print(text)\n            p, t = self._convert(text)\n            phones += p\n            tones += t\n        return phones, tones\n\n\nclass OnnxModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 4\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        meta = self.model.get_modelmeta().custom_metadata_map\n        self.bert_dim = int(meta[\"bert_dim\"])\n        self.ja_bert_dim = int(meta[\"ja_bert_dim\"])\n        self.add_blank = int(meta[\"add_blank\"])\n        self.sample_rate = int(meta[\"sample_rate\"])\n        self.speaker_id = int(meta[\"speaker_id\"])\n        self.lang_id = int(meta[\"lang_id\"])\n        self.sample_rate = int(meta[\"sample_rate\"])\n\n    def __call__(self, x, tones):\n        \"\"\"\n        Args:\n          x: 1-D int64 torch tensor\n          tones: 1-D int64 torch tensor\n        \"\"\"\n        x = x.unsqueeze(0)\n        tones = tones.unsqueeze(0)\n\n        print(x.shape, tones.shape)\n        sid = torch.tensor([self.speaker_id], dtype=torch.int64)\n        noise_scale = torch.tensor([0.6], dtype=torch.float32)\n        length_scale = torch.tensor([1.0], dtype=torch.float32)\n        noise_scale_w = torch.tensor([0.8], dtype=torch.float32)\n\n        x_lengths = torch.tensor([x.shape[-1]], dtype=torch.int64)\n\n        y = self.model.run(\n            [\"y\"],\n            {\n                \"x\": x.numpy(),\n                \"x_lengths\": x_lengths.numpy(),\n                \"tones\": tones.numpy(),\n                \"sid\": sid.numpy(),\n                \"noise_scale\": noise_scale.numpy(),\n                \"noise_scale_w\": noise_scale_w.numpy(),\n                \"length_scale\": length_scale.numpy(),\n            },\n        )[0][0][0]\n        return y\n\n\ndef main():\n    lexicon = Lexicon(lexion_filename=\"./lexicon.txt\", tokens_filename=\"./tokens.txt\")\n\n    text = \"这是一个使用 next generation kaldi 的 text to speech 中英文例子. Thank you! 你觉得如何呢? are you ok? Fantastic! How about you?\"\n    s = jieba.cut(text, HMM=True)\n\n    phones, tones = lexicon.convert(s)\n\n    model = OnnxModel(\"./model.onnx\")\n\n    if model.add_blank:\n        new_phones = [0] * (2 * len(phones) + 1)\n        new_tones = [0] * (2 * len(tones) + 1)\n\n        new_phones[1::2] = phones\n        new_tones[1::2] = tones\n\n        phones = new_phones\n        tones = new_tones\n\n    phones = torch.tensor(phones, dtype=torch.int64)\n    tones = torch.tensor(tones, dtype=torch.int64)\n\n    print(phones.shape, tones.shape)\n\n    y = model(x=phones, tones=tones)\n    sf.write(\"./test.wav\", y, model.sample_rate)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/mobile-asr-models/.gitignore",
    "content": "run2.sh\n"
  },
  {
    "path": "scripts/mobile-asr-models/README.md",
    "content": "# Introduction\n\nThis folder contains scripts to convert ASR models for mobile platforms\nsupporting only batch size equal to 1.\n\nThe advantage of fixing the batch size to 1 is that it provides more\nopportunities for model optimization and quantization.\n\nTo give you a concrete example, for the following model\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n\n| | encoder-epoch-99-avg-1.onnx | encoder-epoch-99-avg-1.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 315 MB| 174 MB|\n|Batch size fixed to 1| 242 MB | 100 MB |\n\nThe following [colab notebook](https://colab.research.google.com/drive/1RsVZbsxbPjazeGrNNbZNjXCYbEG2F2DU?usp=sharing)\nprovides examples to use the above two models.\n\n**WARNING**: Tested with `onnxruntime==1.16.3 onnx==1.15.0`.\n\n```bash\npip install onnxruntime==1.16.3 onnx==1.15.0\n```\n\n## More examples\n\n### [sherpa-onnx-streaming-zipformer-korean-2024-06-16](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-korean-2024-06-16-korean)\n\n\n| | encoder-epoch-99-avg-1.onnx | encoder-epoch-99-avg-1.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 279 MB| 122 MB|\n|Batch size fixed to 1| 264 MB | 107 MB |\n\n### [sherpa-onnx-streaming-zipformer-en-20M-2023-02-17](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-en-20m-2023-02-17-english)\n\n| | encoder-epoch-99-avg-1.onnx | encoder-epoch-99-avg-1.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 85 MB| 41 MB|\n|Batch size fixed to 1| 75 MB | 32 MB |\n\n### [sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12-chinese)\n\n| | encoder-epoch-20-avg-1-chunk-16-left-128.onnx | encoder-epoch-20-avg-1-chunk-16-left-128.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 249 MB| 67 MB|\n|Batch size fixed to 1| 247 MB | 65 MB |\n\n### [icefall-asr-zipformer-streaming-wenetspeech-20230615](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#pkufool-icefall-asr-zipformer-streaming-wenetspeech-20230615-chinese)\n\n| | encoder-epoch-12-avg-4-chunk-16-left-128.onnx | encoder-epoch-12-avg-4-chunk-16-left-128.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 250 MB| 68 MB|\n|Batch size fixed to 1| 247 MB | 65 MB |\n\n### [sherpa-onnx-streaming-zipformer-en-2023-06-26](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-en-2023-06-26-english)\n\n\n| | encoder-epoch-99-avg-1-chunk-16-left-128.onnx | encoder-epoch-99-avg-1-chunk-16-left-128.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 250 MB| 68 MB|\n|Batch size fixed to 1| 247 MB | 65 MB |\n\n### [sherpa-onnx-streaming-zipformer-en-2023-06-21](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-en-2023-06-21-english)\n\n| | encoder-epoch-99-avg-1.onnx | encoder-epoch-99-avg-1.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 338 MB| 180 MB|\n|Batch size fixed to 1| 264 MB | 107 MB |\n\n### [sherpa-onnx-streaming-zipformer-en-2023-02-21](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-en-2023-02-21-english)\n\n| | encoder-epoch-99-avg-1.onnx | encoder-epoch-99-avg-1.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 279 MB| 122 MB|\n|Batch size fixed to 1| 264 MB | 107 MB |\n\n### [sherpa-onnx-streaming-zipformer-fr-2023-04-14](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#shaojieli-sherpa-onnx-streaming-zipformer-fr-2023-04-14-french)\n\n| | encoder-epoch-29-avg-9-with-averaged-model.onnx | encoder-epoch-29-avg-9-with-averaged-model.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 279 MB| 121 MB|\n|Batch size fixed to 1| 264 MB | 107 MB |\n\n### [sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16-bilingual-chinese-english)\n\n| | encoder-epoch-99-avg-1.onnx | encoder-epoch-99-avg-1.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 85 MB| 41 MB|\n|Batch size fixed to 1| 75 MB | 32 MB |\n\n### [sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-zh-14m-2023-02-23-chinese)\n\n| | encoder-epoch-99-avg-1.onnx | encoder-epoch-99-avg-1.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 40 MB| 21 MB|\n|Batch size fixed to 1| 33 MB | 15 MB |\n\n### [sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01](https://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html#sherpa-onnx-kws-zipformer-wenetspeech-3-3m-2024-01-01-chinese)\n\n| | encoder-epoch-12-avg-2-chunk-16-left-64.onnx | encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 12 MB| 4.6 MB|\n|Batch size fixed to 1| 11 MB | 3.9 MB |\n\n### [sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01](https://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html#sherpa-onnx-kws-zipformer-gigaspeech-3-3m-2024-01-01-english)\n\n| | encoder-epoch-12-avg-2-chunk-16-left-64.onnx | encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx|\n|---|---|---|\n|Dynamic batch size| 12 MB| 4.6 MB|\n|Batch size fixed to 1| 11 MB | 3.9 MB |\n"
  },
  {
    "path": "scripts/mobile-asr-models/dynamic_quantization.py",
    "content": "#!/usr/bin/env python3\nimport argparse\n\nimport onnxruntime\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef show(filename):\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--input\",\n        type=str,\n        required=True,\n        help=\"Input onnx model\",\n    )\n\n    parser.add_argument(\n        \"--output\",\n        type=str,\n        required=True,\n        help=\"Output onnx model\",\n    )\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n    print(f\"----------{args.input}----------\")\n    show(args.input)\n    print(\"------------------------------\")\n\n    quantize_dynamic(\n        model_input=args.input,\n        model_output=args.output,\n        op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/mobile-asr-models/generate-asr.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass Model:\n    # We will download\n    # https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/{model_name}.tar.bz2\n    model_name: str\n\n    cmd: str\n\n\ndef get_streaming_zipformer_transducer_models():\n    models = [\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-korean-2024-06-16\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-99-avg-1.onnx \\\n              --output1 $dst/encoder-epoch-99-avg-1.onnx \\\n              --output2 $dst/encoder-epoch-99-avg-1.int8.onnx\n\n            cp -v $src/bpe.model $dst/ || true\n            cp -v $src/tokens.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-99-avg-1.onnx $dst/\n            cp -v $src/joiner-epoch-99-avg-1.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-20-avg-1-chunk-16-left-128.onnx \\\n              --output1 $dst/encoder-epoch-20-avg-1-chunk-16-left-128.onnx \\\n              --output2 $dst/encoder-epoch-20-avg-1-chunk-16-left-128.int8.onnx\n\n            cp -v $src/bpe.model $dst/ || true\n            cp -v $src/README.md $dst/\n            cp -v $src/tokens.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-20-avg-1-chunk-16-left-128.onnx $dst/\n            cp -v $src/joiner-epoch-20-avg-1-chunk-16-left-128.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n            \"\"\",\n        ),\n        Model(\n            model_name=\"icefall-asr-zipformer-streaming-wenetspeech-20230615\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/exp/encoder-epoch-12-avg-4-chunk-16-left-128.onnx \\\n              --output1 $dst/encoder-epoch-12-avg-4-chunk-16-left-128.onnx \\\n              --output2 $dst/encoder-epoch-12-avg-4-chunk-16-left-128.int8.onnx\n\n            cp -fv $src/README.md $dst/\n            cp -v $src/data/lang_char/tokens.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx $dst/\n            cp -v $src/exp/joiner-epoch-12-avg-4-chunk-16-left-128.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-en-2023-06-26\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-99-avg-1-chunk-16-left-128.onnx \\\n              --output1 $dst/encoder-epoch-99-avg-1-chunk-16-left-128.onnx \\\n              --output2 $dst/encoder-epoch-99-avg-1-chunk-16-left-128.int8.onnx\n\n            cp -v $src/bpe.model $dst/ || true\n            cp -v $src/README.md $dst/\n            cp -v $src/tokens.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-99-avg-1-chunk-16-left-128.onnx $dst/\n            cp -v $src/joiner-epoch-99-avg-1-chunk-16-left-128.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n              \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-en-2023-06-21\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-99-avg-1.onnx \\\n              --output1 $dst/encoder-epoch-99-avg-1.onnx \\\n              --output2 $dst/encoder-epoch-99-avg-1.int8.onnx\n\n            cp -fv $src/README.md $dst/\n            cp -v $src/tokens.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-99-avg-1.onnx $dst/\n            cp -v $src/joiner-epoch-99-avg-1.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n                \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-en-2023-02-21\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-99-avg-1.onnx \\\n              --output1 $dst/encoder-epoch-99-avg-1.onnx \\\n              --output2 $dst/encoder-epoch-99-avg-1.int8.onnx\n\n            cp -v $src/bpe.model $dst/ || true\n            cp -v $src/README.md $dst/ || true\n            cp -v $src/tokens.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-99-avg-1.onnx $dst/\n            cp -v $src/joiner-epoch-99-avg-1.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n              \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-99-avg-1.onnx \\\n              --output1 $dst/encoder-epoch-99-avg-1.onnx \\\n              --output2 $dst/encoder-epoch-99-avg-1.int8.onnx\n\n            cp -v $src/README.md $dst/\n            cp -v $src/tokens.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-99-avg-1.onnx $dst/\n            cp -v $src/joiner-epoch-99-avg-1.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-fr-2023-04-14\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-29-avg-9-with-averaged-model.onnx \\\n              --output1 $dst/encoder-epoch-29-avg-9-with-averaged-model.onnx \\\n              --output2 $dst/encoder-epoch-29-avg-9-with-averaged-model.int8.onnx\n\n            cp -v $src/bpe.model $dst/ || true\n            cp -v $src/README.md $dst/ || true\n            cp -v $src/tokens.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-29-avg-9-with-averaged-model.onnx $dst/\n            cp -v $src/joiner-epoch-29-avg-9-with-averaged-model.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n              \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-99-avg-1.onnx \\\n              --output1 $dst/encoder-epoch-99-avg-1.onnx \\\n              --output2 $dst/encoder-epoch-99-avg-1.int8.onnx\n\n            mkdir $dst/{64,96}\n\n            ./run-impl.sh \\\n              --input $src/64/encoder-epoch-99-avg-1.onnx \\\n              --output1 $dst/64/encoder-epoch-99-avg-1.onnx \\\n              --output2 $dst/64/encoder-epoch-99-avg-1.int8.onnx\n\n            ./run-impl.sh \\\n              --input $src/96/encoder-epoch-99-avg-1.onnx \\\n              --output1 $dst/96/encoder-epoch-99-avg-1.onnx \\\n              --output2 $dst/96/encoder-epoch-99-avg-1.int8.onnx\n\n            cp -v $src/bpe.model $dst/ || true\n            cp -v $src/README.md $dst/ || true\n            cp -av $src/test_wavs $dst/\n\n            cp -v $src/tokens.txt $dst/\n            cp -v $src/decoder-epoch-99-avg-1.onnx $dst/\n            cp -v $src/joiner-epoch-99-avg-1.int8.onnx $dst/\n\n            cp -v $src/tokens.txt $dst/64/\n            cp -v $src/64/decoder-epoch-99-avg-1.onnx $dst/64/\n            cp -v $src/64/joiner-epoch-99-avg-1.int8.onnx $dst/64/\n\n            cp -v $src/tokens.txt $dst/96/\n            cp -v $src/96/decoder-epoch-99-avg-1.onnx $dst/96/\n            cp -v $src/96/joiner-epoch-99-avg-1.int8.onnx $dst/96/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n              \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-99-avg-1.onnx \\\n              --output1 $dst/encoder-epoch-99-avg-1.onnx \\\n              --output2 $dst/encoder-epoch-99-avg-1.int8.onnx\n\n            cp -v $src/bpe.model $dst/ || true\n            cp -v $src/README.md $dst/ || true\n            cp -v $src/tokens.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-99-avg-1.onnx $dst/\n            cp -v $src/joiner-epoch-99-avg-1.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-99-avg-1.onnx \\\n              --output1 $dst/encoder-epoch-99-avg-1.onnx \\\n              --output2 $dst/encoder-epoch-99-avg-1.int8.onnx\n\n            cp -v $src/bpe.model $dst/ || true\n            cp -v $src/README.md $dst/ || true\n            cp -v $src/tokens.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-99-avg-1.onnx $dst/\n            cp -v $src/joiner-epoch-99-avg-1.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n            \"\"\",\n        ),\n    ]\n\n    return models\n\n\ndef get_models():\n    return get_streaming_zipformer_transducer_models()\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./run2.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/mobile-asr-models/generate-kws.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass Model:\n    # We will download\n    # https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/{model_name}.tar.bz2\n    model_name: str\n\n    cmd: str\n\n\ndef get_kws_models():\n    models = [\n        Model(\n            model_name=\"sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-12-avg-2-chunk-16-left-64.onnx \\\n              --output1 $dst/encoder-epoch-12-avg-2-chunk-16-left-64.onnx \\\n              --output2 $dst/encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx\n\n            cp -v $src/README.md $dst/\n            cp -v $src/*.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-12-avg-2-chunk-16-left-64.onnx $dst/\n            cp -v $src/joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n                  \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01\",\n            cmd=\"\"\"\n            ./run-impl.sh \\\n              --input $src/encoder-epoch-12-avg-2-chunk-16-left-64.onnx \\\n              --output1 $dst/encoder-epoch-12-avg-2-chunk-16-left-64.onnx \\\n              --output2 $dst/encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx\n\n            cp -v $src/bpe.model $dst/\n            cp -v $src/README.md $dst/\n            cp -v $src/*.txt $dst/\n            cp -av $src/test_wavs $dst/\n            cp -v $src/decoder-epoch-12-avg-2-chunk-16-left-64.onnx $dst/\n            cp -v $src/joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx $dst/\n\n            cat > $dst/notes.md <<EOF\n# Introduction\nThis model is converted from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/$src.tar.bz2\nand it supports only batch size equal to 1.\nEOF\n                  \"\"\",\n        ),\n    ]\n    return models\n\n\ndef get_models():\n    return get_kws_models()\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./run2.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/mobile-asr-models/parse_options.sh",
    "content": "#!/usr/bin/env bash\n\n# Copyright 2012  Johns Hopkins University (Author: Daniel Povey);\n#                 Arnab Ghoshal, Karel Vesely\n\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#  http://www.apache.org/licenses/LICENSE-2.0\n#\n# THIS CODE IS PROVIDED *AS IS* BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n# KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED\n# WARRANTIES OR CONDITIONS OF TITLE, FITNESS FOR A PARTICULAR PURPOSE,\n# MERCHANTABLITY OR NON-INFRINGEMENT.\n# See the Apache 2 License for the specific language governing permissions and\n# limitations under the License.\n\n\n# Parse command-line options.\n# To be sourced by another script (as in \". parse_options.sh\").\n# Option format is: --option-name arg\n# and shell variable \"option_name\" gets set to value \"arg.\"\n# The exception is --help, which takes no arguments, but prints the\n# $help_message variable (if defined).\n\n\n###\n### The --config file options have lower priority to command line\n### options, so we need to import them first...\n###\n\n# Now import all the configs specified by command-line, in left-to-right order\nfor ((argpos=1; argpos<$#; argpos++)); do\n  if [ \"${!argpos}\" == \"--config\" ]; then\n    argpos_plus1=$((argpos+1))\n    config=${!argpos_plus1}\n    [ ! -r $config ] && echo \"$0: missing config '$config'\" && exit 1\n    . $config  # source the config file.\n  fi\ndone\n\n\n###\n### Now we process the command line options\n###\nwhile true; do\n  [ -z \"${1:-}\" ] && break;  # break if there are no arguments\n  case \"$1\" in\n    # If the enclosing script is called with --help option, print the help\n    # message and exit.  Scripts should put help messages in $help_message\n    --help|-h) if [ -z \"$help_message\" ]; then echo \"No help found.\" 1>&2;\n      else printf \"$help_message\\n\" 1>&2 ; fi;\n      exit 0 ;;\n    --*=*) echo \"$0: options to scripts must be of the form --name value, got '$1'\"\n      exit 1 ;;\n    # If the first command-line argument begins with \"--\" (e.g. --foo-bar),\n    # then work out the variable name as $name, which will equal \"foo_bar\".\n    --*) name=`echo \"$1\" | sed s/^--// | sed s/-/_/g`;\n      # Next we test whether the variable in question is undefned-- if so it's\n      # an invalid option and we die.  Note: $0 evaluates to the name of the\n      # enclosing script.\n      # The test [ -z ${foo_bar+xxx} ] will return true if the variable foo_bar\n      # is undefined.  We then have to wrap this test inside \"eval\" because\n      # foo_bar is itself inside a variable ($name).\n      eval '[ -z \"${'$name'+xxx}\" ]' && echo \"$0: invalid option $1\" 1>&2 && exit 1;\n\n      oldval=\"`eval echo \\\\$$name`\";\n      # Work out whether we seem to be expecting a Boolean argument.\n      if [ \"$oldval\" == \"true\" ] || [ \"$oldval\" == \"false\" ]; then\n        was_bool=true;\n      else\n        was_bool=false;\n      fi\n\n      # Set the variable to the right value-- the escaped quotes make it work if\n      # the option had spaces, like --cmd \"queue.pl -sync y\"\n      eval $name=\\\"$2\\\";\n\n      # Check that Boolean-valued arguments are really Boolean.\n      if $was_bool && [[ \"$2\" != \"true\" && \"$2\" != \"false\" ]]; then\n        echo \"$0: expected \\\"true\\\" or \\\"false\\\": $1 $2\" 1>&2\n        exit 1;\n      fi\n      shift 2;\n      ;;\n  *) break;\n  esac\ndone\n\n\n# Check for an empty argument to the --cmd option, which can easily occur as a\n# result of scripting errors.\n[ ! -z \"${cmd+xxx}\" ] && [ -z \"$cmd\" ] && echo \"$0: empty argument to --cmd option\" 1>&2 && exit 1;\n\n\ntrue; # so this script returns exit code 0.\n"
  },
  {
    "path": "scripts/mobile-asr-models/run2.sh.in",
    "content": "#!/usr/bin/env bash\nset -e\n\n{% for model in model_list %}\n\nsrc={{ model.model_name }}\n\nif [[ $src == *kws* ]]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/$src.tar.bz2\n\nelse\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/$src.tar.bz2\nfi\n\ntar xvf $src.tar.bz2\nrm $src.tar.bz2\n\ndst=$src-mobile\n\nmkdir -p $dst\n\n{{ model.cmd }}\n\necho \"---$src---\"\nls -lh $src\necho \"---$dst---\"\nls -lh $dst\nrm -rf $src\n\ntar cjfv $dst.tar.bz2 $dst\n\nif [[ $src == *kws* ]]; then\n  mkdir -p ../../kws\n  mv *.tar.bz2 ../../kws/\nelse\n  mv *.tar.bz2 ../../\nfi\nrm -rf $dst\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/moonshine/.gitignore",
    "content": "tokenizer.json\n"
  },
  {
    "path": "scripts/moonshine/README.md",
    "content": "# Introduction\n\nThis directory contains models from\nhttps://github.com/usefulsensors/moonshine\n\nSee its license at\nhttps://github.com/usefulsensors/moonshine/blob/main/LICENSE\n"
  },
  {
    "path": "scripts/moonshine/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom pathlib import Path\n\nimport tokenizers\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef generate_tokens():\n    if Path(\"./tokens.txt\").is_file():\n        return\n    print(\"Generating tokens.txt\")\n    tokenizer = tokenizers.Tokenizer.from_file(\"./tokenizer.json\")\n    vocab_size = tokenizer.get_vocab_size()\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i in range(vocab_size):\n            s = tokenizer.id_to_token(i).strip()\n            f.write(f\"{s}\\t{i}\\n\")\n\n\ndef main():\n    generate_tokens()\n\n    # Note(fangjun): Don't use int8 for the preprocessor since it has\n    # a larger impact on the accuracy\n    for f in [\"uncached_decode\", \"cached_decode\", \"encode\"]:\n        if Path(f\"{f}.int8.onnx\").is_file():\n            continue\n\n        print(\"processing\", f)\n        quantize_dynamic(\n            model_input=f\"{f}.onnx\",\n            model_output=f\"{f}.int8.onnx\",\n            weight_type=QuantType.QInt8,\n        )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/moonshine/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport datetime as dt\n\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\n\ndef display(sess, name):\n    print(f\"=========={name} Input==========\")\n    for i in sess.get_inputs():\n        print(i)\n    print(f\"=========={name} Output==========\")\n    for i in sess.get_outputs():\n        print(i)\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        preprocess: str,\n        encode: str,\n        uncached_decode: str,\n        cached_decode: str,\n    ):\n        self.init_preprocess(preprocess)\n        display(self.preprocess, \"preprocess\")\n\n        self.init_encode(encode)\n        display(self.encode, \"encode\")\n\n        self.init_uncached_decode(uncached_decode)\n        display(self.uncached_decode, \"uncached_decode\")\n\n        self.init_cached_decode(cached_decode)\n        display(self.cached_decode, \"cached_decode\")\n\n    def init_preprocess(self, preprocess):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.preprocess = ort.InferenceSession(\n            preprocess,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def init_encode(self, encode):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.encode = ort.InferenceSession(\n            encode,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def init_uncached_decode(self, uncached_decode):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.uncached_decode = ort.InferenceSession(\n            uncached_decode,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def init_cached_decode(self, cached_decode):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.cached_decode = ort.InferenceSession(\n            cached_decode,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def run_preprocess(self, audio):\n        \"\"\"\n        Args:\n          audio: (batch_size, num_samples), float32\n        Returns:\n          A tensor of shape (batch_size, T, dim), float32\n        \"\"\"\n        return self.preprocess.run(\n            [\n                self.preprocess.get_outputs()[0].name,\n            ],\n            {\n                self.preprocess.get_inputs()[0].name: audio,\n            },\n        )[0]\n\n    def run_encode(self, features):\n        \"\"\"\n        Args:\n          features: (batch_size, T, dim)\n        Returns:\n          A tensor of shape (batch_size, T, dim)\n        \"\"\"\n        features_len = np.array([features.shape[1]], dtype=np.int32)\n\n        return self.encode.run(\n            [\n                self.encode.get_outputs()[0].name,\n            ],\n            {\n                self.encode.get_inputs()[0].name: features,\n                self.encode.get_inputs()[1].name: features_len,\n            },\n        )[0]\n\n    def run_uncached_decode(self, token: int, token_len: int, encoder_out: np.ndarray):\n        \"\"\"\n        Args:\n          token: The current token\n          token_len: Number of predicted tokens so far\n          encoder_out: A tensor fo shape (batch_size, T, dim)\n        Returns:\n          A a tuple:\n            - a tensor of shape (batch_size, 1, dim)\n            - a list of states\n        \"\"\"\n        token_tensor = np.array([[token]], dtype=np.int32)\n        token_len_tensor = np.array([token_len], dtype=np.int32)\n\n        num_outs = len(self.uncached_decode.get_outputs())\n        out_names = [\n            self.uncached_decode.get_outputs()[i].name for i in range(num_outs)\n        ]\n\n        out = self.uncached_decode.run(\n            out_names,\n            {\n                self.uncached_decode.get_inputs()[0].name: token_tensor,\n                self.uncached_decode.get_inputs()[1].name: encoder_out,\n                self.uncached_decode.get_inputs()[2].name: token_len_tensor,\n            },\n        )\n\n        logits = out[0]\n        states = out[1:]\n\n        return logits, states\n\n    def run_cached_decode(\n        self, token: int, token_len: int, encoder_out: np.ndarray, states\n    ):\n        \"\"\"\n        Args:\n          token: The current token\n          token_len: Number of predicted tokens so far\n          encoder_out: A tensor of shape (batch_size, T, dim)\n          states: previous states\n        Returns:\n          A a tuple:\n            - a tensor of shape (batch_size, 1, dim)\n            - a list of states\n        \"\"\"\n        token_tensor = np.array([[token]], dtype=np.int32)\n        token_len_tensor = np.array([token_len], dtype=np.int32)\n\n        num_outs = len(self.cached_decode.get_outputs())\n        out_names = [self.cached_decode.get_outputs()[i].name for i in range(num_outs)]\n\n        states_inputs = {}\n        for i in range(3, len(self.cached_decode.get_inputs())):\n            name = self.cached_decode.get_inputs()[i].name\n            states_inputs[name] = states[i - 3]\n\n        out = self.cached_decode.run(\n            out_names,\n            {\n                self.cached_decode.get_inputs()[0].name: token_tensor,\n                self.cached_decode.get_inputs()[1].name: encoder_out,\n                self.cached_decode.get_inputs()[2].name: token_len_tensor,\n                **states_inputs,\n            },\n        )\n\n        logits = out[0]\n        states = out[1:]\n\n        return logits, states\n\n\ndef main():\n    wave = \"./1.wav\"\n    id2token = dict()\n    token2id = dict()\n    with open(\"./tokens.txt\", encoding=\"utf-8\") as f:\n        for k, line in enumerate(f):\n            t, idx = line.split(\"\\t\")\n            id2token[int(idx)] = t\n            token2id[t] = int(idx)\n\n    model = OnnxModel(\n        preprocess=\"./preprocess.onnx\",\n        encode=\"./encode.int8.onnx\",\n        uncached_decode=\"./uncached_decode.int8.onnx\",\n        cached_decode=\"./cached_decode.int8.onnx\",\n    )\n\n    audio, sample_rate = sf.read(wave, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=16000,\n        )\n        sample_rate = 16000\n    audio = audio[None]  # (1, num_samples)\n    print(\"audio.shape\", audio.shape)  # (1, 159414)\n\n    start_t = dt.datetime.now()\n\n    features = model.run_preprocess(audio)  # (1, 413, 288)\n    print(\"features\", features.shape)\n\n    sos = token2id[\"<s>\"]\n    eos = token2id[\"</s>\"]\n\n    tokens = [sos]\n\n    encoder_out = model.run_encode(features)\n    print(\"encoder_out.shape\", encoder_out.shape)  # (1, 413, 288)\n\n    logits, states = model.run_uncached_decode(\n        token=tokens[-1],\n        token_len=len(tokens),\n        encoder_out=encoder_out,\n    )\n\n    print(\"logits.shape\", logits.shape)  # (1, 1, 32768)\n    print(\"len(states)\", len(states))  # 24\n\n    max_len = int((audio.shape[-1] / 16000) * 6)\n\n    for i in range(max_len):\n        token = logits.squeeze().argmax()\n        if token == eos:\n            break\n        tokens.append(token)\n\n        logits, states = model.run_cached_decode(\n            token=tokens[-1],\n            token_len=len(tokens),\n            encoder_out=encoder_out,\n            states=states,\n        )\n\n    tokens = tokens[1:]  # remove sos\n    words = [id2token[i] for i in tokens]\n    underline = \"▁\"\n    #  underline = b\"\\xe2\\x96\\x81\".decode()\n    text = \"\".join(words).replace(underline, \" \").strip()\n\n    end_t = dt.datetime.now()\n    t = (end_t - start_t).total_seconds()\n    rtf = t * 16000 / audio.shape[-1]\n\n    print(text)\n    print(\"RTF:\", rtf)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/moonshine/v2/README.md",
    "content": "# Introduction\n\nThis folder contains scripts for moonshine v2 models that use\n    - encoder_model.onnx\n    - decoder_model_merged.onnx\nor\n    - encoder_model.ort\n    - decoder_model_merged.ort\n\nNote that you need to use [./generate_tokens.py](./generate_tokens.py)\nto generate `tokens.txt` from `tokenizer.bin` for moonshine v2 models.\n\nSee also https://github.com/moonshine-ai/moonshine/pull/73\n\n"
  },
  {
    "path": "scripts/moonshine/v2/generate_tokens.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2026  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport base64\nfrom test import BinTokenizer\n\n\ndef main():\n    tokenizer = BinTokenizer(\"./tokenizer.bin\")\n\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for idx, token_bytes in enumerate(tokenizer.tokens):\n            b64 = base64.b64encode(token_bytes).decode(\"ascii\")\n            f.write(f\"{b64} {idx}\\n\")\n\n    print(\"Saved to ./tokens.txt\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/moonshine/v2/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2026  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\n\n\nclass BinTokenizer:\n    def __init__(self, path):\n        self.tokens = self._load(path)\n\n    def _load(self, path):\n        tokens = []\n        with open(path, \"rb\") as f:\n            data = f.read()\n\n        i = 0\n        while i < len(data):\n            first = data[i]\n            i += 1\n\n            if first == 0:\n                tokens.append(b\"\")  # store as bytes\n                continue\n\n            if first < 128:\n                length = first\n            else:\n                second = data[i]\n                i += 1\n                length = (second * 128) + (first - 128)\n\n            token_bytes = data[i : i + length]\n            i += length\n            tokens.append(token_bytes)  # store as bytes, do NOT decode here\n\n        return tokens\n\n    def decode(self, ids):\n        # join bytes first, then decode as UTF-8\n        byte_stream = b\"\".join(self.tokens[i] for i in ids if i < len(self.tokens))\n        text = byte_stream.decode(\"utf-8\", errors=\"replace\")\n        return text.replace(\"▁\", \" \").strip()\n\n\nclass OnnxModel:\n    def __init__(self, encoder, decoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.encoder = ort.InferenceSession(\n            encoder,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        self.decoder = ort.InferenceSession(\n            decoder,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        print(f\"----{encoder} input----\")\n        for i in self.encoder.get_inputs():\n            print(i)\n\n        print(f\"----{encoder} output----\")\n\n        for i in self.encoder.get_outputs():\n            print(i)\n\n        print(f\"----{decoder} input----\")\n        for i in self.decoder.get_inputs():\n            print(i)\n\n        print(f\"----{decoder} output----\")\n\n        for i in self.decoder.get_outputs():\n            print(i)\n\n        self.need_decoder_attention_mask = False\n\n        for n in self.decoder.get_inputs():\n            if \"key_values\" in n.name and not hasattr(self, \"num_head\"):\n                self.num_head = n.shape[1]\n                self.head_dim = n.shape[3]\n\n            if \"encoder_attention_mask\" in n.name:\n                self.need_decoder_attention_mask = True\n        if self.need_decoder_attention_mask:\n            # [ mask, ids, encoder_out, states, use_cache_branch]\n            self.num_layers = (len(self.decoder.get_inputs()) - 4) // 4\n        else:\n            # [ ids, encoder_out, states, use_cache_branch]\n            self.num_layers = (len(self.decoder.get_inputs()) - 3) // 4\n\n        self.bos = 1\n        self.eos = 2\n\n    def get_decoder_init_states(self):\n        states = []\n        shape = [1, self.num_head, 0, self.head_dim]\n        for i in range(self.num_layers):\n            decoder_key = np.zeros(shape, dtype=np.float32)\n            decoder_value = np.zeros(shape, dtype=np.float32)\n            encoder_key = np.zeros(shape, dtype=np.float32)\n            encoder_value = np.zeros(shape, dtype=np.float32)\n\n            states.append(decoder_key)\n            states.append(decoder_value)\n            states.append(encoder_key)\n            states.append(encoder_value)\n\n        return states\n\n    def run_encoder(self, audio):\n        audio = audio[None, :]  # batch=1\n\n        if len(self.encoder.get_inputs()) > 1:\n            mask = np.ones_like(audio, dtype=np.int64)\n\n            outputs = self.encoder.run(\n                [\n                    self.encoder.get_outputs()[0].name,\n                ],\n                {\n                    self.encoder.get_inputs()[0].name: audio,\n                    self.encoder.get_inputs()[1].name: mask,\n                },\n            )\n        else:\n            outputs = self.encoder.run(\n                [\n                    self.encoder.get_outputs()[0].name,\n                ],\n                {\n                    self.encoder.get_inputs()[0].name: audio,\n                },\n            )\n        return outputs[0]  # last_hidden_state\n\n    def run_decoder(self, token_id, encoder_out, states):\n        inputs = dict()\n        if self.need_decoder_attention_mask:\n            mask = np.ones((1, encoder_out.shape[1]), dtype=np.int64)\n            inputs[self.decoder.get_inputs()[0].name] = mask\n\n            inputs[self.decoder.get_inputs()[1].name] = np.array(\n                [[token_id]], dtype=np.int64\n            )\n            inputs[self.decoder.get_inputs()[2].name] = encoder_out\n\n            for i in range(len(states)):\n                inputs[self.decoder.get_inputs()[3 + i].name] = states[i]\n\n            inputs[self.decoder.get_inputs()[-1].name] = np.array(\n                [token_id != self.bos], dtype=bool\n            )\n        else:\n            inputs[self.decoder.get_inputs()[0].name] = np.array(\n                [[token_id]], dtype=np.int64\n            )\n            inputs[self.decoder.get_inputs()[1].name] = encoder_out\n\n            for i in range(len(states)):\n                inputs[self.decoder.get_inputs()[2 + i].name] = states[i]\n\n            inputs[self.decoder.get_inputs()[-1].name] = np.array(\n                [token_id != self.bos], dtype=bool\n            )\n\n        outputs = self.decoder.run(None, inputs)\n\n        logits = outputs[0]\n        if token_id == self.bos:\n            states = outputs[1:]\n        else:\n            for i in range(self.num_layers):\n                states[4 * i + 0] = outputs[1 + 4 * i + 0]\n                states[4 * i + 1] = outputs[1 + 4 * i + 1]\n\n        return logits, states\n\n\ndef load_audio(filename):\n    audio, sample_rate = librosa.load(filename, sr=16000)\n    assert sample_rate == 16000, sample_rate\n    assert len(audio.shape) == 1, audio.shape\n\n    return np.ascontiguousarray(audio[: 8 * 16000])\n\n\ndef main():\n    model = OnnxModel(\n        encoder=\"./tiny/encoder_model.ort\",\n        decoder=\"./tiny/decoder_model_merged.ort\",\n        #\n        #  encoder=\"./tiny-zh/encoder_model.onnx\",\n        #  decoder=\"./tiny-zh/decoder_model_merged.onnx\",\n        #\n        #  encoder=\"./base-zh/encoder_model.ort\",\n        #  decoder=\"./base-zh/decoder_model_merged.ort\",\n    )\n    samples = load_audio(\"./two_cities.wav\")\n    print(\"samples.shape\", samples.shape)\n    encoder_out = model.run_encoder(samples)\n    print(\"encoder_out.shape\", encoder_out.shape)\n    states = model.get_decoder_init_states()\n    tokens = []\n\n    max_len = int(len(samples) / 16000 * 15)\n\n    token_id = model.bos\n\n    for step in range(max_len):\n        logits, states = model.run_decoder(token_id, encoder_out, states)\n        token_id = int(np.argmax(logits[0, 0]))\n        if token_id == model.eos:\n            break\n        tokens.append(token_id)\n    print(tokens)\n\n    tokenizer = BinTokenizer(\"./base-zh/tokenizer.bin\")\n    text = tokenizer.decode(tokens)\n    print(\"text\", text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/.gitignore",
    "content": "!run-*.sh\n"
  },
  {
    "path": "scripts/nemo/GigaAM/README.md",
    "content": "# Introduction\n\nThis folder contains scripts for converting models from\nhttps://github.com/salute-developers/GigaAM\nto sherpa-onnx.\n\nThe ASR models are for Russian speech recognition in this folder.\n\nPlease see the license of the models at\nhttps://github.com/salute-developers/GigaAM/blob/main/LICENSE\n"
  },
  {
    "path": "scripts/nemo/GigaAM/export-onnx-ctc-v2.py",
    "content": "#!/usr/bin/env python3\nimport gigaam\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef add_meta_data(filename: str, meta_data: dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\ndef main() -> None:\n    model_name = \"v2_ctc\"\n    model = gigaam.load_model(\n        model_name, fp16_encoder=False, use_flash=False, download_root=\".\"\n    )\n\n    # use characters\n    # space is 0\n    # <blk> is the last token\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(model.cfg[\"labels\"]):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n    model.to_onnx(\".\")\n    meta_data = {\n        \"vocab_size\": len(model.cfg[\"labels\"]) + 1,\n        \"normalize_type\": \"\",\n        \"subsampling_factor\": 4,\n        \"model_type\": \"EncDecCTCModel\",\n        \"version\": \"1\",\n        \"model_author\": \"https://github.com/salute-developers/GigaAM\",\n        \"license\": \"https://github.com/salute-developers/GigaAM/blob/main/LICENSE\",\n        \"language\": \"Russian\",\n        \"is_giga_am\": 1,\n    }\n    add_meta_data(f\"./{model_name}.onnx\", meta_data)\n    quantize_dynamic(\n        model_input=f\"./{model_name}.onnx\",\n        model_output=\"./model.int8.onnx\",\n        weight_type=QuantType.QUInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/GigaAM/export-onnx-ctc-v3-punct.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport gigaam\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\"\"\"\n==========Input==========\nNodeArg(name='features', type='tensor(float)', shape=['batch_size', 64, 'seq_len'])\nNodeArg(name='feature_lengths', type='tensor(int64)', shape=['batch_size'])\n==========Output==========\nNodeArg(name='log_probs', type='tensor(float)', shape=['batch_size', 'seq_len', 257])\n\"\"\"\n\n\ndef add_meta_data(filename: str, meta_data: dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n\"\"\"\n{'model_class': 'ctc', 'sample_rate': 16000, 'preprocessor': {'_target_': 'gigaam.preprocess.FeatureExtractor',\n'sample_rate': 16000, 'features': 64, 'win_length': 320, 'hop_length': 160, 'mel_scale': 'htk', 'n_fft': 320,\n'mel_norm': None, 'center': False}, 'encoder': {'_target_': 'gigaam.encoder.ConformerEncoder', 'feat_in': 64,\n'n_layers': 16, 'd_model': 768, 'subsampling': 'conv1d', 'subs_kernel_size': 5, 'subsampling_factor': 4,\n'ff_expansion_factor': 4, 'self_attention_model': 'rotary', 'pos_emb_max_len': 5000, 'n_heads': 16,\n'conv_kernel_size': 5, 'flash_attn': False, 'conv_norm_type': 'layer_norm'}, 'head': {'_target_':\n'gigaam.decoder.CTCHead', 'feat_in': 768, 'num_classes': 257}, 'decoding': {'_target_':\n'gigaam.decoding.CTCGreedyDecoding', 'vocabulary': None,\n'model_path': '/root/.cache/gigaam/v3_e2e_ctc_tokenizer.model'},\n'model_name': 'v3_e2e_ctc', 'hashes': {'model': 'c15fd0dbca70363a146016d197ee0e2a',\n'tokenizer': '2a9cd0c246db42d076e92abb31055deb'}}\n\"\"\"\n\n\ndef main() -> None:\n    model_name = \"v3_e2e_ctc\"\n    model = gigaam.load_model(model_name)\n\n    # <blk> is the last token\n    sp = model.decoding.tokenizer.model\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i in range(sp.vocab_size()):\n            f.write(f\"{sp.id_to_piece(i)} {i}\\n\")\n\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n    model.to_onnx(\".\")\n    meta_data = {\n        \"vocab_size\": sp.vocab_size() + 1,\n        \"normalize_type\": \"\",\n        \"subsampling_factor\": 4,\n        \"model_type\": \"EncDecCTCModel\",\n        \"version\": \"1\",\n        \"model_author\": \"https://github.com/salute-developers/GigaAM\",\n        \"license\": \"https://github.com/salute-developers/GigaAM/blob/main/LICENSE\",\n        \"language\": \"Russian\",\n        \"comment\": \"v3 with puncutations\",\n        \"is_giga_am\": 1,\n    }\n    add_meta_data(f\"./{model_name}.onnx\", meta_data)\n    quantize_dynamic(\n        model_input=f\"./{model_name}.onnx\",\n        model_output=\"./model.int8.onnx\",\n        weight_type=QuantType.QUInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/GigaAM/export-onnx-ctc-v3.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport gigaam\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\"\"\"\nNodeArg(name='features', type='tensor(float)', shape=['batch_size', 64, 'seq_len'])\nNodeArg(name='feature_lengths', type='tensor(int64)', shape=['batch_size'])\n-----\nNodeArg(name='log_probs', type='tensor(float)', shape=['batch_size', 'seq_len', 34])\n\"\"\"\n\n\ndef add_meta_data(filename: str, meta_data: dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n\"\"\"\n{'model_class': 'ctc', 'sample_rate': 16000,\n'preprocessor': {'_target_': 'gigaam.preprocess.FeatureExtractor', 'sample_rate': 16000, 'features': 64,\n'win_length': 320, 'hop_length': 160, 'mel_scale': 'htk', 'n_fft': 320, 'mel_norm': None, 'center': False},\n'encoder': {'_target_': 'gigaam.encoder.ConformerEncoder', 'feat_in': 64, 'n_layers': 16, 'd_model': 768,\n'subsampling': 'conv1d', 'subs_kernel_size': 5, 'subsampling_factor': 4, 'ff_expansion_factor': 4,\n'self_attention_model': 'rotary', 'pos_emb_max_len': 5000, 'n_heads': 16, 'conv_kernel_size': 5,\n'flash_attn': False, 'conv_norm_type': 'layer_norm'}, 'head': {'_target_': 'gigaam.decoder.CTCHead',\n'feat_in': 768, 'num_classes': 34}, 'decoding': {'_target_': 'gigaam.decoding.CTCGreedyDecoding',\n'vocabulary': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н', 'о', 'п', 'р', 'с',\n'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я']}, 'model_name': 'v3_ctc',\n'hashes': {'model': '1bdc12052560591b7cdf35bef02619fa'}}\n\"\"\"\n\n\ndef main() -> None:\n    model_name = \"v3_ctc\"\n    model = gigaam.load_model(model_name)\n\n    # use characters\n    # space is 0\n    # <blk> is the last token\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(model.cfg[\"decoding\"][\"vocabulary\"]):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n    model.to_onnx(\".\")\n    meta_data = {\n        \"vocab_size\": len(model.cfg[\"decoding\"][\"vocabulary\"]) + 1,\n        \"normalize_type\": \"\",\n        \"subsampling_factor\": 4,\n        \"model_type\": \"EncDecCTCModel\",\n        \"version\": \"1\",\n        \"model_author\": \"https://github.com/salute-developers/GigaAM\",\n        \"license\": \"https://github.com/salute-developers/GigaAM/blob/main/LICENSE\",\n        \"language\": \"Russian\",\n        \"comment\": \"v3\",\n        \"is_giga_am\": 1,\n    }\n    add_meta_data(f\"./{model_name}.onnx\", meta_data)\n    quantize_dynamic(\n        model_input=f\"./{model_name}.onnx\",\n        model_output=\"./model.int8.onnx\",\n        weight_type=QuantType.QUInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/GigaAM/export-onnx-ctc.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\nfrom typing import Dict\n\nimport onnx\nimport torch\nimport torchaudio\nfrom nemo.collections.asr.models import EncDecCTCModel\nfrom nemo.collections.asr.modules.audio_preprocessing import (\n    AudioToMelSpectrogramPreprocessor as NeMoAudioToMelSpectrogramPreprocessor,\n)\nfrom nemo.collections.asr.parts.preprocessing.features import (\n    FilterbankFeaturesTA as NeMoFilterbankFeaturesTA,\n)\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\nclass FilterbankFeaturesTA(NeMoFilterbankFeaturesTA):\n    def __init__(self, mel_scale: str = \"htk\", wkwargs=None, **kwargs):\n        if \"window_size\" in kwargs:\n            del kwargs[\"window_size\"]\n        if \"window_stride\" in kwargs:\n            del kwargs[\"window_stride\"]\n\n        super().__init__(**kwargs)\n\n        self._mel_spec_extractor: torchaudio.transforms.MelSpectrogram = (\n            torchaudio.transforms.MelSpectrogram(\n                sample_rate=self._sample_rate,\n                win_length=self.win_length,\n                hop_length=self.hop_length,\n                n_mels=kwargs[\"nfilt\"],\n                window_fn=self.torch_windows[kwargs[\"window\"]],\n                mel_scale=mel_scale,\n                norm=kwargs[\"mel_norm\"],\n                n_fft=kwargs[\"n_fft\"],\n                f_max=kwargs.get(\"highfreq\", None),\n                f_min=kwargs.get(\"lowfreq\", 0),\n                wkwargs=wkwargs,\n            )\n        )\n\n\nclass AudioToMelSpectrogramPreprocessor(NeMoAudioToMelSpectrogramPreprocessor):\n    def __init__(self, mel_scale: str = \"htk\", **kwargs):\n        super().__init__(**kwargs)\n        kwargs[\"nfilt\"] = kwargs[\"features\"]\n        del kwargs[\"features\"]\n        self.featurizer = (\n            FilterbankFeaturesTA(  # Deprecated arguments; kept for config compatibility\n                mel_scale=mel_scale,\n                **kwargs,\n            )\n        )\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    model = EncDecCTCModel.from_config_file(\"./ctc_model_config.yaml\")\n    ckpt = torch.load(\"./ctc_model_weights.ckpt\", map_location=\"cpu\")\n    model.load_state_dict(ckpt, strict=False)\n    model.eval()\n\n    # use characters\n    # space is 0\n    # <blk> is the last token\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, t in enumerate(model.cfg.labels):\n            f.write(f\"{t} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n\n    filename = \"model.onnx\"\n    model.export(filename)\n\n    meta_data = {\n        \"vocab_size\": len(model.cfg.labels) + 1,\n        \"normalize_type\": \"\",\n        \"subsampling_factor\": 4,\n        \"model_type\": \"EncDecCTCModel\",\n        \"version\": \"1\",\n        \"model_author\": \"https://github.com/salute-developers/GigaAM\",\n        \"license\": \"https://github.com/salute-developers/GigaAM/blob/main/GigaAM%20License_NC.pdf\",\n        \"language\": \"Russian\",\n        \"is_giga_am\": 1,\n    }\n    add_meta_data(filename, meta_data)\n\n    filename_int8 = \"model.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        weight_type=QuantType.QUInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/GigaAM/export-onnx-rnnt-v2.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport os\n\nimport gigaam\nimport onnx\nimport torch\nfrom gigaam.utils import onnx_converter\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom torch import Tensor\n\n\"\"\"\n==========Input==========\nNodeArg(name='audio_signal', type='tensor(float)', shape=['batch_size', 64, 'seq_len'])\nNodeArg(name='length', type='tensor(int64)', shape=['batch_size'])\n==========Output==========\nNodeArg(name='encoded', type='tensor(float)', shape=['batch_size', 768, 'Transposeencoded_dim_2'])\nNodeArg(name='encoded_len', type='tensor(int32)', shape=['batch_size'])\n\n==========Input==========\nNodeArg(name='x', type='tensor(int32)', shape=[1, 1])\nNodeArg(name='unused_x_len.1', type='tensor(int32)', shape=[1])\nNodeArg(name='h.1', type='tensor(float)', shape=[1, 1, 320])\nNodeArg(name='c.1', type='tensor(float)', shape=[1, 1, 320])\n==========Output==========\nNodeArg(name='dec', type='tensor(float)', shape=[1, 320, 1])\nNodeArg(name='unused_x_len', type='tensor(int32)', shape=[1])\nNodeArg(name='h', type='tensor(float)', shape=[1, 1, 320])\nNodeArg(name='c', type='tensor(float)', shape=[1, 1, 320])\n\n==========Input==========\nNodeArg(name='enc', type='tensor(float)', shape=[1, 768, 1])\nNodeArg(name='dec', type='tensor(float)', shape=[1, 320, 1])\n==========Output==========\nNodeArg(name='joint', type='tensor(float)', shape=[1, 1, 1, 34])\n\"\"\"\n\n\ndef add_meta_data(filename: str, meta_data: dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\nclass EncoderWrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.m = m\n\n    def forward(self, audio_signal: Tensor, length: Tensor):\n        # https://github.com/salute-developers/GigaAM/blob/main/gigaam/encoder.py#L499\n        out, out_len = self.m.encoder(audio_signal, length)\n\n        return out, out_len.to(torch.int64)\n\n    def to_onnx(self, dir_path: str = \".\"):\n        onnx_converter(\n            model_name=f\"{self.m.cfg.model_name}_encoder\",\n            out_dir=dir_path,\n            module=self.m.encoder,\n            dynamic_axes=self.m.encoder.dynamic_axes(),\n        )\n\n\nclass DecoderWrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.m = m\n\n    def forward(self, x: Tensor, unused_x_len: Tensor, h: Tensor, c: Tensor):\n        # https://github.com/salute-developers/GigaAM/blob/main/gigaam/decoder.py#L110C17-L110C54\n        emb = self.m.head.decoder.embed(x)\n        g, (h, c) = self.m.head.decoder.lstm(emb.transpose(0, 1), (h, c))\n        return g.permute(1, 2, 0), unused_x_len + 1, h, c\n\n    def to_onnx(self, dir_path: str = \".\"):\n        label, hidden_h, hidden_c = self.m.head.decoder.input_example()\n        label = label.to(torch.int32)\n        label_len = torch.zeros(1, dtype=torch.int32)\n\n        onnx_converter(\n            model_name=f\"{self.m.cfg.model_name}_decoder\",\n            out_dir=dir_path,\n            module=self,\n            dynamic_axes=self.m.encoder.dynamic_axes(),\n            inputs=(label, label_len, hidden_h, hidden_c),\n            input_names=[\"x\", \"unused_x_len.1\", \"h.1\", \"c.1\"],\n            output_names=[\"dec\", \"unused_x_len\", \"h\", \"c\"],\n        )\n\n\ndef main() -> None:\n    model_name = \"v2_rnnt\"\n    model = gigaam.load_model(\n        model_name, fp16_encoder=False, use_flash=False, download_root=\".\"\n    )\n\n    # use characters\n    # space is 0\n    # <blk> is the last token\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(model.cfg[\"labels\"]):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    EncoderWrapper(model).to_onnx(\".\")\n    DecoderWrapper(model).to_onnx(\".\")\n\n    onnx_converter(\n        model_name=f\"{model.cfg.model_name}_joint\",\n        out_dir=\".\",\n        module=model.head.joint,\n    )\n    meta_data = {\n        # vocab_size does not include the blank\n        # we will increase vocab_size by 1 in the c++ code\n        \"vocab_size\": model.cfg[\"head\"][\"decoder\"][\"num_classes\"] - 1,\n        \"pred_rnn_layers\": model.cfg[\"head\"][\"decoder\"][\"pred_rnn_layers\"],\n        \"pred_hidden\": model.cfg[\"head\"][\"decoder\"][\"pred_hidden\"],\n        \"normalize_type\": \"\",\n        \"subsampling_factor\": 4,\n        \"model_type\": \"EncDecRNNTBPEModel\",\n        \"version\": \"2\",\n        \"model_author\": \"https://github.com/salute-developers/GigaAM\",\n        \"license\": \"https://github.com/salute-developers/GigaAM/blob/main/LICENSE\",\n        \"language\": \"Russian\",\n        \"is_giga_am\": 1,\n    }\n\n    add_meta_data(f\"./{model_name}_encoder.onnx\", meta_data)\n    quantize_dynamic(\n        model_input=f\"./{model_name}_encoder.onnx\",\n        model_output=\"./encoder.int8.onnx\",\n        weight_type=QuantType.QUInt8,\n    )\n    os.rename(f\"./{model_name}_decoder.onnx\", \"decoder.onnx\")\n    os.rename(f\"./{model_name}_joint.onnx\", \"joiner.onnx\")\n    os.remove(f\"./{model_name}_encoder.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/GigaAM/export-onnx-rnnt-v3-punct.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport os\n\nimport gigaam\nimport onnx\nimport torch\nfrom gigaam.utils import onnx_converter\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom torch import Tensor\n\n# encoder input length should be of int64\n# encder output length can be int64 or int32\n\n\"\"\"\n==========Input==========\nNodeArg(name='audio_signal', type='tensor(float)', shape=['batch_size', 64, 'seq_len'])\nNodeArg(name='length', type='tensor(int64)', shape=['batch_size'])\n==========Output==========\nNodeArg(name='encoded', type='tensor(float)', shape=['batch_size', 768, 'Transposeencoded_dim_2'])\nNodeArg(name='encoded_len', type='tensor(int32)', shape=['batch_size'])\n==========Input==========\nNodeArg(name='x', type='tensor(int32)', shape=[1, 1])\nNodeArg(name='unused_x_len.1', type='tensor(int32)', shape=[1])\nNodeArg(name='h.1', type='tensor(float)', shape=[1, 1, 320])\nNodeArg(name='c.1', type='tensor(float)', shape=[1, 1, 320])\n==========Output==========\nNodeArg(name='dec', type='tensor(float)', shape=[1, 320, 1])\nNodeArg(name='unused_x_len', type='tensor(int32)', shape=[1])\nNodeArg(name='h', type='tensor(float)', shape=[1, 1, 320])\nNodeArg(name='c', type='tensor(float)', shape=[1, 1, 320])\n==========Input==========\nNodeArg(name='enc', type='tensor(float)', shape=[1, 768, 1])\nNodeArg(name='dec', type='tensor(float)', shape=[1, 320, 1])\n==========Output==========\nNodeArg(name='joint', type='tensor(float)', shape=[1, 1, 1, 1025])\n\"\"\"\n\n\ndef add_meta_data(filename: str, meta_data: dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\nclass EncoderWrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.m = m\n\n    def forward(self, audio_signal: Tensor, length: Tensor):\n        # https://github.com/salute-developers/GigaAM/blob/main/gigaam/encoder.py#L499\n        out, out_len = self.m.encoder(audio_signal, length)\n\n        return out, out_len.to(torch.int64)\n\n    def to_onnx(self, dir_path: str = \".\"):\n        onnx_converter(\n            model_name=f\"{self.m.cfg.model_name}_encoder\",\n            out_dir=dir_path,\n            module=self.m.encoder,\n            dynamic_axes=self.m.encoder.dynamic_axes(),\n        )\n\n\nclass DecoderWrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.m = m\n\n    def forward(self, x: Tensor, unused_x_len: Tensor, h: Tensor, c: Tensor):\n        # https://github.com/salute-developers/GigaAM/blob/main/gigaam/decoder.py#L110C17-L110C54\n        emb = self.m.head.decoder.embed(x)\n        g, (h, c) = self.m.head.decoder.lstm(emb.transpose(0, 1), (h, c))\n        return g.permute(1, 2, 0), unused_x_len + 1, h, c\n\n    def to_onnx(self, dir_path: str = \".\"):\n        label, hidden_h, hidden_c = self.m.head.decoder.input_example()\n        label = label.to(torch.int32)\n        label_len = torch.zeros(1, dtype=torch.int32)\n\n        onnx_converter(\n            model_name=f\"{self.m.cfg.model_name}_decoder\",\n            out_dir=dir_path,\n            module=self,\n            dynamic_axes=self.m.encoder.dynamic_axes(),\n            inputs=(label, label_len, hidden_h, hidden_c),\n            input_names=[\"x\", \"unused_x_len.1\", \"h.1\", \"c.1\"],\n            output_names=[\"dec\", \"unused_x_len\", \"h\", \"c\"],\n        )\n\n\n\"\"\"\n{'model_class': 'rnnt', 'sample_rate': 16000,\n'preprocessor': {'_target_': 'gigaam.preprocess.FeatureExtractor', 'sample_rate': 16000,\n'features': 64, 'win_length': 320, 'hop_length': 160, 'mel_scale': 'htk', 'n_fft': 320,\n'mel_norm': None, 'center': False},\n'encoder': {'_target_': 'gigaam.encoder.ConformerEncoder', 'feat_in': 64, 'n_layers': 16,\n'd_model': 768, 'subsampling_factor': 4, 'ff_expansion_factor': 4,\n'self_attention_model': 'rotary', 'pos_emb_max_len': 5000, 'n_heads': 16,\n'conv_kernel_size': 5, 'flash_attn': False, 'subs_kernel_size': 5,\n'subsampling': 'conv1d', 'conv_norm_type': 'layer_norm'},\n'head': {'_target_': 'gigaam.decoder.RNNTHead',\n'decoder': {'pred_hidden': 320, 'pred_rnn_layers': 1, 'num_classes': 1025},\n'joint': {'enc_hidden': 768, 'pred_hidden': 320, 'joint_hidden': 320, 'num_classes': 1025}},\n'decoding': {'_target_': 'gigaam.decoding.RNNTGreedyDecoding',\n'vocabulary': None, 'model_path': '/root/.cache/gigaam/v3_e2e_rnnt_tokenizer.model'}, 'model_name': 'v3_e2e_rnnt', 'hashes': {'model': '72e2a9b5c7caad963b2bbfd2f298c252', 'tokenizer': '3b3bf8370e882885d79731592fc99f98'}}\n\"\"\"\n\n\ndef main() -> None:\n    model_name = \"v3_e2e_rnnt\"\n    model = gigaam.load_model(model_name)\n\n    # <blk> is the last token\n    sp = model.decoding.tokenizer.model\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i in range(sp.vocab_size()):\n            f.write(f\"{sp.id_to_piece(i)} {i}\\n\")\n\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    EncoderWrapper(model).to_onnx(\".\")\n    DecoderWrapper(model).to_onnx(\".\")\n\n    onnx_converter(\n        model_name=f\"{model.cfg.model_name}_joint\",\n        out_dir=\".\",\n        module=model.head.joint,\n    )\n    meta_data = {\n        # vocab_size does not include the blank\n        # we will increase vocab_size by 1 in the c++ code\n        \"vocab_size\": model.cfg[\"head\"][\"decoder\"][\"num_classes\"] - 1,\n        \"pred_rnn_layers\": model.cfg[\"head\"][\"decoder\"][\"pred_rnn_layers\"],\n        \"pred_hidden\": model.cfg[\"head\"][\"decoder\"][\"pred_hidden\"],\n        \"normalize_type\": \"\",\n        \"subsampling_factor\": 4,\n        \"model_type\": \"EncDecRNNTBPEModel\",\n        \"version\": \"3\",\n        \"model_author\": \"https://github.com/salute-developers/GigaAM\",\n        \"license\": \"https://github.com/salute-developers/GigaAM/blob/main/LICENSE\",\n        \"language\": \"Russian\",\n        \"comment\": \"v3\",\n        \"is_giga_am\": 1,\n    }\n\n    add_meta_data(f\"./{model_name}_encoder.onnx\", meta_data)\n    quantize_dynamic(\n        model_input=f\"./{model_name}_encoder.onnx\",\n        model_output=\"./encoder.int8.onnx\",\n        weight_type=QuantType.QUInt8,\n    )\n    os.rename(f\"./{model_name}_decoder.onnx\", \"decoder.onnx\")\n    os.rename(f\"./{model_name}_joint.onnx\", \"joiner.onnx\")\n    os.remove(f\"./{model_name}_encoder.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/GigaAM/export-onnx-rnnt-v3.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport os\n\nimport gigaam\nimport onnx\nimport torch\nfrom gigaam.utils import onnx_converter\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom torch import Tensor\n\n# encoder input length should be of int64\n# encder output length can be int64 or int32\n\n\"\"\"\n==========Input==========\nNodeArg(name='audio_signal', type='tensor(float)', shape=['batch_size', 64, 'seq_len'])\nNodeArg(name='length', type='tensor(int64)', shape=['batch_size'])\n==========Output==========\nNodeArg(name='encoded', type='tensor(float)', shape=['batch_size', 768, 'Transposeencoded_dim_2'])\nNodeArg(name='encoded_len', type='tensor(int32)', shape=['batch_size'])\n==========Input==========\nNodeArg(name='x', type='tensor(int32)', shape=[1, 1])\nNodeArg(name='unused_x_len.1', type='tensor(int32)', shape=[1])\nNodeArg(name='h.1', type='tensor(float)', shape=[1, 1, 320])\nNodeArg(name='c.1', type='tensor(float)', shape=[1, 1, 320])\n==========Output==========\nNodeArg(name='dec', type='tensor(float)', shape=[1, 320, 1])\nNodeArg(name='unused_x_len', type='tensor(int32)', shape=[1])\nNodeArg(name='h', type='tensor(float)', shape=[1, 1, 320])\nNodeArg(name='c', type='tensor(float)', shape=[1, 1, 320])\n==========Input==========\nNodeArg(name='enc', type='tensor(float)', shape=[1, 768, 1])\nNodeArg(name='dec', type='tensor(float)', shape=[1, 320, 1])\n==========Output==========\nNodeArg(name='joint', type='tensor(float)', shape=[1, 1, 1, 34])\n\"\"\"\n\n\ndef add_meta_data(filename: str, meta_data: dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\nclass EncoderWrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.m = m\n\n    def forward(self, audio_signal: Tensor, length: Tensor):\n        # https://github.com/salute-developers/GigaAM/blob/main/gigaam/encoder.py#L499\n        out, out_len = self.m.encoder(audio_signal, length)\n\n        return out, out_len.to(torch.int64)\n\n    def to_onnx(self, dir_path: str = \".\"):\n        onnx_converter(\n            model_name=f\"{self.m.cfg.model_name}_encoder\",\n            out_dir=dir_path,\n            module=self.m.encoder,\n            dynamic_axes=self.m.encoder.dynamic_axes(),\n        )\n\n\nclass DecoderWrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.m = m\n\n    def forward(self, x: Tensor, unused_x_len: Tensor, h: Tensor, c: Tensor):\n        # https://github.com/salute-developers/GigaAM/blob/main/gigaam/decoder.py#L110C17-L110C54\n        emb = self.m.head.decoder.embed(x)\n        g, (h, c) = self.m.head.decoder.lstm(emb.transpose(0, 1), (h, c))\n        return g.permute(1, 2, 0), unused_x_len + 1, h, c\n\n    def to_onnx(self, dir_path: str = \".\"):\n        label, hidden_h, hidden_c = self.m.head.decoder.input_example()\n        label = label.to(torch.int32)\n        label_len = torch.zeros(1, dtype=torch.int32)\n\n        onnx_converter(\n            model_name=f\"{self.m.cfg.model_name}_decoder\",\n            out_dir=dir_path,\n            module=self,\n            dynamic_axes=self.m.encoder.dynamic_axes(),\n            inputs=(label, label_len, hidden_h, hidden_c),\n            input_names=[\"x\", \"unused_x_len.1\", \"h.1\", \"c.1\"],\n            output_names=[\"dec\", \"unused_x_len\", \"h\", \"c\"],\n        )\n\n\n\"\"\"\n{'model_class': 'rnnt', 'sample_rate': 16000,\n'preprocessor': {'_target_': 'gigaam.preprocess.FeatureExtractor', 'sample_rate': 16000,\n'features': 64, 'win_length': 320, 'hop_length': 160, 'mel_scale': 'htk', 'n_fft': 320,\n'mel_norm': None, 'center': False},\n'encoder': {'_target_': 'gigaam.encoder.ConformerEncoder', 'feat_in': 64, 'n_layers': 16,\n'd_model': 768, 'subsampling_factor': 4, 'ff_expansion_factor': 4,\n'self_attention_model': 'rotary', 'pos_emb_max_len': 5000, 'n_heads': 16,\n'conv_kernel_size': 5, 'flash_attn': False, 'subs_kernel_size': 5,\n'subsampling': 'conv1d', 'conv_norm_type': 'layer_norm'},\n'head': {'_target_': 'gigaam.decoder.RNNTHead',\n'decoder': {'pred_hidden': 320, 'pred_rnn_layers': 1, 'num_classes': 34},\n'joint': {'enc_hidden': 768, 'pred_hidden': 320, 'joint_hidden': 320, 'num_classes': 34}},\n'decoding': {'_target_': 'gigaam.decoding.RNNTGreedyDecoding',\n'vocabulary': [' ', 'а', 'б', 'в', 'г', 'д', 'е', 'ж', 'з', 'и', 'й', 'к', 'л', 'м', 'н',\n'о', 'п', 'р', 'с', 'т', 'у', 'ф', 'х', 'ц', 'ч', 'ш', 'щ', 'ъ', 'ы', 'ь', 'э', 'ю', 'я']},\n'model_name': 'v3_rnnt', 'hashes': {'model': 'be62a7bc46de1311ec288d3bf8ee2818'}}\n\"\"\"\n\n\ndef main() -> None:\n    model_name = \"v3_rnnt\"\n    model = gigaam.load_model(model_name)\n\n    # use characters\n    # space is 0\n    # <blk> is the last token\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(model.cfg[\"decoding\"][\"vocabulary\"]):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    EncoderWrapper(model).to_onnx(\".\")\n    DecoderWrapper(model).to_onnx(\".\")\n\n    onnx_converter(\n        model_name=f\"{model.cfg.model_name}_joint\",\n        out_dir=\".\",\n        module=model.head.joint,\n    )\n    meta_data = {\n        # vocab_size does not include the blank\n        # we will increase vocab_size by 1 in the c++ code\n        \"vocab_size\": model.cfg[\"head\"][\"decoder\"][\"num_classes\"] - 1,\n        \"pred_rnn_layers\": model.cfg[\"head\"][\"decoder\"][\"pred_rnn_layers\"],\n        \"pred_hidden\": model.cfg[\"head\"][\"decoder\"][\"pred_hidden\"],\n        \"normalize_type\": \"\",\n        \"subsampling_factor\": 4,\n        \"model_type\": \"EncDecRNNTBPEModel\",\n        \"version\": \"3\",\n        \"model_author\": \"https://github.com/salute-developers/GigaAM\",\n        \"license\": \"https://github.com/salute-developers/GigaAM/blob/main/LICENSE\",\n        \"language\": \"Russian\",\n        \"comment\": \"v3\",\n        \"is_giga_am\": 1,\n    }\n\n    add_meta_data(f\"./{model_name}_encoder.onnx\", meta_data)\n    quantize_dynamic(\n        model_input=f\"./{model_name}_encoder.onnx\",\n        model_output=\"./encoder.int8.onnx\",\n        weight_type=QuantType.QUInt8,\n    )\n    os.rename(f\"./{model_name}_decoder.onnx\", \"decoder.onnx\")\n    os.rename(f\"./{model_name}_joint.onnx\", \"joiner.onnx\")\n    os.remove(f\"./{model_name}_encoder.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/GigaAM/export-onnx-rnnt.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom typing import Dict\n\nimport onnx\nimport torch\nimport torchaudio\nfrom nemo.collections.asr.models import EncDecRNNTBPEModel\nfrom nemo.collections.asr.modules.audio_preprocessing import (\n    AudioToMelSpectrogramPreprocessor as NeMoAudioToMelSpectrogramPreprocessor,\n)\nfrom nemo.collections.asr.parts.preprocessing.features import (\n    FilterbankFeaturesTA as NeMoFilterbankFeaturesTA,\n)\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\nclass FilterbankFeaturesTA(NeMoFilterbankFeaturesTA):\n    def __init__(self, mel_scale: str = \"htk\", wkwargs=None, **kwargs):\n        if \"window_size\" in kwargs:\n            del kwargs[\"window_size\"]\n        if \"window_stride\" in kwargs:\n            del kwargs[\"window_stride\"]\n\n        super().__init__(**kwargs)\n\n        self._mel_spec_extractor: torchaudio.transforms.MelSpectrogram = (\n            torchaudio.transforms.MelSpectrogram(\n                sample_rate=self._sample_rate,\n                win_length=self.win_length,\n                hop_length=self.hop_length,\n                n_mels=kwargs[\"nfilt\"],\n                window_fn=self.torch_windows[kwargs[\"window\"]],\n                mel_scale=mel_scale,\n                norm=kwargs[\"mel_norm\"],\n                n_fft=kwargs[\"n_fft\"],\n                f_max=kwargs.get(\"highfreq\", None),\n                f_min=kwargs.get(\"lowfreq\", 0),\n                wkwargs=wkwargs,\n            )\n        )\n\n\nclass AudioToMelSpectrogramPreprocessor(NeMoAudioToMelSpectrogramPreprocessor):\n    def __init__(self, mel_scale: str = \"htk\", **kwargs):\n        super().__init__(**kwargs)\n        kwargs[\"nfilt\"] = kwargs[\"features\"]\n        del kwargs[\"features\"]\n        self.featurizer = (\n            FilterbankFeaturesTA(  # Deprecated arguments; kept for config compatibility\n                mel_scale=mel_scale,\n                **kwargs,\n            )\n        )\n\n\n@torch.no_grad()\ndef main():\n    model = EncDecRNNTBPEModel.from_config_file(\"./rnnt_model_config.yaml\")\n    ckpt = torch.load(\"./rnnt_model_weights.ckpt\", map_location=\"cpu\")\n    model.load_state_dict(ckpt, strict=False)\n    model.eval()\n\n    # use bpe\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(model.joint.vocabulary):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    model.encoder.export(\"encoder.onnx\")\n    model.decoder.export(\"decoder.onnx\")\n    model.joint.export(\"joiner.onnx\")\n\n    meta_data = {\n        # not including the blank\n        # we increase vocab_size in the C++ code\n        \"vocab_size\": model.decoder.vocab_size,\n        \"pred_rnn_layers\": model.decoder.pred_rnn_layers,\n        \"pred_hidden\": model.decoder.pred_hidden,\n        \"normalize_type\": \"\",\n        \"subsampling_factor\": 4,\n        \"model_type\": \"EncDecRNNTBPEModel\",\n        \"version\": \"1\",\n        \"model_author\": \"https://github.com/salute-developers/GigaAM\",\n        \"license\": \"https://github.com/salute-developers/GigaAM/blob/main/GigaAM%20License_NC.pdf\",\n        \"language\": \"Russian\",\n        \"is_giga_am\": 1,\n    }\n    add_meta_data(\"encoder.onnx\", meta_data)\n\n    quantize_dynamic(\n        model_input=\"encoder.onnx\",\n        model_output=\"encoder.int8.onnx\",\n        weight_type=QuantType.QUInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/GigaAM/run-ctc-v2.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nfunction install_gigaam() {\n  curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py\n  python3 get-pip.py\n  pip install torch==2.4.0 torchaudio==2.4.0 -f https://download.pytorch.org/whl/torch_stable.html\n  pip install -qq wget text-unidecode \"matplotlib>=3.3.2\" onnx onnxruntime==1.17.1 pybind11 Cython einops kaldi-native-fbank soundfile librosa\n\n  BRANCH='main'\n  python3 -m pip install git+https://github.com/salute-developers/GigaAM.git@$BRANCH#egg=gigaam\n\n  python3 -m pip install -qq kaldi-native-fbank\n  pip install numpy==1.26.4\n}\n\nfunction download_files() {\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/example.wav\n  curl -SL -O https://raw.githubusercontent.com/salute-developers/GigaAM/main/LICENSE\n}\n\ninstall_gigaam\ndownload_files\n\npython3 ./export-onnx-ctc-v2.py\nls -lh\npython3 ./test-onnx-ctc.py\n"
  },
  {
    "path": "scripts/nemo/GigaAM/run-ctc-v3-punct.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nfunction install_gigaam() {\n  curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py\n  python3 get-pip.py\n  pip install torch==2.4.0 torchaudio==2.4.0 -f https://download.pytorch.org/whl/torch_stable.html\n  pip install -qq wget text-unidecode \"matplotlib>=3.3.2\" onnx onnxruntime==1.17.1 pybind11 Cython einops kaldi-native-fbank soundfile librosa\n\n  BRANCH='main'\n  python3 -m pip install git+https://github.com/salute-developers/GigaAM.git@$BRANCH#egg=gigaam\n\n  python3 -m pip install -qq kaldi-native-fbank\n  pip install numpy==1.26.4\n}\n\nfunction download_files() {\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/example.wav\n  curl -SL -O https://raw.githubusercontent.com/salute-developers/GigaAM/main/LICENSE\n}\n\ninstall_gigaam\ndownload_files\n\npython3 ./export-onnx-ctc-v3-punct.py\nls -lh\npython3 ./test-onnx-ctc.py\n"
  },
  {
    "path": "scripts/nemo/GigaAM/run-ctc-v3.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nfunction install_gigaam() {\n  curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py\n  python3 get-pip.py\n  pip install torch==2.4.0 torchaudio==2.4.0 -f https://download.pytorch.org/whl/torch_stable.html\n  pip install -qq wget text-unidecode \"matplotlib>=3.3.2\" onnx onnxruntime==1.17.1 pybind11 Cython einops kaldi-native-fbank soundfile librosa\n\n  BRANCH='main'\n  python3 -m pip install git+https://github.com/salute-developers/GigaAM.git@$BRANCH#egg=gigaam\n\n  python3 -m pip install -qq kaldi-native-fbank\n  pip install numpy==1.26.4\n}\n\nfunction download_files() {\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/example.wav\n  curl -SL -O https://raw.githubusercontent.com/salute-developers/GigaAM/main/LICENSE\n}\n\ninstall_gigaam\ndownload_files\n\npython3 ./export-onnx-ctc-v3.py\nls -lh\npython3 ./test-onnx-ctc.py\n"
  },
  {
    "path": "scripts/nemo/GigaAM/run-ctc.sh",
    "content": "#!/usr/bin/env bash\n# Copyright    2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nfunction install_nemo() {\n  curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py\n  python3 get-pip.py\n\n  pip install torch==2.4.0 torchaudio==2.4.0 -f https://download.pytorch.org/whl/torch_stable.html\n\n  pip install -qq wget text-unidecode \"matplotlib>=3.3.2\" onnx onnxruntime==1.17.1 pybind11 Cython einops kaldi-native-fbank soundfile librosa\n  pip install -qq ipython\n\n  # sudo apt-get install -q -y sox libsndfile1 ffmpeg python3-pip ipython\n\n  BRANCH='main'\n  python3 -m pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[asr]\n\n  pip install numpy==1.26.4\n}\n\nfunction download_files() {\n  # curl -SL -O https://n-ws-q0bez.s3pd12.sbercloud.ru/b-ws-q0bez-jpv/GigaAM/ctc_model_weights.ckpt\n  # curl -SL -O https://n-ws-q0bez.s3pd12.sbercloud.ru/b-ws-q0bez-jpv/GigaAM/ctc_model_config.yaml\n  # curl -SL -O https://n-ws-q0bez.s3pd12.sbercloud.ru/b-ws-q0bez-jpv/GigaAM/example.wav\n  # curl -SL -O https://n-ws-q0bez.s3pd12.sbercloud.ru/b-ws-q0bez-jpv/GigaAM/long_example.wav\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/ctc/ctc_model_weights.ckpt\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/ctc/ctc_model_config.yaml\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/example.wav\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/long_example.wav\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/GigaAM%20License_NC.pdf\n}\n\ninstall_nemo\ndownload_files\n\npython3 ./export-onnx-ctc.py\nls -lh\npython3 ./test-onnx-ctc.py\n"
  },
  {
    "path": "scripts/nemo/GigaAM/run-rnnt-v2.sh",
    "content": "#!/usr/bin/env bash\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nfunction install_gigaam() {\n  curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py\n  python3 get-pip.py\n  pip install torch==2.4.0 torchaudio==2.4.0 -f https://download.pytorch.org/whl/torch_stable.html\n  pip install -qq wget text-unidecode \"matplotlib>=3.3.2\" onnx onnxruntime==1.17.1 pybind11 Cython einops kaldi-native-fbank soundfile librosa\n\n  BRANCH='main'\n  python3 -m pip install git+https://github.com/salute-developers/GigaAM.git@$BRANCH#egg=gigaam\n\n  python3 -m pip install -qq kaldi-native-fbank\n  pip install numpy==1.26.4\n}\n\nfunction download_files() {\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/example.wav\n  curl -SL -O https://raw.githubusercontent.com/salute-developers/GigaAM/main/LICENSE\n}\n\ninstall_gigaam\ndownload_files\n\npython3 ./export-onnx-rnnt-v2.py\nls -lh\npython3 ./test-onnx-rnnt.py\n"
  },
  {
    "path": "scripts/nemo/GigaAM/run-rnnt-v3-punct.sh",
    "content": "#!/usr/bin/env bash\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nfunction install_gigaam() {\n  curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py\n  python3 get-pip.py\n  pip install torch==2.4.0 torchaudio==2.4.0 -f https://download.pytorch.org/whl/torch_stable.html\n  pip install -qq wget text-unidecode \"matplotlib>=3.3.2\" onnx onnxruntime==1.17.1 pybind11 Cython einops kaldi-native-fbank soundfile librosa\n\n  BRANCH='main'\n  python3 -m pip install git+https://github.com/salute-developers/GigaAM.git@$BRANCH#egg=gigaam\n\n  python3 -m pip install -qq kaldi-native-fbank\n  pip install numpy==1.26.4\n}\n\nfunction download_files() {\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/example.wav\n  curl -SL -O https://raw.githubusercontent.com/salute-developers/GigaAM/main/LICENSE\n}\n\ninstall_gigaam\ndownload_files\n\npython3 ./export-onnx-rnnt-v3-punct.py\nls -lh\npython3 ./test-onnx-rnnt.py\n"
  },
  {
    "path": "scripts/nemo/GigaAM/run-rnnt-v3.sh",
    "content": "#!/usr/bin/env bash\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nfunction install_gigaam() {\n  curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py\n  python3 get-pip.py\n  pip install torch==2.4.0 torchaudio==2.4.0 -f https://download.pytorch.org/whl/torch_stable.html\n  pip install -qq wget text-unidecode \"matplotlib>=3.3.2\" onnx onnxruntime==1.17.1 pybind11 Cython einops kaldi-native-fbank soundfile librosa\n\n  BRANCH='main'\n  python3 -m pip install git+https://github.com/salute-developers/GigaAM.git@$BRANCH#egg=gigaam\n\n  python3 -m pip install -qq kaldi-native-fbank\n  pip install numpy==1.26.4\n}\n\nfunction download_files() {\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/example.wav\n  curl -SL -O https://raw.githubusercontent.com/salute-developers/GigaAM/main/LICENSE\n}\n\ninstall_gigaam\ndownload_files\n\npython3 ./export-onnx-rnnt-v3.py\nls -lh\npython3 ./test-onnx-rnnt.py\n"
  },
  {
    "path": "scripts/nemo/GigaAM/run-rnnt.sh",
    "content": "#!/usr/bin/env bash\n# Copyright    2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nfunction install_nemo() {\n  curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py\n  python3 get-pip.py\n\n  pip install torch==2.4.0 torchaudio==2.4.0 -f https://download.pytorch.org/whl/torch_stable.html\n\n  pip install -qq wget text-unidecode \"matplotlib>=3.3.2\" onnx onnxruntime==1.17.1 pybind11 Cython einops kaldi-native-fbank soundfile librosa\n  pip install -qq ipython\n\n  # sudo apt-get install -q -y sox libsndfile1 ffmpeg python3-pip ipython\n\n  BRANCH='main'\n  python3 -m pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[asr]\n\n  pip install numpy==1.26.4\n}\n\nfunction download_files() {\n  # curl -SL -O https://n-ws-q0bez.s3pd12.sbercloud.ru/b-ws-q0bez-jpv/GigaAM/rnnt_model_weights.ckpt\n  # curl -SL -O https://n-ws-q0bez.s3pd12.sbercloud.ru/b-ws-q0bez-jpv/GigaAM/rnnt_model_config.yaml\n  # curl -SL -O https://n-ws-q0bez.s3pd12.sbercloud.ru/b-ws-q0bez-jpv/GigaAM/example.wav\n  # curl -SL -O https://n-ws-q0bez.s3pd12.sbercloud.ru/b-ws-q0bez-jpv/GigaAM/long_example.wav\n  # curl -SL -O https://n-ws-q0bez.s3pd12.sbercloud.ru/b-ws-q0bez-jpv/GigaAM/tokenizer_all_sets.tar\n\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/rnnt/rnnt_model_weights.ckpt\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/rnnt/rnnt_model_config.yaml\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/example.wav\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/long_example.wav\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/GigaAM%20License_NC.pdf\n  curl -SL -O https://huggingface.co/csukuangfj/tmp-files/resolve/main/GigaAM/rnnt/tokenizer_all_sets.tar\n  tar -xf tokenizer_all_sets.tar && rm tokenizer_all_sets.tar\n  ls -lh\n  echo \"---\"\n  ls -lh tokenizer_all_sets\n  echo \"---\"\n}\n\ninstall_nemo\ndownload_files\n\npython3 ./export-onnx-rnnt.py\nls -lh\npython3 ./test-onnx-rnnt.py\nrm -v encoder.onnx\nls -lh\n"
  },
  {
    "path": "scripts/nemo/GigaAM/test-onnx-ctc.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n# https://github.com/salute-developers/GigaAM\n\nimport kaldi_native_fbank as knf\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\n\n\ndef create_fbank():\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.remove_dc_offset = False\n    opts.frame_opts.preemph_coeff = 0\n    opts.frame_opts.window_type = \"hann\"\n\n    opts.frame_opts.round_to_power_of_two = False\n\n    opts.mel_opts.low_freq = 0\n    opts.mel_opts.high_freq = 8000\n    opts.mel_opts.num_bins = 64\n\n    fbank = knf.OnlineFbank(opts)\n    return fbank\n\n\ndef compute_features(audio, fbank) -> np.ndarray:\n    \"\"\"\n    Args:\n      audio: (num_samples,), np.float32\n      fbank: the fbank extractor\n    Returns:\n      features: (num_frames, feat_dim), np.float32\n    \"\"\"\n    assert len(audio.shape) == 1, audio.shape\n    fbank.accept_waveform(16000, audio)\n    ans = []\n    processed = 0\n    while processed < fbank.num_frames_ready:\n        ans.append(np.array(fbank.get_frame(processed)))\n        processed += 1\n    ans = np.stack(ans)\n    return ans\n\n\ndef display(sess):\n    print(\"==========Input==========\")\n    for i in sess.get_inputs():\n        print(i)\n    print(\"==========Output==========\")\n    for i in sess.get_outputs():\n        print(i)\n\n\n\"\"\"\n==========Input==========\nNodeArg(name='audio_signal', type='tensor(float)', shape=['audio_signal_dynamic_axes_1', 64, 'audio_signal_dynamic_axes_2'])\nNodeArg(name='length', type='tensor(int64)', shape=['length_dynamic_axes_1'])\n==========Output==========\nNodeArg(name='logprobs', type='tensor(float)', shape=['logprobs_dynamic_axes_1', 'logprobs_dynamic_axes_2', 34])\n\"\"\"\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        display(self.model)\n\n    def __call__(self, x: np.ndarray):\n        # x: (T, C)\n        x = torch.from_numpy(x)\n        x = x.t().unsqueeze(0)\n        # x: [1, C, T]\n        x_lens = torch.tensor([x.shape[-1]], dtype=torch.int64)\n\n        log_probs = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x.numpy(),\n                self.model.get_inputs()[1].name: x_lens.numpy(),\n            },\n        )[0]\n        # [batch_size, T, dim]\n        return log_probs\n\n\ndef main():\n    filename = \"./model.int8.onnx\"\n    tokens = \"./tokens.txt\"\n    wav = \"./example.wav\"\n\n    model = OnnxModel(filename)\n\n    id2token = dict()\n    with open(tokens, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.split()\n            if len(fields) == 1:\n                id2token[int(fields[0])] = \" \"\n            else:\n                t, idx = fields\n                id2token[int(idx)] = t\n\n    fbank = create_fbank()\n    audio, sample_rate = sf.read(wav, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=16000,\n        )\n        sample_rate = 16000\n\n    features = compute_features(audio, fbank)\n    print(\"features.shape\", features.shape)\n\n    blank = len(id2token) - 1\n    prev = -1\n    ans = []\n    log_probs = model(features)\n    print(\"log_probs\", log_probs.shape)\n    log_probs = torch.from_numpy(log_probs)[0]\n    ids = torch.argmax(log_probs, dim=1).tolist()\n    for i in ids:\n        if i != blank and i != prev:\n            ans.append(i)\n        prev = i\n\n    tokens = [id2token[i] for i in ans]\n\n    text = \"\".join(tokens)\n    print(wav)\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/GigaAM/test-onnx-rnnt.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom pathlib import Path\n\nimport kaldi_native_fbank as knf\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\n\n\ndef create_fbank():\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.remove_dc_offset = False\n    opts.frame_opts.preemph_coeff = 0\n    opts.frame_opts.window_type = \"hann\"\n\n    opts.frame_opts.round_to_power_of_two = False\n\n    opts.mel_opts.low_freq = 0\n    opts.mel_opts.high_freq = 8000\n    opts.mel_opts.num_bins = 64\n\n    fbank = knf.OnlineFbank(opts)\n    return fbank\n\n\ndef compute_features(audio, fbank):\n    assert len(audio.shape) == 1, audio.shape\n    fbank.accept_waveform(16000, audio)\n    ans = []\n    processed = 0\n    while processed < fbank.num_frames_ready:\n        ans.append(np.array(fbank.get_frame(processed)))\n        processed += 1\n    ans = np.stack(ans)\n    return ans\n\n\ndef display(sess):\n    print(\"==========Input==========\")\n    for i in sess.get_inputs():\n        print(i)\n    print(\"==========Output==========\")\n    for i in sess.get_outputs():\n        print(i)\n\n\n\"\"\"\n==========Input==========\nNodeArg(name='audio_signal', type='tensor(float)', shape=['audio_signal_dynamic_axes_1', 64, 'audio_signal_dynamic_axes_2'])\nNodeArg(name='length', type='tensor(int64)', shape=['length_dynamic_axes_1'])\n==========Output==========\nNodeArg(name='outputs', type='tensor(float)', shape=['outputs_dynamic_axes_1', 768, 'outputs_dynamic_axes_2'])\nNodeArg(name='encoded_lengths', type='tensor(int64)', shape=['encoded_lengths_dynamic_axes_1'])\n==========Input==========\nNodeArg(name='targets', type='tensor(int32)', shape=['targets_dynamic_axes_1', 'targets_dynamic_axes_2'])\nNodeArg(name='target_length', type='tensor(int32)', shape=['target_length_dynamic_axes_1'])\nNodeArg(name='states.1', type='tensor(float)', shape=[1, 'states.1_dim_1', 320])\nNodeArg(name='onnx::LSTM_3', type='tensor(float)', shape=[1, 1, 320])\n==========Output==========\nNodeArg(name='outputs', type='tensor(float)', shape=['outputs_dynamic_axes_1', 320, 'outputs_dynamic_axes_2'])\nNodeArg(name='prednet_lengths', type='tensor(int32)', shape=['prednet_lengths_dynamic_axes_1'])\nNodeArg(name='states', type='tensor(float)', shape=[1, 'states_dynamic_axes_1', 320])\nNodeArg(name='74', type='tensor(float)', shape=[1, 'states_dynamic_axes_1', 320])\n==========Input==========\nNodeArg(name='encoder_outputs', type='tensor(float)', shape=['encoder_outputs_dynamic_axes_1', 768, 'encoder_outputs_dynamic_axes_2'])\nNodeArg(name='decoder_outputs', type='tensor(float)', shape=['decoder_outputs_dynamic_axes_1', 320, 'decoder_outputs_dynamic_axes_2'])\n==========Output==========\nNodeArg(name='outputs', type='tensor(float)', shape=['outputs_dynamic_axes_1', 'outputs_dynamic_axes_2', 'outputs_dynamic_axes_3', 513])\n\"\"\"\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        encoder: str,\n        decoder: str,\n        joiner: str,\n    ):\n        self.init_encoder(encoder)\n        display(self.encoder)\n        self.init_decoder(decoder)\n        display(self.decoder)\n        self.init_joiner(joiner)\n        display(self.joiner)\n\n    def init_encoder(self, encoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.encoder = ort.InferenceSession(\n            encoder,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.encoder.get_modelmeta().custom_metadata_map\n        self.normalize_type = meta[\"normalize_type\"]\n        print(meta)\n\n        self.pred_rnn_layers = int(meta[\"pred_rnn_layers\"])\n        self.pred_hidden = int(meta[\"pred_hidden\"])\n\n    def init_decoder(self, decoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.decoder = ort.InferenceSession(\n            decoder,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def init_joiner(self, joiner):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.joiner = ort.InferenceSession(\n            joiner,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def get_decoder_state(self):\n        batch_size = 1\n        state0 = torch.zeros(self.pred_rnn_layers, batch_size, self.pred_hidden).numpy()\n        state1 = torch.zeros(self.pred_rnn_layers, batch_size, self.pred_hidden).numpy()\n        return state0, state1\n\n    def run_encoder(self, x: np.ndarray):\n        # x: (T, C)\n        x = torch.from_numpy(x)\n        x = x.t().unsqueeze(0)\n        # x: [1, C, T]\n        x_lens = torch.tensor([x.shape[-1]], dtype=torch.int64)\n\n        (encoder_out, out_len) = self.encoder.run(\n            [\n                self.encoder.get_outputs()[0].name,\n                self.encoder.get_outputs()[1].name,\n            ],\n            {\n                self.encoder.get_inputs()[0].name: x.numpy(),\n                self.encoder.get_inputs()[1].name: x_lens.numpy(),\n            },\n        )\n        # [batch_size, dim, T]\n        return encoder_out\n\n    def run_decoder(\n        self,\n        token: int,\n        state0: np.ndarray,\n        state1: np.ndarray,\n    ):\n        target = torch.tensor([[token]], dtype=torch.int32).numpy()\n        target_len = torch.tensor([1], dtype=torch.int32).numpy()\n\n        (decoder_out, decoder_out_length, state0_next, state1_next,) = self.decoder.run(\n            [\n                self.decoder.get_outputs()[0].name,\n                self.decoder.get_outputs()[1].name,\n                self.decoder.get_outputs()[2].name,\n                self.decoder.get_outputs()[3].name,\n            ],\n            {\n                self.decoder.get_inputs()[0].name: target,\n                self.decoder.get_inputs()[1].name: target_len,\n                self.decoder.get_inputs()[2].name: state0,\n                self.decoder.get_inputs()[3].name: state1,\n            },\n        )\n        return decoder_out, state0_next, state1_next\n\n    def run_joiner(\n        self,\n        encoder_out: np.ndarray,\n        decoder_out: np.ndarray,\n    ):\n        # encoder_out: [batch_size,  dim, 1]\n        # decoder_out: [batch_size,  dim, 1]\n        logit = self.joiner.run(\n            [\n                self.joiner.get_outputs()[0].name,\n            ],\n            {\n                self.joiner.get_inputs()[0].name: encoder_out,\n                self.joiner.get_inputs()[1].name: decoder_out,\n            },\n        )[0]\n        # logit: [batch_size, 1, 1, vocab_size]\n        return logit\n\n\ndef main():\n    model = OnnxModel(\"encoder.int8.onnx\", \"decoder.onnx\", \"joiner.onnx\")\n\n    id2token = dict()\n    with open(\"./tokens.txt\", encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.split()\n            if len(fields) == 1:\n                id2token[int(fields[0])] = \" \"\n            else:\n                t, idx = fields\n                id2token[int(idx)] = t\n\n    fbank = create_fbank()\n    audio, sample_rate = sf.read(\"./example.wav\", dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=16000,\n        )\n        sample_rate = 16000\n\n    tail_padding = np.zeros(sample_rate * 2)\n\n    audio = np.concatenate([audio, tail_padding])\n\n    blank = len(id2token) - 1\n    ans = [blank]\n    state0, state1 = model.get_decoder_state()\n    decoder_out, state0_next, state1_next = model.run_decoder(ans[-1], state0, state1)\n\n    features = compute_features(audio, fbank)\n    print(\"audio.shape\", audio.shape)\n    print(\"features.shape\", features.shape)\n\n    encoder_out = model.run_encoder(features)\n    # encoder_out:[batch_size, dim, T)\n    for t in range(encoder_out.shape[2]):\n        encoder_out_t = encoder_out[:, :, t : t + 1]\n        logits = model.run_joiner(encoder_out_t, decoder_out)\n        logits = torch.from_numpy(logits)\n        logits = logits.squeeze()\n        idx = torch.argmax(logits, dim=-1).item()\n        if idx != blank:\n            ans.append(idx)\n            state0 = state0_next\n            state1 = state1_next\n            decoder_out, state0_next, state1_next = model.run_decoder(\n                ans[-1], state0, state1\n            )\n\n    ans = ans[1:]  # remove the first blank\n    print(ans)\n    tokens = [id2token[i] for i in ans]\n    underline = \"▁\"\n    #  underline = b\"\\xe2\\x96\\x81\".decode()\n    text = \"\".join(tokens).replace(underline, \" \").strip()\n    print(\"./example.wav\")\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/README.md",
    "content": "# Introduction\n\nThis directory contains scripts for exporting models\nfrom [NeMo](https://github.com/NVIDIA/NeMo/) to onnx\nso that you can use them in `sherpa-onnx`.\n\n- [./speaker-verification](./speaker-verification) contains models for speaker verification.\n"
  },
  {
    "path": "scripts/nemo/canary/export_onnx_180m_flash.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\n<|en|>\n<|pnc|>\n<|noitn|>\n<|nodiarize|>\n<|notimestamp|>\n\"\"\"\n\nimport os\nfrom typing import Dict, Tuple\n\nimport nemo\nimport onnx\nimport torch\nfrom nemo.collections.common.parts import NEG_INF\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\"\"\"\nNotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED :\nCould not find an implementation for Trilu(14) node with name '/Trilu'\n\nSee also https://github.com/microsoft/onnxruntime/issues/16189#issuecomment-1722219631\n\nSo we use fixed_form_attention_mask() to replace\nthe original form_attention_mask()\n\"\"\"\n\n\ndef fixed_form_attention_mask(input_mask, diagonal=None):\n    \"\"\"\n    Fixed: Build attention mask with optional masking of future tokens we forbid\n    to attend to (e.g. as it is in Transformer decoder).\n\n    Args:\n        input_mask: binary mask of size B x L with 1s corresponding to valid\n            tokens and 0s corresponding to padding tokens\n        diagonal: diagonal where triangular future mask starts\n            None -- do not mask anything\n            0 -- regular translation or language modeling future masking\n            1 -- query stream masking as in XLNet architecture\n    Returns:\n        attention_mask: mask of size B x 1 x L x L with 0s corresponding to\n            tokens we plan to attend to and -10000 otherwise\n    \"\"\"\n\n    if input_mask is None:\n        return None\n    attn_shape = (1, input_mask.shape[1], input_mask.shape[1])\n    attn_mask = input_mask.to(dtype=bool).unsqueeze(1)\n    if diagonal is not None:\n        future_mask = torch.tril(\n            torch.ones(\n                attn_shape,\n                dtype=torch.int64,  # it was torch.bool\n                # but onnxruntime does not support torch.int32 or torch.bool\n                # in torch.tril\n                device=input_mask.device,\n            ),\n            diagonal,\n        ).bool()\n        attn_mask = attn_mask & future_mask\n    attention_mask = (1 - attn_mask.to(torch.float)) * NEG_INF\n    return attention_mask.unsqueeze(1)\n\n\nnemo.collections.common.parts.form_attention_mask = fixed_form_attention_mask\n\nfrom nemo.collections.asr.models import EncDecMultiTaskModel\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\ndef lens_to_mask(lens, max_length):\n    \"\"\"\n    Create a mask from a tensor of lengths.\n    \"\"\"\n    batch_size = lens.shape[0]\n    arange = torch.arange(max_length, device=lens.device)\n    mask = arange.expand(batch_size, max_length) < lens.unsqueeze(1)\n    return mask\n\n\nclass EncoderWrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.encoder = m.encoder\n        self.encoder_decoder_proj = m.encoder_decoder_proj\n\n    def forward(\n        self, x: torch.Tensor, x_len: torch.Tensor\n    ) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:\n        \"\"\"\n        Args:\n          x: (N, T, C)\n          x_len: (N,)\n        Returns:\n          - enc_states: (N, T, C)\n          - encoded_len: (N,)\n          - enc_mask: (N, T)\n        \"\"\"\n        x = x.permute(0, 2, 1)\n        # x: (N, C, T)\n        encoded, encoded_len = self.encoder(audio_signal=x, length=x_len)\n\n        enc_states = encoded.permute(0, 2, 1)\n\n        enc_states = self.encoder_decoder_proj(enc_states)\n\n        enc_mask = lens_to_mask(encoded_len, enc_states.shape[1])\n\n        return enc_states, encoded_len, enc_mask\n\n\nclass DecoderWrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.decoder = m.transf_decoder\n        self.log_softmax = m.log_softmax\n\n        # We use only greedy search, so there is no need to compute log_softmax\n        self.log_softmax.mlp.log_softmax = False\n\n    def forward(\n        self,\n        decoder_input_ids: torch.Tensor,\n        decoder_mems_list_0: torch.Tensor,\n        decoder_mems_list_1: torch.Tensor,\n        decoder_mems_list_2: torch.Tensor,\n        decoder_mems_list_3: torch.Tensor,\n        decoder_mems_list_4: torch.Tensor,\n        decoder_mems_list_5: torch.Tensor,\n        enc_states: torch.Tensor,\n        enc_mask: torch.Tensor,\n    ):\n        \"\"\"\n        Args:\n          decoder_input_ids: (N, num_tokens), torch.int32\n          decoder_mems_list_i: (N, num_tokens, 1024)\n          enc_states: (N, T, 1024)\n          enc_mask: (N, T)\n        Returns:\n          - logits: (N, 1, vocab_size)\n          - decoder_mems_list_i: (N, num_tokens_2, 1024)\n        \"\"\"\n        pos = decoder_input_ids[0][-1].item()\n        decoder_input_ids = decoder_input_ids[:, :-1]\n\n        decoder_hidden_states = self.decoder.embedding.forward(\n            decoder_input_ids, start_pos=pos\n        )\n        decoder_input_mask = torch.ones_like(decoder_input_ids).float()\n\n        decoder_mems_list = self.decoder.decoder.forward(\n            decoder_hidden_states,\n            decoder_input_mask,\n            enc_states,\n            enc_mask,\n            [\n                decoder_mems_list_0,\n                decoder_mems_list_1,\n                decoder_mems_list_2,\n                decoder_mems_list_3,\n                decoder_mems_list_4,\n                decoder_mems_list_5,\n            ],\n            return_mems=True,\n        )\n        logits = self.log_softmax(hidden_states=decoder_mems_list[-1][:, -1:])\n\n        return logits, decoder_mems_list\n\n\ndef export_encoder(canary_model):\n    encoder = EncoderWrapper(canary_model)\n    x = torch.rand(1, 4000, 128)\n    x_lens = torch.tensor([x.shape[1]], dtype=torch.int64)\n\n    encoder_filename = \"encoder.onnx\"\n    torch.onnx.export(\n        encoder,\n        (x, x_lens),\n        encoder_filename,\n        input_names=[\"x\", \"x_len\"],\n        output_names=[\"enc_states\", \"enc_len\", \"enc_mask\"],\n        opset_version=14,\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"T\"},\n            \"x_len\": {0: \"N\"},\n            \"enc_states\": {0: \"N\", 1: \"T\"},\n            \"enc_len\": {0: \"N\"},\n            \"enc_mask\": {0: \"N\", 1: \"T\"},\n        },\n    )\n\n\ndef export_decoder(canary_model):\n    decoder = DecoderWrapper(canary_model)\n    decoder_input_ids = torch.tensor([[1, 0]], dtype=torch.int32)\n\n    decoder_mems_list_0 = torch.zeros(1, 10, 1024)\n    decoder_mems_list_1 = torch.zeros(1, 10, 1024)\n    decoder_mems_list_2 = torch.zeros(1, 10, 1024)\n    decoder_mems_list_3 = torch.zeros(1, 10, 1024)\n    decoder_mems_list_4 = torch.zeros(1, 10, 1024)\n    decoder_mems_list_5 = torch.zeros(1, 10, 1024)\n\n    enc_states = torch.zeros(1, 1000, 1024)\n    enc_mask = torch.ones(1, 1000).bool()\n\n    torch.onnx.export(\n        decoder,\n        (\n            decoder_input_ids,\n            decoder_mems_list_0,\n            decoder_mems_list_1,\n            decoder_mems_list_2,\n            decoder_mems_list_3,\n            decoder_mems_list_4,\n            decoder_mems_list_5,\n            enc_states,\n            enc_mask,\n        ),\n        \"decoder.onnx\",\n        dynamo=True,\n        opset_version=14,\n        external_data=False,\n        input_names=[\n            \"decoder_input_ids\",\n            \"decoder_mems_list_0\",\n            \"decoder_mems_list_1\",\n            \"decoder_mems_list_2\",\n            \"decoder_mems_list_3\",\n            \"decoder_mems_list_4\",\n            \"decoder_mems_list_5\",\n            \"enc_states\",\n            \"enc_mask\",\n        ],\n        output_names=[\n            \"logits\",\n            \"next_decoder_mem_list_0\",\n            \"next_decoder_mem_list_1\",\n            \"next_decoder_mem_list_2\",\n            \"next_decoder_mem_list_3\",\n            \"next_decoder_mem_list_4\",\n            \"next_decoder_mem_list_5\",\n        ],\n        dynamic_axes={\n            \"decoder_input_ids\": {1: \"num_tokens\"},\n            \"decoder_mems_list_0\": {1: \"num_tokens\"},\n            \"decoder_mems_list_1\": {1: \"num_tokens\"},\n            \"decoder_mems_list_2\": {1: \"num_tokens\"},\n            \"decoder_mems_list_3\": {1: \"num_tokens\"},\n            \"decoder_mems_list_4\": {1: \"num_tokens\"},\n            \"decoder_mems_list_5\": {1: \"num_tokens\"},\n            \"enc_states\": {1: \"T\"},\n            \"enc_mask\": {1: \"T\"},\n        },\n    )\n\n\ndef export_tokens(canary_model):\n    underline = \"▁\"\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i in range(canary_model.tokenizer.vocab_size):\n            s = canary_model.tokenizer.ids_to_text([i])\n\n            if s[0] == \" \":\n                s = underline + s[1:]\n\n            f.write(f\"{s} {i}\\n\")\n        print(\"Saved to tokens.txt\")\n\n\n@torch.no_grad()\ndef main():\n    canary_model = EncDecMultiTaskModel.from_pretrained(\"nvidia/canary-180m-flash\")\n    canary_model.eval()\n\n    preprocessor = canary_model.cfg[\"preprocessor\"]\n    sample_rate = preprocessor[\"sample_rate\"]\n    normalize_type = preprocessor[\"normalize\"]\n    window_size = preprocessor[\"window_size\"]  # ms\n    window_stride = preprocessor[\"window_stride\"]  # ms\n    window = preprocessor[\"window\"]\n    features = preprocessor[\"features\"]\n    n_fft = preprocessor[\"n_fft\"]\n    vocab_size = canary_model.tokenizer.vocab_size  # 5248\n\n    subsampling_factor = canary_model.cfg[\"encoder\"][\"subsampling_factor\"]\n\n    assert sample_rate == 16000, sample_rate\n    assert normalize_type == \"per_feature\", normalize_type\n    assert window_size == 0.025, window_size\n    assert window_stride == 0.01, window_stride\n    assert window == \"hann\", window\n    assert features == 128, features\n    assert n_fft == 512, n_fft\n    assert subsampling_factor == 8, subsampling_factor\n\n    export_tokens(canary_model)\n    export_encoder(canary_model)\n    export_decoder(canary_model)\n\n    for m in [\"encoder\", \"decoder\"]:\n        quantize_dynamic(\n            model_input=f\"./{m}.onnx\",\n            model_output=f\"./{m}.int8.onnx\",\n            weight_type=QuantType.QUInt8,\n        )\n\n    meta_data = {\n        \"vocab_size\": vocab_size,\n        \"normalize_type\": normalize_type,\n        \"subsampling_factor\": subsampling_factor,\n        \"model_type\": \"EncDecMultiTaskModel\",\n        \"version\": \"1\",\n        \"model_author\": \"NeMo\",\n        \"url\": \"https://huggingface.co/nvidia/canary-180m-flash\",\n        \"feat_dim\": features,\n    }\n\n    add_meta_data(\"encoder.onnx\", meta_data)\n    add_meta_data(\"encoder.int8.onnx\", meta_data)\n\n    \"\"\"\n    To fix the following error with onnxruntime 1.17.1 and 1.16.3:\n\n    onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 :FAIL : Load model from ./decoder.int8.onnx failed:/Users/runner/work/1/s/onnxruntime/core/graph/model.cc:150 onnxruntime::Model::Model(onnx::ModelProto &&, const onnxruntime::PathString &, const onnxruntime::IOnnxRuntimeOpSchemaRegistryList *, const logging::Logger &, const onnxruntime::ModelOptions &)\n    Unsupported model IR version: 10, max supported IR version: 9\n    \"\"\"\n    for filename in [\"./decoder.onnx\", \"./decoder.int8.onnx\"]:\n        model = onnx.load(filename)\n        print(\"old\", model.ir_version)\n        model.ir_version = 9\n        print(\"new\", model.ir_version)\n        onnx.save(model, filename)\n\n    os.system(\"ls -lh *.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/canary/run_180m_flash.sh",
    "content": "#!/usr/bin/env bash\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/de.wav\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/en.wav\n\npip install \\\n  nemo_toolkit['asr'] \\\n  \"numpy<2\" \\\n  ipython \\\n  kaldi-native-fbank \\\n  librosa \\\n  onnx==1.17.0 \\\n  onnxruntime==1.17.1 \\\n  onnxscript \\\n  soundfile\n\npython3 ./export_onnx_180m_flash.py\nls -lh *.onnx\n\n\nlog \"-----fp32------\"\n\npython3 ./test_180m_flash.py \\\n  --encoder ./encoder.onnx \\\n  --decoder ./decoder.onnx \\\n  --source-lang en \\\n  --target-lang en \\\n  --tokens ./tokens.txt \\\n  --wav ./en.wav\n\npython3 ./test_180m_flash.py \\\n  --encoder ./encoder.onnx \\\n  --decoder ./decoder.onnx \\\n  --source-lang en \\\n  --target-lang de \\\n  --tokens ./tokens.txt \\\n  --wav ./en.wav\n\npython3 ./test_180m_flash.py \\\n  --encoder ./encoder.onnx \\\n  --decoder ./decoder.onnx \\\n  --source-lang de \\\n  --target-lang de \\\n  --tokens ./tokens.txt \\\n  --wav ./de.wav\n\npython3 ./test_180m_flash.py \\\n  --encoder ./encoder.onnx \\\n  --decoder ./decoder.onnx \\\n  --source-lang de \\\n  --target-lang en \\\n  --tokens ./tokens.txt \\\n  --wav ./de.wav\n\n\nlog \"-----int8------\"\n\npython3 ./test_180m_flash.py \\\n  --encoder ./encoder.int8.onnx \\\n  --decoder ./decoder.int8.onnx \\\n  --source-lang en \\\n  --target-lang en \\\n  --tokens ./tokens.txt \\\n  --wav ./en.wav\n\npython3 ./test_180m_flash.py \\\n  --encoder ./encoder.int8.onnx \\\n  --decoder ./decoder.int8.onnx \\\n  --source-lang en \\\n  --target-lang de \\\n  --tokens ./tokens.txt \\\n  --wav ./en.wav\n\npython3 ./test_180m_flash.py \\\n  --encoder ./encoder.int8.onnx \\\n  --decoder ./decoder.int8.onnx \\\n  --source-lang de \\\n  --target-lang de \\\n  --tokens ./tokens.txt \\\n  --wav ./de.wav\n\npython3 ./test_180m_flash.py \\\n  --encoder ./encoder.int8.onnx \\\n  --decoder ./decoder.int8.onnx \\\n  --source-lang de \\\n  --target-lang en \\\n  --tokens ./tokens.txt \\\n  --wav ./de.wav\n"
  },
  {
    "path": "scripts/nemo/canary/test_180m_flash.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nimport time\nfrom pathlib import Path\nfrom typing import List\n\nimport kaldi_native_fbank as knf\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--encoder\", type=str, required=True, help=\"Path to encoder.onnx\"\n    )\n    parser.add_argument(\n        \"--decoder\", type=str, required=True, help=\"Path to decoder.onnx\"\n    )\n\n    parser.add_argument(\"--tokens\", type=str, required=True, help=\"Path to tokens.txt\")\n\n    parser.add_argument(\n        \"--source-lang\",\n        type=str,\n        help=\"Language of the input wav. Valid values are: en, de, es, fr\",\n    )\n    parser.add_argument(\n        \"--target-lang\",\n        type=str,\n        help=\"Language of the recognition result. Valid values are: en, de, es, fr\",\n    )\n    parser.add_argument(\n        \"--use-pnc\",\n        type=int,\n        default=1,\n        help=\"1 to enable cases and punctuations. 0 to disable that\",\n    )\n\n    parser.add_argument(\"--wav\", type=str, required=True, help=\"Path to test.wav\")\n\n    return parser.parse_args()\n\n\ndef display(sess, model):\n    print(f\"=========={model} Input==========\")\n    for i in sess.get_inputs():\n        print(i)\n    print(f\"=========={model }Output==========\")\n    for i in sess.get_outputs():\n        print(i)\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        encoder: str,\n        decoder: str,\n    ):\n        self.init_encoder(encoder)\n        display(self.encoder, \"encoder\")\n\n        self.init_decoder(decoder)\n        display(self.decoder, \"decoder\")\n\n    def init_encoder(self, encoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.encoder = ort.InferenceSession(\n            encoder,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.encoder.get_modelmeta().custom_metadata_map\n        self.normalize_type = meta[\"normalize_type\"]\n        print(meta)\n\n    def init_decoder(self, decoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.decoder = ort.InferenceSession(\n            decoder,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def run_encoder(self, x: np.ndarray, x_lens: np.ndarray):\n        \"\"\"\n        Args:\n          x: (N, T, C), np.float\n          x_lens: (N,), np.int64\n        Returns:\n          enc_states: (N, T, C)\n          enc_lens: (N,), np.int64\n          enc_masks: (N, T), np.bool\n        \"\"\"\n        enc_states, enc_lens, enc_masks = self.encoder.run(\n            [\n                self.encoder.get_outputs()[0].name,\n                self.encoder.get_outputs()[1].name,\n                self.encoder.get_outputs()[2].name,\n            ],\n            {\n                self.encoder.get_inputs()[0].name: x,\n                self.encoder.get_inputs()[1].name: x_lens,\n            },\n        )\n        return enc_states, enc_lens, enc_masks\n\n    def run_decoder(\n        self,\n        decoder_input_ids: np.ndarray,\n        decoder_mems_list: List[np.ndarray],\n        enc_states: np.ndarray,\n        enc_mask: np.ndarray,\n    ):\n        \"\"\"\n        Args:\n          decoder_input_ids: (N, num_tokens), int32\n          decoder_mems_list: a list of tensors, each of which is (N, num_tokens, C)\n          enc_states: (N, T, C), float\n          enc_mask: (N, T), bool\n        Returns:\n          logits: (1, 1, vocab_size), float\n          new_decoder_mems_list:\n        \"\"\"\n        (logits, *new_decoder_mems_list) = self.decoder.run(\n            [\n                self.decoder.get_outputs()[0].name,\n                self.decoder.get_outputs()[1].name,\n                self.decoder.get_outputs()[2].name,\n                self.decoder.get_outputs()[3].name,\n                self.decoder.get_outputs()[4].name,\n                self.decoder.get_outputs()[5].name,\n                self.decoder.get_outputs()[6].name,\n            ],\n            {\n                self.decoder.get_inputs()[0].name: decoder_input_ids,\n                self.decoder.get_inputs()[1].name: decoder_mems_list[0],\n                self.decoder.get_inputs()[2].name: decoder_mems_list[1],\n                self.decoder.get_inputs()[3].name: decoder_mems_list[2],\n                self.decoder.get_inputs()[4].name: decoder_mems_list[3],\n                self.decoder.get_inputs()[5].name: decoder_mems_list[4],\n                self.decoder.get_inputs()[6].name: decoder_mems_list[5],\n                self.decoder.get_inputs()[7].name: enc_states,\n                self.decoder.get_inputs()[8].name: enc_mask,\n            },\n        )\n        return logits, new_decoder_mems_list\n\n\ndef create_fbank():\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.remove_dc_offset = False\n    opts.frame_opts.window_type = \"hann\"\n\n    opts.mel_opts.low_freq = 0\n    opts.mel_opts.num_bins = 128\n\n    opts.mel_opts.is_librosa = True\n\n    fbank = knf.OnlineFbank(opts)\n    return fbank\n\n\ndef compute_features(audio, fbank):\n    assert len(audio.shape) == 1, audio.shape\n    fbank.accept_waveform(16000, audio)\n    ans = []\n    processed = 0\n    while processed < fbank.num_frames_ready:\n        ans.append(np.array(fbank.get_frame(processed)))\n        processed += 1\n    ans = np.stack(ans)\n    return ans\n\n\ndef main():\n    args = get_args()\n    assert Path(args.encoder).is_file(), args.encoder\n    assert Path(args.decoder).is_file(), args.decoder\n    assert Path(args.tokens).is_file(), args.tokens\n    assert Path(args.wav).is_file(), args.wav\n\n    print(vars(args))\n\n    id2token = dict()\n    token2id = dict()\n    with open(args.tokens, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.split()\n            if len(fields) == 2:\n                t, idx = fields[0], int(fields[1])\n                if line[0] == \" \":\n                    t = \" \" + t\n            else:\n                t = \" \"\n                idx = int(fields[0])\n\n            id2token[idx] = t\n            token2id[t] = idx\n\n    model = OnnxModel(args.encoder, args.decoder)\n\n    fbank = create_fbank()\n\n    start = time.time()\n    audio, sample_rate = sf.read(args.wav, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=16000,\n        )\n        sample_rate = 16000\n\n    features = compute_features(audio, fbank)\n    if model.normalize_type != \"\":\n        assert model.normalize_type == \"per_feature\", model.normalize_type\n        mean = features.mean(axis=0, keepdims=True)\n        stddev = features.std(axis=0, keepdims=True) + 1e-5\n        features = (features - mean) / stddev\n\n    features = np.expand_dims(features, axis=0)\n    # features.shape: (1, 291, 128)\n\n    features_len = np.array([features.shape[1]], dtype=np.int64)\n\n    enc_states, _, enc_masks = model.run_encoder(features, features_len)\n\n    decoder_input_ids = []\n    decoder_input_ids.append(token2id[\"<|startofcontext|>\"])\n    decoder_input_ids.append(token2id[\"<|startoftranscript|>\"])\n    decoder_input_ids.append(token2id[\"<|emo:undefined|>\"])\n    if args.source_lang in (\"en\", \"es\", \"de\", \"fr\"):\n        decoder_input_ids.append(token2id[f\"<|{args.source_lang}|>\"])\n    else:\n        decoder_input_ids.append(token2id[f\"<|en|>\"])\n\n    if args.target_lang in (\"en\", \"es\", \"de\", \"fr\"):\n        decoder_input_ids.append(token2id[f\"<|{args.target_lang}|>\"])\n    else:\n        decoder_input_ids.append(token2id[f\"<|en|>\"])\n\n    if args.use_pnc:\n        decoder_input_ids.append(token2id[f\"<|pnc|>\"])\n    else:\n        decoder_input_ids.append(token2id[f\"<|nopnc|>\"])\n\n    decoder_input_ids.append(token2id[f\"<|noitn|>\"])\n    decoder_input_ids.append(token2id[\"<|notimestamp|>\"])\n    decoder_input_ids.append(token2id[\"<|nodiarize|>\"])\n\n    decoder_mems_list = [np.zeros((1, 0, 1024), dtype=np.float32) for _ in range(6)]\n\n    for pos, decoder_input_id in enumerate(decoder_input_ids):\n        logits, decoder_mems_list = model.run_decoder(\n            np.array([[decoder_input_id, pos]], dtype=np.int32),\n            decoder_mems_list,\n            enc_states,\n            enc_masks,\n        )\n    tokens = [logits.argmax()]\n    print(\"decoder_input_ids\", decoder_input_ids)\n    eos = token2id[\"<|endoftext|>\"]\n\n    for i in range(1, 200):\n        decoder_input_ids = [tokens[-1], i]\n        logits, decoder_mems_list = model.run_decoder(\n            np.array([decoder_input_ids], dtype=np.int32),\n            decoder_mems_list,\n            enc_states,\n            enc_masks,\n        )\n        t = logits.argmax()\n        if t == eos:\n            break\n        tokens.append(t)\n    print(\"len(tokens)\", len(tokens))\n    print(\"tokens\", tokens)\n\n    text = \"\".join([id2token[i] for i in tokens])\n\n    underline = \"▁\"\n    #  underline = b\"\\xe2\\x96\\x81\".decode()\n\n    text = text.replace(underline, \" \").strip()\n    print(\"text:\", text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/README.md",
    "content": "# Introduction\n\nThis folder contains scripts for exporting models from\n\n  - https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_streaming_80ms\n  - https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_streaming_480ms\n  - https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_streaming_1040ms\n\n  - # https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_ctc_large\n  - # https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_enes_conformer_transducer_large_codesw\n  - # https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_transducer_large\n  - # https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_enzh_fastconformer_transducer_large_codesw\n\n\n  - # https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_fa_fastconformer_hybrid_large\n  - # https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_it_fastconformer_hybrid_large_pc\n  - # https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_pl_fastconformer_hybrid_large_pc\n  - # https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_ua_fastconformer_hybrid_large_pc\n\n  - https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_pc\n  - https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_es_fastconformer_hybrid_large_pc\n  - https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_multilingual_fastconformer_hybrid_large_pc_blend_eu\n  - https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_multilingual_fastconformer_hybrid_large_pc\n\n  - https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/parakeet-tdt_ctc-110m\n  - https://huggingface.co/nvidia/parakeet-tdt_ctc-0.6b-ja\n  - https://huggingface.co/nvidia/stt_pt_fastconformer_hybrid_large_pc\n  - https://huggingface.co/nvidia/stt_de_fastconformer_hybrid_large_pc\n\nto `sherpa-onnx`.\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/export-onnx-ctc-non-streaming.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport argparse\nfrom typing import Dict\n\nimport nemo.collections.asr as nemo_asr\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n    )\n    parser.add_argument(\n        \"--doc\",\n        type=str,\n        default=\"\",\n    )\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    model_name = args.model\n\n    asr_model = nemo_asr.models.ASRModel.from_pretrained(model_name=model_name)\n    print(asr_model.cfg)\n    print(asr_model)\n\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(asr_model.joint.vocabulary):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    decoder_type = \"ctc\"\n    asr_model.change_decoding_strategy(decoder_type=decoder_type)\n    asr_model.eval()\n\n    asr_model.set_export_config({\"decoder_type\": \"ctc\"})\n\n    filename = \"model.onnx\"\n\n    asr_model.export(filename)\n\n    normalize_type = asr_model.cfg.preprocessor.normalize\n    if normalize_type == \"NA\":\n        normalize_type = \"\"\n\n    meta_data = {\n        \"vocab_size\": asr_model.decoder.vocab_size,\n        \"normalize_type\": normalize_type,\n        \"subsampling_factor\": 8,\n        \"model_type\": \"EncDecHybridRNNTCTCBPEModel\",\n        \"version\": \"1\",\n        \"model_author\": \"NeMo\",\n        \"url\": f\"https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/{model_name}\"\n        if \"/\" in model_name\n        else f\"https://huggingface.co/{model_name}\",\n        \"comment\": \"Only the CTC branch is exported\",\n        \"doc\": args.doc,\n    }\n    add_meta_data(filename, meta_data)\n\n    quantize_dynamic(\n        model_input=\"./model.onnx\",\n        model_output=\"./model.int8.onnx\",\n        weight_type=QuantType.QUInt8,\n    )\n\n    print(\"preprocessor\", asr_model.cfg.preprocessor)\n    print(meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/export-onnx-ctc.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport argparse\nfrom typing import Dict\n\nimport nemo.collections.asr as nemo_asr\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        choices=[\"80\", \"480\", \"1040\"],\n    )\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    model_name = f\"stt_en_fastconformer_hybrid_large_streaming_{args.model}ms\"\n\n    asr_model = nemo_asr.models.ASRModel.from_pretrained(model_name=model_name)\n\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(asr_model.joint.vocabulary):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    decoder_type = \"ctc\"\n    asr_model.change_decoding_strategy(decoder_type=decoder_type)\n    asr_model.eval()\n\n    assert asr_model.encoder.streaming_cfg is not None\n    if isinstance(asr_model.encoder.streaming_cfg.chunk_size, list):\n        chunk_size = asr_model.encoder.streaming_cfg.chunk_size[1]\n    else:\n        chunk_size = asr_model.encoder.streaming_cfg.chunk_size\n\n    if isinstance(asr_model.encoder.streaming_cfg.pre_encode_cache_size, list):\n        pre_encode_cache_size = asr_model.encoder.streaming_cfg.pre_encode_cache_size[1]\n    else:\n        pre_encode_cache_size = asr_model.encoder.streaming_cfg.pre_encode_cache_size\n    window_size = chunk_size + pre_encode_cache_size\n\n    print(\"chunk_size\", chunk_size)\n    print(\"pre_encode_cache_size\", pre_encode_cache_size)\n    print(\"window_size\", window_size)\n\n    chunk_shift = chunk_size\n\n    # cache_last_channel: (batch_size, dim1, dim2, dim3)\n    cache_last_channel_dim1 = len(asr_model.encoder.layers)\n    cache_last_channel_dim2 = asr_model.encoder.streaming_cfg.last_channel_cache_size\n    cache_last_channel_dim3 = asr_model.encoder.d_model\n\n    # cache_last_time: (batch_size, dim1, dim2, dim3)\n    cache_last_time_dim1 = len(asr_model.encoder.layers)\n    cache_last_time_dim2 = asr_model.encoder.d_model\n    cache_last_time_dim3 = asr_model.encoder.conv_context_size[0]\n\n    asr_model.set_export_config({\"decoder_type\": \"ctc\", \"cache_support\": True})\n\n    filename = \"model.onnx\"\n\n    asr_model.export(filename)\n\n    normalize_type = asr_model.cfg.preprocessor.normalize\n    if normalize_type == \"NA\":\n        normalize_type = \"\"\n\n    meta_data = {\n        \"vocab_size\": asr_model.decoder.vocab_size,\n        \"window_size\": window_size,\n        \"chunk_shift\": chunk_shift,\n        \"normalize_type\": normalize_type,\n        \"cache_last_channel_dim1\": cache_last_channel_dim1,\n        \"cache_last_channel_dim2\": cache_last_channel_dim2,\n        \"cache_last_channel_dim3\": cache_last_channel_dim3,\n        \"cache_last_time_dim1\": cache_last_time_dim1,\n        \"cache_last_time_dim2\": cache_last_time_dim2,\n        \"cache_last_time_dim3\": cache_last_time_dim3,\n        \"subsampling_factor\": 8,\n        \"model_type\": \"EncDecHybridRNNTCTCBPEModel\",\n        \"version\": \"1\",\n        \"model_author\": \"NeMo\",\n        \"url\": f\"https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/{model_name}\",\n        \"comment\": \"Only the CTC branch is exported\",\n    }\n    add_meta_data(filename, meta_data)\n    quantize_dynamic(\n        model_input=\"./model.onnx\",\n        model_output=\"./model.int8.onnx\",\n        weight_type=QuantType.QUInt8,\n    )\n\n    print(meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/export-onnx-transducer-non-streaming.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport argparse\nfrom typing import Dict\n\nimport nemo.collections.asr as nemo_asr\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n    )\n    parser.add_argument(\n        \"--doc\",\n        type=str,\n        default=\"\",\n    )\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    model_name = args.model\n\n    asr_model = nemo_asr.models.ASRModel.from_pretrained(model_name=model_name)\n\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(asr_model.joint.vocabulary):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    decoder_type = \"rnnt\"\n    asr_model.change_decoding_strategy(decoder_type=decoder_type)\n    asr_model.eval()\n\n    asr_model.set_export_config({\"decoder_type\": \"rnnt\"})\n\n    # asr_model.export(\"model.onnx\")\n    asr_model.encoder.export(\"encoder.onnx\")\n    asr_model.decoder.export(\"decoder.onnx\")\n    asr_model.joint.export(\"joiner.onnx\")\n    # model.onnx is a suffix.\n    # It will generate two files:\n    # encoder-model.onnx\n    # decoder_joint-model.onnx\n\n    normalize_type = asr_model.cfg.preprocessor.normalize\n    if normalize_type == \"NA\":\n        normalize_type = \"\"\n    meta_data = {\n        \"vocab_size\": asr_model.decoder.vocab_size,\n        \"normalize_type\": normalize_type,\n        \"pred_rnn_layers\": asr_model.decoder.pred_rnn_layers,\n        \"pred_hidden\": asr_model.decoder.pred_hidden,\n        \"subsampling_factor\": 8,\n        \"model_type\": \"EncDecHybridRNNTCTCBPEModel\",\n        \"version\": \"1\",\n        \"model_author\": \"NeMo\",\n        \"url\": f\"https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/{model_name}\"\n        if \"/\" in model_name\n        else f\"https://huggingface.co/{model_name}\",\n        \"comment\": \"Only the transducer branch is exported\",\n        \"doc\": args.doc,\n    }\n    add_meta_data(\"encoder.onnx\", meta_data)\n\n    for m in [\"encoder\", \"decoder\", \"joiner\"]:\n        quantize_dynamic(\n            model_input=f\"{m}.onnx\",\n            model_output=f\"{m}.int8.onnx\",\n            weight_type=QuantType.QUInt8,\n        )\n\n    print(meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/export-onnx-transducer.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport argparse\nfrom typing import Dict\n\nimport nemo.collections.asr as nemo_asr\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        choices=[\"80\", \"480\", \"1040\"],\n    )\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    model_name = f\"stt_en_fastconformer_hybrid_large_streaming_{args.model}ms\"\n\n    asr_model = nemo_asr.models.ASRModel.from_pretrained(model_name=model_name)\n\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(asr_model.joint.vocabulary):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    decoder_type = \"rnnt\"\n    asr_model.change_decoding_strategy(decoder_type=decoder_type)\n    asr_model.eval()\n\n    assert asr_model.encoder.streaming_cfg is not None\n    if isinstance(asr_model.encoder.streaming_cfg.chunk_size, list):\n        chunk_size = asr_model.encoder.streaming_cfg.chunk_size[1]\n    else:\n        chunk_size = asr_model.encoder.streaming_cfg.chunk_size\n\n    if isinstance(asr_model.encoder.streaming_cfg.pre_encode_cache_size, list):\n        pre_encode_cache_size = asr_model.encoder.streaming_cfg.pre_encode_cache_size[1]\n    else:\n        pre_encode_cache_size = asr_model.encoder.streaming_cfg.pre_encode_cache_size\n    window_size = chunk_size + pre_encode_cache_size\n\n    print(\"chunk_size\", chunk_size)\n    print(\"pre_encode_cache_size\", pre_encode_cache_size)\n    print(\"window_size\", window_size)\n\n    chunk_shift = chunk_size\n\n    # cache_last_channel: (batch_size, dim1, dim2, dim3)\n    cache_last_channel_dim1 = len(asr_model.encoder.layers)\n    cache_last_channel_dim2 = asr_model.encoder.streaming_cfg.last_channel_cache_size\n    cache_last_channel_dim3 = asr_model.encoder.d_model\n\n    # cache_last_time: (batch_size, dim1, dim2, dim3)\n    cache_last_time_dim1 = len(asr_model.encoder.layers)\n    cache_last_time_dim2 = asr_model.encoder.d_model\n    cache_last_time_dim3 = asr_model.encoder.conv_context_size[0]\n\n    asr_model.set_export_config({\"decoder_type\": \"rnnt\", \"cache_support\": True})\n\n    # asr_model.export(\"model.onnx\")\n    asr_model.encoder.export(\"encoder.onnx\")\n    asr_model.decoder.export(\"decoder.onnx\")\n    asr_model.joint.export(\"joiner.onnx\")\n    # model.onnx is a suffix.\n    # It will generate two files:\n    # encoder-model.onnx\n    # decoder_joint-model.onnx\n\n    normalize_type = asr_model.cfg.preprocessor.normalize\n    if normalize_type == \"NA\":\n        normalize_type = \"\"\n\n    meta_data = {\n        \"vocab_size\": asr_model.decoder.vocab_size,\n        \"window_size\": window_size,\n        \"chunk_shift\": chunk_shift,\n        \"normalize_type\": normalize_type,\n        \"cache_last_channel_dim1\": cache_last_channel_dim1,\n        \"cache_last_channel_dim2\": cache_last_channel_dim2,\n        \"cache_last_channel_dim3\": cache_last_channel_dim3,\n        \"cache_last_time_dim1\": cache_last_time_dim1,\n        \"cache_last_time_dim2\": cache_last_time_dim2,\n        \"cache_last_time_dim3\": cache_last_time_dim3,\n        \"pred_rnn_layers\": asr_model.decoder.pred_rnn_layers,\n        \"pred_hidden\": asr_model.decoder.pred_hidden,\n        \"subsampling_factor\": 8,\n        \"model_type\": \"EncDecHybridRNNTCTCBPEModel\",\n        \"version\": \"1\",\n        \"model_author\": \"NeMo\",\n        \"url\": f\"https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/{model_name}\",\n        \"comment\": \"Only the transducer branch is exported\",\n    }\n    add_meta_data(\"encoder.onnx\", meta_data)\n\n    for m in [\"encoder\", \"decoder\", \"joiner\"]:\n        quantize_dynamic(\n            model_input=f\"{m}.onnx\",\n            model_output=f\"{m}.int8.onnx\",\n            weight_type=QuantType.QUInt8,\n        )\n\n    print(meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/run-ctc-non-streaming-2.sh",
    "content": "#!/usr/bin/env bash\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\n# 2200 hours of Portuguese speech\nurl=https://huggingface.co/nvidia/stt_pt_fastconformer_hybrid_large_pc\nname=$(basename $url)\nname=\"nvidia/$name\"\ndoc=\"STT PT FastConformer Hybrid Transducer-CTC Large transcribes text in upper and lower case Portuguese alphabet along with spaces, period, comma, question mark. This collection contains the Brazilian Portuguese FastConformer Hybrid (Transducer and CTC) Large model (around 115M parameters) with punctuation and capitalization trained on around 2200h hours of Portuguese speech. \"\n\nlog \"Process $name at $url\"\n./export-onnx-ctc-non-streaming.py --model $name --doc \"$doc\"\nd=sherpa-onnx-nemo-stt_pt_fastconformer_hybrid_large_pc\nmkdir -p $d\nmv -v model.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nmkdir test_wavs\npushd test_wavs\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/pt_br.wav\npopd\ncp -a test_wavs $d\n\nd=sherpa-onnx-nemo-stt_pt_fastconformer_hybrid_large_pc-int8\nmkdir -p $d\nmv -v model.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\nmv test_wavs $d\n\npython3 ./test-onnx-ctc-non-streaming.py \\\n  --model $d/model.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $d/test_wavs/pt_br.wav\n\n\n# 2500 hours of German speech\nurl=https://huggingface.co/nvidia/stt_de_fastconformer_hybrid_large_pc\nname=$(basename $url)\nname=\"nvidia/$name\"\ndoc=\"This model transcribes speech in upper and lower case German alphabet along with spaces, periods, commas, and question marks. It is a 'large' version of FastConformer Transducer-CTC (around 115M parameters) model. This is a hybrid model trained on two losses: Transducer (default) and CTC.\"\n\nlog \"Process $name at $url\"\n./export-onnx-ctc-non-streaming.py --model $name --doc \"$doc\"\nd=sherpa-onnx-nemo-stt_de_fastconformer_hybrid_large_pc\nmkdir -p $d\nmv -v model.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nmkdir test_wavs\npushd test_wavs\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/de.wav\npopd\ncp -a test_wavs $d\n\nd=sherpa-onnx-nemo-stt_de_fastconformer_hybrid_large_pc-int8\nmkdir -p $d\nmv -v model.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\nmv test_wavs $d\n\npython3 ./test-onnx-ctc-non-streaming.py \\\n  --model $d/model.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $d/test_wavs/de.wav\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/run-ctc-non-streaming.sh",
    "content": "#!/usr/bin/env bash\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\n# 36000 hours of English data\nurl=https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/parakeet-tdt_ctc-110m\nname=$(basename $url)\ndoc=\"parakeet-tdt_ctc-110m is an ASR model that transcribes speech with Punctuations and Capitalizations of the English alphabet. It was trained on 36K hours of English speech collected and prepared by NVIDIA NeMo and Suno teams.\"\n\nlog \"Process $name at $url\"\n./export-onnx-ctc-non-streaming.py --model $name --doc \"$doc\"\nd=sherpa-onnx-nemo-parakeet_tdt_ctc_110m-en-36000\nmkdir -p $d\nmv -v model.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nd=sherpa-onnx-nemo-parakeet_tdt_ctc_110m-en-36000-int8\nmkdir -p $d\nmv -v model.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\n\n# 8500 hours of English speech\nurl=https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_pc\nname=$(basename $url)\ndoc=\"This collection contains the English FastConformer Hybrid (Transducer and CTC) Large model (around 114M parameters) with Punctuation and Capitalization on NeMo ASRSet En PC with around 8500 hours of English speech (SPGI 1k, VoxPopuli, MCV11, Europarl-ASR, Fisher, LibriSpeech, NSC1, MLS). It utilizes a Google SentencePiece [1] tokenizer with a vocabulary size of 1024. It transcribes text in upper and lower case English alphabet along with spaces, periods, commas, question marks, and a few other characters.\"\n\nlog \"Process $name at $url\"\n./export-onnx-ctc-non-streaming.py --model $name --doc \"$doc\"\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-en-24500\nmkdir -p $d\nmv -v model.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-en-24500-int8\nmkdir -p $d\nmv -v model.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\n\nurl=https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_es_fastconformer_hybrid_large_pc\nname=$(basename $url)\ndoc=\"This collection contains the Spanish FastConformer Hybrid (CTC and Transducer) Large model (around 114M parameters) with Punctuation and Capitalization. It is trained on the NeMo PnC ES ASRSET (Fisher, MCV12, MLS, Voxpopuli) containing 1424 hours of Spanish speech. It utilizes a Google SentencePiece [1] tokenizer with vocabulary size 1024, and transcribes text in upper and lower case Spanish alphabet along with spaces, period, comma, question mark and inverted question mark.\"\n\n./export-onnx-ctc-non-streaming.py --model $name --doc \"$doc\"\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-es-1424\nmkdir -p $d\nmv -v model.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-es-1424-int8\nmkdir -p $d\nmv -v model.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\n\nurl=https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_multilingual_fastconformer_hybrid_large_pc_blend_eu\nname=$(basename $url)\ndoc=\"This collection contains the Multilingual FastConformer Hybrid (Transducer and CTC) Large model (around 114M parameters) with Punctuation and Capitalization. It is trained on the NeMo PnC German, English, Spanish, and French ASR sets that contain 14,288 hours of speech in total. It utilizes a Google SentencePiece [1] tokenizer with vocabulary size 256 per language and transcribes text in upper and lower case along with spaces, periods, commas, question marks and a few other language-specific characters. The total tokenizer size is 2560, of which 1024 tokens are allocated to English, German, French, and Spanish. The remaining tokens are reserved for future languages.\"\n\n./export-onnx-ctc-non-streaming.py --model $name --doc \"$doc\"\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288\nmkdir -p $d\nmv -v model.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288-int8\nmkdir -p $d\nmv -v model.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\n\nurl=https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_multilingual_fastconformer_hybrid_large_pc\nname=$(basename $url)\ndoc=\"This collection contains the Multilingual FastConformer Hybrid (Transducer and CTC) Large model (around 114M parameters) with Punctuation and Capitalization. It is trained on the NeMo PnC Belarusian, German, English, Spanish, French, Croatian, Italian, Polish, Russian, and Ukrainian ASR sets that contain ~20,000 hours of speech in total. It utilizes a Google SentencePiece [1] tokenizer with vocabulary size 256 per language (2560 total), and transcribes text in upper and lower case along with spaces, periods, commas, question marks and a few other language-specific characters.\"\n\n./export-onnx-ctc-non-streaming.py --model $name --doc \"$doc\"\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k\nmkdir -p $d\nmv -v model.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k-int8\nmkdir -p $d\nmv -v model.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\n\n# Now test the exported model\nlog \"Download test data\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/spoken-language-identification-test-wavs.tar.bz2\ntar xvf spoken-language-identification-test-wavs.tar.bz2\nrm spoken-language-identification-test-wavs.tar.bz2\ndata=spoken-language-identification-test-wavs\n\ncurl -SL -O https://dldata-public.s3.us-east-2.amazonaws.com/2086-149220-0033.wav\nmv 2086-149220-0033.wav en.wav\n\nd=sherpa-onnx-nemo-parakeet_tdt_ctc_110m-en-36000\npython3 ./test-onnx-ctc-non-streaming.py \\\n  --model $d/model.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/en-english.wav\nmkdir -p $d/test_wavs\n\ncp en.wav $d/test_wavs/0.wav\ncp -v $data/en-english.wav $d/test_wavs/1.wav\n\nd=sherpa-onnx-nemo-parakeet_tdt_ctc_110m-en-36000-int8\npython3 ./test-onnx-ctc-non-streaming.py \\\n  --model $d/model.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/en-english.wav\nmkdir -p $d/test_wavs\n\ncp en.wav $d/test_wavs/0.wav\ncp -v $data/en-english.wav $d/test_wavs/1.wav\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-en-24500\npython3 ./test-onnx-ctc-non-streaming.py \\\n  --model $d/model.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/en-english.wav\nmkdir -p $d/test_wavs\ncp en.wav $d/test_wavs/0.wav\ncp -v $data/en-english.wav $d/test_wavs\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-en-24500-int8\npython3 ./test-onnx-ctc-non-streaming.py \\\n  --model $d/model.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/en-english.wav\nmkdir -p $d/test_wavs\ncp en.wav $d/test_wavs/0.wav\ncp -v $data/en-english.wav $d/test_wavs\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-es-1424\npython3 ./test-onnx-ctc-non-streaming.py \\\n  --model $d/model.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/es-spanish.wav\nmkdir -p $d/test_wavs\ncp -v $data/es-spanish.wav $d/test_wavs\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-es-1424-int8\npython3 ./test-onnx-ctc-non-streaming.py \\\n  --model $d/model.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/es-spanish.wav\nmkdir -p $d/test_wavs\ncp -v $data/es-spanish.wav $d/test_wavs\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288\nmkdir -p $d/test_wavs\nfor w in en-english.wav de-german.wav es-spanish.wav fr-french.wav; do\n  python3 ./test-onnx-ctc-non-streaming.py \\\n    --model $d/model.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav $data/$w\n  cp -v $data/$w $d/test_wavs\ndone\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288-int8\nmkdir -p $d/test_wavs\nfor w in en-english.wav de-german.wav es-spanish.wav fr-french.wav; do\n  python3 ./test-onnx-ctc-non-streaming.py \\\n    --model $d/model.int8.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav $data/$w\n  cp -v $data/$w $d/test_wavs\ndone\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k\nmkdir -p $d/test_wavs\nfor w in en-english.wav de-german.wav es-spanish.wav fr-french.wav hr-croatian.wav it-italian.wav po-polish.wav ru-russian.wav uk-ukrainian.wav; do\n  python3 ./test-onnx-ctc-non-streaming.py \\\n    --model $d/model.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav $data/$w\n  cp -v $data/$w $d/test_wavs\ndone\n\nd=sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k-int8\nmkdir -p $d/test_wavs\nfor w in en-english.wav de-german.wav es-spanish.wav fr-french.wav hr-croatian.wav it-italian.wav po-polish.wav ru-russian.wav uk-ukrainian.wav; do\n  python3 ./test-onnx-ctc-non-streaming.py \\\n    --model $d/model.int8.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav $data/$w\n  cp -v $data/$w $d/test_wavs\ndone\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/run-ctc.sh",
    "content": "#!/usr/bin/env bash\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nif [ ! -e ./0.wav ]; then\n  # curl -SL -O https://hf-mirror.com/csukuangfj/icefall-asr-librispeech-streaming-zipformer-small-2024-03-18/resolve/main/test_wavs/0.wav\n  curl -SL -O https://huggingface.co/csukuangfj/icefall-asr-librispeech-streaming-zipformer-small-2024-03-18/resolve/main/test_wavs/0.wav\nfi\n\nms=(\n80\n480\n1040\n)\n\nfor m in ${ms[@]}; do\n  ./export-onnx-ctc.py --model $m\n  d=sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-${m}ms\n\n  d_int8=sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-${m}ms-int8\n\n  if [ ! -f $d/model.onnx ]; then\n    mkdir -p $d $d_int8\n    mv -v model.onnx $d/\n    cp -v tokens.txt $d/\n\n    mv -v model.int8.onnx $d_int8/\n    mv -v tokens.txt $d_int8/\n\n    echo \"---$d---\"\n    ls -lh $d\n\n    echo \"---$d_int8---\"\n    ls -lh $d_int8\n  fi\ndone\n\n# Now test the exported models\n\nfor m in ${ms[@]}; do\n  d=sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-${m}ms\n  echo \"---$d---\"\n  python3 ./test-onnx-ctc.py \\\n    --model $d/model.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav ./0.wav\n\n  d=sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-${m}ms-int8\n  echo \"---$d---\"\n  python3 ./test-onnx-ctc.py \\\n    --model $d/model.int8.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav ./0.wav\ndone\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/run-transducer-non-streaming-2.sh",
    "content": "#!/usr/bin/env bash\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\n# 2200 hours of Portuguese speech\nurl=https://huggingface.co/nvidia/stt_pt_fastconformer_hybrid_large_pc\nname=$(basename $url)\nname=\"nvidia/$name\"\ndoc=\"STT PT FastConformer Hybrid Transducer-CTC Large transcribes text in upper and lower case Portuguese alphabet along with spaces, period, comma, question mark. This collection contains the Brazilian Portuguese FastConformer Hybrid (Transducer and CTC) Large model (around 115M parameters) with punctuation and capitalization trained on around 2200h hours of Portuguese speech. \"\n\nlog \"Process $name at $url\"\n./export-onnx-transducer-non-streaming.py --model $name --doc \"$doc\"\nd=sherpa-onnx-nemo-transducer-stt_pt_fastconformer_hybrid_large_pc\nmkdir -p $d\nmv -v encoder.onnx $d/\nmv -v decoder.onnx $d/\nmv -v joiner.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nmkdir test_wavs\npushd test_wavs\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/pt_br.wav\npopd\ncp -a test_wavs $d\n\nd=sherpa-onnx-nemo-transducer-stt_pt_fastconformer_hybrid_large_pc-int8\nmkdir -p $d\nmv -v encoder.int8.onnx $d/\nmv -v decoder.int8.onnx $d/\nmv -v joiner.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\nmv test_wavs $d\n\npython3 ./test-onnx-transducer-non-streaming.py \\\n  --encoder $d/encoder.int8.onnx \\\n  --decoder $d/decoder.int8.onnx \\\n  --joiner $d/joiner.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $d/test_wavs/pt_br.wav\n\n# 2500 hours of German speech\nurl=https://huggingface.co/nvidia/stt_de_fastconformer_hybrid_large_pc\nname=$(basename $url)\nname=\"nvidia/$name\"\ndoc=\"This model transcribes speech in upper and lower case German alphabet along with spaces, periods, commas, and question marks. It is a 'large' version of FastConformer Transducer-CTC (around 115M parameters) model. This is a hybrid model trained on two losses: Transducer (default) and CTC.\"\n\nlog \"Process $name at $url\"\n./export-onnx-transducer-non-streaming.py --model $name --doc \"$doc\"\nd=sherpa-onnx-nemo-transducer-stt_de_fastconformer_hybrid_large_pc\nmkdir -p $d\nmv -v encoder.onnx $d/\nmv -v decoder.onnx $d/\nmv -v joiner.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nmkdir test_wavs\npushd test_wavs\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/de.wav\npopd\ncp -a test_wavs $d\n\nd=sherpa-onnx-nemo-transducer-stt_de_fastconformer_hybrid_large_pc-int8\nmkdir -p $d\nmv -v encoder.int8.onnx $d/\nmv -v decoder.int8.onnx $d/\nmv -v joiner.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\nmv test_wavs $d\n\npython3 ./test-onnx-transducer-non-streaming.py \\\n  --encoder $d/encoder.int8.onnx \\\n  --decoder $d/decoder.int8.onnx \\\n  --joiner $d/joiner.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $d/test_wavs/de.wav\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/run-transducer-non-streaming.sh",
    "content": "#!/usr/bin/env bash\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\n# 36000 hours of English data\nurl=https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/parakeet-tdt_ctc-110m\nname=$(basename $url)\ndoc=\"parakeet-tdt_ctc-110m is an ASR model that transcribes speech with Punctuations and Capitalizations of the English alphabet. It was trained on 36K hours of English speech collected and prepared by NVIDIA NeMo and Suno teams.\"\n\nlog \"Process $name at $url\"\n./export-onnx-transducer-non-streaming.py --model $name --doc \"$doc\"\nd=sherpa-onnx-nemo-parakeet_tdt_transducer_110m-en-36000\nmkdir -p $d\nmv -v encoder.onnx $d/\nmv -v decoder.onnx $d/\nmv -v joiner.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nd=sherpa-onnx-nemo-parakeet_tdt_transducer_110m-en-36000-int8\nmkdir -p $d\nmv -v encoder.int8.onnx $d/\nmv -v decoder.int8.onnx $d/\nmv -v joiner.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\n\n# 8500 hours of English speech\nurl=https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_pc\nname=$(basename $url)\ndoc=\"This collection contains the English FastConformer Hybrid (Transducer and CTC) Large model (around 114M parameters) with Punctuation and Capitalization on NeMo ASRSet En PC with around 8500 hours of English speech (SPGI 1k, VoxPopuli, MCV11, Europarl-ASR, Fisher, LibriSpeech, NSC1, MLS). It utilizes a Google SentencePiece [1] tokenizer with a vocabulary size of 1024. It transcribes text in upper and lower case English alphabet along with spaces, periods, commas, question marks, and a few other characters.\"\n\nlog \"Process $name at $url\"\n./export-onnx-transducer-non-streaming.py --model $name --doc \"$doc\"\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-en-24500\nmkdir -p $d\nmv -v encoder.onnx $d/\nmv -v decoder.onnx $d/\nmv -v joiner.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-en-24500-int8\nmkdir -p $d\nmv -v encoder.int8.onnx $d/\nmv -v decoder.int8.onnx $d/\nmv -v joiner.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\n\nurl=https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_es_fastconformer_hybrid_large_pc\nname=$(basename $url)\ndoc=\"This collection contains the Spanish FastConformer Hybrid (CTC and Transducer) Large model (around 114M parameters) with Punctuation and Capitalization. It is trained on the NeMo PnC ES ASRSET (Fisher, MCV12, MLS, Voxpopuli) containing 1424 hours of Spanish speech. It utilizes a Google SentencePiece [1] tokenizer with vocabulary size 1024, and transcribes text in upper and lower case Spanish alphabet along with spaces, period, comma, question mark and inverted question mark.\"\n\n./export-onnx-transducer-non-streaming.py --model $name --doc \"$doc\"\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-es-1424\nmkdir -p $d\nmv -v encoder.onnx $d/\nmv -v decoder.onnx $d/\nmv -v joiner.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-es-1424-int8\nmkdir -p $d\nmv -v encoder.int8.onnx $d/\nmv -v decoder.int8.onnx $d/\nmv -v joiner.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\n\nurl=https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_multilingual_fastconformer_hybrid_large_pc_blend_eu\nname=$(basename $url)\ndoc=\"This collection contains the Multilingual FastConformer Hybrid (Transducer and CTC) Large model (around 114M parameters) with Punctuation and Capitalization. It is trained on the NeMo PnC German, English, Spanish, and French ASR sets that contain 14,288 hours of speech in total. It utilizes a Google SentencePiece [1] tokenizer with vocabulary size 256 per language and transcribes text in upper and lower case along with spaces, periods, commas, question marks and a few other language-specific characters. The total tokenizer size is 2560, of which 1024 tokens are allocated to English, German, French, and Spanish. The remaining tokens are reserved for future languages.\"\n\n./export-onnx-transducer-non-streaming.py --model $name --doc \"$doc\"\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-en-de-es-fr-14288\nmkdir -p $d\nmv -v encoder.onnx $d/\nmv -v decoder.onnx $d/\nmv -v joiner.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-en-de-es-fr-14288-int8\nmkdir -p $d\nmv -v encoder.int8.onnx $d/\nmv -v decoder.int8.onnx $d/\nmv -v joiner.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\n\nurl=https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_multilingual_fastconformer_hybrid_large_pc\nname=$(basename $url)\ndoc=\"This collection contains the Multilingual FastConformer Hybrid (Transducer and CTC) Large model (around 114M parameters) with Punctuation and Capitalization. It is trained on the NeMo PnC Belarusian, German, English, Spanish, French, Croatian, Italian, Polish, Russian, and Ukrainian ASR sets that contain ~20,000 hours of speech in total. It utilizes a Google SentencePiece [1] tokenizer with vocabulary size 256 per language (2560 total), and transcribes text in upper and lower case along with spaces, periods, commas, question marks and a few other language-specific characters.\"\n\n./export-onnx-transducer-non-streaming.py --model $name --doc \"$doc\"\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k\nmkdir -p $d\nmv -v encoder.onnx $d/\nmv -v decoder.onnx $d/\nmv -v joiner.onnx $d/\ncp -v tokens.txt $d/\nls -lh $d\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k-int8\nmkdir -p $d\nmv -v encoder.int8.onnx $d/\nmv -v decoder.int8.onnx $d/\nmv -v joiner.int8.onnx $d/\nmv -v tokens.txt $d/\nls -lh $d\n\n# Now test the exported model\nlog \"Download test data\"\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/spoken-language-identification-test-wavs.tar.bz2\ntar xvf spoken-language-identification-test-wavs.tar.bz2\nrm spoken-language-identification-test-wavs.tar.bz2\ndata=spoken-language-identification-test-wavs\n\ncurl -SL -O https://dldata-public.s3.us-east-2.amazonaws.com/2086-149220-0033.wav\nmv 2086-149220-0033.wav en.wav\n\nd=sherpa-onnx-nemo-parakeet_tdt_transducer_110m-en-36000\npython3 ./test-onnx-transducer-non-streaming.py \\\n  --encoder $d/encoder.onnx \\\n  --decoder $d/decoder.onnx \\\n  --joiner $d/joiner.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/en-english.wav\n\npython3 ./test-onnx-transducer-non-streaming.py \\\n  --encoder $d/encoder.onnx \\\n  --decoder $d/decoder.onnx \\\n  --joiner $d/joiner.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav ./en.wav\n\nmkdir -p $d/test_wavs\ncp en.wav $d/test_wavs/0.wav\ncp -v $data/en-english.wav $d/test_wavs\n\nd=sherpa-onnx-nemo-parakeet_tdt_transducer_110m-en-36000-int8\npython3 ./test-onnx-transducer-non-streaming.py \\\n  --encoder $d/encoder.int8.onnx \\\n  --decoder $d/decoder.int8.onnx \\\n  --joiner $d/joiner.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/en-english.wav\n\npython3 ./test-onnx-transducer-non-streaming.py \\\n  --encoder $d/encoder.int8.onnx \\\n  --decoder $d/decoder.int8.onnx \\\n  --joiner $d/joiner.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav ./en.wav\n\nmkdir -p $d/test_wavs\ncp en.wav $d/test_wavs/0.wav\ncp -v $data/en-english.wav $d/test_wavs\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-en-24500\npython3 ./test-onnx-transducer-non-streaming.py \\\n  --encoder $d/encoder.onnx \\\n  --decoder $d/decoder.onnx \\\n  --joiner $d/joiner.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/en-english.wav\nmkdir -p $d/test_wavs\ncp en.wav $d/test_wavs/0.wav\ncp -v $data/en-english.wav $d/test_wavs\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-en-24500-int8\npython3 ./test-onnx-transducer-non-streaming.py \\\n  --encoder $d/encoder.int8.onnx \\\n  --decoder $d/decoder.int8.onnx \\\n  --joiner $d/joiner.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/en-english.wav\nmkdir -p $d/test_wavs\ncp en.wav $d/test_wavs/0.wav\ncp -v $data/en-english.wav $d/test_wavs\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-es-1424\npython3 ./test-onnx-transducer-non-streaming.py \\\n  --encoder $d/encoder.onnx \\\n  --decoder $d/decoder.onnx \\\n  --joiner $d/joiner.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/es-spanish.wav\nmkdir -p $d/test_wavs\ncp -v $data/es-spanish.wav $d/test_wavs\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-es-1424-int8\npython3 ./test-onnx-transducer-non-streaming.py \\\n  --encoder $d/encoder.int8.onnx \\\n  --decoder $d/decoder.int8.onnx \\\n  --joiner $d/joiner.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $data/es-spanish.wav\nmkdir -p $d/test_wavs\ncp -v $data/es-spanish.wav $d/test_wavs\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-en-de-es-fr-14288\nmkdir -p $d/test_wavs\nfor w in en-english.wav de-german.wav es-spanish.wav fr-french.wav; do\n  python3 ./test-onnx-transducer-non-streaming.py \\\n    --encoder $d/encoder.onnx \\\n    --decoder $d/decoder.onnx \\\n    --joiner $d/joiner.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav $data/$w\n  cp -v $data/$w $d/test_wavs\ndone\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-en-de-es-fr-14288-int8\nmkdir -p $d/test_wavs\nfor w in en-english.wav de-german.wav es-spanish.wav fr-french.wav; do\n  python3 ./test-onnx-transducer-non-streaming.py \\\n    --encoder $d/encoder.int8.onnx \\\n    --decoder $d/decoder.int8.onnx \\\n    --joiner $d/joiner.int8.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav $data/$w\n  cp -v $data/$w $d/test_wavs\ndone\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k\nmkdir -p $d/test_wavs\nfor w in en-english.wav de-german.wav es-spanish.wav fr-french.wav hr-croatian.wav it-italian.wav po-polish.wav ru-russian.wav uk-ukrainian.wav; do\n  python3 ./test-onnx-transducer-non-streaming.py \\\n    --encoder $d/encoder.onnx \\\n    --decoder $d/decoder.onnx \\\n    --joiner $d/joiner.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav $data/$w\n  cp -v $data/$w $d/test_wavs\ndone\n\nd=sherpa-onnx-nemo-fast-conformer-transducer-be-de-en-es-fr-hr-it-pl-ru-uk-20k-int8\nmkdir -p $d/test_wavs\nfor w in en-english.wav de-german.wav es-spanish.wav fr-french.wav hr-croatian.wav it-italian.wav po-polish.wav ru-russian.wav uk-ukrainian.wav; do\n  python3 ./test-onnx-transducer-non-streaming.py \\\n    --encoder $d/encoder.int8.onnx \\\n    --decoder $d/decoder.int8.onnx \\\n    --joiner $d/joiner.int8.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav $data/$w\n  cp -v $data/$w $d/test_wavs\ndone\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/run-transducer.sh",
    "content": "#!/usr/bin/env bash\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nset -ex\n\nif [ ! -e ./0.wav ]; then\n  # curl -SL -O https://hf-mirror.com/csukuangfj/icefall-asr-librispeech-streaming-zipformer-small-2024-03-18/resolve/main/test_wavs/0.wav\n  curl -SL -O https://huggingface.co/csukuangfj/icefall-asr-librispeech-streaming-zipformer-small-2024-03-18/resolve/main/test_wavs/0.wav\nfi\n\nms=(\n80\n480\n1040\n)\n\nfor m in ${ms[@]}; do\n  ./export-onnx-transducer.py --model $m\n  d=sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-${m}ms\n  d_int8=sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-${m}ms-int8\n  if [ ! -f $d/encoder.onnx ]; then\n    mkdir -p $d $d_int8\n    mv -v encoder.onnx $d/\n    mv -v decoder.onnx $d/\n    mv -v joiner.onnx $d/\n    cp -v tokens.txt $d/\n\n    mv -v encoder.int8.onnx $d_int8/\n    mv -v decoder.int8.onnx $d_int8/\n    mv -v joiner.int8.onnx $d_int8/\n    mv -v tokens.txt $d_int8/\n\n    echo \"---$d---\"\n    ls -lh $d\n\n    echo \"---$d_int8---\"\n    ls -lh $d_int8\n  fi\ndone\n\n# Now test the exported models\n\nfor m in ${ms[@]}; do\n  d=sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-${m}ms\n  python3 ./test-onnx-transducer.py \\\n    --encoder $d/encoder.onnx \\\n    --decoder $d/decoder.onnx \\\n    --joiner $d/joiner.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav ./0.wav\n\n  d=sherpa-onnx-nemo-streaming-fast-conformer-transducer-en-${m}ms-int8\n  python3 ./test-onnx-transducer.py \\\n    --encoder $d/encoder.int8.onnx \\\n    --decoder $d/decoder.int8.onnx \\\n    --joiner $d/joiner.int8.onnx \\\n    --tokens $d/tokens.txt \\\n    --wav ./0.wav\ndone\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/show-onnx-transudcer.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport onnxruntime\n\n\ndef show(filename):\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\ndef main():\n    print(\"=========encoder==========\")\n    show(\"./encoder.onnx\")\n\n    print(\"=========decoder==========\")\n    show(\"./decoder.onnx\")\n\n    print(\"=========joiner==========\")\n    show(\"./joiner.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n\n\"\"\"\n=========encoder==========\nNodeArg(name='audio_signal', type='tensor(float)', shape=['audio_signal_dynamic_axes_1', 80, 'audio_signal_dynamic_axes_2'])\nNodeArg(name='length', type='tensor(int64)', shape=['length_dynamic_axes_1'])\nNodeArg(name='cache_last_channel', type='tensor(float)', shape=['cache_last_channel_dynamic_axes_1', 17, 'cache_last_channel_dynamic_axes_2', 512])\nNodeArg(name='cache_last_time', type='tensor(float)', shape=['cache_last_time_dynamic_axes_1', 17, 512, 'cache_last_time_dynamic_axes_2'])\nNodeArg(name='cache_last_channel_len', type='tensor(int64)', shape=['cache_last_channel_len_dynamic_axes_1'])\n-----\nNodeArg(name='outputs', type='tensor(float)', shape=['outputs_dynamic_axes_1', 512, 'outputs_dynamic_axes_2'])\nNodeArg(name='encoded_lengths', type='tensor(int64)', shape=['encoded_lengths_dynamic_axes_1'])\nNodeArg(name='cache_last_channel_next', type='tensor(float)', shape=['cache_last_channel_next_dynamic_axes_1', 17, 'cache_last_channel_next_dynamic_axes_2', 512])\nNodeArg(name='cache_last_time_next', type='tensor(float)', shape=['cache_last_time_next_dynamic_axes_1', 17, 512, 'cache_last_time_next_dynamic_axes_2'])\nNodeArg(name='cache_last_channel_next_len', type='tensor(int64)', shape=['cache_last_channel_next_len_dynamic_axes_1'])\n=========decoder==========\nNodeArg(name='targets', type='tensor(int32)', shape=['targets_dynamic_axes_1', 'targets_dynamic_axes_2'])\nNodeArg(name='target_length', type='tensor(int32)', shape=['target_length_dynamic_axes_1'])\nNodeArg(name='states.1', type='tensor(float)', shape=[1, 'states.1_dim_1', 640])\nNodeArg(name='onnx::LSTM_3', type='tensor(float)', shape=[1, 1, 640])\n-----\nNodeArg(name='outputs', type='tensor(float)', shape=['outputs_dynamic_axes_1', 640, 'outputs_dynamic_axes_2'])\nNodeArg(name='prednet_lengths', type='tensor(int32)', shape=['prednet_lengths_dynamic_axes_1'])\nNodeArg(name='states', type='tensor(float)', shape=[1, 'states_dynamic_axes_1', 640])\nNodeArg(name='74', type='tensor(float)', shape=[1, 'LSTM74_dim_1', 640])\n=========joiner==========\nNodeArg(name='encoder_outputs', type='tensor(float)', shape=['encoder_outputs_dynamic_axes_1', 512, 'encoder_outputs_dynamic_axes_2'])\nNodeArg(name='decoder_outputs', type='tensor(float)', shape=['decoder_outputs_dynamic_axes_1', 640, 'decoder_outputs_dynamic_axes_2'])\n-----\nNodeArg(name='outputs', type='tensor(float)', shape=['outputs_dynamic_axes_1', 'outputs_dynamic_axes_2', 'outputs_dynamic_axes_3', 1025])\n\n\"\"\"\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/test-onnx-ctc-non-streaming.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom pathlib import Path\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport torch\nimport soundfile as sf\nimport librosa\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--model\", type=str, required=True, help=\"Path to model.onnx\")\n\n    parser.add_argument(\"--tokens\", type=str, required=True, help=\"Path to tokens.txt\")\n\n    parser.add_argument(\"--wav\", type=str, required=True, help=\"Path to test.wav\")\n\n    return parser.parse_args()\n\n\ndef create_fbank():\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.remove_dc_offset = False\n    opts.frame_opts.window_type = \"hann\"\n\n    opts.mel_opts.low_freq = 0\n    opts.mel_opts.num_bins = 80\n\n    opts.mel_opts.is_librosa = True\n\n    fbank = knf.OnlineFbank(opts)\n    return fbank\n\n\ndef compute_features(audio, fbank):\n    assert len(audio.shape) == 1, audio.shape\n    fbank.accept_waveform(16000, audio)\n    ans = []\n    processed = 0\n    while processed < fbank.num_frames_ready:\n        ans.append(np.array(fbank.get_frame(processed)))\n        processed += 1\n    ans = np.stack(ans)\n    return ans\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        print(\"==========Input==========\")\n        for i in self.model.get_inputs():\n            print(i)\n        print(\"==========Output==========\")\n        for i in self.model.get_outputs():\n            print(i)\n        \"\"\"\n        ==========Input==========\n        NodeArg(name='audio_signal', type='tensor(float)', shape=['audio_signal_dynamic_axes_1', 80, 'audio_signal_dynamic_axes_2'])\n        NodeArg(name='length', type='tensor(int64)', shape=['length_dynamic_axes_1'])\n        ==========Output==========\n        NodeArg(name='logprobs', type='tensor(float)', shape=['logprobs_dynamic_axes_1', 'logprobs_dynamic_axes_2', 1025])\n        \"\"\"\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        self.normalize_type = meta[\"normalize_type\"]\n        print(meta)\n\n    def __call__(self, x: np.ndarray):\n        # x: (T, C)\n        x = torch.from_numpy(x)\n        x = x.t().unsqueeze(0)\n        # x: [1, C, T]\n        x_lens = torch.tensor([x.shape[-1]], dtype=torch.int64)\n\n        log_probs = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x.numpy(),\n                self.model.get_inputs()[1].name: x_lens.numpy(),\n            },\n        )[0]\n        # [batch_size, T, vocab_size]\n        return torch.from_numpy(log_probs)\n\n\ndef main():\n    args = get_args()\n    assert Path(args.model).is_file(), args.model\n    assert Path(args.tokens).is_file(), args.tokens\n    assert Path(args.wav).is_file(), args.wav\n\n    print(vars(args))\n\n    model = OnnxModel(args.model)\n\n    id2token = dict()\n    with open(args.tokens, encoding=\"utf-8\") as f:\n        for line in f:\n            t, idx = line.split()\n            id2token[int(idx)] = t\n\n    fbank = create_fbank()\n    audio, sample_rate = sf.read(args.wav, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=16000,\n        )\n        sample_rate = 16000\n\n    blank = len(id2token) - 1\n    ans = []\n    prev = -1\n\n    print(audio.shape)\n    features = compute_features(audio, fbank)\n    if model.normalize_type != \"\":\n        assert model.normalize_type == \"per_feature\", model.normalize_type\n        features = torch.from_numpy(features)\n        mean = features.mean(dim=0, keepdims=True)\n        stddev = features.std(dim=0, keepdims=True) + 1e-5\n        features = (features - mean) / stddev\n        features = features.numpy()\n\n    print(\"features.shape\", features.shape)\n    log_probs = model(features)\n\n    print(\"log_probs.shape\", log_probs.shape)\n\n    log_probs = log_probs[0, :, :]  # remove batch dim\n    ids = torch.argmax(log_probs, dim=1).tolist()\n    for k in ids:\n        if k != blank and k != prev:\n            ans.append(k)\n        prev = k\n\n    tokens = [id2token[i] for i in ans]\n    underline = \"▁\"\n    #  underline = b\"\\xe2\\x96\\x81\".decode()\n    text = \"\".join(tokens).replace(underline, \" \").strip()\n    print(args.wav)\n    print(text)\n\n\nmain()\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/test-onnx-ctc.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom pathlib import Path\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport torch\nimport soundfile as sf\nimport librosa\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--model\", type=str, required=True, help=\"Path to model.onnx\")\n\n    parser.add_argument(\"--tokens\", type=str, required=True, help=\"Path to tokens.txt\")\n\n    parser.add_argument(\"--wav\", type=str, required=True, help=\"Path to test.wav\")\n\n    return parser.parse_args()\n\n\ndef create_fbank():\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.remove_dc_offset = False\n    opts.frame_opts.window_type = \"hann\"\n\n    opts.mel_opts.low_freq = 0\n    opts.mel_opts.num_bins = 80\n\n    opts.mel_opts.is_librosa = True\n\n    fbank = knf.OnlineFbank(opts)\n    return fbank\n\n\ndef compute_features(audio, fbank):\n    assert len(audio.shape) == 1, audio.shape\n    fbank.accept_waveform(16000, audio)\n    ans = []\n    processed = 0\n    while processed < fbank.num_frames_ready:\n        ans.append(np.array(fbank.get_frame(processed)))\n        processed += 1\n    ans = np.stack(ans)\n    return ans\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        print(meta)\n\n        self.window_size = int(meta[\"window_size\"])\n        self.chunk_shift = int(meta[\"chunk_shift\"])\n\n        self.cache_last_channel_dim1 = int(meta[\"cache_last_channel_dim1\"])\n        self.cache_last_channel_dim2 = int(meta[\"cache_last_channel_dim2\"])\n        self.cache_last_channel_dim3 = int(meta[\"cache_last_channel_dim3\"])\n\n        self.cache_last_time_dim1 = int(meta[\"cache_last_time_dim1\"])\n        self.cache_last_time_dim2 = int(meta[\"cache_last_time_dim2\"])\n        self.cache_last_time_dim3 = int(meta[\"cache_last_time_dim3\"])\n\n        self.init_cache_state()\n\n    def init_cache_state(self):\n        self.cache_last_channel = torch.zeros(\n            1,\n            self.cache_last_channel_dim1,\n            self.cache_last_channel_dim2,\n            self.cache_last_channel_dim3,\n            dtype=torch.float32,\n        ).numpy()\n\n        self.cache_last_time = torch.zeros(\n            1,\n            self.cache_last_time_dim1,\n            self.cache_last_time_dim2,\n            self.cache_last_time_dim3,\n            dtype=torch.float32,\n        ).numpy()\n\n        self.cache_last_channel_len = torch.zeros([1], dtype=torch.int64).numpy()\n\n    def __call__(self, x: np.ndarray):\n        # x: (T, C)\n        x = torch.from_numpy(x)\n        x = x.t().unsqueeze(0)\n        # x: [1, C, T]\n        x_lens = torch.tensor([x.shape[-1]], dtype=torch.int64)\n\n        (\n            log_probs,\n            log_probs_len,\n            cache_last_channel_next,\n            cache_last_time_next,\n            cache_last_channel_len_next,\n        ) = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n                self.model.get_outputs()[1].name,\n                self.model.get_outputs()[2].name,\n                self.model.get_outputs()[3].name,\n                self.model.get_outputs()[4].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x.numpy(),\n                self.model.get_inputs()[1].name: x_lens.numpy(),\n                self.model.get_inputs()[2].name: self.cache_last_channel,\n                self.model.get_inputs()[3].name: self.cache_last_time,\n                self.model.get_inputs()[4].name: self.cache_last_channel_len,\n            },\n        )\n        self.cache_last_channel = cache_last_channel_next\n        self.cache_last_time = cache_last_time_next\n        self.cache_last_channel_len = cache_last_channel_len_next\n\n        # [T, vocab_size]\n        return torch.from_numpy(log_probs).squeeze(0)\n\n\ndef main():\n    args = get_args()\n    assert Path(args.model).is_file(), args.model\n    assert Path(args.tokens).is_file(), args.tokens\n    assert Path(args.wav).is_file(), args.wav\n\n    print(vars(args))\n\n    model = OnnxModel(args.model)\n\n    id2token = dict()\n    with open(args.tokens, encoding=\"utf-8\") as f:\n        for line in f:\n            t, idx = line.split()\n            id2token[int(idx)] = t\n\n    fbank = create_fbank()\n    audio, sample_rate = sf.read(args.wav, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=16000,\n        )\n        sample_rate = 16000\n\n    window_size = model.window_size\n    chunk_shift = model.chunk_shift\n\n    blank = len(id2token) - 1\n    prev = -1\n    ans = []\n\n    features = compute_features(audio, fbank)\n    num_chunks = (features.shape[0] - window_size) // chunk_shift + 1\n    for i in range(num_chunks):\n        start = i * chunk_shift\n        end = start + window_size\n        chunk = features[start:end, :]\n\n        log_probs = model(chunk)\n        ids = torch.argmax(log_probs, dim=1).tolist()\n        for i in ids:\n            if i != blank and i != prev:\n                ans.append(i)\n            prev = i\n\n    tokens = [id2token[i] for i in ans]\n    underline = \"▁\"\n    #  underline = b\"\\xe2\\x96\\x81\".decode()\n    text = \"\".join(tokens).replace(underline, \" \").strip()\n    print(args.wav)\n    print(text)\n\n\nmain()\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/test-onnx-transducer-non-streaming.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom pathlib import Path\n\nimport kaldi_native_fbank as knf\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--encoder\", type=str, required=True, help=\"Path to encoder.onnx\"\n    )\n    parser.add_argument(\n        \"--decoder\", type=str, required=True, help=\"Path to decoder.onnx\"\n    )\n    parser.add_argument(\"--joiner\", type=str, required=True, help=\"Path to joiner.onnx\")\n\n    parser.add_argument(\"--tokens\", type=str, required=True, help=\"Path to tokens.txt\")\n\n    parser.add_argument(\"--wav\", type=str, required=True, help=\"Path to test.wav\")\n\n    return parser.parse_args()\n\n\ndef create_fbank():\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.remove_dc_offset = False\n    opts.frame_opts.window_type = \"hann\"\n\n    opts.mel_opts.low_freq = 0\n    opts.mel_opts.num_bins = 80\n\n    opts.mel_opts.is_librosa = True\n\n    fbank = knf.OnlineFbank(opts)\n    return fbank\n\n\ndef compute_features(audio, fbank):\n    assert len(audio.shape) == 1, audio.shape\n    fbank.accept_waveform(16000, audio)\n    ans = []\n    processed = 0\n    while processed < fbank.num_frames_ready:\n        ans.append(np.array(fbank.get_frame(processed)))\n        processed += 1\n    ans = np.stack(ans)\n    return ans\n\n\ndef display(sess):\n    print(\"==========Input==========\")\n    for i in sess.get_inputs():\n        print(i)\n    print(\"==========Output==========\")\n    for i in sess.get_outputs():\n        print(i)\n\n\n\"\"\"\nencoder\n==========Input==========\nNodeArg(name='audio_signal', type='tensor(float)', shape=['audio_signal_dynamic_axes_1', 80, 'audio_signal_dynamic_axes_2'])\nNodeArg(name='length', type='tensor(int64)', shape=['length_dynamic_axes_1'])\n==========Output==========\nNodeArg(name='outputs', type='tensor(float)', shape=['outputs_dynamic_axes_1', 512, 'outputs_dynamic_axes_2'])\nNodeArg(name='encoded_lengths', type='tensor(int64)', shape=['encoded_lengths_dynamic_axes_1'])\n\ndecoder\n==========Input==========\nNodeArg(name='targets', type='tensor(int32)', shape=['targets_dynamic_axes_1', 'targets_dynamic_axes_2'])\nNodeArg(name='target_length', type='tensor(int32)', shape=['target_length_dynamic_axes_1'])\nNodeArg(name='states.1', type='tensor(float)', shape=[1, 'states.1_dim_1', 640])\nNodeArg(name='onnx::LSTM_3', type='tensor(float)', shape=[1, 1, 640])\n==========Output==========\nNodeArg(name='outputs', type='tensor(float)', shape=['outputs_dynamic_axes_1', 640, 'outputs_dynamic_axes_2'])\nNodeArg(name='prednet_lengths', type='tensor(int32)', shape=['prednet_lengths_dynamic_axes_1'])\nNodeArg(name='states', type='tensor(float)', shape=[1, 'states_dynamic_axes_1', 640])\nNodeArg(name='74', type='tensor(float)', shape=[1, 'LSTM74_dim_1', 640])\n\njoiner\n==========Input==========\nNodeArg(name='encoder_outputs', type='tensor(float)', shape=['encoder_outputs_dynamic_axes_1', 512, 'encoder_outputs_dynamic_axes_2'])\nNodeArg(name='decoder_outputs', type='tensor(float)', shape=['decoder_outputs_dynamic_axes_1', 640, 'decoder_outputs_dynamic_axes_2'])\n==========Output==========\nNodeArg(name='outputs', type='tensor(float)', shape=['outputs_dynamic_axes_1', 'outputs_dynamic_axes_2', 'outputs_dynamic_axes_3', 1025])\n\"\"\"\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        encoder: str,\n        decoder: str,\n        joiner: str,\n    ):\n        self.init_encoder(encoder)\n        display(self.encoder)\n        self.init_decoder(decoder)\n        display(self.decoder)\n        self.init_joiner(joiner)\n        display(self.joiner)\n\n    def init_encoder(self, encoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.encoder = ort.InferenceSession(\n            encoder,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.encoder.get_modelmeta().custom_metadata_map\n        self.normalize_type = meta[\"normalize_type\"]\n        print(meta)\n\n        self.pred_rnn_layers = int(meta[\"pred_rnn_layers\"])\n        self.pred_hidden = int(meta[\"pred_hidden\"])\n\n    def init_decoder(self, decoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.decoder = ort.InferenceSession(\n            decoder,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def init_joiner(self, joiner):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.joiner = ort.InferenceSession(\n            joiner,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def get_decoder_state(self):\n        batch_size = 1\n        state0 = torch.zeros(self.pred_rnn_layers, batch_size, self.pred_hidden).numpy()\n        state1 = torch.zeros(self.pred_rnn_layers, batch_size, self.pred_hidden).numpy()\n        return state0, state1\n\n    def run_encoder(self, x: np.ndarray):\n        # x: (T, C)\n        x = torch.from_numpy(x)\n        x = x.t().unsqueeze(0)\n        # x: [1, C, T]\n        x_lens = torch.tensor([x.shape[-1]], dtype=torch.int64)\n\n        (encoder_out, out_len) = self.encoder.run(\n            [\n                self.encoder.get_outputs()[0].name,\n                self.encoder.get_outputs()[1].name,\n            ],\n            {\n                self.encoder.get_inputs()[0].name: x.numpy(),\n                self.encoder.get_inputs()[1].name: x_lens.numpy(),\n            },\n        )\n        # [batch_size, dim, T]\n        return encoder_out\n\n    def run_decoder(\n        self,\n        token: int,\n        state0: np.ndarray,\n        state1: np.ndarray,\n    ):\n        target = torch.tensor([[token]], dtype=torch.int32).numpy()\n        target_len = torch.tensor([1], dtype=torch.int32).numpy()\n\n        (decoder_out, decoder_out_length, state0_next, state1_next,) = self.decoder.run(\n            [\n                self.decoder.get_outputs()[0].name,\n                self.decoder.get_outputs()[1].name,\n                self.decoder.get_outputs()[2].name,\n                self.decoder.get_outputs()[3].name,\n            ],\n            {\n                self.decoder.get_inputs()[0].name: target,\n                self.decoder.get_inputs()[1].name: target_len,\n                self.decoder.get_inputs()[2].name: state0,\n                self.decoder.get_inputs()[3].name: state1,\n            },\n        )\n        return decoder_out, state0_next, state1_next\n\n    def run_joiner(\n        self,\n        encoder_out: np.ndarray,\n        decoder_out: np.ndarray,\n    ):\n        # encoder_out: [batch_size,  dim, 1]\n        # decoder_out: [batch_size,  dim, 1]\n        logit = self.joiner.run(\n            [\n                self.joiner.get_outputs()[0].name,\n            ],\n            {\n                self.joiner.get_inputs()[0].name: encoder_out,\n                self.joiner.get_inputs()[1].name: decoder_out,\n            },\n        )[0]\n        # logit: [batch_size, 1, 1, vocab_size]\n        return logit\n\n\ndef main():\n    args = get_args()\n    assert Path(args.encoder).is_file(), args.encoder\n    assert Path(args.decoder).is_file(), args.decoder\n    assert Path(args.joiner).is_file(), args.joiner\n    assert Path(args.tokens).is_file(), args.tokens\n    assert Path(args.wav).is_file(), args.wav\n\n    print(vars(args))\n\n    model = OnnxModel(args.encoder, args.decoder, args.joiner)\n\n    id2token = dict()\n    with open(args.tokens, encoding=\"utf-8\") as f:\n        for line in f:\n            t, idx = line.split()\n            id2token[int(idx)] = t\n\n    fbank = create_fbank()\n    audio, sample_rate = sf.read(args.wav, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=16000,\n        )\n        sample_rate = 16000\n\n    tail_padding = np.zeros(sample_rate * 2)\n\n    audio = np.concatenate([audio, tail_padding])\n\n    blank = len(id2token) - 1\n    ans = [blank]\n    state0, state1 = model.get_decoder_state()\n    decoder_out, state0_next, state1_next = model.run_decoder(ans[-1], state0, state1)\n\n    features = compute_features(audio, fbank)\n    if model.normalize_type != \"\":\n        assert model.normalize_type == \"per_feature\", model.normalize_type\n        features = torch.from_numpy(features)\n        mean = features.mean(dim=0, keepdims=True)\n        stddev = features.std(dim=0, keepdims=True) + 1e-5\n        features = (features - mean) / stddev\n        features = features.numpy()\n    print(audio.shape)\n    print(\"features.shape\", features.shape)\n\n    encoder_out = model.run_encoder(features)\n    # encoder_out:[batch_size, dim, T)\n    for t in range(encoder_out.shape[2]):\n        encoder_out_t = encoder_out[:, :, t : t + 1]\n        logits = model.run_joiner(encoder_out_t, decoder_out)\n        logits = torch.from_numpy(logits)\n        logits = logits.squeeze()\n        idx = torch.argmax(logits, dim=-1).item()\n        if idx != blank:\n            ans.append(idx)\n            state0 = state0_next\n            state1 = state1_next\n            decoder_out, state0_next, state1_next = model.run_decoder(\n                ans[-1], state0, state1\n            )\n\n    ans = ans[1:]  # remove the first blank\n    print(ans)\n    tokens = [id2token[i] for i in ans]\n    underline = \"▁\"\n    #  underline = b\"\\xe2\\x96\\x81\".decode()\n    text = \"\".join(tokens).replace(underline, \" \").strip()\n    print(args.wav)\n    print(text)\n\n\nmain()\n"
  },
  {
    "path": "scripts/nemo/fast-conformer-hybrid-transducer-ctc/test-onnx-transducer.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom pathlib import Path\n\nimport kaldi_native_fbank as knf\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--encoder\", type=str, required=True, help=\"Path to encoder.onnx\"\n    )\n    parser.add_argument(\n        \"--decoder\", type=str, required=True, help=\"Path to decoder.onnx\"\n    )\n    parser.add_argument(\"--joiner\", type=str, required=True, help=\"Path to joiner.onnx\")\n\n    parser.add_argument(\"--tokens\", type=str, required=True, help=\"Path to tokens.txt\")\n\n    parser.add_argument(\"--wav\", type=str, required=True, help=\"Path to test.wav\")\n\n    return parser.parse_args()\n\n\ndef create_fbank():\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.remove_dc_offset = False\n    opts.frame_opts.window_type = \"hann\"\n\n    opts.mel_opts.low_freq = 0\n    opts.mel_opts.num_bins = 80\n\n    opts.mel_opts.is_librosa = True\n\n    fbank = knf.OnlineFbank(opts)\n    return fbank\n\n\ndef compute_features(audio, fbank):\n    assert len(audio.shape) == 1, audio.shape\n    fbank.accept_waveform(16000, audio)\n    ans = []\n    processed = 0\n    while processed < fbank.num_frames_ready:\n        ans.append(np.array(fbank.get_frame(processed)))\n        processed += 1\n    ans = np.stack(ans)\n    return ans\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        encoder: str,\n        decoder: str,\n        joiner: str,\n    ):\n        self.init_encoder(encoder)\n        self.init_decoder(decoder)\n        self.init_joiner(joiner)\n\n    def init_encoder(self, encoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.encoder = ort.InferenceSession(\n            encoder,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.encoder.get_modelmeta().custom_metadata_map\n        print(meta)\n\n        self.window_size = int(meta[\"window_size\"])\n        self.chunk_shift = int(meta[\"chunk_shift\"])\n\n        self.cache_last_channel_dim1 = int(meta[\"cache_last_channel_dim1\"])\n        self.cache_last_channel_dim2 = int(meta[\"cache_last_channel_dim2\"])\n        self.cache_last_channel_dim3 = int(meta[\"cache_last_channel_dim3\"])\n\n        self.cache_last_time_dim1 = int(meta[\"cache_last_time_dim1\"])\n        self.cache_last_time_dim2 = int(meta[\"cache_last_time_dim2\"])\n        self.cache_last_time_dim3 = int(meta[\"cache_last_time_dim3\"])\n\n        self.pred_rnn_layers = int(meta[\"pred_rnn_layers\"])\n        self.pred_hidden = int(meta[\"pred_hidden\"])\n\n        self.init_cache_state()\n\n    def init_decoder(self, decoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.decoder = ort.InferenceSession(\n            decoder,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def init_joiner(self, joiner):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.joiner = ort.InferenceSession(\n            joiner,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def get_decoder_state(self):\n        batch_size = 1\n        state0 = torch.zeros(self.pred_rnn_layers, batch_size, self.pred_hidden).numpy()\n        state1 = torch.zeros(self.pred_rnn_layers, batch_size, self.pred_hidden).numpy()\n        return state0, state1\n\n    def init_cache_state(self):\n        self.cache_last_channel = torch.zeros(\n            1,\n            self.cache_last_channel_dim1,\n            self.cache_last_channel_dim2,\n            self.cache_last_channel_dim3,\n            dtype=torch.float32,\n        ).numpy()\n\n        self.cache_last_time = torch.zeros(\n            1,\n            self.cache_last_time_dim1,\n            self.cache_last_time_dim2,\n            self.cache_last_time_dim3,\n            dtype=torch.float32,\n        ).numpy()\n\n        self.cache_last_channel_len = torch.zeros([1], dtype=torch.int64).numpy()\n\n    def run_encoder(self, x: np.ndarray):\n        # x: (T, C)\n        x = torch.from_numpy(x)\n        x = x.t().unsqueeze(0)\n        # x: [1, C, T]\n        x_lens = torch.tensor([x.shape[-1]], dtype=torch.int64)\n\n        (\n            encoder_out,\n            out_len,\n            cache_last_channel_next,\n            cache_last_time_next,\n            cache_last_channel_len_next,\n        ) = self.encoder.run(\n            [\n                self.encoder.get_outputs()[0].name,\n                self.encoder.get_outputs()[1].name,\n                self.encoder.get_outputs()[2].name,\n                self.encoder.get_outputs()[3].name,\n                self.encoder.get_outputs()[4].name,\n            ],\n            {\n                self.encoder.get_inputs()[0].name: x.numpy(),\n                self.encoder.get_inputs()[1].name: x_lens.numpy(),\n                self.encoder.get_inputs()[2].name: self.cache_last_channel,\n                self.encoder.get_inputs()[3].name: self.cache_last_time,\n                self.encoder.get_inputs()[4].name: self.cache_last_channel_len,\n            },\n        )\n        self.cache_last_channel = cache_last_channel_next\n        self.cache_last_time = cache_last_time_next\n        self.cache_last_channel_len = cache_last_channel_len_next\n\n        # [batch_size, dim, T]\n        return encoder_out\n\n    def run_decoder(\n        self,\n        token: int,\n        state0: np.ndarray,\n        state1: np.ndarray,\n    ):\n        target = torch.tensor([[token]], dtype=torch.int32).numpy()\n        target_len = torch.tensor([1], dtype=torch.int32).numpy()\n\n        (\n            decoder_out,\n            decoder_out_length,\n            state0_next,\n            state1_next,\n        ) = self.decoder.run(\n            [\n                self.decoder.get_outputs()[0].name,\n                self.decoder.get_outputs()[1].name,\n                self.decoder.get_outputs()[2].name,\n                self.decoder.get_outputs()[3].name,\n            ],\n            {\n                self.decoder.get_inputs()[0].name: target,\n                self.decoder.get_inputs()[1].name: target_len,\n                self.decoder.get_inputs()[2].name: state0,\n                self.decoder.get_inputs()[3].name: state1,\n            },\n        )\n        return decoder_out, state0_next, state1_next\n\n    def run_joiner(\n        self,\n        encoder_out: np.ndarray,\n        decoder_out: np.ndarray,\n    ):\n        # encoder_out: [batch_size,  dim, 1]\n        # decoder_out: [batch_size,  dim, 1]\n        logit = self.joiner.run(\n            [\n                self.joiner.get_outputs()[0].name,\n            ],\n            {\n                self.joiner.get_inputs()[0].name: encoder_out,\n                self.joiner.get_inputs()[1].name: decoder_out,\n            },\n        )[0]\n        # logit: [batch_size, 1, 1, vocab_size]\n        return logit\n\n\ndef main():\n    args = get_args()\n    assert Path(args.encoder).is_file(), args.encoder\n    assert Path(args.decoder).is_file(), args.decoder\n    assert Path(args.joiner).is_file(), args.joiner\n    assert Path(args.tokens).is_file(), args.tokens\n    assert Path(args.wav).is_file(), args.wav\n\n    print(vars(args))\n\n    model = OnnxModel(args.encoder, args.decoder, args.joiner)\n\n    id2token = dict()\n    with open(args.tokens, encoding=\"utf-8\") as f:\n        for line in f:\n            t, idx = line.split()\n            id2token[int(idx)] = t\n\n    fbank = create_fbank()\n    audio, sample_rate = sf.read(args.wav, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=16000,\n        )\n        sample_rate = 16000\n\n    tail_padding = np.zeros(sample_rate * 2)\n\n    audio = np.concatenate([audio, tail_padding])\n\n    window_size = model.window_size\n    chunk_shift = model.chunk_shift\n\n    blank = len(id2token) - 1\n    ans = [blank]\n    state0, state1 = model.get_decoder_state()\n    decoder_out, state0_next, state1_next = model.run_decoder(ans[-1], state0, state1)\n\n    features = compute_features(audio, fbank)\n    num_chunks = (features.shape[0] - window_size) // chunk_shift + 1\n    for i in range(num_chunks):\n        start = i * chunk_shift\n        end = start + window_size\n        chunk = features[start:end, :]\n\n        encoder_out = model.run_encoder(chunk)\n        # encoder_out:[batch_size, dim, T)\n        for t in range(encoder_out.shape[2]):\n            encoder_out_t = encoder_out[:, :, t : t + 1]\n            logits = model.run_joiner(encoder_out_t, decoder_out)\n            logits = torch.from_numpy(logits)\n            logits = logits.squeeze()\n            idx = torch.argmax(logits, dim=-1).item()\n            if idx != blank:\n                ans.append(idx)\n                state0 = state0_next\n                state1 = state1_next\n                decoder_out, state0_next, state1_next = model.run_decoder(\n                    ans[-1], state0, state1\n                )\n\n    ans = ans[1:]  # remove the first blank\n    tokens = [id2token[i] for i in ans]\n    underline = \"▁\"\n    #  underline = b\"\\xe2\\x96\\x81\".decode()\n    text = \"\".join(tokens).replace(underline, \" \").strip()\n    print(args.wav)\n    print(text)\n\n\nmain()\n"
  },
  {
    "path": "scripts/nemo/generate_bpe_vocab.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2026  github.com/nullbio\n#\n# Generate bpe.vocab file from a NeMo model for use with hotwords in sherpa-onnx.\n#\n# The bpe.vocab file contains BPE tokens with their scores (merge priorities),\n# which is required for hotword/keyword boosting with modified beam search.\n#\n# Usage:\n#   # From a pretrained model name:\n#   python generate_bpe_vocab.py --model nvidia/parakeet-tdt-0.6b-v2\n#\n#   # From a local .nemo file:\n#   python generate_bpe_vocab.py --model ./parakeet-tdt-0.6b-v2.nemo\n#\n#   # Specify output path:\n#   python generate_bpe_vocab.py --model nvidia/parakeet-tdt-0.6b-v2 --output ./bpe.vocab\n\nimport argparse\nfrom pathlib import Path\n\n\ndef generate_bpe_vocab_from_tokenizer(sp, output_path: str):\n    \"\"\"\n    Generate bpe.vocab file from a sentencepiece processor.\n\n    Uses the original scores from the SentencePiece model, which represent\n    BPE merge priorities. These scores ensure correct tokenization order\n    when encoding hotwords.\n\n    Args:\n        sp: SentencePiece processor object (from tokenizer.tokenizer)\n        output_path: Output path for bpe.vocab file\n    \"\"\"\n    vocab_size = sp.get_piece_size()\n    print(f\"Vocabulary size: {vocab_size}\")\n\n    print(f\"Writing bpe.vocab to: {output_path}\")\n    with open(output_path, \"w\", encoding=\"utf-8\") as f:\n        for token_id in range(vocab_size):\n            token = sp.id_to_piece(token_id)\n            score = sp.get_score(token_id)\n            f.write(f\"{token}\\t{score}\\n\")\n\n    print(\"Done!\")\n    return output_path\n\n\ndef generate_bpe_vocab_from_model(asr_model, output_path: str):\n    \"\"\"\n    Generate bpe.vocab file from a loaded NeMo ASR model.\n\n    Args:\n        asr_model: Loaded NeMo ASR model object\n        output_path: Output path for bpe.vocab file\n    \"\"\"\n    sp = asr_model.tokenizer.tokenizer\n    return generate_bpe_vocab_from_tokenizer(sp, output_path)\n\n\ndef generate_bpe_vocab(model_path: str, output_path: str):\n    \"\"\"\n    Generate bpe.vocab file from a NeMo ASR model.\n\n    Args:\n        model_path: Path to .nemo file or HuggingFace model name (e.g., nvidia/parakeet-tdt-0.6b-v2)\n        output_path: Output path for bpe.vocab file\n    \"\"\"\n    import nemo.collections.asr as nemo_asr\n\n    # Load model\n    print(f\"Loading model: {model_path}\")\n    if Path(model_path).is_file():\n        asr_model = nemo_asr.models.ASRModel.restore_from(restore_path=model_path)\n    else:\n        asr_model = nemo_asr.models.ASRModel.from_pretrained(model_name=model_path)\n\n    return generate_bpe_vocab_from_model(asr_model, output_path)\n\n\ndef main():\n    parser = argparse.ArgumentParser(\n        description=\"Generate bpe.vocab file from a NeMo ASR model for hotword support\",\n        formatter_class=argparse.RawDescriptionHelpFormatter,\n        epilog=\"\"\"\nExamples:\n  # From HuggingFace model:\n  python generate_bpe_vocab.py --model nvidia/parakeet-tdt-0.6b-v2\n\n  # From local .nemo file:\n  python generate_bpe_vocab.py --model ./my_model.nemo --output ./bpe.vocab\n        \"\"\",\n    )\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"NeMo model name (e.g., nvidia/parakeet-tdt-0.6b-v2) or path to .nemo file\",\n    )\n    parser.add_argument(\n        \"--output\",\n        type=str,\n        default=\"./bpe.vocab\",\n        help=\"Output path for bpe.vocab file (default: ./bpe.vocab)\",\n    )\n\n    args = parser.parse_args()\n\n    generate_bpe_vocab(\n        model_path=args.model,\n        output_path=args.output,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/nemotron-speech-streaming-en-0.6b/export_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2026  Xiaomi Corp.        (authors: Fangjun Kuang)\nfrom typing import Dict\n\nimport nemo.collections.asr as nemo_asr\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\"\"\"\n'_target_': 'nemo.collections.asr.modules.AudioToMelSpectrogramPreprocessor',\n'sample_rate': 16000, 'normalize': 'NA', 'window_size': 0.025, 'window_stride': 0.01,\n'window': 'hann', 'features': 128, 'n_fft': 512, 'log': True,\n'frame_splicing': 1, 'dither': 1e-05, 'pad_to': 0, 'pad_value': 0.0}\n\n\"\"\"\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    external_filename = filename.split(\".onnx\")[0]\n    onnx.save(\n        model,\n        filename,\n        save_as_external_data=True,\n        all_tensors_to_one_file=True,\n        location=external_filename + \".data\",\n    )\n\n\n@torch.no_grad()\ndef main():\n    model_name = \"nvidia/nemotron-speech-streaming-en-0.6b\"\n\n    asr_model = nemo_asr.models.ASRModel.from_pretrained(model_name=model_name)\n\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(asr_model.joint.vocabulary):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    asr_model.eval()\n\n    assert asr_model.encoder.streaming_cfg is not None\n    if isinstance(asr_model.encoder.streaming_cfg.chunk_size, list):\n        chunk_size = asr_model.encoder.streaming_cfg.chunk_size[1]\n    else:\n        chunk_size = asr_model.encoder.streaming_cfg.chunk_size\n\n    if isinstance(asr_model.encoder.streaming_cfg.pre_encode_cache_size, list):\n        pre_encode_cache_size = asr_model.encoder.streaming_cfg.pre_encode_cache_size[1]\n    else:\n        pre_encode_cache_size = asr_model.encoder.streaming_cfg.pre_encode_cache_size\n    window_size = chunk_size + pre_encode_cache_size\n\n    print(\"chunk_size\", chunk_size)\n    print(\"pre_encode_cache_size\", pre_encode_cache_size)\n    print(\"window_size\", window_size)\n\n    chunk_shift = chunk_size\n\n    # cache_last_channel: (batch_size, dim1, dim2, dim3)\n    cache_last_channel_dim1 = len(asr_model.encoder.layers)\n    cache_last_channel_dim2 = asr_model.encoder.streaming_cfg.last_channel_cache_size\n    cache_last_channel_dim3 = asr_model.encoder.d_model\n\n    # cache_last_time: (batch_size, dim1, dim2, dim3)\n    cache_last_time_dim1 = len(asr_model.encoder.layers)\n    cache_last_time_dim2 = asr_model.encoder.d_model\n    cache_last_time_dim3 = asr_model.encoder.conv_context_size[0]\n\n    asr_model.set_export_config({\"cache_support\": True})\n\n    asr_model.encoder.export(\"encoder.onnx\")\n    asr_model.decoder.export(\"decoder.onnx\")\n    asr_model.joint.export(\"joiner.onnx\")\n\n    normalize_type = asr_model.cfg.preprocessor.normalize\n    if normalize_type == \"NA\":\n        normalize_type = \"\"\n\n    meta_data = {\n        \"vocab_size\": asr_model.decoder.vocab_size,\n        \"window_size\": window_size,\n        \"chunk_shift\": chunk_shift,\n        \"normalize_type\": normalize_type,\n        \"cache_last_channel_dim1\": cache_last_channel_dim1,\n        \"cache_last_channel_dim2\": cache_last_channel_dim2,\n        \"cache_last_channel_dim3\": cache_last_channel_dim3,\n        \"cache_last_time_dim1\": cache_last_time_dim1,\n        \"cache_last_time_dim2\": cache_last_time_dim2,\n        \"cache_last_time_dim3\": cache_last_time_dim3,\n        \"pred_rnn_layers\": asr_model.decoder.pred_rnn_layers,\n        \"pred_hidden\": asr_model.decoder.pred_hidden,\n        \"subsampling_factor\": 8,\n        \"feat_dim\": 128,\n        \"model_type\": \"EncDecHybridRNNTCTCBPEModel\",\n        \"version\": \"1\",\n        \"model_author\": \"NeMo\",\n        \"url\": \"https://huggingface.co/nvidia/nemotron-speech-streaming-en-0.6b\",\n        \"comment\": \"Only the transducer branch is exported\",\n    }\n    add_meta_data(\"encoder.onnx\", meta_data)\n\n    for m in [\"encoder\", \"decoder\", \"joiner\"]:\n        quantize_dynamic(\n            model_input=f\"{m}.onnx\",\n            model_output=f\"{m}.int8.onnx\",\n            weight_type=QuantType.QUInt8,\n        )\n\n    print(meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/parakeet-tdt-0.6b-v2/export_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport os\nimport sys\nfrom pathlib import Path\nfrom typing import Dict\n\nimport nemo.collections.asr as nemo_asr\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n# Add parent directory to path to import generate_bpe_vocab\nsys.path.insert(0, str(Path(__file__).parent.parent))\nfrom generate_bpe_vocab import generate_bpe_vocab_from_model\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    if filename == \"encoder.onnx\":\n        external_filename = \"encoder\"\n        onnx.save(\n            model,\n            filename,\n            save_as_external_data=True,\n            all_tensors_to_one_file=True,\n            location=external_filename + \".weights\",\n        )\n    else:\n        onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    if Path(\"./parakeet-tdt-0.6b-v2.nemo\").is_file():\n        asr_model = nemo_asr.models.ASRModel.restore_from(\n            restore_path=\"./parakeet-tdt-0.6b-v2.nemo\"\n        )\n    else:\n        asr_model = nemo_asr.models.ASRModel.from_pretrained(\n            model_name=\"nvidia/parakeet-tdt-0.6b-v2\"\n        )\n\n    asr_model.eval()\n\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(asr_model.joint.vocabulary):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    # Generate bpe.vocab for hotword support\n    print(\"Generating bpe.vocab for hotword support...\")\n    generate_bpe_vocab_from_model(\n        asr_model=asr_model,\n        output_path=\"./bpe.vocab\",\n    )\n\n    asr_model.encoder.export(\"encoder.onnx\")\n    asr_model.decoder.export(\"decoder.onnx\")\n    asr_model.joint.export(\"joiner.onnx\")\n    os.system(\"ls -lh *.onnx\")\n\n    normalize_type = asr_model.cfg.preprocessor.normalize\n    if normalize_type == \"NA\":\n        normalize_type = \"\"\n\n    meta_data = {\n        \"vocab_size\": asr_model.decoder.vocab_size,\n        \"normalize_type\": normalize_type,\n        \"pred_rnn_layers\": asr_model.decoder.pred_rnn_layers,\n        \"pred_hidden\": asr_model.decoder.pred_hidden,\n        \"subsampling_factor\": 8,\n        \"model_type\": \"EncDecRNNTBPEModel\",\n        \"version\": \"2\",\n        \"model_author\": \"NeMo\",\n        \"url\": \"https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2\",\n        \"comment\": \"Only the transducer branch is exported\",\n        \"feat_dim\": 128,\n    }\n\n    for m in [\"encoder\", \"decoder\", \"joiner\"]:\n        quantize_dynamic(\n            model_input=f\"./{m}.onnx\",\n            model_output=f\"./{m}.int8.onnx\",\n            weight_type=QuantType.QUInt8 if m == \"encoder\" else QuantType.QInt8,\n        )\n        os.system(\"ls -lh *.onnx\")\n\n    add_meta_data(\"encoder.int8.onnx\", meta_data)\n    add_meta_data(\"encoder.onnx\", meta_data)\n    print(\"meta_data\", meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/parakeet-tdt-0.6b-v2/test_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport argparse\nfrom pathlib import Path\n\nimport kaldi_native_fbank as knf\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\nimport time\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--encoder\", type=str, required=True, help=\"Path to encoder.onnx\"\n    )\n    parser.add_argument(\n        \"--decoder\", type=str, required=True, help=\"Path to decoder.onnx\"\n    )\n    parser.add_argument(\"--joiner\", type=str, required=True, help=\"Path to joiner.onnx\")\n\n    parser.add_argument(\"--tokens\", type=str, required=True, help=\"Path to tokens.txt\")\n\n    parser.add_argument(\"--wav\", type=str, required=True, help=\"Path to test.wav\")\n\n    return parser.parse_args()\n\n\ndef create_fbank():\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.remove_dc_offset = False\n    opts.frame_opts.window_type = \"hann\"\n\n    opts.mel_opts.low_freq = 0\n    opts.mel_opts.num_bins = 128\n\n    opts.mel_opts.is_librosa = True\n\n    fbank = knf.OnlineFbank(opts)\n    return fbank\n\n\ndef compute_features(audio, fbank):\n    assert len(audio.shape) == 1, audio.shape\n    fbank.accept_waveform(16000, audio)\n    ans = []\n    processed = 0\n    while processed < fbank.num_frames_ready:\n        ans.append(np.array(fbank.get_frame(processed)))\n        processed += 1\n    ans = np.stack(ans)\n    return ans\n\n\ndef display(sess, model):\n    print(f\"=========={model} Input==========\")\n    for i in sess.get_inputs():\n        print(i)\n    print(f\"=========={model }Output==========\")\n    for i in sess.get_outputs():\n        print(i)\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        encoder: str,\n        decoder: str,\n        joiner: str,\n    ):\n        self.init_encoder(encoder)\n        display(self.encoder, \"encoder\")\n        self.init_decoder(decoder)\n        display(self.decoder, \"decoder\")\n        self.init_joiner(joiner)\n        display(self.joiner, \"joiner\")\n\n    def init_encoder(self, encoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.encoder = ort.InferenceSession(\n            encoder,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.encoder.get_modelmeta().custom_metadata_map\n        self.normalize_type = meta[\"normalize_type\"]\n        print(meta)\n\n        self.pred_rnn_layers = int(meta[\"pred_rnn_layers\"])\n        self.pred_hidden = int(meta[\"pred_hidden\"])\n\n    def init_decoder(self, decoder):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.decoder = ort.InferenceSession(\n            decoder,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def init_joiner(self, joiner):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.joiner = ort.InferenceSession(\n            joiner,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def get_decoder_state(self):\n        batch_size = 1\n        state0 = torch.zeros(self.pred_rnn_layers, batch_size, self.pred_hidden).numpy()\n        state1 = torch.zeros(self.pred_rnn_layers, batch_size, self.pred_hidden).numpy()\n        return state0, state1\n\n    def run_encoder(self, x: np.ndarray):\n        # x: (T, C)\n        x = torch.from_numpy(x)\n        x = x.t().unsqueeze(0)\n        # x: [1, C, T]\n        x_lens = torch.tensor([x.shape[-1]], dtype=torch.int64)\n\n        (encoder_out, out_len) = self.encoder.run(\n            [\n                self.encoder.get_outputs()[0].name,\n                self.encoder.get_outputs()[1].name,\n            ],\n            {\n                self.encoder.get_inputs()[0].name: x.numpy(),\n                self.encoder.get_inputs()[1].name: x_lens.numpy(),\n            },\n        )\n        # [batch_size, dim, T]\n        return encoder_out\n\n    def run_decoder(\n        self,\n        token: int,\n        state0: np.ndarray,\n        state1: np.ndarray,\n    ):\n        target = torch.tensor([[token]], dtype=torch.int32).numpy()\n        target_len = torch.tensor([1], dtype=torch.int32).numpy()\n\n        (decoder_out, decoder_out_length, state0_next, state1_next,) = self.decoder.run(\n            [\n                self.decoder.get_outputs()[0].name,\n                self.decoder.get_outputs()[1].name,\n                self.decoder.get_outputs()[2].name,\n                self.decoder.get_outputs()[3].name,\n            ],\n            {\n                self.decoder.get_inputs()[0].name: target,\n                self.decoder.get_inputs()[1].name: target_len,\n                self.decoder.get_inputs()[2].name: state0,\n                self.decoder.get_inputs()[3].name: state1,\n            },\n        )\n        return decoder_out, state0_next, state1_next\n\n    def run_joiner(\n        self,\n        encoder_out: np.ndarray,\n        decoder_out: np.ndarray,\n    ):\n        # encoder_out: [batch_size,  dim, 1]\n        # decoder_out: [batch_size,  dim, 1]\n        logit = self.joiner.run(\n            [\n                self.joiner.get_outputs()[0].name,\n            ],\n            {\n                self.joiner.get_inputs()[0].name: encoder_out,\n                self.joiner.get_inputs()[1].name: decoder_out,\n            },\n        )[0]\n        # logit: [batch_size, 1, 1, vocab_size]\n        return logit\n\n\ndef main():\n    args = get_args()\n    assert Path(args.encoder).is_file(), args.encoder\n    assert Path(args.decoder).is_file(), args.decoder\n    assert Path(args.joiner).is_file(), args.joiner\n    assert Path(args.tokens).is_file(), args.tokens\n    assert Path(args.wav).is_file(), args.wav\n\n    print(vars(args))\n\n    model = OnnxModel(args.encoder, args.decoder, args.joiner)\n\n    id2token = dict()\n    with open(args.tokens, encoding=\"utf-8\") as f:\n        for line in f:\n            t, idx = line.split()\n            id2token[int(idx)] = t\n    vocab_size = len(id2token)\n\n    start = time.time()\n    fbank = create_fbank()\n    audio, sample_rate = sf.read(args.wav, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != 16000:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=16000,\n        )\n        sample_rate = 16000\n\n    tail_padding = np.zeros(sample_rate * 2)\n\n    audio = np.concatenate([audio, tail_padding])\n\n    blank = len(id2token) - 1\n    ans = [blank]\n    state0, state1 = model.get_decoder_state()\n    decoder_out, state0_next, state1_next = model.run_decoder(ans[-1], state0, state1)\n\n    features = compute_features(audio, fbank)\n    if model.normalize_type != \"\":\n        assert model.normalize_type == \"per_feature\", model.normalize_type\n        features = torch.from_numpy(features)\n        mean = features.mean(dim=0, keepdims=True)\n        stddev = features.std(dim=0, keepdims=True) + 1e-5\n        features = (features - mean) / stddev\n        features = features.numpy()\n    print(audio.shape)\n    print(\"features.shape\", features.shape)\n\n    encoder_out = model.run_encoder(features)\n    # encoder_out:[batch_size, dim, T)\n    t = 0\n    while t < encoder_out.shape[2]:\n        encoder_out_t = encoder_out[:, :, t : t + 1]\n        logits = model.run_joiner(encoder_out_t, decoder_out)\n        logits = torch.from_numpy(logits)\n        logits = logits.squeeze()\n\n        token_logits = logits[:vocab_size]\n        duration_logits = logits[vocab_size:]\n\n        idx = torch.argmax(token_logits, dim=-1).item()\n        skip = torch.argmax(duration_logits, dim=-1).item()\n        if skip == 0:\n            skip = 1\n\n        if idx != blank:\n            ans.append(idx)\n            state0 = state0_next\n            state1 = state1_next\n            decoder_out, state0_next, state1_next = model.run_decoder(\n                ans[-1], state0, state1\n            )\n        t += skip\n\n    end = time.time()\n\n    elapsed_seconds = end - start\n    audio_duration = audio.shape[0] / 16000\n    real_time_factor = elapsed_seconds / audio_duration\n\n    ans = ans[1:]  # remove the first blank\n    tokens = [id2token[i] for i in ans]\n    underline = \"▁\"\n    #  underline = b\"\\xe2\\x96\\x81\".decode()\n    text = \"\".join(tokens).replace(underline, \" \").strip()\n\n    print(ans)\n    print(args.wav)\n    print(text)\n    print(f\"RTF: {real_time_factor}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/parakeet-tdt-0.6b-v3/export_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport os\nimport sys\nfrom pathlib import Path\nfrom typing import Dict\n\nimport nemo.collections.asr as nemo_asr\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n# Add parent directory to path to import generate_bpe_vocab\nsys.path.insert(0, str(Path(__file__).parent.parent))\nfrom generate_bpe_vocab import generate_bpe_vocab_from_model\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    if filename == \"encoder.onnx\":\n        external_filename = \"encoder\"\n        onnx.save(\n            model,\n            filename,\n            save_as_external_data=True,\n            all_tensors_to_one_file=True,\n            location=external_filename + \".weights\",\n        )\n    else:\n        onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    if Path(\"./parakeet-tdt-0.6b-v3.nemo\").is_file():\n        asr_model = nemo_asr.models.ASRModel.restore_from(\n            restore_path=\"./parakeet-tdt-0.6b-v3.nemo\"\n        )\n    else:\n        asr_model = nemo_asr.models.ASRModel.from_pretrained(\n            model_name=\"nvidia/parakeet-tdt-0.6b-v3\"\n        )\n\n    asr_model.eval()\n\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(asr_model.joint.vocabulary):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    # Generate bpe.vocab for hotword support\n    print(\"Generating bpe.vocab for hotword support...\")\n    generate_bpe_vocab_from_model(\n        asr_model=asr_model,\n        output_path=\"./bpe.vocab\",\n    )\n\n    asr_model.encoder.export(\"encoder.onnx\")\n    asr_model.decoder.export(\"decoder.onnx\")\n    asr_model.joint.export(\"joiner.onnx\")\n    os.system(\"ls -lh *.onnx\")\n\n    normalize_type = asr_model.cfg.preprocessor.normalize\n    if normalize_type == \"NA\":\n        normalize_type = \"\"\n\n    meta_data = {\n        \"vocab_size\": asr_model.decoder.vocab_size,\n        \"normalize_type\": normalize_type,\n        \"pred_rnn_layers\": asr_model.decoder.pred_rnn_layers,\n        \"pred_hidden\": asr_model.decoder.pred_hidden,\n        \"subsampling_factor\": 8,\n        \"model_type\": \"EncDecRNNTBPEModel\",\n        \"version\": \"2\",\n        \"model_author\": \"NeMo\",\n        \"url\": \"https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3\",\n        \"comment\": \"Only the transducer branch is exported\",\n        \"feat_dim\": 128,\n    }\n\n    for m in [\"encoder\", \"decoder\", \"joiner\"]:\n        quantize_dynamic(\n            model_input=f\"./{m}.onnx\",\n            model_output=f\"./{m}.int8.onnx\",\n            weight_type=QuantType.QUInt8 if m == \"encoder\" else QuantType.QInt8,\n        )\n        os.system(\"ls -lh *.onnx\")\n\n    add_meta_data(\"encoder.int8.onnx\", meta_data)\n    add_meta_data(\"encoder.onnx\", meta_data)\n    print(\"meta_data\", meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/parakeet-tdt_ctc-0.6b-ja/export-onnx-ctc.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport os\nfrom typing import Dict\n\nimport nemo.collections.asr as nemo_asr\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    asr_model = nemo_asr.models.ASRModel.from_pretrained(\n        model_name=\"nvidia/parakeet-tdt_ctc-0.6b-ja\"\n    )\n\n    print(asr_model.cfg)\n    print(asr_model)\n\n    with open(\"./tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(asr_model.joint.vocabulary):\n            f.write(f\"{s} {i}\\n\")\n        f.write(f\"<blk> {i+1}\\n\")\n        print(\"Saved to tokens.txt\")\n\n    decoder_type = \"ctc\"\n    asr_model.change_decoding_strategy(decoder_type=decoder_type)\n    asr_model.eval()\n\n    asr_model.set_export_config({\"decoder_type\": \"ctc\"})\n\n    filename = \"model.onnx\"\n\n    asr_model.export(filename, onnx_opset_version=18)\n\n    normalize_type = asr_model.cfg.preprocessor.normalize\n    if normalize_type == \"NA\":\n        normalize_type = \"\"\n\n    meta_data = {\n        \"vocab_size\": asr_model.decoder.vocab_size,\n        \"normalize_type\": normalize_type,\n        \"subsampling_factor\": 8,\n        \"model_type\": \"EncDecHybridRNNTCTCBPEModel\",\n        \"version\": \"1\",\n        \"model_author\": \"NeMo\",\n        \"url\": \"https://huggingface.co/nvidia/parakeet-tdt_ctc-0.6b-ja\",\n        \"comment\": \"Only the CTC branch is exported\",\n        \"doc\": \"See https://huggingface.co/nvidia/parakeet-tdt_ctc-0.6b-ja\",\n    }\n\n    os.system(\"ls -lh *.onnx\")\n\n    quantize_dynamic(\n        model_input=\"./model.onnx\",\n        model_output=\"./model.int8.onnx\",\n        weight_type=QuantType.QUInt8,\n    )\n\n    add_meta_data(\"model.int8.onnx\", meta_data)\n\n    os.system(\"ls -lh *.onnx\")\n\n    print(\"preprocessor\", asr_model.cfg.preprocessor)\n    print(meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/parakeet-tdt_ctc-0.6b-ja/run-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\npython3 ./export-onnx-ctc.py\n\nls -lh *.onnx\n\nmkdir -p test_wavs\npushd test_wavs\ncurl -SL -O https://huggingface.co/csukuangfj/reazonspeech-k2-v2-ja-en/resolve/main/test_wavs/transcripts.txt\ncurl -SL -O https://hf-mirror.com/csukuangfj/reazonspeech-k2-v2-ja-en/resolve/main/test_wavs/test_ja_1.wav\ncurl -SL -O https://hf-mirror.com/csukuangfj/reazonspeech-k2-v2-ja-en/resolve/main/test_wavs/test_ja_2.wav\npopd\n\nd=sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8\n\nmkdir -p $d\nmv -v model.int8.onnx $d/\ncp -v tokens.txt $d/\ncp -av test_wavs $d\nls -lh $d\n\n\nd=sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8\npython3 ./test-onnx-ctc-non-streaming.py \\\n  --model $d/model.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $d/test_wavs/test_ja_1.wav\n\npython3 ./test-onnx-ctc-non-streaming.py \\\n  --model $d/model.int8.onnx \\\n  --tokens $d/tokens.txt \\\n  --wav $d/test_wavs/test_ja_2.wav\n"
  },
  {
    "path": "scripts/nemo/speaker-verification/README.md",
    "content": "# Introduction\n\nThis directory contains script for exporting speaker verification models\nfrom [NeMo](https://github.com/NVIDIA/NeMo/) to onnx\nso that you can use them in `sherpa-onnx`.\n\nSpecifically, the following 4 models are exported to `sherpa-onnx`\nfrom\n[this page](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/speaker_recognition/results.html#speaker-recognition-models):\n\n  - [titanet_large](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/titanet_large),\n  - [titanet_small](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/titanet_small)\n  - [speakerverification_speakernet](https://ngc.nvidia.com/catalog/models/nvidia:nemo:speakerverification_speakernet)\n  - [ecapa_tdnn](https://ngc.nvidia.com/catalog/models/nvidia:nemo:ecapa_tdnn)\n"
  },
  {
    "path": "scripts/nemo/speaker-verification/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom typing import Dict\n\nimport nemo.collections.asr as nemo_asr\nimport onnx\nimport torch\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        choices=[\n            \"speakerverification_speakernet\",\n            \"titanet_large\",\n            \"titanet_small\",\n            \"ecapa_tdnn\",\n        ],\n    )\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    speaker_model_config = nemo_asr.models.EncDecSpeakerLabelModel.from_pretrained(\n        model_name=args.model, return_config=True\n    )\n    preprocessor_config = speaker_model_config[\"preprocessor\"]\n\n    print(args.model)\n    print(speaker_model_config)\n    print(preprocessor_config)\n\n    assert preprocessor_config[\"n_fft\"] == 512, preprocessor_config\n\n    assert (\n        preprocessor_config[\"_target_\"]\n        == \"nemo.collections.asr.modules.AudioToMelSpectrogramPreprocessor\"\n    ), preprocessor_config\n\n    assert preprocessor_config[\"frame_splicing\"] == 1, preprocessor_config\n\n    speaker_model = nemo_asr.models.EncDecSpeakerLabelModel.from_pretrained(\n        model_name=args.model\n    )\n    speaker_model.eval()\n    filename = f\"nemo_en_{args.model}.onnx\"\n    speaker_model.export(filename)\n\n    print(f\"Adding metadata to {filename}\")\n\n    comment = \"This model is from NeMo.\"\n    url = {\n        \"titanet_large\": \"https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/titanet_large\",\n        \"titanet_small\": \"https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/titanet_small\",\n        \"speakerverification_speakernet\": \"https://ngc.nvidia.com/catalog/models/nvidia:nemo:speakerverification_speakernet\",\n        \"ecapa_tdnn\": \"https://ngc.nvidia.com/catalog/models/nvidia:nemo:ecapa_tdnn\",\n    }[args.model]\n\n    language = \"English\"\n\n    meta_data = {\n        \"framework\": \"nemo\",\n        \"language\": language,\n        \"url\": url,\n        \"comment\": comment,\n        \"sample_rate\": preprocessor_config[\"sample_rate\"],\n        \"output_dim\": speaker_model_config[\"decoder\"][\"emb_sizes\"],\n        \"feature_normalize_type\": preprocessor_config[\"normalize\"],\n        \"window_size_ms\": int(float(preprocessor_config[\"window_size\"]) * 1000),\n        \"window_stride_ms\": int(float(preprocessor_config[\"window_stride\"]) * 1000),\n        \"window_type\": preprocessor_config[\"window\"],  # e.g., hann\n        \"feat_dim\": preprocessor_config[\"features\"],\n    }\n    print(meta_data)\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/nemo/speaker-verification/test-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2023-2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nThis script computes speaker similarity score in the range [0-1]\nof two wave files using a speaker embedding model.\n\"\"\"\nimport argparse\nimport wave\nfrom pathlib import Path\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nfrom numpy.linalg import norm\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the input onnx model. Example value: model.onnx\",\n    )\n\n    parser.add_argument(\n        \"--file1\",\n        type=str,\n        required=True,\n        help=\"Input wave 1\",\n    )\n\n    parser.add_argument(\n        \"--file2\",\n        type=str,\n        required=True,\n        help=\"Input wave 2\",\n    )\n\n    return parser.parse_args()\n\n\ndef read_wavefile(filename, expected_sample_rate: int = 16000) -> np.ndarray:\n    \"\"\"\n    Args:\n      filename:\n        Path to a wave file, which must be of 16-bit and 16kHz.\n     expected_sample_rate:\n       Expected sample rate of the wave file.\n    Returns:\n      Return a 1-D float32 array containing audio samples. Each sample is in\n      the range [-1, 1].\n    \"\"\"\n    filename = str(filename)\n    with wave.open(filename) as f:\n        wave_file_sample_rate = f.getframerate()\n        assert wave_file_sample_rate == expected_sample_rate, (\n            wave_file_sample_rate,\n            expected_sample_rate,\n        )\n\n        num_channels = f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_int16 = samples_int16.reshape(-1, num_channels)[:, 0]\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n\n        return samples_float32\n\n\ndef compute_features(samples: np.ndarray, model: \"OnnxModel\") -> np.ndarray:\n    fbank_opts = knf.FbankOptions()\n    fbank_opts.frame_opts.samp_freq = model.sample_rate\n    fbank_opts.frame_opts.frame_length_ms = model.window_size_ms\n    fbank_opts.frame_opts.frame_shift_ms = model.window_stride_ms\n    fbank_opts.frame_opts.dither = 0\n    fbank_opts.frame_opts.remove_dc_offset = False\n    fbank_opts.frame_opts.window_type = model.window_type\n\n    fbank_opts.mel_opts.num_bins = model.feat_dim\n    fbank_opts.mel_opts.low_freq = 0\n    fbank_opts.mel_opts.is_librosa = True\n\n    fbank = knf.OnlineFbank(fbank_opts)\n    fbank.accept_waveform(model.sample_rate, samples)\n    fbank.input_finished()\n\n    features = []\n    for i in range(fbank.num_frames_ready):\n        f = fbank.get_frame(i)\n        features.append(f)\n    features = np.stack(features, axis=0)\n    # at this point, the shape of features is (T, C)\n\n    if model.feature_normalize_type != \"\":\n        assert model.feature_normalize_type == \"per_feature\"\n        mean = np.mean(features, axis=0, keepdims=True)\n        std = np.std(features, axis=0, keepdims=True)\n        features = (features - mean) / std\n\n    feature_len = features.shape[0]\n    pad = 16 - feature_len % 16\n\n    if pad > 0:\n        padding = np.zeros((pad, features.shape[1]), dtype=np.float32)\n        features = np.concatenate([features, padding])\n\n    features = np.expand_dims(features, axis=0)\n\n    return features, feature_len\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        self.framework = meta[\"framework\"]\n        self.sample_rate = int(meta[\"sample_rate\"])\n        self.output_dim = int(meta[\"output_dim\"])\n        self.feature_normalize_type = meta[\"feature_normalize_type\"]\n        self.window_size_ms = int(meta[\"window_size_ms\"])\n        self.window_stride_ms = int(meta[\"window_stride_ms\"])\n        self.window_type = meta[\"window_type\"]\n        self.feat_dim = int(meta[\"feat_dim\"])\n        print(meta)\n\n        assert self.framework == \"nemo\", self.framework\n\n    def __call__(self, x: np.ndarray, x_lens: int) -> np.ndarray:\n        \"\"\"\n        Args:\n          x:\n            A 2-D float32 tensor of shape (T, C).\n          y:\n            A 1-D float32 tensor containing model output.\n        \"\"\"\n        x = x.transpose(0, 2, 1)  # (B, T, C) -> (B, C, T)\n        x_lens = np.asarray([x_lens], dtype=np.int64)\n\n        return self.model.run(\n            [\n                self.model.get_outputs()[1].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n                self.model.get_inputs()[1].name: x_lens,\n            },\n        )[0][0]\n\n\ndef main():\n    args = get_args()\n    print(args)\n    filename = Path(args.model)\n    file1 = Path(args.file1)\n    file2 = Path(args.file2)\n    assert filename.is_file(), filename\n    assert file1.is_file(), file1\n    assert file2.is_file(), file2\n\n    model = OnnxModel(filename)\n    wave1 = read_wavefile(file1, model.sample_rate)\n    wave2 = read_wavefile(file2, model.sample_rate)\n\n    features1, features1_len = compute_features(wave1, model)\n    features2, features2_len = compute_features(wave2, model)\n\n    output1 = model(features1, features1_len)\n    output2 = model(features2, features2_len)\n\n    similarity = np.dot(output1, output2) / (norm(output1) * norm(output2))\n    print(f\"similarity in the range [0-1]: {similarity}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/node-addon-api/.gitignore",
    "content": "docs\n"
  },
  {
    "path": "scripts/node-addon-api/CMakeLists.txt",
    "content": "# See also https://github.com/cmake-js/cmake-js\n# npm install cmake-js\n# ./node_modules/.bin/cmake-js --help\n# ./node_modules/.bin/cmake-js --version\n# ./node_modules/.bin/cmake-js compile --help\n# ./node_modules/.bin/cmake-js compile --log-level\n# ./node_modules/.bin/cmake-js compile --log-level verbose\ncmake_minimum_required(VERSION 3.15)\ncmake_policy(SET CMP0091 NEW)\ncmake_policy(SET CMP0042 NEW)\n\nproject(sherpa-onnx)\n\nset(CMAKE_CXX_STANDARD 17)\n\nif(NOT WIN32)\n  set(CMAKE_SKIP_BUILD_RPATH FALSE)\n  set(BUILD_RPATH_USE_ORIGIN TRUE)\n  set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)\nendif()\n\nif(NOT APPLE)\n  set(SHERPA_ONNX_RPATH_ORIGIN \"$ORIGIN\")\nelse()\n  set(SHERPA_ONNX_RPATH_ORIGIN \"@loader_path\")\nendif()\n\nif(NOT WIN32)\n  set(CMAKE_INSTALL_RPATH ${SHERPA_ONNX_RPATH_ORIGIN})\n  set(CMAKE_BUILD_RPATH ${SHERPA_ONNX_RPATH_ORIGIN})\nendif()\n\ninclude_directories(${CMAKE_JS_INC})\n\nset(srcs\n  src/audio-tagging.cc\n  src/keyword-spotting.cc\n  src/non-streaming-asr.cc\n  src/non-streaming-speaker-diarization.cc\n  src/non-streaming-speech-denoiser.cc\n  src/non-streaming-tts.cc\n  src/offline-punctuation.cc\n  src/online-punctuation.cc\n  src/streaming-speech-denoiser.cc\n  src/sherpa-onnx-node-addon-api.cc\n  src/speaker-identification.cc\n  src/spoken-language-identification.cc\n  src/streaming-asr.cc\n  src/vad.cc\n  src/version.cc\n  src/wave-reader.cc\n  src/wave-writer.cc\n)\n\nif(NOT DEFINED ENV{SHERPA_ONNX_INSTALL_DIR})\n  message(FATAL_ERROR \"\nPlease run:\ngit clone https://github.com/k2-fsa/sherpa-onnx\ncd sherpa-onnx\nmkdir build\ncd build\ncmake -DBUILD_SHARED_LIBS=ON -DCMAKE_INSTALL_PREFIX=./install ..\nmake install\nexport SHERPA_ONNX_INSTALL_DIR=$PWD/install\n  \")\nendif()\n\ninclude_directories($ENV{SHERPA_ONNX_INSTALL_DIR}/include)\n\n# See https://nodejs.github.io/node-addon-examples/build-tools/cmake-js\n# Include Node-API wrappers\nexecute_process(\n  COMMAND node -p \"require('node-addon-api').include\"\n    WORKING_DIRECTORY ${CMAKE_SOURCE_DIR}\n    OUTPUT_VARIABLE NODE_ADDON_API_DIR\n)\n\nstring(REPLACE \"\\n\" \"\" NODE_ADDON_API_DIR ${NODE_ADDON_API_DIR})\nstring(REPLACE \"\\\"\" \"\" NODE_ADDON_API_DIR ${NODE_ADDON_API_DIR})\ninclude_directories(${NODE_ADDON_API_DIR})\n\nlink_directories($ENV{SHERPA_ONNX_INSTALL_DIR}/lib)\n\nadd_library(${PROJECT_NAME} SHARED ${srcs} ${CMAKE_JS_SRC})\nset_target_properties(${PROJECT_NAME} PROPERTIES PREFIX \"\" SUFFIX \".node\")\ntarget_link_libraries(${PROJECT_NAME} ${CMAKE_JS_LIB})\n\ntarget_link_libraries(${PROJECT_NAME}\n  sherpa-onnx-c-api\n  onnxruntime\n  -Wl,-rpath,$ENV{SHERPA_ONNX_INSTALL_DIR}/lib\n  -Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}\n)\n\nif(MSVC AND CMAKE_JS_NODELIB_DEF AND CMAKE_JS_NODELIB_TARGET)\n  # Generate node.lib\n  execute_process(COMMAND ${CMAKE_AR} /def:${CMAKE_JS_NODELIB_DEF} /out:${CMAKE_JS_NODELIB_TARGET} ${CMAKE_STATIC_LINKER_FLAGS})\nendif()\n"
  },
  {
    "path": "scripts/node-addon-api/README.md",
    "content": "# Introduction\n\nThis folder contains `node-addon-api` wrapper for `sherpa-onnx`.\n\nCaution: This folder is for developer only.\n\n## Usage\n\n```bash\ngit clone https://github.com/k2-fsa/sherpa-onnx\ncd sherpa-onnx\nmkdir build\ncd build\ncmake -DCMAKE_INSTALL_PREFIX=./install -DBUILD_SHARED_LIBS=ON ..\nmake -j install\nexport PKG_CONFIG_PATH=$PWD/install:$PKG_CONFIG_PATH\ncd ../scripts/node-addon-api/\nnpm i\n./node_modules/.bin/cmake-js compile --log-level verbose\n\n# see test/test_asr_streaming_transducer.js\n# for usages\n```\n"
  },
  {
    "path": "scripts/node-addon-api/lib/addon-static-import.js",
    "content": "const os = require('os');\r\n\r\nlet addon = null;\r\n\r\nconst platform = os.platform() === 'win32' ? 'win' : os.platform();\r\nconst arch = os.arch();\r\n\r\ntry {\r\n  if (arch === 'x64') {\r\n    if (platform === 'win') {\r\n      // @ts-expect-error\r\n      addon = require('../sherpa-onnx-win-x64/sherpa-onnx.node')\r\n    } else if (platform === 'darwin') {\r\n      // @ts-expect-error\r\n      addon = require('../sherpa-onnx-darwin-x64/sherpa-onnx.node')\r\n    } else if (platform === 'linux') {\r\n      // @ts-expect-error\r\n      addon = require('../sherpa-onnx-linux-x64/sherpa-onnx.node')\r\n    }\r\n  } else if (arch === 'arm64') {\r\n    if (platform === 'darwin') {\r\n      // @ts-expect-error\r\n      addon = require('../sherpa-onnx-darwin-arm64/sherpa-onnx.node')\r\n    } else if (platform === 'linux') {\r\n      // @ts-expect-error\r\n      addon = require('../sherpa-onnx-linux-arm64/sherpa-onnx.node')\r\n    }\r\n  } else if (arch === 'ia32') {\r\n    if (platform === 'win') {\r\n      // @ts-expect-error\r\n      addon = require('../sherpa-onnx-win-ia32/sherpa-onnx.node')\r\n    }\r\n  }\r\n} catch (error) {\r\n  //\r\n}\r\n\r\nif (!addon) {\r\n  try {\r\n    if (arch === 'x64') {\r\n      if (platform === 'win') {\r\n        // @ts-expect-error\r\n        addon = require('./node_modules/sherpa-onnx-win-x64/sherpa-onnx.node')\r\n      } else if (platform === 'darwin') {\r\n        // @ts-expect-error\r\n        addon = require('./node_modules/sherpa-onnx-darwin-x64/sherpa-onnx.node')\r\n      } else if (platform === 'linux') {\r\n        // @ts-expect-error\r\n        addon = require('./node_modules/sherpa-onnx-linux-x64/sherpa-onnx.node')\r\n      }\r\n    } else if (arch === 'arm64') {\r\n      if (platform === 'darwin') {\r\n        // @ts-expect-error\r\n        addon = require('./node_modules/sherpa-onnx-darwin-arm64/sherpa-onnx.node')\r\n      } else if (platform === 'linux') {\r\n        // @ts-expect-error\r\n        addon = require('./node_modules/sherpa-onnx-linux-arm64/sherpa-onnx.node')\r\n      }\r\n    } else if (arch === 'ia32') {\r\n      if (platform === 'win') {\r\n        // @ts-expect-error\r\n        addon = require('./node_modules/sherpa-onnx-win-ia32/sherpa-onnx.node')\r\n      }\r\n    }\r\n  } catch (error) {\r\n    //\r\n  }\r\n}\r\n \r\nmodule.exports = addon;"
  },
  {
    "path": "scripts/node-addon-api/lib/addon.js",
    "content": "/** @typedef {import('./types').WaveObject} WaveObject */\n\nconst os = require('os');\nconst path = require('path');\nconst addonStaticImport = require('./addon-static-import');\n\n// Package name triggered spam for sherpa-onnx-win32-x64\n// so we have renamed it to sherpa-onnx-win-x64\nconst platform = os.platform() === 'win32' ? 'win' : os.platform();\nconst arch = os.arch();\nconst platform_arch = `${platform}-${arch}`;\nconst possible_paths = [\n  '../build/Release/sherpa-onnx.node',\n  '../build/Debug/sherpa-onnx.node',\n  `./node_modules/sherpa-onnx-${platform_arch}/sherpa-onnx.node`,\n  `../sherpa-onnx-${platform_arch}/sherpa-onnx.node`,\n  './sherpa-onnx.node',\n];\n\nlet addon = addonStaticImport;\n\nif (!addon) {\n  for (const p of possible_paths) {\n    try {\n      addon = require(p);\n      break;\n    } catch (error) {\n      // do nothing; try the next option\n      ;\n    }\n  }\n}\n\nmodule.exports = addon;\n\nif (!addon) {\n  let addon_path =\n      `${process.env.PWD}/node_modules/sherpa-onnx-${platform_arch}`;\n  const pnpmIndex = __dirname.indexOf(`node_modules${path.sep}.pnpm`);\n  if (pnpmIndex !== -1) {\n    const parts = __dirname.slice(pnpmIndex).split(path.sep);\n    parts.pop();\n    addon_path =\n        `${process.env.PWD}/${parts.join('/')}/sherpa-onnx-${platform_arch}`;\n  }\n\n  let msg = `Could not find sherpa-onnx-node. Tried\\n\\n  ${\n      possible_paths.join('\\n  ')}\\n`\n  if (os.platform() == 'darwin' &&\n      (!process.env.DYLD_LIBRARY_PATH ||\n       !process.env.DYLD_LIBRARY_PATH.includes(\n           `node_modules/sherpa-onnx-${platform_arch}`))) {\n    msg +=\n        'Please remember to set the following environment variable and try again:\\n';\n\n    msg += `export DYLD_LIBRARY_PATH=${addon_path}`;\n\n    msg += ':$DYLD_LIBRARY_PATH\\n';\n  }\n\n  if (os.platform() == 'linux' &&\n      (!process.env.LD_LIBRARY_PATH ||\n       !process.env.LD_LIBRARY_PATH.includes(\n           `node_modules/sherpa-onnx-${platform_arch}`))) {\n    msg +=\n        'Please remember to set the following environment variable and try again:\\n';\n\n    msg += `export LD_LIBRARY_PATH=${addon_path}`;\n\n    msg += ':$LD_LIBRARY_PATH\\n';\n  }\n\n  throw new Error(msg)\n}\n\n/**\n * Read a wave file from disk.\n * @function module.exports.readWave\n * @param {string} filename\n * @param {boolean} [enableExternalBuffer=true]\n * @returns {WaveObject}\n */\n\n/**\n * Read a wave from binary buffer.\n * @function module.exports.readWaveFromBinary\n * @param {Uint8Array} data - Binary contents of a wave file.\n * @param {boolean} [enableExternalBuffer=true]\n * @returns {WaveObject}\n */\n\n/**\n * Write a wave file to disk.\n * @function module.exports.writeWave\n * @param {string} filename\n * @param {WaveObject} obj - { samples: Float32Array, sampleRate: number }\n * @returns {boolean}\n */\n"
  },
  {
    "path": "scripts/node-addon-api/lib/audio-tagg.js",
    "content": "/** @typedef {import('./types').AudioTaggingConfig} AudioTaggingConfig */\n/** @typedef {import('./types').AudioEvent} AudioEvent */\n/** @typedef {import('./types').AudioTaggingHandle} AudioTaggingHandle */\n/** @typedef {import('./non-streaming-asr').OfflineStream} OfflineStream */\n\nconst addon = require('./addon.js');\nconst non_streaming_asr = require('./non-streaming-asr.js');\n\n/**\n * AudioTagging utility.\n * @class\n */\nclass AudioTagging {\n  /**\n   * Create an AudioTagging instance.\n   * @param {AudioTaggingConfig} config\n   */\n  constructor(config) {\n    this.handle = addon.createAudioTagging(config);\n    this.config = config;\n  }\n\n  /**\n   * Create an offline stream bound to this AudioTagging instance.\n   * @returns {OfflineStream}\n   */\n  createStream() {\n    return new non_streaming_asr.OfflineStream(\n        addon.audioTaggingCreateOfflineStream(this.handle));\n  }\n\n  /**\n   * Compute audio tags from an offline stream.\n   * @param {OfflineStream} stream - An offline stream created by `AudioTagging.createStream()`.\n   * @param {number} [topK=-1] - Return top K results; -1 for all.\n   * @returns {AudioEvent[]}\n   */\n  compute(stream, topK = -1) {\n    return addon.audioTaggingCompute(this.handle, stream.handle, topK);\n  }\n}\n\nmodule.exports = {\n  AudioTagging,\n}\n"
  },
  {
    "path": "scripts/node-addon-api/lib/keyword-spotter.js",
    "content": "/** @typedef {import('./types').KeywordSpotterConfig} KeywordSpotterConfig */\n/** @typedef {import('./types').KeywordResult} KeywordResult */\n/** @typedef {import('./streaming-asr').OnlineStream} OnlineStream */\n\nconst addon = require('./addon.js');\nconst streaming_asr = require('./streaming-asr.js');\n\n/**\n * KeywordSpotter handles keyword detection.\n */\nclass KeywordSpotter {\n  /**\n   * @param {KeywordSpotterConfig} config\n   */\n  constructor(config) {\n    this.handle = addon.createKeywordSpotter(config);\n    this.config = config\n  }\n\n  /**\n   * Create an OnlineStream for the spotter.\n   * @returns {OnlineStream}\n   */\n  createStream() {\n    const handle = addon.createKeywordStream(this.handle);\n    return new streaming_asr.OnlineStream(handle);\n  }\n\n  /**\n   * @param {OnlineStream} stream\n   * @returns {boolean}\n   */\n  isReady(stream) {\n    return addon.isKeywordStreamReady(this.handle, stream.handle);\n  }\n\n  /**\n   * Trigger decode on a stream.\n   * @param {OnlineStream} stream\n   */\n  decode(stream) {\n    addon.decodeKeywordStream(this.handle, stream.handle);\n  }\n\n  /**\n   * Reset a stream.\n   * @param {OnlineStream} stream\n   */\n  reset(stream) {\n    addon.resetKeywordStream(this.handle, stream.handle);\n  }\n\n  /**\n   * Get the keyword result for a stream.\n   * @param {OnlineStream} stream\n   * @returns {KeywordResult}\n   */\n  getResult(stream) {\n    const jsonStr = addon.getKeywordResultAsJson(this.handle, stream.handle);\n\n    return JSON.parse(jsonStr);\n  }\n}\n\nmodule.exports = {\n  KeywordSpotter,\n}\n"
  },
  {
    "path": "scripts/node-addon-api/lib/non-streaming-asr.js",
    "content": "/** @typedef {import('./types').OfflineStreamObject} OfflineStreamObject */\n/** @typedef {import('./types').Waveform} Waveform */\n/**\n * @typedef {import('./types').OfflineRecognizerConfig} OfflineRecognizerConfig\n */\n/**\n * @typedef {import('./types').OfflineRecognizerResult} OfflineRecognizerResult\n */\n\nconst addon = require('./addon.js');\n\n/**\n * Internal symbol to mark async-created recognizers.\n * Not accessible unless someone has a reference to this Symbol.\n */\nconst kFromAsyncFactory = Symbol('OfflineRecognizer.fromAsync');\n\n/**\n * OfflineStream represents a synchronous offline audio stream.\n */\nclass OfflineStream {\n  /**\n   * @param {OfflineStreamObject|Object} handle\n   */\n  constructor(handle) {\n    this.handle = handle;\n  }\n\n  /**\n   * Accept a chunk of waveform samples.\n   * @param {Waveform} obj - { samples: Float32Array, sampleRate: number }\n   */\n  acceptWaveform(obj) {\n    addon.acceptWaveformOffline(this.handle, obj);\n  }\n}\n\n/**\n * OfflineRecognizer wraps the native offline recognizer.\n */\nclass OfflineRecognizer {\n  /**\n   * Constructor (SYNC path).\n   *\n   * Users call:\n   *   new OfflineRecognizer(config)\n   *\n   * Async factory calls this with an internal descriptor.\n   *\n   * @param {OfflineRecognizerConfig | Object} configOrInternal\n   */\n  constructor(configOrInternal) {\n    // ----- async factory path -----\n    if (configOrInternal && typeof configOrInternal === 'object' &&\n        configOrInternal[kFromAsyncFactory]) {\n      this.handle = configOrInternal.handle;\n      this.config = configOrInternal.config;\n      return;\n    }\n\n    // ----- sync constructor path -----\n    this.config = configOrInternal;\n    this.handle = addon.createOfflineRecognizer(this.config);\n  }\n\n  /**\n   * Create an OfflineRecognizer asynchronously (non-blocking).\n   *\n   * @param {OfflineRecognizerConfig} config\n   * @returns {Promise<OfflineRecognizer>}\n   */\n  static async createAsync(config) {\n    const handle = await addon.createOfflineRecognizerAsync(config);\n\n    return new OfflineRecognizer({\n      [kFromAsyncFactory]: true,\n      handle,\n      config,\n    });\n  }\n\n  /**\n   * Create a new OfflineStream bound to this recognizer.\n   * @returns {OfflineStream}\n   */\n  createStream() {\n    const handle = addon.createOfflineStream(this.handle);\n    return new OfflineStream(handle);\n  }\n\n  /**\n   * Replace the recognizer config at runtime.\n   * @param {OfflineRecognizerConfig} config\n   */\n  setConfig(config) {\n    this.config = config;\n    addon.offlineRecognizerSetConfig(this.handle, config);\n  }\n\n  /**\n   * Decode an offline stream (synchronous).\n   * @param {OfflineStream} stream\n   */\n  decode(stream) {\n    addon.decodeOfflineStream(this.handle, stream.handle);\n  }\n\n  /**\n   * Decode an offline stream asynchronously (non-blocking).\n   * @param {OfflineStream} stream\n   * @returns {Promise<OfflineRecognizerResult>}\n   */\n  async decodeAsync(stream) {\n    const jsonStr =\n        await addon.decodeOfflineStreamAsync(this.handle, stream.handle);\n    return JSON.parse(jsonStr);\n  }\n\n  /**\n   * Get recognition result for a stream.\n   * @param {OfflineStream} stream\n   * @returns {OfflineRecognizerResult}\n   */\n  getResult(stream) {\n    const jsonStr = addon.getOfflineStreamResultAsJson(stream.handle);\n    return JSON.parse(jsonStr);\n  }\n}\n\nmodule.exports = {\n  OfflineRecognizer,\n  OfflineStream,\n};\n"
  },
  {
    "path": "scripts/node-addon-api/lib/non-streaming-speaker-diarization.js",
    "content": "/** @typedef {import('./types').OfflineSpeakerDiarizationConfig} OfflineSpeakerDiarizationConfig */\n/** @typedef {import('./types').SpeakerDiarizationSegment} SpeakerDiarizationSegment */\n\nconst addon = require('./addon.js');\n\nclass OfflineSpeakerDiarization {\n  /**\n   * @param {OfflineSpeakerDiarizationConfig} config\n   */\n  constructor(config) {\n    this.handle = addon.createOfflineSpeakerDiarization(config);\n    this.config = config;\n\n    this.sampleRate = addon.getOfflineSpeakerDiarizationSampleRate(this.handle);\n  }\n\n  /**\n   * @param {Float32Array} samples - 1-D float32 array in [-1, 1]\n   * @returns {SpeakerDiarizationSegment[]}\n   */\n  process(samples) {\n    return addon.offlineSpeakerDiarizationProcess(this.handle, samples);\n  }\n\n  /**\n   * Set clustering configuration.\n   * @param {{clustering: import('./types').FastClusteringConfig}} config\n   */\n  setConfig(config) {\n    addon.offlineSpeakerDiarizationSetConfig(this.handle, config);\n    this.config.clustering = config.clustering;\n  }\n}\n\nmodule.exports = {\n  OfflineSpeakerDiarization,\n} "
  },
  {
    "path": "scripts/node-addon-api/lib/non-streaming-speech-denoiser.js",
    "content": "/** @typedef {import('./types').OfflineSpeechDenoiserConfig} OfflineSpeechDenoiserConfig */\n/** @typedef {import('./types').GeneratedAudio} GeneratedAudio */\n/** @typedef {import('./types').AudioProcessRequest} AudioProcessRequest */\n\nconst addon = require('./addon.js');\n\nclass OfflineSpeechDenoiser {\n  /**\n   * @param {OfflineSpeechDenoiserConfig} config\n   */\n  constructor(config) {\n    this.handle = addon.createOfflineSpeechDenoiser(config);\n    this.config = config;\n\n    this.sampleRate =\n        addon.offlineSpeechDenoiserGetSampleRateWrapper(this.handle);\n  }\n\n  /**\n   * Run denoiser synchronously.\n   * @param {AudioProcessRequest} obj - { samples: Float32Array, sampleRate: number, enableExternalBuffer?: boolean }\n   * @returns {GeneratedAudio}\n   */\n  run(obj) {\n    return addon.offlineSpeechDenoiserRunWrapper(this.handle, obj);\n  }\n}\n\nmodule.exports = {\n  OfflineSpeechDenoiser,\n} "
  },
  {
    "path": "scripts/node-addon-api/lib/non-streaming-tts.js",
    "content": "/** @typedef {import('./types').OfflineTtsConfig} OfflineTtsConfig */\n/** @typedef {import('./types').TtsRequest} TtsRequest */\n/** @typedef {import('./types').GeneratedAudio} GeneratedAudio */\n\nconst addon = require('./addon.js');\n\n/**\n * Internal symbol to mark async-created TTS instances.\n */\nconst kFromAsyncFactory = Symbol('OfflineTts.fromAsync');\n\n\nclass GenerationConfig {\n  constructor(opts = {}) {\n    Object.assign(this, opts);\n  }\n}\n\n\nclass OfflineTts {\n  /**\n   * Constructor (sync path).\n   *\n   * Users call:\n   *   new OfflineTts(config)\n   *\n   * Async factory calls this with an internal descriptor.\n   *\n   * @param {OfflineTtsConfig|Object} configOrInternal\n   */\n  constructor(configOrInternal) {\n    if (configOrInternal && typeof configOrInternal === 'object' &&\n        configOrInternal[kFromAsyncFactory]) {\n      // ----- async factory path -----\n      this.handle = configOrInternal.handle;\n      this.config = configOrInternal.config;\n    } else {\n      // ----- sync constructor path -----\n      this.config = configOrInternal;\n      this.handle = addon.createOfflineTts(this.config);\n    }\n\n    // Common initialization\n    this.numSpeakers = addon.getOfflineTtsNumSpeakers(this.handle);\n    this.sampleRate = addon.getOfflineTtsSampleRate(this.handle);\n  }\n\n  /**\n   * Create an OfflineTts asynchronously (non-blocking).\n   * @param {OfflineTtsConfig} config\n   * @returns {Promise<OfflineTts>}\n   */\n  static async createAsync(config) {\n    const handle = await addon.createOfflineTtsAsync(config);\n    return new OfflineTts({\n      [kFromAsyncFactory]: true,\n      handle,\n      config,\n    });\n  }\n\n  /**\n   * Generate audio synchronously.\n   * @param {TtsRequest} obj\n   * @returns {GeneratedAudio}\n   */\n  generate(obj) {\n    if (!obj || typeof obj !== 'object') {\n      throw new TypeError('generate() expects an object');\n    }\n\n    // If generationConfig is present, use new API\n    if (obj.generationConfig !== undefined) {\n      return addon.offlineTtsGenerateWithConfig(this.handle, obj);\n    }\n\n    // Fallback to legacy path\n    return addon.offlineTtsGenerate(this.handle, obj);\n  }\n  /**\n   * Generate audio asynchronously with optional generationConfig and progress\n   * callback\n   *\n   * The progress callback receives streaming audio chunks.\n   *\n   * @param {TtsRequest & { generationConfig?: object, onProgress?: (info: {\n   *     samples: Float32Array, progress: number }) => number | boolean | void\n   *     }} obj\n   * @returns {Promise<GeneratedAudio>}\n   */\n  generateAsync(obj) {\n    const {onProgress, ...rest} = obj;\n\n    const hasConfig = obj.generationConfig !== undefined;\n\n    const fn = hasConfig ? addon.offlineTtsGenerateAsyncWithConfig :\n                           addon.offlineTtsGenerateAsync;\n\n    return fn(this.handle, {\n      ...rest,\n      callback: typeof onProgress === 'function' ?\n          (info) => {\n            const ret = onProgress(info);\n            return ret === 0 || ret === false ? 0 : 1;\n          } :\n          undefined,\n    });\n  }\n}\n\n\nmodule.exports = {\n  OfflineTts,\n  GenerationConfig,\n}\n"
  },
  {
    "path": "scripts/node-addon-api/lib/online-speech-denoiser.js",
    "content": "/** @typedef {import('./types').OnlineSpeechDenoiserConfig} OnlineSpeechDenoiserConfig */\n/** @typedef {import('./types').GeneratedAudio} GeneratedAudio */\n/** @typedef {import('./types').AudioProcessRequest} AudioProcessRequest */\n\nconst addon = require('./addon.js');\n\nclass OnlineSpeechDenoiser {\n  /**\n   * @param {OnlineSpeechDenoiserConfig} config\n   */\n  constructor(config) {\n    this.handle = addon.createOnlineSpeechDenoiser(config);\n    this.config = config;\n\n    this.sampleRate =\n        addon.onlineSpeechDenoiserGetSampleRateWrapper(this.handle);\n    this.frameShiftInSamples =\n        addon.onlineSpeechDenoiserGetFrameShiftInSamplesWrapper(this.handle);\n  }\n\n  /**\n   * @param {AudioProcessRequest} obj\n   * @returns {GeneratedAudio}\n   */\n  run(obj) {\n    return addon.onlineSpeechDenoiserRunWrapper(this.handle, obj);\n  }\n\n  /**\n   * @param {boolean} [enableExternalBuffer=true]\n   * @returns {GeneratedAudio}\n   */\n  flush(enableExternalBuffer = true) {\n    return addon.onlineSpeechDenoiserFlushWrapper(\n        this.handle, enableExternalBuffer);\n  }\n\n  reset() {\n    addon.onlineSpeechDenoiserResetWrapper(this.handle);\n  }\n}\n\nmodule.exports = {\n  OnlineSpeechDenoiser,\n};\n"
  },
  {
    "path": "scripts/node-addon-api/lib/punctuation.js",
    "content": "/** @typedef {import('./types').OfflinePunctuationHandle} OfflinePunctuationHandle */\n/** @typedef {import('./types').OfflinePunctuationConfig} OfflinePunctuationConfig */\n/** @typedef {import('./types').OnlinePunctuationConfig} OnlinePunctuationConfig */\n\nconst addon = require('./addon.js');\n\nclass OfflinePunctuation {\n  /**\n   * @param {OfflinePunctuationConfig} config\n   */\n  constructor(config) {\n    this.handle = addon.createOfflinePunctuation(config);\n    this.config = config;\n  }\n  /**\n   * Add punctuation to `text` and return the punctuated text.\n   * @param {string} text\n   * @returns {string}\n   */\n  addPunct(text) {\n    return addon.offlinePunctuationAddPunct(this.handle, text);\n  }\n}\n\nclass OnlinePunctuation {\n  /**\n   * @param {OnlinePunctuationConfig} config\n   */\n  constructor(config) {\n    this.handle = addon.createOnlinePunctuation(config);\n    this.config = config;\n  }\n  /** @param {string} text @returns {string} */\n  addPunct(text) {\n    return addon.onlinePunctuationAddPunct(this.handle, text);\n  }\n}\n\nmodule.exports = {\n  OfflinePunctuation,\n  OnlinePunctuation,\n} \n"
  },
  {
    "path": "scripts/node-addon-api/lib/sherpa-onnx.js",
    "content": "/** @typedef {import('./types').WaveObject} WaveObject */\n/**\n * @typedef {import('./types').OnlineRecognizerResult} OnlineRecognizerResult\n */\n/**\n * @typedef {import('./types').OfflineRecognizerResult} OfflineRecognizerResult\n */\n\nconst addon = require('./addon.js')\nconst streaming_asr = require('./streaming-asr.js');\nconst non_streaming_asr = require('./non-streaming-asr.js');\nconst non_streaming_tts = require('./non-streaming-tts.js');\nconst vad = require('./vad.js');\nconst slid = require('./spoken-language-identification.js');\nconst sid = require('./speaker-identification.js');\nconst at = require('./audio-tagg.js');\nconst punct = require('./punctuation.js');\nconst kws = require('./keyword-spotter.js');\nconst sd = require('./non-streaming-speaker-diarization.js');\nconst speech_denoiser = require('./non-streaming-speech-denoiser.js');\nconst online_speech_denoiser = require('./online-speech-denoiser.js');\n\nmodule.exports = {\n  OnlineRecognizer : streaming_asr.OnlineRecognizer,\n  OfflineRecognizer : non_streaming_asr.OfflineRecognizer,\n  OfflineTts : non_streaming_tts.OfflineTts,\n  GenerationConfig : non_streaming_tts.GenerationConfig,\n  readWave : addon.readWave,\n  writeWave : addon.writeWave,\n  Display : streaming_asr.Display,\n  Vad : vad.Vad,\n  CircularBuffer : vad.CircularBuffer,\n  SpokenLanguageIdentification : slid.SpokenLanguageIdentification,\n  SpeakerEmbeddingExtractor : sid.SpeakerEmbeddingExtractor,\n  SpeakerEmbeddingManager : sid.SpeakerEmbeddingManager,\n  AudioTagging : at.AudioTagging,\n  OfflinePunctuation : punct.OfflinePunctuation,\n  OnlinePunctuation : punct.OnlinePunctuation,\n  KeywordSpotter : kws.KeywordSpotter,\n  OfflineSpeakerDiarization : sd.OfflineSpeakerDiarization,\n  OfflineSpeechDenoiser : speech_denoiser.OfflineSpeechDenoiser,\n  OnlineSpeechDenoiser : online_speech_denoiser.OnlineSpeechDenoiser,\n  version : addon.version,\n  gitSha1 : addon.gitSha1,\n  gitDate : addon.gitDate,\n}\n"
  },
  {
    "path": "scripts/node-addon-api/lib/speaker-identification.js",
    "content": "/** @typedef {import('./types').SpeakerEmbeddingEntry} SpeakerEmbeddingEntry */\n/** @typedef {import('./types').SpeakerEmbeddingManagerSearchObj} SpeakerEmbeddingManagerSearchObj */\n/** @typedef {import('./types').SpeakerEmbeddingManagerVerifyObj} SpeakerEmbeddingManagerVerifyObj */\n/** @typedef {import('./types').SpeakerEmbeddingExtractorConfig} SpeakerEmbeddingExtractorConfig */\n/** @typedef {import('./streaming-asr').OnlineStream} OnlineStream */\n\nconst addon = require('./addon.js');\nconst streaming_asr = require('./streaming-asr.js');\n\n/**\n * SpeakerEmbeddingExtractor wraps native speaker embedding extractor.\n */\nclass SpeakerEmbeddingExtractor {\n  /**\n   * @param {SpeakerEmbeddingExtractorConfig} config\n   */\n  constructor(config) {\n    this.handle = addon.createSpeakerEmbeddingExtractor(config);\n    this.config = config;\n    this.dim = addon.speakerEmbeddingExtractorDim(this.handle);\n  }\n\n  /**\n   * @returns {OnlineStream}\n   */\n  createStream() {\n    return new streaming_asr.OnlineStream(\n        addon.speakerEmbeddingExtractorCreateStream(this.handle));\n  }\n\n  /**\n   * @param {OnlineStream} stream\n   * @returns {boolean}\n   */\n  isReady(stream) {\n    return addon.speakerEmbeddingExtractorIsReady(this.handle, stream.handle);\n  }\n\n  /**\n   * Compute embedding and return a Float32Array\n   * @param {OnlineStream} stream\n   * @param {boolean} [enableExternalBuffer=true]\n   * @returns {Float32Array}\n   */\n  compute(stream, enableExternalBuffer = true) {\n    return addon.speakerEmbeddingExtractorComputeEmbedding(\n        this.handle, stream.handle, enableExternalBuffer);\n  }\n}\n\n/**\n * Flattens an array of Float32Arrays into a single Float32Array.\n * @param {Float32Array[]} arrayList\n * @returns {Float32Array}\n */\nfunction flatten(arrayList) {\n  let n = 0;\n  for (let i = 0; i < arrayList.length; ++i) {\n    n += arrayList[i].length;\n  }\n  let ans = new Float32Array(n);\n\n  let offset = 0;\n  for (let i = 0; i < arrayList.length; ++i) {\n    ans.set(arrayList[i], offset);\n    offset += arrayList[i].length;\n  }\n  return ans;\n}\n\n/**\n * Manager for speaker embeddings.\n */\nclass SpeakerEmbeddingManager {\n  /**\n   * @param {number} dim - The embedding dimension\n   */\n  constructor(dim) {\n    this.handle = addon.createSpeakerEmbeddingManager(dim);\n    this.dim = dim;\n  }\n\n  /**\n   * @param {SpeakerEmbeddingEntry} obj\n   * @returns {boolean}\n   */\n  add(obj) {\n    return addon.speakerEmbeddingManagerAdd(this.handle, obj);\n  }\n\n  /**\n   * @param {{name:string, v: Float32Array[]}} obj\n   * @returns {boolean}\n   */\n  addMulti(obj) {\n    const c = {\n      name: obj.name,\n      vv: flatten(obj.v),\n      n: obj.v.length,\n    };\n    return addon.speakerEmbeddingManagerAddListFlattened(this.handle, c);\n  }\n\n  /**\n   * @param {string} name\n   * @returns {boolean}\n   */\n  remove(name) {\n    return addon.speakerEmbeddingManagerRemove(this.handle, name);\n  }\n\n  /**\n   * @param {SpeakerEmbeddingManagerSearchObj} obj\n   * @returns {string}\n   */\n  search(obj) {\n    return addon.speakerEmbeddingManagerSearch(this.handle, obj);\n  }\n\n  /**\n   * @param {SpeakerEmbeddingManagerVerifyObj} obj\n   * @returns {boolean}\n   */\n  verify(obj) {\n    return addon.speakerEmbeddingManagerVerify(this.handle, obj);\n  }\n\n  /**\n   * @param {string} name\n   * @returns {boolean}\n   */\n  contains(name) {\n    return addon.speakerEmbeddingManagerContains(this.handle, name);\n  }\n\n  /** @returns {number} */\n  getNumSpeakers() {\n    return addon.speakerEmbeddingManagerNumSpeakers(this.handle);\n  }\n\n  /** @returns {string[]} */\n  getAllSpeakerNames() {\n    return addon.speakerEmbeddingManagerGetAllSpeakers(this.handle);\n  }\n}\n\nmodule.exports = {\n  SpeakerEmbeddingExtractor,\n  SpeakerEmbeddingManager,\n}\n"
  },
  {
    "path": "scripts/node-addon-api/lib/spoken-language-identification.js",
    "content": "/** @typedef {import('./types').SpokenLanguageIdentificationConfig} SpokenLanguageIdentificationConfig */\n/** @typedef {import('./non-streaming-asr').OfflineStream} OfflineStream */\n\nconst addon = require('./addon.js');\nconst non_streaming_asr = require('./non-streaming-asr.js');\n\nclass SpokenLanguageIdentification {\n  /**\n   * @param {SpokenLanguageIdentificationConfig} config\n   */\n  constructor(config) {\n    this.handle = addon.createSpokenLanguageIdentification(config);\n    this.config = config;\n  }\n\n  /**\n   * @returns {OfflineStream}\n   */\n  createStream() {\n    return new non_streaming_asr.OfflineStream(\n        addon.createSpokenLanguageIdentificationOfflineStream(this.handle));\n  }\n\n  /**\n   * Return a 2-letter language code, e.g. 'en', 'de', 'fr', 'es', 'zh'\n   * @param {OfflineStream} stream\n   * @returns {string}\n   */\n  compute(stream) {\n    return addon.spokenLanguageIdentificationCompute(\n        this.handle, stream.handle);\n  }\n}\n\nmodule.exports = {\n  SpokenLanguageIdentification,\n} "
  },
  {
    "path": "scripts/node-addon-api/lib/streaming-asr.js",
    "content": "/** @typedef {import('./types').OnlineStreamObject} OnlineStreamObject */\n/** @typedef {import('./types').OnlineRecognizerHandle} OnlineRecognizerHandle */\n/** @typedef {import('./types').DisplayObject} DisplayObject */\n/** @typedef {import('./types').OnlineRecognizerConfig} OnlineRecognizerConfig */\n/** @typedef {import('./types').Waveform} Waveform */\n/** @typedef {import('./types').OnlineRecognizerResult} OnlineRecognizerResult */\n\nconst addon = require('./addon.js');\n\n/**\n * Display helper for printing recognized words.\n */\nclass Display {\n  /**\n   * @param {number} maxWordPerline\n   */\n  constructor(maxWordPerline) {\n    this.handle = addon.createDisplay(maxWordPerline);\n  }\n\n  /**\n   * Print text to display.\n   * @param {number} idx\n   * @param {string} text\n   */\n  print(idx, text) {\n    addon.print(this.handle, idx, text)\n  }\n}\n\n/**\n * OnlineStream holds an active online stream handle.\n */\nclass OnlineStream {\n  /**\n   * @param {OnlineStreamObject|Object} handle - object with `handle` property\n   */\n  constructor(handle) {\n    this.handle = handle;\n  }\n\n  /**\n   * Accept waveform data\n   * @param {Waveform} obj - { samples: Float32Array, sampleRate: number }\n   */\n  acceptWaveform(obj) {\n    addon.acceptWaveformOnline(this.handle, obj)\n  }\n\n  /** Notify the stream input has finished. */\n  inputFinished() {\n    addon.inputFinished(this.handle)\n  }\n}\n\n/**\n * OnlineRecognizer wraps native online recognizer.\n */\nclass OnlineRecognizer {\n  /**\n   * @param {OnlineRecognizerConfig} config - online recognizer config (see C++ for fields)\n   */\n  constructor(config) {\n    this.handle = addon.createOnlineRecognizer(config);\n    this.config = config\n  }\n\n  /**\n   * Create a new OnlineStream.\n   * @returns {OnlineStream}\n   */\n  createStream() {\n    const handle = addon.createOnlineStream(this.handle);\n    return new OnlineStream(handle);\n  }\n\n  /**\n   * Check whether a stream is ready.\n   * @param {OnlineStream} stream\n   * @returns {boolean}\n   */\n  isReady(stream) {\n    return addon.isOnlineStreamReady(this.handle, stream.handle);\n  }\n\n  /**\n   * Trigger decoding on a stream.\n   * @param {OnlineStream} stream\n   */\n  decode(stream) {\n    addon.decodeOnlineStream(this.handle, stream.handle);\n  }\n\n  /**\n   * Check endpoint condition for a stream.\n   * @param {OnlineStream} stream\n   * @returns {boolean}\n   */\n  isEndpoint(stream) {\n    return addon.isEndpoint(this.handle, stream.handle);\n  }\n\n  /**\n   * Reset a stream.\n   * @param {OnlineStream} stream\n   */\n  reset(stream) {\n    addon.reset(this.handle, stream.handle);\n  }\n\n  /**\n   * Get recognition result for a stream.\n   * @param {OnlineStream} stream\n   * @returns {OnlineRecognizerResult}\n   */\n  getResult(stream) {\n    const jsonStr =\n        addon.getOnlineStreamResultAsJson(this.handle, stream.handle);\n\n    return JSON.parse(jsonStr);\n  }\n}\n\nmodule.exports = {\n  OnlineRecognizer,\n  OnlineStream,\n  Display\n}\n"
  },
  {
    "path": "scripts/node-addon-api/lib/types.js",
    "content": "/**\r\n * Centralized JSDoc typedefs for the Node addon API.\r\n * These typedefs mirror the shapes produced/consumed by the C++ bindings\r\n * in `scripts/node-addon-api/src/*` and by the underlying SherpaOnnx C API.\r\n *\r\n * Keep these typedefs specialized (no `any`/`unknown`) and concise.\r\n */\r\n\r\n/**\r\n * Opaque handle types returned by native constructors. These are opaque\r\n * JavaScript objects backed by native pointers. Do not introspect or\r\n * mutate their internals; pass them to the API functions as-is.\r\n *\r\n * @typedef {Object} OfflineStreamHandle\r\n * @see src/non-streaming-asr.cc\r\n */\r\n\r\n/**\r\n * @typedef {Object} OnlineStreamHandle\r\n * @see src/streaming-asr.cc\r\n */\r\n\r\n/**\r\n * @typedef {Object} OfflineRecognizerHandle\r\n * @see src/non-streaming-asr.cc\r\n */\r\n\r\n/**\r\n * @typedef {Object} OnlineRecognizerHandle\r\n * @see src/streaming-asr.cc\r\n */\r\n\r\n/**\r\n * @typedef {Object} DisplayHandle\r\n * @see src/streaming-asr.cc\r\n */\r\n\r\n/**\r\n * @typedef {Object} CircularBufferHandle\r\n * @see src/vad.cc\r\n */\r\n\r\n/**\r\n * @typedef {Object} VoiceActivityDetectorHandle\r\n * @see src/vad.cc\r\n */\r\n\r\n/**\r\n * @typedef {Object} AudioTaggingHandle\r\n * @see src/audio-tagging.cc\r\n */\r\n\r\n/**\r\n * @typedef {Object} OfflinePunctuationHandle\r\n * @see src/offline-punctuation.cc\n */\r\n\r\n/**\r\n * A single audio event returned by AudioTagging.compute().\r\n * @typedef {Object} AudioEvent\r\n * @property {string} name - The event name.\r\n * @property {number} prob - Probability in [0,1].\r\n * @property {number} index - Index (integer) of the event.\r\n */\r\n\r\n/**\r\n * AudioTagging specific model config for Zipformer variant\r\n * @typedef {Object} AudioTaggingZipformerModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * AudioTagging model config.\r\n * @typedef {Object} AudioTaggingModelConfig\r\n * @property {AudioTaggingZipformerModelConfig} [zipformer]\r\n * @property {string} [ced]\r\n * @property {number} [numThreads]\r\n * @property {boolean|number} [debug]\r\n * @property {string} [provider]\r\n */\r\n\r\n/**\r\n * AudioTagging configuration passed to constructor.\r\n * @typedef {Object} AudioTaggingConfig\r\n * @property {AudioTaggingModelConfig} [model]\r\n * @property {string} [labels]\r\n * @property {number} [topK]\r\n */\r\n\r\n/**\r\n * Waveform input object used by acceptWaveform methods.\r\n * @typedef {Object} Waveform\r\n * @property {Float32Array} samples - Float32Array of samples in [-1, 1].\r\n * @property {number} sampleRate - Sample rate as an integer (e.g., 16000).\r\n */\r\n\r\n/**\r\n * Feature config used by recognizers and models.\r\n * @typedef {Object} FeatureConfig\r\n * @property {number} [sampleRate]\r\n * @property {number} [featureDim]\r\n */\r\n\r\n/**\r\n * Silero VAD model config\r\n * @typedef {Object} SileroVadModelConfig\r\n * @property {string} [model]\r\n * @property {number} [threshold]\r\n * @property {number} [minSilenceDuration]\r\n * @property {number} [minSpeechDuration]\r\n * @property {number} [windowSize]\r\n * @property {number} [maxSpeechDuration]\r\n */\r\n\r\n/**\r\n * Ten-VAD model config\r\n * @typedef {Object} TenVadModelConfig\r\n * @property {string} [model]\r\n * @property {number} [threshold]\r\n * @property {number} [minSilenceDuration]\r\n * @property {number} [minSpeechDuration]\r\n * @property {number} [windowSize]\r\n * @property {number} [maxSpeechDuration]\r\n */\r\n\r\n/**\r\n * Voice activity detector configuration.\r\n * @typedef {Object} VadConfig\r\n * @property {SileroVadModelConfig} [sileroVad]\r\n * @property {TenVadModelConfig} [tenVad]\r\n * @property {number} [sampleRate]\r\n * @property {number} [numThreads]\r\n * @property {string} [provider]\r\n * @property {boolean|number} [debug]\r\n */\r\n\r\n/**\r\n * Offline Transducer model config\r\n * @typedef {Object} OfflineTransducerModelConfig\r\n * @property {string} [encoder]\r\n * @property {string} [decoder]\r\n * @property {string} [joiner]\r\n */\r\n\r\n/**\r\n * Offline Paraformer model config\r\n * @typedef {Object} OfflineParaformerModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Offline Zipformer CTC model config\r\n * @typedef {Object} OfflineZipformerCtcModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Offline Wenet CTC model config\r\n * @typedef {Object} OfflineWenetCtcModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Offline Omnilingual ASR CTC model config\r\n * @typedef {Object} OfflineOmnilingualAsrCtcModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Offline Med ASR CTC model config\r\n * @typedef {Object} OfflineMedAsrCtcModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Offline Dolphin model config\r\n * @typedef {Object} OfflineDolphinModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Offline NeMo CTC model config\r\n * @typedef {Object} OfflineNeMoCtcModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Offline Canary model config\r\n * @typedef {Object} OfflineCanaryModelConfig\r\n * @property {string} [encoder]\r\n * @property {string} [decoder]\r\n * @property {string} [srcLang]\r\n * @property {string} [tgtLang]\r\n * @property {number} [usePnc]\r\n */\r\n\r\n/**\r\n * Offline Whisper model config\r\n * @typedef {Object} OfflineWhisperModelConfig\r\n * @property {string} [encoder]\r\n * @property {string} [decoder]\r\n * @property {string} [language]\r\n * @property {string} [task]\r\n * @property {number} [tailPaddings]\r\n */\r\n\r\n/**\r\n * Offline FireRed ASR model config\r\n * @typedef {Object} OfflineFireRedAsrModelConfig\r\n * @property {string} [encoder]\r\n * @property {string} [decoder]\r\n */\r\n\r\n/**\r\n * Offline Moonshine model config\r\n * @typedef {Object} OfflineMoonshineModelConfig\r\n * @property {string} [preprocessor]\r\n * @property {string} [encoder]\r\n * @property {string} [uncachedDecoder]\r\n * @property {string} [cachedDecoder]\r\n */\r\n\r\n/**\r\n * Offline TDNN model config\r\n * @typedef {Object} OfflineTdnnModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Offline SenseVoice model config\r\n * @typedef {Object} OfflineSenseVoiceModelConfig\r\n * @property {string} [model]\r\n * @property {string} [language]\r\n * @property {number} [useInverseTextNormalization]\r\n */\r\n\r\n/**\r\n * Offline model config.\r\n * @typedef {Object} OfflineModelConfig\r\n * @property {OfflineTransducerModelConfig} [transducer]\r\n * @property {OfflineParaformerModelConfig} [paraformer]\r\n * @property {OfflineZipformerCtcModelConfig} [zipformerCtc]\r\n * @property {OfflineWenetCtcModelConfig} [wenetCtc]\r\n * @property {OfflineOmnilingualAsrCtcModelConfig} [omnilingual]\r\n * @property {OfflineMedAsrCtcModelConfig} [medasr]\r\n * @property {OfflineDolphinModelConfig} [dolphin]\r\n * @property {OfflineNeMoCtcModelConfig} [nemoCtc]\r\n * @property {OfflineCanaryModelConfig} [canary]\r\n * @property {OfflineWhisperModelConfig} [whisper]\r\n * @property {OfflineFireRedAsrModelConfig} [fireRedAsr]\r\n * @property {OfflineMoonshineModelConfig} [moonshine]\r\n * @property {OfflineTdnnModelConfig} [tdnn]\r\n * @property {OfflineSenseVoiceModelConfig} [senseVoice]\r\n * @property {string} [tokens]\r\n * @property {number} [numThreads]\r\n * @property {boolean|number} [debug]\r\n * @property {string} [provider]\r\n */\r\n\r\n/**\r\n * Transducer model config\r\n * @typedef {Object} TransducerModelConfig\r\n * @property {string} [encoder]\r\n * @property {string} [decoder]\r\n * @property {string} [joiner]\r\n */\r\n\r\n/**\r\n * Paraformer model config\r\n * @typedef {Object} ParaformerModelConfig\r\n * @property {string} [encoder]\r\n * @property {string} [decoder]\r\n */\r\n\r\n/**\r\n * Zipformer2 CTC model config\r\n * @typedef {Object} Zipformer2CtcModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * NeMo CTC model config\r\n * @typedef {Object} NemoCtcModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Tone CTC model config\r\n * @typedef {Object} ToneCtcModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Online model config (subset of C++ `OnlineModelConfig`).\r\n * @typedef {Object} OnlineModelConfig\r\n * @property {TransducerModelConfig} [transducer]\r\n * @property {ParaformerModelConfig} [paraformer]\r\n * @property {Zipformer2CtcModelConfig} [zipformer2Ctc]\r\n * @property {NemoCtcModelConfig} [nemoCtc]\r\n * @property {ToneCtcModelConfig} [toneCtc]\r\n * @property {string} [tokens]\r\n * @property {number} [numThreads]\r\n * @property {boolean|number} [debug]\r\n * @property {string} [provider]\r\n * @property {string} [modelType]\r\n * @property {string} [modelingUnit]\r\n * @property {string} [bpeVocab]\r\n * @property {string} [tokensBuf]\r\n * @property {number} [tokensBufSize]\r\n */\r\n\r\n/**\r\n * Homophone replacer configuration used both in online and offline recognizers.\r\n * @typedef {Object} HomophoneReplacerConfig\r\n * @property {string} [lexicon]\r\n * @property {string} [ruleFsts]\r\n */\r\n\r\n/**\r\n * Online recognizer configuration passed to createOnlineRecognizer.\r\n * @typedef {Object} OnlineRecognizerConfig\r\n * @property {FeatureConfig} [featConfig]\r\n * @property {OnlineModelConfig} [modelConfig]\r\n * @property {HomophoneReplacerConfig} [hr]\r\n * @property {string} [decodingMethod]\r\n * @property {number} [maxActivePaths]\r\n * @property {boolean|number} [enableEndpoint]\r\n * @property {number} [rule1MinTrailingSilence]\r\n * @property {number} [rule2MinTrailingSilence]\r\n * @property {number} [rule3MinUtteranceLength]\r\n * @property {string} [hotwordsFile]\r\n * @property {number} [hotwordsScore]\r\n * @property {string} [ruleFsts]\r\n * @property {string} [ruleFars]\r\n * @property {number} [blankPenalty]\r\n */\r\n\r\n/**\r\n * Offline recognizer config passed to createOfflineRecognizer.\r\n * @typedef {Object} OfflineRecognizerConfig\r\n * @property {FeatureConfig} [featConfig]\r\n * @property {OfflineModelConfig} [modelConfig]\r\n */\r\n\r\n/**\r\n * Wave object returned by readWave and used by writeWave.\r\n * @typedef {Object} WaveObject\r\n * @property {Float32Array} samples - 1-D float32 samples in [-1, 1].\r\n * @property {number} sampleRate - Sample rate as an integer (e.g., 16000).\r\n * @see src/wave-reader.cc\r\n */\r\n\r\n/**\r\n * Speech segment returned by Vad.front().\r\n * @typedef {Object} SpeechSegment\r\n * @property {number} start - Start index (int32) of this segment.\r\n * @property {Float32Array} samples - Float32Array of samples.\r\n * @see src/vad.cc\r\n */\r\n\r\n/**\r\n * Audio returned by TTS and speech denoiser.\r\n * @typedef {Object} GeneratedAudio\r\n * @property {Float32Array} samples - The generated/denoised audio samples.\r\n * @property {number} sampleRate - Sample rate in Hz.\r\n * @see src/non-streaming-tts.cc\r\n * @see src/non-streaming-speech-denoiser.cc\r\n */\r\n\r\n/**\r\n * @typedef {Object} GenerationConfig\r\n * @property {number=} silenceScale\r\n * @property {number=} speed\r\n * @property {number=} sid\r\n * @property {number=} numSteps\r\n *\r\n * @property {Float32Array=} referenceAudio\r\n * @property {number=} referenceSampleRate\r\n * @property {string=} referenceText\r\n *\r\n * @property {{[key: string]: number | string}} [extra]\r\n */\r\n\r\n\r\n/**\r\n * TTS request object passed to generate/generateAsync.\r\n * @typedef {Object} TtsRequest\r\n * @property {string} text - Input text to synthesize.\r\n * @property {number} sid - Speaker id (integer).\r\n * @property {number} speed - Playback speed (float).\r\n * @property {boolean} [enableExternalBuffer=true] - Whether to use an external\r\n *           buffer.\r\n * @property {GenerationConfig=} generationConfig - Optional\r\n */\r\n\r\n/**\r\n * Spoken Language ID whisper config\r\n * @typedef {Object} SpokenLanguageIdentificationWhisperConfig\r\n * @property {string} [encoder]\r\n * @property {string} [decoder]\r\n * @property {number} [tailPaddings]\r\n */\r\n\r\n/**\r\n * SpokenLanguageIdentification config\r\n * @typedef {Object} SpokenLanguageIdentificationConfig\r\n * @property {SpokenLanguageIdentificationWhisperConfig} [whisper]\r\n * @property {number} [numThreads]\r\n * @property {boolean|number} [debug]\r\n * @property {string} [provider]\r\n */\r\n\r\n/**\r\n * Speaker embedding extractor config\r\n * @typedef {Object} SpeakerEmbeddingExtractorConfig\r\n * @property {string} [model]\r\n * @property {number} [numThreads]\r\n * @property {boolean|number} [debug]\r\n * @property {string} [provider]\r\n */\r\n\r\n/**\r\n * Offline punctuation model config\r\n * @typedef {Object} OfflinePunctuationModelConfig\r\n * @property {string} [ctTransformer]\r\n * @property {number} [numThreads]\r\n * @property {boolean|number} [debug]\r\n * @property {string} [provider]\r\n */\r\n\r\n/**\r\n * Offline punctuation config\r\n * @typedef {Object} OfflinePunctuationConfig\r\n * @property {OfflinePunctuationModelConfig} [model]\r\n */\r\n\r\n/**\r\n * Online punctuation model config\r\n * @typedef {Object} OnlinePunctuationModelConfig\r\n * @property {string} [cnnBilstm]\r\n * @property {string} [bpeVocab]\r\n * @property {number} [numThreads]\r\n * @property {boolean|number} [debug]\r\n * @property {string} [provider]\r\n */\r\n\r\n/**\r\n * Online punctuation config\r\n * @typedef {Object} OnlinePunctuationConfig\r\n * @property {OnlinePunctuationModelConfig} [model]\r\n */\r\n\r\n/**\r\n * Generic audio processing request used by denoisers/tts generators.\r\n * @typedef {Object} AudioProcessRequest\r\n * @property {Float32Array} samples\r\n * @property {number} sampleRate\r\n * @property {boolean} [enableExternalBuffer]\r\n */\r\n\r\n/**\r\n * Offline TTS model configs\r\n * @typedef {Object} OfflineTtsVitsModelConfig\r\n * @property {string} [model]\r\n * @property {string} [lexicon]\r\n * @property {string} [tokens]\r\n * @property {string} [dataDir]\r\n * @property {number} [noiseScale]\r\n * @property {number} [noiseScaleW]\r\n * @property {number} [lengthScale]\r\n */\r\n\r\n/**\r\n * @typedef {Object} OfflineTtsMatchaModelConfig\r\n * @property {string} [acousticModel]\r\n * @property {string} [vocoder]\r\n * @property {string} [lexicon]\r\n * @property {string} [tokens]\r\n * @property {string} [dataDir]\r\n * @property {number} [noiseScale]\r\n * @property {number} [lengthScale]\r\n */\r\n\r\n/**\r\n * @typedef {Object} OfflineTtsKokoroModelConfig\r\n * @property {string} [model]\r\n * @property {string} [voices]\r\n * @property {string} [tokens]\r\n * @property {string} [dataDir]\r\n * @property {number} [lengthScale]\r\n * @property {string} [lexicon]\r\n * @property {string} [lang]\r\n */\r\n\r\n/**\r\n * @typedef {Object} OfflineTtsKittenModelConfig\r\n * @property {string} [model]\r\n * @property {string} [voices]\r\n * @property {string} [tokens]\r\n * @property {string} [dataDir]\r\n * @property {number} [lengthScale]\r\n */\r\n\r\n/**\n * @typedef {Object} OfflineTtsZipvoiceModelConfig\n * @property {string} [tokens]\n * @property {string} [encoder]\n * @property {string} [decoder]\n * @property {string} [vocoder]\n * @property {string} [dataDir]\n * @property {string} [lexicon]\n * @property {number} [featScale]\n * @property {number} [tShift]\n * @property {number} [targetRms]\n * @property {number} [guidanceScale]\n */\n\n/**\n * @typedef {Object} OfflineTtsPocketModelConfig\n * @property {string} [lmFlow]\n * @property {string} [lmMain]\n * @property {string} [encoder]\r\n * @property {string} [decoder]\r\n * @property {string} [textConditioner]\r\n * @property {string} [vocabJson]\r\n * @property {string} [tokenScoresJson]\r\n * @property {number} [voiceEmbeddingCacheCapacity]\r\n */\r\n\r\n/**\r\n * Offline TTS model config\r\n * @typedef {Object} OfflineTtsModelConfig\r\n * @property {OfflineTtsVitsModelConfig} [vits]\r\n * @property {OfflineTtsMatchaModelConfig} [matcha]\n * @property {OfflineTtsKokoroModelConfig} [kokoro]\n * @property {OfflineTtsKittenModelConfig} [kitten]\n * @property {OfflineTtsZipvoiceModelConfig} [zipvoice]\n * @property {OfflineTtsPocketModelConfig} [pocket]\n */\n\r\n/**\r\n * Offline TTS configuration (partial, commonly used props).\r\n * @typedef {Object} OfflineTtsConfig\r\n * @property {OfflineTtsModelConfig} [model]\r\n * @property {number} [maxNumSentences]\r\n * @property {number} [silenceScale]\r\n * @property {number} [numThreads]\r\n * @property {string} [provider]\r\n */\r\n\r\n/**\n * Offline Speech Denoiser model config\n * @typedef {Object} OfflineSpeechDenoiserGtcrnModelConfig\n * @property {string} [model]\n */\n\n/**\n * Offline Speech Denoiser model config\n * @typedef {Object} OfflineSpeechDenoiserDpdfNetModelConfig\n * @property {string} [model]\n */\n\n/**\n * Offline Speech Denoiser model config\n * @typedef {Object} OfflineSpeechDenoiserModelConfig\n * @property {OfflineSpeechDenoiserGtcrnModelConfig} [gtcrn]\n * @property {OfflineSpeechDenoiserDpdfNetModelConfig} [dpdfnet]\n * @property {number} [numThreads]\n * @property {boolean|number} [debug]\n * @property {string} [provider]\n */\n\n/**\n * Offline Speech Denoiser configuration (partial).\n * @typedef {Object} OfflineSpeechDenoiserConfig\n * @property {OfflineSpeechDenoiserModelConfig} [model]\n */\n\n/**\n * Online Speech Denoiser configuration (partial).\n * @typedef {Object} OnlineSpeechDenoiserConfig\n * @property {OfflineSpeechDenoiserModelConfig} [model]\n */\n\r\n/**\r\n * Offline speaker segmentation (pyannote) model config\r\n * @typedef {Object} OfflineSpeakerSegmentationPyannoteModelConfig\r\n * @property {string} [model]\r\n */\r\n\r\n/**\r\n * Offline speaker segmentation model config\r\n * @typedef {Object} OfflineSpeakerSegmentationModelConfig\r\n * @property {OfflineSpeakerSegmentationPyannoteModelConfig} [pyannote]\r\n * @property {number} [numThreads]\r\n * @property {boolean|number} [debug]\r\n * @property {string} [provider]\r\n */\r\n\r\n/**\r\n * Offline Speaker Diarization configuration (partial).\r\n * @typedef {Object} OfflineSpeakerDiarizationConfig\r\n * @property {OfflineSpeakerSegmentationModelConfig} [segmentation]\r\n * @property {SpeakerEmbeddingExtractorConfig} [embedding]\r\n * @property {FastClusteringConfig} [clustering]\r\n * @property {number} [minDurationOn]\r\n * @property {number} [minDurationOff]\r\n */\r\n\r\n/**\r\n * Fast clustering configuration used by diarization.\r\n * @typedef {Object} FastClusteringConfig\r\n * @property {number} [numClusters]\r\n * @property {number} [threshold]\r\n */\r\n\r\n/**\r\n * SpeakerEmbeddingManager add-multi flattened object\r\n * @typedef {Object} SpeakerEmbeddingManagerAddListFlattenedObj\r\n * @property {string} name\r\n * @property {Float32Array} vv\r\n * @property {number} n\r\n */\r\n\r\n/**\r\n * SpeakerEmbeddingManager search object\r\n * @typedef {Object} SpeakerEmbeddingManagerSearchObj\r\n * @property {Float32Array} v\r\n * @property {number} threshold\r\n */\r\n\r\n/**\r\n * SpeakerEmbeddingManager verify object\r\n * @typedef {Object} SpeakerEmbeddingManagerVerifyObj\r\n * @property {string} name\r\n * @property {Float32Array} v\r\n * @property {number} threshold\r\n */\r\n\r\n/**\r\n * KeywordSpotter config (partial)\r\n * @typedef {Object} KeywordSpotterConfig\r\n * @property {FeatureConfig} [featConfig]\r\n * @property {OfflineModelConfig} [modelConfig]\r\n * @property {number} [maxActivePaths]\r\n * @property {number} [numTrailingBlanks]\r\n * @property {number} [keywordsScore]\r\n * @property {number} [keywordsThreshold]\r\n * @property {string} [keywordsFile]\r\n */\r\n\r\n/**\r\n * Offline recognition result returned by `getOfflineStreamResultAsJson`.\r\n * See `OfflineRecognitionResult::AsJsonString()` in C++ for precise fields.\r\n * @typedef {Object} OfflineRecognizerResult\r\n * @property {string} lang\r\n * @property {string} emotion\r\n * @property {string} event\r\n * @property {string} text\r\n * @property {number[]} timestamps\r\n * @property {number[]} durations\r\n * @property {string[]} tokens\r\n * @property {number[]} ys_log_probs\r\n * @property {number[]} words\r\n */\r\n\r\n/**\r\n * Online recognition result returned by `getOnlineStreamResultAsJson`.\r\n * See `OnlineRecognizerResult::AsJsonString()` in C++.\r\n * @typedef {Object} OnlineRecognizerResult\r\n * @property {string} text\r\n * @property {string[]} tokens\r\n * @property {number[]} timestamps\r\n * @property {number[]} ys_probs\r\n * @property {number[]} lm_probs\r\n * @property {number[]} context_scores\r\n * @property {number} segment\r\n * @property {number[]} words\r\n * @property {number} start_time\r\n * @property {boolean} is_final\r\n * @property {boolean} is_eof\r\n */\r\n\r\n/**\r\n * Keyword spotter result returned by `getKeywordResultAsJson`.\r\n * @typedef {Object} KeywordResult\r\n * @property {number} start_time\r\n * @property {string} keyword\r\n * @property {number[]} timestamps\r\n * @property {string[]} tokens\r\n */\r\n\r\n/**\r\n * Speaker diarization segment returned by `offlineSpeakerDiarizationProcess`.\r\n * @typedef {Object} SpeakerDiarizationSegment\r\n * @property {number} start - start time in seconds\r\n * @property {number} end - end time in seconds\r\n * @property {number} speaker - speaker id (integer)\r\n */\r\n\r\n/**\r\n * Speaker embedding entry used by SpeakerEmbeddingManager.add\r\n * @typedef {Object} SpeakerEmbeddingEntry\r\n * @property {string} name - speaker name\r\n * @property {Float32Array} v - embedding vector\r\n */\r\n\r\n/**\r\n * @typedef {Object} OfflineStreamObject\r\n * @property {OfflineStreamHandle} handle\r\n */\r\n\r\n/**\r\n * @typedef {Object} OnlineStreamObject\r\n * @property {OnlineStreamHandle} handle\r\n */\r\n\r\n/**\r\n * @typedef {Object} DisplayObject\r\n * @property {DisplayHandle} handle\r\n */\r\n\r\n// Export typedefs so they can be referenced by require('./types.js')\r\nmodule.exports = {};\r\n"
  },
  {
    "path": "scripts/node-addon-api/lib/vad.js",
    "content": "/** @typedef {import('./types').CircularBufferHandle} CircularBufferHandle */\n/** @typedef {import('./types').SpeechSegment} SpeechSegment */\n/** @typedef {import('./types').VadConfig} VadConfig */\n\nconst addon = require('./addon.js');\n\n/**\n * CircularBuffer stores float32 samples internally.\n */\nclass CircularBuffer {\n  /**\n   * @param {number} capacity - capacity in samples (integer)\n   */\n  constructor(capacity) {\n    this.handle = addon.createCircularBuffer(capacity);\n  }\n\n  /**\n   * Push samples into the buffer.\n   * @param {Float32Array} samples\n   */\n  push(samples) {\n    addon.circularBufferPush(this.handle, samples);\n  }\n\n  /**\n   * Get a slice of samples.\n   * @param {number} startIndex\n   * @param {number} n\n   * @param {boolean} [enableExternalBuffer=true]\n   * @returns {Float32Array}\n   */\n  get(startIndex, n, enableExternalBuffer = true) {\n    return addon.circularBufferGet(\n        this.handle, startIndex, n, enableExternalBuffer);\n  }\n\n  /**\n   * Pop n samples from the buffer.\n   * @param {number} n\n   */\n  pop(n) {\n    return addon.circularBufferPop(this.handle, n);\n  }\n\n  /**\n   * Get current size in samples.\n   * @returns {number}\n   */\n  size() {\n    return addon.circularBufferSize(this.handle);\n  }\n\n  /**\n   * Get head index.\n   * @returns {number}\n   */\n  head() {\n    return addon.circularBufferHead(this.handle);\n  }\n\n  /** Reset the buffer. */\n  reset() {\n    addon.circularBufferReset(this.handle);\n  }\n}\n\n/**\n * Voice Activity Detector (VAD).\n */\nclass Vad {\n  /**\n   * @param {VadConfig} config\n   * @param {number} bufferSizeInSeconds\n   */\n  constructor(config, bufferSizeInSeconds) {\n    this.handle =\n        addon.createVoiceActivityDetector(config, bufferSizeInSeconds);\n    this.config = config;\n  }\n\n  /**\n   * Accept raw waveform samples.\n   * @param {Float32Array} samples\n   */\n  acceptWaveform(samples) {\n    addon.voiceActivityDetectorAcceptWaveform(this.handle, samples);\n  }\n\n  /** @returns {boolean} */\n  isEmpty() {\n    return addon.voiceActivityDetectorIsEmpty(this.handle);\n  }\n\n  /** @returns {boolean} */\n  isDetected() {\n    return addon.voiceActivityDetectorIsDetected(this.handle);\n  }\n\n  /** Pop the earliest detected speech segment. */\n  pop() {\n    addon.voiceActivityDetectorPop(this.handle);\n  }\n\n  /** Clear internal state. */\n  clear() {\n    addon.voiceActivityDetectorClear(this.handle);\n  }\n\n  /**\n   * Get the front speech segment.\n   * @param {boolean} [enableExternalBuffer=true]\n   * @returns {SpeechSegment}\n   */\n  front(enableExternalBuffer = true) {\n    return addon.voiceActivityDetectorFront(this.handle, enableExternalBuffer);\n  }\n\n  /** Reset detector state. */\n  reset() {\n    addon.voiceActivityDetectorReset(this.handle);\n  }\n\n  /** Flush pending internal buffer. */\n  flush() {\n    addon.voiceActivityDetectorFlush(this.handle);\n  }\n}\n\nmodule.exports = {\n  Vad,\n  CircularBuffer,\n}\n"
  },
  {
    "path": "scripts/node-addon-api/package.json",
    "content": "{\n  \"main\": \"lib/sherpa-onnx.js\",\n  \"version\": \"1.0.0\",\n  \"description\": \"Speech-to-text, text-to-speech, and speaker diarization using Next-gen Kaldi without internet connection\",\n  \"dependencies\": {\n    \"cmake-js\": \"^7.3.0\",\n    \"node-addon-api\": \"^8.3.0\",\n    \"perf_hooks\": \"*\"\n  },\n  \"scripts\": {\n    \"install\": \"cmake-js compile --log-level verbose\",\n    \"postinstall\": \"npm run typecheck\",\n    \"test\": \"node --napi-modules ./test/test_binding.js\",\n    \"typecheck\": \"tsc\"\n  },\n  \"repository\": {\n    \"type\": \"git\",\n    \"url\": \"git+https://github.com/k2-fsa/sherpa-onnx.git\"\n  },\n  \"keywords\": [\n    \"speech to text\",\n    \"text to speech\",\n    \"transcription\",\n    \"real-time speech recognition\",\n    \"without internet connection\",\n    \"locally\",\n    \"local\",\n    \"embedded systems\",\n    \"open source\",\n    \"diarization\",\n    \"speaker diarization\",\n    \"speaker recognition\",\n    \"speaker\",\n    \"speaker segmentation\",\n    \"speaker verification\",\n    \"spoken language identification\",\n    \"sherpa\",\n    \"zipformer\",\n    \"asr\",\n    \"tts\",\n    \"stt\",\n    \"c++\",\n    \"onnxruntime\",\n    \"onnx\",\n    \"ai\",\n    \"next-gen kaldi\",\n    \"offline\",\n    \"privacy\",\n    \"open source\",\n    \"streaming speech recognition\",\n    \"speech\",\n    \"recognition\",\n    \"vad\",\n    \"node-addon-api\",\n    \"speaker id\",\n    \"language id\"\n  ],\n  \"author\": \"The next-gen Kaldi team\",\n  \"license\": \"Apache-2.0\",\n  \"gypfile\": true,\n  \"name\": \"sherpa-onnx-node-addon-api\",\n  \"bugs\": {\n    \"url\": \"https://github.com/k2-fsa/sherpa-onnx/issues\"\n  },\n  \"homepage\": \"https://github.com/k2-fsa/sherpa-onnx#readme\",\n  \"devDependencies\": {\n    \"@types/node\": \"^24.10.4\",\n    \"typescript\": \"^5.9.3\"\n  }\n}\n"
  },
  {
    "path": "scripts/node-addon-api/test/test_asr_streaming_transducer.js",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\nconst sherpa_onnx = require('../lib/sherpa-onnx.js');\nconst performance = require('perf_hooks').performance;\n\n// Please download test files from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nconst config = {\n  'featConfig': {\n    'sampleRate': 16000,\n    'featureDim': 80,\n  },\n  'modelConfig': {\n    'transducer': {\n      'encoder':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx',\n      'decoder':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx',\n      'joiner':\n          './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx',\n    },\n    'tokens':\n        './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt',\n    'numThreads': 2,\n    'provider': 'cpu',\n    'debug': 1,\n    'modelType': 'zipformer',\n  }\n};\n\nconst waveFilename =\n    './sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav';\n\nconst recognizer = new sherpa_onnx.OnlineRecognizer(config);\nconsole.log('Started')\nlet start = performance.now();\nconst stream = recognizer.createStream();\nconst wave = sherpa_onnx.readWave(waveFilename);\nstream.acceptWaveform({samples: wave.samples, sampleRate: wave.sampleRate});\n\nconst tailPadding = new Float32Array(wave.sampleRate * 0.4);\nstream.acceptWaveform({samples: tailPadding, sampleRate: wave.sampleRate});\n\nwhile (recognizer.isReady(stream)) {\n  recognizer.decode(stream);\n}\nresult = recognizer.getResult(stream)\nlet stop = performance.now();\nconsole.log('Done')\n\nconst elapsed_seconds = (stop - start) / 1000;\nconst duration = wave.samples.length / wave.sampleRate;\nconst real_time_factor = elapsed_seconds / duration;\nconsole.log('Wave duration', duration.toFixed(3), 'secodns')\nconsole.log('Elapsed', elapsed_seconds.toFixed(3), 'secodns')\nconsole.log('RTF', real_time_factor.toFixed(3))\nconsole.log('result', result.text)\n"
  },
  {
    "path": "scripts/node-addon-api/test/test_binding.js",
    "content": "const sherpa_onnx = require('../lib/sherpa-onnx.js');\nconsole.log(sherpa_onnx)\n\nconsole.log('Tests passed- everything looks OK!');\n"
  },
  {
    "path": "scripts/node-addon-api/tsconfig.json",
    "content": "{\r\n  \"compilerOptions\": {\r\n    \"allowJs\": true,\r\n    \"checkJs\": true,\r\n    \"noEmit\": true,\r\n    \"skipLibCheck\": true,\r\n    \"esModuleInterop\": true,\r\n    \"module\": \"commonjs\",\r\n    \"target\": \"ES2019\",\r\n    \"lib\": [\"ES2020\"],\r\n    \"types\": [\"node\"]\r\n  },\r\n  \"include\": [\"lib/**/*.js\"]\r\n}\r\n"
  },
  {
    "path": "scripts/nodejs/README.md",
    "content": "# Introduction\n\nText-to-speech and speech-to-text with [Next-gen Kaldi](https://github.com/k2-fsa/).\n\nIt processes everything locally without accessing the Internet.\n\nPlease refer to\nhttps://github.com/k2-fsa/sherpa-onnx/tree/master/nodejs-examples\nfor examples.\n\nYou need Node >= 18 for this package.\n"
  },
  {
    "path": "scripts/nodejs/index.js",
    "content": "// Copyright (c)  2023-2024  Xiaomi Corporation (authors: Fangjun Kuang)\n'use strict'\n\nconst wasmModule = require('./sherpa-onnx-wasm-nodejs.js')();\nconst sherpa_onnx_asr = require('./sherpa-onnx-asr.js');\nconst sherpa_onnx_tts = require('./sherpa-onnx-tts.js');\nconst sherpa_onnx_kws = require('./sherpa-onnx-kws.js');\nconst sherpa_onnx_wave = require('./sherpa-onnx-wave.js');\nconst sherpa_onnx_vad = require('./sherpa-onnx-vad.js');\nconst sherpa_onnx_speaker_diarization =\n    require('./sherpa-onnx-speaker-diarization.js');\nconst sherpa_onnx_speech_enhancement =\n    require('./sherpa-onnx-speech-enhancement.js');\n\n\n\nfunction createOnlineRecognizer(config) {\n  return sherpa_onnx_asr.createOnlineRecognizer(wasmModule, config);\n}\n\nfunction createOfflineRecognizer(config) {\n  return new sherpa_onnx_asr.OfflineRecognizer(config, wasmModule);\n}\n\nfunction createOfflineTts(config) {\n  return sherpa_onnx_tts.createOfflineTts(wasmModule, config);\n}\n\nfunction createKws(config) {\n  return sherpa_onnx_kws.createKws(wasmModule, config);\n}\n\nfunction createCircularBuffer(capacity) {\n  return new sherpa_onnx_vad.CircularBuffer(capacity, wasmModule);\n}\n\nfunction createVad(config) {\n  return sherpa_onnx_vad.createVad(wasmModule, config);\n}\n\nfunction createOfflineSpeakerDiarization(config) {\n  return sherpa_onnx_speaker_diarization.createOfflineSpeakerDiarization(\n      wasmModule, config);\n}\n\nfunction readWave(filename) {\n  return sherpa_onnx_wave.readWave(filename, wasmModule);\n}\n\nfunction writeWave(filename, data) {\n  sherpa_onnx_wave.writeWave(filename, data, wasmModule);\n}\n\nfunction readWaveFromBinaryData(uint8Array) {\n  return sherpa_onnx_wave.readWaveFromBinaryData(uint8Array, wasmModule);\n}\n\nfunction createOfflineSpeechDenoiser(config) {\n  return sherpa_onnx_speech_enhancement.createOfflineSpeechDenoiser(\n      wasmModule, config);\n}\n\nfunction createOnlineSpeechDenoiser(config) {\n  return sherpa_onnx_speech_enhancement.createOnlineSpeechDenoiser(\n      wasmModule, config);\n}\n\nfunction getVersion() {\n  const v = wasmModule._SherpaOnnxGetVersionStr();\n  return wasmModule.UTF8ToString(v);\n}\n\nfunction getGitSha1() {\n  const v = wasmModule._SherpaOnnxGetGitSha1();\n  return wasmModule.UTF8ToString(v);\n}\n\nfunction getGitDate() {\n  const v = wasmModule._SherpaOnnxGetGitDate();\n  return wasmModule.UTF8ToString(v);\n}\n\n// Note: online means streaming and offline means non-streaming here.\n// Both of them don't require internet connection.\nmodule.exports = {\n  createOnlineRecognizer,\n  createOfflineRecognizer,\n  createOfflineTts,\n  createKws,\n  readWave,\n  readWaveFromBinaryData,\n  writeWave,\n  createCircularBuffer,\n  createVad,\n  createOfflineSpeakerDiarization,\n  createOfflineSpeechDenoiser,\n  createOnlineSpeechDenoiser,\n  version: getVersion(),\n  gitSha1: getGitSha1(),\n  gitDate: getGitDate(),\n};\n"
  },
  {
    "path": "scripts/nodejs/package.json",
    "content": "{\n  \"name\": \"sherpa-onnx\",\n  \"version\": \"SHERPA_ONNX_VERSION\",\n  \"description\": \"Speech-to-text, text-to-speech, speaker diarization, and speech enhancement using Next-gen Kaldi without internet connection\",\n  \"main\": \"index.js\",\n  \"scripts\": {\n    \"test\": \"echo \\\"Error: no test specified\\\" && exit 1\"\n  },\n  \"repository\": {\n    \"type\": \"git\",\n    \"url\": \"git+https://github.com/k2-fsa/sherpa-onnx.git\"\n  },\n  \"keywords\": [\n    \"speech to text\",\n    \"text to speech\",\n    \"transcription\",\n    \"real-time speech recognition\",\n    \"without internet connection\",\n    \"embedded systems\",\n    \"open source\",\n    \"zipformer\",\n    \"asr\",\n    \"tts\",\n    \"stt\",\n    \"c++\",\n    \"onnxruntime\",\n    \"onnx\",\n    \"ai\",\n    \"next-gen kaldi\",\n    \"offline\",\n    \"privacy\",\n    \"open source\",\n    \"streaming speech recognition\",\n    \"speech\",\n    \"recognition\",\n    \"WebAssembly\",\n    \"wasm\",\n    \"speech enhancement\",\n    \"denoising\"\n  ],\n  \"author\": \"The next-gen Kaldi team\",\n  \"license\": \"Apache-2.0\",\n  \"bugs\": {\n    \"url\": \"https://github.com/k2-fsa/sherpa-onnx/issues\"\n  },\n  \"homepage\": \"https://github.com/k2-fsa/sherpa-onnx#readme\",\n  \"dependencies\": {\n  }\n}\n"
  },
  {
    "path": "scripts/omnilingual-asr/README.md",
    "content": "# Introduction\n\nThis folder contains script to export\nhttps://github.com/facebookresearch/omnilingual-asr\nto sherpa-onnx\n\nSee\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/export-omnilingual-asr-to-onnx.yaml\nfor usage.\n\n```\nnum_frames = round(num_samples / 318 - 1.5)\nnum_samples = round(318 * num_frames + 477)\n\nor\nnum_frames = round(num_samples / 320)\n\n```\n\n20ms per frame\n\n\n\n"
  },
  {
    "path": "scripts/omnilingual-asr/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom typing import Dict\n\nimport onnx\nimport torch\nfrom fairseq2.nn.batch_layout import BatchLayout\nfrom omnilingual_asr.models.inference.pipeline import ASRInferencePipeline\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter,\n    )\n    parser.add_argument(\n        \"--model-card\",\n        type=str,\n        required=True,\n        choices=[\n            \"omniASR_CTC_300M\",\n            \"omniASR_CTC_300M_v2\",\n            \"omniASR_CTC_1B\",\n            \"omniASR_CTC_1B_v2\",\n        ],\n        help=\"The model card to export.\",\n    )\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str], model_card: str):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    if \"300M\" in model_card:\n        onnx.save(model, filename)\n    else:\n        external_filename = filename.split(\".onnx\")[0]\n        onnx.save(\n            model,\n            filename,\n            save_as_external_data=True,\n            all_tensors_to_one_file=True,\n            location=external_filename + \".weights\",\n        )\n\n\nclass ModelWrapper(torch.nn.Module):\n    def __init__(self, model):\n        super().__init__()\n        self.model = model\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n          x: (N, num_samples), float32\n        \"\"\"\n        batch_layout = BatchLayout(shape=x.shape, seq_lens=[x.shape[1]])\n        logits, _ = self.model(x, batch_layout)\n        return logits\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    print(vars(args))\n    pipeline = ASRInferencePipeline(\n        model_card=args.model_card,\n        device=\"cpu\",\n        dtype=torch.float32,\n    )\n\n    vocab_size = pipeline.tokenizer._model.vocabulary_size\n\n    with open(\"tokens.txt\", \"w\") as f:\n        for i in range(pipeline.tokenizer._model.vocabulary_size):\n            f.write(f\"{pipeline.tokenizer._model.index_to_token(i)} {i}\\n\")\n\n    print(\"saved to tokens.txt\")\n\n    wrapper = ModelWrapper(pipeline.model)\n    wrapper.eval()\n\n    x = torch.rand(1, 16000 * 10)\n    torch.onnx.export(\n        wrapper,\n        x,\n        \"model.onnx\",\n        opset_version=14,\n        input_names=[\"x\"],\n        output_names=[\"logits\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"num_samples\"},\n            \"logits\": {0: \"N\", 1: \"num_frames\"},\n        },\n    )\n\n    meta_data = {\n        \"vocab_size\": vocab_size,\n        \"model_type\": \"omnilingual-asr\",\n        \"version\": \"1\",\n        \"sample_rate\": 16000,\n        \"model_author\": \"facebookresearch\",\n        \"url\": \"https://github.com/facebookresearch/omnilingual-asr\",\n        \"comment\": \"300M-CTC\",\n    }\n\n    add_meta_data(\"model.onnx\", meta_data, args.model_card)\n    print(\"saved to model.onnx\")\n\n    quantize_dynamic(\n        model_input=\"./model.onnx\",\n        model_output=\"./model.int8.onnx\",\n        op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QUInt8,\n    )\n    print(\"saved to model.int8.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/omnilingual-asr/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport time\n\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\n\ndef display(sess):\n    print(\"==========Input==========\")\n    for i in sess.get_inputs():\n        print(i)\n    print(\"==========Output==========\")\n    for i in sess.get_outputs():\n        print(i)\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        display(self.model)\n\n    def __call__(self, x: np.ndarray):\n        logits = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )[0]\n        # [batch_size, T, vocab_size]\n        return logits\n\n\ndef load_tokens():\n    id2token = dict()\n    with open(\"./tokens.txt\", encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.split()\n            if len(fields) == 1:\n                id2token[int(fields[0])] = \" \"\n            else:\n                t, idx = fields\n                id2token[int(idx)] = t\n    return id2token\n\n\ndef load_audio(filename):\n    samples, sr = sf.read(filename, always_2d=True, dtype=\"float32\")\n    samples = samples[:, 0]  # only use the first channel\n    if sr != 16000:\n        import librosa\n\n        samples = librosa.resample(samples, orig_sr=sr, target_sr=16000)\n    if len(samples) / 16000 > 40:\n        raise ValueError(f\"{filename} is too long. Support at most 40 seconds\")\n\n    mean = np.mean(samples, axis=0, keepdims=True)\n    var = np.var(samples, axis=0, keepdims=True)\n\n    eps = 1e-5\n    return (samples - mean) / np.sqrt(var + eps)\n\n\ndef test(filename, wav_file_list, num_iter=1):\n    id2token = load_tokens()\n    model = OnnxModel(filename)\n\n    for it in range(num_iter):\n        for wav in wav_file_list:\n            print(f\"---test {filename} with {wav}----iter---{it}\")\n            start = time.time()\n            samples = load_audio(wav)\n\n            logits = model(samples[None])\n            ids = logits[0].argmax(axis=-1)\n            ans = []\n            prev = -1\n            blank = 0\n            for i in ids:\n                if i != blank and i != prev:\n                    ans.append(i)\n                prev = i\n\n            words = [id2token[k] for k in ans]\n            end = time.time()\n            elapsed_seconds = end - start\n            audio_duration = samples.shape[0] / 16000\n            real_time_factor = elapsed_seconds / audio_duration\n\n            print(\"---> text is----\", \"\".join(words))\n            print(f\"RTF: {real_time_factor}\")\n            print()\n\n\ndef main():\n    wav_file_list = [\"./en.wav\", \"./de.wav\", \"./es.wav\", \"./fr.wav\"]\n    test(\"./model.onnx\", wav_file_list)\n\n    test(\"./model.int8.onnx\", wav_file_list)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/paraformer/.gitignore",
    "content": "seg_dict\ntokens.json\n"
  },
  {
    "path": "scripts/paraformer/ascend-npu/export_decoder_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport torch\n\nfrom export_encoder_onnx import load_model\n\n\n@torch.no_grad()\ndef main():\n    print(\"loading model\")\n    model = load_model()\n\n    encoder_out = torch.randn(1, 100, 512, dtype=torch.float32)\n    acoustic_embedding = torch.randn(1, 50, 512, dtype=torch.float32)\n\n    opset_version = 14\n    filename = \"decoder.onnx\"\n    torch.onnx.export(\n        model.decoder,\n        (encoder_out, acoustic_embedding),\n        filename,\n        opset_version=opset_version,\n        input_names=[\"encoder_out\", \"acoustic_embedding\"],\n        output_names=[\"decoder_out\"],\n        dynamic_axes={\n            \"encoder_out\": {1: \"T\"},\n            \"acoustic_embedding\": {1: \"num_tokens\"},\n            \"decoder_out\": {1: \"num_tokens\"},\n        },\n    )\n    print(f\"Saved to {filename}\")\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20251008)\n    main()\n"
  },
  {
    "path": "scripts/paraformer/ascend-npu/export_encoder_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nfrom typing import List, Tuple\n\nimport torch\nimport yaml\n\nfrom torch_model import Paraformer\n\n\ndef load_cmvn(filename) -> Tuple[List[float], List[float]]:\n    neg_mean = None\n    inv_stddev = None\n\n    with open(filename) as f:\n        for line in f:\n            if not line.startswith(\"<LearnRateCoef>\"):\n                continue\n            t = line.split()[3:-1]\n\n            if neg_mean is None:\n                neg_mean = list(map(lambda x: float(x), t))\n            else:\n                inv_stddev = list(map(lambda x: float(x), t))\n\n    return neg_mean, inv_stddev\n\n\ndef load_model():\n    with open(\"./config.yaml\", \"r\", encoding=\"utf-8\") as f:\n        config = yaml.safe_load(f)\n\n    print(\"creating model\")\n\n    neg_mean, inv_stddev = load_cmvn(\"./am.mvn\")\n\n    neg_mean = torch.tensor(neg_mean, dtype=torch.float32)\n    inv_stddev = torch.tensor(inv_stddev, dtype=torch.float32)\n\n    m = Paraformer(\n        neg_mean=neg_mean,\n        inv_stddev=inv_stddev,\n        input_size=560,\n        vocab_size=8404,\n        encoder_conf=config[\"encoder_conf\"],\n        decoder_conf=config[\"decoder_conf\"],\n        predictor_conf=config[\"predictor_conf\"],\n    )\n    m.eval()\n\n    print(\"loading state dict\")\n    state_dict = torch.load(\"./model_state_dict.pt\", map_location=\"cpu\")\n    if \"state_dict\" in state_dict:\n        state_dict = state_dict[\"state_dict\"]\n\n    m.load_state_dict(state_dict)\n    del state_dict\n\n    return m\n\n\n@torch.no_grad()\ndef main():\n    print(\"loading model\")\n    model = load_model()\n\n    x = torch.randn(1, 100, 560, dtype=torch.float32)\n\n    opset_version = 14\n    filename = \"encoder.onnx\"\n    torch.onnx.export(\n        model.encoder,\n        x,\n        filename,\n        opset_version=opset_version,\n        input_names=[\"x\"],\n        output_names=[\"encoder_out\"],\n        dynamic_axes={\n            \"x\": {1: \"T\"},\n            \"encoder_out\": {1: \"T\"},\n        },\n    )\n\n    print(f\"Saved to {filename}\")\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20251013)\n    main()\n"
  },
  {
    "path": "scripts/paraformer/ascend-npu/export_predictor_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport torch\n\nfrom export_encoder_onnx import load_model\nfrom torch_model import CifPredictorV2\n\nif __name__ == \"__main__\":\n\n    def modified_predictor_forward(self: CifPredictorV2, hidden: torch.Tensor):\n        h = hidden\n        context = h.transpose(1, 2)\n        queries = self.pad(context)\n        output = torch.relu(self.cif_conv1d(queries))\n        output = output.transpose(1, 2)\n\n        output = self.cif_output(output)\n        alphas = torch.sigmoid(output)\n        alphas = torch.nn.functional.relu(\n            alphas * self.smooth_factor - self.noise_threshold\n        )\n\n        alphas = alphas.squeeze(-1)\n\n        return alphas\n\n    CifPredictorV2.forward = modified_predictor_forward\n\n\n@torch.no_grad()\ndef main():\n    print(\"loading model\")\n    model = load_model()\n\n    x = torch.randn(1, 100, 512, dtype=torch.float32)\n\n    opset_version = 14\n    filename = \"predictor.onnx\"\n    torch.onnx.export(\n        model.predictor,\n        x,\n        filename,\n        opset_version=opset_version,\n        input_names=[\"encoder_out\"],\n        output_names=[\"alphas\"],\n        dynamic_axes={\n            \"encoder_out\": {1: \"T\"},\n            \"alphas\": {1: \"T\"},\n            },\n    )\n    print(f\"Saved to {filename}\")\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20251008)\n    main()\n"
  },
  {
    "path": "scripts/paraformer/ascend-npu/test_om.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport kaldi_native_fbank as knf\nimport librosa\nimport numpy as np\nfrom ais_bench.infer.interface import InferSession\n\n\ndef compute_feat(filename):\n    sample_rate = 16000\n    samples, _ = librosa.load(filename, sr=sample_rate)\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.snip_edges = False\n    opts.frame_opts.samp_freq = sample_rate\n    opts.mel_opts.num_bins = 80\n\n    online_fbank = knf.OnlineFbank(opts)\n    online_fbank.accept_waveform(sample_rate, (samples * 32768).tolist())\n    online_fbank.input_finished()\n\n    features = np.stack(\n        [online_fbank.get_frame(i) for i in range(online_fbank.num_frames_ready)]\n    )\n    assert features.data.contiguous is True\n    assert features.dtype == np.float32, features.dtype\n    print(\"features sum\", features.sum(), features.shape)\n\n    window_size = 7  # lfr_m\n    window_shift = 6  # lfr_n\n\n    T = (features.shape[0] - window_size) // window_shift + 1\n    features = np.lib.stride_tricks.as_strided(\n        features,\n        shape=(T, features.shape[1] * window_size),\n        strides=((window_shift * features.shape[1]) * 4, 4),\n    )\n    return np.copy(features)\n\n\ndef load_tokens():\n    ans = dict()\n    i = 0\n    with open(\"tokens.txt\", encoding=\"utf-8\") as f:\n        for line in f:\n            ans[i] = line.strip().split()[0]\n            i += 1\n    return ans\n\n\nclass OmModel:\n    def __init__(self):\n        print(\"init encoder\")\n        self.encoder = InferSession(device_id=0, model_path=\"./encoder.om\", debug=False)\n        self.decoder = InferSession(device_id=1, model_path=\"./decoder.om\", debug=False)\n        self.predictor = InferSession(\n            device_id=0, model_path=\"./predictor.om\", debug=False\n        )\n\n        print(\"---encoder---\")\n        for i in self.encoder.get_inputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"-----\")\n\n        for i in self.encoder.get_outputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"---decoder---\")\n        for i in self.decoder.get_inputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"-----\")\n\n        for i in self.decoder.get_outputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"---predictor---\")\n        for i in self.predictor.get_inputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"-----\")\n\n        for i in self.predictor.get_outputs():\n            print(i.name, i.datatype, i.shape)\n\n    def run_encoder(self, features):\n        encoder_out = self.encoder.infer([features], mode=\"dymshape\")[0]\n        return encoder_out\n\n    def run_predictor(self, encoder_out):\n        alphas = self.predictor.infer([encoder_out], mode=\"dymshape\")[0]\n        return alphas\n\n    def run_decoder(self, encoder_out, acoustic_embedding):\n        decoder_out = self.decoder.infer(\n            [encoder_out, acoustic_embedding], mode=\"dymshape\"\n        )[0]\n        return decoder_out\n\n\ndef get_acoustic_embedding(alpha: np.array, hidden: np.array):\n    \"\"\"\n    Args:\n      alpha: (T,)\n      hidden: (T, C)\n    Returns:\n      acoustic_embeds: (num_tokens, C)\n    \"\"\"\n    alpha = alpha.tolist()\n    acc = 0\n\n    embeddings = []\n    cur_embedding = np.zeros((hidden.shape[1],), dtype=np.float32)\n\n    for i, w in enumerate(alpha):\n        acc += w\n        if acc >= 1:\n            overflow = acc - 1\n            remain = w - overflow\n            cur_embedding += remain * hidden[i]\n            embeddings.append(cur_embedding)\n\n            cur_embedding = overflow * hidden[i]\n            acc = overflow\n        else:\n            cur_embedding += w * hidden[i]\n\n    if len(embeddings) == 0:\n        raise ValueError(\"No speech in the audio file\")\n\n    embeddings = np.array(embeddings)\n    return embeddings\n\n\ndef main():\n    features = compute_feat(\"./test_wavs/1.wav\")\n    print(\"here\", features.shape, features.shape[0] > 83)\n\n    print(\"features.shape\", features.shape)\n\n    print(\"sum\", features.sum(), features.mean())\n\n    model = OmModel()\n\n    encoder_out = model.run_encoder(features[None])\n    print(\"encoder_out.shape\", encoder_out.shape)\n    print(\"encoder_out.sum\", encoder_out.sum(), encoder_out.mean())\n\n    alpha = model.run_predictor(encoder_out)\n    print(\"alpha.shape\", alpha.shape)\n    print(\"alpha.sum()\", alpha.sum(), alpha.mean())\n\n    acoustic_embedding = get_acoustic_embedding(alpha[0], encoder_out[0])\n    print(\"acoustic_embedding.shape\", acoustic_embedding.shape)\n    num_tokens = acoustic_embedding.shape[0]\n    print(\"num_tokens\", num_tokens)\n\n    print(\"acoustic_embedding.sum\", acoustic_embedding.sum(), acoustic_embedding.mean())\n\n    decoder_out = model.run_decoder(encoder_out, acoustic_embedding[None])\n    print(\"decoder_out\", decoder_out.shape)\n    print(\"decoder_out.sum\", decoder_out.sum(), decoder_out.mean())\n    yseq = decoder_out[0, :num_tokens].argmax(axis=-1).tolist()\n    print(yseq, \"-->\", len(yseq))\n\n    tokens = load_tokens()\n    words = [tokens[i] for i in yseq if i not in (1, 2)]\n    print(words)\n    text = \"\".join(words)\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/paraformer/qnn/.gitignore",
    "content": "*.raw\n*-list.txt\n"
  },
  {
    "path": "scripts/paraformer/qnn/convert_decoder.sh",
    "content": "#!/usr/bin/env bash\n\nif [ -z $t ]; then\n  echo \"Please run export t=num_input_seconds\"\n  exit 1\nfi\n\nif [ -z $soc ]; then\n  echo \"Please run export soc=SM8850, etc.\"\n  exit 1\nfi\n\nif [ -z $QNN_SDK_ROOT ]; then\n  echo \"Please run setup QNN first\"\n  exit 1\nfi\n\necho \"Export to onnx with num_seconds $t\"\n\npython3 ./export_decoder_onnx.py --input-len-in-seconds $t --opset-version 17 --float-mask 0\n\nls -lh decoder-*.onnx\n\npython3 ../../pyannote/segmentation/show-onnx.py --filename ./decoder-$t-seconds.onnx\n\necho \"Generate test data\"\n\npython3 ./generate_decoder_data.py --input-len-in-seconds $t\n\nls -lh decoder-*\n\necho \"---\"\ncat ./decoder-input-list.txt\necho \"---\"\n\necho \"Convert onnx to qnn\"\n\n\nqnn-onnx-converter \\\n  --input_network ./decoder-$t-seconds.onnx \\\n  --output_path ./decoder-$t-seconds-quantized \\\n  --input_list ./decoder-input-list.txt \\\n  --use_native_input_files  \\\n  --input_dtype encoder_out float32 \\\n  --input_dtype acoustic_embedding float32 \\\n  --input_dtype mask int32 \\\n  --act_bitwidth 16 \\\n  --bias_bitwidth 32\n\n  # Note(fangjun): It throws an error if we specify the layout for decoder inputs.\n  # --input_layout encoder_out NTF\n\nls -lh\n\nmv -v decoder-$t-seconds-quantized decoder-$t-seconds-quantized.cpp\n\npython3 ../../qnn/generate_config.py \\\n    --soc $soc \\\n    --graph-name \"decoder_${t}_seconds_quantized\" \\\n    --output-dir ./my-config-3 \\\n    --qnn-sdk-root $QNN_SDK_ROOT\n\nls -lh my-config-3\n\nhead -n100 ./my-config-3/*.json\n\npython3 \"${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator\" \\\n    -c \"decoder-$t-seconds-quantized.cpp\" \\\n    -b \"decoder-$t-seconds-quantized.bin\" \\\n    -o model_libs\n    # -t x86_64-linux-clang \\\n\nls -lh model_libs/x86_64-linux-clang/\n\n$QNN_SDK_ROOT/bin/x86_64-linux-clang/qnn-context-binary-generator \\\n  --backend $QNN_SDK_ROOT/lib/x86_64-linux-clang/libQnnHtp.so \\\n  --model ./model_libs/x86_64-linux-clang/libdecoder-${t}-seconds-quantized.so \\\n  --output_dir ./binary \\\n  --binary_file decoder \\\n  --config_file ./my-config-3/htp_backend_extensions.json\n\nls -lh binary\n\necho \"Finish exporting decoder\"\n"
  },
  {
    "path": "scripts/paraformer/qnn/convert_encoder.sh",
    "content": "#!/usr/bin/env bash\n\nif [ -z $t ]; then\n  echo \"Please run export t=num_input_seconds\"\n  exit 1\nfi\n\nif [ -z $soc ]; then\n  echo \"Please run export soc=SM8850, etc.\"\n  exit 1\nfi\n\nif [ -z $QNN_SDK_ROOT ]; then\n  echo \"Please run setup QNN first\"\n  exit 1\nfi\n\necho \"Export to onnx with num_seconds $t\"\n\npython3 ./export_encoder_onnx.py --input-len-in-seconds $t --opset-version 17\n\nls -lh encoder-*.onnx\n\npython3 ../../pyannote/segmentation/show-onnx.py --filename ./encoder-$t-seconds.onnx\n\necho \"Generate test data\"\n\npython3 ./generate_encoder_data.py --input-len-in-seconds $t\n\nls -lh encoder-*\n\necho \"---\"\ncat ./encoder-input-list.txt\necho \"---\"\n\necho \"Convert onnx to qnn\"\n\n\nqnn-onnx-converter \\\n  --input_network ./encoder-$t-seconds.onnx \\\n  --output_path ./encoder-$t-seconds-quantized \\\n  --out_node encoder_out \\\n  --input_list ./encoder-input-list.txt \\\n  --use_native_input_files  \\\n  --input_dtype x float32 \\\n  --act_bitwidth 16 \\\n  --bias_bitwidth 32 \\\n  --input_layout x NTF\n\nls -lh\n\nmv -v encoder-$t-seconds-quantized encoder-$t-seconds-quantized.cpp\n\npython3 ../../qnn/generate_config.py \\\n    --soc $soc \\\n    --graph-name \"encoder_${t}_seconds_quantized\" \\\n    --output-dir ./my-config \\\n    --qnn-sdk-root $QNN_SDK_ROOT\n\nls -lh my-config\n\nhead -n100 ./my-config/*.json\n\npython3 \"${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator\" \\\n    -c \"encoder-$t-seconds-quantized.cpp\" \\\n    -b \"encoder-$t-seconds-quantized.bin\" \\\n    -o model_libs\n    # -t x86_64-linux-clang \\\n\nls -lh model_libs/x86_64-linux-clang/\n\n$QNN_SDK_ROOT/bin/x86_64-linux-clang/qnn-context-binary-generator \\\n  --backend $QNN_SDK_ROOT/lib/x86_64-linux-clang/libQnnHtp.so \\\n  --model ./model_libs/x86_64-linux-clang/libencoder-${t}-seconds-quantized.so \\\n  --output_dir ./binary \\\n  --binary_file encoder \\\n  --config_file ./my-config/htp_backend_extensions.json\n\nls -lh binary\n\necho \"Finish exporting encoder\"\n"
  },
  {
    "path": "scripts/paraformer/qnn/convert_predictor.sh",
    "content": "#!/usr/bin/env bash\n\nif [ -z $t ]; then\n  echo \"Please run export t=num_input_seconds\"\n  exit 1\nfi\n\nif [ -z $soc ]; then\n  echo \"Please run export soc=SM8850, etc.\"\n  exit 1\nfi\n\nif [ -z $QNN_SDK_ROOT ]; then\n  echo \"Please run setup QNN first\"\n  exit 1\nfi\n\necho \"Export to onnx with num_seconds $t\"\n\npython3 ./export_predictor_onnx.py --input-len-in-seconds $t --opset-version 17\n\nls -lh predictor-*.onnx\n\npython3 ../../pyannote/segmentation/show-onnx.py --filename ./predictor-$t-seconds.onnx\n\necho \"Generate test data\"\n\npython3 ./generate_predictor_data.py --input-len-in-seconds $t\n\nls -lh predictor-*\n\necho \"---\"\ncat ./predictor-input-list.txt\necho \"---\"\n\necho \"Convert onnx to qnn\"\n\n\nqnn-onnx-converter \\\n  --input_network ./predictor-$t-seconds.onnx \\\n  --output_path ./predictor-$t-seconds-quantized \\\n  --input_list ./predictor-input-list.txt \\\n  --use_native_input_files  \\\n  --input_dtype encoder_out float32 \\\n  --act_bitwidth 16 \\\n  --bias_bitwidth 32\n\n  # Note(fangjun): It throws an error if we specify the layout for predictor input.\n  # --input_layout encoder_out NTF\n\nls -lh\n\nmv -v predictor-$t-seconds-quantized predictor-$t-seconds-quantized.cpp\n\npython3 ../../qnn/generate_config.py \\\n    --soc $soc \\\n    --graph-name \"predictor_${t}_seconds_quantized\" \\\n    --output-dir ./my-config-2 \\\n    --qnn-sdk-root $QNN_SDK_ROOT\n\nls -lh my-config-2\n\nhead -n100 ./my-config-2/*.json\n\npython3 \"${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-model-lib-generator\" \\\n    -c \"predictor-$t-seconds-quantized.cpp\" \\\n    -b \"predictor-$t-seconds-quantized.bin\" \\\n    -o model_libs\n    # -t x86_64-linux-clang \\\n\nls -lh model_libs/x86_64-linux-clang/\n\n$QNN_SDK_ROOT/bin/x86_64-linux-clang/qnn-context-binary-generator \\\n  --backend $QNN_SDK_ROOT/lib/x86_64-linux-clang/libQnnHtp.so \\\n  --model ./model_libs/x86_64-linux-clang/libpredictor-${t}-seconds-quantized.so \\\n  --output_dir ./binary \\\n  --binary_file predictor \\\n  --config_file ./my-config-2/htp_backend_extensions.json\n\nls -lh binary\n\necho \"Finish exporting predictor\"\n"
  },
  {
    "path": "scripts/paraformer/qnn/generate_decoder_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport glob\nfrom pathlib import Path\n\nimport numpy as np\nimport torch\n\nfrom export_encoder_onnx import get_args, get_num_input_frames, load_model\nfrom export_predictor_onnx import modified_predictor_forward\nfrom test_onnx import compute_feat, get_acoustic_embedding\nfrom torch_model import CifPredictorV2\n\nCifPredictorV2.forward = modified_predictor_forward\n\n\ndef pad(features, max_len):\n    if features.shape[0] > max_len:\n        return features[:max_len]\n    elif features.shape[0] < max_len:\n        features = np.pad(\n            features,\n            ((0, max_len - features.shape[0]), (0, 0)),\n            mode=\"constant\",\n            constant_values=0,\n        )\n    return features\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    input_len_in_seconds = int(args.input_len_in_seconds)\n    num_input_frames = get_num_input_frames(input_len_in_seconds)\n\n    wav_files = glob.glob(\"*.wav\")\n\n    model = load_model()\n\n    name_list = []\n    for w in wav_files:\n        f = compute_feat(w)\n        print(w, f.shape)\n        f = pad(f, num_input_frames)\n        f = f[None]\n        print(f.shape)\n\n        f = torch.from_numpy(f)\n\n        encoder_out = model.encoder(f)\n        alpha = model.predictor(encoder_out)\n\n        acoustic_embedding = get_acoustic_embedding(\n            alpha[0].numpy(), encoder_out[0].numpy()\n        )\n        acoustic_embedding = torch.from_numpy(acoustic_embedding[None])\n        num_tokens = acoustic_embedding.shape[1]\n\n        acoustic_embedding = torch.nn.functional.pad(\n            acoustic_embedding,\n            (0, 0, 0, encoder_out.shape[1] - num_tokens),\n            \"constant\",\n            0,\n        )\n\n        mask = torch.zeros(1, encoder_out.shape[1], dtype=torch.int32)\n\n        mask[0, :num_tokens] = 1\n\n        # NOTE(Fangjun): We have to transpose the data since QNN expects\n        # (N, C, T) for the decoder model\n        # Not sure why it has such a requirement.\n\n        encoder_out = encoder_out.permute(0, 2, 1).clone().numpy()\n        acoustic_embedding = acoustic_embedding.permute(0, 2, 1).clone().numpy()\n\n        print(\"inputs: \", encoder_out.shape, acoustic_embedding.shape, mask.shape)\n\n        name = Path(w).stem\n\n        first = f\"decoder-input-{name}-0.raw\"\n        second = f\"decoder-input-{name}-1.raw\"\n        third = f\"decoder-input-{name}-2.raw\"\n        encoder_out.tofile(first)\n        acoustic_embedding.tofile(second)\n        mask.numpy().tofile(third)\n\n        name_list.append((first, second, third))\n\n    with open(\"decoder-input-list.txt\", \"w\") as f:\n        for first, second, third in name_list:\n            f.write(f\"{first} {second} {third}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/paraformer/qnn/generate_encoder_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport glob\nfrom pathlib import Path\n\nimport numpy as np\n\nfrom export_encoder_onnx import get_args, get_num_input_frames\nfrom test_onnx import compute_feat\n\n\ndef pad(features, max_len):\n    if features.shape[0] > max_len:\n        return features[:max_len]\n    elif features.shape[0] < max_len:\n        features = np.pad(\n            features,\n            ((0, max_len - features.shape[0]), (0, 0)),\n            mode=\"constant\",\n            constant_values=0,\n        )\n    return features\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    input_len_in_seconds = int(args.input_len_in_seconds)\n    num_input_frames = get_num_input_frames(input_len_in_seconds)\n\n    wav_files = glob.glob(\"*.wav\")\n    features_name = []\n    for w in wav_files:\n        f = compute_feat(w)\n        print(w, f.shape)\n        f = pad(f, num_input_frames)\n        print(f.shape)\n        print()\n        name = Path(w).stem\n\n        s = f\"encoder-input-{name}.raw\"\n        f.tofile(s)\n        features_name.append(s)\n\n    with open(\"encoder-input-list.txt\", \"w\") as f:\n        for line in features_name:\n            f.write(f\"{line}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/paraformer/qnn/generate_predictor_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport glob\nfrom pathlib import Path\n\nimport numpy as np\nimport torch\n\nfrom export_encoder_onnx import get_args, get_num_input_frames, load_model\nfrom export_predictor_onnx import modified_predictor_forward\nfrom test_onnx import compute_feat\nfrom torch_model import CifPredictorV2\n\nCifPredictorV2.forward = modified_predictor_forward\n\n\ndef pad(features, max_len):\n    if features.shape[0] > max_len:\n        return features[:max_len]\n    elif features.shape[0] < max_len:\n        features = np.pad(\n            features,\n            ((0, max_len - features.shape[0]), (0, 0)),\n            mode=\"constant\",\n            constant_values=0,\n        )\n    return features\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    input_len_in_seconds = int(args.input_len_in_seconds)\n    num_input_frames = get_num_input_frames(input_len_in_seconds)\n\n    wav_files = glob.glob(\"*.wav\")\n\n    model = load_model()\n\n    name_list = []\n    for w in wav_files:\n        f = compute_feat(w)\n        print(w, f.shape)\n        f = pad(f, num_input_frames)\n        f = f[None]\n        print(f.shape)\n\n        f = torch.from_numpy(f)\n\n        encoder_out = model.encoder(f)\n\n        # NOTE(Fangjun): We have to transpose the data since QNN expects\n        # (N, C, T) for the predictor model\n        # Not sure why it has such a requirement.\n\n        encoder_out = encoder_out.transpose(1, 2).clone().numpy()\n\n        print(\"encoder_out\", encoder_out.shape)\n\n        name = Path(w).stem\n\n        s = f\"predictor-input-{name}.raw\"\n        encoder_out.tofile(s)\n        name_list.append(s)\n        print(encoder_out.shape)\n\n    with open(\"predictor-input-list.txt\", \"w\") as f:\n        for line in name_list:\n            f.write(f\"{line}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/paraformer/qnn/test_qnn.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport numpy as np\nimport torch\n\nfrom export_encoder_onnx import load_model\nfrom export_predictor_onnx import modified_predictor_forward\nfrom test_onnx import get_acoustic_embedding\nfrom torch_model import CifPredictorV2\n\nCifPredictorV2.forward = modified_predictor_forward\n\n\ndef load_tokens():\n    id2token = dict()\n    with open(\"./tokens.txt\") as f:\n        for line in f:\n            fields = line.strip().split()\n            id2token[int(fields[1])] = fields[0]\n    return id2token\n\n\n@torch.no_grad()\ndef main():\n    model = load_model()\n    encoder_params = sum(p.numel() for p in model.encoder.parameters())\n    predictor_params = sum(p.numel() for p in model.predictor.parameters())\n    decoder_params = sum(p.numel() for p in model.decoder.parameters())\n    print(\"encoder params (M)\", encoder_params / 1024 / 1024)\n    print(\"predictor params (M)\", predictor_params / 1024 / 1024)\n    print(\"decoder params (M)\", decoder_params / 1024 / 1024)\n\n    features = np.fromfile(\"./encoder-input-zh.raw\", dtype=np.float32).reshape(\n        (1, -1, 560)\n    )\n    features = torch.from_numpy(features)\n    encoder_out = model.encoder(features)\n    encoder_out.permute(0, 2, 1).numpy().tofile(\"predictor-in.raw\")\n\n    alpha = model.predictor(encoder_out)\n\n    acoustic_embedding = get_acoustic_embedding(\n        alpha[0].numpy(), encoder_out[0].numpy()\n    )\n    acoustic_embedding = torch.from_numpy(acoustic_embedding[None])\n\n    num_tokens = acoustic_embedding.shape[1]\n\n    acoustic_embedding = torch.nn.functional.pad(\n        acoustic_embedding,\n        (0, 0, 0, encoder_out.shape[1] - num_tokens),\n        \"constant\",\n        0,\n    )\n\n    mask = torch.zeros(1, encoder_out.shape[1], dtype=torch.float32)\n\n    mask[0, :num_tokens] = 1\n    logits = model.decoder(encoder_out, acoustic_embedding, mask)\n    print(\"encoder_out\", encoder_out.shape)\n    print(\"acoustic_embedding\", acoustic_embedding.shape)\n    print(\"mask\", mask.shape)\n\n    encoder_out.permute(0, 2, 1).numpy().tofile(\"encoder_out.raw\")\n    acoustic_embedding.permute(0, 2, 1).numpy().tofile(\"acoustic_embedding.raw\")\n    mask.to(torch.int32).numpy().tofile(\"mask.raw\")\n\n    yseq = logits[0, :num_tokens].argmax(axis=-1).tolist()\n    print(yseq, \"-->\", len(yseq))\n\n    id2token = load_tokens()\n    text = [id2token[i] for i in yseq]\n    print(text)\n\n    if False:\n        qnn_encoder_out = np.fromfile(\"./encoder_out.raw\", dtype=np.float32).reshape(\n            1, -1, 512\n        )\n\n        qnn_encoder_out = torch.from_numpy(qnn_encoder_out)\n\n        qnn_alpha = np.fromfile(\"./alphas.raw\", dtype=np.float32).reshape(1, -1)\n        qnn_alpha = torch.from_numpy(qnn_alpha)\n\n        acoustic_embedding = get_acoustic_embedding(\n            qnn_alpha[0].numpy(), qnn_encoder_out[0].numpy()\n        )\n        acoustic_embedding = torch.from_numpy(acoustic_embedding[None])\n\n        num_tokens = acoustic_embedding.shape[1]\n\n        acoustic_embedding = torch.nn.functional.pad(\n            acoustic_embedding,\n            (0, 0, 0, qnn_encoder_out.shape[1] - num_tokens),\n            \"constant\",\n            0,\n        )\n\n        mask = torch.zeros(1, qnn_encoder_out.shape[1], dtype=torch.float32)\n\n        mask[0, :num_tokens] = 1\n\n        logits = model.decoder(qnn_encoder_out, acoustic_embedding, mask)\n    else:\n        logits = np.fromfile(\"./decoder_out.raw\", dtype=np.float32).reshape(\n            1,\n            -1,\n            encoder_out.shape[1],\n        )\n        logits = torch.from_numpy(logits)\n        logits = logits.permute(0, 2, 1)\n\n    yseq = logits[0, :num_tokens].argmax(axis=-1).tolist()\n    print(yseq, \"-->\", len(yseq))\n    text = [id2token[i] for i in yseq]\n    print(text)\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20251013)\n    main()\n"
  },
  {
    "path": "scripts/paraformer/rknn/download-example-model.sh",
    "content": "#!/usr/bin/env bash\n\nwget https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/am.mvn\nwget https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/config.yaml\nwget https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/tokens.json\nwget https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/seg_dict\nwget https://hf-mirror.com/csukuangfj/WSChuan-ASR/resolve/main/Paraformer-large-Chuan/model_state_dict.pt\n\npython3 ./export_encoder_onnx.py  --input-len-in-seconds 5\npython3 ./export_rknn.py --target-platform rk3588 --in-model ./encoder-5-seconds.onnx --out-model ./encoder-5-seconds.rknn\n\npython3 ./export_predictor_onnx.py  --input-len-in-seconds 5\npython3 ./export_rknn.py --target-platform rk3588 --in-model ./predictor-5-seconds.onnx --out-model ./predictor-5-seconds.rknn\n\npython3 ./export_decoder_onnx.py  --input-len-in-seconds 5\npython3 ./export_rknn.py --target-platform rk3588 --in-model ./decoder-5-seconds.onnx --out-model ./decoder-5-seconds.rknn\n"
  },
  {
    "path": "scripts/paraformer/rknn/export_decoder_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport torch\n\nfrom export_encoder_onnx import load_model, get_num_input_frames\n\nimport argparse\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--input-len-in-seconds\",\n        type=int,\n        required=True,\n        help=\"\"\"RKNN/QNN does not support dynamic shape, so we need to hard-code\n        how long the model can process.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--float-mask\",\n        type=int,\n        default=1,\n        help=\"1 to use float mask. 0 to use int32 mask\",\n    )\n\n    parser.add_argument(\n        \"--opset-version\",\n        type=int,\n        default=14,\n    )\n    return parser.parse_args()\n\n\n@torch.no_grad()\ndef main():\n    print(\"loading model\")\n    model = load_model()\n\n    args = get_args()\n\n    input_len_in_seconds = int(args.input_len_in_seconds)\n    num_input_frames = get_num_input_frames(input_len_in_seconds)\n\n    encoder_out = torch.randn(1, num_input_frames, 512, dtype=torch.float32)\n    acoustic_embedding = torch.randn(1, num_input_frames, 512, dtype=torch.float32)\n    if args.float_mask == 1:\n        mask = torch.ones([num_input_frames], dtype=torch.float32)\n    else:\n        mask = torch.ones([num_input_frames], dtype=torch.int32)\n\n    d = model.decoder(encoder_out, acoustic_embedding)\n    print(\"d\", d.shape)\n\n    opset_version = args.opset_version\n    filename = f\"decoder-{input_len_in_seconds}-seconds.onnx\"\n    torch.onnx.export(\n        model.decoder,\n        (encoder_out, acoustic_embedding, mask),\n        filename,\n        opset_version=opset_version,\n        input_names=[\"encoder_out\", \"acoustic_embedding\", \"mask\"],\n        output_names=[\"decoder_out\"],\n        dynamic_axes={},\n    )\n    print(f\"Saved to {filename}\")\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20251008)\n    main()\n"
  },
  {
    "path": "scripts/paraformer/rknn/export_encoder_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport argparse\nimport os\nfrom typing import Any, Dict, List, Tuple\n\nimport onnx\nimport torch\nimport yaml\n\nfrom torch_model import Paraformer, SANMEncoder\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--input-len-in-seconds\",\n        type=int,\n        required=True,\n        help=\"\"\"RKNN does not support dynamic shape, so we need to hard-code\n        how long the model can process.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--opset-version\",\n        type=int,\n        default=14,\n    )\n    return parser.parse_args()\n\n\ndef load_cmvn(filename) -> Tuple[List[float], List[float]]:\n    neg_mean = None\n    inv_stddev = None\n\n    with open(filename) as f:\n        for line in f:\n            if not line.startswith(\"<LearnRateCoef>\"):\n                continue\n            t = line.split()[3:-1]\n\n            if neg_mean is None:\n                neg_mean = list(map(lambda x: float(x), t))\n            else:\n                inv_stddev = list(map(lambda x: float(x), t))\n\n    return neg_mean, inv_stddev\n\n\nif __name__ == \"__main__\":\n\n    def modified_sanm_encoder_forward(\n        self: SANMEncoder, xs_pad: torch.Tensor, pos: torch.Tensor\n    ):\n        print(\"xs pad\", xs_pad.shape)\n        xs_pad = (xs_pad + self.neg_mean) * self.inv_stddev\n\n        xs_pad = xs_pad * self.output_size() ** 0.5\n\n        xs_pad = xs_pad + pos\n\n        xs_pad = self.encoders0(xs_pad)[0]\n\n        xs_pad = self.encoders(xs_pad)[0]\n\n        if self.normalize_before:\n            xs_pad = self.after_norm(xs_pad)\n\n        print(\"xs pad--->\", xs_pad.shape, pos.shape)\n\n        return xs_pad\n\n    #  SANMEncoder.forward = modified_sanm_encoder_forward\n\n\ndef load_model():\n    with open(\"./config.yaml\", \"r\", encoding=\"utf-8\") as f:\n        config = yaml.safe_load(f)\n\n    print(\"creating model\")\n\n    neg_mean, inv_stddev = load_cmvn(\"./am.mvn\")\n\n    neg_mean = torch.tensor(neg_mean, dtype=torch.float32)\n    inv_stddev = torch.tensor(inv_stddev, dtype=torch.float32)\n\n    m = Paraformer(\n        neg_mean=neg_mean,\n        inv_stddev=inv_stddev,\n        input_size=560,\n        vocab_size=8404,\n        encoder_conf=config[\"encoder_conf\"],\n        decoder_conf=config[\"decoder_conf\"],\n        predictor_conf=config[\"predictor_conf\"],\n    )\n    m.eval()\n\n    print(\"loading state dict\")\n    state_dict = torch.load(\"./model_state_dict.pt\", map_location=\"cpu\")\n    if \"state_dict\" in state_dict:\n        state_dict = state_dict[\"state_dict\"]\n\n    m.load_state_dict(state_dict)\n    del state_dict\n\n    return m\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\nlfr_window_size = 7\nlfr_window_shift = 6\n\n\ndef get_num_input_frames(input_len_in_seconds):\n    num_frames = input_len_in_seconds * 100\n    print(\"num_frames\", num_frames)\n\n    # num_input_frames is an approximate number\n    num_input_frames = int(num_frames / lfr_window_shift + 0.5)\n    print(\"num_input_frames\", num_input_frames)\n    return num_input_frames\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    print(\"loading model\")\n    model = load_model()\n\n    # frame shift is 10ms, 1 second has about 100 feature frames\n    input_len_in_seconds = int(args.input_len_in_seconds)\n    num_input_frames = get_num_input_frames(input_len_in_seconds)\n\n    x = torch.randn(1, num_input_frames, 560, dtype=torch.float32)\n    pos_emb = torch.rand(1, x.shape[1], 560, dtype=torch.float32)\n\n    opset_version = args.opset_version\n    filename = f\"encoder-{input_len_in_seconds}-seconds.onnx\"\n    torch.onnx.export(\n        model.encoder,\n        #  (x, pos_emb),\n        x,\n        filename,\n        opset_version=opset_version,\n        #  input_names=[\"x\", \"pos_emb\"],\n        input_names=[\"x\"],\n        output_names=[\"encoder_out\"],\n        dynamic_axes={},\n    )\n\n    model_author = os.environ.get(\"model_author\", \"iic\")\n    comment = os.environ.get(\n        \"comment\",\n        \"iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch\",\n    )\n    url = os.environ.get(\"url\", \"https://github.com/alibaba-damo-academy/FunASR\")\n\n    meta_data = {\n        \"lfr_window_size\": lfr_window_size,\n        \"lfr_window_shift\": lfr_window_shift,\n        \"num_input_frames\": num_input_frames,\n        \"normalize_samples\": 0,  # input should be in the range [-32768, 32767]\n        \"model_type\": \"paraformer\",\n        \"version\": \"1\",\n        \"model_author\": model_author,\n        \"maintainer\": \"k2-fsa\",\n        \"vocab_size\": 8404,\n        \"comment\": comment,\n        \"url\": url,\n        \"rknn\": 1,\n    }\n\n    add_meta_data(filename=filename, meta_data=meta_data)\n    print(f\"Saved to {filename}\")\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20251013)\n    main()\n"
  },
  {
    "path": "scripts/paraformer/rknn/export_predictor_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport torch\n\nfrom export_encoder_onnx import load_model, get_args, get_num_input_frames\nfrom torch_model import CifPredictorV2\n\n\ndef modified_predictor_forward(self: CifPredictorV2, hidden: torch.Tensor):\n    h = hidden\n    context = h.transpose(1, 2)\n    queries = self.pad(context)\n    output = torch.relu(self.cif_conv1d(queries))\n    output = output.transpose(1, 2)\n\n    output = self.cif_output(output)\n    alphas = torch.sigmoid(output)\n    alphas = torch.nn.functional.relu(\n        alphas * self.smooth_factor - self.noise_threshold\n    )\n\n    alphas = alphas.squeeze(-1)\n\n    return alphas\n\n\nif __name__ == \"__main__\":\n    CifPredictorV2.forward = modified_predictor_forward\n\n\n@torch.no_grad()\ndef main():\n    print(\"loading model\")\n    model = load_model()\n\n    args = get_args()\n\n    input_len_in_seconds = int(args.input_len_in_seconds)\n    num_input_frames = get_num_input_frames(input_len_in_seconds)\n\n    x = torch.randn(1, num_input_frames, 512, dtype=torch.float32)\n\n    opset_version = args.opset_version\n    filename = f\"predictor-{input_len_in_seconds}-seconds.onnx\"\n    torch.onnx.export(\n        model.predictor,\n        x,\n        filename,\n        opset_version=opset_version,\n        input_names=[\"encoder_out\"],\n        output_names=[\"alphas\"],\n        dynamic_axes={},\n    )\n    print(f\"Saved to {filename}\")\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20251008)\n    main()\n"
  },
  {
    "path": "scripts/paraformer/rknn/export_rknn.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nimport argparse\nimport logging\nfrom pathlib import Path\n\nfrom rknn.api import RKNN\n\nlogging.basicConfig(level=logging.WARNING)\n\ng_platforms = [\n    #  \"rk3562\",\n    #  \"rk3566\",\n    #  \"rk3568\",\n    #  \"rk3576\",\n    \"rk3588\",\n]\n\n\ndef get_parser():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--target-platform\",\n        type=str,\n        required=True,\n        help=f\"Supported values are: {','.join(g_platforms)}\",\n    )\n\n    parser.add_argument(\n        \"--in-model\",\n        type=str,\n        required=True,\n        help=\"Path to the input onnx model\",\n    )\n\n    parser.add_argument(\n        \"--out-model\",\n        type=str,\n        required=True,\n        help=\"Path to the output rknn model\",\n    )\n\n    return parser\n\n\ndef get_meta_data(model: str):\n    import onnxruntime\n\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.inter_op_num_threads = 1\n    session_opts.intra_op_num_threads = 1\n\n    m = onnxruntime.InferenceSession(\n        model,\n        sess_options=session_opts,\n        providers=[\"CPUExecutionProvider\"],\n    )\n\n    for i in m.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in m.get_outputs():\n        print(i)\n    print()\n\n    meta = m.get_modelmeta().custom_metadata_map\n    s = \"\"\n    sep = \"\"\n    for key, value in meta.items():\n        s = s + sep + f\"{key}={value}\"\n        sep = \";\"\n    assert len(s) < 1024, len(s)\n\n    print(\"len(s)\", len(s), s)\n\n    return s\n\n\ndef export_rknn(rknn, filename):\n    ret = rknn.export_rknn(filename)\n    if ret != 0:\n        exit(f\"Export rknn model to {filename} failed!\")\n\n\ndef init_model(filename: str, target_platform: str, custom_string=None):\n    rknn = RKNN(verbose=False)\n\n    rknn.config(\n        optimization_level=0,\n        target_platform=target_platform,\n        custom_string=custom_string,\n    )\n    if not Path(filename).is_file():\n        exit(f\"{filename} does not exist\")\n\n    ret = rknn.load_onnx(model=filename)\n    if ret != 0:\n        exit(f\"Load model {filename} failed!\")\n\n    ret = rknn.build(do_quantization=False)\n    if ret != 0:\n        exit(f\"Build model {filename} failed!\")\n\n    return rknn\n\n\nclass RKNNModel:\n    def __init__(\n        self,\n        model: str,\n        target_platform: str,\n    ):\n        meta = get_meta_data(model)\n        print(meta)\n\n        self.model = init_model(\n            model,\n            target_platform=target_platform,\n            custom_string=meta,\n        )\n\n    def export_rknn(self, model):\n        export_rknn(self.model, model)\n\n    def release(self):\n        self.model.release()\n\n\ndef main():\n    args = get_parser().parse_args()\n    print(vars(args))\n\n    model = RKNNModel(\n        model=args.in_model,\n        target_platform=args.target_platform,\n    )\n\n    model.export_rknn(\n        model=args.out_model,\n    )\n\n    model.release()\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/paraformer/rknn/test_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport kaldi_native_fbank as knf\nimport onnxruntime as ort\nimport librosa\nimport torch\nimport numpy as np\n\n\nclass SinusoidalPositionEncoder(torch.nn.Module):\n    def encode(\n        self,\n        positions: torch.Tensor = None,\n        depth: int = None,\n        dtype: torch.dtype = torch.float32,\n    ):\n        \"\"\"\n        Args:\n          positions: (batch_size, )\n        \"\"\"\n        batch_size = positions.size(0)\n        positions = positions.type(dtype)\n        device = positions.device\n        log_timescale_increment = torch.log(\n            torch.tensor([10000], dtype=dtype, device=device)\n        ) / (depth / 2 - 1)\n        inv_timescales = torch.exp(\n            torch.arange(depth / 2, device=device).type(dtype)\n            * (-log_timescale_increment)\n        )\n        inv_timescales = torch.reshape(inv_timescales, [batch_size, -1])\n        scaled_time = torch.reshape(positions, [1, -1, 1]) * torch.reshape(\n            inv_timescales, [1, 1, -1]\n        )\n        encoding = torch.cat([torch.sin(scaled_time), torch.cos(scaled_time)], dim=2)\n        return encoding.type(dtype)\n\n    def forward(self, batch_size, timesteps, input_dim):\n        positions = torch.arange(1, timesteps + 1)[None, :]\n        position_encoding = self.encode(positions, input_dim, torch.float32)\n\n        return position_encoding\n\n\ndef compute_feat(filename):\n    sample_rate = 16000\n    samples, _ = librosa.load(filename, sr=sample_rate)\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.snip_edges = False\n    opts.frame_opts.samp_freq = sample_rate\n    opts.mel_opts.num_bins = 80\n\n    online_fbank = knf.OnlineFbank(opts)\n    online_fbank.accept_waveform(sample_rate, (samples * 32768).tolist())\n    online_fbank.input_finished()\n\n    features = np.stack(\n        [online_fbank.get_frame(i) for i in range(online_fbank.num_frames_ready)]\n    )\n    assert features.data.contiguous is True\n    assert features.dtype == np.float32, features.dtype\n    #  print(\"features sum\", features.sum(), features.shape)\n\n    window_size = 7  # lfr_m\n    window_shift = 6  # lfr_n\n\n    T = (features.shape[0] - window_size) // window_shift + 1\n    features = np.lib.stride_tricks.as_strided(\n        features,\n        shape=(T, features.shape[1] * window_size),\n        strides=((window_shift * features.shape[1]) * 4, 4),\n    )\n    return np.copy(features)\n\n\ndef load_tokens():\n    ans = dict()\n    i = 0\n    with open(\"tokens.txt\", encoding=\"utf-8\") as f:\n        for line in f:\n            ans[i] = line.strip().split()[0]\n            i += 1\n    return ans\n\n\nclass OnnxModel:\n    def __init__(self):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        print(\"init encoder\")\n        self.encoder = ort.InferenceSession(\n            \"./encoder-5-seconds.onnx\",\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        print(\"init decoder\")\n        self.decoder = ort.InferenceSession(\n            \"./decoder-5-seconds.onnx\",\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        print(\"init predictor\")\n        self.predictor = ort.InferenceSession(\n            \"./predictor-5-seconds.onnx\",\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        print(\"---encoder---\")\n        for i in self.encoder.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.encoder.get_outputs():\n            print(i)\n\n        print(\"---decoder---\")\n        for i in self.decoder.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.decoder.get_outputs():\n            print(i)\n\n        print(\"---predictor---\")\n        for i in self.predictor.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.predictor.get_outputs():\n            print(i)\n\n    #  def run_encoder(self, features, pos_emb):\n    def run_encoder(self, features):\n        (encoder_out,) = self.encoder.run(\n            [\n                self.encoder.get_outputs()[0].name,\n            ],\n            {\n                self.encoder.get_inputs()[0].name: features,\n                #  self.encoder.get_inputs()[1].name: pos_emb,\n            },\n        )\n        return encoder_out\n\n    def run_predictor(self, encoder_out):\n        (alphas,) = self.predictor.run(\n            [\n                self.predictor.get_outputs()[0].name,\n            ],\n            {\n                self.predictor.get_inputs()[0].name: encoder_out,\n            },\n        )\n        return alphas\n\n    #  def run_decoder(self, encoder_out, acoustic_embedding, mask):\n    def run_decoder(self, encoder_out, acoustic_embedding, mask):\n        print(\n            self.decoder.get_outputs()[0].name,\n            self.decoder.get_inputs()[0].name,\n            self.decoder.get_inputs()[1].name,\n        )\n        (decoder_out,) = self.decoder.run(\n            [\n                self.decoder.get_outputs()[0].name,\n            ],\n            {\n                self.decoder.get_inputs()[0].name: encoder_out,\n                self.decoder.get_inputs()[1].name: acoustic_embedding,\n                self.decoder.get_inputs()[2].name: mask,\n            },\n        )\n        return decoder_out\n\n\ndef get_acoustic_embedding(alpha: np.array, hidden: np.array):\n    \"\"\"\n    Args:\n      alpha: (T,)\n      hidden: (T, C)\n    Returns:\n      acoustic_embeds: (num_tokens, C)\n    \"\"\"\n    alpha = alpha.tolist()\n    acc = 0\n    num_tokens = 0\n\n    embeddings = []\n    cur_embedding = np.zeros((hidden.shape[1],), dtype=np.float32)\n\n    for i, w in enumerate(alpha):\n        acc += w\n        if acc >= 1:\n            overflow = acc - 1\n            remain = w - overflow\n            cur_embedding += remain * hidden[i]\n            embeddings.append(cur_embedding)\n\n            cur_embedding = overflow * hidden[i]\n            acc = overflow\n        else:\n            cur_embedding += w * hidden[i]\n\n    if len(embeddings) == 0:\n        raise ValueError(\"No speech in the audio file\")\n\n    embeddings = np.array(embeddings)\n    return embeddings\n\n\ndef main():\n    features = compute_feat(\"./1.wav\")\n    print(\"here\", features.shape, features.shape[0] > 83)\n    if features.shape[0] >= 83:\n        features = features[:83]\n    else:\n        padding = features[-(83 - features.shape[0]) :]\n        print(\"padding\", features.shape, padding.shape)\n        features = np.concatenate([features, padding])\n\n    pos_emb = (\n        SinusoidalPositionEncoder()(1, features.shape[0], features.shape[1])\n        .squeeze(0)\n        .numpy()\n    )\n\n    print(\"features.shape\", features.shape, pos_emb.shape)\n\n    print(\"sum\", features.sum(), features.mean(), pos_emb.sum(), pos_emb.mean())\n\n    model = OnnxModel()\n\n    #  encoder_out = model.run_encoder(features[None], pos_emb[None])\n    encoder_out = model.run_encoder(features[None])\n    print(\"encoder_out.shape\", encoder_out.shape)\n    print(\"encoder_out.sum\", encoder_out.sum(), encoder_out.mean())\n\n    alpha = model.run_predictor(encoder_out)\n    print(\"alpha.shape\", alpha.shape)\n    print(\"alpha.sum()\", alpha.sum(), alpha.mean())\n\n    acoustic_embedding = get_acoustic_embedding(alpha[0], encoder_out[0])\n    print(\"acoustic_embedding.shape\", acoustic_embedding.shape)\n    num_tokens = acoustic_embedding.shape[0]\n\n    padding = np.zeros((83 - acoustic_embedding.shape[0], 512), dtype=np.float32)\n    print(\"padding.shape\", padding.shape, acoustic_embedding.shape)\n\n    acoustic_embedding = np.concatenate([acoustic_embedding, padding], axis=0)\n    print(\"acoustic_embedding.shape\", acoustic_embedding.shape)\n    print(\"acoustic_embedding.sum\", acoustic_embedding.sum(), acoustic_embedding.mean())\n\n    mask = np.zeros((83,), dtype=np.float32)\n    mask[:num_tokens] = 1\n    print(mask)\n\n    decoder_out = model.run_decoder(encoder_out, acoustic_embedding[None], mask)\n    #  decoder_out = model.run_decoder(encoder_out, acoustic_embedding[None])\n    print(\"decoder_out\", decoder_out.shape)\n    print(\"decoder_out.sum\", decoder_out.sum(), decoder_out.mean())\n    yseq = decoder_out[0, :num_tokens].argmax(axis=-1).tolist()\n    print(yseq, \"-->\", len(yseq))\n\n    tokens = load_tokens()\n    words = [tokens[i] for i in yseq if i not in (1, 2)]\n    print(words)\n    text = \"\".join(words)\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/paraformer/rknn/torch_model.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nCode in this file is copied and modified from\nhttps://github.com/modelscope/FunASR\n\"\"\"\nimport math\nfrom typing import Dict, List, Optional, Tuple\n\nimport torch\nimport torch.nn as nn\n\n\nclass EncoderLayerSANM(nn.Module):\n    def __init__(\n        self,\n        in_size,\n        size,\n        self_attn,\n        feed_forward,\n        dropout_rate,\n        normalize_before=True,\n        concat_after=False,\n        stochastic_depth_rate=0.0,\n    ):\n        \"\"\"Construct an EncoderLayer object.\"\"\"\n        super().__init__()\n        self.self_attn = self_attn\n        self.feed_forward = feed_forward\n        self.norm1 = torch.nn.LayerNorm(in_size)\n        self.norm2 = torch.nn.LayerNorm(size)\n        self.dropout = nn.Dropout(dropout_rate)\n        self.in_size = in_size\n        self.size = size\n        self.normalize_before = normalize_before\n        self.concat_after = concat_after\n        if self.concat_after:\n            self.concat_linear = nn.Linear(size + size, size)\n        self.stochastic_depth_rate = stochastic_depth_rate\n        self.dropout_rate = dropout_rate\n\n    def forward(\n        self,\n        x,\n        mask=None,\n        cache=None,\n        mask_shfit_chunk=None,\n        mask_att_chunk_encoder=None,\n    ):\n        \"\"\"Compute encoded features.\n\n        Args:\n            x_input (torch.Tensor): Input tensor (#batch, time, size).\n            mask (torch.Tensor): Mask tensor for the input (#batch, time).\n            cache (torch.Tensor): Cache tensor of the input (#batch, time - 1, size).\n\n        Returns:\n            torch.Tensor: Output tensor (#batch, time, size).\n            torch.Tensor: Mask tensor (#batch, time).\n\n        \"\"\"\n        residual = x\n        if self.normalize_before:\n            x = self.norm1(x)\n\n        if self.in_size == self.size:\n            x = residual + self.dropout(\n                self.self_attn(\n                    x,\n                    mask,\n                    mask_shfit_chunk=mask_shfit_chunk,\n                    mask_att_chunk_encoder=mask_att_chunk_encoder,\n                )\n            )\n        else:\n            x = self.dropout(\n                self.self_attn(\n                    x,\n                    mask,\n                    mask_shfit_chunk=mask_shfit_chunk,\n                    mask_att_chunk_encoder=mask_att_chunk_encoder,\n                )\n            )\n\n        if not self.normalize_before:\n            x = self.norm1(x)\n\n        residual = x\n        if self.normalize_before:\n            x = self.norm2(x)\n\n        x = residual + self.dropout(self.feed_forward(x))\n\n        if not self.normalize_before:\n            x = self.norm2(x)\n\n        x = torch.clamp(x, -60000.0, 60000.0)\n\n        return x, mask, cache, mask_shfit_chunk, mask_att_chunk_encoder\n\n\nclass MultiSequential(torch.nn.Sequential):\n    \"\"\"Multi-input multi-output torch.nn.Sequential.\"\"\"\n\n    def __init__(self, *args, layer_drop_rate=0.0):\n        \"\"\"Initialize MultiSequential with layer_drop.\n\n        Args:\n            layer_drop_rate (float): Probability of dropping out each fn (layer).\n\n        \"\"\"\n        super().__init__(*args)\n        self.layer_drop_rate = layer_drop_rate\n\n    def forward(self, *args):\n        \"\"\"Repeat.\"\"\"\n        for idx, m in enumerate(self):\n            args = m(*args)\n        return args\n\n\ndef repeat(N, fn, layer_drop_rate=0.0):\n    \"\"\"Repeat module N times.\n\n    Args:\n        N (int): Number of repeat time.\n        fn (Callable): Function to generate module.\n        layer_drop_rate (float): Probability of dropping out each fn (layer).\n\n    Returns:\n        MultiSequential: Repeated model instance.\n\n    \"\"\"\n    return MultiSequential(*[fn(n) for n in range(N)], layer_drop_rate=layer_drop_rate)\n\n\nclass MultiHeadedAttentionSANM(nn.Module):\n    \"\"\"Multi-Head Attention layer.\n\n    Args:\n        n_head (int): The number of heads.\n        n_feat (int): The number of features.\n        dropout_rate (float): Dropout rate.\n\n    \"\"\"\n\n    def __init__(\n        self,\n        n_head,\n        in_feat,\n        n_feat,\n        dropout_rate,\n        kernel_size,\n        sanm_shfit=0,\n        lora_list=None,\n        lora_rank=8,\n        lora_alpha=16,\n        lora_dropout=0.1,\n    ):\n        \"\"\"Construct an MultiHeadedAttention object.\"\"\"\n        super().__init__()\n\n        assert lora_list is None\n\n        assert n_feat % n_head == 0, (n_feat, n_head)\n\n        # We assume d_v always equals d_k\n        self.d_k = n_feat // n_head\n        self.h = n_head\n        # self.linear_q = nn.Linear(n_feat, n_feat)\n        # self.linear_k = nn.Linear(n_feat, n_feat)\n        # self.linear_v = nn.Linear(n_feat, n_feat)\n\n        self.linear_out = nn.Linear(n_feat, n_feat)\n        self.linear_q_k_v = nn.Linear(in_feat, n_feat * 3)\n        self.dropout = nn.Dropout(p=dropout_rate)\n\n        self.fsmn_block = nn.Conv1d(\n            n_feat, n_feat, kernel_size, stride=1, padding=0, groups=n_feat, bias=False\n        )\n        # padding\n        left_padding = (kernel_size - 1) // 2\n        if sanm_shfit > 0:\n            left_padding = left_padding + sanm_shfit\n        right_padding = kernel_size - 1 - left_padding\n        self.pad_fn = nn.ConstantPad1d((left_padding, right_padding), 0.0)\n\n    def forward_fsmn(self, inputs, mask=None, mask_shfit_chunk=None):\n        b, t, d = inputs.size()\n        if mask is not None:\n            mask = torch.reshape(mask, (b, -1, 1))\n            if mask_shfit_chunk is not None:\n                mask = mask * mask_shfit_chunk\n            inputs = inputs * mask\n\n        x = inputs.transpose(1, 2)\n        x = self.pad_fn(x)\n        x = self.fsmn_block(x)\n        x = x.transpose(1, 2)\n        x += inputs\n        x = self.dropout(x)\n        if mask is not None:\n            x = x * mask\n        return x\n\n    def forward_qkv(self, x):\n        \"\"\"Transform query, key and value.\n\n        Args:\n            query (torch.Tensor): Query tensor (#batch, time1, size).\n            key (torch.Tensor): Key tensor (#batch, time2, size).\n            value (torch.Tensor): Value tensor (#batch, time2, size).\n\n        Returns:\n            torch.Tensor: Transformed query tensor (#batch, n_head, time1, d_k).\n            torch.Tensor: Transformed key tensor (#batch, n_head, time2, d_k).\n            torch.Tensor: Transformed value tensor (#batch, n_head, time2, d_k).\n\n        \"\"\"\n        b, t, d = x.size()\n        q_k_v = self.linear_q_k_v(x)\n        q, k, v = torch.split(q_k_v, int(self.h * self.d_k), dim=-1)\n        q_h = torch.reshape(q, (b, t, self.h, self.d_k)).transpose(\n            1, 2\n        )  # (batch, head, time1, d_k)\n        k_h = torch.reshape(k, (b, t, self.h, self.d_k)).transpose(\n            1, 2\n        )  # (batch, head, time2, d_k)\n        v_h = torch.reshape(v, (b, t, self.h, self.d_k)).transpose(\n            1, 2\n        )  # (batch, head, time2, d_k)\n\n        return q_h, k_h, v_h, v\n\n    def forward_attention(self, value, scores, mask=None, mask_att_chunk_encoder=None):\n        \"\"\"Compute attention context vector.\n\n        Args:\n            value (torch.Tensor): Transformed value (#batch, n_head, time2, d_k).\n            scores (torch.Tensor): Attention score (#batch, n_head, time1, time2).\n            mask (torch.Tensor): Mask (#batch, 1, time2) or (#batch, time1, time2).\n\n        Returns:\n            torch.Tensor: Transformed value (#batch, time1, d_model)\n                weighted by the attention score (#batch, time1, time2).\n\n        \"\"\"\n        n_batch = value.size(0)\n        if mask is not None:\n            if mask_att_chunk_encoder is not None:\n                mask = mask * mask_att_chunk_encoder\n\n            mask = mask.unsqueeze(1).eq(0)  # (batch, 1, *, time2)\n\n            min_value = -float(\n                \"inf\"\n            )  # float(numpy.finfo(torch.tensor(0, dtype=scores.dtype).numpy().dtype).min)\n            scores = scores.masked_fill(mask, min_value)\n            attn = torch.softmax(scores, dim=-1).masked_fill(\n                mask, 0.0\n            )  # (batch, head, time1, time2)\n        else:\n            attn = torch.softmax(scores, dim=-1)  # (batch, head, time1, time2)\n\n        p_attn = self.dropout(attn)\n        x = torch.matmul(p_attn, value)  # (batch, head, time1, d_k)\n        x = (\n            x.transpose(1, 2).contiguous().view(n_batch, -1, self.h * self.d_k)\n        )  # (batch, time1, d_model)\n\n        return self.linear_out(x)  # (batch, time1, d_model)\n\n    def forward(self, x, mask=None, mask_shfit_chunk=None, mask_att_chunk_encoder=None):\n        \"\"\"Compute scaled dot product attention.\n\n        Args:\n            query (torch.Tensor): Query tensor (#batch, time1, size).\n            key (torch.Tensor): Key tensor (#batch, time2, size).\n            value (torch.Tensor): Value tensor (#batch, time2, size).\n            mask (torch.Tensor): Mask tensor (#batch, 1, time2) or\n                (#batch, time1, time2).\n\n        Returns:\n            torch.Tensor: Output tensor (#batch, time1, d_model).\n\n        \"\"\"\n        q_h, k_h, v_h, v = self.forward_qkv(x)\n        fsmn_memory = self.forward_fsmn(v, mask, mask_shfit_chunk)\n        q_h = q_h * self.d_k ** (-0.5)\n        scores = torch.matmul(q_h, k_h.transpose(-2, -1))\n        att_outs = self.forward_attention(v_h, scores, mask, mask_att_chunk_encoder)\n        return att_outs + fsmn_memory\n\n\nclass SinusoidalPositionEncoder(torch.nn.Module):\n    \"\"\" \"\"\"\n\n    def __init__(self, d_model=80, dropout_rate=0.1):\n        super().__init__()\n        pass\n\n    def encode(\n        self,\n        positions: torch.Tensor = None,\n        depth: int = None,\n        dtype: torch.dtype = torch.float32,\n    ):\n        batch_size = positions.size(0)\n        positions = positions.type(dtype)\n        device = positions.device\n        log_timescale_increment = torch.log(\n            torch.tensor([10000], dtype=dtype, device=device)\n        ) / (depth / 2 - 1)\n        inv_timescales = torch.exp(\n            torch.arange(depth / 2, device=device).type(dtype)\n            * (-log_timescale_increment)\n        )\n        inv_timescales = torch.reshape(inv_timescales, [batch_size, -1])\n        scaled_time = torch.reshape(positions, [1, -1, 1]) * torch.reshape(\n            inv_timescales, [1, 1, -1]\n        )\n        encoding = torch.cat([torch.sin(scaled_time), torch.cos(scaled_time)], dim=2)\n        return encoding.type(dtype)\n\n    def forward(self, x):\n        batch_size, timesteps, input_dim = x.size()\n        positions = torch.arange(1, timesteps + 1, device=x.device)[None, :]\n        position_encoding = self.encode(positions, input_dim, x.dtype).to(x.device)\n\n        return x + position_encoding\n\n\nclass PositionwiseFeedForward(torch.nn.Module):\n    \"\"\"Positionwise feed forward layer.\n\n    Args:\n        idim (int): Input dimension.\n        hidden_units (int): The number of hidden units.\n        dropout_rate (float): Dropout rate.\n\n    \"\"\"\n\n    def __init__(self, idim, hidden_units, dropout_rate, activation=torch.nn.ReLU()):\n        \"\"\"Construct an PositionwiseFeedForward object.\"\"\"\n        super().__init__()\n        self.w_1 = torch.nn.Linear(idim, hidden_units)\n        self.w_2 = torch.nn.Linear(hidden_units, idim)\n        self.dropout = torch.nn.Dropout(dropout_rate)\n        self.activation = activation\n\n    def forward(self, x):\n        \"\"\"Forward function.\"\"\"\n        return self.w_2(self.dropout(self.activation(self.w_1(x))))\n\n\nclass SANMEncoder(nn.Module):\n    \"\"\"\n    Author: Zhifu Gao, Shiliang Zhang, Ming Lei, Ian McLoughlin\n    San-m: Memory equipped self-attention for end-to-end speech recognition\n    https://arxiv.org/abs/2006.01713\n    \"\"\"\n\n    def __init__(\n        self,\n        neg_mean: torch.Tensor,\n        inv_stddev: torch.Tensor,\n        input_size: int,\n        output_size: int = 256,\n        attention_heads: int = 4,\n        linear_units: int = 2048,\n        num_blocks: int = 6,\n        dropout_rate: float = 0.1,\n        positional_dropout_rate: float = 0.1,\n        attention_dropout_rate: float = 0.0,\n        input_layer: Optional[str] = \"conv2d\",\n        pos_enc_class=SinusoidalPositionEncoder,\n        normalize_before: bool = True,\n        concat_after: bool = False,\n        positionwise_layer_type: str = \"linear\",\n        positionwise_conv_kernel_size: int = 1,\n        padding_idx: int = -1,\n        interctc_layer_idx: List[int] = [],\n        interctc_use_conditioning: bool = False,\n        kernel_size: int = 11,\n        sanm_shfit: int = 0,\n        lora_list: List[str] = None,\n        lora_rank: int = 8,\n        lora_alpha: int = 16,\n        lora_dropout: float = 0.1,\n        selfattention_layer_type: str = \"sanm\",\n        tf2torch_tensor_name_prefix_torch: str = \"encoder\",\n        tf2torch_tensor_name_prefix_tf: str = \"seq2seq/encoder\",\n    ):\n        super().__init__()\n        self.neg_mean = neg_mean\n        self.inv_stddev = inv_stddev\n        self._output_size = output_size\n        assert input_layer == \"pe\", input_layer\n\n        self.embed = SinusoidalPositionEncoder()\n        self.normalize_before = normalize_before\n\n        assert positionwise_layer_type == \"linear\", positionwise_layer_type\n        positionwise_layer = PositionwiseFeedForward\n        positionwise_layer_args = (\n            output_size,\n            linear_units,\n            dropout_rate,\n        )\n\n        assert selfattention_layer_type == \"sanm\", selfattention_layer_type\n\n        encoder_selfattn_layer = MultiHeadedAttentionSANM\n        encoder_selfattn_layer_args0 = (\n            attention_heads,\n            input_size,\n            output_size,\n            attention_dropout_rate,\n            kernel_size,\n            sanm_shfit,\n            lora_list,\n            lora_rank,\n            lora_alpha,\n            lora_dropout,\n        )\n\n        encoder_selfattn_layer_args = (\n            attention_heads,\n            output_size,\n            output_size,\n            attention_dropout_rate,\n            kernel_size,\n            sanm_shfit,\n            lora_list,\n            lora_rank,\n            lora_alpha,\n            lora_dropout,\n        )\n\n        self.encoders0 = repeat(\n            1,\n            lambda lnum: EncoderLayerSANM(\n                input_size,\n                output_size,\n                encoder_selfattn_layer(*encoder_selfattn_layer_args0),\n                positionwise_layer(*positionwise_layer_args),\n                dropout_rate,\n                normalize_before,\n                concat_after,\n            ),\n        )\n\n        self.encoders = repeat(\n            num_blocks - 1,\n            lambda lnum: EncoderLayerSANM(\n                output_size,\n                output_size,\n                encoder_selfattn_layer(*encoder_selfattn_layer_args),\n                positionwise_layer(*positionwise_layer_args),\n                dropout_rate,\n                normalize_before,\n                concat_after,\n            ),\n        )\n\n        if self.normalize_before:\n            self.after_norm = torch.nn.LayerNorm(output_size)\n\n        self.interctc_layer_idx = interctc_layer_idx\n\n        assert len(interctc_layer_idx) == 0, len(interctc_layer_idx)\n        self.interctc_use_conditioning = interctc_use_conditioning\n        self.conditioning_layer = None\n        self.dropout = nn.Dropout(dropout_rate)\n        self.tf2torch_tensor_name_prefix_torch = tf2torch_tensor_name_prefix_torch\n        self.tf2torch_tensor_name_prefix_tf = tf2torch_tensor_name_prefix_tf\n\n    def output_size(self) -> int:\n        return self._output_size\n\n    def forward(\n        self,\n        xs_pad: torch.Tensor,\n    ) -> torch.Tensor:\n        \"\"\"Embed positions in tensor.\n\n        Args:\n            xs_pad: input tensor (B, L, D)\n        Returns:\n            position embedded tensor and mask\n        \"\"\"\n        print(\"in xs_pad.shape\", xs_pad.shape)\n        xs_pad = (xs_pad + self.neg_mean) * self.inv_stddev\n        masks = None\n        xs_pad = xs_pad * self.output_size() ** 0.5\n\n        xs_pad = self.embed(xs_pad)\n\n        # xs_pad = self.dropout(xs_pad)\n        encoder_outs = self.encoders0(xs_pad, masks)\n        xs_pad, masks = encoder_outs[0], encoder_outs[1]\n        encoder_outs = self.encoders(xs_pad, masks)\n        xs_pad, masks = encoder_outs[0], encoder_outs[1]\n\n        if self.normalize_before:\n            xs_pad = self.after_norm(xs_pad)\n\n        print(\"out xs_pad.shape\", xs_pad.shape)\n        return xs_pad\n\n\ndef _pre_hook(\n    state_dict,\n    prefix,\n    local_metadata,\n    strict,\n    missing_keys,\n    unexpected_keys,\n    error_msgs,\n):\n    \"\"\"Perform pre-hook in load_state_dict for backward compatibility.\n\n    Note:\n        We saved self.pe until v.0.5.2 but we have omitted it later.\n        Therefore, we remove the item \"pe\" from `state_dict` for backward compatibility.\n\n    \"\"\"\n    k = prefix + \"pe\"\n    if k in state_dict:\n        state_dict.pop(k)\n\n\nclass DecoderLayerSANM(torch.nn.Module):\n    \"\"\"Single decoder layer module.\n\n    Args:\n        size (int): Input dimension.\n        self_attn (torch.nn.Module): Self-attention module instance.\n            `MultiHeadedAttention` instance can be used as the argument.\n        src_attn (torch.nn.Module): Self-attention module instance.\n            `MultiHeadedAttention` instance can be used as the argument.\n        feed_forward (torch.nn.Module): Feed-forward module instance.\n            `PositionwiseFeedForward`, `MultiLayeredConv1d`, or `Conv1dLinear` instance\n            can be used as the argument.\n        dropout_rate (float): Dropout rate.\n        normalize_before (bool): Whether to use layer_norm before the first block.\n        concat_after (bool): Whether to concat attention layer's input and output.\n            if True, additional linear will be applied.\n            i.e. x -> x + linear(concat(x, att(x)))\n            if False, no additional linear will be applied. i.e. x -> x + att(x)\n\n\n    \"\"\"\n\n    def __init__(\n        self,\n        size,\n        self_attn,\n        src_attn,\n        feed_forward,\n        dropout_rate,\n        normalize_before=True,\n        concat_after=False,\n    ):\n        \"\"\"Construct an DecoderLayer object.\"\"\"\n        super(DecoderLayerSANM, self).__init__()\n        self.size = size\n        self.self_attn = self_attn\n        self.src_attn = src_attn\n        self.feed_forward = feed_forward\n        self.norm1 = torch.nn.LayerNorm(size)\n        if self_attn is not None:\n            self.norm2 = torch.nn.LayerNorm(size)\n        if src_attn is not None:\n            self.norm3 = torch.nn.LayerNorm(size)\n        self.dropout = torch.nn.Dropout(dropout_rate)\n        self.normalize_before = normalize_before\n        self.concat_after = concat_after\n        if self.concat_after:\n            self.concat_linear1 = torch.nn.Linear(size + size, size)\n            self.concat_linear2 = torch.nn.Linear(size + size, size)\n        self.reserve_attn = False\n\n    def forward(self, tgt, tgt_mask, memory, memory_mask=None, cache=None):\n        \"\"\"Compute decoded features.\n\n        Args:\n            tgt (torch.Tensor): Input tensor (#batch, maxlen_out, size).\n            tgt_mask (torch.Tensor): Mask for input tensor (#batch, maxlen_out).\n            memory (torch.Tensor): Encoded memory, float32 (#batch, maxlen_in, size).\n            memory_mask (torch.Tensor): Encoded memory mask (#batch, maxlen_in).\n            cache (List[torch.Tensor]): List of cached tensors.\n                Each tensor shape should be (#batch, maxlen_out - 1, size).\n\n        Returns:\n            torch.Tensor: Output tensor(#batch, maxlen_out, size).\n            torch.Tensor: Mask for output tensor (#batch, maxlen_out).\n            torch.Tensor: Encoded memory (#batch, maxlen_in, size).\n            torch.Tensor: Encoded memory mask (#batch, maxlen_in).\n\n        \"\"\"\n        # tgt = self.dropout(tgt)\n        residual = tgt\n        if self.normalize_before:\n            tgt = self.norm1(tgt)\n        tgt = self.feed_forward(tgt)\n\n        x = tgt\n        if self.self_attn:\n            if self.normalize_before:\n                tgt = self.norm2(tgt)\n            x, _ = self.self_attn(tgt, tgt_mask)\n            x = residual + self.dropout(x)\n\n        if self.src_attn is not None:\n            residual = x\n            if self.normalize_before:\n                x = self.norm3(x)\n\n            x_src_attn = self.src_attn(x, memory, memory_mask, ret_attn=False)\n            x = residual + self.dropout(x_src_attn)\n            # x = residual + self.dropout(self.src_attn(x, memory, memory_mask))\n\n        return x, tgt_mask, memory, memory_mask, cache\n\n\nclass MultiHeadedAttentionSANMDecoder(nn.Module):\n    \"\"\"Multi-Head Attention layer.\n\n    Args:\n        n_head (int): The number of heads.\n        n_feat (int): The number of features.\n        dropout_rate (float): Dropout rate.\n\n    \"\"\"\n\n    def __init__(self, n_feat, dropout_rate, kernel_size, sanm_shfit=0):\n        \"\"\"Construct an MultiHeadedAttention object.\"\"\"\n        super().__init__()\n\n        self.dropout = nn.Dropout(p=dropout_rate)\n\n        self.fsmn_block = nn.Conv1d(\n            n_feat, n_feat, kernel_size, stride=1, padding=0, groups=n_feat, bias=False\n        )\n        # padding\n        # padding\n        left_padding = (kernel_size - 1) // 2\n        if sanm_shfit > 0:\n            left_padding = left_padding + sanm_shfit\n        right_padding = kernel_size - 1 - left_padding\n        self.pad_fn = nn.ConstantPad1d((left_padding, right_padding), 0.0)\n        self.kernel_size = kernel_size\n\n    def forward(self, inputs, mask, cache=None, mask_shfit_chunk=None):\n        \"\"\"\n        :param x: (#batch, time1, size).\n        :param mask: Mask tensor (#batch, 1, time)\n        :return:\n        \"\"\"\n        # print(\"in fsmn, inputs\", inputs.size())\n        b, t, d = inputs.size()\n        # logging.info(\n        #     \"mask: {}\".format(mask.size()))\n        if mask is not None:\n            mask = torch.reshape(mask, (b, -1, 1))\n            # logging.info(\"in fsmn, mask: {}, {}\".format(mask.size(), mask[0:100:50, :, :]))\n            if mask_shfit_chunk is not None:\n                # logging.info(\"in fsmn, mask_fsmn: {}, {}\".format(mask_shfit_chunk.size(), mask_shfit_chunk[0:100:50, :, :]))\n                mask = mask * mask_shfit_chunk\n            # logging.info(\"in fsmn, mask_after_fsmn: {}, {}\".format(mask.size(), mask[0:100:50, :, :]))\n            # print(\"in fsmn, mask\", mask.size())\n            # print(\"in fsmn, inputs\", inputs.size())\n            inputs = inputs * mask\n\n        x = inputs.transpose(1, 2)\n        b, d, t = x.size()\n        if cache is None:\n            # print(\"in fsmn, cache is None, x\", x.size())\n\n            x = self.pad_fn(x)\n            if not self.training:\n                cache = x\n        else:\n            # print(\"in fsmn, cache is not None, x\", x.size())\n            # x = torch.cat((x, cache), dim=2)[:, :, :-1]\n            # if t < self.kernel_size:\n            #     x = self.pad_fn(x)\n            x = torch.cat((cache[:, :, 1:], x), dim=2)\n            x = x[:, :, -(self.kernel_size + t - 1) :]\n            # print(\"in fsmn, cache is not None, x_cat\", x.size())\n            cache = x\n        x = self.fsmn_block(x)\n        x = x.transpose(1, 2)\n        # print(\"in fsmn, fsmn_out\", x.size())\n        if x.size(1) != inputs.size(1):\n            inputs = inputs[:, -1, :]\n\n        x = x + inputs\n        x = self.dropout(x)\n        if mask is not None:\n            x = x * mask\n        return x, cache\n\n\nclass MultiHeadedAttentionCrossAtt(nn.Module):\n    \"\"\"Multi-Head Attention layer.\n\n    Args:\n        n_head (int): The number of heads.\n        n_feat (int): The number of features.\n        dropout_rate (float): Dropout rate.\n\n    \"\"\"\n\n    def __init__(\n        self,\n        n_head,\n        n_feat,\n        dropout_rate,\n        lora_list=None,\n        lora_rank=8,\n        lora_alpha=16,\n        lora_dropout=0.1,\n        encoder_output_size=None,\n    ):\n        \"\"\"Construct an MultiHeadedAttention object.\"\"\"\n        super().__init__()\n        assert n_feat % n_head == 0\n        # We assume d_v always equals d_k\n        self.d_k = n_feat // n_head\n        self.h = n_head\n        self.linear_q = nn.Linear(n_feat, n_feat)\n        self.linear_k_v = nn.Linear(\n            n_feat if encoder_output_size is None else encoder_output_size,\n            n_feat * 2,\n        )\n        self.linear_out = nn.Linear(n_feat, n_feat)\n        self.attn = None\n        self.dropout = nn.Dropout(p=dropout_rate)\n\n    def forward_qkv(self, x, memory):\n        \"\"\"Transform query, key and value.\n\n        Args:\n            query (torch.Tensor): Query tensor (#batch, time1, size).\n            key (torch.Tensor): Key tensor (#batch, time2, size).\n            value (torch.Tensor): Value tensor (#batch, time2, size).\n\n        Returns:\n            torch.Tensor: Transformed query tensor (#batch, n_head, time1, d_k).\n            torch.Tensor: Transformed key tensor (#batch, n_head, time2, d_k).\n            torch.Tensor: Transformed value tensor (#batch, n_head, time2, d_k).\n\n        \"\"\"\n\n        # print(\"in forward_qkv, x\", x.size())\n        b = x.size(0)\n        q = self.linear_q(x)\n        q_h = torch.reshape(q, (b, -1, self.h, self.d_k)).transpose(\n            1, 2\n        )  # (batch, head, time1, d_k)\n\n        k_v = self.linear_k_v(memory)\n        k, v = torch.split(k_v, int(self.h * self.d_k), dim=-1)\n        k_h = torch.reshape(k, (b, -1, self.h, self.d_k)).transpose(\n            1, 2\n        )  # (batch, head, time2, d_k)\n        v_h = torch.reshape(v, (b, -1, self.h, self.d_k)).transpose(\n            1, 2\n        )  # (batch, head, time2, d_k)\n\n        return q_h, k_h, v_h\n\n    def forward_attention(self, value, scores, mask, ret_attn=False):\n        \"\"\"Compute attention context vector.\n\n        Args:\n            value (torch.Tensor): Transformed value (#batch, n_head, time2, d_k).\n            scores (torch.Tensor): Attention score (#batch, n_head, time1, time2).\n            mask (torch.Tensor): Mask (#batch, 1, time2) or (#batch, time1, time2).\n\n        Returns:\n            torch.Tensor: Transformed value (#batch, time1, d_model)\n                weighted by the attention score (#batch, time1, time2).\n\n        \"\"\"\n        n_batch = value.size(0)\n        if mask is not None:\n            mask = mask.unsqueeze(1).eq(0)  # (batch, 1, *, time2)\n            min_value = -float(\n                \"inf\"\n            )  # float(numpy.finfo(torch.tensor(0, dtype=scores.dtype).numpy().dtype).min)\n            # logging.info(\n            #     \"scores: {}, mask_size: {}\".format(scores.size(), mask.size()))\n            scores = scores.masked_fill(mask, min_value)\n            attn = torch.softmax(scores, dim=-1).masked_fill(\n                mask, 0.0\n            )  # (batch, head, time1, time2)\n        else:\n            attn = torch.softmax(scores, dim=-1)  # (batch, head, time1, time2)\n        p_attn = self.dropout(attn)\n        x = torch.matmul(p_attn, value)  # (batch, head, time1, d_k)\n        x = (\n            x.transpose(1, 2).contiguous().view(n_batch, -1, self.h * self.d_k)\n        )  # (batch, time1, d_model)\n        if ret_attn:\n            return self.linear_out(x), attn  # (batch, time1, d_model)\n        return self.linear_out(x)  # (batch, time1, d_model)\n\n    def forward(self, x, memory, memory_mask, ret_attn=False):\n        \"\"\"Compute scaled dot product attention.\n\n        Args:\n            query (torch.Tensor): Query tensor (#batch, time1, size).\n            key (torch.Tensor): Key tensor (#batch, time2, size).\n            value (torch.Tensor): Value tensor (#batch, time2, size).\n            mask (torch.Tensor): Mask tensor (#batch, 1, time2) or\n                (#batch, time1, time2).\n\n        Returns:\n            torch.Tensor: Output tensor (#batch, time1, d_model).\n\n        \"\"\"\n        q_h, k_h, v_h = self.forward_qkv(x, memory)\n        q_h = q_h * self.d_k ** (-0.5)\n        scores = torch.matmul(q_h, k_h.transpose(-2, -1))\n        return self.forward_attention(v_h, scores, memory_mask, ret_attn=ret_attn)\n\n\nclass PositionwiseFeedForwardDecoderSANM(torch.nn.Module):\n    \"\"\"Positionwise feed forward layer.\n\n    Args:\n        idim (int): Input dimension.\n        hidden_units (int): The number of hidden units.\n        dropout_rate (float): Dropout rate.\n\n    \"\"\"\n\n    def __init__(\n        self, idim, hidden_units, dropout_rate, adim=None, activation=torch.nn.ReLU()\n    ):\n        \"\"\"Construct an PositionwiseFeedForward object.\"\"\"\n        super(PositionwiseFeedForwardDecoderSANM, self).__init__()\n        self.w_1 = torch.nn.Linear(idim, hidden_units)\n        self.w_2 = torch.nn.Linear(\n            hidden_units, idim if adim is None else adim, bias=False\n        )\n        self.dropout = torch.nn.Dropout(dropout_rate)\n        self.activation = activation\n        self.norm = torch.nn.LayerNorm(hidden_units)\n\n    def forward(self, x):\n        \"\"\"Forward function.\"\"\"\n        return self.w_2(self.norm(self.dropout(self.activation(self.w_1(x)))))\n\n\nclass ParaformerSANMDecoder(torch.nn.Module):\n    \"\"\"\n    Author: Speech Lab of DAMO Academy, Alibaba Group\n    Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition\n    https://arxiv.org/abs/2006.01713\n    \"\"\"\n\n    def __init__(\n        self,\n        vocab_size: int,\n        encoder_output_size: int,\n        attention_heads: int = 4,\n        linear_units: int = 2048,\n        num_blocks: int = 6,\n        dropout_rate: float = 0.1,\n        positional_dropout_rate: float = 0.1,\n        self_attention_dropout_rate: float = 0.0,\n        src_attention_dropout_rate: float = 0.0,\n        input_layer: str = \"embed\",\n        use_output_layer: bool = True,\n        wo_input_layer: bool = False,\n        pos_enc_class=\"PositionalEncoding\",\n        normalize_before: bool = True,\n        concat_after: bool = False,\n        att_layer_num: int = 6,\n        kernel_size: int = 21,\n        sanm_shfit: int = 0,\n        lora_list: List[str] = None,\n        lora_rank: int = 8,\n        lora_alpha: int = 16,\n        lora_dropout: float = 0.1,\n        chunk_multiply_factor: tuple = (1,),\n        tf2torch_tensor_name_prefix_torch: str = \"decoder\",\n        tf2torch_tensor_name_prefix_tf: str = \"seq2seq/decoder\",\n    ):\n        super().__init__()\n\n        attention_dim = encoder_output_size\n\n        assert wo_input_layer is False\n        assert input_layer == \"embed\", input_layer\n\n        # Note: self.embed is not used\n        self.embed = torch.nn.Sequential(\n            torch.nn.Embedding(vocab_size, attention_dim),\n            # pos_enc_class(attention_dim, positional_dropout_rate),\n        )\n\n        self.normalize_before = normalize_before\n        if self.normalize_before:\n            self.after_norm = torch.nn.LayerNorm(attention_dim)\n        if use_output_layer:\n            self.output_layer = torch.nn.Linear(attention_dim, vocab_size)\n        else:\n            self.output_layer = None\n\n        self.att_layer_num = att_layer_num\n        self.num_blocks = num_blocks\n        if sanm_shfit is None:\n            sanm_shfit = (kernel_size - 1) // 2\n        self.decoders = repeat(\n            att_layer_num,\n            lambda lnum: DecoderLayerSANM(\n                attention_dim,\n                MultiHeadedAttentionSANMDecoder(\n                    attention_dim,\n                    self_attention_dropout_rate,\n                    kernel_size,\n                    sanm_shfit=sanm_shfit,\n                ),\n                MultiHeadedAttentionCrossAtt(\n                    attention_heads,\n                    attention_dim,\n                    src_attention_dropout_rate,\n                    lora_list,\n                    lora_rank,\n                    lora_alpha,\n                    lora_dropout,\n                ),\n                PositionwiseFeedForwardDecoderSANM(\n                    attention_dim, linear_units, dropout_rate\n                ),\n                dropout_rate,\n                normalize_before,\n                concat_after,\n            ),\n        )\n        if num_blocks - att_layer_num <= 0:\n            self.decoders2 = None\n        else:\n            self.decoders2 = repeat(\n                num_blocks - att_layer_num,\n                lambda lnum: DecoderLayerSANM(\n                    attention_dim,\n                    MultiHeadedAttentionSANMDecoder(\n                        attention_dim,\n                        self_attention_dropout_rate,\n                        kernel_size,\n                        sanm_shfit=0,\n                    ),\n                    None,\n                    PositionwiseFeedForwardDecoderSANM(\n                        attention_dim, linear_units, dropout_rate\n                    ),\n                    dropout_rate,\n                    normalize_before,\n                    concat_after,\n                ),\n            )\n\n        self.decoders3 = repeat(\n            1,\n            lambda lnum: DecoderLayerSANM(\n                attention_dim,\n                None,\n                None,\n                PositionwiseFeedForwardDecoderSANM(\n                    attention_dim, linear_units, dropout_rate\n                ),\n                dropout_rate,\n                normalize_before,\n                concat_after,\n            ),\n        )\n        self.tf2torch_tensor_name_prefix_torch = tf2torch_tensor_name_prefix_torch\n        self.tf2torch_tensor_name_prefix_tf = tf2torch_tensor_name_prefix_tf\n        self.chunk_multiply_factor = chunk_multiply_factor\n\n    def forward(\n        self,\n        hs_pad: torch.Tensor,\n        ys_in_pad: torch.Tensor,\n        tgt_mask: torch.Tensor = None,\n        chunk_mask: torch.Tensor = None,\n        return_hidden: bool = False,\n        return_both: bool = False,\n    ) -> Tuple[torch.Tensor, torch.Tensor]:\n        \"\"\"Forward decoder.\n\n        Args:\n            hs_pad: encoded memory, float32  (batch, maxlen_in, feat)\n            ys_in_pad:\n                input token ids, int64 (batch, maxlen_out)\n                if input_layer == \"embed\"\n                input tensor (batch, maxlen_out, #mels) in the other cases\n        Returns:\n            (tuple): tuple containing:\n\n            x: decoded token score before softmax (batch, maxlen_out, token)\n                if use_output_layer is True,\n        \"\"\"\n        tgt = ys_in_pad\n\n        memory = hs_pad\n        memory_mask = None\n\n        x = tgt\n        x, tgt_mask, memory, memory_mask, _ = self.decoders(\n            x, tgt_mask, memory, memory_mask\n        )\n        if self.decoders2 is not None:\n            x, tgt_mask, memory, memory_mask, _ = self.decoders2(\n                x, tgt_mask, memory, memory_mask\n            )\n        x, tgt_mask, memory, memory_mask, _ = self.decoders3(\n            x, tgt_mask, memory, memory_mask\n        )\n        if self.normalize_before:\n            hidden = self.after_norm(x)\n\n        print(\"hidden\", hidden.shape)\n        print(\"self.output_layer\", self.output_layer)\n        x = self.output_layer(hidden)\n        print(\"x\", x.shape)\n        return x\n\n\ndef cif_wo_hidden_v1(alphas, threshold, return_fire_idxs=False):\n    batch_size, len_time = alphas.size()\n    device = alphas.device\n    dtype = alphas.dtype\n\n    threshold = torch.tensor([threshold], dtype=alphas.dtype).to(alphas.device)\n\n    fires = torch.zeros(batch_size, len_time, dtype=dtype, device=device)\n\n    # prefix_sum = torch.cumsum(alphas, dim=1)\n    prefix_sum = torch.cumsum(alphas, dim=1, dtype=torch.float64).to(\n        torch.float32\n    )  # cumsum precision degradation cause wrong result in extreme\n    prefix_sum_floor = torch.floor(prefix_sum)\n    dislocation_prefix_sum = torch.roll(prefix_sum, 1, dims=1)\n    dislocation_prefix_sum_floor = torch.floor(dislocation_prefix_sum)\n\n    dislocation_prefix_sum_floor[:, 0] = 0\n    dislocation_diff = prefix_sum_floor - dislocation_prefix_sum_floor\n\n    fire_idxs = dislocation_diff > 0\n    fires[fire_idxs] = 1\n    fires = fires + prefix_sum - prefix_sum_floor\n    if return_fire_idxs:\n        return fires, fire_idxs\n    return fires\n\n\ndef cif_v1(hidden, alphas, threshold):\n    fires, fire_idxs = cif_wo_hidden_v1(alphas, threshold, return_fire_idxs=True)\n\n    device = hidden.device\n    dtype = hidden.dtype\n    batch_size, len_time, hidden_size = hidden.size()\n    # frames = torch.zeros(batch_size, len_time, hidden_size, dtype=dtype, device=device)\n    # prefix_sum_hidden = torch.cumsum(alphas.unsqueeze(-1).tile((1, 1, hidden_size)) * hidden, dim=1)\n    frames = torch.zeros(batch_size, len_time, hidden_size, dtype=dtype, device=device)\n    prefix_sum_hidden = torch.cumsum(\n        alphas.unsqueeze(-1).repeat((1, 1, hidden_size)) * hidden, dim=1\n    )\n\n    frames = prefix_sum_hidden[fire_idxs]\n    shift_frames = torch.roll(frames, 1, dims=0)\n\n    batch_len = fire_idxs.sum(1)\n    batch_idxs = torch.cumsum(batch_len, dim=0)\n    shift_batch_idxs = torch.roll(batch_idxs, 1, dims=0)\n    shift_batch_idxs[0] = 0\n    shift_frames[shift_batch_idxs] = 0\n\n    remains = fires - torch.floor(fires)\n    # remain_frames = remains[fire_idxs].unsqueeze(-1).tile((1, hidden_size)) * hidden[fire_idxs]\n    remain_frames = (\n        remains[fire_idxs].unsqueeze(-1).repeat((1, hidden_size)) * hidden[fire_idxs]\n    )\n\n    shift_remain_frames = torch.roll(remain_frames, 1, dims=0)\n    shift_remain_frames[shift_batch_idxs] = 0\n\n    frames = frames - shift_frames + shift_remain_frames - remain_frames\n\n    # max_label_len = batch_len.max()\n    max_label_len = (\n        torch.round(alphas.sum(-1)).int().max()\n    )  # torch.round to calculate the max length\n\n    # frame_fires = torch.zeros(batch_size, max_label_len, hidden_size, dtype=dtype, device=device)\n    frame_fires = torch.zeros(\n        batch_size, max_label_len, hidden_size, dtype=dtype, device=device\n    )\n    indices = torch.arange(max_label_len, device=device).expand(batch_size, -1)\n    frame_fires_idxs = indices < batch_len.unsqueeze(1)\n    frame_fires[frame_fires_idxs] = frames\n    return frame_fires, fires\n\n\nclass CifPredictorV2(torch.nn.Module):\n    def __init__(\n        self,\n        idim,\n        l_order,\n        r_order,\n        threshold=1.0,\n        dropout=0.1,\n        smooth_factor=1.0,\n        noise_threshold=0,\n        tail_threshold=0.0,\n        tf2torch_tensor_name_prefix_torch=\"predictor\",\n        tf2torch_tensor_name_prefix_tf=\"seq2seq/cif\",\n        tail_mask=True,\n    ):\n        super().__init__()\n\n        self.pad = torch.nn.ConstantPad1d((l_order, r_order), 0)\n        self.cif_conv1d = torch.nn.Conv1d(idim, idim, l_order + r_order + 1)\n        self.cif_output = torch.nn.Linear(idim, 1)\n        self.dropout = torch.nn.Dropout(p=dropout)\n        self.threshold = threshold\n        self.smooth_factor = smooth_factor\n        self.noise_threshold = noise_threshold\n        self.tail_threshold = tail_threshold\n        self.tf2torch_tensor_name_prefix_torch = tf2torch_tensor_name_prefix_torch\n        self.tf2torch_tensor_name_prefix_tf = tf2torch_tensor_name_prefix_tf\n        self.tail_mask = tail_mask\n\n    def forward(\n        self,\n        hidden,\n        target_label=None,\n        mask=None,\n        ignore_id=-1,\n        mask_chunk_predictor=None,\n        target_label_length=None,\n    ):\n        h = hidden\n        context = h.transpose(1, 2)\n        queries = self.pad(context)\n        output = torch.relu(self.cif_conv1d(queries))\n        output = output.transpose(1, 2)\n\n        output = self.cif_output(output)\n        alphas = torch.sigmoid(output)\n        alphas = torch.nn.functional.relu(\n            alphas * self.smooth_factor - self.noise_threshold\n        )\n        if mask is not None:\n            mask = mask.transpose(-1, -2).float()\n            alphas = alphas * mask\n        if mask_chunk_predictor is not None:\n            alphas = alphas * mask_chunk_predictor\n\n        alphas = alphas.squeeze(-1)\n        if mask is not None:\n            mask = mask.squeeze(-1)\n\n        if target_label_length is not None:\n            target_length = target_label_length.squeeze(-1)\n        elif target_label is not None:\n            target_length = (target_label != ignore_id).float().sum(-1)\n        else:\n            target_length = None\n        token_num = alphas.sum(-1)\n        if target_length is not None:\n            alphas *= (target_length / token_num)[:, None].repeat(1, alphas.size(1))\n        elif self.tail_threshold > 0.0:\n            if self.tail_mask:\n                hidden, alphas, token_num = self.tail_process_fn(\n                    hidden, alphas, token_num, mask=mask\n                )\n            else:\n                hidden, alphas, token_num = self.tail_process_fn(\n                    hidden, alphas, token_num, mask=None\n                )\n\n        acoustic_embeds, cif_peak = cif_v1(hidden, alphas, self.threshold)\n        if target_length is None and self.tail_threshold > 0.0:\n            token_num_int = torch.max(token_num).type(torch.int32).item()\n            acoustic_embeds = acoustic_embeds[:, :token_num_int, :]\n\n        return acoustic_embeds, token_num, alphas, cif_peak\n\n    def tail_process_fn(self, hidden, alphas, token_num=None, mask=None):\n        b, t, d = hidden.size()\n        tail_threshold = self.tail_threshold\n        if mask is not None:\n            zeros_t = torch.zeros((b, 1), dtype=torch.float32, device=alphas.device)\n            ones_t = torch.ones_like(zeros_t)\n            mask_1 = torch.cat([mask, zeros_t], dim=1)\n            mask_2 = torch.cat([ones_t, mask], dim=1)\n            mask = mask_2 - mask_1\n            tail_threshold = mask * tail_threshold\n            alphas = torch.cat([alphas, zeros_t], dim=1)\n            alphas = torch.add(alphas, tail_threshold)\n        else:\n            tail_threshold = torch.tensor([tail_threshold], dtype=alphas.dtype).to(\n                alphas.device\n            )\n            tail_threshold = torch.reshape(tail_threshold, (1, 1))\n            if b > 1:\n                alphas = torch.cat([alphas, tail_threshold.repeat(b, 1)], dim=1)\n            else:\n                alphas = torch.cat([alphas, tail_threshold], dim=1)\n        zeros = torch.zeros((b, 1, d), dtype=hidden.dtype).to(hidden.device)\n        hidden = torch.cat([hidden, zeros], dim=1)\n        token_num = alphas.sum(dim=-1)\n        token_num_floor = torch.floor(token_num)\n\n        return hidden, alphas, token_num_floor\n\n\nclass Paraformer(torch.nn.Module):\n    \"\"\"\n    Author: Speech Lab of DAMO Academy, Alibaba Group\n    Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition\n    https://arxiv.org/abs/2206.08317\n    \"\"\"\n\n    def __init__(\n        self,\n        neg_mean: torch.Tensor,\n        inv_stddev: torch.Tensor,\n        input_size: int,\n        vocab_size: int,\n        ignore_id=-1,\n        encoder_conf: Optional[Dict] = None,\n        decoder_conf: Optional[Dict] = None,\n        predictor_conf: Optional[Dict] = None,\n    ):\n        super().__init__()\n\n        self.ignore_id = ignore_id\n        self.encoder = SANMEncoder(\n            neg_mean, inv_stddev, input_size=input_size, **encoder_conf\n        )\n        encoder_output_size = self.encoder.output_size()\n\n        self.decoder = ParaformerSANMDecoder(\n            vocab_size=vocab_size,\n            encoder_output_size=encoder_output_size,\n            **decoder_conf,\n        )\n        self.predictor = CifPredictorV2(**predictor_conf)\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n          x: (N, T, C)\n        \"\"\"\n        encoder_out = self.encoder(x)\n\n        encoder_out_mask = None\n\n        pre_acoustic_embeds, pre_token_length, alphas, pre_peak_index = self.predictor(\n            encoder_out, None, encoder_out_mask, ignore_id=self.ignore_id\n        )\n        # pre_acoustic_embeds: (N, num_tokens, C)\n        # pre_token_length: [num_tokens,]\n        # alphas: (N, T)\n        # pre_peak_index: (N, T)\n\n        pre_token_length = pre_token_length.round().long()\n        if torch.max(pre_token_length) < 1:\n            return []\n\n        decoder_outs = self.decoder(encoder_out, pre_acoustic_embeds)\n        # decoder_outs: (N, num_tokens, vocab_size)\n        return decoder_outs, pre_token_length\n\n\n@torch.no_grad()\ndef test():\n    import yaml\n\n    with open(\"./config.yaml\", \"r\", encoding=\"utf-8\") as f:\n        config = yaml.safe_load(f)\n    print(config[\"encoder_conf\"])\n\n    neg_mean = torch.rand(560)\n    inv_stddev = torch.rand(560)\n\n    m = Paraformer(\n        neg_mean=neg_mean,\n        inv_stddev=inv_stddev,\n        input_size=560,\n        vocab_size=8404,\n        encoder_conf=config[\"encoder_conf\"],\n        decoder_conf=config[\"decoder_conf\"],\n        predictor_conf=config[\"predictor_conf\"],\n    )\n    m.eval()\n    print(m.decoder)\n\n    state_dict = torch.load(\"./model_state_dict.pt\", map_location=\"cpu\")[\"state_dict\"]\n    m.load_state_dict(state_dict)\n    del state_dict\n    print(m)\n\n\nif __name__ == \"__main__\":\n    test()\n"
  },
  {
    "path": "scripts/peng-cheng-starling/.gitignore",
    "content": "bpe.model\n*.wav\n*.onnx\n"
  },
  {
    "path": "scripts/peng-cheng-starling/README.md",
    "content": "# Introduction\n\nThis folder contains scripts for files from\nhttps://github.com/yangb05/PengChengStarling\n"
  },
  {
    "path": "scripts/peng-cheng-starling/quantize_models.py",
    "content": "#!/usr/bin/env python3\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom pathlib import Path\n\n\ndef main():\n    suffix = \"epoch-75-avg-11-chunk-16-left-128\"\n\n    for m in [\"encoder\", \"joiner\"]:\n        if Path(f\"{m}-{suffix}.int8.onnx\").is_file():\n            continue\n\n        quantize_dynamic(\n            model_input=f\"./{m}-{suffix}.onnx\",\n            model_output=f\"./{m}-{suffix}.int8.onnx\",\n            op_types_to_quantize=[\"MatMul\"],\n            weight_type=QuantType.QInt8,\n        )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/piper/.gitignore",
    "content": "*.sh\n*.onnx\n*.json\nMODEL_CARD\ngenerate_samples-vits-piper*.py\n"
  },
  {
    "path": "scripts/piper/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nimport json\nfrom typing import Any, Dict\n\nimport onnx\nfrom iso639 import Lang\n\n\ndef get_args():\n    # For en_GB-semaine-medium\n    # --name semaine\n    # --kind medium\n    # --lang en_GB\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n    parser.add_argument(\n        \"--name\",\n        type=str,\n        required=True,\n    )\n\n    parser.add_argument(\n        \"--kind\",\n        type=str,\n        required=True,\n    )\n\n    parser.add_argument(\n        \"--lang\",\n        type=str,\n        required=True,\n    )\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\ndef load_config(filename):\n    with open(filename, \"r\") as file:\n        config = json.load(file)\n    return config\n\n\ndef generate_tokens(config):\n    id_map = config[\"phoneme_id_map\"]\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for s, i in id_map.items():\n            if s == \"\\n\":\n                continue\n            if isinstance(i, list):\n                i = i[0]\n            print(f\"{s} {i}\")\n            f.write(f\"{s} {i}\\n\")\n    print(\"Generated tokens.txt\")\n\n\n# for en_US-lessac-medium.onnx\n# export LANG=en_US\n# export TYPE=lessac\n# export NAME=medium\ndef main():\n    args = get_args()\n    print(args)\n    lang = args.lang\n\n    lang_iso = Lang(lang.split(\"_\")[0])\n    print(lang, lang_iso)\n\n    kind = args.kind\n\n    name = args.name\n\n    # en_GB-alan-low.onnx.json\n    config = load_config(f\"{lang}-{name}-{kind}.onnx.json\")\n\n    print(\"generate tokens\")\n    generate_tokens(config)\n\n    sample_rate = config[\"audio\"][\"sample_rate\"]\n    if sample_rate == 22500:\n        print(\"Change sample rate from 22500 to 22050\")\n        sample_rate = 22050\n\n    if \"lang_code\" in config:\n        voice = config[\"lang_code\"]\n    else:\n        voice = config[\"espeak\"][\"voice\"]\n\n    print(\"add model metadata\")\n    meta_data = {\n        \"model_type\": \"vits\",\n        \"comment\": \"piper\",  # must be piper for models from piper\n        \"language\": lang_iso.name,\n        \"voice\": voice,  # e.g., en-us\n        \"has_espeak\": 1,\n        \"n_speakers\": config[\"num_speakers\"],\n        \"sample_rate\": sample_rate,\n    }\n    print(meta_data)\n    add_meta_data(f\"{lang}-{name}-{kind}.onnx\", meta_data)\n\n\nmain()\n"
  },
  {
    "path": "scripts/piper/dynamic_quantization.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\n\nimport onnxmltools\nfrom onnxmltools.utils.float16_converter import convert_float_to_float16\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--input\",\n        type=str,\n        required=True,\n    )\n    parser.add_argument(\n        \"--output-fp16\",\n        type=str,\n        required=True,\n    )\n\n    parser.add_argument(\n        \"--output-int8\",\n        type=str,\n        required=True,\n    )\n    return parser.parse_args()\n\n\n# for op_block_list, see also\n# https://github.com/microsoft/onnxruntime/blob/089c52e4522491312e6839af146a276f2351972e/onnxruntime/python/tools/transformers/float16.py#L115\n#\n# libc++abi: terminating with uncaught exception of type Ort::Exception:\n# Type Error: Type (tensor(float16)) of output arg (/dp/RandomNormalLike_output_0)\n# of node (/dp/RandomNormalLike) does not match expected type (tensor(float)).\n#\n# libc++abi: terminating with uncaught exception of type Ort::Exception:\n# This is an invalid model. Type Error: Type 'tensor(float16)' of input\n# parameter (/enc_p/encoder/attn_layers.0/Constant_84_output_0) of\n# operator (Range) in node (/Range_1) is invalid.\ndef export_onnx_fp16(onnx_fp32_path, onnx_fp16_path):\n    onnx_fp32_model = onnxmltools.utils.load_model(onnx_fp32_path)\n    onnx_fp16_model = convert_float_to_float16(\n        onnx_fp32_model,\n        keep_io_types=True,\n        op_block_list=[\n            \"RandomNormalLike\",\n            \"Range\",\n        ],\n    )\n    onnxmltools.utils.save_model(onnx_fp16_model, onnx_fp16_path)\n\n\ndef main():\n    args = get_args()\n    print(args)\n\n    in_filename = args.input\n    output_fp16 = args.output_fp16\n    output_int8 = args.output_int8\n\n    quantize_dynamic(\n        model_input=in_filename,\n        model_output=output_int8,\n        weight_type=QuantType.QUInt8,\n    )\n\n    export_onnx_fp16(in_filename, output_fp16)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/piper/generate.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom dataclasses import dataclass\nfrom pathlib import Path\n\nimport jinja2\n\n\"\"\"\nTODO:\n - add https://huggingface.co/csukuangfj/vits-piper-en_US-glados\n\"\"\"\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass PiperModel:\n    # For en_GB-semaine-medium\n    name: str  # semaine\n    kind: str  # e.g. medium\n    sr: int  # sample rate\n    ns: int  # number of speakers\n    lang: str = \"\"  # e.g., en_GB\n    cmd: str = \"\"\n    model_name: str = \"\"\n    text: str = \"\"\n    index: int = 0\n    url: str = \"\"\n\n\n# arabic\ndef get_ar_models():\n    ar_jo = [\n        PiperModel(name=\"kareem\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"kareem\", kind=\"medium\", sr=22050, ns=1),\n    ]\n    ar_jo += [\n        PiperModel(\n            name=\"SA_miro\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak/blob/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak/resolve/main/miro_ar-SA.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak/resolve/main/miro_ar-SA.onnx.json\n                   mv miro_ar-SA.onnx ar_JO-SA_miro-high.onnx\n                   mv miro_ar-SA.onnx.json ar_JO-SA_miro-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak\",\n        ),\n        PiperModel(\n            name=\"SA_dii\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_dii_espeak/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_dii_espeak\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_dii_espeak/resolve/main/dii_ar-SA.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_dii_espeak/resolve/main/dii_ar-SA.onnx.json\n                   mv dii_ar-SA.onnx ar_JO-SA_dii-high.onnx\n                   mv dii_ar-SA.onnx.json ar_JO-SA_dii-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_dii_espeak\",\n        ),\n        PiperModel(\n            name=\"SA_miro_V2\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak_V2/blob/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak_V2\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak_V2/resolve/main/miro_ar-SA.onnx.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak_V2/resolve/main/miro_ar-SA.onnx.json\n                   mv miro_ar-SA.onnx.onnx ar_JO-SA_miro_V2-high.onnx\n                   mv miro_ar-SA.onnx.json ar_JO-SA_miro_V2-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/phoonnx_ar-SA_miro_espeak_V2\",\n        ),\n    ]\n\n    for m in ar_jo:\n        m.lang = \"ar_JO\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = ar_jo\n\n    for m in ans:\n        m.text = \"كيف حالك اليوم؟\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# catlan\ndef get_ca_models():\n    ca_es = [\n        PiperModel(name=\"upc_ona\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"upc_ona\", kind=\"x_low\", sr=16000, ns=1),\n        PiperModel(name=\"upc_pau\", kind=\"x_low\", sr=16000, ns=1),\n    ]\n\n    for m in ca_es:\n        m.lang = \"ca_ES\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = ca_es\n\n    for m in ans:\n        m.text = \"Si vols estar ben servit, fes-te tu mateix el llit\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# czech\ndef get_cs_models():\n    cs_cz = [\n        PiperModel(name=\"jirka\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"jirka\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in cs_cz:\n        m.lang = \"cs_CZ\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = cs_cz\n\n    for m in ans:\n        m.text = \"Co můžeš udělat dnes, neodkládej na zítřek. \"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# welsh\ndef get_cy_models():\n    cy_gb = [\n        PiperModel(name=\"bu_tts\", kind=\"medium\", sr=22050, ns=7),\n        PiperModel(name=\"gwryw_gogleddol\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in cy_gb:\n        m.lang = \"cy_GB\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = cy_gb\n\n    for m in ans:\n        m.text = \"Ni all y gwynt ei hunan ei ddilyn, ac felly mae’n rhaid i’r gŵyr ddod i’r gorwel i weld y llwybr yn gyfarwydd\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# danish\ndef get_da_models():\n    da_dk = [\n        PiperModel(name=\"talesyntese\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in da_dk:\n        m.lang = \"da_DK\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = da_dk\n\n    for m in ans:\n        m.text = (\n            \"Hvis du går langsomt, men aldrig stopper, når du ender frem til dit mål.\"\n        )\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# greek\ndef get_el_models():\n    el_gr = [\n        PiperModel(name=\"rapunzelina\", kind=\"low\", sr=16000, ns=1),\n    ]\n\n    for m in el_gr:\n        m.lang = \"el_GR\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = el_gr\n\n    for m in ans:\n        m.text = (\n            \"Όταν το δέντρο είναι μικρό, το στρέβλεις· όταν είναι μεγάλο, το λυγίζεις.\"\n        )\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# spanish\ndef get_es_models():\n    es_ES = [\n        PiperModel(name=\"carlfm\", kind=\"x_low\", sr=16000, ns=1),\n        PiperModel(name=\"davefx\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"sharvard\", kind=\"medium\", sr=22050, ns=2),\n    ]\n\n    es_ES.extend(\n        [\n            # https://github.com/rhasspy/piper/issues/187#issuecomment-1802216304\n            # https://drive.google.com/file/d/12tNCCyd0Hf5jsyqCw8828kLSHHx5LOw9/view\n            PiperModel(\n                name=\"glados\",\n                kind=\"medium\",\n                sr=22050,\n                ns=1,\n                cmd=\"\"\"\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-es_ES-glados-medium/resolve/main/es_ES-glados-medium.onnx\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-es_ES-glados-medium/resolve/main/es_ES-glados-medium.onnx.json\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-es_ES-glados-medium/resolve/main/README.md\n                   \"\"\",\n                url=\"https://github.com/rhasspy/piper/issues/187#issuecomment-1802216304\",\n            ),\n        ]\n    )\n\n    es_ES.extend(\n        [\n            PiperModel(\n                name=\"miro\",\n                kind=\"high\",\n                sr=22050,\n                ns=1,\n                cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_es-ES_miro/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_es-ES_miro\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_es-ES_miro/resolve/main/miro_es-ES.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_es-ES_miro/resolve/main/miro_es-ES.onnx.json\n\n                   mv miro_es-ES.onnx es_ES-miro-high.onnx\n                   mv miro_es-ES.onnx.json es_ES-miro-high.onnx.json\n                   \"\"\",\n                url=\"https://huggingface.co/OpenVoiceOS/pipertts_es-ES_miro\",\n            ),\n        ]\n    )\n\n    es_MX = [\n        PiperModel(name=\"ald\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"claude\", kind=\"high\", sr=22050, ns=1),\n    ]\n\n    # Argentina\n    es_AR = [\n        PiperModel(name=\"daniela\", kind=\"high\", sr=22050, ns=1),\n    ]\n\n    for m in es_ES:\n        m.lang = \"es_ES\"\n\n    for m in es_MX:\n        m.lang = \"es_MX\"\n\n    for m in es_AR:\n        m.lang = \"es_AR\"\n\n    ans = es_ES + es_MX + es_AR\n\n    for m in ans:\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        m.text = \"Cuando te encuentres ante una puerta cerrada, no olvides que a veces el destino cierra una puerta para que te desvíes hacia un camino que lleva a una ventana que nunca habrías encontrado por tu cuenta.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# persian\ndef get_fa_models():\n    fa_IR = [\n        PiperModel(name=\"amir\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"ganji\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"ganji_adabi\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"gyro\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"reza_ibrahim\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in fa_IR:\n        m.lang = \"fa_IR\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = fa_IR\n\n    for m in ans:\n        m.text = \"همانطور که کوه ها در برابر باد و باران پایدارند، اما به مرور زمان خرد و پخش می شوند، انسان نیز باید در برابر مشکلات قوی باشد، اما با خرد و خویشتن داری در زندگی به پیش برود.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# finnish\ndef get_fi_models():\n    fi_FI = [\n        PiperModel(name=\"harri\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"harri\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in fi_FI:\n        m.lang = \"fi_FI\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = fi_FI\n\n    for m in ans:\n        m.text = \"Sateenkaaren päässä on kultaa, mutta vain ne, jotka siihen uskovat, voivat sen löytää.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# french\ndef get_fr_models():\n    fr_FR = [\n        PiperModel(name=\"gilles\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"siwis\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"siwis\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"tom\", kind=\"medium\", sr=44100, ns=1),\n        PiperModel(name=\"upmc\", kind=\"medium\", sr=22050, ns=2),\n    ]\n\n    fr_FR.extend(\n        [\n            PiperModel(\n                name=\"tjiho\",\n                kind=f\"model{k}\",\n                sr=44100,\n                ns=1,\n                cmd=f\"\"\"\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model{k}/resolve/main/fr_FR-tjiho-model{k}.onnx\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model{k}/resolve/main/fr_FR-tjiho-model{k}.onnx.json\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model{k}/resolve/main/LICENSE.txt\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model{k}/resolve/main/MODEL_CARD\n                   \"\"\",\n                url=f\"https://huggingface.co/csukuangfj/vits-piper-fr_FR-tjiho-model{k}/tree/main\",\n            )\n            for k in [1, 2, 3]\n        ]\n    )\n\n    fr_FR += [\n        PiperModel(\n            name=\"miro\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_fr-FR_miro/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_fr-FR_miro\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_fr-FR_miro/resolve/main/miro_fr-FR.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_fr-FR_miro/resolve/main/miro_fr-FR.onnx.json\n\n                   mv miro_fr-FR.onnx fr_FR-miro-high.onnx\n                   mv miro_fr-FR.onnx.json fr_FR-miro-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_fr-FR_miro\",\n        ),\n    ]\n\n    for m in fr_FR:\n        m.lang = \"fr_FR\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = fr_FR\n\n    for m in ans:\n        m.text = \"Pas de nouvelles, bonnes nouvelles.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# hindi\ndef get_hi_models():\n    hi_IN = [\n        PiperModel(name=\"pratham\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"priyamvada\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"rohan\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in hi_IN:\n        m.lang = \"hi_IN\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = hi_IN\n\n    for m in ans:\n        m.text = \"यह मत पूछो कि तुम्हारा देश तुम्हारे लिए क्या कर सकता है। यह पूछो कि तुम अपने देश के लिए क्या कर सकते हो।\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# hungarian\ndef get_hu_models():\n    hu_HU = [\n        PiperModel(name=\"anna\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"berta\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"imre\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in hu_HU:\n        m.lang = \"hu_HU\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = hu_HU\n\n    for m in ans:\n        m.text = \"Ha északról fúj a szél, a lányok nem lógnak együtt.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# icelandic\ndef get_is_models():\n    is_IS = [\n        PiperModel(name=\"bui\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"salka\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"steinn\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"ugla\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in is_IS:\n        m.lang = \"is_IS\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = is_IS\n\n    for m in ans:\n        m.text = \"Farðu með allt, eða farðu ekki.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# italian\ndef get_it_models():\n    it_IT = [\n        PiperModel(name=\"paola\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"riccardo\", kind=\"x_low\", sr=16000, ns=1),\n    ]\n\n    it_IT += [\n        PiperModel(\n            name=\"miro\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_it-IT_miro/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_it-IT_miro\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_it-IT_miro/resolve/main/miro_it-IT.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_it-IT_miro/resolve/main/miro_it-IT.onnx.json\n\n                   mv miro_it-IT.onnx it_IT-miro-high.onnx\n                   mv miro_it-IT.onnx.json it_IT-miro-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_it-IT_miro\",\n        ),\n        PiperModel(\n            name=\"dii\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_it-IT_dii/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_it-IT_dii\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_it-IT_dii/resolve/main/dii_it-IT.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_it-IT_dii/resolve/main/dii_it-IT.onnx.json\n\n                   mv dii_it-IT.onnx it_IT-dii-high.onnx\n                   mv dii_it-IT.onnx.json it_IT-dii-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_it-IT_dii\",\n        ),\n    ]\n\n    for m in it_IT:\n        m.lang = \"it_IT\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = it_IT\n\n    for m in ans:\n        m.text = (\n            \"Se vuoi andare veloce, vai da solo; se vuoi andare lontano, vai insieme.\"\n        )\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# georgian\ndef get_ka_models():\n    ka_GE = [\n        PiperModel(name=\"natia\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in ka_GE:\n        m.lang = \"ka_GE\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = ka_GE\n\n    for m in ans:\n        m.text = \"ღვინო თბილისში, საქართველო სამტრედში\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# kazakh\ndef get_kk_models():\n    kk_KZ = [\n        PiperModel(name=\"iseke\", kind=\"x_low\", sr=16000, ns=1),\n        PiperModel(name=\"issai\", kind=\"high\", sr=22050, ns=6),\n        PiperModel(name=\"raya\", kind=\"x_low\", sr=16000, ns=1),\n    ]\n\n    for m in kk_KZ:\n        m.lang = \"kk_KZ\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = kk_KZ\n\n    for m in ans:\n        m.text = \"Әлемнің жұлдыздары сенің көзің, жаным.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# luxembourgish\ndef get_lb_models():\n    lb_LU = [\n        PiperModel(name=\"marylux\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in lb_LU:\n        m.lang = \"lb_LU\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = lb_LU\n\n    for m in ans:\n        m.text = \"Op der Haaptstrooss sinn all Stroossen Brécken, awer d'Dier kann iwwerall erreecht ginn.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# latvian\ndef get_lv_models():\n    lv_LV = [\n        PiperModel(name=\"aivars\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in lv_LV:\n        m.lang = \"lv_LV\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = lv_LV\n\n    for m in ans:\n        m.text = \"Zeme nenes augļus, ja tēvs sēj, bet māte auž.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# malayalam\ndef get_ml_models():\n    ml_IN = [\n        PiperModel(name=\"arjun\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"meera\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in ml_IN:\n        m.lang = \"ml_IN\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = ml_IN\n\n    for m in ans:\n        m.text = \"മണ്ണ് മരിക്കുമ്പോൾ കാട്ടിലെ വെള്ളവും മരിക്കുന്നു.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Nepali\ndef get_ne_models():\n    ne_NP = [\n        PiperModel(name=\"chitwan\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"google\", kind=\"medium\", sr=22050, ns=18),\n        PiperModel(name=\"google\", kind=\"x_low\", sr=16000, ns=18),\n    ]\n\n    for m in ne_NP:\n        m.lang = \"ne_NP\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = ne_NP\n\n    for m in ans:\n        m.text = \"घाँसको पातले पहाडलाई अभिवादन गर्दै झुक्छ।\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# dutch\ndef get_nl_models():\n    nl_BE = [\n        PiperModel(name=\"nathalie\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"nathalie\", kind=\"x_low\", sr=16000, ns=1),\n    ]\n\n    nl_NL = [\n        PiperModel(name=\"pim\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"ronnie\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    nl_NL += [\n        PiperModel(\n            name=\"miro\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_miro/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_miro\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_miro/resolve/main/miro_nl-NL.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_miro/resolve/main/miro_nl-NL.onnx.json\n\n                   mv miro_nl-NL.onnx nl_NL-miro-high.onnx\n                   mv miro_nl-NL.onnx.json nl_NL-miro-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_miro\",\n        ),\n        PiperModel(\n            name=\"dii\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_dii/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_dii\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_dii/resolve/main/dii_nl-NL.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_dii/resolve/main/dii_nl-NL.onnx.json\n\n                   mv dii_nl-NL.onnx nl_NL-dii-high.onnx\n                   mv dii_nl-NL.onnx.json nl_NL-dii-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_nl-NL_dii\",\n        ),\n    ]\n\n    for m in nl_BE:\n        m.lang = \"nl_BE\"\n\n    for m in nl_NL:\n        m.lang = \"nl_NL\"\n\n    ans = nl_BE + nl_NL\n\n    for m in ans:\n        m.text = \"God schiep het water, maar de Nederlander schiep de dijk\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# norwegian\ndef get_no_models():\n    no_NO = [\n        PiperModel(name=\"talesyntese\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in no_NO:\n        m.lang = \"no_NO\"\n\n    ans = no_NO\n\n    for m in ans:\n        m.text = \"Uskyldig kan stormen veroorzaken\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# polish\ndef get_pl_models():\n    pl_PL = [\n        PiperModel(name=\"darkman\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"gosia\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"mc_speech\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    pl_PL.extend(\n        [\n            PiperModel(\n                name=\"jarvis_wg_glos\",\n                kind=\"medium\",\n                sr=22050,\n                ns=1,\n                cmd=\"\"\"\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/pl_PL-jarvis_wg_glos-medium.onnx\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/pl_PL-jarvis_wg_glos-medium.onnx.json\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/README.md\n                   \"\"\",\n                url=\"https://github.com/k2-fsa/sherpa-onnx/issues/2402\",\n            ),\n            PiperModel(\n                name=\"justyna_wg_glos\",\n                kind=\"medium\",\n                sr=22050,\n                ns=1,\n                cmd=\"\"\"\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/pl_PL-justyna_wg_glos-medium.onnx\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/pl_PL-justyna_wg_glos-medium.onnx.json\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/README.md\n                   \"\"\",\n                url=\"https://github.com/k2-fsa/sherpa-onnx/issues/2402\",\n            ),\n            PiperModel(\n                name=\"meski_wg_glos\",\n                kind=\"medium\",\n                sr=22050,\n                ns=1,\n                cmd=\"\"\"\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/pl_PL-meski_wg_glos-medium.onnx\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/pl_PL-meski_wg_glos-medium.onnx.json\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/README.md\n                   \"\"\",\n                url=\"https://github.com/k2-fsa/sherpa-onnx/issues/2402\",\n            ),\n            PiperModel(\n                name=\"zenski_wg_glos\",\n                kind=\"medium\",\n                sr=22050,\n                ns=1,\n                cmd=\"\"\"\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/pl_PL-zenski_wg_glos-medium.onnx\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/pl_PL-zenski_wg_glos-medium.onnx.json\n                   wget -qq https://huggingface.co/WitoldG/polish_piper_models/resolve/main/README.md\n                   \"\"\",\n                url=\"https://github.com/k2-fsa/sherpa-onnx/issues/2402\",\n            ),\n        ]\n    )\n\n    for m in pl_PL:\n        m.lang = \"pl_PL\"\n\n    ans = pl_PL\n\n    for m in ans:\n        m.text = \"Nieważne, za kogo walczysz, i tak popełnisz błąd\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Portuguese\ndef get_pt_models():\n    pt_BR = [\n        PiperModel(name=\"cadu\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"edresson\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"faber\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"jeff\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    pt_PT = [\n        PiperModel(\n            name=\"tugao\",\n            kind=\"medium\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                    wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/pt/pt_PT/tugão/medium/pt_PT-tugão-medium.onnx\n                    wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/pt/pt_PT/tugão/medium/pt_PT-tugão-medium.onnx.json\n                    wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/pt/pt_PT/tugão/medium/MODEL_CARD\n\n                    mv pt_PT-tugão-medium.onnx pt_PT-tugao-medium.onnx\n                    mv pt_PT-tugão-medium.onnx.json pt_PT-tugao-medium.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/rhasspy/piper-voices/tree/main/pt/pt_PT/tugão/medium\",\n        ),\n    ]\n\n    pt_PT += [\n        PiperModel(\n            name=\"miro\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_miro/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_miro\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_miro/resolve/main/miro_pt-PT.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_miro/resolve/main/miro_pt-PT.onnx.json\n\n                   mv miro_pt-PT.onnx pt_PT-miro-high.onnx\n                   mv miro_pt-PT.onnx.json pt_PT-miro-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_miro\",\n        ),\n        PiperModel(\n            name=\"dii\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_dii/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_dii\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_dii/resolve/main/dii_pt-PT.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_dii/resolve/main/dii_pt-PT.onnx.json\n\n                   mv dii_pt-PT.onnx pt_PT-dii-high.onnx\n                   mv dii_pt-PT.onnx.json pt_PT-dii-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_pt-PT_dii\",\n        ),\n    ]\n\n    pt_BR += [\n        PiperModel(\n            name=\"miro\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_miro/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_miro\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_miro/resolve/main/miro_pt-BR.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_miro/resolve/main/miro_pt-BR.onnx.json\n\n                   mv miro_pt-BR.onnx pt_BR-miro-high.onnx\n                   mv miro_pt-BR.onnx.json pt_BR-miro-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_miro\",\n        ),\n        PiperModel(\n            name=\"dii\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_dii/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_dii\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_dii/resolve/main/dii_pt-BR.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_dii/resolve/main/dii_pt-BR.onnx.json\n\n                   mv dii_pt-BR.onnx pt_BR-dii-high.onnx\n                   mv dii_pt-BR.onnx.json pt_BR-dii-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_pt-BR_dii\",\n        ),\n    ]\n\n    for m in pt_BR:\n        m.lang = \"pt_BR\"\n\n    for m in pt_PT:\n        m.lang = \"pt_PT\"\n\n    ans = pt_BR + pt_PT\n\n    for m in ans:\n        m.text = \"Marinha sem vento, não chega a porto\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Romanian\ndef get_ro_models():\n    ro_RO = [\n        PiperModel(name=\"mihai\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in ro_RO:\n        m.lang = \"ro_RO\"\n\n    ans = ro_RO\n\n    for m in ans:\n        m.text = \"Un foc fără lemne se stinge, o lume fără poveste moare.\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Russian\ndef get_ru_models():\n    ru_RU = [\n        PiperModel(name=\"denis\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"dmitri\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"irina\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"ruslan\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in ru_RU:\n        m.lang = \"ru_RU\"\n\n    ans = ru_RU\n\n    for m in ans:\n        m.text = \"Если курица укусит, ей отрубят голову.\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Slovak\ndef get_sk_models():\n    sk_SK = [\n        PiperModel(name=\"lili\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in sk_SK:\n        m.lang = \"sk_SK\"\n\n    ans = sk_SK\n\n    for m in ans:\n        m.text = \"Kto nepozná strach, nepozná vôľu.\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Slovenian\ndef get_sl_models():\n    sl_SI = [\n        PiperModel(name=\"artur\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in sl_SI:\n        m.lang = \"sl_SI\"\n\n    ans = sl_SI\n\n    for m in ans:\n        m.text = \"Kto sa nebojí, nie je hlúpy.\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Serbian\ndef get_sr_models():\n    sr_RS = [\n        PiperModel(name=\"serbski_institut\", kind=\"medium\", sr=22050, ns=2),\n    ]\n\n    for m in sr_RS:\n        m.lang = \"sr_RS\"\n\n    ans = sr_RS\n\n    for m in ans:\n        m.text = \"Круг не може постојати без свог центра, а нација не може постојати без својих хероја.\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Swedish\ndef get_sv_models():\n    sv_SE = [\n        PiperModel(name=\"lisa\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"nst\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in sv_SE:\n        m.lang = \"sv_SE\"\n\n    ans = sv_SE\n\n    for m in ans:\n        m.text = \"Liten skog, med många träd\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Swahili\ndef get_sw_models():\n    sw_CD = [\n        PiperModel(name=\"lanfrica\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in sw_CD:\n        m.lang = \"sw_CD\"\n\n    ans = sw_CD\n\n    for m in ans:\n        m.text = \"Mtu mmoja hawezi kuiba mazingira.\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Turkish\ndef get_tr_models():\n    tr_TR = [\n        PiperModel(name=\"dfki\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"fahrettin\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"fettah\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in tr_TR:\n        m.lang = \"tr_TR\"\n\n    ans = tr_TR\n\n    for m in ans:\n        m.text = \"Bir evin duvarları, bir adamın sözü, bir kadının gülü kırılmaz\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Ukrainian\ndef get_uk_models():\n    uk_UA = [\n        PiperModel(name=\"lada\", kind=\"x_low\", sr=16000, ns=1),\n        PiperModel(name=\"ukrainian_tts\", kind=\"medium\", sr=22050, ns=3),\n    ]\n\n    for m in uk_UA:\n        m.lang = \"uk_UA\"\n\n    ans = uk_UA\n\n    for m in ans:\n        m.text = \"Ви не можете навчити коня, якщо не відвикнете від годівлі.\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Vietnamese\ndef get_vi_models():\n    vi_VN = [\n        PiperModel(name=\"25hours_single\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"vais1000\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"vivos\", kind=\"x_low\", sr=16000, ns=65),\n    ]\n\n    for m in vi_VN:\n        m.lang = \"vi_VN\"\n\n    ans = vi_VN\n\n    for m in ans:\n        m.text = \"Nước cũ đào gỗ mới, sông cũ chảy nước mới\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\n# Indonesian\ndef get_id_models():\n    id_ID = [\n        PiperModel(name=\"news_tts\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    for m in id_ID:\n        m.lang = \"id_ID\"\n\n    ans = id_ID\n\n    for m in ans:\n        m.text = \"Jangan tanyakan apa yang negara bisa berikan kepadamu, tapi tanyakan apa yang bisa kamu berikan untuk negaramu.\"\n\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\ndef get_en_models():\n    en_gb = [\n        PiperModel(name=\"alan\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"alan\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"alba\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"aru\", kind=\"medium\", sr=22050, ns=12),\n        PiperModel(name=\"cori\", kind=\"high\", sr=22050, ns=1),\n        PiperModel(name=\"cori\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"jenny_dioco\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"northern_english_male\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"semaine\", kind=\"medium\", sr=22050, ns=4),\n        PiperModel(name=\"southern_english_female\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"vctk\", kind=\"medium\", sr=22050, ns=109),\n    ]\n    en_us = [\n        PiperModel(name=\"amy\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"amy\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"arctic\", kind=\"medium\", sr=22050, ns=18),\n        PiperModel(name=\"bryce\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"danny\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"hfc_female\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"hfc_male\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"joe\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"john\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"kathleen\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"kristin\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"kusal\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"l2arctic\", kind=\"medium\", sr=22050, ns=24),\n        PiperModel(name=\"lessac\", kind=\"high\", sr=22050, ns=1),\n        PiperModel(name=\"lessac\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"lessac\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"libritts\", kind=\"high\", sr=22050, ns=904),\n        PiperModel(name=\"libritts_r\", kind=\"medium\", sr=22050, ns=904),\n        PiperModel(name=\"ljspeech\", kind=\"high\", sr=22050, ns=1),\n        PiperModel(name=\"ljspeech\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"norman\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"reza_ibrahim\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"ryan\", kind=\"high\", sr=22050, ns=1),\n        PiperModel(name=\"ryan\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"ryan\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"sam\", kind=\"medium\", sr=22050, ns=1),\n    ]\n\n    en_gb.extend(\n        [\n            PiperModel(\n                name=\"southern_english_female\",\n                kind=\"medium\",\n                sr=22050,\n                ns=6,\n                cmd=\"\"\"\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-en_GB-southern_english_female-medium/resolve/main/en_GB-southern_english_female-medium.onnx\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-en_GB-southern_english_female-medium/resolve/main/en_GB-southern_english_female-medium.onnx.json\n                   \"\"\",\n                url=\"https://huggingface.co/csukuangfj/vits-piper-en_GB-southern_english_female-medium\",\n            ),\n            PiperModel(\n                name=\"southern_english_male\",\n                kind=\"medium\",\n                sr=22050,\n                ns=8,\n                cmd=\"\"\"\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-en_GB-southern_english_male-medium/resolve/main/en_GB-southern_english_male-medium.onnx\n                   wget -qq https://huggingface.co/csukuangfj/vits-piper-en_GB-southern_english_male-medium/resolve/main/en_GB-southern_english_male-medium.onnx.json\n                   \"\"\",\n                url=\"https://huggingface.co/csukuangfj/vits-piper-en_GB-southern_english_male-medium\",\n            ),\n        ]\n    )\n\n    en_gb += [\n        PiperModel(\n            name=\"miro\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_en-GB_miro/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_en-GB_miro\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_en-GB_miro/resolve/main/miro_en-GB.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_en-GB_miro/resolve/main/miro_en-GB.onnx.json\n\n                   mv miro_en-GB.onnx en_GB-miro-high.onnx\n                   mv miro_en-GB.onnx.json en_GB-miro-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_en-GB_miro\",\n        ),\n        PiperModel(\n            name=\"dii\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_en-GB_dii/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_en-GB_dii\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_en-GB_dii/resolve/main/dii_en-GB.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_en-GB_dii/resolve/main/dii_en-GB.onnx.json\n\n                   mv dii_en-GB.onnx en_GB-dii-high.onnx\n                   mv dii_en-GB.onnx.json en_GB-dii-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_en-GB_dii\",\n        ),\n    ]\n\n    en_us.extend(\n        [\n            # https://github.com/rhasspy/piper/issues/187#issuecomment-1805709037\n            # https://drive.google.com/file/d/1t2D7zP-e2flduS5duHm__UMB9RjuGqWK/view\n            PiperModel(\n                name=\"glados\",\n                kind=\"high\",\n                sr=22050,\n                ns=1,\n                cmd=\"\"\"\n                   wget -qq https://huggingface.co/csukuangfj/en_US-glados-high/resolve/main/en_US-glados-high.onnx\n                   wget -qq https://huggingface.co/csukuangfj/en_US-glados-high/resolve/main/en_US-glados-high.onnx.json\n                   wget -qq https://huggingface.co/csukuangfj/en_US-glados-high/resolve/main/README.md\n                   wget -qq https://huggingface.co/csukuangfj/en_US-glados-high/resolve/main/MODEL_CARD\n                   \"\"\",\n                url=\"https://github.com/rhasspy/piper/issues/187#issuecomment-1805709037\",\n            ),\n        ]\n    )\n\n    en_us += [\n        PiperModel(\n            name=\"miro\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_en-US_miro/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_en-US_miro\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_en-US_miro/resolve/main/miro_en-US.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_en-US_miro/resolve/main/miro_en-US.onnx.json\n\n                   mv miro_en-US.onnx en_US-miro-high.onnx\n                   mv miro_en-US.onnx.json en_US-miro-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_en-US_miro\",\n        ),\n    ]\n\n    for m in en_gb:\n        m.lang = \"en_GB\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    for m in en_us:\n        m.lang = \"en_US\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = en_gb + en_us\n\n    for m in ans:\n        m.text = \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\ndef get_de_models():\n    de_de = [\n        PiperModel(name=\"eva_k\", kind=\"x_low\", sr=16000, ns=1),\n        PiperModel(name=\"karlsson\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"kerstin\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"pavoque\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"ramona\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"thorsten\", kind=\"high\", sr=22050, ns=1),\n        PiperModel(name=\"thorsten\", kind=\"low\", sr=16000, ns=1),\n        PiperModel(name=\"thorsten\", kind=\"medium\", sr=22050, ns=1),\n        PiperModel(name=\"thorsten_emotional\", kind=\"medium\", sr=22050, ns=8),\n        # https://github.com/rhasspy/piper/issues/187#issuecomment-2691653607\n        PiperModel(\n            name=\"glados\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados/high/de_DE-glados-high.onnx\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados/high/de_DE-glados-high.onnx.json\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados/high/MODEL_CARD\n               wget -qq https://huggingface.co/csukuangfj/vits-piper-de_DE-glados-high/resolve/main/README.md\n               \"\"\",\n            url=\"https://huggingface.co/systemofapwne/piper-de-glados\",\n        ),\n        PiperModel(\n            name=\"glados\",\n            kind=\"low\",\n            sr=16000,\n            ns=1,\n            cmd=\"\"\"\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados/low/de_DE-glados-low.onnx\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados/low/de_DE-glados-low.onnx.json\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados/low/MODEL_CARD\n               wget -qq https://huggingface.co/csukuangfj/vits-piper-de_DE-glados-low/resolve/main/README.md\n               \"\"\",\n            url=\"https://huggingface.co/systemofapwne/piper-de-glados\",\n        ),\n        PiperModel(\n            name=\"glados\",\n            kind=\"medium\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados/medium/de_DE-glados-medium.onnx\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados/medium/de_DE-glados-medium.onnx.json\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados/medium/MODEL_CARD\n               wget -qq https://huggingface.co/csukuangfj/vits-piper-de_DE-glados-medium/resolve/main/README.md\n               \"\"\",\n            url=\"https://huggingface.co/systemofapwne/piper-de-glados\",\n        ),\n        PiperModel(\n            name=\"glados_turret\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados-turret/high/de_DE-glados-turret-high.onnx\n               mv de_DE-glados-turret-high.onnx de_DE-glados_turret-high.onnx\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados-turret/high/de_DE-glados-turret-high.onnx.json\n               mv de_DE-glados-turret-high.onnx.json de_DE-glados_turret-high.onnx.json\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados-turret/high/MODEL_CARD\n               wget https://huggingface.co/csukuangfj/vits-piper-de_DE-glados_turret-high/resolve/main/README.md\n               \"\"\",\n            url=\"https://huggingface.co/systemofapwne/piper-de-glados\",\n        ),\n        PiperModel(\n            name=\"glados_turret\",\n            kind=\"low\",\n            sr=16000,\n            ns=1,\n            cmd=\"\"\"\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados-turret/low/de_DE-glados-turret-low.onnx\n               mv de_DE-glados-turret-low.onnx de_DE-glados_turret-low.onnx\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados-turret/low/de_DE-glados-turret-low.onnx.json\n               mv de_DE-glados-turret-low.onnx.json de_DE-glados_turret-low.onnx.json\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados-turret/low/MODEL_CARD\n               wget https://huggingface.co/csukuangfj/vits-piper-de_DE-glados_turret-low/resolve/main/README.md\n               \"\"\",\n            url=\"https://huggingface.co/systemofapwne/piper-de-glados\",\n        ),\n        PiperModel(\n            name=\"glados_turret\",\n            kind=\"medium\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados-turret/medium/de_DE-glados-turret-medium.onnx\n               mv de_DE-glados-turret-medium.onnx de_DE-glados_turret-medium.onnx\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados-turret/medium/de_DE-glados-turret-medium.onnx.json\n               mv de_DE-glados-turret-medium.onnx.json de_DE-glados_turret-medium.onnx.json\n               wget -qq https://huggingface.co/systemofapwne/piper-de-glados/resolve/main/de/de_DE/glados-turret/medium/MODEL_CARD\n               wget https://huggingface.co/csukuangfj/vits-piper-de_DE-glados_turret-medium/resolve/main/README.md\n               \"\"\",\n            url=\"https://huggingface.co/systemofapwne/piper-de-glados\",\n        ),\n    ]\n\n    de_de += [\n        PiperModel(\n            name=\"miro\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_de-DE_miro/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_de-DE_miro\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_de-DE_miro/resolve/main/miro_de-DE.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_de-DE_miro/resolve/main/miro_de-DE.onnx.json\n\n                   mv miro_de-DE.onnx de_DE-miro-high.onnx\n                   mv miro_de-DE.onnx.json de_DE-miro-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_de-DE_miro\",\n        ),\n        PiperModel(\n            name=\"dii\",\n            kind=\"high\",\n            sr=22050,\n            ns=1,\n            cmd=\"\"\"\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_de-DE_dii/resolve/main/README.md\n\n                   echo \"\\n\\nSee https://huggingface.co/OpenVoiceOS/pipertts_de-DE_dii\" >> README.md\n                   echo \"and https://github.com/OHF-Voice/piper1-gpl/discussions/27\" >> README.md\n                   echo \"\\n\\n# License\\n\\n\" >> README.md\n\n                   echo \"See also https://github.com/k2-fsa/sherpa-onnx/pull/2480\\n\\n\" >> README.md\n                   echo \"This model is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/).\\n\" >> README.md\n\n                   echo \"- ✅ Always free for regular (non-commercial) users  \\n\" >> README.md\n                   echo \"- ❌ Commercial use is not allowed at this time  \\n\" >> README.md\n                   echo \"- 🔄 The author may relax the restrictions in the future (e.g., allow commercial use), but will not make them stricter  \\n\\n\" >> README.md\n                   echo \"**Important:** You must include this license when redistributing the model or any derivatives.\\n\" >> README.md\n\n\n\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_de-DE_dii/resolve/main/dii_de-DE.onnx\n                   wget -qq https://huggingface.co/OpenVoiceOS/pipertts_de-DE_dii/resolve/main/dii_de-DE.onnx.json\n\n                   mv dii_de-DE.onnx de_DE-dii-high.onnx\n                   mv dii_de-DE.onnx.json de_DE-dii-high.onnx.json\n                   \"\"\",\n            url=\"https://huggingface.co/OpenVoiceOS/pipertts_de-DE_dii\",\n        ),\n    ]\n\n    for m in de_de:\n        m.lang = \"de_DE\"\n        if m.model_name == \"\":\n            m.model_name = f\"{m.lang}-{m.name}-{m.kind}.onnx\"\n\n    ans = de_de\n\n    for m in ans:\n        m.text = \"Alles hat ein Ende, nur die Wurst hat zwei.\"\n        code = m.lang[:2]\n        if m.cmd == \"\":\n            m.cmd = f\"\"\"\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/{m.model_name}.json\n            wget -qq https://huggingface.co/rhasspy/piper-voices/resolve/main/{code}/{m.lang}/{m.name}/{m.kind}/MODEL_CARD\n            \"\"\"\n\n        if m.url == \"\":\n            m.url = f\"https://huggingface.co/rhasspy/piper-voices/tree/main/{code}/{m.lang}/{m.name}/{m.kind}\"\n\n    return ans\n\n\ndef get_all_models():\n    ans = []\n    ans += get_ar_models()\n    ans += get_ca_models()\n    ans += get_cs_models()\n    ans += get_cy_models()\n    ans += get_da_models()\n    ans += get_de_models()\n    ans += get_el_models()\n    ans += get_en_models()\n    ans += get_es_models()\n    ans += get_fa_models()\n    ans += get_fi_models()\n    ans += get_fr_models()\n    ans += get_hi_models()\n    ans += get_id_models()\n    ans += get_hu_models()\n    ans += get_is_models()\n    ans += get_it_models()\n    ans += get_ka_models()\n    ans += get_kk_models()\n    ans += get_lb_models()\n    ans += get_lv_models()\n    ans += get_ml_models()\n    ans += get_ne_models()\n    ans += get_nl_models()\n    ans += get_no_models()\n    ans += get_pl_models()\n    ans += get_pt_models()\n    ans += get_ro_models()\n    ans += get_ru_models()\n    ans += get_sk_models()\n    ans += get_sl_models()\n    ans += get_sr_models()\n    ans += get_sv_models()\n    ans += get_sw_models()\n    ans += get_tr_models()\n    ans += get_uk_models()\n    ans += get_vi_models()\n\n\n    for i, m in enumerate(ans):\n        m.index = i\n\n    return ans\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_all_models()\n\n    print(all_model_list)\n\n    num_models = len(all_model_list)\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./generate.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        if not Path(f\"{filename}.in\").is_file():\n            print(f\"skip {filename}\")\n            continue\n\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n    print(f\"There are {len(all_model_list)} models\")\n    for m in all_model_list:\n        print(m.index, m.model_name)\n\n    if Path(\"hf\").is_dir():\n        with open(\"./generate_samples.py.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n        for m in all_model_list:\n            model_dir = f\"vits-piper-{m.lang}-{m.name}-{m.kind}\"\n            d = {\n                \"model\": f\"{model_dir}/{m.model_name}\",\n                \"data_dir\": f\"{model_dir}/espeak-ng-data\",\n                \"tokens\": f\"{model_dir}/tokens.txt\",\n                \"text\": m.text,\n            }\n            for i in range(m.ns):\n                s = template.render(\n                    **d,\n                    sid=i,\n                    output_filename=f\"hf/piper/mp3/{m.lang}/{model_dir}/{i}.mp3\",\n                )\n\n                with open(f\"generate_samples-{model_dir}-{i}.py\", \"w\") as f:\n                    print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/piper/generate.sh.in",
    "content": "#!/usr/bin/env bash\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n#\n# Auto generated! Do NOT edit!\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nwget -qq https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/espeak-ng-data.tar.bz2\ntar xf espeak-ng-data.tar.bz2\nrm espeak-ng-data.tar.bz2\n\nmkdir -p release\n\n{% for model in model_list %}\n\nname={{ model.name }}\nkind={{ model.kind }}\nlang={{ model.lang }}\nmodel_name={{ model.model_name }}\ntext=\"{{ model.text }}\"\nnum_speakers={{ model.ns }}\nsample_rate={{ model.sr }}\n\n{{ model.cmd }}\n\necho \"files\"\n\nls -lh\necho \"---\"\n\npython3 ./add_meta_data.py \\\n  --name $name \\\n  --kind $kind \\\n  --lang $lang\n\ndst=vits-piper-$lang-$name-$kind\ndst_int8=vits-piper-$lang-$name-$kind-int8\ndst_fp16=vits-piper-$lang-$name-$kind-fp16\nmkdir -p $dst\n\nmv -v tokens.txt  $dst/\nmv -v MODEL_CARD $dst/ || true\nmv -v README $dst/ || true\nmv -v README.md $dst/ || true\nmv -v LICENSE.txt $dst/ || true\nmv -v *.json  $dst/\ncp -a ./espeak-ng-data $dst/\n\ncp -a $dst $dst_int8\ncp -a $dst $dst_fp16\n\nmv -v *.onnx  $dst/\n\npython3 ./dynamic_quantization.py \\\n  --input $dst/$model_name \\\n  --output-int8 $dst_int8/$model_name \\\n  --output-fp16 $dst_fp16/$model_name >/dev/null 2>&1\n\necho \"---fp32---\"\nls -lh $dst\n\necho \"---int8---\"\nls -lh $dst_int8\n\necho \"---fp16---\"\nls -lh $dst_fp16\n\ntar cjf ${dst}.tar.bz2 $dst\ntar cjf ${dst_int8}.tar.bz2 $dst_int8\ntar cjf ${dst_fp16}.tar.bz2 $dst_fp16\n\nif [ -d hf ]; then\n  mkdir -p hf/piper/mp3/$lang/vits-piper-$lang-$name-$kind\n  for i in $(seq $num_speakers); do\n    i=$((i-1))\n    python3 ./generate_samples-$dst-$i.py\n  done\n  ls -lh hf/piper/mp3/$lang/vits-piper-$lang-$name-$kind\nfi\n\nmv $dst release\nmv $dst_int8 release\nmv $dst_fp16 release\n\nls -lh release/*\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/piper/generate_samples.py.in",
    "content": "import sherpa_onnx\nimport soundfile as sf\n\nconfig = sherpa_onnx.OfflineTtsConfig(\n    model=sherpa_onnx.OfflineTtsModelConfig(\n        vits=sherpa_onnx.OfflineTtsVitsModelConfig(\n            model=\"{{ model }}\",\n            lexicon=\"\",\n            data_dir=\"{{ data_dir }}\",\n            tokens=\"{{ tokens }}\",\n        ),\n        num_threads=1,\n    ),\n)\n\nif not config.validate():\n    raise ValueError(\"Please check your config\")\n\ntts = sherpa_onnx.OfflineTts(config)\naudio = tts.generate(text=\"{{text}}\", sid={{sid}}, speed=1.0)\n\nsf.write(\"{{ output_filename }}\", audio.samples, samplerate=audio.sample_rate)\n"
  },
  {
    "path": "scripts/pocket-tts/.gitignore",
    "content": "*.json\n*.model\n"
  },
  {
    "path": "scripts/pocket-tts/README.md",
    "content": "# Introduction\n\n- [./convert_tokenizer.py](./convert_tokenizer.py) It produces `./token_scores.json`\n  and `./vocab.json` from [./tokenizer.model](https://huggingface.co/KevinAHM/pocket-tts-onnx/resolve/main/tokenizer.model)\n\n- [./test_tokenizer.py](./test_tokenizer.py) is used to test the exported `./token_scores.json`\n  and `./vocab.json`\n\nIn C++, we don't need to use the [sentencepiece](https://github.com/google/sentencepiece) or protobuf for the tokenizer.\n"
  },
  {
    "path": "scripts/pocket-tts/convert_tokenizer.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2026  Xiaomi Corporation\n\nimport json\n\nimport sentencepiece as spm\n\nsp = spm.SentencePieceProcessor(model_file=\"tokenizer.model\")\n\ntoken2id = {}\ntoken2score = {}\n\nfor i in range(sp.get_piece_size()):\n    tok = sp.id_to_piece(i)\n    token2id[tok] = i\n    token2score[tok] = sp.get_score(i)\n\nwith open(\"vocab.json\", \"w\", encoding=\"utf-8\") as f:\n    json.dump(token2id, f, indent=2, ensure_ascii=False)\n\nwith open(\"token_scores.json\", \"w\", encoding=\"utf-8\") as f:\n    json.dump(token2score, f, indent=2, ensure_ascii=False)\n"
  },
  {
    "path": "scripts/pocket-tts/test_tokenizer.py",
    "content": "#!/usr/bin/env python3\n#\n# Copyright (c)  2026  Xiaomi Corporation\n\nimport json\n\nimport sentencepiece as spm\n\n\nclass SentencePieceBPETokenizer:\n    def __init__(self, vocab_json, token_scores_json):\n        with open(vocab_json, encoding=\"utf-8\") as f:\n            self.token2id = json.load(f)\n\n        with open(token_scores_json, encoding=\"utf-8\") as f:\n            self.token2score = json.load(f)\n\n        self.id2token = {v: k for k, v in self.token2id.items()}\n\n        # index tokens by first char for speed\n        self.by_first_char = {}\n        for tok in self.token2id:\n            if tok:\n                self.by_first_char.setdefault(tok[0], []).append(tok)\n\n        # byte fallback <0xNN>\n        self.byte_token = {b: f\"<0x{b:02X}>\" for b in range(256)}\n\n    def encode(self, text, return_type=\"ids\"):\n        text = text.replace(\" \", \"▁\")\n        if not text.startswith(\"▁\"):\n            text = \"▁\" + text\n\n        n = len(text)\n        dp = [-1e30] * (n + 1)\n        back = [None] * (n + 1)\n        dp[n] = 0.0\n\n        for i in range(n - 1, -1, -1):\n            c = text[i]\n\n            for tok in self.by_first_char.get(c, []):\n                if text.startswith(tok, i):\n                    j = i + len(tok)\n                    score = self.token2score[tok] + dp[j]\n                    if score > dp[i]:\n                        dp[i] = score\n                        back[i] = tok\n\n            # byte fallback\n            if back[i] is None:\n                b = text[i].encode(\"utf-8\")[0]\n                tok = self.byte_token[b]\n                dp[i] = self.token2score[tok] + dp[i + 1]\n                back[i] = tok\n\n        # reconstruct\n        tokens = []\n        i = 0\n        while i < n:\n            tok = back[i]\n            tokens.append(tok)\n            i += len(tok)\n\n        if return_type == \"tokens\":\n            return tokens\n        return [self.token2id[t] for t in tokens]\n\n\ndef main():\n    tokenizer = SentencePieceBPETokenizer(\n        vocab_json=\"./vocab.json\", token_scores_json=\"./token_scores.json\"\n    )\n    s = \"Yesterday, I bought 3 apples, 2 bananas, and a dozen oranges. Wow! That's amazing—did you see it too? I can't believe it's already 10:30 p.m.\"\n\n    tokens = tokenizer.encode(s, return_type=\"tokens\")\n    token_ids = tokenizer.encode(s, return_type=\"int\")\n    print(tokens)\n    print(token_ids)\n    sp = spm.SentencePieceProcessor(model_file=\"tokenizer.model\")\n    #  print(help(sp.encode))\n\n    gt_tokens = sp.encode(s, out_type=str)\n    gt_token_ids = sp.encode(s, out_type=int)\n    print(gt_tokens)\n    print(len(tokens), len(gt_tokens))\n    a = []\n    for k, p in zip(tokens, gt_tokens):\n        a.append(k == p)\n    print(a)\n\n    print(token_ids)\n    print(gt_token_ids)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/pyannote/segmentation/.gitignore",
    "content": "*.bin\n*.onnx\n"
  },
  {
    "path": "scripts/pyannote/segmentation/README.md",
    "content": "# File description\n\nPlease download test wave files from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\n\n## 0-four-speakers-zh.wav\n\nIt is recorded by @csukuangfj\n\n## 1-two-speakers-en.wav\n\nThis file is from\nhttps://github.com/pengzhendong/pyannote-onnx/blob/master/data/test_16k.wav\nand it contains speeches from two speakers.\n\nNote that we have renamed it from `test_16k.wav` to `1-two-speakers-en.wav`\n\n\n## 2-two-speakers-en.wav\nThis file is from\nhttps://huggingface.co/spaces/Xenova/whisper-speaker-diarization\n\nNote that the original file is `./fcf059e3-689f-47ec-a000-bdace87f0113.mp4`.\nWe use the following commands to convert it to `2-two-speakers-en.wav`.\n\n```bash\nffmpeg -i ./fcf059e3-689f-47ec-a000-bdace87f0113.mp4 -ac 1 -ar 16000 ./2-two-speakers-en.wav\n```\n\n## 3-two-speakers-en.wav\n\nThis file is from\nhttps://aws.amazon.com/blogs/machine-learning/deploy-a-hugging-face-pyannote-speaker-diarization-model-on-amazon-sagemaker-as-an-asynchronous-endpoint/\n\nNote that the original file is `ML16091-Audio.mp3`. We use the following\ncommands to convert it to `3-two-speakers-en.wav`\n\n\n```bash\nsox ML16091-Audio.mp3 -r 16k 3-two-speakers-en.wav\n```\n"
  },
  {
    "path": "scripts/pyannote/segmentation/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport os\nfrom typing import Any, Dict\n\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom pyannote.audio import Model\nfrom pyannote.audio.core.task import Problem, Resolution\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    # You can download ./pytorch_model.bin from\n    # https://hf-mirror.com/csukuangfj/pyannote-models/tree/main/segmentation-3.0\n    # or from\n    # https://huggingface.co/Revai/reverb-diarization-v1/tree/main\n    pt_filename = \"./pytorch_model.bin\"\n    model = Model.from_pretrained(pt_filename)\n    model.eval()\n    assert model.dimension == 7, model.dimension\n    print(model.specifications)\n\n    assert (\n        model.specifications.problem == Problem.MONO_LABEL_CLASSIFICATION\n    ), model.specifications.problem\n\n    assert (\n        model.specifications.resolution == Resolution.FRAME\n    ), model.specifications.resolution\n\n    assert model.specifications.duration == 10.0, model.specifications.duration\n\n    assert model.audio.sample_rate == 16000, model.audio.sample_rate\n\n    # (batch, num_channels, num_samples)\n    assert list(model.example_input_array.shape) == [\n        1,\n        1,\n        16000 * 10,\n    ], model.example_input_array.shape\n\n    example_output = model(model.example_input_array)\n\n    # (batch, num_frames, num_classes)\n    assert list(example_output.shape) == [1, 589, 7], example_output.shape\n\n    assert model.receptive_field.step == 0.016875, model.receptive_field.step\n    assert model.receptive_field.duration == 0.0619375, model.receptive_field.duration\n    assert model.receptive_field.step * 16000 == 270, model.receptive_field.step * 16000\n    assert model.receptive_field.duration * 16000 == 991, (\n        model.receptive_field.duration * 16000\n    )\n\n    opset_version = 13\n\n    filename = \"model.onnx\"\n    torch.onnx.export(\n        model,\n        model.example_input_array,\n        filename,\n        opset_version=opset_version,\n        input_names=[\"x\"],\n        output_names=[\"y\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 2: \"T\"},\n            \"y\": {0: \"N\", 1: \"T\"},\n        },\n    )\n\n    sample_rate = model.audio.sample_rate\n\n    window_size = int(model.specifications.duration) * 16000\n    receptive_field_size = int(model.receptive_field.duration * 16000)\n    receptive_field_shift = int(model.receptive_field.step * 16000)\n\n    is_revai = os.getenv(\"SHERPA_ONNX_IS_REVAI\", \"\")\n    if is_revai == \"\":\n        url_1 = \"https://huggingface.co/pyannote/segmentation-3.0\"\n        url_2 = \"https://huggingface.co/csukuangfj/pyannote-models/tree/main/segmentation-3.0\"\n        license_url = (\n            \"https://huggingface.co/pyannote/segmentation-3.0/blob/main/LICENSE\"\n        )\n        model_author = \"pyannote-audio\"\n    else:\n        url_1 = \"https://huggingface.co/Revai/reverb-diarization-v1\"\n        url_2 = \"https://huggingface.co/csukuangfj/sherpa-onnx-reverb-diarization-v1\"\n        license_url = (\n            \"https://huggingface.co/Revai/reverb-diarization-v1/blob/main/LICENSE\"\n        )\n        model_author = \"Revai\"\n\n    meta_data = {\n        \"num_speakers\": len(model.specifications.classes),\n        \"powerset_max_classes\": model.specifications.powerset_max_classes,\n        \"num_classes\": model.dimension,\n        \"sample_rate\": sample_rate,\n        \"window_size\": window_size,\n        \"receptive_field_size\": receptive_field_size,\n        \"receptive_field_shift\": receptive_field_shift,\n        \"model_type\": \"pyannote-segmentation-3.0\",\n        \"version\": \"1\",\n        \"model_author\": model_author,\n        \"maintainer\": \"k2-fsa\",\n        \"url_1\": url_1,\n        \"url_2\": url_2,\n        \"license\": license_url,\n    }\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n    print(\"Generate int8 quantization models\")\n\n    filename_int8 = \"model.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        weight_type=QuantType.QUInt8,\n    )\n\n    print(f\"Saved to {filename} and {filename_int8}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/pyannote/segmentation/notes.md",
    "content": "\n# config.yaml\n\n\n```yaml\ntask:\n  _target_: pyannote.audio.tasks.SpeakerDiarization\n  duration: 10.0\n  max_speakers_per_chunk: 3\n  max_speakers_per_frame: 2\nmodel:\n  _target_: pyannote.audio.models.segmentation.PyanNet\n  sample_rate: 16000\n  num_channels: 1\n  sincnet:\n    stride: 10\n  lstm:\n    hidden_size: 128\n    num_layers: 4\n    bidirectional: true\n    monolithic: true\n  linear:\n    hidden_size: 128\n    num_layers: 2\n```\n\n# Model architecture of ./pytorch_model.bin\n\n`print(model)`:\n\n```python3\nPyanNet(\n  (sincnet): SincNet(\n    (wav_norm1d): InstanceNorm1d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False)\n    (conv1d): ModuleList(\n      (0): Encoder(\n        (filterbank): ParamSincFB()\n      )\n      (1): Conv1d(80, 60, kernel_size=(5,), stride=(1,))\n      (2): Conv1d(60, 60, kernel_size=(5,), stride=(1,))\n    )\n    (pool1d): ModuleList(\n      (0-2): 3 x MaxPool1d(kernel_size=3, stride=3, padding=0, dilation=1, ceil_mode=False)\n    )\n    (norm1d): ModuleList(\n      (0): InstanceNorm1d(80, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False)\n      (1-2): 2 x InstanceNorm1d(60, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False)\n    )\n  )\n  (lstm): LSTM(60, 128, num_layers=4, batch_first=True, dropout=0.5, bidirectional=True)\n  (linear): ModuleList(\n    (0): Linear(in_features=256, out_features=128, bias=True)\n    (1): Linear(in_features=128, out_features=128, bias=True)\n  )\n  (classifier): Linear(in_features=128, out_features=7, bias=True)\n  (activation): LogSoftmax(dim=-1)\n)\n```\n\n```python3\n>>> list(model.specifications)\n[Specifications(problem=<Problem.MONO_LABEL_CLASSIFICATION: 1>, resolution=<Resolution.FRAME: 1>, duration=10.0, min_duration=None, warm_up=(0.0, 0.0), classes=['speaker#1', 'speaker#2', 'speaker#3'], powerset_max_classes=2, permutation_invariant=True)]\n```\n\n```python3\n>>> model.hparams\n\"linear\":       {'hidden_size': 128, 'num_layers': 2}\n\"lstm\":         {'hidden_size': 128, 'num_layers': 4, 'bidirectional': True, 'monolithic': True, 'dropout': 0.5, 'batch_first': True}\n\"num_channels\": 1\n\"sample_rate\":  16000\n\"sincnet\":      {'stride': 10, 'sample_rate': 16000}\n```\n\n## Papers\n\n- [pyannote.audio 2.1 speaker diarization pipeline: principle, benchmark, and recipe](https://hal.science/hal-04247212/document)\n- [pyannote.audio speaker diarization pipeline at VoxSRC 2023](https://mmai.io/datasets/voxceleb/voxsrc/data_workshop_2023/reports/pyannote_report.pdf)\n\n"
  },
  {
    "path": "scripts/pyannote/segmentation/preprocess.sh",
    "content": "#!/usr/bin/env bash\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\npython3 -m onnxruntime.quantization.preprocess --input model.onnx --output tmp.preprocessed.onnx\nmv ./tmp.preprocessed.onnx ./model.onnx\n./show-onnx.py --filename ./model.onnx\n\n<<EOF\n=========./model.onnx==========\nNodeArg(name='x', type='tensor(float)', shape=[1, 1, 'T'])\n-----\nNodeArg(name='y', type='tensor(float)', shape=[1, 'floor(floor(floor(floor(T/10 - 251/10)/3 - 2/3)/3)/3 - 8/3) + 1', 7])\n\n  floor(floor(floor(floor(T/10 - 251/10)/3 - 2/3)/3)/3 - 8/3) + 1\n= floor(floor(floor(floor(T - 251)/30 - 2/3)/3)/3 - 8/3) + 1\n= floor(floor(floor(floor(T - 271)/30)/3)/3 - 8/3) + 1\n= floor(floor(floor(floor(T - 271)/90))/3 - 8/3) + 1\n= floor(floor(floor(T - 271)/90)/3 - 8/3) + 1\n= floor(floor((T - 271)/90)/3 - 8/3) + 1\n= floor(floor((T - 271)/90 - 8)/3) + 1\n= floor(floor((T - 271 - 720)/90)/3) + 1\n= floor(floor((T - 991)/90)/3) + 1\n= floor(floor((T - 991)/270)) + 1\n= (T - 991)/270 + 1\n= (T - 991 + 270)/270\n= (T - 721)/270\n\nIt means:\n - Number of input samples should be at least 721\n - One frame corresponds to 270 samples. (If we use T + 270, it outputs one more frame)\nEOF\n"
  },
  {
    "path": "scripts/pyannote/segmentation/show-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport onnxruntime\nimport argparse\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--filename\",\n        type=str,\n        required=True,\n        help=\"Path to model.onnx\",\n    )\n\n    return parser.parse_args()\n\n\ndef show(filename):\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\ndef main():\n    args = get_args()\n    print(f\"========={args.filename}==========\")\n    show(args.filename)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/pyannote/segmentation/speaker-diarization-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nPlease refer to\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/speaker-diarization.yaml\nfor usages.\n\"\"\"\n\nimport argparse\nfrom datetime import timedelta\nfrom pathlib import Path\nfrom typing import List\n\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\nimport sherpa_onnx\nimport soundfile as sf\nfrom numpy.lib.stride_tricks import as_strided\n\n\nclass Segment:\n    def __init__(\n        self,\n        start,\n        end,\n        speaker,\n    ):\n        assert start < end\n        self.start = start\n        self.end = end\n        self.speaker = speaker\n\n    def merge(self, other, gap=0.5):\n        assert self.speaker == other.speaker, (self.speaker, other.speaker)\n        if self.end < other.start and self.end + gap >= other.start:\n            return Segment(start=self.start, end=other.end, speaker=self.speaker)\n        elif other.end < self.start and other.end + gap >= self.start:\n            return Segment(start=other.start, end=self.end, speaker=self.speaker)\n        else:\n            return None\n\n    @property\n    def duration(self):\n        return self.end - self.start\n\n    def __str__(self):\n        s = f\"{timedelta(seconds=self.start)}\"[:-3]\n        s += \" --> \"\n        s += f\"{timedelta(seconds=self.end)}\"[:-3]\n        s += f\" speaker_{self.speaker:02d}\"\n        return s\n\n\ndef merge_segment_list(in_out: List[Segment], min_duration_off: float):\n    changed = True\n    while changed:\n        changed = False\n        for i in range(len(in_out)):\n            if i + 1 >= len(in_out):\n                continue\n\n            new_segment = in_out[i].merge(in_out[i + 1], gap=min_duration_off)\n            if new_segment is None:\n                continue\n            del in_out[i + 1]\n            in_out[i] = new_segment\n            changed = True\n            break\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--seg-model\",\n        type=str,\n        required=True,\n        help=\"Path to model.onnx for segmentation\",\n    )\n    parser.add_argument(\n        \"--speaker-embedding-model\",\n        type=str,\n        required=True,\n        help=\"Path to model.onnx for speaker embedding extractor\",\n    )\n    parser.add_argument(\"--wav\", type=str, required=True, help=\"Path to test.wav\")\n\n    return parser.parse_args()\n\n\nclass OnnxSegmentationModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        print(meta)\n\n        self.window_size = int(meta[\"window_size\"])\n        self.sample_rate = int(meta[\"sample_rate\"])\n        self.window_shift = int(0.1 * self.window_size)\n        self.receptive_field_size = int(meta[\"receptive_field_size\"])\n        self.receptive_field_shift = int(meta[\"receptive_field_shift\"])\n        self.num_speakers = int(meta[\"num_speakers\"])\n        self.powerset_max_classes = int(meta[\"powerset_max_classes\"])\n        self.num_classes = int(meta[\"num_classes\"])\n\n    def __call__(self, x):\n        \"\"\"\n        Args:\n          x: (N, num_samples)\n        Returns:\n          A tensor of shape (N, num_frames, num_classes)\n        \"\"\"\n        x = np.expand_dims(x, axis=1)\n\n        (y,) = self.model.run(\n            [self.model.get_outputs()[0].name], {self.model.get_inputs()[0].name: x}\n        )\n\n        return y\n\n\ndef load_wav(filename, expected_sample_rate) -> np.ndarray:\n    audio, sample_rate = sf.read(filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != expected_sample_rate:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=expected_sample_rate,\n        )\n    return audio\n\n\ndef get_powerset_mapping(num_classes, num_speakers, powerset_max_classes):\n    mapping = np.zeros((num_classes, num_speakers))\n\n    k = 1\n    for i in range(1, powerset_max_classes + 1):\n        if i == 1:\n            for j in range(0, num_speakers):\n                mapping[k, j] = 1\n                k += 1\n        elif i == 2:\n            for j in range(0, num_speakers):\n                for m in range(j + 1, num_speakers):\n                    mapping[k, j] = 1\n                    mapping[k, m] = 1\n                    k += 1\n        elif i == 3:\n            raise RuntimeError(\"Unsupported\")\n\n    return mapping\n\n\ndef to_multi_label(y, mapping):\n    \"\"\"\n    Args:\n      y: (num_chunks, num_frames, num_classes)\n    Returns:\n      A tensor of shape (num_chunks, num_frames, num_speakers)\n    \"\"\"\n    y = np.argmax(y, axis=-1)\n    labels = mapping[y.reshape(-1)].reshape(y.shape[0], y.shape[1], -1)\n    return labels\n\n\n# speaker count per frame\ndef speaker_count(labels, seg_m):\n    \"\"\"\n    Args:\n      labels: (num_chunks, num_frames, num_speakers)\n      seg_m: Segmentation model\n    Returns:\n      A integer array of shape (num_total_frames,)\n    \"\"\"\n    labels = labels.sum(axis=-1)\n    # Now labels: (num_chunks, num_frames)\n\n    num_frames = (\n        int(\n            (seg_m.window_size + (labels.shape[0] - 1) * seg_m.window_shift)\n            / seg_m.receptive_field_shift\n        )\n        + 1\n    )\n    ans = np.zeros((num_frames,))\n    count = np.zeros((num_frames,))\n\n    for i in range(labels.shape[0]):\n        this_chunk = labels[i]\n        start = int(i * seg_m.window_shift / seg_m.receptive_field_shift + 0.5)\n        end = start + this_chunk.shape[0]\n        ans[start:end] += this_chunk\n        count[start:end] += 1\n\n    ans /= np.maximum(count, 1e-12)\n\n    return (ans + 0.5).astype(np.int8)\n\n\ndef load_speaker_embedding_model(filename):\n    config = sherpa_onnx.SpeakerEmbeddingExtractorConfig(\n        model=filename,\n        num_threads=1,\n        debug=0,\n    )\n    if not config.validate():\n        raise ValueError(f\"Invalid config. {config}\")\n    extractor = sherpa_onnx.SpeakerEmbeddingExtractor(config)\n    return extractor\n\n\ndef get_embeddings(embedding_filename, audio, labels, seg_m, exclude_overlap):\n    \"\"\"\n    Args:\n      embedding_filename: Path to the speaker embedding extractor model\n      audio: (num_samples,)\n      labels: (num_chunks, num_frames, num_speakers)\n      seg_m: segmentation model\n    Returns:\n      Return (num_chunks, num_speakers, embedding_dim)\n    \"\"\"\n    if exclude_overlap:\n        labels = labels * (labels.sum(axis=-1, keepdims=True) < 2)\n\n    extractor = load_speaker_embedding_model(embedding_filename)\n    buffer = np.empty(seg_m.window_size)\n    num_chunks, num_frames, num_speakers = labels.shape\n\n    ans_chunk_speaker_pair = []\n    ans_embeddings = []\n\n    for i in range(num_chunks):\n        labels_T = labels[i].T\n        # t: (num_speakers, num_frames)\n\n        sample_offset = i * seg_m.window_shift\n\n        for j in range(num_speakers):\n            frames = labels_T[j]\n            if frames.sum() < 10:\n                # skip segment less than 20 frames, i.e., about 0.2 seconds\n                continue\n\n            start = None\n            start_samples = 0\n            idx = 0\n            for k in range(num_frames):\n                if frames[k] != 0:\n                    if start is None:\n                        start = k\n                elif start is not None:\n                    start_samples = (\n                        int(start / num_frames * seg_m.window_size) + sample_offset\n                    )\n                    end_samples = (\n                        int(k / num_frames * seg_m.window_size) + sample_offset\n                    )\n                    num_samples = end_samples - start_samples\n                    buffer[idx : idx + num_samples] = audio[start_samples:end_samples]\n                    idx += num_samples\n\n                    start = None\n            if start is not None:\n                start_samples = (\n                    int(start / num_frames * seg_m.window_size) + sample_offset\n                )\n                end_samples = int(k / num_frames * seg_m.window_size) + sample_offset\n                num_samples = end_samples - start_samples\n                buffer[idx : idx + num_samples] = audio[start_samples:end_samples]\n                idx += num_samples\n\n            stream = extractor.create_stream()\n            stream.accept_waveform(sample_rate=seg_m.sample_rate, waveform=buffer[:idx])\n            stream.input_finished()\n\n            assert extractor.is_ready(stream)\n            embedding = extractor.compute(stream)\n            embedding = np.array(embedding)\n\n            ans_chunk_speaker_pair.append([i, j])\n            ans_embeddings.append(embedding)\n\n    assert len(ans_chunk_speaker_pair) == len(ans_embeddings), (\n        len(ans_chunk_speaker_pair),\n        len(ans_embeddings),\n    )\n    return ans_chunk_speaker_pair, np.array(ans_embeddings)\n\n\ndef main():\n    args = get_args()\n    assert Path(args.seg_model).is_file(), args.seg_model\n    assert Path(args.wav).is_file(), args.wav\n\n    seg_m = OnnxSegmentationModel(args.seg_model)\n    audio = load_wav(args.wav, seg_m.sample_rate)\n    # audio: (num_samples,)\n\n    num = (audio.shape[0] - seg_m.window_size) // seg_m.window_shift + 1\n\n    samples = as_strided(\n        audio,\n        shape=(num, seg_m.window_size),\n        strides=(seg_m.window_shift * audio.strides[0], audio.strides[0]),\n    )\n\n    # or use torch.Tensor.unfold\n    #  samples = torch.from_numpy(audio).unfold(0, seg_m.window_size, seg_m.window_shift).numpy()\n\n    if (\n        audio.shape[0] < seg_m.window_size\n        or (audio.shape[0] - seg_m.window_size) % seg_m.window_shift > 0\n    ):\n        has_last_chunk = True\n    else:\n        has_last_chunk = False\n\n    num_chunks = samples.shape[0]\n    batch_size = 32\n    output = []\n    for i in range(0, num_chunks, batch_size):\n        start = i\n        end = i + batch_size\n        # it's perfectly ok to use end > num_chunks\n        y = seg_m(samples[start:end])\n        output.append(y)\n\n    if has_last_chunk:\n        last_chunk = audio[num_chunks * seg_m.window_shift :]  # noqa\n        pad_size = seg_m.window_size - last_chunk.shape[0]\n        last_chunk = np.pad(last_chunk, (0, pad_size))\n        last_chunk = np.expand_dims(last_chunk, axis=0)\n        y = seg_m(last_chunk)\n        output.append(y)\n\n    y = np.vstack(output)\n    # y: (num_chunks, num_frames, num_classes)\n\n    mapping = get_powerset_mapping(\n        num_classes=seg_m.num_classes,\n        num_speakers=seg_m.num_speakers,\n        powerset_max_classes=seg_m.powerset_max_classes,\n    )\n    labels = to_multi_label(y, mapping=mapping)\n    # labels: (num_chunks, num_frames, num_speakers)\n\n    inactive = (labels.sum(axis=1) == 0).astype(np.int8)\n    # inactive: (num_chunks, num_speakers)\n\n    speakers_per_frame = speaker_count(labels=labels, seg_m=seg_m)\n    # speakers_per_frame: (num_frames, speakers_per_frame)\n\n    if speakers_per_frame.max() == 0:\n        print(\"No speakers found in the audio file!\")\n        return\n\n    # if users specify only 1 speaker for clustering, then return the\n    # result directly\n\n    # Now, get embeddings\n    chunk_speaker_pair, embeddings = get_embeddings(\n        args.speaker_embedding_model,\n        audio=audio,\n        labels=labels,\n        seg_m=seg_m,\n        #  exclude_overlap=True,\n        exclude_overlap=False,\n    )\n    # chunk_speaker_pair: a list of (chunk_idx, speaker_idx)\n    # embeddings: (batch_size, embedding_dim)\n\n    # Please change num_clusters or threshold by yourself.\n    clustering_config = sherpa_onnx.FastClusteringConfig(num_clusters=2)\n    #  clustering_config = sherpa_onnx.FastClusteringConfig(threshold=0.8)\n    clustering = sherpa_onnx.FastClustering(clustering_config)\n    cluster_labels = clustering(embeddings)\n\n    chunk_speaker_to_cluster = dict()\n    for (chunk_idx, speaker_idx), cluster_idx in zip(\n        chunk_speaker_pair, cluster_labels\n    ):\n        if inactive[chunk_idx, speaker_idx] == 1:\n            print(\"skip \", chunk_idx, speaker_idx)\n            continue\n        chunk_speaker_to_cluster[(chunk_idx, speaker_idx)] = cluster_idx\n\n    num_speakers = max(cluster_labels) + 1\n    relabels = np.zeros((labels.shape[0], labels.shape[1], num_speakers))\n    for i in range(labels.shape[0]):\n        for j in range(labels.shape[1]):\n            for k in range(labels.shape[2]):\n                if (i, k) not in chunk_speaker_to_cluster:\n                    continue\n                t = chunk_speaker_to_cluster[(i, k)]\n\n                if labels[i, j, k] == 1:\n                    relabels[i, j, t] = 1\n\n    num_frames = (\n        int(\n            (seg_m.window_size + (relabels.shape[0] - 1) * seg_m.window_shift)\n            / seg_m.receptive_field_shift\n        )\n        + 1\n    )\n\n    count = np.zeros((num_frames, relabels.shape[-1]))\n    for i in range(relabels.shape[0]):\n        this_chunk = relabels[i]\n        start = int(i * seg_m.window_shift / seg_m.receptive_field_shift + 0.5)\n        end = start + this_chunk.shape[0]\n        count[start:end] += this_chunk\n\n    if has_last_chunk:\n        stop_frame = int(audio.shape[0] / seg_m.receptive_field_shift)\n        count = count[:stop_frame]\n\n    sorted_count = np.argsort(-count, axis=-1)\n    final = np.zeros((count.shape[0], count.shape[1]))\n\n    for i, (c, sc) in enumerate(zip(speakers_per_frame, sorted_count)):\n        for k in range(c):\n            final[i, sc[k]] = 1\n\n    min_duration_off = 0.5\n    min_duration_on = 0.3\n    onset = 0.5\n    offset = 0.5\n    # final: (num_frames, num_speakers)\n\n    final = final.T\n    for kk in range(final.shape[0]):\n        segment_list = []\n        frames = final[kk]\n\n        is_active = frames[0] > onset\n\n        start = None\n        if is_active:\n            start = 0\n        scale = seg_m.receptive_field_shift / seg_m.sample_rate\n        scale_offset = seg_m.receptive_field_size / seg_m.sample_rate * 0.5\n        for i in range(1, len(frames)):\n            if is_active:\n                if frames[i] < offset:\n                    segment = Segment(\n                        start=start * scale + scale_offset,\n                        end=i * scale + scale_offset,\n                        speaker=kk,\n                    )\n                    segment_list.append(segment)\n                    is_active = False\n            else:\n                if frames[i] > onset:\n                    start = i\n                    is_active = True\n\n        if is_active:\n            segment = Segment(\n                start=start * scale + scale_offset,\n                end=(len(frames) - 1) * scale + scale_offset,\n                speaker=kk,\n            )\n            segment_list.append(segment)\n\n        if len(segment_list) > 1:\n            merge_segment_list(segment_list, min_duration_off=min_duration_off)\n            for s in segment_list:\n                if s.duration < min_duration_on:\n                    continue\n                print(s)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/pyannote/segmentation/speaker-diarization-torch.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nPlease refer to\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/speaker-diarization.yaml\nfor usages.\n\"\"\"\n\n\"\"\"\n1. Go to https://huggingface.co/hbredin/wespeaker-voxceleb-resnet34-LM/tree/main\nwget https://huggingface.co/hbredin/wespeaker-voxceleb-resnet34-LM/resolve/main/speaker-embedding.onnx\n\n2. Change line 166 of pyannote/audio/pipelines/speaker_diarization.py\n\n```\n            #  self._embedding = PretrainedSpeakerEmbedding(\n            #      self.embedding, use_auth_token=use_auth_token\n            #  )\n            self._embedding = embedding\n```\n\"\"\"\n\nimport argparse\nfrom pathlib import Path\n\nimport torch\nfrom pyannote.audio import Model\nfrom pyannote.audio.pipelines import SpeakerDiarization as SpeakerDiarizationPipeline\nfrom pyannote.audio.pipelines.speaker_verification import (\n    ONNXWeSpeakerPretrainedSpeakerEmbedding,\n)\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--wav\", type=str, required=True, help=\"Path to test.wav\")\n\n    return parser.parse_args()\n\n\ndef build_pipeline():\n    embedding_filename = \"./speaker-embedding.onnx\"\n    if Path(embedding_filename).is_file():\n        # You need to modify line 166\n        # of pyannote/audio/pipelines/speaker_diarization.py\n        # Please see the comments at the start of this script for details\n        embedding = ONNXWeSpeakerPretrainedSpeakerEmbedding(embedding_filename)\n    else:\n        embedding = \"hbredin/wespeaker-voxceleb-resnet34-LM\"\n\n    pt_filename = \"./pytorch_model.bin\"\n    segmentation = Model.from_pretrained(pt_filename)\n    segmentation.eval()\n\n    pipeline = SpeakerDiarizationPipeline(\n        segmentation=segmentation,\n        embedding=embedding,\n        embedding_exclude_overlap=True,\n    )\n\n    params = {\n        \"clustering\": {\n            \"method\": \"centroid\",\n            \"min_cluster_size\": 12,\n            \"threshold\": 0.7045654963945799,\n        },\n        \"segmentation\": {\"min_duration_off\": 0.5},\n    }\n\n    pipeline.instantiate(params)\n    return pipeline\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    assert Path(args.wav).is_file(), args.wav\n    pipeline = build_pipeline()\n    print(pipeline)\n    t = pipeline(args.wav)\n    print(type(t))\n    print(t)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/pyannote/segmentation/vad-onnx.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\n./export-onnx.py\n./preprocess.sh\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n./vad-onnx.py --model ./model.onnx --wav ./lei-jun-test.wav\n\"\"\"\n\nimport argparse\nfrom pathlib import Path\n\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\nfrom numpy.lib.stride_tricks import as_strided\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--model\", type=str, required=True, help=\"Path to model.onnx\")\n    parser.add_argument(\"--wav\", type=str, required=True, help=\"Path to test.wav\")\n\n    return parser.parse_args()\n\n\nclass OnnxModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        print(meta)\n\n        self.window_size = int(meta[\"window_size\"])\n        self.sample_rate = int(meta[\"sample_rate\"])\n        self.window_shift = int(0.1 * self.window_size)\n        self.receptive_field_size = int(meta[\"receptive_field_size\"])\n        self.receptive_field_shift = int(meta[\"receptive_field_shift\"])\n        self.num_speakers = int(meta[\"num_speakers\"])\n        self.powerset_max_classes = int(meta[\"powerset_max_classes\"])\n        self.num_classes = int(meta[\"num_classes\"])\n\n    def __call__(self, x):\n        \"\"\"\n        Args:\n          x: (N, num_samples)\n        Returns:\n          A tensor of shape (N, num_frames, num_classes)\n        \"\"\"\n        x = np.expand_dims(x, axis=1)\n\n        (y,) = self.model.run(\n            [self.model.get_outputs()[0].name], {self.model.get_inputs()[0].name: x}\n        )\n\n        return y\n\n\ndef load_wav(filename, expected_sample_rate) -> np.ndarray:\n    audio, sample_rate = sf.read(filename, dtype=\"float32\", always_2d=True)\n    audio = audio[:, 0]  # only use the first channel\n    if sample_rate != expected_sample_rate:\n        audio = librosa.resample(\n            audio,\n            orig_sr=sample_rate,\n            target_sr=expected_sample_rate,\n        )\n    return audio\n\n\ndef get_powerset_mapping(num_classes, num_speakers, powerset_max_classes):\n    mapping = np.zeros((num_classes, num_speakers))\n\n    k = 1\n    for i in range(1, powerset_max_classes + 1):\n        if i == 1:\n            for j in range(0, num_speakers):\n                mapping[k, j] = 1\n                k += 1\n        elif i == 2:\n            for j in range(0, num_speakers):\n                for m in range(j + 1, num_speakers):\n                    mapping[k, j] = 1\n                    mapping[k, m] = 1\n                    k += 1\n        elif i == 3:\n            raise RuntimeError(\"Unsupported\")\n\n    return mapping\n\n\ndef to_multi_label(y, mapping):\n    \"\"\"\n    Args:\n      y: (num_chunks, num_frames, num_classes)\n    Returns:\n      A tensor of shape (num_chunks, num_frames, num_speakers)\n    \"\"\"\n    y = np.argmax(y, axis=-1)\n    labels = mapping[y.reshape(-1)].reshape(y.shape[0], y.shape[1], -1)\n    return labels\n\n\ndef main():\n    args = get_args()\n    assert Path(args.model).is_file(), args.model\n    assert Path(args.wav).is_file(), args.wav\n\n    m = OnnxModel(args.model)\n    audio = load_wav(args.wav, m.sample_rate)\n    # audio: (num_samples,)\n    print(\"audio\", audio.shape, audio.min(), audio.max(), audio.sum())\n\n    num = (audio.shape[0] - m.window_size) // m.window_shift + 1\n\n    samples = as_strided(\n        audio,\n        shape=(num, m.window_size),\n        strides=(m.window_shift * audio.strides[0], audio.strides[0]),\n    )\n\n    # or use torch.Tensor.unfold\n    #  samples = torch.from_numpy(audio).unfold(0, m.window_size, m.window_shift).numpy()\n\n    print(\n        \"samples\",\n        samples.shape,\n        samples.mean(),\n        samples.sum(),\n        samples[:3, :3].sum(axis=-1),\n    )\n\n    if (\n        audio.shape[0] < m.window_size\n        or (audio.shape[0] - m.window_size) % m.window_shift > 0\n    ):\n        has_last_chunk = True\n    else:\n        has_last_chunk = False\n\n    num_chunks = samples.shape[0]\n    batch_size = 32\n    output = []\n    for i in range(0, num_chunks, batch_size):\n        start = i\n        end = i + batch_size\n        # it's perfectly ok to use end > num_chunks\n        y = m(samples[start:end])\n        output.append(y)\n\n    if has_last_chunk:\n        last_chunk = audio[num_chunks * m.window_shift :]  # noqa\n        pad_size = m.window_size - last_chunk.shape[0]\n        last_chunk = np.pad(last_chunk, (0, pad_size))\n        last_chunk = np.expand_dims(last_chunk, axis=0)\n        y = m(last_chunk)\n        output.append(y)\n\n    y = np.vstack(output)\n    # y: (num_chunks, num_frames, num_classes)\n\n    mapping = get_powerset_mapping(\n        num_classes=m.num_classes,\n        num_speakers=m.num_speakers,\n        powerset_max_classes=m.powerset_max_classes,\n    )\n    labels = to_multi_label(y, mapping=mapping)\n    # labels: (num_chunks, num_frames, num_speakers)\n\n    # binary classification\n    labels = np.max(labels, axis=-1)\n    # labels: (num_chunk, num_frames)\n\n    num_frames = (\n        int(\n            (m.window_size + (labels.shape[0] - 1) * m.window_shift)\n            / m.receptive_field_shift\n        )\n        + 1\n    )\n\n    count = np.zeros((num_frames,))\n    classification = np.zeros((num_frames,))\n    weight = np.hamming(labels.shape[1])\n\n    for i in range(labels.shape[0]):\n        this_chunk = labels[i]\n        start = int(i * m.window_shift / m.receptive_field_shift + 0.5)\n        end = start + this_chunk.shape[0]\n\n        classification[start:end] += this_chunk * weight\n        count[start:end] += weight\n\n    classification /= np.maximum(count, 1e-12)\n\n    if has_last_chunk:\n        stop_frame = int(audio.shape[0] / m.receptive_field_shift)\n        classification = classification[:stop_frame]\n\n    classification = classification.tolist()\n\n    onset = 0.5\n    offset = 0.5\n\n    is_active = classification[0] > onset\n    start = None\n    if is_active:\n        start = 0\n\n    scale = m.receptive_field_shift / m.sample_rate\n    scale_offset = m.receptive_field_size / m.sample_rate * 0.5\n\n    for i in range(len(classification)):\n        if is_active:\n            if classification[i] < offset:\n                print(\n                    f\"{start*scale + scale_offset:.3f} -- {i*scale + scale_offset:.3f}\"\n                )\n                is_active = False\n        else:\n            if classification[i] > onset:\n                start = i\n                is_active = True\n\n    if is_active:\n        print(\n            f\"{start*scale + scale_offset:.3f} -- {(len(classification)-1)*scale + scale_offset:.3f}\"\n        )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/pyannote/segmentation/vad-torch.py",
    "content": "#!/usr/bin/env python3\n\nimport torch\nfrom pyannote.audio import Model\nfrom pyannote.audio.pipelines import (\n    VoiceActivityDetection as VoiceActivityDetectionPipeline,\n)\n\n\n@torch.no_grad()\ndef main():\n    # Please download it from\n    # https://huggingface.co/csukuangfj/pyannote-models/tree/main/segmentation-3.0\n    pt_filename = \"./pytorch_model.bin\"\n    model = Model.from_pretrained(pt_filename)\n    model.eval()\n\n    pipeline = VoiceActivityDetectionPipeline(segmentation=model)\n\n    # https://huggingface.co/pyannote/voice-activity-detection/blob/main/config.yaml\n    # https://github.com/pyannote/pyannote-audio/issues/1215\n    initial_params = {\n        \"min_duration_on\": 0.0,\n        \"min_duration_off\": 0.0,\n    }\n    pipeline.onset = 0.5\n    pipeline.offset = 0.5\n\n    pipeline.instantiate(initial_params)\n\n    # wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n    t = pipeline(\"./lei-jun-test.wav\")\n    print(type(t))\n    print(t)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/qnn/__init__.py",
    "content": ""
  },
  {
    "path": "scripts/qnn/device_info.py",
    "content": "#!/usr/bin/env python3\nfrom dataclasses import dataclass\nfrom enum import IntEnum, unique\n\n\"\"\"\nSee also\nhttps://docs.qualcomm.com/doc/80-63442-10/topic/QNN_general_overview.html#supported-snapdragon-devices\n\nSA8255 soc_id    52 dsp_arch     v73 vtcm_size (MB)      8\nSA8295 soc_id    39 dsp_arch     v68 vtcm_size (MB)      8\nSM8350 soc_id    35 dsp_arch     v68 vtcm_size (MB)      4\nSM8450 soc_id    36 dsp_arch     v69 vtcm_size (MB)      8\nSM8475 soc_id    42 dsp_arch     v69 vtcm_size (MB)      8\nSM8550 soc_id    43 dsp_arch     v73 vtcm_size (MB)      8\nSM8650 soc_id    57 dsp_arch     v75 vtcm_size (MB)      8\nSM8750 soc_id    69 dsp_arch     v79 vtcm_size (MB)      8\nSM8850 soc_id    87 dsp_arch     v81 vtcm_size (MB)      8\nSSG2115P soc_id  46 dsp_arch     v73 vtcm_size (MB)      2\nSSG2125P soc_id  58 dsp_arch     v73 vtcm_size (MB)      2\nSXR1230P soc_id  45 dsp_arch     v73 vtcm_size (MB)      2\nSXR2230P soc_id  53 dsp_arch     v69 vtcm_size (MB)      8\nSXR2330P soc_id  75 dsp_arch     v79 vtcm_size (MB)      8\nQCS9100 soc_id   77 dsp_arch     v73 vtcm_size (MB)      8\nSAR2230P soc_id  95 dsp_arch     v81 vtcm_size (MB)      4\nSW6100 soc_id    96 dsp_arch     v81 vtcm_size (MB)      4\n\"\"\"\n\n\n@unique\nclass Chipset(IntEnum):\n    # see https://github.com/pytorch/executorch/blob/main/backends/qualcomm/serialization/qc_schema.py#L41\n    # SA8255, soc_id 52,  dsp_arch v73\n    SA8255 = 52  # v73\n    SA8295 = 39  # v68\n    SM8350 = 35  # v68\n    SM8450 = 36  # v69\n    SM8475 = 42  # v69\n    SM8550 = 43  # v73\n    SM8650 = 57  # v75\n    SM8750 = 69  # v79\n    SM8850 = 87  # v81\n    #  SSG2115P = 46  # v73\n    #  SSG2125P = 58  # v73\n    #  SXR1230P = 45  # v73\n    #  SXR2230P = 53  # v69\n    #  SXR2330P = 75  # v79\n    QCS9100 = 77  # v73\n    #  SAR2230P = 95  # v81\n    #  SW6100 = 96  # v81\n\n\n@unique\nclass HtpArch(IntEnum):\n    v68 = 68\n    v69 = 69\n    v73 = 73\n    v75 = 75\n    v79 = 79\n    v81 = 81\n    v87 = 87\n\n\n@dataclass\nclass HtpInfo:\n    arch: HtpArch\n    vtcm_size_in_mb: int\n\n\n@dataclass\nclass SocInfo:\n    model: Chipset\n    info: HtpInfo\n\n\nsoc_info_list = [\n    SocInfo(Chipset.SA8255, HtpInfo(HtpArch.v73, 8)),\n    SocInfo(Chipset.SA8295, HtpInfo(HtpArch.v68, 8)),\n    SocInfo(Chipset.SM8350, HtpInfo(HtpArch.v68, 4)),\n    SocInfo(Chipset.SM8450, HtpInfo(HtpArch.v69, 8)),\n    SocInfo(Chipset.SM8475, HtpInfo(HtpArch.v69, 8)),\n    SocInfo(Chipset.SM8550, HtpInfo(HtpArch.v73, 8)),\n    SocInfo(Chipset.SM8650, HtpInfo(HtpArch.v75, 8)),\n    SocInfo(Chipset.SM8750, HtpInfo(HtpArch.v79, 8)),\n    SocInfo(Chipset.SM8850, HtpInfo(HtpArch.v81, 8)),\n    #  SocInfo(Chipset.SSG2115P, HtpInfo(HtpArch.v73, 2)),\n    #  SocInfo(Chipset.SSG2125P, HtpInfo(HtpArch.v73, 2)),\n    #  SocInfo(Chipset.SXR1230P, HtpInfo(HtpArch.v73, 2)),\n    #  SocInfo(Chipset.SXR2230P, HtpInfo(HtpArch.v69, 8)),\n    #  SocInfo(Chipset.SXR2330P, HtpInfo(HtpArch.v79, 8)),\n    SocInfo(Chipset.QCS9100, HtpInfo(HtpArch.v73, 8)),\n    #  SocInfo(Chipset.SAR2230P, HtpInfo(HtpArch.v81, 4)),\n    #  SocInfo(Chipset.SW6100, HtpInfo(HtpArch.v81, 4)),\n]\n\nsoc_info_dict = {soc.model.name: soc for soc in soc_info_list}\n\n\ndef _test():\n    for soc in soc_info_list:\n        print(\n            soc.model.name,\n            \"soc_id\\t\",\n            soc.model.value,\n            \"dsp_arch\\t\",\n            soc.info.arch.name,\n            \"vtcm_size (MB)\\t\",\n            soc.info.vtcm_size_in_mb,\n        )\n\n\nif __name__ == \"__main__\":\n    _test()\n"
  },
  {
    "path": "scripts/qnn/generate_config.py",
    "content": "#!/usr/bin/env python3\n\n# see\n# https://github.com/MollySophia/rwkv-qualcomm/blob/2a82c641c90ee130cbd7038ca7449b2fa818de71/utils/htp_devices_config.py\n# https://docs.qualcomm.com/bundle/publicresource/topics/80-64748-1/model_prep_linux.html#QNN-HTP-context-binary\n\nimport argparse\nimport json\nfrom pathlib import Path\n\nfrom device_info import soc_info_dict\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--soc\",\n        type=str,\n        required=True,\n        help=\"SM8850, SA8295, etc\",\n    )\n\n    parser.add_argument(\n        \"--graph-name\",\n        type=str,\n        required=True,\n        help=\"Graph name\",\n    )\n\n    parser.add_argument(\n        \"--output-dir\",\n        type=str,\n        required=True,\n        help=\"Output directory to save the generated json files\",\n    )\n\n    parser.add_argument(\n        \"--qnn-sdk-root\",\n        type=str,\n        required=True,\n        help=\"Path to qnn sdk\",\n    )\n\n    return parser.parse_args()\n\n\ndef generate_config(\n    soc_name: str,\n    graph_name: str,\n    output_dir: str,\n    qnn_sdk_root: str,\n):\n    if soc_name not in soc_info_dict:\n        raise ValueError(\n            f\"Unsupported SOC {soc_name}. Supported: - {sorted(list(soc_info_dict.keys()))}\"\n        )\n    soc = soc_info_dict[soc_name]\n\n    output_dir = Path(output_dir).absolute()\n    output_dir.mkdir(parents=True, exist_ok=True)\n\n    htp_backend_extensions_data = {\n        \"backend_extensions\": {\n            \"shared_library_path\": f\"{qnn_sdk_root}/lib/x86_64-linux-clang/libQnnHtpNetRunExtensions.so\",\n            \"config_file_path\": f\"{output_dir}/htp_config.json\",\n        }\n    }\n\n    htp_backend_config_data = {\n        \"graphs\": [\n            {\n                \"vtcm_mb\": soc.info.vtcm_size_in_mb,\n                \"O\": 3,\n                \"graph_names\": [graph_name],\n            }\n        ],\n        \"devices\": [\n            {\n                \"device_id\": 0,\n                \"soc_id\": soc.model.value,\n                \"dsp_arch\": soc.info.arch.name,\n                \"cores\": [\n                    {\n                        \"core_id\": 0,\n                        \"perf_profile\": \"burst\",\n                        \"rpc_control_latency\": 200,\n                    }\n                ],\n            }\n        ],\n    }\n\n    with open(str(output_dir / \"htp_backend_extensions.json\"), \"w\") as f:\n        json.dump(htp_backend_extensions_data, f, indent=4)\n\n    with open(str(output_dir / \"htp_config.json\"), \"w\") as f:\n        json.dump(htp_backend_config_data, f, indent=4)\n\n\ndef _test():\n    qnn_sdk_root = \"/home/fangjun/open-source/qairt/2.40.0.251030\"\n    generate_config(\n        soc_name=\"SM8850\",\n        graph_name=\"model_10_seconds_quantized\",\n        output_dir=\"./tmp\",\n        qnn_sdk_root=qnn_sdk_root,\n    )\n\n\nif __name__ == \"__main__\":\n    #  _test()\n\n    args = get_args()\n    print(vars(args))\n    generate_config(\n        soc_name=args.soc,\n        graph_name=args.graph_name,\n        output_dir=args.output_dir,\n        qnn_sdk_root=args.qnn_sdk_root,\n    )\n\n# ./generate_config.py  --soc SM8850 --graph-name abc --output-dir ./tmp2 --qnn-sdk-root $QNN_SDK_ROOT\n"
  },
  {
    "path": "scripts/sense-voice/README-nano.md",
    "content": "# Introduction\n\nThis directory contains models converted from\nhttps://huggingface.co/FunAudioLLM/Fun-ASR-Nano-2512\n\n## Core Features\n\n> From  https://huggingface.co/FunAudioLLM/Fun-ASR-Nano-2512\n\n    - Far-field High-noise Recognition: Deeply optimized for far-distance sound pickup and high-noise scenarios (such as conference rooms, in-vehicle environments, industrial sites, etc.), improving recognition accuracy to 93%.\n\n    - Chinese Dialects and Regional Accents:\n\n        - Supports 7 major dialects: Wu, Cantonese, Min, Hakka, Gan, Xiang, Jin\n        - Covers 26 regional accents: including Henan, Shaanxi, Hubei, Sichuan, Chongqing, Yunnan, Guizhou, Guangdong, Guangxi and more than 20 other regions\n\n    - Multi-language Free Speech: Supports recognition of 31 languages, with focused optimization on East and Southeast Asian languages, supporting free language switching and mixed recognition.\n    - Music Background Lyric Recognition: Enhanced speech recognition performance under music background interference, supporting accurate recognition of lyric content in songs.\n\n\n\n## 核心特性\n\n> From https://huggingface.co/FunAudioLLM/Fun-ASR-Nano-2512/blob/main/README_zh.md\n\n    - 远场高噪声识别： 针对远距离拾音及高噪声场景（如会议室、车载环境、工业现场等）进行深度优化，识别准确率提升至 **93%**。\n    - 中文方言与地方口音：\n\n        - 支持 7 大方言：吴语、粤语、闽语、客家话、赣语、湘语、晋语\n        - 覆盖 26 个地区口音：包括河南、陕西、湖北、四川、重庆、云南、贵州、广东、广西等 20 多个地区\n\n    - 多语言自由说： 支持 31 种语言识别，重点优化东亚与东南亚语种，支持语种自由切换和混合识别。\n    - 音乐背景歌词识别： 强化在音乐背景干扰下的语音识别性能，支持对歌曲中歌词内容的精准识别。\n"
  },
  {
    "path": "scripts/sense-voice/README.md",
    "content": "# Introduction\n\nThis directory contains models converted from\nhttps://github.com/FunAudioLLM/SenseVoice\n"
  },
  {
    "path": "scripts/sense-voice/ascend-npu/export_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom typing import List, Tuple\n\nimport sentencepiece as spm\nimport torch\n\nfrom torch_model import SenseVoiceSmall\n\n\ndef load_cmvn(filename) -> Tuple[List[float], List[float]]:\n    neg_mean = None\n    inv_stddev = None\n\n    with open(filename) as f:\n        for line in f:\n            if not line.startswith(\"<LearnRateCoef>\"):\n                continue\n            t = line.split()[3:-1]\n\n            if neg_mean is None:\n                neg_mean = list(map(lambda x: float(x), t))\n            else:\n                inv_stddev = list(map(lambda x: float(x), t))\n\n    return neg_mean, inv_stddev\n\n\ndef generate_tokens(sp):\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i in range(sp.vocab_size()):\n            f.write(f\"{sp.id_to_piece(i)} {i}\\n\")\n    print(\"saved to tokens.txt\")\n\n\nclass ModelWrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.m = m\n\n    def forward(self, x, prompt):\n        logits = self.m(x[None], prompt)[0]\n        part1 = logits[:4]\n        part2 = logits[4:]\n        part1 = part1.reshape(4, 25055)\n        part2 = part2.reshape(x.size(0), 25055)\n        return part1, part2\n\n\n@torch.no_grad()\ndef main():\n    sp = spm.SentencePieceProcessor()\n    sp.load(\"./chn_jpn_yue_eng_ko_spectok.bpe.model\")\n    generate_tokens(sp)\n\n    print(\"loading model\")\n\n    state_dict = torch.load(\"./model.pt\", map_location=\"cpu\")\n    if \"state_dict\" in state_dict:\n        state_dict = state_dict[\"state_dict\"]\n\n    neg_mean, inv_stddev = load_cmvn(\"./am.mvn\")\n\n    neg_mean = torch.tensor(neg_mean, dtype=torch.float32)\n    inv_stddev = torch.tensor(inv_stddev, dtype=torch.float32)\n\n    model = SenseVoiceSmall(neg_mean=neg_mean, inv_stddev=inv_stddev)\n    model.load_state_dict(state_dict)\n    model.eval()\n    del state_dict\n\n    model = ModelWrapper(model)\n    model.eval()\n\n    x = torch.randn(1, 93, 560, dtype=torch.float32)\n\n    language = 3\n    text_norm = 15\n    prompt = torch.tensor([language, 1, 2, text_norm], dtype=torch.int32)\n\n    opset_version = 14\n    filename = \"model.onnx\"\n    torch.onnx.export(\n        model.m,\n        (x, prompt),\n        filename,\n        opset_version=opset_version,\n        input_names=[\"x\", \"prompt\"],\n        output_names=[\"logits\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"T\"},\n            \"logits\": {0: \"N\", 1: \"T_4\"},\n        },\n    )\n    print(f\"saved to {filename}\")\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20251018)\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/ascend-npu/export_onnx_static_shape.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom typing import List, Tuple\n\nimport sentencepiece as spm\nimport torch\n\nfrom torch_model import SenseVoiceSmall\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--input-len-in-seconds\",\n        type=int,\n        required=True,\n        help=\"\"\"Some Ascend NPU does not support dynamic shape, so we need to hard-code\n        how long the model can process.\n        \"\"\",\n    )\n    return parser.parse_args()\n\n\ndef load_cmvn(filename) -> Tuple[List[float], List[float]]:\n    neg_mean = None\n    inv_stddev = None\n\n    with open(filename) as f:\n        for line in f:\n            if not line.startswith(\"<LearnRateCoef>\"):\n                continue\n            t = line.split()[3:-1]\n\n            if neg_mean is None:\n                neg_mean = list(map(lambda x: float(x), t))\n            else:\n                inv_stddev = list(map(lambda x: float(x), t))\n\n    return neg_mean, inv_stddev\n\n\ndef generate_tokens(sp):\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i in range(sp.vocab_size()):\n            f.write(f\"{sp.id_to_piece(i)} {i}\\n\")\n    print(\"saved to tokens.txt\")\n\n\nclass ModelWrapper(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.m = m\n\n    def forward(self, x, prompt):\n        logits = self.m(x[None], prompt)[0]\n        part1 = logits[:4]\n        part2 = logits[4:]\n        part1 = part1.reshape(4, 25055)\n        part2 = part2.reshape(x.size(0), 25055)\n        return part1, part2\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    sp = spm.SentencePieceProcessor()\n    sp.load(\"./chn_jpn_yue_eng_ko_spectok.bpe.model\")\n    generate_tokens(sp)\n\n    print(\"loading model\")\n\n    state_dict = torch.load(\"./model.pt\", map_location=\"cpu\")\n    if \"state_dict\" in state_dict:\n        state_dict = state_dict[\"state_dict\"]\n\n    neg_mean, inv_stddev = load_cmvn(\"./am.mvn\")\n\n    neg_mean = torch.tensor(neg_mean, dtype=torch.float32)\n    inv_stddev = torch.tensor(inv_stddev, dtype=torch.float32)\n\n    model = SenseVoiceSmall(neg_mean=neg_mean, inv_stddev=inv_stddev)\n    model.load_state_dict(state_dict)\n    model.eval()\n    del state_dict\n\n    model = ModelWrapper(model)\n    model.eval()\n\n    lfr_window_size = 7\n    lfr_window_shift = 6\n\n    # frame shift is 10ms, 1 second has about 100 feature frames\n    input_len_in_seconds = int(args.input_len_in_seconds)\n    num_frames = input_len_in_seconds * 100\n    print(\"num_frames\", num_frames)\n\n    # num_input_frames is an approximate number\n    num_input_frames = int(num_frames / lfr_window_shift + 0.5)\n    print(\"num_input_frames\", num_input_frames)\n\n    x = torch.randn(1, num_input_frames, 560, dtype=torch.float32)\n    print(\"x.shape\", x.shape)\n\n    language = 3\n    text_norm = 15\n    prompt = torch.tensor([language, 1, 2, text_norm], dtype=torch.int32)\n\n    opset_version = 14\n    filename = \"model.onnx\"\n    torch.onnx.export(\n        model.m,\n        (x, prompt),\n        filename,\n        opset_version=opset_version,\n        input_names=[\"x\", \"prompt\"],\n        output_names=[\"logits\"],\n        dynamic_axes={},\n    )\n    print(f\"saved to {filename}\")\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20251018)\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/ascend-npu/test_om.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom typing import Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport soundfile as sf\nfrom ais_bench.infer.interface import InferSession\n\n\nclass OmModel:\n    def __init__(self):\n        self.model = InferSession(device_id=0, model_path=\"./model.om\", debug=False)\n\n        print(\"---model---\")\n        for i in self.model.get_inputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i.name, i.datatype, i.shape)\n\n    def __call__(self, x, prompt=None, language=None, text_norm=None):\n        return self.model.infer([x, prompt], mode=\"dymshape\", custom_sizes=10000000)[0][\n            0\n        ]\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef load_tokens(filename):\n    ans = dict()\n    i = 0\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            ans[i] = line.strip().split()[0]\n            i += 1\n    return ans\n\n\ndef compute_feat(\n    samples,\n    sample_rate,\n    window_size: int = 7,  # lfr_m\n    window_shift: int = 6,  # lfr_n\n):\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.snip_edges = False\n    opts.frame_opts.window_type = \"hamming\"\n    opts.frame_opts.samp_freq = sample_rate\n    opts.mel_opts.num_bins = 80\n\n    online_fbank = knf.OnlineFbank(opts)\n    online_fbank.accept_waveform(sample_rate, (samples * 32768).tolist())\n    online_fbank.input_finished()\n\n    features = np.stack(\n        [online_fbank.get_frame(i) for i in range(online_fbank.num_frames_ready)]\n    )\n    assert features.data.contiguous is True\n    assert features.dtype == np.float32, features.dtype\n\n    T = (features.shape[0] - window_size) // window_shift + 1\n    features = np.lib.stride_tricks.as_strided(\n        features,\n        shape=(T, features.shape[1] * window_size),\n        strides=((window_shift * features.shape[1]) * 4, 4),\n    )\n\n    return np.copy(features)\n\n\ndef main():\n    samples, sample_rate = load_audio(\"./test_wavs/zh.wav\")\n    if sample_rate != 16000:\n        import librosa\n\n        samples = librosa.resample(samples, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    model = OmModel()\n\n    features = compute_feat(\n        samples=samples,\n        sample_rate=sample_rate,\n    )\n    print(\"features.shape\", features.shape)\n\n    language_auto = 0\n    language_zh = 3\n    language_en = 4\n    language_yue = 7\n    language_ya = 11\n    language_ko = 12\n    language_nospeech = 13\n\n    language = language_auto\n\n    with_itn = 14\n    without_itn = 15\n\n    text_norm = with_itn\n\n    prompt = np.array([language, 1, 2, text_norm], dtype=np.int32)\n\n    print(\"prompt\", prompt.shape)\n\n    logits = model(\n        x=features[None],\n        prompt=prompt,\n    )\n    print(\"logits.shape\", logits.shape, type(logits))\n\n    idx = logits.argmax(axis=-1)\n    print(idx)\n    print(len(idx))\n    prev = -1\n    ids = []\n    for i in idx:\n        if i != prev:\n            ids.append(i)\n        prev = i\n    ids = [i for i in ids if i != 0]\n    print(ids)\n\n    tokens = load_tokens(\"./tokens.txt\")\n    text = \"\".join([tokens[i] for i in ids])\n\n    text = text.replace(\"▁\", \" \")\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/ascend-npu/test_om_static.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom typing import Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport soundfile as sf\nimport torch\nfrom ais_bench.infer.interface import InferSession\n\n\nclass OmModel:\n    def __init__(self):\n        self.model = InferSession(device_id=0, model_path=\"./model.om\", debug=False)\n\n        print(\"---model---\")\n        for i in self.model.get_inputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i.name, i.datatype, i.shape)\n\n        self.num_frames = self.model.get_inputs()[0].shape[1]\n\n    def __call__(self, x, prompt=None, language=None, text_norm=None):\n        return self.model.infer([x, prompt], mode=\"static\", custom_sizes=10000000)[0][0]\n        return logits\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef load_tokens(filename):\n    ans = dict()\n    i = 0\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            ans[i] = line.strip().split()[0]\n            i += 1\n    return ans\n\n\ndef compute_feat(\n    samples,\n    sample_rate,\n    window_size: int = 7,  # lfr_m\n    window_shift: int = 6,  # lfr_n\n):\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.snip_edges = False\n    opts.frame_opts.window_type = \"hamming\"\n    opts.frame_opts.samp_freq = sample_rate\n    opts.mel_opts.num_bins = 80\n\n    online_fbank = knf.OnlineFbank(opts)\n    online_fbank.accept_waveform(sample_rate, (samples * 32768).tolist())\n    online_fbank.input_finished()\n\n    features = np.stack(\n        [online_fbank.get_frame(i) for i in range(online_fbank.num_frames_ready)]\n    )\n    assert features.data.contiguous is True\n    assert features.dtype == np.float32, features.dtype\n\n    T = (features.shape[0] - window_size) // window_shift + 1\n    features = np.lib.stride_tricks.as_strided(\n        features,\n        shape=(T, features.shape[1] * window_size),\n        strides=((window_shift * features.shape[1]) * 4, 4),\n    )\n\n    return np.copy(features)\n\n\ndef main():\n    samples, sample_rate = load_audio(\"./test_wavs/zh.wav\")\n    if sample_rate != 16000:\n        import librosa\n\n        samples = librosa.resample(samples, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    model = OmModel()\n\n    features = compute_feat(\n        samples=samples,\n        sample_rate=sample_rate,\n    )\n    print(\"features.shape\", features.shape)\n    if model.num_frames > 0:\n        if features.shape[0] < model.num_frames:\n            features = np.pad(\n                features,\n                ((0, model.num_frames - features.shape[0]), (0, 0)),\n                mode=\"constant\",\n                constant_values=0,\n            )\n        elif features.shape[0] > model.num_frames:\n            features = features[: model.num_frames]\n\n        print(\"features.shape (new)\", features.shape)\n\n    language_auto = 0\n    language_zh = 3\n    language_en = 4\n    language_yue = 7\n    language_ya = 11\n    language_ko = 12\n    language_nospeech = 13\n\n    language = language_auto\n\n    with_itn = 14\n    without_itn = 15\n\n    text_norm = with_itn\n\n    prompt = np.array([language, 1, 2, text_norm], dtype=np.int32)\n    # language = np.array([language], dtype=np.int32)\n    # text_norm = np.array([text_norm], dtype=np.int32)\n\n    print(\"prompt\", prompt.shape)\n\n    logits = model(\n        x=features[None],\n        prompt=prompt,\n        # language=language,\n        ##text_norm=text_norm,\n    )\n    print(\"logits.shape\", logits.shape, type(logits))\n\n    idx = logits.argmax(axis=-1)\n    print(idx)\n    print(len(idx))\n    prev = -1\n    ids = []\n    for i in idx:\n        if i != prev:\n            ids.append(i)\n        prev = i\n    ids = [i for i in ids if i != 0]\n    print(ids)\n\n    tokens = load_tokens(\"./tokens.txt\")\n    text = \"\".join([tokens[i] for i in ids])\n\n    text = text.replace(\"▁\", \" \")\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nWe use\nhttps://hf-mirror.com/yuekai/model_repo_sense_voice_small/blob/main/export_onnx.py\nas a reference while writing this file.\n\nThanks to https://github.com/yuekaizhang for making the file public.\n\nYou should install FunASR before you run this file.\n\"\"\"\n\nimport os\nfrom typing import Any, Dict, Tuple\n\nimport onnx\nimport torch\nfrom model import SenseVoiceSmall\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\ndef modified_forward(\n    self,\n    x: torch.Tensor,\n    x_length: torch.Tensor,\n    language: torch.Tensor,\n    text_norm: torch.Tensor,\n):\n    \"\"\"\n    Args:\n      x:\n        A 3-D tensor of shape (N, T, C) with dtype torch.float32\n      x_length:\n        A 1-D tensor of shape (N,) with dtype torch.int32\n      language:\n        A 1-D tensor of shape (N,) with dtype torch.int32\n        See also https://github.com/FunAudioLLM/SenseVoice/blob/a80e676461b24419cf1130a33d4dd2f04053e5cc/model.py#L640\n      text_norm:\n        A 1-D tensor of shape (N,) with dtype torch.int32\n        See also https://github.com/FunAudioLLM/SenseVoice/blob/a80e676461b24419cf1130a33d4dd2f04053e5cc/model.py#L642\n    \"\"\"\n    language_query = self.embed(language).unsqueeze(1)\n    text_norm_query = self.embed(text_norm).unsqueeze(1)\n\n    event_emo_query = self.embed(torch.LongTensor([[1, 2]])).repeat(x.size(0), 1, 1)\n\n    x = torch.cat((language_query, event_emo_query, text_norm_query, x), dim=1)\n    x_length += 4\n\n    encoder_out, encoder_out_lens = self.encoder(x, x_length)\n    if isinstance(encoder_out, tuple):\n        encoder_out = encoder_out[0]\n\n    ctc_logits = self.ctc.ctc_lo(encoder_out)\n\n    return ctc_logits\n\n\ndef load_cmvn(filename) -> Tuple[str, str]:\n    neg_mean = None\n    inv_stddev = None\n\n    with open(filename) as f:\n        for line in f:\n            if not line.startswith(\"<LearnRateCoef>\"):\n                continue\n            t = line.split()[3:-1]\n\n            if neg_mean is None:\n                neg_mean = \",\".join(t)\n            else:\n                inv_stddev = \",\".join(t)\n\n    return neg_mean, inv_stddev\n\n\ndef generate_tokens(params):\n    sp = params[\"tokenizer\"].sp\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i in range(sp.vocab_size()):\n            f.write(f\"{sp.id_to_piece(i)} {i}\\n\")\n\n    os.system(\"head tokens.txt; tail -n200 tokens.txt\")\n\n\ndef display_params(params):\n    print(\"----------params----------\")\n    print(params)\n\n    print(\"----------frontend_conf----------\")\n    print(params[\"frontend_conf\"])\n\n    os.system(f\"cat {params['frontend_conf']['cmvn_file']}\")\n\n    print(\"----------config----------\")\n    print(params[\"config\"])\n\n    os.system(f\"cat {params['config']}\")\n\n\n@torch.no_grad()\ndef main():\n    model, params = SenseVoiceSmall.from_pretrained(\n        model=\"iic/SenseVoiceSmall\", device=\"cpu\"\n    )\n    model.eval()\n\n    display_params(params)\n\n    generate_tokens(params)\n\n    model.__class__.forward = modified_forward\n\n    x = torch.randn(2, 100, 560, dtype=torch.float32)\n    x_length = torch.tensor([80, 100], dtype=torch.int32)\n    language = torch.tensor([0, 3], dtype=torch.int32)\n    text_norm = torch.tensor([14, 15], dtype=torch.int32)\n\n    opset_version = 13\n    filename = \"model.onnx\"\n    torch.onnx.export(\n        model,\n        (x, x_length, language, text_norm),\n        filename,\n        opset_version=opset_version,\n        input_names=[\"x\", \"x_length\", \"language\", \"text_norm\"],\n        output_names=[\"logits\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"T\"},\n            \"x_length\": {0: \"N\"},\n            \"language\": {0: \"N\"},\n            \"text_norm\": {0: \"N\"},\n            \"logits\": {0: \"N\", 1: \"T\"},\n        },\n    )\n\n    lfr_window_size = params[\"frontend_conf\"][\"lfr_m\"]\n    lfr_window_shift = params[\"frontend_conf\"][\"lfr_n\"]\n\n    neg_mean, inv_stddev = load_cmvn(params[\"frontend_conf\"][\"cmvn_file\"])\n    vocab_size = params[\"tokenizer\"].sp.vocab_size()\n\n    meta_data = {\n        \"lfr_window_size\": lfr_window_size,\n        \"lfr_window_shift\": lfr_window_shift,\n        \"normalize_samples\": 0,  # input should be in the range [-32768, 32767]\n        \"neg_mean\": neg_mean,\n        \"inv_stddev\": inv_stddev,\n        \"model_type\": \"sense_voice_ctc\",\n        # version 1: Use QInt8\n        # version 2: Use QUInt8\n        \"version\": \"2\",\n        \"model_author\": \"iic\",\n        \"maintainer\": \"k2-fsa\",\n        \"vocab_size\": vocab_size,\n        \"comment\": \"iic/SenseVoiceSmall\",\n        \"lang_auto\": model.lid_dict[\"auto\"],\n        \"lang_zh\": model.lid_dict[\"zh\"],\n        \"lang_en\": model.lid_dict[\"en\"],\n        \"lang_yue\": model.lid_dict[\"yue\"],  # cantonese\n        \"lang_ja\": model.lid_dict[\"ja\"],\n        \"lang_ko\": model.lid_dict[\"ko\"],\n        \"lang_nospeech\": model.lid_dict[\"nospeech\"],\n        \"with_itn\": model.textnorm_dict[\"withitn\"],\n        \"without_itn\": model.textnorm_dict[\"woitn\"],\n        \"url\": \"https://huggingface.co/FunAudioLLM/SenseVoiceSmall\",\n    }\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n    filename_int8 = \"model.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        op_types_to_quantize=[\"MatMul\"],\n        # Note that we have to use QUInt8 here.\n        #\n        # When QInt8 is used, C++ onnxruntime produces incorrect results\n        weight_type=QuantType.QUInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20240717)\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/export_onnx_nano.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nimport os\nfrom typing import Any, Dict\n\nimport onnx\nimport torch\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\nfrom test_nano_torch import load_tokens, load_torch_model\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--opset-version\",\n        type=int,\n        default=13,\n    )\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    print(vars(args))\n    id2tokens = load_tokens()\n\n    vocab_size = len(id2tokens)\n    blank_id = vocab_size - 1\n\n    print(\"loading model\")\n\n    model = load_torch_model()\n    model.eval()\n\n    x = torch.randn(1, 30, 560, dtype=torch.float32)\n\n    opset_version = args.opset_version\n    filename = \"model.onnx\"\n    torch.onnx.export(\n        model,\n        x,\n        filename,\n        opset_version=opset_version,\n        input_names=[\"x\"],\n        output_names=[\"logits\"],\n        dynamic_axes={\n            \"x\": {1: \"T\"},\n        },\n    )\n\n    model_author = \"FunAudioLLM\"\n    comment = os.environ.get(\"comment\", \"FunAudioLLM/Fun-ASR-Nano-2512\")\n    url = \"https://huggingface.co/FunAudioLLM/Fun-ASR-Nano-2512\"\n\n    meta_data = {\n        \"lfr_window_size\": 7,\n        \"lfr_window_shift\": 6,\n        \"normalize_samples\": 0,  # input should be in the range [-32768, 32767]\n        \"model_type\": \"sense_voice_ctc\",\n        \"version\": \"1\",\n        \"model_author\": model_author,\n        \"maintainer\": \"k2-fsa\",\n        \"vocab_size\": vocab_size,\n        \"blank_id\": blank_id,\n        \"comment\": comment,\n        \"url\": url,\n    }\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n    filename_int8 = \"model.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        op_types_to_quantize=[\"MatMul\"],\n        # Note that we have to use QUInt8 here.\n        #\n        # When QInt8 is used, C++ onnxruntime produces incorrect results\n        weight_type=QuantType.QUInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20251217)\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/qnn/.gitignore",
    "content": "*.raw\n"
  },
  {
    "path": "scripts/sense-voice/qnn/decode_logits.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport numpy as np\n\n\ndef load_tokens(filename):\n    ans = dict()\n    i = 0\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            ans[i] = line.strip().split()[0]\n            i += 1\n    return ans\n\n\nlogits = np.fromfile(\"./logits.raw\", dtype=np.float32).reshape((-1, 25055))\n\nidx = logits.argmax(axis=-1)\nprint(\"idx\", idx)\nprint(len(idx))\nprev = -1\nids = []\nfor i in idx:\n    if i != prev:\n        ids.append(i)\n    prev = i\nids = [i for i in ids if i != 0]\nprint(ids)\n\ntokens = load_tokens(\"./tokens.txt\")\ntext = \"\".join([tokens[i] for i in ids])\n\ntext = text.replace(\"_\", \" \")\nprint(text)\n"
  },
  {
    "path": "scripts/sense-voice/qnn/generate_test_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom typing import Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport soundfile as sf\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--num-frames\",\n        type=int,\n        required=True,\n    )\n\n    parser.add_argument(\n        \"--wav\",\n        type=str,\n        required=True,\n    )\n    return parser.parse_args()\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef compute_feat(\n    samples,\n    sample_rate,\n    window_size: int = 7,  # lfr_m\n    window_shift: int = 6,  # lfr_n\n):\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.snip_edges = False\n    opts.frame_opts.window_type = \"hamming\"\n    opts.frame_opts.samp_freq = sample_rate\n    opts.mel_opts.num_bins = 80\n\n    online_fbank = knf.OnlineFbank(opts)\n    online_fbank.accept_waveform(sample_rate, (samples * 32768).tolist())\n    online_fbank.input_finished()\n\n    features = np.stack(\n        [online_fbank.get_frame(i) for i in range(online_fbank.num_frames_ready)]\n    )\n    assert features.data.contiguous is True\n    assert features.dtype == np.float32, features.dtype\n\n    T = (features.shape[0] - window_size) // window_shift + 1\n    features = np.lib.stride_tricks.as_strided(\n        features,\n        shape=(T, features.shape[1] * window_size),\n        strides=((window_shift * features.shape[1]) * 4, 4),\n    )\n\n    return np.copy(features)\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    samples, sample_rate = load_audio(args.wav)\n    if sample_rate != 16000:\n        import librosa\n\n        samples = librosa.resample(samples, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    features = compute_feat(\n        samples=samples,\n        sample_rate=sample_rate,\n    )\n    print(\"features.shape\", features.shape)\n    if features.shape[0] > args.num_frames:\n        features = features[: args.num_frames]\n    elif features.shape[0] < args.num_frames:\n        pad_width = ((0, args.num_frames - features.shape[0]), (0, 0))\n        features = np.pad(features, pad_width, mode=\"constant\", constant_values=0)\n\n    features.tofile(\"input0.raw\")\n\n    language_auto = 0\n    language_zh = 3\n    language_en = 4\n    language_yue = 7\n    language_ya = 11\n    language_ko = 12\n    language_nospeech = 13\n\n    language = language_auto\n\n    with_itn = 14\n    without_itn = 15\n\n    text_norm = with_itn\n\n    prompt = np.array([language, 1, 2, text_norm], dtype=np.int32)\n    prompt.tofile(\"input1.raw\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/rknn/adaptor.py",
    "content": "import torch\nfrom torch import nn\n\nimport torch_model\n\n\nclass MultiHeadedAttention(nn.Module):\n    \"\"\"\n    This class is copied and modified from\n    https://github.com/modelscope/FunASR/blob/main/funasr/models/transformer/attention.py\n    \"\"\"\n\n    def __init__(self, n_head, n_feat, dropout_rate):\n        super().__init__()\n        assert n_feat % n_head == 0\n\n        # We assume d_v always equals d_k\n        self.d_k = n_feat // n_head\n        self.h = n_head\n        self.linear_q = nn.Linear(n_feat, n_feat)\n        self.linear_k = nn.Linear(n_feat, n_feat)\n        self.linear_v = nn.Linear(n_feat, n_feat)\n        self.linear_out = nn.Linear(n_feat, n_feat)\n        self.attn = None\n        self.dropout = nn.Dropout(p=dropout_rate)\n\n    def forward_qkv(self, query, key, value):\n        \"\"\"Transform query, key and value.\n\n        Args:\n            query (torch.Tensor): Query tensor (#batch, time1, size).\n            key (torch.Tensor): Key tensor (#batch, time2, size).\n            value (torch.Tensor): Value tensor (#batch, time2, size).\n\n        Returns:\n            torch.Tensor: Transformed query tensor (#batch, n_head, time1, d_k).\n            torch.Tensor: Transformed key tensor (#batch, n_head, time2, d_k).\n            torch.Tensor: Transformed value tensor (#batch, n_head, time2, d_k).\n\n        \"\"\"\n        n_batch = query.size(0)\n        q = self.linear_q(query).view(n_batch, -1, self.h, self.d_k)\n        k = self.linear_k(key).view(n_batch, -1, self.h, self.d_k)\n        v = self.linear_v(value).view(n_batch, -1, self.h, self.d_k)\n        q = q.transpose(1, 2)  # (batch, head, time1, d_k)\n        k = k.transpose(1, 2)  # (batch, head, time2, d_k)\n        v = v.transpose(1, 2)  # (batch, head, time2, d_k)\n\n        return q, k, v\n\n    def forward_attention(self, value, scores, mask):\n        \"\"\"Compute attention context vector.\n\n        Args:\n            value (torch.Tensor): Transformed value (#batch, n_head, time2, d_k).\n            scores (torch.Tensor): Attention score (#batch, n_head, time1, time2).\n            mask (torch.Tensor): Mask (#batch, 1, time2) or (#batch, time1, time2).\n\n        Returns:\n            torch.Tensor: Transformed value (#batch, time1, d_model)\n                weighted by the attention score (#batch, time1, time2).\n\n        \"\"\"\n        n_batch = value.size(0)\n        if mask is not None:\n            mask = mask.unsqueeze(1).eq(0)  # (batch, 1, *, time2)\n\n            min_value = -float(\n                \"inf\"\n            )  # min_value = float(np.finfo(torch.tensor(0, dtype=qk.dtype).numpy().dtype).min)\n            scores = scores.masked_fill(mask, min_value)\n            attn = torch.softmax(scores, dim=-1).masked_fill(\n                mask, 0.0\n            )  # (batch, head, time1, time2)\n        else:\n            attn = torch.softmax(scores, dim=-1)  # (batch, head, time1, time2)\n\n        p_attn = self.dropout(attn)\n        x = torch.matmul(p_attn, value)  # (batch, head, time1, d_k)\n        x = (\n            x.transpose(1, 2).contiguous().view(n_batch, -1, self.h * self.d_k)\n        )  # (batch, time1, d_model)\n\n        return self.linear_out(x)  # (batch, time1, d_model)\n\n    def forward(self, query, key, value, mask):\n        \"\"\"Compute scaled dot product attention.\n\n        Args:\n            query (torch.Tensor): Query tensor (#batch, time1, size).\n            key (torch.Tensor): Key tensor (#batch, time2, size).\n            value (torch.Tensor): Value tensor (#batch, time2, size).\n            mask (torch.Tensor): Mask tensor (#batch, 1, time2) or\n                (#batch, time1, time2).\n\n        Returns:\n            torch.Tensor: Output tensor (#batch, time1, d_model).\n\n        \"\"\"\n        q, k, v = self.forward_qkv(query, key, value)\n        #  scores = torch.matmul(q, k.transpose(-2, -1)) / math.sqrt(self.d_k)\n        scores = torch.matmul(q, k.transpose(-2, -1)) * self.d_k ** (-0.5)\n\n        return self.forward_attention(v, scores, mask)\n\n\nclass EncoderLayer(nn.Module):\n    \"\"\"\n    This class is copied and modified from\n    https://github.com/modelscope/FunASR/blob/main/funasr/models/transformer/encoder.py\n    \"\"\"\n\n    def __init__(\n        self,\n        size,\n        self_attn,\n        feed_forward,\n        dropout_rate,\n        normalize_before=True,\n        concat_after=False,\n        stochastic_depth_rate=0.0,\n    ):\n        super().__init__()\n\n        self.self_attn = self_attn\n        self.feed_forward = feed_forward\n        self.norm1 = nn.LayerNorm(size, eps=1e-12)\n        self.norm2 = nn.LayerNorm(size, eps=1e-12)\n        self.dropout = nn.Dropout(dropout_rate)\n        self.size = size\n        self.normalize_before = normalize_before\n        self.concat_after = concat_after\n        if self.concat_after:\n            self.concat_linear = nn.Linear(size + size, size)\n        self.stochastic_depth_rate = stochastic_depth_rate\n\n    def forward(self, x, mask=None, cache=None):\n        \"\"\"Compute encoded features.\n\n        Args:\n            x_input (torch.Tensor): Input tensor (#batch, time, size).\n            mask (torch.Tensor): Mask tensor for the input (#batch, time).\n            cache (torch.Tensor): Cache tensor of the input (#batch, time - 1, size).\n\n        Returns:\n            torch.Tensor: Output tensor (#batch, time, size).\n            torch.Tensor: Mask tensor (#batch, time).\n\n        \"\"\"\n        skip_layer = False\n        # with stochastic depth, residual connection `x + f(x)` becomes\n        # `x <- x + 1 / (1 - p) * f(x)` at training time.\n        stoch_layer_coeff = 1.0\n\n        if skip_layer:\n            if cache is not None:\n                x = torch.cat([cache, x], dim=1)\n            return x, mask\n\n        residual = x\n        if self.normalize_before:\n            x = self.norm1(x)\n\n        if cache is None:\n            x_q = x\n        else:\n            assert cache.shape == (x.shape[0], x.shape[1] - 1, self.size)\n            x_q = x[:, -1:, :]\n            residual = residual[:, -1:, :]\n            mask = None if mask is None else mask[:, -1:, :]\n\n        if self.concat_after:\n            x_concat = torch.cat((x, self.self_attn(x_q, x, x, mask)), dim=-1)\n            x = residual + stoch_layer_coeff * self.concat_linear(x_concat)\n        else:\n            x = residual + stoch_layer_coeff * self.dropout(\n                self.self_attn(x_q, x, x, mask)\n            )\n        if not self.normalize_before:\n            x = self.norm1(x)\n\n        residual = x\n        if self.normalize_before:\n            x = self.norm2(x)\n        x = residual + stoch_layer_coeff * self.dropout(self.feed_forward(x))\n        if not self.normalize_before:\n            x = self.norm2(x)\n\n        if cache is not None:\n            x = torch.cat([cache, x], dim=1)\n\n        return x, mask\n\n\nclass Transformer(nn.Module):\n    # This class is copied and modified from\n    # https://github.com/modelscope/FunASR/blob/main/funasr/models/llm_asr/adaptor.py\n    def __init__(\n        self,\n        downsample_rate=1,\n        encoder_dim=512,\n        llm_dim=512,\n        ffn_dim: int = 2048,\n        n_layer: int = 5,\n        **kwargs\n    ):\n        super().__init__()\n        assert downsample_rate == 1, downsample_rate\n        self.k = downsample_rate\n        self.encoder_dim = encoder_dim\n        self.llm_dim = llm_dim\n        self.linear1 = nn.Linear(self.encoder_dim * self.k, ffn_dim)\n        self.relu = nn.ReLU()\n        self.linear2 = nn.Linear(ffn_dim, self.llm_dim)\n\n        self.blocks = None\n        if n_layer > 0:\n            self.blocks = nn.ModuleList(\n                [\n                    EncoderLayer(\n                        llm_dim,\n                        MultiHeadedAttention(\n                            kwargs.get(\"attention_heads\", 8),\n                            llm_dim,\n                            kwargs.get(\"attention_dropout_rate\", 0.0),\n                        ),\n                        torch_model.PositionwiseFeedForward(\n                            llm_dim,\n                            llm_dim // 4,\n                            kwargs.get(\"dropout_rate\", 0.0),\n                        ),\n                        kwargs.get(\"dropout_rate\", 0.0),\n                    )\n                    for i in range(n_layer)\n                ]\n            )\n\n    def forward(self, x):\n        x = self.linear1(x)\n        x = self.relu(x)\n        x = self.linear2(x)\n\n        masks = None\n\n        if self.blocks is not None:\n            for layer, block in enumerate(self.blocks):\n                x, masks = block(x, masks)\n        return x\n"
  },
  {
    "path": "scripts/sense-voice/rknn/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nimport os\nfrom typing import Any, Dict, List, Tuple\n\nimport onnx\nimport sentencepiece as spm\nimport torch\n\nfrom torch_model import SenseVoiceSmall\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--input-len-in-seconds\",\n        type=int,\n        required=True,\n        help=\"\"\"RKNN does not support dynamic shape, so we need to hard-code\n        how long the model can process.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--opset-version\",\n        type=int,\n        default=13,\n    )\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\ndef load_cmvn(filename) -> Tuple[List[float], List[float]]:\n    neg_mean = None\n    inv_stddev = None\n\n    with open(filename) as f:\n        for line in f:\n            if not line.startswith(\"<LearnRateCoef>\"):\n                continue\n            t = line.split()[3:-1]\n\n            if neg_mean is None:\n                neg_mean = list(map(lambda x: float(x), t))\n            else:\n                inv_stddev = list(map(lambda x: float(x), t))\n\n    return neg_mean, inv_stddev\n\n\ndef generate_tokens(sp):\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i in range(sp.vocab_size()):\n            f.write(f\"{sp.id_to_piece(i)} {i}\\n\")\n    print(\"saved to tokens.txt\")\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    sp = spm.SentencePieceProcessor()\n    sp.load(\"./chn_jpn_yue_eng_ko_spectok.bpe.model\")\n    vocab_size = sp.vocab_size()\n    generate_tokens(sp)\n\n    print(\"loading model\")\n\n    state_dict = torch.load(\"./model.pt\", map_location=\"cpu\")\n    if \"state_dict\" in state_dict:\n        state_dict = state_dict[\"state_dict\"]\n\n    neg_mean, inv_stddev = load_cmvn(\"./am.mvn\")\n\n    neg_mean = torch.tensor(neg_mean, dtype=torch.float32)\n    inv_stddev = torch.tensor(inv_stddev, dtype=torch.float32)\n\n    model = SenseVoiceSmall(neg_mean=neg_mean, inv_stddev=inv_stddev)\n    model.load_state_dict(state_dict)\n    model.eval()\n    del state_dict\n\n    lfr_window_size = 7\n    lfr_window_shift = 6\n\n    # frame shift is 10ms, 1 second has about 100 feature frames\n    input_len_in_seconds = int(args.input_len_in_seconds)\n    num_frames = input_len_in_seconds * 100\n    print(\"num_frames\", num_frames)\n\n    # num_input_frames is an approximate number\n    num_input_frames = int(num_frames / lfr_window_shift + 0.5)\n    print(\"num_input_frames\", num_input_frames)\n\n    x = torch.randn(1, num_input_frames, 560, dtype=torch.float32)\n\n    language = 3\n    text_norm = 15\n    prompt = torch.tensor([language, 1, 2, text_norm], dtype=torch.int32)\n\n    opset_version = args.opset_version\n    filename = f\"model-{input_len_in_seconds}-seconds.onnx\"\n    torch.onnx.export(\n        model,\n        (x, prompt),\n        filename,\n        opset_version=opset_version,\n        input_names=[\"x\", \"prompt\"],\n        output_names=[\"logits\"],\n        dynamic_axes={},\n    )\n\n    model_author = os.environ.get(\"model_author\", \"iic\")\n    comment = os.environ.get(\"comment\", \"iic/SenseVoiceSmall\")\n    url = os.environ.get(\"url\", \"https://huggingface.co/FunAudioLLM/SenseVoiceSmall\")\n\n    meta_data = {\n        \"lfr_window_size\": lfr_window_size,\n        \"lfr_window_shift\": lfr_window_shift,\n        \"num_input_frames\": num_input_frames,\n        \"normalize_samples\": 0,  # input should be in the range [-32768, 32767]\n        \"model_type\": \"sense_voice_ctc\",\n        \"version\": \"1\",\n        \"model_author\": model_author,\n        \"maintainer\": \"k2-fsa\",\n        \"vocab_size\": vocab_size,\n        \"comment\": comment,\n        \"lang_auto\": model.lid_dict[\"auto\"],\n        \"lang_zh\": model.lid_dict[\"zh\"],\n        \"lang_en\": model.lid_dict[\"en\"],\n        \"lang_yue\": model.lid_dict[\"yue\"],  # cantonese\n        \"lang_ja\": model.lid_dict[\"ja\"],\n        \"lang_ko\": model.lid_dict[\"ko\"],\n        \"lang_nospeech\": model.lid_dict[\"nospeech\"],\n        \"with_itn\": model.textnorm_dict[\"withitn\"],\n        \"without_itn\": model.textnorm_dict[\"woitn\"],\n        \"url\": url,\n    }\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n\nif __name__ == \"__main__\":\n    torch.manual_seed(20250717)\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/rknn/export-rknn.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nimport argparse\nimport logging\nfrom pathlib import Path\n\nfrom rknn.api import RKNN\n\nlogging.basicConfig(level=logging.WARNING)\n\ng_platforms = [\n    #  \"rv1103\",\n    #  \"rv1103b\",\n    #  \"rv1106\",\n    #  \"rk2118\",\n    \"rk3562\",\n    \"rk3566\",\n    \"rk3568\",\n    \"rk3576\",\n    \"rk3588\",\n]\n\n\ndef get_parser():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--target-platform\",\n        type=str,\n        required=True,\n        help=f\"Supported values are: {','.join(g_platforms)}\",\n    )\n\n    parser.add_argument(\n        \"--in-model\",\n        type=str,\n        required=True,\n        help=\"Path to the input onnx model\",\n    )\n\n    parser.add_argument(\n        \"--out-model\",\n        type=str,\n        required=True,\n        help=\"Path to the output rknn model\",\n    )\n\n    return parser\n\n\ndef get_meta_data(model: str):\n    import onnxruntime\n\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.inter_op_num_threads = 1\n    session_opts.intra_op_num_threads = 1\n\n    m = onnxruntime.InferenceSession(\n        model,\n        sess_options=session_opts,\n        providers=[\"CPUExecutionProvider\"],\n    )\n\n    for i in m.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in m.get_outputs():\n        print(i)\n    print()\n\n    meta = m.get_modelmeta().custom_metadata_map\n    s = \"\"\n    sep = \"\"\n    for key, value in meta.items():\n        if key in (\"neg_mean\", \"inv_stddev\"):\n            continue\n        s = s + sep + f\"{key}={value}\"\n        sep = \";\"\n    assert len(s) < 1024, len(s)\n\n    print(\"len(s)\", len(s), s)\n\n    return s\n\n\ndef export_rknn(rknn, filename):\n    ret = rknn.export_rknn(filename)\n    if ret != 0:\n        exit(f\"Export rknn model to {filename} failed!\")\n\n\ndef init_model(filename: str, target_platform: str, custom_string=None):\n    rknn = RKNN(verbose=False)\n\n    rknn.config(\n        optimization_level=0,\n        target_platform=target_platform,\n        custom_string=custom_string,\n    )\n    if not Path(filename).is_file():\n        exit(f\"{filename} does not exist\")\n\n    ret = rknn.load_onnx(model=filename)\n    if ret != 0:\n        exit(f\"Load model {filename} failed!\")\n\n    ret = rknn.build(do_quantization=False)\n    if ret != 0:\n        exit(f\"Build model {filename} failed!\")\n\n    return rknn\n\n\nclass RKNNModel:\n    def __init__(\n        self,\n        model: str,\n        target_platform: str,\n    ):\n        meta = get_meta_data(model)\n        print(meta)\n\n        self.model = init_model(\n            model,\n            target_platform=target_platform,\n            custom_string=meta,\n        )\n\n    def export_rknn(self, model):\n        export_rknn(self.model, model)\n\n    def release(self):\n        self.model.release()\n\n\ndef main():\n    args = get_parser().parse_args()\n    print(vars(args))\n\n    model = RKNNModel(\n        model=args.in_model,\n        target_platform=args.target_platform,\n    )\n\n    model.export_rknn(\n        model=args.out_model,\n    )\n\n    model.release()\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/rknn/nano.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom torch import nn\n\nimport adaptor\nimport torch_model\n\n\nclass Nano(nn.Module):\n    def __init__(self, vocab_size: int = 60515):\n        super().__init__()\n        self.audio_encoder = torch_model.SenseVoiceEncoderSmall()\n        self.ctc_decoder = adaptor.Transformer()\n        # blank is 60514, i.e., the last token id\n        self.ctc = torch_model.CTC(\n            odim=vocab_size,\n            encoder_output_size=self.audio_encoder.output_size,\n        )\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n          x: (N, T, C)\n        Returns:\n          - logits: (N, T, vocab_size)\n        \"\"\"\n        encoder_out = self.audio_encoder(x)\n        encoder_out = self.ctc_decoder(encoder_out)\n        logits = self.ctc.ctc_lo(encoder_out)\n        return logits\n"
  },
  {
    "path": "scripts/sense-voice/rknn/test_nano_torch.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport base64\nfrom pathlib import Path\n\nimport torch\n\nimport nano\nimport test_onnx\n\n\ndef load_tokens(filename: str = \"./tokens.txt\"):\n    id2token = dict()\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            try:\n                f = line.strip().split()\n                if len(f) == 2:\n                    t, i = f\n                else:\n                    t = \" \"\n                    i = f[0]\n                id2token[int(i)] = t\n            except Exception as ex:\n                print(ex)\n                raise\n    return id2token\n\n\ndef load_torch_model():\n    if not Path(\"./model.pt\").is_file():\n        raise ValueError(\n            \"Please download files from https://huggingface.co/csukuangfj/funasr-nano-with-ctc\"\n        )\n    model = nano.Nano()\n\n    state_dict = torch.load(\"./model.pt\", map_location=\"cpu\")\n\n    to_delete = [k for k in state_dict if \"llm\" in k or \"audio_adaptor\" in k]\n\n    for k in to_delete:\n        del state_dict[k]\n\n    model.load_state_dict(state_dict, strict=True)\n    model.eval()\n\n    del state_dict\n\n    return model\n\n\n@torch.no_grad()\ndef main():\n    model = load_torch_model()\n    num_params = sum(p.numel() for p in model.parameters())\n    print(\"num_params (M)\", num_params, num_params / 1000000)\n\n    samples, sample_rate = test_onnx.load_audio(\"./zh.wav\")\n    assert sample_rate == 16000, sample_rate\n\n    features = test_onnx.compute_feat(samples=samples, sample_rate=sample_rate)\n    x = torch.from_numpy(features)[None]\n    logits = model(x)\n\n    idx = logits.squeeze(0).argmax(dim=-1)\n    print(idx)\n    idx = torch.unique_consecutive(idx).tolist()\n    print(idx)\n\n    id2token = load_tokens(\"./tokens.txt\")\n    blank_id = len(id2token) - 1\n\n    idx = [i for i in idx if i != blank_id]\n    print(idx)\n\n    s = b\"\"\n    for i in idx:\n        s += base64.b64decode(id2token[i])\n\n    text = s.decode().strip()\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/rknn/test_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nNote: This is for testing the onnx models that would be later used to export\nto RKNN\n\"\"\"\n\nimport argparse\nfrom typing import Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to model.onnx\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--wave\",\n        type=str,\n        required=True,\n        help=\"The input wave to be recognized\",\n    )\n\n    parser.add_argument(\n        \"--language\",\n        type=str,\n        default=\"auto\",\n        help=\"the language of the input wav file. Supported values: zh, en, ja, ko, yue, auto\",\n    )\n\n    parser.add_argument(\n        \"--use-itn\",\n        type=int,\n        default=0,\n        help=\"1 to use inverse text normalization. 0 to not use inverse text normalization\",\n    )\n\n    return parser.parse_args()\n\n\nclass OnnxModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n\n        self.window_size = int(meta[\"lfr_window_size\"])  # lfr_m\n        self.window_shift = int(meta[\"lfr_window_shift\"])  # lfr_n\n\n        lang_zh = int(meta[\"lang_zh\"])\n        lang_en = int(meta[\"lang_en\"])\n        lang_ja = int(meta[\"lang_ja\"])\n        lang_ko = int(meta[\"lang_ko\"])\n        lang_yue = int(meta[\"lang_yue\"])\n        lang_auto = int(meta[\"lang_auto\"])\n\n        self.lang_id = {\n            \"zh\": lang_zh,\n            \"en\": lang_en,\n            \"ja\": lang_ja,\n            \"ko\": lang_ko,\n            \"yue\": lang_yue,\n            \"auto\": lang_auto,\n        }\n        self.with_itn = int(meta[\"with_itn\"])\n        self.without_itn = int(meta[\"without_itn\"])\n\n        self.max_len = self.model.get_inputs()[0].shape[1]\n\n    def __call__(self, x, prompt):\n        logits = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x.numpy(),\n                self.model.get_inputs()[1].name: prompt.numpy(),\n            },\n        )[0]\n\n        return torch.from_numpy(logits)\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef load_tokens(filename):\n    ans = dict()\n    i = 0\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            ans[i] = line.strip().split()[0]\n            i += 1\n    return ans\n\n\ndef compute_feat(\n    samples,\n    sample_rate,\n    max_len: int = -1,\n    window_size: int = 7,  # lfr_m\n    window_shift: int = 6,  # lfr_n\n):\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.snip_edges = False\n    opts.frame_opts.window_type = \"hamming\"\n    opts.frame_opts.samp_freq = sample_rate\n    opts.mel_opts.num_bins = 80\n\n    online_fbank = knf.OnlineFbank(opts)\n    online_fbank.accept_waveform(sample_rate, (samples * 32768).tolist())\n    online_fbank.input_finished()\n\n    features = np.stack(\n        [online_fbank.get_frame(i) for i in range(online_fbank.num_frames_ready)]\n    )\n    assert features.data.contiguous is True\n    assert features.dtype == np.float32, features.dtype\n\n    T = (features.shape[0] - window_size) // window_shift + 1\n    features = np.lib.stride_tricks.as_strided(\n        features,\n        shape=(T, features.shape[1] * window_size),\n        strides=((window_shift * features.shape[1]) * 4, 4),\n    )\n\n    print(\"features.shape\", features.shape)\n\n    if max_len > 0:\n        if features.shape[0] > max_len:\n            features = features[:max_len]\n        elif features.shape[0] < max_len:\n            features = np.pad(\n                features,\n                ((0, max_len - features.shape[0]), (0, 0)),\n                mode=\"constant\",\n                constant_values=0,\n            )\n\n    print(\"features.shape\", features.shape)\n    features = np.ascontiguousarray(features)\n\n    return features\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n    samples, sample_rate = load_audio(args.wave)\n    if sample_rate != 16000:\n        import librosa\n\n        samples = librosa.resample(samples, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    model = OnnxModel(filename=args.model)\n\n    features = compute_feat(\n        samples=samples,\n        sample_rate=sample_rate,\n        max_len=model.max_len,\n        window_size=model.window_size,\n        window_shift=model.window_shift,\n    )\n\n    features = torch.from_numpy(features).unsqueeze(0)\n\n    language = model.lang_id[\"auto\"]\n    if args.language in model.lang_id:\n        language = model.lang_id[args.language]\n    else:\n        print(f\"Invalid language: '{args.language}'\")\n        print(\"Use auto\")\n\n    if args.use_itn:\n        text_norm = model.with_itn\n    else:\n        text_norm = model.without_itn\n\n    prompt = torch.tensor([language, 1, 2, text_norm], dtype=torch.int32)\n\n    logits = model(\n        x=features,\n        prompt=prompt,\n    )\n\n    idx = logits.squeeze(0).argmax(dim=-1)\n    # idx is of shape (T,)\n    idx = torch.unique_consecutive(idx)\n\n    blank_id = 0\n    idx = idx[idx != blank_id].tolist()\n\n    tokens = load_tokens(args.tokens)\n    text = \"\".join([tokens[i] for i in idx])\n\n    text = text.replace(\"▁\", \" \")\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/rknn/torch_model.py",
    "content": "# This file is modified from\n# https://github.com/modelscope/FunASR/blob/main/funasr/models/sense_voice/model.py\n\nimport torch\nimport torch.nn\nimport torch.nn as nn\nimport torch.nn.functional as F\n\n\nclass SinusoidalPositionEncoder(nn.Module):\n    def __init__(self, d_model=80, dropout_rate=0.1):\n        super().__init__()\n\n    def encode(\n        self,\n        positions: torch.Tensor = None,\n        depth: int = None,\n        dtype: torch.dtype = torch.float32,\n    ):\n        \"\"\"\n        Args:\n          positions: (batch_size, )\n        \"\"\"\n        batch_size = positions.size(0)\n        positions = positions.type(dtype)\n        device = positions.device\n        log_timescale_increment = torch.log(\n            torch.tensor([10000], dtype=dtype, device=device)\n        ) / (depth / 2 - 1)\n        inv_timescales = torch.exp(\n            torch.arange(depth / 2, device=device).type(dtype)\n            * (-log_timescale_increment)\n        )\n        inv_timescales = torch.reshape(inv_timescales, [batch_size, -1])\n        scaled_time = torch.reshape(positions, [1, -1, 1]) * torch.reshape(\n            inv_timescales, [1, 1, -1]\n        )\n        encoding = torch.cat([torch.sin(scaled_time), torch.cos(scaled_time)], dim=2)\n        return encoding.type(dtype)\n\n    def forward(self, x):\n        batch_size, timesteps, input_dim = x.size()\n        positions = torch.arange(1, timesteps + 1, device=x.device)[None, :]\n        position_encoding = self.encode(positions, input_dim, x.dtype).to(x.device)\n\n        return x + position_encoding\n\n\nclass PositionwiseFeedForward(nn.Module):\n    \"\"\"Positionwise feed forward layer.\n\n    Args:\n        idim (int): Input dimension.\n        hidden_units (int): The number of hidden units.\n        dropout_rate (float): Dropout rate.\n\n    \"\"\"\n\n    def __init__(self, idim, hidden_units, dropout_rate, activation=None):\n        super().__init__()\n        self.w_1 = torch.nn.Linear(idim, hidden_units)\n        self.w_2 = torch.nn.Linear(hidden_units, idim)\n        self.dropout = torch.nn.Dropout(dropout_rate)\n        if activation is None:\n            activation = torch.nn.ReLU()\n        self.activation = activation\n\n    def forward(self, x):\n        \"\"\"Forward function.\"\"\"\n        return self.w_2(self.dropout(self.activation(self.w_1(x))))\n\n\nclass MultiHeadedAttentionSANM(nn.Module):\n    \"\"\"Multi-Head Attention layer.\n\n    Args:\n        n_head (int): The number of heads.\n        n_feat (int): The number of features.\n        dropout_rate (float): Dropout rate.\n\n    \"\"\"\n\n    def __init__(\n        self,\n        n_head,\n        in_feat,\n        n_feat,\n        dropout_rate,\n        kernel_size,\n        sanm_shfit=0,\n        lora_list=None,\n        lora_rank=8,\n        lora_alpha=16,\n        lora_dropout=0.1,\n    ):\n        super().__init__()\n        assert n_feat % n_head == 0\n        # We assume d_v always equals d_k\n        self.d_k = n_feat // n_head\n        self.h = n_head\n        self.linear_out = nn.Linear(n_feat, n_feat)\n        self.linear_q_k_v = nn.Linear(in_feat, n_feat * 3)\n        self.attn = None\n        self.dropout = nn.Dropout(p=dropout_rate)\n\n        self.fsmn_block = nn.Conv1d(\n            n_feat, n_feat, kernel_size, stride=1, padding=0, groups=n_feat, bias=False\n        )\n        # padding\n        left_padding = (kernel_size - 1) // 2\n        if sanm_shfit > 0:\n            left_padding = left_padding + sanm_shfit\n        right_padding = kernel_size - 1 - left_padding\n        self.pad_fn = nn.ConstantPad1d((left_padding, right_padding), 0.0)\n\n    def forward_fsmn(self, inputs, mask, mask_shfit_chunk=None):\n        b, t, d = inputs.size()\n        if mask is not None:\n            mask = torch.reshape(mask, (b, -1, 1))\n            if mask_shfit_chunk is not None:\n                mask = mask * mask_shfit_chunk\n            inputs = inputs * mask\n\n        x = inputs.transpose(1, 2)\n        x = self.pad_fn(x)\n        x = self.fsmn_block(x)\n        x = x.transpose(1, 2)\n        x += inputs\n        x = self.dropout(x)\n        if mask is not None:\n            x = x * mask\n        return x\n\n    def forward_qkv(self, x):\n        \"\"\"Transform query, key and value.\n\n        Args:\n            query (torch.Tensor): Query tensor (#batch, time1, size).\n            key (torch.Tensor): Key tensor (#batch, time2, size).\n            value (torch.Tensor): Value tensor (#batch, time2, size).\n\n        Returns:\n            torch.Tensor: Transformed query tensor (#batch, n_head, time1, d_k).\n            torch.Tensor: Transformed key tensor (#batch, n_head, time2, d_k).\n            torch.Tensor: Transformed value tensor (#batch, n_head, time2, d_k).\n\n        \"\"\"\n        b, t, d = x.size()\n        q_k_v = self.linear_q_k_v(x)\n        q, k, v = torch.split(q_k_v, int(self.h * self.d_k), dim=-1)\n        q_h = torch.reshape(q, (b, t, self.h, self.d_k)).transpose(\n            1, 2\n        )  # (batch, head, time1, d_k)\n        k_h = torch.reshape(k, (b, t, self.h, self.d_k)).transpose(\n            1, 2\n        )  # (batch, head, time2, d_k)\n        v_h = torch.reshape(v, (b, t, self.h, self.d_k)).transpose(\n            1, 2\n        )  # (batch, head, time2, d_k)\n\n        return q_h, k_h, v_h, v\n\n    def forward_attention(self, value, scores, mask, mask_att_chunk_encoder=None):\n        \"\"\"Compute attention context vector.\n\n        Args:\n            value (torch.Tensor): Transformed value (#batch, n_head, time2, d_k).\n            scores (torch.Tensor): Attention score (#batch, n_head, time1, time2).\n            mask (torch.Tensor): Mask (#batch, 1, time2) or (#batch, time1, time2).\n\n        Returns:\n            torch.Tensor: Transformed value (#batch, time1, d_model)\n                weighted by the attention score (#batch, time1, time2).\n\n        \"\"\"\n        n_batch = value.size(0)\n        if mask is not None:\n            if mask_att_chunk_encoder is not None:\n                mask = mask * mask_att_chunk_encoder\n\n            mask = mask.unsqueeze(1).eq(0)  # (batch, 1, *, time2)\n\n            min_value = -float(\n                \"inf\"\n            )  # float(numpy.finfo(torch.tensor(0, dtype=scores.dtype).numpy().dtype).min)\n            scores = scores.masked_fill(mask, min_value)\n            attn = torch.softmax(scores, dim=-1).masked_fill(\n                mask, 0.0\n            )  # (batch, head, time1, time2)\n        else:\n            attn = torch.softmax(scores, dim=-1)  # (batch, head, time1, time2)\n\n        p_attn = self.dropout(attn)\n        x = torch.matmul(p_attn, value)  # (batch, head, time1, d_k)\n        x = (\n            x.transpose(1, 2).contiguous().view(n_batch, -1, self.h * self.d_k)\n        )  # (batch, time1, d_model)\n\n        return self.linear_out(x)  # (batch, time1, d_model)\n\n    def forward(self, x, mask, mask_shfit_chunk=None, mask_att_chunk_encoder=None):\n        \"\"\"Compute scaled dot product attention.\n\n        Args:\n            query (torch.Tensor): Query tensor (#batch, time1, size).\n            key (torch.Tensor): Key tensor (#batch, time2, size).\n            value (torch.Tensor): Value tensor (#batch, time2, size).\n            mask (torch.Tensor): Mask tensor (#batch, 1, time2) or\n                (#batch, time1, time2).\n\n        Returns:\n            torch.Tensor: Output tensor (#batch, time1, d_model).\n\n        \"\"\"\n        q_h, k_h, v_h, v = self.forward_qkv(x)\n        fsmn_memory = self.forward_fsmn(v, mask, mask_shfit_chunk)\n        q_h = q_h * self.d_k ** (-0.5)\n        scores = torch.matmul(q_h, k_h.transpose(-2, -1))\n        att_outs = self.forward_attention(v_h, scores, mask, mask_att_chunk_encoder)\n        return att_outs + fsmn_memory\n\n\nclass EncoderLayerSANM(nn.Module):\n    def __init__(\n        self,\n        in_size,\n        size,\n        self_attn,\n        feed_forward,\n        dropout_rate,\n        normalize_before=True,\n        concat_after=False,\n        stochastic_depth_rate=0.0,\n    ):\n        super().__init__()\n        self.self_attn = self_attn\n        self.feed_forward = feed_forward\n        self.norm1 = LayerNorm(in_size)\n        self.norm2 = LayerNorm(size)\n        self.dropout = nn.Dropout(dropout_rate)\n        self.in_size = in_size\n        self.size = size\n        self.normalize_before = normalize_before\n        self.concat_after = concat_after\n        if self.concat_after:\n            self.concat_linear = nn.Linear(size + size, size)\n        self.stochastic_depth_rate = stochastic_depth_rate\n        self.dropout_rate = dropout_rate\n\n    def forward(\n        self, x, mask, cache=None, mask_shfit_chunk=None, mask_att_chunk_encoder=None\n    ):\n        \"\"\"Compute encoded features.\n\n        Args:\n            x_input (torch.Tensor): Input tensor (#batch, time, size).\n            mask (torch.Tensor): Mask tensor for the input (#batch, time).\n            cache (torch.Tensor): Cache tensor of the input (#batch, time - 1, size).\n\n        Returns:\n            torch.Tensor: Output tensor (#batch, time, size).\n            torch.Tensor: Mask tensor (#batch, time).\n\n        \"\"\"\n        skip_layer = False\n        # with stochastic depth, residual connection `x + f(x)` becomes\n        # `x <- x + 1 / (1 - p) * f(x)` at training time.\n        stoch_layer_coeff = 1.0\n        if self.training and self.stochastic_depth_rate > 0:\n            skip_layer = torch.rand(1).item() < self.stochastic_depth_rate\n            stoch_layer_coeff = 1.0 / (1 - self.stochastic_depth_rate)\n\n        if skip_layer:\n            if cache is not None:\n                x = torch.cat([cache, x], dim=1)\n            return x, mask\n\n        residual = x\n        if self.normalize_before:\n            x = self.norm1(x)\n\n        if self.concat_after:\n            x_concat = torch.cat(\n                (\n                    x,\n                    self.self_attn(\n                        x,\n                        mask,\n                        mask_shfit_chunk=mask_shfit_chunk,\n                        mask_att_chunk_encoder=mask_att_chunk_encoder,\n                    ),\n                ),\n                dim=-1,\n            )\n            if self.in_size == self.size:\n                x = residual + stoch_layer_coeff * self.concat_linear(x_concat)\n            else:\n                x = stoch_layer_coeff * self.concat_linear(x_concat)\n        else:\n            if self.in_size == self.size:\n                x = residual + stoch_layer_coeff * self.dropout(\n                    self.self_attn(\n                        x,\n                        mask,\n                        mask_shfit_chunk=mask_shfit_chunk,\n                        mask_att_chunk_encoder=mask_att_chunk_encoder,\n                    )\n                )\n            else:\n                x = stoch_layer_coeff * self.dropout(\n                    self.self_attn(\n                        x,\n                        mask,\n                        mask_shfit_chunk=mask_shfit_chunk,\n                        mask_att_chunk_encoder=mask_att_chunk_encoder,\n                    )\n                )\n                return x, mask\n        if not self.normalize_before:\n            x = self.norm1(x)\n\n        residual = x\n        if self.normalize_before:\n            x = self.norm2(x)\n        x = residual + stoch_layer_coeff * self.dropout(self.feed_forward(x))\n        if not self.normalize_before:\n            x = self.norm2(x)\n\n        return x, mask, cache, mask_shfit_chunk, mask_att_chunk_encoder\n\n\nclass LayerNorm(nn.LayerNorm):\n    def __init__(self, *args, **kwargs):\n        super().__init__(*args, **kwargs)\n\n    def forward(self, input):\n        output = F.layer_norm(\n            input.float(),\n            self.normalized_shape,\n            self.weight.float() if self.weight is not None else None,\n            self.bias.float() if self.bias is not None else None,\n            self.eps,\n        )\n        return output.type_as(input)\n\n\nclass SenseVoiceEncoderSmall(nn.Module):\n    def __init__(self):\n        super().__init__()\n        self.input_size = 80 * 7\n        self.output_size = 512\n        self.attention_heads = 4\n        self.linear_units = 2048\n        self.num_blocks = 50\n        self.tp_blocks = 20\n        self.input_layer = \"pe\"\n        self.pos_enc_class = \"SinusoidalPositionEncoder\"\n        self.normalize_before = True\n        self.kernel_size = 11\n        self.sanm_shfit = 0\n        self.concat_after = False\n        self.positionwise_layer_type = \"linear\"\n        self.positionwise_conv_kernel_size = 1\n        self.padding_idx = -1\n        self.selfattention_layer_type = \"sanm\"\n        self.dropout_rate = 0.1\n        self.attention_dropout_rate = 0.1\n\n        self._output_size = self.output_size\n\n        self.embed = SinusoidalPositionEncoder()\n\n        positionwise_layer = PositionwiseFeedForward\n        positionwise_layer_args = (\n            self.output_size,\n            self.linear_units,\n            self.dropout_rate,\n        )\n\n        encoder_selfattn_layer = MultiHeadedAttentionSANM\n        encoder_selfattn_layer_args0 = (\n            self.attention_heads,\n            self.input_size,\n            self.output_size,\n            self.attention_dropout_rate,\n            self.kernel_size,\n            self.sanm_shfit,\n        )\n        encoder_selfattn_layer_args = (\n            self.attention_heads,\n            self.output_size,\n            self.output_size,\n            self.attention_dropout_rate,\n            self.kernel_size,\n            self.sanm_shfit,\n        )\n\n        self.encoders0 = nn.ModuleList(\n            [\n                EncoderLayerSANM(\n                    self.input_size,\n                    self.output_size,\n                    encoder_selfattn_layer(*encoder_selfattn_layer_args0),\n                    positionwise_layer(*positionwise_layer_args),\n                    self.dropout_rate,\n                )\n                for i in range(1)\n            ]\n        )\n\n        self.encoders = nn.ModuleList(\n            [\n                EncoderLayerSANM(\n                    self.output_size,\n                    self.output_size,\n                    encoder_selfattn_layer(*encoder_selfattn_layer_args),\n                    positionwise_layer(*positionwise_layer_args),\n                    self.dropout_rate,\n                )\n                for i in range(self.num_blocks - 1)\n            ]\n        )\n\n        self.tp_encoders = nn.ModuleList(\n            [\n                EncoderLayerSANM(\n                    self.output_size,\n                    self.output_size,\n                    encoder_selfattn_layer(*encoder_selfattn_layer_args),\n                    positionwise_layer(*positionwise_layer_args),\n                    self.dropout_rate,\n                )\n                for i in range(self.tp_blocks)\n            ]\n        )\n\n        self.after_norm = LayerNorm(self.output_size)\n\n        self.tp_norm = LayerNorm(self.output_size)\n\n    def forward(\n        self,\n        xs_pad: torch.Tensor,\n    ):\n        masks = None\n\n        xs_pad *= self.output_size**0.5\n\n        xs_pad = self.embed(xs_pad)\n\n        # forward encoder1\n        for layer_idx, encoder_layer in enumerate(self.encoders0):\n            encoder_outs = encoder_layer(xs_pad, masks)\n            xs_pad, masks = encoder_outs[0], encoder_outs[1]\n\n        for layer_idx, encoder_layer in enumerate(self.encoders):\n            encoder_outs = encoder_layer(xs_pad, masks)\n            xs_pad, masks = encoder_outs[0], encoder_outs[1]\n\n        xs_pad = self.after_norm(xs_pad)\n\n        for layer_idx, encoder_layer in enumerate(self.tp_encoders):\n            encoder_outs = encoder_layer(xs_pad, masks)\n            xs_pad, masks = encoder_outs[0], encoder_outs[1]\n\n        xs_pad = self.tp_norm(xs_pad)\n        return xs_pad\n\n\nclass CTC(nn.Module):\n    def __init__(\n        self,\n        odim: int,\n        encoder_output_size: int,\n        dropout_rate: float = 0.0,\n        ctc_type: str = \"builtin\",\n        reduce: bool = True,\n        ignore_nan_grad: bool = True,\n        extra_linear: bool = True,\n    ):\n        super().__init__()\n        eprojs = encoder_output_size\n        self.dropout_rate = dropout_rate\n\n        if extra_linear:\n            self.ctc_lo = torch.nn.Linear(eprojs, odim)\n        else:\n            self.ctc_lo = None\n\n    def softmax(self, hs_pad):\n        \"\"\"softmax of frame activations\n\n        Args:\n            Tensor hs_pad: 3d tensor (B, Tmax, eprojs)\n        Returns:\n            torch.Tensor: softmax applied 3d tensor (B, Tmax, odim)\n        \"\"\"\n        if self.ctc_lo is not None:\n            return F.softmax(self.ctc_lo(hs_pad), dim=2)\n        else:\n            return F.softmax(hs_pad, dim=2)\n\n    def log_softmax(self, hs_pad):\n        \"\"\"log_softmax of frame activations\n\n        Args:\n            Tensor hs_pad: 3d tensor (B, Tmax, eprojs)\n        Returns:\n            torch.Tensor: log softmax applied 3d tensor (B, Tmax, odim)\n        \"\"\"\n        if self.ctc_lo is not None:\n            return F.log_softmax(self.ctc_lo(hs_pad), dim=2)\n        else:\n            return F.log_softmax(hs_pad, dim=2)\n\n    def argmax(self, hs_pad):\n        \"\"\"argmax of frame activations\n\n        Args:\n            torch.Tensor hs_pad: 3d tensor (B, Tmax, eprojs)\n        Returns:\n            torch.Tensor: argmax applied 2d tensor (B, Tmax)\n        \"\"\"\n        if self.ctc_lo is not None:\n            return torch.argmax(self.ctc_lo(hs_pad), dim=2)\n        else:\n            return torch.argmax(hs_pad, dim=2)\n\n\nclass SenseVoiceSmall(nn.Module):\n    def __init__(self, neg_mean: torch.Tensor, inv_stddev: torch.Tensor):\n        super().__init__()\n        self.sos = 1\n        self.eos = 2\n        self.length_normalized_loss = True\n        self.ignore_id = -1\n        self.blank_id = 0\n        self.input_size = 80 * 7\n        self.vocab_size = 25055\n\n        self.neg_mean = neg_mean.unsqueeze(0).unsqueeze(0)\n        self.inv_stddev = inv_stddev.unsqueeze(0).unsqueeze(0)\n\n        self.lid_dict = {\n            \"auto\": 0,\n            \"zh\": 3,\n            \"en\": 4,\n            \"yue\": 7,\n            \"ja\": 11,\n            \"ko\": 12,\n            \"nospeech\": 13,\n        }\n        self.lid_int_dict = {\n            24884: 3,\n            24885: 4,\n            24888: 7,\n            24892: 11,\n            24896: 12,\n            24992: 13,\n        }\n        self.textnorm_dict = {\"withitn\": 14, \"woitn\": 15}\n        self.textnorm_int_dict = {25016: 14, 25017: 15}\n\n        self.emo_dict = {\n            \"unk\": 25009,\n            \"happy\": 25001,\n            \"sad\": 25002,\n            \"angry\": 25003,\n            \"neutral\": 25004,\n        }\n\n        self.encoder = SenseVoiceEncoderSmall()\n        self.ctc = CTC(\n            odim=self.vocab_size,\n            encoder_output_size=self.encoder.output_size,\n        )\n        self.embed = torch.nn.Embedding(\n            7 + len(self.lid_dict) + len(self.textnorm_dict), self.input_size\n        )\n\n    def forward(self, x, prompt):\n        input_query = self.embed(prompt).unsqueeze(0)\n\n        # for export, we always assume x and self.neg_mean are on CPU\n        x = (x + self.neg_mean) * self.inv_stddev\n        x = torch.cat((input_query, x), dim=1)\n\n        encoder_out = self.encoder(x)\n        logits = self.ctc.ctc_lo(encoder_out)\n\n        return logits\n"
  },
  {
    "path": "scripts/sense-voice/show-info.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport onnxruntime\n\n\ndef show(filename):\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(filename, session_opts)\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n    meta = sess.get_modelmeta().custom_metadata_map\n    print(\"*****************************************\")\n    print(\"meta\\n\", meta)\n\n\ndef main():\n    print(\"=========model==========\")\n    show(\"./model.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n\"\"\"\n=========model==========\nNodeArg(name='x', type='tensor(float)', shape=['N', 'T', 560])\nNodeArg(name='x_length', type='tensor(int32)', shape=['N'])\nNodeArg(name='language', type='tensor(int32)', shape=['N'])\nNodeArg(name='text_norm', type='tensor(int32)', shape=['N'])\n-----\nNodeArg(name='logits', type='tensor(float)', shape=['N', 'T', 25055])\n*****************************************\n\"\"\"\n"
  },
  {
    "path": "scripts/sense-voice/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom typing import Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to model.onnx\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--wave\",\n        type=str,\n        required=True,\n        help=\"The input wave to be recognized\",\n    )\n\n    parser.add_argument(\n        \"--language\",\n        type=str,\n        default=\"auto\",\n        help=\"the language of the input wav file. Supported values: zh, en, ja, ko, yue, auto\",\n    )\n\n    parser.add_argument(\n        \"--use-itn\",\n        type=int,\n        default=0,\n        help=\"1 to use inverse text normalization. 0 to not use inverse text normalization\",\n    )\n\n    return parser.parse_args()\n\n\nclass OnnxModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n\n        self.window_size = int(meta[\"lfr_window_size\"])  # lfr_m\n        self.window_shift = int(meta[\"lfr_window_shift\"])  # lfr_n\n\n        lang_zh = int(meta[\"lang_zh\"])\n        lang_en = int(meta[\"lang_en\"])\n        lang_ja = int(meta[\"lang_ja\"])\n        lang_ko = int(meta[\"lang_ko\"])\n        lang_auto = int(meta[\"lang_auto\"])\n\n        self.lang_id = {\n            \"zh\": lang_zh,\n            \"en\": lang_en,\n            \"ja\": lang_ja,\n            \"ko\": lang_ko,\n            \"auto\": lang_auto,\n        }\n        self.with_itn = int(meta[\"with_itn\"])\n        self.without_itn = int(meta[\"without_itn\"])\n\n        neg_mean = meta[\"neg_mean\"].split(\",\")\n        neg_mean = list(map(lambda x: float(x), neg_mean))\n\n        inv_stddev = meta[\"inv_stddev\"].split(\",\")\n        inv_stddev = list(map(lambda x: float(x), inv_stddev))\n\n        self.neg_mean = np.array(neg_mean, dtype=np.float32)\n        self.inv_stddev = np.array(inv_stddev, dtype=np.float32)\n\n    def __call__(self, x, x_length, language, text_norm):\n        logits = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x.numpy(),\n                self.model.get_inputs()[1].name: x_length.numpy(),\n                self.model.get_inputs()[2].name: language.numpy(),\n                self.model.get_inputs()[3].name: text_norm.numpy(),\n            },\n        )[0]\n\n        return torch.from_numpy(logits)\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef load_tokens(filename):\n    ans = dict()\n    i = 0\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            ans[i] = line.strip().split()[0]\n            i += 1\n    return ans\n\n\ndef compute_feat(\n    samples,\n    sample_rate,\n    neg_mean: np.ndarray,\n    inv_stddev: np.ndarray,\n    window_size: int = 7,  # lfr_m\n    window_shift: int = 6,  # lfr_n\n):\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.snip_edges = False\n    opts.frame_opts.window_type = \"hamming\"\n    opts.frame_opts.samp_freq = sample_rate\n    opts.mel_opts.num_bins = 80\n\n    online_fbank = knf.OnlineFbank(opts)\n    online_fbank.accept_waveform(sample_rate, (samples * 32768).tolist())\n    online_fbank.input_finished()\n\n    features = np.stack(\n        [online_fbank.get_frame(i) for i in range(online_fbank.num_frames_ready)]\n    )\n    assert features.data.contiguous is True\n    assert features.dtype == np.float32, features.dtype\n\n    T = (features.shape[0] - window_size) // window_shift + 1\n    features = np.lib.stride_tricks.as_strided(\n        features,\n        shape=(T, features.shape[1] * window_size),\n        strides=((window_shift * features.shape[1]) * 4, 4),\n    )\n\n    features = (features + neg_mean) * inv_stddev\n\n    return features\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n    samples, sample_rate = load_audio(args.wave)\n    if sample_rate != 16000:\n        import librosa\n\n        samples = librosa.resample(samples, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    model = OnnxModel(filename=args.model)\n\n    features = compute_feat(\n        samples=samples,\n        sample_rate=sample_rate,\n        neg_mean=model.neg_mean,\n        inv_stddev=model.inv_stddev,\n        window_size=model.window_size,\n        window_shift=model.window_shift,\n    )\n\n    features = torch.from_numpy(features).unsqueeze(0)\n    features_length = torch.tensor([features.size(1)], dtype=torch.int32)\n\n    language = model.lang_id[\"auto\"]\n    if args.language in model.lang_id:\n        language = model.lang_id[args.language]\n    else:\n        print(f\"Invalid language: '{args.language}'\")\n        print(\"Use auto\")\n\n    if args.use_itn:\n        text_norm = model.with_itn\n    else:\n        text_norm = model.without_itn\n\n    language = torch.tensor([language], dtype=torch.int32)\n    text_norm = torch.tensor([text_norm], dtype=torch.int32)\n\n    logits = model(\n        x=features,\n        x_length=features_length,\n        language=language,\n        text_norm=text_norm,\n    )\n\n    idx = logits.squeeze(0).argmax(dim=-1)\n    # idx is of shape (T,)\n    idx = torch.unique_consecutive(idx)\n\n    blank_id = 0\n    idx = idx[idx != blank_id].tolist()\n\n    tokens = load_tokens(args.tokens)\n    text = \"\".join([tokens[i] for i in idx])\n\n    text = text.replace(\"▁\", \" \")\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/sense-voice/test_onnx_nano.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\n=========./model.onnx==========\nNodeArg(name='x', type='tensor(float)', shape=[1, 'T', 560])\n-----\nNodeArg(name='logits', type='tensor(float)', shape=['Addlogits_dim_0', 'Addlogits_dim_1', 60515])\n\n=========./model.int8.onnx==========\nNodeArg(name='x', type='tensor(float)', shape=[1, 'T', 560])\n-----\nNodeArg(name='logits', type='tensor(float)', shape=['Addlogits_dim_0', 'Addlogits_dim_1', 60515])\n\"\"\"\n\nimport argparse\nimport base64\nfrom typing import Tuple\n\nfrom test_onnx import compute_feat, load_audio\n\nimport onnxruntime as ort\nimport librosa\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to model.onnx\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--wave\",\n        type=str,\n        required=True,\n        help=\"The input wave to be recognized\",\n    )\n\n    return parser.parse_args()\n\n\nclass OnnxModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n\n        self.window_size = int(meta[\"lfr_window_size\"])  # lfr_m\n        self.window_shift = int(meta[\"lfr_window_shift\"])  # lfr_n\n        self.blank_id = int(meta[\"blank_id\"])\n\n    def __call__(self, x):\n        logits = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )[0]\n\n        return logits\n\n\ndef load_tokens(filename: str):\n    ans = dict()\n    i = 0\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            ans[i] = line.strip().split()[0]\n            i += 1\n    return ans\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n    samples, sample_rate = load_audio(args.wave)\n    if sample_rate != 16000:\n        samples = librosa.resample(samples, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    model = OnnxModel(filename=args.model)\n\n    features = compute_feat(\n        samples=samples,\n        sample_rate=sample_rate,\n        window_size=model.window_size,\n        window_shift=model.window_shift,\n    )\n\n    logits = model(\n        x=features[None],\n    )\n\n    idx = logits[0].argmax(axis=-1)\n    print(\"initial ids\", idx)\n    id2token = load_tokens(args.tokens)\n    blank_id = model.blank_id\n    print(\"blank_id\", blank_id)\n\n    unique_ids = []\n    prev = -1\n    for i in idx:\n        if i == prev:\n            continue\n        unique_ids.append(i)\n        prev = i\n    print(\"unique_ids\", unique_ids)\n\n    ids = [i for i in unique_ids if i != blank_id]\n\n    print(\"ids without blank\", ids)\n    s = b\"\"\n    for i in ids:\n        s += base64.b64decode(id2token[i])\n\n    text = s.decode().strip()\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/silero_vad/v4/README.md",
    "content": "# Introduction\n\nThis folder contains script for exporting\n[silero_vad v4](https://github.com/snakers4/silero-vad/tree/v4.0)\nto rknn.\n\n# Steps to run\n\n## 1. Download a jit model\nYou can download it from <https://github.com/snakers4/silero-vad/blob/v4.0/files/silero_vad.jit>\n\n```bash\nwget https://github.com/snakers4/silero-vad/raw/refs/tags/v4.0/files/silero_vad.jit\n```\n\n```bash\nls -lh silero_vad.jit\n-rw-r--r-- 1 kuangfangjun root 1.4M Mar 30 11:04 silero_vad.jit\n```\n\n## 2. Export it to onnx\n```bash\n./export-onnx.py\n```\n\nIt will generate a file `./m.onnx`\n\n```bash\n ls -lh m.onnx\n-rw-r--r-- 1 kuangfangjun root 627K Mar 30 11:13 m.onnx\n```\n\n## 3. Test the onnx model\n\n```bash\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav\n./test-onnx.py  --model ./m.onnx --wav ./lei-jun-test.wav\n```\n\n## 4. Convert the onnx model to RKNN format\n\nWe assume you have installed rknn toolkit 2.1\n```bash\n./export-rknn.py --in-model ./m.onnx --out-model m.rknn  --target-platform rk3588\n```\n\nIt will generate a file `./m.rknn`\n\n```bash\nls -lh m.rknn\n-rw-r--r-- 1 kuangfangjun root 2.2M Mar 30 11:19 m.rknn\n```\n"
  },
  {
    "path": "scripts/silero_vad/v4/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport onnx\nimport torch\nfrom onnxsim import simplify\n\nimport torch\nfrom torch import Tensor\n\n\ndef simple_pad(x: Tensor, pad: int) -> Tensor:\n    #  _0 = torch.slice(torch.slice(torch.slice(x), 1), 2, 1, torch.add(1, pad))\n    _0 = x[:, :, 1 : 1 + pad]\n\n    left_pad = torch.flip(_0, [-1])\n    #  _1 = torch.slice(torch.slice(torch.slice(x), 1), 2, torch.sub(-1, pad), -1)\n\n    _1 = x[:, :, (-1 - pad) : -1]\n\n    right_pad = torch.flip(_1, [-1])\n    _2 = torch.cat([left_pad, x, right_pad], 2)\n    return _2\n\n\nclass MyModule(torch.nn.Module):\n    def __init__(self, m):\n        super().__init__()\n        self.m = m\n\n    def adaptive_normalization_forward(self, spect):\n        m = self.m._model.adaptive_normalization\n        _0 = simple_pad\n\n        # Note(fangjun): rknn uses fp16 by default, whose max value is 65504\n        # so we need to re-write the computation for spect0\n        #  spect0 = torch.log1p(torch.mul(spect, 1048576))\n        spect0 = torch.log1p(spect) + 13.86294\n\n        _1 = torch.eq(len(spect0.shape), 2)\n        if _1:\n            _2 = torch.unsqueeze(spect0, 0)\n            spect1 = _2\n        else:\n            spect1 = spect0\n        mean = torch.mean(spect1, [1], True)\n        to_pad = m.to_pad\n        mean0 = _0(\n            mean,\n            to_pad,\n        )\n        filter_ = m.filter_\n        mean1 = torch.conv1d(mean0, filter_)\n        mean_mean = torch.mean(mean1, [-1], True)\n        spect2 = torch.add(spect1, torch.neg(mean_mean))\n        return spect2\n\n    def forward(self, x: torch.Tensor, h: torch.Tensor, c: torch.Tensor):\n        m = self.m._model\n\n        feature_extractor = m.feature_extractor\n        x0 = (feature_extractor).forward(\n            x,\n        )\n        norm = self.adaptive_normalization_forward(x0)\n        x1 = torch.cat([x0, norm], 1)\n        first_layer = m.first_layer\n        x2 = (first_layer).forward(\n            x1,\n        )\n        encoder = m.encoder\n        x3 = (encoder).forward(\n            x2,\n        )\n        decoder = m.decoder\n        x4, h0, c0, = (decoder).forward(\n            x3,\n            h,\n            c,\n        )\n        _0 = torch.mean(torch.squeeze(x4, 1), [1])\n        out = torch.unsqueeze(_0, 1)\n        return (out, h0, c0)\n\n\n@torch.no_grad()\ndef main():\n    m = torch.jit.load(\"./silero_vad.jit\")\n    m = MyModule(m)\n    x = torch.rand((1, 512), dtype=torch.float32)\n    h = torch.rand((2, 1, 64), dtype=torch.float32)\n    c = torch.rand((2, 1, 64), dtype=torch.float32)\n    m = torch.jit.script(m)\n    torch.onnx.export(\n        m,\n        (x, h, c),\n        \"m.onnx\",\n        input_names=[\"x\", \"h\", \"c\"],\n        output_names=[\"prob\", \"next_h\", \"next_c\"],\n    )\n\n    print(\"simplifying ...\")\n    model = onnx.load(\"m.onnx\")\n\n    meta_data = {\n        \"model_type\": \"silero-vad-v4\",\n        \"sample_rate\": 16000,\n        \"version\": 4,\n        \"h_shape\": \"2,1,64\",\n        \"c_shape\": \"2,1,64\",\n    }\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n    print(model.metadata_props)\n\n    model_simp, check = simplify(model)\n    onnx.save(model_simp, \"m.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/silero_vad/v4/export-rknn.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nimport argparse\nimport logging\nfrom pathlib import Path\n\nfrom rknn.api import RKNN\n\nlogging.basicConfig(level=logging.WARNING)\n\ng_platforms = [\n    #  \"rv1103\",\n    #  \"rv1103b\",\n    #  \"rv1106\",\n    #  \"rk2118\",\n    \"rk3562\",\n    \"rk3566\",\n    \"rk3568\",\n    \"rk3576\",\n    \"rk3588\",\n]\n\n\ndef get_parser():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--target-platform\",\n        type=str,\n        required=True,\n        help=f\"Supported values are: {','.join(g_platforms)}\",\n    )\n\n    parser.add_argument(\n        \"--in-model\",\n        type=str,\n        required=True,\n        help=\"Path to the input onnx model\",\n    )\n\n    parser.add_argument(\n        \"--out-model\",\n        type=str,\n        required=True,\n        help=\"Path to the output rknn model\",\n    )\n\n    return parser\n\n\ndef get_meta_data(model: str):\n    import onnxruntime\n\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.inter_op_num_threads = 1\n    session_opts.intra_op_num_threads = 1\n\n    m = onnxruntime.InferenceSession(\n        model,\n        sess_options=session_opts,\n        providers=[\"CPUExecutionProvider\"],\n    )\n\n    for i in m.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in m.get_outputs():\n        print(i)\n    print()\n\n    meta = m.get_modelmeta().custom_metadata_map\n    s = \"\"\n    sep = \"\"\n    for key, value in meta.items():\n        s = s + sep + f\"{key}={value}\"\n        sep = \";\"\n    assert len(s) < 1024\n\n    return s\n\n\ndef export_rknn(rknn, filename):\n    ret = rknn.export_rknn(filename)\n    if ret != 0:\n        exit(\"Export rknn model to {filename} failed!\")\n\n\ndef init_model(filename: str, target_platform: str, custom_string=None):\n    rknn = RKNN(verbose=False)\n\n    rknn.config(\n        optimization_level=0,\n        target_platform=target_platform,\n        custom_string=custom_string,\n    )\n    if not Path(filename).is_file():\n        exit(f\"{filename} does not exist\")\n\n    ret = rknn.load_onnx(model=filename)\n    if ret != 0:\n        exit(f\"Load model {filename} failed!\")\n\n    ret = rknn.build(do_quantization=False)\n    if ret != 0:\n        exit(\"Build model {filename} failed!\")\n\n    return rknn\n\n\nclass RKNNModel:\n    def __init__(\n        self,\n        model: str,\n        target_platform: str,\n    ):\n        meta = get_meta_data(model)\n        print(meta)\n\n        self.model = init_model(\n            model,\n            target_platform=target_platform,\n            custom_string=meta,\n        )\n\n    def export_rknn(self, model):\n        export_rknn(self.model, model)\n\n    def release(self):\n        self.model.release()\n\n\ndef main():\n    args = get_parser().parse_args()\n    print(vars(args))\n\n    model = RKNNModel(\n        model=args.in_model,\n        target_platform=args.target_platform,\n    )\n\n    model.export_rknn(\n        model=args.out_model,\n    )\n\n    model.release()\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/silero_vad/v4/show.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport onnxruntime\nimport onnx\n\n\"\"\"\n[key: \"model_type\"\nvalue: \"silero-vad-v4\"\n, key: \"sample_rate\"\nvalue: \"16000\"\n, key: \"version\"\nvalue: \"4\"\n, key: \"h_shape\"\nvalue: \"2,1,64\"\n, key: \"c_shape\"\nvalue: \"2,1,64\"\n]\nNodeArg(name='x', type='tensor(float)', shape=[1, 512])\nNodeArg(name='h', type='tensor(float)', shape=[2, 1, 64])\nNodeArg(name='c', type='tensor(float)', shape=[2, 1, 64])\n-----\nNodeArg(name='prob', type='tensor(float)', shape=[1, 1])\nNodeArg(name='next_h', type='tensor(float)', shape=[2, 1, 64])\nNodeArg(name='next_c', type='tensor(float)', shape=[2, 1, 64])\n\"\"\"\n\n\ndef show(filename):\n    model = onnx.load(filename)\n    print(model.metadata_props)\n\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(\n        filename, session_opts, providers=[\"CPUExecutionProvider\"]\n    )\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\ndef main():\n    show(\"./m.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/silero_vad/v4/test-on-rk3588-board.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n# Please run this file on your rk3588 board\n\ntry:\n    from rknnlite.api import RKNNLite\nexcept:\n    print(\"Please run this file on your board (linux + aarch64 + npu)\")\n    print(\"You need to install rknn_toolkit_lite2\")\n    print(\n        \" from https://github.com/airockchip/rknn-toolkit2/tree/master/rknn-toolkit-lite2/packages\"\n    )\n    print(\n        \"https://github.com/airockchip/rknn-toolkit2/blob/v2.1.0/rknn-toolkit-lite2/packages/rknn_toolkit_lite2-2.1.0-cp310-cp310-linux_aarch64.whl\"\n    )\n    print(\"is known to work\")\n    raise\n\nimport time\nfrom pathlib import Path\nfrom typing import Tuple\n\nimport numpy as np\nimport soundfile as sf\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef init_model(filename, target_platform=\"rk3588\"):\n    if not Path(filename).is_file():\n        exit(f\"{filename} does not exist\")\n\n    rknn_lite = RKNNLite(verbose=False)\n    ret = rknn_lite.load_rknn(path=filename)\n    if ret != 0:\n        exit(f\"Load model {filename} failed!\")\n\n    ret = rknn_lite.init_runtime(core_mask=RKNNLite.NPU_CORE_0)\n    if ret != 0:\n        exit(f\"Failed to init rknn runtime for {filename}\")\n    return rknn_lite\n\n\nclass RKNNModel:\n    def __init__(self, model: str, target_platform=\"rk3588\"):\n        self.model = init_model(model)\n\n    def release(self):\n        self.model.release()\n\n    def __call__(self, x: np.ndarray, h: np.ndarray, c: np.ndarray):\n        \"\"\"\n        Args:\n          x: (1, 512), np.float32\n          h: (2, 1, 64), np.float32\n          c: (2, 1, 64), np.float32\n        Returns:\n          prob:\n          next_h:\n          next_c\n        \"\"\"\n        out, next_h, next_c = self.model.inference(inputs=[x, h, c])\n        return out.item(), next_h, next_c\n\n\ndef main():\n    model = RKNNModel(model=\"./m.rknn\")\n    for i in range(1):\n        test(model)\n\n\ndef test(model):\n    print(\"started\")\n    start = time.time()\n    samples, sample_rate = load_audio(\"./lei-jun-test.wav\")\n    assert sample_rate == 16000, sample_rate\n\n    window_size = 512\n\n    h = np.zeros((2, 1, 64), dtype=np.float32)\n    c = np.zeros((2, 1, 64), dtype=np.float32)\n\n    threshold = 0.5\n    num_windows = samples.shape[0] // window_size\n    out = []\n    for i in range(num_windows):\n        print(i, num_windows)\n        this_samples = samples[i * window_size : (i + 1) * window_size]\n        prob, h, c = model(this_samples[None], h, c)\n        out.append(prob > threshold)\n\n    min_speech_duration = 0.25 * sample_rate / window_size\n    min_silence_duration = 0.25 * sample_rate / window_size\n\n    result = []\n    last = -1\n    for k, f in enumerate(out):\n        if f >= threshold:\n            if last == -1:\n                last = k\n        elif last != -1:\n            if k - last > min_speech_duration:\n                result.append((last, k))\n            last = -1\n\n    if last != -1 and k - last > min_speech_duration:\n        result.append((last, k))\n\n    if not result:\n        print(\"Empty for ./lei-jun-test.wav\")\n        return\n\n    print(result)\n\n    final = [result[0]]\n    for r in result[1:]:\n        f = final[-1]\n        if r[0] - f[1] < min_silence_duration:\n            final[-1] = (f[0], r[1])\n        else:\n            final.append(r)\n\n    for f in final:\n        start = f[0] * window_size / sample_rate\n        end = f[1] * window_size / sample_rate\n        print(\"{:.3f} -- {:.3f}\".format(start, end))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/silero_vad/v4/test-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport onnxruntime as ort\nimport argparse\nimport soundfile as sf\nfrom typing import Tuple\nimport numpy as np\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the onnx model\",\n    )\n\n    parser.add_argument(\n        \"--wav\",\n        type=str,\n        required=True,\n        help=\"Path to the input wav\",\n    )\n    return parser.parse_args()\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        model: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n        self.model = ort.InferenceSession(\n            model,\n            sess_options=session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def get_init_states(self):\n        h = np.zeros((2, 1, 64), dtype=np.float32)\n        c = np.zeros((2, 1, 64), dtype=np.float32)\n        return h, c\n\n    def __call__(self, x, h, c):\n        \"\"\"\n        Args:\n          x: (1, 512)\n          h: (2, 1, 64)\n          c: (2, 1, 64)\n        Returns:\n          prob: (1, 1)\n          next_h: (2, 1, 64)\n          next_c: (2, 1, 64)\n        \"\"\"\n        x = x[None]\n        out, next_h, next_c = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n                self.model.get_outputs()[1].name,\n                self.model.get_outputs()[2].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n                self.model.get_inputs()[1].name: h,\n                self.model.get_inputs()[2].name: c,\n            },\n        )\n        return out, next_h, next_c\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef main():\n    args = get_args()\n\n    samples, sample_rate = load_audio(args.wav)\n    if sample_rate != 16000:\n        import librosa\n\n        samples = librosa.resample(samples, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    model = OnnxModel(args.model)\n    probs = []\n    h, c = model.get_init_states()\n    window_size = 512\n    num_windows = samples.shape[0] // window_size\n\n    for i in range(num_windows):\n        start = i * window_size\n        end = start + window_size\n\n        p, h, c = model(samples[start:end], h, c)\n\n        probs.append(p[0].item())\n\n    threshold = 0.5\n    out = np.array(probs) > threshold\n    out = out.tolist()\n    min_speech_duration = 0.25 * sample_rate / window_size\n    min_silence_duration = 0.25 * sample_rate / window_size\n\n    result = []\n    last = -1\n    for k, f in enumerate(out):\n        if f >= threshold:\n            if last == -1:\n                last = k\n        elif last != -1:\n            if k - last > min_speech_duration:\n                result.append((last, k))\n            last = -1\n\n    if last != -1 and k - last > min_speech_duration:\n        result.append((last, k))\n\n    if not result:\n        print(f\"Empty for {args.wav}\")\n        return\n\n    print(result)\n\n    final = [result[0]]\n    for r in result[1:]:\n        f = final[-1]\n        if r[0] - f[1] < min_silence_duration:\n            final[-1] = (f[0], r[1])\n        else:\n            final.append(r)\n\n    for f in final:\n        start = f[0] * window_size / sample_rate\n        end = f[1] * window_size / sample_rate\n        print(\"{:.3f} -- {:.3f}\".format(start, end))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/spleeter/.gitignore",
    "content": "2stems.tar.gz\n2stems\n"
  },
  {
    "path": "scripts/spleeter/__init__.py",
    "content": ""
  },
  {
    "path": "scripts/spleeter/convert_to_pb.py",
    "content": "#!/usr/bin/env python3\n\n# Code in this file is modified from\n# https://blog.metaflow.fr/tensorflow-how-to-freeze-a-model-and-serve-it-with-a-python-api-d4f3596b3adc\n#\n# Please see ./run.sh for usages\nimport argparse\n\nimport tensorflow as tf\n\n\ndef freeze_graph(model_dir, output_node_names, output_filename):\n    \"\"\"Extract the sub graph defined by the output nodes and convert all its\n    variables into constant\n\n    Args:\n      model_dir:\n        the root folder containing the checkpoint state file\n      output_node_names:\n        a string, containing all the output node's names, comma separated\n      output_filename:\n        Filename to save the graph.\n    \"\"\"\n    if not tf.compat.v1.gfile.Exists(model_dir):\n        raise AssertionError(\n            \"Export directory doesn't exists. Please specify an export \"\n            \"directory: %s\" % model_dir\n        )\n\n    if not output_node_names:\n        print(\"You need to supply the name of a node to --output_node_names.\")\n        return -1\n\n    # We retrieve our checkpoint fullpath\n    checkpoint = tf.train.get_checkpoint_state(model_dir)\n    input_checkpoint = checkpoint.model_checkpoint_path\n\n    # We precise the file fullname of our freezed graph\n    output_graph = output_filename\n\n    # We clear devices to allow TensorFlow to control on which device it will load operations\n    clear_devices = True\n\n    # We start a session using a temporary fresh Graph\n    with tf.compat.v1.Session(graph=tf.Graph()) as sess:\n        # We import the meta graph in the current default Graph\n        saver = tf.compat.v1.train.import_meta_graph(\n            input_checkpoint + \".meta\", clear_devices=clear_devices\n        )\n\n        # We restore the weights\n        saver.restore(sess, input_checkpoint)\n\n        # We use a built-in TF helper to export variables to constants\n        output_graph_def = tf.compat.v1.graph_util.convert_variables_to_constants(\n            sess,  # The session is used to retrieve the weights\n            tf.compat.v1.get_default_graph().as_graph_def(),  # The graph_def is used to retrieve the nodes\n            output_node_names.split(\n                \",\"\n            ),  # The output node names are used to select the useful nodes\n        )\n\n        # Finally we serialize and dump the output graph to the filesystem\n        with tf.compat.v1.gfile.GFile(output_graph, \"wb\") as f:\n            f.write(output_graph_def.SerializeToString())\n        print(\"%d ops in the final graph.\" % len(output_graph_def.node))\n\n    return output_graph_def\n\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model-dir\", type=str, default=\"\", help=\"Model folder to export\"\n    )\n    parser.add_argument(\n        \"--output-node-names\",\n        type=str,\n        default=\"vocals_spectrogram/mul,accompaniment_spectrogram/mul\",\n        help=\"The name of the output nodes, comma separated.\",\n    )\n\n    parser.add_argument(\n        \"--output-filename\",\n        type=str,\n    )\n    args = parser.parse_args()\n\n    freeze_graph(args.model_dir, args.output_node_names, args.output_filename)\n"
  },
  {
    "path": "scripts/spleeter/convert_to_torch.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n# Please see ./run.sh for usage\n\nimport argparse\n\nimport numpy as np\nimport tensorflow as tf\nimport torch\n\nfrom unet import UNet\n\n\ndef load_graph(frozen_graph_filename):\n    # This function is modified from\n    # https://blog.metaflow.fr/tensorflow-how-to-freeze-a-model-and-serve-it-with-a-python-api-d4f3596b3adc\n\n    # We load the protobuf file from the disk and parse it to retrieve the\n    # unserialized graph_def\n    with tf.compat.v1.gfile.GFile(frozen_graph_filename, \"rb\") as f:\n        graph_def = tf.compat.v1.GraphDef()\n        graph_def.ParseFromString(f.read())\n\n    # Then, we import the graph_def into a new Graph and returns it\n    with tf.Graph().as_default() as graph:\n        # The name var will prefix every op/nodes in your graph\n        # Since we load everything in a new graph, this is not needed\n        #  tf.import_graph_def(graph_def, name=\"prefix\")\n        tf.import_graph_def(graph_def, name=\"\")\n    return graph\n\n\ndef generate_waveform():\n    np.random.seed(20230821)\n    waveform = np.random.rand(60 * 44100).astype(np.float32)\n\n    # (num_samples, num_channels)\n    waveform = waveform.reshape(-1, 2)\n    return waveform\n\n\ndef get_param(graph, name):\n    with tf.compat.v1.Session(graph=graph) as sess:\n        constant_ops = [op for op in sess.graph.get_operations() if op.type == \"Const\"]\n        for constant_op in constant_ops:\n            if constant_op.name != name:\n                continue\n\n            value = sess.run(constant_op.outputs[0])\n            return torch.from_numpy(value)\n\n\n@torch.no_grad()\ndef main(name):\n    graph = load_graph(f\"./2stems/frozen_{name}_model.pb\")\n    #  for op in graph.get_operations():\n    #      print(op.name)\n    x = graph.get_tensor_by_name(\"waveform:0\")\n    #  y = graph.get_tensor_by_name(\"Reshape:0\")\n    y0 = graph.get_tensor_by_name(\"strided_slice_3:0\")\n    #  y1 = graph.get_tensor_by_name(\"leaky_re_lu_5/LeakyRelu:0\")\n    #  y1 = graph.get_tensor_by_name(\"conv2d_5/BiasAdd:0\")\n    #  y1 = graph.get_tensor_by_name(\"conv2d_transpose/BiasAdd:0\")\n    #  y1 = graph.get_tensor_by_name(\"re_lu/Relu:0\")\n    #  y1 = graph.get_tensor_by_name(\"batch_normalization_6/cond/FusedBatchNorm_1:0\")\n    #  y1 = graph.get_tensor_by_name(\"concatenate/concat:0\")\n    #  y1 = graph.get_tensor_by_name(\"concatenate_1/concat:0\")\n    #  y1 = graph.get_tensor_by_name(\"concatenate_4/concat:0\")\n    #  y1 = graph.get_tensor_by_name(\"batch_normalization_11/cond/FusedBatchNorm_1:0\")\n    #  y1 = graph.get_tensor_by_name(\"conv2d_6/Sigmoid:0\")\n    y1 = graph.get_tensor_by_name(f\"{name}_spectrogram/mul:0\")\n\n    unet = UNet()\n    unet.eval()\n\n    # For the conv2d in tensorflow, weight shape is (kernel_h, kernel_w, in_channel, out_channel)\n    # default input shape is NHWC\n\n    # For the conv2d in torch, weight shape is (out_channel, in_channel, kernel_h, kernel_w)\n    # default input shape is NCHW\n    state_dict = unet.state_dict()\n    #  print(list(state_dict.keys()))\n\n    if name == \"vocals\":\n        state_dict[\"conv.weight\"] = get_param(graph, \"conv2d/kernel\").permute(\n            3, 2, 0, 1\n        )\n        state_dict[\"conv.bias\"] = get_param(graph, \"conv2d/bias\")\n\n        state_dict[\"bn.weight\"] = get_param(graph, \"batch_normalization/gamma\")\n        state_dict[\"bn.bias\"] = get_param(graph, \"batch_normalization/beta\")\n        state_dict[\"bn.running_mean\"] = get_param(\n            graph, \"batch_normalization/moving_mean\"\n        )\n        state_dict[\"bn.running_var\"] = get_param(\n            graph, \"batch_normalization/moving_variance\"\n        )\n\n        conv_offset = 0\n        bn_offset = 0\n    else:\n        state_dict[\"conv.weight\"] = get_param(graph, \"conv2d_7/kernel\").permute(\n            3, 2, 0, 1\n        )\n        state_dict[\"conv.bias\"] = get_param(graph, \"conv2d_7/bias\")\n\n        state_dict[\"bn.weight\"] = get_param(graph, \"batch_normalization_12/gamma\")\n        state_dict[\"bn.bias\"] = get_param(graph, \"batch_normalization_12/beta\")\n        state_dict[\"bn.running_mean\"] = get_param(\n            graph, \"batch_normalization_12/moving_mean\"\n        )\n        state_dict[\"bn.running_var\"] = get_param(\n            graph, \"batch_normalization_12/moving_variance\"\n        )\n        conv_offset = 7\n        bn_offset = 12\n\n    for i in range(1, 6):\n        state_dict[f\"conv{i}.weight\"] = get_param(\n            graph, f\"conv2d_{i+conv_offset}/kernel\"\n        ).permute(3, 2, 0, 1)\n        state_dict[f\"conv{i}.bias\"] = get_param(graph, f\"conv2d_{i+conv_offset}/bias\")\n        if i >= 5:\n            continue\n        state_dict[f\"bn{i}.weight\"] = get_param(\n            graph, f\"batch_normalization_{i+bn_offset}/gamma\"\n        )\n        state_dict[f\"bn{i}.bias\"] = get_param(\n            graph, f\"batch_normalization_{i+bn_offset}/beta\"\n        )\n        state_dict[f\"bn{i}.running_mean\"] = get_param(\n            graph, f\"batch_normalization_{i+bn_offset}/moving_mean\"\n        )\n        state_dict[f\"bn{i}.running_var\"] = get_param(\n            graph, f\"batch_normalization_{i+bn_offset}/moving_variance\"\n        )\n\n    if name == \"vocals\":\n        state_dict[\"up1.weight\"] = get_param(graph, \"conv2d_transpose/kernel\").permute(\n            3, 2, 0, 1\n        )\n        state_dict[\"up1.bias\"] = get_param(graph, \"conv2d_transpose/bias\")\n\n        state_dict[\"bn5.weight\"] = get_param(graph, \"batch_normalization_6/gamma\")\n        state_dict[\"bn5.bias\"] = get_param(graph, \"batch_normalization_6/beta\")\n        state_dict[\"bn5.running_mean\"] = get_param(\n            graph, \"batch_normalization_6/moving_mean\"\n        )\n        state_dict[\"bn5.running_var\"] = get_param(\n            graph, \"batch_normalization_6/moving_variance\"\n        )\n        conv_offset = 0\n        bn_offset = 0\n    else:\n        state_dict[\"up1.weight\"] = get_param(\n            graph, \"conv2d_transpose_6/kernel\"\n        ).permute(3, 2, 0, 1)\n        state_dict[\"up1.bias\"] = get_param(graph, \"conv2d_transpose_6/bias\")\n\n        state_dict[\"bn5.weight\"] = get_param(graph, \"batch_normalization_18/gamma\")\n        state_dict[\"bn5.bias\"] = get_param(graph, \"batch_normalization_18/beta\")\n        state_dict[\"bn5.running_mean\"] = get_param(\n            graph, \"batch_normalization_18/moving_mean\"\n        )\n        state_dict[\"bn5.running_var\"] = get_param(\n            graph, \"batch_normalization_18/moving_variance\"\n        )\n        conv_offset = 6\n        bn_offset = 12\n\n    for i in range(1, 6):\n        state_dict[f\"up{i+1}.weight\"] = get_param(\n            graph, f\"conv2d_transpose_{i+conv_offset}/kernel\"\n        ).permute(3, 2, 0, 1)\n\n        state_dict[f\"up{i+1}.bias\"] = get_param(\n            graph, f\"conv2d_transpose_{i+conv_offset}/bias\"\n        )\n\n        state_dict[f\"bn{5+i}.weight\"] = get_param(\n            graph, f\"batch_normalization_{6+i+bn_offset}/gamma\"\n        )\n        state_dict[f\"bn{5+i}.bias\"] = get_param(\n            graph, f\"batch_normalization_{6+i+bn_offset}/beta\"\n        )\n        state_dict[f\"bn{5+i}.running_mean\"] = get_param(\n            graph, f\"batch_normalization_{6+i+bn_offset}/moving_mean\"\n        )\n        state_dict[f\"bn{5+i}.running_var\"] = get_param(\n            graph, f\"batch_normalization_{6+i+bn_offset}/moving_variance\"\n        )\n\n    if name == \"vocals\":\n        state_dict[\"up7.weight\"] = get_param(graph, \"conv2d_6/kernel\").permute(\n            3, 2, 0, 1\n        )\n        state_dict[\"up7.bias\"] = get_param(graph, \"conv2d_6/bias\")\n    else:\n        state_dict[\"up7.weight\"] = get_param(graph, \"conv2d_13/kernel\").permute(\n            3, 2, 0, 1\n        )\n        state_dict[\"up7.bias\"] = get_param(graph, \"conv2d_13/bias\")\n\n    unet.load_state_dict(state_dict)\n\n    with tf.compat.v1.Session(graph=graph) as sess:\n        y0_out, y1_out = sess.run([y0, y1], feed_dict={x: generate_waveform()})\n        #  y0_out = sess.run(y0, feed_dict={x: generate_waveform()})\n        #  y1_out = sess.run(y1, feed_dict={x: generate_waveform()})\n        #  print(y0_out.shape)\n        #  print(y1_out.shape)\n\n    # for the batchnormalization in tensorflow,\n    # default input shape is NHWC\n\n    # for the batchnormalization in torch,\n    # default input shape is NCHW\n\n    torch_y1_out = unet(torch.from_numpy(y0_out).permute(3, 0, 1, 2))\n    torch_y1_out = torch_y1_out.permute(1, 0, 2, 3)\n\n    #  print(torch_y1_out.shape, torch.from_numpy(y1_out).permute(0, 3, 1, 2).shape)\n    assert torch.allclose(\n        torch_y1_out, torch.from_numpy(y1_out).permute(0, 3, 1, 2), atol=1e-1\n    ), ((torch_y1_out - torch.from_numpy(y1_out).permute(0, 3, 1, 2)).abs().max())\n    torch.save(unet.state_dict(), f\"2stems/{name}.pt\")\n\n\nif __name__ == \"__main__\":\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--name\",\n        type=str,\n        required=True,\n        choices=[\"vocals\", \"accompaniment\"],\n    )\n    args = parser.parse_args()\n    print(vars(args))\n    main(args.name)\n"
  },
  {
    "path": "scripts/spleeter/export_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport onnx\nimport onnxmltools\nimport torch\nfrom onnxmltools.utils.float16_converter import convert_float_to_float16\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\nfrom unet import UNet\n\n\ndef export_onnx_fp16(onnx_fp32_path, onnx_fp16_path):\n    onnx_fp32_model = onnxmltools.utils.load_model(onnx_fp32_path)\n    onnx_fp16_model = convert_float_to_float16(onnx_fp32_model, keep_io_types=True)\n    onnxmltools.utils.save_model(onnx_fp16_model, onnx_fp16_path)\n\n\ndef add_meta_data(filename, prefix):\n    meta_data = {\n        \"model_type\": \"spleeter\",\n        \"sample_rate\": 41000,\n        \"version\": 1,\n        \"model_url\": \"https://github.com/deezer/spleeter\",\n        \"stems\": 2,\n        \"comment\": prefix,\n        \"model_name\": \"2stems.tar.gz\",\n    }\n    model = onnx.load(filename)\n\n    print(model.metadata_props)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, filename)\n\n\ndef export(model, prefix):\n    num_splits = 1\n    x = torch.rand(2, num_splits, 512, 1024, dtype=torch.float32)\n\n    filename = f\"./2stems/{prefix}.onnx\"\n    torch.onnx.export(\n        model,\n        x,\n        filename,\n        input_names=[\"x\"],\n        output_names=[\"y\"],\n        dynamic_axes={\n            \"x\": {1: \"num_splits\"},\n        },\n        opset_version=13,\n    )\n\n    add_meta_data(filename, prefix)\n\n    filename_int8 = f\"./2stems/{prefix}.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        weight_type=QuantType.QUInt8,\n    )\n\n    filename_fp16 = f\"./2stems/{prefix}.fp16.onnx\"\n    export_onnx_fp16(filename, filename_fp16)\n\n\n@torch.no_grad()\ndef main():\n    vocals = UNet()\n    state_dict = torch.load(\"./2stems/vocals.pt\", map_location=\"cpu\")\n    vocals.load_state_dict(state_dict)\n    vocals.eval()\n\n    accompaniment = UNet()\n    state_dict = torch.load(\"./2stems/accompaniment.pt\", map_location=\"cpu\")\n    accompaniment.load_state_dict(state_dict)\n    accompaniment.eval()\n\n    export(vocals, \"vocals\")\n    export(accompaniment, \"accompaniment\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/spleeter/separate.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n# Please see ./run.sh for usage\n\nfrom typing import Optional\n\nimport ffmpeg\nimport numpy as np\nimport soundfile as sf\nimport torch\nfrom pydub import AudioSegment\n\nfrom unet import UNet\n\n\ndef load_audio(filename, sample_rate: Optional[int] = 44100):\n    probe = ffmpeg.probe(filename)\n    if \"streams\" not in probe or len(probe[\"streams\"]) == 0:\n        raise ValueError(\"No stream was found with ffprobe\")\n\n    metadata = next(\n        stream for stream in probe[\"streams\"] if stream[\"codec_type\"] == \"audio\"\n    )\n    n_channels = metadata[\"channels\"]\n\n    if sample_rate is None:\n        sample_rate = metadata[\"sample_rate\"]\n\n    process = (\n        ffmpeg.input(filename)\n        .output(\"pipe:\", format=\"f32le\", ar=sample_rate)\n        .run_async(pipe_stdout=True, pipe_stderr=True)\n    )\n    buffer, _ = process.communicate()\n    waveform = np.frombuffer(buffer, dtype=\"<f4\").reshape(-1, n_channels)\n\n    waveform = torch.from_numpy(np.copy(waveform)).to(torch.float32)\n    if n_channels == 1:\n        waveform = waveform.tile(1, 2)\n\n    if n_channels > 2:\n        waveform = waveform[:, :2]\n\n    return waveform, sample_rate\n\n\n@torch.no_grad()\ndef main():\n    vocals = UNet()\n    vocals.eval()\n    state_dict = torch.load(\"./2stems/vocals.pt\", map_location=\"cpu\")\n    vocals.load_state_dict(state_dict)\n\n    accompaniment = UNet()\n    accompaniment.eval()\n    state_dict = torch.load(\"./2stems/accompaniment.pt\", map_location=\"cpu\")\n    accompaniment.load_state_dict(state_dict)\n\n    #\n    #  waveform, sample_rate = load_audio(\"./audio_example.mp3\")\n\n    # You can download the following two mp3 from\n    # https://huggingface.co/spaces/csukuangfj/music-source-separation/tree/main/examples\n    waveform, sample_rate = load_audio(\"./qi-feng-le.mp3\")\n    #  waveform, sample_rate = load_audio(\"./Yesterday_Once_More-Carpenters.mp3\")\n    assert waveform.shape[1] == 2, waveform.shape\n\n    waveform = torch.nn.functional.pad(waveform, (0, 0, 0, 4096))\n\n    # torch.stft requires a 2-D input of shape (N, T), so we transpose waveform\n    stft = torch.stft(\n        waveform.t(),\n        n_fft=4096,\n        hop_length=1024,\n        window=torch.hann_window(4096, periodic=True),\n        center=False,\n        onesided=True,\n        return_complex=True,\n    )\n    print(\"stft\", stft.shape)\n\n    # stft: (2, 2049, 465)\n    # stft is a complex tensor\n    y = stft.permute(2, 1, 0)\n    print(\"y0\", y.shape)\n    # (465, 2049, 2)\n\n    y = y[:, :1024, :]\n    # (465, 1024, 2)\n\n    tensor_size = y.shape[0] - int(y.shape[0] / 512) * 512\n    pad_size = 512 - tensor_size\n    y = torch.nn.functional.pad(y, (0, 0, 0, 0, 0, pad_size))\n    # (512, 1024, 2)\n    print(\"y1\", y.shape, y.dtype)\n\n    num_splits = int(y.shape[0] / 512)\n    y = y.reshape([num_splits, 512] + list(y.shape[1:]))\n    # y: (1, 512, 1024, 2)\n    print(\"y2\", y.shape, y.dtype)\n\n    y = y.abs()\n\n    y = y.permute(3, 0, 1, 2)\n    # (2, 1, 512, 1024)\n    print(\"y3\", y.shape, y.dtype)\n\n    vocals_spec = vocals(y)\n    accompaniment_spec = accompaniment(y)\n\n    vocals_spec = vocals_spec.permute(1, 0, 2, 3)\n    accompaniment_spec = accompaniment_spec.permute(1, 0, 2, 3)\n\n    sum_spec = (vocals_spec**2 + accompaniment_spec**2) + 1e-10\n    print(\n        \"vocals_spec\",\n        vocals_spec.shape,\n        accompaniment_spec.shape,\n        sum_spec.shape,\n        vocals_spec.dtype,\n    )\n\n    vocals_spec = (vocals_spec**2 + 1e-10 / 2) / sum_spec\n    # (1, 2, 512, 1024)\n\n    accompaniment_spec = (accompaniment_spec**2 + 1e-10 / 2) / sum_spec\n    # (1, 2, 512, 1024)\n\n    for name, spec in zip(\n        [\"vocals\", \"accompaniment\"], [vocals_spec, accompaniment_spec]\n    ):\n        spec = torch.nn.functional.pad(spec, (0, 2049 - 1024, 0, 0, 0, 0, 0, 0))\n        # (1, 2, 512, 2049)\n\n        spec = spec.permute(0, 2, 3, 1)\n        # (1, 512, 2049, 2)\n        print(\"here00\", spec.shape)\n\n        spec = spec.reshape(-1, spec.shape[2], spec.shape[3])\n        # (512, 2049, 2)\n\n        print(\"here2\", spec.shape)\n        # (512, 2049, 2)\n\n        spec = spec[: stft.shape[2], :, :]\n        # (465, 2049, 2)\n        print(\"here 3\", spec.shape, stft.shape)\n\n        spec = spec.permute(2, 1, 0)\n        # (2, 2049, 465)\n\n        masked_stft = spec * stft\n\n        wave = torch.istft(\n            masked_stft,\n            4096,\n            1024,\n            window=torch.hann_window(4096, periodic=True),\n            onesided=True,\n        ) * (2 / 3)\n\n        print(wave.shape, wave.dtype)\n        sf.write(f\"{name}.wav\", wave.t(), 44100)\n\n        wave = (wave.t() * 32768).to(torch.int16)\n        sound = AudioSegment(\n            data=wave.numpy().tobytes(), sample_width=2, frame_rate=44100, channels=2\n        )\n        sound.export(f\"{name}.mp3\", format=\"mp3\", bitrate=\"128k\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/spleeter/separate_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\nimport time\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\n\nfrom separate import load_audio\n\n\"\"\"\n----------inputs for ./2stems/vocals.onnx----------\nNodeArg(name='x', type='tensor(float)', shape=[2, 'num_splits', 512, 1024])\n----------outputs for ./2stems/vocals.onnx----------\nNodeArg(name='y', type='tensor(float)', shape=[2, 'Transposey_dim_1', 512, 1024])\n\n----------inputs for ./2stems/accompaniment.onnx----------\nNodeArg(name='x', type='tensor(float)', shape=[2, 'num_splits', 512, 1024])\n----------outputs for ./2stems/accompaniment.onnx----------\nNodeArg(name='y', type='tensor(float)', shape=[2, 'Transposey_dim_1', 512, 1024])\n\"\"\"\n\n\nclass OnnxModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        print(f\"----------inputs for {filename}----------\")\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(f\"----------outputs for {filename}----------\")\n\n        for i in self.model.get_outputs():\n            print(i)\n        print(\"--------------------\")\n\n    def __call__(self, x):\n        \"\"\"\n        Args:\n          x: (num_splits, 2, 512, 1024)\n        \"\"\"\n        spec = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x.numpy(),\n            },\n        )[0]\n\n        return torch.from_numpy(spec)\n\n\ndef main():\n    vocals = OnnxModel(\"./2stems/vocals.onnx\")\n    accompaniment = OnnxModel(\"./2stems/accompaniment.onnx\")\n\n    waveform, sample_rate = load_audio(\"./qi-feng-le.mp3\")\n    waveform = waveform[: 44100 * 10, :]\n\n    stft_config = knf.StftConfig(\n        n_fft=4096,\n        hop_length=1024,\n        win_length=4096,\n        center=False,\n        window_type=\"hann\",\n    )\n    knf_stft = knf.Stft(stft_config)\n    knf_istft = knf.IStft(stft_config)\n\n    start = time.time()\n\n    stft_result_c0 = knf_stft(waveform[:, 0].tolist())\n    stft_result_c1 = knf_stft(waveform[:, 1].tolist())\n    print(\"c0 stft\", stft_result_c0.num_frames)\n\n    orig_real0 = np.array(stft_result_c0.real, dtype=np.float32).reshape(\n        stft_result_c0.num_frames, -1\n    )\n    orig_imag0 = np.array(stft_result_c0.imag, dtype=np.float32).reshape(\n        stft_result_c0.num_frames, -1\n    )\n\n    orig_real1 = np.array(stft_result_c1.real, dtype=np.float32).reshape(\n        stft_result_c1.num_frames, -1\n    )\n    orig_imag1 = np.array(stft_result_c1.imag, dtype=np.float32).reshape(\n        stft_result_c1.num_frames, -1\n    )\n\n    real0 = torch.from_numpy(orig_real0)\n    imag0 = torch.from_numpy(orig_imag0)\n    real1 = torch.from_numpy(orig_real1)\n    imag1 = torch.from_numpy(orig_imag1)\n    # (num_frames, n_fft/2_1)\n    print(\"real0\", real0.shape)\n\n    # keep only the first 1024 bins\n    real0 = real0[:, :1024]\n    imag0 = imag0[:, :1024]\n    real1 = real1[:, :1024]\n    imag1 = imag1[:, :1024]\n\n    stft0 = (real0.square() + imag0.square()).sqrt()\n    stft1 = (real1.square() + imag1.square()).sqrt()\n\n    # pad it to multiple of 512\n    padding = 512 - real0.shape[0] % 512\n    print(\"padding\", padding)\n    if padding > 0:\n        stft0 = torch.nn.functional.pad(stft0, (0, 0, 0, padding))\n        stft1 = torch.nn.functional.pad(stft1, (0, 0, 0, padding))\n    stft0 = stft0.reshape(1, -1, 512, 1024)\n    stft1 = stft1.reshape(1, -1, 512, 1024)\n\n    stft_01 = torch.cat([stft0, stft1], axis=0)\n\n    print(\"stft_01\", stft_01.shape, stft_01.dtype)\n\n    vocals_spec = vocals(stft_01)\n    accompaniment_spec = accompaniment(stft_01)\n    # (num_channels, num_splits, 512, 1024)\n\n    sum_spec = (vocals_spec.square() + accompaniment_spec.square()) + 1e-10\n\n    vocals_spec = (vocals_spec**2 + 1e-10 / 2) / sum_spec\n    accompaniment_spec = (accompaniment_spec**2 + 1e-10 / 2) / sum_spec\n\n    for name, spec in zip(\n        [\"vocals\", \"accompaniment\"], [vocals_spec, accompaniment_spec]\n    ):\n        spec_c0 = spec[0]\n        spec_c1 = spec[1]\n\n        spec_c0 = spec_c0.reshape(-1, 1024)\n        spec_c1 = spec_c1.reshape(-1, 1024)\n\n        spec_c0 = spec_c0[: stft_result_c0.num_frames, :]\n        spec_c1 = spec_c1[: stft_result_c0.num_frames, :]\n\n        spec_c0 = torch.nn.functional.pad(spec_c0, (0, 2049 - 1024, 0, 0))\n        spec_c1 = torch.nn.functional.pad(spec_c1, (0, 2049 - 1024, 0, 0))\n\n        spec_c0_real = spec_c0 * orig_real0\n        spec_c0_imag = spec_c0 * orig_imag0\n\n        spec_c1_real = spec_c1 * orig_real1\n        spec_c1_imag = spec_c1 * orig_imag1\n\n        result0 = knf.StftResult(\n            real=spec_c0_real.reshape(-1).tolist(),\n            imag=spec_c0_imag.reshape(-1).tolist(),\n            num_frames=orig_real0.shape[0],\n        )\n\n        result1 = knf.StftResult(\n            real=spec_c1_real.reshape(-1).tolist(),\n            imag=spec_c1_imag.reshape(-1).tolist(),\n            num_frames=orig_real1.shape[0],\n        )\n\n        wav0 = knf_istft(result0)\n        wav1 = knf_istft(result1)\n\n        wav = np.array([wav0, wav1], dtype=np.float32)\n        wav = np.transpose(wav)\n        # now wav is (num_samples, num_channels)\n\n        sf.write(f\"./onnx-{name}.wav\", wav, 44100)\n\n        print(f\"Saved to ./onnx-{name}.wav\")\n\n    end = time.time()\n    elapsed_seconds = end - start\n    audio_duration = waveform.shape[0] / sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/spleeter/unet.py",
    "content": "# Copyright    2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport torch\n\n\nclass UNet(torch.nn.Module):\n    def __init__(self):\n        super().__init__()\n        self.conv = torch.nn.Conv2d(2, 16, kernel_size=5, stride=(2, 2), padding=0)\n        self.bn = torch.nn.BatchNorm2d(\n            16, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n        #\n        self.conv1 = torch.nn.Conv2d(16, 32, kernel_size=5, stride=(2, 2), padding=0)\n        self.bn1 = torch.nn.BatchNorm2d(\n            32, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n\n        self.conv2 = torch.nn.Conv2d(32, 64, kernel_size=5, stride=(2, 2), padding=0)\n        self.bn2 = torch.nn.BatchNorm2d(\n            64, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n\n        self.conv3 = torch.nn.Conv2d(64, 128, kernel_size=5, stride=(2, 2), padding=0)\n        self.bn3 = torch.nn.BatchNorm2d(\n            128, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n\n        self.conv4 = torch.nn.Conv2d(128, 256, kernel_size=5, stride=(2, 2), padding=0)\n        self.bn4 = torch.nn.BatchNorm2d(\n            256, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n\n        self.conv5 = torch.nn.Conv2d(256, 512, kernel_size=5, stride=(2, 2), padding=0)\n\n        self.up1 = torch.nn.ConvTranspose2d(512, 256, kernel_size=5, stride=2)\n        self.bn5 = torch.nn.BatchNorm2d(\n            256, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n\n        self.up2 = torch.nn.ConvTranspose2d(512, 128, kernel_size=5, stride=2)\n        self.bn6 = torch.nn.BatchNorm2d(\n            128, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n\n        self.up3 = torch.nn.ConvTranspose2d(256, 64, kernel_size=5, stride=2)\n        self.bn7 = torch.nn.BatchNorm2d(\n            64, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n\n        self.up4 = torch.nn.ConvTranspose2d(128, 32, kernel_size=5, stride=2)\n        self.bn8 = torch.nn.BatchNorm2d(\n            32, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n\n        self.up5 = torch.nn.ConvTranspose2d(64, 16, kernel_size=5, stride=2)\n        self.bn9 = torch.nn.BatchNorm2d(\n            16, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n\n        self.up6 = torch.nn.ConvTranspose2d(32, 1, kernel_size=5, stride=2)\n        self.bn10 = torch.nn.BatchNorm2d(\n            1, track_running_stats=True, eps=1e-3, momentum=0.01\n        )\n\n        # output logit is False, so we need self.up7\n        self.up7 = torch.nn.Conv2d(1, 2, kernel_size=4, dilation=2, padding=3)\n\n    def forward(self, x):\n        \"\"\"\n        Args:\n          x: (num_audio_channels, num_splits, 512, 1024)\n        Returns:\n          y: (num_audio_channels, num_splits, 512, 1024)\n        \"\"\"\n        x = x.permute(1, 0, 2, 3)\n\n        in_x = x\n        # in_x is (3, 2, 512, 1024) = (T, 2, 512, 1024)\n        x = torch.nn.functional.pad(x, (1, 2, 1, 2), \"constant\", 0)\n        conv1 = self.conv(x)\n        batch1 = self.bn(conv1)\n        rel1 = torch.nn.functional.leaky_relu(batch1, negative_slope=0.2)\n\n        x = torch.nn.functional.pad(rel1, (1, 2, 1, 2), \"constant\", 0)\n        conv2 = self.conv1(x)  # (3, 32, 128, 256)\n        batch2 = self.bn1(conv2)\n        rel2 = torch.nn.functional.leaky_relu(\n            batch2, negative_slope=0.2\n        )  # (3, 32, 128, 256)\n\n        x = torch.nn.functional.pad(rel2, (1, 2, 1, 2), \"constant\", 0)\n        conv3 = self.conv2(x)  # (3, 64, 64, 128)\n        batch3 = self.bn2(conv3)\n        rel3 = torch.nn.functional.leaky_relu(\n            batch3, negative_slope=0.2\n        )  # (3, 64, 64, 128)\n\n        x = torch.nn.functional.pad(rel3, (1, 2, 1, 2), \"constant\", 0)\n        conv4 = self.conv3(x)  # (3, 128, 32, 64)\n        batch4 = self.bn3(conv4)\n        rel4 = torch.nn.functional.leaky_relu(\n            batch4, negative_slope=0.2\n        )  # (3, 128, 32, 64)\n\n        x = torch.nn.functional.pad(rel4, (1, 2, 1, 2), \"constant\", 0)\n        conv5 = self.conv4(x)  # (3, 256, 16, 32)\n        batch5 = self.bn4(conv5)\n        rel6 = torch.nn.functional.leaky_relu(\n            batch5, negative_slope=0.2\n        )  # (3, 256, 16, 32)\n\n        x = torch.nn.functional.pad(rel6, (1, 2, 1, 2), \"constant\", 0)\n        conv6 = self.conv5(x)  # (3, 512, 8, 16)\n\n        up1 = self.up1(conv6)\n        up1 = up1[:, :, 1:-2, 1:-2]  # (3, 256, 16, 32)\n        up1 = torch.nn.functional.relu(up1)\n        batch7 = self.bn5(up1)\n        merge1 = torch.cat([conv5, batch7], axis=1)  # (3, 512, 16, 32)\n\n        up2 = self.up2(merge1)\n        up2 = up2[:, :, 1:-2, 1:-2]\n        up2 = torch.nn.functional.relu(up2)\n        batch8 = self.bn6(up2)\n\n        merge2 = torch.cat([conv4, batch8], axis=1)  # (3, 256, 32, 64)\n\n        up3 = self.up3(merge2)\n        up3 = up3[:, :, 1:-2, 1:-2]\n        up3 = torch.nn.functional.relu(up3)\n        batch9 = self.bn7(up3)\n\n        merge3 = torch.cat([conv3, batch9], axis=1)  # (3, 128, 64, 128)\n\n        up4 = self.up4(merge3)\n        up4 = up4[:, :, 1:-2, 1:-2]\n        up4 = torch.nn.functional.relu(up4)\n        batch10 = self.bn8(up4)\n\n        merge4 = torch.cat([conv2, batch10], axis=1)  # (3, 64, 128, 256)\n\n        up5 = self.up5(merge4)\n        up5 = up5[:, :, 1:-2, 1:-2]\n        up5 = torch.nn.functional.relu(up5)\n        batch11 = self.bn9(up5)\n\n        merge5 = torch.cat([conv1, batch11], axis=1)  # (3, 32, 256, 512)\n\n        up6 = self.up6(merge5)\n        up6 = up6[:, :, 1:-2, 1:-2]\n        up6 = torch.nn.functional.relu(up6)\n        batch12 = self.bn10(up6)  # (3, 1, 512, 1024)  = (T, 1, 512, 1024)\n\n        up7 = self.up7(batch12)\n        up7 = torch.sigmoid(up7)  # (3, 2, 512, 1024)\n\n        ans = up7 * in_x\n        return ans.permute(1, 0, 2, 3)\n"
  },
  {
    "path": "scripts/supertonic/README.md",
    "content": "# Supertonic TTS INT8 Quantization\n\nQuantize [Supertonic](https://github.com/supertone-inc/supertonic) TTS ONNX models to INT8 for on-device deployment.\n\n## Overview\n\n- **Pipeline**: `gen_calib_configs` → `dump_inputs` → `convert`; stage 4 generates **.bin** assets when JSONs exist: `generate_voices_bin.py`, `generate_indexer_bin.py`. Runtime loads **tts.json** for TTS config.\n- **Quantization**: duration_predictor, text_encoder, vector_estimator → dynamic INT8; vocoder → static INT8 (calibration from dumped data).\n- **Voice**: Runtime loads one **`voice.bin`**. Generate with `python3 generate_voices_bin.py [input_dir] [output_bin]`. Pass `--supertonic-voice-style=/path/to/voice.bin`. Use `--sid` 0..N-1 to select speaker.\n- **Unicode indexer**: Runtime uses **`unicode_indexer.bin`**. Generate with `python3 generate_indexer_bin.py [json_path] [bin_path]`. Pass `--supertonic-unicode-indexer=/path/to/unicode_indexer.bin`.\n- **TTS config**: Runtime loads **`tts.json`**. Pass `--supertonic-tts-json=/path/to/tts.json`.\n\n## Usage\n\n```bash\n./run.sh              # Run all stages (0–4)\n./run.sh 4            # Only generate voice.bin, unicode_indexer.bin\n```\n\n**Stages:** 0 = download models, 1 = gen calib configs, 2 = dump calib data, 3 = quantize, 4 = generate `voice.bin`, `unicode_indexer.bin`. \n"
  },
  {
    "path": "scripts/supertonic/convert.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2026 zengyw\n\n\"\"\"\nQuantize Supertonic TTS ONNX models (duration_predictor, text_encoder,\nvector_estimator, vocoder) to int8.\nSee also https://github.com/supertone-inc/supertonic\n\"\"\"\n\nimport argparse\nimport glob\nimport inspect\nimport os\nimport shutil\nimport tempfile\nfrom typing import Dict, List, Optional, Tuple\n\nimport numpy as np\nimport onnx\nfrom onnx import numpy_helper\nimport onnxruntime as ort\nfrom onnxruntime.quantization import (\n    CalibrationDataReader,\n    QuantFormat,\n    QuantType,\n    quantize_dynamic,\n    quantize_static,\n)\n\ntry:\n    from onnxruntime.quantization import CalibrationMethod\nexcept Exception:\n    CalibrationMethod = None\n\n_quant_pre_process = None\ntry:\n    from onnxruntime.quantization.shape_inference import quant_pre_process as _qpp\n    _quant_pre_process = _qpp\nexcept Exception:\n    try:\n        from onnxruntime.quantization import quant_pre_process as _qpp\n        _quant_pre_process = _qpp\n    except Exception:\n        _quant_pre_process = None\n\n\ndef ensure_graph_names(m: onnx.ModelProto) -> None:\n    def fix_graph(g: onnx.GraphProto, prefix: str) -> None:\n        if not g.name:\n            g.name = prefix\n        for node in g.node:\n            for attr in node.attribute:\n                if attr.type == onnx.AttributeProto.GRAPH and attr.g is not None:\n                    fix_graph(attr.g, f\"{prefix}_{node.name or node.op_type}_g\")\n                elif attr.type == onnx.AttributeProto.GRAPHS:\n                    for i, sg in enumerate(attr.graphs):\n                        fix_graph(sg, f\"{prefix}_{node.name or node.op_type}_gs{i}\")\n\n    if not m.graph.name:\n        m.graph.name = \"graph\"\n    fix_graph(m.graph, m.graph.name)\n\n\ndef ensure_node_names(m: onnx.ModelProto) -> None:\n    for i, n in enumerate(m.graph.node):\n        if not n.name:\n            n.name = f\"{n.op_type}_{i}\"\n\n\ndef save_clean(path: str) -> None:\n    m = onnx.load(path)\n    ensure_graph_names(m)\n    ensure_node_names(m)\n    onnx.save_model(m, path, save_as_external_data=False)\n\n\ndef preprocess(src: str, dst: str, mode: str) -> str:\n    if mode == \"none\":\n        return src\n    if mode == \"onnx\":\n        m = onnx.load(src)\n        ensure_graph_names(m)\n        ensure_node_names(m)\n        try:\n            m = onnx.shape_inference.infer_shapes(m)\n        except Exception:\n            pass\n        onnx.save_model(m, dst, save_as_external_data=False)\n        return dst\n    if mode == \"ort\":\n        if _quant_pre_process is None:\n            return preprocess(src, dst, \"onnx\")\n        sig = inspect.signature(_quant_pre_process)\n        allowed = set(sig.parameters.keys())\n        kwargs = {}\n        if \"skip_symbolic_shape_inference\" in allowed:\n            kwargs[\"skip_symbolic_shape_inference\"] = True\n        if \"skip_onnx_shape_inference\" in allowed:\n            kwargs[\"skip_onnx_shape_inference\"] = False\n        if \"skip_optimization\" in allowed:\n            kwargs[\"skip_optimization\"] = False\n        try:\n            _quant_pre_process(src, dst, **kwargs)\n            save_clean(dst)\n            return dst\n        except Exception:\n            return preprocess(src, dst, \"onnx\")\n    raise ValueError(f\"Unknown preprocess mode: {mode}\")\n\n\ndef pick_calib_method(name: str):\n    # fallback to name (str) when CalibrationMethod unavailable\n    if CalibrationMethod is None:\n        print(f\"CalibrationMethod is None, using {name}\")\n        return name\n    return getattr(CalibrationMethod, name, CalibrationMethod.MinMax)\n\n\ndef get_io_names(model_path: str) -> Tuple[List[str], List[str]]:\n    so = ort.SessionOptions()\n    so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL\n    sess = ort.InferenceSession(model_path, sess_options=so, providers=[\"CPUExecutionProvider\"])\n    ins = [i.name for i in sess.get_inputs()]\n    outs = [o.name for o in sess.get_outputs()]\n    return ins, outs\n\n\ndef onnx_int8_name(src_name: str) -> str:\n    return os.path.splitext(src_name)[0] + \".int8.onnx\"\n\n\ndef _detect_variable_axis(shapes: List[Tuple[int, ...]]) -> Optional[int]:\n    # return axis index if exactly one axis varies across shapes, else None\n    if not shapes:\n        return None\n    nd = len(shapes[0])\n    if any(len(s) != nd for s in shapes):\n        return None\n    var_axes = []\n    for ax in range(nd):\n        vals = {s[ax] for s in shapes}\n        if len(vals) > 1:\n            var_axes.append(ax)\n    if len(var_axes) == 1:\n        return var_axes[0]\n    return None\n\n\ndef _crop_center(arr: np.ndarray, axis: int, target: int) -> np.ndarray:\n    cur = arr.shape[axis]\n    if cur <= target:\n        return arr\n    start = (cur - target) // 2\n    sl = [slice(None)] * arr.ndim\n    sl[axis] = slice(start, start + target)\n    return arr[tuple(sl)]\n\n\ndef _pad(arr: np.ndarray, axis: int, target: int, pad_value: float) -> np.ndarray:\n    cur = arr.shape[axis]\n    if cur >= target:\n        return arr\n    pad_width = [(0, 0)] * arr.ndim\n    pad_width[axis] = (0, target - cur)\n    return np.pad(arr, pad_width, mode=\"constant\", constant_values=pad_value)\n\n\ndef _pad_or_crop(arr: np.ndarray, axis: int, target: int, pad_value: float) -> np.ndarray:\n    cur = arr.shape[axis]\n    if cur > target:\n        return _crop_center(arr, axis, target)\n    if cur < target:\n        return _pad(arr, axis, target, pad_value)\n    return arr\n\n\ndef _pad_value_for(name: str, dtype: np.dtype):\n    n = name.lower()\n    if \"mask\" in n:\n        return 0\n    if np.issubdtype(dtype, np.integer):\n        return 0\n    return 0.0\n\n\ndef _build_pad_plan_percentile(\n    folder: str,\n    input_names: List[str],\n    limit: int,\n    pad_percentile: int,\n    pad_max: int,\n) -> Dict[str, Tuple[int, int]]:\n    files = sorted(glob.glob(os.path.join(folder, \"*.npz\")))\n    files = files[:limit] if limit > 0 else files\n    if not files:\n        raise RuntimeError(f\"No npz in: {folder}\")\n\n    shapes_per_in: Dict[str, List[Tuple[int, ...]]] = {n: [] for n in input_names}\n    for f in files:\n        d = np.load(f, allow_pickle=False)\n        for n in input_names:\n            if n not in d:\n                raise KeyError(f\"{f} missing '{n}', keys={list(d.keys())}\")\n            shapes_per_in[n].append(tuple(d[n].shape))\n\n    plan: Dict[str, Tuple[int, int]] = {}\n    for n, shapes in shapes_per_in.items():\n        ax = _detect_variable_axis(shapes)\n        if ax is None:\n            continue\n        lens = np.array([s[ax] for s in shapes], dtype=np.int64)\n        tgt = int(np.percentile(lens, pad_percentile))\n        tgt = max(1, tgt)\n        if pad_max > 0:\n            tgt = min(tgt, pad_max)\n        plan[n] = (ax, tgt)\n    return plan\n\n\nclass PaddedNpzDataReader(CalibrationDataReader):\n    def __init__(self, folder: str, input_names: List[str], limit: int, pad_percentile: int, pad_max: int):\n        self.files = sorted(glob.glob(os.path.join(folder, \"*.npz\")))\n        if limit > 0:\n            self.files = self.files[:limit]\n        if not self.files:\n            raise RuntimeError(f\"No calibration npz in: {folder}\")\n        self.input_names = input_names\n        self.pad_plan = _build_pad_plan_percentile(folder, input_names, limit, pad_percentile, pad_max)\n        self._iter = iter(self.files)\n\n    def get_next(self) -> Optional[Dict[str, np.ndarray]]:\n        try:\n            p = next(self._iter)\n        except StopIteration:\n            return None\n        d = np.load(p, allow_pickle=False)\n        feeds: Dict[str, np.ndarray] = {}\n        for n in self.input_names:\n            x = d[n]\n            if x.dtype == np.float64:\n                x = x.astype(np.float32)\n            if n in self.pad_plan:\n                axis, tgt = self.pad_plan[n]\n                pv = _pad_value_for(n, x.dtype)\n                x = _pad_or_crop(x, axis, tgt, pv)\n            feeds[n] = x\n        return feeds\n\n    def rewind(self) -> None:\n        self._iter = iter(self.files)\n\n\ndef safe_copy(src: str, dst: str) -> None:\n    shutil.copy2(src, dst)\n    try:\n        save_clean(dst)\n    except Exception:\n        pass\n\n\ndef quantize_dynamic_safe(fp32_path: str, out_path: str, op_types: List[str], wt_type: QuantType) -> None:\n    try:\n        quantize_dynamic(\n            model_input=fp32_path,\n            model_output=out_path,\n            op_types_to_quantize=op_types,\n            weight_type=wt_type,\n            per_channel=False,\n            reduce_range=False,\n            use_external_data_format=False,\n        )\n        save_clean(out_path)\n    except Exception as e:\n        print(f\"[WARN] dynamic quant failed for {os.path.basename(fp32_path)}: {e} -> fallback copy\")\n        safe_copy(fp32_path, out_path)\n\n\ndef quantize_static_safe(\n    fp32_path: str,\n    out_path: str,\n    calib_folder: str,\n    preprocess_mode: str,\n    calib_limit: int,\n    calibrate_method: str,\n    act_type: QuantType,\n    wt_type: QuantType,\n    per_channel: bool,\n    reduce_range: bool,\n    op_types: List[str],\n    nodes_to_exclude: Optional[List[str]],\n    pad_percentile: int,\n    pad_max: int,\n) -> None:\n    with tempfile.TemporaryDirectory(prefix=\"st_q_\") as td:\n        pre_path = os.path.join(td, \"pre.onnx\")\n        fp32_for_quant = preprocess(fp32_path, pre_path, preprocess_mode)\n        ins, _ = get_io_names(fp32_for_quant)\n\n        extra = {\"WeightSymmetric\": True}\n        extra[\"ActivationSymmetric\"] = (act_type == QuantType.QInt8)\n\n        def _run(method: str) -> None:\n            sig = inspect.signature(quantize_static)\n            allowed = set(sig.parameters.keys())\n            kwargs = dict(\n                quant_format=QuantFormat.QDQ,\n                op_types_to_quantize=op_types,\n                per_channel=per_channel,\n                reduce_range=reduce_range,\n                activation_type=act_type,\n                weight_type=wt_type,\n                optimize_model=False,\n                use_external_data_format=False,\n                extra_options=extra,\n                calibration_providers=[\"CPUExecutionProvider\"],\n            )\n            cm = pick_calib_method(method)\n            if \"calibrate_method\" in allowed:\n                kwargs[\"calibrate_method\"] = cm\n            if nodes_to_exclude and \"nodes_to_exclude\" in allowed:\n                kwargs[\"nodes_to_exclude\"] = nodes_to_exclude\n            kwargs = {k: v for k, v in kwargs.items() if k in allowed}\n\n            dr = PaddedNpzDataReader(calib_folder, ins, calib_limit, pad_percentile, pad_max)\n            quantize_static(fp32_for_quant, out_path, dr, **kwargs)\n            save_clean(out_path)\n\n        try:\n            _run(calibrate_method)\n        except Exception as e:\n            msg = str(e)\n            if \"inhomogeneous shape\" in msg or \"setting an array element with a sequence\" in msg:\n                print(f\"[WARN] calib shape issue on {os.path.basename(fp32_path)} -> fallback MinMax\")\n                _run(\"MinMax\")\n            else:\n                print(f\"[WARN] static quant failed for {os.path.basename(fp32_path)}: {e} -> fallback copy\")\n                safe_copy(fp32_path, out_path)\n\n\ndef _name_exists(model: onnx.ModelProto, name: str) -> bool:\n    for t in model.graph.initializer:\n        if t.name == name:\n            return True\n    for v in list(model.graph.value_info) + list(model.graph.input) + list(model.graph.output):\n        if v.name == name:\n            return True\n    for n in model.graph.node:\n        if name in n.output:\n            return True\n    return False\n\n\ndef _unique_name(model: onnx.ModelProto, base: str) -> str:\n    if not _name_exists(model, base):\n        return base\n    i = 0\n    while True:\n        cand = f\"{base}_{i}\"\n        if not _name_exists(model, cand):\n            return cand\n        i += 1\n\n\ndef _w8dq_quantize_per_channel_s8(w: np.ndarray, axis: int = 0) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:\n    w = w.astype(np.float32)\n    w_abs = np.max(np.abs(w), axis=tuple(i for i in range(w.ndim) if i != axis), keepdims=False)\n    w_abs = np.maximum(w_abs, 1e-8)\n    scale = (w_abs / 127.0).astype(np.float32)\n    zp = np.zeros_like(scale, dtype=np.int8)\n\n    shape = [1] * w.ndim\n    shape[axis] = w.shape[axis]\n    scale_b = scale.reshape(shape)\n    w_q = np.round(w / scale_b).clip(-127, 127).astype(np.int8)\n    return w_q, scale, zp\n\n\ndef apply_w8dq_to_conv_weights(\n    model_in: str,\n    model_out: str,\n    exclude_last_conv: int,\n    only_fp32: bool = True,\n) -> None:\n    m = onnx.load(model_in)\n    ensure_graph_names(m)\n    ensure_node_names(m)\n\n    convs_all = [n for n in m.graph.node if n.op_type == \"Conv\"]\n    if exclude_last_conv > 0 and len(convs_all) >= exclude_last_conv:\n        convs_use = convs_all[:-exclude_last_conv]\n    else:\n        convs_use = convs_all\n\n    imap = {t.name: t for t in m.graph.initializer}\n\n    def remove_initializer(name: str) -> None:\n        keep = [t for t in m.graph.initializer if t.name != name]\n        del m.graph.initializer[:]\n        m.graph.initializer.extend(keep)\n\n    new_nodes = []\n    changed = 0\n\n    for node in m.graph.node:\n        if node.op_type != \"Conv\":\n            continue\n        if node not in convs_use:\n            continue\n        if len(node.input) < 2:\n            continue\n\n        w_name = node.input[1]\n        if w_name not in imap:\n            continue\n        w_t = imap[w_name]\n        w = numpy_helper.to_array(w_t)\n        if only_fp32 and w.dtype != np.float32:\n            continue\n\n        w_q, scale, zp = _w8dq_quantize_per_channel_s8(w, axis=0)\n\n        wq_name = _unique_name(m, w_name + \"_wq\")\n        sc_name = _unique_name(m, w_name + \"_scale\")\n        zp_name = _unique_name(m, w_name + \"_zp\")\n        dq_out = _unique_name(m, w_name + \"_dq\")\n\n        m.graph.initializer.extend([numpy_helper.from_array(w_q, name=wq_name)])\n        m.graph.initializer.extend([numpy_helper.from_array(scale.astype(np.float32), name=sc_name)])\n        m.graph.initializer.extend([numpy_helper.from_array(zp.astype(np.int8), name=zp_name)])\n\n        dq = onnx.helper.make_node(\n            \"DequantizeLinear\",\n            inputs=[wq_name, sc_name, zp_name],\n            outputs=[dq_out],\n            name=_unique_name(m, \"DQ_\" + w_name),\n            axis=0,\n        )\n        new_nodes.append(dq)\n\n        node.input[1] = dq_out\n        remove_initializer(w_name)\n        changed += 1\n\n    if new_nodes:\n        old_nodes = list(m.graph.node)\n        del m.graph.node[:]\n        m.graph.node.extend(new_nodes + old_nodes)\n\n    onnx.checker.check_model(m)\n    onnx.save_model(m, model_out, save_as_external_data=False)\n    save_clean(model_out)\n    print(f\"[W8-DQ] conv weights compressed: {changed} (exclude_last_conv={exclude_last_conv})\")\n\n\ndef infer_vocoder_latent_shape(vocoder_fp32: str, voc_calib_dir: str) -> Optional[Tuple[int, ...]]:\n    try:\n        voc_in, _ = get_io_names(vocoder_fp32)\n        if len(voc_in) != 1:\n            return None\n        inp = voc_in[0]\n        files = sorted(glob.glob(os.path.join(voc_calib_dir, \"*.npz\")))\n        if not files:\n            return None\n        d = np.load(files[0], allow_pickle=False)\n        if inp not in d:\n            return None\n        return tuple(d[inp].shape)\n    except Exception:\n        return None\n\n\ndef pick_ve_output_index(ve_model_path: str, ve_calib_dir: str, voc_latent_shape: Optional[Tuple[int, ...]]) -> int:\n    ve_in, _ = get_io_names(ve_model_path)\n    files = sorted(glob.glob(os.path.join(ve_calib_dir, \"*.npz\")))\n    if not files:\n        return 0\n    d = np.load(files[0], allow_pickle=False)\n    feeds = {}\n    for n in ve_in:\n        x = d[n]\n        if x.dtype == np.float64:\n            x = x.astype(np.float32)\n        feeds[n] = x\n\n    so = ort.SessionOptions()\n    so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL\n    sess = ort.InferenceSession(ve_model_path, sess_options=so, providers=[\"CPUExecutionProvider\"])\n    outs = sess.run(None, feeds)\n\n    best = 0\n    if voc_latent_shape is not None:\n        vrank = len(voc_latent_shape)\n        for i, y in enumerate(outs):\n            y = np.asarray(y)\n            if not np.issubdtype(y.dtype, np.floating):\n                continue\n            if y.ndim != vrank:\n                continue\n            # Supertonic VE latent dim 512, pick output matching vocoder input\n            if 512 in y.shape:\n                best = i\n                break\n        return best\n\n    for i, y in enumerate(outs):\n        y = np.asarray(y)\n        if np.issubdtype(y.dtype, np.floating) and y.ndim == 3 and (512 in y.shape):  # latent dim\n            best = i\n            break\n    return best\n\n\ndef build_vocoder_calib_from_ve(\n    ve_model_path: str,\n    ve_calib_dir: str,\n    vocoder_fp32: str,\n    out_dir: str,\n    ve_output_index: int,\n    limit: int,\n    pad_percentile: int,\n    pad_max: int,\n) -> None:\n    os.makedirs(out_dir, exist_ok=True)\n    voc_in, _ = get_io_names(vocoder_fp32)\n    if len(voc_in) != 1:\n        raise RuntimeError(f\"vocoder inputs != 1, got {voc_in}\")\n    voc_in_name = voc_in[0]\n\n    ve_in, _ = get_io_names(ve_model_path)\n    files = sorted(glob.glob(os.path.join(ve_calib_dir, \"*.npz\")))\n    files = files[:limit] if limit > 0 else files\n    if not files:\n        raise RuntimeError(f\"No npz in {ve_calib_dir}\")\n\n    so = ort.SessionOptions()\n    so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL\n    sess = ort.InferenceSession(ve_model_path, sess_options=so, providers=[\"CPUExecutionProvider\"])\n\n    ve_pad_plan = _build_pad_plan_percentile(ve_calib_dir, ve_in, limit, pad_percentile, pad_max)\n\n    latents = []\n    for f in files:\n        d = np.load(f, allow_pickle=False)\n        feeds = {}\n        for n in ve_in:\n            x = d[n]\n            if x.dtype == np.float64:\n                x = x.astype(np.float32)\n            if n in ve_pad_plan:\n                axis, tgt = ve_pad_plan[n]\n                pv = _pad_value_for(n, x.dtype)\n                x = _pad_or_crop(x, axis, tgt, pv)\n            feeds[n] = x\n        y = np.asarray(sess.run(None, feeds)[ve_output_index], dtype=np.float32)\n        latents.append(y)\n\n    shapes = [tuple(z.shape) for z in latents]\n    ax = _detect_variable_axis(shapes)\n    if ax is not None:\n        lens = np.array([s[ax] for s in shapes], dtype=np.int64)\n        tgt = int(np.percentile(lens, pad_percentile))\n        tgt = max(1, tgt)\n        if pad_max > 0:\n            tgt = min(tgt, pad_max)\n        latents = [_pad_or_crop(z, ax, tgt, 0.0) for z in latents]\n\n    for i, y in enumerate(latents):\n        np.savez(os.path.join(out_dir, f\"{i:05d}.npz\"), **{voc_in_name: y})\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--src-dir\", type=str, required=True, help=\"source model dir\"\n    )\n    parser.add_argument(\n        \"--dst-dir\", type=str, required=True, help=\"output model dir\"\n    )\n    parser.add_argument(\n        \"--calib-dir\", type=str, required=True, help=\"calibration npz dir\"\n    )\n    parser.add_argument(\n        \"--preprocess\", choices=[\"onnx\", \"ort\", \"none\"], default=\"ort\"\n    )\n    parser.add_argument(\"--duration-predictor\", default=\"duration_predictor.onnx\")\n    parser.add_argument(\"--text-encoder\", default=\"text_encoder.onnx\")\n    parser.add_argument(\"--vector-estimator\", default=\"vector_estimator.onnx\")\n    parser.add_argument(\"--vocoder\", default=\"vocoder.onnx\")\n    parser.add_argument(\"--dp-mode\", choices=[\"copy\", \"dynamic\"], default=\"copy\")\n    parser.add_argument(\"--te-mode\", choices=[\"copy\", \"dynamic\"], default=\"copy\")\n    parser.add_argument(\n        \"--dp-te-weight-type\", choices=[\"qint8\", \"quint8\"], default=\"qint8\"\n    )\n    parser.add_argument(\"--ve-mode\", choices=[\"copy\", \"dynamic\"], default=\"dynamic\")\n    parser.add_argument(\"--ve-conv-w8dq\", action=\"store_true\", default=True)\n    parser.add_argument(\"--ve-w8dq-exclude-last-conv\", type=int, default=6)\n    parser.add_argument(\"--ve-calib-limit\", type=int, default=100)\n    parser.add_argument(\"--vocoder-calib-limit\", type=int, default=100)\n    parser.add_argument(\n        \"--vocoder-calibrate-method\",\n        choices=[\"MinMax\", \"Entropy\", \"Percentile\"],\n        default=\"Percentile\",\n    )\n    parser.add_argument(\"--vocoder-act\", choices=[\"qint8\", \"quint8\"], default=\"quint8\")\n    parser.add_argument(\"--vocoder-wt\", choices=[\"qint8\", \"quint8\"], default=\"qint8\")\n    parser.add_argument(\"--vocoder-per-channel\", action=\"store_true\", default=True)\n    parser.add_argument(\"--vocoder-reduce-range\", action=\"store_true\", default=True)\n    parser.add_argument(\"--exclude-last-conv\", type=int, default=8)\n    parser.add_argument(\"--vocoder-tail-w8dq\", action=\"store_true\", default=True)\n    parser.add_argument(\n        \"--vocoder-tail-w8dq-exclude-last-conv\", type=int, default=0\n    )\n    parser.add_argument(\"--vocoder-calib-from-ve\", action=\"store_true\", default=True)\n    parser.add_argument(\"--ve-output-index\", type=int, default=-1)\n    parser.add_argument(\"--pad-percentile\", type=int, default=90)\n    parser.add_argument(\"--pad-max\", type=int, default=0)\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    os.makedirs(args.dst_dir, exist_ok=True)\n\n    print(\"ORT:\", ort.__version__, \"providers:\", ort.get_available_providers())\n\n    dp_fp32 = os.path.join(args.src_dir, args.duration_predictor)\n    te_fp32 = os.path.join(args.src_dir, args.text_encoder)\n    ve_fp32 = os.path.join(args.src_dir, args.vector_estimator)\n    voc_fp32 = os.path.join(args.src_dir, args.vocoder)\n\n    dp_out = os.path.join(args.dst_dir, onnx_int8_name(args.duration_predictor))\n    te_out = os.path.join(args.dst_dir, onnx_int8_name(args.text_encoder))\n    ve_out = os.path.join(args.dst_dir, onnx_int8_name(args.vector_estimator))\n    voc_out = os.path.join(args.dst_dir, onnx_int8_name(args.vocoder))\n\n    dp_te_wt = QuantType.QInt8 if args.dp_te_weight_type == \"qint8\" else QuantType.QUInt8\n    voc_act = QuantType.QInt8 if args.vocoder_act == \"qint8\" else QuantType.QUInt8\n    voc_wt = QuantType.QInt8 if args.vocoder_wt == \"qint8\" else QuantType.QUInt8\n\n    if args.dp_mode == \"copy\":\n        safe_copy(dp_fp32, dp_out)\n    else:\n        quantize_dynamic_safe(dp_fp32, dp_out, [\"MatMul\", \"Gemm\"], dp_te_wt)\n\n    if args.te_mode == \"copy\":\n        safe_copy(te_fp32, te_out)\n    else:\n        quantize_dynamic_safe(te_fp32, te_out, [\"MatMul\", \"Gemm\"], dp_te_wt)\n\n    if args.ve_mode == \"copy\":\n        safe_copy(ve_fp32, ve_out)\n    else:\n        quantize_dynamic_safe(ve_fp32, ve_out, [\"MatMul\", \"Gemm\"], QuantType.QInt8)\n\n    if args.ve_conv_w8dq:\n        apply_w8dq_to_conv_weights(\n            model_in=ve_out,\n            model_out=ve_out,\n            exclude_last_conv=args.ve_w8dq_exclude_last_conv,\n            only_fp32=True,\n        )\n\n    ve_calib = os.path.join(args.calib_dir, os.path.splitext(args.vector_estimator)[0])\n    voc_calib_dir = os.path.join(args.calib_dir, os.path.splitext(args.vocoder)[0])\n\n    voc_lat_shape = infer_vocoder_latent_shape(voc_fp32, voc_calib_dir)\n\n    nodes_excl = None\n    if args.exclude_last_conv > 0:\n        with tempfile.TemporaryDirectory(prefix=\"voc_pre_\") as td:\n            pre_voc = os.path.join(td, \"voc_pre.onnx\")\n            voc_for = preprocess(voc_fp32, pre_voc, args.preprocess)\n            m = onnx.load(voc_for)\n            ensure_node_names(m)\n            convs = [n.name for n in m.graph.node if n.op_type == \"Conv\"]\n            if len(convs) >= args.exclude_last_conv:\n                nodes_excl = convs[-args.exclude_last_conv:]\n\n    def _run_vocoder_quantize(calib_folder: str) -> None:\n        quantize_static_safe(\n            fp32_path=voc_fp32,\n            out_path=voc_out,\n            calib_folder=calib_folder,\n            preprocess_mode=args.preprocess,\n            calib_limit=args.vocoder_calib_limit,\n            calibrate_method=args.vocoder_calibrate_method,\n            act_type=voc_act,\n            wt_type=voc_wt,\n            per_channel=args.vocoder_per_channel,\n            reduce_range=args.vocoder_reduce_range,\n            op_types=[\"Conv\"],\n            nodes_to_exclude=nodes_excl,\n            pad_percentile=args.pad_percentile,\n            pad_max=args.pad_max,\n        )\n        if args.vocoder_tail_w8dq and args.exclude_last_conv > 0:\n            apply_w8dq_to_conv_weights(\n                model_in=voc_out,\n                model_out=voc_out,\n                exclude_last_conv=args.vocoder_tail_w8dq_exclude_last_conv,\n                only_fp32=True,\n            )\n\n    if args.vocoder_calib_from_ve:\n        with tempfile.TemporaryDirectory(prefix=\"vocoder_calib_\") as tmp_voc_calib:\n            ve_idx = args.ve_output_index\n            if ve_idx < 0:\n                ve_idx = pick_ve_output_index(ve_out, ve_calib, voc_lat_shape)\n            print(f\"[INFO] VE output index for vocoder calib: {ve_idx}\")\n            build_vocoder_calib_from_ve(\n                ve_model_path=ve_out,\n                ve_calib_dir=ve_calib,\n                vocoder_fp32=voc_fp32,\n                out_dir=tmp_voc_calib,\n                ve_output_index=ve_idx,\n                limit=args.vocoder_calib_limit,\n                pad_percentile=args.pad_percentile,\n                pad_max=args.pad_max,\n            )\n            _run_vocoder_quantize(tmp_voc_calib)\n    else:\n        _run_vocoder_quantize(voc_calib_dir)\n\n    print(\"Quantization completed!\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/supertonic/dump_inputs.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2026 zengyw\n\n\"\"\"\nDump Supertonic TTS model inputs to npz for calibration.\nSee also https://github.com/supertone-inc/supertonic\n\"\"\"\n\nimport argparse\nimport os\n\nimport numpy as np\nimport onnxruntime as ort\n\nfrom helper import (\n    UnicodeProcessor,\n    Style,\n    TextToSpeech,\n    load_onnx_all,\n    load_cfgs,\n    load_text_processor,\n    load_voice_style,\n    chunk_text\n)\n\n\nclass DumpTextToSpeech(TextToSpeech):\n    \"\"\"TTS with input dumping capability.\"\"\"\n\n    def __init__(\n        self,\n        cfgs: dict,\n        text_processor: UnicodeProcessor,\n        dp_ort: ort.InferenceSession,\n        text_enc_ort: ort.InferenceSession,\n        vector_est_ort: ort.InferenceSession,\n        vocoder_ort: ort.InferenceSession,\n        dump_dir: str = \"calib\",\n    ):\n        super().__init__(\n            cfgs, text_processor, dp_ort, text_enc_ort, vector_est_ort, vocoder_ort\n        )\n        self.dump_dir = dump_dir\n\n        self.dump_dirs = {\n            \"duration_predictor\": os.path.join(dump_dir, \"duration_predictor\"),\n            \"text_encoder\": os.path.join(dump_dir, \"text_encoder\"),\n            \"vector_estimator\": os.path.join(dump_dir, \"vector_estimator\"),\n            \"vocoder\": os.path.join(dump_dir, \"vocoder\"),\n        }\n        for d in self.dump_dirs.values():\n            os.makedirs(d, exist_ok=True)\n        self.counters = {k: 0 for k in self.dump_dirs}\n\n    def _save_inputs(self, model_name: str, inputs: dict):\n        \"\"\"Save input tensors to npz file.\"\"\"\n        counter = self.counters[model_name]\n        output_path = os.path.join(self.dump_dirs[model_name], f\"{counter:03d}.npz\")\n        np.savez(output_path, **inputs)\n        self.counters[model_name] += 1\n        print(f\"  Saved {model_name} inputs to {output_path}\")\n\n    def _infer(\n        self,\n        text_list: list[str],\n        lang_list: list[str],\n        style: Style,\n        total_step: int,\n        speed: float = 1.05,\n    ) -> tuple[np.ndarray, np.ndarray]:\n        \"\"\"Run inference with input dumping.\"\"\"\n        assert (\n            len(text_list) == style.ttl.shape[0]\n        ), \"Number of texts must match number of style vectors\"\n        bsz = len(text_list)\n\n        text_ids, text_mask = self.text_processor(text_list, lang_list)\n        dp_inputs = {\n            \"text_ids\": text_ids,\n            \"style_dp\": style.dp,\n            \"text_mask\": text_mask,\n        }\n        self._save_inputs(\"duration_predictor\", dp_inputs)\n        dur_onnx, *_ = self.dp_ort.run(None, dp_inputs)\n        dur_onnx = dur_onnx / speed\n        text_emb_onnx, *_ = self.text_enc_ort.run(\n            None,\n            {\n                \"text_ids\": text_ids,\n                \"style_ttl\": style.ttl,\n                \"text_mask\": text_mask,\n            },\n        )\n        self._save_inputs(\"text_encoder\", {\n            \"text_ids\": text_ids,\n            \"style_ttl\": style.ttl,\n            \"text_mask\": text_mask,\n        })\n        xt, latent_mask = self.sample_noisy_latent(dur_onnx)\n        total_step_np = np.array([total_step] * bsz, dtype=np.float32)\n\n        # dump vector_estimator inputs at last step (most informative)\n        for step in range(total_step):\n            current_step = np.array([step] * bsz, dtype=np.float32)\n            ve_inputs = {\n                \"noisy_latent\": xt,\n                \"text_emb\": text_emb_onnx,\n                \"style_ttl\": style.ttl,\n                \"text_mask\": text_mask,\n                \"latent_mask\": latent_mask,\n                \"current_step\": current_step,\n                \"total_step\": total_step_np,\n            }\n            if step == total_step - 1:\n                self._save_inputs(\"vector_estimator\", ve_inputs)\n            xt, *_ = self.vector_est_ort.run(None, ve_inputs)\n\n        # Vocoder inputs and run\n        vocoder_inputs = {\"latent\": xt}\n        self._save_inputs(\"vocoder\", vocoder_inputs)\n        wav, *_ = self.vocoder_ort.run(None, vocoder_inputs)\n\n        return wav, dur_onnx\n\n    def __call__(\n        self,\n        text: str,\n        lang: str,\n        style: Style,\n        total_step: int,\n        speed: float = 1.05,\n        silence_duration: float = 0.3,\n    ) -> tuple[np.ndarray, np.ndarray]:\n        \"\"\"Single text to speech with input dumping.\"\"\"\n        assert (\n            style.ttl.shape[0] == 1\n        ), \"Single speaker text to speech only supports single style\"\n        max_len = 120 if lang == \"ko\" else 300\n        text_list = chunk_text(text, max_len=max_len)\n        wav_cat = None\n        dur_cat = None\n\n        for i, text_chunk in enumerate(text_list):\n            print(f\"Processing chunk {i+1}/{len(text_list)}: '{text_chunk[:50]}...'\")\n            wav, dur_onnx = self._infer([text_chunk], [lang], style, total_step, speed)\n            if wav_cat is None:\n                wav_cat = wav\n                dur_cat = dur_onnx\n            else:\n                silence = np.zeros(\n                    (1, int(silence_duration * self.sample_rate)), dtype=np.float32\n                )\n                wav_cat = np.concatenate([wav_cat, silence, wav], axis=1)\n                dur_cat += dur_onnx + silence_duration\n        return wav_cat, dur_cat\n\n    def batch(\n        self,\n        text_list: list[str],\n        lang_list: list[str],\n        style: Style,\n        total_step: int,\n        speed: float = 1.05,\n    ) -> tuple[np.ndarray, np.ndarray]:\n        \"\"\"Batch inference with input dumping.\"\"\"\n        return self._infer(text_list, lang_list, style, total_step, speed)\n\n\ndef load_dump_text_to_speech(\n    onnx_dir: str, dump_dir: str = \"calib\", use_gpu: bool = False\n) -> DumpTextToSpeech:\n    \"\"\"Load TTS model for dumping inputs.\"\"\"\n    opts = ort.SessionOptions()\n    if use_gpu:\n        raise NotImplementedError(\"GPU mode is not fully tested\")\n    else:\n        providers = [\"CPUExecutionProvider\"]\n        print(\"Using CPU for inference\")\n\n    cfgs = load_cfgs(onnx_dir)\n    dp_ort, text_enc_ort, vector_est_ort, vocoder_ort = load_onnx_all(\n        onnx_dir, opts, providers\n    )\n    text_processor = load_text_processor(onnx_dir)\n    return DumpTextToSpeech(\n        cfgs, text_processor, dp_ort, text_enc_ort, vector_est_ort, vocoder_ort, dump_dir\n    )\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--onnx-dir\", type=str, default=\"assets/onnx\", help=\"onnx model dir\"\n    )\n    parser.add_argument(\n        \"--dump-dir\", type=str, default=\"calib\", help=\"output npz dir\"\n    )\n    parser.add_argument(\n        \"--total-step\", type=int, default=5, help=\"denoising steps\"\n    )\n    parser.add_argument(\n        \"--speed\", type=float, default=1.05, help=\"speech speed\"\n    )\n    parser.add_argument(\n        \"--n_test\", type=int, default=1, help=\"num sentences\"\n    )\n    parser.add_argument(\"--batch\", action=\"store_true\", help=\"batch mode\")\n    parser.add_argument(\n        \"--voice_style\",\n        type=str,\n        nargs=\"+\",\n        default=[\"assets/voice_styles/M1.json\"],\n        help=\"voice style json path(s)\",\n    )\n    parser.add_argument(\n        \"--text\",\n        type=str,\n        nargs=\"+\",\n        default=[\n            \"This morning, I took a walk in the park, and the sound of the birds and the breeze was so pleasant.\"\n        ],\n        help=\"text(s) to synthesize\",\n    )\n    parser.add_argument(\n        \"--lang\", type=str, nargs=\"+\", default=[\"en\"], help=\"language(s)\"\n    )\n    parser.add_argument(\"--clear\", action=\"store_true\", help=\"clear dump dir\")\n    parser.add_argument(\n        \"--config-file\",\n        type=str, default=None, dest=\"config_file\", help=\"batch config json\"\n    )\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n\n    if args.clear and os.path.exists(args.dump_dir):\n        import shutil\n        shutil.rmtree(args.dump_dir)\n        print(f\"Cleared existing directory: {args.dump_dir}\")\n\n    # Load TTS with dumping\n    print(f\"Loading models from {args.onnx_dir}...\")\n\n    if args.config_file:\n        import json\n        with open(args.config_file, \"r\") as f:\n            configs = json.load(f)\n\n        print(f\"Loaded {len(configs)} configurations from {args.config_file}\")\n\n        # Process each configuration one by one\n        tts = load_dump_text_to_speech(args.onnx_dir, args.dump_dir, use_gpu=False)\n\n        print(f\"\\nProcessing {len(configs)} sentence(s)...\")\n        for i, cfg in enumerate(configs):\n            print(f\"\\n[{i+1}/{len(configs)}] voice={cfg['voice'].split('/')[-1]}, lang={cfg['lang']}\")\n            voice = load_voice_style([cfg[\"voice\"]])\n            _wav, _duration = tts(cfg[\"text\"], cfg[\"lang\"], voice, args.total_step, args.speed)\n    else:\n        # Validate inputs for non-batch mode\n        if args.batch:\n            assert len(args.voice_style) == len(args.text), (\n                f\"Number of voice styles ({len(args.voice_style)}) must match \"\n                f\"number of texts ({len(args.text)})\"\n            )\n\n        tts = load_dump_text_to_speech(args.onnx_dir, args.dump_dir, use_gpu=False)\n\n        # Load voice style\n        style = load_voice_style(args.voice_style, verbose=True)\n\n        # Process sentences\n        print(f\"\\nProcessing {args.n_test} sentence(s)...\")\n        for n in range(args.n_test):\n            print(f\"\\n[{n+1}/{args.n_test}]\")\n\n            if args.batch:\n                wav, duration = tts.batch(args.text, args.lang, style, args.total_step, args.speed)\n            else:\n                wav, duration = tts(args.text[0], args.lang[0], style, args.total_step, args.speed)\n\n    # Print summary\n    print(\"\\n\" + \"=\" * 50)\n    print(\"Dumping completed!\")\n    print(\"=\" * 50)\n    print(\"\\nGenerated files:\")\n    for model_name, counter in tts.counters.items():\n        dump_dir = tts.dump_dirs[model_name]\n        if os.path.exists(dump_dir):\n            files = sorted(os.listdir(dump_dir))\n            print(f\"  {model_name}: {len(files)} files in {dump_dir}/\")\n            for f in files[:5]:\n                print(f\"    - {f}\")\n            if len(files) > 5:\n                print(f\"    ... and {len(files) - 5} more\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/supertonic/gen_calib_configs.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2026 zengyw\n# Generate calibration configs (voice/text/lang) with diverse voices and text.\n\nimport json\nimport random\nfrom collections import Counter\n\nSENTENCES = {\n    \"en\": [\n        \"Hello world.\",\n        \"How are you today?\",\n        \"The sky is blue.\",\n        \"I love machine learning.\",\n        \"Python is awesome.\",\n        \"Good morning everyone.\",\n        \"Artificial intelligence is growing.\",\n        \"Speech synthesis is fascinating.\",\n        \"Neural networks are powerful.\",\n        \"Text to speech converts text to audio.\",\n        \"The quick brown fox jumps over the lazy dog.\",\n        \"Machine learning enables computers to learn from data.\",\n        \"Natural language processing helps machines understand text.\",\n        \"Deep learning has revolutionized artificial intelligence.\",\n        \"Speech synthesis technology has advanced significantly.\",\n        \"Neural voice cloning can replicate speaking styles.\",\n        \"Text normalization is important for proper pronunciation.\",\n        \"Voice assistants help us interact with technology naturally.\",\n        \"Modern TTS systems use deep learning for high-quality speech.\",\n        \"Human computer interaction has become more intuitive.\",\n    ],\n    \"es\": [\n        \"Hola mundo.\",\n        \"¿Cómo estás hoy?\",\n        \"El cielo es azul.\",\n        \"Me encanta el aprendizaje automático.\",\n        \"Python es increíble.\",\n        \"Buenos días a todos.\",\n        \"La inteligencia artificial está creciendo.\",\n        \"La síntesis de voz es fascinante.\",\n        \"Las redes neuronales son poderosas.\",\n        \"El texto a voz convierte texto en audio.\",\n        \"El veloz marrón salta sobre el perro perezoso.\",\n        \"El aprendizaje automático permite a las computadoras aprender.\",\n        \"El procesamiento del lenguaje natural ayuda a las máquinas.\",\n        \"El aprendizaje profundo ha revolucionado la inteligencia artificial.\",\n        \"La tecnología de síntesis de voz ha avanzado significativamente.\",\n        \"La clonación de voz neuronal puede replicar estilos de habla.\",\n        \"La normalización de texto es importante para la pronunciación.\",\n        \"Los asistentes de voz nos ayudan a interactuar con la tecnología.\",\n        \"Los sistemas TTS modernos utilizan aprendizaje profundo.\",\n        \"La interacción humano computadora se ha vuelto más intuitiva.\",\n    ],\n    \"pt\": [\n        \"Olá mundo.\",\n        \"Como você está hoje?\",\n        \"O céu é azul.\",\n        \"Eu amo aprendizado de máquina.\",\n        \"Python é incrível.\",\n        \"Bom dia a todos.\",\n        \"A inteligência artificial está crescendo.\",\n        \"A síntese de voz é fascinante.\",\n        \"As redes neurais são poderosas.\",\n        \"Texto para voz converte texto em áudio.\",\n        \"A rápida raposa marrom salta sobre o cachorro preguiçoso.\",\n        \"O aprendizado de máquina permite que computadores aprendam.\",\n        \"O processamento de linguagem natural ajuda máquinas a entender.\",\n        \"O aprendizado profundo revolucionou a inteligência artificial.\",\n        \"A tecnologia de síntese de voz avançou significativamente.\",\n        \"A clonagem de voz neural pode replicar estilos de fala.\",\n        \"A normalização de texto é importante para pronúncia.\",\n        \"Assistentes de voz nos ajudam a interagir com tecnologia.\",\n        \"Sistemas TTS modernos usam aprendizado profundo para áudio.\",\n        \"A interação humano computador tornou-se mais intuitiva.\",\n    ],\n    \"fr\": [\n        \"Bonjour le monde.\",\n        \"Comment allez-vous aujourd'hui?\",\n        \"Le ciel est bleu.\",\n        \"J'aime l'apprentissage automatique.\",\n        \"Python est incroyable.\",\n        \"Bonjour à tous.\",\n        \"L'intelligence artificielle grandit.\",\n        \"La synthèse vocale est fascinante.\",\n        \"Les réseaux neuronaux sont puissants.\",\n        \"Le texte en voix convertit le texte en audio.\",\n        \"Le rapide renard brun saute par-dessus le chien paresseux.\",\n        \"L'apprentissage automatique permet aux ordinateurs d'apprendre.\",\n        \"Le traitement du langage naturel aide les machines à comprendre.\",\n        \"L'apprentissage profond a révolutionné l'intelligence artificielle.\",\n        \"La technologie de synthèse vocale a considérablement progressé.\",\n        \"Le clonage vocal neuronal peut reproduire les styles de parole.\",\n        \"La normalisation du texte est importante pour la prononciation.\",\n        \"Les assistants vocaux nous aident à interagir avec la technologie.\",\n        \"Les systèmes TTS modernes utilisent l'apprentissage profond.\",\n        \"L'interaction homme machine est devenue plus intuitive.\",\n    ],\n    \"ko\": [\n        \"안녕하세요 세계.\",\n        \"오늘 어떻게 지내세요?\",\n        \"하늘이 푸릅니다.\",\n        \"기계학습을 사랑합니다.\",\n        \"파이썬은 놀라워요.\",\n        \"모든 분께 좋은 아침입니다.\",\n        \"인공지능이 성장하고 있습니다.\",\n        \"음성 합성은 매력적입니다.\",\n        \"신경망은 강력합니다.\",\n        \"텍스트 음성 변환이 텍스트를 오디오로 변환합니다.\",\n        \"빠른 갈색 여우가 게으른 개를 뛰어넘습니다.\",\n        \"기계학습이 컴퓨터가 데이터로 학습할 수 있게 합니다.\",\n        \"자연어 처리가 기계를 이해하도록 돕습니다.\",\n        \"딥러닝이 인공지능을 혁신했습니다.\",\n        \"음성 합성 기술이 크게 발전했습니다.\",\n        \"음성 클로닝이 음성 스타일을 복제할 수 있습니다.\",\n        \"텍스트 정규화가 올바른 발음에 중요합니다.\",\n        \"음성 비서가 기술과 상호작용하는 데 도움이 됩니다.\",\n        \"최신 TTS 시스템이 고품질 음성을 생성합니다.\",\n        \"인간 컴퓨터 상호작용이 더 직관적이 되었습니다.\",\n    ],\n}\n\nVOICE_STYLES = {\n    \"M\": [\n        \"assets/voice_styles/M1.json\",\n        \"assets/voice_styles/M2.json\",\n        \"assets/voice_styles/M3.json\",\n        \"assets/voice_styles/M4.json\",\n        \"assets/voice_styles/M5.json\",\n    ],\n    \"F\": [\n        \"assets/voice_styles/F1.json\",\n        \"assets/voice_styles/F2.json\",\n        \"assets/voice_styles/F3.json\",\n        \"assets/voice_styles/F4.json\",\n        \"assets/voice_styles/F5.json\",\n    ],\n}\n\nSAMPLES_PER_LANG = 20\n\ndef generate_config():\n    configs = []\n    random.seed(42)\n\n    for lang, sentences in SENTENCES.items():\n        voice_pool = VOICE_STYLES[\"M\"] + VOICE_STYLES[\"F\"]\n        random.shuffle(voice_pool)\n\n        for i in range(SAMPLES_PER_LANG):\n            voice = voice_pool[i % len(voice_pool)]\n            sentence_idx = i % len(sentences)\n            sentence = sentences[sentence_idx]\n\n            if i % 3 == 0:\n                sentence2 = sentences[(sentence_idx + 1) % len(sentences)]\n                sentence = sentence + \" \" + sentence2\n            if i % 5 == 0:\n                sentence3 = sentences[(sentence_idx + 2) % len(sentences)]\n                sentence = sentence + \" \" + sentence3\n\n            configs.append({\n                \"voice\": voice,\n                \"text\": sentence,\n                \"lang\": lang,\n            })\n\n    random.shuffle(configs)\n    return configs\n\n\ndef main():\n    configs = generate_config()\n    with open(\"calib_configs.json\", \"w\", encoding=\"utf-8\") as f:\n        json.dump(configs, f, ensure_ascii=False, indent=2)\n\n    print(f\"Generated {len(configs)} configurations saved to calib_configs.json\")\n    print(\"\\nDistribution:\")\n    voices = [c[\"voice\"].split(\"/\")[-1] for c in configs]\n    langs = [c[\"lang\"] for c in configs]\n    lens = [len(c[\"text\"]) for c in configs]\n\n    print(\"\\nVoice distribution:\")\n    for v, c in Counter(voices).items():\n        print(f\"  {v}: {c}\")\n\n    print(\"\\nLanguage distribution:\")\n    for lang, c in Counter(langs).items():\n        print(f\"  {lang}: {c}\")\n\n    print(\"\\nText length stats:\")\n    print(f\"  min: {min(lens)}, max: {max(lens)}, avg: {sum(lens)/len(lens):.1f}\")\n\n    print(\"\\nSample configs:\")\n    for i in range(0, len(configs), 20):\n        c = configs[i]\n        print(f\"  [{i//20 + 1}] lang={c['lang']}, voice={c['voice'].split('/')[-1]}, text='{c['text'][:30]}...'\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/supertonic/generate_indexer_bin.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2026  zengyw\n# Generate unicode_indexer.bin from unicode_indexer.json.\n\nimport json\nimport sys\nfrom pathlib import Path\n\nimport numpy as np\n\n\ndef main():\n    script_dir = Path(__file__).parent\n    default_json = script_dir.parent.parent / \"assets\" / \"onnx\" / \"unicode_indexer.json\"\n    json_path = Path(sys.argv[1]) if len(sys.argv) > 1 else default_json\n    bin_path = Path(sys.argv[2]) if len(sys.argv) > 2 else json_path.with_suffix(\".bin\")\n\n    if not json_path.exists():\n        print(f\"Error: {json_path} does not exist\")\n        return 1\n\n    try:\n        with open(json_path, \"r\", encoding=\"utf-8\") as f:\n            arr = json.load(f)\n    except Exception as e:\n        print(f\"Error: failed to read JSON {json_path}: {e}\")\n        return 1\n\n    if not isinstance(arr, list):\n        print(f\"Error: JSON must be an array of integers, got {type(arr)}\")\n        return 1\n\n    for i, x in enumerate(arr):\n        if isinstance(x, bool) or not isinstance(x, (int, np.integer)):\n            print(f\"Error: JSON element {i} is not an integer: {x} (type={type(x)})\")\n            return 1\n        if x < np.iinfo(np.int32).min or x > np.iinfo(np.int32).max:\n            print(f\"Error: JSON element {i} out of int32 range: {x}\")\n            return 1\n\n    array = np.asarray(arr, dtype=np.int32)\n\n    try:\n        with open(bin_path, \"wb\") as f:\n            f.write(array.tobytes(order=\"C\"))\n    except Exception as e:\n        print(f\"Error: failed to write {bin_path}: {e}\")\n        return 1\n\n    print(f\"Wrote {array.size} int32 -> {bin_path}\")\n    return 0\n\n\nif __name__ == \"__main__\":\n    sys.exit(main())\n"
  },
  {
    "path": "scripts/supertonic/generate_voices_bin.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2026  zengyw\n# Merge Supertonic voice style JSONs from a directory into one voice.bin\n\nimport json\nimport sys\nfrom pathlib import Path\n\nimport numpy as np\n\n\ndef load_one_json(json_path):\n    with open(json_path, \"r\", encoding=\"utf-8\") as f:\n        data = json.load(f)\n\n    if \"style_ttl\" not in data:\n        raise ValueError(f\"{json_path}: missing key 'style_ttl'\")\n    if \"style_dp\" not in data:\n        raise ValueError(f\"{json_path}: missing key 'style_dp'\")\n\n    style_ttl = data[\"style_ttl\"]\n    if \"dims\" not in style_ttl or \"data\" not in style_ttl:\n        raise ValueError(f\"{json_path}: 'style_ttl' must contain keys 'dims' and 'data'\")\n    ttl_dims = tuple(int(x) for x in style_ttl[\"dims\"])\n    ttl_arr = np.asarray(style_ttl[\"data\"], dtype=np.float32)\n\n    ttl_size = int(np.prod(ttl_dims)) if len(ttl_dims) > 0 else 0\n    if ttl_arr.size != ttl_size:\n        raise ValueError(\n            f\"{json_path}: ttl size {ttl_arr.size} != prod(ttl_dims) {ttl_size} (ttl_dims={ttl_dims})\"\n        )\n    ttl_arr = ttl_arr.reshape(ttl_dims)\n    if not np.all(np.isfinite(ttl_arr)):\n        raise ValueError(f\"{json_path}: ttl contains NaN/Inf\")\n\n    style_dp = data[\"style_dp\"]\n    if \"dims\" not in style_dp or \"data\" not in style_dp:\n        raise ValueError(f\"{json_path}: 'style_dp' must contain keys 'dims' and 'data'\")\n    dp_dims = tuple(int(x) for x in style_dp[\"dims\"])\n    dp_arr = np.asarray(style_dp[\"data\"], dtype=np.float32)\n\n    dp_size = int(np.prod(dp_dims)) if len(dp_dims) > 0 else 0\n    if dp_arr.size != dp_size:\n        raise ValueError(\n            f\"{json_path}: dp size {dp_arr.size} != prod(dp_dims) {dp_size} (dp_dims={dp_dims})\"\n        )\n    dp_arr = dp_arr.reshape(dp_dims)\n    if not np.all(np.isfinite(dp_arr)):\n        raise ValueError(f\"{json_path}: dp contains NaN/Inf\")\n    return ttl_dims, ttl_arr, dp_dims, dp_arr\n\n\ndef merge_jsons_to_binary(json_paths, output_path):\n    if not json_paths:\n        raise ValueError(\"No JSON paths given\")\n    ttl_arrays = []\n    dp_arrays = []\n    ref_ttl = ref_dp = None\n    for p in json_paths:\n        ttl_dims, ttl_arr, dp_dims, dp_arr = load_one_json(p)\n        if len(ttl_dims) != 3 or ttl_dims[0] != 1:\n            raise ValueError(\n                f\"{p}: expected ttl dims [1, d1, d2], got {ttl_dims}\"\n            )\n        if len(dp_dims) != 3 or dp_dims[0] != 1:\n            raise ValueError(\n                f\"{p}: expected dp dims [1, d1, d2], got {dp_dims}\"\n            )\n        if ref_ttl is None:\n            ref_ttl, ref_dp = ttl_dims, dp_dims\n        elif ttl_dims[1:] != ref_ttl[1:] or dp_dims[1:] != ref_dp[1:]:\n            raise ValueError(\n                f\"File {p} has dims ttl{ttl_dims} dp{dp_dims}; \"\n                f\"expected ttl[1:]={ref_ttl[1:]}, dp[1:]={ref_dp[1:]}\"\n            )\n        ttl_arrays.append(ttl_arr)\n        dp_arrays.append(dp_arr)\n\n    n = len(json_paths)\n    ttl_stack = np.concatenate(ttl_arrays, axis=0)\n    dp_stack = np.concatenate(dp_arrays, axis=0)\n    out_ttl_dims = np.array([n, ref_ttl[1], ref_ttl[2]], dtype=np.int64)\n    out_dp_dims = np.array([n, ref_dp[1], ref_dp[2]], dtype=np.int64)\n\n    with open(output_path, \"wb\") as f:\n        f.write(out_ttl_dims.tobytes())\n        f.write(out_dp_dims.tobytes())\n        f.write(ttl_stack.ravel().tobytes())\n        f.write(dp_stack.ravel().tobytes())\n    print(f\"Merged {n} voice(s) -> {output_path} (sid 0..{n - 1})\")\n\n\ndef main():\n    script_dir = Path(__file__).parent\n    default_input = script_dir / \"assets\" / \"voice_styles\"\n    input_dir = Path(sys.argv[1]) if len(sys.argv) > 1 else default_input\n    if len(sys.argv) > 2:\n        output_path = Path(sys.argv[2])\n    else:\n        output_path = input_dir / \"voice.bin\"\n\n    if not input_dir.exists() or not input_dir.is_dir():\n        print(f\"Error: input dir does not exist or not a directory: {input_dir}\")\n        return 1\n    json_files = sorted(input_dir.glob(\"*.json\"))\n\n    if not json_files:\n        print(f\"No JSON files found in {input_dir}\")\n        return 1\n\n    try:\n        merge_jsons_to_binary([str(p) for p in json_files], str(output_path))\n    except Exception as e:\n        print(f\"Error: {e}\")\n        return 1\n\n    return 0\n\n\nif __name__ == \"__main__\":\n    exit(main())\n"
  },
  {
    "path": "scripts/t-one/README.md",
    "content": "# Introduction\n\nThis folder contains scripts for exporting models from\nhttps://github.com/voicekit-team/T-one\nto sherpa-onnx.\n"
  },
  {
    "path": "scripts/t-one/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport onnx\n\n\ndef main():\n    meta_data = {\n        \"model_type\": \"t-one\",\n        \"language\": \"Russian\",\n        \"version\": 1,\n        \"maintainer\": \"k2-fsa\",\n        \"sample_rate\": 8000,\n        \"frame_length_ms\": 300,  # chunk_duration_ms\n        \"state_dim\": 219729,\n        \"comment\": \"This is a streaming CTC model for Russian with expected audio sample rate 8000\",\n        \"url\": \"https://github.com/voicekit-team/T-one\",\n        \"see_also\": \"https://huggingface.co/t-tech/T-one\",\n    }\n    model = onnx.load(\"./model.onnx\")\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, \"./model.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/t-one/generate_tokens.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport json\n\n\ndef main():\n    with open(\"vocab.json\") as f:\n        token2id = json.load(f)\n\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for s, i in token2id.items():\n            if s == \"|\":\n                s = \" \"\n            if s == \"[PAD]\":\n                s = \"<blk>\"\n\n            f.write(f\"{s} {i}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/t-one/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom typing import Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to model.onnx\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to tokens.txt\",\n    )\n\n    parser.add_argument(\n        \"--wave\",\n        type=str,\n        required=True,\n        help=\"The input wave to be recognized\",\n    )\n\n    return parser.parse_args()\n\n\nclass OnnxModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n\n        self.frame_length_ms = int(meta[\"frame_length_ms\"])\n        self.sample_rate = int(meta[\"sample_rate\"])\n        self.state_dim = int(meta[\"state_dim\"])\n\n    def get_init_state(self, batch_size=1):\n        return np.zeros((batch_size, self.state_dim), dtype=np.float16)\n\n    def __call__(self, x, state):\n        \"\"\"\n        Args:\n          x: (batch_size, num_samples, 1), int32\n          state: (batch_size, 219729)\n        Returns:\n          log_probs: (batch_size, num_frames, vocab_size)\n          next_state: (batch_size, 219729)\n        \"\"\"\n        log_prob, next_state = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n                self.model.get_outputs()[1].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n                self.model.get_inputs()[1].name: state,\n            },\n        )\n        return log_prob, next_state\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef load_tokens(filename):\n    ans = dict()\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.strip().split()\n            if len(fields) == 1:\n                ans[int(fields[0])] = \" \"\n            else:\n                ans[int(fields[1])] = fields[0]\n    return ans\n\n\ndef compute_feat(\n    samples,\n    sample_rate,\n    frame_length_ms: int,\n):\n    opts = knf.RawAudioSamplesOptions()\n    opts.frame_opts.samp_freq = sample_rate\n    opts.frame_opts.frame_length_ms = frame_length_ms\n    opts.frame_opts.frame_shift_ms = frame_length_ms\n\n    raw_audio_samples = knf.OnlineRawAudioSamples(opts)\n\n    raw_audio_samples.accept_waveform(sample_rate, samples)\n    raw_audio_samples.input_finished()\n\n    features = []\n\n    for i in range(raw_audio_samples.num_frames_ready):\n        f = raw_audio_samples.get_frame(i)\n        features.append(f)\n\n    return (np.array(features, dtype=np.float32) * 32768).astype(np.int32)\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    model = OnnxModel(filename=args.model)\n\n    samples, sample_rate = load_audio(args.wave)\n    if sample_rate != model.sample_rate:\n        import librosa\n\n        samples = librosa.resample(\n            samples, orig_sr=sample_rate, target_sr=model.sample_rate\n        )\n        sample_rate = model.sample_rate\n\n    # Pad 0.5 seconds\n    samples = np.pad(samples, (2400, 2400))\n\n    features = compute_feat(\n        samples=samples,\n        sample_rate=sample_rate,\n        frame_length_ms=model.frame_length_ms,\n    )\n\n    id2token = load_tokens(args.tokens)\n\n    blank = -2\n    for idx, token in id2token.items():\n        if token == \"<blk>\":\n            blank = idx\n\n    state = model.get_init_state()\n    token_id_list = []\n    for f in features:\n        log_probs, state = model(f[None, :, None], state)\n\n        max_token_ids = log_probs[0].argmax(axis=-1).tolist()\n        token_id_list += max_token_ids\n\n    unique_ids = []\n    prev = -1\n    for t in token_id_list:\n        if t == blank:\n            prev = t\n            continue\n\n        if t == prev:\n            continue\n\n        prev = t\n        unique_ids.append(prev)\n    text = \"\".join([id2token[i] for i in unique_ids])\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/tele-speech/.gitignore",
    "content": "*.json\n"
  },
  {
    "path": "scripts/tele-speech/README.md",
    "content": "# Introduction\n\nThis folder contains scripts about adding metadata to\nonnx models from\nhttps://hf-mirror.com/lovemefan/telespeech/tree/main\n\nPlease see\n\n  - https://github.com/Tele-AI/TeleSpeech-ASR\n  - https://github.com/lovemefan/telespeech-asr-python\n  - [TeleSpeech模型社区许可协议.pdf](https://github.com/Tele-AI/TeleSpeech-ASR/blob/master/TeleSpeech%E6%A8%A1%E5%9E%8B%E7%A4%BE%E5%8C%BA%E8%AE%B8%E5%8F%AF%E5%8D%8F%E8%AE%AE.pdf)\n\nfor more details.\n"
  },
  {
    "path": "scripts/tele-speech/add-metadata.py",
    "content": "#!/usr/bin/env python3\n\nimport json\nfrom typing import Dict\n\nimport onnx\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = value\n\n    onnx.save(model, filename)\n\n\ndef main():\n    with open(\"./vocab.json\", \"r\", encoding=\"utf-8\") as f:\n        tokens = json.load(f)\n\n    vocab_size = len(tokens)\n    with open(\"tokens.txt\", \"w\", encoding=\"utf-8\") as f:\n        for token, idx in tokens.items():\n            if idx == 0:\n                f.write(\"<blk> 0\\n\")\n            else:\n                f.write(f\"{token} {idx}\\n\")\n\n    filename = \"model.onnx\"\n    meta_data = {\n        \"model_type\": \"telespeech_ctc\",\n        \"version\": \"1\",\n        \"model_author\": \"Tele-AI\",\n        \"comment\": \"See also https://github.com/lovemefan/telespeech-asr-python\",\n        \"license\": \"https://github.com/Tele-AI/TeleSpeech-ASR/blob/master/TeleSpeech%E6%A8%A1%E5%9E%8B%E7%A4%BE%E5%8C%BA%E8%AE%B8%E5%8F%AF%E5%8D%8F%E8%AE%AE.pdf\",\n        \"url\": \"https://github.com/Tele-AI/TeleSpeech-ASR\",\n    }\n\n    add_meta_data(filename, meta_data)\n\n    filename_int8 = f\"model.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QInt8,\n    )\n\n    #  filename_uint8 = f\"model.uint8.onnx\"\n    #  quantize_dynamic(\n    #      model_input=filename,\n    #      model_output=filename_uint8,\n    #      op_types_to_quantize=[\"MatMul\"],\n    #      weight_type=QuantType.QUInt8,\n    #  )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/tele-speech/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2024  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom typing import Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\n\"\"\"\nNodeArg(name='feats', type='tensor(float)', shape=[1, 'T', 40])\n-----\nNodeArg(name='logits', type='tensor(float)', shape=['Addlogits_dim_0', 1, 7535])\n\"\"\"\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        self.show()\n\n    def show(self):\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i)\n\n    def __call__(self, x):\n        \"\"\"\n        Args:\n          x: a float32 tensor of shape (N, T, C)\n        \"\"\"\n        logits = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )[0]\n\n        return logits\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef get_features(test_wav_filename):\n    samples, sample_rate = load_audio(test_wav_filename)\n\n    if sample_rate != 16000:\n        import librosa\n\n        samples = librosa.resample(samples, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    samples *= 32768\n\n    opts = knf.MfccOptions()\n    # See https://github.com/Tele-AI/TeleSpeech-ASR/blob/master/mfcc_hires.conf\n    opts.frame_opts.dither = 0\n\n    opts.num_ceps = 40\n    opts.use_energy = False\n\n    opts.mel_opts.num_bins = 40\n    opts.mel_opts.low_freq = 40\n    opts.mel_opts.high_freq = -200\n\n    mfcc = knf.OnlineMfcc(opts)\n    mfcc.accept_waveform(16000, samples)\n    frames = []\n    for i in range(mfcc.num_frames_ready):\n        frames.append(mfcc.get_frame(i))\n\n    frames = np.stack(frames, axis=0)\n    return frames\n\n\ndef cmvn(features):\n    # See https://github.com/Tele-AI/TeleSpeech-ASR/blob/master/wenet_representation/conf/train_d2v2_ark_conformer.yaml#L70\n    # https://github.com/Tele-AI/TeleSpeech-ASR/blob/master/wenet_representation/wenet/dataset/dataset.py#L184\n    # https://github.com/Tele-AI/TeleSpeech-ASR/blob/master/wenet_representation/wenet/dataset/processor.py#L278\n    mean = features.mean(axis=0, keepdims=True)\n    std = features.std(axis=0, keepdims=True)\n    return (features - mean) / (std + 1e-5)\n\n\ndef main():\n    # Please download the test data from\n    # https://hf-mirror.com/csukuangfj/sherpa-onnx-paraformer-zh-small-2024-03-09/tree/main/test_wavs\n    test_wav_filename = \"./3-sichuan.wav\"\n    test_wav_filename = \"./4-tianjin.wav\"\n    test_wav_filename = \"./5-henan.wav\"\n\n    features = get_features(test_wav_filename)\n\n    features = cmvn(features)\n\n    features = np.expand_dims(features, axis=0)  # (T, C) -> (N, T, C)\n\n    model_filename = \"./model.int8.onnx\"\n    model = OnnxModel(model_filename)\n    logits = model(features)\n    logits = logits.squeeze(axis=1)  # remove batch axis\n    ids = logits.argmax(axis=-1)\n\n    id2token = dict()\n    with open(\"./tokens.txt\", encoding=\"utf-8\") as f:\n        for line in f:\n            t, idx = line.split()\n            id2token[int(idx)] = t\n\n    tokens = []\n\n    blank = 0\n    prev = -1\n\n    for k in ids:\n        if k != blank and k != prev:\n            tokens.append(k)\n        prev = k\n\n    tokens = [id2token[i] for i in tokens]\n    text = \"\".join(tokens)\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/text2token.py",
    "content": "#!/usr/bin/env python3\n\n\"\"\"\nThis script encode the texts (given line by line through `text`) to tokens and\nwrite the results to the file given by ``output``.\n\nUsage:\nIf the tokens_type is bpe:\n\npython3 ./text2token.py \\\n          --text texts.txt \\\n          --tokens tokens.txt \\\n          --tokens-type bpe \\\n          --bpe-model bpe.model \\\n          --output hotwords.txt\n\nIf the tokens_type is cjkchar:\n\npython3 ./text2token.py \\\n          --text texts.txt \\\n          --tokens tokens.txt \\\n          --tokens-type cjkchar \\\n          --output hotwords.txt\n\nIf the tokens_type is cjkchar+bpe:\n\npython3 ./text2token.py \\\n          --text texts.txt \\\n          --tokens tokens.txt \\\n          --tokens-type cjkchar+bpe \\\n          --bpe-model bpe.model \\\n          --output hotwords.txt\n\n\"\"\"\nimport argparse\n\nfrom sherpa_onnx import text2token\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--text\",\n        type=str,\n        required=True,\n        help=\"\"\"Path to the input texts.\n\n        Each line in the texts contains the original phrase, it might also contain some\n        extra items, for example, the boosting score (starting with :), the triggering\n        threshold (starting with #, only used in keyword spotting task) and the original\n        phrase (starting with @). Note: extra items will be kept in the output.\n\n        example input 1 (tokens_type = ppinyin):\n\n        小爱同学 :2.0 #0.6 @小爱同学\n        你好问问 :3.5 @你好问问\n        小艺小艺 #0.6 @小艺小艺\n\n        example output 1:\n\n        x iǎo ài t óng x ué :2.0 #0.6 @小爱同学\n        n ǐ h ǎo w èn w èn :3.5 @你好问问\n        x iǎo y ì x iǎo y ì #0.6 @小艺小艺\n\n        example input 2 (tokens_type = bpe):\n\n        HELLO WORLD :1.5 #0.4\n        HI GOOGLE :2.0 #0.8\n        HEY SIRI #0.35\n\n        example output 2:\n\n        ▁HE LL O ▁WORLD :1.5 #0.4\n        ▁HI ▁GO O G LE :2.0 #0.8\n        ▁HE Y ▁S I RI #0.35\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"The path to tokens.txt.\",\n    )\n\n    parser.add_argument(\n        \"--tokens-type\",\n        type=str,\n        required=True,\n        choices=[\n            \"cjkchar\",\n            \"bpe\",\n            \"cjkchar+bpe\",\n            \"fpinyin\",\n            \"ppinyin\",\n            \"phone+ppinyin\",\n        ],\n        help=\"\"\"The type of modeling units, should be cjkchar, bpe, cjkchar+bpe, fpinyin\n        ppinyin or phone+ppinyin.\n        fpinyin means full pinyin, each cjkchar has a pinyin(with tone).\n        ppinyin means partial pinyin, it splits pinyin into initial and final,\n        phone means English phonemes in CMU dictionary format.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--bpe-model\",\n        type=str,\n        help=\"The path to bpe.model. Only required when tokens-type is bpe or cjkchar+bpe.\",\n    )\n\n    parser.add_argument(\n        \"--lexicon\",\n        type=str,\n        help=\"The path to lexicon.txt. Only required when tokens-type is phone+ppinyin.\",\n    )\n\n    parser.add_argument(\n        \"--output\",\n        type=str,\n        required=True,\n        help=\"Path where the encoded tokens will be written to.\",\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n\n    texts = []\n    # extra information like boosting score (start with :), triggering threshold (start with #)\n    # original keyword (start with @)\n    extra_info = []\n    with open(args.text, \"r\", encoding=\"utf8\") as f:\n        for line in f:\n            extra = []\n            text = []\n            toks = line.strip().split()\n            for tok in toks:\n                if tok[0] == \":\" or tok[0] == \"#\" or tok[0] == \"@\":\n                    extra.append(tok)\n                else:\n                    text.append(tok)\n            texts.append(\" \".join(text))\n            extra_info.append(extra)\n    encoded_texts = text2token(\n        texts,\n        tokens=args.tokens,\n        tokens_type=args.tokens_type,\n        bpe_model=args.bpe_model,\n        lexicon=args.lexicon,\n    )\n    with open(args.output, \"w\", encoding=\"utf8\") as f:\n        for i, txt in enumerate(encoded_texts):\n            txt += extra_info[i]\n            f.write(\" \".join(txt) + \"\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/utils.sh",
    "content": "#!/bin/bash\n\ndefault='\\033[0m'\nbold='\\033[1m'\nred='\\033[31m'\ngreen='\\033[32m'\n\nfunction ok() {\n  printf \"${bold}${green}[OK]${default} $1\\n\"\n}\n\nfunction error() {\n  printf \"${bold}${red}[FAILED]${default} $1\\n\"\n}\n\nfunction abort() {\n  printf \"${bold}${red}[FAILED]${default} $1\\n\"\n  exit 1\n}\n"
  },
  {
    "path": "scripts/uvr_mdx/READEME.md",
    "content": "# Introduction\n\nThis folder contains scripts for converting models from\nhttps://github.com/TRvlvr/model_repo/releases/tag/all_public_uvr_models\nto sherpa-onnx.\n"
  },
  {
    "path": "scripts/uvr_mdx/add_meta_data_and_quantize.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nfrom pathlib import Path\n\nimport onnx\nimport onnxmltools\nimport onnxruntime\nfrom onnxmltools.utils.float16_converter import convert_float_to_float16\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--filename\",\n        type=str,\n        required=True,\n        help=\"Path to onnx model\",\n    )\n\n    return parser.parse_args()\n\n\ndef export_onnx_fp16(onnx_fp32_path, onnx_fp16_path):\n    onnx_fp32_model = onnxmltools.utils.load_model(onnx_fp32_path)\n    onnx_fp16_model = convert_float_to_float16(onnx_fp32_model, keep_io_types=True)\n    onnxmltools.utils.save_model(onnx_fp16_model, onnx_fp16_path)\n\n\ndef validate(model: onnxruntime.InferenceSession):\n    for i in model.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in model.get_outputs():\n        print(i)\n\n    assert len(model.get_inputs()) == 1, len(model.get_inputs())\n    assert len(model.get_outputs()) == 1, len(model.get_outputs())\n\n    inp = model.get_inputs()[0]\n    outp = model.get_outputs()[0]\n\n    assert len(inp.shape) == 4, inp.shape\n    assert len(outp.shape) == 4, outp.shape\n\n    assert inp.shape[1:] == outp.shape[1:], (inp.shape, outp.shape)\n\n\ndef add_meta_data(filename, meta_data):\n    model = onnx.load(filename)\n\n    print(model.metadata_props)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, filename)\n\n\ndef main():\n    args = get_args()\n    filename = Path(args.filename)\n    if not filename.is_file():\n        raise ValueError(f\"{filename} does not exist\")\n\n    name = filename.stem\n    print(\"name\", name)\n\n    model = onnx.load(str(filename))\n\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(\n        str(filename), session_opts, providers=[\"CPUExecutionProvider\"]\n    )\n    validate(sess)\n\n    inp = sess.get_inputs()[0]\n    outp = sess.get_outputs()[0]\n\n    meta_data = {\n        \"model_type\": \"UVR\",\n        \"model_name\": name,\n        \"sample_rate\": 44100,\n        \"comment\": \"This model is downloaded from https://github.com/TRvlvr/model_repo/releases\",\n        \"n_fft\": inp.shape[2] * 2,\n        \"center\": 1,\n        \"window_type\": \"hann\",\n        \"win_length\": inp.shape[2] * 2,\n        \"hop_length\": 1024,\n        \"dim_t\": inp.shape[3],\n        \"dim_f\": inp.shape[2],\n        \"dim_c\": inp.shape[1],\n        \"stems\": 2,\n    }\n    add_meta_data(str(filename), meta_data)\n\n    filename_fp16 = f\"./{name}.fp16.onnx\"\n    export_onnx_fp16(filename, filename_fp16)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/uvr_mdx/show.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport onnxruntime\nimport onnx\n\n\"\"\"\n[]\nNodeArg(name='input', type='tensor(float)', shape=['batch_size', 4, 3072, 256])\n-----\nNodeArg(name='output', type='tensor(float)', shape=['batch_size', 4, 3072, 256])\n\"\"\"\n\n\ndef show(filename):\n    model = onnx.load(filename)\n    print(model.metadata_props)\n\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3\n    sess = onnxruntime.InferenceSession(\n        filename, session_opts, providers=[\"CPUExecutionProvider\"]\n    )\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in sess.get_outputs():\n        print(i)\n\n\ndef main():\n    #  show(\"./UVR-MDX-NET-Voc_FT.onnx\")\n    show(\"./UVR_MDXNET_1_9703.onnx\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/uvr_mdx/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport time\n\nimport argparse\nimport kaldi_native_fbank as knf\nimport librosa\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--model-filename\",\n        type=str,\n        required=True,\n        help=\"Path to onnx model\",\n    )\n\n    parser.add_argument(\n        \"--audio-filename\",\n        type=str,\n        required=True,\n        help=\"Path to input audio file\",\n    )\n\n    return parser.parse_args()\n\n\nclass OnnxModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 4\n        session_opts.intra_op_num_threads = 4\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        self.dim_t = self.model.get_outputs()[0].shape[3]\n\n        self.dim_f = self.model.get_outputs()[0].shape[2]\n\n        self.n_fft = self.dim_f * 2\n\n        self.dim_c = self.model.get_outputs()[0].shape[1]\n        assert self.dim_c == 4, self.dim_c\n\n        self.hop = 1024\n        self.n_bins = self.n_fft // 2 + 1\n        self.chunk_size = self.hop * (self.dim_t - 1)\n\n        self.freq_pad = np.zeros([1, self.dim_c, self.n_bins - self.dim_f, self.dim_t])\n\n        print(f\"----------inputs for {filename}----------\")\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(f\"----------outputs for {filename}----------\")\n\n        for i in self.model.get_outputs():\n            print(i)\n            print(i.shape)\n        print(\"--------------------\")\n\n    def __call__(self, x):\n        \"\"\"\n        Args:\n          x: (batch_size, 4, self.dim_f, self.dim_t)\n        Returns:\n          spec: (batch_size, 4, self.dim_f, self.dim_t)\n        \"\"\"\n        spec = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )[0]\n\n        return spec\n\n\ndef main():\n    args = get_args()\n    m = OnnxModel(args.model_filename)\n\n    stft_config = knf.StftConfig(\n        n_fft=m.n_fft,\n        hop_length=m.hop,\n        win_length=m.n_fft,\n        center=True,\n        window_type=\"hann\",\n    )\n    knf_stft = knf.Stft(stft_config)\n    knf_istft = knf.IStft(stft_config)\n\n    sample_rate = 44100\n\n    samples, rate = librosa.load(args.audio_filename, mono=False, sr=sample_rate)\n\n    start_time = time.time()\n\n    assert rate == sample_rate, (rate, sample_rate)\n\n    # samples: (2, 479832) , (num_channels, num_samples), 44100, 10.88\n    print(\"samples\", samples.shape, rate, samples.shape[1] / rate)\n\n    assert samples.ndim == 2, samples.shape\n    assert samples.shape[0] == 2, samples.shape\n\n    margin = sample_rate\n\n    num_chunks = 15\n    chunk_size = num_chunks * sample_rate\n\n    # if they are too few samples, reset chunk_size\n    if samples.shape[1] < chunk_size:\n        chunk_size = samples.shape[1]\n\n    if margin > chunk_size:\n        margin = chunk_size\n\n    segments = []\n    for skip in range(0, samples.shape[1], chunk_size):\n        start = max(0, skip - margin)\n        end = min(skip + chunk_size + margin, samples.shape[1])\n        segments.append(samples[:, start:end])\n        if end == samples.shape[1]:\n            break\n\n    sources = []\n    for kk, s in enumerate(segments):\n        num_samples = s.shape[1]\n        trim = m.n_fft // 2\n        gen_size = m.chunk_size - 2 * trim\n        pad = gen_size - s.shape[1] % gen_size\n        mix_p = np.concatenate(\n            (\n                np.zeros((2, trim)),\n                s,\n                np.zeros((2, pad)),\n                np.zeros((2, trim)),\n            ),\n            axis=1,\n        )\n\n        chunk_list = []\n        i = 0\n        while i < s.shape[1] + pad:\n            chunk_list.append(mix_p[:, i : i + m.chunk_size])\n            i += gen_size\n\n        mix_waves = np.array(chunk_list)\n\n        mix_waves_reshaped = mix_waves.reshape(-1, m.chunk_size)\n        stft_results = []\n        for w in mix_waves_reshaped:\n            stft = knf_stft(w)\n            stft_results.append(stft)\n        real = np.array(\n            [np.array(s.real).reshape(s.num_frames, -1) for s in stft_results],\n            dtype=np.float32,\n        )[:, :, :-1]\n        # real: (6, 256, 3072)\n\n        real = real.transpose(0, 2, 1)\n        # real: (6, 3072, 256)\n\n        imag = np.array(\n            [np.array(s.imag).reshape(s.num_frames, -1) for s in stft_results],\n            dtype=np.float32,\n        )[:, :, :-1]\n        imag = imag.transpose(0, 2, 1)\n        # imag: (6, 3072, 256)\n\n        x = np.stack([real, imag], axis=1)\n        # x: (6, 2, 3072, 256) -> (batch_size, real_imag, 3072, 256)\n        x = x.reshape(-1, m.dim_c, m.dim_f, m.dim_t)\n        # x: (3, 4, 3072, 256)\n        spec = m(x)\n\n        freq_pad = np.repeat(m.freq_pad, spec.shape[0], axis=0)\n\n        x = np.concatenate([spec, freq_pad], axis=2)\n        # x: (3, 4, 3073, 256)\n        x = x.reshape(-1, 2, m.n_bins, m.dim_t)\n        # x: (6, 2, 3073, 256)\n        x = x.transpose(0, 1, 3, 2)\n        # x: (6, 2, 256, 3073)\n        num_frames = x.shape[2]\n\n        x = x.reshape(x.shape[0], x.shape[1], -1)\n        wav_list = []\n        for k in range(x.shape[0]):\n            istft_result = knf.StftResult(\n                real=x[k, 0].reshape(-1).tolist(),\n                imag=x[k, 1].reshape(-1).tolist(),\n                num_frames=num_frames,\n            )\n            wav = knf_istft(istft_result)\n            wav_list.append(wav)\n        wav = np.array(wav_list, dtype=np.float32)\n        # wav: (6, 261120)\n\n        wav = wav.reshape(-1, 2, wav.shape[-1])\n        # wav: (3, 2, 261120)\n\n        wav = wav[:, :, trim:-trim]\n        # wav: (3, 2, 254976)\n\n        wav = wav.transpose(1, 0, 2)\n        # wav: (2, 3, 254976)\n\n        wav = wav.reshape(2, -1)\n        # wav: (2, 764928)\n\n        wav = wav[:, :-pad]\n        # wav: 2, 705600)\n        if kk == 0:\n            start = 0\n        else:\n            start = margin\n\n        if kk == len(segments) - 1:\n            end = None\n        else:\n            end = -margin\n\n        sources.append(wav[:, start:end])\n\n    sources = np.concatenate(sources, axis=-1)\n\n    vocals = sources\n    non_vocals = samples - vocals\n    end_time = time.time()\n    elapsed_seconds = end_time - start_time\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n\n    audio_duration = samples.shape[1] / sample_rate\n    real_time_factor = elapsed_seconds / audio_duration\n    print(f\"Elapsed seconds: {elapsed_seconds:.3f}\")\n    print(f\"Audio duration in seconds: {audio_duration:.3f}\")\n    print(f\"RTF: {elapsed_seconds:.3f}/{audio_duration:.3f} = {real_time_factor:.3f}\")\n\n    sf.write(f\"./vocals.mp3\", np.transpose(vocals), sample_rate)\n    sf.write(f\"./non_vocals.mp3\", np.transpose(non_vocals), sample_rate)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/vits/.gitignore",
    "content": "tokens-ljs.txt\ntokens-vctk.txt\n"
  },
  {
    "path": "scripts/vits/__init__.py",
    "content": ""
  },
  {
    "path": "scripts/vits/export-onnx-ljs.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nThis script converts vits models trained using the LJ Speech dataset.\n\nUsage:\n\n(1) Download vits\n\ncd /Users/fangjun/open-source\ngit clone https://github.com/jaywalnut310/vits\n\n(2) Download pre-trained models from\nhttps://huggingface.co/csukuangfj/vits-ljs/tree/main\n\nwget https://huggingface.co/csukuangfj/vits-ljs/resolve/main/pretrained_ljs.pth\n\n(3) Run this file\n\n./export-onnx-ljs.py  \\\n  --config ~/open-source//vits/configs/ljs_base.json \\\n  --checkpoint ~/open-source/icefall-models/vits-ljs/pretrained_ljs.pth\n\nIt will generate the following two files:\n\n$ ls -lh *.onnx\n-rw-r--r--  1 fangjun  staff    36M Oct 10 20:48 vits-ljs.int8.onnx\n-rw-r--r--  1 fangjun  staff   109M Oct 10 20:48 vits-ljs.onnx\n\"\"\"\nimport sys\n\n# Please change this line to point to the vits directory.\n# You can download vits from\n# https://github.com/jaywalnut310/vits\nsys.path.insert(0, \"/Users/fangjun/open-source/vits\")  # noqa\n\nimport argparse\nfrom pathlib import Path\nfrom typing import Dict, Any\n\nimport commons\nimport onnx\nimport torch\nimport utils\nfrom models import SynthesizerTrn\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom text import text_to_sequence\nfrom text.symbols import symbols\nfrom text.symbols import _punctuation\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--config\",\n        type=str,\n        required=True,\n        help=\"\"\"Path to ljs_base.json.\n        You can find it at\n        https://huggingface.co/csukuangfj/vits-ljs/resolve/main/ljs_base.json\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--checkpoint\",\n        type=str,\n        required=True,\n        help=\"\"\"Path to the checkpoint file.\n        You can find it at\n        https://huggingface.co/csukuangfj/vits-ljs/resolve/main/pretrained_ljs.pth\n\n        \"\"\",\n    )\n\n    return parser.parse_args()\n\n\nclass OnnxModel(torch.nn.Module):\n    def __init__(self, model: SynthesizerTrn):\n        super().__init__()\n        self.model = model\n\n    def forward(\n        self,\n        x,\n        x_lengths,\n        noise_scale=1,\n        length_scale=1,\n        noise_scale_w=1.0,\n        sid=None,\n        max_len=None,\n    ):\n        return self.model.infer(\n            x=x,\n            x_lengths=x_lengths,\n            sid=sid,\n            noise_scale=noise_scale,\n            length_scale=length_scale,\n            noise_scale_w=noise_scale_w,\n            max_len=max_len,\n        )[0]\n\n\ndef get_text(text, hps):\n    text_norm = text_to_sequence(text, hps.data.text_cleaners)\n    if hps.data.add_blank:\n        text_norm = commons.intersperse(text_norm, 0)\n    text_norm = torch.LongTensor(text_norm)\n    return text_norm\n\n\ndef check_args(args):\n    assert Path(args.config).is_file(), args.config\n    assert Path(args.checkpoint).is_file(), args.checkpoint\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\ndef generate_tokens():\n    with open(\"tokens-ljs.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(symbols):\n            f.write(f\"{s} {i}\\n\")\n    print(\"Generated tokens-ljs.txt\")\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    check_args(args)\n\n    generate_tokens()\n\n    hps = utils.get_hparams_from_file(args.config)\n\n    net_g = SynthesizerTrn(\n        len(symbols),\n        hps.data.filter_length // 2 + 1,\n        hps.train.segment_size // hps.data.hop_length,\n        **hps.model,\n    )\n    _ = net_g.eval()\n\n    _ = utils.load_checkpoint(args.checkpoint, net_g, None)\n\n    x = get_text(\"Liliana is the most beautiful assistant\", hps)\n    x = x.unsqueeze(0)\n\n    x_length = torch.tensor([x.shape[1]], dtype=torch.int64)\n    noise_scale = torch.tensor([1], dtype=torch.float32)\n    length_scale = torch.tensor([1], dtype=torch.float32)\n    noise_scale_w = torch.tensor([1], dtype=torch.float32)\n\n    model = OnnxModel(net_g)\n\n    opset_version = 13\n\n    filename = \"vits-ljs.onnx\"\n\n    torch.onnx.export(\n        model,\n        (x, x_length, noise_scale, length_scale, noise_scale_w),\n        filename,\n        opset_version=opset_version,\n        input_names=[\"x\", \"x_length\", \"noise_scale\", \"length_scale\", \"noise_scale_w\"],\n        output_names=[\"y\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"L\"},  # n_audio is also known as batch_size\n            \"x_length\": {0: \"N\"},\n            \"y\": {0: \"N\", 2: \"L\"},\n        },\n    )\n    meta_data = {\n        \"model_type\": \"vits\",\n        \"comment\": \"ljspeech\",\n        \"language\": \"English\",\n        \"add_blank\": int(hps.data.add_blank),\n        \"n_speakers\": int(hps.data.n_speakers),\n        \"sample_rate\": hps.data.sampling_rate,\n        \"punctuation\": \" \".join(list(_punctuation)),\n    }\n    print(\"meta_data\", meta_data)\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n    print(\"Generate int8 quantization models\")\n\n    filename_int8 = \"vits-ljs.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        weight_type=QuantType.QUInt8,\n    )\n\n    print(f\"Saved to {filename} and {filename_int8}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/vits/export-onnx-vctk.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nThis script converts vits models trained using the VCTK dataset.\n\nUsage:\n\n(1) Download vits\n\ncd /Users/fangjun/open-source\ngit clone https://github.com/jaywalnut310/vits\n\n(2) Download pre-trained models from\nhttps://huggingface.co/csukuangfj/vits-vctk/tree/main\n\nwget https://huggingface.co/csukuangfj/vits-vctk/resolve/main/pretrained_vctk.pth\n\n(3) Run this file\n\n./export-onnx-vctk.py  \\\n  --config ~/open-source//vits/configs/vctk_base.json \\\n  --checkpoint ~/open-source/icefall-models/vits-vctk/pretrained_vctk.pth\n\nIt will generate the following two files:\n\n$ ls -lh *.onnx\n-rw-r--r--  1 fangjun  staff    37M Oct 16 10:57 vits-vctk.int8.onnx\n-rw-r--r--  1 fangjun  staff   116M Oct 16 10:57 vits-vctk.onnx\n\"\"\"\nimport sys\n\n# Please change this line to point to the vits directory.\n# You can download vits from\n# https://github.com/jaywalnut310/vits\nsys.path.insert(0, \"/Users/fangjun/open-source/vits\")  # noqa\n\nimport argparse\nfrom pathlib import Path\nfrom typing import Dict, Any\n\nimport commons\nimport onnx\nimport torch\nimport utils\nfrom models import SynthesizerTrn\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom text import text_to_sequence\nfrom text.symbols import symbols\nfrom text.symbols import _punctuation\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--config\",\n        type=str,\n        required=True,\n        help=\"\"\"Path to vctk_base.json.\n        You can find it at\n        https://huggingface.co/csukuangfj/vits-vctk/resolve/main/vctk_base.json\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--checkpoint\",\n        type=str,\n        required=True,\n        help=\"\"\"Path to the checkpoint file.\n        You can find it at\n        https://huggingface.co/csukuangfj/vits-vctk/resolve/main/pretrained_vctk.pth\n        \"\"\",\n    )\n\n    return parser.parse_args()\n\n\nclass OnnxModel(torch.nn.Module):\n    def __init__(self, model: SynthesizerTrn):\n        super().__init__()\n        self.model = model\n\n    def forward(\n        self,\n        x,\n        x_lengths,\n        noise_scale=1,\n        length_scale=1,\n        noise_scale_w=1.0,\n        sid=0,\n        max_len=None,\n    ):\n        return self.model.infer(\n            x=x,\n            x_lengths=x_lengths,\n            sid=sid,\n            noise_scale=noise_scale,\n            length_scale=length_scale,\n            noise_scale_w=noise_scale_w,\n            max_len=max_len,\n        )[0]\n\n\ndef get_text(text, hps):\n    text_norm = text_to_sequence(text, hps.data.text_cleaners)\n    if hps.data.add_blank:\n        text_norm = commons.intersperse(text_norm, 0)\n    text_norm = torch.LongTensor(text_norm)\n    return text_norm\n\n\ndef check_args(args):\n    assert Path(args.config).is_file(), args.config\n    assert Path(args.checkpoint).is_file(), args.checkpoint\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\ndef generate_tokens():\n    with open(\"tokens-vctk.txt\", \"w\", encoding=\"utf-8\") as f:\n        for i, s in enumerate(symbols):\n            f.write(f\"{s} {i}\\n\")\n    print(\"Generated tokens-vctk.txt\")\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    check_args(args)\n\n    generate_tokens()\n\n    hps = utils.get_hparams_from_file(args.config)\n\n    net_g = SynthesizerTrn(\n        len(symbols),\n        hps.data.filter_length // 2 + 1,\n        hps.train.segment_size // hps.data.hop_length,\n        n_speakers=hps.data.n_speakers,\n        **hps.model,\n    )\n    _ = net_g.eval()\n\n    _ = utils.load_checkpoint(args.checkpoint, net_g, None)\n\n    x = get_text(\"Liliana is the most beautiful assistant\", hps)\n    x = x.unsqueeze(0)\n\n    x_length = torch.tensor([x.shape[1]], dtype=torch.int64)\n    noise_scale = torch.tensor([1], dtype=torch.float32)\n    length_scale = torch.tensor([1], dtype=torch.float32)\n    noise_scale_w = torch.tensor([1], dtype=torch.float32)\n    sid = torch.tensor([0], dtype=torch.int64)\n\n    model = OnnxModel(net_g)\n\n    opset_version = 13\n\n    filename = \"vits-vctk.onnx\"\n\n    torch.onnx.export(\n        model,\n        (x, x_length, noise_scale, length_scale, noise_scale_w, sid),\n        filename,\n        opset_version=opset_version,\n        input_names=[\n            \"x\",\n            \"x_length\",\n            \"noise_scale\",\n            \"length_scale\",\n            \"noise_scale_w\",\n            \"sid\",\n        ],\n        output_names=[\"y\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"L\"},  # n_audio is also known as batch_size\n            \"x_length\": {0: \"N\"},\n            \"y\": {0: \"N\", 2: \"L\"},\n        },\n    )\n    meta_data = {\n        \"model_type\": \"vits\",\n        \"comment\": \"vctk\",\n        \"language\": \"English\",\n        \"add_blank\": int(hps.data.add_blank),\n        \"n_speakers\": int(hps.data.n_speakers),\n        \"sample_rate\": hps.data.sampling_rate,\n        \"punctuation\": \" \".join(list(_punctuation)),\n    }\n    print(\"meta_data\", meta_data)\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n    print(\"Generate int8 quantization models\")\n\n    filename_int8 = \"vits-vctk.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        weight_type=QuantType.QUInt8,\n    )\n\n    print(f\"Saved to {filename} and {filename_int8}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/vocos/README.md",
    "content": "# Introduction\n\nThis folder contains script to export the ONNX model from\nhttps://huggingface.co/BSC-LT\nto sherpa-onnx\n"
  },
  {
    "path": "scripts/vocos/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport argparse\n\nimport onnx\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\"--in-model\", type=str, required=True, help=\"input onnx model\")\n\n    parser.add_argument(\n        \"--out-model\", type=str, required=True, help=\"output onnx model\"\n    )\n\n    return parser.parse_args()\n\n\ndef main():\n    args = get_args()\n    print(args.in_model, args.out_model)\n\n    model = onnx.load(args.in_model)\n\n    meta_data = {\n        \"model_type\": \"vocos\",\n        \"model_filename\": \"mel_spec_22khz_univ.onnx\",\n        \"sample_rate\": 22050,\n        \"version\": 1,\n        \"model_author\": \"BSC-LT\",\n        \"maintainer\": \"k2-fsa\",\n        \"n_fft\": 1024,\n        \"hop_length\": 256,\n        \"win_length\": 1024,\n        \"window_type\": \"hann\",\n        \"center\": 1,\n        \"pad_mode\": \"reflect\",\n        \"normalized\": 0,\n        \"url1\": \"https://huggingface.co/BSC-LT/vocos-mel-22khz\",\n        \"url2\": \"https://github.com/gemelo-ai/vocos\",\n    }\n\n    print(model.metadata_props)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n    print(\"--------------------\")\n\n    print(model.metadata_props)\n\n    onnx.save(model, args.out_model)\n\n    print(f\"Saved to {args.out_model}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/vocos/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport datetime as dt\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\ntry:\n    from piper_phonemize import phonemize_espeak\nexcept Exception as ex:\n    raise RuntimeError(\n        f\"{ex}\\nPlease run\\n\"\n        \"pip install piper_phonemize -f https://k2-fsa.github.io/icefall/piper_phonemize.html\"\n    )\n\n\nclass OnnxVocosModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        print(\"----------vocos----------\")\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i)\n        print()\n\n    def __call__(self, x: np.ndarray):\n        \"\"\"\n        Args:\n          x: (N, feat_dim, num_frames)\n        Returns:\n          mag: (N, n_fft/2+1, num_frames)\n          x: (N, n_fft/2+1, num_frames)\n          y: (N, n_fft/2+1, num_frames)\n\n        The complex spectrum is mag * (x + j*y)\n        \"\"\"\n        assert x.ndim == 3, x.shape\n        assert x.shape[0] == 1, x.shape\n\n        mag, x, y = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n                self.model.get_outputs()[1].name,\n                self.model.get_outputs()[2].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )\n\n        return mag, x, y\n\n\nclass OnnxHifiGANModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        print(\"----------hifigan----------\")\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i)\n        print()\n\n    def __call__(self, x: np.ndarray):\n        \"\"\"\n        Args:\n          x: (N, feat_dim, num_frames)\n        Returns:\n          audio: (N, num_samples)\n        \"\"\"\n        assert x.ndim == 3, x.shape\n        assert x.shape[0] == 1, x.shape\n\n        audio = self.model.run(\n            [self.model.get_outputs()[0].name],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )[0]\n        # audio: (batch_size, num_samples)\n\n        return audio\n\n\ndef load_tokens(filename):\n    token2id = dict()\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.strip().split()\n            if len(fields) == 1:\n                t = \" \"\n                idx = int(fields[0])\n            else:\n                t, idx = line.strip().split()\n            token2id[t] = int(idx)\n    return token2id\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n        tokens: str,\n    ):\n        self.token2id = load_tokens(tokens)\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        print(f\"{self.model.get_modelmeta().custom_metadata_map}\")\n        metadata = self.model.get_modelmeta().custom_metadata_map\n        self.sample_rate = int(metadata[\"sample_rate\"])\n\n        print(\"----------matcha----------\")\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i)\n        print()\n\n    def __call__(self, x: np.ndim):\n        \"\"\"\n        Args:\n        \"\"\"\n        assert x.ndim == 2, x.shape\n        assert x.shape[0] == 1, x.shape\n\n        x_lengths = np.array([x.shape[1]], dtype=np.int64)\n\n        noise_scale = np.array([1.0], dtype=np.float32)\n        length_scale = np.array([1.0], dtype=np.float32)\n\n        mel = self.model.run(\n            [self.model.get_outputs()[0].name],\n            {\n                self.model.get_inputs()[0].name: x,\n                self.model.get_inputs()[1].name: x_lengths,\n                self.model.get_inputs()[2].name: noise_scale,\n                self.model.get_inputs()[3].name: length_scale,\n            },\n        )[0]\n        # mel: (batch_size, feat_dim, num_frames)\n\n        return mel\n\n\ndef main():\n    am = OnnxModel(\n        filename=\"./matcha-icefall-en_US-ljspeech/model-steps-3.onnx\",\n        tokens=\"./matcha-icefall-en_US-ljspeech/tokens.txt\",\n    )\n    vocoder = OnnxHifiGANModel(\"./hifigan_v2.onnx\")\n    vocos = OnnxVocosModel(\"./mel_spec_22khz_univ.onnx\")\n\n    text = \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n    tokens_list = phonemize_espeak(text, \"en-us\")\n    print(tokens_list)\n    tokens = []\n    for t in tokens_list:\n        tokens.extend(t)\n\n    token_ids = []\n    for t in tokens:\n        if t not in am.token2id:\n            print(f\"Skip OOV '{t}'\")\n            continue\n        token_ids.append(am.token2id[t])\n\n    token_ids2 = [am.token2id[\"_\"]] * (len(token_ids) * 2 + 1)\n    token_ids2[1::2] = token_ids\n    token_ids = token_ids2\n    x = np.array([token_ids], dtype=np.int64)\n\n    mel_start_t = dt.datetime.now()\n    mel = am(x)\n    mel_end_t = dt.datetime.now()\n\n    print(\"mel\", mel.shape)\n    # mel:(1, 80, 78)\n\n    vocos_start_t = dt.datetime.now()\n    mag, x, y = vocos(mel)\n    stft_result = knf.StftResult(\n        real=(mag * x)[0].transpose().reshape(-1).tolist(),\n        imag=(mag * y)[0].transpose().reshape(-1).tolist(),\n        num_frames=mag.shape[2],\n    )\n    config = knf.StftConfig(\n        n_fft=1024,\n        hop_length=256,\n        win_length=1024,\n        window_type=\"hann\",\n        center=True,\n        pad_mode=\"reflect\",\n        normalized=False,\n    )\n    istft = knf.IStft(config)\n    audio_vocos = istft(stft_result)\n    vocos_end_t = dt.datetime.now()\n\n    audio_vocos = np.array(audio_vocos)\n    #  audio = audio / 2\n    print(\"vocos max/min\", np.max(audio_vocos), np.min(audio_vocos))\n\n    sf.write(\"vocos.wav\", audio_vocos, am.sample_rate, \"PCM_16\")\n\n    hifigan_start_t = dt.datetime.now()\n    audio_hifigan = vocoder(mel)\n    hifigan_end_t = dt.datetime.now()\n    audio_hifigan = audio_hifigan.squeeze()\n\n    print(\"hifigan max/min\", np.max(audio_hifigan), np.min(audio_hifigan))\n\n    sample_rate = am.sample_rate\n    sf.write(\"hifigan-v2.wav\", audio_hifigan, sample_rate, \"PCM_16\")\n\n    am_t = (mel_end_t - mel_start_t).total_seconds()\n    vocos_t = (vocos_end_t - vocos_start_t).total_seconds()\n    hifigan_t = (hifigan_end_t - hifigan_start_t).total_seconds()\n\n    mean_audio_duration = (\n        (audio_vocos.shape[-1] + audio_hifigan.shape[-1]) / 2 / sample_rate\n    )\n    rtf_am = am_t / mean_audio_duration\n\n    rtf_vocos = vocos_t * sample_rate / audio_vocos.shape[-1]\n    rtf_hifigan = hifigan_t * sample_rate / audio_hifigan.shape[-1]\n\n    print(\n        \"Audio duration for vocos {:.3f} s\".format(audio_vocos.shape[-1] / sample_rate)\n    )\n    print(\n        \"Audio duration for hifigan {:.3f} s\".format(\n            audio_hifigan.shape[-1] / sample_rate\n        )\n    )\n    print(\"Mean audio duration: {:.3f} s\".format(mean_audio_duration))\n    print(\"RTF for acoustic model {:.3f}\".format(rtf_am))\n    print(\"RTF for vocos {:.3f}\".format(rtf_vocos))\n    print(\"RTF for hifigan {:.3f}\".format(rtf_hifigan))\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wasm/generate-tts.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass Model:\n    model_name: str\n    hf: str  # huggingface space name\n    ms: str  # modelscope space name\n    cmd: str = \"\"\n\n\ndef get_models():\n    models = [\n        Model(\n            model_name=\"vits-piper-de_DE-thorsten_emotional-medium\",\n            hf=\"k2-fsa/web-assembly-tts-sherpa-onnx-de\",\n            ms=\"k2-fsa/web-assembly-tts-sherpa-onnx-de\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            mv -v *.onnx ../\n            mv -v tokens.txt ../\n            mv -v espeak-ng-data ../\n            popd\n\n\n            git checkout .\n\n            rm -rf $model_name\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"vits-piper-en_US-libritts_r-medium\",\n            hf=\"k2-fsa/web-assembly-tts-sherpa-onnx-en\",\n            ms=\"k2-fsa/web-assembly-tts-sherpa-onnx-en\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            mv -v *.onnx ../\n            mv -v tokens.txt ../\n            mv -v espeak-ng-data ../\n            popd\n\n\n            git checkout .\n\n            rm -rf $model_name\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"matcha-icefall-zh-en\",\n            hf=\"k2-fsa/web-assembly-zh-en-tts-matcha\",\n            ms=\"csukuangfj/web-assembly-zh-en-tts-matcha\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            mv -v *.fst ../\n            mv -v *.onnx ../\n            mv -v tokens.txt ../\n            mv -v lexicon.txt ../\n            mv -v espeak-ng-data ../\n            popd\n\n            curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-16khz-univ.onnx\n\n            git checkout .\n            sed -i.bak 's/let modelType = 0/let modelType = 1/g' ../sherpa-onnx-tts.js\n\n            rm -rf $model_name\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"matcha-icefall-zh-baker\",\n            hf=\"k2-fsa/web-assembly-zh-tts-matcha\",\n            ms=\"csukuangfj/web-assembly-zh-tts-matcha\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            mv -v *.fst ../\n            mv -v *.onnx ../\n            mv -v tokens.txt ../\n            mv -v lexicon.txt ../\n            popd\n\n            curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n\n            git checkout .\n            sed -i.bak 's/let modelType = 0/let modelType = 2/g' ../sherpa-onnx-tts.js\n\n            rm -rf $model_name\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"matcha-icefall-en_US-ljspeech\",\n            hf=\"k2-fsa/web-assembly-en-tts-matcha\",\n            ms=\"csukuangfj/web-assembly-en-tts-matcha\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            mv -v *.onnx ../\n            mv -v tokens.txt ../\n            mv -v espeak-ng-data ../\n            popd\n\n            curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\n\n\n            git checkout .\n            sed -i.bak 's/let modelType = 0/let modelType = 3/g' ../sherpa-onnx-tts.js\n\n             rm -rf $model_name\n             git diff\n             \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\",\n            hf=\"k2-fsa/web-assembly-zh-en-tts-zipvoice\",\n            ms=\"csukuangfj/web-assembly-zh-en-tts-zipvoice\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            mv -v encoder.int8.onnx ../\n            mv -v decoder.int8.onnx ../\n            mv -v tokens.txt ../\n            mv -v lexicon.txt ../\n            mv -v espeak-ng-data ../\n            popd\n\n            curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n            git checkout .\n            sed -i.bak 's/let modelType = 0/let modelType = 4/g' ../sherpa-onnx-tts.js\n            rm -rf $model_name\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-pocket-tts-int8-2026-01-26\",\n            hf=\"k2-fsa/web-assembly-en-tts-pocket\",\n            ms=\"csukuangfj/web-assembly-en-tts-pocket\",\n            cmd=\"\"\"\n            pushd $model_name\n\n            mv -v lm_flow.int8.onnx ../\n            mv -v lm_main.int8.onnx ../\n            mv -v encoder.onnx ../\n            mv -v decoder.int8.onnx ../\n            mv -v text_conditioner.onnx ../\n            mv -v vocab.json ../\n            mv -v token_scores.json ../\n            popd\n\n            git checkout .\n            sed -i.bak 's/let modelType = 0/let modelType = 5/g' ../sherpa-onnx-tts.js\n            rm -rf $model_name\n            git diff\n            \"\"\",\n        ),\n    ]\n    return models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./run-tts.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wasm/generate-vad-asr.py",
    "content": "#!/usr/bin/env python3\n\nimport argparse\nfrom dataclasses import dataclass\n\nimport jinja2\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--total\",\n        type=int,\n        default=1,\n        help=\"Number of runners\",\n    )\n    parser.add_argument(\n        \"--index\",\n        type=int,\n        default=0,\n        help=\"Index of the current runner\",\n    )\n    return parser.parse_args()\n\n\n@dataclass\nclass Model:\n    model_name: str\n    hf: str  # huggingface space name\n    ms: str  # modelscope space name\n    short_name: str\n    cmd: str = \"\"\n\n\ndef get_models():\n    models = [\n        Model(\n            model_name=\"sherpa-onnx-whisper-tiny.en\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-whisper-tiny\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-en-whisper-tiny\",\n            short_name=\"vad-asr-en-whisper_tiny\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v tiny.en-encoder.int8.onnx ../whisper-encoder.onnx\n            mv -v tiny.en-decoder.int8.onnx ../whisper-decoder.onnx\n            mv -v tiny.en-tokens.txt ../tokens.txt\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Whisper tiny.en supporting English 英文/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-tiny-en-int8\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny\",\n            short_name=\"vad-asr-en-moonshine_tiny\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v preprocess.onnx ../moonshine-preprocessor.onnx\n            mv -v encode.int8.onnx ../moonshine-encoder.onnx\n            mv -v uncached_decode.int8.onnx ../moonshine-uncached-decoder.onnx\n            mv -v cached_decode.int8.onnx ../moonshine-cached-decoder.onnx\n            mv -v tokens.txt ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine tiny supporting English 英文/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-tiny-en\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-tiny-en\",\n            short_name=\"vad-asr-moonshine-v2-tiny-en\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v encoder_model.ort ../moonshine-encoder.ort\n            mv -v decoder_model_merged.ort ../moonshine-merged-decoder.ort\n            mv -v tokens.txt ../\n            mv -v LICENSE ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine v2 tiny-en supporting English 英语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-tiny-ja-quantized-2026-02-27\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-tiny-ja\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-tiny-ja\",\n            short_name=\"vad-asr-moonshine-v2-tiny-ja\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v encoder_model.ort ../moonshine-encoder.ort\n            mv -v decoder_model_merged.ort ../moonshine-merged-decoder.ort\n            mv -v tokens.txt ../\n            mv -v LICENSE ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine v2 tiny-ja supporting Japanese 日语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-tiny-ko-quantized-2026-02-27\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-tiny-ko\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-tiny-ko\",\n            short_name=\"vad-asr-moonshine-v2-tiny-ko\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v encoder_model.ort ../moonshine-encoder.ort\n            mv -v decoder_model_merged.ort ../moonshine-merged-decoder.ort\n            mv -v tokens.txt ../\n            mv -v LICENSE ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine v2 tiny-ko supporting Korean 韩语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-en-quantized-2026-02-27\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-en\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-en\",\n            short_name=\"vad-asr-moonshine-v2-base-en\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v encoder_model.ort ../moonshine-encoder.ort\n            mv -v decoder_model_merged.ort ../moonshine-merged-decoder.ort\n            mv -v tokens.txt ../\n            mv -v LICENSE ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine v2 base-en supporting English 英语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-zh-quantized-2026-02-27\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-zh\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-zh\",\n            short_name=\"vad-asr-moonshine-v2-base-zh\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v encoder_model.ort ../moonshine-encoder.ort\n            mv -v decoder_model_merged.ort ../moonshine-merged-decoder.ort\n            mv -v tokens.txt ../\n            mv -v LICENSE ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine v2 base-zh supporting Chinese 普通话/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-ja-quantized-2026-02-27\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-ja\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-ja\",\n            short_name=\"vad-asr-moonshine-v2-base-ja\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v encoder_model.ort ../moonshine-encoder.ort\n            mv -v decoder_model_merged.ort ../moonshine-merged-decoder.ort\n            mv -v tokens.txt ../\n            mv -v LICENSE ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine v2 base-ja supporting Japanese 日文/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-vi-quantized-2026-02-27\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-vi\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-vi\",\n            short_name=\"vad-asr-moonshine-v2-base-vi\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v encoder_model.ort ../moonshine-encoder.ort\n            mv -v decoder_model_merged.ort ../moonshine-merged-decoder.ort\n            mv -v tokens.txt ../\n            mv -v LICENSE ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine v2 base-vi supporting Vietnamese 越南语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-es-quantized-2026-02-27\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-es\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-es\",\n            short_name=\"vad-asr-moonshine-v2-base-es\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v encoder_model.ort ../moonshine-encoder.ort\n            mv -v decoder_model_merged.ort ../moonshine-merged-decoder.ort\n            mv -v tokens.txt ../\n            mv -v LICENSE ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine v2 base-es supporting Spanish 西班牙语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-ar-quantized-2026-02-27\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-ar\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-ar\",\n            short_name=\"vad-asr-moonshine-v2-base-ar\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v encoder_model.ort ../moonshine-encoder.ort\n            mv -v decoder_model_merged.ort ../moonshine-merged-decoder.ort\n            mv -v tokens.txt ../\n            mv -v LICENSE ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine v2 base-ar supporting Arabic 阿拉伯语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-moonshine-base-uk-quantized-2026-02-27\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-uk\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-moonshine-v2-base-uk\",\n            short_name=\"vad-asr-moonshine-v2-base-uk\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v encoder_model.ort ../moonshine-encoder.ort\n            mv -v decoder_model_merged.ort ../moonshine-merged-decoder.ort\n            mv -v tokens.txt ../\n            mv -v LICENSE ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Moonshine v2 base-uk supporting Ukrainian 乌克兰语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-ja-ko-cantonese-sense-voice\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-zh-en-jp-ko-cantonese-sense-voice\",\n            short_name=\"vad-asr-zh_en_ja_ko_cantonese-sense_voice_small\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v model.int8.onnx ../sense-voice.onnx\n            mv -v tokens.txt ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/SenseVoice Small supporting English, Chinese, Japanese, Korean, Cantonese 中英日韩粤/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-paraformer-zh-2023-09-14\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer\",\n            ms=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer\",\n            short_name=\"vad-asr-zh_en-paraformer_large\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v model.int8.onnx ../paraformer.onnx\n            mv -v tokens.txt ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Paraformer supporting Chinese, English 中英/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-paraformer-zh-small-2024-03-09\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small\",\n            ms=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small\",\n            short_name=\"vad-asr-zh_en-paraformer_small\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v model.int8.onnx ../paraformer.onnx\n            mv -v tokens.txt ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Paraformer-small supporting Chinese, English 中英文/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-gigaspeech-2023-12-12\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech\",\n            ms=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech\",\n            short_name=\"vad-asr-en-zipformer_gigaspeech\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv encoder-epoch-30-avg-1.int8.onnx ../transducer-encoder.onnx\n            mv decoder-epoch-30-avg-1.onnx ../transducer-decoder.onnx\n            mv joiner-epoch-30-avg-1.int8.onnx ../transducer-joiner.onnx\n            mv tokens.txt ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Zipformer supporting English 英语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"icefall-asr-zipformer-wenetspeech-20230615\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech\",\n            ms=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech\",\n            short_name=\"vad-asr-zh-zipformer_wenetspeech\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv -v data/lang_char/tokens.txt ../\n            mv -v exp/encoder-epoch-12-avg-4.int8.onnx ../transducer-encoder.onnx\n            mv -v exp/decoder-epoch-12-avg-4.onnx ../transducer-decoder.onnx\n            mv -v exp/joiner-epoch-12-avg-4.int8.onnx ../transducer-joiner.onnx\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Zipformer supporting Chinese 中文/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-ja-zipformer\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-ja-zipformer\",\n            short_name=\"vad-asr-ja-zipformer_reazonspeech\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv encoder-epoch-99-avg-1.int8.onnx ../transducer-encoder.onnx\n            mv decoder-epoch-99-avg-1.onnx ../transducer-decoder.onnx\n            mv joiner-epoch-99-avg-1.int8.onnx ../transducer-joiner.onnx\n            mv tokens.txt ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Zipformer supporting Japanese 日语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-thai-2024-06-20\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-th-zipformer\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-th-zipformer\",\n            short_name=\"vad-asr-th-zipformer_gigaspeech2\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv encoder-epoch-12-avg-5.int8.onnx ../transducer-encoder.onnx\n            mv decoder-epoch-12-avg-5.onnx ../transducer-decoder.onnx\n            mv joiner-epoch-12-avg-5.int8.onnx ../transducer-joiner.onnx\n            mv tokens.txt ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Zipformer supporting Thai 泰语/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-telespeech\",\n            ms=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-telespeech\",\n            short_name=\"vad-asr-zh-telespeech\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv model.int8.onnx ../telespeech.onnx\n            mv tokens.txt ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/TeleSpeech-ASR supporting Chinese 多种中文方言/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-multi-lang-dophin-ctc\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-multi-lang-dophin-ctc\",\n            short_name=\"vad-asr-multi_lang-dolphin_ctc\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv model.int8.onnx ../dolphin.onnx\n            mv tokens.txt ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's%Zipformer%<a href=\"https://github.com/DataoceanAI/Dolphin\">Dolphin</a> (多种中文方言及非常多种语言)%g' ../index.html\n            git diff\n            \"\"\",\n        ),\n        Model(\n            model_name=\"sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03\",\n            hf=\"k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-zipformer-ctc\",\n            ms=\"csukuangfj/web-assembly-vad-asr-sherpa-onnx-zh-zipformer-ctc\",\n            short_name=\"vad-asr-zh-zipformer-ctc\",\n            cmd=\"\"\"\n            pushd $model_name\n            mv model.int8.onnx ../zipformer-ctc.onnx\n            mv tokens.txt ../\n            popd\n            rm -rf $model_name\n            sed -i.bak 's/Zipformer/Zipformer CTC supporting Chinese 中文/g' ../index.html\n            git diff\n            \"\"\",\n        ),\n    ]\n    return models\n\n\ndef main():\n    args = get_args()\n    index = args.index\n    total = args.total\n    assert 0 <= index < total, (index, total)\n\n    all_model_list = get_models()\n\n    num_models = len(all_model_list)\n\n    num_per_runner = num_models // total\n    if num_per_runner <= 0:\n        raise ValueError(f\"num_models: {num_models}, num_runners: {total}\")\n\n    start = index * num_per_runner\n    end = start + num_per_runner\n\n    remaining = num_models - args.total * num_per_runner\n\n    print(f\"{index}/{total}: {start}-{end}/{num_models}\")\n\n    d = dict()\n    d[\"model_list\"] = all_model_list[start:end]\n    if index < remaining:\n        s = args.total * num_per_runner + index\n        d[\"model_list\"].append(all_model_list[s])\n        print(f\"{s}/{num_models}\")\n\n    filename_list = [\n        \"./run-vad-asr.sh\",\n    ]\n    for filename in filename_list:\n        environment = jinja2.Environment()\n        with open(f\"{filename}.in\") as f:\n            s = f.read()\n        template = environment.from_string(s)\n\n        s = template.render(**d)\n        with open(filename, \"w\") as f:\n            print(s, file=f)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wasm/run-tts.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Build WebAssembly APPs for huggingface spaces and modelscope spaces\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n\n{% for model in model_list %}\nmodel_name={{ model.model_name }}\nhf_name={{ model.hf }}\nms_name={{ model.ms }}\n\npushd wasm/tts\ngit checkout .\nrm -rf assets\nmkdir assets\ncd assets\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\nrm ${model_name}.tar.bz2\n\n{{ model.cmd }}\n\npopd\n\nls -lh wasm/tts/assets\n\nrm -rf build-wasm-simd-tts/install\nrm -rf build-wasm-simd-tts/wasm\n\n./build-wasm-simd-tts.sh\n\ndst=sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-${model_name}\nmv build-wasm-simd-tts/install/bin/wasm/tts $dst\nls -lh $dst\ntar cjfv $dst.tar.bz2 ./$dst\nls -lh *.tar.bz2\n\ngit config --global user.email \"csukuangfj@gmail.com\"\ngit config --global user.name \"Fangjun Kuang\"\n\nexport GIT_LFS_SKIP_SMUDGE=1\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\nrm -rf ms\ngit clone https://www.modelscope.cn/studios/$ms_name.git ms\n\ncd ms\ncp -v ../$dst/* .\n\ngit status\ngit lfs track \"*.data\"\ngit lfs track \"*.wasm\"\nls -lh\n\ngit add .\ngit commit -m \"update model\" || true\ngit push https://oauth2:${MS_TOKEN}@www.modelscope.cn/studios/$ms_name.git || true\ncd ..\nrm -rf ms\n\nrm -rf huggingface\n\ngit clone https://huggingface.co/spaces/$hf_name huggingface\ncd huggingface\ncp -v ../$dst/* .\n\ngit status\ngit lfs track \"*.data\"\ngit lfs track \"*.wasm\"\nls -lh\n\ngit add .\ngit commit -m \"update model\" || true\ngit push https://csukuangfj:$HF_TOKEN@huggingface.co/spaces/$hf_name main || true\ncd ..\nrm -rf huggingface\nrm -rf $dst\n\nls -lh *.tar.bz2\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/wasm/run-vad-asr.sh.in",
    "content": "#!/usr/bin/env bash\n#\n# Build WebAssembly APPs for huggingface spaces and modelscope spaces\n\nset -ex\n\nlog() {\n  # This function is from espnet\n  local fname=${BASH_SOURCE[1]##*/}\n  echo -e \"$(date '+%Y-%m-%d %H:%M:%S') (${fname}:${BASH_LINENO[0]}:${FUNCNAME[1]}) $*\"\n}\n\nSHERPA_ONNX_VERSION=$(grep \"SHERPA_ONNX_VERSION\" ./CMakeLists.txt  | cut -d \" \" -f 2  | cut -d '\"' -f 2)\n\n\n{% for model in model_list %}\nmodel_name={{ model.model_name }}\nshort_name={{ model.short_name }}\nhf_name={{ model.hf }}\nms_name={{ model.ms }}\n\npushd wasm/vad-asr\ngit checkout .\nrm -rf assets\nmkdir assets\ncd assets\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/${model_name}.tar.bz2\ntar xvf ${model_name}.tar.bz2\nrm ${model_name}.tar.bz2\n\n{{ model.cmd }}\n\npopd\n\nls -lh wasm/vad-asr/assets\n\nrm -rf build-wasm-simd-vad-asr/install\nrm -rf build-wasm-simd-vad-asr/wasm\n\n./build-wasm-simd-vad-asr.sh\n\ndst=sherpa-onnx-wasm-simd-${SHERPA_ONNX_VERSION}-${short_name}\nmv build-wasm-simd-vad-asr/install/bin/wasm/vad-asr $dst\nls -lh $dst\ntar cjfv $dst.tar.bz2 ./$dst\nls -lh *.tar.bz2\n\ngit config --global user.email \"csukuangfj@gmail.com\"\ngit config --global user.name \"Fangjun Kuang\"\n\nexport GIT_LFS_SKIP_SMUDGE=1\nexport GIT_CLONE_PROTECTION_ACTIVE=false\n\nif [ x\"$ms_name\" != x\"\" ]; then\n  rm -rf ms\n  git clone https://www.modelscope.cn/studios/$ms_name.git ms\n\n  cd ms\n  cp -v ../$dst/* .\n\n  git status\n  git lfs track \"*.data\"\n  git lfs track \"*.wasm\"\n  ls -lh\n\n  git add .\n  git commit -m \"update model\" || true\n  git push https://oauth2:${MS_TOKEN}@www.modelscope.cn/studios/$ms_name.git || true\n  cd ..\n  rm -rf ms\nfi\n\nrm -rf huggingface\n\ngit clone https://huggingface.co/spaces/$hf_name huggingface\ncd huggingface\ncp -v ../$dst/* .\n\ngit status\ngit lfs track \"*.data\"\ngit lfs track \"*.wasm\"\nls -lh\n\ngit add .\ngit commit -m \"update model\" || true\ngit push https://csukuangfj2:$HF_TOKEN@huggingface.co/spaces/$hf_name main || true\ncd ..\nrm -rf huggingface\nrm -rf $dst\n\nls -lh *.tar.bz2\n\n{% endfor %}\n"
  },
  {
    "path": "scripts/wenet/README.md",
    "content": "# Introduction\n\nThis folder contains script for exporting models\nfrom [wenet](https://github.com/wenet-e2e/wenet)\nto onnx. You can use the exported models in sherpa-onnx.\n\nNote that both **streaming** and **non-streaming** models are supported.\n\nWe only use the CTC branch. Rescore with the attention decoder\nis not supported, though decoding with H, HL, and HLG is supported.\n"
  },
  {
    "path": "scripts/wenet/export-onnx-streaming.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n# pip install git+https://github.com/wenet-e2e/wenet.git\n# pip install onnxruntime onnx pyyaml\n# cp -a ~/open-source/wenet/wenet/transducer/search .\n# cp -a ~/open-source//wenet/wenet/e_branchformer .\n# cp -a ~/open-source/wenet/wenet/ctl_model .\n\nimport os\nfrom typing import Dict\n\nimport onnx\nimport torch\nimport yaml\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\nfrom wenet.utils.init_model import init_model\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    #  model = onnx.version_converter.convert_version(model, 21)\n\n    onnx.save(model, filename)\n\n\nclass OnnxModel(torch.nn.Module):\n    def __init__(self, encoder: torch.nn.Module, ctc: torch.nn.Module):\n        super().__init__()\n        self.encoder = encoder\n        self.ctc = ctc\n\n    def forward(\n        self,\n        x: torch.Tensor,\n        offset: torch.Tensor,\n        required_cache_size: torch.Tensor,\n        attn_cache: torch.Tensor,\n        conv_cache: torch.Tensor,\n        attn_mask: torch.Tensor,\n    ):\n        \"\"\"\n        Args:\n          x:\n            A 3-D float32 tensor of shape (N, T, C). It supports only N == 1.\n          offset:\n            A scalar of dtype torch.int64.\n          required_cache_size:\n            A scalar of dtype torch.int64.\n          attn_cache:\n            A 4-D float32 tensor of shape (num_blocks, head, required_cache_size, encoder_output_size / head /2).\n          conv_cache:\n            A 4-D float32 tensor of shape (num_blocks, N, encoder_output_size, cnn_module_kernel - 1).\n          attn_mask:\n            A 3-D bool tensor of shape (N, 1, required_cache_size + chunk_size)\n        Returns:\n          Return a tuple of 3 tensors:\n            - A 3-D float32 tensor of shape (N, T, C) containing log_probs\n            - next_attn_cache\n            - next_conv_cache\n        \"\"\"\n        encoder_out, next_att_cache, next_conv_cache = self.encoder.forward_chunk(\n            xs=x,\n            offset=offset,\n            required_cache_size=required_cache_size,\n            att_cache=attn_cache,\n            cnn_cache=conv_cache,\n            att_mask=attn_mask,\n        )\n        log_probs = self.ctc.log_softmax(encoder_out)\n\n        return log_probs, next_att_cache, next_conv_cache\n\n\nclass Foo:\n    pass\n\n\n@torch.no_grad()\ndef main():\n    args = Foo()\n    args.checkpoint = \"./final.pt\"\n    config_file = \"./train.yaml\"\n\n    with open(config_file, \"r\") as fin:\n        configs = yaml.load(fin, Loader=yaml.FullLoader)\n    torch_model, configs = init_model(args, configs)\n    torch_model.eval()\n\n    head = configs[\"encoder_conf\"][\"attention_heads\"]\n    num_blocks = configs[\"encoder_conf\"][\"num_blocks\"]\n    output_size = configs[\"encoder_conf\"][\"output_size\"]\n    cnn_module_kernel = configs[\"encoder_conf\"].get(\"cnn_module_kernel\", 1)\n\n    right_context = torch_model.right_context()\n    subsampling_factor = torch_model.encoder.embed.subsampling_rate\n    chunk_size = 16\n    left_chunks = 4\n\n    decoding_window = (chunk_size - 1) * subsampling_factor + right_context + 1\n\n    required_cache_size = chunk_size * left_chunks\n\n    offset = required_cache_size\n\n    attn_cache = torch.zeros(\n        num_blocks,\n        head,\n        required_cache_size,\n        output_size // head * 2,\n        dtype=torch.float32,\n    )\n\n    attn_mask = torch.ones(1, 1, required_cache_size + chunk_size, dtype=torch.bool)\n    attn_mask[:, :, :required_cache_size] = 0\n\n    conv_cache = torch.zeros(\n        num_blocks, 1, output_size, cnn_module_kernel - 1, dtype=torch.float32\n    )\n\n    sos = torch_model.sos_symbol()\n    eos = torch_model.eos_symbol()\n\n    onnx_model = OnnxModel(\n        encoder=torch_model.encoder,\n        ctc=torch_model.ctc,\n    )\n    filename = \"model-streaming.onnx\"\n\n    N = 1\n    T = decoding_window\n    C = 80\n    x = torch.rand(N, T, C, dtype=torch.float32)\n    offset = torch.tensor([offset], dtype=torch.int64)\n    required_cache_size = torch.tensor([required_cache_size], dtype=torch.int64)\n\n    opset_version = 13\n    torch.onnx.export(\n        onnx_model,\n        (x, offset, required_cache_size, attn_cache, conv_cache, attn_mask),\n        filename,\n        opset_version=opset_version,\n        input_names=[\n            \"x\",\n            \"offset\",\n            \"required_cache_size\",\n            \"attn_cache\",\n            \"conv_cache\",\n            \"attn_mask\",\n        ],\n        output_names=[\"log_probs\", \"next_att_cache\", \"next_conv_cache\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"T\"},\n            \"attn_cache\": {2: \"T\"},\n            \"attn_mask\": {2: \"T\"},\n            \"log_probs\": {0: \"N\"},\n            \"new_attn_cache\": {2: \"T\"},\n        },\n    )\n\n    # https://wenet.org.cn/downloads?models=wenet&version=aishell_u2pp_conformer_exp.tar.gz\n    url = os.environ.get(\"WENET_URL\", \"\")\n    meta_data = {\n        \"model_type\": \"wenet_ctc\",\n        \"version\": \"1\",\n        \"model_author\": \"wenet\",\n        \"comment\": \"streaming\",\n        \"url\": \"https://wenet.org.cn/downloads?models=wenet&version=aishell_u2pp_conformer_exp.tar.gz\",\n        \"chunk_size\": chunk_size,\n        \"left_chunks\": left_chunks,\n        \"head\": head,\n        \"num_blocks\": num_blocks,\n        \"output_size\": output_size,\n        \"cnn_module_kernel\": cnn_module_kernel,\n        \"right_context\": right_context,\n        \"subsampling_factor\": subsampling_factor,\n        \"vocab_size\": torch_model.ctc.ctc_lo.weight.shape[0],\n    }\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n    print(\"Generate int8 quantization models\")\n\n    filename_int8 = f\"model-streaming.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wenet/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n# pip install git+https://github.com/wenet-e2e/wenet.git\n# pip install onnxruntime onnx pyyaml\n# cp -a ~/open-source/wenet/wenet/transducer/search .\n# cp -a ~/open-source//wenet/wenet/e_branchformer .\n# cp -a ~/open-source/wenet/wenet/ctl_model .\n\nimport os\nfrom typing import Dict\n\nimport onnx\nimport torch\nimport yaml\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\n\nfrom wenet.utils.init_model import init_model\n\n\nclass Foo:\n    pass\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    #  model = onnx.version_converter.convert_version(model, 21)\n\n    onnx.save(model, filename)\n\n\nclass OnnxModel(torch.nn.Module):\n    def __init__(self, encoder: torch.nn.Module, ctc: torch.nn.Module):\n        super().__init__()\n        self.encoder = encoder\n        self.ctc = ctc\n\n    def forward(self, x, x_lens):\n        \"\"\"\n        Args:\n          x:\n            A 3-D tensor of shape (N, T, C)\n          x_lens:\n            A 1-D tensor of shape (N,) containing valid lengths in x before\n            padding. Its type is torch.int64\n        \"\"\"\n        encoder_out, encoder_out_mask = self.encoder(\n            x,\n            x_lens,\n            decoding_chunk_size=-1,\n            num_decoding_left_chunks=-1,\n        )\n        log_probs = self.ctc.log_softmax(encoder_out)\n        log_probs_lens = encoder_out_mask.int().squeeze(1).sum(1)\n\n        return log_probs, log_probs_lens\n\n\n@torch.no_grad()\ndef main():\n    args = Foo()\n    args.checkpoint = \"./final.pt\"\n    config_file = \"./train.yaml\"\n\n    with open(config_file, \"r\") as fin:\n        configs = yaml.load(fin, Loader=yaml.FullLoader)\n    torch_model, configs = init_model(args, configs)\n    torch_model.eval()\n\n    onnx_model = OnnxModel(encoder=torch_model.encoder, ctc=torch_model.ctc)\n    filename = \"model.onnx\"\n\n    N = 1\n    T = 1000\n    C = 80\n    x = torch.rand(N, T, C, dtype=torch.float)\n    x_lens = torch.full((N,), fill_value=T, dtype=torch.int64)\n\n    # https://github.com/pytorch/pytorch/issues/114801\n    opset_version = 13\n    onnx_model = torch.jit.script(onnx_model)\n    torch.onnx.export(\n        onnx_model,\n        (x, x_lens),\n        filename,\n        opset_version=opset_version,\n        input_names=[\"x\", \"x_lens\"],\n        output_names=[\"log_probs\", \"log_probs_lens\"],\n        dynamic_axes={\n            \"x\": {0: \"N\", 1: \"T\"},\n            \"x_lens\": {0: \"N\"},\n            \"log_probs\": {0: \"N\", 1: \"T\"},\n            \"log_probs_lens\": {0: \"N\"},\n        },\n    )\n\n    # https://wenet.org.cn/downloads?models=wenet&version=aishell_u2pp_conformer_exp.tar.gz\n    url = os.environ.get(\"WENET_URL\", \"\")\n    meta_data = {\n        \"model_type\": \"wenet_ctc\",\n        \"version\": \"1\",\n        \"model_author\": \"wenet\",\n        \"comment\": \"non-streaming\",\n        \"subsampling_factor\": torch_model.encoder.embed.subsampling_rate,\n        \"vocab_size\": torch_model.ctc.ctc_lo.weight.shape[0],\n        \"url\": url,\n    }\n    add_meta_data(filename=filename, meta_data=meta_data)\n\n    print(\"Generate int8 quantization models\")\n\n    filename_int8 = f\"model.int8.onnx\"\n    quantize_dynamic(\n        model_input=filename,\n        model_output=filename_int8,\n        op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wenet/test-onnx-streaming.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport kaldi_native_fbank as knf\nimport onnxruntime as ort\nimport torch\nimport torchaudio\nfrom torch.nn.utils.rnn import pad_sequence\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 4\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        self.left_chunks = int(meta[\"left_chunks\"])\n        self.num_blocks = int(meta[\"num_blocks\"])\n        self.chunk_size = int(meta[\"chunk_size\"])\n        self.head = int(meta[\"head\"])\n        self.output_size = int(meta[\"output_size\"])\n        self.cnn_module_kernel = int(meta[\"cnn_module_kernel\"])\n        self.right_context = int(meta[\"right_context\"])\n        self.subsampling_factor = int(meta[\"subsampling_factor\"])\n\n        self._init_cache()\n\n    def _init_cache(self):\n        required_cache_size = self.chunk_size * self.left_chunks\n\n        self.attn_cache = torch.zeros(\n            self.num_blocks,\n            self.head,\n            required_cache_size,\n            self.output_size // self.head * 2,\n            dtype=torch.float32,\n        ).numpy()\n\n        self.conv_cache = torch.zeros(\n            self.num_blocks,\n            1,\n            self.output_size,\n            self.cnn_module_kernel - 1,\n            dtype=torch.float32,\n        ).numpy()\n\n        self.offset = torch.tensor([required_cache_size], dtype=torch.int64).numpy()\n\n        self.required_cache_size = torch.tensor(\n            [self.chunk_size * self.left_chunks], dtype=torch.int64\n        ).numpy()\n\n    def __call__(self, x: torch.Tensor) -> torch.Tensor:\n        \"\"\"\n        Args:\n          x:\n            A 2-D tensor of shape (T, C)\n        Returns:\n          Return a 2-D tensor of shape (T, C) containing log_probs.\n        \"\"\"\n        attn_mask = torch.ones(\n            1, 1, int(self.required_cache_size + self.chunk_size), dtype=torch.bool\n        )\n        chunk_idx = self.offset // self.chunk_size - self.left_chunks\n        if chunk_idx < self.left_chunks:\n            attn_mask[\n                :, :, : int(self.required_cache_size - chunk_idx * self.chunk_size)\n            ] = False\n\n        log_probs, new_attn_cache, new_conv_cache = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n                self.model.get_outputs()[1].name,\n                self.model.get_outputs()[2].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x.unsqueeze(0).numpy(),\n                self.model.get_inputs()[1].name: self.offset,\n                self.model.get_inputs()[2].name: self.required_cache_size,\n                self.model.get_inputs()[3].name: self.attn_cache,\n                self.model.get_inputs()[4].name: self.conv_cache,\n                self.model.get_inputs()[5].name: attn_mask.numpy(),\n            },\n        )\n\n        self.attn_cache = new_attn_cache\n        self.conv_cache = new_conv_cache\n\n        log_probs = torch.from_numpy(log_probs)\n\n        self.offset += log_probs.shape[1]\n\n        return log_probs.squeeze(0)\n\n\ndef get_features(test_wav_filename):\n    wave, sample_rate = torchaudio.load(test_wav_filename)\n    audio = wave[0].contiguous()  # only use the first channel\n    if sample_rate != 16000:\n        audio = torchaudio.functional.resample(\n            audio, orig_freq=sample_rate, new_freq=16000\n        )\n    audio *= 32768\n\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.mel_opts.num_bins = 80\n    opts.frame_opts.snip_edges = False\n    opts.mel_opts.debug_mel = False\n\n    fbank = knf.OnlineFbank(opts)\n    fbank.accept_waveform(16000, audio.numpy())\n    frames = []\n    for i in range(fbank.num_frames_ready):\n        frames.append(torch.from_numpy(fbank.get_frame(i)))\n    frames = torch.stack(frames)\n    return frames\n\n\ndef main():\n    model_filename = \"./model-streaming.onnx\"\n    model = OnnxModel(model_filename)\n\n    filename = \"./0.wav\"\n    x = get_features(filename)\n\n    padding = torch.zeros(50, 80)\n    x = torch.cat([x, padding], dim=0)\n\n    chunk_length = (\n        (model.chunk_size - 1) * model.subsampling_factor + model.right_context + 1\n    )\n    chunk_length = int(chunk_length)\n    chunk_shift = int(model.chunk_size * model.subsampling_factor)\n    print(chunk_length, chunk_shift)\n\n    num_frames = x.shape[0]\n    n = (num_frames - chunk_length) // chunk_shift + 1\n    tokens = []\n    for i in range(n):\n        start = i * chunk_shift\n        end = start + chunk_length\n        frames = x[start:end, :]\n        log_probs = model(frames)\n\n        indexes = log_probs.argmax(dim=1)\n        indexes = torch.unique_consecutive(indexes)\n        indexes = indexes[indexes != 0].tolist()\n        if indexes:\n            tokens.extend(indexes)\n\n    id2word = dict()\n    with open(\"./units.txt\", encoding=\"utf-8\") as f:\n        for line in f:\n            word, idx = line.strip().split()\n            id2word[int(idx)] = word\n    text = \"\".join([id2word[i] for i in tokens])\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wenet/test-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport kaldi_native_fbank as knf\nimport onnxruntime as ort\nimport torch\nimport torchaudio\nfrom torch.nn.utils.rnn import pad_sequence\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 4\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def __call__(self, x: torch.Tensor, x_lens: torch.Tensor) -> torch.Tensor:\n        \"\"\"\n        Args:\n          x:\n            A 3-D tensor of shape (N, T, C)\n          x_lens:\n            A 1-D tensor of shape (N,). Its dtype is torch.int64\n        Returns:\n          Return a 3-D tensor of shape (N, T, C) containing log_probs.\n        \"\"\"\n        log_probs, log_probs_lens = self.model.run(\n            [self.model.get_outputs()[0].name, self.model.get_outputs()[1].name],\n            {\n                self.model.get_inputs()[0].name: x.numpy(),\n                self.model.get_inputs()[1].name: x_lens.numpy(),\n            },\n        )\n        return torch.from_numpy(log_probs), torch.from_numpy(log_probs_lens)\n\n\ndef get_features(test_wav_filename):\n    wave, sample_rate = torchaudio.load(test_wav_filename)\n    audio = wave[0].contiguous()  # only use the first channel\n    if sample_rate != 16000:\n        audio = torchaudio.functional.resample(\n            audio, orig_freq=sample_rate, new_freq=16000\n        )\n    audio *= 32768\n\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.mel_opts.num_bins = 80\n    opts.frame_opts.snip_edges = False\n    opts.mel_opts.debug_mel = False\n\n    fbank = knf.OnlineFbank(opts)\n    fbank.accept_waveform(16000, audio.numpy())\n    frames = []\n    for i in range(fbank.num_frames_ready):\n        frames.append(torch.from_numpy(fbank.get_frame(i)))\n    frames = torch.stack(frames)\n    return frames\n\n\ndef main():\n    model_filename = \"./model.onnx\"\n    model = OnnxModel(model_filename)\n\n    filename = \"./0.wav\"\n    x = get_features(filename)\n    x = x.unsqueeze(0)\n\n    # Note: It supports only batch size == 1\n    x_lens = torch.tensor([x.shape[1]], dtype=torch.int64)\n\n    print(x.shape, x_lens)\n\n    log_probs, log_probs_lens = model(x, x_lens)\n    log_probs = log_probs[0]\n    print(log_probs.shape)\n\n    indexes = log_probs.argmax(dim=1)\n    print(indexes)\n    indexes = torch.unique_consecutive(indexes)\n    indexes = indexes[indexes != 0].tolist()\n\n    id2word = dict()\n    with open(\"./units.txt\", encoding=\"utf-8\") as f:\n        for line in f:\n            word, idx = line.strip().split()\n            id2word[int(idx)] = word\n    text = \"\".join([id2word[i] for i in indexes])\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wespeaker/README.md",
    "content": "# Introduction\n\nThis folder contains script for adding meta data to onnx models from\nhttps://github.com/wenet-e2e/wespeaker/blob/master/docs/pretrained.md\n\nYou can use the models with metadata in sherpa-onnx.\n\n\n**Caution**: You have to add model meta data to `*.onnx` since we plan\nto support models from different frameworks.\n"
  },
  {
    "path": "scripts/wespeaker/add_meta_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nThis script adds meta data to a model so that it can be used in sherpa-onnx.\n\nUsage:\n./add_meta_data.py --model ./voxceleb_resnet34.onnx  --language English\n\"\"\"\n\nimport argparse\nfrom pathlib import Path\nfrom typing import Dict\n\nimport onnx\nimport onnxruntime\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the input onnx model. Example value: model.onnx\",\n    )\n\n    parser.add_argument(\n        \"--language\",\n        type=str,\n        required=True,\n        help=\"\"\"Supported language of the input model.\n        Example value: Chinese, English.\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--url\",\n        type=str,\n        default=\"https://github.com/wenet-e2e/wespeaker/blob/master/docs/pretrained.md\",\n        help=\"Where the model is downloaded\",\n    )\n\n    parser.add_argument(\n        \"--comment\",\n        type=str,\n        default=\"no comment\",\n        help=\"Comment about the model\",\n    )\n\n    parser.add_argument(\n        \"--sample-rate\",\n        type=int,\n        default=16000,\n        help=\"Sample rate expected by the model\",\n    )\n\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, str]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    onnx.save(model, filename)\n\n\ndef get_output_dim(filename) -> int:\n    filename = str(filename)\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.log_severity_level = 3  # error level\n    sess = onnxruntime.InferenceSession(filename, session_opts)\n\n    for i in sess.get_inputs():\n        print(i)\n\n    print(\"----------\")\n\n    for o in sess.get_outputs():\n        print(o)\n\n    print(\"----------\")\n\n    assert len(sess.get_inputs()) == 1\n    assert len(sess.get_outputs()) == 1\n\n    i = sess.get_inputs()[0]\n    o = sess.get_outputs()[0]\n\n    assert i.shape[:2] == [\"B\", \"T\"], i.shape\n    assert o.shape[0] == \"B\"\n\n    assert i.shape[2] == 80, i.shape\n\n    return o.shape[1]\n\n\ndef main():\n    args = get_args()\n    model = Path(args.model)\n    language = args.language\n    url = args.url\n    comment = args.comment\n    sample_rate = args.sample_rate\n\n    if not model.is_file():\n        raise ValueError(f\"{model} does not exist\")\n\n    assert len(language) > 0, len(language)\n    assert len(url) > 0, len(url)\n\n    output_dim = get_output_dim(model)\n\n    # all models from wespeaker expect input samples in the range\n    # [-32768, 32767]\n    normalize_samples = 0\n\n    meta_data = {\n        \"framework\": \"wespeaker\",\n        \"language\": language,\n        \"url\": url,\n        \"comment\": comment,\n        \"sample_rate\": sample_rate,\n        \"output_dim\": output_dim,\n        \"normalize_samples\": normalize_samples,\n    }\n    print(meta_data)\n    add_meta_data(filename=str(model), meta_data=meta_data)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wespeaker/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nThis script computes speaker similarity score in the range [0-1]\nof two wave files using a speaker embedding model.\n\"\"\"\nimport argparse\nimport wave\nfrom pathlib import Path\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nfrom numpy.linalg import norm\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        help=\"Path to the input onnx model. Example value: model.onnx\",\n    )\n\n    parser.add_argument(\n        \"--file1\",\n        type=str,\n        required=True,\n        help=\"Input wave 1\",\n    )\n\n    parser.add_argument(\n        \"--file2\",\n        type=str,\n        required=True,\n        help=\"Input wave 2\",\n    )\n\n    return parser.parse_args()\n\n\ndef read_wavefile(filename, expected_sample_rate: int = 16000) -> np.ndarray:\n    \"\"\"\n    Args:\n      filename:\n        Path to a wave file, which must be of 16-bit and 16kHz.\n     expected_sample_rate:\n       Expected sample rate of the wave file.\n    Returns:\n      Return a 1-D float32 array containing audio samples. Each sample is in\n      the range [-1, 1].\n    \"\"\"\n    filename = str(filename)\n    with wave.open(filename) as f:\n        wave_file_sample_rate = f.getframerate()\n        assert wave_file_sample_rate == expected_sample_rate, (\n            wave_file_sample_rate,\n            expected_sample_rate,\n        )\n\n        num_channels = f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_int16 = samples_int16.reshape(-1, num_channels)[:, 0]\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n\n        return samples_float32\n\n\ndef compute_features(samples: np.ndarray, sample_rate: int) -> np.ndarray:\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.samp_freq = sample_rate\n    opts.frame_opts.snip_edges = False\n\n    opts.mel_opts.num_bins = 80\n    opts.mel_opts.debug_mel = False\n\n    fbank = knf.OnlineFbank(opts)\n    fbank.accept_waveform(sample_rate, samples)\n    fbank.input_finished()\n\n    features = []\n    for i in range(fbank.num_frames_ready):\n        f = fbank.get_frame(i)\n        features.append(f)\n    features = np.stack(features, axis=0)\n\n    return features\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n        )\n\n        meta = self.model.get_modelmeta().custom_metadata_map\n        self.normalize_samples = int(meta[\"normalize_samples\"])\n        self.sample_rate = int(meta[\"sample_rate\"])\n        self.output_dim = int(meta[\"output_dim\"])\n\n    def __call__(self, x: np.ndarray) -> np.ndarray:\n        \"\"\"\n        Args:\n          x:\n            A 2-D float32 tensor of shape (T, C).\n          y:\n            A 1-D float32 tensor containing model output.\n        \"\"\"\n        x = np.expand_dims(x, axis=0)\n\n        return self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )[0][0]\n\n\ndef main():\n    args = get_args()\n    filename = Path(args.model)\n    file1 = Path(args.file1)\n    file2 = Path(args.file2)\n    assert filename.is_file(), filename\n    assert file1.is_file(), file1\n    assert file2.is_file(), file2\n\n    model = OnnxModel(filename)\n    wave1 = read_wavefile(file1, model.sample_rate)\n    wave2 = read_wavefile(file2, model.sample_rate)\n\n    if not model.normalize_samples:\n        wave1 = wave1 * 32768\n        wave2 = wave2 * 32768\n\n    features1 = compute_features(wave1, model.sample_rate)\n    features2 = compute_features(wave2, model.sample_rate)\n\n    output1 = model(features1)\n    output2 = model(features2)\n\n    similarity = np.dot(output1, output2) / (norm(output1) * norm(output2))\n    print(f\"similarity in the range [0-1]: {similarity}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wheel/README.md",
    "content": "# Introduction\n\nThis folder is for developers only.\n\n## sherpa-onnx-core\n\nIt contains the scripts for building the package sherpa-onnx-core.\n\n```\npython3 setup.py bdist_wheel --plat-name=macosx_10_15_x86_64\npython3 setup.py bdist_wheel --plat-name=macosx_11_0_arm64\npython3 setup.py bdist_wheel --plat-name=macosx_11_0_universal2\npython3 setup.py bdist_wheel --plat-name=macosx_10_15_universal2\n\npython3 setup.py bdist_wheel --plat-name=win_amd64\npython3 setup.py bdist_wheel --plat-name=win32\n\npython3 setup.py bdist_wheel --plat-name=manylinux2014_x86_64\npython3 setup.py bdist_wheel --plat-name=manylinux2014_aarch64\npython3 setup.py bdist_wheel --plat-name=linux_armv7l\n```\n\n## sherpa-onnx-bin\n"
  },
  {
    "path": "scripts/wheel/patch_wheel.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport argparse\nimport glob\nimport shutil\nimport subprocess\nimport sys\nfrom pathlib import Path\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--in-dir\",\n        type=Path,\n        required=True,\n        help=\"Input directory.\",\n    )\n\n    parser.add_argument(\n        \"--out-dir\",\n        type=Path,\n        required=True,\n        help=\"Output directory.\",\n    )\n    return parser.parse_args()\n\n\ndef process(out_dir: Path, whl: Path):\n    tmp_dir = out_dir / \"tmp\"\n    subprocess.check_call(f\"unzip {whl} -d {tmp_dir}\", shell=True)\n    if \"cp37\" in str(whl):\n        py_version = \"3.7\"\n    elif \"cp38\" in str(whl):\n        py_version = \"3.8\"\n    elif \"cp39\" in str(whl):\n        py_version = \"3.9\"\n    elif \"cp310\" in str(whl):\n        py_version = \"3.10\"\n    elif \"cp311\" in str(whl):\n        py_version = \"3.11\"\n    elif \"cp312\" in str(whl):\n        py_version = \"3.12\"\n    elif \"cp313\" in str(whl):\n        py_version = \"3.13\"\n    elif \"cp314\" in str(whl):\n        py_version = \"3.14\"\n    elif \"py3-none\" in str(whl):\n        py_version = None\n    else:\n        assert False, f\"Unknown python version in {whl}\"\n\n    if py_version:\n        rpath_list = [\n            f\"$ORIGIN/../lib/python{py_version}/site-packages/sherpa_onnx/lib\",\n            f\"$ORIGIN/../lib/python{py_version}/dist-packages/sherpa_onnx/lib\",\n            #\n            f\"$ORIGIN/../lib/python{py_version}/site-packages/sherpa_onnx/lib64\",\n            f\"$ORIGIN/../lib/python{py_version}/dist-packages/sherpa_onnx/lib64\",\n            #\n            f\"$ORIGIN/../lib/python{py_version}/site-packages/sherpa_onnx.libs\",\n        ]\n    else:\n        rpath_list = []\n        for p in [\"3.8\", \"3.9\", \"3.10\", \"3.11\", \"3.12\", \"3.13\", \"3.14\"]:\n            rpath_list.extend(\n                [\n                    f\"$ORIGIN/../lib/python{p}/site-packages/sherpa_onnx/lib\",\n                    f\"$ORIGIN/../lib/python{p}/dist-packages/sherpa_onnx/lib\",\n                ]\n            )\n\n    rpaths = \":\".join(rpath_list)\n\n    for filename in glob.glob(f\"{tmp_dir}/sherpa_onnx*data/data/bin/*\", recursive=True):\n        print(filename)\n        existing_rpath = (\n            subprocess.check_output([\"patchelf\", \"--print-rpath\", filename])\n            .decode()\n            .strip()\n        )\n        target_rpaths = rpaths + \":\" + existing_rpath\n        subprocess.check_call(\n            f\"patchelf --force-rpath --set-rpath '{target_rpaths}' {filename}\",\n            shell=True,\n        )\n\n    outwheel = Path(shutil.make_archive(whl, \"zip\", tmp_dir))\n    Path(outwheel).rename(out_dir / whl.name)\n\n    shutil.rmtree(tmp_dir)\n\n\ndef main():\n    args = get_args()\n    print(args)\n    in_dir = args.in_dir\n    out_dir = args.out_dir\n    out_dir.mkdir(exist_ok=True, parents=True)\n\n    for whl in in_dir.glob(\"*.whl\"):\n        process(out_dir, whl)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wheel/sherpa-onnx-bin/setup.py",
    "content": "import glob\nimport platform\n\nfrom setuptools import setup\n\n\ndef is_windows():\n    return platform.system() == \"Windows\"\n\n\nbin_files = glob.glob(\"bin/*\")\nprint(\"bin_files\", bin_files)\n\nsetup(\n    name=\"sherpa-onnx-bin\",\n    version=\"1.12.31\",\n    description=\"Binary executables for sherpa-onnx\",\n    author=\"The sherpa-onnx development team\",\n    url=\"https://github.com/k2-fsa/sherpa-onnx\",\n    author_email=\"dpovey@gmail.com\",\n    zip_safe=False,\n    license=\"Apache 2.0\",\n    packages=[],\n    data_files=[(\"Scripts\", bin_files) if is_windows() else (\"bin\", bin_files)],\n    install_requires=[\n        \"sherpa-onnx-core==1.12.31\",\n    ],\n    classifiers=[\n        \"Programming Language :: Python :: 3\",\n        \"Operating System :: Microsoft :: Windows\",\n        \"Operating System :: POSIX :: Linux\",\n        \"Operating System :: MacOS :: MacOS X\",\n        \"Topic :: Scientific/Engineering :: Artificial Intelligence\",\n    ],\n)\n"
  },
  {
    "path": "scripts/wheel/sherpa-onnx-core/.gitignore",
    "content": ""
  },
  {
    "path": "scripts/wheel/sherpa-onnx-core/MANIFEST.in",
    "content": "recursive-include sherpa_onnx/lib *\nrecursive-include sherpa_onnx/include *\n"
  },
  {
    "path": "scripts/wheel/sherpa-onnx-core/setup.py",
    "content": "import platform\n\nfrom setuptools import setup\n\n\ndef is_windows():\n    return platform.system() == \"Windows\"\n\n\ndef get_binaries():\n    if not is_windows():\n        return None\n    libs = [\n        \"onnxruntime.dll\",\n        \"sherpa-onnx-c-api.dll\",\n        \"sherpa-onnx-cxx-api.dll\",\n        \"sherpa-onnx-c-api.lib\",\n        \"sherpa-onnx-cxx-api.lib\",\n    ]\n    prefix = \"./sherpa_onnx/lib\"\n    return [f\"{prefix}/{lib}\" for lib in libs]\n\n\nsetup(\n    name=\"sherpa-onnx-core\",\n    version=\"1.12.31\",\n    description=\"Core shared libraries for sherpa-onnx\",\n    packages=[\"sherpa_onnx\"],\n    include_package_data=True,\n    data_files=[(\"Scripts\", get_binaries())] if get_binaries() else None,\n    author=\"The sherpa-onnx development team\",\n    url=\"https://github.com/k2-fsa/sherpa-onnx\",\n    author_email=\"dpovey@gmail.com\",\n    zip_safe=False,\n    license=\"Apache-2.0\",\n    classifiers=[\n        \"Programming Language :: Python :: 3\",\n        \"Operating System :: Microsoft :: Windows\",\n        \"Operating System :: POSIX :: Linux\",\n        \"Operating System :: MacOS :: MacOS X\",\n        \"Topic :: Scientific/Engineering :: Artificial Intelligence\",\n    ],\n)\n"
  },
  {
    "path": "scripts/wheel/sherpa-onnx-core/sherpa_onnx/__main__.py",
    "content": "import sys\nfrom . import _info\n\n\ndef main():\n    args = sys.argv[1:]\n    if not args:\n        print(\n            \"Usage: python3 -m sherpa_onnx [--cflags|--c-api-libs|--c-api-libs-only-L|--c-api-libs-only-l|--cxx-api-libs|--cxx-api-libs-only-L|--cxx-api-libs-only-l]\"\n        )\n        sys.exit(1)\n\n    if \"--cflags\" in args:\n        print(f\"-I{_info.get_include_dir()}\")\n    elif \"--c-api-libs\" in args:\n        lib_flags = \" \".join(f\"-l{lib}\" for lib in _info.get_c_api_libs())\n        print(f\"-L{_info.get_libs_dir()} {lib_flags}\")\n    elif \"--c-api-libs-only-L\" in args:\n        print(f\"-L{_info.get_libs_dir()}\")\n    elif \"--c-api-libs-only-l\" in args:\n        print(\" \".join(f\"-l{lib}\" for lib in _info.get_c_api_libs()))\n    elif \"--cxx-api-libs\" in args:\n        lib_flags = \" \".join(f\"-l{lib}\" for lib in _info.get_cxx_api_libs())\n        print(f\"-L{_info.get_libs_dir()} {lib_flags}\")\n    elif \"--cxx-api-libs-only-L\" in args:\n        print(f\"-L{_info.get_libs_dir()}\")\n    elif \"--cxx-api-libs-only-l\" in args:\n        print(\" \".join(f\"-l{lib}\" for lib in _info.get_cxx_api_libs()))\n    else:\n        print(\"Unknown option:\", args[0])\n        sys.exit(1)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/wheel/sherpa-onnx-core/sherpa_onnx/_info.py",
    "content": "from pathlib import Path\nfrom typing import List\n\n_pkg_dir = Path(__file__).parent\nlibs_dir = _pkg_dir / \"lib\"\ninclude_dir = _pkg_dir / \"include\"\n\n# List of libraries (without \"lib\" prefix, without extension)\n# Adjust to match your actual .so/.dll/.dylib files\nonnxruntime_lib = [\"onnxruntime\"]\nc_lib = [\"sherpa-onnx-c-api\"] + onnxruntime_lib\ncxx_lib = [\"sherpa-onnx-cxx-api\"] + c_lib\n\n\ndef get_include_dir() -> str:\n    return str(include_dir)\n\n\ndef get_libs_dir() -> str:\n    return str(libs_dir)\n\n\ndef get_c_api_libs() -> List[str]:\n    return c_lib\n\n\ndef get_cxx_api_libs() -> List[str]:\n    return cxx_lib\n"
  },
  {
    "path": "scripts/whisper/.gitignore",
    "content": "*.onnx\n*.config\n*.ort\n*-tokens.txt\n*.bias\n*.weights\n*.weight\n*.*embedding\n_Const*\nonnx__*\n"
  },
  {
    "path": "scripts/whisper/README.md",
    "content": "# Introduction\n\nThis folder contains code showing how to convert [Whisper][whisper] to onnx\nand use onnxruntime to replace PyTorch for speech recognition.\n\nYou can use [sherpa-onnx][sherpa-onnx] to run the converted model.\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/export-onnx.html\nfor details.\n\n## Finding Alignment Heads for Word Timestamps\n\nThe `export-onnx-with-attention.py` script exports Whisper models with\ncross-attention weights for word-level timestamps. It requires knowing which\nattention heads are \"alignment heads\" - heads that show monotonically increasing\nattention patterns useful for aligning audio to text.\n\nFor standard OpenAI Whisper models, alignment heads are defined in the\n`ALIGNMENT_HEADS` dict in the export script. For new or custom models (like\ndistil-whisper variants), you can discover alignment heads using:\n\n```bash\npython find_alignment_heads.py --model <model-name> --audio <test-audio.wav>\n```\n\nThis script analyzes all attention heads and ranks them by:\n- **Monotonicity**: Whether attention peaks move forward as tokens are decoded\n- **Diagonal score**: Correlation with expected diagonal attention pattern\n\nExample output:\n```\nTop 15 alignment head candidates:\n------------------------------------------------------------\n Layer   Head    Monotonic     Diagonal     Combined\n------------------------------------------------------------\n     3      2        0.846        0.985        0.915\n     0      0        0.962        0.617        0.789\n     ...\n```\n\nHeads with high combined scores (>0.7) are good candidates. A single head with\na very high diagonal score (>0.9) is often sufficient for accurate timestamps.\n\n[whisper]: https://github.com/openai/whisper\n[sherpa-onnx]: https://github.com/k2-fsa/sherpa-onnx\n"
  },
  {
    "path": "scripts/whisper/ascend-npu/test_om.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2026  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nUsage example:\n\n./test_om.py \\\n  --encoder ./tiny.en-encoder.om \\\n  --decoder ./tiny.en-decoder.om \\\n  --tokens ./tiny.en-tokens.txt \\\n  --wav  ./test_wavs/0.wav\n\"\"\"\n\nimport argparse\nimport base64\nfrom typing import List\n\nimport kaldi_native_fbank as knf\nimport librosa\nimport numpy as np\nfrom ais_bench.infer.interface import InferSession\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        required=True,\n        help=\"Path to the encoder\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        required=True,\n        help=\"Path to the decoder\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to the tokens\",\n    )\n\n    parser.add_argument(\n        \"--wav\",\n        type=str,\n        required=True,\n        help=\"Path to the test wav\",\n    )\n\n    return parser.parse_args()\n\n\ndef causal_mask_1d(n: int, L: int):\n    \"\"\"\n    Returns a 1-D int mask of shape (L,) with:\n      0 -> allowed\n      1 -> masked (will be converted to -inf later)\n    \"\"\"\n    mask = np.ones((L,), dtype=np.int32)\n    if n > 0:\n        mask[:n] = 0\n    return mask\n\n\ndef load_audio(filename: str) -> np.ndarray:\n    samples, _ = librosa.load(filename, sr=16000)\n\n    samples = np.ascontiguousarray(samples)\n    return samples\n\n\ndef compute_features(samples: np.ndarray, dim: int = 80) -> np.ndarray:\n    \"\"\"\n    Returns:\n      Return a 1-D float32 tensor of shape (1, 80, 3000) containing the features.\n    \"\"\"\n    features = []\n    opts = knf.WhisperFeatureOptions()\n    opts.dim = dim\n    online_whisper_fbank = knf.OnlineWhisperFbank(opts)\n    online_whisper_fbank.accept_waveform(16000, samples)\n    online_whisper_fbank.input_finished()\n\n    features = np.stack(\n        [\n            online_whisper_fbank.get_frame(i)\n            for i in range(online_whisper_fbank.num_frames_ready)\n        ]\n    )\n    log_spec = np.log10(np.clip(features, a_min=1e-10, a_max=None))\n    log_spec = np.maximum(log_spec, log_spec.max() - 8.0)\n    mel = (log_spec + 4.0) / 4.0\n    num_frames = mel.shape[0]\n    target = 3000\n    if num_frames < target:\n        mel = np.pad(\n            mel,\n            pad_width=((0, target - num_frames), (0, 0)),\n            mode=\"constant\",\n            constant_values=0,\n        )\n\n    mel = np.expand_dims(mel.T, axis=0)\n    mel = np.ascontiguousarray(mel)\n\n    return mel\n\n\ndef load_tokens(filename):\n    tokens = dict()\n    with open(filename, \"r\") as f:\n        for line in f:\n            t, i = line.split()\n            tokens[int(i)] = t\n    return tokens\n\n\nclass OmModel:\n    def __init__(self, encoder: str, decoder: str):\n        self.encoder = InferSession(device_id=0, model_path=encoder, debug=False)\n        self.decoder = InferSession(device_id=0, model_path=decoder, debug=False)\n\n        name = self.encoder.get_inputs()[0].name\n\n        if \".en\" in name:\n            self.sot_sequence = [50257, 50362]\n            self.eot = 50256\n        else:\n            self.sot_sequence = [50258, 50259, 50359, 50363]\n            self.eot = 50257\n\n        if \"tiny\" in name:\n            self.n_text_layer = 4\n            self.n_text_ctx = 448\n            self.n_text_state = 384\n        elif \"base\" in name:\n            self.n_text_layer = 6\n            self.n_text_ctx = 448\n            self.n_text_state = 512\n        elif \"small\" in name:\n            self.n_text_layer = 12\n            self.n_text_ctx = 448\n            self.n_text_state = 768\n        elif \"medium\" in name:\n            self.n_text_layer = 24\n            self.n_text_ctx = 448\n            self.n_text_state = 1024\n        else:\n            assert False, f\"Unsupported encoder input {name}\"\n\n        print(\"---encoder---\")\n        for i in self.encoder.get_inputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"-----\")\n\n        for i in self.encoder.get_outputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"---decoder---\")\n        for i in self.decoder.get_inputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"-----\")\n\n        for i in self.decoder.get_outputs():\n            print(i.name, i.datatype, i.shape)\n\n    def get_self_cache(self) -> List[np.ndarray]:\n        self_cache = []\n        batch_size = 1\n        for i in range(self.n_text_layer):\n            k = np.zeros(\n                (batch_size, self.n_text_ctx, self.n_text_state), dtype=np.float32\n            )\n            v = np.zeros(\n                (batch_size, self.n_text_ctx, self.n_text_state), dtype=np.float32\n            )\n            self_cache.extend([k, v])\n        return self_cache\n\n    def run_encoder(self, x: np.ndarray):\n        \"\"\"\n        Args:\n          x: (1, 80, 3000), np.float32\n        Returns:\n          cross_kv:\n           - (k, v) for layer 0\n           - (k, v) for layer 1\n           - (k, v) for layer 2\n           - (k, v) for layer 3\n        \"\"\"\n        out = self.encoder.infer([x])\n        return out\n\n    def run_decoder(self, tokens: np.ndarray, self_kv, cross_kv, offset, mask):\n        \"\"\"\n        Args:\n          tokens: (1, 1), np.int32\n          offset: (1,), np.int32\n          mask: (model.n_text_ctx,), np.int32\n        Returns:\n          logit: (1, 1, vocab_size)\n          this_self_kv\n        \"\"\"\n        return self.decoder.infer([tokens] + self_kv + cross_kv + [offset, mask])\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n    samples = load_audio(args.wav)\n    features = compute_features(samples)\n    print(\"features\", features.shape)\n\n    model = OmModel(args.encoder, args.decoder)\n\n    cross_kv = model.run_encoder(features)\n\n    self_kv = model.get_self_cache()\n\n    offset = np.array([0], dtype=np.int32)\n    for t in model.sot_sequence:\n        token = np.array([[t]], dtype=np.int32)  # sot\n        mask = causal_mask_1d(offset.item(), model.n_text_ctx)\n        print(t, model.sot_sequence, token, mask.shape, len(cross_kv), len(self_kv))\n\n        out = model.run_decoder(\n            tokens=token, self_kv=self_kv, cross_kv=cross_kv, offset=offset, mask=mask\n        )\n\n        for i in range(1, len(out)):\n            self_kv[i - 1][:, offset.item() : offset.item() + 1, :] = out[i]\n\n        offset += 1\n\n    idx = out[0][0, 0].argmax()\n\n    eot = model.eot\n\n    ans = []\n\n    while idx != eot and offset.item() < 100:\n        ans.append(idx)\n        token = np.array([[idx]], dtype=np.int32)\n\n        mask = causal_mask_1d(offset.item(), model.n_text_ctx)\n\n        out = model.run_decoder(\n            tokens=token, self_kv=self_kv, cross_kv=cross_kv, offset=offset, mask=mask\n        )\n\n        for i in range(1, len(out)):\n            self_kv[i - 1][:, offset.item() : offset.item() + 1, :] = out[i]\n\n        offset += 1\n        idx = out[0][0, 0].argmax()\n\n    print(ans)\n    id2token = load_tokens(args.tokens)\n\n    s = b\"\"\n    for i in ans:\n        if i in id2token:\n            s += base64.b64decode(id2token[i])\n\n    print(s.decode().strip())\n    return\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/whisper/export-onnx-with-attention.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Posit Software, PBC\n# flake8: noqa\n\n\"\"\"\nExport Whisper ONNX models with cross-attention weights for word-level timestamps.\n\nThis script exports Whisper models that include cross-attention weights from\nalignment heads as an additional decoder output. These weights can be used\nwith Dynamic Time Warping (DTW) to compute word-level timestamps.\n\nBased on the original export-onnx.py script.\n\nUsage:\n  python export-onnx-with-attention.py --model tiny\n\nThe exported decoder will have 4 outputs instead of 3:\n  - logits\n  - out_n_layer_self_k_cache\n  - out_n_layer_self_v_cache\n  - cross_attention_weights  (NEW: shape [n_alignment_heads, n_audio_ctx])\n\"\"\"\n\nimport argparse\nimport importlib.util\nimport os\nfrom pathlib import Path\nfrom typing import Dict, List, Optional, Tuple\n\nimport onnx\nimport torch\nimport torch.nn.functional as F\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom torch import Tensor, nn\n\nimport whisper\nfrom whisper.model import (\n    MultiHeadAttention,\n    ResidualAttentionBlock,\n    TextDecoder,\n)\n\nfrom export_onnx import add_meta_data, load_model, AudioEncoderTensorCache\n\n\n# Sentinel value indicating alignment heads should be read from model metadata\nUSE_MODEL_METADATA = True\n\n# Alignment heads for each model variant.\n# For official OpenAI models, we use USE_MODEL_METADATA to read from the model.\n# For distil-whisper models, we use empirically-determined heads since their\n# metadata includes all heads in certain layers rather than curated ones.\nALIGNMENT_HEADS = {\n    # TODO: [\"medium-aishell\"]\n    # Official OpenAI models - trust their metadata\n    \"tiny.en\": USE_MODEL_METADATA,\n    \"tiny\": USE_MODEL_METADATA,\n    \"base.en\": USE_MODEL_METADATA,\n    \"base\": USE_MODEL_METADATA,\n    \"small.en\": USE_MODEL_METADATA,\n    \"small\": USE_MODEL_METADATA,\n    \"medium.en\": USE_MODEL_METADATA,\n    \"medium\": USE_MODEL_METADATA,\n    \"large-v1\": USE_MODEL_METADATA,\n    \"large-v2\": USE_MODEL_METADATA,\n    \"large-v3\": USE_MODEL_METADATA,\n    \"large\": USE_MODEL_METADATA,\n    \"turbo\": USE_MODEL_METADATA,\n    # Distil-whisper models (alignment heads discovered empirically)\n    # distil-small.en has 4 decoder layers; head (3,2) has 0.985 diagonal score\n    \"distil-small.en\": [(3, 2)],\n    # distil-medium.en has 2 decoder layers; head (1,11) has 0.804 diagonal score\n    \"distil-medium.en\": [(1, 11)],\n    # distil-large-v2 has 2 decoder layers; head (1,12) has 0.806 diagonal score\n    \"distil-large-v2\": [(1, 12)],\n    # distil-large-v3 has 2 decoder layers; head (1,3) has 0.623 diagonal score\n    \"distil-large-v3\": [(1, 3)],\n    # distil-large-v3.5 has 2 decoder layers; head (1,3) has 0.483 diagonal score\n    \"distil-large-v3.5\": [(1, 3)],\n}\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        choices=list(ALIGNMENT_HEADS.keys()),\n        help=\"Whisper model name (must have known alignment heads)\",\n    )\n    return parser.parse_args()\n\n\ndef extract_alignment_heads_from_model(model) -> List[Tuple[int, int]]:\n    \"\"\"Extract alignment heads from model metadata.\n\n    Official OpenAI whisper models store alignment heads as a sparse boolean\n    tensor with shape (n_layers, n_heads) where True indicates an alignment head.\n\n    Returns:\n        List of (layer, head) tuples.\n\n    Raises:\n        ValueError: If alignment heads cannot be extracted from model.\n    \"\"\"\n    if not hasattr(model, \"alignment_heads\") or model.alignment_heads is None:\n        raise ValueError(\"Model does not have alignment_heads metadata\")\n\n    ah = model.alignment_heads\n    if not hasattr(ah, \"indices\"):\n        raise ValueError(\"Model alignment_heads is not a sparse tensor\")\n\n    indices = ah.indices()\n    return list(zip(indices[0].tolist(), indices[1].tolist()))\n\n\ndef get_alignment_heads(name: str, model) -> List[Tuple[int, int]]:\n    \"\"\"Get alignment heads for a model.\n\n    If ALIGNMENT_HEADS[name] is USE_MODEL_METADATA, alignment heads are read\n    from the model's metadata. Otherwise, the explicit list is used.\n\n    Args:\n        name: Model name\n        model: Loaded whisper model\n\n    Returns:\n        List of (layer, head) tuples for alignment heads.\n\n    Raises:\n        ValueError: If no alignment heads can be determined for the model.\n    \"\"\"\n    if name not in ALIGNMENT_HEADS:\n        raise ValueError(\n            f\"No alignment heads defined for model '{name}'. \"\n            f\"Supported models: {', '.join(sorted(ALIGNMENT_HEADS.keys()))}\"\n        )\n\n    heads = ALIGNMENT_HEADS[name]\n\n    if heads is USE_MODEL_METADATA:\n        print(\"Reading alignment heads from model metadata\")\n        return extract_alignment_heads_from_model(model)\n    else:\n        print(\"Using alignment heads from ALIGNMENT_HEADS table\")\n        return heads\n\n\ndef convert_tokens(name: str, model):\n    \"\"\"Convert and save tokens file.\"\"\"\n    whisper_dir = Path(whisper.__file__).parent\n    multilingual = model.is_multilingual\n    tokenizer = (\n        whisper_dir\n        / \"assets\"\n        / (multilingual and \"multilingual.tiktoken\" or \"gpt2.tiktoken\")\n    )\n    if not tokenizer.is_file():\n        raise ValueError(f\"Cannot find {tokenizer}\")\n\n    with open(tokenizer, \"r\") as f:\n        contents = f.read()\n        tokens = {\n            token: int(rank)\n            for token, rank in (line.split() for line in contents.splitlines() if line)\n        }\n\n    output_path = f\"{name}-tokens.txt\"\n    with open(output_path, \"w\") as f:\n        for t, i in tokens.items():\n            f.write(f\"{t} {i}\\n\")\n\n\n# =============================================================================\n# Attention-enabled decoder classes\n# =============================================================================\n\n\nclass MultiHeadAttentionCrossWithWeights(nn.Module):\n    \"\"\"Cross-attention that returns both output and attention weights.\"\"\"\n\n    def __init__(\n        self,\n        inMultiHeadAttention: MultiHeadAttention,\n        layer_index: int,\n        alignment_heads: List[Tuple[int, int]],\n    ):\n        super().__init__()\n        self.multiHeadAttention = inMultiHeadAttention\n        self.layer_index = layer_index\n        # Find which heads in this layer are alignment heads\n        self.alignment_head_indices = [\n            head_idx for (layer_idx, head_idx) in alignment_heads\n            if layer_idx == layer_index\n        ]\n        self.n_head = inMultiHeadAttention.n_head\n\n    def forward(\n        self,\n        x: Tensor,\n        k: Tensor,\n        v: Tensor,\n    ) -> Tuple[Tensor, Optional[Tensor]]:\n        q = self.multiHeadAttention.query(x)\n\n        # Compute attention weights manually (don't use SDPA)\n        n_batch, n_ctx, n_state = q.shape\n        scale = (n_state // self.n_head) ** -0.25\n\n        q = q.view(*q.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n        k_reshaped = k.view(*k.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n        v_reshaped = v.view(*v.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n\n        # Compute QK^T with scaling\n        qk = (q * scale) @ (k_reshaped * scale).transpose(-1, -2)\n        qk = qk.float()\n\n        # Softmax to get attention weights\n        w = F.softmax(qk, dim=-1).to(q.dtype)\n\n        # Compute output\n        out = (w @ v_reshaped).permute(0, 2, 1, 3).flatten(start_dim=2)\n        out = self.multiHeadAttention.out(out)\n\n        # Extract alignment head weights if this layer has any\n        if self.alignment_head_indices:\n            # w shape: (batch, n_head, n_ctx, n_audio_ctx)\n            # Select only the alignment heads for this layer\n            # Output shape: (batch, n_alignment_heads, n_ctx, n_audio_ctx)\n            alignment_weights = w[:, self.alignment_head_indices, :, :]\n        else:\n            alignment_weights = None\n\n        return out, alignment_weights\n\n\nclass MultiHeadAttentionSelfManual(nn.Module):\n    \"\"\"Self-attention with KV cache support and manual attention computation.\"\"\"\n\n    def __init__(self, inMultiHeadAttention: MultiHeadAttention):\n        super().__init__()\n        self.multiHeadAttention = inMultiHeadAttention\n        self.n_head = inMultiHeadAttention.n_head\n\n    def forward(\n        self,\n        x: Tensor,\n        k_cache: Tensor,\n        v_cache: Tensor,\n        mask: Tensor,\n    ):\n        q = self.multiHeadAttention.query(x)\n        k = self.multiHeadAttention.key(x)\n        v = self.multiHeadAttention.value(x)\n\n        k_cache[:, -k.shape[1] :, :] = k\n        v_cache[:, -v.shape[1] :, :] = v\n\n        # Manual attention computation (avoid SDPA for ONNX compatibility)\n        n_batch, n_ctx, n_state = q.shape\n        scale = (n_state // self.n_head) ** -0.25\n\n        q = q.view(*q.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n        k_reshaped = k_cache.view(*k_cache.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n        v_reshaped = v_cache.view(*v_cache.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n\n        qk = (q * scale) @ (k_reshaped * scale).transpose(-1, -2)\n        if mask is not None:\n            qk = qk + mask[:n_ctx, :n_ctx]\n        qk = qk.float()\n\n        w = F.softmax(qk, dim=-1).to(q.dtype)\n        out = (w @ v_reshaped).permute(0, 2, 1, 3).flatten(start_dim=2)\n\n        return self.multiHeadAttention.out(out), k_cache, v_cache\n\n\nclass ResidualAttentionBlockWithWeights(nn.Module):\n    \"\"\"Residual attention block that returns cross-attention weights.\"\"\"\n\n    def __init__(\n        self,\n        inResidualAttentionBlock: ResidualAttentionBlock,\n        layer_index: int,\n        alignment_heads: List[Tuple[int, int]],\n    ):\n        super().__init__()\n        self.originalBlock = inResidualAttentionBlock\n        self.attn = MultiHeadAttentionSelfManual(inResidualAttentionBlock.attn)\n        self.cross_attn = (\n            MultiHeadAttentionCrossWithWeights(\n                inResidualAttentionBlock.cross_attn,\n                layer_index,\n                alignment_heads,\n            )\n            if inResidualAttentionBlock.cross_attn\n            else None\n        )\n\n    def forward(\n        self,\n        x: Tensor,\n        self_k_cache: Tensor,\n        self_v_cache: Tensor,\n        cross_k: Tensor,\n        cross_v: Tensor,\n        mask: Tensor,\n    ) -> Tuple[Tensor, Tensor, Tensor, Optional[Tensor]]:\n        self_attn_x, self_k_cache_updated, self_v_cache_updated = self.attn(\n            self.originalBlock.attn_ln(x), self_k_cache, self_v_cache, mask=mask\n        )\n        x = x + self_attn_x\n\n        cross_attention_weights = None\n        if self.cross_attn:\n            cross_out, cross_attention_weights = self.cross_attn(\n                self.originalBlock.cross_attn_ln(x), cross_k, cross_v\n            )\n            x = x + cross_out\n\n        x = x + self.originalBlock.mlp(self.originalBlock.mlp_ln(x))\n        return x, self_k_cache_updated, self_v_cache_updated, cross_attention_weights\n\n\nclass TextDecoderWithAttention(nn.Module):\n    \"\"\"Text decoder that outputs cross-attention weights from alignment heads.\"\"\"\n\n    def __init__(\n        self,\n        inTextDecoder: TextDecoder,\n        in_n_ctx: int,\n        alignment_heads: List[Tuple[int, int]],\n    ):\n        super().__init__()\n        self.textDecoder = inTextDecoder\n        self.n_ctx = in_n_ctx\n        self.alignment_heads = alignment_heads\n\n        self.blocks = nn.ModuleList()\n        for i, original_block in enumerate(self.textDecoder.blocks):\n            self.blocks.append(\n                ResidualAttentionBlockWithWeights(original_block, i, alignment_heads)\n            )\n\n    def forward(\n        self,\n        tokens: Tensor,\n        n_layer_self_k_cache: Tensor,\n        n_layer_self_v_cache: Tensor,\n        n_layer_cross_k: Tensor,\n        n_layer_cross_v: Tensor,\n        offset: Tensor,\n    ) -> Tuple[Tensor, Tensor, Tensor, Tensor]:\n        x = (\n            self.textDecoder.token_embedding(tokens)\n            + self.textDecoder.positional_embedding[\n                offset[0] : offset[0] + tokens.shape[-1]\n            ]\n        )\n        x = x.to(n_layer_cross_k[0].dtype)\n\n        # Collect attention weights from alignment heads across all layers\n        all_attention_weights = []\n\n        for i, block in enumerate(self.blocks):\n            self_k_cache = n_layer_self_k_cache[i, :, : offset[0] + tokens.shape[-1], :]\n            self_v_cache = n_layer_self_v_cache[i, :, : offset[0] + tokens.shape[-1], :]\n\n            x, self_k_cache, self_v_cache, attn_weights = block(\n                x,\n                self_k_cache=self_k_cache,\n                self_v_cache=self_v_cache,\n                cross_k=n_layer_cross_k[i],\n                cross_v=n_layer_cross_v[i],\n                mask=self.textDecoder.mask,\n            )\n\n            n_layer_self_k_cache[i, :, : offset[0] + tokens.shape[-1], :] = self_k_cache\n            n_layer_self_v_cache[i, :, : offset[0] + tokens.shape[-1], :] = self_v_cache\n\n            if attn_weights is not None:\n                all_attention_weights.append(attn_weights)\n\n        x = self.textDecoder.ln(x)\n\n        logits = (\n            torch.matmul(\n                self.textDecoder.token_embedding.weight.to(x.dtype),\n                x.permute(0, 2, 1),\n            )\n            .permute(0, 2, 1)\n            .float()\n        )\n\n        # Stack attention weights from all alignment heads\n        # Shape: (batch, total_alignment_heads, n_tokens, n_audio_ctx)\n        if all_attention_weights:\n            cross_attention_weights = torch.cat(all_attention_weights, dim=1)\n        else:\n            # Fallback: create dummy tensor if no alignment heads configured\n            cross_attention_weights = torch.zeros(\n                tokens.shape[0], 1, tokens.shape[1], n_layer_cross_k.shape[2],\n                device=tokens.device, dtype=logits.dtype\n            )\n\n        return logits, n_layer_self_k_cache, n_layer_self_v_cache, cross_attention_weights\n\n\n# =============================================================================\n# Main export function\n# =============================================================================\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    name = args.model\n\n    print(f\"Exporting {name} with cross-attention weights\")\n\n    opset_version = 13\n\n    # Load model\n    model = load_model(name)\n    print(f\"Model dimensions: {model.dims}\")\n    print(f\"Total parameters: {sum(p.numel() for p in model.parameters()):,}\")\n\n    # Get alignment heads for this model\n    alignment_heads = get_alignment_heads(name, model)\n    print(f\"Using {len(alignment_heads)} alignment heads: {alignment_heads}\")\n\n    convert_tokens(name=name, model=model)\n\n    tokenizer = whisper.tokenizer.get_tokenizer(\n        model.is_multilingual, num_languages=model.num_languages\n    )\n\n    model.eval()\n\n    # Prepare test input\n    audio = torch.rand(16000 * 2)\n    audio = whisper.pad_or_trim(audio)\n\n    if name in (\"distil-large-v3\", \"distil-large-v3.5\"):\n        n_mels = 128\n    elif name in (\"large\", \"large-v3\", \"turbo\"):\n        n_mels = 128\n    else:\n        n_mels = 80\n\n    mel = whisper.log_mel_spectrogram(audio, n_mels=n_mels).to(model.device).unsqueeze(0)\n    batch_size = 1\n\n    # Export encoder (same as original)\n    encoder = AudioEncoderTensorCache(model.encoder, model.decoder)\n    n_layer_cross_k, n_layer_cross_v = encoder(mel)\n\n    encoder_filename = f\"{name}-encoder.onnx\"\n    torch.onnx.export(\n        encoder,\n        mel,\n        encoder_filename,\n        opset_version=opset_version,\n        input_names=[\"mel\"],\n        output_names=[\"n_layer_cross_k\", \"n_layer_cross_v\"],\n        dynamic_axes={\n            \"mel\": {0: \"n_audio\", 2: \"T\"},\n            \"n_layer_cross_k\": {1: \"n_audio\", 2: \"T\"},\n            \"n_layer_cross_v\": {1: \"n_audio\", 2: \"T\"},\n        },\n    )\n\n    encoder_meta_data = {\n        \"model_type\": f\"whisper-{name}\",\n        \"version\": \"2\",  # Version 2 indicates attention-enabled\n        \"maintainer\": \"k2-fsa\",\n        \"n_mels\": model.dims.n_mels,\n        \"n_audio_ctx\": model.dims.n_audio_ctx,\n        \"n_audio_state\": model.dims.n_audio_state,\n        \"n_audio_head\": model.dims.n_audio_head,\n        \"n_audio_layer\": model.dims.n_audio_layer,\n        \"n_vocab\": model.dims.n_vocab,\n        \"n_text_ctx\": model.dims.n_text_ctx,\n        \"n_text_state\": model.dims.n_text_state,\n        \"n_text_head\": model.dims.n_text_head,\n        \"n_text_layer\": model.dims.n_text_layer,\n        \"sot_sequence\": \",\".join(list(map(str, tokenizer.sot_sequence))),\n        \"all_language_tokens\": \",\".join(list(map(str, tokenizer.all_language_tokens))),\n        \"all_language_codes\": \",\".join(tokenizer.all_language_codes),\n        \"sot\": tokenizer.sot,\n        \"sot_index\": tokenizer.sot_sequence.index(tokenizer.sot),\n        \"eot\": tokenizer.eot,\n        \"blank_id\": tokenizer.encode(\" \")[0],\n        \"is_multilingual\": int(model.is_multilingual),\n        \"no_speech\": tokenizer.no_speech,\n        \"non_speech_tokens\": \",\".join(list(map(str, tokenizer.non_speech_tokens))),\n        \"transcribe\": tokenizer.transcribe,\n        \"translate\": tokenizer.translate,\n        \"sot_prev\": tokenizer.sot_prev,\n        \"sot_lm\": tokenizer.sot_lm,\n        \"no_timestamps\": tokenizer.no_timestamps,\n        # Attention-specific metadata\n        \"n_alignment_heads\": len(alignment_heads),\n        \"alignment_heads\": \",\".join([f\"{l}:{h}\" for l, h in alignment_heads]),\n    }\n    print(f\"Encoder metadata: {encoder_meta_data}\")\n    add_meta_data(filename=encoder_filename, meta_data=encoder_meta_data)\n\n    # Export decoder with attention outputs\n    n_audio = mel.shape[0]\n    tokens = torch.tensor(\n        [[tokenizer.sot, tokenizer.sot, tokenizer.sot]] * n_audio\n    ).to(mel.device)\n\n    decoder = TextDecoderWithAttention(\n        model.decoder, model.dims.n_text_ctx, alignment_heads\n    )\n\n    n_layer_self_k_cache = torch.zeros(\n        (\n            len(model.decoder.blocks),\n            n_audio,\n            model.dims.n_text_ctx,\n            model.dims.n_text_state,\n        ),\n        device=mel.device,\n    )\n    n_layer_self_v_cache = torch.zeros(\n        (\n            len(model.decoder.blocks),\n            n_audio,\n            model.dims.n_text_ctx,\n            model.dims.n_text_state,\n        ),\n        device=mel.device,\n    )\n    offset = torch.zeros(1, dtype=torch.int64).to(mel.device)\n\n    # Test forward pass\n    logits, _, _, cross_attn_weights = decoder(\n        tokens,\n        n_layer_self_k_cache.clone(),\n        n_layer_self_v_cache.clone(),\n        n_layer_cross_k,\n        n_layer_cross_v,\n        offset,\n    )\n\n    print(f\"Logits shape: {logits.shape}\")\n    print(f\"Cross-attention weights shape: {cross_attn_weights.shape}\")\n    assert cross_attn_weights.shape == (\n        n_audio, len(alignment_heads), tokens.shape[1], model.dims.n_audio_ctx\n    ), f\"Unexpected attention shape: {cross_attn_weights.shape}\"\n\n    # Export with single token input (for autoregressive decoding)\n    offset = torch.tensor([tokens.shape[1]], dtype=torch.int64).to(mel.device)\n    tokens_single = torch.tensor([[tokenizer.sot]] * n_audio).to(mel.device)\n\n    decoder_filename = f\"{name}-decoder.onnx\"\n    torch.onnx.export(\n        decoder,\n        (\n            tokens_single,\n            n_layer_self_k_cache,\n            n_layer_self_v_cache,\n            n_layer_cross_k,\n            n_layer_cross_v,\n            offset,\n        ),\n        decoder_filename,\n        opset_version=opset_version,\n        input_names=[\n            \"tokens\",\n            \"in_n_layer_self_k_cache\",\n            \"in_n_layer_self_v_cache\",\n            \"n_layer_cross_k\",\n            \"n_layer_cross_v\",\n            \"offset\",\n        ],\n        output_names=[\n            \"logits\",\n            \"out_n_layer_self_k_cache\",\n            \"out_n_layer_self_v_cache\",\n            \"cross_attention_weights\",\n        ],\n        dynamic_axes={\n            \"tokens\": {0: \"n_audio\", 1: \"n_tokens\"},\n            \"in_n_layer_self_k_cache\": {1: \"n_audio\"},\n            \"in_n_layer_self_v_cache\": {1: \"n_audio\"},\n            \"n_layer_cross_k\": {1: \"n_audio\", 2: \"T\"},\n            \"n_layer_cross_v\": {1: \"n_audio\", 2: \"T\"},\n            \"cross_attention_weights\": {0: \"n_audio\", 2: \"n_tokens\", 3: \"T\"},\n        },\n    )\n\n    if \"large\" in name:\n        decoder_external_filename = decoder_filename.split(\".onnx\")[0]\n        decoder_model = onnx.load(decoder_filename)\n        onnx.save(\n            decoder_model,\n            decoder_filename,\n            save_as_external_data=True,\n            all_tensors_to_one_file=True,\n            location=decoder_external_filename + \".weights\",\n        )\n\n    # Generate int8 quantized models\n    print(\"Generating int8 quantized models...\")\n\n    encoder_filename_int8 = f\"{name}-encoder.int8.onnx\"\n    quantize_dynamic(\n        model_input=encoder_filename,\n        model_output=encoder_filename_int8,\n        op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QInt8,\n    )\n\n    decoder_filename_int8 = f\"{name}-decoder.int8.onnx\"\n    quantize_dynamic(\n        model_input=decoder_filename,\n        model_output=decoder_filename_int8,\n        op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QInt8,\n    )\n\n    print(f\"\\nExported files:\")\n    print(f\"  - {encoder_filename}\")\n    print(f\"  - {encoder_filename_int8}\")\n    print(f\"  - {decoder_filename}\")\n    print(f\"  - {decoder_filename_int8}\")\n    print(f\"  - {name}-tokens.txt\")\n    print(f\"\\nDecoder has 4 outputs including cross_attention_weights\")\n\n\nif __name__ == \"__main__\":\n    torch.set_num_threads(1)\n    torch.set_num_interop_threads(1)\n    # To fix\n    # TypeError: scaled_dot_product_attention(): argument 'is_causal' must be bool, not Tensor\n    # See also https://github.com/k2-fsa/sherpa-onnx/issues/1764\n    from whisper.model import disable_sdpa\n\n    with disable_sdpa():\n        main()\n"
  },
  {
    "path": "scripts/whisper/export-onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n# flake8: noqa\n\n\"\"\"\nNote: Code in this file is modified from\nhttps://github.com/TadaoYamaoka/whisper/blob/main/to_onnx.py\n\nThanks to https://github.com/TadaoYamaoka\nfor making the onnx export script public.\n\nNote that we have removed the 30 seconds constraint from whisper. You can\nuse any T <= 30.\n\"\"\"\n\nimport argparse\nimport os\nfrom pathlib import Path\nfrom typing import Any, Dict, Optional\n\nimport onnx\nimport torch\nimport torch.nn.functional as F\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom torch import Tensor, nn\n\nimport whisper\nfrom whisper.model import (\n    AudioEncoder,\n    MultiHeadAttention,\n    ResidualAttentionBlock,\n    TextDecoder,\n)\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        # fmt: off\n        choices=[\n            \"tiny\", \"tiny.en\", \"base\", \"base.en\",\n            \"small\", \"small.en\", \"medium\", \"medium.en\",\n            \"large-v1\", \"large-v2\",\n            \"large\", \"large-v3\", \"turbo\", # these three have feature dim 128\n            \"distil-medium.en\", \"distil-small.en\", \"distil-large-v2\",\n            \"distil-large-v3\",\n            \"distil-large-v3.5\",\n            # for fine-tuned models from icefall\n            \"medium-aishell\",\n            ],\n        # fmt: on\n    )\n    return parser.parse_args()\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    if \"large\" in filename or \"turbo\" in filename:\n        external_filename = filename.split(\".onnx\")[0]\n        onnx.save(\n            model,\n            filename,\n            save_as_external_data=True,\n            all_tensors_to_one_file=True,\n            location=external_filename + \".weights\",\n        )\n    else:\n        onnx.save(model, filename)\n\n\ndef modified_audio_encoder_forward(self: AudioEncoder, x: torch.Tensor):\n    \"\"\"\n    x : torch.Tensor, shape = (batch_size, n_mels, n_ctx)\n        the mel spectrogram of the audio\n    \"\"\"\n    x = F.gelu(self.conv1(x))\n    x = F.gelu(self.conv2(x))\n    x = x.permute(0, 2, 1)\n\n    if False:\n        # This branch contains the original code\n        assert x.shape[1:] == self.positional_embedding.shape, \"incorrect audio shape\"\n        x = (x + self.positional_embedding).to(x.dtype)\n    else:\n        # This branch contains the actual changes\n        assert (\n            x.shape[2] == self.positional_embedding.shape[1]\n        ), f\"incorrect audio shape: {x.shape}, {self.positional_embedding.shape}\"\n        assert (\n            x.shape[1] == self.positional_embedding.shape[0]\n        ), f\"incorrect audio shape: {x.shape}, {self.positional_embedding.shape}\"\n        x = (x + self.positional_embedding[: x.shape[1]]).to(x.dtype)\n\n    for block in self.blocks:\n        x = block(x)\n\n    x = self.ln_post(x)\n    return x\n\n\nAudioEncoder.forward = modified_audio_encoder_forward\n\n\nclass AudioEncoderTensorCache(nn.Module):\n    def __init__(self, inAudioEncoder: AudioEncoder, inTextDecoder: TextDecoder):\n        super().__init__()\n        self.audioEncoder = inAudioEncoder\n        self.textDecoder = inTextDecoder\n\n    def forward(self, x: Tensor):\n        audio_features = self.audioEncoder(x)\n\n        n_layer_cross_k_list = []\n        n_layer_cross_v_list = []\n        for block in self.textDecoder.blocks:\n            n_layer_cross_k_list.append(block.cross_attn.key(audio_features))\n            n_layer_cross_v_list.append(block.cross_attn.value(audio_features))\n\n        return torch.stack(n_layer_cross_k_list), torch.stack(n_layer_cross_v_list)\n\n\nclass MultiHeadAttentionCross(nn.Module):\n    def __init__(self, inMultiHeadAttention: MultiHeadAttention):\n        super().__init__()\n        self.multiHeadAttention = inMultiHeadAttention\n\n    def forward(\n        self,\n        x: Tensor,\n        k: Tensor,\n        v: Tensor,\n        mask: Optional[Tensor] = None,\n    ):\n        q = self.multiHeadAttention.query(x)\n        wv, qk = self.multiHeadAttention.qkv_attention(q, k, v, mask)\n        return self.multiHeadAttention.out(wv)\n\n\nclass MultiHeadAttentionSelf(nn.Module):\n    def __init__(self, inMultiHeadAttention: MultiHeadAttention):\n        super().__init__()\n        self.multiHeadAttention = inMultiHeadAttention\n\n    def forward(\n        self,\n        x: Tensor,  # (b, n_ctx      , n_state)\n        k_cache: Tensor,  # (b, n_ctx_cache, n_state)\n        v_cache: Tensor,  # (b, n_ctx_cache, n_state)\n        mask: Tensor,\n    ):\n        q = self.multiHeadAttention.query(x)  # (b, n_ctx, n_state)\n        k = self.multiHeadAttention.key(x)  # (b, n_ctx, n_state)\n        v = self.multiHeadAttention.value(x)  # (b, n_ctx, n_state)\n\n        k_cache[:, -k.shape[1] :, :] = k  # (b, n_ctx_cache + n_ctx, n_state)\n        v_cache[:, -v.shape[1] :, :] = v  # (b, n_ctx_cache + n_ctx, n_state)\n\n        wv, qk = self.multiHeadAttention.qkv_attention(q, k_cache, v_cache, mask)\n        return self.multiHeadAttention.out(wv), k_cache, v_cache\n\n\nclass ResidualAttentionBlockTensorCache(nn.Module):\n    def __init__(self, inResidualAttentionBlock: ResidualAttentionBlock):\n        super().__init__()\n        self.originalBlock = inResidualAttentionBlock\n        self.attn = MultiHeadAttentionSelf(inResidualAttentionBlock.attn)\n        self.cross_attn = (\n            MultiHeadAttentionCross(inResidualAttentionBlock.cross_attn)\n            if inResidualAttentionBlock.cross_attn\n            else None\n        )\n\n    def forward(\n        self,\n        x: Tensor,\n        self_k_cache: Tensor,\n        self_v_cache: Tensor,\n        cross_k: Tensor,\n        cross_v: Tensor,\n        mask: Tensor,\n    ):\n        self_attn_x, self_k_cache_updated, self_v_cache_updated = self.attn(\n            self.originalBlock.attn_ln(x), self_k_cache, self_v_cache, mask=mask\n        )\n        x = x + self_attn_x\n\n        if self.cross_attn:\n            x = x + self.cross_attn(\n                self.originalBlock.cross_attn_ln(x), cross_k, cross_v\n            )\n\n        x = x + self.originalBlock.mlp(self.originalBlock.mlp_ln(x))\n        return x, self_k_cache_updated, self_v_cache_updated\n\n\nclass TextDecoderTensorCache(nn.Module):\n    def __init__(self, inTextDecoder: TextDecoder, in_n_ctx: int):\n        super().__init__()\n        self.textDecoder = inTextDecoder\n        self.n_ctx = in_n_ctx\n\n        self.blocks = []\n        for orginal_block in self.textDecoder.blocks:\n            self.blocks.append(ResidualAttentionBlockTensorCache(orginal_block))\n\n    def forward(\n        self,\n        tokens: Tensor,\n        n_layer_self_k_cache: Tensor,\n        n_layer_self_v_cache: Tensor,\n        n_layer_cross_k: Tensor,\n        n_layer_cross_v: Tensor,\n        offset: Tensor,\n    ):\n        x = (\n            self.textDecoder.token_embedding(tokens)\n            + self.textDecoder.positional_embedding[\n                offset[0] : offset[0] + tokens.shape[-1]\n            ]\n        )\n        x = x.to(n_layer_cross_k[0].dtype)\n\n        i = 0\n        for block in self.blocks:\n            self_k_cache = n_layer_self_k_cache[i, :, : offset[0] + tokens.shape[-1], :]\n            self_v_cache = n_layer_self_v_cache[i, :, : offset[0] + tokens.shape[-1], :]\n            x, self_k_cache, self_v_cache = block(\n                x,\n                self_k_cache=self_k_cache,\n                self_v_cache=self_v_cache,\n                cross_k=n_layer_cross_k[i],\n                cross_v=n_layer_cross_v[i],\n                mask=self.textDecoder.mask,\n            )\n            n_layer_self_k_cache[i, :, : offset[0] + tokens.shape[-1], :] = self_k_cache\n            n_layer_self_v_cache[i, :, : offset[0] + tokens.shape[-1], :] = self_v_cache\n            i += 1\n\n        x = self.textDecoder.ln(x)\n\n        if False:\n            # x.shape (1, 3, 384)\n            # weight.shape (51684, 384)\n\n            logits = (\n                x\n                @ torch.transpose(\n                    self.textDecoder.token_embedding.weight.to(x.dtype), 0, 1\n                )\n            ).float()\n        else:\n            logits = (\n                torch.matmul(\n                    self.textDecoder.token_embedding.weight.to(x.dtype),\n                    x.permute(0, 2, 1),\n                )\n                .permute(0, 2, 1)\n                .float()\n            )\n\n        return logits, n_layer_self_k_cache, n_layer_self_v_cache\n\n\n# ref: https://github.com/ggerganov/whisper.cpp/blob/master/models/convert-pt-to-ggml.py#L232\ndef convert_tokens(name, model):\n    whisper_dir = Path(whisper.__file__).parent\n    multilingual = model.is_multilingual\n    tokenizer = (\n        whisper_dir\n        / \"assets\"\n        / (multilingual and \"multilingual.tiktoken\" or \"gpt2.tiktoken\")\n    )\n    if not tokenizer.is_file():\n        raise ValueError(f\"Cannot find {tokenizer}\")\n\n    #  import base64\n\n    with open(tokenizer, \"r\") as f:\n        contents = f.read()\n        #  tokens = {\n        #      base64.b64decode(token): int(rank)\n        #      for token, rank in (line.split() for line in contents.splitlines() if line)\n        #  }\n        tokens = {\n            token: int(rank)\n            for token, rank in (line.split() for line in contents.splitlines() if line)\n        }\n\n    with open(f\"{name}-tokens.txt\", \"w\") as f:\n        for t, i in tokens.items():\n            f.write(f\"{t} {i}\\n\")\n\n\ndef load_model(name: str):\n    \"\"\"Load a Whisper model by name.\n\n    For standard OpenAI models (tiny, base, small, medium, large, etc.),\n    this uses whisper.load_model() directly.\n\n    For distil-whisper and fine-tuned models, this expects the checkpoint\n    file to be pre-downloaded to the current directory with a specific name.\n\n    Args:\n        name: Model name (e.g., \"tiny\", \"distil-small.en\", \"medium-aishell\")\n\n    Returns:\n        The loaded whisper model.\n\n    Raises:\n        ValueError: If a required checkpoint file is not found.\n    \"\"\"\n    if name == \"distil-medium.en\":\n        filename = \"./distil-medium-en-original-model.bin\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/distil-whisper/distil-medium.en\n                to download original-model.bin\n                You can use the following command to do that:\n\n                wget -O distil-medium-en-original-model.bin https://huggingface.co/distil-whisper/distil-medium.en/resolve/main/original-model.bin\n            \"\"\"\n            )\n        return whisper.load_model(filename)\n    elif name == \"distil-large-v2\":\n        filename = \"./distil-large-v2-original-model.bin\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/distil-whisper/distil-large-v2\n                to download original-model.bin\n                You can use the following command to do that:\n\n                wget -O distil-large-v2-original-model.bin https://huggingface.co/distil-whisper/distil-large-v2/resolve/main/original-model.bin\n            \"\"\"\n            )\n        return whisper.load_model(filename)\n    elif name == \"distil-large-v3\":\n        filename = \"./distil-large-v3-original-model.bin\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/distil-whisper/distil-large-v3-openai\n                to download model.bin\n                You can use the following command to do that:\n\n                wget -O distil-large-v3-original-model.bin https://huggingface.co/distil-whisper/distil-large-v3-openai/resolve/main/model.bin\n            \"\"\"\n            )\n        return whisper.load_model(filename)\n    elif name == \"distil-large-v3.5\":\n        filename = \"./distil-large-v3.5-original-model.bin\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/distil-whisper/distil-large-v3.5-openai/\n                to download model.bin\n                You can use the following command to do that:\n\n                wget -O distil-large-v3.5-original-model.bin https://huggingface.co/distil-whisper/distil-large-v3.5-openai/resolve/main/model.bin\n            \"\"\"\n            )\n        return whisper.load_model(filename)\n    elif name == \"distil-small.en\":\n        filename = \"./distil-small-en-original-model.bin\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/distil-whisper/distil-small.en\n                to download original-model.bin\n                You can use the following command to do that:\n\n                wget -O distil-small-en-original-model.bin https://huggingface.co/distil-whisper/distil-small.en/resolve/main/original-model.bin\n            \"\"\"\n            )\n        return whisper.load_model(filename)\n    elif name == \"medium-aishell\":\n        filename = \"./medium-aishell.pt\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/yuekai/icefall_asr_aishell_whisper/tree/main/exp_medium\n                to download whisper-medium-aishell1-epoch-10-avg-4.pt\n                You can use the following command to do that:\n\n                wget -O medium-aishell.pt https://huggingface.co/yuekai/icefall_asr_aishell_whisper/resolve/main/exp_medium/whisper-medium-aishell1-epoch-10-avg-4.pt\n            \"\"\"\n            )\n        return whisper.load_model(filename)\n    else:\n        return whisper.load_model(name)\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    name = args.model\n    print(args)\n    print(name)\n\n    opset_version = 17\n\n    model = load_model(name)\n    print(model.dims)\n\n    print(\n        f\"number of model parameters: {name}\",\n        sum(p.numel() for p in model.parameters()),\n    )\n    print(\n        f\"number of encoder parameters: {name}\",\n        sum(p.numel() for p in model.encoder.parameters()),\n    )\n    print(\n        f\"number of decoder parameters: {name}\",\n        sum(p.numel() for p in model.decoder.parameters()),\n    )\n\n    convert_tokens(name=name, model=model)\n\n    # write tokens\n\n    tokenizer = whisper.tokenizer.get_tokenizer(\n        model.is_multilingual, num_languages=model.num_languages\n    )\n\n    model.eval()\n    print(model.dims)\n    audio = torch.rand(16000 * 2)\n    audio = whisper.pad_or_trim(audio)\n    assert audio.shape == (16000 * 30,), audio.shape\n\n    if args.model in (\"distil-large-v3\", \"distil-large-v3.5\"):\n        n_mels = 128\n    elif args.model in (\n        \"large\",\n        \"large-v3\",\n        \"turbo\",\n    ):\n        n_mels = 128\n    else:\n        n_mels = 80\n\n    mel = (\n        whisper.log_mel_spectrogram(audio, n_mels=n_mels).to(model.device).unsqueeze(0)\n    )\n    batch_size = 1\n    assert mel.shape == (batch_size, n_mels, 30 * 100), mel.shape\n\n    encoder = AudioEncoderTensorCache(model.encoder, model.decoder)\n\n    n_layer_cross_k, n_layer_cross_v = encoder(mel)\n    assert n_layer_cross_k.shape == (\n        model.dims.n_text_layer,\n        batch_size,\n        model.dims.n_audio_ctx,\n        model.dims.n_text_state,\n    ), (n_layer_cross_k.shape, model.dims)\n    assert n_layer_cross_v.shape == (\n        model.dims.n_text_layer,\n        batch_size,\n        model.dims.n_audio_ctx,\n        model.dims.n_text_state,\n    ), (n_layer_cross_v.shape, model.dims)\n\n    encoder_filename = f\"{name}-encoder.onnx\"\n    torch.onnx.export(\n        encoder,\n        mel,\n        encoder_filename,\n        opset_version=opset_version,\n        input_names=[\"mel\"],\n        output_names=[\"n_layer_cross_k\", \"n_layer_cross_v\"],\n        dynamic_axes={\n            \"mel\": {0: \"n_audio\", 2: \"T\"},  # n_audio is also known as batch_size\n            \"n_layer_cross_k\": {1: \"n_audio\", 2: \"T\"},\n            \"n_layer_cross_v\": {1: \"n_audio\", 2: \"T\"},\n        },\n    )\n\n    encoder_meta_data = {\n        \"model_type\": f\"whisper-{name}\",\n        \"version\": \"1\",\n        \"maintainer\": \"k2-fsa\",\n        \"n_mels\": model.dims.n_mels,\n        \"n_audio_ctx\": model.dims.n_audio_ctx,\n        \"n_audio_state\": model.dims.n_audio_state,\n        \"n_audio_head\": model.dims.n_audio_head,\n        \"n_audio_layer\": model.dims.n_audio_layer,\n        \"n_vocab\": model.dims.n_vocab,\n        \"n_text_ctx\": model.dims.n_text_ctx,\n        \"n_text_state\": model.dims.n_text_state,\n        \"n_text_head\": model.dims.n_text_head,\n        \"n_text_layer\": model.dims.n_text_layer,\n        \"sot_sequence\": \",\".join(list(map(str, tokenizer.sot_sequence))),\n        \"all_language_tokens\": \",\".join(\n            list(map(str, tokenizer.all_language_tokens))\n        ),  # a list of ids\n        \"all_language_codes\": \",\".join(\n            tokenizer.all_language_codes\n        ),  # e.g., en, de, zh, fr\n        \"sot\": tokenizer.sot,\n        \"sot_index\": tokenizer.sot_sequence.index(tokenizer.sot),\n        \"eot\": tokenizer.eot,\n        \"blank_id\": tokenizer.encode(\" \")[0],\n        \"is_multilingual\": int(model.is_multilingual),\n        \"no_speech\": tokenizer.no_speech,\n        \"non_speech_tokens\": \",\".join(list(map(str, tokenizer.non_speech_tokens))),\n        \"transcribe\": tokenizer.transcribe,\n        \"translate\": tokenizer.translate,\n        \"sot_prev\": tokenizer.sot_prev,\n        \"sot_lm\": tokenizer.sot_lm,\n        \"no_timestamps\": tokenizer.no_timestamps,\n    }\n    print(f\"encoder_meta_data: {encoder_meta_data}\")\n    add_meta_data(filename=encoder_filename, meta_data=encoder_meta_data)\n\n    n_audio = mel.shape[0]\n    tokens = torch.tensor([[tokenizer.sot, tokenizer.sot, tokenizer.sot]] * n_audio).to(\n        mel.device\n    )  # [n_audio, 3]\n    decoder = TextDecoderTensorCache(model.decoder, model.dims.n_text_ctx)\n    n_layer_self_k_cache = torch.zeros(\n        (\n            len(model.decoder.blocks),\n            n_audio,\n            model.dims.n_text_ctx,\n            model.dims.n_text_state,\n        ),\n        device=mel.device,\n    )\n    n_layer_self_v_cache = torch.zeros(\n        (\n            len(model.decoder.blocks),\n            n_audio,\n            model.dims.n_text_ctx,\n            model.dims.n_text_state,\n        ),\n        device=mel.device,\n    )\n    offset = torch.zeros(1, dtype=torch.int64).to(mel.device)\n    logits, n_layer_self_k_cache, n_layer_self_v_cache = decoder(\n        tokens,\n        n_layer_self_k_cache,\n        n_layer_self_v_cache,\n        n_layer_cross_k,\n        n_layer_cross_v,\n        offset,\n    )\n    assert logits.shape == (n_audio, tokens.shape[1], model.dims.n_vocab)\n    assert n_layer_self_k_cache.shape == (\n        model.dims.n_text_layer,\n        n_audio,\n        model.dims.n_text_ctx,\n        model.dims.n_text_state,\n    )\n    assert n_layer_self_v_cache.shape == (\n        model.dims.n_text_layer,\n        n_audio,\n        model.dims.n_text_ctx,\n        model.dims.n_text_state,\n    )\n\n    offset = torch.tensor([tokens.shape[1]], dtype=torch.int64).to(mel.device)\n    tokens = torch.tensor([[tokenizer.sot]] * n_audio).to(mel.device)  # [n_audio, 1]\n\n    logits, out_n_layer_self_k_cache, out_n_layer_self_v_cache = decoder(\n        tokens,\n        n_layer_self_k_cache,\n        n_layer_self_v_cache,\n        n_layer_cross_k,\n        n_layer_cross_v,\n        offset,\n    )\n\n    decoder_filename = f\"{name}-decoder.onnx\"\n    torch.onnx.export(\n        decoder,\n        (\n            tokens,\n            n_layer_self_k_cache,\n            n_layer_self_v_cache,\n            n_layer_cross_k,\n            n_layer_cross_v,\n            offset,\n        ),\n        decoder_filename,\n        opset_version=opset_version,\n        input_names=[\n            \"tokens\",\n            \"in_n_layer_self_k_cache\",\n            \"in_n_layer_self_v_cache\",\n            \"n_layer_cross_k\",\n            \"n_layer_cross_v\",\n            \"offset\",\n        ],\n        output_names=[\"logits\", \"out_n_layer_self_k_cache\", \"out_n_layer_self_v_cache\"],\n        dynamic_axes={\n            \"tokens\": {0: \"n_audio\", 1: \"n_tokens\"},\n            \"in_n_layer_self_k_cache\": {1: \"n_audio\"},\n            \"in_n_layer_self_v_cache\": {1: \"n_audio\"},\n            \"n_layer_cross_k\": {1: \"n_audio\", 2: \"T\"},\n            \"n_layer_cross_v\": {1: \"n_audio\", 2: \"T\"},\n        },\n    )\n\n    if \"large\" in args.model:\n        decoder_external_filename = decoder_filename.split(\".onnx\")[0]\n        decoder_model = onnx.load(decoder_filename)\n        onnx.save(\n            decoder_model,\n            decoder_filename,\n            save_as_external_data=True,\n            all_tensors_to_one_file=True,\n            location=decoder_external_filename + \".weights\",\n        )\n\n    # Generate int8 quantization models\n    # See https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html#data-type-selection\n\n    print(\"Generate int8 quantization models\")\n\n    encoder_filename_int8 = f\"{name}-encoder.int8.onnx\"\n    quantize_dynamic(\n        model_input=encoder_filename,\n        model_output=encoder_filename_int8,\n        op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QInt8,\n    )\n\n    decoder_filename_int8 = f\"{name}-decoder.int8.onnx\"\n    quantize_dynamic(\n        model_input=decoder_filename,\n        model_output=decoder_filename_int8,\n        op_types_to_quantize=[\"MatMul\"],\n        weight_type=QuantType.QInt8,\n    )\n\n\nif __name__ == \"__main__\":\n    torch.set_num_threads(1)\n    torch.set_num_interop_threads(1)\n    # To fix\n    # TypeError: scaled_dot_product_attention(): argument 'is_causal' must be bool, not Tensor\n    # See also https://github.com/k2-fsa/sherpa-onnx/issues/1764\n    from whisper.model import disable_sdpa\n\n    with disable_sdpa():\n        main()\n"
  },
  {
    "path": "scripts/whisper/find_alignment_heads.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Posit Software, PBC\n\n\"\"\"\nFind alignment heads for a Whisper model by analyzing cross-attention patterns.\n\nAlignment heads are attention heads that show monotonically increasing patterns,\nmeaning they attend to progressively later parts of the audio as more text tokens\nare decoded. These heads are useful for computing word-level timestamps.\n\nUsage:\n    python find_alignment_heads.py --model distil-small.en --audio 0.wav\n\"\"\"\n\nimport argparse\nfrom collections import defaultdict\nfrom typing import Dict, List, Tuple\n\nimport numpy as np\nimport torch\nimport whisper\nfrom whisper.audio import load_audio, log_mel_spectrogram, pad_or_trim\n\nfrom export_onnx import load_model\n\n\ndef get_args():\n    parser = argparse.ArgumentParser(description=\"Find alignment heads in Whisper models\")\n    parser.add_argument(\"--model\", type=str, required=True, help=\"Model name (e.g., distil-small.en)\")\n    parser.add_argument(\"--audio\", type=str, required=True, help=\"Path to audio file\")\n    parser.add_argument(\"--top-k\", type=int, default=10, help=\"Number of top heads to report\")\n    return parser.parse_args()\n\n\n@torch.no_grad()\ndef compute_cross_attention_weights(\n    model: whisper.Whisper,\n    audio_path: str,\n) -> Tuple[Dict[Tuple[int, int], np.ndarray], List[int], str]:\n    \"\"\"\n    Run transcription and capture cross-attention weights from all heads.\n\n    Returns:\n        attention_weights: Dict mapping (layer, head) to attention matrix [n_tokens, n_audio_frames]\n        token_ids: List of decoded token IDs\n        text: Transcribed text\n    \"\"\"\n    # Load and preprocess audio\n    audio = load_audio(audio_path)\n    audio = pad_or_trim(audio)\n\n    n_mels = model.dims.n_mels\n    mel = log_mel_spectrogram(audio, n_mels=n_mels).to(model.device)\n\n    # Encode audio\n    audio_features = model.encoder(mel.unsqueeze(0))\n\n    # Get tokenizer\n    tokenizer = whisper.tokenizer.get_tokenizer(\n        model.is_multilingual,\n        num_languages=getattr(model, 'num_languages', None) or (99 if model.is_multilingual else None),\n        task=\"transcribe\",\n    )\n\n    # Initial tokens (SOT sequence + no_timestamps)\n    # The no_timestamps token is required for proper decoding\n    tokens = list(tokenizer.sot_sequence) + [tokenizer.no_timestamps]\n\n    # Storage for attention weights per (layer, head)\n    all_attention_weights: Dict[Tuple[int, int], List[np.ndarray]] = defaultdict(list)\n\n    n_layers = len(model.decoder.blocks)\n    n_heads = model.dims.n_text_head\n\n    print(f\"Model has {n_layers} decoder layers with {n_heads} attention heads each\")\n\n    # Decode with attention capture\n    max_tokens = 448  # max context length\n\n    for i in range(max_tokens):\n        tokens_tensor = torch.tensor([tokens]).to(model.device)\n\n        # We need to manually run through decoder blocks to capture attention\n        x = model.decoder.token_embedding(tokens_tensor) + model.decoder.positional_embedding[:tokens_tensor.shape[1]]\n        x = x.to(audio_features.dtype)\n\n        for layer_idx, block in enumerate(model.decoder.blocks):\n            # Self-attention (we don't need this for alignment)\n            x = x + block.attn(block.attn_ln(x), mask=model.decoder.mask)[0]\n\n            # Cross-attention - compute manually to get weights\n            cross_attn = block.cross_attn\n            ln_output = block.cross_attn_ln(x)\n\n            q = cross_attn.query(ln_output)\n            k = cross_attn.key(audio_features)\n            v = cross_attn.value(audio_features)\n\n            # Reshape for multi-head attention\n            batch_size, n_ctx, n_state = q.shape\n            head_dim = n_state // n_heads\n\n            q = q.view(batch_size, n_ctx, n_heads, head_dim).permute(0, 2, 1, 3)\n            k = k.view(batch_size, -1, n_heads, head_dim).permute(0, 2, 1, 3)\n            v = v.view(batch_size, -1, n_heads, head_dim).permute(0, 2, 1, 3)\n\n            # Compute attention weights\n            scale = head_dim ** -0.25\n            qk = (q * scale) @ (k * scale).transpose(-1, -2)\n            attn_weights = torch.softmax(qk.float(), dim=-1)  # [batch, heads, n_ctx, n_audio]\n\n            # Store attention weights for each head (only the last token's attention)\n            for head_idx in range(n_heads):\n                # Get attention from the last decoded token\n                weights = attn_weights[0, head_idx, -1, :].detach().cpu().numpy()\n                all_attention_weights[(layer_idx, head_idx)].append(weights)\n\n            # Compute attention output\n            attn_output = (attn_weights.to(v.dtype) @ v).permute(0, 2, 1, 3).flatten(start_dim=2)\n            attn_output = cross_attn.out(attn_output)\n            x = x + attn_output\n\n            # MLP\n            x = x + block.mlp(block.mlp_ln(x))\n\n        x = model.decoder.ln(x)\n        logits = (x @ model.decoder.token_embedding.weight.T).float()\n\n        # Get next token\n        next_token = logits[0, -1].argmax().item()\n\n        if next_token == tokenizer.eot:\n            break\n\n        tokens.append(next_token)\n\n    # Convert to numpy arrays [n_tokens, n_audio_frames]\n    attention_matrices = {}\n    for key, weights_list in all_attention_weights.items():\n        attention_matrices[key] = np.stack(weights_list, axis=0)\n\n    # Decode text\n    text = tokenizer.decode(tokens[len(tokenizer.sot_sequence):])\n\n    return attention_matrices, tokens, text\n\n\ndef compute_monotonicity_score(attention: np.ndarray) -> float:\n    \"\"\"\n    Compute how monotonically increasing the attention pattern is.\n\n    For each token, find the frame with maximum attention (argmax).\n    A good alignment head should have these argmax positions increasing\n    monotonically (or nearly so) as tokens progress.\n\n    Returns a score between 0 and 1, where 1 is perfectly monotonic.\n    \"\"\"\n    n_tokens, n_frames = attention.shape\n\n    if n_tokens < 2:\n        return 0.0\n\n    # Get the frame with maximum attention for each token\n    peak_positions = np.argmax(attention, axis=1)\n\n    # Count how many times position increases (or stays same)\n    increases = 0\n    for i in range(1, len(peak_positions)):\n        if peak_positions[i] >= peak_positions[i - 1]:\n            increases += 1\n\n    monotonicity = increases / (len(peak_positions) - 1)\n    return monotonicity\n\n\ndef compute_diagonal_score(attention: np.ndarray) -> float:\n    \"\"\"\n    Compute how diagonal the attention pattern is.\n\n    A diagonal pattern means token i attends mostly to audio frame i*scale,\n    where scale = n_frames / n_tokens.\n    \"\"\"\n    n_tokens, n_frames = attention.shape\n\n    if n_tokens < 2:\n        return 0.0\n\n    # Expected diagonal positions\n    scale = n_frames / n_tokens\n    expected_positions = np.arange(n_tokens) * scale\n\n    # Actual peak positions\n    peak_positions = np.argmax(attention, axis=1)\n\n    # Compute correlation between expected and actual\n    if np.std(peak_positions) < 1e-6:\n        return 0.0\n\n    correlation = np.corrcoef(expected_positions, peak_positions)[0, 1]\n\n    # Handle NaN\n    if np.isnan(correlation):\n        return 0.0\n\n    return max(0, correlation)  # Only positive correlations indicate good alignment\n\n\ndef analyze_attention_heads(\n    attention_matrices: Dict[Tuple[int, int], np.ndarray],\n    top_k: int = 10,\n) -> List[Tuple[Tuple[int, int], float, float, float]]:\n    \"\"\"\n    Analyze all attention heads and rank them by alignment quality.\n\n    Returns list of ((layer, head), monotonicity_score, diagonal_score, combined_score)\n    sorted by combined score descending.\n    \"\"\"\n    results = []\n\n    for (layer, head), attention in attention_matrices.items():\n        mono_score = compute_monotonicity_score(attention)\n        diag_score = compute_diagonal_score(attention)\n        combined_score = (mono_score + diag_score) / 2\n        results.append(((layer, head), mono_score, diag_score, combined_score))\n\n    # Sort by combined score (descending)\n    results.sort(key=lambda x: x[3], reverse=True)\n\n    return results[:top_k]\n\n\ndef main():\n    args = get_args()\n\n    # Load model\n    print(f\"Loading model: {args.model}\")\n    model = load_model(args.model)\n    model.eval()  # Set to evaluation mode\n\n    print(f\"Model dimensions: {model.dims}\")\n\n    # Check if model already has alignment heads\n    if hasattr(model, 'alignment_heads') and model.alignment_heads is not None:\n        indices = model.alignment_heads.indices()\n        existing_heads = list(zip(indices[0].tolist(), indices[1].tolist()))\n        print(f\"Model has pre-defined alignment heads: {existing_heads}\")\n\n    # Run transcription and capture attention\n    print(f\"\\nTranscribing: {args.audio}\")\n    attention_matrices, tokens, text = compute_cross_attention_weights(model, args.audio)\n\n    print(f\"\\nTranscription: {text}\")\n    print(f\"Number of tokens: {len(tokens)}\")\n\n    # Analyze heads\n    print(f\"\\nAnalyzing {len(attention_matrices)} attention heads...\")\n    top_heads = analyze_attention_heads(attention_matrices, args.top_k)\n\n    print(f\"\\nTop {args.top_k} alignment head candidates:\")\n    print(\"-\" * 60)\n    print(f\"{'Layer':>6} {'Head':>6} {'Monotonic':>12} {'Diagonal':>12} {'Combined':>12}\")\n    print(\"-\" * 60)\n\n    for (layer, head), mono, diag, combined in top_heads:\n        print(f\"{layer:>6} {head:>6} {mono:>12.3f} {diag:>12.3f} {combined:>12.3f}\")\n\n    # Generate Python code for the best heads\n    print(\"\\n\" + \"=\" * 60)\n    print(\"Suggested ALIGNMENT_HEADS entry:\")\n    print(\"=\" * 60)\n\n    # Use heads with combined score > 0.7 (or top 6 if fewer qualify)\n    good_heads = [(l, h) for (l, h), m, d, c in top_heads if c > 0.7]\n    if len(good_heads) < 6:\n        good_heads = [(l, h) for (l, h), m, d, c in top_heads[:6]]\n\n    print(f'\"{args.model}\": {good_heads},')\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/whisper/model-info.md",
    "content": "# tiny/tiny.en\n```\nModelDimensions(\n    n_mels=80,\n    n_audio_ctx=1500,\n    n_audio_state=384,\n    n_audio_head=6,\n    n_audio_layer=4,\n    n_vocab=51865,\n    n_text_ctx=448,\n    n_text_state=384,\n    n_text_head=6,\n    n_text_layer=4\n)\n```\n\n# base/base.en\n```\nModelDimensions(\n    n_mels=80,\n    n_audio_ctx=1500,\n    n_audio_state=512,\n    n_audio_head=8,\n    n_audio_layer=6,\n    n_vocab=51865,\n    n_text_ctx=448,\n    n_text_state=512,\n    n_text_head=8,\n    n_text_layer=6\n)\n```\n\n# small/small.en\n```\nModelDimensions(\n    n_mels=80,\n    n_audio_ctx=1500,\n    n_audio_state=768,\n    n_audio_head=12,\n    n_audio_layer=12,\n    n_vocab=51865,\n    n_text_ctx=448,\n    n_text_state=768,\n    n_text_head=12,\n    n_text_layer=12\n)\n```\n\n\n# medium/medium.en\n```\nModelDimensions(\n    n_mels=80,\n    n_audio_ctx=1500,\n    n_audio_state=1024,\n    n_audio_head=16,\n    n_audio_layer=24,\n    n_vocab=51865,\n    n_text_ctx=448,\n    n_text_state=1024,\n    n_text_head=16,\n    n_text_layer=24\n)\n```\n\n# large\n```\nModelDimensions(\n    n_mels=80,\n    n_audio_ctx=1500,\n    n_audio_state=1280,\n    n_audio_head=20,\n    n_audio_layer=32,\n    n_vocab=51865,\n    n_text_ctx=448,\n    n_text_state=1280,\n    n_text_head=20,\n    n_text_layer=32\n)\n```\n\n"
  },
  {
    "path": "scripts/whisper/requirements.txt",
    "content": "openai-whisper\n"
  },
  {
    "path": "scripts/whisper/rknn/README.md",
    "content": "# Usage\n\nYou can find pre-exported rknn models for rk3588 at\n\nhttps://modelscope.cn/models/csukuangfj/2026-01-05-rknn/files\n\n\n# Download test wave\n\n```\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/en.wav\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/en-16k.wav\n```\n\n## Export to onnx\n\n```\n./export_onnx.py --model tiny.en\n```\n\n## Test onnx\n\n```\n./test_onnx.py --model tiny.en\n```\n\n## Export to rknn\n\n```\npython3 ./export_rknn.py --target-platform rk3588  --in-model ./tiny.en-encoder.onnx --out-model ./tiny.en-encoder.rknn\n\npython3 ./export_rknn.py --target-platform rk3588  --in-model ./tiny.en-decoder.onnx --out-model ./tiny.en-decoder.rknn\n```\n\n```\nls -lh tiny.en-*.rknn\n\n-rw-r--r-- 1 kuangfangjun root 95M Jan  5 16:16 tiny.en-decoder.rknn\n-rw-r--r-- 1 kuangfangjun root 22M Jan  5 16:15 tiny.en-encoder.rknn\n```\n\n## Run it on your rk3588 board\n\n```\nwget https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en/resolve/main/tiny.en-tokens.txt\n\n./test_on_rk3588_board.py  --encoder ./tiny.en-encoder.rknn --decoder ./tiny.en-decoder.rknn --tokens ./tiny.en-tokens.txt --wav ./en-16k.wav\n```\n"
  },
  {
    "path": "scripts/whisper/rknn/export_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n# flake8: noqa\n\n\"\"\"\nNote: Code in this file is modified from\nhttps://github.com/TadaoYamaoka/whisper/blob/main/to_onnx.py\n\nThanks to https://github.com/TadaoYamaoka\nfor making the onnx export script public.\n\nNote that we have removed the 30 seconds constraint from whisper. You can\nuse any T <= 30.\n\"\"\"\n\nimport argparse\nimport inspect\nimport os\nfrom pathlib import Path\nfrom typing import Any, Dict, List, Optional, Tuple\n\nimport onnx\nimport torch\nimport torch.nn.functional as F\nimport whisper\nfrom onnxruntime.quantization import QuantType, quantize_dynamic\nfrom torch import Tensor, nn\nfrom whisper.model import (\n    AudioEncoder,\n    MultiHeadAttention,\n    ResidualAttentionBlock,\n    TextDecoder,\n)\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--model\",\n        type=str,\n        required=True,\n        # fmt: off\n        choices=[\n            \"tiny\", \"tiny.en\", \"base\", \"base.en\",\n            \"small\", \"small.en\", \"medium\", \"medium.en\",\n            \"large-v1\", \"large-v2\",\n            \"large\", \"large-v3\", \"turbo\", # these three have feature dim 128\n            \"distil-medium.en\", \"distil-small.en\", \"distil-large-v2\",\n            \"distil-large-v3\",\n            \"distil-large-v3.5\",\n            # for fine-tuned models from icefall\n            \"medium-aishell\",\n            ],\n        # fmt: on\n    )\n    return parser.parse_args()\n\n\ndef causal_mask_1d(n: int, L: int, device=None, dtype=torch.int32):\n    \"\"\"\n    Returns a 1-D int mask of shape (L,) with:\n      0 -> allowed\n      1 -> masked (will be converted to -inf later)\n    \"\"\"\n    mask = torch.ones((L,), device=device, dtype=dtype)\n    if n > 0:\n        mask[:n] = 0\n    return mask\n\n\ndef add_meta_data(filename: str, meta_data: Dict[str, Any]):\n    \"\"\"Add meta data to an ONNX model. It is changed in-place.\n\n    Args:\n      filename:\n        Filename of the ONNX model to be changed.\n      meta_data:\n        Key-value pairs.\n    \"\"\"\n    model = onnx.load(filename)\n\n    while len(model.metadata_props):\n        model.metadata_props.pop()\n\n    for key, value in meta_data.items():\n        meta = model.metadata_props.add()\n        meta.key = key\n        meta.value = str(value)\n\n    if \"large\" in filename or \"turbo\" in filename:\n        external_filename = filename.split(\".onnx\")[0]\n        onnx.save(\n            model,\n            filename,\n            save_as_external_data=True,\n            all_tensors_to_one_file=True,\n            location=external_filename + \".weights\",\n        )\n    else:\n        onnx.save(model, filename)\n\n\ndef modified_self_qkv_attention(\n    self,\n    q: Tensor,\n    k_cache: Tensor,\n    v_cache: Tensor,\n    k1: Tensor,\n    v1: Tensor,\n    mask: Tensor,\n) -> Tuple[torch.Tensor, Optional[torch.Tensor]]:\n    assert mask is not None\n\n    n_batch, n_ctx, n_state = q.shape\n\n    scale = (n_state // self.n_head) ** -0.25\n    q = q.view(*q.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n    k_cache = k_cache.view(*k_cache.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n    v_cache = v_cache.view(*v_cache.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n\n    k1 = k1.view(*k1.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n    v1 = v1.view(*v1.shape[:2], self.n_head, -1).permute(0, 2, 1, 3)\n\n    qk = (q * scale) @ (k_cache * scale).transpose(-1, -2)  # (1, 6, 1, 448)\n\n    qk1 = (q * scale) @ (k1 * scale).transpose(-1, -2)  # (1, 6, 1, 1)\n\n    #  qk = qk + mask\n    #  qk.masked_fill_(mask.to(torch.bool), float(\"-inf\"))\n    qk.masked_fill_(mask.to(torch.bool), -60000)\n\n    qk = qk.float()\n    qk1 = qk1.float()\n\n    qk_total = torch.cat([qk, qk1], dim=-1)\n\n    w_total = F.softmax(qk_total, dim=-1).to(q.dtype)\n    w = w_total[:, :, :, :-1]\n    w1 = w_total[:, :, :, -1:]\n\n    out = (w @ v_cache).permute(0, 2, 1, 3).flatten(start_dim=2)\n    out1 = (w1 @ v1).permute(0, 2, 1, 3).flatten(start_dim=2)\n    out = out + out1\n\n    qk = qk.detach()\n\n    return out, qk\n\n\nMultiHeadAttention.qkv_attention_self = modified_self_qkv_attention\n\n\ndef modified_audio_encoder_forward(self: AudioEncoder, x: torch.Tensor):\n    \"\"\"\n    x : torch.Tensor, shape = (batch_size, n_mels, n_ctx)\n        the mel spectrogram of the audio\n    \"\"\"\n    x = F.gelu(self.conv1(x))\n    x = F.gelu(self.conv2(x))\n    x = x.permute(0, 2, 1)\n\n    if False:\n        # This branch contains the original code\n        assert x.shape[1:] == self.positional_embedding.shape, \"incorrect audio shape\"\n        x = (x + self.positional_embedding).to(x.dtype)\n    else:\n        #  print(x.shape, self.positional_embedding.shape)\n        # This branch contains the actual changes\n        assert (\n            x.shape[2] == self.positional_embedding.shape[1]\n        ), f\"incorrect audio shape: {x.shape}, {self.positional_embedding.shape}\"\n        assert (\n            x.shape[1] == self.positional_embedding.shape[0]\n        ), f\"incorrect audio shape: {x.shape}, {self.positional_embedding.shape}\"\n        x = (x + self.positional_embedding[: x.shape[1]]).to(x.dtype)\n\n    for block in self.blocks:\n        x = block(x)\n\n    x = self.ln_post(x)\n    return x\n\n\nAudioEncoder.forward = modified_audio_encoder_forward\n\n\nclass AudioEncoderTensorCache(nn.Module):\n    def __init__(self, inAudioEncoder: AudioEncoder, inTextDecoder: TextDecoder):\n        super().__init__()\n        self.audioEncoder = inAudioEncoder\n        self.textDecoder = inTextDecoder\n\n    def forward(self, x: Tensor) -> List[Tuple[Tensor, Tensor]]:\n        \"\"\"\n        Args:\n          x: (1, 80, 3000)\n          cross_kv_pair:\n            - the i-th entry contains kv cache for the i-th layer\n        \"\"\"\n        audio_features = self.audioEncoder(x)\n\n        n_layer_cross_k_list = []\n        n_layer_cross_v_list = []\n\n        cross_kv_pair = []\n        for block in self.textDecoder.blocks:\n            k = block.cross_attn.key(audio_features)  # (batch_size, 1500, 384)\n            v = block.cross_attn.value(audio_features)  # (batch_size, 1500, 384)\n\n            cross_kv_pair.append((k, v))\n\n        return cross_kv_pair\n\n\nclass MultiHeadAttentionCross(nn.Module):\n    def __init__(self, inMultiHeadAttention: MultiHeadAttention):\n        super().__init__()\n        self.multiHeadAttention = inMultiHeadAttention\n\n    def forward(\n        self,\n        x: Tensor,\n        k: Tensor,\n        v: Tensor,\n        mask: Optional[Tensor] = None,\n    ):\n        q = self.multiHeadAttention.query(x)\n        wv, qk = self.multiHeadAttention.qkv_attention(q, k, v, mask)\n        return self.multiHeadAttention.out(wv)\n\n\nclass MultiHeadAttentionSelf(nn.Module):\n    def __init__(self, inMultiHeadAttention: MultiHeadAttention):\n        super().__init__()\n        self.multiHeadAttention = inMultiHeadAttention\n\n    def forward(\n        self,\n        x: Tensor,  # (1, 1      , 384)\n        k_cache: Tensor,  # (1, 448, 384)\n        v_cache: Tensor,  # (1, 448, 384)\n        mask: Tensor,  # (448,)\n    ):\n        q = self.multiHeadAttention.query(x)  # (1, 1, 384)\n        k = self.multiHeadAttention.key(x)  # (1, 1, 384)\n        v = self.multiHeadAttention.value(x)  # (1, 1, 384)\n\n        #  k_cache[:, offset : offset + 1, :] = k  # (b, n_ctx_cache + n_ctx, n_state)\n        #  v_cache[:, offset : offset + 1, :] = v  # (b, n_ctx_cache + n_ctx, n_state)\n\n        wv, qk = self.multiHeadAttention.qkv_attention_self(\n            q,\n            k_cache=k_cache,\n            v_cache=v_cache,\n            k1=k,\n            v1=v,\n            mask=mask,\n        )\n\n        return self.multiHeadAttention.out(wv), k, v\n\n\nclass ResidualAttentionBlockTensorCache(nn.Module):\n    def __init__(self, inResidualAttentionBlock: ResidualAttentionBlock):\n        super().__init__()\n        self.originalBlock = inResidualAttentionBlock\n        self.attn = MultiHeadAttentionSelf(inResidualAttentionBlock.attn)\n        self.cross_attn = (\n            MultiHeadAttentionCross(inResidualAttentionBlock.cross_attn)\n            if inResidualAttentionBlock.cross_attn\n            else None\n        )\n\n    def forward(\n        self,\n        x: Tensor,\n        self_k_cache: Tensor,\n        self_v_cache: Tensor,\n        cross_k: Tensor,\n        cross_v: Tensor,\n        offset: Tensor,\n        mask: Tensor,\n    ):\n        self_attn_x, self_k, self_v = self.attn(\n            self.originalBlock.attn_ln(x),\n            self_k_cache,\n            self_v_cache,\n            mask=mask,\n        )\n        x = x + self_attn_x\n\n        if self.cross_attn:\n            x = x + self.cross_attn(\n                self.originalBlock.cross_attn_ln(x), cross_k, cross_v\n            )\n\n        x = x + self.originalBlock.mlp(self.originalBlock.mlp_ln(x))\n        return x, self_k, self_v\n\n\nclass TextDecoderTensorCache(nn.Module):\n    def __init__(self, inTextDecoder: TextDecoder, in_n_ctx: int):\n        super().__init__()\n        self.textDecoder = inTextDecoder\n        self.n_ctx = in_n_ctx\n\n        self.blocks = []\n        for orginal_block in self.textDecoder.blocks:\n            self.blocks.append(ResidualAttentionBlockTensorCache(orginal_block))\n\n    def forward(\n        self,\n        tokens: Tensor,\n        self_kv_pair: List[Tuple[Tensor, Tensor]],\n        cross_kv_pair: List[Tuple[Tensor, Tensor]],\n        offset: Tensor,\n        mask: Tensor,\n    ) -> Tuple[Tensor, List[Tuple[Tensor, Tensor]]]:\n        \"\"\"\n        tokens: (batch_size, 1)\n        self_kv_pair:\n            - [i][0]: layer_i_self_k_cache, (batch_size, 448, dim)\n            - [i][1]: layer_i_self_v_cache, (batch_size, 448, dim)\n        Returns:\n          - logits\n          - this_self_kv_pair\n        \"\"\"\n        assert tokens.shape == (1, 1), tokens.shape\n        x = self.textDecoder.token_embedding(\n            tokens\n        ) + self.textDecoder.positional_embedding[offset.to(torch.int64)].unsqueeze(0)\n\n        i = 0\n        this_self_kv_pair = []\n        for block in self.blocks:\n            self_k_cache = self_kv_pair[i][0]\n            self_v_cache = self_kv_pair[i][1]\n\n            x, self_k, self_v = block(\n                x,\n                #  self_k_cache=self_k_cache[:, : offset + 1],\n                #  self_v_cache=self_v_cache[:, : offset + 1],\n                self_k_cache=self_k_cache,\n                self_v_cache=self_v_cache,\n                cross_k=cross_kv_pair[i][0],\n                cross_v=cross_kv_pair[i][1],\n                offset=offset,\n                #  mask=self.textDecoder.mask,\n                mask=mask,\n            )\n            #  self_k_cache[:, : offset + 1] = updated_self_k_cache\n            #  self_v_cache[:, : offset + 1] = updated_self_v_cache\n            #  updated_self_kv_pair.append((self_k_cache, self_v_cache))\n            this_self_kv_pair.append((self_k, self_v))\n\n            i += 1\n\n        x = self.textDecoder.ln(x)\n\n        if False:\n            # x.shape (1, 3, 384)\n            # weight.shape (51684, 384)\n\n            logits = (\n                x\n                @ torch.transpose(\n                    self.textDecoder.token_embedding.weight.to(x.dtype), 0, 1\n                )\n            ).float()\n        else:\n            logits = (\n                torch.matmul(\n                    self.textDecoder.token_embedding.weight.to(x.dtype),\n                    x.permute(0, 2, 1),\n                )\n                .permute(0, 2, 1)\n                .float()\n            )\n\n        return logits, this_self_kv_pair\n\n\n# ref: https://github.com/ggerganov/whisper.cpp/blob/master/models/convert-pt-to-ggml.py#L232\ndef convert_tokens(name, model):\n    whisper_dir = Path(whisper.__file__).parent\n    multilingual = model.is_multilingual\n    tokenizer = (\n        whisper_dir\n        / \"assets\"\n        / (multilingual and \"multilingual.tiktoken\" or \"gpt2.tiktoken\")\n    )\n    if not tokenizer.is_file():\n        raise ValueError(f\"Cannot find {tokenizer}\")\n\n    #  import base64\n\n    with open(tokenizer, \"r\") as f:\n        contents = f.read()\n        #  tokens = {\n        #      base64.b64decode(token): int(rank)\n        #      for token, rank in (line.split() for line in contents.splitlines() if line)\n        #  }\n        tokens = {\n            token: int(rank)\n            for token, rank in (line.split() for line in contents.splitlines() if line)\n        }\n\n    with open(f\"{name}-tokens.txt\", \"w\") as f:\n        for t, i in tokens.items():\n            f.write(f\"{t} {i}\\n\")\n    print(f\"Saved to {name}-tokens.txt\")\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    name = args.model\n    print(args)\n    print(name)\n\n    opset_version = 17\n\n    if name == \"distil-medium.en\":\n        filename = \"./distil-medium-en-original-model.bin\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/distil-whisper/distil-medium.en\n                to download original-model.bin\n                You can use the following command to do that:\n\n                wget -O distil-medium-en-original-model.bin https://huggingface.co/distil-whisper/distil-medium.en/resolve/main/original-model.bin\n            \"\"\"\n            )\n        model = whisper.load_model(filename)\n    elif name == \"distil-large-v2\":\n        filename = \"./distil-large-v2-original-model.bin\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/distil-whisper/distil-large-v2\n                to download original-model.bin\n                You can use the following command to do that:\n\n                wget -O distil-large-v2-original-model.bin https://huggingface.co/distil-whisper/distil-large-v2/resolve/main/original-model.bin\n            \"\"\"\n            )\n        model = whisper.load_model(filename)\n    elif name == \"distil-large-v3\":\n        filename = \"./distil-large-v3-original-model.bin\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/distil-whisper/distil-large-v3-openai\n                to download model.bin\n                You can use the following command to do that:\n\n                wget -O distil-large-v3-original-model.bin https://huggingface.co/distil-whisper/distil-large-v3-openai/resolve/main/model.bin\n            \"\"\"\n            )\n        model = whisper.load_model(filename)\n    elif name == \"distil-large-v3.5\":\n        filename = \"./distil-large-v3.5-original-model.bin\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/distil-whisper/distil-large-v3.5-openai/\n                to download model.bin\n                You can use the following command to do that:\n\n                wget -O distil-large-v3.5-original-model.bin https://huggingface.co/distil-whisper/distil-large-v3.5-openai/resolve/main/model.bin\n            \"\"\"\n            )\n        model = whisper.load_model(filename)\n    elif name == \"distil-small.en\":\n        filename = \"./distil-small-en-original-model.bin\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/distil-whisper/distil-small.en\n                to download original-model.bin\n                You can use the following command to do that:\n\n                wget -O distil-small-en-original-model.bin https://huggingface.co/distil-whisper/distil-small.en/resolve/main/original-model.bin\n            \"\"\"\n            )\n        model = whisper.load_model(filename)\n    elif name == \"medium-aishell\":\n        filename = \"./medium-aishell.pt\"\n        if not Path(filename).is_file():\n            raise ValueError(\n                \"\"\"\n                Please go to https://huggingface.co/yuekai/icefall_asr_aishell_whisper/tree/main/exp_medium\n                to download whisper-medium-aishell1-epoch-10-avg-4.pt\n                You can use the following command to do that:\n\n                wget -O medium-aishell.pt https://huggingface.co/yuekai/icefall_asr_aishell_whisper/resolve/main/exp_medium/whisper-medium-aishell1-epoch-10-avg-4.pt\n            \"\"\"\n            )\n        model = whisper.load_model(filename)\n    else:\n        model = whisper.load_model(name)\n    model.to(\"cpu\")\n\n    num_params = sum(p.numel() for p in model.parameters())\n    num_encoder_params = sum(p.numel() for p in model.encoder.parameters())\n    num_decoder_params = sum(p.numel() for p in model.decoder.parameters())\n    print(f\"{name} model parameters: {num_params} (or {num_params/1000/1000} M)\")\n    print(\n        f\"{name} encoder parameters: {num_encoder_params} (or {num_encoder_params/1000/1000} M)\"\n    )\n    print(\n        f\"{name} decoder parameters: {num_decoder_params} (or {num_decoder_params/1000/1000} M)\"\n    )\n\n    convert_tokens(name=name, model=model)\n\n    # write tokens\n\n    tokenizer = whisper.tokenizer.get_tokenizer(\n        model.is_multilingual, num_languages=model.num_languages\n    )\n    # tiny: <|startoftranscript|><|en|><|transcribe|> (50258, 50259, 50359)\n    # base: <|startoftranscript|><|en|><|transcribe|> (50258, 50259, 50359)\n    # tiny.en: <|startoftranscript|> (50257,)\n    print(tokenizer.decode(tokenizer.sot_sequence), tokenizer.sot_sequence)\n\n    # tiny: <|notimestamps|> 50363\n    # base: <|notimestamps|> 50363\n    # tiny.en: <|notimestamps|> 50362\n    print(tokenizer.decode([tokenizer.no_timestamps]), tokenizer.no_timestamps)\n\n    model.eval()\n    print(model.dims)\n    audio = torch.rand(16000 * 2)\n    audio = whisper.pad_or_trim(audio)\n    assert audio.shape == (16000 * 30,), audio.shape\n\n    if args.model in (\"distil-large-v3\", \"distil-large-v3.5\"):\n        n_mels = 128\n    elif args.model in (\n        \"large\",\n        \"large-v3\",\n        \"turbo\",\n    ):\n        n_mels = 128\n    else:\n        n_mels = 80\n\n    mel = (\n        whisper.log_mel_spectrogram(audio, n_mels=n_mels).to(model.device).unsqueeze(0)\n    )\n    batch_size = 1\n    assert mel.shape == (batch_size, n_mels, 30 * 100), mel.shape\n\n    encoder = AudioEncoderTensorCache(model.encoder, model.decoder)\n\n    cross_kv_pair = encoder(mel)\n    assert len(cross_kv_pair) == model.dims.n_text_layer, (\n        len(cross_kv_pair),\n        model.dims.n_text_layer,\n    )\n\n    output_names = []\n    for i in range(model.dims.n_text_layer):\n        k = f\"cross_k_{i}\"\n        v = f\"cross_v_{i}\"\n        output_names.append(k)\n        output_names.append(v)\n\n    export_sig = inspect.signature(torch.onnx.export)\n\n    kwargs = dict()\n    if \"dynamo\" in export_sig.parameters:\n        kwargs[\"dynamo\"] = False\n\n    if \"external_data\" in export_sig.parameters:\n        kwargs[\"external_data\"] = False\n\n    encoder_filename = f\"{name}-encoder.onnx\"\n    torch.onnx.export(\n        encoder,\n        mel,\n        encoder_filename,\n        opset_version=opset_version,\n        input_names=[f\"{name}-mel\"],\n        output_names=output_names,\n        **kwargs,\n    )\n\n    encoder_meta_data = {\n        \"model_type\": f\"whisper-{name}\",\n        \"version\": \"1\",\n        \"maintainer\": \"k2-fsa\",\n        \"n_mels\": model.dims.n_mels,\n        \"n_audio_ctx\": model.dims.n_audio_ctx,\n        \"n_audio_state\": model.dims.n_audio_state,\n        \"n_audio_head\": model.dims.n_audio_head,\n        \"n_audio_layer\": model.dims.n_audio_layer,\n        \"n_vocab\": model.dims.n_vocab,\n        \"n_text_ctx\": model.dims.n_text_ctx,\n        \"n_text_state\": model.dims.n_text_state,\n        \"n_text_head\": model.dims.n_text_head,\n        \"n_text_layer\": model.dims.n_text_layer,\n        \"sot_sequence\": \",\".join(list(map(str, tokenizer.sot_sequence))),\n        #  \"all_language_tokens\": \",\".join(\n        #      list(map(str, tokenizer.all_language_tokens))\n        #  ),  # a list of ids\n        #  \"all_language_codes\": \",\".join(\n        #      tokenizer.all_language_codes\n        #  ),  # e.g., en, de, zh, fr\n        \"sot\": tokenizer.sot,\n        \"sot_index\": tokenizer.sot_sequence.index(tokenizer.sot),\n        \"eot\": tokenizer.eot,\n        \"blank_id\": tokenizer.encode(\" \")[0],\n        \"is_multilingual\": int(model.is_multilingual),\n        \"no_speech\": tokenizer.no_speech,\n        \"non_speech_tokens\": \",\".join(list(map(str, tokenizer.non_speech_tokens))),\n        \"transcribe\": tokenizer.transcribe,\n        \"translate\": tokenizer.translate,\n        \"sot_prev\": tokenizer.sot_prev,\n        \"sot_lm\": tokenizer.sot_lm,\n        \"no_timestamps\": tokenizer.no_timestamps,\n    }\n    print(f\"encoder_meta_data: {encoder_meta_data}\")\n    add_meta_data(filename=encoder_filename, meta_data=encoder_meta_data)\n\n    tokens = torch.tensor([[tokenizer.sot]], dtype=torch.int32)\n    decoder = TextDecoderTensorCache(model.decoder, model.dims.n_text_ctx)\n\n    self_kv_pair = []\n    batch_size = 1\n    for i in range(model.dims.n_text_layer):\n        k = torch.zeros(batch_size, model.dims.n_text_ctx, model.dims.n_text_state)\n        v = torch.zeros(batch_size, model.dims.n_text_ctx, model.dims.n_text_state)\n        self_kv_pair.append((k, v))\n\n    offset = torch.zeros(1, dtype=torch.int32)\n    mask = causal_mask_1d(offset.item(), model.dims.n_text_ctx)\n\n    logits, this_self_kv_pair = decoder(\n        tokens,\n        self_kv_pair,\n        cross_kv_pair,\n        offset,\n        mask,\n    )\n\n    assert logits.shape == (batch_size, tokens.shape[1], model.dims.n_vocab)\n    assert len(this_self_kv_pair) == model.dims.n_text_layer, (\n        len(this_self_kv_pair),\n        model.dims.n_text_layer,\n    )\n\n    input_names = [f\"{name}-tokens\"]\n    for i in range(model.dims.n_text_layer):\n        k = f\"{name}-self_k_{i}\"\n        v = f\"{name}-self_v_{i}\"\n        input_names.append(k)\n        input_names.append(v)\n\n    for i in range(model.dims.n_text_layer):\n        k = f\"{name}-cross_k_{i}\"\n        v = f\"{name}-cross_v_{i}\"\n        input_names.append(k)\n        input_names.append(v)\n    input_names.append(f\"{name}-offset\")\n    input_names.append(f\"{name}-mask\")\n\n    output_names = [f\"{name}-logits\"]\n    for i in range(model.dims.n_text_layer):\n        k = f\"{name}-this_self_k_{i}\"\n        v = f\"{name}-this_self_v_{i}\"\n        output_names.append(k)\n        output_names.append(v)\n\n    decoder_filename = f\"{name}-decoder.onnx\"\n    torch.onnx.export(\n        decoder,\n        (\n            tokens,\n            self_kv_pair,\n            cross_kv_pair,\n            offset,\n            mask,\n        ),\n        decoder_filename,\n        opset_version=opset_version,\n        input_names=input_names,\n        output_names=output_names,\n        **kwargs,\n    )\n\n    if \"large\" in args.model:\n        decoder_external_filename = decoder_filename.split(\".onnx\")[0]\n        decoder_model = onnx.load(decoder_filename)\n        onnx.save(\n            decoder_model,\n            decoder_filename,\n            save_as_external_data=True,\n            all_tensors_to_one_file=True,\n            location=decoder_external_filename + \".weights\",\n        )\n\n    if False:\n        # Generate int8 quantization models\n        # See https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html#data-type-selection\n\n        print(\"Generate int8 quantization models\")\n\n        encoder_filename_int8 = f\"{name}-encoder.int8.onnx\"\n        quantize_dynamic(\n            model_input=encoder_filename,\n            model_output=encoder_filename_int8,\n            op_types_to_quantize=[\"MatMul\"],\n            weight_type=QuantType.QInt8,\n        )\n\n        decoder_filename_int8 = f\"{name}-decoder.int8.onnx\"\n        quantize_dynamic(\n            model_input=decoder_filename,\n            model_output=decoder_filename_int8,\n            op_types_to_quantize=[\"MatMul\"],\n            weight_type=QuantType.QInt8,\n        )\n\n\nif __name__ == \"__main__\":\n    torch.set_num_threads(1)\n    torch.set_num_interop_threads(1)\n    try:\n        # To fix\n        # TypeError: scaled_dot_product_attention(): argument 'is_causal' must be bool, not Tensor\n        # See also https://github.com/k2-fsa/sherpa-onnx/issues/1764\n        from whisper.model import disable_sdpa\n\n        with disable_sdpa():\n            main()\n    except:\n        main()\n"
  },
  {
    "path": "scripts/whisper/rknn/export_rknn.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation (authors: Fangjun Kuang)\n\nimport argparse\nimport logging\nfrom pathlib import Path\n\nfrom rknn.api import RKNN\n\nlogging.basicConfig(level=logging.WARNING)\n\ng_platforms = [\n    \"rk3562\",\n    \"rk3566\",\n    \"rk3568\",\n    \"rk3576\",\n    \"rk3588\",\n]\n\n\ndef get_parser():\n    parser = argparse.ArgumentParser(\n        formatter_class=argparse.ArgumentDefaultsHelpFormatter\n    )\n\n    parser.add_argument(\n        \"--target-platform\",\n        type=str,\n        required=True,\n        help=f\"Supported values are: {','.join(g_platforms)}\",\n    )\n\n    parser.add_argument(\n        \"--in-model\",\n        type=str,\n        required=True,\n        help=\"Path to the input onnx model\",\n    )\n\n    parser.add_argument(\n        \"--out-model\",\n        type=str,\n        required=True,\n        help=\"Path to the output rknn model\",\n    )\n\n    return parser\n\n\ndef get_meta_data(model: str):\n    import onnxruntime\n\n    session_opts = onnxruntime.SessionOptions()\n    session_opts.inter_op_num_threads = 1\n    session_opts.intra_op_num_threads = 1\n\n    m = onnxruntime.InferenceSession(\n        model,\n        sess_options=session_opts,\n        providers=[\"CPUExecutionProvider\"],\n    )\n\n    for i in m.get_inputs():\n        print(i)\n\n    print(\"-----\")\n\n    for i in m.get_outputs():\n        print(i)\n    print()\n\n    meta = m.get_modelmeta().custom_metadata_map\n    s = \"\"\n    sep = \"\"\n    for key, value in meta.items():\n        s = s + sep + f\"{key}={value}\"\n        sep = \";\"\n    assert len(s) < 1024, len(s)\n\n    print(\"len(s)\", len(s), s)\n\n    return s\n\n\ndef export_rknn(rknn, filename):\n    ret = rknn.export_rknn(filename)\n    if ret != 0:\n        exit(f\"Export rknn model to {filename} failed!\")\n\n\ndef init_model(filename: str, target_platform: str, custom_string=None):\n    rknn = RKNN(verbose=False)\n\n    rknn.config(\n        optimization_level=0,\n        target_platform=target_platform,\n        custom_string=custom_string,\n    )\n    if not Path(filename).is_file():\n        exit(f\"{filename} does not exist\")\n\n    ret = rknn.load_onnx(model=filename)\n    if ret != 0:\n        exit(f\"Load model {filename} failed!\")\n\n    ret = rknn.build(do_quantization=False)\n    if ret != 0:\n        exit(f\"Build model {filename} failed!\")\n\n    return rknn\n\n\nclass RKNNModel:\n    def __init__(\n        self,\n        model: str,\n        target_platform: str,\n    ):\n        meta = get_meta_data(model)\n        print(meta)\n\n        self.model = init_model(\n            model,\n            target_platform=target_platform,\n            custom_string=meta,\n        )\n\n    def export_rknn(self, model):\n        export_rknn(self.model, model)\n\n    def release(self):\n        self.model.release()\n\n\ndef main():\n    args = get_parser().parse_args()\n    print(vars(args))\n\n    model = RKNNModel(\n        model=args.in_model,\n        target_platform=args.target_platform,\n    )\n\n    model.export_rknn(\n        model=args.out_model,\n    )\n\n    model.release()\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/whisper/rknn/generate_decoder_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport glob\nfrom dataclasses import dataclass\nfrom pathlib import Path\nfrom typing import List, Tuple\n\nimport numpy as np\nimport torch\nimport whisper\n\nfrom export_onnx import AudioEncoderTensorCache, TextDecoderTensorCache, causal_mask_1d\nfrom test_torch import compute_feat\n\n# we need to transpose cross_kv to (1, 384, 1500) when using it as an input\n# we need to transpose self_kv to (1, 384, 448) when using it as an input\n\n\ndef deepcopy_pair(pair):\n    return [(a.clone(), b.clone()) for a, b in pair]\n\n\ndef to_file(tensor, filename, debug):\n    if debug:\n        print(filename, tensor.shape, tensor.dtype)\n    tensor.numpy().tofile(filename)\n\n\n@dataclass\nclass DecoderInput:\n    tokens: torch.Tensor\n    self_kv_pair: List[Tuple[torch.Tensor, torch.Tensor]]\n    cross_kv_pair: List[Tuple[torch.Tensor, torch.Tensor]]\n    offset: torch.Tensor\n    mask: torch.Tensor\n\n    def save_to_file(self, prefix, debug):\n        ans = []\n        to_file(self.tokens.to(torch.int32), f\"{prefix}-tokens.raw\", debug)\n        ans.append(f\"{prefix}-tokens.raw\")\n\n        for i, (k, v) in enumerate(self.self_kv_pair):\n            to_file(k.permute(0, 2, 1), f\"{prefix}-self_k_{i}.raw\", debug)\n            ans.append(f\"{prefix}-self_k_{i}.raw\")\n\n            to_file(v.permute(0, 2, 1), f\"{prefix}-self_v_{i}.raw\", debug)\n            ans.append(f\"{prefix}-self_v_{i}.raw\")\n\n        for i, (k, v) in enumerate(self.cross_kv_pair):\n            to_file(k.permute(0, 2, 1), f\"{prefix}-cross_k_{i}.raw\", debug)\n            ans.append(f\"{prefix}-cross_k_{i}.raw\")\n\n            to_file(v.permute(0, 2, 1), f\"{prefix}-cross_v_{i}.raw\", debug)\n            ans.append(f\"{prefix}-cross_v_{i}.raw\")\n\n        to_file(self.offset.to(torch.int32), f\"{prefix}-offset.raw\", debug)\n        ans.append(f\"{prefix}-offset.raw\")\n\n        to_file(self.mask.to(torch.int32), f\"{prefix}-mask.raw\", debug)\n        ans.append(f\"{prefix}-mask.raw\")\n\n        return ans\n\n\ndef process(model, tokenizer, w):\n    mel = compute_feat(w)\n\n    encoder = AudioEncoderTensorCache(model.encoder, model.decoder)\n    cross_kv_pair = encoder(mel)\n\n    # cross_kv_pair[0][0]: (1, 1500, 384)\n    # cross_kv_pair[0][1]: (1, 1500, 384)\n\n    ans = []\n\n    decoder = TextDecoderTensorCache(model.decoder, model.dims.n_text_ctx)\n\n    batch_size = 1\n    self_kv_pair = []\n    for i in range(model.dims.n_text_layer):\n        k = torch.zeros(batch_size, model.dims.n_text_ctx, model.dims.n_text_state)\n        v = torch.zeros(batch_size, model.dims.n_text_ctx, model.dims.n_text_state)\n\n        self_kv_pair.append((k, v))\n    # self_kv_pair[0][0]: (1, 448, 384)\n    # self_kv_pair[0][1]: (1, 448, 384)\n\n    offset = torch.zeros(1, dtype=torch.int64).to(mel.device)\n    mask = causal_mask_1d(offset.item(), model.dims.n_text_ctx)\n\n    tokens = torch.tensor([[tokenizer.sot]])\n\n    ans.append(\n        DecoderInput(\n            tokens=tokens.clone(),\n            self_kv_pair=deepcopy_pair(self_kv_pair),\n            cross_kv_pair=deepcopy_pair(cross_kv_pair),\n            offset=offset.clone(),\n            mask=mask.clone(),\n        )\n    )\n\n    logits, this_self_kv_pair = decoder(\n        tokens,\n        self_kv_pair,\n        cross_kv_pair,\n        offset,\n        mask,\n    )\n    for (k_cache, v_cache), (k, v) in zip(self_kv_pair, this_self_kv_pair):\n        k_cache[:, offset : offset + 1] = k\n        v_cache[:, offset : offset + 1] = v\n\n    offset += 1\n\n    mask = causal_mask_1d(offset.item(), model.dims.n_text_ctx)\n\n    tokens = torch.tensor([[tokenizer.no_timestamps]])\n    logits, this_self_kv_pair = decoder(\n        tokens, self_kv_pair, cross_kv_pair, offset, mask\n    )\n\n    ans.append(\n        DecoderInput(\n            tokens=tokens.clone(),\n            self_kv_pair=deepcopy_pair(self_kv_pair),\n            cross_kv_pair=deepcopy_pair(cross_kv_pair),\n            offset=offset.clone(),\n            mask=mask.clone(),\n        )\n    )\n\n    for (k_cache, v_cache), (k, v) in zip(self_kv_pair, this_self_kv_pair):\n        k_cache[:, offset : offset + 1] = k\n        v_cache[:, offset : offset + 1] = v\n\n    assert logits.shape == (1, tokens.shape[1], model.dims.n_vocab)\n\n    print(\"logits.shape\", logits.shape)  # (1, 3, 51864)\n    idx = logits[0, -1].argmax().item()\n\n    steps = 0\n    results = []\n    while idx != tokenizer.eot and steps < 50:\n        results.append(idx)\n        tokens = torch.tensor([[results[-1]]])\n\n        offset += 1\n        mask = causal_mask_1d(offset.item(), model.dims.n_text_ctx)\n\n        logits, this_self_kv_pair = decoder(\n            tokens, self_kv_pair, cross_kv_pair, offset, mask\n        )\n\n        ans.append(\n            DecoderInput(\n                tokens=tokens.clone(),\n                self_kv_pair=deepcopy_pair(self_kv_pair),\n                cross_kv_pair=deepcopy_pair(cross_kv_pair),\n                offset=offset.clone(),\n                mask=mask.clone(),\n            )\n        )\n\n        for (k_cache, v_cache), (k, v) in zip(self_kv_pair, this_self_kv_pair):\n            k_cache[:, offset : offset + 1] = k\n            v_cache[:, offset : offset + 1] = v\n\n        idx = logits[0, -1].argmax().item()\n        steps += 1\n\n    print(results)\n    print(tokenizer.decode(results))\n    return ans\n\n\n@torch.no_grad()\ndef main():\n    model = whisper.load_model(\"tiny.en\")\n    model.eval()\n    tokenizer = whisper.tokenizer.get_tokenizer(\n        model.is_multilingual, num_languages=model.num_languages\n    )\n\n    wav_files = glob.glob(\"*.wav\")\n    features_name = []\n    for w in wav_files:\n        decoder_input_list = process(model, tokenizer, w)\n        print(len(decoder_input_list))\n\n        name = Path(w).stem\n        files = [\n            d.save_to_file(f\"{name}-decoder-iter-{k:02d}\", k == 0)\n            for k, d in enumerate(decoder_input_list)\n        ]\n\n        features_name.extend(files)\n\n    with open(\"decoder-input-list.txt\", \"w\") as f:\n        for line in features_name:\n            line = \" \".join(line)\n            f.write(f\"{line}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/whisper/rknn/generate_encoder_data.py",
    "content": "#!/usr/bin/env python3\n# Copyright (c)  2025  Xiaomi Corporation\n\nimport glob\nfrom pathlib import Path\n\nimport numpy as np\n\nfrom test_torch import compute_feat\n\n\n@torch.no_grad()\ndef main():\n    wav_files = glob.glob(\"*.wav\")\n    features_name = []\n    for w in wav_files:\n        f = compute_feat(w)\n\n        # Note: qnn expects (1, 3000, 80) as input\n        f = f.permute(0, 2, 1)  # (1, 80, 3000) -> (1, 3000, 80)\n\n        f = f.numpy()\n        print(w, f.shape)\n        name = Path(w).stem\n\n        s = f\"encoder-input-{name}.raw\"\n        f.tofile(s)\n        features_name.append(s)\n\n    with open(\"encoder-input-list.txt\", \"w\") as f:\n        for line in features_name:\n            f.write(f\"{line}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/whisper/rknn/notes.md",
    "content": "# Note\n\n## Encoder\n```\n=========./tiny.en-encoder.onnx==========\nNodeArg(name='tiny.en-mel', type='tensor(float)', shape=[1, 80, 3000])\n-----\nNodeArg(name='cross_k_0', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_v_0', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_k_1', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_v_1', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_k_2', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_v_2', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_k_3', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_v_3', type='tensor(float)', shape=[1, 1500, 384])\n```\n\n## Decoder\n\n```\n=========./tiny.en-decoder.onnx==========\nNodeArg(name='tiny.en-tokens', type='tensor(int32)', shape=[1, 1])\nNodeArg(name='tiny.en-self_k_0', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_v_0', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_k_1', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_v_1', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_k_2', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_v_2', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_k_3', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_v_3', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-cross_k_0', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_v_0', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_k_1', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_v_1', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_k_2', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_v_2', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_k_3', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_v_3', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-offset', type='tensor(int64)', shape=[1])\nNodeArg(name='tiny.en-mask', type='tensor(float)', shape=[448])\n-----\nNodeArg(name='tiny.en-logits', type='tensor(float)', shape=['Casttiny.en-logits_dim_0', 'Casttiny.en-logits_dim_1', 51864])\nNodeArg(name='tiny.en-this_self_k_0', type='tensor(float)', shape=[1, 'MatMultiny.en-this_self_k_0_dim_1', 384])\nNodeArg(name='tiny.en-this_self_v_0', type='tensor(float)', shape=[1, 'MatMultiny.en-this_self_k_0_dim_1', 384])\nNodeArg(name='tiny.en-this_self_k_1', type='tensor(float)', shape=['MatMultiny.en-this_self_k_1_dim_0', 'MatMultiny.en-this_self_k_1_dim_1', 384])\nNodeArg(name='tiny.en-this_self_v_1', type='tensor(float)', shape=['MatMultiny.en-this_self_k_1_dim_0', 'MatMultiny.en-this_self_k_1_dim_1', 384])\nNodeArg(name='tiny.en-this_self_k_2', type='tensor(float)', shape=['MatMultiny.en-this_self_k_2_dim_0', 'MatMultiny.en-this_self_k_2_dim_1', 384])\nNodeArg(name='tiny.en-this_self_v_2', type='tensor(float)', shape=['MatMultiny.en-this_self_k_2_dim_0', 'MatMultiny.en-this_self_k_2_dim_1', 384])\nNodeArg(name='tiny.en-this_self_k_3', type='tensor(float)', shape=['MatMultiny.en-this_self_k_3_dim_0', 'MatMultiny.en-this_self_k_3_dim_1', 384])\nNodeArg(name='tiny.en-this_self_v_3', type='tensor(float)', shape=['MatMultiny.en-this_self_k_3_dim_0', 'MatMultiny.en-this_self_k_3_dim_1', 384])\n```\n"
  },
  {
    "path": "scripts/whisper/rknn/test_on_rk3588_board.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\"\"\"\nusage:\n\n./test_on_rk3588_board.py  --encoder ./base-encoder.rknn --decoder ./base-decoder.rknn --tokens ./base-tokens.txt --wav ./en-16k.wav\n\n./test_on_rk3588_board.py  --encoder ./base.en-encoder.rknn --decoder ./base.en-decoder.rknn --tokens ./base.en-tokens.txt --wav ./en-16k.wav\n\"\"\"\n\ntry:\n    from rknnlite.api import RKNNLite\nexcept:\n    print(\"Please run this file on your board (linux + aarch64 + npu)\")\n    print(\"You need to install rknn_toolkit_lite2\")\n    print(\n        \" from https://github.com/airockchip/rknn-toolkit2/tree/master/rknn-toolkit-lite2/packages\"\n    )\n    print(\n        \"https://github.com/airockchip/rknn-toolkit2/blob/v2.1.0/rknn-toolkit-lite2/packages/rknn_toolkit_lite2-2.1.0-cp310-cp310-linux_aarch64.whl\"\n    )\n    print(\"is known to work\")\n    raise\n\nimport argparse\nimport base64\nimport time\nfrom pathlib import Path\nfrom typing import List, Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport soundfile as sf\nimport torch\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        required=True,\n        help=\"Path to the encoder\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        required=True,\n        help=\"Path to the decoder\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to the tokens\",\n    )\n\n    parser.add_argument(\n        \"--wav\",\n        type=str,\n        required=True,\n        help=\"Path to the test wav\",\n    )\n\n    return parser.parse_args()\n\n\ndef causal_mask_1d(n: int, L: int):\n    \"\"\"\n    Returns a 1-D int mask of shape (L,) with:\n      0 -> allowed\n      1 -> masked (will be converted to -inf later)\n    \"\"\"\n    mask = np.ones((L,), dtype=np.int32)\n    if n > 0:\n        mask[:n] = 0\n    return mask\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef compute_features(samples: np.ndarray, dim: int = 80) -> np.ndarray:\n    \"\"\"\n    Returns:\n      Return a 1-D float32 tensor of shape (1, 80, 3000) containing the features.\n    \"\"\"\n    features = []\n    opts = knf.WhisperFeatureOptions()\n    opts.dim = dim\n    online_whisper_fbank = knf.OnlineWhisperFbank(opts)\n    online_whisper_fbank.accept_waveform(16000, samples)\n    online_whisper_fbank.input_finished()\n    for i in range(online_whisper_fbank.num_frames_ready):\n        f = online_whisper_fbank.get_frame(i)\n        f = torch.from_numpy(f)\n        features.append(f)\n\n    features = torch.stack(features)\n\n    log_spec = torch.clamp(features, min=1e-10).log10()\n    log_spec = torch.maximum(log_spec, log_spec.max() - 8.0)\n    mel = (log_spec + 4.0) / 4.0\n    # mel (T, 80)\n\n    # We pad 1500 frames at the end so that it is able to detect eot\n    # You can use another value instead of 1500.\n    mel = torch.nn.functional.pad(mel, (0, 0, 0, 1500), \"constant\", 0)\n    # Note that if it throws for a multilingual model,\n    # please use a larger value, say 300\n\n    target = 3000\n    if mel.shape[0] > target:\n        # -50 so that there are some zero tail paddings.\n        mel = mel[: target - 50]\n        mel = torch.nn.functional.pad(mel, (0, 0, 0, 50), \"constant\", 0)\n    elif mel.shape[0] < target:\n        mel = torch.nn.functional.pad(\n            mel, (0, 0, 0, target - mel.shape[0]), \"constant\", 0\n        )\n\n    mel = mel.t().unsqueeze(0)\n\n    return mel\n\n\ndef load_tokens(filename):\n    tokens = dict()\n    with open(filename, \"r\") as f:\n        for line in f:\n            t, i = line.split()\n            tokens[int(i)] = t\n    return tokens\n\n\ndef init_model(filename, target_platform=\"rk3588\"):\n\n    if not Path(filename).is_file():\n        exit(f\"{filename} does not exist\")\n\n    rknn_lite = RKNNLite(verbose=False)\n    ret = rknn_lite.load_rknn(path=filename)\n    if ret != 0:\n        exit(f\"Load model {filename} failed!\")\n\n    ret = rknn_lite.init_runtime(core_mask=RKNNLite.NPU_CORE_0)\n    if ret != 0:\n        exit(f\"Failed to init rknn runtime for {filename}\")\n    return rknn_lite\n\n\nclass RKNNModel:\n    def __init__(\n        self,\n        encoder: str,\n        decoder: str,\n        sot_sequence: List[int],\n        eot: int,\n        n_text_layer: int,\n        n_text_ctx: int,\n        n_text_state: int,\n        target_platform=\"rk3588\",\n    ):\n        self.sot_sequence = sot_sequence\n        self.eot = eot\n        self.n_text_layer = n_text_layer\n        self.n_text_ctx = n_text_ctx\n        self.n_text_state = n_text_state\n\n        print(\"sot_sequence\", self.sot_sequence)\n        print(\"eot\", self.eot)\n\n        self.encoder = init_model(encoder)\n        self.decoder = init_model(decoder)\n\n    def release(self):\n        self.encoder.release()\n        self.decoder.release()\n\n    def run_encoder(self, x: np.ndarray):\n        \"\"\"\n        Args:\n          x: (1, 80, 3000), np.float32\n        Returns:\n          cross_kv:\n           - (k, v) for layer 0\n           - (k, v) for layer 1\n           - (k, v) for layer 2\n           - (k, v) for layer 3\n        \"\"\"\n        out = self.encoder.inference(inputs=[x.numpy()])\n        return out\n\n    def get_self_cache(self) -> List[np.ndarray]:\n        self_cache = []\n        batch_size = 1\n        for i in range(self.n_text_layer):\n            k = np.zeros(\n                (batch_size, self.n_text_ctx, self.n_text_state), dtype=np.float32\n            )\n            v = np.zeros(\n                (batch_size, self.n_text_ctx, self.n_text_state), dtype=np.float32\n            )\n            self_cache.extend([k, v])\n        return self_cache\n\n    def run_decoder(self, tokens: np.ndarray, self_kv, cross_kv, offset, mask):\n        \"\"\"\n        Args:\n          tokens: (1, 1), np.int32\n          offset: (1,), np.int32\n          mask: (model.n_text_ctx,), np.int32\n        Returns:\n          logit: (1, 1, vocab_size)\n          this_self_kv\n        \"\"\"\n        return self.decoder.inference(\n            inputs=[tokens] + self_kv + cross_kv + [offset, mask]\n        )\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    id2token = load_tokens(args.tokens)\n\n    if \".en\" in args.encoder:\n        sot_sequence = [50257, 50362]\n        eot = 50256\n    else:\n        sot_sequence = [50258, 50259, 50359, 50363]\n        eot = 50257\n\n    if \"tiny\" in args.encoder:\n        n_text_layer = 4\n        n_text_ctx = 448\n        n_text_state = 384\n    elif \"base\" in args.encoder:\n        n_text_layer = 6\n        n_text_ctx = 448\n        n_text_state = 512\n    elif \"small\" in args.encoder:\n        n_text_layer = 12\n        n_text_ctx = 448\n        n_text_state = 768\n    elif \"medium\" in args.encoder:\n        n_text_layer = 24\n        n_text_ctx = 448\n        n_text_state = 1024\n    else:\n        assert False, f\"Unsupported encoder {args.encoder}\"\n\n    model = RKNNModel(\n        encoder=args.encoder,\n        decoder=args.decoder,\n        sot_sequence=sot_sequence,\n        eot=eot,\n        n_text_layer=n_text_layer,\n        n_text_ctx=n_text_ctx,\n        n_text_state=n_text_state,\n    )\n\n    for i in range(1):\n        test(model, id2token)\n\n\ndef test(model, id2token):\n\n    start = time.time()\n    samples, sample_rate = load_audio(\"./en-16k.wav\")\n    assert sample_rate == 16000, sample_rate\n\n    features = compute_features(samples)\n    print(features.shape)\n    cross_kv = model.run_encoder(features)\n\n    self_kv = model.get_self_cache()\n\n    offset = np.array([0], dtype=np.int32)\n    for t in model.sot_sequence:\n        token = np.array([[t]], dtype=np.int32)  # sot\n        mask = causal_mask_1d(offset.item(), model.n_text_ctx)\n\n        out = model.run_decoder(\n            tokens=token, self_kv=self_kv, cross_kv=cross_kv, offset=offset, mask=mask\n        )\n\n        for i in range(1, len(out)):\n            self_kv[i - 1][:, offset.item() : offset.item() + 1, :] = out[i]\n\n        offset += 1\n\n    idx = out[0][0, 0].argmax()\n\n    eot = model.eot\n\n    ans = []\n\n    while idx != eot and offset.item() < 100:\n        ans.append(idx)\n        token = np.array([[idx]], dtype=np.int32)\n\n        mask = causal_mask_1d(offset.item(), model.n_text_ctx)\n\n        out = model.run_decoder(\n            tokens=token, self_kv=self_kv, cross_kv=cross_kv, offset=offset, mask=mask\n        )\n\n        for i in range(1, len(out)):\n            self_kv[i - 1][:, offset.item() : offset.item() + 1, :] = out[i]\n\n        offset += 1\n        idx = out[0][0, 0].argmax()\n\n    print(ans)\n\n    s = b\"\"\n    for i in ans:\n        if i in id2token:\n            s += base64.b64decode(id2token[i])\n\n    print(s.decode().strip())\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/whisper/rknn/test_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom typing import List, Tuple\n\nimport numpy as np\nimport onnxruntime as ort\nimport torch\nimport whisper\n\nfrom test_torch import compute_feat\nfrom export_onnx import causal_mask_1d, get_args\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        encoder: str,\n        decoder: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 4\n\n        self.session_opts = session_opts\n\n        self.init_encoder(encoder)\n        self.init_decoder(decoder)\n\n    def init_encoder(self, encoder: str):\n        self.encoder = ort.InferenceSession(\n            encoder,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        self.encoder_input_names = []\n        self.encoder_output_names = []\n\n        print(f\"-----{encoder}-----\")\n        print(f\"----input----\")\n        for i in self.encoder.get_inputs():\n            print(i)\n            self.encoder_input_names.append(i.name)\n\n        print(\"-----output-----\")\n\n        for i in self.encoder.get_outputs():\n            print(i)\n            self.encoder_output_names.append(i.name)\n\n        meta = self.encoder.get_modelmeta().custom_metadata_map\n        self.n_text_layer = int(meta[\"n_text_layer\"])\n        self.n_text_ctx = int(meta[\"n_text_ctx\"])\n        self.n_text_state = int(meta[\"n_text_state\"])\n\n    def init_decoder(self, decoder: str):\n        self.decoder = ort.InferenceSession(\n            decoder,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        self.decoder_input_names = []\n        self.decoder_output_names = []\n\n        print(f\"-----{decoder}-----\")\n        print(f\"----input----\")\n        for i in self.decoder.get_inputs():\n            print(i)\n            self.decoder_input_names.append(i.name)\n\n        print(\"-----output-----\")\n\n        for i in self.decoder.get_outputs():\n            print(i)\n            self.decoder_output_names.append(i.name)\n\n    def run_encoder(\n        self,\n        mel: np.ndarray,\n    ) -> List[np.ndarray]:\n        cross_kv = self.encoder.run(\n            self.encoder_output_names,\n            {\n                self.encoder.get_inputs()[0].name: mel,\n            },\n        )\n        return cross_kv\n\n    def run_decoder(self, inputs: List[np.ndarray]) -> List[np.ndarray]:\n        feed = {\n            self.decoder.get_inputs()[i].name: inputs[i] for i in range(len(inputs))\n        }\n\n        out = self.decoder.run(\n            self.decoder_output_names,\n            feed,\n        )\n        return out\n\n    def get_self_cache(self) -> List[np.ndarray]:\n        self_cache = []\n        batch_size = 1\n        for i in range(self.n_text_layer):\n            k = np.zeros(\n                (batch_size, self.n_text_ctx, self.n_text_state), dtype=np.float32\n            )\n            v = np.zeros(\n                (batch_size, self.n_text_ctx, self.n_text_state), dtype=np.float32\n            )\n            self_cache.extend([k, v])\n        return self_cache\n\n\ndef main():\n    args = get_args()\n    print(vars(args))\n\n    torch_model = whisper.load_model(args.model)\n    tokenizer = whisper.tokenizer.get_tokenizer(\n        torch_model.is_multilingual, num_languages=torch_model.num_languages\n    )\n\n    mel = compute_feat(\"./en-16k.wav\").numpy()\n    print(mel.shape)  # (1, 80. 3000)\n    model = OnnxModel(f\"./{args.model}-encoder.onnx\", f\"./{args.model}-decoder.onnx\")\n\n    sot_sequence = list(tokenizer.sot_sequence) + [tokenizer.no_timestamps]\n\n    # tiny.en: [50257, 50362]\n    # tiny: [50258, 50259, 50359, 50363]\n    print(\"sot sequence\", sot_sequence)\n\n    cross_kv = model.run_encoder(mel)\n    print(len(cross_kv))  # 8\n\n    self_kv = model.get_self_cache()\n\n    # tiny.en: 50256\n    # tiny: 50257\n    eot = tokenizer.eot\n    print(\"eot\", eot)\n\n    offset = np.array([0], dtype=np.int32)\n    for t in sot_sequence:\n        token = np.array([[t]], dtype=np.int32)  # sot\n        mask = causal_mask_1d(offset.item(), model.n_text_ctx).numpy()\n\n        out = model.run_decoder([token] + self_kv + cross_kv + [offset, mask])\n\n        for i in range(1, len(out)):\n            self_kv[i - 1][:, offset.item() : offset.item() + 1, :] = out[i]\n\n        offset += 1\n\n    idx = out[0][0, 0].argmax()\n\n    ans = []\n\n    while idx != eot and offset.item() < 200:\n        ans.append(idx)\n        token = np.array([[idx]], dtype=np.int32)  # no_timestamps\n        for i in range(1, len(out)):\n            self_kv[i - 1][:, offset.item() : offset.item() + 1, :] = out[i]\n\n        mask = causal_mask_1d(offset.item(), model.n_text_ctx).numpy()\n\n        out = model.run_decoder([token] + self_kv + cross_kv + [offset, mask])\n        idx = out[0][0, 0].argmax()\n\n        offset += 1\n\n    print(ans)\n    text = \"\".join(tokenizer.decode(ans)).strip()\n    print(text)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/whisper/rknn/test_qnn.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom typing import Tuple\n\nimport numpy as np\nimport soundfile as sf\nimport torch\nimport whisper\n\nfrom export_onnx import AudioEncoderTensorCache, TextDecoderTensorCache, causal_mask_1d\nfrom test_torch import compute_feat\n\n\n@torch.no_grad()\ndef main():\n    mel = compute_feat(\"en.wav\")\n\n    model = whisper.load_model(\"tiny.en\")\n    tokenizer = whisper.tokenizer.get_tokenizer(\n        model.is_multilingual, num_languages=model.num_languages\n    )\n\n    model.eval()\n\n    cross_kv_pair = []\n    for i in range(4):\n        k = features = np.fromfile(f\"./cross_k_{i}.raw\", dtype=np.float32).reshape(\n            1, 1500, 384\n        )\n        v = features = np.fromfile(f\"./cross_v_{i}.raw\", dtype=np.float32).reshape(\n            1, 1500, 384\n        )\n\n        k = torch.from_numpy(k)\n        v = torch.from_numpy(v)\n\n        cross_kv_pair.append((k, v))\n\n    n_audio = mel.shape[0]\n\n    decoder = TextDecoderTensorCache(model.decoder, model.dims.n_text_ctx)\n\n    self_kv_pair = []\n    for i in range(model.dims.n_text_layer):\n        k = torch.zeros(n_audio, model.dims.n_text_ctx, model.dims.n_text_state)\n        v = torch.zeros(n_audio, model.dims.n_text_ctx, model.dims.n_text_state)\n        self_kv_pair.append((k, v))\n\n    offset = torch.zeros(1, dtype=torch.int64).to(mel.device)\n\n    mask = causal_mask_1d(offset.item(), model.dims.n_text_ctx)\n\n    tokens = torch.tensor([[tokenizer.sot]])\n    logits, this_self_kv_pair = decoder(\n        tokens,\n        self_kv_pair,\n        cross_kv_pair,\n        offset,\n        mask,\n    )\n    for (k_cache, v_cache), (k, v) in zip(self_kv_pair, this_self_kv_pair):\n        k_cache[:, offset : offset + 1] = k\n        v_cache[:, offset : offset + 1] = v\n\n    offset += 1\n\n    mask = causal_mask_1d(offset.item(), model.dims.n_text_ctx)\n\n    tokens = torch.tensor([[tokenizer.no_timestamps]])\n    logits, this_self_kv_pair = decoder(\n        tokens, self_kv_pair, cross_kv_pair, offset, mask\n    )\n\n    for (k_cache, v_cache), (k, v) in zip(self_kv_pair, this_self_kv_pair):\n        k_cache[:, offset : offset + 1] = k\n        v_cache[:, offset : offset + 1] = v\n\n    assert logits.shape == (n_audio, tokens.shape[1], model.dims.n_vocab)\n\n    print(\"logits.shape\", logits.shape)  # (1, 3, 51864)\n    idx = logits[0, -1].argmax().item()\n\n    steps = 0\n    results = []\n    while idx != tokenizer.eot and steps < 50:\n        results.append(idx)\n        tokens = torch.tensor([[results[-1]]])\n\n        offset += 1\n        mask = causal_mask_1d(offset.item(), model.dims.n_text_ctx)\n\n        logits, this_self_kv_pair = decoder(\n            tokens, self_kv_pair, cross_kv_pair, offset, mask\n        )\n\n        for (k_cache, v_cache), (k, v) in zip(self_kv_pair, this_self_kv_pair):\n            k_cache[:, offset : offset + 1] = k\n            v_cache[:, offset : offset + 1] = v\n\n        idx = logits[0, -1].argmax().item()\n        steps += 1\n\n    print(results)\n    print(tokenizer.decode(results))\n\n\nif __name__ == \"__main__\":\n    torch.set_num_threads(1)\n    torch.set_num_interop_threads(1)\n    # To fix\n    # TypeError: scaled_dot_product_attention(): argument 'is_causal' must be bool, not Tensor\n    # See also https://github.com/k2-fsa/sherpa-onnx/issues/1764\n    from whisper.model import disable_sdpa\n\n    with disable_sdpa():\n        main()\n"
  },
  {
    "path": "scripts/whisper/rknn/test_torch.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom typing import Tuple\n\nimport numpy as np\nimport soundfile as sf\nimport torch\nimport whisper\n\nfrom export_onnx import (\n    AudioEncoderTensorCache,\n    TextDecoderTensorCache,\n    causal_mask_1d,\n    get_args,\n)\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef compute_feat(filename: str):\n    wave, sample_rate = load_audio(filename)\n    if sample_rate != 16000:\n        import librosa\n\n        wave = librosa.resample(wave, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    audio = whisper.pad_or_trim(wave)\n    assert audio.shape == (16000 * 30,), audio.shape\n\n    mel = whisper.log_mel_spectrogram(audio, n_mels=80).unsqueeze(0)\n    assert mel.shape == (1, 80, 3000), mel.shape\n\n    return mel\n\n\n@torch.no_grad()\ndef main():\n    args = get_args()\n    print(vars(args))\n    mel = compute_feat(\"en.wav\")\n\n    model = whisper.load_model(args.model, device=\"cpu\")\n    tokenizer = whisper.tokenizer.get_tokenizer(\n        model.is_multilingual, num_languages=model.num_languages\n    )\n\n    model.eval()\n\n    encoder = AudioEncoderTensorCache(model.encoder, model.decoder)\n\n    cross_kv_pair = encoder(mel)\n\n    n_audio = mel.shape[0]\n\n    decoder = TextDecoderTensorCache(model.decoder, model.dims.n_text_ctx)\n\n    self_kv_pair = []\n    for i in range(model.dims.n_text_layer):\n        k = torch.zeros(n_audio, model.dims.n_text_ctx, model.dims.n_text_state)\n        v = torch.zeros(n_audio, model.dims.n_text_ctx, model.dims.n_text_state)\n        self_kv_pair.append((k, v))\n\n    offset = torch.zeros(1, dtype=torch.int64).to(mel.device)\n\n    mask = causal_mask_1d(offset.item(), model.dims.n_text_ctx)\n\n    tokens = torch.tensor([[tokenizer.sot]])\n    logits, this_self_kv_pair = decoder(\n        tokens,\n        self_kv_pair,\n        cross_kv_pair,\n        offset,\n        mask,\n    )\n    for (k_cache, v_cache), (k, v) in zip(self_kv_pair, this_self_kv_pair):\n        k_cache[:, offset : offset + 1] = k\n        v_cache[:, offset : offset + 1] = v\n\n    offset += 1\n\n    mask = causal_mask_1d(offset.item(), model.dims.n_text_ctx)\n\n    tokens = torch.tensor([[tokenizer.no_timestamps]])\n    logits, this_self_kv_pair = decoder(\n        tokens, self_kv_pair, cross_kv_pair, offset, mask\n    )\n\n    for (k_cache, v_cache), (k, v) in zip(self_kv_pair, this_self_kv_pair):\n        k_cache[:, offset : offset + 1] = k\n        v_cache[:, offset : offset + 1] = v\n\n    assert logits.shape == (n_audio, tokens.shape[1], model.dims.n_vocab)\n\n    print(\"logits.shape\", logits.shape)  # (1, 3, 51864)\n    idx = logits[0, -1].argmax().item()\n\n    steps = 0\n    results = []\n    while idx != tokenizer.eot and steps < 50:\n        results.append(idx)\n        tokens = torch.tensor([[results[-1]]])\n\n        offset += 1\n        mask = causal_mask_1d(offset.item(), model.dims.n_text_ctx)\n\n        logits, this_self_kv_pair = decoder(\n            tokens, self_kv_pair, cross_kv_pair, offset, mask\n        )\n\n        for (k_cache, v_cache), (k, v) in zip(self_kv_pair, this_self_kv_pair):\n            k_cache[:, offset : offset + 1] = k\n            v_cache[:, offset : offset + 1] = v\n\n        idx = logits[0, -1].argmax().item()\n        steps += 1\n\n    print(results)\n    print(tokenizer.decode(results))\n\n\nif __name__ == \"__main__\":\n    torch.set_num_threads(1)\n    torch.set_num_interop_threads(1)\n    try:\n        # To fix\n        # TypeError: scaled_dot_product_attention(): argument 'is_causal' must be bool, not Tensor\n        # See also https://github.com/k2-fsa/sherpa-onnx/issues/1764\n        from whisper.model import disable_sdpa\n\n        with disable_sdpa():\n            main()\n    except:\n        main()\n"
  },
  {
    "path": "scripts/whisper/rknn/tiny-en-onnx-info.md",
    "content": "# tiny.en encoder\n\n```\n----input----\nNodeArg(name='tiny.en-mel', type='tensor(float)', shape=[1, 80, 3000])\n\n-----output-----\nNodeArg(name='cross_k_0', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_v_0', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_k_1', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_v_1', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_k_2', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_v_2', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_k_3', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='cross_v_3', type='tensor(float)', shape=[1, 1500, 384])\n```\n\n# tiny.en decoder\n\n```\n----input----\nNodeArg(name='tiny.en-tokens', type='tensor(int32)', shape=[1, 1])\nNodeArg(name='tiny.en-self_k_0', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_v_0', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_k_1', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_v_1', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_k_2', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_v_2', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_k_3', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-self_v_3', type='tensor(float)', shape=[1, 448, 384])\nNodeArg(name='tiny.en-cross_k_0', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_v_0', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_k_1', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_v_1', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_k_2', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_v_2', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_k_3', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-cross_v_3', type='tensor(float)', shape=[1, 1500, 384])\nNodeArg(name='tiny.en-offset', type='tensor(int32)', shape=[1])\nNodeArg(name='tiny.en-mask', type='tensor(int32)', shape=[448])\n\n-----output-----\n\nNodeArg(name='tiny.en-logits', type='tensor(float)', shape=[1, 1, 51864])\nNodeArg(name='tiny.en-this_self_k_0', type='tensor(float)', shape=[1, 1, 384])\nNodeArg(name='tiny.en-this_self_v_0', type='tensor(float)', shape=[1, 1, 384])\nNodeArg(name='tiny.en-this_self_k_1', type='tensor(float)', shape=[1, 1, 384])\nNodeArg(name='tiny.en-this_self_v_1', type='tensor(float)', shape=[1, 1, 384])\nNodeArg(name='tiny.en-this_self_k_2', type='tensor(float)', shape=[1, 1, 384])\nNodeArg(name='tiny.en-this_self_v_2', type='tensor(float)', shape=[1, 1, 384])\nNodeArg(name='tiny.en-this_self_k_3', type='tensor(float)', shape=[1, 1, 384])\nNodeArg(name='tiny.en-this_self_v_3', type='tensor(float)', shape=[1, 1, 384])\n```\n"
  },
  {
    "path": "scripts/whisper/test.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2023  Xiaomi Corp.        (authors: Fangjun Kuang)\n\"\"\"\nPlease first run ./export-onnx.py\nbefore you run this script\n\"\"\"\nimport argparse\nimport base64\nfrom typing import Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\nimport torch\n\n\ndef get_args():\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        \"--encoder\",\n        type=str,\n        required=True,\n        help=\"Path to the encoder\",\n    )\n\n    parser.add_argument(\n        \"--decoder\",\n        type=str,\n        required=True,\n        help=\"Path to the decoder\",\n    )\n\n    parser.add_argument(\n        \"--tokens\",\n        type=str,\n        required=True,\n        help=\"Path to the tokens\",\n    )\n\n    parser.add_argument(\n        \"--language\",\n        type=str,\n        help=\"\"\"The actual spoken language in the audio.\n        Example values, en, de, zh, jp, fr.\n        If None, we will detect the language using the first 30s of the\n        input audio\n        \"\"\",\n    )\n\n    parser.add_argument(\n        \"--task\",\n        choices=[\"transcribe\", \"translate\"],\n        type=str,\n        default=\"transcribe\",\n        help=\"Valid values are: transcribe, translate\",\n    )\n\n    parser.add_argument(\n        \"--test-attention\",\n        action=\"store_true\",\n        help=\"Test cross-attention outputs (requires attention-enabled model)\",\n    )\n\n    parser.add_argument(\n        \"sound_file\",\n        type=str,\n        help=\"Path to the test wave\",\n    )\n    return parser.parse_args()\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        encoder: str,\n        decoder: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 4\n\n        self.session_opts = session_opts\n\n        self.init_encoder(encoder)\n        self.init_decoder(decoder)\n\n    def init_encoder(self, encoder: str):\n        self.encoder = ort.InferenceSession(\n            encoder,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n        meta = self.encoder.get_modelmeta().custom_metadata_map\n        self.n_text_layer = int(meta[\"n_text_layer\"])\n        self.n_text_ctx = int(meta[\"n_text_ctx\"])\n        self.n_text_state = int(meta[\"n_text_state\"])\n        self.n_mels = int(meta[\"n_mels\"])\n        self.sot = int(meta[\"sot\"])\n        self.eot = int(meta[\"eot\"])\n        self.translate = int(meta[\"translate\"])\n        self.transcribe = int(meta[\"transcribe\"])\n        self.no_timestamps = int(meta[\"no_timestamps\"])\n        self.no_speech = int(meta[\"no_speech\"])\n        self.blank = int(meta[\"blank_id\"])\n\n        self.sot_sequence = list(map(int, meta[\"sot_sequence\"].split(\",\")))\n        self.sot_sequence.append(self.no_timestamps)\n\n        self.all_language_tokens = list(\n            map(int, meta[\"all_language_tokens\"].split(\",\"))\n        )\n        self.all_language_codes = meta[\"all_language_codes\"].split(\",\")\n        self.lang2id = dict(zip(self.all_language_codes, self.all_language_tokens))\n        self.id2lang = dict(zip(self.all_language_tokens, self.all_language_codes))\n\n        self.is_multilingual = int(meta[\"is_multilingual\"]) == 1\n\n    def init_decoder(self, decoder: str):\n        self.decoder = ort.InferenceSession(\n            decoder,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def run_encoder(\n        self,\n        mel: torch.Tensor,\n    ) -> Tuple[torch.Tensor, torch.Tensor]:\n        n_layer_cross_k, n_layer_cross_v = self.encoder.run(\n            [\n                self.encoder.get_outputs()[0].name,\n                self.encoder.get_outputs()[1].name,\n            ],\n            {\n                self.encoder.get_inputs()[0].name: mel.numpy(),\n            },\n        )\n        return torch.from_numpy(n_layer_cross_k), torch.from_numpy(n_layer_cross_v)\n\n    def run_decoder(\n        self,\n        tokens: torch.Tensor,\n        n_layer_self_k_cache: torch.Tensor,\n        n_layer_self_v_cache: torch.Tensor,\n        n_layer_cross_k: torch.Tensor,\n        n_layer_cross_v: torch.Tensor,\n        offset: torch.Tensor,\n        return_attention: bool = False,\n    ):\n        # Caller must verify decoder has 4 outputs before passing return_attention=True\n        logits, out_n_layer_self_k_cache, out_n_layer_self_v_cache, *rest = self.decoder.run(\n            [\n                self.decoder.get_outputs()[0].name,\n                self.decoder.get_outputs()[1].name,\n                self.decoder.get_outputs()[2].name,\n                *([self.decoder.get_outputs()[3].name] if return_attention else []),\n            ],\n            {\n                self.decoder.get_inputs()[0].name: tokens.numpy(),\n                self.decoder.get_inputs()[1].name: n_layer_self_k_cache.numpy(),\n                self.decoder.get_inputs()[2].name: n_layer_self_v_cache.numpy(),\n                self.decoder.get_inputs()[3].name: n_layer_cross_k.numpy(),\n                self.decoder.get_inputs()[4].name: n_layer_cross_v.numpy(),\n                self.decoder.get_inputs()[5].name: offset.numpy(),\n            },\n        )\n        return (\n            torch.from_numpy(logits),\n            torch.from_numpy(out_n_layer_self_k_cache),\n            torch.from_numpy(out_n_layer_self_v_cache),\n            torch.from_numpy(rest[0]) if return_attention else None,\n        )\n\n    def get_self_cache(self) -> Tuple[torch.Tensor, torch.Tensor]:\n        batch_size = 1\n        n_layer_self_k_cache = torch.zeros(\n            self.n_text_layer,\n            batch_size,\n            self.n_text_ctx,\n            self.n_text_state,\n        )\n        n_layer_self_v_cache = torch.zeros(\n            self.n_text_layer,\n            batch_size,\n            self.n_text_ctx,\n            self.n_text_state,\n        )\n        return n_layer_self_k_cache, n_layer_self_v_cache\n\n    def suppress_tokens(self, logits, is_initial: bool) -> None:\n        # suppress blank\n        if is_initial:\n            logits[self.eot] = float(\"-inf\")\n            logits[self.blank] = float(\"-inf\")\n\n        # suppress <|notimestamps|>\n        logits[self.no_timestamps] = float(\"-inf\")\n\n        logits[self.sot] = float(\"-inf\")\n        logits[self.no_speech] = float(\"-inf\")\n\n        # logits is changed in-place\n        logits[self.translate] = float(\"-inf\")\n\n    def detect_language(\n        self, n_layer_cross_k: torch.Tensor, n_layer_cross_v: torch.Tensor\n    ) -> int:\n        tokens = torch.tensor([[self.sot]], dtype=torch.int64)\n        offset = torch.zeros(1, dtype=torch.int64)\n        n_layer_self_k_cache, n_layer_self_v_cache = self.get_self_cache()\n\n        logits, n_layer_self_k_cache, n_layer_self_v_cache, _ = self.run_decoder(\n            tokens=tokens,\n            n_layer_self_k_cache=n_layer_self_k_cache,\n            n_layer_self_v_cache=n_layer_self_v_cache,\n            n_layer_cross_k=n_layer_cross_k,\n            n_layer_cross_v=n_layer_cross_v,\n            offset=offset,\n        )\n        logits = logits.reshape(-1)\n        mask = torch.ones(logits.shape[0], dtype=torch.int64)\n        mask[self.all_language_tokens] = 0\n        logits[mask != 0] = float(\"-inf\")\n        lang_id = logits.argmax().item()\n        print(\"detected language: \", self.id2lang[lang_id])\n        return lang_id\n\n\ndef load_tokens(filename):\n    tokens = dict()\n    with open(filename, \"r\") as f:\n        for line in f:\n            t, i = line.split()\n            tokens[int(i)] = t\n    return tokens\n\n\ndef verify_attention(attention_weights, n_audio_ctx, tokens, token_table):\n    \"\"\"Verify attention weights and print approximate timestamps.\"\"\"\n    if not attention_weights:\n        print(\"No attention weights to verify\")\n        return\n\n    n_heads = attention_weights[0].shape[1]\n    print(\"\\n--- Attention Verification ---\")\n    print(f\"Alignment heads: {n_heads}, Audio frames: {n_audio_ctx}, Tokens: {len(tokens)}\")\n\n    for i, attn in enumerate(attention_weights):\n        expected = (1, n_heads, 1, n_audio_ctx)\n        if tuple(attn.shape) != expected:\n            print(f\"  Token {i}: expected shape {expected}, got {tuple(attn.shape)}\")\n\n    print(\"\\n--- Approximate Timestamps ---\")\n    for i, attn in enumerate(attention_weights):\n        peak_frame = attn.mean(dim=1).squeeze().argmax().item()\n        timestamp = peak_frame * 0.02\n        token_str = token_table.get(tokens[i], f\"<{tokens[i]}>\")\n        try:\n            token_display = base64.b64decode(token_str).decode()\n        except Exception:\n            token_display = token_str\n        print(f\"  Token {i} ({token_display!r}): ~{timestamp:.2f}s\")\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef compute_features(filename: str, dim: int = 80) -> torch.Tensor:\n    \"\"\"\n    Args:\n      filename:\n        Path to an audio file.\n    Returns:\n      Return a 1-D float32 tensor of shape (1, 80, 3000) containing the features.\n    \"\"\"\n    wave, sample_rate = load_audio(filename)\n    if sample_rate != 16000:\n        import librosa\n\n        wave = librosa.resample(wave, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    features = []\n    opts = knf.WhisperFeatureOptions()\n    opts.dim = dim\n    online_whisper_fbank = knf.OnlineWhisperFbank(opts)\n    online_whisper_fbank.accept_waveform(16000, wave)\n    online_whisper_fbank.input_finished()\n    for i in range(online_whisper_fbank.num_frames_ready):\n        f = online_whisper_fbank.get_frame(i)\n        f = torch.from_numpy(f)\n        features.append(f)\n\n    features = torch.stack(features)\n\n    log_spec = torch.clamp(features, min=1e-10).log10()\n    log_spec = torch.maximum(log_spec, log_spec.max() - 8.0)\n    mel = (log_spec + 4.0) / 4.0\n    # mel (T, 80)\n\n    # We pad 1500 frames at the end so that it is able to detect eot\n    # You can use another value instead of 1500.\n    mel = torch.nn.functional.pad(mel, (0, 0, 0, 1500), \"constant\", 0)\n    # Note that if it throws for a multilingual model,\n    # please use a larger value, say 300\n\n    target = 3000\n    if mel.shape[0] > target:\n        # -50 so that there are some zero tail paddings.\n        mel = mel[: target - 50]\n        mel = torch.nn.functional.pad(mel, (0, 0, 0, 50), \"constant\", 0)\n\n    # We don't need to pad it to 30 seconds now!\n    #  mel = torch.nn.functional.pad(mel, (0, 0, 0, target - mel.shape[0]), \"constant\", 0)\n\n    mel = mel.t().unsqueeze(0)\n\n    return mel\n\n\ndef main():\n    args = get_args()\n\n    model = OnnxModel(args.encoder, args.decoder)\n\n    if args.test_attention and len(model.decoder.get_outputs()) < 4:\n        raise RuntimeError(\n            \"--test-attention requires a model with cross-attention outputs. \"\n            \"Use export-onnx-with-attention.py to export a compatible model.\"\n        )\n\n    n_mels = model.n_mels\n\n    mel = compute_features(args.sound_file, dim=n_mels)\n\n    n_layer_cross_k, n_layer_cross_v = model.run_encoder(mel)\n\n    if args.language is not None:\n        if model.is_multilingual is False and args.language != \"en\":\n            print(f\"This model supports only English. Given: {args.language}\")\n            return\n\n        if args.language not in model.lang2id:\n            print(f\"Invalid language: {args.language}\")\n            print(f\"Valid values are: {list(model.lang2id.keys())}\")\n            return\n\n        # [sot, lang, task, notimestamps]\n        model.sot_sequence[1] = model.lang2id[args.language]\n    elif model.is_multilingual is True:\n        print(\"detecting language\")\n        lang = model.detect_language(n_layer_cross_k, n_layer_cross_v)\n        model.sot_sequence[1] = lang\n\n    if args.task is not None:\n        if model.is_multilingual is False and args.task != \"transcribe\":\n            print(\"This model supports only English. Please use --task=transcribe\")\n            return\n        assert args.task in [\"transcribe\", \"translate\"], args.task\n\n        if args.task == \"translate\":\n            model.sot_sequence[2] = model.translate\n\n    n_layer_self_k_cache, n_layer_self_v_cache = model.get_self_cache()\n\n    print(model.sot_sequence)\n    tokens = torch.tensor([model.sot_sequence], dtype=torch.int64)\n    offset = torch.zeros(1, dtype=torch.int64)\n    logits, n_layer_self_k_cache, n_layer_self_v_cache, _ = model.run_decoder(\n        tokens=tokens,\n        n_layer_self_k_cache=n_layer_self_k_cache,\n        n_layer_self_v_cache=n_layer_self_v_cache,\n        n_layer_cross_k=n_layer_cross_k,\n        n_layer_cross_v=n_layer_cross_v,\n        offset=offset,\n    )\n    offset += len(model.sot_sequence)\n    # logits.shape (batch_size, tokens.shape[1], vocab_size)\n    logits = logits[0, -1]\n    model.suppress_tokens(logits, is_initial=True)\n    #  logits = logits.softmax(dim=-1)\n    # for greedy search, we don't need to compute softmax or log_softmax\n    max_token_id = logits.argmax(dim=-1)\n    results = []\n    all_attention_weights = []\n    for i in range(model.n_text_ctx):\n        if max_token_id == model.eot:\n            break\n        results.append(max_token_id.item())\n        tokens = torch.tensor([[results[-1]]])\n\n        logits, n_layer_self_k_cache, n_layer_self_v_cache, attn = model.run_decoder(\n            tokens=tokens,\n            n_layer_self_k_cache=n_layer_self_k_cache,\n            n_layer_self_v_cache=n_layer_self_v_cache,\n            n_layer_cross_k=n_layer_cross_k,\n            n_layer_cross_v=n_layer_cross_v,\n            offset=offset,\n            return_attention=args.test_attention,\n        )\n        if attn is not None:\n            all_attention_weights.append(attn)\n        offset += 1\n        logits = logits[0, -1]\n        model.suppress_tokens(logits, is_initial=False)\n        max_token_id = logits.argmax(dim=-1)\n    token_table = load_tokens(args.tokens)\n    s = b\"\"\n    for i in results:\n        if i in token_table:\n            s += base64.b64decode(token_table[i])\n\n    print(s.decode().strip())\n\n    if args.test_attention:\n        n_audio_ctx = n_layer_cross_k.shape[2]\n        verify_attention(all_attention_weights, n_audio_ctx, results, token_table)\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/whisper/test_torch.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nimport torch\n\n\nfrom export_onnx import AudioEncoderTensorCache, TextDecoderTensorCache\nfrom test import load_audio\n\nimport whisper\n\n\n@torch.no_grad()\ndef main():\n    wave, sample_rate = load_audio(\"en.wav\")\n    if sample_rate != 16000:\n        import librosa\n\n        wave = librosa.resample(wave, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    audio = whisper.pad_or_trim(wave)\n    assert audio.shape == (16000 * 30,), audio.shape\n\n    mel = whisper.log_mel_spectrogram(audio, n_mels=80).unsqueeze(0)\n    assert mel.shape == (1, 80, 3000), mel.shape\n\n    model = whisper.load_model(\"tiny.en\")\n    tokenizer = whisper.tokenizer.get_tokenizer(\n        model.is_multilingual, num_languages=model.num_languages\n    )\n\n    model.eval()\n\n    encoder = AudioEncoderTensorCache(model.encoder, model.decoder)\n\n    n_layer_cross_k, n_layer_cross_v = encoder(mel)\n    print(\"n_layer_cross_k\", n_layer_cross_k.shape)  # (4, 1, 1500, 384)\n    print(\"n_layer_cross_v\", n_layer_cross_v.shape)  # (4, 1, 1500, 384)\n\n    n_audio = mel.shape[0]\n    tokens = torch.tensor([[tokenizer.sot, tokenizer.sot, tokenizer.sot]] * n_audio).to(\n        mel.device\n    )  # [n_audio, 3]\n\n    decoder = TextDecoderTensorCache(model.decoder, model.dims.n_text_ctx)\n\n    n_layer_self_k_cache = torch.zeros(\n        (\n            len(model.decoder.blocks),\n            n_audio,\n            model.dims.n_text_ctx,\n            model.dims.n_text_state,\n        ),\n        device=mel.device,\n    )\n    n_layer_self_v_cache = torch.zeros(\n        (\n            len(model.decoder.blocks),\n            n_audio,\n            model.dims.n_text_ctx,\n            model.dims.n_text_state,\n        ),\n        device=mel.device,\n    )\n    offset = torch.zeros(1, dtype=torch.int64).to(mel.device)\n    logits, n_layer_self_k_cache, n_layer_self_v_cache = decoder(\n        tokens,\n        n_layer_self_k_cache,\n        n_layer_self_v_cache,\n        n_layer_cross_k,\n        n_layer_cross_v,\n        offset,\n    )\n    assert logits.shape == (n_audio, tokens.shape[1], model.dims.n_vocab)\n    assert n_layer_self_k_cache.shape == (\n        model.dims.n_text_layer,\n        n_audio,\n        model.dims.n_text_ctx,\n        model.dims.n_text_state,\n    )\n    assert n_layer_self_v_cache.shape == (\n        model.dims.n_text_layer,\n        n_audio,\n        model.dims.n_text_ctx,\n        model.dims.n_text_state,\n    )\n\n    offset = torch.zeros(1, dtype=torch.int64).to(mel.device)\n\n    offset += len(tokenizer.sot_sequence)\n    print(\"logits.shape\", logits.shape)  # (1, 3, 51864)\n    idx = logits[0, -1].argmax().item()\n\n    steps = 0\n    results = []\n    while idx != tokenizer.eot and steps < 50:\n        results.append(idx)\n        tokens = torch.tensor([[results[-1]]])\n        offset += 1\n\n        logits, n_layer_self_k_cache, n_layer_self_v_cache = decoder(\n            tokens,\n            n_layer_self_k_cache,\n            n_layer_self_v_cache,\n            n_layer_cross_k,\n            n_layer_cross_v,\n            offset,\n        )\n        idx = logits[0, -1].argmax().item()\n        print(\"idx\", idx, \"step\", steps)\n        steps += 1\n\n    print(results)\n    print(tokenizer.decode(results))\n\n\nif __name__ == \"__main__\":\n    torch.set_num_threads(1)\n    torch.set_num_interop_threads(1)\n    # To fix\n    # TypeError: scaled_dot_product_attention(): argument 'is_causal' must be bool, not Tensor\n    # See also https://github.com/k2-fsa/sherpa-onnx/issues/1764\n    from whisper.model import disable_sdpa\n\n    with disable_sdpa():\n        main()\n"
  },
  {
    "path": "scripts/whisper/tools/timestamp_viewer.html",
    "content": "<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n    <meta charset=\"UTF-8\">\n    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\n    <title>Whisper Timestamp Viewer</title>\n    <style>\n        * {\n            box-sizing: border-box;\n        }\n        body {\n            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;\n            max-width: 1400px;\n            margin: 0 auto;\n            padding: 20px;\n            background: #f5f5f5;\n        }\n        h1 {\n            text-align: center;\n            color: #333;\n        }\n        .controls {\n            background: white;\n            padding: 20px;\n            border-radius: 8px;\n            margin-bottom: 20px;\n            box-shadow: 0 2px 4px rgba(0,0,0,0.1);\n        }\n        .controls label {\n            display: block;\n            margin-bottom: 10px;\n            font-weight: 600;\n        }\n        .controls input[type=\"file\"] {\n            margin-bottom: 15px;\n        }\n        .player-container {\n            background: white;\n            padding: 20px;\n            border-radius: 8px;\n            margin-bottom: 20px;\n            box-shadow: 0 2px 4px rgba(0,0,0,0.1);\n            position: sticky;\n            top: 0;\n            z-index: 100;\n        }\n        audio {\n            width: 100%;\n        }\n        .current-time {\n            text-align: center;\n            font-size: 24px;\n            font-weight: bold;\n            color: #333;\n            margin-top: 10px;\n            font-family: monospace;\n        }\n        .columns-container {\n            display: flex;\n            gap: 20px;\n            overflow-x: auto;\n        }\n        .column {\n            flex: 1;\n            min-width: 250px;\n            background: white;\n            border-radius: 8px;\n            box-shadow: 0 2px 4px rgba(0,0,0,0.1);\n            overflow: hidden;\n        }\n        .column-header {\n            background: #4a90d9;\n            color: white;\n            padding: 10px 15px;\n            font-weight: 600;\n            position: sticky;\n            top: 0;\n            z-index: 10;\n        }\n        .column-content {\n        }\n        .token-row {\n            display: flex;\n            padding: 6px 12px;\n            border-bottom: 1px solid #eee;\n            font-size: 14px;\n            transition: background-color 0.1s;\n            cursor: pointer;\n        }\n        .token-row:hover {\n            background: #f0f0f0;\n        }\n        .token-row:active {\n            background: #e0e0e0;\n        }\n        .token-row:focus {\n            outline: 2px solid #4a90d9;\n            outline-offset: -2px;\n            background: #e8f0fa;\n        }\n        .token-row.active {\n            background: #fff3cd;\n        }\n        .token-row.past {\n            background: #e8f5e9;\n        }\n        .token-text {\n            flex: 1;\n            font-family: monospace;\n            white-space: pre;\n        }\n        .token-time {\n            color: #666;\n            font-family: monospace;\n            font-size: 12px;\n            margin-left: 10px;\n        }\n        .instructions {\n            background: #e3f2fd;\n            padding: 15px;\n            border-radius: 8px;\n            margin-bottom: 20px;\n            color: #1565c0;\n        }\n        .csv-list {\n            margin-top: 10px;\n        }\n        .csv-item {\n            display: inline-flex;\n            align-items: center;\n            background: #e0e0e0;\n            padding: 5px 10px;\n            border-radius: 4px;\n            margin: 5px 5px 5px 0;\n        }\n        .csv-item button {\n            background: none;\n            border: none;\n            color: #666;\n            cursor: pointer;\n            margin-left: 8px;\n            font-size: 16px;\n        }\n        .csv-item button:hover {\n            color: #c00;\n        }\n    </style>\n</head>\n<body>\n    <h1>Whisper Timestamp Viewer</h1>\n\n    <div class=\"instructions\">\n        <strong>Instructions:</strong> Upload a WAV file and one or more CSV files containing token timestamps.\n        CSV files should have columns: token, timestamp, duration.\n        <br><br>\n        <strong>Click</strong> a token to play just that token's audio.\n        <strong>Shift+click</strong> to play continuously from that point.\n    </div>\n\n    <div class=\"controls\">\n        <label>\n            Audio File (WAV):\n            <input type=\"file\" id=\"audioInput\" accept=\".wav,.mp3,.ogg,.m4a\">\n        </label>\n\n        <label>\n            CSV Files (can select multiple):\n            <input type=\"file\" id=\"csvInput\" accept=\".csv\" multiple>\n        </label>\n\n        <div class=\"csv-list\" id=\"csvList\"></div>\n    </div>\n\n    <div class=\"player-container\" id=\"playerContainer\" style=\"display: none;\">\n        <audio id=\"audioPlayer\" controls></audio>\n        <div class=\"current-time\" id=\"currentTime\">0.000s</div>\n    </div>\n\n    <div class=\"columns-container\" id=\"columnsContainer\"></div>\n\n    <script>\n        const audioInput = document.getElementById('audioInput');\n        const csvInput = document.getElementById('csvInput');\n        const csvList = document.getElementById('csvList');\n        const playerContainer = document.getElementById('playerContainer');\n        const audioPlayer = document.getElementById('audioPlayer');\n        const currentTimeDisplay = document.getElementById('currentTime');\n        const columnsContainer = document.getElementById('columnsContainer');\n\n        let csvData = [];  // Array of {name, tokens: [{token, timestamp, duration}]}\n\n        // Handle audio file\n        audioInput.addEventListener('change', (e) => {\n            const file = e.target.files[0];\n            if (file) {\n                // Revoke previous URL if exists\n                if (audioPlayer.src.startsWith('blob:')) {\n                    URL.revokeObjectURL(audioPlayer.src);\n                }\n                const url = URL.createObjectURL(file);\n                audioPlayer.src = url;\n                playerContainer.style.display = 'block';\n            }\n        });\n\n        // Handle CSV files\n        csvInput.addEventListener('change', async (e) => {\n            const files = Array.from(e.target.files);\n            for (const file of files) {\n                const text = await file.text();\n                const tokens = parseCSV(text);\n                csvData.push({name: file.name, tokens});\n            }\n            updateCsvList();\n            renderColumns();\n        });\n\n        function parseCSV(text) {\n            const lines = text.trim().split('\\n');\n            const tokens = [];\n\n            // Skip header\n            for (let i = 1; i < lines.length; i++) {\n                const line = lines[i];\n                // Handle CSV properly (quoted strings may contain commas)\n                let token, rest;\n\n                if (line.startsWith('\"')) {\n                    // Token is quoted\n                    const endQuote = line.indexOf('\",', 1);\n                    if (endQuote !== -1) {\n                        token = line.substring(1, endQuote);\n                        rest = line.substring(endQuote + 2);\n                    } else {\n                        continue;\n                    }\n                } else {\n                    // Token is not quoted\n                    const firstComma = line.indexOf(',');\n                    if (firstComma !== -1) {\n                        token = line.substring(0, firstComma);\n                        rest = line.substring(firstComma + 1);\n                    } else {\n                        continue;\n                    }\n                }\n\n                const [timestamp, duration] = rest.split(',').map(s => parseFloat(s.trim()));\n                if (!isNaN(timestamp) && !isNaN(duration)) {\n                    tokens.push({token, timestamp, duration});\n                }\n            }\n\n            return tokens;\n        }\n\n        function updateCsvList() {\n            csvList.innerHTML = csvData.map((csv, i) => `\n                <span class=\"csv-item\">\n                    ${escapeHtml(csv.name)}\n                    <button onclick=\"removeCSV(${i})\">&times;</button>\n                </span>\n            `).join('');\n        }\n\n        function removeCSV(index) {\n            csvData.splice(index, 1);\n            updateCsvList();\n            renderColumns();\n        }\n\n        function escapeHtml(text) {\n            const div = document.createElement('div');\n            div.textContent = text;\n            return div.innerHTML;\n        }\n\n        // Play a segment from start to end time, then pause and seek back to start\n        let segmentStartTime = null;\n        let segmentEndTime = null;\n        let isPlayingSegment = false;\n\n        function playSegment(start, end) {\n            isPlayingSegment = true;\n            segmentStartTime = start;\n            segmentEndTime = end;\n            audioPlayer.currentTime = start;\n            audioPlayer.play();\n        }\n\n        function checkSegmentEnd() {\n            if (segmentEndTime !== null && audioPlayer.currentTime >= segmentEndTime) {\n                audioPlayer.pause();\n                const seekBackTo = segmentStartTime;\n                segmentStartTime = null;\n                segmentEndTime = null;\n                if (seekBackTo !== null) {\n                    audioPlayer.currentTime = seekBackTo;\n                }\n            }\n        }\n\n        function renderColumns() {\n            columnsContainer.innerHTML = csvData.map((csv, colIndex) => `\n                <div class=\"column\">\n                    <div class=\"column-header\">${escapeHtml(csv.name)}</div>\n                    <div class=\"column-content\" id=\"column-${colIndex}\">\n                        ${csv.tokens.map((t, i) => `\n                            <div class=\"token-row\" data-col=\"${colIndex}\" data-idx=\"${i}\"\n                                 data-start=\"${t.timestamp}\" data-end=\"${t.timestamp + t.duration}\"\n                                 tabindex=\"0\" role=\"button\">\n                                <span class=\"token-text\">${escapeHtml(t.token)}</span>\n                                <span class=\"token-time\">${t.timestamp.toFixed(2)}s</span>\n                            </div>\n                        `).join('')}\n                    </div>\n                </div>\n            `).join('');\n\n            // Add click and keyboard handlers to play just that token's timespan\n            // Shift+click plays continuously from that point\n            document.querySelectorAll('.token-row').forEach(row => {\n                function activateRow(e) {\n                    const start = parseFloat(row.dataset.start);\n                    const end = parseFloat(row.dataset.end);\n                    if (e.shiftKey) {\n                        // Shift+click: play continuously from this point\n                        segmentEndTime = null;\n                        audioPlayer.currentTime = start;\n                        audioPlayer.play();\n                    } else {\n                        // Regular click: play just this token\n                        playSegment(start, end);\n                    }\n                }\n\n                row.addEventListener('click', activateRow);\n                row.addEventListener('keydown', (e) => {\n                    if (e.key === 'Enter' || e.key === ' ') {\n                        e.preventDefault();\n                        activateRow(e);\n                    }\n                });\n            });\n        }\n\n        // Update highlighting during playback\n        audioPlayer.addEventListener('timeupdate', () => {\n            const currentTime = audioPlayer.currentTime;\n            currentTimeDisplay.textContent = currentTime.toFixed(3) + 's';\n\n            document.querySelectorAll('.token-row').forEach(row => {\n                const start = parseFloat(row.dataset.start);\n                const end = parseFloat(row.dataset.end);\n\n                row.classList.remove('active', 'past');\n\n                if (currentTime >= start && currentTime < end) {\n                    row.classList.add('active');\n                } else if (currentTime >= end) {\n                    row.classList.add('past');\n                }\n            });\n        });\n\n        // More frequent updates for smoother highlighting\n        let animationFrame;\n        let lastScrolledToken = null;\n\n        audioPlayer.addEventListener('play', () => {\n            function update() {\n                const currentTime = audioPlayer.currentTime;\n                currentTimeDisplay.textContent = currentTime.toFixed(3) + 's';\n\n                let firstActiveRow = null;\n                document.querySelectorAll('.token-row').forEach(row => {\n                    const start = parseFloat(row.dataset.start);\n                    const end = parseFloat(row.dataset.end);\n\n                    row.classList.remove('active', 'past');\n\n                    if (currentTime >= start && currentTime < end) {\n                        row.classList.add('active');\n                        if (!firstActiveRow) {\n                            firstActiveRow = row;\n                        }\n                    } else if (currentTime >= end) {\n                        row.classList.add('past');\n                    }\n                });\n\n                // Auto-scroll to keep active token visible\n                if (firstActiveRow && firstActiveRow !== lastScrolledToken) {\n                    const rect = firstActiveRow.getBoundingClientRect();\n                    const playerHeight = playerContainer.offsetHeight;\n                    // If active token is below the visible area or too close to player\n                    if (rect.top < playerHeight + 20 || rect.bottom > window.innerHeight - 50) {\n                        firstActiveRow.scrollIntoView({behavior: 'smooth', block: 'center'});\n                        lastScrolledToken = firstActiveRow;\n                    }\n                }\n\n                // Check if we should stop at segment end\n                checkSegmentEnd();\n\n                if (!audioPlayer.paused) {\n                    animationFrame = requestAnimationFrame(update);\n                }\n            }\n            update();\n        });\n\n        audioPlayer.addEventListener('pause', () => {\n            cancelAnimationFrame(animationFrame);\n        });\n\n        audioPlayer.addEventListener('seeked', () => {\n            // Clear segment on manual seek (but not if triggered by playSegment)\n            if (!isPlayingSegment && segmentEndTime !== null) {\n                segmentStartTime = null;\n                segmentEndTime = null;\n            }\n            // Clear the flag now that seek has completed\n            isPlayingSegment = false;\n            // Update highlighting immediately after seek\n            const currentTime = audioPlayer.currentTime;\n            lastScrolledToken = null;  // Reset so we scroll to new position\n            let firstActiveRow = null;\n\n            document.querySelectorAll('.token-row').forEach(row => {\n                const start = parseFloat(row.dataset.start);\n                const end = parseFloat(row.dataset.end);\n\n                row.classList.remove('active', 'past');\n\n                if (currentTime >= start && currentTime < end) {\n                    row.classList.add('active');\n                    if (!firstActiveRow) {\n                        firstActiveRow = row;\n                    }\n                } else if (currentTime >= end) {\n                    row.classList.add('past');\n                }\n            });\n\n            // Scroll to active token on seek\n            if (firstActiveRow) {\n                firstActiveRow.scrollIntoView({behavior: 'smooth', block: 'center'});\n            }\n        });\n    </script>\n</body>\n</html>\n"
  },
  {
    "path": "scripts/whisper/tools/whisper_timestamps_csv.py",
    "content": "#!/usr/bin/env python3\n\"\"\"\nGenerate CSV file with token timestamps from a Whisper model.\n\nUsage:\n    python whisper_timestamps_csv.py \\\n        --encoder path/to/encoder.onnx \\\n        --decoder path/to/decoder.onnx \\\n        --tokens path/to/tokens.txt \\\n        --audio path/to/audio.wav \\\n        --output timestamps.csv \\\n        [--enable-segment-timestamps]\n\"\"\"\n\nimport argparse\nimport csv\nimport wave\nimport numpy as np\nimport sherpa_onnx\n\n\ndef main():\n    parser = argparse.ArgumentParser(\n        description=\"Generate CSV with token timestamps from Whisper model\"\n    )\n    parser.add_argument(\"--encoder\", required=True, help=\"Path to encoder ONNX model\")\n    parser.add_argument(\"--decoder\", required=True, help=\"Path to decoder ONNX model\")\n    parser.add_argument(\"--tokens\", required=True, help=\"Path to tokens.txt file\")\n    parser.add_argument(\"--audio\", required=True, help=\"Path to input WAV file\")\n    parser.add_argument(\"--output\", required=True, help=\"Path to output CSV file\")\n    parser.add_argument(\n        \"--enable-segment-timestamps\",\n        action=\"store_true\",\n        help=\"Enable segment-level timestamps\",\n    )\n    parser.add_argument(\n        \"--language\", default=\"en\", help=\"Language code (default: en)\"\n    )\n    parser.add_argument(\n        \"--num-threads\", type=int, default=4, help=\"Number of threads (default: 4)\"\n    )\n    args = parser.parse_args()\n\n    # Create recognizer\n    recognizer = sherpa_onnx.OfflineRecognizer.from_whisper(\n        encoder=args.encoder,\n        decoder=args.decoder,\n        tokens=args.tokens,\n        language=args.language,\n        task=\"transcribe\",\n        enable_token_timestamps=True,\n        enable_segment_timestamps=args.enable_segment_timestamps,\n        num_threads=args.num_threads,\n    )\n\n    # Load audio\n    with wave.open(args.audio, \"rb\") as f:\n        assert f.getnchannels() == 1, \"Audio must be mono\"\n        assert f.getsampwidth() == 2, \"Audio must be 16-bit\"\n        sample_rate = f.getframerate()\n        samples = f.readframes(f.getnframes())\n\n    samples = np.frombuffer(samples, dtype=np.int16).astype(np.float32) / 32768.0\n\n    # Run recognition\n    stream = recognizer.create_stream()\n    stream.accept_waveform(sample_rate, samples)\n    recognizer.decode_stream(stream)\n    result = stream.result\n\n    # Write CSV\n    with open(args.output, \"w\", newline=\"\", encoding=\"utf-8\") as f:\n        writer = csv.writer(f)\n        writer.writerow([\"token\", \"timestamp\", \"duration\"])\n        for token, ts, dur in zip(result.tokens, result.timestamps, result.durations):\n            writer.writerow([token, f\"{ts:.3f}\", f\"{dur:.3f}\"])\n\n    print(f\"Wrote {len(result.tokens)} tokens to {args.output}\")\n    print(f\"Full text: {result.text}\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/zipformer-ctc/ascend/2025-07-03/onnx_test.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom typing import Tuple\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\nBPE_UNK = chr(8263)\nPRINTABLE_BASE_CHARS = (\n    list(range(256, 287 + 1))\n    + list(range(32, 126 + 1))\n    + list(range(288, 305 + 1))\n    + list(range(308, 318 + 1))\n    + list(range(321, 328 + 1))\n    + list(range(330, 382 + 1))\n    + list(range(384, 422 + 1))\n)\n\n\nBYTE_TO_BCHAR = {b: chr(PRINTABLE_BASE_CHARS[b]) for b in range(256)}\nBCHAR_TO_BYTE = {bc: b for b, bc in BYTE_TO_BCHAR.items()}\nBCHAR_TO_BYTE[BPE_UNK] = 32  # map unk to space\n\n\ndef load_tokens(filename):\n    ans = dict()\n    i = 0\n    with open(filename, encoding=\"utf-8\") as f:\n        for line in f:\n            ans[i] = line.strip().split()[0]\n            i += 1\n    return ans\n\n\ndef load_audio(filename: str) -> Tuple[np.ndarray, int]:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n\n    if sample_rate != 16000:\n        import librosa\n\n        data = librosa.resample(data, orig_sr=sample_rate, target_sr=16000)\n        sample_rate = 16000\n\n    samples = np.ascontiguousarray(data)\n    return samples, sample_rate\n\n\ndef compute_feat(\n    samples: np.ndarray,\n    sample_rate: int,\n    max_len: int,\n):\n    opts = knf.FbankOptions()\n    opts.frame_opts.dither = 0\n    opts.frame_opts.snip_edges = False\n    opts.frame_opts.window_type = \"povey\"\n    opts.frame_opts.samp_freq = sample_rate\n    opts.mel_opts.num_bins = 80\n\n    online_fbank = knf.OnlineFbank(opts)\n    online_fbank.accept_waveform(sample_rate, samples.tolist())\n    online_fbank.input_finished()\n\n    features = np.stack(\n        [online_fbank.get_frame(i) for i in range(online_fbank.num_frames_ready)]\n    )\n\n    if features.shape[0] > max_len:\n        features = features[:max_len]\n    elif features.shape[0] < max_len:\n        features = np.pad(\n            features,\n            ((0, max_len - features.shape[0]), (0, 0)),\n            mode=\"constant\",\n            constant_values=0,\n        )\n\n    features = np.ascontiguousarray(features)\n\n    assert features.data.contiguous is True\n    assert features.dtype == np.float32, features.dtype\n\n    return features\n\n\nclass OnnxModel:\n    def __init__(self, filename):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        shape = self.model.get_inputs()[0].shape\n        self.max_len = shape[1]\n\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i)\n\n    def __call__(self, x):\n        log_probs = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n            ],\n            {self.model.get_inputs()[0].name: x[None]},\n        )[0]\n\n        return log_probs\n\n\ndef main():\n    wave = \"./0.wav\"\n    wave = \"./1.wav\"\n    samples, sample_rate = load_audio(wave)\n\n    model = OnnxModel(\"./model.onnx\")\n\n    features = compute_feat(\n        samples=samples,\n        sample_rate=sample_rate,\n        max_len=model.max_len,\n    )\n    print(\"features\", features.shape)\n\n    log_probs = model(features)\n\n    idx = log_probs[0].argmax(axis=-1)\n    print(\"idx\", idx)\n    print(len(idx))\n    prev = -1\n    ids = []\n    for i in idx:\n        if i != prev:\n            ids.append(i)\n        prev = i\n    ids = [i for i in ids if i != 0]\n    print(ids)\n\n    tokens = load_tokens(\"./tokens.txt\")\n    text = \"\".join([tokens[i] for i in ids])\n\n    s = b\"\"\n    for t in text:\n        if t == \"▁\":\n            continue\n        elif t in BCHAR_TO_BYTE:\n            s += bytes([BCHAR_TO_BYTE[t]])\n        else:\n            print(\"skip OOV\", t)\n\n    print(s.decode())\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/zipformer-ctc/ascend/2025-07-03/test_om.py",
    "content": "#!/usr/bin/env python3\n# Copyright      2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\nfrom ais_bench.infer.interface import InferSession\n\nfrom onnx_test import BCHAR_TO_BYTE, compute_feat, load_audio, load_tokens\n\n\nclass OmModel:\n    def __init__(self):\n        self.model = InferSession(device_id=0, model_path=\"./model.om\", debug=False)\n\n        self.max_len = self.model.get_inputs()[0].shape[1]\n        print(\"---model---\")\n        for i in self.model.get_inputs():\n            print(i.name, i.datatype, i.shape)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i.name, i.datatype, i.shape)\n\n    def __call__(self, x):\n        \"\"\"\n        Args:\n          x: (N, T, C)\n        Returns:\n          log_probs: (N, T, vocab_size)\n        \"\"\"\n        return self.model.infer([x], mode=\"static\", custom_sizes=10000000)[0]\n\n\ndef main():\n    samples, sample_rate = load_audio(\"./test_wavs/0.wav\")\n    model = OmModel()\n\n    features = compute_feat(\n        samples=samples, sample_rate=sample_rate, max_len=model.max_len\n    )\n    print(\"features.shape\", features.shape)\n\n    log_probs = model(x=features[None])\n    print(\"log_probs.shape\", log_probs.shape, type(log_probs))\n\n    idx = log_probs[0].argmax(axis=-1)\n    print(\"idx\", idx)\n    print(len(idx))\n    prev = -1\n    ids = []\n    for i in idx:\n        if i != prev:\n            ids.append(i)\n        prev = i\n    ids = [i for i in ids if i != 0]\n    print(ids)\n\n    tokens = load_tokens(\"./tokens.txt\")\n    text = \"\".join([tokens[i] for i in ids])\n\n    s = b\"\"\n    for t in text:\n        if t == \"▁\":\n            continue\n        elif t in BCHAR_TO_BYTE:\n            s += bytes([BCHAR_TO_BYTE[t]])\n        else:\n            print(\"skip OOV\", t)\n\n    print(s.decode())\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/zipvoice/zh-en/generate_lexicon.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nfrom pypinyin import Style, lazy_pinyin, load_phrases_dict, phrases_dict, pinyin_dict\nfrom pypinyin.contrib.tone_convert import to_finals_tone3, to_initials\n\nload_phrases_dict(\n    {\n        \"行长\": [[\"hang2\"], [\"zhang3\"]],\n        \"银行行长\": [[\"yin2\"], [\"hang2\"], [\"hang2\"], [\"zhang3\"]],\n    }\n)\nuser_defined = {\n    \"微调\": [\"wei1\", \"tiao2\"],\n    \"这个\": [\"zhe4\", \"ge4\"],\n    \"方便地\": [\"fang1\", \"bian2\", \"de1\"],\n}\n\n\ndef get_initial_final(token):\n    if isinstance(token, list):\n        ans = \"\"\n        sep = \"\"\n        for t in token:\n            ans += sep + get_initial_final(t)\n            sep = \" \"\n        return ans\n\n    initial = to_initials(token, strict=False)\n\n    final = to_finals_tone3(\n        token,\n        strict=False,\n        neutral_tone_with_five=True,\n    )\n\n    ans = \"\"\n    if initial:\n        ans = initial + \"0\"\n\n    if final:\n        ans += f\" {final}\"\n\n    return ans\n\n\ndef main():\n    filename = \"lexicon.txt\"\n\n    word_dict = pinyin_dict.pinyin_dict\n    phrases = phrases_dict.phrases_dict\n\n    with open(filename, \"w\", encoding=\"utf-8\") as f:\n        for key in word_dict:\n            if not (0x4E00 <= key <= 0x9FFF):\n                continue\n\n            w = chr(key)\n            token = lazy_pinyin(\n                w,\n                style=Style.TONE3,\n                tone_sandhi=True,\n                neutral_tone_with_five=True,\n            )[0]\n\n            initial_final = get_initial_final(token)\n\n            f.write(f\"{w} {initial_final}\\n\")\n\n        for key, value in user_defined.items():\n            initial_final = get_initial_final(value)\n            f.write(f\"{key} {initial_final}\\n\")\n\n        for key in phrases:\n            if key in user_defined:\n                continue\n            token = lazy_pinyin(\n                key,\n                style=Style.TONE3,\n                tone_sandhi=True,\n                neutral_tone_with_five=True,\n            )\n            initial_final = get_initial_final(token)\n\n            f.write(f\"{key} {initial_final}\\n\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "scripts/zipvoice/zh-en/test_onnx.py",
    "content": "#!/usr/bin/env python3\n# Copyright    2025  Xiaomi Corp.        (authors: Fangjun Kuang)\n\n\nimport kaldi_native_fbank as knf\nimport numpy as np\nimport onnxruntime as ort\nimport soundfile as sf\n\n\ndef compute_features(samples):\n    stft_config = knf.StftConfig(\n        n_fft=1024,\n        hop_length=256,\n        win_length=1024,\n        center=True,\n        window_type=\"hann\",\n    )\n    knf_stft = knf.Stft(stft_config)\n    stft_result = knf_stft(samples.tolist())\n    real = np.array(stft_result.real, dtype=np.float32).reshape(\n        stft_result.num_frames, -1\n    )\n    imag = np.array(stft_result.imag, dtype=np.float32).reshape(\n        stft_result.num_frames, -1\n    )\n\n    mag = np.sqrt(real * real + imag * imag).astype(np.float32)\n\n    mel_opts = knf.MelBanksOptions()\n    mel_opts.num_bins = 100\n    mel_opts.low_freq = 0\n    mel_opts.high_freq = 24000 // 2\n    mel_opts.is_librosa = True\n    mel_opts.norm = \"\"\n    mel_opts.use_slaney_mel_scale = False\n\n    frame_opts = knf.FrameExtractionOptions()\n    frame_opts.samp_freq = 24000\n    #  frame_opts.frame_length_ms = 1024 * 1000 / 24000\n    #  frame_opts.frame_shift_ms = 256 * 1000 / 24000\n\n    mel_filters = knf.MelBanks(mel_opts, frame_opts)\n    mel_features = np.zeros((mag.shape[0], 100))\n    for i in range(mag.shape[0]):\n        mel_features[i] = mel_filters.compute(mag[i])\n    print(\"sum\", np.sum(mel_features), np.mean(mel_features))\n\n    mel_features = np.log(mel_features + 1e-10)\n    return mel_features\n\n\nclass OnnxModel:\n    def __init__(\n        self,\n        text_encoder_path: str,\n        fm_decoder_path: str,\n        num_thread: int = 1,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = num_thread\n        session_opts.intra_op_num_threads = num_thread\n\n        self.session_opts = session_opts\n\n        self.init_text_encoder(text_encoder_path)\n        self.init_fm_decoder(fm_decoder_path)\n\n    def init_text_encoder(self, model_path: str):\n        self.text_encoder = ort.InferenceSession(\n            model_path,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n\n    def init_fm_decoder(self, model_path: str):\n        self.fm_decoder = ort.InferenceSession(\n            model_path,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        meta = self.fm_decoder.get_modelmeta().custom_metadata_map\n        self.feat_dim = int(meta[\"feat_dim\"])\n\n    def run_text_encoder(\n        self,\n        tokens: np.ndarray,\n        prompt_tokens: np.ndarray,\n        prompt_features_len: np.ndarray,\n        speed: np.ndarray,\n    ) -> np.ndarray:\n        out = self.text_encoder.run(\n            [\n                self.text_encoder.get_outputs()[0].name,\n            ],\n            {\n                self.text_encoder.get_inputs()[0].name: tokens,\n                self.text_encoder.get_inputs()[1].name: prompt_tokens,\n                self.text_encoder.get_inputs()[2].name: prompt_features_len,\n                self.text_encoder.get_inputs()[3].name: speed,\n            },\n        )\n        return out[0]\n\n    def run_fm_decoder(\n        self,\n        t: np.ndarray,\n        x: np.ndarray,\n        text_condition: np.ndarray,\n        speech_condition: np.ndarray,\n        guidance_scale: np.ndarray,\n    ) -> np.ndarray:\n        out = self.fm_decoder.run(\n            [\n                self.fm_decoder.get_outputs()[0].name,\n            ],\n            {\n                self.fm_decoder.get_inputs()[0].name: t,\n                self.fm_decoder.get_inputs()[1].name: x,\n                self.fm_decoder.get_inputs()[2].name: text_condition,\n                self.fm_decoder.get_inputs()[3].name: speech_condition,\n                self.fm_decoder.get_inputs()[4].name: guidance_scale,\n            },\n        )\n        return out[0]\n\n\nclass OnnxVocosModel:\n    def __init__(\n        self,\n        filename: str,\n    ):\n        session_opts = ort.SessionOptions()\n        session_opts.inter_op_num_threads = 1\n        session_opts.intra_op_num_threads = 1\n\n        self.session_opts = session_opts\n        self.model = ort.InferenceSession(\n            filename,\n            sess_options=self.session_opts,\n            providers=[\"CPUExecutionProvider\"],\n        )\n        print(f\"vocos {self.model.get_modelmeta().custom_metadata_map}\")\n\n        print(\"----------vocos----------\")\n        for i in self.model.get_inputs():\n            print(i)\n\n        print(\"-----\")\n\n        for i in self.model.get_outputs():\n            print(i)\n        print()\n\n    def __call__(self, x: np.ndarray):\n        \"\"\"\n        Args:\n          x: (N, feat_dim, num_frames)\n        Returns:\n          mag: (N, n_fft/2+1, num_frames)\n          x: (N, n_fft/2+1, num_frames)\n          y: (N, n_fft/2+1, num_frames)\n\n        The complex spectrum is mag * (x + j*y)\n        \"\"\"\n        assert x.ndim == 3, x.shape\n        assert x.shape[0] == 1, x.shape\n\n        mag, x, y = self.model.run(\n            [\n                self.model.get_outputs()[0].name,\n                self.model.get_outputs()[1].name,\n                self.model.get_outputs()[2].name,\n            ],\n            {\n                self.model.get_inputs()[0].name: x,\n            },\n        )\n\n        return mag, x, y\n\n\ndef get_phones(text):\n    if text[-1] != \".\":\n        text += \".\"\n\n    word2tokens = dict()\n    with open(\"./lexicon.txt\", encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.split()\n            word = fields[0]\n            tokens = fields[1:]\n            word2tokens[word] = tokens\n\n    token2id = dict()\n    with open(\"./tokens.txt\", encoding=\"utf-8\") as f:\n        for line in f:\n            fields = line.strip().split()\n            if len(fields) == 1:\n                token2id[\" \"] = int(fields[0])\n            else:\n                token2id[fields[0]] = int(fields[1])\n\n    tokens = []\n    for w in text:\n        if w in word2tokens:\n            tokens += word2tokens[w]\n        else:\n            tokens.append(w)\n    ids = []\n    for t in tokens:\n        if t in token2id:\n            ids.append(token2id[t])\n        else:\n            print(f\"skip {t}\")\n\n    return ids\n\n\ndef compute_rms(features):\n    return np.sqrt(np.mean(np.square(features)))\n\n\ndef get_timestamps(num_steps, t_shift=1):\n    steps = np.linspace(0, 1, num_steps + 1)\n    if t_shift != 1:\n        steps = t_shift * steps / (1 + (t_shift - 1) * steps)\n\n    return steps.tolist()\n\n\ndef trim_leading_silence_energy(samples, frame_size=2048, hop=512, energy_thresh=0.5):\n    energies = [\n        np.sum(np.abs(samples[i : i + frame_size]) ** 2)\n        for i in range(0, len(samples) - frame_size, hop)\n    ]\n    #  print(energies)\n    # First frame whose energy exceeds threshold\n    frame_index = next((i for i, e in enumerate(energies) if e > energy_thresh), 0)\n    frame_index = max(frame_index - 3, 0)\n    start_sample = frame_index * hop\n    return samples[start_sample:]\n\n\ndef main():\n    vocoder = OnnxVocosModel(\"./vocos_24khz.onnx\")\n\n    prompt_text = \"各位村民, 大家新年好! 近期, 湖北省武汉市等多个地区\"\n    prompt_wav_filename = \"news-female.wav\"\n\n    prompt_text = \"本台消息, 中共中央国务院, 近日印发关于构建数据基础制度, 更好发挥数据要素作用的意见.\"\n    prompt_wav_filename = \"news-female-2.wav\"\n\n    prompt_text = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\"\n    prompt_wav_filename = \"leijun-1.wav\"\n\n    prompt_ids = get_phones(prompt_text)\n\n    text = \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n\n    ids = get_phones(text)\n\n    data, sample_rate = sf.read(\n        prompt_wav_filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    if sample_rate != 24000:\n        import librosa\n\n        samples = librosa.resample(\n            samples,\n            orig_sr=sample_rate,\n            target_sr=24000,\n        )\n        sample_rate = 24000\n\n    assert len(samples.shape) == 1, samples.shape\n\n    rms = compute_rms(samples)\n    print(\"rms\", rms)\n\n    target_rms = 0.1\n    if rms < target_rms:\n        samples = samples * target_rms / rms\n    new_rms = compute_rms(samples)\n\n    print(\"new_rms\", new_rms)\n\n    prompt_features = compute_features(samples)\n    print(\"features.shape\", prompt_features.shape)\n\n    feat_scale = 0.1\n    prompt_features = prompt_features * feat_scale\n\n    model = OnnxModel(\n        text_encoder_path=\"./text_encoder_int8.onnx\",\n        fm_decoder_path=\"./fm_decoder_int8.onnx\",\n    )\n\n    tokens = np.array([ids], dtype=np.int64)\n    assert len(tokens.shape) == 2, tokens.shape\n\n    prompt_tokens = np.array([prompt_ids], dtype=np.int64)\n    assert len(prompt_tokens.shape) == 2, prompt_tokens.shape\n    prompt_features_len = np.array(prompt_features.shape[0], dtype=np.int64)\n    speed = np.array(1.0, dtype=np.float32)\n\n    print(tokens.shape, prompt_tokens.shape, prompt_features_len)\n\n    text_condition = model.run_text_encoder(\n        tokens=tokens,\n        prompt_tokens=prompt_tokens,\n        prompt_features_len=prompt_features_len,\n        speed=speed,\n    )\n\n    x = np.random.randn(*text_condition.shape).astype(np.float32)\n\n    speech_condition = np.pad(\n        prompt_features,\n        pad_width=((0, x.shape[1] - prompt_features.shape[0]), (0, 0)),\n        mode=\"constant\",\n        constant_values=0,\n    )[None].astype(np.float32)\n\n    print(speech_condition.shape, prompt_features.shape)\n\n    guidance_scale = np.array(1.0, dtype=np.float32)\n\n    num_steps = 8\n    steps = get_timestamps(num_steps=num_steps, t_shift=0.5)\n    for i in range(num_steps):\n        t = np.array(steps[i], dtype=np.float32)\n        v = model.run_fm_decoder(\n            t=t,\n            x=x,\n            text_condition=text_condition,\n            speech_condition=speech_condition,\n            guidance_scale=guidance_scale,\n        )\n        x = x + v * (steps[i + 1] - steps[i])\n    print(\"prompt_features\", prompt_features.shape)\n    x = x[:, prompt_features.shape[0] :]\n    print(\"x\", x.shape)\n\n    x = x / feat_scale\n    mel = x.transpose(0, 2, 1)\n    mag, x, y = vocoder(mel)\n    print(\"mag\", mag.shape, x.shape, y.shape)\n\n    stft_result = knf.StftResult(\n        real=(mag * x)[0].transpose().reshape(-1).tolist(),\n        imag=(mag * y)[0].transpose().reshape(-1).tolist(),\n        num_frames=mag.shape[2],\n    )\n    config = knf.StftConfig(\n        n_fft=1024,\n        hop_length=256,\n        win_length=1024,\n        window_type=\"hann\",\n        center=True,\n        pad_mode=\"reflect\",\n        normalized=False,\n    )\n    istft = knf.IStft(config)\n    audio_vocos = istft(stft_result)\n\n    audio_vocos = np.array(audio_vocos)\n    audio_vocos = trim_leading_silence_energy(audio_vocos)\n\n    #  if rms < target_rms:\n    #      audio_vocos = audio_vocos / target_rms * rms\n\n    sf.write(\"generated.wav\", audio_vocos, sample_rate, \"PCM_16\")\n\n\nif __name__ == \"__main__\":\n    main()\n"
  },
  {
    "path": "setup.py",
    "content": "#!/usr/bin/env python3\n\nimport os\nimport re\nfrom pathlib import Path\n\nimport setuptools\n\nfrom cmake.cmake_extension import (\n    BuildExtension,\n    bdist_wheel,\n    cmake_extension,\n    get_binaries,\n    is_windows,\n    need_split_package,\n)\n\n\ndef read_long_description():\n    with open(\"README.md\", encoding=\"utf8\") as f:\n        readme = f.read()\n    return readme\n\n\ndef get_package_version():\n    with open(\"CMakeLists.txt\") as f:\n        content = f.read()\n\n    match = re.search(r\"set\\(SHERPA_ONNX_VERSION (.*)\\)\", content)\n    latest_version = match.group(1).strip('\"')\n\n    cmake_args = os.environ.get(\"SHERPA_ONNX_CMAKE_ARGS\", \"\")\n    extra_version = \"\"\n    if \"-DSHERPA_ONNX_ENABLE_GPU=ON\" in cmake_args:\n        extra_version = \"+cuda\"\n\n    cuda_version = os.environ.get(\"SHERPA_ONNX_CUDA_VERSION\", \"\")\n    if cuda_version:\n        extra_version += cuda_version\n\n    latest_version += extra_version\n\n    return latest_version\n\n\npackage_name = \"sherpa_onnx\"\n\nwith open(\"sherpa-onnx/python/sherpa_onnx/__init__.py\", \"a\") as f:\n    f.write(f\"__version__ = '{get_package_version()}'\\n\")\n\n\ndef get_binaries_to_install():\n    if need_split_package():\n        return None\n\n    cmake_args = os.environ.get(\"SHERPA_ONNX_CMAKE_ARGS\", \"\")\n    if \"-DSHERPA_ONNX_ENABLE_BINARY=OFF\" in cmake_args:\n        return None\n\n    bin_dir = Path(\"build\") / \"sherpa_onnx\" / \"bin\"\n    bin_dir.mkdir(parents=True, exist_ok=True)\n    suffix = \".exe\" if is_windows() else \"\"\n\n    binaries = get_binaries()\n\n    exe = []\n    for f in binaries:\n        suffix = \"\" if (\".dll\" in f or \".lib\" in f) else suffix\n        t = bin_dir / (f + suffix)\n        exe.append(str(t))\n    return exe\n\n\nsetuptools.setup(\n    name=package_name,\n    python_requires=\">=3.7\",\n    version=get_package_version(),\n    author=\"The sherpa-onnx development team\",\n    author_email=\"dpovey@gmail.com\",\n    package_dir={\n        \"sherpa_onnx\": \"sherpa-onnx/python/sherpa_onnx\",\n    },\n    packages=[\"sherpa_onnx\"],\n    data_files=(\n        [\n            (\n                (\"Scripts\", get_binaries_to_install())\n                if is_windows()\n                else (\"bin\", get_binaries_to_install())\n            )\n        ]\n        if get_binaries_to_install()\n        else None\n    ),\n    url=\"https://github.com/k2-fsa/sherpa-onnx\",\n    long_description=read_long_description(),\n    long_description_content_type=\"text/markdown\",\n    ext_modules=[cmake_extension(\"_sherpa_onnx\")],\n    cmdclass={\"build_ext\": BuildExtension, \"bdist_wheel\": bdist_wheel},\n    zip_safe=False,\n    classifiers=[\n        \"Programming Language :: C++\",\n        \"Programming Language :: Python\",\n        \"Topic :: Scientific/Engineering :: Artificial Intelligence\",\n    ],\n    entry_points={\n        \"console_scripts\": [\n            \"sherpa-onnx-cli=sherpa_onnx.cli:cli\",\n        ],\n    },\n    license=\"Apache licensed, as found in the LICENSE file\",\n    install_requires=[\"sherpa-onnx-core==1.12.31\"] if need_split_package() else None,\n)\n\nwith open(\"sherpa-onnx/python/sherpa_onnx/__init__.py\", \"r\") as f:\n    lines = f.readlines()\n\nwith open(\"sherpa-onnx/python/sherpa_onnx/__init__.py\", \"w\") as f:\n    for line in lines:\n        if \"__version__\" in line:\n            # skip __version__ = \"x.x.x\"\n            continue\n        f.write(line)\n"
  },
  {
    "path": "sherpa-onnx/CMakeLists.txt",
    "content": "add_subdirectory(csrc)\nif(SHERPA_ONNX_ENABLE_PYTHON)\n  add_subdirectory(python)\nendif()\n\nif(SHERPA_ONNX_ENABLE_JNI)\n  add_subdirectory(jni)\nendif()\n\nif(SHERPA_ONNX_ENABLE_C_API)\n  add_subdirectory(c-api)\nendif()\n"
  },
  {
    "path": "sherpa-onnx/c-api/CMakeLists.txt",
    "content": "include_directories(${PROJECT_SOURCE_DIR})\nadd_library(sherpa-onnx-c-api c-api.cc)\ntarget_link_libraries(sherpa-onnx-c-api sherpa-onnx-core)\ntarget_include_directories(sherpa-onnx-c-api PUBLIC ${PROJECT_SOURCE_DIR})\n\nif(BUILD_SHARED_LIBS)\n  target_compile_definitions(sherpa-onnx-c-api PUBLIC SHERPA_ONNX_BUILD_SHARED_LIBS=1)\n  target_compile_definitions(sherpa-onnx-c-api PUBLIC SHERPA_ONNX_BUILD_MAIN_LIB=1)\nendif()\n\nadd_library(sherpa-onnx-cxx-api cxx-api.cc)\ntarget_link_libraries(sherpa-onnx-cxx-api sherpa-onnx-c-api)\ntarget_include_directories(sherpa-onnx-cxx-api PUBLIC ${PROJECT_SOURCE_DIR})\n\nif(ANDROID OR (UNIX AND NOT APPLE))\n  set_target_properties(sherpa-onnx-c-api PROPERTIES\n    LINK_FLAGS \"-Wl,--version-script=${CMAKE_CURRENT_SOURCE_DIR}/sherpa-onnx-symbols-c.lds\"\n  )\nelseif(APPLE)\n  set_target_properties(sherpa-onnx-c-api PROPERTIES\n    LINK_FLAGS \"-Wl,-exported_symbols_list,${CMAKE_CURRENT_SOURCE_DIR}/sherpa-onnx-symbols-c.exp\"\n  )\nendif()\n\ninstall(\n  TARGETS\n    sherpa-onnx-c-api\n    sherpa-onnx-cxx-api\n  DESTINATION\n    lib\n)\n\ninstall(\n  FILES\n    c-api.h\n    cxx-api.h\n  DESTINATION\n    include/sherpa-onnx/c-api\n)\n"
  },
  {
    "path": "sherpa-onnx/c-api/Doxyfile",
    "content": "# Doxygen configuration for sherpa-onnx C and C++ public APIs.\n# Run from this directory with:\n#\n#   doxygen Doxyfile\n#\n# HTML output is generated under ./doxygen-docs/html/.\n\nDOXYFILE_ENCODING      = UTF-8\nPROJECT_NAME           = \"sherpa-onnx C API\"\nPROJECT_BRIEF          = \"Public C API and C++ wrapper for sherpa-onnx\"\nPROJECT_NUMBER         = 1.0\nOUTPUT_DIRECTORY       = doxygen-docs\nCREATE_SUBDIRS         = NO\nALLOW_UNICODE_NAMES    = YES\nOUTPUT_LANGUAGE        = English\nBRIEF_MEMBER_DESC      = YES\nREPEAT_BRIEF           = NO\nALWAYS_DETAILED_SEC    = NO\nINLINE_INHERITED_MEMB  = NO\nFULL_PATH_NAMES        = YES\nSTRIP_FROM_PATH        = ..\nSHORT_NAMES            = NO\nJAVADOC_AUTOBRIEF      = NO\nQT_AUTOBRIEF           = NO\nMULTILINE_CPP_IS_BRIEF = NO\nINHERIT_DOCS           = YES\nSEPARATE_MEMBER_PAGES  = NO\nTAB_SIZE               = 2\nOPTIMIZE_OUTPUT_FOR_C  = NO\nOPTIMIZE_OUTPUT_JAVA   = NO\nOPTIMIZE_FOR_FORTRAN   = NO\nOPTIMIZE_OUTPUT_VHDL   = NO\nMARKDOWN_SUPPORT       = YES\nAUTOLINK_SUPPORT       = YES\nBUILTIN_STL_SUPPORT    = YES\nCPP_CLI_SUPPORT        = NO\nSIP_SUPPORT            = NO\nIDL_PROPERTY_SUPPORT   = YES\nDISTRIBUTE_GROUP_DOC   = NO\nGROUP_NESTED_COMPOUNDS = NO\nSUBGROUPING            = YES\nINLINE_GROUPED_CLASSES = NO\nINLINE_SIMPLE_STRUCTS  = NO\nTYPEDEF_HIDES_STRUCT   = NO\nLOOKUP_CACHE_SIZE      = 2\n\nEXTRACT_ALL            = YES\nEXTRACT_PRIVATE        = NO\nEXTRACT_PRIV_VIRTUAL   = NO\nEXTRACT_PACKAGE        = NO\nEXTRACT_STATIC         = NO\nEXTRACT_LOCAL_CLASSES  = YES\nEXTRACT_LOCAL_METHODS  = NO\nEXTRACT_ANON_NSPACES   = NO\nHIDE_UNDOC_MEMBERS     = NO\nHIDE_UNDOC_CLASSES     = NO\nHIDE_FRIEND_COMPOUNDS  = NO\nHIDE_IN_BODY_DOCS      = NO\nINTERNAL_DOCS          = NO\nCASE_SENSE_NAMES       = YES\nHIDE_SCOPE_NAMES       = NO\nHIDE_COMPOUND_REFERENCE = NO\nSHOW_HEADERFILE        = YES\nSHOW_INCLUDE_FILES     = YES\nSHOW_GROUPED_MEMB_INC  = NO\nFORCE_LOCAL_INCLUDES   = NO\nINLINE_INFO            = YES\nSORT_MEMBER_DOCS       = YES\nSORT_BRIEF_DOCS        = NO\nSORT_MEMBERS_CTORS_1ST = NO\nSORT_GROUP_NAMES       = NO\nSORT_BY_SCOPE_NAME     = NO\nSTRICT_PROTO_MATCHING  = NO\nGENERATE_TODOLIST      = YES\nGENERATE_TESTLIST      = NO\nGENERATE_BUGLIST       = NO\nGENERATE_DEPRECATEDLIST = YES\nENABLED_SECTIONS       =\nMAX_INITIALIZER_LINES  = 30\nSHOW_USED_FILES        = YES\nSHOW_FILES             = YES\nSHOW_NAMESPACES        = YES\nFILE_VERSION_FILTER    =\nLAYOUT_FILE            =\nCITE_BIB_FILES         =\n\nQUIET                  = NO\nWARNINGS               = YES\nWARN_IF_UNDOCUMENTED   = NO\nWARN_IF_DOC_ERROR      = YES\nWARN_IF_INCOMPLETE_DOC = YES\nWARN_NO_PARAMDOC       = NO\nWARN_AS_ERROR          = NO\nWARN_FORMAT            = \"$file:$line: $text\"\nWARN_LOGFILE           =\n\nINPUT                  = mainpage.md \\\n                         c-api.h \\\n                         cxx-api.h\nINPUT_ENCODING         = UTF-8\nFILE_PATTERNS          = *.h\nRECURSIVE              = NO\nEXCLUDE                =\nEXCLUDE_SYMLINKS       = NO\nEXCLUDE_PATTERNS       =\nEXCLUDE_SYMBOLS        =\nEXAMPLE_PATH           = ../../c-api-examples \\\n                         ../../cxx-api-examples\nEXAMPLE_PATTERNS       = *.c \\\n                         *.cc \\\n                         *.h\nEXAMPLE_RECURSIVE      = NO\nIMAGE_PATH             =\nINPUT_FILTER           =\nFILTER_PATTERNS        =\nFILTER_SOURCE_FILES    = NO\nFILTER_SOURCE_PATTERNS =\nUSE_MDFILE_AS_MAINPAGE = mainpage.md\nSOURCE_BROWSER         = YES\nINLINE_SOURCES         = NO\nSTRIP_CODE_COMMENTS    = YES\nREFERENCED_BY_RELATION = YES\nREFERENCES_RELATION    = YES\nREFERENCES_LINK_SOURCE = YES\nSOURCE_TOOLTIPS        = YES\nUSE_HTAGS              = NO\nVERBATIM_HEADERS       = YES\n\nCLANG_ASSISTED_PARSING = NO\nCLANG_ADD_INC_PATHS    = YES\nCLANG_OPTIONS          =\n\nALPHABETICAL_INDEX     = YES\nCOLS_IN_ALPHA_INDEX    = 5\nIGNORE_PREFIX          = SherpaOnnx\n\nHTML_OUTPUT            = html\nHTML_FILE_EXTENSION    = .html\nHTML_HEADER            =\nHTML_FOOTER            =\nHTML_STYLESHEET        =\nHTML_EXTRA_STYLESHEET  =\nHTML_EXTRA_FILES       =\nHTML_COLORSTYLE        = LIGHT\nHTML_COLORSTYLE_HUE    = 220\nHTML_COLORSTYLE_SAT    = 100\nHTML_COLORSTYLE_GAMMA  = 80\nHTML_TIMESTAMP         = NO\nHTML_DYNAMIC_MENUS     = YES\nHTML_DYNAMIC_SECTIONS  = YES\nHTML_INDEX_NUM_ENTRIES = 100\nGENERATE_DOCSET        = NO\nGENERATE_HTMLHELP      = NO\nGENERATE_CHI           = NO\nGENERATE_QHP           = NO\nGENERATE_ECLIPSEHELP   = NO\nDISABLE_INDEX          = NO\nGENERATE_TREEVIEW      = YES\nENUM_VALUES_PER_LINE   = 1\nTREEVIEW_WIDTH         = 250\nEXT_LINKS_IN_WINDOW    = NO\nOBFUSCATE_EMAILS       = YES\nHTML_FORMULA_FORMAT    = svg\nFORMULA_FONTSIZE       = 10\nFORMULA_MACROFILE      =\nUSE_MATHJAX            = NO\nSEARCHENGINE           = YES\nSERVER_BASED_SEARCH    = NO\nEXTERNAL_SEARCH        = NO\nSEARCHENGINE_URL       =\nSEARCHDATA_FILE        = searchdata.xml\nEXTERNAL_SEARCH_ID     =\nEXTRA_SEARCH_MAPPINGS  =\n\nLATEX_OUTPUT           = latex\nGENERATE_LATEX         = NO\nGENERATE_RTF           = NO\nGENERATE_MAN           = NO\nGENERATE_XML           = NO\nGENERATE_DOCBOOK       = NO\nGENERATE_AUTOGEN_DEF   = NO\nGENERATE_PERLMOD       = NO\n\nENABLE_PREPROCESSING   = YES\nMACRO_EXPANSION        = YES\nEXPAND_ONLY_PREDEF     = YES\nSEARCH_INCLUDES        = YES\nINCLUDE_PATH           = ..\nINCLUDE_FILE_PATTERNS  =\nPREDEFINED             = SHERPA_ONNX_API= \\\n                         SHERPA_ONNX_EXPORT= \\\n                         SHERPA_ONNX_IMPORT= \\\n                         SHERPA_ONNX_DEPRECATED(x)=\nEXPAND_AS_DEFINED      =\nSKIP_FUNCTION_MACROS   = YES\n\nTAGFILES               =\nGENERATE_TAGFILE       =\nALLEXTERNALS           = NO\nEXTERNAL_GROUPS        = YES\nEXTERNAL_PAGES         = YES\n\nCLASS_DIAGRAMS         = YES\nHIDE_UNDOC_RELATIONS   = YES\nHAVE_DOT               = YES\nCLASS_GRAPH            = YES\nCOLLABORATION_GRAPH    = YES\nGROUP_GRAPHS           = YES\nUML_LOOK               = NO\nUML_LIMIT_NUM_FIELDS   = 10\nDOT_NUM_THREADS        = 0\nDOT_FONTNAME           = Helvetica\nDOT_FONTSIZE           = 10\nDOT_FONTPATH           =\nCLASS_GRAPH_WIDTH      = 1024\nDOT_GRAPH_MAX_NODES    = 50\nMAX_DOT_GRAPH_DEPTH    = 0\nDOT_TRANSPARENT        = NO\nDOT_MULTI_TARGETS      = NO\nGENERATE_LEGEND        = YES\nDOT_CLEANUP            = YES\n"
  },
  {
    "path": "sherpa-onnx/c-api/README.md",
    "content": "# Introduction\n\n\n## View doc\n\nYou can find documentation for C API and CXX API at the following address:\n<https://k2-fsa.github.io/sherpa/onnx/c-api/html/index.html>\n\n## Generate doc\n\n```bash\nsudo apt install doxygen graphviz      # Ubuntu/Debian\nbrew install doxygen graphviz          # macOS\n```\n\n```bash\ndoxygen ./Doxyfile\n```\n"
  },
  {
    "path": "sherpa-onnx/c-api/c-api.cc",
    "content": "// sherpa-onnx/c-api/c-api.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n#include <algorithm>\n#include <cstring>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"nlohmann/json.hpp\"\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/audio-tagging.h\"\n#include \"sherpa-onnx/csrc/circular-buffer.h\"\n#include \"sherpa-onnx/csrc/display.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/keyword-spotter.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-punctuation.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n#include \"sherpa-onnx/csrc/online-punctuation.h\"\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/csrc/online-speech-denoiser.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-manager.h\"\n#include \"sherpa-onnx/csrc/spoken-language-identification.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/version.h\"\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\n#if SHERPA_ONNX_ENABLE_TTS == 1\n#include \"sherpa-onnx/csrc/offline-tts.h\"\n#endif\n\n#if SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION == 1\n#include \"sherpa-onnx/csrc/offline-speaker-diarization.h\"\n#endif\n\nconst char *SherpaOnnxGetVersionStr() { return sherpa_onnx::GetVersionStr(); }\nconst char *SherpaOnnxGetGitSha1() { return sherpa_onnx::GetGitSha1(); }\nconst char *SherpaOnnxGetGitDate() { return sherpa_onnx::GetGitDate(); }\n\nstruct SherpaOnnxOnlineRecognizer {\n  std::unique_ptr<sherpa_onnx::OnlineRecognizer> impl;\n};\n\nstruct SherpaOnnxOnlineStream {\n  std::unique_ptr<sherpa_onnx::OnlineStream> impl;\n  explicit SherpaOnnxOnlineStream(std::unique_ptr<sherpa_onnx::OnlineStream> p)\n      : impl(std::move(p)) {}\n};\n\nstruct SherpaOnnxDisplay {\n  std::unique_ptr<sherpa_onnx::Display> impl;\n};\n\n#define SHERPA_ONNX_OR(x, y) (x ? x : y)\n\nstatic sherpa_onnx::OnlineRecognizerConfig GetOnlineRecognizerConfig(\n    const SherpaOnnxOnlineRecognizerConfig *config) {\n  sherpa_onnx::OnlineRecognizerConfig recognizer_config;\n\n  recognizer_config.feat_config.sampling_rate =\n      SHERPA_ONNX_OR(config->feat_config.sample_rate, 16000);\n  recognizer_config.feat_config.feature_dim =\n      SHERPA_ONNX_OR(config->feat_config.feature_dim, 80);\n\n  recognizer_config.model_config.transducer.encoder =\n      SHERPA_ONNX_OR(config->model_config.transducer.encoder, \"\");\n  recognizer_config.model_config.transducer.decoder =\n      SHERPA_ONNX_OR(config->model_config.transducer.decoder, \"\");\n  recognizer_config.model_config.transducer.joiner =\n      SHERPA_ONNX_OR(config->model_config.transducer.joiner, \"\");\n\n  recognizer_config.model_config.paraformer.encoder =\n      SHERPA_ONNX_OR(config->model_config.paraformer.encoder, \"\");\n  recognizer_config.model_config.paraformer.decoder =\n      SHERPA_ONNX_OR(config->model_config.paraformer.decoder, \"\");\n\n  recognizer_config.model_config.zipformer2_ctc.model =\n      SHERPA_ONNX_OR(config->model_config.zipformer2_ctc.model, \"\");\n\n  recognizer_config.model_config.tokens =\n      SHERPA_ONNX_OR(config->model_config.tokens, \"\");\n  if (config->model_config.tokens_buf &&\n      config->model_config.tokens_buf_size > 0) {\n    recognizer_config.model_config.tokens_buf = std::string(\n        config->model_config.tokens_buf, config->model_config.tokens_buf_size);\n  }\n\n  recognizer_config.model_config.nemo_ctc.model =\n      SHERPA_ONNX_OR(config->model_config.nemo_ctc.model, \"\");\n\n  recognizer_config.model_config.t_one_ctc.model =\n      SHERPA_ONNX_OR(config->model_config.t_one_ctc.model, \"\");\n\n  recognizer_config.model_config.num_threads =\n      SHERPA_ONNX_OR(config->model_config.num_threads, 1);\n  recognizer_config.model_config.provider_config.provider =\n      SHERPA_ONNX_OR(config->model_config.provider, \"cpu\");\n\n  if (recognizer_config.model_config.provider_config.provider.empty()) {\n    recognizer_config.model_config.provider_config.provider = \"cpu\";\n  }\n\n  recognizer_config.model_config.model_type =\n      SHERPA_ONNX_OR(config->model_config.model_type, \"\");\n  recognizer_config.model_config.debug = config->model_config.debug;\n  recognizer_config.model_config.modeling_unit =\n      SHERPA_ONNX_OR(config->model_config.modeling_unit, \"cjkchar\");\n\n  if (recognizer_config.model_config.modeling_unit.empty()) {\n    recognizer_config.model_config.modeling_unit = \"cjkchar\";\n  }\n\n  recognizer_config.model_config.bpe_vocab =\n      SHERPA_ONNX_OR(config->model_config.bpe_vocab, \"\");\n\n  recognizer_config.decoding_method =\n      SHERPA_ONNX_OR(config->decoding_method, \"greedy_search\");\n  if (recognizer_config.decoding_method.empty()) {\n    recognizer_config.decoding_method = \"greedy_search\";\n  }\n\n  recognizer_config.max_active_paths =\n      SHERPA_ONNX_OR(config->max_active_paths, 4);\n\n  recognizer_config.enable_endpoint =\n      SHERPA_ONNX_OR(config->enable_endpoint, 0);\n\n  recognizer_config.endpoint_config.rule1.min_trailing_silence =\n      SHERPA_ONNX_OR(config->rule1_min_trailing_silence, 2.4);\n\n  recognizer_config.endpoint_config.rule2.min_trailing_silence =\n      SHERPA_ONNX_OR(config->rule2_min_trailing_silence, 1.2);\n\n  recognizer_config.endpoint_config.rule3.min_utterance_length =\n      SHERPA_ONNX_OR(config->rule3_min_utterance_length, 20);\n\n  recognizer_config.hotwords_file = SHERPA_ONNX_OR(config->hotwords_file, \"\");\n  recognizer_config.hotwords_score =\n      SHERPA_ONNX_OR(config->hotwords_score, 1.5);\n  if (config->hotwords_buf && config->hotwords_buf_size > 0) {\n    recognizer_config.hotwords_buf =\n        std::string(config->hotwords_buf, config->hotwords_buf_size);\n  }\n\n  recognizer_config.blank_penalty = config->blank_penalty;\n\n  recognizer_config.ctc_fst_decoder_config.graph =\n      SHERPA_ONNX_OR(config->ctc_fst_decoder_config.graph, \"\");\n  recognizer_config.ctc_fst_decoder_config.max_active =\n      SHERPA_ONNX_OR(config->ctc_fst_decoder_config.max_active, 3000);\n\n  recognizer_config.rule_fsts = SHERPA_ONNX_OR(config->rule_fsts, \"\");\n  recognizer_config.rule_fars = SHERPA_ONNX_OR(config->rule_fars, \"\");\n\n  recognizer_config.hr.lexicon = SHERPA_ONNX_OR(config->hr.lexicon, \"\");\n  recognizer_config.hr.rule_fsts = SHERPA_ONNX_OR(config->hr.rule_fsts, \"\");\n\n  if (config->model_config.debug) {\n#if __OHOS__\n    auto str_vec = sherpa_onnx::SplitString(recognizer_config.ToString(), 128);\n    for (const auto &s : str_vec) {\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", s.c_str());\n      SHERPA_ONNX_LOGE(\"%s\\n\", s.c_str());\n    }\n#else\n    SHERPA_ONNX_LOGE(\"%s\", recognizer_config.ToString().c_str());\n#endif\n  }\n\n  return recognizer_config;\n}\n\nconst SherpaOnnxOnlineRecognizer *SherpaOnnxCreateOnlineRecognizer(\n    const SherpaOnnxOnlineRecognizerConfig *config) {\n  sherpa_onnx::OnlineRecognizerConfig recognizer_config =\n      GetOnlineRecognizerConfig(config);\n\n  if (!recognizer_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config!\");\n    return nullptr;\n  }\n\n  SherpaOnnxOnlineRecognizer *recognizer = new SherpaOnnxOnlineRecognizer;\n\n  recognizer->impl =\n      std::make_unique<sherpa_onnx::OnlineRecognizer>(recognizer_config);\n\n  return recognizer;\n}\n\nvoid SherpaOnnxDestroyOnlineRecognizer(\n    const SherpaOnnxOnlineRecognizer *recognizer) {\n  if (!recognizer) return;\n  delete recognizer;\n}\n\nconst SherpaOnnxOnlineStream *SherpaOnnxCreateOnlineStream(\n    const SherpaOnnxOnlineRecognizer *recognizer) {\n  SherpaOnnxOnlineStream *stream =\n      new SherpaOnnxOnlineStream(recognizer->impl->CreateStream());\n  return stream;\n}\n\nconst SherpaOnnxOnlineStream *SherpaOnnxCreateOnlineStreamWithHotwords(\n    const SherpaOnnxOnlineRecognizer *recognizer, const char *hotwords) {\n  SherpaOnnxOnlineStream *stream =\n      new SherpaOnnxOnlineStream(recognizer->impl->CreateStream(hotwords));\n  return stream;\n}\n\nvoid SherpaOnnxDestroyOnlineStream(const SherpaOnnxOnlineStream *stream) {\n  if (!stream) return;\n  delete stream;\n}\n\nvoid SherpaOnnxOnlineStreamAcceptWaveform(const SherpaOnnxOnlineStream *stream,\n                                          int32_t sample_rate,\n                                          const float *samples, int32_t n) {\n  stream->impl->AcceptWaveform(sample_rate, samples, n);\n}\n\nint32_t SherpaOnnxIsOnlineStreamReady(\n    const SherpaOnnxOnlineRecognizer *recognizer,\n    const SherpaOnnxOnlineStream *stream) {\n  return recognizer->impl->IsReady(stream->impl.get());\n}\n\nvoid SherpaOnnxDecodeOnlineStream(const SherpaOnnxOnlineRecognizer *recognizer,\n                                  const SherpaOnnxOnlineStream *stream) {\n  recognizer->impl->DecodeStream(stream->impl.get());\n}\n\nvoid SherpaOnnxDecodeMultipleOnlineStreams(\n    const SherpaOnnxOnlineRecognizer *recognizer,\n    const SherpaOnnxOnlineStream **streams, int32_t n) {\n  std::vector<sherpa_onnx::OnlineStream *> ss(n);\n  for (int32_t i = 0; i != n; ++i) {\n    ss[i] = streams[i]->impl.get();\n  }\n  recognizer->impl->DecodeStreams(ss.data(), n);\n}\n\nconst SherpaOnnxOnlineRecognizerResult *SherpaOnnxGetOnlineStreamResult(\n    const SherpaOnnxOnlineRecognizer *recognizer,\n    const SherpaOnnxOnlineStream *stream) {\n  sherpa_onnx::OnlineRecognizerResult result =\n      recognizer->impl->GetResult(stream->impl.get());\n  const auto &text = result.text;\n\n  auto r = new SherpaOnnxOnlineRecognizerResult;\n  memset(r, 0, sizeof(SherpaOnnxOnlineRecognizerResult));\n\n  // copy text\n  char *pText = new char[text.size() + 1];\n  std::copy(text.begin(), text.end(), pText);\n  pText[text.size()] = 0;\n  r->text = pText;\n\n  // copy json\n  std::string json = result.AsJsonString();\n  char *pJson = new char[json.size() + 1];\n  std::copy(json.begin(), json.end(), pJson);\n  pJson[json.size()] = 0;\n  r->json = pJson;\n\n  // copy tokens\n  auto count = result.tokens.size();\n  if (count > 0) {\n    size_t total_length = 0;\n    for (const auto &token : result.tokens) {\n      // +1 for the null character at the end of each token\n      total_length += token.size() + 1;\n    }\n\n    r->count = count;\n    // Each word ends with nullptr\n    char *tokens = new char[total_length]{};\n    char **tokens_temp = new char *[r->count];\n    int32_t pos = 0;\n    for (int32_t i = 0; i < r->count; ++i) {\n      tokens_temp[i] = tokens + pos;\n      memcpy(tokens + pos, result.tokens[i].c_str(), result.tokens[i].size());\n      // +1 to move past the null character\n      pos += result.tokens[i].size() + 1;\n    }\n    r->tokens_arr = tokens_temp;\n\n    if (!result.timestamps.empty() && result.timestamps.size() == r->count) {\n      r->timestamps = new float[r->count];\n      std::copy(result.timestamps.begin(), result.timestamps.end(),\n                r->timestamps);\n    } else {\n      r->timestamps = nullptr;\n    }\n\n    r->tokens = tokens;\n  } else {\n    r->count = 0;\n    r->timestamps = nullptr;\n    r->tokens = nullptr;\n    r->tokens_arr = nullptr;\n  }\n\n  return r;\n}\n\nvoid SherpaOnnxDestroyOnlineRecognizerResult(\n    const SherpaOnnxOnlineRecognizerResult *r) {\n  if (r) {\n    delete[] r->text;\n    delete[] r->json;\n    delete[] r->tokens;\n    delete[] r->tokens_arr;\n    delete[] r->timestamps;\n    delete r;\n  }\n}\n\nconst char *SherpaOnnxGetOnlineStreamResultAsJson(\n    const SherpaOnnxOnlineRecognizer *recognizer,\n    const SherpaOnnxOnlineStream *stream) {\n  sherpa_onnx::OnlineRecognizerResult result =\n      recognizer->impl->GetResult(stream->impl.get());\n  std::string json = result.AsJsonString();\n  char *pJson = new char[json.size() + 1];\n  std::copy(json.begin(), json.end(), pJson);\n  pJson[json.size()] = 0;\n  return pJson;\n}\n\nvoid SherpaOnnxDestroyOnlineStreamResultJson(const char *s) {\n  if (!s) return;\n  delete[] s;\n}\n\nvoid SherpaOnnxOnlineStreamReset(const SherpaOnnxOnlineRecognizer *recognizer,\n                                 const SherpaOnnxOnlineStream *stream) {\n  recognizer->impl->Reset(stream->impl.get());\n}\n\nvoid SherpaOnnxOnlineStreamInputFinished(const SherpaOnnxOnlineStream *stream) {\n  stream->impl->InputFinished();\n}\n\nvoid SherpaOnnxOnlineStreamSetOption(const SherpaOnnxOnlineStream *stream,\n                                     const char *key, const char *value) {\n  if (!stream || !key || !value) return;\n  stream->impl->SetOption(key, value);\n}\n\nconst char *SherpaOnnxOnlineStreamGetOption(\n    const SherpaOnnxOnlineStream *stream, const char *key) {\n  if (!stream || !key) return nullptr;\n  return stream->impl->GetOption(key).c_str();\n}\n\nint32_t SherpaOnnxOnlineStreamHasOption(const SherpaOnnxOnlineStream *stream,\n                                        const char *key) {\n  if (!stream || !key) return 0;\n  return stream->impl->HasOption(key);\n}\n\nint32_t SherpaOnnxOnlineStreamIsEndpoint(\n    const SherpaOnnxOnlineRecognizer *recognizer,\n    const SherpaOnnxOnlineStream *stream) {\n  return recognizer->impl->IsEndpoint(stream->impl.get());\n}\n\nconst SherpaOnnxDisplay *SherpaOnnxCreateDisplay(int32_t max_word_per_line) {\n  SherpaOnnxDisplay *ans = new SherpaOnnxDisplay;\n  ans->impl = std::make_unique<sherpa_onnx::Display>(max_word_per_line);\n  return ans;\n}\n\nvoid SherpaOnnxDestroyDisplay(const SherpaOnnxDisplay *display) {\n  if (!display) return;\n  delete display;\n}\n\nvoid SherpaOnnxPrint(const SherpaOnnxDisplay *display, int32_t idx,\n                     const char *s) {\n  display->impl->Print(idx, s);\n}\n\n// ============================================================\n// For offline ASR (i.e., non-streaming ASR)\n// ============================================================\n//\nstruct SherpaOnnxOfflineRecognizer {\n  std::unique_ptr<sherpa_onnx::OfflineRecognizer> impl;\n};\n\nstruct SherpaOnnxOfflineStream {\n  std::unique_ptr<sherpa_onnx::OfflineStream> impl;\n  explicit SherpaOnnxOfflineStream(\n      std::unique_ptr<sherpa_onnx::OfflineStream> p)\n      : impl(std::move(p)) {}\n};\n\nstatic sherpa_onnx::OfflineRecognizerConfig GetOfflineRecognizerConfig(\n    const SherpaOnnxOfflineRecognizerConfig *config) {\n  sherpa_onnx::OfflineRecognizerConfig recognizer_config;\n\n  recognizer_config.feat_config.sampling_rate =\n      SHERPA_ONNX_OR(config->feat_config.sample_rate, 16000);\n\n  recognizer_config.feat_config.feature_dim =\n      SHERPA_ONNX_OR(config->feat_config.feature_dim, 80);\n\n  recognizer_config.model_config.transducer.encoder_filename =\n      SHERPA_ONNX_OR(config->model_config.transducer.encoder, \"\");\n\n  recognizer_config.model_config.transducer.decoder_filename =\n      SHERPA_ONNX_OR(config->model_config.transducer.decoder, \"\");\n\n  recognizer_config.model_config.transducer.joiner_filename =\n      SHERPA_ONNX_OR(config->model_config.transducer.joiner, \"\");\n\n  recognizer_config.model_config.paraformer.model =\n      SHERPA_ONNX_OR(config->model_config.paraformer.model, \"\");\n\n  recognizer_config.model_config.nemo_ctc.model =\n      SHERPA_ONNX_OR(config->model_config.nemo_ctc.model, \"\");\n\n  recognizer_config.model_config.whisper.encoder =\n      SHERPA_ONNX_OR(config->model_config.whisper.encoder, \"\");\n\n  recognizer_config.model_config.whisper.decoder =\n      SHERPA_ONNX_OR(config->model_config.whisper.decoder, \"\");\n\n  recognizer_config.model_config.whisper.language =\n      SHERPA_ONNX_OR(config->model_config.whisper.language, \"\");\n\n  recognizer_config.model_config.whisper.task =\n      SHERPA_ONNX_OR(config->model_config.whisper.task, \"transcribe\");\n  if (recognizer_config.model_config.whisper.task.empty()) {\n    recognizer_config.model_config.whisper.task = \"transcribe\";\n  }\n\n  recognizer_config.model_config.whisper.tail_paddings =\n      SHERPA_ONNX_OR(config->model_config.whisper.tail_paddings, -1);\n\n  recognizer_config.model_config.whisper.enable_token_timestamps =\n      config->model_config.whisper.enable_token_timestamps;\n\n  recognizer_config.model_config.whisper.enable_segment_timestamps =\n      config->model_config.whisper.enable_segment_timestamps;\n\n  recognizer_config.model_config.tdnn.model =\n      SHERPA_ONNX_OR(config->model_config.tdnn.model, \"\");\n\n  recognizer_config.model_config.tokens =\n      SHERPA_ONNX_OR(config->model_config.tokens, \"\");\n  recognizer_config.model_config.num_threads =\n      SHERPA_ONNX_OR(config->model_config.num_threads, 1);\n  recognizer_config.model_config.debug = config->model_config.debug;\n  recognizer_config.model_config.provider =\n      SHERPA_ONNX_OR(config->model_config.provider, \"cpu\");\n  if (recognizer_config.model_config.provider.empty()) {\n    recognizer_config.model_config.provider = \"cpu\";\n  }\n\n  recognizer_config.model_config.model_type =\n      SHERPA_ONNX_OR(config->model_config.model_type, \"\");\n  recognizer_config.model_config.modeling_unit =\n      SHERPA_ONNX_OR(config->model_config.modeling_unit, \"cjkchar\");\n\n  if (recognizer_config.model_config.modeling_unit.empty()) {\n    recognizer_config.model_config.modeling_unit = \"cjkchar\";\n  }\n\n  recognizer_config.model_config.bpe_vocab =\n      SHERPA_ONNX_OR(config->model_config.bpe_vocab, \"\");\n\n  recognizer_config.model_config.telespeech_ctc =\n      SHERPA_ONNX_OR(config->model_config.telespeech_ctc, \"\");\n\n  recognizer_config.model_config.sense_voice.model =\n      SHERPA_ONNX_OR(config->model_config.sense_voice.model, \"\");\n\n  recognizer_config.model_config.sense_voice.language =\n      SHERPA_ONNX_OR(config->model_config.sense_voice.language, \"\");\n\n  recognizer_config.model_config.sense_voice.use_itn =\n      config->model_config.sense_voice.use_itn;\n\n  recognizer_config.model_config.moonshine.preprocessor =\n      SHERPA_ONNX_OR(config->model_config.moonshine.preprocessor, \"\");\n\n  recognizer_config.model_config.moonshine.encoder =\n      SHERPA_ONNX_OR(config->model_config.moonshine.encoder, \"\");\n\n  recognizer_config.model_config.moonshine.uncached_decoder =\n      SHERPA_ONNX_OR(config->model_config.moonshine.uncached_decoder, \"\");\n\n  recognizer_config.model_config.moonshine.cached_decoder =\n      SHERPA_ONNX_OR(config->model_config.moonshine.cached_decoder, \"\");\n\n  recognizer_config.model_config.moonshine.merged_decoder =\n      SHERPA_ONNX_OR(config->model_config.moonshine.merged_decoder, \"\");\n\n  recognizer_config.model_config.fire_red_asr.encoder =\n      SHERPA_ONNX_OR(config->model_config.fire_red_asr.encoder, \"\");\n\n  recognizer_config.model_config.fire_red_asr.decoder =\n      SHERPA_ONNX_OR(config->model_config.fire_red_asr.decoder, \"\");\n\n  recognizer_config.model_config.dolphin.model =\n      SHERPA_ONNX_OR(config->model_config.dolphin.model, \"\");\n\n  recognizer_config.model_config.zipformer_ctc.model =\n      SHERPA_ONNX_OR(config->model_config.zipformer_ctc.model, \"\");\n\n  recognizer_config.model_config.canary.encoder =\n      SHERPA_ONNX_OR(config->model_config.canary.encoder, \"\");\n\n  recognizer_config.model_config.canary.decoder =\n      SHERPA_ONNX_OR(config->model_config.canary.decoder, \"\");\n\n  recognizer_config.model_config.canary.src_lang =\n      SHERPA_ONNX_OR(config->model_config.canary.src_lang, \"\");\n\n  recognizer_config.model_config.canary.tgt_lang =\n      SHERPA_ONNX_OR(config->model_config.canary.tgt_lang, \"\");\n\n  recognizer_config.model_config.canary.use_pnc =\n      config->model_config.canary.use_pnc;\n\n  recognizer_config.model_config.wenet_ctc.model =\n      SHERPA_ONNX_OR(config->model_config.wenet_ctc.model, \"\");\n\n  recognizer_config.model_config.omnilingual.model =\n      SHERPA_ONNX_OR(config->model_config.omnilingual.model, \"\");\n\n  recognizer_config.model_config.medasr.model =\n      SHERPA_ONNX_OR(config->model_config.medasr.model, \"\");\n\n  recognizer_config.model_config.funasr_nano.encoder_adaptor =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.encoder_adaptor, \"\");\n  recognizer_config.model_config.funasr_nano.llm =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.llm, \"\");\n  recognizer_config.model_config.funasr_nano.embedding =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.embedding, \"\");\n  recognizer_config.model_config.funasr_nano.tokenizer =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.tokenizer, \"\");\n  recognizer_config.model_config.funasr_nano.system_prompt =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.system_prompt,\n                     \"You are a helpful assistant.\");\n  recognizer_config.model_config.funasr_nano.user_prompt = SHERPA_ONNX_OR(\n      config->model_config.funasr_nano.user_prompt, \"语音转写：\");\n  recognizer_config.model_config.funasr_nano.language =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.language, \"\");\n  recognizer_config.model_config.funasr_nano.itn =\n      config->model_config.funasr_nano.itn;\n  recognizer_config.model_config.funasr_nano.hotwords =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.hotwords, \"\");\n  recognizer_config.model_config.funasr_nano.max_new_tokens =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.max_new_tokens, 512);\n  recognizer_config.model_config.funasr_nano.temperature =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.temperature, 1e-6f);\n  recognizer_config.model_config.funasr_nano.top_p =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.top_p, 0.8f);\n  recognizer_config.model_config.funasr_nano.seed =\n      SHERPA_ONNX_OR(config->model_config.funasr_nano.seed, 42);\n\n  recognizer_config.model_config.fire_red_asr_ctc.model =\n      SHERPA_ONNX_OR(config->model_config.fire_red_asr_ctc.model, \"\");\n\n  recognizer_config.lm_config.model =\n      SHERPA_ONNX_OR(config->lm_config.model, \"\");\n  recognizer_config.lm_config.scale =\n      SHERPA_ONNX_OR(config->lm_config.scale, 1.0);\n\n  recognizer_config.decoding_method =\n      SHERPA_ONNX_OR(config->decoding_method, \"greedy_search\");\n\n  if (recognizer_config.decoding_method.empty()) {\n    recognizer_config.decoding_method = \"greedy_search\";\n  }\n\n  recognizer_config.max_active_paths =\n      SHERPA_ONNX_OR(config->max_active_paths, 4);\n\n  recognizer_config.hotwords_file = SHERPA_ONNX_OR(config->hotwords_file, \"\");\n  recognizer_config.hotwords_score =\n      SHERPA_ONNX_OR(config->hotwords_score, 1.5);\n\n  recognizer_config.blank_penalty = config->blank_penalty;\n\n  recognizer_config.rule_fsts = SHERPA_ONNX_OR(config->rule_fsts, \"\");\n  recognizer_config.rule_fars = SHERPA_ONNX_OR(config->rule_fars, \"\");\n\n  recognizer_config.hr.lexicon = SHERPA_ONNX_OR(config->hr.lexicon, \"\");\n  recognizer_config.hr.rule_fsts = SHERPA_ONNX_OR(config->hr.rule_fsts, \"\");\n\n  if (config->model_config.debug) {\n#if __OHOS__\n    auto str_vec = sherpa_onnx::SplitString(recognizer_config.ToString(), 128);\n    for (const auto &s : str_vec) {\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", s.c_str());\n      SHERPA_ONNX_LOGE(\"%s\\n\", s.c_str());\n    }\n#else\n    SHERPA_ONNX_LOGE(\"%s\", recognizer_config.ToString().c_str());\n#endif\n  }\n\n  return recognizer_config;\n}\n\nconst SherpaOnnxOfflineRecognizer *SherpaOnnxCreateOfflineRecognizer(\n    const SherpaOnnxOfflineRecognizerConfig *config) {\n  sherpa_onnx::OfflineRecognizerConfig recognizer_config =\n      GetOfflineRecognizerConfig(config);\n\n  if (!recognizer_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config\");\n    return nullptr;\n  }\n\n  SherpaOnnxOfflineRecognizer *recognizer = new SherpaOnnxOfflineRecognizer;\n\n  recognizer->impl =\n      std::make_unique<sherpa_onnx::OfflineRecognizer>(recognizer_config);\n\n  return recognizer;\n}\n\nvoid SherpaOnnxOfflineRecognizerSetConfig(\n    const SherpaOnnxOfflineRecognizer *recognizer,\n    const SherpaOnnxOfflineRecognizerConfig *config) {\n  sherpa_onnx::OfflineRecognizerConfig recognizer_config =\n      GetOfflineRecognizerConfig(config);\n  recognizer->impl->SetConfig(recognizer_config);\n}\n\nvoid SherpaOnnxDestroyOfflineRecognizer(\n    const SherpaOnnxOfflineRecognizer *recognizer) {\n  if (!recognizer) return;\n  delete recognizer;\n}\n\nconst SherpaOnnxOfflineStream *SherpaOnnxCreateOfflineStream(\n    const SherpaOnnxOfflineRecognizer *recognizer) {\n  SherpaOnnxOfflineStream *stream =\n      new SherpaOnnxOfflineStream(recognizer->impl->CreateStream());\n  return stream;\n}\n\nconst SherpaOnnxOfflineStream *SherpaOnnxCreateOfflineStreamWithHotwords(\n    const SherpaOnnxOfflineRecognizer *recognizer, const char *hotwords) {\n  SherpaOnnxOfflineStream *stream =\n      new SherpaOnnxOfflineStream(recognizer->impl->CreateStream(hotwords));\n  return stream;\n}\n\nvoid SherpaOnnxDestroyOfflineStream(const SherpaOnnxOfflineStream *stream) {\n  if (!stream) return;\n  delete stream;\n}\n\nvoid SherpaOnnxAcceptWaveformOffline(const SherpaOnnxOfflineStream *stream,\n                                     int32_t sample_rate, const float *samples,\n                                     int32_t n) {\n  stream->impl->AcceptWaveform(sample_rate, samples, n);\n}\n\nvoid SherpaOnnxOfflineStreamSetOption(const SherpaOnnxOfflineStream *stream,\n                                      const char *key, const char *value) {\n  if (!stream || !key || !value) return;\n  stream->impl->SetOption(key, value);\n}\n\nconst char *SherpaOnnxOfflineStreamGetOption(\n    const SherpaOnnxOfflineStream *stream, const char *key) {\n  if (!stream || !key) return nullptr;\n  return stream->impl->GetOption(key).c_str();\n}\n\nint32_t SherpaOnnxOfflineStreamHasOption(const SherpaOnnxOfflineStream *stream,\n                                         const char *key) {\n  if (!stream || !key) return 0;\n  return stream->impl->HasOption(key);\n}\n\nvoid SherpaOnnxDecodeOfflineStream(\n    const SherpaOnnxOfflineRecognizer *recognizer,\n    const SherpaOnnxOfflineStream *stream) {\n  recognizer->impl->DecodeStream(stream->impl.get());\n}\n\nvoid SherpaOnnxDecodeMultipleOfflineStreams(\n    const SherpaOnnxOfflineRecognizer *recognizer,\n    const SherpaOnnxOfflineStream **streams, int32_t n) {\n  std::vector<sherpa_onnx::OfflineStream *> ss(n);\n  for (int32_t i = 0; i != n; ++i) {\n    ss[i] = streams[i]->impl.get();\n  }\n  recognizer->impl->DecodeStreams(ss.data(), n);\n}\n\nconst SherpaOnnxOfflineRecognizerResult *SherpaOnnxGetOfflineStreamResult(\n    const SherpaOnnxOfflineStream *stream) {\n  const sherpa_onnx::OfflineRecognitionResult &result =\n      stream->impl->GetResult();\n  const auto &text = result.text;\n\n  auto r = new SherpaOnnxOfflineRecognizerResult;\n  memset(r, 0, sizeof(SherpaOnnxOfflineRecognizerResult));\n\n  char *pText = new char[text.size() + 1];\n  std::copy(text.begin(), text.end(), pText);\n  pText[text.size()] = 0;\n  r->text = pText;\n\n  // lang\n  const auto &lang = result.lang;\n  char *c_lang = new char[lang.size() + 1];\n  std::copy(lang.begin(), lang.end(), c_lang);\n  c_lang[lang.size()] = '\\0';\n  r->lang = c_lang;\n\n  // emotion\n  const auto &emotion = result.emotion;\n  char *c_emotion = new char[emotion.size() + 1];\n  std::copy(emotion.begin(), emotion.end(), c_emotion);\n  c_emotion[emotion.size()] = '\\0';\n  r->emotion = c_emotion;\n\n  // event\n  const auto &event = result.event;\n  char *c_event = new char[event.size() + 1];\n  std::copy(event.begin(), event.end(), c_event);\n  c_event[event.size()] = '\\0';\n  r->event = c_event;\n\n  // copy json\n  std::string json = result.AsJsonString();\n  char *pJson = new char[json.size() + 1];\n  std::copy(json.begin(), json.end(), pJson);\n  pJson[json.size()] = 0;\n  r->json = pJson;\n\n  // copy tokens\n  auto count = result.tokens.size();\n  if (count > 0) {\n    size_t total_length = 0;\n    for (const auto &token : result.tokens) {\n      // +1 for the null character at the end of each token\n      total_length += token.size() + 1;\n    }\n\n    r->count = count;\n    // Each word ends with nullptr\n    char *tokens = new char[total_length]{};\n    char **tokens_temp = new char *[r->count];\n    int32_t pos = 0;\n    for (int32_t i = 0; i < r->count; ++i) {\n      tokens_temp[i] = tokens + pos;\n      memcpy(tokens + pos, result.tokens[i].c_str(), result.tokens[i].size());\n      // +1 to move past the null character\n      pos += result.tokens[i].size() + 1;\n    }\n    r->tokens_arr = tokens_temp;\n\n    if (!result.timestamps.empty() && result.timestamps.size() == r->count) {\n      r->timestamps = new float[r->count];\n      std::copy(result.timestamps.begin(), result.timestamps.end(),\n                r->timestamps);\n    } else {\n      r->timestamps = nullptr;\n    }\n\n    if (!result.durations.empty() && result.durations.size() == r->count) {\n      r->durations = new float[r->count];\n      std::copy(result.durations.begin(), result.durations.end(), r->durations);\n    } else {\n      r->durations = nullptr;\n    }\n\n    if (!result.ys_log_probs.empty() &&\n        result.ys_log_probs.size() == r->count) {\n      r->ys_log_probs = new float[r->count];\n      std::copy(result.ys_log_probs.begin(), result.ys_log_probs.end(),\n                r->ys_log_probs);\n    } else {\n      r->ys_log_probs = nullptr;\n    }\n\n    r->tokens = tokens;\n  } else {\n    r->count = 0;\n    r->timestamps = nullptr;\n    r->tokens = nullptr;\n    r->tokens_arr = nullptr;\n    r->ys_log_probs = nullptr;\n  }\n\n  // Copy segment-level timestamps (from Whisper with segment timestamps)\n  auto segment_count = result.segment_texts.size();\n  if (segment_count > 0 && result.segment_timestamps.size() == segment_count &&\n      result.segment_durations.size() == segment_count) {\n    r->segment_count = segment_count;\n\n    // Copy segment timestamps\n    float *timestamps = new float[segment_count];\n    std::copy(result.segment_timestamps.begin(),\n              result.segment_timestamps.end(), timestamps);\n    r->segment_timestamps = timestamps;\n\n    // Copy segment durations\n    float *durations = new float[segment_count];\n    std::copy(result.segment_durations.begin(), result.segment_durations.end(),\n              durations);\n    r->segment_durations = durations;\n\n    // Copy segment texts (similar to tokens)\n    size_t total_length = 0;\n    for (const auto &seg_text : result.segment_texts) {\n      total_length += seg_text.size() + 1;  // +1 for null terminator\n    }\n\n    char *segment_texts = new char[total_length]{};\n    char **segment_texts_temp = new char *[segment_count];\n    int32_t pos = 0;\n    for (int32_t i = 0; i < static_cast<int32_t>(segment_count); ++i) {\n      segment_texts_temp[i] = segment_texts + pos;\n      memcpy(segment_texts + pos, result.segment_texts[i].c_str(),\n             result.segment_texts[i].size());\n      pos += result.segment_texts[i].size() + 1;\n    }\n    r->segment_texts = segment_texts;\n    r->segment_texts_arr = segment_texts_temp;\n  } else {\n    r->segment_count = 0;\n    r->segment_timestamps = nullptr;\n    r->segment_durations = nullptr;\n    r->segment_texts = nullptr;\n    r->segment_texts_arr = nullptr;\n  }\n\n  return r;\n}\n\nvoid SherpaOnnxDestroyOfflineRecognizerResult(\n    const SherpaOnnxOfflineRecognizerResult *r) {\n  if (r) {\n    delete[] r->text;\n    delete[] r->timestamps;\n    delete[] r->durations;\n    delete[] r->ys_log_probs;\n    delete[] r->tokens;\n    delete[] r->tokens_arr;\n    delete[] r->json;\n    delete[] r->lang;\n    delete[] r->emotion;\n    delete[] r->event;\n    delete[] r->segment_timestamps;\n    delete[] r->segment_durations;\n    delete[] r->segment_texts;\n    delete[] r->segment_texts_arr;\n    delete r;\n  }\n}\n\nconst char *SherpaOnnxGetOfflineStreamResultAsJson(\n    const SherpaOnnxOfflineStream *stream) {\n  const sherpa_onnx::OfflineRecognitionResult &result =\n      stream->impl->GetResult();\n  std::string json = result.AsJsonString();\n  char *pJson = new char[json.size() + 1];\n  std::copy(json.begin(), json.end(), pJson);\n  pJson[json.size()] = 0;\n  return pJson;\n}\n\nvoid SherpaOnnxDestroyOfflineStreamResultJson(const char *s) {\n  if (!s) return;\n  delete[] s;\n}\n\n// ============================================================\n// For Keyword Spot\n// ============================================================\n\nstruct SherpaOnnxKeywordSpotter {\n  std::unique_ptr<sherpa_onnx::KeywordSpotter> impl;\n};\n\nstatic sherpa_onnx::KeywordSpotterConfig GetKeywordSpotterConfig(\n    const SherpaOnnxKeywordSpotterConfig *config) {\n  sherpa_onnx::KeywordSpotterConfig spotter_config;\n\n  spotter_config.feat_config.sampling_rate =\n      SHERPA_ONNX_OR(config->feat_config.sample_rate, 16000);\n  spotter_config.feat_config.feature_dim =\n      SHERPA_ONNX_OR(config->feat_config.feature_dim, 80);\n\n  spotter_config.model_config.transducer.encoder =\n      SHERPA_ONNX_OR(config->model_config.transducer.encoder, \"\");\n  spotter_config.model_config.transducer.decoder =\n      SHERPA_ONNX_OR(config->model_config.transducer.decoder, \"\");\n  spotter_config.model_config.transducer.joiner =\n      SHERPA_ONNX_OR(config->model_config.transducer.joiner, \"\");\n\n  spotter_config.model_config.paraformer.encoder =\n      SHERPA_ONNX_OR(config->model_config.paraformer.encoder, \"\");\n  spotter_config.model_config.paraformer.decoder =\n      SHERPA_ONNX_OR(config->model_config.paraformer.decoder, \"\");\n\n  spotter_config.model_config.zipformer2_ctc.model =\n      SHERPA_ONNX_OR(config->model_config.zipformer2_ctc.model, \"\");\n\n  spotter_config.model_config.nemo_ctc.model =\n      SHERPA_ONNX_OR(config->model_config.nemo_ctc.model, \"\");\n\n  spotter_config.model_config.tokens =\n      SHERPA_ONNX_OR(config->model_config.tokens, \"\");\n  if (config->model_config.tokens_buf &&\n      config->model_config.tokens_buf_size > 0) {\n    spotter_config.model_config.tokens_buf = std::string(\n        config->model_config.tokens_buf, config->model_config.tokens_buf_size);\n  }\n\n  spotter_config.model_config.num_threads =\n      SHERPA_ONNX_OR(config->model_config.num_threads, 1);\n  spotter_config.model_config.provider_config.provider =\n      SHERPA_ONNX_OR(config->model_config.provider, \"cpu\");\n  if (spotter_config.model_config.provider_config.provider.empty()) {\n    spotter_config.model_config.provider_config.provider = \"cpu\";\n  }\n\n  spotter_config.model_config.model_type =\n      SHERPA_ONNX_OR(config->model_config.model_type, \"\");\n  spotter_config.model_config.debug = config->model_config.debug;\n\n  spotter_config.max_active_paths = SHERPA_ONNX_OR(config->max_active_paths, 4);\n\n  spotter_config.num_trailing_blanks =\n      SHERPA_ONNX_OR(config->num_trailing_blanks, 1);\n\n  spotter_config.keywords_score = SHERPA_ONNX_OR(config->keywords_score, 1.0);\n\n  spotter_config.keywords_threshold =\n      SHERPA_ONNX_OR(config->keywords_threshold, 0.25);\n\n  spotter_config.keywords_file = SHERPA_ONNX_OR(config->keywords_file, \"\");\n  if (config->keywords_buf && config->keywords_buf_size > 0) {\n    spotter_config.keywords_buf =\n        std::string(config->keywords_buf, config->keywords_buf_size);\n  }\n\n  if (spotter_config.model_config.debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", spotter_config.ToString().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", spotter_config.ToString().c_str());\n#endif\n  }\n\n  return spotter_config;\n}\n\nconst SherpaOnnxKeywordSpotter *SherpaOnnxCreateKeywordSpotter(\n    const SherpaOnnxKeywordSpotterConfig *config) {\n  auto spotter_config = GetKeywordSpotterConfig(config);\n  if (!spotter_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config!\");\n    return nullptr;\n  }\n\n  SherpaOnnxKeywordSpotter *spotter = new SherpaOnnxKeywordSpotter;\n\n  spotter->impl = std::make_unique<sherpa_onnx::KeywordSpotter>(spotter_config);\n\n  return spotter;\n}\n\nvoid SherpaOnnxDestroyKeywordSpotter(const SherpaOnnxKeywordSpotter *spotter) {\n  if (!spotter) return;\n  delete spotter;\n}\n\nconst SherpaOnnxOnlineStream *SherpaOnnxCreateKeywordStream(\n    const SherpaOnnxKeywordSpotter *spotter) {\n  SherpaOnnxOnlineStream *stream =\n      new SherpaOnnxOnlineStream(spotter->impl->CreateStream());\n  return stream;\n}\n\nconst SherpaOnnxOnlineStream *SherpaOnnxCreateKeywordStreamWithKeywords(\n    const SherpaOnnxKeywordSpotter *spotter, const char *keywords) {\n  SherpaOnnxOnlineStream *stream =\n      new SherpaOnnxOnlineStream(spotter->impl->CreateStream(keywords));\n  return stream;\n}\n\nint32_t SherpaOnnxIsKeywordStreamReady(const SherpaOnnxKeywordSpotter *spotter,\n                                       const SherpaOnnxOnlineStream *stream) {\n  return spotter->impl->IsReady(stream->impl.get());\n}\n\nvoid SherpaOnnxDecodeKeywordStream(const SherpaOnnxKeywordSpotter *spotter,\n                                   const SherpaOnnxOnlineStream *stream) {\n  spotter->impl->DecodeStream(stream->impl.get());\n}\n\nvoid SherpaOnnxResetKeywordStream(const SherpaOnnxKeywordSpotter *spotter,\n                                  const SherpaOnnxOnlineStream *stream) {\n  spotter->impl->Reset(stream->impl.get());\n}\n\nvoid SherpaOnnxDecodeMultipleKeywordStreams(\n    const SherpaOnnxKeywordSpotter *spotter,\n    const SherpaOnnxOnlineStream **streams, int32_t n) {\n  std::vector<sherpa_onnx::OnlineStream *> ss(n);\n  for (int32_t i = 0; i != n; ++i) {\n    ss[i] = streams[i]->impl.get();\n  }\n  spotter->impl->DecodeStreams(ss.data(), n);\n}\n\nconst SherpaOnnxKeywordResult *SherpaOnnxGetKeywordResult(\n    const SherpaOnnxKeywordSpotter *spotter,\n    const SherpaOnnxOnlineStream *stream) {\n  const sherpa_onnx::KeywordResult &result =\n      spotter->impl->GetResult(stream->impl.get());\n  const auto &keyword = result.keyword;\n\n  auto r = new SherpaOnnxKeywordResult;\n  memset(r, 0, sizeof(SherpaOnnxKeywordResult));\n\n  r->start_time = result.start_time;\n\n  // copy keyword\n  char *pKeyword = new char[keyword.size() + 1];\n  std::copy(keyword.begin(), keyword.end(), pKeyword);\n  pKeyword[keyword.size()] = 0;\n  r->keyword = pKeyword;\n\n  // copy json\n  std::string json = result.AsJsonString();\n  char *pJson = new char[json.size() + 1];\n  std::copy(json.begin(), json.end(), pJson);\n  pJson[json.size()] = 0;\n  r->json = pJson;\n\n  // copy tokens\n  auto count = result.tokens.size();\n  if (count > 0) {\n    size_t total_length = 0;\n    for (const auto &token : result.tokens) {\n      // +1 for the null character at the end of each token\n      total_length += token.size() + 1;\n    }\n\n    r->count = count;\n    // Each word ends with nullptr\n    char *pTokens = new char[total_length]{};\n    char **tokens_temp = new char *[r->count];\n    int32_t pos = 0;\n    for (int32_t i = 0; i < r->count; ++i) {\n      tokens_temp[i] = pTokens + pos;\n      memcpy(pTokens + pos, result.tokens[i].c_str(), result.tokens[i].size());\n      // +1 to move past the null character\n      pos += result.tokens[i].size() + 1;\n    }\n    r->tokens = pTokens;\n    r->tokens_arr = tokens_temp;\n\n    if (!result.timestamps.empty()) {\n      r->timestamps = new float[result.timestamps.size()];\n      std::copy(result.timestamps.begin(), result.timestamps.end(),\n                r->timestamps);\n    } else {\n      r->timestamps = nullptr;\n    }\n\n  } else {\n    r->count = 0;\n    r->timestamps = nullptr;\n    r->tokens = nullptr;\n    r->tokens_arr = nullptr;\n  }\n\n  return r;\n}\n\nvoid SherpaOnnxDestroyKeywordResult(const SherpaOnnxKeywordResult *r) {\n  if (r) {\n    delete[] r->keyword;\n    delete[] r->json;\n    delete[] r->tokens;\n    delete[] r->tokens_arr;\n    delete[] r->timestamps;\n    delete r;\n  }\n}\n\nconst char *SherpaOnnxGetKeywordResultAsJson(\n    const SherpaOnnxKeywordSpotter *spotter,\n    const SherpaOnnxOnlineStream *stream) {\n  const sherpa_onnx::KeywordResult &result =\n      spotter->impl->GetResult(stream->impl.get());\n\n  std::string json = result.AsJsonString();\n  char *pJson = new char[json.size() + 1];\n  std::copy(json.begin(), json.end(), pJson);\n  pJson[json.size()] = 0;\n  return pJson;\n}\n\nvoid SherpaOnnxFreeKeywordResultJson(const char *s) {\n  if (!s) return;\n  delete[] s;\n}\n\n// ============================================================\n// For VAD\n// ============================================================\n//\nstruct SherpaOnnxCircularBuffer {\n  std::unique_ptr<sherpa_onnx::CircularBuffer> impl;\n};\n\nconst SherpaOnnxCircularBuffer *SherpaOnnxCreateCircularBuffer(\n    int32_t capacity) {\n  SherpaOnnxCircularBuffer *buffer = new SherpaOnnxCircularBuffer;\n  buffer->impl = std::make_unique<sherpa_onnx::CircularBuffer>(capacity);\n  return buffer;\n}\n\nvoid SherpaOnnxDestroyCircularBuffer(const SherpaOnnxCircularBuffer *buffer) {\n  if (!buffer) return;\n  delete buffer;\n}\n\nvoid SherpaOnnxCircularBufferPush(const SherpaOnnxCircularBuffer *buffer,\n                                  const float *p, int32_t n) {\n  buffer->impl->Push(p, n);\n}\n\nconst float *SherpaOnnxCircularBufferGet(const SherpaOnnxCircularBuffer *buffer,\n                                         int32_t start_index, int32_t n) {\n  std::vector<float> v = buffer->impl->Get(start_index, n);\n\n  float *p = new float[n];\n  std::copy(v.begin(), v.end(), p);\n  return p;\n}\n\nvoid SherpaOnnxCircularBufferFree(const float *p) {\n  if (!p) return;\n  delete[] p;\n}\n\nvoid SherpaOnnxCircularBufferPop(const SherpaOnnxCircularBuffer *buffer,\n                                 int32_t n) {\n  buffer->impl->Pop(n);\n}\n\nint32_t SherpaOnnxCircularBufferSize(const SherpaOnnxCircularBuffer *buffer) {\n  return buffer->impl->Size();\n}\n\nint32_t SherpaOnnxCircularBufferHead(const SherpaOnnxCircularBuffer *buffer) {\n  return buffer->impl->Head();\n}\n\nvoid SherpaOnnxCircularBufferReset(const SherpaOnnxCircularBuffer *buffer) {\n  buffer->impl->Reset();\n}\n\nstruct SherpaOnnxVoiceActivityDetector {\n  std::unique_ptr<sherpa_onnx::VoiceActivityDetector> impl;\n};\n\nstatic sherpa_onnx::VadModelConfig GetVadModelConfig(\n    const SherpaOnnxVadModelConfig *config) {\n  sherpa_onnx::VadModelConfig vad_config;\n\n  vad_config.silero_vad.model = SHERPA_ONNX_OR(config->silero_vad.model, \"\");\n  vad_config.silero_vad.threshold =\n      SHERPA_ONNX_OR(config->silero_vad.threshold, 0.5);\n\n  vad_config.silero_vad.min_silence_duration =\n      SHERPA_ONNX_OR(config->silero_vad.min_silence_duration, 0.5);\n\n  vad_config.silero_vad.min_speech_duration =\n      SHERPA_ONNX_OR(config->silero_vad.min_speech_duration, 0.25);\n\n  vad_config.silero_vad.window_size =\n      SHERPA_ONNX_OR(config->silero_vad.window_size, 512);\n\n  vad_config.silero_vad.max_speech_duration =\n      SHERPA_ONNX_OR(config->silero_vad.max_speech_duration, 20);\n\n  vad_config.ten_vad.model = SHERPA_ONNX_OR(config->ten_vad.model, \"\");\n  vad_config.ten_vad.threshold = SHERPA_ONNX_OR(config->ten_vad.threshold, 0.5);\n\n  vad_config.ten_vad.min_silence_duration =\n      SHERPA_ONNX_OR(config->ten_vad.min_silence_duration, 0.5);\n\n  vad_config.ten_vad.min_speech_duration =\n      SHERPA_ONNX_OR(config->ten_vad.min_speech_duration, 0.25);\n\n  vad_config.ten_vad.window_size =\n      SHERPA_ONNX_OR(config->ten_vad.window_size, 256);\n\n  vad_config.ten_vad.max_speech_duration =\n      SHERPA_ONNX_OR(config->ten_vad.max_speech_duration, 20);\n\n  vad_config.sample_rate = SHERPA_ONNX_OR(config->sample_rate, 16000);\n  vad_config.num_threads = SHERPA_ONNX_OR(config->num_threads, 1);\n  vad_config.provider = SHERPA_ONNX_OR(config->provider, \"cpu\");\n  if (vad_config.provider.empty()) {\n    vad_config.provider = \"cpu\";\n  }\n\n  vad_config.debug = config->debug;\n\n  if (vad_config.debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", vad_config.ToString().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", vad_config.ToString().c_str());\n#endif\n  }\n\n  return vad_config;\n}\n\nconst SherpaOnnxVoiceActivityDetector *SherpaOnnxCreateVoiceActivityDetector(\n    const SherpaOnnxVadModelConfig *config, float buffer_size_in_seconds) {\n  if (!config) {\n    SHERPA_ONNX_LOGE(\"vad config is nullptr\");\n    return nullptr;\n  }\n\n  auto vad_config = GetVadModelConfig(config);\n\n  if (!vad_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config\");\n    return nullptr;\n  }\n\n  SherpaOnnxVoiceActivityDetector *p = new SherpaOnnxVoiceActivityDetector;\n  p->impl = std::make_unique<sherpa_onnx::VoiceActivityDetector>(\n      vad_config, buffer_size_in_seconds);\n\n  return p;\n}\n\nvoid SherpaOnnxDestroyVoiceActivityDetector(\n    const SherpaOnnxVoiceActivityDetector *p) {\n  if (!p) return;\n  delete p;\n}\n\nvoid SherpaOnnxVoiceActivityDetectorAcceptWaveform(\n    const SherpaOnnxVoiceActivityDetector *p, const float *samples, int32_t n) {\n  if (!p) {\n    SHERPA_ONNX_LOGE(\"vad is nullptr\");\n    return;\n  }\n\n  if (!samples) {\n    SHERPA_ONNX_LOGE(\"samples is nullptr\");\n    return;\n  }\n\n  p->impl->AcceptWaveform(samples, n);\n}\n\nint32_t SherpaOnnxVoiceActivityDetectorEmpty(\n    const SherpaOnnxVoiceActivityDetector *p) {\n  if (!p) {\n    SHERPA_ONNX_LOGE(\"vad is nullptr\");\n    return 1;  // 1 means it is empty\n  }\n\n  return p->impl->Empty();\n}\n\nint32_t SherpaOnnxVoiceActivityDetectorDetected(\n    const SherpaOnnxVoiceActivityDetector *p) {\n  if (!p) {\n    SHERPA_ONNX_LOGE(\"vad is nullptr\");\n    return 0;\n  }\n\n  return p->impl->IsSpeechDetected();\n}\n\nvoid SherpaOnnxVoiceActivityDetectorPop(\n    const SherpaOnnxVoiceActivityDetector *p) {\n  if (!p) {\n    SHERPA_ONNX_LOGE(\"vad is nullptr\");\n    return;\n  }\n\n  p->impl->Pop();\n}\n\nvoid SherpaOnnxVoiceActivityDetectorClear(\n    const SherpaOnnxVoiceActivityDetector *p) {\n  if (!p) {\n    SHERPA_ONNX_LOGE(\"vad is nullptr\");\n    return;\n  }\n\n  p->impl->Clear();\n}\n\nconst SherpaOnnxSpeechSegment *SherpaOnnxVoiceActivityDetectorFront(\n    const SherpaOnnxVoiceActivityDetector *p) {\n  if (!p) {\n    SHERPA_ONNX_LOGE(\"vad is nullptr\");\n    return nullptr;\n  }\n\n  if (SherpaOnnxVoiceActivityDetectorEmpty(p)) {\n    return nullptr;\n  }\n\n  const sherpa_onnx::SpeechSegment &segment = p->impl->Front();\n\n  SherpaOnnxSpeechSegment *ans = new SherpaOnnxSpeechSegment;\n  ans->start = segment.start;\n  ans->samples = new float[segment.samples.size()];\n  std::copy(segment.samples.begin(), segment.samples.end(), ans->samples);\n  ans->n = segment.samples.size();\n\n  return ans;\n}\n\nvoid SherpaOnnxDestroySpeechSegment(const SherpaOnnxSpeechSegment *p) {\n  if (p) {\n    delete[] p->samples;\n    delete p;\n  }\n}\n\nvoid SherpaOnnxVoiceActivityDetectorReset(\n    const SherpaOnnxVoiceActivityDetector *p) {\n  if (!p) {\n    SHERPA_ONNX_LOGE(\"vad is nullptr\");\n    return;\n  }\n\n  p->impl->Reset();\n}\n\nvoid SherpaOnnxVoiceActivityDetectorFlush(\n    const SherpaOnnxVoiceActivityDetector *p) {\n  if (!p) {\n    SHERPA_ONNX_LOGE(\"vad is nullptr\");\n    return;\n  }\n\n  p->impl->Flush();\n}\n\n#if SHERPA_ONNX_ENABLE_TTS == 1\nstruct SherpaOnnxOfflineTts {\n  std::unique_ptr<sherpa_onnx::OfflineTts> impl;\n};\n\nstatic sherpa_onnx::OfflineTtsConfig GetOfflineTtsConfig(\n    const SherpaOnnxOfflineTtsConfig *config) {\n  sherpa_onnx::OfflineTtsConfig tts_config;\n\n  // vits\n  tts_config.model.vits.model = SHERPA_ONNX_OR(config->model.vits.model, \"\");\n  tts_config.model.vits.lexicon =\n      SHERPA_ONNX_OR(config->model.vits.lexicon, \"\");\n  tts_config.model.vits.tokens = SHERPA_ONNX_OR(config->model.vits.tokens, \"\");\n  tts_config.model.vits.data_dir =\n      SHERPA_ONNX_OR(config->model.vits.data_dir, \"\");\n  tts_config.model.vits.noise_scale =\n      SHERPA_ONNX_OR(config->model.vits.noise_scale, 0.667);\n  tts_config.model.vits.noise_scale_w =\n      SHERPA_ONNX_OR(config->model.vits.noise_scale_w, 0.8);\n  tts_config.model.vits.length_scale =\n      SHERPA_ONNX_OR(config->model.vits.length_scale, 1.0);\n\n  // matcha\n  tts_config.model.matcha.acoustic_model =\n      SHERPA_ONNX_OR(config->model.matcha.acoustic_model, \"\");\n  tts_config.model.matcha.vocoder =\n      SHERPA_ONNX_OR(config->model.matcha.vocoder, \"\");\n  tts_config.model.matcha.lexicon =\n      SHERPA_ONNX_OR(config->model.matcha.lexicon, \"\");\n  tts_config.model.matcha.tokens =\n      SHERPA_ONNX_OR(config->model.matcha.tokens, \"\");\n  tts_config.model.matcha.data_dir =\n      SHERPA_ONNX_OR(config->model.matcha.data_dir, \"\");\n  tts_config.model.matcha.noise_scale =\n      SHERPA_ONNX_OR(config->model.matcha.noise_scale, 0.667);\n  tts_config.model.matcha.length_scale =\n      SHERPA_ONNX_OR(config->model.matcha.length_scale, 1.0);\n\n  // kokoro\n  tts_config.model.kokoro.model =\n      SHERPA_ONNX_OR(config->model.kokoro.model, \"\");\n  tts_config.model.kokoro.voices =\n      SHERPA_ONNX_OR(config->model.kokoro.voices, \"\");\n  tts_config.model.kokoro.tokens =\n      SHERPA_ONNX_OR(config->model.kokoro.tokens, \"\");\n  tts_config.model.kokoro.data_dir =\n      SHERPA_ONNX_OR(config->model.kokoro.data_dir, \"\");\n  tts_config.model.kokoro.length_scale =\n      SHERPA_ONNX_OR(config->model.kokoro.length_scale, 1.0);\n  tts_config.model.kokoro.lexicon =\n      SHERPA_ONNX_OR(config->model.kokoro.lexicon, \"\");\n  tts_config.model.kokoro.lang = SHERPA_ONNX_OR(config->model.kokoro.lang, \"\");\n\n  // kitten\n  tts_config.model.kitten.model =\n      SHERPA_ONNX_OR(config->model.kitten.model, \"\");\n  tts_config.model.kitten.voices =\n      SHERPA_ONNX_OR(config->model.kitten.voices, \"\");\n  tts_config.model.kitten.tokens =\n      SHERPA_ONNX_OR(config->model.kitten.tokens, \"\");\n  tts_config.model.kitten.data_dir =\n      SHERPA_ONNX_OR(config->model.kitten.data_dir, \"\");\n  tts_config.model.kitten.length_scale =\n      SHERPA_ONNX_OR(config->model.kitten.length_scale, 1.0);\n\n  // zipvoice\n  tts_config.model.zipvoice.tokens =\n      SHERPA_ONNX_OR(config->model.zipvoice.tokens, \"\");\n  tts_config.model.zipvoice.encoder =\n      SHERPA_ONNX_OR(config->model.zipvoice.encoder, \"\");\n  tts_config.model.zipvoice.decoder =\n      SHERPA_ONNX_OR(config->model.zipvoice.decoder, \"\");\n  tts_config.model.zipvoice.vocoder =\n      SHERPA_ONNX_OR(config->model.zipvoice.vocoder, \"\");\n  tts_config.model.zipvoice.data_dir =\n      SHERPA_ONNX_OR(config->model.zipvoice.data_dir, \"\");\n  tts_config.model.zipvoice.lexicon =\n      SHERPA_ONNX_OR(config->model.zipvoice.lexicon, \"\");\n  tts_config.model.zipvoice.feat_scale =\n      SHERPA_ONNX_OR(config->model.zipvoice.feat_scale, 0.1f);\n  tts_config.model.zipvoice.t_shift =\n      SHERPA_ONNX_OR(config->model.zipvoice.t_shift, 0.5f);\n  tts_config.model.zipvoice.target_rms =\n      SHERPA_ONNX_OR(config->model.zipvoice.target_rms, 0.1f);\n  tts_config.model.zipvoice.guidance_scale =\n      SHERPA_ONNX_OR(config->model.zipvoice.guidance_scale, 1.0f);\n\n  // pocket\n  tts_config.model.pocket.lm_flow =\n      SHERPA_ONNX_OR(config->model.pocket.lm_flow, \"\");\n  tts_config.model.pocket.lm_main =\n      SHERPA_ONNX_OR(config->model.pocket.lm_main, \"\");\n  tts_config.model.pocket.encoder =\n      SHERPA_ONNX_OR(config->model.pocket.encoder, \"\");\n  tts_config.model.pocket.decoder =\n      SHERPA_ONNX_OR(config->model.pocket.decoder, \"\");\n  tts_config.model.pocket.text_conditioner =\n      SHERPA_ONNX_OR(config->model.pocket.text_conditioner, \"\");\n  tts_config.model.pocket.vocab_json =\n      SHERPA_ONNX_OR(config->model.pocket.vocab_json, \"\");\n  tts_config.model.pocket.token_scores_json =\n      SHERPA_ONNX_OR(config->model.pocket.token_scores_json, \"\");\n  if (config->model.pocket.voice_embedding_cache_capacity >= 0) {\n    tts_config.model.pocket.voice_embedding_cache_capacity =\n        config->model.pocket.voice_embedding_cache_capacity;\n  } else {\n    tts_config.model.pocket.voice_embedding_cache_capacity = 50;\n  }\n\n  // supertonic\n  tts_config.model.supertonic.duration_predictor =\n      SHERPA_ONNX_OR(config->model.supertonic.duration_predictor, \"\");\n  tts_config.model.supertonic.text_encoder =\n      SHERPA_ONNX_OR(config->model.supertonic.text_encoder, \"\");\n  tts_config.model.supertonic.vector_estimator =\n      SHERPA_ONNX_OR(config->model.supertonic.vector_estimator, \"\");\n  tts_config.model.supertonic.vocoder =\n      SHERPA_ONNX_OR(config->model.supertonic.vocoder, \"\");\n  tts_config.model.supertonic.tts_json =\n      SHERPA_ONNX_OR(config->model.supertonic.tts_json, \"\");\n  tts_config.model.supertonic.unicode_indexer =\n      SHERPA_ONNX_OR(config->model.supertonic.unicode_indexer, \"\");\n  tts_config.model.supertonic.voice_style =\n      SHERPA_ONNX_OR(config->model.supertonic.voice_style, \"\");\n\n  tts_config.model.num_threads = SHERPA_ONNX_OR(config->model.num_threads, 1);\n  tts_config.model.debug = config->model.debug;\n  tts_config.model.provider = SHERPA_ONNX_OR(config->model.provider, \"cpu\");\n  if (tts_config.model.provider.empty()) {\n    tts_config.model.provider = \"cpu\";\n  }\n\n  tts_config.rule_fsts = SHERPA_ONNX_OR(config->rule_fsts, \"\");\n  tts_config.rule_fars = SHERPA_ONNX_OR(config->rule_fars, \"\");\n  tts_config.max_num_sentences = SHERPA_ONNX_OR(config->max_num_sentences, 1);\n  tts_config.silence_scale = SHERPA_ONNX_OR(config->silence_scale, 0.2);\n\n  if (tts_config.model.debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", tts_config.ToString().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", tts_config.ToString().c_str());\n#endif\n  }\n\n  return tts_config;\n}\n\nconst SherpaOnnxOfflineTts *SherpaOnnxCreateOfflineTts(\n    const SherpaOnnxOfflineTtsConfig *config) {\n  auto tts_config = GetOfflineTtsConfig(config);\n\n  if (!tts_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config\");\n    return nullptr;\n  }\n\n  SherpaOnnxOfflineTts *tts = new SherpaOnnxOfflineTts;\n\n  tts->impl = std::make_unique<sherpa_onnx::OfflineTts>(tts_config);\n\n  return tts;\n}\n\nvoid SherpaOnnxDestroyOfflineTts(const SherpaOnnxOfflineTts *tts) {\n  if (!tts) return;\n  delete tts;\n}\n\nint32_t SherpaOnnxOfflineTtsSampleRate(const SherpaOnnxOfflineTts *tts) {\n  return tts->impl->SampleRate();\n}\n\nint32_t SherpaOnnxOfflineTtsNumSpeakers(const SherpaOnnxOfflineTts *tts) {\n  return tts->impl->NumSpeakers();\n}\n\nstatic const SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateInternal(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid, float speed,\n    std::function<int32_t(const float *, int32_t, float)> callback) {\n  sherpa_onnx::GeneratedAudio audio =\n      tts->impl->Generate(text, sid, speed, callback);\n\n  if (audio.samples.empty()) {\n    return nullptr;\n  }\n\n  SherpaOnnxGeneratedAudio *ans = new SherpaOnnxGeneratedAudio;\n\n  float *samples = new float[audio.samples.size()];\n  std::copy(audio.samples.begin(), audio.samples.end(), samples);\n\n  ans->samples = samples;\n  ans->n = audio.samples.size();\n  ans->sample_rate = audio.sample_rate;\n\n  return ans;\n}\n\nstatic const SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateInternal(\n    const SherpaOnnxOfflineTts *tts, const char *text,\n    const SherpaOnnxGenerationConfig *config,\n    std::function<int32_t(const float *, int32_t, float)> callback) {\n  sherpa_onnx::GenerationConfig cfg;\n  if (config->reference_audio) {\n    if (config->reference_audio_len <= 0) {\n      SHERPA_ONNX_LOGE(\"Invalid reference audio len: %d\",\n                       config->reference_audio_len);\n      return nullptr;\n    }\n\n    cfg.reference_audio.assign(\n        config->reference_audio,\n        config->reference_audio + config->reference_audio_len);\n  }\n\n  cfg.silence_scale = SHERPA_ONNX_OR(config->silence_scale, 0.2);\n  cfg.speed = SHERPA_ONNX_OR(config->speed, 1.0);\n  cfg.sid = config->sid;\n\n  cfg.reference_sample_rate = config->reference_sample_rate;\n\n  cfg.reference_text = SHERPA_ONNX_OR(config->reference_text, \"\");\n  cfg.num_steps = SHERPA_ONNX_OR(config->num_steps, 5);\n\n  if (config->extra && !std::string(config->extra).empty()) {\n    try {\n      auto json = nlohmann::json::parse(config->extra);\n      for (auto &[k, v] : json.items()) {\n        std::string val = v.is_string() ? v.get<std::string>() : v.dump();\n        cfg.extra.insert_or_assign(std::string(k), std::move(val));\n      }\n    } catch (const nlohmann::json::parse_error &e) {\n      SHERPA_ONNX_LOGE(\"Failed to parse extra JSON: '%s'\", e.what());\n      SHERPA_ONNX_LOGE(\"Ignore the extra opt\");\n    }\n  }\n\n  sherpa_onnx::GeneratedAudio audio = tts->impl->Generate(text, cfg, callback);\n\n  if (audio.samples.empty()) {\n    return nullptr;\n  }\n\n  SherpaOnnxGeneratedAudio *ans = new SherpaOnnxGeneratedAudio;\n\n  float *samples = new float[audio.samples.size()];\n  std::copy(audio.samples.begin(), audio.samples.end(), samples);\n\n  ans->samples = samples;\n  ans->n = audio.samples.size();\n  ans->sample_rate = audio.sample_rate;\n\n  return ans;\n}\n\nconst SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerate(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid,\n    float speed) {\n  if (!tts) {\n    SHERPA_ONNX_LOGE(\"tts is nullptr\");\n    return nullptr;\n  }\n\n  if (!text) {\n    SHERPA_ONNX_LOGE(\"text is nullptr\");\n    return nullptr;\n  }\n\n  return SherpaOnnxOfflineTtsGenerateInternal(tts, text, sid, speed, nullptr);\n}\n\nconst SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithCallback(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid, float speed,\n    SherpaOnnxGeneratedAudioCallback callback) {\n  if (!tts) {\n    SHERPA_ONNX_LOGE(\"tts is nullptr\");\n    return nullptr;\n  }\n\n  if (!text) {\n    SHERPA_ONNX_LOGE(\"text is nullptr\");\n    return nullptr;\n  }\n\n  if (callback) {\n    auto wrapper = [callback](const float *samples, int32_t n,\n                              float /*progress*/) {\n      return callback(samples, n);\n    };\n\n    return SherpaOnnxOfflineTtsGenerateInternal(tts, text, sid, speed,\n                                                std::move(wrapper));\n  } else {\n    return SherpaOnnxOfflineTtsGenerateInternal(tts, text, sid, speed, nullptr);\n  }\n}\n\nconst SherpaOnnxGeneratedAudio *\nSherpaOnnxOfflineTtsGenerateWithProgressCallback(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid, float speed,\n    SherpaOnnxGeneratedAudioProgressCallback callback) {\n  if (!tts) {\n    SHERPA_ONNX_LOGE(\"tts is nullptr\");\n    return nullptr;\n  }\n\n  if (!text) {\n    SHERPA_ONNX_LOGE(\"text is nullptr\");\n    return nullptr;\n  }\n\n  if (callback) {\n    auto wrapper = [callback](const float *samples, int32_t n, float progress) {\n      return callback(samples, n, progress);\n    };\n    return SherpaOnnxOfflineTtsGenerateInternal(tts, text, sid, speed,\n                                                std::move(wrapper));\n  } else {\n    return SherpaOnnxOfflineTtsGenerateInternal(tts, text, sid, speed, nullptr);\n  }\n}\n\nconst SherpaOnnxGeneratedAudio *\nSherpaOnnxOfflineTtsGenerateWithProgressCallbackWithArg(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid, float speed,\n    SherpaOnnxGeneratedAudioProgressCallbackWithArg callback, void *arg) {\n  if (!tts) {\n    SHERPA_ONNX_LOGE(\"tts is nullptr\");\n    return nullptr;\n  }\n\n  if (!text) {\n    SHERPA_ONNX_LOGE(\"text is nullptr\");\n    return nullptr;\n  }\n\n  if (callback) {\n    auto wrapper = [callback, arg](const float *samples, int32_t n,\n                                   float progress) {\n      return callback(samples, n, progress, arg);\n    };\n    return SherpaOnnxOfflineTtsGenerateInternal(tts, text, sid, speed,\n                                                std::move(wrapper));\n  } else {\n    return SherpaOnnxOfflineTtsGenerateInternal(tts, text, sid, speed, nullptr);\n  }\n}\n\nconst SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithCallbackWithArg(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid, float speed,\n    SherpaOnnxGeneratedAudioCallbackWithArg callback, void *arg) {\n  if (!tts) {\n    SHERPA_ONNX_LOGE(\"tts is nullptr\");\n    return nullptr;\n  }\n\n  if (!text) {\n    SHERPA_ONNX_LOGE(\"text is nullptr\");\n    return nullptr;\n  }\n\n  if (callback) {\n    auto wrapper = [callback, arg](const float *samples, int32_t n,\n                                   float /*progress*/) {\n      return callback(samples, n, arg);\n    };\n\n    return SherpaOnnxOfflineTtsGenerateInternal(tts, text, sid, speed,\n                                                std::move(wrapper));\n  } else {\n    return SherpaOnnxOfflineTtsGenerateInternal(tts, text, sid, speed, nullptr);\n  }\n}\n\nconst SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithZipvoice(\n    const SherpaOnnxOfflineTts *tts, const char *text, const char *prompt_text,\n    const float *prompt_samples, int32_t n_prompt, int32_t prompt_sr,\n    float speed, int32_t num_steps) {\n  if (!tts) {\n    SHERPA_ONNX_LOGE(\"tts is nullptr\");\n    return nullptr;\n  }\n\n  if (!text) {\n    SHERPA_ONNX_LOGE(\"text is nullptr\");\n    return nullptr;\n  }\n\n  if (!prompt_text) {\n    SHERPA_ONNX_LOGE(\"prompt_text is nullptr\");\n    return nullptr;\n  }\n\n  if (!prompt_samples) {\n    SHERPA_ONNX_LOGE(\"prompt_samples is nullptr\");\n    return nullptr;\n  }\n\n  std::string text_s = text;\n  std::string ptext_s = prompt_text;\n\n  std::vector<float> prompt_vec;\n  if (n_prompt > 0) {\n    prompt_vec.assign(prompt_samples,\n                      prompt_samples + static_cast<size_t>(n_prompt));\n  }\n\n  auto out = tts->impl->Generate(text_s, ptext_s, prompt_vec, prompt_sr, speed,\n                                 num_steps,\n                                 /*callback=*/nullptr);\n\n  if (out.samples.empty()) {\n    return nullptr;\n  }\n\n  auto *ans = new SherpaOnnxGeneratedAudio;\n  ans->sample_rate = static_cast<int32_t>(out.sample_rate);\n  ans->n = static_cast<int32_t>(out.samples.size());\n\n  float *buf = new float[out.samples.size()];\n  std::copy(out.samples.begin(), out.samples.end(), buf);\n  ans->samples = buf;\n\n  return ans;\n}\n\nconst SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithConfig(\n    const SherpaOnnxOfflineTts *tts, const char *text,\n    const SherpaOnnxGenerationConfig *config,\n    SherpaOnnxGeneratedAudioProgressCallbackWithArg callback, void *arg) {\n  if (!tts) {\n    SHERPA_ONNX_LOGE(\"tts is nullptr\");\n    return nullptr;\n  }\n\n  if (!text) {\n    SHERPA_ONNX_LOGE(\"text is nullptr\");\n    return nullptr;\n  }\n\n  if (!config) {\n    SHERPA_ONNX_LOGE(\"config is nullptr\");\n    return nullptr;\n  }\n\n  if (callback) {\n    auto wrapper = [callback, arg](const float *samples, int32_t n,\n                                   float progress) {\n      return callback(samples, n, progress, arg);\n    };\n\n    return SherpaOnnxOfflineTtsGenerateInternal(tts, text, config,\n                                                std::move(wrapper));\n  } else {\n    return SherpaOnnxOfflineTtsGenerateInternal(tts, text, config, nullptr);\n  }\n}\n\nvoid SherpaOnnxDestroyOfflineTtsGeneratedAudio(\n    const SherpaOnnxGeneratedAudio *p) {\n  if (p) {\n    delete[] p->samples;\n    delete p;\n  }\n}\n#else\nconst SherpaOnnxOfflineTts *SherpaOnnxCreateOfflineTts(\n    const SherpaOnnxOfflineTtsConfig *config) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nvoid SherpaOnnxDestroyOfflineTts(const SherpaOnnxOfflineTts *tts) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n}\n\nint32_t SherpaOnnxOfflineTtsSampleRate(const SherpaOnnxOfflineTts *tts) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return 0;\n}\n\nint32_t SherpaOnnxOfflineTtsNumSpeakers(const SherpaOnnxOfflineTts *tts) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return 0;\n}\n\nconst SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerate(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid,\n    float speed) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nconst SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithCallback(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid, float speed,\n    SherpaOnnxGeneratedAudioCallback callback) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nconst SherpaOnnxGeneratedAudio *\nSherpaOnnxOfflineTtsGenerateWithProgressCallback(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid, float speed,\n    SherpaOnnxGeneratedAudioProgressCallback callback) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nconst SherpaOnnxGeneratedAudio *\nSherpaOnnxOfflineTtsGenerateWithProgressCallbackWithArg(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid, float speed,\n    SherpaOnnxGeneratedAudioProgressCallbackWithArg callback, void *arg) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nconst SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithCallbackWithArg(\n    const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid, float speed,\n    SherpaOnnxGeneratedAudioCallbackWithArg callback, void *arg) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nconst SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithZipvoice(\n    const SherpaOnnxOfflineTts *tts, const char *text, const char *prompt_text,\n    const float *prompt_samples, int32_t n_prompt, int32_t prompt_sr,\n    float speed, int32_t num_steps) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nconst SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithConfig(\n    const SherpaOnnxOfflineTts *tts, const char *text,\n    const SherpaOnnxGenerationConfig *config,\n    SherpaOnnxGeneratedAudioProgressCallbackWithArg callback, void *arg) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nvoid SherpaOnnxDestroyOfflineTtsGeneratedAudio(\n    const SherpaOnnxGeneratedAudio *p) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n}\n#endif  // SHERPA_ONNX_ENABLE_TTS == 1\n\nint32_t SherpaOnnxWriteWave(const float *samples, int32_t n,\n                            int32_t sample_rate, const char *filename) {\n  return sherpa_onnx::WriteWave(filename, sample_rate, samples, n);\n}\n\nint64_t SherpaOnnxWaveFileSize(int32_t n_samples) {\n  return sherpa_onnx::WaveFileSize(n_samples);\n}\n\nvoid SherpaOnnxWriteWaveToBuffer(const float *samples, int32_t n,\n                                 int32_t sample_rate, char *buffer) {\n  sherpa_onnx::WriteWave(buffer, sample_rate, samples, n);\n}\n\nconst SherpaOnnxWave *SherpaOnnxReadWave(const char *filename) {\n  int32_t sample_rate = -1;\n  bool is_ok = false;\n  std::vector<float> samples =\n      sherpa_onnx::ReadWave(filename, &sample_rate, &is_ok);\n  if (!is_ok) {\n    return nullptr;\n  }\n\n  float *c_samples = new float[samples.size()];\n  std::copy(samples.begin(), samples.end(), c_samples);\n\n  SherpaOnnxWave *wave = new SherpaOnnxWave;\n  wave->samples = c_samples;\n  wave->sample_rate = sample_rate;\n  wave->num_samples = samples.size();\n  return wave;\n}\n\nconst SherpaOnnxWave *SherpaOnnxReadWaveFromBinaryData(const char *data,\n                                                       int32_t n) {\n  if (!data || n <= 0) {\n    return nullptr;\n  }\n\n  int32_t sample_rate = -1;\n  bool is_ok = false;\n\n  std::istringstream is(std::string(data, n));\n\n  std::vector<float> samples = sherpa_onnx::ReadWave(is, &sample_rate, &is_ok);\n  if (!is_ok) {\n    return nullptr;\n  }\n\n  float *c_samples = new float[samples.size()];\n  std::copy(samples.begin(), samples.end(), c_samples);\n\n  SherpaOnnxWave *wave = new SherpaOnnxWave;\n  wave->samples = c_samples;\n  wave->sample_rate = sample_rate;\n  wave->num_samples = samples.size();\n  return wave;\n}\n\nvoid SherpaOnnxFreeWave(const SherpaOnnxWave *wave) {\n  if (wave) {\n    delete[] wave->samples;\n    delete wave;\n  }\n}\n\nstruct SherpaOnnxSpokenLanguageIdentification {\n  std::unique_ptr<sherpa_onnx::SpokenLanguageIdentification> impl;\n};\n\nconst SherpaOnnxSpokenLanguageIdentification *\nSherpaOnnxCreateSpokenLanguageIdentification(\n    const SherpaOnnxSpokenLanguageIdentificationConfig *config) {\n  sherpa_onnx::SpokenLanguageIdentificationConfig slid_config;\n  slid_config.whisper.encoder = SHERPA_ONNX_OR(config->whisper.encoder, \"\");\n  slid_config.whisper.decoder = SHERPA_ONNX_OR(config->whisper.decoder, \"\");\n  slid_config.whisper.tail_paddings =\n      SHERPA_ONNX_OR(config->whisper.tail_paddings, -1);\n  slid_config.num_threads = SHERPA_ONNX_OR(config->num_threads, 1);\n  slid_config.debug = config->debug;\n  slid_config.provider = SHERPA_ONNX_OR(config->provider, \"cpu\");\n  if (slid_config.provider.empty()) {\n    slid_config.provider = \"cpu\";\n  }\n\n  if (slid_config.debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", slid_config.ToString().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", slid_config.ToString().c_str());\n#endif\n  }\n\n  if (!slid_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config\");\n    return nullptr;\n  }\n\n  SherpaOnnxSpokenLanguageIdentification *slid =\n      new SherpaOnnxSpokenLanguageIdentification;\n  slid->impl =\n      std::make_unique<sherpa_onnx::SpokenLanguageIdentification>(slid_config);\n\n  return slid;\n}\n\nvoid SherpaOnnxDestroySpokenLanguageIdentification(\n    const SherpaOnnxSpokenLanguageIdentification *slid) {\n  if (!slid) return;\n  delete slid;\n}\n\nSherpaOnnxOfflineStream *\nSherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(\n    const SherpaOnnxSpokenLanguageIdentification *slid) {\n  SherpaOnnxOfflineStream *stream =\n      new SherpaOnnxOfflineStream(slid->impl->CreateStream());\n  return stream;\n}\n\nconst SherpaOnnxSpokenLanguageIdentificationResult *\nSherpaOnnxSpokenLanguageIdentificationCompute(\n    const SherpaOnnxSpokenLanguageIdentification *slid,\n    const SherpaOnnxOfflineStream *s) {\n  std::string lang = slid->impl->Compute(s->impl.get());\n  char *c_lang = new char[lang.size() + 1];\n  std::copy(lang.begin(), lang.end(), c_lang);\n  c_lang[lang.size()] = '\\0';\n  SherpaOnnxSpokenLanguageIdentificationResult *r =\n      new SherpaOnnxSpokenLanguageIdentificationResult;\n  r->lang = c_lang;\n  return r;\n}\n\nvoid SherpaOnnxDestroySpokenLanguageIdentificationResult(\n    const SherpaOnnxSpokenLanguageIdentificationResult *r) {\n  if (r) {\n    delete[] r->lang;\n    delete r;\n  }\n}\n\nstruct SherpaOnnxSpeakerEmbeddingExtractor {\n  std::unique_ptr<sherpa_onnx::SpeakerEmbeddingExtractor> impl;\n};\n\nstatic sherpa_onnx::SpeakerEmbeddingExtractorConfig\nGetSpeakerEmbeddingExtractorConfig(\n    const SherpaOnnxSpeakerEmbeddingExtractorConfig *config) {\n  sherpa_onnx::SpeakerEmbeddingExtractorConfig c;\n  c.model = SHERPA_ONNX_OR(config->model, \"\");\n\n  c.num_threads = SHERPA_ONNX_OR(config->num_threads, 1);\n  c.debug = config->debug;\n  c.provider = SHERPA_ONNX_OR(config->provider, \"cpu\");\n  if (c.provider.empty()) {\n    c.provider = \"cpu\";\n  }\n\n  if (config->debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", c.ToString().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", c.ToString().c_str());\n#endif\n  }\n\n  return c;\n}\n\nconst SherpaOnnxSpeakerEmbeddingExtractor *\nSherpaOnnxCreateSpeakerEmbeddingExtractor(\n    const SherpaOnnxSpeakerEmbeddingExtractorConfig *config) {\n  auto c = GetSpeakerEmbeddingExtractorConfig(config);\n\n  if (!c.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config!\");\n    return nullptr;\n  }\n\n  auto p = new SherpaOnnxSpeakerEmbeddingExtractor;\n\n  p->impl = std::make_unique<sherpa_onnx::SpeakerEmbeddingExtractor>(c);\n\n  return p;\n}\n\nvoid SherpaOnnxDestroySpeakerEmbeddingExtractor(\n    const SherpaOnnxSpeakerEmbeddingExtractor *p) {\n  if (!p) return;\n  delete p;\n}\n\nint32_t SherpaOnnxSpeakerEmbeddingExtractorDim(\n    const SherpaOnnxSpeakerEmbeddingExtractor *p) {\n  return p->impl->Dim();\n}\n\nconst SherpaOnnxOnlineStream *SherpaOnnxSpeakerEmbeddingExtractorCreateStream(\n    const SherpaOnnxSpeakerEmbeddingExtractor *p) {\n  SherpaOnnxOnlineStream *stream =\n      new SherpaOnnxOnlineStream(p->impl->CreateStream());\n  return stream;\n}\n\nint32_t SherpaOnnxSpeakerEmbeddingExtractorIsReady(\n    const SherpaOnnxSpeakerEmbeddingExtractor *p,\n    const SherpaOnnxOnlineStream *s) {\n  return p->impl->IsReady(s->impl.get());\n}\n\nconst float *SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(\n    const SherpaOnnxSpeakerEmbeddingExtractor *p,\n    const SherpaOnnxOnlineStream *s) {\n  std::vector<float> v = p->impl->Compute(s->impl.get());\n  float *ans = new float[v.size()];\n  std::copy(v.begin(), v.end(), ans);\n  return ans;\n}\n\nvoid SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(const float *v) {\n  if (!v) return;\n  delete[] v;\n}\n\nstruct SherpaOnnxSpeakerEmbeddingManager {\n  std::unique_ptr<sherpa_onnx::SpeakerEmbeddingManager> impl;\n};\n\nconst SherpaOnnxSpeakerEmbeddingManager *\nSherpaOnnxCreateSpeakerEmbeddingManager(int32_t dim) {\n  auto p = new SherpaOnnxSpeakerEmbeddingManager;\n  p->impl = std::make_unique<sherpa_onnx::SpeakerEmbeddingManager>(dim);\n  return p;\n}\n\nvoid SherpaOnnxDestroySpeakerEmbeddingManager(\n    const SherpaOnnxSpeakerEmbeddingManager *p) {\n  if (!p) return;\n  delete p;\n}\n\nint32_t SherpaOnnxSpeakerEmbeddingManagerAdd(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name,\n    const float *v) {\n  return p->impl->Add(name, v);\n}\n\nint32_t SherpaOnnxSpeakerEmbeddingManagerAddList(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name,\n    const float **v) {\n  int32_t n = 0;\n  auto q = v;\n  while (q && q[0]) {\n    ++n;\n    ++q;\n  }\n\n  if (n == 0) {\n    SHERPA_ONNX_LOGE(\"Empty embedding!\");\n    return 0;\n  }\n\n  std::vector<std::vector<float>> vec(n);\n  int32_t dim = p->impl->Dim();\n\n  for (int32_t i = 0; i != n; ++i) {\n    vec[i] = std::vector<float>(v[i], v[i] + dim);\n  }\n\n  return p->impl->Add(name, vec);\n}\n\nint32_t SherpaOnnxSpeakerEmbeddingManagerAddListFlattened(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name,\n    const float *v, int32_t n) {\n  std::vector<std::vector<float>> vec(n);\n\n  int32_t dim = p->impl->Dim();\n\n  for (int32_t i = 0; i != n; ++i, v += dim) {\n    vec[i] = std::vector<float>(v, v + dim);\n  }\n\n  return p->impl->Add(name, vec);\n}\n\nint32_t SherpaOnnxSpeakerEmbeddingManagerRemove(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name) {\n  return p->impl->Remove(name);\n}\n\nconst char *SherpaOnnxSpeakerEmbeddingManagerSearch(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const float *v,\n    float threshold) {\n  auto r = p->impl->Search(v, threshold);\n  if (r.empty()) {\n    return nullptr;\n  }\n\n  char *name = new char[r.size() + 1];\n  std::copy(r.begin(), r.end(), name);\n  name[r.size()] = '\\0';\n\n  return name;\n}\n\nvoid SherpaOnnxSpeakerEmbeddingManagerFreeSearch(const char *name) {\n  if (!name) return;\n  delete[] name;\n}\n\nconst SherpaOnnxSpeakerEmbeddingManagerBestMatchesResult *\nSherpaOnnxSpeakerEmbeddingManagerGetBestMatches(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const float *v, float threshold,\n    int32_t n) {\n  auto matches = p->impl->GetBestMatches(v, threshold, n);\n\n  if (matches.empty()) {\n    return nullptr;\n  }\n\n  auto resultMatches =\n      new SherpaOnnxSpeakerEmbeddingManagerSpeakerMatch[matches.size()];\n  for (int i = 0; i < matches.size(); ++i) {\n    resultMatches[i].score = matches[i].score;\n\n    char *name = new char[matches[i].name.size() + 1];\n    std::copy(matches[i].name.begin(), matches[i].name.end(), name);\n    name[matches[i].name.size()] = '\\0';\n\n    resultMatches[i].name = name;\n  }\n\n  auto *result = new SherpaOnnxSpeakerEmbeddingManagerBestMatchesResult();\n  result->count = matches.size();\n  result->matches = resultMatches;\n\n  return result;\n}\n\nvoid SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches(\n    const SherpaOnnxSpeakerEmbeddingManagerBestMatchesResult *r) {\n  if (r == nullptr) {\n    return;\n  }\n\n  for (int32_t i = 0; i < r->count; ++i) {\n    delete[] r->matches[i].name;\n  }\n  delete[] r->matches;\n  delete r;\n}\n\nint32_t SherpaOnnxSpeakerEmbeddingManagerVerify(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name,\n    const float *v, float threshold) {\n  return p->impl->Verify(name, v, threshold);\n}\n\nint32_t SherpaOnnxSpeakerEmbeddingManagerContains(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name) {\n  return p->impl->Contains(name);\n}\n\nint32_t SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(\n    const SherpaOnnxSpeakerEmbeddingManager *p) {\n  return p->impl->NumSpeakers();\n}\n\nconst char *const *SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers(\n    const SherpaOnnxSpeakerEmbeddingManager *manager) {\n  std::vector<std::string> all_speakers = manager->impl->GetAllSpeakers();\n  int32_t num_speakers = all_speakers.size();\n  char **p = new char *[num_speakers + 1];\n  p[num_speakers] = nullptr;\n\n  int32_t i = 0;\n  for (const auto &name : all_speakers) {\n    p[i] = new char[name.size() + 1];\n    std::copy(name.begin(), name.end(), p[i]);\n    p[i][name.size()] = '\\0';\n\n    i += 1;\n  }\n  return p;\n}\n\nvoid SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers(\n    const char *const *names) {\n  auto p = names;\n\n  while (p && p[0]) {\n    delete[] p[0];\n    ++p;\n  }\n\n  delete[] names;\n}\n\nstruct SherpaOnnxAudioTagging {\n  std::unique_ptr<sherpa_onnx::AudioTagging> impl;\n};\n\nconst SherpaOnnxAudioTagging *SherpaOnnxCreateAudioTagging(\n    const SherpaOnnxAudioTaggingConfig *config) {\n  sherpa_onnx::AudioTaggingConfig ac;\n  ac.model.zipformer.model = SHERPA_ONNX_OR(config->model.zipformer.model, \"\");\n  ac.model.ced = SHERPA_ONNX_OR(config->model.ced, \"\");\n  ac.model.num_threads = SHERPA_ONNX_OR(config->model.num_threads, 1);\n  ac.model.debug = config->model.debug;\n  ac.model.provider = SHERPA_ONNX_OR(config->model.provider, \"cpu\");\n  if (ac.model.provider.empty()) {\n    ac.model.provider = \"cpu\";\n  }\n\n  ac.labels = SHERPA_ONNX_OR(config->labels, \"\");\n  ac.top_k = SHERPA_ONNX_OR(config->top_k, 5);\n\n  if (ac.model.debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", ac.ToString().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", ac.ToString().c_str());\n#endif\n  }\n\n  if (!ac.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config\");\n    return nullptr;\n  }\n\n  SherpaOnnxAudioTagging *tagger = new SherpaOnnxAudioTagging;\n  tagger->impl = std::make_unique<sherpa_onnx::AudioTagging>(ac);\n\n  return tagger;\n}\n\nvoid SherpaOnnxDestroyAudioTagging(const SherpaOnnxAudioTagging *tagger) {\n  if (!tagger) return;\n  delete tagger;\n}\n\nconst SherpaOnnxOfflineStream *SherpaOnnxAudioTaggingCreateOfflineStream(\n    const SherpaOnnxAudioTagging *tagger) {\n  const SherpaOnnxOfflineStream *stream =\n      new SherpaOnnxOfflineStream(tagger->impl->CreateStream());\n  return stream;\n}\n\nconst SherpaOnnxAudioEvent *const *SherpaOnnxAudioTaggingCompute(\n    const SherpaOnnxAudioTagging *tagger, const SherpaOnnxOfflineStream *s,\n    int32_t top_k) {\n  std::vector<sherpa_onnx::AudioEvent> events =\n      tagger->impl->Compute(s->impl.get(), top_k);\n\n  int32_t n = static_cast<int32_t>(events.size());\n  SherpaOnnxAudioEvent **ans = new SherpaOnnxAudioEvent *[n + 1];\n  ans[n] = nullptr;\n\n  int32_t i = 0;\n  for (const auto &e : events) {\n    SherpaOnnxAudioEvent *p = new SherpaOnnxAudioEvent;\n\n    char *name = new char[e.name.size() + 1];\n    std::copy(e.name.begin(), e.name.end(), name);\n    name[e.name.size()] = 0;\n\n    p->name = name;\n\n    p->index = e.index;\n    p->prob = e.prob;\n\n    ans[i] = p;\n    i += 1;\n  }\n\n  return ans;\n}\n\nvoid SherpaOnnxAudioTaggingFreeResults(\n    const SherpaOnnxAudioEvent *const *events) {\n  auto p = events;\n\n  while (p && *p) {\n    auto e = *p;\n\n    delete[] e->name;\n    delete e;\n\n    ++p;\n  }\n\n  delete[] events;\n}\n\nstruct SherpaOnnxOfflinePunctuation {\n  std::unique_ptr<sherpa_onnx::OfflinePunctuation> impl;\n};\n\nstatic sherpa_onnx::OfflinePunctuationConfig GetOfflinePunctuationConfig(\n    const SherpaOnnxOfflinePunctuationConfig *config) {\n  sherpa_onnx::OfflinePunctuationConfig c;\n  c.model.ct_transformer = SHERPA_ONNX_OR(config->model.ct_transformer, \"\");\n  c.model.num_threads = SHERPA_ONNX_OR(config->model.num_threads, 1);\n  c.model.debug = config->model.debug;\n  c.model.provider = SHERPA_ONNX_OR(config->model.provider, \"cpu\");\n  if (c.model.provider.empty()) {\n    c.model.provider = \"cpu\";\n  }\n\n  if (config->model.debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", c.ToString().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", c.ToString().c_str());\n#endif\n  }\n\n  return c;\n}\n\nconst SherpaOnnxOfflinePunctuation *SherpaOnnxCreateOfflinePunctuation(\n    const SherpaOnnxOfflinePunctuationConfig *config) {\n  if (config == nullptr) {\n    return nullptr;\n  }\n\n  auto c = GetOfflinePunctuationConfig(config);\n\n  if (!c.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config\");\n    return nullptr;\n  }\n\n  SherpaOnnxOfflinePunctuation *punct = new SherpaOnnxOfflinePunctuation;\n  punct->impl = std::make_unique<sherpa_onnx::OfflinePunctuation>(c);\n\n  return punct;\n}\n\nvoid SherpaOnnxDestroyOfflinePunctuation(\n    const SherpaOnnxOfflinePunctuation *punct) {\n  if (!punct) return;\n  delete punct;\n}\n\nconst char *SherpaOfflinePunctuationAddPunct(\n    const SherpaOnnxOfflinePunctuation *punct, const char *text) {\n  if (!punct || !text) return nullptr;\n  std::string text_with_punct = punct->impl->AddPunctuation(text);\n\n  char *ans = new char[text_with_punct.size() + 1];\n  std::copy(text_with_punct.begin(), text_with_punct.end(), ans);\n  ans[text_with_punct.size()] = 0;\n\n  return ans;\n}\n\nvoid SherpaOfflinePunctuationFreeText(const char *text) {\n  if (!text) return;\n  delete[] text;\n}\n\nstruct SherpaOnnxOnlinePunctuation {\n  std::unique_ptr<sherpa_onnx::OnlinePunctuation> impl;\n};\n\nstatic sherpa_onnx::OnlinePunctuationConfig GetOnlinePunctuationConfig(\n    const SherpaOnnxOnlinePunctuationConfig *config) {\n  sherpa_onnx::OnlinePunctuationConfig punctuation_config;\n  punctuation_config.model.cnn_bilstm =\n      SHERPA_ONNX_OR(config->model.cnn_bilstm, \"\");\n  punctuation_config.model.bpe_vocab =\n      SHERPA_ONNX_OR(config->model.bpe_vocab, \"\");\n  punctuation_config.model.num_threads =\n      SHERPA_ONNX_OR(config->model.num_threads, 1);\n  punctuation_config.model.debug = config->model.debug;\n  punctuation_config.model.provider =\n      SHERPA_ONNX_OR(config->model.provider, \"cpu\");\n\n  if (config->model.debug) {\n#if __OHOS__\n    auto str_vec = sherpa_onnx::SplitString(punctuation_config.ToString(), 128);\n    for (const auto &s : str_vec) {\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", s.c_str());\n      SHERPA_ONNX_LOGE(\"%s\\n\", s.c_str());\n    }\n#else\n    SHERPA_ONNX_LOGE(\"%s\", punctuation_config.ToString().c_str());\n#endif\n  }\n\n  return punctuation_config;\n}\n\nconst SherpaOnnxOnlinePunctuation *SherpaOnnxCreateOnlinePunctuation(\n    const SherpaOnnxOnlinePunctuationConfig *config) {\n  if (config == nullptr) {\n    return nullptr;\n  }\n\n  auto punctuation_config = GetOnlinePunctuationConfig(config);\n  if (!punctuation_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config\");\n    return nullptr;\n  }\n\n  auto *p = new SherpaOnnxOnlinePunctuation;\n  p->impl =\n      std::make_unique<sherpa_onnx::OnlinePunctuation>(punctuation_config);\n  return p;\n}\n\nvoid SherpaOnnxDestroyOnlinePunctuation(const SherpaOnnxOnlinePunctuation *p) {\n  if (!p) return;\n  delete p;\n}\n\nconst char *SherpaOnnxOnlinePunctuationAddPunct(\n    const SherpaOnnxOnlinePunctuation *punctuation, const char *text) {\n  if (!punctuation || !text) return nullptr;\n\n  try {\n    std::string s = punctuation->impl->AddPunctuationWithCase(text);\n    char *p = new char[s.size() + 1];\n    std::copy(s.begin(), s.end(), p);\n    p[s.size()] = '\\0';\n    return p;\n  } catch (const std::exception &e) {\n    SHERPA_ONNX_LOGE(\"Failed to add punctuation: %s\", e.what());\n    return nullptr;\n  }\n}\n\nvoid SherpaOnnxOnlinePunctuationFreeText(const char *text) {\n  if (!text) return;\n  delete[] text;\n}\n\nstruct SherpaOnnxLinearResampler {\n  std::unique_ptr<sherpa_onnx::LinearResample> impl;\n};\n\nconst SherpaOnnxLinearResampler *SherpaOnnxCreateLinearResampler(\n    int32_t samp_rate_in_hz, int32_t samp_rate_out_hz, float filter_cutoff_hz,\n    int32_t num_zeros) {\n  SherpaOnnxLinearResampler *p = new SherpaOnnxLinearResampler;\n  p->impl = std::make_unique<sherpa_onnx::LinearResample>(\n      samp_rate_in_hz, samp_rate_out_hz, filter_cutoff_hz, num_zeros);\n\n  return p;\n}\n\nvoid SherpaOnnxDestroyLinearResampler(const SherpaOnnxLinearResampler *p) {\n  if (!p) return;\n  delete p;\n}\n\nconst SherpaOnnxResampleOut *SherpaOnnxLinearResamplerResample(\n    const SherpaOnnxLinearResampler *p, const float *input, int32_t input_dim,\n    int32_t flush) {\n  std::vector<float> o;\n  p->impl->Resample(input, input_dim, flush, &o);\n\n  float *s = new float[o.size()];\n  std::copy(o.begin(), o.end(), s);\n\n  SherpaOnnxResampleOut *ans = new SherpaOnnxResampleOut;\n  ans->samples = s;\n  ans->n = static_cast<int32_t>(o.size());\n\n  return ans;\n}\n\nvoid SherpaOnnxLinearResamplerResampleFree(const SherpaOnnxResampleOut *p) {\n  if (!p) return;\n  delete[] p->samples;\n  delete p;\n}\n\nint32_t SherpaOnnxLinearResamplerResampleGetInputSampleRate(\n    const SherpaOnnxLinearResampler *p) {\n  return p->impl->GetInputSamplingRate();\n}\n\nint32_t SherpaOnnxLinearResamplerResampleGetOutputSampleRate(\n    const SherpaOnnxLinearResampler *p) {\n  return p->impl->GetOutputSamplingRate();\n}\n\nvoid SherpaOnnxLinearResamplerReset(const SherpaOnnxLinearResampler *p) {\n  p->impl->Reset();\n}\n\nint32_t SherpaOnnxFileExists(const char *filename) {\n  return sherpa_onnx::FileExists(filename);\n}\n\nstruct SherpaOnnxOfflineSpeechDenoiser {\n  std::unique_ptr<sherpa_onnx::OfflineSpeechDenoiser> impl;\n};\n\nstatic const SherpaOnnxDenoisedAudio *CreateDenoisedAudio(\n    const sherpa_onnx::DenoisedAudio &audio) {\n  auto ans = new SherpaOnnxDenoisedAudio;\n\n  float *denoised_samples = nullptr;\n  if (!audio.samples.empty()) {\n    denoised_samples = new float[audio.samples.size()];\n    std::copy(audio.samples.begin(), audio.samples.end(), denoised_samples);\n  }\n\n  ans->samples = denoised_samples;\n  ans->n = audio.samples.size();\n  ans->sample_rate = audio.sample_rate;\n\n  return ans;\n}\n\nstatic sherpa_onnx::OfflineSpeechDenoiserConfig GetOfflineSpeechDenoiserConfig(\n    const SherpaOnnxOfflineSpeechDenoiserConfig *config) {\n  sherpa_onnx::OfflineSpeechDenoiserConfig c;\n  c.model.gtcrn.model = SHERPA_ONNX_OR(config->model.gtcrn.model, \"\");\n  c.model.num_threads = SHERPA_ONNX_OR(config->model.num_threads, 1);\n  c.model.debug = config->model.debug;\n  c.model.provider = SHERPA_ONNX_OR(config->model.provider, \"cpu\");\n  c.model.dpdfnet.model = SHERPA_ONNX_OR(config->model.dpdfnet.model, \"\");\n\n  if (c.model.debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", c.ToString().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", c.ToString().c_str());\n#endif\n  }\n\n  return c;\n}\n\nconst SherpaOnnxOfflineSpeechDenoiser *SherpaOnnxCreateOfflineSpeechDenoiser(\n    const SherpaOnnxOfflineSpeechDenoiserConfig *config) {\n  if (config == nullptr) {\n    return nullptr;\n  }\n\n  auto sd_config = GetOfflineSpeechDenoiserConfig(config);\n\n  if (!sd_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config\");\n    return nullptr;\n  }\n\n  SherpaOnnxOfflineSpeechDenoiser *sd = new SherpaOnnxOfflineSpeechDenoiser;\n\n  sd->impl = std::make_unique<sherpa_onnx::OfflineSpeechDenoiser>(sd_config);\n\n  return sd;\n}\n\nvoid SherpaOnnxDestroyOfflineSpeechDenoiser(\n    const SherpaOnnxOfflineSpeechDenoiser *sd) {\n  if (!sd) return;\n  delete sd;\n}\n\nint32_t SherpaOnnxOfflineSpeechDenoiserGetSampleRate(\n    const SherpaOnnxOfflineSpeechDenoiser *sd) {\n  if (sd == nullptr) {\n    return 0;\n  }\n\n  return sd->impl->GetSampleRate();\n}\n\nconst SherpaOnnxDenoisedAudio *SherpaOnnxOfflineSpeechDenoiserRun(\n    const SherpaOnnxOfflineSpeechDenoiser *sd, const float *samples, int32_t n,\n    int32_t sample_rate) {\n  if (sd == nullptr) {\n    return nullptr;\n  }\n\n  if (samples == nullptr && n > 0) {\n    return nullptr;\n  }\n\n  auto audio = sd->impl->Run(samples, n, sample_rate);\n  return CreateDenoisedAudio(audio);\n}\n\nvoid SherpaOnnxDestroyDenoisedAudio(const SherpaOnnxDenoisedAudio *p) {\n  if (!p) return;\n  delete[] p->samples;\n  delete p;\n}\n\nstruct SherpaOnnxOnlineSpeechDenoiser {\n  std::unique_ptr<sherpa_onnx::OnlineSpeechDenoiser> impl;\n};\n\nstatic sherpa_onnx::OnlineSpeechDenoiserConfig GetOnlineSpeechDenoiserConfig(\n    const SherpaOnnxOnlineSpeechDenoiserConfig *config) {\n  sherpa_onnx::OnlineSpeechDenoiserConfig c;\n  c.model.gtcrn.model = SHERPA_ONNX_OR(config->model.gtcrn.model, \"\");\n  c.model.num_threads = SHERPA_ONNX_OR(config->model.num_threads, 1);\n  c.model.debug = config->model.debug;\n  c.model.provider = SHERPA_ONNX_OR(config->model.provider, \"cpu\");\n  c.model.dpdfnet.model = SHERPA_ONNX_OR(config->model.dpdfnet.model, \"\");\n\n  if (c.model.debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", c.ToString().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", c.ToString().c_str());\n#endif\n  }\n\n  return c;\n}\n\nconst SherpaOnnxOnlineSpeechDenoiser *SherpaOnnxCreateOnlineSpeechDenoiser(\n    const SherpaOnnxOnlineSpeechDenoiserConfig *config) {\n  if (config == nullptr) {\n    return nullptr;\n  }\n\n  auto sd_config = GetOnlineSpeechDenoiserConfig(config);\n\n  if (!sd_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config\");\n    return nullptr;\n  }\n\n  auto *sd = new SherpaOnnxOnlineSpeechDenoiser;\n  sd->impl = std::make_unique<sherpa_onnx::OnlineSpeechDenoiser>(sd_config);\n  return sd;\n}\n\nvoid SherpaOnnxDestroyOnlineSpeechDenoiser(\n    const SherpaOnnxOnlineSpeechDenoiser *sd) {\n  if (!sd) return;\n  delete sd;\n}\n\nint32_t SherpaOnnxOnlineSpeechDenoiserGetSampleRate(\n    const SherpaOnnxOnlineSpeechDenoiser *sd) {\n  if (sd == nullptr) {\n    return 0;\n  }\n\n  return sd->impl->GetSampleRate();\n}\n\nint32_t SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(\n    const SherpaOnnxOnlineSpeechDenoiser *sd) {\n  if (sd == nullptr) {\n    return 0;\n  }\n\n  return sd->impl->GetFrameShiftInSamples();\n}\n\nconst SherpaOnnxDenoisedAudio *SherpaOnnxOnlineSpeechDenoiserRun(\n    const SherpaOnnxOnlineSpeechDenoiser *sd, const float *samples, int32_t n,\n    int32_t sample_rate) {\n  if (sd == nullptr) {\n    return nullptr;\n  }\n\n  if (samples == nullptr && n > 0) {\n    return nullptr;\n  }\n\n  auto audio = sd->impl->Run(samples, n, sample_rate);\n\n  if (audio.samples.empty()) {\n    return nullptr;\n  }\n\n  return CreateDenoisedAudio(audio);\n}\n\nconst SherpaOnnxDenoisedAudio *SherpaOnnxOnlineSpeechDenoiserFlush(\n    const SherpaOnnxOnlineSpeechDenoiser *sd) {\n  if (sd == nullptr) {\n    return nullptr;\n  }\n\n  auto audio = sd->impl->Flush();\n\n  if (audio.samples.empty()) {\n    return nullptr;\n  }\n\n  return CreateDenoisedAudio(audio);\n}\n\nvoid SherpaOnnxOnlineSpeechDenoiserReset(\n    const SherpaOnnxOnlineSpeechDenoiser *sd) {\n  if (sd == nullptr) {\n    return;\n  }\n\n  sd->impl->Reset();\n}\n\n#if SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION == 1\n\nstruct SherpaOnnxOfflineSpeakerDiarization {\n  std::unique_ptr<sherpa_onnx::OfflineSpeakerDiarization> impl;\n};\n\nstruct SherpaOnnxOfflineSpeakerDiarizationResult {\n  sherpa_onnx::OfflineSpeakerDiarizationResult impl;\n};\n\nstatic sherpa_onnx::OfflineSpeakerDiarizationConfig\nGetOfflineSpeakerDiarizationConfig(\n    const SherpaOnnxOfflineSpeakerDiarizationConfig *config) {\n  sherpa_onnx::OfflineSpeakerDiarizationConfig sd_config;\n\n  sd_config.segmentation.pyannote.model =\n      SHERPA_ONNX_OR(config->segmentation.pyannote.model, \"\");\n  sd_config.segmentation.num_threads =\n      SHERPA_ONNX_OR(config->segmentation.num_threads, 1);\n  sd_config.segmentation.debug = config->segmentation.debug;\n  sd_config.segmentation.provider =\n      SHERPA_ONNX_OR(config->segmentation.provider, \"cpu\");\n  if (sd_config.segmentation.provider.empty()) {\n    sd_config.segmentation.provider = \"cpu\";\n  }\n\n  sd_config.embedding.model = SHERPA_ONNX_OR(config->embedding.model, \"\");\n  sd_config.embedding.num_threads =\n      SHERPA_ONNX_OR(config->embedding.num_threads, 1);\n  sd_config.embedding.debug = config->embedding.debug;\n  sd_config.embedding.provider =\n      SHERPA_ONNX_OR(config->embedding.provider, \"cpu\");\n  if (sd_config.embedding.provider.empty()) {\n    sd_config.embedding.provider = \"cpu\";\n  }\n\n  sd_config.clustering.num_clusters =\n      SHERPA_ONNX_OR(config->clustering.num_clusters, -1);\n\n  sd_config.clustering.threshold =\n      SHERPA_ONNX_OR(config->clustering.threshold, 0.5);\n\n  sd_config.min_duration_on = SHERPA_ONNX_OR(config->min_duration_on, 0.3);\n\n  sd_config.min_duration_off = SHERPA_ONNX_OR(config->min_duration_off, 0.5);\n\n  if (sd_config.segmentation.debug || sd_config.embedding.debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", sd_config.ToString().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", sd_config.ToString().c_str());\n#endif\n  }\n\n  return sd_config;\n}\n\nconst SherpaOnnxOfflineSpeakerDiarization *\nSherpaOnnxCreateOfflineSpeakerDiarization(\n    const SherpaOnnxOfflineSpeakerDiarizationConfig *config) {\n  auto sd_config = GetOfflineSpeakerDiarizationConfig(config);\n\n  if (!sd_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in config\");\n    return nullptr;\n  }\n\n  SherpaOnnxOfflineSpeakerDiarization *sd =\n      new SherpaOnnxOfflineSpeakerDiarization;\n\n  sd->impl =\n      std::make_unique<sherpa_onnx::OfflineSpeakerDiarization>(sd_config);\n\n  return sd;\n}\n\nvoid SherpaOnnxDestroyOfflineSpeakerDiarization(\n    const SherpaOnnxOfflineSpeakerDiarization *sd) {\n  if (!sd) return;\n  delete sd;\n}\n\nint32_t SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(\n    const SherpaOnnxOfflineSpeakerDiarization *sd) {\n  return sd->impl->SampleRate();\n}\n\nvoid SherpaOnnxOfflineSpeakerDiarizationSetConfig(\n    const SherpaOnnxOfflineSpeakerDiarization *sd,\n    const SherpaOnnxOfflineSpeakerDiarizationConfig *config) {\n  sherpa_onnx::OfflineSpeakerDiarizationConfig sd_config;\n\n  sd_config.clustering.num_clusters =\n      SHERPA_ONNX_OR(config->clustering.num_clusters, -1);\n\n  sd_config.clustering.threshold =\n      SHERPA_ONNX_OR(config->clustering.threshold, 0.5);\n\n  sd->impl->SetConfig(sd_config);\n}\n\nint32_t SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r) {\n  return r->impl.NumSpeakers();\n}\n\nint32_t SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r) {\n  return r->impl.NumSegments();\n}\n\nconst SherpaOnnxOfflineSpeakerDiarizationSegment *\nSherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r) {\n  if (r->impl.NumSegments() == 0) {\n    return nullptr;\n  }\n\n  auto segments = r->impl.SortByStartTime();\n\n  int32_t n = segments.size();\n  SherpaOnnxOfflineSpeakerDiarizationSegment *ans =\n      new SherpaOnnxOfflineSpeakerDiarizationSegment[n];\n\n  for (int32_t i = 0; i != n; ++i) {\n    const auto &s = segments[i];\n\n    ans[i].start = s.Start();\n    ans[i].end = s.End();\n    ans[i].speaker = s.Speaker();\n  }\n\n  return ans;\n}\n\nvoid SherpaOnnxOfflineSpeakerDiarizationDestroySegment(\n    const SherpaOnnxOfflineSpeakerDiarizationSegment *s) {\n  if (!s) return;\n  delete[] s;\n}\n\nconst SherpaOnnxOfflineSpeakerDiarizationResult *\nSherpaOnnxOfflineSpeakerDiarizationProcess(\n    const SherpaOnnxOfflineSpeakerDiarization *sd, const float *samples,\n    int32_t n) {\n  auto ans = new SherpaOnnxOfflineSpeakerDiarizationResult;\n  ans->impl = sd->impl->Process(samples, n);\n\n  return ans;\n}\n\nvoid SherpaOnnxOfflineSpeakerDiarizationDestroyResult(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r) {\n  if (!r) return;\n  delete r;\n}\n\nconst SherpaOnnxOfflineSpeakerDiarizationResult *\nSherpaOnnxOfflineSpeakerDiarizationProcessWithCallback(\n    const SherpaOnnxOfflineSpeakerDiarization *sd, const float *samples,\n    int32_t n, SherpaOnnxOfflineSpeakerDiarizationProgressCallback callback,\n    void *arg) {\n  auto ans = new SherpaOnnxOfflineSpeakerDiarizationResult;\n  ans->impl = sd->impl->Process(samples, n, callback, arg);\n\n  return ans;\n}\n\nconst SherpaOnnxOfflineSpeakerDiarizationResult *\nSherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg(\n    const SherpaOnnxOfflineSpeakerDiarization *sd, const float *samples,\n    int32_t n,\n    SherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArg callback) {\n  auto wrapper = [callback](int32_t num_processed_chunks,\n                            int32_t num_total_chunks, void *) {\n    return callback(num_processed_chunks, num_total_chunks);\n  };\n\n  auto ans = new SherpaOnnxOfflineSpeakerDiarizationResult;\n  ans->impl = sd->impl->Process(samples, n, wrapper);\n\n  return ans;\n}\n#else\n\nconst SherpaOnnxOfflineSpeakerDiarization *\nSherpaOnnxCreateOfflineSpeakerDiarization(\n    const SherpaOnnxOfflineSpeakerDiarizationConfig *config) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nvoid SherpaOnnxDestroyOfflineSpeakerDiarization(\n    const SherpaOnnxOfflineSpeakerDiarization *sd) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n}\n\nint32_t SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(\n    const SherpaOnnxOfflineSpeakerDiarization *sd) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n  return 0;\n}\n\nvoid SherpaOnnxOfflineSpeakerDiarizationSetConfig(\n    const SherpaOnnxOfflineSpeakerDiarization *sd,\n    const SherpaOnnxOfflineSpeakerDiarizationConfig *config) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n}\n\nint32_t SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n  return 0;\n}\n\nint32_t SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n  return 0;\n}\n\nconst SherpaOnnxOfflineSpeakerDiarizationSegment *\nSherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nvoid SherpaOnnxOfflineSpeakerDiarizationDestroySegment(\n    const SherpaOnnxOfflineSpeakerDiarizationSegment *s) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n}\n\nconst SherpaOnnxOfflineSpeakerDiarizationResult *\nSherpaOnnxOfflineSpeakerDiarizationProcess(\n    const SherpaOnnxOfflineSpeakerDiarization *sd, const float *samples,\n    int32_t n) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nconst SherpaOnnxOfflineSpeakerDiarizationResult *\nSherpaOnnxOfflineSpeakerDiarizationProcessWithCallback(\n    const SherpaOnnxOfflineSpeakerDiarization *sd, const float *samples,\n    int32_t n, SherpaOnnxOfflineSpeakerDiarizationProgressCallback callback,\n    void *arg) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nconst SherpaOnnxOfflineSpeakerDiarizationResult *\nSherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg(\n    const SherpaOnnxOfflineSpeakerDiarization *sd, const float *samples,\n    int32_t n,\n    SherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArg callback) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\nvoid SherpaOnnxOfflineSpeakerDiarizationDestroyResult(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n}\n\n#endif\n\n#ifdef __OHOS__\n\nconst SherpaOnnxOfflineSpeechDenoiser *\nSherpaOnnxCreateOfflineSpeechDenoiserOHOS(\n    const SherpaOnnxOfflineSpeechDenoiserConfig *config,\n    NativeResourceManager *mgr) {\n  if (config == nullptr) {\n    return nullptr;\n  }\n\n  if (!mgr) {\n    return SherpaOnnxCreateOfflineSpeechDenoiser(config);\n  }\n\n  auto sd_config = GetOfflineSpeechDenoiserConfig(config);\n\n  SherpaOnnxOfflineSpeechDenoiser *sd = new SherpaOnnxOfflineSpeechDenoiser;\n\n  sd->impl =\n      std::make_unique<sherpa_onnx::OfflineSpeechDenoiser>(mgr, sd_config);\n\n  return sd;\n}\n\nconst SherpaOnnxOnlineSpeechDenoiser *SherpaOnnxCreateOnlineSpeechDenoiserOHOS(\n    const SherpaOnnxOnlineSpeechDenoiserConfig *config,\n    NativeResourceManager *mgr) {\n  if (config == nullptr) {\n    return nullptr;\n  }\n\n  if (mgr == nullptr) {\n    return SherpaOnnxCreateOnlineSpeechDenoiser(config);\n  }\n\n  auto sd_config = GetOnlineSpeechDenoiserConfig(config);\n\n  auto *sd = new SherpaOnnxOnlineSpeechDenoiser;\n  sd->impl =\n      std::make_unique<sherpa_onnx::OnlineSpeechDenoiser>(mgr, sd_config);\n\n  return sd;\n}\n\nconst SherpaOnnxOnlineRecognizer *SherpaOnnxCreateOnlineRecognizerOHOS(\n    const SherpaOnnxOnlineRecognizerConfig *config,\n    NativeResourceManager *mgr) {\n  if (!mgr) {\n    return SherpaOnnxCreateOnlineRecognizer(config);\n  }\n\n  sherpa_onnx::OnlineRecognizerConfig recognizer_config =\n      GetOnlineRecognizerConfig(config);\n\n  SherpaOnnxOnlineRecognizer *recognizer = new SherpaOnnxOnlineRecognizer;\n\n  recognizer->impl =\n      std::make_unique<sherpa_onnx::OnlineRecognizer>(mgr, recognizer_config);\n\n  return recognizer;\n}\n\nconst SherpaOnnxOnlinePunctuation *SherpaOnnxCreateOnlinePunctuationOHOS(\n    const SherpaOnnxOnlinePunctuationConfig *config,\n    NativeResourceManager *mgr) {\n  if (config == nullptr) {\n    return nullptr;\n  }\n\n  if (mgr == nullptr) {\n    return SherpaOnnxCreateOnlinePunctuation(config);\n  }\n\n  auto punctuation_config = GetOnlinePunctuationConfig(config);\n  auto *p = new SherpaOnnxOnlinePunctuation;\n  p->impl =\n      std::make_unique<sherpa_onnx::OnlinePunctuation>(mgr, punctuation_config);\n  return p;\n}\n\nconst SherpaOnnxOfflineRecognizer *SherpaOnnxCreateOfflineRecognizerOHOS(\n    const SherpaOnnxOfflineRecognizerConfig *config,\n    NativeResourceManager *mgr) {\n  if (mgr == nullptr) {\n    return SherpaOnnxCreateOfflineRecognizer(config);\n  }\n\n  sherpa_onnx::OfflineRecognizerConfig recognizer_config =\n      GetOfflineRecognizerConfig(config);\n\n  SherpaOnnxOfflineRecognizer *recognizer = new SherpaOnnxOfflineRecognizer;\n\n  recognizer->impl =\n      std::make_unique<sherpa_onnx::OfflineRecognizer>(mgr, recognizer_config);\n\n  return recognizer;\n}\n\nconst SherpaOnnxVoiceActivityDetector *\nSherpaOnnxCreateVoiceActivityDetectorOHOS(\n    const SherpaOnnxVadModelConfig *config, float buffer_size_in_seconds,\n    NativeResourceManager *mgr) {\n  if (mgr == nullptr) {\n    return SherpaOnnxCreateVoiceActivityDetector(config,\n                                                 buffer_size_in_seconds);\n  }\n\n  auto vad_config = GetVadModelConfig(config);\n\n  SherpaOnnxVoiceActivityDetector *p = new SherpaOnnxVoiceActivityDetector;\n  p->impl = std::make_unique<sherpa_onnx::VoiceActivityDetector>(\n      mgr, vad_config, buffer_size_in_seconds);\n\n  return p;\n}\n\nconst SherpaOnnxSpeakerEmbeddingExtractor *\nSherpaOnnxCreateSpeakerEmbeddingExtractorOHOS(\n    const SherpaOnnxSpeakerEmbeddingExtractorConfig *config,\n    NativeResourceManager *mgr) {\n  if (!mgr) {\n    return SherpaOnnxCreateSpeakerEmbeddingExtractor(config);\n  }\n\n  auto c = GetSpeakerEmbeddingExtractorConfig(config);\n\n  auto p = new SherpaOnnxSpeakerEmbeddingExtractor;\n\n  p->impl = std::make_unique<sherpa_onnx::SpeakerEmbeddingExtractor>(mgr, c);\n\n  return p;\n}\n\nconst SherpaOnnxKeywordSpotter *SherpaOnnxCreateKeywordSpotterOHOS(\n    const SherpaOnnxKeywordSpotterConfig *config, NativeResourceManager *mgr) {\n  if (!mgr) {\n    return SherpaOnnxCreateKeywordSpotter(config);\n  }\n\n  auto spotter_config = GetKeywordSpotterConfig(config);\n\n  SherpaOnnxKeywordSpotter *spotter = new SherpaOnnxKeywordSpotter;\n\n  spotter->impl =\n      std::make_unique<sherpa_onnx::KeywordSpotter>(mgr, spotter_config);\n\n  return spotter;\n}\n\n#if SHERPA_ONNX_ENABLE_TTS == 1\nconst SherpaOnnxOfflineTts *SherpaOnnxCreateOfflineTtsOHOS(\n    const SherpaOnnxOfflineTtsConfig *config, NativeResourceManager *mgr) {\n  if (!mgr) {\n    return SherpaOnnxCreateOfflineTts(config);\n  }\n\n  auto tts_config = GetOfflineTtsConfig(config);\n\n  SherpaOnnxOfflineTts *tts = new SherpaOnnxOfflineTts;\n\n  tts->impl = std::make_unique<sherpa_onnx::OfflineTts>(mgr, tts_config);\n\n  return tts;\n}\n#else\nconst SherpaOnnxOfflineTts *SherpaOnnxCreateOfflineTtsOHOS(\n    const SherpaOnnxOfflineTtsConfig *config, NativeResourceManager *mgr) {\n  SHERPA_ONNX_LOGE(\"TTS is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n#endif  // #if SHERPA_ONNX_ENABLE_TTS == 1\n\nconst SherpaOnnxOfflinePunctuation *SherpaOnnxCreateOfflinePunctuationOHOS(\n    const SherpaOnnxOfflinePunctuationConfig *config,\n    NativeResourceManager *mgr) {\n  if (config == nullptr) {\n    return nullptr;\n  }\n\n  if (!mgr) {\n    return SherpaOnnxCreateOfflinePunctuation(config);\n  }\n\n  auto c = GetOfflinePunctuationConfig(config);\n  if (c.model.ct_transformer.empty()) {\n    SHERPA_ONNX_LOGE(\"Please specify a punctuation model! Return a null pointer\");\n    return nullptr;\n  }\n\n  auto *punct = new SherpaOnnxOfflinePunctuation;\n  punct->impl = std::make_unique<sherpa_onnx::OfflinePunctuation>(mgr, c);\n\n  return punct;\n}\n\n#if SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION == 1\nconst SherpaOnnxOfflineSpeakerDiarization *\nSherpaOnnxCreateOfflineSpeakerDiarizationOHOS(\n    const SherpaOnnxOfflineSpeakerDiarizationConfig *config,\n    NativeResourceManager *mgr) {\n  if (!mgr) {\n    return SherpaOnnxCreateOfflineSpeakerDiarization(config);\n  }\n\n  auto sd_config = GetOfflineSpeakerDiarizationConfig(config);\n\n  SherpaOnnxOfflineSpeakerDiarization *sd =\n      new SherpaOnnxOfflineSpeakerDiarization;\n\n  sd->impl =\n      std::make_unique<sherpa_onnx::OfflineSpeakerDiarization>(mgr, sd_config);\n\n  return sd;\n}\n#else\n\nconst SherpaOnnxOfflineSpeakerDiarization *\nSherpaOnnxCreateOfflineSpeakerDiarizationOHOS(\n    const SherpaOnnxOfflineSpeakerDiarizationConfig *config,\n    NativeResourceManager *mgr) {\n  SHERPA_ONNX_LOGE(\n      \"Speaker diarization is not enabled. Please rebuild sherpa-onnx\");\n  return nullptr;\n}\n\n#endif  // #if SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION == 1\n\n#endif  // #ifdef __OHOS__\n"
  },
  {
    "path": "sherpa-onnx/c-api/c-api.h",
    "content": "// sherpa-onnx/c-api/c-api.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n/**\n * @file c-api.h\n * @brief Public C API for sherpa-onnx.\n *\n * This header exposes the main sherpa-onnx inference features through a stable\n * C interface. It is intended for native C/C++ applications and for language\n * bindings that need a C ABI.\n *\n * The file is organized by feature family. The major API groups are:\n *\n * - Utility helpers: version/build information, file checks, WAVE I/O, and a\n *   display helper for incremental text output\n * - Streaming ASR: online recognizers, online streams, endpointing, and\n *   per-stream runtime options\n * - Non-streaming ASR: offline recognizers, offline streams, batch decode, and\n *   result retrieval\n * - Keyword spotting: streaming keyword detection, custom keyword streams, and\n *   keyword result snapshots\n * - Voice activity detection: Silero/Ten VAD models, speech segment buffers,\n *   and detector state management\n * - Text-to-speech: offline TTS model families, generation configuration, and\n *   generated audio helpers\n * - Spoken language identification\n * - Speaker embedding extraction and speaker enrollment/search/verification\n * - Audio tagging\n * - Offline and online punctuation restoration\n * - Linear resampling\n * - Offline speaker diarization\n * - Offline and online speech enhancement / denoising\n * - HarmonyOS-specific constructor variants\n *\n * Common ownership rules:\n *\n * - Opaque handles created by `SherpaOnnxCreate*()` functions are generally\n *   destroyed with a matching `SherpaOnnxDestroy*()` function\n * - Snapshot/result objects returned by query functions usually need explicit\n *   destruction as documented on each API\n * - Strings or arrays returned by helper/query functions are either:\n *   - statically owned by the library and must not be freed, or\n *   - heap-allocated for the caller and must be released with the matching\n *     `Free`/`Destroy` API\n *\n * General usage pattern:\n *\n * 1. Zero-initialize a config struct with `memset(&config, 0, sizeof(config))`\n * 2. Fill in the required model paths and runtime options\n * 3. Create the corresponding engine with `SherpaOnnxCreate*()`\n * 4. Create a stream if the feature uses one\n * 5. Feed audio or text, run the compute/decode API, and retrieve results\n * 6. Release every returned object with the documented matching API\n *\n * The examples in `c-api-examples/` show complete end-to-end usage. Useful\n * starting points include:\n *\n * - `decode-file-c-api.c` for ASR\n * - `kws-c-api.c` for keyword spotting\n * - `vad-whisper-c-api.c` for VAD\n * - `offline-tts-c-api.c` and `kokoro-tts-en-c-api.c` for TTS\n * - `speaker-identification-c-api.c` for speaker embedding and verification\n * - `audio-tagging-c-api.c` for audio tagging\n * - `add-punctuation-c-api.c` and `add-punctuation-online-c-api.c` for\n *   punctuation\n * - `offline-sepaker-diarization-c-api.c` for diarization\n * - `speech-enhancement-gtcrn-c-api.c` and\n *   `online-speech-enhancement-gtcrn-c-api.c` for speech enhancement\n */\n\n#ifndef SHERPA_ONNX_C_API_C_API_H_\n#define SHERPA_ONNX_C_API_C_API_H_\n\n#include <stdint.h>\n\n#ifdef __cplusplus\nextern \"C\" {\n#endif\n\n// See https://github.com/pytorch/pytorch/blob/main/c10/macros/Export.h\n// We will set SHERPA_ONNX_BUILD_SHARED_LIBS and SHERPA_ONNX_BUILD_MAIN_LIB in\n// CMakeLists.txt\n\n#if defined(__GNUC__)\n#pragma GCC diagnostic push\n#pragma GCC diagnostic ignored \"-Wattributes\"\n#endif\n\n#if defined(_WIN32)\n#if defined(SHERPA_ONNX_BUILD_SHARED_LIBS)\n#define SHERPA_ONNX_EXPORT __declspec(dllexport)\n#define SHERPA_ONNX_IMPORT __declspec(dllimport)\n#else\n#define SHERPA_ONNX_EXPORT\n#define SHERPA_ONNX_IMPORT\n#endif\n#else  // WIN32\n#define SHERPA_ONNX_EXPORT __attribute__((visibility(\"default\")))\n\n#define SHERPA_ONNX_IMPORT SHERPA_ONNX_EXPORT\n#endif  // WIN32\n\n#if defined(SHERPA_ONNX_BUILD_MAIN_LIB)\n#define SHERPA_ONNX_API SHERPA_ONNX_EXPORT\n#else\n#define SHERPA_ONNX_API SHERPA_ONNX_IMPORT\n#endif\n\n#ifndef SHERPA_ONNX_DEPRECATED\n#if defined(_MSC_VER)\n#define SHERPA_ONNX_DEPRECATED(msg) __declspec(deprecated(msg))\n#elif defined(__GNUC__) || defined(__clang__)\n#define SHERPA_ONNX_DEPRECATED(msg) __attribute__((deprecated(msg)))\n#else\n#define SHERPA_ONNX_DEPRECATED(msg)\n#endif\n#endif\n\n/**\n * @brief Return the sherpa-onnx version string.\n *\n * The returned pointer refers to statically allocated memory owned by the\n * library. Do not free it and do not modify it.\n *\n * @return Version string, for example `\"1.12.1\"`.\n *\n * @code\n * printf(\"sherpa-onnx version: %s\\n\", SherpaOnnxGetVersionStr());\n * @endcode\n */\nSHERPA_ONNX_API const char *SherpaOnnxGetVersionStr();\n\n/**\n * @brief Return the Git SHA1 used to build the library.\n *\n * The returned pointer refers to statically allocated memory owned by the\n * library. Do not free it and do not modify it.\n *\n * @return Short Git SHA1 string, for example `\"6982b86c\"`.\n */\nSHERPA_ONNX_API const char *SherpaOnnxGetGitSha1();\n\n/**\n * @brief Return the Git build date used to build the library.\n *\n * The returned pointer refers to statically allocated memory owned by the\n * library. Do not free it and do not modify it.\n *\n * @return Build date string, for example `\"Fri Jun 20 11:22:52 2025\"`.\n */\nSHERPA_ONNX_API const char *SherpaOnnxGetGitDate();\n\n/**\n * @brief Check whether a file exists.\n *\n * @param filename File path to test.\n * @return 1 if the file exists; otherwise 0.\n *\n * @code\n * if (!SherpaOnnxFileExists(\"./Obama.wav\")) {\n *   fprintf(stderr, \"Please download Obama.wav\\n\");\n * }\n * @endcode\n */\nSHERPA_ONNX_API int32_t SherpaOnnxFileExists(const char *filename);\n\n/**\n * @brief Configuration for a streaming transducer model.\n *\n * Please refer to\n * https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n * to download compatible pre-trained models.\n */\ntypedef struct SherpaOnnxOnlineTransducerModelConfig {\n  /** Path to the encoder ONNX model. */\n  const char *encoder;\n  /** Path to the decoder ONNX model. */\n  const char *decoder;\n  /** Path to the joiner ONNX model. */\n  const char *joiner;\n} SherpaOnnxOnlineTransducerModelConfig;\n\n/**\n * @brief Configuration for a streaming Paraformer model.\n *\n * Please visit\n * https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/index.html\n * to download compatible models.\n */\ntypedef struct SherpaOnnxOnlineParaformerModelConfig {\n  /** Path to the encoder ONNX model. */\n  const char *encoder;\n  /** Path to the decoder ONNX model. */\n  const char *decoder;\n} SherpaOnnxOnlineParaformerModelConfig;\n\n/**\n * @brief Configuration for a streaming Zipformer2 CTC model.\n */\ntypedef struct SherpaOnnxOnlineZipformer2CtcModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOnlineZipformer2CtcModelConfig;\n\n/** @brief Configuration for a streaming NeMo CTC model. */\ntypedef struct SherpaOnnxOnlineNemoCtcModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOnlineNemoCtcModelConfig;\n\n/** @brief Configuration for a streaming T-One CTC model. */\ntypedef struct SherpaOnnxOnlineToneCtcModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOnlineToneCtcModelConfig;\n\n/**\n * @brief Model configuration shared by streaming ASR recognizers.\n *\n * Zero-initialize this struct before use, then fill in the sub-config for the\n * model family you want to use together with the shared fields such as\n * @c tokens, @c provider, and @c num_threads.\n *\n * Exactly one model family should be configured for each recognizer. For\n * example, set only one of @c transducer, @c paraformer, @c zipformer2_ctc,\n * @c nemo_ctc, or @c t_one_ctc.\n *\n * If multiple model families are configured at the same time, the\n * implementation will choose one of them, and which one is used is\n * implementation-defined. Do not rely on any precedence rule.\n */\ntypedef struct SherpaOnnxOnlineModelConfig {\n  /** Streaming transducer model files. */\n  SherpaOnnxOnlineTransducerModelConfig transducer;\n  /** Streaming Paraformer model files. */\n  SherpaOnnxOnlineParaformerModelConfig paraformer;\n  /** Streaming Zipformer2 CTC model files. */\n  SherpaOnnxOnlineZipformer2CtcModelConfig zipformer2_ctc;\n  /** Path to the tokens file. */\n  const char *tokens;\n  /** Number of threads used by the ONNX Runtime backend. */\n  int32_t num_threads;\n  /** Execution provider, for example \"cpu\", \"cuda\", or \"coreml\". */\n  const char *provider;\n  /** Non-zero to print model debug information. */\n  int32_t debug;\n  /** Optional explicit model type override. */\n  const char *model_type;\n  /**\n   * Modeling unit used by the tokens.\n   *\n   * Valid values include:\n   * - \"cjkchar\"\n   * - \"bpe\"\n   * - \"cjkchar+bpe\"\n   */\n  const char *modeling_unit;\n  /** Path to the BPE vocabulary file when BPE is used. */\n  const char *bpe_vocab;\n  /** Optional in-memory tokens data. Used instead of @c tokens when non-NULL.\n   */\n  const char *tokens_buf;\n  /** Size in bytes of @c tokens_buf, excluding the trailing '\\0'. */\n  int32_t tokens_buf_size;\n  /** Streaming NeMo CTC model files. */\n  SherpaOnnxOnlineNemoCtcModelConfig nemo_ctc;\n  /** Streaming T-One CTC model files. */\n  SherpaOnnxOnlineToneCtcModelConfig t_one_ctc;\n} SherpaOnnxOnlineModelConfig;\n\n/**\n * @brief Feature extraction settings for ASR.\n *\n * The bundled ASR models typically expect 16 kHz mono audio and 80-bin\n * features.\n */\ntypedef struct SherpaOnnxFeatureConfig {\n  /** Sample rate expected by the model, for example 16000. */\n  int32_t sample_rate;\n\n  /** Feature dimension expected by the model, for example 80. */\n  int32_t feature_dim;\n} SherpaOnnxFeatureConfig;\n\n/** @brief Configuration for HLG/FST-based online CTC decoding. */\ntypedef struct SherpaOnnxOnlineCtcFstDecoderConfig {\n  /** Path to the decoding graph. */\n  const char *graph;\n  /** Decoder max-active setting. */\n  int32_t max_active;\n} SherpaOnnxOnlineCtcFstDecoderConfig;\n\n/** @brief Configuration for homophone replacement. */\ntypedef struct SherpaOnnxHomophoneReplacerConfig {\n  /** Unused legacy field kept for ABI compatibility. */\n  const char *dict_dir;\n  /** Path to the lexicon used by the homophone replacer. */\n  const char *lexicon;\n  /** Path to the replacement rule FST file. */\n  const char *rule_fsts;\n} SherpaOnnxHomophoneReplacerConfig;\n\n/**\n * @brief Configuration for a streaming ASR recognizer.\n *\n * Zero-initialize this struct before use. Then fill in @c feat_config,\n * @c model_config, and any optional decoding, endpoint, or hotword settings.\n *\n * Example model package:\n * `sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20`\n *\n * @code\n * SherpaOnnxOnlineRecognizerConfig config;\n * memset(&config, 0, sizeof(config));\n *\n * config.feat_config.sample_rate = 16000;\n * config.feat_config.feature_dim = 80;\n *\n * config.model_config.transducer.encoder =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"encoder-epoch-99-avg-1.int8.onnx\";\n * config.model_config.transducer.decoder =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"decoder-epoch-99-avg-1.onnx\";\n * config.model_config.transducer.joiner =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"joiner-epoch-99-avg-1.int8.onnx\";\n * config.model_config.tokens =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"tokens.txt\";\n * config.model_config.provider = \"cpu\";\n * config.model_config.num_threads = 1;\n *\n * config.decoding_method = \"greedy_search\";\n * @endcode\n */\ntypedef struct SherpaOnnxOnlineRecognizerConfig {\n  /** Feature extraction settings. */\n  SherpaOnnxFeatureConfig feat_config;\n  /** Streaming model configuration. */\n  SherpaOnnxOnlineModelConfig model_config;\n\n  /** Decoding method, for example \"greedy_search\" or \"modified_beam_search\". */\n  const char *decoding_method;\n\n  /** Number of active paths for modified beam search. */\n  int32_t max_active_paths;\n\n  /** Set to non-zero to enable endpoint detection. */\n  int32_t enable_endpoint;\n\n  /** Endpoint rule 1 trailing silence threshold in seconds. */\n  float rule1_min_trailing_silence;\n\n  /** Endpoint rule 2 trailing silence threshold in seconds. */\n  float rule2_min_trailing_silence;\n\n  /** Endpoint rule 3 utterance-length threshold in seconds. */\n  float rule3_min_utterance_length;\n\n  /** Path to a hotwords file. */\n  const char *hotwords_file;\n\n  /** Bonus score added to each hotword token during decoding. */\n  float hotwords_score;\n\n  /** Optional HLG/FST online CTC decoder configuration. */\n  SherpaOnnxOnlineCtcFstDecoderConfig ctc_fst_decoder_config;\n  /** Path to punctuation or text-processing rule FSTs. */\n  const char *rule_fsts;\n  /** Path to FAR archives used by text-processing rules. */\n  const char *rule_fars;\n  /** Optional blank penalty applied during decoding. */\n  float blank_penalty;\n\n  /** Optional in-memory hotwords text used instead of @c hotwords_file. */\n  const char *hotwords_buf;\n  /** Size in bytes of @c hotwords_buf, excluding the trailing '\\0'. */\n  int32_t hotwords_buf_size;\n  /** Optional homophone replacement configuration. */\n  SherpaOnnxHomophoneReplacerConfig hr;\n} SherpaOnnxOnlineRecognizerConfig;\n\n/**\n * @brief Incremental recognition result for a streaming ASR stream.\n *\n * All pointers in this struct are owned by the result object returned from\n * SherpaOnnxGetOnlineStreamResult() and become invalid after\n * SherpaOnnxDestroyOnlineRecognizerResult() is called.\n */\ntypedef struct SherpaOnnxOnlineRecognizerResult {\n  /** Recognized text accumulated so far. */\n  const char *text;\n\n  /**\n   * Contiguous memory block containing token strings separated by '\\0'.\n   *\n   * Use @c tokens_arr for convenient indexed access.\n   */\n  const char *tokens;\n\n  /** Array of @c count pointers into @c tokens. */\n  const char *const *tokens_arr;\n\n  /**\n   * Optional token timestamps in seconds.\n   *\n   * This field may be NULL when the model does not provide timestamps.\n   * When non-NULL, it contains @c count entries and is parallel to\n   * @c tokens_arr.\n   */\n  float *timestamps;\n\n  /** Number of entries in @c tokens_arr and, when available, @c timestamps. */\n  int32_t count;\n\n  /** JSON serialization of the result. */\n  const char *json;\n} SherpaOnnxOnlineRecognizerResult;\n\n/** @brief Streaming recognizer handle. */\ntypedef struct SherpaOnnxOnlineRecognizer SherpaOnnxOnlineRecognizer;\n/** @brief Streaming decoding state for one utterance or stream. */\ntypedef struct SherpaOnnxOnlineStream SherpaOnnxOnlineStream;\n\n/**\n * @brief Create a streaming ASR recognizer.\n *\n * The returned recognizer runs locally and does not require Internet access.\n *\n * @param config Recognizer configuration.\n * @return A recognizer handle on success, or NULL if the configuration is\n *         invalid. The caller owns the returned object and must free it with\n *         SherpaOnnxDestroyOnlineRecognizer().\n *\n * @code\n * SherpaOnnxOnlineRecognizerConfig config;\n * memset(&config, 0, sizeof(config));\n * config.feat_config.sample_rate = 16000;\n * config.feat_config.feature_dim = 80;\n * config.model_config.transducer.encoder =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"encoder-epoch-99-avg-1.int8.onnx\";\n * config.model_config.transducer.decoder =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"decoder-epoch-99-avg-1.onnx\";\n * config.model_config.transducer.joiner =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"joiner-epoch-99-avg-1.int8.onnx\";\n * config.model_config.tokens =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"tokens.txt\";\n * config.model_config.provider = \"cpu\";\n * config.model_config.num_threads = 1;\n * config.decoding_method = \"greedy_search\";\n *\n * const SherpaOnnxOnlineRecognizer *recognizer =\n *     SherpaOnnxCreateOnlineRecognizer(&config);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxOnlineRecognizer *\nSherpaOnnxCreateOnlineRecognizer(\n    const SherpaOnnxOnlineRecognizerConfig *config);\n\n/**\n * @brief Destroy a streaming recognizer.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOnlineRecognizer().\n *\n * @code\n * SherpaOnnxDestroyOnlineRecognizer(recognizer);\n * recognizer = NULL;\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOnlineRecognizer(\n    const SherpaOnnxOnlineRecognizer *recognizer);\n\n/**\n * @brief Create a streaming ASR state object.\n *\n * One stream corresponds to one decoding state. Reuse the same recognizer to\n * create multiple streams.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOnlineRecognizer().\n * @return A newly created stream. The caller owns the returned object and must\n *         free it with SherpaOnnxDestroyOnlineStream().\n *\n * @code\n * const SherpaOnnxWave *wave = SherpaOnnxReadWave(\n *     \"./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav\");\n * const SherpaOnnxOnlineStream *stream =\n *     SherpaOnnxCreateOnlineStream(recognizer);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxOnlineStream *SherpaOnnxCreateOnlineStream(\n    const SherpaOnnxOnlineRecognizer *recognizer);\n\n/**\n * @brief Create a streaming ASR state object with per-stream hotwords.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOnlineRecognizer().\n * @param hotwords Hotwords text to associate with the stream.\n * @return A newly created stream. The caller owns the returned object and must\n *         free it with SherpaOnnxDestroyOnlineStream().\n *\n * @code\n * const SherpaOnnxOnlineStream *stream =\n *     SherpaOnnxCreateOnlineStreamWithHotwords(recognizer,\n *                                              \"▁HELLO ▁WORLD\");\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxOnlineStream *\nSherpaOnnxCreateOnlineStreamWithHotwords(\n    const SherpaOnnxOnlineRecognizer *recognizer, const char *hotwords);\n\n/**\n * @brief Destroy a streaming ASR state object.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream() or\n *               SherpaOnnxCreateOnlineStreamWithHotwords().\n *\n * @code\n * SherpaOnnxDestroyOnlineStream(stream);\n * stream = NULL;\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOnlineStream(\n    const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Append audio samples to a streaming ASR stream.\n *\n * The input is mono floating-point PCM normalized to the range [-1, 1].\n * If @p sample_rate differs from the recognizer feature sample rate,\n * sherpa-onnx resamples internally.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n * @param sample_rate Sample rate of @p samples.\n * @param samples Pointer to @p n samples in the range [-1, 1].\n * @param n Number of samples.\n *\n * @code\n * int32_t start = 0;\n * int32_t chunk_size = 3200;  // 0.2 seconds at 16 kHz\n * SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n *                                      wave->samples + start, chunk_size);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxOnlineStreamAcceptWaveform(\n    const SherpaOnnxOnlineStream *stream, int32_t sample_rate,\n    const float *samples, int32_t n);\n\n/**\n * @brief Check whether a streaming ASR stream is ready to decode.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOnlineRecognizer().\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n * @return 1 if enough frames are available for decoding; otherwise 0.\n *\n * @code\n * if (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n *   SherpaOnnxDecodeOnlineStream(recognizer, stream);\n * }\n * @endcode\n */\nSHERPA_ONNX_API int32_t\nSherpaOnnxIsOnlineStreamReady(const SherpaOnnxOnlineRecognizer *recognizer,\n                              const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Decode one step of a streaming ASR stream.\n *\n * Call this only when SherpaOnnxIsOnlineStreamReady() returns 1.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOnlineRecognizer().\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n *\n * @code\n * SherpaOnnxOnlineStreamAcceptWaveform(stream, sample_rate, samples, n);\n * while (SherpaOnnxIsOnlineStreamReady(recognizer, stream)) {\n *   SherpaOnnxDecodeOnlineStream(recognizer, stream);\n * }\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDecodeOnlineStream(\n    const SherpaOnnxOnlineRecognizer *recognizer,\n    const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Decode multiple streaming ASR streams in parallel.\n *\n * The caller must ensure every stream in @p streams is ready before calling\n * this function.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOnlineRecognizer().\n * @param streams Array of @p n stream pointers.\n * @param n Number of streams in @p streams.\n *\n * @code\n * const SherpaOnnxOnlineStream *streams[2] = {stream1, stream2};\n * SherpaOnnxDecodeMultipleOnlineStreams(recognizer, streams, 2);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDecodeMultipleOnlineStreams(\n    const SherpaOnnxOnlineRecognizer *recognizer,\n    const SherpaOnnxOnlineStream **streams, int32_t n);\n\n/**\n * @brief Get the current streaming ASR result for a stream.\n *\n * The returned snapshot is independent from the stream state. The caller owns\n * it and must free it with SherpaOnnxDestroyOnlineRecognizerResult().\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOnlineRecognizer().\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n * @return A newly allocated result snapshot.\n *\n * @code\n * const SherpaOnnxOnlineRecognizerResult *r =\n *     SherpaOnnxGetOnlineStreamResult(recognizer, stream);\n * printf(\"%s\\n\", r->text);\n * // r->tokens_arr[i] and r->timestamps[i] are parallel when timestamps\n * // are available.\n * SherpaOnnxDestroyOnlineRecognizerResult(r);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxOnlineRecognizerResult *\nSherpaOnnxGetOnlineStreamResult(const SherpaOnnxOnlineRecognizer *recognizer,\n                                const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Destroy a result returned by SherpaOnnxGetOnlineStreamResult().\n *\n * @param r A pointer returned by SherpaOnnxGetOnlineStreamResult().\n *\n * @code\n * SherpaOnnxDestroyOnlineRecognizerResult(r);\n * r = NULL;\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOnlineRecognizerResult(\n    const SherpaOnnxOnlineRecognizerResult *r);\n\n/**\n * @brief Get the current streaming ASR result as JSON.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOnlineRecognizer().\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n * @return A newly allocated JSON string. Free it with\n *         SherpaOnnxDestroyOnlineStreamResultJson().\n *\n * @code\n * const char *json =\n *     SherpaOnnxGetOnlineStreamResultAsJson(recognizer, stream);\n * puts(json);\n * SherpaOnnxDestroyOnlineStreamResultJson(json);\n * @endcode\n */\nSHERPA_ONNX_API const char *SherpaOnnxGetOnlineStreamResultAsJson(\n    const SherpaOnnxOnlineRecognizer *recognizer,\n    const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Free a JSON string returned by\n * SherpaOnnxGetOnlineStreamResultAsJson().\n *\n * @param s A pointer returned by SherpaOnnxGetOnlineStreamResultAsJson().\n *\n * @code\n * SherpaOnnxDestroyOnlineStreamResultJson(json);\n * json = NULL;\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOnlineStreamResultJson(const char *s);\n\n/**\n * @brief Reset a streaming ASR stream after an endpoint or utterance boundary.\n *\n * This clears the decoder state for the stream so that it can be reused for a\n * new utterance.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOnlineRecognizer().\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n *\n * @code\n * if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n *   SherpaOnnxOnlineStreamReset(recognizer, stream);\n * }\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxOnlineStreamReset(\n    const SherpaOnnxOnlineRecognizer *recognizer,\n    const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Signal end-of-input for a streaming ASR stream.\n *\n * After calling this function, do not append more samples to the stream.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n *\n * @code\n * SherpaOnnxOnlineStreamInputFinished(stream);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxOnlineStreamInputFinished(\n    const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Set a per-stream runtime option.\n *\n * This is a generic extension point for model-specific or runtime-specific\n * options such as \"is_final\" for streaming Paraformer.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n * @param key Option name.\n * @param value Option value represented as text.\n *\n * @code\n * SherpaOnnxOnlineStreamSetOption(stream, \"is_final\", \"1\");\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxOnlineStreamSetOption(\n    const SherpaOnnxOnlineStream *stream, const char *key, const char *value);\n\n/**\n * @brief Get a per-stream runtime option.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n * @param key Option name.\n * @return The option value. The returned pointer is owned by the stream, must\n *         not be freed by the caller, and may be invalidated if the option is\n *         overwritten or the stream is destroyed.\n *\n * @code\n * const char *value = SherpaOnnxOnlineStreamGetOption(stream, \"is_final\");\n * @endcode\n */\nSHERPA_ONNX_API const char *SherpaOnnxOnlineStreamGetOption(\n    const SherpaOnnxOnlineStream *stream, const char *key);\n\n/**\n * @brief Check whether a per-stream runtime option exists.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n * @param key Option name.\n * @return 1 if the option exists; otherwise 0.\n *\n * @code\n * int32_t has_option = SherpaOnnxOnlineStreamHasOption(stream, \"is_final\");\n * @endcode\n */\nSHERPA_ONNX_API int32_t SherpaOnnxOnlineStreamHasOption(\n    const SherpaOnnxOnlineStream *stream, const char *key);\n\n/**\n * @brief Check whether endpoint detection has triggered for a stream.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOnlineRecognizer().\n * @param stream A pointer returned by SherpaOnnxCreateOnlineStream().\n * @return 1 if an endpoint is detected; otherwise 0.\n *\n * @code\n * if (SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream)) {\n *   SherpaOnnxOnlineStreamReset(recognizer, stream);\n * }\n * @endcode\n */\nSHERPA_ONNX_API int32_t\nSherpaOnnxOnlineStreamIsEndpoint(const SherpaOnnxOnlineRecognizer *recognizer,\n                                 const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Helper for pretty-printing incremental recognition results.\n *\n * This utility is mainly used by example programs on Linux and macOS.\n */\ntypedef struct SherpaOnnxDisplay SherpaOnnxDisplay;\n\n/**\n * @brief Create a display helper.\n *\n * @param max_word_per_line Maximum number of words to show per line.\n * @return A newly allocated display helper. Free it with\n *         SherpaOnnxDestroyDisplay().\n *\n * @code\n * const SherpaOnnxDisplay *display = SherpaOnnxCreateDisplay(50);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxDisplay *SherpaOnnxCreateDisplay(\n    int32_t max_word_per_line);\n\n/**\n * @brief Destroy a display helper.\n *\n * @param display A pointer returned by SherpaOnnxCreateDisplay().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyDisplay(const SherpaOnnxDisplay *display);\n\n/**\n * @brief Print one line of text using the display helper.\n *\n * @param display A pointer returned by SherpaOnnxCreateDisplay().\n * @param idx Segment or utterance index to print.\n * @param s Text to print.\n *\n * @code\n * SherpaOnnxPrint(display, segment_id, r->text);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxPrint(const SherpaOnnxDisplay *display,\n                                     int32_t idx, const char *s);\n// ============================================================\n// For offline ASR (i.e., non-streaming ASR)\n// ============================================================\n\n/**\n * @brief Configuration for a non-streaming transducer model.\n */\ntypedef struct SherpaOnnxOfflineTransducerModelConfig {\n  /** Path to the encoder ONNX model. */\n  const char *encoder;\n  /** Path to the decoder ONNX model. */\n  const char *decoder;\n  /** Path to the joiner ONNX model. */\n  const char *joiner;\n} SherpaOnnxOfflineTransducerModelConfig;\n\n/** @brief Configuration for a non-streaming Paraformer model. */\ntypedef struct SherpaOnnxOfflineParaformerModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOfflineParaformerModelConfig;\n\n/** @brief Configuration for a non-streaming NeMo CTC model. */\ntypedef struct SherpaOnnxOfflineNemoEncDecCtcModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOfflineNemoEncDecCtcModelConfig;\n\n/**\n * @brief Configuration for a non-streaming Whisper model.\n */\ntypedef struct SherpaOnnxOfflineWhisperModelConfig {\n  /** Path to the encoder ONNX model. */\n  const char *encoder;\n  /** Path to the decoder ONNX model. */\n  const char *decoder;\n  /** Optional language hint, for example \"en\" or \"zh\". */\n  const char *language;\n  /** Optional Whisper task such as \"transcribe\" or \"translate\". */\n  const char *task;\n  /** Number of tail padding frames appended internally. */\n  int32_t tail_paddings;\n\n  /** Non-zero to enable token-level timestamps when supported by the model. */\n  int32_t enable_token_timestamps;\n\n  /** Non-zero to enable Whisper segment-level timestamps. */\n  int32_t enable_segment_timestamps;\n} SherpaOnnxOfflineWhisperModelConfig;\n\n/** @brief Configuration for a Canary model. */\ntypedef struct SherpaOnnxOfflineCanaryModelConfig {\n  /** Path to the encoder ONNX model. */\n  const char *encoder;\n  /** Path to the decoder ONNX model. */\n  const char *decoder;\n  /** Source language hint. */\n  const char *src_lang;\n  /** Target language hint. */\n  const char *tgt_lang;\n  /** Non-zero to enable punctuation and capitalization when supported. */\n  int32_t use_pnc;\n} SherpaOnnxOfflineCanaryModelConfig;\n\n/** @brief Configuration for a FireRedAsr encoder/decoder model. */\ntypedef struct SherpaOnnxOfflineFireRedAsrModelConfig {\n  /** Path to the encoder ONNX model. */\n  const char *encoder;\n  /** Path to the decoder ONNX model. */\n  const char *decoder;\n} SherpaOnnxOfflineFireRedAsrModelConfig;\n\n/** @brief Configuration for a FireRedAsr CTC model. */\ntypedef struct SherpaOnnxOfflineFireRedAsrCtcModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOfflineFireRedAsrCtcModelConfig;\n\n/** @brief Configuration for a Moonshine model. */\ntypedef struct SherpaOnnxOfflineMoonshineModelConfig {\n  /** Path to the preprocessor ONNX model. */\n  const char *preprocessor;\n  /** Path to the encoder ONNX model. */\n  const char *encoder;\n  /** Path to the uncached decoder ONNX model. */\n  const char *uncached_decoder;\n  /** Path to the cached decoder ONNX model. */\n  const char *cached_decoder;\n  /** Path to the merged decoder ONNX model. */\n  const char *merged_decoder;\n} SherpaOnnxOfflineMoonshineModelConfig;\n\n/** @brief Configuration for a TDNN model. */\ntypedef struct SherpaOnnxOfflineTdnnModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOfflineTdnnModelConfig;\n\n/** @brief Configuration for an offline language model. */\ntypedef struct SherpaOnnxOfflineLMConfig {\n  /** Path to the language model. */\n  const char *model;\n  /** Interpolation scale for the language model. */\n  float scale;\n} SherpaOnnxOfflineLMConfig;\n\n/** @brief Configuration for a SenseVoice model. */\ntypedef struct SherpaOnnxOfflineSenseVoiceModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n  /** Optional language hint. */\n  const char *language;\n  /** Non-zero to enable inverse text normalization. */\n  int32_t use_itn;\n} SherpaOnnxOfflineSenseVoiceModelConfig;\n\n/** @brief Configuration for a Dolphin model. */\ntypedef struct SherpaOnnxOfflineDolphinModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOfflineDolphinModelConfig;\n\n/** @brief Configuration for an offline Zipformer CTC model. */\ntypedef struct SherpaOnnxOfflineZipformerCtcModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOfflineZipformerCtcModelConfig;\n\n/** @brief Configuration for an offline WeNet CTC model. */\ntypedef struct SherpaOnnxOfflineWenetCtcModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOfflineWenetCtcModelConfig;\n\n/** @brief Configuration for an omnilingual offline CTC model. */\ntypedef struct SherpaOnnxOfflineOmnilingualAsrCtcModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOfflineOmnilingualAsrCtcModelConfig;\n\n/** @brief Configuration for an offline FunASR Nano model. */\ntypedef struct SherpaOnnxOfflineFunASRNanoModelConfig {\n  /** Path to the encoder adaptor. */\n  const char *encoder_adaptor;\n  /** Path to the LLM ONNX model. */\n  const char *llm;\n  /** Path to the embedding model. */\n  const char *embedding;\n  /** Path to the tokenizer file. */\n  const char *tokenizer;\n  /** System prompt. */\n  const char *system_prompt;\n  /** User prompt. */\n  const char *user_prompt;\n  /** Maximum number of generated tokens. */\n  int32_t max_new_tokens;\n  /** Sampling temperature. */\n  float temperature;\n  /** Top-p sampling threshold. */\n  float top_p;\n  /** Random seed. */\n  int32_t seed;\n  /** Optional language hint. */\n  const char *language;\n  /** Non-zero to enable inverse text normalization. */\n  int32_t itn;\n  /** Optional hotwords text. */\n  const char *hotwords;\n} SherpaOnnxOfflineFunASRNanoModelConfig;\n\n/** @brief Configuration for a MedASR CTC model. */\ntypedef struct SherpaOnnxOfflineMedAsrCtcModelConfig {\n  /** Path to the ONNX model. */\n  const char *model;\n} SherpaOnnxOfflineMedAsrCtcModelConfig;\n\n/**\n * @brief Model configuration shared by offline ASR recognizers.\n *\n * Zero-initialize this struct before use, then fill in exactly the sub-config\n * needed by the model family you want to run.\n *\n * Exactly one model family should be configured for each recognizer. For\n * example, set only one of @c transducer, @c paraformer, @c nemo_ctc,\n * @c whisper, @c tdnn, @c sense_voice, @c moonshine, @c fire_red_asr,\n * @c dolphin, @c zipformer_ctc, @c canary, @c wenet_ctc, @c omnilingual,\n * @c medasr, @c funasr_nano, or @c fire_red_asr_ctc.\n *\n * If multiple model families are configured at the same time, the\n * implementation will choose one of them, and which one is used is\n * implementation-defined. Do not rely on any precedence rule.\n */\ntypedef struct SherpaOnnxOfflineModelConfig {\n  /** Non-streaming transducer model files. */\n  SherpaOnnxOfflineTransducerModelConfig transducer;\n  /** Non-streaming Paraformer model files. */\n  SherpaOnnxOfflineParaformerModelConfig paraformer;\n  /** Non-streaming NeMo CTC model files. */\n  SherpaOnnxOfflineNemoEncDecCtcModelConfig nemo_ctc;\n  /** Whisper model files and options. */\n  SherpaOnnxOfflineWhisperModelConfig whisper;\n  /** TDNN model files. */\n  SherpaOnnxOfflineTdnnModelConfig tdnn;\n\n  /** Path to the tokens file. */\n  const char *tokens;\n  /** Number of backend threads. */\n  int32_t num_threads;\n  /** Non-zero to print debug information. */\n  int32_t debug;\n  /** Execution provider, for example \"cpu\" or \"cuda\". */\n  const char *provider;\n  /** Optional explicit model type override. */\n  const char *model_type;\n  /** Modeling unit, such as \"cjkchar\", \"bpe\", or \"cjkchar+bpe\". */\n  const char *modeling_unit;\n  /** Path to the BPE vocabulary file when BPE is used. */\n  const char *bpe_vocab;\n  /** Path to the TeleSpeech CTC model. */\n  const char *telespeech_ctc;\n  /** SenseVoice configuration. */\n  SherpaOnnxOfflineSenseVoiceModelConfig sense_voice;\n  /** Moonshine configuration. */\n  SherpaOnnxOfflineMoonshineModelConfig moonshine;\n  /** FireRedAsr configuration. */\n  SherpaOnnxOfflineFireRedAsrModelConfig fire_red_asr;\n  /** Dolphin configuration. */\n  SherpaOnnxOfflineDolphinModelConfig dolphin;\n  /** Zipformer CTC configuration. */\n  SherpaOnnxOfflineZipformerCtcModelConfig zipformer_ctc;\n  /** Canary configuration. */\n  SherpaOnnxOfflineCanaryModelConfig canary;\n  /** WeNet CTC configuration. */\n  SherpaOnnxOfflineWenetCtcModelConfig wenet_ctc;\n  /** Omnilingual CTC configuration. */\n  SherpaOnnxOfflineOmnilingualAsrCtcModelConfig omnilingual;\n  /** MedASR configuration. */\n  SherpaOnnxOfflineMedAsrCtcModelConfig medasr;\n  /** FunASR Nano configuration. */\n  SherpaOnnxOfflineFunASRNanoModelConfig funasr_nano;\n  /** FireRedAsr CTC configuration. */\n  SherpaOnnxOfflineFireRedAsrCtcModelConfig fire_red_asr_ctc;\n} SherpaOnnxOfflineModelConfig;\n\n/**\n * @brief Configuration for a non-streaming ASR recognizer.\n *\n * Zero-initialize this struct before use.\n *\n * Example using Whisper:\n *\n * @code\n * SherpaOnnxOfflineRecognizerConfig config;\n * memset(&config, 0, sizeof(config));\n *\n * config.feat_config.sample_rate = 16000;\n * config.feat_config.feature_dim = 80;\n *\n * config.model_config.whisper.encoder =\n *     \"./sherpa-onnx-whisper-tiny/tiny-encoder.onnx\";\n * config.model_config.whisper.decoder =\n *     \"./sherpa-onnx-whisper-tiny/tiny-decoder.onnx\";\n * config.model_config.whisper.language = \"en\";\n * config.model_config.whisper.task = \"transcribe\";\n * config.model_config.tokens =\n *     \"./sherpa-onnx-whisper-tiny/tiny-tokens.txt\";\n * config.model_config.provider = \"cpu\";\n * config.model_config.num_threads = 1;\n *\n * config.decoding_method = \"greedy_search\";\n * @endcode\n *\n * Example using SenseVoice:\n *\n * @code\n * config.model_config.sense_voice.model =\n *     \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/model.int8.onnx\";\n * config.model_config.sense_voice.language = \"auto\";\n * config.model_config.sense_voice.use_itn = 1;\n * config.model_config.tokens =\n *     \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/tokens.txt\";\n * @endcode\n *\n * Example using Parakeet TDT:\n *\n * @code\n * config.model_config.transducer.encoder =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/encoder.int8.onnx\";\n * config.model_config.transducer.decoder =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/decoder.int8.onnx\";\n * config.model_config.transducer.joiner =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/joiner.int8.onnx\";\n * config.model_config.tokens =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/tokens.txt\";\n * config.model_config.model_type = \"nemo_transducer\";\n * @endcode\n */\ntypedef struct SherpaOnnxOfflineRecognizerConfig {\n  /** Feature extraction settings. */\n  SherpaOnnxFeatureConfig feat_config;\n  /** Offline model configuration. */\n  SherpaOnnxOfflineModelConfig model_config;\n  /** Optional language model configuration. */\n  SherpaOnnxOfflineLMConfig lm_config;\n\n  /** Decoding method, for example \"greedy_search\" or \"modified_beam_search\". */\n  const char *decoding_method;\n  /** Number of active paths for modified beam search. */\n  int32_t max_active_paths;\n\n  /** Path to a hotwords file. */\n  const char *hotwords_file;\n\n  /** Bonus score added to each hotword token. */\n  float hotwords_score;\n  /** Path to punctuation or text-processing rule FSTs. */\n  const char *rule_fsts;\n  /** Path to FAR archives used by text-processing rules. */\n  const char *rule_fars;\n  /** Optional blank penalty applied during decoding. */\n  float blank_penalty;\n\n  /** Optional homophone replacement configuration. */\n  SherpaOnnxHomophoneReplacerConfig hr;\n} SherpaOnnxOfflineRecognizerConfig;\n\n/** @brief Non-streaming recognizer handle. */\ntypedef struct SherpaOnnxOfflineRecognizer SherpaOnnxOfflineRecognizer;\n\n/** @brief Non-streaming decoding state for one utterance. */\ntypedef struct SherpaOnnxOfflineStream SherpaOnnxOfflineStream;\n\n/**\n * @brief Create a non-streaming ASR recognizer.\n *\n * @param config Recognizer configuration.\n * @return A recognizer handle on success, or NULL if the configuration is\n *         invalid. The caller owns the returned object and must free it with\n *         SherpaOnnxDestroyOfflineRecognizer().\n *\n * Whisper example:\n *\n * @code\n * SherpaOnnxOfflineRecognizerConfig config;\n * memset(&config, 0, sizeof(config));\n * config.feat_config.sample_rate = 16000;\n * config.feat_config.feature_dim = 80;\n * config.model_config.whisper.encoder =\n *     \"./sherpa-onnx-whisper-tiny/tiny-encoder.onnx\";\n * config.model_config.whisper.decoder =\n *     \"./sherpa-onnx-whisper-tiny/tiny-decoder.onnx\";\n * config.model_config.whisper.language = \"en\";\n * config.model_config.whisper.task = \"transcribe\";\n * config.model_config.tokens =\n *     \"./sherpa-onnx-whisper-tiny/tiny-tokens.txt\";\n * config.model_config.provider = \"cpu\";\n * config.model_config.num_threads = 1;\n * config.decoding_method = \"greedy_search\";\n *\n * const SherpaOnnxOfflineRecognizer *recognizer =\n *     SherpaOnnxCreateOfflineRecognizer(&config);\n * @endcode\n *\n * SenseVoice example:\n *\n * @code\n * config.model_config.sense_voice.model =\n *     \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/model.int8.onnx\";\n * config.model_config.sense_voice.language = \"auto\";\n * config.model_config.sense_voice.use_itn = 1;\n * config.model_config.tokens =\n *     \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/tokens.txt\";\n * @endcode\n *\n * Parakeet TDT example:\n *\n * @code\n * config.model_config.transducer.encoder =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/encoder.int8.onnx\";\n * config.model_config.transducer.decoder =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/decoder.int8.onnx\";\n * config.model_config.transducer.joiner =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/joiner.int8.onnx\";\n * config.model_config.tokens =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/tokens.txt\";\n * config.model_config.model_type = \"nemo_transducer\";\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineRecognizer *\nSherpaOnnxCreateOfflineRecognizer(\n    const SherpaOnnxOfflineRecognizerConfig *config);\n\n/**\n * @brief Update the configuration of an existing offline recognizer.\n *\n * @param recognizer Recognizer handle.\n * @param config New recognizer configuration.\n *\n * @code\n * SherpaOnnxOfflineRecognizerSetConfig(recognizer, &config);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxOfflineRecognizerSetConfig(\n    const SherpaOnnxOfflineRecognizer *recognizer,\n    const SherpaOnnxOfflineRecognizerConfig *config);\n\n/**\n * @brief Destroy a non-streaming recognizer.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOfflineRecognizer().\n *\n * @code\n * SherpaOnnxDestroyOfflineRecognizer(recognizer);\n * recognizer = NULL;\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOfflineRecognizer(\n    const SherpaOnnxOfflineRecognizer *recognizer);\n\n/**\n * @brief Create a non-streaming ASR input stream.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOfflineRecognizer().\n * @return A newly created stream. The caller owns the returned object and must\n *         free it with SherpaOnnxDestroyOfflineStream().\n *\n * @code\n * const SherpaOnnxWave *wave =\n *     SherpaOnnxReadWave(\"./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav\");\n * const SherpaOnnxOfflineStream *stream =\n *     SherpaOnnxCreateOfflineStream(recognizer);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineStream *SherpaOnnxCreateOfflineStream(\n    const SherpaOnnxOfflineRecognizer *recognizer);\n\n/**\n * @brief Create a non-streaming ASR input stream with per-stream hotwords.\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOfflineRecognizer().\n * @param hotwords Hotwords text to associate with the stream.\n * @return A newly created stream. The caller owns the returned object and must\n *         free it with SherpaOnnxDestroyOfflineStream().\n *\n * @code\n * const SherpaOnnxOfflineStream *stream =\n *     SherpaOnnxCreateOfflineStreamWithHotwords(recognizer,\n *                                               \"▁HELLO ▁WORLD\");\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineStream *\nSherpaOnnxCreateOfflineStreamWithHotwords(\n    const SherpaOnnxOfflineRecognizer *recognizer, const char *hotwords);\n\n/**\n * @brief Destroy a non-streaming ASR stream.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOfflineStream() or\n *               SherpaOnnxCreateOfflineStreamWithHotwords().\n *\n * @code\n * SherpaOnnxDestroyOfflineStream(stream);\n * stream = NULL;\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOfflineStream(\n    const SherpaOnnxOfflineStream *stream);\n\n/**\n * @brief Provide the full utterance to an offline ASR stream.\n *\n * The input is mono floating-point PCM normalized to the range [-1, 1].\n * If @p sample_rate differs from the recognizer feature sample rate,\n * sherpa-onnx resamples internally.\n *\n * @warning Call this function at most once for each offline stream. Offline\n * recognition expects the entire utterance in a single call.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOfflineStream().\n * @param sample_rate Sample rate of @p samples.\n * @param samples Pointer to @p n samples in the range [-1, 1].\n * @param n Number of samples.\n *\n * @code\n * const SherpaOnnxWave *wave =\n *     SherpaOnnxReadWave(\"./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav\");\n * const SherpaOnnxOfflineStream *stream =\n *     SherpaOnnxCreateOfflineStream(recognizer);\n * SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate,\n *                                 wave->samples, wave->num_samples);\n * SherpaOnnxDecodeOfflineStream(recognizer, stream);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxAcceptWaveformOffline(\n    const SherpaOnnxOfflineStream *stream, int32_t sample_rate,\n    const float *samples, int32_t n);\n\n/**\n * @brief Set a per-stream runtime option for offline ASR.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOfflineStream().\n * @param key Option name.\n * @param value Option value represented as text.\n *\n * @code\n * SherpaOnnxOfflineStreamSetOption(stream, \"language\", \"en\");\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxOfflineStreamSetOption(\n    const SherpaOnnxOfflineStream *stream, const char *key, const char *value);\n\n/**\n * @brief Get a per-stream runtime option for offline ASR.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOfflineStream().\n * @param key Option name.\n * @return The option value. The returned pointer is owned by the stream, must\n *         not be freed by the caller, and may be invalidated if the option is\n *         overwritten or the stream is destroyed.\n *\n * @code\n * const char *value = SherpaOnnxOfflineStreamGetOption(stream, \"language\");\n * @endcode\n */\nSHERPA_ONNX_API const char *SherpaOnnxOfflineStreamGetOption(\n    const SherpaOnnxOfflineStream *stream, const char *key);\n\n/**\n * @brief Check whether a per-stream runtime option exists.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOfflineStream().\n * @param key Option name.\n * @return 1 if the option exists; otherwise 0.\n *\n * @code\n * int32_t has_language =\n *     SherpaOnnxOfflineStreamHasOption(stream, \"language\");\n * @endcode\n */\nSHERPA_ONNX_API int32_t SherpaOnnxOfflineStreamHasOption(\n    const SherpaOnnxOfflineStream *stream, const char *key);\n\n/**\n * @brief Run offline ASR on one stream.\n *\n * Call this after SherpaOnnxAcceptWaveformOffline().\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOfflineRecognizer().\n * @param stream A pointer returned by SherpaOnnxCreateOfflineStream().\n *\n * @code\n * SherpaOnnxDecodeOfflineStream(recognizer, stream);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDecodeOfflineStream(\n    const SherpaOnnxOfflineRecognizer *recognizer,\n    const SherpaOnnxOfflineStream *stream);\n\n/**\n * @brief Run offline ASR on multiple streams in parallel.\n *\n * The caller must have already provided one utterance to each stream via\n * SherpaOnnxAcceptWaveformOffline().\n *\n * @param recognizer A pointer returned by SherpaOnnxCreateOfflineRecognizer().\n * @param streams Array of @p n offline stream pointers.\n * @param n Number of streams in @p streams.\n *\n * @code\n * const SherpaOnnxOfflineStream *streams[2] = {stream1, stream2};\n * SherpaOnnxDecodeMultipleOfflineStreams(recognizer, streams, 2);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDecodeMultipleOfflineStreams(\n    const SherpaOnnxOfflineRecognizer *recognizer,\n    const SherpaOnnxOfflineStream **streams, int32_t n);\n\n/**\n * @brief Recognition result for a non-streaming ASR stream.\n *\n * All pointers in this struct are owned by the result object returned from\n * SherpaOnnxGetOfflineStreamResult() and become invalid after\n * SherpaOnnxDestroyOfflineRecognizerResult() is called.\n */\ntypedef struct SherpaOnnxOfflineRecognizerResult {\n  /** Recognized text. */\n  const char *text;\n\n  /**\n   * Optional token timestamps in seconds.\n   *\n   * This field may be NULL when the model does not provide token timestamps.\n   * When non-NULL, it contains @c count entries and is parallel to\n   * @c tokens_arr.\n   */\n  float *timestamps;\n\n  /** Number of token entries in @c tokens_arr and related per-token arrays. */\n  int32_t count;\n\n  /**\n   * Contiguous memory block containing token strings separated by '\\0'.\n   *\n   * Use @c tokens_arr for convenient indexed access.\n   */\n  const char *tokens;\n\n  /** Array of @c count pointers into @c tokens. */\n  const char *const *tokens_arr;\n\n  /** JSON serialization of the result. */\n  const char *json;\n\n  /** Optional recognized language label. */\n  const char *lang;\n\n  /** Optional recognized emotion label. */\n  const char *emotion;\n\n  /** Optional recognized event label. */\n  const char *event;\n\n  /** Optional token durations in seconds, parallel to @c tokens_arr. */\n  float *durations;\n\n  /** Optional token log probabilities, parallel to @c tokens_arr. */\n  float *ys_log_probs;\n\n  /** Optional segment start times in seconds, parallel to @c segment_texts_arr.\n   */\n  const float *segment_timestamps;\n\n  /** Optional segment durations in seconds, parallel to @c segment_texts_arr.\n   */\n  const float *segment_durations;\n\n  /** Contiguous memory block containing segment texts separated by '\\0'. */\n  const char *segment_texts;\n\n  /** Array of @c segment_count pointers into @c segment_texts. */\n  const char *const *segment_texts_arr;\n\n  /** Number of segment entries in the segment-level arrays. */\n  int32_t segment_count;\n} SherpaOnnxOfflineRecognizerResult;\n\n/**\n * @brief Get the recognition result for an offline ASR stream.\n *\n * Call this after SherpaOnnxDecodeOfflineStream() or\n * SherpaOnnxDecodeMultipleOfflineStreams().\n *\n * @param stream A pointer returned by SherpaOnnxCreateOfflineStream().\n * @return A newly allocated result snapshot. Free it with\n *         SherpaOnnxDestroyOfflineRecognizerResult().\n *\n * @code\n * const SherpaOnnxOfflineRecognizerResult *r =\n *     SherpaOnnxGetOfflineStreamResult(stream);\n * printf(\"%s\\n\", r->text);\n * if (r->timestamps) {\n *   printf(\"First token starts at %.3f seconds\\n\", r->timestamps[0]);\n * }\n * SherpaOnnxDestroyOfflineRecognizerResult(r);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineRecognizerResult *\nSherpaOnnxGetOfflineStreamResult(const SherpaOnnxOfflineStream *stream);\n\n/**\n * @brief Destroy a result returned by SherpaOnnxGetOfflineStreamResult().\n *\n * @param r A pointer returned by SherpaOnnxGetOfflineStreamResult().\n *\n * @code\n * SherpaOnnxDestroyOfflineRecognizerResult(r);\n * r = NULL;\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOfflineRecognizerResult(\n    const SherpaOnnxOfflineRecognizerResult *r);\n\n/**\n * @brief Get the offline ASR result as JSON.\n *\n * @param stream A pointer returned by SherpaOnnxCreateOfflineStream().\n * @return A newly allocated JSON string. Free it with\n *         SherpaOnnxDestroyOfflineStreamResultJson().\n *\n * @code\n * const char *json = SherpaOnnxGetOfflineStreamResultAsJson(stream);\n * puts(json);\n * SherpaOnnxDestroyOfflineStreamResultJson(json);\n * @endcode\n */\nSHERPA_ONNX_API const char *SherpaOnnxGetOfflineStreamResultAsJson(\n    const SherpaOnnxOfflineStream *stream);\n\n/**\n * @brief Free a JSON string returned by\n * SherpaOnnxGetOfflineStreamResultAsJson().\n *\n * @param s A pointer returned by SherpaOnnxGetOfflineStreamResultAsJson().\n *\n * @code\n * SherpaOnnxDestroyOfflineStreamResultJson(json);\n * json = NULL;\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOfflineStreamResultJson(const char *s);\n\n// ============================================================\n// For keyword spotting\n// ============================================================\n/**\n * @brief Snapshot of the current keyword spotting result.\n *\n * Free this object with SherpaOnnxDestroyKeywordResult().\n */\ntypedef struct SherpaOnnxKeywordResult {\n  /**\n   * Triggered keyword text.\n   *\n   * For English models this is usually space-separated words. For Chinese\n   * models it is typically the surface form without spaces.\n   */\n  const char *keyword;\n\n  /**\n   * Token sequence as a single string.\n   *\n   * For BPE-based models this contains the decoded BPE tokens.\n   */\n  const char *tokens;\n\n  /**\n   * Token sequence as an array.\n   *\n   * The array length is @c count. Each string is owned by this result object.\n   */\n  const char *const *tokens_arr;\n\n  /** Number of decoded tokens in @c tokens_arr and @c timestamps. */\n  int32_t count;\n\n  /**\n   * Per-token timestamps in seconds.\n   *\n   * This array has @c count elements. Element @c i corresponds to\n   * `tokens_arr[i]`.\n   */\n  float *timestamps;\n\n  /** Start time of the current segment in seconds. */\n  float start_time;\n\n  /**\n   * JSON representation of the result.\n   *\n   * The JSON includes `keyword`, `tokens`, `timestamps`, and `start_time`.\n   */\n  const char *json;\n} SherpaOnnxKeywordResult;\n\n/**\n * @brief Configuration for keyword spotting.\n *\n * The acoustic model is configured through @c model_config. In practice this is\n * usually a streaming transducer model.\n *\n * Keyword definitions can be provided either through @c keywords_file or\n * through @c keywords_buf/@c keywords_buf_size. If both are set, the buffer is\n * used.\n *\n * Example using\n * `sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile`:\n *\n * @code\n * SherpaOnnxKeywordSpotterConfig config;\n * memset(&config, 0, sizeof(config));\n *\n * config.model_config.transducer.encoder =\n *     \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n *     \"encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx\";\n * config.model_config.transducer.decoder =\n *     \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n *     \"decoder-epoch-12-avg-2-chunk-16-left-64.onnx\";\n * config.model_config.transducer.joiner =\n *     \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n *     \"joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx\";\n * config.model_config.tokens =\n *     \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n *     \"tokens.txt\";\n * config.model_config.provider = \"cpu\";\n * config.model_config.num_threads = 1;\n *\n * config.keywords_file =\n *     \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/\"\n *     \"test_wavs/test_keywords.txt\";\n * config.max_active_paths = 4;\n * config.keywords_score = 3.0f;\n * config.keywords_threshold = 0.1f;\n * @endcode\n */\ntypedef struct SherpaOnnxKeywordSpotterConfig {\n  /** Feature extraction parameters. */\n  SherpaOnnxFeatureConfig feat_config;\n  /** Streaming acoustic model configuration. */\n  SherpaOnnxOnlineModelConfig model_config;\n  /** Maximum number of active decoding paths. */\n  int32_t max_active_paths;\n  /** Number of trailing blank symbols required before trigger finalization. */\n  int32_t num_trailing_blanks;\n  /** Bonus score applied to keywords during search. */\n  float keywords_score;\n  /** Detection threshold. Larger values are more conservative. */\n  float keywords_threshold;\n  /** Optional keyword file. */\n  const char *keywords_file;\n  /** Optional in-memory keyword data. If non-null, it overrides @c\n   * keywords_file. */\n  const char *keywords_buf;\n  /** Size in bytes of @c keywords_buf, excluding any trailing `'\\0'`. */\n  int32_t keywords_buf_size;\n} SherpaOnnxKeywordSpotterConfig;\n\n/** @brief Opaque keyword spotter handle. */\ntypedef struct SherpaOnnxKeywordSpotter SherpaOnnxKeywordSpotter;\n\n/**\n * @brief Create a keyword spotter.\n *\n * @param config Keyword spotter configuration.\n * @return A newly allocated keyword spotter on success, or NULL on error. Free\n *         it with SherpaOnnxDestroyKeywordSpotter().\n */\nSHERPA_ONNX_API const SherpaOnnxKeywordSpotter *SherpaOnnxCreateKeywordSpotter(\n    const SherpaOnnxKeywordSpotterConfig *config);\n\n/**\n * @brief Destroy a keyword spotter.\n *\n * @param spotter A pointer returned by SherpaOnnxCreateKeywordSpotter().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyKeywordSpotter(\n    const SherpaOnnxKeywordSpotter *spotter);\n\n/**\n * @brief Create a keyword spotting stream using the spotter's built-in keyword\n * list.\n *\n * @param spotter A pointer returned by SherpaOnnxCreateKeywordSpotter().\n * @return A newly allocated stream. Free it with\n * SherpaOnnxDestroyOnlineStream().\n */\nSHERPA_ONNX_API const SherpaOnnxOnlineStream *SherpaOnnxCreateKeywordStream(\n    const SherpaOnnxKeywordSpotter *spotter);\n\n/**\n * @brief Create a keyword spotting stream with extra or replacement keywords.\n *\n * The @p keywords string uses the same textual format as the keyword files used\n * by the examples. For instance:\n *\n * @code\n * const SherpaOnnxOnlineStream *stream =\n *     SherpaOnnxCreateKeywordStreamWithKeywords(\n *         kws, \"y ǎn y uán @演员/zh ī m íng @知名\");\n * @endcode\n *\n * @param spotter A pointer returned by SherpaOnnxCreateKeywordSpotter().\n * @param keywords Inline keyword definition string.\n * @return A newly allocated stream. Free it with\n * SherpaOnnxDestroyOnlineStream().\n */\nSHERPA_ONNX_API const SherpaOnnxOnlineStream *\nSherpaOnnxCreateKeywordStreamWithKeywords(\n    const SherpaOnnxKeywordSpotter *spotter, const char *keywords);\n\n/**\n * @brief Check whether a keyword stream has enough audio for decoding.\n *\n * @param spotter A pointer returned by SherpaOnnxCreateKeywordSpotter().\n * @param stream A pointer returned by SherpaOnnxCreateKeywordStream() or\n *               SherpaOnnxCreateKeywordStreamWithKeywords().\n * @return 1 if the stream is ready to decode; otherwise 0.\n */\nSHERPA_ONNX_API int32_t\nSherpaOnnxIsKeywordStreamReady(const SherpaOnnxKeywordSpotter *spotter,\n                               const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Decode one ready keyword stream.\n *\n * Call this only when SherpaOnnxIsKeywordStreamReady() returns 1.\n *\n * @param spotter A pointer returned by SherpaOnnxCreateKeywordSpotter().\n * @param stream A pointer returned by SherpaOnnxCreateKeywordStream() or\n *               SherpaOnnxCreateKeywordStreamWithKeywords().\n */\nSHERPA_ONNX_API void SherpaOnnxDecodeKeywordStream(\n    const SherpaOnnxKeywordSpotter *spotter,\n    const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Reset a keyword stream after a keyword is detected.\n *\n * The examples call this immediately after a successful trigger so the next\n * keyword can be detected independently.\n *\n * @param spotter A pointer returned by SherpaOnnxCreateKeywordSpotter().\n * @param stream A pointer returned by SherpaOnnxCreateKeywordStream() or\n *               SherpaOnnxCreateKeywordStreamWithKeywords().\n */\nSHERPA_ONNX_API void SherpaOnnxResetKeywordStream(\n    const SherpaOnnxKeywordSpotter *spotter,\n    const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Decode multiple ready keyword streams in parallel.\n *\n * The caller must ensure every stream in @p streams is ready before calling\n * this function.\n *\n * @param spotter A pointer returned by SherpaOnnxCreateKeywordSpotter().\n * @param streams Array of ready streams.\n * @param n Number of elements in @p streams.\n */\nSHERPA_ONNX_API void SherpaOnnxDecodeMultipleKeywordStreams(\n    const SherpaOnnxKeywordSpotter *spotter,\n    const SherpaOnnxOnlineStream **streams, int32_t n);\n\n/**\n * @brief Get the current keyword spotting result for a stream.\n *\n * The returned snapshot may represent either \"no trigger yet\" or a detected\n * keyword. A common pattern is to check whether `strlen(r->keyword) != 0`.\n *\n * @param spotter A pointer returned by SherpaOnnxCreateKeywordSpotter().\n * @param stream A pointer returned by SherpaOnnxCreateKeywordStream() or\n *               SherpaOnnxCreateKeywordStreamWithKeywords().\n * @return A newly allocated result snapshot. Free it with\n *         SherpaOnnxDestroyKeywordResult().\n *\n * @code\n * const SherpaOnnxKeywordResult *r = SherpaOnnxGetKeywordResult(kws, stream);\n * if (r && r->json && strlen(r->keyword)) {\n *   fprintf(stderr, \"Detected keyword: %s\\n\", r->json);\n *   SherpaOnnxResetKeywordStream(kws, stream);\n * }\n * SherpaOnnxDestroyKeywordResult(r);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxKeywordResult *SherpaOnnxGetKeywordResult(\n    const SherpaOnnxKeywordSpotter *spotter,\n    const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Destroy a keyword result snapshot.\n *\n * @param r A pointer returned by SherpaOnnxGetKeywordResult().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyKeywordResult(\n    const SherpaOnnxKeywordResult *r);\n\n/**\n * @brief Get the current keyword spotting result as JSON.\n *\n * @param spotter A pointer returned by SherpaOnnxCreateKeywordSpotter().\n * @param stream A pointer returned by SherpaOnnxCreateKeywordStream() or\n *               SherpaOnnxCreateKeywordStreamWithKeywords().\n * @return A newly allocated JSON string. Free it with\n *         SherpaOnnxFreeKeywordResultJson().\n */\nSHERPA_ONNX_API const char *SherpaOnnxGetKeywordResultAsJson(\n    const SherpaOnnxKeywordSpotter *spotter,\n    const SherpaOnnxOnlineStream *stream);\n\n/**\n * @brief Free a JSON string returned by SherpaOnnxGetKeywordResultAsJson().\n *\n * @param s A pointer returned by SherpaOnnxGetKeywordResultAsJson().\n */\nSHERPA_ONNX_API void SherpaOnnxFreeKeywordResultJson(const char *s);\n\n// ============================================================\n// For VAD\n// ============================================================\n\n/** @brief Configuration for a Silero VAD model. */\ntypedef struct SherpaOnnxSileroVadModelConfig {\n  /** Path to `silero_vad.onnx`. */\n  const char *model;\n  /** Speech probability threshold. Frames above this value are speech. */\n  float threshold;\n  /** Minimum silence duration in seconds used to close a speech segment. */\n  float min_silence_duration;\n  /** Minimum speech duration in seconds to keep a detected segment. */\n  float min_speech_duration;\n  /** Input window size in samples. A common value is 512. */\n  int32_t window_size;\n  /**\n   * Maximum speech duration in seconds.\n   *\n   * When a segment exceeds this value, the detector temporarily uses a higher\n   * threshold to encourage a split.\n   */\n  float max_speech_duration;\n} SherpaOnnxSileroVadModelConfig;\n\n/** @brief Configuration for a Ten VAD model. */\ntypedef struct SherpaOnnxTenVadModelConfig {\n  /** Path to `ten-vad.onnx`. */\n  const char *model;\n  /** Speech probability threshold. Frames above this value are speech. */\n  float threshold;\n  /** Minimum silence duration in seconds used to close a speech segment. */\n  float min_silence_duration;\n  /** Minimum speech duration in seconds to keep a detected segment. */\n  float min_speech_duration;\n  /** Input window size in samples. A common value is 256. */\n  int32_t window_size;\n  /**\n   * Maximum speech duration in seconds.\n   *\n   * When a segment exceeds this value, the detector temporarily uses a higher\n   * threshold to encourage a split.\n   */\n  float max_speech_duration;\n} SherpaOnnxTenVadModelConfig;\n\n/**\n * @brief Configuration shared by voice activity detectors.\n *\n * Exactly one VAD model family should be configured. Set either\n * @c silero_vad.model or @c ten_vad.model.\n *\n * If both are configured, the implementation will choose one of them, and\n * which one is used is implementation-defined. Do not rely on any precedence\n * rule.\n *\n * Example model files:\n * - `./silero_vad.onnx`\n * - `./ten-vad.onnx`\n *\n * @code\n * SherpaOnnxVadModelConfig config;\n * memset(&config, 0, sizeof(config));\n *\n * config.silero_vad.model = \"./silero_vad.onnx\";\n * config.silero_vad.threshold = 0.25f;\n * config.silero_vad.min_silence_duration = 0.5f;\n * config.silero_vad.min_speech_duration = 0.5f;\n * config.silero_vad.max_speech_duration = 10.0f;\n * config.silero_vad.window_size = 512;\n *\n * config.sample_rate = 16000;\n * config.num_threads = 1;\n * config.provider = \"cpu\";\n * config.debug = 0;\n * @endcode\n */\ntypedef struct SherpaOnnxVadModelConfig {\n  /** Silero VAD configuration. */\n  SherpaOnnxSileroVadModelConfig silero_vad;\n  /** Input sample rate expected by the detector, usually 16000. */\n  int32_t sample_rate;\n  /** Number of backend threads. */\n  int32_t num_threads;\n  /** Execution provider, for example \"cpu\" or \"cuda\". */\n  const char *provider;\n  /** Non-zero to print debug information. */\n  int32_t debug;\n  /** Ten VAD configuration. */\n  SherpaOnnxTenVadModelConfig ten_vad;\n} SherpaOnnxVadModelConfig;\n\n/** @brief Opaque circular-buffer handle used by helper APIs. */\ntypedef struct SherpaOnnxCircularBuffer SherpaOnnxCircularBuffer;\n\n/**\n * @brief Create a floating-point circular buffer.\n *\n * @param capacity Maximum number of samples the buffer can keep.\n * @return A newly allocated buffer. Free it with\n *         SherpaOnnxDestroyCircularBuffer().\n *\n * @code\n * const SherpaOnnxCircularBuffer *buffer =\n *     SherpaOnnxCreateCircularBuffer(16000 * 30);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxCircularBuffer *SherpaOnnxCreateCircularBuffer(\n    int32_t capacity);\n\n/**\n * @brief Destroy a circular buffer.\n *\n * @param buffer A pointer returned by SherpaOnnxCreateCircularBuffer().\n *\n * @code\n * SherpaOnnxDestroyCircularBuffer(buffer);\n * buffer = NULL;\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyCircularBuffer(\n    const SherpaOnnxCircularBuffer *buffer);\n\n/**\n * @brief Append samples to a circular buffer.\n *\n * @param buffer A pointer returned by SherpaOnnxCreateCircularBuffer().\n * @param p Pointer to @p n samples.\n * @param n Number of samples.\n *\n * @code\n * SherpaOnnxCircularBufferPush(buffer, wave->samples, wave->num_samples);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxCircularBufferPush(\n    const SherpaOnnxCircularBuffer *buffer, const float *p, int32_t n);\n\n/**\n * @brief Copy out a slice of samples from a circular buffer.\n *\n * @param buffer A pointer returned by SherpaOnnxCreateCircularBuffer().\n * @param start_index Absolute start index in the buffer timeline.\n * @param n Number of samples to copy.\n * @return A newly allocated array containing @p n samples. Free it with\n *         SherpaOnnxCircularBufferFree().\n *\n * @code\n * const float *samples = SherpaOnnxCircularBufferGet(buffer, start, 3200);\n * SherpaOnnxCircularBufferFree(samples);\n * @endcode\n */\nSHERPA_ONNX_API const float *SherpaOnnxCircularBufferGet(\n    const SherpaOnnxCircularBuffer *buffer, int32_t start_index, int32_t n);\n\n/** @brief Free an array returned by SherpaOnnxCircularBufferGet(). */\nSHERPA_ONNX_API void SherpaOnnxCircularBufferFree(const float *p);\n\n/**\n * @brief Drop samples from the front of a circular buffer.\n *\n * @param buffer A pointer returned by SherpaOnnxCreateCircularBuffer().\n * @param n Number of samples to remove.\n */\nSHERPA_ONNX_API void SherpaOnnxCircularBufferPop(\n    const SherpaOnnxCircularBuffer *buffer, int32_t n);\n\n/**\n * @brief Return the number of currently stored samples.\n *\n * @param buffer A pointer returned by SherpaOnnxCreateCircularBuffer().\n * @return Number of samples currently in the buffer.\n */\nSHERPA_ONNX_API int32_t\nSherpaOnnxCircularBufferSize(const SherpaOnnxCircularBuffer *buffer);\n\n/**\n * @brief Return the current head index of the buffer timeline.\n *\n * The value is monotonically non-decreasing until\n * SherpaOnnxCircularBufferReset() is called.\n *\n * @param buffer A pointer returned by SherpaOnnxCreateCircularBuffer().\n * @return The current head index.\n */\nSHERPA_ONNX_API int32_t\nSherpaOnnxCircularBufferHead(const SherpaOnnxCircularBuffer *buffer);\n\n/**\n * @brief Clear a circular buffer and reset its head index.\n *\n * @param buffer A pointer returned by SherpaOnnxCreateCircularBuffer().\n */\nSHERPA_ONNX_API void SherpaOnnxCircularBufferReset(\n    const SherpaOnnxCircularBuffer *buffer);\n\n/**\n * @brief One detected speech segment returned by the VAD.\n *\n * The segment owns @c samples. Free the whole object with\n * SherpaOnnxDestroySpeechSegment().\n */\ntypedef struct SherpaOnnxSpeechSegment {\n  /** Start index, in input samples, of this segment. */\n  int32_t start;\n  /** Newly allocated mono samples for this segment. */\n  float *samples;\n  /** Number of samples in @c samples. */\n  int32_t n;\n} SherpaOnnxSpeechSegment;\n\n/** @brief Opaque voice activity detector handle. */\ntypedef struct SherpaOnnxVoiceActivityDetector SherpaOnnxVoiceActivityDetector;\n\n/**\n * @brief Create a voice activity detector.\n *\n * Example model files are shown in `c-api-examples/vad-whisper-c-api.c`.\n *\n * @param config VAD configuration.\n * @param buffer_size_in_seconds Internal buffering capacity in seconds.\n * @return A newly allocated detector on success, or NULL on configuration\n *         error. Free it with SherpaOnnxDestroyVoiceActivityDetector().\n *\n * @code\n * SherpaOnnxVadModelConfig config;\n * memset(&config, 0, sizeof(config));\n * config.silero_vad.model = \"./silero_vad.onnx\";\n * config.silero_vad.threshold = 0.25f;\n * config.silero_vad.min_silence_duration = 0.5f;\n * config.silero_vad.min_speech_duration = 0.5f;\n * config.silero_vad.max_speech_duration = 10.0f;\n * config.silero_vad.window_size = 512;\n * config.sample_rate = 16000;\n * config.num_threads = 1;\n *\n * const SherpaOnnxVoiceActivityDetector *vad =\n *     SherpaOnnxCreateVoiceActivityDetector(&config, 30.0f);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxVoiceActivityDetector *\nSherpaOnnxCreateVoiceActivityDetector(const SherpaOnnxVadModelConfig *config,\n                                      float buffer_size_in_seconds);\n\n/**\n * @brief Destroy a voice activity detector.\n *\n * @param p A pointer returned by SherpaOnnxCreateVoiceActivityDetector().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyVoiceActivityDetector(\n    const SherpaOnnxVoiceActivityDetector *p);\n\n/**\n * @brief Feed audio samples to the VAD.\n *\n * Input samples are mono floating-point PCM in the range [-1, 1].\n *\n * @param p A pointer returned by SherpaOnnxCreateVoiceActivityDetector().\n * @param samples Pointer to @p n samples.\n * @param n Number of samples.\n *\n * @code\n * SherpaOnnxVoiceActivityDetectorAcceptWaveform(vad,\n *                                               wave->samples + i,\n *                                               window_size);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxVoiceActivityDetectorAcceptWaveform(\n    const SherpaOnnxVoiceActivityDetector *p, const float *samples, int32_t n);\n\n/**\n * @brief Check whether the detector currently has any completed speech segment.\n *\n * @param p A pointer returned by SherpaOnnxCreateVoiceActivityDetector().\n * @return 1 if no completed speech segment is available; otherwise 0.\n */\nSHERPA_ONNX_API int32_t\nSherpaOnnxVoiceActivityDetectorEmpty(const SherpaOnnxVoiceActivityDetector *p);\n\n/**\n * @brief Check whether the detector is currently inside speech.\n *\n * @param p A pointer returned by SherpaOnnxCreateVoiceActivityDetector().\n * @return 1 if speech is currently detected; otherwise 0.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxVoiceActivityDetectorDetected(\n    const SherpaOnnxVoiceActivityDetector *p);\n\n/**\n * @brief Remove the front speech segment from the detector queue.\n *\n * Call this after consuming the segment returned by\n * SherpaOnnxVoiceActivityDetectorFront().\n *\n * @param p A pointer returned by SherpaOnnxCreateVoiceActivityDetector().\n *\n * @code\n * const SherpaOnnxSpeechSegment *segment =\n *     SherpaOnnxVoiceActivityDetectorFront(vad);\n * // ... use segment ...\n * SherpaOnnxDestroySpeechSegment(segment);\n * SherpaOnnxVoiceActivityDetectorPop(vad);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxVoiceActivityDetectorPop(\n    const SherpaOnnxVoiceActivityDetector *p);\n\n/**\n * @brief Remove all queued speech segments.\n *\n * @param p A pointer returned by SherpaOnnxCreateVoiceActivityDetector().\n */\nSHERPA_ONNX_API void SherpaOnnxVoiceActivityDetectorClear(\n    const SherpaOnnxVoiceActivityDetector *p);\n\n/**\n * @brief Get the first queued speech segment.\n *\n * The returned segment is a copy owned by the caller. Free it with\n * SherpaOnnxDestroySpeechSegment().\n *\n * @param p A pointer returned by SherpaOnnxCreateVoiceActivityDetector().\n * @return The first queued speech segment, or NULL if none is available.\n *\n * @code\n * while (!SherpaOnnxVoiceActivityDetectorEmpty(vad)) {\n *   const SherpaOnnxSpeechSegment *segment =\n *       SherpaOnnxVoiceActivityDetectorFront(vad);\n *   printf(\"start=%d, samples=%d\\n\", segment->start, segment->n);\n *   SherpaOnnxDestroySpeechSegment(segment);\n *   SherpaOnnxVoiceActivityDetectorPop(vad);\n * }\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxSpeechSegment *\nSherpaOnnxVoiceActivityDetectorFront(const SherpaOnnxVoiceActivityDetector *p);\n\n/**\n * @brief Destroy a speech segment returned by\n * SherpaOnnxVoiceActivityDetectorFront().\n *\n * @param p A pointer returned by SherpaOnnxVoiceActivityDetectorFront().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroySpeechSegment(\n    const SherpaOnnxSpeechSegment *p);\n\n/**\n * @brief Reset a voice activity detector so it can process a new stream.\n *\n * @param p A pointer returned by SherpaOnnxCreateVoiceActivityDetector().\n */\nSHERPA_ONNX_API void SherpaOnnxVoiceActivityDetectorReset(\n    const SherpaOnnxVoiceActivityDetector *p);\n\n/**\n * @brief Flush buffered tail samples and force final segmentation.\n *\n * Call this after the last chunk of input has been fed.\n *\n * @param p A pointer returned by SherpaOnnxCreateVoiceActivityDetector().\n *\n * @code\n * SherpaOnnxVoiceActivityDetectorFlush(vad);\n * @endcode\n */\nSHERPA_ONNX_API void SherpaOnnxVoiceActivityDetectorFlush(\n    const SherpaOnnxVoiceActivityDetector *p);\n\n// ============================================================\n// For offline Text-to-Speech (i.e., non-streaming TTS)\n// ============================================================\n\n/** @brief Configuration for a VITS TTS model. */\ntypedef struct SherpaOnnxOfflineTtsVitsModelConfig {\n  /** Path to the VITS ONNX model, for example `./vits-ljs.onnx`. */\n  const char *model;\n  /** Path to the lexicon file. Ignored if @c data_dir is provided. */\n  const char *lexicon;\n  /** Path to the tokens file. */\n  const char *tokens;\n  /** Optional path to espeak-ng-data. */\n  const char *data_dir;\n  /** VITS noise scale. */\n  float noise_scale;\n  /** VITS duration noise scale. */\n  float noise_scale_w;\n  /** Speech rate scale. Values < 1 are slower; values > 1 are faster. */\n  float length_scale;\n  /** Unused legacy field kept for ABI compatibility. */\n  const char *dict_dir;\n} SherpaOnnxOfflineTtsVitsModelConfig;\n\n/** @brief Configuration for a Matcha TTS model. */\ntypedef struct SherpaOnnxOfflineTtsMatchaModelConfig {\n  /** Path to the Matcha acoustic model. */\n  const char *acoustic_model;\n  /** Path to the vocoder model, for example `./vocos-22khz-univ.onnx`. */\n  const char *vocoder;\n  /** Path to the lexicon file. */\n  const char *lexicon;\n  /** Path to the tokens file. */\n  const char *tokens;\n  /** Optional path to espeak-ng-data. */\n  const char *data_dir;\n  /** Matcha noise scale. */\n  float noise_scale;\n  /** Speech rate scale. Values < 1 are slower; values > 1 are faster. */\n  float length_scale;\n  /** Unused legacy field kept for ABI compatibility. */\n  const char *dict_dir;\n} SherpaOnnxOfflineTtsMatchaModelConfig;\n\n/** @brief Configuration for a Kokoro TTS model. */\ntypedef struct SherpaOnnxOfflineTtsKokoroModelConfig {\n  /** Path to the Kokoro model, for example `./kokoro-en-v0_19/model.onnx`. */\n  const char *model;\n  /** Path to the Kokoro voices file. */\n  const char *voices;\n  /** Path to the tokens file. */\n  const char *tokens;\n  /** Optional path to espeak-ng-data. */\n  const char *data_dir;\n  /** Speech rate scale. Values < 1 are slower; values > 1 are faster. */\n  float length_scale;\n  /** Unused legacy field kept for ABI compatibility. */\n  const char *dict_dir;\n  /** Optional lexicon file. */\n  const char *lexicon;\n  /** Optional language hint. */\n  const char *lang;\n} SherpaOnnxOfflineTtsKokoroModelConfig;\n\n/** @brief Configuration for a Kitten TTS model. */\ntypedef struct SherpaOnnxOfflineTtsKittenModelConfig {\n  /** Path to the Kitten model. */\n  const char *model;\n  /** Path to the Kitten voices file. */\n  const char *voices;\n  /** Path to the tokens file. */\n  const char *tokens;\n  /** Optional path to espeak-ng-data. */\n  const char *data_dir;\n  /** Speech rate scale. Values < 1 are slower; values > 1 are faster. */\n  float length_scale;\n} SherpaOnnxOfflineTtsKittenModelConfig;\n\n/** @brief Configuration for a ZipVoice TTS model. */\ntypedef struct SherpaOnnxOfflineTtsZipvoiceModelConfig {\n  /** Path to the tokens file. */\n  const char *tokens;\n  /** Path to the ZipVoice encoder model. */\n  const char *encoder;\n  /** Path to the ZipVoice decoder model. */\n  const char *decoder;\n  /** Path to the vocoder model. */\n  const char *vocoder;\n  /** Optional path to espeak-ng-data. */\n  const char *data_dir;\n  /** Path to the lexicon file. */\n  const char *lexicon;\n  /** Feature scaling factor. */\n  float feat_scale;\n  /** Time shift parameter. */\n  float t_shift;\n  /** Target RMS parameter. */\n  float target_rms;\n  /** Guidance scale parameter. */\n  float guidance_scale;\n} SherpaOnnxOfflineTtsZipvoiceModelConfig;\n\n/** @brief Configuration for a Pocket TTS model. */\ntypedef struct SherpaOnnxOfflineTtsPocketModelConfig {\n  /** Path to `lm_flow*.onnx`. */\n  const char *lm_flow;\n  /** Path to `lm_main*.onnx`. */\n  const char *lm_main;\n  /** Path to the Pocket encoder model. */\n  const char *encoder;\n  /** Path to the Pocket decoder model. */\n  const char *decoder;\n  /** Path to the text conditioner model. */\n  const char *text_conditioner;\n  /** Path to `vocab.json`. */\n  const char *vocab_json;\n  /** Path to `token_scores.json`. */\n  const char *token_scores_json;\n  /** Voice embedding cache capacity. */\n  int32_t voice_embedding_cache_capacity;\n} SherpaOnnxOfflineTtsPocketModelConfig;\n\n/** @brief Configuration for a Supertonic TTS model. */\ntypedef struct SherpaOnnxOfflineTtsSupertonicModelConfig {\n  /** Path to the duration predictor model. */\n  const char *duration_predictor;\n  /** Path to the text encoder model. */\n  const char *text_encoder;\n  /** Path to the vector estimator model. */\n  const char *vector_estimator;\n  /** Path to the vocoder model. */\n  const char *vocoder;\n  /** Path to `tts.json`. */\n  const char *tts_json;\n  /** Path to the unicode indexer file. */\n  const char *unicode_indexer;\n  /** Path to the voice style file. */\n  const char *voice_style;\n} SherpaOnnxOfflineTtsSupertonicModelConfig;\n\n/**\n * @brief Configuration shared by offline TTS models.\n *\n * Exactly one TTS model family should be configured. For example, set only one\n * of @c vits, @c matcha, @c kokoro, @c kitten, @c zipvoice, @c pocket, or\n * @c supertonic.\n *\n * If multiple model families are configured at the same time, the\n * implementation will choose one of them, and which one is used is\n * implementation-defined. Do not rely on any precedence rule.\n *\n * Concrete example model packages in this repository include:\n * - `kokoro-en-v0_19`\n * - `sherpa-onnx-pocket-tts-int8-2026-01-26`\n * - `matcha-icefall-en_US-ljspeech`\n * - `sherpa-onnx-zipvoice-distill-int8-zh-en-emilia`\n */\ntypedef struct SherpaOnnxOfflineTtsModelConfig {\n  /** VITS configuration. */\n  SherpaOnnxOfflineTtsVitsModelConfig vits;\n  /** Number of backend threads. */\n  int32_t num_threads;\n  /** Non-zero to print debug information. */\n  int32_t debug;\n  /** Execution provider, for example \"cpu\" or \"cuda\". */\n  const char *provider;\n  /** Matcha configuration. */\n  SherpaOnnxOfflineTtsMatchaModelConfig matcha;\n  /** Kokoro configuration. */\n  SherpaOnnxOfflineTtsKokoroModelConfig kokoro;\n  /** Kitten configuration. */\n  SherpaOnnxOfflineTtsKittenModelConfig kitten;\n  /** ZipVoice configuration. */\n  SherpaOnnxOfflineTtsZipvoiceModelConfig zipvoice;\n  /** Pocket configuration. */\n  SherpaOnnxOfflineTtsPocketModelConfig pocket;\n  /** Supertonic configuration. */\n  SherpaOnnxOfflineTtsSupertonicModelConfig supertonic;\n} SherpaOnnxOfflineTtsModelConfig;\n\n/**\n * @brief Configuration for offline text-to-speech.\n *\n * @code\n * SherpaOnnxOfflineTtsConfig config;\n * memset(&config, 0, sizeof(config));\n *\n * config.model.kokoro.model = \"./kokoro-en-v0_19/model.onnx\";\n * config.model.kokoro.voices = \"./kokoro-en-v0_19/voices.bin\";\n * config.model.kokoro.tokens = \"./kokoro-en-v0_19/tokens.txt\";\n * config.model.kokoro.data_dir = \"./kokoro-en-v0_19/espeak-ng-data\";\n * config.model.num_threads = 2;\n * config.model.provider = \"cpu\";\n * config.model.debug = 0;\n * config.max_num_sentences = 2;\n * @endcode\n */\ntypedef struct SherpaOnnxOfflineTtsConfig {\n  /** TTS model configuration. */\n  SherpaOnnxOfflineTtsModelConfig model;\n  /** Optional comma-separated rule FST list. */\n  const char *rule_fsts;\n  /** Maximum number of sentences processed per chunk. */\n  int32_t max_num_sentences;\n  /** Optional FAR archives used by text normalization rules. */\n  const char *rule_fars;\n  /** Default silence scale between sentences. */\n  float silence_scale;\n} SherpaOnnxOfflineTtsConfig;\n\n/**\n * @brief Generated waveform returned by TTS APIs.\n *\n * The returned structure owns @c samples. Free the whole object with\n * SherpaOnnxDestroyOfflineTtsGeneratedAudio().\n */\ntypedef struct SherpaOnnxGeneratedAudio {\n  /** Generated mono samples in the range [-1, 1]. */\n  const float *samples;\n  /** Number of samples in @c samples. */\n  int32_t n;\n  /** Output sample rate. */\n  int32_t sample_rate;\n} SherpaOnnxGeneratedAudio;\n\n/**\n * @brief Callback invoked during incremental generation.\n *\n * Return 1 to continue generation. Return 0 to stop early.\n *\n * The @p samples pointer is only valid during the callback. Copy the samples if\n * you need to keep them after the callback returns.\n */\ntypedef int32_t (*SherpaOnnxGeneratedAudioCallback)(const float *samples,\n                                                    int32_t n);\n\n/**\n * @brief Same as SherpaOnnxGeneratedAudioCallback but with an extra user\n * pointer.\n */\ntypedef int32_t (*SherpaOnnxGeneratedAudioCallbackWithArg)(const float *samples,\n                                                           int32_t n,\n                                                           void *arg);\n\n/**\n * @brief Progress callback invoked during incremental generation.\n *\n * @param samples Newly generated samples valid only during the callback.\n * @param n Number of samples in @p samples.\n * @param p Progress in the range [0, 1].\n * @return Return 1 to continue generation. Return 0 to stop early.\n */\ntypedef int32_t (*SherpaOnnxGeneratedAudioProgressCallback)(\n    const float *samples, int32_t n, float p);\n\n/**\n * @brief Same as SherpaOnnxGeneratedAudioProgressCallback but with an extra\n * user pointer.\n */\ntypedef int32_t (*SherpaOnnxGeneratedAudioProgressCallbackWithArg)(\n    const float *samples, int32_t n, float p, void *arg);\n\n/** @brief Opaque offline TTS handle. */\ntypedef struct SherpaOnnxOfflineTts SherpaOnnxOfflineTts;\n\n/**\n * @brief Create an offline TTS engine.\n *\n * @param config TTS configuration.\n * @return A newly allocated TTS engine on success, or NULL on configuration\n *         error. Free it with SherpaOnnxDestroyOfflineTts().\n *\n * @code\n * SherpaOnnxOfflineTtsConfig config;\n * memset(&config, 0, sizeof(config));\n * config.model.kokoro.model = \"./kokoro-en-v0_19/model.onnx\";\n * config.model.kokoro.voices = \"./kokoro-en-v0_19/voices.bin\";\n * config.model.kokoro.tokens = \"./kokoro-en-v0_19/tokens.txt\";\n * config.model.kokoro.data_dir = \"./kokoro-en-v0_19/espeak-ng-data\";\n * config.model.num_threads = 2;\n *\n * const SherpaOnnxOfflineTts *tts = SherpaOnnxCreateOfflineTts(&config);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineTts *SherpaOnnxCreateOfflineTts(\n    const SherpaOnnxOfflineTtsConfig *config);\n\n/**\n * @brief Destroy an offline TTS engine.\n *\n * @param tts A pointer returned by SherpaOnnxCreateOfflineTts().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOfflineTts(\n    const SherpaOnnxOfflineTts *tts);\n\n/**\n * @brief Return the output sample rate of a TTS engine.\n *\n * @param tts A pointer returned by SherpaOnnxCreateOfflineTts().\n * @return Output sample rate in Hz.\n */\nSHERPA_ONNX_API int32_t\nSherpaOnnxOfflineTtsSampleRate(const SherpaOnnxOfflineTts *tts);\n\n/**\n * @brief Return the number of available speaker IDs.\n *\n * Single-speaker models often return 1.\n *\n * @param tts A pointer returned by SherpaOnnxCreateOfflineTts().\n * @return Number of speakers supported by the model.\n */\nSHERPA_ONNX_API int32_t\nSherpaOnnxOfflineTtsNumSpeakers(const SherpaOnnxOfflineTts *tts);\n\n/**\n * @brief Generate speech from text using the simple sid/speed interface.\n *\n * @deprecated Use SherpaOnnxOfflineTtsGenerateWithConfig() instead.\n *\n * @param tts A pointer returned by SherpaOnnxCreateOfflineTts().\n * @param text Input text.\n * @param sid Speaker ID for multi-speaker models.\n * @param speed Speech rate. Values > 1 are faster.\n * @return Generated audio, or NULL on error. Free it with\n *         SherpaOnnxDestroyOfflineTtsGeneratedAudio().\n *\n * @code\n * const SherpaOnnxGeneratedAudio *audio =\n *     SherpaOnnxOfflineTtsGenerate(tts, \"Hello from sherpa-onnx!\", 0, 1.0f);\n * SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,\n *                     \"./generated.wav\");\n * SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n * @endcode\n */\nSHERPA_ONNX_API SHERPA_ONNX_DEPRECATED(\n    \"Use SherpaOnnxOfflineTtsGenerateWithConfig() instead\") const\n    SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerate(\n        const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid,\n        float speed);\n\n/**\n * @brief Generate speech and receive incremental audio chunks through a\n * callback.\n *\n * @deprecated Use SherpaOnnxOfflineTtsGenerateWithConfig() instead.\n *\n * The callback receives newly generated samples. The sample pointer is valid\n * only for the duration of the callback.\n *\n * @param tts A pointer returned by SherpaOnnxCreateOfflineTts().\n * @param text Input text.\n * @param sid Speaker ID for multi-speaker models.\n * @param speed Speech rate. Values > 1 are faster.\n * @param callback Incremental callback. Return 0 to stop generation early.\n * @return Final generated audio, or NULL on error. Free it with\n *         SherpaOnnxDestroyOfflineTtsGeneratedAudio().\n */\nSHERPA_ONNX_API SHERPA_ONNX_DEPRECATED(\n    \"Use SherpaOnnxOfflineTtsGenerateWithConfig() instead\") const\n    SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithCallback(\n        const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid,\n        float speed, SherpaOnnxGeneratedAudioCallback callback);\n\n/**\n * @brief Generate speech with a progress callback.\n *\n * @deprecated Use SherpaOnnxOfflineTtsGenerateWithConfig() instead.\n *\n * @param tts A pointer returned by SherpaOnnxCreateOfflineTts().\n * @param text Input text.\n * @param sid Speaker ID for multi-speaker models.\n * @param speed Speech rate. Values > 1 are faster.\n * @param callback Progress callback. Return 0 to stop generation early.\n * @return Final generated audio, or NULL on error. Free it with\n *         SherpaOnnxDestroyOfflineTtsGeneratedAudio().\n *\n * @code\n * int32_t Progress(const float *samples, int32_t n, float p) {\n *   fprintf(stderr, \"Progress: %.2f%%\\n\", p * 100);\n *   return 1;\n * }\n *\n * const SherpaOnnxGeneratedAudio *audio =\n *     SherpaOnnxOfflineTtsGenerateWithProgressCallback(tts, text, 0, 1.0f,\n *                                                      Progress);\n * @endcode\n */\nSHERPA_ONNX_API SHERPA_ONNX_DEPRECATED(\n    \"Use SherpaOnnxOfflineTtsGenerateWithConfig() instead\") const\n    SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithProgressCallback(\n        const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid,\n        float speed, SherpaOnnxGeneratedAudioProgressCallback callback);\n\n/**\n * @brief Generate speech with a progress callback that receives a user pointer.\n *\n * @deprecated Use SherpaOnnxOfflineTtsGenerateWithConfig() instead.\n *\n * @param tts A pointer returned by SherpaOnnxCreateOfflineTts().\n * @param text Input text.\n * @param sid Speaker ID for multi-speaker models.\n * @param speed Speech rate. Values > 1 are faster.\n * @param callback Progress callback with user pointer. Return 0 to stop early.\n * @param arg User pointer forwarded to @p callback.\n * @return Final generated audio, or NULL on error. Free it with\n *         SherpaOnnxDestroyOfflineTtsGeneratedAudio().\n */\nSHERPA_ONNX_API SHERPA_ONNX_DEPRECATED(\n    \"Use SherpaOnnxOfflineTtsGenerateWithConfig() instead\") const\n    SherpaOnnxGeneratedAudio\n        *SherpaOnnxOfflineTtsGenerateWithProgressCallbackWithArg(\n            const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid,\n            float speed,\n            SherpaOnnxGeneratedAudioProgressCallbackWithArg callback,\n            void *arg);\n\n/**\n * @brief Same as SherpaOnnxOfflineTtsGenerateWithCallback() but with a user\n * pointer.\n *\n * @deprecated Use SherpaOnnxOfflineTtsGenerateWithConfig() instead.\n *\n * @param tts A pointer returned by SherpaOnnxCreateOfflineTts().\n * @param text Input text.\n * @param sid Speaker ID for multi-speaker models.\n * @param speed Speech rate. Values > 1 are faster.\n * @param callback Incremental callback with user pointer.\n * @param arg User pointer forwarded to @p callback.\n * @return Final generated audio, or NULL on error. Free it with\n *         SherpaOnnxDestroyOfflineTtsGeneratedAudio().\n */\nSHERPA_ONNX_API SHERPA_ONNX_DEPRECATED(\n    \"Use SherpaOnnxOfflineTtsGenerateWithConfig() instead\") const\n    SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithCallbackWithArg(\n        const SherpaOnnxOfflineTts *tts, const char *text, int32_t sid,\n        float speed, SherpaOnnxGeneratedAudioCallbackWithArg callback,\n        void *arg);\n\n/**\n * @brief Deprecated ZipVoice-specific generation API.\n *\n * Use SherpaOnnxOfflineTtsGenerateWithConfig() instead.\n */\nSHERPA_ONNX_API SHERPA_ONNX_DEPRECATED(\n    \"Use SherpaOnnxOfflineTtsGenerateWithConfig() instead\") const\n    SherpaOnnxGeneratedAudio *SherpaOnnxOfflineTtsGenerateWithZipvoice(\n        const SherpaOnnxOfflineTts *tts, const char *text,\n        const char *prompt_text, const float *prompt_samples, int32_t n_prompt,\n        int32_t prompt_sr, float speed, int32_t num_steps);\n\n/**\n * @brief Generation-time parameters shared by advanced TTS APIs.\n *\n * This struct supports both simple multi-speaker synthesis and more advanced\n * zero-shot or reference-conditioned models.\n *\n * Example for Pocket TTS:\n *\n * @code\n * SherpaOnnxGenerationConfig cfg;\n * memset(&cfg, 0, sizeof(cfg));\n * cfg.speed = 1.0f;\n * cfg.reference_audio = wave->samples;\n * cfg.reference_audio_len = wave->num_samples;\n * cfg.reference_sample_rate = wave->sample_rate;\n * cfg.extra = \"{\\\"max_reference_audio_len\\\": 10.0, \\\"seed\\\": 42}\";\n * @endcode\n */\ntypedef struct SherpaOnnxGenerationConfig {\n  /** Silence scale between sentences. */\n  float silence_scale;\n  /** Speech rate. Used only by models that support it. */\n  float speed;\n  /** Speaker ID for multi-speaker models. */\n  int32_t sid;\n  /** Optional reference audio for zero-shot or voice-cloning models. */\n  const float *reference_audio;\n  /** Length of @c reference_audio in samples. */\n  int32_t reference_audio_len;\n  /** Sample rate of @c reference_audio. */\n  int32_t reference_sample_rate;\n  /** Optional reference text associated with @c reference_audio. */\n  const char *reference_text;\n  /** Optional number of flow-matching steps. */\n  int32_t num_steps;\n  /** Optional model-specific JSON string with extra key/value pairs. */\n  const char *extra;\n} SherpaOnnxGenerationConfig;\n\n/**\n * @brief Generate speech using the advanced configuration interface.\n *\n * This is the preferred API for new integrations. It supports callback-based\n * progress reporting and model-specific options such as reference audio.\n *\n * @param tts A pointer returned by SherpaOnnxCreateOfflineTts().\n * @param text Input text.\n * @param config Generation-time configuration.\n * @param callback Optional progress callback with user pointer. Return 0 to\n *                 stop early.\n * @param arg User pointer forwarded to @p callback.\n * @return Generated audio, or NULL on error. Free it with\n *         SherpaOnnxDestroyOfflineTtsGeneratedAudio().\n *\n * @code\n * SherpaOnnxGenerationConfig cfg;\n * memset(&cfg, 0, sizeof(cfg));\n * cfg.sid = 0;\n * cfg.speed = 1.0f;\n * cfg.silence_scale = 0.2f;\n *\n * const SherpaOnnxGeneratedAudio *audio =\n *     SherpaOnnxOfflineTtsGenerateWithConfig(tts,\n *         \"Today as always, men fall into two groups.\",\n *         &cfg, NULL, NULL);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxGeneratedAudio *\nSherpaOnnxOfflineTtsGenerateWithConfig(\n    const SherpaOnnxOfflineTts *tts, const char *text,\n    const SherpaOnnxGenerationConfig *config,\n    SherpaOnnxGeneratedAudioProgressCallbackWithArg callback, void *arg);\n\n/**\n * @brief Destroy audio returned by a TTS generation API.\n *\n * @param p A pointer returned by one of the SherpaOnnxOfflineTtsGenerate*\n *          functions.\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOfflineTtsGeneratedAudio(\n    const SherpaOnnxGeneratedAudio *p);\n\n/**\n * @brief Write floating-point PCM to a mono 16-bit WAVE file.\n *\n * @param samples Pointer to @p n samples in the range [-1, 1].\n * @param n Number of samples.\n * @param sample_rate Sample rate in Hz.\n * @param filename Output filename.\n * @return 1 on success; 0 on failure.\n *\n * @code\n * SherpaOnnxWriteWave(audio->samples, audio->n, audio->sample_rate,\n *                     \"./generated-kokoro-en.wav\");\n * @endcode\n */\nSHERPA_ONNX_API int32_t SherpaOnnxWriteWave(const float *samples, int32_t n,\n                                            int32_t sample_rate,\n                                            const char *filename);\n\n/**\n * @brief Return the number of bytes needed for a mono 16-bit WAVE file.\n *\n * @param n_samples Number of PCM samples.\n * @return Required buffer size in bytes.\n */\nSHERPA_ONNX_API int64_t SherpaOnnxWaveFileSize(int32_t n_samples);\n\n/**\n * @brief Write a mono 16-bit WAVE file to a caller-provided buffer.\n *\n * Allocate at least SherpaOnnxWaveFileSize(@p n) bytes before calling.\n *\n * @param samples Pointer to @p n samples in the range [-1, 1].\n * @param n Number of samples.\n * @param sample_rate Sample rate in Hz.\n * @param buffer Output buffer.\n */\nSHERPA_ONNX_API void SherpaOnnxWriteWaveToBuffer(const float *samples,\n                                                 int32_t n, int32_t sample_rate,\n                                                 char *buffer);\n\n/**\n * @brief Decoded mono WAVE file content.\n *\n * Free this object with SherpaOnnxFreeWave().\n */\ntypedef struct SherpaOnnxWave {\n  /** Samples normalized to the range [-1, 1]. */\n  const float *samples;\n  /** Sample rate in Hz. */\n  int32_t sample_rate;\n  /** Number of samples. */\n  int32_t num_samples;\n} SherpaOnnxWave;\n\n/**\n * @brief Read a mono 16-bit PCM WAVE file.\n *\n * @param filename Input WAVE filename.\n * @return A newly allocated wave object, or NULL on error. Free it with\n *         SherpaOnnxFreeWave().\n *\n * @code\n * const SherpaOnnxWave *wave = SherpaOnnxReadWave(\"./Obama.wav\");\n * if (wave) {\n *   printf(\"sample_rate=%d, num_samples=%d\\n\",\n *          wave->sample_rate, wave->num_samples);\n *   SherpaOnnxFreeWave(wave);\n * }\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxWave *SherpaOnnxReadWave(const char *filename);\n\n/**\n * @brief Read a mono 16-bit PCM WAVE file from binary memory.\n *\n * @param data Pointer to the WAVE file bytes.\n * @param n Size of @p data in bytes.\n * @return A newly allocated wave object, or NULL on error. Free it with\n *         SherpaOnnxFreeWave().\n */\nSHERPA_ONNX_API const SherpaOnnxWave *SherpaOnnxReadWaveFromBinaryData(\n    const char *data, int32_t n);\n\n/**\n * @brief Destroy a wave object returned by SherpaOnnxReadWave() or\n * SherpaOnnxReadWaveFromBinaryData().\n */\nSHERPA_ONNX_API void SherpaOnnxFreeWave(const SherpaOnnxWave *wave);\n\n// ============================================================\n// For spoken language identification\n// ============================================================\n\n/**\n * @brief Whisper-based model files for spoken language identification.\n *\n * Example:\n *\n * @code\n * SherpaOnnxSpokenLanguageIdentificationWhisperConfig whisper;\n * memset(&whisper, 0, sizeof(whisper));\n * whisper.encoder = \"./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx\";\n * whisper.decoder = \"./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx\";\n * @endcode\n */\ntypedef struct SherpaOnnxSpokenLanguageIdentificationWhisperConfig {\n  /** Whisper encoder model. */\n  const char *encoder;\n  /** Whisper decoder model. */\n  const char *decoder;\n  /** Optional tail padding in samples appended internally before inference. */\n  int32_t tail_paddings;\n} SherpaOnnxSpokenLanguageIdentificationWhisperConfig;\n\n/**\n * @brief Configuration for spoken language identification.\n *\n * The current implementation uses Whisper-based models.\n *\n * Example using `sherpa-onnx-whisper-tiny`:\n *\n * @code\n * SherpaOnnxSpokenLanguageIdentificationConfig config;\n * memset(&config, 0, sizeof(config));\n * config.whisper.encoder = \"./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx\";\n * config.whisper.decoder = \"./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx\";\n * config.num_threads = 1;\n * config.provider = \"cpu\";\n * @endcode\n */\ntypedef struct SherpaOnnxSpokenLanguageIdentificationConfig {\n  /** Whisper model configuration. */\n  SherpaOnnxSpokenLanguageIdentificationWhisperConfig whisper;\n  /** Number of inference threads. */\n  int32_t num_threads;\n  /** Non-zero to print debug information. */\n  int32_t debug;\n  /** Execution provider such as `\"cpu\"`. */\n  const char *provider;\n} SherpaOnnxSpokenLanguageIdentificationConfig;\n\n/** @brief Opaque spoken-language identification handle. */\ntypedef struct SherpaOnnxSpokenLanguageIdentification\n    SherpaOnnxSpokenLanguageIdentification;\n\n/**\n * @brief Create a spoken-language identifier.\n *\n * @param config Spoken-language identification configuration.\n * @return A newly allocated identifier on success, or NULL on error. Free it\n *         with SherpaOnnxDestroySpokenLanguageIdentification().\n */\nSHERPA_ONNX_API const SherpaOnnxSpokenLanguageIdentification *\nSherpaOnnxCreateSpokenLanguageIdentification(\n    const SherpaOnnxSpokenLanguageIdentificationConfig *config);\n\n/**\n * @brief Destroy a spoken-language identifier.\n *\n * @param slid A pointer returned by\n * SherpaOnnxCreateSpokenLanguageIdentification().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroySpokenLanguageIdentification(\n    const SherpaOnnxSpokenLanguageIdentification *slid);\n\n/**\n * @brief Create an offline stream for spoken-language identification.\n *\n * Feed audio to the returned stream with SherpaOnnxAcceptWaveformOffline(), and\n * then call SherpaOnnxSpokenLanguageIdentificationCompute().\n *\n * @param slid A pointer returned by\n * SherpaOnnxCreateSpokenLanguageIdentification().\n * @return A newly allocated offline stream. Free it with\n *         SherpaOnnxDestroyOfflineStream().\n */\nSHERPA_ONNX_API SherpaOnnxOfflineStream *\nSherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(\n    const SherpaOnnxSpokenLanguageIdentification *slid);\n\n/**\n * @brief Result of spoken-language identification.\n *\n * Free this object with SherpaOnnxDestroySpokenLanguageIdentificationResult().\n */\ntypedef struct SherpaOnnxSpokenLanguageIdentificationResult {\n  /**\n   * Predicted language code such as `\"en\"`, `\"de\"`, `\"zh\"`, or `\"es\"`.\n   */\n  const char *lang;\n} SherpaOnnxSpokenLanguageIdentificationResult;\n\n/**\n * @brief Run spoken-language identification on an offline stream.\n *\n * Example:\n *\n * @code\n * SherpaOnnxOfflineStream *stream =\n *     SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(slid);\n * SherpaOnnxAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,\n *                                 wave->num_samples);\n * const SherpaOnnxSpokenLanguageIdentificationResult *result =\n *     SherpaOnnxSpokenLanguageIdentificationCompute(slid, stream);\n * printf(\"lang=%s\\n\", result->lang);\n * SherpaOnnxDestroySpokenLanguageIdentificationResult(result);\n * SherpaOnnxDestroyOfflineStream(stream);\n * @endcode\n *\n * @param slid A pointer returned by\n * SherpaOnnxCreateSpokenLanguageIdentification().\n * @param s A pointer returned by\n *          SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream().\n * @return A newly allocated result object. Free it with\n *         SherpaOnnxDestroySpokenLanguageIdentificationResult().\n */\nSHERPA_ONNX_API const SherpaOnnxSpokenLanguageIdentificationResult *\nSherpaOnnxSpokenLanguageIdentificationCompute(\n    const SherpaOnnxSpokenLanguageIdentification *slid,\n    const SherpaOnnxOfflineStream *s);\n\n/**\n * @brief Destroy a spoken-language identification result.\n *\n * @param r A pointer returned by\n * SherpaOnnxSpokenLanguageIdentificationCompute().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroySpokenLanguageIdentificationResult(\n    const SherpaOnnxSpokenLanguageIdentificationResult *r);\n\n// ============================================================\n// For speaker embedding extraction\n// ============================================================\n/**\n * @brief Configuration for speaker embedding extraction.\n *\n * Example using\n * `3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx`:\n *\n * @code\n * SherpaOnnxSpeakerEmbeddingExtractorConfig config;\n * memset(&config, 0, sizeof(config));\n * config.model = \"./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx\";\n * config.num_threads = 1;\n * config.provider = \"cpu\";\n * @endcode\n */\ntypedef struct SherpaOnnxSpeakerEmbeddingExtractorConfig {\n  /** Speaker embedding model file. */\n  const char *model;\n  /** Number of inference threads. */\n  int32_t num_threads;\n  /** Non-zero to print debug information. */\n  int32_t debug;\n  /** Execution provider such as `\"cpu\"`. */\n  const char *provider;\n} SherpaOnnxSpeakerEmbeddingExtractorConfig;\n\n/** @brief Opaque speaker embedding extractor handle. */\ntypedef struct SherpaOnnxSpeakerEmbeddingExtractor\n    SherpaOnnxSpeakerEmbeddingExtractor;\n\n/**\n * @brief Create a speaker embedding extractor.\n *\n * @param config Speaker embedding extractor configuration.\n * @return A newly allocated extractor on success, or NULL on error. Free it\n *         with SherpaOnnxDestroySpeakerEmbeddingExtractor().\n */\nSHERPA_ONNX_API const SherpaOnnxSpeakerEmbeddingExtractor *\nSherpaOnnxCreateSpeakerEmbeddingExtractor(\n    const SherpaOnnxSpeakerEmbeddingExtractorConfig *config);\n\n/**\n * @brief Destroy a speaker embedding extractor.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingExtractor().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroySpeakerEmbeddingExtractor(\n    const SherpaOnnxSpeakerEmbeddingExtractor *p);\n\n/**\n * @brief Return the embedding dimension produced by the extractor.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingExtractor().\n * @return Embedding dimension.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxSpeakerEmbeddingExtractorDim(\n    const SherpaOnnxSpeakerEmbeddingExtractor *p);\n\n/**\n * @brief Create a streaming feature buffer for embedding extraction.\n *\n * Feed samples with SherpaOnnxOnlineStreamAcceptWaveform(), then call\n * SherpaOnnxSpeakerEmbeddingExtractorIsReady() and\n * SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding().\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingExtractor().\n * @return A newly allocated online stream. Free it with\n *         SherpaOnnxDestroyOnlineStream().\n */\nSHERPA_ONNX_API const SherpaOnnxOnlineStream *\nSherpaOnnxSpeakerEmbeddingExtractorCreateStream(\n    const SherpaOnnxSpeakerEmbeddingExtractor *p);\n\n/**\n * @brief Check whether enough audio has been provided to compute an embedding.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingExtractor().\n * @param s A pointer returned by\n * SherpaOnnxSpeakerEmbeddingExtractorCreateStream().\n * @return 1 if the stream is ready; otherwise 0.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxSpeakerEmbeddingExtractorIsReady(\n    const SherpaOnnxSpeakerEmbeddingExtractor *p,\n    const SherpaOnnxOnlineStream *s);\n\n/**\n * @brief Compute the embedding for a stream.\n *\n * The returned vector has `SherpaOnnxSpeakerEmbeddingExtractorDim(p)` elements.\n * Free it with SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding().\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingExtractor().\n * @param s A pointer returned by\n * SherpaOnnxSpeakerEmbeddingExtractorCreateStream().\n * @return A newly allocated embedding vector.\n *\n * @code\n * const SherpaOnnxOnlineStream *stream =\n *     SherpaOnnxSpeakerEmbeddingExtractorCreateStream(ex);\n * SherpaOnnxOnlineStreamAcceptWaveform(stream, wave->sample_rate,\n * wave->samples, wave->num_samples);\n * SherpaOnnxOnlineStreamInputFinished(stream);\n * if (SherpaOnnxSpeakerEmbeddingExtractorIsReady(ex, stream)) {\n *   const float *v =\n *       SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(ex, stream);\n *   SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(v);\n * }\n * SherpaOnnxDestroyOnlineStream(stream);\n * @endcode\n */\nSHERPA_ONNX_API const float *\nSherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(\n    const SherpaOnnxSpeakerEmbeddingExtractor *p,\n    const SherpaOnnxOnlineStream *s);\n\n/**\n * @brief Destroy an embedding vector returned by\n * SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding().\n *\n * @param v A pointer returned by\n *          SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding().\n */\nSHERPA_ONNX_API void SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(\n    const float *v);\n\n/** @brief Opaque speaker embedding manager handle. */\ntypedef struct SherpaOnnxSpeakerEmbeddingManager\n    SherpaOnnxSpeakerEmbeddingManager;\n\n/**\n * @brief Create a speaker embedding manager.\n *\n * The manager stores enrolled speaker embeddings and supports speaker search\n * and verification.\n *\n * @param dim Embedding dimension. This should match\n *            SherpaOnnxSpeakerEmbeddingExtractorDim().\n * @return A newly allocated manager. Free it with\n *         SherpaOnnxDestroySpeakerEmbeddingManager().\n */\nSHERPA_ONNX_API const SherpaOnnxSpeakerEmbeddingManager *\nSherpaOnnxCreateSpeakerEmbeddingManager(int32_t dim);\n\n/**\n * @brief Destroy a speaker embedding manager.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroySpeakerEmbeddingManager(\n    const SherpaOnnxSpeakerEmbeddingManager *p);\n\n/**\n * @brief Add one enrollment embedding for a speaker.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n * @param name Speaker name.\n * @param v Embedding vector with exactly `dim` elements.\n * @return 1 on success; 0 on error.\n */\nSHERPA_ONNX_API int32_t\nSherpaOnnxSpeakerEmbeddingManagerAdd(const SherpaOnnxSpeakerEmbeddingManager *p,\n                                     const char *name, const float *v);\n\n/**\n * @brief Add multiple enrollment embeddings for one speaker.\n *\n * @p v is a NULL-terminated array of embedding pointers:\n * `v[0]`, `v[1]`, ..., `v[n - 1]`, followed by `v[n] == NULL`.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n * @param name Speaker name.\n * @param v NULL-terminated array of embedding pointers.\n * @return 1 on success; 0 on error.\n *\n * @code\n * const float *spk1_vec[4] = {e1, e2, e3, NULL};\n * SherpaOnnxSpeakerEmbeddingManagerAddList(manager, \"fangjun\", spk1_vec);\n * @endcode\n */\nSHERPA_ONNX_API int32_t SherpaOnnxSpeakerEmbeddingManagerAddList(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name,\n    const float **v);\n\n/**\n * @brief Add multiple enrollment embeddings packed in one flat array.\n *\n * The input contains @p n embeddings laid out consecutively, so the total\n * array length must be `n * dim`.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n * @param name Speaker name.\n * @param v Flattened embedding array.\n * @param n Number of embeddings in @p v.\n * @return 1 on success; 0 on error.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxSpeakerEmbeddingManagerAddListFlattened(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name,\n    const float *v, int32_t n);\n\n/**\n * @brief Remove a speaker from the manager.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n * @param name Speaker name to remove.\n * @return 1 if removed; otherwise 0. Returns 0 if the speaker does not exist.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxSpeakerEmbeddingManagerRemove(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name);\n\n/**\n * @brief Search for the best matching enrolled speaker.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n * @param v Query embedding vector.\n * @param threshold Minimum similarity threshold in the range [0, 1].\n * @return A newly allocated speaker name on match, or NULL if no speaker\n *         passes the threshold. Free the returned name with\n *         SherpaOnnxSpeakerEmbeddingManagerFreeSearch().\n */\nSHERPA_ONNX_API const char *SherpaOnnxSpeakerEmbeddingManagerSearch(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const float *v,\n    float threshold);\n\n/**\n * @brief Free a string returned by SherpaOnnxSpeakerEmbeddingManagerSearch().\n *\n * @param name A pointer returned by\n *             SherpaOnnxSpeakerEmbeddingManagerSearch().\n */\nSHERPA_ONNX_API void SherpaOnnxSpeakerEmbeddingManagerFreeSearch(\n    const char *name);\n\n/**\n * @brief One speaker match returned by the best-matches API.\n */\ntypedef struct SherpaOnnxSpeakerEmbeddingManagerSpeakerMatch {\n  /** Similarity score. Larger means more similar. */\n  float score;\n  /** Speaker name. */\n  const char *name;\n} SherpaOnnxSpeakerEmbeddingManagerSpeakerMatch;\n\n/**\n * @brief Collection of best speaker matches.\n *\n * Free this object with SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches().\n */\ntypedef struct SherpaOnnxSpeakerEmbeddingManagerBestMatchesResult {\n  /** Pointer to an array of @c count matches. */\n  const SherpaOnnxSpeakerEmbeddingManagerSpeakerMatch *matches;\n  /** Number of valid entries in @c matches. */\n  int32_t count;\n} SherpaOnnxSpeakerEmbeddingManagerBestMatchesResult;\n\n/**\n * @brief Return up to @p n best matches above a similarity threshold.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n * @param v Query embedding vector.\n * @param threshold Minimum similarity threshold in the range [0, 1].\n * @param n Maximum number of matches to return.\n * @return A newly allocated result object, or NULL if no matches are found.\n *         Free it with SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches().\n */\nSHERPA_ONNX_API const SherpaOnnxSpeakerEmbeddingManagerBestMatchesResult *\nSherpaOnnxSpeakerEmbeddingManagerGetBestMatches(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const float *v, float threshold,\n    int32_t n);\n\n/**\n * @brief Destroy a best-matches result.\n *\n * @param r A pointer returned by\n * SherpaOnnxSpeakerEmbeddingManagerGetBestMatches().\n */\nSHERPA_ONNX_API void SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches(\n    const SherpaOnnxSpeakerEmbeddingManagerBestMatchesResult *r);\n\n/**\n * @brief Verify whether a query embedding matches a named speaker.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n * @param name Speaker name to compare against.\n * @param v Query embedding vector.\n * @param threshold Minimum similarity threshold in the range [0, 1].\n * @return 1 if the speaker matches; otherwise 0.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxSpeakerEmbeddingManagerVerify(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name,\n    const float *v, float threshold);\n\n/**\n * @brief Check whether a speaker is enrolled.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n * @param name Speaker name.\n * @return 1 if the speaker exists; otherwise 0.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxSpeakerEmbeddingManagerContains(\n    const SherpaOnnxSpeakerEmbeddingManager *p, const char *name);\n\n/**\n * @brief Return the number of enrolled speakers.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n * @return Number of enrolled speakers.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(\n    const SherpaOnnxSpeakerEmbeddingManager *p);\n\n/**\n * @brief Return all enrolled speaker names.\n *\n * The returned array is NULL-terminated. If no speakers are enrolled, the\n * returned array still exists and its first element is NULL.\n *\n * @param p A pointer returned by SherpaOnnxCreateSpeakerEmbeddingManager().\n * @return A newly allocated NULL-terminated array of speaker names. Free it\n *         with SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers().\n */\nSHERPA_ONNX_API const char *const *\nSherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers(\n    const SherpaOnnxSpeakerEmbeddingManager *p);\n\n/**\n * @brief Free an array returned by\n * SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers().\n *\n * @param names A pointer returned by\n * SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers().\n */\nSHERPA_ONNX_API void SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers(\n    const char *const *names);\n\n// ============================================================\n// For audio tagging\n// ============================================================\n/** @brief Zipformer audio-tagging model configuration. */\ntypedef struct SherpaOnnxOfflineZipformerAudioTaggingModelConfig {\n  /** Model filename. */\n  const char *model;\n} SherpaOnnxOfflineZipformerAudioTaggingModelConfig;\n\n/**\n * @brief Audio-tagging model configuration.\n *\n * Configure exactly one model family. If multiple model families are provided,\n * one of them will be used and the choice is implementation-defined.\n *\n * Example using\n * `sherpa-onnx-zipformer-audio-tagging-2024-04-09`:\n *\n * @code\n * SherpaOnnxAudioTaggingModelConfig model;\n * memset(&model, 0, sizeof(model));\n * model.zipformer.model =\n *     \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.int8.onnx\";\n * model.num_threads = 1;\n * model.provider = \"cpu\";\n * @endcode\n */\ntypedef struct SherpaOnnxAudioTaggingModelConfig {\n  /** Zipformer model configuration. */\n  SherpaOnnxOfflineZipformerAudioTaggingModelConfig zipformer;\n  /** Alternative CED model file. */\n  const char *ced;\n  /** Number of inference threads. */\n  int32_t num_threads;\n  /** Non-zero to print debug information. */\n  int32_t debug;\n  /** Execution provider such as `\"cpu\"`. */\n  const char *provider;\n} SherpaOnnxAudioTaggingModelConfig;\n\n/**\n * @brief Configuration for audio tagging.\n *\n * @code\n * SherpaOnnxAudioTaggingConfig config;\n * memset(&config, 0, sizeof(config));\n * config.model.zipformer.model =\n *     \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.int8.onnx\";\n * config.model.num_threads = 1;\n * config.model.provider = \"cpu\";\n * config.labels =\n *     \"./sherpa-onnx-zipformer-audio-tagging-2024-04-09/class_labels_indices.csv\";\n * config.top_k = 5;\n * @endcode\n */\ntypedef struct SherpaOnnxAudioTaggingConfig {\n  /** Acoustic model configuration. */\n  SherpaOnnxAudioTaggingModelConfig model;\n  /** CSV file containing class labels. */\n  const char *labels;\n  /** Default number of results to return when `top_k == -1` at inference time.\n   */\n  int32_t top_k;\n} SherpaOnnxAudioTaggingConfig;\n\n/**\n * @brief One audio-tagging prediction.\n */\ntypedef struct SherpaOnnxAudioEvent {\n  /** Event label. */\n  const char *name;\n  /** Integer label index. */\n  int32_t index;\n  /** Probability or confidence score. */\n  float prob;\n} SherpaOnnxAudioEvent;\n\n/** @brief Opaque audio tagger handle. */\ntypedef struct SherpaOnnxAudioTagging SherpaOnnxAudioTagging;\n\n/**\n * @brief Create an audio tagger.\n *\n * @param config Audio-tagging configuration.\n * @return A newly allocated audio tagger on success, or NULL on error. Free it\n *         with SherpaOnnxDestroyAudioTagging().\n */\nSHERPA_ONNX_API const SherpaOnnxAudioTagging *SherpaOnnxCreateAudioTagging(\n    const SherpaOnnxAudioTaggingConfig *config);\n\n/**\n * @brief Destroy an audio tagger.\n *\n * @param tagger A pointer returned by SherpaOnnxCreateAudioTagging().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyAudioTagging(\n    const SherpaOnnxAudioTagging *tagger);\n\n/**\n * @brief Create an offline stream for audio tagging.\n *\n * @param tagger A pointer returned by SherpaOnnxCreateAudioTagging().\n * @return A newly allocated offline stream. Free it with\n *         SherpaOnnxDestroyOfflineStream().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineStream *\nSherpaOnnxAudioTaggingCreateOfflineStream(const SherpaOnnxAudioTagging *tagger);\n\n/**\n * @brief Run audio tagging on an offline stream.\n *\n * The returned array is NULL-terminated. If @p top_k is -1, the value stored in\n * `config.top_k` is used instead.\n *\n * @param tagger A pointer returned by SherpaOnnxCreateAudioTagging().\n * @param s A pointer returned by SherpaOnnxAudioTaggingCreateOfflineStream().\n * @param top_k Number of top results to return, or -1 to use the configured\n *              default.\n * @return A newly allocated NULL-terminated array of result pointers ordered by\n *         descending probability. Free it with\n *         SherpaOnnxAudioTaggingFreeResults().\n *\n * @code\n * const SherpaOnnxAudioEvent *const *results =\n *     SherpaOnnxAudioTaggingCompute(tagger, stream, 5);\n * for (int32_t i = 0; results[i] != NULL; ++i) {\n *   printf(\"%d %.3f %s\\n\", results[i]->index, results[i]->prob,\n *          results[i]->name);\n * }\n * SherpaOnnxAudioTaggingFreeResults(results);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxAudioEvent *const *\nSherpaOnnxAudioTaggingCompute(const SherpaOnnxAudioTagging *tagger,\n                              const SherpaOnnxOfflineStream *s, int32_t top_k);\n\n/**\n * @brief Destroy results returned by SherpaOnnxAudioTaggingCompute().\n *\n * @param p A pointer returned by SherpaOnnxAudioTaggingCompute().\n */\nSHERPA_ONNX_API void SherpaOnnxAudioTaggingFreeResults(\n    const SherpaOnnxAudioEvent *const *p);\n\n// ============================================================\n// For punctuation\n// ============================================================\n\n/**\n * @brief Offline punctuation model configuration.\n *\n * Example:\n *\n * @code\n * SherpaOnnxOfflinePunctuationModelConfig model;\n * memset(&model, 0, sizeof(model));\n * model.ct_transformer =\n *     \"./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx\";\n * model.num_threads = 1;\n * model.provider = \"cpu\";\n * @endcode\n */\ntypedef struct SherpaOnnxOfflinePunctuationModelConfig {\n  /** Offline punctuation model file. */\n  const char *ct_transformer;\n  /** Number of inference threads. */\n  int32_t num_threads;\n  /** Non-zero to print debug information. */\n  int32_t debug;\n  /** Execution provider such as `\"cpu\"`. */\n  const char *provider;\n} SherpaOnnxOfflinePunctuationModelConfig;\n\n/** @brief Configuration for offline punctuation. */\ntypedef struct SherpaOnnxOfflinePunctuationConfig {\n  /** Model configuration. */\n  SherpaOnnxOfflinePunctuationModelConfig model;\n} SherpaOnnxOfflinePunctuationConfig;\n\n/** @brief Opaque offline punctuation handle. */\ntypedef struct SherpaOnnxOfflinePunctuation SherpaOnnxOfflinePunctuation;\n\n/**\n * @brief Create an offline punctuation processor.\n *\n * @param config Offline punctuation configuration.\n * @return A newly allocated punctuation processor on success, or NULL on\n *         error. Free it with SherpaOnnxDestroyOfflinePunctuation().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflinePunctuation *\nSherpaOnnxCreateOfflinePunctuation(\n    const SherpaOnnxOfflinePunctuationConfig *config);\n\n/**\n * @brief Destroy an offline punctuation processor.\n *\n * @param punct A pointer returned by SherpaOnnxCreateOfflinePunctuation().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOfflinePunctuation(\n    const SherpaOnnxOfflinePunctuation *punct);\n\n/**\n * @brief Add punctuation to a complete input text.\n *\n * @param punct A pointer returned by SherpaOnnxCreateOfflinePunctuation().\n * @param text Input text without punctuation.\n * @return A newly allocated punctuated string. Free it with\n *         SherpaOfflinePunctuationFreeText().\n */\nSHERPA_ONNX_API const char *SherpaOfflinePunctuationAddPunct(\n    const SherpaOnnxOfflinePunctuation *punct, const char *text);\n\n/**\n * @brief Free a string returned by SherpaOfflinePunctuationAddPunct().\n *\n * @param text A pointer returned by SherpaOfflinePunctuationAddPunct().\n */\nSHERPA_ONNX_API void SherpaOfflinePunctuationFreeText(const char *text);\n\n/**\n * @brief Online punctuation model configuration.\n *\n * Example using `sherpa-onnx-online-punct-en-2024-08-06`:\n *\n * @code\n * SherpaOnnxOnlinePunctuationModelConfig model;\n * memset(&model, 0, sizeof(model));\n * model.cnn_bilstm =\n * \"./sherpa-onnx-online-punct-en-2024-08-06/model.int8.onnx\"; model.bpe_vocab =\n * \"./sherpa-onnx-online-punct-en-2024-08-06/bpe.vocab\"; model.num_threads = 1;\n * model.provider = \"cpu\";\n * @endcode\n */\ntypedef struct SherpaOnnxOnlinePunctuationModelConfig {\n  /** Online punctuation model file. */\n  const char *cnn_bilstm;\n  /** BPE vocabulary used by the model. */\n  const char *bpe_vocab;\n  /** Number of inference threads. */\n  int32_t num_threads;\n  /** Non-zero to print debug information. */\n  int32_t debug;\n  /** Execution provider such as `\"cpu\"`. */\n  const char *provider;\n} SherpaOnnxOnlinePunctuationModelConfig;\n\n/** @brief Configuration for online punctuation. */\ntypedef struct SherpaOnnxOnlinePunctuationConfig {\n  /** Model configuration. */\n  SherpaOnnxOnlinePunctuationModelConfig model;\n} SherpaOnnxOnlinePunctuationConfig;\n\n/** @brief Opaque online punctuation handle. */\ntypedef struct SherpaOnnxOnlinePunctuation SherpaOnnxOnlinePunctuation;\n\n/**\n * @brief Create an online punctuation processor.\n *\n * @param config Online punctuation configuration.\n * @return A newly allocated punctuation processor on success, or NULL on\n *         error. Free it with SherpaOnnxDestroyOnlinePunctuation().\n */\nSHERPA_ONNX_API const SherpaOnnxOnlinePunctuation *\nSherpaOnnxCreateOnlinePunctuation(\n    const SherpaOnnxOnlinePunctuationConfig *config);\n\n/**\n * @brief Destroy an online punctuation processor.\n *\n * @param punctuation A pointer returned by SherpaOnnxCreateOnlinePunctuation().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOnlinePunctuation(\n    const SherpaOnnxOnlinePunctuation *punctuation);\n\n/**\n * @brief Add punctuation to one text chunk using the online punctuation model.\n *\n * @param punctuation A pointer returned by SherpaOnnxCreateOnlinePunctuation().\n * @param text Input text chunk.\n * @return A newly allocated punctuated string. Free it with\n *         SherpaOnnxOnlinePunctuationFreeText().\n *\n * @code\n * const char *out =\n *     SherpaOnnxOnlinePunctuationAddPunct(punct,\n *         \"how are you i am fine thank you\");\n * printf(\"%s\\n\", out);\n * SherpaOnnxOnlinePunctuationFreeText(out);\n * @endcode\n */\nSHERPA_ONNX_API const char *SherpaOnnxOnlinePunctuationAddPunct(\n    const SherpaOnnxOnlinePunctuation *punctuation, const char *text);\n\n/**\n * @brief Free a string returned by SherpaOnnxOnlinePunctuationAddPunct().\n *\n * @param text A pointer returned by SherpaOnnxOnlinePunctuationAddPunct().\n */\nSHERPA_ONNX_API void SherpaOnnxOnlinePunctuationFreeText(const char *text);\n\n// For resampling\n/** @brief Opaque linear resampler handle. */\ntypedef struct SherpaOnnxLinearResampler SherpaOnnxLinearResampler;\n\n/**\n * @brief Create a linear resampler.\n *\n * A common choice is:\n *\n * @code\n * float min_freq = samp_rate_in_hz < samp_rate_out_hz ? samp_rate_in_hz\n *                                                 : samp_rate_out_hz;\n * float filter_cutoff_hz = 0.99f * 0.5f * min_freq;\n * int32_t num_zeros = 6;\n * @endcode\n *\n * @param samp_rate_in_hz Input sample rate in Hz.\n * @param samp_rate_out_hz Output sample rate in Hz.\n * @param filter_cutoff_hz Low-pass cutoff frequency in Hz.\n * @param num_zeros Low-pass filter width control parameter.\n * @return A newly allocated resampler. Free it with\n *         SherpaOnnxDestroyLinearResampler().\n */\nSHERPA_ONNX_API const SherpaOnnxLinearResampler *\nSherpaOnnxCreateLinearResampler(int32_t samp_rate_in_hz,\n                                int32_t samp_rate_out_hz,\n                                float filter_cutoff_hz, int32_t num_zeros);\n\n/**\n * @brief Destroy a linear resampler.\n *\n * @param p A pointer returned by SherpaOnnxCreateLinearResampler().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyLinearResampler(\n    const SherpaOnnxLinearResampler *p);\n\n/**\n * @brief Reset a linear resampler to its initial state.\n *\n * @param p A pointer returned by SherpaOnnxCreateLinearResampler().\n */\nSHERPA_ONNX_API void SherpaOnnxLinearResamplerReset(\n    const SherpaOnnxLinearResampler *p);\n\n/**\n * @brief Output chunk returned by SherpaOnnxLinearResamplerResample().\n *\n * Free this object with SherpaOnnxLinearResamplerResampleFree().\n */\ntypedef struct SherpaOnnxResampleOut {\n  /** Output samples. */\n  const float *samples;\n  /** Number of output samples. */\n  int32_t n;\n} SherpaOnnxResampleOut;\n\n/**\n * @brief Resample one chunk of input audio.\n *\n * Set @p flush to 1 for the final chunk so buffered samples are emitted.\n *\n * @param p A pointer returned by SherpaOnnxCreateLinearResampler().\n * @param input Input sample array.\n * @param input_dim Number of input samples.\n * @param flush 1 if this is the final chunk; otherwise 0.\n * @return A newly allocated output chunk. Free it with\n *         SherpaOnnxLinearResamplerResampleFree().\n */\nSHERPA_ONNX_API const SherpaOnnxResampleOut *SherpaOnnxLinearResamplerResample(\n    const SherpaOnnxLinearResampler *p, const float *input, int32_t input_dim,\n    int32_t flush);\n\n/**\n * @brief Destroy a resampler output chunk.\n *\n * @param p A pointer returned by SherpaOnnxLinearResamplerResample().\n */\nSHERPA_ONNX_API void SherpaOnnxLinearResamplerResampleFree(\n    const SherpaOnnxResampleOut *p);\n\n/**\n * @brief Return the resampler input sample rate.\n *\n * @param p A pointer returned by SherpaOnnxCreateLinearResampler().\n * @return Input sample rate in Hz.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxLinearResamplerResampleGetInputSampleRate(\n    const SherpaOnnxLinearResampler *p);\n\n/**\n * @brief Return the resampler output sample rate.\n *\n * @param p A pointer returned by SherpaOnnxCreateLinearResampler().\n * @return Output sample rate in Hz.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxLinearResamplerResampleGetOutputSampleRate(\n    const SherpaOnnxLinearResampler *p);\n\n// =========================================================================\n// For offline speaker diarization (i.e., non-streaming speaker diarization)\n// =========================================================================\n/** @brief Pyannote speaker-segmentation model configuration. */\ntypedef struct SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig {\n  /** Segmentation model filename. */\n  const char *model;\n} SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig;\n\n/**\n * @brief Segmentation model configuration for offline speaker diarization.\n *\n * Configure exactly one model family. If multiple model families are provided,\n * one is chosen and the choice is implementation-defined.\n */\ntypedef struct SherpaOnnxOfflineSpeakerSegmentationModelConfig {\n  /** Pyannote segmentation model configuration. */\n  SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig pyannote;\n  /** Number of inference threads. */\n  int32_t num_threads;\n  /** Non-zero to print debug information. */\n  int32_t debug;\n  /** Execution provider such as `\"cpu\"`. */\n  const char *provider;\n} SherpaOnnxOfflineSpeakerSegmentationModelConfig;\n\n/**\n * @brief Fast clustering configuration.\n *\n * If @c num_clusters is greater than 0, @c threshold is ignored. When the\n * number of speakers is known in advance, setting @c num_clusters is strongly\n * recommended.\n */\ntypedef struct SherpaOnnxFastClusteringConfig {\n  /** Known number of speakers. If > 0, threshold-based clustering is bypassed.\n   */\n  int32_t num_clusters;\n  /** Distance threshold used when the number of speakers is unknown. */\n  float threshold;\n} SherpaOnnxFastClusteringConfig;\n\n/**\n * @brief Configuration for offline speaker diarization.\n *\n * Example based on `offline-sepaker-diarization-c-api.c`:\n *\n * @code\n * SherpaOnnxOfflineSpeakerDiarizationConfig config;\n * memset(&config, 0, sizeof(config));\n * config.segmentation.pyannote.model =\n *     \"./sherpa-onnx-pyannote-segmentation-3-0/model.onnx\";\n * config.embedding.model =\n *     \"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\";\n * config.clustering.num_clusters = 4;\n * @endcode\n */\ntypedef struct SherpaOnnxOfflineSpeakerDiarizationConfig {\n  /** Speaker segmentation model configuration. */\n  SherpaOnnxOfflineSpeakerSegmentationModelConfig segmentation;\n  /** Speaker embedding extractor configuration. */\n  SherpaOnnxSpeakerEmbeddingExtractorConfig embedding;\n  /** Clustering configuration. */\n  SherpaOnnxFastClusteringConfig clustering;\n  /** Segments shorter than this duration in seconds are discarded. */\n  float min_duration_on;\n  /** Small gaps shorter than this duration in seconds may be merged. */\n  float min_duration_off;\n} SherpaOnnxOfflineSpeakerDiarizationConfig;\n\n/** @brief Opaque offline speaker diarization handle. */\ntypedef struct SherpaOnnxOfflineSpeakerDiarization\n    SherpaOnnxOfflineSpeakerDiarization;\n\n/**\n * @brief Create an offline speaker diarization pipeline.\n *\n * @param config Offline speaker diarization configuration.\n * @return A newly allocated diarizer on success, or NULL on error. Free it\n *         with SherpaOnnxDestroyOfflineSpeakerDiarization().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineSpeakerDiarization *\nSherpaOnnxCreateOfflineSpeakerDiarization(\n    const SherpaOnnxOfflineSpeakerDiarizationConfig *config);\n\n/**\n * @brief Destroy an offline speaker diarizer.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOfflineSpeakerDiarization().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOfflineSpeakerDiarization(\n    const SherpaOnnxOfflineSpeakerDiarization *sd);\n\n/**\n * @brief Return the expected input sample rate.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOfflineSpeakerDiarization().\n * @return Required input sample rate in Hz.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(\n    const SherpaOnnxOfflineSpeakerDiarization *sd);\n\n/**\n * @brief Update clustering-related settings of an existing diarizer.\n *\n * Only `config->clustering` is used. Other fields are ignored.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOfflineSpeakerDiarization().\n * @param config Configuration whose `clustering` field will be applied.\n */\nSHERPA_ONNX_API void SherpaOnnxOfflineSpeakerDiarizationSetConfig(\n    const SherpaOnnxOfflineSpeakerDiarization *sd,\n    const SherpaOnnxOfflineSpeakerDiarizationConfig *config);\n\n/** @brief Opaque offline speaker diarization result. */\ntypedef struct SherpaOnnxOfflineSpeakerDiarizationResult\n    SherpaOnnxOfflineSpeakerDiarizationResult;\n\n/**\n * @brief One diarization segment.\n */\ntypedef struct SherpaOnnxOfflineSpeakerDiarizationSegment {\n  /** Segment start time in seconds. */\n  float start;\n  /** Segment end time in seconds. */\n  float end;\n  /** Speaker label, typically an integer cluster ID. */\n  int32_t speaker;\n} SherpaOnnxOfflineSpeakerDiarizationSegment;\n\n/**\n * @brief Return the number of speakers in a diarization result.\n *\n * @param r A pointer returned by one of the\n *          SherpaOnnxOfflineSpeakerDiarizationProcess*() functions.\n * @return Number of speaker clusters.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r);\n\n/**\n * @brief Return the number of diarization segments.\n *\n * @param r A pointer returned by one of the\n *          SherpaOnnxOfflineSpeakerDiarizationProcess*() functions.\n * @return Number of segments.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r);\n\n/**\n * @brief Return segments sorted by start time.\n *\n * The returned array contains exactly\n * SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments() entries.\n *\n * @param r A pointer returned by one of the\n *          SherpaOnnxOfflineSpeakerDiarizationProcess*() functions.\n * @return A newly allocated segment array. Free it with\n *         SherpaOnnxOfflineSpeakerDiarizationDestroySegment().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineSpeakerDiarizationSegment *\nSherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r);\n\n/**\n * @brief Destroy a segment array returned by\n * SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime().\n *\n * @param s A pointer returned by\n *          SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime().\n */\nSHERPA_ONNX_API void SherpaOnnxOfflineSpeakerDiarizationDestroySegment(\n    const SherpaOnnxOfflineSpeakerDiarizationSegment *s);\n\n/**\n * @brief Progress callback for offline speaker diarization.\n *\n * The current implementation reports progress but ignores the callback's\n * return value.\n */\ntypedef int32_t (*SherpaOnnxOfflineSpeakerDiarizationProgressCallback)(\n    int32_t num_processed_chunks, int32_t num_total_chunks, void *arg);\n\n/**\n * @brief Same as SherpaOnnxOfflineSpeakerDiarizationProgressCallback but\n * without a user pointer.\n */\ntypedef int32_t (*SherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArg)(\n    int32_t num_processed_chunks, int32_t num_total_chunks);\n\n/**\n * @brief Run offline speaker diarization.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOfflineSpeakerDiarization().\n * @param samples Input mono PCM samples normalized to [-1, 1].\n * @param n Number of input samples.\n * @return A newly allocated diarization result. Free it with\n *         SherpaOnnxOfflineSpeakerDiarizationDestroyResult().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineSpeakerDiarizationResult *\nSherpaOnnxOfflineSpeakerDiarizationProcess(\n    const SherpaOnnxOfflineSpeakerDiarization *sd, const float *samples,\n    int32_t n);\n\n/**\n * @brief Run offline speaker diarization with a progress callback.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOfflineSpeakerDiarization().\n * @param samples Input mono PCM samples normalized to [-1, 1].\n * @param n Number of input samples.\n * @param callback Progress callback.\n * @param arg User pointer forwarded to @p callback.\n * @return A newly allocated diarization result. Free it with\n *         SherpaOnnxOfflineSpeakerDiarizationDestroyResult().\n *\n * @code\n * static int32_t ProgressCallback(int32_t done, int32_t total, void *arg) {\n *   fprintf(stderr, \"progress %.2f%%\\n\", 100.0f * done / total);\n *   return 0;\n * }\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineSpeakerDiarizationResult *\nSherpaOnnxOfflineSpeakerDiarizationProcessWithCallback(\n    const SherpaOnnxOfflineSpeakerDiarization *sd, const float *samples,\n    int32_t n, SherpaOnnxOfflineSpeakerDiarizationProgressCallback callback,\n    void *arg);\n\n/**\n * @brief Run offline speaker diarization with a progress callback that has no\n * user pointer.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOfflineSpeakerDiarization().\n * @param samples Input mono PCM samples normalized to [-1, 1].\n * @param n Number of input samples.\n * @param callback Progress callback.\n * @return A newly allocated diarization result. Free it with\n *         SherpaOnnxOfflineSpeakerDiarizationDestroyResult().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineSpeakerDiarizationResult *\nSherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg(\n    const SherpaOnnxOfflineSpeakerDiarization *sd, const float *samples,\n    int32_t n,\n    SherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArg callback);\n\n/**\n * @brief Destroy a diarization result.\n *\n * @param r A pointer returned by one of the\n *          SherpaOnnxOfflineSpeakerDiarizationProcess*() functions.\n */\nSHERPA_ONNX_API void SherpaOnnxOfflineSpeakerDiarizationDestroyResult(\n    const SherpaOnnxOfflineSpeakerDiarizationResult *r);\n\n// =========================================================================\n// For offline speech enhancement\n// =========================================================================\n/** @brief GTCRN offline denoiser model configuration. */\ntypedef struct SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig {\n  /** Model filename. */\n  const char *model;\n} SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig;\n\n/** @brief DPDFNet offline denoiser model configuration. */\ntypedef struct SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig {\n  /** Model filename. */\n  const char *model;\n} SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig;\n\n/**\n * @brief Speech denoiser model configuration shared by offline and online APIs.\n *\n * Configure exactly one model family. If multiple model families are provided,\n * one is chosen and the choice is implementation-defined.\n */\ntypedef struct SherpaOnnxOfflineSpeechDenoiserModelConfig {\n  /** GTCRN model configuration. */\n  SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig gtcrn;\n  /** Number of inference threads. */\n  int32_t num_threads;\n  /** Non-zero to print debug information. */\n  int32_t debug;\n  /** Execution provider such as `\"cpu\"`. */\n  const char *provider;\n  /** DPDFNet model configuration. */\n  SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig dpdfnet;\n} SherpaOnnxOfflineSpeechDenoiserModelConfig;\n\n/** @brief Configuration for offline speech denoising. */\ntypedef struct SherpaOnnxOfflineSpeechDenoiserConfig {\n  /** Model configuration. */\n  SherpaOnnxOfflineSpeechDenoiserModelConfig model;\n} SherpaOnnxOfflineSpeechDenoiserConfig;\n\n/** @brief Opaque offline speech denoiser handle. */\ntypedef struct SherpaOnnxOfflineSpeechDenoiser SherpaOnnxOfflineSpeechDenoiser;\n\n/**\n * @brief Create an offline speech denoiser.\n *\n * Example using `gtcrn_simple.onnx`:\n *\n * @code\n * SherpaOnnxOfflineSpeechDenoiserConfig config;\n * memset(&config, 0, sizeof(config));\n * config.model.gtcrn.model = \"./gtcrn_simple.onnx\";\n * @endcode\n *\n * @param config Offline denoiser configuration.\n * @return A newly allocated denoiser on success, or NULL on error. Free it\n *         with SherpaOnnxDestroyOfflineSpeechDenoiser().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineSpeechDenoiser *\nSherpaOnnxCreateOfflineSpeechDenoiser(\n    const SherpaOnnxOfflineSpeechDenoiserConfig *config);\n\n/**\n * @brief Destroy an offline speech denoiser.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOfflineSpeechDenoiser().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOfflineSpeechDenoiser(\n    const SherpaOnnxOfflineSpeechDenoiser *sd);\n\n/**\n * @brief Return the expected sample rate for the denoiser.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOfflineSpeechDenoiser().\n * @return Required input sample rate in Hz.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxOfflineSpeechDenoiserGetSampleRate(\n    const SherpaOnnxOfflineSpeechDenoiser *sd);\n\n/**\n * @brief Denoised audio returned by offline or online speech enhancement APIs.\n *\n * Free this object with SherpaOnnxDestroyDenoisedAudio().\n */\ntypedef struct SherpaOnnxDenoisedAudio {\n  /** Output samples in the range [-1, 1]. */\n  const float *samples;\n  /** Number of output samples. */\n  int32_t n;\n  /** Output sample rate in Hz. */\n  int32_t sample_rate;\n} SherpaOnnxDenoisedAudio;\n\n/**\n * @brief Run offline speech denoising on a complete waveform.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOfflineSpeechDenoiser().\n * @param samples Input mono PCM samples normalized to [-1, 1].\n * @param n Number of input samples.\n * @param sample_rate Input sample rate in Hz.\n * @return A newly allocated denoised waveform. Free it with\n *         SherpaOnnxDestroyDenoisedAudio().\n *\n * @code\n * const SherpaOnnxDenoisedAudio *denoised =\n *     SherpaOnnxOfflineSpeechDenoiserRun(sd, wave->samples, wave->num_samples,\n *                                        wave->sample_rate);\n * SherpaOnnxWriteWave(denoised->samples, denoised->n, denoised->sample_rate,\n *                     \"./enhanced.wav\");\n * SherpaOnnxDestroyDenoisedAudio(denoised);\n * @endcode\n */\nSHERPA_ONNX_API const SherpaOnnxDenoisedAudio *\nSherpaOnnxOfflineSpeechDenoiserRun(const SherpaOnnxOfflineSpeechDenoiser *sd,\n                                   const float *samples, int32_t n,\n                                   int32_t sample_rate);\n\n/**\n * @brief Destroy denoised audio returned by a speech enhancement API.\n *\n * @param p A pointer returned by SherpaOnnxOfflineSpeechDenoiserRun(),\n *          SherpaOnnxOnlineSpeechDenoiserRun(), or\n *          SherpaOnnxOnlineSpeechDenoiserFlush().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyDenoisedAudio(\n    const SherpaOnnxDenoisedAudio *p);\n\n// =========================================================================\n// For streaming speech enhancement\n// =========================================================================\n/** @brief Configuration for streaming speech denoising. */\ntypedef struct SherpaOnnxOnlineSpeechDenoiserConfig {\n  /** Model configuration. */\n  SherpaOnnxOfflineSpeechDenoiserModelConfig model;\n} SherpaOnnxOnlineSpeechDenoiserConfig;\n\n/** @brief Opaque online speech denoiser handle. */\ntypedef struct SherpaOnnxOnlineSpeechDenoiser SherpaOnnxOnlineSpeechDenoiser;\n\n/**\n * @brief Create an online speech denoiser.\n *\n * @param config Online denoiser configuration.\n * @return A newly allocated denoiser on success, or NULL on error. Free it\n *         with SherpaOnnxDestroyOnlineSpeechDenoiser().\n */\nSHERPA_ONNX_API const SherpaOnnxOnlineSpeechDenoiser *\nSherpaOnnxCreateOnlineSpeechDenoiser(\n    const SherpaOnnxOnlineSpeechDenoiserConfig *config);\n\n/**\n * @brief Destroy an online speech denoiser.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOnlineSpeechDenoiser().\n */\nSHERPA_ONNX_API void SherpaOnnxDestroyOnlineSpeechDenoiser(\n    const SherpaOnnxOnlineSpeechDenoiser *sd);\n\n/**\n * @brief Return the expected input sample rate for the online denoiser.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOnlineSpeechDenoiser().\n * @return Required input sample rate in Hz.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxOnlineSpeechDenoiserGetSampleRate(\n    const SherpaOnnxOnlineSpeechDenoiser *sd);\n\n/**\n * @brief Return the recommended chunk size in samples for streaming input.\n *\n * Example programs feed audio to the online denoiser in this chunk size.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOnlineSpeechDenoiser().\n * @return Frame shift in samples.\n */\nSHERPA_ONNX_API int32_t SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(\n    const SherpaOnnxOnlineSpeechDenoiser *sd);\n\n/**\n * @brief Process one chunk of streaming audio.\n *\n * This function is not thread-safe. It may return NULL when not enough input\n * has been accumulated to produce denoised output yet.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOnlineSpeechDenoiser().\n * @param samples Input chunk normalized to [-1, 1].\n * @param n Number of input samples.\n * @param sample_rate Input sample rate in Hz.\n * @return A newly allocated denoised chunk, or NULL if no output is available\n *         yet. Free non-NULL results with SherpaOnnxDestroyDenoisedAudio().\n */\nSHERPA_ONNX_API const SherpaOnnxDenoisedAudio *\nSherpaOnnxOnlineSpeechDenoiserRun(const SherpaOnnxOnlineSpeechDenoiser *sd,\n                                  const float *samples, int32_t n,\n                                  int32_t sample_rate);\n\n/**\n * @brief Flush buffered samples and reset the online denoiser.\n *\n * This also resets the denoiser so it can be reused for a new utterance.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOnlineSpeechDenoiser().\n * @return A newly allocated denoised chunk, or NULL if no buffered output\n *         remains. Free non-NULL results with SherpaOnnxDestroyDenoisedAudio().\n */\nSHERPA_ONNX_API const SherpaOnnxDenoisedAudio *\nSherpaOnnxOnlineSpeechDenoiserFlush(const SherpaOnnxOnlineSpeechDenoiser *sd);\n\n/**\n * @brief Reset an online denoiser so it can process a new stream.\n *\n * @param sd A pointer returned by SherpaOnnxCreateOnlineSpeechDenoiser().\n */\nSHERPA_ONNX_API void SherpaOnnxOnlineSpeechDenoiserReset(\n    const SherpaOnnxOnlineSpeechDenoiser *sd);\n\n#ifdef __OHOS__\n\n/**\n * @brief HarmonyOS native resource manager type.\n *\n * Pass the resource manager provided by the HarmonyOS application runtime when\n * using the `*OHOS()` constructors below.\n */\ntypedef struct NativeResourceManager NativeResourceManager;\n\n/**\n * @brief Create an offline speech denoiser on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of SherpaOnnxCreateOfflineSpeechDenoiser().\n *\n * @param config Offline denoiser configuration.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated denoiser, or NULL on error. Free it with\n *         SherpaOnnxDestroyOfflineSpeechDenoiser().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineSpeechDenoiser *\nSherpaOnnxCreateOfflineSpeechDenoiserOHOS(\n    const SherpaOnnxOfflineSpeechDenoiserConfig *config,\n    NativeResourceManager *mgr);\n\n/**\n * @brief Create an online speech denoiser on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of SherpaOnnxCreateOnlineSpeechDenoiser().\n *\n * @param config Online denoiser configuration.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated denoiser, or NULL on error. Free it with\n *         SherpaOnnxDestroyOnlineSpeechDenoiser().\n */\nSHERPA_ONNX_API const SherpaOnnxOnlineSpeechDenoiser *\nSherpaOnnxCreateOnlineSpeechDenoiserOHOS(\n    const SherpaOnnxOnlineSpeechDenoiserConfig *config,\n    NativeResourceManager *mgr);\n\n/**\n * @brief Create an online recognizer on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of SherpaOnnxCreateOnlineRecognizer().\n *\n * @param config Recognizer configuration.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated recognizer, or NULL on error. Free it with\n *         SherpaOnnxDestroyOnlineRecognizer().\n */\nSHERPA_ONNX_API const SherpaOnnxOnlineRecognizer *\nSherpaOnnxCreateOnlineRecognizerOHOS(\n    const SherpaOnnxOnlineRecognizerConfig *config, NativeResourceManager *mgr);\n\n/**\n * @brief Create an offline recognizer on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of SherpaOnnxCreateOfflineRecognizer().\n *\n * @param config Recognizer configuration.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated recognizer, or NULL on error. Free it with\n *         SherpaOnnxDestroyOfflineRecognizer().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineRecognizer *\nSherpaOnnxCreateOfflineRecognizerOHOS(\n    const SherpaOnnxOfflineRecognizerConfig *config,\n    NativeResourceManager *mgr);\n\n/**\n * @brief Create a voice activity detector on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of SherpaOnnxCreateVoiceActivityDetector().\n *\n * @param config VAD model configuration.\n * @param buffer_size_in_seconds Internal buffer duration in seconds.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated VAD instance, or NULL on error. Free it with\n *         SherpaOnnxDestroyVoiceActivityDetector().\n */\nSHERPA_ONNX_API const SherpaOnnxVoiceActivityDetector *\nSherpaOnnxCreateVoiceActivityDetectorOHOS(\n    const SherpaOnnxVadModelConfig *config, float buffer_size_in_seconds,\n    NativeResourceManager *mgr);\n\n/**\n * @brief Create an offline TTS engine on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of SherpaOnnxCreateOfflineTts().\n *\n * @param config Offline TTS configuration.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated TTS engine, or NULL on error. Free it with\n *         SherpaOnnxDestroyOfflineTts().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineTts *SherpaOnnxCreateOfflineTtsOHOS(\n    const SherpaOnnxOfflineTtsConfig *config, NativeResourceManager *mgr);\n\n/**\n * @brief Create an offline punctuation processor on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of SherpaOnnxCreateOfflinePunctuation().\n *\n * @param config Offline punctuation configuration.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated punctuation processor, or NULL on error. Free it\n *         with SherpaOnnxDestroyOfflinePunctuation().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflinePunctuation *\nSherpaOnnxCreateOfflinePunctuationOHOS(\n    const SherpaOnnxOfflinePunctuationConfig *config,\n    NativeResourceManager *mgr);\n\n/**\n * @brief Create an online punctuation processor on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of SherpaOnnxCreateOnlinePunctuation().\n *\n * @param config Online punctuation configuration.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated punctuation processor, or NULL on error. Free it\n *         with SherpaOnnxDestroyOnlinePunctuation().\n */\nSHERPA_ONNX_API const SherpaOnnxOnlinePunctuation *\nSherpaOnnxCreateOnlinePunctuationOHOS(\n    const SherpaOnnxOnlinePunctuationConfig *config,\n    NativeResourceManager *mgr);\n\n/**\n * @brief Create a speaker embedding extractor on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of\n * SherpaOnnxCreateSpeakerEmbeddingExtractor().\n *\n * @param config Speaker embedding extractor configuration.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated extractor, or NULL on error. Free it with\n *         SherpaOnnxDestroySpeakerEmbeddingExtractor().\n */\nSHERPA_ONNX_API const SherpaOnnxSpeakerEmbeddingExtractor *\nSherpaOnnxCreateSpeakerEmbeddingExtractorOHOS(\n    const SherpaOnnxSpeakerEmbeddingExtractorConfig *config,\n    NativeResourceManager *mgr);\n\n/**\n * @brief Create a keyword spotter on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of SherpaOnnxCreateKeywordSpotter().\n *\n * @param config Keyword spotter configuration.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated keyword spotter, or NULL on error. Free it with\n *         SherpaOnnxDestroyKeywordSpotter().\n */\nSHERPA_ONNX_API const SherpaOnnxKeywordSpotter *\nSherpaOnnxCreateKeywordSpotterOHOS(const SherpaOnnxKeywordSpotterConfig *config,\n                                   NativeResourceManager *mgr);\n\n/**\n * @brief Create an offline speaker diarizer on HarmonyOS.\n *\n * This is the HarmonyOS counterpart of\n * SherpaOnnxCreateOfflineSpeakerDiarization().\n *\n * @param config Offline speaker diarization configuration.\n * @param mgr HarmonyOS resource manager used to resolve bundled assets.\n * @return A newly allocated diarizer, or NULL on error. Free it with\n *         SherpaOnnxDestroyOfflineSpeakerDiarization().\n */\nSHERPA_ONNX_API const SherpaOnnxOfflineSpeakerDiarization *\nSherpaOnnxCreateOfflineSpeakerDiarizationOHOS(\n    const SherpaOnnxOfflineSpeakerDiarizationConfig *config,\n    NativeResourceManager *mgr);\n#endif\n\n#if defined(__GNUC__)\n#pragma GCC diagnostic pop\n#endif\n\n#ifdef __cplusplus\n} /* extern \"C\" */\n#endif\n\n#endif  // SHERPA_ONNX_C_API_C_API_H_\n"
  },
  {
    "path": "sherpa-onnx/c-api/cxx-api.cc",
    "content": "// sherpa-onnx/c-api/cxx-api.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"sherpa-onnx/c-api/cxx-api.h\"\n\n#include <algorithm>\n#include <cstring>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"nlohmann/json.hpp\"\n\nnamespace sherpa_onnx::cxx {\n\nstatic void FillSpeechDenoiserModelConfig(\n    const OfflineSpeechDenoiserModelConfig &src,\n    SherpaOnnxOfflineSpeechDenoiserModelConfig *dst) {\n  memset(dst, 0, sizeof(*dst));\n  dst->gtcrn.model = src.gtcrn.model.c_str();\n  dst->dpdfnet.model = src.dpdfnet.model.c_str();\n  dst->num_threads = src.num_threads;\n  dst->provider = src.provider.c_str();\n  dst->debug = src.debug;\n}\n\nWave ReadWave(const std::string &filename) {\n  auto p = SherpaOnnxReadWave(filename.c_str());\n\n  Wave ans;\n  if (p) {\n    ans.samples.resize(p->num_samples);\n\n    std::copy(p->samples, p->samples + p->num_samples, ans.samples.data());\n\n    ans.sample_rate = p->sample_rate;\n    SherpaOnnxFreeWave(p);\n  }\n\n  return ans;\n}\n\nbool WriteWave(const std::string &filename, const Wave &wave) {\n  return SherpaOnnxWriteWave(wave.samples.data(), wave.samples.size(),\n                             wave.sample_rate, filename.c_str());\n}\n\nOnlineStream::OnlineStream(const SherpaOnnxOnlineStream *p)\n    : MoveOnly<OnlineStream, SherpaOnnxOnlineStream>(p) {}\n\nvoid OnlineStream::Destroy(const SherpaOnnxOnlineStream *p) const {\n  SherpaOnnxDestroyOnlineStream(p);\n}\n\nvoid OnlineStream::AcceptWaveform(int32_t sample_rate, const float *samples,\n                                  int32_t n) const {\n  SherpaOnnxOnlineStreamAcceptWaveform(p_, sample_rate, samples, n);\n}\n\nvoid OnlineStream::InputFinished() const {\n  SherpaOnnxOnlineStreamInputFinished(p_);\n}\n\nvoid OnlineStream::SetOption(const char *key, const char *value) const {\n  SherpaOnnxOnlineStreamSetOption(p_, key, value);\n}\n\nconst char *OnlineStream::GetOption(const char *key) const {\n  return SherpaOnnxOnlineStreamGetOption(p_, key);\n}\n\nint32_t OnlineStream::HasOption(const char *key) const {\n  return SherpaOnnxOnlineStreamHasOption(p_, key);\n}\n\nOnlineRecognizer OnlineRecognizer::Create(\n    const OnlineRecognizerConfig &config) {\n  struct SherpaOnnxOnlineRecognizerConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.feat_config.sample_rate = config.feat_config.sample_rate;\n  c.feat_config.feature_dim = config.feat_config.feature_dim;\n\n  c.model_config.transducer.encoder =\n      config.model_config.transducer.encoder.c_str();\n  c.model_config.transducer.decoder =\n      config.model_config.transducer.decoder.c_str();\n  c.model_config.transducer.joiner =\n      config.model_config.transducer.joiner.c_str();\n\n  c.model_config.paraformer.encoder =\n      config.model_config.paraformer.encoder.c_str();\n  c.model_config.paraformer.decoder =\n      config.model_config.paraformer.decoder.c_str();\n\n  c.model_config.zipformer2_ctc.model =\n      config.model_config.zipformer2_ctc.model.c_str();\n\n  c.model_config.nemo_ctc.model = config.model_config.nemo_ctc.model.c_str();\n  c.model_config.t_one_ctc.model = config.model_config.t_one_ctc.model.c_str();\n\n  c.model_config.tokens = config.model_config.tokens.c_str();\n  c.model_config.num_threads = config.model_config.num_threads;\n  c.model_config.provider = config.model_config.provider.c_str();\n  c.model_config.debug = config.model_config.debug;\n  c.model_config.model_type = config.model_config.model_type.c_str();\n  c.model_config.modeling_unit = config.model_config.modeling_unit.c_str();\n  c.model_config.bpe_vocab = config.model_config.bpe_vocab.c_str();\n  c.model_config.tokens_buf = config.model_config.tokens_buf.c_str();\n  c.model_config.tokens_buf_size = config.model_config.tokens_buf.size();\n\n  c.decoding_method = config.decoding_method.c_str();\n  c.max_active_paths = config.max_active_paths;\n  c.enable_endpoint = config.enable_endpoint;\n  c.rule1_min_trailing_silence = config.rule1_min_trailing_silence;\n  c.rule2_min_trailing_silence = config.rule2_min_trailing_silence;\n  c.rule3_min_utterance_length = config.rule3_min_utterance_length;\n  c.hotwords_file = config.hotwords_file.c_str();\n  c.hotwords_score = config.hotwords_score;\n\n  c.ctc_fst_decoder_config.graph = config.ctc_fst_decoder_config.graph.c_str();\n  c.ctc_fst_decoder_config.max_active =\n      config.ctc_fst_decoder_config.max_active;\n\n  c.rule_fsts = config.rule_fsts.c_str();\n  c.rule_fars = config.rule_fars.c_str();\n\n  c.blank_penalty = config.blank_penalty;\n\n  c.hotwords_buf = config.hotwords_buf.c_str();\n  c.hotwords_buf_size = config.hotwords_buf.size();\n\n  c.hr.lexicon = config.hr.lexicon.c_str();\n  c.hr.rule_fsts = config.hr.rule_fsts.c_str();\n\n  auto p = SherpaOnnxCreateOnlineRecognizer(&c);\n  return OnlineRecognizer(p);\n}\n\nOnlineRecognizer::OnlineRecognizer(const SherpaOnnxOnlineRecognizer *p)\n    : MoveOnly<OnlineRecognizer, SherpaOnnxOnlineRecognizer>(p) {}\n\nvoid OnlineRecognizer::Destroy(const SherpaOnnxOnlineRecognizer *p) const {\n  SherpaOnnxDestroyOnlineRecognizer(p);\n}\n\nOnlineStream OnlineRecognizer::CreateStream() const {\n  auto s = SherpaOnnxCreateOnlineStream(p_);\n  return OnlineStream{s};\n}\n\nOnlineStream OnlineRecognizer::CreateStream(const std::string &hotwords) const {\n  auto s = SherpaOnnxCreateOnlineStreamWithHotwords(p_, hotwords.c_str());\n  return OnlineStream{s};\n}\n\nbool OnlineRecognizer::IsReady(const OnlineStream *s) const {\n  return SherpaOnnxIsOnlineStreamReady(p_, s->Get());\n}\n\nvoid OnlineRecognizer::Decode(const OnlineStream *s) const {\n  SherpaOnnxDecodeOnlineStream(p_, s->Get());\n}\n\nvoid OnlineRecognizer::Reset(const OnlineStream *s) const {\n  SherpaOnnxOnlineStreamReset(p_, s->Get());\n}\n\nbool OnlineRecognizer::IsEndpoint(const OnlineStream *s) const {\n  return SherpaOnnxOnlineStreamIsEndpoint(p_, s->Get());\n}\n\nvoid OnlineRecognizer::Decode(const OnlineStream *ss, int32_t n) const {\n  if (n <= 0) {\n    return;\n  }\n\n  std::vector<const SherpaOnnxOnlineStream *> streams(n);\n  for (int32_t i = 0; i != n; ++i) {\n    streams[i] = ss[i].Get();\n  }\n\n  SherpaOnnxDecodeMultipleOnlineStreams(p_, streams.data(), n);\n}\n\nOnlineRecognizerResult OnlineRecognizer::GetResult(\n    const OnlineStream *s) const {\n  auto r = SherpaOnnxGetOnlineStreamResult(p_, s->Get());\n\n  OnlineRecognizerResult ans;\n  ans.text = r->text;\n\n  ans.tokens.resize(r->count);\n  for (int32_t i = 0; i != r->count; ++i) {\n    ans.tokens[i] = r->tokens_arr[i];\n  }\n\n  if (r->timestamps) {\n    ans.timestamps.resize(r->count);\n    std::copy(r->timestamps, r->timestamps + r->count, ans.timestamps.data());\n  }\n\n  ans.json = r->json;\n\n  SherpaOnnxDestroyOnlineRecognizerResult(r);\n\n  return ans;\n}\n\n// ============================================================================\n// Non-streaming ASR\n// ============================================================================\nOfflineStream::OfflineStream(const SherpaOnnxOfflineStream *p)\n    : MoveOnly<OfflineStream, SherpaOnnxOfflineStream>(p) {}\n\nvoid OfflineStream::Destroy(const SherpaOnnxOfflineStream *p) const {\n  SherpaOnnxDestroyOfflineStream(p);\n}\n\nvoid OfflineStream::AcceptWaveform(int32_t sample_rate, const float *samples,\n                                   int32_t n) const {\n  SherpaOnnxAcceptWaveformOffline(p_, sample_rate, samples, n);\n}\n\nvoid OfflineStream::SetOption(const char *key, const char *value) const {\n  SherpaOnnxOfflineStreamSetOption(p_, key, value);\n}\n\nconst char *OfflineStream::GetOption(const char *key) const {\n  return SherpaOnnxOfflineStreamGetOption(p_, key);\n}\n\nint32_t OfflineStream::HasOption(const char *key) const {\n  return SherpaOnnxOfflineStreamHasOption(p_, key);\n}\n\nstatic SherpaOnnxOfflineRecognizerConfig Convert(\n    const OfflineRecognizerConfig &config) {\n  struct SherpaOnnxOfflineRecognizerConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.feat_config.sample_rate = config.feat_config.sample_rate;\n  c.feat_config.feature_dim = config.feat_config.feature_dim;\n  c.model_config.transducer.encoder =\n      config.model_config.transducer.encoder.c_str();\n  c.model_config.transducer.decoder =\n      config.model_config.transducer.decoder.c_str();\n  c.model_config.transducer.joiner =\n      config.model_config.transducer.joiner.c_str();\n\n  c.model_config.paraformer.model =\n      config.model_config.paraformer.model.c_str();\n\n  c.model_config.nemo_ctc.model = config.model_config.nemo_ctc.model.c_str();\n\n  c.model_config.whisper.encoder = config.model_config.whisper.encoder.c_str();\n  c.model_config.whisper.decoder = config.model_config.whisper.decoder.c_str();\n  c.model_config.whisper.language =\n      config.model_config.whisper.language.c_str();\n  c.model_config.whisper.task = config.model_config.whisper.task.c_str();\n  c.model_config.whisper.tail_paddings =\n      config.model_config.whisper.tail_paddings;\n  c.model_config.whisper.enable_token_timestamps =\n      config.model_config.whisper.enable_token_timestamps;\n  c.model_config.whisper.enable_segment_timestamps =\n      config.model_config.whisper.enable_segment_timestamps;\n\n  c.model_config.tdnn.model = config.model_config.tdnn.model.c_str();\n\n  c.model_config.tokens = config.model_config.tokens.c_str();\n  c.model_config.num_threads = config.model_config.num_threads;\n  c.model_config.debug = config.model_config.debug;\n  c.model_config.provider = config.model_config.provider.c_str();\n  c.model_config.model_type = config.model_config.model_type.c_str();\n  c.model_config.modeling_unit = config.model_config.modeling_unit.c_str();\n  c.model_config.bpe_vocab = config.model_config.bpe_vocab.c_str();\n  c.model_config.telespeech_ctc = config.model_config.telespeech_ctc.c_str();\n\n  c.model_config.sense_voice.model =\n      config.model_config.sense_voice.model.c_str();\n  c.model_config.sense_voice.language =\n      config.model_config.sense_voice.language.c_str();\n  c.model_config.sense_voice.use_itn = config.model_config.sense_voice.use_itn;\n\n  c.model_config.moonshine.preprocessor =\n      config.model_config.moonshine.preprocessor.c_str();\n  c.model_config.moonshine.encoder =\n      config.model_config.moonshine.encoder.c_str();\n  c.model_config.moonshine.uncached_decoder =\n      config.model_config.moonshine.uncached_decoder.c_str();\n  c.model_config.moonshine.cached_decoder =\n      config.model_config.moonshine.cached_decoder.c_str();\n  c.model_config.moonshine.merged_decoder =\n      config.model_config.moonshine.merged_decoder.c_str();\n\n  c.model_config.fire_red_asr.encoder =\n      config.model_config.fire_red_asr.encoder.c_str();\n  c.model_config.fire_red_asr.decoder =\n      config.model_config.fire_red_asr.decoder.c_str();\n\n  c.model_config.dolphin.model = config.model_config.dolphin.model.c_str();\n\n  c.model_config.zipformer_ctc.model =\n      config.model_config.zipformer_ctc.model.c_str();\n\n  c.model_config.canary.encoder = config.model_config.canary.encoder.c_str();\n  c.model_config.canary.decoder = config.model_config.canary.decoder.c_str();\n  c.model_config.canary.src_lang = config.model_config.canary.src_lang.c_str();\n  c.model_config.canary.tgt_lang = config.model_config.canary.tgt_lang.c_str();\n  c.model_config.canary.use_pnc = config.model_config.canary.use_pnc;\n\n  c.model_config.wenet_ctc.model = config.model_config.wenet_ctc.model.c_str();\n\n  c.model_config.omnilingual.model =\n      config.model_config.omnilingual.model.c_str();\n\n  c.model_config.funasr_nano.encoder_adaptor =\n      config.model_config.funasr_nano.encoder_adaptor.c_str();\n  c.model_config.funasr_nano.llm = config.model_config.funasr_nano.llm.c_str();\n  c.model_config.funasr_nano.embedding =\n      config.model_config.funasr_nano.embedding.c_str();\n  c.model_config.funasr_nano.tokenizer =\n      config.model_config.funasr_nano.tokenizer.c_str();\n  c.model_config.funasr_nano.system_prompt =\n      config.model_config.funasr_nano.system_prompt.c_str();\n  c.model_config.funasr_nano.user_prompt =\n      config.model_config.funasr_nano.user_prompt.c_str();\n  c.model_config.funasr_nano.max_new_tokens =\n      config.model_config.funasr_nano.max_new_tokens;\n  c.model_config.funasr_nano.temperature =\n      config.model_config.funasr_nano.temperature;\n  c.model_config.funasr_nano.top_p = config.model_config.funasr_nano.top_p;\n  c.model_config.funasr_nano.seed = config.model_config.funasr_nano.seed;\n  c.model_config.funasr_nano.language =\n      config.model_config.funasr_nano.language.c_str();\n  c.model_config.funasr_nano.itn = config.model_config.funasr_nano.itn ? 1 : 0;\n  c.model_config.funasr_nano.hotwords =\n      config.model_config.funasr_nano.hotwords.c_str();\n  c.model_config.medasr.model = config.model_config.medasr.model.c_str();\n\n  c.model_config.fire_red_asr_ctc.model =\n      config.model_config.fire_red_asr_ctc.model.c_str();\n\n  c.lm_config.model = config.lm_config.model.c_str();\n  c.lm_config.scale = config.lm_config.scale;\n\n  c.decoding_method = config.decoding_method.c_str();\n  c.max_active_paths = config.max_active_paths;\n  c.hotwords_file = config.hotwords_file.c_str();\n  c.hotwords_score = config.hotwords_score;\n\n  c.rule_fsts = config.rule_fsts.c_str();\n  c.rule_fars = config.rule_fars.c_str();\n\n  c.blank_penalty = config.blank_penalty;\n\n  c.hr.lexicon = config.hr.lexicon.c_str();\n  c.hr.rule_fsts = config.hr.rule_fsts.c_str();\n\n  return c;\n}\n\nOfflineRecognizer OfflineRecognizer::Create(\n    const OfflineRecognizerConfig &config) {\n  auto c = Convert(config);\n\n  auto p = SherpaOnnxCreateOfflineRecognizer(&c);\n  return OfflineRecognizer(p);\n}\n\nvoid OfflineRecognizer::SetConfig(const OfflineRecognizerConfig &config) const {\n  auto c = Convert(config);\n  SherpaOnnxOfflineRecognizerSetConfig(p_, &c);\n}\n\nOfflineRecognizer::OfflineRecognizer(const SherpaOnnxOfflineRecognizer *p)\n    : MoveOnly<OfflineRecognizer, SherpaOnnxOfflineRecognizer>(p) {}\n\nvoid OfflineRecognizer::Destroy(const SherpaOnnxOfflineRecognizer *p) const {\n  SherpaOnnxDestroyOfflineRecognizer(p);\n}\n\nOfflineStream OfflineRecognizer::CreateStream() const {\n  auto s = SherpaOnnxCreateOfflineStream(p_);\n  return OfflineStream{s};\n}\n\nOfflineStream OfflineRecognizer::CreateStream(\n    const std::string &hotwords) const {\n  auto s = SherpaOnnxCreateOfflineStreamWithHotwords(p_, hotwords.c_str());\n  return OfflineStream{s};\n}\n\nvoid OfflineRecognizer::Decode(const OfflineStream *s) const {\n  SherpaOnnxDecodeOfflineStream(p_, s->Get());\n}\n\nvoid OfflineRecognizer::Decode(const OfflineStream *ss, int32_t n) const {\n  if (n <= 0) {\n    return;\n  }\n\n  std::vector<const SherpaOnnxOfflineStream *> streams(n);\n  for (int32_t i = 0; i != n; ++i) {\n    streams[i] = ss[i].Get();\n  }\n\n  SherpaOnnxDecodeMultipleOfflineStreams(p_, streams.data(), n);\n}\n\nOfflineRecognizerResult OfflineRecognizer::GetResult(\n    const OfflineStream *s) const {\n  auto r = SherpaOnnxGetOfflineStreamResult(s->Get());\n\n  OfflineRecognizerResult ans;\n  if (r) {\n    ans.text = r->text;\n\n    if (r->timestamps) {\n      ans.timestamps.resize(r->count);\n      std::copy(r->timestamps, r->timestamps + r->count, ans.timestamps.data());\n    }\n\n    ans.tokens.resize(r->count);\n    for (int32_t i = 0; i != r->count; ++i) {\n      ans.tokens[i] = r->tokens_arr[i];\n    }\n\n    ans.json = r->json;\n    ans.lang = r->lang ? r->lang : \"\";\n    ans.emotion = r->emotion ? r->emotion : \"\";\n    ans.event = r->event ? r->event : \"\";\n\n    if (r->durations) {\n      ans.durations.resize(r->count);\n      std::copy(r->durations, r->durations + r->count, ans.durations.data());\n    }\n  }\n\n  SherpaOnnxDestroyOfflineRecognizerResult(r);\n\n  return ans;\n}\n\nstd::shared_ptr<OfflineRecognizerResult> OfflineRecognizer::GetResultPtr(\n    const OfflineStream *s) const {\n  auto r = GetResult(s);\n  return std::make_shared<OfflineRecognizerResult>(r);\n}\n\nOfflineTts OfflineTts::Create(const OfflineTtsConfig &config) {\n  struct SherpaOnnxOfflineTtsConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.model.vits.model = config.model.vits.model.c_str();\n  c.model.vits.lexicon = config.model.vits.lexicon.c_str();\n  c.model.vits.tokens = config.model.vits.tokens.c_str();\n  c.model.vits.data_dir = config.model.vits.data_dir.c_str();\n  c.model.vits.noise_scale = config.model.vits.noise_scale;\n  c.model.vits.noise_scale_w = config.model.vits.noise_scale_w;\n  c.model.vits.length_scale = config.model.vits.length_scale;\n\n  c.model.matcha.acoustic_model = config.model.matcha.acoustic_model.c_str();\n  c.model.matcha.vocoder = config.model.matcha.vocoder.c_str();\n  c.model.matcha.lexicon = config.model.matcha.lexicon.c_str();\n  c.model.matcha.tokens = config.model.matcha.tokens.c_str();\n  c.model.matcha.data_dir = config.model.matcha.data_dir.c_str();\n  c.model.matcha.noise_scale = config.model.matcha.noise_scale;\n  c.model.matcha.length_scale = config.model.matcha.length_scale;\n\n  c.model.kokoro.model = config.model.kokoro.model.c_str();\n  c.model.kokoro.voices = config.model.kokoro.voices.c_str();\n  c.model.kokoro.tokens = config.model.kokoro.tokens.c_str();\n  c.model.kokoro.data_dir = config.model.kokoro.data_dir.c_str();\n  c.model.kokoro.length_scale = config.model.kokoro.length_scale;\n  c.model.kokoro.lexicon = config.model.kokoro.lexicon.c_str();\n  c.model.kokoro.lang = config.model.kokoro.lang.c_str();\n\n  c.model.kitten.model = config.model.kitten.model.c_str();\n  c.model.kitten.voices = config.model.kitten.voices.c_str();\n  c.model.kitten.tokens = config.model.kitten.tokens.c_str();\n  c.model.kitten.data_dir = config.model.kitten.data_dir.c_str();\n  c.model.kitten.length_scale = config.model.kitten.length_scale;\n\n  c.model.zipvoice.tokens = config.model.zipvoice.tokens.c_str();\n  c.model.zipvoice.encoder = config.model.zipvoice.encoder.c_str();\n  c.model.zipvoice.decoder = config.model.zipvoice.decoder.c_str();\n  c.model.zipvoice.vocoder = config.model.zipvoice.vocoder.c_str();\n  c.model.zipvoice.data_dir = config.model.zipvoice.data_dir.c_str();\n  c.model.zipvoice.lexicon = config.model.zipvoice.lexicon.c_str();\n  c.model.zipvoice.feat_scale = config.model.zipvoice.feat_scale;\n  c.model.zipvoice.t_shift = config.model.zipvoice.t_shift;\n  c.model.zipvoice.target_rms = config.model.zipvoice.target_rms;\n  c.model.zipvoice.guidance_scale = config.model.zipvoice.guidance_scale;\n\n  c.model.pocket.lm_flow = config.model.pocket.lm_flow.c_str();\n  c.model.pocket.lm_main = config.model.pocket.lm_main.c_str();\n  c.model.pocket.encoder = config.model.pocket.encoder.c_str();\n  c.model.pocket.decoder = config.model.pocket.decoder.c_str();\n  c.model.pocket.text_conditioner =\n      config.model.pocket.text_conditioner.c_str();\n\n  c.model.pocket.vocab_json = config.model.pocket.vocab_json.c_str();\n\n  c.model.pocket.token_scores_json =\n      config.model.pocket.token_scores_json.c_str();\n\n  c.model.pocket.voice_embedding_cache_capacity =\n      config.model.pocket.voice_embedding_cache_capacity;\n\n  c.model.supertonic.duration_predictor =\n      config.model.supertonic.duration_predictor.c_str();\n  c.model.supertonic.text_encoder =\n      config.model.supertonic.text_encoder.c_str();\n  c.model.supertonic.vector_estimator =\n      config.model.supertonic.vector_estimator.c_str();\n  c.model.supertonic.vocoder = config.model.supertonic.vocoder.c_str();\n  c.model.supertonic.tts_json = config.model.supertonic.tts_json.c_str();\n  c.model.supertonic.unicode_indexer =\n      config.model.supertonic.unicode_indexer.c_str();\n  c.model.supertonic.voice_style = config.model.supertonic.voice_style.c_str();\n\n  c.model.num_threads = config.model.num_threads;\n  c.model.debug = config.model.debug;\n  c.model.provider = config.model.provider.c_str();\n\n  c.rule_fsts = config.rule_fsts.c_str();\n  c.max_num_sentences = config.max_num_sentences;\n  c.silence_scale = config.silence_scale;\n  c.rule_fars = config.rule_fars.c_str();\n\n  auto p = SherpaOnnxCreateOfflineTts(&c);\n  return OfflineTts(p);\n}\n\nOfflineTts::OfflineTts(const SherpaOnnxOfflineTts *p)\n    : MoveOnly<OfflineTts, SherpaOnnxOfflineTts>(p) {}\n\nvoid OfflineTts::Destroy(const SherpaOnnxOfflineTts *p) const {\n  SherpaOnnxDestroyOfflineTts(p);\n}\n\nint32_t OfflineTts::SampleRate() const {\n  return SherpaOnnxOfflineTtsSampleRate(p_);\n}\n\nint32_t OfflineTts::NumSpeakers() const {\n  return SherpaOnnxOfflineTtsNumSpeakers(p_);\n}\n\nGeneratedAudio OfflineTts::Generate(const std::string &text,\n                                    int32_t sid /*= 0*/, float speed /*= 1.0*/,\n                                    OfflineTtsCallback callback /*= nullptr*/,\n                                    void *arg /*= nullptr*/) const {\n  const SherpaOnnxGeneratedAudio *audio;\n  if (!callback) {\n    audio = SherpaOnnxOfflineTtsGenerate(p_, text.c_str(), sid, speed);\n  } else {\n    audio = SherpaOnnxOfflineTtsGenerateWithProgressCallbackWithArg(\n        p_, text.c_str(), sid, speed, callback, arg);\n  }\n\n  GeneratedAudio ans;\n\n  if (!audio) {\n    return ans;\n  }\n\n  ans.samples = std::vector<float>{audio->samples, audio->samples + audio->n};\n  ans.sample_rate = audio->sample_rate;\n\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  return ans;\n}\n\nGeneratedAudio OfflineTts::Generate(const std::string &text,\n                                    const GenerationConfig &config,\n                                    OfflineTtsCallback callback /*= nullptr*/,\n                                    void *arg /*= nullptr*/) const {\n  SherpaOnnxGenerationConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.silence_scale = config.silence_scale;\n  c.speed = config.speed;\n  c.sid = config.sid;\n  c.reference_audio = config.reference_audio.data();\n  c.reference_audio_len = config.reference_audio.size();\n  c.reference_sample_rate = config.reference_sample_rate;\n  c.reference_text = config.reference_text.c_str();\n  c.num_steps = config.num_steps;\n\n  nlohmann::json j = config.extra;\n  std::string s = j.dump();\n  c.extra = s.c_str();\n\n  const SherpaOnnxGeneratedAudio *audio =\n      SherpaOnnxOfflineTtsGenerateWithConfig(p_, text.c_str(), &c, callback,\n                                             arg);\n  GeneratedAudio ans;\n\n  if (!audio) {\n    return ans;\n  }\n\n  ans.samples = std::vector<float>{audio->samples, audio->samples + audio->n};\n  ans.sample_rate = audio->sample_rate;\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio);\n  return ans;\n}\n\nstd::shared_ptr<GeneratedAudio> OfflineTts::Generate2(\n    const std::string &text, int32_t sid /*= 0*/, float speed /*= 1.0*/,\n    OfflineTtsCallback callback /*= nullptr*/, void *arg /*= nullptr*/) const {\n  auto audio = Generate(text, sid, speed, callback, arg);\n\n  GeneratedAudio *ans = new GeneratedAudio;\n  ans->samples = std::move(audio.samples);\n  ans->sample_rate = audio.sample_rate;\n\n  return std::shared_ptr<GeneratedAudio>(ans);\n}\n\nstd::shared_ptr<GeneratedAudio> OfflineTts::Generate2(\n    const std::string &text, const GenerationConfig &config,\n    OfflineTtsCallback callback /*= nullptr*/, void *arg /*= nullptr*/) const {\n  auto audio = Generate(text, config, callback, arg);\n\n  GeneratedAudio *ans = new GeneratedAudio;\n  ans->samples = std::move(audio.samples);\n  ans->sample_rate = audio.sample_rate;\n\n  return std::shared_ptr<GeneratedAudio>(ans);\n}\n\nKeywordSpotter KeywordSpotter::Create(const KeywordSpotterConfig &config) {\n  struct SherpaOnnxKeywordSpotterConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.feat_config.sample_rate = config.feat_config.sample_rate;\n\n  c.model_config.transducer.encoder =\n      config.model_config.transducer.encoder.c_str();\n  c.model_config.transducer.decoder =\n      config.model_config.transducer.decoder.c_str();\n  c.model_config.transducer.joiner =\n      config.model_config.transducer.joiner.c_str();\n  c.feat_config.feature_dim = config.feat_config.feature_dim;\n\n  c.model_config.paraformer.encoder =\n      config.model_config.paraformer.encoder.c_str();\n  c.model_config.paraformer.decoder =\n      config.model_config.paraformer.decoder.c_str();\n\n  c.model_config.zipformer2_ctc.model =\n      config.model_config.zipformer2_ctc.model.c_str();\n\n  c.model_config.nemo_ctc.model = config.model_config.nemo_ctc.model.c_str();\n\n  c.model_config.tokens = config.model_config.tokens.c_str();\n  c.model_config.num_threads = config.model_config.num_threads;\n  c.model_config.provider = config.model_config.provider.c_str();\n  c.model_config.debug = config.model_config.debug;\n  c.model_config.model_type = config.model_config.model_type.c_str();\n  c.model_config.modeling_unit = config.model_config.modeling_unit.c_str();\n  c.model_config.bpe_vocab = config.model_config.bpe_vocab.c_str();\n  c.model_config.tokens_buf = config.model_config.tokens_buf.c_str();\n  c.model_config.tokens_buf_size = config.model_config.tokens_buf.size();\n\n  c.max_active_paths = config.max_active_paths;\n  c.num_trailing_blanks = config.num_trailing_blanks;\n  c.keywords_score = config.keywords_score;\n  c.keywords_threshold = config.keywords_threshold;\n  c.keywords_file = config.keywords_file.c_str();\n  c.keywords_buf = config.keywords_buf.c_str();\n  c.keywords_buf_size = static_cast<int32_t>(config.keywords_buf.size());\n\n  auto p = SherpaOnnxCreateKeywordSpotter(&c);\n  return KeywordSpotter(p);\n}\n\nKeywordSpotter::KeywordSpotter(const SherpaOnnxKeywordSpotter *p)\n    : MoveOnly<KeywordSpotter, SherpaOnnxKeywordSpotter>(p) {}\n\nvoid KeywordSpotter::Destroy(const SherpaOnnxKeywordSpotter *p) const {\n  SherpaOnnxDestroyKeywordSpotter(p);\n}\n\nOnlineStream KeywordSpotter::CreateStream() const {\n  auto s = SherpaOnnxCreateKeywordStream(p_);\n  return OnlineStream{s};\n}\n\nOnlineStream KeywordSpotter::CreateStream(const std::string &keywords) const {\n  auto s = SherpaOnnxCreateKeywordStreamWithKeywords(p_, keywords.c_str());\n  return OnlineStream{s};\n}\n\nbool KeywordSpotter::IsReady(const OnlineStream *s) const {\n  return SherpaOnnxIsKeywordStreamReady(p_, s->Get());\n}\n\nvoid KeywordSpotter::Decode(const OnlineStream *s) const {\n  return SherpaOnnxDecodeKeywordStream(p_, s->Get());\n}\n\nvoid KeywordSpotter::Decode(const OnlineStream *ss, int32_t n) const {\n  if (n <= 0) {\n    return;\n  }\n\n  std::vector<const SherpaOnnxOnlineStream *> streams(n);\n  for (int32_t i = 0; i != n; ++i) {\n    streams[i] = ss[i].Get();\n  }\n\n  SherpaOnnxDecodeMultipleKeywordStreams(p_, streams.data(), n);\n}\n\nKeywordResult KeywordSpotter::GetResult(const OnlineStream *s) const {\n  auto r = SherpaOnnxGetKeywordResult(p_, s->Get());\n\n  KeywordResult ans;\n  ans.keyword = r->keyword;\n\n  ans.tokens.resize(r->count);\n  for (int32_t i = 0; i < r->count; ++i) {\n    ans.tokens[i] = r->tokens_arr[i];\n  }\n\n  if (r->timestamps) {\n    ans.timestamps.resize(r->count);\n    std::copy(r->timestamps, r->timestamps + r->count, ans.timestamps.data());\n  }\n\n  ans.start_time = r->start_time;\n  ans.json = r->json;\n\n  SherpaOnnxDestroyKeywordResult(r);\n\n  return ans;\n}\n\nvoid KeywordSpotter::Reset(const OnlineStream *s) const {\n  SherpaOnnxResetKeywordStream(p_, s->Get());\n}\n\n// ============================================================\n// For Offline Speech Enhancement\n// ============================================================\n\nOfflineSpeechDenoiser OfflineSpeechDenoiser::Create(\n    const OfflineSpeechDenoiserConfig &config) {\n  struct SherpaOnnxOfflineSpeechDenoiserConfig c;\n  FillSpeechDenoiserModelConfig(config.model, &c.model);\n\n  auto p = SherpaOnnxCreateOfflineSpeechDenoiser(&c);\n\n  return OfflineSpeechDenoiser(p);\n}\n\nvoid OfflineSpeechDenoiser::Destroy(\n    const SherpaOnnxOfflineSpeechDenoiser *p) const {\n  SherpaOnnxDestroyOfflineSpeechDenoiser(p);\n}\n\nOfflineSpeechDenoiser::OfflineSpeechDenoiser(\n    const SherpaOnnxOfflineSpeechDenoiser *p)\n    : MoveOnly<OfflineSpeechDenoiser, SherpaOnnxOfflineSpeechDenoiser>(p) {}\n\nDenoisedAudio OfflineSpeechDenoiser::Run(const float *samples, int32_t n,\n                                         int32_t sample_rate) const {\n  auto audio = SherpaOnnxOfflineSpeechDenoiserRun(p_, samples, n, sample_rate);\n  if (audio == nullptr) {\n    return {};\n  }\n\n  DenoisedAudio ans;\n  ans.samples = {audio->samples, audio->samples + audio->n};\n  ans.sample_rate = audio->sample_rate;\n  SherpaOnnxDestroyDenoisedAudio(audio);\n\n  return ans;\n}\n\nint32_t OfflineSpeechDenoiser::GetSampleRate() const {\n  return SherpaOnnxOfflineSpeechDenoiserGetSampleRate(p_);\n}\n\nOnlineSpeechDenoiser OnlineSpeechDenoiser::Create(\n    const OnlineSpeechDenoiserConfig &config) {\n  struct SherpaOnnxOnlineSpeechDenoiserConfig c;\n  FillSpeechDenoiserModelConfig(config.model, &c.model);\n\n  auto p = SherpaOnnxCreateOnlineSpeechDenoiser(&c);\n  return OnlineSpeechDenoiser(p);\n}\n\nvoid OnlineSpeechDenoiser::Destroy(\n    const SherpaOnnxOnlineSpeechDenoiser *p) const {\n  SherpaOnnxDestroyOnlineSpeechDenoiser(p);\n}\n\nOnlineSpeechDenoiser::OnlineSpeechDenoiser(\n    const SherpaOnnxOnlineSpeechDenoiser *p)\n    : MoveOnly<OnlineSpeechDenoiser, SherpaOnnxOnlineSpeechDenoiser>(p) {}\n\nDenoisedAudio OnlineSpeechDenoiser::Run(const float *samples, int32_t n,\n                                        int32_t sample_rate) const {\n  auto audio = SherpaOnnxOnlineSpeechDenoiserRun(p_, samples, n, sample_rate);\n  if (audio == nullptr) {\n    return {};\n  }\n\n  DenoisedAudio ans;\n  ans.samples = {audio->samples, audio->samples + audio->n};\n  ans.sample_rate = audio->sample_rate;\n  SherpaOnnxDestroyDenoisedAudio(audio);\n  return ans;\n}\n\nDenoisedAudio OnlineSpeechDenoiser::Flush() const {\n  auto audio = SherpaOnnxOnlineSpeechDenoiserFlush(p_);\n  if (audio == nullptr) {\n    return {};\n  }\n\n  DenoisedAudio ans;\n  ans.samples = {audio->samples, audio->samples + audio->n};\n  ans.sample_rate = audio->sample_rate;\n  SherpaOnnxDestroyDenoisedAudio(audio);\n  return ans;\n}\n\nvoid OnlineSpeechDenoiser::Reset() const {\n  SherpaOnnxOnlineSpeechDenoiserReset(p_);\n}\n\nint32_t OnlineSpeechDenoiser::GetSampleRate() const {\n  return SherpaOnnxOnlineSpeechDenoiserGetSampleRate(p_);\n}\n\nint32_t OnlineSpeechDenoiser::GetFrameShiftInSamples() const {\n  return SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(p_);\n}\n\nCircularBuffer CircularBuffer::Create(int32_t capacity) {\n  auto p = SherpaOnnxCreateCircularBuffer(capacity);\n  return CircularBuffer(p);\n}\n\nCircularBuffer::CircularBuffer(const SherpaOnnxCircularBuffer *p)\n    : MoveOnly<CircularBuffer, SherpaOnnxCircularBuffer>(p) {}\n\nvoid CircularBuffer::Destroy(const SherpaOnnxCircularBuffer *p) const {\n  SherpaOnnxDestroyCircularBuffer(p);\n}\n\nvoid CircularBuffer::Push(const float *samples, int32_t n) const {\n  SherpaOnnxCircularBufferPush(p_, samples, n);\n}\n\nstd::vector<float> CircularBuffer::Get(int32_t start_index, int32_t n) const {\n  const float *samples = SherpaOnnxCircularBufferGet(p_, start_index, n);\n  std::vector<float> ans(n);\n  std::copy(samples, samples + n, ans.begin());\n\n  SherpaOnnxCircularBufferFree(samples);\n  return ans;\n}\n\nvoid CircularBuffer::Pop(int32_t n) const {\n  SherpaOnnxCircularBufferPop(p_, n);\n}\n\nint32_t CircularBuffer::Size() const {\n  return SherpaOnnxCircularBufferSize(p_);\n}\n\nint32_t CircularBuffer::Head() const {\n  return SherpaOnnxCircularBufferHead(p_);\n}\n\nvoid CircularBuffer::Reset() const { SherpaOnnxCircularBufferReset(p_); }\n\nVoiceActivityDetector VoiceActivityDetector::Create(\n    const VadModelConfig &config, float buffer_size_in_seconds) {\n  struct SherpaOnnxVadModelConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.silero_vad.model = config.silero_vad.model.c_str();\n  c.silero_vad.threshold = config.silero_vad.threshold;\n  c.silero_vad.min_silence_duration = config.silero_vad.min_silence_duration;\n  c.silero_vad.min_speech_duration = config.silero_vad.min_speech_duration;\n  c.silero_vad.window_size = config.silero_vad.window_size;\n  c.silero_vad.max_speech_duration = config.silero_vad.max_speech_duration;\n\n  c.ten_vad.model = config.ten_vad.model.c_str();\n  c.ten_vad.threshold = config.ten_vad.threshold;\n  c.ten_vad.min_silence_duration = config.ten_vad.min_silence_duration;\n  c.ten_vad.min_speech_duration = config.ten_vad.min_speech_duration;\n  c.ten_vad.window_size = config.ten_vad.window_size;\n  c.ten_vad.max_speech_duration = config.ten_vad.max_speech_duration;\n\n  c.sample_rate = config.sample_rate;\n  c.num_threads = config.num_threads;\n  c.provider = config.provider.c_str();\n  c.debug = config.debug;\n\n  auto p = SherpaOnnxCreateVoiceActivityDetector(&c, buffer_size_in_seconds);\n  return VoiceActivityDetector(p);\n}\n\nVoiceActivityDetector::VoiceActivityDetector(\n    const SherpaOnnxVoiceActivityDetector *p)\n    : MoveOnly<VoiceActivityDetector, SherpaOnnxVoiceActivityDetector>(p) {}\n\nvoid VoiceActivityDetector::Destroy(\n    const SherpaOnnxVoiceActivityDetector *p) const {\n  SherpaOnnxDestroyVoiceActivityDetector(p);\n}\n\nvoid VoiceActivityDetector::AcceptWaveform(const float *samples,\n                                           int32_t n) const {\n  SherpaOnnxVoiceActivityDetectorAcceptWaveform(p_, samples, n);\n}\n\nbool VoiceActivityDetector::IsEmpty() const {\n  return SherpaOnnxVoiceActivityDetectorEmpty(p_);\n}\n\nbool VoiceActivityDetector ::IsDetected() const {\n  return SherpaOnnxVoiceActivityDetectorDetected(p_);\n}\n\nvoid VoiceActivityDetector::Pop() const {\n  SherpaOnnxVoiceActivityDetectorPop(p_);\n}\n\nvoid VoiceActivityDetector::Clear() const {\n  SherpaOnnxVoiceActivityDetectorClear(p_);\n}\n\nSpeechSegment VoiceActivityDetector::Front() const {\n  auto f = SherpaOnnxVoiceActivityDetectorFront(p_);\n\n  SpeechSegment segment;\n  if (!f) return segment;\n  segment.start = f->start;\n  segment.samples = std::vector<float>{f->samples, f->samples + f->n};\n\n  SherpaOnnxDestroySpeechSegment(f);\n\n  return segment;\n}\n\nstd::shared_ptr<SpeechSegment> VoiceActivityDetector::FrontPtr() const {\n  auto segment = Front();\n  return std::make_shared<SpeechSegment>(segment);\n}\n\nvoid VoiceActivityDetector::Reset() const {\n  SherpaOnnxVoiceActivityDetectorReset(p_);\n}\n\nvoid VoiceActivityDetector::Flush() const {\n  SherpaOnnxVoiceActivityDetectorFlush(p_);\n}\n\nLinearResampler LinearResampler::Create(int32_t samp_rate_in_hz,\n                                        int32_t samp_rate_out_hz,\n                                        float filter_cutoff_hz,\n                                        int32_t num_zeros) {\n  auto p = SherpaOnnxCreateLinearResampler(samp_rate_in_hz, samp_rate_out_hz,\n                                           filter_cutoff_hz, num_zeros);\n  return LinearResampler(p);\n}\n\nLinearResampler::LinearResampler(const SherpaOnnxLinearResampler *p)\n    : MoveOnly<LinearResampler, SherpaOnnxLinearResampler>(p) {}\n\nvoid LinearResampler::Destroy(const SherpaOnnxLinearResampler *p) const {\n  SherpaOnnxDestroyLinearResampler(p);\n}\n\nvoid LinearResampler::Reset() const { SherpaOnnxLinearResamplerReset(p_); }\n\nstd::vector<float> LinearResampler::Resample(const float *input,\n                                             int32_t input_dim,\n                                             bool flush) const {\n  auto out = SherpaOnnxLinearResamplerResample(p_, input, input_dim, flush);\n\n  std::vector<float> ans{out->samples, out->samples + out->n};\n\n  SherpaOnnxLinearResamplerResampleFree(out);\n\n  return ans;\n}\n\nint32_t LinearResampler::GetInputSamplingRate() const {\n  return SherpaOnnxLinearResamplerResampleGetInputSampleRate(p_);\n}\n\nint32_t LinearResampler::GetOutputSamplingRate() const {\n  return SherpaOnnxLinearResamplerResampleGetOutputSampleRate(p_);\n}\n\nstd::string GetVersionStr() { return SherpaOnnxGetVersionStr(); }\n\nstd::string GetGitSha1() { return SherpaOnnxGetGitSha1(); }\n\nstd::string GetGitDate() { return SherpaOnnxGetGitDate(); }\n\nbool FileExists(const std::string &filename) {\n  return SherpaOnnxFileExists(filename.c_str());\n}\n\n// ============================================================\n// For Offline Punctuation\n// ============================================================\nOfflinePunctuation OfflinePunctuation::Create(\n    const OfflinePunctuationConfig &config) {\n  struct SherpaOnnxOfflinePunctuationConfig c;\n  memset(&c, 0, sizeof(c));\n  c.model.ct_transformer = config.model.ct_transformer.c_str();\n  c.model.num_threads = config.model.num_threads;\n  c.model.debug = config.model.debug;\n  c.model.provider = config.model.provider.c_str();\n\n  const SherpaOnnxOfflinePunctuation *punct =\n      SherpaOnnxCreateOfflinePunctuation(&c);\n  return OfflinePunctuation(punct);\n}\n\nOfflinePunctuation::OfflinePunctuation(const SherpaOnnxOfflinePunctuation *p)\n    : MoveOnly<OfflinePunctuation, SherpaOnnxOfflinePunctuation>(p) {}\n\nvoid OfflinePunctuation::Destroy(const SherpaOnnxOfflinePunctuation *p) const {\n  SherpaOnnxDestroyOfflinePunctuation(p);\n}\n\nstd::string OfflinePunctuation::AddPunctuation(const std::string &text) const {\n  const char *result = SherpaOfflinePunctuationAddPunct(p_, text.c_str());\n  if (!result) return {};\n  std::string ans(result);\n  SherpaOfflinePunctuationFreeText(result);\n  return ans;\n}\n\n// ============================================================\n// For Online Punctuation\n// ============================================================\nOnlinePunctuation OnlinePunctuation::Create(\n    const OnlinePunctuationConfig &config) {\n  struct SherpaOnnxOnlinePunctuationConfig c;\n  memset(&c, 0, sizeof(c));\n  c.model.cnn_bilstm = config.model.cnn_bilstm.c_str();\n  c.model.bpe_vocab = config.model.bpe_vocab.c_str();\n  c.model.num_threads = config.model.num_threads;\n  c.model.debug = config.model.debug;\n  c.model.provider = config.model.provider.c_str();\n\n  const SherpaOnnxOnlinePunctuation *punct =\n      SherpaOnnxCreateOnlinePunctuation(&c);\n  return OnlinePunctuation(punct);\n}\n\nOnlinePunctuation::OnlinePunctuation(const SherpaOnnxOnlinePunctuation *p)\n    : MoveOnly<OnlinePunctuation, SherpaOnnxOnlinePunctuation>(p) {}\n\nvoid OnlinePunctuation::Destroy(const SherpaOnnxOnlinePunctuation *p) const {\n  SherpaOnnxDestroyOnlinePunctuation(p);\n}\n\nstd::string OnlinePunctuation::AddPunctuation(const std::string &text) const {\n  const char *result = SherpaOnnxOnlinePunctuationAddPunct(p_, text.c_str());\n  if (!result) return {};\n  std::string ans(result);\n  SherpaOnnxOnlinePunctuationFreeText(result);\n  return ans;\n}\n\n// ============================================================\n// For Audio tagging\n// ============================================================\nAudioTagging AudioTagging::Create(const AudioTaggingConfig &config) {\n  struct SherpaOnnxAudioTaggingConfig c;\n  memset(&c, 0, sizeof(c));\n\n  c.model.zipformer.model = config.model.zipformer.model.c_str();\n  c.model.ced = config.model.ced.c_str();\n  c.model.num_threads = config.model.num_threads;\n  c.model.debug = config.model.debug;\n  c.model.provider = config.model.provider.c_str();\n  c.labels = config.labels.c_str();\n  c.top_k = config.top_k;\n\n  const SherpaOnnxAudioTagging *tagger = SherpaOnnxCreateAudioTagging(&c);\n  return AudioTagging(tagger);\n}\n\nAudioTagging::AudioTagging(const SherpaOnnxAudioTagging *p)\n    : MoveOnly<AudioTagging, SherpaOnnxAudioTagging>(p) {}\n\nvoid AudioTagging::Destroy(const SherpaOnnxAudioTagging *p) const {\n  SherpaOnnxDestroyAudioTagging(p);\n}\n\nOfflineStream AudioTagging::CreateStream() const {\n  auto s = SherpaOnnxAudioTaggingCreateOfflineStream(p_);\n  return OfflineStream{s};\n}\n\nstd::vector<AudioEvent> AudioTagging::Compute(const OfflineStream *s,\n                                              int32_t top_k /*= -1*/) {\n  auto events = SherpaOnnxAudioTaggingCompute(p_, s->Get(), top_k);\n  std::vector<AudioEvent> ans;\n\n  auto pe = events;\n  while (pe && *pe) {\n    AudioEvent e;\n    e.name = (*pe)->name;\n    e.index = (*pe)->index;\n    e.prob = (*pe)->prob;\n    ans.push_back(std::move(e));\n    ++pe;\n  }\n\n  SherpaOnnxAudioTaggingFreeResults(events);\n\n  return ans;\n}\n\nstd::shared_ptr<std::vector<AudioEvent>> AudioTagging::ComputePtr(\n    const OfflineStream *s, int32_t top_k /*= -1*/) {\n  auto events = Compute(s, top_k);\n  return std::make_shared<std::vector<AudioEvent>>(events);\n}\n\n}  // namespace sherpa_onnx::cxx\n"
  },
  {
    "path": "sherpa-onnx/c-api/cxx-api.h",
    "content": "// sherpa-onnx/c-api/cxx-api.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n/**\n * @file cxx-api.h\n * @brief Public C++ wrapper for the sherpa-onnx C API.\n *\n * This header provides a lightweight C++ interface on top of `c-api.h`. The\n * wrapper follows a few simple design rules:\n *\n * - Configuration objects are plain structs with `std::string`,\n *   `std::vector`, and default values\n * - Runtime handles are move-only RAII classes that automatically release the\n *   underlying C handle\n * - Result objects are copied into standard C++ containers so callers do not\n *   need to manage C-allocated memory manually\n * - The API mirrors the C API closely, while offering a more idiomatic C++\n *   surface\n *\n * Major feature families available in this file:\n *\n * - Streaming ASR\n * - Non-streaming ASR\n * - Non-streaming TTS\n * - Keyword spotting\n * - Offline and online speech enhancement\n * - VAD and circular buffering\n * - Linear resampling\n * - Version/file/WAVE helpers\n * - Offline and online punctuation\n * - Audio tagging\n *\n * Typical usage pattern:\n *\n * 1. Fill a config struct\n * 2. Create the corresponding RAII wrapper with `Class::Create(...)`\n * 3. Check `wrapper.Get()` for success\n * 4. Feed audio or text, run inference, and retrieve results as C++ objects\n * 5. Let destructors clean up automatically\n *\n * Example programs are available in `cxx-api-examples/` and show concrete model\n * packages and end-to-end usage.\n */\n#ifndef SHERPA_ONNX_C_API_CXX_API_H_\n#define SHERPA_ONNX_C_API_CXX_API_H_\n\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <vector>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nnamespace sherpa_onnx::cxx {\n\n// ============================================================================\n// Streaming ASR\n// ============================================================================\n/** @brief Streaming transducer model files. */\nstruct OnlineTransducerModelConfig {\n  /** Encoder ONNX model. */\n  std::string encoder;\n  /** Decoder ONNX model. */\n  std::string decoder;\n  /** Joiner ONNX model. */\n  std::string joiner;\n};\n\n/** @brief Streaming Paraformer model files. */\nstruct OnlineParaformerModelConfig {\n  /** Encoder ONNX model. */\n  std::string encoder;\n  /** Decoder ONNX model. */\n  std::string decoder;\n};\n\n/** @brief Streaming Zipformer2 CTC model file. */\nstruct OnlineZipformer2CtcModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Streaming NeMo CTC model file. */\nstruct OnlineNemoCtcModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Streaming T-One CTC model file. */\nstruct OnlineToneCtcModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/**\n * @brief Acoustic model configuration for streaming ASR.\n *\n * Configure exactly one model family. If multiple model families are set, one\n * of them will be chosen and the choice is implementation-defined.\n *\n * Example using\n * `sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20`:\n *\n * @code\n * OnlineModelConfig model;\n * model.transducer.encoder =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"encoder-epoch-99-avg-1.int8.onnx\";\n * model.transducer.decoder =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"decoder-epoch-99-avg-1.onnx\";\n * model.transducer.joiner =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"joiner-epoch-99-avg-1.int8.onnx\";\n * model.tokens =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\";\n * model.num_threads = 1;\n * @endcode\n */\nstruct OnlineModelConfig {\n  /** Streaming transducer configuration. */\n  OnlineTransducerModelConfig transducer;\n  /** Streaming Paraformer configuration. */\n  OnlineParaformerModelConfig paraformer;\n  /** Streaming Zipformer2 CTC configuration. */\n  OnlineZipformer2CtcModelConfig zipformer2_ctc;\n  /** Streaming NeMo CTC configuration. */\n  OnlineNemoCtcModelConfig nemo_ctc;\n  /** Streaming T-One CTC configuration. */\n  OnlineToneCtcModelConfig t_one_ctc;\n  /** Token file path. */\n  std::string tokens;\n  /** Number of inference threads. */\n  int32_t num_threads = 1;\n  /** Execution provider such as `\"cpu\"`. */\n  std::string provider = \"cpu\";\n  /** Enable verbose debug logging. */\n  bool debug = false;\n  /** Optional explicit model type hint. */\n  std::string model_type;\n  /** Modeling unit such as `\"cjkchar\"` or `\"bpe\"`. */\n  std::string modeling_unit = \"cjkchar\";\n  /** Optional BPE vocabulary. */\n  std::string bpe_vocab;\n  /** Optional in-memory token content. If non-empty, it is used instead of a\n   * file. */\n  std::string tokens_buf;\n};\n\n/** @brief Feature extraction settings shared by ASR and KWS wrappers. */\nstruct FeatureConfig {\n  /** Input sample rate in Hz. */\n  int32_t sample_rate = 16000;\n  /** Number of features per frame. */\n  int32_t feature_dim = 80;\n};\n\n/** @brief Decoder graph configuration for online CTC + FST decoding. */\nstruct OnlineCtcFstDecoderConfig {\n  /** FST graph file. */\n  std::string graph;\n  /** Maximum number of active states during search. */\n  int32_t max_active = 3000;\n};\n\n/** @brief Homophone replacement resources used by some Chinese ASR setups. */\nstruct HomophoneReplacerConfig {\n  /** Reserved field. Currently unused by the wrapper. */\n  std::string dict_dir;\n  /** Lexicon file used by the replacer. */\n  std::string lexicon;\n  /** Rule FST file used for replacement. */\n  std::string rule_fsts;\n};\n\n/**\n * @brief Configuration for streaming ASR.\n *\n * Example:\n *\n * @code\n * OnlineRecognizerConfig config;\n * config.model_config.transducer.encoder =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"encoder-epoch-99-avg-1.int8.onnx\";\n * config.model_config.transducer.decoder =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"decoder-epoch-99-avg-1.onnx\";\n * config.model_config.transducer.joiner =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\"\n *     \"joiner-epoch-99-avg-1.int8.onnx\";\n * config.model_config.tokens =\n *     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\";\n * config.model_config.num_threads = 1;\n * config.hr.lexicon = \"./lexicon.txt\";\n * config.hr.rule_fsts = \"./replace.fst\";\n * @endcode\n */\nstruct OnlineRecognizerConfig {\n  /** Feature extraction configuration. */\n  FeatureConfig feat_config;\n  /** Acoustic model configuration. */\n  OnlineModelConfig model_config;\n\n  /** Decoding method such as `\"greedy_search\"` or `\"modified_beam_search\"`. */\n  std::string decoding_method = \"greedy_search\";\n\n  /** Maximum number of active paths for beam-search-style decoding. */\n  int32_t max_active_paths = 4;\n\n  /** Enable endpoint detection. */\n  bool enable_endpoint = false;\n\n  /** Endpointing rule 1 trailing silence threshold in seconds. */\n  float rule1_min_trailing_silence = 2.4;\n\n  /** Endpointing rule 2 trailing silence threshold in seconds. */\n  float rule2_min_trailing_silence = 1.2;\n\n  /** Endpointing rule 3 minimum utterance length in seconds. */\n  float rule3_min_utterance_length = 20;\n\n  /** Optional hotword file. */\n  std::string hotwords_file;\n\n  /** Hotword boost score. */\n  float hotwords_score = 1.5;\n\n  /** Optional CTC+FST decoder configuration. */\n  OnlineCtcFstDecoderConfig ctc_fst_decoder_config;\n  /** Optional ITN rule FST archive. */\n  std::string rule_fsts;\n  /** Optional ITN rule FAR archive. */\n  std::string rule_fars;\n  /** Optional blank penalty applied during decoding. */\n  float blank_penalty = 0;\n\n  /** Optional in-memory hotword definitions. */\n  std::string hotwords_buf;\n  /** Optional homophone replacement configuration. */\n  HomophoneReplacerConfig hr;\n};\n\n/** @brief Current streaming ASR result copied into C++ containers. */\nstruct OnlineRecognizerResult {\n  /** Decoded text. */\n  std::string text;\n  /** Token sequence. */\n  std::vector<std::string> tokens;\n  /** Per-token timestamps in seconds. */\n  std::vector<float> timestamps;\n  /** JSON representation of the result. */\n  std::string json;\n};\n\n/** @brief Mono PCM waveform used by the helper I/O functions. */\nstruct Wave {\n  /** Samples normalized to `[-1, 1]`. */\n  std::vector<float> samples;\n  /** Sample rate in Hz. */\n  int32_t sample_rate = 0;\n};\n\n/**\n * @brief Read a mono WAVE file into a C++ value object.\n *\n * On failure, the returned wave has `samples.empty() == true`.\n *\n * @param filename Input WAVE filename.\n * @return Decoded wave data.\n */\nSHERPA_ONNX_API Wave ReadWave(const std::string &filename);\n\n/**\n * @brief Write a mono WAVE file from a C++ value object.\n *\n * @param filename Output filename.\n * @param wave PCM samples and sample rate to write.\n * @return `true` on success; `false` on failure.\n */\nSHERPA_ONNX_API bool WriteWave(const std::string &filename, const Wave &wave);\n\n/**\n * @brief Base class for move-only RAII wrappers around C handles.\n *\n * Derived classes implement `Destroy(const T *) const` and inherit automatic\n * destruction, `Get()`, and `Release()`.\n */\ntemplate <typename Derived, typename T>\nclass SHERPA_ONNX_API MoveOnly {\n public:\n  /** @brief Construct an empty wrapper. */\n  MoveOnly() = default;\n  /** @brief Construct a wrapper from a raw C handle. */\n  explicit MoveOnly(const T *p) : p_(p) {}\n\n  /** @brief Destroy the wrapped handle if present. */\n  ~MoveOnly() { Destroy(); }\n\n  MoveOnly(const MoveOnly &) = delete;\n\n  MoveOnly &operator=(const MoveOnly &) = delete;\n\n  MoveOnly(MoveOnly &&other) : p_(other.Release()) {}\n\n  MoveOnly &operator=(MoveOnly &&other) {\n    if (&other == this) {\n      return *this;\n    }\n\n    Destroy();\n\n    p_ = other.Release();\n\n    return *this;\n  }\n\n  /** @brief Return the wrapped raw pointer without transferring ownership. */\n  const T *Get() const { return p_; }\n\n  /** @brief Release ownership of the wrapped raw pointer. */\n  const T *Release() {\n    const T *p = p_;\n    p_ = nullptr;\n    return p;\n  }\n\n private:\n  void Destroy() {\n    if (p_ == nullptr) {\n      return;\n    }\n\n    static_cast<Derived *>(this)->Destroy(p_);\n\n    p_ = nullptr;\n  }\n\n protected:\n  const T *p_ = nullptr;\n};\n\nclass SHERPA_ONNX_API OnlineStream\n    : public MoveOnly<OnlineStream, SherpaOnnxOnlineStream> {\n public:\n  /** @brief Wrap an existing C online stream handle. */\n  explicit OnlineStream(const SherpaOnnxOnlineStream *p);\n\n  /** @brief Append audio samples to the stream. */\n  void AcceptWaveform(int32_t sample_rate, const float *samples,\n                      int32_t n) const;\n\n  /** @brief Indicate that no more input audio will be provided. */\n  void InputFinished() const;\n\n  /** @brief Set a per-stream string option. */\n  void SetOption(const char *key, const char *value) const;\n  /** @brief Get a per-stream string option. */\n  const char *GetOption(const char *key) const;\n  /** @brief Check whether a per-stream option exists. */\n  int32_t HasOption(const char *key) const;\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxOnlineStream *p) const;\n};\n\n/**\n * @brief RAII wrapper for a streaming recognizer.\n *\n * Example:\n *\n * @code\n * OnlineRecognizer recognizer = OnlineRecognizer::Create(config);\n * OnlineStream stream = recognizer.CreateStream();\n * stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),\n *                       wave.samples.size());\n * stream.InputFinished();\n * while (recognizer.IsReady(&stream)) {\n *   recognizer.Decode(&stream);\n * }\n * auto result = recognizer.GetResult(&stream);\n * @endcode\n */\nclass SHERPA_ONNX_API OnlineRecognizer\n    : public MoveOnly<OnlineRecognizer, SherpaOnnxOnlineRecognizer> {\n public:\n  /** @brief Create a streaming recognizer from a config struct. */\n  static OnlineRecognizer Create(const OnlineRecognizerConfig &config);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxOnlineRecognizer *p) const;\n\n  /** @brief Create a stream that uses the recognizer's configured hotwords. */\n  OnlineStream CreateStream() const;\n\n  /** @brief Create a stream with inline hotwords. */\n  OnlineStream CreateStream(const std::string &hotwords) const;\n\n  /** @brief Check whether the given stream has enough data to decode. */\n  bool IsReady(const OnlineStream *s) const;\n\n  /** @brief Decode one ready stream. */\n  void Decode(const OnlineStream *s) const;\n\n  /** @brief Decode multiple ready streams in parallel. */\n  void Decode(const OnlineStream *ss, int32_t n) const;\n\n  /** @brief Return the current recognition result for a stream. */\n  OnlineRecognizerResult GetResult(const OnlineStream *s) const;\n\n  /** @brief Reset a stream after endpointing or utterance completion. */\n  void Reset(const OnlineStream *s) const;\n\n  /** @brief Check whether endpointing has triggered for a stream. */\n  bool IsEndpoint(const OnlineStream *s) const;\n\n private:\n  explicit OnlineRecognizer(const SherpaOnnxOnlineRecognizer *p);\n};\n\n// ============================================================================\n// Non-streaming ASR\n// ============================================================================\n/** @brief Offline transducer model files. */\nstruct OfflineTransducerModelConfig {\n  /** Encoder ONNX model. */\n  std::string encoder;\n  /** Decoder ONNX model. */\n  std::string decoder;\n  /** Joiner ONNX model. */\n  std::string joiner;\n};\n\n/** @brief Offline Paraformer model file. */\nstruct OfflineParaformerModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Offline NeMo EncDec CTC model file. */\nstruct OfflineNemoEncDecCtcModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Offline Whisper model configuration. */\nstruct OfflineWhisperModelConfig {\n  /** Encoder ONNX model. */\n  std::string encoder;\n  /** Decoder ONNX model. */\n  std::string decoder;\n  /** Whisper language string such as `\"en\"` or `\"zh\"`. */\n  std::string language;\n  /** Task such as `\"transcribe\"` or `\"translate\"`. */\n  std::string task = \"transcribe\";\n  /** Optional tail paddings in samples. */\n  int32_t tail_paddings = -1;\n  /** Enable token timestamps in the result. */\n  bool enable_token_timestamps = false;\n  /** Enable segment timestamps in the result JSON. */\n  bool enable_segment_timestamps = false;\n};\n\n/** @brief Offline Canary model configuration. */\nstruct OfflineCanaryModelConfig {\n  /** Encoder ONNX model. */\n  std::string encoder;\n  /** Decoder ONNX model. */\n  std::string decoder;\n  /** Source language code. */\n  std::string src_lang;\n  /** Target language code. */\n  std::string tgt_lang;\n  /** Whether punctuation/casing is enabled by the model. */\n  bool use_pnc = true;\n};\n\n/** @brief Offline FireRed ASR model files. */\nstruct OfflineFireRedAsrModelConfig {\n  /** Encoder ONNX model. */\n  std::string encoder;\n  /** Decoder ONNX model. */\n  std::string decoder;\n};\n\n/** @brief Offline FireRed ASR CTC model file. */\nstruct OfflineFireRedAsrCtcModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Offline TDNN model file. */\nstruct OfflineTdnnModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Offline SenseVoice model configuration. */\nstruct OfflineSenseVoiceModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n  /** Language hint. */\n  std::string language;\n  /** Enable inverse text normalization. */\n  bool use_itn = false;\n};\n\n/** @brief Offline Dolphin model file. */\nstruct OfflineDolphinModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Offline Zipformer CTC model file. */\nstruct OfflineZipformerCtcModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Offline WeNet CTC model file. */\nstruct OfflineWenetCtcModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Offline omnilingual ASR CTC model file. */\nstruct OfflineOmnilingualAsrCtcModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Offline MedASR CTC model file. */\nstruct OfflineMedAsrCtcModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief Offline Moonshine model configuration. */\nstruct OfflineMoonshineModelConfig {\n  /** Preprocessor model file. */\n  std::string preprocessor;\n  /** Encoder model file. */\n  std::string encoder;\n  /** Uncached decoder model file. */\n  std::string uncached_decoder;\n  /** Cached decoder model file. */\n  std::string cached_decoder;\n  /** Merged decoder model file. */\n  std::string merged_decoder;\n};\n\n/** @brief Offline FunASR Nano model configuration. */\nstruct OfflineFunASRNanoModelConfig {\n  /** Encoder adaptor model file. */\n  std::string encoder_adaptor;\n  /** LLM model file. */\n  std::string llm;\n  /** Embedding model file. */\n  std::string embedding;\n  /** Tokenizer file. */\n  std::string tokenizer;\n  /** System prompt passed to the model. */\n  std::string system_prompt = \"You are a helpful assistant.\";\n  /** User prompt prefix passed to the model. */\n  std::string user_prompt = \"语音转写：\";\n  /** Maximum number of generated tokens. */\n  int32_t max_new_tokens = 512;\n  /** Sampling temperature. */\n  float temperature = 1e-6f;\n  /** Top-p sampling parameter. */\n  float top_p = 0.8f;\n  /** Random seed. */\n  int32_t seed = 42;\n  /** Language hint. */\n  std::string language;\n  /** Enable inverse text normalization. */\n  bool itn = true;\n  /** Optional hotwords string. */\n  std::string hotwords;\n};\n\n/**\n * @brief Acoustic model configuration for offline ASR.\n *\n * Configure exactly one model family. If multiple model families are set, one\n * is chosen and the choice is implementation-defined.\n */\nstruct OfflineModelConfig {\n  /** Offline transducer configuration. */\n  OfflineTransducerModelConfig transducer;\n  /** Offline Paraformer configuration. */\n  OfflineParaformerModelConfig paraformer;\n  /** Offline NeMo CTC configuration. */\n  OfflineNemoEncDecCtcModelConfig nemo_ctc;\n  /** Offline Whisper configuration. */\n  OfflineWhisperModelConfig whisper;\n  /** Offline TDNN configuration. */\n  OfflineTdnnModelConfig tdnn;\n\n  /** Token file. */\n  std::string tokens;\n  /** Number of inference threads. */\n  int32_t num_threads = 1;\n  /** Enable verbose debug logging. */\n  bool debug = false;\n  /** Execution provider such as `\"cpu\"`. */\n  std::string provider = \"cpu\";\n  /** Optional explicit model type hint. */\n  std::string model_type;\n  /** Modeling unit such as `\"cjkchar\"` or `\"bpe\"`. */\n  std::string modeling_unit = \"cjkchar\";\n  /** Optional BPE vocabulary. */\n  std::string bpe_vocab;\n  /** Telespeech CTC model file. */\n  std::string telespeech_ctc;\n  /** SenseVoice configuration. */\n  OfflineSenseVoiceModelConfig sense_voice;\n  /** Moonshine configuration. */\n  OfflineMoonshineModelConfig moonshine;\n  /** FireRed transducer configuration. */\n  OfflineFireRedAsrModelConfig fire_red_asr;\n  /** Dolphin configuration. */\n  OfflineDolphinModelConfig dolphin;\n  /** Zipformer CTC configuration. */\n  OfflineZipformerCtcModelConfig zipformer_ctc;\n  /** Canary configuration. */\n  OfflineCanaryModelConfig canary;\n  /** WeNet CTC configuration. */\n  OfflineWenetCtcModelConfig wenet_ctc;\n  /** Omnilingual ASR configuration. */\n  OfflineOmnilingualAsrCtcModelConfig omnilingual;\n  /** MedASR configuration. */\n  OfflineMedAsrCtcModelConfig medasr;\n  /** FunASR Nano configuration. */\n  OfflineFunASRNanoModelConfig funasr_nano;\n  /** FireRed CTC configuration. */\n  OfflineFireRedAsrCtcModelConfig fire_red_asr_ctc;\n};\n\n/** @brief Optional language-model rescoring configuration for offline ASR. */\nstruct OfflineLMConfig {\n  /** LM model file. */\n  std::string model;\n  /** LM scale. */\n  float scale = 1.0;\n};\n\n/**\n * @brief Configuration for offline ASR.\n *\n * Example using SenseVoice:\n *\n * @code\n * OfflineRecognizerConfig config;\n * config.model_config.sense_voice.model =\n *     \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/model.int8.onnx\";\n * config.model_config.sense_voice.language = \"auto\";\n * config.model_config.sense_voice.use_itn = true;\n * config.model_config.tokens =\n *     \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/tokens.txt\";\n * config.model_config.num_threads = 1;\n * @endcode\n *\n * Example using Parakeet TDT v2:\n *\n * @code\n * OfflineRecognizerConfig config;\n * config.model_config.transducer.encoder =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/encoder.int8.onnx\";\n * config.model_config.transducer.decoder =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/decoder.int8.onnx\";\n * config.model_config.transducer.joiner =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/joiner.int8.onnx\";\n * config.model_config.tokens =\n *     \"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/tokens.txt\";\n * config.model_config.model_type = \"nemo_transducer\";\n * config.model_config.num_threads = 1;\n * @endcode\n */\nstruct OfflineRecognizerConfig {\n  /** Feature extraction configuration. */\n  FeatureConfig feat_config;\n  /** Acoustic model configuration. */\n  OfflineModelConfig model_config;\n  /** Optional LM configuration. */\n  OfflineLMConfig lm_config;\n\n  /** Decoding method such as `\"greedy_search\"` or `\"modified_beam_search\"`. */\n  std::string decoding_method = \"greedy_search\";\n  /** Maximum number of active paths for beam-search-style decoding. */\n  int32_t max_active_paths = 4;\n\n  /** Optional hotword file. */\n  std::string hotwords_file;\n\n  /** Hotword boost score. */\n  float hotwords_score = 1.5;\n  /** Optional ITN rule FST archive. */\n  std::string rule_fsts;\n  /** Optional ITN rule FAR archive. */\n  std::string rule_fars;\n  /** Optional blank penalty applied during decoding. */\n  float blank_penalty = 0;\n  /** Optional homophone replacement configuration. */\n  HomophoneReplacerConfig hr;\n};\n\n/** @brief Offline ASR result copied into C++ containers. */\nstruct OfflineRecognizerResult {\n  /** Decoded text. */\n  std::string text;\n  /** Per-token timestamps in seconds when available. */\n  std::vector<float> timestamps;\n  /** Token sequence. */\n  std::vector<std::string> tokens;\n  /** JSON representation of the result. */\n  std::string json;\n  /** Detected language when provided by the model. */\n  std::string lang;\n  /** Detected emotion when provided by the model. */\n  std::string emotion;\n  /** Detected event when provided by the model. */\n  std::string event;\n\n  /** Non-empty only for TDT-style models. */\n  std::vector<float> durations;\n};\n\n/** @brief RAII wrapper for an offline decoding stream. */\nclass SHERPA_ONNX_API OfflineStream\n    : public MoveOnly<OfflineStream, SherpaOnnxOfflineStream> {\n public:\n  /** @brief Wrap an existing C offline stream handle. */\n  explicit OfflineStream(const SherpaOnnxOfflineStream *p);\n\n  /** @brief Provide the complete waveform for offline decoding. */\n  void AcceptWaveform(int32_t sample_rate, const float *samples,\n                      int32_t n) const;\n\n  /** @brief Set a per-stream string option. */\n  void SetOption(const char *key, const char *value) const;\n  /** @brief Get a per-stream string option. */\n  const char *GetOption(const char *key) const;\n  /** @brief Check whether a per-stream option exists. */\n  int32_t HasOption(const char *key) const;\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxOfflineStream *p) const;\n};\n\n/**\n * @brief RAII wrapper for an offline recognizer.\n *\n * For most offline models, call `AcceptWaveform()` once per stream, then call\n * `Decode()` and `GetResult()`.\n */\nclass SHERPA_ONNX_API OfflineRecognizer\n    : public MoveOnly<OfflineRecognizer, SherpaOnnxOfflineRecognizer> {\n public:\n  /** @brief Create an offline recognizer from a config struct. */\n  static OfflineRecognizer Create(const OfflineRecognizerConfig &config);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxOfflineRecognizer *p) const;\n\n  /** @brief Create a stream using the recognizer's configured hotwords. */\n  OfflineStream CreateStream() const;\n\n  /** @brief Create a stream with inline hotwords. */\n  OfflineStream CreateStream(const std::string &hotwords) const;\n\n  /** @brief Decode one offline stream. */\n  void Decode(const OfflineStream *s) const;\n\n  /** @brief Decode multiple offline streams in parallel. */\n  void Decode(const OfflineStream *ss, int32_t n) const;\n\n  /** @brief Return the copied recognition result for one stream. */\n  OfflineRecognizerResult GetResult(const OfflineStream *s) const;\n\n  /**\n   * @brief Convenience wrapper that returns the result inside a shared pointer.\n   *\n   * This helper exists mainly for integration environments that prefer owning\n   * pointers, such as Unreal Engine.\n   */\n  std::shared_ptr<OfflineRecognizerResult> GetResultPtr(\n      const OfflineStream *s) const;\n\n  /** @brief Update recognizer runtime configuration after creation. */\n  void SetConfig(const OfflineRecognizerConfig &config) const;\n\n private:\n  explicit OfflineRecognizer(const SherpaOnnxOfflineRecognizer *p);\n};\n\n// ============================================================================\n// Non-streaming TTS\n// ============================================================================\n/** @brief VITS model configuration. */\nstruct OfflineTtsVitsModelConfig {\n  /** Acoustic model file. */\n  std::string model;\n  /** Lexicon file. */\n  std::string lexicon;\n  /** Token file. */\n  std::string tokens;\n  /** Data directory such as `espeak-ng-data`. */\n  std::string data_dir;\n  /** Reserved field. Currently unused by the wrapper. */\n  std::string dict_dir;\n\n  /** VITS noise scale. */\n  float noise_scale = 0.667;\n  /** VITS noise scale for duration prediction. */\n  float noise_scale_w = 0.8;\n  /** Length scale. Values < 1 are faster; values > 1 are slower. */\n  float length_scale = 1.0;\n};\n\n/** @brief Matcha model configuration. */\nstruct OfflineTtsMatchaModelConfig {\n  /** Acoustic model file. */\n  std::string acoustic_model;\n  /** Vocoder model file. */\n  std::string vocoder;\n  /** Lexicon file. */\n  std::string lexicon;\n  /** Token file. */\n  std::string tokens;\n  /** Data directory such as `espeak-ng-data`. */\n  std::string data_dir;\n  /** Reserved field. Currently unused by the wrapper. */\n  std::string dict_dir;\n\n  /** Matcha noise scale. */\n  float noise_scale = 0.667;\n  /** Length scale. Values < 1 are faster; values > 1 are slower. */\n  float length_scale = 1.0;\n};\n\n/** @brief Kokoro model configuration. */\nstruct OfflineTtsKokoroModelConfig {\n  /** Acoustic model file. */\n  std::string model;\n  /** Voices file. */\n  std::string voices;\n  /** Token file. */\n  std::string tokens;\n  /** Data directory such as `espeak-ng-data`. */\n  std::string data_dir;\n  /** Reserved field. Currently unused by the wrapper. */\n  std::string dict_dir;\n  /** Optional lexicon file. */\n  std::string lexicon;\n  /** Language/voice family hint. */\n  std::string lang;\n\n  /** Length scale. Values < 1 are faster; values > 1 are slower. */\n  float length_scale = 1.0;\n};\n\n/** @brief Kitten model configuration. */\nstruct OfflineTtsKittenModelConfig {\n  /** Acoustic model file. */\n  std::string model;\n  /** Voices file. */\n  std::string voices;\n  /** Token file. */\n  std::string tokens;\n  /** Data directory. */\n  std::string data_dir;\n\n  /** Length scale. Values < 1 are faster; values > 1 are slower. */\n  float length_scale = 1.0;\n};\n\n/** @brief ZipVoice model configuration. */\nstruct OfflineTtsZipvoiceModelConfig {\n  /** Token file. */\n  std::string tokens;\n  /** Encoder model file. */\n  std::string encoder;\n  /** Decoder model file. */\n  std::string decoder;\n  /** Vocoder model file. */\n  std::string vocoder;\n  /** Data directory. */\n  std::string data_dir;\n  /** Lexicon file. */\n  std::string lexicon;\n\n  /** Feature scale. */\n  float feat_scale = 0.1;\n  /** Time shift. */\n  float t_shift = 0.5;\n  /** Target RMS. */\n  float target_rms = 0.1;\n  /** Guidance scale. */\n  float guidance_scale = 1.0;\n};\n\n/** @brief Pocket TTS model configuration. */\nstruct OfflineTtsPocketModelConfig {\n  /** Flow model file. */\n  std::string lm_flow;\n  /** Main language model file. */\n  std::string lm_main;\n  /** Encoder model file. */\n  std::string encoder;\n  /** Decoder model file. */\n  std::string decoder;\n  /** Text conditioner model file. */\n  std::string text_conditioner;\n\n  /** Vocabulary JSON file. */\n  std::string vocab_json;\n  /** Token scores JSON file. */\n  std::string token_scores_json;\n  /** Voice embedding cache size. */\n  int32_t voice_embedding_cache_capacity = 50;\n};\n\n/** @brief Supertonic model configuration. */\nstruct OfflineTtsSupertonicModelConfig {\n  /** Duration predictor model file. */\n  std::string duration_predictor;\n  /** Text encoder model file. */\n  std::string text_encoder;\n  /** Vector estimator model file. */\n  std::string vector_estimator;\n  /** Vocoder model file. */\n  std::string vocoder;\n  /** Model metadata JSON. */\n  std::string tts_json;\n  /** Unicode indexer resource. */\n  std::string unicode_indexer;\n  /** Voice style resource. */\n  std::string voice_style;\n};\n\n/**\n * @brief Model configuration for offline TTS.\n *\n * Configure exactly one model family. If multiple model families are set, one\n * is chosen and the choice is implementation-defined.\n */\nstruct OfflineTtsModelConfig {\n  /** VITS configuration. */\n  OfflineTtsVitsModelConfig vits;\n  /** Matcha configuration. */\n  OfflineTtsMatchaModelConfig matcha;\n  /** Kokoro configuration. */\n  OfflineTtsKokoroModelConfig kokoro;\n  /** Kitten configuration. */\n  OfflineTtsKittenModelConfig kitten;\n  /** ZipVoice configuration. */\n  OfflineTtsZipvoiceModelConfig zipvoice;\n  /** Pocket configuration. */\n  OfflineTtsPocketModelConfig pocket;\n  /** Supertonic configuration. */\n  OfflineTtsSupertonicModelConfig supertonic;\n\n  /** Number of inference threads. */\n  int32_t num_threads = 1;\n  /** Enable verbose debug logging. */\n  bool debug = false;\n  /** Execution provider such as `\"cpu\"`. */\n  std::string provider = \"cpu\";\n};\n\n/** @brief Generation-time options for advanced TTS synthesis. */\nstruct GenerationConfig {\n  /** Silence scale between sentences. */\n  float silence_scale = 0.2;\n  /** Speech speed. Used only by some models. */\n  float speed = 1.0;\n  /** Speaker ID for multi-speaker models. */\n  int32_t sid = 0;\n  /** Reference audio samples for zero-shot or voice-cloning models. */\n  std::vector<float> reference_audio;\n  /** Sample rate of `reference_audio`. */\n  int32_t reference_sample_rate = 0;\n  /** Optional reference text. Not all models require it. */\n  std::string reference_text;\n  /** Number of flow-matching steps when supported. */\n  int32_t num_steps = 5;\n\n  /** Model-specific extra attributes serialized to JSON internally. */\n  std::unordered_map<std::string, std::string> extra;\n};\n\n/** @brief Configuration for offline TTS. */\nstruct OfflineTtsConfig {\n  /** Model configuration. */\n  OfflineTtsModelConfig model;\n  /** Optional ITN rule FST archive. */\n  std::string rule_fsts;\n  /** Optional ITN rule FAR archive. */\n  std::string rule_fars;\n  /** Sentence chunking limit for generation. */\n  int32_t max_num_sentences = 1;\n  /** Silence scale between generated sentences. */\n  float silence_scale = 0.2;\n};\n\n/** @brief Generated audio returned by the C++ TTS wrapper. */\nstruct GeneratedAudio {\n  /** Output samples normalized to `[-1, 1]`. */\n  std::vector<float> samples;\n  /** Output sample rate in Hz. */\n  int32_t sample_rate = 0;\n};\n\n/**\n * @brief TTS progress callback.\n *\n * Return 1 to continue generating and 0 to stop early.\n */\nusing OfflineTtsCallback = int32_t (*)(const float *samples,\n                                       int32_t num_samples, float progress,\n                                       void *arg);\n\n/**\n * @brief RAII wrapper for offline TTS.\n *\n * Example using Pocket TTS:\n *\n * @code\n * OfflineTtsConfig config;\n * config.model.pocket.lm_flow =\n *     \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\";\n * config.model.pocket.lm_main =\n *     \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\";\n * config.model.pocket.encoder =\n *     \"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\";\n * config.model.pocket.decoder =\n *     \"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\";\n * config.model.pocket.text_conditioner =\n *     \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\";\n * config.model.pocket.vocab_json =\n *     \"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\";\n * config.model.pocket.token_scores_json =\n *     \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\";\n * @endcode\n */\nclass SHERPA_ONNX_API OfflineTts\n    : public MoveOnly<OfflineTts, SherpaOnnxOfflineTts> {\n public:\n  /** @brief Create an offline TTS engine. */\n  static OfflineTts Create(const OfflineTtsConfig &config);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxOfflineTts *p) const;\n\n  /** @brief Return the output sample rate of generated audio. */\n  int32_t SampleRate() const;\n\n  /** @brief Return the number of supported speakers. */\n  int32_t NumSpeakers() const;\n\n  /**\n   * @brief Generate speech using the simple speaker-id and speed interface.\n   *\n   * This overload mirrors the legacy/simple TTS API. Prefer the\n   * `GenerationConfig` overload for new code.\n   */\n  GeneratedAudio Generate(const std::string &text, int32_t sid = 0,\n                          float speed = 1.0,\n                          OfflineTtsCallback callback = nullptr,\n                          void *arg = nullptr) const;\n\n  /** @brief Generate speech using the advanced generation configuration. */\n  GeneratedAudio Generate(const std::string &text,\n                          const GenerationConfig &config,\n                          OfflineTtsCallback callback = nullptr,\n                          void *arg = nullptr) const;\n\n  /** @brief Like Generate(), but returns a shared pointer to the result. */\n  std::shared_ptr<GeneratedAudio> Generate2(\n      const std::string &text, int32_t sid = 0, float speed = 1.0,\n      OfflineTtsCallback callback = nullptr, void *arg = nullptr) const;\n\n  /** @brief Like the advanced Generate() overload, but returns a shared\n   * pointer. */\n  std::shared_ptr<GeneratedAudio> Generate2(\n      const std::string &text, const GenerationConfig &config,\n      OfflineTtsCallback callback = nullptr, void *arg = nullptr) const;\n\n private:\n  explicit OfflineTts(const SherpaOnnxOfflineTts *p);\n};\n\n// ============================================================\n// For Keyword Spotter\n// ============================================================\n\n/** @brief Current keyword spotting result copied into C++ containers. */\nstruct KeywordResult {\n  /** Triggered keyword text. */\n  std::string keyword;\n  /** Decoded token sequence. */\n  std::vector<std::string> tokens;\n  /** Per-token timestamps in seconds. */\n  std::vector<float> timestamps;\n  /** Segment start time in seconds. */\n  float start_time = 0.0f;\n  /** JSON representation of the result. */\n  std::string json;\n};\n\n/** @brief Configuration for the C++ keyword spotting wrapper. */\nstruct KeywordSpotterConfig {\n  /** Feature extraction configuration. */\n  FeatureConfig feat_config;\n  /** Streaming acoustic model configuration. */\n  OnlineModelConfig model_config;\n  /** Maximum number of active paths. */\n  int32_t max_active_paths = 4;\n  /** Number of trailing blanks required before finalizing a trigger. */\n  int32_t num_trailing_blanks = 1;\n  /** Keyword score bonus. */\n  float keywords_score = 1.0f;\n  /** Detection threshold. */\n  float keywords_threshold = 0.25f;\n  /** Keyword file. */\n  std::string keywords_file;\n  /** In-memory keyword definitions. */\n  std::string keywords_buf;\n};\n\n/** @brief RAII wrapper for keyword spotting. */\nclass SHERPA_ONNX_API KeywordSpotter\n    : public MoveOnly<KeywordSpotter, SherpaOnnxKeywordSpotter> {\n public:\n  /** @brief Create a keyword spotter from a config struct. */\n  static KeywordSpotter Create(const KeywordSpotterConfig &config);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxKeywordSpotter *p) const;\n\n  /** @brief Create a keyword stream using configured keywords. */\n  OnlineStream CreateStream() const;\n\n  /** @brief Create a keyword stream with inline extra or replacement keywords.\n   */\n  OnlineStream CreateStream(const std::string &keywords) const;\n\n  /** @brief Check whether the stream has enough data to decode. */\n  bool IsReady(const OnlineStream *s) const;\n\n  /** @brief Decode one ready stream. */\n  void Decode(const OnlineStream *s) const;\n\n  /** @brief Decode multiple ready streams in parallel. */\n  void Decode(const OnlineStream *ss, int32_t n) const;\n\n  /** @brief Reset a stream after a keyword trigger. */\n  void Reset(const OnlineStream *s) const;\n\n  /** @brief Return the copied keyword spotting result for a stream. */\n  KeywordResult GetResult(const OnlineStream *s) const;\n\n private:\n  explicit KeywordSpotter(const SherpaOnnxKeywordSpotter *p);\n};\n\n/** @brief GTCRN speech denoiser model configuration. */\nstruct OfflineSpeechDenoiserGtcrnModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/** @brief DPDFNet speech denoiser model configuration. */\nstruct OfflineSpeechDenoiserDpdfNetModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n};\n\n/**\n * @brief Speech denoiser model configuration.\n *\n * Configure exactly one model family. If multiple model families are set, one\n * is chosen and the choice is implementation-defined.\n */\nstruct OfflineSpeechDenoiserModelConfig {\n  /** GTCRN configuration. */\n  OfflineSpeechDenoiserGtcrnModelConfig gtcrn;\n  /** DPDFNet configuration. */\n  OfflineSpeechDenoiserDpdfNetModelConfig dpdfnet;\n  /** Number of inference threads. */\n  int32_t num_threads = 1;\n  /** Enable verbose debug logging. */\n  bool debug = false;\n  /** Execution provider such as `\"cpu\"`. */\n  std::string provider = \"cpu\";\n};\n\n/** @brief Configuration for offline speech denoising. */\nstruct OfflineSpeechDenoiserConfig {\n  /** Model configuration. */\n  OfflineSpeechDenoiserModelConfig model;\n};\n\n/** @brief Denoised waveform returned by speech enhancement wrappers. */\nstruct DenoisedAudio {\n  /** Output samples normalized to `[-1, 1]`. */\n  std::vector<float> samples;\n  /** Output sample rate in Hz. */\n  int32_t sample_rate = 0;\n};\n\n/** @brief RAII wrapper for offline speech denoising. */\nclass SHERPA_ONNX_API OfflineSpeechDenoiser\n    : public MoveOnly<OfflineSpeechDenoiser, SherpaOnnxOfflineSpeechDenoiser> {\n public:\n  /** @brief Create an offline speech denoiser. */\n  static OfflineSpeechDenoiser Create(\n      const OfflineSpeechDenoiserConfig &config);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxOfflineSpeechDenoiser *p) const;\n\n  /** @brief Run denoising on a complete waveform. */\n  DenoisedAudio Run(const float *samples, int32_t n, int32_t sample_rate) const;\n\n  /** @brief Return the expected input sample rate. */\n  int32_t GetSampleRate() const;\n\n private:\n  explicit OfflineSpeechDenoiser(const SherpaOnnxOfflineSpeechDenoiser *p);\n};\n\n/** @brief Configuration for online speech denoising. */\nstruct OnlineSpeechDenoiserConfig {\n  /** Model configuration. */\n  OfflineSpeechDenoiserModelConfig model;\n};\n\n/** @brief RAII wrapper for online speech denoising. */\nclass SHERPA_ONNX_API OnlineSpeechDenoiser\n    : public MoveOnly<OnlineSpeechDenoiser, SherpaOnnxOnlineSpeechDenoiser> {\n public:\n  /** @brief Create an online speech denoiser. */\n  static OnlineSpeechDenoiser Create(const OnlineSpeechDenoiserConfig &config);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxOnlineSpeechDenoiser *p) const;\n\n  /** @brief Process one chunk of streaming audio. */\n  DenoisedAudio Run(const float *samples, int32_t n, int32_t sample_rate) const;\n\n  /** @brief Flush buffered audio and reset the denoiser. */\n  DenoisedAudio Flush() const;\n\n  /** @brief Reset the denoiser for a new stream. */\n  void Reset() const;\n\n  /** @brief Return the expected input sample rate. */\n  int32_t GetSampleRate() const;\n\n  /** @brief Return the recommended frame shift in samples for streaming input.\n   */\n  int32_t GetFrameShiftInSamples() const;\n\n private:\n  explicit OnlineSpeechDenoiser(const SherpaOnnxOnlineSpeechDenoiser *p);\n};\n\n// ==============================\n// VAD\n// ==============================\n\n/** @brief Silero VAD model configuration. */\nstruct SileroVadModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n  /** Detection threshold. */\n  float threshold = 0.5;\n  /** Minimum silence duration in seconds. */\n  float min_silence_duration = 0.5;\n  /** Minimum speech duration in seconds. */\n  float min_speech_duration = 0.25;\n  /** Window size in samples. */\n  int32_t window_size = 512;\n  /** Maximum speech duration in seconds before forced split. */\n  float max_speech_duration = 20;\n};\n\n/** @brief Ten VAD model configuration. */\nstruct TenVadModelConfig {\n  /** Model ONNX file. */\n  std::string model;\n  /** Detection threshold. */\n  float threshold = 0.5;\n  /** Minimum silence duration in seconds. */\n  float min_silence_duration = 0.5;\n  /** Minimum speech duration in seconds. */\n  float min_speech_duration = 0.25;\n  /** Window size in samples. */\n  int32_t window_size = 256;\n  /** Maximum speech duration in seconds before forced split. */\n  float max_speech_duration = 20;\n};\n\n/**\n * @brief VAD model configuration.\n *\n * Configure exactly one model family. If multiple model families are set, one\n * is chosen and the choice is implementation-defined.\n */\nstruct VadModelConfig {\n  /** Silero VAD configuration. */\n  SileroVadModelConfig silero_vad;\n  /** Ten VAD configuration. */\n  TenVadModelConfig ten_vad;\n\n  /** Input sample rate in Hz. */\n  int32_t sample_rate = 16000;\n  /** Number of inference threads. */\n  int32_t num_threads = 1;\n  /** Execution provider such as `\"cpu\"`. */\n  std::string provider = \"cpu\";\n  /** Enable verbose debug logging. */\n  bool debug = false;\n};\n\n/** @brief One speech segment produced by the VAD wrapper. */\nstruct SpeechSegment {\n  /** Start sample index relative to the processed audio timeline. */\n  int32_t start = 0;\n  /** Speech samples for the segment. */\n  std::vector<float> samples;\n};\n\n/** @brief RAII wrapper for the circular buffer helper used by VAD. */\nclass SHERPA_ONNX_API CircularBuffer\n    : public MoveOnly<CircularBuffer, SherpaOnnxCircularBuffer> {\n public:\n  /** @brief Create a circular buffer with the given capacity in samples. */\n  static CircularBuffer Create(int32_t capacity);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxCircularBuffer *p) const;\n\n  /** @brief Append samples to the buffer. */\n  void Push(const float *p, int32_t n) const;\n\n  /** @brief Copy a contiguous span from the buffer. */\n  std::vector<float> Get(int32_t start_index, int32_t n) const;\n\n  /** @brief Remove samples from the head of the buffer. */\n  void Pop(int32_t n) const;\n\n  /** @brief Return the number of stored samples. */\n  int32_t Size() const;\n\n  /** @brief Return the current head index. */\n  int32_t Head() const;\n\n  /** @brief Reset the buffer to empty. */\n  void Reset() const;\n\n private:\n  explicit CircularBuffer(const SherpaOnnxCircularBuffer *p);\n};\n\n/**\n * @brief RAII wrapper for voice activity detection.\n *\n * The wrapper collects detected speech segments internally. Use `IsEmpty()`,\n * `Front()`, and `Pop()` to consume them.\n */\nclass SHERPA_ONNX_API VoiceActivityDetector\n    : public MoveOnly<VoiceActivityDetector, SherpaOnnxVoiceActivityDetector> {\n public:\n  /** @brief Create a VAD instance. */\n  static VoiceActivityDetector Create(const VadModelConfig &config,\n                                      float buffer_size_in_seconds);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxVoiceActivityDetector *p) const;\n\n  /** @brief Feed more audio samples to the detector. */\n  void AcceptWaveform(const float *samples, int32_t n) const;\n\n  /** @brief Check whether no speech segments are currently queued. */\n  bool IsEmpty() const;\n\n  /** @brief Check whether speech is currently detected. */\n  bool IsDetected() const;\n\n  /** @brief Remove the front queued speech segment. */\n  void Pop() const;\n\n  /** @brief Remove all queued speech segments. */\n  void Clear() const;\n\n  /** @brief Return the front queued speech segment. */\n  SpeechSegment Front() const;\n\n  /** @brief Like Front(), but returns the segment in a shared pointer. */\n  std::shared_ptr<SpeechSegment> FrontPtr() const;\n\n  /** @brief Reset the detector state. */\n  void Reset() const;\n\n  /** @brief Flush buffered context at end of input. */\n  void Flush() const;\n\n private:\n  explicit VoiceActivityDetector(const SherpaOnnxVoiceActivityDetector *p);\n};\n\n/** @brief RAII wrapper for linear resampling. */\nclass SHERPA_ONNX_API LinearResampler\n    : public MoveOnly<LinearResampler, SherpaOnnxLinearResampler> {\n public:\n  /** @brief Construct an empty wrapper. */\n  LinearResampler() = default;\n  /** @brief Create a linear resampler. */\n  static LinearResampler Create(int32_t samp_rate_in_hz,\n                                int32_t samp_rate_out_hz,\n                                float filter_cutoff_hz, int32_t num_zeros);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxLinearResampler *p) const;\n\n  /** @brief Reset the resampler state. */\n  void Reset() const;\n\n  /** @brief Resample one chunk of input audio. */\n  std::vector<float> Resample(const float *input, int32_t input_dim,\n                              bool flush) const;\n\n  /** @brief Return the input sample rate in Hz. */\n  int32_t GetInputSamplingRate() const;\n  /** @brief Return the output sample rate in Hz. */\n  int32_t GetOutputSamplingRate() const;\n\n private:\n  explicit LinearResampler(const SherpaOnnxLinearResampler *p);\n};\n\n/** @brief Return the sherpa-onnx version string as a C++ string. */\nSHERPA_ONNX_API std::string GetVersionStr();\n/** @brief Return the build Git SHA1 as a C++ string. */\nSHERPA_ONNX_API std::string GetGitSha1();\n/** @brief Return the build Git date as a C++ string. */\nSHERPA_ONNX_API std::string GetGitDate();\n/** @brief Return `true` if a file exists. */\nSHERPA_ONNX_API bool FileExists(const std::string &filename);\n\n// ============================================================================\n// Offline Punctuation\n// ============================================================================\n/** @brief Offline punctuation model configuration. */\nstruct OfflinePunctuationModelConfig {\n  /** Model file. */\n  std::string ct_transformer;\n  /** Number of inference threads. */\n  int32_t num_threads = 1;\n  /** Enable verbose debug logging. */\n  bool debug = false;\n  /** Execution provider such as `\"cpu\"`. */\n  std::string provider = \"cpu\";\n};\n\n/** @brief Configuration for offline punctuation. */\nstruct OfflinePunctuationConfig {\n  /** Model configuration. */\n  OfflinePunctuationModelConfig model;\n};\n\n/** @brief RAII wrapper for offline punctuation restoration. */\nclass SHERPA_ONNX_API OfflinePunctuation\n    : public MoveOnly<OfflinePunctuation, SherpaOnnxOfflinePunctuation> {\n public:\n  /** @brief Create an offline punctuation model. */\n  static OfflinePunctuation Create(const OfflinePunctuationConfig &config);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxOfflinePunctuation *p) const;\n\n  /** @brief Add punctuation to a complete input text. */\n  std::string AddPunctuation(const std::string &text) const;\n\n private:\n  explicit OfflinePunctuation(const SherpaOnnxOfflinePunctuation *p);\n};\n\n// ============================================================================\n// Online Punctuation\n// ============================================================================\n/** @brief Online punctuation model configuration. */\nstruct OnlinePunctuationModelConfig {\n  /** Model file. */\n  std::string cnn_bilstm;\n  /** BPE vocabulary file. */\n  std::string bpe_vocab;\n  /** Number of inference threads. */\n  int32_t num_threads = 1;\n  /** Enable verbose debug logging. */\n  bool debug = false;\n  /** Execution provider such as `\"cpu\"`. */\n  std::string provider = \"cpu\";\n};\n\n/** @brief Configuration for online punctuation. */\nstruct OnlinePunctuationConfig {\n  /** Model configuration. */\n  OnlinePunctuationModelConfig model;\n};\n\n/** @brief RAII wrapper for online punctuation restoration. */\nclass SHERPA_ONNX_API OnlinePunctuation\n    : public MoveOnly<OnlinePunctuation, SherpaOnnxOnlinePunctuation> {\n public:\n  /** @brief Create an online punctuation model. */\n  static OnlinePunctuation Create(const OnlinePunctuationConfig &config);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxOnlinePunctuation *p) const;\n\n  /** @brief Add punctuation to one input text chunk. */\n  std::string AddPunctuation(const std::string &text) const;\n\n private:\n  explicit OnlinePunctuation(const SherpaOnnxOnlinePunctuation *p);\n};\n\n// ============================================================================\n// Audio tagging\n// ============================================================================\n/** @brief Zipformer audio-tagging model configuration. */\nstruct OfflineZipformerAudioTaggingModelConfig {\n  /** Model file. */\n  std::string model;\n};\n\n/**\n * @brief Audio-tagging model configuration.\n *\n * Configure exactly one model family. If multiple model families are set, one\n * is chosen and the choice is implementation-defined.\n */\nstruct AudioTaggingModelConfig {\n  /** Zipformer model configuration. */\n  OfflineZipformerAudioTaggingModelConfig zipformer;\n  /** Alternative CED model file. */\n  std::string ced;\n  /** Number of inference threads. */\n  int32_t num_threads = 1;\n  /** Enable verbose debug logging. */\n  bool debug = false;\n  /** Execution provider such as `\"cpu\"`. */\n  std::string provider = \"cpu\";\n};\n\n/** @brief Configuration for audio tagging. */\nstruct AudioTaggingConfig {\n  /** Model configuration. */\n  AudioTaggingModelConfig model;\n  /** CSV file containing label names. */\n  std::string labels;\n  /** Default number of results to return. */\n  int32_t top_k = 5;\n};\n\n/** @brief One audio-tagging event returned by the C++ wrapper. */\nstruct AudioEvent {\n  /** Event label. */\n  std::string name;\n  /** Class index. */\n  int32_t index;\n  /** Probability or confidence score. */\n  float prob;\n};\n\n/** @brief RAII wrapper for audio tagging. */\nclass SHERPA_ONNX_API AudioTagging\n    : public MoveOnly<AudioTagging, SherpaOnnxAudioTagging> {\n public:\n  /** @brief Create an audio tagger. */\n  static AudioTagging Create(const AudioTaggingConfig &config);\n\n  /** @brief Destroy the wrapped C handle. */\n  void Destroy(const SherpaOnnxAudioTagging *p) const;\n\n  /** @brief Create an offline stream for tagging. */\n  OfflineStream CreateStream() const;\n  /**\n   * @brief Run audio tagging and return copied results.\n   *\n   * When `top_k == -1`, the wrapper uses `config.top_k`. When `top_k > 0`,\n   * that argument overrides the configured default.\n   */\n  std::vector<AudioEvent> Compute(const OfflineStream *s, int32_t top_k = -1);\n\n  /** @brief Like Compute(), but returns the result vector in a shared pointer.\n   */\n  std::shared_ptr<std::vector<AudioEvent>> ComputePtr(const OfflineStream *s,\n                                                      int32_t top_k = -1);\n\n private:\n  explicit AudioTagging(const SherpaOnnxAudioTagging *p);\n};\n\n}  // namespace sherpa_onnx::cxx\n\n#endif  // SHERPA_ONNX_C_API_CXX_API_H_\n"
  },
  {
    "path": "sherpa-onnx/c-api/generate.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nnm -g ../../build/lib/libsherpa-onnx-c-api.dylib | awk '$2==\"T\" && $3 ~ /^_Sherpa/ {print $3}' | sort  > ./sherpa-onnx-symbols-c.exp\n\n"
  },
  {
    "path": "sherpa-onnx/c-api/mainpage.md",
    "content": "# sherpa-onnx public API documentation\n\nThis documentation covers the public native APIs shipped in:\n\n- `c-api.h` — the C API\n- `cxx-api.h` — the C++ wrapper built on top of the C API\n\nThese headers expose the main sherpa-onnx inference features for native\napplications and for language bindings that need a stable ABI.\n\n## What is documented here\n\nThe generated docs include the public APIs for:\n\n- streaming ASR\n- non-streaming ASR\n- keyword spotting\n- voice activity detection\n- offline text-to-speech\n- spoken language identification\n- speaker embedding extraction and speaker management\n- audio tagging\n- offline and online punctuation\n- linear resampling\n- offline speaker diarization\n- offline and online speech enhancement\n\nThe C API also includes HarmonyOS-specific constructor variants where\napplicable.\n\n## Which header should I use?\n\nUse `c-api.h` if you are:\n\n- writing C code\n- building FFI bindings for other languages\n- integrating through a plain C ABI\n\nUse `cxx-api.h` if you are:\n\n- writing C++ code directly\n- preferring RAII wrappers over manual destroy/free calls\n- preferring `std::string`, `std::vector`, and move-only wrapper classes\n\n## Common ownership rules\n\nFor the C API:\n\n- objects created by `SherpaOnnxCreate*()` are usually destroyed with a\n  matching `SherpaOnnxDestroy*()`\n- result snapshots, returned strings, and returned arrays must be released with\n  the specific matching free/destroy function documented on each API\n- some helpers return pointers to statically owned strings; those must not be\n  freed\n\nFor the C++ API:\n\n- wrapper classes are move-only and use RAII\n- copied result objects are returned as standard C++ value types\n- callers normally do not need to manage the underlying C pointers directly\n\n## Typical workflow\n\nFor both APIs, the usual flow is:\n\n1. create and fill a config object\n2. create the engine or recognizer\n3. create a stream if the feature is stream-based\n4. feed audio or text\n5. run decode/compute/generate\n6. read back results\n7. destroy resources, or let the C++ wrappers clean them up automatically\n\n## Recommended entry points\n\nStart with:\n\n- [`c-api.h`](https://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/c-api/c-api.h)\n  for the plain C API\n- [`cxx-api.h`](https://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/c-api/cxx-api.h)\n  for the C++ wrapper\n\nRepresentative example programs live in:\n\n- [`c-api-examples/`](https://github.com/k2-fsa/sherpa-onnx/tree/master/c-api-examples)\n- [`cxx-api-examples/`](https://github.com/k2-fsa/sherpa-onnx/tree/master/cxx-api-examples)\n\nUseful examples include:\n\n- [`decode-file-c-api.c`](https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/decode-file-c-api.c)\n- [`whisper-c-api.c`](https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/whisper-c-api.c)\n- [`sense-voice-c-api.c`](https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/sense-voice-c-api.c)\n- [`nemo-parakeet-c-api.c`](https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/nemo-parakeet-c-api.c)\n- [`streaming-zipformer-with-hr-cxx-api.cc`](https://github.com/k2-fsa/sherpa-onnx/blob/master/cxx-api-examples/streaming-zipformer-with-hr-cxx-api.cc)\n- [`sense-voice-cxx-api.cc`](https://github.com/k2-fsa/sherpa-onnx/blob/master/cxx-api-examples/sense-voice-cxx-api.cc)\n- [`pocket-tts-en-cxx-api.cc`](https://github.com/k2-fsa/sherpa-onnx/blob/master/cxx-api-examples/pocket-tts-en-cxx-api.cc)\n- [`vad-cxx-api.cc`](https://github.com/k2-fsa/sherpa-onnx/blob/master/cxx-api-examples/vad-cxx-api.cc)\n\n## Generating the documentation\n\nFrom `sherpa-onnx/c-api/`, run:\n\n```bash\ndoxygen Doxyfile\n```\n\nHTML output is written to:\n\n```text\ndoxygen-docs/html/\n```\n"
  },
  {
    "path": "sherpa-onnx/c-api/sherpa-onnx-symbols-c.exp",
    "content": "_SherpaOfflinePunctuationAddPunct\n_SherpaOfflinePunctuationFreeText\n_SherpaOnnxAcceptWaveformOffline\n_SherpaOnnxAudioTaggingCompute\n_SherpaOnnxAudioTaggingCreateOfflineStream\n_SherpaOnnxAudioTaggingFreeResults\n_SherpaOnnxCircularBufferFree\n_SherpaOnnxCircularBufferGet\n_SherpaOnnxCircularBufferHead\n_SherpaOnnxCircularBufferPop\n_SherpaOnnxCircularBufferPush\n_SherpaOnnxCircularBufferReset\n_SherpaOnnxCircularBufferSize\n_SherpaOnnxCreateAudioTagging\n_SherpaOnnxCreateCircularBuffer\n_SherpaOnnxCreateDisplay\n_SherpaOnnxCreateKeywordSpotter\n_SherpaOnnxCreateKeywordStream\n_SherpaOnnxCreateKeywordStreamWithKeywords\n_SherpaOnnxCreateLinearResampler\n_SherpaOnnxCreateOfflinePunctuation\n_SherpaOnnxCreateOfflineRecognizer\n_SherpaOnnxCreateOfflineSpeakerDiarization\n_SherpaOnnxCreateOfflineSpeechDenoiser\n_SherpaOnnxCreateOfflineStream\n_SherpaOnnxCreateOfflineStreamWithHotwords\n_SherpaOnnxCreateOfflineTts\n_SherpaOnnxCreateOnlineSpeechDenoiser\n_SherpaOnnxCreateOnlinePunctuation\n_SherpaOnnxCreateOnlineRecognizer\n_SherpaOnnxCreateOnlineStream\n_SherpaOnnxCreateOnlineStreamWithHotwords\n_SherpaOnnxCreateSpeakerEmbeddingExtractor\n_SherpaOnnxCreateSpeakerEmbeddingManager\n_SherpaOnnxCreateSpokenLanguageIdentification\n_SherpaOnnxCreateVoiceActivityDetector\n_SherpaOnnxDecodeKeywordStream\n_SherpaOnnxDecodeMultipleKeywordStreams\n_SherpaOnnxDecodeMultipleOfflineStreams\n_SherpaOnnxDecodeMultipleOnlineStreams\n_SherpaOnnxDecodeOfflineStream\n_SherpaOnnxDecodeOnlineStream\n_SherpaOnnxDestroyAudioTagging\n_SherpaOnnxDestroyCircularBuffer\n_SherpaOnnxDestroyDenoisedAudio\n_SherpaOnnxDestroyDisplay\n_SherpaOnnxDestroyKeywordResult\n_SherpaOnnxDestroyKeywordSpotter\n_SherpaOnnxDestroyLinearResampler\n_SherpaOnnxDestroyOfflinePunctuation\n_SherpaOnnxDestroyOfflineRecognizer\n_SherpaOnnxDestroyOfflineRecognizerResult\n_SherpaOnnxDestroyOfflineSpeakerDiarization\n_SherpaOnnxDestroyOfflineSpeechDenoiser\n_SherpaOnnxDestroyOfflineStream\n_SherpaOnnxDestroyOfflineStreamResultJson\n_SherpaOnnxDestroyOfflineTts\n_SherpaOnnxDestroyOfflineTtsGeneratedAudio\n_SherpaOnnxDestroyOnlineSpeechDenoiser\n_SherpaOnnxDestroyOnlinePunctuation\n_SherpaOnnxDestroyOnlineRecognizer\n_SherpaOnnxDestroyOnlineRecognizerResult\n_SherpaOnnxDestroyOnlineStream\n_SherpaOnnxDestroyOnlineStreamResultJson\n_SherpaOnnxDestroySpeakerEmbeddingExtractor\n_SherpaOnnxDestroySpeakerEmbeddingManager\n_SherpaOnnxDestroySpeechSegment\n_SherpaOnnxDestroySpokenLanguageIdentification\n_SherpaOnnxDestroySpokenLanguageIdentificationResult\n_SherpaOnnxDestroyVoiceActivityDetector\n_SherpaOnnxFileExists\n_SherpaOnnxFreeKeywordResultJson\n_SherpaOnnxFreeWave\n_SherpaOnnxGetGitDate\n_SherpaOnnxGetGitSha1\n_SherpaOnnxGetKeywordResult\n_SherpaOnnxGetKeywordResultAsJson\n_SherpaOnnxGetOfflineStreamResult\n_SherpaOnnxGetOfflineStreamResultAsJson\n_SherpaOnnxGetOnlineStreamResult\n_SherpaOnnxGetOnlineStreamResultAsJson\n_SherpaOnnxGetVersionStr\n_SherpaOnnxIsKeywordStreamReady\n_SherpaOnnxIsOnlineStreamReady\n_SherpaOnnxLinearResamplerResample\n_SherpaOnnxLinearResamplerResampleFree\n_SherpaOnnxLinearResamplerResampleGetInputSampleRate\n_SherpaOnnxLinearResamplerResampleGetOutputSampleRate\n_SherpaOnnxLinearResamplerReset\n_SherpaOnnxOfflineRecognizerSetConfig\n_SherpaOnnxOfflineStreamGetOption\n_SherpaOnnxOfflineStreamHasOption\n_SherpaOnnxOfflineStreamSetOption\n_SherpaOnnxOfflineSpeakerDiarizationDestroyResult\n_SherpaOnnxOfflineSpeakerDiarizationDestroySegment\n_SherpaOnnxOfflineSpeakerDiarizationGetSampleRate\n_SherpaOnnxOfflineSpeakerDiarizationProcess\n_SherpaOnnxOfflineSpeakerDiarizationProcessWithCallback\n_SherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg\n_SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments\n_SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers\n_SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime\n_SherpaOnnxOfflineSpeakerDiarizationSetConfig\n_SherpaOnnxOfflineSpeechDenoiserGetSampleRate\n_SherpaOnnxOfflineSpeechDenoiserRun\n_SherpaOnnxOfflineTtsGenerate\n_SherpaOnnxOfflineTtsGenerateWithCallback\n_SherpaOnnxOfflineTtsGenerateWithCallbackWithArg\n_SherpaOnnxOfflineTtsGenerateWithConfig\n_SherpaOnnxOfflineTtsGenerateWithProgressCallback\n_SherpaOnnxOfflineTtsGenerateWithProgressCallbackWithArg\n_SherpaOnnxOfflineTtsGenerateWithZipvoice\n_SherpaOnnxOfflineTtsNumSpeakers\n_SherpaOnnxOfflineTtsSampleRate\n_SherpaOnnxOnlinePunctuationAddPunct\n_SherpaOnnxOnlinePunctuationFreeText\n_SherpaOnnxOnlineSpeechDenoiserFlush\n_SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples\n_SherpaOnnxOnlineSpeechDenoiserGetSampleRate\n_SherpaOnnxOnlineSpeechDenoiserReset\n_SherpaOnnxOnlineSpeechDenoiserRun\n_SherpaOnnxOnlineStreamAcceptWaveform\n_SherpaOnnxOnlineStreamGetOption\n_SherpaOnnxOnlineStreamHasOption\n_SherpaOnnxOnlineStreamInputFinished\n_SherpaOnnxOnlineStreamSetOption\n_SherpaOnnxOnlineStreamIsEndpoint\n_SherpaOnnxOnlineStreamReset\n_SherpaOnnxPrint\n_SherpaOnnxReadWave\n_SherpaOnnxReadWaveFromBinaryData\n_SherpaOnnxResetKeywordStream\n_SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding\n_SherpaOnnxSpeakerEmbeddingExtractorCreateStream\n_SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding\n_SherpaOnnxSpeakerEmbeddingExtractorDim\n_SherpaOnnxSpeakerEmbeddingExtractorIsReady\n_SherpaOnnxSpeakerEmbeddingManagerAdd\n_SherpaOnnxSpeakerEmbeddingManagerAddList\n_SherpaOnnxSpeakerEmbeddingManagerAddListFlattened\n_SherpaOnnxSpeakerEmbeddingManagerContains\n_SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers\n_SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches\n_SherpaOnnxSpeakerEmbeddingManagerFreeSearch\n_SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers\n_SherpaOnnxSpeakerEmbeddingManagerGetBestMatches\n_SherpaOnnxSpeakerEmbeddingManagerNumSpeakers\n_SherpaOnnxSpeakerEmbeddingManagerRemove\n_SherpaOnnxSpeakerEmbeddingManagerSearch\n_SherpaOnnxSpeakerEmbeddingManagerVerify\n_SherpaOnnxSpokenLanguageIdentificationCompute\n_SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream\n_SherpaOnnxVoiceActivityDetectorAcceptWaveform\n_SherpaOnnxVoiceActivityDetectorClear\n_SherpaOnnxVoiceActivityDetectorDetected\n_SherpaOnnxVoiceActivityDetectorEmpty\n_SherpaOnnxVoiceActivityDetectorFlush\n_SherpaOnnxVoiceActivityDetectorFront\n_SherpaOnnxVoiceActivityDetectorPop\n_SherpaOnnxVoiceActivityDetectorReset\n_SherpaOnnxWaveFileSize\n_SherpaOnnxWriteWave\n_SherpaOnnxWriteWaveToBuffer\n"
  },
  {
    "path": "sherpa-onnx/c-api/sherpa-onnx-symbols-c.lds",
    "content": "{\n  global:\n    SherpaOnnx*;\n    # For offline punctuation.\n    SherpaOffline*;\n  local:\n    *;\n};\n"
  },
  {
    "path": "sherpa-onnx/csrc/.gitignore",
    "content": "*.cc-bak\n*.h-bak\n"
  },
  {
    "path": "sherpa-onnx/csrc/CMakeLists.txt",
    "content": "include_directories(${PROJECT_SOURCE_DIR})\n\nif(SHERPA_ONNX_ENABLE_PYTHON)\n  message(STATUS \"PYTHON_EXECUTABLE: ${PYTHON_EXECUTABLE}\")\n  execute_process(\n    COMMAND \"${PYTHON_EXECUTABLE}\" -c \"import sys; print('.'.join(sys.version.split('.')[:2]))\"\n    OUTPUT_STRIP_TRAILING_WHITESPACE\n    OUTPUT_VARIABLE PYTHON_VERSION\n  )\n  message(STATUS \"PYTHON_VERSION: ${PYTHON_VERSION}\")\nendif()\n\nset(sources\n  base64-decode.cc\n  bbpe.cc\n  cat.cc\n  circular-buffer.cc\n  context-graph.cc\n  endpoint.cc\n  features.cc\n  file-utils.cc\n  fst-utils.cc\n  homophone-replacer.cc\n  hypothesis.cc\n  keyword-spotter-impl.cc\n  keyword-spotter.cc\n  lodr-fst.cc\n  math.cc\n  normal-data-generator.cc\n  offline-canary-model-config.cc\n  offline-canary-model.cc\n  offline-ctc-fst-decoder-config.cc\n  offline-ctc-fst-decoder.cc\n  offline-ctc-greedy-search-decoder.cc\n  offline-ctc-model.cc\n  offline-dolphin-model-config.cc\n  offline-dolphin-model.cc\n  offline-fire-red-asr-ctc-model-config.cc\n  offline-fire-red-asr-ctc-model.cc\n  offline-fire-red-asr-greedy-search-decoder.cc\n  offline-fire-red-asr-model-config.cc\n  offline-fire-red-asr-model.cc\n  offline-lm-config.cc\n  offline-lm.cc\n  offline-medasr-ctc-model-config.cc\n  offline-medasr-ctc-model.cc\n  offline-model-config.cc\n  offline-moonshine-greedy-search-decoder.cc\n  offline-moonshine-v2-greedy-search-decoder.cc\n  offline-moonshine-model-config.cc\n  offline-moonshine-model-v2.cc\n  offline-moonshine-model.cc\n  offline-nemo-enc-dec-ctc-model-config.cc\n  offline-nemo-enc-dec-ctc-model.cc\n  offline-omnilingual-asr-ctc-model-config.cc\n  offline-omnilingual-asr-ctc-model.cc\n  offline-paraformer-greedy-search-decoder.cc\n  offline-paraformer-model-config.cc\n  offline-paraformer-model.cc\n  offline-recognizer-impl.cc\n  offline-recognizer.cc\n  offline-rnn-lm.cc\n  offline-sense-voice-model-config.cc\n  offline-sense-voice-model.cc\n  offline-source-separation-impl.cc\n  offline-source-separation-model-config.cc\n  offline-source-separation-spleeter-model-config.cc\n  offline-source-separation-spleeter-model.cc\n  offline-source-separation-uvr-model-config.cc\n  offline-source-separation-uvr-model.cc\n  offline-source-separation.cc\n  offline-stream.cc\n  offline-tdnn-ctc-model.cc\n  offline-tdnn-model-config.cc\n  offline-telespeech-ctc-model.cc\n  offline-transducer-greedy-search-decoder.cc\n  offline-transducer-greedy-search-nemo-decoder.cc\n  offline-transducer-model-config.cc\n  offline-transducer-model.cc\n  offline-transducer-modified-beam-search-decoder.cc\n  offline-transducer-modified-beam-search-nemo-decoder.cc\n  offline-transducer-nemo-model.cc\n  offline-wenet-ctc-model-config.cc\n  offline-wenet-ctc-model.cc\n  offline-whisper-dtw.cc\n  offline-whisper-greedy-search-decoder.cc\n  offline-whisper-model-config.cc\n  offline-whisper-model.cc\n  offline-whisper-timestamp-rules.cc\n  offline-zipformer-ctc-model-config.cc\n  offline-zipformer-ctc-model.cc\n  online-conformer-transducer-model.cc\n  online-ctc-fst-decoder-config.cc\n  online-ctc-fst-decoder.cc\n  online-ctc-greedy-search-decoder.cc\n  online-ctc-model.cc\n  online-ebranchformer-transducer-model.cc\n  online-lm-config.cc\n  online-lm.cc\n  online-lstm-transducer-model.cc\n  online-model-config.cc\n  online-nemo-ctc-model-config.cc\n  online-nemo-ctc-model.cc\n  online-paraformer-model-config.cc\n  online-paraformer-model.cc\n  online-recognizer-impl.cc\n  online-recognizer.cc\n  online-rnn-lm.cc\n  online-stream.cc\n  online-t-one-ctc-model-config.cc\n  online-t-one-ctc-model.cc\n  online-transducer-decoder.cc\n  online-transducer-greedy-search-decoder.cc\n  online-transducer-greedy-search-nemo-decoder.cc\n  online-transducer-model-config.cc\n  online-transducer-model.cc\n  online-transducer-modified-beam-search-decoder.cc\n  online-transducer-nemo-model.cc\n  online-wenet-ctc-model-config.cc\n  online-wenet-ctc-model.cc\n  online-zipformer-transducer-model.cc\n  online-zipformer2-ctc-model-config.cc\n  online-zipformer2-ctc-model.cc\n  online-zipformer2-transducer-model.cc\n  onnx-utils.cc\n  packed-sequence.cc\n  pad-sequence.cc\n  parse-options.cc\n  phrase-matcher.cc\n  provider-config.cc\n  provider.cc\n  resample.cc\n  session.cc\n  silero-vad-model-config.cc\n  silero-vad-model.cc\n  slice.cc\n  spoken-language-identification-impl.cc\n  spoken-language-identification.cc\n  stack.cc\n  symbol-table.cc\n  ten-vad-model-config.cc\n  ten-vad-model.cc\n  text-utils.cc\n  timer.cc\n  transducer-keyword-decoder.cc\n  transpose.cc\n  unbind.cc\n  utils.cc\n  vad-model-config.cc\n  vad-model.cc\n  version.cc\n  voice-activity-detector.cc\n  wave-reader.cc\n  wave-writer.cc\n)\n\n# speaker embedding extractor\nlist(APPEND sources\n  speaker-embedding-extractor-impl.cc\n  speaker-embedding-extractor-model.cc\n  speaker-embedding-extractor-nemo-model.cc\n  speaker-embedding-extractor.cc\n  speaker-embedding-manager.cc\n)\n\n# audio tagging\nlist(APPEND sources\n  audio-tagging-impl.cc\n  audio-tagging-label-file.cc\n  audio-tagging-model-config.cc\n  audio-tagging.cc\n  offline-ced-model.cc\n  offline-zipformer-audio-tagging-model-config.cc\n  offline-zipformer-audio-tagging-model.cc\n)\n\nlist(APPEND sources\n  qnn-config.cc\n)\n\n# punctuation\nlist(APPEND sources\n  offline-ct-transformer-model.cc\n  offline-punctuation-impl.cc\n  offline-punctuation-model-config.cc\n  offline-punctuation.cc\n  online-cnn-bilstm-model.cc\n  online-punctuation-impl.cc\n  online-punctuation-model-config.cc\n  online-punctuation.cc\n)\nif(SHERPA_ONNX_ENABLE_RKNN)\n  list(APPEND sources\n    ./rknn/context-blocking-queue-rknn.cc\n    ./rknn/offline-sense-voice-model-rknn.cc\n    ./rknn/offline-paraformer-model-rknn.cc\n    ./rknn/online-stream-rknn.cc\n    ./rknn/online-transducer-greedy-search-decoder-rknn.cc\n    ./rknn/online-transducer-modified-beam-search-decoder-rknn.cc\n    ./rknn/online-zipformer-ctc-model-rknn.cc\n    ./rknn/online-zipformer-transducer-model-rknn.cc\n    ./rknn/silero-vad-model-rknn.cc\n    ./rknn/transducer-keyword-decoder-rknn.cc\n    ./rknn/utils.cc\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_AXERA)\n  list(APPEND sources\n    ./axera/ax-engine-guard.cc\n    ./axera/offline-sense-voice-model-axera.cc\n    ./axera/utils.cc\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_AXCL)\n  list(APPEND sources\n    ./axcl/axcl-engine-guard.cc\n    ./axcl/axcl-engine-io-guard.cc\n    ./axcl/axcl-engine-io-info-guard.cc\n    ./axcl/axcl-manager.cc\n    ./axcl/axcl-model.cc\n    ./axcl/offline-sense-voice-model-axcl.cc\n    ./axcl/utils.cc\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_RKNN OR SHERPA_ONNX_ENABLE_ASCEND_NPU OR SHERPA_ONNX_ENABLE_QNN OR SHERPA_ONNX_ENABLE_AXERA OR SHERPA_ONNX_ENABLE_AXCL)\n  list(APPEND sources\n    ./rknn/offline-ctc-greedy-search-decoder-rknn.cc\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_ASCEND_NPU)\n  list(APPEND sources\n    ./ascend/offline-paraformer-model-ascend.cc\n    ./ascend/offline-sense-voice-model-ascend.cc\n    ./ascend/offline-whisper-model-ascend.cc\n    ./ascend/offline-zipformer-ctc-model-ascend.cc\n    ./ascend/utils.cc\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_QNN)\n  list(APPEND sources\n    ./qnn/offline-sense-voice-model-qnn.cc\n    ./qnn/offline-paraformer-model-qnn.cc\n    ./qnn/offline-zipformer-ctc-model-qnn.cc\n    ./qnn/qnn-backend.cc\n    ./qnn/qnn-model.cc\n    ./qnn/utils.cc\n  )\nendif()\n\nlist(APPEND sources\n  offline-funasr-nano-model-config.cc\n  offline-funasr-nano-model.cc\n  offline-recognizer-funasr-nano-impl.cc\n  funasr-nano-tokenizer.cc\n)\n\nif(SHERPA_ONNX_ENABLE_TTS)\n  list(APPEND sources\n    character-lexicon.cc\n    hifigan-vocoder.cc\n    kokoro-multi-lang-lexicon.cc\n    lexicon.cc\n    matcha-tts-lexicon.cc\n    melo-tts-lexicon.cc\n    offline-tts-character-frontend.cc\n    offline-tts-frontend.cc\n    offline-tts-impl.cc\n    offline-tts-kitten-model-config.cc\n    offline-tts-kitten-model.cc\n    offline-tts-kokoro-model-config.cc\n    offline-tts-kokoro-model.cc\n    offline-tts-matcha-model-config.cc\n    offline-tts-matcha-model.cc\n    offline-tts-model-config.cc\n    offline-tts-pocket-model-config.cc\n    offline-tts-pocket-model.cc\n    offline-tts-supertonic-impl.cc\n    offline-tts-supertonic-model-config.cc\n    offline-tts-supertonic-model.cc\n    offline-tts-supertonic-unicode-processor.cc\n    offline-tts-vits-model-config.cc\n    offline-tts-vits-model.cc\n    offline-tts-zipvoice-model-config.cc\n    offline-tts-zipvoice-model.cc\n    offline-tts.cc\n    piper-phonemize-lexicon.cc\n    sentence-piece-tokenizer.cc\n    vocoder.cc\n    vocos-vocoder.cc\n  )\nendif()\n\nlist(APPEND sources\n  offline-speech-denoiser-dpdfnet-model-config.cc\n  offline-speech-denoiser-dpdfnet-model.cc\n  offline-speech-denoiser-gtcrn-model-config.cc\n  offline-speech-denoiser-gtcrn-model.cc\n  offline-speech-denoiser-impl.cc\n  offline-speech-denoiser-model-config.cc\n  offline-speech-denoiser.cc\n  online-speech-denoiser-impl.cc\n  online-speech-denoiser.cc\n)\n\nif(SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION)\n  list(APPEND sources\n    fast-clustering-config.cc\n    fast-clustering.cc\n    offline-speaker-diarization-impl.cc\n    offline-speaker-diarization-result.cc\n    offline-speaker-diarization.cc\n    offline-speaker-segmentation-model-config.cc\n    offline-speaker-segmentation-pyannote-model-config.cc\n    offline-speaker-segmentation-pyannote-model.cc\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_CHECK)\n  list(APPEND sources log.cc)\nendif()\n\n# Always static build\nadd_library(sherpa-onnx-core STATIC ${sources})\n\n\nif(WIN32 AND SHERPA_ONNX_LINK_D3D)\n    target_link_libraries(sherpa-onnx-core dxguid.lib d3d12.lib dxgi.lib dxcore.lib)\nendif()\n\n\nif(TARGET directml)\n    target_link_libraries(sherpa-onnx-core directml)\nendif()\n\nset_target_properties(\n    sherpa-onnx-core\n  PROPERTIES\n    POSITION_INDEPENDENT_CODE ON\n    C_VISIBILITY_PRESET hidden\n    CXX_VISIBILITY_PRESET hidden\n)\n\nif(APPLE)\n  target_compile_options(sherpa-onnx-core PRIVATE\n    -Wno-deprecated-declarations\n  )\nendif()\n\nif(ANDROID_NDK)\n  target_link_libraries(sherpa-onnx-core android log)\nendif()\n\ntarget_link_libraries(sherpa-onnx-core\n  kaldi-native-fbank-core\n  kaldi-decoder-core\n  ssentencepiece_core\n)\nif(DEFINED OHOS AND x${OHOS} STREQUAL xOHOS)\n  target_link_libraries(sherpa-onnx-core\n    hilog_ndk.z\n    rawfile.z\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_RKNN)\n  if(DEFINED ENV{SHERPA_ONNX_RKNN_TOOLKIT2_LIB_DIR})\n    target_link_libraries(sherpa-onnx-core -L$ENV{SHERPA_ONNX_RKNN_TOOLKIT2_LIB_DIR} -lrknnrt)\n  else()\n    target_link_libraries(sherpa-onnx-core rknnrt)\n  endif()\nendif()\n\nif(SHERPA_ONNX_ENABLE_AXERA)\n  if(DEFINED ENV{SHERPA_ONNX_AXERA_LIB_DIR})\n    target_link_libraries(sherpa-onnx-core\n      -L$ENV{SHERPA_ONNX_AXERA_LIB_DIR}\n      -lax_engine\n      -lax_interpreter\n      -lax_sys\n      -lpthread\n    )\n  else()\n    target_link_libraries(sherpa-onnx-core\n      ax_engine\n      ax_interpreter\n      ax_sys\n      pthread\n    )\n  endif()\nendif()\n\nif(SHERPA_ONNX_ENABLE_AXCL)\n  if(DEFINED ENV{SHERPA_ONNX_AXCL_LIB_DIR})\n    target_link_libraries(sherpa-onnx-core\n      -L$ENV{SHERPA_ONNX_AXCL_LIB_DIR}\n      -laxcl_rt\n      )\n  else()\n    target_link_libraries(sherpa-onnx-core\n      axcl_rt\n    )\n  endif()\nendif()\n\nif(SHERPA_ONNX_ENABLE_ASCEND_NPU)\n    target_include_directories(sherpa-onnx-core PRIVATE ${ASCEND_TOOLKIT_HOME}/include)\n    target_link_libraries(sherpa-onnx-core\n      -L${ASCEND_TOOLKIT_HOME}/lib64\n      -lascendcl\n    )\nendif()\n\nif(SHERPA_ONNX_ENABLE_QNN)\n  target_include_directories(sherpa-onnx-core PRIVATE ${QNN_SDK_ROOT}/include/QNN)\nendif()\n\nif(SHERPA_ONNX_ENABLE_SPACEMIT)\n  if(TARGET spacemit_ep)\n    target_link_libraries(sherpa-onnx-core spacemit_ep)\n  else()\n    target_link_libraries(sherpa-onnx-core ${spacemit_ep_lib_files})\n  endif()\nendif()\n\nif(TARGET onnxruntime)\n  target_link_libraries(sherpa-onnx-core onnxruntime)\nelse()\n  target_link_libraries(sherpa-onnx-core ${onnxruntime_lib_files})\nendif()\n\nif(NOT WIN32)\n  target_link_libraries(sherpa-onnx-core -lm)\nendif()\n\nif(NOT BUILD_SHARED_LIBS AND APPLE)\n  target_link_libraries(sherpa-onnx-core \"-framework Foundation\")\nendif()\n\ntarget_link_libraries(sherpa-onnx-core fstfar fst)\n\nif(SHERPA_ONNX_ENABLE_TTS)\n  target_link_libraries(sherpa-onnx-core\n    piper_phonemize)\nendif()\n\nif(SHERPA_ONNX_ENABLE_CHECK)\n  target_compile_definitions(sherpa-onnx-core PUBLIC SHERPA_ONNX_ENABLE_CHECK=1)\n\n  if(SHERPA_ONNX_HAVE_EXECINFO_H)\n    target_compile_definitions(sherpa-onnx-core PRIVATE SHERPA_ONNX_HAVE_EXECINFO_H=1)\n  endif()\n\n  if(SHERPA_ONNX_HAVE_CXXABI_H)\n    target_compile_definitions(sherpa-onnx-core PRIVATE SHERPA_ONNX_HAVE_CXXABI_H=1)\n  endif()\nendif()\n\nif(NOT BUILD_SHARED_LIBS AND CMAKE_SYSTEM_NAME STREQUAL Linux)\n  # This is for linux arm32 and arm64\n  target_link_libraries(sherpa-onnx-core -ldl)\nendif()\n\nif(NOT WIN32 AND NOT SHERPA_ONNX_ENABLE_WASM AND CMAKE_SYSTEM_NAME STREQUAL Linux)\n  target_link_libraries(sherpa-onnx-core -pthread)\nendif()\n\nif(SHERPA_ONNX_ENABLE_BINARY)\n  add_executable(sherpa-onnx sherpa-onnx.cc)\n  add_executable(sherpa-onnx-keyword-spotter sherpa-onnx-keyword-spotter.cc)\n  add_executable(sherpa-onnx-offline sherpa-onnx-offline.cc)\n  add_executable(sherpa-onnx-offline-audio-tagging sherpa-onnx-offline-audio-tagging.cc)\n  add_executable(sherpa-onnx-offline-denoiser sherpa-onnx-offline-denoiser.cc)\n  add_executable(sherpa-onnx-offline-language-identification sherpa-onnx-offline-language-identification.cc)\n  add_executable(sherpa-onnx-offline-parallel sherpa-onnx-offline-parallel.cc)\n  add_executable(sherpa-onnx-offline-punctuation sherpa-onnx-offline-punctuation.cc)\n  add_executable(sherpa-onnx-offline-source-separation sherpa-onnx-offline-source-separation.cc)\n  add_executable(sherpa-onnx-online-denoiser sherpa-onnx-online-denoiser.cc)\n  add_executable(sherpa-onnx-online-punctuation sherpa-onnx-online-punctuation.cc)\n  add_executable(sherpa-onnx-version sherpa-onnx-version.cc version.cc)\n  add_executable(sherpa-onnx-vad sherpa-onnx-vad.cc)\n\n  if(SHERPA_ONNX_ENABLE_TTS)\n    add_executable(sherpa-onnx-offline-tts sherpa-onnx-offline-tts.cc)\n  endif()\n\n  if(SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION)\n    add_executable(sherpa-onnx-offline-speaker-diarization sherpa-onnx-offline-speaker-diarization.cc)\n  endif()\n\n  set(main_exes\n    sherpa-onnx\n    sherpa-onnx-keyword-spotter\n    sherpa-onnx-offline\n    sherpa-onnx-offline-audio-tagging\n    sherpa-onnx-offline-denoiser\n    sherpa-onnx-offline-language-identification\n    sherpa-onnx-offline-parallel\n    sherpa-onnx-offline-punctuation\n    sherpa-onnx-offline-source-separation\n    sherpa-onnx-online-denoiser\n    sherpa-onnx-online-punctuation\n    sherpa-onnx-vad\n  )\n  if(SHERPA_ONNX_ENABLE_TTS)\n    list(APPEND main_exes\n      sherpa-onnx-offline-tts\n    )\n  endif()\n\n  if(SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION)\n    list(APPEND main_exes\n      sherpa-onnx-offline-speaker-diarization\n    )\n  endif()\n\n  foreach(exe IN LISTS main_exes)\n    target_link_libraries(${exe} sherpa-onnx-core)\n  endforeach()\n\n  if(NOT WIN32)\n    foreach(exe IN LISTS main_exes)\n      target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib\")\n      target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../../../sherpa_onnx/lib\")\n\n      if(SHERPA_ONNX_ENABLE_PYTHON)\n        target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${PYTHON_VERSION}/site-packages/sherpa_onnx/lib\")\n      elseif(SHERPA_ONNX_SPLIT_PYTHON_PACKAGE)\n        foreach(ver in ITEMS 3.8 3.9 3.10 3.11 3.12 3.13 3.14)\n          target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${ver}/site-packages/sherpa_onnx/lib\")\n        endforeach()\n      endif()\n    endforeach()\n  endif()\nendif()\n\nif(NOT BUILD_SHARED_LIBS)\n  install(TARGETS sherpa-onnx-core DESTINATION lib)\nendif()\n\nif(SHERPA_ONNX_ENABLE_BINARY)\n  install(\n    TARGETS\n      ${main_exes}\n      sherpa-onnx-version\n    DESTINATION\n      bin\n  )\nendif()\n\nif(SHERPA_ONNX_HAS_ALSA AND SHERPA_ONNX_ENABLE_BINARY)\n  add_executable(sherpa-onnx-alsa sherpa-onnx-alsa.cc alsa.cc)\n  add_executable(sherpa-onnx-alsa-offline sherpa-onnx-alsa-offline.cc alsa.cc)\n  add_executable(sherpa-onnx-alsa-offline-audio-tagging sherpa-onnx-alsa-offline-audio-tagging.cc alsa.cc)\n  add_executable(sherpa-onnx-alsa-offline-speaker-identification sherpa-onnx-alsa-offline-speaker-identification.cc alsa.cc)\n  add_executable(sherpa-onnx-keyword-spotter-alsa sherpa-onnx-keyword-spotter-alsa.cc alsa.cc)\n  add_executable(sherpa-onnx-vad-alsa sherpa-onnx-vad-alsa.cc alsa.cc)\n  add_executable(sherpa-onnx-vad-alsa-offline-asr sherpa-onnx-vad-alsa-offline-asr.cc alsa.cc)\n\n\n  if(SHERPA_ONNX_ENABLE_TTS)\n    add_executable(sherpa-onnx-offline-tts-play-alsa sherpa-onnx-offline-tts-play-alsa.cc alsa-play.cc)\n  endif()\n\n  set(exes\n    sherpa-onnx-alsa\n    sherpa-onnx-alsa-offline\n    sherpa-onnx-alsa-offline-speaker-identification\n    sherpa-onnx-keyword-spotter-alsa\n    sherpa-onnx-vad-alsa\n    sherpa-onnx-vad-alsa-offline-asr\n    sherpa-onnx-alsa-offline-audio-tagging\n  )\n\n  if(SHERPA_ONNX_ENABLE_TTS)\n    list(APPEND exes\n      sherpa-onnx-offline-tts-play-alsa\n    )\n  endif()\n\n  #   # To fix the following error for Windows when building exe\n  #   #  mismatch detected for 'RuntimeLibrary': value 'MT_StaticRelease' doesn't match value 'MD_Dynamic Release'\n\n  foreach(exe IN LISTS exes)\n    target_link_libraries(${exe} sherpa-onnx-core)\n  endforeach()\n\n  foreach(exe IN LISTS exes)\n    if(DEFINED ENV{SHERPA_ONNX_ALSA_LIB_DIR})\n      target_link_libraries(${exe} -L$ENV{SHERPA_ONNX_ALSA_LIB_DIR} -lasound)\n    else()\n      target_link_libraries(${exe} asound)\n    endif()\n  endforeach()\n\n  if(NOT WIN32)\n    foreach(exe IN LISTS exes)\n      target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib\")\n      target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../../../sherpa_onnx/lib\")\n    endforeach()\n\n    if(SHERPA_ONNX_ENABLE_PYTHON)\n      foreach(exe IN LISTS exes)\n        target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${PYTHON_VERSION}/site-packages/sherpa_onnx/lib\")\n      endforeach()\n    elseif(SHERPA_ONNX_SPLIT_PYTHON_PACKAGE)\n      foreach(exe IN LISTS exes)\n        foreach(ver in ITEMS 3.8 3.9 3.10 3.11 3.12 3.13 3.14)\n          target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${ver}/site-packages/sherpa_onnx/lib\")\n        endforeach()\n      endforeach()\n    endif()\n  endif()\n\n  install(\n    TARGETS ${exes}\n    DESTINATION\n      bin\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_PORTAUDIO AND SHERPA_ONNX_ENABLE_BINARY)\n  if(SHERPA_ONNX_ENABLE_TTS)\n    add_executable(sherpa-onnx-offline-tts-play\n      sherpa-onnx-offline-tts-play.cc\n      microphone.cc\n    )\n  endif()\n\n  add_executable(sherpa-onnx-keyword-spotter-microphone\n    sherpa-onnx-keyword-spotter-microphone.cc\n    microphone.cc\n  )\n\n  add_executable(sherpa-onnx-microphone\n    sherpa-onnx-microphone.cc\n    microphone.cc\n  )\n\n\n  add_executable(sherpa-onnx-microphone-offline\n    sherpa-onnx-microphone-offline.cc\n    microphone.cc\n  )\n\n  add_executable(sherpa-onnx-vad-microphone\n    sherpa-onnx-vad-microphone.cc\n    microphone.cc\n  )\n\n  add_executable(sherpa-onnx-vad-microphone-simulated-streaming-asr\n    sherpa-onnx-vad-microphone-simulated-streaming-asr.cc\n    microphone.cc\n  )\n\n  add_executable(sherpa-onnx-vad-with-offline-asr\n    sherpa-onnx-vad-with-offline-asr.cc\n  )\n\n  add_executable(sherpa-onnx-vad-with-online-asr\n    sherpa-onnx-vad-with-online-asr.cc\n  )\n\n  add_executable(sherpa-onnx-vad-microphone-offline-asr\n    sherpa-onnx-vad-microphone-offline-asr.cc\n    microphone.cc\n  )\n\n  add_executable(sherpa-onnx-microphone-offline-speaker-identification\n    sherpa-onnx-microphone-offline-speaker-identification.cc\n    microphone.cc\n  )\n\n  add_executable(sherpa-onnx-microphone-offline-audio-tagging\n    sherpa-onnx-microphone-offline-audio-tagging.cc\n    microphone.cc\n  )\n\n  set(exes\n    sherpa-onnx-keyword-spotter-microphone\n    sherpa-onnx-microphone\n    sherpa-onnx-microphone-offline\n    sherpa-onnx-microphone-offline-audio-tagging\n    sherpa-onnx-microphone-offline-speaker-identification\n    sherpa-onnx-vad-microphone\n    sherpa-onnx-vad-microphone-simulated-streaming-asr\n    sherpa-onnx-vad-microphone-offline-asr\n    sherpa-onnx-vad-with-offline-asr\n    sherpa-onnx-vad-with-online-asr\n  )\n  if(SHERPA_ONNX_ENABLE_TTS)\n    list(APPEND exes\n      sherpa-onnx-offline-tts-play\n    )\n  endif()\n\n  foreach(exe IN LISTS exes)\n    target_link_libraries(${exe} portaudio_static sherpa-onnx-core)\n  endforeach()\n\n  if(NOT WIN32)\n    foreach(exe IN LISTS exes)\n      target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib\")\n      target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../../../sherpa_onnx/lib\")\n    endforeach()\n\n    if(SHERPA_ONNX_ENABLE_PYTHON)\n      foreach(exe IN LISTS exes)\n        target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${PYTHON_VERSION}/site-packages/sherpa_onnx/lib\")\n      endforeach()\n    elseif(SHERPA_ONNX_SPLIT_PYTHON_PACKAGE)\n      foreach(exe IN LISTS exes)\n        foreach(ver in ITEMS 3.8 3.9 3.10 3.11 3.12 3.13 3.14)\n          target_link_libraries(${exe} \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${ver}/site-packages/sherpa_onnx/lib\")\n        endforeach()\n      endforeach()\n    endif()\n  endif()\n\n  install(\n    TARGETS ${exes}\n    DESTINATION\n      bin\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_WEBSOCKET AND SHERPA_ONNX_ENABLE_BINARY)\n  add_definitions(-DASIO_STANDALONE)\n  add_definitions(-D_WEBSOCKETPP_CPP11_STL_)\n\n  add_executable(sherpa-onnx-online-websocket-server\n    online-websocket-server-impl.cc\n    online-websocket-server.cc\n  )\n  target_link_libraries(sherpa-onnx-online-websocket-server sherpa-onnx-core)\n\n  add_executable(sherpa-onnx-online-websocket-client\n    online-websocket-client.cc\n  )\n  target_link_libraries(sherpa-onnx-online-websocket-client sherpa-onnx-core)\n\n  if(NOT WIN32)\n    target_compile_options(sherpa-onnx-online-websocket-server PRIVATE -Wno-deprecated-declarations)\n\n    target_compile_options(sherpa-onnx-online-websocket-client PRIVATE -Wno-deprecated-declarations)\n  endif()\n\n  # For offline websocket\n  add_executable(sherpa-onnx-offline-websocket-server\n    offline-websocket-server-impl.cc\n    offline-websocket-server.cc\n  )\n  target_link_libraries(sherpa-onnx-offline-websocket-server sherpa-onnx-core)\n\n  if(NOT WIN32)\n    target_compile_options(sherpa-onnx-offline-websocket-server PRIVATE -Wno-deprecated-declarations)\n  endif()\n\n  if(NOT WIN32)\n    target_link_libraries(sherpa-onnx-online-websocket-server \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib\")\n    target_link_libraries(sherpa-onnx-online-websocket-server \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../../../sherpa_onnx/lib\")\n\n    target_link_libraries(sherpa-onnx-online-websocket-client \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib\")\n    target_link_libraries(sherpa-onnx-online-websocket-client \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../../../sherpa_onnx/lib\")\n\n    target_link_libraries(sherpa-onnx-offline-websocket-server \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib\")\n    target_link_libraries(sherpa-onnx-offline-websocket-server \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../../../sherpa_onnx/lib\")\n\n    if(SHERPA_ONNX_ENABLE_PYTHON AND NOT WIN32)\n      target_link_libraries(sherpa-onnx-online-websocket-server \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${PYTHON_VERSION}/site-packages/sherpa_onnx/lib\")\n      target_link_libraries(sherpa-onnx-online-websocket-client \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${PYTHON_VERSION}/site-packages/sherpa_onnx/lib\")\n      target_link_libraries(sherpa-onnx-offline-websocket-server \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${PYTHON_VERSION}/site-packages/sherpa_onnx/lib\")\n    elseif(SHERPA_ONNX_SPLIT_PYTHON_PACKAGE)\n        foreach(ver in ITEMS 3.8 3.9 3.10 3.11 3.12 3.13 3.14)\n          target_link_libraries(sherpa-onnx-online-websocket-server \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${ver}/site-packages/sherpa_onnx/lib\")\n          target_link_libraries(sherpa-onnx-online-websocket-client \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${ver}/site-packages/sherpa_onnx/lib\")\n          target_link_libraries(sherpa-onnx-offline-websocket-server \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/../lib/python${ver}/site-packages/sherpa_onnx/lib\")\n        endforeach()\n    endif()\n  endif()\n\n  install(\n    TARGETS\n      sherpa-onnx-online-websocket-server\n      sherpa-onnx-online-websocket-client\n      sherpa-onnx-offline-websocket-server\n    DESTINATION\n      bin\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_TESTS)\n  set(sherpa_onnx_test_srcs\n    cat-test.cc\n    circular-buffer-test.cc\n    context-graph-test.cc\n    math-test.cc\n    offline-whisper-timestamp-rules-test.cc\n    packed-sequence-test.cc\n    pad-sequence-test.cc\n    regex-lang-test.cc\n    slice-test.cc\n    stack-test.cc\n    text-utils-test.cc\n    text2token-test.cc\n    transpose-test.cc\n    unbind-test.cc\n    utfcpp-test.cc\n    wave-reader-test.cc\n  )\n  if(SHERPA_ONNX_ENABLE_TTS)\n    list(APPEND sherpa_onnx_test_srcs\n      sentence-piece-tokenizer-test.cc\n      piper-phonemize-test.cc\n    )\n  endif()\n\n  if(SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION)\n    list(APPEND sherpa_onnx_test_srcs\n      fast-clustering-test.cc\n    )\n  endif()\n\n  list(APPEND sherpa_onnx_test_srcs\n    speaker-embedding-manager-test.cc\n  )\n\n  function(sherpa_onnx_add_test source)\n    get_filename_component(name ${source} NAME_WE)\n    set(target_name ${name})\n    add_executable(${target_name} \"${source}\")\n\n    target_link_libraries(${target_name}\n      PRIVATE\n        gtest\n        gtest_main\n        sherpa-onnx-core\n    )\n\n    add_test(NAME \"${target_name}\"\n      COMMAND\n        $<TARGET_FILE:${target_name}>\n    )\n  endfunction()\n\n  foreach(source IN LISTS sherpa_onnx_test_srcs)\n    sherpa_onnx_add_test(${source})\n  endforeach()\nendif()\n\nset(srcs_to_check)\nforeach(s IN LISTS sources)\n  list(APPEND srcs_to_check ${CMAKE_CURRENT_LIST_DIR}/${s})\nendforeach()\n\n# For clang-tidy\nadd_custom_target(\n  clang-tidy-check\n  clang-tidy -p ${CMAKE_BINARY_DIR}/compile_commands.json --config-file ${PROJECT_SOURCE_DIR}/.clang-tidy ${srcs_to_check}\n  DEPENDS ${sources})\n\nadd_custom_target(check DEPENDS clang-tidy-check)\n"
  },
  {
    "path": "sherpa-onnx/csrc/CPPLINT.cfg",
    "content": "exclude_files=tee-stream.h\n"
  },
  {
    "path": "sherpa-onnx/csrc/README.md",
    "content": "# File descriptions\n\n- [./sherpa-onnx-alsa.cc](./sherpa-onnx-alsa.cc) For Linux only, especially for\n  embedded Linux, e.g., Raspberry Pi; it uses a streaming model for real-time\n  speech recognition with a microphone.\n\n- [./sherpa-onnx-microphone.cc](./sherpa-onnx-microphone.cc)\n  For Linux/Windows/macOS; it uses a streaming model for real-time speech\n  recognition with a microphone.\n\n- [./sherpa-onnx-microphone-offline.cc](./sherpa-onnx-microphone-offline.cc)\n  For Linux/Windows/macOS; it uses a non-streaming model for speech\n  recognition with a microphone.\n\n- [./sherpa-onnx.cc](./sherpa-onnx.cc)\n  It uses a streaming model to decode wave files\n\n- [./sherpa-onnx-offline.cc](./sherpa-onnx-offline.cc)\n  It uses a non-streaming model to decode wave files\n\n- [./online-websocket-server.cc](./online-websocket-server.cc)\n  WebSocket server for streaming models.\n\n- [./offline-websocket-server.cc](./offline-websocket-server.cc)\n  WebSocket server for non-streaming models.\n\n- [./sherpa-onnx-vad-microphone.cc](./sherpa-onnx-vad-microphone.cc)\n  Use silero VAD to detect speeches with a microphone.\n\n"
  },
  {
    "path": "sherpa-onnx/csrc/alsa-play.cc",
    "content": "// sherpa-onnx/csrc/alsa-play.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifdef SHERPA_ONNX_ENABLE_ALSA\n\n#include \"sherpa-onnx/csrc/alsa-play.h\"\n\n#include <algorithm>\n#include <cstdio>\n#include <memory>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nAlsaPlay::AlsaPlay(const char *device_name, int32_t sample_rate) {\n  int32_t err = snd_pcm_open(&handle_, device_name, SND_PCM_STREAM_PLAYBACK, 0);\n\n  if (err) {\n    fprintf(stderr, \"Unable to open: %s. %s\\n\", device_name, snd_strerror(err));\n    exit(-1);\n  }\n\n  SetParameters(sample_rate);\n}\n\nAlsaPlay::~AlsaPlay() {\n  if (handle_) {\n    int32_t err = snd_pcm_close(handle_);\n    if (err < 0) {\n      printf(\"Failed to close pcm: %s\\n\", snd_strerror(err));\n    }\n  }\n}\n\nvoid AlsaPlay::SetParameters(int32_t sample_rate) {\n  // set the following parameters\n  // 1. sample_rate\n  // 2. sample format: int16_t\n  // 3. num_channels: 1\n  snd_pcm_hw_params_t *params;\n  snd_pcm_hw_params_alloca(&params);\n  snd_pcm_hw_params_any(handle_, params);\n\n  int32_t err = snd_pcm_hw_params_set_access(handle_, params,\n                                             SND_PCM_ACCESS_RW_INTERLEAVED);\n  if (err < 0) {\n    printf(\"SND_PCM_ACCESS_RW_INTERLEAVED is not supported: %s\\n\",\n           snd_strerror(err));\n    exit(-1);\n  }\n\n  err = snd_pcm_hw_params_set_format(handle_, params, SND_PCM_FORMAT_S16_LE);\n\n  if (err < 0) {\n    printf(\"Can't set format to 16-bit: %s\\n\", snd_strerror(err));\n    exit(-1);\n  }\n\n  err = snd_pcm_hw_params_set_channels(handle_, params, 1);\n\n  if (err < 0) {\n    printf(\"Can't set channel number to 1: %s\\n\", snd_strerror(err));\n  }\n\n  uint32_t rate = sample_rate;\n  err = snd_pcm_hw_params_set_rate_near(handle_, params, &rate, 0);\n  if (err < 0) {\n    printf(\"Can't set rate to %d. %s\\n\", rate, snd_strerror(err));\n  }\n\n  err = snd_pcm_hw_params(handle_, params);\n  if (err < 0) {\n    printf(\"Can't set hardware parameters. %s\\n\", snd_strerror(err));\n    exit(-1);\n  }\n\n  uint32_t tmp;\n  snd_pcm_hw_params_get_rate(params, &tmp, 0);\n  int32_t actual_sample_rate = tmp;\n  if (actual_sample_rate != sample_rate) {\n    fprintf(stderr,\n            \"Creating a resampler:\\n\"\n            \"   in_sample_rate: %d\\n\"\n            \"   output_sample_rate: %d\\n\",\n            sample_rate, actual_sample_rate);\n\n    float min_freq = std::min(actual_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler_ = std::make_unique<LinearResample>(\n        sample_rate, actual_sample_rate, lowpass_cutoff, lowpass_filter_width);\n  }\n\n  snd_pcm_uframes_t frames;\n  snd_pcm_hw_params_get_period_size(params, &frames, 0);\n  buf_.resize(frames);\n}\n\nvoid AlsaPlay::Play(const std::vector<float> &samples) {\n  std::vector<float> tmp;\n  const float *p = samples.data();\n  int32_t num_samples = samples.size();\n  if (resampler_) {\n    resampler_->Resample(samples.data(), samples.size(), false, &tmp);\n    p = tmp.data();\n    num_samples = tmp.size();\n  }\n\n  int32_t frames = buf_.size();\n  int32_t i = 0;\n  for (; i + frames < num_samples; i += frames) {\n    for (int32_t k = 0; k != frames; ++k) {\n      buf_[k] = p[i + k] * 32767;\n    }\n\n    int32_t err = snd_pcm_writei(handle_, buf_.data(), frames);\n    if (err == -EPIPE) {\n      printf(\"XRUN.\\n\");\n      snd_pcm_prepare(handle_);\n    } else if (err < 0) {\n      printf(\"Can't write to PCM device: %s\\n\", snd_strerror(err));\n      exit(-1);\n    }\n  }\n\n  if (i < num_samples) {\n    for (int32_t k = 0; k + i < num_samples; ++k) {\n      buf_[k] = p[i + k] * 32767;\n    }\n\n    int32_t err = snd_pcm_writei(handle_, buf_.data(), num_samples - i);\n    if (err == -EPIPE) {\n      printf(\"XRUN.\\n\");\n      snd_pcm_prepare(handle_);\n    } else if (err < 0) {\n      printf(\"Can't write to PCM device: %s\\n\", snd_strerror(err));\n      exit(-1);\n    }\n  }\n}\n\nvoid AlsaPlay::Drain() {\n  int32_t err = snd_pcm_drain(handle_);\n  if (err < 0) {\n    printf(\"Failed to drain pcm. %s\\n\", snd_strerror(err));\n  }\n}\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_ENABLE_ALSA\n"
  },
  {
    "path": "sherpa-onnx/csrc/alsa-play.h",
    "content": "// sherpa-onnx/csrc/alsa-play.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ALSA_PLAY_H_\n#define SHERPA_ONNX_CSRC_ALSA_PLAY_H_\n\n#include <cstdint>\n#include <memory>\n#include <vector>\n\n#include \"alsa/asoundlib.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n\nnamespace sherpa_onnx {\n\nclass AlsaPlay {\n public:\n  AlsaPlay(const char *device_name, int32_t sample_rate);\n  ~AlsaPlay();\n  void Play(const std::vector<float> &samples);\n\n  // wait for all the samples to be played\n  void Drain();\n\n private:\n  void SetParameters(int32_t sample_rate);\n\n private:\n  snd_pcm_t *handle_ = nullptr;\n  std::unique_ptr<LinearResample> resampler_;\n  std::vector<int16_t> buf_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ALSA_PLAY_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/alsa.cc",
    "content": "// sherpa-onnx/csrc/sherpa-alsa.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifdef SHERPA_ONNX_ENABLE_ALSA\n\n#include \"sherpa-onnx/csrc/alsa.h\"\n\n#include <algorithm>\n#include <cstdio>\n#include <memory>\n#include <vector>\n\n#include \"alsa/asoundlib.h\"\n\nnamespace sherpa_onnx {\n\nvoid ToFloat(const std::vector<int16_t> &in, int32_t num_channels,\n             std::vector<float> *out) {\n  out->resize(in.size() / num_channels);\n\n  int32_t n = in.size();\n  for (int32_t i = 0, k = 0; i < n; i += num_channels, ++k) {\n    (*out)[k] = in[i] / 32768.;\n  }\n}\n\nAlsa::Alsa(const char *device_name) {\n  const char *kDeviceHelp = R\"(\nPlease use the command:\n\n  arecord -l\n\nto list all available devices. For instance, if the output is:\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\n  )\";\n\n  int32_t err =\n      snd_pcm_open(&capture_handle_, device_name, SND_PCM_STREAM_CAPTURE, 0);\n  if (err) {\n    fprintf(stderr, \"Unable to open: %s. %s\\n\", device_name, snd_strerror(err));\n    fprintf(stderr, \"%s\\n\", kDeviceHelp);\n    exit(-1);\n  }\n\n  snd_pcm_hw_params_t *hw_params;\n  snd_pcm_hw_params_alloca(&hw_params);\n\n  err = snd_pcm_hw_params_any(capture_handle_, hw_params);\n  if (err) {\n    fprintf(stderr, \"Failed to initialize hw_params: %s\\n\", snd_strerror(err));\n    exit(-1);\n  }\n\n  err = snd_pcm_hw_params_set_access(capture_handle_, hw_params,\n                                     SND_PCM_ACCESS_RW_INTERLEAVED);\n  if (err) {\n    fprintf(stderr, \"Failed to set access type: %s\\n\", snd_strerror(err));\n    exit(-1);\n  }\n\n  err = snd_pcm_hw_params_set_format(capture_handle_, hw_params,\n                                     SND_PCM_FORMAT_S16_LE);\n  if (err) {\n    fprintf(stderr, \"Failed to set format: %s\\n\", snd_strerror(err));\n    exit(-1);\n  }\n\n  // mono\n  err = snd_pcm_hw_params_set_channels(capture_handle_, hw_params, 1);\n  if (err) {\n    fprintf(stderr, \"Failed to set number of channels to 1. %s\\n\",\n            snd_strerror(err));\n\n    err = snd_pcm_hw_params_set_channels(capture_handle_, hw_params, 2);\n    if (err) {\n      fprintf(stderr, \"Failed to set number of channels to 2. %s\\n\",\n              snd_strerror(err));\n\n      exit(-1);\n    }\n    actual_channel_count_ = 2;\n    fprintf(stderr,\n            \"Channel count is set to 2. Will use only 1 channel of it.\\n\");\n  }\n\n  uint32_t actual_sample_rate = expected_sample_rate_;\n\n  int32_t dir = 0;\n  err = snd_pcm_hw_params_set_rate_near(capture_handle_, hw_params,\n                                        &actual_sample_rate, &dir);\n  if (err) {\n    fprintf(stderr, \"Failed to set sample rate to, %d: %s\\n\",\n            expected_sample_rate_, snd_strerror(err));\n    exit(-1);\n  }\n  actual_sample_rate_ = actual_sample_rate;\n\n  if (actual_sample_rate_ != expected_sample_rate_) {\n    fprintf(stderr, \"Failed to set sample rate to %d\\n\", expected_sample_rate_);\n    fprintf(stderr, \"Current sample rate is %d\\n\", actual_sample_rate_);\n    fprintf(stderr,\n            \"Creating a resampler:\\n\"\n            \"   in_sample_rate: %d\\n\"\n            \"   output_sample_rate: %d\\n\",\n            actual_sample_rate_, expected_sample_rate_);\n\n    float min_freq = std::min(actual_sample_rate_, expected_sample_rate_);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler_ = std::make_unique<LinearResample>(\n        actual_sample_rate_, expected_sample_rate_, lowpass_cutoff,\n        lowpass_filter_width);\n  } else {\n    fprintf(stderr, \"Current sample rate: %d\\n\", actual_sample_rate_);\n  }\n\n  err = snd_pcm_hw_params(capture_handle_, hw_params);\n  if (err) {\n    fprintf(stderr, \"Failed to set hw params: %s\\n\", snd_strerror(err));\n    exit(-1);\n  }\n\n  err = snd_pcm_prepare(capture_handle_);\n  if (err) {\n    fprintf(stderr, \"Failed to prepare for recording: %s\\n\", snd_strerror(err));\n    exit(-1);\n  }\n\n  fprintf(stderr, \"Recording started!\\n\");\n}\n\nAlsa::~Alsa() { snd_pcm_close(capture_handle_); }\n\nconst std::vector<float> &Alsa::Read(int32_t num_samples) {\n  samples_.resize(num_samples * actual_channel_count_);\n\n  // count is in frames. Each frame contains actual_channel_count_ samples\n  int32_t count = snd_pcm_readi(capture_handle_, samples_.data(), num_samples);\n  if (count == -EPIPE) {\n    static int32_t n = 0;\n    if (++n > 5) {\n      fprintf(\n          stderr,\n          \"Too many overruns. It is very likely that the RTF on your board is \"\n          \"larger than 1. Please use ./bin/sherpa-onnx to compute the RTF.\\n\");\n      exit(-1);\n    }\n    fprintf(stderr, \"XRUN.\\n\");\n    snd_pcm_prepare(capture_handle_);\n\n    static std::vector<float> tmp;\n    return tmp;\n  } else if (count < 0) {\n    fprintf(stderr, \"Can't read PCM device: %s\\n\", snd_strerror(count));\n    exit(-1);\n  }\n\n  samples_.resize(count * actual_channel_count_);\n\n  ToFloat(samples_, actual_channel_count_, &samples1_);\n\n  if (!resampler_) {\n    return samples1_;\n  }\n\n  resampler_->Resample(samples1_.data(), samples_.size(), false, &samples2_);\n  return samples2_;\n}\n\n}  // namespace sherpa_onnx\n\n#endif\n"
  },
  {
    "path": "sherpa-onnx/csrc/alsa.h",
    "content": "// sherpa-onnx/csrc/sherpa-alsa.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ALSA_H_\n#define SHERPA_ONNX_CSRC_ALSA_H_\n\n#include <memory>\n#include <vector>\n\n#include \"alsa/asoundlib.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n\nnamespace sherpa_onnx {\n\nclass Alsa {\n public:\n  explicit Alsa(const char *device_name);\n  ~Alsa();\n\n  // This is a blocking read.\n  //\n  // @param num_samples  Number of samples to read.\n  //\n  // The returned value is valid until the next call to Read().\n  const std::vector<float> &Read(int32_t num_samples);\n\n  int32_t GetExpectedSampleRate() const { return expected_sample_rate_; }\n  int32_t GetActualSampleRate() const { return actual_sample_rate_; }\n\n private:\n  snd_pcm_t *capture_handle_;\n  int32_t expected_sample_rate_ = 16000;\n  int32_t actual_sample_rate_;\n\n  int32_t actual_channel_count_ = 1;\n\n  std::unique_ptr<LinearResample> resampler_;\n  std::vector<int16_t> samples_;  // directly from the microphone\n  std::vector<float> samples1_;   // normalized version of samples_\n  std::vector<float> samples2_;   // possibly resampled from samples1_\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ALSA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/macros.h",
    "content": "// sherpa-onnx/csrc/ascend/macros.h\n//\n// Copyright      2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ASCEND_MACROS_H_\n#define SHERPA_ONNX_CSRC_ASCEND_MACROS_H_\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\n#define SHERPA_ONNX_ASCEND_CHECK(ret, msg, ...)    \\\n  do {                                             \\\n    if (ret != ACL_ERROR_NONE) {                   \\\n      const char *_msg = aclGetRecentErrMsg();     \\\n      SHERPA_ONNX_LOGE(\"Return code is: %d\", ret); \\\n      SHERPA_ONNX_LOGE(\"Error message: %s\", _msg); \\\n      SHERPA_ONNX_LOGE(msg, ##__VA_ARGS__);        \\\n      SHERPA_ONNX_EXIT(-1);                        \\\n    }                                              \\\n  } while (0)\n\n#endif  // SHERPA_ONNX_CSRC_ASCEND_MACROS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/offline-paraformer-model-ascend.cc",
    "content": "// sherpa-onnx/csrc/ascend/offline-paraformer-model-ascend.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// References:\n// https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/83RC1alpha003/API/appdevgapi/aclcppdevg_03_0298.html\n#include \"sherpa-onnx/csrc/ascend/offline-paraformer-model-ascend.h\"\n\n#include <algorithm>\n#include <array>\n#include <memory>\n#include <mutex>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/ascend/macros.h\"\n#include \"sherpa-onnx/csrc/ascend/utils.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineParaformerModelAscend::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    PreInit();\n\n    std::vector<std::string> filenames;\n    SplitStringToVector(config_.paraformer.model, \",\", false, &filenames);\n    if (filenames.size() != 3) {\n      SHERPA_ONNX_LOGE(\"Invalid paraformer ascend NPU model '%s'\",\n                       config_.paraformer.model.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    InitEncoder(filenames[0]);\n    InitPredictor(filenames[1]);\n    InitDecoder(filenames[2]);\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) : config_(config) {\n    PreInit();\n\n    std::vector<std::string> filenames;\n    SplitStringToVector(config_.paraformer.model, \",\", false, &filenames);\n    if (filenames.size() != 3) {\n      SHERPA_ONNX_LOGE(\"Invalid paraformer ascend NPU model '%s'\",\n                       config_.paraformer.model.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    {\n      auto buf = ReadFile(mgr, filenames[0]);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, filenames[1]);\n      InitPredictor(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, filenames[2]);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    PostInit();\n  }\n\n  std::vector<float> Run(std::vector<float> features) {\n    // TODO(fangjun): Support multi clients\n    std::lock_guard<std::mutex> lock(mutex_);\n\n    features = ApplyLFR(std::move(features));\n    if (features.empty()) {\n      return {};\n    }\n\n    int32_t num_frames = features.size() / 560;\n\n    RunEncoder(std::move(features));\n\n    std::vector<float> encoder_out_cpu(num_frames * encoder_dim_);\n    aclError ret = aclrtMemcpy(\n        encoder_out_cpu.data(), num_frames * encoder_dim_ * sizeof(float),\n        *encoder_out_ptr_, num_frames * encoder_dim_ * sizeof(float),\n        ACL_MEMCPY_DEVICE_TO_HOST);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    RunPredictor(num_frames);\n\n    std::vector<float> alphas_cpu(num_frames);\n\n    ret =\n        aclrtMemcpy(alphas_cpu.data(), num_frames * sizeof(float), *alphas_ptr_,\n                    num_frames * sizeof(float), ACL_MEMCPY_DEVICE_TO_HOST);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    std::vector<float> acoustic_embedding =\n        ComputeAcousticEmbedding(encoder_out_cpu, alphas_cpu, encoder_dim_);\n    if (acoustic_embedding.empty()) {\n      // no speech in the audio file\n      return {};\n    }\n\n    encoder_out_cpu.clear();\n    alphas_cpu.clear();\n\n    int32_t num_tokens = acoustic_embedding.size() / encoder_dim_;\n\n    RunDecoder(num_frames, std::move(acoustic_embedding));\n\n    std::vector<float> logits(num_tokens * vocab_size_);\n\n    ret = aclrtMemcpy(logits.data(), num_tokens * vocab_size_ * sizeof(float),\n                      *logits_ptr_, num_tokens * vocab_size_ * sizeof(float),\n                      ACL_MEMCPY_DEVICE_TO_HOST);\n\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    return logits;\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n private:\n  void RunEncoder(std::vector<float> features) {\n    int32_t num_frames = features.size() / 560;\n\n    aclError ret = aclrtMemcpy(*features_ptr_, features.size() * sizeof(float),\n                               features.data(), features.size() * sizeof(float),\n                               ACL_MEMCPY_HOST_TO_DEVICE);\n\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    AclMdlDataset input_dataset;\n    AclDataBuffer features_buf(*features_ptr_, features.size() * sizeof(float));\n    input_dataset.AddBuffer(features_buf);\n\n    // dynamic shape input\n    // https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/83RC1alpha003/appdevg/acldevg/aclcppdevg_000044.html\n\n    std::array<int64_t, 3> features_shape = {1, num_frames, 560};\n    AclTensorDesc features_desc(ACL_FLOAT, features_shape.size(),\n                                features_shape.data(), ACL_FORMAT_ND);\n    input_dataset.SetTensorDesc(features_desc, 0);\n\n    AclMdlDataset output_dataset;\n\n    AclDataBuffer encoder_out(*encoder_out_ptr_,\n                              num_frames * encoder_dim_ * sizeof(float));\n    output_dataset.AddBuffer(encoder_out);\n\n    ret = aclmdlExecute(*encoder_model_, input_dataset, output_dataset);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlExecute for encoder\");\n  }\n\n  void RunPredictor(int32_t num_frames) {\n    AclMdlDataset input_dataset;\n    AclDataBuffer encoder_out_buf(*encoder_out_ptr_,\n                                  num_frames * encoder_dim_ * sizeof(float));\n    input_dataset.AddBuffer(encoder_out_buf);\n\n    std::array<int64_t, 3> encoder_out_shape = {1, num_frames, encoder_dim_};\n    AclTensorDesc encoder_out_desc(ACL_FLOAT, encoder_out_shape.size(),\n                                   encoder_out_shape.data(), ACL_FORMAT_ND);\n    input_dataset.SetTensorDesc(encoder_out_desc, 0);\n\n    AclMdlDataset output_dataset;\n    AclDataBuffer alphas_buf(*alphas_ptr_, num_frames * sizeof(float));\n    output_dataset.AddBuffer(alphas_buf);\n\n    aclError ret =\n        aclmdlExecute(*predictor_model_, input_dataset, output_dataset);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlExecute for predictor\");\n  }\n\n  void RunDecoder(int32_t num_frames, std::vector<float> acoustic_embedding) {\n    aclError ret = aclrtMemcpy(\n        *acoustic_embedding_ptr_, acoustic_embedding.size() * sizeof(float),\n        acoustic_embedding.data(), acoustic_embedding.size() * sizeof(float),\n        ACL_MEMCPY_HOST_TO_DEVICE);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    int32_t num_tokens = acoustic_embedding.size() / encoder_dim_;\n\n    AclMdlDataset input_dataset;\n    AclDataBuffer encoder_out_buf(*encoder_out_ptr_,\n                                  num_frames * encoder_dim_ * sizeof(float));\n    input_dataset.AddBuffer(encoder_out_buf);\n\n    std::array<int64_t, 3> encoder_out_shape = {1, num_frames, encoder_dim_};\n    AclTensorDesc encoder_out_desc(ACL_FLOAT, encoder_out_shape.size(),\n                                   encoder_out_shape.data(), ACL_FORMAT_ND);\n    input_dataset.SetTensorDesc(encoder_out_desc, 0);\n\n    AclDataBuffer acoustic_embedding_buf(\n        *acoustic_embedding_ptr_, num_tokens * encoder_dim_ * sizeof(float));\n    input_dataset.AddBuffer(acoustic_embedding_buf);\n\n    std::array<int64_t, 3> acoustic_embedding_shape = {1, num_tokens,\n                                                       encoder_dim_};\n    AclTensorDesc acoustic_embedding_desc(\n        ACL_FLOAT, acoustic_embedding_shape.size(),\n        acoustic_embedding_shape.data(), ACL_FORMAT_ND);\n    input_dataset.SetTensorDesc(acoustic_embedding_desc, 1);\n\n    AclMdlDataset output_dataset;\n    AclDataBuffer logits_buf(*logits_ptr_,\n                             num_tokens * vocab_size_ * sizeof(float));\n    output_dataset.AddBuffer(logits_buf);\n\n    ret = aclmdlExecute(*decoder_model_, input_dataset, output_dataset);\n\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlExecute for decoder\");\n  }\n\n  void InitEncoder(const std::string &filename) {\n    encoder_model_ = std::make_unique<AclModel>(filename);\n    if (config_.debug) {\n      auto s = encoder_model_->GetInfo();\n\n      SHERPA_ONNX_LOGE(\"----encoder----\\n%s\\n\", s.c_str());\n    }\n  }\n\n  void InitPredictor(const std::string &filename) {\n    predictor_model_ = std::make_unique<AclModel>(filename);\n    if (config_.debug) {\n      auto s = predictor_model_->GetInfo();\n\n      SHERPA_ONNX_LOGE(\"----predictor----\\n%s\\n\", s.c_str());\n    }\n  }\n\n  void InitDecoder(const std::string &filename) {\n    decoder_model_ = std::make_unique<AclModel>(filename);\n    if (config_.debug) {\n      auto s = decoder_model_->GetInfo();\n\n      SHERPA_ONNX_LOGE(\"----decoder----\\n%s\\n\", s.c_str());\n    }\n  }\n\n  void InitEncoder(void *data, size_t size) {\n    encoder_model_ = std::make_unique<AclModel>(data, size);\n    if (config_.debug) {\n      auto s = encoder_model_->GetInfo();\n      SHERPA_ONNX_LOGE(\"----encoder----\\n%s\\n\", s.c_str());\n    }\n  }\n\n  void InitPredictor(void *data, size_t size) {\n    predictor_model_ = std::make_unique<AclModel>(data, size);\n    if (config_.debug) {\n      auto s = predictor_model_->GetInfo();\n      SHERPA_ONNX_LOGE(\"----predictor----\\n%s\\n\", s.c_str());\n    }\n  }\n\n  void InitDecoder(void *data, size_t size) {\n    decoder_model_ = std::make_unique<AclModel>(data, size);\n    if (config_.debug) {\n      auto s = decoder_model_->GetInfo();\n      SHERPA_ONNX_LOGE(\"----decoder----\\n%s\\n\", s.c_str());\n    }\n  }\n\n  void PreInit() {\n    int32_t device_id = 0;\n    aclError ret = aclrtSetDevice(device_id);\n    SHERPA_ONNX_ASCEND_CHECK(\n        ret, \"Failed to call aclrtSetDevice with device id: %d\", device_id);\n\n    context_ = std::make_unique<AclContext>(device_id);\n\n    ret = aclrtSetCurrentContext(*context_);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtSetCurrentContext\");\n  }\n\n  void PostInit() {\n    encoder_dim_ = encoder_model_->GetOutputShapes()[0].back();\n    vocab_size_ = decoder_model_->GetOutputShapes()[0].back();\n\n    Preallocate();\n  }\n\n  void Preallocate() {\n    // max 30 seconds\n    max_num_frames_ = (30 * 100 - 7) / 6 + 1;\n\n    features_ptr_ = std::make_unique<AclDevicePtr>(max_num_frames_ * feat_dim_ *\n                                                   sizeof(float));\n\n    encoder_out_ptr_ = std::make_unique<AclDevicePtr>(\n        max_num_frames_ * encoder_dim_ * sizeof(float));\n\n    alphas_ptr_ =\n        std::make_unique<AclDevicePtr>(max_num_frames_ * sizeof(float));\n\n    acoustic_embedding_ptr_ = std::make_unique<AclDevicePtr>(\n        max_num_frames_ * encoder_dim_ * sizeof(float));\n\n    logits_ptr_ = std::make_unique<AclDevicePtr>(max_num_frames_ * vocab_size_ *\n                                                 sizeof(float));\n  }\n\n  std::vector<float> ApplyLFR(std::vector<float> in) const {\n    int32_t lfr_window_size = 7;\n    int32_t lfr_window_shift = 6;\n    int32_t in_feat_dim = 80;\n\n    int32_t in_num_frames = in.size() / in_feat_dim;\n    if (in_num_frames < lfr_window_size) {\n      return {};\n    }\n\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n\n    if (out_num_frames > max_num_frames_) {\n      SHERPA_ONNX_LOGE(\n          \"Number of input frames %d is too large. Truncate it to %d frames.\",\n          out_num_frames, max_num_frames_);\n\n      SHERPA_ONNX_LOGE(\n          \"Recognition result may be truncated/incomplete. Please select a \"\n          \"model accepting longer audios.\");\n\n      out_num_frames = max_num_frames_;\n    }\n\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n\n    std::vector<float> out(out_num_frames * out_feat_dim);\n\n    const float *p_in = in.data();\n    float *p_out = out.data();\n\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n\n    return out;\n  }\n\n private:\n  std::mutex mutex_;\n  Acl acl_;\n\n  std::unique_ptr<AclContext> context_;\n\n  OfflineModelConfig config_;\n\n  std::unique_ptr<AclModel> encoder_model_;\n  std::unique_ptr<AclModel> predictor_model_;\n  std::unique_ptr<AclModel> decoder_model_;\n\n  int32_t encoder_dim_ = 0;\n  int32_t vocab_size_ = 0;\n  int32_t max_num_frames_ = 0;\n  int32_t feat_dim_ = 560;\n\n  std::unique_ptr<AclDevicePtr> features_ptr_;\n  std::unique_ptr<AclDevicePtr> encoder_out_ptr_;\n  std::unique_ptr<AclDevicePtr> alphas_ptr_;\n  std::unique_ptr<AclDevicePtr> acoustic_embedding_ptr_;\n  std::unique_ptr<AclDevicePtr> logits_ptr_;\n};\n\nOfflineParaformerModelAscend::OfflineParaformerModelAscend(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineParaformerModelAscend::OfflineParaformerModelAscend(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineParaformerModelAscend::~OfflineParaformerModelAscend() = default;\n\nstd::vector<float> OfflineParaformerModelAscend::Run(\n    std::vector<float> features) const {\n  return impl_->Run(std::move(features));\n}\n\nint32_t OfflineParaformerModelAscend::VocabSize() const {\n  return impl_->VocabSize();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineParaformerModelAscend::OfflineParaformerModelAscend(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineParaformerModelAscend::OfflineParaformerModelAscend(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/offline-paraformer-model-ascend.h",
    "content": "// sherpa-onnx/csrc/ascend/offline-paraformer-model-ascend.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ASCEND_OFFLINE_PARAFORMER_MODEL_ASCEND_H_\n#define SHERPA_ONNX_CSRC_ASCEND_OFFLINE_PARAFORMER_MODEL_ASCEND_H_\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineParaformerModelAscend {\n public:\n  ~OfflineParaformerModelAscend();\n\n  explicit OfflineParaformerModelAscend(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineParaformerModelAscend(Manager *mgr, const OfflineModelConfig &config);\n\n  /**\n   * @param features A tensor of shape (num_frames, feature_dim)\n   *                 before applying LFR.\n   * @returns Return a tensor of shape (num_output_frames, vocab_size)\n   */\n  std::vector<float> Run(std::vector<float> features) const;\n\n  int32_t VocabSize() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ASCEND_OFFLINE_PARAFORMER_MODEL_ASCEND_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/offline-recognizer-zipformer-ctc-ascend-impl.h",
    "content": "// sherpa-onnx/csrc/ascend/offline-recognizer-zipformer-ctc-ascend-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ASCEND_OFFLINE_RECOGNIZER_ZIPFORMER_CTC_ASCEND_IMPL_H_\n#define SHERPA_ONNX_CSRC_ASCEND_OFFLINE_RECOGNIZER_ZIPFORMER_CTC_ASCEND_IMPL_H_\n\n#include <ios>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/ascend/offline-zipformer-ctc-model-ascend.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/rknn/offline-ctc-greedy-search-decoder-rknn.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\n// defined in ../offline-recognizer-ctc-impl.h\nOfflineRecognitionResult Convert(const OfflineCtcDecoderResult &src,\n                                 const SymbolTable &sym_table,\n                                 int32_t frame_shift_ms,\n                                 int32_t subsampling_factor);\n\nclass OfflineRecognizerZipformerCtcAscendImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerZipformerCtcAscendImpl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineZipformerCtcModelAscend>(\n            config.model_config)) {\n    Init();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerZipformerCtcAscendImpl(Manager *mgr,\n                                          const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<OfflineZipformerCtcModelAscend>(\n            mgr, config.model_config)) {\n    Init();\n  }\n\n  void Init() {\n    if (config_.decoding_method == \"greedy_search\") {\n      if (!symbol_table_.Contains(\"<blk>\") &&\n          !symbol_table_.Contains(\"<eps>\") &&\n          !symbol_table_.Contains(\"<blank>\") &&\n          config_.model_config.omnilingual.model.empty()) {\n        // for omnilingual asr, its blank id is 0\n        SHERPA_ONNX_LOGE(\n            \"We expect that tokens.txt contains \"\n            \"the symbol <blk> or <eps> or <blank> and its ID.\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      int32_t blank_id = 0;\n      if (symbol_table_.Contains(\"<blk>\")) {\n        blank_id = symbol_table_[\"<blk>\"];\n      } else if (symbol_table_.Contains(\"<eps>\")) {\n        // for tdnn models of the yesno recipe from icefall\n        blank_id = symbol_table_[\"<eps>\"];\n      } else if (symbol_table_.Contains(\"<blank>\")) {\n        // for Wenet CTC models\n        blank_id = symbol_table_[\"<blank>\"];\n      }\n\n      decoder_ = std::make_unique<OfflineCtcGreedySearchDecoderRknn>(blank_id);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config_.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(config_.feat_config);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    for (int32_t i = 0; i != n; ++i) {\n      DecodeStream(ss[i]);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  // Decode a single stream.\n  // Some models do not support batch size > 1, e.g., WeNet CTC models.\n  void DecodeStream(OfflineStream *s) const {\n    std::vector<float> f = s->GetFrames();\n\n    int32_t vocab_size = model_->VocabSize();\n\n    std::vector<float> log_probs = model_->Run(std::move(f));\n    int32_t num_out_frames = log_probs.size() / vocab_size;\n\n    auto result =\n        decoder_->Decode(log_probs.data(), num_out_frames, vocab_size);\n\n    int32_t frame_shift_ms = 10;\n\n    auto r = Convert(result, symbol_table_, frame_shift_ms,\n                     model_->SubsamplingFactor());\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    s->SetResult(r);\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<OfflineZipformerCtcModelAscend> model_;\n  std::unique_ptr<OfflineCtcGreedySearchDecoderRknn> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ASCEND_OFFLINE_RECOGNIZER_ZIPFORMER_CTC_ASCEND_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/offline-sense-voice-model-ascend.cc",
    "content": "// sherpa-onnx/csrc/ascend/offline-sense-voice-model-ascend.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// References:\n// https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/83RC1alpha003/API/appdevgapi/aclcppdevg_03_0298.html\n#include \"sherpa-onnx/csrc/ascend/offline-sense-voice-model-ascend.h\"\n\n#include <algorithm>\n#include <array>\n#include <memory>\n#include <mutex>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/ascend/macros.h\"\n#include \"sherpa-onnx/csrc/ascend/utils.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModelAscend::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    PreInit();\n    InitModel(config_.sense_voice.model);\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) : config_(config) {\n    PreInit();\n    {\n      auto buf = ReadFile(mgr, config_.sense_voice.model);\n      InitModel(buf.data(), buf.size());\n    }\n    PostInit();\n  }\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const {\n    return meta_data_;\n  }\n\n  std::vector<float> Run(std::vector<float> features, int32_t language,\n                         int32_t text_norm) {\n    // TODO(fangjun): Support multi clients\n    std::lock_guard<std::mutex> lock(mutex_);\n\n    features = ApplyLFR(std::move(features));\n    if (features.empty()) {\n      return {};\n    }\n\n    int32_t num_frames = features.size() / 560;\n\n    aclError ret =\n        aclrtMemcpy(*x_ptr_, features.size() * sizeof(float), features.data(),\n                    features.size() * sizeof(float), ACL_MEMCPY_HOST_TO_DEVICE);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    std::array<int32_t, 4> prompt_array{language, 1, 2, text_norm};\n    ret = aclrtMemcpy(*prompt_ptr_, prompt_ptr_->Size(), prompt_array.data(),\n                      prompt_ptr_->Size(), ACL_MEMCPY_HOST_TO_DEVICE);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    AclMdlDataset input_dataset;\n    AclDataBuffer x_buf(*x_ptr_, features.size() * sizeof(float));\n    input_dataset.AddBuffer(x_buf);\n\n    AclDataBuffer prompt_buf(*prompt_ptr_, prompt_ptr_->Size());\n    input_dataset.AddBuffer(prompt_buf);\n\n    // dynamic shape input\n    // https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/83RC1alpha003/appdevg/acldevg/aclcppdevg_000044.html\n\n    std::array<int64_t, 3> x_shape = {1, num_frames, 560};\n    AclTensorDesc x_desc(ACL_FLOAT, x_shape.size(), x_shape.data(),\n                         ACL_FORMAT_ND);\n    input_dataset.SetTensorDesc(x_desc, 0);\n\n    std::array<int64_t, 1> prompt_shape = {4};\n    AclTensorDesc prompt_desc(ACL_INT32, prompt_shape.size(),\n                              prompt_shape.data(), ACL_FORMAT_ND);\n    input_dataset.SetTensorDesc(prompt_desc, 1);\n\n    AclMdlDataset output_dataset;\n\n    AclDataBuffer logits_buf(*logits_ptr_,\n                             num_frames * vocab_size_ * sizeof(float));\n    output_dataset.AddBuffer(logits_buf);\n\n    ret = aclmdlExecute(*model_, input_dataset, output_dataset);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlExecute\");\n\n    std::vector<float> logits(num_frames * vocab_size_);\n    ret = aclrtMemcpy(logits.data(), num_frames * vocab_size_ * sizeof(float),\n                      *logits_ptr_, num_frames * vocab_size_ * sizeof(float),\n                      ACL_MEMCPY_DEVICE_TO_HOST);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    return logits;\n  }\n\n private:\n  void InitModel(const std::string &filename) {\n    model_ = std::make_unique<AclModel>(filename);\n    if (config_.debug) {\n      auto s = model_->GetInfo();\n      SHERPA_ONNX_LOGE(\"%s\", s.c_str());\n    }\n  }\n\n  void InitModel(void *data, size_t size) {\n    model_ = std::make_unique<AclModel>(data, size);\n    if (config_.debug) {\n      auto s = model_->GetInfo();\n      SHERPA_ONNX_LOGE(\"%s\", s.c_str());\n    }\n  }\n\n  void PreInit() {\n    int32_t device_id = 0;\n    aclError ret = aclrtSetDevice(device_id);\n    SHERPA_ONNX_ASCEND_CHECK(\n        ret, \"Failed to call aclrtSetDevice with device id: %d\", device_id);\n\n    context_ = std::make_unique<AclContext>(device_id);\n\n    ret = aclrtSetCurrentContext(*context_);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtSetCurrentContext\");\n  }\n\n  void PostInit() {\n    vocab_size_ = model_->GetOutputShapes()[0].back();\n\n    Preallocate();\n  }\n\n  void Preallocate() {\n    // max 30 seconds\n    max_num_frames_ = (30 * 100 - 7) / 6 + 1;\n    x_ptr_ = std::make_unique<AclDevicePtr>(max_num_frames_ * feat_dim_ *\n                                            sizeof(float));\n\n    prompt_ptr_ = std::make_unique<AclDevicePtr>(4 * sizeof(int32_t));\n\n    logits_ptr_ = std::make_unique<AclDevicePtr>((max_num_frames_ + 4) *\n                                                 vocab_size_ * sizeof(float));\n  }\n\n  std::vector<float> ApplyLFR(std::vector<float> in) const {\n    int32_t lfr_window_size = meta_data_.window_size;\n    int32_t lfr_window_shift = meta_data_.window_shift;\n    int32_t in_feat_dim = 80;\n\n    int32_t in_num_frames = in.size() / in_feat_dim;\n    if (in_num_frames < lfr_window_size) {\n      return {};\n    }\n\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n\n    if (out_num_frames > max_num_frames_) {\n      SHERPA_ONNX_LOGE(\n          \"Number of input frames %d is too large. Truncate it to %d frames.\",\n          out_num_frames, max_num_frames_);\n\n      SHERPA_ONNX_LOGE(\n          \"Recognition result may be truncated/incomplete. Please select a \"\n          \"model accepting longer audios.\");\n\n      out_num_frames = max_num_frames_;\n    }\n\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n\n    std::vector<float> out(out_num_frames * out_feat_dim);\n\n    const float *p_in = in.data();\n    float *p_out = out.data();\n\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n\n    return out;\n  }\n\n private:\n  std::mutex mutex_;\n  Acl acl_;\n\n  std::unique_ptr<AclContext> context_;\n\n  OfflineModelConfig config_;\n  OfflineSenseVoiceModelMetaData meta_data_;\n\n  std::unique_ptr<AclModel> model_;\n  int32_t vocab_size_ = 0;\n  int32_t max_num_frames_ = 0;\n  int32_t feat_dim_ = 560;\n\n  std::unique_ptr<AclDevicePtr> x_ptr_;\n  std::unique_ptr<AclDevicePtr> prompt_ptr_;\n  std::unique_ptr<AclDevicePtr> logits_ptr_;\n};\n\nOfflineSenseVoiceModelAscend::OfflineSenseVoiceModelAscend(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineSenseVoiceModelAscend::OfflineSenseVoiceModelAscend(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineSenseVoiceModelAscend::~OfflineSenseVoiceModelAscend() = default;\n\nstd::vector<float> OfflineSenseVoiceModelAscend::Run(\n    std::vector<float> features, int32_t language, int32_t text_norm) const {\n  return impl_->Run(std::move(features), language, text_norm);\n}\n\nconst OfflineSenseVoiceModelMetaData &\nOfflineSenseVoiceModelAscend::GetModelMetadata() const {\n  return impl_->GetModelMetadata();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSenseVoiceModelAscend::OfflineSenseVoiceModelAscend(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSenseVoiceModelAscend::OfflineSenseVoiceModelAscend(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/offline-sense-voice-model-ascend.h",
    "content": "// sherpa-onnx/csrc/ascend/offline-sense-voice-model-ascend.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ASCEND_OFFLINE_SENSE_VOICE_MODEL_ASCEND_H_\n#define SHERPA_ONNX_CSRC_ASCEND_OFFLINE_SENSE_VOICE_MODEL_ASCEND_H_\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-sense-voice-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModelAscend {\n public:\n  ~OfflineSenseVoiceModelAscend();\n\n  explicit OfflineSenseVoiceModelAscend(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineSenseVoiceModelAscend(Manager *mgr, const OfflineModelConfig &config);\n\n  /**\n   * @param features A tensor of shape (num_frames, feature_dim)\n   *                 before applying LFR.\n   * @param language\n   * @param text_norm\n   * @returns Return a tensor of shape (num_output_frames, vocab_size)\n   */\n  std::vector<float> Run(std::vector<float> features, int32_t language,\n                         int32_t text_norm) const;\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ASCEND_OFFLINE_SENSE_VOICE_MODEL_ASCEND_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/offline-whisper-model-ascend.cc",
    "content": "// sherpa-onnx/csrc/ascend/offline-whisper-model-ascend.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/ascend/offline-whisper-model-ascend.h\"\n\n#include <algorithm>\n#include <array>\n#include <memory>\n#include <mutex>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/ascend/macros.h\"\n#include \"sherpa-onnx/csrc/ascend/utils.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\n// masked positions: 1\n// unmasked positions: 0\nstatic void UpdateCausalMask(int32_t offset, int32_t capacity, int32_t *p) {\n  std::fill(p, p + offset, 0);\n  std::fill(p + offset, p + capacity, 1);\n}\n\nstatic WhisperModelType ParseWhisperModelFromString(const std::string &s) {\n  auto pos = s.find('-');\n  if (pos == std::string::npos) {\n    SHERPA_ONNX_LOGE(\"Unexpected model input '%s'\", s.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  if (s.substr(pos + 1) != \"mel\") {\n    SHERPA_ONNX_LOGE(\"Unexpected model input '%s'\", s.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  if (pos == 0) {\n    SHERPA_ONNX_LOGE(\"Empty model name in '%s'\", s.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  return ParseWhisperModelType(s.substr(0, pos));\n}\n\nclass OfflineWhisperModelAscend::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    PreInit();\n\n    InitEncoder(config_.whisper.encoder);\n    InitDecoder(config_.whisper.decoder);\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) : config_(config) {\n    PreInit();\n\n    {\n      auto buf = ReadFile(mgr, config_.whisper.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config_.whisper.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    PostInit();\n  }\n\n  OfflineWhisperDecoderResult Run(std::vector<float> features) {\n    // TODO(fangjun): Support multi clients\n    std::lock_guard<std::mutex> lock(mutex_);\n\n    OfflineWhisperDecoderResult r;\n\n    if (features.empty()) {\n      return r;\n    }\n\n    int32_t num_frames = features.size() / feat_dim_;\n    if (num_frames > num_frames_) {\n      SHERPA_ONNX_LOGE(\n          \"Number of input frames %d is too large. Truncate it to %d frames.\",\n          num_frames, num_frames_);\n\n      SHERPA_ONNX_LOGE(\n          \"Recognition result may be truncated/incomplete. Please select a \"\n          \"model accepting longer audios or use VAD to cut your audio into \"\n          \"small chunks.\");\n\n      num_frames = num_frames_;\n    }\n\n    // assume at most 6 tokens per second\n    int32_t num_possible_tokens = num_frames / 100.0 * 6;\n    num_possible_tokens =\n        std::min<int32_t>(num_possible_tokens, n_text_ctx_ / 2);\n\n    features.resize(num_frames_ * feat_dim_, 0);\n\n    // (num_frames_, feat_dim_) -> (feat_dim_, num_frames_)\n    features = Transpose(features.data(), num_frames_, feat_dim_);\n\n    RunEncoder(std::move(features));\n\n    // Note(fangjun): No need to initialize the self kv cache to 0\n\n    std::vector<int32_t> sot_sequence(sot_sequence_);\n\n    if (IsMultilingual(model_type_)) {\n      if (config_.whisper.task == \"translate\") {\n        sot_sequence[2] = translate_;\n      } else if (config_.whisper.task != \"transcribe\") {\n        SHERPA_ONNX_LOGE(\n            \"Valid task values are: translate, transcribe. Given: '%s'\",\n            config_.whisper.task.c_str());\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      if (!config_.whisper.language.empty()) {\n        int32_t lang_id = GetWhisperLanguageTokenId(config_.whisper.language);\n        if (lang_id < 0) {\n          SHERPA_ONNX_LOGE(\"Unsupported language: '%s'\",\n                           config_.whisper.language.c_str());\n          SHERPA_ONNX_EXIT(-1);\n        }\n        r.lang = config_.whisper.language;\n\n        sot_sequence[1] = lang_id;\n      } else {\n        // detect language\n        if (config_.debug) {\n          SHERPA_ONNX_LOGE(\"Detecting language.\");\n        }\n        token_offset_mask_cpu_[0] = sot_sequence_[0];\n        token_offset_mask_cpu_[1] = 0;\n        UpdateCausalMask(0, n_text_ctx_, token_offset_mask_cpu_.data() + 2);\n\n        int32_t lang_id = DetectLanguage();\n        r.lang = GetWhisperLanguageCode(lang_id);\n\n        if (config_.debug) {\n          SHERPA_ONNX_LOGE(\"Detected Language: %s\", r.lang.c_str());\n        }\n\n        sot_sequence[1] = lang_id;\n      }\n    }\n\n    int32_t &token = token_offset_mask_cpu_[0];\n    int32_t &offset = token_offset_mask_cpu_[1];\n    offset = 0;\n\n    int32_t *p_mask = token_offset_mask_cpu_.data() + 2;\n    UpdateCausalMask(offset, n_text_ctx_, p_mask);\n\n    for (int32_t i = 0; i < sot_sequence.size(); ++i) {\n      token = sot_sequence[i];\n      token = RunDecoder();\n      p_mask[offset] = 0;\n\n      offset += 1;\n    }\n\n    if (token == eot_) {\n      return r;\n    }\n\n    r.tokens.reserve(num_possible_tokens);\n\n    while (offset < num_possible_tokens && token != eot_) {\n      r.tokens.push_back(token);\n      token = RunDecoder();\n\n      p_mask[offset] = 0;\n      offset += 1;\n    }\n\n    return r;\n  }\n\n  int32_t FeatureDim() const { return feat_dim_; }\n\n private:\n  void RunEncoder(std::vector<float> features) {\n    aclError ret = aclrtMemcpy(features_ptr_, features.size() * sizeof(float),\n                               features.data(), features.size() * sizeof(float),\n                               ACL_MEMCPY_HOST_TO_DEVICE);\n\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    AclMdlDataset input_dataset;\n    input_dataset.AddBuffer(encoder_input_buffer_[0]);\n\n    AclMdlDataset output_dataset;\n\n    for (auto &p : encoder_output_buffer_) {\n      output_dataset.AddBuffer(p);\n    }\n\n    ret = aclmdlExecute(*encoder_model_, input_dataset, output_dataset);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlExecute\");\n  }\n\n  int32_t RunDecoder() {\n    RunDecoderImpl();\n\n    UpdateSelfKvCache();\n\n    auto ret = aclrtMemcpy(\n        logits_cpu_.data(), logits_cpu_.size() * sizeof(float), logits_ptr_,\n        logits_cpu_.size() * sizeof(float), ACL_MEMCPY_DEVICE_TO_HOST);\n\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    return MaxElementIndex(logits_cpu_.data(), logits_cpu_.size());\n  }\n\n  int32_t DetectLanguage() {\n    RunDecoderImpl();\n\n    // No need to update the Self KV cache\n\n    auto ret = aclrtMemcpy(\n        logits_cpu_.data(), logits_cpu_.size() * sizeof(float), logits_ptr_,\n        logits_cpu_.size() * sizeof(float), ACL_MEMCPY_DEVICE_TO_HOST);\n\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    const auto &all_lang_ids = GetAllWhisperLanguageTokenIds();\n    int32_t lang_id = all_lang_ids[0];\n    float this_logit = logits_cpu_[lang_id];\n\n    for (int32_t i = 1; i != all_lang_ids.size(); ++i) {\n      int32_t id = all_lang_ids[i];\n      float p = logits_cpu_[id];\n\n      if (p > this_logit) {\n        this_logit = p;\n        lang_id = id;\n      }\n    }\n\n    return lang_id;\n  }\n\n  void RunDecoderImpl() {\n    aclError ret =\n        aclrtMemcpy(token_ptr_, token_offset_mask_cpu_.size() * sizeof(int32_t),\n                    token_offset_mask_cpu_.data(),\n                    token_offset_mask_cpu_.size() * sizeof(int32_t),\n                    ACL_MEMCPY_HOST_TO_DEVICE);\n\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    AclMdlDataset input_dataset;\n\n    for (auto &p : decoder_input_buffer_) {\n      input_dataset.AddBuffer(p);\n    }\n\n    AclMdlDataset output_dataset;\n\n    for (auto &p : decoder_output_buffer_) {\n      output_dataset.AddBuffer(p);\n    }\n\n    ret = aclmdlExecute(*decoder_model_, input_dataset, output_dataset);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlExecute\");\n  }\n\n  void UpdateSelfKvCache() {\n    int32_t offset = token_offset_mask_cpu_[1];\n    for (int32_t i = 0; i < n_text_layer_ * 2; ++i) {\n      const float *src = delta_kv_ptr_[i];\n      float *dst = self_kv_ptr_[i] + offset * n_text_state_;\n\n      auto ret = aclrtMemcpy(dst, n_text_state_ * sizeof(float), src,\n                             n_text_state_ * sizeof(float),\n                             ACL_MEMCPY_DEVICE_TO_DEVICE);\n      SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n    }\n  }\n\n  void PreInit() {\n    int32_t device_id = 0;\n    aclError ret = aclrtSetDevice(device_id);\n    SHERPA_ONNX_ASCEND_CHECK(\n        ret, \"Failed to call aclrtSetDevice with device id: %d\", device_id);\n\n    context_ = std::make_unique<AclContext>(device_id);\n\n    ret = aclrtSetCurrentContext(*context_);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtSetCurrentContext\");\n  }\n\n  void PostInit() {\n    PostInitEncoder();\n    PostInitDecoder();\n    Preallocate();\n    InitSotSequence();\n\n    InitEncoderBuffer();\n    InitDecoderBuffer();\n  }\n\n  void InitEncoderBuffer() {\n    AclDataBuffer features_buf(features_ptr_,\n                               feat_dim_ * num_frames_ * sizeof(float));\n    encoder_input_buffer_.clear();\n    encoder_input_buffer_.push_back(std::move(features_buf));\n\n    encoder_output_buffer_.reserve(cross_kv_ptr_.size());\n    for (auto p : cross_kv_ptr_) {\n      AclDataBuffer tmp_buffer(p,\n                               num_out_frames_ * n_text_state_ * sizeof(float));\n      encoder_output_buffer_.push_back(std::move(tmp_buffer));\n    }\n  }\n\n  void InitDecoderBuffer() {\n    decoder_input_buffer_.reserve(1 + 2 * n_text_layer_ + 2 * n_text_layer_ +\n                                  1 + 1);\n    // token, self_kv, cross_kv, offset, mask\n\n    AclDataBuffer token_buf(token_ptr_, sizeof(int32_t));\n    decoder_input_buffer_.push_back(std::move(token_buf));\n\n    for (auto &p : self_kv_ptr_) {\n      AclDataBuffer tmp_buffer(p, n_text_ctx_ * n_text_state_ * sizeof(float));\n      decoder_input_buffer_.push_back(std::move(tmp_buffer));\n    }\n\n    for (auto &p : cross_kv_ptr_) {\n      AclDataBuffer tmp_buffer(p,\n                               num_out_frames_ * n_text_state_ * sizeof(float));\n      decoder_input_buffer_.push_back(std::move(tmp_buffer));\n    }\n\n    AclDataBuffer offset_buf(offset_ptr_, sizeof(int32_t));\n    decoder_input_buffer_.push_back(std::move(offset_buf));\n\n    AclDataBuffer mask_buf(mask_ptr_, n_text_ctx_ * sizeof(int32_t));\n    decoder_input_buffer_.push_back(std::move(mask_buf));\n\n    decoder_output_buffer_.reserve(1 + 2 * n_text_layer_);\n    AclDataBuffer logits_buf(logits_ptr_, vocab_size_ * sizeof(float));\n    decoder_output_buffer_.push_back(std::move(logits_buf));\n\n    for (auto &p : delta_kv_ptr_) {\n      AclDataBuffer tmp_buffer(p, n_text_state_ * sizeof(float));\n      decoder_output_buffer_.push_back(std::move(tmp_buffer));\n    }\n  }\n\n  void InitSotSequence() {\n    switch (model_type_) {\n      case WhisperModelType::TinyEn:\n        // fallthrough\n      case WhisperModelType::BaseEn:\n        // fallthrough\n      case WhisperModelType::SmallEn:\n        // fallthrough\n      case WhisperModelType::MediumEn:\n        // fallthrough\n        // <|startoftranscript|><|notimestamps|>\n        sot_sequence_ = {50257, 50362};\n        eot_ = 50256;\n        break;\n      case WhisperModelType::Tiny:\n      case WhisperModelType::Base:\n        // fallthrough\n      case WhisperModelType::Small:\n        // fallthrough\n      case WhisperModelType::Medium:\n        // fallthrough\n      case WhisperModelType::Large:\n        // <|startoftranscript|><|en|><|transcribe|><|notimestamps|>\n        sot_sequence_ = {50258, 50259, 50359, 50363};\n        eot_ = 50257;\n        translate_ = 50358;\n        break;\n      default:\n        SHERPA_ONNX_LOGE(\"Unsupported model type: '%s'\",\n                         ToString(model_type_).c_str());\n        SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"sot_sequence: \";\n      for (auto i : sot_sequence_) {\n        os << i << \" \";\n      }\n      os << \"\\n\";\n      os << \"eot: \" << eot_ << \"\\n\";\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n    }\n  }\n\n  void Preallocate() {\n    // Allocate a single big block.\n    int32_t total = 0;\n\n    // features: (1, feat_dim_, num_frames_)\n    total += num_frames_ * feat_dim_ * sizeof(float);\n    // token: (1,)\n    total += sizeof(int32_t);\n    // offset: (1,)\n    total += sizeof(int32_t);\n\n    // mask: (1, n_text_ctx_)\n    total += n_text_ctx_ * sizeof(int32_t);\n\n    // logits: (1, 1, vocab_size_)\n    total += vocab_size_ * sizeof(float);\n\n    // cross_kv: n_text_layer_ * 2 * (num_out_frames_, n_text_state_)\n\n    total +=\n        n_text_layer_ * 2 * num_out_frames_ * n_text_state_ * sizeof(float);\n\n    // self_kv: n_text_layer_ * 2 * (n_text_ctx_, n_text_state_)\n    total += n_text_layer_ * 2 * n_text_ctx_ * n_text_state_ * sizeof(float);\n\n    // delta_kv: n_text_layer_ * 2 * (1, 1, n_text_state_)\n    total += n_text_layer_ * 2 * n_text_state_ * sizeof(float);\n\n    ptr_ = std::make_unique<AclDevicePtr>(total);\n    float *start = ptr_->Get<float>();\n    int32_t *start_int32 = ptr_->Get<int32_t>();\n    int32_t offset = 0;\n\n    // (1, feat_dim_, num_frames_)\n    features_ptr_ = start + offset;\n    offset += feat_dim_ * num_frames_;  // in float or in int32_t, not in bytes\n\n    // make sure token,offset,mask are contiguous in device memory\n\n    // (1,)\n    token_ptr_ = start_int32 + offset;\n    offset += 1;\n\n    // (1,)\n    offset_ptr_ = start_int32 + offset;\n    offset += 1;\n\n    // (1, n_text_ctx_)\n    mask_ptr_ = start_int32 + offset;\n    offset += n_text_ctx_;\n\n    // (1, 1, vocab_size_)\n    logits_ptr_ = start + offset;\n    offset += vocab_size_;\n\n    // (1, num_frames_, n_text_state_)\n    cross_kv_ptr_.reserve(n_text_layer_ * 2);\n    for (int32_t i = 0; i < n_text_layer_ * 2; ++i) {\n      auto p = start + offset;\n      offset += num_out_frames_ * n_text_state_;\n      cross_kv_ptr_.push_back(std::move(p));\n    }\n\n    // (1, n_text_ctx_, n_text_state_)\n    self_kv_ptr_.reserve(n_text_layer_ * 2);\n    for (int32_t i = 0; i < n_text_layer_ * 2; ++i) {\n      auto p = start + offset;\n      offset += n_text_ctx_ * n_text_state_;\n      self_kv_ptr_.push_back(std::move(p));\n    }\n\n    // (1, 1, n_text_state_)\n    delta_kv_ptr_.reserve(n_text_layer_ * 2);\n    for (int32_t i = 0; i < n_text_layer_ * 2; ++i) {\n      auto p = start + offset;\n      offset += n_text_state_;\n      delta_kv_ptr_.push_back(std::move(p));\n    }\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"Allocated %d bytes, or %.3f MB\", total,\n                       total / 1024. / 1024.);\n    }\n  }\n\n  void PostInitEncoder() {\n    const std::vector<std::string> &names = encoder_model_->GetInputNames();\n    model_type_ = ParseWhisperModelFromString(names[0]);\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"model type: %s\", ToString(model_type_).c_str());\n    }\n\n    const std::vector<std::vector<int64_t>> &input_shapes =\n        encoder_model_->GetInputShapes();\n\n    const auto &mel_shape = input_shapes[0];\n    if (mel_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"It supports only batch size == 1. Given: %d\",\n                       static_cast<int32_t>(mel_shape[0]));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    feat_dim_ = mel_shape[1];\n    num_frames_ = mel_shape[2];\n\n    const std::vector<std::vector<int64_t>> &output_shapes =\n        encoder_model_->GetOutputShapes();\n\n    n_text_layer_ = output_shapes.size() / 2;\n\n    num_out_frames_ = output_shapes[0][1];\n    n_text_state_ = output_shapes[0].back();\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"feat_dim_: %d\", feat_dim_);\n      SHERPA_ONNX_LOGE(\"num_frames_: %d\", num_frames_);\n      SHERPA_ONNX_LOGE(\"num_out_frames_: %d\", num_out_frames_);\n      SHERPA_ONNX_LOGE(\"n_text_layer_: %d\", n_text_layer_);\n      SHERPA_ONNX_LOGE(\"n_text_state_: %d\", n_text_state_);\n    }\n  }\n\n  void PostInitDecoder() {\n    const std::vector<std::vector<int64_t>> &input_shapes =\n        decoder_model_->GetInputShapes();\n    // tokens, self_kv, cross_kv, offset, mask\n    int32_t expected_num_inputs = 1 + 2 * n_text_layer_ + 2 * n_text_layer_ + 2;\n    if (input_shapes.size() != expected_num_inputs) {\n      SHERPA_ONNX_LOGE(\"Expect %d inputs. Actual: %d\", expected_num_inputs,\n                       static_cast<int32_t>(input_shapes.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    const auto &s = input_shapes[1];\n    if (s[0] != 1) {\n      SHERPA_ONNX_LOGE(\"Support only batch size 1. Given: %d\",\n                       static_cast<int32_t>(s[0]));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    n_text_ctx_ = s[1];\n    token_offset_mask_cpu_.resize(1 + 1 + n_text_ctx_);\n\n    if (s[2] != n_text_state_) {\n      SHERPA_ONNX_LOGE(\"Expect n_text_state_ %d. Given: %d\", n_text_state_,\n                       static_cast<int32_t>(s[2]));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"n_text_ctx_: %d\", n_text_ctx_);\n    }\n\n    const std::vector<std::vector<int64_t>> &output_shapes =\n        decoder_model_->GetOutputShapes();\n\n    vocab_size_ = output_shapes[0].back();\n    logits_cpu_.resize(vocab_size_);\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"vocab_size: %d\", vocab_size_);\n    }\n  }\n\n  void InitEncoder(const std::string &filename) {\n    encoder_model_ = std::make_unique<AclModel>(filename);\n    if (config_.debug) {\n      auto s = encoder_model_->GetInfo();\n\n      SHERPA_ONNX_LOGE(\"----encoder----\\n%s\\n\", s.c_str());\n    }\n  }\n\n  void InitEncoder(void *data, size_t size) {\n    encoder_model_ = std::make_unique<AclModel>(data, size);\n    if (config_.debug) {\n      auto s = encoder_model_->GetInfo();\n      SHERPA_ONNX_LOGE(\"----encoder----\\n%s\\n\", s.c_str());\n    }\n  }\n\n  void InitDecoder(const std::string &filename) {\n    decoder_model_ = std::make_unique<AclModel>(filename);\n    if (config_.debug) {\n      auto s = decoder_model_->GetInfo();\n\n      SHERPA_ONNX_LOGE(\"----decoder----\\n%s\\n\", s.c_str());\n    }\n  }\n\n  void InitDecoder(void *data, size_t size) {\n    decoder_model_ = std::make_unique<AclModel>(data, size);\n    if (config_.debug) {\n      auto s = decoder_model_->GetInfo();\n      SHERPA_ONNX_LOGE(\"----decoder----\\n%s\\n\", s.c_str());\n    }\n  }\n\n private:\n  std::mutex mutex_;\n  Acl acl_;\n\n  std::unique_ptr<AclContext> context_;\n\n  std::unique_ptr<AclModel> encoder_model_;\n  std::unique_ptr<AclModel> decoder_model_;\n\n  OfflineModelConfig config_;\n\n  // tiny, tiny.en, base.en, base, etc\n  WhisperModelType model_type_;\n  int32_t feat_dim_ = 0;\n  int32_t num_frames_ = 0;\n  int32_t num_out_frames_ = 0;\n  int32_t n_text_layer_ = 0;\n  int32_t n_text_ctx_ = 0;\n  int32_t n_text_state_ = 0;\n  int32_t vocab_size_ = 0;\n\n  std::unique_ptr<AclDevicePtr> ptr_;\n\n  // All of the following raw pointers will point to some already allocated\n  // device memory. No need to free them.\n  float *features_ptr_ = nullptr;\n  int32_t *token_ptr_ = nullptr;\n  int32_t *offset_ptr_ = nullptr;\n  int32_t *mask_ptr_ = nullptr;\n  float *logits_ptr_ = nullptr;\n\n  std::vector<float *> cross_kv_ptr_;\n  std::vector<float *> self_kv_ptr_;\n  std::vector<float *> delta_kv_ptr_;\n\n  std::vector<int32_t> token_offset_mask_cpu_;\n  std::vector<float> logits_cpu_;\n\n  std::vector<int32_t> sot_sequence_;\n  int32_t eot_ = 0;\n  int32_t translate_ = 0;\n\n  std::vector<AclDataBuffer> encoder_input_buffer_;\n  std::vector<AclDataBuffer> encoder_output_buffer_;\n\n  std::vector<AclDataBuffer> decoder_input_buffer_;\n  std::vector<AclDataBuffer> decoder_output_buffer_;\n};\n\nOfflineWhisperModelAscend::OfflineWhisperModelAscend(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineWhisperModelAscend::OfflineWhisperModelAscend(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineWhisperModelAscend::~OfflineWhisperModelAscend() = default;\n\nOfflineWhisperDecoderResult OfflineWhisperModelAscend::Run(\n    std::vector<float> features) const {\n  return impl_->Run(std::move(features));\n}\n\nint32_t OfflineWhisperModelAscend::FeatureDim() const {\n  return impl_->FeatureDim();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineWhisperModelAscend::OfflineWhisperModelAscend(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineWhisperModelAscend::OfflineWhisperModelAscend(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/offline-whisper-model-ascend.h",
    "content": "// sherpa-onnx/csrc/ascend/offline-whisper-model-ascend.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ASCEND_OFFLINE_WHISPER_MODEL_ASCEND_H_\n#define SHERPA_ONNX_CSRC_ASCEND_OFFLINE_WHISPER_MODEL_ASCEND_H_\n\n#include <cstdint>\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineWhisperModelAscend {\n public:\n  ~OfflineWhisperModelAscend();\n\n  explicit OfflineWhisperModelAscend(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineWhisperModelAscend(Manager *mgr, const OfflineModelConfig &config);\n\n  /**\n   * @param features A tensor of shape (1, num_frames, feat_dim)\n   */\n  OfflineWhisperDecoderResult Run(std::vector<float> features) const;\n\n  int32_t FeatureDim() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ASCEND_OFFLINE_WHISPER_MODEL_ASCEND_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/offline-zipformer-ctc-model-ascend.cc",
    "content": "// sherpa-onnx/csrc/ascend/offline-zipformer-ctc-model-ascend.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n// References:\n// https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/83RC1alpha003/API/appdevgapi/aclcppdevg_03_0298.html\n#include \"sherpa-onnx/csrc/ascend/offline-zipformer-ctc-model-ascend.h\"\n\n#include <algorithm>\n#include <array>\n#include <memory>\n#include <mutex>  // NOLINT\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/ascend/macros.h\"\n#include \"sherpa-onnx/csrc/ascend/utils.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineZipformerCtcModelAscend::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    PreInit();\n    InitModel(config_.zipformer_ctc.model);\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) : config_(config) {\n    PreInit();\n    {\n      auto buf = ReadFile(mgr, config_.zipformer_ctc.model);\n      InitModel(buf.data(), buf.size());\n    }\n    PostInit();\n  }\n\n  std::vector<float> Run(std::vector<float> features) {\n    // TODO(fangjun): Support multi clients\n    std::lock_guard<std::mutex> lock(mutex_);\n\n    int32_t num_frames = features.size() / feat_dim_;\n\n    if (num_frames != max_num_frames_) {\n      if (num_frames > max_num_frames_) {\n        SHERPA_ONNX_LOGE(\n            \"Number of input frames %d is too large. Truncate it to %d frames.\",\n            num_frames, max_num_frames_);\n\n        SHERPA_ONNX_LOGE(\n            \"Recognition result may be truncated/incomplete. Please select a \"\n            \"model accepting longer audios.\");\n      }\n\n      features.resize(max_num_frames_ * feat_dim_, 0);\n\n      num_frames = max_num_frames_;\n    }\n\n    aclError ret =\n        aclrtMemcpy(*x_ptr_, features.size() * sizeof(float), features.data(),\n                    features.size() * sizeof(float), ACL_MEMCPY_HOST_TO_DEVICE);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    AclMdlDataset input_dataset;\n    AclDataBuffer x_buf(*x_ptr_, features.size() * sizeof(float));\n    input_dataset.AddBuffer(x_buf);\n\n    AclMdlDataset output_dataset;\n\n    AclDataBuffer logits_buf(*log_probs_ptr_,\n                             num_output_frames_ * vocab_size_ * sizeof(float));\n    output_dataset.AddBuffer(logits_buf);\n\n    ret = aclmdlExecute(*model_, input_dataset, output_dataset);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlExecute\");\n\n    std::vector<float> log_probs(num_output_frames_ * vocab_size_);\n    ret = aclrtMemcpy(\n        log_probs.data(), num_output_frames_ * vocab_size_ * sizeof(float),\n        *log_probs_ptr_, num_output_frames_ * vocab_size_ * sizeof(float),\n        ACL_MEMCPY_DEVICE_TO_HOST);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMemcpy\");\n\n    return log_probs;\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t SubsamplingFactor() const { return subsampling_factor_; }\n\n private:\n  void InitModel(const std::string &filename) {\n    model_ = std::make_unique<AclModel>(filename);\n    if (config_.debug) {\n      auto s = model_->GetInfo();\n      SHERPA_ONNX_LOGE(\"%s\", s.c_str());\n    }\n  }\n\n  void InitModel(void *data, size_t size) {\n    model_ = std::make_unique<AclModel>(data, size);\n    if (config_.debug) {\n      auto s = model_->GetInfo();\n      SHERPA_ONNX_LOGE(\"%s\", s.c_str());\n    }\n  }\n\n  void PreInit() {\n    int32_t device_id = 0;\n    aclError ret = aclrtSetDevice(device_id);\n    SHERPA_ONNX_ASCEND_CHECK(\n        ret, \"Failed to call aclrtSetDevice with device id: %d\", device_id);\n\n    context_ = std::make_unique<AclContext>(device_id);\n\n    ret = aclrtSetCurrentContext(*context_);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtSetCurrentContext\");\n  }\n\n  void PostInit() {\n    auto in_shape = model_->GetInputShapes()[0];\n\n    max_num_frames_ = in_shape[1];\n    feat_dim_ = in_shape[2];\n\n    auto out_shape = model_->GetOutputShapes()[0];\n\n    num_output_frames_ = out_shape[1];\n    vocab_size_ = out_shape[2];\n\n    subsampling_factor_ = max_num_frames_ / out_shape[1];\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"max_num_frames: %d\", max_num_frames_);\n      SHERPA_ONNX_LOGE(\"feat_dim: %d\", feat_dim_);\n      SHERPA_ONNX_LOGE(\"vocab_size: %d\", vocab_size_);\n      SHERPA_ONNX_LOGE(\"subsampling_factor: %d\", subsampling_factor_);\n    }\n\n    Preallocate();\n  }\n\n  void Preallocate() {\n    x_ptr_ = std::make_unique<AclDevicePtr>(max_num_frames_ * feat_dim_ *\n                                            sizeof(float));\n\n    log_probs_ptr_ = std::make_unique<AclDevicePtr>(\n        num_output_frames_ * vocab_size_ * sizeof(float));\n  }\n\n private:\n  std::mutex mutex_;\n  Acl acl_;\n\n  std::unique_ptr<AclContext> context_;\n\n  OfflineModelConfig config_;\n\n  std::unique_ptr<AclModel> model_;\n  int32_t vocab_size_ = 0;\n  int32_t max_num_frames_ = 0;\n  int32_t num_output_frames_ = 0;\n  int32_t feat_dim_ = 0;\n  int32_t subsampling_factor_ = 0;\n\n  std::unique_ptr<AclDevicePtr> x_ptr_;\n  std::unique_ptr<AclDevicePtr> log_probs_ptr_;\n};\n\nOfflineZipformerCtcModelAscend::OfflineZipformerCtcModelAscend(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineZipformerCtcModelAscend::OfflineZipformerCtcModelAscend(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineZipformerCtcModelAscend::~OfflineZipformerCtcModelAscend() = default;\n\nstd::vector<float> OfflineZipformerCtcModelAscend::Run(\n    std::vector<float> features) const {\n  return impl_->Run(std::move(features));\n}\n\nint32_t OfflineZipformerCtcModelAscend::VocabSize() const {\n  return impl_->VocabSize();\n}\n\nint32_t OfflineZipformerCtcModelAscend::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineZipformerCtcModelAscend::OfflineZipformerCtcModelAscend(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineZipformerCtcModelAscend::OfflineZipformerCtcModelAscend(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/offline-zipformer-ctc-model-ascend.h",
    "content": "// sherpa-onnx/csrc/ascend/offline-zipformer-ctc-model-ascend.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ASCEND_OFFLINE_ZIPFORMER_CTC_MODEL_ASCEND_H_\n#define SHERPA_ONNX_CSRC_ASCEND_OFFLINE_ZIPFORMER_CTC_MODEL_ASCEND_H_\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineZipformerCtcModelAscend {\n public:\n  ~OfflineZipformerCtcModelAscend();\n\n  explicit OfflineZipformerCtcModelAscend(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineZipformerCtcModelAscend(Manager *mgr,\n                                 const OfflineModelConfig &config);\n\n  /**\n   * @param features A tensor of shape (num_frames, feature_dim)\n   * @returns Return a tensor of shape (num_output_frames, vocab_size)\n   */\n  std::vector<float> Run(std::vector<float> features) const;\n\n  int32_t VocabSize() const;\n  int32_t SubsamplingFactor() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ASCEND_OFFLINE_ZIPFORMER_CTC_MODEL_ASCEND_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/utils.cc",
    "content": "// sherpa-onnx/csrc/ascend/utils.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/ascend/utils.h\"\n\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/ascend/macros.h\"\n\nnamespace sherpa_onnx {\n\nstatic const char *AclDataTypeToString(aclDataType data_type) {\n  switch (data_type) {\n    case ACL_DT_UNDEFINED:\n      return \"ACL_DT_UNDEFINED\";\n    case ACL_FLOAT:\n      return \"ACL_FLOAT\";\n    case ACL_FLOAT16:\n      return \"ACL_FLOAT16\";\n    case ACL_INT8:\n      return \"ACL_INT8\";\n    case ACL_INT32:\n      return \"ACL_INT32\";\n    case ACL_UINT8:\n      return \"ACL_UINT8\";\n    case ACL_INT16:\n      return \"ACL_INT16\";\n    case ACL_UINT16:\n      return \"ACL_UINT16\";\n    case ACL_UINT32:\n      return \"ACL_UINT32\";\n    case ACL_INT64:\n      return \"ACL_INT64\";\n    case ACL_UINT64:\n      return \"ACL_UINT64\";\n    case ACL_DOUBLE:\n      return \"ACL_DOUBLE\";\n    case ACL_BOOL:\n      return \"ACL_BOOL\";\n    case ACL_STRING:\n      return \"ACL_STRING\";\n    case ACL_COMPLEX64:\n      return \"ACL_COMPLEX64\";\n    case ACL_COMPLEX128:\n      return \"ACL_COMPLEX128\";\n    case ACL_BF16:\n      return \"ACL_BF16\";\n#if defined(ACL_INT4)\n    case ACL_INT4:\n      return \"ACL_INT4\";\n#endif\n    case ACL_UINT1:\n      return \"ACL_UINT1\";\n    case ACL_COMPLEX32:\n      return \"ACL_COMPLEX32\";\n    default:\n      return \"unknown\";\n  }\n}\n\nstatic const char *AclFormatToString(aclFormat format) {\n  switch (format) {\n    case ACL_FORMAT_UNDEFINED:\n      return \"ACL_FORMAT_UNDEFINED\";\n    case ACL_FORMAT_NCHW:\n      return \"ACL_FORMAT_NCHW\";\n    case ACL_FORMAT_NHWC:\n      return \"ACL_FORMAT_NHWC\";\n    case ACL_FORMAT_ND:\n      return \"ACL_FORMAT_ND\";\n    case ACL_FORMAT_NC1HWC0:\n      return \"ACL_FORMAT_NC1HWC0\";\n    case ACL_FORMAT_FRACTAL_Z:\n      return \"ACL_FORMAT_FRACTAL_Z\";\n    case ACL_FORMAT_NC1HWC0_C04:\n      return \"ACL_FORMAT_NC1HWC0_C04\";\n    case ACL_FORMAT_HWCN:\n      return \"ACL_FORMAT_HWCN\";\n    case ACL_FORMAT_NDHWC:\n      return \"ACL_FORMAT_NDHWC\";\n    case ACL_FORMAT_FRACTAL_NZ:\n      return \"ACL_FORMAT_FRACTAL_NZ\";\n    case ACL_FORMAT_NCDHW:\n      return \"ACL_FORMAT_NCDHW\";\n    case ACL_FORMAT_NDC1HWC0:\n      return \"ACL_FORMAT_NDC1HWC0\";\n    case ACL_FRACTAL_Z_3D:\n      return \"ACL_FRACTAL_Z_3D\";\n    case ACL_FORMAT_NC:\n      return \"ACL_FORMAT_NC\";\n    case ACL_FORMAT_NCL:\n      return \"ACL_FORMAT_NCL\";\n    default:\n      return \"unknown\";\n  }\n}\n\nAcl::Acl() {\n  aclError ret = aclInit(nullptr);\n  SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclInit\");\n  initialized_ = true;\n}\n\nAcl::~Acl() {\n  if (initialized_) {\n    aclError ret = aclFinalize();\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclFinalize\");\n  }\n}\n\nAclContext::AclContext(int32_t device_id) {\n  aclError ret = aclrtCreateContext(&context_, device_id);\n  SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtCreateContext\");\n}\n\nAclContext::~AclContext() {\n  if (context_) {\n    aclError ret = aclrtDestroyContext(context_);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtDestroyContext\");\n  }\n}\n\naclrtContext AclContext::Get() const { return context_; }\n\nAclDevicePtr::AclDevicePtr(\n    size_t size, aclrtMemMallocPolicy policy /*= ACL_MEM_MALLOC_HUGE_FIRST*/) {\n  if (size > 0) {\n    aclError ret = aclrtMalloc(&p_, size, policy);\n\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtMalloc with size: %zu\",\n                             size);\n  }\n  size_ = size;\n}\n\nAclDevicePtr::~AclDevicePtr() {\n  if (p_) {\n    aclError ret = aclrtFree(p_);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclrtFree\");\n  }\n}\n\nAclModelDesc::AclModelDesc(uint32_t model_id) {\n  p_ = aclmdlCreateDesc();\n  if (!p_) {\n    SHERPA_ONNX_LOGE(\"Failed to call aclmdlCreateDesc\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  aclError ret = aclmdlGetDesc(p_, model_id);\n  SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlGetDesc\");\n}\n\nAclModelDesc::~AclModelDesc() {\n  if (p_) {\n    aclError ret = aclmdlDestroyDesc(p_);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlDestroyDesc\");\n  }\n}\n\nAclModel::AclModel(const std::string &model_path) {\n  aclError ret = aclmdlLoadFromFile(model_path.c_str(), &model_id_);\n  SHERPA_ONNX_ASCEND_CHECK(ret,\n                           \"Failed to call aclmdlLoadFromFile from file '%s'\",\n                           model_path.c_str());\n\n  Init();\n}\n\nAclModel::AclModel(const void *model, size_t model_size) {\n  aclError ret = aclmdlLoadFromMem(model, model_size, &model_id_);\n  SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlLoadFromMem\");\n\n  Init();\n}\n\nAclModel::~AclModel() {\n  if (model_id_ != 0) {\n    aclError ret = aclmdlUnload(model_id_);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlUnload\");\n  }\n}\n\nvoid AclModel::Init() {\n  desc_ = std::make_unique<AclModelDesc>(model_id_);\n\n  InitInputNames();\n  InitInputShapes();\n\n  InitOutputNames();\n  InitOutputShapes();\n}\n\nvoid AclModel::InitInputNames() {\n  size_t num_inputs = aclmdlGetNumInputs(desc_->Get());\n  input_names_.resize(num_inputs);\n\n  for (int32_t i = 0; i < num_inputs; ++i) {\n    const char *name = aclmdlGetInputNameByIndex(desc_->Get(), i);\n    input_names_[i] = name;\n  }\n}\n\nvoid AclModel::InitInputShapes() {\n  size_t num_inputs = aclmdlGetNumInputs(desc_->Get());\n  input_shapes_.resize(num_inputs);\n\n  std::vector<int64_t> shape;\n  for (int32_t i = 0; i < num_inputs; ++i) {\n    aclmdlIODims dims;\n    aclError ret = aclmdlGetInputDims(desc_->Get(), i, &dims);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlGetInputDims\");\n\n    shape.resize(dims.dimCount);\n    for (int32_t k = 0; k < dims.dimCount; ++k) {\n      shape[k] = dims.dims[k];\n    }\n    input_shapes_[i] = std::move(shape);\n  }\n}\n\nvoid AclModel::InitOutputNames() {\n  size_t num_outputs = aclmdlGetNumOutputs(desc_->Get());\n  output_names_.resize(num_outputs);\n  for (int32_t i = 0; i < num_outputs; ++i) {\n    const char *name = aclmdlGetOutputNameByIndex(desc_->Get(), i);\n    output_names_[i] = name;\n  }\n}\n\nvoid AclModel::InitOutputShapes() {\n  size_t num_outputs = aclmdlGetNumOutputs(desc_->Get());\n  output_shapes_.resize(num_outputs);\n\n  std::vector<int64_t> shape;\n  for (int32_t i = 0; i < num_outputs; ++i) {\n    aclmdlIODims dims;\n    aclError ret = aclmdlGetOutputDims(desc_->Get(), i, &dims);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlGetOutputDims\");\n\n    shape.resize(dims.dimCount);\n    for (int32_t k = 0; k < dims.dimCount; ++k) {\n      shape[k] = dims.dims[k];\n    }\n    output_shapes_[i] = std::move(shape);\n  }\n}\n\nstd::string AclModel::GetInfo() const {\n  size_t num_inputs = aclmdlGetNumInputs(desc_->Get());\n  size_t num_outputs = aclmdlGetNumOutputs(desc_->Get());\n\n  std::ostringstream os;\n  os << \"Model id: \" << model_id_ << \"\\n\";\n  os << \"Num inputs: \" << num_inputs << \"\\n\";\n  os << \"Num outputs: \" << num_outputs << \"\\n\";\n\n  for (int32_t i = 0; i < num_inputs; ++i) {\n    os << \"---input \" << i << \"---\\n\";\n\n    size_t size_in_bytes = aclmdlGetInputSizeByIndex(desc_->Get(), i);\n\n    os << \" size in bytes: \" << size_in_bytes << \"\\n\";\n    os << \" size in MB:    \" << size_in_bytes / 1024. / 1024 << \"\\n\";\n\n    const char *name = aclmdlGetInputNameByIndex(desc_->Get(), i);\n    os << \" name: \" << name << \"\\n\";\n\n    aclFormat format = aclmdlGetInputFormat(desc_->Get(), i);\n\n    os << \" format: \" << AclFormatToString(format) << \"\\n\";\n    aclDataType type = aclmdlGetInputDataType(desc_->Get(), i);\n    os << \" data type: \" << AclDataTypeToString(type) << \"\\n\";\n\n    aclmdlIODims dims;\n    aclError ret = aclmdlGetInputDims(desc_->Get(), i, &dims);\n    os << \" dim: \" << dims.dimCount << \"\\n\";\n    for (size_t d = 0; d < dims.dimCount; ++d) {\n      os << \"  \" << d << \" -> \" << dims.name << \", \" << dims.dims[d] << \"\\n\";\n    }\n  }\n\n  for (int32_t i = 0; i < num_outputs; ++i) {\n    os << \"---output \" << i << \"---\\n\";\n\n    size_t size_out_bytes = aclmdlGetOutputSizeByIndex(desc_->Get(), i);\n\n    os << \" size out bytes: \" << size_out_bytes << \"\\n\";\n    os << \" size out MB:    \" << size_out_bytes / 1024 / 1024 << \"\\n\";\n\n    const char *name = aclmdlGetOutputNameByIndex(desc_->Get(), i);\n    os << \" name: \" << name << \"\\n\";\n\n    aclFormat format = aclmdlGetOutputFormat(desc_->Get(), i);\n\n    os << \" format: \" << AclFormatToString(format) << \"\\n\";\n    aclDataType type = aclmdlGetOutputDataType(desc_->Get(), i);\n    os << \" data type: \" << AclDataTypeToString(type) << \"\\n\";\n\n    aclmdlIODims dims;\n    aclError ret = aclmdlGetOutputDims(desc_->Get(), i, &dims);\n    os << \" dim: \" << dims.dimCount << \"\\n\";\n    for (size_t d = 0; d < dims.dimCount; ++d) {\n      os << \"  \" << d << \" -> \" << dims.name << \", \" << dims.dims[d] << \"\\n\";\n    }\n  }\n\n  return os.str();\n}\n\nAclMdlDataset::AclMdlDataset() {\n  p_ = aclmdlCreateDataset();\n  if (!p_) {\n    SHERPA_ONNX_LOGE(\"Failed to call aclmdlCreateDataset\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n}\n\nAclMdlDataset::~AclMdlDataset() {\n  if (p_) {\n    aclError ret = aclmdlDestroyDataset(p_);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlDestroyDataset\");\n  }\n}\n\nvoid AclMdlDataset::AddBuffer(aclDataBuffer *buffer) const {\n  aclError ret = aclmdlAddDatasetBuffer(p_, buffer);\n  SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclmdlAddDatasetBuffer\");\n}\n\nvoid AclMdlDataset::SetTensorDesc(aclTensorDesc *tensor_desc,\n                                  size_t index) const {\n  aclError ret = aclmdlSetDatasetTensorDesc(p_, tensor_desc, index);\n\n  SHERPA_ONNX_ASCEND_CHECK(\n      ret, \"Failed to call aclmdlSetDatasetTensorDesc for input %zu\", index);\n}\n\nAclDataBuffer::AclDataBuffer(void *data, size_t size) {\n  p_ = aclCreateDataBuffer(data, size);\n\n  if (!p_) {\n    SHERPA_ONNX_LOGE(\"Failed to call aclCreateDataBuffer\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n}\n\nAclDataBuffer::~AclDataBuffer() { Release(); }\n\nvoid AclDataBuffer::Release() {\n  if (p_) {\n    aclError ret = aclDestroyDataBuffer(p_);\n    SHERPA_ONNX_ASCEND_CHECK(ret, \"Failed to call aclDestroyDataBuffer\");\n  }\n  p_ = nullptr;\n}\n\nAclTensorDesc::AclTensorDesc(aclDataType data_type, int num_dims,\n                             const int64_t *dims, aclFormat format) {\n  p_ = aclCreateTensorDesc(data_type, num_dims, dims, format);\n  if (!p_) {\n    SHERPA_ONNX_LOGE(\"Failed to call aclCreateTensorDesc\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n}\n\nAclTensorDesc::~AclTensorDesc() {\n  if (p_) {\n    aclDestroyTensorDesc(p_);\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/ascend/utils.h",
    "content": "// sherpa-onnx/csrc/ascend/utils.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ASCEND_UTILS_H_\n#define SHERPA_ONNX_CSRC_ASCEND_UTILS_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"acl/acl.h\"\n\nnamespace sherpa_onnx {\n\nclass Acl {\n public:\n  Acl();\n  ~Acl();\n\n  Acl(const Acl &) = delete;\n  Acl &operator=(const Acl &) = delete;\n\n  Acl(Acl &&) = delete;\n  Acl &operator=(Acl &&) = delete;\n\n private:\n  bool initialized_ = false;\n};\n\nclass AclContext {\n public:\n  explicit AclContext(int32_t device_id);\n\n  ~AclContext();\n\n  AclContext(const AclContext &) = delete;\n  AclContext &operator=(const AclContext &) = delete;\n\n  AclContext(AclContext &&) = delete;\n  AclContext &operator=(AclContext &&) = delete;\n\n  aclrtContext Get() const;\n  operator aclrtContext() { return context_; }\n\n private:\n  aclrtContext context_ = nullptr;\n};\n\nclass AclDevicePtr {\n public:\n  explicit AclDevicePtr(\n      size_t size, aclrtMemMallocPolicy policy = ACL_MEM_MALLOC_HUGE_FIRST);\n\n  ~AclDevicePtr();\n\n  AclDevicePtr(const AclDevicePtr &) = delete;\n  AclDevicePtr &operator=(const AclDevicePtr &) = delete;\n\n  AclDevicePtr(AclDevicePtr &&) = delete;\n  AclDevicePtr &operator=(AclDevicePtr &&) = delete;\n\n  void *Get() const { return p_; }\n\n  template <typename T>\n  T *Get() const {\n    return reinterpret_cast<T *>(p_);\n  }\n\n  operator void *() { return p_; }\n\n  size_t Size() const { return size_; }\n\n private:\n  void *p_ = nullptr;\n  size_t size_ = 0;\n};\n\nclass AclModelDesc {\n public:\n  explicit AclModelDesc(uint32_t model_id);\n\n  ~AclModelDesc();\n\n  AclModelDesc(const AclModelDesc &) = delete;\n  AclModelDesc &operator=(const AclModelDesc &) = delete;\n\n  AclModelDesc(AclModelDesc &&) = delete;\n  AclModelDesc &operator=(AclModelDesc &&) = delete;\n\n  aclmdlDesc *Get() const { return p_; }\n  operator aclmdlDesc *() const { return p_; }\n\n  size_t Size() const { return size_; }\n\n private:\n  aclmdlDesc *p_ = nullptr;\n  size_t size_ = 0;\n};\n\nclass AclModel {\n public:\n  explicit AclModel(const std::string &model_path);\n  AclModel(const void *model, size_t model_size);\n  ~AclModel();\n\n  uint32_t Get() const { return model_id_; }\n  operator uint32_t() const { return model_id_; }\n\n  AclModel(const AclModel &) = delete;\n  AclModel &operator=(const AclModel &) = delete;\n\n  AclModel(AclModel &&) = delete;\n  AclModel &operator=(AclModel &&) = delete;\n\n  std::string GetInfo() const;\n\n  const std::vector<std::string> &GetInputNames() const { return input_names_; }\n\n  const std::vector<std::vector<int64_t>> &GetInputShapes() const {\n    return input_shapes_;\n  }\n\n  const std::vector<std::string> &GetOutputNames() const {\n    return output_names_;\n  }\n\n  const std::vector<std::vector<int64_t>> &GetOutputShapes() const {\n    return output_shapes_;\n  }\n\n private:\n  void Init();\n  void InitInputNames();\n  void InitInputShapes();\n\n  void InitOutputNames();\n  void InitOutputShapes();\n\n private:\n  uint32_t model_id_ = 0;\n  std::unique_ptr<AclModelDesc> desc_;\n\n  std::vector<std::string> input_names_;\n  std::vector<std::vector<int64_t>> input_shapes_;\n\n  std::vector<std::string> output_names_;\n  std::vector<std::vector<int64_t>> output_shapes_;\n};\n\nclass AclMdlDataset {\n public:\n  AclMdlDataset();\n  ~AclMdlDataset();\n\n  AclMdlDataset(const AclMdlDataset &) = delete;\n  AclMdlDataset &operator=(const AclMdlDataset &) = delete;\n\n  AclMdlDataset(AclMdlDataset &&) = delete;\n  AclMdlDataset &operator=(AclMdlDataset &&) = delete;\n\n  void AddBuffer(aclDataBuffer *buffer) const;\n  void SetTensorDesc(aclTensorDesc *tensor_desc, size_t index) const;\n\n  aclmdlDataset *Get() const { return p_; }\n  operator aclmdlDataset *() const { return p_; }\n\n private:\n  aclmdlDataset *p_ = nullptr;\n};\n\nclass AclDataBuffer {\n public:\n  AclDataBuffer(void *data, size_t size);\n  ~AclDataBuffer();\n\n  AclDataBuffer(const AclDataBuffer &) = delete;\n  AclDataBuffer &operator=(const AclDataBuffer &) = delete;\n\n  AclDataBuffer(AclDataBuffer &&other) {\n    p_ = other.p_;\n    other.p_ = nullptr;\n  }\n  AclDataBuffer &operator=(AclDataBuffer &&other) {\n    if (this == &other) {\n      return *this;\n    }\n\n    Release();\n\n    p_ = other.p_;\n    other.p_ = nullptr;\n    return *this;\n  }\n\n  void Release();\n\n  aclDataBuffer *Get() const { return p_; }\n  operator aclDataBuffer *() const { return p_; }\n\n private:\n  aclDataBuffer *p_ = nullptr;\n};\n\nclass AclTensorDesc {\n public:\n  AclTensorDesc(aclDataType data_type, int num_dims, const int64_t *dims,\n                aclFormat format);\n  ~AclTensorDesc();\n\n  AclTensorDesc(const AclTensorDesc &) = delete;\n  AclTensorDesc &operator=(const AclTensorDesc &) = delete;\n\n  AclTensorDesc(AclTensorDesc &&) = delete;\n  AclTensorDesc &operator=(AclTensorDesc &&) = delete;\n\n  aclTensorDesc *Get() const { return p_; }\n  operator aclTensorDesc *() const { return p_; }\n\n private:\n  aclTensorDesc *p_ = nullptr;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ASCEND_UTILS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/audio-tagging-ced-impl.h",
    "content": "// sherpa-onnx/csrc/audio-tagging-ced-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_AUDIO_TAGGING_CED_IMPL_H_\n#define SHERPA_ONNX_CSRC_AUDIO_TAGGING_CED_IMPL_H_\n\n#include <assert.h>\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/audio-tagging-impl.h\"\n#include \"sherpa-onnx/csrc/audio-tagging-label-file.h\"\n#include \"sherpa-onnx/csrc/audio-tagging.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/offline-ced-model.h\"\n\nnamespace sherpa_onnx {\n\nclass AudioTaggingCEDImpl : public AudioTaggingImpl {\n public:\n  explicit AudioTaggingCEDImpl(const AudioTaggingConfig &config)\n      : config_(config), model_(config.model), labels_(config.labels) {\n    if (model_.NumEventClasses() != labels_.NumEventClasses()) {\n      SHERPA_ONNX_LOGE(\"number of classes: %d (model) != %d (label file)\",\n                       model_.NumEventClasses(), labels_.NumEventClasses());\n      exit(-1);\n    }\n  }\n\n#if __ANDROID_API__ >= 9\n  explicit AudioTaggingCEDImpl(AAssetManager *mgr,\n                               const AudioTaggingConfig &config)\n      : config_(config),\n        model_(mgr, config.model),\n        labels_(mgr, config.labels) {\n    if (model_.NumEventClasses() != labels_.NumEventClasses()) {\n      SHERPA_ONNX_LOGE(\"number of classes: %d (model) != %d (label file)\",\n                       model_.NumEventClasses(), labels_.NumEventClasses());\n      exit(-1);\n    }\n  }\n#endif\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(CEDTag{});\n  }\n\n  std::vector<AudioEvent> Compute(OfflineStream *s,\n                                  int32_t top_k = -1) const override {\n    if (top_k < 0) {\n      top_k = config_.top_k;\n    }\n\n    int32_t num_event_classes = model_.NumEventClasses();\n\n    if (top_k > num_event_classes) {\n      top_k = num_event_classes;\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    // WARNING(fangjun): It is fixed to 64 for CED models\n    int32_t feat_dim = 64;\n    std::vector<float> f = s->GetFrames();\n\n    int32_t num_frames = f.size() / feat_dim;\n    assert(feat_dim * num_frames == static_cast<int32_t>(f.size()));\n\n    std::array<int64_t, 3> shape = {1, num_frames, feat_dim};\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, f.data(), f.size(),\n                                            shape.data(), shape.size());\n\n    Ort::Value probs = model_.Forward(std::move(x));\n\n    const float *p = probs.GetTensorData<float>();\n\n    std::vector<int32_t> top_k_indexes = TopkIndex(p, num_event_classes, top_k);\n\n    std::vector<AudioEvent> ans(top_k);\n\n    int32_t i = 0;\n\n    for (int32_t index : top_k_indexes) {\n      ans[i].name = labels_.GetEventName(index);\n      ans[i].index = index;\n      ans[i].prob = p[index];\n      i += 1;\n    }\n\n    return ans;\n  }\n\n private:\n  AudioTaggingConfig config_;\n  OfflineCEDModel model_;\n  AudioTaggingLabels labels_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AUDIO_TAGGING_CED_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/audio-tagging-impl.cc",
    "content": "// sherpa-onnx/csrc/audio-tagging-impl.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/audio-tagging-impl.h\"\n\n#include <memory>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/audio-tagging-ced-impl.h\"\n#include \"sherpa-onnx/csrc/audio-tagging-zipformer-impl.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<AudioTaggingImpl> AudioTaggingImpl::Create(\n    const AudioTaggingConfig &config) {\n  if (!config.model.zipformer.model.empty()) {\n    return std::make_unique<AudioTaggingZipformerImpl>(config);\n  } else if (!config.model.ced.empty()) {\n    return std::make_unique<AudioTaggingCEDImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\n      \"Please specify an audio tagging model! Return a null pointer\");\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\nstd::unique_ptr<AudioTaggingImpl> AudioTaggingImpl::Create(\n    AAssetManager *mgr, const AudioTaggingConfig &config) {\n  if (!config.model.zipformer.model.empty()) {\n    return std::make_unique<AudioTaggingZipformerImpl>(mgr, config);\n  } else if (!config.model.ced.empty()) {\n    return std::make_unique<AudioTaggingCEDImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\n      \"Please specify an audio tagging model! Return a null pointer\");\n  return nullptr;\n}\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/audio-tagging-impl.h",
    "content": "// sherpa-onnx/csrc/audio-tagging-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_AUDIO_TAGGING_IMPL_H_\n#define SHERPA_ONNX_CSRC_AUDIO_TAGGING_IMPL_H_\n\n#include <memory>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/audio-tagging.h\"\n\nnamespace sherpa_onnx {\n\nclass AudioTaggingImpl {\n public:\n  virtual ~AudioTaggingImpl() = default;\n\n  static std::unique_ptr<AudioTaggingImpl> Create(\n      const AudioTaggingConfig &config);\n\n#if __ANDROID_API__ >= 9\n  static std::unique_ptr<AudioTaggingImpl> Create(\n      AAssetManager *mgr, const AudioTaggingConfig &config);\n#endif\n\n  virtual std::unique_ptr<OfflineStream> CreateStream() const = 0;\n\n  virtual std::vector<AudioEvent> Compute(OfflineStream *s,\n                                          int32_t top_k = -1) const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AUDIO_TAGGING_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/audio-tagging-label-file.cc",
    "content": "// sherpa-onnx/csrc/audio-tagging-label-file.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/audio-tagging-label-file.h\"\n\n#include <fstream>\n#include <sstream>\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nAudioTaggingLabels::AudioTaggingLabels(const std::string &filename) {\n  std::ifstream is(filename);\n  Init(is);\n}\n\n#if __ANDROID_API__ >= 9\nAudioTaggingLabels::AudioTaggingLabels(AAssetManager *mgr,\n                                       const std::string &filename) {\n  auto buf = ReadFile(mgr, filename);\n  std::istringstream is(std::string(buf.data(), buf.size()));\n  Init(is);\n}\n#endif\n\n// Format of a label file\n/*\nindex,mid,display_name\n0,/m/09x0r,\"Speech\"\n1,/m/05zppz,\"Male speech, man speaking\"\n*/\nvoid AudioTaggingLabels::Init(std::istream &is) {\n  std::string line;\n  std::getline(is, line);  // skip the header\n\n  std::string index;\n  std::string tmp;\n  std::string name;\n\n  while (std::getline(is, line)) {\n    index.clear();\n    name.clear();\n    std::istringstream input2(line);\n\n    std::getline(input2, index, ',');\n    std::getline(input2, tmp, ',');\n    std::getline(input2, name);\n\n    std::size_t pos{};\n    int32_t i = std::stoi(index, &pos);\n    if (index.empty() || pos != index.size()) {\n      SHERPA_ONNX_LOGE(\"Invalid line: %s\", line.c_str());\n      exit(-1);\n    }\n\n    if (i != static_cast<int32_t>(names_.size())) {\n      SHERPA_ONNX_LOGE(\n          \"Index should be sorted and contiguous. Expected index: %d, given: \"\n          \"%d.\",\n          static_cast<int32_t>(names_.size()), i);\n    }\n    if (name.empty() || name.front() != '\"' || name.back() != '\"') {\n      SHERPA_ONNX_LOGE(\"Invalid line: %s\", line.c_str());\n      exit(-1);\n    }\n\n    names_.emplace_back(name.begin() + 1, name.end() - 1);\n  }\n}\n\nconst std::string &AudioTaggingLabels::GetEventName(int32_t index) const {\n  return names_.at(index);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/audio-tagging-label-file.h",
    "content": "// sherpa-onnx/csrc/audio-tagging-label-file.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_AUDIO_TAGGING_LABEL_FILE_H_\n#define SHERPA_ONNX_CSRC_AUDIO_TAGGING_LABEL_FILE_H_\n\n#include <istream>\n#include <string>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\nnamespace sherpa_onnx {\n\nclass AudioTaggingLabels {\n public:\n  explicit AudioTaggingLabels(const std::string &filename);\n#if __ANDROID_API__ >= 9\n  AudioTaggingLabels(AAssetManager *mgr, const std::string &filename);\n#endif\n\n  // Return the event name for the given index.\n  // The returned reference is valid as long as this object is alive\n  const std::string &GetEventName(int32_t index) const;\n  int32_t NumEventClasses() const { return names_.size(); }\n\n private:\n  void Init(std::istream &is);\n\n private:\n  std::vector<std::string> names_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AUDIO_TAGGING_LABEL_FILE_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/audio-tagging-model-config.cc",
    "content": "// sherpa-onnx/csrc/audio-tagging-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/audio-tagging-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid AudioTaggingModelConfig::Register(ParseOptions *po) {\n  zipformer.Register(po);\n\n  po->Register(\"ced-model\", &ced,\n               \"Path to CED model. Only need to pass one of --zipformer-model \"\n               \"or --ced-model\");\n\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n}\n\nbool AudioTaggingModelConfig::Validate() const {\n  if (!zipformer.model.empty() && !zipformer.Validate()) {\n    return false;\n  }\n\n  if (!ced.empty() && !FileExists(ced)) {\n    SHERPA_ONNX_LOGE(\"CED model file '%s' does not exist\", ced.c_str());\n    return false;\n  }\n\n  if (zipformer.model.empty() && ced.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide either --zipformer-model or --ced-model\");\n    return false;\n  }\n\n  return true;\n}\n\nstd::string AudioTaggingModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"AudioTaggingModelConfig(\";\n  os << \"zipformer=\" << zipformer.ToString() << \", \";\n  os << \"ced=\\\"\" << ced << \"\\\", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/audio-tagging-model-config.h",
    "content": "// sherpa-onnx/csrc/audio-tagging-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_AUDIO_TAGGING_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_AUDIO_TAGGING_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-zipformer-audio-tagging-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct AudioTaggingModelConfig {\n  struct OfflineZipformerAudioTaggingModelConfig zipformer;\n  std::string ced;\n\n  int32_t num_threads = 1;\n  bool debug = false;\n  std::string provider = \"cpu\";\n\n  AudioTaggingModelConfig() = default;\n\n  AudioTaggingModelConfig(\n      const OfflineZipformerAudioTaggingModelConfig &zipformer,\n      const std::string &ced, int32_t num_threads, bool debug,\n      const std::string &provider)\n      : zipformer(zipformer),\n        ced(ced),\n        num_threads(num_threads),\n        debug(debug),\n        provider(provider) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AUDIO_TAGGING_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/audio-tagging-zipformer-impl.h",
    "content": "// sherpa-onnx/csrc/audio-tagging-zipformer-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_AUDIO_TAGGING_ZIPFORMER_IMPL_H_\n#define SHERPA_ONNX_CSRC_AUDIO_TAGGING_ZIPFORMER_IMPL_H_\n\n#include <assert.h>\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/audio-tagging-impl.h\"\n#include \"sherpa-onnx/csrc/audio-tagging-label-file.h\"\n#include \"sherpa-onnx/csrc/audio-tagging.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/offline-zipformer-audio-tagging-model.h\"\n\nnamespace sherpa_onnx {\n\nclass AudioTaggingZipformerImpl : public AudioTaggingImpl {\n public:\n  explicit AudioTaggingZipformerImpl(const AudioTaggingConfig &config)\n      : config_(config), model_(config.model), labels_(config.labels) {\n    if (model_.NumEventClasses() != labels_.NumEventClasses()) {\n      SHERPA_ONNX_LOGE(\"number of classes: %d (model) != %d (label file)\",\n                       model_.NumEventClasses(), labels_.NumEventClasses());\n      exit(-1);\n    }\n  }\n\n#if __ANDROID_API__ >= 9\n  explicit AudioTaggingZipformerImpl(AAssetManager *mgr,\n                                     const AudioTaggingConfig &config)\n      : config_(config),\n        model_(mgr, config.model),\n        labels_(mgr, config.labels) {\n    if (model_.NumEventClasses() != labels_.NumEventClasses()) {\n      SHERPA_ONNX_LOGE(\"number of classes: %d (model) != %d (label file)\",\n                       model_.NumEventClasses(), labels_.NumEventClasses());\n      exit(-1);\n    }\n  }\n#endif\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>();\n  }\n\n  std::vector<AudioEvent> Compute(OfflineStream *s,\n                                  int32_t top_k = -1) const override {\n    if (top_k < 0) {\n      top_k = config_.top_k;\n    }\n\n    int32_t num_event_classes = model_.NumEventClasses();\n\n    if (top_k > num_event_classes) {\n      top_k = num_event_classes;\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    // WARNING(fangjun): It is fixed to 80 for all models from icefall\n    int32_t feat_dim = 80;\n    std::vector<float> f = s->GetFrames();\n\n    int32_t num_frames = f.size() / feat_dim;\n\n    assert(feat_dim * num_frames == static_cast<int32_t>(f.size()));\n\n    std::array<int64_t, 3> shape = {1, num_frames, feat_dim};\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, f.data(), f.size(),\n                                            shape.data(), shape.size());\n\n    int64_t x_length_scalar = num_frames;\n    std::array<int64_t, 1> x_length_shape = {1};\n    Ort::Value x_length =\n        Ort::Value::CreateTensor(memory_info, &x_length_scalar, 1,\n                                 x_length_shape.data(), x_length_shape.size());\n\n    Ort::Value probs = model_.Forward(std::move(x), std::move(x_length));\n\n    const float *p = probs.GetTensorData<float>();\n\n    std::vector<int32_t> top_k_indexes = TopkIndex(p, num_event_classes, top_k);\n\n    std::vector<AudioEvent> ans(top_k);\n\n    int32_t i = 0;\n\n    for (int32_t index : top_k_indexes) {\n      ans[i].name = labels_.GetEventName(index);\n      ans[i].index = index;\n      ans[i].prob = p[index];\n      i += 1;\n    }\n\n    return ans;\n  }\n\n private:\n  AudioTaggingConfig config_;\n  OfflineZipformerAudioTaggingModel model_;\n  AudioTaggingLabels labels_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AUDIO_TAGGING_ZIPFORMER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/audio-tagging.cc",
    "content": "// sherpa-onnx/csrc/audio-tagging.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/audio-tagging.h\"\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/audio-tagging-impl.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstd::string AudioEvent::ToString() const {\n  std::ostringstream os;\n  os << \"AudioEvent(\";\n  os << \"name=\\\"\" << name << \"\\\", \";\n  os << \"index=\" << index << \", \";\n  os << \"prob=\" << prob << \")\";\n  return os.str();\n}\n\nvoid AudioTaggingConfig::Register(ParseOptions *po) {\n  model.Register(po);\n  po->Register(\"labels\", &labels, \"Event label file\");\n  po->Register(\"top-k\", &top_k, \"Top k events to return in the result\");\n}\n\nbool AudioTaggingConfig::Validate() const {\n  if (!model.Validate()) {\n    return false;\n  }\n\n  if (top_k < 1) {\n    SHERPA_ONNX_LOGE(\"--top-k should be >= 1. Given: %d\", top_k);\n    return false;\n  }\n\n  if (labels.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --labels\");\n    return false;\n  }\n\n  if (!FileExists(labels)) {\n    SHERPA_ONNX_LOGE(\"--labels '%s' does not exist\", labels.c_str());\n    return false;\n  }\n\n  return true;\n}\nstd::string AudioTaggingConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"AudioTaggingConfig(\";\n  os << \"model=\" << model.ToString() << \", \";\n  os << \"labels=\\\"\" << labels << \"\\\", \";\n  os << \"top_k=\" << top_k << \")\";\n\n  return os.str();\n}\n\nAudioTagging::AudioTagging(const AudioTaggingConfig &config)\n    : impl_(AudioTaggingImpl::Create(config)) {}\n\n#if __ANDROID_API__ >= 9\nAudioTagging::AudioTagging(AAssetManager *mgr, const AudioTaggingConfig &config)\n    : impl_(AudioTaggingImpl::Create(mgr, config)) {}\n#endif\n\nAudioTagging::~AudioTagging() = default;\n\nstd::unique_ptr<OfflineStream> AudioTagging::CreateStream() const {\n  return impl_->CreateStream();\n}\n\nstd::vector<AudioEvent> AudioTagging::Compute(OfflineStream *s,\n                                              int32_t top_k /*= -1*/) const {\n  return impl_->Compute(s, top_k);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/audio-tagging.h",
    "content": "// sherpa-onnx/csrc/audio-tagging.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_AUDIO_TAGGING_H_\n#define SHERPA_ONNX_CSRC_AUDIO_TAGGING_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/audio-tagging-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-stream.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct AudioTaggingConfig {\n  AudioTaggingModelConfig model;\n  std::string labels;\n\n  int32_t top_k = 5;\n\n  AudioTaggingConfig() = default;\n\n  AudioTaggingConfig(const AudioTaggingModelConfig &model,\n                     const std::string &labels, int32_t top_k)\n      : model(model), labels(labels), top_k(top_k) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nstruct AudioEvent {\n  std::string name;  // name of the event\n  int32_t index;     // index of the event in the label file\n  float prob;        // probability of the event\n\n  std::string ToString() const;\n};\n\nclass AudioTaggingImpl;\n\nclass AudioTagging {\n public:\n  explicit AudioTagging(const AudioTaggingConfig &config);\n\n#if __ANDROID_API__ >= 9\n  AudioTagging(AAssetManager *mgr, const AudioTaggingConfig &config);\n#endif\n\n  ~AudioTagging();\n\n  std::unique_ptr<OfflineStream> CreateStream() const;\n\n  // If top_k is -1, then config.top_k is used.\n  // Otherwise, config.top_k is ignored\n  //\n  // Return top_k AudioEvent. ans[0].prob is the largest of all returned events.\n  std::vector<AudioEvent> Compute(OfflineStream *s, int32_t top_k = -1) const;\n\n private:\n  std::unique_ptr<AudioTaggingImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AUDIO_TAGGING_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/axcl-engine-guard.cc",
    "content": "// sherpa-onnx/csrc/axcl/axcl-engine-guard.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/axcl/axcl-engine-guard.h\"\n\n#include <cstdint>\n\n#include \"axcl.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nAxclEngineGuard::AxclEngineGuard(\n    axclrtEngineVNpuKind npuKind /*= AXCL_VNPU_DISABLE*/) {\n  axclError ret = axclrtEngineInit(npuKind);\n  if (ret != 0) {\n    SHERPA_ONNX_LOGE(\"Failed to call axclrtEngineInit(). Return code is: %d\",\n                     static_cast<int32_t>(ret));\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  initialized_ = true;\n}\n\nAxclEngineGuard::~AxclEngineGuard() {\n  if (initialized_) {\n    auto ret = axclrtEngineFinalize();\n\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to call axclrtEngineFinalize(). Return code is: %d\",\n          static_cast<int32_t>(ret));\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/axcl-engine-guard.h",
    "content": "// sherpa-onnx/csrc/axcl/axcl-engine-guard.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_AXCL_AXCL_ENGINE_GUARD_H_\n#define SHERPA_ONNX_CSRC_AXCL_AXCL_ENGINE_GUARD_H_\n#include \"axcl.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nclass AxclEngineGuard {\n public:\n  explicit AxclEngineGuard(axclrtEngineVNpuKind npuKind = AXCL_VNPU_DISABLE);\n  ~AxclEngineGuard();\n\n  AxclEngineGuard(const AxclEngineGuard &) = delete;\n  AxclEngineGuard &operator=(const AxclEngineGuard &) = delete;\n  AxclEngineGuard(AxclEngineGuard &&) = delete;\n  AxclEngineGuard &operator=(AxclEngineGuard &&) = delete;\n\n private:\n  bool initialized_ = false;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AXCL_AXCL_ENGINE_GUARD_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/axcl-engine-io-guard.cc",
    "content": "// sherpa-onnx/csrc/axcl/axcl-engine-io-guard.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/axcl/axcl-engine-io-guard.h\"\n\n#include <cstdint>\n\n#include \"axcl.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nAxclEngineIOGuard::AxclEngineIOGuard(axclrtEngineIOInfo io_info) {\n  axclError ret = axclrtEngineCreateIO(io_info, &io_);\n  if (ret != 0) {\n    SHERPA_ONNX_LOGE(\n        \"Failed to call axclrtEngineCreateIO(). Return code is: %d\",\n        static_cast<int32_t>(ret));\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  initialized_ = true;\n}\n\nAxclEngineIOGuard::~AxclEngineIOGuard() {\n  if (initialized_) {\n    auto ret = axclrtEngineDestroyIO(io_);\n\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to call axclrtEngineDestroyIO(). Return code is: %d\",\n          static_cast<int32_t>(ret));\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/axcl-engine-io-guard.h",
    "content": "// sherpa-onnx/csrc/axcl/axcl-engine-io-guard.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_AXCL_AXCL_ENGINE_IO_GUARD_H_\n#define SHERPA_ONNX_CSRC_AXCL_AXCL_ENGINE_IO_GUARD_H_\n#include \"axcl.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nclass AxclEngineIOGuard {\n public:\n  explicit AxclEngineIOGuard(axclrtEngineIOInfo io_info);\n  ~AxclEngineIOGuard();\n\n  AxclEngineIOGuard(const AxclEngineIOGuard &) = delete;\n  AxclEngineIOGuard &operator=(const AxclEngineIOGuard &) = delete;\n  AxclEngineIOGuard(AxclEngineIOGuard &&) = delete;\n  AxclEngineIOGuard &operator=(AxclEngineIOGuard &&) = delete;\n\n  operator axclrtEngineIO() { return io_; }\n\n private:\n  bool initialized_ = false;\n  axclrtEngineIO io_ = nullptr;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AXCL_AXCL_ENGINE_IO_GUARD_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/axcl-engine-io-info-guard.cc",
    "content": "// sherpa-onnx/csrc/axcl/axcl-engine-io-info-guard.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/axcl/axcl-engine-io-info-guard.h\"\n\n#include <cstdint>\n\n#include \"axcl.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nAxclEngineIOInfoGuard::AxclEngineIOInfoGuard(uint64_t model_id) {\n  axclError ret = axclrtEngineGetIOInfo(model_id, &io_info_);\n  if (ret != 0) {\n    SHERPA_ONNX_LOGE(\n        \"Failed to call axclrtEngineGetIOInfo(). Return code is: %d\",\n        static_cast<int32_t>(ret));\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  initialized_ = true;\n}\n\nAxclEngineIOInfoGuard::~AxclEngineIOInfoGuard() {\n  if (initialized_) {\n    auto ret = axclrtEngineDestroyIOInfo(io_info_);\n\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to call axclrtEngineDestroyIOInfo(). Return code is: %d\",\n          static_cast<int32_t>(ret));\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/axcl-engine-io-info-guard.h",
    "content": "// sherpa-onnx/csrc/axcl/axcl-engine-io-info-guard.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_AXCL_AXCL_ENGINE_IO_INFO_GUARD_H_\n#define SHERPA_ONNX_CSRC_AXCL_AXCL_ENGINE_IO_INFO_GUARD_H_\n#include <cstdint>\n\n#include \"axcl.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nclass AxclEngineIOInfoGuard {\n public:\n  explicit AxclEngineIOInfoGuard(uint64_t model_id);\n  ~AxclEngineIOInfoGuard();\n\n  AxclEngineIOInfoGuard(const AxclEngineIOInfoGuard &) = delete;\n  AxclEngineIOInfoGuard &operator=(const AxclEngineIOInfoGuard &) = delete;\n  AxclEngineIOInfoGuard(AxclEngineIOInfoGuard &&) = delete;\n  AxclEngineIOInfoGuard &operator=(AxclEngineIOInfoGuard &&) = delete;\n\n  operator axclrtEngineIOInfo() { return io_info_; }\n\n private:\n  bool initialized_ = false;\n  axclrtEngineIOInfo io_info_ = nullptr;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AXCL_AXCL_ENGINE_IO_INFO_GUARD_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/axcl-manager.cc",
    "content": "// sherpa-onnx/csrc/axcl/axcl-manager.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/axcl/axcl-manager.h\"\n\n#include <cstdint>\n\n#include \"axcl.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstd::mutex AxclManager::mutex_;\n\nint32_t AxclManager::count_{0};\n\nAxclManager::AxclManager(const char *config /*= nullptr*/) {\n  std::lock_guard<std::mutex> lock(mutex_);\n  if (count_ == 0) {\n    auto ret = axclInit(config);\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\"Failed to call axclInit(). Return code: %d\",\n                       static_cast<int32_t>(ret));\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  ++count_;\n}\n\nAxclManager::~AxclManager() {\n  std::lock_guard<std::mutex> lock(mutex_);\n  if (--count_ == 0) {\n    auto ret = axclFinalize();\n\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\"Failed to call axclFinalize(). Return code: %d\",\n                       static_cast<int32_t>(ret));\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/axcl-manager.h",
    "content": "// sherpa-onnx/csrc/axcl/axcl-manager.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_AXCL_AXCL_MANAGER_H_\n#define SHERPA_ONNX_CSRC_AXCL_AXCL_MANAGER_H_\n\n#include <cstdint>\n#include <mutex>\n\nnamespace sherpa_onnx {\n\nclass AxclManager {\n public:\n  explicit AxclManager(const char *config = nullptr);\n  ~AxclManager();\n\n  AxclManager(const AxclManager &) = delete;\n  AxclManager &operator=(const AxclManager &) = delete;\n\n  AxclManager(AxclManager &&) = delete;\n  AxclManager &operator=(AxclManager &&) = delete;\n\n private:\n  static std::mutex mutex_;\n  static int32_t count_;\n};\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_AXCL_AXCL_MANAGER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/axcl-model.cc",
    "content": "// sherpa-onnx/csrc/axcl/axcl-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/axcl/axcl-model.h\"\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"axcl.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/axcl/axcl-engine-guard.h\"\n#include \"sherpa-onnx/csrc/axcl/axcl-engine-io-guard.h\"\n#include \"sherpa-onnx/csrc/axcl/axcl-engine-io-info-guard.h\"\n#include \"sherpa-onnx/csrc/axcl/axcl-manager.h\"\n#include \"sherpa-onnx/csrc/axcl/utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\n/*\nInitialization step:\n\n1. AxclInit()\n2. set device\n3. init engine\n4. axclrtEngineLoadFromMem or axclrtEngineLoadFromFile\n5. axclrtEngineCreateContext\n */\n\nclass AxclModel::Impl {\n public:\n  Impl(const std::string &filename, int32_t device_id) {\n    if (!SetDevice(device_id)) {\n      return;\n    }\n\n    InitEngine();\n\n    axclError ret = axclrtEngineLoadFromFile(filename.c_str(), &model_id_);\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to call axclrtEngineLoadFromFile() with file: %s. Return \"\n          \"code is: %d\",\n          filename.c_str(), static_cast<int32_t>(ret));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    model_loaded_ = true;\n\n    PostInit();\n  }\n\n  Impl(const void *cpu_buf, size_t buf_len_in_bytes, int32_t device_id) {\n    if (!SetDevice(device_id)) {\n      return;\n    }\n\n    InitEngine();\n\n    {\n      AxclDevicePtr device_ptr(buf_len_in_bytes, AXCL_MEM_MALLOC_NORMAL_ONLY);\n      auto ret = axclrtMemcpy(device_ptr, cpu_buf, buf_len_in_bytes,\n                              AXCL_MEMCPY_HOST_TO_DEVICE);\n      if (ret != 0) {\n        SHERPA_ONNX_LOGE(\"Failed to call axclrtMemcpy(). Return code is: %d\",\n                         static_cast<int32_t>(ret));\n        return;\n      }\n\n      ret = axclrtEngineLoadFromMem(device_ptr, buf_len_in_bytes, &model_id_);\n      if (ret != 0) {\n        SHERPA_ONNX_LOGE(\n            \"Failed to call axclrtEngineLoadFromMem(). Return code is: %d\",\n            static_cast<int32_t>(ret));\n        return;\n      }\n    }\n\n    model_loaded_ = true;\n\n    PostInit();\n  }\n\n  ~Impl() {\n    if (model_loaded_) {\n      axclError ret = axclrtEngineUnload(model_id_);\n\n      if (ret != 0) {\n        SHERPA_ONNX_LOGE(\n            \"Failed to call axclrtEngineUnload(). Return code is: %d\",\n            static_cast<int32_t>(ret));\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n  }\n\n  const std::vector<std::string> &InputTensorNames() const {\n    return input_tensor_names_;\n  }\n  const std::vector<std::string> &OutputTensorNames() const {\n    return output_tensor_names_;\n  }\n\n  std::vector<int32_t> TensorShape(const std::string &name) const {\n    for (size_t i = 0; i < input_tensor_names_.size(); ++i) {\n      if (input_tensor_names_[i] == name) {\n        return input_tensor_shapes_[i];\n      }\n    }\n\n    for (size_t i = 0; i < output_tensor_names_.size(); ++i) {\n      if (output_tensor_names_[i] == name) {\n        return output_tensor_shapes_[i];\n      }\n    }\n\n    SHERPA_ONNX_LOGE(\"Found no tensor with name: '%s'\", name.c_str());\n    return {};\n  }\n\n  int32_t TensorSizeInBytes(const std::string &name) const {\n    for (size_t i = 0; i < input_tensor_names_.size(); ++i) {\n      if (input_tensor_names_[i] == name) {\n        return input_tensors_[i].Size();\n      }\n    }\n\n    for (size_t i = 0; i < output_tensor_names_.size(); ++i) {\n      if (output_tensor_names_[i] == name) {\n        return output_tensors_[i].Size();\n      }\n    }\n\n    SHERPA_ONNX_LOGE(\"Found no tensor with name: '%s'\", name.c_str());\n    return 0;\n  }\n\n  bool HasTensor(const std::string &name) const {\n    for (size_t i = 0; i < input_tensor_names_.size(); ++i) {\n      if (input_tensor_names_[i] == name) {\n        return true;\n      }\n    }\n\n    for (size_t i = 0; i < output_tensor_names_.size(); ++i) {\n      if (output_tensor_names_[i] == name) {\n        return true;\n      }\n    }\n\n    return false;\n  }\n\n  template <typename T>\n  bool SetInputTensorData(const std::string &name, const T *p,\n                          int32_t n) const {\n    for (size_t i = 0; i < input_tensor_names_.size(); ++i) {\n      if (input_tensor_names_[i] == name) {\n        if (n * sizeof(T) != input_tensors_[i].Size()) {\n          SHERPA_ONNX_LOGE(\"Expected size: %zu, given: %zu\",\n                           input_tensors_[i].Size(), n * sizeof(T));\n          return false;\n        }\n\n        auto ret =\n            axclrtMemcpy(input_tensors_[i].Get(), p, input_tensors_[i].Size(),\n                         AXCL_MEMCPY_HOST_TO_DEVICE);\n        if (ret != 0) {\n          SHERPA_ONNX_LOGE(\n              \"Failed to call axclrtMemcpy(). tensor name: '%s', return code: \"\n              \"%d\",\n              name.c_str(), static_cast<int32_t>(ret));\n          return false;\n        }\n\n        return true;\n      }\n    }\n\n    SHERPA_ONNX_LOGE(\"Found no tensor with name: '%s'\", name.c_str());\n\n    return false;\n  }\n\n  std::vector<float> GetOutputTensorData(const std::string &name) const {\n    for (size_t i = 0; i < output_tensor_names_.size(); ++i) {\n      if (output_tensor_names_[i] == name) {\n        size_t bytes = output_tensors_[i].Size();\n        std::vector<float> out(bytes / sizeof(float));\n\n        auto ret = axclrtMemcpy(out.data(), output_tensors_[i].Get(), bytes,\n                                AXCL_MEMCPY_DEVICE_TO_HOST);\n        if (ret != 0) {\n          SHERPA_ONNX_LOGE(\n              \"Failed to call axclrtMemcpy(). tensor name: '%s', return code: \"\n              \"%d\",\n              name.c_str(), static_cast<int32_t>(ret));\n          return {};\n        }\n\n        return out;\n      }\n    }\n\n    SHERPA_ONNX_LOGE(\"Found no tensor with name: '%s'\", name.c_str());\n\n    return {};\n  }\n\n  bool Run() const {\n    uint32_t group = 0;\n    auto ret =\n        axclrtEngineExecute(model_id_, context_id_, group, *engine_io_guard_);\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\"Failed to call axclrtEngineExecute(), return code: %d\",\n                       static_cast<int32_t>(ret));\n      return false;\n    }\n    return true;\n  }\n\n  bool IsInitialized() const { return model_loaded_; }\n\n private:\n  bool SetDevice(int32_t device_id) {\n    axclrtDeviceList lst;\n    auto ret = axclrtGetDeviceList(&lst);\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to call axclrtGetDeviceList(). Return code is: %d\",\n          static_cast<int32_t>(ret));\n      return false;\n    }\n\n    if (lst.num == 0) {\n      SHERPA_ONNX_LOGE(\"Found 0 device.\");\n      return false;\n    }\n\n    // device_id counts from 0\n    if (device_id < 0 || device_id >= lst.num) {\n      SHERPA_ONNX_LOGE(\"Invalid device_id: %d. Valid range: 0-%d\", device_id,\n                       lst.num - 1);\n      return false;\n    }\n\n    ret = axclrtSetDevice(lst.devices[device_id]);\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\"Failed to call axclrtSetDevice(). Return code is: %d\",\n                       static_cast<int32_t>(ret));\n      return false;\n    }\n\n    return true;\n  }\n\n  void InitEngine() { engine_guard_ = std::make_unique<AxclEngineGuard>(); }\n\n  void PostInit() {\n    InitContext();\n\n    io_info_guard_ = std::make_unique<AxclEngineIOInfoGuard>(model_id_);\n\n    int32_t count = 0;\n    auto ret = axclrtEngineGetShapeGroupsCount(*io_info_guard_, &count);\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to call axclrtEngineGetShapeGroupsCount(). Return code is: \"\n          \"%d\",\n          static_cast<int32_t>(ret));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (count != 1) {\n      SHERPA_ONNX_LOGE(\"Only support 1 group at present. Given: %d\", count);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    engine_io_guard_ = std::make_unique<AxclEngineIOGuard>(*io_info_guard_);\n\n    InitInput();\n    InitOutput();\n  }\n\n  void InitContext() {\n    // Note(fangjun): No need to destroy context_id_\n    auto ret = axclrtEngineCreateContext(model_id_, &context_id_);\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to call axclrtEngineCreateContext(). Return code is: %d\",\n          static_cast<int32_t>(ret));\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  void InitInput() {\n    uint32_t group = 0;\n\n    int32_t num_inputs = axclrtEngineGetNumInputs(*io_info_guard_);\n\n    input_tensor_names_.resize(num_inputs);\n    input_tensor_shapes_.reserve(num_inputs);\n\n    for (int32_t i = 0; i < num_inputs; ++i) {\n      size_t size_in_bytes =\n          axclrtEngineGetInputSizeByIndex(*io_info_guard_, group, i);\n      input_tensors_.emplace_back(size_in_bytes, AXCL_MEM_MALLOC_HUGE_FIRST);\n\n      axclrtEngineIODims dims;\n      auto ret = axclrtEngineGetInputDims(*io_info_guard_, group, i, &dims);\n      if (ret != 0) {\n        SHERPA_ONNX_LOGE(\n            \"Failed to call axclrtEngineGetInputDims(). Return code is: %d\",\n            static_cast<int32_t>(ret));\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      input_tensor_shapes_.emplace_back(dims.dims, dims.dims + dims.dimCount);\n\n      input_tensor_names_[i] =\n          axclrtEngineGetInputNameByIndex(*io_info_guard_, i);\n\n      ret = axclrtEngineSetInputBufferByIndex(*engine_io_guard_, i,\n                                              input_tensors_[i], size_in_bytes);\n      if (ret != 0) {\n        SHERPA_ONNX_LOGE(\n            \"Failed to call axclrtEngineSetInputBufferByIndex(). Return code \"\n            \"is: %d\",\n            static_cast<int32_t>(ret));\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n  }\n\n  void InitOutput() {\n    uint32_t group = 0;\n\n    int32_t num_outputs = axclrtEngineGetNumOutputs(*io_info_guard_);\n\n    output_tensor_names_.resize(num_outputs);\n    output_tensor_shapes_.reserve(num_outputs);\n\n    for (int32_t i = 0; i < num_outputs; ++i) {\n      auto size_in_bytes =\n          axclrtEngineGetOutputSizeByIndex(*io_info_guard_, group, i);\n      output_tensors_.emplace_back(size_in_bytes, AXCL_MEM_MALLOC_HUGE_FIRST);\n\n      axclrtEngineIODims dims;\n      auto ret = axclrtEngineGetOutputDims(*io_info_guard_, group, i, &dims);\n      if (ret != 0) {\n        SHERPA_ONNX_LOGE(\n            \"Failed to call axclrtEngineGetOutputDims(). Return code is: %d\",\n            static_cast<int32_t>(ret));\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      output_tensor_shapes_.emplace_back(dims.dims, dims.dims + dims.dimCount);\n      output_tensor_names_[i] =\n          axclrtEngineGetOutputNameByIndex(*io_info_guard_, i);\n\n      ret = axclrtEngineSetOutputBufferByIndex(\n          *engine_io_guard_, i, output_tensors_[i], size_in_bytes);\n      if (ret != 0) {\n        SHERPA_ONNX_LOGE(\n            \"Failed to call axclrtEngineSetOutputBufferByIndex(). Return code \"\n            \"is: %d\",\n            static_cast<int32_t>(ret));\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n  }\n\n private:\n  AxclManager manager_;\n  std::unique_ptr<AxclEngineGuard> engine_guard_;\n  std::unique_ptr<AxclEngineIOGuard> engine_io_guard_;\n  std::unique_ptr<AxclEngineIOInfoGuard> io_info_guard_;\n\n  bool model_loaded_ = false;\n  uint64_t model_id_ = 0;\n  uint64_t context_id_ = 0;\n\n  std::vector<std::string> input_tensor_names_;\n  std::vector<std::string> output_tensor_names_;\n\n  std::vector<AxclDevicePtr> input_tensors_;\n  std::vector<AxclDevicePtr> output_tensors_;\n\n  std::vector<std::vector<int32_t>> input_tensor_shapes_;\n  std::vector<std::vector<int32_t>> output_tensor_shapes_;\n};\n\nAxclModel::AxclModel(const std::string &filename, int32_t device_id /*= 0*/)\n    : impl_(std::make_unique<Impl>(filename, device_id)) {}\n\nAxclModel::AxclModel(const void *cpu_buf, size_t buf_len_in_bytes,\n                     int32_t device_id /*= 0*/)\n    : impl_(std::make_unique<Impl>(cpu_buf, buf_len_in_bytes, device_id)) {}\n\nAxclModel::~AxclModel() = default;\n\nconst std::vector<std::string> &AxclModel::InputTensorNames() const {\n  return impl_->InputTensorNames();\n}\nconst std::vector<std::string> &AxclModel::OutputTensorNames() const {\n  return impl_->OutputTensorNames();\n}\n\nstd::vector<int32_t> AxclModel::TensorShape(const std::string &name) const {\n  return impl_->TensorShape(name);\n}\n\nint32_t AxclModel::TensorSizeInBytes(const std::string &name) const {\n  return impl_->TensorSizeInBytes(name);\n}\n\nbool AxclModel::HasTensor(const std::string &name) const {\n  return impl_->HasTensor(name);\n}\n\nbool AxclModel::SetInputTensorData(const std::string &name, const float *p,\n                                   int32_t n) const {\n  return impl_->SetInputTensorData(name, p, n);\n}\n\nbool AxclModel::SetInputTensorData(const std::string &name, const int32_t *p,\n                                   int32_t n) const {\n  return impl_->SetInputTensorData(name, p, n);\n}\n\nstd::vector<float> AxclModel::GetOutputTensorData(\n    const std::string &name) const {\n  return impl_->GetOutputTensorData(name);\n}\n\nbool AxclModel::Run() const { return impl_->Run(); }\n\nbool AxclModel::IsInitialized() const { return impl_->IsInitialized(); }\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/axcl-model.h",
    "content": "// sherpa-onnx/csrc/axcl/axcl-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_AXCL_AXCL_MODEL_H_\n#define SHERPA_ONNX_CSRC_AXCL_AXCL_MODEL_H_\n\n#include <cstdint>\n#include <memory>\n#include <string>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nclass AxclModel {\n public:\n  explicit AxclModel(const std::string &filename, int32_t device_id = 0);\n\n  AxclModel(const void *cpu_buf, size_t buf_len_in_bytes,\n            int32_t device_id = 0);\n  ~AxclModel();\n\n  const std::vector<std::string> &InputTensorNames() const;\n  const std::vector<std::string> &OutputTensorNames() const;\n\n  std::vector<int32_t> TensorShape(const std::string &name) const;\n  int32_t TensorSizeInBytes(const std::string &name) const;\n\n  bool HasTensor(const std::string &name) const;\n\n  bool SetInputTensorData(const std::string &name, const float *p,\n                          int32_t n) const;\n\n  bool SetInputTensorData(const std::string &name, const int32_t *p,\n                          int32_t n) const;\n\n  std::vector<float> GetOutputTensorData(const std::string &name) const;\n\n  bool Run() const;\n  bool IsInitialized() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AXCL_AXCL_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/offline-sense-voice-model-axcl.cc",
    "content": "// sherpa-onnx/csrc/axcl/offline-sense-voice-model-axcl.cc\n//\n// Copyright (c)  2025  M5Stack Technology CO LTD\n\n#include \"sherpa-onnx/csrc/axcl/offline-sense-voice-model-axcl.h\"\n\n#include <algorithm>\n#include <array>\n#include <cstring>\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/axcl/axcl-model.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModelAxcl::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    model_ = std::make_unique<AxclModel>(config_.sense_voice.model);\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) : config_(config) {\n    auto buf = ReadFile(mgr, config_.sense_voice.model);\n    model_ = std::make_unique<AxclModel>(buf.data(), buf.size());\n\n    PostInit();\n  }\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const {\n    return meta_data_;\n  }\n\n  std::vector<float> Run(std::vector<float> features, int32_t language,\n                         int32_t text_norm) {\n    features = ApplyLFR(std::move(features));\n    std::array<int32_t, 4> prompt{language, 1, 2, text_norm};\n\n    model_->SetInputTensorData(\"x\", features.data(), features.size());\n    model_->SetInputTensorData(\"prompt\", prompt.data(), prompt.size());\n    model_->Run();\n    return model_->GetOutputTensorData(\"logits\");\n  }\n\n private:\n  void PostInit() {\n    if (!model_->IsInitialized()) {\n      SHERPA_ONNX_LOGE(\"Failed to initialize the model with '%s'\",\n                       config_.sense_voice.model.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    num_input_frames_ = model_->TensorShape(\"x\")[1];\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"  num_input_frames_ = %d\", num_input_frames_);\n    }\n  }\n\n  std::vector<float> ApplyLFR(std::vector<float> in) const {\n    int32_t lfr_window_size = meta_data_.window_size;\n    int32_t lfr_window_shift = meta_data_.window_shift;\n    int32_t in_feat_dim = 80;\n    int32_t in_num_frames = in.size() / in_feat_dim;\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n\n    if (out_num_frames > num_input_frames_) {\n      SHERPA_ONNX_LOGE(\n          \"Number of input frames %d is too large. Truncate it to %d frames.\",\n          out_num_frames, num_input_frames_);\n      SHERPA_ONNX_LOGE(\n          \"Recognition result may be truncated/incomplete. Please select a \"\n          \"model accepting longer audios.\");\n      out_num_frames = num_input_frames_;\n    }\n\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n    std::vector<float> out(num_input_frames_ * out_feat_dim);\n    const float *p_in = in.data();\n    float *p_out = out.data();\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n    return out;\n  }\n\n private:\n  OfflineModelConfig config_;\n  std::unique_ptr<AxclModel> model_;\n  OfflineSenseVoiceModelMetaData meta_data_;\n  int32_t num_input_frames_ = -1;\n};\n\nOfflineSenseVoiceModelAxcl::~OfflineSenseVoiceModelAxcl() = default;\n\nOfflineSenseVoiceModelAxcl::OfflineSenseVoiceModelAxcl(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineSenseVoiceModelAxcl::OfflineSenseVoiceModelAxcl(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nstd::vector<float> OfflineSenseVoiceModelAxcl::Run(std::vector<float> features,\n                                                   int32_t language,\n                                                   int32_t text_norm) const {\n  return impl_->Run(std::move(features), language, text_norm);\n}\n\nconst OfflineSenseVoiceModelMetaData &\nOfflineSenseVoiceModelAxcl::GetModelMetadata() const {\n  return impl_->GetModelMetadata();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSenseVoiceModelAxcl::OfflineSenseVoiceModelAxcl(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSenseVoiceModelAxcl::OfflineSenseVoiceModelAxcl(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/offline-sense-voice-model-axcl.h",
    "content": "// sherpa-onnx/csrc/axcl/offline-sense-voice-model-axcl.h\n//\n// Copyright (c)  2025  M5Stack Technology CO LTD\n\n#ifndef SHERPA_ONNX_CSRC_AXCL_OFFLINE_SENSE_VOICE_MODEL_AXCL_H_\n#define SHERPA_ONNX_CSRC_AXCL_OFFLINE_SENSE_VOICE_MODEL_AXCL_H_\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-sense-voice-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModelAxcl {\n public:\n  ~OfflineSenseVoiceModelAxcl();\n\n  explicit OfflineSenseVoiceModelAxcl(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineSenseVoiceModelAxcl(Manager *mgr, const OfflineModelConfig &config);\n\n  std::vector<float> Run(std::vector<float> features, int32_t language,\n                         int32_t text_norm) const;\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AXCL_OFFLINE_SENSE_VOICE_MODEL_AXCL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/utils.cc",
    "content": "// sherpa-onnx/csrc/axcl/utils.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/axcl/utils.h\"\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nAxclDevicePtr::AxclDevicePtr(\n    size_t size,\n    axclrtMemMallocPolicy policy /*= AXCL_MEM_MALLOC_HUGE_FIRST*/) {\n  auto ret = axclrtMalloc(&p_, size, policy);\n  if (ret != 0) {\n    SHERPA_ONNX_LOGE(\"Failed to call axclrtMalloc(). Return code: %d\",\n                     static_cast<int32_t>(ret));\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  size_ = size;\n}\n\nvoid AxclDevicePtr::Release() {\n  if (!p_) {\n    return;\n  }\n\n  auto ret = axclrtFree(p_);\n  if (ret != 0) {\n    SHERPA_ONNX_LOGE(\"Failed to call axclrtFree(). Return code: %d\",\n                     static_cast<int32_t>(ret));\n    SHERPA_ONNX_EXIT(-1);\n  }\n  p_ = nullptr;\n  size_ = 0;\n}\n\nAxclDevicePtr::~AxclDevicePtr() { Release(); }\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/axcl/utils.h",
    "content": "// sherpa-onnx/csrc/axcl/utils.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_AXCL_UTILS_H_\n#define SHERPA_ONNX_CSRC_AXCL_UTILS_H_\n\n#include \"axcl.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nclass AxclDevicePtr {\n public:\n  explicit AxclDevicePtr(\n      size_t size, axclrtMemMallocPolicy policy = AXCL_MEM_MALLOC_HUGE_FIRST);\n\n  ~AxclDevicePtr();\n\n  AxclDevicePtr(const AxclDevicePtr &) = delete;\n  AxclDevicePtr &operator=(const AxclDevicePtr &) = delete;\n\n  AxclDevicePtr(AxclDevicePtr &&other) {\n    p_ = other.p_;\n    size_ = other.size_;\n\n    other.p_ = nullptr;\n    other.size_ = 0;\n  }\n  AxclDevicePtr &operator=(AxclDevicePtr &&other) {\n    if (this == &other) {\n      return *this;\n    }\n    Release();\n    p_ = other.p_;\n    size_ = other.size_;\n\n    other.p_ = nullptr;\n    other.size_ = 0;\n\n    return *this;\n  }\n\n  void Release();\n\n  void *Get() const { return p_; }\n  operator void *() { return p_; }\n\n  size_t Size() const { return size_; }\n\n private:\n  void *p_ = nullptr;\n  size_t size_ = 0;  // in bytes\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AXCL_UTILS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/axera/ax-engine-guard.cc",
    "content": "// sherpa-onnx/csrc/axera/ax-engine-guard.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/axera/ax-engine-guard.h\"\n\n#include <cstring>\n\n#include \"ax_engine_api.h\"  // NOLINT\n#include \"ax_sys_api.h\"     // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nthread_local int32_t AxEngineGuard::count_ = 0;\n\nAxEngineGuard::AxEngineGuard() {\n  if (count_ == 0) {\n    auto ret = AX_SYS_Init();\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\"Failed to call AX_SYS_Init. ret code: %d\",\n                       static_cast<int32_t>(ret));\n\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    AX_ENGINE_NPU_ATTR_T npu_attr;\n    memset(&npu_attr, 0, sizeof(npu_attr));\n    npu_attr.eHardMode = AX_ENGINE_VIRTUAL_NPU_DISABLE;\n    ret = AX_ENGINE_Init(&npu_attr);\n\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\"Failed to call AX_ENGINE_Init. ret code: %d\",\n                       static_cast<int32_t>(ret));\n\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  ++count_;\n}\n\nAxEngineGuard::~AxEngineGuard() {\n  --count_;\n  if (count_ == 0) {\n    AX_ENGINE_Deinit();\n    AX_SYS_Deinit();\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/axera/ax-engine-guard.h",
    "content": "// sherpa-onnx/csrc/axera/ax-engine-guard.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_AXERA_AX_ENGINE_GUARD_H_\n#define SHERPA_ONNX_CSRC_AXERA_AX_ENGINE_GUARD_H_\n#include <cstdint>\n\nnamespace sherpa_onnx {\n\nclass AxEngineGuard {\n public:\n  AxEngineGuard();\n  ~AxEngineGuard();\n\n  AxEngineGuard(const AxEngineGuard &) = delete;\n  AxEngineGuard &operator=(const AxEngineGuard &) = delete;\n\n  AxEngineGuard(AxEngineGuard &&) = delete;\n  AxEngineGuard &operator=(AxEngineGuard &&) = delete;\n\n private:\n  static thread_local int32_t count_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AXERA_AX_ENGINE_GUARD_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/axera/offline-sense-voice-model-axera.cc",
    "content": "// sherpa-onnx/csrc/axera/offline-sense-voice-model-axera.cc\n//\n// Copyright (c)  2025  M5Stack Technology CO LTD\n\n#include \"sherpa-onnx/csrc/axera/offline-sense-voice-model-axera.h\"\n\n#include <algorithm>\n#include <array>\n#include <cstring>\n#include <mutex>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"ax_engine_api.h\"  // NOLINT\n#include \"ax_sys_api.h\"     // NOLINT\n#include \"sherpa-onnx/csrc/axera/ax-engine-guard.h\"\n#include \"sherpa-onnx/csrc/axera/utils.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModelAxera::Impl {\n public:\n  ~Impl() {\n    FreeIO(&io_data_);\n    if (handle_) {\n      AX_ENGINE_DestroyHandle(handle_);\n    }\n  }\n\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    auto buf = ReadFile(config_.sense_voice.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) : config_(config) {\n    auto buf = ReadFile(mgr, config_.sense_voice.model);\n    Init(buf.data(), buf.size());\n  }\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const {\n    return meta_data_;\n  }\n\n  std::vector<float> Run(std::vector<float> features, int32_t language,\n                         int32_t text_norm) {\n    // TODO(fangjun): Support multi clients\n    std::lock_guard<std::mutex> lock(mutex_);\n\n    features = ApplyLFR(std::move(features));\n\n    std::array<int32_t, 4> prompt{language, 1, 2, text_norm};\n\n    const auto &in0_meta = io_info_->pInputs[0];\n    size_t bytes0 = in0_meta.nSize;\n\n    if (bytes0 != features.size() * sizeof(float)) {\n      SHERPA_ONNX_LOGE(\n          \"Feature size mismatch. model expects %u bytes, but got %zu bytes\",\n          in0_meta.nSize, features.size() * sizeof(float));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::memcpy(io_data_.pInputs[0].pVirAddr, features.data(), bytes0);\n\n    const auto &in1_meta = io_info_->pInputs[1];\n    size_t bytes1 = in1_meta.nSize;\n    if (bytes1 != prompt.size() * sizeof(int32_t)) {\n      SHERPA_ONNX_LOGE(\n          \"Prompt size mismatch. model expects %u bytes, but got %zu bytes\",\n          in1_meta.nSize, prompt.size() * sizeof(int32_t));\n      SHERPA_ONNX_EXIT(-1);\n    }\n    std::memcpy(io_data_.pInputs[1].pVirAddr, prompt.data(), bytes1);\n\n    auto ret = AX_ENGINE_RunSync(handle_, &io_data_);\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\"AX_ENGINE_RunSync failed, ret = %d\", ret);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    const auto &out_meta = io_info_->pOutputs[0];\n    auto &out_buf = io_data_.pOutputs[0];\n\n    size_t out_elems = out_meta.nSize / sizeof(float);\n    std::vector<float> out(out_elems);\n\n    std::memcpy(out.data(), out_buf.pVirAddr, out_meta.nSize);\n\n    return out;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    InitContext(model_data, model_data_length, config_.debug, &handle_);\n\n    InitInputOutputAttrs(handle_, config_.debug, &io_info_);\n\n    PrepareIO(io_info_, &io_data_, config_.debug);\n\n    if (!io_info_ || io_info_->nInputSize != 2 || !io_info_->pInputs) {\n      SHERPA_ONNX_LOGE(\"No input tensor in Axera model\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    auto &in0 = io_info_->pInputs[0];\n    if (in0.nShapeSize < 2) {\n      SHERPA_ONNX_LOGE(\"Input tensor rank is too small (nShapeSize = %u)\",\n                       in0.nShapeSize);\n      SHERPA_ONNX_EXIT(-1);\n    }\n    num_input_frames_ = in0.pShape[1];\n\n    if (io_info_->nOutputSize != 1) {\n      SHERPA_ONNX_LOGE(\"Axera sense voice model expected only 1 output tensor\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"Axera SenseVoice model init done.\");\n      SHERPA_ONNX_LOGE(\"  num_input_frames_ = %d\", num_input_frames_);\n    }\n  }\n\n  std::vector<float> ApplyLFR(std::vector<float> in) const {\n    int32_t lfr_window_size = meta_data_.window_size;\n    int32_t lfr_window_shift = meta_data_.window_shift;\n    int32_t in_feat_dim = 80;\n    int32_t in_num_frames = in.size() / in_feat_dim;\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n\n    if (out_num_frames > num_input_frames_) {\n      SHERPA_ONNX_LOGE(\n          \"Number of input frames %d is too large. Truncate it to %d frames.\",\n          out_num_frames, num_input_frames_);\n      SHERPA_ONNX_LOGE(\n          \"Recognition result may be truncated/incomplete. Please select a \"\n          \"model accepting longer audios.\");\n      out_num_frames = num_input_frames_;\n    }\n\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n    std::vector<float> out(num_input_frames_ * out_feat_dim);\n    const float *p_in = in.data();\n    float *p_out = out.data();\n\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n\n    return out;\n  }\n\n private:\n  std::mutex mutex_;\n  AxEngineGuard ax_engine_guard_;\n\n  OfflineModelConfig config_;\n  AX_ENGINE_HANDLE handle_ = nullptr;\n  AX_ENGINE_IO_INFO_T *io_info_ = nullptr;\n  AX_ENGINE_IO_T io_data_;\n  OfflineSenseVoiceModelMetaData meta_data_;\n  int32_t num_input_frames_ = -1;\n};\n\nOfflineSenseVoiceModelAxera::~OfflineSenseVoiceModelAxera() = default;\n\nOfflineSenseVoiceModelAxera::OfflineSenseVoiceModelAxera(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineSenseVoiceModelAxera::OfflineSenseVoiceModelAxera(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nstd::vector<float> OfflineSenseVoiceModelAxera::Run(std::vector<float> features,\n                                                    int32_t language,\n                                                    int32_t text_norm) const {\n  return impl_->Run(std::move(features), language, text_norm);\n}\n\nconst OfflineSenseVoiceModelMetaData &\nOfflineSenseVoiceModelAxera::GetModelMetadata() const {\n  return impl_->GetModelMetadata();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSenseVoiceModelAxera::OfflineSenseVoiceModelAxera(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSenseVoiceModelAxera::OfflineSenseVoiceModelAxera(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/axera/offline-sense-voice-model-axera.h",
    "content": "// sherpa-onnx/csrc/axera/offline-sense-voice-model-axera.h\n//\n// Copyright (c)  2025  M5Stack Technology CO LTD\n\n#ifndef SHERPA_ONNX_CSRC_AXERA_OFFLINE_SENSE_VOICE_MODEL_AXERA_H_\n#define SHERPA_ONNX_CSRC_AXERA_OFFLINE_SENSE_VOICE_MODEL_AXERA_H_\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-sense-voice-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModelAxera {\n public:\n  ~OfflineSenseVoiceModelAxera();\n\n  explicit OfflineSenseVoiceModelAxera(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineSenseVoiceModelAxera(Manager *mgr, const OfflineModelConfig &config);\n\n  std::vector<float> Run(std::vector<float> features, int32_t language,\n                         int32_t text_norm) const;\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AXERA_OFFLINE_SENSE_VOICE_MODEL_AXERA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/axera/utils.cc",
    "content": "// sherpa-onnx/csrc/axera/utils.cc\n//\n// Copyright (c)  2025  M5Stack Technology CO LTD\n\n#include \"sherpa-onnx/csrc/axera/utils.h\"\n\n#include <string.h>\n\n#include <sstream>\n#include <string>\n#include <utility>\n\n#include \"ax_engine_api.h\"   // NOLINT\n#include \"ax_engine_type.h\"  // NOLINT\n#include \"ax_sys_api.h\"      // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\n#define SHERPA_ONNX_TO_STRING(type) \\\n  case type:                        \\\n    return #type\n\nnamespace sherpa_onnx {\n\nstatic constexpr int32_t kCmnAlignSize = 128;\nstatic const char *kSherpaOnnxAxeraSessionName = \"sherpa-onnx-axera\";\n\nstatic std::string VectorToString(AX_S32 *arr, AX_U8 n) {\n  std::ostringstream os;\n  std::string sep;\n  os << \"[\";\n  for (AX_U8 i = 0; i < n; ++i) {\n    os << sep << arr[i];\n    sep = \", \";\n  }\n  os << \"]\";\n\n  return os.str();\n}\n\nstatic const char *AxEngineDataTypeToString(AX_ENGINE_DATA_TYPE_T type) {\n  switch (type) {\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_UNKNOWN);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_UINT8);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_UINT16);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_FLOAT32);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_SINT16);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_SINT8);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_SINT32);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_UINT32);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_FLOAT64);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_UINT10_PACKED);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_UINT12_PACKED);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_UINT14_PACKED);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_DT_UINT16_PACKED);\n    default:\n      return \"Unknown data type\";\n  }\n}\n\nstatic const char *AxEngineTensorLayoutToString(\n    AX_ENGINE_TENSOR_LAYOUT_T layout) {\n  switch (layout) {\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_TENSOR_LAYOUT_UNKNOWN);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_TENSOR_LAYOUT_NHWC);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_TENSOR_LAYOUT_NCHW);\n    default:\n      return \"Unknown data layout\";\n  }\n}\n\nstatic const char *AxEngineMemoryTypeToString(AX_ENGINE_MEMORY_TYPE_T type) {\n  switch (type) {\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_MT_PHYSICAL);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_MT_VIRTUAL);\n    SHERPA_ONNX_TO_STRING(AX_ENGINE_MT_OCM);\n    default:\n      return \"Unknown memory type\";\n  }\n}\n\n/*\nnum_inputs: 2\nnum_outputs: 1\nmax_bach_size: 1\ndynamic_bach_size: false\n---input 0---\n name: x\n shape: [1, 167, 560]\n layout: AX_ENGINE_TENSOR_LAYOUT_NCHW\n memory_type: AX_ENGINE_MT_PHYSICAL\n data_type: AX_ENGINE_DT_FLOAT32\n n_size (number of bytes): 374080\n---input 1---\n name: prompt\n shape: [4]\n layout: AX_ENGINE_TENSOR_LAYOUT_NCHW\n memory_type: AX_ENGINE_MT_PHYSICAL\n data_type: AX_ENGINE_DT_SINT32\n n_size (number of bytes): 16\n\n---output 0---\n name: logits\n shape: [1, 171, 25055]\n layout: AX_ENGINE_TENSOR_LAYOUT_UNKNOWN\n memory_type: AX_ENGINE_MT_PHYSICAL\n data_type: AX_ENGINE_DT_FLOAT32\n n_size: 17137620\n */\nstatic std::string ToString(const AX_ENGINE_IO_INFO_T *io_info) {\n  std::ostringstream os;\n  os << \"num_inputs: \" << io_info->nInputSize << \"\\n\";\n  os << \"num_outputs: \" << io_info->nOutputSize << \"\\n\";\n  os << \"max_bach_size: \" << io_info->nMaxBatchSize << \"\\n\";\n  os << \"dynamic_bach_size: \" << (io_info->bDynamicBatchSize ? \"true\" : \"false\")\n     << \"\\n\";\n\n  for (AX_U32 i = 0; i < io_info->nInputSize; ++i) {\n    const auto &input = io_info->pInputs[i];\n    os << \"---input \" << i << \"---\\n\";\n    os << \" name: \" << input.pName << \"\\n\";\n    os << \" shape: \" << VectorToString(input.pShape, input.nShapeSize) << \"\\n\";\n    os << \" layout: \" << AxEngineTensorLayoutToString(input.eLayout) << \"\\n\";\n    os << \" memory_type: \" << AxEngineMemoryTypeToString(input.eMemoryType)\n       << \"\\n\";\n    os << \" data_type: \" << AxEngineDataTypeToString(input.eDataType) << \"\\n\";\n    os << \" n_size (number of bytes): \" << input.nSize << \"\\n\";\n  }\n  os << \"\\n\";\n\n  for (AX_U32 i = 0; i < io_info->nOutputSize; ++i) {\n    const auto &output = io_info->pOutputs[i];\n    os << \"---output \" << i << \"---\\n\";\n    os << \" name: \" << output.pName << \"\\n\";\n    os << \" shape: \" << VectorToString(output.pShape, output.nShapeSize)\n       << \"\\n\";\n    os << \" layout: \" << AxEngineTensorLayoutToString(output.eLayout) << \"\\n\";\n    os << \" memory_type: \" << AxEngineMemoryTypeToString(output.eMemoryType)\n       << \"\\n\";\n    os << \" data_type: \" << AxEngineDataTypeToString(output.eDataType) << \"\\n\";\n    os << \" n_size: \" << output.nSize << \"\\n\";\n  }\n\n  return os.str();\n}\n\nvoid InitContext(const void *model_data, size_t model_data_length, bool debug,\n                 AX_ENGINE_HANDLE *handle) {\n  if (!handle) {\n    SHERPA_ONNX_LOGE(\"InitContext: handle is null\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  auto ret = AX_ENGINE_CreateHandle(handle, model_data, model_data_length);\n  if (ret != 0) {\n    SHERPA_ONNX_LOGE(\"AX_ENGINE_CreateHandle failed, ret = %d\", ret);\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  if (debug) {\n    SHERPA_ONNX_LOGE(\"AX_ENGINE_CreateHandle done. handle = %p\", *handle);\n  }\n\n  ret = AX_ENGINE_CreateContext(*handle);\n  if (ret != 0) {\n    SHERPA_ONNX_LOGE(\"AX_ENGINE_CreateContext failed, ret = %d\", ret);\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  if (debug) {\n    SHERPA_ONNX_LOGE(\"AX_ENGINE_CreateContext done.\");\n  }\n}\n\nvoid InitInputOutputAttrs(AX_ENGINE_HANDLE handle, bool debug,\n                          AX_ENGINE_IO_INFO_T **io_info) {\n  if (!io_info) {\n    SHERPA_ONNX_LOGE(\"InitInputOutputAttrs: io_info is null\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  // Note(fangjun): No need to free *io_info\n  auto ret = AX_ENGINE_GetIOInfo(handle, io_info);\n  if (ret != 0) {\n    SHERPA_ONNX_LOGE(\"AX_ENGINE_GetIOInfo failed, ret = %d\", ret);\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  if (debug) {\n    SHERPA_ONNX_LOGE(\"AX_ENGINE_GetIOInfo done.\");\n    SHERPA_ONNX_LOGE(\"IO_INFO:\\n%s\", ToString(*io_info).c_str());\n  }\n}\n\nvoid PrepareIO(AX_ENGINE_IO_INFO_T *io_info, AX_ENGINE_IO_T *io_data,\n               bool debug) {\n  if (!io_info || !io_data) {\n    SHERPA_ONNX_LOGE(\"PrepareIO: io_info or io_data is null\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  memset(io_data, 0, sizeof(AX_ENGINE_IO_T));\n\n  io_data->pInputs = new AX_ENGINE_IO_BUFFER_T[io_info->nInputSize];\n\n  memset(io_data->pInputs, 0,\n         sizeof(AX_ENGINE_IO_BUFFER_T) * io_info->nInputSize);\n\n  io_data->nInputSize = io_info->nInputSize;\n\n  for (AX_U32 i = 0; i < io_info->nInputSize; ++i) {\n    const auto &input = io_info->pInputs[i];\n    auto &buffer = io_data->pInputs[i];\n\n    buffer.nSize = input.nSize;\n\n    auto ret = AX_SYS_MemAlloc(\n        reinterpret_cast<AX_U64 *>(&buffer.phyAddr), &buffer.pVirAddr,\n        input.nSize, kCmnAlignSize,\n        reinterpret_cast<const AX_S8 *>(kSherpaOnnxAxeraSessionName));\n\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\"Failed to allocate memory for Input %d\",\n                       static_cast<int32_t>(i));\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  io_data->pOutputs = new AX_ENGINE_IO_BUFFER_T[io_info->nOutputSize];\n\n  memset(io_data->pOutputs, 0,\n         sizeof(AX_ENGINE_IO_BUFFER_T) * io_info->nOutputSize);\n\n  io_data->nOutputSize = io_info->nOutputSize;\n\n  for (AX_U32 i = 0; i < io_info->nOutputSize; ++i) {\n    const auto &output = io_info->pOutputs[i];\n    auto &buffer = io_data->pOutputs[i];\n    buffer.nSize = output.nSize;\n    auto ret = AX_SYS_MemAllocCached(\n        reinterpret_cast<AX_U64 *>(&buffer.phyAddr), &buffer.pVirAddr,\n        output.nSize, kCmnAlignSize,\n        reinterpret_cast<const AX_S8 *>(kSherpaOnnxAxeraSessionName));\n\n    if (ret != 0) {\n      SHERPA_ONNX_LOGE(\"Failed to allocate memory for Output %d\",\n                       static_cast<int32_t>(i));\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n}\n\nvoid FreeIO(AX_ENGINE_IO_T *io_data) {\n  for (AX_U32 i = 0; i < io_data->nInputSize; ++i) {\n    auto &buf = io_data->pInputs[i];\n    AX_SYS_MemFree(buf.phyAddr, buf.pVirAddr);\n  }\n\n  for (AX_U32 i = 0; i < io_data->nOutputSize; ++i) {\n    auto &buf = io_data->pOutputs[i];\n    AX_SYS_MemFree(buf.phyAddr, buf.pVirAddr);\n  }\n  delete[] io_data->pInputs;\n  delete[] io_data->pOutputs;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/axera/utils.h",
    "content": "// sherpa-onnx/csrc/axera/utils.h\n//\n// Copyright (c)  2025  M5Stack Technology CO LTD\n\n#ifndef SHERPA_ONNX_CSRC_AXERA_UTILS_H_\n#define SHERPA_ONNX_CSRC_AXERA_UTILS_H_\n\n#include <cstddef>\n\n#include \"ax_engine_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nvoid InitContext(const void *model_data, size_t model_data_length, bool debug,\n                 AX_ENGINE_HANDLE *handle);\n\nvoid InitInputOutputAttrs(AX_ENGINE_HANDLE handle, bool debug,\n                          AX_ENGINE_IO_INFO_T **io_info);\n\nvoid PrepareIO(AX_ENGINE_IO_INFO_T *io_info, AX_ENGINE_IO_T *io_data,\n               bool debug);\n\nvoid FreeIO(AX_ENGINE_IO_T *io_data);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_AXERA_UTILS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/base64-decode.cc",
    "content": "// sherpa-onnx/csrc/base64-decode.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/base64-decode.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstatic int32_t Ord(char c) {\n  if (c >= 'A' && c <= 'Z') {\n    return c - 'A';\n  } else if (c >= 'a' && c <= 'z') {\n    return c - 'a' + ('Z' - 'A') + 1;\n  } else if (c >= '0' && c <= '9') {\n    return c - '0' + ('Z' - 'A') + ('z' - 'a') + 2;\n  } else if (c == '+') {\n    return 62;\n  } else if (c == '/') {\n    return 63;\n  }\n\n  SHERPA_ONNX_LOGE(\"Unknown character %d, %c\\n\", c, c);\n\n  exit(-1);\n}\n\n// see\n// https://github.com/ReneNyffenegger/cpp-base64/blob/master/base64.cpp#L243\nstd::string Base64Decode(const std::string &s) {\n  if (s.empty()) {\n    SHERPA_ONNX_LOGE(\"Empty string!\");\n    exit(-1);\n  }\n\n  int32_t n = static_cast<int32_t>(s.size()) / 4 * 3;\n\n  std::string ans;\n  ans.reserve(n);\n\n  int32_t i = 0;\n  while (i < static_cast<int32_t>(s.size())) {\n    if (s[i] == '=') {\n      return \" \";\n    }\n\n    int32_t first = (Ord(s[i]) << 2) + ((Ord(s[i + 1]) & 0x30) >> 4);\n    ans.push_back(static_cast<char>(first));\n\n    if (i + 2 < static_cast<int32_t>(s.size()) && s[i + 2] != '=') {\n      int32_t second =\n          ((Ord(s[i + 1]) & 0x0f) << 4) + ((Ord(s[i + 2]) & 0x3c) >> 2);\n      ans.push_back(static_cast<char>(second));\n\n      if (i + 3 < static_cast<int32_t>(s.size()) && s[i + 3] != '=') {\n        int32_t third = ((Ord(s[i + 2]) & 0x03) << 6) + Ord(s[i + 3]);\n        ans.push_back(static_cast<char>(third));\n      }\n    }\n    i += 4;\n  }\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/base64-decode.h",
    "content": "// sherpa-onnx/csrc/base64-decode.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_BASE64_DECODE_H_\n#define SHERPA_ONNX_CSRC_BASE64_DECODE_H_\n\n#include <string>\n\nnamespace sherpa_onnx {\n\n/** @param s A base64 encoded string.\n *  @return Return the decoded string.\n */\nstd::string Base64Decode(const std::string &s);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_BASE64_DECODE_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/bbpe.cc",
    "content": "// sherpa-onnx/csrc/bbpe.cc\n//\n// Copyright (c)  2024 Xiaomi Corporation\n\n// Auto-generated! DO NOT EDIT\n\n#include \"sherpa-onnx/csrc/bbpe.h\"\n\n#include <cstdint>\n#include <string>\n#include <unordered_map>\n\nconst std::unordered_map<std::string, uint8_t> &GetByteBpeTable() {\n  static const std::unordered_map<std::string, uint8_t> table = {\n      {\"Ā\", 0},   {\"ā\", 1},   {\"Ă\", 2},   {\"ă\", 3},   {\"Ą\", 4},   {\"ą\", 5},\n      {\"Ć\", 6},   {\"ć\", 7},   {\"Ĉ\", 8},   {\"ĉ\", 9},   {\"Ċ\", 10},  {\"ċ\", 11},\n      {\"Č\", 12},  {\"č\", 13},  {\"Ď\", 14},  {\"ď\", 15},  {\"Đ\", 16},  {\"đ\", 17},\n      {\"Ē\", 18},  {\"ē\", 19},  {\"Ĕ\", 20},  {\"ĕ\", 21},  {\"Ė\", 22},  {\"ė\", 23},\n      {\"Ę\", 24},  {\"ę\", 25},  {\"Ě\", 26},  {\"ě\", 27},  {\"Ĝ\", 28},  {\"ĝ\", 29},\n      {\"Ğ\", 30},  {\"ğ\", 31},  {\" \", 32},  {\"!\", 33},  {\"\\\"\", 34}, {\"#\", 35},\n      {\"$\", 36},  {\"%\", 37},  {\"&\", 38},  {\"'\", 39},  {\"(\", 40},  {\")\", 41},\n      {\"*\", 42},  {\"+\", 43},  {\",\", 44},  {\"-\", 45},  {\".\", 46},  {\"/\", 47},\n      {\"0\", 48},  {\"1\", 49},  {\"2\", 50},  {\"3\", 51},  {\"4\", 52},  {\"5\", 53},\n      {\"6\", 54},  {\"7\", 55},  {\"8\", 56},  {\"9\", 57},  {\":\", 58},  {\";\", 59},\n      {\"<\", 60},  {\"=\", 61},  {\">\", 62},  {\"?\", 63},  {\"@\", 64},  {\"A\", 65},\n      {\"B\", 66},  {\"C\", 67},  {\"D\", 68},  {\"E\", 69},  {\"F\", 70},  {\"G\", 71},\n      {\"H\", 72},  {\"I\", 73},  {\"J\", 74},  {\"K\", 75},  {\"L\", 76},  {\"M\", 77},\n      {\"N\", 78},  {\"O\", 79},  {\"P\", 80},  {\"Q\", 81},  {\"R\", 82},  {\"S\", 83},\n      {\"T\", 84},  {\"U\", 85},  {\"V\", 86},  {\"W\", 87},  {\"X\", 88},  {\"Y\", 89},\n      {\"Z\", 90},  {\"[\", 91},  {\"\\\\\", 92}, {\"]\", 93},  {\"^\", 94},  {\"_\", 95},\n      {\"`\", 96},  {\"a\", 97},  {\"b\", 98},  {\"c\", 99},  {\"d\", 100}, {\"e\", 101},\n      {\"f\", 102}, {\"g\", 103}, {\"h\", 104}, {\"i\", 105}, {\"j\", 106}, {\"k\", 107},\n      {\"l\", 108}, {\"m\", 109}, {\"n\", 110}, {\"o\", 111}, {\"p\", 112}, {\"q\", 113},\n      {\"r\", 114}, {\"s\", 115}, {\"t\", 116}, {\"u\", 117}, {\"v\", 118}, {\"w\", 119},\n      {\"x\", 120}, {\"y\", 121}, {\"z\", 122}, {\"{\", 123}, {\"|\", 124}, {\"}\", 125},\n      {\"~\", 126}, {\"Ġ\", 127}, {\"ġ\", 128}, {\"Ģ\", 129}, {\"ģ\", 130}, {\"Ĥ\", 131},\n      {\"ĥ\", 132}, {\"Ħ\", 133}, {\"ħ\", 134}, {\"Ĩ\", 135}, {\"ĩ\", 136}, {\"Ī\", 137},\n      {\"ī\", 138}, {\"Ĭ\", 139}, {\"ĭ\", 140}, {\"Į\", 141}, {\"į\", 142}, {\"İ\", 143},\n      {\"ı\", 144}, {\"Ĵ\", 145}, {\"ĵ\", 146}, {\"Ķ\", 147}, {\"ķ\", 148}, {\"ĸ\", 149},\n      {\"Ĺ\", 150}, {\"ĺ\", 151}, {\"Ļ\", 152}, {\"ļ\", 153}, {\"Ľ\", 154}, {\"ľ\", 155},\n      {\"Ł\", 156}, {\"ł\", 157}, {\"Ń\", 158}, {\"ń\", 159}, {\"Ņ\", 160}, {\"ņ\", 161},\n      {\"Ň\", 162}, {\"ň\", 163}, {\"Ŋ\", 164}, {\"ŋ\", 165}, {\"Ō\", 166}, {\"ō\", 167},\n      {\"Ŏ\", 168}, {\"ŏ\", 169}, {\"Ő\", 170}, {\"ő\", 171}, {\"Œ\", 172}, {\"œ\", 173},\n      {\"Ŕ\", 174}, {\"ŕ\", 175}, {\"Ŗ\", 176}, {\"ŗ\", 177}, {\"Ř\", 178}, {\"ř\", 179},\n      {\"Ś\", 180}, {\"ś\", 181}, {\"Ŝ\", 182}, {\"ŝ\", 183}, {\"Ş\", 184}, {\"ş\", 185},\n      {\"Š\", 186}, {\"š\", 187}, {\"Ţ\", 188}, {\"ţ\", 189}, {\"Ť\", 190}, {\"ť\", 191},\n      {\"Ŧ\", 192}, {\"ŧ\", 193}, {\"Ũ\", 194}, {\"ũ\", 195}, {\"Ū\", 196}, {\"ū\", 197},\n      {\"Ŭ\", 198}, {\"ŭ\", 199}, {\"Ů\", 200}, {\"ů\", 201}, {\"Ű\", 202}, {\"ű\", 203},\n      {\"Ų\", 204}, {\"ų\", 205}, {\"Ŵ\", 206}, {\"ŵ\", 207}, {\"Ŷ\", 208}, {\"ŷ\", 209},\n      {\"Ÿ\", 210}, {\"Ź\", 211}, {\"ź\", 212}, {\"Ż\", 213}, {\"ż\", 214}, {\"Ž\", 215},\n      {\"ž\", 216}, {\"ƀ\", 217}, {\"Ɓ\", 218}, {\"Ƃ\", 219}, {\"ƃ\", 220}, {\"Ƅ\", 221},\n      {\"ƅ\", 222}, {\"Ɔ\", 223}, {\"Ƈ\", 224}, {\"ƈ\", 225}, {\"Ɖ\", 226}, {\"Ɗ\", 227},\n      {\"Ƌ\", 228}, {\"ƌ\", 229}, {\"ƍ\", 230}, {\"Ǝ\", 231}, {\"Ə\", 232}, {\"Ɛ\", 233},\n      {\"Ƒ\", 234}, {\"ƒ\", 235}, {\"Ɠ\", 236}, {\"Ɣ\", 237}, {\"ƕ\", 238}, {\"Ɩ\", 239},\n      {\"Ɨ\", 240}, {\"Ƙ\", 241}, {\"ƙ\", 242}, {\"ƚ\", 243}, {\"ƛ\", 244}, {\"Ɯ\", 245},\n      {\"Ɲ\", 246}, {\"ƞ\", 247}, {\"Ɵ\", 248}, {\"Ơ\", 249}, {\"ơ\", 250}, {\"Ƣ\", 251},\n      {\"ƣ\", 252}, {\"Ƥ\", 253}, {\"ƥ\", 254}, {\"Ʀ\", 255}, {\"⁇\", 32},\n  };\n\n  return table;\n}\n\nconst std::unordered_map<uint8_t, std::string> &GetByteBpeTableId2Token() {\n  static const std::unordered_map<uint8_t, std::string> table = {\n      {0, \"Ā\"},   {1, \"ā\"},   {2, \"Ă\"},   {3, \"ă\"},   {4, \"Ą\"},   {5, \"ą\"},\n      {6, \"Ć\"},   {7, \"ć\"},   {8, \"Ĉ\"},   {9, \"ĉ\"},   {10, \"Ċ\"},  {11, \"ċ\"},\n      {12, \"Č\"},  {13, \"č\"},  {14, \"Ď\"},  {15, \"ď\"},  {16, \"Đ\"},  {17, \"đ\"},\n      {18, \"Ē\"},  {19, \"ē\"},  {20, \"Ĕ\"},  {21, \"ĕ\"},  {22, \"Ė\"},  {23, \"ė\"},\n      {24, \"Ę\"},  {25, \"ę\"},  {26, \"Ě\"},  {27, \"ě\"},  {28, \"Ĝ\"},  {29, \"ĝ\"},\n      {30, \"Ğ\"},  {31, \"ğ\"},  {32, \" \"},  {33, \"!\"},  {34, \"\\\"\"}, {35, \"#\"},\n      {36, \"$\"},  {37, \"%\"},  {38, \"&\"},  {39, \"'\"},  {40, \"(\"},  {41, \")\"},\n      {42, \"*\"},  {43, \"+\"},  {44, \",\"},  {45, \"-\"},  {46, \".\"},  {47, \"/\"},\n      {48, \"0\"},  {49, \"1\"},  {50, \"2\"},  {51, \"3\"},  {52, \"4\"},  {53, \"5\"},\n      {54, \"6\"},  {55, \"7\"},  {56, \"8\"},  {57, \"9\"},  {58, \":\"},  {59, \";\"},\n      {60, \"<\"},  {61, \"=\"},  {62, \">\"},  {63, \"?\"},  {64, \"@\"},  {65, \"A\"},\n      {66, \"B\"},  {67, \"C\"},  {68, \"D\"},  {69, \"E\"},  {70, \"F\"},  {71, \"G\"},\n      {72, \"H\"},  {73, \"I\"},  {74, \"J\"},  {75, \"K\"},  {76, \"L\"},  {77, \"M\"},\n      {78, \"N\"},  {79, \"O\"},  {80, \"P\"},  {81, \"Q\"},  {82, \"R\"},  {83, \"S\"},\n      {84, \"T\"},  {85, \"U\"},  {86, \"V\"},  {87, \"W\"},  {88, \"X\"},  {89, \"Y\"},\n      {90, \"Z\"},  {91, \"[\"},  {92, \"\\\\\"}, {93, \"]\"},  {94, \"^\"},  {95, \"_\"},\n      {96, \"`\"},  {97, \"a\"},  {98, \"b\"},  {99, \"c\"},  {100, \"d\"}, {101, \"e\"},\n      {102, \"f\"}, {103, \"g\"}, {104, \"h\"}, {105, \"i\"}, {106, \"j\"}, {107, \"k\"},\n      {108, \"l\"}, {109, \"m\"}, {110, \"n\"}, {111, \"o\"}, {112, \"p\"}, {113, \"q\"},\n      {114, \"r\"}, {115, \"s\"}, {116, \"t\"}, {117, \"u\"}, {118, \"v\"}, {119, \"w\"},\n      {120, \"x\"}, {121, \"y\"}, {122, \"z\"}, {123, \"{\"}, {124, \"|\"}, {125, \"}\"},\n      {126, \"~\"}, {127, \"Ġ\"}, {128, \"ġ\"}, {129, \"Ģ\"}, {130, \"ģ\"}, {131, \"Ĥ\"},\n      {132, \"ĥ\"}, {133, \"Ħ\"}, {134, \"ħ\"}, {135, \"Ĩ\"}, {136, \"ĩ\"}, {137, \"Ī\"},\n      {138, \"ī\"}, {139, \"Ĭ\"}, {140, \"ĭ\"}, {141, \"Į\"}, {142, \"į\"}, {143, \"İ\"},\n      {144, \"ı\"}, {145, \"Ĵ\"}, {146, \"ĵ\"}, {147, \"Ķ\"}, {148, \"ķ\"}, {149, \"ĸ\"},\n      {150, \"Ĺ\"}, {151, \"ĺ\"}, {152, \"Ļ\"}, {153, \"ļ\"}, {154, \"Ľ\"}, {155, \"ľ\"},\n      {156, \"Ł\"}, {157, \"ł\"}, {158, \"Ń\"}, {159, \"ń\"}, {160, \"Ņ\"}, {161, \"ņ\"},\n      {162, \"Ň\"}, {163, \"ň\"}, {164, \"Ŋ\"}, {165, \"ŋ\"}, {166, \"Ō\"}, {167, \"ō\"},\n      {168, \"Ŏ\"}, {169, \"ŏ\"}, {170, \"Ő\"}, {171, \"ő\"}, {172, \"Œ\"}, {173, \"œ\"},\n      {174, \"Ŕ\"}, {175, \"ŕ\"}, {176, \"Ŗ\"}, {177, \"ŗ\"}, {178, \"Ř\"}, {179, \"ř\"},\n      {180, \"Ś\"}, {181, \"ś\"}, {182, \"Ŝ\"}, {183, \"ŝ\"}, {184, \"Ş\"}, {185, \"ş\"},\n      {186, \"Š\"}, {187, \"š\"}, {188, \"Ţ\"}, {189, \"ţ\"}, {190, \"Ť\"}, {191, \"ť\"},\n      {192, \"Ŧ\"}, {193, \"ŧ\"}, {194, \"Ũ\"}, {195, \"ũ\"}, {196, \"Ū\"}, {197, \"ū\"},\n      {198, \"Ŭ\"}, {199, \"ŭ\"}, {200, \"Ů\"}, {201, \"ů\"}, {202, \"Ű\"}, {203, \"ű\"},\n      {204, \"Ų\"}, {205, \"ų\"}, {206, \"Ŵ\"}, {207, \"ŵ\"}, {208, \"Ŷ\"}, {209, \"ŷ\"},\n      {210, \"Ÿ\"}, {211, \"Ź\"}, {212, \"ź\"}, {213, \"Ż\"}, {214, \"ż\"}, {215, \"Ž\"},\n      {216, \"ž\"}, {217, \"ƀ\"}, {218, \"Ɓ\"}, {219, \"Ƃ\"}, {220, \"ƃ\"}, {221, \"Ƅ\"},\n      {222, \"ƅ\"}, {223, \"Ɔ\"}, {224, \"Ƈ\"}, {225, \"ƈ\"}, {226, \"Ɖ\"}, {227, \"Ɗ\"},\n      {228, \"Ƌ\"}, {229, \"ƌ\"}, {230, \"ƍ\"}, {231, \"Ǝ\"}, {232, \"Ə\"}, {233, \"Ɛ\"},\n      {234, \"Ƒ\"}, {235, \"ƒ\"}, {236, \"Ɠ\"}, {237, \"Ɣ\"}, {238, \"ƕ\"}, {239, \"Ɩ\"},\n      {240, \"Ɨ\"}, {241, \"Ƙ\"}, {242, \"ƙ\"}, {243, \"ƚ\"}, {244, \"ƛ\"}, {245, \"Ɯ\"},\n      {246, \"Ɲ\"}, {247, \"ƞ\"}, {248, \"Ɵ\"}, {249, \"Ơ\"}, {250, \"ơ\"}, {251, \"Ƣ\"},\n      {252, \"ƣ\"}, {253, \"Ƥ\"}, {254, \"ƥ\"}, {255, \"Ʀ\"},\n  };\n\n  return table;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/bbpe.h",
    "content": "// sherpa-onnx/csrc/bbpe.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_BBPE_H_\n#define SHERPA_ONNX_CSRC_BBPE_H_\n#include <cstdint>\n#include <string>\n#include <unordered_map>\n\n// It is equivalent to the map BCHAR_TO_BYTE\n// from\n// https://github.com/k2-fsa/icefall/blob/master/icefall/byte_utils.py#L280\nconst std::unordered_map<std::string, uint8_t> &GetByteBpeTable();\n\nconst std::unordered_map<uint8_t, std::string> &GetByteBpeTableId2Token();\n\n#endif  // SHERPA_ONNX_CSRC_BBPE_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/cat-test.cc",
    "content": "// sherpa-onnx/csrc/cat-test.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/cat.h\"\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nTEST(Cat, Test1DTensors) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 1> a_shape{3};\n  std::array<int64_t, 1> b_shape{6};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0; i != static_cast<int32_t>(b_shape[0]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Cat(allocator, {&a, &b}, 0);\n\n  const float *pans = ans.GetTensorData<float>();\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0]); ++i) {\n    EXPECT_EQ(pa[i], pans[i]);\n  }\n\n  for (int32_t i = 0; i != static_cast<int32_t>(b_shape[0]); ++i) {\n    EXPECT_EQ(pb[i], pans[i + a_shape[0]]);\n  }\n\n  Print1D(&a);\n  Print1D(&b);\n  Print1D(&ans);\n}\n\nTEST(Cat, Test2DTensorsDim0) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 2> a_shape{2, 3};\n  std::array<int64_t, 2> b_shape{4, 3};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0] * a_shape[1]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0; i != static_cast<int32_t>(b_shape[0] * b_shape[1]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Cat(allocator, {&a, &b}, 0);\n\n  const float *pans = ans.GetTensorData<float>();\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0] * a_shape[1]); ++i) {\n    EXPECT_EQ(pa[i], pans[i]);\n  }\n  for (int32_t i = 0; i != static_cast<int32_t>(b_shape[0] * b_shape[1]); ++i) {\n    EXPECT_EQ(pb[i], pans[i + a_shape[0] * a_shape[1]]);\n  }\n\n  Print2D(&a);\n  Print2D(&b);\n  Print2D(&ans);\n}\n\nTEST(Cat, Test2DTensorsDim1) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 2> a_shape{4, 3};\n  std::array<int64_t, 2> b_shape{4, 2};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0] * a_shape[1]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0; i != static_cast<int32_t>(b_shape[0] * b_shape[1]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Cat(allocator, {&a, &b}, 1);\n\n  const float *pans = ans.GetTensorData<float>();\n\n  for (int32_t r = 0; r != static_cast<int32_t>(a_shape[0]); ++r) {\n    for (int32_t i = 0; i != static_cast<int32_t>(a_shape[1]);\n         ++i, ++pa, ++pans) {\n      EXPECT_EQ(*pa, *pans);\n    }\n\n    for (int32_t i = 0; i != static_cast<int32_t>(b_shape[1]);\n         ++i, ++pb, ++pans) {\n      EXPECT_EQ(*pb, *pans);\n    }\n  }\n\n  Print2D(&a);\n  Print2D(&b);\n  Print2D(&ans);\n}\n\nTEST(Cat, Test3DTensorsDim0) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 3> a_shape{2, 3, 2};\n  std::array<int64_t, 3> b_shape{4, 3, 2};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(a_shape[0] * a_shape[1] * a_shape[2]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(b_shape[0] * b_shape[1] * b_shape[2]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Cat(allocator, {&a, &b}, 0);\n\n  const float *pans = ans.GetTensorData<float>();\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(a_shape[0] * a_shape[1] * a_shape[2]); ++i) {\n    EXPECT_EQ(pa[i], pans[i]);\n  }\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(b_shape[0] * b_shape[1] * b_shape[2]); ++i) {\n    EXPECT_EQ(pb[i], pans[i + a_shape[0] * a_shape[1] * a_shape[2]]);\n  }\n\n  Print3D(&a);\n  Print3D(&b);\n  Print3D(&ans);\n}\n\nTEST(Cat, Test3DTensorsDim1) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 3> a_shape{2, 2, 3};\n  std::array<int64_t, 3> b_shape{2, 4, 3};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(a_shape[0] * a_shape[1] * a_shape[2]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(b_shape[0] * b_shape[1] * b_shape[2]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Cat(allocator, {&a, &b}, 1);\n\n  const float *pans = ans.GetTensorData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0]); ++i) {\n    for (int32_t k = 0; k != static_cast<int32_t>(a_shape[1] * a_shape[2]);\n         ++k, ++pa, ++pans) {\n      EXPECT_EQ(*pa, *pans);\n    }\n\n    for (int32_t k = 0; k != static_cast<int32_t>(b_shape[1] * b_shape[2]);\n         ++k, ++pb, ++pans) {\n      EXPECT_EQ(*pb, *pans);\n    }\n  }\n\n  Print3D(&a);\n  Print3D(&b);\n  Print3D(&ans);\n}\n\nTEST(Cat, Test3DTensorsDim2) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 3> a_shape{2, 3, 4};\n  std::array<int64_t, 3> b_shape{2, 3, 5};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(a_shape[0] * a_shape[1] * a_shape[2]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(b_shape[0] * b_shape[1] * b_shape[2]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Cat(allocator, {&a, &b}, 2);\n\n  const float *pans = ans.GetTensorData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0] * a_shape[1]); ++i) {\n    for (int32_t k = 0; k != static_cast<int32_t>(a_shape[2]);\n         ++k, ++pa, ++pans) {\n      EXPECT_EQ(*pa, *pans);\n    }\n\n    for (int32_t k = 0; k != static_cast<int32_t>(b_shape[2]);\n         ++k, ++pb, ++pans) {\n      EXPECT_EQ(*pb, *pans);\n    }\n  }\n\n  Print3D(&a);\n  Print3D(&b);\n  Print3D(&ans);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/cat.cc",
    "content": "// sherpa-onnx/csrc/cat.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/cat.h\"\n\n#include <algorithm>\n#include <functional>\n#include <numeric>\n#include <sstream>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nstatic bool Compare(const std::vector<int64_t> &a,\n                    const std::vector<int64_t> &b, int32_t skip_dim) {\n  if (a.size() != b.size()) return false;\n\n  for (int32_t i = 0; i != static_cast<int32_t>(a.size()); ++i) {\n    if (i == skip_dim) continue;\n\n    if (a[i] != b[i]) return false;\n  }\n\n  return true;\n}\n\nstatic void PrintShape(const std::vector<int64_t> &a) {\n  std::ostringstream os;\n  for (auto i : a) {\n    os << i << \" \";\n  }\n  os << \"\\n\";\n  SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n}\n\ntemplate <typename T /*=float*/>\nOrt::Value Cat(OrtAllocator *allocator,\n               const std::vector<const Ort::Value *> &values, int32_t dim) {\n  if (values.size() == 1u) {\n    return Clone(allocator, values[0]);\n  }\n\n  std::vector<int64_t> v0_shape =\n      values[0]->GetTensorTypeAndShapeInfo().GetShape();\n\n  int64_t total_dim = v0_shape[dim];\n\n  for (int32_t i = 1; i != static_cast<int32_t>(values.size()); ++i) {\n    auto s = values[i]->GetTensorTypeAndShapeInfo().GetShape();\n    total_dim += s[dim];\n\n    bool ret = Compare(v0_shape, s, dim);\n    if (!ret) {\n      SHERPA_ONNX_LOGE(\"Incorrect shape in Cat !\\n\");\n\n      SHERPA_ONNX_LOGE(\"Shape for tensor 0: \");\n      PrintShape(v0_shape);\n\n      SHERPA_ONNX_LOGE(\"Shape for tensor %d: \", i);\n      PrintShape(s);\n\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  std::vector<int64_t> ans_shape;\n  ans_shape.reserve(v0_shape.size());\n  ans_shape.insert(ans_shape.end(), v0_shape.data(), v0_shape.data() + dim);\n  ans_shape.push_back(total_dim);\n  ans_shape.insert(ans_shape.end(), v0_shape.data() + dim + 1,\n                   v0_shape.data() + v0_shape.size());\n\n  auto leading_size = static_cast<int32_t>(std::accumulate(\n      v0_shape.begin(), v0_shape.begin() + dim, 1, std::multiplies<int64_t>()));\n\n  auto trailing_size = static_cast<int32_t>(\n      std::accumulate(v0_shape.begin() + dim + 1, v0_shape.end(), 1,\n                      std::multiplies<int64_t>()));\n\n  Ort::Value ans = Ort::Value::CreateTensor<T>(allocator, ans_shape.data(),\n                                               ans_shape.size());\n  T *dst = ans.GetTensorMutableData<T>();\n\n  for (int32_t i = 0; i != leading_size; ++i) {\n    for (auto value : values) {\n      auto this_dim = value->GetTensorTypeAndShapeInfo().GetShape()[dim];\n      const T *src = value->GetTensorData<T>();\n      src += i * this_dim * trailing_size;\n\n      std::copy(src, src + this_dim * trailing_size, dst);\n      dst += this_dim * trailing_size;\n    }\n  }\n\n  return ans;\n}\n\ntemplate Ort::Value Cat<float>(OrtAllocator *allocator,\n                               const std::vector<const Ort::Value *> &values,\n                               int32_t dim);\n\ntemplate Ort::Value Cat<uint16_t>(OrtAllocator *allocator,\n                                  const std::vector<const Ort::Value *> &values,\n                                  int32_t dim);\n\ntemplate Ort::Value Cat<int64_t>(OrtAllocator *allocator,\n                                 const std::vector<const Ort::Value *> &values,\n                                 int32_t dim);\n\nOrt::Value CatFloat16(OrtAllocator *allocator,\n                      const std::vector<const Ort::Value *> &values,\n                      int32_t dim) {\n  if (values.size() == 1u) {\n    return Clone(allocator, values[0]);\n  }\n\n  std::vector<int64_t> v0_shape =\n      values[0]->GetTensorTypeAndShapeInfo().GetShape();\n\n  int64_t total_dim = v0_shape[dim];\n\n  for (int32_t i = 1; i != static_cast<int32_t>(values.size()); ++i) {\n    auto s = values[i]->GetTensorTypeAndShapeInfo().GetShape();\n    total_dim += s[dim];\n\n    bool ret = Compare(v0_shape, s, dim);\n    if (!ret) {\n      SHERPA_ONNX_LOGE(\"Incorrect shape in Cat !\\n\");\n\n      SHERPA_ONNX_LOGE(\"Shape for tensor 0: \");\n      PrintShape(v0_shape);\n\n      SHERPA_ONNX_LOGE(\"Shape for tensor %d: \", i);\n      PrintShape(s);\n\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  std::vector<int64_t> ans_shape;\n  ans_shape.reserve(v0_shape.size());\n  ans_shape.insert(ans_shape.end(), v0_shape.data(), v0_shape.data() + dim);\n  ans_shape.push_back(total_dim);\n  ans_shape.insert(ans_shape.end(), v0_shape.data() + dim + 1,\n                   v0_shape.data() + v0_shape.size());\n\n  auto leading_size = static_cast<int32_t>(std::accumulate(\n      v0_shape.begin(), v0_shape.begin() + dim, 1, std::multiplies<int64_t>()));\n\n  auto trailing_size = static_cast<int32_t>(\n      std::accumulate(v0_shape.begin() + dim + 1, v0_shape.end(), 1,\n                      std::multiplies<int64_t>()));\n\n  Ort::Value ans =\n      Ort::Value::CreateTensor(allocator, ans_shape.data(), ans_shape.size(),\n                               ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);\n  using T = uint16_t;\n\n  T *dst = ans.GetTensorMutableData<T>();\n\n  for (int32_t i = 0; i != leading_size; ++i) {\n    for (auto value : values) {\n      auto this_dim = value->GetTensorTypeAndShapeInfo().GetShape()[dim];\n      const T *src = value->GetTensorData<T>();\n      src += i * this_dim * trailing_size;\n\n      std::copy(src, src + this_dim * trailing_size, dst);\n      dst += this_dim * trailing_size;\n    }\n  }\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/cat.h",
    "content": "// sherpa-onnx/csrc/cat.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_CAT_H_\n#define SHERPA_ONNX_CSRC_CAT_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\n/** Cat a list of tensors along the given dim.\n *\n * @param allocator Allocator to allocate space for the returned tensor\n * @param values  Pointer to a list of tensors. The shape of the tensor must\n *                be the same except on the dim to be concatenated.\n * @param dim  The dim along which to concatenate the input tensors\n *\n * @return Return the concatenated tensor\n */\ntemplate <typename T = float>\nOrt::Value Cat(OrtAllocator *allocator,\n               const std::vector<const Ort::Value *> &values, int32_t dim);\n\nOrt::Value CatFloat16(OrtAllocator *allocator,\n                      const std::vector<const Ort::Value *> &values,\n                      int32_t dim);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_CAT_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/character-lexicon.cc",
    "content": "// sherpa-onnx/csrc/character-lexicon.cc\n//\n// Copyright (c)  2022-2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/character-lexicon.h\"\n\n#include <algorithm>\n#include <fstream>\n#include <memory>\n#include <regex>  // NOLINT\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <unordered_set>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/phrase-matcher.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass CharacterLexicon::Impl {\n public:\n  Impl(const std::string &lexicon, const std::string &tokens, bool debug)\n      : debug_(debug) {\n    if (lexicon.empty()) {\n      SHERPA_ONNX_LOGE(\"Please provide lexicon.txt for this model\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    {\n      std::ifstream is(tokens);\n      InitTokens(is);\n    }\n\n    {\n      std::ifstream is(lexicon);\n      InitLexicon(is);\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const std::string &lexicon, const std::string &tokens,\n       bool debug)\n      : debug_(debug) {\n    if (lexicon.empty()) {\n      SHERPA_ONNX_LOGE(\"Please provide lexicon.txt for this model\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    {\n      auto buf = ReadFile(mgr, tokens);\n      std::istringstream is(std::string(buf.data(), buf.size()));\n\n      InitTokens(is);\n    }\n\n    {\n      auto buf = ReadFile(mgr, lexicon);\n      std::istringstream is(std::string(buf.data(), buf.size()));\n      InitLexicon(is);\n    }\n  }\n\n  std::vector<TokenIDs> ConvertTextToTokenIds(const std::string &text) const {\n    // see\n    // https://github.com/Plachtaa/VITS-fast-fine-tuning/blob/main/text/mandarin.py#L244\n    std::regex punct_re{\"：|、|；\"};\n    std::string s = std::regex_replace(text, punct_re, \"，\");\n\n    std::regex punct_re2(\"[.]\");\n    s = std::regex_replace(s, punct_re2, \"。\");\n\n    std::regex punct_re3(\"[?]\");\n    s = std::regex_replace(s, punct_re3, \"？\");\n\n    std::regex punct_re4(\"[!]\");\n    s = std::regex_replace(s, punct_re4, \"！\");\n\n    std::vector<std::string> words = SplitUtf8(text);\n\n    if (debug_) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"input text:\\n%{public}s\", text.c_str());\n      SHERPA_ONNX_LOGE(\"after replacing punctuations:\\n%{public}s\", s.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"input text:\\n%s\", text.c_str());\n      SHERPA_ONNX_LOGE(\"after replacing punctuations:\\n%s\", s.c_str());\n#endif\n\n      std::ostringstream os;\n      std::string sep = \"\";\n      for (const auto &w : words) {\n        os << sep << w;\n        sep = \"_\";\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"after splitting into UTF8:\\n%{public}s\",\n                       os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"after splitting into UTF8:\\n%s\", os.str().c_str());\n#endif\n    }\n\n    // remove spaces after punctuations\n    std::vector<std::string> words2 = std::move(words);\n    words.reserve(words2.size());\n\n    for (int32_t i = 0; i < words2.size(); ++i) {\n      if (i == 0) {\n        words.push_back(std::move(words2[i]));\n      } else if (words2[i] == \" \") {\n        if (words.back() == \" \" || IsPunct(words.back())) {\n          continue;\n        } else {\n          words.push_back(std::move(words2[i]));\n        }\n      } else if (IsPunct(words2[i])) {\n        if (words.back() == \" \" || IsPunct(words.back())) {\n          continue;\n        } else {\n          words.push_back(std::move(words2[i]));\n        }\n      } else {\n        words.push_back(std::move(words2[i]));\n      }\n    }\n\n    if (debug_) {\n      std::ostringstream os;\n      std::string sep = \"\";\n      for (const auto &w : words) {\n        os << sep << w;\n        sep = \"_\";\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"after removing spaces after punctuations:\\n%{public}s\",\n                       os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"after removing spaces after punctuations:\\n%s\",\n                       os.str().c_str());\n#endif\n    }\n\n    std::vector<TokenIDs> ans;\n    std::vector<int64_t> this_sentence;\n\n    PhraseMatcher matcher(&all_words_, words, debug_);\n\n    for (const std::string &w : matcher) {\n      auto ids = ConvertWordToIds(w);\n      if (ids.empty()) {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\"Ignore OOV '%{public}s'\", w.c_str());\n#else\n        SHERPA_ONNX_LOGE(\"Ignore OOV '%s'\", w.c_str());\n#endif\n        continue;\n      }\n\n      this_sentence.insert(this_sentence.end(), ids.begin(), ids.end());\n\n      if (IsPunct(w)) {\n        ans.emplace_back(std::move(this_sentence));\n        this_sentence = {};\n      }\n    }  // for (const std::string &w : matcher)\n\n    if (!this_sentence.empty()) {\n      ans.emplace_back(std::move(this_sentence));\n    }\n\n    return ans;\n  }\n\n private:\n  std::vector<int32_t> ConvertWordToIds(const std::string &w) const {\n    std::vector<int32_t> ans;\n\n    if (word2ids_.count(w)) {\n      ans = word2ids_.at(w);\n    } else if (token2id_.count(w)) {\n      ans = {token2id_.at(w)};\n    } else {\n      std::vector<std::string> words = SplitUtf8(w);\n      for (const auto &word : words) {\n        if (word2ids_.count(word)) {\n          auto ids = ConvertWordToIds(word);\n          ans.insert(ans.end(), ids.begin(), ids.end());\n        }\n      }\n    }\n    if (debug_) {\n      std::ostringstream os;\n      os << w << \": \";\n      for (auto i : ans) {\n        os << id2token_.at(i) << \" \";\n      }\n      os << \"\\n\";\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    return ans;\n  }\n\n  void InitTokens(std::istream &is) {\n    token2id_ = ReadTokens(is);\n\n    std::vector<std::pair<std::string, std::string>> puncts = {\n        {\",\", \"，\"}, {\".\", \"。\"}, {\"!\", \"！\"}, {\"?\", \"？\"}, {\":\", \"：\"},\n        {\"\\\"\", \"“\"}, {\"\\\"\", \"”\"}, {\"'\", \"‘\"},  {\"'\", \"’\"},  {\";\", \"；\"},\n    };\n\n    for (const auto &p : puncts) {\n      if (token2id_.count(p.first) && !token2id_.count(p.second)) {\n        token2id_[p.second] = token2id_[p.first];\n      }\n\n      if (!token2id_.count(p.first) && token2id_.count(p.second)) {\n        token2id_[p.first] = token2id_[p.second];\n      }\n    }\n\n    if (!token2id_.count(\"、\") && token2id_.count(\"，\")) {\n      token2id_[\"、\"] = token2id_[\"，\"];\n    }\n\n    if (!token2id_.count(\";\") && token2id_.count(\",\")) {\n      token2id_[\";\"] = token2id_[\",\"];\n    }\n\n    if (debug_) {\n      for (const auto &p : token2id_) {\n        id2token_[p.second] = p.first;\n      }\n    }\n  }\n\n  void InitLexicon(std::istream &is) {\n    std::string word;\n    std::vector<std::string> token_list;\n    std::string line;\n    std::string phone;\n    int32_t line_num = 0;\n\n    while (std::getline(is, line)) {\n      ++line_num;\n      if (line.find_first_not_of(\" \\t\\n\\v\\f\\r\") == std::string::npos) {\n        // Line is empty or only spaces/tabs, skip it\n        continue;\n      }\n\n      std::istringstream iss(line);\n\n      token_list.clear();\n\n      iss >> word;\n      ToLowerCase(&word);\n\n      if (word2ids_.count(word)) {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\n            \"Duplicated word: %{public}s at line %{public}d:%{public}s. Ignore \"\n            \"it.\",\n            word.c_str(), line_num, line.c_str());\n#else\n        SHERPA_ONNX_LOGE(\"Duplicated word: %s at line %d:%s. Ignore it.\",\n                         word.c_str(), line_num, line.c_str());\n#endif\n        continue;\n      }\n\n      while (iss >> phone) {\n        token_list.push_back(std::move(phone));\n      }\n\n      std::vector<int32_t> ids = ConvertTokensToIds(token2id_, token_list);\n      if (ids.empty()) {\n        if (debug_) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"Empty token ids for '%{public}s'\", line.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"Empty token ids for '%s'\", line.c_str());\n#endif\n        }\n        continue;\n      }\n\n      word2ids_.insert({std::move(word), std::move(ids)});\n    }\n\n    for (const auto &[key, _] : word2ids_) {\n      all_words_.insert(key);\n    }\n  }\n\n private:\n  // lexicon.txt is saved in word2ids_\n  std::unordered_map<std::string, std::vector<int32_t>> word2ids_;\n  std::unordered_set<std::string> all_words_;\n\n  // tokens.txt is saved in token2id_\n  std::unordered_map<std::string, int32_t> token2id_;\n\n  std::unordered_map<int32_t, std::string> id2token_;\n\n  bool debug_ = false;\n};\n\nCharacterLexicon::~CharacterLexicon() = default;\n\nCharacterLexicon::CharacterLexicon(const std::string &lexicon,\n                                   const std::string &tokens, bool debug)\n    : impl_(std::make_unique<Impl>(lexicon, tokens, debug)) {}\n\ntemplate <typename Manager>\nCharacterLexicon::CharacterLexicon(Manager *mgr, const std::string &lexicon,\n                                   const std::string &tokens, bool debug)\n    : impl_(std::make_unique<Impl>(mgr, lexicon, tokens, debug)) {}\n\nstd::vector<TokenIDs> CharacterLexicon::ConvertTextToTokenIds(\n    const std::string &text, const std::string & /*unused_voice = \"\"*/) const {\n  return impl_->ConvertTextToTokenIds(text);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate CharacterLexicon::CharacterLexicon(AAssetManager *mgr,\n                                            const std::string &lexicon,\n                                            const std::string &tokens,\n                                            bool debug);\n#endif\n\n#if __OHOS__\ntemplate CharacterLexicon::CharacterLexicon(NativeResourceManager *mgr,\n                                            const std::string &lexicon,\n                                            const std::string &tokens,\n                                            bool debug);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/character-lexicon.h",
    "content": "// sherpa-onnx/csrc/character-lexicon.h\n//\n// Copyright (c)  2022-2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_CHARACTER_LEXICON_H_\n#define SHERPA_ONNX_CSRC_CHARACTER_LEXICON_H_\n\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n\nnamespace sherpa_onnx {\n\nclass CharacterLexicon : public OfflineTtsFrontend {\n public:\n  ~CharacterLexicon() override;\n\n  CharacterLexicon(const std::string &lexicon, const std::string &tokens,\n                   bool debug);\n\n  template <typename Manager>\n  CharacterLexicon(Manager *mgr, const std::string &lexicon,\n                   const std::string &tokens, bool debug);\n\n  std::vector<TokenIDs> ConvertTextToTokenIds(\n      const std::string &text,\n      const std::string &unused_voice = \"\") const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_CHARACTER_LEXICON_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/circular-buffer-test.cc",
    "content": "// sherpa-onnx/csrc/circular-buffer-test.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/circular-buffer.h\"\n\n#include <vector>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nTEST(CircularBuffer, Push) {\n  CircularBuffer buffer(10);\n  EXPECT_EQ(buffer.Size(), 0);\n  EXPECT_EQ(buffer.Head(), 0);\n  EXPECT_EQ(buffer.Tail(), 0);\n\n  std::vector<float> a = {0, 1, 2, 3, 4, 5};\n  buffer.Push(a.data(), a.size());\n\n  EXPECT_EQ(buffer.Size(), 6);\n  EXPECT_EQ(buffer.Head(), 0);\n  EXPECT_EQ(buffer.Tail(), 6);\n\n  auto c = buffer.Get(0, a.size());\n  EXPECT_EQ(a.size(), c.size());\n  for (int32_t i = 0; i != a.size(); ++i) {\n    EXPECT_EQ(a[i], c[i]);\n  }\n\n  std::vector<float> d = {-6, -7, -8, -9};\n  buffer.Push(d.data(), d.size());\n\n  c = buffer.Get(a.size(), d.size());\n  EXPECT_EQ(d.size(), c.size());\n  for (int32_t i = 0; i != d.size(); ++i) {\n    EXPECT_EQ(d[i], c[i]);\n  }\n}\n\nTEST(CircularBuffer, PushAndPop) {\n  CircularBuffer buffer(5);\n  std::vector<float> a = {0, 1, 2, 3};\n  buffer.Push(a.data(), a.size());\n\n  EXPECT_EQ(buffer.Size(), 4);\n  EXPECT_EQ(buffer.Head(), 0);\n  EXPECT_EQ(buffer.Tail(), 4);\n\n  buffer.Pop(2);\n\n  EXPECT_EQ(buffer.Size(), 2);\n  EXPECT_EQ(buffer.Head(), 2);\n  EXPECT_EQ(buffer.Tail(), 4);\n\n  auto c = buffer.Get(2, 2);\n  EXPECT_EQ(c.size(), 2);\n  EXPECT_EQ(c[0], 2);\n  EXPECT_EQ(c[1], 3);\n\n  a = {10, 20, 30};\n  buffer.Push(a.data(), a.size());\n  EXPECT_EQ(buffer.Size(), 5);\n  EXPECT_EQ(buffer.Head(), 2);\n  EXPECT_EQ(buffer.Tail(), 7);\n\n  c = buffer.Get(2, 5);\n  EXPECT_EQ(c.size(), 5);\n  EXPECT_EQ(c[0], 2);\n  EXPECT_EQ(c[1], 3);\n  EXPECT_EQ(c[2], 10);\n  EXPECT_EQ(c[3], 20);\n  EXPECT_EQ(c[4], 30);\n\n  c = buffer.Get(3, 4);\n  EXPECT_EQ(c.size(), 4);\n  EXPECT_EQ(c[0], 3);\n  EXPECT_EQ(c[1], 10);\n  EXPECT_EQ(c[2], 20);\n  EXPECT_EQ(c[3], 30);\n\n  c = buffer.Get(4, 3);\n  EXPECT_EQ(c.size(), 3);\n  EXPECT_EQ(c[0], 10);\n  EXPECT_EQ(c[1], 20);\n  EXPECT_EQ(c[2], 30);\n\n  buffer.Pop(4);\n  EXPECT_EQ(buffer.Size(), 1);\n  EXPECT_EQ(buffer.Head(), 6);\n  EXPECT_EQ(buffer.Tail(), 7);\n\n  c = buffer.Get(6, 1);\n  EXPECT_EQ(c.size(), 1);\n  EXPECT_EQ(c[0], 30);\n\n  a = {100, 200, 300, 400};\n  buffer.Push(a.data(), a.size());\n  EXPECT_EQ(buffer.Size(), 5);\n\n  EXPECT_EQ(buffer.Size(), 5);\n  EXPECT_EQ(buffer.Head(), 6);\n  EXPECT_EQ(buffer.Tail(), 11);\n\n  c = buffer.Get(6, 5);\n  EXPECT_EQ(c.size(), 5);\n  EXPECT_EQ(c[0], 30);\n  EXPECT_EQ(c[1], 100);\n  EXPECT_EQ(c[2], 200);\n  EXPECT_EQ(c[3], 300);\n  EXPECT_EQ(c[4], 400);\n\n  buffer.Pop(3);\n  EXPECT_EQ(buffer.Size(), 2);\n  EXPECT_EQ(buffer.Head(), 9);\n  EXPECT_EQ(buffer.Tail(), 11);\n\n  c = buffer.Get(10, 1);\n  EXPECT_EQ(c.size(), 1);\n  EXPECT_EQ(c[0], 400);\n\n  a = {1000, 2000, 3000};\n  buffer.Push(a.data(), a.size());\n\n  EXPECT_EQ(buffer.Size(), 5);\n  EXPECT_EQ(buffer.Head(), 9);\n  EXPECT_EQ(buffer.Tail(), 14);\n\n  buffer.Pop(1);\n\n  EXPECT_EQ(buffer.Size(), 4);\n  EXPECT_EQ(buffer.Head(), 10);\n  EXPECT_EQ(buffer.Tail(), 14);\n\n  a = {4000};\n\n  buffer.Push(a.data(), a.size());\n  EXPECT_EQ(buffer.Size(), 5);\n  EXPECT_EQ(buffer.Head(), 10);\n  EXPECT_EQ(buffer.Tail(), 15);\n\n  c = buffer.Get(13, 2);\n  EXPECT_EQ(c.size(), 2);\n  EXPECT_EQ(c[0], 3000);\n  EXPECT_EQ(c[1], 4000);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/circular-buffer.cc",
    "content": "// sherpa-onnx/csrc/circular-buffer.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/circular-buffer.h\"\n\n#include <algorithm>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nCircularBuffer::CircularBuffer(int32_t capacity) {\n  if (capacity <= 0) {\n    SHERPA_ONNX_LOGE(\"Please specify a positive capacity. Given: %d\\n\",\n                     capacity);\n    exit(-1);\n  }\n  buffer_.resize(capacity);\n}\n\nvoid CircularBuffer::Resize(int32_t new_capacity) {\n  int32_t capacity = static_cast<int32_t>(buffer_.size());\n  if (new_capacity <= capacity) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\n        \"new_capacity (%{public}d) <= original capacity (%{public}d). Skip it.\",\n        new_capacity, capacity);\n#else\n    SHERPA_ONNX_LOGE(\"new_capacity (%d) <= original capacity (%d). Skip it.\",\n                     new_capacity, capacity);\n#endif\n    return;\n  }\n\n  int32_t size = Size();\n  if (size == 0) {\n    buffer_.resize(new_capacity);\n    return;\n  }\n\n  std::vector<float> new_buffer(new_capacity);\n  int32_t start = head_ % capacity;\n  int32_t dest = head_ % new_capacity;\n\n  if (start + size <= capacity) {\n    if (dest + size <= new_capacity) {\n      std::copy(buffer_.begin() + start, buffer_.begin() + start + size,\n                new_buffer.begin() + dest);\n    } else {\n      int32_t part1_size = new_capacity - dest;\n\n      // copy [start, start+part1_size] to new_buffer\n      std::copy(buffer_.begin() + start, buffer_.begin() + start + part1_size,\n                new_buffer.begin() + dest);\n\n      // copy [start+part1_size, start+size] to new_buffer\n      std::copy(buffer_.begin() + start + part1_size,\n                buffer_.begin() + start + size, new_buffer.begin());\n    }\n  } else {\n    int32_t part1_size = capacity - start;\n    int32_t part2_size = size - part1_size;\n\n    // copy [start, start+part1_size] to new_buffer\n    if (dest + part1_size <= new_capacity) {\n      std::copy(buffer_.begin() + start, buffer_.begin() + start + part1_size,\n                new_buffer.begin() + dest);\n    } else {\n      int32_t first_part = new_capacity - dest;\n      std::copy(buffer_.begin() + start, buffer_.begin() + start + first_part,\n                new_buffer.begin() + dest);\n\n      std::copy(buffer_.begin() + start + first_part,\n                buffer_.begin() + start + part1_size, new_buffer.begin());\n    }\n\n    int32_t new_dest = (dest + part1_size) % new_capacity;\n\n    if (new_dest + part2_size <= new_capacity) {\n      std::copy(buffer_.begin(), buffer_.begin() + part2_size,\n                new_buffer.begin() + new_dest);\n    } else {\n      int32_t first_part = new_capacity - new_dest;\n      std::copy(buffer_.begin(), buffer_.begin() + first_part,\n                new_buffer.begin() + new_dest);\n      std::copy(buffer_.begin() + first_part, buffer_.begin() + part2_size,\n                new_buffer.begin());\n    }\n  }\n  buffer_.swap(new_buffer);\n}\n\nvoid CircularBuffer::Push(const float *p, int32_t n) {\n  int32_t capacity = static_cast<int32_t>(buffer_.size());\n  int32_t size = Size();\n  if (n + size > capacity) {\n    int32_t new_capacity = std::max(capacity * 2, n + size);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\n        \"Overflow! n: %{public}d, size: %{public}d, n+size: %{public}d, \"\n        \"capacity: %{public}d. Increase \"\n        \"capacity to: %{public}d. (Original data is copied. No data loss!)\",\n        n, size, n + size, capacity, new_capacity);\n#else\n    SHERPA_ONNX_LOGE(\n        \"Overflow! n: %d, size: %d, n+size: %d, capacity: %d. Increase \"\n        \"capacity to: %d. (Original data is copied. No data loss!)\",\n        n, size, n + size, capacity, new_capacity);\n#endif\n    Resize(new_capacity);\n\n    capacity = new_capacity;\n  }\n\n  int32_t start = tail_ % capacity;\n\n  tail_ += n;\n\n  if (start + n < capacity) {\n    std::copy(p, p + n, buffer_.begin() + start);\n    return;\n  }\n\n  int32_t part1_size = capacity - start;\n\n  std::copy(p, p + part1_size, buffer_.begin() + start);\n\n  std::copy(p + part1_size, p + n, buffer_.begin());\n}\n\nstd::vector<float> CircularBuffer::Get(int32_t start_index, int32_t n) const {\n  if (start_index < head_ || start_index >= tail_) {\n    SHERPA_ONNX_LOGE(\"Invalid start_index: %d. head_: %d, tail_: %d\",\n                     start_index, head_, tail_);\n    return {};\n  }\n\n  int32_t size = Size();\n  if (n < 0 || n > size) {\n    SHERPA_ONNX_LOGE(\"Invalid n: %d. size: %d\", n, size);\n    return {};\n  }\n\n  int32_t capacity = static_cast<int32_t>(buffer_.size());\n\n  if (start_index - head_ + n > size) {\n    SHERPA_ONNX_LOGE(\"Invalid start_index: %d and n: %d. head_: %d, size: %d\",\n                     start_index, n, head_, size);\n    return {};\n  }\n\n  int32_t start = start_index % capacity;\n\n  if (start + n < capacity) {\n    return {buffer_.begin() + start, buffer_.begin() + start + n};\n  }\n\n  std::vector<float> ans(n);\n\n  std::copy(buffer_.begin() + start, buffer_.end(), ans.begin());\n\n  int32_t part1_size = capacity - start;\n  int32_t part2_size = n - part1_size;\n  std::copy(buffer_.begin(), buffer_.begin() + part2_size,\n            ans.begin() + part1_size);\n\n  return ans;\n}\n\nvoid CircularBuffer::Pop(int32_t n) {\n  int32_t size = Size();\n  if (n < 0 || n > size) {\n    SHERPA_ONNX_LOGE(\"Invalid n: %d. size: %d\", n, size);\n    return;\n  }\n\n  head_ += n;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/circular-buffer.h",
    "content": "// sherpa-onnx/csrc/circular-buffer.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_CIRCULAR_BUFFER_H_\n#define SHERPA_ONNX_CSRC_CIRCULAR_BUFFER_H_\n\n#include <cstdint>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nclass CircularBuffer {\n public:\n  // Capacity of this buffer. Should be large enough.\n  // If it is full, we just print a message and exit the program.\n  explicit CircularBuffer(int32_t capacity);\n\n  // Push an array\n  //\n  // @param p Pointer to the start address of the array\n  // @param n Number of elements in the array\n  //\n  // Note: If n + Size() > capacity, we print an error message and exit.\n  void Push(const float *p, int32_t n);\n\n  // @param start_index Should in the range [head_, tail_)\n  // @param n Number of elements to get\n  // @return Return a vector of size n containing the requested elements\n  std::vector<float> Get(int32_t start_index, int32_t n) const;\n\n  // Remove n elements from the buffer\n  //\n  // @param n Should be in the range [0, size_]\n  void Pop(int32_t n);\n\n  // Number of elements in the buffer.\n  int32_t Size() const { return tail_ - head_; }\n\n  // Current position of the head\n  int32_t Head() const { return head_; }\n\n  // Current position of the tail\n  int32_t Tail() const { return tail_; }\n\n  void Reset() {\n    head_ = 0;\n    tail_ = 0;\n  }\n\n  void Resize(int32_t new_capacity);\n\n private:\n  std::vector<float> buffer_;\n\n  int32_t head_ = 0;  // linear index; always increasing; never wraps around\n  int32_t tail_ = 0;  // linear index, always increasing; never wraps around.\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_CIRCULAR_BUFFER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/context-graph-test.cc",
    "content": "// sherpa-onnx/csrc/context-graph-test.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/context-graph.h\"\n\n#include <chrono>\n#include <cmath>\n#include <map>\n#include <random>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstatic void TestHelper(const std::map<std::string, float> &queries, float score,\n                       bool strict_mode) {\n  std::vector<std::string> contexts_str(\n      {\"S\", \"HE\", \"SHE\", \"SHELL\", \"HIS\", \"HERS\", \"HELLO\", \"THIS\", \"THEM\"});\n  std::vector<std::vector<int32_t>> contexts;\n  std::vector<float> scores;\n  for (int32_t i = 0; i < contexts_str.size(); ++i) {\n    contexts.emplace_back(contexts_str[i].begin(), contexts_str[i].end());\n    scores.push_back(std::round(score / contexts_str[i].size() * 100) / 100);\n  }\n  auto context_graph = ContextGraph(contexts, 1, scores);\n\n  for (const auto &iter : queries) {\n    float total_scores = 0;\n    auto state = context_graph.Root();\n    for (auto q : iter.first) {\n      auto res = context_graph.ForwardOneStep(state, q, strict_mode);\n      total_scores += std::get<0>(res);\n      state = std::get<1>(res);\n    }\n    auto res = context_graph.Finalize(state);\n    EXPECT_EQ(res.second->token, -1);\n    total_scores += res.first;\n    EXPECT_EQ(total_scores, iter.second);\n  }\n}\n\nTEST(ContextGraph, TestBasic) {\n  auto queries = std::map<std::string, float>{\n      {\"HEHERSHE\", 14}, {\"HERSHE\", 12}, {\"HISHE\", 9},\n      {\"SHED\", 6},      {\"SHELF\", 6},   {\"HELL\", 2},\n      {\"HELLO\", 7},     {\"DHRHISQ\", 4}, {\"THEN\", 2}};\n  TestHelper(queries, 0, true);\n}\n\nTEST(ContextGraph, TestBasicNonStrict) {\n  auto queries = std::map<std::string, float>{\n      {\"HEHERSHE\", 7}, {\"HERSHE\", 5}, {\"HISHE\", 5},   {\"SHED\", 3}, {\"SHELF\", 3},\n      {\"HELL\", 2},     {\"HELLO\", 2},  {\"DHRHISQ\", 3}, {\"THEN\", 2}};\n  TestHelper(queries, 0, false);\n}\n\nTEST(ContextGraph, TestCustomize) {\n  auto queries = std::map<std::string, float>{\n      {\"HEHERSHE\", 35.84}, {\"HERSHE\", 30.84},  {\"HISHE\", 24.18},\n      {\"SHED\", 18.34},     {\"SHELF\", 18.34},   {\"HELL\", 5},\n      {\"HELLO\", 13},       {\"DHRHISQ\", 10.84}, {\"THEN\", 5}};\n  TestHelper(queries, 5, true);\n}\n\nTEST(ContextGraph, TestCustomizeNonStrict) {\n  auto queries = std::map<std::string, float>{\n      {\"HEHERSHE\", 20}, {\"HERSHE\", 15},    {\"HISHE\", 10.84},\n      {\"SHED\", 10},     {\"SHELF\", 10},     {\"HELL\", 5},\n      {\"HELLO\", 5},     {\"DHRHISQ\", 5.84}, {\"THEN\", 5}};\n  TestHelper(queries, 5, false);\n}\n\nTEST(ContextGraph, Benchmark) {\n  std::random_device rd;\n  std::mt19937 mt(rd());\n  std::uniform_int_distribution<int32_t> char_dist(0, 25);\n  std::uniform_int_distribution<int32_t> len_dist(3, 8);\n  for (int32_t num = 10; num <= 10000; num *= 10) {\n    std::vector<std::vector<int32_t>> contexts;\n    for (int32_t i = 0; i < num; ++i) {\n      std::vector<int32_t> tmp;\n      int32_t word_len = len_dist(mt);\n      for (int32_t j = 0; j < word_len; ++j) {\n        tmp.push_back(char_dist(mt));\n      }\n      contexts.push_back(std::move(tmp));\n    }\n    auto start = std::chrono::high_resolution_clock::now();\n    auto context_graph = ContextGraph(contexts, 1);\n    auto stop = std::chrono::high_resolution_clock::now();\n    auto duration =\n        std::chrono::duration_cast<std::chrono::microseconds>(stop - start);\n    SHERPA_ONNX_LOGE(\"Construct context graph for %d item takes %d us.\", num,\n                     static_cast<int32_t>(duration.count()));\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/context-graph.cc",
    "content": "// sherpa-onnx/csrc/context-graph.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/context-graph.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <memory>\n#include <queue>\n#include <string>\n#include <tuple>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\nvoid ContextGraph::Build(const std::vector<std::vector<int32_t>> &token_ids,\n                         const std::vector<float> &scores,\n                         const std::vector<std::string> &phrases,\n                         const std::vector<float> &ac_thresholds) const {\n  if (!scores.empty()) {\n    SHERPA_ONNX_CHECK_EQ(token_ids.size(), scores.size());\n  }\n  if (!phrases.empty()) {\n    SHERPA_ONNX_CHECK_EQ(token_ids.size(), phrases.size());\n  }\n  if (!ac_thresholds.empty()) {\n    SHERPA_ONNX_CHECK_EQ(token_ids.size(), ac_thresholds.size());\n  }\n  for (int32_t i = 0; i < static_cast<int32_t>(token_ids.size()); ++i) {\n    auto node = root_.get();\n    float score = scores.empty() ? 0.0f : scores[i];\n    score = score == 0.0f ? context_score_ : score;\n    float ac_threshold = ac_thresholds.empty() ? 0.0f : ac_thresholds[i];\n    ac_threshold = ac_threshold == 0.0f ? ac_threshold_ : ac_threshold;\n    std::string phrase = phrases.empty() ? std::string() : phrases[i];\n\n    for (int32_t j = 0; j < static_cast<int32_t>(token_ids[i].size()); ++j) {\n      int32_t token = token_ids[i][j];\n      if (0 == node->next.count(token)) {\n        bool is_end = j == (static_cast<int32_t>(token_ids[i].size()) - 1);\n        node->next[token] = std::make_unique<ContextState>(\n            token, score, node->node_score + score,\n            is_end ? node->node_score + score : 0, j + 1,\n            is_end ? ac_threshold : 0.0f, is_end,\n            is_end ? phrase : std::string());\n      } else {\n        float token_score = std::max(score, node->next[token]->token_score);\n        node->next[token]->token_score = token_score;\n        float node_score = node->node_score + token_score;\n        node->next[token]->node_score = node_score;\n        bool is_end = (j == static_cast<int32_t>(token_ids[i].size()) - 1) ||\n                      node->next[token]->is_end;\n        node->next[token]->output_score = is_end ? node_score : 0.0f;\n        node->next[token]->is_end = is_end;\n        if (j == static_cast<int32_t>(token_ids[i].size()) - 1) {\n          node->next[token]->phrase = phrase;\n          node->next[token]->ac_threshold = ac_threshold;\n        }\n      }\n      node = node->next[token].get();\n    }\n  }\n  FillFailOutput();\n}\n\nstd::tuple<float, const ContextState *, const ContextState *>\nContextGraph::ForwardOneStep(const ContextState *state, int32_t token,\n                             bool strict_mode /*= true*/) const {\n  const ContextState *node = nullptr;\n  float score = 0;\n  if (1 == state->next.count(token)) {\n    node = state->next.at(token).get();\n    score = node->token_score;\n  } else {\n    node = state->fail;\n    while (0 == node->next.count(token)) {\n      node = node->fail;\n      if (-1 == node->token) break;  // root\n    }\n    if (1 == node->next.count(token)) {\n      node = node->next.at(token).get();\n    }\n    score = node->node_score - state->node_score;\n  }\n\n  if (!node) {\n    SHERPA_ONNX_LOGE(\"Some bad things happened.\");\n    exit(-1);\n  }\n\n  const ContextState *matched_node =\n      node->is_end ? node : (node->output != nullptr ? node->output : nullptr);\n\n  if (!strict_mode && node->output_score != 0) {\n    SHERPA_ONNX_CHECK(nullptr != matched_node);\n    float output_score =\n        node->is_end ? node->node_score\n                     : (node->output != nullptr ? node->output->node_score\n                                                : node->node_score);\n    return std::make_tuple(score + output_score - node->node_score, root_.get(),\n                           matched_node);\n  }\n  return std::make_tuple(score + node->output_score, node, matched_node);\n}\n\nstd::pair<float, const ContextState *> ContextGraph::Finalize(\n    const ContextState *state) const {\n  float score = -state->node_score;\n  return std::make_pair(score, root_.get());\n}\n\nstd::pair<bool, const ContextState *> ContextGraph::IsMatched(\n    const ContextState *state) const {\n  bool status = false;\n  const ContextState *node = nullptr;\n  if (state->is_end) {\n    status = true;\n    node = state;\n  } else {\n    if (state->output != nullptr) {\n      status = true;\n      node = state->output;\n    }\n  }\n  return std::make_pair(status, node);\n}\n\nvoid ContextGraph::FillFailOutput() const {\n  std::queue<const ContextState *> node_queue;\n  for (auto &kv : root_->next) {\n    kv.second->fail = root_.get();\n    node_queue.push(kv.second.get());\n  }\n  while (!node_queue.empty()) {\n    auto current_node = node_queue.front();\n    node_queue.pop();\n    for (auto &kv : current_node->next) {\n      auto fail = current_node->fail;\n      if (1 == fail->next.count(kv.first)) {\n        fail = fail->next.at(kv.first).get();\n      } else {\n        fail = fail->fail;\n        while (0 == fail->next.count(kv.first)) {\n          fail = fail->fail;\n          if (-1 == fail->token) break;\n        }\n        if (1 == fail->next.count(kv.first))\n          fail = fail->next.at(kv.first).get();\n      }\n      kv.second->fail = fail;\n      // fill the output arc\n      auto output = fail;\n      while (!output->is_end) {\n        output = output->fail;\n        if (-1 == output->token) {\n          output = nullptr;\n          break;\n        }\n      }\n      kv.second->output = output;\n      kv.second->output_score += output == nullptr ? 0 : output->output_score;\n      node_queue.push(kv.second.get());\n    }\n  }\n}\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/context-graph.h",
    "content": "// sherpa-onnx/csrc/context-graph.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_CONTEXT_GRAPH_H_\n#define SHERPA_ONNX_CSRC_CONTEXT_GRAPH_H_\n\n#include <memory>\n#include <string>\n#include <tuple>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/log.h\"\n\nnamespace sherpa_onnx {\n\nclass ContextGraph;\nusing ContextGraphPtr = std::shared_ptr<ContextGraph>;\n\nstruct ContextState {\n  int32_t token;\n  float token_score;\n  float node_score;\n  float output_score;\n  int32_t level;\n  float ac_threshold;\n  bool is_end;\n  std::string phrase;\n  std::unordered_map<int32_t, std::unique_ptr<ContextState>> next;\n  const ContextState *fail = nullptr;\n  const ContextState *output = nullptr;\n\n  ContextState() = default;\n  ContextState(int32_t token, float token_score, float node_score,\n               float output_score, int32_t level = 0, float ac_threshold = 0.0f,\n               bool is_end = false, const std::string &phrase = {})\n      : token(token),\n        token_score(token_score),\n        node_score(node_score),\n        output_score(output_score),\n        level(level),\n        ac_threshold(ac_threshold),\n        is_end(is_end),\n        phrase(phrase) {}\n};\n\nclass ContextGraph {\n public:\n  ContextGraph() = default;\n  ContextGraph(const std::vector<std::vector<int32_t>> &token_ids,\n               float context_score, float ac_threshold,\n               const std::vector<float> &scores = {},\n               const std::vector<std::string> &phrases = {},\n               const std::vector<float> &ac_thresholds = {})\n      : context_score_(context_score), ac_threshold_(ac_threshold) {\n    root_ = std::make_unique<ContextState>(-1, 0, 0, 0);\n    root_->fail = root_.get();\n    Build(token_ids, scores, phrases, ac_thresholds);\n  }\n\n  ContextGraph(const std::vector<std::vector<int32_t>> &token_ids,\n               float context_score, const std::vector<float> &scores = {})\n      : ContextGraph(token_ids, context_score, 0.0f, scores,\n                     std::vector<std::string>(), std::vector<float>()) {}\n\n  std::tuple<float, const ContextState *, const ContextState *> ForwardOneStep(\n      const ContextState *state, int32_t token_id,\n      bool strict_mode = true) const;\n\n  std::pair<bool, const ContextState *> IsMatched(\n      const ContextState *state) const;\n\n  std::pair<float, const ContextState *> Finalize(\n      const ContextState *state) const;\n\n  const ContextState *Root() const { return root_.get(); }\n\n private:\n  float context_score_;\n  float ac_threshold_;\n  std::unique_ptr<ContextState> root_;\n  void Build(const std::vector<std::vector<int32_t>> &token_ids,\n             const std::vector<float> &scores,\n             const std::vector<std::string> &phrases,\n             const std::vector<float> &ac_thresholds) const;\n  void FillFailOutput() const;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_CONTEXT_GRAPH_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/display.h",
    "content": "// sherpa-onnx/csrc/display.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_DISPLAY_H_\n#define SHERPA_ONNX_CSRC_DISPLAY_H_\n#include <stdio.h>\n\n#include <string>\n\nnamespace sherpa_onnx {\n\nclass Display {\n public:\n  explicit Display(int32_t max_word_per_line = 60)\n      : max_word_per_line_(max_word_per_line) {}\n\n  void Print(int32_t segment_id, const std::string &s) {\n#ifdef _MSC_VER\n    if (segment_id != -1) {\n      fprintf(stderr, \"%d:%s\\n\", segment_id, s.c_str());\n    } else {\n      fprintf(stderr, \"%s\\n\", s.c_str());\n    }\n    return;\n#endif\n    if (last_segment_ == segment_id) {\n      Clear();\n    } else {\n      if (last_segment_ != -1) {\n        fprintf(stderr, \"\\n\\r\");\n      }\n      last_segment_ = segment_id;\n      num_previous_lines_ = 0;\n    }\n\n    if (segment_id != -1) {\n      fprintf(stderr, \"\\r%d:\", segment_id);\n    }\n\n    int32_t i = 0;\n    for (size_t n = 0; n < s.size();) {\n      if (s[n] > 0 && s[n] < 0x7f) {\n        fprintf(stderr, \"%c\", s[n]);\n        ++n;\n      } else {\n        // Each Chinese character occupies 3 bytes for UTF-8 encoding.\n        std::string tmp(s.begin() + n, s.begin() + n + 3);\n        fprintf(stderr, \"%s\", tmp.data());\n        n += 3;\n      }\n\n      ++i;\n      if (i >= max_word_per_line_ && n + 1 < s.size() &&\n          (s[n] == ' ' || s[n] < 0)) {\n        fprintf(stderr, \"\\n\\r \");\n        ++num_previous_lines_;\n        i = 0;\n      }\n    }\n  }\n\n private:\n  // Clear the output for the current segment\n  void Clear() {\n    ClearCurrentLine();\n    while (num_previous_lines_ > 0) {\n      GoUpOneLine();\n      ClearCurrentLine();\n      --num_previous_lines_;\n    }\n  }\n\n  // Clear the current line\n  void ClearCurrentLine() const { fprintf(stderr, \"\\33[2K\\r\"); }\n\n  // Move the cursor to the previous line\n  void GoUpOneLine() const { fprintf(stderr, \"\\033[1A\\r\"); }\n\n private:\n  int32_t max_word_per_line_;\n  int32_t num_previous_lines_ = 0;\n  int32_t last_segment_ = -1;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_DISPLAY_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/endpoint.cc",
    "content": "// sherpa-onnx/csrc/endpoint.cc\n//\n// Copyright (c)  2022  (authors: Pingfeng Luo)\n//                2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/endpoint.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/log.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstatic bool RuleActivated(const EndpointRule &rule,\n                          const std::string &rule_name, float trailing_silence,\n                          float utterance_length) {\n  bool contain_nonsilence = utterance_length > trailing_silence;\n  bool ans = (contain_nonsilence || !rule.must_contain_nonsilence) &&\n             trailing_silence >= rule.min_trailing_silence &&\n             utterance_length >= rule.min_utterance_length;\n  if (ans) {\n    SHERPA_ONNX_LOG(DEBUG) << \"Endpointing rule \" << rule_name << \" activated: \"\n                           << (contain_nonsilence ? \"true\" : \"false\") << ','\n                           << trailing_silence << ',' << utterance_length;\n  }\n  return ans;\n}\n\nstatic void RegisterEndpointRule(ParseOptions *po, EndpointRule *rule,\n                                 const std::string &rule_name) {\n  po->Register(\n      rule_name + \"-must-contain-nonsilence\", &rule->must_contain_nonsilence,\n      \"If True, for this endpointing \" + rule_name +\n          \" to apply there must be nonsilence in the best-path traceback. \"\n          \"For decoding, a non-blank token is considered as non-silence\");\n  po->Register(rule_name + \"-min-trailing-silence\", &rule->min_trailing_silence,\n               \"This endpointing \" + rule_name +\n                   \" requires duration of trailing silence in seconds) to \"\n                   \"be >= this value.\");\n  po->Register(rule_name + \"-min-utterance-length\", &rule->min_utterance_length,\n               \"This endpointing \" + rule_name +\n                   \" requires utterance-length (in seconds) to be >= this \"\n                   \"value.\");\n}\n\nstd::string EndpointRule::ToString() const {\n  std::ostringstream os;\n\n  os << \"EndpointRule(\";\n  os << \"must_contain_nonsilence=\"\n     << (must_contain_nonsilence ? \"True\" : \"False\") << \", \";\n  os << \"min_trailing_silence=\" << min_trailing_silence << \", \";\n  os << \"min_utterance_length=\" << min_utterance_length << \")\";\n\n  return os.str();\n}\n\nvoid EndpointConfig::Register(ParseOptions *po) {\n  RegisterEndpointRule(po, &rule1, \"rule1\");\n  RegisterEndpointRule(po, &rule2, \"rule2\");\n  RegisterEndpointRule(po, &rule3, \"rule3\");\n}\n\nstd::string EndpointConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"EndpointConfig(\";\n  os << \"rule1=\" << rule1.ToString() << \", \";\n  os << \"rule2=\" << rule2.ToString() << \", \";\n  os << \"rule3=\" << rule3.ToString() << \")\";\n\n  return os.str();\n}\n\nbool Endpoint::IsEndpoint(int32_t num_frames_decoded,\n                          int32_t trailing_silence_frames,\n                          float frame_shift_in_seconds) const {\n  float utterance_length =\n      static_cast<float>(num_frames_decoded) * frame_shift_in_seconds;\n\n  float trailing_silence =\n      static_cast<float>(trailing_silence_frames) * frame_shift_in_seconds;\n\n  if (RuleActivated(config_.rule1, \"rule1\", trailing_silence,\n                    utterance_length) ||\n      RuleActivated(config_.rule2, \"rule2\", trailing_silence,\n                    utterance_length) ||\n      RuleActivated(config_.rule3, \"rule3\", trailing_silence,\n                    utterance_length)) {\n    return true;\n  }\n  return false;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/endpoint.h",
    "content": "// sherpa-onnx/csrc/endpoint.h\n//\n// Copyright (c)  2022  (authors: Pingfeng Luo)\n//                2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ENDPOINT_H_\n#define SHERPA_ONNX_CSRC_ENDPOINT_H_\n\n#include <string>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nstruct EndpointRule {\n  // If True, for this endpointing rule to apply there must\n  // be nonsilence in the best-path traceback.\n  // For decoding, a non-blank token is considered as non-silence\n  bool must_contain_nonsilence = true;\n  // This endpointing rule requires duration of trailing silence\n  // (in seconds) to be >= this value.\n  float min_trailing_silence = 2.0;\n  // This endpointing rule requires utterance-length (in seconds)\n  // to be >= this value.\n  float min_utterance_length = 0.0f;\n\n  EndpointRule() = default;\n\n  EndpointRule(bool must_contain_nonsilence, float min_trailing_silence,\n               float min_utterance_length)\n      : must_contain_nonsilence(must_contain_nonsilence),\n        min_trailing_silence(min_trailing_silence),\n        min_utterance_length(min_utterance_length) {}\n\n  std::string ToString() const;\n};\n\nclass ParseOptions;\n\nstruct EndpointConfig {\n  // For default setting,\n  // rule1 times out after 2.4 seconds of silence, even if we decoded nothing.\n  // rule2 times out after 1.2 seconds of silence after decoding something.\n  // rule3 times out after the utterance is 20 seconds long, regardless of\n  // anything else.\n  EndpointRule rule1;\n  EndpointRule rule2;\n  EndpointRule rule3;\n\n  void Register(ParseOptions *po);\n\n  EndpointConfig()\n      : rule1{false, 2.4, 0}, rule2{true, 1.2, 0}, rule3{false, 0, 20} {}\n\n  EndpointConfig(const EndpointRule &rule1, const EndpointRule &rule2,\n                 const EndpointRule &rule3)\n      : rule1(rule1), rule2(rule2), rule3(rule3) {}\n\n  std::string ToString() const;\n};\n\nclass Endpoint {\n public:\n  explicit Endpoint(const EndpointConfig &config) : config_(config) {}\n\n  /// This function returns true if this set of endpointing rules thinks we\n  /// should terminate decoding.\n  bool IsEndpoint(int32_t num_frames_decoded, int32_t trailing_silence_frames,\n                  float frame_shift_in_seconds) const;\n\n private:\n  EndpointConfig config_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ENDPOINT_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/fast-clustering-config.cc",
    "content": "// sherpa-onnx/csrc/fast-clustering-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/fast-clustering-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\nstd::string FastClusteringConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"FastClusteringConfig(\";\n  os << \"num_clusters=\" << num_clusters << \", \";\n  os << \"threshold=\" << threshold << \")\";\n\n  return os.str();\n}\n\nvoid FastClusteringConfig::Register(ParseOptions *po) {\n  po->Register(\n      \"num-clusters\", &num_clusters,\n      \"Number of cluster. If greater than 0, then cluster threshold is \"\n      \"ignored. Please provide it if you know the actual number of \"\n      \"clusters in advance.\");\n\n  po->Register(\"cluster-threshold\", &threshold,\n               \"If num_clusters is not specified, then it specifies the \"\n               \"distance threshold for clustering. smaller value -> more \"\n               \"clusters. larger value -> fewer clusters\");\n}\n\nbool FastClusteringConfig::Validate() const {\n  if (num_clusters < 1 && threshold < 0) {\n    SHERPA_ONNX_LOGE(\"Please provide either num_clusters or threshold\");\n    return false;\n  }\n\n  return true;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/fast-clustering-config.h",
    "content": "// sherpa-onnx/csrc/fast-clustering-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_FAST_CLUSTERING_CONFIG_H_\n#define SHERPA_ONNX_CSRC_FAST_CLUSTERING_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct FastClusteringConfig {\n  // If greater than 0, then threshold is ignored.\n  //\n  // We strongly recommend that you set it if you know the number of clusters\n  // in advance\n  int32_t num_clusters = -1;\n\n  // distance threshold.\n  //\n  // The smaller, the more clusters it will generate.\n  // The larger, the fewer clusters it will generate.\n  float threshold = 0.5;\n\n  FastClusteringConfig() = default;\n\n  FastClusteringConfig(int32_t num_clusters, float threshold)\n      : num_clusters(num_clusters), threshold(threshold) {}\n\n  std::string ToString() const;\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_FAST_CLUSTERING_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/fast-clustering-test.cc",
    "content": "// sherpa-onnx/csrc/fast-clustering-test.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/fast-clustering.h\"\n\n#include <iostream>\n#include <vector>\n\n#include \"gtest/gtest.h\"\n\nnamespace sherpa_onnx {\n\nTEST(FastClustering, TestTwoClusters) {\n  std::vector<float> features = {\n      // point 0\n      0.1,\n      0.1,\n      // point 2\n      0.4,\n      -0.5,\n      // point 3\n      0.6,\n      -0.7,\n      // point 1\n      0.2,\n      0.3,\n  };\n\n  FastClusteringConfig config;\n  config.num_clusters = 2;\n\n  FastClustering clustering(config);\n  auto labels = clustering.Cluster(features.data(), 4, 2);\n  int32_t k = 0;\n  for (auto i : labels) {\n    std::cout << \"point \" << k << \": label \" << i << \"\\n\";\n    ++k;\n  }\n}\n\nTEST(FastClustering, TestClusteringWithThreshold) {\n  std::vector<float> features = {\n      // point 0\n      0.1,\n      0.1,\n      // point 2\n      0.4,\n      -0.5,\n      // point 3\n      0.6,\n      -0.7,\n      // point 1\n      0.2,\n      0.3,\n  };\n\n  FastClusteringConfig config;\n  config.threshold = 0.5;\n\n  FastClustering clustering(config);\n  auto labels = clustering.Cluster(features.data(), 4, 2);\n  int32_t k = 0;\n  for (auto i : labels) {\n    std::cout << \"point \" << k << \": label \" << i << \"\\n\";\n    ++k;\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/fast-clustering.cc",
    "content": "// sherpa-onnx/csrc/fast-clustering.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/fast-clustering.h\"\n\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"fastcluster-all-in-one.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nclass FastClustering::Impl {\n public:\n  explicit Impl(const FastClusteringConfig &config) : config_(config) {}\n\n  std::vector<int32_t> Cluster(float *features, int32_t num_rows,\n                               int32_t num_cols) const {\n    if (num_rows <= 0) {\n      return {};\n    }\n\n    if (num_rows == 1) {\n      return {0};\n    }\n\n    Eigen::Map<\n        Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>>\n        m(features, num_rows, num_cols);\n    m.rowwise().normalize();\n\n    std::vector<double> distance((num_rows * (num_rows - 1)) / 2);\n\n    int32_t k = 0;\n    for (int32_t i = 0; i != num_rows; ++i) {\n      auto v = m.row(i);\n      for (int32_t j = i + 1; j != num_rows; ++j) {\n        double cosine_similarity = v.dot(m.row(j));\n        double consine_dissimilarity = 1 - cosine_similarity;\n\n        if (consine_dissimilarity < 0) {\n          consine_dissimilarity = 0;\n        }\n\n        distance[k] = consine_dissimilarity;\n        ++k;\n      }\n    }\n\n    std::vector<int32_t> merge(2 * (num_rows - 1));\n    std::vector<double> height(num_rows - 1);\n\n    fastclustercpp::hclust_fast(num_rows, distance.data(),\n                                fastclustercpp::HCLUST_METHOD_COMPLETE,\n                                merge.data(), height.data());\n\n    std::vector<int32_t> labels(num_rows);\n    if (config_.num_clusters > 0) {\n      fastclustercpp::cutree_k(num_rows, merge.data(), config_.num_clusters,\n                               labels.data());\n    } else {\n      fastclustercpp::cutree_cdist(num_rows, merge.data(), height.data(),\n                                   config_.threshold, labels.data());\n    }\n\n    return labels;\n  }\n\n private:\n  FastClusteringConfig config_;\n};\n\nFastClustering::FastClustering(const FastClusteringConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\nFastClustering::~FastClustering() = default;\n\nstd::vector<int32_t> FastClustering::Cluster(float *features, int32_t num_rows,\n                                             int32_t num_cols) const {\n  return impl_->Cluster(features, num_rows, num_cols);\n}\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/fast-clustering.h",
    "content": "// sherpa-onnx/csrc/fast-clustering.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_FAST_CLUSTERING_H_\n#define SHERPA_ONNX_CSRC_FAST_CLUSTERING_H_\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/fast-clustering-config.h\"\n\nnamespace sherpa_onnx {\n\nclass FastClustering {\n public:\n  explicit FastClustering(const FastClusteringConfig &config);\n  ~FastClustering();\n\n  /**\n   * @param features Pointer to a 2-D feature matrix in row major. Each row\n   *                 is a feature frame. It is changed in-place. We will\n   *                 convert each feature frame to a normalized vector.\n   *                 That is, the L2-norm of each vector will be equal to 1.\n   *                 It uses cosine dissimilarity,\n   *                 which is 1 - (cosine similarity)\n   * @param num_rows Number of feature frames\n   * @param num-cols The feature dimension.\n   *\n   * @return Return a vector of size num_rows. ans[i] contains the label\n   *         for the i-th feature frame, i.e., the i-th row of the feature\n   *         matrix.\n   */\n  std::vector<int32_t> Cluster(float *features, int32_t num_rows,\n                               int32_t num_cols) const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_FAST_CLUSTERING_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/features.cc",
    "content": "// sherpa-onnx/csrc/features.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/features.h\"\n\n#include <algorithm>\n#include <memory>\n#include <mutex>\n#include <sstream>\n#include <string>\n#include <vector>\n\n#include \"kaldi-native-fbank/csrc/online-feature.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n\nnamespace sherpa_onnx {\n\nvoid FeatureExtractorConfig::Register(ParseOptions *po) {\n  po->Register(\"sample-rate\", &sampling_rate,\n               \"Sampling rate of the input waveform. \"\n               \"Note: You can have a different \"\n               \"sample rate for the input waveform. We will do resampling \"\n               \"inside the feature extractor\");\n\n  po->Register(\"feat-dim\", &feature_dim,\n               \"Feature dimension. Must match the one expected by the model. \"\n               \"Not used by whisper and CED models\");\n\n  po->Register(\"low-freq\", &low_freq, \"Low cutoff frequency for mel bins\");\n\n  po->Register(\"high-freq\", &high_freq,\n               \"High cutoff frequency for mel bins \"\n               \"(if <= 0, offset from Nyquist)\");\n\n  po->Register(\"dither\", &dither,\n               \"Dithering constant (0.0 means no dither). \"\n               \"By default the audio samples are in range [-1,+1], \"\n               \"so 0.00003 is a good value, \"\n               \"equivalent to the default 1.0 from kaldi\");\n}\n\nstd::string FeatureExtractorConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"FeatureExtractorConfig(\";\n  os << \"sampling_rate=\" << sampling_rate << \", \";\n  os << \"feature_dim=\" << feature_dim << \", \";\n  os << \"low_freq=\" << low_freq << \", \";\n  os << \"high_freq=\" << high_freq << \", \";\n  os << \"dither=\" << dither << \", \";\n  os << \"normalize_samples=\" << (normalize_samples ? \"True\" : \"False\") << \", \";\n  os << \"snip_edges=\" << (snip_edges ? \"True\" : \"False\") << \")\";\n\n  return os.str();\n}\n\nclass FeatureExtractor::Impl {\n public:\n  explicit Impl(const FeatureExtractorConfig &config) : config_(config) {\n    if (config_.is_mfcc) {\n      InitMfcc();\n    } else if (config_.is_whisper) {\n      InitWhisper();\n    } else if (config_.is_t_one) {\n      InitRawAudioSamples();\n    } else {\n      InitFbank();\n    }\n  }\n\n  void AcceptWaveform(int32_t sampling_rate, const float *waveform, int32_t n) {\n    if (config_.normalize_samples) {\n      AcceptWaveformImpl(sampling_rate, waveform, n);\n    } else {\n      std::vector<float> buf(n);\n      for (int32_t i = 0; i != n; ++i) {\n        buf[i] = waveform[i] * 32768;\n      }\n      AcceptWaveformImpl(sampling_rate, buf.data(), n);\n    }\n  }\n\n  void AcceptWaveformImpl(int32_t sampling_rate, const float *waveform,\n                          int32_t n) {\n    std::lock_guard<std::mutex> lock(mutex_);\n\n    if (resampler_) {\n      if (sampling_rate != resampler_->GetInputSamplingRate()) {\n        SHERPA_ONNX_LOGE(\n            \"You changed the input sampling rate!! Expected: %d, given: \"\n            \"%d\",\n            resampler_->GetInputSamplingRate(), sampling_rate);\n        exit(-1);\n      }\n\n      std::vector<float> samples;\n      resampler_->Resample(waveform, n, false, &samples);\n\n      AcceptWaveformWrapper(config_.sampling_rate, samples.data(),\n                            samples.size());\n      return;\n    }\n\n    if (sampling_rate != config_.sampling_rate) {\n      SHERPA_ONNX_LOGE(\n          \"Creating a resampler:\\n\"\n          \"   in_sample_rate: %d\\n\"\n          \"   output_sample_rate: %d\\n\",\n          sampling_rate, static_cast<int32_t>(config_.sampling_rate));\n\n      float min_freq = std::min<int32_t>(sampling_rate, config_.sampling_rate);\n      float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n      int32_t lowpass_filter_width = 6;\n      resampler_ = std::make_unique<LinearResample>(\n          sampling_rate, config_.sampling_rate, lowpass_cutoff,\n          lowpass_filter_width);\n\n      std::vector<float> samples;\n      resampler_->Resample(waveform, n, false, &samples);\n\n      AcceptWaveformWrapper(config_.sampling_rate, samples.data(),\n                            samples.size());\n\n      return;\n    }\n\n    AcceptWaveformWrapper(sampling_rate, waveform, n);\n  }\n\n  void InputFinished() const {\n    std::lock_guard<std::mutex> lock(mutex_);\n    if (fbank_) {\n      fbank_->InputFinished();\n      return;\n    } else if (whisper_fbank_) {\n      whisper_fbank_->InputFinished();\n      return;\n    } else if (raw_audio_) {\n      raw_audio_->InputFinished();\n      return;\n    } else if (mfcc_) {\n      mfcc_->InputFinished();\n      return;\n    }\n\n    SHERPA_ONNX_LOGE(\"unreachable code\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  int32_t NumFramesReady() const {\n    if (fbank_) {\n      return fbank_->NumFramesReady();\n    } else if (whisper_fbank_) {\n      return whisper_fbank_->NumFramesReady();\n    } else if (raw_audio_) {\n      return raw_audio_->NumFramesReady();\n    } else if (mfcc_) {\n      return mfcc_->NumFramesReady();\n    }\n    SHERPA_ONNX_LOGE(\"unreachable code\");\n    SHERPA_ONNX_EXIT(-1);\n    return -1;\n  }\n\n  bool IsLastFrame(int32_t frame) const {\n    std::lock_guard<std::mutex> lock(mutex_);\n    if (fbank_) {\n      return fbank_->IsLastFrame(frame);\n    } else if (whisper_fbank_) {\n      return whisper_fbank_->IsLastFrame(frame);\n    } else if (raw_audio_) {\n      return raw_audio_->IsLastFrame(frame);\n    } else if (mfcc_) {\n      return mfcc_->IsLastFrame(frame);\n    }\n\n    SHERPA_ONNX_LOGE(\"unreachable code\");\n    SHERPA_ONNX_EXIT(-1);\n    return false;\n  }\n\n  std::vector<float> GetFrames(int32_t frame_index, int32_t n) {\n    std::lock_guard<std::mutex> lock(mutex_);\n    if (frame_index + n > NumFramesReady()) {\n      SHERPA_ONNX_LOGE(\"%d + %d > %d\\n\", frame_index, n, NumFramesReady());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    int32_t discard_num = frame_index - last_frame_index_;\n    if (discard_num < 0) {\n      SHERPA_ONNX_LOGE(\"last_frame_index_: %d, frame_index_: %d\",\n                       last_frame_index_, frame_index);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    PopWrapper(discard_num);\n\n    int32_t feature_dim = FeatureDim();\n    std::vector<float> features(feature_dim * n);\n\n    float *p = features.data();\n\n    for (int32_t i = 0; i != n; ++i) {\n      const float *f = GetFrameWrapper(i + frame_index);\n      std::copy(f, f + feature_dim, p);\n      p += feature_dim;\n    }\n\n    last_frame_index_ = frame_index;\n\n    return features;\n  }\n\n  int32_t FeatureDim() const {\n    if (fbank_ || whisper_fbank_) {\n      return opts_.mel_opts.num_bins;\n    } else if (mfcc_) {\n      return mfcc_opts_.num_ceps;\n    } else if (raw_audio_) {\n      return raw_audio_->Dim();\n    }\n\n    SHERPA_ONNX_LOGE(\"unreachable code\");\n    SHERPA_ONNX_EXIT(-1);\n    return -1;\n  }\n\n private:\n  void AcceptWaveformWrapper(float sampling_rate, const float *waveform,\n                             int32_t n) const {\n    if (fbank_) {\n      fbank_->AcceptWaveform(sampling_rate, waveform, n);\n      return;\n    } else if (whisper_fbank_) {\n      whisper_fbank_->AcceptWaveform(sampling_rate, waveform, n);\n      return;\n    } else if (raw_audio_) {\n      raw_audio_->AcceptWaveform(sampling_rate, waveform, n);\n      return;\n    } else if (mfcc_) {\n      mfcc_->AcceptWaveform(sampling_rate, waveform, n);\n      return;\n    }\n\n    SHERPA_ONNX_LOGE(\"unreachable code\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  const float *GetFrameWrapper(int32_t frame_index) const {\n    if (fbank_) {\n      return fbank_->GetFrame(frame_index);\n    } else if (whisper_fbank_) {\n      return whisper_fbank_->GetFrame(frame_index);\n    } else if (raw_audio_) {\n      return raw_audio_->GetFrame(frame_index);\n    } else if (mfcc_) {\n      return mfcc_->GetFrame(frame_index);\n    }\n\n    SHERPA_ONNX_LOGE(\"unreachable code\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n  }\n\n  void PopWrapper(int32_t discard_num) const {\n    if (fbank_) {\n      fbank_->Pop(discard_num);\n      return;\n    } else if (whisper_fbank_) {\n      whisper_fbank_->Pop(discard_num);\n      return;\n    } else if (raw_audio_) {\n      raw_audio_->Pop(discard_num);\n      return;\n    } else if (mfcc_) {\n      mfcc_->Pop(discard_num);\n      return;\n    }\n\n    SHERPA_ONNX_LOGE(\"unreachable code\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  void InitFbank() {\n    opts_.frame_opts.dither = config_.dither;\n    opts_.frame_opts.snip_edges = config_.snip_edges;\n    opts_.frame_opts.samp_freq = config_.sampling_rate;\n    opts_.frame_opts.frame_shift_ms = config_.frame_shift_ms;\n    opts_.frame_opts.frame_length_ms = config_.frame_length_ms;\n    opts_.frame_opts.remove_dc_offset = config_.remove_dc_offset;\n    opts_.frame_opts.preemph_coeff = config_.preemph_coeff;\n    opts_.frame_opts.window_type = config_.window_type;\n    opts_.frame_opts.round_to_power_of_two = config_.round_to_power_of_two;\n\n    opts_.mel_opts.num_bins = config_.feature_dim;\n\n    opts_.mel_opts.high_freq = config_.high_freq;\n    opts_.mel_opts.low_freq = config_.low_freq;\n\n    opts_.mel_opts.is_librosa = config_.is_librosa;\n\n    fbank_ = std::make_unique<knf::OnlineFbank>(opts_);\n  }\n\n  void InitMfcc() {\n    mfcc_opts_.frame_opts.dither = config_.dither;\n    mfcc_opts_.frame_opts.snip_edges = config_.snip_edges;\n    mfcc_opts_.frame_opts.samp_freq = config_.sampling_rate;\n    mfcc_opts_.frame_opts.frame_shift_ms = config_.frame_shift_ms;\n    mfcc_opts_.frame_opts.frame_length_ms = config_.frame_length_ms;\n    mfcc_opts_.frame_opts.remove_dc_offset = config_.remove_dc_offset;\n    mfcc_opts_.frame_opts.preemph_coeff = config_.preemph_coeff;\n    mfcc_opts_.frame_opts.window_type = config_.window_type;\n    mfcc_opts_.frame_opts.round_to_power_of_two = config_.round_to_power_of_two;\n\n    mfcc_opts_.mel_opts.num_bins = config_.feature_dim;\n\n    mfcc_opts_.mel_opts.high_freq = config_.high_freq;\n    mfcc_opts_.mel_opts.low_freq = config_.low_freq;\n\n    mfcc_opts_.mel_opts.is_librosa = config_.is_librosa;\n\n    mfcc_opts_.num_ceps = config_.num_ceps;\n    mfcc_opts_.use_energy = config_.use_energy;\n\n    mfcc_ = std::make_unique<knf::OnlineMfcc>(mfcc_opts_);\n  }\n\n  void InitWhisper() {\n    config_.normalize_samples = true;\n    opts_.frame_opts.samp_freq = 16000;\n    opts_.mel_opts.num_bins = config_.feature_dim;\n\n    knf::WhisperFeatureOptions whisper_opts;\n    whisper_opts.frame_opts = opts_.frame_opts;\n    whisper_opts.dim = config_.feature_dim;\n\n    whisper_fbank_ = std::make_unique<knf::OnlineWhisperFbank>(whisper_opts);\n    config_.sampling_rate = opts_.frame_opts.samp_freq;\n  }\n\n  void InitRawAudioSamples() {\n    opts_raw_audio_.frame_opts.samp_freq = config_.sampling_rate;\n    opts_raw_audio_.frame_opts.frame_length_ms = config_.frame_length_ms;\n    opts_raw_audio_.frame_opts.frame_shift_ms = config_.frame_shift_ms;\n\n    raw_audio_ = std::make_unique<knf::OnlineRawAudioSamples>(opts_raw_audio_);\n  }\n\n private:\n  std::unique_ptr<knf::OnlineFbank> fbank_;\n  std::unique_ptr<knf::OnlineMfcc> mfcc_;\n  std::unique_ptr<knf::OnlineWhisperFbank> whisper_fbank_;\n  std::unique_ptr<knf::OnlineRawAudioSamples> raw_audio_;\n  knf::FbankOptions opts_;\n  knf::RawAudioSamplesOptions opts_raw_audio_;\n  knf::MfccOptions mfcc_opts_;\n  FeatureExtractorConfig config_;\n  mutable std::mutex mutex_;\n  std::unique_ptr<LinearResample> resampler_;\n  int32_t last_frame_index_ = 0;\n};\n\nFeatureExtractor::FeatureExtractor(const FeatureExtractorConfig &config /*={}*/)\n    : impl_(std::make_unique<Impl>(config)) {}\n\nFeatureExtractor::~FeatureExtractor() = default;\n\nvoid FeatureExtractor::AcceptWaveform(int32_t sampling_rate,\n                                      const float *waveform, int32_t n) const {\n  impl_->AcceptWaveform(sampling_rate, waveform, n);\n}\n\nvoid FeatureExtractor::InputFinished() const { impl_->InputFinished(); }\n\nint32_t FeatureExtractor::NumFramesReady() const {\n  return impl_->NumFramesReady();\n}\n\nbool FeatureExtractor::IsLastFrame(int32_t frame) const {\n  return impl_->IsLastFrame(frame);\n}\n\nstd::vector<float> FeatureExtractor::GetFrames(int32_t frame_index,\n                                               int32_t n) const {\n  return impl_->GetFrames(frame_index, n);\n}\n\nint32_t FeatureExtractor::FeatureDim() const { return impl_->FeatureDim(); }\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/features.h",
    "content": "// sherpa-onnx/csrc/features.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_FEATURES_H_\n#define SHERPA_ONNX_CSRC_FEATURES_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct FeatureExtractorConfig {\n  // Sampling rate used by the feature extractor. If it is different from\n  // the sampling rate of the input waveform, we will do resampling inside.\n  int32_t sampling_rate = 16000;\n\n  // num_mel_bins\n  //\n  // Note: for mfcc, this value is also for num_mel_bins.\n  // The actual feature dimension is num_ceps\n  int32_t feature_dim = 80;\n\n  // minimal frequency for Mel-filterbank, in Hz\n  float low_freq = 20.0f;\n\n  // maximal frequency of Mel-filterbank\n  // in Hz; negative value is subtracted from Nyquist freq.:\n  // i.e. for sampling_rate 16000 / 2 - 400 = 7600Hz\n  //\n  // Please see\n  // https://github.com/lhotse-speech/lhotse/blob/master/lhotse/features/fbank.py#L27\n  // and\n  // https://github.com/k2-fsa/sherpa-onnx/issues/514\n  float high_freq = -400.0f;\n\n  // dithering constant, useful for signals with hard-zeroes in non-speech parts\n  // this prevents large negative values in log-mel filterbanks\n  //\n  // In k2, audio samples are in range [-1..+1], in kaldi the range was\n  // [-32k..+32k], so the value 0.00003 is equivalent to kaldi default 1.0\n  //\n  float dither = 0.0f;  // dithering disabled by default\n\n  // Set internally by some models, e.g., paraformer sets it to false.\n  // This parameter is not exposed to users from the commandline\n  // If true, the feature extractor expects inputs to be normalized to\n  // the range [-1, 1].\n  // If false, we will multiply the inputs by 32768\n  bool normalize_samples = true;\n\n  bool snip_edges = false;\n  float frame_shift_ms = 10.0f;   // in milliseconds.\n  float frame_length_ms = 25.0f;  // in milliseconds.\n  bool is_librosa = false;\n  bool remove_dc_offset = true;       // Subtract mean of wave before FFT.\n  float preemph_coeff = 0.97f;        // Preemphasis coefficient.\n  std::string window_type = \"povey\";  // e.g. Hamming window\n\n  // For models from NeMo\n  // This option is not exposed and is set internally when loading models.\n  // Possible values:\n  // - per_feature\n  // - all_features (not implemented yet)\n  // - fixed_mean (not implemented)\n  // - fixed_std (not implemented)\n  // - or just leave it to empty\n  // See\n  // https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/parts/preprocessing/features.py#L59\n  // for details\n  std::string nemo_normalize_type;\n\n  // for MFCC\n  int32_t num_ceps = 13;\n  bool use_energy = true;\n\n  bool is_mfcc = false;\n\n  bool is_whisper = false;\n\n  bool is_t_one = false;\n\n  bool round_to_power_of_two = true;\n\n  std::string ToString() const;\n\n  void Register(ParseOptions *po);\n};\n\nclass FeatureExtractor {\n public:\n  explicit FeatureExtractor(const FeatureExtractorConfig &config = {});\n  ~FeatureExtractor();\n\n  /**\n     @param sampling_rate The sampling_rate of the input waveform. If it does\n                          not equal to  config.sampling_rate, we will do\n                          resampling inside.\n     @param waveform Pointer to a 1-D array of size n. It must be normalized to\n                     the range [-1, 1].\n     @param n Number of entries in waveform\n   */\n  void AcceptWaveform(int32_t sampling_rate, const float *waveform,\n                      int32_t n) const;\n\n  /**\n   * InputFinished() tells the class you won't be providing any\n   * more waveform.  This will help flush out the last frame or two\n   * of features, in the case where snip-edges == false; it also\n   * affects the return value of IsLastFrame().\n   */\n  void InputFinished() const;\n\n  int32_t NumFramesReady() const;\n\n  /** Note: IsLastFrame() will only ever return true if you have called\n   * InputFinished() (and this frame is the last frame).\n   */\n  bool IsLastFrame(int32_t frame) const;\n\n  /** Get n frames starting from the given frame index.\n   *\n   * @param frame_index  The starting frame index\n   * @param n  Number of frames to get.\n   * @return Return a 2-D tensor of shape (n, feature_dim).\n   *         which is flattened into a 1-D vector (flattened in row major)\n   */\n  std::vector<float> GetFrames(int32_t frame_index, int32_t n) const;\n\n  /// Return feature dim of this extractor\n  int32_t FeatureDim() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_FEATURES_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/file-utils.cc",
    "content": "// sherpa-onnx/csrc/file-utils.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n\n#include <fstream>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <vector>\n\n#ifdef _WIN32\n#include <windows.h>\n#else\n#include <limits.h>\n#include <stdlib.h>\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nbool FileExists(const std::string &filename) {\n  return std::ifstream(filename).good();\n}\n\nvoid AssertFileExists(const std::string &filename) {\n  if (!FileExists(filename)) {\n    SHERPA_ONNX_LOGE(\"filename '%s' does not exist\", filename.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n}\n\nstd::vector<char> ReadFile(const std::string &filename) {\n  std::ifstream file(filename, std::ios::binary | std::ios::ate);\n  if (!file.is_open()) {\n    return {};\n  }\n\n  std::streamsize size = file.tellg();\n  file.seekg(0, std::ios::beg);\n\n  std::vector<char> buffer(size);\n  if (!file.read(buffer.data(), size)) {\n    return {};\n  }\n\n  return buffer;\n}\n\n#if __ANDROID_API__ >= 9\nstd::vector<char> ReadFile(AAssetManager *mgr, const std::string &filename) {\n  if (!filename.empty() && filename[0] == '/') {\n    SHERPA_ONNX_LOGE(\n        \"You are using an absolute path '%s', but assetManager is NOT set to \"\n        \"null.\",\n        filename.c_str());\n\n    SHERPA_ONNX_LOGE(\n        \"Please set assetManager to null when you load model files from the SD \"\n        \"card\");\n\n    SHERPA_ONNX_LOGE(\n        \"See also https://github.com/k2-fsa/sherpa-onnx/issues/2562\");\n  }\n\n  AAsset *asset = AAssetManager_open(mgr, filename.c_str(), AASSET_MODE_BUFFER);\n  if (!asset) {\n    __android_log_print(ANDROID_LOG_FATAL, \"sherpa-onnx\",\n                        \"Read binary file: Load '%s' failed\", filename.c_str());\n    exit(-1);\n  }\n\n  auto p = reinterpret_cast<const char *>(AAsset_getBuffer(asset));\n  size_t asset_length = AAsset_getLength(asset);\n\n  std::vector<char> buffer(p, p + asset_length);\n  AAsset_close(asset);\n\n  return buffer;\n}\n#endif\n\n#if __OHOS__\nstd::vector<char> ReadFile(NativeResourceManager *mgr,\n                           const std::string &filename) {\n  std::unique_ptr<RawFile, decltype(&OH_ResourceManager_CloseRawFile)> fp(\n      OH_ResourceManager_OpenRawFile(mgr, filename.c_str()),\n      OH_ResourceManager_CloseRawFile);\n\n  if (!fp) {\n    std::ostringstream os;\n    os << \"Read file '\" << filename << \"' failed.\";\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n    return {};\n  }\n\n  auto len = static_cast<int32_t>(OH_ResourceManager_GetRawFileSize(fp.get()));\n\n  std::vector<char> buffer(len);\n\n  int32_t n = OH_ResourceManager_ReadRawFile(fp.get(), buffer.data(), len);\n\n  if (n != len) {\n    std::ostringstream os;\n    os << \"Read file '\" << filename << \"' failed. Number of bytes read: \" << n\n       << \". Expected bytes to read: \" << len;\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n    return {};\n  }\n\n  return buffer;\n}\n#endif\n\nstd::string ResolveAbsolutePath(const std::string &path) {\n  if (path.empty()) {\n    return path;\n  }\n\n#ifdef _WIN32\n  // Check if path is already absolute (drive letter or UNC path)\n  if ((path.size() > 1 && path[1] == ':') ||\n      (path.size() > 1 && path[0] == '\\\\' && path[1] == '\\\\')) {\n    return path;\n  }\n\n  char buffer[MAX_PATH];\n  if (GetFullPathNameA(path.c_str(), MAX_PATH, buffer, nullptr)) {\n    return std::string(buffer);\n  }\n\n  return path;  // fallback on failure\n\n#else\n  // POSIX: absolute paths start with '/'\n  if (path[0] == '/') {\n    return path;\n  }\n\n  char buffer[PATH_MAX];\n  if (realpath(path.c_str(), buffer)) {\n    return std::string(buffer);\n  }\n\n  return path;  // fallback on failure\n#endif\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/file-utils.h",
    "content": "// sherpa-onnx/csrc/file-utils.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_FILE_UTILS_H_\n#define SHERPA_ONNX_CSRC_FILE_UTILS_H_\n\n#include <fstream>\n#include <string>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\nnamespace sherpa_onnx {\n\n/** Check whether a given path is a file or not\n *\n * @param filename Path to check.\n * @return Return true if the given path is a file; return false otherwise.\n */\nbool FileExists(const std::string &filename);\n\n/** Abort if the file does not exist.\n *\n * @param filename The file to check.\n */\nvoid AssertFileExists(const std::string &filename);\n\nstd::vector<char> ReadFile(const std::string &filename);\n\n#if __ANDROID_API__ >= 9\nstd::vector<char> ReadFile(AAssetManager *mgr, const std::string &filename);\n#endif\n\n#if __OHOS__\nstd::vector<char> ReadFile(NativeResourceManager *mgr,\n                           const std::string &filename);\n#endif\n\nstd::string ResolveAbsolutePath(const std::string &path);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_FILE_UTILS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/fst-utils.cc",
    "content": "// sherpa-onnx/csrc/fst-utils.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/fst-utils.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\n// This function is copied from kaldi.\n//\n// @param filename Path to a StdVectorFst or StdConstFst graph\n// @return The caller should free the returned pointer using `delete` to\n//         avoid memory leak.\nfst::Fst<fst::StdArc> *ReadGraph(const std::string &filename) {\n  // read decoding network FST\n  std::ifstream is(filename, std::ios::binary);\n  if (!is.good()) {\n    SHERPA_ONNX_LOGE(\"Could not open decoding-graph FST %s\", filename.c_str());\n  }\n\n  fst::FstHeader hdr;\n  if (!hdr.Read(is, \"<unknown>\")) {\n    SHERPA_ONNX_LOGE(\"Reading FST: error reading FST header.\");\n  }\n\n  if (hdr.ArcType() != fst::StdArc::Type()) {\n    SHERPA_ONNX_LOGE(\"FST with arc type %s not supported\",\n                     hdr.ArcType().c_str());\n  }\n  fst::FstReadOptions ropts(\"<unspecified>\", &hdr);\n\n  fst::Fst<fst::StdArc> *decode_fst = nullptr;\n\n  if (hdr.FstType() == \"vector\") {\n    decode_fst = fst::VectorFst<fst::StdArc>::Read(is, ropts);\n  } else if (hdr.FstType() == \"const\") {\n    decode_fst = fst::ConstFst<fst::StdArc>::Read(is, ropts);\n  } else {\n    SHERPA_ONNX_LOGE(\"Reading FST: unsupported FST type: %s\",\n                     hdr.FstType().c_str());\n  }\n\n  if (decode_fst == nullptr) {  // fst code will warn.\n    SHERPA_ONNX_LOGE(\"Error reading FST (after reading header).\");\n    return nullptr;\n  } else {\n    return decode_fst;\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/fst-utils.h",
    "content": "// sherpa-onnx/csrc/fst-utils.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_FST_UTILS_H_\n#define SHERPA_ONNX_CSRC_FST_UTILS_H_\n\n#include <string>\n\n#include \"fst/fstlib.h\"\n\nnamespace sherpa_onnx {\n\nfst::Fst<fst::StdArc> *ReadGraph(const std::string &filename);\n\n}\n\n#endif  // SHERPA_ONNX_CSRC_FST_UTILS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/funasr-nano-tokenizer.cc",
    "content": "// sherpa-onnx/csrc/funasr-nano-tokenizer.cc\n//\n// Copyright (c)  2025  zengyw\n\n#include \"sherpa-onnx/csrc/funasr-nano-tokenizer.h\"\n\n#include <algorithm>\n#include <cctype>\n#include <cstdint>\n#include <cstring>\n#include <limits>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <unordered_set>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n\nstatic std::string FindTokenizerJson(const std::string &tokenizer_dir) {\n  std::string p = tokenizer_dir + \"/tokenizer.json\";\n  if (FileExists(p)) return p;\n  return \"\";\n}\n\nstatic std::string FindVocabJson(const std::string &tokenizer_dir) {\n  std::string p = tokenizer_dir + \"/vocab.json\";\n  if (FileExists(p)) return p;\n  return \"\";\n}\n\nstatic std::string FindMergesTxt(const std::string &tokenizer_dir) {\n  std::string p = tokenizer_dir + \"/merges.txt\";\n  if (FileExists(p)) return p;\n  return \"\";\n}\n\nstatic std::string LoadBytesFromFile(const std::string &path) {\n  std::vector<char> data = ReadFile(path);\n  if (data.empty()) return \"\";\n  return std::string(data.data(), data.size());\n}\n\n#if __ANDROID_API__ >= 9\nstatic std::string LoadBytesFromFile(AAssetManager *mgr,\n                                     const std::string &path) {\n  std::vector<char> data = ReadFile(mgr, path);\n  if (data.empty()) return \"\";\n  return std::string(data.data(), data.size());\n}\n#endif\n\n#if __OHOS__\nstatic std::string LoadBytesFromFile(NativeResourceManager *mgr,\n                                     const std::string &path) {\n  std::vector<char> data = ReadFile(mgr, path);\n  if (data.empty()) return \"\";\n  return std::string(data.data(), data.size());\n}\n#endif\n\nstatic inline void TrimInPlace(std::string *s) {\n  if (!s) return;\n  auto &x = *s;\n  size_t b = x.find_first_not_of(\" \\t\\r\\n\");\n  if (b == std::string::npos) {\n    x.clear();\n    return;\n  }\n  size_t e = x.find_last_not_of(\" \\t\\r\\n\");\n  x = x.substr(b, e - b + 1);\n}\n\nstatic inline void AppendUtf8(uint32_t cp, std::string *out) {\n  if (!out) return;\n  if (cp <= 0x7Fu) {\n    out->push_back(static_cast<char>(cp));\n  } else if (cp <= 0x7FFu) {\n    out->push_back(static_cast<char>(0xC0u | ((cp >> 6) & 0x1Fu)));\n    out->push_back(static_cast<char>(0x80u | (cp & 0x3Fu)));\n  } else if (cp <= 0xFFFFu) {\n    out->push_back(static_cast<char>(0xE0u | ((cp >> 12) & 0x0Fu)));\n    out->push_back(static_cast<char>(0x80u | ((cp >> 6) & 0x3Fu)));\n    out->push_back(static_cast<char>(0x80u | (cp & 0x3Fu)));\n  } else {\n    out->push_back(static_cast<char>(0xF0u | ((cp >> 18) & 0x07u)));\n    out->push_back(static_cast<char>(0x80u | ((cp >> 12) & 0x3Fu)));\n    out->push_back(static_cast<char>(0x80u | ((cp >> 6) & 0x3Fu)));\n    out->push_back(static_cast<char>(0x80u | (cp & 0x3Fu)));\n  }\n}\n\nstatic inline bool Utf8Next(const std::string &s, size_t *i, uint32_t *cp,\n                            size_t *nbytes) {\n  if (!i || !cp || !nbytes) return false;\n  if (*i >= s.size()) return false;\n  const unsigned char c = static_cast<unsigned char>(s[*i]);\n  if (c < 0x80) {\n    *cp = c;\n    *nbytes = 1;\n    return true;\n  }\n  if ((c >> 5) == 0x6) {  // 110xxxxx\n    if (*i + 1 >= s.size()) return false;\n    const unsigned char c1 = static_cast<unsigned char>(s[*i + 1]);\n    if ((c1 >> 6) != 0x2) return false;\n    *cp = ((c & 0x1F) << 6) | (c1 & 0x3F);\n    *nbytes = 2;\n    return true;\n  }\n  if ((c >> 4) == 0xE) {  // 1110xxxx\n    if (*i + 2 >= s.size()) return false;\n    const unsigned char c1 = static_cast<unsigned char>(s[*i + 1]);\n    const unsigned char c2 = static_cast<unsigned char>(s[*i + 2]);\n    if ((c1 >> 6) != 0x2 || (c2 >> 6) != 0x2) return false;\n    *cp = ((c & 0x0F) << 12) | ((c1 & 0x3F) << 6) | (c2 & 0x3F);\n    *nbytes = 3;\n    return true;\n  }\n  if ((c >> 3) == 0x1E) {  // 11110xxx\n    if (*i + 3 >= s.size()) return false;\n    const unsigned char c1 = static_cast<unsigned char>(s[*i + 1]);\n    const unsigned char c2 = static_cast<unsigned char>(s[*i + 2]);\n    const unsigned char c3 = static_cast<unsigned char>(s[*i + 3]);\n    if ((c1 >> 6) != 0x2 || (c2 >> 6) != 0x2 || (c3 >> 6) != 0x2) return false;\n    *cp = ((c & 0x07) << 18) | ((c1 & 0x3F) << 12) | ((c2 & 0x3F) << 6) |\n          (c3 & 0x3F);\n    *nbytes = 4;\n    return true;\n  }\n  return false;\n}\n\nenum class Utf8ConsumeStatus {\n  kOk = 0,\n  kIncomplete = 1,\n  kInvalid = 2,\n};\n\nstruct Utf8ConsumeResult {\n  std::string prefix;\n  Utf8ConsumeStatus status;\n};\n\nstatic Utf8ConsumeResult ConsumeValidUtf8Prefix(std::string *pending) {\n  Utf8ConsumeResult r;\n  if (!pending || pending->empty()) {\n    r.status = Utf8ConsumeStatus::kOk;\n    return r;\n  }\n\n  const auto is_cont = [](uint8_t b) -> bool { return (b & 0xC0u) == 0x80u; };\n\n  const std::string &s = *pending;\n  const size_t n = s.size();\n\n  size_t i = 0;\n  size_t last_good = 0;\n\n  while (i < n) {\n    uint8_t b0 = static_cast<uint8_t>(s[i]);\n\n    if (b0 < 0x80u) {\n      ++i;\n      last_good = i;\n      continue;\n    }\n\n    size_t need = 0;\n\n    if (b0 >= 0xC2u && b0 <= 0xDFu) {\n      need = 2;\n      if (i + need > n) {\n        r.status = Utf8ConsumeStatus::kIncomplete;\n        break;\n      }\n      uint8_t b1 = static_cast<uint8_t>(s[i + 1]);\n      if (!is_cont(b1)) {\n        r.status = Utf8ConsumeStatus::kInvalid;\n        break;\n      }\n      i += need;\n      last_good = i;\n      continue;\n    }\n\n    if (b0 >= 0xE0u && b0 <= 0xEFu) {\n      need = 3;\n      if (i + need > n) {\n        r.status = Utf8ConsumeStatus::kIncomplete;\n        break;\n      }\n      uint8_t b1 = static_cast<uint8_t>(s[i + 1]);\n      uint8_t b2 = static_cast<uint8_t>(s[i + 2]);\n      if (!is_cont(b1) || !is_cont(b2)) {\n        r.status = Utf8ConsumeStatus::kInvalid;\n        break;\n      }\n\n      if (b0 == 0xE0u && b1 < 0xA0u) {\n        r.status = Utf8ConsumeStatus::kInvalid;\n        break;\n      }\n      if (b0 == 0xEDu && b1 > 0x9Fu) {\n        r.status = Utf8ConsumeStatus::kInvalid;\n        break;\n      }\n\n      i += need;\n      last_good = i;\n      continue;\n    }\n\n    if (b0 >= 0xF0u && b0 <= 0xF4u) {\n      need = 4;\n      if (i + need > n) {\n        r.status = Utf8ConsumeStatus::kIncomplete;\n        break;\n      }\n      uint8_t b1 = static_cast<uint8_t>(s[i + 1]);\n      uint8_t b2 = static_cast<uint8_t>(s[i + 2]);\n      uint8_t b3 = static_cast<uint8_t>(s[i + 3]);\n      if (!is_cont(b1) || !is_cont(b2) || !is_cont(b3)) {\n        r.status = Utf8ConsumeStatus::kInvalid;\n        break;\n      }\n\n      if (b0 == 0xF0u && b1 < 0x90u) {\n        r.status = Utf8ConsumeStatus::kInvalid;\n        break;\n      }\n      if (b0 == 0xF4u && b1 > 0x8Fu) {\n        r.status = Utf8ConsumeStatus::kInvalid;\n        break;\n      }\n\n      i += need;\n      last_good = i;\n      continue;\n    }\n\n    r.status = Utf8ConsumeStatus::kInvalid;\n    break;\n  }\n\n  if (i == n) {\n    r.status = Utf8ConsumeStatus::kOk;\n    last_good = n;\n  }\n\n  if (last_good > 0) {\n    r.prefix = pending->substr(0, last_good);\n    pending->erase(0, last_good);\n  } else {\n    r.prefix.clear();\n  }\n\n  return r;\n}\n\nstatic inline void ByteLevelDecodeTokenToBytes(\n    const std::string &token,\n    const std::unordered_map<std::string, uint8_t> &unicode_to_byte,\n    std::string *out_bytes) {\n  if (!out_bytes) return;\n\n  size_t i = 0;\n  while (i < token.size()) {\n    size_t t = i;\n    uint32_t cp = 0;\n    size_t n = 0;\n    if (!Utf8Next(token, &t, &cp, &n) || n == 0) {\n      out_bytes->push_back(token[i]);\n      i += 1;\n      continue;\n    }\n    std::string ch = token.substr(i, n);\n    auto it = unicode_to_byte.find(ch);\n    if (it != unicode_to_byte.end()) {\n      out_bytes->push_back(static_cast<char>(it->second));\n    } else {\n      out_bytes->append(ch);\n    }\n    i += n;\n  }\n}\n\nstatic inline bool IsNewline(uint32_t cp) { return cp == '\\n' || cp == '\\r'; }\n\nstatic inline bool IsAsciiSpace(uint32_t cp) { return cp == ' '; }\n\nstatic inline bool IsWhitespace(uint32_t cp) {\n  return cp == ' ' || cp == '\\t' || cp == '\\n' || cp == '\\r' || cp == '\\v' ||\n         cp == '\\f';\n}\n\nstatic inline bool IsAsciiAlpha(uint32_t cp) {\n  return (cp >= 'a' && cp <= 'z') || (cp >= 'A' && cp <= 'Z');\n}\n\nstatic inline bool IsAsciiDigit(uint32_t cp) {\n  return (cp >= '0' && cp <= '9');\n}\n\n// A light-weight unicode letter/number approximation good enough for\n// Qwen3(English/Chinese/Japanese/Korean + common scripts).\nstatic inline bool IsLetter(uint32_t cp) {\n  if (IsAsciiAlpha(cp)) return true;\n\n  // CJK Unified Ideographs\n  if (cp >= 0x4E00 && cp <= 0x9FFF) return true;\n  // CJK Extension A\n  if (cp >= 0x3400 && cp <= 0x4DBF) return true;\n  // Hiragana/Katakana\n  if (cp >= 0x3040 && cp <= 0x30FF) return true;\n  // Hangul syllables\n  if (cp >= 0xAC00 && cp <= 0xD7AF) return true;\n  // Hangul Jamo\n  if (cp >= 0x1100 && cp <= 0x11FF) return true;\n\n  // Latin-1 Supplement + Latin Extended (covers most European letters)\n  if (cp >= 0x00C0 && cp <= 0x02AF) return true;\n\n  return false;\n}\n\nstatic inline bool IsNumber(uint32_t cp) {\n  if (IsAsciiDigit(cp)) return true;\n  // Fullwidth digits\n  if (cp >= 0xFF10 && cp <= 0xFF19) return true;\n  return false;\n}\n\nclass JsonReader {\n public:\n  explicit JsonReader(const std::string &s) : s_(s), p_(0) {}\n\n  bool SeekToKey(const std::string &key) {\n    std::string needle = \"\\\"\" + key + \"\\\"\";\n    size_t pos = s_.find(needle);\n    if (pos == std::string::npos) return false;\n    p_ = pos + needle.size();\n    return true;\n  }\n\n  void SkipWs() {\n    while (p_ < s_.size()) {\n      char c = s_[p_];\n      if (c == ' ' || c == '\\t' || c == '\\r' || c == '\\n') {\n        ++p_;\n      } else {\n        break;\n      }\n    }\n  }\n\n  bool Consume(char c) {\n    SkipWs();\n    if (p_ < s_.size() && s_[p_] == c) {\n      ++p_;\n      return true;\n    }\n    return false;\n  }\n\n  bool Peek(char *c) const {\n    if (!c) return false;\n    size_t q = p_;\n    while (q < s_.size()) {\n      char x = s_[q];\n      if (x == ' ' || x == '\\t' || x == '\\r' || x == '\\n') {\n        ++q;\n        continue;\n      }\n      *c = x;\n      return true;\n    }\n    return false;\n  }\n\n  bool ParseString(std::string *out) {\n    if (!out) return false;\n    SkipWs();\n    if (p_ >= s_.size() || s_[p_] != '\"') return false;\n    ++p_;\n    std::string r;\n    while (p_ < s_.size()) {\n      char c = s_[p_++];\n      if (c == '\"') {\n        *out = std::move(r);\n        return true;\n      }\n      if (c != '\\\\') {\n        r.push_back(c);\n        continue;\n      }\n      if (p_ >= s_.size()) return false;\n      char esc = s_[p_++];\n      switch (esc) {\n        case '\"':\n          r.push_back('\"');\n          break;\n        case '\\\\':\n          r.push_back('\\\\');\n          break;\n        case '/':\n          r.push_back('/');\n          break;\n        case 'b':\n          r.push_back('\\b');\n          break;\n        case 'f':\n          r.push_back('\\f');\n          break;\n        case 'n':\n          r.push_back('\\n');\n          break;\n        case 'r':\n          r.push_back('\\r');\n          break;\n        case 't':\n          r.push_back('\\t');\n          break;\n        case 'u': {\n          if (p_ + 4 > s_.size()) return false;\n          uint32_t u = 0;\n          for (int i = 0; i < 4; ++i) {\n            char h = s_[p_++];\n            u <<= 4;\n            if (h >= '0' && h <= '9')\n              u |= (h - '0');\n            else if (h >= 'a' && h <= 'f')\n              u |= (h - 'a' + 10);\n            else if (h >= 'A' && h <= 'F')\n              u |= (h - 'A' + 10);\n            else\n              return false;\n          }\n          if (u >= 0xD800 && u <= 0xDBFF) {\n            size_t save = p_;\n            if (p_ + 6 <= s_.size() && s_[p_] == '\\\\' && s_[p_ + 1] == 'u') {\n              p_ += 2;\n              uint32_t v = 0;\n              for (int i = 0; i < 4; ++i) {\n                char h = s_[p_++];\n                v <<= 4;\n                if (h >= '0' && h <= '9')\n                  v |= (h - '0');\n                else if (h >= 'a' && h <= 'f')\n                  v |= (h - 'a' + 10);\n                else if (h >= 'A' && h <= 'F')\n                  v |= (h - 'A' + 10);\n                else\n                  return false;\n              }\n              if (v >= 0xDC00 && v <= 0xDFFF) {\n                uint32_t cp = 0x10000 + (((u - 0xD800) << 10) | (v - 0xDC00));\n                AppendUtf8(cp, &r);\n                break;\n              }\n            }\n            p_ = save;\n          }\n          AppendUtf8(u, &r);\n          break;\n        }\n        default:\n          return false;\n      }\n    }\n    return false;\n  }\n\n  bool ParseBool(bool *out) {\n    if (!out) return false;\n    SkipWs();\n    if (p_ + 4 <= s_.size() && s_.compare(p_, 4, \"true\") == 0) {\n      p_ += 4;\n      *out = true;\n      return true;\n    }\n    if (p_ + 5 <= s_.size() && s_.compare(p_, 5, \"false\") == 0) {\n      p_ += 5;\n      *out = false;\n      return true;\n    }\n    return false;\n  }\n\n  bool ParseInt64(int64_t *out) {\n    if (!out) return false;\n    SkipWs();\n    if (p_ >= s_.size()) return false;\n    bool neg = false;\n    if (s_[p_] == '-') {\n      neg = true;\n      ++p_;\n    }\n    if (p_ >= s_.size() || !std::isdigit(static_cast<unsigned char>(s_[p_]))) {\n      return false;\n    }\n    int64_t v = 0;\n    while (p_ < s_.size() && std::isdigit(static_cast<unsigned char>(s_[p_]))) {\n      int d = s_[p_] - '0';\n      if (v > (std::numeric_limits<int64_t>::max() - d) / 10) return false;\n      v = v * 10 + d;\n      ++p_;\n    }\n    *out = neg ? -v : v;\n    return true;\n  }\n\n  bool SkipValue() {\n    SkipWs();\n    if (p_ >= s_.size()) return false;\n    char c = s_[p_];\n    if (c == '\"') {\n      std::string tmp;\n      return ParseString(&tmp);\n    }\n    if (c == '{') return SkipObject();\n    if (c == '[') return SkipArray();\n    if (c == 't' || c == 'f') {\n      bool b = false;\n      return ParseBool(&b);\n    }\n    if (c == 'n') {\n      if (p_ + 4 <= s_.size() && s_.compare(p_, 4, \"null\") == 0) {\n        p_ += 4;\n        return true;\n      }\n      return false;\n    }\n    int64_t v = 0;\n    return ParseInt64(&v);\n  }\n\n private:\n  bool SkipObject() {\n    if (!Consume('{')) return false;\n    SkipWs();\n    if (Consume('}')) return true;\n    while (true) {\n      std::string k;\n      if (!ParseString(&k)) return false;\n      if (!Consume(':')) return false;\n      if (!SkipValue()) return false;\n      SkipWs();\n      if (Consume('}')) return true;\n      if (!Consume(',')) return false;\n    }\n  }\n\n  bool SkipArray() {\n    if (!Consume('[')) return false;\n    SkipWs();\n    if (Consume(']')) return true;\n    while (true) {\n      if (!SkipValue()) return false;\n      SkipWs();\n      if (Consume(']')) return true;\n      if (!Consume(',')) return false;\n    }\n  }\n\n private:\n  const std::string &s_;\n  size_t p_;\n};\n\nnamespace {\nstatic inline int64_t TokenToIdOrDefault(\n    const std::unordered_map<std::string, int32_t> &vocab,\n    const std::string &tok, int64_t def_val) {\n  auto it = vocab.find(tok);\n  if (it == vocab.end()) return def_val;\n  return static_cast<int64_t>(it->second);\n}\n}  // namespace\n\n// Build bytes_to_unicode mapping (ByteLevel encoder/decoder).\nstatic void BuildBytesToUnicode(\n    std::string byte_to_unicode[256],\n    std::unordered_map<std::string, uint8_t> *unicode_to_byte) {\n  std::vector<uint32_t> bs;\n  bs.reserve(256);\n  for (uint32_t c = 33; c <= 126; ++c) bs.push_back(c);\n  for (uint32_t c = 161; c <= 172; ++c) bs.push_back(c);\n  for (uint32_t c = 174; c <= 255; ++c) bs.push_back(c);\n\n  std::vector<uint32_t> cs = bs;\n  cs.reserve(256);\n  uint32_t n = 0;\n  auto contains = [&](uint32_t b) -> bool {\n    return std::find(bs.begin(), bs.end(), b) != bs.end();\n  };\n  for (uint32_t b = 0; b <= 255; ++b) {\n    if (!contains(b)) {\n      bs.push_back(b);\n      cs.push_back(256 + n);\n      ++n;\n    }\n  }\n\n  if (unicode_to_byte) unicode_to_byte->clear();\n  for (size_t i = 0; i < bs.size(); ++i) {\n    uint32_t b = bs[i];\n    uint32_t c = cs[i];\n    std::string u;\n    AppendUtf8(c, &u);\n    byte_to_unicode[b] = u;\n    if (unicode_to_byte) {\n      (*unicode_to_byte)[u] = static_cast<uint8_t>(b);\n    }\n  }\n}\n\n// Parse vocab.json: {\"token\": id, ...}\nstatic bool ParseVocabJson(const std::string &blob,\n                           std::unordered_map<std::string, int32_t> *out) {\n  if (!out) return false;\n  out->clear();\n  JsonReader r(blob);\n  r.SkipWs();\n  if (!r.Consume('{')) return false;\n  r.SkipWs();\n  if (r.Consume('}')) return true;\n\n  while (true) {\n    std::string key;\n    if (!r.ParseString(&key)) return false;\n    if (!r.Consume(':')) return false;\n    int64_t id64 = 0;\n    if (!r.ParseInt64(&id64)) return false;\n    if (id64 < 0 || id64 > std::numeric_limits<int32_t>::max()) return false;\n    (*out)[key] = static_cast<int32_t>(id64);\n\n    r.SkipWs();\n    if (r.Consume('}')) return true;\n    if (!r.Consume(',')) return false;\n  }\n}\n\n// Parse merges.txt: each non-comment line: \"left right\"\nstatic bool ParseMergesTxt(const std::string &blob,\n                           std::unordered_map<std::string, int32_t> *out) {\n  if (!out) return false;\n  out->clear();\n  std::istringstream is(blob);\n  std::string line;\n  int32_t rank = 0;\n  while (std::getline(is, line)) {\n    if (line.empty()) continue;\n    if (line.rfind(\"#version\", 0) == 0) continue;\n    std::string left, right;\n    {\n      std::istringstream ls(line);\n      if (!(ls >> left >> right)) continue;\n    }\n    std::string key = left;\n    key.push_back('\\t');\n    key.append(right);\n    (*out)[key] = rank++;\n  }\n  return true;\n}\n\nstatic inline bool IsWordChar(uint32_t cp) {\n  return IsLetter(cp) || IsNumber(cp) || cp == '_';\n}\n\n// A manual approximation for Qwen3 tokenizer Split regex.\n// The regex is in tokenizer.json pre_tokenizer Split. We avoid std::regex\n// due to missing \\p{L}/\\p{N} support in libc++/libstdc++ regex.\nstatic std::vector<std::string> SplitByQwen3Pattern(const std::string &text) {\n  std::vector<std::string> out;\n  out.reserve(text.size() / 2 + 1);\n\n  size_t i = 0;\n  while (i < text.size()) {\n    if (text[i] == '\\'') {\n      auto lower = [](char c) -> char {\n        return static_cast<char>(std::tolower(static_cast<unsigned char>(c)));\n      };\n      if (i + 1 < text.size()) {\n        char c1 = lower(text[i + 1]);\n        if (c1 == 's' || c1 == 't' || c1 == 'm' || c1 == 'd') {\n          out.push_back(text.substr(i, 2));\n          i += 2;\n          continue;\n        }\n        if (i + 2 < text.size()) {\n          char c2 = lower(text[i + 2]);\n          if (c1 == 'r' && c2 == 'e') {\n            out.push_back(text.substr(i, 3));\n            i += 3;\n            continue;\n          }\n          if (c1 == 'v' && c2 == 'e') {\n            out.push_back(text.substr(i, 3));\n            i += 3;\n            continue;\n          }\n          if (c1 == 'l' && c2 == 'l') {\n            out.push_back(text.substr(i, 3));\n            i += 3;\n            continue;\n          }\n        }\n      }\n    }\n\n    size_t cur = i;\n    uint32_t cp = 0;\n    size_t n = 0;\n    if (!Utf8Next(text, &cur, &cp, &n) || n == 0) {\n      out.push_back(text.substr(i, 1));\n      i += 1;\n      continue;\n    }\n\n    auto peek_next_cp = [&](size_t pos, uint32_t *cp2, size_t *n2) -> bool {\n      size_t t = pos;\n      uint32_t x = 0;\n      size_t nn = 0;\n      if (!Utf8Next(text, &t, &x, &nn)) return false;\n      if (cp2) *cp2 = x;\n      if (n2) *n2 = nn;\n      return true;\n    };\n\n    {\n      uint32_t next_cp = 0;\n      size_t next_n = 0;\n      bool has_next = peek_next_cp(i + n, &next_cp, &next_n);\n\n      bool cur_ok_prefix = (!IsNewline(cp) && !IsLetter(cp) && !IsNumber(cp));\n      bool cur_is_letter = IsLetter(cp);\n\n      if (cur_is_letter || (cur_ok_prefix && has_next && IsLetter(next_cp))) {\n        size_t start = i;\n        size_t j = i;\n        if (!cur_is_letter) {\n          j += n;\n          while (j < text.size()) {\n            size_t t = j;\n            uint32_t cpl = 0;\n            size_t nl = 0;\n            if (!Utf8Next(text, &t, &cpl, &nl)) break;\n            if (!IsLetter(cpl)) break;\n            j += nl;\n          }\n        } else {\n          j = i;\n          while (j < text.size()) {\n            size_t t = j;\n            uint32_t cpl = 0;\n            size_t nl = 0;\n            if (!Utf8Next(text, &t, &cpl, &nl)) break;\n            if (!IsLetter(cpl)) break;\n            j += nl;\n          }\n        }\n        out.push_back(text.substr(start, j - start));\n        i = j;\n        continue;\n      }\n    }\n\n    if (IsNumber(cp)) {\n      out.push_back(text.substr(i, n));\n      i += n;\n      continue;\n    }\n\n    {\n      bool starts_with_space_prefix = IsAsciiSpace(cp);\n      size_t start = i;\n      size_t j = i;\n\n      auto is_punct_like = [&](uint32_t x) -> bool {\n        return (!IsWhitespace(x) && !IsLetter(x) && !IsNumber(x));\n      };\n\n      if (starts_with_space_prefix) {\n        uint32_t next_cp = 0;\n        size_t next_n = 0;\n        if (peek_next_cp(i + n, &next_cp, &next_n) && is_punct_like(next_cp)) {\n          j += n;\n          while (j < text.size()) {\n            size_t t = j;\n            uint32_t cx = 0;\n            size_t nx = 0;\n            if (!Utf8Next(text, &t, &cx, &nx)) break;\n            if (!is_punct_like(cx)) break;\n            j += nx;\n          }\n          while (j < text.size()) {\n            size_t t = j;\n            uint32_t cx = 0;\n            size_t nx = 0;\n            if (!Utf8Next(text, &t, &cx, &nx)) break;\n            if (!IsNewline(cx)) break;\n            j += nx;\n          }\n          out.push_back(text.substr(start, j - start));\n          i = j;\n          continue;\n        }\n      } else if (is_punct_like(cp)) {\n        while (j < text.size()) {\n          size_t t = j;\n          uint32_t cx = 0;\n          size_t nx = 0;\n          if (!Utf8Next(text, &t, &cx, &nx)) break;\n          if (!is_punct_like(cx)) break;\n          j += nx;\n        }\n        while (j < text.size()) {\n          size_t t = j;\n          uint32_t cx = 0;\n          size_t nx = 0;\n          if (!Utf8Next(text, &t, &cx, &nx)) break;\n          if (!IsNewline(cx)) break;\n          j += nx;\n        }\n        out.push_back(text.substr(start, j - start));\n        i = j;\n        continue;\n      }\n    }\n\n    {\n      if (IsWhitespace(cp)) {\n        size_t start = i;\n        size_t j = i;\n\n        bool saw_newline = false;\n        while (j < text.size()) {\n          size_t t = j;\n          uint32_t cx = 0;\n          size_t nx = 0;\n          if (!Utf8Next(text, &t, &cx, &nx)) break;\n          if (IsNewline(cx)) {\n            saw_newline = true;\n            break;\n          }\n          if (!IsWhitespace(cx)) break;\n          j += nx;\n        }\n\n        if (saw_newline) {\n          while (j < text.size()) {\n            size_t t = j;\n            uint32_t cx = 0;\n            size_t nx = 0;\n            if (!Utf8Next(text, &t, &cx, &nx)) break;\n            if (!IsNewline(cx)) break;\n            j += nx;\n          }\n          out.push_back(text.substr(start, j - start));\n          i = j;\n          continue;\n        }\n      }\n    }\n\n    if (IsWhitespace(cp)) {\n      bool only_ws_to_end = true;\n      size_t j = i;\n      while (j < text.size()) {\n        size_t t = j;\n        uint32_t cx = 0;\n        size_t nx = 0;\n        if (!Utf8Next(text, &t, &cx, &nx)) break;\n        if (!IsWhitespace(cx)) {\n          only_ws_to_end = false;\n          break;\n        }\n        j += nx;\n      }\n      if (only_ws_to_end) {\n        out.push_back(text.substr(i));\n        break;\n      }\n    }\n\n    if (IsWhitespace(cp)) {\n      size_t start = i;\n      size_t j = i;\n      while (j < text.size()) {\n        size_t t = j;\n        uint32_t cx = 0;\n        size_t nx = 0;\n        if (!Utf8Next(text, &t, &cx, &nx)) break;\n        if (!IsWhitespace(cx)) break;\n        j += nx;\n      }\n      out.push_back(text.substr(start, j - start));\n      i = j;\n      continue;\n    }\n\n    out.push_back(text.substr(i, n));\n    i += n;\n  }\n\n  return out;\n}\n\nstatic std::vector<std::string> SplitUtf8ToChars(const std::string &s) {\n  std::vector<std::string> out;\n  out.reserve(s.size());\n  size_t i = 0;\n  while (i < s.size()) {\n    size_t t = i;\n    uint32_t cp = 0;\n    size_t n = 0;\n    if (!Utf8Next(s, &t, &cp, &n) || n == 0) {\n      out.push_back(s.substr(i, 1));\n      i += 1;\n      continue;\n    }\n    out.push_back(s.substr(i, n));\n    i += n;\n  }\n  return out;\n}\n\nstatic inline std::string MakeMergeKey(const std::string &a,\n                                       const std::string &b) {\n  std::string k = a;\n  k.push_back('\\t');\n  k.append(b);\n  return k;\n}\n\n}  // namespace\n\n// Parse tokenizer.json added_tokens: extract objects with {id, content, ...}\nstatic bool ParseAddedTokensFromTokenizerJson(\n    const std::string &blob,\n    std::vector<FunASRNanoTokenizer::AddedToken> *out) {\n  if (!out) return false;\n  out->clear();\n\n  JsonReader r(blob);\n  if (!r.SeekToKey(\"added_tokens\")) return true;\n  if (!r.Consume(':')) return false;\n  if (!r.Consume('[')) return false;\n\n  r.SkipWs();\n  if (r.Consume(']')) return true;\n\n  while (true) {\n    if (!r.Consume('{')) return false;\n    FunASRNanoTokenizer::AddedToken t;\n\n    r.SkipWs();\n    if (!r.Consume('}')) {\n      while (true) {\n        std::string k;\n        if (!r.ParseString(&k)) return false;\n        if (!r.Consume(':')) return false;\n\n        if (k == \"id\") {\n          int64_t v = 0;\n          if (!r.ParseInt64(&v)) return false;\n          t.id = static_cast<int32_t>(v);\n        } else if (k == \"content\") {\n          if (!r.ParseString(&t.content)) return false;\n        } else if (k == \"single_word\") {\n          if (!r.ParseBool(&t.single_word)) return false;\n        } else if (k == \"lstrip\") {\n          if (!r.ParseBool(&t.lstrip)) return false;\n        } else if (k == \"rstrip\") {\n          if (!r.ParseBool(&t.rstrip)) return false;\n        } else if (k == \"normalized\") {\n          if (!r.ParseBool(&t.normalized)) return false;\n        } else if (k == \"special\") {\n          if (!r.ParseBool(&t.special)) return false;\n        } else {\n          if (!r.SkipValue()) return false;\n        }\n\n        r.SkipWs();\n        if (r.Consume('}')) break;\n        if (!r.Consume(',')) return false;\n      }\n    }\n\n    if (t.id >= 0 && !t.content.empty()) {\n      out->push_back(std::move(t));\n    }\n\n    r.SkipWs();\n    if (r.Consume(']')) return true;\n    if (!r.Consume(',')) return false;\n  }\n}\n\n// Build trie for AddedTokens longest match (byte-wise).\nvoid BuildAddedTokensTrie(\n    const std::vector<FunASRNanoTokenizer::AddedToken> &tokens,\n    std::vector<FunASRNanoTokenizer::TrieNode> *trie) {\n  if (!trie) return;\n  trie->clear();\n  trie->push_back(FunASRNanoTokenizer::TrieNode{});\n  for (int32_t i = 0; i < static_cast<int32_t>(tokens.size()); ++i) {\n    const auto &tok = tokens[i];\n    int32_t node = 0;\n    for (uint8_t b :\n         std::vector<uint8_t>(tok.content.begin(), tok.content.end())) {\n      auto it = (*trie)[node].next.find(b);\n      if (it == (*trie)[node].next.end()) {\n        int32_t new_node = static_cast<int32_t>(trie->size());\n        trie->push_back(FunASRNanoTokenizer::TrieNode{});\n        (*trie)[node].next.emplace(b, new_node);\n        node = new_node;\n      } else {\n        node = it->second;\n      }\n    }\n    (*trie)[node].token_index = i;\n  }\n}\n\nstatic void MergeVocabAndAddedTokens(\n    std::unordered_map<std::string, int32_t> *vocab,\n    const std::vector<FunASRNanoTokenizer::AddedToken> &added,\n    std::unordered_set<std::string> *added_contents) {\n  if (!vocab) return;\n  if (added_contents) added_contents->clear();\n\n  int32_t overwritten = 0;\n  for (const auto &t : added) {\n    if (t.id < 0 || t.content.empty()) continue;\n    if (added_contents) added_contents->insert(t.content);\n\n    auto it = vocab->find(t.content);\n    if (it != vocab->end() && it->second != t.id) {\n      ++overwritten;\n    }\n    (*vocab)[t.content] = t.id;\n  }\n\n  if (overwritten > 0) {\n    SHERPA_ONNX_LOGE(\n        \"AddedTokens overwrote %d vocab entries with different ids. \"\n        \"This is expected for some tokenizers; keeping added-token ids.\",\n        overwritten);\n  }\n}\n\nvoid BuildIdToToken(const std::unordered_map<std::string, int32_t> &vocab,\n                    const std::unordered_set<std::string> &added_contents,\n                    std::vector<std::string> *id2token) {\n  if (!id2token) return;\n  int32_t max_id = -1;\n  for (const auto &kv : vocab) {\n    max_id = std::max(max_id, kv.second);\n  }\n  if (max_id < 0) {\n    id2token->clear();\n    return;\n  }\n  id2token->assign(static_cast<size_t>(max_id) + 1, std::string{});\n\n  int32_t dup = 0;\n  for (const auto &kv : vocab) {\n    const std::string &tok = kv.first;\n    int32_t id = kv.second;\n    if (id < 0) continue;\n    std::string &slot = (*id2token)[static_cast<size_t>(id)];\n    if (slot.empty()) {\n      slot = tok;\n      continue;\n    }\n    if (slot == tok) continue;\n\n    bool slot_is_added = added_contents.count(slot) > 0;\n    bool tok_is_added = added_contents.count(tok) > 0;\n    if (!slot_is_added && tok_is_added) {\n      slot = tok;\n    }\n    ++dup;\n  }\n\n  if (dup > 0) {\n    SHERPA_ONNX_LOGE(\n        \"Detected %d duplicated id->token collisions while building id2token. \"\n        \"Kept added_tokens' string when possible.\",\n        dup);\n  }\n}\n\n// Try to match an AddedToken at byte-position `pos`.\n// Returns (matched_len_bytes, token_index) or (0, -1) if no match.\nstd::pair<int32_t, int32_t> MatchAddedToken(\n    const std::string &text, size_t pos,\n    const std::vector<FunASRNanoTokenizer::TrieNode> &trie) {\n  if (trie.empty()) return {0, -1};\n  int32_t node = 0;\n  int32_t best_idx = -1;\n  int32_t best_len = 0;\n\n  size_t i = pos;\n  while (i < text.size()) {\n    uint8_t b = static_cast<uint8_t>(text[i]);\n    auto it = trie[node].next.find(b);\n    if (it == trie[node].next.end()) break;\n    node = it->second;\n    ++i;\n    if (trie[node].token_index >= 0) {\n      best_idx = trie[node].token_index;\n      best_len = static_cast<int32_t>(i - pos);\n    }\n  }\n  return {best_len, best_idx};\n}\n\nFunASRNanoTokenizer::FunASRNanoTokenizer(const std::string &tokenizer_dir) {\n  Init(tokenizer_dir);\n}\n\n#if __ANDROID_API__ >= 9\nFunASRNanoTokenizer::FunASRNanoTokenizer(AAssetManager *mgr,\n                                         const std::string &tokenizer_dir) {\n  Init(mgr, tokenizer_dir);\n}\n#endif\n\n#if __OHOS__\nFunASRNanoTokenizer::FunASRNanoTokenizer(NativeResourceManager *mgr,\n                                         const std::string &tokenizer_dir) {\n  Init(mgr, tokenizer_dir);\n}\n#endif\n\nvoid FunASRNanoTokenizer::Init(const std::string &tokenizer_dir) {\n  std::string tok_json = FindTokenizerJson(tokenizer_dir);\n  if (tok_json.empty()) {\n    SHERPA_ONNX_LOGE(\"Cannot find tokenizer.json in: %s\",\n                     tokenizer_dir.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  std::string vocab_json = FindVocabJson(tokenizer_dir);\n  if (vocab_json.empty()) {\n    SHERPA_ONNX_LOGE(\"Cannot find vocab.json in: %s\", tokenizer_dir.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  std::string merges_txt = FindMergesTxt(tokenizer_dir);\n  if (merges_txt.empty()) {\n    SHERPA_ONNX_LOGE(\"Cannot find merges.txt in: %s\", tokenizer_dir.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  const std::string tok_blob = LoadBytesFromFile(tok_json);\n  const std::string vocab_blob = LoadBytesFromFile(vocab_json);\n  const std::string merges_blob = LoadBytesFromFile(merges_txt);\n\n  if (tok_blob.empty() || vocab_blob.empty() || merges_blob.empty()) {\n    SHERPA_ONNX_LOGE(\"Failed to read tokenizer files from: %s\",\n                     tokenizer_dir.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  // Build ByteLevel bytes_to_unicode mapping\n  BuildBytesToUnicode(byte_to_unicode_, &unicode_to_byte_);\n\n  if (!ParseVocabJson(vocab_blob, &token2id_)) {\n    SHERPA_ONNX_LOGE(\"Failed to parse vocab.json: %s\", vocab_json.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  if (!ParseMergesTxt(merges_blob, &merges_rank_)) {\n    SHERPA_ONNX_LOGE(\"Failed to parse merges.txt: %s\", merges_txt.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  if (!ParseAddedTokensFromTokenizerJson(tok_blob, &added_tokens_)) {\n    SHERPA_ONNX_LOGE(\"Failed to parse added_tokens from tokenizer.json: %s\",\n                     tok_json.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  MergeVocabAndAddedTokens(&token2id_, added_tokens_, &added_token_contents_);\n\n  BuildIdToToken(token2id_, added_token_contents_, &id2token_);\n\n  BuildAddedTokensTrie(added_tokens_, &trie_);\n\n  FinalizeSpecialIds();\n}\n\n#if __ANDROID_API__ >= 9\nvoid FunASRNanoTokenizer::Init(AAssetManager *mgr,\n                               const std::string &tokenizer_dir) {\n  std::string tok_json = tokenizer_dir + \"/tokenizer.json\";\n  std::string vocab_json = tokenizer_dir + \"/vocab.json\";\n  std::string merges_txt = tokenizer_dir + \"/merges.txt\";\n\n  const std::string tok_blob = LoadBytesFromFile(mgr, tok_json);\n  const std::string vocab_blob = LoadBytesFromFile(mgr, vocab_json);\n  const std::string merges_blob = LoadBytesFromFile(mgr, merges_txt);\n\n  if (tok_blob.empty() || vocab_blob.empty() || merges_blob.empty()) {\n    SHERPA_ONNX_LOGE(\"Failed to read tokenizer files from assets: %s\",\n                     tokenizer_dir.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  BuildBytesToUnicode(byte_to_unicode_, &unicode_to_byte_);\n\n  if (!ParseVocabJson(vocab_blob, &token2id_)) {\n    SHERPA_ONNX_LOGE(\"Failed to parse vocab.json from assets: %s\",\n                     vocab_json.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  if (!ParseMergesTxt(merges_blob, &merges_rank_)) {\n    SHERPA_ONNX_LOGE(\"Failed to parse merges.txt from assets: %s\",\n                     merges_txt.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  if (!ParseAddedTokensFromTokenizerJson(tok_blob, &added_tokens_)) {\n    SHERPA_ONNX_LOGE(\"Failed to parse added_tokens from assets tokenizer.json\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n  MergeVocabAndAddedTokens(&token2id_, added_tokens_, &added_token_contents_);\n  BuildIdToToken(token2id_, added_token_contents_, &id2token_);\n  BuildAddedTokensTrie(added_tokens_, &trie_);\n  FinalizeSpecialIds();\n}\n#endif\n\n#if __OHOS__\nvoid FunASRNanoTokenizer::Init(NativeResourceManager *mgr,\n                               const std::string &tokenizer_dir) {\n  std::string tok_json = tokenizer_dir + \"/tokenizer.json\";\n  std::string vocab_json = tokenizer_dir + \"/vocab.json\";\n  std::string merges_txt = tokenizer_dir + \"/merges.txt\";\n\n  const std::string tok_blob = LoadBytesFromFile(mgr, tok_json);\n  const std::string vocab_blob = LoadBytesFromFile(mgr, vocab_json);\n  const std::string merges_blob = LoadBytesFromFile(mgr, merges_txt);\n\n  if (tok_blob.empty() || vocab_blob.empty() || merges_blob.empty()) {\n    SHERPA_ONNX_LOGE(\"Failed to read tokenizer files from rawfile: %s\",\n                     tokenizer_dir.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  BuildBytesToUnicode(byte_to_unicode_, &unicode_to_byte_);\n\n  if (!ParseVocabJson(vocab_blob, &token2id_)) {\n    SHERPA_ONNX_LOGE(\"Failed to parse vocab.json from rawfile: %s\",\n                     vocab_json.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  if (!ParseMergesTxt(merges_blob, &merges_rank_)) {\n    SHERPA_ONNX_LOGE(\"Failed to parse merges.txt from rawfile: %s\",\n                     merges_txt.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  if (!ParseAddedTokensFromTokenizerJson(tok_blob, &added_tokens_)) {\n    SHERPA_ONNX_LOGE(\n        \"Failed to parse added_tokens from rawfile tokenizer.json\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n  MergeVocabAndAddedTokens(&token2id_, added_tokens_, &added_token_contents_);\n  BuildIdToToken(token2id_, added_token_contents_, &id2token_);\n  BuildAddedTokensTrie(added_tokens_, &trie_);\n  FinalizeSpecialIds();\n}\n#endif\n\nvoid FunASRNanoTokenizer::FinalizeSpecialIds() {\n  im_end_token_id_ = TokenToIdOrDefault(token2id_, \"<|im_end|>\", 151645);\n  eos_token_id_ = TokenToIdOrDefault(token2id_, \"<|endoftext|>\", -1);\n  if (eos_token_id_ < 0) eos_token_id_ = im_end_token_id_;\n\n  pad_token_id_ = TokenToIdOrDefault(token2id_, \"<|pad|>\", -1);\n  if (pad_token_id_ < 0) pad_token_id_ = eos_token_id_;\n\n  special_ids_.clear();\n  special_ids_.insert(static_cast<int32_t>(eos_token_id_));\n  special_ids_.insert(static_cast<int32_t>(im_end_token_id_));\n  special_ids_.insert(static_cast<int32_t>(pad_token_id_));\n\n  int64_t im_start = TokenToIdOrDefault(token2id_, \"<|im_start|>\", -1);\n  if (im_start >= 0) special_ids_.insert(static_cast<int32_t>(im_start));\n}\n\nstatic inline bool CheckSingleWordBoundary(const std::string &text, size_t pos,\n                                           size_t end) {\n  auto prev_is_word = [&]() -> bool {\n    if (pos == 0) return false;\n    size_t j = pos;\n    while (j > 0 && (static_cast<unsigned char>(text[j - 1]) & 0xC0) == 0x80)\n      --j;\n    if (j == 0) return false;\n    size_t t = j - 1;\n    while (t > 0 && (static_cast<unsigned char>(text[t]) & 0xC0) == 0x80) --t;\n    size_t k = t;\n    uint32_t cp = 0;\n    size_t nb = 0;\n    if (!Utf8Next(text, &k, &cp, &nb)) return false;\n    return IsWordChar(cp);\n  };\n\n  auto next_is_word = [&]() -> bool {\n    if (end >= text.size()) return false;\n    size_t k = end;\n    uint32_t cp = 0;\n    size_t nb = 0;\n    if (!Utf8Next(text, &k, &cp, &nb)) return false;\n    return IsWordChar(cp);\n  };\n\n  return !(prev_is_word() || next_is_word());\n}\n\n// ByteLevel encode: map each byte to unicode char (bytes_to_unicode).\nstatic inline std::string ByteLevelEncode(\n    const std::string &token, const std::string byte_to_unicode[256]) {\n  std::string out;\n  out.reserve(token.size() * 2);\n  for (unsigned char b : token) {\n    out.append(byte_to_unicode[b]);\n  }\n  return out;\n}\n\n// BPE encode (with cache): bytelevel_word to merged token strings.\nstatic std::vector<std::string> BpeEncodeWithCache(\n    const std::string &word,\n    const std::unordered_map<std::string, int32_t> &merges_rank,\n    std::unordered_map<std::string, std::vector<std::string>> *cache) {\n  if (!cache) return {};\n  auto it = cache->find(word);\n  if (it != cache->end()) return it->second;\n\n  std::vector<std::string> symbols = SplitUtf8ToChars(word);\n  if (symbols.empty()) {\n    (*cache)[word] = {};\n    return {};\n  }\n  if (symbols.size() == 1) {\n    (*cache)[word] = symbols;\n    return symbols;\n  }\n\n  while (symbols.size() > 1) {\n    int32_t best_rank = std::numeric_limits<int32_t>::max();\n    int32_t best_pos = -1;\n\n    for (int32_t i = 0; i + 1 < static_cast<int32_t>(symbols.size()); ++i) {\n      std::string key = MakeMergeKey(symbols[i], symbols[i + 1]);\n      auto it2 = merges_rank.find(key);\n      if (it2 != merges_rank.end()) {\n        int32_t r = it2->second;\n        if (r < best_rank) {\n          best_rank = r;\n          best_pos = i;\n        }\n      }\n    }\n\n    if (best_pos < 0) break;\n\n    // Merge best pair\n    symbols[best_pos].append(symbols[best_pos + 1]);\n    symbols.erase(symbols.begin() + best_pos + 1);\n  }\n\n  (*cache)[word] = symbols;\n  return symbols;\n}\n\nstd::vector<int64_t> FunASRNanoTokenizer::Encode(const std::string &text) {\n  if (token2id_.empty()) {\n    SHERPA_ONNX_LOGE(\"Tokenizer not initialized\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  std::vector<int64_t> out;\n  if (text.empty()) return out;\n\n  size_t pos = 0;\n  size_t last = 0;\n  while (pos < text.size()) {\n    auto m = MatchAddedToken(text, pos, trie_);\n    int32_t mlen = m.first;\n    int32_t tidx = m.second;\n\n    if (mlen > 0 && tidx >= 0) {\n      const auto &tok = added_tokens_[static_cast<size_t>(tidx)];\n\n      if (tok.single_word) {\n        if (!CheckSingleWordBoundary(text, pos, pos + mlen)) {\n          mlen = 0;\n          tidx = -1;\n        }\n      }\n    }\n\n    if (mlen > 0 && tidx >= 0) {\n      if (pos > last) {\n        std::string seg = text.substr(last, pos - last);\n        auto pieces = SplitByQwen3Pattern(seg);\n        for (const auto &p : pieces) {\n          std::string bl = ByteLevelEncode(p, byte_to_unicode_);\n          auto bpe_toks = BpeEncodeWithCache(bl, merges_rank_, &bpe_cache_);\n          for (const auto &bt : bpe_toks) {\n            auto it = token2id_.find(bt);\n            if (it == token2id_.end()) {\n              continue;\n            }\n            out.push_back(static_cast<int64_t>(it->second));\n          }\n        }\n      }\n\n      const auto &atok = added_tokens_[static_cast<size_t>(tidx)];\n      out.push_back(static_cast<int64_t>(atok.id));\n\n      pos += static_cast<size_t>(mlen);\n      last = pos;\n      continue;\n    }\n\n    ++pos;\n  }\n\n  if (last < text.size()) {\n    std::string seg = text.substr(last);\n    auto pieces = SplitByQwen3Pattern(seg);\n    for (const auto &p : pieces) {\n      std::string bl = ByteLevelEncode(p, byte_to_unicode_);\n      auto bpe_toks = BpeEncodeWithCache(bl, merges_rank_, &bpe_cache_);\n      for (const auto &bt : bpe_toks) {\n        auto it = token2id_.find(bt);\n        if (it == token2id_.end()) continue;\n        out.push_back(static_cast<int64_t>(it->second));\n      }\n    }\n  }\n\n  return out;\n}\n\nstd::string FunASRNanoTokenizer::GetTokenStringStreaming(\n    int64_t token_id, std::string *pending_bytes) const {\n  if (!pending_bytes) return \"\";\n\n  if (id2token_.empty()) {\n    SHERPA_ONNX_LOGE(\"Tokenizer not initialized\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  int32_t id = static_cast<int32_t>(token_id);\n  if (id < 0 || static_cast<size_t>(id) >= id2token_.size()) return \"\";\n\n  if (!special_ids_.empty() && special_ids_.count(id)) return \"\";\n\n  const std::string &token = id2token_[static_cast<size_t>(id)];\n  if (token.empty()) return \"\";\n\n  ByteLevelDecodeTokenToBytes(token, unicode_to_byte_, pending_bytes);\n\n  std::string out;\n\n  while (!pending_bytes->empty()) {\n    Utf8ConsumeResult c = ConsumeValidUtf8Prefix(pending_bytes);\n    out.append(c.prefix);\n\n    if (c.status == Utf8ConsumeStatus::kOk) {\n      break;\n    }\n\n    if (c.status == Utf8ConsumeStatus::kIncomplete) {\n      break;\n    }\n\n    if (c.status == Utf8ConsumeStatus::kInvalid) {\n      if (!pending_bytes->empty()) {\n        pending_bytes->erase(0, 1);\n      }\n      out.append(\"\\xEF\\xBF\\xBD\");\n      continue;\n    }\n  }\n\n  return out;\n}\n\nstd::string FunASRNanoTokenizer::Decode(const std::vector<int64_t> &token_ids) {\n  if (id2token_.empty()) {\n    SHERPA_ONNX_LOGE(\"Tokenizer not initialized\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n  if (token_ids.empty()) return \"\";\n\n  std::vector<std::string> toks;\n  toks.reserve(token_ids.size());\n  for (int64_t v : token_ids) {\n    if (v < 0) continue;\n    if (v > static_cast<int64_t>(std::numeric_limits<int32_t>::max())) continue;\n    int32_t id = static_cast<int32_t>(v);\n    if (!special_ids_.empty() && special_ids_.count(id)) continue;\n    if (id < 0 || static_cast<size_t>(id) >= id2token_.size()) continue;\n    const std::string &t = id2token_[static_cast<size_t>(id)];\n    if (!t.empty()) toks.push_back(t);\n  }\n\n  std::string merged;\n  {\n    size_t total = 0;\n    for (const auto &t : toks) total += t.size();\n    merged.reserve(total);\n    for (const auto &t : toks) merged.append(t);\n  }\n\n  std::vector<uint8_t> bytes;\n  bytes.reserve(merged.size());\n\n  size_t i = 0;\n  while (i < merged.size()) {\n    size_t t = i;\n    uint32_t cp = 0;\n    size_t n = 0;\n    if (!Utf8Next(merged, &t, &cp, &n) || n == 0) {\n      bytes.push_back(static_cast<uint8_t>(merged[i]));\n      i += 1;\n      continue;\n    }\n    std::string ch = merged.substr(i, n);\n    auto it = unicode_to_byte_.find(ch);\n    if (it != unicode_to_byte_.end()) {\n      bytes.push_back(it->second);\n    } else {\n      for (unsigned char b : ch) bytes.push_back(b);\n    }\n    i += n;\n  }\n\n  std::string out(reinterpret_cast<const char *>(bytes.data()), bytes.size());\n\n  for (const char *sp : {\"<|im_end|>\", \"<|im_start|>\", \"<|endoftext|>\"}) {\n    std::string needle(sp);\n    size_t pos = 0;\n    while ((pos = out.find(needle, pos)) != std::string::npos) {\n      out.erase(pos, needle.size());\n    }\n  }\n\n  TrimInPlace(&out);\n  return out;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/funasr-nano-tokenizer.h",
    "content": "// sherpa-onnx/csrc/funasr-nano-tokenizer.h\n//\n// Copyright (c)  2025  zengyw\n//\n// A self-contained Qwen3 ByteLevel-BPE tokenizer implementation.\n// - No dependency on tokenizers-cpp / HF tokenizers\n// - Loads vocab.json + merges.txt + tokenizer.json(added_tokens)\n// - Supports AddedTokens via Trie longest-match\n// - ByteLevel bytes_to_unicode encode/decode\n\n#ifndef SHERPA_ONNX_CSRC_FUNASR_NANO_TOKENIZER_H_\n#define SHERPA_ONNX_CSRC_FUNASR_NANO_TOKENIZER_H_\n\n#include <cstdint>\n#include <string>\n#include <unordered_map>\n#include <unordered_set>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include <android/asset_manager.h>\n#endif\n\n#if __OHOS__\nstruct NativeResourceManager;\n#endif\n\nnamespace sherpa_onnx {\n\nclass FunASRNanoTokenizer {\n public:\n  explicit FunASRNanoTokenizer(const std::string &tokenizer_dir);\n\n#if __ANDROID_API__ >= 9\n  FunASRNanoTokenizer(AAssetManager *mgr, const std::string &tokenizer_dir);\n#endif\n\n#if __OHOS__\n  FunASRNanoTokenizer(NativeResourceManager *mgr,\n                      const std::string &tokenizer_dir);\n#endif\n\n  std::vector<int64_t> Encode(const std::string &text);\n  std::string Decode(const std::vector<int64_t> &token_ids);\n  std::string GetTokenStringStreaming(int64_t token_id,\n                                      std::string *pending_bytes) const;\n\n  int64_t GetEosTokenId() const { return eos_token_id_; }\n  int64_t GetPadTokenId() const { return pad_token_id_; }\n  int64_t GetImEndTokenId() const { return im_end_token_id_; }\n\n  // Public structures for helper functions\n  struct AddedToken {\n    std::string content;\n    int32_t id = -1;\n    bool single_word = false;\n    bool lstrip = false;\n    bool rstrip = false;\n    bool normalized = false;\n    bool special = false;\n  };\n\n  struct TrieNode {\n    std::unordered_map<uint8_t, int32_t> next;\n    int32_t token_index = -1;  // index in added_tokens_ if terminal\n  };\n\n private:\n  void Init(const std::string &tokenizer_dir);\n\n#if __ANDROID_API__ >= 9\n  void Init(AAssetManager *mgr, const std::string &tokenizer_dir);\n#endif\n\n#if __OHOS__\n  void Init(NativeResourceManager *mgr, const std::string &tokenizer_dir);\n#endif\n\n  void FinalizeSpecialIds();\n\n private:\n  // Special ids\n  int64_t eos_token_id_ = -1;\n  int64_t pad_token_id_ = -1;\n  int64_t im_end_token_id_ = -1;\n\n  std::unordered_set<int32_t> special_ids_;\n\n  // Vocab: token <-> id\n  std::unordered_map<std::string, int32_t> token2id_;\n  std::vector<std::string> id2token_;\n\n  // merges ranks: \"left\\tright\" -> rank\n  std::unordered_map<std::string, int32_t> merges_rank_;\n\n  // BPE cache: bytelevel_word -> list of merged tokens\n  std::unordered_map<std::string, std::vector<std::string>> bpe_cache_;\n\n  // bytes_to_unicode mapping (ByteLevel)\n  std::string byte_to_unicode_[256];\n  std::unordered_map<std::string, uint8_t> unicode_to_byte_;\n\n  // AddedTokens\n  std::vector<AddedToken> added_tokens_;\n  std::vector<TrieNode> trie_;\n  std::unordered_set<std::string> added_token_contents_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_FUNASR_NANO_TOKENIZER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/hifigan-vocoder.cc",
    "content": "// sherpa-onnx/csrc/hifigan-vocoder.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/hifigan-vocoder.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n\nnamespace sherpa_onnx {\n\nclass HifiganVocoder::Impl {\n public:\n  explicit Impl(int32_t num_threads, const std::string &provider,\n                const std::string &model)\n      : env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(num_threads, provider)),\n        allocator_{} {\n    auto buf = ReadFile(model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  explicit Impl(Manager *mgr, int32_t num_threads, const std::string &provider,\n                const std::string &model)\n      : env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(num_threads, provider)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, model);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<float> Run(Ort::Value mel) const {\n    auto out = sess_->Run({}, input_names_ptr_.data(), &mel, 1,\n                          output_names_ptr_.data(), output_names_ptr_.size());\n\n    std::vector<int64_t> audio_shape =\n        out[0].GetTensorTypeAndShapeInfo().GetShape();\n\n    int64_t total = 1;\n    // The output shape may be (1, 1, total) or (1, total) or (total,)\n    for (auto i : audio_shape) {\n      total *= i;\n    }\n\n    const float *p = out[0].GetTensorData<float>();\n    return {p, p + total};\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n  }\n\n private:\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n};\n\nHifiganVocoder::HifiganVocoder(int32_t num_threads, const std::string &provider,\n                               const std::string &model)\n    : impl_(std::make_unique<Impl>(num_threads, provider, model)) {}\n\ntemplate <typename Manager>\nHifiganVocoder::HifiganVocoder(Manager *mgr, int32_t num_threads,\n                               const std::string &provider,\n                               const std::string &model)\n    : impl_(std::make_unique<Impl>(mgr, num_threads, provider, model)) {}\n\nHifiganVocoder::~HifiganVocoder() = default;\n\nstd::vector<float> HifiganVocoder::Run(Ort::Value mel) const {\n  return impl_->Run(std::move(mel));\n}\n\n#if __ANDROID_API__ >= 9\ntemplate HifiganVocoder::HifiganVocoder(AAssetManager *mgr, int32_t num_threads,\n                                        const std::string &provider,\n                                        const std::string &model);\n#endif\n\n#if __OHOS__\ntemplate HifiganVocoder::HifiganVocoder(NativeResourceManager *mgr,\n                                        int32_t num_threads,\n                                        const std::string &provider,\n                                        const std::string &model);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/hifigan-vocoder.h",
    "content": "// sherpa-onnx/csrc/hifigan-vocoder.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_HIFIGAN_VOCODER_H_\n#define SHERPA_ONNX_CSRC_HIFIGAN_VOCODER_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/vocoder.h\"\n\nnamespace sherpa_onnx {\n\nclass HifiganVocoder : public Vocoder {\n public:\n  ~HifiganVocoder() override;\n\n  HifiganVocoder(int32_t num_threads, const std::string &provider,\n                 const std::string &model);\n\n  template <typename Manager>\n  HifiganVocoder(Manager *mgr, int32_t num_threads, const std::string &provider,\n                 const std::string &model);\n\n  /** @param mel A float32 tensor of shape (batch_size, feat_dim, num_frames).\n   *  @return Return a float32 tensor of shape (batch_size, num_samples).\n   */\n  std::vector<float> Run(Ort::Value mel) const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_HIFIGAN_VOCODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/homophone-replacer.cc",
    "content": "// sherpa-onnx/csrc/homophone-replacer.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/homophone-replacer.h\"\n\n#include <cctype>\n#include <fstream>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <unordered_set>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"kaldifst/csrc/text-normalizer.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/phrase-matcher.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid HomophoneReplacerConfig::Register(ParseOptions *po) {\n  po->Register(\"hr-dict-dir\", &dict_dir,\n               \"Not used. You don't need to provide a value for it\");\n\n  po->Register(\"hr-lexicon\", &lexicon,\n               \"Path to lexicon.txt used by HomophoneReplacer.\");\n\n  po->Register(\"hr-rule-fsts\", &rule_fsts,\n               \"Fst files for HomophoneReplacer. If there are multiple, they \"\n               \"are separated by a comma. E.g., a.fst,b.fst,c.fst\");\n}\n\nbool HomophoneReplacerConfig::Validate() const {\n  if (!dict_dir.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"From sherpa-onnx v1.12.15, you don't need to provide dict_dir for \"\n        \"this model. Ignore it\");\n  }\n\n  if (!lexicon.empty() && !FileExists(lexicon)) {\n    SHERPA_ONNX_LOGE(\"--hr-lexicon: '%s' does not exist\", lexicon.c_str());\n    return false;\n  }\n\n  if (!rule_fsts.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(rule_fsts, \",\", false, &files);\n\n    if (files.size() > 1) {\n      SHERPA_ONNX_LOGE(\"Only 1 file is supported now.\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    for (const auto &f : files) {\n      if (!FileExists(f)) {\n        SHERPA_ONNX_LOGE(\"Rule fst '%s' does not exist. \", f.c_str());\n        return false;\n      }\n    }\n  }\n\n  return true;\n}\n\nstd::string HomophoneReplacerConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"HomophoneReplacerConfig(\";\n  os << \"lexicon=\\\"\" << lexicon << \"\\\", \";\n  os << \"rule_fsts=\\\"\" << rule_fsts << \"\\\")\";\n\n  return os.str();\n}\n\nclass HomophoneReplacer::Impl {\n public:\n  explicit Impl(const HomophoneReplacerConfig &config) : config_(config) {\n    {\n      std::ifstream is(config.lexicon);\n      InitLexicon(is);\n    }\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      replacer_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config_.debug) {\n          SHERPA_ONNX_LOGE(\"hr rule fst: %s\", f.c_str());\n        }\n        replacer_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(f));\n      }\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const HomophoneReplacerConfig &config) : config_(config) {\n    {\n      auto buf = ReadFile(mgr, config.lexicon);\n\n      std::istringstream is(std::string(buf.data(), buf.size()));\n      InitLexicon(is);\n    }\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      replacer_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config_.debug) {\n          SHERPA_ONNX_LOGE(\"hr rule fst: %s\", f.c_str());\n        }\n        auto buf = ReadFile(mgr, f);\n        std::istringstream is(std::string(buf.data(), buf.size()));\n        replacer_list_.push_back(\n            std::make_unique<kaldifst::TextNormalizer>(is));\n      }\n    }\n  }\n\n  std::string Apply(const std::string &text) const {\n    std::string ans;\n\n    if (text.empty()) {\n      return ans;\n    }\n\n    std::vector<std::string> words = SplitUtf8(text);\n\n    if (config_.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Input text: '%{public}s'\", text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Input text: '%s'\", text.c_str());\n#endif\n      std::ostringstream os;\n      os << \"After splitting into UTF8: \";\n      std::string sep;\n      for (const auto &w : words) {\n        os << sep << w;\n        sep = \"_\";\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    // convert words to pronunciations\n    std::vector<std::string> current_words;\n    std::vector<std::string> current_pronunciations;\n\n    PhraseMatcher matcher(&all_words_, words, config_.debug);\n\n    for (const std::string &w : matcher) {\n      if (w.size() < 3 ||\n          reinterpret_cast<const uint8_t *>(w.data())[0] < 128) {\n        if (!current_words.empty()) {\n          ans += ApplyImpl(current_words, current_pronunciations);\n          current_words.clear();\n          current_pronunciations.clear();\n        }\n        ans += w;\n        if (isalpha(w[0])) {\n          ans.push_back(' ');\n        }\n        continue;\n      }\n\n      auto p = ConvertWordToPronunciation(w);\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\"%s %s\", w.c_str(), p.c_str());\n      }\n\n      current_words.push_back(w);\n      current_pronunciations.push_back(std::move(p));\n    }  // for (const std::string &w : matcher) {\n\n    if (!current_words.empty()) {\n      ans += ApplyImpl(current_words, current_pronunciations);\n    }\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"Output text: '%s'\", ans.c_str());\n    }\n\n    if (!ans.empty() && ans.back() == ' ') {\n      ans.pop_back();\n    }\n\n    return ans;\n  }\n\n private:\n  std::string ApplyImpl(const std::vector<std::string> &words,\n                        const std::vector<std::string> &pronunciations) const {\n    std::string ans;\n    for (const auto &r : replacer_list_) {\n      ans = r->Normalize(words, pronunciations);\n      // TODO(fangjun): We support only 1 rule fst at present.\n      break;\n    }\n    return ans;\n  }\n  std::string ConvertWordToPronunciation(const std::string &word) const {\n    if (word2pron_.count(word)) {\n      return word2pron_.at(word);\n    }\n\n    if (word.size() <= 3) {\n      // not a Chinese character\n      return word;\n    }\n\n    std::vector<std::string> words = SplitUtf8(word);\n    std::string ans;\n    for (const auto &w : words) {\n      if (word2pron_.count(w)) {\n        ans.append(word2pron_.at(w));\n      } else {\n        ans.append(w);\n      }\n    }\n\n    return ans;\n  }\n\n  void InitLexicon(std::istream &is) {\n    std::string word;\n    std::string pron;\n    std::string p;\n\n    std::string line;\n    int32_t line_num = 0;\n    int32_t num_warn = 0;\n    while (std::getline(is, line)) {\n      ++line_num;\n      std::istringstream iss(line);\n\n      pron.clear();\n      iss >> word;\n      ToLowerCase(&word);\n\n      if (word2pron_.count(word)) {\n        num_warn += 1;\n        if (num_warn < 10) {\n          SHERPA_ONNX_LOGE(\"Duplicated word: %s at line %d:%s. Ignore it.\",\n                           word.c_str(), line_num, line.c_str());\n        }\n        continue;\n      }\n\n      while (iss >> p) {\n        if (p.back() > '4') {\n          p.push_back('1');\n        }\n        pron.append(std::move(p));\n      }\n\n      if (pron.empty()) {\n        SHERPA_ONNX_LOGE(\n            \"Empty pronunciation for word '%s' at line %d:%s. Ignore it.\",\n            word.c_str(), line_num, line.c_str());\n        continue;\n      }\n\n      word2pron_.insert({std::move(word), std::move(pron)});\n    }\n\n    for (const auto &[key, _] : word2pron_) {\n      all_words_.insert(key);\n    }\n  }\n\n private:\n  HomophoneReplacerConfig config_;\n  std::vector<std::unique_ptr<kaldifst::TextNormalizer>> replacer_list_;\n  std::unordered_map<std::string, std::string> word2pron_;\n  std::unordered_set<std::string> all_words_;\n};\n\nHomophoneReplacer::HomophoneReplacer(const HomophoneReplacerConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nHomophoneReplacer::HomophoneReplacer(Manager *mgr,\n                                     const HomophoneReplacerConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nHomophoneReplacer::~HomophoneReplacer() = default;\n\nstd::string HomophoneReplacer::Apply(const std::string &text) const {\n  return RemoveInvalidUtf8Sequences(impl_->Apply(text));\n}\n\n#if __ANDROID_API__ >= 9\ntemplate HomophoneReplacer::HomophoneReplacer(\n    AAssetManager *mgr, const HomophoneReplacerConfig &config);\n#endif\n\n#if __OHOS__\ntemplate HomophoneReplacer::HomophoneReplacer(\n    NativeResourceManager *mgr, const HomophoneReplacerConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/homophone-replacer.h",
    "content": "// sherpa-onnx/csrc/homophone-replacer.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_HOMOPHONE_REPLACER_H_\n#define SHERPA_ONNX_CSRC_HOMOPHONE_REPLACER_H_\n\n#include <memory>\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct HomophoneReplacerConfig {\n  std::string dict_dir;  // unused\n  std::string lexicon;\n\n  // comma separated fst files, e.g. a.fst,b.fst,c.fst\n  std::string rule_fsts;\n\n  bool debug;\n\n  HomophoneReplacerConfig() = default;\n\n  HomophoneReplacerConfig(const std::string &dict_dir,\n                          const std::string &lexicon,\n                          const std::string &rule_fsts, bool debug)\n      : dict_dir(dict_dir),\n        lexicon(lexicon),\n        rule_fsts(rule_fsts),\n        debug(debug) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nclass HomophoneReplacer {\n public:\n  explicit HomophoneReplacer(const HomophoneReplacerConfig &config);\n\n  template <typename Manager>\n  HomophoneReplacer(Manager *mgr, const HomophoneReplacerConfig &config);\n\n  ~HomophoneReplacer();\n\n  std::string Apply(const std::string &text) const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_HOMOPHONE_REPLACER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/hypothesis.cc",
    "content": "/**\n * Copyright (c)  2023  Xiaomi Corporation\n * Copyright (c)  2023  Pingfeng Luo\n */\n\n#include \"sherpa-onnx/csrc/hypothesis.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nvoid Hypotheses::Add(Hypothesis hyp) {\n  auto key = hyp.Key();\n  auto it = hyps_dict_.find(key);\n  if (it == hyps_dict_.end()) {\n    hyps_dict_[key] = std::move(hyp);\n  } else {\n    it->second.log_prob = LogAdd<double>()(it->second.log_prob, hyp.log_prob);\n  }\n}\n\nHypothesis Hypotheses::GetMostProbable(bool length_norm) const {\n  if (length_norm == false) {\n    return std::max_element(hyps_dict_.begin(), hyps_dict_.end(),\n                            [](const auto &left, auto &right) -> bool {\n                              return left.second.TotalLogProb() <\n                                     right.second.TotalLogProb();\n                            })\n        ->second;\n  } else {\n    // for length_norm is true\n    return std::max_element(\n               hyps_dict_.begin(), hyps_dict_.end(),\n               [](const auto &left, const auto &right) -> bool {\n                 return left.second.TotalLogProb() / left.second.ys.size() <\n                        right.second.TotalLogProb() / right.second.ys.size();\n               })\n        ->second;\n  }\n}\n\nstd::vector<Hypothesis> Hypotheses::GetTopK(int32_t k, bool length_norm) const {\n  k = std::max(k, 1);\n  k = std::min(k, Size());\n\n  std::vector<Hypothesis> all_hyps = Vec();\n\n  if (length_norm == false) {\n    std::partial_sort(all_hyps.begin(), all_hyps.begin() + k, all_hyps.end(),\n                      [](const auto &a, const auto &b) {\n                        return a.TotalLogProb() > b.TotalLogProb();\n                      });\n  } else {\n    // for length_norm is true\n    std::partial_sort(all_hyps.begin(), all_hyps.begin() + k, all_hyps.end(),\n                      [](const auto &a, const auto &b) {\n                        return a.TotalLogProb() / a.ys.size() >\n                               b.TotalLogProb() / b.ys.size();\n                      });\n  }\n\n  return {all_hyps.begin(), all_hyps.begin() + k};\n}\n\nconst std::vector<int32_t> GetHypsRowSplits(\n    const std::vector<Hypotheses> &hyps) {\n  std::vector<int32_t> row_splits;\n  row_splits.reserve(hyps.size() + 1);\n\n  row_splits.push_back(0);\n  int32_t s = 0;\n  for (const auto &h : hyps) {\n    s += h.Size();\n    row_splits.push_back(s);\n  }\n\n  return row_splits;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/hypothesis.h",
    "content": "/**\n * Copyright (c)  2023  Xiaomi Corporation\n * Copyright (c)  2023  Pingfeng Luo\n *\n */\n\n#ifndef SHERPA_ONNX_CSRC_HYPOTHESIS_H_\n#define SHERPA_ONNX_CSRC_HYPOTHESIS_H_\n\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n#include <memory>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/context-graph.h\"\n#include \"sherpa-onnx/csrc/lodr-fst.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nstruct Hypothesis {\n  // The predicted tokens so far. Newly predicated tokens are appended.\n  std::vector<int64_t> ys;\n\n  // timestamps[i] contains the frame number after subsampling\n  // on which ys[i] is decoded.\n  std::vector<int32_t> timestamps;\n\n  // The acoustic probability for each token in ys.\n  // Used for keyword spotting task.\n  // For transducer modified beam-search and greedy-search,\n  // this is filled with log_posterior scores.\n  std::vector<float> ys_probs;\n\n  // lm_probs[i] contains the lm score for each token in ys.\n  // Used only in transducer modified beam-search.\n  // Elements filled only if LM is used.\n  std::vector<float> lm_probs;\n\n  // context_scores[i] contains the context-graph score for each token in ys.\n  // Used only in transducer modified beam-search.\n  // Elements filled only if `ContextGraph` is used.\n  std::vector<float> context_scores;\n\n  // The total score of ys in log space.\n  // It contains only acoustic scores\n  double log_prob = 0;\n\n  // LM log prob if any.\n  double lm_log_prob = 0;\n\n  // the nn lm score for next token given the current ys,\n  // when using shallow fusion\n  CopyableOrtValue nn_lm_scores;\n\n  // cur scored tokens by RNN LM, when rescoring\n  int32_t cur_scored_pos = 0;\n\n  // the nn lm states\n  std::vector<CopyableOrtValue> nn_lm_states;\n\n  // the LODR states\n  std::shared_ptr<LodrStateCost> lodr_state;\n\n  const ContextState *context_state;\n\n  // TODO(fangjun): Make it configurable\n  // the minimum of tokens in a chunk for streaming RNN LM\n  int32_t lm_rescore_min_chunk = 2;  // a const\n\n  int32_t num_trailing_blanks = 0;\n\n  Hypothesis() = default;\n  Hypothesis(const std::vector<int64_t> &ys, double log_prob,\n             const ContextState *context_state = nullptr)\n      : ys(ys), log_prob(log_prob), context_state(context_state) {}\n\n  double TotalLogProb() const { return log_prob + lm_log_prob; }\n\n  // If two Hypotheses have the same `Key`, then they contain\n  // the same token sequence.\n  std::string Key() const {\n    // TODO(fangjun): Use a hash function?\n    std::ostringstream os;\n    std::string sep;\n    for (auto i : ys) {\n      os << sep << i;\n      sep = \"-\";\n    }\n    return os.str();\n  }\n\n  // For debugging\n  std::string ToString() const {\n    std::ostringstream os;\n    os << \"(\" << Key() << \", \" << log_prob << \")\";\n    return os.str();\n  }\n};\n\nclass Hypotheses {\n public:\n  Hypotheses() = default;\n\n  explicit Hypotheses(std::vector<Hypothesis> hyps) {\n    for (auto &h : hyps) {\n      hyps_dict_[h.Key()] = std::move(h);\n    }\n  }\n\n  explicit Hypotheses(std::unordered_map<std::string, Hypothesis> hyps_dict)\n      : hyps_dict_(std::move(hyps_dict)) {}\n\n  // Add hyp to this object. If it already exists, its log_prob\n  // is updated with the given hyp using log-sum-exp.\n  void Add(Hypothesis hyp);\n\n  // Get the hyp that has the largest log_prob.\n  // If length_norm is true, hyp's log_prob is divided by\n  // len(hyp.ys) before comparison.\n  Hypothesis GetMostProbable(bool length_norm) const;\n\n  // Get the k hyps that have the largest log_prob.\n  // If length_norm is true, hyp's log_prob is divided by\n  // len(hyp.ys) before comparison.\n  std::vector<Hypothesis> GetTopK(int32_t k, bool length_norm) const;\n\n  int32_t Size() const { return hyps_dict_.size(); }\n\n  std::string ToString() const {\n    std::ostringstream os;\n    for (const auto &p : hyps_dict_) {\n      os << p.second.ToString() << \"\\n\";\n    }\n    return os.str();\n  }\n\n  auto begin() const { return hyps_dict_.begin(); }\n  auto end() const { return hyps_dict_.end(); }\n\n  auto begin() { return hyps_dict_.begin(); }\n  auto end() { return hyps_dict_.end(); }\n\n  void Clear() { hyps_dict_.clear(); }\n\n  // Return a list of hyps contained in this object.\n  std::vector<Hypothesis> Vec() const {\n    std::vector<Hypothesis> ans;\n    ans.reserve(hyps_dict_.size());\n    for (const auto &p : hyps_dict_) {\n      ans.push_back(p.second);\n    }\n    return ans;\n  }\n\n private:\n  using Map = std ::unordered_map<std::string, Hypothesis>;\n  Map hyps_dict_;\n};\n\nconst std::vector<int32_t> GetHypsRowSplits(\n    const std::vector<Hypotheses> &hyps);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_HYPOTHESIS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/keyword-spotter-impl.cc",
    "content": "// sherpa-onnx/csrc/keyword-spotter-impl.cc\n//\n// Copyright (c)  2023-2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/keyword-spotter-impl.h\"\n\n#include <memory>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/keyword-spotter-transducer-impl.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\n#if SHERPA_ONNX_ENABLE_RKNN\n#include \"sherpa-onnx/csrc/rknn/keyword-spotter-transducer-rknn-impl.h\"\n#endif\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<KeywordSpotterImpl> KeywordSpotterImpl::Create(\n    const KeywordSpotterConfig &config) {\n  if (config.model_config.provider_config.provider == \"rknn\") {\n#if SHERPA_ONNX_ENABLE_RKNN\n    if (!config.model_config.transducer.encoder.empty()) {\n      return std::make_unique<KeywordSpotterTransducerRknnImpl>(config);\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_RKNN=ON if you \"\n        \"want to use rknn.\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (!config.model_config.transducer.encoder.empty()) {\n    return std::make_unique<KeywordSpotterTransducerImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please specify a model\");\n  SHERPA_ONNX_EXIT(-1);\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<KeywordSpotterImpl> KeywordSpotterImpl::Create(\n    Manager *mgr, const KeywordSpotterConfig &config) {\n  if (config.model_config.provider_config.provider == \"rknn\") {\n#if SHERPA_ONNX_ENABLE_RKNN\n    if (!config.model_config.transducer.encoder.empty()) {\n      return std::make_unique<KeywordSpotterTransducerRknnImpl>(mgr, config);\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_RKNN=ON if you \"\n        \"want to use rknn.\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (!config.model_config.transducer.encoder.empty()) {\n    return std::make_unique<KeywordSpotterTransducerImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please specify a model\");\n  exit(-1);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<KeywordSpotterImpl> KeywordSpotterImpl::Create(\n    AAssetManager *mgr, const KeywordSpotterConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<KeywordSpotterImpl> KeywordSpotterImpl::Create(\n    NativeResourceManager *mgr, const KeywordSpotterConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/keyword-spotter-impl.h",
    "content": "// sherpa-onnx/csrc/keyword-spotter-impl.h\n//\n// Copyright (c)  2023-2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_KEYWORD_SPOTTER_IMPL_H_\n#define SHERPA_ONNX_CSRC_KEYWORD_SPOTTER_IMPL_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/keyword-spotter.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n\nnamespace sherpa_onnx {\n\nclass KeywordSpotterImpl {\n public:\n  static std::unique_ptr<KeywordSpotterImpl> Create(\n      const KeywordSpotterConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<KeywordSpotterImpl> Create(\n      Manager *mgr, const KeywordSpotterConfig &config);\n\n  virtual ~KeywordSpotterImpl() = default;\n\n  virtual std::unique_ptr<OnlineStream> CreateStream() const = 0;\n\n  virtual std::unique_ptr<OnlineStream> CreateStream(\n      const std::string &keywords) const = 0;\n\n  virtual bool IsReady(OnlineStream *s) const = 0;\n\n  virtual void Reset(OnlineStream *s) const = 0;\n\n  virtual void DecodeStreams(OnlineStream **ss, int32_t n) const = 0;\n\n  virtual KeywordResult GetResult(OnlineStream *s) const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_KEYWORD_SPOTTER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/keyword-spotter-transducer-impl.h",
    "content": "// sherpa-onnx/csrc/keyword-spotter-transducer-impl.h\n//\n// Copyright (c)  2023-2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_KEYWORD_SPOTTER_TRANSDUCER_IMPL_H_\n#define SHERPA_ONNX_CSRC_KEYWORD_SPOTTER_TRANSDUCER_IMPL_H_\n\n#include <algorithm>\n#include <memory>\n#include <regex>  // NOLINT\n#include <string>\n#include <sstream>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/keyword-spotter-impl.h\"\n#include \"sherpa-onnx/csrc/keyword-spotter.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/transducer-keyword-decoder.h\"\n#include \"sherpa-onnx/csrc/utils.h\"\n\nnamespace sherpa_onnx {\n\nKeywordResult Convert(const TransducerKeywordResult &src,\n                      const SymbolTable &sym_table, float frame_shift_ms,\n                      int32_t subsampling_factor, int32_t frames_since_start) {\n  KeywordResult r;\n  r.tokens.reserve(src.tokens.size());\n  r.timestamps.reserve(src.tokens.size());\n  r.keyword = src.keyword;\n  bool from_tokens = src.keyword.empty();\n\n  for (auto i : src.tokens) {\n    auto sym = sym_table[i];\n    if (from_tokens) {\n      r.keyword.append(sym);\n    }\n    r.tokens.push_back(std::move(sym));\n  }\n  if (from_tokens && r.keyword.size()) {\n    r.keyword = r.keyword.substr(1);\n  }\n\n  float frame_shift_s = frame_shift_ms / 1000. * subsampling_factor;\n  for (auto t : src.timestamps) {\n    float time = frame_shift_s * t;\n    r.timestamps.push_back(time);\n  }\n\n  r.start_time = frames_since_start * frame_shift_ms / 1000.;\n\n  return r;\n}\n\nclass KeywordSpotterTransducerImpl : public KeywordSpotterImpl {\n public:\n  explicit KeywordSpotterTransducerImpl(const KeywordSpotterConfig &config)\n      : config_(config),\n        model_(OnlineTransducerModel::Create(config.model_config)) {\n    if (!config.model_config.tokens_buf.empty()) {\n      sym_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      /// assuming tokens_buf and tokens are guaranteed not being both empty\n      sym_ = SymbolTable(config.model_config.tokens, true);\n    }\n\n    if (sym_.Contains(\"<unk>\")) {\n      unk_id_ = sym_[\"<unk>\"];\n    }\n\n    model_->SetFeatureDim(config.feat_config.feature_dim);\n\n    if (config.keywords_buf.empty()) {\n      InitKeywords();\n    } else {\n      InitKeywordsFromBufStr();\n    }\n\n    decoder_ = std::make_unique<TransducerKeywordDecoder>(\n        model_.get(), config_.max_active_paths, config_.num_trailing_blanks,\n        unk_id_);\n  }\n\n  template <typename Manager>\n  KeywordSpotterTransducerImpl(Manager *mgr, const KeywordSpotterConfig &config)\n      : config_(config),\n        model_(OnlineTransducerModel::Create(mgr, config.model_config)),\n        sym_(mgr, config.model_config.tokens) {\n    if (sym_.Contains(\"<unk>\")) {\n      unk_id_ = sym_[\"<unk>\"];\n    }\n\n    model_->SetFeatureDim(config.feat_config.feature_dim);\n\n    InitKeywords(mgr);\n\n    decoder_ = std::make_unique<TransducerKeywordDecoder>(\n        model_.get(), config_.max_active_paths, config_.num_trailing_blanks,\n        unk_id_);\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream() const override {\n    auto stream =\n        std::make_unique<OnlineStream>(config_.feat_config, keywords_graph_);\n    InitOnlineStream(stream.get());\n    return stream;\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream(\n      const std::string &keywords) const override {\n    auto kws = std::regex_replace(keywords, std::regex(\"/\"), \"\\n\");\n    std::istringstream is(kws);\n\n    std::vector<std::vector<int32_t>> current_ids;\n    std::vector<std::string> current_kws;\n    std::vector<float> current_scores;\n    std::vector<float> current_thresholds;\n\n    if (!EncodeKeywords(is, sym_, &current_ids, &current_kws, &current_scores,\n                        &current_thresholds)) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Encode keywords '%{public}s' failed.\",\n                       keywords.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Encode keywords '%s' failed.\", keywords.c_str());\n#endif\n      return nullptr;\n    }\n\n    int32_t num_kws = current_ids.size();\n    int32_t num_default_kws = keywords_id_.size();\n\n    current_ids.insert(current_ids.end(), keywords_id_.begin(),\n                       keywords_id_.end());\n\n    if (!current_kws.empty() && !keywords_.empty()) {\n      current_kws.insert(current_kws.end(), keywords_.begin(), keywords_.end());\n    } else if (!current_kws.empty() && keywords_.empty()) {\n      current_kws.insert(current_kws.end(), num_default_kws, std::string());\n    } else if (current_kws.empty() && !keywords_.empty()) {\n      current_kws.insert(current_kws.end(), num_kws, std::string());\n      current_kws.insert(current_kws.end(), keywords_.begin(), keywords_.end());\n    } else {\n      // Do nothing.\n    }\n\n    if (!current_scores.empty() && !boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), boost_scores_.begin(),\n                            boost_scores_.end());\n    } else if (!current_scores.empty() && boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), num_default_kws,\n                            config_.keywords_score);\n    } else if (current_scores.empty() && !boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), num_kws,\n                            config_.keywords_score);\n      current_scores.insert(current_scores.end(), boost_scores_.begin(),\n                            boost_scores_.end());\n    } else {\n      // Do nothing.\n    }\n\n    if (!current_thresholds.empty() && !thresholds_.empty()) {\n      current_thresholds.insert(current_thresholds.end(), thresholds_.begin(),\n                                thresholds_.end());\n    } else if (!current_thresholds.empty() && thresholds_.empty()) {\n      current_thresholds.insert(current_thresholds.end(), num_default_kws,\n                                config_.keywords_threshold);\n    } else if (current_thresholds.empty() && !thresholds_.empty()) {\n      current_thresholds.insert(current_thresholds.end(), num_kws,\n                                config_.keywords_threshold);\n      current_thresholds.insert(current_thresholds.end(), thresholds_.begin(),\n                                thresholds_.end());\n    } else {\n      // Do nothing.\n    }\n\n    auto keywords_graph = std::make_shared<ContextGraph>(\n        current_ids, config_.keywords_score, config_.keywords_threshold,\n        current_scores, current_kws, current_thresholds);\n\n    auto stream =\n        std::make_unique<OnlineStream>(config_.feat_config, keywords_graph);\n    InitOnlineStream(stream.get());\n    return stream;\n  }\n\n  bool IsReady(OnlineStream *s) const override {\n    return s->GetNumProcessedFrames() + model_->ChunkSize() <\n           s->NumFramesReady();\n  }\n  void Reset(OnlineStream *s) const override { InitOnlineStream(s); }\n\n  void DecodeStreams(OnlineStream **ss, int32_t n) const override {\n    for (int32_t i = 0; i < n; ++i) {\n      auto s = ss[i];\n      auto r = s->GetKeywordResult(true);\n      int32_t num_trailing_blanks = r.num_trailing_blanks;\n      // assume subsampling_factor is 4\n      // assume frameshift is 0.01 second\n      float trailing_silence = num_trailing_blanks * 4 * 0.01;\n\n      // it resets automatically after detecting 1.5 seconds of silence\n      float threshold = 1.5;\n      if (trailing_silence > threshold) {\n        Reset(s);\n      }\n    }\n\n    int32_t chunk_size = model_->ChunkSize();\n    int32_t chunk_shift = model_->ChunkShift();\n\n    int32_t feature_dim = ss[0]->FeatureDim();\n\n    std::vector<TransducerKeywordResult> results(n);\n    std::vector<float> features_vec(n * chunk_size * feature_dim);\n    std::vector<std::vector<Ort::Value>> states_vec(n);\n    std::vector<int64_t> all_processed_frames(n);\n\n    for (int32_t i = 0; i != n; ++i) {\n      SHERPA_ONNX_CHECK(ss[i]->GetContextGraph() != nullptr);\n\n      const auto num_processed_frames = ss[i]->GetNumProcessedFrames();\n      std::vector<float> features =\n          ss[i]->GetFrames(num_processed_frames, chunk_size);\n\n      // Question: should num_processed_frames include chunk_shift?\n      ss[i]->GetNumProcessedFrames() += chunk_shift;\n\n      std::copy(features.begin(), features.end(),\n                features_vec.data() + i * chunk_size * feature_dim);\n\n      results[i] = std::move(ss[i]->GetKeywordResult());\n      states_vec[i] = std::move(ss[i]->GetStates());\n      all_processed_frames[i] = num_processed_frames;\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> x_shape{n, chunk_size, feature_dim};\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, features_vec.data(),\n                                            features_vec.size(), x_shape.data(),\n                                            x_shape.size());\n\n    std::array<int64_t, 1> processed_frames_shape{\n        static_cast<int64_t>(all_processed_frames.size())};\n\n    Ort::Value processed_frames = Ort::Value::CreateTensor(\n        memory_info, all_processed_frames.data(), all_processed_frames.size(),\n        processed_frames_shape.data(), processed_frames_shape.size());\n\n    auto states = model_->StackStates(states_vec);\n\n    auto pair = model_->RunEncoder(std::move(x), std::move(states),\n                                   std::move(processed_frames));\n\n    decoder_->Decode(std::move(pair.first), ss, &results);\n\n    std::vector<std::vector<Ort::Value>> next_states =\n        model_->UnStackStates(pair.second);\n\n    for (int32_t i = 0; i != n; ++i) {\n      ss[i]->SetKeywordResult(results[i]);\n      ss[i]->SetStates(std::move(next_states[i]));\n    }\n  }\n\n  KeywordResult GetResult(OnlineStream *s) const override {\n    TransducerKeywordResult decoder_result = s->GetKeywordResult(true);\n\n    // TODO(fangjun): Remember to change these constants if needed\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = 4;\n    return Convert(decoder_result, sym_, frame_shift_ms, subsampling_factor,\n                   s->GetNumFramesSinceStart());\n  }\n\n private:\n  void InitKeywords(std::istream &is) {\n    if (!EncodeKeywords(is, sym_, &keywords_id_, &keywords_, &boost_scores_,\n                        &thresholds_)) {\n      SHERPA_ONNX_LOGE(\"Encode keywords failed.\");\n      exit(-1);\n    }\n    keywords_graph_ = std::make_shared<ContextGraph>(\n        keywords_id_, config_.keywords_score, config_.keywords_threshold,\n        boost_scores_, keywords_, thresholds_);\n  }\n\n  void InitKeywords() {\n#ifdef SHERPA_ONNX_ENABLE_WASM_KWS\n    // Due to the limitations of the wasm file system,\n    // the keyword_file variable is directly parsed as a string of keywords\n    // if WASM KWS on\n    std::istringstream is(config_.keywords_file);\n    InitKeywords(is);\n#else\n    // each line in keywords_file contains space-separated words\n    std::ifstream is(config_.keywords_file);\n    if (!is) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Open keywords file failed: '%{public}s'\",\n                       config_.keywords_file.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Open keywords file failed: '%s'\",\n                       config_.keywords_file.c_str());\n#endif\n      exit(-1);\n    }\n    InitKeywords(is);\n#endif\n  }\n\n  template <typename Manager>\n  void InitKeywords(Manager *mgr) {\n    // each line in keywords_file contains space-separated words\n\n    auto buf = ReadFile(mgr, config_.keywords_file);\n\n    std::istringstream is(std::string(buf.data(), buf.size()));\n\n    if (!is) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Open keywords file failed: '%{public}s'\",\n                       config_.keywords_file.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Open keywords file failed: '%s'\",\n                       config_.keywords_file.c_str());\n#endif\n      exit(-1);\n    }\n    InitKeywords(is);\n  }\n\n  void InitKeywordsFromBufStr() {\n    // keywords_buf's content is supposed to be same as the keywords_file's\n    std::istringstream is(config_.keywords_buf);\n    InitKeywords(is);\n  }\n\n  void InitOnlineStream(OnlineStream *stream) const {\n    auto r = decoder_->GetEmptyResult();\n    SHERPA_ONNX_CHECK_EQ(r.hyps.Size(), 1);\n\n    SHERPA_ONNX_CHECK(stream->GetContextGraph() != nullptr);\n    r.hyps.begin()->second.context_state = stream->GetContextGraph()->Root();\n\n    stream->SetKeywordResult(r);\n    stream->SetStates(model_->GetEncoderInitStates());\n  }\n\n private:\n  KeywordSpotterConfig config_;\n  std::vector<std::vector<int32_t>> keywords_id_;\n  std::vector<float> boost_scores_;\n  std::vector<float> thresholds_;\n  std::vector<std::string> keywords_;\n  ContextGraphPtr keywords_graph_;\n  std::unique_ptr<OnlineTransducerModel> model_;\n  std::unique_ptr<TransducerKeywordDecoder> decoder_;\n  SymbolTable sym_;\n  int32_t unk_id_ = -1;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_KEYWORD_SPOTTER_TRANSDUCER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/keyword-spotter.cc",
    "content": "// sherpa-onnx/csrc/keyword-spotter.cc\n//\n// Copyright (c)  2023-2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/keyword-spotter.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <fstream>\n#include <iomanip>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/keyword-spotter-impl.h\"\n\nnamespace sherpa_onnx {\n\nstd::string KeywordResult::AsJsonString() const {\n  std::ostringstream os;\n  os << \"{\";\n  os << \"\\\"start_time\\\":\" << std::fixed << std::setprecision(2) << start_time\n     << \", \";\n\n  os << \"\\\"keyword\\\"\"\n     << \": \";\n  os << \"\\\"\" << keyword << \"\\\"\"\n     << \", \";\n\n  os << \"\\\"\"\n     << \"timestamps\"\n     << \"\\\"\"\n     << \": \";\n  os << \"[\";\n\n  std::string sep = \"\";\n  for (auto t : timestamps) {\n    os << sep << std::fixed << std::setprecision(2) << t;\n    sep = \", \";\n  }\n  os << \"], \";\n\n  os << \"\\\"\"\n     << \"tokens\"\n     << \"\\\"\"\n     << \":\";\n  os << \"[\";\n\n  sep = \"\";\n  auto oldFlags = os.flags();\n  for (const auto &t : tokens) {\n    if (t.size() == 1 && static_cast<uint8_t>(t[0]) > 0x7f) {\n      const uint8_t *p = reinterpret_cast<const uint8_t *>(t.c_str());\n      os << sep << \"\\\"\"\n         << \"<0x\" << std::hex << std::uppercase << static_cast<uint32_t>(p[0])\n         << \">\"\n         << \"\\\"\";\n      os.flags(oldFlags);\n    } else {\n      os << sep << \"\\\"\" << t << \"\\\"\";\n    }\n    sep = \", \";\n  }\n  os << \"]\";\n  os << \"}\";\n\n  return os.str();\n}\n\nvoid KeywordSpotterConfig::Register(ParseOptions *po) {\n  feat_config.Register(po);\n  model_config.Register(po);\n\n  po->Register(\"max-active-paths\", &max_active_paths,\n               \"beam size used in modified beam search.\");\n  po->Register(\"num-trailing-blanks\", &num_trailing_blanks,\n               \"The number of trailing blanks should have after the keyword.\");\n  po->Register(\"keywords-score\", &keywords_score,\n               \"The bonus score for each token in context word/phrase.\");\n  po->Register(\"keywords-threshold\", &keywords_threshold,\n               \"The acoustic threshold (probability) to trigger the keywords.\");\n  po->Register(\n      \"keywords-file\", &keywords_file,\n      \"The file containing keywords, one word/phrase per line, and for each\"\n      \"phrase the bpe/cjkchar are separated by a space. For example: \"\n      \"▁HE LL O ▁WORLD\"\n      \"你 好 世 界\");\n}\n\nbool KeywordSpotterConfig::Validate() const {\n  if (!keywords_file.empty() && !keywords_buf.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"you can not provide a keywords_buf and a keywords file: '%s', \"\n        \"at the same time, which is confusing\",\n        keywords_file.c_str());\n    return false;\n  }\n\n  if (keywords_file.empty() && keywords_buf.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"Please provide either a keywords-file or the keywords-buf\");\n    return false;\n  }\n\n#ifndef SHERPA_ONNX_ENABLE_WASM_KWS\n  // due to the limitations of the wasm file system,\n  // keywords file will be packaged into the sherpa-onnx-wasm-kws-main.data file\n  // Solution: take keyword_file variable is directly\n  // parsed as a string of keywords\n  if (keywords_buf.empty() && !std::ifstream(keywords_file.c_str()).good()) {\n    SHERPA_ONNX_LOGE(\"Keywords file '%s' does not exist.\",\n                     keywords_file.c_str());\n    return false;\n  }\n#endif\n\n  return model_config.Validate();\n}\n\nstd::string KeywordSpotterConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"KeywordSpotterConfig(\";\n  os << \"feat_config=\" << feat_config.ToString() << \", \";\n  os << \"model_config=\" << model_config.ToString() << \", \";\n  os << \"max_active_paths=\" << max_active_paths << \", \";\n  os << \"num_trailing_blanks=\" << num_trailing_blanks << \", \";\n  os << \"keywords_score=\" << keywords_score << \", \";\n  os << \"keywords_threshold=\" << keywords_threshold << \", \";\n  os << \"keywords_file=\\\"\" << keywords_file << \"\\\")\";\n\n  return os.str();\n}\n\nKeywordSpotter::KeywordSpotter(const KeywordSpotterConfig &config)\n    : impl_(KeywordSpotterImpl::Create(config)) {}\n\ntemplate <typename Manager>\nKeywordSpotter::KeywordSpotter(Manager *mgr, const KeywordSpotterConfig &config)\n    : impl_(KeywordSpotterImpl::Create(mgr, config)) {}\n\nKeywordSpotter::~KeywordSpotter() = default;\n\nstd::unique_ptr<OnlineStream> KeywordSpotter::CreateStream() const {\n  return impl_->CreateStream();\n}\n\nstd::unique_ptr<OnlineStream> KeywordSpotter::CreateStream(\n    const std::string &keywords) const {\n  return impl_->CreateStream(keywords);\n}\n\nbool KeywordSpotter::IsReady(OnlineStream *s) const {\n  return impl_->IsReady(s);\n}\n\nvoid KeywordSpotter::Reset(OnlineStream *s) const { impl_->Reset(s); }\n\nvoid KeywordSpotter::DecodeStreams(OnlineStream **ss, int32_t n) const {\n  impl_->DecodeStreams(ss, n);\n}\n\nKeywordResult KeywordSpotter::GetResult(OnlineStream *s) const {\n  return impl_->GetResult(s);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate KeywordSpotter::KeywordSpotter(AAssetManager *mgr,\n                                        const KeywordSpotterConfig &config);\n#endif\n\n#if __OHOS__\ntemplate KeywordSpotter::KeywordSpotter(NativeResourceManager *mgr,\n                                        const KeywordSpotterConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/keyword-spotter.h",
    "content": "// sherpa-onnx/csrc/keyword-spotter.h\n//\n// Copyright (c)  2023-2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_KEYWORD_SPOTTER_H_\n#define SHERPA_ONNX_CSRC_KEYWORD_SPOTTER_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/features.h\"\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct KeywordResult {\n  /// The triggered keyword.\n  /// For English, it consists of space separated words.\n  /// For Chinese, it consists of Chinese words without spaces.\n  /// Example 1: \"hello world\"\n  /// Example 2: \"你好世界\"\n  std::string keyword;\n\n  /// Decoded results at the token level.\n  /// For instance, for BPE-based models it consists of a list of BPE tokens.\n  std::vector<std::string> tokens;\n\n  /// timestamps.size() == tokens.size()\n  /// timestamps[i] records the time in seconds when tokens[i] is decoded.\n  std::vector<float> timestamps;\n\n  /// Starting time of this segment.\n  /// When an endpoint is detected, it will change\n  float start_time = 0;\n\n  /** Return a json string.\n   *\n   * The returned string contains:\n   *   {\n   *     \"keyword\": \"The triggered keyword\",\n   *     \"tokens\": [x, x, x],\n   *     \"timestamps\": [x, x, x],\n   *     \"start_time\": x,\n   *   }\n   */\n  std::string AsJsonString() const;\n};\n\nstruct KeywordSpotterConfig {\n  FeatureExtractorConfig feat_config;\n  OnlineModelConfig model_config;\n\n  int32_t max_active_paths = 4;\n\n  int32_t num_trailing_blanks = 1;\n\n  float keywords_score = 1.0;\n\n  float keywords_threshold = 0.25;\n\n  std::string keywords_file;\n\n  /// if keywords_buf is non-empty,\n  /// the keywords will be loaded from the buffer instead of from the\n  /// \"keywrods_file\"\n  std::string keywords_buf;\n\n  KeywordSpotterConfig() = default;\n\n  KeywordSpotterConfig(const FeatureExtractorConfig &feat_config,\n                       const OnlineModelConfig &model_config,\n                       int32_t max_active_paths, int32_t num_trailing_blanks,\n                       float keywords_score, float keywords_threshold,\n                       const std::string &keywords_file)\n      : feat_config(feat_config),\n        model_config(model_config),\n        max_active_paths(max_active_paths),\n        num_trailing_blanks(num_trailing_blanks),\n        keywords_score(keywords_score),\n        keywords_threshold(keywords_threshold),\n        keywords_file(keywords_file) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nclass KeywordSpotterImpl;\n\nclass KeywordSpotter {\n public:\n  explicit KeywordSpotter(const KeywordSpotterConfig &config);\n\n  template <typename Manager>\n  KeywordSpotter(Manager *mgr, const KeywordSpotterConfig &config);\n\n  ~KeywordSpotter();\n\n  /** Create a stream for decoding.\n   *\n   */\n  std::unique_ptr<OnlineStream> CreateStream() const;\n\n  /** Create a stream for decoding.\n   *\n   *  @param The keywords for this string, it might contain several keywords,\n   *         the keywords are separated by \"/\". In each of the keywords, there\n   *         are cjkchars or bpes, the bpe/cjkchar are separated by space (\" \").\n   *         For example, keywords I LOVE YOU and HELLO WORLD, looks like:\n   *\n   *         \"▁I ▁LOVE ▁YOU/▁HE LL O ▁WORLD\"\n   */\n  std::unique_ptr<OnlineStream> CreateStream(const std::string &keywords) const;\n\n  /**\n   * Return true if the given stream has enough frames for decoding.\n   * Return false otherwise\n   */\n  bool IsReady(OnlineStream *s) const;\n\n  // Remember to call it after detecting a keyword\n  void Reset(OnlineStream *s) const;\n\n  /** Decode a single stream. */\n  void DecodeStream(OnlineStream *s) const {\n    OnlineStream *ss[1] = {s};\n    DecodeStreams(ss, 1);\n  }\n\n  /** Decode multiple streams in parallel\n   *\n   * @param ss Pointer array containing streams to be decoded.\n   * @param n Number of streams in `ss`.\n   */\n  void DecodeStreams(OnlineStream **ss, int32_t n) const;\n\n  KeywordResult GetResult(OnlineStream *s) const;\n\n private:\n  std::unique_ptr<KeywordSpotterImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_KEYWORD_SPOTTER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/kokoro-multi-lang-lexicon.cc",
    "content": "// sherpa-onnx/csrc/kokoro-multi-lang-lexicon.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/kokoro-multi-lang-lexicon.h\"\n\n#include <fstream>\n#include <regex>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <unordered_set>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"espeak-ng/speak_lib.h\"\n#include \"phoneme_ids.hpp\"  // NOLINT\n#include \"phonemize.hpp\"    // NOLINT\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/phrase-matcher.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid CallPhonemizeEspeak(const std::string &text,\n                         piper::eSpeakPhonemeConfig &config,  // NOLINT\n                         std::vector<std::vector<piper::Phoneme>> *phonemes);\n\nclass KokoroMultiLangLexicon::Impl {\n public:\n  Impl(const std::string &tokens, const std::string &lexicon,\n       const std::string &data_dir,\n       const OfflineTtsKokoroModelMetaData &meta_data, bool debug)\n      : meta_data_(meta_data), debug_(debug) {\n    InitTokens(tokens);\n\n    InitLexicon(lexicon);\n\n    InitEspeak(data_dir);  // See ./piper-phonemize-lexicon.cc\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const std::string &tokens, const std::string &lexicon,\n       const std::string &data_dir,\n       const OfflineTtsKokoroModelMetaData &meta_data, bool debug)\n      : meta_data_(meta_data), debug_(debug) {\n    InitTokens(mgr, tokens);\n\n    InitLexicon(mgr, lexicon);\n\n    // we assume you have copied data_dir from assets to some path\n\n    InitEspeak(data_dir);  // See ./piper-phonemize-lexicon.cc\n  }\n\n  std::vector<TokenIDs> ConvertTextToTokenIds(const std::string &_text,\n                                              const std::string &voice) const {\n    // we cannot convert text to lowercase here since it will affect\n    // how piper_phonemize handles punctuations inside the text\n    std::string text = _text;\n\n    std::vector<std::pair<std::string, std::string>> replace_str_pairs = {\n        {\"，\", \",\"}, {\":\", \",\"},  {\"、\", \",\"}, {\"；\", \";\"},   {\"：\", \":\"},\n        {\"。\", \".\"}, {\"？\", \"?\"}, {\"！\", \"!\"}, {\"\\\\s+\", \" \"},\n    };\n    for (const auto &p : replace_str_pairs) {\n      std::regex re(p.first);\n      text = std::regex_replace(text, re, p.second);\n    }\n\n    if (debug_) {\n      SHERPA_ONNX_LOGE(\"After replacing punctuations and merging spaces:\\n%s\",\n                       text.c_str());\n    }\n\n    // https://en.cppreference.com/w/cpp/regex\n    // https://stackoverflow.com/questions/37989081/how-to-use-unicode-range-in-c-regex\n    std::string expr_chinese = \"([\\\\u4e00-\\\\u9fff]+)\";\n    std::string expr_not_chinese = \"([^\\\\u4e00-\\\\u9fff]+)\";\n\n    std::string expr_both = expr_chinese + \"|\" + expr_not_chinese;\n\n    auto ws = ToWideString(text);\n    std::wstring wexpr_both = ToWideString(expr_both);\n    std::wregex we_both(wexpr_both);\n\n    std::wstring wexpr_zh = ToWideString(expr_chinese);\n    std::wregex we_zh(wexpr_zh);\n\n    auto begin = std::wsregex_iterator(ws.begin(), ws.end(), we_both);\n    auto end = std::wsregex_iterator();\n\n    std::vector<TokenIDs> ans;\n\n    for (std::wsregex_iterator i = begin; i != end; ++i) {\n      std::wsmatch match = *i;\n      std::wstring match_str = match.str();\n\n      auto ms = ToString(match_str);\n      uint8_t c = reinterpret_cast<const uint8_t *>(ms.data())[0];\n\n      std::vector<std::vector<int32_t>> ids_vec;\n      if (std::regex_match(match_str, we_zh)) {\n        if (debug_) {\n          SHERPA_ONNX_LOGE(\"Chinese: %s\", ms.c_str());\n        }\n        ids_vec = ConvertChineseToTokenIDs(ms);\n      } else {\n        if (debug_) {\n          SHERPA_ONNX_LOGE(\"Non-Chinese: %s\", ms.c_str());\n        }\n\n        ids_vec = ConvertNonChineseToTokenIDs(ms, voice);\n      }\n\n      for (const auto &ids : ids_vec) {\n        if (ids.size() > 10 + 2) {\n          ans.emplace_back(ids);\n        } else {\n          if (ans.empty()) {\n            ans.emplace_back(ids);\n          } else {\n            if ((ans.back().tokens.size() + ids.size() < 50) ||\n                (ids.size() < 5)) {\n              ans.back().tokens.back() = ids[1];\n              ans.back().tokens.insert(ans.back().tokens.end(), ids.begin() + 2,\n                                       ids.end());\n            } else {\n              ans.emplace_back(ids);\n            }\n          }\n        }\n      }\n    }\n\n    if (debug_) {\n      for (const auto &v : ans) {\n        std::ostringstream os;\n        os << \"\\n\";\n        std::string sep;\n        for (auto i : v.tokens) {\n          os << sep << i;\n          sep = \" \";\n        }\n        os << \"\\n\";\n        SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n      }\n    }\n\n    return ans;\n  }\n\n private:\n  bool IsPunctuation(const std::string &text) const {\n    if (text == \";\" || text == \":\" || text == \",\" || text == \".\" ||\n        text == \"!\" || text == \"?\" || text == \"—\" || text == \"…\" ||\n        text == \"\\\"\" || text == \"(\" || text == \")\" || text == \"“\" ||\n        text == \"”\") {\n      return true;\n    }\n\n    return false;\n  }\n\n  std::vector<int32_t> ConvertWordToIds(const std::string &w) const {\n    std::vector<int32_t> ans;\n    if (word2ids_.count(w)) {\n      ans = word2ids_.at(w);\n    } else {\n      std::vector<std::string> words = SplitUtf8(w);\n      for (const auto &word : words) {\n        if (word2ids_.count(word)) {\n          auto ids = ConvertWordToIds(word);\n          ans.insert(ans.end(), ids.begin(), ids.end());\n        } else {\n          if (debug_) {\n            SHERPA_ONNX_LOGE(\"Skip OOV: '%s'\", word.c_str());\n          }\n        }\n      }\n    }\n\n    if (debug_ && !ans.empty()) {\n      std::ostringstream os;\n      os << w << \": \";\n      for (auto i : ans) {\n        os << id2token_.at(i) << \" \";\n      }\n      os << \"\\n\";\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    return ans;\n  }\n\n  std::vector<std::vector<int32_t>> ConvertChineseToTokenIDs(\n      const std::string &text) const {\n    std::vector<std::string> words = SplitUtf8(text);\n\n    if (debug_) {\n      std::ostringstream os;\n      std::string sep = \"\";\n      for (const auto &w : words) {\n        os << sep << w;\n        sep = \"_\";\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"after splitting into UTF8:\\n%{public}s\",\n                       os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"after splitting into UTF8:\\n%s\", os.str().c_str());\n#endif\n    }\n\n    std::vector<std::vector<int32_t>> ans;\n    std::vector<int32_t> this_sentence;\n    int32_t max_len = meta_data_.max_token_len;\n\n    this_sentence.push_back(0);\n\n    PhraseMatcher matcher(&all_words_, words, debug_);\n\n    for (const std::string &w : matcher) {\n      auto ids = ConvertWordToIds(w);\n      if (ids.empty()) {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\"Ignore OOV '%{public}s'\", w.c_str());\n#else\n        SHERPA_ONNX_LOGE(\"Ignore OOV '%s'\", w.c_str());\n#endif\n        continue;\n      }\n\n      if (this_sentence.size() + ids.size() > max_len - 2) {\n        this_sentence.push_back(0);\n        ans.push_back(std::move(this_sentence));\n\n        this_sentence.push_back(0);\n      }\n\n      this_sentence.insert(this_sentence.end(), ids.begin(), ids.end());\n    }  // for (const std::string &w : matcher)\n\n    if (this_sentence.size() > 1) {\n      this_sentence.push_back(0);\n      ans.push_back(std::move(this_sentence));\n    }\n\n    if (debug_) {\n      for (const auto &v : ans) {\n        std::ostringstream os;\n        os << \"\\n\";\n        std::string sep;\n        for (auto i : v) {\n          os << sep << i;\n          sep = \" \";\n        }\n        os << \"\\n\";\n        SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n      }\n    }\n\n    return ans;\n  }\n\n  std::vector<std::vector<int32_t>> ConvertTextToTokenIDsWithEspeak(\n      const std::string &text, const std::string &voice) const {\n    auto temp = ConvertTextToTokenIdsKokoroOrKitten(\n        phoneme2id_, meta_data_.max_token_len, text, voice);\n    std::vector<std::vector<int32_t>> ans;\n    ans.reserve(temp.size());\n\n    for (const auto &i : temp) {\n      ans.emplace_back(i.tokens.begin(), i.tokens.end());\n    }\n\n    return ans;\n  }\n\n  std::vector<std::vector<int32_t>> ConvertNonChineseToTokenIDs(\n      const std::string &text, const std::string &voice) const {\n    if (IsPunctuation(text)) {\n      return {std::vector<int32_t>{0, token2id_.at(text), 0}};\n    }\n\n    if (!voice.empty()) {\n      return ConvertTextToTokenIDsWithEspeak(text, voice);\n    }\n\n    // If voice is empty, we split the text into words and use the lexicon\n    // to lookup the pronunciation of each word, fallback to espeak if\n    // a word is not in the lexicon.\n\n    std::vector<std::string> words = SplitUtf8(text);\n    if (debug_) {\n      std::ostringstream os;\n      os << \"After splitting to words: \";\n      std::string sep;\n      for (const auto &w : words) {\n        os << sep << w;\n        sep = \"_\";\n      }\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n    }\n\n    std::vector<std::vector<int32_t>> ans;\n    int32_t max_len = meta_data_.max_token_len;\n    std::vector<int32_t> this_sentence;\n\n    int32_t space_id = token2id_.at(\" \");\n\n    this_sentence.push_back(0);\n\n    for (const auto &_word : words) {\n      auto word = ToLowerCase(_word);\n      if (IsPunctuation(word)) {\n        this_sentence.push_back(token2id_.at(word));\n\n        if (this_sentence.size() > max_len - 2) {\n          // this sentence is too long, split it\n          this_sentence.push_back(0);\n          ans.push_back(std::move(this_sentence));\n\n          this_sentence.push_back(0);\n          continue;\n        }\n\n        if (word == \".\" || word == \"!\" || word == \"?\" || word == \";\") {\n          // Note: You can add more punctuations here to split the text\n          // into sentences. We just use four here: .!?;\n          this_sentence.push_back(0);\n          ans.push_back(std::move(this_sentence));\n\n          this_sentence.push_back(0);\n        }\n      } else if (word2ids_.count(word)) {\n        const auto &ids = word2ids_.at(word);\n        if (this_sentence.size() + ids.size() + 3 > max_len - 2) {\n          this_sentence.push_back(0);\n          ans.push_back(std::move(this_sentence));\n\n          this_sentence.push_back(0);\n        }\n\n        this_sentence.insert(this_sentence.end(), ids.begin(), ids.end());\n        this_sentence.push_back(space_id);\n      } else {\n        if (debug_) {\n          SHERPA_ONNX_LOGE(\"Use espeak-ng to handle the OOV: '%s'\",\n                           word.c_str());\n        }\n\n        piper::eSpeakPhonemeConfig config;\n\n        config.voice = meta_data_.voice;\n\n        std::vector<std::vector<piper::Phoneme>> phonemes;\n\n        CallPhonemizeEspeak(word, config, &phonemes);\n        // Note phonemes[i] contains a vector of unicode codepoints;\n        // we need to convert them to utf8\n\n        std::vector<int32_t> ids;\n        for (const auto &v : phonemes) {\n          for (const auto p : v) {\n            auto token = Utf32ToUtf8(p);\n            if (token2id_.count(token)) {\n              ids.push_back(token2id_.at(token));\n            } else {\n              if (debug_) {\n                SHERPA_ONNX_LOGE(\"Skip OOV token '%s' from '%s'\", token.c_str(),\n                                 word.c_str());\n              }\n            }\n          }\n        }\n\n        if (this_sentence.size() + ids.size() + 3 > max_len - 2) {\n          this_sentence.push_back(0);\n          ans.push_back(std::move(this_sentence));\n\n          this_sentence.push_back(0);\n        }\n\n        this_sentence.insert(this_sentence.end(), ids.begin(), ids.end());\n        this_sentence.push_back(space_id);\n      }\n    }\n\n    if (this_sentence.size() > 1) {\n      this_sentence.push_back(0);\n      ans.push_back(std::move(this_sentence));\n    }\n\n    if (debug_) {\n      for (const auto &v : ans) {\n        std::ostringstream os;\n        os << \"\\n\";\n        std::string sep;\n        for (auto i : v) {\n          os << sep << i;\n          sep = \" \";\n        }\n        os << \"\\n\";\n        SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n      }\n    }\n\n    return ans;\n  }\n\n  void InitTokens(const std::string &tokens) {\n    std::ifstream is(tokens);\n    InitTokens(is);\n  }\n\n  template <typename Manager>\n  void InitTokens(Manager *mgr, const std::string &tokens) {\n    auto buf = ReadFile(mgr, tokens);\n\n    std::istringstream is(std::string(buf.data(), buf.size()));\n    InitTokens(is);\n  }\n\n  void InitTokens(std::istream &is) {\n    token2id_ = ReadTokens(is);  // defined in ./symbol-table.cc\n\n    if (debug_) {\n      for (const auto &p : token2id_) {\n        id2token_[p.second] = p.first;\n      }\n    }\n\n    std::u32string s;\n    for (const auto &p : token2id_) {\n      s = Utf8ToUtf32(p.first);\n\n      if (s.size() != 1) {\n        SHERPA_ONNX_LOGE(\"Error for token %s with id %d\", p.first.c_str(),\n                         p.second);\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      char32_t c = s[0];\n      phoneme2id_.insert({c, p.second});\n    }\n  }\n\n  void InitLexicon(const std::string &lexicon) {\n    if (lexicon.empty()) {\n      return;\n    }\n\n    std::vector<std::string> files;\n    SplitStringToVector(lexicon, \",\", false, &files);\n    for (const auto &f : files) {\n      std::ifstream is(f);\n      InitLexicon(is);\n    }\n  }\n\n  template <typename Manager>\n  void InitLexicon(Manager *mgr, const std::string &lexicon) {\n    if (lexicon.empty()) {\n      return;\n    }\n\n    std::vector<std::string> files;\n    SplitStringToVector(lexicon, \",\", false, &files);\n    for (const auto &f : files) {\n      auto buf = ReadFile(mgr, f);\n\n      std::istringstream is(std::string(buf.data(), buf.size()));\n      InitLexicon(is);\n    }\n  }\n\n  void InitLexicon(std::istream &is) {\n    std::string word;\n    std::vector<std::string> token_list;\n    std::string token;\n\n    std::string line;\n    int32_t line_num = 0;\n    int32_t num_warn = 0;\n    while (std::getline(is, line)) {\n      ++line_num;\n      std::istringstream iss(line);\n\n      token_list.clear();\n      iss >> word;\n      ToLowerCase(&word);\n\n      if (word2ids_.count(word)) {\n        num_warn += 1;\n        if (num_warn < 10) {\n          SHERPA_ONNX_LOGE(\"Duplicated word: %s at line %d:%s. Ignore it.\",\n                           word.c_str(), line_num, line.c_str());\n        }\n        continue;\n      }\n\n      while (iss >> token) {\n        token_list.push_back(std::move(token));\n      }\n\n      std::vector<int32_t> ids = ConvertTokensToIds(token2id_, token_list);\n\n      if (ids.empty() && word != \"呣\") {\n        SHERPA_ONNX_LOGE(\n            \"Invalid pronunciation for word '%s' at line %d:%s. Ignore it\",\n            word.c_str(), line_num, line.c_str());\n        continue;\n      }\n\n      word2ids_.insert({std::move(word), std::move(ids)});\n    }\n\n    for (const auto &[key, _] : word2ids_) {\n      all_words_.insert(key);\n    }\n  }\n\n private:\n  OfflineTtsKokoroModelMetaData meta_data_;\n\n  // word to token IDs\n  std::unordered_map<std::string, std::vector<int32_t>> word2ids_;\n  std::unordered_set<std::string> all_words_;\n\n  // tokens.txt is saved in token2id_\n  std::unordered_map<std::string, int32_t> token2id_;\n  std::unordered_map<int32_t, std::string> id2token_;\n\n  std::unordered_map<char32_t, int32_t> phoneme2id_;\n\n  bool debug_ = false;\n};\n\nKokoroMultiLangLexicon::~KokoroMultiLangLexicon() = default;\n\nKokoroMultiLangLexicon::KokoroMultiLangLexicon(\n    const std::string &tokens, const std::string &lexicon,\n    const std::string &data_dir, const OfflineTtsKokoroModelMetaData &meta_data,\n    bool debug)\n    : impl_(std::make_unique<Impl>(tokens, lexicon, data_dir, meta_data,\n                                   debug)) {}  // NOLINT\n\ntemplate <typename Manager>\nKokoroMultiLangLexicon::KokoroMultiLangLexicon(\n    Manager *mgr, const std::string &tokens, const std::string &lexicon,\n    const std::string &data_dir, const OfflineTtsKokoroModelMetaData &meta_data,\n    bool debug)\n    : impl_(std::make_unique<Impl>(mgr, tokens, lexicon, data_dir, meta_data,\n                                   debug)) {}  // NOLINT\n\nstd::vector<TokenIDs> KokoroMultiLangLexicon::ConvertTextToTokenIds(\n    const std::string &text, const std::string &voice /*= \"\"*/) const {\n  return impl_->ConvertTextToTokenIds(text, voice);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate KokoroMultiLangLexicon::KokoroMultiLangLexicon(\n    AAssetManager *mgr, const std::string &tokens, const std::string &lexicon,\n    const std::string &data_dir, const OfflineTtsKokoroModelMetaData &meta_data,\n    bool debug);\n#endif\n\n#if __OHOS__\ntemplate KokoroMultiLangLexicon::KokoroMultiLangLexicon(\n    NativeResourceManager *mgr, const std::string &tokens,\n    const std::string &lexicon, const std::string &data_dir,\n    const OfflineTtsKokoroModelMetaData &meta_data, bool debug);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/kokoro-multi-lang-lexicon.h",
    "content": "// sherpa-onnx/csrc/kokoro-multi-lang-lexicon.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_KOKORO_MULTI_LANG_LEXICON_H_\n#define SHERPA_ONNX_CSRC_KOKORO_MULTI_LANG_LEXICON_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-kokoro-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass KokoroMultiLangLexicon : public OfflineTtsFrontend {\n public:\n  ~KokoroMultiLangLexicon() override;\n\n  KokoroMultiLangLexicon(const std::string &tokens, const std::string &lexicon,\n                         const std::string &data_dir,\n                         const OfflineTtsKokoroModelMetaData &meta_data,\n                         bool debug);\n\n  template <typename Manager>\n  KokoroMultiLangLexicon(Manager *mgr, const std::string &tokens,\n                         const std::string &lexicon,\n                         const std::string &data_dir,\n                         const OfflineTtsKokoroModelMetaData &meta_data,\n                         bool debug);\n\n  std::vector<TokenIDs> ConvertTextToTokenIds(\n      const std::string &text, const std::string &voice = \"\") const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_KOKORO_MULTI_LANG_LEXICON_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/lexicon.cc",
    "content": "// sherpa-onnx/csrc/lexicon.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/lexicon.h\"\n\n#include <algorithm>\n#include <cctype>\n#include <fstream>\n#include <iomanip>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nstatic std::vector<std::string> ProcessHeteronyms(\n    const std::vector<std::string> &words) {\n  std::vector<std::string> ans;\n  ans.reserve(words.size());\n\n  int32_t num_words = static_cast<int32_t>(words.size());\n  int32_t i = 0;\n  int32_t prev = -1;\n  while (i < num_words) {\n    // start of a phrase #$|\n    if ((i + 2 < num_words) && words[i] == \"#\" && words[i + 1] == \"$\" &&\n        words[i + 2] == \"|\") {\n      if (prev == -1) {\n        prev = i + 3;\n      }\n      i = i + 3;\n      continue;\n    }\n\n    // end of a phrase |$#\n    if ((i + 2 < num_words) && words[i] == \"|\" && words[i + 1] == \"$\" &&\n        words[i + 2] == \"#\") {\n      if (prev != -1) {\n        std::ostringstream os;\n        for (int32_t k = prev; k < i; ++k) {\n          if (words[k] != \"|\" && words[k] != \"$\" && words[k] != \"#\") {\n            os << words[k];\n          }\n        }\n        ans.push_back(os.str());\n\n        prev = -1;\n      }\n\n      i += 3;\n      continue;\n    }\n\n    if (prev == -1) {\n      // not inside a phrase\n      ans.push_back(words[i]);\n    }\n\n    ++i;\n  }\n\n  return ans;\n}\n\nstd::vector<int32_t> ConvertTokensToIds(\n    const std::unordered_map<std::string, int32_t> &token2id,\n    const std::vector<std::string> &tokens) {\n  std::vector<int32_t> ids;\n  ids.reserve(tokens.size());\n  for (const auto &s : tokens) {\n    if (!token2id.count(s)) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Unknown token: %{public}s\", s.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Unknown token: %s\", s.c_str());\n#endif\n      return {};\n    }\n    int32_t id = token2id.at(s);\n    ids.push_back(id);\n  }\n\n  return ids;\n}\n\nLexicon::Lexicon(const std::string &lexicon, const std::string &tokens,\n                 const std::string &punctuations, const std::string &language,\n                 bool debug /*= false*/)\n    : debug_(debug) {\n  InitLanguage(language);\n\n  {\n    std::ifstream is(tokens);\n    InitTokens(is);\n  }\n\n  {\n    std::ifstream is(lexicon);\n    InitLexicon(is);\n  }\n\n  InitPunctuations(punctuations);\n}\n\ntemplate <typename Manager>\nLexicon::Lexicon(Manager *mgr, const std::string &lexicon,\n                 const std::string &tokens, const std::string &punctuations,\n                 const std::string &language, bool debug /*= false*/\n                 )\n    : debug_(debug) {\n  InitLanguage(language);\n\n  {\n    auto buf = ReadFile(mgr, tokens);\n    std::istringstream is(std::string(buf.data(), buf.size()));\n    InitTokens(is);\n  }\n\n  {\n    auto buf = ReadFile(mgr, lexicon);\n    std::istringstream is(std::string(buf.data(), buf.size()));\n    InitLexicon(is);\n  }\n\n  InitPunctuations(punctuations);\n}\n\nstd::vector<TokenIDs> Lexicon::ConvertTextToTokenIds(\n    const std::string &text, const std::string & /*voice*/ /*= \"\"*/) const {\n  switch (language_) {\n    case Language::kChinese:\n      return ConvertTextToTokenIdsChinese(text);\n    case Language::kNotChinese:\n      return ConvertTextToTokenIdsNotChinese(text);\n    default:\n      SHERPA_ONNX_LOGE(\"Unknown language: %d\", static_cast<int32_t>(language_));\n      SHERPA_ONNX_EXIT(-1);\n  }\n\n  return {};\n}\n\nstd::vector<TokenIDs> Lexicon::ConvertTextToTokenIdsChinese(\n    const std::string &_text) const {\n  std::string text(_text);\n  ToLowerCase(&text);\n\n  std::vector<std::string> words = SplitUtf8(text);\n  words = ProcessHeteronyms(words);\n\n  if (debug_) {\n    std::ostringstream os;\n\n    os << \"Input text in string: \" << text << \"\\n\";\n    os << \"Input text in bytes:\";\n    for (uint8_t c : text) {\n      os << \" 0x\" << std::setfill('0') << std::setw(2) << std::right << std::hex\n         << static_cast<int32_t>(c);\n    }\n    os << \"\\n\";\n    os << \"After splitting to words:\";\n    for (const auto &w : words) {\n      os << \" \" << w;\n    }\n    os << \"\\n\";\n\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  std::vector<TokenIDs> ans;\n  std::vector<int64_t> this_sentence;\n\n  int32_t sil = -1;\n  int32_t eos = -1;\n  if (token2id_.count(\"sil\")) {\n    sil = token2id_.at(\"sil\");\n    eos = token2id_.at(\"eos\");\n  }\n\n  int32_t pad = -1;\n  if (token2id_.count(\"#0\")) {\n    pad = token2id_.at(\"#0\");\n  }\n\n  if (sil != -1) {\n    this_sentence.push_back(sil);\n  }\n\n  for (const auto &w : words) {\n    if (w == \".\" || w == \";\" || w == \"!\" || w == \"?\" || w == \"-\" || w == \":\" ||\n        w == \"。\" || w == \"；\" || w == \"！\" || w == \"？\" || w == \"：\" ||\n        w == \"”\" ||\n        // not sentence break\n        w == \",\" || w == \"“\" || w == \"，\" || w == \"、\") {\n      if (punctuations_.count(w)) {\n        if (token2id_.count(w)) {\n          this_sentence.push_back(token2id_.at(w));\n        } else if (pad != -1) {\n          this_sentence.push_back(pad);\n        } else if (sil != -1) {\n          this_sentence.push_back(sil);\n        }\n      }\n\n      if (w != \",\" && w != \"“\" && w != \"，\" && w != \"、\") {\n        if (eos != -1) {\n          this_sentence.push_back(eos);\n        }\n        ans.emplace_back(std::move(this_sentence));\n        this_sentence = {};\n\n        if (sil != -1) {\n          this_sentence.push_back(sil);\n        }\n      }\n      continue;\n    }\n\n    if (!word2ids_.count(w)) {\n      SHERPA_ONNX_LOGE(\"OOV %s. Ignore it!\", w.c_str());\n      continue;\n    }\n\n    const auto &token_ids = word2ids_.at(w);\n    this_sentence.insert(this_sentence.end(), token_ids.begin(),\n                         token_ids.end());\n  }\n\n  if (sil != -1) {\n    this_sentence.push_back(sil);\n  }\n\n  if (eos != -1) {\n    this_sentence.push_back(eos);\n  }\n\n  if (!this_sentence.empty()) {\n    ans.emplace_back(std::move(this_sentence));\n  }\n\n  return ans;\n}\n\nstd::vector<TokenIDs> Lexicon::ConvertTextToTokenIdsNotChinese(\n    const std::string &_text) const {\n  std::string text(_text);\n  ToLowerCase(&text);\n\n  std::vector<std::string> words = SplitUtf8(text);\n\n  if (debug_) {\n    std::ostringstream os;\n\n    os << \"Input text (lowercase) in string: \" << text << \"\\n\";\n    os << \"Input text in bytes:\";\n    for (uint8_t c : text) {\n      os << \" 0x\" << std::setfill('0') << std::setw(2) << std::right << std::hex\n         << static_cast<int32_t>(c);\n    }\n    os << \"\\n\";\n    os << \"After splitting to words:\";\n    for (const auto &w : words) {\n      os << \" \" << w;\n    }\n    os << \"\\n\";\n\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  int32_t blank = token2id_.at(\" \");\n\n  std::vector<TokenIDs> ans;\n  std::vector<int64_t> this_sentence;\n\n  for (const auto &w : words) {\n    if (w == \".\" || w == \";\" || w == \"!\" || w == \"?\" || w == \"-\" || w == \":\" ||\n        // not sentence break\n        w == \",\") {\n      if (punctuations_.count(w)) {\n        this_sentence.push_back(token2id_.at(w));\n      }\n\n      if (w != \",\") {\n        this_sentence.push_back(blank);\n        ans.emplace_back(std::move(this_sentence));\n        this_sentence = {};\n      }\n\n      continue;\n    }\n\n    if (!word2ids_.count(w)) {\n      SHERPA_ONNX_LOGE(\"OOV %s. Ignore it!\", w.c_str());\n      continue;\n    }\n\n    const auto &token_ids = word2ids_.at(w);\n    this_sentence.insert(this_sentence.end(), token_ids.begin(),\n                         token_ids.end());\n    this_sentence.push_back(blank);\n  }\n\n  if (!this_sentence.empty()) {\n    // remove the last blank\n    this_sentence.resize(this_sentence.size() - 1);\n  }\n\n  if (!this_sentence.empty()) {\n    ans.emplace_back(std::move(this_sentence));\n  }\n\n  return ans;\n}\n\nvoid Lexicon::InitTokens(std::istream &is) { token2id_ = ReadTokens(is); }\n\nvoid Lexicon::InitLanguage(const std::string &_lang) {\n  std::string lang(_lang);\n  ToLowerCase(&lang);\n  if (lang == \"chinese\") {\n    language_ = Language::kChinese;\n  } else if (!lang.empty()) {\n    language_ = Language::kNotChinese;\n  } else {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"Unknown language: %{public}s\", _lang.c_str());\n#else\n    SHERPA_ONNX_LOGE(\"Unknown language: %s\", _lang.c_str());\n#endif\n    SHERPA_ONNX_EXIT(-1);\n  }\n}\n\nvoid Lexicon::InitLexicon(std::istream &is) {\n  std::string word;\n  std::vector<std::string> token_list;\n  std::string line;\n  std::string phone;\n\n  while (std::getline(is, line)) {\n    std::istringstream iss(line);\n\n    token_list.clear();\n\n    iss >> word;\n    ToLowerCase(&word);\n\n    if (word2ids_.count(word)) {\n      SHERPA_ONNX_LOGE(\"Duplicated word: %s. Ignore it.\", word.c_str());\n      continue;\n    }\n\n    while (iss >> phone) {\n      token_list.push_back(std::move(phone));\n    }\n\n    std::vector<int32_t> ids = ConvertTokensToIds(token2id_, token_list);\n    if (ids.empty()) {\n      continue;\n    }\n\n    word2ids_.insert({std::move(word), std::move(ids)});\n  }\n}\n\nvoid Lexicon::InitPunctuations(const std::string &punctuations) {\n  std::vector<std::string> punctuation_list;\n  SplitStringToVector(punctuations, \" \", false, &punctuation_list);\n  for (auto &s : punctuation_list) {\n    punctuations_.insert(std::move(s));\n  }\n}\n\n#if __ANDROID_API__ >= 9\ntemplate Lexicon::Lexicon(AAssetManager *mgr, const std::string &lexicon,\n                          const std::string &tokens,\n                          const std::string &punctuations,\n                          const std::string &language, bool debug = false);\n#endif\n\n#if __OHOS__\ntemplate Lexicon::Lexicon(NativeResourceManager *mgr,\n                          const std::string &lexicon, const std::string &tokens,\n                          const std::string &punctuations,\n                          const std::string &language, bool debug = false);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/lexicon.h",
    "content": "// sherpa-onnx/csrc/lexicon.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_LEXICON_H_\n#define SHERPA_ONNX_CSRC_LEXICON_H_\n\n#include <cstdint>\n#include <istream>\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <unordered_set>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n\nnamespace sherpa_onnx {\n\nclass Lexicon : public OfflineTtsFrontend {\n public:\n  Lexicon() = default;  // for subclasses\n                        //\n  // Note: for models from piper, we won't use this class.\n  Lexicon(const std::string &lexicon, const std::string &tokens,\n          const std::string &punctuations, const std::string &language,\n          bool debug = false);\n\n  template <typename Manager>\n  Lexicon(Manager *mgr, const std::string &lexicon, const std::string &tokens,\n          const std::string &punctuations, const std::string &language,\n          bool debug = false);\n\n  std::vector<TokenIDs> ConvertTextToTokenIds(\n      const std::string &text, const std::string &voice = \"\") const override;\n\n private:\n  std::vector<TokenIDs> ConvertTextToTokenIdsNotChinese(\n      const std::string &text) const;\n\n  std::vector<TokenIDs> ConvertTextToTokenIdsChinese(\n      const std::string &text) const;\n\n  void InitLanguage(const std::string &lang);\n  void InitTokens(std::istream &is);\n  void InitLexicon(std::istream &is);\n  void InitPunctuations(const std::string &punctuations);\n\n private:\n  enum class Language {\n    kNotChinese,\n    kChinese,\n    kUnknown,\n  };\n\n private:\n  std::unordered_map<std::string, std::vector<int32_t>> word2ids_;\n  std::unordered_set<std::string> punctuations_;\n  std::unordered_map<std::string, int32_t> token2id_;\n  Language language_ = Language::kUnknown;\n  bool debug_ = false;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_LEXICON_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/lodr-fst.cc",
    "content": "// sherpa-onnx/csrc/lodr-fst.cc\n//\n// Contains code copied from icefall/utils/ngram_lm.py\n// Copyright (c)  2023 Xiaomi Corporation\n//\n// Copyright (c)  2025 Tilde SIA (Askars Salimbajevs)\n\n#include \"sherpa-onnx/csrc/lodr-fst.h\"\n\n#include <algorithm>\n#include <limits>\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/hypothesis.h\"\n#include \"sherpa-onnx/csrc/log.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nint32_t LodrFst::FindBackoffId() {\n  // assume that the backoff id is the only input label with epsilon output\n\n  for (int32_t state = 0; state < fst_->NumStates(); ++state) {\n    fst::ArcIterator<fst::StdConstFst> arc_iter(*fst_, state);\n    for (; !arc_iter.Done(); arc_iter.Next()) {\n      const auto &arc = arc_iter.Value();\n      if (arc.olabel == 0) {  // Check if the output label is epsilon (0)\n        return arc.ilabel;    // Return the input label\n      }\n    }\n  }\n\n  return -1;  // Return -1 if no such input symbol is found\n}\n\nLodrFst::LodrFst(const std::string &fst_path, int32_t backoff_id)\n    : backoff_id_(backoff_id) {\n  fst_ = std::unique_ptr<fst::StdConstFst>(\n      CastOrConvertToConstFst(fst::StdVectorFst::Read(fst_path)));\n\n  if (backoff_id < 0) {\n    // backoff_id_ is not provided, find it automatically\n    backoff_id_ = FindBackoffId();\n    if (backoff_id_ < 0) {\n      std::string err_msg = \"Failed to initialize LODR: No backoff arc found\";\n      SHERPA_ONNX_LOGE(\"%s\", err_msg.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n}\n\nstd::vector<std::tuple<int32_t, float>> LodrFst::ProcessBackoffArcs(\n    int32_t state, float cost) {\n  std::vector<std::tuple<int32_t, float>> ans;\n  auto next = GetNextStatesCostsNoBackoff(state, backoff_id_);\n  if (!next.has_value()) {\n    return ans;\n  }\n  auto [next_state, next_cost] = next.value();\n  ans.emplace_back(next_state, next_cost + cost);\n  auto recursive_result = ProcessBackoffArcs(next_state, next_cost + cost);\n  ans.insert(ans.end(), recursive_result.begin(), recursive_result.end());\n  return ans;\n}\n\nstd::optional<std::tuple<int32_t, float>> LodrFst::GetNextStatesCostsNoBackoff(\n    int32_t state, int32_t label) {\n  fst::ArcIterator<fst::StdConstFst> arc_iter(*fst_, state);\n  int32_t num_arcs = fst_->NumArcs(state);\n\n  int32_t left = 0, right = num_arcs - 1;\n  while (left <= right) {\n    int32_t mid = (left + right) / 2;\n    arc_iter.Seek(mid);\n    auto arc = arc_iter.Value();\n    if (arc.ilabel < label) {\n      left = mid + 1;\n    } else if (arc.ilabel > label) {\n      right = mid - 1;\n    } else {\n      return std::make_tuple(arc.nextstate, arc.weight.Value());\n    }\n  }\n  return std::nullopt;\n}\n\nstd::pair<std::vector<int32_t>, std::vector<float>> LodrFst::GetNextStateCosts(\n    int32_t state, int32_t label) {\n  std::vector<int32_t> states = {state};\n  std::vector<float> costs = {0};\n\n  auto extra_states_costs = ProcessBackoffArcs(state, 0);\n  for (const auto &[s, c] : extra_states_costs) {\n    states.push_back(s);\n    costs.push_back(c);\n  }\n\n  std::vector<int32_t> next_states;\n  std::vector<float> next_costs;\n  for (size_t i = 0; i < states.size(); ++i) {\n    auto next = GetNextStatesCostsNoBackoff(states[i], label);\n    if (next.has_value()) {\n      auto [ns, nc] = next.value();\n      next_states.push_back(ns);\n      next_costs.push_back(costs[i] + nc);\n    }\n  }\n\n  return std::make_pair(next_states, next_costs);\n}\n\nvoid LodrFst::ComputeScore(float scale, Hypothesis *hyp, int32_t offset) {\n  if (scale == 0) {\n    return;\n  }\n\n  hyp->lodr_state = std::make_unique<LodrStateCost>(this);\n\n  // Walk through the FST with the input text from the hypothesis\n  for (size_t i = offset; i < hyp->ys.size(); ++i) {\n    *hyp->lodr_state = hyp->lodr_state->ForwardOneStep(hyp->ys[i]);\n  }\n\n  float lodr_score = hyp->lodr_state->FinalScore();\n\n  if (lodr_score == -std::numeric_limits<float>::infinity()) {\n    SHERPA_ONNX_LOGE(\"Failed to compute LODR. Empty or mismatched FST?\");\n    return;\n  }\n\n  // Update the hyp score\n  hyp->log_prob += scale * lodr_score;\n}\n\nfloat LodrFst::GetFinalCost(int32_t state) {\n  auto final_weight = fst_->Final(state);\n  if (final_weight == fst::StdArc::Weight::Zero()) {\n    return 0.0;\n  }\n  return final_weight.Value();\n}\n\nLodrStateCost::LodrStateCost(\n    LodrFst *fst, const std::unordered_map<int32_t, float> &state_cost)\n    : fst_(fst) {\n  if (state_cost.empty()) {\n    state_cost_[0] = 0.0;\n  } else {\n    state_cost_ = state_cost;\n  }\n}\n\nLodrStateCost LodrStateCost::ForwardOneStep(int32_t label) {\n  std::unordered_map<int32_t, float> state_cost;\n  for (const auto &[s, c] : state_cost_) {\n    auto [next_states, next_costs] = fst_->GetNextStateCosts(s, label);\n    for (size_t i = 0; i < next_states.size(); ++i) {\n      int32_t ns = next_states[i];\n      float nc = next_costs[i];\n      if (state_cost.find(ns) == state_cost.end()) {\n        state_cost[ns] = std::numeric_limits<float>::infinity();\n      }\n      state_cost[ns] = std::min(state_cost[ns], c + nc);\n    }\n  }\n  return LodrStateCost(fst_, state_cost);\n}\n\nfloat LodrStateCost::Score() const {\n  if (state_cost_.empty()) {\n    return -std::numeric_limits<float>::infinity();\n  }\n  auto min_cost = std::min_element(\n      state_cost_.begin(), state_cost_.end(),\n      [](const auto &a, const auto &b) { return a.second < b.second; });\n  return -min_cost->second;\n}\n\nfloat LodrStateCost::FinalScore() const {\n  if (state_cost_.empty()) {\n    return -std::numeric_limits<float>::infinity();\n  }\n  auto min_cost = std::min_element(\n      state_cost_.begin(), state_cost_.end(),\n      [](const auto &a, const auto &b) { return a.second < b.second; });\n  return -(min_cost->second + fst_->GetFinalCost(min_cost->first));\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/lodr-fst.h",
    "content": "// sherpa-onnx/csrc/lodr-fst.h\n//\n// Contains code copied from icefall/utils/ngram_lm.py\n// Copyright (c)  2023 Xiaomi Corporation\n//\n// Copyright (c)  2025 Tilde SIA (Askars Salimbajevs)\n\n\n#ifndef SHERPA_ONNX_CSRC_LODR_FST_H_\n#define SHERPA_ONNX_CSRC_LODR_FST_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n#include <optional>\n#include <tuple>\n#include <unordered_map>\n#include <limits>\n#include <algorithm>\n#include <utility>\n\n#include \"kaldifst/csrc/kaldi-fst-io.h\"\n\nnamespace sherpa_onnx {\n\nstruct Hypothesis;\n\nclass LodrFst {\n public:\n  explicit LodrFst(const std::string &fst_path, int32_t backoff_id = -1);\n\n  std::pair<std::vector<int32_t>, std::vector<float>> GetNextStateCosts(\n    int32_t state, int32_t label);\n\n  float GetFinalCost(int32_t state);\n\n  void ComputeScore(float scale, Hypothesis *hyp, int32_t offset);\n\n private:\n  fst::StdVectorFst YsToFst(const std::vector<int64_t> &ys, int32_t offset);\n\n  std::vector<std::tuple<int32_t, float>> ProcessBackoffArcs(\n    int32_t state, float cost);\n\n  std::optional<std::tuple<int32_t, float>> GetNextStatesCostsNoBackoff(\n    int32_t state, int32_t label);\n\n  int32_t FindBackoffId();\n\n\n  int32_t backoff_id_ = -1;\n  std::unique_ptr<fst::StdConstFst> fst_;  // owned by this class\n};\n\nclass LodrStateCost {\n public:\n  explicit LodrStateCost(\n    LodrFst* fst,\n    const std::unordered_map<int32_t, float> &state_cost = {});\n\n    LodrStateCost ForwardOneStep(int32_t label);\n\n  float Score() const;\n  float FinalScore() const;\n\n private:\n  // The fst_ is not owned by this class and borrowed from the caller\n  // (e.g. OnlineRnnLM).\n  LodrFst* fst_;\n  std::unordered_map<int32_t, float> state_cost_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_LODR_FST_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/log.cc",
    "content": "// sherpa-onnx/csrc/log.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/log.h\"\n\n#ifdef SHERPA_ONNX_HAVE_EXECINFO_H\n#include <execinfo.h>  // To get stack trace in error messages.\n#ifdef SHERPA_ONNX_HAVE_CXXABI_H\n#include <cxxabi.h>  // For name demangling.\n// Useful to decode the stack trace, but only used if we have execinfo.h\n#endif  // SHERPA_ONNX_HAVE_CXXABI_H\n#endif  // SHERPA_ONNX_HAVE_EXECINFO_H\n\n#include <stdlib.h>\n\n#include <ctime>\n#include <iomanip>\n#include <string>\n\nnamespace sherpa_onnx {\n\nstd::string GetDateTimeStr() {\n  std::ostringstream os;\n  std::time_t t = std::time(nullptr);\n  std::tm tm = *std::localtime(&t);\n  os << std::put_time(&tm, \"%F %T\");  // yyyy-mm-dd hh:mm:ss\n  return os.str();\n}\n\nstatic bool LocateSymbolRange(const std::string &trace_name, std::size_t *begin,\n                              std::size_t *end) {\n  // Find the first '_' with leading ' ' or '('.\n  *begin = std::string::npos;\n  for (std::size_t i = 1; i < trace_name.size(); ++i) {\n    if (trace_name[i] != '_') {\n      continue;\n    }\n    if (trace_name[i - 1] == ' ' || trace_name[i - 1] == '(') {\n      *begin = i;\n      break;\n    }\n  }\n  if (*begin == std::string::npos) {\n    return false;\n  }\n  *end = trace_name.find_first_of(\" +\", *begin);\n  return *end != std::string::npos;\n}\n\n#ifdef SHERPA_ONNX_HAVE_EXECINFO_H\nstatic std::string Demangle(const std::string &trace_name) {\n#ifndef SHERPA_ONNX_HAVE_CXXABI_H\n  return trace_name;\n#else   // SHERPA_ONNX_HAVE_CXXABI_H\n  // Try demangle the symbol. We are trying to support the following formats\n  // produced by different platforms:\n  //\n  // Linux:\n  //   ./kaldi-error-test(_ZN5kaldi13UnitTestErrorEv+0xb) [0x804965d]\n  //\n  // Mac:\n  //   0 server 0x000000010f67614d _ZNK5kaldi13MessageLogger10LogMessageEv + 813\n  //\n  // We want to extract the name e.g., '_ZN5kaldi13UnitTestErrorEv' and\n  // demangle it info a readable name like kaldi::UnitTextError.\n  std::size_t begin, end;\n  if (!LocateSymbolRange(trace_name, &begin, &end)) {\n    return trace_name;\n  }\n  std::string symbol = trace_name.substr(begin, end - begin);\n  int status;\n  char *demangled_name = abi::__cxa_demangle(symbol.c_str(), 0, 0, &status);\n  if (status == 0 && demangled_name != nullptr) {\n    symbol = demangled_name;\n    free(demangled_name);\n  }\n  return trace_name.substr(0, begin) + symbol +\n         trace_name.substr(end, std::string::npos);\n#endif  // SHERPA_ONNX_HAVE_CXXABI_H\n}\n#endif  // SHERPA_ONNX_HAVE_EXECINFO_H\n\nstd::string GetStackTrace() {\n  std::string ans;\n#ifdef SHERPA_ONNX_HAVE_EXECINFO_H\n  constexpr const std::size_t kMaxTraceSize = 50;\n  constexpr const std::size_t kMaxTracePrint = 50;  // Must be even.\n                                                    // Buffer for the trace.\n  void *trace[kMaxTraceSize];\n  // Get the trace.\n  std::size_t size = backtrace(trace, kMaxTraceSize);\n  // Get the trace symbols.\n  char **trace_symbol = backtrace_symbols(trace, size);\n  if (trace_symbol == nullptr) return ans;\n\n  // Compose a human-readable backtrace string.\n  ans += \"[ Stack-Trace: ]\\n\";\n  if (size <= kMaxTracePrint) {\n    for (std::size_t i = 0; i < size; ++i) {\n      ans += Demangle(trace_symbol[i]) + \"\\n\";\n    }\n  } else {  // Print out first+last (e.g.) 5.\n    for (std::size_t i = 0; i < kMaxTracePrint / 2; ++i) {\n      ans += Demangle(trace_symbol[i]) + \"\\n\";\n    }\n    ans += \".\\n.\\n.\\n\";\n    for (std::size_t i = size - kMaxTracePrint / 2; i < size; ++i) {\n      ans += Demangle(trace_symbol[i]) + \"\\n\";\n    }\n    if (size == kMaxTraceSize)\n      ans += \".\\n.\\n.\\n\";  // Stack was too long, probably a bug.\n  }\n\n  // We must free the array of pointers allocated by backtrace_symbols(),\n  // but not the strings themselves.\n  free(trace_symbol);\n#endif  // SHERPA_ONNX_HAVE_EXECINFO_H\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/log.h",
    "content": "// sherpa-onnx/csrc/log.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_LOG_H_\n#define SHERPA_ONNX_CSRC_LOG_H_\n\n#include <stdio.h>\n\n#include <mutex>  // NOLINT\n#include <sstream>\n#include <string>\n\nnamespace sherpa_onnx {\n\n#if SHERPA_ONNX_ENABLE_CHECK\n\n#if defined(NDEBUG)\nconstexpr bool kDisableDebug = true;\n#else\nconstexpr bool kDisableDebug = false;\n#endif\n\nenum class LogLevel {\n  kTrace = 0,\n  kDebug = 1,\n  kInfo = 2,\n  kWarning = 3,\n  kError = 4,\n  kFatal = 5,  // print message and abort the program\n};\n\n// They are used in SHERPA_ONNX_LOG(xxx), so their names\n// do not follow the google c++ code style\n//\n// You can use them in the following way:\n//\n//  SHERPA_ONNX_LOG(TRACE) << \"some message\";\n//  SHERPA_ONNX_LOG(DEBUG) << \"some message\";\n#ifndef _MSC_VER\nconstexpr LogLevel TRACE = LogLevel::kTrace;\nconstexpr LogLevel DEBUG = LogLevel::kDebug;\nconstexpr LogLevel INFO = LogLevel::kInfo;\nconstexpr LogLevel WARNING = LogLevel::kWarning;\nconstexpr LogLevel ERROR = LogLevel::kError;\nconstexpr LogLevel FATAL = LogLevel::kFatal;\n#else\n#define TRACE LogLevel::kTrace\n#define DEBUG LogLevel::kDebug\n#define INFO LogLevel::kInfo\n#define WARNING LogLevel::kWarning\n#define ERROR LogLevel::kError\n#define FATAL LogLevel::kFatal\n#endif\n\nstd::string GetStackTrace();\n\n/* Return the current log level.\n\n\n   If the current log level is TRACE, then all logged messages are printed out.\n\n   If the current log level is DEBUG, log messages with \"TRACE\" level are not\n   shown and all other levels are printed out.\n\n   Similarly, if the current log level is INFO, log message with \"TRACE\" and\n   \"DEBUG\" are not shown and all other levels are printed out.\n\n   If it is FATAL, then only FATAL messages are shown.\n */\ninline LogLevel GetCurrentLogLevel() {\n  static LogLevel log_level = INFO;\n  static std::once_flag init_flag;\n  std::call_once(init_flag, []() {\n    const char *env_log_level = std::getenv(\"SHERPA_ONNX_LOG_LEVEL\");\n    if (env_log_level == nullptr) return;\n\n    std::string s = env_log_level;\n    if (s == \"TRACE\")\n      log_level = TRACE;\n    else if (s == \"DEBUG\")\n      log_level = DEBUG;\n    else if (s == \"INFO\")\n      log_level = INFO;\n    else if (s == \"WARNING\")\n      log_level = WARNING;\n    else if (s == \"ERROR\")\n      log_level = ERROR;\n    else if (s == \"FATAL\")\n      log_level = FATAL;\n    else\n      fprintf(stderr,\n              \"Unknown SHERPA_ONNX_LOG_LEVEL: %s\"\n              \"\\nSupported values are: \"\n              \"TRACE, DEBUG, INFO, WARNING, ERROR, FATAL\",\n              s.c_str());\n  });\n  return log_level;\n}\n\ninline bool EnableAbort() {\n  static std::once_flag init_flag;\n  static bool enable_abort = false;\n  std::call_once(init_flag, []() {\n    enable_abort = (std::getenv(\"SHERPA_ONNX_ABORT\") != nullptr);\n  });\n  return enable_abort;\n}\n\nclass Logger {\n public:\n  Logger(const char *filename, const char *func_name, uint32_t line_num,\n         LogLevel level)\n      : filename_(filename),\n        func_name_(func_name),\n        line_num_(line_num),\n        level_(level) {\n    cur_level_ = GetCurrentLogLevel();\n    switch (level) {\n      case TRACE:\n        if (cur_level_ <= TRACE) fprintf(stderr, \"[T] \");\n        break;\n      case DEBUG:\n        if (cur_level_ <= DEBUG) fprintf(stderr, \"[D] \");\n        break;\n      case INFO:\n        if (cur_level_ <= INFO) fprintf(stderr, \"[I] \");\n        break;\n      case WARNING:\n        if (cur_level_ <= WARNING) fprintf(stderr, \"[W] \");\n        break;\n      case ERROR:\n        if (cur_level_ <= ERROR) fprintf(stderr, \"[E] \");\n        break;\n      case FATAL:\n        if (cur_level_ <= FATAL) fprintf(stderr, \"[F] \");\n        break;\n    }\n\n    if (cur_level_ <= level_) {\n      fprintf(stderr, \"%s:%u:%s \", filename, line_num, func_name);\n    }\n  }\n\n  ~Logger() noexcept(false) {\n    static constexpr const char *kErrMsg = R\"(\n    Some bad things happened. Please read the above error messages and stack\n    trace. If you are using Python, the following command may be helpful:\n\n      gdb --args python /path/to/your/code.py\n\n    (You can use `gdb` to debug the code. Please consider compiling\n    a debug version of sherpa_onnx.).\n\n    If you are unable to fix it, please open an issue at:\n\n      https://github.com/csukuangfj/kaldi-native-fbank/issues/new\n    )\";\n    if (level_ == FATAL) {\n      fprintf(stderr, \"\\n\");\n      std::string stack_trace = GetStackTrace();\n      if (!stack_trace.empty()) {\n        fprintf(stderr, \"\\n\\n%s\\n\", stack_trace.c_str());\n      }\n\n      fflush(nullptr);\n\n#ifndef __ANDROID_API__\n      if (EnableAbort()) {\n        // NOTE: abort() will terminate the program immediately without\n        // printing the Python stack backtrace.\n        abort();\n      }\n\n      throw std::runtime_error(kErrMsg);\n#else\n      abort();\n#endif\n    }\n  }\n\n  const Logger &operator<<(bool b) const {\n    if (cur_level_ <= level_) {\n      fprintf(stderr, b ? \"true\" : \"false\");\n    }\n    return *this;\n  }\n\n  const Logger &operator<<(int8_t i) const {\n    if (cur_level_ <= level_) fprintf(stderr, \"%d\", i);\n    return *this;\n  }\n\n  const Logger &operator<<(const char *s) const {\n    if (cur_level_ <= level_) fprintf(stderr, \"%s\", s);\n    return *this;\n  }\n\n  const Logger &operator<<(int32_t i) const {\n    if (cur_level_ <= level_) fprintf(stderr, \"%d\", i);\n    return *this;\n  }\n\n  const Logger &operator<<(uint32_t i) const {\n    if (cur_level_ <= level_) fprintf(stderr, \"%u\", i);\n    return *this;\n  }\n\n  const Logger &operator<<(uint64_t i) const {\n    if (cur_level_ <= level_)\n      fprintf(stderr, \"%llu\", (long long unsigned int)i);  // NOLINT\n    return *this;\n  }\n\n  const Logger &operator<<(int64_t i) const {\n    if (cur_level_ <= level_)\n      fprintf(stderr, \"%lli\", (long long int)i);  // NOLINT\n    return *this;\n  }\n\n  const Logger &operator<<(float f) const {\n    if (cur_level_ <= level_) fprintf(stderr, \"%f\", f);\n    return *this;\n  }\n\n  const Logger &operator<<(double d) const {\n    if (cur_level_ <= level_) fprintf(stderr, \"%f\", d);\n    return *this;\n  }\n\n  template <typename T>\n  const Logger &operator<<(const T &t) const {\n    // require T overloads operator<<\n    std::ostringstream os;\n    os << t;\n    return *this << os.str().c_str();\n  }\n\n  // specialization to fix compile error: `stringstream << nullptr` is ambiguous\n  const Logger &operator<<(const std::nullptr_t &null) const {\n    if (cur_level_ <= level_) *this << \"(null)\";\n    return *this;\n  }\n\n private:\n  const char *filename_;\n  const char *func_name_;\n  uint32_t line_num_;\n  LogLevel level_;\n  LogLevel cur_level_;\n};\n#endif  // SHERPA_ONNX_ENABLE_CHECK\n\nclass Voidifier {\n public:\n#if SHERPA_ONNX_ENABLE_CHECK\n  void operator&(const Logger &) const {}\n#endif\n};\n#if !defined(SHERPA_ONNX_ENABLE_CHECK)\ntemplate <typename T>\nconst Voidifier &operator<<(const Voidifier &v, T &&) {\n  return v;\n}\n#endif\n\n}  // namespace sherpa_onnx\n\n#define SHERPA_ONNX_STATIC_ASSERT(x) static_assert(x, \"\")\n\n#ifdef SHERPA_ONNX_ENABLE_CHECK\n\n#if defined(__clang__) || defined(__GNUC__) || defined(__GNUG__) || \\\n    defined(__PRETTY_FUNCTION__)\n// for clang and GCC\n#define SHERPA_ONNX_FUNC __PRETTY_FUNCTION__\n#else\n// for other compilers\n#define SHERPA_ONNX_FUNC __func__\n#endif\n\n#define SHERPA_ONNX_CHECK(x)                                            \\\n  (x) ? (void)0                                                         \\\n      : ::sherpa_onnx::Voidifier() &                                    \\\n            ::sherpa_onnx::Logger(__FILE__, SHERPA_ONNX_FUNC, __LINE__, \\\n                                  ::sherpa_onnx::FATAL)                 \\\n                << \"Check failed: \" << #x << \" \"\n\n// WARNING: x and y may be evaluated multiple times, but this happens only\n// when the check fails. Since the program aborts if it fails, we don't think\n// the extra evaluation of x and y matters.\n//\n// CAUTION: we recommend the following use case:\n//\n//      auto x = Foo();\n//      auto y = Bar();\n//      SHERPA_ONNX_CHECK_EQ(x, y) << \"Some message\";\n//\n//  And please avoid\n//\n//      SHERPA_ONNX_CHECK_EQ(Foo(), Bar());\n//\n//  if `Foo()` or `Bar()` causes some side effects, e.g., changing some\n//  local static variables or global variables.\n#define _SHERPA_ONNX_CHECK_OP(x, y, op)                                        \\\n  ((x)op(y)) ? (void)0                                                         \\\n             : ::sherpa_onnx::Voidifier() &                                    \\\n                   ::sherpa_onnx::Logger(__FILE__, SHERPA_ONNX_FUNC, __LINE__, \\\n                                         ::sherpa_onnx::FATAL)                 \\\n                       << \"Check failed: \" << #x << \" \" << #op << \" \" << #y    \\\n                       << \" (\" << (x) << \" vs. \" << (y) << \") \"\n\n#define SHERPA_ONNX_CHECK_EQ(x, y) _SHERPA_ONNX_CHECK_OP(x, y, ==)\n#define SHERPA_ONNX_CHECK_NE(x, y) _SHERPA_ONNX_CHECK_OP(x, y, !=)\n#define SHERPA_ONNX_CHECK_LT(x, y) _SHERPA_ONNX_CHECK_OP(x, y, <)\n#define SHERPA_ONNX_CHECK_LE(x, y) _SHERPA_ONNX_CHECK_OP(x, y, <=)\n#define SHERPA_ONNX_CHECK_GT(x, y) _SHERPA_ONNX_CHECK_OP(x, y, >)\n#define SHERPA_ONNX_CHECK_GE(x, y) _SHERPA_ONNX_CHECK_OP(x, y, >=)\n\n#define SHERPA_ONNX_LOG(x) \\\n  ::sherpa_onnx::Logger(__FILE__, SHERPA_ONNX_FUNC, __LINE__, ::sherpa_onnx::x)\n\n// ------------------------------------------------------------\n//       For debug check\n// ------------------------------------------------------------\n// If you define the macro \"-D NDEBUG\" while compiling kaldi-native-fbank,\n// the following macros are in fact empty and does nothing.\n\n#define SHERPA_ONNX_DCHECK(x) \\\n  ::sherpa_onnx::kDisableDebug ? (void)0 : SHERPA_ONNX_CHECK(x)\n\n#define SHERPA_ONNX_DCHECK_EQ(x, y) \\\n  ::sherpa_onnx::kDisableDebug ? (void)0 : SHERPA_ONNX_CHECK_EQ(x, y)\n\n#define SHERPA_ONNX_DCHECK_NE(x, y) \\\n  ::sherpa_onnx::kDisableDebug ? (void)0 : SHERPA_ONNX_CHECK_NE(x, y)\n\n#define SHERPA_ONNX_DCHECK_LT(x, y) \\\n  ::sherpa_onnx::kDisableDebug ? (void)0 : SHERPA_ONNX_CHECK_LT(x, y)\n\n#define SHERPA_ONNX_DCHECK_LE(x, y) \\\n  ::sherpa_onnx::kDisableDebug ? (void)0 : SHERPA_ONNX_CHECK_LE(x, y)\n\n#define SHERPA_ONNX_DCHECK_GT(x, y) \\\n  ::sherpa_onnx::kDisableDebug ? (void)0 : SHERPA_ONNX_CHECK_GT(x, y)\n\n#define SHERPA_ONNX_DCHECK_GE(x, y) \\\n  ::sherpa_onnx::kDisableDebug ? (void)0 : SHERPA_ONNX_CHECK_GE(x, y)\n\n#define SHERPA_ONNX_DLOG(x)    \\\n  ::sherpa_onnx::kDisableDebug \\\n      ? (void)0                \\\n      : ::sherpa_onnx::Voidifier() & SHERPA_ONNX_LOG(x)\n\n#else\n\n#define SHERPA_ONNX_CHECK(x) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_LOG(x) ::sherpa_onnx::Voidifier()\n\n#define SHERPA_ONNX_CHECK_EQ(x, y) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_CHECK_NE(x, y) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_CHECK_LT(x, y) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_CHECK_LE(x, y) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_CHECK_GT(x, y) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_CHECK_GE(x, y) ::sherpa_onnx::Voidifier()\n\n#define SHERPA_ONNX_DCHECK(x) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_DLOG(x) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_DCHECK_EQ(x, y) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_DCHECK_NE(x, y) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_DCHECK_LT(x, y) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_DCHECK_LE(x, y) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_DCHECK_GT(x, y) ::sherpa_onnx::Voidifier()\n#define SHERPA_ONNX_DCHECK_GE(x, y) ::sherpa_onnx::Voidifier()\n\n#endif  // SHERPA_ONNX_CHECK_NE\n\n#endif  // SHERPA_ONNX_CSRC_LOG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/macros.h",
    "content": "// sherpa-onnx/csrc/macros.h\n//\n// Copyright      2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_MACROS_H_\n#define SHERPA_ONNX_CSRC_MACROS_H_\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <utility>\n#if __OHOS__\n#include \"hilog/log.h\"\n\n#undef LOG_DOMAIN\n#undef LOG_TAG\n\n// https://gitee.com/openharmony/docs/blob/145a084f0b742e4325915e32f8184817927d1251/en/contribute/OpenHarmony-Log-guide.md#hilog-api-usage-specifications\n#define LOG_DOMAIN 0x6666\n#define LOG_TAG \"sherpa_onnx\"\n#endif\n\n#if __ANDROID_API__ >= 8\n#include \"android/log.h\"\n#define SHERPA_ONNX_LOGE(...)                                                  \\\n  do {                                                                         \\\n    fprintf(stderr, \"%s:%s:%d \", __FILE__, __func__,                           \\\n            static_cast<int32_t>(__LINE__));                                   \\\n    fprintf(stderr, ##__VA_ARGS__);                                            \\\n    fprintf(stderr, \"\\n\");                                                     \\\n    __android_log_print(ANDROID_LOG_WARN, \"sherpa-onnx\", \"%s:%s:%d\", __FILE__, \\\n                        __func__, static_cast<int32_t>(__LINE__));             \\\n    __android_log_print(ANDROID_LOG_WARN, \"sherpa-onnx\", ##__VA_ARGS__);       \\\n  } while (0)\n#elif defined(__OHOS__)\n#define SHERPA_ONNX_LOGE(...) OH_LOG_INFO(LOG_APP, ##__VA_ARGS__)\n#elif SHERPA_ONNX_ENABLE_WASM\n#define SHERPA_ONNX_LOGE(...)                        \\\n  do {                                               \\\n    fprintf(stdout, \"%s:%s:%d \", __FILE__, __func__, \\\n            static_cast<int>(__LINE__));             \\\n    fprintf(stdout, ##__VA_ARGS__);                  \\\n    fprintf(stdout, \"\\n\");                           \\\n  } while (0)\n#else\n#define SHERPA_ONNX_LOGE(...)                        \\\n  do {                                               \\\n    fprintf(stderr, \"%s:%s:%d \", __FILE__, __func__, \\\n            static_cast<int>(__LINE__));             \\\n    fprintf(stderr, ##__VA_ARGS__);                  \\\n    fprintf(stderr, \"\\n\");                           \\\n  } while (0)\n#endif\n\n#define SHERPA_ONNX_EXIT(code) exit(code)\n\n// Read an integer\n#define SHERPA_ONNX_READ_META_DATA(dst, src_key)                           \\\n  do {                                                                     \\\n    auto value = LookupCustomModelMetaData(meta_data, src_key, allocator); \\\n    if (value.empty()) {                                                   \\\n      SHERPA_ONNX_LOGE(\"'%s' does not exist in the metadata\", src_key);    \\\n      SHERPA_ONNX_EXIT(-1);                                                \\\n    }                                                                      \\\n                                                                           \\\n    dst = atoi(value.c_str());                                             \\\n    if (dst < 0) {                                                         \\\n      SHERPA_ONNX_LOGE(\"Invalid value %d for '%s'\", dst, src_key);         \\\n      SHERPA_ONNX_EXIT(-1);                                                \\\n    }                                                                      \\\n  } while (0)\n\n#define SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(dst, src_key, default_value) \\\n  do {                                                                       \\\n    auto value = LookupCustomModelMetaData(meta_data, src_key, allocator);   \\\n    if (value.empty()) {                                                     \\\n      dst = default_value;                                                   \\\n    } else {                                                                 \\\n      dst = atoi(value.c_str());                                             \\\n      if (dst < 0) {                                                         \\\n        SHERPA_ONNX_LOGE(\"Invalid value %d for '%s'\", dst, src_key);         \\\n        SHERPA_ONNX_EXIT(-1);                                                \\\n      }                                                                      \\\n    }                                                                        \\\n  } while (0)\n\n// read a vector of integers\n#define SHERPA_ONNX_READ_META_DATA_VEC(dst, src_key)                           \\\n  do {                                                                         \\\n    auto value = LookupCustomModelMetaData(meta_data, src_key, allocator);     \\\n    if (value.empty()) {                                                       \\\n      SHERPA_ONNX_LOGE(\"'%s' does not exist in the metadata\", src_key);        \\\n      SHERPA_ONNX_EXIT(-1);                                                    \\\n    }                                                                          \\\n                                                                               \\\n    bool ret = SplitStringToIntegers(value.c_str(), \",\", true, &dst);          \\\n    if (!ret) {                                                                \\\n      SHERPA_ONNX_LOGE(\"Invalid value '%s' for '%s'\", value.c_str(), src_key); \\\n      SHERPA_ONNX_EXIT(-1);                                                    \\\n    }                                                                          \\\n  } while (0)\n\n// read a vector of floats\n#define SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(dst, src_key)                     \\\n  do {                                                                         \\\n    auto value = LookupCustomModelMetaData(meta_data, src_key, allocator);     \\\n    if (value.empty()) {                                                       \\\n      SHERPA_ONNX_LOGE(\"%s does not exist in the metadata\", src_key);          \\\n      SHERPA_ONNX_EXIT(-1);                                                    \\\n    }                                                                          \\\n                                                                               \\\n    bool ret = SplitStringToFloats(value.c_str(), \",\", true, &dst);            \\\n    if (!ret) {                                                                \\\n      SHERPA_ONNX_LOGE(\"Invalid value '%s' for '%s'\", value.c_str(), src_key); \\\n      SHERPA_ONNX_EXIT(-1);                                                    \\\n    }                                                                          \\\n  } while (0)\n\n// read a vector of strings\n#define SHERPA_ONNX_READ_META_DATA_VEC_STRING(dst, src_key)                \\\n  do {                                                                     \\\n    auto value = LookupCustomModelMetaData(meta_data, src_key, allocator); \\\n    if (value.empty()) {                                                   \\\n      SHERPA_ONNX_LOGE(\"'%s' does not exist in the metadata\", src_key);    \\\n      SHERPA_ONNX_EXIT(-1);                                                \\\n    }                                                                      \\\n    SplitStringToVector(value.c_str(), \",\", false, &dst);                  \\\n                                                                           \\\n    if (dst.empty()) {                                                     \\\n      SHERPA_ONNX_LOGE(\"Invalid value '%s' for '%s'. Empty vector!\",       \\\n                       value.c_str(), src_key);                            \\\n      SHERPA_ONNX_EXIT(-1);                                                \\\n    }                                                                      \\\n  } while (0)\n\n// read a vector of strings separated by sep\n#define SHERPA_ONNX_READ_META_DATA_VEC_STRING_SEP(dst, src_key, sep)       \\\n  do {                                                                     \\\n    auto value = LookupCustomModelMetaData(meta_data, src_key, allocator); \\\n    if (value.empty()) {                                                   \\\n      SHERPA_ONNX_LOGE(\"'%s' does not exist in the metadata\", src_key);    \\\n      SHERPA_ONNX_EXIT(-1);                                                \\\n    }                                                                      \\\n    SplitStringToVector(value.c_str(), sep, false, &dst);                  \\\n                                                                           \\\n    if (dst.empty()) {                                                     \\\n      SHERPA_ONNX_LOGE(\"Invalid value '%s' for '%s'. Empty vector!\",       \\\n                       value.c_str(), src_key);                            \\\n      SHERPA_ONNX_EXIT(-1);                                                \\\n    }                                                                      \\\n  } while (0)\n\n// Read a string\n#define SHERPA_ONNX_READ_META_DATA_STR(dst, src_key)                       \\\n  do {                                                                     \\\n    auto value = LookupCustomModelMetaData(meta_data, src_key, allocator); \\\n    if (value.empty()) {                                                   \\\n      SHERPA_ONNX_LOGE(\"'%s' does not exist in the metadata\", src_key);    \\\n      SHERPA_ONNX_EXIT(-1);                                                \\\n    }                                                                      \\\n                                                                           \\\n    dst = std::move(value);                                                \\\n    if (dst.empty()) {                                                     \\\n      SHERPA_ONNX_LOGE(\"Invalid value for '%s'\\n\", src_key);               \\\n      SHERPA_ONNX_EXIT(-1);                                                \\\n    }                                                                      \\\n  } while (0)\n\n#define SHERPA_ONNX_READ_META_DATA_STR_ALLOW_EMPTY(dst, src_key)           \\\n  do {                                                                     \\\n    auto value = LookupCustomModelMetaData(meta_data, src_key, allocator); \\\n                                                                           \\\n    dst = std::move(value);                                                \\\n  } while (0)\n\n#define SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(dst, src_key,          \\\n                                                    default_value)         \\\n  do {                                                                     \\\n    auto value = LookupCustomModelMetaData(meta_data, src_key, allocator); \\\n    if (value.empty()) {                                                   \\\n      dst = default_value;                                                 \\\n    } else {                                                               \\\n      dst = std::move(value);                                              \\\n      if (dst.empty()) {                                                   \\\n        SHERPA_ONNX_LOGE(\"Invalid value for '%s'\\n\", src_key);             \\\n        SHERPA_ONNX_EXIT(-1);                                              \\\n      }                                                                    \\\n    }                                                                      \\\n  } while (0)\n\n#endif  // SHERPA_ONNX_CSRC_MACROS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/matcha-tts-lexicon.cc",
    "content": "// sherpa-onnx/csrc/matcha-tts-lexicon.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/matcha-tts-lexicon.h\"\n\n#include <ctype.h>\n\n#include <algorithm>\n#include <fstream>\n#include <memory>\n#include <regex>  // NOLINT\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <unordered_set>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"espeak-ng/speak_lib.h\"\n#include \"phoneme_ids.hpp\"  // NOLINT\n#include \"phonemize.hpp\"    // NOLINT\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/phrase-matcher.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n// Please see https://github.com/k2-fsa/sherpa-onnx/pull/2853\n// for why we need to do the replacement\nstatic const std::vector<std::pair<std::string, std::string>> kReplacements = {\n    {\"ɝ\", \"ɜɹ\"}, {\"ɚ\", \"əɹ\"},\n\n    {\"eɪ\", \"A\"}, {\"aɪ\", \"I\"}, {\"ɔɪ\", \"Y\"},\n    {\"oʊ\", \"O\"}, {\"əʊ\", \"O\"}, {\"aʊ\", \"W\"},\n\n    {\"tʃ\", \"ʧ\"}, {\"dʒ\", \"ʤ\"},\n\n    {\"ː\", \"\"},\n\n    {\"g\", \"ɡ\"},  {\"r\", \"ɹ\"},\n\n    {\"e\", \"ɛ\"},\n};\n\nstd::vector<std::string> ConvertPhonemesToUTF8(\n    const std::vector<std::vector<char32_t>> &phonemes) {\n  std::vector<std::string> out;\n\n  for (const auto &word : phonemes) {\n    for (char32_t cp : word) {\n      out.push_back(Utf32ToUtf8(cp));\n    }\n  }\n\n  return out;\n}\n\nstd::string ApplyReplacements(std::string s) {\n  for (const auto &p : kReplacements) {\n    const std::string &from = p.first;\n    const std::string &to = p.second;\n\n    size_t pos = 0;\n    while ((pos = s.find(from, pos)) != std::string::npos) {\n      s.replace(pos, from.size(), to);\n      pos += to.size();\n    }\n  }\n  return s;\n}\n\nstd::vector<std::string> SplitTokensUTF8(const std::string &s) {\n  std::vector<std::string> out;\n\n  for (size_t i = 0; i < s.size();) {\n    unsigned char c = s[i];\n    size_t len = (c < 0x80) ? 1 : (c < 0xE0) ? 2 : (c < 0xF0) ? 3 : 4;\n\n    out.push_back(s.substr(i, len));\n    i += len;\n  }\n\n  return out;\n}\n\nstd::vector<std::string> ProcessPhonemes(\n    const std::vector<std::vector<char32_t>> &phonemes, bool skip_replacement) {\n  auto tokens = ConvertPhonemesToUTF8(phonemes);\n  if (skip_replacement) {\n    return tokens;\n  }\n\n  std::string joined = Join(tokens);\n  std::string replaced = ApplyReplacements(joined);\n  return SplitTokensUTF8(replaced);\n}\n\n}  // namespace\n\nvoid CallPhonemizeEspeak(const std::string &text,\n                         piper::eSpeakPhonemeConfig &config,  // NOLINT\n                         std::vector<std::vector<piper::Phoneme>> *phonemes);\n\nclass MatchaTtsLexicon::Impl {\n public:\n  Impl(const std::string &lexicon, const std::string &tokens,\n       const std::string &data_dir, bool debug, bool skip_replacement)\n      : debug_(debug), skip_replacement_(skip_replacement) {\n    if (lexicon.empty()) {\n      SHERPA_ONNX_LOGE(\"Please provide lexicon.txt for this model\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    {\n      std::ifstream is(tokens);\n      InitTokens(is);\n    }\n\n    InitLexicon(lexicon);\n\n    if (data_dir.empty()) {\n      SHERPA_ONNX_LOGE(\"Please provide data dir for this model\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    InitEspeak(data_dir);  // See ./piper-phonemize-lexicon.cc\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const std::string &lexicon, const std::string &tokens,\n       const std::string &data_dir, bool debug, bool skip_replacement)\n      : debug_(debug), skip_replacement_(skip_replacement) {\n    if (lexicon.empty()) {\n      SHERPA_ONNX_LOGE(\"Please provide lexicon.txt for this model\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    {\n      auto buf = ReadFile(mgr, tokens);\n      std::istringstream is(std::string(buf.data(), buf.size()));\n\n      InitTokens(is);\n    }\n\n    std::vector<std::string> files;\n    SplitStringToVector(lexicon, \",\", false, &files);\n    for (const auto &f : files) {\n      auto buf = ReadFile(mgr, f);\n\n      std::istringstream is(std::string(buf.data(), buf.size()));\n      InitLexicon(is);\n    }\n\n    if (data_dir.empty()) {\n      SHERPA_ONNX_LOGE(\"Please provide data dir for this model\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    InitEspeak(data_dir);  // See ./piper-phonemize-lexicon.cc\n  }\n\n  std::vector<TokenIDs> ConvertTextToTokenIds(const std::string &_text) const {\n    std::string text = _text;\n    std::vector<std::pair<std::string, std::string>> replace_str_pairs = {\n        {\"，\", \",\"}, {\"、\", \",\"}, {\"；\", \";\"}, {\"：\", \",\"},   {\":\", \",\"},\n        {\"。\", \".\"}, {\"？\", \"?\"}, {\"！\", \"!\"}, {\"\\\\s+\", \" \"},\n    };\n    for (const auto &p : replace_str_pairs) {\n      std::regex re(p.first);\n      text = std::regex_replace(text, re, p.second);\n    }\n\n    if (debug_) {\n      SHERPA_ONNX_LOGE(\"After replacing punctuations and merging spaces:\\n%s\",\n                       text.c_str());\n    }\n\n    std::vector<std::string> words = SplitUtf8(text);\n\n    if (debug_) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"input text:\\n%{public}s\", _text.c_str());\n      SHERPA_ONNX_LOGE(\"after replacing punctuations:\\n%{public}s\",\n                       text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"input text:\\n%s\", _text.c_str());\n      SHERPA_ONNX_LOGE(\"after replacing punctuations:\\n%s\", text.c_str());\n#endif\n\n      std::ostringstream os;\n      std::string sep = \"\";\n      for (const auto &w : words) {\n        os << sep << w;\n        sep = \"_\";\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"after splitting into UTF8:\\n%{public}s\",\n                       os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"after splitting into UTF8:\\n%s\", os.str().c_str());\n#endif\n    }\n\n    // remove spaces after punctuations\n    std::vector<std::string> words2 = std::move(words);\n    words.reserve(words2.size());\n\n    for (int32_t i = 0; i < words2.size(); ++i) {\n      if (i == 0) {\n        words.push_back(std::move(words2[i]));\n      } else if (words2[i] == \" \") {\n        if (words.back() == \" \" || IsPunct(words.back())) {\n          continue;\n        } else {\n          words.push_back(std::move(words2[i]));\n        }\n      } else if (IsPunct(words2[i])) {\n        if (words.back() == \" \" || IsPunct(words.back())) {\n          continue;\n        } else {\n          words.push_back(std::move(words2[i]));\n        }\n      } else {\n        words.push_back(std::move(words2[i]));\n      }\n    }\n\n    if (debug_) {\n      std::ostringstream os;\n      std::string sep = \"\";\n      for (const auto &w : words) {\n        os << sep << w;\n        sep = \"_\";\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"after removing spaces after punctuations:\\n%{public}s\",\n                       os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"after removing spaces after punctuations:\\n%s\",\n                       os.str().c_str());\n#endif\n    }\n\n    std::vector<TokenIDs> ans;\n    std::vector<int64_t> this_sentence;\n\n    PhraseMatcher matcher(&all_words_, words, debug_);\n\n    int32_t blank = token2id_.at(\" \");\n\n    std::vector<int32_t> ids;\n    std::string last_word;\n    for (const std::string &w : matcher) {\n      ids = ConvertWordToIds(w);\n\n      if (ids.empty()) {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\"Ignore OOV '%{public}s'\", w.c_str());\n#else\n        SHERPA_ONNX_LOGE(\"Ignore OOV '%s'\", w.c_str());\n#endif\n\n        last_word = w;\n        continue;\n      }\n\n      if (!last_word.empty() && isalpha(last_word[0])) {\n        this_sentence.push_back(blank);\n      }\n\n      this_sentence.insert(this_sentence.end(), ids.begin(), ids.end());\n\n      if (IsPunct(w)) {\n        if (debug_) {\n          std::ostringstream os;\n          std::string sep;\n          os << \"new sentence: [\";\n          for (auto i : this_sentence) {\n            os << sep << i;\n            sep = \", \";\n          }\n          os << \"]\";\n          SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n        }\n\n        ans.emplace_back(std::move(this_sentence));\n        this_sentence = {};\n      }\n\n      last_word = w;\n    }  // for (const std::string &w : matcher)\n\n    if (!this_sentence.empty()) {\n      ans.emplace_back(std::move(this_sentence));\n    }\n\n    return ans;\n  }\n\n private:\n  std::vector<int32_t> ConvertWordToIds(const std::string &w) const {\n    std::vector<int32_t> ans;\n    if (word2ids_.count(w)) {\n      ans = word2ids_.at(w);\n    } else if (token2id_.count(w)) {\n      ans = {token2id_.at(w)};\n    } else {\n      if (ContainsCJK(w)) {\n        std::vector<std::string> words = SplitUtf8(w);\n        for (const auto &word : words) {\n          if (word2ids_.count(word)) {\n            auto ids = ConvertWordToIds(word);\n            ans.insert(ans.end(), ids.begin(), ids.end());\n          }\n        }\n      } else {\n        if (debug_) {\n          SHERPA_ONNX_LOGE(\"use espeak for %s\", w.c_str());\n        }\n        // use espeak\n        piper::eSpeakPhonemeConfig config;\n        config.voice = \"en-us\";\n        std::vector<std::vector<piper::Phoneme>> phonemes;\n        CallPhonemizeEspeak(w, config, &phonemes);\n\n        auto pp = ProcessPhonemes(phonemes, skip_replacement_);\n\n        for (const auto &p : pp) {\n          if (token2id_.count(p)) {\n            ans.push_back(token2id_.at(p));\n          } else {\n            SHERPA_ONNX_LOGE(\"Skip token: %s\", p.c_str());\n          }\n        }\n      }\n    }\n\n    if (debug_) {\n      std::ostringstream os;\n      os << w << \": \";\n      for (auto i : ans) {\n        os << \"'\" << id2token_.at(i) << \"'(\" << i << \")\" << \",\";\n      }\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    return ans;\n  }\n\n  void InitTokens(std::istream &is) {\n    token2id_ = ReadTokens(is);\n\n    if (debug_) {\n      for (const auto &p : token2id_) {\n        id2token_[p.second] = p.first;\n      }\n    }\n  }\n\n  void InitLexicon(const std::string &lexicon) {\n    if (lexicon.empty()) {\n      SHERPA_ONNX_LOGE(\"Empty lexicon!\");\n      return;\n    }\n\n    std::vector<std::string> files;\n    SplitStringToVector(lexicon, \",\", false, &files);\n    for (const auto &f : files) {\n      std::ifstream is(f);\n      InitLexicon(is);\n    }\n  }\n\n  void InitLexicon(std::istream &is) {\n    std::string word;\n    std::vector<std::string> token_list;\n    std::string line;\n    std::string phone;\n    int32_t line_num = 0;\n\n    while (std::getline(is, line)) {\n      ++line_num;\n\n      std::istringstream iss(line);\n\n      token_list.clear();\n\n      iss >> word;\n      ToLowerCase(&word);\n\n      if (word2ids_.count(word)) {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\n            \"Duplicated word: %{public}s at line %{public}d:%{public}s. Ignore \"\n            \"it.\",\n            word.c_str(), line_num, line.c_str());\n#else\n        SHERPA_ONNX_LOGE(\"Duplicated word: %s at line %d:%s. Ignore it.\",\n                         word.c_str(), line_num, line.c_str());\n#endif\n        continue;\n      }\n\n      while (iss >> phone) {\n        token_list.push_back(std::move(phone));\n      }\n\n      std::vector<int32_t> ids = ConvertTokensToIds(token2id_, token_list);\n      if (ids.empty()) {\n        if (debug_) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"Empty token ids for '%{public}s'\", line.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"Empty token ids for '%s'\", line.c_str());\n#endif\n        }\n        continue;\n      }\n\n      word2ids_.insert({std::move(word), std::move(ids)});\n    }\n\n    for (const auto &[key, _] : word2ids_) {\n      all_words_.insert(key);\n    }\n  }\n\n private:\n  // lexicon.txt is saved in word2ids_\n  std::unordered_map<std::string, std::vector<int32_t>> word2ids_;\n  std::unordered_set<std::string> all_words_;\n\n  // tokens.txt is saved in token2id_\n  std::unordered_map<std::string, int32_t> token2id_;\n\n  std::unordered_map<int32_t, std::string> id2token_;\n\n  bool debug_ = false;\n  bool skip_replacement_ = false;\n};  // namespace sherpa_onnx\n\nMatchaTtsLexicon::~MatchaTtsLexicon() = default;\n\nMatchaTtsLexicon::MatchaTtsLexicon(const std::string &lexicon,\n                                   const std::string &tokens,\n                                   const std::string &data_dir, bool debug,\n                                   bool skip_replacement)\n    : impl_(std::make_unique<Impl>(lexicon, tokens, data_dir, debug,\n                                   skip_replacement)) {}  // NOLINT\n\ntemplate <typename Manager>\nMatchaTtsLexicon::MatchaTtsLexicon(Manager *mgr, const std::string &lexicon,\n                                   const std::string &tokens,\n                                   const std::string &data_dir, bool debug,\n                                   bool skip_replacement)\n    : impl_(std::make_unique<Impl>(mgr, lexicon, tokens, data_dir, debug,\n                                   skip_replacement)) {}  // NOLINT\n\nstd::vector<TokenIDs> MatchaTtsLexicon::ConvertTextToTokenIds(\n    const std::string &text, const std::string & /*unused_voice = \"\"*/) const {\n  return impl_->ConvertTextToTokenIds(text);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate MatchaTtsLexicon::MatchaTtsLexicon(AAssetManager *mgr,\n                                            const std::string &lexicon,\n                                            const std::string &tokens,\n                                            const std::string &data_dir,\n                                            bool debug, bool skip_replacement);\n#endif\n\n#if __OHOS__\ntemplate MatchaTtsLexicon::MatchaTtsLexicon(NativeResourceManager *mgr,\n                                            const std::string &lexicon,\n                                            const std::string &tokens,\n                                            const std::string &data_dir,\n                                            bool debug, bool skip_replacement);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/matcha-tts-lexicon.h",
    "content": "// sherpa-onnx/csrc/matcha-tts-lexicon.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_MATCHA_TTS_LEXICON_H_\n#define SHERPA_ONNX_CSRC_MATCHA_TTS_LEXICON_H_\n\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n\nnamespace sherpa_onnx {\n\n// For Chinese+English matcha tts\nclass MatchaTtsLexicon : public OfflineTtsFrontend {\n public:\n  ~MatchaTtsLexicon() override;\n\n  MatchaTtsLexicon(const std::string &lexicon, const std::string &tokens,\n                   const std::string &data_dir, bool debug,\n                   bool skip_replacement);\n\n  template <typename Manager>\n  MatchaTtsLexicon(Manager *mgr, const std::string &lexicon,\n                   const std::string &tokens, const std::string &data_dir,\n                   bool debug, bool skip_replacement);\n\n  std::vector<TokenIDs> ConvertTextToTokenIds(\n      const std::string &text,\n      const std::string &unused_voice = \"\") const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_MATCHA_TTS_LEXICON_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/math-test.cc",
    "content": "// sherpa-onnx/csrc/math-test.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/math.h\"\n\n#include <vector>\n\n#include \"gtest/gtest.h\"\n\nnamespace sherpa_onnx {\n\nTEST(Transpose, Case1) {\n  // 0 1 2\n  // 3 4 5\n  std::vector<float> in = {0, 1, 2, 3, 4, 5};\n  std::vector<float> out = Transpose(in.data(), 2, 3);\n\n  // 0 3\n  // 1 4\n  // 2 5\n  std::vector<float> expected_out = {0, 3, 1, 4, 2, 5};\n  EXPECT_EQ(out, expected_out);\n}\n\nTEST(Transpose, Case2) {\n  // 0 1\n  // 2 3\n  // 4 5\n  std::vector<float> in = {0, 1, 2, 3, 4, 5};\n  std::vector<float> out = Transpose(in.data(), 3, 2);\n\n  // 0 2 4\n  // 1 3 5\n  std::vector<float> expected_out = {0, 2, 4, 1, 3, 5};\n  EXPECT_EQ(out, expected_out);\n}\n\nTEST(ScaleAdd, Case1) {\n  std::vector<float> src = {1, 2, 3};\n  float scale = 10;\n  std::vector<float> in_out = {5, 6, 0};\n  ScaleAdd(src.data(), scale, src.size(), in_out.data());\n\n  std::vector<float> expected = {10 + 5, 20 + 6, 30 + 0};\n  EXPECT_EQ(in_out, expected);\n}\n\nTEST(Scale, Case1) {\n  std::vector<float> src = {1, 2, 3};\n  float scale = 10;\n  std::vector<float> in_out = {5, 6, 0};\n  Scale(src.data(), scale, src.size(), in_out.data());\n\n  std::vector<float> expected = {10, 20, 30};\n  EXPECT_EQ(in_out, expected);\n}\n\nTEST(Scale, Case2InPlace) {\n  std::vector<float> src = {1, 2, 3};\n  float scale = 10;\n  Scale(src.data(), scale, src.size(), src.data());\n\n  std::vector<float> expected = {10, 20, 30};\n  EXPECT_EQ(src, expected);\n}\n\n/*\n\nimport numpy as np\n\ndef compute_mean_and_inv_std(p: np.ndarray):\n    mean = p.mean(axis=0)\n    var = np.maximum((p**2).mean(axis=0) - mean**2, 0.0)\n    std = np.sqrt(var)\n    inv_std = 1.0 / (std + 1e-5)\n    return mean.astype(np.float32), inv_std.astype(np.float32)\n\ndef dump_cpp_vector(name: str, arr: np.ndarray):\n    flat = arr.flatten()\n    print(f\"std::vector<float> {name} = {{\")\n    line = \"\"\n    for i, v in enumerate(flat):\n        line += f\"{v:.8f}f, \"\n        if (i + 1) % 8 == 0:\n            print(\"  \" + line)\n            line = \"\"\n    if line:\n        print(\"  \" + line)\n    print(\"};\\n\")\n\nnp.random.seed(42)\nnum_rows, num_cols = 4, 6\nx = np.random.randn(num_rows, num_cols).astype(np.float32)\n\nmean, inv_std = compute_mean_and_inv_std(x)\n\ndump_cpp_vector(\"x\", x)\ndump_cpp_vector(\"mean\", mean)\ndump_cpp_vector(\"inv_std\", inv_std)\n\n */\n\nTEST(ComputeMeanAndInvStd, Case1) {\n  std::vector<float> x = {\n      0.49671414f,  -0.13826430f, 0.64768857f, 1.52302980f,  -0.23415338f,\n      -0.23413695f, 1.57921278f,  0.76743472f, -0.46947438f, 0.54256004f,\n      -0.46341768f, -0.46572974f, 0.24196227f, -1.91328025f, -1.72491789f,\n      -0.56228751f, -1.01283109f, 0.31424734f, -0.90802407f, -1.41230369f,\n      1.46564877f,  -0.22577630f, 0.06752820f, -1.42474818f,\n  };\n\n  std::vector<float> expected_mean = {\n      0.35246629f, -0.67410338f, -0.02026373f,\n      0.31938151f, -0.41071847f, -0.45259190f,\n  };\n\n  std::vector<float> expected_inv_std = {\n      1.13103926f, 0.94854516f, 0.83320111f,\n      1.24679470f, 2.52932906f, 1.59057319f,\n  };\n\n  std::vector<float> mean;\n  std::vector<float> inv_std;\n\n  int32_t num_rows = 4;\n  int32_t num_cols = 6;\n\n  ComputeMeanAndInvStd(x.data(), num_rows, num_cols, &mean, &inv_std);\n\n  ASSERT_EQ(mean.size(), num_cols);\n  ASSERT_EQ(inv_std.size(), num_cols);\n\n  for (int32_t i = 0; i < num_cols; ++i) {\n    EXPECT_NEAR(mean[i], expected_mean[i], 1e-6f) << \"at index \" << i;\n    EXPECT_NEAR(inv_std[i], expected_inv_std[i], 1e-6f) << \"at index \" << i;\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/math.cc",
    "content": "// sherpa-onnx/csrc/math.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/math.h\"\n\n#include <vector>\n\n#include \"Eigen/Dense\"\n\nnamespace sherpa_onnx {\n\nvoid ScaleAdd(const float *src, float scale, int32_t n, float *in_out) {\n  Eigen::Map<const Eigen::ArrayXf> src_vec(src, n);\n  Eigen::Map<Eigen::ArrayXf> inout_vec(in_out, n);\n\n  inout_vec += scale * src_vec;\n}\n\nvoid Scale(const float *src, float scale, int32_t n, float *out) {\n  Eigen::Map<const Eigen::ArrayXf> src_vec(src, n);\n  Eigen::Map<Eigen::ArrayXf> out_vec(out, n);\n\n  out_vec = scale * src_vec;\n}\n\nstd::vector<float> MakeVorbisWindow(int32_t window_length) {\n  constexpr float kPi = 3.14159265358979323846f;\n  std::vector<float> window(window_length);\n  const float half = window_length / 2.0f;\n  for (int32_t i = 0; i != window_length; ++i) {\n    float s = std::sin(0.5f * kPi * (i + 0.5f) / half);\n    window[i] = std::sin(0.5f * kPi * s * s);\n  }\n\n  return window;\n}\n\n// this if for Paraformer\nstd::vector<float> ComputeAcousticEmbedding(\n    const std::vector<float> &encoder_out, const std::vector<float> &alphas,\n    int32_t encoder_dim) {\n  std::vector<float> ans;\n  ans.reserve(encoder_out.size());\n\n  float acc = 0;\n  std::vector<float> cur_emb(encoder_dim);\n  for (int32_t i = 0; i < static_cast<int32_t>(alphas.size()); ++i) {\n    float w = alphas[i];\n\n    acc += w;\n    if (acc >= 1) {\n      float overflow = acc - 1;\n      float remain = w - overflow;\n\n      ScaleAdd(encoder_out.data() + i * encoder_dim, remain, encoder_dim,\n               cur_emb.data());\n\n      ans.insert(ans.end(), cur_emb.begin(), cur_emb.end());\n\n      Scale(encoder_out.data() + i * encoder_dim, overflow, encoder_dim,\n            cur_emb.data());\n\n      acc = overflow;\n    } else {\n      ScaleAdd(encoder_out.data() + i * encoder_dim, w, encoder_dim,\n               cur_emb.data());\n    }\n  }\n  // TODO(fangjun): The last cur_emb is not used\n\n  return ans;\n}\n\nstd::vector<float> Transpose(const float *input, int32_t rows, int32_t cols) {\n  std::vector<float> output(cols * rows);\n\n  Eigen::Map<const Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic,\n                                 Eigen::RowMajor>>\n      in(input, rows, cols);\n\n  Eigen::Map<\n      Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>>\n      out(output.data(), cols, rows);\n\n  out.noalias() = in.transpose();\n\n  return output;\n}\n\nvoid ComputeMeanAndInvStd(const float *p, int32_t num_rows, int32_t num_cols,\n                          std::vector<float> *mean,\n                          std::vector<float> *inv_stddev) {\n  using RowMajorMat =\n      Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>;\n\n  Eigen::Map<const RowMajorMat> X(p, num_rows, num_cols);\n\n  Eigen::RowVectorXf mean_vec = X.colwise().mean();\n\n  Eigen::RowVectorXf mean_sq = X.array().square().colwise().mean();\n\n  Eigen::RowVectorXf var = mean_sq.array() - mean_vec.array().square();\n\n  Eigen::RowVectorXf stddev = var.array().max(0.0f).sqrt();\n\n  Eigen::RowVectorXf inv_std = (stddev.array() + 1e-5f).inverse();\n\n  mean->assign(mean_vec.data(), mean_vec.data() + num_cols);\n\n  inv_stddev->assign(inv_std.data(), inv_std.data() + num_cols);\n}\n\nvoid NormalizeWhisperFeatures(float *features, int32_t num_frames,\n                              int32_t feat_dim) {\n  // log_spec = torch.clamp(features, min=1e-10).log10()\n  // log_spec = torch.maximum(log_spec, log_spec.max() - 8.0)\n  // mel = (log_spec + 4.0) / 4.0\n\n  using Eigen::ArrayXXf;\n  using Eigen::Map;\n\n  Map<ArrayXXf, Eigen::RowMajor> feats(features, num_frames, feat_dim);\n\n  feats = feats.max(1e-10f).log10();\n\n  float max_v = feats.maxCoeff() - 8.0f;\n\n  feats = feats.max(max_v);\n  feats = (feats + 4.0f) / 4.0f;\n}\n\nint32_t MaxElementIndex(const float *v, int32_t n) {\n  // Map raw pointer to an Eigen vector (no copy)\n  Eigen::Map<const Eigen::VectorXf> vec(v, n);\n\n  Eigen::Index maxIndex;\n  vec.maxCoeff(&maxIndex);\n\n  return static_cast<int32_t>(maxIndex);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/math.h",
    "content": "/**\n * Copyright (c)  2022  Xiaomi Corporation (authors: Daniel Povey)\n * Copyright (c)  2023                     (Pingfeng Luo)\n *\n */\n// This file is copied from k2/csrc/utils.h\n#ifndef SHERPA_ONNX_CSRC_MATH_H_\n#define SHERPA_ONNX_CSRC_MATH_H_\n\n#include <algorithm>\n#include <cassert>\n#include <cstdint>\n#include <cmath>\n#include <numeric>\n#include <vector>\n\n#include \"Eigen/Dense\"\n\nnamespace sherpa_onnx {\n\n// logf(FLT_EPSILON)\n#define SHERPA_ONNX_MIN_LOG_DIFF_FLOAT -15.9423847198486328125f\n\n// log(DBL_EPSILON)\n#define SHERPA_ONNX_MIN_LOG_DIFF_DOUBLE \\\n  -36.0436533891171535515240975655615329742431640625\n\ntemplate <typename T>\nstruct LogAdd;\n\ntemplate <>\nstruct LogAdd<double> {\n  double operator()(double x, double y) const {\n    double diff;\n\n    if (x < y) {\n      diff = x - y;\n      x = y;\n    } else {\n      diff = y - x;\n    }\n    // diff is negative.  x is now the larger one.\n\n    if (diff >= SHERPA_ONNX_MIN_LOG_DIFF_DOUBLE) {\n      double res;\n      res = x + log1p(exp(diff));\n      return res;\n    }\n\n    return x;  // return the larger one.\n  }\n};\n\ntemplate <>\nstruct LogAdd<float> {\n  float operator()(float x, float y) const {\n    float diff;\n\n    if (x < y) {\n      diff = x - y;\n      x = y;\n    } else {\n      diff = y - x;\n    }\n    // diff is negative.  x is now the larger one.\n\n    if (diff >= SHERPA_ONNX_MIN_LOG_DIFF_DOUBLE) {\n      float res;\n      res = x + log1pf(expf(diff));\n      return res;\n    }\n\n    return x;  // return the larger one.\n  }\n};\n\ntemplate <class T>\nvoid LogSoftmax(T *input, int32_t input_len) {\n  assert(input);\n\n  T m = *std::max_element(input, input + input_len);\n\n  T sum = 0.0;\n  for (int32_t i = 0; i < input_len; i++) {\n    sum += exp(input[i] - m);\n  }\n\n  T offset = m + log(sum);\n  for (int32_t i = 0; i < input_len; i++) {\n    input[i] -= offset;\n  }\n}\n\ntemplate <typename T>\nvoid LogSoftmax(T *in, int32_t w, int32_t h) {\n  for (int32_t i = 0; i != h; ++i) {\n    LogSoftmax(in, w);\n    in += w;\n  }\n}\n\ntemplate <typename T>\nvoid SubtractBlank(T *in, int32_t w, int32_t h, int32_t blank_idx,\n                   float blank_penalty) {\n  for (int32_t i = 0; i != h; ++i) {\n    in[blank_idx] -= blank_penalty;\n    in += w;\n  }\n}\n\ntemplate <class T>\nstd::vector<int32_t> TopkIndex(const T *vec, int32_t size, int32_t topk) {\n  std::vector<int32_t> vec_index(size);\n  std::iota(vec_index.begin(), vec_index.end(), 0);\n\n  std::partial_sort(vec_index.begin(), vec_index.begin() + topk,\n                    vec_index.end(), [vec](int32_t index_1, int32_t index_2) {\n                      return vec[index_1] > vec[index_2];\n                    });\n\n  int32_t k_num = std::min<int32_t>(size, topk);\n  return {vec_index.begin(), vec_index.begin() + k_num};\n}\n\ntemplate <class T>\nstd::vector<int32_t> TopkIndex(const std::vector<std::vector<T>> &vec,\n                               int32_t topk) {\n  std::vector<T> flatten;\n  flatten.reserve(vec.size() * vec[0].size());\n  for (const auto &v : vec) {\n    flatten.insert(flatten.end(), v.begin(), v.end());\n  }\n\n  return TopkIndex(flatten.data(), flatten.size(), topk);\n}\n\n// in_out[i] += src[i] * scale\nvoid ScaleAdd(const float *src, float scale, int32_t n, float *in_out);\n\n// out[i] = src[i] * scale\nvoid Scale(const float *src, float scale, int32_t n, float *out);\n\nstd::vector<float> MakeVorbisWindow(int32_t window_length);\n\n// For Paraformer\nstd::vector<float> ComputeAcousticEmbedding(\n    const std::vector<float> &encoder_out, const std::vector<float> &alphas,\n    int32_t encoder_dim);\n\n// Transpose a 2-D matrix in row-major\nstd::vector<float> Transpose(const float *input, int32_t rows, int32_t cols);\n\n/* Compute mean and inverse stddev over rows.\n *\n * @param p  A pointer to a 2-d array of shape (num_rows, num_cols)\n * @param num_rows Number of rows\n * @param num_cols Number of columns\n * @param mean On return, it contains p.mean(axis=0). You don't need to\n *             pre-allocate space for it.\n * @param inv_stddev On return, it contains 1/p.std(axis=0) You don't need to\n *                   pre-allocate space for it.\n */\nvoid ComputeMeanAndInvStd(const float *p, int32_t num_rows, int32_t num_cols,\n                          std::vector<float> *mean,\n                          std::vector<float> *inv_stddev);\n\nvoid NormalizeWhisperFeatures(float *features, int32_t num_frames,\n                              int32_t feat_dim);\n\nint32_t MaxElementIndex(const float *v, int32_t n);\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_MATH_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/melo-tts-lexicon.cc",
    "content": "// sherpa-onnx/csrc/melo-tts-lexicon.cc\n//\n// Copyright (c)  2022-2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/melo-tts-lexicon.h\"\n\n#include <fstream>\n#include <regex>  // NOLINT\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <unordered_set>\n#include <utility>\n#include <vector>\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/phrase-matcher.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass MeloTtsLexicon::Impl {\n public:\n  Impl(const std::string &lexicon, const std::string &tokens,\n       const OfflineTtsVitsModelMetaData &meta_data, bool debug)\n      : meta_data_(meta_data), debug_(debug) {\n    {\n      std::ifstream is(tokens);\n      InitTokens(is);\n    }\n\n    {\n      std::ifstream is(lexicon);\n      InitLexicon(is);\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const std::string &lexicon, const std::string &tokens,\n       const OfflineTtsVitsModelMetaData &meta_data, bool debug)\n      : meta_data_(meta_data), debug_(debug) {\n    {\n      auto buf = ReadFile(mgr, tokens);\n\n      std::istringstream is(std::string(buf.data(), buf.size()));\n      InitTokens(is);\n    }\n\n    {\n      auto buf = ReadFile(mgr, lexicon);\n\n      std::istringstream is(std::string(buf.data(), buf.size()));\n      InitLexicon(is);\n    }\n  }\n\n  std::vector<TokenIDs> ConvertTextToTokenIds(const std::string &_text) const {\n    std::string text = ToLowerCase(_text);\n    // see\n    // https://github.com/Plachtaa/VITS-fast-fine-tuning/blob/main/text/mandarin.py#L244\n    std::regex punct_re{\"：|、|；\"};\n    std::string s = std::regex_replace(text, punct_re, \",\");\n\n    std::regex punct_re2(\"。\");\n    s = std::regex_replace(s, punct_re2, \".\");\n\n    std::regex punct_re3(\"？\");\n    s = std::regex_replace(s, punct_re3, \"?\");\n\n    std::regex punct_re4(\"！\");\n    s = std::regex_replace(s, punct_re4, \"!\");\n\n    std::vector<std::string> words = SplitUtf8(text);\n\n    if (debug_) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"input text:\\n%{public}s\", text.c_str());\n      SHERPA_ONNX_LOGE(\"after replacing punctuations:\\n%{public}s\", s.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"input text:\\n%s\", text.c_str());\n      SHERPA_ONNX_LOGE(\"after replacing punctuations:\\n%s\", s.c_str());\n#endif\n\n      std::ostringstream os;\n      std::string sep = \"\";\n      for (const auto &w : words) {\n        os << sep << w;\n        sep = \"_\";\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"after splitting into UTF8:\\n%{public}s\",\n                       os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"after splitting into UTF8:\\n%s\", os.str().c_str());\n#endif\n    }\n\n    std::vector<TokenIDs> ans;\n    TokenIDs this_sentence;\n\n    PhraseMatcher matcher(&all_words_, words, debug_);\n\n    for (const std::string &w : matcher) {\n      auto ids = ConvertWordToIds(w);\n      if (ids.tokens.empty()) {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\"Ignore OOV '%{public}s'\", w.c_str());\n#else\n        SHERPA_ONNX_LOGE(\"Ignore OOV '%s'\", w.c_str());\n#endif\n        continue;\n      }\n\n      if (debug_) {\n        std::ostringstream os;\n        os << w << \": \";\n        for (auto i : ids.tokens) {\n          os << id2token_.at(i) << \" \";\n        }\n\n        for (auto i : ids.tones) {\n          os << i << \" \";\n        }\n        os << \"\\n\";\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n        SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n      }\n\n      this_sentence.tokens.insert(this_sentence.tokens.end(),\n                                  ids.tokens.begin(), ids.tokens.end());\n      this_sentence.tones.insert(this_sentence.tones.end(), ids.tones.begin(),\n                                 ids.tones.end());\n\n      if (w == \".\" || w == \"!\" || w == \"?\" || w == \",\" || w == \"。\" ||\n          w == \"！\" || w == \"？\" || w == \"，\") {\n        ans.push_back(std::move(this_sentence));\n        this_sentence = {};\n      }\n    }  // for (const std::string &w : matcher)\n\n    if (!this_sentence.tokens.empty()) {\n      ans.push_back(std::move(this_sentence));\n    }\n\n    return ans;\n  }\n\n private:\n  TokenIDs ConvertWordToIds(const std::string &w) const {\n    if (word2ids_.count(w)) {\n      return word2ids_.at(w);\n    }\n\n    if (token2id_.count(w)) {\n      return {{token2id_.at(w)}, {0}};\n    }\n\n    TokenIDs ans;\n\n    std::vector<std::string> words = SplitUtf8(w);\n    for (const auto &word : words) {\n      if (word2ids_.count(word)) {\n        auto ids = ConvertWordToIds(word);\n        ans.tokens.insert(ans.tokens.end(), ids.tokens.begin(),\n                          ids.tokens.end());\n        ans.tones.insert(ans.tones.end(), ids.tones.begin(), ids.tones.end());\n      } else {\n        // If the lexicon does not contain the word, we split the word into\n        // characters.\n        //\n        // For instance, if the word is TTS and it is does not exist\n        // in the lexicon, we split it into 3 characters: T T S\n        std::string s;\n        for (char c : word) {\n          s = c;\n          if (word2ids_.count(s)) {\n            const auto &t = word2ids_.at(s);\n            ans.tokens.insert(ans.tokens.end(), t.tokens.begin(),\n                              t.tokens.end());\n            ans.tones.insert(ans.tones.end(), t.tones.begin(), t.tones.end());\n          }\n        }\n      }\n    }\n\n    return ans;\n  }\n\n  void InitTokens(std::istream &is) {\n    token2id_ = ReadTokens(is);\n\n    if (debug_) {\n      for (const auto &p : token2id_) {\n        id2token_[p.second] = p.first;\n      }\n    }\n\n    token2id_[\" \"] = token2id_[\"_\"];\n\n    std::vector<std::pair<std::string, std::string>> puncts = {\n        {\",\", \"，\"}, {\".\", \"。\"}, {\"!\", \"！\"}, {\"?\", \"？\"}};\n\n    for (const auto &p : puncts) {\n      if (token2id_.count(p.first) && !token2id_.count(p.second)) {\n        token2id_[p.second] = token2id_[p.first];\n      }\n\n      if (!token2id_.count(p.first) && token2id_.count(p.second)) {\n        token2id_[p.first] = token2id_[p.second];\n      }\n    }\n\n    if (!token2id_.count(\"、\") && token2id_.count(\"，\")) {\n      token2id_[\"、\"] = token2id_[\"，\"];\n    }\n\n    // Map 'v' to 'V' token (same as post_replace_ph in MeloTTS)\n    // Only for English models\n    if (meta_data_.language == \"en\" && token2id_.count(\"V\")) {\n      token2id_[\"v\"] = token2id_[\"V\"];\n    }\n  }\n\n  void InitLexicon(std::istream &is) {\n    std::string word;\n    std::vector<std::string> token_list;\n\n    std::vector<std::string> phone_list;\n    std::vector<int64_t> tone_list;\n\n    std::string line;\n    std::string phone;\n    int32_t line_num = 0;\n\n    while (std::getline(is, line)) {\n      ++line_num;\n\n      std::istringstream iss(line);\n\n      token_list.clear();\n      phone_list.clear();\n      tone_list.clear();\n\n      iss >> word;\n      ToLowerCase(&word);\n\n      if (word2ids_.count(word)) {\n        SHERPA_ONNX_LOGE(\"Duplicated word: %s at line %d:%s. Ignore it.\",\n                         word.c_str(), line_num, line.c_str());\n        continue;\n      }\n\n      while (iss >> phone) {\n        token_list.push_back(std::move(phone));\n      }\n\n      if ((token_list.size() & 1) != 0) {\n        SHERPA_ONNX_LOGE(\"Invalid line %d: '%s'\", line_num, line.c_str());\n        exit(-1);\n      }\n\n      int32_t num_phones = token_list.size() / 2;\n      phone_list.reserve(num_phones);\n      tone_list.reserve(num_phones);\n\n      for (int32_t i = 0; i != num_phones; ++i) {\n        phone_list.push_back(std::move(token_list[i]));\n        tone_list.push_back(std::stoi(token_list[i + num_phones], nullptr));\n        if (tone_list.back() < 0 || tone_list.back() > 50) {\n          SHERPA_ONNX_LOGE(\"Invalid line %d: '%s'\", line_num, line.c_str());\n          exit(-1);\n        }\n      }\n\n      std::vector<int32_t> ids = ConvertTokensToIds(token2id_, phone_list);\n      if (ids.empty()) {\n        continue;\n      }\n\n      if (ids.size() != num_phones) {\n        SHERPA_ONNX_LOGE(\"Invalid line %d: '%s'\", line_num, line.c_str());\n        exit(-1);\n      }\n\n      std::vector<int64_t> ids64{ids.begin(), ids.end()};\n\n      word2ids_.insert(\n          {std::move(word), TokenIDs{std::move(ids64), std::move(tone_list)}});\n    }\n\n    // For Chinese+English MeloTTS\n    word2ids_[\"呣\"] = word2ids_[\"母\"];\n    word2ids_[\"嗯\"] = word2ids_[\"恩\"];\n\n    for (const auto &[key, _] : word2ids_) {\n      all_words_.insert(key);\n    }\n  }\n\n private:\n  // lexicon.txt is saved in word2ids_\n  std::unordered_map<std::string, TokenIDs> word2ids_;\n  std::unordered_set<std::string> all_words_;\n\n  // tokens.txt is saved in token2id_\n  std::unordered_map<std::string, int32_t> token2id_;\n  std::unordered_map<int32_t, std::string> id2token_;\n\n  OfflineTtsVitsModelMetaData meta_data_;\n\n  bool debug_ = false;\n};\n\nMeloTtsLexicon::~MeloTtsLexicon() = default;\n\nMeloTtsLexicon::MeloTtsLexicon(const std::string &lexicon,\n                               const std::string &tokens,\n                               const OfflineTtsVitsModelMetaData &meta_data,\n                               bool debug)\n    : impl_(std::make_unique<Impl>(lexicon, tokens, meta_data, debug)) {}\n\ntemplate <typename Manager>\nMeloTtsLexicon::MeloTtsLexicon(Manager *mgr, const std::string &lexicon,\n                               const std::string &tokens,\n                               const OfflineTtsVitsModelMetaData &meta_data,\n                               bool debug)\n    : impl_(std::make_unique<Impl>(mgr, lexicon, tokens, meta_data, debug)) {}\n\nstd::vector<TokenIDs> MeloTtsLexicon::ConvertTextToTokenIds(\n    const std::string &text, const std::string & /*unused_voice = \"\"*/) const {\n  return impl_->ConvertTextToTokenIds(text);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate MeloTtsLexicon::MeloTtsLexicon(\n    AAssetManager *mgr, const std::string &lexicon, const std::string &tokens,\n    const OfflineTtsVitsModelMetaData &meta_data, bool debug);\n#endif\n\n#if __OHOS__\ntemplate MeloTtsLexicon::MeloTtsLexicon(\n    NativeResourceManager *mgr, const std::string &lexicon,\n    const std::string &tokens, const OfflineTtsVitsModelMetaData &meta_data,\n    bool debug);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/melo-tts-lexicon.h",
    "content": "// sherpa-onnx/csrc/melo-tts-lexicon.h\n//\n// Copyright (c)  2022-2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_MELO_TTS_LEXICON_H_\n#define SHERPA_ONNX_CSRC_MELO_TTS_LEXICON_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-vits-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass MeloTtsLexicon : public OfflineTtsFrontend {\n public:\n  ~MeloTtsLexicon() override;\n  MeloTtsLexicon(const std::string &lexicon, const std::string &tokens,\n                 const OfflineTtsVitsModelMetaData &meta_data, bool debug);\n\n  template <typename Manager>\n  MeloTtsLexicon(Manager *mgr, const std::string &lexicon,\n                 const std::string &tokens,\n                 const OfflineTtsVitsModelMetaData &meta_data, bool debug);\n\n  std::vector<TokenIDs> ConvertTextToTokenIds(\n      const std::string &text,\n      const std::string &unused_voice = \"\") const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_MELO_TTS_LEXICON_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/microphone.cc",
    "content": "// sherpa-onnx/csrc/microphone.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/microphone.h\"\n\n#include <stdio.h>\n#include <stdlib.h>\n\nnamespace sherpa_onnx {\n\nMicrophone::Microphone() {\n  PaError err = Pa_Initialize();\n  if (err != paNoError) {\n    fprintf(stderr, \"portaudio error: %s\\n\", Pa_GetErrorText(err));\n    exit(-1);\n  }\n}\n\nMicrophone::~Microphone() {\n  CloseDevice();\n  PaError err = Pa_Terminate();\n  if (err != paNoError) {\n    fprintf(stderr, \"portaudio error: %s\\n\", Pa_GetErrorText(err));\n  }\n}\n\nint Microphone::GetDeviceCount() const { return Pa_GetDeviceCount(); }\n\nint Microphone::GetDefaultInputDevice() const {\n  return Pa_GetDefaultInputDevice();\n}\n\nvoid Microphone::PrintDevices(int device_index) const {\n  int num_devices = Pa_GetDeviceCount();\n  fprintf(stderr, \"Num devices: %d\\n\", num_devices);\n  for (int i = 0; i != num_devices; ++i) {\n    const PaDeviceInfo *info = Pa_GetDeviceInfo(i);\n    fprintf(stderr, \" %s %d %s\\n\", (i == device_index) ? \"*\" : \" \", i,\n            info->name);\n  }\n}\n\nbool Microphone::OpenDevice(int index, int sample_rate, int channel,\n                            PaStreamCallback cb, void *userdata) {\n  if (index < 0 || index >= Pa_GetDeviceCount()) {\n    fprintf(stderr, \"Invalid device index: %d\\n\", index);\n    return false;\n  }\n\n  const PaDeviceInfo *info = Pa_GetDeviceInfo(index);\n  if (!info) {\n    fprintf(stderr, \"No device info found for index: %d\\n\", index);\n    return false;\n  }\n\n  CloseDevice();\n\n  fprintf(stderr, \"Use device: %d\\n\", index);\n  fprintf(stderr, \"  Name: %s\\n\", info->name);\n  fprintf(stderr, \"  Max input channels: %d\\n\", info->maxInputChannels);\n\n  PaStreamParameters param;\n  param.device = index;\n  param.channelCount = channel;\n  param.sampleFormat = paFloat32;\n  param.suggestedLatency = info->defaultLowInputLatency;\n  param.hostApiSpecificStreamInfo = nullptr;\n\n  PaError err =\n      Pa_OpenStream(&stream, &param, nullptr, /* &outputParameters, */\n                    sample_rate,\n                    0,          // frames per buffer\n                    paClipOff,  // we won't output out of range samples\n                                // so don't bother clipping them\n                    cb, userdata);\n  if (err != paNoError) {\n    fprintf(stderr, \"portaudio error: %s\\n\", Pa_GetErrorText(err));\n    return false;\n  }\n\n  err = Pa_StartStream(stream);\n  fprintf(stderr, \"Started\\n\");\n\n  if (err != paNoError) {\n    fprintf(stderr, \"portaudio error: %s\\n\", Pa_GetErrorText(err));\n    CloseDevice();\n    return false;\n  }\n  return true;\n}\n\nvoid Microphone::CloseDevice() {\n  if (stream) {\n    PaError err = Pa_CloseStream(stream);\n    if (err != paNoError) {\n      fprintf(stderr, \"Pa_CloseStream error: %s\\n\", Pa_GetErrorText(err));\n    }\n    stream = nullptr;\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/microphone.h",
    "content": "// sherpa-onnx/csrc/microphone.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_MICROPHONE_H_\n#define SHERPA_ONNX_CSRC_MICROPHONE_H_\n#include <cstdint>\n\n#include \"portaudio.h\"  // NOLINT\nnamespace sherpa_onnx {\n\nclass Microphone {\n public:\n  Microphone();\n  ~Microphone();\n\n  int32_t GetDeviceCount() const;\n  int32_t GetDefaultInputDevice() const;\n  void PrintDevices(int32_t sel) const;\n\n  bool OpenDevice(int32_t index, int32_t sample_rate, int32_t channel,\n                  PaStreamCallback cb, void *userdata);\n\n  void CloseDevice();\n\n private:\n  PaStream *stream = nullptr;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_MICROPHONE_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/normal-data-generator.cc",
    "content": "// sherpa-onnx/csrc/normal-data-generator.cc\n//\n// Copyright      2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/normal-data-generator.h\"\n\n#include <random>\n#include <thread>\n\nnamespace sherpa_onnx {\n\n// Helper type hidden in translation unit\nnamespace {\nstruct RNGHolder {\n  std::mt19937 rng;\n  std::normal_distribution<float> dist;\n\n  RNGHolder()\n      : rng([] {\n          std::random_device rd;\n          std::seed_seq seq{rd(),\n                            static_cast<unsigned>(std::hash<std::thread::id>{}(\n                                std::this_thread::get_id()))};\n          return std::mt19937(seq);\n        }()),\n        dist() {}\n};\n}  // namespace\n\nNormalDataGenerator::NormalDataGenerator(float mean /* = 0.0f*/,\n                                         float stddev /* = 1.0f*/,\n                                         int32_t seed /* = -1*/)\n    : mean_(mean), stddev_(stddev), seed_(seed) {\n  if (seed_ >= 0) {\n    rng_.seed(static_cast<unsigned>(seed_));\n  }\n}\n\nvoid NormalDataGenerator::Fill(float *data, std::size_t size) const {\n  if (seed_ >= 0) {\n    // Deterministic mode: use instance-level RNG\n    std::normal_distribution<float> dist(mean_, stddev_);\n    for (std::size_t i = 0; i < size; ++i) {\n      data[i] = dist(rng_);\n    }\n  } else {\n    // Original behavior: thread-local random device\n    static thread_local RNGHolder holder;\n\n    holder.dist.param(\n        std::normal_distribution<float>::param_type(mean_, stddev_));\n\n    for (std::size_t i = 0; i < size; ++i) {\n      data[i] = holder.dist(holder.rng);\n    }\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/normal-data-generator.h",
    "content": "// sherpa-onnx/csrc/normal-data-generator.h\n//\n// Copyright      2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_NORMAL_DATA_GENERATOR_H_\n#define SHERPA_ONNX_CSRC_NORMAL_DATA_GENERATOR_H_\n\n#include <cstddef>\n#include <cstdint>\n#include <random>\n\nnamespace sherpa_onnx {\n\nclass NormalDataGenerator {\n public:\n  explicit NormalDataGenerator(float mean = 0.0f, float stddev = 1.0f,\n                               int32_t seed = -1);\n\n  // Fill pre-allocated memory\n  void Fill(float *data, std::size_t size) const;\n\n private:\n  float mean_;\n  float stddev_;\n  int32_t seed_ = -1;         // -1 = use thread-local random device (default)\n  mutable std::mt19937 rng_;  // used if seed_ >= 0\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_NORMAL_DATA_GENERATOR_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-canary-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-canary-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-canary-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineCanaryModelConfig::Register(ParseOptions *po) {\n  po->Register(\"canary-encoder\", &encoder,\n               \"Path to onnx encoder of Canary, e.g., encoder.int8.onnx\");\n\n  po->Register(\"canary-decoder\", &decoder,\n               \"Path to onnx decoder of Canary, e.g., decoder.int8.onnx\");\n\n  po->Register(\"canary-src-lang\", &src_lang,\n               \"Valid values: en, de, es, fr. If empty, default to use en\");\n\n  po->Register(\"canary-tgt-lang\", &tgt_lang,\n               \"Valid values: en, de, es, fr. If empty, default to use en\");\n\n  po->Register(\"canary-use-pnc\", &use_pnc,\n               \"true to enable punctuations and casing. false to disable them\");\n}\n\nbool OfflineCanaryModelConfig::Validate() const {\n  if (encoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --canary-encoder\");\n    return false;\n  }\n\n  if (!FileExists(encoder)) {\n    SHERPA_ONNX_LOGE(\"Canary encoder file '%s' does not exist\",\n                     encoder.c_str());\n    return false;\n  }\n\n  if (decoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --canary-decoder\");\n    return false;\n  }\n\n  if (!FileExists(decoder)) {\n    SHERPA_ONNX_LOGE(\"Canary decoder file '%s' does not exist\",\n                     decoder.c_str());\n    return false;\n  }\n\n  if (!src_lang.empty()) {\n    if (src_lang != \"en\" && src_lang != \"de\" && src_lang != \"es\" &&\n        src_lang != \"fr\") {\n      SHERPA_ONNX_LOGE(\"Please use en, de, es, or fr for --canary-src-lang\");\n      return false;\n    }\n  }\n\n  if (!tgt_lang.empty()) {\n    if (tgt_lang != \"en\" && tgt_lang != \"de\" && tgt_lang != \"es\" &&\n        tgt_lang != \"fr\") {\n      SHERPA_ONNX_LOGE(\"Please use en, de, es, or fr for --canary-tgt-lang\");\n      return false;\n    }\n  }\n\n  return true;\n}\n\nstd::string OfflineCanaryModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineCanaryModelConfig(\";\n  os << \"encoder=\\\"\" << encoder << \"\\\", \";\n  os << \"decoder=\\\"\" << decoder << \"\\\", \";\n  os << \"src_lang=\\\"\" << src_lang << \"\\\", \";\n  os << \"tgt_lang=\\\"\" << tgt_lang << \"\\\", \";\n  os << \"use_pnc=\" << (use_pnc ? \"True\" : \"False\") << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-canary-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-canary-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CANARY_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CANARY_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineCanaryModelConfig {\n  std::string encoder;\n  std::string decoder;\n\n  // en, de, es, fr, or leave it empty to use en\n  std::string src_lang;\n\n  // en, de, es, fr, or leave it empty to use en\n  std::string tgt_lang;\n\n  // true to enable punctuations and casing\n  // false to disable punctuations and casing\n  bool use_pnc = true;\n\n  OfflineCanaryModelConfig() = default;\n  OfflineCanaryModelConfig(const std::string &encoder,\n                           const std::string &decoder,\n                           const std::string &src_lang,\n                           const std::string &tgt_lang, bool use_pnc)\n      : encoder(encoder),\n        decoder(decoder),\n        src_lang(src_lang),\n        tgt_lang(tgt_lang),\n        use_pnc(use_pnc) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CANARY_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-canary-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-canary-model-meta-data.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CANARY_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CANARY_MODEL_META_DATA_H_\n\n#include <string>\n#include <unordered_map>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nstruct OfflineCanaryModelMetaData {\n  int32_t vocab_size;\n  int32_t subsampling_factor = 8;\n  int32_t feat_dim = 120;\n  int32_t num_decoder_layers = 6;\n  int32_t decoder_hidden_size = 1024;\n  std::string normalize_type;\n  std::unordered_map<std::string, int32_t> lang2id;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CANARY_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-canary-model.cc",
    "content": "// sherpa-onnx/csrc/offline-canary-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-canary-model.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <tuple>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-canary-model-meta-data.h\"\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineCanaryModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.canary.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.canary.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.canary.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.canary.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n  }\n\n  std::vector<Ort::Value> ForwardEncoder(Ort::Value features,\n                                         Ort::Value features_length) {\n    std::array<Ort::Value, 2> encoder_inputs = {std::move(features),\n                                                std::move(features_length)};\n\n    auto encoder_out = encoder_sess_->Run(\n        {}, encoder_input_names_ptr_.data(), encoder_inputs.data(),\n        encoder_inputs.size(), encoder_output_names_ptr_.data(),\n        encoder_output_names_ptr_.size());\n\n    return encoder_out;\n  }\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> ForwardDecoder(\n      Ort::Value tokens, std::vector<Ort::Value> decoder_states,\n      Ort::Value encoder_states, Ort::Value enc_mask) {\n    std::vector<Ort::Value> decoder_inputs;\n    decoder_inputs.reserve(3 + decoder_states.size());\n\n    decoder_inputs.push_back(std::move(tokens));\n    for (auto &s : decoder_states) {\n      decoder_inputs.push_back(std::move(s));\n    }\n\n    decoder_inputs.push_back(std::move(encoder_states));\n    decoder_inputs.push_back(std::move(enc_mask));\n\n    auto decoder_outputs = decoder_sess_->Run(\n        {}, decoder_input_names_ptr_.data(), decoder_inputs.data(),\n        decoder_inputs.size(), decoder_output_names_ptr_.data(),\n        decoder_output_names_ptr_.size());\n\n    Ort::Value logits = std::move(decoder_outputs[0]);\n\n    std::vector<Ort::Value> output_decoder_states;\n    output_decoder_states.reserve(decoder_states.size());\n\n    int32_t i = 0;\n    for (auto &s : decoder_outputs) {\n      i += 1;\n      if (i == 1) {\n        continue;\n      }\n      output_decoder_states.push_back(std::move(s));\n    }\n\n    return {std::move(logits), std::move(output_decoder_states)};\n  }\n\n  std::vector<Ort::Value> GetInitialDecoderStates() {\n    int32_t num_layers = meta_.num_decoder_layers;\n    int64_t hidden_size = meta_.decoder_hidden_size;\n    std::array<int64_t, 3> shape{1, 0, hidden_size};\n\n    std::vector<Ort::Value> ans;\n    ans.reserve(num_layers);\n    for (int32_t i = 0; i < num_layers; ++i) {\n      Ort::Value state = Ort::Value::CreateTensor<float>(\n          Allocator(), shape.data(), shape.size());\n\n      ans.push_back(std::move(state));\n    }\n\n    return ans;\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  const OfflineCanaryModelMetaData &GetModelMetadata() const { return meta_; }\n\n  OfflineCanaryModelMetaData &GetModelMetadata() { return meta_; }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---encoder---\\n\";\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    std::string model_type;\n    SHERPA_ONNX_READ_META_DATA_STR(model_type, \"model_type\");\n\n    if (model_type != \"EncDecMultiTaskModel\") {\n      SHERPA_ONNX_LOGE(\n          \"Expected model type 'EncDecMultiTaskModel'. Given: '%s'\",\n          model_type.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA(meta_.vocab_size, \"vocab_size\");\n    SHERPA_ONNX_READ_META_DATA_STR_ALLOW_EMPTY(meta_.normalize_type,\n                                               \"normalize_type\");\n    SHERPA_ONNX_READ_META_DATA(meta_.subsampling_factor, \"subsampling_factor\");\n    SHERPA_ONNX_READ_META_DATA(meta_.feat_dim, \"feat_dim\");\n\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_.num_decoder_layers,\n                                            \"num_decoder_layers\", 6);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_.decoder_hidden_size,\n                                            \"decoder_hidden_size\", 1024);\n  }\n\n  void InitDecoder(void *model_data, size_t model_data_length) {\n    decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                  &decoder_input_names_ptr_);\n\n    GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                   &decoder_output_names_ptr_);\n  }\n\n private:\n  OfflineCanaryModelMetaData meta_;\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n};\n\nOfflineCanaryModel::OfflineCanaryModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineCanaryModel::OfflineCanaryModel(Manager *mgr,\n                                       const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineCanaryModel::~OfflineCanaryModel() = default;\n\nstd::vector<Ort::Value> OfflineCanaryModel::ForwardEncoder(\n    Ort::Value features, Ort::Value features_length) const {\n  return impl_->ForwardEncoder(std::move(features), std::move(features_length));\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOfflineCanaryModel::ForwardDecoder(Ort::Value tokens,\n                                   std::vector<Ort::Value> decoder_states,\n                                   Ort::Value encoder_states,\n                                   Ort::Value enc_mask) const {\n  return impl_->ForwardDecoder(std::move(tokens), std::move(decoder_states),\n                               std::move(encoder_states), std::move(enc_mask));\n}\n\nstd::vector<Ort::Value> OfflineCanaryModel::GetInitialDecoderStates() const {\n  return impl_->GetInitialDecoderStates();\n}\n\nOrtAllocator *OfflineCanaryModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nconst OfflineCanaryModelMetaData &OfflineCanaryModel::GetModelMetadata() const {\n  return impl_->GetModelMetadata();\n}\nOfflineCanaryModelMetaData &OfflineCanaryModel::GetModelMetadata() {\n  return impl_->GetModelMetadata();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineCanaryModel::OfflineCanaryModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineCanaryModel::OfflineCanaryModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-canary-model.h",
    "content": "// sherpa-onnx/csrc/offline-canary-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CANARY_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CANARY_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-canary-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n// see\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/nemo/canary/test_180m_flash.py\nclass OfflineCanaryModel {\n public:\n  explicit OfflineCanaryModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineCanaryModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineCanaryModel();\n\n  /** Run the encoder.\n   *\n   * @param features  A tensor of shape (N, T, C) of dtype float32.\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - encoder_states: A 3-D tensor of shape (N, T', encoder_dim)\n   *  - encoder_len: A 1-D tensor of shape (N,) containing number\n   *                        of frames in `encoder_out` before padding.\n   *                        Its dtype is int64_t\n   *  - enc_mask: A 2-D tensor of shape (N, T') with dtype bool\n   */\n  std::vector<Ort::Value> ForwardEncoder(Ort::Value features,\n                                         Ort::Value features_length) const;\n\n  /** Run the decoder model.\n   *\n   * @param tokens A int32 tensor of shape (N, num_tokens)\n   * @param decoder_states std::vector<Ort::Value>\n   * @param encoder_states Output from ForwardEncoder()\n   * @param enc_mask Output from ForwardEncoder()\n   *\n   * @return Return a pair:\n   *\n   *  - logits A 3-D tensor of shape (N, num_words, vocab_size)\n   *  - new_decoder_states: Can be used as input for ForwardDecoder()\n   */\n  std::pair<Ort::Value, std::vector<Ort::Value>> ForwardDecoder(\n      Ort::Value tokens, std::vector<Ort::Value> decoder_states,\n      Ort::Value encoder_states, Ort::Value enc_mask) const;\n\n  // The return value can be used as input for ForwardDecoder()\n  std::vector<Ort::Value> GetInitialDecoderStates() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n  const OfflineCanaryModelMetaData &GetModelMetadata() const;\n\n  OfflineCanaryModelMetaData &GetModelMetadata();\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CANARY_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ced-model.cc",
    "content": "// sherpa-onnx/csrc/offline-ced-model.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-ced-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineCEDModel::Impl {\n public:\n  explicit Impl(const AudioTaggingModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.ced);\n    Init(buf.data(), buf.size());\n  }\n\n#if __ANDROID_API__ >= 9\n  Impl(AAssetManager *mgr, const AudioTaggingModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.ced);\n    Init(buf.data(), buf.size());\n  }\n#endif\n\n  Ort::Value Forward(Ort::Value features) {\n    features = Transpose12(allocator_, &features);\n\n    auto ans = sess_->Run({}, input_names_ptr_.data(), &features, 1,\n                          output_names_ptr_.data(), output_names_ptr_.size());\n    return std::move(ans[0]);\n  }\n\n  int32_t NumEventClasses() const { return num_event_classes_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n    }\n\n    // get num_event_classes from the output[0].shape,\n    // which is (N, num_event_classes)\n    num_event_classes_ =\n        sess_->GetOutputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape()[1];\n  }\n\n private:\n  AudioTaggingModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t num_event_classes_ = 0;\n};\n\nOfflineCEDModel::OfflineCEDModel(const AudioTaggingModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\n#if __ANDROID_API__ >= 9\nOfflineCEDModel::OfflineCEDModel(AAssetManager *mgr,\n                                 const AudioTaggingModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n#endif\n\nOfflineCEDModel::~OfflineCEDModel() = default;\n\nOrt::Value OfflineCEDModel::Forward(Ort::Value features) const {\n  return impl_->Forward(std::move(features));\n}\n\nint32_t OfflineCEDModel::NumEventClasses() const {\n  return impl_->NumEventClasses();\n}\n\nOrtAllocator *OfflineCEDModel::Allocator() const { return impl_->Allocator(); }\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ced-model.h",
    "content": "// sherpa-onnx/csrc/offline-ced-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CED_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CED_MODEL_H_\n#include <memory>\n#include <utility>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/audio-tagging-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements the CED model from\n * https://github.com/RicherMans/CED/blob/main/export_onnx.py\n */\nclass OfflineCEDModel {\n public:\n  explicit OfflineCEDModel(const AudioTaggingModelConfig &config);\n\n#if __ANDROID_API__ >= 9\n  OfflineCEDModel(AAssetManager *mgr, const AudioTaggingModelConfig &config);\n#endif\n\n  ~OfflineCEDModel();\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   *\n   * @return Return a tensor\n   *  - probs: A 2-D tensor of shape (N, num_event_classes).\n   */\n  Ort::Value Forward(Ort::Value features) const;\n\n  /** Return the number of event classes of the model\n   */\n  int32_t NumEventClasses() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CED_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ct-transformer-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-ct-transformer-model-meta-data.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CT_TRANSFORMER_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CT_TRANSFORMER_MODEL_META_DATA_H_\n\n#include <string>\n#include <unordered_map>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nstruct OfflineCtTransformerModelMetaData {\n  std::unordered_map<std::string, int32_t> token2id;\n  std::unordered_map<std::string, int32_t> punct2id;\n  std::vector<std::string> id2punct;\n\n  int32_t unk_id;\n  int32_t dot_id;\n  int32_t comma_id;\n  int32_t quest_id;\n  int32_t pause_id;\n  int32_t underline_id;\n  int32_t num_punctuations;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CT_TRANSFORMER_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ct-transformer-model.cc",
    "content": "// sherpa-onnx/csrc/offline-ct-transformer-model.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-ct-transformer-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineCtTransformerModel::Impl {\n public:\n  explicit Impl(const OfflinePunctuationModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.ct_transformer);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflinePunctuationModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.ct_transformer);\n    Init(buf.data(), buf.size());\n  }\n\n  Ort::Value Forward(Ort::Value text, Ort::Value text_len) {\n    std::array<Ort::Value, 2> inputs = {std::move(text), std::move(text_len)};\n\n    auto ans =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n    return std::move(ans[0]);\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  const OfflineCtTransformerModelMetaData &GetModelMetadata() const {\n    return meta_data_;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    std::vector<std::string> tokens;\n    SHERPA_ONNX_READ_META_DATA_VEC_STRING_SEP(tokens, \"tokens\", \"|\");\n\n    int32_t vocab_size = 0;\n    SHERPA_ONNX_READ_META_DATA(vocab_size, \"vocab_size\");\n    if (static_cast<int32_t>(tokens.size()) != vocab_size) {\n      SHERPA_ONNX_LOGE(\"tokens.size() %d != vocab_size %d\",\n                       static_cast<int32_t>(tokens.size()), vocab_size);\n      exit(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA_VEC_STRING_SEP(meta_data_.id2punct,\n                                              \"punctuations\", \"|\");\n\n    std::string unk_symbol;\n    SHERPA_ONNX_READ_META_DATA_STR(unk_symbol, \"unk_symbol\");\n\n    // output shape is (N, T, num_punctuations)\n    meta_data_.num_punctuations =\n        sess_->GetOutputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape()[2];\n\n    int32_t i = 0;\n    for (const auto &t : tokens) {\n      meta_data_.token2id[t] = i;\n      i += 1;\n    }\n\n    i = 0;\n    for (const auto &p : meta_data_.id2punct) {\n      meta_data_.punct2id[p] = i;\n      i += 1;\n    }\n\n    meta_data_.unk_id = meta_data_.token2id.at(unk_symbol);\n\n    meta_data_.dot_id = meta_data_.punct2id.at(\"。\");\n    meta_data_.comma_id = meta_data_.punct2id.at(\"，\");\n    meta_data_.quest_id = meta_data_.punct2id.at(\"？\");\n    meta_data_.pause_id = meta_data_.punct2id.at(\"、\");\n    meta_data_.underline_id = meta_data_.punct2id.at(\"_\");\n\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"vocab_size: \" << meta_data_.token2id.size() << \"\\n\";\n      os << \"num_punctuations: \" << meta_data_.num_punctuations << \"\\n\";\n      os << \"punctuations: \";\n      for (const auto &s : meta_data_.id2punct) {\n        os << s << \" \";\n      }\n      os << \"\\n\";\n      SHERPA_ONNX_LOGE(\"\\n%s\\n\", os.str().c_str());\n    }\n  }\n\n private:\n  OfflinePunctuationModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  OfflineCtTransformerModelMetaData meta_data_;\n};\n\nOfflineCtTransformerModel::OfflineCtTransformerModel(\n    const OfflinePunctuationModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineCtTransformerModel::OfflineCtTransformerModel(\n    Manager *mgr, const OfflinePunctuationModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineCtTransformerModel::OfflineCtTransformerModel(\n    AAssetManager *mgr, const OfflinePunctuationModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineCtTransformerModel::OfflineCtTransformerModel(\n    NativeResourceManager *mgr, const OfflinePunctuationModelConfig &config);\n#endif\n\nOfflineCtTransformerModel::~OfflineCtTransformerModel() = default;\n\nOrt::Value OfflineCtTransformerModel::Forward(Ort::Value text,\n                                              Ort::Value text_len) const {\n  return impl_->Forward(std::move(text), std::move(text_len));\n}\n\nOrtAllocator *OfflineCtTransformerModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nconst OfflineCtTransformerModelMetaData &\nOfflineCtTransformerModel::GetModelMetadata() const {\n  return impl_->GetModelMetadata();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ct-transformer-model.h",
    "content": "// sherpa-onnx/csrc/offline-ct-transformer-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CT_TRANSFORMER_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CT_TRANSFORMER_MODEL_H_\n#include <memory>\n#include <utility>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-ct-transformer-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-punctuation-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements\n * https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/python/onnxruntime/funasr_onnx/punc_bin.py#L17\n * from FunASR\n */\nclass OfflineCtTransformerModel {\n public:\n  explicit OfflineCtTransformerModel(\n      const OfflinePunctuationModelConfig &config);\n\n  template <typename Manager>\n  OfflineCtTransformerModel(Manager *mgr,\n                            const OfflinePunctuationModelConfig &config);\n\n  ~OfflineCtTransformerModel();\n\n  /** Run the forward method of the model.\n   *\n   * @param text  A tensor of shape (N, T) of dtype int32.\n   * @param text  A tensor of shape (N) of dtype int32.\n   *\n   * @return Return a tensor\n   *  - punctuation_ids: A 2-D tensor of shape (N, T).\n   */\n  Ort::Value Forward(Ort::Value text, Ort::Value text_len) const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n  const OfflineCtTransformerModelMetaData &GetModelMetadata() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CT_TRANSFORMER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ctc-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-ctc-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CTC_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CTC_DECODER_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nstruct OfflineCtcDecoderResult {\n  /// The decoded token IDs\n  std::vector<int64_t> tokens;\n\n  /// The decoded word IDs\n  /// Note: tokens.size() is usually not equal to words.size()\n  /// words is empty for greedy search decoding.\n  /// it is not empty when an HLG graph or an HLG graph is used.\n  std::vector<int32_t> words;\n\n  /// timestamps[i] contains the output frame index where tokens[i] is decoded.\n  /// Note: The index is after subsampling\n  ///\n  /// tokens.size() == timestamps.size()\n  std::vector<int32_t> timestamps;\n};\n\nclass OfflineCtcDecoder {\n public:\n  virtual ~OfflineCtcDecoder() = default;\n\n  /** Run CTC decoding given the output from the encoder model.\n   *\n   * @param log_probs A 3-D tensor of shape (N, T, vocab_size) containing\n   *                  lob_probs.\n   * @param log_probs_length A 1-D tensor of shape (N,) containing number\n   *                         of valid frames in log_probs before padding.\n   *\n   * @return Return a vector of size `N` containing the decoded results.\n   */\n  virtual std::vector<OfflineCtcDecoderResult> Decode(\n      Ort::Value log_probs, Ort::Value log_probs_length) = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CTC_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ctc-fst-decoder-config.cc",
    "content": "// sherpa-onnx/csrc/offline-ctc-fst-decoder-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-ctc-fst-decoder-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstd::string OfflineCtcFstDecoderConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineCtcFstDecoderConfig(\";\n  os << \"graph=\\\"\" << graph << \"\\\", \";\n  os << \"max_active=\" << max_active << \")\";\n\n  return os.str();\n}\n\nvoid OfflineCtcFstDecoderConfig::Register(ParseOptions *po) {\n  std::string prefix = \"ctc\";\n  ParseOptions p(prefix, po);\n\n  p.Register(\"graph\", &graph, \"Path to H.fst, HL.fst, or HLG.fst\");\n\n  p.Register(\"max-active\", &max_active,\n             \"Decoder max active states.  Larger->slower; more accurate\");\n}\n\nbool OfflineCtcFstDecoderConfig::Validate() const {\n  if (!graph.empty() && !FileExists(graph)) {\n    SHERPA_ONNX_LOGE(\"graph: '%s' does not exist\", graph.c_str());\n    return false;\n  }\n  return true;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ctc-fst-decoder-config.h",
    "content": "// sherpa-onnx/csrc/offline-ctc-fst-decoder-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CTC_FST_DECODER_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CTC_FST_DECODER_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineCtcFstDecoderConfig {\n  // Path to H.fst, HL.fst or HLG.fst\n  std::string graph;\n  int32_t max_active = 3000;\n\n  OfflineCtcFstDecoderConfig() = default;\n\n  OfflineCtcFstDecoderConfig(const std::string &graph, int32_t max_active)\n      : graph(graph), max_active(max_active) {}\n\n  std::string ToString() const;\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CTC_FST_DECODER_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ctc-fst-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-ctc-fst-decoder.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-ctc-fst-decoder.h\"\n\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"fst/fstlib.h\"\n#include \"kaldi-decoder/csrc/decodable-ctc.h\"\n#include \"kaldi-decoder/csrc/eigen.h\"\n#include \"kaldi-decoder/csrc/faster-decoder.h\"\n#include \"sherpa-onnx/csrc/fst-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\n/**\n * @param decoder\n * @param p Pointer to a 2-d array of shape (num_frames, vocab_size)\n * @param num_frames Number of rows in the 2-d array.\n * @param vocab_size Number of columns in the 2-d array.\n * @return Return the decoded result.\n */\nstatic OfflineCtcDecoderResult DecodeOne(kaldi_decoder::FasterDecoder *decoder,\n                                         const float *p, int32_t num_frames,\n                                         int32_t vocab_size) {\n  OfflineCtcDecoderResult r;\n  kaldi_decoder::DecodableCtc decodable(p, num_frames, vocab_size);\n\n  decoder->Decode(&decodable);\n\n  if (!decoder->ReachedFinal()) {\n    SHERPA_ONNX_LOGE(\"Not reached final!\");\n    return r;\n  }\n\n  fst::VectorFst<fst::LatticeArc> decoded;  // linear FST.\n  decoder->GetBestPath(&decoded);\n\n  if (decoded.NumStates() == 0) {\n    SHERPA_ONNX_LOGE(\"Empty best path!\");\n    return r;\n  }\n\n  auto cur_state = decoded.Start();\n\n  int32_t blank_id = 0;\n\n  for (int32_t t = 0, prev = -1; decoded.NumArcs(cur_state) == 1; ++t) {\n    fst::ArcIterator<fst::Fst<fst::LatticeArc>> iter(decoded, cur_state);\n    const auto &arc = iter.Value();\n\n    cur_state = arc.nextstate;\n\n    if (arc.ilabel == prev) {\n      continue;\n    }\n\n    // 0 is epsilon here\n    if (arc.ilabel == 0 || arc.ilabel == blank_id + 1) {\n      prev = arc.ilabel;\n      continue;\n    }\n\n    // -1 here since the input labels are incremented during graph\n    // construction\n    r.tokens.push_back(arc.ilabel - 1);\n    if (arc.olabel != 0) {\n      r.words.push_back(arc.olabel);\n    }\n\n    r.timestamps.push_back(t);\n    prev = arc.ilabel;\n  }\n\n  return r;\n}\n\nOfflineCtcFstDecoder::OfflineCtcFstDecoder(\n    const OfflineCtcFstDecoderConfig &config)\n    : config_(config), fst_(ReadGraph(config_.graph)) {}\n\nstd::vector<OfflineCtcDecoderResult> OfflineCtcFstDecoder::Decode(\n    Ort::Value log_probs, Ort::Value log_probs_length) {\n  std::vector<int64_t> shape = log_probs.GetTensorTypeAndShapeInfo().GetShape();\n\n  assert(static_cast<int32_t>(shape.size()) == 3);\n  int32_t batch_size = shape[0];\n  int32_t T = shape[1];\n  int32_t vocab_size = shape[2];\n\n  std::vector<int64_t> length_shape =\n      log_probs_length.GetTensorTypeAndShapeInfo().GetShape();\n  assert(static_cast<int32_t>(length_shape.size()) == 1);\n\n  assert(shape[0] == length_shape[0]);\n\n  kaldi_decoder::FasterDecoderOptions opts;\n  opts.max_active = config_.max_active;\n  kaldi_decoder::FasterDecoder faster_decoder(*fst_, opts);\n\n  const float *start = log_probs.GetTensorData<float>();\n\n  std::vector<OfflineCtcDecoderResult> ans;\n  ans.reserve(batch_size);\n\n  for (int32_t i = 0; i != batch_size; ++i) {\n    const float *p = start + i * T * vocab_size;\n    int32_t num_frames = log_probs_length.GetTensorData<int64_t>()[i];\n    auto r = DecodeOne(&faster_decoder, p, num_frames, vocab_size);\n    ans.push_back(std::move(r));\n  }\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ctc-fst-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-ctc-fst-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CTC_FST_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CTC_FST_DECODER_H_\n\n#include <memory>\n#include <vector>\n\n#include \"fst/fst.h\"\n#include \"sherpa-onnx/csrc/offline-ctc-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-ctc-fst-decoder-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineCtcFstDecoder : public OfflineCtcDecoder {\n public:\n  explicit OfflineCtcFstDecoder(const OfflineCtcFstDecoderConfig &config);\n\n  std::vector<OfflineCtcDecoderResult> Decode(\n      Ort::Value log_probs, Ort::Value log_probs_length) override;\n\n private:\n  OfflineCtcFstDecoderConfig config_;\n\n  std::unique_ptr<fst::Fst<fst::StdArc>> fst_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CTC_FST_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ctc-greedy-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-ctc-greedy-search-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-ctc-greedy-search-decoder.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstd::vector<OfflineCtcDecoderResult> OfflineCtcGreedySearchDecoder::Decode(\n    Ort::Value log_probs, Ort::Value log_probs_length) {\n  std::vector<int64_t> shape = log_probs.GetTensorTypeAndShapeInfo().GetShape();\n  int32_t batch_size = static_cast<int32_t>(shape[0]);\n  int32_t num_frames = static_cast<int32_t>(shape[1]);\n  int32_t vocab_size = static_cast<int32_t>(shape[2]);\n\n  const int64_t *p_log_probs_length = log_probs_length.GetTensorData<int64_t>();\n\n  std::vector<OfflineCtcDecoderResult> ans;\n  ans.reserve(batch_size);\n\n  for (int32_t b = 0; b != batch_size; ++b) {\n    const float *p_log_probs =\n        log_probs.GetTensorData<float>() + b * num_frames * vocab_size;\n\n    OfflineCtcDecoderResult r;\n    int64_t prev_id = -1;\n\n    for (int32_t t = 0; t != static_cast<int32_t>(p_log_probs_length[b]); ++t) {\n      auto y = static_cast<int64_t>(std::distance(\n          static_cast<const float *>(p_log_probs),\n          std::max_element(\n              static_cast<const float *>(p_log_probs),\n              static_cast<const float *>(p_log_probs) + vocab_size)));\n      p_log_probs += vocab_size;\n\n      if (y != blank_id_ && y != prev_id) {\n        r.tokens.push_back(y);\n        r.timestamps.push_back(t);\n      }\n      prev_id = y;\n    }  // for (int32_t t = 0; ...)\n\n    ans.push_back(std::move(r));\n  }\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ctc-greedy-search-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-ctc-greedy-search-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CTC_GREEDY_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CTC_GREEDY_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-ctc-decoder.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineCtcGreedySearchDecoder : public OfflineCtcDecoder {\n public:\n  explicit OfflineCtcGreedySearchDecoder(int32_t blank_id)\n      : blank_id_(blank_id) {}\n\n  std::vector<OfflineCtcDecoderResult> Decode(\n      Ort::Value log_probs, Ort::Value log_probs_length) override;\n\n private:\n  int32_t blank_id_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CTC_GREEDY_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/offline-ctc-model.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <sstream>\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-dolphin-model.h\"\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-medasr-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-tdnn-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-telespeech-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-wenet-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-zipformer-ctc-model.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace {\n\nenum class ModelType : std::uint8_t {\n  kEncDecCTCModelBPE,\n  kEncDecCTCModel,\n  kEncDecHybridRNNTCTCBPEModel,\n  kTdnn,\n  kZipformerCtc,\n  kWenetCtc,\n  kTeleSpeechCtc,\n  kUnknown,\n};\n\n}  // namespace\n\nnamespace sherpa_onnx {\n\nstatic ModelType GetModelType(char *model_data, size_t model_data_length,\n                              bool debug) {\n  Ort::Env env(ORT_LOGGING_LEVEL_ERROR);\n  Ort::SessionOptions sess_opts;\n  sess_opts.SetIntraOpNumThreads(1);\n  sess_opts.SetInterOpNumThreads(1);\n\n  auto sess = std::make_unique<Ort::Session>(env, model_data, model_data_length,\n                                             sess_opts);\n\n  Ort::ModelMetadata meta_data = sess->GetModelMetadata();\n  if (debug) {\n    std::ostringstream os;\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;\n  auto model_type =\n      LookupCustomModelMetaData(meta_data, \"model_type\", allocator);\n  if (model_type.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"No model_type in the metadata!\\n\"\n        \"If you are using models from NeMo, please refer to\\n\"\n        \"https://huggingface.co/csukuangfj/\"\n        \"sherpa-onnx-nemo-ctc-en-citrinet-512/blob/main/add-model-metadata.py\\n\"\n        \"or \"\n        \"https://github.com/k2-fsa/sherpa-onnx/tree/master/scripts/nemo/\"\n        \"fast-conformer-hybrid-transducer-ctc\\n\"\n        \"If you are using models from WeNet, please refer to\\n\"\n        \"https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/wenet/\"\n        \"run.sh\\n\"\n        \"If you are using models from TeleSpeech, please refer to\\n\"\n        \"https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/tele-speech/\"\n        \"add-metadata.py\"\n        \"\\n\"\n        \"for how to add metadata to model.onnx\\n\");\n    return ModelType::kUnknown;\n  }\n\n  if (model_type == \"EncDecCTCModelBPE\") {\n    return ModelType::kEncDecCTCModelBPE;\n  } else if (model_type == \"EncDecCTCModel\") {\n    return ModelType::kEncDecCTCModel;\n  } else if (model_type == \"EncDecHybridRNNTCTCBPEModel\") {\n    return ModelType::kEncDecHybridRNNTCTCBPEModel;\n  } else if (model_type == \"tdnn\") {\n    return ModelType::kTdnn;\n  } else if (model_type == \"zipformer2_ctc\") {\n    return ModelType::kZipformerCtc;\n  } else if (model_type == \"wenet_ctc\") {\n    return ModelType::kWenetCtc;\n  } else if (model_type == \"telespeech_ctc\") {\n    return ModelType::kTeleSpeechCtc;\n  } else {\n    SHERPA_ONNX_LOGE(\"Unsupported model_type: %s\", model_type.c_str());\n    return ModelType::kUnknown;\n  }\n}\n\nstd::unique_ptr<OfflineCtcModel> OfflineCtcModel::Create(\n    const OfflineModelConfig &config) {\n  if (!config.dolphin.model.empty()) {\n    return std::make_unique<OfflineDolphinModel>(config);\n  } else if (!config.nemo_ctc.model.empty()) {\n    return std::make_unique<OfflineNemoEncDecCtcModel>(config);\n  } else if (!config.tdnn.model.empty()) {\n    return std::make_unique<OfflineTdnnCtcModel>(config);\n  } else if (!config.zipformer_ctc.model.empty()) {\n    return std::make_unique<OfflineZipformerCtcModel>(config);\n  } else if (!config.wenet_ctc.model.empty()) {\n    return std::make_unique<OfflineWenetCtcModel>(config);\n  } else if (!config.telespeech_ctc.empty()) {\n    return std::make_unique<OfflineTeleSpeechCtcModel>(config);\n  } else if (!config.omnilingual.model.empty()) {\n    return std::make_unique<OfflineOmnilingualAsrCtcModel>(config);\n  } else if (!config.medasr.model.empty()) {\n    return std::make_unique<OfflineMedAsrCtcModel>(config);\n  } else if (!config.fire_red_asr_ctc.model.empty()) {\n    return std::make_unique<OfflineFireRedAsrCtcModel>(config);\n  }\n\n  // TODO(fangjun): Refactor it. We don't need to use model_type here\n  ModelType model_type = ModelType::kUnknown;\n\n  std::string filename;\n  if (!config.nemo_ctc.model.empty()) {\n    filename = config.nemo_ctc.model;\n  } else if (!config.tdnn.model.empty()) {\n    filename = config.tdnn.model;\n  } else if (!config.zipformer_ctc.model.empty()) {\n    filename = config.zipformer_ctc.model;\n  } else if (!config.wenet_ctc.model.empty()) {\n    filename = config.wenet_ctc.model;\n  } else if (!config.telespeech_ctc.empty()) {\n    filename = config.telespeech_ctc;\n  } else {\n    SHERPA_ONNX_LOGE(\"Please specify a CTC model\");\n    exit(-1);\n  }\n\n  {\n    auto buffer = ReadFile(filename);\n\n    model_type = GetModelType(buffer.data(), buffer.size(), config.debug);\n  }\n\n  switch (model_type) {\n    case ModelType::kEncDecCTCModelBPE:\n    case ModelType::kEncDecCTCModel:\n      return std::make_unique<OfflineNemoEncDecCtcModel>(config);\n    case ModelType::kEncDecHybridRNNTCTCBPEModel:\n      return std::make_unique<OfflineNemoEncDecHybridRNNTCTCBPEModel>(config);\n    case ModelType::kTdnn:\n      return std::make_unique<OfflineTdnnCtcModel>(config);\n    case ModelType::kZipformerCtc:\n      return std::make_unique<OfflineZipformerCtcModel>(config);\n    case ModelType::kWenetCtc:\n      return std::make_unique<OfflineWenetCtcModel>(config);\n    case ModelType::kTeleSpeechCtc:\n      return std::make_unique<OfflineTeleSpeechCtcModel>(config);\n    case ModelType::kUnknown:\n      SHERPA_ONNX_LOGE(\"Unknown model type in offline CTC!\");\n      return nullptr;\n  }\n\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OfflineCtcModel> OfflineCtcModel::Create(\n    Manager *mgr, const OfflineModelConfig &config) {\n  if (!config.dolphin.model.empty()) {\n    return std::make_unique<OfflineDolphinModel>(mgr, config);\n  } else if (!config.nemo_ctc.model.empty()) {\n    return std::make_unique<OfflineNemoEncDecCtcModel>(mgr, config);\n  } else if (!config.tdnn.model.empty()) {\n    return std::make_unique<OfflineTdnnCtcModel>(mgr, config);\n  } else if (!config.zipformer_ctc.model.empty()) {\n    return std::make_unique<OfflineZipformerCtcModel>(mgr, config);\n  } else if (!config.wenet_ctc.model.empty()) {\n    return std::make_unique<OfflineWenetCtcModel>(mgr, config);\n  } else if (!config.telespeech_ctc.empty()) {\n    return std::make_unique<OfflineTeleSpeechCtcModel>(mgr, config);\n  } else if (!config.omnilingual.model.empty()) {\n    return std::make_unique<OfflineOmnilingualAsrCtcModel>(mgr, config);\n  } else if (!config.medasr.model.empty()) {\n    return std::make_unique<OfflineMedAsrCtcModel>(mgr, config);\n  } else if (!config.fire_red_asr_ctc.model.empty()) {\n    return std::make_unique<OfflineFireRedAsrCtcModel>(mgr, config);\n  }\n\n  // TODO(fangjun): Refactor it. We don't need to use model_type here\n  ModelType model_type = ModelType::kUnknown;\n\n  std::string filename;\n  if (!config.nemo_ctc.model.empty()) {\n    filename = config.nemo_ctc.model;\n  } else if (!config.tdnn.model.empty()) {\n    filename = config.tdnn.model;\n  } else if (!config.zipformer_ctc.model.empty()) {\n    filename = config.zipformer_ctc.model;\n  } else if (!config.wenet_ctc.model.empty()) {\n    filename = config.wenet_ctc.model;\n  } else if (!config.telespeech_ctc.empty()) {\n    filename = config.telespeech_ctc;\n  } else {\n    SHERPA_ONNX_LOGE(\"Please specify a CTC model\");\n    exit(-1);\n  }\n\n  {\n    auto buffer = ReadFile(mgr, filename);\n\n    model_type = GetModelType(buffer.data(), buffer.size(), config.debug);\n  }\n\n  switch (model_type) {\n    case ModelType::kEncDecCTCModelBPE:\n    case ModelType::kEncDecCTCModel:\n      return std::make_unique<OfflineNemoEncDecCtcModel>(mgr, config);\n    case ModelType::kEncDecHybridRNNTCTCBPEModel:\n      return std::make_unique<OfflineNemoEncDecHybridRNNTCTCBPEModel>(mgr,\n                                                                      config);\n    case ModelType::kTdnn:\n      return std::make_unique<OfflineTdnnCtcModel>(mgr, config);\n    case ModelType::kZipformerCtc:\n      return std::make_unique<OfflineZipformerCtcModel>(mgr, config);\n    case ModelType::kWenetCtc:\n      return std::make_unique<OfflineWenetCtcModel>(mgr, config);\n    case ModelType::kTeleSpeechCtc:\n      return std::make_unique<OfflineTeleSpeechCtcModel>(mgr, config);\n    case ModelType::kUnknown:\n      SHERPA_ONNX_LOGE(\"Unknown model type in offline CTC!\");\n      return nullptr;\n  }\n\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OfflineCtcModel> OfflineCtcModel::Create(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OfflineCtcModel> OfflineCtcModel::Create(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-ctc-model.h",
    "content": "// sherpa-onnx/csrc/offline-ctc-model.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_CTC_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineCtcModel {\n public:\n  virtual ~OfflineCtcModel() = default;\n\n  static std::unique_ptr<OfflineCtcModel> Create(\n      const OfflineModelConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OfflineCtcModel> Create(\n      Manager *mgr, const OfflineModelConfig &config);\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size).\n   *  - log_probs_length A 1-D tensor of shape (N,). Its dtype is int64_t\n   */\n  virtual std::vector<Ort::Value> Forward(Ort::Value features,\n                                          Ort::Value features_length) = 0;\n\n  /** Return the vocabulary size of the model\n   */\n  virtual int32_t VocabSize() const = 0;\n\n  /** SubsamplingFactor of the model\n   *\n   * For NeMo Citrinet, the subsampling factor is usually 4.\n   * For NeMo Conformer CTC, the subsampling factor is usually 8.\n   */\n  virtual int32_t SubsamplingFactor() const { return 1; }\n\n  /** Return an allocator for allocating memory\n   */\n  virtual OrtAllocator *Allocator() const = 0;\n\n  /** For some models, e.g., those from NeMo, they require some preprocessing\n   * for the features.\n   */\n  virtual std::string FeatureNormalizationMethod() const { return {}; }\n\n  // Return true if the model supports batch size > 1\n  virtual bool SupportBatchProcessing() const { return true; }\n\n  // return true for models from https://github.com/salute-developers/GigaAM\n  // return false otherwise\n  virtual bool IsGigaAM() const { return false; }\n\n  // For Dolphin and FireRedASR CTC models, they use global CMVN\n  virtual void NormalizeFeatures(float *features, int32_t num_frames,\n                                 int32_t feat_dim) const {}\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-dolphin-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-dolphin-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-dolphin-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineDolphinModelConfig::Register(ParseOptions *po) {\n  po->Register(\"dolphin-model\", &model,\n               \"Path to model.onnx of Dolphin CTC branch.\");\n}\n\nbool OfflineDolphinModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"Dolphin model '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineDolphinModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineDolphinModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-dolphin-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-dolphin-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_DOLPHIN_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_DOLPHIN_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineDolphinModelConfig {\n  std::string model;\n\n  OfflineDolphinModelConfig() = default;\n  explicit OfflineDolphinModelConfig(const std::string &model) : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_DOLPHIN_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-dolphin-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-dolphin-model-meta-data.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_DOLPHIN_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_DOLPHIN_MODEL_META_DATA_H_\n\n#include <string>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nstruct OfflineDolphinModelMetaData {\n  int32_t vocab_size;\n  int32_t subsampling_factor = 4;\n  std::vector<float> mean;\n  std::vector<float> inv_stddev;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_DOLPHIN_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-dolphin-model.cc",
    "content": "// sherpa-onnx/csrc/offline-dolphin-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-dolphin-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineDolphinModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.dolphin.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.dolphin.model);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) {\n    std::array<Ort::Value, 2> inputs = {\n        std::move(features),\n        std::move(features_length),\n    };\n\n    return sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                      output_names_ptr_.data(), output_names_ptr_.size());\n  }\n\n  int32_t VocabSize() const { return meta_data_.vocab_size; }\n\n  int32_t SubsamplingFactor() const { return meta_data_.subsampling_factor; }\n\n  void NormalizeFeatures(float *features, int32_t num_frames,\n                         int32_t feat_dim) const {\n    using RowMajorMat =\n        Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>;\n    Eigen::Map<RowMajorMat> x(features, num_frames, feat_dim);\n\n    Eigen::Map<const Eigen::RowVectorXf> mean(meta_data_.mean.data(), feat_dim);\n    Eigen::Map<const Eigen::RowVectorXf> inv_std(meta_data_.inv_stddev.data(),\n                                                 feat_dim);\n    x.array() =\n        (x.array().rowwise() - mean.array()).rowwise() * inv_std.array();\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(meta_data_.vocab_size, \"vocab_size\");\n\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(meta_data_.mean, \"mean\");\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(meta_data_.inv_stddev, \"invstd\");\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  OfflineDolphinModelMetaData meta_data_;\n};\n\nOfflineDolphinModel::OfflineDolphinModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineDolphinModel::OfflineDolphinModel(Manager *mgr,\n                                         const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineDolphinModel::~OfflineDolphinModel() = default;\n\nstd::vector<Ort::Value> OfflineDolphinModel::Forward(\n    Ort::Value features, Ort::Value features_length) {\n  return impl_->Forward(std::move(features), std::move(features_length));\n}\n\nint32_t OfflineDolphinModel::VocabSize() const { return impl_->VocabSize(); }\n\nint32_t OfflineDolphinModel::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\nvoid OfflineDolphinModel::NormalizeFeatures(float *features, int32_t num_frames,\n                                            int32_t feat_dim) const {\n  return impl_->NormalizeFeatures(features, num_frames, feat_dim);\n}\n\nOrtAllocator *OfflineDolphinModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineDolphinModel::OfflineDolphinModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineDolphinModel::OfflineDolphinModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-dolphin-model.h",
    "content": "// sherpa-onnx/csrc/offline-dolphin-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_DOLPHIN_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_DOLPHIN_MODEL_H_\n\n#include <memory>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-dolphin-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineDolphinModel : public OfflineCtcModel {\n public:\n  explicit OfflineDolphinModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineDolphinModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineDolphinModel() override;\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size).\n   *  - log_probs_length A 1-D tensor of shape (N,). Its dtype is int64_t\n   */\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** SubsamplingFactor of the model\n   *\n   * For Citrinet, the subsampling factor is usually 4.\n   * For Conformer CTC, the subsampling factor is usually 8.\n   */\n  int32_t SubsamplingFactor() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  bool SupportBatchProcessing() const override { return true; }\n\n  void NormalizeFeatures(float *features, int32_t num_frames,\n                         int32_t feat_dim) const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_DOLPHIN_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-ctc-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-ctc-model-config.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-ctc-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineFireRedAsrCtcModelConfig::Register(ParseOptions *po) {\n  po->Register(\n      \"fire-red-asr-ctc\", &model,\n      \"Path to model.onnx from FireRedASR CTC. \"\n      \"Please see \"\n      \"https://k2-fsa.github.io/sherpa/onnx/FireRedAsr/pretrained.html \"\n      \"for available FireRedASR CTC models\");\n}\n\nbool OfflineFireRedAsrCtcModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"FireRedASR CTC model: '%s' does not exist\",\n                     model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineFireRedAsrCtcModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineFireRedAsrCtcModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-ctc-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-ctc-model-config.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_CTC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineFireRedAsrCtcModelConfig {\n  std::string model;\n\n  OfflineFireRedAsrCtcModelConfig() = default;\n  explicit OfflineFireRedAsrCtcModelConfig(const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-ctc-model.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-ctc-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineFireRedAsrCtcModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.fire_red_asr_ctc.model),\n        sess_opts_);\n    Init(nullptr, 0);\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.fire_red_asr_ctc.model);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) {\n    std::array<Ort::Value, 2> inputs = {std::move(features),\n                                        std::move(features_length)};\n\n    return sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                      output_names_ptr_.data(), output_names_ptr_.size());\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t SubsamplingFactor() const { return subsampling_factor_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  void NormalizeFeatures(float *features, int32_t num_frames,\n                         int32_t feat_dim) const {\n    if (static_cast<int32_t>(mean_.size()) != feat_dim) {\n      SHERPA_ONNX_LOGE(\"Bad things happened\");\n      SHERPA_ONNX_LOGE(\"Wrong feat dim %d. Expect: %d\", feat_dim,\n                       static_cast<int32_t>(mean_.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    using RowMajorMat =\n        Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>;\n    Eigen::Map<RowMajorMat> x(features, num_frames, feat_dim);\n\n    Eigen::Map<const Eigen::RowVectorXf> mean(mean_.data(), feat_dim);\n    Eigen::Map<const Eigen::RowVectorXf> inv_std(inv_stddev_.data(), feat_dim);\n    x.array() =\n        (x.array().rowwise() - mean.array()).rowwise() * inv_std.array();\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                             model_data_length, sess_opts_);\n    } else if (!sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize session outside of this \"\n          \"function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    std::string model_type;\n    SHERPA_ONNX_READ_META_DATA_STR(model_type, \"model_type\");\n    if (model_type != \"fire-red-asr-2-ctc\") {\n      SHERPA_ONNX_LOGE(\"Expect model type fire-red-asr-2-ctc. Given: '%s'\",\n                       model_type.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(subsampling_factor_,\n                                            \"subsampling_factor\", 4);\n\n    auto shape =\n        sess_->GetOutputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape();\n    vocab_size_ = shape.back();\n\n    if (config_.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"subsampling_factor: %{public}d\", subsampling_factor_);\n      SHERPA_ONNX_LOGE(\"vocab_size: %{public}d\", vocab_size_);\n#else\n      SHERPA_ONNX_LOGE(\"subsampling_factor: %d\", subsampling_factor_);\n      SHERPA_ONNX_LOGE(\"vocab_size: %d\", vocab_size_);\n#endif\n    }\n\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(mean_, \"cmvn_mean\");\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(inv_stddev_, \"cmvn_inv_stddev\");\n    if (mean_.size() != inv_stddev_.size()) {\n      SHERPA_ONNX_LOGE(\"Incorrect cmvn. mean size: %d, inv_stddev size: %d\",\n                       static_cast<int32_t>(mean_.size()),\n                       static_cast<int32_t>(inv_stddev_.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t vocab_size_ = 0;\n  int32_t subsampling_factor_ = 0;\n\n  std::vector<float> mean_;\n  std::vector<float> inv_stddev_;\n};\n\nOfflineFireRedAsrCtcModel::OfflineFireRedAsrCtcModel(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineFireRedAsrCtcModel::OfflineFireRedAsrCtcModel(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineFireRedAsrCtcModel::~OfflineFireRedAsrCtcModel() = default;\n\nstd::vector<Ort::Value> OfflineFireRedAsrCtcModel::Forward(\n    Ort::Value features, Ort::Value features_length) {\n  return impl_->Forward(std::move(features), std::move(features_length));\n}\n\nint32_t OfflineFireRedAsrCtcModel::VocabSize() const {\n  return impl_->VocabSize();\n}\n\nint32_t OfflineFireRedAsrCtcModel::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\nOrtAllocator *OfflineFireRedAsrCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nvoid OfflineFireRedAsrCtcModel::NormalizeFeatures(float *features,\n                                                  int32_t num_frames,\n                                                  int32_t feat_dim) const {\n  return impl_->NormalizeFeatures(features, num_frames, feat_dim);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineFireRedAsrCtcModel::OfflineFireRedAsrCtcModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineFireRedAsrCtcModel::OfflineFireRedAsrCtcModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-ctc-model.h",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-ctc-model.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_CTC_MODEL_H_\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements the CTC model from FIRE_RED_ASR.\n */\nclass OfflineFireRedAsrCtcModel : public OfflineCtcModel {\n public:\n  explicit OfflineFireRedAsrCtcModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineFireRedAsrCtcModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineFireRedAsrCtcModel() override;\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size).\n   *  - log_probs_length A 1-D tensor of shape (N,). Its dtype is int64_t\n   */\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  int32_t SubsamplingFactor() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  void NormalizeFeatures(float *features, int32_t num_frames,\n                         int32_t feat_dim) const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-decoder.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_DECODER_H_\n\n#include <cstdint>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nstruct OfflineFireRedAsrDecoderResult {\n  /// The decoded token IDs\n  std::vector<int32_t> tokens;\n};\n\nclass OfflineFireRedAsrDecoder {\n public:\n  virtual ~OfflineFireRedAsrDecoder() = default;\n\n  /** Run beam search given the output from the FireRedAsr encoder model.\n   *\n   * @param n_layer_cross_k       A 4-D tensor of shape\n   *                              (num_decoder_layers, N, T, d_model).\n   * @param n_layer_cross_v       A 4-D tensor of shape\n   *                              (num_decoder_layers, N, T, d_model).\n   *\n   * @return Return a vector of size `N` containing the decoded results.\n   */\n  virtual std::vector<OfflineFireRedAsrDecoderResult> Decode(\n      Ort::Value n_layer_cross_k, Ort::Value n_layer_cross_v,\n      int32_t num_feature_frames) = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.h\"\n\n#include <algorithm>\n#include <tuple>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\n// Note: this functions works only for batch size == 1 at present\nstd::vector<OfflineFireRedAsrDecoderResult>\nOfflineFireRedAsrGreedySearchDecoder::Decode(Ort::Value cross_k,\n                                             Ort::Value cross_v,\n                                             int32_t num_feature_frames) {\n  const auto &meta_data = model_->GetModelMetadata();\n\n  auto memory_info =\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n  // For multilingual models, initial_tokens contains [sot, language, task]\n  //   - language is English by default\n  //   - task is transcribe by default\n  //\n  // For non-multilingual models, initial_tokens contains [sot]\n  std::array<int64_t, 2> token_shape = {1, 1};\n  int64_t token = meta_data.sos_id;\n\n  int32_t batch_size = 1;\n\n  Ort::Value tokens = Ort::Value::CreateTensor(\n      memory_info, &token, 1, token_shape.data(), token_shape.size());\n\n  std::array<int64_t, 1> offset_shape{1};\n  Ort::Value offset = Ort::Value::CreateTensor<int64_t>(\n      model_->Allocator(), offset_shape.data(), offset_shape.size());\n  *(offset.GetTensorMutableData<int64_t>()) = 0;\n\n  std::vector<OfflineFireRedAsrDecoderResult> ans(1);\n\n  auto self_kv_cache = model_->GetInitialSelfKVCache();\n\n  std::tuple<Ort::Value, Ort::Value, Ort::Value, Ort::Value, Ort::Value,\n             Ort::Value>\n      decoder_out = {Ort::Value{nullptr},\n                     std::move(self_kv_cache.first),\n                     std::move(self_kv_cache.second),\n                     std::move(cross_k),\n                     std::move(cross_v),\n                     std::move(offset)};\n\n  // assume at most 6 tokens per second\n  int32_t num_possible_tokens = num_feature_frames / 100.0 * 6;\n  num_possible_tokens =\n      std::min<int32_t>(num_possible_tokens, meta_data.max_len / 2);\n\n  for (int32_t i = 0; i < num_possible_tokens; ++i) {\n    decoder_out = model_->ForwardDecoder(View(&tokens),\n                                         std::move(std::get<1>(decoder_out)),\n                                         std::move(std::get<2>(decoder_out)),\n                                         std::move(std::get<3>(decoder_out)),\n                                         std::move(std::get<4>(decoder_out)),\n                                         std::move(std::get<5>(decoder_out)));\n\n    const auto &logits = std::get<0>(decoder_out);\n    const float *p_logits = logits.GetTensorData<float>();\n\n    auto logits_shape = logits.GetTensorTypeAndShapeInfo().GetShape();\n    int32_t vocab_size = logits_shape[2];\n\n    int32_t max_token_id = static_cast<int32_t>(std::distance(\n        p_logits, std::max_element(p_logits, p_logits + vocab_size)));\n    if (max_token_id == meta_data.eos_id) {\n      break;\n    }\n\n    ans[0].tokens.push_back(max_token_id);\n\n    token = max_token_id;\n\n    // increment offset\n    *(std::get<5>(decoder_out).GetTensorMutableData<int64_t>()) += 1;\n  }\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_GREEDY_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_GREEDY_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineFireRedAsrGreedySearchDecoder : public OfflineFireRedAsrDecoder {\n public:\n  explicit OfflineFireRedAsrGreedySearchDecoder(OfflineFireRedAsrModel *model)\n      : model_(model) {}\n\n  std::vector<OfflineFireRedAsrDecoderResult> Decode(\n      Ort::Value cross_k, Ort::Value cross_v,\n      int32_t num_feature_frames) override;\n\n private:\n  OfflineFireRedAsrModel *model_;  // not owned\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_GREEDY_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineFireRedAsrModelConfig::Register(ParseOptions *po) {\n  po->Register(\"fire-red-asr-encoder\", &encoder,\n               \"Path to onnx encoder of FireRedAsr\");\n\n  po->Register(\"fire-red-asr-decoder\", &decoder,\n               \"Path to onnx decoder of FireRedAsr\");\n}\n\nbool OfflineFireRedAsrModelConfig::Validate() const {\n  if (encoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --fire-red-asr-encoder\");\n    return false;\n  }\n\n  if (!FileExists(encoder)) {\n    SHERPA_ONNX_LOGE(\"FireRedAsr encoder file '%s' does not exist\",\n                     encoder.c_str());\n    return false;\n  }\n\n  if (decoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --fire-red-asr-decoder\");\n    return false;\n  }\n\n  if (!FileExists(decoder)) {\n    SHERPA_ONNX_LOGE(\"FireRedAsr decoder file '%s' does not exist\",\n                     decoder.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineFireRedAsrModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineFireRedAsrModelConfig(\";\n  os << \"encoder=\\\"\" << encoder << \"\\\", \";\n  os << \"decoder=\\\"\" << decoder << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\n// see https://github.com/FireRedTeam/FireRedASR\nstruct OfflineFireRedAsrModelConfig {\n  std::string encoder;\n  std::string decoder;\n\n  OfflineFireRedAsrModelConfig() = default;\n  OfflineFireRedAsrModelConfig(const std::string &encoder,\n                               const std::string &decoder)\n      : encoder(encoder), decoder(decoder) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-model-meta-data.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_META_DATA_H_\n\n#include <string>\n#include <unordered_map>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nstruct OfflineFireRedAsrModelMetaData {\n  int32_t sos_id;\n  int32_t eos_id;\n  int32_t max_len;\n\n  int32_t num_decoder_layers;\n  int32_t num_head;\n  int32_t head_dim;\n\n  std::vector<float> mean;\n  std::vector<float> inv_stddev;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-model.cc",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-model.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <tuple>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n\nstatic inline bool IsCudaProvider(const std::string &provider) {\n  return provider == \"cuda\";\n}\n\n}  // namespace\n\nclass OfflineFireRedAsrModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{},\n        cpu_mem_info_(\n            Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)),\n        is_cpu_provider_(config.provider == \"cpu\" || config.provider.empty()) {\n    {\n      auto buf = ReadFile(config.fire_red_asr.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.fire_red_asr.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    InitCudaIOBinding();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{},\n        cpu_mem_info_(\n            Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)),\n        is_cpu_provider_(config.provider == \"cpu\" || config.provider.empty()) {\n    {\n      auto buf = ReadFile(mgr, config.fire_red_asr.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.fire_red_asr.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    InitCudaIOBinding();\n  }\n\n  std::pair<Ort::Value, Ort::Value> ForwardEncoder(Ort::Value features,\n                                                   Ort::Value features_length) {\n    std::array<Ort::Value, 2> inputs{std::move(features),\n                                     std::move(features_length)};\n\n    std::vector<Ort::Value> encoder_out;\n\n    if (use_cuda_iobinding_) {\n      // Encoder outputs (cross_k, cross_v) are used multiple times in decoder\n      // steps, so keep them on GPU to avoid device<->host copies.\n      Ort::IoBinding binding(*encoder_sess_);\n      binding.BindInput(encoder_input_names_ptr_[0], inputs[0]);\n      binding.BindInput(encoder_input_names_ptr_[1], inputs[1]);\n\n      binding.BindOutput(encoder_output_names_ptr_[0], *cuda_mem_info_);\n      binding.BindOutput(encoder_output_names_ptr_[1], *cuda_mem_info_);\n\n      binding.SynchronizeInputs();\n      encoder_sess_->Run(Ort::RunOptions{nullptr}, binding);\n      binding.SynchronizeOutputs();\n      encoder_out = binding.GetOutputValues();\n    } else {\n      encoder_out = encoder_sess_->Run(\n          {}, encoder_input_names_ptr_.data(), inputs.data(), inputs.size(),\n          encoder_output_names_ptr_.data(), encoder_output_names_ptr_.size());\n    }\n\n    return {std::move(encoder_out[0]), std::move(encoder_out[1])};\n  }\n\n  std::tuple<Ort::Value, Ort::Value, Ort::Value, Ort::Value, Ort::Value,\n             Ort::Value>\n  ForwardDecoder(Ort::Value tokens, Ort::Value n_layer_self_k_cache,\n                 Ort::Value n_layer_self_v_cache, Ort::Value n_layer_cross_k,\n                 Ort::Value n_layer_cross_v, Ort::Value offset) {\n    std::array<Ort::Value, 6> decoder_input = {std::move(tokens),\n                                               std::move(n_layer_self_k_cache),\n                                               std::move(n_layer_self_v_cache),\n                                               std::move(n_layer_cross_k),\n                                               std::move(n_layer_cross_v),\n                                               std::move(offset)};\n\n    std::vector<Ort::Value> decoder_out;\n\n    if (use_cuda_iobinding_) {\n      // CPU-side sampling needs logits on CPU, while self KV cache should\n      // remain on GPU to avoid large device<->host copies between decode steps.\n      Ort::IoBinding binding(*decoder_sess_);\n      for (size_t i = 0; i < decoder_input.size(); ++i) {\n        binding.BindInput(decoder_input_names_ptr_[i], decoder_input[i]);\n      }\n\n      binding.BindOutput(decoder_output_names_ptr_[0], cpu_mem_info_);\n      binding.BindOutput(decoder_output_names_ptr_[1], *cuda_mem_info_);\n      binding.BindOutput(decoder_output_names_ptr_[2], *cuda_mem_info_);\n\n      binding.SynchronizeInputs();\n      decoder_sess_->Run(Ort::RunOptions{nullptr}, binding);\n      binding.SynchronizeOutputs();\n      decoder_out = binding.GetOutputValues();\n    } else {\n      decoder_out = decoder_sess_->Run(\n          {}, decoder_input_names_ptr_.data(), decoder_input.data(),\n          decoder_input.size(), decoder_output_names_ptr_.data(),\n          decoder_output_names_ptr_.size());\n    }\n\n    return std::tuple<Ort::Value, Ort::Value, Ort::Value, Ort::Value,\n                      Ort::Value, Ort::Value>{\n        std::move(decoder_out[0]),   std::move(decoder_out[1]),\n        std::move(decoder_out[2]),   std::move(decoder_input[3]),\n        std::move(decoder_input[4]), std::move(decoder_input[5])};\n  }\n\n  std::pair<Ort::Value, Ort::Value> GetInitialSelfKVCache() {\n    int32_t batch_size = 1;\n    std::array<int64_t, 5> shape{meta_data_.num_decoder_layers, batch_size,\n                                 meta_data_.max_len, meta_data_.num_head,\n                                 meta_data_.head_dim};\n\n    Ort::Value n_layer_self_k_cache = Ort::Value::CreateTensor<float>(\n        Allocator(), shape.data(), shape.size());\n\n    Ort::Value n_layer_self_v_cache = Ort::Value::CreateTensor<float>(\n        Allocator(), shape.data(), shape.size());\n\n    auto n = shape[0] * shape[1] * shape[2] * shape[3] * shape[4];\n\n    float *p_k = n_layer_self_k_cache.GetTensorMutableData<float>();\n    float *p_v = n_layer_self_v_cache.GetTensorMutableData<float>();\n\n    memset(p_k, 0, sizeof(float) * n);\n    memset(p_v, 0, sizeof(float) * n);\n\n    return {std::move(n_layer_self_k_cache), std::move(n_layer_self_v_cache)};\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  const OfflineFireRedAsrModelMetaData &GetModelMetadata() const {\n    return meta_data_;\n  }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---encoder---\\n\";\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(meta_data_.num_decoder_layers,\n                               \"num_decoder_layers\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.num_head, \"num_head\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.head_dim, \"head_dim\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.sos_id, \"sos\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.eos_id, \"eos\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.max_len, \"max_len\");\n\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(meta_data_.mean, \"cmvn_mean\");\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(meta_data_.inv_stddev,\n                                         \"cmvn_inv_stddev\");\n  }\n\n  void InitDecoder(void *model_data, size_t model_data_length) {\n    decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                  &decoder_input_names_ptr_);\n\n    GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                   &decoder_output_names_ptr_);\n  }\n\n  void InitCudaIOBinding() {\n    use_cuda_iobinding_ =\n        (!is_cpu_provider_ && IsCudaProvider(config_.provider));\n    if (use_cuda_iobinding_) {\n      // Use device 0 by default. SessionOptions() in sherpa-onnx usually\n      // configures the CUDA EP device; binding here only affects output memory.\n      cuda_mem_info_ = std::make_unique<Ort::MemoryInfo>(\n          \"Cuda\", OrtDeviceAllocator, 0, OrtMemTypeDefault);\n    }\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  Ort::MemoryInfo cpu_mem_info_;\n  std::unique_ptr<Ort::MemoryInfo> cuda_mem_info_;\n  bool use_cuda_iobinding_ = false;\n  bool is_cpu_provider_ = false;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  OfflineFireRedAsrModelMetaData meta_data_;\n};\n\nOfflineFireRedAsrModel::OfflineFireRedAsrModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineFireRedAsrModel::OfflineFireRedAsrModel(Manager *mgr,\n                                               const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineFireRedAsrModel::~OfflineFireRedAsrModel() = default;\n\nstd::pair<Ort::Value, Ort::Value> OfflineFireRedAsrModel::ForwardEncoder(\n    Ort::Value features, Ort::Value features_length) const {\n  return impl_->ForwardEncoder(std::move(features), std::move(features_length));\n}\n\nstd::tuple<Ort::Value, Ort::Value, Ort::Value, Ort::Value, Ort::Value,\n           Ort::Value>\nOfflineFireRedAsrModel::ForwardDecoder(Ort::Value tokens,\n                                       Ort::Value n_layer_self_k_cache,\n                                       Ort::Value n_layer_self_v_cache,\n                                       Ort::Value n_layer_cross_k,\n                                       Ort::Value n_layer_cross_v,\n                                       Ort::Value offset) const {\n  return impl_->ForwardDecoder(\n      std::move(tokens), std::move(n_layer_self_k_cache),\n      std::move(n_layer_self_v_cache), std::move(n_layer_cross_k),\n      std::move(n_layer_cross_v), std::move(offset));\n}\n\nstd::pair<Ort::Value, Ort::Value>\nOfflineFireRedAsrModel::GetInitialSelfKVCache() const {\n  return impl_->GetInitialSelfKVCache();\n}\n\nOrtAllocator *OfflineFireRedAsrModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nconst OfflineFireRedAsrModelMetaData &OfflineFireRedAsrModel::GetModelMetadata()\n    const {\n  return impl_->GetModelMetadata();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineFireRedAsrModel::OfflineFireRedAsrModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineFireRedAsrModel::OfflineFireRedAsrModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-fire-red-asr-model.h",
    "content": "// sherpa-onnx/csrc/offline-fire-red-asr-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <tuple>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineFireRedAsrModel {\n public:\n  explicit OfflineFireRedAsrModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineFireRedAsrModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineFireRedAsrModel();\n\n  /** Run the encoder model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_len  A tensor of shape (N,) with dtype int64.\n   *\n   * @return Return a pair containing:\n   *  - n_layer_cross_k: A 4-D tensor of shape\n   *                     (num_decoder_layers, N, T, d_model)\n   *  - n_layer_cross_v: A 4-D tensor of shape\n   *                     (num_decoder_layers, N, T, d_model)\n   */\n  std::pair<Ort::Value, Ort::Value> ForwardEncoder(\n      Ort::Value features, Ort::Value features_length) const;\n\n  /** Run the decoder model.\n   *\n   * @param tokens A int64 tensor of shape (N, num_words)\n   * @param n_layer_self_k_cache  A 5-D tensor of shape\n   *                       (num_decoder_layers, N, max_len, num_head, head_dim).\n   * @param n_layer_self_v_cache  A 5-D tensor of shape\n   *                       (num_decoder_layers, N, max_len, num_head, head_dim).\n   * @param n_layer_cross_k       A 5-D tensor of shape\n   *                              (num_decoder_layers, N, T, d_model).\n   * @param n_layer_cross_v       A 5-D tensor of shape\n   *                              (num_decoder_layers, N, T, d_model).\n   * @param offset A int64 tensor of shape (N,)\n   *\n   * @return Return a tuple containing 6 tensors:\n   *\n   *  - logits A 3-D tensor of shape (N, num_words, vocab_size)\n   *  - out_n_layer_self_k_cache Same shape as n_layer_self_k_cache\n   *  - out_n_layer_self_v_cache Same shape as n_layer_self_v_cache\n   *  - out_n_layer_cross_k Same as n_layer_cross_k\n   *  - out_n_layer_cross_v Same as n_layer_cross_v\n   *  - out_offset Same as offset\n   */\n  std::tuple<Ort::Value, Ort::Value, Ort::Value, Ort::Value, Ort::Value,\n             Ort::Value>\n  ForwardDecoder(Ort::Value tokens, Ort::Value n_layer_self_k_cache,\n                 Ort::Value n_layer_self_v_cache, Ort::Value n_layer_cross_k,\n                 Ort::Value n_layer_cross_v, Ort::Value offset) const;\n\n  /** Return the initial self kv cache in a pair\n   *  - n_layer_self_k_cache A 5-D tensor of shape\n   *                       (num_decoder_layers, N, max_len, num_head, head_dim).\n   *  - n_layer_self_v_cache A 5-D tensor of shape\n   *                       (num_decoder_layers, N, max_len, num_head, head_dim).\n   */\n  std::pair<Ort::Value, Ort::Value> GetInitialSelfKVCache() const;\n\n  const OfflineFireRedAsrModelMetaData &GetModelMetadata() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-funasr-nano-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-funasr-nano-model-config.cc\n//\n// Copyright (c)  2025  zengyw\n\n#include \"sherpa-onnx/csrc/offline-funasr-nano-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineFunASRNanoModelConfig::Register(ParseOptions *po) {\n  po->Register(\"funasr-nano-encoder-adaptor\", &encoder_adaptor,\n               \"Path to encoder_adaptor.onnx for FunASR-nano\");\n\n  po->Register(\"funasr-nano-llm\", &llm,\n               \"Path to llm.onnx for FunASR-nano (KV cache mode)\");\n\n  po->Register(\"funasr-nano-embedding\", &embedding,\n               \"Path to embedding.onnx for FunASR-nano\");\n\n  po->Register(\n      \"funasr-nano-tokenizer\", &tokenizer,\n      \"Path to tokenizer directory (e.g., Qwen3-0.6B) for FunASR-nano\");\n\n  po->Register(\"funasr-nano-system-prompt\", &system_prompt,\n               \"System prompt for FunASR-nano\");\n\n  po->Register(\"funasr-nano-user-prompt\", &user_prompt,\n               \"User prompt template for FunASR-nano\");\n\n  po->Register(\"funasr-nano-max-new-tokens\", &max_new_tokens,\n               \"Maximum number of new tokens to generate for FunASR-nano\");\n\n  po->Register(\"funasr-nano-temperature\", &temperature,\n               \"Sampling temperature for FunASR-nano\");\n\n  po->Register(\"funasr-nano-top-p\", &top_p,\n               \"Top-p (nucleus) sampling threshold for FunASR-nano\");\n\n  po->Register(\"funasr-nano-seed\", &seed, \"Random seed for FunASR-nano\");\n\n  po->Register(\"funasr-nano-language\", &language,\n               \"Language for transcription (empty string means None)\");\n\n  po->Register(\"funasr-nano-itn\", &itn,\n               \"Whether to apply inverse text normalization (default: true)\");\n\n  po->Register(\"funasr-nano-hotwords\", &hotwords,\n               \"Hotwords (comma-separated, e.g., \\\"Sherpa,FunASR\\\")\");\n}\n\nbool OfflineFunASRNanoModelConfig::Validate() const {\n  if (encoder_adaptor.empty()) {\n    SHERPA_ONNX_LOGE(\"--funasr-nano-encoder-adaptor is required\");\n    return false;\n  }\n\n  if (!FileExists(encoder_adaptor)) {\n    SHERPA_ONNX_LOGE(\"--funasr-nano-encoder-adaptor: '%s' does not exist\",\n                     encoder_adaptor.c_str());\n    return false;\n  }\n\n  if (llm.empty()) {\n    SHERPA_ONNX_LOGE(\"--funasr-nano-llm is required\");\n    return false;\n  }\n\n  if (!FileExists(llm)) {\n    SHERPA_ONNX_LOGE(\"--funasr-nano-llm: '%s' does not exist\", llm.c_str());\n    return false;\n  }\n\n  if (tokenizer.empty()) {\n    SHERPA_ONNX_LOGE(\"--funasr-nano-tokenizer is required\");\n    return false;\n  }\n\n  if (!FileExists(tokenizer + \"/vocab.json\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/vocab.json' does not exist. Please check --funasr-nano-tokenizer\",\n        tokenizer.c_str());\n    return false;\n  }\n\n  if (!FileExists(tokenizer + \"/merges.txt\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/merges.txt' does not exist. Please check --funasr-nano-tokenizer\",\n        tokenizer.c_str());\n    return false;\n  }\n\n  if (!FileExists(tokenizer + \"/tokenizer.json\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/tokenizer.json' does not exist. Please check \"\n        \"--funasr-nano-tokenizer\",\n        tokenizer.c_str());\n    return false;\n  }\n\n  if (embedding.empty()) {\n    SHERPA_ONNX_LOGE(\"--funasr-nano-embedding is required\");\n    return false;\n  }\n\n  if (!FileExists(embedding)) {\n    SHERPA_ONNX_LOGE(\"--funasr-nano-embedding: '%s' does not exist\",\n                     embedding.c_str());\n    return false;\n  }\n\n  if (max_new_tokens <= 0) {\n    SHERPA_ONNX_LOGE(\"--funasr-nano-max-new-tokens should be > 0. Given: %d\",\n                     max_new_tokens);\n    return false;\n  }\n\n  if (temperature < 0.0f) {\n    SHERPA_ONNX_LOGE(\"--funasr-nano-temperature should be >= 0.0. Given: %f\",\n                     temperature);\n    return false;\n  }\n\n  if (top_p < 0.0f || top_p > 1.0f) {\n    SHERPA_ONNX_LOGE(\"--funasr-nano-top-p should be in [0.0, 1.0]. Given: %f\",\n                     top_p);\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineFunASRNanoModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineFunASRNanoModelConfig(\";\n  os << \"encoder_adaptor=\\\"\" << encoder_adaptor << \"\\\", \";\n  os << \"llm=\\\"\" << llm << \"\\\", \";\n  os << \"embedding=\\\"\" << embedding << \"\\\", \";\n  os << \"tokenizer=\\\"\" << tokenizer << \"\\\", \";\n  os << \"system_prompt=\\\"\" << system_prompt << \"\\\", \";\n  os << \"user_prompt=\\\"\" << user_prompt << \"\\\", \";\n  os << \"max_new_tokens=\" << max_new_tokens << \", \";\n  os << \"temperature=\" << temperature << \", \";\n  os << \"top_p=\" << top_p << \", \";\n  os << \"seed=\" << seed << \", \";\n  os << \"language=\\\"\" << language << \"\\\", \";\n  os << \"itn=\" << (itn ? \"True\" : \"False\") << \", \";\n  os << \"hotwords=\\\"\" << hotwords << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-funasr-nano-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-funasr-nano-model-config.h\n//\n// Copyright (c)  2025  zengyw\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_FUNASR_NANO_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_FUNASR_NANO_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineFunASRNanoModelConfig {\n  // Path to encoder_adaptor.onnx\n  std::string encoder_adaptor;\n\n  // Path to llm.onnx (KV cache model)\n  std::string llm;\n\n  // Path to embedding.onnx\n  std::string embedding;\n\n  // Path to tokenizer directory (e.g., Qwen3-0.6B)\n  std::string tokenizer;\n\n  // System prompt\n  std::string system_prompt = \"You are a helpful assistant.\";\n\n  // User prompt template (will be filled with audio tokens)\n  std::string user_prompt = \"语音转写：\";\n\n  // Maximum number of new tokens to generate\n  int32_t max_new_tokens = 512;\n\n  // Sampling temperature\n  float temperature = 1e-6f;\n\n  // Top-p (nucleus) sampling threshold\n  float top_p = 0.8f;\n\n  // Random seed for reproducibility\n  int32_t seed = 42;\n\n  // Language for transcription (empty string means None)\n  std::string language;\n\n  // Whether to apply inverse text normalization (ITN)\n  bool itn = true;\n\n  // Hotwords\n  std::string hotwords;\n\n  OfflineFunASRNanoModelConfig() = default;\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_FUNASR_NANO_MODEL_CONFIG_H_\n\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-funasr-nano-model.cc",
    "content": "// sherpa-onnx/csrc/offline-funasr-nano-model.cc\n//\n// Copyright (c)  2025  zengyw\n\n#include \"sherpa-onnx/csrc/offline-funasr-nano-model.h\"\n\n#include <algorithm>\n#include <cctype>\n#include <cmath>\n#include <cstdint>\n#include <cstring>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n\n// Calculate the total number of elements from a tensor shape.\nstatic inline size_t NumelFromShape(const std::vector<int64_t> &shape) {\n  if (shape.empty()) return 0;\n  size_t n = 1;\n  for (auto d : shape) {\n    if (d <= 0) return 0;\n    n *= static_cast<size_t>(d);\n  }\n  return n;\n}\n\n#if ORT_API_VERSION >= 14\nstatic inline void AssertTensorIsCpu(const Ort::Value &v, const char *what) {\n  if (!v.IsTensor()) return;\n  auto mi = v.GetTensorMemoryInfo();\n  if (mi.GetDeviceType() != OrtMemoryInfoDeviceType_CPU) {\n    SHERPA_ONNX_LOGE(\n        \"%s: expected CPU tensor but got device_type=%d device_id=%d\", what,\n        (int)mi.GetDeviceType(), mi.GetDeviceId());\n    SHERPA_ONNX_EXIT(-1);\n  }\n}\n#else\nstatic inline void AssertTensorIsCpu(const Ort::Value &v, const char *what) {\n  if (!v.IsTensor()) return;\n\n  const OrtValue *v_ptr = reinterpret_cast<const OrtValue *>(&v);\n  const OrtMemoryInfo *memory_info = nullptr;\n\n  // 1. Get memory info\n  OrtStatus *status = Ort::GetApi().GetTensorMemoryInfo(v_ptr, &memory_info);\n  if (status) {\n    const char *msg = Ort::GetApi().GetErrorMessage(status);\n    Ort::GetApi().ReleaseStatus(status);\n    SHERPA_ONNX_LOGE(\"%s: failed to get tensor memory info: %s\", what, msg);\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  // 2. Get memory type (OrtMemType)\n  OrtMemType mem_type;\n  status = Ort::GetApi().MemoryInfoGetMemType(memory_info, &mem_type);\n  if (status) {\n    const char *msg = Ort::GetApi().GetErrorMessage(status);\n    Ort::GetApi().ReleaseStatus(status);\n    SHERPA_ONNX_LOGE(\"%s: failed to get mem type: %s\", what, msg);\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  // 3. Check CPU\n  if (mem_type != OrtMemTypeCPU) {\n    int device_id = 0;\n    status = Ort::GetApi().MemoryInfoGetId(memory_info, &device_id);\n    if (status) {\n      const char *msg = Ort::GetApi().GetErrorMessage(status);\n      Ort::GetApi().ReleaseStatus(status);\n      SHERPA_ONNX_LOGE(\"%s: failed to get device id: %s\", what, msg);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_LOGE(\"%s: expected CPU tensor but got mem_type=%d device_id=%d\",\n                     what, static_cast<int>(mem_type), device_id);\n    SHERPA_ONNX_EXIT(-1);\n  }\n}\n#endif\n\nstatic inline std::string ToLower(std::string s) {\n  std::transform(s.begin(), s.end(), s.begin(), [](unsigned char c) -> char {\n    return static_cast<char>(std::tolower(c));\n  });\n  return s;\n}\n\nstatic inline bool IsCudaProvider(const std::string &provider) {\n  auto p = ToLower(provider);\n  // Keep it conservative. We only enable IO binding policy below when we\n  // are on CUDA; other EPs keep the existing behavior.\n  return p == \"cuda\" || (p.size() > 4 && p.find(\"cuda\") == 0);\n}\n\n// Get the element type of a session input tensor.\nstatic inline ONNXTensorElementDataType GetSessionInputElemType(\n    Ort::Session *sess, size_t input_index) {\n  auto ti = sess->GetInputTypeInfo(input_index);\n  auto t = ti.GetTensorTypeAndShapeInfo();\n  return static_cast<ONNXTensorElementDataType>(t.GetElementType());\n}\n\ntemplate <typename T>\nstatic Ort::Value AllocTensor(OrtAllocator *alloc,\n                              const std::vector<int64_t> &shape) {\n  return Ort::Value::CreateTensor<T>(alloc, shape.data(), shape.size());\n}\n\ntemplate <>\nOrt::Value AllocTensor<uint16_t>(OrtAllocator *alloc,\n                                 const std::vector<int64_t> &shape) {\n  return Ort::Value::CreateTensor(alloc, shape.data(), shape.size(),\n                                  ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);\n}\n\n// Allocate tensor by ONNX elem type (float/float16 only).\nstatic inline Ort::Value AllocTensorByElemType(\n    OrtAllocator *alloc, const std::vector<int64_t> &shape,\n    ONNXTensorElementDataType t) {\n  if (t == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT) {\n    return AllocTensor<float>(alloc, shape);\n  }\n  if (t == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16 ||\n      t == ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT16) {\n    return AllocTensor<uint16_t>(alloc, shape);\n  }\n  SHERPA_ONNX_LOGE(\"AllocTensorByElemType: unsupported elem_type=%d\", (int)t);\n  SHERPA_ONNX_EXIT(-1);\n  return AllocTensor<float>(alloc, shape);\n}\n\n// Convert tensor to float32, handling both float16 and float32 inputs.\n// NOTE: This helper assumes the input tensor is on CPU memory.\n// The caller must ensure the tensor is on CPU (e.g., via IO Binding).\nstatic Ort::Value CastToFloat32(Ort::Value in, OrtAllocator *alloc) {\n  if (!in.IsTensor()) return in;\n  auto info = in.GetTensorTypeAndShapeInfo();\n  auto shape = info.GetShape();\n  size_t n = NumelFromShape(shape);\n  if (n == 0) return in;\n  auto et = info.GetElementType();\n\n  AssertTensorIsCpu(in, \"CastToFloat32\");\n\n  Ort::Value out = AllocTensor<float>(alloc, shape);\n  float *dst = out.GetTensorMutableData<float>();\n  if (et == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT) {\n    const float *src = in.GetTensorData<float>();\n    std::memcpy(dst, src, n * sizeof(float));\n    return out;\n  }\n  if (et == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16 ||\n      et == ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT16) {\n    const uint16_t *src = in.GetTensorData<uint16_t>();\n    for (size_t i = 0; i < n; ++i) dst[i] = HalfBitsToFloat(src[i]);\n    return out;\n  }\n  SHERPA_ONNX_LOGE(\"CastToFloat32: unsupported input elem_type=%d\", (int)et);\n  return in;\n}\n\n// Convert tensor to float16, handling both float16 and float32 inputs.\n// NOTE: This helper assumes the input tensor is on CPU memory.\nstatic Ort::Value CastToFloat16(Ort::Value in, OrtAllocator *alloc) {\n  if (!in.IsTensor()) return in;\n  auto info = in.GetTensorTypeAndShapeInfo();\n  auto shape = info.GetShape();\n  size_t n = NumelFromShape(shape);\n  if (n == 0) return in;\n  auto et = static_cast<ONNXTensorElementDataType>(info.GetElementType());\n\n  AssertTensorIsCpu(in, \"CastToFloat16\");\n\n  Ort::Value out = AllocTensor<uint16_t>(alloc, shape);\n  uint16_t *dst = out.GetTensorMutableData<uint16_t>();\n  if (et == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16 ||\n      et == ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT16) {\n    const uint16_t *src = in.GetTensorData<uint16_t>();\n    std::memcpy(dst, src, n * sizeof(uint16_t));\n    return out;\n  }\n  if (et == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT) {\n    const float *src = in.GetTensorData<float>();\n    for (size_t i = 0; i < n; ++i) dst[i] = FloatToHalfBits(src[i]);\n    return out;\n  }\n  SHERPA_ONNX_LOGE(\"CastToFloat16: unsupported input elem_type=%d\", (int)et);\n  return in;\n}\n\n// Cast tensor to the expected element type (float16 or float32).\n// Returns the input unchanged if it already matches the expected type.\nstatic Ort::Value CastFloatLikeForExpected(Ort::Value in,\n                                           ONNXTensorElementDataType expected,\n                                           OrtAllocator *alloc) {\n  if (!in.IsTensor()) return in;\n  auto info = in.GetTensorTypeAndShapeInfo();\n  auto actual = static_cast<ONNXTensorElementDataType>(info.GetElementType());\n  if (actual == expected) return in;\n  if (expected == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16) {\n    return CastToFloat16(std::move(in), alloc);\n  }\n  if (expected == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT) {\n    return CastToFloat32(std::move(in), alloc);\n  }\n  SHERPA_ONNX_LOGE(\n      \"CastFloatLikeForExpected: unsupported expected elem_type=%d\",\n      (int)expected);\n  return in;\n}\n\nstatic inline bool NeedsTypeConversion(Ort::Value &in,\n                                       ONNXTensorElementDataType expected) {\n  if (!in.IsTensor()) return false;\n  auto info = in.GetTensorTypeAndShapeInfo();\n  auto actual = static_cast<ONNXTensorElementDataType>(info.GetElementType());\n  return actual != expected;\n}\n\n// Cast attention mask tensor to int64 if needed.\n// Supports int32 to int64 conversion.\n// NOTE: This helper assumes the input tensor is on CPU memory.\nstatic Ort::Value CastMaskToInt64IfNeeded(Ort::Value in, OrtAllocator *alloc) {\n  if (!in.IsTensor()) return in;\n  auto info = in.GetTensorTypeAndShapeInfo();\n  auto shape = info.GetShape();\n  size_t n = NumelFromShape(shape);\n  if (n == 0) return in;\n  auto et = static_cast<ONNXTensorElementDataType>(info.GetElementType());\n  if (et == ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64) return in;\n\n  AssertTensorIsCpu(in, \"CastMaskToInt64IfNeeded\");\n\n  if (et == ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32) {\n    const int32_t *src = in.GetTensorData<int32_t>();\n    Ort::Value out = AllocTensor<int64_t>(alloc, shape);\n    int64_t *dst = out.GetTensorMutableData<int64_t>();\n    for (size_t i = 0; i < n; ++i) dst[i] = static_cast<int64_t>(src[i]);\n    return out;\n  }\n\n  SHERPA_ONNX_LOGE(\"attention_mask elem_type=%d not supported, expected int64\",\n                   (int)et);\n  return in;\n}\n\n// Ensure attention_mask is [batch, target_len] on CPU, int64.\n// If shorter: pad with 0. If longer: truncate.\nstatic Ort::Value NormalizeAttentionMask(Ort::Value mask, int64_t target_len,\n                                         OrtAllocator *alloc) {\n  if (!mask.IsTensor()) return mask;\n  AssertTensorIsCpu(mask, \"NormalizeAttentionMask\");\n\n  auto info = mask.GetTensorTypeAndShapeInfo();\n  auto shape = info.GetShape();\n  if (shape.size() != 2) return mask;\n\n  int64_t b = shape[0];\n  int64_t l = shape[1];\n  if (b <= 0 || l <= 0) return mask;\n\n  if (static_cast<ONNXTensorElementDataType>(info.GetElementType()) !=\n      ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64) {\n    mask = CastMaskToInt64IfNeeded(std::move(mask), alloc);\n    info = mask.GetTensorTypeAndShapeInfo();\n    shape = info.GetShape();\n    if (shape.size() != 2) return mask;\n    b = shape[0];\n    l = shape[1];\n  }\n\n  if (l == target_len) return mask;\n\n  std::vector<int64_t> new_shape = {b, target_len};\n  Ort::Value out = AllocTensor<int64_t>(alloc, new_shape);\n  int64_t *dst = out.GetTensorMutableData<int64_t>();\n  const int64_t *src = mask.GetTensorData<int64_t>();\n\n  std::memset(dst, 0,\n              static_cast<size_t>(b) * static_cast<size_t>(target_len) *\n                  sizeof(int64_t));\n\n  int64_t copy_len = std::min<int64_t>(l, target_len);\n  for (int64_t bi = 0; bi < b; ++bi) {\n    const int64_t *srow = src + bi * l;\n    int64_t *drow = dst + bi * target_len;\n    std::memcpy(drow, srow, static_cast<size_t>(copy_len) * sizeof(int64_t));\n  }\n\n  return out;\n}\n\n}  // namespace\n\n// Implementation class for OfflineFunASRNanoModel.\n// Manages ONNX sessions for encoder, KV cache LLM, and embedding models.\nclass OfflineFunASRNanoModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR, \"funasr-nano\"),\n        sess_opts_encoder_(GetSessionOptions(config)),\n        sess_opts_llm_(GetSessionOptions(config)),\n        sess_opts_embedding_(GetSessionOptions(config)),\n        allocator_(),\n        cpu_mem_info_(\n            Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)),\n        is_cpu_provider_(config.provider == \"cpu\" || config.provider.empty()) {\n    const auto &c = config_.funasr_nano;\n\n    if (c.encoder_adaptor.empty()) {\n      SHERPA_ONNX_LOGE(\"funasr_nano.encoder_adaptor is empty\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (c.llm.empty()) {\n      SHERPA_ONNX_LOGE(\"funasr_nano.llm is required for KV cache mode\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    InitEncoderAdaptor(c.encoder_adaptor);\n    InitLLM(c.llm);\n    InitEmbedding(c.embedding);\n    has_embedding_model_ = true;\n\n    // FunASR-nano uses CPU-side sampling. When running on CUDA, we bind\n    // logits to CPU (so sampling can read it safely).\n    use_cuda_iobinding_ =\n        (!is_cpu_provider_ && IsCudaProvider(config_.provider));\n    if (use_cuda_iobinding_) {\n      // Use device 0 by default. SessionOptions() in sherpa-onnx usually\n      // configures the CUDA EP device; binding here only affects output memory.\n      cuda_mem_info_ = std::make_unique<Ort::MemoryInfo>(\n          \"Cuda\", OrtDeviceAllocator, 0, OrtMemTypeDefault);\n    }\n    CheckFp16OnCuda();\n  }\n\n  void InitEncoderAdaptorFromMemory(void *model_data,\n                                    size_t model_data_length) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_encoder_);\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n    encoder_in_type_ = GetSessionInputElemType(encoder_sess_.get(), 0);\n    Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(lfr_window_size_, \"lfr_window_size\");\n    SHERPA_ONNX_READ_META_DATA(lfr_window_shift_, \"lfr_window_shift\");\n    SHERPA_ONNX_READ_META_DATA(hidden_size_, \"llm_dim\");\n  }\n\n  void SetupLlmFromSession() {\n    GetInputNames(llm_sess_.get(), &llm_input_names_, &llm_input_names_ptr_);\n    GetOutputNames(llm_sess_.get(), &llm_output_names_, &llm_output_names_ptr_);\n\n    llm_embeds_in_type_ = GetSessionInputElemType(llm_sess_.get(), 0);\n    if (llm_embeds_in_type_ != ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT) {\n      SHERPA_ONNX_LOGE(\"LLM inputs_embeds must be float32, got elem_type=%d\",\n                       (int)llm_embeds_in_type_);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    Ort::ModelMetadata meta_data = llm_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"LLM model metadata:\\n%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"LLM model metadata:\\n%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n    if (hidden_size_ == 0) {\n      SHERPA_ONNX_READ_META_DATA(hidden_size_, \"hidden_size\");\n    }\n\n    // Detect KV delta model type (model_type metadata should contain\n    // \"kv_delta\")\n    auto model_type_value =\n        LookupCustomModelMetaData(meta_data, \"model_type\", allocator);\n    is_kv_delta_model_ =\n        (!model_type_value.empty() &&\n         model_type_value.find(\"kv_delta\") != std::string::npos);\n\n    int32_t num_outputs = static_cast<int32_t>(llm_output_names_.size());\n    if (num_outputs < 1 || (num_outputs - 1) % 2 != 0) {\n      SHERPA_ONNX_LOGE(\n          \"LLM model must have 1 logits output + 2*num_layers KV outputs, got \"\n          \"%d outputs\",\n          num_outputs);\n      SHERPA_ONNX_EXIT(-1);\n    }\n    int32_t inferred_layers = (num_outputs - 1) / 2;\n\n    auto num_layers_value =\n        LookupCustomModelMetaData(meta_data, \"num_layers\", allocator);\n    if (!num_layers_value.empty()) {\n      num_layers_ = atoi(num_layers_value.c_str());\n      if (num_layers_ <= 0) {\n        SHERPA_ONNX_LOGE(\"Invalid num_layers=%d from metadata\", num_layers_);\n        SHERPA_ONNX_EXIT(-1);\n      }\n      if (num_layers_ != inferred_layers) {\n        SHERPA_ONNX_LOGE(\"LLM num_layers mismatch: metadata=%d, inferred=%d\",\n                         num_layers_, inferred_layers);\n        SHERPA_ONNX_EXIT(-1);\n      }\n    } else {\n      num_layers_ = inferred_layers;\n    }\n\n    // Read KV cache capacity from metadata.\n    auto max_total_len_value =\n        LookupCustomModelMetaData(meta_data, \"max_total_len\", allocator);\n    if (!max_total_len_value.empty()) {\n      max_total_len_ = atoi(max_total_len_value.c_str());\n    } else {\n      auto attn_len_value =\n          LookupCustomModelMetaData(meta_data, \"attention_mask_len\", allocator);\n      if (!attn_len_value.empty())\n        max_total_len_ = atoi(attn_len_value.c_str());\n    }\n    if (max_total_len_ <= 0) {\n      // Fallback: use input[1] shape\n      auto ti = llm_sess_->GetInputTypeInfo(1);\n      auto shp = ti.GetTensorTypeAndShapeInfo().GetShape();\n      if (shp.size() == 2 && shp[1] > 0) {\n        max_total_len_ = static_cast<int32_t>(shp[1]);\n      }\n      if (max_total_len_ <= 0) {\n        SHERPA_ONNX_LOGE(\n            \"Failed to determine max_total_len from metadata or input shape\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n\n    // Only KV delta models are supported\n    if (!is_kv_delta_model_) {\n      SHERPA_ONNX_LOGE(\n          \"Only KV delta models are supported, but model_type does not contain \"\n          \"'kv_delta'\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    // Validate input layout: 0 embeds, 1 attention_mask, 2 cache_position, 3+\n    // KV cache\n    if (llm_input_names_.size() < 3u) {\n      SHERPA_ONNX_LOGE(\n          \"LLM model inputs must be >=3 (embeds,mask,cache_position)\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    cache_position_input_index_ = 2;\n    past_kv_input_start_index_ = 3;\n\n    int32_t expected_inputs = 3 + 2 * num_layers_;\n    int32_t actual_inputs = static_cast<int32_t>(llm_input_names_.size());\n    if (actual_inputs != expected_inputs) {\n      if (actual_inputs == 2 + 2 * num_layers_) {\n        SHERPA_ONNX_LOGE(\n            \"LLM model inputs mismatch: expected %d (=3+2*num_layers with \"\n            \"cache_position) \"\n            \"got %d (=2+2*num_layers without cache_position). \"\n            \"Please use a model exported with cache_position support.\",\n            expected_inputs, actual_inputs);\n      } else {\n        SHERPA_ONNX_LOGE(\n            \"LLM model inputs mismatch: expected %d (=3+2*num_layers) got %d\",\n            expected_inputs, actual_inputs);\n      }\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    // KV input element type (should be float16 or float32).\n    kv_in_type_ =\n        GetSessionInputElemType(llm_sess_.get(), past_kv_input_start_index_);\n    kv_in_type_v_ = GetSessionInputElemType(llm_sess_.get(),\n                                            past_kv_input_start_index_ + 1);\n    if (!(kv_in_type_ == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT ||\n          kv_in_type_ == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16 ||\n          kv_in_type_ == ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT16)) {\n      SHERPA_ONNX_LOGE(\"LLM past_key elem_type=%d not supported\",\n                       (int)kv_in_type_);\n      SHERPA_ONNX_EXIT(-1);\n    }\n    if (!(kv_in_type_v_ == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT ||\n          kv_in_type_v_ == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16 ||\n          kv_in_type_v_ == ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT16)) {\n      SHERPA_ONNX_LOGE(\"LLM past_value elem_type=%d not supported\",\n                       (int)kv_in_type_v_);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    // Templates for KV shapes from session inputs.\n    auto past_key_ti = llm_sess_->GetInputTypeInfo(past_kv_input_start_index_);\n    past_key_shape_tpl_ = past_key_ti.GetTensorTypeAndShapeInfo().GetShape();\n\n    auto past_value_ti =\n        llm_sess_->GetInputTypeInfo(past_kv_input_start_index_ + 1);\n    past_value_shape_tpl_ =\n        past_value_ti.GetTensorTypeAndShapeInfo().GetShape();\n\n    // Pre-allocate buffers for CPU IoBinding (decode step: [1, 1, vocab_size]\n    // and [1, 1, kv_h, hd])\n    int64_t kv_h = past_key_shape_tpl_[2];\n    int64_t hd = past_key_shape_tpl_[3];\n    std::vector<int64_t> logits_shape = {1, 1,\n                                         static_cast<int64_t>(vocab_size_)};\n    logits_buffer_ = AllocTensor<float>(allocator_, logits_shape);\n\n    kv_delta_buffers_.reserve(num_layers_);\n    std::vector<int64_t> kv_delta_shape = {1, 1, kv_h, hd};\n    for (int32_t i = 0; i < num_layers_; ++i) {\n      Ort::Value key_delta =\n          AllocTensorByElemType(allocator_, kv_delta_shape, kv_in_type_);\n      Ort::Value value_delta =\n          AllocTensorByElemType(allocator_, kv_delta_shape, kv_in_type_v_);\n      kv_delta_buffers_.emplace_back(std::move(key_delta),\n                                     std::move(value_delta));\n    }\n    has_decode_buffers_ = true;\n  }\n\n  void InitLLMFromMemory(void *model_data, size_t model_data_length) {\n    try {\n      llm_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_llm_);\n    } catch (const Ort::Exception &e) {\n      SHERPA_ONNX_LOGE(\"InitLLMFromMemory: failed to create session: %s\",\n                       e.what());\n      if (std::string(e.what()).find(\"external data\") != std::string::npos ||\n          std::string(e.what()).find(\"External data\") != std::string::npos) {\n        SHERPA_ONNX_LOGE(\n            \"LLM model requires external data (.data file) but loaded from \"\n            \"memory. \"\n            \"Please use fp16/int8 single-file model or load by file path \"\n            \"instead.\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n      throw;\n    }\n\n    SetupLlmFromSession();\n  }\n\n  void InitEmbeddingFromMemory(void *model_data, size_t model_data_length) {\n    embedding_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_embedding_);\n    GetInputNames(embedding_sess_.get(), &embedding_input_names_,\n                  &embedding_input_names_ptr_);\n    GetOutputNames(embedding_sess_.get(), &embedding_output_names_,\n                   &embedding_output_names_ptr_);\n    Ort::ModelMetadata meta_data = embedding_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    if (hidden_size_ == 0) {\n      SHERPA_ONNX_READ_META_DATA(hidden_size_, \"hidden_size\");\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR, \"funasr-nano\"),\n        sess_opts_encoder_(GetSessionOptions(config)),\n        sess_opts_llm_(GetSessionOptions(config)),\n        sess_opts_embedding_(GetSessionOptions(config)),\n        allocator_(),\n        cpu_mem_info_(\n            Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)),\n        is_cpu_provider_(config.provider == \"cpu\" || config.provider.empty()) {\n    const auto &c = config_.funasr_nano;\n\n    if (c.encoder_adaptor.empty()) {\n      SHERPA_ONNX_LOGE(\"funasr_nano.encoder_adaptor is empty\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (c.llm.empty()) {\n      SHERPA_ONNX_LOGE(\"funasr_nano.llm is required for KV cache mode\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    auto buf_encoder = ReadFile(mgr, c.encoder_adaptor);\n    InitEncoderAdaptorFromMemory(buf_encoder.data(), buf_encoder.size());\n\n    auto buf_llm = ReadFile(mgr, c.llm);\n    InitLLMFromMemory(buf_llm.data(), buf_llm.size());\n\n    auto buf_embedding = ReadFile(mgr, c.embedding);\n    InitEmbeddingFromMemory(buf_embedding.data(), buf_embedding.size());\n    has_embedding_model_ = true;\n\n    use_cuda_iobinding_ =\n        (!is_cpu_provider_ && IsCudaProvider(config_.provider));\n    if (use_cuda_iobinding_) {\n      cuda_mem_info_ = std::make_unique<Ort::MemoryInfo>(\n          \"Cuda\", OrtDeviceAllocator, 0, OrtMemTypeDefault);\n    }\n    CheckFp16OnCuda();\n  }\n\n  // Forward pass through encoder adaptor model.\n  // Converts audio features to embeddings compatible with the LLM.\n  Ort::Value ForwardEncoderAdaptor(Ort::Value features) {\n    if (NeedsTypeConversion(features, encoder_in_type_)) {\n      features = CastFloatLikeForExpected(std::move(features), encoder_in_type_,\n                                          allocator_);\n    }\n\n    // Encoder output is consumed by CPU-side code (embedding packing), so we\n    // bind it to CPU when running on CUDA to avoid returning a CUDA pointer.\n    if (use_cuda_iobinding_) {\n      Ort::IoBinding binding(*encoder_sess_);\n      binding.BindInput(encoder_input_names_ptr_[0], features);\n      binding.BindOutput(encoder_output_names_ptr_[0], cpu_mem_info_);\n      binding.SynchronizeInputs();\n      encoder_sess_->Run(Ort::RunOptions{nullptr}, binding);\n      binding.SynchronizeOutputs();\n      auto outs = binding.GetOutputValues();\n\n      if (outs.empty()) {\n        SHERPA_ONNX_LOGE(\"ForwardEncoderAdaptor: empty outputs\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n      return std::move(outs[0]);\n    }\n\n    std::array<Ort::Value, 1> inputs = {std::move(features)};\n    auto outputs = encoder_sess_->Run(\n        {}, encoder_input_names_ptr_.data(), inputs.data(), inputs.size(),\n        encoder_output_names_ptr_.data(), encoder_output_names_ptr_.size());\n    return std::move(outputs[0]);\n  }\n\n  std::vector<std::pair<Ort::Value, Ort::Value>> CreateEmptyKVCache(\n      int64_t batch) {\n    std::vector<std::pair<Ort::Value, Ort::Value>> kv_cache;\n    kv_cache.reserve(num_layers_);\n\n    // Read kv_h, hd from input shape template (dim2, dim3)\n    auto &tpl = past_key_shape_tpl_;\n    if (tpl.size() < 4) {\n      SHERPA_ONNX_LOGE(\"Invalid KV cache shape template, expected >=4 dims\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n    int64_t kv_h = tpl[2];\n    int64_t hd = tpl[3];\n    std::vector<int64_t> key_shape = {\n        batch, static_cast<int64_t>(max_total_len_), kv_h, hd};\n    std::vector<int64_t> value_shape = key_shape;\n\n    size_t key_numel = NumelFromShape(key_shape);\n    size_t value_numel = NumelFromShape(value_shape);\n\n    for (int32_t i = 0; i < num_layers_; ++i) {\n      Ort::Value key_tensor =\n          AllocTensorByElemType(allocator_, key_shape, kv_in_type_);\n      Ort::Value value_tensor =\n          AllocTensorByElemType(allocator_, value_shape, kv_in_type_);\n\n      // Zero-initialize cache\n      if (key_numel > 0) {\n        if (kv_in_type_ == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT) {\n          std::memset(key_tensor.GetTensorMutableData<float>(), 0,\n                      key_numel * sizeof(float));\n        } else {\n          std::memset(key_tensor.GetTensorMutableData<uint16_t>(), 0,\n                      key_numel * sizeof(uint16_t));\n        }\n      }\n\n      if (value_numel > 0) {\n        if (kv_in_type_ == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT) {\n          std::memset(value_tensor.GetTensorMutableData<float>(), 0,\n                      value_numel * sizeof(float));\n        } else {\n          std::memset(value_tensor.GetTensorMutableData<uint16_t>(), 0,\n                      value_numel * sizeof(uint16_t));\n        }\n      }\n\n      kv_cache.emplace_back(std::move(key_tensor), std::move(value_tensor));\n    }\n    return kv_cache;\n  }\n\n  std::pair<Ort::Value, std::vector<std::pair<Ort::Value, Ort::Value>>>\n  ForwardLLM(Ort::Value inputs_embeds, Ort::Value attention_mask,\n             const Ort::Value &cache_position,\n             const std::vector<std::pair<Ort::Value, Ort::Value>> &cache_kv) {\n    if (static_cast<int32_t>(cache_kv.size()) != num_layers_) {\n      SHERPA_ONNX_LOGE(\"ForwardLLM: cache_kv size (%zu) != num_layers (%d)\",\n                       cache_kv.size(), num_layers_);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (!inputs_embeds.IsTensor()) {\n      SHERPA_ONNX_LOGE(\"ForwardLLM: inputs_embeds is not a tensor\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    auto embeds_info = inputs_embeds.GetTensorTypeAndShapeInfo();\n    auto embeds_type =\n        static_cast<ONNXTensorElementDataType>(embeds_info.GetElementType());\n    if (embeds_type != ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT) {\n      SHERPA_ONNX_LOGE(\n          \"ForwardLLM: inputs_embeds must be float32, got elem_type=%d\",\n          (int)embeds_type);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    // Prepare attention_mask: int64, truncate if length exceeds max_total_len\n    if (attention_mask.IsTensor()) {\n      auto mask_info = attention_mask.GetTensorTypeAndShapeInfo();\n      auto mask_type =\n          static_cast<ONNXTensorElementDataType>(mask_info.GetElementType());\n      if (mask_type != ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64) {\n        attention_mask =\n            CastMaskToInt64IfNeeded(std::move(attention_mask), allocator_);\n        mask_info = attention_mask.GetTensorTypeAndShapeInfo();\n      }\n\n      auto mask_shape = mask_info.GetShape();\n      if (mask_shape.size() == 2 && mask_shape[1] > max_total_len_) {\n        // Truncate attention_mask if it exceeds max_total_len\n        attention_mask = NormalizeAttentionMask(std::move(attention_mask),\n                                                max_total_len_, allocator_);\n      }\n    }\n\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(3 + 2 * cache_kv.size());\n    inputs.push_back(std::move(inputs_embeds));\n    inputs.push_back(std::move(attention_mask));\n    inputs.push_back(View(const_cast<Ort::Value *>(&cache_position)));\n\n    for (const auto &kv : cache_kv) {\n      inputs.push_back(View(const_cast<Ort::Value *>(&kv.first)));\n      inputs.push_back(View(const_cast<Ort::Value *>(&kv.second)));\n    }\n\n    std::vector<const char *> input_names_ptr;\n    input_names_ptr.reserve(3 + 2 * cache_kv.size());\n    input_names_ptr.push_back(llm_input_names_ptr_[0]);  // inputs_embeds\n    input_names_ptr.push_back(llm_input_names_ptr_[1]);  // attention_mask\n    input_names_ptr.push_back(llm_input_names_ptr_[2]);  // cache_position\n    for (size_t i = 0; i < cache_kv.size(); ++i) {\n      input_names_ptr.push_back(\n          llm_input_names_ptr_[past_kv_input_start_index_ + 2 * i]);\n      input_names_ptr.push_back(\n          llm_input_names_ptr_[past_kv_input_start_index_ + 2 * i + 1]);\n    }\n\n    // Check if this is a decode step (seq_len == 1) for CPU buffer reuse\n    auto embeds_shape = embeds_info.GetShape();\n    bool is_decode_step = (embeds_shape.size() == 3 && embeds_shape[1] == 1);\n    bool use_cpu_decode_buffers =\n        (is_decode_step && has_decode_buffers_ && !use_cuda_iobinding_);\n\n    std::vector<Ort::Value> outputs;\n\n    if (use_cuda_iobinding_) {\n      Ort::IoBinding binding(*llm_sess_);\n      for (size_t i = 0; i < inputs.size(); ++i) {\n        binding.BindInput(input_names_ptr[i], inputs[i]);\n      }\n\n      // logits must be CPU (we will read it on CPU).\n      binding.BindOutput(llm_output_names_ptr_[0], cpu_mem_info_);\n\n      // KV outputs: bind to CPU so ApplyKvDeltaInplace can work with CPU cache\n      for (size_t i = 1; i < llm_output_names_ptr_.size(); ++i) {\n        binding.BindOutput(llm_output_names_ptr_[i], cpu_mem_info_);\n      }\n\n      binding.SynchronizeInputs();\n      llm_sess_->Run(Ort::RunOptions{nullptr}, binding);\n      binding.SynchronizeOutputs();\n      outputs = binding.GetOutputValues();\n    } else if (use_cpu_decode_buffers) {\n      // CPU path: use IoBinding with pre-allocated buffers for decode step\n      Ort::IoBinding binding(*llm_sess_);\n      for (size_t i = 0; i < inputs.size(); ++i) {\n        binding.BindInput(input_names_ptr[i], inputs[i]);\n      }\n\n      // Bind outputs to pre-allocated buffers\n      binding.BindOutput(llm_output_names_ptr_[0], logits_buffer_);\n      for (size_t i = 0; i < kv_delta_buffers_.size(); ++i) {\n        binding.BindOutput(llm_output_names_ptr_[1 + 2 * i],\n                           kv_delta_buffers_[i].first);\n        binding.BindOutput(llm_output_names_ptr_[1 + 2 * i + 1],\n                           kv_delta_buffers_[i].second);\n      }\n\n      binding.SynchronizeInputs();\n      llm_sess_->Run(Ort::RunOptions{nullptr}, binding);\n      binding.SynchronizeOutputs();\n      outputs = binding.GetOutputValues();\n    } else {\n      // Prefill step or buffers not initialized: use regular Run\n      outputs = llm_sess_->Run({}, input_names_ptr.data(), inputs.data(),\n                               inputs.size(), llm_output_names_ptr_.data(),\n                               llm_output_names_ptr_.size());\n    }\n\n    Ort::Value logits{nullptr};\n    if (use_cpu_decode_buffers) {\n      // For decode step with pre-allocated buffer, create a view to return\n      // (outputs will be destroyed but buffer persists)\n      logits = View(&logits_buffer_);\n    } else {\n      if (outputs.empty()) {\n        SHERPA_ONNX_LOGE(\"ForwardLLM: empty outputs\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n      logits = std::move(outputs[0]);\n    }\n\n    if (!logits.IsTensor()) {\n      SHERPA_ONNX_LOGE(\"ForwardLLM: logits is not a tensor\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    AssertTensorIsCpu(logits, \"ForwardLLM logits\");\n\n    auto logits_info = logits.GetTensorTypeAndShapeInfo();\n    auto logits_type =\n        static_cast<ONNXTensorElementDataType>(logits_info.GetElementType());\n    if (logits_type != ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT) {\n      SHERPA_ONNX_LOGE(\"ForwardLLM: logits must be float32, got elem_type=%d\",\n                       (int)logits_type);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    int32_t inferred_layers;\n    if (use_cpu_decode_buffers) {\n      // For decode step with pre-allocated buffers, we know the number of\n      // layers\n      inferred_layers = num_layers_;\n    } else {\n      if ((outputs.size() - 1) % 2 != 0) {\n        SHERPA_ONNX_LOGE(\"ForwardLLM: invalid KV cache outputs size=%d\",\n                         static_cast<int>(outputs.size()));\n        SHERPA_ONNX_EXIT(-1);\n      }\n      inferred_layers = static_cast<int32_t>((outputs.size() - 1) / 2);\n      if (inferred_layers != num_layers_) {\n        SHERPA_ONNX_LOGE(\n            \"ForwardLLM: KV outputs layers mismatch: expected=%d, got=%d\",\n            num_layers_, inferred_layers);\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n\n    std::vector<std::pair<Ort::Value, Ort::Value>> kv_outputs;\n    kv_outputs.reserve(num_layers_);\n    if (use_cpu_decode_buffers) {\n      // For decode step with pre-allocated buffers, create views\n      for (int32_t i = 0; i < num_layers_; ++i) {\n        kv_outputs.emplace_back(View(&kv_delta_buffers_[i].first),\n                                View(&kv_delta_buffers_[i].second));\n      }\n    } else {\n      for (int32_t i = 0; i < num_layers_; ++i) {\n        kv_outputs.emplace_back(std::move(outputs[1 + 2 * i]),\n                                std::move(outputs[1 + 2 * i + 1]));\n      }\n    }\n\n    return {std::move(logits), std::move(kv_outputs)};\n  }\n\n  // Apply KV delta in-place to the KV cache.\n  // Copy key_delta/value_delta into cache_key/value at positions [pos0:pos0+S)\n  void ApplyKvDeltaInplace(\n      std::vector<std::pair<Ort::Value, Ort::Value>> *cache_kv,\n      const std::vector<std::pair<Ort::Value, Ort::Value>> &kv_delta,\n      const Ort::Value &cache_position) const {\n    if (!cache_kv || cache_kv->size() != static_cast<size_t>(num_layers_) ||\n        kv_delta.size() != static_cast<size_t>(num_layers_)) {\n      SHERPA_ONNX_LOGE(\n          \"ApplyKvDeltaInplace: invalid kv sizes: cache=%zu delta=%zu\",\n          cache_kv ? cache_kv->size() : 0, kv_delta.size());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    // cache_position: [S], first element is pos0 (contiguous write)\n    auto pos_info = cache_position.GetTensorTypeAndShapeInfo();\n    auto pos_shape = pos_info.GetShape();\n    int64_t S = pos_shape.empty() ? 0 : pos_shape[0];\n    if (S <= 0) {\n      SHERPA_ONNX_LOGE(\"ApplyKvDeltaInplace: cache_position has invalid shape\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    const int64_t *pos_data = cache_position.GetTensorData<int64_t>();\n    int64_t pos0 = pos_data[0];\n\n    if (pos0 < 0) {\n      SHERPA_ONNX_LOGE(\"ApplyKvDeltaInplace: pos0 < 0 (%d)\",\n                       static_cast<int32_t>(pos0));\n      SHERPA_ONNX_EXIT(-1);\n    }\n    if (pos0 + S > max_total_len_) {\n      SHERPA_ONNX_LOGE(\n          \"ApplyKvDeltaInplace: pos0+S exceeds max_total_len_ (%d + %d > \"\n          \"%d), clamping S\",\n          static_cast<int32_t>(pos0), static_cast<int32_t>(S), max_total_len_);\n      S = max_total_len_ - pos0;\n      if (S <= 0) return;\n    }\n\n    for (int32_t layer = 0; layer < num_layers_; ++layer) {\n      Ort::Value &cache_key = (*cache_kv)[layer].first;\n      Ort::Value &cache_val = (*cache_kv)[layer].second;\n\n      const Ort::Value &delta_key = kv_delta[layer].first;\n      const Ort::Value &delta_val = kv_delta[layer].second;\n\n      auto ck_info = cache_key.GetTensorTypeAndShapeInfo();\n      auto dk_info = delta_key.GetTensorTypeAndShapeInfo();\n\n      auto ck_shape = ck_info.GetShape();  // [B, max_total_len, kv_h, hd]\n      auto dk_shape = dk_info.GetShape();  // [B, S, kv_h, hd]\n\n      int64_t B = ck_shape[0];\n      int64_t kv_h = ck_shape[2];\n      int64_t hd = ck_shape[3];\n\n      // bytes per element\n      auto elem_type = ck_info.GetElementType();\n      size_t elem_bytes = 0;\n      switch (elem_type) {\n        case ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT:\n          elem_bytes = 4;\n          break;\n        case ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16:\n        case ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT16:\n          elem_bytes = 2;\n          break;\n        default:\n          SHERPA_ONNX_LOGE(\"ApplyKvDeltaInplace: unsupported elem_type=%d\",\n                           elem_type);\n          SHERPA_ONNX_EXIT(-1);\n      }\n\n      size_t bytes_per_pos =\n          static_cast<size_t>(kv_h) * static_cast<size_t>(hd) * elem_bytes;\n\n      void *dst_k = cache_key.GetTensorMutableData<void>();\n      void *dst_v = cache_val.GetTensorMutableData<void>();\n      const void *src_k = delta_key.GetTensorData<void>();\n      const void *src_v = delta_val.GetTensorData<void>();\n\n      for (int64_t b = 0; b < B; ++b) {\n        size_t dst_off =\n            (static_cast<size_t>(b) * static_cast<size_t>(max_total_len_) +\n             static_cast<size_t>(pos0)) *\n            bytes_per_pos;\n        size_t src_off =\n            (static_cast<size_t>(b) * static_cast<size_t>(dk_shape[1])) *\n            bytes_per_pos;\n\n        size_t copy_bytes = static_cast<size_t>(S) * bytes_per_pos;\n\n        uint8_t *dst_k_ptr = static_cast<uint8_t *>(dst_k) + dst_off;\n        uint8_t *dst_v_ptr = static_cast<uint8_t *>(dst_v) + dst_off;\n        const uint8_t *src_k_ptr =\n            static_cast<const uint8_t *>(src_k) + src_off;\n        const uint8_t *src_v_ptr =\n            static_cast<const uint8_t *>(src_v) + src_off;\n\n        std::memcpy(dst_k_ptr, src_k_ptr, copy_bytes);\n        std::memcpy(dst_v_ptr, src_v_ptr, copy_bytes);\n      }\n    }\n  }\n\n  // Forward pass through embedding model.\n  // Converts token IDs to embeddings.\n  Ort::Value ForwardEmbedding(Ort::Value input_ids) {\n    // Embedding output is consumed by CPU-side packing code; bind it to CPU\n    // when running on CUDA to avoid returning a CUDA pointer.\n    if (use_cuda_iobinding_) {\n      Ort::IoBinding binding(*embedding_sess_);\n      binding.BindInput(embedding_input_names_ptr_[0], input_ids);\n      binding.BindOutput(embedding_output_names_ptr_[0], cpu_mem_info_);\n      binding.SynchronizeInputs();\n      embedding_sess_->Run(Ort::RunOptions{nullptr}, binding);\n      binding.SynchronizeOutputs();\n      auto outs = binding.GetOutputValues();\n\n      if (outs.empty()) {\n        SHERPA_ONNX_LOGE(\"ForwardEmbedding: empty outputs\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n      return std::move(outs[0]);\n    }\n\n    std::array<Ort::Value, 1> inputs = {std::move(input_ids)};\n    auto outputs = embedding_sess_->Run(\n        {}, embedding_input_names_ptr_.data(), inputs.data(), inputs.size(),\n        embedding_output_names_ptr_.data(), embedding_output_names_ptr_.size());\n    return std::move(outputs[0]);\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n  int32_t HiddenSize() const { return hidden_size_; }\n  int32_t GetMaxTotalLen() const { return max_total_len_; }\n  int32_t LfrWindowSize() const { return lfr_window_size_; }\n  int32_t LfrWindowShift() const { return lfr_window_shift_; }\n  OrtAllocator *Allocator() { return allocator_; }\n  bool HasEmbeddingModel() const { return has_embedding_model_; }\n  bool UseKVCache() const { return true; }\n  bool IsCpuProvider() const { return is_cpu_provider_; }\n\n private:\n  void CheckFp16OnCuda() {\n    if (use_cuda_iobinding_) {\n      Ort::ModelMetadata meta_data = llm_sess_->GetModelMetadata();\n      Ort::AllocatorWithDefaultOptions allocator;\n      auto quant_type =\n          LookupCustomModelMetaData(meta_data, \"quantization_type\", allocator);\n\n      if (!quant_type.empty() && quant_type == \"fp16\") {\n        SHERPA_ONNX_LOGE(\n            \"fp16 LLM models are not supported on CUDA yet. Please use \"\n            \"fp32/int8 models.\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n  }\n\n  void InitEncoderAdaptor(const std::string &model_path) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(model_path), sess_opts_encoder_);\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n    encoder_in_type_ = GetSessionInputElemType(encoder_sess_.get(), 0);\n    Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(lfr_window_size_, \"lfr_window_size\");\n    SHERPA_ONNX_READ_META_DATA(lfr_window_shift_, \"lfr_window_shift\");\n    SHERPA_ONNX_READ_META_DATA(hidden_size_, \"llm_dim\");\n  }\n\n  void InitLLM(const std::string &model_path) {\n    // For fp32 models: check for .data file by replacing .onnx with .data\n    // int8 and fp16 models don't have .data files, so no need to check\n    std::string data_path = model_path;\n    if (data_path.size() >= 5 &&\n        data_path.substr(data_path.size() - 5) == \".onnx\") {\n      data_path = data_path.substr(0, data_path.size() - 5) + \".data\";\n    } else {\n      data_path = model_path + \".data\";\n    }\n    bool has_external_data = FileExists(data_path);\n\n    // Resolve absolute path for model file\n    std::string abs_model_path = ResolveAbsolutePath(model_path);\n\n    if (has_external_data) {\n      // When external data exists, use absolute file path to create session.\n      // ONNX Runtime will automatically find .data file in the same directory\n      // as the model file when using absolute path.\n      llm_sess_ = std::make_unique<Ort::Session>(\n          env_, SHERPA_ONNX_TO_ORT_PATH(abs_model_path), sess_opts_llm_);\n    } else {\n      // No external data: load entire model into memory\n      std::vector<char> model_data = ReadFile(model_path);\n      llm_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data.data(), model_data.size(), sess_opts_llm_);\n    }\n\n    SetupLlmFromSession();\n  }\n\n  void InitEmbedding(const std::string &model_path) {\n    embedding_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(model_path), sess_opts_embedding_);\n    GetInputNames(embedding_sess_.get(), &embedding_input_names_,\n                  &embedding_input_names_ptr_);\n    GetOutputNames(embedding_sess_.get(), &embedding_output_names_,\n                   &embedding_output_names_ptr_);\n    Ort::ModelMetadata meta_data = embedding_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    if (hidden_size_ == 0) {\n      SHERPA_ONNX_READ_META_DATA(hidden_size_, \"hidden_size\");\n    }\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_encoder_;\n  Ort::SessionOptions sess_opts_llm_;\n  Ort::SessionOptions sess_opts_embedding_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  Ort::MemoryInfo cpu_mem_info_;\n  std::unique_ptr<Ort::MemoryInfo> cuda_mem_info_;\n  bool use_cuda_iobinding_ = false;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> llm_sess_;\n  std::unique_ptr<Ort::Session> embedding_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> llm_input_names_;\n  std::vector<const char *> llm_input_names_ptr_;\n  std::vector<std::string> llm_output_names_;\n  std::vector<const char *> llm_output_names_ptr_;\n\n  std::vector<std::string> embedding_input_names_;\n  std::vector<const char *> embedding_input_names_ptr_;\n  std::vector<std::string> embedding_output_names_;\n  std::vector<const char *> embedding_output_names_ptr_;\n\n  int32_t vocab_size_ = 0;\n  int32_t hidden_size_ = 0;\n  int32_t lfr_window_size_ = 0;\n  int32_t lfr_window_shift_ = 0;\n\n  int32_t num_layers_ = 0;\n  int32_t max_total_len_ = 0;  // attention_mask length / cache capacity\n  bool has_embedding_model_ = false;\n\n  ONNXTensorElementDataType encoder_in_type_ =\n      ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED;\n  ONNXTensorElementDataType llm_embeds_in_type_ =\n      ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED;\n\n  // KV input element types (for CreateEmptyKVCache).\n  ONNXTensorElementDataType kv_in_type_ = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT;\n  ONNXTensorElementDataType kv_in_type_v_ = ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT;\n\n  // Input indices for KV cache LLM.\n  size_t cache_position_input_index_ = 2;\n  size_t past_kv_input_start_index_ = 3;\n\n  std::vector<int64_t> past_key_shape_tpl_;\n  std::vector<int64_t> past_value_shape_tpl_;\n\n  bool is_cpu_provider_ = false;\n  bool is_kv_delta_model_ = false;\n\n  // Pre-allocated buffers for CPU IoBinding (decode step reuse)\n  bool has_decode_buffers_ = false;\n  Ort::Value logits_buffer_{nullptr};\n  std::vector<std::pair<Ort::Value, Ort::Value>> kv_delta_buffers_;\n};\n\nOfflineFunASRNanoModel::OfflineFunASRNanoModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineFunASRNanoModel::OfflineFunASRNanoModel(Manager *mgr,\n                                               const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineFunASRNanoModel::~OfflineFunASRNanoModel() = default;\n\nOrt::Value OfflineFunASRNanoModel::ForwardEncoderAdaptor(Ort::Value features) {\n  return impl_->ForwardEncoderAdaptor(std::move(features));\n}\n\nstd::pair<Ort::Value, std::vector<std::pair<Ort::Value, Ort::Value>>>\nOfflineFunASRNanoModel::ForwardLLM(\n    Ort::Value inputs_embeds, Ort::Value attention_mask,\n    const Ort::Value &cache_position,\n    const std::vector<std::pair<Ort::Value, Ort::Value>> &cache_kv) {\n  return impl_->ForwardLLM(std::move(inputs_embeds), std::move(attention_mask),\n                           std::move(cache_position), cache_kv);\n}\n\nstd::vector<std::pair<Ort::Value, Ort::Value>>\nOfflineFunASRNanoModel::CreateEmptyKVCache(int64_t batch) {\n  return impl_->CreateEmptyKVCache(batch);\n}\n\nvoid OfflineFunASRNanoModel::ApplyKvDeltaInplace(\n    std::vector<std::pair<Ort::Value, Ort::Value>> *cache_kv,\n    const std::vector<std::pair<Ort::Value, Ort::Value>> &kv_delta,\n    const Ort::Value &cache_position) {\n  return impl_->ApplyKvDeltaInplace(cache_kv, kv_delta, cache_position);\n}\n\nbool OfflineFunASRNanoModel::UseKVCache() const { return impl_->UseKVCache(); }\n\nOrt::Value OfflineFunASRNanoModel::ForwardEmbedding(Ort::Value input_ids) {\n  return impl_->ForwardEmbedding(std::move(input_ids));\n}\n\nint32_t OfflineFunASRNanoModel::VocabSize() const { return impl_->VocabSize(); }\nint32_t OfflineFunASRNanoModel::HiddenSize() const {\n  return impl_->HiddenSize();\n}\nint32_t OfflineFunASRNanoModel::GetMaxTotalLen() const {\n  return impl_->GetMaxTotalLen();\n}\n\nint32_t OfflineFunASRNanoModel::LfrWindowSize() const {\n  return impl_->LfrWindowSize();\n}\nint32_t OfflineFunASRNanoModel::LfrWindowShift() const {\n  return impl_->LfrWindowShift();\n}\n\nOrtAllocator *OfflineFunASRNanoModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nbool OfflineFunASRNanoModel::HasEmbeddingModel() const {\n  return impl_->HasEmbeddingModel();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineFunASRNanoModel::OfflineFunASRNanoModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineFunASRNanoModel::OfflineFunASRNanoModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-funasr-nano-model.h",
    "content": "// sherpa-onnx/csrc/offline-funasr-nano-model.h\n//\n// Copyright (c)  2025  zengyw\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_FUNASR_NANO_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_FUNASR_NANO_MODEL_H_\n\n#include <cstdint>\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-funasr-nano-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineFunASRNanoModel {\n public:\n  explicit OfflineFunASRNanoModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineFunASRNanoModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineFunASRNanoModel();\n\n  /** Run the encoder+adaptor model.\n   *\n   * @param features  A tensor of shape (N, T, C). Audio features.\n   * @return Return embeddings of shape (N, T', hidden_size)\n   */\n  Ort::Value ForwardEncoderAdaptor(Ort::Value features);\n\n  /** Run the LLM model (KV cache mode).\n   *\n   * @param inputs_embeds  A tensor of shape (N, T, hidden_size), float32.\n   * @param attention_mask  A tensor of shape (N, T) containing attention mask,\n   * int64.\n   * @param cache_position  A tensor of shape (T,) containing cache positions,\n   * int64.\n   * @param cache_kv  Fixed-size KV cache, vector of (key, value) pairs.\n   * @return Return tuple (logits, kv_outputs...). Logits shape (N, T,\n   * vocab_size), float32. kv_outputs is a vector of (key_delta, value_delta)\n   * pairs for each layer.\n   */\n  std::pair<Ort::Value, std::vector<std::pair<Ort::Value, Ort::Value>>>\n  ForwardLLM(Ort::Value inputs_embeds, Ort::Value attention_mask,\n             const Ort::Value &cache_position,\n             const std::vector<std::pair<Ort::Value, Ort::Value>> &cache_kv);\n\n  /** Create fixed-size KV cache buffer.\n   *\n   * @param batch  Batch size (usually 1).\n   * @return Return vector of (key, value) pairs with fixed cache dimensions [B,\n   * max_total_len, kv_h, hd].\n   */\n  std::vector<std::pair<Ort::Value, Ort::Value>> CreateEmptyKVCache(\n      int64_t batch);\n\n  /** Apply KV delta in-place to KV cache buffer.\n   *\n   * @param cache_kv  Fixed-size KV cache to update, vector of (key, value)\n   * pairs.\n   * @param kv_delta  KV deltas from current step, vector of (key_delta,\n   * value_delta) pairs.\n   * @param cache_position  Cache position tensor indicating where to write\n   * deltas.\n   */\n  void ApplyKvDeltaInplace(\n      std::vector<std::pair<Ort::Value, Ort::Value>> *cache_kv,\n      const std::vector<std::pair<Ort::Value, Ort::Value>> &kv_delta,\n      const Ort::Value &cache_position);\n\n  /** Check if using KV cache mode. Always returns true for FunASR-nano.\n   */\n  bool UseKVCache() const;\n\n  /** Run the embedding model.\n   *\n   * @param input_ids  A tensor of shape (N, T) containing token IDs.\n   * @return Return embeddings of shape (N, T, hidden_size)\n   */\n  Ort::Value ForwardEmbedding(Ort::Value input_ids);\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const;\n\n  /** Return the hidden size of the model\n   */\n  int32_t HiddenSize() const;\n\n  /** Return the maximum total sequence length (from metadata)\n   */\n  int32_t GetMaxTotalLen() const;\n\n  /** It is lfr_window_size in metadata\n   */\n  int32_t LfrWindowSize() const;\n\n  /** It is lfr_window_shift in metadata\n   */\n  int32_t LfrWindowShift() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n  /** Check if embedding model is available\n   */\n  bool HasEmbeddingModel() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_FUNASR_NANO_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-lm-config.cc",
    "content": "// sherpa-onnx/csrc/offline-lm-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-lm-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineLMConfig::Register(ParseOptions *po) {\n  po->Register(\"lm\", &model, \"Path to LM model.\");\n  po->Register(\"lm-scale\", &scale, \"LM scale.\");\n  po->Register(\"lm-num-threads\", &lm_num_threads,\n               \"Number of threads to run the neural network of LM model\");\n  po->Register(\"lm-provider\", &lm_provider,\n               \"Specify a provider to LM model use: cpu, cuda, coreml\");\n  po->Register(\"lodr-fst\", &lodr_fst, \"Path to LODR FST model.\");\n  po->Register(\"lodr-scale\", &lodr_scale, \"LODR scale.\");\n  po->Register(\"lodr-backoff-id\", &lodr_backoff_id,\n               \"ID of the backoff in the LODR FST. -1 means autodetect\");\n}\n\nbool OfflineLMConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"'%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  if (!lodr_fst.empty() && !FileExists(lodr_fst)) {\n    SHERPA_ONNX_LOGE(\"'%s' does not exist\", lodr_fst.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineLMConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineLMConfig(\";\n  os << \"model=\\\"\" << model << \"\\\", \";\n  os << \"scale=\" << scale << \", \";\n  os << \"lodr_scale=\" << lodr_scale << \", \";\n  os << \"lodr_fst=\\\"\" << lodr_fst << \"\\\", \";\n  os << \"lodr_backoff_id=\" << lodr_backoff_id << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-lm-config.h",
    "content": "// sherpa-onnx/csrc/offline-lm-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_LM_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_LM_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineLMConfig {\n  // path to the onnx model\n  std::string model;\n\n  // LM scale\n  float scale = 0.5;\n  int32_t lm_num_threads = 1;\n  std::string lm_provider = \"cpu\";\n\n  // LODR\n  std::string lodr_fst;\n  float lodr_scale = 0.01;\n  int32_t lodr_backoff_id = -1;  // -1 means not set\n\n  OfflineLMConfig() = default;\n\n  OfflineLMConfig(const std::string &model, float scale, int32_t lm_num_threads,\n                  const std::string &lm_provider, const std::string &lodr_fst,\n                  float lodr_scale, int32_t lodr_backoff_id)\n      : model(model),\n        scale(scale),\n        lm_num_threads(lm_num_threads),\n        lm_provider(lm_provider),\n        lodr_fst(lodr_fst),\n        lodr_scale(lodr_scale),\n        lodr_backoff_id(lodr_backoff_id) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_LM_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-lm.cc",
    "content": "// sherpa-onnx/csrc/offline-lm.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-lm.h\"\n\n#include <algorithm>\n#include <memory>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/lodr-fst.h\"\n#include \"sherpa-onnx/csrc/offline-rnn-lm.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OfflineLM> OfflineLM::Create(const OfflineLMConfig &config) {\n  return std::make_unique<OfflineRnnLM>(config);\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OfflineLM> OfflineLM::Create(Manager *mgr,\n                                             const OfflineLMConfig &config) {\n  return std::make_unique<OfflineRnnLM>(mgr, config);\n}\n\nvoid OfflineLM::ComputeLMScore(float scale, int32_t context_size,\n                               std::vector<Hypotheses> *hyps) {\n  // compute the max token seq so that we know how much space to allocate\n  int32_t max_token_seq = 0;\n  int32_t num_hyps = 0;\n\n  // we subtract context_size below since each token sequence is prepended\n  // with context_size blanks\n  for (const auto &h : *hyps) {\n    num_hyps += h.Size();\n    for (const auto &t : h) {\n      max_token_seq =\n          std::max<int32_t>(max_token_seq, t.second.ys.size() - context_size);\n    }\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 2> x_shape{num_hyps, max_token_seq};\n  Ort::Value x = Ort::Value::CreateTensor<int64_t>(allocator, x_shape.data(),\n                                                   x_shape.size());\n\n  std::array<int64_t, 1> x_lens_shape{num_hyps};\n  Ort::Value x_lens = Ort::Value::CreateTensor<int64_t>(\n      allocator, x_lens_shape.data(), x_lens_shape.size());\n\n  int64_t *p = x.GetTensorMutableData<int64_t>();\n  std::fill(p, p + num_hyps * max_token_seq, 0);\n\n  int64_t *p_lens = x_lens.GetTensorMutableData<int64_t>();\n\n  for (const auto &h : *hyps) {\n    for (const auto &t : h) {\n      const auto &ys = t.second.ys;\n      int32_t len = ys.size() - context_size;\n      std::copy(ys.begin() + context_size, ys.end(), p);\n      *p_lens = len;\n\n      p += max_token_seq;\n      ++p_lens;\n    }\n  }\n  auto negative_loglike = Rescore(std::move(x), std::move(x_lens));\n  const float *p_nll = negative_loglike.GetTensorData<float>();\n  // We scale LODR scale with LM scale to replicate Icefall code\n  auto lodr_scale = config_.lodr_scale * scale;\n  for (auto &h : *hyps) {\n    for (auto &t : h) {\n      // Use -scale here since we want to change negative loglike to loglike.\n      t.second.lm_log_prob = -scale * (*p_nll);\n      ++p_nll;\n      // apply LODR to hyp score\n      if (lodr_fst_ != nullptr) {\n        lodr_fst_->ComputeScore(lodr_scale, &t.second, context_size);\n      }\n    }\n  }\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OfflineLM> OfflineLM::Create(\n    AAssetManager *mgr, const OfflineLMConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OfflineLM> OfflineLM::Create(\n    NativeResourceManager *mgr, const OfflineLMConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-lm.h",
    "content": "// sherpa-onnx/csrc/offline-lm.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_LM_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_LM_H_\n\n#include <memory>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/hypothesis.h\"\n#include \"sherpa-onnx/csrc/lodr-fst.h\"\n#include \"sherpa-onnx/csrc/offline-lm-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineLM {\n public:\n  explicit OfflineLM(const OfflineLMConfig &config) : config_(config) {\n    if (!config_.lodr_fst.empty()) {\n      try {\n        lodr_fst_ = std::make_unique<LodrFst>(LodrFst(config_.lodr_fst,\n                                                    config_.lodr_backoff_id));\n      } catch (const std::exception& e) {\n        throw std::runtime_error(\"Failed to load LODR FST from: \" +\n                                  config_.lodr_fst + \". Error: \" + e.what());\n      }\n    }\n  }\n  virtual ~OfflineLM() = default;\n\n  static std::unique_ptr<OfflineLM> Create(const OfflineLMConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OfflineLM> Create(Manager *mgr,\n                                           const OfflineLMConfig &config);\n\n  /** Rescore a batch of sentences.\n   *\n   * @param x A 2-D tensor of shape (N, L) with data type int64.\n   * @param x_lens A 1-D tensor of shape (N,) with data type int64.\n   *               It contains number of valid tokens in x before padding.\n   * @return Return a 1-D tensor of shape (N,) containing the negative log\n   *         likelihood of each utterance. Its data type is float32.\n   *\n   * Caution: It returns negative log likelihood (nll), not log likelihood\n   */\n  virtual Ort::Value Rescore(Ort::Value x, Ort::Value x_lens) = 0;\n\n  // This function updates hyp.lm_lob_prob of hyps.\n  //\n  // @param scale LM score\n  // @param context_size Context size of the transducer decoder model\n  // @param hyps It is changed in-place.\n  void ComputeLMScore(float scale, int32_t context_size,\n                      std::vector<Hypotheses> *hyps);\n\n private:\n  std::unique_ptr<LodrFst> lodr_fst_;\n  float lodr_scale_;\n  OfflineLMConfig config_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_LM_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-medasr-ctc-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-medasr-ctc-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-medasr-ctc-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineMedAsrCtcModelConfig::Register(ParseOptions *po) {\n  po->Register(\n      \"medasr\", &model,\n      \"Path to model.onnx from MedASR. Please see \"\n      \"https://github.com/k2-fsa/sherpa-onnx/pull/2934 for available models\");\n}\n\nbool OfflineMedAsrCtcModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"MedASR model: '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineMedAsrCtcModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineMedAsrCtcModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-medasr-ctc-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-medasr-ctc-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_MEDASR_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_MEDASR_CTC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineMedAsrCtcModelConfig {\n  std::string model;\n\n  OfflineMedAsrCtcModelConfig() = default;\n  explicit OfflineMedAsrCtcModelConfig(const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_MEDASR_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-medasr-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/offline-medasr-ctc-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-medasr-ctc-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n\nstd::vector<int64_t> GetMask(Ort::Value length) {\n  auto shape = length.GetTensorTypeAndShapeInfo().GetShape();\n  if (shape.size() != 1) {\n    SHERPA_ONNX_LOGE(\"Invalid length dim %zu\", shape.size());\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  auto batch_size = shape[0];\n\n  const int64_t *p = length.GetTensorData<int64_t>();\n\n  int64_t max_len = *std::max_element(p, p + batch_size);\n\n  std::vector<int64_t> ans(batch_size * max_len, 0);\n\n  int64_t *p_mask = ans.data();\n\n  for (int32_t i = 0; i < batch_size; ++i) {\n    auto len = p[i];\n    std::fill(p_mask, p_mask + len, 1);\n\n    p_mask += max_len;\n  }\n\n  return ans;\n}\n\n}  // namespace\n\nclass OfflineMedAsrCtcModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.medasr.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.medasr.model);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) {\n    std::vector<int64_t> mask = GetMask(std::move(features_length));\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> shape =\n        features.GetTensorTypeAndShapeInfo().GetShape();\n    shape.resize(2);\n\n    Ort::Value mask_tensor = Ort::Value::CreateTensor<int64_t>(\n        memory_info, mask.data(), mask.size(), shape.data(), shape.size());\n\n    std::array<Ort::Value, 2> inputs = {std::move(features),\n                                        std::move(mask_tensor)};\n\n    return sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                      output_names_ptr_.data(), output_names_ptr_.size());\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t SubsamplingFactor() const { return subsampling_factor_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    std::string model_type;\n    SHERPA_ONNX_READ_META_DATA_STR(model_type, \"model_type\");\n    if (model_type != \"medasr_ctc\") {\n      SHERPA_ONNX_LOGE(\"Expect model type medasr_ctc. Given: '%s'\",\n                       model_type.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(subsampling_factor_,\n                                            \"subsampling_factor\", 4);\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t vocab_size_ = 0;\n  int32_t subsampling_factor_ = 0;\n};\n\nOfflineMedAsrCtcModel::OfflineMedAsrCtcModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineMedAsrCtcModel::OfflineMedAsrCtcModel(Manager *mgr,\n                                             const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineMedAsrCtcModel::~OfflineMedAsrCtcModel() = default;\n\nstd::vector<Ort::Value> OfflineMedAsrCtcModel::Forward(\n    Ort::Value features, Ort::Value features_length) {\n  return impl_->Forward(std::move(features), std::move(features_length));\n}\n\nint32_t OfflineMedAsrCtcModel::VocabSize() const { return impl_->VocabSize(); }\n\nint32_t OfflineMedAsrCtcModel::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\nOrtAllocator *OfflineMedAsrCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineMedAsrCtcModel::OfflineMedAsrCtcModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineMedAsrCtcModel::OfflineMedAsrCtcModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-medasr-ctc-model.h",
    "content": "// sherpa-onnx/csrc/offline-medasr-ctc-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_MEDASR_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_MEDASR_CTC_MODEL_H_\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements the CTC model from MedASR.\n *\n * See\n * https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/medasr/export_onnx.py\n * https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/medasr/test_onnx.py\n * https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/medasr/run.sh\n *\n */\nclass OfflineMedAsrCtcModel : public OfflineCtcModel {\n public:\n  explicit OfflineMedAsrCtcModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineMedAsrCtcModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineMedAsrCtcModel() override;\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size).\n   *  - log_probs_length A 1-D tensor of shape (N,). Its dtype is int64_t\n   */\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  int32_t SubsamplingFactor() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_MEDASR_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineModelConfig::Register(ParseOptions *po) {\n  transducer.Register(po);\n  paraformer.Register(po);\n  nemo_ctc.Register(po);\n  whisper.Register(po);\n  fire_red_asr.Register(po);\n  tdnn.Register(po);\n  zipformer_ctc.Register(po);\n  wenet_ctc.Register(po);\n  sense_voice.Register(po);\n  moonshine.Register(po);\n  dolphin.Register(po);\n  canary.Register(po);\n  omnilingual.Register(po);\n  funasr_nano.Register(po);\n  medasr.Register(po);\n  fire_red_asr_ctc.Register(po);\n\n  po->Register(\"telespeech-ctc\", &telespeech_ctc,\n               \"Path to model.onnx for telespeech ctc\");\n\n  po->Register(\"tokens\", &tokens, \"Path to tokens.txt\");\n\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n\n  po->Register(\"model-type\", &model_type,\n               \"Specify it to reduce model initialization time. \"\n               \"Valid values are: transducer, paraformer, nemo_ctc, whisper, \"\n               \"tdnn, zipformer2_ctc, telespeech_ctc, fire_red_asr.\"\n               \"All other values lead to loading the model twice.\");\n  po->Register(\n      \"modeling-unit\", &modeling_unit,\n      \"The modeling unit of the model, commonly used units are bpe, \"\n      \"bbpe, cjkchar, cjkchar+bpe, etc. Currently, it is needed only when \"\n      \"hotwords are provided, we need it to encode the hotwords into \"\n      \"token sequence.\");\n  po->Register(\"bpe-vocab\", &bpe_vocab,\n               \"The vocabulary generated by google's sentencepiece program. \"\n               \"It is a file has two columns, one is the token, the other is \"\n               \"the log probability, you can get it from the directory where \"\n               \"your bpe model is generated. Only used when hotwords provided \"\n               \"and the modeling unit is bpe, bbpe, or cjkchar+bpe\");\n}\n\nbool OfflineModelConfig::Validate() const {\n  // For RK NPU, we reinterpret num_threads:\n  //\n  // For RK3588 only\n  // num_threads == 1 -> Select a core randomly\n  // num_threads == 0 -> Use NPU core 0\n  // num_threads == -1 -> Use NPU core 1\n  // num_threads == -2 -> Use NPU core 2\n  // num_threads == -3 -> Use NPU core 0 and core 1\n  // num_threads == -4 -> Use NPU core 0, core 1, and core 2\n  if (provider != \"rknn\") {\n    if (num_threads < 1) {\n      SHERPA_ONNX_LOGE(\"num_threads should be > 0. Given %d\", num_threads);\n      return false;\n    }\n    if (!sense_voice.model.empty() && (EndsWith(sense_voice.model, \".rknn\"))) {\n      SHERPA_ONNX_LOGE(\n          \"--provider is %s, which is not rknn, but you pass a rknn model \"\n          \"filename. model: '%s'\",\n          provider.c_str(), sense_voice.model.c_str());\n      return false;\n    }\n  }\n\n  if (provider == \"rknn\") {\n    if (!sense_voice.model.empty() && (EndsWith(sense_voice.model, \".onnx\"))) {\n      SHERPA_ONNX_LOGE(\n          \"--provider is rknn, but you pass an onnx model \"\n          \"filename. model: '%s'\",\n          sense_voice.model.c_str());\n      return false;\n    }\n  }\n\n  // For FunASR-nano, tokens file is not required (tokenizer is loaded from\n  // directory) Check tokens file only if not using funasr_nano\n  if (funasr_nano.encoder_adaptor.empty()) {\n    if (!FileExists(tokens)) {\n      SHERPA_ONNX_LOGE(\"tokens: '%s' does not exist\", tokens.c_str());\n      return false;\n    }\n  }\n\n  if (!modeling_unit.empty() &&\n      (modeling_unit == \"bpe\" || modeling_unit == \"cjkchar+bpe\" ||\n       modeling_unit == \"bbpe\")) {\n    if (!FileExists(bpe_vocab)) {\n      SHERPA_ONNX_LOGE(\"bpe_vocab: '%s' does not exist\", bpe_vocab.c_str());\n      return false;\n    }\n  }\n\n  if (!paraformer.model.empty()) {\n    return paraformer.Validate();\n  }\n\n  if (!nemo_ctc.model.empty()) {\n    return nemo_ctc.Validate();\n  }\n\n  if (!whisper.encoder.empty()) {\n    return whisper.Validate();\n  }\n\n  if (!fire_red_asr.encoder.empty()) {\n    return fire_red_asr.Validate();\n  }\n\n  if (!tdnn.model.empty()) {\n    return tdnn.Validate();\n  }\n\n  if (!zipformer_ctc.model.empty()) {\n    return zipformer_ctc.Validate();\n  }\n\n  if (!wenet_ctc.model.empty()) {\n    return wenet_ctc.Validate();\n  }\n\n  if (!sense_voice.model.empty() ||\n      !sense_voice.qnn_config.context_binary.empty()) {\n    return sense_voice.Validate();\n  }\n\n  if (!moonshine.encoder.empty()) {\n    return moonshine.Validate();\n  }\n\n  if (!dolphin.model.empty()) {\n    return dolphin.Validate();\n  }\n\n  if (!canary.encoder.empty()) {\n    return canary.Validate();\n  }\n\n  if (!omnilingual.model.empty()) {\n    return omnilingual.Validate();\n  }\n\n  if (!funasr_nano.encoder_adaptor.empty()) {\n    return funasr_nano.Validate();\n  }\n\n  if (!medasr.model.empty()) {\n    return medasr.Validate();\n  }\n\n  if (!fire_red_asr_ctc.model.empty()) {\n    return fire_red_asr_ctc.Validate();\n  }\n\n  if (!telespeech_ctc.empty() && !FileExists(telespeech_ctc)) {\n    SHERPA_ONNX_LOGE(\"telespeech_ctc: '%s' does not exist\",\n                     telespeech_ctc.c_str());\n    return false;\n  }\n\n  if (!transducer.encoder_filename.empty()) {\n    return transducer.Validate();\n  }\n\n  return true;\n}\n\nstd::string OfflineModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineModelConfig(\";\n  os << \"transducer=\" << transducer.ToString() << \", \";\n  os << \"paraformer=\" << paraformer.ToString() << \", \";\n  os << \"nemo_ctc=\" << nemo_ctc.ToString() << \", \";\n  os << \"whisper=\" << whisper.ToString() << \", \";\n  os << \"fire_red_asr=\" << fire_red_asr.ToString() << \", \";\n  os << \"tdnn=\" << tdnn.ToString() << \", \";\n  os << \"zipformer_ctc=\" << zipformer_ctc.ToString() << \", \";\n  os << \"wenet_ctc=\" << wenet_ctc.ToString() << \", \";\n  os << \"sense_voice=\" << sense_voice.ToString() << \", \";\n  os << \"moonshine=\" << moonshine.ToString() << \", \";\n  os << \"dolphin=\" << dolphin.ToString() << \", \";\n  os << \"canary=\" << canary.ToString() << \", \";\n  os << \"omnilingual=\" << omnilingual.ToString() << \", \";\n  os << \"funasr_nano=\" << funasr_nano.ToString() << \", \";\n  os << \"medasr=\" << medasr.ToString() << \", \";\n  os << \"fire_red_asr_ctc=\" << fire_red_asr_ctc.ToString() << \", \";\n  os << \"telespeech_ctc=\\\"\" << telespeech_ctc << \"\\\", \";\n  os << \"tokens=\\\"\" << tokens << \"\\\", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\", \";\n  os << \"model_type=\\\"\" << model_type << \"\\\", \";\n  os << \"modeling_unit=\\\"\" << modeling_unit << \"\\\", \";\n  os << \"bpe_vocab=\\\"\" << bpe_vocab << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-canary-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-dolphin-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-ctc-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-funasr-nano-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-medasr-ctc-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-moonshine-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-paraformer-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-sense-voice-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-tdnn-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-wenet-ctc-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-whisper-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-zipformer-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineModelConfig {\n  OfflineTransducerModelConfig transducer;\n  OfflineParaformerModelConfig paraformer;\n  OfflineNemoEncDecCtcModelConfig nemo_ctc;\n  OfflineWhisperModelConfig whisper;\n  OfflineFireRedAsrModelConfig fire_red_asr;\n  OfflineTdnnModelConfig tdnn;\n  OfflineZipformerCtcModelConfig zipformer_ctc;\n  OfflineWenetCtcModelConfig wenet_ctc;\n  OfflineSenseVoiceModelConfig sense_voice;\n  OfflineMoonshineModelConfig moonshine;\n  OfflineDolphinModelConfig dolphin;\n  OfflineCanaryModelConfig canary;\n  OfflineOmnilingualAsrCtcModelConfig omnilingual;\n  OfflineFunASRNanoModelConfig funasr_nano;\n  OfflineMedAsrCtcModelConfig medasr;\n  OfflineFireRedAsrCtcModelConfig fire_red_asr_ctc;\n  std::string telespeech_ctc;\n\n  std::string tokens;\n  int32_t num_threads = 2;\n  bool debug = false;\n  std::string provider = \"cpu\";\n\n  // With the help of this field, we only need to load the model once\n  // instead of twice; and therefore it reduces initialization time.\n  //\n  // Valid values:\n  //  - transducer. The given model is from icefall\n  //  - paraformer. It is a paraformer model\n  //  - nemo_ctc. It is a NeMo CTC model.\n  //\n  // All other values are invalid and lead to loading the model twice.\n  std::string model_type;\n\n  std::string modeling_unit = \"cjkchar\";\n  std::string bpe_vocab;\n\n  OfflineModelConfig() = default;\n  OfflineModelConfig(const OfflineTransducerModelConfig &transducer,\n                     const OfflineParaformerModelConfig &paraformer,\n                     const OfflineNemoEncDecCtcModelConfig &nemo_ctc,\n                     const OfflineWhisperModelConfig &whisper,\n                     const OfflineFireRedAsrModelConfig &fire_red_asr,\n                     const OfflineTdnnModelConfig &tdnn,\n                     const OfflineZipformerCtcModelConfig &zipformer_ctc,\n                     const OfflineWenetCtcModelConfig &wenet_ctc,\n                     const OfflineSenseVoiceModelConfig &sense_voice,\n                     const OfflineMoonshineModelConfig &moonshine,\n                     const OfflineDolphinModelConfig &dolphin,\n                     const OfflineCanaryModelConfig &canary,\n                     const OfflineOmnilingualAsrCtcModelConfig &omnilingual,\n                     const OfflineFunASRNanoModelConfig &funasr_nano,\n                     const OfflineMedAsrCtcModelConfig &medasr,\n                     const OfflineFireRedAsrCtcModelConfig &fire_red_asr_ctc,\n                     const std::string &telespeech_ctc,\n                     const std::string &tokens, int32_t num_threads, bool debug,\n                     const std::string &provider, const std::string &model_type,\n                     const std::string &modeling_unit,\n                     const std::string &bpe_vocab)\n      : transducer(transducer),\n        paraformer(paraformer),\n        nemo_ctc(nemo_ctc),\n        whisper(whisper),\n        fire_red_asr(fire_red_asr),\n        tdnn(tdnn),\n        zipformer_ctc(zipformer_ctc),\n        wenet_ctc(wenet_ctc),\n        sense_voice(sense_voice),\n        moonshine(moonshine),\n        dolphin(dolphin),\n        canary(canary),\n        omnilingual(omnilingual),\n        funasr_nano(funasr_nano),\n        medasr(medasr),\n        fire_red_asr_ctc(fire_red_asr_ctc),\n        telespeech_ctc(telespeech_ctc),\n        tokens(tokens),\n        num_threads(num_threads),\n        debug(debug),\n        provider(provider),\n        model_type(model_type),\n        modeling_unit(modeling_unit),\n        bpe_vocab(bpe_vocab) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-moonshine-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_DECODER_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nstruct OfflineMoonshineDecoderResult {\n  /// The decoded token IDs\n  std::vector<int32_t> tokens;\n};\n\nclass OfflineMoonshineDecoder {\n public:\n  virtual ~OfflineMoonshineDecoder() = default;\n\n  /** Run beam search given the output from the moonshine encoder model.\n   *\n   * @param encoder_out A 3-D tensor of shape (batch_size, T, dim)\n   * @return Return a vector of size `N` containing the decoded results.\n   */\n  virtual std::vector<OfflineMoonshineDecoderResult> Decode(\n      Ort::Value encoder_out) = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nstd::vector<OfflineMoonshineDecoderResult>\nOfflineMoonshineGreedySearchDecoder::Decode(Ort::Value encoder_out) {\n  auto encoder_out_shape = encoder_out.GetTensorTypeAndShapeInfo().GetShape();\n  if (encoder_out_shape[0] != 1) {\n    SHERPA_ONNX_LOGE(\"Support only batch size == 1. Given: %d\\n\",\n                     static_cast<int32_t>(encoder_out_shape[0]));\n    return {};\n  }\n\n  auto memory_info =\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n  // encoder_out_shape[1] * 384 is the number of audio samples\n  // 16000 is the sample rate\n  //\n  //\n  // 384 is from the moonshine paper\n  int32_t max_len =\n      static_cast<int32_t>(encoder_out_shape[1] * 384 / 16000.0 * 6);\n\n  int32_t sos = 1;\n  int32_t eos = 2;\n  int32_t seq_len = 1;\n\n  std::vector<int32_t> tokens;\n\n  std::array<int64_t, 2> token_shape = {1, 1};\n  int64_t seq_len_shape = 1;\n\n  Ort::Value token_tensor = Ort::Value::CreateTensor(\n      memory_info, &sos, 1, token_shape.data(), token_shape.size());\n\n  Ort::Value seq_len_tensor =\n      Ort::Value::CreateTensor(memory_info, &seq_len, 1, &seq_len_shape, 1);\n\n  Ort::Value logits{nullptr};\n  std::vector<Ort::Value> states;\n\n  std::tie(logits, states) = model_->ForwardUnCachedDecoder(\n      std::move(token_tensor), std::move(seq_len_tensor), View(&encoder_out));\n\n  int32_t vocab_size = logits.GetTensorTypeAndShapeInfo().GetShape()[2];\n\n  for (int32_t i = 0; i != max_len; ++i) {\n    const float *p = logits.GetTensorData<float>();\n\n    int32_t max_token_id = static_cast<int32_t>(\n        std::distance(p, std::max_element(p, p + vocab_size)));\n    if (max_token_id == eos) {\n      break;\n    }\n    tokens.push_back(max_token_id);\n\n    seq_len += 1;\n\n    token_tensor = Ort::Value::CreateTensor(\n        memory_info, &tokens.back(), 1, token_shape.data(), token_shape.size());\n\n    seq_len_tensor =\n        Ort::Value::CreateTensor(memory_info, &seq_len, 1, &seq_len_shape, 1);\n\n    // To fix the false alarm of clang-tidy\n    // error: 'states' used after it was moved\n    // [bugprone-use-after-move,-warnings-as-errors]\n    // we use a tmp_states here\n    std::vector<Ort::Value> tmp_states{std::move(states)};\n\n    std::tie(logits, states) = model_->ForwardCachedDecoder(\n        std::move(token_tensor), std::move(seq_len_tensor), View(&encoder_out),\n        std::move(tmp_states));\n  }\n\n  OfflineMoonshineDecoderResult ans;\n  ans.tokens = std::move(tokens);\n\n  return {ans};\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_GREEDY_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_GREEDY_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-moonshine-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-moonshine-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineMoonshineGreedySearchDecoder : public OfflineMoonshineDecoder {\n public:\n  explicit OfflineMoonshineGreedySearchDecoder(OfflineMoonshineModel *model)\n      : model_(model) {}\n\n  std::vector<OfflineMoonshineDecoderResult> Decode(\n      Ort::Value encoder_out) override;\n\n private:\n  OfflineMoonshineModel *model_;  // not owned\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_GREEDY_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-moonshine-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-moonshine-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineMoonshineModelConfig::Register(ParseOptions *po) {\n  po->Register(\n      \"moonshine-preprocessor\", &preprocessor,\n      \"Path to onnx preprocessor of moonshine v1, e.g., preprocess.onnx\");\n\n  po->Register(\"moonshine-encoder\", &encoder,\n               \"Path to onnx encoder of moonshine v1 or v2, e.g., encode.onnx \"\n               \"for v1, encoder_model.onnx for v2\");\n\n  po->Register(\"moonshine-uncached-decoder\", &uncached_decoder,\n               \"Path to onnx uncached_decoder of moonshine v1, e.g., \"\n               \"uncached_decode.onnx\");\n\n  po->Register(\n      \"moonshine-cached-decoder\", &cached_decoder,\n      \"Path to onnx cached_decoder of moonshine v1, e.g., cached_decode.onnx\");\n\n  po->Register(\"moonshine-merged-decoder\", &merged_decoder,\n               \"Path to onnx merged decoder of moonshine v2, e.g., \"\n               \"decoder_model_merged.onnx\");\n}\n\nbool OfflineMoonshineModelConfig::Validate() const {\n  // both v1 and v2 require a encoder model\n  if (encoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --moonshine-encoder\");\n    return false;\n  }\n\n  if (!FileExists(encoder)) {\n    SHERPA_ONNX_LOGE(\"moonshine encoder file '%s' does not exist\",\n                     encoder.c_str());\n    return false;\n  }\n\n  if (merged_decoder.empty()) {\n    // for v1\n    if (preprocessor.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"Please provide --moonshine-preprocessor for v1 or \"\n          \"--moonshine-merged-decoder for v2\");\n      return false;\n    }\n\n    if (!FileExists(preprocessor)) {\n      SHERPA_ONNX_LOGE(\"moonshine preprocessor file '%s' does not exist\",\n                       preprocessor.c_str());\n      return false;\n    }\n\n    if (uncached_decoder.empty()) {\n      SHERPA_ONNX_LOGE(\"Please provide --moonshine-uncached-decoder for v1\");\n      return false;\n    }\n\n    if (!FileExists(uncached_decoder)) {\n      SHERPA_ONNX_LOGE(\"moonshine uncached decoder file '%s' does not exist\",\n                       uncached_decoder.c_str());\n      return false;\n    }\n\n    if (cached_decoder.empty()) {\n      SHERPA_ONNX_LOGE(\"Please provide --moonshine-cached-decoder for v1\");\n      return false;\n    }\n\n    if (!FileExists(cached_decoder)) {\n      SHERPA_ONNX_LOGE(\"moonshine cached decoder file '%s' does not exist\",\n                       cached_decoder.c_str());\n      return false;\n    }\n  } else {\n    // v2\n    if (!preprocessor.empty()) {\n      SHERPA_ONNX_LOGE(\"Please don't provide preprocessor for moonshine v2\");\n      return false;\n    }\n\n    if (!uncached_decoder.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"Please don't provide uncached decoder for moonshine v2\");\n      return false;\n    }\n\n    if (!cached_decoder.empty()) {\n      SHERPA_ONNX_LOGE(\"Please don't provide cached decoder for moonshine v2\");\n      return false;\n    }\n\n    if (!FileExists(merged_decoder)) {\n      SHERPA_ONNX_LOGE(\n          \"moonshine v2 merged_decoder decoder file '%s' does not exist\",\n          merged_decoder.c_str());\n      return false;\n    }\n  }\n\n  return true;\n}\n\nstd::string OfflineMoonshineModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineMoonshineModelConfig(\";\n  os << \"preprocessor=\\\"\" << preprocessor << \"\\\", \";\n  os << \"encoder=\\\"\" << encoder << \"\\\", \";\n  os << \"uncached_decoder=\\\"\" << uncached_decoder << \"\\\", \";\n  os << \"cached_decoder=\\\"\" << cached_decoder << \"\\\", \";\n  os << \"merged_decoder=\\\"\" << merged_decoder << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-moonshine-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineMoonshineModelConfig {\n  // For moonshine v1, it has 4 models:\n  // preprocessor, encoder, uncached_decoder, cached_decoder\n  //\n  // For moonshine v2, it has 2 models:\n  // encoder, merged_decoder\n  //\n  // You can choose either v1 by providing 4 models or\n  // select v2 by providing 2 models, but not both\n\n  std::string preprocessor;\n  std::string encoder;\n  std::string uncached_decoder;\n  std::string cached_decoder;\n\n  std::string merged_decoder;\n\n  OfflineMoonshineModelConfig() = default;\n  OfflineMoonshineModelConfig(const std::string &preprocessor,\n                              const std::string &encoder,\n                              const std::string &uncached_decoder,\n                              const std::string &cached_decoder,\n                              const std::string &merged_decoder)\n      : preprocessor(preprocessor),\n        encoder(encoder),\n        uncached_decoder(uncached_decoder),\n        cached_decoder(cached_decoder),\n        merged_decoder(merged_decoder) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-model-v2.cc",
    "content": "// sherpa-onnx/csrc/offline-moonshine-model-v2.cc\n//\n// Copyright (c)  2024-2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-moonshine-model-v2.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineMoonshineModelV2::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.moonshine.encoder), sess_opts_);\n    InitEncoder(nullptr, 0);\n\n    decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.moonshine.merged_decoder),\n        sess_opts_);\n    InitDecoder(nullptr, 0);\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.moonshine.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.moonshine.merged_decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n  }\n\n  Ort::Value ForwardEncoder(Ort::Value audio) {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> mask;\n    std::vector<Ort::Value> inputs;\n\n    inputs.push_back(std::move(audio));\n\n    if (encoder_input_names_.size() > 1) {\n      std::vector<int64_t> shape =\n          inputs.back().GetTensorTypeAndShapeInfo().GetShape();\n\n      mask.resize(shape[1], 1);\n\n      Ort::Value mask_tensor = Ort::Value::CreateTensor<int64_t>(\n          memory_info, mask.data(), mask.size(), shape.data(), shape.size());\n      inputs.push_back(std::move(mask_tensor));\n    }\n\n    auto features = encoder_sess_->Run(\n        {}, encoder_input_names_ptr_.data(), inputs.data(), inputs.size(),\n        encoder_output_names_ptr_.data(), encoder_output_names_ptr_.size());\n\n    return std::move(features[0]);\n  }\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> ForwardDecoder(\n      Ort::Value tokens, Ort::Value encoder_out,\n      std::vector<Ort::Value> states) {\n    auto encoder_seq_len = states[2].GetTensorTypeAndShapeInfo().GetShape()[2];\n    bool use_cache_branch = encoder_seq_len > 1;\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> mask;\n\n    std::vector<Ort::Value> inputs;\n\n    inputs.reserve(4 + states.size());\n\n    if (decoder_needs_mask_) {\n      mask.resize(encoder_out.GetTensorTypeAndShapeInfo().GetShape()[1], 1);\n      std::array<int64_t, 2> shape = {\n          1, encoder_out.GetTensorTypeAndShapeInfo().GetShape()[1]};\n\n      Ort::Value mask_tensor = Ort::Value::CreateTensor<int64_t>(\n          memory_info, mask.data(), mask.size(), shape.data(), shape.size());\n\n      inputs.push_back(std::move(mask_tensor));\n    }\n\n    inputs.push_back(std::move(tokens));\n    inputs.push_back(std::move(encoder_out));\n\n    for (auto &s : states) {\n      inputs.push_back(View(&s));\n    }\n\n    int64_t shape = 1;\n\n    Ort::Value tensor = Ort::Value::CreateTensor<bool>(\n        memory_info, &use_cache_branch, 1, &shape, 1);\n\n    inputs.push_back(std::move(tensor));\n\n    auto out = decoder_sess_->Run(\n        {}, decoder_input_names_ptr_.data(), inputs.data(), inputs.size(),\n        decoder_output_names_ptr_.data(), decoder_output_names_ptr_.size());\n\n    if (!use_cache_branch) {\n      // update encoder and decoder\n      for (int32_t i = 0; i < static_cast<int32_t>(states_.size()); ++i) {\n        states[i] = std::move(out[1 + i]);\n      }\n    } else {\n      // only update decoder kv\n      for (int32_t i = 0; i < num_layers_; ++i) {\n        states[4 * i + 0] = std::move(out[1 + 4 * i + 0]);\n        states[4 * i + 1] = std::move(out[1 + 4 * i + 1]);\n      }\n    }\n\n    return {std::move(out[0]), std::move(states)};\n  }\n\n  std::vector<Ort::Value> GetDecoderInitStates() {\n    std::vector<Ort::Value> ans;\n\n    ans.reserve(states_.size());\n\n    for (auto &s : states_) {\n      ans.push_back(View(&s));\n    }\n\n    return ans;\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      encoder_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!encoder_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass model data or initialize the encoder session outside of \"\n          \"this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n  }\n\n  void InitDecoder(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      decoder_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!decoder_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass model data or initialize the decoder session outside of \"\n          \"this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                  &decoder_input_names_ptr_);\n\n    GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                   &decoder_output_names_ptr_);\n\n    for (const auto &s : decoder_input_names_) {\n      if (Contains(s, \"encoder_attention_mask\")) {\n        decoder_needs_mask_ = true;\n      }\n    }\n\n    int32_t k = 0;\n    for (const auto &s : decoder_input_names_) {\n      if (Contains(s, \"key_values\")) {\n        auto shape = decoder_sess_->GetInputTypeInfo(k)\n                         .GetTensorTypeAndShapeInfo()\n                         .GetShape();\n        if (static_cast<int32_t>(shape.size()) != 4) {\n          SHERPA_ONNX_LOGE(\"The shape for %s should be 4-d. Given: %d-d\",\n                           s.c_str(), static_cast<int32_t>(shape.size()));\n          SHERPA_ONNX_EXIT(-1);\n        }\n\n        num_head_ = shape[1];\n        head_dim_ = shape[3];\n        break;\n      }\n      k += 1;\n    }\n\n    if (decoder_needs_mask_) {\n      // [ mask, ids, encoder_out, states, use_cache_branch]\n      num_layers_ = (static_cast<int32_t>(decoder_input_names_.size()) - 4) / 4;\n    } else {\n      // [ ids, encoder_out, states, use_cache_branch]\n      num_layers_ = (static_cast<int32_t>(decoder_input_names_.size()) - 3) / 4;\n    }\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"need attention mask: %d\",\n                       static_cast<int32_t>(decoder_needs_mask_));\n      SHERPA_ONNX_LOGE(\"num_head: %d\", num_head_);\n      SHERPA_ONNX_LOGE(\"head_dim: %d\", head_dim_);\n      SHERPA_ONNX_LOGE(\"num_layers: %d\", num_layers_);\n    }\n\n    InitDecoderStates();\n  }\n\n  void InitDecoderStates() {\n    states_.reserve(num_layers_ * 4);\n    std::array<int64_t, 4> shape{1, num_head_, 0, head_dim_};\n\n    auto n = shape[0] * shape[1] * shape[2] * shape[3];\n\n    for (int32_t i = 0; i < 4 * num_layers_; ++i) {\n      Ort::Value v = Ort::Value::CreateTensor<float>(Allocator(), shape.data(),\n                                                     shape.size());\n\n      float *p = v.GetTensorMutableData<float>();\n      memset(p, 0, sizeof(float) * n);\n      states_.push_back(std::move(v));\n    }\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<Ort::Value> states_;\n\n  int32_t num_head_ = 0;\n  int32_t head_dim_ = 0;\n  int32_t num_layers_ = 0;\n  bool decoder_needs_mask_ = false;\n};\n\nOfflineMoonshineModelV2::OfflineMoonshineModelV2(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineMoonshineModelV2::OfflineMoonshineModelV2(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineMoonshineModelV2::~OfflineMoonshineModelV2() = default;\n\nOrt::Value OfflineMoonshineModelV2::ForwardEncoder(Ort::Value audio) const {\n  return impl_->ForwardEncoder(std::move(audio));\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOfflineMoonshineModelV2::ForwardDecoder(Ort::Value token,\n                                        Ort::Value encoder_out,\n                                        std::vector<Ort::Value> states) const {\n  return impl_->ForwardDecoder(std::move(token), std::move(encoder_out),\n                               std::move(states));\n}\n\nstd::vector<Ort::Value> OfflineMoonshineModelV2::GetDecoderInitStates() const {\n  return impl_->GetDecoderInitStates();\n}\n\nOrtAllocator *OfflineMoonshineModelV2::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineMoonshineModelV2::OfflineMoonshineModelV2(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineMoonshineModelV2::OfflineMoonshineModelV2(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-model-v2.h",
    "content": "// sherpa-onnx/csrc/offline-moonshine-model-v2.h\n//\n// Copyright (c)  2024-2026  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_MODEL_V2_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_MODEL_V2_H_\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n// please see\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/moonshine/v2/test.py\nclass OfflineMoonshineModelV2 {\n public:\n  explicit OfflineMoonshineModelV2(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineMoonshineModelV2(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineMoonshineModelV2();\n\n  /** Run the encoder model.\n   *\n   * @param audio A float32 tensor of shape (batch_size, num_samples)\n   *\n   * @return Return a float32 tensor of shape (batch_size, T, dim) that\n   *         can be used as the input of ForwardDecoder()\n   *\n   * Note it currently supports only batch size 1.\n   */\n  Ort::Value ForwardEncoder(Ort::Value audio) const;\n\n  /** Run the merged decoder.\n   *\n   * @param token A int64 tensor of shape (batch_size, num_tokens)\n   * @param encoder_out A float32 tensor of shape (batch_size, T, dim)\n   * @param states Model States\n   *\n   * @returns Return a pair:\n   *\n   *          - logits, a float32 tensor of shape (batch_size, 1, dim)\n   *          - states, a list of states\n   *\n   * Note it supports only batch_size 1.\n   */\n  std::pair<Ort::Value, std::vector<Ort::Value>> ForwardDecoder(\n      Ort::Value token, Ort::Value encoder_out,\n      std::vector<Ort::Value> states) const;\n\n  std::vector<Ort::Value> GetDecoderInitStates() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_MODEL_V2_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-model.cc",
    "content": "// sherpa-onnx/csrc/offline-moonshine-model.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-moonshine-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineMoonshineModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.moonshine.preprocessor);\n      InitPreprocessor(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.moonshine.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.moonshine.uncached_decoder);\n      InitUnCachedDecoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.moonshine.cached_decoder);\n      InitCachedDecoder(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.moonshine.preprocessor);\n      InitPreprocessor(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.moonshine.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.moonshine.uncached_decoder);\n      InitUnCachedDecoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.moonshine.cached_decoder);\n      InitCachedDecoder(buf.data(), buf.size());\n    }\n  }\n\n  Ort::Value ForwardPreprocessor(Ort::Value audio) {\n    auto features = preprocessor_sess_->Run(\n        {}, preprocessor_input_names_ptr_.data(), &audio, 1,\n        preprocessor_output_names_ptr_.data(),\n        preprocessor_output_names_ptr_.size());\n\n    return std::move(features[0]);\n  }\n\n  Ort::Value ForwardEncoder(Ort::Value features, Ort::Value features_len) {\n    std::array<Ort::Value, 2> encoder_inputs{std::move(features),\n                                             std::move(features_len)};\n    auto encoder_out = encoder_sess_->Run(\n        {}, encoder_input_names_ptr_.data(), encoder_inputs.data(),\n        encoder_inputs.size(), encoder_output_names_ptr_.data(),\n        encoder_output_names_ptr_.size());\n\n    return std::move(encoder_out[0]);\n  }\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> ForwardUnCachedDecoder(\n      Ort::Value tokens, Ort::Value seq_len, Ort::Value encoder_out) {\n    std::array<Ort::Value, 3> uncached_decoder_input = {\n        std::move(tokens),\n        std::move(encoder_out),\n        std::move(seq_len),\n    };\n\n    auto uncached_decoder_out = uncached_decoder_sess_->Run(\n        {}, uncached_decoder_input_names_ptr_.data(),\n        uncached_decoder_input.data(), uncached_decoder_input.size(),\n        uncached_decoder_output_names_ptr_.data(),\n        uncached_decoder_output_names_ptr_.size());\n\n    std::vector<Ort::Value> states;\n    states.reserve(uncached_decoder_out.size() - 1);\n\n    int32_t i = -1;\n    for (auto &s : uncached_decoder_out) {\n      ++i;\n      if (i == 0) {\n        continue;\n      }\n\n      states.push_back(std::move(s));\n    }\n\n    return {std::move(uncached_decoder_out[0]), std::move(states)};\n  }\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> ForwardCachedDecoder(\n      Ort::Value tokens, Ort::Value seq_len, Ort::Value encoder_out,\n      std::vector<Ort::Value> states) {\n    std::vector<Ort::Value> cached_decoder_input;\n    cached_decoder_input.reserve(3 + states.size());\n    cached_decoder_input.push_back(std::move(tokens));\n    cached_decoder_input.push_back(std::move(encoder_out));\n    cached_decoder_input.push_back(std::move(seq_len));\n\n    for (auto &s : states) {\n      cached_decoder_input.push_back(std::move(s));\n    }\n\n    auto cached_decoder_out = cached_decoder_sess_->Run(\n        {}, cached_decoder_input_names_ptr_.data(), cached_decoder_input.data(),\n        cached_decoder_input.size(), cached_decoder_output_names_ptr_.data(),\n        cached_decoder_output_names_ptr_.size());\n\n    std::vector<Ort::Value> next_states;\n    next_states.reserve(cached_decoder_out.size() - 1);\n\n    int32_t i = -1;\n    for (auto &s : cached_decoder_out) {\n      ++i;\n      if (i == 0) {\n        continue;\n      }\n\n      next_states.push_back(std::move(s));\n    }\n\n    return {std::move(cached_decoder_out[0]), std::move(next_states)};\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void InitPreprocessor(void *model_data, size_t model_data_length) {\n    preprocessor_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(preprocessor_sess_.get(), &preprocessor_input_names_,\n                  &preprocessor_input_names_ptr_);\n\n    GetOutputNames(preprocessor_sess_.get(), &preprocessor_output_names_,\n                   &preprocessor_output_names_ptr_);\n  }\n\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n  }\n\n  void InitUnCachedDecoder(void *model_data, size_t model_data_length) {\n    uncached_decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(uncached_decoder_sess_.get(), &uncached_decoder_input_names_,\n                  &uncached_decoder_input_names_ptr_);\n\n    GetOutputNames(uncached_decoder_sess_.get(),\n                   &uncached_decoder_output_names_,\n                   &uncached_decoder_output_names_ptr_);\n  }\n\n  void InitCachedDecoder(void *model_data, size_t model_data_length) {\n    cached_decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(cached_decoder_sess_.get(), &cached_decoder_input_names_,\n                  &cached_decoder_input_names_ptr_);\n\n    GetOutputNames(cached_decoder_sess_.get(), &cached_decoder_output_names_,\n                   &cached_decoder_output_names_ptr_);\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> preprocessor_sess_;\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> uncached_decoder_sess_;\n  std::unique_ptr<Ort::Session> cached_decoder_sess_;\n\n  std::vector<std::string> preprocessor_input_names_;\n  std::vector<const char *> preprocessor_input_names_ptr_;\n\n  std::vector<std::string> preprocessor_output_names_;\n  std::vector<const char *> preprocessor_output_names_ptr_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> uncached_decoder_input_names_;\n  std::vector<const char *> uncached_decoder_input_names_ptr_;\n\n  std::vector<std::string> uncached_decoder_output_names_;\n  std::vector<const char *> uncached_decoder_output_names_ptr_;\n\n  std::vector<std::string> cached_decoder_input_names_;\n  std::vector<const char *> cached_decoder_input_names_ptr_;\n\n  std::vector<std::string> cached_decoder_output_names_;\n  std::vector<const char *> cached_decoder_output_names_ptr_;\n};\n\nOfflineMoonshineModel::OfflineMoonshineModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineMoonshineModel::OfflineMoonshineModel(Manager *mgr,\n                                             const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineMoonshineModel::~OfflineMoonshineModel() = default;\n\nOrt::Value OfflineMoonshineModel::ForwardPreprocessor(Ort::Value audio) const {\n  return impl_->ForwardPreprocessor(std::move(audio));\n}\n\nOrt::Value OfflineMoonshineModel::ForwardEncoder(\n    Ort::Value features, Ort::Value features_len) const {\n  return impl_->ForwardEncoder(std::move(features), std::move(features_len));\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOfflineMoonshineModel::ForwardUnCachedDecoder(Ort::Value token,\n                                              Ort::Value seq_len,\n                                              Ort::Value encoder_out) const {\n  return impl_->ForwardUnCachedDecoder(std::move(token), std::move(seq_len),\n                                       std::move(encoder_out));\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOfflineMoonshineModel::ForwardCachedDecoder(\n    Ort::Value token, Ort::Value seq_len, Ort::Value encoder_out,\n    std::vector<Ort::Value> states) const {\n  return impl_->ForwardCachedDecoder(std::move(token), std::move(seq_len),\n                                     std::move(encoder_out), std::move(states));\n}\n\nOrtAllocator *OfflineMoonshineModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineMoonshineModel::OfflineMoonshineModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineMoonshineModel::OfflineMoonshineModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-model.h",
    "content": "// sherpa-onnx/csrc/offline-moonshine-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n// please see\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/moonshine/test.py\nclass OfflineMoonshineModel {\n public:\n  explicit OfflineMoonshineModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineMoonshineModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineMoonshineModel();\n\n  /** Run the preprocessor model.\n   *\n   * @param audio A float32 tensor of shape (batch_size, num_samples)\n   *\n   * @return Return a float32 tensor of shape (batch_size, T, dim) that\n   *         can be used as the input of ForwardEncoder()\n   */\n  Ort::Value ForwardPreprocessor(Ort::Value audio) const;\n\n  /** Run the encoder model.\n   *\n   * @param features A float32 tensor of shape (batch_size, T, dim)\n   * @param features_len A int32 tensor of shape (batch_size,)\n   * @returns A float32 tensor of shape (batch_size, T, dim).\n   */\n  Ort::Value ForwardEncoder(Ort::Value features, Ort::Value features_len) const;\n\n  /** Run the uncached decoder.\n   *\n   * @param token A int32 tensor of shape (batch_size, num_tokens)\n   * @param seq_len A int32 tensor of shape (batch_size,) containing number\n   *                of predicted tokens so far\n   * @param encoder_out A float32 tensor of shape (batch_size, T, dim)\n   *\n   * @returns Return a pair:\n   *\n   *          - logits, a float32 tensor of shape (batch_size, 1, dim)\n   *          - states, a list of states\n   */\n  std::pair<Ort::Value, std::vector<Ort::Value>> ForwardUnCachedDecoder(\n      Ort::Value token, Ort::Value seq_len, Ort::Value encoder_out) const;\n\n  /** Run the cached decoder.\n   *\n   * @param token A int32 tensor of shape (batch_size, num_tokens)\n   * @param seq_len A int32 tensor of shape (batch_size,) containing number\n   *                of predicted tokens so far\n   * @param encoder_out A float32 tensor of shape (batch_size, T, dim)\n   * @param states A list of previous states\n   *\n   * @returns Return a pair:\n   *          - logits, a float32 tensor of shape (batch_size, 1, dim)\n   *          - states, a list of new states\n   */\n  std::pair<Ort::Value, std::vector<Ort::Value>> ForwardCachedDecoder(\n      Ort::Value token, Ort::Value seq_len, Ort::Value encoder_out,\n      std::vector<Ort::Value> states) const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-v2-greedy-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-moonshine-v2-greedy-search-decoder.cc\n//\n// Copyright (c)  2024-2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-moonshine-v2-greedy-search-decoder.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nstd::vector<OfflineMoonshineDecoderResult>\nOfflineMoonshineV2GreedySearchDecoder::Decode(Ort::Value encoder_out) {\n  auto encoder_out_shape = encoder_out.GetTensorTypeAndShapeInfo().GetShape();\n  if (encoder_out_shape[0] != 1) {\n    SHERPA_ONNX_LOGE(\"Support only batch size == 1. Given: %d\\n\",\n                     static_cast<int32_t>(encoder_out_shape[0]));\n    return {};\n  }\n\n  auto memory_info =\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n  // encoder_out_shape[1] * 384 is the number of audio samples\n  // 16000 is the sample rate\n  //\n  //\n  // 384 is from the moonshine paper\n  int32_t max_len =\n      static_cast<int32_t>(encoder_out_shape[1] * 384 / 16000.0 * 15);\n\n  int64_t sos = 1;\n  int32_t eos = 2;\n  int32_t seq_len = 1;\n\n  std::vector<int32_t> tokens;\n\n  std::array<int64_t, 2> token_shape = {1, 1};\n\n  Ort::Value token_tensor = Ort::Value::CreateTensor(\n      memory_info, &sos, 1, token_shape.data(), token_shape.size());\n\n  Ort::Value logits{nullptr};\n  std::vector<Ort::Value> states = model_->GetDecoderInitStates();\n\n  // To fix the false alarm of clang-tidy\n  // error: 'states' used after it was moved\n  // [bugprone-use-after-move,-warnings-as-errors]\n  // we use a tmp_states here\n  std::vector<Ort::Value> tmp_states{std::move(states)};\n\n  std::tie(logits, states) = model_->ForwardDecoder(\n      std::move(token_tensor), View(&encoder_out), std::move(tmp_states));\n\n  int32_t vocab_size = logits.GetTensorTypeAndShapeInfo().GetShape()[2];\n\n  int64_t max_token_id;\n\n  for (int32_t i = 0; i != max_len; ++i) {\n    const float *p = logits.GetTensorData<float>();\n\n    max_token_id = static_cast<int64_t>(\n        std::distance(p, std::max_element(p, p + vocab_size)));\n\n    if (max_token_id == eos) {\n      break;\n    }\n\n    tokens.push_back(max_token_id);\n\n    seq_len += 1;\n\n    token_tensor = Ort::Value::CreateTensor(\n        memory_info, &max_token_id, 1, token_shape.data(), token_shape.size());\n\n    tmp_states = std::move(states);\n\n    std::tie(logits, states) = model_->ForwardDecoder(\n        std::move(token_tensor), View(&encoder_out), std::move(tmp_states));\n  }\n\n  OfflineMoonshineDecoderResult ans;\n  ans.tokens = std::move(tokens);\n\n  return {ans};\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-moonshine-v2-greedy-search-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-moonshine-v2-greedy-search-decoder.h\n//\n// Copyright (c)  2024-2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_V2_GREEDY_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_V2_GREEDY_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-moonshine-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-moonshine-model-v2.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineMoonshineV2GreedySearchDecoder : public OfflineMoonshineDecoder {\n public:\n  explicit OfflineMoonshineV2GreedySearchDecoder(OfflineMoonshineModelV2 *model)\n      : model_(model) {}\n\n  std::vector<OfflineMoonshineDecoderResult> Decode(\n      Ort::Value encoder_out) override;\n\n private:\n  OfflineMoonshineModelV2 *model_;  // not owned\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_MOONSHINE_V2_GREEDY_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineNemoEncDecCtcModelConfig::Register(ParseOptions *po) {\n  po->Register(\"nemo-ctc-model\", &model,\n               \"Path to model.onnx of Nemo EncDecCtcModel.\");\n}\n\nbool OfflineNemoEncDecCtcModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"NeMo model: '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineNemoEncDecCtcModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineNemoEncDecCtcModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_NEMO_ENC_DEC_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_NEMO_ENC_DEC_CTC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineNemoEncDecCtcModelConfig {\n  std::string model;\n\n  OfflineNemoEncDecCtcModelConfig() = default;\n  explicit OfflineNemoEncDecCtcModelConfig(const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_NEMO_ENC_DEC_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model.cc\n//\n// Copyright (c)  2023-2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineNemoEncDecCtcModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.nemo_ctc.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.nemo_ctc.model);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) {\n    std::vector<int64_t> shape =\n        features_length.GetTensorTypeAndShapeInfo().GetShape();\n\n    Ort::Value out_features_length = Ort::Value::CreateTensor<int64_t>(\n        allocator_, shape.data(), shape.size());\n\n    const int64_t *src = features_length.GetTensorData<int64_t>();\n    int64_t *dst = out_features_length.GetTensorMutableData<int64_t>();\n    for (int64_t i = 0; i != shape[0]; ++i) {\n      dst[i] = src[i] / subsampling_factor_;\n    }\n\n    // (B, T, C) -> (B, C, T)\n    features = Transpose12(allocator_, &features);\n\n    std::array<Ort::Value, 2> inputs = {std::move(features),\n                                        std::move(features_length)};\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    std::vector<Ort::Value> ans;\n    ans.reserve(2);\n    ans.push_back(std::move(out[0]));\n    ans.push_back(std::move(out_features_length));\n    return ans;\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t SubsamplingFactor() const { return subsampling_factor_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  std::string FeatureNormalizationMethod() const { return normalize_type_; }\n\n  bool IsGigaAM() const { return is_giga_am_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n    SHERPA_ONNX_READ_META_DATA(subsampling_factor_, \"subsampling_factor\");\n    SHERPA_ONNX_READ_META_DATA_STR_ALLOW_EMPTY(normalize_type_,\n                                               \"normalize_type\");\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(is_giga_am_, \"is_giga_am\", 0);\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t vocab_size_ = 0;\n  int32_t subsampling_factor_ = 0;\n  std::string normalize_type_;\n\n  // it is 1 for models from\n  // https://github.com/salute-developers/GigaAM\n  int32_t is_giga_am_ = 0;\n};\n\nOfflineNemoEncDecCtcModel::OfflineNemoEncDecCtcModel(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineNemoEncDecCtcModel::OfflineNemoEncDecCtcModel(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineNemoEncDecCtcModel::~OfflineNemoEncDecCtcModel() = default;\n\nstd::vector<Ort::Value> OfflineNemoEncDecCtcModel::Forward(\n    Ort::Value features, Ort::Value features_length) {\n  return impl_->Forward(std::move(features), std::move(features_length));\n}\n\nint32_t OfflineNemoEncDecCtcModel::VocabSize() const {\n  return impl_->VocabSize();\n}\nint32_t OfflineNemoEncDecCtcModel::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\nOrtAllocator *OfflineNemoEncDecCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nstd::string OfflineNemoEncDecCtcModel::FeatureNormalizationMethod() const {\n  return impl_->FeatureNormalizationMethod();\n}\n\nbool OfflineNemoEncDecCtcModel::IsGigaAM() const { return impl_->IsGigaAM(); }\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineNemoEncDecCtcModel::OfflineNemoEncDecCtcModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineNemoEncDecCtcModel::OfflineNemoEncDecCtcModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model.h",
    "content": "// sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_NEMO_ENC_DEC_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_NEMO_ENC_DEC_CTC_MODEL_H_\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements the EncDecCTCModelBPE model from NeMo.\n *\n * See\n * https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/models/ctc_bpe_models.py\n * https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/models/ctc_models.py\n */\nclass OfflineNemoEncDecCtcModel : public OfflineCtcModel {\n public:\n  explicit OfflineNemoEncDecCtcModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineNemoEncDecCtcModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineNemoEncDecCtcModel() override;\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size).\n   *  - log_probs_length A 1-D tensor of shape (N,). Its dtype is int64_t\n   */\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** SubsamplingFactor of the model\n   *\n   * For Citrinet, the subsampling factor is usually 4.\n   * For Conformer CTC, the subsampling factor is usually 8.\n   */\n  int32_t SubsamplingFactor() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  // Possible values:\n  // - per_feature\n  // - all_features (not implemented yet)\n  // - fixed_mean (not implemented)\n  // - fixed_std (not implemented)\n  // - or just leave it to empty\n  // See\n  // https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/parts/preprocessing/features.py#L59\n  // for details\n  std::string FeatureNormalizationMethod() const override;\n\n  bool IsGigaAM() const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\nusing OfflineNemoEncDecHybridRNNTCTCBPEModel = OfflineNemoEncDecCtcModel;\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_NEMO_ENC_DEC_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineOmnilingualAsrCtcModelConfig::Register(ParseOptions *po) {\n  po->Register(\"omnilingual-asr-model\", &model,\n               \"Path to Omnilingual ASR CTC model\");\n}\n\nbool OfflineOmnilingualAsrCtcModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"Omnilingual ASR CTC model file '%s' does not exist\",\n                     model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineOmnilingualAsrCtcModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineOmnilingualAsrCtcModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_OMNILINGUAL_ASR_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_OMNILINGUAL_ASR_CTC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\n// for\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/omnilingual-asr/test.py\nstruct OfflineOmnilingualAsrCtcModelConfig {\n  std::string model;\n\n  OfflineOmnilingualAsrCtcModelConfig() = default;\n\n  explicit OfflineOmnilingualAsrCtcModelConfig(const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_OMNILINGUAL_ASR_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineOmnilingualAsrCtcModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config_.omnilingual.model), sess_opts_);\n    Init(nullptr, 0);\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.omnilingual.model);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value /*/features_length*/) {\n    auto out_vec =\n        sess_->Run({}, input_names_ptr_.data(), &features, 1,\n                   output_names_ptr_.data(), output_names_ptr_.size());\n    std::vector<int64_t> logits_shape =\n        out_vec[0].GetTensorTypeAndShapeInfo().GetShape();\n\n    std::vector<int64_t> num_frames(logits_shape[0], logits_shape[1]);\n\n    int64_t shape = logits_shape[0];\n\n    Ort::Value logits_len =\n        Ort::Value::CreateTensor<int64_t>(allocator_, &shape, 1);\n    std::copy(num_frames.begin(), num_frames.end(),\n              logits_len.GetTensorMutableData<int64_t>());\n\n    out_vec.push_back(std::move(logits_len));\n\n    return out_vec;\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  static void NormalizeFeatures(float *features, int32_t num_frames,\n                                int32_t feat_dim) {\n    if (num_frames != 1) {\n      SHERPA_ONNX_LOGE(\n          \"Unexpected error in collecting samples for Omnilingual ASR models!\");\n      return;\n    }\n\n    // Map the single-row feature vector\n    Eigen::Map<Eigen::ArrayXf> x(features, feat_dim);\n    float mean = x.mean();\n    float var = (x.square().mean() - mean * mean);\n    var = std::max(var, 0.0f);\n    float inv_stddev = 1.0f / std::sqrt(var + 1e-5f);\n\n    x = (x - mean) * inv_stddev;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    // For models with 1B parameters, weights are saved externally\n    // in model.weights\n    // We cannot create session from buffer in this case.\n    if (model_data) {\n      sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                             model_data_length, sess_opts_);\n    } else if (!sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize session outside of this \"\n          \"function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    // get vocab size from the output[0].shape, which is (N, T, vocab_size)\n    vocab_size_ =\n        sess_->GetOutputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape()[2];\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t vocab_size_ = 0;\n};\n\nOfflineOmnilingualAsrCtcModel::OfflineOmnilingualAsrCtcModel(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineOmnilingualAsrCtcModel::OfflineOmnilingualAsrCtcModel(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineOmnilingualAsrCtcModel::~OfflineOmnilingualAsrCtcModel() = default;\n\nstd::vector<Ort::Value> OfflineOmnilingualAsrCtcModel::Forward(\n    Ort::Value features, Ort::Value features_length) {\n  return impl_->Forward(std::move(features), std::move(features_length));\n}\n\nint32_t OfflineOmnilingualAsrCtcModel::VocabSize() const {\n  return impl_->VocabSize();\n}\n\nOrtAllocator *OfflineOmnilingualAsrCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nvoid OfflineOmnilingualAsrCtcModel::NormalizeFeatures(float *features,\n                                                      int32_t num_frames,\n                                                      int32_t feat_dim) const {\n  return impl_->NormalizeFeatures(features, num_frames, feat_dim);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineOmnilingualAsrCtcModel::OfflineOmnilingualAsrCtcModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineOmnilingualAsrCtcModel::OfflineOmnilingualAsrCtcModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model.h",
    "content": "// sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_OMNILINGUAL_ASR_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_OMNILINGUAL_ASR_CTC_MODEL_H_\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements the Omnilingual ASR CTC model\n * from\n * https://github.com/facebookresearch/omnilingual-asr\n *\n * See\n * https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/omnilingual-asr/export-onnx.py\n */\nclass OfflineOmnilingualAsrCtcModel : public OfflineCtcModel {\n public:\n  explicit OfflineOmnilingualAsrCtcModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineOmnilingualAsrCtcModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineOmnilingualAsrCtcModel() override;\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size).\n   *  - log_probs_length A 1-D tensor of shape (N,). Its dtype is int64_t\n   */\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  void NormalizeFeatures(float *features, int32_t num_frames,\n                         int32_t feat_dim) const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_OMNILINGUAL_ASR_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-paraformer-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-paraformer-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_DECODER_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nstruct OfflineParaformerDecoderResult {\n  /// The decoded token IDs\n  std::vector<int64_t> tokens;\n\n  // it contains the start time of each token in seconds\n  //\n  // len(timestamps) == len(tokens)\n  std::vector<float> timestamps;\n};\n\nclass OfflineParaformerDecoder {\n public:\n  virtual ~OfflineParaformerDecoder() = default;\n\n  /** Run beam search given the output from the paraformer model.\n   *\n   * @param log_probs A 3-D tensor of shape (N, T, vocab_size)\n   * @param token_num A 1-D tensor of shape (N). token_num equals to T.\n   *\n   * @return Return a vector of size `N` containing the decoded results.\n   */\n  virtual std::vector<OfflineParaformerDecoderResult> Decode(\n      Ort::Value log_probs, Ort::Value token_num,\n      Ort::Value us_cif_peak = Ort::Value(nullptr)) = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-paraformer-greedy-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-paraformer-greedy-search-decoder.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-paraformer-greedy-search-decoder.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstd::vector<OfflineParaformerDecoderResult>\nOfflineParaformerGreedySearchDecoder::Decode(\n    Ort::Value log_probs, Ort::Value /*token_num*/,\n    Ort::Value us_cif_peak /*=Ort::Value(nullptr)*/\n) {\n  std::vector<int64_t> shape = log_probs.GetTensorTypeAndShapeInfo().GetShape();\n  int32_t batch_size = shape[0];\n  int32_t num_tokens = shape[1];\n  int32_t vocab_size = shape[2];\n\n  std::vector<OfflineParaformerDecoderResult> results(batch_size);\n\n  for (int32_t i = 0; i != batch_size; ++i) {\n    const float *p =\n        log_probs.GetTensorData<float>() + i * num_tokens * vocab_size;\n    for (int32_t k = 0; k != num_tokens; ++k) {\n      auto max_idx = static_cast<int64_t>(\n          std::distance(p, std::max_element(p, p + vocab_size)));\n      if (max_idx == eos_id_) {\n        break;\n      }\n\n      results[i].tokens.push_back(max_idx);\n\n      p += vocab_size;\n    }\n\n    if (us_cif_peak) {\n      int32_t dim = us_cif_peak.GetTensorTypeAndShapeInfo().GetShape().back();\n\n      const auto *peak = us_cif_peak.GetTensorData<float>() + i * dim;\n      std::vector<float> timestamps;\n      timestamps.reserve(results[i].tokens.size());\n\n      // 10.0: frameshift is 10 milliseconds\n      // 6: LfrWindowSize\n      // 3: us_cif_peak is upsampled by a factor of 3\n      // 1000: milliseconds to seconds\n      float scale = 10.0 * 6 / 3 / 1000;\n\n      for (int32_t k = 0; k != dim; ++k) {\n        if (peak[k] > 1 - 1e-4) {\n          timestamps.push_back(k * scale);\n        }\n      }\n\n      if (!timestamps.empty()) {\n        timestamps.pop_back();\n      }\n\n      if (timestamps.size() == results[i].tokens.size()) {\n        results[i].timestamps = std::move(timestamps);\n      }\n    }\n  }\n\n  return results;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-paraformer-greedy-search-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-paraformer-greedy-search-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_GREEDY_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_GREEDY_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-paraformer-decoder.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineParaformerGreedySearchDecoder : public OfflineParaformerDecoder {\n public:\n  explicit OfflineParaformerGreedySearchDecoder(int32_t eos_id)\n      : eos_id_(eos_id) {}\n\n  std::vector<OfflineParaformerDecoderResult> Decode(\n      Ort::Value log_probs, Ort::Value token_num,\n      Ort::Value us_cif_peak = Ort::Value(nullptr)) override;\n\n private:\n  int32_t eos_id_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_GREEDY_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-paraformer-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-paraformer-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-paraformer-model-config.h\"\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineParaformerModelConfig::Register(ParseOptions *po) {\n  po->Register(\n      \"paraformer\", &model,\n      \"Path to model.onnx of Paraformer. If you use Ascend NPU, it is \"\n      \"/path/to/encoder.om,/path/to/predictor.om,/path/to/decoder.om\"\n      \"If you use RK NPU, it is \"\n      \"/path/to/encoder.rknn,/path/to/predictor.rknn,/path/to/decoder.rknn\");\n\n  std::string prefix = \"paraformer\";\n  ParseOptions p(prefix, po);\n\n  qnn_config.Register(&p);\n}\n\nbool OfflineParaformerModelConfig::Validate() const {\n  if (EndsWith(model, \".onnx\")) {\n    if (!FileExists(model)) {\n      SHERPA_ONNX_LOGE(\"Paraformer model '%s' does not exist\", model.c_str());\n      return false;\n    }\n    return true;\n  }\n\n  if (EndsWith(model, \".om\")) {\n    std::vector<std::string> filenames;\n    SplitStringToVector(model, \",\", false, &filenames);\n    if (filenames.size() != 3 || !EndsWith(filenames[0], \"encoder.om\") ||\n        !EndsWith(filenames[1], \"predictor.om\") ||\n        !EndsWith(filenames[2], \"decoder.om\")) {\n      SHERPA_ONNX_LOGE(\n          \"For Ascend NPU, you should pass \"\n          \"/path/to/encoder.om,/path/to/predictor.om,/path/to/decoder.om. \"\n          \"Given '%s'\",\n          model.c_str());\n      return false;\n    }\n\n    for (const auto &name : filenames) {\n      if (!FileExists(name)) {\n        SHERPA_ONNX_LOGE(\"Paraformer model '%s' does not exist\", name.c_str());\n        return false;\n      }\n    }\n\n    return true;\n  }\n\n  if (EndsWith(model, \".rknn\")) {\n    std::vector<std::string> filenames;\n    SplitStringToVector(model, \",\", false, &filenames);\n    if (filenames.size() != 3 || !EndsWith(filenames[0], \"encoder.rknn\") ||\n        !EndsWith(filenames[1], \"predictor.rknn\") ||\n        !EndsWith(filenames[2], \"decoder.rknn\")) {\n      SHERPA_ONNX_LOGE(\n          \"For RKNN, you should pass \"\n          \"/path/encoder.rknn,/path/predictor.rknn,/path/decoder.rknn. \"\n          \"Given '%s'\",\n          model.c_str());\n      return false;\n    }\n\n    for (const auto &name : filenames) {\n      if (!FileExists(name)) {\n        SHERPA_ONNX_LOGE(\"Paraformer model '%s' does not exist\", name.c_str());\n        return false;\n      }\n    }\n\n    return true;\n  }\n\n  if (EndsWith(model, \".so\")) {\n    std::vector<std::string> filenames;\n    SplitStringToVector(model, \",\", false, &filenames);\n    if (filenames.size() != 3 || !EndsWith(filenames[0], \"encoder.so\") ||\n        !EndsWith(filenames[1], \"predictor.so\") ||\n        !EndsWith(filenames[2], \"decoder.so\")) {\n      SHERPA_ONNX_LOGE(\n          \"For QNN, you should pass \"\n          \"/path/libencoder.so,/path/libpredictor.so,/path/libdecoder.so. \"\n          \"Given '%s'\",\n          model.c_str());\n      return false;\n    }\n\n    for (const auto &name : filenames) {\n      if (!FileExists(name)) {\n        SHERPA_ONNX_LOGE(\"Paraformer model '%s' does not exist\", name.c_str());\n        return false;\n      }\n    }\n\n    if (!qnn_config.Validate()) {\n      return false;\n    }\n\n    return true;\n  }\n\n  if (model.empty() && !qnn_config.context_binary.empty()) {\n    // we require that the context_binary exists\n    if (!FileExists(qnn_config.context_binary)) {\n      SHERPA_ONNX_LOGE(\n          \"Model is empty, but you provide a context binary that does not \"\n          \"exist\");\n      return false;\n    }\n\n    std::vector<std::string> filenames;\n    SplitStringToVector(model, \",\", false, &filenames);\n    if (filenames.size() != 3) {\n      SHERPA_ONNX_LOGE(\n          \"For Paraformer with QNN, you should pass \"\n          \"/path/encoder.bin,/path/predictor.bin,/path/decoder.bin\"\n          \"Given '%s'\",\n          model.c_str());\n      return false;\n    }\n\n    for (const auto &name : filenames) {\n      if (!FileExists(name)) {\n        SHERPA_ONNX_LOGE(\"Paraformer context binary '%s' does not exist\",\n                         name.c_str());\n        return false;\n      }\n    }\n\n    if (!qnn_config.Validate()) {\n      return false;\n    }\n\n    return true;\n  }\n\n  SHERPA_ONNX_LOGE(\n      \"Please pass *.onnx, *.om, *.rknn, or *.so models. Given '%s'\",\n      model.c_str());\n  return false;\n}\n\nstd::string OfflineParaformerModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineParaformerModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\"\";\n\n  if (!qnn_config.backend_lib.empty()) {\n    os << \", qnn_config=\" << qnn_config.ToString();\n  }\n\n  os << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-paraformer-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-paraformer-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/qnn-config.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineParaformerModelConfig {\n  // for ascend npu,\n  // model is \"/path/to/encoder.om,/path/to/predictor.om,/path/to/decoder.om\"\n  //\n  // for rknn,\n  // model is\n  // \"/path/to/encoder.rknn,/path/to/predictor.rknn,/path/to/decoder.rknn\"\n  //\n  // for qnn with shared libs, model is\n  // model is\n  // \"/path/to/libencoder.so,/path/to/libpredictor.so,/path/to/libdecoder.so\"\n  std::string model;\n\n  QnnConfig qnn_config;\n\n  OfflineParaformerModelConfig() = default;\n  explicit OfflineParaformerModelConfig(const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-paraformer-model.cc",
    "content": "// sherpa-onnx/csrc/offline-paraformer-model.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-paraformer-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineParaformerModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.paraformer.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.paraformer.model);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) {\n    std::array<Ort::Value, 2> inputs = {std::move(features),\n                                        std::move(features_length)};\n\n    return sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                      output_names_ptr_.data(), output_names_ptr_.size());\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t LfrWindowSize() const { return lfr_window_size_; }\n\n  int32_t LfrWindowShift() const { return lfr_window_shift_; }\n\n  const std::vector<float> &NegativeMean() const { return neg_mean_; }\n\n  const std::vector<float> &InverseStdDev() const { return inv_stddev_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n    SHERPA_ONNX_READ_META_DATA(lfr_window_size_, \"lfr_window_size\");\n    SHERPA_ONNX_READ_META_DATA(lfr_window_shift_, \"lfr_window_shift\");\n\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(neg_mean_, \"neg_mean\");\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(inv_stddev_, \"inv_stddev\");\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  std::vector<float> neg_mean_;\n  std::vector<float> inv_stddev_;\n\n  int32_t vocab_size_ = 0;  // initialized in Init\n  int32_t lfr_window_size_ = 0;\n  int32_t lfr_window_shift_ = 0;\n};\n\nOfflineParaformerModel::OfflineParaformerModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineParaformerModel::OfflineParaformerModel(Manager *mgr,\n                                               const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineParaformerModel::~OfflineParaformerModel() = default;\n\nstd::vector<Ort::Value> OfflineParaformerModel::Forward(\n    Ort::Value features, Ort::Value features_length) {\n  return impl_->Forward(std::move(features), std::move(features_length));\n}\n\nint32_t OfflineParaformerModel::VocabSize() const { return impl_->VocabSize(); }\n\nint32_t OfflineParaformerModel::LfrWindowSize() const {\n  return impl_->LfrWindowSize();\n}\nint32_t OfflineParaformerModel::LfrWindowShift() const {\n  return impl_->LfrWindowShift();\n}\nconst std::vector<float> &OfflineParaformerModel::NegativeMean() const {\n  return impl_->NegativeMean();\n}\nconst std::vector<float> &OfflineParaformerModel::InverseStdDev() const {\n  return impl_->InverseStdDev();\n}\n\nOrtAllocator *OfflineParaformerModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineParaformerModel::OfflineParaformerModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineParaformerModel::OfflineParaformerModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-paraformer-model.h",
    "content": "// sherpa-onnx/csrc/offline-paraformer-model.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_MODEL_H_\n\n#include <memory>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineParaformerModel {\n public:\n  explicit OfflineParaformerModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineParaformerModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineParaformerModel();\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C). It is changed in-place.\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int32_t.\n   *\n   * @return Return a vector containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size)\n   *  - token_num: A 1-D tensor of shape (N, T') containing number\n   *               of valid tokens in each utterance. Its dtype is int64_t.\n   *  If it is a model supporting timestamps, then there are additional two\n   *  outputs:\n   *   - us_alphas\n   *   - us_cif_peak\n   */\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length);\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const;\n\n  /** It is lfr_m in config.yaml\n   */\n  int32_t LfrWindowSize() const;\n\n  /** It is lfr_n in config.yaml\n   */\n  int32_t LfrWindowShift() const;\n\n  /** Return negative mean for CMVN\n   */\n  const std::vector<float> &NegativeMean() const;\n\n  /** Return inverse stddev for CMVN\n   */\n  const std::vector<float> &InverseStdDev() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_PARAFORMER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-punctuation-ct-transformer-impl.h",
    "content": "// sherpa-onnx/csrc/offline-punctuation-ct-transformer-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_CT_TRANSFORMER_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_CT_TRANSFORMER_IMPL_H_\n\n#include <math.h>\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/offline-ct-transformer-model.h\"\n#include \"sherpa-onnx/csrc/offline-punctuation-impl.h\"\n#include \"sherpa-onnx/csrc/offline-punctuation.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflinePunctuationCtTransformerImpl : public OfflinePunctuationImpl {\n public:\n  explicit OfflinePunctuationCtTransformerImpl(\n      const OfflinePunctuationConfig &config)\n      : config_(config), model_(config.model) {}\n\n  template <typename Manager>\n  OfflinePunctuationCtTransformerImpl(Manager *mgr,\n                                      const OfflinePunctuationConfig &config)\n      : config_(config), model_(mgr, config.model) {}\n\n  std::string AddPunctuation(const std::string &text) const override {\n    if (text.empty()) {\n      return {};\n    }\n\n    std::vector<std::string> tokens = SplitUtf8(text);\n    std::vector<int32_t> token_ids;\n    token_ids.reserve(tokens.size());\n\n    const auto &meta_data = model_.GetModelMetadata();\n\n    for (const auto &t : tokens) {\n      std::string token = ToLowerCase(t);\n      if (meta_data.token2id.count(token)) {\n        token_ids.push_back(meta_data.token2id.at(token));\n      } else {\n        token_ids.push_back(meta_data.unk_id);\n      }\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t segment_size = 20;\n    int32_t max_len = 200;\n    int32_t num_segments =\n        ceil((static_cast<float>(token_ids.size()) + segment_size - 1) /\n             segment_size);\n\n    std::vector<int32_t> punctuations;\n    int32_t last = -1;\n    for (int32_t i = 0; i != num_segments; ++i) {\n      int32_t this_start = i * segment_size;         // included\n      int32_t this_end = this_start + segment_size;  // not included\n      if (this_end > static_cast<int32_t>(token_ids.size())) {\n        this_end = token_ids.size();\n      }\n\n      if (last != -1) {\n        this_start = last;\n      }\n      // token_ids[this_start:this_end] is sent to the model\n\n      std::array<int64_t, 2> x_shape = {1, this_end - this_start};\n      Ort::Value x =\n          Ort::Value::CreateTensor(memory_info, token_ids.data() + this_start,\n                                   x_shape[1], x_shape.data(), x_shape.size());\n\n      int64_t len_shape = 1;\n      int32_t len = x_shape[1];\n      Ort::Value x_len =\n          Ort::Value::CreateTensor(memory_info, &len, 1, &len_shape, 1);\n\n      Ort::Value out = model_.Forward(std::move(x), std::move(x_len));\n\n      // [N, T, num_punctuations]\n      std::vector<int64_t> out_shape =\n          out.GetTensorTypeAndShapeInfo().GetShape();\n\n      assert(out_shape[0] == 1);\n      assert(out_shape[1] == len);\n      assert(out_shape[2] == meta_data.num_punctuations);\n\n      std::vector<int32_t> this_punctuations;\n      this_punctuations.reserve(len);\n\n      const float *p = out.GetTensorData<float>();\n      for (int32_t k = 0; k != len; ++k, p += meta_data.num_punctuations) {\n        auto index = static_cast<int32_t>(std::distance(\n            p, std::max_element(p, p + meta_data.num_punctuations)));\n        this_punctuations.push_back(index);\n      }  // for (int32_t k = 0; k != len; ++k, p += meta_data.num_punctuations)\n\n      int32_t dot_index = -1;\n      int32_t comma_index = -1;\n\n      for (int32_t m = static_cast<int32_t>(this_punctuations.size()) - 2;\n           m >= 1; --m) {\n        int32_t punct_id = this_punctuations[m];\n\n        if (punct_id == meta_data.dot_id || punct_id == meta_data.quest_id) {\n          dot_index = m;\n          break;\n        }\n\n        if (comma_index == -1 && punct_id == meta_data.comma_id) {\n          comma_index = m;\n        }\n      }  // for (int32_t k = this_punctuations.size() - 1; k >= 1; --k)\n\n      if (dot_index == -1 && len >= max_len && comma_index != -1) {\n        dot_index = comma_index;\n        this_punctuations[dot_index] = meta_data.dot_id;\n      }\n\n      if (dot_index == -1) {\n        if (last == -1) {\n          last = this_start;\n        }\n\n        if (i == num_segments - 1) {\n          dot_index = static_cast<int32_t>(this_punctuations.size()) - 1;\n        }\n      } else {\n        last = this_start + dot_index + 1;\n      }\n\n      if (dot_index != -1) {\n        punctuations.insert(punctuations.end(), this_punctuations.begin(),\n                            this_punctuations.begin() + (dot_index + 1));\n      }\n    }  // for (int32_t i = 0; i != num_segments; ++i)\n\n    if (punctuations.empty()) {\n      return text + meta_data.id2punct[meta_data.dot_id];\n    }\n    std::vector<std::string> words_punct;\n\n    for (int32_t i = 0; i != static_cast<int32_t>(punctuations.size()); ++i) {\n      if (i >= static_cast<int32_t>(tokens.size())) {\n        break;\n      }\n      std::string &w = tokens[i];\n      if (i > 0 && !(words_punct.back()[0] & 0x80) && !(w[0] & 0x80)) {\n        words_punct.push_back(\" \");\n      }\n      words_punct.push_back(std::move(w));\n\n      if (punctuations[i] != meta_data.underline_id) {\n        words_punct.push_back(meta_data.id2punct[punctuations[i]]);\n      }\n    }\n\n    if (words_punct.back() == meta_data.id2punct[meta_data.comma_id] ||\n        words_punct.back() == meta_data.id2punct[meta_data.pause_id]) {\n      words_punct.back() = meta_data.id2punct[meta_data.dot_id];\n    }\n\n    if (words_punct.back() != meta_data.id2punct[meta_data.dot_id] &&\n        words_punct.back() != meta_data.id2punct[meta_data.quest_id]) {\n      words_punct.push_back(meta_data.id2punct[meta_data.dot_id]);\n    }\n\n    std::string ans;\n    for (const auto &w : words_punct) {\n      ans.append(w);\n    }\n    return ans;\n  }\n\n private:\n  OfflinePunctuationConfig config_;\n  OfflineCtTransformerModel model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_CT_TRANSFORMER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-punctuation-impl.cc",
    "content": "// sherpa-onnx/csrc/offline-punctuation-impl.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-punctuation-impl.h\"\n\n#include <memory>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-punctuation-ct-transformer-impl.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OfflinePunctuationImpl> OfflinePunctuationImpl::Create(\n    const OfflinePunctuationConfig &config) {\n  if (!config.model.ct_transformer.empty()) {\n    return std::make_unique<OfflinePunctuationCtTransformerImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please specify a punctuation model! Return a null pointer\");\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OfflinePunctuationImpl> OfflinePunctuationImpl::Create(\n    Manager *mgr, const OfflinePunctuationConfig &config) {\n  if (!config.model.ct_transformer.empty()) {\n    return std::make_unique<OfflinePunctuationCtTransformerImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please specify a punctuation model! Return a null pointer\");\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OfflinePunctuationImpl> OfflinePunctuationImpl::Create(\n    AAssetManager *mgr, const OfflinePunctuationConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OfflinePunctuationImpl> OfflinePunctuationImpl::Create(\n    NativeResourceManager *mgr, const OfflinePunctuationConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-punctuation-impl.h",
    "content": "// sherpa-onnx/csrc/offline-punctuation-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_IMPL_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-punctuation.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflinePunctuationImpl {\n public:\n  virtual ~OfflinePunctuationImpl() = default;\n\n  static std::unique_ptr<OfflinePunctuationImpl> Create(\n      const OfflinePunctuationConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OfflinePunctuationImpl> Create(\n      Manager *mgr, const OfflinePunctuationConfig &config);\n\n  virtual std::string AddPunctuation(const std::string &text) const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-punctuation-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-punctuation-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-punctuation-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflinePunctuationModelConfig::Register(ParseOptions *po) {\n  po->Register(\"ct-transformer\", &ct_transformer,\n               \"Path to the controllable time-delay (CT) transformer model\");\n\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n}\n\nbool OfflinePunctuationModelConfig::Validate() const {\n  if (ct_transformer.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --ct-transformer\");\n    return false;\n  }\n\n  if (!FileExists(ct_transformer)) {\n    SHERPA_ONNX_LOGE(\"--ct-transformer %s does not exist\",\n                     ct_transformer.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflinePunctuationModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflinePunctuationModelConfig(\";\n  os << \"ct_transformer=\\\"\" << ct_transformer << \"\\\", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-punctuation-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-punctuation-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflinePunctuationModelConfig {\n  std::string ct_transformer;\n\n  int32_t num_threads = 1;\n  bool debug = false;\n  std::string provider = \"cpu\";\n\n  OfflinePunctuationModelConfig() = default;\n\n  OfflinePunctuationModelConfig(const std::string &ct_transformer,\n                                int32_t num_threads, bool debug,\n                                const std::string &provider)\n      : ct_transformer(ct_transformer),\n        num_threads(num_threads),\n        debug(debug),\n        provider(provider) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-punctuation.cc",
    "content": "// sherpa-onnx/csrc/offline-punctuation.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-punctuation.h\"\n\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-punctuation-impl.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflinePunctuationConfig::Register(ParseOptions *po) {\n  model.Register(po);\n}\n\nbool OfflinePunctuationConfig::Validate() const {\n  if (!model.Validate()) {\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflinePunctuationConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflinePunctuationConfig(\";\n  os << \"model=\" << model.ToString() << \")\";\n\n  return os.str();\n}\n\nOfflinePunctuation::OfflinePunctuation(const OfflinePunctuationConfig &config)\n    : impl_(OfflinePunctuationImpl::Create(config)) {}\n\ntemplate <typename Manager>\nOfflinePunctuation::OfflinePunctuation(Manager *mgr,\n                                       const OfflinePunctuationConfig &config)\n    : impl_(OfflinePunctuationImpl::Create(mgr, config)) {}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflinePunctuation::OfflinePunctuation(\n    AAssetManager *mgr, const OfflinePunctuationConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflinePunctuation::OfflinePunctuation(\n    NativeResourceManager *mgr, const OfflinePunctuationConfig &config);\n#endif\n\nOfflinePunctuation::~OfflinePunctuation() = default;\n\nstd::string OfflinePunctuation::AddPunctuation(const std::string &text) const {\n  return impl_->AddPunctuation(text);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-punctuation.h",
    "content": "// sherpa-onnx/csrc/offline-punctuation.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-punctuation-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflinePunctuationConfig {\n  OfflinePunctuationModelConfig model;\n\n  OfflinePunctuationConfig() = default;\n\n  explicit OfflinePunctuationConfig(const OfflinePunctuationModelConfig &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nclass OfflinePunctuationImpl;\n\nclass OfflinePunctuation {\n public:\n  explicit OfflinePunctuation(const OfflinePunctuationConfig &config);\n\n  template <typename Manager>\n  OfflinePunctuation(Manager *mgr, const OfflinePunctuationConfig &config);\n\n  ~OfflinePunctuation();\n\n  // Add punctuation to the input text and return it.\n  std::string AddPunctuation(const std::string &text) const;\n\n private:\n  std::unique_ptr<OfflinePunctuationImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_PUNCTUATION_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-canary-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-canary-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_CANARY_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_CANARY_IMPL_H_\n\n#include <algorithm>\n#include <ios>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-canary-model.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineRecognizerCanaryImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerCanaryImpl(const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineCanaryModel>(config_.model_config)) {\n    PostInit();\n  }\n\n  template <typename Manager>\n  explicit OfflineRecognizerCanaryImpl(Manager *mgr,\n                                       const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(\n            std::make_unique<OfflineCanaryModel>(mgr, config_.model_config)) {\n    PostInit();\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(config_.feat_config);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    for (int32_t i = 0; i < n; ++i) {\n      DecodeStream(ss[i]);\n    }\n  }\n\n  void DecodeStream(OfflineStream *s) const {\n    auto meta = model_->GetModelMetadata();\n    auto enc_out = RunEncoder(s);\n    Ort::Value enc_states = std::move(enc_out[0]);\n    Ort::Value enc_mask = std::move(enc_out[2]);\n    // enc_out[1] is discarded\n    std::vector<int32_t> decoder_input = GetInitialDecoderInput();\n    auto decoder_states = model_->GetInitialDecoderStates();\n    Ort::Value logits{nullptr};\n\n    for (int32_t i = 0; i < decoder_input.size(); ++i) {\n      std::tie(logits, decoder_states) =\n          RunDecoder(decoder_input[i], i, std::move(decoder_states),\n                     View(&enc_states), View(&enc_mask));\n    }\n\n    int32_t max_token_id = GetMaxTokenId(&logits);\n    int32_t eos = symbol_table_[\"<|endoftext|>\"];\n\n    int32_t num_feature_frames =\n        enc_states.GetTensorTypeAndShapeInfo().GetShape()[1] *\n        meta.subsampling_factor;\n\n    std::vector<int32_t> tokens = {max_token_id};\n\n    // Assume 30 tokens per second. It is to avoid the following for loop\n    // running indefinitely.\n    int32_t num_tokens =\n        static_cast<int32_t>(num_feature_frames / 100.0 * 30) + 1;\n\n    for (int32_t i = 1; i <= num_tokens; ++i) {\n      if (tokens.back() == eos) {\n        break;\n      }\n\n      std::tie(logits, decoder_states) =\n          RunDecoder(tokens.back(), i, std::move(decoder_states),\n                     View(&enc_states), View(&enc_mask));\n      tokens.push_back(GetMaxTokenId(&logits));\n    }\n\n    // remove the last eos token\n    tokens.pop_back();\n\n    auto r = Convert(tokens);\n\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n\n    s->SetResult(r);\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n  void SetConfig(const OfflineRecognizerConfig &config) override {\n    config_.model_config.canary.src_lang = config.model_config.canary.src_lang;\n    config_.model_config.canary.tgt_lang = config.model_config.canary.tgt_lang;\n    config_.model_config.canary.use_pnc = config.model_config.canary.use_pnc;\n\n    // we don't change the config_ in the base class\n  }\n\n private:\n  OfflineRecognitionResult Convert(const std::vector<int32_t> &tokens) const {\n    OfflineRecognitionResult r;\n    r.tokens.reserve(tokens.size());\n\n    std::string text;\n    for (auto i : tokens) {\n      if (!symbol_table_.Contains(i)) {\n        continue;\n      }\n\n      const auto &s = symbol_table_[i];\n      text += s;\n      r.tokens.push_back(s);\n    }\n\n    r.text = std::move(text);\n\n    return r;\n  }\n\n  int32_t GetMaxTokenId(Ort::Value *logits) const {\n    // logits is of shape (1, 1, vocab_size)\n    auto meta = model_->GetModelMetadata();\n    const float *p_logits = logits->GetTensorData<float>();\n\n    int32_t max_token_id = static_cast<int32_t>(std::distance(\n        p_logits, std::max_element(p_logits, p_logits + meta.vocab_size)));\n\n    return max_token_id;\n  }\n\n  std::vector<Ort::Value> RunEncoder(OfflineStream *s) const {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t feat_dim = config_.feat_config.feature_dim;\n    std::vector<float> f = s->GetFrames();\n\n    int32_t num_frames = f.size() / feat_dim;\n\n    std::array<int64_t, 3> shape = {1, num_frames, feat_dim};\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, f.data(), f.size(),\n                                            shape.data(), shape.size());\n\n    int64_t x_length_scalar = num_frames;\n    std::array<int64_t, 1> x_length_shape = {1};\n    Ort::Value x_length =\n        Ort::Value::CreateTensor(memory_info, &x_length_scalar, 1,\n                                 x_length_shape.data(), x_length_shape.size());\n    return model_->ForwardEncoder(std::move(x), std::move(x_length));\n  }\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> RunDecoder(\n      int32_t token, int32_t pos, std::vector<Ort::Value> decoder_states,\n      Ort::Value enc_states, Ort::Value enc_mask) const {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 2> shape = {1, 2};\n    std::array<int32_t, 2> _decoder_input = {token, pos};\n\n    Ort::Value decoder_input = Ort::Value::CreateTensor(\n        memory_info, _decoder_input.data(), _decoder_input.size(), shape.data(),\n        shape.size());\n\n    return model_->ForwardDecoder(std::move(decoder_input),\n                                  std::move(decoder_states),\n                                  std::move(enc_states), std::move(enc_mask));\n  }\n\n  // see\n  // https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/nemo/canary/test_180m_flash.py#L242\n  std::vector<int32_t> GetInitialDecoderInput() const {\n    auto canary_config = config_.model_config.canary;\n    const auto &meta = model_->GetModelMetadata();\n\n    std::vector<int32_t> decoder_input(9);\n    decoder_input[0] = symbol_table_[\"<|startofcontext|>\"];\n    decoder_input[1] = symbol_table_[\"<|startoftranscript|>\"];\n    decoder_input[2] = symbol_table_[\"<|emo:undefined|>\"];\n\n    if (canary_config.src_lang.empty() ||\n        !meta.lang2id.count(canary_config.src_lang)) {\n      decoder_input[3] = meta.lang2id.at(\"en\");\n    } else {\n      decoder_input[3] = meta.lang2id.at(canary_config.src_lang);\n    }\n\n    if (canary_config.tgt_lang.empty() ||\n        !meta.lang2id.count(canary_config.tgt_lang)) {\n      decoder_input[4] = meta.lang2id.at(\"en\");\n    } else {\n      decoder_input[4] = meta.lang2id.at(canary_config.tgt_lang);\n    }\n\n    if (canary_config.use_pnc) {\n      decoder_input[5] = symbol_table_[\"<|pnc|>\"];\n    } else {\n      decoder_input[5] = symbol_table_[\"<|nopnc|>\"];\n    }\n\n    decoder_input[6] = symbol_table_[\"<|noitn|>\"];\n    decoder_input[7] = symbol_table_[\"<|notimestamp|>\"];\n    decoder_input[8] = symbol_table_[\"<|nodiarize|>\"];\n\n    return decoder_input;\n  }\n\n private:\n  void PostInit() {\n    auto &meta = model_->GetModelMetadata();\n    config_.feat_config.feature_dim = meta.feat_dim;\n\n    config_.feat_config.nemo_normalize_type = meta.normalize_type;\n\n    config_.feat_config.dither = 0;\n    config_.feat_config.remove_dc_offset = false;\n    config_.feat_config.low_freq = 0;\n    config_.feat_config.window_type = \"hann\";\n    config_.feat_config.is_librosa = true;\n\n    meta.lang2id[\"en\"] = symbol_table_[\"<|en|>\"];\n    meta.lang2id[\"es\"] = symbol_table_[\"<|es|>\"];\n    meta.lang2id[\"de\"] = symbol_table_[\"<|de|>\"];\n    meta.lang2id[\"fr\"] = symbol_table_[\"<|fr|>\"];\n\n    if (symbol_table_.NumSymbols() != meta.vocab_size) {\n      SHERPA_ONNX_LOGE(\"number of lines in tokens.txt %d != %d (vocab_size)\",\n                       symbol_table_.NumSymbols(), meta.vocab_size);\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<OfflineCanaryModel> model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_CANARY_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-ctc-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-ctc-impl.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_CTC_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_CTC_IMPL_H_\n\n#include <ios>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-ctc-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-ctc-fst-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-ctc-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/pad-sequence.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\nOfflineRecognitionResult Convert(const OfflineCtcDecoderResult &src,\n                                 const SymbolTable &sym_table,\n                                 int32_t frame_shift_ms,\n                                 int32_t subsampling_factor) {\n  OfflineRecognitionResult r;\n  r.tokens.reserve(src.tokens.size());\n  r.timestamps.reserve(src.timestamps.size());\n\n  std::string text;\n\n  for (int32_t i = 0; i != src.tokens.size(); ++i) {\n    if (sym_table.Contains(\"SIL\") && src.tokens[i] == sym_table[\"SIL\"]) {\n      // tdnn models from yesno have a SIL token, we should remove it.\n      continue;\n    }\n\n    if (sym_table.Contains(\"</s>\") && src.tokens[i] == sym_table[\"</s>\"]) {\n      // Skip </s> for Google MedASR\n      continue;\n    }\n    auto sym = sym_table[src.tokens[i]];\n    text.append(sym);\n\n    if (sym.size() == 1 && (sym[0] < 0x20 || sym[0] > 0x7e)) {\n      // for bpe models with byte_fallback\n      // (but don't rewrite printable characters 0x20..0x7e,\n      //  which collide with standard BPE units)\n      std::ostringstream os;\n      os << \"<0x\" << std::hex << std::uppercase\n         << (static_cast<int32_t>(sym[0]) & 0xff) << \">\";\n      sym = os.str();\n    }\n\n    r.tokens.push_back(std::move(sym));\n  }\n\n  if (sym_table.IsByteBpe()) {\n    text = sym_table.DecodeByteBpe(text);\n  }\n\n  if (!text.empty() && text.front() == ' ') {\n    text.erase(0, 1);\n  }\n\n  r.text = std::move(text);\n\n  float frame_shift_s = frame_shift_ms / 1000. * subsampling_factor;\n  for (auto t : src.timestamps) {\n    float time = frame_shift_s * t;\n    r.timestamps.push_back(time);\n  }\n\n  r.words = std::move(src.words);\n\n  return r;\n}\n\nclass OfflineRecognizerCtcImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerCtcImpl(const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(OfflineCtcModel::Create(config_.model_config)) {\n    Init();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerCtcImpl(Manager *mgr, const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(OfflineCtcModel::Create(mgr, config_.model_config)) {\n    Init();\n  }\n\n  void Init() {\n    if (!config_.model_config.telespeech_ctc.empty()) {\n      config_.feat_config.snip_edges = true;\n      config_.feat_config.num_ceps = 40;\n      config_.feat_config.feature_dim = 40;\n      config_.feat_config.low_freq = 40;\n      config_.feat_config.high_freq = -200;\n      config_.feat_config.use_energy = false;\n      config_.feat_config.normalize_samples = false;\n      config_.feat_config.is_mfcc = true;\n    }\n\n    if (!config_.model_config.nemo_ctc.model.empty()) {\n      if (model_->IsGigaAM()) {\n        config_.feat_config.low_freq = 0;\n        config_.feat_config.high_freq = 8000;\n        config_.feat_config.remove_dc_offset = false;\n        config_.feat_config.preemph_coeff = 0;\n        config_.feat_config.window_type = \"hann\";\n        config_.feat_config.feature_dim = 64;\n\n        // see\n        // https://github.com/salute-developers/GigaAM/blob/main/gigaam/preprocess.py#L68\n        //\n        // GigaAM uses n_fft 400\n        config_.feat_config.round_to_power_of_two = false;\n      } else {\n        config_.feat_config.low_freq = 0;\n        config_.feat_config.high_freq = 0;\n        config_.feat_config.is_librosa = true;\n        config_.feat_config.remove_dc_offset = false;\n        config_.feat_config.window_type = \"hann\";\n      }\n    }\n\n    if (!config_.model_config.dolphin.model.empty()) {\n      config_.feat_config.low_freq = 0;\n      config_.feat_config.high_freq = 8000;\n      config_.feat_config.remove_dc_offset = false;\n      config_.feat_config.dither = 0;\n      config_.feat_config.preemph_coeff = 0;\n      config_.feat_config.window_type = \"hann\";\n      config_.feat_config.feature_dim = 80;\n      config_.feat_config.is_librosa = true;\n      config_.feat_config.frame_length_ms = 31.25;  // 16000/512 = 31.25\n      config_.feat_config.snip_edges = false;\n    }\n\n    if (!config_.model_config.wenet_ctc.model.empty()) {\n      // WeNet CTC models assume input samples are in the range\n      // [-32768, 32767], so we set normalize_samples to false\n      config_.feat_config.normalize_samples = false;\n      config_.feat_config.dither = 1;\n    }\n\n    if (!config_.model_config.medasr.model.empty()) {\n      config_.feat_config.low_freq = 125;\n      config_.feat_config.high_freq = 7500;\n      config_.feat_config.remove_dc_offset = false;\n      config_.feat_config.dither = 0;\n      config_.feat_config.preemph_coeff = 0;\n      config_.feat_config.window_type = \"hanning\";\n      config_.feat_config.feature_dim = 128;\n      config_.feat_config.snip_edges = true;\n    }\n\n    if (!config_.model_config.fire_red_asr_ctc.model.empty()) {\n      config_.feat_config.normalize_samples = false;\n      config_.feat_config.high_freq = 0;\n      config_.feat_config.snip_edges = true;\n    }\n\n    config_.feat_config.nemo_normalize_type =\n        model_->FeatureNormalizationMethod();\n\n    if (!config_.ctc_fst_decoder_config.graph.empty()) {\n      // TODO(fangjun): Support android to read the graph from\n      // asset_manager\n      decoder_ = std::make_unique<OfflineCtcFstDecoder>(\n          config_.ctc_fst_decoder_config);\n    } else if (config_.decoding_method == \"greedy_search\") {\n      if (!symbol_table_.Contains(\"<blk>\") &&\n          !symbol_table_.Contains(\"<eps>\") &&\n          !symbol_table_.Contains(\"<blank>\") &&\n          config_.model_config.omnilingual.model.empty()) {\n        // for omnilingual asr, its blank id is 0\n        SHERPA_ONNX_LOGE(\n            \"We expect that tokens.txt contains \"\n            \"the symbol <blk> or <eps> or <blank> and its ID.\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      int32_t blank_id = 0;\n      if (symbol_table_.Contains(\"<blk>\")) {\n        blank_id = symbol_table_[\"<blk>\"];\n      } else if (symbol_table_.Contains(\"<eps>\")) {\n        // for tdnn models of the yesno recipe from icefall\n        blank_id = symbol_table_[\"<eps>\"];\n      } else if (symbol_table_.Contains(\"<blank>\")) {\n        // for Wenet CTC models\n        blank_id = symbol_table_[\"<blank>\"];\n      }\n\n      decoder_ = std::make_unique<OfflineCtcGreedySearchDecoder>(blank_id);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config_.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    if (config_.model_config.omnilingual.model.empty()) {\n      return std::make_unique<OfflineStream>(config_.feat_config);\n    } else {\n      return std::make_unique<OfflineStream>(OmnilingualAsrTag{});\n    }\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    if (!model_->SupportBatchProcessing() || (n == 1) ||\n        !config_.model_config.omnilingual.model.empty()) {\n      // If the model does not support batch processing,\n      // we process each stream independently.\n      //\n      // omnilingual asr is disabled for batch processing at present\n      for (int32_t i = 0; i != n; ++i) {\n        DecodeStream(ss[i]);\n      }\n      return;\n    }\n\n    // Even if the omnilingual asr model can process batch input, the following\n    // code does not support batching raw audio samples.\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t feat_dim = ss[0]->FeatureDim();\n\n    std::vector<Ort::Value> features;\n    features.reserve(n);\n\n    std::vector<std::vector<float>> features_vec(n);\n    std::vector<int64_t> features_length_vec(n);\n\n    for (int32_t i = 0; i != n; ++i) {\n      std::vector<float> f = ss[i]->GetFrames();\n\n      int32_t num_frames = f.size() / feat_dim;\n\n      model_->NormalizeFeatures(f.data(), num_frames, feat_dim);\n\n      features_vec[i] = std::move(f);\n\n      features_length_vec[i] = num_frames;\n\n      std::array<int64_t, 2> shape = {num_frames, feat_dim};\n\n      Ort::Value x = Ort::Value::CreateTensor(\n          memory_info, features_vec[i].data(), features_vec[i].size(),\n          shape.data(), shape.size());\n      features.push_back(std::move(x));\n    }  // for (int32_t i = 0; i != n; ++i)\n\n    std::vector<const Ort::Value *> features_pointer(n);\n    for (int32_t i = 0; i != n; ++i) {\n      features_pointer[i] = &features[i];\n    }\n\n    std::array<int64_t, 1> features_length_shape = {n};\n    Ort::Value x_length = Ort::Value::CreateTensor(\n        memory_info, features_length_vec.data(), n,\n        features_length_shape.data(), features_length_shape.size());\n\n    Ort::Value x = PadSequence(model_->Allocator(), features_pointer,\n                               -23.025850929940457f);\n    auto t = model_->Forward(std::move(x), std::move(x_length));\n\n    auto results = decoder_->Decode(std::move(t[0]), std::move(t[1]));\n\n    int32_t frame_shift_ms = 10;\n    for (int32_t i = 0; i != n; ++i) {\n      auto r = Convert(results[i], symbol_table_, frame_shift_ms,\n                       model_->SubsamplingFactor());\n      r.text = ApplyInverseTextNormalization(std::move(r.text));\n      r.text = ApplyHomophoneReplacer(std::move(r.text));\n      ss[i]->SetResult(r);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  // Decode a single stream.\n  // Some models do not support batch size > 1, e.g., WeNet CTC models.\n  void DecodeStream(OfflineStream *s) const {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t feat_dim = s->FeatureDim();\n    std::vector<float> f = s->GetFrames();\n\n    int32_t num_frames = f.size() / feat_dim;\n\n    model_->NormalizeFeatures(f.data(), num_frames, feat_dim);\n\n    std::vector<int64_t> shape = {1, num_frames, feat_dim};\n    if (!config_.model_config.omnilingual.model.empty()) {\n      shape = {1, feat_dim};\n    }\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, f.data(), f.size(),\n                                            shape.data(), shape.size());\n\n    int64_t x_length_scalar = num_frames;\n    std::array<int64_t, 1> x_length_shape = {1};\n    Ort::Value x_length =\n        Ort::Value::CreateTensor(memory_info, &x_length_scalar, 1,\n                                 x_length_shape.data(), x_length_shape.size());\n\n    auto t = model_->Forward(std::move(x), std::move(x_length));\n    auto results = decoder_->Decode(std::move(t[0]), std::move(t[1]));\n    int32_t frame_shift_ms = 10;\n\n    if (!config_.model_config.omnilingual.model.empty()) {\n      frame_shift_ms = 20;\n    }\n\n    auto r = Convert(results[0], symbol_table_, frame_shift_ms,\n                     model_->SubsamplingFactor());\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    s->SetResult(r);\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<OfflineCtcModel> model_;\n  std::unique_ptr<OfflineCtcDecoder> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_CTC_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-fire-red-asr-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-fire-red-asr-impl.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_FIRE_RED_ASR_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_FIRE_RED_ASR_IMPL_H_\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-model.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nstatic OfflineRecognitionResult Convert(\n    const OfflineFireRedAsrDecoderResult &src, const SymbolTable &sym_table) {\n  OfflineRecognitionResult r;\n  r.tokens.reserve(src.tokens.size());\n\n  std::string text;\n  for (auto i : src.tokens) {\n    if (!sym_table.Contains(i)) {\n      continue;\n    }\n\n    const auto &s = sym_table[i];\n    text += s;\n    r.tokens.push_back(s);\n  }\n\n  r.text = std::move(text);\n\n  return r;\n}\n\nclass OfflineRecognizerFireRedAsrImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerFireRedAsrImpl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineFireRedAsrModel>(config.model_config)) {\n    Init();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerFireRedAsrImpl(Manager *mgr,\n                                  const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<OfflineFireRedAsrModel>(mgr,\n                                                        config.model_config)) {\n    Init();\n  }\n\n  void Init() {\n    if (config_.decoding_method == \"greedy_search\") {\n      decoder_ =\n          std::make_unique<OfflineFireRedAsrGreedySearchDecoder>(model_.get());\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only greedy_search is supported at present for FireRedAsr. Given %s\",\n          config_.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    const auto &meta_data = model_->GetModelMetadata();\n\n    config_.feat_config.normalize_samples = false;\n    config_.feat_config.high_freq = 0;\n    config_.feat_config.snip_edges = true;\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(config_.feat_config);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    // batch decoding is not implemented yet\n    for (int32_t i = 0; i != n; ++i) {\n      DecodeStream(ss[i]);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void DecodeStream(OfflineStream *s) const {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t feat_dim = s->FeatureDim();\n    std::vector<float> f = s->GetFrames();\n    ApplyCMVN(&f);\n\n    int64_t num_frames = f.size() / feat_dim;\n\n    std::array<int64_t, 3> shape{1, num_frames, feat_dim};\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, f.data(), f.size(),\n                                            shape.data(), shape.size());\n\n    int64_t len_shape = 1;\n    Ort::Value x_len =\n        Ort::Value::CreateTensor(memory_info, &num_frames, 1, &len_shape, 1);\n\n    auto cross_kv = model_->ForwardEncoder(std::move(x), std::move(x_len));\n\n    auto results = decoder_->Decode(std::move(cross_kv.first),\n                                    std::move(cross_kv.second), num_frames);\n\n    auto r = Convert(results[0], symbol_table_);\n\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    s->SetResult(r);\n  }\n\n  void ApplyCMVN(std::vector<float> *v) const {\n    const auto &meta_data = model_->GetModelMetadata();\n    const auto &mean_vec = meta_data.mean;\n    const auto &inv_stddev_vec = meta_data.inv_stddev;\n    int32_t feat_dim = static_cast<int32_t>(mean_vec.size());\n    int32_t num_frames = static_cast<int32_t>(v->size()) / feat_dim;\n    Eigen::Map<\n        Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>>\n        mat(v->data(), num_frames, feat_dim);\n    Eigen::Map<const Eigen::RowVectorXf> mean(mean_vec.data(), feat_dim);\n    Eigen::Map<const Eigen::RowVectorXf> inv_std(inv_stddev_vec.data(),\n                                                 feat_dim);\n\n    mat.array() =\n        (mat.array().rowwise() - mean.array()).rowwise() * inv_std.array();\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<OfflineFireRedAsrModel> model_;\n  std::unique_ptr<OfflineFireRedAsrDecoder> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_FIRE_RED_ASR_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-funasr-nano-impl.cc",
    "content": "// sherpa-onnx/csrc/offline-recognizer-funasr-nano-impl.cc\n//\n// Copyright (c)  2025  zengyw\n\n#include \"sherpa-onnx/csrc/offline-recognizer-funasr-nano-impl.h\"\n\n#include <algorithm>\n#include <cctype>\n#include <cmath>\n#include <cstdint>\n#include <cstring>\n#include <limits>\n#include <memory>\n#include <random>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n// Build cache_position tensor from attention_mask.\n// Creates a [S] int64_t tensor where the first element is the starting position\n// (pos0) for writing KV deltas. The remaining elements are consecutive\n// positions [pos0, pos0+1, ..., pos0+S-1].\n// For prefill: pos0 = 0, S = context_len\n// For decode: pos0 = valid_len, S = 1 (mask_len = valid_len + 1)\nstatic Ort::Value BuildCachePositionFromMask(const Ort::Value &attention_mask,\n                                             int32_t seq_len,\n                                             OrtAllocator *allocator) {\n  auto mask_info = attention_mask.GetTensorTypeAndShapeInfo();\n  auto mask_shape = mask_info.GetShape();\n\n  // Get the current position from attention_mask length\n  // mask_shape is [1, mask_len], where mask_len = past_len + seq_len\n  int64_t pos0 = 0;\n  if (mask_shape.size() == 2 && mask_shape[1] > 0) {\n    // pos0 is the current position in cache (past length = mask_len - seq_len)\n    pos0 = static_cast<int64_t>(mask_shape[1]) - seq_len;\n  }\n  if (pos0 < 0) pos0 = 0;\n\n  // Create tensor using allocator\n  std::array<int64_t, 1> pos_shape{seq_len};\n  Ort::Value cache_position = Ort::Value::CreateTensor<int64_t>(\n      allocator, pos_shape.data(), pos_shape.size());\n\n  // Fill the tensor with position values\n  int64_t *p = cache_position.GetTensorMutableData<int64_t>();\n  for (int32_t i = 0; i < seq_len; ++i) {\n    p[i] = pos0 + i;\n  }\n\n  return cache_position;\n}\n\n// Create attention_mask tensor view from pre-allocated buffer.\n// Returns a tensor with shape [1, mask_len] (dynamic length).\nstatic Ort::Value CreateAttentionMaskView(\n    std::vector<int64_t> *attention_mask_vec, int32_t mask_len,\n    const Ort::MemoryInfo &memory_info, bool update_new_pos = false) {\n  if (update_new_pos && mask_len > 0) {\n    (*attention_mask_vec)[mask_len - 1] = 1;\n  }\n  std::array<int64_t, 2> mask_shape{1, mask_len};\n  return Ort::Value::CreateTensor<int64_t>(\n      memory_info, attention_mask_vec->data(), static_cast<size_t>(mask_len),\n      mask_shape.data(), mask_shape.size());\n}\n\nstatic inline void TrimInplace(std::string *s) {\n  if (!s) return;\n  auto &str = *s;\n  auto not_space = [](unsigned char c) { return !std::isspace(c); };\n\n  str.erase(str.begin(), std::find_if(str.begin(), str.end(), not_space));\n  str.erase(std::find_if(str.rbegin(), str.rend(), not_space).base(),\n            str.end());\n}\n\nstatic std::vector<std::string> ParseHotwordsCsv(const std::string &csv) {\n  std::vector<std::string> out;\n  std::string cur;\n  cur.reserve(csv.size());\n\n  for (size_t i = 0; i < csv.size(); ++i) {\n    unsigned char ch = static_cast<unsigned char>(csv[i]);\n    // Support both ASCII and Chinese separators\n    // Check for Chinese comma (，) and semicolon (；) - UTF-8 encoding\n    bool is_separator = false;\n    if (ch == ',' || ch == ';' || ch == '\\n' || ch == '\\r' || ch == '\\t') {\n      is_separator = true;\n    } else if (ch == 0xEF) {\n      // Check for UTF-8 encoded Chinese comma (，) = EF BC 8C or semicolon (；)\n      // = EF BC 9B. Otherwise consume full 3-byte sequence to avoid corrupting\n      // other UTF-8 chars (e.g. 0xEF 0xBE 0xAD).\n      if (i + 2 < csv.size()) {\n        unsigned char ch1 = static_cast<unsigned char>(csv[i + 1]);\n        unsigned char ch2 = static_cast<unsigned char>(csv[i + 2]);\n        if (ch1 == 0xBC && (ch2 == 0x8C || ch2 == 0x9B)) {\n          is_separator = true;\n          i += 2;  // Skip the remaining UTF-8 bytes\n        } else if (ch1 >= 0x80 && ch1 <= 0xBF && ch2 >= 0x80 && ch2 <= 0xBF) {\n          cur.push_back(csv[i]);\n          cur.push_back(csv[i + 1]);\n          cur.push_back(csv[i + 2]);\n          i += 2;\n          continue;\n        }\n      }\n    }\n\n    if (is_separator) {\n      TrimInplace(&cur);\n      if (!cur.empty()) out.push_back(cur);\n      cur.clear();\n    } else {\n      cur.push_back(csv[i]);\n    }\n  }\n  TrimInplace(&cur);\n  if (!cur.empty()) out.push_back(cur);\n  return out;\n}\n\nstatic std::string JoinWithComma(const std::vector<std::string> &xs) {\n  std::string s;\n  for (size_t i = 0; i < xs.size(); ++i) {\n    if (i) s += \", \";\n    s += xs[i];\n  }\n  return s;\n}\n\n// Build user prompt based on hotwords, language, and itn settings.\nstatic std::string BuildUserPrompt(const std::vector<std::string> &hotwords,\n                                   const std::string *language, bool itn,\n                                   const std::string *user_prompt) {\n  const bool has_override =\n      !hotwords.empty() || (language && !language->empty()) || !itn;\n  if (user_prompt && !user_prompt->empty() && !has_override) {\n    return *user_prompt;\n  }\n\n  std::string prefix;\n  if (!hotwords.empty()) {\n    std::string hw = JoinWithComma(hotwords);\n    prefix =\n        \"请结合上下文信息，更加准确地完成语音转写任务。如果没有相关信息，我们会\"\n        \"留空。\\n\\n\\n\"\n        \"**上下文信息：**\\n\\n\\n\";\n    prefix += \"热词列表：[\" + hw + \"]\\n\";\n  }\n\n  std::string task =\n      (!language || language->empty()) ? \"语音转写\" : \"语音转写成\" + *language;\n  if (!itn) {\n    task += \"，不进行文本规整\";\n  }\n  task += \"：\";\n\n  return prefix + task;\n}\n\n}  // namespace\n\nOfflineRecognizerFunASRNanoImpl::OfflineRecognizerFunASRNanoImpl(\n    const OfflineRecognizerConfig &config)\n    : OfflineRecognizerImpl(config),\n      config_(config),\n      model_(std::make_unique<OfflineFunASRNanoModel>(config.model_config)),\n      tokenizer_(std::make_unique<FunASRNanoTokenizer>(\n          config.model_config.funasr_nano.tokenizer)),\n      rng_(config.model_config.funasr_nano.seed) {\n  InitFeatConfig();\n}\n\ntemplate <typename Manager>\nOfflineRecognizerFunASRNanoImpl::OfflineRecognizerFunASRNanoImpl(\n    Manager *mgr, const OfflineRecognizerConfig &config)\n    : OfflineRecognizerImpl(mgr, config),\n      config_(config),\n      model_(\n          std::make_unique<OfflineFunASRNanoModel>(mgr, config.model_config)),\n      tokenizer_(std::make_unique<FunASRNanoTokenizer>(\n          mgr, config.model_config.funasr_nano.tokenizer)),\n      rng_(config.model_config.funasr_nano.seed) {\n  InitFeatConfig();\n}\n\nstd::unique_ptr<OfflineStream> OfflineRecognizerFunASRNanoImpl::CreateStream()\n    const {\n  return std::make_unique<OfflineStream>(config_.feat_config);\n}\n\n// Initialize feature extraction configuration for FunASR-nano.\n// Sets normalization, window type, and disables edge snipping and dithering\n// to match the model's expected input format.\nvoid OfflineRecognizerFunASRNanoImpl::InitFeatConfig() {\n  config_.feat_config.normalize_samples = false;\n  config_.feat_config.window_type = \"hamming\";\n  config_.feat_config.snip_edges = false;\n  config_.feat_config.dither = 0.0f;\n}\n\n// Apply Low Frame Rate (LFR) processing to reduce temporal resolution.\n// Concatenates multiple consecutive frames into a single frame.\nstd::vector<float> OfflineRecognizerFunASRNanoImpl::ApplyLFR(\n    const std::vector<float> &in) const {\n  int32_t lfr_window_size = model_->LfrWindowSize();\n  int32_t lfr_window_shift = model_->LfrWindowShift();\n  int32_t in_feat_dim = config_.feat_config.feature_dim;\n  int32_t in_num_frames = static_cast<int32_t>(in.size() / in_feat_dim);\n  int32_t out_num_frames =\n      (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n  if (out_num_frames <= 0) return {};\n  int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n  std::vector<float> out(out_num_frames * out_feat_dim);\n  const float *p_in = in.data();\n  float *p_out = out.data();\n  for (int32_t i = 0; i != out_num_frames; ++i) {\n    std::copy(p_in, p_in + out_feat_dim, p_out);\n    p_out += out_feat_dim;\n    p_in += lfr_window_shift * in_feat_dim;\n  }\n  return out;\n}\n\n// Build source token IDs with chat template format:\n// [system_prompt] [user_prompt] [audio_tokens] [assistant_prompt]\n// Returns the token sequence and sets fbank_beg_idx to the start position\n// of audio tokens in the sequence.\nstd::vector<int64_t> OfflineRecognizerFunASRNanoImpl::BuildSourceIds(\n    const std::string &system_prompt, const std::string &user_prompt,\n    int32_t audio_token_len, int32_t &fbank_beg_idx,\n    int32_t &fake_token_len) const {\n  const std::string system_text =\n      \"<|im_start|>system\\n\" + system_prompt + \"<|im_end|>\\n\";\n  const std::string user_text = \"<|im_start|>user\\n\" + user_prompt;\n  const std::string after_text = \"<|im_end|>\\n<|im_start|>assistant\\n\";\n  std::vector<int64_t> ids_before = tokenizer_->Encode(system_text + user_text);\n  std::vector<int64_t> ids_after = tokenizer_->Encode(after_text);\n  fbank_beg_idx = static_cast<int32_t>(ids_before.size());\n  fake_token_len = audio_token_len;\n  int64_t pad_id = tokenizer_->GetPadTokenId();\n  if (pad_id < 0) pad_id = tokenizer_->GetEosTokenId();\n  std::vector<int64_t> source_ids;\n  source_ids.reserve(ids_before.size() + audio_token_len + ids_after.size());\n  source_ids.insert(source_ids.end(), ids_before.begin(), ids_before.end());\n  // Use pad tokens as placeholders for audio embeddings\n  source_ids.insert(source_ids.end(), audio_token_len, pad_id);\n  source_ids.insert(source_ids.end(), ids_after.begin(), ids_after.end());\n  return source_ids;\n}\n\n// Sample token from logits using greedy decoding (argmax).\n// Handles both FP16 and FP32 logits, skipping NaN/Inf values.\n// Returns token ID 0 as fallback if all logits are invalid.\nint64_t OfflineRecognizerFunASRNanoImpl::SampleTokenFromLogitsFp16OrFp32(\n    const void *logits, bool is_fp16, int32_t vocab_size) const {\n  int32_t best = 0;\n  float best_val = -1e30f;\n  bool found_valid = false;\n  if (is_fp16) {\n    const uint16_t *p = reinterpret_cast<const uint16_t *>(logits);\n    for (int32_t i = 0; i < vocab_size; ++i) {\n      float v = HalfBitsToFloat(p[i]);\n      if (std::isfinite(v) && v > best_val) {\n        best_val = v;\n        best = i;\n        found_valid = true;\n      }\n    }\n  } else {\n    const float *p = reinterpret_cast<const float *>(logits);\n    for (int32_t i = 0; i < vocab_size; ++i) {\n      if (std::isfinite(p[i]) && p[i] > best_val) {\n        best_val = p[i];\n        best = i;\n        found_valid = true;\n      }\n    }\n  }\n  if (!found_valid) {\n    return 0;\n  }\n  return static_cast<int64_t>(best);\n}\n\n// Sample token from logits using temperature and top-p (nucleus) sampling.\n// Handles both FP16 and FP32 logits.\n// Returns token ID 0 as fallback if all logits are invalid.\n// If temperature is very small (<= 1e-6) or invalid, falls back to greedy\n// decoding. If top_p >= 1.0, samples from all tokens without sorting (full\n// vocabulary).\nint64_t OfflineRecognizerFunASRNanoImpl::SampleTokenWithTemperatureAndTopP(\n    const void *logits, bool is_fp16, int32_t vocab_size, float temperature,\n    float top_p) const {\n  if (temperature <= 1e-6f || !std::isfinite(temperature)) {\n    return SampleTokenFromLogitsFp16OrFp32(logits, is_fp16, vocab_size);\n  }\n\n  if (!std::isfinite(top_p) || top_p <= 0.0f) {\n    return SampleTokenFromLogitsFp16OrFp32(logits, is_fp16, vocab_size);\n  }\n  if (top_p > 1.0f) top_p = 1.0f;\n\n  thread_local std::vector<float> probs;\n  thread_local std::vector<int32_t> idx;\n\n  probs.resize(vocab_size);\n  idx.resize(vocab_size);\n\n  float max_logit = -std::numeric_limits<float>::infinity();\n  bool found_valid = false;\n\n  if (is_fp16) {\n    const uint16_t *p = reinterpret_cast<const uint16_t *>(logits);\n    for (int32_t i = 0; i < vocab_size; ++i) {\n      float v = HalfBitsToFloat(p[i]);\n      if (std::isfinite(v)) {\n        v /= temperature;\n        probs[i] = v;\n        if (v > max_logit) max_logit = v;\n        found_valid = true;\n      } else {\n        probs[i] = -1e30f;\n      }\n      idx[i] = i;\n    }\n  } else {\n    const float *p = reinterpret_cast<const float *>(logits);\n    for (int32_t i = 0; i < vocab_size; ++i) {\n      float v = p[i];\n      if (std::isfinite(v)) {\n        v /= temperature;\n        probs[i] = v;\n        if (v > max_logit) max_logit = v;\n        found_valid = true;\n      } else {\n        probs[i] = -1e30f;\n      }\n      idx[i] = i;\n    }\n  }\n\n  if (!found_valid) return 0;\n\n  float sum_exp = 0.0f;\n  for (int32_t i = 0; i < vocab_size; ++i) {\n    float e = std::exp(probs[i] - max_logit);\n    probs[i] = e;\n    sum_exp += e;\n  }\n  if (sum_exp <= 0.0f || !std::isfinite(sum_exp)) return 0;\n  for (int32_t i = 0; i < vocab_size; ++i) {\n    probs[i] /= sum_exp;\n  }\n\n  if (top_p >= 1.0f) {\n    std::uniform_real_distribution<float> dist(0.0f, 1.0f);\n    float sample = dist(rng_);\n    float cumsum = 0.0f;\n    for (int32_t i = 0; i < vocab_size; ++i) {\n      cumsum += probs[i];\n      if (sample <= cumsum) return static_cast<int64_t>(i);\n    }\n    return static_cast<int64_t>(vocab_size - 1);\n  }\n\n  int32_t k = std::min<int32_t>(256, vocab_size);\n  float cum_k = 0.0f;\n  while (true) {\n    std::partial_sort(\n        idx.begin(), idx.begin() + k, idx.end(),\n        [&](int32_t a, int32_t b) { return probs[a] > probs[b]; });\n\n    cum_k = 0.0f;\n    for (int32_t i = 0; i < k; ++i) cum_k += probs[idx[i]];\n\n    if (cum_k >= top_p || k == vocab_size) break;\n\n    int32_t new_k = std::min(vocab_size, k * 2);\n    if (new_k == k) break;\n    k = new_k;\n  }\n\n  float cumsum = 0.0f;\n  int32_t cutoff = k;\n  for (int32_t i = 0; i < k; ++i) {\n    cumsum += probs[idx[i]];\n    if (cumsum >= top_p) {\n      cutoff = i + 1;\n      break;\n    }\n  }\n\n  float renorm_sum = 0.0f;\n  for (int32_t i = 0; i < cutoff; ++i) renorm_sum += probs[idx[i]];\n  if (renorm_sum <= 0.0f) return 0;\n\n  std::uniform_real_distribution<float> dist(0.0f, renorm_sum);\n  float sample = dist(rng_);\n  float cumsum_sample = 0.0f;\n  for (int32_t i = 0; i < cutoff; ++i) {\n    cumsum_sample += probs[idx[i]];\n    if (sample <= cumsum_sample) return static_cast<int64_t>(idx[i]);\n  }\n  return static_cast<int64_t>(idx[cutoff - 1]);\n}\n\nOfflineRecognitionResult OfflineRecognizerFunASRNanoImpl::GenerateText(\n    Ort::Value encoder_out, const std::string &system_prompt,\n    const std::string &user_prompt) const {\n  OfflineRecognitionResult result;\n  auto memory_info =\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n  const auto &funasr_config = config_.model_config.funasr_nano;\n  auto enc_shape = encoder_out.GetTensorTypeAndShapeInfo().GetShape();\n  int32_t audio_token_len = static_cast<int32_t>(enc_shape[1]);\n  int32_t hidden_size = static_cast<int32_t>(enc_shape[2]);\n  int32_t fbank_beg_idx = 0;\n  int32_t fake_token_len = 0;\n  std::vector<int64_t> source_ids =\n      BuildSourceIds(system_prompt, user_prompt, audio_token_len, fbank_beg_idx,\n                     fake_token_len);\n  int32_t context_len = static_cast<int32_t>(source_ids.size());\n\n  // Create KV cache buffer [B, max_total_len, kv_h, hd].\n  // This stores the accumulated KV cache. Model outputs are deltas that get\n  // applied in-place.\n  std::vector<std::pair<Ort::Value, Ort::Value>> cache_kv =\n      model_->CreateEmptyKVCache(1);\n  int32_t max_seq_len = model_->GetMaxTotalLen();\n  if (max_seq_len <= 0) {\n    SHERPA_ONNX_LOGE(\"Invalid max_seq_len=%d\", max_seq_len);\n    result.text = \"\";\n    return result;\n  }\n\n  // If context exceeds KV capacity: prioritize truncating audio placeholders\n  // (keep prompt scaffold intact).\n  if (context_len > max_seq_len) {\n    int32_t before_len = fbank_beg_idx;\n    int32_t after_len = context_len - before_len - fake_token_len;\n    if (after_len < 0) after_len = 0;\n\n    int32_t keep_audio = max_seq_len - before_len - after_len;\n    if (keep_audio < 0) {\n      SHERPA_ONNX_LOGE(\n          \"Context_len (%d) too large for KV capacity (%d) and prompts already \"\n          \"exceed capacity. \"\n          \"Falling back to keep last %d tokens.\",\n          context_len, max_seq_len, max_seq_len);\n      SHERPA_ONNX_LOGE(\n          \"The model max_total_len (%d) limits total context (prompt + audio \"\n          \"tokens). Suggestions:\",\n          max_seq_len);\n      SHERPA_ONNX_LOGE(\n          \"  1) Reduce hotwords: fewer or shorter hotwords shorten the \"\n          \"prompt.\");\n      SHERPA_ONNX_LOGE(\n          \"  2) Shorten audio: use shorter clips so audio_token_len \"\n          \"decreases.\");\n      SHERPA_ONNX_LOGE(\n          \"  3) Use a model with larger max_total_len: export with \"\n          \"max_total_len>%d via scripts in \"\n          \"https://github.com/Wasser1462/FunASR-nano-onnx , or download \"\n          \"from https://modelscope.cn/models/zengshuishui/FunASR-nano-onnx/\",\n          max_seq_len);\n      // Fallback: keep the suffix.\n      source_ids.erase(source_ids.begin(), source_ids.end() - max_seq_len);\n      // Audio alignment is no longer controllable, skip injecting audio\n      // embeddings.\n      fbank_beg_idx = -1;\n      fake_token_len = 0;\n      context_len = static_cast<int32_t>(source_ids.size());\n    } else {\n      if (keep_audio > audio_token_len) keep_audio = audio_token_len;\n\n      SHERPA_ONNX_LOGE(\n          \"Context_len (%d) exceeds KV capacity (%d). Truncating audio \"\n          \"placeholders: \"\n          \"audio_token_len=%d -> keep_audio=%d (before=%d after=%d).\",\n          context_len, max_seq_len, audio_token_len, keep_audio, before_len,\n          after_len);\n      SHERPA_ONNX_LOGE(\n          \"The model max_total_len (%d) limits total context (prompt + audio \"\n          \"tokens). Suggestions:\",\n          max_seq_len);\n      SHERPA_ONNX_LOGE(\n          \"  1) Reduce hotwords: fewer or shorter hotwords shorten the \"\n          \"prompt.\");\n      SHERPA_ONNX_LOGE(\n          \"  2) Shorten audio: use shorter clips so audio_token_len \"\n          \"decreases.\");\n      SHERPA_ONNX_LOGE(\n          \"  3) Use a model with larger max_total_len: export with \"\n          \"max_total_len>%d via scripts in \"\n          \"https://github.com/Wasser1462/FunASR-nano-onnx , or download \"\n          \"from https://modelscope.cn/models/zengshuishui/FunASR-nano-onnx/\",\n          max_seq_len);\n\n      // Rebuild ids_before/ids_after using slices.\n      std::vector<int64_t> ids_before(source_ids.begin(),\n                                      source_ids.begin() + before_len);\n      std::vector<int64_t> ids_after(source_ids.end() - after_len,\n                                     source_ids.end());\n\n      int64_t pad_id = tokenizer_->GetPadTokenId();\n      if (pad_id < 0) pad_id = tokenizer_->GetEosTokenId();\n\n      source_ids.clear();\n      source_ids.reserve(before_len + keep_audio + after_len);\n      source_ids.insert(source_ids.end(), ids_before.begin(), ids_before.end());\n      source_ids.insert(source_ids.end(), keep_audio, pad_id);\n      source_ids.insert(source_ids.end(), ids_after.begin(), ids_after.end());\n\n      fake_token_len = keep_audio;\n      fbank_beg_idx = before_len;\n      context_len = static_cast<int32_t>(source_ids.size());\n    }\n  }\n\n  // Get text embeddings for the prompt tokens\n  std::vector<int64_t> input_ids = source_ids;\n  std::array<int64_t, 2> ids_shape{1, context_len};\n  Ort::Value input_ids_tensor =\n      Ort::Value::CreateTensor(memory_info, input_ids.data(), input_ids.size(),\n                               ids_shape.data(), ids_shape.size());\n\n  Ort::Value text_embeds =\n      model_->ForwardEmbedding(std::move(input_ids_tensor));\n\n  auto te_info = text_embeds.GetTensorTypeAndShapeInfo();\n  const auto te_type = te_info.GetElementType();\n  const bool te_fp16 = (te_type == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);\n\n  // Allocate inputs_embeds only for prefill (context_len * hidden_size).\n  // Decode steps will use a separate reusable buffer.\n  std::vector<float> inputs_embeds_fp32(\n      static_cast<size_t>(context_len) * hidden_size, 0.0f);\n\n  // Copy text embeddings.\n  if (te_fp16) {\n    const uint16_t *p = text_embeds.GetTensorData<uint16_t>();\n    const size_t total = static_cast<size_t>(context_len) * hidden_size;\n    for (size_t i = 0; i < total; ++i) {\n      inputs_embeds_fp32[i] = HalfBitsToFloat(p[i]);\n    }\n  } else {\n    const float *p = text_embeds.GetTensorData<float>();\n    const size_t total = static_cast<size_t>(context_len) * hidden_size;\n    std::memcpy(inputs_embeds_fp32.data(), p, total * sizeof(float));\n  }\n\n  // Inject audio embeddings into placeholder region (if alignment is still\n  // possible).\n  auto enc_info2 = encoder_out.GetTensorTypeAndShapeInfo();\n  auto enc_et =\n      static_cast<ONNXTensorElementDataType>(enc_info2.GetElementType());\n  int32_t copy_len = std::min(fake_token_len, audio_token_len);\n\n  if (copy_len > 0 && fbank_beg_idx >= 0) {\n    if (enc_et == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16) {\n      const uint16_t *enc = encoder_out.GetTensorData<uint16_t>();\n      const size_t hidden_size_u = static_cast<size_t>(hidden_size);\n      for (int32_t t = 0; t < copy_len; ++t) {\n        const uint16_t *src = enc + static_cast<size_t>(t) * hidden_size_u;\n        float *dst = inputs_embeds_fp32.data() +\n                     static_cast<size_t>(fbank_beg_idx + t) * hidden_size_u;\n        for (size_t d = 0; d < hidden_size_u; ++d) {\n          dst[d] = HalfBitsToFloat(src[d]);\n        }\n      }\n    } else if (enc_et == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT) {\n      const float *enc = encoder_out.GetTensorData<float>();\n      const size_t hidden_size_u = static_cast<size_t>(hidden_size);\n      for (int32_t t = 0; t < copy_len; ++t) {\n        const float *src = enc + static_cast<size_t>(t) * hidden_size_u;\n        float *dst = inputs_embeds_fp32.data() +\n                     static_cast<size_t>(fbank_beg_idx + t) * hidden_size_u;\n        std::memcpy(dst, src, hidden_size_u * sizeof(float));\n      }\n    } else {\n      SHERPA_ONNX_LOGE(\"encoder_out elem_type=%d not supported\", (int)enc_et);\n      result.text = \"\";\n      return result;\n    }\n  }\n\n  // Pre-allocate attention_mask buffer to avoid per-step allocations\n  std::vector<int64_t> attention_mask_vec(static_cast<size_t>(max_seq_len), 0);\n  // Initialize first context_len positions to 1 for prefill\n  std::fill(attention_mask_vec.begin(),\n            attention_mask_vec.begin() + context_len, 1);\n\n  // Pre-allocate reusable buffer for decode step embeddings (hidden_size)\n  std::vector<float> next_embed_fp32(static_cast<size_t>(hidden_size));\n\n  int32_t valid_len = context_len;\n\n  std::vector<int64_t> generated_ids;\n  generated_ids.reserve(funasr_config.max_new_tokens);\n\n  const int64_t eos_id = tokenizer_->GetEosTokenId();\n  const int64_t im_end_id = tokenizer_->GetImEndTokenId();\n  const int32_t max_new_tokens = funasr_config.max_new_tokens;\n\n  bool is_first_step = true;\n\n  for (int32_t step = 0; step < max_new_tokens; ++step) {\n    // valid_len represents the mask_len for the next decode step (= past +\n    // current).\n    if (valid_len >= max_seq_len) break;\n\n    Ort::Value logits{nullptr};\n\n    if (is_first_step) {\n      // Prefill: seq = context_len, mask_len = context_len.\n      if (config_.model_config.debug) {\n        SHERPA_ONNX_LOGE(\n            \"GenerateText: starting prefill with context_len=%d, \"\n            \"inputs_embeds_fp32.size()=%zu\",\n            context_len, inputs_embeds_fp32.size());\n      }\n\n      std::array<int64_t, 3> embeds_shape{1, context_len, hidden_size};\n      Ort::Value inputs_embeds_tensor = Ort::Value::CreateTensor<float>(\n          memory_info, inputs_embeds_fp32.data(),\n          static_cast<size_t>(context_len) * hidden_size, embeds_shape.data(),\n          embeds_shape.size());\n\n      // Use pre-allocated attention_mask buffer (first context_len positions\n      // already set to 1)\n      Ort::Value attention_mask_view = CreateAttentionMaskView(\n          &attention_mask_vec, context_len, memory_info, false);\n\n      Ort::Value cache_position = BuildCachePositionFromMask(\n          attention_mask_view, context_len, model_->Allocator());\n\n      auto tmp = model_->ForwardLLM(std::move(inputs_embeds_tensor),\n                                    std::move(attention_mask_view),\n                                    cache_position, cache_kv);\n      logits = std::move(tmp.first);\n      auto kv_outputs = std::move(tmp.second);\n\n      // Apply KV deltas to cache buffer in-place.\n      // kv_outputs contains deltas that update cache_kv at positions specified\n      // by cache_position.\n      model_->ApplyKvDeltaInplace(&cache_kv, kv_outputs, cache_position);\n\n    } else {\n      // Decode: seq = 1, mask_len = valid_len + 1 (past + current)\n      int64_t last_token_id = generated_ids.back();\n      std::vector<int64_t> one_id{last_token_id};\n      std::array<int64_t, 2> one_shape{1, 1};\n      Ort::Value one_tensor =\n          Ort::Value::CreateTensor(memory_info, one_id.data(), one_id.size(),\n                                   one_shape.data(), one_shape.size());\n\n      Ort::Value next_embed = model_->ForwardEmbedding(std::move(one_tensor));\n      auto ne_info = next_embed.GetTensorTypeAndShapeInfo();\n      bool ne_fp16 =\n          (ne_info.GetElementType() == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);\n\n      // Reuse pre-allocated buffer for decode step embedding\n      if (ne_fp16) {\n        const uint16_t *src = next_embed.GetTensorData<uint16_t>();\n        for (size_t d = 0; d < static_cast<size_t>(hidden_size); ++d) {\n          next_embed_fp32[d] = HalfBitsToFloat(src[d]);\n        }\n      } else {\n        const float *src = next_embed.GetTensorData<float>();\n        std::memcpy(next_embed_fp32.data(), src,\n                    static_cast<size_t>(hidden_size) * sizeof(float));\n      }\n\n      std::array<int64_t, 3> embeds_shape{1, 1, hidden_size};\n      Ort::Value inputs_embeds_tensor = Ort::Value::CreateTensor<float>(\n          memory_info, next_embed_fp32.data(), static_cast<size_t>(hidden_size),\n          embeds_shape.data(), embeds_shape.size());\n\n      // mask_len must equal kv_seq_len (= past + current = valid_len + 1).\n      // Use pre-allocated attention_mask buffer, update new position to 1\n      int32_t mask_len = valid_len + 1;\n      Ort::Value attention_mask_view = CreateAttentionMaskView(\n          &attention_mask_vec, mask_len, memory_info, true);\n\n      Ort::Value cache_position = BuildCachePositionFromMask(\n          attention_mask_view, 1, model_->Allocator());\n\n      auto tmp = model_->ForwardLLM(std::move(inputs_embeds_tensor),\n                                    std::move(attention_mask_view),\n                                    cache_position, cache_kv);\n      logits = std::move(tmp.first);\n      auto kv_outputs = std::move(tmp.second);\n\n      // Apply KV deltas to cache buffer in-place.\n      model_->ApplyKvDeltaInplace(&cache_kv, kv_outputs, cache_position);\n    }\n\n    auto log_info = logits.GetTensorTypeAndShapeInfo();\n    auto log_shape = log_info.GetShape();\n\n    // logits are [B, S, V]. Always pick the last available step.\n    if (log_shape.size() < 3) {\n      SHERPA_ONNX_LOGE(\"Unexpected logits rank=%zu\", log_shape.size());\n      result.text = \"\";\n      return result;\n    }\n\n    int32_t time_dim = static_cast<int32_t>(log_shape[1]);\n    int32_t vocab_size = static_cast<int32_t>(log_shape[2]);\n    if (time_dim <= 0 || vocab_size <= 0) {\n      SHERPA_ONNX_LOGE(\"Invalid logits shape [%d,%d,%d]\",\n                       static_cast<int32_t>(log_shape[0]),\n                       static_cast<int32_t>(log_shape[1]),\n                       static_cast<int32_t>(log_shape[2]));\n      result.text = \"\";\n      return result;\n    }\n\n    const bool log_fp16 =\n        (log_info.GetElementType() == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);\n\n    int32_t last_idx = time_dim - 1;\n\n    const void *base = nullptr;\n    if (log_fp16)\n      base = logits.GetTensorData<uint16_t>();\n    else\n      base = logits.GetTensorData<float>();\n\n    const size_t offset = static_cast<size_t>(last_idx) * vocab_size;\n    const void *last_logits =\n        log_fp16 ? static_cast<const void *>(\n                       reinterpret_cast<const uint16_t *>(base) + offset)\n                 : static_cast<const void *>(\n                       reinterpret_cast<const float *>(base) + offset);\n\n    int64_t next_id = SampleTokenWithTemperatureAndTopP(\n        last_logits, log_fp16, vocab_size, funasr_config.temperature,\n        funasr_config.top_p);\n\n    if (next_id == eos_id || next_id == im_end_id) break;\n\n    generated_ids.push_back(next_id);\n\n    if (is_first_step) is_first_step = false;\n\n    // valid_len represents the kv_seq_len for the next decode step.\n    valid_len += 1;\n  }\n\n  result.text = tokenizer_->Decode(generated_ids);\n\n  if (funasr_config.itn) {\n    result.text = ApplyInverseTextNormalization(std::move(result.text));\n    result.text = ApplyHomophoneReplacer(std::move(result.text));\n  }\n\n  if (config_.model_config.debug) {\n    SHERPA_ONNX_LOGE(\"GenerateText: generated %zu tokens: %s\",\n                     generated_ids.size(), result.text.c_str());\n    std::string token_str;\n    for (size_t i = 0; i < generated_ids.size() && i < 10; ++i) {\n      if (i > 0) token_str += \",\";\n      token_str += std::to_string(generated_ids[i]);\n    }\n    SHERPA_ONNX_LOGE(\"GenerateText: token ids: %s%s\", token_str.c_str(),\n                     generated_ids.size() > 10 ? \"...\" : \"\");\n  }\n\n  if (!generated_ids.empty()) {\n    result.tokens.reserve(generated_ids.size());\n    std::string pending_bytes;\n    for (int64_t token_id : generated_ids) {\n      // Use GetTokenStringStreaming() to handle cross-token UTF-8 sequences\n      // This properly handles cases where a single character is split across\n      // multiple BPE tokens\n      std::string s =\n          tokenizer_->GetTokenStringStreaming(token_id, &pending_bytes);\n      result.tokens.push_back(std::move(s));\n    }\n\n    if (!pending_bytes.empty() && !result.tokens.empty()) {\n      // Handle any remaining bytes from the last token, treating them as\n      // invalid.\n      std::string replacement_chars;\n      replacement_chars.reserve(pending_bytes.size() * 3);\n      for (size_t i = 0; i < pending_bytes.size(); ++i) {\n        replacement_chars.append(\"\\xEF\\xBF\\xBD\");\n      }\n      result.tokens.back().append(replacement_chars);\n    }\n\n    // Calculate timestamps based on effective audio coverage duration\n    // Use copy_len (actual injected audio token count) to determine\n    result.timestamps.reserve(generated_ids.size());\n    if (fbank_beg_idx >= 0 && copy_len > 0 && !generated_ids.empty()) {\n      float frame_shift_ms = config_.feat_config.frame_shift_ms;\n\n      int32_t lfr_shift = model_->LfrWindowShift();\n      float token_time_sec =\n          frame_shift_ms * static_cast<float>(lfr_shift) / 1000.0f;\n\n      float effective_audio_duration =\n          static_cast<float>(copy_len) * token_time_sec;\n\n      if (effective_audio_duration > 0) {\n        if (generated_ids.size() == 1) {\n          result.timestamps.push_back(effective_audio_duration / 2.0f);\n        } else {\n          // Distribute timestamps evenly across effective_audio_duration\n          // Use (size - 1) so the last timestamp equals\n          // effective_audio_duration\n          float time_per_token = effective_audio_duration /\n                                 static_cast<float>(generated_ids.size() - 1);\n          for (size_t i = 0; i < generated_ids.size(); ++i) {\n            result.timestamps.push_back(static_cast<float>(i) * time_per_token);\n          }\n        }\n      }\n    }\n  }\n\n  return result;\n}\n\n// Decode multiple audio streams in batch.\n// Applies LFR processing, runs encoder, and generates text for each stream.\nvoid OfflineRecognizerFunASRNanoImpl::DecodeStreams(OfflineStream **ss,\n                                                    int32_t n) const {\n  auto memory_info =\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n  const auto &funasr_config = config_.model_config.funasr_nano;\n  for (int32_t i = 0; i != n; ++i) {\n    std::vector<float> f = ss[i]->GetFrames();\n    f = ApplyLFR(f);\n    int32_t num_frames = static_cast<int32_t>(\n        f.size() / (config_.feat_config.feature_dim * model_->LfrWindowSize()));\n    if (num_frames <= 0) {\n      OfflineRecognitionResult r;\n      r.text = \"\";\n      ss[i]->SetResult(r);\n      continue;\n    }\n\n    std::array<int64_t, 3> shape{1, num_frames,\n                                 static_cast<int64_t>(f.size() / num_frames)};\n\n    Ort::Value features = Ort::Value::CreateTensor<float>(\n        memory_info, const_cast<float *>(f.data()), f.size(), shape.data(),\n        shape.size());\n\n    Ort::Value encoder_out = model_->ForwardEncoderAdaptor(std::move(features));\n\n    // Parse hotwords parameter\n    std::vector<std::string> hotwords =\n        ParseHotwordsCsv(funasr_config.hotwords);\n\n    // language is empty means None\n    const std::string *lang_ptr =\n        funasr_config.language.empty() ? nullptr : &funasr_config.language;\n\n    // Build user prompt: respect funasr_config.user_prompt; merge with\n    // hotwords/language/itn when provided.\n    std::string user_prompt_dyn = BuildUserPrompt(\n        hotwords, lang_ptr, funasr_config.itn, &funasr_config.user_prompt);\n\n    if (config_.model_config.debug) {\n      SHERPA_ONNX_LOGE(\n          \"DecodeStreams: hotwords=%zu, language=%s, itn=%d\", hotwords.size(),\n          funasr_config.language.empty() ? \"(empty)\"\n                                         : funasr_config.language.c_str(),\n          funasr_config.itn ? 1 : 0);\n      SHERPA_ONNX_LOGE(\"DecodeStreams: user_prompt_dyn=%s\",\n                       user_prompt_dyn.c_str());\n    }\n\n    OfflineRecognitionResult r = GenerateText(\n        std::move(encoder_out), funasr_config.system_prompt, user_prompt_dyn);\n\n    ss[i]->SetResult(r);\n  }\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineRecognizerFunASRNanoImpl::OfflineRecognizerFunASRNanoImpl(\n    AAssetManager *mgr, const OfflineRecognizerConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineRecognizerFunASRNanoImpl::OfflineRecognizerFunASRNanoImpl(\n    NativeResourceManager *mgr, const OfflineRecognizerConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-funasr-nano-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-funasr-nano-impl.h\n//\n// Copyright (c)  2025  zengyw\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_FUNASR_NANO_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_FUNASR_NANO_IMPL_H_\n\n#include <algorithm>\n#include <memory>\n#include <random>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/funasr-nano-tokenizer.h\"\n#include \"sherpa-onnx/csrc/offline-funasr-nano-model.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/pad-sequence.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineRecognizerFunASRNanoImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerFunASRNanoImpl(\n      const OfflineRecognizerConfig &config);\n\n  template <typename Manager>\n  OfflineRecognizerFunASRNanoImpl(Manager *mgr,\n                                  const OfflineRecognizerConfig &config);\n\n  std::unique_ptr<OfflineStream> CreateStream() const override;\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override;\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void InitFeatConfig();\n  std::vector<float> ApplyLFR(const std::vector<float> &in) const;\n\n  std::vector<int64_t> BuildSourceIds(const std::string &system_prompt,\n                                      const std::string &user_prompt,\n                                      int32_t audio_token_len,\n                                      int32_t &fbank_beg_idx,\n                                      int32_t &fake_token_len) const;\n\n  int64_t SampleTokenFromLogitsFp16OrFp32(const void *logits,\n                                         bool is_fp16,\n                                         int32_t vocab_size) const;\n\n  int64_t SampleTokenWithTemperatureAndTopP(const void *logits,\n                                            bool is_fp16,\n                                            int32_t vocab_size,\n                                            float temperature,\n                                            float top_p) const;\n\n  OfflineRecognitionResult GenerateText(Ort::Value encoder_out,\n                                       const std::string &system_prompt,\n                                       const std::string &user_prompt) const;\n\n  OfflineRecognizerConfig config_;\n  std::unique_ptr<OfflineFunASRNanoModel> model_;\n  std::unique_ptr<FunASRNanoTokenizer> tokenizer_;\n  mutable std::mt19937 rng_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_FUNASR_NANO_IMPL_H_\n\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-impl.cc",
    "content": "// sherpa-onnx/csrc/offline-recognizer-impl.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"fst/extensions/far/far.h\"\n#include \"kaldifst/csrc/kaldi-fst-io.h\"\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-canary-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-ctc-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-fire-red-asr-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-funasr-nano-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-moonshine-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-moonshine-v2-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-paraformer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-paraformer-tpl-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-sense-voice-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-sense-voice-tpl-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-transducer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-transducer-nemo-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-whisper-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-whisper-tpl-impl.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\n#if SHERPA_ONNX_ENABLE_RKNN\n#include \"sherpa-onnx/csrc/rknn/offline-paraformer-model-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/offline-sense-voice-model-rknn.h\"\n#endif\n\n#if SHERPA_ONNX_ENABLE_AXERA\n#include \"sherpa-onnx/csrc/axera/offline-sense-voice-model-axera.h\"\n#endif\n\n#if SHERPA_ONNX_ENABLE_AXCL\n#include \"sherpa-onnx/csrc/axcl/offline-sense-voice-model-axcl.h\"\n#endif\n\n#if SHERPA_ONNX_ENABLE_ASCEND_NPU\n#include \"sherpa-onnx/csrc/ascend/offline-paraformer-model-ascend.h\"\n#include \"sherpa-onnx/csrc/ascend/offline-recognizer-zipformer-ctc-ascend-impl.h\"\n#include \"sherpa-onnx/csrc/ascend/offline-sense-voice-model-ascend.h\"\n#include \"sherpa-onnx/csrc/ascend/offline-whisper-model-ascend.h\"\n#endif\n\n#if SHERPA_ONNX_ENABLE_QNN\n#include \"sherpa-onnx/csrc/qnn/offline-paraformer-model-qnn.h\"\n#include \"sherpa-onnx/csrc/qnn/offline-recognizer-zipformer-ctc-qnn-impl.h\"\n#include \"sherpa-onnx/csrc/qnn/offline-sense-voice-model-qnn.h\"\n#endif\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OfflineRecognizerImpl> OfflineRecognizerImpl::Create(\n    const OfflineRecognizerConfig &config) {\n  if (config.model_config.provider == \"rknn\") {\n#if SHERPA_ONNX_ENABLE_RKNN\n    if (!config.model_config.sense_voice.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerSenseVoiceTplImpl<OfflineSenseVoiceModelRknn>>(\n          config);\n    } else if (!config.model_config.paraformer.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerParaformerTplImpl<OfflineParaformerModelRknn>>(\n          config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only SenseVoice and Paraformer models are currently supported \"\n          \"by rknn for non-streaming ASR.\");\n      SHERPA_ONNX_EXIT(-1);\n      return nullptr;\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_RKNN=ON if you \"\n        \"want to use rknn. See also \"\n        \"https://k2-fsa.github.io/sherpa/onnx/rknn/install.html\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (config.model_config.provider == \"axera\") {\n#if SHERPA_ONNX_ENABLE_AXERA\n    if (!config.model_config.sense_voice.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerSenseVoiceTplImpl<OfflineSenseVoiceModelAxera>>(\n          config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only SenseVoice models are currently supported by Axera NPU for \"\n          \"non-streaming ASR.\");\n      SHERPA_ONNX_EXIT(-1);\n      return nullptr;\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_AXERA=ON if you \"\n        \"want to use axera. See also \"\n        \"https://k2-fsa.github.io/sherpa/onnx/axera/install.html\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (config.model_config.provider == \"axcl\") {\n#if SHERPA_ONNX_ENABLE_AXCL\n    if (!config.model_config.sense_voice.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerSenseVoiceTplImpl<OfflineSenseVoiceModelAxcl>>(\n          config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only SenseVoice models are currently supported by axcl for \"\n          \"non-streaming ASR.\");\n      SHERPA_ONNX_EXIT(-1);\n      return nullptr;\n    }\n\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_AXCL=ON if you \"\n        \"want to use axcl. See also \"\n        \"https://k2-fsa.github.io/sherpa/onnx/axcl/install.html\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (config.model_config.provider == \"ascend\") {\n#if SHERPA_ONNX_ENABLE_ASCEND_NPU\n    if (!config.model_config.sense_voice.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerSenseVoiceTplImpl<OfflineSenseVoiceModelAscend>>(\n          config);\n    } else if (!config.model_config.paraformer.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerParaformerTplImpl<OfflineParaformerModelAscend>>(\n          config);\n    } else if (!config.model_config.zipformer_ctc.model.empty()) {\n      return std::make_unique<OfflineRecognizerZipformerCtcAscendImpl>(config);\n    } else if (!config.model_config.whisper.encoder.empty()) {\n      return std::make_unique<\n          OfflineRecognizerWhisperTplImpl<OfflineWhisperModelAscend>>(config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only SenseVoice, Paraformer, Whisper, and Zipformer CTC models are \"\n          \"currently supported by Ascend NPU for non-streaming ASR.\");\n      SHERPA_ONNX_EXIT(-1);\n      return nullptr;\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_ASCEND_NPU=ON if \"\n        \"you want to use Ascend NPU. See also \"\n        \"https://k2-fsa.github.io/sherpa/onnx/ascend/install.html\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (config.model_config.provider == \"qnn\") {\n#if SHERPA_ONNX_ENABLE_QNN\n    if (!config.model_config.sense_voice.model.empty() ||\n        !config.model_config.sense_voice.qnn_config.context_binary.empty()) {\n      return std::make_unique<\n          OfflineRecognizerSenseVoiceTplImpl<OfflineSenseVoiceModelQnn>>(\n          config);\n    } else if (!config.model_config.zipformer_ctc.model.empty() ||\n               !config.model_config.zipformer_ctc.qnn_config.context_binary\n                    .empty()) {\n      return std::make_unique<OfflineRecognizerZipformerCtcQnnImpl>(config);\n    } else if (!config.model_config.paraformer.model.empty() ||\n               !config.model_config.paraformer.qnn_config.context_binary\n                    .empty()) {\n      return std::make_unique<\n          OfflineRecognizerParaformerTplImpl<OfflineParaformerModelQnn>>(\n          config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only SenseVoice, Paraformer, and Zipformer CTC models are currently \"\n          \"supported by QNN for non-streaming ASR.\");\n      SHERPA_ONNX_EXIT(-1);\n      return nullptr;\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_QNN=ON if \"\n        \"you want to use qnn. See also \"\n        \"https://k2-fsa.github.io/sherpa/onnx/qnn/build.html\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (!config.model_config.sense_voice.model.empty()) {\n    return std::make_unique<OfflineRecognizerSenseVoiceImpl>(config);\n  }\n\n  if (!config.model_config.funasr_nano.encoder_adaptor.empty()) {\n    return std::make_unique<OfflineRecognizerFunASRNanoImpl>(config);\n  }\n\n  if (!config.model_config.paraformer.model.empty()) {\n    return std::make_unique<OfflineRecognizerParaformerImpl>(config);\n  }\n\n  if (!config.model_config.nemo_ctc.model.empty() ||\n      !config.model_config.zipformer_ctc.model.empty() ||\n      !config.model_config.tdnn.model.empty() ||\n      !config.model_config.wenet_ctc.model.empty() ||\n      !config.model_config.omnilingual.model.empty() ||\n      !config.model_config.medasr.model.empty() ||\n      !config.model_config.fire_red_asr_ctc.model.empty() ||\n      !config.model_config.dolphin.model.empty()) {\n    return std::make_unique<OfflineRecognizerCtcImpl>(config);\n  }\n\n  if (!config.model_config.whisper.encoder.empty()) {\n    return std::make_unique<OfflineRecognizerWhisperImpl>(config);\n  }\n\n  if (!config.model_config.fire_red_asr.encoder.empty()) {\n    return std::make_unique<OfflineRecognizerFireRedAsrImpl>(config);\n  }\n\n  if (!config.model_config.moonshine.preprocessor.empty()) {\n    return std::make_unique<OfflineRecognizerMoonshineImpl>(config);\n  }\n\n  if (!config.model_config.moonshine.merged_decoder.empty()) {\n    return std::make_unique<OfflineRecognizerMoonshineV2Impl>(config);\n  }\n\n  if (!config.model_config.canary.encoder.empty()) {\n    return std::make_unique<OfflineRecognizerCanaryImpl>(config);\n  }\n\n  // TODO(fangjun): Refactor it. We only need to use model type for the\n  // following models:\n  //  1. transducer and nemo_transducer\n  if (!config.model_config.model_type.empty()) {\n    const auto &model_type = config.model_config.model_type;\n    if (model_type == \"transducer\") {\n      return std::make_unique<OfflineRecognizerTransducerImpl>(config);\n    } else if (model_type == \"nemo_transducer\") {\n      return std::make_unique<OfflineRecognizerTransducerNeMoImpl>(config);\n    } else if (model_type == \"paraformer\") {\n      return std::make_unique<OfflineRecognizerParaformerImpl>(config);\n    } else if (model_type == \"nemo_ctc\" || model_type == \"tdnn\" ||\n               model_type == \"zipformer2_ctc\" || model_type == \"wenet_ctc\" ||\n               model_type == \"telespeech_ctc\") {\n      return std::make_unique<OfflineRecognizerCtcImpl>(config);\n    } else if (model_type == \"whisper\") {\n      // unreachable\n      return std::make_unique<OfflineRecognizerWhisperImpl>(config);\n    } else if (model_type == \"moonshine\") {\n      // unreachable\n      return std::make_unique<OfflineRecognizerMoonshineImpl>(config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Invalid model_type: %s. Trying to load the model to get its type\",\n          model_type.c_str());\n    }\n  }\n\n  Ort::Env env(ORT_LOGGING_LEVEL_ERROR);\n\n  Ort::SessionOptions sess_opts;\n  sess_opts.SetIntraOpNumThreads(1);\n  sess_opts.SetInterOpNumThreads(1);\n\n  std::string model_filename;\n  if (!config.model_config.transducer.encoder_filename.empty()) {\n    model_filename = config.model_config.transducer.encoder_filename;\n  } else if (!config.model_config.paraformer.model.empty()) {\n    model_filename = config.model_config.paraformer.model;\n  } else if (!config.model_config.nemo_ctc.model.empty()) {\n    model_filename = config.model_config.nemo_ctc.model;\n  } else if (!config.model_config.telespeech_ctc.empty()) {\n    model_filename = config.model_config.telespeech_ctc;\n  } else if (!config.model_config.tdnn.model.empty()) {\n    model_filename = config.model_config.tdnn.model;\n  } else if (!config.model_config.zipformer_ctc.model.empty()) {\n    model_filename = config.model_config.zipformer_ctc.model;\n  } else if (!config.model_config.wenet_ctc.model.empty()) {\n    model_filename = config.model_config.wenet_ctc.model;\n  } else if (!config.model_config.whisper.encoder.empty()) {\n    model_filename = config.model_config.whisper.encoder;\n  } else {\n    SHERPA_ONNX_LOGE(\"Please provide a model\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  auto buf = ReadFile(model_filename);\n\n  auto encoder_sess =\n      std::make_unique<Ort::Session>(env, buf.data(), buf.size(), sess_opts);\n\n  Ort::ModelMetadata meta_data = encoder_sess->GetModelMetadata();\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n  auto model_type =\n      LookupCustomModelMetaData(meta_data, \"model_type\", allocator);\n  if (model_type.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"No model_type in the metadata!\\n\\n\"\n        \"Please refer to the following URLs to add metadata\"\n        \"\\n\"\n        \"(0) Transducer models from icefall\"\n        \"\\n    \"\n        \"https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/\"\n        \"pruned_transducer_stateless7/export-onnx.py#L303\"\n        \"\\n\"\n        \"(1) Nemo CTC models\\n    \"\n        \"https://huggingface.co/csukuangfj/\"\n        \"sherpa-onnx-nemo-ctc-en-citrinet-512/blob/main/add-model-metadata.py\"\n        \"\\n\"\n        \"(2) Paraformer\"\n        \"\\n    \"\n        \"https://huggingface.co/csukuangfj/\"\n        \"paraformer-onnxruntime-python-example/blob/main/add-model-metadata.py\"\n        \"\\n    \"\n        \"(3) Whisper\"\n        \"\\n    \"\n        \"(4) Tdnn models of the yesno recipe from icefall\"\n        \"\\n    \"\n        \"https://github.com/k2-fsa/icefall/tree/master/egs/yesno/ASR/tdnn\"\n        \"\\n\"\n        \"(5) Zipformer CTC models from icefall\"\n        \"\\n    \"\n        \"https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/\"\n        \"zipformer/export-onnx-ctc.py\"\n        \"\\n\"\n        \"(6) CTC models from WeNet\"\n        \"\\n    \"\n        \"https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/wenet/run.sh\"\n        \"\\n\"\n        \"(7) CTC models from TeleSpeech\"\n        \"\\n    \"\n        \"https://github.com/Tele-AI/TeleSpeech-ASR\"\n        \"\\n\"\n        \"\\n\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  if (model_type == \"conformer\" || model_type == \"zipformer\" ||\n      model_type == \"zipformer2\") {\n    return std::make_unique<OfflineRecognizerTransducerImpl>(config);\n  }\n\n  if (model_type == \"paraformer\") {\n    return std::make_unique<OfflineRecognizerParaformerImpl>(config);\n  }\n\n  if ((model_type == \"EncDecHybridRNNTCTCBPEModel\" ||\n       model_type == \"EncDecRNNTBPEModel\") &&\n      !config.model_config.transducer.decoder_filename.empty() &&\n      !config.model_config.transducer.joiner_filename.empty()) {\n    return std::make_unique<OfflineRecognizerTransducerNeMoImpl>(config);\n  }\n\n  if (model_type == \"EncDecCTCModelBPE\" || model_type == \"EncDecCTCModel\" ||\n      model_type == \"EncDecHybridRNNTCTCBPEModel\" || model_type == \"tdnn\" ||\n      model_type == \"zipformer2_ctc\" || model_type == \"wenet_ctc\" ||\n      model_type == \"telespeech_ctc\") {\n    return std::make_unique<OfflineRecognizerCtcImpl>(config);\n  }\n\n  if (strncmp(model_type.c_str(), \"whisper\", 7) == 0) {\n    return std::make_unique<OfflineRecognizerWhisperImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\n      \"\\nUnsupported model_type: %s\\n\"\n      \"We support only the following model types at present: \\n\"\n      \" - Non-streaming transducer models from icefall\\n\"\n      \" - Non-streaming Paraformer models from FunASR\\n\"\n      \" - EncDecCTCModelBPE models from NeMo\\n\"\n      \" - EncDecCTCModel models from NeMo\\n\"\n      \" - EncDecHybridRNNTCTCBPEModel models from NeMo\\n\"\n      \" - EncDecRNNTBPEModel models from NeMO\"\n      \" - Whisper models\\n\"\n      \" - Tdnn models\\n\"\n      \" - Zipformer CTC models\\n\"\n      \" - WeNet CTC models\\n\"\n      \" - TeleSpeech CTC models\\n\",\n      model_type.c_str());\n\n  SHERPA_ONNX_EXIT(-1);\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OfflineRecognizerImpl> OfflineRecognizerImpl::Create(\n    Manager *mgr, const OfflineRecognizerConfig &config) {\n  if (config.model_config.provider == \"rknn\") {\n#if SHERPA_ONNX_ENABLE_RKNN\n    if (!config.model_config.sense_voice.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerSenseVoiceTplImpl<OfflineSenseVoiceModelRknn>>(\n          mgr, config);\n    } else if (!config.model_config.paraformer.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerParaformerTplImpl<OfflineParaformerModelRknn>>(\n          mgr, config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only SenseVoice and Paraformer models are currently supported \"\n          \"by rknn for non-streaming ASR.\");\n      SHERPA_ONNX_EXIT(-1);\n      return nullptr;\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_RKNN=ON if you \"\n        \"want to use rknn. See also \"\n        \"https://k2-fsa.github.io/sherpa/onnx/rknn/install.html\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (config.model_config.provider == \"axera\") {\n#if SHERPA_ONNX_ENABLE_AXERA\n    if (!config.model_config.sense_voice.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerSenseVoiceTplImpl<OfflineSenseVoiceModelAxera>>(\n          mgr, config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only SenseVoice models are currently supported by Axera NPU for \"\n          \"non-streaming ASR.\");\n      SHERPA_ONNX_EXIT(-1);\n      return nullptr;\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_AXERA=ON if you \"\n        \"want to use axera. See also \"\n        \"https://k2-fsa.github.io/sherpa/onnx/axera/install.html\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (config.model_config.provider == \"axcl\") {\n#if SHERPA_ONNX_ENABLE_AXCL\n    if (!config.model_config.sense_voice.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerSenseVoiceTplImpl<OfflineSenseVoiceModelAxcl>>(\n          mgr, config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only SenseVoice models are currently supported by axcl for \"\n          \"non-streaming ASR.\");\n      SHERPA_ONNX_EXIT(-1);\n      return nullptr;\n    }\n\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_AXCL=ON if you \"\n        \"want to use axcl. See also \"\n        \"https://k2-fsa.github.io/sherpa/onnx/axcl/install.html\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (config.model_config.provider == \"ascend\") {\n#if SHERPA_ONNX_ENABLE_ASCEND_NPU\n    if (!config.model_config.sense_voice.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerSenseVoiceTplImpl<OfflineSenseVoiceModelAscend>>(\n          mgr, config);\n    } else if (!config.model_config.paraformer.model.empty()) {\n      return std::make_unique<\n          OfflineRecognizerParaformerTplImpl<OfflineParaformerModelAscend>>(\n          mgr, config);\n    } else if (!config.model_config.zipformer_ctc.model.empty()) {\n      return std::make_unique<OfflineRecognizerZipformerCtcAscendImpl>(mgr,\n                                                                       config);\n    } else if (!config.model_config.whisper.encoder.empty()) {\n      return std::make_unique<\n          OfflineRecognizerWhisperTplImpl<OfflineWhisperModelAscend>>(mgr,\n                                                                      config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only SenseVoice, Paraformer, Whisper, and Zipformer CTC models are \"\n          \"currently supported by Ascend NPU for non-streaming ASR.\");\n      SHERPA_ONNX_EXIT(-1);\n      return nullptr;\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_ASCEND_NPU=ON if \"\n        \"you want to use Ascend NPU. See also \"\n        \"https://k2-fsa.github.io/sherpa/onnx/ascend/install.html\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (config.model_config.provider == \"qnn\") {\n#if SHERPA_ONNX_ENABLE_QNN\n    if (!config.model_config.sense_voice.model.empty() ||\n        !config.model_config.sense_voice.qnn_config.context_binary.empty()) {\n      return std::make_unique<\n          OfflineRecognizerSenseVoiceTplImpl<OfflineSenseVoiceModelQnn>>(\n          mgr, config);\n    } else if (!config.model_config.zipformer_ctc.model.empty() ||\n               !config.model_config.zipformer_ctc.qnn_config.context_binary\n                    .empty()) {\n      return std::make_unique<OfflineRecognizerZipformerCtcQnnImpl>(mgr,\n                                                                    config);\n    } else if (!config.model_config.paraformer.model.empty() ||\n               !config.model_config.paraformer.qnn_config.context_binary\n                    .empty()) {\n      return std::make_unique<\n          OfflineRecognizerParaformerTplImpl<OfflineParaformerModelQnn>>(\n          mgr, config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only SenseVoice, Paraformer, and Zipformer CTC models are currently \"\n          \"supported by QNN for non-streaming ASR.\");\n      SHERPA_ONNX_EXIT(-1);\n      return nullptr;\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_QNN=ON if \"\n        \"you want to use qnn. See also \"\n        \"https://k2-fsa.github.io/sherpa/onnx/qnn/build.html\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (!config.model_config.sense_voice.model.empty()) {\n    return std::make_unique<OfflineRecognizerSenseVoiceImpl>(mgr, config);\n  }\n\n  if (!config.model_config.funasr_nano.encoder_adaptor.empty()) {\n    return std::make_unique<OfflineRecognizerFunASRNanoImpl>(mgr, config);\n  }\n\n  if (!config.model_config.paraformer.model.empty()) {\n    return std::make_unique<OfflineRecognizerParaformerImpl>(mgr, config);\n  }\n\n  if (!config.model_config.nemo_ctc.model.empty() ||\n      !config.model_config.zipformer_ctc.model.empty() ||\n      !config.model_config.tdnn.model.empty() ||\n      !config.model_config.wenet_ctc.model.empty() ||\n      !config.model_config.omnilingual.model.empty() ||\n      !config.model_config.medasr.model.empty() ||\n      !config.model_config.fire_red_asr_ctc.model.empty() ||\n      !config.model_config.dolphin.model.empty()) {\n    return std::make_unique<OfflineRecognizerCtcImpl>(mgr, config);\n  }\n\n  if (!config.model_config.whisper.encoder.empty()) {\n    return std::make_unique<OfflineRecognizerWhisperImpl>(mgr, config);\n  }\n\n  if (!config.model_config.fire_red_asr.encoder.empty()) {\n    return std::make_unique<OfflineRecognizerFireRedAsrImpl>(mgr, config);\n  }\n\n  if (!config.model_config.moonshine.preprocessor.empty()) {\n    return std::make_unique<OfflineRecognizerMoonshineImpl>(mgr, config);\n  }\n\n  if (!config.model_config.moonshine.merged_decoder.empty()) {\n    return std::make_unique<OfflineRecognizerMoonshineV2Impl>(mgr, config);\n  }\n\n  if (!config.model_config.canary.encoder.empty()) {\n    return std::make_unique<OfflineRecognizerCanaryImpl>(mgr, config);\n  }\n\n  // TODO(fangjun): Refactor it. We only need to use model type for the\n  // following models:\n  //  1. transducer and nemo_transducer\n  if (!config.model_config.model_type.empty()) {\n    const auto &model_type = config.model_config.model_type;\n    if (model_type == \"transducer\") {\n      return std::make_unique<OfflineRecognizerTransducerImpl>(mgr, config);\n    } else if (model_type == \"nemo_transducer\") {\n      return std::make_unique<OfflineRecognizerTransducerNeMoImpl>(mgr, config);\n    } else if (model_type == \"paraformer\") {\n      return std::make_unique<OfflineRecognizerParaformerImpl>(mgr, config);\n    } else if (model_type == \"nemo_ctc\" || model_type == \"tdnn\" ||\n               model_type == \"zipformer2_ctc\" || model_type == \"wenet_ctc\" ||\n               model_type == \"telespeech_ctc\") {\n      return std::make_unique<OfflineRecognizerCtcImpl>(mgr, config);\n    } else if (model_type == \"whisper\") {\n      return std::make_unique<OfflineRecognizerWhisperImpl>(mgr, config);\n    } else if (model_type == \"moonshine\") {\n      // unreachable code\n      return std::make_unique<OfflineRecognizerMoonshineImpl>(mgr, config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Invalid model_type: %s. Trying to load the model to get its type\",\n          model_type.c_str());\n    }\n  }\n\n  Ort::Env env(ORT_LOGGING_LEVEL_ERROR);\n\n  Ort::SessionOptions sess_opts;\n  sess_opts.SetIntraOpNumThreads(1);\n  sess_opts.SetInterOpNumThreads(1);\n\n  std::string model_filename;\n  if (!config.model_config.transducer.encoder_filename.empty()) {\n    model_filename = config.model_config.transducer.encoder_filename;\n  } else if (!config.model_config.paraformer.model.empty()) {\n    model_filename = config.model_config.paraformer.model;\n  } else if (!config.model_config.nemo_ctc.model.empty()) {\n    model_filename = config.model_config.nemo_ctc.model;\n  } else if (!config.model_config.tdnn.model.empty()) {\n    model_filename = config.model_config.tdnn.model;\n  } else if (!config.model_config.zipformer_ctc.model.empty()) {\n    model_filename = config.model_config.zipformer_ctc.model;\n  } else if (!config.model_config.wenet_ctc.model.empty()) {\n    model_filename = config.model_config.wenet_ctc.model;\n  } else if (!config.model_config.telespeech_ctc.empty()) {\n    model_filename = config.model_config.telespeech_ctc;\n  } else if (!config.model_config.whisper.encoder.empty()) {\n    model_filename = config.model_config.whisper.encoder;\n  } else {\n    SHERPA_ONNX_LOGE(\"Please provide a model\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  auto buf = ReadFile(mgr, model_filename);\n\n  auto encoder_sess =\n      std::make_unique<Ort::Session>(env, buf.data(), buf.size(), sess_opts);\n\n  Ort::ModelMetadata meta_data = encoder_sess->GetModelMetadata();\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n  auto model_type =\n      LookupCustomModelMetaData(meta_data, \"model_type\", allocator);\n  if (model_type.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"No model_type in the metadata!\\n\\n\"\n        \"Please refer to the following URLs to add metadata\"\n        \"\\n\"\n        \"(0) Transducer models from icefall\"\n        \"\\n    \"\n        \"https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/\"\n        \"pruned_transducer_stateless7/export-onnx.py#L303\"\n        \"\\n\"\n        \"(1) Nemo CTC models\\n    \"\n        \"https://huggingface.co/csukuangfj/\"\n        \"sherpa-onnx-nemo-ctc-en-citrinet-512/blob/main/add-model-metadata.py\"\n        \"\\n\"\n        \"(2) Paraformer\"\n        \"\\n    \"\n        \"https://huggingface.co/csukuangfj/\"\n        \"paraformer-onnxruntime-python-example/blob/main/add-model-metadata.py\"\n        \"\\n    \"\n        \"(3) Whisper\"\n        \"\\n    \"\n        \"(4) Tdnn models of the yesno recipe from icefall\"\n        \"\\n    \"\n        \"https://github.com/k2-fsa/icefall/tree/master/egs/yesno/ASR/tdnn\"\n        \"\\n\"\n        \"(5) Zipformer CTC models from icefall\"\n        \"\\n    \"\n        \"https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/\"\n        \"zipformer/export-onnx-ctc.py\"\n        \"\\n\"\n        \"(6) CTC models from WeNet\"\n        \"\\n    \"\n        \"https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/wenet/run.sh\"\n        \"\\n\"\n        \"(7) CTC models from TeleSpeech\"\n        \"\\n    \"\n        \"https://github.com/Tele-AI/TeleSpeech-ASR\"\n        \"\\n\"\n        \"\\n\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  if (model_type == \"conformer\" || model_type == \"zipformer\" ||\n      model_type == \"zipformer2\") {\n    return std::make_unique<OfflineRecognizerTransducerImpl>(mgr, config);\n  }\n\n  if (model_type == \"paraformer\") {\n    return std::make_unique<OfflineRecognizerParaformerImpl>(mgr, config);\n  }\n\n  if ((model_type == \"EncDecHybridRNNTCTCBPEModel\" ||\n       model_type == \"EncDecRNNTBPEModel\") &&\n      !config.model_config.transducer.decoder_filename.empty() &&\n      !config.model_config.transducer.joiner_filename.empty()) {\n    return std::make_unique<OfflineRecognizerTransducerNeMoImpl>(mgr, config);\n  }\n\n  if (model_type == \"EncDecCTCModelBPE\" || model_type == \"EncDecCTCModel\" ||\n      model_type == \"EncDecHybridRNNTCTCBPEModel\" || model_type == \"tdnn\" ||\n      model_type == \"zipformer2_ctc\" || model_type == \"wenet_ctc\" ||\n      model_type == \"telespeech_ctc\") {\n    return std::make_unique<OfflineRecognizerCtcImpl>(mgr, config);\n  }\n\n  if (strncmp(model_type.c_str(), \"whisper\", 7) == 0) {\n    return std::make_unique<OfflineRecognizerWhisperImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\n      \"\\nUnsupported model_type: %s\\n\"\n      \"We support only the following model types at present: \\n\"\n      \" - Non-streaming transducer models from icefall\\n\"\n      \" - Non-streaming Paraformer models from FunASR\\n\"\n      \" - EncDecCTCModelBPE models from NeMo\\n\"\n      \" - EncDecCTCModel models from NeMo\\n\"\n      \" - EncDecHybridRNNTCTCBPEModel models from NeMo\\n\"\n      \" - EncDecRNNTBPEModel models from NeMo\\n\"\n      \" - Whisper models\\n\"\n      \" - Tdnn models\\n\"\n      \" - Zipformer CTC models\\n\"\n      \" - WeNet CTC models\\n\"\n      \" - TeleSpeech CTC models\\n\",\n      model_type.c_str());\n\n  SHERPA_ONNX_EXIT(-1);\n}\n\nOfflineRecognizerImpl::OfflineRecognizerImpl(\n    const OfflineRecognizerConfig &config)\n    : config_(config) {\n  // TODO(fangjun): Refactor this function\n\n  if (!config.rule_fsts.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(config.rule_fsts, \",\", false, &files);\n    itn_list_.reserve(files.size());\n    for (const auto &f : files) {\n      if (config.model_config.debug) {\n        SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n      }\n      itn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(f));\n    }\n  }\n\n  if (!config.rule_fars.empty()) {\n    if (config.model_config.debug) {\n      SHERPA_ONNX_LOGE(\"Loading FST archives\");\n    }\n    std::vector<std::string> files;\n    SplitStringToVector(config.rule_fars, \",\", false, &files);\n\n    itn_list_.reserve(files.size() + itn_list_.size());\n\n    for (const auto &f : files) {\n      if (config.model_config.debug) {\n        SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n      }\n      std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n          fst::FarReader<fst::StdArc>::Open(f));\n      for (; !reader->Done(); reader->Next()) {\n        std::unique_ptr<fst::StdConstFst> r(\n            fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n        itn_list_.push_back(\n            std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n      }\n    }\n\n    if (config.model_config.debug) {\n      SHERPA_ONNX_LOGE(\"FST archives loaded!\");\n    }\n  }\n\n  if (!config.hr.lexicon.empty() && !config.hr.rule_fsts.empty()) {\n    auto hr_config = config.hr;\n    hr_config.debug = config.model_config.debug;\n    hr_ = std::make_unique<HomophoneReplacer>(hr_config);\n  }\n}\n\ntemplate <typename Manager>\nOfflineRecognizerImpl::OfflineRecognizerImpl(\n    Manager *mgr, const OfflineRecognizerConfig &config)\n    : config_(config) {\n  if (!config.rule_fsts.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(config.rule_fsts, \",\", false, &files);\n    itn_list_.reserve(files.size());\n    for (const auto &f : files) {\n      if (config.model_config.debug) {\n        SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n      }\n      auto buf = ReadFile(mgr, f);\n      std::istringstream is(std::string(buf.data(), buf.size()));\n      itn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(is));\n    }\n  }\n\n  if (!config.rule_fars.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(config.rule_fars, \",\", false, &files);\n    itn_list_.reserve(files.size() + itn_list_.size());\n\n    for (const auto &f : files) {\n      if (config.model_config.debug) {\n        SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n      }\n\n      auto buf = ReadFile(mgr, f);\n\n      std::unique_ptr<std::istream> s(\n          new std::istringstream(std::string(buf.data(), buf.size())));\n\n      std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n          fst::FarReader<fst::StdArc>::Open(std::move(s)));\n\n      for (; !reader->Done(); reader->Next()) {\n        std::unique_ptr<fst::StdConstFst> r(\n            fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n        itn_list_.push_back(\n            std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n      }  // for (; !reader->Done(); reader->Next())\n    }  // for (const auto &f : files)\n  }  // if (!config.rule_fars.empty())\n\n  if (!config.hr.lexicon.empty() && !config.hr.rule_fsts.empty()) {\n    auto hr_config = config.hr;\n    hr_config.debug = config.model_config.debug;\n    hr_ = std::make_unique<HomophoneReplacer>(mgr, hr_config);\n  }\n}\n\nstd::string OfflineRecognizerImpl::ApplyInverseTextNormalization(\n    std::string text) const {\n  text = RemoveInvalidUtf8Sequences(text);\n\n  if (!itn_list_.empty()) {\n    for (const auto &tn : itn_list_) {\n      text = tn->Normalize(text);\n    }\n  }\n\n  return text;\n}\n\nstd::string OfflineRecognizerImpl::ApplyHomophoneReplacer(\n    std::string text) const {\n  if (hr_) {\n    text = hr_->Apply(text);\n  }\n\n  return text;\n}\n\nvoid OfflineRecognizerImpl::SetConfig(const OfflineRecognizerConfig &config) {\n  config_ = config;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineRecognizerImpl::OfflineRecognizerImpl(\n    AAssetManager *mgr, const OfflineRecognizerConfig &config);\n\ntemplate std::unique_ptr<OfflineRecognizerImpl> OfflineRecognizerImpl::Create(\n    AAssetManager *mgr, const OfflineRecognizerConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineRecognizerImpl::OfflineRecognizerImpl(\n    NativeResourceManager *mgr, const OfflineRecognizerConfig &config);\ntemplate std::unique_ptr<OfflineRecognizerImpl> OfflineRecognizerImpl::Create(\n    NativeResourceManager *mgr, const OfflineRecognizerConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-impl.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_IMPL_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"kaldifst/csrc/text-normalizer.h\"\n#include \"sherpa-onnx/csrc/homophone-replacer.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/offline-stream.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerImpl(const OfflineRecognizerConfig &config);\n\n  static std::unique_ptr<OfflineRecognizerImpl> Create(\n      const OfflineRecognizerConfig &config);\n\n  template <typename Manager>\n  OfflineRecognizerImpl(Manager *mgr, const OfflineRecognizerConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OfflineRecognizerImpl> Create(\n      Manager *mgr, const OfflineRecognizerConfig &config);\n\n  virtual ~OfflineRecognizerImpl() = default;\n\n  virtual std::unique_ptr<OfflineStream> CreateStream(\n      const std::string &hotwords) const {\n    SHERPA_ONNX_LOGE(\"Only transducer models support contextual biasing.\");\n    exit(-1);\n  }\n\n  virtual std::unique_ptr<OfflineStream> CreateStream() const = 0;\n\n  virtual void DecodeStreams(OfflineStream **ss, int32_t n) const = 0;\n\n  virtual void SetConfig(const OfflineRecognizerConfig &config);\n\n  virtual OfflineRecognizerConfig GetConfig() const = 0;\n\n  std::string ApplyInverseTextNormalization(std::string text) const;\n\n  std::string ApplyHomophoneReplacer(std::string text) const;\n\n protected:\n  OfflineRecognizerConfig config_;\n  // for inverse text normalization. Used only if\n  // config.rule_fsts is not empty or\n  // config.rule_fars is not empty\n  std::vector<std::unique_ptr<kaldifst::TextNormalizer>> itn_list_;\n  std::unique_ptr<HomophoneReplacer> hr_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-moonshine-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-moonshine-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_MOONSHINE_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_MOONSHINE_IMPL_H_\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-moonshine-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-moonshine-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-moonshine-model.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nOfflineRecognitionResult Convert(const OfflineMoonshineDecoderResult &src,\n                                 const SymbolTable &sym_table) {\n  OfflineRecognitionResult r;\n  r.tokens.reserve(src.tokens.size());\n\n  std::string text;\n  for (auto i : src.tokens) {\n    if (!sym_table.Contains(i)) {\n      continue;\n    }\n\n    const auto &s = sym_table[i];\n    text += s;\n    r.tokens.push_back(s);\n  }\n\n  r.text = text;\n\n  return r;\n}\n\nclass OfflineRecognizerMoonshineImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerMoonshineImpl(const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineMoonshineModel>(config.model_config)) {\n    Init();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerMoonshineImpl(Manager *mgr,\n                                 const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(\n            std::make_unique<OfflineMoonshineModel>(mgr, config.model_config)) {\n    Init();\n  }\n\n  void Init() {\n    if (config_.decoding_method == \"greedy_search\") {\n      decoder_ =\n          std::make_unique<OfflineMoonshineGreedySearchDecoder>(model_.get());\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only greedy_search is supported at present for moonshine. Given %s\",\n          config_.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    MoonshineTag tag;\n    return std::make_unique<OfflineStream>(tag);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    // batch decoding is not implemented yet\n    for (int32_t i = 0; i != n; ++i) {\n      DecodeStream(ss[i]);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void DecodeStream(OfflineStream *s) const {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<float> audio = s->GetFrames();\n\n    try {\n      std::array<int64_t, 2> shape{1, static_cast<int64_t>(audio.size())};\n\n      Ort::Value audio_tensor = Ort::Value::CreateTensor(\n          memory_info, audio.data(), audio.size(), shape.data(), shape.size());\n\n      Ort::Value features =\n          model_->ForwardPreprocessor(std::move(audio_tensor));\n\n      int32_t features_len = features.GetTensorTypeAndShapeInfo().GetShape()[1];\n\n      int64_t features_shape = 1;\n\n      Ort::Value features_len_tensor = Ort::Value::CreateTensor(\n          memory_info, &features_len, 1, &features_shape, 1);\n\n      Ort::Value encoder_out = model_->ForwardEncoder(\n          std::move(features), std::move(features_len_tensor));\n\n      auto results = decoder_->Decode(std::move(encoder_out));\n\n      auto r = Convert(results[0], symbol_table_);\n      r.text = ApplyInverseTextNormalization(std::move(r.text));\n      r.text = ApplyHomophoneReplacer(std::move(r.text));\n      s->SetResult(r);\n    } catch (const Ort::Exception &ex) {\n      SHERPA_ONNX_LOGE(\n          \"\\n\\nCaught exception:\\n\\n%s\\n\\nReturn an empty result. Number of \"\n          \"audio samples: %d\",\n          ex.what(), static_cast<int32_t>(audio.size()));\n      return;\n    }\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<OfflineMoonshineModel> model_;\n  std::unique_ptr<OfflineMoonshineDecoder> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_MOONSHINE_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-moonshine-v2-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-moonshine-v2-impl.h\n//\n// Copyright (c)  2024-2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_MOONSHINE_V2_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_MOONSHINE_V2_IMPL_H_\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-moonshine-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-moonshine-model-v2.h\"\n#include \"sherpa-onnx/csrc/offline-moonshine-v2-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\n// defined in ./offline-recognizer-moonshine-impl.h\nOfflineRecognitionResult Convert(const OfflineMoonshineDecoderResult &src,\n                                 const SymbolTable &sym_table);\n\nclass OfflineRecognizerMoonshineV2Impl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerMoonshineV2Impl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineMoonshineModelV2>(config.model_config)) {\n    Init();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerMoonshineV2Impl(Manager *mgr,\n                                   const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<OfflineMoonshineModelV2>(mgr,\n                                                         config.model_config)) {\n    Init();\n  }\n\n  void Init() {\n    // tokens.txt from whisper is base64 encoded, so we need to decode it\n    // See also ../../scripts/moonshine/v2/generate_tokens.py\n    symbol_table_.ApplyBase64Decode();\n\n    if (config_.decoding_method == \"greedy_search\") {\n      decoder_ =\n          std::make_unique<OfflineMoonshineV2GreedySearchDecoder>(model_.get());\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only greedy_search is supported at present for moonshine. Given %s\",\n          config_.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    MoonshineTag tag;\n    return std::make_unique<OfflineStream>(tag);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    // batch decoding is not implemented yet\n    for (int32_t i = 0; i != n; ++i) {\n      DecodeStream(ss[i]);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void DecodeStream(OfflineStream *s) const {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<float> audio = s->GetFrames();\n\n    try {\n      std::array<int64_t, 2> shape{1, static_cast<int64_t>(audio.size())};\n\n      Ort::Value audio_tensor = Ort::Value::CreateTensor(\n          memory_info, audio.data(), audio.size(), shape.data(), shape.size());\n\n      Ort::Value encoder_out = model_->ForwardEncoder(std::move(audio_tensor));\n\n      auto results = decoder_->Decode(std::move(encoder_out));\n\n      auto r = Convert(results[0], symbol_table_);\n      r.text = ApplyInverseTextNormalization(std::move(r.text));\n      r.text = ApplyHomophoneReplacer(std::move(r.text));\n      s->SetResult(r);\n    } catch (const Ort::Exception &ex) {\n      SHERPA_ONNX_LOGE(\n          \"\\n\\nCaught exception:\\n\\n%s\\n\\nReturn an empty result. Number of \"\n          \"audio samples: %d\",\n          ex.what(), static_cast<int32_t>(audio.size()));\n      return;\n    }\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<OfflineMoonshineModelV2> model_;\n  std::unique_ptr<OfflineMoonshineDecoder> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_MOONSHINE_V2_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-paraformer-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-paraformer-impl.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_PARAFORMER_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_PARAFORMER_IMPL_H_\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-paraformer-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-paraformer-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-paraformer-model.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/pad-sequence.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\nOfflineRecognitionResult Convert(const OfflineParaformerDecoderResult &src,\n                                 const SymbolTable &sym_table) {\n  OfflineRecognitionResult r;\n  r.tokens.reserve(src.tokens.size());\n  r.timestamps = src.timestamps;\n\n  std::string text;\n\n  // When the current token ends with \"@@\" we set mergeable to true\n  bool mergeable = false;\n\n  for (int32_t i = 0; i != src.tokens.size(); ++i) {\n    auto sym = sym_table[src.tokens[i]];\n    r.tokens.push_back(sym);\n\n    if ((sym.back() != '@') || (sym.size() > 2 && sym[sym.size() - 2] != '@')) {\n      // sym does not end with \"@@\"\n      const uint8_t *p = reinterpret_cast<const uint8_t *>(sym.c_str());\n      if (p[0] < 0x80) {\n        // an ascii\n        if (mergeable) {\n          mergeable = false;\n          text.append(sym);\n        } else {\n          text.append(\" \");\n          text.append(sym);\n        }\n      } else {\n        // not an ascii\n        mergeable = false;\n\n        if (i > 0) {\n          const uint8_t p = reinterpret_cast<const uint8_t *>(\n              sym_table[src.tokens[i - 1]].c_str())[0];\n          if (p < 0x80) {\n            // put a space between ascii and non-ascii\n            text.append(\" \");\n          }\n        }\n        text.append(sym);\n      }\n    } else {\n      // this sym ends with @@\n      sym = std::string(sym.data(), sym.size() - 2);\n      if (mergeable) {\n        text.append(sym);\n      } else {\n        text.append(\" \");\n        text.append(sym);\n        mergeable = true;\n      }\n    }\n  }\n  r.text = std::move(text);\n\n  return r;\n}\n\nclass OfflineRecognizerParaformerImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerParaformerImpl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineParaformerModel>(config.model_config)) {\n    if (config.decoding_method == \"greedy_search\") {\n      int32_t eos_id = symbol_table_[\"</s>\"];\n      decoder_ = std::make_unique<OfflineParaformerGreedySearchDecoder>(eos_id);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    InitFeatConfig();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerParaformerImpl(Manager *mgr,\n                                  const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<OfflineParaformerModel>(mgr,\n                                                        config.model_config)) {\n    if (config.decoding_method == \"greedy_search\") {\n      int32_t eos_id = symbol_table_[\"</s>\"];\n      decoder_ = std::make_unique<OfflineParaformerGreedySearchDecoder>(eos_id);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    InitFeatConfig();\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(config_.feat_config);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    // 1. Apply LFR\n    // 2. Apply CMVN\n    //\n    // Please refer to\n    // https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45555.pdf\n    // for what LFR means\n    //\n    // \"Lower Frame Rate Neural Network Acoustic Models\"\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<Ort::Value> features;\n    features.reserve(n);\n\n    int32_t feat_dim =\n        config_.feat_config.feature_dim * model_->LfrWindowSize();\n\n    std::vector<std::vector<float>> features_vec(n);\n    std::vector<int32_t> features_length_vec(n);\n    for (int32_t i = 0; i != n; ++i) {\n      std::vector<float> f = ss[i]->GetFrames();\n\n      f = ApplyLFR(f);\n      ApplyCMVN(&f);\n\n      int32_t num_frames = f.size() / feat_dim;\n      features_vec[i] = std::move(f);\n\n      features_length_vec[i] = num_frames;\n\n      std::array<int64_t, 2> shape = {num_frames, feat_dim};\n\n      Ort::Value x = Ort::Value::CreateTensor(\n          memory_info, features_vec[i].data(), features_vec[i].size(),\n          shape.data(), shape.size());\n      features.push_back(std::move(x));\n    }\n\n    std::vector<const Ort::Value *> features_pointer(n);\n    for (int32_t i = 0; i != n; ++i) {\n      features_pointer[i] = &features[i];\n    }\n\n    std::array<int64_t, 1> features_length_shape = {n};\n    Ort::Value x_length = Ort::Value::CreateTensor(\n        memory_info, features_length_vec.data(), n,\n        features_length_shape.data(), features_length_shape.size());\n\n    // Caution(fangjun): We cannot pad it with log(eps),\n    // i.e., -23.025850929940457f\n    Ort::Value x = PadSequence(model_->Allocator(), features_pointer, 0);\n\n    std::vector<Ort::Value> t;\n    try {\n      t = model_->Forward(std::move(x), std::move(x_length));\n    } catch (const Ort::Exception &ex) {\n      SHERPA_ONNX_LOGE(\"\\n\\nCaught exception:\\n\\n%s\\n\\nReturn an empty result\",\n                       ex.what());\n      return;\n    }\n\n    std::vector<OfflineParaformerDecoderResult> results;\n    if (t.size() == 2) {\n      results = decoder_->Decode(std::move(t[0]), std::move(t[1]));\n    } else {\n      results =\n          decoder_->Decode(std::move(t[0]), std::move(t[1]), std::move(t[3]));\n    }\n\n    for (int32_t i = 0; i != n; ++i) {\n      auto r = Convert(results[i], symbol_table_);\n      r.text = ApplyInverseTextNormalization(std::move(r.text));\n      r.text = ApplyHomophoneReplacer(std::move(r.text));\n      ss[i]->SetResult(r);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void InitFeatConfig() {\n    // Paraformer models assume input samples are in the range\n    // [-32768, 32767], so we set normalize_samples to false\n    config_.feat_config.normalize_samples = false;\n    config_.feat_config.window_type = \"hamming\";\n    config_.feat_config.high_freq = 0;\n    config_.feat_config.snip_edges = true;\n  }\n\n  std::vector<float> ApplyLFR(const std::vector<float> &in) const {\n    int32_t lfr_window_size = model_->LfrWindowSize();\n    int32_t lfr_window_shift = model_->LfrWindowShift();\n    int32_t in_feat_dim = config_.feat_config.feature_dim;\n\n    int32_t in_num_frames = in.size() / in_feat_dim;\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n\n    std::vector<float> out(out_num_frames * out_feat_dim);\n\n    const float *p_in = in.data();\n    float *p_out = out.data();\n\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n\n    return out;\n  }\n\n  void ApplyCMVN(std::vector<float> *v) const {\n    const std::vector<float> &neg_mean = model_->NegativeMean();\n    const std::vector<float> &inv_stddev = model_->InverseStdDev();\n    int32_t dim = static_cast<int32_t>(neg_mean.size());\n    int32_t num_frames = static_cast<int32_t>(v->size()) / dim;\n\n    Eigen::Map<\n        Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>>\n        mat(v->data(), num_frames, dim);\n\n    Eigen::Map<const Eigen::RowVectorXf> neg_mean_vec(neg_mean.data(), dim);\n    Eigen::Map<const Eigen::RowVectorXf> inv_stddev_vec(inv_stddev.data(), dim);\n\n    mat.array() = (mat.array().rowwise() + neg_mean_vec.array()).rowwise() *\n                  inv_stddev_vec.array();\n  }\n\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<OfflineParaformerModel> model_;\n  std::unique_ptr<OfflineParaformerDecoder> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_PARAFORMER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-paraformer-tpl-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-paraformer-tpl-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_PARAFORMER_TPL_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_PARAFORMER_TPL_IMPL_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\n// defined in ../offline-recognizer-paraformer-impl.h\nOfflineRecognitionResult Convert(const OfflineParaformerDecoderResult &src,\n                                 const SymbolTable &sym_table);\n\ntemplate <typename ParaformerModel>\nclass OfflineRecognizerParaformerTplImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerParaformerTplImpl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<ParaformerModel>(config.model_config)) {\n    if (config.decoding_method != \"greedy_search\") {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    InitFeatConfig();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerParaformerTplImpl(Manager *mgr,\n                                     const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<ParaformerModel>(mgr, config.model_config)) {\n    if (config.decoding_method != \"greedy_search\") {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    InitFeatConfig();\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(config_.feat_config);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    for (int32_t i = 0; i < n; ++i) {\n      DecodeOneStream(ss[i]);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void InitFeatConfig() {\n    config_.feat_config.normalize_samples = false;\n    config_.feat_config.window_type = \"hamming\";\n    config_.feat_config.high_freq = 0;\n    config_.feat_config.snip_edges = true;\n  }\n\n  void DecodeOneStream(OfflineStream *s) const {\n    std::vector<float> f = s->GetFrames();\n\n    std::vector<float> logits = model_->Run(std::move(f));\n    if (logits.empty()) {\n      SHERPA_ONNX_LOGE(\"No speech detected\");\n      return;\n    }\n\n    int32_t vocab_size = model_->VocabSize();\n    int32_t num_tokens = logits.size() / vocab_size;\n\n    int32_t eos_id = symbol_table_[\"</s>\"];\n\n    OfflineParaformerDecoderResult r;\n    const float *p = logits.data();\n    for (int32_t i = 0; i < num_tokens; ++i) {\n      auto max_idx = static_cast<int64_t>(\n          std::distance(p, std::max_element(p, p + vocab_size)));\n\n      if (max_idx == eos_id) {\n        break;\n      }\n      r.tokens.push_back(max_idx);\n      p += vocab_size;\n    }\n\n    auto result = Convert(r, symbol_table_);\n    result.text = ApplyInverseTextNormalization(std::move(result.text));\n    result.text = ApplyHomophoneReplacer(std::move(result.text));\n    s->SetResult(result);\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<ParaformerModel> model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_PARAFORMER_TPL_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-sense-voice-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-sense-voice-impl.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_SENSE_VOICE_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_SENSE_VOICE_IMPL_H_\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-ctc-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/offline-sense-voice-model.h\"\n#include \"sherpa-onnx/csrc/pad-sequence.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\nOfflineRecognitionResult ConvertSenseVoiceResult(\n    const OfflineCtcDecoderResult &src, const SymbolTable &sym_table,\n    int32_t frame_shift_ms, int32_t subsampling_factor,\n    bool is_funasr_nano = false) {\n  OfflineRecognitionResult r;\n  r.tokens.reserve(src.tokens.size());\n  r.timestamps.reserve(src.timestamps.size());\n\n  std::string text;\n\n  // Funasr NanO does not support emotion, event, language, etc.\n  int32_t start = is_funasr_nano ? 0 : 4;\n\n  for (int32_t i = start; i < src.tokens.size(); ++i) {\n    auto sym = sym_table[src.tokens[i]];\n    text.append(sym);\n\n    r.tokens.push_back(std::move(sym));\n  }\n  r.text = std::move(text);\n\n  float frame_shift_s = frame_shift_ms / 1000. * subsampling_factor;\n\n  for (int32_t i = start; i < src.timestamps.size(); ++i) {\n    float time = frame_shift_s * (src.timestamps[i] - start);\n    r.timestamps.push_back(time);\n  }\n\n  r.words = std::move(src.words);\n\n  if (!is_funasr_nano) {\n    // parse lang, emotion and event from tokens.\n    if (src.tokens.size() >= 3) {\n      r.lang = sym_table[src.tokens[0]];\n      r.emotion = sym_table[src.tokens[1]];\n      r.event = sym_table[src.tokens[2]];\n    }\n  }\n\n  return r;\n}\n\nclass OfflineRecognizerSenseVoiceImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerSenseVoiceImpl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineSenseVoiceModel>(config.model_config)) {\n    const auto &meta_data = model_->GetModelMetadata();\n    if (config.decoding_method == \"greedy_search\") {\n      decoder_ =\n          std::make_unique<OfflineCtcGreedySearchDecoder>(meta_data.blank_id);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerSenseVoiceImpl(Manager *mgr,\n                                  const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<OfflineSenseVoiceModel>(mgr,\n                                                        config.model_config)) {\n    const auto &meta_data = model_->GetModelMetadata();\n    if (config.decoding_method == \"greedy_search\") {\n      decoder_ =\n          std::make_unique<OfflineCtcGreedySearchDecoder>(meta_data.blank_id);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    PostInit();\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(config_.feat_config);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    const auto &meta_data = model_->GetModelMetadata();\n\n    if (meta_data.is_funasr_nano) {\n      for (int32_t i = 0; i < n; ++i) {\n        DecodeOneStreamFunAsrNano(ss[i]);\n      }\n\n      return;\n    }\n\n    if (n == 1) {\n      DecodeOneStream(ss[0]);\n      return;\n    }\n\n    // 1. Apply LFR\n    // 2. Apply CMVN\n    //\n    // Please refer to\n    // https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45555.pdf\n    // for what LFR means\n    //\n    // \"Lower Frame Rate Neural Network Acoustic Models\"\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<Ort::Value> features;\n    features.reserve(n);\n\n    int32_t feat_dim = config_.feat_config.feature_dim * meta_data.window_size;\n\n    std::vector<std::vector<float>> features_vec(n);\n    std::vector<int32_t> features_length_vec(n);\n    for (int32_t i = 0; i != n; ++i) {\n      std::vector<float> f = ss[i]->GetFrames();\n\n      f = ApplyLFR(f);\n      ApplyCMVN(&f);\n\n      int32_t num_frames = f.size() / feat_dim;\n      features_vec[i] = std::move(f);\n\n      features_length_vec[i] = num_frames;\n\n      std::array<int64_t, 2> shape = {num_frames, feat_dim};\n\n      Ort::Value x = Ort::Value::CreateTensor(\n          memory_info, features_vec[i].data(), features_vec[i].size(),\n          shape.data(), shape.size());\n      features.push_back(std::move(x));\n    }\n\n    std::vector<const Ort::Value *> features_pointer(n);\n    for (int32_t i = 0; i != n; ++i) {\n      features_pointer[i] = &features[i];\n    }\n\n    std::array<int64_t, 1> features_length_shape = {n};\n    Ort::Value x_length = Ort::Value::CreateTensor(\n        memory_info, features_length_vec.data(), n,\n        features_length_shape.data(), features_length_shape.size());\n\n    // Caution(fangjun): We cannot pad it with log(eps),\n    // i.e., -23.025850929940457f\n    Ort::Value x = PadSequence(model_->Allocator(), features_pointer, 0);\n\n    int32_t language = 0;\n    if (config_.model_config.sense_voice.language.empty()) {\n      language = 0;\n    } else if (meta_data.lang2id.count(\n                   config_.model_config.sense_voice.language)) {\n      language =\n          meta_data.lang2id.at(config_.model_config.sense_voice.language);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unknown language: %s. Use 0 instead.\",\n                       config_.model_config.sense_voice.language.c_str());\n    }\n\n    std::vector<int32_t> language_array(n);\n    std::fill(language_array.begin(), language_array.end(), language);\n\n    std::vector<int32_t> text_norm_array(n);\n    std::fill(text_norm_array.begin(), text_norm_array.end(),\n              config_.model_config.sense_voice.use_itn\n                  ? meta_data.with_itn_id\n                  : meta_data.without_itn_id);\n\n    Ort::Value language_tensor = Ort::Value::CreateTensor(\n        memory_info, language_array.data(), n, features_length_shape.data(),\n        features_length_shape.size());\n\n    Ort::Value text_norm_tensor = Ort::Value::CreateTensor(\n        memory_info, text_norm_array.data(), n, features_length_shape.data(),\n        features_length_shape.size());\n\n    Ort::Value logits{nullptr};\n    try {\n      logits = model_->Forward(std::move(x), std::move(x_length),\n                               std::move(language_tensor),\n                               std::move(text_norm_tensor));\n    } catch (const Ort::Exception &ex) {\n      SHERPA_ONNX_LOGE(\"\\n\\nCaught exception:\\n\\n%s\\n\\nReturn an empty result\",\n                       ex.what());\n      return;\n    }\n\n    // decoder_->Decode() requires that logits_length is of dtype int64\n    std::vector<int64_t> features_length_vec_64;\n    features_length_vec_64.reserve(n);\n    for (auto i : features_length_vec) {\n      i += 4;\n      features_length_vec_64.push_back(i);\n    }\n\n    Ort::Value logits_length = Ort::Value::CreateTensor(\n        memory_info, features_length_vec_64.data(), n,\n        features_length_shape.data(), features_length_shape.size());\n\n    auto results =\n        decoder_->Decode(std::move(logits), std::move(logits_length));\n\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = meta_data.window_shift;\n    for (int32_t i = 0; i != n; ++i) {\n      auto r = ConvertSenseVoiceResult(results[i], symbol_table_,\n                                       frame_shift_ms, subsampling_factor);\n      r.text = ApplyInverseTextNormalization(std::move(r.text));\n      r.text = ApplyHomophoneReplacer(std::move(r.text));\n      ss[i]->SetResult(r);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void DecodeOneStreamFunAsrNano(OfflineStream *s) const {\n    const auto &meta_data = model_->GetModelMetadata();\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t feat_dim = config_.feat_config.feature_dim * meta_data.window_size;\n    std::vector<float> f = s->GetFrames();\n    f = ApplyLFR(f);\n\n    int32_t num_frames = f.size() / feat_dim;\n    std::array<int64_t, 3> shape = {1, num_frames, feat_dim};\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, f.data(), f.size(),\n                                            shape.data(), shape.size());\n\n    Ort::Value logits{nullptr};\n    try {\n      logits = model_->Forward(std::move(x));\n    } catch (const Ort::Exception &ex) {\n      SHERPA_ONNX_LOGE(\"\\n\\nCaught exception:\\n\\n%s\\n\\nReturn an empty result\",\n                       ex.what());\n      return;\n    }\n\n    int64_t new_num_frames = logits.GetTensorTypeAndShapeInfo().GetShape()[1];\n    int64_t num_frame_shape = 1;\n    Ort::Value logits_length = Ort::Value::CreateTensor(\n        memory_info, &new_num_frames, 1, &num_frame_shape, 1);\n\n    auto results =\n        decoder_->Decode(std::move(logits), std::move(logits_length));\n\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = meta_data.window_shift;\n    auto r = ConvertSenseVoiceResult(results[0], symbol_table_, frame_shift_ms,\n                                     subsampling_factor, true);\n\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    s->SetResult(r);\n  }\n\n  void DecodeOneStream(OfflineStream *s) const {\n    const auto &meta_data = model_->GetModelMetadata();\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t feat_dim = config_.feat_config.feature_dim * meta_data.window_size;\n    std::vector<float> f = s->GetFrames();\n    f = ApplyLFR(f);\n    ApplyCMVN(&f);\n    int32_t num_frames = f.size() / feat_dim;\n    std::array<int64_t, 3> shape = {1, num_frames, feat_dim};\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, f.data(), f.size(),\n                                            shape.data(), shape.size());\n\n    int64_t scale_shape = 1;\n\n    Ort::Value x_length =\n        Ort::Value::CreateTensor(memory_info, &num_frames, 1, &scale_shape, 1);\n\n    int32_t language = 0;\n    if (config_.model_config.sense_voice.language.empty()) {\n      language = 0;\n    } else if (meta_data.lang2id.count(\n                   config_.model_config.sense_voice.language)) {\n      language =\n          meta_data.lang2id.at(config_.model_config.sense_voice.language);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unknown language: %s. Use 0 instead.\",\n                       config_.model_config.sense_voice.language.c_str());\n    }\n\n    int32_t text_norm = config_.model_config.sense_voice.use_itn\n                            ? meta_data.with_itn_id\n                            : meta_data.without_itn_id;\n\n    Ort::Value language_tensor =\n        Ort::Value::CreateTensor(memory_info, &language, 1, &scale_shape, 1);\n\n    Ort::Value text_norm_tensor =\n        Ort::Value::CreateTensor(memory_info, &text_norm, 1, &scale_shape, 1);\n\n    Ort::Value logits{nullptr};\n    try {\n      logits = model_->Forward(std::move(x), std::move(x_length),\n                               std::move(language_tensor),\n                               std::move(text_norm_tensor));\n    } catch (const Ort::Exception &ex) {\n      SHERPA_ONNX_LOGE(\"\\n\\nCaught exception:\\n\\n%s\\n\\nReturn an empty result\",\n                       ex.what());\n      return;\n    }\n\n    int64_t new_num_frames = num_frames + 4;\n    Ort::Value logits_length = Ort::Value::CreateTensor(\n        memory_info, &new_num_frames, 1, &scale_shape, 1);\n\n    auto results =\n        decoder_->Decode(std::move(logits), std::move(logits_length));\n\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = meta_data.window_shift;\n    auto r = ConvertSenseVoiceResult(results[0], symbol_table_, frame_shift_ms,\n                                     subsampling_factor);\n\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    s->SetResult(r);\n  }\n\n  void PostInit() {\n    InitFeatConfig();\n\n    const auto &meta_data = model_->GetModelMetadata();\n    if (meta_data.is_funasr_nano) {\n      symbol_table_.ApplyBase64Decode();\n    }\n  }\n\n  void InitFeatConfig() {\n    const auto &meta_data = model_->GetModelMetadata();\n\n    config_.feat_config.normalize_samples = meta_data.normalize_samples;\n    config_.feat_config.window_type = \"hamming\";\n    config_.feat_config.high_freq = 0;\n    config_.feat_config.snip_edges = true;\n  }\n\n  std::vector<float> ApplyLFR(const std::vector<float> &in) const {\n    const auto &meta_data = model_->GetModelMetadata();\n\n    int32_t lfr_window_size = meta_data.window_size;\n    int32_t lfr_window_shift = meta_data.window_shift;\n    int32_t in_feat_dim = config_.feat_config.feature_dim;\n\n    int32_t in_num_frames = in.size() / in_feat_dim;\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n\n    std::vector<float> out(out_num_frames * out_feat_dim);\n\n    const float *p_in = in.data();\n    float *p_out = out.data();\n\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n\n    return out;\n  }\n\n  void ApplyCMVN(std::vector<float> *v) const {\n    const auto &meta_data = model_->GetModelMetadata();\n    const std::vector<float> &neg_mean = meta_data.neg_mean;\n    const std::vector<float> &inv_stddev = meta_data.inv_stddev;\n    int32_t dim = static_cast<int32_t>(neg_mean.size());\n    int32_t num_frames = static_cast<int32_t>(v->size()) / dim;\n    Eigen::Map<\n        Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>>\n        mat(v->data(), num_frames, dim);\n    Eigen::Map<const Eigen::RowVectorXf> neg_mean_vec(neg_mean.data(), dim);\n\n    Eigen::Map<const Eigen::RowVectorXf> inv_stddev_vec(inv_stddev.data(), dim);\n    mat.array() = (mat.array().rowwise() + neg_mean_vec.array()).rowwise() *\n                  inv_stddev_vec.array();\n  }\n\n  SymbolTable symbol_table_;\n  std::unique_ptr<OfflineSenseVoiceModel> model_;\n  std::unique_ptr<OfflineCtcDecoder> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_SENSE_VOICE_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-sense-voice-tpl-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-sense-voice-tpl-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_SENSE_VOICE_TPL_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_SENSE_VOICE_TPL_IMPL_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/rknn/offline-ctc-greedy-search-decoder-rknn.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\n// defined in ../offline-recognizer-sense-voice-impl.h\nOfflineRecognitionResult ConvertSenseVoiceResult(\n    const OfflineCtcDecoderResult &src, const SymbolTable &sym_table,\n    int32_t frame_shift_ms, int32_t subsampling_factor,\n    bool is_funasr_nano /*= false*/);\n\ntemplate <typename SenseVoiceModel>\nclass OfflineRecognizerSenseVoiceTplImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerSenseVoiceTplImpl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<SenseVoiceModel>(config.model_config)) {\n    const auto &meta_data = model_->GetModelMetadata();\n    if (config.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OfflineCtcGreedySearchDecoderRknn>(\n          meta_data.blank_id);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    InitFeatConfig();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerSenseVoiceTplImpl(Manager *mgr,\n                                     const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<SenseVoiceModel>(mgr, config.model_config)) {\n    const auto &meta_data = model_->GetModelMetadata();\n    if (config.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OfflineCtcGreedySearchDecoderRknn>(\n          meta_data.blank_id);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    InitFeatConfig();\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(config_.feat_config);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    for (int32_t i = 0; i < n; ++i) {\n      DecodeOneStream(ss[i]);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void InitFeatConfig() {\n    const auto &meta_data = model_->GetModelMetadata();\n\n    config_.feat_config.normalize_samples = meta_data.normalize_samples;\n    config_.feat_config.window_type = \"hamming\";\n    config_.feat_config.high_freq = 0;\n    config_.feat_config.snip_edges = true;\n  }\n\n  void DecodeOneStream(OfflineStream *s) const {\n    const auto &meta_data = model_->GetModelMetadata();\n\n    std::vector<float> f = s->GetFrames();\n\n    int32_t language = 0;\n    if (config_.model_config.sense_voice.language.empty()) {\n      language = 0;\n    } else if (meta_data.lang2id.count(\n                   config_.model_config.sense_voice.language)) {\n      language =\n          meta_data.lang2id.at(config_.model_config.sense_voice.language);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unknown language: %s. Use 0 instead.\",\n                       config_.model_config.sense_voice.language.c_str());\n    }\n\n    int32_t text_norm = config_.model_config.sense_voice.use_itn\n                            ? meta_data.with_itn_id\n                            : meta_data.without_itn_id;\n\n    std::vector<float> logits = model_->Run(std::move(f), language, text_norm);\n    if (logits.empty()) {\n      return;\n    }\n\n    int32_t num_out_frames = logits.size() / meta_data.vocab_size;\n\n    auto result =\n        decoder_->Decode(logits.data(), num_out_frames, meta_data.vocab_size);\n\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = meta_data.window_shift;\n    auto r = ConvertSenseVoiceResult(result, symbol_table_, frame_shift_ms,\n                                     subsampling_factor);\n\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    s->SetResult(r);\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<SenseVoiceModel> model_;\n  std::unique_ptr<OfflineCtcGreedySearchDecoderRknn> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_SENSE_VOICE_TPL_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-transducer-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-transducer-impl.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_TRANSDUCER_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_TRANSDUCER_IMPL_H_\n\n#include <fstream>\n#include <ios>\n#include <memory>\n#include <regex>  // NOLINT\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/context-graph.h\"\n#include \"sherpa-onnx/csrc/log.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-model.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-modified-beam-search-decoder.h\"\n#include \"sherpa-onnx/csrc/pad-sequence.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/utils.h\"\n#include \"ssentencepiece/csrc/ssentencepiece.h\"\n\nnamespace sherpa_onnx {\n\nstatic OfflineRecognitionResult Convert(\n    const OfflineTransducerDecoderResult &src, const SymbolTable &sym_table,\n    int32_t frame_shift_ms, int32_t subsampling_factor) {\n  OfflineRecognitionResult r;\n  r.tokens.reserve(src.tokens.size());\n  r.timestamps.reserve(src.timestamps.size());\n  r.durations.reserve(src.durations.size());\n\n  std::string text;\n  for (auto i : src.tokens) {\n    auto sym = sym_table[i];\n    text.append(sym);\n\n    if (sym.size() == 1 && (sym[0] < 0x20 || sym[0] > 0x7e)) {\n      // for bpe models with byte_fallback,\n      // (but don't rewrite printable characters 0x20..0x7e,\n      //  which collide with standard BPE units)\n      std::ostringstream os;\n      os << \"<0x\" << std::hex << std::uppercase\n         << (static_cast<int32_t>(sym[0]) & 0xff) << \">\";\n      sym = os.str();\n    }\n\n    r.tokens.push_back(std::move(sym));\n  }\n  if (sym_table.IsByteBpe()) {\n    text = sym_table.DecodeByteBpe(text);\n  }\n\n  r.text = std::move(text);\n\n  float frame_shift_s = frame_shift_ms / 1000. * subsampling_factor;\n  for (auto t : src.timestamps) {\n    float time = frame_shift_s * t;\n    r.timestamps.push_back(time);\n  }\n\n  // Copy durations (if present)\n  for (auto d : src.durations) {\n    r.durations.push_back(d * frame_shift_s);\n  }\n\n  // Copy token log probabilities (confidence scores)\n  r.ys_log_probs = src.ys_log_probs;\n\n  return r;\n}\n\nclass OfflineRecognizerTransducerImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerTransducerImpl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineTransducerModel>(config_.model_config)) {\n    if (symbol_table_.Contains(\"<unk>\")) {\n      unk_id_ = symbol_table_[\"<unk>\"];\n    }\n\n    if (config_.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OfflineTransducerGreedySearchDecoder>(\n          model_.get(), unk_id_, config_.blank_penalty);\n    } else if (config_.decoding_method == \"modified_beam_search\") {\n      if (!config_.lm_config.model.empty()) {\n        lm_ = OfflineLM::Create(config.lm_config);\n      }\n\n      if (!config_.model_config.bpe_vocab.empty()) {\n        bpe_encoder_ = std::make_unique<ssentencepiece::Ssentencepiece>(\n            config_.model_config.bpe_vocab);\n      }\n\n      if (!config_.hotwords_file.empty()) {\n        InitHotwords();\n      }\n\n      decoder_ = std::make_unique<OfflineTransducerModifiedBeamSearchDecoder>(\n          model_.get(), lm_.get(), config_.max_active_paths,\n          config_.lm_config.scale, unk_id_, config_.blank_penalty);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported decoding method: %s\",\n                       config_.decoding_method.c_str());\n      exit(-1);\n    }\n  }\n\n  template <typename Manager>\n  explicit OfflineRecognizerTransducerImpl(\n      Manager *mgr, const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<OfflineTransducerModel>(mgr,\n                                                        config_.model_config)) {\n    if (symbol_table_.Contains(\"<unk>\")) {\n      unk_id_ = symbol_table_[\"<unk>\"];\n    }\n\n    if (config_.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OfflineTransducerGreedySearchDecoder>(\n          model_.get(), unk_id_, config_.blank_penalty);\n    } else if (config_.decoding_method == \"modified_beam_search\") {\n      if (!config_.lm_config.model.empty()) {\n        lm_ = OfflineLM::Create(mgr, config.lm_config);\n      }\n\n      if (!config_.model_config.bpe_vocab.empty()) {\n        auto buf = ReadFile(mgr, config_.model_config.bpe_vocab);\n        std::istringstream iss(std::string(buf.begin(), buf.end()));\n        bpe_encoder_ = std::make_unique<ssentencepiece::Ssentencepiece>(iss);\n      }\n\n      if (!config_.hotwords_file.empty()) {\n        InitHotwords(mgr);\n      }\n\n      decoder_ = std::make_unique<OfflineTransducerModifiedBeamSearchDecoder>(\n          model_.get(), lm_.get(), config_.max_active_paths,\n          config_.lm_config.scale, unk_id_, config_.blank_penalty);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported decoding method: %s\",\n                       config_.decoding_method.c_str());\n      exit(-1);\n    }\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream(\n      const std::string &hotwords) const override {\n    auto hws = std::regex_replace(hotwords, std::regex(\"/\"), \"\\n\");\n    std::istringstream is(hws);\n    std::vector<std::vector<int32_t>> current;\n    std::vector<float> current_scores;\n    if (!EncodeHotwords(is, config_.model_config.modeling_unit, symbol_table_,\n                        bpe_encoder_.get(), &current, &current_scores)) {\n      SHERPA_ONNX_LOGE(\"Encode hotwords failed, skipping, hotwords are : '%s'\",\n                       hotwords.c_str());\n    }\n\n    int32_t num_default_hws = hotwords_.size();\n    int32_t num_hws = current.size();\n\n    current.insert(current.end(), hotwords_.begin(), hotwords_.end());\n\n    if (!current_scores.empty() && !boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), boost_scores_.begin(),\n                            boost_scores_.end());\n    } else if (!current_scores.empty() && boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), num_default_hws,\n                            config_.hotwords_score);\n    } else if (current_scores.empty() && !boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), num_hws,\n                            config_.hotwords_score);\n      current_scores.insert(current_scores.end(), boost_scores_.begin(),\n                            boost_scores_.end());\n    } else {\n      // Do nothing.\n    }\n\n    auto context_graph = std::make_shared<ContextGraph>(\n        current, config_.hotwords_score, current_scores);\n    return std::make_unique<OfflineStream>(config_.feat_config, context_graph);\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(config_.feat_config,\n                                           hotwords_graph_);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t feat_dim = ss[0]->FeatureDim();\n\n    std::vector<Ort::Value> features;\n\n    features.reserve(n);\n\n    std::vector<std::vector<float>> features_vec(n);\n    std::vector<int64_t> features_length_vec(n);\n    for (int32_t i = 0; i != n; ++i) {\n      auto f = ss[i]->GetFrames();\n      int32_t num_frames = f.size() / feat_dim;\n\n      features_length_vec[i] = num_frames;\n      features_vec[i] = std::move(f);\n\n      std::array<int64_t, 2> shape = {num_frames, feat_dim};\n\n      Ort::Value x = Ort::Value::CreateTensor(\n          memory_info, features_vec[i].data(), features_vec[i].size(),\n          shape.data(), shape.size());\n      features.push_back(std::move(x));\n    }\n\n    std::vector<const Ort::Value *> features_pointer(n);\n    for (int32_t i = 0; i != n; ++i) {\n      features_pointer[i] = &features[i];\n    }\n\n    std::array<int64_t, 1> features_length_shape = {n};\n    Ort::Value x_length = Ort::Value::CreateTensor(\n        memory_info, features_length_vec.data(), n,\n        features_length_shape.data(), features_length_shape.size());\n\n    Ort::Value x = PadSequence(model_->Allocator(), features_pointer,\n                               -23.025850929940457f);\n\n    auto t = model_->RunEncoder(std::move(x), std::move(x_length));\n    auto results =\n        decoder_->Decode(std::move(t.first), std::move(t.second), ss, n);\n\n    int32_t frame_shift_ms = 10;\n    for (int32_t i = 0; i != n; ++i) {\n      auto r = Convert(results[i], symbol_table_, frame_shift_ms,\n                       model_->SubsamplingFactor());\n      r.text = ApplyInverseTextNormalization(std::move(r.text));\n      r.text = ApplyHomophoneReplacer(std::move(r.text));\n\n      ss[i]->SetResult(r);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n  void InitHotwords() {\n    // each line in hotwords_file contains space-separated words\n\n    std::ifstream is(config_.hotwords_file);\n    if (!is) {\n      SHERPA_ONNX_LOGE(\"Open hotwords file failed: '%s'\",\n                       config_.hotwords_file.c_str());\n      exit(-1);\n    }\n\n    if (!EncodeHotwords(is, config_.model_config.modeling_unit, symbol_table_,\n                        bpe_encoder_.get(), &hotwords_, &boost_scores_)) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to encode some hotwords, skip them already, see logs above \"\n          \"for details.\");\n    }\n    hotwords_graph_ = std::make_shared<ContextGraph>(\n        hotwords_, config_.hotwords_score, boost_scores_);\n  }\n\n  template <typename Manager>\n  void InitHotwords(Manager *mgr) {\n    // each line in hotwords_file contains space-separated words\n\n    auto buf = ReadFile(mgr, config_.hotwords_file);\n\n    std::istringstream is(std::string(buf.begin(), buf.end()));\n\n    if (!is) {\n      SHERPA_ONNX_LOGE(\"Open hotwords file failed: '%s'\",\n                       config_.hotwords_file.c_str());\n      exit(-1);\n    }\n\n    if (!EncodeHotwords(is, config_.model_config.modeling_unit, symbol_table_,\n                        bpe_encoder_.get(), &hotwords_, &boost_scores_)) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to encode some hotwords, skip them already, see logs above \"\n          \"for details.\");\n    }\n    hotwords_graph_ = std::make_shared<ContextGraph>(\n        hotwords_, config_.hotwords_score, boost_scores_);\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::vector<std::vector<int32_t>> hotwords_;\n  std::vector<float> boost_scores_;\n  ContextGraphPtr hotwords_graph_;\n  std::unique_ptr<ssentencepiece::Ssentencepiece> bpe_encoder_;\n  std::unique_ptr<OfflineTransducerModel> model_;\n  std::unique_ptr<OfflineTransducerDecoder> decoder_;\n  std::unique_ptr<OfflineLM> lm_;\n  int32_t unk_id_ = -1;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_TRANSDUCER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-transducer-nemo-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-transducer-nemo-impl.h\n//\n// Copyright (c)  2022-2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_TRANSDUCER_NEMO_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_TRANSDUCER_NEMO_IMPL_H_\n\n#include <fstream>\n#include <ios>\n#include <memory>\n#include <regex>  // NOLINT\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-greedy-search-nemo-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-modified-beam-search-nemo-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-nemo-model.h\"\n#include \"sherpa-onnx/csrc/pad-sequence.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n#include \"sherpa-onnx/csrc/utils.h\"\n#include \"ssentencepiece/csrc/ssentencepiece.h\"\n\nnamespace sherpa_onnx {\n\n// defined in ./offline-recognizer-transducer-impl.h\nOfflineRecognitionResult Convert(const OfflineTransducerDecoderResult &src,\n                                 const SymbolTable &sym_table,\n                                 int32_t frame_shift_ms,\n                                 int32_t subsampling_factor);\n\nclass OfflineRecognizerTransducerNeMoImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerTransducerNeMoImpl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineTransducerNeMoModel>(\n            config_.model_config)) {\n    if (symbol_table_.Contains(\"<unk>\")) {\n      unk_id_ = symbol_table_[\"<unk>\"];\n    }\n\n    if (config_.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OfflineTransducerGreedySearchNeMoDecoder>(\n          model_.get(), config_.blank_penalty, model_->IsTDT());\n    } else if (config_.decoding_method == \"modified_beam_search\") {\n      // Initialize BPE encoder if provided\n      if (!config_.model_config.bpe_vocab.empty()) {\n        bpe_encoder_ = std::make_unique<ssentencepiece::Ssentencepiece>(\n            config_.model_config.bpe_vocab);\n      }\n\n      // Initialize hotwords if provided\n      if (!config_.hotwords_file.empty()) {\n        InitHotwords();\n      }\n\n      decoder_ =\n          std::make_unique<OfflineTransducerModifiedBeamSearchNeMoDecoder>(\n              model_.get(), config_.max_active_paths, unk_id_,\n              config_.blank_penalty, model_->IsTDT(), config_.hotwords_score);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported decoding method: %s\",\n                       config_.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n    PostInit();\n  }\n\n  template <typename Manager>\n  explicit OfflineRecognizerTransducerNeMoImpl(\n      Manager *mgr, const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<OfflineTransducerNeMoModel>(\n            mgr, config_.model_config)) {\n    if (symbol_table_.Contains(\"<unk>\")) {\n      unk_id_ = symbol_table_[\"<unk>\"];\n    }\n\n    if (config_.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OfflineTransducerGreedySearchNeMoDecoder>(\n          model_.get(), config_.blank_penalty, model_->IsTDT());\n    } else if (config_.decoding_method == \"modified_beam_search\") {\n      // Initialize BPE encoder if provided\n      if (!config_.model_config.bpe_vocab.empty()) {\n        auto buf = ReadFile(mgr, config_.model_config.bpe_vocab);\n        std::istringstream iss(std::string(buf.begin(), buf.end()));\n        bpe_encoder_ = std::make_unique<ssentencepiece::Ssentencepiece>(iss);\n      }\n\n      // Initialize hotwords if provided\n      if (!config_.hotwords_file.empty()) {\n        InitHotwords(mgr);\n      }\n\n      decoder_ =\n          std::make_unique<OfflineTransducerModifiedBeamSearchNeMoDecoder>(\n              model_.get(), config_.max_active_paths, unk_id_,\n              config_.blank_penalty, model_->IsTDT(), config_.hotwords_score);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported decoding method: %s\",\n                       config_.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    PostInit();\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream(\n      const std::string &hotwords) const override {\n    auto hws = std::regex_replace(hotwords, std::regex(\"/\"), \"\\n\");\n    std::istringstream is(hws);\n    std::vector<std::vector<int32_t>> current;\n    std::vector<float> current_scores;\n    if (!EncodeHotwords(is, config_.model_config.modeling_unit, symbol_table_,\n                        bpe_encoder_.get(), &current, &current_scores)) {\n      SHERPA_ONNX_LOGE(\"Encode hotwords failed, skipping, hotwords are : '%s'\",\n                       hotwords.c_str());\n    }\n\n    int32_t num_default_hws = hotwords_.size();\n    int32_t num_hws = current.size();\n\n    current.insert(current.end(), hotwords_.begin(), hotwords_.end());\n\n    if (!current_scores.empty() && !boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), boost_scores_.begin(),\n                            boost_scores_.end());\n    } else if (!current_scores.empty() && boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), num_default_hws,\n                            config_.hotwords_score);\n    } else if (current_scores.empty() && !boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), num_hws,\n                            config_.hotwords_score);\n      current_scores.insert(current_scores.end(), boost_scores_.begin(),\n                            boost_scores_.end());\n    } else {\n      // Do nothing.\n    }\n\n    auto context_graph = std::make_shared<ContextGraph>(\n        current, config_.hotwords_score, current_scores);\n    return std::make_unique<OfflineStream>(config_.feat_config, context_graph);\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(config_.feat_config,\n                                           hotwords_graph_);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t feat_dim = ss[0]->FeatureDim();\n\n    std::vector<Ort::Value> features;\n\n    features.reserve(n);\n\n    std::vector<std::vector<float>> features_vec(n);\n    std::vector<int64_t> features_length_vec(n);\n    for (int32_t i = 0; i != n; ++i) {\n      auto f = ss[i]->GetFrames();\n      int32_t num_frames = f.size() / feat_dim;\n\n      features_length_vec[i] = num_frames;\n      features_vec[i] = std::move(f);\n\n      std::array<int64_t, 2> shape = {num_frames, feat_dim};\n\n      Ort::Value x = Ort::Value::CreateTensor(\n          memory_info, features_vec[i].data(), features_vec[i].size(),\n          shape.data(), shape.size());\n      features.push_back(std::move(x));\n    }\n\n    std::vector<const Ort::Value *> features_pointer(n);\n    for (int32_t i = 0; i != n; ++i) {\n      features_pointer[i] = &features[i];\n    }\n\n    std::array<int64_t, 1> features_length_shape = {n};\n    Ort::Value x_length = Ort::Value::CreateTensor(\n        memory_info, features_length_vec.data(), n,\n        features_length_shape.data(), features_length_shape.size());\n\n    Ort::Value x = PadSequence(model_->Allocator(), features_pointer, 0);\n\n    auto t = model_->RunEncoder(std::move(x), std::move(x_length));\n    // t[0] encoder_out, float tensor, (batch_size, dim, T)\n    // t[1] encoder_out_length, int64 tensor, (batch_size,)\n\n    Ort::Value encoder_out = Transpose12(model_->Allocator(), &t[0]);\n\n    auto results =\n        decoder_->Decode(std::move(encoder_out), std::move(t[1]), ss, n);\n\n    int32_t frame_shift_ms = 10;\n    for (int32_t i = 0; i != n; ++i) {\n      auto r = Convert(results[i], symbol_table_, frame_shift_ms,\n                       model_->SubsamplingFactor());\n\n      // Remove leading space from BPE tokenization\n      if (!r.text.empty() && r.text.front() == ' ') {\n        r.text.erase(0, 1);\n      }\n\n      r.text = ApplyInverseTextNormalization(std::move(r.text));\n      r.text = ApplyHomophoneReplacer(std::move(r.text));\n\n      ss[i]->SetResult(r);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void PostInit() {\n    int32_t feat_dim = model_->FeatureDim();\n\n    if (feat_dim > 0) {\n      config_.feat_config.feature_dim = feat_dim;\n    }\n\n    config_.feat_config.nemo_normalize_type =\n        model_->FeatureNormalizationMethod();\n\n    if (model_->IsGigaAM()) {\n      config_.feat_config.low_freq = 0;\n      config_.feat_config.high_freq = 8000;\n      config_.feat_config.remove_dc_offset = false;\n      config_.feat_config.preemph_coeff = 0;\n      config_.feat_config.window_type = \"hann\";\n      config_.feat_config.feature_dim = 64;\n\n      // see\n      // https://github.com/salute-developers/GigaAM/blob/main/gigaam/preprocess.py#L68\n      //\n      // GigaAM uses n_fft 400\n      config_.feat_config.round_to_power_of_two = false;\n    } else {\n      config_.feat_config.low_freq = 0;\n      // config_.feat_config.high_freq = 8000;\n      config_.feat_config.is_librosa = true;\n      config_.feat_config.remove_dc_offset = false;\n      // config_.feat_config.window_type = \"hann\";\n    }\n\n    int32_t vocab_size = model_->VocabSize();\n\n    // check the blank ID\n    if (!symbol_table_.Contains(\"<blk>\")) {\n      SHERPA_ONNX_LOGE(\"tokens.txt does not include the blank token <blk>\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (symbol_table_[\"<blk>\"] != vocab_size - 1) {\n      SHERPA_ONNX_LOGE(\"<blk> is not the last token!\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (symbol_table_.NumSymbols() != vocab_size) {\n      SHERPA_ONNX_LOGE(\"number of lines in tokens.txt %d != %d (vocab_size)\",\n                       symbol_table_.NumSymbols(), vocab_size);\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  void InitHotwords() {\n    // each line in hotwords_file contains space-separated words\n\n    std::ifstream is(config_.hotwords_file);\n    if (!is) {\n      SHERPA_ONNX_LOGE(\"Open hotwords file failed: '%s'\",\n                       config_.hotwords_file.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (!EncodeHotwords(is, config_.model_config.modeling_unit, symbol_table_,\n                        bpe_encoder_.get(), &hotwords_, &boost_scores_)) {\n      SHERPA_ONNX_LOGE(\n          \"Some hotwords failed to encode and were skipped. See above for \"\n          \"details.\");\n    }\n    hotwords_graph_ = std::make_shared<ContextGraph>(\n        hotwords_, config_.hotwords_score, boost_scores_);\n  }\n\n  template <typename Manager>\n  void InitHotwords(Manager *mgr) {\n    // each line in hotwords_file contains space-separated words\n\n    auto buf = ReadFile(mgr, config_.hotwords_file);\n\n    std::istringstream is(std::string(buf.begin(), buf.end()));\n\n    if (!EncodeHotwords(is, config_.model_config.modeling_unit, symbol_table_,\n                        bpe_encoder_.get(), &hotwords_, &boost_scores_)) {\n      SHERPA_ONNX_LOGE(\n          \"Some hotwords failed to encode and were skipped. See above for \"\n          \"details.\");\n    }\n    hotwords_graph_ = std::make_shared<ContextGraph>(\n        hotwords_, config_.hotwords_score, boost_scores_);\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::vector<std::vector<int32_t>> hotwords_;\n  std::vector<float> boost_scores_;\n  ContextGraphPtr hotwords_graph_;\n  std::unique_ptr<ssentencepiece::Ssentencepiece> bpe_encoder_;\n  std::unique_ptr<OfflineTransducerNeMoModel> model_;\n  std::unique_ptr<OfflineTransducerDecoder> decoder_;\n  int32_t unk_id_ = -1;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_TRANSDUCER_NEMO_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-whisper-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-whisper-impl.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_WHISPER_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_WHISPER_IMPL_H_\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/offline-whisper-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-whisper-dtw.h\"\n#include \"sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-whisper-model.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineRecognizerWhisperImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerWhisperImpl(const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineWhisperModel>(config.model_config)) {\n    Init();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerWhisperImpl(Manager *mgr,\n                               const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(\n            std::make_unique<OfflineWhisperModel>(mgr, config.model_config)) {\n    Init();\n  }\n\n  void Init() {\n    // tokens.txt from whisper is base64 encoded, so we need to decode it\n    symbol_table_.ApplyBase64Decode();\n\n    if (config_.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OfflineWhisperGreedySearchDecoder>(\n          config_.model_config.whisper, model_.get());\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only greedy_search is supported at present for whisper. Given %s\",\n          config_.decoding_method.c_str());\n      exit(-1);\n    }\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    WhisperTag tag;\n    tag.dim = model_->FeatureDim();\n    return std::make_unique<OfflineStream>(tag);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    // batch decoding is not implemented yet\n    for (int32_t i = 0; i != n; ++i) {\n      DecodeStream(ss[i]);\n    }\n  }\n\n  void SetConfig(const OfflineRecognizerConfig &config) override {\n    config_.model_config.whisper = config.model_config.whisper;\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void DecodeStream(OfflineStream *s) const {\n    decoder_->SetConfig(config_.model_config.whisper);\n\n    int32_t max_num_frames = 3000;\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t feat_dim = s->FeatureDim();\n    std::vector<float> f = s->GetFrames();\n    int32_t num_frames = f.size() / feat_dim;\n\n    // we use 50 here so that there will be some zero tail paddings\n    if (num_frames >= max_num_frames - 50) {\n      SHERPA_ONNX_LOGE(\n          \"Only waves less than 30 seconds are supported. We process only the \"\n          \"first 30 seconds and discard the remaining data\");\n      num_frames = max_num_frames - 50;\n    }\n\n    model_->NormalizeFeatures(f.data(), num_frames, feat_dim);\n\n    // note that 1000 is an experience-value.\n    // You can replace 1000 by other values, say, 100.\n    //\n    // Since we have removed the 30 seconds constraint, we need\n    // tail_padding_frames so that whisper is able to detect the eot token.\n    int32_t tail_padding_frames = 1000;\n\n    if (config_.model_config.whisper.tail_paddings > 0) {\n      tail_padding_frames = config_.model_config.whisper.tail_paddings;\n    }\n\n    int32_t actual_frames =\n        std::min(num_frames + tail_padding_frames, max_num_frames);\n\n    std::array<int64_t, 3> shape{1, actual_frames, feat_dim};\n\n    Ort::Value mel = Ort::Value::CreateTensor<float>(\n        model_->Allocator(), shape.data(), shape.size());\n\n    float *p_mel = mel.GetTensorMutableData<float>();\n    std::copy(f.data(), f.data() + num_frames * feat_dim, p_mel);\n\n    std::fill_n(p_mel + num_frames * feat_dim,\n                (actual_frames - num_frames) * feat_dim, 0);\n\n    mel = Transpose12(model_->Allocator(), &mel);\n\n    try {\n      auto cross_kv = model_->ForwardEncoder(std::move(mel));\n\n      auto results = decoder_->Decode(std::move(cross_kv.first),\n                                      std::move(cross_kv.second), num_frames);\n\n      auto r = Convert(results[0], symbol_table_);\n      s->SetResult(r);\n    } catch (const Ort::Exception &ex) {\n      SHERPA_ONNX_LOGE(\n          \"\\n\\nCaught exception:\\n\\n%s\\n\\nReturn an empty result. Number of \"\n          \"input frames: %d, Current tail \"\n          \"paddings: %d. If you see a lot of such exceptions, please consider \"\n          \"using a larger --whisper-tail-paddings\",\n          ex.what(), num_frames, tail_padding_frames);\n      return;\n    }\n  }\n\n private:\n  OfflineRecognitionResult Convert(const OfflineWhisperDecoderResult &src,\n                                   const SymbolTable &sym_table) const {\n    OfflineRecognitionResult r;\n    r.tokens.reserve(src.tokens.size());\n\n    std::string text;\n\n    // Get timestamp begin token ID to filter out timestamp tokens\n    int32_t timestamp_begin = model_->TimestampBegin();\n    bool enable_segment_timestamps =\n        config_.model_config.whisper.enable_segment_timestamps;\n\n    // Build text, skipping timestamp tokens if in segment timestamp mode\n    for (auto i : src.tokens) {\n      // Skip timestamp tokens (they are >= timestamp_begin)\n      if (enable_segment_timestamps && i >= timestamp_begin) {\n        continue;\n      }\n\n      if (!sym_table.Contains(i)) {\n        continue;\n      }\n\n      std::string s = sym_table[i];\n      s = ApplyInverseTextNormalization(s);\n      s = ApplyHomophoneReplacer(std::move(s));\n\n      text += s;\n      r.tokens.push_back(s);\n    }\n\n    r.text = text;\n    r.lang = src.lang;\n\n    // Convert segments from segment timestamp mode to parallel vectors\n    if (enable_segment_timestamps && !src.segments.empty()) {\n      r.segment_timestamps.reserve(src.segments.size());\n      r.segment_durations.reserve(src.segments.size());\n      r.segment_texts.reserve(src.segments.size());\n\n      // Total audio duration for fallback when segment has no explicit end time\n      float total_audio_duration = src.num_audio_frames * 0.02f;\n\n      for (const auto &seg : src.segments) {\n        r.segment_timestamps.push_back(seg.start_time);\n        // Use remaining audio duration if end_time is sentinel (-1.0f)\n        float duration = (seg.end_time == -1.0f)\n                             ? (total_audio_duration - seg.start_time)\n                             : (seg.end_time - seg.start_time);\n        // Clamp to non-negative to handle rounding/model quirks\n        duration = std::max(0.0f, duration);\n        r.segment_durations.push_back(duration);\n\n        // Convert token IDs to text\n        std::string seg_text;\n        for (int32_t tok_id : seg.token_ids) {\n          if (sym_table.Contains(tok_id)) {\n            std::string s = sym_table[tok_id];\n            s = ApplyInverseTextNormalization(s);\n            s = ApplyHomophoneReplacer(std::move(s));\n            seg_text += s;\n          }\n        }\n        r.segment_texts.push_back(std::move(seg_text));\n      }\n    }\n\n    // Compute token-level timestamps using DTW if enabled\n    if (config_.model_config.whisper.enable_token_timestamps &&\n        !src.attention_weights.empty() &&\n        !r.tokens.empty()) {\n      ComputeTimestamps(src, r);\n    }\n\n    return r;\n  }\n\n  // Compute token-level timestamps using cross-attention DTW\n  void ComputeTimestamps(const OfflineWhisperDecoderResult &src,\n                         OfflineRecognitionResult &r) const {\n    WhisperDTW dtw;\n\n    // Note: src.attention includes all tokens (initial + decoded)\n    // The first few are SOT sequence tokens which DTW will skip.\n    // Initial tokens are: [sot, lang, task, no_timestamps] for multilingual,\n    // or [sot, no_timestamps] for English-only models.\n    int32_t sot_sequence_length =\n        static_cast<int32_t>(model_->GetInitialTokens().size());\n\n    // Use ComputeTokenTimings which extracts both start times and durations\n    // directly from the DTW jump_times, following OpenAI's approach:\n    //   start_times[i] = jump_times[i]\n    //   end_times[i] = jump_times[i+1]\n    //   durations[i] = end_times[i] - start_times[i]\n    // Pass timestamp_token_indices to filter out timestamp tokens from DTW\n    // (needed when enable_segment_timestamps=true to avoid alignment issues)\n    TokenTimingResult timing = dtw.ComputeTokenTimings(\n        src.attention_weights.data(), src.attention_n_heads,\n        src.attention_n_tokens, src.attention_n_frames, src.num_audio_frames,\n        sot_sequence_length, static_cast<int32_t>(r.tokens.size()),\n        src.timestamp_token_indices);\n\n    // Populate timestamps and durations\n    r.timestamps = std::move(timing.start_times);\n    r.durations = std::move(timing.durations);\n\n    // Ensure vectors match token count\n    if (r.timestamps.size() != r.tokens.size()) {\n      SHERPA_ONNX_LOGE(\n          \"DTW returned %zu timestamps for %zu tokens, padding/truncating\",\n          r.timestamps.size(), r.tokens.size());\n    }\n    float fill_time = r.timestamps.empty() ? 0.0f : r.timestamps.back();\n    r.timestamps.resize(r.tokens.size(), fill_time);\n    r.durations.resize(r.tokens.size(), 0.0f);\n\n    // Clamp token end times to segment boundaries (like OpenAI timing.py)\n    // If a token ends more than 0.5s after segment end, truncate it.\n    // This prevents DTW-derived timings from extending past segment bounds.\n    if (!src.segments.empty() && !r.timestamps.empty()) {\n      float segment_end = src.segments.back().end_time;\n      if (segment_end > 0) {\n        for (size_t i = 0; i < r.timestamps.size(); ++i) {\n          float token_end = r.timestamps[i] + r.durations[i];\n          // Like OpenAI: if token_end > segment_end + 0.5, clamp it\n          if (token_end > segment_end + 0.5f) {\n            r.durations[i] = std::max(0.0f, segment_end - r.timestamps[i]);\n          }\n        }\n      }\n    }\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<OfflineWhisperModel> model_;\n  std::unique_ptr<OfflineWhisperDecoder> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_WHISPER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer-whisper-tpl-impl.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer-whisper-tpl-impl.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_WHISPER_TPL_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_WHISPER_TPL_IMPL_H_\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\ntemplate <typename WhisperModel>\nclass OfflineRecognizerWhisperTplImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerWhisperTplImpl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<WhisperModel>(config.model_config)) {\n    Init();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerWhisperTplImpl(Manager *mgr,\n                                  const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<WhisperModel>(mgr, config.model_config)) {\n    Init();\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    WhisperTag tag;\n    tag.dim = model_->FeatureDim();\n    return std::make_unique<OfflineStream>(tag);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    // batch decoding is not implemented yet\n    for (int32_t i = 0; i != n; ++i) {\n      DecodeStream(ss[i]);\n    }\n  }\n\n  void SetConfig(const OfflineRecognizerConfig &config) override {\n    config_.model_config.whisper = config.model_config.whisper;\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  void Init() {\n    // tokens.txt from whisper is base64 encoded, so we need to decode it\n    symbol_table_.ApplyBase64Decode();\n\n    if (config_.decoding_method == \"greedy_search\") {\n      SHERPA_ONNX_LOGE(\"use greedy_search\");\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Only greedy_search is supported at present for whisper. Given '%s'\",\n          config_.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  void DecodeStream(OfflineStream *s) const {\n    int32_t feat_dim = s->FeatureDim();\n    std::vector<float> f = s->GetFrames();\n    int32_t num_frames = f.size() / feat_dim;\n\n    NormalizeWhisperFeatures(f.data(), num_frames, feat_dim);\n\n    auto r = model_->Run(std::move(f));\n    auto res = Convert(r, symbol_table_);\n\n    s->SetResult(res);\n  }\n\n  OfflineRecognitionResult Convert(const OfflineWhisperDecoderResult &src,\n                                   const SymbolTable &sym_table) const {\n    OfflineRecognitionResult r;\n    r.tokens.reserve(src.tokens.size());\n\n    std::string text;\n    for (auto i : src.tokens) {\n      if (!sym_table.Contains(i)) {\n        continue;\n      }\n\n      std::string s = sym_table[i];\n      s = ApplyInverseTextNormalization(s);\n      s = ApplyHomophoneReplacer(std::move(s));\n\n      text += s;\n      r.tokens.push_back(s);\n    }\n\n    r.text = text;\n    r.lang = src.lang;\n\n    return r;\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<WhisperModel> model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_WHISPER_TPL_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer.cc",
    "content": "// sherpa-onnx/csrc/offline-recognizer.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-lm-config.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineRecognizerConfig::Register(ParseOptions *po) {\n  feat_config.Register(po);\n  model_config.Register(po);\n  lm_config.Register(po);\n  ctc_fst_decoder_config.Register(po);\n  hr.Register(po);\n\n  po->Register(\n      \"decoding-method\", &decoding_method,\n      \"decoding method,\"\n      \"Valid values: greedy_search, modified_beam_search. \"\n      \"modified_beam_search is applicable only for transducer models.\");\n\n  po->Register(\"max-active-paths\", &max_active_paths,\n               \"Used only when decoding_method is modified_beam_search\");\n\n  po->Register(\"blank-penalty\", &blank_penalty,\n               \"The penalty applied on blank symbol during decoding. \"\n               \"Note: It is a positive value. \"\n               \"Increasing value will lead to lower deletion at the cost\"\n               \"of higher insertions. \"\n               \"Currently only applicable for transducer models.\");\n\n  po->Register(\n      \"hotwords-file\", &hotwords_file,\n      \"The file containing hotwords, one words/phrases per line, For example: \"\n      \"HELLO WORLD\"\n      \"你好世界\");\n\n  po->Register(\"hotwords-score\", &hotwords_score,\n               \"The bonus score for each token in context word/phrase. \"\n               \"Used only when decoding_method is modified_beam_search\");\n\n  po->Register(\n      \"rule-fsts\", &rule_fsts,\n      \"If not empty, it specifies fsts for inverse text normalization. \"\n      \"If there are multiple fsts, they are separated by a comma.\");\n\n  po->Register(\n      \"rule-fars\", &rule_fars,\n      \"If not empty, it specifies fst archives for inverse text normalization. \"\n      \"If there are multiple archives, they are separated by a comma.\");\n}\n\nbool OfflineRecognizerConfig::Validate() const {\n  if (decoding_method == \"modified_beam_search\" && !lm_config.model.empty()) {\n    if (max_active_paths <= 0) {\n      SHERPA_ONNX_LOGE(\"max_active_paths is less than 0! Given: %d\",\n                       max_active_paths);\n      return false;\n    }\n    if (!lm_config.Validate()) {\n      return false;\n    }\n  }\n\n  if (!hotwords_file.empty() && decoding_method != \"modified_beam_search\") {\n    SHERPA_ONNX_LOGE(\n        \"Please use --decoding-method=modified_beam_search if you\"\n        \" provide --hotwords-file. Given --decoding-method='%s'\",\n        decoding_method.c_str());\n    return false;\n  }\n\n  if (!ctc_fst_decoder_config.graph.empty() &&\n      !ctc_fst_decoder_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in fst_decoder\");\n    return false;\n  }\n\n  if (!hotwords_file.empty() && !FileExists(hotwords_file)) {\n    SHERPA_ONNX_LOGE(\"--hotwords-file: '%s' does not exist\",\n                     hotwords_file.c_str());\n    return false;\n  }\n\n  if (!rule_fsts.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(rule_fsts, \",\", false, &files);\n    for (const auto &f : files) {\n      if (!FileExists(f)) {\n        SHERPA_ONNX_LOGE(\"Rule fst '%s' does not exist. \", f.c_str());\n        return false;\n      }\n    }\n  }\n\n  if (!rule_fars.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(rule_fars, \",\", false, &files);\n    for (const auto &f : files) {\n      if (!FileExists(f)) {\n        SHERPA_ONNX_LOGE(\"Rule far '%s' does not exist. \", f.c_str());\n        return false;\n      }\n    }\n  }\n\n  if (!hr.lexicon.empty() && !hr.rule_fsts.empty() && !hr.Validate()) {\n    return false;\n  }\n\n  return model_config.Validate();\n}\n\nstd::string OfflineRecognizerConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineRecognizerConfig(\";\n  os << \"feat_config=\" << feat_config.ToString() << \", \";\n  os << \"model_config=\" << model_config.ToString() << \", \";\n  os << \"lm_config=\" << lm_config.ToString() << \", \";\n  os << \"ctc_fst_decoder_config=\" << ctc_fst_decoder_config.ToString() << \", \";\n\n  os << \"decoding_method=\\\"\" << decoding_method << \"\\\", \";\n  os << \"max_active_paths=\" << max_active_paths << \", \";\n  os << \"hotwords_file=\\\"\" << hotwords_file << \"\\\", \";\n  os << \"hotwords_score=\" << hotwords_score << \", \";\n  os << \"blank_penalty=\" << blank_penalty << \", \";\n  os << \"rule_fsts=\\\"\" << rule_fsts << \"\\\", \";\n  os << \"rule_fars=\\\"\" << rule_fars << \"\\\", \";\n  os << \"hr=\" << hr.ToString() << \")\";\n\n  return os.str();\n}\n\ntemplate <typename Manager>\nOfflineRecognizer::OfflineRecognizer(Manager *mgr,\n                                     const OfflineRecognizerConfig &config)\n    : impl_(OfflineRecognizerImpl::Create(mgr, config)) {}\n\nOfflineRecognizer::OfflineRecognizer(const OfflineRecognizerConfig &config)\n    : impl_(OfflineRecognizerImpl::Create(config)) {}\n\nOfflineRecognizer::~OfflineRecognizer() = default;\n\nstd::unique_ptr<OfflineStream> OfflineRecognizer::CreateStream(\n    const std::string &hotwords) const {\n  return impl_->CreateStream(hotwords);\n}\n\nstd::unique_ptr<OfflineStream> OfflineRecognizer::CreateStream() const {\n  return impl_->CreateStream();\n}\n\nvoid OfflineRecognizer::DecodeStreams(OfflineStream **ss, int32_t n) const {\n  impl_->DecodeStreams(ss, n);\n}\n\nvoid OfflineRecognizer::SetConfig(const OfflineRecognizerConfig &config) {\n  impl_->SetConfig(config);\n}\n\nOfflineRecognizerConfig OfflineRecognizer::GetConfig() const {\n  return impl_->GetConfig();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineRecognizer::OfflineRecognizer(\n    AAssetManager *mgr, const OfflineRecognizerConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineRecognizer::OfflineRecognizer(\n    NativeResourceManager *mgr, const OfflineRecognizerConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-recognizer.h",
    "content": "// sherpa-onnx/csrc/offline-recognizer.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/features.h\"\n#include \"sherpa-onnx/csrc/homophone-replacer.h\"\n#include \"sherpa-onnx/csrc/offline-ctc-fst-decoder-config.h\"\n#include \"sherpa-onnx/csrc/offline-lm-config.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-stream.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineRecognitionResult;\n\nstruct OfflineRecognizerConfig {\n  FeatureExtractorConfig feat_config;\n  OfflineModelConfig model_config;\n  OfflineLMConfig lm_config;\n  OfflineCtcFstDecoderConfig ctc_fst_decoder_config;\n\n  std::string decoding_method = \"greedy_search\";\n  int32_t max_active_paths = 4;\n\n  std::string hotwords_file;\n  float hotwords_score = 1.5;\n\n  float blank_penalty = 0.0;\n\n  // If there are multiple rules, they are applied from left to right.\n  std::string rule_fsts;\n\n  // If there are multiple FST archives, they are applied from left to right.\n  std::string rule_fars;\n  HomophoneReplacerConfig hr;\n\n  // only greedy_search is implemented\n  // TODO(fangjun): Implement modified_beam_search\n\n  OfflineRecognizerConfig() = default;\n  OfflineRecognizerConfig(\n      const FeatureExtractorConfig &feat_config,\n      const OfflineModelConfig &model_config, const OfflineLMConfig &lm_config,\n      const OfflineCtcFstDecoderConfig &ctc_fst_decoder_config,\n      const std::string &decoding_method, int32_t max_active_paths,\n      const std::string &hotwords_file, float hotwords_score,\n      float blank_penalty, const std::string &rule_fsts,\n      const std::string &rule_fars, const HomophoneReplacerConfig &hr)\n      : feat_config(feat_config),\n        model_config(model_config),\n        lm_config(lm_config),\n        ctc_fst_decoder_config(ctc_fst_decoder_config),\n        decoding_method(decoding_method),\n        max_active_paths(max_active_paths),\n        hotwords_file(hotwords_file),\n        hotwords_score(hotwords_score),\n        blank_penalty(blank_penalty),\n        rule_fsts(rule_fsts),\n        rule_fars(rule_fars),\n        hr(hr) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nclass OfflineRecognizerImpl;\n\nclass OfflineRecognizer {\n public:\n  ~OfflineRecognizer();\n\n  template <typename Manager>\n  OfflineRecognizer(Manager *mgr, const OfflineRecognizerConfig &config);\n\n  explicit OfflineRecognizer(const OfflineRecognizerConfig &config);\n\n  /// Create a stream for decoding.\n  std::unique_ptr<OfflineStream> CreateStream() const;\n\n  /** Create a stream for decoding.\n   *\n   *  @param The hotwords for this string, it might contain several hotwords,\n   *         the hotwords are separated by \"/\". In each of the hotwords, there\n   *         are cjkchars or bpes, the bpe/cjkchar are separated by space (\" \").\n   *         For example, hotwords I LOVE YOU and HELLO WORLD, looks like:\n   *\n   *         \"▁I ▁LOVE ▁YOU/▁HE LL O ▁WORLD\"\n   */\n  std::unique_ptr<OfflineStream> CreateStream(\n      const std::string &hotwords) const;\n\n  /** Decode a single stream\n   *\n   * @param s The stream to decode.\n   */\n  void DecodeStream(OfflineStream *s) const {\n    OfflineStream *ss[1] = {s};\n    DecodeStreams(ss, 1);\n  }\n\n  /** Decode a list of streams.\n   *\n   * @param ss Pointer to an array of streams.\n   * @param n  Size of the input array.\n   */\n  void DecodeStreams(OfflineStream **ss, int32_t n) const;\n\n  /** Onnxruntime Session objects are not affected by this method.\n   * The exact behavior can be defined by a specific recognizer impl.\n   * For instance, for the whisper recognizer, you can retrieve the language and\n   * task from the config and ignore any remaining fields in `config`.\n   */\n  void SetConfig(const OfflineRecognizerConfig &config);\n\n  OfflineRecognizerConfig GetConfig() const;\n\n private:\n  std::unique_ptr<OfflineRecognizerImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RECOGNIZER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-rnn-lm.cc",
    "content": "// sherpa-onnx/csrc/offline-rnn-lm.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-rnn-lm.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineRnnLM::Impl {\n public:\n  explicit Impl(const OfflineLMConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_{GetSessionOptions(config)},\n        allocator_{} {\n    auto buf = ReadFile(config_.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineLMConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_{GetSessionOptions(config)},\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.model);\n    Init(buf.data(), buf.size());\n  }\n\n  Ort::Value Rescore(Ort::Value x, Ort::Value x_lens) {\n    std::array<Ort::Value, 2> inputs = {std::move(x), std::move(x_lens)};\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    return std::move(out[0]);\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n  }\n\n private:\n  OfflineLMConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n};\n\nOfflineRnnLM::OfflineRnnLM(const OfflineLMConfig &config)\n    : impl_(std::make_unique<Impl>(config)), OfflineLM(config) {}\n\ntemplate <typename Manager>\nOfflineRnnLM::OfflineRnnLM(Manager *mgr, const OfflineLMConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)), OfflineLM(config) {}\n\nOfflineRnnLM::~OfflineRnnLM() = default;\n\nOrt::Value OfflineRnnLM::Rescore(Ort::Value x, Ort::Value x_lens) {\n  return impl_->Rescore(std::move(x), std::move(x_lens));\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineRnnLM::OfflineRnnLM(AAssetManager *mgr,\n                                    const OfflineLMConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineRnnLM::OfflineRnnLM(NativeResourceManager *mgr,\n                                    const OfflineLMConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-rnn-lm.h",
    "content": "// sherpa-onnx/csrc/offline-rnn-lm.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_RNN_LM_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_RNN_LM_H_\n\n#include <memory>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-lm-config.h\"\n#include \"sherpa-onnx/csrc/offline-lm.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineRnnLM : public OfflineLM {\n public:\n  ~OfflineRnnLM() override;\n\n  explicit OfflineRnnLM(const OfflineLMConfig &config);\n\n  template <typename Manager>\n  OfflineRnnLM(Manager *mgr, const OfflineLMConfig &config);\n\n  /** Rescore a batch of sentences.\n   *\n   * @param x A 2-D tensor of shape (N, L) with data type int64.\n   * @param x_lens A 1-D tensor of shape (N,) with data type int64.\n   *               It contains number of valid tokens in x before padding.\n   * @return Return a 1-D tensor of shape (N,) containing the log likelihood\n   *         of each utterance. Its data type is float32.\n   *\n   * Caution: It returns log likelihood, not negative log likelihood (nll).\n   */\n  Ort::Value Rescore(Ort::Value x, Ort::Value x_lens) override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_RNN_LM_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-sense-voice-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-sense-voice-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-sense-voice-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSenseVoiceModelConfig::Register(ParseOptions *po) {\n  po->Register(\"sense-voice-model\", &model,\n               \"Path to model.onnx of SenseVoice.\");\n  po->Register(\n      \"sense-voice-language\", &language,\n      \"Valid values: auto, zh, en, ja, ko, yue. If left empty, auto is used\");\n  po->Register(\n      \"sense-voice-use-itn\", &use_itn,\n      \"True to enable inverse text normalization. False to disable it.\");\n\n  std::string prefix = \"sense-voice\";\n  ParseOptions p(prefix, po);\n\n  qnn_config.Register(&p);\n}\n\nbool OfflineSenseVoiceModelConfig::Validate() const {\n  if (qnn_config.context_binary.empty()) {\n    if (model.empty()) {\n      SHERPA_ONNX_LOGE(\"Please provide a senseVoice model\");\n      return false;\n    }\n\n    if (!FileExists(model)) {\n      SHERPA_ONNX_LOGE(\"SenseVoice model '%s' does not exist\", model.c_str());\n      return false;\n    }\n  }\n\n  if (!language.empty()) {\n    if (language != \"auto\" && language != \"zh\" && language != \"en\" &&\n        language != \"ja\" && language != \"ko\" && language != \"yue\") {\n      SHERPA_ONNX_LOGE(\n          \"Invalid sense-voice-language: '%s'. Valid values are: auto, zh, en, \"\n          \"ja, ko, yue. Or you can leave it empty to use 'auto'\",\n          language.c_str());\n\n      return false;\n    }\n  }\n\n  if (model.empty() && !qnn_config.context_binary.empty()) {\n    // we require that the context_binary exists\n    if (!FileExists(qnn_config.context_binary)) {\n      SHERPA_ONNX_LOGE(\n          \"Model is empty, but you provide a context binary that does not \"\n          \"exist\");\n      return false;\n    }\n  }\n\n  if (EndsWith(model, \".so\") || EndsWith(model, \".bin\") ||\n      (model.empty() && !qnn_config.context_binary.empty())) {\n    return qnn_config.Validate();\n  }\n\n  return true;\n}\n\nstd::string OfflineSenseVoiceModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSenseVoiceModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\", \";\n\n  if (!qnn_config.backend_lib.empty()) {\n    os << \"qnn_config=\" << qnn_config.ToString() << \", \";\n  }\n\n  os << \"language=\\\"\" << language << \"\\\", \";\n  os << \"use_itn=\" << (use_itn ? \"True\" : \"False\") << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-sense-voice-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-sense-voice-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SENSE_VOICE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SENSE_VOICE_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/qnn-config.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSenseVoiceModelConfig {\n  std::string model;\n\n  // \"\" or \"auto\" to let the model recognize the language\n  // valid values:\n  //  zh, en, ja, ko, yue, auto\n  std::string language = \"auto\";\n\n  // true to use inverse text normalization\n  // false to not use inverse text normalization\n  bool use_itn = false;\n\n  QnnConfig qnn_config;\n\n  OfflineSenseVoiceModelConfig() = default;\n  OfflineSenseVoiceModelConfig(const std::string &model,\n                               const std::string &language, bool use_itn)\n      : model(model), language(language), use_itn(use_itn) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SENSE_VOICE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-sense-voice-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-sense-voice-model-meta-data.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SENSE_VOICE_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SENSE_VOICE_MODEL_META_DATA_H_\n\n#include <string>\n#include <unordered_map>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nstruct OfflineSenseVoiceModelMetaData {\n  // ID for using inverse text normalization\n  int32_t with_itn_id = 14;\n\n  // ID for not using inverse text normalization\n  int32_t without_itn_id = 15;\n\n  int32_t window_size = 7;   // lfr_m\n  int32_t window_shift = 6;  // lfr_n\n  int32_t vocab_size = 25055;\n\n  int32_t subsampling_factor = 1;\n\n  // Usually 0 for SenseVoice models.\n  // 0 means samples are scaled to [-32768, 32767] before are sent to the\n  // feature extractor\n  int32_t normalize_samples = 0;\n\n  int32_t blank_id = 0;\n\n  // possible values:\n  // zh, en, ja, ko, yue, auto\n  // where\n  //  zh is Chinese (Mandarin)\n  //  en is English\n  //  ja is Japanese\n  //  ko is Korean\n  //  yue is Cantonese\n  //  auto is to let the model recognize the language\n  std::unordered_map<std::string, int32_t> lang2id{\n      {\"auto\", 0}, {\"zh\", 3}, {\"en\", 4}, {\"yue\", 7}, {\"ja\", 11}, {\"ko\", 12},\n  };\n\n  std::vector<float> neg_mean;    // not used in rk npu and ascend npu\n  std::vector<float> inv_stddev;  // not used in rk npu and ascend npu\n\n  bool is_funasr_nano = false;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SENSE_VOICE_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-sense-voice-model.cc",
    "content": "// sherpa-onnx/csrc/offline-sense-voice-model.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-sense-voice-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.sense_voice.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.sense_voice.model);\n    Init(buf.data(), buf.size());\n  }\n\n  Ort::Value Forward(Ort::Value features, Ort::Value features_length,\n                     Ort::Value language, Ort::Value text_norm) {\n    std::array<Ort::Value, 4> inputs = {\n        std::move(features),\n        std::move(features_length),\n        std::move(language),\n        std::move(text_norm),\n    };\n\n    auto ans =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n    return std::move(ans[0]);\n  }\n\n  Ort::Value Forward(Ort::Value features) {\n    auto ans = sess_->Run({}, input_names_ptr_.data(), &features, 1,\n                          output_names_ptr_.data(), output_names_ptr_.size());\n    return std::move(ans[0]);\n  }\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const {\n    return meta_data_;\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    std::string comment;\n    SHERPA_ONNX_READ_META_DATA_STR_ALLOW_EMPTY(comment, \"comment\");\n\n    meta_data_.is_funasr_nano = Contains(comment, \"Nano\");\n\n    SHERPA_ONNX_READ_META_DATA(meta_data_.vocab_size, \"vocab_size\");\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.blank_id, \"blank_id\", 0);\n\n    SHERPA_ONNX_READ_META_DATA(meta_data_.window_size, \"lfr_window_size\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.window_shift, \"lfr_window_shift\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.normalize_samples,\n                               \"normalize_samples\");\n\n    if (!meta_data_.is_funasr_nano) {\n      SHERPA_ONNX_READ_META_DATA(meta_data_.with_itn_id, \"with_itn\");\n\n      SHERPA_ONNX_READ_META_DATA(meta_data_.without_itn_id, \"without_itn\");\n\n      int32_t lang_auto = 0;\n      int32_t lang_zh = 0;\n      int32_t lang_en = 0;\n      int32_t lang_ja = 0;\n      int32_t lang_ko = 0;\n      int32_t lang_yue = 0;\n\n      SHERPA_ONNX_READ_META_DATA(lang_auto, \"lang_auto\");\n      SHERPA_ONNX_READ_META_DATA(lang_zh, \"lang_zh\");\n      SHERPA_ONNX_READ_META_DATA(lang_en, \"lang_en\");\n      SHERPA_ONNX_READ_META_DATA(lang_ja, \"lang_ja\");\n      SHERPA_ONNX_READ_META_DATA(lang_ko, \"lang_ko\");\n      SHERPA_ONNX_READ_META_DATA(lang_yue, \"lang_yue\");\n\n      meta_data_.lang2id = {\n          {\"auto\", lang_auto}, {\"zh\", lang_zh}, {\"en\", lang_en},\n          {\"ja\", lang_ja},     {\"ko\", lang_ko}, {\"yue\", lang_yue},\n      };\n\n      SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(meta_data_.neg_mean, \"neg_mean\");\n      SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(meta_data_.inv_stddev, \"inv_stddev\");\n    }\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  OfflineSenseVoiceModelMetaData meta_data_;\n};\n\nOfflineSenseVoiceModel::OfflineSenseVoiceModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineSenseVoiceModel::OfflineSenseVoiceModel(Manager *mgr,\n                                               const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineSenseVoiceModel::~OfflineSenseVoiceModel() = default;\n\nOrt::Value OfflineSenseVoiceModel::Forward(Ort::Value features,\n                                           Ort::Value features_length,\n                                           Ort::Value language,\n                                           Ort::Value text_norm) const {\n  return impl_->Forward(std::move(features), std::move(features_length),\n                        std::move(language), std::move(text_norm));\n}\n\nOrt::Value OfflineSenseVoiceModel::Forward(Ort::Value features) const {\n  return impl_->Forward(std::move(features));\n}\n\nconst OfflineSenseVoiceModelMetaData &OfflineSenseVoiceModel::GetModelMetadata()\n    const {\n  return impl_->GetModelMetadata();\n}\n\nOrtAllocator *OfflineSenseVoiceModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSenseVoiceModel::OfflineSenseVoiceModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSenseVoiceModel::OfflineSenseVoiceModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-sense-voice-model.h",
    "content": "// sherpa-onnx/csrc/offline-sense-voice-model.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SENSE_VOICE_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SENSE_VOICE_MODEL_H_\n\n#include <memory>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-sense-voice-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModel {\n public:\n  explicit OfflineSenseVoiceModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineSenseVoiceModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineSenseVoiceModel();\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C). It is changed in-place.\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int32_t.\n   * @param language A 1-D tensor of shape (N,) with dtype int32_t\n   * @param text_norm A 1-D tensor of shape (N,) with dtype int32_t\n   *\n   * @return Return logits of shape (N, T, C) with dtype float\n   *\n   * Note: The subsampling factor is 1 for SenseVoice, so there is\n   *       no need to output logits_length.\n   */\n  Ort::Value Forward(Ort::Value features, Ort::Value features_length,\n                     Ort::Value language, Ort::Value text_norm) const;\n\n  /** For FunASR-Nano\n   *\n   * @param features A tensor of shape (1, T, C) with dtype float32\n   * @return Return logits of shape (1, T, C) with dtype float32\n   */\n  Ort::Value Forward(Ort::Value features) const;\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SENSE_VOICE_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-impl.cc",
    "content": "// sherpa-onnx/csrc/offline-source-separation-impl.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-source-separation-impl.h\"\n\n#include <algorithm>\n#include <memory>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/offline-source-separation-spleeter-impl.h\"\n#include \"sherpa-onnx/csrc/offline-source-separation-uvr-impl.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OfflineSourceSeparationImpl>\nOfflineSourceSeparationImpl::Create(\n    const OfflineSourceSeparationConfig &config) {\n  if (!config.model.spleeter.vocals.empty()) {\n    return std::make_unique<OfflineSourceSeparationSpleeterImpl>(config);\n  }\n\n  if (!config.model.uvr.model.empty()) {\n    return std::make_unique<OfflineSourceSeparationUvrImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide a separation model!\");\n\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OfflineSourceSeparationImpl>\nOfflineSourceSeparationImpl::Create(\n    Manager *mgr, const OfflineSourceSeparationConfig &config) {\n  if (!config.model.spleeter.vocals.empty()) {\n    return std::make_unique<OfflineSourceSeparationSpleeterImpl>(mgr, config);\n  }\n\n  if (!config.model.uvr.model.empty()) {\n    return std::make_unique<OfflineSourceSeparationUvrImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide a separation model!\");\n\n  return nullptr;\n}\n\nOfflineSourceSeparationInput OfflineSourceSeparationImpl::Resample(\n    const OfflineSourceSeparationInput &input, bool debug /*= false*/) const {\n  const OfflineSourceSeparationInput *p_input = &input;\n  OfflineSourceSeparationInput tmp_input;\n\n  int32_t output_sample_rate = GetOutputSampleRate();\n\n  if (input.sample_rate != output_sample_rate) {\n    SHERPA_ONNX_LOGE(\n        \"Creating a resampler:\\n\"\n        \"   in_sample_rate: %d\\n\"\n        \"   output_sample_rate: %d\\n\",\n        input.sample_rate, output_sample_rate);\n\n    float min_freq = std::min<int32_t>(input.sample_rate, output_sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    auto resampler =\n        std::make_unique<LinearResample>(input.sample_rate, output_sample_rate,\n                                         lowpass_cutoff, lowpass_filter_width);\n\n    std::vector<float> s;\n    for (const auto &samples : input.samples.data) {\n      resampler->Reset();\n      resampler->Resample(samples.data(), samples.size(), true, &s);\n      tmp_input.samples.data.push_back(std::move(s));\n    }\n\n    tmp_input.sample_rate = output_sample_rate;\n    p_input = &tmp_input;\n  }\n\n  if (p_input->samples.data.size() > 1) {\n    if (debug) {\n      SHERPA_ONNX_LOGE(\"input ch1 samples size: %d\",\n                       static_cast<int32_t>(p_input->samples.data[1].size()));\n    }\n\n    if (p_input->samples.data[0].size() != p_input->samples.data[1].size()) {\n      SHERPA_ONNX_LOGE(\"ch0 samples size %d vs ch1 samples size %d\",\n                       static_cast<int32_t>(p_input->samples.data[0].size()),\n                       static_cast<int32_t>(p_input->samples.data[1].size()));\n\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  return *p_input;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OfflineSourceSeparationImpl>\nOfflineSourceSeparationImpl::Create(\n    AAssetManager *mgr, const OfflineSourceSeparationConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OfflineSourceSeparationImpl>\nOfflineSourceSeparationImpl::Create(\n    NativeResourceManager *mgr, const OfflineSourceSeparationConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-impl.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_IMPL_H_\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-source-separation.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSourceSeparationImpl {\n public:\n  static std::unique_ptr<OfflineSourceSeparationImpl> Create(\n      const OfflineSourceSeparationConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OfflineSourceSeparationImpl> Create(\n      Manager *mgr, const OfflineSourceSeparationConfig &config);\n\n  virtual ~OfflineSourceSeparationImpl() = default;\n\n  virtual OfflineSourceSeparationOutput Process(\n      const OfflineSourceSeparationInput &input) const = 0;\n\n  virtual int32_t GetOutputSampleRate() const = 0;\n\n  virtual int32_t GetNumberOfStems() const = 0;\n\n  OfflineSourceSeparationInput Resample(\n      const OfflineSourceSeparationInput &input, bool debug = false) const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-source-separation-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-source-separation-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSourceSeparationModelConfig::Register(ParseOptions *po) {\n  spleeter.Register(po);\n  uvr.Register(po);\n\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n}\n\nbool OfflineSourceSeparationModelConfig::Validate() const {\n  if (!spleeter.vocals.empty()) {\n    return spleeter.Validate();\n  }\n\n  if (!uvr.model.empty()) {\n    return uvr.Validate();\n  }\n\n  SHERPA_ONNX_LOGE(\"Please specify a source separation model\");\n\n  return false;\n}\n\nstd::string OfflineSourceSeparationModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSourceSeparationModelConfig(\";\n  os << \"spleeter=\" << spleeter.ToString() << \", \";\n  os << \"uvr=\" << uvr.ToString() << \", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-source-separation-spleeter-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-source-separation-uvr-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSourceSeparationModelConfig {\n  OfflineSourceSeparationSpleeterModelConfig spleeter;\n  OfflineSourceSeparationUvrModelConfig uvr;\n\n  int32_t num_threads = 1;\n  bool debug = false;\n  std::string provider = \"cpu\";\n\n  OfflineSourceSeparationModelConfig() = default;\n\n  OfflineSourceSeparationModelConfig(\n      const OfflineSourceSeparationSpleeterModelConfig &spleeter,\n      const OfflineSourceSeparationUvrModelConfig &uvr, int32_t num_threads,\n      bool debug, const std::string &provider)\n      : spleeter(spleeter),\n        uvr(uvr),\n        num_threads(num_threads),\n        debug(debug),\n        provider(provider) {}\n\n  void Register(ParseOptions *po);\n\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-spleeter-impl.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation-spleeter-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_IMPL_H_\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"kaldi-native-fbank/csrc/istft.h\"\n#include \"kaldi-native-fbank/csrc/stft.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-source-separation-spleeter-model.h\"\n#include \"sherpa-onnx/csrc/offline-source-separation.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSourceSeparationSpleeterImpl : public OfflineSourceSeparationImpl {\n public:\n  explicit OfflineSourceSeparationSpleeterImpl(\n      const OfflineSourceSeparationConfig &config)\n      : config_(config), model_(config_.model) {}\n\n  template <typename Manager>\n  OfflineSourceSeparationSpleeterImpl(\n      Manager *mgr, const OfflineSourceSeparationConfig &config)\n      : config_(config), model_(mgr, config_.model) {}\n\n  OfflineSourceSeparationOutput Process(\n      const OfflineSourceSeparationInput &_input) const override {\n    auto input = Resample(_input, config_.model.debug);\n\n    auto stft_ch0 = ComputeStft(input, 0);\n\n    auto stft_ch1 = ComputeStft(input, 1);\n    knf::StftResult *p_stft_ch1 = stft_ch1.real.empty() ? &stft_ch0 : &stft_ch1;\n\n    int32_t num_frames = stft_ch0.num_frames;\n    int32_t fft_bins = stft_ch0.real.size() / num_frames;\n\n    int32_t pad = 512 - (stft_ch0.num_frames % 512);\n    if (pad < 512) {\n      num_frames += pad;\n    }\n\n    if (num_frames % 512) {\n      SHERPA_ONNX_LOGE(\"num_frames should be multiple of 512, actual: %d. %d\",\n                       num_frames, num_frames % 512);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    Eigen::VectorXf real(2 * num_frames * 1024);\n    Eigen::VectorXf imag(2 * num_frames * 1024);\n    real.setZero();\n    imag.setZero();\n\n    float *p_real = &real[0];\n    float *p_imag = &imag[0];\n\n    // copy stft result of channel 0\n    for (int32_t i = 0; i != stft_ch0.num_frames; ++i) {\n      std::copy(stft_ch0.real.data() + i * fft_bins,\n                stft_ch0.real.data() + i * fft_bins + 1024, p_real + 1024 * i);\n\n      std::copy(stft_ch0.imag.data() + i * fft_bins,\n                stft_ch0.imag.data() + i * fft_bins + 1024, p_imag + 1024 * i);\n    }\n\n    p_real += num_frames * 1024;\n    p_imag += num_frames * 1024;\n\n    // copy stft result of channel 1\n    for (int32_t i = 0; i != stft_ch1.num_frames; ++i) {\n      std::copy(p_stft_ch1->real.data() + i * fft_bins,\n                p_stft_ch1->real.data() + i * fft_bins + 1024,\n                p_real + 1024 * i);\n\n      std::copy(p_stft_ch1->imag.data() + i * fft_bins,\n                p_stft_ch1->imag.data() + i * fft_bins + 1024,\n                p_imag + 1024 * i);\n    }\n\n    Eigen::VectorXf x = (real.array().square() + imag.array().square()).sqrt();\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 4> x_shape{2, num_frames / 512, 512, 1024};\n    Ort::Value x_tensor = Ort::Value::CreateTensor(\n        memory_info, &x[0], x.size(), x_shape.data(), x_shape.size());\n\n    Ort::Value vocals_spec_tensor = model_.RunVocals(View(&x_tensor));\n    Ort::Value accompaniment_spec_tensor =\n        model_.RunAccompaniment(std::move(x_tensor));\n\n    Eigen::VectorXf vocals_spec = Eigen::Map<Eigen::VectorXf>(\n        vocals_spec_tensor.GetTensorMutableData<float>(), x.size());\n\n    Eigen::VectorXf accompaniment_spec = Eigen::Map<Eigen::VectorXf>(\n        accompaniment_spec_tensor.GetTensorMutableData<float>(), x.size());\n\n    Eigen::VectorXf sum_spec = vocals_spec.array().square() +\n                               accompaniment_spec.array().square() + 1e-10;\n\n    vocals_spec = (vocals_spec.array().square() + 1e-10 / 2) / sum_spec.array();\n\n    accompaniment_spec =\n        (accompaniment_spec.array().square() + 1e-10 / 2) / sum_spec.array();\n\n    auto vocals_samples_ch0 = ProcessSpec(vocals_spec, stft_ch0, 0);\n    auto vocals_samples_ch1 = ProcessSpec(vocals_spec, *p_stft_ch1, 1);\n\n    auto accompaniment_samples_ch0 =\n        ProcessSpec(accompaniment_spec, stft_ch0, 0);\n    auto accompaniment_samples_ch1 =\n        ProcessSpec(accompaniment_spec, *p_stft_ch1, 1);\n\n    OfflineSourceSeparationOutput ans;\n    ans.sample_rate = GetOutputSampleRate();\n\n    ans.stems.resize(2);\n    ans.stems[0].data.reserve(2);\n    ans.stems[1].data.reserve(2);\n\n    ans.stems[0].data.push_back(std::move(vocals_samples_ch0));\n    ans.stems[0].data.push_back(std::move(vocals_samples_ch1));\n\n    ans.stems[1].data.push_back(std::move(accompaniment_samples_ch0));\n    ans.stems[1].data.push_back(std::move(accompaniment_samples_ch1));\n\n    return ans;\n  }\n\n  int32_t GetOutputSampleRate() const override {\n    return model_.GetMetaData().sample_rate;\n  }\n\n  int32_t GetNumberOfStems() const override {\n    return model_.GetMetaData().num_stems;\n  }\n\n private:\n  // spec is of shape (2, num_chunks, 512, 1024)\n  std::vector<float> ProcessSpec(const Eigen::VectorXf &spec,\n                                 const knf::StftResult &stft,\n                                 int32_t channel) const {\n    int32_t fft_bins = stft.real.size() / stft.num_frames;\n\n    Eigen::VectorXf mask(stft.real.size());\n    mask.setZero();\n\n    float *p_mask = &mask[0];\n\n    // assume there are 2 channels\n    const float *p_spec = &spec[0] + (spec.size() / 2) * channel;\n\n    for (int32_t i = 0; i != stft.num_frames; ++i) {\n      std::copy(p_spec + i * 1024, p_spec + (i + 1) * 1024,\n                p_mask + i * fft_bins);\n    }\n\n    knf::StftResult masked_stft;\n\n    masked_stft.num_frames = stft.num_frames;\n    masked_stft.real.resize(stft.real.size());\n    masked_stft.imag.resize(stft.imag.size());\n\n    Eigen::Map<Eigen::VectorXf>(masked_stft.real.data(),\n                                masked_stft.real.size()) =\n        mask.array() *\n        Eigen::Map<Eigen::VectorXf>(const_cast<float *>(stft.real.data()),\n                                    stft.real.size())\n            .array();\n\n    Eigen::Map<Eigen::VectorXf>(masked_stft.imag.data(),\n                                masked_stft.imag.size()) =\n        mask.array() *\n        Eigen::Map<Eigen::VectorXf>(const_cast<float *>(stft.imag.data()),\n                                    stft.imag.size())\n            .array();\n\n    auto stft_config = GetStftConfig();\n    knf::IStft istft(stft_config);\n\n    return istft.Compute(masked_stft);\n  }\n\n  knf::StftResult ComputeStft(const OfflineSourceSeparationInput &input,\n                              int32_t ch) const {\n    if (ch >= input.samples.data.size()) {\n      SHERPA_ONNX_LOGE(\"Invalid channel %d. Max %d\", ch,\n                       static_cast<int32_t>(input.samples.data.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (input.samples.data[ch].empty()) {\n      return {};\n    }\n\n    return ComputeStft(input.samples.data[ch]);\n  }\n\n  knf::StftResult ComputeStft(const std::vector<float> &samples) const {\n    auto stft_config = GetStftConfig();\n    knf::Stft stft(stft_config);\n\n    return stft.Compute(samples.data(), samples.size());\n  }\n\n  knf::StftConfig GetStftConfig() const {\n    const auto &meta = model_.GetMetaData();\n\n    knf::StftConfig stft_config;\n    stft_config.n_fft = meta.n_fft;\n    stft_config.hop_length = meta.hop_length;\n    stft_config.win_length = meta.window_length;\n    stft_config.window_type = meta.window_type;\n    stft_config.center = meta.center;\n\n    return stft_config;\n  }\n\n private:\n  OfflineSourceSeparationConfig config_;\n  OfflineSourceSeparationSpleeterModel model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-spleeter-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-source-separation-spleeter-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-source-separation-spleeter-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSourceSeparationSpleeterModelConfig::Register(ParseOptions *po) {\n  po->Register(\"spleeter-vocals\", &vocals, \"Path to the spleeter vocals model\");\n\n  po->Register(\"spleeter-accompaniment\", &accompaniment,\n               \"Path to the spleeter accompaniment model\");\n}\n\nbool OfflineSourceSeparationSpleeterModelConfig::Validate() const {\n  if (vocals.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --spleeter-vocals\");\n    return false;\n  }\n\n  if (!FileExists(vocals)) {\n    SHERPA_ONNX_LOGE(\"spleeter vocals '%s' does not exist. \", vocals.c_str());\n    return false;\n  }\n\n  if (accompaniment.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --spleeter-accompaniment\");\n    return false;\n  }\n\n  if (!FileExists(accompaniment)) {\n    SHERPA_ONNX_LOGE(\"spleeter accompaniment '%s' does not exist. \",\n                     accompaniment.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineSourceSeparationSpleeterModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSourceSeparationSpleeterModelConfig(\";\n  os << \"vocals=\\\"\" << vocals << \"\\\", \";\n  os << \"accompaniment=\\\"\" << accompaniment << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-spleeter-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation-spleeter-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-source-separation-spleeter-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSourceSeparationSpleeterModelConfig {\n  std::string vocals;\n\n  std::string accompaniment;\n\n  OfflineSourceSeparationSpleeterModelConfig() = default;\n\n  OfflineSourceSeparationSpleeterModelConfig(const std::string &vocals,\n                                             const std::string &accompaniment)\n      : vocals(vocals), accompaniment(accompaniment) {}\n\n  void Register(ParseOptions *po);\n\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-spleeter-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation-spleeter-model-meta-data.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_META_DATA_H_\n\n#include <string>\n#include <unordered_map>\n#include <vector>\n\nnamespace sherpa_onnx {\n\n// See also\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/spleeter/separate_onnx.py\nstruct OfflineSourceSeparationSpleeterModelMetaData {\n  int32_t sample_rate = 44100;\n  int32_t num_stems = 2;\n\n  int32_t n_fft = 4096;\n  int32_t hop_length = 1024;\n  int32_t window_length = 4096;\n  bool center = false;\n  std::string window_type = \"hann\";\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-spleeter-model.cc",
    "content": "// sherpa-onnx/csrc/offline-source-separation-spleeter-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-source-separation-spleeter-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSourceSeparationSpleeterModel::Impl {\n public:\n  explicit Impl(const OfflineSourceSeparationModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.spleeter.vocals);\n      InitVocals(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.spleeter.accompaniment);\n      InitAccompaniment(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineSourceSeparationModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.spleeter.vocals);\n      InitVocals(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.spleeter.accompaniment);\n      InitAccompaniment(buf.data(), buf.size());\n    }\n  }\n\n  const OfflineSourceSeparationSpleeterModelMetaData &GetMetaData() const {\n    return meta_;\n  }\n\n  Ort::Value RunVocals(Ort::Value x) const {\n    auto out = vocals_sess_->Run({}, vocals_input_names_ptr_.data(), &x, 1,\n                                 vocals_output_names_ptr_.data(),\n                                 vocals_output_names_ptr_.size());\n    return std::move(out[0]);\n  }\n\n  Ort::Value RunAccompaniment(Ort::Value x) const {\n    auto out =\n        accompaniment_sess_->Run({}, accompaniment_input_names_ptr_.data(), &x,\n                                 1, accompaniment_output_names_ptr_.data(),\n                                 accompaniment_output_names_ptr_.size());\n    return std::move(out[0]);\n  }\n\n private:\n  void InitVocals(void *model_data, size_t model_data_length) {\n    vocals_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(vocals_sess_.get(), &vocals_input_names_,\n                  &vocals_input_names_ptr_);\n\n    GetOutputNames(vocals_sess_.get(), &vocals_output_names_,\n                   &vocals_output_names_ptr_);\n\n    Ort::ModelMetadata meta_data = vocals_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---vocals model---\\n\";\n      PrintModelMetadata(os, meta_data);\n\n      os << \"----------input names----------\\n\";\n      int32_t i = 0;\n      for (const auto &s : vocals_input_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n      os << \"----------output names----------\\n\";\n      i = 0;\n      for (const auto &s : vocals_output_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    std::string model_type;\n    SHERPA_ONNX_READ_META_DATA_STR(model_type, \"model_type\");\n    if (model_type != \"spleeter\") {\n      SHERPA_ONNX_LOGE(\"Expect model type 'spleeter'. Given: '%s'\",\n                       model_type.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA(meta_.num_stems, \"stems\");\n    if (meta_.num_stems != 2) {\n      SHERPA_ONNX_LOGE(\"Only 2stems is supported. Given %d stems\",\n                       meta_.num_stems);\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  void InitAccompaniment(void *model_data, size_t model_data_length) {\n    accompaniment_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(accompaniment_sess_.get(), &accompaniment_input_names_,\n                  &accompaniment_input_names_ptr_);\n\n    GetOutputNames(accompaniment_sess_.get(), &accompaniment_output_names_,\n                   &accompaniment_output_names_ptr_);\n  }\n\n private:\n  OfflineSourceSeparationModelConfig config_;\n  OfflineSourceSeparationSpleeterModelMetaData meta_;\n\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> vocals_sess_;\n\n  std::vector<std::string> vocals_input_names_;\n  std::vector<const char *> vocals_input_names_ptr_;\n\n  std::vector<std::string> vocals_output_names_;\n  std::vector<const char *> vocals_output_names_ptr_;\n\n  std::unique_ptr<Ort::Session> accompaniment_sess_;\n\n  std::vector<std::string> accompaniment_input_names_;\n  std::vector<const char *> accompaniment_input_names_ptr_;\n\n  std::vector<std::string> accompaniment_output_names_;\n  std::vector<const char *> accompaniment_output_names_ptr_;\n};\n\nOfflineSourceSeparationSpleeterModel::~OfflineSourceSeparationSpleeterModel() =\n    default;  // NOLINT\n\nOfflineSourceSeparationSpleeterModel::OfflineSourceSeparationSpleeterModel(\n    const OfflineSourceSeparationModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineSourceSeparationSpleeterModel::OfflineSourceSeparationSpleeterModel(\n    Manager *mgr, const OfflineSourceSeparationModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOrt::Value OfflineSourceSeparationSpleeterModel::RunVocals(Ort::Value x) const {\n  return impl_->RunVocals(std::move(x));\n}\n\nOrt::Value OfflineSourceSeparationSpleeterModel::RunAccompaniment(\n    Ort::Value x) const {\n  return impl_->RunAccompaniment(std::move(x));\n}\n\nconst OfflineSourceSeparationSpleeterModelMetaData &\nOfflineSourceSeparationSpleeterModel::GetMetaData() const {\n  return impl_->GetMetaData();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSourceSeparationSpleeterModel::\n    OfflineSourceSeparationSpleeterModel(\n        AAssetManager *mgr, const OfflineSourceSeparationModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSourceSeparationSpleeterModel::\n    OfflineSourceSeparationSpleeterModel(\n        NativeResourceManager *mgr,\n        const OfflineSourceSeparationModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-spleeter-model.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation-spleeter-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_H_\n#include <memory>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-source-separation-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-source-separation-spleeter-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSourceSeparationSpleeterModel {\n public:\n  ~OfflineSourceSeparationSpleeterModel();\n\n  explicit OfflineSourceSeparationSpleeterModel(\n      const OfflineSourceSeparationModelConfig &config);\n\n  template <typename Manager>\n  OfflineSourceSeparationSpleeterModel(\n      Manager *mgr, const OfflineSourceSeparationModelConfig &config);\n\n  Ort::Value RunVocals(Ort::Value x) const;\n  Ort::Value RunAccompaniment(Ort::Value x) const;\n\n  const OfflineSourceSeparationSpleeterModelMetaData &GetMetaData() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-uvr-impl.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation-uvr-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_IMPL_H_\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"kaldi-native-fbank/csrc/istft.h\"\n#include \"kaldi-native-fbank/csrc/stft.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-source-separation-uvr-model.h\"\n#include \"sherpa-onnx/csrc/offline-source-separation.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSourceSeparationUvrImpl : public OfflineSourceSeparationImpl {\n public:\n  explicit OfflineSourceSeparationUvrImpl(\n      const OfflineSourceSeparationConfig &config)\n      : config_(config), model_(config_.model) {}\n\n  template <typename Manager>\n  OfflineSourceSeparationUvrImpl(Manager *mgr,\n                                 const OfflineSourceSeparationConfig &config)\n      : config_(config), model_(mgr, config_.model) {}\n\n  OfflineSourceSeparationOutput Process(\n      const OfflineSourceSeparationInput &_input) const override {\n    auto input = Resample(_input, config_.model.debug);\n\n    auto chunks_ch0 = SplitIntoChunks(input.samples.data[0]);\n\n    std::vector<std::vector<float>> chunks_ch1;\n    if (input.samples.data.size() > 1) {\n      chunks_ch1 = SplitIntoChunks(input.samples.data[1]);\n    }\n\n    std::vector<float> samples_ch0;\n    std::vector<float> samples_ch1;\n\n    for (int32_t i = 0; i != static_cast<int32_t>(chunks_ch0.size()); ++i) {\n      bool is_first_chunk = (i == 0);\n      bool is_last_chunk = (i == static_cast<int32_t>(chunks_ch0.size()) - 1);\n\n      auto s = ProcessChunk(\n          chunks_ch0[i],\n          chunks_ch1.empty() ? std::vector<float>{} : chunks_ch1[i],\n          is_first_chunk, is_last_chunk);\n\n      samples_ch0.insert(samples_ch0.end(), s.first.begin(), s.first.end());\n      samples_ch1.insert(samples_ch1.end(), s.second.begin(), s.second.end());\n    }\n\n    auto &vocals_ch0 = samples_ch0;\n    auto &vocals_ch1 = samples_ch1;\n\n    std::vector<float> non_vocals_ch0(vocals_ch0.size());\n    std::vector<float> non_vocals_ch1(vocals_ch1.size());\n\n    Eigen::Map<Eigen::VectorXf>(non_vocals_ch0.data(), non_vocals_ch0.size()) =\n        Eigen::Map<Eigen::VectorXf>(input.samples.data[0].data(),\n                                    input.samples.data[0].size())\n            .array() -\n        Eigen::Map<Eigen::VectorXf>(vocals_ch0.data(), vocals_ch0.size())\n            .array();\n\n    if (input.samples.data.size() > 1) {\n      Eigen::Map<Eigen::VectorXf>(non_vocals_ch1.data(),\n                                  non_vocals_ch1.size()) =\n          Eigen::Map<Eigen::VectorXf>(input.samples.data[1].data(),\n                                      input.samples.data[1].size())\n              .array() -\n          Eigen::Map<Eigen::VectorXf>(vocals_ch1.data(), vocals_ch1.size())\n              .array();\n    } else {\n      Eigen::Map<Eigen::VectorXf>(non_vocals_ch1.data(),\n                                  non_vocals_ch1.size()) =\n          Eigen::Map<Eigen::VectorXf>(input.samples.data[0].data(),\n                                      input.samples.data[0].size())\n              .array() -\n          Eigen::Map<Eigen::VectorXf>(vocals_ch1.data(), vocals_ch1.size())\n              .array();\n    }\n\n    OfflineSourceSeparationOutput ans;\n    ans.sample_rate = GetOutputSampleRate();\n\n    ans.stems.resize(2);\n    ans.stems[0].data.reserve(2);\n    ans.stems[1].data.reserve(2);\n\n    ans.stems[0].data.push_back(std::move(vocals_ch0));\n    ans.stems[0].data.push_back(std::move(vocals_ch1));\n\n    ans.stems[1].data.push_back(std::move(non_vocals_ch0));\n    ans.stems[1].data.push_back(std::move(non_vocals_ch1));\n\n    return ans;\n  }\n\n  int32_t GetOutputSampleRate() const override {\n    return model_.GetMetaData().sample_rate;\n  }\n\n  int32_t GetNumberOfStems() const override {\n    return model_.GetMetaData().num_stems;\n  }\n\n private:\n  std::pair<std::vector<float>, std::vector<float>> ProcessChunk(\n      const std::vector<float> &chunk_ch0, const std::vector<float> &chunk_ch1,\n      bool is_first_chunk, bool is_last_chunk) const {\n    int32_t pad0 = 0;\n\n    auto stft_results_ch0 = ComputeStft(chunk_ch0, &pad0);\n\n    int32_t pad1 = pad0;\n    std::vector<knf::StftResult> stft_results_ch1;\n\n    if (!chunk_ch1.empty()) {\n      stft_results_ch1 = ComputeStft(chunk_ch1, &pad1);\n    } else {\n      stft_results_ch1 = stft_results_ch0;\n    }\n\n    const auto &meta_ = model_.GetMetaData();\n\n    int32_t num_frames = stft_results_ch0[0].num_frames;\n    int32_t dim_f = meta_.dim_f;\n    int32_t dim_t = meta_.dim_t;\n    int32_t n_fft_bin = meta_.n_fft / 2 + 1;\n    if (num_frames != dim_t) {\n      SHERPA_ONNX_LOGE(\"num_frames(%d) != dim_t(%d)\", num_frames, dim_t);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    // the first 2: number of channels\n    // the second 2: real and image\n    std::vector<float> x(stft_results_ch0.size() * 2 * 2 * dim_f * dim_t);\n    float *px = x.data();\n\n    for (int32_t i = 0; i != static_cast<int32_t>(stft_results_ch0.size());\n         ++i) {\n      const auto &ch0 = stft_results_ch0[i];\n      const auto &ch1 = stft_results_ch1[i];\n\n      const float *p_real_ch0 = ch0.real.data();\n      const float *p_imag_ch0 = ch0.imag.data();\n\n      const float *p_real_ch1 = ch1.real.data();\n      const float *p_imag_ch1 = ch1.imag.data();\n\n      for (int32_t j = 0; j != dim_f; ++j) {\n        for (int32_t k = 0; k != num_frames; ++k) {\n          *px = p_real_ch0[k * n_fft_bin + j];\n          ++px;\n        }\n      }\n\n      for (int32_t j = 0; j != dim_f; ++j) {\n        for (int32_t k = 0; k != num_frames; ++k) {\n          *px = p_imag_ch0[k * n_fft_bin + j];\n          ++px;\n        }\n      }\n\n      for (int32_t j = 0; j != dim_f; ++j) {\n        for (int32_t k = 0; k != num_frames; ++k) {\n          *px = p_real_ch1[k * n_fft_bin + j];\n          ++px;\n        }\n      }\n\n      for (int32_t j = 0; j != dim_f; ++j) {\n        for (int32_t k = 0; k != num_frames; ++k) {\n          *px = p_imag_ch1[k * n_fft_bin + j];\n          ++px;\n        }\n      }\n    }  // for (int32_t i = 0; i !=\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 4> x_shape{\n        static_cast<int32_t>(stft_results_ch0.size()) * 4 / meta_.dim_c,\n        meta_.dim_c, dim_f, dim_t};\n\n    Ort::Value x_tensor = Ort::Value::CreateTensor(\n        memory_info, x.data(), x.size(), x_shape.data(), x_shape.size());\n\n    Ort::Value spec = model_.Run(std::move(x_tensor));\n\n    const float *p_spec = spec.GetTensorData<float>();\n\n    for (int32_t i = 0; i != static_cast<int32_t>(stft_results_ch0.size());\n         ++i) {\n      auto &ch0 = stft_results_ch0[i];\n      auto &ch1 = stft_results_ch1[i];\n\n      float *p_real_ch0 = ch0.real.data();\n      float *p_imag_ch0 = ch0.imag.data();\n\n      float *p_real_ch1 = ch1.real.data();\n      float *p_imag_ch1 = ch1.imag.data();\n\n      for (int32_t j = 0; j != dim_f; ++j) {\n        for (int32_t k = 0; k != num_frames; ++k) {\n          p_real_ch0[k * n_fft_bin + j] = *p_spec;\n          ++p_spec;\n        }\n      }\n\n      for (int32_t j = 0; j != dim_f; ++j) {\n        for (int32_t k = 0; k != num_frames; ++k) {\n          p_imag_ch0[k * n_fft_bin + j] = *p_spec;\n          ++p_spec;\n        }\n      }\n\n      for (int32_t j = 0; j != dim_f; ++j) {\n        for (int32_t k = 0; k != num_frames; ++k) {\n          p_real_ch1[k * n_fft_bin + j] = *p_spec;\n          ++p_spec;\n        }\n      }\n\n      for (int32_t j = 0; j != dim_f; ++j) {\n        for (int32_t k = 0; k != num_frames; ++k) {\n          p_imag_ch1[k * n_fft_bin + j] = *p_spec;\n          ++p_spec;\n        }\n      }\n\n      for (int32_t k = 0; k != num_frames; ++k) {\n        for (int32_t j = dim_f; j != n_fft_bin; ++j) {\n          p_real_ch0[k * n_fft_bin + j] = 0;\n          p_real_ch1[k * n_fft_bin + j] = 0;\n\n          p_imag_ch0[k * n_fft_bin + j] = 0;\n          p_imag_ch1[k * n_fft_bin + j] = 0;\n        }\n      }\n    }\n\n    auto samples_ch0 = ComputeInverseStft(stft_results_ch0, pad0,\n                                          is_first_chunk, is_last_chunk);\n\n    auto samples_ch1 = ComputeInverseStft(stft_results_ch1, pad1,\n                                          is_first_chunk, is_last_chunk);\n\n    return {std::move(samples_ch0), std::move(samples_ch1)};\n  }\n\n  std::vector<float> ComputeInverseStft(\n      const std::vector<knf::StftResult> &stft_result, int32_t pad,\n      bool is_first_chunk, bool is_last_chunk) const {\n    const auto &meta_ = model_.GetMetaData();\n    int32_t trim = meta_.n_fft / 2;\n\n    int32_t margin = meta_.margin;\n\n    int32_t chunk_size = meta_.num_chunks * meta_.sample_rate;\n\n    if (margin > chunk_size) {\n      margin = chunk_size;\n    }\n\n    auto stft_config = GetStftConfig();\n    knf::IStft istft(stft_config);\n\n    std::vector<float> ans;\n\n    for (int32_t i = 0; i != static_cast<int32_t>(stft_result.size()); ++i) {\n      auto samples = istft.Compute(stft_result[i]);\n      int32_t num_samples = static_cast<int32_t>(samples.size());\n\n      ans.insert(ans.end(), samples.begin() + trim,\n                 samples.begin() + (num_samples - trim));\n    }\n\n    int32_t start = is_first_chunk ? 0 : margin;\n    int32_t end =\n        is_last_chunk ? (ans.size() - pad) : (ans.size() - pad - margin);\n\n    return {ans.begin() + start, ans.begin() + end};\n  }\n\n  std::vector<knf::StftResult> ComputeStft(const std::vector<float> &chunk,\n                                           int32_t *pad) const {\n    const auto &meta_ = model_.GetMetaData();\n\n    int32_t num_samples = static_cast<int32_t>(chunk.size());\n    int32_t trim = meta_.n_fft / 2;\n    int32_t chunk_size = meta_.hop_length * (meta_.dim_t - 1);\n    int32_t gen_size = chunk_size - 2 * trim;\n    *pad = gen_size - num_samples % gen_size;\n\n    std::vector<float> samples(trim + chunk.size() + *pad + trim);\n    std::copy(chunk.begin(), chunk.end(), samples.begin() + trim);\n\n    auto stft_config = GetStftConfig();\n    knf::Stft stft(stft_config);\n\n    std::vector<knf::StftResult> stft_results;\n    // split the chunk into short segments\n    for (int32_t i = 0; i < num_samples + *pad; i += gen_size) {\n      auto r = stft.Compute(samples.data() + i, chunk_size);\n      stft_results.push_back(std::move(r));\n    }\n\n    return stft_results;\n  }\n\n  std::vector<std::vector<float>> SplitIntoChunks(\n      const std::vector<float> &samples) const {\n    std::vector<std::vector<float>> ans;\n\n    if (samples.empty()) {\n      return ans;\n    }\n\n    const auto &meta_ = model_.GetMetaData();\n    int32_t margin = meta_.margin;\n\n    int32_t chunk_size = meta_.num_chunks * meta_.sample_rate;\n\n    if (static_cast<int32_t>(samples.size()) < chunk_size) {\n      chunk_size = samples.size();\n    }\n\n    if (margin > chunk_size) {\n      margin = chunk_size;\n    }\n\n    for (int32_t i = 0; i < static_cast<int32_t>(samples.size());\n         i += chunk_size) {\n      int32_t start = std::max<int32_t>(0, i - margin);\n      int32_t end = std::min<int32_t>(i + chunk_size + margin,\n                                      static_cast<int32_t>(samples.size()));\n      if (start >= end) {\n        break;\n      }\n\n      ans.emplace_back(samples.begin() + start, samples.begin() + end);\n\n      if (end == static_cast<int32_t>(samples.size())) {\n        break;\n      }\n    }\n\n    return ans;\n  }\n\n  knf::StftConfig GetStftConfig() const {\n    const auto &meta = model_.GetMetaData();\n\n    knf::StftConfig stft_config;\n    stft_config.n_fft = meta.n_fft;\n    stft_config.hop_length = meta.hop_length;\n    stft_config.win_length = meta.window_length;\n    stft_config.window_type = meta.window_type;\n    stft_config.center = meta.center;\n\n    return stft_config;\n  }\n\n private:\n  OfflineSourceSeparationConfig config_;\n  OfflineSourceSeparationUvrModel model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-uvr-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-source-separation-uvr-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-source-separation-uvr-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSourceSeparationUvrModelConfig::Register(ParseOptions *po) {\n  po->Register(\"uvr-model\", &model, \"Path to the UVR model\");\n}\n\nbool OfflineSourceSeparationUvrModelConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --uvr-model\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"UVR model '%s' does not exist. \", model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineSourceSeparationUvrModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSourceSeparationUvrModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-uvr-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation-uvr-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-source-separation-uvr-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSourceSeparationUvrModelConfig {\n  std::string model;\n\n  OfflineSourceSeparationUvrModelConfig() = default;\n\n  explicit OfflineSourceSeparationUvrModelConfig(const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-uvr-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation-uvr-model-meta-data.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_META_DATA_H_\n\n#include <string>\n#include <unordered_map>\n#include <vector>\n\nnamespace sherpa_onnx {\n\n// See also\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/uvr_mdx/test.py\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/uvr_mdx/add_meta_data_and_quantize.py\nstruct OfflineSourceSeparationUvrModelMetaData {\n  int32_t sample_rate = 44100;\n  int32_t num_stems = 2;\n  int32_t dim_c = -1;\n  int32_t dim_f = -1;\n  int32_t dim_t = -1;\n\n  int32_t n_fft = -1;\n  int32_t hop_length = 1024;\n\n  int32_t window_length = -1;\n  int32_t center = 1;\n  std::string window_type = \"hann\";\n\n  // the following fields are preconfigured. Please see\n  // https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/uvr_mdx/test.py\n  int32_t margin = 0;  // changed in ./offline-source-separation-uvr-model.cc\n  const int32_t num_chunks = 15;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-uvr-model.cc",
    "content": "// sherpa-onnx/csrc/offline-source-separation-uvr-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-source-separation-uvr-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSourceSeparationUvrModel::Impl {\n public:\n  explicit Impl(const OfflineSourceSeparationModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config.uvr.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineSourceSeparationModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config.uvr.model);\n    Init(buf.data(), buf.size());\n  }\n\n  const OfflineSourceSeparationUvrModelMetaData &GetMetaData() const {\n    return meta_;\n  }\n\n  Ort::Value Run(Ort::Value x) const {\n    auto out = sess_->Run({}, input_names_ptr_.data(), &x, 1,\n                          output_names_ptr_.data(), output_names_ptr_.size());\n    return std::move(out[0]);\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---UVR model---\\n\";\n      PrintModelMetadata(os, meta_data);\n\n      os << \"----------input names----------\\n\";\n      int32_t i = 0;\n      for (const auto &s : input_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n      os << \"----------output names----------\\n\";\n      i = 0;\n      for (const auto &s : output_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    std::string model_type;\n    SHERPA_ONNX_READ_META_DATA_STR(model_type, \"model_type\");\n    if (model_type != \"UVR\") {\n      SHERPA_ONNX_LOGE(\"Expect model type 'UVR'. Given: '%s'\",\n                       model_type.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA(meta_.num_stems, \"stems\");\n    if (meta_.num_stems != 2) {\n      SHERPA_ONNX_LOGE(\"Only 2stems is supported. Given %d stems\",\n                       meta_.num_stems);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA(meta_.sample_rate, \"sample_rate\");\n    SHERPA_ONNX_READ_META_DATA(meta_.n_fft, \"n_fft\");\n    SHERPA_ONNX_READ_META_DATA(meta_.center, \"center\");\n    SHERPA_ONNX_READ_META_DATA(meta_.window_length, \"win_length\");\n    SHERPA_ONNX_READ_META_DATA(meta_.hop_length, \"hop_length\");\n    SHERPA_ONNX_READ_META_DATA(meta_.dim_t, \"dim_t\");\n    SHERPA_ONNX_READ_META_DATA(meta_.dim_f, \"dim_f\");\n    SHERPA_ONNX_READ_META_DATA(meta_.dim_c, \"dim_c\");\n    SHERPA_ONNX_READ_META_DATA_STR(meta_.window_type, \"window_type\");\n\n    meta_.margin = meta_.sample_rate;\n  }\n\n private:\n  OfflineSourceSeparationModelConfig config_;\n  OfflineSourceSeparationUvrModelMetaData meta_;\n\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n};\n\nOfflineSourceSeparationUvrModel::~OfflineSourceSeparationUvrModel() = default;\n\nOfflineSourceSeparationUvrModel::OfflineSourceSeparationUvrModel(\n    const OfflineSourceSeparationModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineSourceSeparationUvrModel::OfflineSourceSeparationUvrModel(\n    Manager *mgr, const OfflineSourceSeparationModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOrt::Value OfflineSourceSeparationUvrModel::Run(Ort::Value x) const {\n  return impl_->Run(std::move(x));\n}\n\nconst OfflineSourceSeparationUvrModelMetaData &\nOfflineSourceSeparationUvrModel::GetMetaData() const {\n  return impl_->GetMetaData();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSourceSeparationUvrModel::OfflineSourceSeparationUvrModel(\n    AAssetManager *mgr, const OfflineSourceSeparationModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSourceSeparationUvrModel::OfflineSourceSeparationUvrModel(\n    NativeResourceManager *mgr,\n    const OfflineSourceSeparationModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation-uvr-model.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation-uvr-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_H_\n#include <memory>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-source-separation-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-source-separation-uvr-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSourceSeparationUvrModel {\n public:\n  ~OfflineSourceSeparationUvrModel();\n\n  explicit OfflineSourceSeparationUvrModel(\n      const OfflineSourceSeparationModelConfig &config);\n\n  template <typename Manager>\n  OfflineSourceSeparationUvrModel(\n      Manager *mgr, const OfflineSourceSeparationModelConfig &config);\n\n  Ort::Value Run(Ort::Value x) const;\n\n  const OfflineSourceSeparationUvrModelMetaData &GetMetaData() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation.cc",
    "content": "// sherpa-onnx/csrc/offline-source-separation.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-source-separation.h\"\n\n#include <memory>\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-source-separation-impl.h\"\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\nnamespace sherpa_onnx {\n\nvoid OfflineSourceSeparationConfig::Register(ParseOptions *po) {\n  model.Register(po);\n}\n\nbool OfflineSourceSeparationConfig::Validate() const {\n  return model.Validate();\n}\n\nstd::string OfflineSourceSeparationConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSourceSeparationConfig(\";\n  os << \"model=\" << model.ToString() << \")\";\n\n  return os.str();\n}\n\ntemplate <typename Manager>\nOfflineSourceSeparation::OfflineSourceSeparation(\n    Manager *mgr, const OfflineSourceSeparationConfig &config)\n    : impl_(OfflineSourceSeparationImpl::Create(mgr, config)) {}\n\nOfflineSourceSeparation::OfflineSourceSeparation(\n    const OfflineSourceSeparationConfig &config)\n    : impl_(OfflineSourceSeparationImpl::Create(config)) {}\n\nOfflineSourceSeparation::~OfflineSourceSeparation() = default;\n\nOfflineSourceSeparationOutput OfflineSourceSeparation::Process(\n    const OfflineSourceSeparationInput &input) const {\n  return impl_->Process(input);\n}\n\nint32_t OfflineSourceSeparation::GetOutputSampleRate() const {\n  return impl_->GetOutputSampleRate();\n}\n\n// e.g., it is 2 for 2stems from spleeter\nint32_t OfflineSourceSeparation::GetNumberOfStems() const {\n  return impl_->GetNumberOfStems();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSourceSeparation::OfflineSourceSeparation(\n    AAssetManager *mgr, const OfflineSourceSeparationConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSourceSeparation::OfflineSourceSeparation(\n    NativeResourceManager *mgr, const OfflineSourceSeparationConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-source-separation.h",
    "content": "// sherpa-onnx/csrc/offline-source-separation.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-source-separation-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSourceSeparationConfig {\n  OfflineSourceSeparationModelConfig model;\n\n  OfflineSourceSeparationConfig() = default;\n\n  explicit OfflineSourceSeparationConfig(\n      const OfflineSourceSeparationModelConfig &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nstruct MultiChannelSamples {\n  // data[i] is for the i-th channel\n  //\n  // each sample is in the range [-1, 1]\n  std::vector<std::vector<float>> data;\n};\n\nstruct OfflineSourceSeparationInput {\n  MultiChannelSamples samples;\n\n  int32_t sample_rate;\n};\n\nstruct OfflineSourceSeparationOutput {\n  std::vector<MultiChannelSamples> stems;\n\n  int32_t sample_rate;\n};\n\nclass OfflineSourceSeparationImpl;\n\nclass OfflineSourceSeparation {\n public:\n  ~OfflineSourceSeparation();\n\n  explicit OfflineSourceSeparation(const OfflineSourceSeparationConfig &config);\n\n  template <typename Manager>\n  OfflineSourceSeparation(Manager *mgr,\n                          const OfflineSourceSeparationConfig &config);\n\n  OfflineSourceSeparationOutput Process(\n      const OfflineSourceSeparationInput &input) const;\n\n  int32_t GetOutputSampleRate() const;\n\n  // e.g., it is 2 for 2stems from spleeter\n  int32_t GetNumberOfStems() const;\n\n private:\n  std::unique_ptr<OfflineSourceSeparationImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SOURCE_SEPARATION_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-diarization-impl.cc",
    "content": "// sherpa-onnx/csrc/offline-speaker-diarization-impl.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-speaker-diarization-impl.h\"\n\n#include <memory>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-speaker-diarization-pyannote-impl.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OfflineSpeakerDiarizationImpl>\nOfflineSpeakerDiarizationImpl::Create(\n    const OfflineSpeakerDiarizationConfig &config) {\n  if (!config.segmentation.pyannote.model.empty()) {\n    return std::make_unique<OfflineSpeakerDiarizationPyannoteImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please specify a speaker segmentation model.\");\n\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OfflineSpeakerDiarizationImpl>\nOfflineSpeakerDiarizationImpl::Create(\n    Manager *mgr, const OfflineSpeakerDiarizationConfig &config) {\n  if (!config.segmentation.pyannote.model.empty()) {\n    return std::make_unique<OfflineSpeakerDiarizationPyannoteImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please specify a speaker segmentation model.\");\n\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OfflineSpeakerDiarizationImpl>\nOfflineSpeakerDiarizationImpl::Create(\n    AAssetManager *mgr, const OfflineSpeakerDiarizationConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OfflineSpeakerDiarizationImpl>\nOfflineSpeakerDiarizationImpl::Create(\n    NativeResourceManager *mgr, const OfflineSpeakerDiarizationConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-diarization-impl.h",
    "content": "// sherpa-onnx/csrc/offline-speaker-diarization-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_IMPL_H_\n\n#include <functional>\n#include <memory>\n\n#include \"sherpa-onnx/csrc/offline-speaker-diarization.h\"\nnamespace sherpa_onnx {\n\nclass OfflineSpeakerDiarizationImpl {\n public:\n  static std::unique_ptr<OfflineSpeakerDiarizationImpl> Create(\n      const OfflineSpeakerDiarizationConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OfflineSpeakerDiarizationImpl> Create(\n      Manager *mgr, const OfflineSpeakerDiarizationConfig &config);\n\n  virtual ~OfflineSpeakerDiarizationImpl() = default;\n\n  virtual int32_t SampleRate() const = 0;\n\n  // Note: Only config.clustering is used. All other fields in config are\n  // ignored\n  virtual void SetConfig(const OfflineSpeakerDiarizationConfig &config) = 0;\n\n  virtual OfflineSpeakerDiarizationResult Process(\n      const float *audio, int32_t n,\n      OfflineSpeakerDiarizationProgressCallback callback = nullptr,\n      void *callback_arg = nullptr) const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-diarization-pyannote-impl.h",
    "content": "// sherpa-onnx/csrc/offline-speaker-diarization-pyannote-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_PYANNOTE_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_PYANNOTE_IMPL_H_\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/fast-clustering.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/offline-speaker-diarization-impl.h\"\n#include \"sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {  // NOLINT\n\n// copied from https://github.com/k2-fsa/k2/blob/master/k2/csrc/host/util.h#L41\ntemplate <class T>\ninline void hash_combine(std::size_t *seed, const T &v) {  // NOLINT\n  std::hash<T> hasher;\n  *seed ^= hasher(v) + 0x9e3779b9 + ((*seed) << 6) + ((*seed) >> 2);  // NOLINT\n}\n\n// copied from https://github.com/k2-fsa/k2/blob/master/k2/csrc/host/util.h#L47\nstruct PairHash {\n  template <class T1, class T2>\n  std::size_t operator()(const std::pair<T1, T2> &pair) const {\n    std::size_t result = 0;\n    hash_combine(&result, pair.first);\n    hash_combine(&result, pair.second);\n    return result;\n  }\n};\n}  // namespace\n\nusing Matrix2D = Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic,\n                               Eigen::RowMajor>;  // NOLINT\n\nusing Matrix2DInt32 = Eigen::Matrix<int32_t, Eigen::Dynamic, Eigen::Dynamic,\n                                    Eigen::RowMajor>;  // NOLINT\n\nusing FloatRowVector = Eigen::Matrix<float, 1, Eigen::Dynamic>;\nusing Int32RowVector = Eigen::Matrix<int32_t, 1, Eigen::Dynamic>;\n\nusing Int32Pair = std::pair<int32_t, int32_t>;\n\nclass OfflineSpeakerDiarizationPyannoteImpl\n    : public OfflineSpeakerDiarizationImpl {\n public:\n  ~OfflineSpeakerDiarizationPyannoteImpl() override = default;\n\n  explicit OfflineSpeakerDiarizationPyannoteImpl(\n      const OfflineSpeakerDiarizationConfig &config)\n      : config_(config),\n        segmentation_model_(config_.segmentation),\n        embedding_extractor_(config_.embedding),\n        clustering_(std::make_unique<FastClustering>(config_.clustering)) {\n    Init();\n  }\n\n  template <typename Manager>\n  OfflineSpeakerDiarizationPyannoteImpl(\n      Manager *mgr, const OfflineSpeakerDiarizationConfig &config)\n      : config_(config),\n        segmentation_model_(mgr, config_.segmentation),\n        embedding_extractor_(mgr, config_.embedding),\n        clustering_(std::make_unique<FastClustering>(config_.clustering)) {\n    Init();\n  }\n\n  int32_t SampleRate() const override {\n    const auto &meta_data = segmentation_model_.GetModelMetaData();\n\n    return meta_data.sample_rate;\n  }\n\n  void SetConfig(const OfflineSpeakerDiarizationConfig &config) override {\n    if (!config.clustering.Validate()) {\n      SHERPA_ONNX_LOGE(\"Invalid clustering config. Skip it\");\n      return;\n    }\n    clustering_ = std::make_unique<FastClustering>(config.clustering);\n    config_.clustering = config.clustering;\n  }\n\n  OfflineSpeakerDiarizationResult Process(\n      const float *audio, int32_t n,\n      OfflineSpeakerDiarizationProgressCallback callback = nullptr,\n      void *callback_arg = nullptr) const override {\n    std::vector<Matrix2D> segmentations = RunSpeakerSegmentationModel(audio, n);\n    // segmentations[i] is for chunk_i\n    // Each matrix is of shape (num_frames, num_powerset_classes)\n    if (segmentations.empty()) {\n      return {};\n    }\n\n    std::vector<Matrix2DInt32> labels;\n    labels.reserve(segmentations.size());\n\n    for (const auto &m : segmentations) {\n      labels.push_back(ToMultiLabel(m));\n    }\n\n    segmentations.clear();\n\n    if (labels.size() == 1) {\n      if (callback) {\n        callback(1, 1, callback_arg);\n      }\n\n      return HandleOneChunkSpecialCase(labels[0], n);\n    }\n\n    // labels[i] is a 0-1 matrix of shape (num_frames, num_speakers)\n\n    // speaker count per frame\n    Int32RowVector speakers_per_frame = ComputeSpeakersPerFrame(labels);\n\n    if (speakers_per_frame.maxCoeff() == 0) {\n      SHERPA_ONNX_LOGE(\"No speakers found in the audio samples\");\n      return {};\n    }\n\n    auto chunk_speaker_samples_list_pair = GetChunkSpeakerSampleIndexes(labels);\n\n    // The embedding model may output NaN. valid_indexes contains indexes\n    // in chunk_speaker_samples_list_pair.second that don't lead to\n    // NaN embeddings.\n    std::vector<int32_t> valid_indexes;\n    valid_indexes.reserve(chunk_speaker_samples_list_pair.second.size());\n\n    Matrix2D embeddings =\n        ComputeEmbeddings(audio, n, chunk_speaker_samples_list_pair.second,\n                          &valid_indexes, std::move(callback), callback_arg);\n\n    if (valid_indexes.size() != chunk_speaker_samples_list_pair.second.size()) {\n      std::vector<Int32Pair> chunk_speaker_pair;\n      std::vector<std::vector<Int32Pair>> sample_indexes;\n\n      chunk_speaker_pair.reserve(valid_indexes.size());\n      sample_indexes.reserve(valid_indexes.size());\n      for (auto i : valid_indexes) {\n        chunk_speaker_pair.push_back(chunk_speaker_samples_list_pair.first[i]);\n        sample_indexes.push_back(\n            std::move(chunk_speaker_samples_list_pair.second[i]));\n      }\n\n      chunk_speaker_samples_list_pair.first = std::move(chunk_speaker_pair);\n      chunk_speaker_samples_list_pair.second = std::move(sample_indexes);\n    }\n\n    std::vector<int32_t> cluster_labels = clustering_->Cluster(\n        &embeddings(0, 0), embeddings.rows(), embeddings.cols());\n\n    if (cluster_labels.empty()) {\n      SHERPA_ONNX_LOGE(\"No speakers found in the audio samples\");\n      return {};\n    }\n\n    int32_t max_cluster_index =\n        *std::max_element(cluster_labels.begin(), cluster_labels.end());\n\n    auto chunk_speaker_to_cluster = ConvertChunkSpeakerToCluster(\n        chunk_speaker_samples_list_pair.first, cluster_labels);\n\n    auto new_labels =\n        ReLabel(labels, max_cluster_index, chunk_speaker_to_cluster);\n\n    Matrix2DInt32 speaker_count = ComputeSpeakerCount(new_labels, n);\n\n    Matrix2DInt32 final_labels =\n        FinalizeLabels(speaker_count, speakers_per_frame);\n\n    auto result = ComputeResult(final_labels);\n\n    return result;\n  }\n\n private:\n  void Init() { InitPowersetMapping(); }\n\n  // see also\n  // https://github.com/pyannote/pyannote-audio/blob/develop/pyannote/audio/utils/powerset.py#L68\n  void InitPowersetMapping() {\n    const auto &meta_data = segmentation_model_.GetModelMetaData();\n    int32_t num_classes = meta_data.num_classes;\n    int32_t powerset_max_classes = meta_data.powerset_max_classes;\n    int32_t num_speakers = meta_data.num_speakers;\n\n    powerset_mapping_ = Matrix2DInt32(num_classes, num_speakers);\n    powerset_mapping_.setZero();\n\n    int32_t k = 1;\n    for (int32_t i = 1; i <= powerset_max_classes; ++i) {\n      if (i == 1) {\n        for (int32_t j = 0; j != num_speakers; ++j, ++k) {\n          powerset_mapping_(k, j) = 1;\n        }\n      } else if (i == 2) {\n        for (int32_t j = 0; j != num_speakers; ++j) {\n          for (int32_t m = j + 1; m < num_speakers; ++m, ++k) {\n            powerset_mapping_(k, j) = 1;\n            powerset_mapping_(k, m) = 1;\n          }\n        }\n      } else {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\n            \"powerset_max_classes = %{public}d is currently not supported!\", i);\n#else\n        SHERPA_ONNX_LOGE(\n            \"powerset_max_classes = %d is currently not supported!\", i);\n#endif\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n  }\n\n  std::vector<Matrix2D> RunSpeakerSegmentationModel(const float *audio,\n                                                    int32_t n) const {\n    std::vector<Matrix2D> ans;\n\n    const auto &meta_data = segmentation_model_.GetModelMetaData();\n    int32_t window_size = meta_data.window_size;\n    int32_t window_shift = meta_data.window_shift;\n\n    if (n <= 0) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"number of audio samples is %{public}d (<= 0). Please provide a \"\n          \"positive number\",\n          n);\n#else\n      SHERPA_ONNX_LOGE(\n          \"number of audio samples is %d (<= 0). Please provide a positive \"\n          \"number\",\n          n);\n#endif\n      return {};\n    }\n\n    if (n <= window_size) {\n      std::vector<float> buf(window_size);\n      // NOTE: buf is zero initialized by default\n\n      std::copy(audio, audio + n, buf.data());\n\n      Matrix2D m = ProcessChunk(buf.data());\n\n      ans.push_back(std::move(m));\n\n      return ans;\n    }\n\n    int32_t num_chunks = (n - window_size) / window_shift + 1;\n    bool has_last_chunk = ((n - window_size) % window_shift) > 0;\n\n    ans.reserve(num_chunks + has_last_chunk);\n\n    const float *p = audio;\n\n    for (int32_t i = 0; i != num_chunks; ++i, p += window_shift) {\n      Matrix2D m = ProcessChunk(p);\n\n      ans.push_back(std::move(m));\n    }\n\n    if (has_last_chunk) {\n      std::vector<float> buf(window_size);\n      std::copy(p, audio + n, buf.data());\n\n      Matrix2D m = ProcessChunk(buf.data());\n\n      ans.push_back(std::move(m));\n    }\n\n    return ans;\n  }\n\n  Matrix2D ProcessChunk(const float *p) const {\n    const auto &meta_data = segmentation_model_.GetModelMetaData();\n    int32_t window_size = meta_data.window_size;\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> shape = {1, 1, window_size};\n\n    Ort::Value x =\n        Ort::Value::CreateTensor(memory_info, const_cast<float *>(p),\n                                 window_size, shape.data(), shape.size());\n\n    Ort::Value out = segmentation_model_.Forward(std::move(x));\n    std::vector<int64_t> out_shape = out.GetTensorTypeAndShapeInfo().GetShape();\n    Matrix2D m(out_shape[1], out_shape[2]);\n    std::copy(out.GetTensorData<float>(), out.GetTensorData<float>() + m.size(),\n              &m(0, 0));\n    return m;\n  }\n\n  Matrix2DInt32 ToMultiLabel(const Matrix2D &m) const {\n    int32_t num_rows = m.rows();\n    Matrix2DInt32 ans(num_rows, powerset_mapping_.cols());\n\n    std::ptrdiff_t col_id;\n\n    for (int32_t i = 0; i != num_rows; ++i) {\n      m.row(i).maxCoeff(&col_id);\n      ans.row(i) = powerset_mapping_.row(col_id);\n    }\n\n    return ans;\n  }\n\n  // See also\n  // https://github.com/pyannote/pyannote-audio/blob/develop/pyannote/audio/pipelines/utils/diarization.py#L122\n  Int32RowVector ComputeSpeakersPerFrame(\n      const std::vector<Matrix2DInt32> &labels) const {\n    const auto &meta_data = segmentation_model_.GetModelMetaData();\n    int32_t window_size = meta_data.window_size;\n    int32_t window_shift = meta_data.window_shift;\n    int32_t receptive_field_shift = meta_data.receptive_field_shift;\n\n    int32_t num_chunks = labels.size();\n\n    int32_t num_frames = (window_size + (num_chunks - 1) * window_shift) /\n                             receptive_field_shift +\n                         1;\n\n    FloatRowVector count(num_frames);\n    FloatRowVector weight(num_frames);\n    count.setZero();\n    weight.setZero();\n\n    for (int32_t i = 0; i != num_chunks; ++i) {\n      int32_t start =\n          static_cast<float>(i) * window_shift / receptive_field_shift + 0.5;\n\n      auto seq = Eigen::seqN(start, labels[i].rows());\n\n      count(seq).array() += labels[i].rowwise().sum().array().cast<float>();\n\n      weight(seq).array() += 1;\n    }\n\n    return ((count.array() / (weight.array() + 1e-12f)) + 0.5).cast<int32_t>();\n  }\n\n  // ans.first: a list of (chunk_id, speaker_id)\n  // ans.second: a list of list of (start_sample_index, end_sample_index)\n  //\n  // ans.first[i] corresponds to ans.second[i]\n  std::pair<std::vector<Int32Pair>, std::vector<std::vector<Int32Pair>>>\n  GetChunkSpeakerSampleIndexes(const std::vector<Matrix2DInt32> &labels) const {\n    auto new_labels = ExcludeOverlap(labels);\n\n    std::vector<Int32Pair> chunk_speaker_list;\n    std::vector<std::vector<Int32Pair>> samples_index_list;\n\n    const auto &meta_data = segmentation_model_.GetModelMetaData();\n    int32_t window_size = meta_data.window_size;\n    int32_t window_shift = meta_data.window_shift;\n    int32_t receptive_field_shift = meta_data.receptive_field_shift;\n    int32_t num_speakers = meta_data.num_speakers;\n\n    int32_t chunk_index = 0;\n    for (const auto &label : new_labels) {\n      Matrix2DInt32 tmp = label.transpose();\n      // tmp: (num_speakers, num_frames)\n      int32_t num_frames = tmp.cols();\n\n      int32_t sample_offset = chunk_index * window_shift;\n\n      for (int32_t speaker_index = 0; speaker_index != num_speakers;\n           ++speaker_index) {\n        auto d = tmp.row(speaker_index);\n        if (d.sum() < 10) {\n          // skip segments less than 10 frames\n          continue;\n        }\n\n        Int32Pair this_chunk_speaker = {chunk_index, speaker_index};\n        std::vector<Int32Pair> this_speaker_samples;\n\n        bool is_active = false;\n        int32_t start_index;\n\n        for (int32_t k = 0; k != num_frames; ++k) {\n          if (d[k] != 0) {\n            if (!is_active) {\n              is_active = true;\n              start_index = k;\n            }\n          } else if (is_active) {\n            is_active = false;\n\n            int32_t start_samples =\n                static_cast<float>(start_index) / num_frames * window_size +\n                sample_offset;\n            int32_t end_samples =\n                static_cast<float>(k) / num_frames * window_size +\n                sample_offset;\n\n            this_speaker_samples.emplace_back(start_samples, end_samples);\n          }\n        }\n\n        if (is_active) {\n          int32_t start_samples =\n              static_cast<float>(start_index) / num_frames * window_size +\n              sample_offset;\n          int32_t end_samples =\n              static_cast<float>(num_frames - 1) / num_frames * window_size +\n              sample_offset;\n          this_speaker_samples.emplace_back(start_samples, end_samples);\n        }\n\n        chunk_speaker_list.push_back(std::move(this_chunk_speaker));\n        samples_index_list.push_back(std::move(this_speaker_samples));\n      }  // for (int32_t speaker_index = 0;\n      chunk_index += 1;\n    }  // for (const auto &label : new_labels)\n\n    return {chunk_speaker_list, samples_index_list};\n  }\n\n  // If there are multiple speakers at a frame, then this frame is excluded.\n  std::vector<Matrix2DInt32> ExcludeOverlap(\n      const std::vector<Matrix2DInt32> &labels) const {\n    int32_t num_chunks = labels.size();\n    std::vector<Matrix2DInt32> ans;\n    ans.reserve(num_chunks);\n\n    for (const auto &label : labels) {\n      Matrix2DInt32 new_label(label.rows(), label.cols());\n      new_label.setZero();\n      Int32RowVector v = label.rowwise().sum();\n\n      for (int32_t i = 0; i != v.cols(); ++i) {\n        if (v[i] < 2) {\n          new_label.row(i) = label.row(i);\n        }\n      }\n\n      ans.push_back(std::move(new_label));\n    }\n\n    return ans;\n  }\n\n  /**\n   * @param sample_indexes[i] contains the sample segment start and end indexes\n   *                          for the i-th (chunk, speaker) pair\n   * @return Return a matrix of shape (sample_indexes.size(), embedding_dim)\n   *         where ans.row[i] contains the embedding for the\n   *         i-th (chunk, speaker) pair\n   */\n  Matrix2D ComputeEmbeddings(\n      const float *audio, int32_t n,\n      const std::vector<std::vector<Int32Pair>> &sample_indexes,\n      std::vector<int32_t> *valid_indexes,\n      OfflineSpeakerDiarizationProgressCallback callback,\n      void *callback_arg) const {\n    const auto &meta_data = segmentation_model_.GetModelMetaData();\n    int32_t sample_rate = meta_data.sample_rate;\n    Matrix2D ans(sample_indexes.size(), embedding_extractor_.Dim());\n\n    auto IsNaNWrapper = [](float f) -> bool { return std::isnan(f); };\n\n    int32_t k = 0;\n    int32_t cur_row_index = 0;\n    for (const auto &v : sample_indexes) {\n      auto stream = embedding_extractor_.CreateStream();\n      for (const auto &p : v) {\n        int32_t end = (p.second <= n) ? p.second : n;\n        int32_t num_samples = end - p.first;\n\n        if (num_samples > 0) {\n          stream->AcceptWaveform(sample_rate, audio + p.first, num_samples);\n        }\n      }\n\n      stream->InputFinished();\n      if (!embedding_extractor_.IsReady(stream.get())) {\n        SHERPA_ONNX_LOGE(\n            \"This segment is too short, which should not happen since we have \"\n            \"already filtered short segments\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      std::vector<float> embedding = embedding_extractor_.Compute(stream.get());\n\n      if (std::none_of(embedding.begin(), embedding.end(), IsNaNWrapper)) {\n        // a valid embedding\n        std::copy(embedding.begin(), embedding.end(), &ans(cur_row_index, 0));\n        cur_row_index += 1;\n        valid_indexes->push_back(k);\n      }\n\n      k += 1;\n\n      if (callback) {\n        callback(k, ans.rows(), callback_arg);\n      }\n    }\n\n    if (k != cur_row_index) {\n      auto seq = Eigen::seqN(0, cur_row_index);\n      ans = ans(seq, Eigen::all);\n    }\n\n    return ans;\n  }\n\n  std::unordered_map<Int32Pair, int32_t, PairHash> ConvertChunkSpeakerToCluster(\n      const std::vector<Int32Pair> &chunk_speaker_pair,\n      const std::vector<int32_t> &cluster_labels) const {\n    std::unordered_map<Int32Pair, int32_t, PairHash> ans;\n\n    int32_t k = 0;\n    for (const auto &p : chunk_speaker_pair) {\n      ans[p] = cluster_labels[k];\n      k += 1;\n    }\n\n    return ans;\n  }\n\n  std::vector<Matrix2DInt32> ReLabel(\n      const std::vector<Matrix2DInt32> &labels, int32_t max_cluster_index,\n      std::unordered_map<Int32Pair, int32_t, PairHash> chunk_speaker_to_cluster)\n      const {\n    std::vector<Matrix2DInt32> new_labels;\n    new_labels.reserve(labels.size());\n\n    int32_t chunk_index = 0;\n    for (const auto &label : labels) {\n      Matrix2DInt32 new_label(label.rows(), max_cluster_index + 1);\n      new_label.setZero();\n\n      Matrix2DInt32 t = label.transpose();\n      // t: (num_speakers, num_frames)\n\n      for (int32_t speaker_index = 0; speaker_index != t.rows();\n           ++speaker_index) {\n        if (chunk_speaker_to_cluster.count({chunk_index, speaker_index}) == 0) {\n          continue;\n        }\n\n        int32_t new_speaker_index =\n            chunk_speaker_to_cluster.at({chunk_index, speaker_index});\n\n        for (int32_t k = 0; k != t.cols(); ++k) {\n          if (t(speaker_index, k) == 1) {\n            new_label(k, new_speaker_index) = 1;\n          }\n        }\n      }\n\n      new_labels.push_back(std::move(new_label));\n\n      chunk_index += 1;\n    }\n\n    return new_labels;\n  }\n\n  Matrix2DInt32 ComputeSpeakerCount(const std::vector<Matrix2DInt32> &labels,\n                                    int32_t num_samples) const {\n    const auto &meta_data = segmentation_model_.GetModelMetaData();\n    int32_t window_size = meta_data.window_size;\n    int32_t window_shift = meta_data.window_shift;\n    int32_t receptive_field_shift = meta_data.receptive_field_shift;\n\n    int32_t num_chunks = labels.size();\n\n    int32_t num_frames = (window_size + (num_chunks - 1) * window_shift) /\n                             receptive_field_shift +\n                         1;\n\n    Matrix2DInt32 count(num_frames, labels[0].cols());\n    count.setZero();\n\n    for (int32_t i = 0; i != num_chunks; ++i) {\n      int32_t start =\n          static_cast<float>(i) * window_shift / receptive_field_shift + 0.5;\n\n      auto seq = Eigen::seqN(start, labels[i].rows());\n\n      count(seq, Eigen::all).array() += labels[i].array();\n    }\n\n    bool has_last_chunk = ((num_samples - window_size) % window_shift) > 0;\n\n    if (!has_last_chunk) {\n      return count;\n    }\n\n    int32_t last_frame = num_samples / receptive_field_shift;\n    return count(Eigen::seq(0, last_frame), Eigen::all);\n  }\n\n  Matrix2DInt32 FinalizeLabels(const Matrix2DInt32 &count,\n                               const Int32RowVector &speakers_per_frame) const {\n    int32_t num_rows = count.rows();\n    int32_t num_cols = count.cols();\n\n    Matrix2DInt32 ans(num_rows, num_cols);\n    ans.setZero();\n\n    for (int32_t i = 0; i != num_rows; ++i) {\n      int32_t k = speakers_per_frame[i];\n      if (k == 0) {\n        continue;\n      }\n      auto top_k = TopkIndex(&count(i, 0), num_cols, k);\n\n      for (int32_t m : top_k) {\n        ans(i, m) = 1;\n      }\n    }\n\n    return ans;\n  }\n\n  OfflineSpeakerDiarizationResult ComputeResult(\n      const Matrix2DInt32 &final_labels) const {\n    Matrix2DInt32 final_labels_t = final_labels.transpose();\n    int32_t num_speakers = final_labels_t.rows();\n    int32_t num_frames = final_labels_t.cols();\n\n    const auto &meta_data = segmentation_model_.GetModelMetaData();\n    int32_t window_size = meta_data.window_size;\n    int32_t window_shift = meta_data.window_shift;\n    int32_t receptive_field_shift = meta_data.receptive_field_shift;\n    int32_t receptive_field_size = meta_data.receptive_field_size;\n    int32_t sample_rate = meta_data.sample_rate;\n\n    float scale = static_cast<float>(receptive_field_shift) / sample_rate;\n    float scale_offset = 0.5 * receptive_field_size / sample_rate;\n\n    OfflineSpeakerDiarizationResult ans;\n\n    for (int32_t speaker_index = 0; speaker_index != num_speakers;\n         ++speaker_index) {\n      std::vector<OfflineSpeakerDiarizationSegment> this_speaker;\n\n      bool is_active = final_labels_t(speaker_index, 0) > 0;\n      int32_t start_index = is_active ? 0 : -1;\n\n      for (int32_t frame_index = 1; frame_index != num_frames; ++frame_index) {\n        if (is_active) {\n          if (final_labels_t(speaker_index, frame_index) == 0) {\n            float start_time = start_index * scale + scale_offset;\n            float end_time = frame_index * scale + scale_offset;\n\n            OfflineSpeakerDiarizationSegment segment(start_time, end_time,\n                                                     speaker_index);\n            this_speaker.push_back(segment);\n\n            is_active = false;\n          }\n        } else if (final_labels_t(speaker_index, frame_index) == 1) {\n          is_active = true;\n          start_index = frame_index;\n        }\n      }\n\n      if (is_active) {\n        float start_time = start_index * scale + scale_offset;\n        float end_time = (num_frames - 1) * scale + scale_offset;\n\n        OfflineSpeakerDiarizationSegment segment(start_time, end_time,\n                                                 speaker_index);\n        this_speaker.push_back(segment);\n      }\n\n      // merge segments if the gap between them is less than min_duration_off\n      MergeSegments(&this_speaker);\n\n      for (const auto &seg : this_speaker) {\n        if (seg.Duration() > config_.min_duration_on) {\n          ans.Add(seg);\n        }\n      }\n    }  // for (int32_t speaker_index = 0; speaker_index != num_speakers;\n\n    return ans;\n  }\n\n  OfflineSpeakerDiarizationResult HandleOneChunkSpecialCase(\n      const Matrix2DInt32 &final_labels, int32_t num_samples) const {\n    const auto &meta_data = segmentation_model_.GetModelMetaData();\n    int32_t window_size = meta_data.window_size;\n    int32_t window_shift = meta_data.window_shift;\n    int32_t receptive_field_shift = meta_data.receptive_field_shift;\n\n    bool has_last_chunk = (num_samples - window_size) % window_shift > 0;\n    if (!has_last_chunk) {\n      return ComputeResult(final_labels);\n    }\n\n    int32_t num_frames = final_labels.rows();\n\n    int32_t new_num_frames = num_samples / receptive_field_shift;\n\n    num_frames = (new_num_frames <= num_frames) ? new_num_frames : num_frames;\n\n    return ComputeResult(final_labels(Eigen::seq(0, num_frames), Eigen::all));\n  }\n\n  void MergeSegments(\n      std::vector<OfflineSpeakerDiarizationSegment> *segments) const {\n    float min_duration_off = config_.min_duration_off;\n    bool changed = true;\n    while (changed) {\n      changed = false;\n      for (int32_t i = 0; i < static_cast<int32_t>(segments->size()) - 1; ++i) {\n        auto s = (*segments)[i].Merge((*segments)[i + 1], min_duration_off);\n        if (s) {\n          (*segments)[i] = s.value();\n          segments->erase(segments->begin() + i + 1);\n\n          changed = true;\n          break;\n        }\n      }\n    }\n  }\n\n private:\n  OfflineSpeakerDiarizationConfig config_;\n  OfflineSpeakerSegmentationPyannoteModel segmentation_model_;\n  SpeakerEmbeddingExtractor embedding_extractor_;\n  std::unique_ptr<FastClustering> clustering_;\n  Matrix2DInt32 powerset_mapping_;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_PYANNOTE_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-diarization-result.cc",
    "content": "// sherpa-onnx/csrc/offline-speaker-diarization-result.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-speaker-diarization-result.h\"\n\n#include <algorithm>\n#include <array>\n#include <cstdio>\n#include <sstream>\n#include <string>\n#include <unordered_set>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nOfflineSpeakerDiarizationSegment::OfflineSpeakerDiarizationSegment(\n    float start, float end, int32_t speaker, const std::string &text /*= {}*/) {\n  if (start > end) {\n    SHERPA_ONNX_LOGE(\"start %.3f should be less than end %.3f\", start, end);\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  start_ = start;\n  end_ = end;\n  speaker_ = speaker;\n  text_ = text;\n}\n\nstd::optional<OfflineSpeakerDiarizationSegment>\nOfflineSpeakerDiarizationSegment::Merge(\n    const OfflineSpeakerDiarizationSegment &other, float gap) const {\n  if (other.speaker_ != speaker_) {\n    SHERPA_ONNX_LOGE(\n        \"The two segments should have the same speaker. this->speaker: %d, \"\n        \"other.speaker: %d\",\n        speaker_, other.speaker_);\n    return std::nullopt;\n  }\n\n  if (end_ < other.start_ && end_ + gap >= other.start_) {\n    return OfflineSpeakerDiarizationSegment(start_, other.end_, speaker_);\n  } else if (other.end_ < start_ && other.end_ + gap >= start_) {\n    return OfflineSpeakerDiarizationSegment(other.start_, end_, speaker_);\n  } else {\n    return std::nullopt;\n  }\n}\n\nstd::string OfflineSpeakerDiarizationSegment::ToString() const {\n  std::array<char, 128> s{};\n\n  snprintf(s.data(), s.size(), \"%.3f -- %.3f speaker_%02d\", start_, end_,\n           speaker_);\n\n  std::ostringstream os;\n  os << s.data();\n\n  if (!text_.empty()) {\n    os << \" \" << text_;\n  }\n\n  return os.str();\n}\n\nvoid OfflineSpeakerDiarizationResult::Add(\n    const OfflineSpeakerDiarizationSegment &segment) {\n  segments_.push_back(segment);\n}\n\nint32_t OfflineSpeakerDiarizationResult::NumSpeakers() const {\n  std::unordered_set<int32_t> count;\n  for (const auto &s : segments_) {\n    count.insert(s.Speaker());\n  }\n\n  return count.size();\n}\n\nint32_t OfflineSpeakerDiarizationResult::NumSegments() const {\n  return segments_.size();\n}\n\n// Return a list of segments sorted by segment.start time\nstd::vector<OfflineSpeakerDiarizationSegment>\nOfflineSpeakerDiarizationResult::SortByStartTime() const {\n  auto ans = segments_;\n  std::sort(ans.begin(), ans.end(), [](const auto &a, const auto &b) {\n    return (a.Start() < b.Start()) ||\n           ((a.Start() == b.Start()) && (a.Speaker() < b.Speaker()));\n  });\n\n  return ans;\n}\n\nstd::vector<std::vector<OfflineSpeakerDiarizationSegment>>\nOfflineSpeakerDiarizationResult::SortBySpeaker() const {\n  auto tmp = segments_;\n  std::sort(tmp.begin(), tmp.end(), [](const auto &a, const auto &b) {\n    return (a.Speaker() < b.Speaker()) ||\n           ((a.Speaker() == b.Speaker()) && (a.Start() < b.Start()));\n  });\n\n  std::vector<std::vector<OfflineSpeakerDiarizationSegment>> ans(NumSpeakers());\n  for (auto &s : tmp) {\n    ans[s.Speaker()].push_back(std::move(s));\n  }\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-diarization-result.h",
    "content": "// sherpa-onnx/csrc/offline-speaker-diarization-result.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_RESULT_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_RESULT_H_\n\n#include <cstdint>\n#include <optional>\n#include <string>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nclass OfflineSpeakerDiarizationSegment {\n public:\n  OfflineSpeakerDiarizationSegment(float start, float end, int32_t speaker,\n                                   const std::string &text = {});\n\n  // If the gap between the two segments is less than the given gap, then we\n  // merge them and return a new segment. Otherwise, it returns null.\n  std::optional<OfflineSpeakerDiarizationSegment> Merge(\n      const OfflineSpeakerDiarizationSegment &other, float gap) const;\n\n  float Start() const { return start_; }\n  float End() const { return end_; }\n  int32_t Speaker() const { return speaker_; }\n  const std::string &Text() const { return text_; }\n  float Duration() const { return end_ - start_; }\n\n  void SetText(const std::string &text) { text_ = text; }\n\n  std::string ToString() const;\n\n private:\n  float start_;       // in seconds\n  float end_;         // in seconds\n  int32_t speaker_;   // ID of the speaker, starting from 0\n  std::string text_;  // If not empty, it contains the speech recognition result\n                      // of this segment\n};\n\nclass OfflineSpeakerDiarizationResult {\n public:\n  // Add a new segment\n  void Add(const OfflineSpeakerDiarizationSegment &segment);\n\n  // Number of distinct speakers contained in this object at this point\n  int32_t NumSpeakers() const;\n\n  int32_t NumSegments() const;\n\n  // Return a list of segments sorted by segment.start time\n  std::vector<OfflineSpeakerDiarizationSegment> SortByStartTime() const;\n\n  // ans.size() == NumSpeakers().\n  // ans[i] is for speaker_i and is sorted by start time\n  std::vector<std::vector<OfflineSpeakerDiarizationSegment>> SortBySpeaker()\n      const;\n\n private:\n  std::vector<OfflineSpeakerDiarizationSegment> segments_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_RESULT_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-diarization.cc",
    "content": "// sherpa-onnx/csrc/offline-speaker-diarization.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-speaker-diarization.h\"\n\n#include <string>\n#include <utility>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/offline-speaker-diarization-impl.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSpeakerDiarizationConfig::Register(ParseOptions *po) {\n  ParseOptions po_segmentation(\"segmentation\", po);\n  segmentation.Register(&po_segmentation);\n\n  ParseOptions po_embedding(\"embedding\", po);\n  embedding.Register(&po_embedding);\n\n  ParseOptions po_clustering(\"clustering\", po);\n  clustering.Register(&po_clustering);\n\n  po->Register(\"min-duration-on\", &min_duration_on,\n               \"if a segment is less than this value, then it is discarded. \"\n               \"Set it to 0 so that no segment is discarded\");\n\n  po->Register(\"min-duration-off\", &min_duration_off,\n               \"if the gap between to segments of the same speaker is less \"\n               \"than this value, then these two segments are merged into a \"\n               \"single segment. We do it recursively.\");\n}\n\nbool OfflineSpeakerDiarizationConfig::Validate() const {\n  if (!segmentation.Validate()) {\n    return false;\n  }\n\n  if (!embedding.Validate()) {\n    return false;\n  }\n\n  if (!clustering.Validate()) {\n    return false;\n  }\n\n  if (min_duration_on < 0) {\n    SHERPA_ONNX_LOGE(\"min_duration_on %.3f is negative\", min_duration_on);\n    return false;\n  }\n\n  if (min_duration_off < 0) {\n    SHERPA_ONNX_LOGE(\"min_duration_off %.3f is negative\", min_duration_off);\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineSpeakerDiarizationConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSpeakerDiarizationConfig(\";\n  os << \"segmentation=\" << segmentation.ToString() << \", \";\n  os << \"embedding=\" << embedding.ToString() << \", \";\n  os << \"clustering=\" << clustering.ToString() << \", \";\n  os << \"min_duration_on=\" << min_duration_on << \", \";\n  os << \"min_duration_off=\" << min_duration_off << \")\";\n\n  return os.str();\n}\n\nOfflineSpeakerDiarization::OfflineSpeakerDiarization(\n    const OfflineSpeakerDiarizationConfig &config)\n    : impl_(OfflineSpeakerDiarizationImpl::Create(config)) {}\n\ntemplate <typename Manager>\nOfflineSpeakerDiarization::OfflineSpeakerDiarization(\n    Manager *mgr, const OfflineSpeakerDiarizationConfig &config)\n    : impl_(OfflineSpeakerDiarizationImpl::Create(mgr, config)) {}\n\nOfflineSpeakerDiarization::~OfflineSpeakerDiarization() = default;\n\nint32_t OfflineSpeakerDiarization::SampleRate() const {\n  return impl_->SampleRate();\n}\n\nvoid OfflineSpeakerDiarization::SetConfig(\n    const OfflineSpeakerDiarizationConfig &config) {\n  impl_->SetConfig(config);\n}\n\nOfflineSpeakerDiarizationResult OfflineSpeakerDiarization::Process(\n    const float *audio, int32_t n,\n    OfflineSpeakerDiarizationProgressCallback callback /*= nullptr*/,\n    void *callback_arg /*= nullptr*/) const {\n  return impl_->Process(audio, n, std::move(callback), callback_arg);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSpeakerDiarization::OfflineSpeakerDiarization(\n    AAssetManager *mgr, const OfflineSpeakerDiarizationConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSpeakerDiarization::OfflineSpeakerDiarization(\n    NativeResourceManager *mgr, const OfflineSpeakerDiarizationConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-diarization.h",
    "content": "// sherpa-onnx/csrc/offline-speaker-diarization.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_H_\n\n#include <functional>\n#include <memory>\n#include <string>\n\n#include \"sherpa-onnx/csrc/fast-clustering-config.h\"\n#include \"sherpa-onnx/csrc/offline-speaker-diarization-result.h\"\n#include \"sherpa-onnx/csrc/offline-speaker-segmentation-model-config.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSpeakerDiarizationConfig {\n  OfflineSpeakerSegmentationModelConfig segmentation;\n  SpeakerEmbeddingExtractorConfig embedding;\n  FastClusteringConfig clustering;\n\n  // if a segment is less than this value, then it is discarded\n  float min_duration_on = 0.3;  // in seconds\n\n  // if the gap between to segments of the same speaker is less than this value,\n  // then these two segments are merged into a single segment.\n  // We do this recursively.\n  float min_duration_off = 0.5;  // in seconds\n\n  OfflineSpeakerDiarizationConfig() = default;\n\n  OfflineSpeakerDiarizationConfig(\n      const OfflineSpeakerSegmentationModelConfig &segmentation,\n      const SpeakerEmbeddingExtractorConfig &embedding,\n      const FastClusteringConfig &clustering, float min_duration_on,\n      float min_duration_off)\n      : segmentation(segmentation),\n        embedding(embedding),\n        clustering(clustering),\n        min_duration_on(min_duration_on),\n        min_duration_off(min_duration_off) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n  std::string ToString() const;\n};\n\nclass OfflineSpeakerDiarizationImpl;\n\nusing OfflineSpeakerDiarizationProgressCallback = std::function<int32_t(\n    int32_t processed_chunks, int32_t num_chunks, void *arg)>;\n\nclass OfflineSpeakerDiarization {\n public:\n  explicit OfflineSpeakerDiarization(\n      const OfflineSpeakerDiarizationConfig &config);\n\n  template <typename Manager>\n  OfflineSpeakerDiarization(Manager *mgr,\n                            const OfflineSpeakerDiarizationConfig &config);\n\n  ~OfflineSpeakerDiarization();\n\n  // Expected sample rate of the input audio samples\n  int32_t SampleRate() const;\n\n  // Note: Only config.clustering is used. All other fields in config are\n  // ignored\n  void SetConfig(const OfflineSpeakerDiarizationConfig &config);\n\n  OfflineSpeakerDiarizationResult Process(\n      const float *audio, int32_t n,\n      OfflineSpeakerDiarizationProgressCallback callback = nullptr,\n      void *callback_arg = nullptr) const;\n\n private:\n  std::unique_ptr<OfflineSpeakerDiarizationImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_DIARIZATION_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-segmentation-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-speaker-segmentation-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/offline-speaker-segmentation-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSpeakerSegmentationModelConfig::Register(ParseOptions *po) {\n  pyannote.Register(po);\n\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n}\n\nbool OfflineSpeakerSegmentationModelConfig::Validate() const {\n  if (num_threads < 1) {\n    SHERPA_ONNX_LOGE(\"num_threads should be > 0. Given %d\", num_threads);\n    return false;\n  }\n\n  if (!pyannote.model.empty()) {\n    return pyannote.Validate();\n  }\n\n  if (pyannote.model.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"You have to provide at least one speaker segmentation model\");\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineSpeakerSegmentationModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSpeakerSegmentationModelConfig(\";\n  os << \"pyannote=\" << pyannote.ToString() << \", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-segmentation-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-speaker-segmentation-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSpeakerSegmentationModelConfig {\n  OfflineSpeakerSegmentationPyannoteModelConfig pyannote;\n\n  int32_t num_threads = 1;\n  bool debug = false;\n  std::string provider = \"cpu\";\n\n  OfflineSpeakerSegmentationModelConfig() = default;\n\n  explicit OfflineSpeakerSegmentationModelConfig(\n      const OfflineSpeakerSegmentationPyannoteModelConfig &pyannote,\n      int32_t num_threads, bool debug, const std::string &provider)\n      : pyannote(pyannote),\n        num_threads(num_threads),\n        debug(debug),\n        provider(provider) {}\n\n  void Register(ParseOptions *po);\n\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSpeakerSegmentationPyannoteModelConfig::Register(ParseOptions *po) {\n  po->Register(\"pyannote-model\", &model,\n               \"Path to model.onnx of the Pyannote segmentation model.\");\n}\n\nbool OfflineSpeakerSegmentationPyannoteModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"Pyannote segmentation model: '%s' does not exist\",\n                     model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineSpeakerSegmentationPyannoteModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSpeakerSegmentationPyannoteModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_PYANNOTE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_PYANNOTE_MODEL_CONFIG_H_\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSpeakerSegmentationPyannoteModelConfig {\n  std::string model;\n\n  OfflineSpeakerSegmentationPyannoteModelConfig() = default;\n\n  explicit OfflineSpeakerSegmentationPyannoteModelConfig(\n      const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_PYANNOTE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model-meta-data.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_PYANNOTE_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_PYANNOTE_MODEL_META_DATA_H_\n\n#include <cstdint>\n#include <string>\n\nnamespace sherpa_onnx {\n\n// If you are not sure what each field means, please\n// have a look of the Python file in the model directory that\n// you have downloaded.\nstruct OfflineSpeakerSegmentationPyannoteModelMetaData {\n  int32_t sample_rate = 0;\n  int32_t window_size = 0;            // in samples\n  int32_t window_shift = 0;           // in samples\n  int32_t receptive_field_size = 0;   // in samples\n  int32_t receptive_field_shift = 0;  // in samples\n  int32_t num_speakers = 0;\n  int32_t powerset_max_classes = 0;\n  int32_t num_classes = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_PYANNOTE_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model.cc",
    "content": "// sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSpeakerSegmentationPyannoteModel::Impl {\n public:\n  explicit Impl(const OfflineSpeakerSegmentationModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.pyannote.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineSpeakerSegmentationModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.pyannote.model);\n    Init(buf.data(), buf.size());\n  }\n\n  const OfflineSpeakerSegmentationPyannoteModelMetaData &GetModelMetaData()\n      const {\n    return meta_data_;\n  }\n\n  Ort::Value Forward(Ort::Value x) {\n    auto out = sess_->Run({}, input_names_ptr_.data(), &x, 1,\n                          output_names_ptr_.data(), output_names_ptr_.size());\n\n    return std::move(out[0]);\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(meta_data_.sample_rate, \"sample_rate\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.window_size, \"window_size\");\n\n    meta_data_.window_shift =\n        static_cast<int32_t>(0.1 * meta_data_.window_size);\n\n    SHERPA_ONNX_READ_META_DATA(meta_data_.receptive_field_size,\n                               \"receptive_field_size\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.receptive_field_shift,\n                               \"receptive_field_shift\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.num_speakers, \"num_speakers\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.powerset_max_classes,\n                               \"powerset_max_classes\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.num_classes, \"num_classes\");\n  }\n\n private:\n  OfflineSpeakerSegmentationModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  OfflineSpeakerSegmentationPyannoteModelMetaData meta_data_;\n};\n\nOfflineSpeakerSegmentationPyannoteModel::\n    OfflineSpeakerSegmentationPyannoteModel(  // NOLINT\n        const OfflineSpeakerSegmentationModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}  // NOLINT\n\ntemplate <typename Manager>\nOfflineSpeakerSegmentationPyannoteModel::\n    OfflineSpeakerSegmentationPyannoteModel(  // NOLINT\n        Manager *mgr, const OfflineSpeakerSegmentationModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}  // NOLINT\n\nOfflineSpeakerSegmentationPyannoteModel::\n    ~OfflineSpeakerSegmentationPyannoteModel() = default;  // NOLINT\n\nconst OfflineSpeakerSegmentationPyannoteModelMetaData &\nOfflineSpeakerSegmentationPyannoteModel::GetModelMetaData() const {\n  return impl_->GetModelMetaData();\n}\n\nOrt::Value OfflineSpeakerSegmentationPyannoteModel::Forward(\n    Ort::Value x) const {\n  return impl_->Forward(std::move(x));\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSpeakerSegmentationPyannoteModel::\n    OfflineSpeakerSegmentationPyannoteModel(  // NOLINT\n        AAssetManager *mgr,\n        const OfflineSpeakerSegmentationModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSpeakerSegmentationPyannoteModel::\n    OfflineSpeakerSegmentationPyannoteModel(  // NOLINT\n        NativeResourceManager *mgr,\n        const OfflineSpeakerSegmentationModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model.h",
    "content": "// sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_PYANNOTE_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_PYANNOTE_MODEL_H_\n\n#include <memory>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-speaker-segmentation-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSpeakerSegmentationPyannoteModel {\n public:\n  explicit OfflineSpeakerSegmentationPyannoteModel(\n      const OfflineSpeakerSegmentationModelConfig &config);\n\n  template <typename Manager>\n  OfflineSpeakerSegmentationPyannoteModel(\n      Manager *mgr, const OfflineSpeakerSegmentationModelConfig &config);\n\n  ~OfflineSpeakerSegmentationPyannoteModel();\n\n  const OfflineSpeakerSegmentationPyannoteModelMetaData &GetModelMetaData()\n      const;\n\n  /**\n   * @param x A 3-D float tensor of shape (batch_size, 1, num_samples)\n   * @return Return a float tensor of\n   *         shape (batch_size, num_frames, num_speakers). Note that\n   *         num_speakers here uses powerset encoding.\n   */\n  Ort::Value Forward(Ort::Value x) const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEAKER_SEGMENTATION_PYANNOTE_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-impl.h",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-impl.h\n//\n// Copyright (c)  2026  Ceva Inc\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_IMPL_H_\n\n#include <algorithm>\n#include <array>\n#include <cmath>\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"kaldi-native-fbank/csrc/istft.h\"\n#include \"kaldi-native-fbank/csrc/stft.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-impl.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSpeechDenoiserDpdfNetImpl : public OfflineSpeechDenoiserImpl {\n public:\n  explicit OfflineSpeechDenoiserDpdfNetImpl(\n      const OfflineSpeechDenoiserConfig &config)\n      : model_(config.model) {}\n\n  template <typename Manager>\n  OfflineSpeechDenoiserDpdfNetImpl(Manager *mgr,\n                                   const OfflineSpeechDenoiserConfig &config)\n      : model_(mgr, config.model) {}\n\n  DenoisedAudio Run(const float *samples, int32_t n,\n                    int32_t sample_rate) const override {\n    const auto &meta = model_.GetMetaData();\n\n    std::vector<float> tmp;\n    auto p = samples;\n\n    if (sample_rate != meta.sample_rate) {\n      SHERPA_ONNX_LOGE(\n          \"Creating a resampler:\\n\"\n          \"   in_sample_rate: %d\\n\"\n          \"   output_sample_rate: %d\\n\",\n          sample_rate, meta.sample_rate);\n\n      float min_freq = std::min<int32_t>(sample_rate, meta.sample_rate);\n      float lowpass_cutoff = 0.99f * 0.5f * min_freq;\n\n      int32_t lowpass_filter_width = 6;\n      auto resampler = std::make_unique<LinearResample>(\n          sample_rate, meta.sample_rate, lowpass_cutoff, lowpass_filter_width);\n      resampler->Resample(samples, n, true, &tmp);\n      p = tmp.data();\n      n = tmp.size();\n    }\n\n    auto stft_config = GetStftConfig();\n    knf::Stft stft(stft_config);\n    knf::StftResult stft_result = stft.Compute(p, n);\n\n    auto state = model_.GetInitState();\n    Ort::Value next_state{nullptr};\n\n    knf::StftResult enhanced_stft_result;\n    enhanced_stft_result.num_frames = stft_result.num_frames;\n    for (int32_t i = 0; i < stft_result.num_frames; ++i) {\n      auto frame = Process(stft_result, i, std::move(state), &next_state);\n      state = std::move(next_state);\n\n      enhanced_stft_result.real.insert(enhanced_stft_result.real.end(),\n                                       frame.first.begin(), frame.first.end());\n      enhanced_stft_result.imag.insert(enhanced_stft_result.imag.end(),\n                                       frame.second.begin(),\n                                       frame.second.end());\n    }\n\n    knf::IStft istft(stft_config);\n\n    DenoisedAudio denoised_audio;\n    denoised_audio.sample_rate = meta.sample_rate;\n    denoised_audio.samples = ShiftWaveform(istft.Compute(enhanced_stft_result),\n                                           meta.window_length * 2);\n    return denoised_audio;\n  }\n\n  int32_t GetSampleRate() const override {\n    return model_.GetMetaData().sample_rate;\n  }\n\n private:\n  static std::vector<float> ShiftWaveform(std::vector<float> samples,\n                                          int32_t shift) {\n    if (samples.size() > static_cast<size_t>(shift)) {\n      std::copy(samples.begin() + shift, samples.end(), samples.begin());\n      samples.resize(samples.size() - shift);\n    } else {\n      samples.clear();\n    }\n\n    samples.resize(samples.size() + shift, 0.0f);\n    return samples;\n  }\n\n  knf::StftConfig GetStftConfig() const {\n    const auto &meta = model_.GetMetaData();\n\n    knf::StftConfig stft_config;\n    stft_config.n_fft = meta.n_fft;\n    stft_config.hop_length = meta.hop_length;\n    stft_config.win_length = meta.window_length;\n    stft_config.normalized = meta.normalized;\n    stft_config.center = meta.center;\n    stft_config.pad_mode = meta.pad_mode;\n    stft_config.window_type = meta.window_type;\n    stft_config.window = MakeVorbisWindow(meta.window_length);\n\n    return stft_config;\n  }\n\n  std::pair<std::vector<float>, std::vector<float>> Process(\n      const knf::StftResult &stft_result, int32_t frame_index, Ort::Value state,\n      Ort::Value *next_state) const {\n    const auto &meta = model_.GetMetaData();\n    const int32_t n_fft = meta.n_fft;\n\n    std::vector<float> x((n_fft / 2 + 1) * 2);\n\n    const float *p_real =\n        stft_result.real.data() + frame_index * (n_fft / 2 + 1);\n    const float *p_imag =\n        stft_result.imag.data() + frame_index * (n_fft / 2 + 1);\n\n    for (int32_t i = 0; i < n_fft / 2 + 1; ++i) {\n      x[2 * i] = p_real[i];\n      x[2 * i + 1] = p_imag[i];\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 4> x_shape{1, 1, n_fft / 2 + 1, 2};\n    Ort::Value x_tensor = Ort::Value::CreateTensor<float>(\n        memory_info, x.data(), x.size(), x_shape.data(), x_shape.size());\n\n    Ort::Value output{nullptr};\n    std::tie(output, *next_state) =\n        model_.Run(std::move(x_tensor), std::move(state));\n\n    std::vector<float> real(n_fft / 2 + 1);\n    std::vector<float> imag(n_fft / 2 + 1);\n    const auto *p = output.GetTensorData<float>();\n    for (int32_t i = 0; i < n_fft / 2 + 1; ++i) {\n      real[i] = p[2 * i];\n      imag[i] = p[2 * i + 1];\n    }\n\n    return {std::move(real), std::move(imag)};\n  }\n\n private:\n  OfflineSpeechDenoiserDpdfNetModel model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model-config.cc\n//\n// Copyright (c)  2026  Ceva Inc\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSpeechDenoiserDpdfNetModelConfig::Register(ParseOptions *po) {\n  po->Register(\"speech-denoiser-dpdfnet-model\", &model,\n               \"Path to a DPDFNet ONNX model for speech denoising, e.g. \"\n               \"baseline/dpdfnet2/dpdfnet4/dpdfnet8 (16 kHz) or \"\n               \"dpdfnet2_48khz_hr (48 kHz). Download DPDFNet models from the \"\n               \"sherpa-onnx GitHub release or the official Hugging Face hub: \"\n               \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/\"\n               \"speech-enhancement-models or \"\n               \"https://huggingface.co/Ceva-IP/DPDFNet\");\n}\n\nbool OfflineSpeechDenoiserDpdfNetModelConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --speech-denoiser-dpdfnet-model\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"dpdfnet model file '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineSpeechDenoiserDpdfNetModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSpeechDenoiserDpdfNetModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model-config.h\n//\n// Copyright (c)  2026  Ceva Inc\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSpeechDenoiserDpdfNetModelConfig {\n  std::string model;\n  OfflineSpeechDenoiserDpdfNetModelConfig() = default;\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model-meta-data.h\n//\n// Copyright (c)  2026  Ceva Inc\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_META_DATA_H_\n\n#include <cstdint>\n#include <string>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nstruct OfflineSpeechDenoiserDpdfNetModelMetaData {\n  int32_t version = 1;\n  int32_t sample_rate = 0;\n  int32_t n_fft = 0;\n  int32_t hop_length = 0;\n  int32_t window_length = 0;\n  bool normalized = false;\n  bool center = true;\n  std::string window_type = \"vorbis\";\n  std::string pad_mode = \"reflect\";\n  int32_t freq_bins = 0;\n  int32_t erb_bins = 0;\n  int32_t spec_bins = 0;\n  int32_t state_size = 0;\n  int32_t erb_norm_state_size = 0;\n  int32_t spec_norm_state_size = 0;\n  std::string profile;\n  std::vector<float> erb_norm_init;\n  std::vector<float> spec_norm_init;\n\n  std::vector<int64_t> spec_shape;\n  std::vector<int64_t> state_shape;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model.cc",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model.cc\n//\n// Copyright (c)  2026  Ceva Inc\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model.h\"\n\n#include <algorithm>\n#include <array>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n\nstd::vector<int64_t> GetInputShape(Ort::Session *sess, size_t index) {\n  return sess->GetInputTypeInfo(index).GetTensorTypeAndShapeInfo().GetShape();\n}\n\nstd::vector<int64_t> GetOutputShape(Ort::Session *sess, size_t index) {\n  return sess->GetOutputTypeInfo(index).GetTensorTypeAndShapeInfo().GetShape();\n}\n\n}  // namespace\n\nclass OfflineSpeechDenoiserDpdfNetModel::Impl {\n public:\n  explicit Impl(const OfflineSpeechDenoiserModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config.dpdfnet.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineSpeechDenoiserModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config.dpdfnet.model);\n    Init(buf.data(), buf.size());\n  }\n\n  Ort::Value GetInitState() {\n    Ort::Value state = Ort::Value::CreateTensor<float>(\n        allocator_, meta_.state_shape.data(), meta_.state_shape.size());\n\n    auto *p = state.GetTensorMutableData<float>();\n    std::fill_n(p, meta_.state_size, 0.f);\n    std::copy(meta_.erb_norm_init.begin(), meta_.erb_norm_init.end(), p);\n    std::copy(meta_.spec_norm_init.begin(), meta_.spec_norm_init.end(),\n              p + meta_.erb_norm_state_size);\n\n    return state;\n  }\n\n  std::pair<Ort::Value, Ort::Value> Run(Ort::Value x, Ort::Value state) const {\n    std::array<Ort::Value, 2> inputs{std::move(x), std::move(state)};\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    return {std::move(out[0]), std::move(out[1])};\n  }\n\n  const OfflineSpeechDenoiserDpdfNetModelMetaData &GetMetaData() const {\n    return meta_;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macros below\n\n    std::string model_type;\n    SHERPA_ONNX_READ_META_DATA_STR(model_type, \"model_type\");\n    if (model_type != \"dpdfnet\") {\n      SHERPA_ONNX_LOGE(\"Expect model type 'dpdfnet'. Given: '%s'\",\n                       model_type.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_.version, \"version\", 1);\n    SHERPA_ONNX_READ_META_DATA_STR(meta_.profile, \"profile\");\n    SHERPA_ONNX_READ_META_DATA(meta_.sample_rate, \"sample_rate\");\n    SHERPA_ONNX_READ_META_DATA(meta_.n_fft, \"n_fft\");\n    SHERPA_ONNX_READ_META_DATA(meta_.hop_length, \"hop_length\");\n    SHERPA_ONNX_READ_META_DATA(meta_.window_length, \"window_length\");\n    int32_t normalized = 0;\n    int32_t center = 1;\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(normalized, \"normalized\", 0);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(center, \"center\", 1);\n    SHERPA_ONNX_READ_META_DATA_STR(meta_.window_type, \"window_type\");\n    SHERPA_ONNX_READ_META_DATA_STR(meta_.pad_mode, \"pad_mode\");\n    SHERPA_ONNX_READ_META_DATA(meta_.freq_bins, \"freq_bins\");\n    SHERPA_ONNX_READ_META_DATA(meta_.erb_bins, \"erb_bins\");\n    SHERPA_ONNX_READ_META_DATA(meta_.spec_bins, \"spec_bins\");\n    SHERPA_ONNX_READ_META_DATA(meta_.state_size, \"state_size\");\n    SHERPA_ONNX_READ_META_DATA(meta_.erb_norm_state_size,\n                               \"erb_norm_state_size\");\n    SHERPA_ONNX_READ_META_DATA(meta_.spec_norm_state_size,\n                               \"spec_norm_state_size\");\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(meta_.erb_norm_init, \"erb_norm_init\");\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(meta_.spec_norm_init,\n                                         \"spec_norm_init\");\n\n    if (normalized > 1 || center > 1) {\n      SHERPA_ONNX_LOGE(\n          \"Invalid boolean metadata values. normalized=%d, center=%d.\",\n          normalized, center);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    meta_.normalized = normalized != 0;\n    meta_.center = center != 0;\n\n    if (meta_.sample_rate <= 0 || meta_.n_fft <= 0 || meta_.hop_length <= 0 ||\n        meta_.window_length <= 0 || meta_.freq_bins <= 1 ||\n        meta_.erb_bins <= 0 || meta_.spec_bins <= 0 || meta_.state_size <= 0) {\n      SHERPA_ONNX_LOGE(\n          \"Invalid DPDFNet metadata. sample_rate=%d, n_fft=%d, \"\n          \"hop_length=%d, window_length=%d, freq_bins=%d, erb_bins=%d, \"\n          \"spec_bins=%d, state_size=%d.\",\n          meta_.sample_rate, meta_.n_fft, meta_.hop_length, meta_.window_length,\n          meta_.freq_bins, meta_.erb_bins, meta_.spec_bins, meta_.state_size);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (input_names_.size() != 2 || output_names_.size() != 2) {\n      SHERPA_ONNX_LOGE(\n          \"Expect the dpdfnet model to have 2 inputs and 2 outputs. \"\n          \"Got %zu inputs and %zu outputs.\",\n          input_names_.size(), output_names_.size());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    auto spec_shape = GetInputShape(sess_.get(), 0);\n    auto state_shape = GetInputShape(sess_.get(), 1);\n    auto out_spec_shape = GetOutputShape(sess_.get(), 0);\n    auto out_state_shape = GetOutputShape(sess_.get(), 1);\n\n    if (spec_shape.size() != 4 || state_shape.size() != 1 ||\n        out_spec_shape.size() != 4 || out_state_shape.size() != 1) {\n      SHERPA_ONNX_LOGE(\n          \"Unexpected dpdfnet ONNX signature. Expected \"\n          \"(spec:[B,T,F,2], state:[S]) -> (spec_e:[B,T,F,2], state_out:[S]). \"\n          \"Got spec ndim=%d, state ndim=%d, out_spec ndim=%d, out_state \"\n          \"ndim=%d.\",\n          static_cast<int32_t>(spec_shape.size()),\n          static_cast<int32_t>(state_shape.size()),\n          static_cast<int32_t>(out_spec_shape.size()),\n          static_cast<int32_t>(out_state_shape.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    const int64_t freq_bins = spec_shape[2];\n    const int64_t complex_dim = spec_shape[3];\n    const int64_t state_size = state_shape[0];\n\n    if (freq_bins <= 1 || complex_dim != 2 || state_size <= 0) {\n      SHERPA_ONNX_LOGE(\n          \"Unsupported dpdfnet model shapes. spec ndim=%d, state ndim=%d, \"\n          \"freq_bins=%d, complex_dim=%d, state_size=%d.\",\n          static_cast<int32_t>(spec_shape.size()),\n          static_cast<int32_t>(state_shape.size()),\n          static_cast<int32_t>(freq_bins), static_cast<int32_t>(complex_dim),\n          static_cast<int32_t>(state_size));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    meta_.spec_shape = std::move(spec_shape);\n    meta_.state_shape = std::move(state_shape);\n\n    if (meta_.freq_bins != freq_bins) {\n      SHERPA_ONNX_LOGE(\n          \"Mismatch between metadata and ONNX graph for freq_bins. \"\n          \"metadata=%d, graph=%d.\",\n          meta_.freq_bins, static_cast<int32_t>(freq_bins));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta_.n_fft != static_cast<int32_t>((freq_bins - 1) * 2)) {\n      SHERPA_ONNX_LOGE(\n          \"Mismatch between metadata and ONNX graph for n_fft. metadata=%d, \"\n          \"graph=%d.\",\n          meta_.n_fft, static_cast<int32_t>((freq_bins - 1) * 2));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta_.state_size != state_size) {\n      SHERPA_ONNX_LOGE(\n          \"Mismatch between metadata and ONNX graph for state_size. \"\n          \"metadata=%d, graph=%d.\",\n          meta_.state_size, static_cast<int32_t>(state_size));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta_.erb_norm_state_size !=\n        static_cast<int32_t>(meta_.erb_norm_init.size())) {\n      SHERPA_ONNX_LOGE(\n          \"Mismatch between erb_norm_state_size (%d) and erb_norm_init size \"\n          \"(%zu).\",\n          meta_.erb_norm_state_size, meta_.erb_norm_init.size());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta_.spec_norm_state_size !=\n        static_cast<int32_t>(meta_.spec_norm_init.size())) {\n      SHERPA_ONNX_LOGE(\n          \"Mismatch between spec_norm_state_size (%d) and spec_norm_init size \"\n          \"(%zu).\",\n          meta_.spec_norm_state_size, meta_.spec_norm_init.size());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    const int32_t init_prefix_state_size =\n        meta_.erb_norm_state_size + meta_.spec_norm_state_size;\n    if (meta_.erb_norm_state_size <= 0 || meta_.spec_norm_state_size <= 0) {\n      SHERPA_ONNX_LOGE(\n          \"Invalid normalization state sizes in the metadata. \"\n          \"erb_norm_state_size=%d, spec_norm_state_size=%d.\",\n          meta_.erb_norm_state_size, meta_.spec_norm_state_size);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta_.state_size < init_prefix_state_size) {\n      SHERPA_ONNX_LOGE(\n          \"The dpdfnet state tensor is too small: %d. It must be at least %d.\",\n          meta_.state_size, init_prefix_state_size);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (out_spec_shape[2] != freq_bins || out_spec_shape[3] != 2 ||\n        out_state_shape[0] != state_size) {\n      SHERPA_ONNX_LOGE(\n          \"Unexpected dpdfnet output shapes. out_spec[2]=%d, out_spec[3]=%d, \"\n          \"out_state[0]=%d, expected freq_bins=%d, complex_dim=2, \"\n          \"state_size=%d.\",\n          static_cast<int32_t>(out_spec_shape[2]),\n          static_cast<int32_t>(out_spec_shape[3]),\n          static_cast<int32_t>(out_state_shape[0]),\n          static_cast<int32_t>(freq_bins), static_cast<int32_t>(state_size));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---dpdfnet model---\\n\";\n      PrintModelMetadata(os, meta_data);\n      os << \"input names:\\n\";\n      for (int32_t i = 0; i != static_cast<int32_t>(input_names_.size()); ++i) {\n        os << i << \" \" << input_names_[i] << \"\\n\";\n      }\n\n      os << \"output names:\\n\";\n      for (int32_t i = 0; i != static_cast<int32_t>(output_names_.size());\n           ++i) {\n        os << i << \" \" << output_names_[i] << \"\\n\";\n      }\n\n      os << \"spec shape: \";\n      for (auto d : meta_.spec_shape) {\n        os << d << \" \";\n      }\n      os << \"\\nstate shape: \";\n      for (auto d : meta_.state_shape) {\n        os << d << \" \";\n      }\n      os << \"\\nprofile: \" << meta_.profile;\n      os << \"\\nsample_rate: \" << meta_.sample_rate;\n      os << \"\\nn_fft: \" << meta_.n_fft;\n      os << \"\\nfreq_bins: \" << meta_.freq_bins;\n      os << \"\\nerb_bins: \" << meta_.erb_bins;\n      os << \"\\nspec_bins: \" << meta_.spec_bins;\n      os << \"\\nstate_size: \" << meta_.state_size;\n      os << \"\\nnormalized: \" << static_cast<int32_t>(meta_.normalized);\n      os << \"\\ncenter: \" << static_cast<int32_t>(meta_.center);\n      os << \"\\n\";\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n  }\n\n private:\n  OfflineSpeechDenoiserModelConfig config_;\n  OfflineSpeechDenoiserDpdfNetModelMetaData meta_;\n\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n};\n\nOfflineSpeechDenoiserDpdfNetModel::~OfflineSpeechDenoiserDpdfNetModel() =\n    default;  // NOLINT\n\nOfflineSpeechDenoiserDpdfNetModel::OfflineSpeechDenoiserDpdfNetModel(\n    const OfflineSpeechDenoiserModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineSpeechDenoiserDpdfNetModel::OfflineSpeechDenoiserDpdfNetModel(\n    Manager *mgr, const OfflineSpeechDenoiserModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOrt::Value OfflineSpeechDenoiserDpdfNetModel::GetInitState() const {\n  return impl_->GetInitState();\n}\n\nstd::pair<Ort::Value, Ort::Value> OfflineSpeechDenoiserDpdfNetModel::Run(\n    Ort::Value x, Ort::Value state) const {\n  return impl_->Run(std::move(x), std::move(state));\n}\n\nconst OfflineSpeechDenoiserDpdfNetModelMetaData &\nOfflineSpeechDenoiserDpdfNetModel::GetMetaData() const {\n  return impl_->GetMetaData();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSpeechDenoiserDpdfNetModel::OfflineSpeechDenoiserDpdfNetModel(\n    AAssetManager *mgr, const OfflineSpeechDenoiserModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSpeechDenoiserDpdfNetModel::OfflineSpeechDenoiserDpdfNetModel(\n    NativeResourceManager *mgr, const OfflineSpeechDenoiserModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model.h",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model.h\n//\n// Copyright (c)  2026  Ceva Inc\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_H_\n\n#include <memory>\n#include <utility>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSpeechDenoiserDpdfNetModel {\n public:\n  ~OfflineSpeechDenoiserDpdfNetModel();\n  explicit OfflineSpeechDenoiserDpdfNetModel(\n      const OfflineSpeechDenoiserModelConfig &config);\n\n  template <typename Manager>\n  OfflineSpeechDenoiserDpdfNetModel(\n      Manager *mgr, const OfflineSpeechDenoiserModelConfig &config);\n\n  Ort::Value GetInitState() const;\n\n  std::pair<Ort::Value, Ort::Value> Run(Ort::Value x, Ort::Value state) const;\n\n  const OfflineSpeechDenoiserDpdfNetModelMetaData &GetMetaData() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-impl.h",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_IMPL_H_\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"kaldi-native-fbank/csrc/feature-window.h\"\n#include \"kaldi-native-fbank/csrc/istft.h\"\n#include \"kaldi-native-fbank/csrc/stft.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-impl.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSpeechDenoiserGtcrnImpl : public OfflineSpeechDenoiserImpl {\n public:\n  explicit OfflineSpeechDenoiserGtcrnImpl(\n      const OfflineSpeechDenoiserConfig &config)\n      : model_(config.model) {}\n\n  template <typename Manager>\n  OfflineSpeechDenoiserGtcrnImpl(Manager *mgr,\n                                 const OfflineSpeechDenoiserConfig &config)\n      : model_(mgr, config.model) {}\n\n  DenoisedAudio Run(const float *samples, int32_t n,\n                    int32_t sample_rate) const override {\n    const auto &meta = model_.GetMetaData();\n\n    std::vector<float> tmp;\n    auto p = samples;\n\n    if (sample_rate != meta.sample_rate) {\n      SHERPA_ONNX_LOGE(\n          \"Creating a resampler:\\n\"\n          \"   in_sample_rate: %d\\n\"\n          \"   output_sample_rate: %d\\n\",\n          sample_rate, meta.sample_rate);\n\n      float min_freq = std::min<int32_t>(sample_rate, meta.sample_rate);\n      float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n      int32_t lowpass_filter_width = 6;\n      auto resampler = std::make_unique<LinearResample>(\n          sample_rate, meta.sample_rate, lowpass_cutoff, lowpass_filter_width);\n      resampler->Resample(samples, n, true, &tmp);\n      p = tmp.data();\n      n = tmp.size();\n    }\n\n    knf::StftConfig stft_config;\n    stft_config.n_fft = meta.n_fft;\n    stft_config.hop_length = meta.hop_length;\n    stft_config.win_length = meta.window_length;\n    stft_config.window_type = meta.window_type;\n    if (stft_config.window_type == \"hann_sqrt\") {\n      auto window = knf::GetWindow(\"hann\", stft_config.win_length);\n      for (auto &w : window) {\n        w = std::sqrt(w);\n      }\n      stft_config.window = std::move(window);\n    }\n\n    knf::Stft stft(stft_config);\n    knf::StftResult stft_result = stft.Compute(p, n);\n\n    auto states = model_.GetInitStates();\n    OfflineSpeechDenoiserGtcrnModel::States next_states;\n\n    knf::StftResult enhanced_stft_result;\n    enhanced_stft_result.num_frames = stft_result.num_frames;\n    for (int32_t i = 0; i < stft_result.num_frames; ++i) {\n      auto p = Process(stft_result, i, std::move(states), &next_states);\n      states = std::move(next_states);\n\n      enhanced_stft_result.real.insert(enhanced_stft_result.real.end(),\n                                       p.first.begin(), p.first.end());\n      enhanced_stft_result.imag.insert(enhanced_stft_result.imag.end(),\n                                       p.second.begin(), p.second.end());\n    }\n\n    knf::IStft istft(stft_config);\n\n    DenoisedAudio denoised_audio;\n    denoised_audio.sample_rate = meta.sample_rate;\n    denoised_audio.samples = istft.Compute(enhanced_stft_result);\n    return denoised_audio;\n  }\n\n  int32_t GetSampleRate() const override {\n    return model_.GetMetaData().sample_rate;\n  }\n\n private:\n  std::pair<std::vector<float>, std::vector<float>> Process(\n      const knf::StftResult &stft_result, int32_t frame_index,\n      OfflineSpeechDenoiserGtcrnModel::States states,\n      OfflineSpeechDenoiserGtcrnModel::States *next_states) const {\n    const auto &meta = model_.GetMetaData();\n    int32_t n_fft = meta.n_fft;\n    std::vector<float> x((n_fft / 2 + 1) * 2);\n\n    const float *p_real =\n        stft_result.real.data() + frame_index * (n_fft / 2 + 1);\n    const float *p_imag =\n        stft_result.imag.data() + frame_index * (n_fft / 2 + 1);\n\n    for (int32_t i = 0; i < n_fft / 2 + 1; ++i) {\n      x[2 * i] = p_real[i];\n      x[2 * i + 1] = p_imag[i];\n    }\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 4> x_shape{1, n_fft / 2 + 1, 1, 2};\n    Ort::Value x_tensor = Ort::Value::CreateTensor(\n        memory_info, x.data(), x.size(), x_shape.data(), x_shape.size());\n\n    Ort::Value output{nullptr};\n    std::tie(output, *next_states) =\n        model_.Run(std::move(x_tensor), std::move(states));\n\n    std::vector<float> real(n_fft / 2 + 1);\n    std::vector<float> imag(n_fft / 2 + 1);\n    const auto *p = output.GetTensorData<float>();\n    for (int32_t i = 0; i < n_fft / 2 + 1; ++i) {\n      real[i] = p[2 * i];\n      imag[i] = p[2 * i + 1];\n    }\n\n    return {std::move(real), std::move(imag)};\n  }\n\n private:\n  OfflineSpeechDenoiserGtcrnModel model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSpeechDenoiserGtcrnModelConfig::Register(ParseOptions *po) {\n  po->Register(\"speech-denoiser-gtcrn-model\", &model,\n               \"Path to the gtcrn model for speech denoising\");\n}\n\nbool OfflineSpeechDenoiserGtcrnModelConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --speech-denoiser-gtcrn-model\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"gtcrn model file '%s' does not exist\", model.c_str());\n    return false;\n  }\n  return true;\n}\n\nstd::string OfflineSpeechDenoiserGtcrnModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSpeechDenoiserGtcrnModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSpeechDenoiserGtcrnModelConfig {\n  std::string model;\n  OfflineSpeechDenoiserGtcrnModelConfig() = default;\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model-meta-data.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_META_DATA_H_\n\n#include <cstdint>\n#include <string>\n#include <vector>\n\nnamespace sherpa_onnx {\n\n// please refer to\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/gtcrn/add_meta_data.py\nstruct OfflineSpeechDenoiserGtcrnModelMetaData {\n  int32_t sample_rate = 0;\n  int32_t version = 1;\n  int32_t n_fft = 0;\n  int32_t hop_length = 0;\n  int32_t window_length = 0;\n  std::string window_type;\n\n  std::vector<int64_t> conv_cache_shape;\n  std::vector<int64_t> tra_cache_shape;\n  std::vector<int64_t> inter_cache_shape;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model.cc",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSpeechDenoiserGtcrnModel::Impl {\n public:\n  explicit Impl(const OfflineSpeechDenoiserModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.gtcrn.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineSpeechDenoiserModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.gtcrn.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  const OfflineSpeechDenoiserGtcrnModelMetaData &GetMetaData() const {\n    return meta_;\n  }\n\n  States GetInitStates() {\n    Ort::Value conv_cache = Ort::Value::CreateTensor<float>(\n        allocator_, meta_.conv_cache_shape.data(),\n        meta_.conv_cache_shape.size());\n\n    Ort::Value tra_cache = Ort::Value::CreateTensor<float>(\n        allocator_, meta_.tra_cache_shape.data(), meta_.tra_cache_shape.size());\n\n    Ort::Value inter_cache = Ort::Value::CreateTensor<float>(\n        allocator_, meta_.inter_cache_shape.data(),\n        meta_.inter_cache_shape.size());\n\n    Fill<float>(&conv_cache, 0);\n    Fill<float>(&tra_cache, 0);\n    Fill<float>(&inter_cache, 0);\n\n    std::vector<Ort::Value> states;\n\n    states.reserve(3);\n    states.push_back(std::move(conv_cache));\n    states.push_back(std::move(tra_cache));\n    states.push_back(std::move(inter_cache));\n\n    return states;\n  }\n\n  std::pair<Ort::Value, States> Run(Ort::Value x, States states) const {\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(1 + states.size());\n    inputs.push_back(std::move(x));\n    for (auto &s : states) {\n      inputs.push_back(std::move(s));\n    }\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    std::vector<Ort::Value> next_states;\n    next_states.reserve(out.size() - 1);\n    for (int32_t k = 1; k < out.size(); ++k) {\n      next_states.push_back(std::move(out[k]));\n    }\n\n    return {std::move(out[0]), std::move(next_states)};\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---gtcrn model---\\n\";\n      PrintModelMetadata(os, meta_data);\n\n      os << \"----------input names----------\\n\";\n      int32_t i = 0;\n      for (const auto &s : input_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n      os << \"----------output names----------\\n\";\n      i = 0;\n      for (const auto &s : output_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    std::string model_type;\n    SHERPA_ONNX_READ_META_DATA_STR(model_type, \"model_type\");\n    if (model_type != \"gtcrn\") {\n      SHERPA_ONNX_LOGE(\"Expect model type 'gtcrn'. Given: '%s'\",\n                       model_type.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA(meta_.sample_rate, \"sample_rate\");\n    SHERPA_ONNX_READ_META_DATA(meta_.n_fft, \"n_fft\");\n    SHERPA_ONNX_READ_META_DATA(meta_.hop_length, \"hop_length\");\n    SHERPA_ONNX_READ_META_DATA(meta_.window_length, \"window_length\");\n    SHERPA_ONNX_READ_META_DATA_STR(meta_.window_type, \"window_type\");\n    SHERPA_ONNX_READ_META_DATA(meta_.version, \"version\");\n\n    SHERPA_ONNX_READ_META_DATA_VEC(meta_.conv_cache_shape, \"conv_cache_shape\");\n    SHERPA_ONNX_READ_META_DATA_VEC(meta_.tra_cache_shape, \"tra_cache_shape\");\n    SHERPA_ONNX_READ_META_DATA_VEC(meta_.inter_cache_shape,\n                                   \"inter_cache_shape\");\n  }\n\n private:\n  OfflineSpeechDenoiserModelConfig config_;\n  OfflineSpeechDenoiserGtcrnModelMetaData meta_;\n\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n};\n\nOfflineSpeechDenoiserGtcrnModel::~OfflineSpeechDenoiserGtcrnModel() = default;\n\nOfflineSpeechDenoiserGtcrnModel::OfflineSpeechDenoiserGtcrnModel(\n    const OfflineSpeechDenoiserModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineSpeechDenoiserGtcrnModel::OfflineSpeechDenoiserGtcrnModel(\n    Manager *mgr, const OfflineSpeechDenoiserModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineSpeechDenoiserGtcrnModel::States\nOfflineSpeechDenoiserGtcrnModel::GetInitStates() const {\n  return impl_->GetInitStates();\n}\n\nstd::pair<Ort::Value, OfflineSpeechDenoiserGtcrnModel::States>\nOfflineSpeechDenoiserGtcrnModel::Run(Ort::Value x, States states) const {\n  return impl_->Run(std::move(x), std::move(states));\n}\n\nconst OfflineSpeechDenoiserGtcrnModelMetaData &\nOfflineSpeechDenoiserGtcrnModel::GetMetaData() const {\n  return impl_->GetMetaData();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSpeechDenoiserGtcrnModel::OfflineSpeechDenoiserGtcrnModel(\n    AAssetManager *mgr, const OfflineSpeechDenoiserModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSpeechDenoiserGtcrnModel::OfflineSpeechDenoiserGtcrnModel(\n    NativeResourceManager *mgr, const OfflineSpeechDenoiserModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model.h",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_H_\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSpeechDenoiserGtcrnModel {\n public:\n  ~OfflineSpeechDenoiserGtcrnModel();\n  explicit OfflineSpeechDenoiserGtcrnModel(\n      const OfflineSpeechDenoiserModelConfig &config);\n\n  template <typename Manager>\n  OfflineSpeechDenoiserGtcrnModel(\n      Manager *mgr, const OfflineSpeechDenoiserModelConfig &config);\n\n  using States = std::vector<Ort::Value>;\n\n  States GetInitStates() const;\n\n  std::pair<Ort::Value, States> Run(Ort::Value x, States states) const;\n\n  const OfflineSpeechDenoiserGtcrnModelMetaData &GetMetaData() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-impl.cc",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-impl.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-impl.h\"\n\n#include <memory>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-impl.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-impl.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OfflineSpeechDenoiserImpl> OfflineSpeechDenoiserImpl::Create(\n    const OfflineSpeechDenoiserConfig &config) {\n  const bool has_gtcrn = !config.model.gtcrn.model.empty();\n  const bool has_dpdfnet = !config.model.dpdfnet.model.empty();\n\n  if (has_gtcrn) {\n    return std::make_unique<OfflineSpeechDenoiserGtcrnImpl>(config);\n  } else if (has_dpdfnet) {\n    return std::make_unique<OfflineSpeechDenoiserDpdfNetImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide one speech denoising model.\");\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OfflineSpeechDenoiserImpl> OfflineSpeechDenoiserImpl::Create(\n    Manager *mgr, const OfflineSpeechDenoiserConfig &config) {\n  const bool has_gtcrn = !config.model.gtcrn.model.empty();\n  const bool has_dpdfnet = !config.model.dpdfnet.model.empty();\n\n  if (has_gtcrn) {\n    return std::make_unique<OfflineSpeechDenoiserGtcrnImpl>(mgr, config);\n  } else if (has_dpdfnet) {\n    return std::make_unique<OfflineSpeechDenoiserDpdfNetImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide one speech denoising model.\");\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OfflineSpeechDenoiserImpl>\nOfflineSpeechDenoiserImpl::Create(AAssetManager *mgr,\n                                  const OfflineSpeechDenoiserConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OfflineSpeechDenoiserImpl>\nOfflineSpeechDenoiserImpl::Create(NativeResourceManager *mgr,\n                                  const OfflineSpeechDenoiserConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-impl.h",
    "content": "// sherpa-onnx/csrc/offline-speaker-speech-denoiser-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_IMPL_H_\n\n#include <memory>\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSpeechDenoiserImpl {\n public:\n  virtual ~OfflineSpeechDenoiserImpl() = default;\n\n  static std::unique_ptr<OfflineSpeechDenoiserImpl> Create(\n      const OfflineSpeechDenoiserConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OfflineSpeechDenoiserImpl> Create(\n      Manager *mgr, const OfflineSpeechDenoiserConfig &config);\n\n  virtual DenoisedAudio Run(const float *samples, int32_t n,\n                            int32_t sample_rate) const = 0;\n\n  virtual int32_t GetSampleRate() const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSpeechDenoiserModelConfig::Register(ParseOptions *po) {\n  gtcrn.Register(po);\n  dpdfnet.Register(po);\n\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n}\n\nbool OfflineSpeechDenoiserModelConfig::Validate() const {\n  if (gtcrn.model.empty() && dpdfnet.model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide a speech denoising model.\");\n    return false;\n  }\n\n  if (!gtcrn.model.empty()) {\n    return gtcrn.Validate();\n  }\n\n  return dpdfnet.Validate();\n}\n\nstd::string OfflineSpeechDenoiserModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSpeechDenoiserModelConfig(\";\n  os << \"gtcrn=\" << gtcrn.ToString() << \", \";\n  os << \"dpdfnet=\" << dpdfnet.ToString() << \", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineSpeechDenoiserModelConfig {\n  OfflineSpeechDenoiserGtcrnModelConfig gtcrn;\n  OfflineSpeechDenoiserDpdfNetModelConfig dpdfnet;\n\n  int32_t num_threads = 1;\n  bool debug = false;\n  std::string provider = \"cpu\";\n\n  OfflineSpeechDenoiserModelConfig() = default;\n\n  OfflineSpeechDenoiserModelConfig(\n      const OfflineSpeechDenoiserGtcrnModelConfig &gtcrn,\n      const OfflineSpeechDenoiserDpdfNetModelConfig &dpdfnet,\n      int32_t num_threads, bool debug, const std::string &provider)\n      : gtcrn(gtcrn),\n        dpdfnet(dpdfnet),\n        num_threads(num_threads),\n        debug(debug),\n        provider(provider) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser.cc",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-impl.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineSpeechDenoiserConfig::Register(ParseOptions *po) {\n  model.Register(po);\n}\n\nbool OfflineSpeechDenoiserConfig::Validate() const { return model.Validate(); }\n\nstd::string OfflineSpeechDenoiserConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineSpeechDenoiserConfig(\";\n  os << \"model=\" << model.ToString() << \")\";\n  return os.str();\n}\n\ntemplate <typename Manager>\nOfflineSpeechDenoiser::OfflineSpeechDenoiser(\n    Manager *mgr, const OfflineSpeechDenoiserConfig &config)\n    : impl_(OfflineSpeechDenoiserImpl::Create(mgr, config)) {}\n\nOfflineSpeechDenoiser::OfflineSpeechDenoiser(\n    const OfflineSpeechDenoiserConfig &config)\n    : impl_(OfflineSpeechDenoiserImpl::Create(config)) {}\n\nOfflineSpeechDenoiser::~OfflineSpeechDenoiser() = default;\n\nDenoisedAudio OfflineSpeechDenoiser::Run(const float *samples, int32_t n,\n                                         int32_t sample_rate) const {\n  return impl_->Run(samples, n, sample_rate);\n}\n\nint32_t OfflineSpeechDenoiser::GetSampleRate() const {\n  return impl_->GetSampleRate();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSpeechDenoiser::OfflineSpeechDenoiser(\n    AAssetManager *mgr, const OfflineSpeechDenoiserConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSpeechDenoiser::OfflineSpeechDenoiser(\n    NativeResourceManager *mgr, const OfflineSpeechDenoiserConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-speech-denoiser.h",
    "content": "// sherpa-onnx/csrc/offline-speech-denoiser.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct DenoisedAudio {\n  std::vector<float> samples;\n  int32_t sample_rate;\n};\n\nstruct OfflineSpeechDenoiserConfig {\n  OfflineSpeechDenoiserModelConfig model;\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nclass OfflineSpeechDenoiserImpl;\n\nclass OfflineSpeechDenoiser {\n public:\n  explicit OfflineSpeechDenoiser(const OfflineSpeechDenoiserConfig &config);\n  ~OfflineSpeechDenoiser();\n\n  template <typename Manager>\n  OfflineSpeechDenoiser(Manager *mgr,\n                        const OfflineSpeechDenoiserConfig &config);\n\n  /*\n   * @param samples 1-D array of audio samples. Each sample is in the\n   *                range [-1, 1].\n   * @param n Number of samples\n   * @param sample_rate Sample rate of the input samples\n   *\n   */\n  DenoisedAudio Run(const float *samples, int32_t n, int32_t sample_rate) const;\n\n  /*\n   * Return the sample rate of the denoised audio\n   */\n  int32_t GetSampleRate() const;\n\n private:\n  std::unique_ptr<OfflineSpeechDenoiserImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_SPEECH_DENOISER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-stream.cc",
    "content": "// sherpa-onnx/csrc/offline-stream.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-stream.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <cmath>\n#include <iomanip>\n#include <limits>\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Core\"\n#include \"kaldi-native-fbank/csrc/online-feature.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineStream::Impl {\n public:\n  explicit Impl(const FeatureExtractorConfig &config,\n                ContextGraphPtr context_graph)\n      : config_(config), context_graph_(std::move(context_graph)) {\n    if (config.is_mfcc) {\n      mfcc_opts_.frame_opts.dither = config_.dither;\n      mfcc_opts_.frame_opts.snip_edges = config_.snip_edges;\n      mfcc_opts_.frame_opts.samp_freq = config_.sampling_rate;\n      mfcc_opts_.frame_opts.frame_shift_ms = config_.frame_shift_ms;\n      mfcc_opts_.frame_opts.frame_length_ms = config_.frame_length_ms;\n      mfcc_opts_.frame_opts.remove_dc_offset = config_.remove_dc_offset;\n      mfcc_opts_.frame_opts.window_type = config_.window_type;\n\n      mfcc_opts_.mel_opts.num_bins = config_.feature_dim;\n\n      mfcc_opts_.mel_opts.high_freq = config_.high_freq;\n      mfcc_opts_.mel_opts.low_freq = config_.low_freq;\n\n      mfcc_opts_.mel_opts.is_librosa = config_.is_librosa;\n\n      mfcc_opts_.num_ceps = config_.num_ceps;\n      mfcc_opts_.use_energy = config_.use_energy;\n\n      mfcc_ = std::make_unique<knf::OnlineMfcc>(mfcc_opts_);\n    } else {\n      opts_.frame_opts.dither = config.dither;\n      opts_.frame_opts.snip_edges = config.snip_edges;\n      opts_.frame_opts.samp_freq = config.sampling_rate;\n      opts_.frame_opts.frame_shift_ms = config.frame_shift_ms;\n      opts_.frame_opts.frame_length_ms = config.frame_length_ms;\n      opts_.frame_opts.remove_dc_offset = config.remove_dc_offset;\n      opts_.frame_opts.window_type = config.window_type;\n\n      opts_.mel_opts.num_bins = config.feature_dim;\n\n      opts_.mel_opts.high_freq = config.high_freq;\n      opts_.mel_opts.low_freq = config.low_freq;\n\n      opts_.mel_opts.is_librosa = config.is_librosa;\n\n      fbank_ = std::make_unique<knf::OnlineFbank>(opts_);\n    }\n  }\n\n  explicit Impl(WhisperTag tag) {\n    config_.normalize_samples = true;\n    opts_.frame_opts.samp_freq = 16000;\n    opts_.mel_opts.num_bins = tag.dim;\n\n    knf::WhisperFeatureOptions whisper_opts;\n    whisper_opts.frame_opts = opts_.frame_opts;\n    whisper_opts.dim = tag.dim;\n\n    whisper_fbank_ = std::make_unique<knf::OnlineWhisperFbank>(whisper_opts);\n    config_.sampling_rate = opts_.frame_opts.samp_freq;\n  }\n\n  explicit Impl(CEDTag /*tag*/) : is_ced_(true) {\n    // see\n    // https://github.com/RicherMans/CED/blob/main/onnx_inference_with_kaldi.py\n\n    opts_.frame_opts.frame_length_ms = 32;\n    opts_.frame_opts.dither = 0;\n    opts_.frame_opts.preemph_coeff = 0;\n    opts_.frame_opts.remove_dc_offset = false;\n    opts_.frame_opts.window_type = \"hann\";\n    opts_.frame_opts.snip_edges = false;\n\n    opts_.frame_opts.samp_freq = 16000;  // fixed to 16000\n    opts_.mel_opts.num_bins = 64;\n    opts_.mel_opts.low_freq = 0;\n    opts_.mel_opts.high_freq = 8000;\n    opts_.use_log_fbank = false;\n\n    config_.sampling_rate = opts_.frame_opts.samp_freq;\n\n    fbank_ = std::make_unique<knf::OnlineFbank>(opts_);\n  }\n\n  explicit Impl(MoonshineTag /*tag*/) : is_moonshine_(true) {\n    config_.sampling_rate = 16000;\n  }\n\n  explicit Impl(OmnilingualAsrTag /*tag*/) : is_omnilingual_asr_(true) {\n    config_.sampling_rate = 16000;\n  }\n\n  void AcceptWaveform(int32_t sampling_rate, const float *waveform, int32_t n) {\n    if (config_.normalize_samples) {\n      AcceptWaveformImpl(sampling_rate, waveform, n);\n    } else {\n      std::vector<float> buf(n);\n      for (int32_t i = 0; i != n; ++i) {\n        buf[i] = waveform[i] * 32768;\n      }\n      AcceptWaveformImpl(sampling_rate, buf.data(), n);\n    }\n  }\n\n  void AcceptWaveformImpl(int32_t sampling_rate, const float *waveform,\n                          int32_t n) {\n    if (sampling_rate != config_.sampling_rate) {\n      SHERPA_ONNX_LOGE(\n          \"Creating a resampler:\\n\"\n          \"   in_sample_rate: %d\\n\"\n          \"   output_sample_rate: %d\\n\",\n          sampling_rate, static_cast<int32_t>(config_.sampling_rate));\n\n      float min_freq = std::min<int32_t>(sampling_rate, config_.sampling_rate);\n      float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n      int32_t lowpass_filter_width = 6;\n      auto resampler = std::make_unique<LinearResample>(\n          sampling_rate, config_.sampling_rate, lowpass_cutoff,\n          lowpass_filter_width);\n      std::vector<float> samples;\n      resampler->Resample(waveform, n, true, &samples);\n\n      if (is_moonshine_ || is_omnilingual_asr_) {\n        samples_.insert(samples_.end(), samples.begin(), samples.end());\n      } else if (fbank_) {\n        fbank_->AcceptWaveform(config_.sampling_rate, samples.data(),\n                               samples.size());\n        fbank_->InputFinished();\n      } else if (mfcc_) {\n        mfcc_->AcceptWaveform(config_.sampling_rate, samples.data(),\n                              samples.size());\n        mfcc_->InputFinished();\n      } else {\n        whisper_fbank_->AcceptWaveform(config_.sampling_rate, samples.data(),\n                                       samples.size());\n        whisper_fbank_->InputFinished();\n      }\n\n      return;\n    }  // if (sampling_rate != config_.sampling_rate)\n\n    if (is_moonshine_ || is_omnilingual_asr_) {\n      samples_.insert(samples_.end(), waveform, waveform + n);\n    } else if (fbank_) {\n      fbank_->AcceptWaveform(sampling_rate, waveform, n);\n      fbank_->InputFinished();\n    } else if (mfcc_) {\n      mfcc_->AcceptWaveform(sampling_rate, waveform, n);\n      mfcc_->InputFinished();\n    } else {\n      whisper_fbank_->AcceptWaveform(sampling_rate, waveform, n);\n      whisper_fbank_->InputFinished();\n    }\n  }\n\n  int32_t FeatureDim() const {\n    if (is_moonshine_ || is_omnilingual_asr_) {\n      return samples_.size();\n    }\n\n    return mfcc_ ? mfcc_opts_.num_ceps : opts_.mel_opts.num_bins;\n  }\n\n  std::vector<float> GetFrames() const {\n    if (is_moonshine_ || is_omnilingual_asr_) {\n      return samples_;\n    }\n\n    int32_t n = fbank_  ? fbank_->NumFramesReady()\n                : mfcc_ ? mfcc_->NumFramesReady()\n                        : whisper_fbank_->NumFramesReady();\n    assert(n > 0 && \"Please first call AcceptWaveform()\");\n\n    int32_t feature_dim = FeatureDim();\n\n    std::vector<float> features(n * feature_dim);\n\n    float *p = features.data();\n\n    for (int32_t i = 0; i != n; ++i) {\n      const float *f = fbank_  ? fbank_->GetFrame(i)\n                       : mfcc_ ? mfcc_->GetFrame(i)\n                               : whisper_fbank_->GetFrame(i);\n      std::copy(f, f + feature_dim, p);\n      p += feature_dim;\n    }\n\n    NemoNormalizeFeatures(features.data(), n, feature_dim);\n\n    if (is_ced_) {\n      AmplitudeToDB(features.data(), features.size());\n    }\n\n    return features;\n  }\n\n  void SetResult(const OfflineRecognitionResult &r) { r_ = r; }\n\n  const OfflineRecognitionResult &GetResult() const { return r_; }\n\n  const ContextGraphPtr &GetContextGraph() const { return context_graph_; }\n\n  void SetOption(const std::string &key, const std::string &value) {\n    options_[key] = value;\n  }\n\n  bool HasOption(const std::string &key) const {\n    return options_.count(key) != 0;\n  }\n\n  const std::string &GetOption(const std::string &key) const {\n    auto it = options_.find(key);\n    if (it != options_.end()) {\n      return it->second;\n    }\n    static const std::string kEmpty;\n    return kEmpty;\n  }\n\n  int32_t GetOptionInt(const std::string &key, int32_t default_value) const {\n    auto it = options_.find(key);\n    if (it != options_.end()) {\n      return ToIntOrDefault(it->second, default_value);\n    }\n    return default_value;\n  }\n\n  float GetOptionFloat(const std::string &key, float default_value) const {\n    auto it = options_.find(key);\n    if (it != options_.end()) {\n      return ToFloatOrDefault(it->second, default_value);\n    }\n    return default_value;\n  }\n\n private:\n  // see\n  // https://github.com/pytorch/audio/blob/main/src/torchaudio/functional/functional.py#L359\n  void AmplitudeToDB(float *p, int32_t n) const {\n    float multiplier = 10;\n    float top_db = 120;\n    float amin = 1e-10;\n\n    float max_x = std::numeric_limits<float>::min();\n\n    for (int32_t i = 0; i != n; ++i) {\n      float x = p[i];\n      x = (x > amin) ? x : amin;\n      x = log10f(x) * multiplier;\n\n      max_x = (x > max_x) ? x : max_x;\n      p[i] = x;\n    }\n\n    float d = max_x - top_db;\n    for (int32_t i = 0; i != n; ++i) {\n      float x = p[i];\n      x = (x > d) ? x : d;\n      p[i] = x;\n    }\n  }\n\n  void NemoNormalizeFeatures(float *p, int32_t num_frames,\n                             int32_t feature_dim) const {\n    if (config_.nemo_normalize_type.empty()) {\n      return;\n    }\n\n    if (config_.nemo_normalize_type != \"per_feature\") {\n      SHERPA_ONNX_LOGE(\n          \"Only normalize_type=per_feature is implemented. Given: %s\",\n          config_.nemo_normalize_type.c_str());\n      exit(-1);\n    }\n\n    NemoNormalizePerFeature(p, num_frames, feature_dim);\n  }\n\n  static void NemoNormalizePerFeature(float *p, int32_t num_frames,\n                                      int32_t feature_dim) {\n    using RowMajorMat =\n        Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>;\n\n    Eigen::Map<RowMajorMat> x(p, num_frames, feature_dim);\n\n    Eigen::RowVectorXf mean = x.colwise().mean();\n    Eigen::RowVectorXf var =\n        (x.array().square().colwise().mean() - mean.array().square())\n            .max(0.0f);  // avoid negative due to FP error\n\n    Eigen::RowVectorXf inv_std = (var.array().sqrt() + 1e-5f).inverse();\n\n    x.array() =\n        (x.array().rowwise() - mean.array()).rowwise() * inv_std.array();\n  }\n\n private:\n  FeatureExtractorConfig config_;\n  std::unique_ptr<knf::OnlineFbank> fbank_;\n  std::unique_ptr<knf::OnlineMfcc> mfcc_;\n  std::unique_ptr<knf::OnlineWhisperFbank> whisper_fbank_;\n  knf::FbankOptions opts_;\n  knf::MfccOptions mfcc_opts_;\n  OfflineRecognitionResult r_;\n  ContextGraphPtr context_graph_;\n  bool is_ced_ = false;\n  bool is_moonshine_ = false;\n  bool is_omnilingual_asr_ = false;\n\n  // used only when (is_moonshine_ || is_omnilingual_asr_) == true\n  std::vector<float> samples_;\n\n  std::unordered_map<std::string, std::string> options_;\n};\n\nOfflineStream::OfflineStream(const FeatureExtractorConfig &config /*= {}*/,\n                             ContextGraphPtr context_graph /*= nullptr*/)\n    : impl_(std::make_unique<Impl>(config, std::move(context_graph))) {}\n\nOfflineStream::OfflineStream(WhisperTag tag)\n    : impl_(std::make_unique<Impl>(tag)) {}\n\nOfflineStream::OfflineStream(CEDTag tag) : impl_(std::make_unique<Impl>(tag)) {}\n\nOfflineStream::OfflineStream(MoonshineTag tag)\n    : impl_(std::make_unique<Impl>(tag)) {}\n\nOfflineStream::OfflineStream(OmnilingualAsrTag tag)\n    : impl_(std::make_unique<Impl>(tag)) {}\n\nOfflineStream::~OfflineStream() = default;\n\nvoid OfflineStream::AcceptWaveform(int32_t sampling_rate, const float *waveform,\n                                   int32_t n) const {\n  impl_->AcceptWaveform(sampling_rate, waveform, n);\n}\n\nint32_t OfflineStream::FeatureDim() const { return impl_->FeatureDim(); }\n\nstd::vector<float> OfflineStream::GetFrames() const {\n  return impl_->GetFrames();\n}\n\nvoid OfflineStream::SetResult(const OfflineRecognitionResult &r) {\n  impl_->SetResult(r);\n}\n\nconst ContextGraphPtr &OfflineStream::GetContextGraph() const {\n  return impl_->GetContextGraph();\n}\n\nconst OfflineRecognitionResult &OfflineStream::GetResult() const {\n  return impl_->GetResult();\n}\n\nvoid OfflineStream::SetOption(const std::string &key,\n                              const std::string &value) {\n  impl_->SetOption(key, value);\n}\n\nbool OfflineStream::HasOption(const std::string &key) const {\n  return impl_->HasOption(key);\n}\n\nconst std::string &OfflineStream::GetOption(const std::string &key) const {\n  return impl_->GetOption(key);\n}\n\nint32_t OfflineStream::GetOptionInt(const std::string &key,\n                                    int32_t default_value) const {\n  return impl_->GetOptionInt(key, default_value);\n}\n\nfloat OfflineStream::GetOptionFloat(const std::string &key,\n                                    float default_value) const {\n  return impl_->GetOptionFloat(key, default_value);\n}\n\nstd::string OfflineRecognitionResult::AsJsonString() const {\n  std::ostringstream os;\n  os << \"{\";\n\n  os << \"\\\"lang\\\"\"\n     << \": \";\n  os << std::quoted(lang) << \", \";\n\n  os << \"\\\"emotion\\\"\"\n     << \": \";\n  os << std::quoted(emotion) << \", \";\n\n  os << \"\\\"event\\\"\"\n     << \": \";\n  os << std::quoted(event) << \", \";\n\n  os << \"\\\"text\\\"\"\n     << \": \";\n  os << std::quoted(text) << \", \";\n\n  os << \"\\\"\"\n     << \"timestamps\"\n     << \"\\\"\"\n     << \": \";\n  os << \"[\";\n\n  std::string sep = \"\";\n  for (auto t : timestamps) {\n    os << sep << std::fixed << std::setprecision(2) << t;\n    sep = \", \";\n  }\n  os << \"], \";\n\n  os << \"\\\"\"\n     << \"durations\"\n     << \"\\\"\"\n     << \": \";\n  os << \"[\";\n  sep = \"\";\n  for (auto d : durations) {\n    os << sep << std::fixed << std::setprecision(2) << d;\n    sep = \", \";\n  }\n  os << \"], \";\n\n  os << \"\\\"\"\n     << \"tokens\"\n     << \"\\\"\"\n     << \":\";\n  os << \"[\";\n\n  sep = \"\";\n  auto oldFlags = os.flags();\n  for (const auto &t : tokens) {\n    if (t.size() == 1 && static_cast<uint8_t>(t[0]) > 0x7f) {\n      const uint8_t *p = reinterpret_cast<const uint8_t *>(t.c_str());\n      os << sep << \"\\\"\"\n         << \"<0x\" << std::hex << std::uppercase << static_cast<uint32_t>(p[0])\n         << \">\"\n         << \"\\\"\";\n      os.flags(oldFlags);\n    } else {\n      os << sep << std::quoted(t);\n    }\n    sep = \", \";\n  }\n  os << \"], \";\n\n  os << \"\\\"\"\n     << \"ys_log_probs\"\n     << \"\\\"\"\n     << \": \";\n  os << \"[\";\n  sep = \"\";\n  for (auto p : ys_log_probs) {\n    os << sep << std::fixed << std::setprecision(6) << p;\n    sep = \", \";\n  }\n  os << \"], \";\n\n  sep = \"\";\n\n  os << \"\\\"\"\n     << \"words\"\n     << \"\\\"\"\n     << \": \";\n  os << \"[\";\n  for (int32_t w : words) {\n    os << sep << w;\n    sep = \", \";\n  }\n  os << \"]\";\n\n  // Add segment-level data if present (from Whisper timestamp token mode)\n  if (!segment_timestamps.empty()) {\n    os << \", \";\n\n    os << \"\\\"segment_timestamps\\\": [\";\n    sep = \"\";\n    for (auto t : segment_timestamps) {\n      os << sep << std::fixed << std::setprecision(2) << t;\n      sep = \", \";\n    }\n    os << \"], \";\n\n    os << \"\\\"segment_durations\\\": [\";\n    sep = \"\";\n    for (auto d : segment_durations) {\n      os << sep << std::fixed << std::setprecision(2) << d;\n      sep = \", \";\n    }\n    os << \"], \";\n\n    os << \"\\\"segment_texts\\\": [\";\n    sep = \"\";\n    for (const auto &t : segment_texts) {\n      os << sep << std::quoted(t);\n      sep = \", \";\n    }\n    os << \"]\";\n  }\n\n  os << \"}\";\n\n  return os.str();\n}\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-stream.h",
    "content": "// sherpa-onnx/csrc/offline-stream.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_STREAM_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_STREAM_H_\n#include <stdint.h>\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/context-graph.h\"\n#include \"sherpa-onnx/csrc/features.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineRecognitionResult {\n  // Recognition results.\n  // For English, it consists of space separated words.\n  // For Chinese, it consists of Chinese words without spaces.\n  std::string text;\n\n  // Decoded results at the token level.\n  // For instance, for BPE-based models it consists of a list of BPE tokens.\n  std::vector<std::string> tokens;\n\n  std::string lang;\n\n  // emotion target of the audio.\n  std::string emotion;\n\n  // event target of the audio.\n  std::string event;\n\n  /// timestamps.size() == tokens.size()\n  /// timestamps[i] records the time in seconds when tokens[i] is decoded.\n  std::vector<float> timestamps;\n\n  /// durations[i] contains the duration (in seconds) for tokens[i] (TDT models\n  /// only)\n  std::vector<float> durations;\n\n  /// ys_log_probs[i] contains the log probability (confidence) for tokens[i].\n  std::vector<float> ys_log_probs;\n\n  // Word IDs from FST decoding (CTC models with FST decoder only).\n  std::vector<int32_t> words;\n\n  // Segment-level data (from Whisper with segment timestamps enabled).\n  // These are parallel vectors: segment_timestamps.size() ==\n  // segment_durations.size() == segment_texts.size()\n  std::vector<float> segment_timestamps;   // start time of each segment\n  std::vector<float> segment_durations;    // duration of each segment\n  std::vector<std::string> segment_texts;  // text of each segment\n\n  std::string AsJsonString() const;\n};\n\nstruct WhisperTag {\n  int32_t dim = 80;\n};\n\nstruct CEDTag {};\n\n// It uses a neural network model, a preprocessor, to convert\n// audio samples to features\nstruct MoonshineTag {};\n\n// It is based on Wav2Vec, accepting raw audio samples as input\nstruct OmnilingualAsrTag {};\n\nclass OfflineStream {\n public:\n  explicit OfflineStream(const FeatureExtractorConfig &config = {},\n                         ContextGraphPtr context_graph = {});\n\n  explicit OfflineStream(WhisperTag tag);\n  explicit OfflineStream(CEDTag tag);\n  explicit OfflineStream(MoonshineTag tag);\n  explicit OfflineStream(OmnilingualAsrTag tag);\n  ~OfflineStream();\n\n  /**\n     @param sampling_rate The sampling_rate of the input waveform. If it does\n                          not equal to  config.sampling_rate, we will do\n                          resampling inside.\n     @param waveform Pointer to a 1-D array of size n. It must be normalized to\n                     the range [-1, 1].\n     @param n Number of entries in waveform\n\n     Caution: You can only invoke this function once so you have to input\n              all the samples at once\n   */\n  void AcceptWaveform(int32_t sampling_rate, const float *waveform,\n                      int32_t n) const;\n\n  /// Return feature dim of this extractor.\n  ///\n  /// Note: if it is Moonshine, then it returns the number of audio samples\n  /// currently received.\n  int32_t FeatureDim() const;\n\n  // Get all the feature frames of this stream in a 1-D array, which is\n  // flattened from a 2-D array of shape (num_frames, feat_dim).\n  std::vector<float> GetFrames() const;\n\n  /** Set the recognition result for this stream. */\n  void SetResult(const OfflineRecognitionResult &r);\n\n  /** Get the recognition result of this stream */\n  const OfflineRecognitionResult &GetResult() const;\n\n  /** Get the ContextGraph of this stream */\n  const ContextGraphPtr &GetContextGraph() const;\n\n  // Generic per-stream option mechanism (key-value string pairs).\n  void SetOption(const std::string &key, const std::string &value);\n  bool HasOption(const std::string &key) const;\n\n  // Returns the value for the given key, or an empty string if the key\n  // does not exist. No exception is thrown for missing keys.\n  const std::string &GetOption(const std::string &key) const;\n  int32_t GetOptionInt(const std::string &key,\n                       int32_t default_value = 0) const;\n  float GetOptionFloat(const std::string &key,\n                       float default_value = 0.0f) const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_STREAM_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tdnn-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/offline-tdnn-ctc-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tdnn-ctc-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTdnnCtcModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.tdnn.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.tdnn.model);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features) {\n    auto nnet_out =\n        sess_->Run({}, input_names_ptr_.data(), &features, 1,\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    std::vector<int64_t> nnet_out_shape =\n        nnet_out[0].GetTensorTypeAndShapeInfo().GetShape();\n\n    std::vector<int64_t> out_length_vec(nnet_out_shape[0], nnet_out_shape[1]);\n    std::vector<int64_t> out_length_shape(1, nnet_out_shape[0]);\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    Ort::Value nnet_out_length = Ort::Value::CreateTensor(\n        memory_info, out_length_vec.data(), out_length_vec.size(),\n        out_length_shape.data(), out_length_shape.size());\n\n    std::vector<Ort::Value> ans;\n    ans.reserve(2);\n    ans.push_back(std::move(nnet_out[0]));\n    ans.push_back(Clone(Allocator(), &nnet_out_length));\n    return ans;\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t vocab_size_ = 0;\n};\n\nOfflineTdnnCtcModel::OfflineTdnnCtcModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTdnnCtcModel::OfflineTdnnCtcModel(Manager *mgr,\n                                         const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTdnnCtcModel::~OfflineTdnnCtcModel() = default;\n\nstd::vector<Ort::Value> OfflineTdnnCtcModel::Forward(\n    Ort::Value features, Ort::Value /*features_length*/) {\n  return impl_->Forward(std::move(features));\n}\n\nint32_t OfflineTdnnCtcModel::VocabSize() const { return impl_->VocabSize(); }\n\nOrtAllocator *OfflineTdnnCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTdnnCtcModel::OfflineTdnnCtcModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTdnnCtcModel::OfflineTdnnCtcModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tdnn-ctc-model.h",
    "content": "// sherpa-onnx/csrc/offline-tdnn-ctc-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TDNN_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TDNN_CTC_MODEL_H_\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements the tdnn model of the yesno recipe from icefall.\n *\n * See\n * https://github.com/k2-fsa/icefall/tree/master/egs/yesno/ASR/tdnn\n */\nclass OfflineTdnnCtcModel : public OfflineCtcModel {\n public:\n  explicit OfflineTdnnCtcModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineTdnnCtcModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineTdnnCtcModel() override;\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a pair containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size).\n   *  - log_probs_length A 1-D tensor of shape (N,). Its dtype is int64_t\n   */\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value /*features_length*/) override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TDNN_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tdnn-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-tdnn-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tdnn-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineTdnnModelConfig::Register(ParseOptions *po) {\n  po->Register(\"tdnn-model\", &model, \"Path to onnx model\");\n}\n\nbool OfflineTdnnModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"tdnn model file %s does not exist\", model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineTdnnModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineTdnnModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tdnn-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-tdnn-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TDNN_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TDNN_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\n// for https://github.com/k2-fsa/icefall/tree/master/egs/yesno/ASR/tdnn\nstruct OfflineTdnnModelConfig {\n  std::string model;\n\n  OfflineTdnnModelConfig() = default;\n  explicit OfflineTdnnModelConfig(const std::string &model) : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TDNN_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-telespeech-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/offline-telespeech-ctc-model.cc\n//\n// Copyright (c)  2023-2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-telespeech-ctc-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTeleSpeechCtcModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.telespeech_ctc);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.telespeech_ctc);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value /*features_length*/) {\n    std::vector<int64_t> shape =\n        features.GetTensorTypeAndShapeInfo().GetShape();\n\n    if (static_cast<int32_t>(shape[0]) != 1) {\n      SHERPA_ONNX_LOGE(\"This model supports only batch size 1. Given %d\",\n                       static_cast<int32_t>(shape[0]));\n    }\n\n    auto out = sess_->Run({}, input_names_ptr_.data(), &features, 1,\n                          output_names_ptr_.data(), output_names_ptr_.size());\n\n    std::vector<int64_t> logits_shape = {1};\n    Ort::Value logits_length = Ort::Value::CreateTensor<int64_t>(\n        allocator_, logits_shape.data(), logits_shape.size());\n\n    int64_t *dst = logits_length.GetTensorMutableData<int64_t>();\n    dst[0] = out[0].GetTensorTypeAndShapeInfo().GetShape()[0];\n\n    // (T, B, C) -> (B, T, C)\n    Ort::Value logits = Transpose01(allocator_, &out[0]);\n\n    std::vector<Ort::Value> ans;\n    ans.reserve(2);\n    ans.push_back(std::move(logits));\n    ans.push_back(std::move(logits_length));\n\n    return ans;\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t SubsamplingFactor() const { return subsampling_factor_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    {\n      auto shape =\n          sess_->GetOutputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape();\n      vocab_size_ = shape[2];\n    }\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t vocab_size_ = 0;\n  int32_t subsampling_factor_ = 4;\n};\n\nOfflineTeleSpeechCtcModel::OfflineTeleSpeechCtcModel(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTeleSpeechCtcModel::OfflineTeleSpeechCtcModel(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTeleSpeechCtcModel::~OfflineTeleSpeechCtcModel() = default;\n\nstd::vector<Ort::Value> OfflineTeleSpeechCtcModel::Forward(\n    Ort::Value features, Ort::Value features_length) {\n  return impl_->Forward(std::move(features), std::move(features_length));\n}\n\nint32_t OfflineTeleSpeechCtcModel::VocabSize() const {\n  return impl_->VocabSize();\n}\nint32_t OfflineTeleSpeechCtcModel::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\nOrtAllocator *OfflineTeleSpeechCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTeleSpeechCtcModel::OfflineTeleSpeechCtcModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTeleSpeechCtcModel::OfflineTeleSpeechCtcModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-telespeech-ctc-model.h",
    "content": "// sherpa-onnx/csrc/offline-telespeech-ctc-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TELESPEECH_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TELESPEECH_CTC_MODEL_H_\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements the CTC model from\n * https://github.com/Tele-AI/TeleSpeech-ASR.\n *\n * See\n * https://github.com/lovemefan/telespeech-asr-python/blob/main/telespeechasr/onnx/onnx_infer.py\n * and\n * https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/tele-speech/test.py\n */\nclass OfflineTeleSpeechCtcModel : public OfflineCtcModel {\n public:\n  explicit OfflineTeleSpeechCtcModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineTeleSpeechCtcModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineTeleSpeechCtcModel() override;\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size).\n   *  - log_probs_length A 1-D tensor of shape (N,). Its dtype is int64_t\n   */\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** SubsamplingFactor of the model\n   */\n  int32_t SubsamplingFactor() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  // TeleSpeech CTC models do not support batch size > 1\n  bool SupportBatchProcessing() const override { return false; }\n\n  std::string FeatureNormalizationMethod() const override {\n    return \"per_feature\";\n  }\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TELESPEECH_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-transducer-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_DECODER_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-stream.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTransducerDecoderResult {\n  /// The decoded token IDs\n  std::vector<int64_t> tokens;\n\n  /// timestamps[i] contains the output frame index where tokens[i] is decoded.\n  /// Note: The index is after subsampling\n  std::vector<int32_t> timestamps;\n\n  /// durations[i] contains the duration for tokens[i] in output frames\n  /// (post-subsampling). It is converted to seconds by higher layers\n  /// (e.g., Convert() in offline-recognizer-transducer-impl.h).\n  std::vector<float> durations;\n\n  /// ys_log_probs[i] contains the log probability (confidence) for tokens[i].\n  std::vector<float> ys_log_probs;\n};\n\nclass OfflineTransducerDecoder {\n public:\n  virtual ~OfflineTransducerDecoder() = default;\n\n  /** Run transducer beam search given the output from the encoder model.\n   *\n   * @param encoder_out A 3-D tensor of shape (N, T, joiner_dim)\n   * @param encoder_out_length A 1-D tensor of shape (N,) containing number\n   *                           of valid frames in encoder_out before padding.\n   *\n   * @return Return a vector of size `N` containing the decoded results.\n   */\n  virtual std::vector<OfflineTransducerDecoderResult> Decode(\n      Ort::Value encoder_out, Ort::Value encoder_out_length,\n      OfflineStream **ss = nullptr, int32_t n = 0) = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-greedy-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-transducer-greedy-search-decoder.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-transducer-greedy-search-decoder.h\"\n\n#include <algorithm>\n#include <iterator>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/packed-sequence.h\"\n#include \"sherpa-onnx/csrc/slice.h\"\n\nnamespace sherpa_onnx {\n\nstd::vector<OfflineTransducerDecoderResult>\nOfflineTransducerGreedySearchDecoder::Decode(Ort::Value encoder_out,\n                                             Ort::Value encoder_out_length,\n                                             OfflineStream **ss /*= nullptr*/,\n                                             int32_t n /*= 0*/) {\n  PackedSequence packed_encoder_out = PackPaddedSequence(\n      model_->Allocator(), &encoder_out, &encoder_out_length);\n\n  int32_t batch_size =\n      static_cast<int32_t>(packed_encoder_out.sorted_indexes.size());\n\n  int32_t vocab_size = model_->VocabSize();\n  int32_t context_size = model_->ContextSize();\n\n  std::vector<OfflineTransducerDecoderResult> ans(batch_size);\n  for (auto &r : ans) {\n    r.tokens.resize(context_size, -1);\n    // 0 is the ID of the blank token\n    r.tokens.back() = 0;\n  }\n\n  auto decoder_input = model_->BuildDecoderInput(ans, ans.size());\n  Ort::Value decoder_out = model_->RunDecoder(std::move(decoder_input));\n\n  int32_t start = 0;\n  int32_t t = 0;\n  for (auto n : packed_encoder_out.batch_sizes) {\n    Ort::Value cur_encoder_out = packed_encoder_out.Get(start, n);\n    Ort::Value cur_decoder_out = Slice(model_->Allocator(), &decoder_out, 0, n);\n    start += n;\n    Ort::Value logit = model_->RunJoiner(std::move(cur_encoder_out),\n                                         std::move(cur_decoder_out));\n    float *p_logit = logit.GetTensorMutableData<float>();\n    bool emitted = false;\n    for (int32_t i = 0; i != n; ++i) {\n      if (blank_penalty_ > 0.0) {\n        p_logit[0] -= blank_penalty_;  // assuming blank id is 0\n      }\n\n      LogSoftmax(p_logit, vocab_size);\n\n      auto y = static_cast<int32_t>(std::distance(\n          p_logit, std::max_element(p_logit, p_logit + vocab_size)));\n\n      float log_prob = p_logit[y];\n\n      p_logit += vocab_size;\n      // blank id is hardcoded to 0\n      // also, it treats unk as blank\n      if (y != 0 && y != unk_id_) {\n        ans[i].tokens.push_back(y);\n        ans[i].timestamps.push_back(t);\n        ans[i].ys_log_probs.push_back(log_prob);\n        emitted = true;\n      }\n    }\n    if (emitted) {\n      Ort::Value decoder_input = model_->BuildDecoderInput(ans, n);\n      decoder_out = model_->RunDecoder(std::move(decoder_input));\n    }\n    ++t;\n  }\n\n  for (auto &r : ans) {\n    r.tokens = {r.tokens.begin() + context_size, r.tokens.end()};\n  }\n\n  std::vector<OfflineTransducerDecoderResult> unsorted_ans(batch_size);\n  for (int32_t i = 0; i != batch_size; ++i) {\n    unsorted_ans[packed_encoder_out.sorted_indexes[i]] = std::move(ans[i]);\n  }\n\n  return unsorted_ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-greedy-search-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-transducer-greedy-search-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_GREEDY_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_GREEDY_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTransducerGreedySearchDecoder : public OfflineTransducerDecoder {\n public:\n  OfflineTransducerGreedySearchDecoder(OfflineTransducerModel *model,\n                                       int32_t unk_id,\n                                       float blank_penalty)\n      : model_(model), unk_id_(unk_id), blank_penalty_(blank_penalty) {}\n\n  std::vector<OfflineTransducerDecoderResult> Decode(\n      Ort::Value encoder_out, Ort::Value encoder_out_length,\n      OfflineStream **ss = nullptr, int32_t n = 0) override;\n\n private:\n  OfflineTransducerModel *model_;  // Not owned\n  int32_t unk_id_;\n  float blank_penalty_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_GREEDY_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-greedy-search-nemo-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-transducer-greedy-search-nemo-decoder.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-transducer-greedy-search-nemo-decoder.h\"\n\n#include <algorithm>\n#include <iterator>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nstatic std::pair<Ort::Value, Ort::Value> BuildDecoderInput(\n    int32_t token, OrtAllocator *allocator) {\n  std::array<int64_t, 2> shape{1, 1};\n\n  Ort::Value decoder_input =\n      Ort::Value::CreateTensor<int32_t>(allocator, shape.data(), shape.size());\n\n  std::array<int64_t, 1> length_shape{1};\n  Ort::Value decoder_input_length = Ort::Value::CreateTensor<int32_t>(\n      allocator, length_shape.data(), length_shape.size());\n\n  int32_t *p = decoder_input.GetTensorMutableData<int32_t>();\n\n  int32_t *p_length = decoder_input_length.GetTensorMutableData<int32_t>();\n\n  p[0] = token;\n\n  p_length[0] = 1;\n\n  return {std::move(decoder_input), std::move(decoder_input_length)};\n}\n\nstatic OfflineTransducerDecoderResult DecodeOne(\n    const float *p, int32_t num_rows, int32_t num_cols,\n    OfflineTransducerNeMoModel *model, float blank_penalty) {\n  auto memory_info =\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n  OfflineTransducerDecoderResult ans;\n\n  int32_t vocab_size = model->VocabSize();\n  int32_t blank_id = vocab_size - 1;\n  int32_t max_symbols_per_frame = 10;\n\n  auto decoder_input_pair = BuildDecoderInput(blank_id, model->Allocator());\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> decoder_output_pair =\n      model->RunDecoder(std::move(decoder_input_pair.first),\n                        std::move(decoder_input_pair.second),\n                        model->GetDecoderInitStates(1));\n\n  std::array<int64_t, 3> encoder_shape{1, num_cols, 1};\n\n  for (int32_t t = 0; t != num_rows; ++t) {\n    Ort::Value cur_encoder_out = Ort::Value::CreateTensor(\n        memory_info, const_cast<float *>(p) + t * num_cols, num_cols,\n        encoder_shape.data(), encoder_shape.size());\n\n    for (int32_t q = 0; q != max_symbols_per_frame; ++q) {\n      Ort::Value logit = model->RunJoiner(View(&cur_encoder_out),\n                                          View(&decoder_output_pair.first));\n\n      float *p_logit = logit.GetTensorMutableData<float>();\n      if (blank_penalty > 0) {\n        p_logit[blank_id] -= blank_penalty;\n      }\n\n      auto y = static_cast<int32_t>(std::distance(\n          static_cast<const float *>(p_logit),\n          std::max_element(static_cast<const float *>(p_logit),\n                           static_cast<const float *>(p_logit) + vocab_size)));\n\n      // Apply LogSoftmax and get log probability for selected token\n      LogSoftmax(p_logit, vocab_size);\n      float log_prob = p_logit[y];\n\n      if (y != blank_id) {\n        ans.tokens.push_back(y);\n        ans.timestamps.push_back(t);\n        ans.ys_log_probs.push_back(log_prob);\n\n        decoder_input_pair = BuildDecoderInput(y, model->Allocator());\n\n        decoder_output_pair =\n            model->RunDecoder(std::move(decoder_input_pair.first),\n                              std::move(decoder_input_pair.second),\n                              std::move(decoder_output_pair.second));\n      } else {\n        break;\n      }  // if (y != blank_id)\n    }\n  }  // for (int32_t i = 0; i != num_rows; ++i)\n\n  return ans;\n}\n\nstatic OfflineTransducerDecoderResult DecodeOneTDT(\n    const float *p, int32_t num_rows, int32_t num_cols,\n    OfflineTransducerNeMoModel *model, float blank_penalty) {\n  auto memory_info =\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n  OfflineTransducerDecoderResult ans;\n\n  int32_t vocab_size = model->VocabSize();\n  int32_t blank_id = vocab_size - 1;\n\n  auto decoder_input_pair = BuildDecoderInput(blank_id, model->Allocator());\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> decoder_output_pair =\n      model->RunDecoder(std::move(decoder_input_pair.first),\n                        std::move(decoder_input_pair.second),\n                        model->GetDecoderInitStates(1));\n\n  std::array<int64_t, 3> encoder_shape{1, num_cols, 1};\n\n  int32_t max_tokens_per_frame = 5;\n  int32_t tokens_this_frame = 0;\n\n  int32_t skip = 0;\n  std::vector<float> token_logits_copy(\n      vocab_size);  // Reusable buffer for LogSoftmax\n  for (int32_t t = 0; t < num_rows; t += skip) {\n    Ort::Value cur_encoder_out = Ort::Value::CreateTensor(\n        memory_info, const_cast<float *>(p) + t * num_cols, num_cols,\n        encoder_shape.data(), encoder_shape.size());\n\n    Ort::Value logit = model->RunJoiner(View(&cur_encoder_out),\n                                        View(&decoder_output_pair.first));\n\n    auto shape = logit.GetTensorTypeAndShapeInfo().GetShape();\n\n    float *p_logit = logit.GetTensorMutableData<float>();\n    if (blank_penalty > 0) {\n      p_logit[blank_id] -= blank_penalty;\n    }\n\n    int32_t output_size = shape.back();\n    int32_t num_durations = output_size - vocab_size;\n\n    // Split logits into token and duration logits\n    const float *token_logits = p_logit;\n    const float *duration_logits = p_logit + vocab_size;\n\n    auto y = static_cast<int32_t>(std::distance(\n        token_logits,\n        std::max_element(token_logits, token_logits + vocab_size)));\n\n    // Apply LogSoftmax to token logits and get log probability\n    std::copy(token_logits, token_logits + vocab_size,\n              token_logits_copy.begin());\n    LogSoftmax(token_logits_copy.data(), vocab_size);\n    float log_prob = token_logits_copy[y];\n\n    // note that skip can be 0\n    skip = static_cast<int32_t>(std::distance(\n        duration_logits,\n        std::max_element(duration_logits, duration_logits + num_durations)));\n\n    if (y != blank_id) {\n      ans.tokens.push_back(y);\n      ans.timestamps.push_back(t);\n      ans.durations.push_back(skip);\n      ans.ys_log_probs.push_back(log_prob);\n\n      decoder_input_pair = BuildDecoderInput(y, model->Allocator());\n\n      decoder_output_pair =\n          model->RunDecoder(std::move(decoder_input_pair.first),\n                            std::move(decoder_input_pair.second),\n                            std::move(decoder_output_pair.second));\n\n      tokens_this_frame += 1;\n    }\n\n    if (skip > 0) {\n      tokens_this_frame = 0;\n    }\n\n    if (tokens_this_frame >= max_tokens_per_frame) {\n      tokens_this_frame = 0;\n      skip = 1;\n    }\n\n    if (y == blank_id && skip == 0) {\n      tokens_this_frame = 0;\n      skip = 1;\n    }\n  }  // for (int32_t t = 0; t < num_rows; t += skip)\n\n  return ans;\n}\n\nstd::vector<OfflineTransducerDecoderResult>\nOfflineTransducerGreedySearchNeMoDecoder::Decode(\n    Ort::Value encoder_out, Ort::Value encoder_out_length,\n    OfflineStream ** /*ss = nullptr*/, int32_t /*n= 0*/) {\n  auto shape = encoder_out.GetTensorTypeAndShapeInfo().GetShape();\n\n  int32_t batch_size = static_cast<int32_t>(shape[0]);\n  int32_t dim1 = static_cast<int32_t>(shape[1]);\n  int32_t dim2 = static_cast<int32_t>(shape[2]);\n\n  auto length_type =\n      encoder_out_length.GetTensorTypeAndShapeInfo().GetElementType();\n  if ((length_type != ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32) &&\n      (length_type != ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64)) {\n    SHERPA_ONNX_LOGE(\"Unsupported encoder_out_length data type: %d\",\n                     static_cast<int32_t>(length_type));\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  const float *p = encoder_out.GetTensorData<float>();\n\n  std::vector<OfflineTransducerDecoderResult> ans(batch_size);\n\n  for (int32_t i = 0; i != batch_size; ++i) {\n    const float *this_p = p + dim1 * dim2 * i;\n    int32_t this_len = length_type == ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32\n                           ? encoder_out_length.GetTensorData<int32_t>()[i]\n                           : encoder_out_length.GetTensorData<int64_t>()[i];\n\n    if (is_tdt_) {\n      ans[i] = DecodeOneTDT(this_p, this_len, dim2, model_, blank_penalty_);\n    } else {\n      ans[i] = DecodeOne(this_p, this_len, dim2, model_, blank_penalty_);\n    }\n  }\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-greedy-search-nemo-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-transducer-greedy-search-nemo-decoder.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_GREEDY_SEARCH_NEMO_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_GREEDY_SEARCH_NEMO_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-nemo-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTransducerGreedySearchNeMoDecoder\n    : public OfflineTransducerDecoder {\n public:\n  OfflineTransducerGreedySearchNeMoDecoder(OfflineTransducerNeMoModel *model,\n                                           float blank_penalty, bool is_tdt)\n      : model_(model), blank_penalty_(blank_penalty), is_tdt_(is_tdt) {}\n\n  std::vector<OfflineTransducerDecoderResult> Decode(\n      Ort::Value encoder_out, Ort::Value encoder_out_length,\n      OfflineStream **ss = nullptr, int32_t n = 0) override;\n\n private:\n  OfflineTransducerNeMoModel *model_;  // Not owned\n  float blank_penalty_;\n  bool is_tdt_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_GREEDY_SEARCH_NEMO_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-transducer-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/offline-transducer-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineTransducerModelConfig::Register(ParseOptions *po) {\n  po->Register(\"encoder\", &encoder_filename, \"Path to encoder.onnx\");\n  po->Register(\"decoder\", &decoder_filename, \"Path to decoder.onnx\");\n  po->Register(\"joiner\", &joiner_filename, \"Path to joiner.onnx\");\n}\n\nbool OfflineTransducerModelConfig::Validate() const {\n  if (!FileExists(encoder_filename)) {\n    SHERPA_ONNX_LOGE(\"transducer encoder: '%s' does not exist\",\n                     encoder_filename.c_str());\n    return false;\n  }\n\n  if (!FileExists(decoder_filename)) {\n    SHERPA_ONNX_LOGE(\"transducer decoder: '%s' does not exist\",\n                     decoder_filename.c_str());\n    return false;\n  }\n\n  if (!FileExists(joiner_filename)) {\n    SHERPA_ONNX_LOGE(\"transducer joiner: '%s' does not exist\",\n                     joiner_filename.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineTransducerModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineTransducerModelConfig(\";\n  os << \"encoder_filename=\\\"\" << encoder_filename << \"\\\", \";\n  os << \"decoder_filename=\\\"\" << decoder_filename << \"\\\", \";\n  os << \"joiner_filename=\\\"\" << joiner_filename << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-transducer-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTransducerModelConfig {\n  std::string encoder_filename;\n  std::string decoder_filename;\n  std::string joiner_filename;\n\n  OfflineTransducerModelConfig() = default;\n  OfflineTransducerModelConfig(const std::string &encoder_filename,\n                               const std::string &decoder_filename,\n                               const std::string &joiner_filename)\n      : encoder_filename(encoder_filename),\n        decoder_filename(decoder_filename),\n        joiner_filename(joiner_filename) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-model.cc",
    "content": "// sherpa-onnx/csrc/offline-transducer-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-transducer-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTransducerModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.transducer.encoder_filename);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.transducer.decoder_filename);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.transducer.joiner_filename);\n      InitJoiner(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.transducer.encoder_filename);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.transducer.decoder_filename);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.transducer.joiner_filename);\n      InitJoiner(buf.data(), buf.size());\n    }\n  }\n\n  std::pair<Ort::Value, Ort::Value> RunEncoder(Ort::Value features,\n                                               Ort::Value features_length) {\n    std::array<Ort::Value, 2> encoder_inputs = {std::move(features),\n                                                std::move(features_length)};\n\n    auto encoder_out = encoder_sess_->Run(\n        {}, encoder_input_names_ptr_.data(), encoder_inputs.data(),\n        encoder_inputs.size(), encoder_output_names_ptr_.data(),\n        encoder_output_names_ptr_.size());\n\n    return {std::move(encoder_out[0]), std::move(encoder_out[1])};\n  }\n\n  Ort::Value RunDecoder(Ort::Value decoder_input) {\n    auto decoder_out = decoder_sess_->Run(\n        {}, decoder_input_names_ptr_.data(), &decoder_input, 1,\n        decoder_output_names_ptr_.data(), decoder_output_names_ptr_.size());\n    return std::move(decoder_out[0]);\n  }\n\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out) {\n    std::array<Ort::Value, 2> joiner_input = {std::move(encoder_out),\n                                              std::move(decoder_out)};\n    auto logit = joiner_sess_->Run({}, joiner_input_names_ptr_.data(),\n                                   joiner_input.data(), joiner_input.size(),\n                                   joiner_output_names_ptr_.data(),\n                                   joiner_output_names_ptr_.size());\n\n    return std::move(logit[0]);\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n  int32_t ContextSize() const { return context_size_; }\n  int32_t SubsamplingFactor() const { return 4; }\n  OrtAllocator *Allocator() { return allocator_; }\n\n  Ort::Value BuildDecoderInput(\n      const std::vector<OfflineTransducerDecoderResult> &results,\n      int32_t end_index) {\n    assert(end_index <= results.size());\n\n    int32_t batch_size = end_index;\n    int32_t context_size = ContextSize();\n    std::array<int64_t, 2> shape{batch_size, context_size};\n\n    Ort::Value decoder_input = Ort::Value::CreateTensor<int64_t>(\n        Allocator(), shape.data(), shape.size());\n    int64_t *p = decoder_input.GetTensorMutableData<int64_t>();\n\n    for (int32_t i = 0; i != batch_size; ++i) {\n      const auto &r = results[i];\n      const int64_t *begin = r.tokens.data() + r.tokens.size() - context_size;\n      const int64_t *end = r.tokens.data() + r.tokens.size();\n      std::copy(begin, end, p);\n      p += context_size;\n    }\n\n    return decoder_input;\n  }\n\n  Ort::Value BuildDecoderInput(const std::vector<Hypothesis> &results,\n                               int32_t end_index) {\n    assert(end_index <= results.size());\n\n    int32_t batch_size = end_index;\n    int32_t context_size = ContextSize();\n    std::array<int64_t, 2> shape{batch_size, context_size};\n\n    Ort::Value decoder_input = Ort::Value::CreateTensor<int64_t>(\n        Allocator(), shape.data(), shape.size());\n    int64_t *p = decoder_input.GetTensorMutableData<int64_t>();\n\n    for (int32_t i = 0; i != batch_size; ++i) {\n      const auto &r = results[i];\n      const int64_t *begin = r.ys.data() + r.ys.size() - context_size;\n      const int64_t *end = r.ys.data() + r.ys.size();\n      std::copy(begin, end, p);\n      p += context_size;\n    }\n\n    return decoder_input;\n  }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---encoder---\\n\";\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n  }\n\n  void InitDecoder(void *model_data, size_t model_data_length) {\n    decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                  &decoder_input_names_ptr_);\n\n    GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                   &decoder_output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = decoder_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---decoder---\\n\";\n      PrintModelMetadata(os, meta_data);\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n    SHERPA_ONNX_READ_META_DATA(context_size_, \"context_size\");\n  }\n\n  void InitJoiner(void *model_data, size_t model_data_length) {\n    joiner_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(joiner_sess_.get(), &joiner_input_names_,\n                  &joiner_input_names_ptr_);\n\n    GetOutputNames(joiner_sess_.get(), &joiner_output_names_,\n                   &joiner_output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = joiner_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---joiner---\\n\";\n      PrintModelMetadata(os, meta_data);\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n    }\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n  std::unique_ptr<Ort::Session> joiner_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<std::string> joiner_input_names_;\n  std::vector<const char *> joiner_input_names_ptr_;\n\n  std::vector<std::string> joiner_output_names_;\n  std::vector<const char *> joiner_output_names_ptr_;\n\n  int32_t vocab_size_ = 0;    // initialized in InitDecoder\n  int32_t context_size_ = 0;  // initialized in InitDecoder\n};\n\nOfflineTransducerModel::OfflineTransducerModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTransducerModel::OfflineTransducerModel(Manager *mgr,\n                                               const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTransducerModel::~OfflineTransducerModel() = default;\n\nstd::pair<Ort::Value, Ort::Value> OfflineTransducerModel::RunEncoder(\n    Ort::Value features, Ort::Value features_length) {\n  return impl_->RunEncoder(std::move(features), std::move(features_length));\n}\n\nOrt::Value OfflineTransducerModel::RunDecoder(Ort::Value decoder_input) {\n  return impl_->RunDecoder(std::move(decoder_input));\n}\n\nOrt::Value OfflineTransducerModel::RunJoiner(Ort::Value encoder_out,\n                                             Ort::Value decoder_out) {\n  return impl_->RunJoiner(std::move(encoder_out), std::move(decoder_out));\n}\n\nint32_t OfflineTransducerModel::VocabSize() const { return impl_->VocabSize(); }\n\nint32_t OfflineTransducerModel::ContextSize() const {\n  return impl_->ContextSize();\n}\n\nint32_t OfflineTransducerModel::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\nOrtAllocator *OfflineTransducerModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nOrt::Value OfflineTransducerModel::BuildDecoderInput(\n    const std::vector<OfflineTransducerDecoderResult> &results,\n    int32_t end_index) const {\n  return impl_->BuildDecoderInput(results, end_index);\n}\n\nOrt::Value OfflineTransducerModel::BuildDecoderInput(\n    const std::vector<Hypothesis> &results, int32_t end_index) const {\n  return impl_->BuildDecoderInput(results, end_index);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTransducerModel::OfflineTransducerModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTransducerModel::OfflineTransducerModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-model.h",
    "content": "// sherpa-onnx/csrc/offline-transducer-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODEL_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/hypothesis.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTransducerDecoderResult;\n\nclass OfflineTransducerModel {\n public:\n  explicit OfflineTransducerModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineTransducerModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineTransducerModel();\n\n  /** Run the encoder.\n   *\n   * @param features  A tensor of shape (N, T, C). It is changed in-place.\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a pair containing:\n   *  - encoder_out: A 3-D tensor of shape (N, T', encoder_dim)\n   *  - encoder_out_length: A 1-D tensor of shape (N,) containing number\n   *                        of frames in `encoder_out` before padding.\n   */\n  std::pair<Ort::Value, Ort::Value> RunEncoder(Ort::Value features,\n                                               Ort::Value features_length);\n\n  /** Run the decoder network.\n   *\n   * Caution: We assume there are no recurrent connections in the decoder and\n   *          the decoder is stateless. See\n   * https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless2/decoder.py\n   *          for an example\n   *\n   * @param decoder_input It is usually of shape (N, context_size)\n   * @return Return a tensor of shape (N, decoder_dim).\n   */\n  Ort::Value RunDecoder(Ort::Value decoder_input);\n\n  /** Run the joint network.\n   *\n   * @param encoder_out Output of the encoder network. A tensor of shape\n   *                    (N, joiner_dim).\n   * @param decoder_out Output of the decoder network. A tensor of shape\n   *                    (N, joiner_dim).\n   * @return Return a tensor of shape (N, vocab_size). In icefall, the last\n   *         last layer of the joint network is `nn.Linear`,\n   *         not `nn.LogSoftmax`.\n   */\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out);\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const;\n\n  /** Return the context_size of the decoder model.\n   */\n  int32_t ContextSize() const;\n\n  /** Return the subsampling factor of the model.\n   */\n  int32_t SubsamplingFactor() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n  /** Build decoder_input from the current results.\n   *\n   * @param results Current decoded results.\n   * @param end_index We only use results[0:end_index] to build\n   *                  the decoder_input. results[end_index] is not used.\n   * @return Return a tensor of shape (results.size(), ContextSize())\n   */\n  Ort::Value BuildDecoderInput(\n      const std::vector<OfflineTransducerDecoderResult> &results,\n      int32_t end_index) const;\n\n  Ort::Value BuildDecoderInput(const std::vector<Hypothesis> &results,\n                               int32_t end_index) const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-modified-beam-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-transducer-modified-beam-search-decoder.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-transducer-modified-beam-search-decoder.h\"\n\n#include <deque>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/context-graph.h\"\n#include \"sherpa-onnx/csrc/hypothesis.h\"\n#include \"sherpa-onnx/csrc/log.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/packed-sequence.h\"\n#include \"sherpa-onnx/csrc/slice.h\"\n\nnamespace sherpa_onnx {\n\nstd::vector<OfflineTransducerDecoderResult>\nOfflineTransducerModifiedBeamSearchDecoder::Decode(\n    Ort::Value encoder_out, Ort::Value encoder_out_length,\n    OfflineStream **ss /*=nullptr */, int32_t n /*= 0*/) {\n  PackedSequence packed_encoder_out = PackPaddedSequence(\n      model_->Allocator(), &encoder_out, &encoder_out_length);\n\n  int32_t batch_size =\n      static_cast<int32_t>(packed_encoder_out.sorted_indexes.size());\n\n  if (ss != nullptr) SHERPA_ONNX_CHECK_EQ(batch_size, n);\n\n  int32_t vocab_size = model_->VocabSize();\n  int32_t context_size = model_->ContextSize();\n\n  std::vector<int64_t> blanks(context_size, -1);\n  blanks.back() = 0;\n\n  std::deque<Hypotheses> finalized;\n  std::vector<Hypotheses> cur;\n  std::vector<Hypothesis> prev;\n\n  std::vector<ContextGraphPtr> context_graphs(batch_size, nullptr);\n\n  for (int32_t i = 0; i < batch_size; ++i) {\n    const ContextState *context_state = nullptr;\n    if (ss != nullptr) {\n      context_graphs[i] =\n          ss[packed_encoder_out.sorted_indexes[i]]->GetContextGraph();\n      if (context_graphs[i] != nullptr)\n        context_state = context_graphs[i]->Root();\n    }\n    Hypotheses blank_hyp({{blanks, 0, context_state}});\n    cur.emplace_back(std::move(blank_hyp));\n  }\n\n  int32_t start = 0;\n  int32_t t = 0;\n  for (auto n : packed_encoder_out.batch_sizes) {\n    Ort::Value cur_encoder_out = packed_encoder_out.Get(start, n);\n    start += n;\n\n    if (n < static_cast<int32_t>(cur.size())) {\n      for (int32_t k = static_cast<int32_t>(cur.size()) - 1; k >= n; --k) {\n        finalized.push_front(std::move(cur[k]));\n      }\n\n      cur.erase(cur.begin() + n, cur.end());\n    }  // if (n < static_cast<int32_t>(cur.size()))\n\n    // Due to merging paths with identical token sequences,\n    // not all utterances have \"max_active_paths\" paths.\n    auto hyps_row_splits = GetHypsRowSplits(cur);\n    int32_t num_hyps = hyps_row_splits.back();\n\n    prev.clear();\n    prev.reserve(num_hyps);\n\n    for (auto &hyps : cur) {\n      for (auto &h : hyps) {\n        prev.push_back(std::move(h.second));\n      }\n    }\n    cur.clear();\n    cur.reserve(n);\n\n    auto decoder_input = model_->BuildDecoderInput(prev, num_hyps);\n    // decoder_input shape: (num_hyps, context_size)\n\n    auto decoder_out = model_->RunDecoder(std::move(decoder_input));\n    // decoder_out is (num_hyps, joiner_dim)\n\n    cur_encoder_out =\n        Repeat(model_->Allocator(), &cur_encoder_out, hyps_row_splits);\n    // now cur_encoder_out is of shape (num_hyps, joiner_dim)\n\n    Ort::Value logit =\n        model_->RunJoiner(std::move(cur_encoder_out), View(&decoder_out));\n\n    float *p_logit = logit.GetTensorMutableData<float>();\n    if (blank_penalty_ > 0.0) {\n      // assuming blank id is 0\n      SubtractBlank(p_logit, vocab_size, num_hyps, 0, blank_penalty_);\n    }\n    LogSoftmax(p_logit, vocab_size, num_hyps);\n\n    // now p_logit contains log_softmax output, we rename it to p_logprob\n    // to match what it actually contains\n    float *p_logprob = p_logit;\n\n    // add log_prob of each hypothesis to p_logprob before taking top_k\n    for (int32_t i = 0; i != num_hyps; ++i) {\n      float log_prob = prev[i].log_prob;\n      for (int32_t k = 0; k != vocab_size; ++k, ++p_logprob) {\n        *p_logprob += log_prob;\n      }\n    }\n    p_logprob = p_logit;  // we changed p_logprob in the above for loop\n\n    // Now compute top_k for each utterance\n    for (int32_t i = 0; i != n; ++i) {\n      int32_t start = hyps_row_splits[i];\n      int32_t end = hyps_row_splits[i + 1];\n      auto topk =\n          TopkIndex(p_logprob, vocab_size * (end - start), max_active_paths_);\n\n      Hypotheses hyps;\n      for (auto k : topk) {\n        int32_t hyp_index = k / vocab_size + start;\n        int32_t new_token = k % vocab_size;\n        Hypothesis new_hyp = prev[hyp_index];\n\n        float context_score = 0;\n        auto context_state = new_hyp.context_state;\n        // blank is hardcoded to 0\n        // also, it treats unk as blank\n        if (new_token != 0 && new_token != unk_id_) {\n          new_hyp.ys.push_back(new_token);\n          new_hyp.timestamps.push_back(t);\n\n          // Store the token log probability (subtract prev log_prob to get\n          // original)\n          float token_log_prob = p_logprob[k] - prev[hyp_index].log_prob;\n          new_hyp.ys_probs.push_back(token_log_prob);\n\n          if (context_graphs[i] != nullptr) {\n            auto context_res = context_graphs[i]->ForwardOneStep(\n                context_state, new_token, false /* non-strict mode */);\n            context_score = std::get<0>(context_res);\n            new_hyp.context_state = std::get<1>(context_res);\n          }\n        }\n\n        new_hyp.log_prob = p_logprob[k] + context_score;\n        hyps.Add(std::move(new_hyp));\n      }  // for (auto k : topk)\n      p_logprob += (end - start) * vocab_size;\n      cur.push_back(std::move(hyps));\n    }  // for (int32_t i = 0; i != n; ++i)\n\n    ++t;\n  }  // for (auto n : packed_encoder_out.batch_sizes)\n\n  for (auto &h : finalized) {\n    cur.push_back(std::move(h));\n  }\n\n  // Finalize context biasing matching..\n  for (int32_t i = 0; i < cur.size(); ++i) {\n    for (auto iter = cur[i].begin(); iter != cur[i].end(); ++iter) {\n      if (context_graphs[i] != nullptr) {\n        auto context_res =\n            context_graphs[i]->Finalize(iter->second.context_state);\n        iter->second.log_prob += context_res.first;\n        iter->second.context_state = context_res.second;\n      }\n    }\n  }\n\n  if (lm_) {\n    // use LM for rescoring\n    lm_->ComputeLMScore(lm_scale_, context_size, &cur);\n  }\n\n  std::vector<OfflineTransducerDecoderResult> unsorted_ans(batch_size);\n  for (int32_t i = 0; i != batch_size; ++i) {\n    Hypothesis hyp = cur[i].GetMostProbable(true);\n\n    auto &r = unsorted_ans[packed_encoder_out.sorted_indexes[i]];\n\n    // strip leading blanks\n    r.tokens = {hyp.ys.begin() + context_size, hyp.ys.end()};\n    r.timestamps = std::move(hyp.timestamps);\n    r.ys_log_probs = std::move(hyp.ys_probs);\n  }\n\n  return unsorted_ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-modified-beam-search-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-transducer-modified-beam-search-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-lm.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTransducerModifiedBeamSearchDecoder\n    : public OfflineTransducerDecoder {\n public:\n  OfflineTransducerModifiedBeamSearchDecoder(OfflineTransducerModel *model,\n                                             OfflineLM *lm,\n                                             int32_t max_active_paths,\n                                             float lm_scale, int32_t unk_id,\n                                             float blank_penalty)\n      : model_(model),\n        lm_(lm),\n        max_active_paths_(max_active_paths),\n        lm_scale_(lm_scale),\n        unk_id_(unk_id),\n        blank_penalty_(blank_penalty) {}\n\n  std::vector<OfflineTransducerDecoderResult> Decode(\n      Ort::Value encoder_out, Ort::Value encoder_out_length,\n      OfflineStream **ss = nullptr, int32_t n = 0) override;\n\n private:\n  OfflineTransducerModel *model_;  // Not owned\n  OfflineLM *lm_;                  // Not owned; may be nullptr\n\n  int32_t max_active_paths_;\n  float lm_scale_;  // used only when lm_ is not nullptr\n  int32_t unk_id_;\n  float blank_penalty_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-modified-beam-search-nemo-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-transducer-modified-beam-search-nemo-decoder.cc\r\n//\r\n// Copyright (c)  2026  (authors: github.com/nefastosaturo, github.com/nullbio)\r\n\r\n#include \"sherpa-onnx/csrc/offline-transducer-modified-beam-search-nemo-decoder.h\"\r\n\r\n#include <algorithm>\r\n#include <deque>\r\n#include <utility>\r\n#include <vector>\r\n\r\n#include \"sherpa-onnx/csrc/context-graph.h\"\r\n#include \"sherpa-onnx/csrc/hypothesis.h\"\r\n#include \"sherpa-onnx/csrc/log.h\"\r\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\r\n#include \"sherpa-onnx/csrc/packed-sequence.h\"\r\n#include \"sherpa-onnx/csrc/slice.h\"\r\n\r\nnamespace sherpa_onnx {\r\n\r\n// Helper structure to track hypothesis with decoder state\r\nstruct NeMoHypothesis {\r\n  std::vector<int32_t> ys;          // token sequence (excluding initial blank)\r\n  std::vector<int32_t> timestamps;  // timestamps for each token\r\n  std::vector<int32_t> durations;   // durations for TDT\r\n  std::vector<float> ys_probs;      // log probability for each token\r\n  float log_prob;                   // accumulated log probability\r\n  std::vector<Ort::Value> decoder_states;  // RNN/LSTM states\r\n  const ContextState *context_state;       // context graph state\r\n  OrtAllocator *allocator;                 // allocator for cloning states\r\n  int32_t frame_offset;  // current frame position for this hypothesis\r\n\r\n  NeMoHypothesis()\r\n      : log_prob(0.0f),\r\n        context_state(nullptr),\r\n        allocator(nullptr),\r\n        frame_offset(0) {}\r\n\r\n  // Copy constructor - needed for hypothesis expansion\r\n  NeMoHypothesis(const NeMoHypothesis &other)\r\n      : ys(other.ys),\r\n        timestamps(other.timestamps),\r\n        durations(other.durations),\r\n        ys_probs(other.ys_probs),\r\n        log_prob(other.log_prob),\r\n        context_state(other.context_state),\r\n        allocator(other.allocator),\r\n        frame_offset(other.frame_offset) {\r\n    // Deep copy of decoder states\r\n    decoder_states.reserve(other.decoder_states.size());\r\n    for (const auto &state : other.decoder_states) {\r\n      decoder_states.push_back(Clone(allocator, &state));\r\n    }\r\n  }\r\n\r\n  NeMoHypothesis &operator=(const NeMoHypothesis &other) {\r\n    if (this != &other) {\r\n      ys = other.ys;\r\n      timestamps = other.timestamps;\r\n      durations = other.durations;\r\n      ys_probs = other.ys_probs;\r\n      log_prob = other.log_prob;\r\n      context_state = other.context_state;\r\n      allocator = other.allocator;\r\n      frame_offset = other.frame_offset;\r\n\r\n      decoder_states.clear();\r\n      decoder_states.reserve(other.decoder_states.size());\r\n      for (const auto &state : other.decoder_states) {\r\n        decoder_states.push_back(Clone(allocator, &state));\r\n      }\r\n    }\r\n    return *this;\r\n  }\r\n\r\n  NeMoHypothesis(NeMoHypothesis &&) = default;\r\n  NeMoHypothesis &operator=(NeMoHypothesis &&) = default;\r\n};\r\n\r\nstd::vector<OfflineTransducerDecoderResult>\r\nOfflineTransducerModifiedBeamSearchNeMoDecoder::Decode(\r\n    Ort::Value encoder_out, Ort::Value encoder_out_length,\r\n    OfflineStream **ss /*= nullptr*/, int32_t n /*= 0*/) {\r\n  auto encoder_shape = encoder_out.GetTensorTypeAndShapeInfo().GetShape();\r\n  int32_t batch_size = static_cast<int32_t>(encoder_shape[0]);\r\n  int32_t num_frames = static_cast<int32_t>(encoder_shape[1]);\r\n  int32_t encoder_dim = static_cast<int32_t>(encoder_shape[2]);\r\n\r\n  if (ss != nullptr) SHERPA_ONNX_CHECK_EQ(batch_size, n);\r\n\r\n  int32_t vocab_size = model_->VocabSize();\r\n  int32_t blank_id = vocab_size - 1;  // NeMo models have blank at the end\r\n\r\n  // For TDT models, we need to know the number of duration bins\r\n  // We'll detect this from the joiner output size on first run\r\n  int32_t num_durations = 0;\r\n\r\n  std::vector<ContextGraphPtr> context_graphs(batch_size, nullptr);\r\n\r\n  auto memory_info =\r\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\r\n\r\n  OrtAllocator *allocator = model_->Allocator();\r\n\r\n  const float *encoder_data = encoder_out.GetTensorData<float>();\r\n\r\n  // Get per-utterance lengths\r\n  std::vector<int32_t> utterance_lengths(batch_size);\r\n  auto length_type =\r\n      encoder_out_length.GetTensorTypeAndShapeInfo().GetElementType();\r\n  for (int32_t i = 0; i < batch_size; ++i) {\r\n    utterance_lengths[i] =\r\n        (length_type == ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32)\r\n            ? encoder_out_length.GetTensorData<int32_t>()[i]\r\n            : static_cast<int32_t>(\r\n                  encoder_out_length.GetTensorData<int64_t>()[i]);\r\n  }\r\n\r\n  std::vector<OfflineTransducerDecoderResult> results(batch_size);\r\n\r\n  // Process each utterance independently (simpler for TDT with variable frame\r\n  // positions)\r\n  for (int32_t b = 0; b < batch_size; ++b) {\r\n    const ContextState *context_state = nullptr;\r\n    if (ss != nullptr) {\r\n      context_graphs[b] = ss[b]->GetContextGraph();\r\n      if (context_graphs[b] != nullptr) {\r\n        context_state = context_graphs[b]->Root();\r\n      }\r\n    }\r\n\r\n    int32_t this_num_frames = utterance_lengths[b];\r\n    const float *this_encoder = encoder_data + b * num_frames * encoder_dim;\r\n\r\n    // Initialize with single hypothesis\r\n    std::vector<NeMoHypothesis> cur_hyps;\r\n    {\r\n      NeMoHypothesis blank_hyp;\r\n      blank_hyp.log_prob = 0.0f;\r\n      blank_hyp.context_state = context_state;\r\n      blank_hyp.allocator = allocator;\r\n      blank_hyp.frame_offset = 0;\r\n      blank_hyp.decoder_states = model_->GetDecoderInitStates(1);\r\n      cur_hyps.push_back(std::move(blank_hyp));\r\n    }\r\n\r\n    // Process until all hypotheses have finished\r\n    while (true) {\r\n      // Find minimum frame offset among active hypotheses\r\n      int32_t min_frame = this_num_frames;\r\n      for (const auto &hyp : cur_hyps) {\r\n        if (hyp.frame_offset < min_frame) {\r\n          min_frame = hyp.frame_offset;\r\n        }\r\n      }\r\n\r\n      if (min_frame >= this_num_frames) {\r\n        break;  // All hypotheses have finished\r\n      }\r\n\r\n      // Process hypotheses at the minimum frame\r\n      std::vector<std::pair<float, NeMoHypothesis>> all_candidates;\r\n\r\n      for (auto &hyp : cur_hyps) {\r\n        if (hyp.frame_offset > min_frame) {\r\n          // This hypothesis is ahead, keep it as-is\r\n          all_candidates.emplace_back(hyp.log_prob, std::move(hyp));\r\n          continue;\r\n        }\r\n\r\n        // Get encoder output for this frame\r\n        std::array<int64_t, 3> encoder_3d_shape{1, encoder_dim, 1};\r\n        const float *frame_data = this_encoder + hyp.frame_offset * encoder_dim;\r\n\r\n        Ort::Value encoder_out_frame = Ort::Value::CreateTensor(\r\n            memory_info, const_cast<float *>(frame_data), encoder_dim,\r\n            encoder_3d_shape.data(), encoder_3d_shape.size());\r\n\r\n        // Prepare decoder input: use blank_id as initial token, then last\r\n        // emitted token\r\n        int32_t last_token = hyp.ys.empty() ? blank_id : hyp.ys.back();\r\n        std::array<int64_t, 2> decoder_input_shape = {1, 1};\r\n        std::vector<int32_t> decoder_input_data = {last_token};\r\n\r\n        Ort::Value decoder_input = Ort::Value::CreateTensor(\r\n            memory_info, decoder_input_data.data(), 1,\r\n            decoder_input_shape.data(), decoder_input_shape.size());\r\n\r\n        std::array<int64_t, 1> decoder_input_length_shape = {1};\r\n        std::vector<int32_t> decoder_input_length_data = {1};\r\n\r\n        Ort::Value decoder_input_length = Ort::Value::CreateTensor(\r\n            memory_info, decoder_input_length_data.data(), 1,\r\n            decoder_input_length_shape.data(),\r\n            decoder_input_length_shape.size());\r\n\r\n        // Clone decoder states for this expansion\r\n        std::vector<Ort::Value> decoder_states_copy;\r\n        decoder_states_copy.reserve(hyp.decoder_states.size());\r\n        for (const auto &state : hyp.decoder_states) {\r\n          decoder_states_copy.push_back(Clone(allocator, &state));\r\n        }\r\n\r\n        auto decoder_result = model_->RunDecoder(\r\n            std::move(decoder_input), std::move(decoder_input_length),\r\n            std::move(decoder_states_copy));\r\n\r\n        Ort::Value decoder_out = std::move(decoder_result.first);\r\n        std::vector<Ort::Value> next_states = std::move(decoder_result.second);\r\n\r\n        // Run joiner\r\n        Ort::Value logit =\r\n            model_->RunJoiner(View(&encoder_out_frame), View(&decoder_out));\r\n\r\n        auto logit_shape = logit.GetTensorTypeAndShapeInfo().GetShape();\r\n        int32_t output_size = static_cast<int32_t>(logit_shape.back());\r\n\r\n        float *p_logit = logit.GetTensorMutableData<float>();\r\n\r\n        // Detect TDT mode from joiner output size\r\n        if (is_tdt_ && num_durations == 0 && output_size > vocab_size) {\r\n          num_durations = output_size - vocab_size;\r\n        }\r\n\r\n        // Split into token and duration logits for TDT\r\n        int32_t token_vocab_size = is_tdt_ ? vocab_size : output_size;\r\n        float *token_logits = p_logit;\r\n        float *duration_logits = is_tdt_ ? (p_logit + vocab_size) : nullptr;\r\n\r\n        // Apply blank penalty\r\n        if (blank_penalty_ > 0.0f) {\r\n          token_logits[blank_id] -= blank_penalty_;\r\n        }\r\n\r\n        // Compute log softmax for tokens only\r\n        LogSoftmax(token_logits, token_vocab_size, 1);\r\n\r\n        // Apply context boosting BEFORE top-k selection so hotword tokens\r\n        // have a chance to be selected even if their base probability is low\r\n        if (context_graphs[b] != nullptr && hyp.context_state != nullptr) {\r\n          for (const auto &pair : hyp.context_state->next) {\r\n            int32_t token_id = pair.first;\r\n            if (token_id >= 0 && token_id < token_vocab_size) {\r\n              token_logits[token_id] += hotwords_score_;\r\n            }\r\n          }\r\n        }\r\n\r\n        auto top_k_tokens =\r\n            TopkIndex(token_logits, token_vocab_size, max_active_paths_);\r\n\r\n        // Determine duration/skip for TDT\r\n        int32_t predicted_skip = 1;  // Default: advance by 1 frame\r\n        float duration_log_prob = 0.0f;\r\n        if (is_tdt_ && duration_logits != nullptr && num_durations > 0) {\r\n          // Apply log softmax to duration logits\r\n          LogSoftmax(duration_logits, num_durations, 1);\r\n\r\n          // Find best duration\r\n          predicted_skip = static_cast<int32_t>(\r\n              std::distance(duration_logits,\r\n                            std::max_element(duration_logits,\r\n                                             duration_logits + num_durations)));\r\n\r\n          // Get the log probability for the selected duration\r\n          duration_log_prob = duration_logits[predicted_skip];\r\n        }\r\n\r\n        // Create candidate hypotheses\r\n        for (int32_t idx : top_k_tokens) {\r\n          int32_t token = idx;\r\n          // For TDT: joint probability = P(token) * P(duration)\r\n          // In log space: log P(token, duration) = log P(token) + log\r\n          // P(duration)\r\n          float token_log_prob =\r\n              token_logits[token] + duration_log_prob + hyp.log_prob;\r\n\r\n          NeMoHypothesis new_hyp;\r\n          new_hyp.ys = hyp.ys;\r\n          new_hyp.timestamps = hyp.timestamps;\r\n          new_hyp.durations = hyp.durations;\r\n          new_hyp.ys_probs = hyp.ys_probs;\r\n          new_hyp.context_state = hyp.context_state;\r\n          new_hyp.allocator = allocator;\r\n          new_hyp.log_prob = token_log_prob;\r\n\r\n          float context_score = 0.0f;\r\n\r\n          if (token == blank_id || token == unk_id_) {\r\n            // Blank or unk: keep decoder state, advance frame\r\n            new_hyp.decoder_states.reserve(hyp.decoder_states.size());\r\n            for (const auto &state : hyp.decoder_states) {\r\n              new_hyp.decoder_states.push_back(Clone(allocator, &state));\r\n            }\r\n            // For blank/unk in TDT, always advance by at least 1\r\n            new_hyp.frame_offset =\r\n                hyp.frame_offset + std::max(1, predicted_skip);\r\n          } else {\r\n            // Non-blank: add token, use new decoder state\r\n            new_hyp.ys.push_back(token);\r\n            new_hyp.timestamps.push_back(hyp.frame_offset);\r\n            new_hyp.ys_probs.push_back(token_logits[token]);\r\n            if (is_tdt_) {\r\n              new_hyp.durations.push_back(predicted_skip);\r\n            }\r\n\r\n            new_hyp.decoder_states.reserve(next_states.size());\r\n            for (const auto &state : next_states) {\r\n              new_hyp.decoder_states.push_back(Clone(allocator, &state));\r\n            }\r\n\r\n            // For non-blank in TDT, advance by predicted duration (can be 0 to\r\n            // emit more tokens) For non-TDT, stay on same frame to allow more\r\n            // tokens\r\n            if (is_tdt_) {\r\n              new_hyp.frame_offset = hyp.frame_offset + predicted_skip;\r\n            } else {\r\n              new_hyp.frame_offset = hyp.frame_offset;\r\n            }\r\n\r\n            // Update context graph\r\n            if (context_graphs[b] != nullptr) {\r\n              auto context_res = context_graphs[b]->ForwardOneStep(\r\n                  new_hyp.context_state, token, false);\r\n              context_score = std::get<0>(context_res);\r\n              new_hyp.context_state = std::get<1>(context_res);\r\n            }\r\n            new_hyp.log_prob += context_score;\r\n          }\r\n\r\n          all_candidates.emplace_back(new_hyp.log_prob, std::move(new_hyp));\r\n        }\r\n      }\r\n\r\n      // Keep top-k hypotheses\r\n      if (all_candidates.empty()) {\r\n        break;\r\n      }\r\n\r\n      std::partial_sort(\r\n          all_candidates.begin(),\r\n          all_candidates.begin() +\r\n              std::min(max_active_paths_,\r\n                       static_cast<int32_t>(all_candidates.size())),\r\n          all_candidates.end(),\r\n          [](const auto &a, const auto &b) { return a.first > b.first; });\r\n\r\n      int32_t keep = std::min(max_active_paths_,\r\n                              static_cast<int32_t>(all_candidates.size()));\r\n      cur_hyps.clear();\r\n      cur_hyps.reserve(keep);\r\n      for (int32_t k = 0; k < keep; ++k) {\r\n        cur_hyps.push_back(std::move(all_candidates[k].second));\r\n      }\r\n    }\r\n\r\n    // Finalize context biasing\r\n    for (auto &hyp : cur_hyps) {\r\n      if (context_graphs[b] != nullptr) {\r\n        auto context_res = context_graphs[b]->Finalize(hyp.context_state);\r\n        hyp.log_prob += context_res.first;\r\n        hyp.context_state = context_res.second;\r\n      }\r\n    }\r\n\r\n    // Find best hypothesis\r\n    auto best_it =\r\n        std::max_element(cur_hyps.begin(), cur_hyps.end(),\r\n                         [](const NeMoHypothesis &a, const NeMoHypothesis &b) {\r\n                           return a.log_prob < b.log_prob;\r\n                         });\r\n\r\n    if (best_it != cur_hyps.end()) {\r\n      // Convert int32_t to int64_t for tokens\r\n      results[b].tokens.assign(best_it->ys.begin(), best_it->ys.end());\r\n      results[b].timestamps = best_it->timestamps;\r\n      results[b].ys_log_probs = best_it->ys_probs;\r\n      // Convert int32_t durations to float\r\n      results[b].durations.reserve(best_it->durations.size());\r\n      for (int32_t d : best_it->durations) {\r\n        results[b].durations.push_back(static_cast<float>(d));\r\n      }\r\n    }\r\n  }\r\n\r\n  return results;\r\n}\r\n\r\n}  // namespace sherpa_onnx\r\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-modified-beam-search-nemo-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-transducer-modified-beam-search-nemo-decoder.h\r\n//\r\n// Copyright (c)  2026  (authors: github.com/nefastosaturo, github.com/nullbio)\r\n\r\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_NEMO_DECODER_H_\r\n#define SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_NEMO_DECODER_H_\r\n\r\n#include <vector>\r\n\r\n#include \"sherpa-onnx/csrc/offline-transducer-decoder.h\"\r\n#include \"sherpa-onnx/csrc/offline-transducer-nemo-model.h\"\r\n\r\nnamespace sherpa_onnx {\r\n\r\nclass OfflineTransducerModifiedBeamSearchNeMoDecoder\r\n    : public OfflineTransducerDecoder {\r\n public:\r\n  OfflineTransducerModifiedBeamSearchNeMoDecoder(\r\n      OfflineTransducerNeMoModel *model, int32_t max_active_paths,\r\n      int32_t unk_id, float blank_penalty, bool is_tdt,\r\n      float hotwords_score = 0.0f)\r\n      : model_(model),\r\n        max_active_paths_(max_active_paths),\r\n        unk_id_(unk_id),\r\n        blank_penalty_(blank_penalty),\r\n        is_tdt_(is_tdt),\r\n        hotwords_score_(hotwords_score) {}\r\n\r\n  std::vector<OfflineTransducerDecoderResult> Decode(\r\n      Ort::Value encoder_out,\r\n      Ort::Value encoder_out_length,\r\n      OfflineStream **ss = nullptr,\r\n      int32_t n = 0) override;\r\n\r\n private:\r\n  OfflineTransducerNeMoModel *model_;  // Not owned\r\n\r\n  int32_t max_active_paths_;\r\n  int32_t unk_id_;\r\n  float blank_penalty_;\r\n  bool is_tdt_;  // Token-and-Duration Transducer mode\r\n  float hotwords_score_;\r\n};\r\n\r\n}  // namespace sherpa_onnx\r\n\r\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_NEMO_DECODER_H_\r\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-nemo-model.cc",
    "content": "// sherpa-onnx/csrc/offline-transducer-nemo-model.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-transducer-nemo-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTransducerNeMoModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.transducer.encoder_filename);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.transducer.decoder_filename);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.transducer.joiner_filename);\n      InitJoiner(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.transducer.encoder_filename);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.transducer.decoder_filename);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.transducer.joiner_filename);\n      InitJoiner(buf.data(), buf.size());\n    }\n  }\n\n  std::vector<Ort::Value> RunEncoder(Ort::Value features,\n                                     Ort::Value features_length) {\n    // (B, T, C) -> (B, C, T)\n    features = Transpose12(allocator_, &features);\n\n    std::array<Ort::Value, 2> encoder_inputs = {std::move(features),\n                                                std::move(features_length)};\n\n    auto encoder_out = encoder_sess_->Run(\n        {}, encoder_input_names_ptr_.data(), encoder_inputs.data(),\n        encoder_inputs.size(), encoder_output_names_ptr_.data(),\n        encoder_output_names_ptr_.size());\n\n    return encoder_out;\n  }\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> RunDecoder(\n      Ort::Value targets, Ort::Value targets_length,\n      std::vector<Ort::Value> states) {\n    std::vector<Ort::Value> decoder_inputs;\n    decoder_inputs.reserve(2 + states.size());\n\n    decoder_inputs.push_back(std::move(targets));\n    decoder_inputs.push_back(std::move(targets_length));\n\n    for (auto &s : states) {\n      decoder_inputs.push_back(std::move(s));\n    }\n\n    auto decoder_out = decoder_sess_->Run(\n        {}, decoder_input_names_ptr_.data(), decoder_inputs.data(),\n        decoder_inputs.size(), decoder_output_names_ptr_.data(),\n        decoder_output_names_ptr_.size());\n\n    std::vector<Ort::Value> states_next;\n    states_next.reserve(states.size());\n\n    // decoder_out[0]: decoder_output\n    // decoder_out[1]: decoder_output_length\n    // decoder_out[2:] states_next\n\n    for (int32_t i = 0; i != states.size(); ++i) {\n      states_next.push_back(std::move(decoder_out[i + 2]));\n    }\n\n    // we discard decoder_out[1]\n    return {std::move(decoder_out[0]), std::move(states_next)};\n  }\n\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out) {\n    std::array<Ort::Value, 2> joiner_input = {std::move(encoder_out),\n                                              std::move(decoder_out)};\n    auto logit = joiner_sess_->Run({}, joiner_input_names_ptr_.data(),\n                                   joiner_input.data(), joiner_input.size(),\n                                   joiner_output_names_ptr_.data(),\n                                   joiner_output_names_ptr_.size());\n\n    return std::move(logit[0]);\n  }\n\n  std::vector<Ort::Value> GetDecoderInitStates(int32_t batch_size) {\n    std::array<int64_t, 3> s0_shape{pred_rnn_layers_, batch_size, pred_hidden_};\n    Ort::Value s0 = Ort::Value::CreateTensor<float>(allocator_, s0_shape.data(),\n                                                    s0_shape.size());\n\n    Fill<float>(&s0, 0);\n\n    std::array<int64_t, 3> s1_shape{pred_rnn_layers_, batch_size, pred_hidden_};\n\n    Ort::Value s1 = Ort::Value::CreateTensor<float>(allocator_, s1_shape.data(),\n                                                    s1_shape.size());\n\n    Fill<float>(&s1, 0);\n\n    std::vector<Ort::Value> states;\n\n    states.reserve(2);\n    states.push_back(std::move(s0));\n    states.push_back(std::move(s1));\n\n    return states;\n  }\n\n  int32_t SubsamplingFactor() const { return subsampling_factor_; }\n  int32_t VocabSize() const { return vocab_size_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  std::string FeatureNormalizationMethod() const { return normalize_type_; }\n\n  bool IsGigaAM() const { return is_giga_am_; }\n  bool IsTDT() const { return is_tdt_; }\n\n  int32_t FeatureDim() const { return feat_dim_; }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---encoder---\\n\";\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n\n    // need to increase by 1 since the blank token is not included in computing\n    // vocab_size in NeMo.\n    vocab_size_ += 1;\n\n    SHERPA_ONNX_READ_META_DATA(subsampling_factor_, \"subsampling_factor\");\n    SHERPA_ONNX_READ_META_DATA_STR_ALLOW_EMPTY(normalize_type_,\n                                               \"normalize_type\");\n    SHERPA_ONNX_READ_META_DATA(pred_rnn_layers_, \"pred_rnn_layers\");\n    SHERPA_ONNX_READ_META_DATA(pred_hidden_, \"pred_hidden\");\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(is_giga_am_, \"is_giga_am\", 0);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(feat_dim_, \"feat_dim\", -1);\n\n    if (normalize_type_ == \"NA\") {\n      normalize_type_ = \"\";\n    }\n\n    std::string url;\n    SHERPA_ONNX_READ_META_DATA_STR_ALLOW_EMPTY(url, \"url\");\n    if (url.find(\"tdt\") != std::string::npos) {\n      is_tdt_ = 1;\n    }\n  }\n\n  void InitDecoder(void *model_data, size_t model_data_length) {\n    decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                  &decoder_input_names_ptr_);\n\n    GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                   &decoder_output_names_ptr_);\n  }\n\n  void InitJoiner(void *model_data, size_t model_data_length) {\n    joiner_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(joiner_sess_.get(), &joiner_input_names_,\n                  &joiner_input_names_ptr_);\n\n    GetOutputNames(joiner_sess_.get(), &joiner_output_names_,\n                   &joiner_output_names_ptr_);\n\n    auto shape = joiner_sess_->GetOutputTypeInfo(0)\n                     .GetTensorTypeAndShapeInfo()\n                     .GetShape();\n    int32_t output_size = shape.back();\n    if (is_tdt_) {\n      if (vocab_size_ == output_size) {\n        SHERPA_ONNX_LOGE(\"It is not a TDT model!\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\"TDT model. vocab_size: %d, num_durations: %d\",\n                         vocab_size_, output_size - vocab_size_);\n      }\n    } else if (vocab_size_ != output_size) {\n      SHERPA_ONNX_LOGE(\"vocab_size: %d != output_size: %d\", vocab_size_,\n                       output_size);\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n  std::unique_ptr<Ort::Session> joiner_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<std::string> joiner_input_names_;\n  std::vector<const char *> joiner_input_names_ptr_;\n\n  std::vector<std::string> joiner_output_names_;\n  std::vector<const char *> joiner_output_names_ptr_;\n\n  int32_t vocab_size_ = 0;\n  int32_t subsampling_factor_ = 8;\n  std::string normalize_type_;\n  int32_t pred_rnn_layers_ = -1;\n  int32_t pred_hidden_ = -1;\n  int32_t is_giga_am_ = 0;\n  int32_t is_tdt_ = 0;\n\n  // giga am uses 64\n  // parakeet-tdt-0.6b-v2 uses 128\n  // others use 80\n  int32_t feat_dim_ = -1;  // -1 means to use default values.\n};\n\nOfflineTransducerNeMoModel::OfflineTransducerNeMoModel(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTransducerNeMoModel::OfflineTransducerNeMoModel(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTransducerNeMoModel::~OfflineTransducerNeMoModel() = default;\n\nstd::vector<Ort::Value> OfflineTransducerNeMoModel::RunEncoder(\n    Ort::Value features, Ort::Value features_length) const {\n  return impl_->RunEncoder(std::move(features), std::move(features_length));\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOfflineTransducerNeMoModel::RunDecoder(Ort::Value targets,\n                                       Ort::Value targets_length,\n                                       std::vector<Ort::Value> states) const {\n  return impl_->RunDecoder(std::move(targets), std::move(targets_length),\n                           std::move(states));\n}\n\nstd::vector<Ort::Value> OfflineTransducerNeMoModel::GetDecoderInitStates(\n    int32_t batch_size) const {\n  return impl_->GetDecoderInitStates(batch_size);\n}\n\nOrt::Value OfflineTransducerNeMoModel::RunJoiner(Ort::Value encoder_out,\n                                                 Ort::Value decoder_out) const {\n  return impl_->RunJoiner(std::move(encoder_out), std::move(decoder_out));\n}\n\nint32_t OfflineTransducerNeMoModel::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\nint32_t OfflineTransducerNeMoModel::VocabSize() const {\n  return impl_->VocabSize();\n}\n\nOrtAllocator *OfflineTransducerNeMoModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nstd::string OfflineTransducerNeMoModel::FeatureNormalizationMethod() const {\n  return impl_->FeatureNormalizationMethod();\n}\n\nbool OfflineTransducerNeMoModel::IsGigaAM() const { return impl_->IsGigaAM(); }\n\nbool OfflineTransducerNeMoModel::IsTDT() const { return impl_->IsTDT(); }\n\nint32_t OfflineTransducerNeMoModel::FeatureDim() const {\n  return impl_->FeatureDim();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTransducerNeMoModel::OfflineTransducerNeMoModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTransducerNeMoModel::OfflineTransducerNeMoModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-transducer-nemo-model.h",
    "content": "// sherpa-onnx/csrc/offline-transducer-nemo-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_NEMO_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_NEMO_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n// see\n// https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/models/hybrid_rnnt_ctc_bpe_models.py#L40\n// Its decoder is stateful, not stateless.\nclass OfflineTransducerNeMoModel {\n public:\n  explicit OfflineTransducerNeMoModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineTransducerNeMoModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineTransducerNeMoModel();\n\n  /** Run the encoder.\n   *\n   * @param features  A tensor of shape (N, T, C). It is changed in-place.\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - encoder_out: A 3-D tensor of shape (N, T', encoder_dim)\n   *  - encoder_out_length: A 1-D tensor of shape (N,) containing number\n   *                        of frames in `encoder_out` before padding.\n   */\n  std::vector<Ort::Value> RunEncoder(Ort::Value features,\n                                     Ort::Value features_length) const;\n\n  /** Run the decoder network.\n   *\n   * @param targets A int32 tensor of shape (batch_size, 1)\n   * @param targets_length A int32 tensor of shape (batch_size,)\n   * @param states The states for the decoder model.\n   * @return Return a vector:\n   *           - ans[0] is the decoder_out (a float tensor)\n   *           - ans[1] is the decoder_out_length (a int32 tensor)\n   *           - ans[2:] is the states_next\n   */\n  std::pair<Ort::Value, std::vector<Ort::Value>> RunDecoder(\n      Ort::Value targets, Ort::Value targets_length,\n      std::vector<Ort::Value> states) const;\n\n  std::vector<Ort::Value> GetDecoderInitStates(int32_t batch_size) const;\n\n  /** Run the joint network.\n   *\n   * @param encoder_out Output of the encoder network.\n   * @param decoder_out Output of the decoder network.\n   * @return Return a tensor of shape (N, 1, 1, vocab_size) containing logits.\n   */\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out) const;\n\n  /** Return the subsampling factor of the model.\n   */\n  int32_t SubsamplingFactor() const;\n\n  int32_t VocabSize() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n  // Possible values:\n  // - per_feature\n  // - all_features (not implemented yet)\n  // - fixed_mean (not implemented)\n  // - fixed_std (not implemented)\n  // - or just leave it to empty\n  // See\n  // https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/parts/preprocessing/features.py#L59\n  // for details\n  std::string FeatureNormalizationMethod() const;\n\n  bool IsGigaAM() const;\n\n  // true if it is a Token-and-Duration Transducer model\n  // false otherwise\n  bool IsTDT() const;\n\n  int32_t FeatureDim() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TRANSDUCER_NEMO_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-character-frontend.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-character-frontend.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include <algorithm>\n#include <cctype>\n#include <fstream>\n#include <locale>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-tts-character-frontend.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nstatic std::unordered_map<char32_t, int32_t> ReadTokens(std::istream &is) {\n  std::unordered_map<char32_t, int32_t> token2id;\n\n  std::string line;\n\n  std::string sym;\n  std::u32string s;\n  int32_t id = 0;\n  while (std::getline(is, line)) {\n    std::istringstream iss(line);\n    iss >> sym;\n    if (iss.eof()) {\n      id = atoi(sym.c_str());\n      sym = \" \";\n    } else {\n      iss >> id;\n    }\n\n    // eat the trailing \\r\\n on windows\n    iss >> std::ws;\n    if (!iss.eof()) {\n      SHERPA_ONNX_LOGE(\"Error when reading tokens: %s\", line.c_str());\n      exit(-1);\n    }\n\n    // Form models from coqui-ai/TTS, we have saved the IDs of the following\n    // symbols in OfflineTtsVitsModelMetaData, so it is safe to skip them here.\n    if (sym == \"<PAD>\" || sym == \"<EOS>\" || sym == \"<BOS>\" || sym == \"<BLNK>\") {\n      continue;\n    }\n\n    s = Utf8ToUtf32(sym);\n    if (s.size() != 1) {\n      SHERPA_ONNX_LOGE(\"Error when reading tokens at Line %s. size: %d\",\n                       line.c_str(), static_cast<int32_t>(s.size()));\n      exit(-1);\n    }\n\n    char32_t c = s[0];\n\n    if (token2id.count(c)) {\n      SHERPA_ONNX_LOGE(\"Duplicated token %s. Line %s. Existing ID: %d\",\n                       sym.c_str(), line.c_str(), token2id.at(c));\n      exit(-1);\n    }\n\n    token2id.insert({c, id});\n  }\n\n  return token2id;\n}\n\nOfflineTtsCharacterFrontend::OfflineTtsCharacterFrontend(\n    const std::string &tokens, const OfflineTtsVitsModelMetaData &meta_data)\n    : meta_data_(meta_data) {\n  std::ifstream is(tokens);\n  token2id_ = ReadTokens(is);\n}\n\ntemplate <typename Manager>\nOfflineTtsCharacterFrontend::OfflineTtsCharacterFrontend(\n    Manager *mgr, const std::string &tokens,\n    const OfflineTtsVitsModelMetaData &meta_data)\n    : meta_data_(meta_data) {\n  auto buf = ReadFile(mgr, tokens);\n  std::istringstream is(std::string(buf.data(), buf.size()));\n  token2id_ = ReadTokens(is);\n}\n\nstd::vector<TokenIDs> OfflineTtsCharacterFrontend::ConvertTextToTokenIds(\n    const std::string &_text, const std::string & /*voice = \"\"*/) const {\n  // see\n  // https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/utils/text/tokenizer.py#L87\n  int32_t use_eos_bos = meta_data_.use_eos_bos;\n  int32_t bos_id = meta_data_.bos_id;\n  int32_t eos_id = meta_data_.eos_id;\n  int32_t blank_id = meta_data_.blank_id;\n  int32_t add_blank = meta_data_.add_blank;\n\n  std::string text(_text.size(), 0);\n  std::transform(_text.begin(), _text.end(), text.begin(),\n                 [](auto c) { return std::tolower(c); });\n\n  std::u32string s = Utf8ToUtf32(text);\n\n  std::vector<TokenIDs> ans;\n\n  std::vector<int64_t> this_sentence;\n  if (add_blank) {\n    if (use_eos_bos) {\n      this_sentence.push_back(bos_id);\n    }\n\n    this_sentence.push_back(blank_id);\n\n    for (char32_t c : s) {\n      if (token2id_.count(c)) {\n        this_sentence.push_back(token2id_.at(c));\n        this_sentence.push_back(blank_id);\n      } else {\n        SHERPA_ONNX_LOGE(\"Skip unknown character. Unicode codepoint: \\\\U+%04x.\",\n                         static_cast<uint32_t>(c));\n      }\n\n      if (c == '.' || c == ':' || c == '?' || c == '!') {\n        // end of a sentence\n        if (use_eos_bos) {\n          this_sentence.push_back(eos_id);\n        }\n\n        ans.emplace_back(std::move(this_sentence));\n        this_sentence = {};\n\n        // re-initialize this_sentence\n        if (use_eos_bos) {\n          this_sentence.push_back(bos_id);\n        }\n        this_sentence.push_back(blank_id);\n      }\n    }\n\n    if (use_eos_bos) {\n      this_sentence.push_back(eos_id);\n    }\n\n    if (static_cast<int32_t>(this_sentence.size()) > 1 + use_eos_bos) {\n      ans.emplace_back(std::move(this_sentence));\n    }\n  } else {\n    // not adding blank\n    if (use_eos_bos) {\n      this_sentence.push_back(bos_id);\n    }\n\n    for (char32_t c : s) {\n      if (token2id_.count(c)) {\n        this_sentence.push_back(token2id_.at(c));\n      }\n\n      if (c == '.' || c == ':' || c == '?' || c == '!') {\n        // end of a sentence\n        if (use_eos_bos) {\n          this_sentence.push_back(eos_id);\n        }\n\n        ans.emplace_back(std::move(this_sentence));\n        this_sentence = {};\n\n        // re-initialize this_sentence\n        if (use_eos_bos) {\n          this_sentence.push_back(bos_id);\n        }\n      }\n    }\n\n    if (this_sentence.size() > 1) {\n      ans.emplace_back(std::move(this_sentence));\n    }\n  }\n\n  return ans;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTtsCharacterFrontend::OfflineTtsCharacterFrontend(\n    AAssetManager *mgr, const std::string &tokens,\n    const OfflineTtsVitsModelMetaData &meta_data);\n\n#endif\n\n#if __OHOS__\ntemplate OfflineTtsCharacterFrontend::OfflineTtsCharacterFrontend(\n    NativeResourceManager *mgr, const std::string &tokens,\n    const OfflineTtsVitsModelMetaData &meta_data);\n\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-character-frontend.h",
    "content": "// sherpa-onnx/csrc/offline-tts-character-frontend.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_CHARACTER_FRONTEND_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_CHARACTER_FRONTEND_H_\n#include <cstdint>\n#include <string>\n#include <unordered_map>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-vits-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsCharacterFrontend : public OfflineTtsFrontend {\n public:\n  OfflineTtsCharacterFrontend(const std::string &tokens,\n                              const OfflineTtsVitsModelMetaData &meta_data);\n\n  template <typename Manager>\n  OfflineTtsCharacterFrontend(Manager *mgr, const std::string &tokens,\n                              const OfflineTtsVitsModelMetaData &meta_data);\n\n  /** Convert a string to token IDs.\n   *\n   * @param text The input text.\n   *             Example 1: \"This is the first sample sentence; this is the\n   *             second one.\" Example 2: \"这是第一句。这是第二句。\"\n   * @param voice Optional. It is for espeak-ng.\n   *\n   * @return Return a vector-of-vector of token IDs. Each subvector contains\n   *         a sentence that can be processed independently.\n   *         If a frontend does not support splitting the text into\n   * sentences, the resulting vector contains only one subvector.\n   */\n  std::vector<TokenIDs> ConvertTextToTokenIds(\n      const std::string &text, const std::string &voice = \"\") const override;\n\n private:\n  OfflineTtsVitsModelMetaData meta_data_;\n  std::unordered_map<char32_t, int32_t> token2id_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_CHARACTER_FRONTEND_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-frontend.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-frontend.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n\n#include <sstream>\n#include <string>\n\nnamespace sherpa_onnx {\n\nstd::string TokenIDs::ToString() const {\n  std::ostringstream os;\n  os << \"TokenIDs(\";\n  os << \"tokens=[\";\n  std::string sep;\n  for (auto i : tokens) {\n    os << sep << i;\n    sep = \", \";\n  }\n  os << \"], \";\n\n  os << \"tones=[\";\n  sep = {};\n  for (auto i : tones) {\n    os << sep << i;\n    sep = \", \";\n  }\n  os << \"]\";\n  os << \")\";\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-frontend.h",
    "content": "// sherpa-onnx/csrc/offline-tts-frontend.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_FRONTEND_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_FRONTEND_H_\n#include <cstdint>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstruct TokenIDs {\n  TokenIDs() = default;\n\n  /*implicit*/ TokenIDs(std::vector<int64_t> tokens)  // NOLINT\n      : tokens{std::move(tokens)} {}\n\n  /*implicit*/ TokenIDs(const std::vector<int32_t> &tokens)  // NOLINT\n      : tokens{tokens.begin(), tokens.end()} {}\n\n  TokenIDs(std::vector<int64_t> tokens,  // NOLINT\n           std::vector<int64_t> tones)   // NOLINT\n      : tokens{std::move(tokens)}, tones{std::move(tones)} {}\n\n  std::string ToString() const;\n\n  std::vector<int64_t> tokens;\n\n  // Used only in MeloTTS\n  std::vector<int64_t> tones;\n};\n\nclass OfflineTtsFrontend {\n public:\n  virtual ~OfflineTtsFrontend() = default;\n\n  /** Convert a string to token IDs.\n   *\n   * @param text The input text.\n   *             Example 1: \"This is the first sample sentence; this is the\n   *             second one.\" Example 2: \"这是第一句。这是第二句。\"\n   * @param voice Optional. It is for espeak-ng.\n   *\n   * @return Return a vector-of-vector of token IDs. Each subvector contains\n   *         a sentence that can be processed independently.\n   *         If a frontend does not support splitting the text into sentences,\n   *         the resulting vector contains only one subvector.\n   */\n  virtual std::vector<TokenIDs> ConvertTextToTokenIds(\n      const std::string &text, const std::string &voice = \"\") const = 0;\n};\n\n// implementation is in ./piper-phonemize-lexicon.cc\nvoid InitEspeak(const std::string &data_dir);\n\n// implementation in ./piper-phonemize-lexicon.cc\nstd::vector<TokenIDs> ConvertTextToTokenIdsKokoroOrKitten(\n    const std::unordered_map<char32_t, int32_t> &token2id,\n    int32_t max_token_len, const std::string &text,\n    const std::string &voice = \"\");\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_FRONTEND_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-impl.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-impl.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-impl.h\"\n\n#include <memory>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/offline-tts-kitten-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-kokoro-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-matcha-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-pocket-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-supertonic-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-vits-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-zipvoice-impl.h\"\n\nnamespace sherpa_onnx {\n\nstd::vector<int64_t> OfflineTtsImpl::AddBlank(const std::vector<int64_t> &x,\n                                              int32_t blank_id /*= 0*/) const {\n  // we assume the blank ID is 0\n  std::vector<int64_t> buffer(x.size() * 2 + 1, blank_id);\n  int32_t i = 1;\n  for (auto k : x) {\n    buffer[i] = k;\n    i += 2;\n  }\n  return buffer;\n}\n\nstd::unique_ptr<OfflineTtsImpl> OfflineTtsImpl::Create(\n    const OfflineTtsConfig &config) {\n  if (!config.model.vits.model.empty()) {\n    return std::make_unique<OfflineTtsVitsImpl>(config);\n  } else if (!config.model.matcha.acoustic_model.empty()) {\n    return std::make_unique<OfflineTtsMatchaImpl>(config);\n  } else if (!config.model.zipvoice.encoder.empty() &&\n             !config.model.zipvoice.decoder.empty()) {\n    return std::make_unique<OfflineTtsZipvoiceImpl>(config);\n  } else if (!config.model.kokoro.model.empty()) {\n    return std::make_unique<OfflineTtsKokoroImpl>(config);\n  } else if (!config.model.kitten.model.empty()) {\n    return std::make_unique<OfflineTtsKittenImpl>(config);\n  } else if (!config.model.pocket.lm_flow.empty()) {\n    return std::make_unique<OfflineTtsPocketImpl>(config);\n  } else if (!config.model.supertonic.tts_json.empty()) {\n    return std::make_unique<OfflineTtsSupertonicImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide a tts model.\");\n\n  return {};\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OfflineTtsImpl> OfflineTtsImpl::Create(\n    Manager *mgr, const OfflineTtsConfig &config) {\n  if (!config.model.vits.model.empty()) {\n    return std::make_unique<OfflineTtsVitsImpl>(mgr, config);\n  } else if (!config.model.matcha.acoustic_model.empty()) {\n    return std::make_unique<OfflineTtsMatchaImpl>(mgr, config);\n  } else if (!config.model.zipvoice.encoder.empty() &&\n             !config.model.zipvoice.decoder.empty()) {\n    return std::make_unique<OfflineTtsZipvoiceImpl>(mgr, config);\n  } else if (!config.model.kokoro.model.empty()) {\n    return std::make_unique<OfflineTtsKokoroImpl>(mgr, config);\n  } else if (!config.model.kitten.model.empty()) {\n    return std::make_unique<OfflineTtsKittenImpl>(mgr, config);\n  } else if (!config.model.pocket.lm_flow.empty()) {\n    return std::make_unique<OfflineTtsPocketImpl>(mgr, config);\n  } else if (!config.model.supertonic.tts_json.empty()) {\n    return std::make_unique<OfflineTtsSupertonicImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide a tts model.\");\n  return {};\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OfflineTtsImpl> OfflineTtsImpl::Create(\n    AAssetManager *mgr, const OfflineTtsConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OfflineTtsImpl> OfflineTtsImpl::Create(\n    NativeResourceManager *mgr, const OfflineTtsConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-impl.h",
    "content": "// sherpa-onnx/csrc/offline-tts-impl.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_IMPL_H_\n\n#include <memory>\n#include <stdexcept>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-tts.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsImpl {\n public:\n  virtual ~OfflineTtsImpl() = default;\n\n  static std::unique_ptr<OfflineTtsImpl> Create(const OfflineTtsConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OfflineTtsImpl> Create(Manager *mgr,\n                                                const OfflineTtsConfig &config);\n\n  [[deprecated(\"Use Generate(text, GenerationConfig, callback) instead\")]]\n  virtual GeneratedAudio Generate(\n      const std::string &text, int64_t sid = 0, float speed = 1.0,\n      GeneratedAudioCallback callback = nullptr) const {\n    SHERPA_ONNX_LOGE(\"Not implemented yet. Only some models support this\");\n    SHERPA_ONNX_LOGE(\"Please use sherpa-onnx > v1.12.30\");\n    return {};\n  }\n\n  virtual GeneratedAudio Generate(\n      const std::string &text, const GenerationConfig &config,\n      GeneratedAudioCallback callback = nullptr) const {\n    SHERPA_ONNX_LOGE(\"Not implemented yet. Only some models support this\");\n    return {};\n  }\n\n  virtual GeneratedAudio Generate(\n      const std::string &text, const std::string &prompt_text,\n      const std::vector<float> &prompt_samples, int32_t sample_rate,\n      float speed = 1.0, int32_t num_step = 4,\n      GeneratedAudioCallback callback = nullptr) const {\n    SHERPA_ONNX_LOGE(\"Not implemented yet. Only some models support this\");\n    return {};\n  }\n\n  // Return the sample rate of the generated audio\n  virtual int32_t SampleRate() const = 0;\n\n  // Number of supported speakers.\n  // If it supports only a single speaker, then it return 0 or 1.\n  virtual int32_t NumSpeakers() const { return 1; }\n\n  std::vector<int64_t> AddBlank(const std::vector<int64_t> &x,\n                                int32_t blank_id = 0) const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kitten-impl.h",
    "content": "// sherpa-onnx/csrc/offline-tts-kitten-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_IMPL_H_\n\n#include <iomanip>\n#include <ios>\n#include <memory>\n#include <string>\n#include <sstream>\n#include <utility>\n#include <vector>\n\n#include \"fst/extensions/far/far.h\"\n#include \"kaldifst/csrc/kaldi-fst-io.h\"\n#include \"kaldifst/csrc/text-normalizer.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/lexicon.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-kitten-model.h\"\n#include \"sherpa-onnx/csrc/piper-phonemize-lexicon.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsKittenImpl : public OfflineTtsImpl {\n public:\n  explicit OfflineTtsKittenImpl(const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsKittenModel>(config.model)) {\n    InitFrontend();\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      tn_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule fst: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n#endif\n        }\n        tn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(f));\n      }\n    }\n\n    if (!config.rule_fars.empty()) {\n      if (config.model.debug) {\n        SHERPA_ONNX_LOGE(\"Loading FST archives\");\n      }\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fars, \",\", false, &files);\n\n      tn_list_.reserve(files.size() + tn_list_.size());\n\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule far: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n#endif\n        }\n        std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n            fst::FarReader<fst::StdArc>::Open(f));\n        for (; !reader->Done(); reader->Next()) {\n          std::unique_ptr<fst::StdConstFst> r(\n              fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n          tn_list_.push_back(\n              std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n        }\n      }\n\n      if (config.model.debug) {\n        SHERPA_ONNX_LOGE(\"FST archives loaded!\");\n      }\n    }\n  }\n\n  template <typename Manager>\n  OfflineTtsKittenImpl(Manager *mgr, const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsKittenModel>(mgr, config.model)) {\n    InitFrontend(mgr);\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      tn_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule fst: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n#endif\n        }\n        auto buf = ReadFile(mgr, f);\n        std::istringstream is(std::string(buf.data(), buf.size()));\n        tn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(is));\n      }\n    }\n\n    if (!config.rule_fars.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fars, \",\", false, &files);\n      tn_list_.reserve(files.size() + tn_list_.size());\n\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule far: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n#endif\n        }\n\n        auto buf = ReadFile(mgr, f);\n\n        std::unique_ptr<std::istream> s(\n            new std::istringstream(std::string(buf.data(), buf.size())));\n\n        std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n            fst::FarReader<fst::StdArc>::Open(std::move(s)));\n\n        for (; !reader->Done(); reader->Next()) {\n          std::unique_ptr<fst::StdConstFst> r(\n              fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n          tn_list_.push_back(\n              std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n        }  // for (; !reader->Done(); reader->Next())\n      }    // for (const auto &f : files)\n    }      // if (!config.rule_fars.empty())\n  }\n\n  int32_t SampleRate() const override {\n    return model_->GetMetaData().sample_rate;\n  }\n\n  int32_t NumSpeakers() const override {\n    return model_->GetMetaData().num_speakers;\n  }\n\n  // Supported options in GenerationConfig:\n  //   - sid: Speaker ID for multi-speaker models\n  //   - speed: Speech speed factor (default: 1.0)\n  //   - silence_scale: Scale applied to pauses in the generated audio\n  //\n  // Supported extra options in config.extra:\n  //   - None\n  GeneratedAudio Generate(\n      const std::string &_text, const GenerationConfig &gen_config,\n      GeneratedAudioCallback callback = nullptr) const override {\n    if (config_.model.debug) {\n      SHERPA_ONNX_LOGE(\"%s\", gen_config.ToString().c_str());\n    }\n\n    int64_t sid = gen_config.sid;\n    float speed = gen_config.speed;\n    if (speed <= 0) {\n      SHERPA_ONNX_LOGE(\"Speed must be > 0. Given: %f\", speed);\n      return {};\n    }\n\n    const auto &meta_data = model_->GetMetaData();\n    int32_t num_speakers = meta_data.num_speakers;\n\n    if (num_speakers == 0 && sid != 0) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"This is a single-speaker model and supports only sid 0. Given sid: \"\n          \"%{public}d. sid is ignored\",\n          static_cast<int32_t>(sid));\n#else\n      SHERPA_ONNX_LOGE(\n          \"This is a single-speaker model and supports only sid 0. Given sid: \"\n          \"%d. sid is ignored\",\n          static_cast<int32_t>(sid));\n#endif\n    }\n\n    if (num_speakers != 0 && (sid >= num_speakers || sid < 0)) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"This model contains only %{public}d speakers. sid should be in the \"\n          \"range [%{public}d, %{public}d]. Given: %{public}d. Use sid=0\",\n          num_speakers, 0, num_speakers - 1, static_cast<int32_t>(sid));\n#else\n      SHERPA_ONNX_LOGE(\n          \"This model contains only %d speakers. sid should be in the range \"\n          \"[%d, %d]. Given: %d. Use sid=0\",\n          num_speakers, 0, num_speakers - 1, static_cast<int32_t>(sid));\n#endif\n      sid = 0;\n    }\n\n    std::string text = _text;\n    if (config_.model.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Raw text: %{public}s\", text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Raw text: %s\", text.c_str());\n#endif\n      std::ostringstream os;\n      os << \"In bytes (hex):\\n\";\n      const auto p = reinterpret_cast<const uint8_t *>(text.c_str());\n      for (int32_t i = 0; i != text.size(); ++i) {\n        os << std::setw(2) << std::setfill('0') << std::hex\n           << static_cast<uint32_t>(p[i]) << \" \";\n      }\n      os << \"\\n\";\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    if (!tn_list_.empty()) {\n      for (const auto &tn : tn_list_) {\n        text = tn->Normalize(text);\n        if (config_.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"After normalizing: %{public}s\", text.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"After normalizing: %s\", text.c_str());\n#endif\n        }\n      }\n    }\n\n    std::vector<TokenIDs> token_ids =\n        frontend_->ConvertTextToTokenIds(text, meta_data.voice);\n\n    if (token_ids.empty() ||\n        (token_ids.size() == 1 && token_ids[0].tokens.empty())) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Failed to convert '%{public}s' to token IDs\",\n                       text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Failed to convert '%s' to token IDs\", text.c_str());\n#endif\n      return {};\n    }\n\n    std::vector<std::vector<int64_t>> x;\n\n    x.reserve(token_ids.size());\n\n    for (auto &i : token_ids) {\n      x.push_back(std::move(i.tokens));\n    }\n\n    int32_t x_size = static_cast<int32_t>(x.size());\n\n    if (config_.max_num_sentences != 1) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"max_num_sentences (%{public}d) != 1 is ignored for Kitten TTS \"\n          \"models\",\n          config_.max_num_sentences);\n#else\n      SHERPA_ONNX_LOGE(\n          \"max_num_sentences (%d) != 1 is ignored for Kitten TTS models\",\n          config_.max_num_sentences);\n#endif\n    }\n\n    // the input text is too long, we process sentences within it in batches\n    // to avoid OOM. Batch size is config_.max_num_sentences\n    std::vector<std::vector<int64_t>> batch_x;\n\n    int32_t batch_size = 1;\n    batch_x.reserve(batch_size);\n    int32_t num_batches = x_size / batch_size;\n\n    if (config_.model.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"Split it into %{public}d batches. batch size: \"\n          \"%{public}d. Number of sentences: %{public}d\",\n          num_batches, batch_size, x_size);\n#else\n      SHERPA_ONNX_LOGE(\n          \"Split it into %d batches. batch size: %d. Number \"\n          \"of sentences: %d\",\n          num_batches, batch_size, x_size);\n#endif\n    }\n\n    GeneratedAudio ans;\n\n    int32_t should_continue = 1;\n\n    int32_t k = 0;\n\n    for (int32_t b = 0; b != num_batches && should_continue; ++b) {\n      batch_x.clear();\n      for (int32_t i = 0; i != batch_size; ++i, ++k) {\n        batch_x.push_back(std::move(x[k]));\n      }\n\n      auto audio = Process(batch_x, sid, speed, gen_config.silence_scale);\n      ans.sample_rate = audio.sample_rate;\n      ans.samples.insert(ans.samples.end(), audio.samples.begin(),\n                         audio.samples.end());\n      if (callback) {\n        should_continue = callback(audio.samples.data(), audio.samples.size(),\n                                   (b + 1) * 1.0 / num_batches);\n        // Caution(fangjun): audio is freed when the callback returns, so users\n        // should copy the data if they want to access the data after\n        // the callback returns to avoid segmentation fault.\n      }\n    }\n\n    batch_x.clear();\n    while (k < static_cast<int32_t>(x.size()) && should_continue) {\n      batch_x.push_back(std::move(x[k]));\n\n      ++k;\n    }\n\n    if (!batch_x.empty()) {\n      auto audio = Process(batch_x, sid, speed, gen_config.silence_scale);\n      ans.sample_rate = audio.sample_rate;\n      ans.samples.insert(ans.samples.end(), audio.samples.begin(),\n                         audio.samples.end());\n      if (callback) {\n        callback(audio.samples.data(), audio.samples.size(), 1.0);\n        // Caution(fangjun): audio is freed when the callback returns, so users\n        // should copy the data if they want to access the data after\n        // the callback returns to avoid segmentation fault.\n      }\n    }\n\n    return ans;\n  }\n\n  [[deprecated(\"Use Generate(text, GenerationConfig, callback) instead\")]]\n  GeneratedAudio Generate(\n      const std::string &text, int64_t sid = 0, float speed = 1.0,\n      GeneratedAudioCallback callback = nullptr) const override {\n    GenerationConfig gen_config;\n    gen_config.sid = sid;\n    gen_config.speed = speed;\n    gen_config.silence_scale = config_.silence_scale;\n    return Generate(text, gen_config, std::move(callback));\n  }\n\n private:\n  template <typename Manager>\n  void InitFrontend(Manager *mgr) {\n    const auto &meta_data = model_->GetMetaData();\n    frontend_ = std::make_unique<PiperPhonemizeLexicon>(\n        mgr, config_.model.kitten.tokens, config_.model.kitten.data_dir,\n        meta_data);\n  }\n\n  void InitFrontend() {\n    const auto &meta_data = model_->GetMetaData();\n    frontend_ = std::make_unique<PiperPhonemizeLexicon>(\n        config_.model.kitten.tokens, config_.model.kitten.data_dir, meta_data);\n  }\n\n  GeneratedAudio Process(const std::vector<std::vector<int64_t>> &tokens,\n                         int32_t sid, float speed,\n                         float silence_scale) const {\n    int32_t num_tokens = 0;\n    for (const auto &k : tokens) {\n      num_tokens += k.size();\n    }\n\n    std::vector<int64_t> x;\n    x.reserve(num_tokens);\n    for (const auto &k : tokens) {\n      x.insert(x.end(), k.begin(), k.end());\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 2> x_shape = {1, static_cast<int32_t>(x.size())};\n    Ort::Value x_tensor = Ort::Value::CreateTensor(\n        memory_info, x.data(), x.size(), x_shape.data(), x_shape.size());\n\n    Ort::Value audio = model_->Run(std::move(x_tensor), sid, speed);\n\n    std::vector<int64_t> audio_shape =\n        audio.GetTensorTypeAndShapeInfo().GetShape();\n\n    int64_t total = 1;\n    // The output shape may be (1, 1, total) or (1, total) or (total,)\n    for (auto i : audio_shape) {\n      total *= i;\n    }\n\n    const float *p = audio.GetTensorData<float>();\n\n    GeneratedAudio ans;\n    ans.sample_rate = model_->GetMetaData().sample_rate;\n    ans.samples = std::vector<float>(p, p + total);\n\n    if (silence_scale != 1) {\n      ans = ans.ScaleSilence(silence_scale);\n    }\n\n    return ans;\n  }\n\n private:\n  OfflineTtsConfig config_;\n  std::unique_ptr<OfflineTtsKittenModel> model_;\n  std::vector<std::unique_ptr<kaldifst::TextNormalizer>> tn_list_;\n  std::unique_ptr<OfflineTtsFrontend> frontend_;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kitten-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-kitten-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-kitten-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineTtsKittenModelConfig::Register(ParseOptions *po) {\n  po->Register(\"kitten-model\", &model, \"Path to kitten model\");\n  po->Register(\"kitten-voices\", &voices,\n               \"Path to voices.bin for kitten models\");\n  po->Register(\"kitten-tokens\", &tokens,\n               \"Path to tokens.txt for kitten models\");\n  po->Register(\"kitten-data-dir\", &data_dir,\n               \"Path to the directory containing dict for espeak-ng.\");\n  po->Register(\"kitten-length-scale\", &length_scale,\n               \"Inverse of speech speed. Larger->Slower; Smaller->faster.\");\n}\n\nbool OfflineTtsKittenModelConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --kitten-model\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"--kitten-model: '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  if (voices.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --kitten-voices\");\n    return false;\n  }\n\n  if (!FileExists(voices)) {\n    SHERPA_ONNX_LOGE(\"--kitten-voices: '%s' does not exist\", voices.c_str());\n    return false;\n  }\n\n  if (tokens.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --kitten-tokens\");\n    return false;\n  }\n\n  if (!FileExists(tokens)) {\n    SHERPA_ONNX_LOGE(\"--kitten-tokens: '%s' does not exist\", tokens.c_str());\n    return false;\n  }\n\n  if (data_dir.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --kitten-data-dir\");\n    return false;\n  }\n\n  if (!FileExists(data_dir + \"/phontab\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/phontab' does not exist. Please check --kitten-data-dir\",\n        data_dir.c_str());\n    return false;\n  }\n\n  if (!FileExists(data_dir + \"/phonindex\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/phonindex' does not exist. Please check --kitten-data-dir\",\n        data_dir.c_str());\n    return false;\n  }\n\n  if (!FileExists(data_dir + \"/phondata\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/phondata' does not exist. Please check --kitten-data-dir\",\n        data_dir.c_str());\n    return false;\n  }\n\n  if (!FileExists(data_dir + \"/intonations\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/intonations' does not exist. Please check --kitten-data-dir\",\n        data_dir.c_str());\n    return false;\n  }\n\n  if (length_scale <= 0) {\n    SHERPA_ONNX_LOGE(\n        \"Please provide a positive length_scale for --kitten-length-scale. \"\n        \"Given: %.3f\",\n        length_scale);\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineTtsKittenModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineTtsKittenModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\", \";\n  os << \"voices=\\\"\" << voices << \"\\\", \";\n  os << \"tokens=\\\"\" << tokens << \"\\\", \";\n  os << \"data_dir=\\\"\" << data_dir << \"\\\", \";\n  os << \"length_scale=\" << length_scale << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kitten-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-tts-kitten-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTtsKittenModelConfig {\n  std::string model;\n  std::string voices;\n  std::string tokens;\n\n  std::string data_dir;\n  // speed = 1 / length_scale\n  float length_scale = 1.0;\n\n  OfflineTtsKittenModelConfig() = default;\n\n  OfflineTtsKittenModelConfig(const std::string &model,\n                              const std::string &voices,\n                              const std::string &tokens,\n                              const std::string &data_dir, float length_scale)\n      : model(model),\n        voices(voices),\n        tokens(tokens),\n        data_dir(data_dir),\n        length_scale(length_scale) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kitten-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-tts-kitten-model-meta-data.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_MODEL_META_DATA_H_\n\n#include <cstdint>\n#include <string>\n\nnamespace sherpa_onnx {\n\n// please refer to\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/kitten-tts/nano_v0_1/add_meta_data.py\nstruct OfflineTtsKittenModelMetaData {\n  int32_t sample_rate = 0;\n  int32_t num_speakers = 0;\n  int32_t version = 1;\n  int32_t has_espeak = 1;\n\n  int32_t max_token_len = 256;\n\n  std::string voice;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kitten-model.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-kitten-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-kitten-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsKittenModel::Impl {\n public:\n  explicit Impl(const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto model_buf = ReadFile(config.kitten.model);\n    auto voices_buf = ReadFile(config.kitten.voices);\n    Init(model_buf.data(), model_buf.size(), voices_buf.data(),\n         voices_buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto model_buf = ReadFile(mgr, config.kitten.model);\n    auto voices_buf = ReadFile(mgr, config.kitten.voices);\n    Init(model_buf.data(), model_buf.size(), voices_buf.data(),\n         voices_buf.size());\n  }\n\n  const OfflineTtsKittenModelMetaData &GetMetaData() const {\n    return meta_data_;\n  }\n\n  Ort::Value Run(Ort::Value x, int32_t sid, float speed) {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> x_shape = x.GetTensorTypeAndShapeInfo().GetShape();\n    if (x_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"Support only batch_size == 1. Given: %d\",\n                       static_cast<int32_t>(x_shape[0]));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    int32_t num_speakers = meta_data_.num_speakers;\n    int32_t dim1 = style_dim_[1];\n\n    /*const*/ float *p = styles_.data() + sid * dim1;\n\n    std::array<int64_t, 2> style_embedding_shape = {1, dim1};\n    Ort::Value style_embedding = Ort::Value::CreateTensor(\n        memory_info, p, dim1, style_embedding_shape.data(),\n        style_embedding_shape.size());\n\n    int64_t speed_shape = 1;\n    if (config_.kitten.length_scale != 1 && speed == 1) {\n      speed = 1. / config_.kitten.length_scale;\n    }\n\n    Ort::Value speed_tensor =\n        Ort::Value::CreateTensor(memory_info, &speed, 1, &speed_shape, 1);\n\n    std::array<Ort::Value, 3> inputs = {\n        std::move(x), std::move(style_embedding), std::move(speed_tensor)};\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    return std::move(out[0]);\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length, const char *voices_data,\n            size_t voices_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---kitten model---\\n\";\n      PrintModelMetadata(os, meta_data);\n\n      os << \"----------input names----------\\n\";\n      int32_t i = 0;\n      for (const auto &s : input_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n      os << \"----------output names----------\\n\";\n      i = 0;\n      for (const auto &s : output_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    std::string model_type;\n    SHERPA_ONNX_READ_META_DATA_STR(model_type, \"model_type\");\n    if (model_type != \"kitten-tts\") {\n      SHERPA_ONNX_LOGE(\n          \"Please download the kitten tts model from us containing meta data\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA(meta_data_.sample_rate, \"sample_rate\");\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.version, \"version\", 1);\n    SHERPA_ONNX_READ_META_DATA(meta_data_.num_speakers, \"n_speakers\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.has_espeak, \"has_espeak\");\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(meta_data_.voice, \"voice\",\n                                                \"en-us\");\n    if (meta_data_.has_espeak != 1) {\n      SHERPA_ONNX_LOGE(\"It should require espeak-ng\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (config_.debug) {\n      std::vector<std::string> speaker_names;\n      SHERPA_ONNX_READ_META_DATA_VEC_STRING(speaker_names, \"speaker_names\");\n      std::ostringstream os;\n      os << \"\\n\";\n      for (int32_t i = 0; i != speaker_names.size(); ++i) {\n        os << i << \"->\" << speaker_names[i] << \", \";\n      }\n      os << \"\\n\";\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    SHERPA_ONNX_READ_META_DATA_VEC(style_dim_, \"style_dim\");\n    if (style_dim_.size() != 2) {\n      SHERPA_ONNX_LOGE(\"style_dim should be 2-d, given: %d\",\n                       static_cast<int32_t>(style_dim_.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (style_dim_[0] != 1) {\n      SHERPA_ONNX_LOGE(\"style_dim[0] should be 1, given: %d\", style_dim_[0]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    int32_t actual_num_floats = voices_data_length / sizeof(float);\n    int32_t expected_num_floats =\n        style_dim_[0] * style_dim_[1] * meta_data_.num_speakers;\n\n    if (actual_num_floats != expected_num_floats) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"Corrupted --kitten-voices '%{public}s'. Expected #floats: \"\n          \"%{public}d, actual: %{public}d\",\n          config_.kitten.voices.c_str(), expected_num_floats,\n          actual_num_floats);\n#else\n      SHERPA_ONNX_LOGE(\n          \"Corrupted --kitten-voices '%s'. Expected #floats: %d, actual: %d\",\n          config_.kitten.voices.c_str(), expected_num_floats,\n          actual_num_floats);\n#endif\n\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    styles_ = std::vector<float>(\n        reinterpret_cast<const float *>(voices_data),\n        reinterpret_cast<const float *>(voices_data) + expected_num_floats);\n  }\n\n private:\n  OfflineTtsModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  OfflineTtsKittenModelMetaData meta_data_;\n  std::vector<int32_t> style_dim_;\n\n  // (num_speakers, style_dim_[1])\n  std::vector<float> styles_;\n};\n\nOfflineTtsKittenModel::OfflineTtsKittenModel(\n    const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTtsKittenModel::OfflineTtsKittenModel(\n    Manager *mgr, const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTtsKittenModel::~OfflineTtsKittenModel() = default;\n\nconst OfflineTtsKittenModelMetaData &OfflineTtsKittenModel::GetMetaData()\n    const {\n  return impl_->GetMetaData();\n}\n\nOrt::Value OfflineTtsKittenModel::Run(Ort::Value x, int64_t sid /*= 0*/,\n                                      float speed /*= 1.0*/) const {\n  return impl_->Run(std::move(x), sid, speed);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTtsKittenModel::OfflineTtsKittenModel(\n    AAssetManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTtsKittenModel::OfflineTtsKittenModel(\n    NativeResourceManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kitten-model.h",
    "content": "// sherpa-onnx/csrc/offline-tts-kitten-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_MODEL_H_\n\n#include <memory>\n#include <string>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-tts-kitten-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsKittenModel {\n public:\n  ~OfflineTtsKittenModel();\n\n  explicit OfflineTtsKittenModel(const OfflineTtsModelConfig &config);\n\n  template <typename Manager>\n  OfflineTtsKittenModel(Manager *mgr, const OfflineTtsModelConfig &config);\n\n  // @params x An int64 tensor of shape (1, num_tokens)\n  // @return Return a float32 tensor containing the\n  //         samples of shape (num_samples,)\n  Ort::Value Run(Ort::Value x, int64_t sid = 0, float speed = 1.0) const;\n\n  const OfflineTtsKittenModelMetaData &GetMetaData() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_KITTEN_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kokoro-impl.h",
    "content": "// sherpa-onnx/csrc/offline-tts-kokoro-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_IMPL_H_\n\n#include <iomanip>\n#include <ios>\n#include <memory>\n#include <string>\n#include <sstream>\n#include <utility>\n#include <vector>\n\n#include \"fst/extensions/far/far.h\"\n#include \"kaldifst/csrc/kaldi-fst-io.h\"\n#include \"kaldifst/csrc/text-normalizer.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/kokoro-multi-lang-lexicon.h\"\n#include \"sherpa-onnx/csrc/lexicon.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-kokoro-model.h\"\n#include \"sherpa-onnx/csrc/piper-phonemize-lexicon.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsKokoroImpl : public OfflineTtsImpl {\n public:\n  explicit OfflineTtsKokoroImpl(const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsKokoroModel>(config.model)) {\n    InitFrontend();\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      tn_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule fst: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n#endif\n        }\n        tn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(f));\n      }\n    }\n\n    if (!config.rule_fars.empty()) {\n      if (config.model.debug) {\n        SHERPA_ONNX_LOGE(\"Loading FST archives\");\n      }\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fars, \",\", false, &files);\n\n      tn_list_.reserve(files.size() + tn_list_.size());\n\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule far: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n#endif\n        }\n        std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n            fst::FarReader<fst::StdArc>::Open(f));\n        for (; !reader->Done(); reader->Next()) {\n          std::unique_ptr<fst::StdConstFst> r(\n              fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n          tn_list_.push_back(\n              std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n        }\n      }\n\n      if (config.model.debug) {\n        SHERPA_ONNX_LOGE(\"FST archives loaded!\");\n      }\n    }\n  }\n\n  template <typename Manager>\n  OfflineTtsKokoroImpl(Manager *mgr, const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsKokoroModel>(mgr, config.model)) {\n    InitFrontend(mgr);\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      tn_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule fst: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n#endif\n        }\n        auto buf = ReadFile(mgr, f);\n        std::istringstream is(std::string(buf.data(), buf.size()));\n        tn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(is));\n      }\n    }\n\n    if (!config.rule_fars.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fars, \",\", false, &files);\n      tn_list_.reserve(files.size() + tn_list_.size());\n\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule far: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n#endif\n        }\n\n        auto buf = ReadFile(mgr, f);\n\n        std::unique_ptr<std::istream> s(\n            new std::istringstream(std::string(buf.data(), buf.size())));\n\n        std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n            fst::FarReader<fst::StdArc>::Open(std::move(s)));\n\n        for (; !reader->Done(); reader->Next()) {\n          std::unique_ptr<fst::StdConstFst> r(\n              fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n          tn_list_.push_back(\n              std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n        }  // for (; !reader->Done(); reader->Next())\n      }    // for (const auto &f : files)\n    }      // if (!config.rule_fars.empty())\n  }\n\n  int32_t SampleRate() const override {\n    return model_->GetMetaData().sample_rate;\n  }\n\n  int32_t NumSpeakers() const override {\n    return model_->GetMetaData().num_speakers;\n  }\n\n  // Supported options in GenerationConfig:\n  //   - sid: Speaker ID for multi-speaker models\n  //   - speed: Speech speed factor. If left at 1.0, it falls back to the\n  //            default implied by kokoro.length_scale.\n  //   - silence_scale: Scale applied to pauses in the generated audio. If left\n  //                    at 0.2, it falls back to OfflineTtsConfig.silence_scale.\n  //\n  // Supported extra options in config.extra:\n  //   - lang: Language override for Kokoro >= 1.0. Defaults to\n  //           kokoro.lang if provided, otherwise meta_data.voice.\n  GeneratedAudio Generate(\n      const std::string &_text, const GenerationConfig &gen_config,\n      GeneratedAudioCallback callback = nullptr) const override {\n    if (config_.model.debug) {\n      SHERPA_ONNX_LOGE(\"%s\", gen_config.ToString().c_str());\n    }\n\n    int64_t sid = gen_config.sid;\n    float speed = gen_config.speed;\n    if (speed <= 0) {\n      SHERPA_ONNX_LOGE(\"Speed must be > 0. Given: %f\", speed);\n      return {};\n    }\n\n    const auto &meta_data = model_->GetMetaData();\n    int32_t num_speakers = meta_data.num_speakers;\n\n    if (num_speakers == 0 && sid != 0) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"This is a single-speaker model and supports only sid 0. Given sid: \"\n          \"%{public}d. sid is ignored\",\n          static_cast<int32_t>(sid));\n#else\n      SHERPA_ONNX_LOGE(\n          \"This is a single-speaker model and supports only sid 0. Given sid: \"\n          \"%d. sid is ignored\",\n          static_cast<int32_t>(sid));\n#endif\n    }\n\n    if (num_speakers != 0 && (sid >= num_speakers || sid < 0)) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"This model contains only %{public}d speakers. sid should be in the \"\n          \"range [%{public}d, %{public}d]. Given: %{public}d. Use sid=0\",\n          num_speakers, 0, num_speakers - 1, static_cast<int32_t>(sid));\n#else\n      SHERPA_ONNX_LOGE(\n          \"This model contains only %d speakers. sid should be in the range \"\n          \"[%d, %d]. Given: %d. Use sid=0\",\n          num_speakers, 0, num_speakers - 1, static_cast<int32_t>(sid));\n#endif\n      sid = 0;\n    }\n\n    std::string text = _text;\n    if (config_.model.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Raw text: %{public}s\", text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Raw text: %s\", text.c_str());\n#endif\n      std::ostringstream os;\n      os << \"In bytes (hex):\\n\";\n      const auto p = reinterpret_cast<const uint8_t *>(text.c_str());\n      for (int32_t i = 0; i != text.size(); ++i) {\n        os << std::setw(2) << std::setfill('0') << std::hex\n           << static_cast<uint32_t>(p[i]) << \" \";\n      }\n      os << \"\\n\";\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    if (!tn_list_.empty()) {\n      for (const auto &tn : tn_list_) {\n        text = tn->Normalize(text);\n        if (config_.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"After normalizing: %{public}s\", text.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"After normalizing: %s\", text.c_str());\n#endif\n        }\n      }\n    }\n\n    std::string lang = gen_config.GetExtraString(\"lang\");\n    if (lang.empty()) {\n      lang = config_.model.kokoro.lang.empty() ? meta_data.voice\n                                               : config_.model.kokoro.lang;\n    }\n\n    std::vector<TokenIDs> token_ids = frontend_->ConvertTextToTokenIds(\n        text, lang);\n\n    if (token_ids.empty() ||\n        (token_ids.size() == 1 && token_ids[0].tokens.empty())) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Failed to convert '%{public}s' to token IDs\",\n                       text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Failed to convert '%s' to token IDs\", text.c_str());\n#endif\n      return {};\n    }\n\n    std::vector<std::vector<int64_t>> x;\n\n    x.reserve(token_ids.size());\n\n    for (auto &i : token_ids) {\n      x.push_back(std::move(i.tokens));\n    }\n\n    int32_t x_size = static_cast<int32_t>(x.size());\n\n    if (config_.max_num_sentences != 1) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"max_num_sentences (%{public}d) != 1 is ignored for Kokoro TTS \"\n          \"models\",\n          config_.max_num_sentences);\n#else\n      SHERPA_ONNX_LOGE(\n          \"max_num_sentences (%d) != 1 is ignored for Kokoro TTS models\",\n          config_.max_num_sentences);\n#endif\n    }\n\n    // the input text is too long, we process sentences within it in batches\n    // to avoid OOM. Batch size is config_.max_num_sentences\n    std::vector<std::vector<int64_t>> batch_x;\n\n    int32_t batch_size = 1;\n    batch_x.reserve(config_.max_num_sentences);\n    int32_t num_batches = x_size / batch_size;\n\n    if (config_.model.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"Split it into %{public}d batches. batch size: \"\n          \"%{public}d. Number of sentences: %{public}d\",\n          num_batches, batch_size, x_size);\n#else\n      SHERPA_ONNX_LOGE(\n          \"Split it into %d batches. batch size: %d. Number \"\n          \"of sentences: %d\",\n          num_batches, batch_size, x_size);\n#endif\n    }\n\n    GeneratedAudio ans;\n\n    int32_t should_continue = 1;\n\n    int32_t k = 0;\n\n    for (int32_t b = 0; b != num_batches && should_continue; ++b) {\n      batch_x.clear();\n      for (int32_t i = 0; i != batch_size; ++i, ++k) {\n        batch_x.push_back(std::move(x[k]));\n      }\n\n      auto audio =\n          Process(batch_x, sid, speed, gen_config.silence_scale);\n      ans.sample_rate = audio.sample_rate;\n      ans.samples.insert(ans.samples.end(), audio.samples.begin(),\n                         audio.samples.end());\n      if (callback) {\n        should_continue = callback(audio.samples.data(), audio.samples.size(),\n                                   (b + 1) * 1.0 / num_batches);\n        // Caution(fangjun): audio is freed when the callback returns, so users\n        // should copy the data if they want to access the data after\n        // the callback returns to avoid segmentation fault.\n      }\n    }\n\n    batch_x.clear();\n    while (k < static_cast<int32_t>(x.size()) && should_continue) {\n      batch_x.push_back(std::move(x[k]));\n\n      ++k;\n    }\n\n    if (!batch_x.empty()) {\n      auto audio =\n          Process(batch_x, sid, speed, gen_config.silence_scale);\n      ans.sample_rate = audio.sample_rate;\n      ans.samples.insert(ans.samples.end(), audio.samples.begin(),\n                         audio.samples.end());\n      if (callback) {\n        callback(audio.samples.data(), audio.samples.size(), 1.0);\n        // Caution(fangjun): audio is freed when the callback returns, so users\n        // should copy the data if they want to access the data after\n        // the callback returns to avoid segmentation fault.\n      }\n    }\n\n    return ans;\n  }\n\n  [[deprecated(\"Use Generate(text, GenerationConfig, callback) instead\")]]\n  GeneratedAudio Generate(\n      const std::string &text, int64_t sid = 0, float speed = 1.0,\n      GeneratedAudioCallback callback = nullptr) const override {\n    GenerationConfig gen_config;\n    gen_config.sid = sid;\n    gen_config.speed = speed;\n    gen_config.silence_scale = config_.silence_scale;\n    if (!config_.model.kokoro.lang.empty()) {\n      gen_config.extra[\"lang\"] = config_.model.kokoro.lang;\n    }\n\n    return Generate(text, gen_config, std::move(callback));\n  }\n\n private:\n  template <typename Manager>\n  void InitFrontend(Manager *mgr) {\n    const auto &meta_data = model_->GetMetaData();\n\n    if (meta_data.version >= 2) {\n      // this is a multi-lingual model, we require that you pass lexicon\n      if (config_.model.kokoro.lexicon.empty() &&\n          config_.model.kokoro.lang.empty()) {\n        SHERPA_ONNX_LOGE(\"Current model version: '%d'\", meta_data.version);\n        SHERPA_ONNX_LOGE(\n            \"You are using a multi-lingual Kokoro model (e.g., Kokoro >= \"\n            \"v1.0). Please pass --kokoro-lexicon or provide --kokoro-lang\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      frontend_ = std::make_unique<KokoroMultiLangLexicon>(\n          mgr, config_.model.kokoro.tokens, config_.model.kokoro.lexicon,\n          config_.model.kokoro.data_dir, meta_data, config_.model.debug);\n\n      return;\n    }\n\n    frontend_ = std::make_unique<PiperPhonemizeLexicon>(\n        mgr, config_.model.kokoro.tokens, config_.model.kokoro.data_dir,\n        meta_data);\n  }\n\n  void InitFrontend() {\n    const auto &meta_data = model_->GetMetaData();\n    if (meta_data.version >= 2) {\n      // this is a multi-lingual model, we require that you pass lexicon\n      if (config_.model.kokoro.lexicon.empty() &&\n          config_.model.kokoro.lang.empty()) {\n        SHERPA_ONNX_LOGE(\"Current model version: '%d'\", meta_data.version);\n        SHERPA_ONNX_LOGE(\n            \"You are using a multi-lingual Kokoro model (e.g., Kokoro >= \"\n            \"v1.0). please pass --kokoro-lexicon or --kokoro-lang\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      frontend_ = std::make_unique<KokoroMultiLangLexicon>(\n          config_.model.kokoro.tokens, config_.model.kokoro.lexicon,\n          config_.model.kokoro.data_dir, meta_data, config_.model.debug);\n\n      return;\n    }\n\n    // this is for kokoro v0.19, which supports only English\n    frontend_ = std::make_unique<PiperPhonemizeLexicon>(\n        config_.model.kokoro.tokens, config_.model.kokoro.data_dir, meta_data);\n  }\n\n  GeneratedAudio Process(const std::vector<std::vector<int64_t>> &tokens,\n                         int32_t sid, float speed,\n                         float silence_scale) const {\n    int32_t num_tokens = 0;\n    for (const auto &k : tokens) {\n      num_tokens += k.size();\n    }\n\n    std::vector<int64_t> x;\n    x.reserve(num_tokens);\n    for (const auto &k : tokens) {\n      x.insert(x.end(), k.begin(), k.end());\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 2> x_shape = {1, static_cast<int32_t>(x.size())};\n    Ort::Value x_tensor = Ort::Value::CreateTensor(\n        memory_info, x.data(), x.size(), x_shape.data(), x_shape.size());\n\n    Ort::Value audio = model_->Run(std::move(x_tensor), sid, speed);\n\n    std::vector<int64_t> audio_shape =\n        audio.GetTensorTypeAndShapeInfo().GetShape();\n\n    int64_t total = 1;\n    // The output shape may be (1, 1, total) or (1, total) or (total,)\n    for (auto i : audio_shape) {\n      total *= i;\n    }\n\n    const float *p = audio.GetTensorData<float>();\n\n    GeneratedAudio ans;\n    ans.sample_rate = model_->GetMetaData().sample_rate;\n    ans.samples = std::vector<float>(p, p + total);\n\n    if (silence_scale == 0.2f) {\n      silence_scale = config_.silence_scale;\n    }\n\n    if (silence_scale != 1) {\n      ans = ans.ScaleSilence(silence_scale);\n    }\n\n    return ans;\n  }\n\n private:\n  OfflineTtsConfig config_;\n  std::unique_ptr<OfflineTtsKokoroModel> model_;\n  std::vector<std::unique_ptr<kaldifst::TextNormalizer>> tn_list_;\n  std::unique_ptr<OfflineTtsFrontend> frontend_;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kokoro-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-kokoro-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-kokoro-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineTtsKokoroModelConfig::Register(ParseOptions *po) {\n  po->Register(\"kokoro-model\", &model, \"Path to Kokoro model\");\n  po->Register(\"kokoro-voices\", &voices,\n               \"Path to voices.bin for Kokoro models\");\n  po->Register(\"kokoro-tokens\", &tokens,\n               \"Path to tokens.txt for Kokoro models\");\n  po->Register(\"kokoro-lang\", &lang,\n               \"Used only by kokoro >= 1.0. Example values: \"\n               \"en (English), \"\n               \"es (Spanish), fr (French), hi (hindi), it (Italian), \"\n               \"pt-br (Brazilian Portuguese).\"\n               \"You can leave it empty, in which case you need to provide \"\n               \"--kokoro-lexicon.\");\n  po->Register(\n      \"kokoro-lexicon\", &lexicon,\n      \"Path to lexicon.txt for Kokoro models. Used only for Kokoro >= v1.0\"\n      \"You can pass multiple files, separated by ','. Example: \"\n      \"./lexicon-us-en.txt,./lexicon-zh.txt\");\n  po->Register(\"kokoro-data-dir\", &data_dir,\n               \"Path to the directory containing dict for espeak-ng.\");\n  po->Register(\"kokoro-dict-dir\", &dict_dir,\n               \"Not used. You don't need to provide a value for it\");\n  po->Register(\"kokoro-length-scale\", &length_scale,\n               \"Speech speed. Larger->Slower; Smaller->faster.\");\n}\n\nbool OfflineTtsKokoroModelConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --kokoro-model\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"--kokoro-model: '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  if (tokens.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --kokoro-tokens\");\n    return false;\n  }\n\n  if (!FileExists(tokens)) {\n    SHERPA_ONNX_LOGE(\"--kokoro-tokens: '%s' does not exist\", tokens.c_str());\n    return false;\n  }\n\n  if (!lexicon.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(lexicon, \",\", false, &files);\n    for (const auto &f : files) {\n      if (!FileExists(f)) {\n        SHERPA_ONNX_LOGE(\n            \"lexicon '%s' does not exist. Please re-check --kokoro-lexicon\",\n            f.c_str());\n        return false;\n      }\n    }\n  }\n\n  if (data_dir.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --kokoro-data-dir\");\n    return false;\n  }\n\n  if (!FileExists(data_dir + \"/phontab\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/phontab' does not exist. Please check --kokoro-data-dir\",\n        data_dir.c_str());\n    return false;\n  }\n\n  if (!FileExists(data_dir + \"/phonindex\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/phonindex' does not exist. Please check --kokoro-data-dir\",\n        data_dir.c_str());\n    return false;\n  }\n\n  if (!FileExists(data_dir + \"/phondata\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/phondata' does not exist. Please check --kokoro-data-dir\",\n        data_dir.c_str());\n    return false;\n  }\n\n  if (!FileExists(data_dir + \"/intonations\")) {\n    SHERPA_ONNX_LOGE(\n        \"'%s/intonations' does not exist. Please check --kokoro-data-dir\",\n        data_dir.c_str());\n    return false;\n  }\n\n  if (!dict_dir.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"From sherpa-onnx v1.12.15, you don't need to provide dict_dir or \"\n        \"dictDir for this model. Ignore this value.\");\n  }\n\n  return true;\n}\n\nstd::string OfflineTtsKokoroModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineTtsKokoroModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\", \";\n  os << \"voices=\\\"\" << voices << \"\\\", \";\n  os << \"tokens=\\\"\" << tokens << \"\\\", \";\n  os << \"lexicon=\\\"\" << lexicon << \"\\\", \";\n  os << \"data_dir=\\\"\" << data_dir << \"\\\", \";\n  os << \"length_scale=\" << length_scale << \", \";\n  os << \"lang=\\\"\" << lang << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kokoro-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-tts-kokoro-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTtsKokoroModelConfig {\n  std::string model;\n  std::string voices;\n  std::string tokens;\n\n  // Note: You can pass multiple files, separated by \",\", to lexicon\n  // Example: lexicon = \"./lexicon-gb-en.txt,./lexicon-zh.txt\";\n  std::string lexicon;\n\n  std::string data_dir;\n\n  std::string dict_dir;\n\n  // speed = 1 / length_scale\n  float length_scale = 1.0;\n\n  // Used only for Kokoro >= 1.0.\n  //\n  // If it is not empty, meta_data.voice is ignored.\n  // Example values: es (Spanish), fr (French), pt (Portuguese)\n  // See https://hf-mirror.com/hexgrad/Kokoro-82M/blob/main/VOICES.md\n  std::string lang;\n\n  OfflineTtsKokoroModelConfig() = default;\n\n  OfflineTtsKokoroModelConfig(const std::string &model,\n                              const std::string &voices,\n                              const std::string &tokens,\n                              const std::string &lexicon,\n                              const std::string &data_dir,\n                              const std::string &dict_dir, float length_scale,\n                              const std::string &lang)\n      : model(model),\n        voices(voices),\n        tokens(tokens),\n        lexicon(lexicon),\n        data_dir(data_dir),\n        dict_dir(dict_dir),\n        length_scale(length_scale),\n        lang(lang) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kokoro-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-tts-kokoro-model-meta-data.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_MODEL_META_DATA_H_\n\n#include <cstdint>\n#include <string>\n\nnamespace sherpa_onnx {\n\n// please refer to\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/kokoro/v0.19/add_meta_data.py\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/kokoro/v1.0/add_meta_data.py\n// https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/kokoro/v1.1-zh/add_meta_data.py\nstruct OfflineTtsKokoroModelMetaData {\n  int32_t sample_rate = 0;\n  int32_t num_speakers = 0;\n  int32_t version = 1;\n  int32_t has_espeak = 1;\n  int32_t max_token_len = 0;\n\n  std::string voice;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kokoro-model.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-kokoro-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-kokoro-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsKokoroModel::Impl {\n public:\n  explicit Impl(const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto model_buf = ReadFile(config.kokoro.model);\n    auto voices_buf = ReadFile(config.kokoro.voices);\n    Init(model_buf.data(), model_buf.size(), voices_buf.data(),\n         voices_buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto model_buf = ReadFile(mgr, config.kokoro.model);\n    auto voices_buf = ReadFile(mgr, config.kokoro.voices);\n    Init(model_buf.data(), model_buf.size(), voices_buf.data(),\n         voices_buf.size());\n  }\n\n  const OfflineTtsKokoroModelMetaData &GetMetaData() const {\n    return meta_data_;\n  }\n\n  Ort::Value Run(Ort::Value x, int32_t sid, float speed) {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> x_shape = x.GetTensorTypeAndShapeInfo().GetShape();\n    if (x_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"Support only batch_size == 1. Given: %d\",\n                       static_cast<int32_t>(x_shape[0]));\n      exit(-1);\n    }\n\n    // there is a 0 at the front and end of x\n    int32_t len = static_cast<int32_t>(x_shape[1]) - 2;\n    int32_t num_speakers = meta_data_.num_speakers;\n    int32_t dim0 = style_dim_[0];\n    int32_t dim1 = style_dim_[2];\n    if (len >= dim0) {\n      SHERPA_ONNX_LOGE(\"Bad things happened! %d vs %d\", len, dim0);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    /*const*/ float *p = styles_.data() + sid * dim0 * dim1 + len * dim1;\n\n    std::array<int64_t, 2> style_embedding_shape = {1, dim1};\n    Ort::Value style_embedding = Ort::Value::CreateTensor(\n        memory_info, p, dim1, style_embedding_shape.data(),\n        style_embedding_shape.size());\n\n    int64_t speed_shape = 1;\n    if (config_.kokoro.length_scale != 1 && speed == 1) {\n      speed = 1. / config_.kokoro.length_scale;\n    }\n\n    Ort::Value speed_tensor =\n        Ort::Value::CreateTensor(memory_info, &speed, 1, &speed_shape, 1);\n\n    std::array<Ort::Value, 3> inputs = {\n        std::move(x), std::move(style_embedding), std::move(speed_tensor)};\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    return std::move(out[0]);\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length, const char *voices_data,\n            size_t voices_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---kokoro model---\\n\";\n      PrintModelMetadata(os, meta_data);\n\n      os << \"----------input names----------\\n\";\n      int32_t i = 0;\n      for (const auto &s : input_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n      os << \"----------output names----------\\n\";\n      i = 0;\n      for (const auto &s : output_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(meta_data_.sample_rate, \"sample_rate\");\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.version, \"version\", 1);\n    SHERPA_ONNX_READ_META_DATA(meta_data_.num_speakers, \"n_speakers\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.has_espeak, \"has_espeak\");\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(meta_data_.voice, \"voice\",\n                                                \"en-us\");\n\n    if (config_.debug) {\n      std::vector<std::string> speaker_names;\n      SHERPA_ONNX_READ_META_DATA_VEC_STRING(speaker_names, \"speaker_names\");\n      std::ostringstream os;\n      os << \"\\n\";\n      for (int32_t i = 0; i != speaker_names.size(); ++i) {\n        os << i << \"->\" << speaker_names[i] << \", \";\n      }\n      os << \"\\n\";\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    SHERPA_ONNX_READ_META_DATA_VEC(style_dim_, \"style_dim\");\n    if (style_dim_.size() != 3) {\n      SHERPA_ONNX_LOGE(\"style_dim should be 3-d, given: %d\",\n                       static_cast<int32_t>(style_dim_.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (style_dim_[1] != 1) {\n      SHERPA_ONNX_LOGE(\"style_dim[1] should be 1, given: %d\", style_dim_[1]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    int32_t actual_num_floats = voices_data_length / sizeof(float);\n    int32_t expected_num_floats =\n        style_dim_[0] * style_dim_[2] * meta_data_.num_speakers;\n\n    if (actual_num_floats != expected_num_floats) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"Corrupted --kokoro-voices '%{public}s'. Expected #floats: \"\n          \"%{public}d, actual: %{public}d\",\n          config_.kokoro.voices.c_str(), expected_num_floats,\n          actual_num_floats);\n#else\n      SHERPA_ONNX_LOGE(\n          \"Corrupted --kokoro-voices '%s'. Expected #floats: %d, actual: %d\",\n          config_.kokoro.voices.c_str(), expected_num_floats,\n          actual_num_floats);\n#endif\n\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    styles_ = std::vector<float>(\n        reinterpret_cast<const float *>(voices_data),\n        reinterpret_cast<const float *>(voices_data) + expected_num_floats);\n\n    meta_data_.max_token_len = style_dim_[0];\n  }\n\n private:\n  OfflineTtsModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  OfflineTtsKokoroModelMetaData meta_data_;\n  std::vector<int32_t> style_dim_;\n\n  // (num_speakers, style_dim_[0], style_dim_[2])\n  std::vector<float> styles_;\n};\n\nOfflineTtsKokoroModel::OfflineTtsKokoroModel(\n    const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTtsKokoroModel::OfflineTtsKokoroModel(\n    Manager *mgr, const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTtsKokoroModel::~OfflineTtsKokoroModel() = default;\n\nconst OfflineTtsKokoroModelMetaData &OfflineTtsKokoroModel::GetMetaData()\n    const {\n  return impl_->GetMetaData();\n}\n\nOrt::Value OfflineTtsKokoroModel::Run(Ort::Value x, int64_t sid /*= 0*/,\n                                      float speed /*= 1.0*/) const {\n  return impl_->Run(std::move(x), sid, speed);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTtsKokoroModel::OfflineTtsKokoroModel(\n    AAssetManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTtsKokoroModel::OfflineTtsKokoroModel(\n    NativeResourceManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-kokoro-model.h",
    "content": "// sherpa-onnx/csrc/offline-tts-kokoro-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_MODEL_H_\n\n#include <memory>\n#include <string>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-tts-kokoro-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsKokoroModel {\n public:\n  ~OfflineTtsKokoroModel();\n\n  explicit OfflineTtsKokoroModel(const OfflineTtsModelConfig &config);\n\n  template <typename Manager>\n  OfflineTtsKokoroModel(Manager *mgr, const OfflineTtsModelConfig &config);\n\n  // Return a float32 tensor containing the samples\n  // of shape (batch_size, num_samples)\n  Ort::Value Run(Ort::Value x, int64_t sid = 0, float speed = 1.0) const;\n\n  const OfflineTtsKokoroModelMetaData &GetMetaData() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_KOKORO_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-matcha-impl.h",
    "content": "// sherpa-onnx/csrc/offline-tts-matcha-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_IMPL_H_\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <sstream>\n#include <utility>\n#include <vector>\n\n#include \"fst/extensions/far/far.h\"\n#include \"kaldifst/csrc/kaldi-fst-io.h\"\n#include \"kaldifst/csrc/text-normalizer.h\"\n#include \"sherpa-onnx/csrc/character-lexicon.h\"\n#include \"sherpa-onnx/csrc/lexicon.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/matcha-tts-lexicon.h\"\n#include \"sherpa-onnx/csrc/melo-tts-lexicon.h\"\n#include \"sherpa-onnx/csrc/offline-tts-character-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-matcha-model.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/piper-phonemize-lexicon.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/vocoder.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsMatchaImpl : public OfflineTtsImpl {\n public:\n  explicit OfflineTtsMatchaImpl(const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsMatchaModel>(config.model)) {\n    const auto &meta_data = model_->GetMetaData();\n    if (meta_data.need_vocoder) {\n      if (config.model.matcha.vocoder.empty()) {\n        SHERPA_ONNX_LOGE(\"Please provide vocoder for this model\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      if (!FileExists(config.model.matcha.vocoder)) {\n        SHERPA_ONNX_LOGE(\"Please vocoder '%s' does not exist\",\n                         config.model.matcha.vocoder.c_str());\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      vocoder_ = Vocoder::Create(config.model);\n    } else if (!config.model.matcha.vocoder.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"You don't need to provide vocoder for this model. Ignore it\");\n    }\n\n    InitFrontend();\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      tn_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule fst: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n#endif\n        }\n        tn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(f));\n      }\n    }\n\n    if (!config.rule_fars.empty()) {\n      if (config.model.debug) {\n        SHERPA_ONNX_LOGE(\"Loading FST archives\");\n      }\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fars, \",\", false, &files);\n\n      tn_list_.reserve(files.size() + tn_list_.size());\n\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule far: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n#endif\n        }\n        std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n            fst::FarReader<fst::StdArc>::Open(f));\n        for (; !reader->Done(); reader->Next()) {\n          std::unique_ptr<fst::StdConstFst> r(\n              fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n          tn_list_.push_back(\n              std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n        }\n      }\n\n      if (config.model.debug) {\n        SHERPA_ONNX_LOGE(\"FST archives loaded!\");\n      }\n    }\n\n    if (meta_data.sample_rate == 16000 && meta_data.is_zh_en == 1) {\n      if (!Contains(config.model.matcha.vocoder, \"16\") &&\n          Contains(config.model.matcha.vocoder, \"2\")) {\n        SHERPA_ONNX_LOGE(\n            \"This Chinese+English TTS model requires a 16khz Vocoder.\");\n        SHERPA_ONNX_LOGE(\"You should use vocos-16khz-univ.onnx.\");\n        SHERPA_ONNX_LOGE(\n            \"Please re-download a vocoder from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/\"\n            \"vocoder-models.\");\n      }\n    }\n  }\n\n  template <typename Manager>\n  OfflineTtsMatchaImpl(Manager *mgr, const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsMatchaModel>(mgr, config.model)) {\n    const auto &meta_data = model_->GetMetaData();\n    if (meta_data.need_vocoder) {\n      if (config.model.matcha.vocoder.empty()) {\n        SHERPA_ONNX_LOGE(\"Please provide vocoder for this model\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      vocoder_ = Vocoder::Create(mgr, config.model);\n    } else if (!config.model.matcha.vocoder.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"You don't need to provide vocoder for this model. Ignore it\");\n    }\n\n    InitFrontend(mgr);\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      tn_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule fst: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n#endif\n        }\n        auto buf = ReadFile(mgr, f);\n        std::istringstream is(std::string(buf.data(), buf.size()));\n        tn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(is));\n      }\n    }\n\n    if (!config.rule_fars.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fars, \",\", false, &files);\n      tn_list_.reserve(files.size() + tn_list_.size());\n\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule far: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n#endif\n        }\n\n        auto buf = ReadFile(mgr, f);\n\n        std::unique_ptr<std::istream> s(\n            new std::istringstream(std::string(buf.data(), buf.size())));\n\n        std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n            fst::FarReader<fst::StdArc>::Open(std::move(s)));\n\n        for (; !reader->Done(); reader->Next()) {\n          std::unique_ptr<fst::StdConstFst> r(\n              fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n          tn_list_.push_back(\n              std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n        }  // for (; !reader->Done(); reader->Next())\n      }  // for (const auto &f : files)\n    }  // if (!config.rule_fars.empty())\n\n    if (meta_data.sample_rate == 16000 && meta_data.is_zh_en == 1) {\n      if (!Contains(config.model.matcha.vocoder, \"16\") &&\n          Contains(config.model.matcha.vocoder, \"2\")) {\n        SHERPA_ONNX_LOGE(\n            \"This Chinese+English TTS model requires a 16khz Vocoder.\");\n        SHERPA_ONNX_LOGE(\"You should use vocos-16khz-univ.onnx.\");\n        SHERPA_ONNX_LOGE(\n            \"Please re-download a vocoder from \"\n            \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/\"\n            \"vocoder-models.\");\n      }\n    }\n  }\n\n  int32_t SampleRate() const override {\n    return model_->GetMetaData().sample_rate;\n  }\n\n  int32_t NumSpeakers() const override {\n    return model_->GetMetaData().num_speakers;\n  }\n\n  // Supported options in GenerationConfig:\n  //   - sid: Speaker ID for multi-speaker models\n  //   - speed: Speech speed factor (default: 1.0)\n  //   - silence_scale: Scale applied to pauses in the generated audio\n  //\n  // Supported extra options in config.extra:\n  //   - None\n  GeneratedAudio Generate(\n      const std::string &_text, const GenerationConfig &gen_config,\n      GeneratedAudioCallback callback = nullptr) const override {\n    if (config_.model.debug) {\n      SHERPA_ONNX_LOGE(\"%s\", gen_config.ToString().c_str());\n    }\n\n    int64_t sid = gen_config.sid;\n    float speed = gen_config.speed;\n    if (speed <= 0) {\n      SHERPA_ONNX_LOGE(\"Speed must be > 0. Given: %f\", speed);\n      return {};\n    }\n\n    const auto &meta_data = model_->GetMetaData();\n    int32_t num_speakers = meta_data.num_speakers;\n\n    if (num_speakers == 0 && sid != 0) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"This is a single-speaker model and supports only sid 0. Given sid: \"\n          \"%{public}d. sid is ignored\",\n          static_cast<int32_t>(sid));\n#else\n      SHERPA_ONNX_LOGE(\n          \"This is a single-speaker model and supports only sid 0. Given sid: \"\n          \"%d. sid is ignored\",\n          static_cast<int32_t>(sid));\n#endif\n    }\n\n    if (num_speakers != 0 && (sid >= num_speakers || sid < 0)) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"This model contains only %{public}d speakers. sid should be in the \"\n          \"range [%{public}d, %{public}d]. Given: %{public}d. Use sid=0\",\n          num_speakers, 0, num_speakers - 1, static_cast<int32_t>(sid));\n#else\n      SHERPA_ONNX_LOGE(\n          \"This model contains only %d speakers. sid should be in the range \"\n          \"[%d, %d]. Given: %d. Use sid=0\",\n          num_speakers, 0, num_speakers - 1, static_cast<int32_t>(sid));\n#endif\n      sid = 0;\n    }\n\n    std::string text = _text;\n    if (config_.model.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Raw text: %{public}s\", text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Raw text: %s\", text.c_str());\n#endif\n    }\n\n    if (!tn_list_.empty()) {\n      for (const auto &tn : tn_list_) {\n        text = tn->Normalize(text);\n        if (config_.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"After normalizing: %{public}s\", text.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"After normalizing: %s\", text.c_str());\n#endif\n        }\n      }\n    }\n\n    std::vector<TokenIDs> token_ids =\n        frontend_->ConvertTextToTokenIds(text, meta_data.voice);\n\n    if (token_ids.empty() ||\n        (token_ids.size() == 1 && token_ids[0].tokens.empty())) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Failed to convert '%{public}s' to token IDs\",\n                       text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Failed to convert '%s' to token IDs\", text.c_str());\n#endif\n      return {};\n    }\n\n    std::vector<std::vector<int64_t>> x;\n\n    x.reserve(token_ids.size());\n\n    for (auto &i : token_ids) {\n      x.push_back(std::move(i.tokens));\n    }\n\n    if (meta_data.add_blank) {\n      for (auto &k : x) {\n        k = AddBlank(k, meta_data.pad_id);\n      }\n    }\n\n    int32_t x_size = static_cast<int32_t>(x.size());\n\n    if (config_.max_num_sentences <= 0 || x_size <= config_.max_num_sentences) {\n      auto ans = Process(x, sid, speed, gen_config.silence_scale);\n      if (callback) {\n        callback(ans.samples.data(), ans.samples.size(), 1.0);\n      }\n      return ans;\n    }\n\n    // the input text is too long, we process sentences within it in batches\n    // to avoid OOM. Batch size is config_.max_num_sentences\n    std::vector<std::vector<int64_t>> batch_x;\n\n    int32_t batch_size = config_.max_num_sentences;\n    batch_x.reserve(config_.max_num_sentences);\n    int32_t num_batches = x_size / batch_size;\n\n    if (config_.model.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"Text is too long. Split it into %{public}d batches. batch size: \"\n          \"%{public}d. Number of sentences: %{public}d\",\n          num_batches, batch_size, x_size);\n#else\n      SHERPA_ONNX_LOGE(\n          \"Text is too long. Split it into %d batches. batch size: %d. Number \"\n          \"of sentences: %d\",\n          num_batches, batch_size, x_size);\n#endif\n    }\n\n    GeneratedAudio ans;\n\n    int32_t should_continue = 1;\n\n    int32_t k = 0;\n\n    for (int32_t b = 0; b != num_batches && should_continue; ++b) {\n      batch_x.clear();\n      for (int32_t i = 0; i != batch_size; ++i, ++k) {\n        batch_x.push_back(std::move(x[k]));\n      }\n\n      auto audio = Process(batch_x, sid, speed, gen_config.silence_scale);\n      ans.sample_rate = audio.sample_rate;\n      ans.samples.insert(ans.samples.end(), audio.samples.begin(),\n                         audio.samples.end());\n      if (callback) {\n        should_continue = callback(audio.samples.data(), audio.samples.size(),\n                                   (b + 1) * 1.0 / num_batches);\n        // Caution(fangjun): audio is freed when the callback returns, so users\n        // should copy the data if they want to access the data after\n        // the callback returns to avoid segmentation fault.\n      }\n    }\n\n    batch_x.clear();\n    while (k < static_cast<int32_t>(x.size()) && should_continue) {\n      batch_x.push_back(std::move(x[k]));\n\n      ++k;\n    }\n\n    if (!batch_x.empty()) {\n      auto audio = Process(batch_x, sid, speed, gen_config.silence_scale);\n      ans.sample_rate = audio.sample_rate;\n      ans.samples.insert(ans.samples.end(), audio.samples.begin(),\n                         audio.samples.end());\n      if (callback) {\n        callback(audio.samples.data(), audio.samples.size(), 1.0);\n        // Caution(fangjun): audio is freed when the callback returns, so users\n        // should copy the data if they want to access the data after\n        // the callback returns to avoid segmentation fault.\n      }\n    }\n\n    return ans;\n  }\n\n  [[deprecated(\"Use Generate(text, GenerationConfig, callback) instead\")]]\n  GeneratedAudio Generate(\n      const std::string &text, int64_t sid = 0, float speed = 1.0,\n      GeneratedAudioCallback callback = nullptr) const override {\n    GenerationConfig gen_config;\n    gen_config.sid = sid;\n    gen_config.speed = speed;\n    gen_config.silence_scale = config_.silence_scale;\n    return Generate(text, gen_config, std::move(callback));\n  }\n\n private:\n  template <typename Manager>\n  void InitFrontend(Manager *mgr) {\n    // for piper phonemizer\n    // we require that you copy espeak_ng_data\n    // from assets to disk\n    const auto &meta_data = model_->GetMetaData();\n\n    if (meta_data.is_zh_en) {\n      frontend_ = std::make_unique<MatchaTtsLexicon>(\n          mgr, config_.model.matcha.lexicon, config_.model.matcha.tokens,\n          config_.model.matcha.data_dir, config_.model.debug, false);\n    } else if (meta_data.jieba) {\n      frontend_ = std::make_unique<CharacterLexicon>(\n          mgr, config_.model.matcha.lexicon, config_.model.matcha.tokens,\n          config_.model.debug);\n    } else if (meta_data.has_espeak) {\n      frontend_ = std::make_unique<PiperPhonemizeLexicon>(\n          mgr, config_.model.matcha.tokens, config_.model.matcha.data_dir,\n          meta_data);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported matcha tts model. Please ask for help\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  void InitFrontend() {\n    const auto &meta_data = model_->GetMetaData();\n\n    if (meta_data.is_zh_en) {\n      frontend_ = std::make_unique<MatchaTtsLexicon>(\n          config_.model.matcha.lexicon, config_.model.matcha.tokens,\n          config_.model.matcha.data_dir, config_.model.debug, false);\n    } else if (meta_data.jieba) {\n      frontend_ = std::make_unique<CharacterLexicon>(\n          config_.model.matcha.lexicon, config_.model.matcha.tokens,\n          config_.model.debug);\n    } else if (meta_data.has_espeak) {\n      frontend_ = std::make_unique<PiperPhonemizeLexicon>(\n          config_.model.matcha.tokens, config_.model.matcha.data_dir,\n          meta_data);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported matcha tts model. Please ask for help\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  GeneratedAudio Process(const std::vector<std::vector<int64_t>> &tokens,\n                         int32_t sid, float speed,\n                         float silence_scale) const {\n    int32_t num_tokens = 0;\n    for (const auto &k : tokens) {\n      num_tokens += k.size();\n    }\n\n    std::vector<int64_t> x;\n    x.reserve(num_tokens);\n    for (const auto &k : tokens) {\n      x.insert(x.end(), k.begin(), k.end());\n    }\n\n    if (config_.model.debug) {\n      std::ostringstream oss;\n      for (int32_t i : x) {\n        oss << i << \", \";\n      }\n      oss << \"\\n\";\n      SHERPA_ONNX_LOGE(\"%s\\n\", oss.str().c_str());\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 2> x_shape = {1, static_cast<int32_t>(x.size())};\n    Ort::Value x_tensor = Ort::Value::CreateTensor(\n        memory_info, x.data(), x.size(), x_shape.data(), x_shape.size());\n\n    GeneratedAudio ans;\n\n    Ort::Value mel = model_->Run(std::move(x_tensor), sid, speed);\n\n    const auto &meta_data = model_->GetMetaData();\n    if (meta_data.need_vocoder) {\n      ans.samples = vocoder_->Run(std::move(mel));\n    } else {\n      std::vector<int64_t> shape = mel.GetTensorTypeAndShapeInfo().GetShape();\n      int64_t num_samples = 1;\n      for (auto s : shape) {\n        num_samples *= s;\n      }\n      ans.samples.resize(num_samples);\n      auto p = mel.GetTensorData<float>();\n      std::copy(p, p + num_samples, ans.samples.data());\n    }\n\n    ans.sample_rate = model_->GetMetaData().sample_rate;\n\n    if (silence_scale != 1) {\n      ans = ans.ScaleSilence(silence_scale);\n    }\n\n    return ans;\n  }\n\n private:\n  OfflineTtsConfig config_;\n  std::unique_ptr<OfflineTtsMatchaModel> model_;\n  std::unique_ptr<Vocoder> vocoder_;\n  std::vector<std::unique_ptr<kaldifst::TextNormalizer>> tn_list_;\n  std::unique_ptr<OfflineTtsFrontend> frontend_;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-matcha-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-matcha-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-matcha-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineTtsMatchaModelConfig::Register(ParseOptions *po) {\n  po->Register(\"matcha-acoustic-model\", &acoustic_model,\n               \"Path to matcha acoustic model\");\n  po->Register(\"matcha-vocoder\", &vocoder, \"Path to matcha vocoder\");\n  po->Register(\n      \"matcha-lexicon\", &lexicon,\n      \"Path to lexicon.txt for Matcha models. You can pass multiple \"\n      \"files separated by comma , e.g., lexicon.txt,lexicon2.txt,lexicon3.txt\");\n  po->Register(\"matcha-tokens\", &tokens,\n               \"Path to tokens.txt for Matcha models\");\n  po->Register(\"matcha-data-dir\", &data_dir,\n               \"Path to the directory containing dict for espeak-ng. If it is \"\n               \"given, --matcha-lexicon is ignored.\");\n  po->Register(\"matcha-dict-dir\", &dict_dir,\n               \"Not used. You don't need to provide a value for it\");\n  po->Register(\"matcha-noise-scale\", &noise_scale,\n               \"noise_scale for Matcha models\");\n  po->Register(\"matcha-length-scale\", &length_scale,\n               \"Speech speed. Larger->Slower; Smaller->faster.\");\n}\n\nbool OfflineTtsMatchaModelConfig::Validate() const {\n  if (acoustic_model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --matcha-acoustic-model\");\n    return false;\n  }\n\n  if (!FileExists(acoustic_model)) {\n    SHERPA_ONNX_LOGE(\"--matcha-acoustic-model: '%s' does not exist\",\n                     acoustic_model.c_str());\n    return false;\n  }\n\n  if (tokens.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --matcha-tokens\");\n    return false;\n  }\n\n  if (!FileExists(tokens)) {\n    SHERPA_ONNX_LOGE(\"--matcha-tokens: '%s' does not exist\", tokens.c_str());\n    return false;\n  }\n\n  if (!data_dir.empty()) {\n    if (!FileExists(data_dir + \"/phontab\")) {\n      SHERPA_ONNX_LOGE(\n          \"'%s/phontab' does not exist. Please check --matcha-data-dir\",\n          data_dir.c_str());\n      return false;\n    }\n\n    if (!FileExists(data_dir + \"/phonindex\")) {\n      SHERPA_ONNX_LOGE(\n          \"'%s/phonindex' does not exist. Please check --matcha-data-dir\",\n          data_dir.c_str());\n      return false;\n    }\n\n    if (!FileExists(data_dir + \"/phondata\")) {\n      SHERPA_ONNX_LOGE(\n          \"'%s/phondata' does not exist. Please check --matcha-data-dir\",\n          data_dir.c_str());\n      return false;\n    }\n\n    if (!FileExists(data_dir + \"/intonations\")) {\n      SHERPA_ONNX_LOGE(\n          \"'%s/intonations' does not exist. Please check --matcha-data-dir\",\n          data_dir.c_str());\n      return false;\n    }\n  }\n\n  if (!lexicon.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(lexicon, \",\", false, &files);\n    for (const auto &f : files) {\n      if (!FileExists(f)) {\n        SHERPA_ONNX_LOGE(\n            \"lexicon '%s' does not exist. Please re-check --matcha-lexicon\",\n            f.c_str());\n        return false;\n      }\n    }\n  }\n\n  if (!dict_dir.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"From sherpa-onnx v1.12.15, you don't need to provide dict_dir for \"\n        \"this model. Ignore it\");\n  }\n\n  return true;\n}\n\nstd::string OfflineTtsMatchaModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineTtsMatchaModelConfig(\";\n  os << \"acoustic_model=\\\"\" << acoustic_model << \"\\\", \";\n  os << \"vocoder=\\\"\" << vocoder << \"\\\", \";\n  os << \"lexicon=\\\"\" << lexicon << \"\\\", \";\n  os << \"tokens=\\\"\" << tokens << \"\\\", \";\n  os << \"data_dir=\\\"\" << data_dir << \"\\\", \";\n  os << \"noise_scale=\" << noise_scale << \", \";\n  os << \"length_scale=\" << length_scale << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-matcha-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-tts-matcha-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTtsMatchaModelConfig {\n  std::string acoustic_model;\n  std::string vocoder;\n  std::string lexicon;\n  std::string tokens;\n\n  // If data_dir is given, lexicon is ignored\n  // data_dir is for piper-phonemizer, which uses espeak-ng\n  std::string data_dir;\n\n  // Used for Chinese TTS models using jieba\n  std::string dict_dir;\n\n  float noise_scale = 1;\n  float length_scale = 1;\n\n  OfflineTtsMatchaModelConfig() = default;\n\n  OfflineTtsMatchaModelConfig(const std::string &acoustic_model,\n                              const std::string &vocoder,\n                              const std::string &lexicon,\n                              const std::string &tokens,\n                              const std::string &data_dir,\n                              const std::string &dict_dir,\n                              float noise_scale = 1.0, float length_scale = 1)\n      : acoustic_model(acoustic_model),\n        vocoder(vocoder),\n        lexicon(lexicon),\n        tokens(tokens),\n        data_dir(data_dir),\n        dict_dir(dict_dir),\n        noise_scale(noise_scale),\n        length_scale(length_scale) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-matcha-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-tts-matcha-model-meta-data.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_MODEL_META_DATA_H_\n\n#include <cstdint>\n#include <string>\n\nnamespace sherpa_onnx {\n\n// If you are not sure what each field means, please\n// have a look of the Python file in the model directory that\n// you have downloaded.\nstruct OfflineTtsMatchaModelMetaData {\n  int32_t sample_rate = 0;\n  int32_t num_speakers = 0;\n  int32_t version = 1;\n  int32_t jieba = 0;\n  int32_t has_espeak = 0;\n  int32_t use_eos_bos = 0;\n  int32_t pad_id = 0;\n  int32_t add_blank = 1;\n  int32_t is_zh_en = 0;\n  bool need_vocoder = true;\n\n  std::string voice;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-matcha-model.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-matcha-model.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-matcha-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsMatchaModel::Impl {\n public:\n  explicit Impl(const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config.matcha.acoustic_model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config.matcha.acoustic_model);\n    Init(buf.data(), buf.size());\n  }\n\n  const OfflineTtsMatchaModelMetaData &GetMetaData() const {\n    return meta_data_;\n  }\n\n  Ort::Value Run(Ort::Value x, int64_t sid, float speed) {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> x_shape = x.GetTensorTypeAndShapeInfo().GetShape();\n    if (x_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"Support only batch_size == 1. Given: %d\",\n                       static_cast<int32_t>(x_shape[0]));\n      exit(-1);\n    }\n\n    int64_t len = x_shape[1];\n    int64_t len_shape = 1;\n\n    Ort::Value x_length =\n        Ort::Value::CreateTensor(memory_info, &len, 1, &len_shape, 1);\n\n    int64_t scale_shape = 1;\n    float noise_scale = config_.matcha.noise_scale;\n    float length_scale = config_.matcha.length_scale;\n\n    if (speed != 1 && speed > 0) {\n      length_scale = 1. / speed;\n    }\n\n    Ort::Value noise_scale_tensor =\n        Ort::Value::CreateTensor(memory_info, &noise_scale, 1, &scale_shape, 1);\n\n    Ort::Value length_scale_tensor = Ort::Value::CreateTensor(\n        memory_info, &length_scale, 1, &scale_shape, 1);\n\n    Ort::Value sid_tensor =\n        Ort::Value::CreateTensor(memory_info, &sid, 1, &scale_shape, 1);\n\n    std::array<float, 2> scales = {noise_scale, length_scale};\n    int64_t scales_shape = 2;\n\n    Ort::Value scales_tensor = Ort::Value::CreateTensor(\n        memory_info, scales.data(), scales.size(), &scales_shape, 1);\n\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(5);\n    inputs.push_back(std::move(x));\n    inputs.push_back(std::move(x_length));\n    if (input_names_[2] == \"scales\") {\n      // for models from\n      // https://github.com/shivammehta25/Matcha-TTS\n      inputs.push_back(std::move(scales_tensor));\n    } else {\n      // for models from icefall\n      inputs.push_back(std::move(noise_scale_tensor));\n      inputs.push_back(std::move(length_scale_tensor));\n    }\n\n    if (input_names_.size() == 5 && input_names_.back() == \"sid\") {\n      // for models from icefall\n      inputs.push_back(std::move(sid_tensor));\n\n      // Note that we have not supported multi-speaker tts models from\n      // https://github.com/shivammehta25/Matcha-TTS\n    }\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    return std::move(out[0]);\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---matcha model---\\n\";\n      PrintModelMetadata(os, meta_data);\n\n      os << \"----------input names----------\\n\";\n      int32_t i = 0;\n      for (const auto &s : input_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n      os << \"----------output names----------\\n\";\n      i = 0;\n      for (const auto &s : output_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(meta_data_.sample_rate, \"sample_rate\");\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.version, \"version\", 1);\n    SHERPA_ONNX_READ_META_DATA(meta_data_.num_speakers, \"n_speakers\");\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.jieba, \"jieba\", 0);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.has_espeak, \"has_espeak\",\n                                            0);\n    SHERPA_ONNX_READ_META_DATA(meta_data_.use_eos_bos, \"use_eos_bos\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.pad_id, \"pad_id\");\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(meta_data_.voice, \"voice\",\n                                                \"en-us\");\n\n    if (meta_data_.voice == \"zh en-us\") {\n      // for models from\n      // https://modelscope.cn/models/dengcunqin/matcha_tts_zh_en_20251010\n      meta_data_.add_blank = 0;\n      meta_data_.is_zh_en = 1;\n    }\n\n    if (output_names_.front() == \"audio_output\") {\n      meta_data_.need_vocoder = false;\n    }\n  }\n\n private:\n  OfflineTtsModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  OfflineTtsMatchaModelMetaData meta_data_;\n};\n\nOfflineTtsMatchaModel::OfflineTtsMatchaModel(\n    const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTtsMatchaModel::OfflineTtsMatchaModel(\n    Manager *mgr, const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTtsMatchaModel::~OfflineTtsMatchaModel() = default;\n\nconst OfflineTtsMatchaModelMetaData &OfflineTtsMatchaModel::GetMetaData()\n    const {\n  return impl_->GetMetaData();\n}\n\nOrt::Value OfflineTtsMatchaModel::Run(Ort::Value x, int64_t sid /*= 0*/,\n                                      float speed /*= 1.0*/) const {\n  return impl_->Run(std::move(x), sid, speed);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTtsMatchaModel::OfflineTtsMatchaModel(\n    AAssetManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTtsMatchaModel::OfflineTtsMatchaModel(\n    NativeResourceManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-matcha-model.h",
    "content": "// sherpa-onnx/csrc/offline-tts-matcha-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_MODEL_H_\n\n#include <memory>\n#include <string>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-tts-matcha-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsMatchaModel {\n public:\n  ~OfflineTtsMatchaModel();\n\n  explicit OfflineTtsMatchaModel(const OfflineTtsModelConfig &config);\n\n  template <typename Manager>\n  OfflineTtsMatchaModel(Manager *mgr, const OfflineTtsModelConfig &config);\n\n  // Return a float32 tensor containing the mel\n  // of shape (batch_size, mel_dim, num_frames)\n  Ort::Value Run(Ort::Value x, int64_t sid = 0, float speed = 1.0) const;\n\n  const OfflineTtsMatchaModelMetaData &GetMetaData() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_MATCHA_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineTtsModelConfig::Register(ParseOptions *po) {\n  vits.Register(po);\n  matcha.Register(po);\n  kokoro.Register(po);\n  zipvoice.Register(po);\n  kitten.Register(po);\n  pocket.Register(po);\n  supertonic.Register(po);\n\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n}\n\nbool OfflineTtsModelConfig::Validate() const {\n  if (num_threads < 1) {\n    SHERPA_ONNX_LOGE(\"num_threads should be > 0. Given %d\", num_threads);\n    return false;\n  }\n\n  if (!vits.model.empty()) {\n    return vits.Validate();\n  }\n\n  if (!matcha.acoustic_model.empty()) {\n    return matcha.Validate();\n  }\n\n  if (!zipvoice.decoder.empty()) {\n    return zipvoice.Validate();\n  }\n\n  if (!kokoro.model.empty()) {\n    return kokoro.Validate();\n  }\n\n  if (!kitten.model.empty()) {\n    return kitten.Validate();\n  }\n\n  if (!pocket.lm_flow.empty()) {\n    return pocket.Validate();\n  }\n\n  if (!supertonic.tts_json.empty()) {\n    return supertonic.Validate();\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide exactly one tts model.\");\n\n  return false;\n}\n\nstd::string OfflineTtsModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineTtsModelConfig(\";\n  os << \"vits=\" << vits.ToString() << \", \";\n  os << \"matcha=\" << matcha.ToString() << \", \";\n  os << \"kokoro=\" << kokoro.ToString() << \", \";\n  os << \"zipvoice=\" << zipvoice.ToString() << \", \";\n  os << \"kitten=\" << kitten.ToString() << \", \";\n  os << \"pocket=\" << pocket.ToString() << \", \";\n  os << \"supertonic=\" << supertonic.ToString() << \", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-tts-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-tts-kitten-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-tts-kokoro-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-tts-matcha-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-tts-pocket-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-tts-supertonic-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-tts-vits-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-tts-zipvoice-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTtsModelConfig {\n  OfflineTtsVitsModelConfig vits;\n  OfflineTtsMatchaModelConfig matcha;\n  OfflineTtsKokoroModelConfig kokoro;\n  OfflineTtsZipvoiceModelConfig zipvoice;\n  OfflineTtsKittenModelConfig kitten;\n  OfflineTtsPocketModelConfig pocket;\n  OfflineTtsSupertonicModelConfig supertonic;\n\n  int32_t num_threads = 1;\n  bool debug = false;\n  std::string provider = \"cpu\";\n\n  OfflineTtsModelConfig() = default;\n\n  OfflineTtsModelConfig(const OfflineTtsVitsModelConfig &vits,\n                        const OfflineTtsMatchaModelConfig &matcha,\n                        const OfflineTtsKokoroModelConfig &kokoro,\n                        const OfflineTtsZipvoiceModelConfig &zipvoice,\n                        const OfflineTtsKittenModelConfig &kitten,\n                        const OfflineTtsPocketModelConfig &pocket,\n                        const OfflineTtsSupertonicModelConfig &supertonic,\n                        int32_t num_threads, bool debug,\n                        const std::string &provider)\n      : vits(vits),\n        matcha(matcha),\n        kokoro(kokoro),\n        zipvoice(zipvoice),\n        kitten(kitten),\n        pocket(pocket),\n        supertonic(supertonic),\n        num_threads(num_threads),\n        debug(debug),\n        provider(provider) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-pocket-impl.h",
    "content": "// sherpa-onnx/csrc/offline-tts-pocket-impl.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_POCKET_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_POCKET_IMPL_H_\n\n#include <algorithm>\n#include <chrono>\n#include <cmath>\n#include <cstdint>\n#include <cstring>\n#include <functional>\n#include <iomanip>\n#include <ios>\n#include <limits>\n#include <list>\n#include <memory>\n#include <mutex>\n#include <sstream>\n#include <string>\n#include <tuple>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"fst/extensions/far/far.h\"\n#include \"kaldifst/csrc/kaldi-fst-io.h\"\n#include \"kaldifst/csrc/text-normalizer.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/normal-data-generator.h\"\n#include \"sherpa-onnx/csrc/offline-tts-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-pocket-model.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n#include \"sherpa-onnx/csrc/sentence-piece-tokenizer.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsPocketImpl : public OfflineTtsImpl {\n public:\n  explicit OfflineTtsPocketImpl(const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsPocketModel>(config.model)) {\n    InitTokenizer();\n\n    cache_.SetCapacity(config.model.pocket.voice_embedding_cache_capacity);\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      tn_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule fst: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n#endif\n        }\n        tn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(f));\n      }\n    }\n\n    if (!config.rule_fars.empty()) {\n      if (config.model.debug) {\n        SHERPA_ONNX_LOGE(\"Loading FST archives\");\n      }\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fars, \",\", false, &files);\n\n      tn_list_.reserve(files.size() + tn_list_.size());\n\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule far: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n#endif\n        }\n        std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n            fst::FarReader<fst::StdArc>::Open(f));\n        for (; !reader->Done(); reader->Next()) {\n          std::unique_ptr<fst::StdConstFst> r(\n              fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n          tn_list_.push_back(\n              std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n        }\n      }\n\n      if (config.model.debug) {\n        SHERPA_ONNX_LOGE(\"FST archives loaded!\");\n      }\n    }\n  }\n\n  template <typename Manager>\n  OfflineTtsPocketImpl(Manager *mgr, const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsPocketModel>(mgr, config.model)) {\n    InitTokenizer(mgr);\n    cache_.SetCapacity(config.model.pocket.voice_embedding_cache_capacity);\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      tn_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule fst: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n#endif\n        }\n        auto buf = ReadFile(mgr, f);\n        std::istringstream is(std::string(buf.data(), buf.size()));\n        tn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(is));\n      }\n    }\n\n    if (!config.rule_fars.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fars, \",\", false, &files);\n      tn_list_.reserve(files.size() + tn_list_.size());\n\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule far: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n#endif\n        }\n\n        auto buf = ReadFile(mgr, f);\n\n        std::unique_ptr<std::istream> s(\n            new std::istringstream(std::string(buf.data(), buf.size())));\n\n        std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n            fst::FarReader<fst::StdArc>::Open(std::move(s)));\n\n        for (; !reader->Done(); reader->Next()) {\n          std::unique_ptr<fst::StdConstFst> r(\n              fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n          tn_list_.push_back(\n              std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n        }  // for (; !reader->Done(); reader->Next())\n      }  // for (const auto &f : files)\n    }  // if (!config.rule_fars.empty())\n  }\n\n  int32_t SampleRate() const override { return 24000; }\n\n  int32_t NumSpeakers() const override { return 1; }\n\n  /**\n   *\n   * Supported extra parameters:\n   *\n   *  - max_frames, int, default 500\n   *  - frames_after_eos, int, default 3\n   *  - temperature, float, default 0.7\n   *  - chunk_size, int, default 15\n   *  - max_reference_audio_len, float, default 10, in seconds\n   *  - max_char_in_sentence, int, default 200\n   *  - min_char_in_sentence, int, default 30\n   *  - seed, int, default -1\n   */\n  GeneratedAudio Generate(\n      const std::string &_text, const GenerationConfig &gen_config,\n      GeneratedAudioCallback callback = nullptr) const override {\n    if (config_.model.debug) {\n      SHERPA_ONNX_LOGE(\"%s\", gen_config.ToString().c_str());\n    }\n\n    std::string text = _text;\n    if (config_.model.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Raw text: %{public}s\", text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Raw text: %s\", text.c_str());\n#endif\n      std::ostringstream os;\n      os << \"In bytes (hex):\\n\";\n      const auto p = reinterpret_cast<const uint8_t *>(text.c_str());\n      for (int32_t i = 0; i != text.size(); ++i) {\n        os << std::setw(2) << std::setfill('0') << std::hex\n           << static_cast<uint32_t>(p[i]) << \" \";\n      }\n      os << \"\\n\";\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    if (!tn_list_.empty()) {\n      for (const auto &tn : tn_list_) {\n        text = tn->Normalize(text);\n        if (config_.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"After normalizing: %{public}s\", text.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"After normalizing: %s\", text.c_str());\n#endif\n        }\n      }\n    }\n\n    auto sentences = SplitByPunctuation(text);\n\n    if (sentences.empty()) {\n      return {};\n    }\n\n    int32_t max_char_in_sentence =\n        gen_config.GetExtraInt(\"max_char_in_sentence\", 200);\n\n    int32_t min_char_in_sentence =\n        gen_config.GetExtraInt(\"min_char_in_sentence\", 30);\n\n    sentences = MergeShortSentences(sentences, min_char_in_sentence);\n\n    std::vector<std::string> final_chunks;\n    for (const auto &s : sentences) {\n      auto pieces = SplitLongSentence(s, max_char_in_sentence);\n      final_chunks.insert(final_chunks.end(), pieces.begin(), pieces.end());\n    }\n\n    sentences = std::move(final_chunks);\n\n    Ort::Value voice_embedding = GetVoiceEmbedding(gen_config);\n    if (!voice_embedding) {\n      return {};\n    }\n\n    GeneratedAudio result;\n    result.sample_rate = SampleRate();\n\n    const int32_t total = sentences.size();\n\n    bool should_continue = true;\n\n    for (int32_t i = 0; i < total && should_continue; ++i) {\n      if (config_.model.debug) {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\"Processing %{public}d/%{public}d: %{public}s\", i + 1,\n                         total, sentences[i].c_str());\n#else\n        SHERPA_ONNX_LOGE(\"Processing %d/%d: %s\", i + 1, total,\n                         sentences[i].c_str());\n#endif\n      }\n      GeneratedAudioCallback wrapped_cb = nullptr;\n\n      if (callback) {\n        wrapped_cb = [&, i](const float *samples, int32_t n,\n                            float sentence_progress) -> bool {\n          float global_progress = (i + sentence_progress) / total;\n\n          return callback(samples, n, global_progress);\n        };\n      }\n\n      GeneratedAudio cur = GenerateSingleSentence(sentences[i], gen_config,\n                                                  View(&voice_embedding),\n                                                  should_continue, wrapped_cb);\n\n      if (cur.samples.empty()) {\n        continue;\n      }\n\n      result.samples.insert(result.samples.end(), cur.samples.begin(),\n                            cur.samples.end());\n    }\n\n    float silence_scale = gen_config.silence_scale;\n    if (silence_scale != 1) {\n      result = result.ScaleSilence(silence_scale);\n    }\n\n    return result;\n  }\n\n  static size_t ComputeHash(const float *p, size_t n) {\n    size_t hash = 0;\n\n    auto hash_combine = [](size_t &seed, size_t value) {\n      seed ^= value + 0x9e3779b97f4a7c15ull + (seed << 6) + (seed >> 2);\n    };\n\n    hash_combine(hash, n);\n\n    for (size_t i = 0; i < n; ++i) {\n      uint32_t bits;\n      std::memcpy(&bits, &p[i], sizeof(float));\n      hash_combine(hash, bits);\n    }\n\n    return hash;\n  }\n\n  GeneratedAudio GenerateSingleSentence(\n      const std::string &text, const GenerationConfig &gen_config,\n      Ort::Value voice_embedding, bool &should_continue,\n      GeneratedAudioCallback callback = nullptr) const {\n    Ort::Value text_embedding = GetTextEmbedding(text);\n\n    auto lm_main_state = model_->GetLmMainInitState();\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    {\n      std::array<int64_t, 3> empty_seq_shape = {1, 0, 32};\n\n      Ort::Value empty_seq_tensor = Ort::Value::CreateTensor<float>(\n          memory_info, nullptr, 0, empty_seq_shape.data(),\n          empty_seq_shape.size());\n\n      // voice conditioning\n      // discard the return result\n      RunLmMain(View(&empty_seq_tensor), std::move(voice_embedding),\n                lm_main_state);\n\n      // text conditioning\n      // discard the return result\n      RunLmMain(std::move(empty_seq_tensor), std::move(text_embedding),\n                lm_main_state);\n    }\n\n    std::vector<float> cur(1 * 1 * 32, std::numeric_limits<float>::quiet_NaN());\n    std::array<int64_t, 3> cur_shape = {1, 1, 32};\n\n    int32_t num_steps = gen_config.num_steps;\n    int32_t max_frames = gen_config.GetExtraInt(\"max_frames\", 500);\n    int32_t frames_after_eos = gen_config.GetExtraInt(\"frames_after_eos\", 3);\n    float temperature = gen_config.GetExtraFloat(\"temperature\", 0.7f);\n    float stddev = std::sqrt(temperature);\n    int32_t seed = gen_config.GetExtraInt(\"seed\", -1);\n\n    NormalDataGenerator normal_gen(0, stddev, seed);\n    std::vector<float> noise(32, 0);\n    std::array<int64_t, 2> noise_shape = {1, 32};\n\n    Ort::Value noise_tensor =\n        Ort::Value::CreateTensor(memory_info, noise.data(), noise.size(),\n                                 noise_shape.data(), noise_shape.size());\n\n    std::array<int64_t, 3> empty_text_shape = {1, 0, 1024};\n\n    Ort::Value empty_text_tensor = Ort::Value::CreateTensor<float>(\n        memory_info, nullptr, 0, empty_text_shape.data(),\n        empty_text_shape.size());\n\n    Ort::Value conditioning{nullptr};\n    Ort::Value eos_logit{nullptr};\n\n    std::vector<float> latent_list;\n    int32_t eos_step = -1;\n    int32_t frame_size = -1;\n    for (int32_t step = 0; step < max_frames; ++step) {\n      Ort::Value cur_tensor =\n          Ort::Value::CreateTensor(memory_info, cur.data(), cur.size(),\n                                   cur_shape.data(), cur_shape.size());\n\n      std::tie(conditioning, eos_logit) = RunLmMain(\n          std::move(cur_tensor), View(&empty_text_tensor), lm_main_state);\n      const float *p_logit = eos_logit.GetTensorData<float>();\n\n      if (eos_step < 0 && p_logit[0] > -4) {\n        eos_step = step;\n      }\n\n      if (eos_step > 0 && (step >= eos_step + frames_after_eos)) {\n        break;\n      }\n\n      normal_gen.Fill(noise.data(), noise.size());\n\n      Ort::Value latent =\n          RunLmFlow(std::move(conditioning), View(&noise_tensor), num_steps);\n\n      auto n = latent.GetTensorTypeAndShapeInfo().GetShape().back();\n      if (frame_size == -1) {\n        frame_size = n;\n      }\n\n      cur = {latent.GetTensorData<float>(), latent.GetTensorData<float>() + n};\n\n      latent_list.insert(latent_list.end(), latent.GetTensorData<float>(),\n                         latent.GetTensorData<float>() + n);\n    }\n\n    lm_main_state.values.clear();\n\n    auto decoder_state = model_->GetMimiDecoderInitState();\n\n    int32_t chunk_size = gen_config.GetExtraInt(\"chunk_size\", 15);\n\n    int32_t num_chunks = latent_list.size() / frame_size / chunk_size;\n    std::array<int64_t, 3> chunk_shape = {1, chunk_size, frame_size};\n\n    std::vector<float> audio_list;\n\n    int32_t remaining_chunks =\n        (latent_list.size() - num_chunks * chunk_size * frame_size) /\n        frame_size;\n\n    const float *p = latent_list.data();\n    for (int32_t i = 0;\n         (p < latent_list.data() + latent_list.size()) && should_continue;\n         ++i) {\n      int32_t this_chunk_size = chunk_size;\n      if (i >= num_chunks) {\n        this_chunk_size = remaining_chunks;\n      }\n\n      chunk_shape[1] = this_chunk_size;\n\n      Ort::Value chunk_tensor = Ort::Value::CreateTensor(\n          memory_info, const_cast<float *>(p), this_chunk_size * frame_size,\n          chunk_shape.data(), chunk_shape.size());\n\n      p += this_chunk_size * frame_size;\n\n      Ort::Value out = RunMimiDecoder(std::move(chunk_tensor), decoder_state);\n\n      auto n = out.GetTensorTypeAndShapeInfo().GetShape().back();\n\n      if (callback) {\n        should_continue =\n            callback(out.GetTensorData<float>(), n,\n                     (i + 1) * 1.0 / (num_chunks + !!remaining_chunks));\n        // Caution(fangjun): out is freed when the callback returns, so users\n        // should copy the data if they want to access the data after\n        // the callback returns to avoid segmentation fault.\n      }\n\n      audio_list.insert(audio_list.end(), out.GetTensorData<float>(),\n                        out.GetTensorData<float>() + n);\n    }\n\n    GeneratedAudio ans;\n    ans.sample_rate = SampleRate();\n    ans.samples = std::move(audio_list);\n\n    return ans;\n  }\n\n private:\n  template <typename Manager>\n  void InitTokenizer(Manager *mgr) {\n    tokenizer_ = std::make_unique<SentencePieceTokenizer>(\n        mgr, config_.model.pocket.vocab_json,\n        config_.model.pocket.token_scores_json);\n  }\n\n  void InitTokenizer() {\n    tokenizer_ = std::make_unique<SentencePieceTokenizer>(\n        config_.model.pocket.vocab_json,\n        config_.model.pocket.token_scores_json);\n  }\n\n  Ort::Value GetVoiceEmbedding(const GenerationConfig &gen_config) const {\n    if (gen_config.reference_sample_rate <= 0) {\n      SHERPA_ONNX_LOGE(\"reference_sample_rate %d is invalid.\",\n                       gen_config.reference_sample_rate);\n      return Ort::Value{nullptr};\n    }\n\n    if (gen_config.reference_audio.empty()) {\n      SHERPA_ONNX_LOGE(\"reference audio is empty\");\n      return Ort::Value{nullptr};\n    }\n\n    std::vector<float> reference_audio;\n\n    const float *p_audio;\n    int32_t num_samples;\n    if (gen_config.reference_sample_rate != SampleRate()) {\n      SHERPA_ONNX_LOGE(\n          \"Creating a resampler:\\n\"\n          \"   in_sample_rate: %d\\n\"\n          \"   output_sample_rate: %d\",\n          gen_config.reference_sample_rate, SampleRate());\n\n      float min_freq =\n          std::min<int32_t>(gen_config.reference_sample_rate, SampleRate());\n      float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n      int32_t lowpass_filter_width = 6;\n      auto resampler = std::make_unique<sherpa_onnx::LinearResample>(\n          gen_config.reference_sample_rate, SampleRate(), lowpass_cutoff,\n          lowpass_filter_width);\n\n      resampler->Resample(gen_config.reference_audio.data(),\n                          gen_config.reference_audio.size(), true,\n                          &reference_audio);\n      p_audio = reference_audio.data();\n      num_samples = reference_audio.size();\n    } else {\n      p_audio = gen_config.reference_audio.data();\n      num_samples = gen_config.reference_audio.size();\n    }\n\n    float max_reference_audio_len =\n        gen_config.GetExtraFloat(\"max_reference_audio_len\", 10);\n\n    // in seconds\n\n    int32_t max_len =\n        static_cast<int32_t>(max_reference_audio_len * SampleRate());\n\n    if (num_samples > max_len) {\n      if (config_.model.debug) {\n        SHERPA_ONNX_LOGE(\n            \"max_reference_audio_len is %.3f seconds. Given reference audio of \"\n            \"%.3f seconds. Only the first %.3f seconds are used\",\n            max_reference_audio_len, num_samples * 1.0f / SampleRate(),\n            max_reference_audio_len);\n      }\n      num_samples = max_len;\n    }\n\n    // Compute hash of reference audio for cache lookup\n    size_t audio_hash = ComputeHash(p_audio, num_samples);\n\n    auto cached_embedding = cache_.Get(audio_hash);\n    if (cached_embedding) {\n      if (config_.model.debug) {\n        SHERPA_ONNX_LOGE(\"CACHE HIT: voice embedding (hash=%zu)\", audio_hash);\n      }\n      // Create an owned tensor and copy data to avoid use-after-free\n      auto result = Ort::Value::CreateTensor<float>(\n          model_->Allocator(), cached_embedding->second.data(),\n          cached_embedding->second.size());\n      std::copy(cached_embedding->first.begin(), cached_embedding->first.end(),\n                result.GetTensorMutableData<float>());\n      return result;\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> shape = {1, 1, num_samples};\n    Ort::Value x =\n        Ort::Value::CreateTensor(memory_info, const_cast<float *>(p_audio),\n                                 num_samples, shape.data(), shape.size());\n\n    Ort::Value result = model_->RunMimiEncoder(std::move(x));\n\n    auto info = result.GetTensorTypeAndShapeInfo();\n    auto result_shape = info.GetShape();\n    size_t total = info.GetElementCount();\n    const float *result_data = result.GetTensorData<float>();\n\n    cache_.Put(audio_hash, std::vector<float>(result_data, result_data + total),\n               std::move(result_shape));\n\n    if (config_.model.debug) {\n      SHERPA_ONNX_LOGE(\"CACHE MISS: cached embedding (hash=%zu, %zu floats)\",\n                       audio_hash, total);\n    }\n\n    return result;\n  }\n\n  Ort::Value GetTextEmbedding(const std::string &text) const {\n    std::vector<int32_t> token_ids = tokenizer_->EncodeIds(text);\n    if (config_.model.debug) {\n      std::ostringstream os;\n      os << \"\\ntoken_ids (len=\" << token_ids.size() << \"): \";\n      for (auto i : token_ids) {\n        os << i << \" \";\n      }\n      os << \"\\n\";\n\n      auto tokens = tokenizer_->EncodeTokens(text);\n      os << \"tokens (len=\" << tokens.size() << \"):\";\n      for (const auto &t : tokens) {\n        os << t << \" \";\n      }\n\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n    }\n\n    std::vector<int64_t> token_ids_i64 = {token_ids.begin(), token_ids.end()};\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 2> shape = {1,\n                                    static_cast<int64_t>(token_ids_i64.size())};\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, token_ids_i64.data(),\n                                            token_ids_i64.size(), shape.data(),\n                                            shape.size());\n    return model_->RunTextConditioner(std::move(x));\n  }\n\n  // state is changed in-place\n  std::pair<Ort::Value, Ort::Value> RunLmMain(Ort::Value seq,\n                                              Ort::Value embedding,\n                                              PocketLmMainState &state) const {\n    std::tuple<Ort::Value, Ort::Value, PocketLmMainState> output =\n        model_->RunLmMain(std::move(seq), std::move(embedding),\n                          std::move(state));\n\n    state = std::move(std::get<2>(output));\n\n    return {std::move(std::get<0>(output)), std::move(std::get<1>(output))};\n  }\n\n  Ort::Value RunLmFlow(Ort::Value conditioning, Ort::Value noise,\n                       int32_t num_steps) const {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    Ort::Value x = Clone(model_->Allocator(), &noise);\n\n    std::array<int64_t, 2> shape = {1, 1};\n\n    float dt = 1.0f / static_cast<float>(num_steps);\n\n    float s = 0;\n    float t = 0;\n\n    Ort::Value s_tensor = Ort::Value::CreateTensor(memory_info, &s, 1,\n                                                   shape.data(), shape.size());\n\n    Ort::Value t_tensor = Ort::Value::CreateTensor(memory_info, &t, 1,\n                                                   shape.data(), shape.size());\n\n    for (int32_t i = 0; i < num_steps; ++i) {\n      s = static_cast<float>(i) / static_cast<float>(num_steps);\n      t = s + dt;\n\n      Ort::Value out = model_->RunLmFlow(View(&conditioning), View(&s_tensor),\n                                         View(&t_tensor), View(&x));\n\n      auto n = out.GetTensorTypeAndShapeInfo().GetShape().back();\n\n      ScaleAdd(out.GetTensorData<float>(), dt, n,\n               x.GetTensorMutableData<float>());\n    }\n\n    return std::move(x);\n  }\n\n  // state is changed in-place\n  Ort::Value RunMimiDecoder(Ort::Value latent,\n                            PocketMimiDecoderState &state) const {\n    std::pair<Ort::Value, PocketMimiDecoderState> output =\n        model_->RunMimiDecoder(std::move(latent), std::move(state));\n\n    state = std::move(output.second);\n\n    return std::move(output.first);\n  }\n\n private:\n  OfflineTtsConfig config_;\n  std::unique_ptr<OfflineTtsPocketModel> model_;\n  std::vector<std::unique_ptr<kaldifst::TextNormalizer>> tn_list_;\n  std::unique_ptr<SentencePieceTokenizer> tokenizer_;\n\n  // Shared Thread-Safe LRU Cache for Voice Embeddings\n  struct VoiceEmbeddingCache {\n    using Embedding = std::pair<std::vector<float>, std::vector<int64_t>>;\n    using EmbeddingPtr = std::shared_ptr<Embedding>;\n\n   private:\n    using ListNode = std::pair<size_t, EmbeddingPtr>;\n    using ListIt = std::list<ListNode>::iterator;\n\n    mutable std::mutex mutex_;\n    size_t capacity_;\n\n    // Front = most recently used\n    std::list<ListNode> lru_list_;\n\n    // Key -> iterator into lru_list_\n    std::unordered_map<size_t, ListIt> map_;\n\n   public:\n    static constexpr size_t kDefaultCapacity = 50;\n\n    explicit VoiceEmbeddingCache(size_t cap = kDefaultCapacity)\n        : capacity_(cap) {}\n\n    EmbeddingPtr Get(size_t key) {\n      std::lock_guard<std::mutex> lock(mutex_);\n\n      auto it = map_.find(key);\n      if (it == map_.end()) {\n        return nullptr;  // cache miss\n      }\n\n      // Move to front (most recently used)\n      if (it->second != lru_list_.begin()) {\n        lru_list_.splice(lru_list_.begin(), lru_list_, it->second);\n      }\n\n      return it->second->second;  // copy shared_ptr\n    }\n\n    void Put(size_t key, std::vector<float> data, std::vector<int64_t> shape) {\n      std::lock_guard<std::mutex> lock(mutex_);\n\n      if (capacity_ == 0) {\n        return;\n      }\n\n      auto it = map_.find(key);\n\n      // If exists, update and move to front\n      if (it != map_.end()) {\n        it->second->second =\n            std::make_shared<Embedding>(std::move(data), std::move(shape));\n\n        if (it->second != lru_list_.begin()) {\n          lru_list_.splice(lru_list_.begin(), lru_list_, it->second);\n        }\n        return;\n      }\n\n      // Evict if full\n      if (lru_list_.size() >= capacity_) {\n        auto &last = lru_list_.back();\n        size_t last_key = last.first;\n\n        map_.erase(last_key);\n        lru_list_.pop_back();  // shared_ptr released here\n      }\n\n      // Insert new at front\n      lru_list_.emplace_front(\n          key, std::make_shared<Embedding>(std::move(data), std::move(shape)));\n\n      map_[key] = lru_list_.begin();\n    }\n\n    void SetCapacity(int32_t cap) {\n      if (cap < 0) {\n        SHERPA_ONNX_LOGE(\n            \"voice_embedding_cache_capacity must be >= 0. Given: %d\", cap);\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      std::lock_guard<std::mutex> lock(mutex_);\n      capacity_ = cap;\n\n      while (lru_list_.size() > capacity_) {\n        auto &last = lru_list_.back();\n        size_t last_key = last.first;\n\n        map_.erase(last_key);\n        lru_list_.pop_back();\n      }\n    }\n\n    size_t Size() const {\n      std::lock_guard<std::mutex> lock(mutex_);\n      return lru_list_.size();\n    }\n\n    void Clear() {\n      std::lock_guard<std::mutex> lock(mutex_);\n      map_.clear();\n      lru_list_.clear();\n    }\n  };\n\n  mutable VoiceEmbeddingCache cache_;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_POCKET_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-pocket-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-pocket-model-config.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-pocket-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineTtsPocketModelConfig::Register(ParseOptions *po) {\n  po->Register(\"pocket-lm-flow\", &lm_flow, \"Path to PocketTTS lm flow model\");\n  po->Register(\"pocket-lm-main\", &lm_main, \"Path to PocketTTS lm main model\");\n  po->Register(\"pocket-encoder\", &encoder, \"Path to PocketTTS encoder model\");\n  po->Register(\"pocket-decoder\", &decoder, \"Path to PocketTTS decoder model\");\n  po->Register(\"pocket-text-conditioner\", &text_conditioner,\n               \"Path to PocketTTS text conditioner model\");\n  po->Register(\"pocket-vocab-json\", &vocab_json,\n               \"Path to PocketTTS vocab.json\");\n  po->Register(\"pocket-token-scores-json\", &token_scores_json,\n               \"Path to PocketTTS token_scores.json\");\n  po->Register(\"pocket-voice-embedding-cache-capacity\",\n               &voice_embedding_cache_capacity,\n               \"Capacity of the voice embedding cache (number of items). \"\n               \"Default: 50. 0 disables caching.\");\n}\n\nbool OfflineTtsPocketModelConfig::Validate() const {\n  if (lm_flow.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --pocket-lm-flow\");\n    return false;\n  }\n\n  if (!FileExists(lm_flow)) {\n    SHERPA_ONNX_LOGE(\"--pocket-lm-flow '%s' does not exist\", lm_flow.c_str());\n    return false;\n  }\n\n  if (lm_main.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --pocket-lm-main\");\n    return false;\n  }\n\n  if (!FileExists(lm_main)) {\n    SHERPA_ONNX_LOGE(\"--pocket-lm-main '%s' does not exist\", lm_main.c_str());\n    return false;\n  }\n\n  if (encoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --pocket-encoder\");\n    return false;\n  }\n\n  if (!FileExists(encoder)) {\n    SHERPA_ONNX_LOGE(\"--pocket-encoder '%s' does not exist\", encoder.c_str());\n    return false;\n  }\n\n  if (decoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --pocket-decoder\");\n    return false;\n  }\n\n  if (!FileExists(decoder)) {\n    SHERPA_ONNX_LOGE(\"--pocket-decoder '%s' does not exist\", decoder.c_str());\n    return false;\n  }\n\n  if (text_conditioner.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --pocket-text-conditioner\");\n    return false;\n  }\n\n  if (!FileExists(text_conditioner)) {\n    SHERPA_ONNX_LOGE(\"--pocket-text-conditioner '%s' does not exist\",\n                     text_conditioner.c_str());\n    return false;\n  }\n\n  if (vocab_json.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --pocket-vocab-json\");\n    return false;\n  }\n\n  if (!FileExists(vocab_json)) {\n    SHERPA_ONNX_LOGE(\"--pocket-vocab-json '%s' does not exist\",\n                     vocab_json.c_str());\n    return false;\n  }\n\n  if (token_scores_json.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --pocket-token-scores-json\");\n    return false;\n  }\n\n  if (!FileExists(token_scores_json)) {\n    SHERPA_ONNX_LOGE(\"--pocket-token-scores-json '%s' does not exist\",\n                     token_scores_json.c_str());\n    return false;\n  }\n\n  if (voice_embedding_cache_capacity < 0) {\n    SHERPA_ONNX_LOGE(\n        \"voice_embedding_cache_capacity must be non-negative. Given: %d\",\n        voice_embedding_cache_capacity);\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineTtsPocketModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineTtsPocketModelConfig(\";\n  os << \"lm_flow=\\\"\" << lm_flow << \"\\\", \";\n  os << \"lm_main=\\\"\" << lm_main << \"\\\", \";\n  os << \"encoder=\\\"\" << encoder << \"\\\", \";\n  os << \"decoder=\\\"\" << decoder << \"\\\", \";\n  os << \"text_conditioner=\\\"\" << text_conditioner << \"\\\", \";\n  os << \"vocab_json=\\\"\" << vocab_json << \"\\\", \";\n  os << \"token_scores_json=\\\"\" << token_scores_json << \"\\\", \";\n  os << \"voice_embedding_cache_capacity=\" << voice_embedding_cache_capacity\n     << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-pocket-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-tts-pocket-model-config.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_POCKET_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_POCKET_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTtsPocketModelConfig {\n  std::string lm_flow;\n  std::string lm_main;\n  std::string encoder;\n  std::string decoder;\n  std::string text_conditioner;\n\n  std::string vocab_json;\n  std::string token_scores_json;\n\n  OfflineTtsPocketModelConfig() = default;\n  int32_t voice_embedding_cache_capacity = 50;\n\n  OfflineTtsPocketModelConfig(const std::string &lm_flow,\n                              const std::string &lm_main,\n                              const std::string &encoder,\n                              const std::string &decoder,\n                              const std::string &text_conditioner,\n                              const std::string &vocab_json,\n                              const std::string &token_scores_json,\n                              int32_t voice_embedding_cache_capacity = 50)\n      : lm_flow(lm_flow),\n        lm_main(lm_main),\n        encoder(encoder),\n        decoder(decoder),\n        text_conditioner(text_conditioner),\n        vocab_json(vocab_json),\n        token_scores_json(token_scores_json),\n        voice_embedding_cache_capacity(voice_embedding_cache_capacity) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_POCKET_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-pocket-model.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-pocket-model.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-pocket-model.h\"\n\n#include <memory>\n#include <string>\n#include <tuple>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n\nnamespace sherpa_onnx {\n\nstatic Ort::Value CreateZeroTensorLike(Ort::Session &sess, int32_t input_index,\n                                       OrtAllocator *allocator) {\n  auto type_info = sess.GetInputTypeInfo(input_index);\n  auto tensor_info = type_info.GetTensorTypeAndShapeInfo();\n  ONNXTensorElementDataType elem_type = tensor_info.GetElementType();\n  std::vector<int64_t> shape = tensor_info.GetShape();\n\n  // 3. Replace dynamic dims (-1) with 1\n  for (auto &d : shape) {\n    if (d < 0) {\n      d = 1;\n    }\n  }\n\n  Ort::Value v{nullptr};\n  switch (elem_type) {\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT:\n      v = Ort::Value::CreateTensor<float>(allocator, shape.data(),\n                                          shape.size());\n      Fill<float>(&v, 0);\n      break;\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_BOOL:\n      v = Ort::Value::CreateTensor<bool>(allocator, shape.data(), shape.size());\n      Fill<bool>(&v, 0);\n      break;\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64:\n      v = Ort::Value::CreateTensor<int64_t>(allocator, shape.data(),\n                                            shape.size());\n      Fill<int64_t>(&v, 0);\n      break;\n    default:\n      SHERPA_ONNX_LOGE(\"Unsupported tensor element type: %d\", elem_type);\n      SHERPA_ONNX_EXIT(-1);\n  }\n\n  return v;\n}\n\nclass OfflineTtsPocketModel::Impl {\n public:\n  explicit Impl(const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)) {\n    lm_flow_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.pocket.lm_flow), sess_opts_);\n    InitLmFlow(nullptr, 0);\n\n    lm_main_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.pocket.lm_main), sess_opts_);\n    InitLmMain(nullptr, 0);\n\n    mimi_encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.pocket.encoder), sess_opts_);\n    InitMimiEncoder(nullptr, 0);\n\n    mimi_decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.pocket.decoder), sess_opts_);\n    InitMimiDecoder(nullptr, 0);\n\n    text_conditioner_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.pocket.text_conditioner),\n        sess_opts_);\n    InitTextConditioner(nullptr, 0);\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)) {\n    {\n      auto buf = ReadFile(mgr, config.pocket.lm_flow);\n      InitLmFlow(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.pocket.lm_main);\n      InitLmMain(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.pocket.encoder);\n      InitMimiEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.pocket.decoder);\n      InitMimiDecoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.pocket.text_conditioner);\n      InitTextConditioner(buf.data(), buf.size());\n    }\n  }\n\n  PocketLmMainState GetLmMainInitState() {\n    PocketLmMainState s;\n    s.values.reserve(lm_main_init_states_.values.size());\n    for (auto &v : lm_main_init_states_.values) {\n      s.values.push_back(View(&v));\n    }\n    return s;\n  }\n\n  PocketMimiDecoderState GetMimiDecoderInitState() {\n    PocketMimiDecoderState s;\n    s.values.reserve(mimi_decoder_init_states_.values.size());\n    for (auto &v : mimi_decoder_init_states_.values) {\n      s.values.push_back(View(&v));\n    }\n\n    return s;\n  }\n\n  Ort::Value RunMimiEncoder(Ort::Value audio) const {\n    std::vector<Ort::Value> inputs;\n    inputs.push_back(std::move(audio));\n\n    auto outputs = mimi_encoder_sess_->Run(\n        {}, mimi_encoder_input_names_ptr_.data(), inputs.data(), inputs.size(),\n        mimi_encoder_output_names_ptr_.data(),\n        mimi_encoder_output_names_ptr_.size());\n\n    return std::move(outputs[0]);\n  }\n\n  Ort::Value RunTextConditioner(Ort::Value text_tokens) const {\n    std::vector<Ort::Value> inputs;\n    inputs.push_back(std::move(text_tokens));\n\n    auto outputs = text_conditioner_sess_->Run(\n        Ort::RunOptions{nullptr}, text_conditioner_input_names_ptr_.data(),\n        inputs.data(), inputs.size(), text_conditioner_output_names_ptr_.data(),\n        text_conditioner_output_names_ptr_.size());\n\n    return std::move(outputs[0]);\n  }\n\n  std::tuple<Ort::Value, Ort::Value, PocketLmMainState> RunLmMain(\n      Ort::Value seq, Ort::Value embeddings, PocketLmMainState state) const {\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(2 + state.values.size());\n\n    inputs.push_back(std::move(seq));\n    inputs.push_back(std::move(embeddings));\n\n    for (auto &v : state.values) {\n      inputs.push_back(std::move(v));\n    }\n\n    auto outputs = lm_main_sess_->Run(\n        Ort::RunOptions{nullptr}, lm_main_input_names_ptr_.data(),\n        inputs.data(), inputs.size(), lm_main_output_names_ptr_.data(),\n        lm_main_output_names_ptr_.size());\n\n    PocketLmMainState new_state;\n    new_state.values.reserve(outputs.size() - 2);\n    for (size_t i = 2; i < outputs.size(); ++i) {\n      new_state.values.push_back(std::move(outputs[i]));\n    }\n\n    return {std::move(outputs[0]), std::move(outputs[1]), std::move(new_state)};\n  }\n\n  Ort::Value RunLmFlow(Ort::Value c, Ort::Value s, Ort::Value t,\n                       Ort::Value x) const {\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(4);\n    inputs.push_back(std::move(c));\n    inputs.push_back(std::move(s));\n    inputs.push_back(std::move(t));\n    inputs.push_back(std::move(x));\n\n    auto outputs = lm_flow_sess_->Run(\n        {}, lm_flow_input_names_ptr_.data(), inputs.data(), inputs.size(),\n        lm_flow_output_names_ptr_.data(), lm_flow_output_names_ptr_.size());\n\n    return std::move(outputs[0]);\n  }\n\n  std::pair<Ort::Value, PocketMimiDecoderState> RunMimiDecoder(\n      Ort::Value latent, PocketMimiDecoderState state) const {\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(1 + state.values.size());\n\n    inputs.push_back(std::move(latent));\n    for (auto &v : state.values) {\n      inputs.push_back(std::move(v));\n    }\n\n    auto outputs = mimi_decoder_sess_->Run(\n        {}, mimi_decoder_input_names_ptr_.data(), inputs.data(), inputs.size(),\n        mimi_decoder_output_names_ptr_.data(),\n        mimi_decoder_output_names_ptr_.size());\n\n    PocketMimiDecoderState new_state;\n    new_state.values.reserve(outputs.size() - 1);\n    for (size_t i = 1; i < outputs.size(); ++i) {\n      new_state.values.push_back(std::move(outputs[i]));\n    }\n\n    return {std::move(outputs[0]), std::move(new_state)};\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void InitLmFlow(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      lm_flow_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!lm_flow_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize lm flow session outside of \"\n          \"this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(lm_flow_sess_.get(), &lm_flow_input_names_,\n                  &lm_flow_input_names_ptr_);\n\n    GetOutputNames(lm_flow_sess_.get(), &lm_flow_output_names_,\n                   &lm_flow_output_names_ptr_);\n  }\n\n  void InitLmMain(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      lm_main_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!lm_main_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize lm main session outside of \"\n          \"this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(lm_main_sess_.get(), &lm_main_input_names_,\n                  &lm_main_input_names_ptr_);\n\n    GetOutputNames(lm_main_sess_.get(), &lm_main_output_names_,\n                   &lm_main_output_names_ptr_);\n\n    lm_main_init_states_.values.reserve(lm_main_input_names_.size() - 2);\n    for (size_t i = 2; i < lm_main_input_names_.size(); ++i) {\n      lm_main_init_states_.values.push_back(\n          CreateZeroTensorLike(*lm_main_sess_, i, allocator_));\n    }\n  }\n\n  void InitMimiEncoder(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      mimi_encoder_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!mimi_encoder_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize mimi encoder session outside \"\n          \"of this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(mimi_encoder_sess_.get(), &mimi_encoder_input_names_,\n                  &mimi_encoder_input_names_ptr_);\n\n    GetOutputNames(mimi_encoder_sess_.get(), &mimi_encoder_output_names_,\n                   &mimi_encoder_output_names_ptr_);\n  }\n\n  void InitMimiDecoder(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      mimi_decoder_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!mimi_decoder_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize mimi decoder session outside \"\n          \"of this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(mimi_decoder_sess_.get(), &mimi_decoder_input_names_,\n                  &mimi_decoder_input_names_ptr_);\n\n    GetOutputNames(mimi_decoder_sess_.get(), &mimi_decoder_output_names_,\n                   &mimi_decoder_output_names_ptr_);\n\n    // init mimi_decoder_init_states_\n    mimi_decoder_init_states_.values.reserve(mimi_decoder_input_names_.size() -\n                                             1);\n    for (size_t i = 1; i < mimi_decoder_input_names_.size(); ++i) {\n      mimi_decoder_init_states_.values.push_back(\n          CreateZeroTensorLike(*mimi_decoder_sess_, i, allocator_));\n    }\n  }\n\n  void InitTextConditioner(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      text_conditioner_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!text_conditioner_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize text conditioner session \"\n          \"outside of this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(text_conditioner_sess_.get(), &text_conditioner_input_names_,\n                  &text_conditioner_input_names_ptr_);\n\n    GetOutputNames(text_conditioner_sess_.get(),\n                   &text_conditioner_output_names_,\n                   &text_conditioner_output_names_ptr_);\n  }\n\n private:\n  OfflineTtsModelConfig config_;\n\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> lm_main_sess_;\n  std::unique_ptr<Ort::Session> lm_flow_sess_;\n  std::unique_ptr<Ort::Session> mimi_decoder_sess_;\n  std::unique_ptr<Ort::Session> mimi_encoder_sess_;\n  std::unique_ptr<Ort::Session> text_conditioner_sess_;\n\n  std::vector<std::string> lm_flow_input_names_;\n  std::vector<const char *> lm_flow_input_names_ptr_;\n\n  std::vector<std::string> lm_flow_output_names_;\n  std::vector<const char *> lm_flow_output_names_ptr_;\n\n  std::vector<std::string> lm_main_input_names_;\n  std::vector<const char *> lm_main_input_names_ptr_;\n\n  std::vector<std::string> lm_main_output_names_;\n  std::vector<const char *> lm_main_output_names_ptr_;\n\n  std::vector<std::string> mimi_encoder_input_names_;\n  std::vector<const char *> mimi_encoder_input_names_ptr_;\n\n  std::vector<std::string> mimi_encoder_output_names_;\n  std::vector<const char *> mimi_encoder_output_names_ptr_;\n\n  std::vector<std::string> mimi_decoder_input_names_;\n  std::vector<const char *> mimi_decoder_input_names_ptr_;\n\n  std::vector<std::string> mimi_decoder_output_names_;\n  std::vector<const char *> mimi_decoder_output_names_ptr_;\n\n  std::vector<std::string> text_conditioner_input_names_;\n  std::vector<const char *> text_conditioner_input_names_ptr_;\n\n  std::vector<std::string> text_conditioner_output_names_;\n  std::vector<const char *> text_conditioner_output_names_ptr_;\n\n  PocketLmMainState lm_main_init_states_;\n  PocketMimiDecoderState mimi_decoder_init_states_;\n};\n\nOfflineTtsPocketModel::OfflineTtsPocketModel(\n    const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTtsPocketModel::OfflineTtsPocketModel(\n    Manager *mgr, const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTtsPocketModel::~OfflineTtsPocketModel() = default;\n\nPocketLmMainState OfflineTtsPocketModel::GetLmMainInitState() const {\n  return impl_->GetLmMainInitState();\n}\n\nPocketMimiDecoderState OfflineTtsPocketModel::GetMimiDecoderInitState() const {\n  return impl_->GetMimiDecoderInitState();\n}\n\nOrt::Value OfflineTtsPocketModel::RunMimiEncoder(Ort::Value audio) const {\n  return impl_->RunMimiEncoder(std::move(audio));\n}\n\nOrt::Value OfflineTtsPocketModel::RunTextConditioner(\n    Ort::Value text_tokens) const {\n  return impl_->RunTextConditioner(std::move(text_tokens));\n}\n\nstd::tuple<Ort::Value, Ort::Value, PocketLmMainState>\nOfflineTtsPocketModel::RunLmMain(Ort::Value seq, Ort::Value embeddings,\n                                 PocketLmMainState state) const {\n  return impl_->RunLmMain(std::move(seq), std::move(embeddings),\n                          std::move(state));\n}\n\nOrt::Value OfflineTtsPocketModel::RunLmFlow(Ort::Value c, Ort::Value s,\n                                            Ort::Value t, Ort::Value x) const {\n  return impl_->RunLmFlow(std::move(c), std::move(s), std::move(t),\n                          std::move(x));\n}\n\nstd::pair<Ort::Value, PocketMimiDecoderState>\nOfflineTtsPocketModel::RunMimiDecoder(Ort::Value latent,\n                                      PocketMimiDecoderState state) const {\n  return impl_->RunMimiDecoder(std::move(latent), std::move(state));\n}\n\nOrtAllocator *OfflineTtsPocketModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTtsPocketModel::OfflineTtsPocketModel(\n    AAssetManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTtsPocketModel::OfflineTtsPocketModel(\n    NativeResourceManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-pocket-model.h",
    "content": "// sherpa-onnx/csrc/offline-tts-pocket-model.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_POCKET_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_POCKET_MODEL_H_\n\n#include <memory>\n#include <tuple>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n\nnamespace sherpa_onnx {\n\nstruct PocketLmMainState {\n  std::vector<Ort::Value> values;\n};\n\nstruct PocketMimiDecoderState {\n  std::vector<Ort::Value> values;\n};\n\n// Please refer to\n// https://huggingface.co/KevinAHM/pocket-tts-onnx/blob/main/pocket_tts_onnx.py\nclass OfflineTtsPocketModel {\n public:\n  explicit OfflineTtsPocketModel(const OfflineTtsModelConfig &config);\n\n  template <typename Manager>\n  OfflineTtsPocketModel(Manager *mgr, const OfflineTtsModelConfig &config);\n\n  ~OfflineTtsPocketModel();\n\n  PocketLmMainState GetLmMainInitState() const;\n  PocketMimiDecoderState GetMimiDecoderInitState() const;\n\n  /**\n   * @param audio should be of 24000Hz. Its shape is (1, 1, num_samples)\n   * @returns a float32 tensor of shape (1, num_frames, 1024)\n   */\n  Ort::Value RunMimiEncoder(Ort::Value audio) const;\n\n  /**\n   * @param text_tokens (1, num_tokens) of shape int64\n   * @return float32 tensor of shape (1, num_tokens, 1024)\n   */\n  Ort::Value RunTextConditioner(Ort::Value text_tokens) const;\n\n  Ort::Value RunLmFlow(Ort::Value c, Ort::Value s, Ort::Value t,\n                       Ort::Value x) const;\n\n  std::tuple<Ort::Value, Ort::Value, PocketLmMainState> RunLmMain(\n      Ort::Value seq, Ort::Value embeddings, PocketLmMainState state) const;\n\n  std::pair<Ort::Value, PocketMimiDecoderState> RunMimiDecoder(\n      Ort::Value latent, PocketMimiDecoderState state) const;\n\n  OrtAllocator *Allocator() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_POCKET_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-supertonic-impl.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-supertonic-impl.cc\n//\n// Copyright (c)  2026 zengyw\n//\n// This file is based on Supertonic TTS\n// (https://github.com/Supertone-Inc/supertonic) which is licensed under MIT\n// License (Copyright (c) 2025 Supertone Inc.)\n\n#include \"sherpa-onnx/csrc/offline-tts-supertonic-impl.h\"\n\n#include <algorithm>\n#include <array>\n#include <cinttypes>\n#include <cmath>\n#include <cstdint>\n#include <cstring>\n#include <limits>\n#include <numeric>\n#include <random>\n#include <sstream>\n#include <string>\n#include <string_view>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/normal-data-generator.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\nnamespace {\n\n// Minimum duration (in seconds) to prevent zero-length audio\nconstexpr float kMinDuration = 0.1f;\n\n// Maximum latent length to prevent excessive memory allocation and OOM.\nconstexpr int32_t kMaxLatentLen = 10000;\n\nconstexpr std::array<std::string_view, 5> kSupertonicAvailableLangs = {\n    \"en\", \"ko\", \"es\", \"pt\", \"fr\",\n};\n\nvoid GetLatentMaskFlat(const std::vector<int64_t> &wav_lengths,\n                       int32_t base_chunk_size, int32_t chunk_compress_factor,\n                       std::vector<float> *mask_flat,\n                       std::vector<int64_t> *mask_shape) {\n  const int32_t bsz = static_cast<int32_t>(wav_lengths.size());\n  int32_t wav_chunk_size = base_chunk_size * chunk_compress_factor;\n  std::vector<int64_t> latent_lengths;\n  latent_lengths.reserve(bsz);\n  for (auto len : wav_lengths) {\n    latent_lengths.push_back((len + wav_chunk_size - 1) / wav_chunk_size);\n  }\n  LengthsToMask(latent_lengths, mask_flat, mask_shape);\n}\n\nSupertonicStyle ParseVoiceStyleFromBinary(const std::vector<char> &buf) {\n  constexpr size_t kHeaderSize = 6 * sizeof(int64_t);\n  constexpr size_t kMaxPayloadBytes = 64 * 1024 * 1024;\n\n  if (buf.size() < kHeaderSize) {\n    SHERPA_ONNX_LOGE(\n        \"Invalid voice style .bin: file too small (got %zu bytes, need %zu \"\n        \"header)\",\n        buf.size(), kHeaderSize);\n    SHERPA_ONNX_EXIT(-1);\n  }\n  int64_t dims[6];\n  std::memcpy(dims, buf.data(), kHeaderSize);\n  for (int i = 0; i < 6; ++i) {\n    if (dims[i] <= 0) {\n      SHERPA_ONNX_LOGE(\"Invalid voice style .bin: dims[%d]=%\" PRId64 \" <= 0\", i,\n                       dims[i]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  auto mul3 = [](int64_t a, int64_t b, int64_t c, const char *name) -> size_t {\n    constexpr int64_t kMax = std::numeric_limits<int64_t>::max();\n    if (a <= 0 || b <= 0 || c <= 0 || a > kMax / b) {\n      SHERPA_ONNX_LOGE(\"Invalid voice style .bin: %s dims overflow\", name);\n      SHERPA_ONNX_EXIT(-1);\n    }\n    int64_t ab = a * b;\n    if (ab > kMax / c) {\n      SHERPA_ONNX_LOGE(\"Invalid voice style .bin: %s dims overflow\", name);\n      SHERPA_ONNX_EXIT(-1);\n    }\n    return static_cast<size_t>(ab * c);\n  };\n  size_t ttl_elems = mul3(dims[0], dims[1], dims[2], \"ttl\");\n  size_t dp_elems = mul3(dims[3], dims[4], dims[5], \"dp\");\n\n  size_t ttl_bytes = ttl_elems * sizeof(float);\n  size_t dp_bytes = dp_elems * sizeof(float);\n  if (ttl_bytes / sizeof(float) != ttl_elems ||\n      dp_bytes / sizeof(float) != dp_elems) {\n    SHERPA_ONNX_LOGE(\"Invalid voice style .bin: byte size overflow\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n  size_t payload_bytes = ttl_bytes + dp_bytes;\n  if (payload_bytes < ttl_bytes || payload_bytes < dp_bytes) {\n    SHERPA_ONNX_LOGE(\"Invalid voice style .bin: payload size overflow\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n  if (payload_bytes > kMaxPayloadBytes) {\n    SHERPA_ONNX_LOGE(\n        \"Invalid voice style .bin: payload too large (%zu bytes, max %zu)\",\n        payload_bytes, kMaxPayloadBytes);\n    SHERPA_ONNX_EXIT(-1);\n  }\n  size_t expected_total = kHeaderSize + payload_bytes;\n  if (expected_total < kHeaderSize) {\n    SHERPA_ONNX_LOGE(\"Invalid voice style .bin: total size overflow\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n  if (buf.size() != expected_total) {\n    SHERPA_ONNX_LOGE(\n        \"Invalid voice style .bin: size mismatch (got %zu bytes, expected \"\n        \"exactly %zu)\",\n        buf.size(), expected_total);\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  std::vector<int64_t> ttl_shape = {dims[0], dims[1], dims[2]};\n  std::vector<int64_t> dp_shape = {dims[3], dims[4], dims[5]};\n  std::vector<float> ttl_data(ttl_elems);\n  std::memcpy(ttl_data.data(), buf.data() + kHeaderSize, ttl_bytes);\n  std::vector<float> dp_data(dp_elems);\n  std::memcpy(dp_data.data(), buf.data() + kHeaderSize + ttl_bytes, dp_bytes);\n\n  SupertonicStyle style;\n  style.ttl_data = std::move(ttl_data);\n  style.dp_data = std::move(dp_data);\n  style.ttl_shape = std::move(ttl_shape);\n  style.dp_shape = std::move(dp_shape);\n  return style;\n}\n}  // namespace\n\nOfflineTtsSupertonicImpl::OfflineTtsSupertonicImpl(\n    const OfflineTtsConfig &config)\n    : config_(config),\n      model_(std::make_unique<OfflineTtsSupertonicModel>(config.model)),\n      text_processor_(std::make_unique<SupertonicUnicodeProcessor>(\n          config.model.supertonic.unicode_indexer)),\n      memory_info_(\n          Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)) {\n  std::vector<char> buf = ReadFile(config.model.supertonic.voice_style);\n  if (buf.empty()) {\n    SHERPA_ONNX_LOGE(\"Failed to read voice style file: %s\",\n                     config.model.supertonic.voice_style.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  InitVoiceStyle(buf);\n}\n\ntemplate <typename Manager>\nOfflineTtsSupertonicImpl::OfflineTtsSupertonicImpl(\n    Manager *mgr, const OfflineTtsConfig &config)\n    : config_(config),\n      model_(std::make_unique<OfflineTtsSupertonicModel>(mgr, config.model)),\n      text_processor_(std::make_unique<SupertonicUnicodeProcessor>(\n          mgr, config.model.supertonic.unicode_indexer)),\n      memory_info_(\n          Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)) {\n  std::vector<char> buf = ReadFile(mgr, config.model.supertonic.voice_style);\n  if (buf.empty()) {\n    SHERPA_ONNX_LOGE(\"Failed to read voice style file: %s\",\n                     config.model.supertonic.voice_style.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  InitVoiceStyle(buf);\n}\n\nint32_t OfflineTtsSupertonicImpl::SampleRate() const {\n  return model_->GetSampleRate();\n}\n\nGeneratedAudio OfflineTtsSupertonicImpl::Generate(\n    const std::string &text, int64_t sid, float speed,\n    GeneratedAudioCallback callback) const {\n  GenerationConfig config;\n  config.sid = sid;\n  config.speed = speed;\n  return Generate(text, config, callback);\n}\n\nGeneratedAudio OfflineTtsSupertonicImpl::Generate(\n    const std::string &text, const GenerationConfig &config,\n    GeneratedAudioCallback callback) const {\n  // Supported extra options in config.extra:\n  //   - \"speed\" (float): Speech speed factor (default: 1.05)\n  //   - \"num_steps\" (int): Number of denoising steps (default: 5)\n  //   - \"lang\" (string): Language code, e.g. \"en\", \"ko\" (default: \"en\")\n  //   - sid selects speaker from voice.bin (0 .. NumSpeakers()-1).\n  //   - \"max_len\" (int): Max chunk length. Default: 300 (non-Korean), 120 (ko).\n  //   - \"silence_duration\" (float): Silence in seconds between chunks (default:\n  //   0.3)\n  //   - \"seed\" (int): RNG seed for reproducibility. -1 = random (default).\n\n  if (config_.model.debug) {\n    SHERPA_ONNX_LOGE(\"%s\", config.ToString().c_str());\n  }\n  int32_t seed = config.GetExtraInt(\"seed\", -1);\n  float speed =\n      config.GetExtraFloat(\"speed\", config.speed > 0 ? config.speed : 1.05f);\n  int32_t num_steps = config.GetExtraInt(\n      \"num_steps\", config.num_steps > 0 ? config.num_steps : 5);\n  if (speed <= 0) {\n    SHERPA_ONNX_LOGE(\"Speed must be > 0. Given: %f\", speed);\n    return {};\n  }\n  if (num_steps <= 0) {\n    SHERPA_ONNX_LOGE(\"Num steps must be > 0. Given: %d\", num_steps);\n    return {};\n  }\n  std::string text_single = Trim(text);\n  if (text_single.empty()) {\n    return {};\n  }\n\n  int64_t sid = config.sid;\n  if (sid >= num_speakers_ || sid < 0) {\n    SHERPA_ONNX_LOGE(\n        \"Model has %d speaker(s). sid must be in [0, %d]. Given sid=%d, \"\n        \"using 0\",\n        num_speakers_, num_speakers_ - 1, static_cast<int32_t>(sid));\n    sid = 0;\n  }\n\n  std::string lang = config.GetExtraString(\"lang\", \"en\");\n  bool lang_ok = std::any_of(kSupertonicAvailableLangs.begin(),\n                             kSupertonicAvailableLangs.end(),\n                             [&](std::string_view s) { return s == lang; });\n  if (!lang_ok) {\n    SHERPA_ONNX_LOGE(\"Invalid language: %s. Available: en, ko, es, pt, fr\",\n                     lang.c_str());\n    return {};\n  }\n\n  float silence_duration = config.GetExtraFloat(\"silence_duration\", 0.3f);\n  size_t max_len =\n      (lang == \"ko\") ? static_cast<size_t>(config.GetExtraInt(\"max_len\", 120))\n                     : static_cast<size_t>(config.GetExtraInt(\"max_len\", 300));\n  if (max_len == 0) {\n    SHERPA_ONNX_LOGE(\"Max length must be > 0. Given: %zu\", max_len);\n    return {};\n  }\n  auto text_chunks = ChunkText(text_single, max_len);\n  return ProcessChunksAndConcatenate(text_chunks, lang, sid, num_steps, speed,\n                                     silence_duration, seed, callback);\n}\n\nGeneratedAudio OfflineTtsSupertonicImpl::Process(\n    const std::string &text, const std::string &lang, int64_t sid,\n    int32_t num_steps, float speed, NormalDataGenerator &gen) const {\n  const auto &cfg = model_->GetConfig();\n  StyleSliceView slice = GetStyleSliceForSid(sid);\n  const int32_t bsz = 1;\n\n  std::vector<int64_t> text_ids;\n  std::vector<float> text_mask_flat;\n  std::vector<int64_t> text_mask_shape;\n  text_processor_->Process(text, lang, &text_ids, &text_mask_flat,\n                           &text_mask_shape);\n  if (text_ids.empty() || text_mask_flat.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"Text processing failed: empty text_ids or text_mask. Text: \\\"%s\\\"\",\n        text.c_str());\n    return {};\n  }\n  if (text_mask_shape.size() != 3) {\n    SHERPA_ONNX_LOGE(\n        \"Invalid text_mask_shape size: %zu (expected 3). Text: \\\"%s\\\"\",\n        text_mask_shape.size(), text.c_str());\n    return {};\n  }\n  int64_t text_seq_len = static_cast<int64_t>(text_ids.size());\n  int64_t text_mask_len = text_mask_shape[2];\n  if (text_seq_len != text_mask_len) {\n    SHERPA_ONNX_LOGE(\"Text sequence length mismatch: text_ids=%\" PRId64\n                     \", text_mask=%\" PRId64 \". Text: \\\"%s\\\"\",\n                     text_seq_len, text_mask_len, text.c_str());\n    return {};\n  }\n\n  std::vector<int64_t> text_ids_shape = {1, text_seq_len};\n\n  Ort::Value text_ids_tensor = Ort::Value::CreateTensor<int64_t>(\n      memory_info_, text_ids.data(), text_ids.size(), text_ids_shape.data(),\n      text_ids_shape.size());\n  Ort::Value style_dp_tensor = Ort::Value::CreateTensor<float>(\n      memory_info_, const_cast<float *>(slice.dp_data), slice.dp_size,\n      slice.dp_shape.data(), slice.dp_shape.size());\n  Ort::Value text_mask_tensor = Ort::Value::CreateTensor<float>(\n      memory_info_, text_mask_flat.data(), text_mask_flat.size(),\n      text_mask_shape.data(), text_mask_shape.size());\n  Ort::Value dp_output = model_->RunDurationPredictor(\n      std::move(text_ids_tensor), std::move(style_dp_tensor),\n      std::move(text_mask_tensor));\n  auto dp_output_info = dp_output.GetTensorTypeAndShapeInfo();\n  size_t dp_element_count = dp_output_info.GetElementCount();\n  if (dp_element_count != 1) {\n    SHERPA_ONNX_LOGE(\n        \"Duration predictor output size mismatch: expected 1, got %zu. Text: \"\n        \"\\\"%s\\\"\",\n        dp_element_count, text.c_str());\n    return {};\n  }\n  auto *dur_data = dp_output.GetTensorMutableData<float>();\n  std::vector<float> duration(dur_data, dur_data + 1);\n  if (speed != 1.0f) {\n    for (auto &dur : duration) {\n      dur /= speed;\n      if (dur < kMinDuration) {\n        dur = kMinDuration;\n      }\n    }\n  }\n\n  Ort::Value text_enc_output = model_->RunTextEncoder(\n      Ort::Value::CreateTensor<int64_t>(memory_info_, text_ids.data(),\n                                        text_ids.size(), text_ids_shape.data(),\n                                        text_ids_shape.size()),\n      Ort::Value::CreateTensor<float>(\n          memory_info_, const_cast<float *>(slice.ttl_data), slice.ttl_size,\n          slice.ttl_shape.data(), slice.ttl_shape.size()),\n      Ort::Value::CreateTensor<float>(\n          memory_info_, text_mask_flat.data(), text_mask_flat.size(),\n          text_mask_shape.data(), text_mask_shape.size()));\n  auto text_emb_info = text_enc_output.GetTensorTypeAndShapeInfo();\n  size_t text_emb_size = text_emb_info.GetElementCount();\n  if (text_emb_size == 0) {\n    SHERPA_ONNX_LOGE(\"Text encoder output is empty. Text: \\\"%s\\\"\",\n                     text.c_str());\n    return {};\n  }\n  auto *text_emb_data = text_enc_output.GetTensorMutableData<float>();\n  auto text_emb_shape = text_emb_info.GetShape();\n\n  float wav_len_max =\n      *std::max_element(duration.begin(), duration.end()) * cfg.ae.sample_rate;\n  std::vector<int64_t> wav_lengths;\n  wav_lengths.reserve(bsz);\n  for (float d : duration) {\n    int64_t wav_len = static_cast<int64_t>(d * cfg.ae.sample_rate);\n    if (wav_len < 1) {\n      wav_len = 1;\n    }\n    wav_lengths.push_back(wav_len);\n  }\n  int32_t chunk_size = cfg.ae.base_chunk_size * cfg.ttl.chunk_compress_factor;\n  int32_t latent_len =\n      static_cast<int32_t>((wav_len_max + chunk_size - 1) / chunk_size);\n  if (latent_len > kMaxLatentLen) {\n    SHERPA_ONNX_LOGE(\n        \"Latent length (%d) exceeds maximum (%d), capping to prevent OOM\",\n        latent_len, kMaxLatentLen);\n    latent_len = kMaxLatentLen;\n  }\n\n  int32_t latent_dim = cfg.ttl.latent_dim * cfg.ttl.chunk_compress_factor;\n  size_t latent_total_size = static_cast<size_t>(bsz) *\n                             static_cast<size_t>(latent_dim) *\n                             static_cast<size_t>(latent_len);\n  if (latent_total_size / static_cast<size_t>(bsz) /\n          static_cast<size_t>(latent_dim) !=\n      static_cast<size_t>(latent_len)) {\n    SHERPA_ONNX_LOGE(\n        \"Latent total size overflow: bsz=%d, latent_dim=%d, latent_len=%d. \"\n        \"Text: \\\"%s\\\"\",\n        bsz, latent_dim, latent_len, text.c_str());\n    return {};\n  }\n\n  std::vector<float> xt_flat(latent_total_size);\n\n  gen.Fill(xt_flat.data(), xt_flat.size());\n\n  std::vector<float> latent_mask_flat;\n  std::vector<int64_t> latent_mask_shape;\n  GetLatentMaskFlat(wav_lengths, cfg.ae.base_chunk_size,\n                    cfg.ttl.chunk_compress_factor, &latent_mask_flat,\n                    &latent_mask_shape);\n  int64_t latent_mask_len = latent_mask_shape[2];\n  if (latent_mask_len != latent_len) {\n    SHERPA_ONNX_LOGE(\"Latent mask length mismatch: expected %d, got %\" PRId64\n                     \". Text: \\\"%s\\\"\",\n                     latent_len, latent_mask_len, text.c_str());\n    return {};\n  }\n  for (int32_t b = 0; b < bsz; ++b) {\n    const float *mask_batch = latent_mask_flat.data() + b * latent_mask_len;\n    float *xt_batch = xt_flat.data() + b * latent_dim * latent_len;\n    for (int32_t d = 0; d < latent_dim; ++d) {\n      float *xt_dim = xt_batch + d * latent_len;\n      for (int32_t t = 0; t < latent_len; ++t) {\n        xt_dim[t] *= mask_batch[t];\n      }\n    }\n  }\n\n  std::vector<int64_t> latent_shape = {bsz, latent_dim, latent_len};\n  std::vector<float> total_step_vec(bsz, static_cast<float>(num_steps));\n  std::array<int64_t, 1> step_shape = {bsz};\n\n  // Constant inputs: create once outside loop, keep text_enc_output alive.\n  Ort::Value text_emb_const = Ort::Value::CreateTensor<float>(\n      memory_info_, text_emb_data, text_emb_size, text_emb_shape.data(),\n      text_emb_shape.size());\n  Ort::Value style_ttl_const = Ort::Value::CreateTensor<float>(\n      memory_info_, const_cast<float *>(slice.ttl_data), slice.ttl_size,\n      slice.ttl_shape.data(), slice.ttl_shape.size());\n  Ort::Value text_mask_const = Ort::Value::CreateTensor<float>(\n      memory_info_, text_mask_flat.data(), text_mask_flat.size(),\n      text_mask_shape.data(), text_mask_shape.size());\n  Ort::Value latent_mask_const = Ort::Value::CreateTensor<float>(\n      memory_info_, latent_mask_flat.data(), latent_mask_flat.size(),\n      latent_mask_shape.data(), latent_mask_shape.size());\n  Ort::Value total_step_const = Ort::Value::CreateTensor<float>(\n      memory_info_, total_step_vec.data(), total_step_vec.size(),\n      step_shape.data(), step_shape.size());\n\n  float current_step = 0.f;\n  for (int32_t step = 0; step < num_steps; step++) {\n    current_step = static_cast<float>(step);\n    Ort::Value noisy_latent_tensor = Ort::Value::CreateTensor<float>(\n        memory_info_, xt_flat.data(), xt_flat.size(), latent_shape.data(),\n        latent_shape.size());\n    Ort::Value current_step_tensor = Ort::Value::CreateTensor<float>(\n        memory_info_, &current_step, 1, step_shape.data(), step_shape.size());\n\n    Ort::Value vector_est_output = model_->RunVectorEstimator(\n        std::move(noisy_latent_tensor), std::move(current_step_tensor),\n        text_emb_const, style_ttl_const, latent_mask_const, text_mask_const,\n        total_step_const);\n    auto vector_est_output_info = vector_est_output.GetTensorTypeAndShapeInfo();\n    size_t denoised_size = vector_est_output_info.GetElementCount();\n    if (denoised_size != latent_total_size) {\n      SHERPA_ONNX_LOGE(\n          \"Denoised latent size mismatch at step %d: expected %zu, got %zu. \"\n          \"Text: \\\"%s\\\"\",\n          step, latent_total_size, denoised_size, text.c_str());\n      return {};\n    }\n    auto *denoised_data = vector_est_output.GetTensorMutableData<float>();\n    std::memcpy(xt_flat.data(), denoised_data,\n                latent_total_size * sizeof(float));\n  }\n\n  Ort::Value latent_tensor = Ort::Value::CreateTensor<float>(\n      memory_info_, xt_flat.data(), xt_flat.size(), latent_shape.data(),\n      latent_shape.size());\n  Ort::Value vocoder_output = model_->RunVocoder(std::move(latent_tensor));\n  auto wav_info = vocoder_output.GetTensorTypeAndShapeInfo();\n  auto wav_shape = wav_info.GetShape();\n  size_t wav_size = wav_info.GetElementCount();\n  if (wav_size == 0) {\n    SHERPA_ONNX_LOGE(\"Vocoder output is empty. Text: \\\"%s\\\"\", text.c_str());\n    return {};\n  }\n\n  auto *wav_data = vocoder_output.GetTensorMutableData<float>();\n  if (config_.model.debug) {\n    std::ostringstream os;\n    os << \"Vocoder output shape: [\";\n    for (size_t i = 0; i < wav_shape.size(); ++i) {\n      if (i > 0) os << \", \";\n      os << wav_shape[i];\n    }\n    os << \"], total elements: \" << wav_size << \", bsz: \" << bsz;\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n  }\n\n  GeneratedAudio result;\n  if ((wav_shape.size() == 2 && wav_shape[0] == bsz) ||\n      (wav_shape.size() == 3 && wav_shape[0] == bsz && wav_shape[1] == 1)) {\n    int64_t samples_per_batch =\n        (wav_shape.size() == 2) ? wav_shape[1] : wav_shape[2];\n    result.samples.reserve(static_cast<size_t>(std::accumulate(\n        wav_lengths.begin(), wav_lengths.end(), static_cast<int64_t>(0))));\n    for (int32_t b = 0; b < bsz; ++b) {\n      int64_t actual_len = wav_lengths[b];\n      if (actual_len > samples_per_batch) {\n        actual_len = samples_per_batch;\n      }\n      const float *batch_wav = wav_data + b * samples_per_batch;\n      result.samples.insert(result.samples.end(), batch_wav,\n                            batch_wav + actual_len);\n    }\n  } else if (wav_shape.size() == 1 ||\n             (wav_shape.size() == 2 && wav_shape[0] == 1)) {\n    result.samples.assign(wav_data, wav_data + wav_size);\n  } else {\n    std::ostringstream os;\n    os << \"Unexpected vocoder output shape: [\";\n    for (size_t i = 0; i < wav_shape.size(); ++i) {\n      if (i > 0) os << \", \";\n      os << wav_shape[i];\n    }\n    os << \"], bsz=\" << bsz << \", using all samples\";\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n    result.samples.assign(wav_data, wav_data + wav_size);\n  }\n  if (config_.model.debug && !result.samples.empty()) {\n    float max_abs = 0.f;\n    float min_abs = std::abs(result.samples[0]);\n    for (float x : result.samples) {\n      float ax = std::abs(x);\n      max_abs = std::max(max_abs, ax);\n      min_abs = std::min(min_abs, ax);\n    }\n    SHERPA_ONNX_LOGE(\"Audio samples: %zu, min_abs=%.6f, max_abs=%.6f\",\n                     result.samples.size(), min_abs, max_abs);\n  }\n  result.sample_rate = cfg.ae.sample_rate;\n  return result;\n}\n\nGeneratedAudio OfflineTtsSupertonicImpl::ProcessChunksAndConcatenate(\n    const std::vector<std::string> &text_chunks, const std::string &lang,\n    int64_t sid, int32_t num_steps, float speed, float silence_duration,\n    int32_t seed, GeneratedAudioCallback callback) const {\n  NormalDataGenerator gen(0, 1, seed);\n  GeneratedAudio result;\n  std::vector<std::vector<float>> chunk_samples;\n  chunk_samples.reserve(text_chunks.size());\n  int32_t num_chunks = static_cast<int32_t>(text_chunks.size());\n  for (int32_t i = 0; i < num_chunks; ++i) {\n    auto chunk_result =\n        Process(text_chunks[i], lang, sid, num_steps, speed, gen);\n    if (chunk_result.samples.empty()) {\n      continue;\n    }\n    if (callback) {\n      float progress =\n          static_cast<float>(i + 1) / static_cast<float>(num_chunks);\n      callback(chunk_result.samples.data(), chunk_result.samples.size(),\n               progress);\n    }\n    chunk_samples.push_back(std::move(chunk_result.samples));\n  }\n\n  if (chunk_samples.empty()) {\n    result.sample_rate = model_->GetSampleRate();\n    return result;\n  }\n\n  int32_t sample_rate = model_->GetSampleRate();\n  size_t silence_len =\n      static_cast<size_t>(silence_duration * static_cast<float>(sample_rate));\n  size_t total = 0;\n  for (const auto &s : chunk_samples) {\n    total += s.size();\n  }\n  if (chunk_samples.size() > 1) {\n    total += (chunk_samples.size() - 1) * silence_len;\n  }\n\n  std::vector<float> wav_cat;\n  wav_cat.reserve(total);\n  for (size_t i = 0; i < chunk_samples.size(); ++i) {\n    if (i > 0) {\n      wav_cat.insert(wav_cat.end(), silence_len, 0.f);\n    }\n    wav_cat.insert(wav_cat.end(), chunk_samples[i].begin(),\n                   chunk_samples[i].end());\n  }\n  result.samples = std::move(wav_cat);\n  result.sample_rate = sample_rate;\n  return result;\n}\n\nvoid OfflineTtsSupertonicImpl::InitVoiceStyle(const std::vector<char> &buf) {\n  SupertonicStyle style = ParseVoiceStyleFromBinary(buf);\n  if (style.ttl_shape.size() != 3 || style.dp_shape.size() != 3) {\n    SHERPA_ONNX_LOGE(\n        \"Invalid voice style: ttl_shape or dp_shape must have 3 dimensions\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n  int32_t num_speakers = static_cast<int32_t>(style.ttl_shape[0]);\n  if (num_speakers <= 0) {\n    SHERPA_ONNX_LOGE(\n        \"Invalid voice style: num_speakers must be >= 1. Given: %d\",\n        num_speakers);\n    SHERPA_ONNX_EXIT(-1);\n  }\n  if (style.ttl_shape[0] != style.dp_shape[0]) {\n    SHERPA_ONNX_LOGE(\n        \"Invalid voice style: ttl_shape[0] != dp_shape[0]. Given: %d != %d\",\n        static_cast<int32_t>(style.ttl_shape[0]),\n        static_cast<int32_t>(style.dp_shape[0]));\n    SHERPA_ONNX_EXIT(-1);\n  }\n  num_speakers_ = num_speakers;\n  full_style_ = std::move(style);\n\n  if (config_.model.debug) {\n    SHERPA_ONNX_LOGE(\"Number of speakers: %d\", num_speakers_);\n  }\n}\n\nOfflineTtsSupertonicImpl::StyleSliceView\nOfflineTtsSupertonicImpl::GetStyleSliceForSid(int64_t sid) const {\n  StyleSliceView out;\n  int32_t s = 0;\n  if (num_speakers_ != 1) {\n    int64_t hi = static_cast<int64_t>(num_speakers_ - 1);\n    int64_t clamped = std::clamp<int64_t>(sid, 0, hi);\n    s = static_cast<int32_t>(clamped);\n  }\n  const SupertonicStyle &full = full_style_;\n  out.ttl_shape = {1, full.ttl_shape[1], full.ttl_shape[2]};\n  out.dp_shape = {1, full.dp_shape[1], full.dp_shape[2]};\n  size_t ttl_slice = static_cast<size_t>(out.ttl_shape[1] * out.ttl_shape[2]);\n  size_t dp_slice = static_cast<size_t>(out.dp_shape[1] * out.dp_shape[2]);\n  out.ttl_size = ttl_slice;\n  out.dp_size = dp_slice;\n  out.ttl_data = full.ttl_data.data() + static_cast<size_t>(s) * ttl_slice;\n  out.dp_data = full.dp_data.data() + static_cast<size_t>(s) * dp_slice;\n  return out;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTtsSupertonicImpl::OfflineTtsSupertonicImpl(\n    AAssetManager *mgr, const OfflineTtsConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTtsSupertonicImpl::OfflineTtsSupertonicImpl(\n    NativeResourceManager *mgr, const OfflineTtsConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-supertonic-impl.h",
    "content": "// sherpa-onnx/csrc/offline-tts-supertonic-impl.h\n//\n// Copyright (c)  2026 zengyw\n//\n// This file is based on Supertonic TTS\n// (https://github.com/Supertone-Inc/supertonic) which is licensed under MIT\n// License (Copyright (c) 2025 Supertone Inc.)\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_IMPL_H_\n\n#include <array>\n#include <memory>\n#include <random>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/normal-data-generator.h\"\n#include \"sherpa-onnx/csrc/offline-tts-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-supertonic-model.h\"\n#include \"sherpa-onnx/csrc/offline-tts-supertonic-unicode-processor.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsSupertonicImpl : public OfflineTtsImpl {\n public:\n  explicit OfflineTtsSupertonicImpl(const OfflineTtsConfig &config);\n\n  template <typename Manager>\n  OfflineTtsSupertonicImpl(Manager *mgr, const OfflineTtsConfig &config);\n\n  int32_t SampleRate() const override;\n\n  int32_t NumSpeakers() const override { return num_speakers_; }\n\n  [[deprecated(\"Use Generate(text, GenerationConfig, callback) instead\")]]\n  GeneratedAudio Generate(\n      const std::string &text, int64_t sid = 0, float speed = 1.0,\n      GeneratedAudioCallback callback = nullptr) const override;\n\n  GeneratedAudio Generate(\n      const std::string &text, const GenerationConfig &config,\n      GeneratedAudioCallback callback = nullptr) const override;\n\n private:\n  GeneratedAudio Process(const std::string &text, const std::string &lang,\n                         int64_t sid, int32_t num_steps, float speed,\n                         NormalDataGenerator &gen) const;\n\n  GeneratedAudio ProcessChunksAndConcatenate(\n      const std::vector<std::string> &text_chunks, const std::string &lang,\n      int64_t sid, int32_t num_steps, float speed, float silence_duration,\n      int32_t seed, GeneratedAudioCallback callback) const;\n\n  void InitVoiceStyle(const std::vector<char> &buf);\n\n  struct StyleSliceView {\n    const float *ttl_data;\n    size_t ttl_size;\n    std::array<int64_t, 3> ttl_shape;\n    const float *dp_data;\n    size_t dp_size;\n    std::array<int64_t, 3> dp_shape;\n  };\n  StyleSliceView GetStyleSliceForSid(int64_t sid) const;\n\n  OfflineTtsConfig config_;\n  std::unique_ptr<OfflineTtsSupertonicModel> model_;\n  std::unique_ptr<SupertonicUnicodeProcessor> text_processor_;\n  int32_t num_speakers_ = 0;\n  SupertonicStyle full_style_;  // shape [num_speakers_, ...]\n  Ort::MemoryInfo memory_info_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-supertonic-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-supertonic-model-config.cc\n//\n// Copyright (c)  2026 zengyw\n\n#include \"sherpa-onnx/csrc/offline-tts-supertonic-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineTtsSupertonicModelConfig::Register(ParseOptions *po) {\n  po->Register(\"supertonic-duration-predictor\", &duration_predictor,\n               \"Path to duration_predictor.onnx for Supertonic TTS\");\n  po->Register(\"supertonic-text-encoder\", &text_encoder,\n               \"Path to text_encoder.onnx for Supertonic TTS\");\n  po->Register(\"supertonic-vector-estimator\", &vector_estimator,\n               \"Path to vector_estimator.onnx for Supertonic TTS\");\n  po->Register(\"supertonic-vocoder\", &vocoder,\n               \"Path to vocoder.onnx for Supertonic TTS\");\n  po->Register(\"supertonic-tts-json\", &tts_json,\n               \"Path to tts.json for Supertonic TTS\");\n  po->Register(\"supertonic-unicode-indexer\", &unicode_indexer,\n               \"Path to unicode_indexer.bin for Supertonic TTS\");\n  po->Register(\"supertonic-voice-style\", &voice_style,\n               \"Path to Supertonic voice.bin (use sid 0..NumSpeakers()-1 to \"\n               \"select)\");\n}\n\nbool OfflineTtsSupertonicModelConfig::Validate() const {\n  if (duration_predictor.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --supertonic-duration-predictor\");\n    return false;\n  }\n  if (!FileExists(duration_predictor)) {\n    SHERPA_ONNX_LOGE(\"--supertonic-duration-predictor '%s' does not exist\",\n                     duration_predictor.c_str());\n    return false;\n  }\n\n  if (text_encoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --supertonic-text-encoder\");\n    return false;\n  }\n  if (!FileExists(text_encoder)) {\n    SHERPA_ONNX_LOGE(\"--supertonic-text-encoder '%s' does not exist\",\n                     text_encoder.c_str());\n    return false;\n  }\n\n  if (vector_estimator.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --supertonic-vector-estimator\");\n    return false;\n  }\n  if (!FileExists(vector_estimator)) {\n    SHERPA_ONNX_LOGE(\"--supertonic-vector-estimator '%s' does not exist\",\n                     vector_estimator.c_str());\n    return false;\n  }\n\n  if (vocoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --supertonic-vocoder\");\n    return false;\n  }\n  if (!FileExists(vocoder)) {\n    SHERPA_ONNX_LOGE(\"--supertonic-vocoder '%s' does not exist\",\n                     vocoder.c_str());\n    return false;\n  }\n\n  if (tts_json.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --supertonic-tts-json\");\n    return false;\n  }\n  if (!FileExists(tts_json)) {\n    SHERPA_ONNX_LOGE(\"--supertonic-tts-json '%s' does not exist\",\n                     tts_json.c_str());\n    return false;\n  }\n\n  if (unicode_indexer.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --supertonic-unicode-indexer\");\n    return false;\n  }\n  if (!FileExists(unicode_indexer)) {\n    SHERPA_ONNX_LOGE(\"--supertonic-unicode-indexer '%s' does not exist\",\n                     unicode_indexer.c_str());\n    return false;\n  }\n\n  if (voice_style.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --supertonic-voice-style\");\n    return false;\n  }\n  if (!FileExists(voice_style)) {\n    SHERPA_ONNX_LOGE(\"--supertonic-voice-style '%s' does not exist\",\n                     voice_style.c_str());\n    return false;\n  }\n  return true;\n}\n\nstd::string OfflineTtsSupertonicModelConfig::ToString() const {\n  std::ostringstream os;\n  os << \"OfflineTtsSupertonicModelConfig(\";\n  os << \"duration_predictor=\\\"\" << duration_predictor << \"\\\", \";\n  os << \"text_encoder=\\\"\" << text_encoder << \"\\\", \";\n  os << \"vector_estimator=\\\"\" << vector_estimator << \"\\\", \";\n  os << \"vocoder=\\\"\" << vocoder << \"\\\", \";\n  os << \"tts_json=\\\"\" << tts_json << \"\\\", \";\n  os << \"unicode_indexer=\\\"\" << unicode_indexer << \"\\\", \";\n  os << \"voice_style=\\\"\" << voice_style << \"\\\")\";\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-supertonic-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-tts-supertonic-model-config.h\n//\n// Copyright (c)  2026 zengyw\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTtsSupertonicModelConfig {\n  // Individual model file paths\n  std::string duration_predictor;\n  std::string text_encoder;\n  std::string vector_estimator;\n  std::string vocoder;\n\n  // Path to tts.json (TTS config: ae.sample_rate, ae.base_chunk_size, etc.)\n  std::string tts_json;\n\n  // Path to unicode_indexer.bin (raw int32 array)\n  std::string unicode_indexer;\n\n  // Path to voice.bin\n  std::string voice_style;\n\n  OfflineTtsSupertonicModelConfig() = default;\n\n  OfflineTtsSupertonicModelConfig(const std::string &duration_predictor,\n                                  const std::string &text_encoder,\n                                  const std::string &vector_estimator,\n                                  const std::string &vocoder,\n                                  const std::string &tts_json,\n                                  const std::string &unicode_indexer,\n                                  const std::string &voice_style)\n      : duration_predictor(duration_predictor),\n        text_encoder(text_encoder),\n        vector_estimator(vector_estimator),\n        vocoder(vocoder),\n        tts_json(tts_json),\n        unicode_indexer(unicode_indexer),\n        voice_style(voice_style) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-supertonic-model.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-supertonic-model.cc\n//\n// Copyright (c)  2026 zengyw\n//\n// This file is based on Supertonic TTS\n// (https://github.com/Supertone-Inc/supertonic) which is licensed under MIT\n// License (Copyright (c) 2025 Supertone Inc.)\n\n#include \"sherpa-onnx/csrc/offline-tts-supertonic-model.h\"\n\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"nlohmann/json.hpp\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nusing json = nlohmann::json;\n\nclass OfflineTtsSupertonicModel::Impl {\n public:\n  explicit Impl(const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)) {\n    Init();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)) {\n    Init(mgr);\n  }\n\n  const SupertonicConfig &GetConfig() const { return cfg_; }\n  int32_t GetSampleRate() const { return cfg_.ae.sample_rate; }\n\n  Ort::Value RunDurationPredictor(Ort::Value text_ids, Ort::Value style_dp,\n                                  Ort::Value text_mask) const {\n    std::vector<Ort::Value> inputs;\n    inputs.push_back(std::move(text_ids));\n    inputs.push_back(std::move(style_dp));\n    inputs.push_back(std::move(text_mask));\n    auto outputs =\n        dp_sess_->Run(Ort::RunOptions{nullptr}, dp_input_names_ptr_.data(),\n                      inputs.data(), inputs.size(), dp_output_names_ptr_.data(),\n                      dp_output_names_ptr_.size());\n    return std::move(outputs[0]);\n  }\n\n  Ort::Value RunTextEncoder(Ort::Value text_ids, Ort::Value style_ttl,\n                            Ort::Value text_mask) const {\n    std::vector<Ort::Value> inputs;\n    inputs.push_back(std::move(text_ids));\n    inputs.push_back(std::move(style_ttl));\n    inputs.push_back(std::move(text_mask));\n    auto outputs = text_enc_sess_->Run(\n        Ort::RunOptions{nullptr}, text_enc_input_names_ptr_.data(),\n        inputs.data(), inputs.size(), text_enc_output_names_ptr_.data(),\n        text_enc_output_names_ptr_.size());\n    return std::move(outputs[0]);\n  }\n\n  Ort::Value RunVectorEstimator(Ort::Value noisy_latent,\n                                Ort::Value current_step, Ort::Value &text_emb,\n                                Ort::Value &style_ttl, Ort::Value &latent_mask,\n                                Ort::Value &text_mask,\n                                Ort::Value &total_step) const {\n    std::vector<Ort::Value> inputs;\n    inputs.push_back(std::move(noisy_latent));\n    inputs.push_back(View(&text_emb));\n    inputs.push_back(View(&style_ttl));\n    inputs.push_back(View(&latent_mask));\n    inputs.push_back(View(&text_mask));\n    inputs.push_back(std::move(current_step));\n    inputs.push_back(View(&total_step));\n    auto outputs = vector_est_sess_->Run(\n        Ort::RunOptions{nullptr}, vector_est_input_names_ptr_.data(),\n        inputs.data(), inputs.size(), vector_est_output_names_ptr_.data(),\n        vector_est_output_names_ptr_.size());\n    return std::move(outputs[0]);\n  }\n\n  Ort::Value RunVocoder(Ort::Value latent) const {\n    std::vector<Ort::Value> inputs;\n    inputs.push_back(std::move(latent));\n    auto outputs = vocoder_sess_->Run(\n        Ort::RunOptions{nullptr}, vocoder_input_names_ptr_.data(),\n        inputs.data(), inputs.size(), vocoder_output_names_ptr_.data(),\n        vocoder_output_names_ptr_.size());\n    return std::move(outputs[0]);\n  }\n\n private:\n  void PrintModelInfo(Ort::Session *sess, const std::string &name) const {\n    if (!config_.debug) {\n      return;\n    }\n    std::vector<std::string> input_names, output_names;\n    std::vector<const char *> input_names_ptr, output_names_ptr;\n    GetInputNames(sess, &input_names, &input_names_ptr);\n    GetOutputNames(sess, &output_names, &output_names_ptr);\n    std::ostringstream os;\n    os << \"----------\" << name << \"----------\\n\";\n    os << \"Input names: \";\n    for (const auto &n : input_names) os << n << \" \";\n    os << \"\\nOutput names: \";\n    for (const auto &n : output_names) os << n << \" \";\n    os << \"\\n\";\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n  }\n\n  void PrintDebugInfo(const std::string &tts_config_path) const {\n    if (!config_.debug) {\n      return;\n    }\n    std::ostringstream os;\n    os << \"---supertonic model---\\n\";\n    os << \"tts_config: \" << tts_config_path << \"\\n\";\n    os << \"sample_rate: \" << cfg_.ae.sample_rate << \"\\n\";\n    os << \"base_chunk_size: \" << cfg_.ae.base_chunk_size << \"\\n\";\n    os << \"chunk_compress_factor: \" << cfg_.ttl.chunk_compress_factor << \"\\n\";\n    os << \"latent_dim: \" << cfg_.ttl.latent_dim << \"\\n\";\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n  }\n\n  void PrintModelInfos() const {\n    if (!config_.debug) {\n      return;\n    }\n    PrintModelInfo(dp_sess_.get(), \"duration_predictor\");\n    PrintModelInfo(text_enc_sess_.get(), \"text_encoder\");\n    PrintModelInfo(vector_est_sess_.get(), \"vector_estimator\");\n    PrintModelInfo(vocoder_sess_.get(), \"vocoder\");\n  }\n\n  void InitDurationPredictor(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      dp_sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                                model_data_length, sess_opts_);\n    } else if (!dp_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize duration predictor session \"\n          \"outside of this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n    GetInputNames(dp_sess_.get(), &dp_input_names_, &dp_input_names_ptr_);\n    GetOutputNames(dp_sess_.get(), &dp_output_names_, &dp_output_names_ptr_);\n  }\n\n  void InitTextEncoder(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      text_enc_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!text_enc_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize text encoder session outside \"\n          \"of this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n    GetInputNames(text_enc_sess_.get(), &text_enc_input_names_,\n                  &text_enc_input_names_ptr_);\n    GetOutputNames(text_enc_sess_.get(), &text_enc_output_names_,\n                   &text_enc_output_names_ptr_);\n  }\n\n  void InitVectorEstimator(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      vector_est_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!vector_est_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize vector estimator session \"\n          \"outside of this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n    GetInputNames(vector_est_sess_.get(), &vector_est_input_names_,\n                  &vector_est_input_names_ptr_);\n    GetOutputNames(vector_est_sess_.get(), &vector_est_output_names_,\n                   &vector_est_output_names_ptr_);\n  }\n\n  void InitVocoder(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      vocoder_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!vocoder_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize vocoder session outside of \"\n          \"this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n    GetInputNames(vocoder_sess_.get(), &vocoder_input_names_,\n                  &vocoder_input_names_ptr_);\n    GetOutputNames(vocoder_sess_.get(), &vocoder_output_names_,\n                   &vocoder_output_names_ptr_);\n  }\n\n  void LoadModels() {\n    dp_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config_.supertonic.duration_predictor),\n        sess_opts_);\n    InitDurationPredictor(nullptr, 0);\n\n    text_enc_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config_.supertonic.text_encoder),\n        sess_opts_);\n    InitTextEncoder(nullptr, 0);\n\n    vector_est_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config_.supertonic.vector_estimator),\n        sess_opts_);\n    InitVectorEstimator(nullptr, 0);\n\n    vocoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config_.supertonic.vocoder), sess_opts_);\n    InitVocoder(nullptr, 0);\n  }\n\n  template <typename Manager>\n  void LoadOneModel(Manager *mgr, const std::string &path,\n                    const char *model_name,\n                    const std::function<void(void *, size_t)> &init) {\n    auto buf = ReadFile(mgr, path);\n    if (buf.empty()) {\n      SHERPA_ONNX_LOGE(\"Failed to read %s model: %s\", model_name, path.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n    init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  void LoadModels(Manager *mgr) {\n    LoadOneModel(\n        mgr, config_.supertonic.duration_predictor, \"duration_predictor\",\n        [this](void *p, size_t len) { InitDurationPredictor(p, len); });\n    LoadOneModel(mgr, config_.supertonic.text_encoder, \"text_encoder\",\n                 [this](void *p, size_t len) { InitTextEncoder(p, len); });\n    LoadOneModel(mgr, config_.supertonic.vector_estimator, \"vector_estimator\",\n                 [this](void *p, size_t len) { InitVectorEstimator(p, len); });\n    LoadOneModel(mgr, config_.supertonic.vocoder, \"vocoder\",\n                 [this](void *p, size_t len) { InitVocoder(p, len); });\n  }\n\n  void Init() {\n    std::string tts_config_path =\n        ResolveAbsolutePath(config_.supertonic.tts_json);\n    LoadConfig(tts_config_path);\n    PrintDebugInfo(tts_config_path);\n    LoadModels();\n    PrintModelInfos();\n  }\n\n  template <typename Manager>\n  void Init(Manager *mgr) {\n    std::string tts_config_path =\n        ResolveAbsolutePath(config_.supertonic.tts_json);\n    LoadConfig(mgr, tts_config_path);\n    PrintDebugInfo(tts_config_path);\n    LoadModels(mgr);\n    PrintModelInfos();\n  }\n\n  void ParseConfig(const json &j) {\n    if (j.find(\"ae\") == j.end() || j.find(\"ttl\") == j.end()) {\n      SHERPA_ONNX_LOGE(\"Invalid config file: missing 'ae' or 'ttl' section\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n    const auto &ae = j[\"ae\"];\n    const auto &ttl = j[\"ttl\"];\n    auto get_int = [](const json &obj, const char *key,\n                      const char *section) -> int32_t {\n      if (obj.find(key) == obj.end()) {\n        SHERPA_ONNX_LOGE(\"Invalid config: %s.%s missing\", section, key);\n        SHERPA_ONNX_EXIT(-1);\n      }\n      if (!obj[key].is_number_integer()) {\n        SHERPA_ONNX_LOGE(\"Invalid config: %s.%s must be integer\", section, key);\n        SHERPA_ONNX_EXIT(-1);\n      }\n      return obj[key].get<int32_t>();\n    };\n    cfg_.ae.sample_rate = get_int(ae, \"sample_rate\", \"ae\");\n    cfg_.ae.base_chunk_size = get_int(ae, \"base_chunk_size\", \"ae\");\n    cfg_.ttl.chunk_compress_factor =\n        get_int(ttl, \"chunk_compress_factor\", \"ttl\");\n    cfg_.ttl.latent_dim = get_int(ttl, \"latent_dim\", \"ttl\");\n    if (cfg_.ae.sample_rate <= 0) {\n      SHERPA_ONNX_LOGE(\"Invalid sample_rate: %d\", cfg_.ae.sample_rate);\n      SHERPA_ONNX_EXIT(-1);\n    }\n    if (cfg_.ae.base_chunk_size <= 0) {\n      SHERPA_ONNX_LOGE(\"Invalid base_chunk_size: %d\", cfg_.ae.base_chunk_size);\n      SHERPA_ONNX_EXIT(-1);\n    }\n    if (cfg_.ttl.chunk_compress_factor <= 0) {\n      SHERPA_ONNX_LOGE(\"Invalid chunk_compress_factor: %d\",\n                       cfg_.ttl.chunk_compress_factor);\n      SHERPA_ONNX_EXIT(-1);\n    }\n    if (cfg_.ttl.latent_dim <= 0) {\n      SHERPA_ONNX_LOGE(\"Invalid latent_dim: %d\", cfg_.ttl.latent_dim);\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  static json LoadJsonFromBuffer(const std::vector<char> &buf) {\n    if (buf.empty()) {\n      SHERPA_ONNX_LOGE(\"Empty json buffer\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n    try {\n      return json::parse(buf.begin(), buf.end());\n    } catch (const std::exception &e) {\n      SHERPA_ONNX_LOGE(\"Failed to parse JSON buffer: %s\", e.what());\n      SHERPA_ONNX_EXIT(-1);\n    }\n    return json{};\n  }\n\n  void LoadConfig(const std::string &config_path) {\n    auto buf = ReadFile(config_path);\n    if (buf.empty()) {\n      SHERPA_ONNX_LOGE(\"Failed to read config: %s\", config_path.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n    json j = LoadJsonFromBuffer(buf);\n    ParseConfig(j);\n  }\n\n  template <typename Manager>\n  void LoadConfig(Manager *mgr, const std::string &config_path) {\n    auto buf = ReadFile(mgr, config_path);\n    if (buf.empty()) {\n      SHERPA_ONNX_LOGE(\"Failed to read config: %s\", config_path.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n    json j = LoadJsonFromBuffer(buf);\n    ParseConfig(j);\n  }\n\n  OfflineTtsModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  SupertonicConfig cfg_;\n\n  std::unique_ptr<Ort::Session> dp_sess_;\n  std::unique_ptr<Ort::Session> text_enc_sess_;\n  std::unique_ptr<Ort::Session> vector_est_sess_;\n  std::unique_ptr<Ort::Session> vocoder_sess_;\n\n  std::vector<std::string> dp_input_names_;\n  std::vector<const char *> dp_input_names_ptr_;\n  std::vector<std::string> dp_output_names_;\n  std::vector<const char *> dp_output_names_ptr_;\n\n  std::vector<std::string> text_enc_input_names_;\n  std::vector<const char *> text_enc_input_names_ptr_;\n  std::vector<std::string> text_enc_output_names_;\n  std::vector<const char *> text_enc_output_names_ptr_;\n\n  std::vector<std::string> vector_est_input_names_;\n  std::vector<const char *> vector_est_input_names_ptr_;\n  std::vector<std::string> vector_est_output_names_;\n  std::vector<const char *> vector_est_output_names_ptr_;\n\n  std::vector<std::string> vocoder_input_names_;\n  std::vector<const char *> vocoder_input_names_ptr_;\n  std::vector<std::string> vocoder_output_names_;\n  std::vector<const char *> vocoder_output_names_ptr_;\n};\n\nconst SupertonicConfig &OfflineTtsSupertonicModel::GetConfig() const {\n  return impl_->GetConfig();\n}\n\nint32_t OfflineTtsSupertonicModel::GetSampleRate() const {\n  return impl_->GetSampleRate();\n}\n\nOrt::Value OfflineTtsSupertonicModel::RunDurationPredictor(\n    Ort::Value text_ids, Ort::Value style_dp, Ort::Value text_mask) const {\n  return impl_->RunDurationPredictor(std::move(text_ids), std::move(style_dp),\n                                     std::move(text_mask));\n}\n\nOrt::Value OfflineTtsSupertonicModel::RunTextEncoder(\n    Ort::Value text_ids, Ort::Value style_ttl, Ort::Value text_mask) const {\n  return impl_->RunTextEncoder(std::move(text_ids), std::move(style_ttl),\n                               std::move(text_mask));\n}\n\nOrt::Value OfflineTtsSupertonicModel::RunVectorEstimator(\n    Ort::Value noisy_latent, Ort::Value current_step, Ort::Value &text_emb,\n    Ort::Value &style_ttl, Ort::Value &latent_mask, Ort::Value &text_mask,\n    Ort::Value &total_step) const {\n  return impl_->RunVectorEstimator(std::move(noisy_latent),\n                                   std::move(current_step), text_emb, style_ttl,\n                                   latent_mask, text_mask, total_step);\n}\n\nOrt::Value OfflineTtsSupertonicModel::RunVocoder(Ort::Value latent) const {\n  return impl_->RunVocoder(std::move(latent));\n}\n\nOfflineTtsSupertonicModel::OfflineTtsSupertonicModel(\n    const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTtsSupertonicModel::OfflineTtsSupertonicModel(\n    Manager *mgr, const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTtsSupertonicModel::~OfflineTtsSupertonicModel() = default;\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTtsSupertonicModel::OfflineTtsSupertonicModel(\n    AAssetManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTtsSupertonicModel::OfflineTtsSupertonicModel(\n    NativeResourceManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-supertonic-model.h",
    "content": "// sherpa-onnx/csrc/offline-tts-supertonic-model.h\n//\n// Copyright (c)  2026 zengyw\n//\n// This file is based on Supertonic TTS\n// (https://github.com/Supertone-Inc/supertonic) which is licensed under MIT\n// License (Copyright (c) 2025 Supertone Inc.)\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n\nnamespace sherpa_onnx {\n\nstruct SupertonicConfig {\n  struct AEConfig {\n    int32_t sample_rate;\n    int32_t base_chunk_size;\n  } ae;\n\n  struct TTLConfig {\n    int32_t chunk_compress_factor;\n    int32_t latent_dim;\n  } ttl;\n};\n\nstruct SupertonicStyle {\n  std::vector<float> ttl_data;\n  std::vector<float> dp_data;\n  std::vector<int64_t> ttl_shape;\n  std::vector<int64_t> dp_shape;\n};\n\nclass OfflineTtsSupertonicModel {\n public:\n  ~OfflineTtsSupertonicModel();\n\n  explicit OfflineTtsSupertonicModel(const OfflineTtsModelConfig &config);\n\n  template <typename Manager>\n  OfflineTtsSupertonicModel(Manager *mgr, const OfflineTtsModelConfig &config);\n\n  const SupertonicConfig &GetConfig() const;\n  int32_t GetSampleRate() const;\n\n  Ort::Value RunDurationPredictor(Ort::Value text_ids, Ort::Value style_dp,\n                                  Ort::Value text_mask) const;\n  Ort::Value RunTextEncoder(Ort::Value text_ids, Ort::Value style_ttl,\n                            Ort::Value text_mask) const;\n\n  Ort::Value RunVectorEstimator(Ort::Value noisy_latent,\n                                Ort::Value current_step, Ort::Value &text_emb,\n                                Ort::Value &style_ttl, Ort::Value &latent_mask,\n                                Ort::Value &text_mask,\n                                Ort::Value &total_step) const;\n  Ort::Value RunVocoder(Ort::Value latent) const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-supertonic-unicode-processor.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-supertonic-unicode-processor.cc\n//\n// Copyright (c)  2026 zengyw\n//\n// This file is based on Supertonic TTS\n// (https://github.com/Supertone-Inc/supertonic) which is licensed under MIT\n// License (Copyright (c) 2025 Supertone Inc.)\n\n#include \"sherpa-onnx/csrc/offline-tts-supertonic-unicode-processor.h\"\n\n#include <array>\n#include <cctype>\n#include <cstddef>\n#include <cstdint>\n#include <cstring>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\nnamespace {\n\n// Hangul syllable decomposition constants (Unicode Standard Annex #15)\nstatic constexpr uint32_t kHangulSbase = 0xAC00;  // Start of Hangul syllables\nstatic constexpr uint32_t kHangulLbase = 0x1100;  // Start of Hangul Jamo\nstatic constexpr uint32_t kHangulVbase = 0x1161;  // Start of Hangul vowels\nstatic constexpr uint32_t kHangulTbase = 0x11A7;  // Start of Hangul trailing\nstatic constexpr int32_t kHangulLcount = 19;\nstatic constexpr int32_t kHangulVcount = 21;\nstatic constexpr int32_t kHangulTcount = 28;\nstatic constexpr int32_t kHangulNcount = kHangulVcount * kHangulTcount;  // 588\nstatic constexpr int32_t kHangulScount =\n    kHangulLcount * kHangulNcount;  // 11172  // NOLINT\n\n// Latin NFKD decompositions via switch (no static map allocation).\n// Returns true if codepoint was decomposed, false otherwise.\nstatic bool DecomposeLatin(uint32_t codepoint, std::vector<uint16_t> *out) {\n  auto push2 = [&](uint16_t a, uint16_t b) {\n    out->push_back(a);\n    out->push_back(b);\n  };\n  switch (codepoint) {\n    case 0x00C1:\n      push2(0x0041, 0x0301);\n      return true;\n    case 0x00C9:\n      push2(0x0045, 0x0301);\n      return true;\n    case 0x00CD:\n      push2(0x0049, 0x0301);\n      return true;\n    case 0x00D3:\n      push2(0x004F, 0x0301);\n      return true;\n    case 0x00DA:\n      push2(0x0055, 0x0301);\n      return true;\n    case 0x00E1:\n      push2(0x0061, 0x0301);\n      return true;\n    case 0x00E9:\n      push2(0x0065, 0x0301);\n      return true;\n    case 0x00ED:\n      push2(0x0069, 0x0301);\n      return true;\n    case 0x00F3:\n      push2(0x006F, 0x0301);\n      return true;\n    case 0x00FA:\n      push2(0x0075, 0x0301);\n      return true;\n    case 0x00C0:\n      push2(0x0041, 0x0300);\n      return true;\n    case 0x00C8:\n      push2(0x0045, 0x0300);\n      return true;\n    case 0x00CC:\n      push2(0x0049, 0x0300);\n      return true;\n    case 0x00D2:\n      push2(0x004F, 0x0300);\n      return true;\n    case 0x00D9:\n      push2(0x0055, 0x0300);\n      return true;\n    case 0x00E0:\n      push2(0x0061, 0x0300);\n      return true;\n    case 0x00E8:\n      push2(0x0065, 0x0300);\n      return true;\n    case 0x00EC:\n      push2(0x0069, 0x0300);\n      return true;\n    case 0x00F2:\n      push2(0x006F, 0x0300);\n      return true;\n    case 0x00F9:\n      push2(0x0075, 0x0300);\n      return true;\n    case 0x00C2:\n      push2(0x0041, 0x0302);\n      return true;\n    case 0x00CA:\n      push2(0x0045, 0x0302);\n      return true;\n    case 0x00CE:\n      push2(0x0049, 0x0302);\n      return true;\n    case 0x00D4:\n      push2(0x004F, 0x0302);\n      return true;\n    case 0x00DB:\n      push2(0x0055, 0x0302);\n      return true;\n    case 0x00E2:\n      push2(0x0061, 0x0302);\n      return true;\n    case 0x00EA:\n      push2(0x0065, 0x0302);\n      return true;\n    case 0x00EE:\n      push2(0x0069, 0x0302);\n      return true;\n    case 0x00F4:\n      push2(0x006F, 0x0302);\n      return true;\n    case 0x00FB:\n      push2(0x0075, 0x0302);\n      return true;\n    case 0x00C3:\n      push2(0x0041, 0x0303);\n      return true;\n    case 0x00D1:\n      push2(0x004E, 0x0303);\n      return true;\n    case 0x00D5:\n      push2(0x004F, 0x0303);\n      return true;\n    case 0x00E3:\n      push2(0x0061, 0x0303);\n      return true;\n    case 0x00F1:\n      push2(0x006E, 0x0303);\n      return true;\n    case 0x00F5:\n      push2(0x006F, 0x0303);\n      return true;\n    case 0x00C4:\n      push2(0x0041, 0x0308);\n      return true;\n    case 0x00CB:\n      push2(0x0045, 0x0308);\n      return true;\n    case 0x00CF:\n      push2(0x0049, 0x0308);\n      return true;\n    case 0x00D6:\n      push2(0x004F, 0x0308);\n      return true;\n    case 0x00DC:\n      push2(0x0055, 0x0308);\n      return true;\n    case 0x00E4:\n      push2(0x0061, 0x0308);\n      return true;\n    case 0x00EB:\n      push2(0x0065, 0x0308);\n      return true;\n    case 0x00EF:\n      push2(0x0069, 0x0308);\n      return true;\n    case 0x00F6:\n      push2(0x006F, 0x0308);\n      return true;\n    case 0x00FC:\n      push2(0x0075, 0x0308);\n      return true;\n    case 0x00C7:\n      push2(0x0043, 0x0327);\n      return true;\n    case 0x00E7:\n      push2(0x0063, 0x0327);\n      return true;\n    default:\n      return false;\n  }\n}\n\nstatic void DecomposeCharacter(uint32_t codepoint,\n                               std::vector<uint16_t> *output) {\n  if (codepoint >= kHangulSbase && codepoint < kHangulSbase + kHangulScount) {\n    uint32_t s_index = codepoint - kHangulSbase;\n    uint32_t l_index = s_index / kHangulNcount;\n    uint32_t v_index = (s_index % kHangulNcount) / kHangulTcount;\n    uint32_t t_index = s_index % kHangulTcount;\n\n    output->push_back(static_cast<uint16_t>(kHangulLbase + l_index));\n    output->push_back(static_cast<uint16_t>(kHangulVbase + v_index));\n    if (t_index > 0) {\n      output->push_back(static_cast<uint16_t>(kHangulTbase + t_index));\n    }\n    return;\n  }\n\n  if (DecomposeLatin(codepoint, output)) return;\n\n  if (codepoint > 0xFFFF) return;\n  output->push_back(static_cast<uint16_t>(codepoint));\n}\n\n// Decode the last UTF-8 codepoint in s. Returns 0 if s is empty or invalid.\nstatic uint32_t LastCodepointUtf8(const std::string &s) {\n  if (s.empty()) return 0;\n\n  size_t start = s.size() - 1;\n  while (start > 0 && (static_cast<unsigned char>(s[start]) & 0xC0) == 0x80) {\n    --start;\n  }\n\n  unsigned char c = static_cast<unsigned char>(s[start]);\n\n  if ((c & 0x80) == 0) return c;\n\n  if ((c & 0xE0) == 0xC0 && start + 1 < s.size()) {\n    return ((c & 0x1F) << 6) |\n           (static_cast<unsigned char>(s[start + 1]) & 0x3F);\n  }\n\n  if ((c & 0xF0) == 0xE0 && start + 2 < s.size()) {\n    return ((c & 0x0F) << 12) |\n           ((static_cast<unsigned char>(s[start + 1]) & 0x3F) << 6) |\n           (static_cast<unsigned char>(s[start + 2]) & 0x3F);\n  }\n\n  if ((c & 0xF8) == 0xF0 && start + 3 < s.size()) {\n    return ((c & 0x07) << 18) |\n           ((static_cast<unsigned char>(s[start + 1]) & 0x3F) << 12) |\n           ((static_cast<unsigned char>(s[start + 2]) & 0x3F) << 6) |\n           (static_cast<unsigned char>(s[start + 3]) & 0x3F);\n  }\n\n  return 0;\n}\n\nstatic bool IsEndingPunctuationCodepoint(uint32_t cp) {\n  switch (cp) {\n    case 0x2026:  // …\n    case 0x3002:  // 。\n    case 0x300D:  // 」\n    case 0x300F:  // 』\n    case 0x3011:  // 】\n    case 0x3009:  // 〉\n    case 0x300B:  // 》\n    case 0x203A:  // ›\n    case 0x00BB:  // »\n    case 0x201C:  // \"\n    case 0x201D:  // \"\n    case 0x2018:  // '\n    case 0x2019:  // '\n      return true;\n    default:\n      return false;\n  }\n}\n\nstatic void ReplaceString(std::string *text, const std::string &from,\n                          const std::string &to) {\n  size_t pos = 0;\n  while ((pos = text->find(from, pos)) != std::string::npos) {\n    text->replace(pos, from.length(), to);\n    pos += to.length();\n  }\n}\n\n// Load indexer from raw int32_t binary (from generate_indexer_bin.py).\nstatic std::vector<int32_t> LoadIndexerFromBinary(const char *data,\n                                                  size_t size) {\n  if (size == 0 || (size % sizeof(int32_t) != 0)) {\n    SHERPA_ONNX_LOGE(\n        \"Invalid unicode indexer .bin size: %zu (must be multiple of %zu)\",\n        size, sizeof(int32_t));\n    SHERPA_ONNX_EXIT(-1);\n  }\n  size_t count = size / sizeof(int32_t);\n  std::vector<int32_t> out(count);\n  std::memcpy(out.data(), data, size);\n  return out;\n}\n\nstatic std::vector<int32_t> LoadIndexerFromPathImpl(\n    const std::vector<char> &buf, const std::string &path) {\n  if (buf.empty()) {\n    SHERPA_ONNX_LOGE(\"Failed to read unicode indexer: %s\", path.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  return LoadIndexerFromBinary(buf.data(), buf.size());\n}\n\n}  // namespace\n\nSupertonicUnicodeProcessor::SupertonicUnicodeProcessor(\n    const std::string &unicode_indexer_path) {\n  if (!EndsWith(unicode_indexer_path, \".bin\")) {\n    SHERPA_ONNX_LOGE(\"Unicode indexer path must be end with .bin. Given: '%s'\",\n                     unicode_indexer_path.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  std::vector<char> buf = ReadFile(unicode_indexer_path);\n  indexer_ = LoadIndexerFromPathImpl(buf, unicode_indexer_path);\n}\n\ntemplate <typename Manager>\nSupertonicUnicodeProcessor::SupertonicUnicodeProcessor(\n    Manager *mgr, const std::string &unicode_indexer_path) {\n  if (!EndsWith(unicode_indexer_path, \".bin\")) {\n    SHERPA_ONNX_LOGE(\"Unicode indexer path must be end with .bin. Given: '%s'\",\n                     unicode_indexer_path.c_str());\n    SHERPA_ONNX_EXIT(-1);\n  }\n  std::vector<char> buf = ReadFile(mgr, unicode_indexer_path);\n  indexer_ = LoadIndexerFromPathImpl(buf, unicode_indexer_path);\n}\n\nstd::string SupertonicUnicodeProcessor::PreprocessText(\n    const std::string &text, const std::string &lang) const {\n  std::string result = text;\n\n  static constexpr std::array<std::pair<const char *, const char *>, 25>\n      replacements = {{\n          {\"–\", \"-\"},\n          {\"‑\", \"-\"},\n          {\"—\", \"-\"},\n          {\"_\", \" \"},\n          {u8\"\\u201C\", \"\\\"\"},\n          {u8\"\\u201D\", \"\\\"\"},\n          {u8\"\\u2018\", \"'\"},\n          {u8\"\\u2019\", \"'\"},\n          {\"´\", \"'\"},\n          {\"`\", \"'\"},\n          {\"[\", \" \"},\n          {\"]\", \" \"},\n          {\"|\", \" \"},\n          {\"/\", \" \"},\n          {\"#\", \" \"},\n          {\"→\", \" \"},\n          {\"←\", \" \"},\n          {\"♥\", \"\"},\n          {\"☆\", \"\"},\n          {\"♡\", \"\"},\n          {\"©\", \"\"},\n          {\"\\\\\", \"\"},\n          {\"@\", \" at \"},\n          {\"e.g.,\", \"for example, \"},\n          {\"i.e.,\", \"that is, \"},\n      }};\n\n  for (const auto &repl : replacements) {\n    ReplaceString(&result, repl.first, repl.second);\n  }\n\n  // Remove some U+1Fxxx emoji/symbols (4-byte UTF-8 sequences: F0 9F 80-BF\n  // 80-BF). Note: This only removes a subset of emoji (U+1F000-U+1FFFF), not\n  // all emoji. Optimized: manual scanning instead of regex.\n  std::string emoji_removed;\n  emoji_removed.reserve(result.size());\n  for (size_t i = 0; i < result.size();) {\n    if (i + 3 < result.size() &&\n        static_cast<unsigned char>(result[i]) == 0xF0 &&\n        static_cast<unsigned char>(result[i + 1]) == 0x9F &&\n        (static_cast<unsigned char>(result[i + 2]) & 0xC0) == 0x80 &&\n        (static_cast<unsigned char>(result[i + 3]) & 0xC0) == 0x80) {\n      i += 4;  // Skip emoji\n    } else {\n      emoji_removed += result[i];\n      ++i;\n    }\n  }\n  result = std::move(emoji_removed);\n\n  // Fix spacing around punctuation (optimized: single pass)\n  std::string punct_fixed;\n  punct_fixed.reserve(result.size());\n  for (size_t i = 0; i < result.size(); ++i) {\n    if (result[i] == ' ' && i + 1 < result.size()) {\n      char next = result[i + 1];\n      if (next == ',' || next == '.' || next == '!' || next == '?' ||\n          next == ';' || next == ':' || next == '\\'') {\n        punct_fixed += next;\n        ++i;  // Skip space and punctuation\n        continue;\n      }\n    }\n    punct_fixed += result[i];\n  }\n  result = std::move(punct_fixed);\n\n  // Collapse adjacent duplicate quotes (\"\" -> \", '' -> ') while preserving\n  // normal paired quotes. Discard backticks. Single-pass O(n) algorithm.\n  std::string quotes_fixed;\n  quotes_fixed.reserve(result.size());\n  for (size_t i = 0; i < result.size(); ++i) {\n    if (result[i] == '`') {\n      // Skip backticks\n      continue;\n    }\n    if (result[i] == '\"' && i + 1 < result.size() && result[i + 1] == '\"') {\n      // Collapse adjacent double quotes: \"\" -> \"\n      quotes_fixed += '\"';\n      ++i;  // Skip the second quote\n    } else if (result[i] == '\\'' && i + 1 < result.size() &&\n               result[i + 1] == '\\'') {\n      // Collapse adjacent single quotes: '' -> '\n      quotes_fixed += '\\'';\n      ++i;  // Skip the second quote\n    } else {\n      quotes_fixed += result[i];\n    }\n  }\n  result = std::move(quotes_fixed);\n\n  // Remove extra spaces (optimized: single pass)\n  std::string spaces_fixed;\n  spaces_fixed.reserve(result.size());\n  bool last_was_space = false;\n  for (char c : result) {\n    if (std::isspace(static_cast<unsigned char>(c))) {\n      if (!last_was_space) {\n        spaces_fixed += ' ';\n        last_was_space = true;\n      }\n    } else {\n      spaces_fixed += c;\n      last_was_space = false;\n    }\n  }\n  result = Trim(spaces_fixed);\n\n  if (!result.empty()) {\n    char last_char = result.back();\n    bool ends_with_punct =\n        (last_char == '.' || last_char == '!' || last_char == '?' ||\n         last_char == ';' || last_char == ':' || last_char == ',' ||\n         last_char == '\\'' || last_char == '\"' || last_char == ')' ||\n         last_char == ']' || last_char == '}' || last_char == '>');\n    if (!ends_with_punct) {\n      ends_with_punct = IsEndingPunctuationCodepoint(LastCodepointUtf8(result));\n    }\n    if (!ends_with_punct) {\n      result += \".\";\n    }\n  }\n\n  // Wrap text with language tags\n  result = \"<\" + lang + \">\" + result + \"</\" + lang + \">\";\n\n  return result;\n}\n\nstd::vector<uint16_t> SupertonicUnicodeProcessor::TextToUnicodeValues(\n    const std::string &text) const {\n  std::vector<uint16_t> unicode_values;\n  size_t i = 0;\n\n  while (i < text.size()) {\n    uint32_t codepoint = 0;\n    unsigned char c = static_cast<unsigned char>(text[i]);\n\n    if ((c & 0x80) == 0) {\n      codepoint = c;\n      i += 1;\n    } else if ((c & 0xE0) == 0xC0 && i + 1 < text.size()) {\n      codepoint = (c & 0x1F) << 6;\n      codepoint |= (static_cast<unsigned char>(text[i + 1]) & 0x3F);\n      i += 2;\n    } else if ((c & 0xF0) == 0xE0 && i + 2 < text.size()) {\n      codepoint = (c & 0x0F) << 12;\n      codepoint |= (static_cast<unsigned char>(text[i + 1]) & 0x3F) << 6;\n      codepoint |= (static_cast<unsigned char>(text[i + 2]) & 0x3F);\n      i += 3;\n    } else if ((c & 0xF8) == 0xF0 && i + 3 < text.size()) {\n      codepoint = (c & 0x07) << 18;\n      codepoint |= (static_cast<unsigned char>(text[i + 1]) & 0x3F) << 12;\n      codepoint |= (static_cast<unsigned char>(text[i + 2]) & 0x3F) << 6;\n      codepoint |= (static_cast<unsigned char>(text[i + 3]) & 0x3F);\n      i += 4;\n    } else {\n      i += 1;\n      continue;\n    }\n\n    DecomposeCharacter(codepoint, &unicode_values);\n  }\n\n  return unicode_values;\n}\n\nvoid SupertonicUnicodeProcessor::Process(\n    const std::string &text, const std::string &lang,\n    std::vector<int64_t> *text_ids, std::vector<float> *text_mask_flat,\n    std::vector<int64_t> *text_mask_shape) const {\n  const std::string processed = PreprocessText(text, lang);\n  const std::vector<uint16_t> unicode_vals = TextToUnicodeValues(processed);\n  const size_t seq_len = unicode_vals.size();\n\n  constexpr int64_t kUnknownId = 0;\n  text_ids->assign(seq_len, kUnknownId);\n  for (size_t i = 0; i < seq_len; ++i) {\n    const size_t u = unicode_vals[i];\n    (*text_ids)[i] = (u < indexer_.size()) ? indexer_[u] : kUnknownId;\n  }\n\n  // Batch size is always 1: mask is all ones, shape [1, 1, seq_len].\n  text_mask_flat->assign(seq_len, 1.0f);\n  text_mask_shape->assign({1, 1, static_cast<int64_t>(seq_len)});\n}\n\n#if __ANDROID_API__ >= 9\ntemplate SupertonicUnicodeProcessor::SupertonicUnicodeProcessor(\n    AAssetManager *mgr, const std::string &unicode_indexer_path);\n#endif\n\n#if __OHOS__\ntemplate SupertonicUnicodeProcessor::SupertonicUnicodeProcessor(\n    NativeResourceManager *mgr, const std::string &unicode_indexer_path);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-supertonic-unicode-processor.h",
    "content": "// sherpa-onnx/csrc/offline-tts-supertonic-unicode-processor.h\n//\n// Copyright (c)  2026 zengyw\n//\n// This file is based on Supertonic TTS\n// (https://github.com/Supertone-Inc/supertonic) which is licensed under MIT\n// License (Copyright (c) 2025 Supertone Inc.)\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_UNICODE_PROCESSOR_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_UNICODE_PROCESSOR_H_\n\n#include <cstdint>\n#include <string>\n#include <vector>\n\nnamespace sherpa_onnx {\n\n// Unicode text processor for Supertonic TTS\nclass SupertonicUnicodeProcessor {\n public:\n  explicit SupertonicUnicodeProcessor(const std::string &unicode_indexer_path);\n\n  template <typename Manager>\n  SupertonicUnicodeProcessor(Manager *mgr,\n                             const std::string &unicode_indexer_path);\n\n  void Process(const std::string &text, const std::string &lang,\n               std::vector<int64_t> *text_ids,\n               std::vector<float> *text_mask_flat,\n               std::vector<int64_t> *text_mask_shape) const;\n\n private:\n  std::string PreprocessText(const std::string &text,\n                             const std::string &lang) const;\n  std::vector<uint16_t> TextToUnicodeValues(const std::string &text) const;\n\n  std::vector<int32_t> indexer_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_SUPERTONIC_UNICODE_PROCESSOR_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-vits-impl.h",
    "content": "// sherpa-onnx/csrc/offline-tts-vits-impl.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_IMPL_H_\n\n#include <memory>\n#include <string>\n#include <sstream>\n#include <utility>\n#include <vector>\n\n#include \"fst/extensions/far/far.h\"\n#include \"kaldifst/csrc/kaldi-fst-io.h\"\n#include \"kaldifst/csrc/text-normalizer.h\"\n#include \"sherpa-onnx/csrc/character-lexicon.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/lexicon.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/melo-tts-lexicon.h\"\n#include \"sherpa-onnx/csrc/offline-tts-character-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-vits-model.h\"\n#include \"sherpa-onnx/csrc/piper-phonemize-lexicon.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsVitsImpl : public OfflineTtsImpl {\n public:\n  explicit OfflineTtsVitsImpl(const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsVitsModel>(config.model)) {\n    InitFrontend();\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      tn_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule fst: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n#endif\n        }\n        tn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(f));\n      }\n    }\n\n    if (!config.rule_fars.empty()) {\n      if (config.model.debug) {\n        SHERPA_ONNX_LOGE(\"Loading FST archives\");\n      }\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fars, \",\", false, &files);\n\n      tn_list_.reserve(files.size() + tn_list_.size());\n\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule far: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n#endif\n        }\n        std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n            fst::FarReader<fst::StdArc>::Open(f));\n        for (; !reader->Done(); reader->Next()) {\n          std::unique_ptr<fst::StdConstFst> r(\n              fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n          tn_list_.push_back(\n              std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n        }\n      }\n\n      if (config.model.debug) {\n        SHERPA_ONNX_LOGE(\"FST archives loaded!\");\n      }\n    }\n  }\n\n  template <typename Manager>\n  OfflineTtsVitsImpl(Manager *mgr, const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsVitsModel>(mgr, config.model)) {\n    InitFrontend(mgr);\n\n    if (!config.rule_fsts.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fsts, \",\", false, &files);\n      tn_list_.reserve(files.size());\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule fst: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n#endif\n        }\n        auto buf = ReadFile(mgr, f);\n        std::istringstream is(std::string(buf.data(), buf.size()));\n        tn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(is));\n      }\n    }\n\n    if (!config.rule_fars.empty()) {\n      std::vector<std::string> files;\n      SplitStringToVector(config.rule_fars, \",\", false, &files);\n      tn_list_.reserve(files.size() + tn_list_.size());\n\n      for (const auto &f : files) {\n        if (config.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"rule far: %{public}s\", f.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n#endif\n        }\n\n        auto buf = ReadFile(mgr, f);\n\n        std::unique_ptr<std::istream> s(\n            new std::istringstream(std::string(buf.data(), buf.size())));\n\n        std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n            fst::FarReader<fst::StdArc>::Open(std::move(s)));\n\n        for (; !reader->Done(); reader->Next()) {\n          std::unique_ptr<fst::StdConstFst> r(\n              fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n          tn_list_.push_back(\n              std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n        }  // for (; !reader->Done(); reader->Next())\n      }    // for (const auto &f : files)\n    }      // if (!config.rule_fars.empty())\n  }\n\n  int32_t SampleRate() const override {\n    return model_->GetMetaData().sample_rate;\n  }\n\n  int32_t NumSpeakers() const override {\n    return model_->GetMetaData().num_speakers;\n  }\n\n  // Supported options in GenerationConfig:\n  //   - sid: Speaker ID for multi-speaker models\n  //   - speed: Speech speed factor (default: 1.0)\n  //   - silence_scale: Scale applied to pauses in the generated audio\n  //\n  // Supported extra options in config.extra:\n  //   - None\n  GeneratedAudio Generate(\n      const std::string &_text, const GenerationConfig &gen_config,\n      GeneratedAudioCallback callback = nullptr) const override {\n    if (config_.model.debug) {\n      SHERPA_ONNX_LOGE(\"%s\", gen_config.ToString().c_str());\n    }\n\n    int64_t sid = gen_config.sid;\n    float speed = gen_config.speed;\n    if (speed <= 0) {\n      SHERPA_ONNX_LOGE(\"Speed must be > 0. Given: %f\", speed);\n      return {};\n    }\n\n    const auto &meta_data = model_->GetMetaData();\n    int32_t num_speakers = meta_data.num_speakers;\n\n    if (num_speakers == 0 && sid != 0) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"This is a single-speaker model and supports only sid 0. Given sid: \"\n          \"%{public}d. sid is ignored\",\n          static_cast<int32_t>(sid));\n#else\n      SHERPA_ONNX_LOGE(\n          \"This is a single-speaker model and supports only sid 0. Given sid: \"\n          \"%d. sid is ignored\",\n          static_cast<int32_t>(sid));\n#endif\n    }\n\n    if (num_speakers != 0 && (sid >= num_speakers || sid < 0)) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"This model contains only %{public}d speakers. sid should be in the \"\n          \"range [%{public}d, %{public}d]. Given: %{public}d. Use sid=0\",\n          num_speakers, 0, num_speakers - 1, static_cast<int32_t>(sid));\n#else\n      SHERPA_ONNX_LOGE(\n          \"This model contains only %d speakers. sid should be in the range \"\n          \"[%d, %d]. Given: %d. Use sid=0\",\n          num_speakers, 0, num_speakers - 1, static_cast<int32_t>(sid));\n#endif\n      sid = 0;\n    }\n\n    std::string text = _text;\n    if (config_.model.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Raw text: %{public}s\", text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Raw text: %s\", text.c_str());\n#endif\n    }\n\n    if (!tn_list_.empty()) {\n      for (const auto &tn : tn_list_) {\n        text = tn->Normalize(text);\n        if (config_.model.debug) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"After normalizing: %{public}s\", text.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"After normalizing: %s\", text.c_str());\n#endif\n        }\n      }\n    }\n\n    std::vector<TokenIDs> token_ids =\n        frontend_->ConvertTextToTokenIds(text, meta_data.voice);\n\n    if (token_ids.empty() ||\n        (token_ids.size() == 1 && token_ids[0].tokens.empty())) {\n      SHERPA_ONNX_LOGE(\"Failed to convert %s to token IDs\", text.c_str());\n      return {};\n    }\n\n    std::vector<std::vector<int64_t>> x;\n    std::vector<std::vector<int64_t>> tones;\n\n    x.reserve(token_ids.size());\n\n    for (auto &i : token_ids) {\n      x.push_back(std::move(i.tokens));\n    }\n\n    if (!token_ids[0].tones.empty()) {\n      tones.reserve(token_ids.size());\n      for (auto &i : token_ids) {\n        tones.push_back(std::move(i.tones));\n      }\n    }\n\n    // TODO(fangjun): add blank inside the frontend, not here\n    if (meta_data.add_blank && config_.model.vits.data_dir.empty() &&\n        meta_data.frontend != \"characters\") {\n      for (auto &k : x) {\n        k = AddBlank(k);\n      }\n\n      for (auto &k : tones) {\n        k = AddBlank(k);\n      }\n    }\n\n    int32_t x_size = static_cast<int32_t>(x.size());\n\n    if (config_.max_num_sentences <= 0 || x_size <= config_.max_num_sentences) {\n      auto ans = Process(x, tones, sid, speed, gen_config.silence_scale);\n      if (callback) {\n        callback(ans.samples.data(), ans.samples.size(), 1.0);\n      }\n      return ans;\n    }\n\n    // the input text is too long, we process sentences within it in batches\n    // to avoid OOM. Batch size is config_.max_num_sentences\n    std::vector<std::vector<int64_t>> batch_x;\n    std::vector<std::vector<int64_t>> batch_tones;\n\n    int32_t batch_size = config_.max_num_sentences;\n    batch_x.reserve(config_.max_num_sentences);\n    batch_tones.reserve(config_.max_num_sentences);\n    int32_t num_batches = x_size / batch_size;\n\n    if (config_.model.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"Text is too long. Split it into %{public}d batches. batch size: \"\n          \"%{public}d. Number of sentences: %{public}d\",\n          num_batches, batch_size, x_size);\n#else\n      SHERPA_ONNX_LOGE(\n          \"Text is too long. Split it into %d batches. batch size: %d. Number \"\n          \"of sentences: %d\",\n          num_batches, batch_size, x_size);\n#endif\n    }\n\n    GeneratedAudio ans;\n\n    int32_t should_continue = 1;\n\n    int32_t k = 0;\n\n    for (int32_t b = 0; b != num_batches && should_continue; ++b) {\n      batch_x.clear();\n      batch_tones.clear();\n      for (int32_t i = 0; i != batch_size; ++i, ++k) {\n        batch_x.push_back(std::move(x[k]));\n\n        if (!tones.empty()) {\n          batch_tones.push_back(std::move(tones[k]));\n        }\n      }\n\n      auto audio = Process(batch_x, batch_tones, sid, speed,\n                           gen_config.silence_scale);\n      ans.sample_rate = audio.sample_rate;\n      ans.samples.insert(ans.samples.end(), audio.samples.begin(),\n                         audio.samples.end());\n      if (callback) {\n        should_continue = callback(audio.samples.data(), audio.samples.size(),\n                                   (b + 1) * 1.0 / num_batches);\n        // Caution(fangjun): audio is freed when the callback returns, so users\n        // should copy the data if they want to access the data after\n        // the callback returns to avoid segmentation fault.\n      }\n    }\n\n    batch_x.clear();\n    batch_tones.clear();\n    while (k < static_cast<int32_t>(x.size()) && should_continue) {\n      batch_x.push_back(std::move(x[k]));\n      if (!tones.empty()) {\n        batch_tones.push_back(std::move(tones[k]));\n      }\n\n      ++k;\n    }\n\n    if (!batch_x.empty()) {\n      auto audio =\n          Process(batch_x, batch_tones, sid, speed, gen_config.silence_scale);\n      ans.sample_rate = audio.sample_rate;\n      ans.samples.insert(ans.samples.end(), audio.samples.begin(),\n                         audio.samples.end());\n      if (callback) {\n        callback(audio.samples.data(), audio.samples.size(), 1.0);\n        // Caution(fangjun): audio is freed when the callback returns, so users\n        // should copy the data if they want to access the data after\n        // the callback returns to avoid segmentation fault.\n      }\n    }\n\n    return ans;\n  }\n\n  [[deprecated(\"Use Generate(text, GenerationConfig, callback) instead\")]]\n  GeneratedAudio Generate(\n      const std::string &text, int64_t sid = 0, float speed = 1.0,\n      GeneratedAudioCallback callback = nullptr) const override {\n    GenerationConfig gen_config;\n    gen_config.sid = sid;\n    gen_config.speed = speed;\n    gen_config.silence_scale = config_.silence_scale;\n    return Generate(text, gen_config, std::move(callback));\n  }\n\n private:\n  template <typename Manager>\n  void InitFrontend(Manager *mgr) {\n    const auto &meta_data = model_->GetMetaData();\n\n    if (meta_data.frontend == \"characters\") {\n      frontend_ = std::make_unique<OfflineTtsCharacterFrontend>(\n          mgr, config_.model.vits.tokens, meta_data);\n    } else if (meta_data.jieba && meta_data.is_melo_tts) {\n      frontend_ = std::make_unique<MeloTtsLexicon>(\n          mgr, config_.model.vits.lexicon, config_.model.vits.tokens,\n          model_->GetMetaData(), config_.model.debug);\n    } else if (meta_data.jieba) {\n      frontend_ = std::make_unique<CharacterLexicon>(\n          mgr, config_.model.vits.lexicon, config_.model.vits.tokens,\n          config_.model.debug);\n    } else if (meta_data.is_melo_tts && meta_data.language == \"English\") {\n      frontend_ = std::make_unique<MeloTtsLexicon>(\n          mgr, config_.model.vits.lexicon, config_.model.vits.tokens,\n          model_->GetMetaData(), config_.model.debug);\n    } else if ((meta_data.is_piper || meta_data.is_coqui ||\n                meta_data.is_icefall) &&\n               !config_.model.vits.data_dir.empty()) {\n      frontend_ = std::make_unique<PiperPhonemizeLexicon>(\n          mgr, config_.model.vits.tokens, config_.model.vits.data_dir,\n          meta_data);\n    } else {\n      if (config_.model.vits.lexicon.empty()) {\n        SHERPA_ONNX_LOGE(\n            \"Not a model using characters as modeling unit. Please provide \"\n            \"--vits-lexicon if you leave --vits-data-dir empty\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      frontend_ = std::make_unique<Lexicon>(\n          mgr, config_.model.vits.lexicon, config_.model.vits.tokens,\n          meta_data.punctuations, meta_data.language, config_.model.debug);\n    }\n  }\n\n  void InitFrontend() {\n    const auto &meta_data = model_->GetMetaData();\n\n    if (meta_data.frontend == \"characters\") {\n      frontend_ = std::make_unique<OfflineTtsCharacterFrontend>(\n          config_.model.vits.tokens, meta_data);\n    } else if (meta_data.jieba && meta_data.is_melo_tts) {\n      frontend_ = std::make_unique<MeloTtsLexicon>(\n          config_.model.vits.lexicon, config_.model.vits.tokens,\n          model_->GetMetaData(), config_.model.debug);\n    } else if (meta_data.is_melo_tts && meta_data.language == \"English\") {\n      frontend_ = std::make_unique<MeloTtsLexicon>(\n          config_.model.vits.lexicon, config_.model.vits.tokens,\n          model_->GetMetaData(), config_.model.debug);\n    } else if (meta_data.jieba) {\n      frontend_ = std::make_unique<CharacterLexicon>(config_.model.vits.lexicon,\n                                                     config_.model.vits.tokens,\n                                                     config_.model.debug);\n    } else if ((meta_data.is_piper || meta_data.is_coqui ||\n                meta_data.is_icefall) &&\n               !config_.model.vits.data_dir.empty()) {\n      frontend_ = std::make_unique<PiperPhonemizeLexicon>(\n          config_.model.vits.tokens, config_.model.vits.data_dir,\n          model_->GetMetaData());\n    } else {\n      if (config_.model.vits.lexicon.empty()) {\n        SHERPA_ONNX_LOGE(\n            \"Not a model using characters as modeling unit. Please provide \"\n            \"--vits-lexicon if you leave --vits-data-dir empty\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n      frontend_ = std::make_unique<Lexicon>(\n          config_.model.vits.lexicon, config_.model.vits.tokens,\n          meta_data.punctuations, meta_data.language, config_.model.debug);\n    }\n  }\n\n  GeneratedAudio Process(const std::vector<std::vector<int64_t>> &tokens,\n                         const std::vector<std::vector<int64_t>> &tones,\n                         int32_t sid, float speed,\n                         float silence_scale) const {\n    int32_t num_tokens = 0;\n    for (const auto &k : tokens) {\n      num_tokens += k.size();\n    }\n\n    std::vector<int64_t> x;\n    x.reserve(num_tokens);\n    for (const auto &k : tokens) {\n      x.insert(x.end(), k.begin(), k.end());\n    }\n\n    std::vector<int64_t> tone_list;\n    if (!tones.empty()) {\n      tone_list.reserve(num_tokens);\n      for (const auto &k : tones) {\n        tone_list.insert(tone_list.end(), k.begin(), k.end());\n      }\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 2> x_shape = {1, static_cast<int32_t>(x.size())};\n    Ort::Value x_tensor = Ort::Value::CreateTensor(\n        memory_info, x.data(), x.size(), x_shape.data(), x_shape.size());\n\n    Ort::Value tones_tensor{nullptr};\n    if (!tones.empty()) {\n      tones_tensor = Ort::Value::CreateTensor(memory_info, tone_list.data(),\n                                              tone_list.size(), x_shape.data(),\n                                              x_shape.size());\n    }\n\n    Ort::Value audio{nullptr};\n    if (tones.empty()) {\n      audio = model_->Run(std::move(x_tensor), sid, speed);\n    } else {\n      audio =\n          model_->Run(std::move(x_tensor), std::move(tones_tensor), sid, speed);\n    }\n\n    std::vector<int64_t> audio_shape =\n        audio.GetTensorTypeAndShapeInfo().GetShape();\n\n    int64_t total = 1;\n    // The output shape may be (1, 1, total) or (1, total) or (total,)\n    for (auto i : audio_shape) {\n      total *= i;\n    }\n\n    const float *p = audio.GetTensorData<float>();\n\n    GeneratedAudio ans;\n    ans.sample_rate = model_->GetMetaData().sample_rate;\n    ans.samples = std::vector<float>(p, p + total);\n\n    if (silence_scale != 1) {\n      ans = ans.ScaleSilence(silence_scale);\n    }\n\n    return ans;\n  }\n\n private:\n  OfflineTtsConfig config_;\n  std::unique_ptr<OfflineTtsVitsModel> model_;\n  std::vector<std::unique_ptr<kaldifst::TextNormalizer>> tn_list_;\n  std::unique_ptr<OfflineTtsFrontend> frontend_;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-vits-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-vits-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-vits-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineTtsVitsModelConfig::Register(ParseOptions *po) {\n  po->Register(\"vits-model\", &model, \"Path to VITS model\");\n  po->Register(\"vits-lexicon\", &lexicon, \"Path to lexicon.txt for VITS models\");\n  po->Register(\"vits-tokens\", &tokens, \"Path to tokens.txt for VITS models\");\n  po->Register(\"vits-data-dir\", &data_dir,\n               \"Path to the directory containing dict for espeak-ng. If it is \"\n               \"given, --vits-lexicon is ignored.\");\n  po->Register(\"vits-dict-dir\", &dict_dir,\n               \"Not used. You don't need to provide a value for it\");\n  po->Register(\"vits-noise-scale\", &noise_scale, \"noise_scale for VITS models\");\n  po->Register(\"vits-noise-scale-w\", &noise_scale_w,\n               \"noise_scale_w for VITS models\");\n  po->Register(\"vits-length-scale\", &length_scale,\n               \"Speech speed. Larger->Slower; Smaller->faster.\");\n}\n\nbool OfflineTtsVitsModelConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --vits-model\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"--vits-model: '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  if (tokens.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --vits-tokens\");\n    return false;\n  }\n\n  if (!FileExists(tokens)) {\n    SHERPA_ONNX_LOGE(\"--vits-tokens: '%s' does not exist\", tokens.c_str());\n    return false;\n  }\n\n  if (!data_dir.empty()) {\n    if (!FileExists(data_dir + \"/phontab\")) {\n      SHERPA_ONNX_LOGE(\n          \"'%s/phontab' does not exist. Please check --vits-data-dir\",\n          data_dir.c_str());\n      return false;\n    }\n\n    if (!FileExists(data_dir + \"/phonindex\")) {\n      SHERPA_ONNX_LOGE(\n          \"'%s/phonindex' does not exist. Please check --vits-data-dir\",\n          data_dir.c_str());\n      return false;\n    }\n\n    if (!FileExists(data_dir + \"/phondata\")) {\n      SHERPA_ONNX_LOGE(\n          \"'%s/phondata' does not exist. Please check --vits-data-dir\",\n          data_dir.c_str());\n      return false;\n    }\n\n    if (!FileExists(data_dir + \"/intonations\")) {\n      SHERPA_ONNX_LOGE(\n          \"'%s/intonations' does not exist. Please check --vits-data-dir\",\n          data_dir.c_str());\n      return false;\n    }\n  }\n\n  if (!dict_dir.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"From sherpa-onnx v1.12.15, you don't need to provide dict_dir for \"\n        \"this model. Ignore it\");\n  }\n\n  return true;\n}\n\nstd::string OfflineTtsVitsModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineTtsVitsModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\", \";\n  os << \"lexicon=\\\"\" << lexicon << \"\\\", \";\n  os << \"tokens=\\\"\" << tokens << \"\\\", \";\n  os << \"data_dir=\\\"\" << data_dir << \"\\\", \";\n  os << \"noise_scale=\" << noise_scale << \", \";\n  os << \"noise_scale_w=\" << noise_scale_w << \", \";\n  os << \"length_scale=\" << length_scale << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-vits-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-tts-vits-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTtsVitsModelConfig {\n  std::string model;\n  std::string lexicon;\n  std::string tokens;\n\n  // If data_dir is given, lexicon is ignored\n  // data_dir is for piper-phonemize, which uses espeak-ng\n  std::string data_dir;\n\n  // Used for Chinese TTS models using jieba\n  std::string dict_dir;\n\n  float noise_scale = 0.667;\n  float noise_scale_w = 0.8;\n  float length_scale = 1;\n\n  // used only for multi-speaker models, e.g, vctk speech dataset.\n  // Not applicable for single-speaker models, e.g., ljspeech dataset\n\n  OfflineTtsVitsModelConfig() = default;\n\n  OfflineTtsVitsModelConfig(const std::string &model,\n                            const std::string &lexicon,\n                            const std::string &tokens,\n                            const std::string &data_dir,\n                            const std::string &dict_dir,\n                            float noise_scale = 0.667,\n                            float noise_scale_w = 0.8, float length_scale = 1)\n      : model(model),\n        lexicon(lexicon),\n        tokens(tokens),\n        data_dir(data_dir),\n        dict_dir(dict_dir),\n        noise_scale(noise_scale),\n        noise_scale_w(noise_scale_w),\n        length_scale(length_scale) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-vits-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-tts-vits-model-meta-data.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_MODEL_META_DATA_H_\n\n#include <cstdint>\n#include <string>\n\nnamespace sherpa_onnx {\n\n// If you are not sure what each field means, please\n// have a look of the Python file in the model directory that\n// you have downloaded.\nstruct OfflineTtsVitsModelMetaData {\n  int32_t sample_rate = 0;\n  int32_t add_blank = 0;\n  int32_t num_speakers = 0;\n\n  bool is_piper = false;\n  bool is_coqui = false;\n  bool is_icefall = false;\n  bool is_melo_tts = false;\n\n  // for Chinese TTS models from\n  // https://github.com/Plachtaa/VITS-fast-fine-tuning\n  int32_t jieba = 0;\n\n  // the following options are for models from coqui-ai/TTS\n  int32_t blank_id = 0;\n  int32_t bos_id = 0;\n  int32_t eos_id = 0;\n  int32_t use_eos_bos = 0;\n  int32_t pad_id = 0;\n\n  // for melo tts\n  int32_t speaker_id = 0;\n  int32_t version = 0;\n\n  std::string punctuations;\n  std::string language;\n  std::string voice;\n  std::string frontend;  // characters\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-vits-model.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-vits-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-vits-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsVitsModel::Impl {\n public:\n  explicit Impl(const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config.vits.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config.vits.model);\n    Init(buf.data(), buf.size());\n  }\n\n  Ort::Value Run(Ort::Value x, int64_t sid, float speed) {\n    if (meta_data_.is_piper || meta_data_.is_coqui) {\n      return RunVitsPiperOrCoqui(std::move(x), sid, speed);\n    }\n\n    return RunVits(std::move(x), sid, speed);\n  }\n\n  Ort::Value Run(Ort::Value x, Ort::Value tones, int64_t sid, float speed) {\n    if (meta_data_.num_speakers == 1) {\n      // For MeloTTS, we hardcode sid to the one contained in the meta data\n      sid = meta_data_.speaker_id;\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> x_shape = x.GetTensorTypeAndShapeInfo().GetShape();\n    if (x_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"Support only batch_size == 1. Given: %d\",\n                       static_cast<int32_t>(x_shape[0]));\n      exit(-1);\n    }\n\n    int64_t len = x_shape[1];\n    int64_t len_shape = 1;\n\n    Ort::Value x_length =\n        Ort::Value::CreateTensor(memory_info, &len, 1, &len_shape, 1);\n\n    int64_t scale_shape = 1;\n    float noise_scale = config_.vits.noise_scale;\n    float length_scale = config_.vits.length_scale;\n    float noise_scale_w = config_.vits.noise_scale_w;\n\n    if (speed != 1 && speed > 0) {\n      length_scale = 1. / speed;\n    }\n\n    Ort::Value noise_scale_tensor =\n        Ort::Value::CreateTensor(memory_info, &noise_scale, 1, &scale_shape, 1);\n\n    Ort::Value length_scale_tensor = Ort::Value::CreateTensor(\n        memory_info, &length_scale, 1, &scale_shape, 1);\n\n    Ort::Value noise_scale_w_tensor = Ort::Value::CreateTensor(\n        memory_info, &noise_scale_w, 1, &scale_shape, 1);\n\n    Ort::Value sid_tensor =\n        Ort::Value::CreateTensor(memory_info, &sid, 1, &scale_shape, 1);\n\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(7);\n    inputs.push_back(std::move(x));\n    inputs.push_back(std::move(x_length));\n    inputs.push_back(std::move(tones));\n    inputs.push_back(std::move(sid_tensor));\n    inputs.push_back(std::move(noise_scale_tensor));\n    inputs.push_back(std::move(length_scale_tensor));\n    inputs.push_back(std::move(noise_scale_w_tensor));\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    return std::move(out[0]);\n  }\n\n  const OfflineTtsVitsModelMetaData &GetMetaData() const { return meta_data_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---vits model---\\n\";\n      PrintModelMetadata(os, meta_data);\n\n      os << \"----------input names----------\\n\";\n      int32_t i = 0;\n      for (const auto &s : input_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n      os << \"----------output names----------\\n\";\n      i = 0;\n      for (const auto &s : output_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(meta_data_.sample_rate, \"sample_rate\");\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.add_blank, \"add_blank\",\n                                            0);\n\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.speaker_id, \"speaker_id\",\n                                            0);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.version, \"version\", 0);\n    SHERPA_ONNX_READ_META_DATA(meta_data_.num_speakers, \"n_speakers\");\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(meta_data_.punctuations,\n                                                \"punctuation\", \"\");\n    SHERPA_ONNX_READ_META_DATA_STR(meta_data_.language, \"language\");\n\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(meta_data_.voice, \"voice\", \"\");\n\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(meta_data_.frontend, \"frontend\",\n                                                \"\");\n\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.jieba, \"jieba\", 0);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.blank_id, \"blank_id\", 0);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.bos_id, \"bos_id\", 0);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.eos_id, \"eos_id\", 0);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.use_eos_bos,\n                                            \"use_eos_bos\", 1);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.pad_id, \"pad_id\", 0);\n\n    std::string comment;\n    SHERPA_ONNX_READ_META_DATA_STR(comment, \"comment\");\n\n    if (comment.find(\"piper\") != std::string::npos) {\n      meta_data_.is_piper = true;\n    }\n\n    if (comment.find(\"coqui\") != std::string::npos) {\n      meta_data_.is_coqui = true;\n    }\n\n    if (comment.find(\"icefall\") != std::string::npos) {\n      meta_data_.is_icefall = true;\n    }\n\n    if (comment.find(\"melo\") != std::string::npos) {\n      meta_data_.is_melo_tts = true;\n      int32_t expected_version = 2;\n      if (meta_data_.version < expected_version) {\n        SHERPA_ONNX_LOGE(\n            \"Please download the latest MeloTTS model and retry. Current \"\n            \"version: %d. Expected version: %d\",\n            meta_data_.version, expected_version);\n        exit(-1);\n      }\n\n      // NOTE(fangjun):\n      // version 0 is the first version\n      // version 2: add jieba=1 to the metadata\n    }\n  }\n\n  Ort::Value RunVitsPiperOrCoqui(Ort::Value x, int64_t sid, float speed) {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> x_shape = x.GetTensorTypeAndShapeInfo().GetShape();\n    if (x_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"Support only batch_size == 1. Given: %d\",\n                       static_cast<int32_t>(x_shape[0]));\n      exit(-1);\n    }\n\n    int64_t len = x_shape[1];\n    int64_t len_shape = 1;\n\n    Ort::Value x_length =\n        Ort::Value::CreateTensor(memory_info, &len, 1, &len_shape, 1);\n\n    float noise_scale = config_.vits.noise_scale;\n    float length_scale = config_.vits.length_scale;\n    float noise_scale_w = config_.vits.noise_scale_w;\n\n    if (speed != 1 && speed > 0) {\n      length_scale = 1. / speed;\n    }\n    std::array<float, 3> scales = {noise_scale, length_scale, noise_scale_w};\n\n    int64_t scale_shape = 3;\n\n    Ort::Value scales_tensor = Ort::Value::CreateTensor(\n        memory_info, scales.data(), scales.size(), &scale_shape, 1);\n\n    int64_t sid_shape = 1;\n    Ort::Value sid_tensor =\n        Ort::Value::CreateTensor(memory_info, &sid, 1, &sid_shape, 1);\n\n    int64_t lang_id_shape = 1;\n    int64_t lang_id = 0;\n    Ort::Value lang_id_tensor =\n        Ort::Value::CreateTensor(memory_info, &lang_id, 1, &lang_id_shape, 1);\n\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(5);\n    inputs.push_back(std::move(x));\n    inputs.push_back(std::move(x_length));\n    inputs.push_back(std::move(scales_tensor));\n\n    if (input_names_.size() >= 4 && input_names_[3] == \"sid\") {\n      inputs.push_back(std::move(sid_tensor));\n    }\n\n    if (input_names_.size() >= 5 && input_names_[4] == \"langid\") {\n      inputs.push_back(std::move(lang_id_tensor));\n    }\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    return std::move(out[0]);\n  }\n\n  Ort::Value RunVits(Ort::Value x, int64_t sid, float speed) {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> x_shape = x.GetTensorTypeAndShapeInfo().GetShape();\n    if (x_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"Support only batch_size == 1. Given: %d\",\n                       static_cast<int32_t>(x_shape[0]));\n      exit(-1);\n    }\n\n    int64_t len = x_shape[1];\n    int64_t len_shape = 1;\n\n    Ort::Value x_length =\n        Ort::Value::CreateTensor(memory_info, &len, 1, &len_shape, 1);\n\n    int64_t scale_shape = 1;\n    float noise_scale = config_.vits.noise_scale;\n    float length_scale = config_.vits.length_scale;\n    float noise_scale_w = config_.vits.noise_scale_w;\n\n    if (speed != 1 && speed > 0) {\n      length_scale = 1. / speed;\n    }\n\n    Ort::Value noise_scale_tensor =\n        Ort::Value::CreateTensor(memory_info, &noise_scale, 1, &scale_shape, 1);\n\n    Ort::Value length_scale_tensor = Ort::Value::CreateTensor(\n        memory_info, &length_scale, 1, &scale_shape, 1);\n\n    Ort::Value noise_scale_w_tensor = Ort::Value::CreateTensor(\n        memory_info, &noise_scale_w, 1, &scale_shape, 1);\n\n    Ort::Value sid_tensor =\n        Ort::Value::CreateTensor(memory_info, &sid, 1, &scale_shape, 1);\n\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(6);\n    inputs.push_back(std::move(x));\n    inputs.push_back(std::move(x_length));\n    inputs.push_back(std::move(noise_scale_tensor));\n    inputs.push_back(std::move(length_scale_tensor));\n    inputs.push_back(std::move(noise_scale_w_tensor));\n\n    if (input_names_.size() == 6 &&\n        (input_names_.back() == \"sid\" || input_names_.back() == \"speaker\")) {\n      inputs.push_back(std::move(sid_tensor));\n    }\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    return std::move(out[0]);\n  }\n\n private:\n  OfflineTtsModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  OfflineTtsVitsModelMetaData meta_data_;\n};\n\nOfflineTtsVitsModel::OfflineTtsVitsModel(const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTtsVitsModel::OfflineTtsVitsModel(Manager *mgr,\n                                         const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTtsVitsModel::~OfflineTtsVitsModel() = default;\n\nOrt::Value OfflineTtsVitsModel::Run(Ort::Value x, int64_t sid /*=0*/,\n                                    float speed /*= 1.0*/) {\n  return impl_->Run(std::move(x), sid, speed);\n}\n\nOrt::Value OfflineTtsVitsModel::Run(Ort::Value x, Ort::Value tones,\n                                    int64_t sid /*= 0*/,\n                                    float speed /*= 1.0*/) const {\n  return impl_->Run(std::move(x), std::move(tones), sid, speed);\n}\n\nconst OfflineTtsVitsModelMetaData &OfflineTtsVitsModel::GetMetaData() const {\n  return impl_->GetMetaData();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTtsVitsModel::OfflineTtsVitsModel(\n    AAssetManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTtsVitsModel::OfflineTtsVitsModel(\n    NativeResourceManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-vits-model.h",
    "content": "// sherpa-onnx/csrc/offline-tts-vits-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_MODEL_H_\n\n#include <memory>\n#include <string>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-tts-vits-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsVitsModel {\n public:\n  ~OfflineTtsVitsModel();\n\n  explicit OfflineTtsVitsModel(const OfflineTtsModelConfig &config);\n\n  template <typename Manager>\n  OfflineTtsVitsModel(Manager *mgr, const OfflineTtsModelConfig &config);\n\n  /** Run the model.\n   *\n   * @param x A int64 tensor of shape (1, num_tokens)\n  // @param sid Speaker ID. Used only for multi-speaker models, e.g., models\n  //            trained using the VCTK dataset. It is not used for\n  //            single-speaker models, e.g., models trained using the ljspeech\n  //            dataset.\n   * @return Return a float32 tensor containing audio samples. You can flatten\n   *         it to a 1-D tensor.\n   */\n  Ort::Value Run(Ort::Value x, int64_t sid = 0, float speed = 1.0);\n\n  // This is for MeloTTS\n  Ort::Value Run(Ort::Value x, Ort::Value tones, int64_t sid = 0,\n                 float speed = 1.0) const;\n\n  const OfflineTtsVitsModelMetaData &GetMetaData() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_VITS_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-zipvoice-impl.h",
    "content": "// sherpa-onnx/csrc/offline-tts-zipvoice-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_IMPL_H_\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"kaldi-native-fbank/csrc/mel-computations.h\"\n#include \"kaldi-native-fbank/csrc/stft.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/matcha-tts-lexicon.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-impl.h\"\n#include \"sherpa-onnx/csrc/offline-tts-zipvoice-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-tts-zipvoice-model.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/vocoder.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsZipvoiceImpl : public OfflineTtsImpl {\n public:\n  explicit OfflineTtsZipvoiceImpl(const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsZipvoiceModel>(config.model)),\n        vocoder_(Vocoder::Create(config.model)) {\n    InitFrontend();\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  OfflineTtsZipvoiceImpl(Manager *mgr, const OfflineTtsConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineTtsZipvoiceModel>(mgr, config.model)),\n        vocoder_(Vocoder::Create(mgr, config.model)) {\n    InitFrontend(mgr);\n\n    PostInit();\n  }\n\n  int32_t SampleRate() const override {\n    return model_->GetMetaData().sample_rate;\n  }\n\n  GeneratedAudio Generate(\n      const std::string &text, const GenerationConfig &config,\n      GeneratedAudioCallback callback = nullptr) const override {\n    // Supported extra options in config.extra:\n    //   - \"speed\" (float): Speech speed factor (default: 1.0)\n    //   - \"num_steps\" (int): Number of flow-matching steps (default: 4)\n    //   - \"max_char_in_sentence\" (int): Max characters per chunk (default: 200)\n    //   - \"min_char_in_sentence\" (int): Merge shorter chunks until this size\n    //     (default: 30)\n    //   - \"feat_scale\" (float): Prompt mel log scaling factor (default:\n    //     config.model.zipvoice.feat_scale)\n    //   - \"t_shift\" (float): Timestep shift used by the decoder schedule\n    //     (default: config.model.zipvoice.t_shift)\n    //   - \"target_rms\" (float): Prompt RMS normalization target (default:\n    //     config.model.zipvoice.target_rms)\n    //   - \"guidance_scale\" (float): Classifier-free guidance scale for the\n    //     decoder (default: config.model.zipvoice.guidance_scale)\n    if (config_.model.debug) {\n      SHERPA_ONNX_LOGE(\"%s\", config.ToString().c_str());\n    }\n\n    if (config.reference_sample_rate <= 0) {\n      SHERPA_ONNX_LOGE(\"reference_sample_rate %d is invalid.\",\n                       config.reference_sample_rate);\n      return {};\n    }\n\n    if (config.reference_audio.empty()) {\n      SHERPA_ONNX_LOGE(\"reference_audio is empty.\");\n      return {};\n    }\n\n    if (config.reference_text.empty()) {\n      SHERPA_ONNX_LOGE(\"reference_text is empty.\");\n      return {};\n    }\n\n    float speed =\n        config.GetExtraFloat(\"speed\", config.speed > 0 ? config.speed : 1.0f);\n    if (speed <= 0) {\n      SHERPA_ONNX_LOGE(\"Speed must be > 0. Given: %f\", speed);\n      return {};\n    }\n\n    int32_t num_steps = config.GetExtraInt(\n        \"num_steps\", config.num_steps > 0 ? config.num_steps : 4);\n    if (num_steps <= 0) {\n      SHERPA_ONNX_LOGE(\"Num steps must be > 0. Given: %d\", num_steps);\n      return {};\n    }\n\n    float feat_scale =\n        config.GetExtraFloat(\"feat_scale\", config_.model.zipvoice.feat_scale);\n    if (feat_scale <= 0) {\n      SHERPA_ONNX_LOGE(\"feat_scale must be > 0. Given: %f\", feat_scale);\n      return {};\n    }\n\n    float t_shift =\n        config.GetExtraFloat(\"t_shift\", config_.model.zipvoice.t_shift);\n    if (t_shift < 0) {\n      SHERPA_ONNX_LOGE(\"t_shift must be >= 0. Given: %f\", t_shift);\n      return {};\n    }\n\n    float target_rms =\n        config.GetExtraFloat(\"target_rms\", config_.model.zipvoice.target_rms);\n    if (target_rms <= 0) {\n      SHERPA_ONNX_LOGE(\"target_rms must be > 0. Given: %f\", target_rms);\n      return {};\n    }\n\n    float guidance_scale = config.GetExtraFloat(\n        \"guidance_scale\", config_.model.zipvoice.guidance_scale);\n    if (guidance_scale <= 0) {\n      SHERPA_ONNX_LOGE(\"guidance_scale must be > 0. Given: %f\", guidance_scale);\n      return {};\n    }\n\n    std::vector<TokenIDs> prompt_token_ids =\n        frontend_->ConvertTextToTokenIds(config.reference_text);\n    if (prompt_token_ids.empty() ||\n        (prompt_token_ids.size() == 1 && prompt_token_ids[0].tokens.empty())) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"Failed to convert prompt text '%{public}s' to token IDs\",\n          config.reference_text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Failed to convert prompt text '%s' to token IDs\",\n                       config.reference_text.c_str());\n#endif\n      return {};\n    }\n\n    std::vector<int64_t> prompt_tokens;\n    for (const auto &t : prompt_token_ids) {\n      prompt_tokens.insert(prompt_tokens.end(), t.tokens.begin(),\n                           t.tokens.end());\n    }\n\n    std::vector<float> prompt_features = ComputePromptFeatures(\n        config.reference_audio, config.reference_sample_rate, feat_scale,\n        target_rms);\n    if (prompt_features.empty()) {\n      SHERPA_ONNX_LOGE(\"No frames extracted from the prompt audio\");\n      return {};\n    }\n\n    auto sentences = SplitByPunctuation(text);\n    if (sentences.empty()) {\n      return {};\n    }\n\n    int32_t max_char_in_sentence =\n        config.GetExtraInt(\"max_char_in_sentence\", 200);\n    int32_t min_char_in_sentence =\n        config.GetExtraInt(\"min_char_in_sentence\", 30);\n\n    if (max_char_in_sentence <= 0) {\n      SHERPA_ONNX_LOGE(\"max_char_in_sentence must be > 0. Given: %d\",\n                       max_char_in_sentence);\n      return {};\n    }\n\n    if (min_char_in_sentence <= 0) {\n      SHERPA_ONNX_LOGE(\"min_char_in_sentence must be > 0. Given: %d\",\n                       min_char_in_sentence);\n      return {};\n    }\n\n    sentences = MergeShortSentences(sentences, min_char_in_sentence);\n\n    std::vector<std::string> final_chunks;\n    for (const auto &s : sentences) {\n      auto pieces = SplitLongSentence(s, max_char_in_sentence);\n      final_chunks.insert(final_chunks.end(), pieces.begin(), pieces.end());\n    }\n\n    sentences = std::move(final_chunks);\n    if (sentences.empty()) {\n      return {};\n    }\n\n    GeneratedAudio result;\n    result.sample_rate = SampleRate();\n\n    const int32_t total = static_cast<int32_t>(sentences.size());\n\n    for (int32_t i = 0; i < total; ++i) {\n      if (config_.model.debug) {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\"Processing %{public}d/%{public}d: %{public}s\", i + 1,\n                         total, sentences[i].c_str());\n#else\n        SHERPA_ONNX_LOGE(\"Processing %d/%d: %s\", i + 1, total,\n                         sentences[i].c_str());\n#endif\n      }\n\n      GeneratedAudio cur = GenerateChunk(\n          sentences[i], prompt_tokens, prompt_features, speed, num_steps,\n          feat_scale, t_shift, guidance_scale);\n\n      if (cur.samples.empty()) {\n        continue;\n      }\n\n      result.samples.insert(result.samples.end(), cur.samples.begin(),\n                            cur.samples.end());\n\n      if (callback) {\n        if (!callback(cur.samples.data(),\n                      static_cast<int32_t>(cur.samples.size()),\n                      (i + 1) * 1.0f / total)) {\n          break;\n        }\n      }\n    }\n\n    if (config.silence_scale != 1) {\n      result = result.ScaleSilence(config.silence_scale);\n    }\n\n    return result;\n  }\n\n  GeneratedAudio Generate(\n      const std::string &text, const std::string &prompt_text,\n      const std::vector<float> &prompt_samples, int32_t sample_rate,\n      float speed, int32_t num_steps,\n      GeneratedAudioCallback callback = nullptr) const override {\n    GenerationConfig config;\n    config.speed = speed;\n    config.num_steps = num_steps;\n    config.reference_text = prompt_text;\n    config.reference_audio = prompt_samples;\n    config.reference_sample_rate = sample_rate;\n    return Generate(text, config, std::move(callback));\n  }\n\n private:\n  void PostInit() { InitMelBanks(); }\n\n  void InitMelBanks() {\n    const auto &meta = model_->GetMetaData();\n    int32_t sample_rate = meta.sample_rate;\n    int32_t n_fft = meta.n_fft;\n    int32_t hop_length = meta.hop_length;\n    int32_t win_length = meta.window_length;\n    int32_t num_mels = meta.num_mels;\n\n    knf::FrameExtractionOptions frame_opts;\n    frame_opts.samp_freq = sample_rate;\n    frame_opts.frame_length_ms = win_length * 1000 / sample_rate;\n    frame_opts.frame_shift_ms = hop_length * 1000 / sample_rate;\n    frame_opts.window_type = \"hanning\";\n\n    knf::MelBanksOptions mel_opts;\n    mel_opts.num_bins = num_mels;\n    mel_opts.low_freq = 0;\n    mel_opts.high_freq = sample_rate / 2;\n    mel_opts.is_librosa = true;\n    mel_opts.use_slaney_mel_scale = false;\n    mel_opts.norm = \"\";\n\n    mel_banks_ = std::make_unique<knf::MelBanks>(mel_opts, frame_opts, 1.0f);\n  }\n\n  template <typename Manager>\n  void InitFrontend(Manager *mgr) {\n    frontend_ = std::make_unique<MatchaTtsLexicon>(\n        mgr, config_.model.zipvoice.lexicon, config_.model.zipvoice.tokens,\n        config_.model.zipvoice.data_dir, config_.model.debug, true);\n  }\n\n  void InitFrontend() {\n    frontend_ = std::make_unique<MatchaTtsLexicon>(\n        config_.model.zipvoice.lexicon, config_.model.zipvoice.tokens,\n        config_.model.zipvoice.data_dir, config_.model.debug, true);\n  }\n\n  void ComputeMelSpectrogram(const std::vector<float> &_samples,\n                             int32_t sample_rate, float feat_scale,\n                             std::vector<float> *prompt_features) const {\n    const auto &meta = model_->GetMetaData();\n    if (sample_rate != meta.sample_rate) {\n      SHERPA_ONNX_LOGE(\n          \"Creating a resampler:\\n\"\n          \"   in_sample_rate: %d\\n\"\n          \"   output_sample_rate: %d\\n\",\n          sample_rate, static_cast<int32_t>(meta.sample_rate));\n\n      float min_freq = std::min<int32_t>(sample_rate, meta.sample_rate);\n      float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n      int32_t lowpass_filter_width = 6;\n      auto resampler = std::make_unique<LinearResample>(\n          sample_rate, meta.sample_rate, lowpass_cutoff, lowpass_filter_width);\n      std::vector<float> samples;\n      resampler->Resample(_samples.data(), _samples.size(), true, &samples);\n      ComputeMelSpectrogram(samples, feat_scale, prompt_features);\n      return;\n    }\n\n    ComputeMelSpectrogram(_samples, feat_scale, prompt_features);\n  }\n\n  void ComputeMelSpectrogram(const std::vector<float> &samples,\n                             float feat_scale,\n                             std::vector<float> *prompt_features) const {\n    const auto &meta = model_->GetMetaData();\n\n    int32_t n_fft = meta.n_fft;\n    int32_t hop_length = meta.hop_length;\n    int32_t win_length = meta.window_length;\n    int32_t num_mels = meta.num_mels;\n\n    knf::StftConfig stft_config;\n    stft_config.n_fft = n_fft;\n    stft_config.hop_length = hop_length;\n    stft_config.win_length = win_length;\n    stft_config.window_type = \"hann\";\n    stft_config.center = true;\n\n    knf::Stft stft(stft_config);\n    auto stft_result = stft.Compute(samples.data(), samples.size());\n    int32_t num_frames = stft_result.num_frames;\n    int32_t fft_bins = n_fft / 2 + 1;\n\n    prompt_features->resize(num_frames * num_mels);\n    float *p = prompt_features->data();\n\n    std::vector<float> magnitude_spectrum(fft_bins);\n\n    for (int32_t i = 0; i < num_frames; ++i, p += num_mels) {\n      for (int32_t k = 0; k < fft_bins; ++k) {\n        float real = stft_result.real[i * fft_bins + k];\n        float imag = stft_result.imag[i * fft_bins + k];\n        magnitude_spectrum[k] = std::sqrt(real * real + imag * imag);\n      }\n\n      mel_banks_->Compute(magnitude_spectrum.data(), p);\n\n      for (int32_t j = 0; j < num_mels; ++j) {\n        p[j] = std::log(p[j] + 1e-10f) * feat_scale;\n      }\n    }\n  }\n\n  GeneratedAudio GenerateChunk(const std::string &text,\n                               const std::vector<int64_t> &prompt_tokens,\n                               const std::vector<float> &prompt_features,\n                               float speed, int32_t num_steps, float feat_scale,\n                               float t_shift, float guidance_scale) const {\n    std::vector<TokenIDs> text_token_ids =\n        frontend_->ConvertTextToTokenIds(text);\n\n    if (text_token_ids.empty() ||\n        (text_token_ids.size() == 1 && text_token_ids[0].tokens.empty())) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Failed to convert '%{public}s' to token IDs\",\n                       text.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Failed to convert '%s' to token IDs\", text.c_str());\n#endif\n      return {};\n    }\n\n    std::vector<int64_t> tokens;\n    for (const auto &t : text_token_ids) {\n      tokens.insert(tokens.end(), t.tokens.begin(), t.tokens.end());\n    }\n\n    return Process(tokens, prompt_tokens, prompt_features, speed, num_steps,\n                   feat_scale, t_shift, guidance_scale);\n  }\n\n  std::vector<float> ComputePromptFeatures(\n      const std::vector<float> &prompt_samples, int32_t sample_rate,\n      float feat_scale, float target_rms) const {\n    std::vector<float> prompt_samples_scaled = prompt_samples;\n    double prompt_rms = 0.0;\n    double sum_sq = 0.0;\n    for (float s : prompt_samples_scaled) {\n      sum_sq += s * s;\n    }\n    prompt_rms = std::sqrt(sum_sq / prompt_samples_scaled.size());\n    if (prompt_rms < target_rms && prompt_rms > 0.0f) {\n      float scale = target_rms / prompt_rms;\n      for (auto &s : prompt_samples_scaled) {\n        s *= scale;\n      }\n    }\n\n    std::vector<float> prompt_features;\n    ComputeMelSpectrogram(prompt_samples_scaled, sample_rate, feat_scale,\n                          &prompt_features);\n\n    return prompt_features;\n  }\n\n  GeneratedAudio Process(const std::vector<int64_t> &tokens,\n                         const std::vector<int64_t> &prompt_tokens,\n                         const std::vector<float> &prompt_features, float speed,\n                         int32_t num_steps, float feat_scale, float t_shift,\n                         float guidance_scale) const {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 2> tokens_shape = {1,\n                                           static_cast<int64_t>(tokens.size())};\n\n    Ort::Value tokens_tensor = Ort::Value::CreateTensor(\n        memory_info, const_cast<int64_t *>(tokens.data()), tokens.size(),\n        tokens_shape.data(), tokens_shape.size());\n\n    std::array<int64_t, 2> prompt_tokens_shape = {\n        1, static_cast<int64_t>(prompt_tokens.size())};\n\n    Ort::Value prompt_tokens_tensor = Ort::Value::CreateTensor(\n        memory_info, const_cast<int64_t *>(prompt_tokens.data()),\n        prompt_tokens.size(), prompt_tokens_shape.data(),\n        prompt_tokens_shape.size());\n\n    int32_t mel_dim = model_->GetMetaData().num_mels;\n\n    int32_t num_frames = prompt_features.size() / mel_dim;\n\n    std::array<int64_t, 3> shape = {1, num_frames, mel_dim};\n    auto prompt_features_tensor = Ort::Value::CreateTensor(\n        memory_info, const_cast<float *>(prompt_features.data()),\n        prompt_features.size(), shape.data(), shape.size());\n\n    Ort::Value mel =\n        model_->Run(std::move(tokens_tensor), std::move(prompt_tokens_tensor),\n                    std::move(prompt_features_tensor), speed, num_steps,\n                    t_shift, guidance_scale);\n\n    // Assume mel_shape = {1, T, C}\n    std::vector<int64_t> mel_shape = mel.GetTensorTypeAndShapeInfo().GetShape();\n    int64_t T = mel_shape[1];\n    int64_t C = mel_shape[2];\n\n    const float *mel_data = mel.GetTensorData<float>();\n\n    float inv_feat_scale = 1 / feat_scale;\n\n    // mel_permuted is (C, T)\n    std::vector<float> mel_permuted = Transpose(mel_data, T, C);\n\n    Scale(mel_permuted.data(), inv_feat_scale, mel_permuted.size(),\n          mel_permuted.data());\n\n    std::array<int64_t, 3> new_shape = {1, C, T};\n    Ort::Value mel_new = Ort::Value::CreateTensor<float>(\n        memory_info, mel_permuted.data(), mel_permuted.size(), new_shape.data(),\n        new_shape.size());\n\n    GeneratedAudio ans;\n    ans.samples = vocoder_->Run(std::move(mel_new));\n    ans.sample_rate = model_->GetMetaData().sample_rate;\n    return ans;\n  }\n\n private:\n  OfflineTtsConfig config_;\n  std::unique_ptr<OfflineTtsZipvoiceModel> model_;\n  std::unique_ptr<Vocoder> vocoder_;\n  std::unique_ptr<OfflineTtsFrontend> frontend_;\n\n  std::unique_ptr<knf::MelBanks> mel_banks_;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-zipvoice-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-zipvoice-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-zipvoice-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineTtsZipvoiceModelConfig::Register(ParseOptions *po) {\n  po->Register(\"zipvoice-tokens\", &tokens,\n               \"Path to tokens.txt for ZipVoice models\");\n  po->Register(\"zipvoice-data-dir\", &data_dir,\n               \"Path to the directory containing dict for espeak-ng.\");\n  po->Register(\"zipvoice-lexicon\", &lexicon, \"Path to lexicon.txt for Chinese\");\n  po->Register(\"zipvoice-encoder\", &encoder, \"Path to zipvoice text model\");\n  po->Register(\"zipvoice-decoder\", &decoder,\n               \"Path to zipvoice flow-matching decoder model\");\n  po->Register(\"zipvoice-vocoder\", &vocoder, \"Path to zipvoice vocoder\");\n  po->Register(\"zipvoice-feat-scale\", &feat_scale,\n               \"Feature scale for ZipVoice (default: 0.1)\");\n  po->Register(\"zipvoice-t-shift\", &t_shift,\n               \"Shift t to smaller ones if t_shift < 1.0 (default: 0.5)\");\n  po->Register(\n      \"zipvoice-target-rms\", &target_rms,\n      \"Target speech normalization rms value for ZipVoice (default: 0.1)\");\n  po->Register(\n      \"zipvoice-guidance-scale\", &guidance_scale,\n      \"The scale of classifier-free guidance during inference for ZipVoice \"\n      \"(default: 1.0)\");\n}\n\nbool OfflineTtsZipvoiceModelConfig::Validate() const {\n  if (tokens.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --zipvoice-tokens\");\n    return false;\n  }\n  if (!FileExists(tokens)) {\n    SHERPA_ONNX_LOGE(\"--zipvoice-tokens: '%s' does not exist\", tokens.c_str());\n    return false;\n  }\n\n  if (encoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --zipvoice-encoder\");\n    return false;\n  }\n  if (!FileExists(encoder)) {\n    SHERPA_ONNX_LOGE(\"--zipvoice-encoder: '%s' does not exist\",\n                     encoder.c_str());\n    return false;\n  }\n\n  if (decoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --zipvoice-decoder\");\n    return false;\n  }\n  if (!FileExists(decoder)) {\n    SHERPA_ONNX_LOGE(\"--zipvoice-decoder: '%s' does not exist\",\n                     decoder.c_str());\n    return false;\n  }\n\n  if (vocoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --zipvoice-vocoder\");\n    return false;\n  }\n\n  if (!FileExists(vocoder)) {\n    SHERPA_ONNX_LOGE(\"--zipvoice-vocoder: '%s' does not exist\",\n                     vocoder.c_str());\n    return false;\n  }\n\n  if (!data_dir.empty()) {\n    std::vector<std::string> required_files = {\n        \"phontab\",\n        \"phonindex\",\n        \"phondata\",\n        \"intonations\",\n    };\n    for (const auto &f : required_files) {\n      if (!FileExists(data_dir + \"/\" + f)) {\n        SHERPA_ONNX_LOGE(\n            \"'%s/%s' does not exist. Please check zipvoice-data-dir\",\n            data_dir.c_str(), f.c_str());\n        return false;\n      }\n    }\n  }\n\n  if (feat_scale <= 0) {\n    SHERPA_ONNX_LOGE(\"--zipvoice-feat-scale must be positive. Given: %f\",\n                     feat_scale);\n    return false;\n  }\n\n  if (t_shift < 0) {\n    SHERPA_ONNX_LOGE(\"--zipvoice-t-shift must be non-negative. Given: %f\",\n                     t_shift);\n    return false;\n  }\n\n  if (target_rms <= 0) {\n    SHERPA_ONNX_LOGE(\"--zipvoice-target-rms must be positive. Given: %f\",\n                     target_rms);\n    return false;\n  }\n\n  if (guidance_scale <= 0) {\n    SHERPA_ONNX_LOGE(\"--zipvoice-guidance-scale must be positive. Given: %f\",\n                     guidance_scale);\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineTtsZipvoiceModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineTtsZipvoiceModelConfig(\";\n  os << \"tokens=\\\"\" << tokens << \"\\\", \";\n  os << \"encoder=\\\"\" << encoder << \"\\\", \";\n  os << \"decoder=\\\"\" << decoder << \"\\\", \";\n  os << \"vocoder=\\\"\" << vocoder << \"\\\", \";\n  os << \"data_dir=\\\"\" << data_dir << \"\\\", \";\n  os << \"lexicon=\\\"\" << lexicon << \"\\\", \";\n  os << \"feat_scale=\" << feat_scale << \", \";\n  os << \"t_shift=\" << t_shift << \", \";\n  os << \"target_rms=\" << target_rms << \", \";\n  os << \"guidance_scale=\" << guidance_scale << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-zipvoice-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-tts-zipvoice-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_CONFIG_H_\n\n#include <cstdint>\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTtsZipvoiceModelConfig {\n  std::string tokens;\n  std::string encoder;\n  std::string decoder;\n  std::string vocoder;\n\n  std::string data_dir;\n  std::string lexicon;\n\n  float feat_scale = 0.1;\n  float t_shift = 0.5;\n  float target_rms = 0.1;\n  float guidance_scale = 1.0;\n\n  OfflineTtsZipvoiceModelConfig() = default;\n\n  OfflineTtsZipvoiceModelConfig(\n      const std::string &tokens, const std::string &encoder,\n      const std::string &decoder, const std::string &vocoder,\n      const std::string &data_dir, const std::string &lexicon,\n      float feat_scale = 0.1, float t_shift = 0.5, float target_rms = 0.1,\n      float guidance_scale = 1.0)\n      : tokens(tokens),\n        encoder(encoder),\n        decoder(decoder),\n        vocoder(vocoder),\n        data_dir(data_dir),\n        lexicon(lexicon),\n        feat_scale(feat_scale),\n        t_shift(t_shift),\n        target_rms(target_rms),\n        guidance_scale(guidance_scale) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-zipvoice-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/offline-tts-zipvoice-model-meta-data.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_META_DATA_H_\n\n#include <cstdint>\n#include <string>\n\nnamespace sherpa_onnx {\n\n// If you are not sure what each field means, please\n// have a look of the Python file in the model directory that\n// you have downloaded.\nstruct OfflineTtsZipvoiceModelMetaData {\n  int32_t version = 1;\n  int32_t feat_dim = 100;\n  int32_t sample_rate = 24000;\n  int32_t n_fft = 1024;\n  int32_t hop_length = 256;\n  int32_t window_length = 1024;\n  int32_t num_mels = 100;\n  int32_t use_espeak = 1;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-zipvoice-model.cc",
    "content": "// sherpa-onnx/csrc/offline-tts-zipvoice-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts-zipvoice-model.h\"\n\n#include <algorithm>\n#include <cstring>\n#include <iostream>\n#include <memory>\n#include <random>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/normal-data-generator.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsZipvoiceModel::Impl {\n public:\n  explicit Impl(const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config.zipvoice.encoder);\n    InitEncoder(buf.data(), buf.size());\n\n    buf = ReadFile(config.zipvoice.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config.zipvoice.encoder);\n    InitEncoder(buf.data(), buf.size());\n\n    buf = ReadFile(mgr, config.zipvoice.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  const OfflineTtsZipvoiceModelMetaData &GetMetaData() const {\n    return meta_data_;\n  }\n\n  Ort::Value Run(Ort::Value tokens, Ort::Value prompt_tokens,\n                 Ort::Value prompt_features, float speed, int32_t num_steps,\n                 float t_shift, float guidance_scale) {\n    std::vector<int64_t> tokens_shape =\n        tokens.GetTensorTypeAndShapeInfo().GetShape();\n\n    int64_t batch_size = tokens_shape[0];\n\n    std::vector<int64_t> prompt_feat_shape =\n        prompt_features.GetTensorTypeAndShapeInfo().GetShape();\n\n    int64_t prompt_feat_len = prompt_feat_shape[1];\n\n    Ort::Value text_condition =\n        RunEncoder(std::move(tokens), std::move(prompt_tokens),\n                   View(&prompt_features), speed);\n\n    std::vector<int64_t> text_cond_shape =\n        text_condition.GetTensorTypeAndShapeInfo().GetShape();\n    int64_t num_frames = text_cond_shape[1];\n\n    int64_t feat_dim = meta_data_.feat_dim;\n\n    std::vector<float> x_data(batch_size * num_frames * feat_dim);\n\n    normal_gen_.Fill(x_data.data(), x_data.size());\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> x_shape = {batch_size, num_frames, feat_dim};\n    Ort::Value x = Ort::Value::CreateTensor<float>(\n        memory_info, x_data.data(), x_data.size(), x_shape.data(),\n        x_shape.size());\n\n    std::vector<float> speech_cond_data(batch_size * num_frames * feat_dim);\n    const float *src = prompt_features.GetTensorData<float>();\n    float *dst = speech_cond_data.data();\n    std::copy(src, src + batch_size * prompt_feat_len * feat_dim, dst);\n    prompt_features = Ort::Value{nullptr};\n\n    std::vector<int64_t> speech_cond_shape = {batch_size, num_frames, feat_dim};\n\n    Ort::Value speech_condition = Ort::Value::CreateTensor<float>(\n        memory_info, speech_cond_data.data(), speech_cond_data.size(),\n        speech_cond_shape.data(), speech_cond_shape.size());\n\n    std::vector<float> timesteps(num_steps + 1);\n    for (int32_t i = 0; i <= num_steps; ++i) {\n      float t = static_cast<float>(i) / num_steps;\n      timesteps[i] = t_shift * t / (1.0f + (t_shift - 1.0f) * t);\n    }\n\n    int64_t guidance_scale_shape = 1;\n    Ort::Value guidance_scale_tensor = Ort::Value::CreateTensor<float>(\n        memory_info, &guidance_scale, 1, &guidance_scale_shape, 1);\n\n    float *x_ptr = x.GetTensorMutableData<float>();\n\n    int64_t N = batch_size * num_frames * feat_dim;\n\n    for (int32_t step = 0; step < num_steps; ++step) {\n      float t = timesteps[step];\n\n      Ort::Value v =\n          RunDecoder(t, View(&x), View(&text_condition),\n                     View(&speech_condition), View(&guidance_scale_tensor));\n\n      float delta_t = timesteps[step + 1] - timesteps[step];\n\n      const float *v_ptr = v.GetTensorData<float>();\n      for (int64_t i = 0; i < N; ++i) {\n        x_ptr[i] += v_ptr[i] * delta_t;\n      }\n    }\n\n    int64_t kept_frames = num_frames - prompt_feat_len;\n\n    std::vector<int64_t> out_shape = {batch_size, kept_frames, feat_dim};\n\n    Ort::Value ans = Ort::Value::CreateTensor<float>(\n        allocator_, out_shape.data(), out_shape.size());\n\n    float *p_out = ans.GetTensorMutableData<float>();\n\n    for (int64_t b = 0; b < batch_size; ++b) {\n      auto begin = x_ptr + (b * num_frames + prompt_feat_len) * feat_dim;\n      auto end = begin + kept_frames * feat_dim;\n      std::copy(begin, end, p_out);\n      p_out += kept_frames * feat_dim;\n    }\n\n    return ans;\n  }\n\n private:\n  void InitEncoder(void *encoder_data, size_t encoder_data_length) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, encoder_data, encoder_data_length, sess_opts_);\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_names_ptr_);\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.use_espeak, \"use_espeak\",\n                                            1);\n\n    if (config_.debug) {\n      std::ostringstream os;\n\n      os << \"---encoder---\\n\";\n      Ort::ModelMetadata text_meta_data = encoder_sess_->GetModelMetadata();\n      PrintModelMetadata(os, text_meta_data);\n\n      os << \"----------input names----------\\n\";\n      int32_t i = 0;\n      for (const auto &s : encoder_input_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n      os << \"----------output names----------\\n\";\n      i = 0;\n      for (const auto &s : encoder_output_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n  }\n\n  void InitDecoder(void *decoder_data, size_t decoder_data_length) {\n    decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, decoder_data, decoder_data_length, sess_opts_);\n    GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                  &decoder_input_names_ptr_);\n    GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                   &decoder_output_names_ptr_);\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    auto meta_data = decoder_sess_->GetModelMetadata();\n\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.version, \"version\", 1);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.feat_dim, \"feat_dim\",\n                                            100);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.sample_rate,\n                                            \"sample_rate\", 24000);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.n_fft, \"n_fft\", 1024);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.hop_length, \"hop_length\",\n                                            256);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.window_length,\n                                            \"window_length\", 1024);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_data_.num_mels, \"num_mels\",\n                                            100);\n\n    if (config_.debug) {\n      std::ostringstream os;\n\n      os << \"---decoder---\\n\";\n      PrintModelMetadata(os, meta_data);\n\n      os << \"----------input names----------\\n\";\n      int32_t i = 0;\n      for (const auto &s : decoder_input_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n      os << \"----------output names----------\\n\";\n      i = 0;\n      for (const auto &s : decoder_output_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n  }\n\n  Ort::Value RunEncoder(Ort::Value tokens, Ort::Value prompt_tokens,\n                        Ort::Value prompt_features, float speed) {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::vector<int64_t> tokens_shape =\n        tokens.GetTensorTypeAndShapeInfo().GetShape();\n\n    int64_t batch_size = tokens_shape[0];\n    if (batch_size != 1) {\n      SHERPA_ONNX_LOGE(\"Support only batch_size == 1. Given: %d\",\n                       static_cast<int32_t>(batch_size));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<int64_t> prompt_feat_shape =\n        prompt_features.GetTensorTypeAndShapeInfo().GetShape();\n\n    int64_t prompt_feat_len = prompt_feat_shape[1];\n    int64_t prompt_feat_len_shape = 1;\n    Ort::Value prompt_feat_len_tensor = Ort::Value::CreateTensor<int64_t>(\n        memory_info, &prompt_feat_len, 1, &prompt_feat_len_shape, 1);\n\n    int64_t speed_shape = 1;\n    Ort::Value speed_tensor = Ort::Value::CreateTensor<float>(\n        memory_info, &speed, 1, &speed_shape, 1);\n\n    std::vector<Ort::Value> encoder_inputs;\n    encoder_inputs.reserve(4);\n    encoder_inputs.push_back(std::move(tokens));\n    encoder_inputs.push_back(std::move(prompt_tokens));\n    encoder_inputs.push_back(std::move(prompt_feat_len_tensor));\n    encoder_inputs.push_back(std::move(speed_tensor));\n\n    auto encoder_out = encoder_sess_->Run(\n        {}, encoder_names_ptr_.data(), encoder_inputs.data(),\n        encoder_inputs.size(), encoder_output_names_ptr_.data(),\n        encoder_output_names_ptr_.size());\n\n    return std::move(encoder_out[0]);\n  }\n\n  Ort::Value RunDecoder(float t, Ort::Value x, Ort::Value text_condition,\n                        Ort::Value speech_condition,\n                        Ort::Value guidance_scale_tensor) {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int64_t t_shape = 1;\n    Ort::Value t_tensor =\n        Ort::Value::CreateTensor<float>(memory_info, &t, 1, &t_shape, 1);\n\n    std::vector<Ort::Value> decoder_inputs;\n    decoder_inputs.reserve(5);\n    decoder_inputs.emplace_back(std::move(t_tensor));\n    decoder_inputs.push_back(std::move(x));\n    decoder_inputs.push_back(std::move(text_condition));\n    decoder_inputs.push_back(std::move(speech_condition));\n    decoder_inputs.push_back(std::move(guidance_scale_tensor));\n\n    auto decoder_out = decoder_sess_->Run(\n        {}, decoder_input_names_ptr_.data(), decoder_inputs.data(),\n        decoder_inputs.size(), decoder_output_names_ptr_.data(),\n        decoder_output_names_ptr_.size());\n\n    return std::move(decoder_out[0]);\n  }\n\n private:\n  OfflineTtsModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  OfflineTtsZipvoiceModelMetaData meta_data_;\n  NormalDataGenerator normal_gen_;\n};\n\nOfflineTtsZipvoiceModel::OfflineTtsZipvoiceModel(\n    const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineTtsZipvoiceModel::OfflineTtsZipvoiceModel(\n    Manager *mgr, const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineTtsZipvoiceModel::~OfflineTtsZipvoiceModel() = default;\n\nconst OfflineTtsZipvoiceModelMetaData &OfflineTtsZipvoiceModel::GetMetaData()\n    const {\n  return impl_->GetMetaData();\n}\n\nOrt::Value OfflineTtsZipvoiceModel::Run(Ort::Value tokens,\n                                        Ort::Value prompt_tokens,\n                                        Ort::Value prompt_features,\n                                        float speed /*= 1.0*/,\n                                        int32_t num_steps /*= 16*/,\n                                        float t_shift /*= 0.5f*/,\n                                        float guidance_scale /*= 1.0f*/) const {\n  return impl_->Run(std::move(tokens), std::move(prompt_tokens),\n                    std::move(prompt_features), speed, num_steps, t_shift,\n                    guidance_scale);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTtsZipvoiceModel::OfflineTtsZipvoiceModel(\n    AAssetManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTtsZipvoiceModel::OfflineTtsZipvoiceModel(\n    NativeResourceManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts-zipvoice-model.h",
    "content": "// sherpa-onnx/csrc/offline-tts-zipvoice-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_H_\n\n#include <memory>\n#include <string>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-tts-zipvoice-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineTtsZipvoiceModel {\n public:\n  ~OfflineTtsZipvoiceModel();\n\n  explicit OfflineTtsZipvoiceModel(const OfflineTtsModelConfig &config);\n\n  template <typename Manager>\n  OfflineTtsZipvoiceModel(Manager *mgr, const OfflineTtsModelConfig &config);\n\n  // Return a float32 tensor containing the mel\n  // of shape (batch_size, mel_dim, num_frames)\n  Ort::Value Run(Ort::Value tokens, Ort::Value prompt_tokens,\n                 Ort::Value prompt_features, float speed, int32_t num_steps,\n                 float t_shift = 0.5f,\n                 float guidance_scale = 1.0f) const;\n\n  const OfflineTtsZipvoiceModelMetaData &GetMetaData() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts.cc",
    "content": "// sherpa-onnx/csrc/offline-tts.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts.h\"\n\n#include <cmath>\n#include <map>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-tts-impl.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nstruct SilenceInterval {\n  int32_t start;\n  int32_t end;\n};\n\nGeneratedAudio GeneratedAudio::ScaleSilence(float scale) const {\n  if (scale == 1) {\n    return *this;\n  }\n  // if the interval is larger than 0.2 second, then we assume it is a pause\n  int32_t threshold = static_cast<int32_t>(sample_rate * 0.2);\n\n  std::vector<SilenceInterval> intervals;\n  int32_t num_samples = static_cast<int32_t>(samples.size());\n\n  int32_t last = -1;\n  int32_t i;\n  for (i = 0; i != num_samples; ++i) {\n    if (fabs(samples[i]) <= 0.01) {\n      if (last == -1) {\n        last = i;\n      }\n      continue;\n    }\n\n    if (last != -1 && i - last < threshold) {\n      last = -1;\n      continue;\n    }\n\n    if (last != -1) {\n      intervals.push_back({last, i});\n      last = -1;\n    }\n  }\n\n  if (last != -1 && num_samples - last > threshold) {\n    intervals.push_back({last, num_samples});\n  }\n\n  if (intervals.empty()) {\n    return *this;\n  }\n\n  GeneratedAudio ans;\n  ans.sample_rate = sample_rate;\n  ans.samples.reserve(samples.size());\n\n  i = 0;\n  for (const auto &interval : intervals) {\n    ans.samples.insert(ans.samples.end(), samples.begin() + i,\n                       samples.begin() + interval.start);\n    i = interval.end;\n    int32_t n = static_cast<int32_t>((interval.end - interval.start) * scale);\n\n    ans.samples.insert(ans.samples.end(), samples.begin() + interval.start,\n                       samples.begin() + interval.start + n);\n  }\n\n  if (i < num_samples) {\n    ans.samples.insert(ans.samples.end(), samples.begin() + i, samples.end());\n  }\n\n  return ans;\n}\n\nstd::string GenerationConfig::GetExtraString(\n    const std::string &key, const std::string &def /*= \"\"*/) const {\n  auto it = extra.find(key);\n  return it == extra.end() ? def : it->second;\n}\n\nint32_t GenerationConfig::GetExtraInt(const std::string &key,\n                                      int32_t def) const {\n  auto it = extra.find(key);\n  if (it == extra.end()) {\n    return def;\n  }\n\n  return ToIntOrDefault(it->second, def);\n}\n\nfloat GenerationConfig::GetExtraFloat(const std::string &key, float def) const {\n  auto it = extra.find(key);\n  if (it == extra.end()) {\n    return def;\n  }\n\n  return ToFloatOrDefault(it->second, def);\n}\n\nstd::string GenerationConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"GenerationConfig(\";\n  os << \"silence_scale=\" << silence_scale;\n  os << \", speed=\" << speed;\n  os << \", sid=\" << sid;\n  os << \", num_steps=\" << num_steps;\n  os << \", reference_audio_len=\" << reference_audio.size();\n  os << \", reference_sample_rate=\" << reference_sample_rate;\n\n  if (!reference_text.empty()) {\n    os << \", reference_text=\\\"\" << reference_text << \"\\\"\";\n  }\n\n  if (!extra.empty()) {\n    os << \", extra={\";\n    std::string sep;\n\n    std::map<std::string, std::string> sorted(extra.begin(), extra.end());\n\n    for (const auto &kv : sorted) {\n      os << sep << kv.first << \": \\\"\" << kv.second << \"\\\"\";\n      sep = \", \";\n    }\n    os << \"}\";\n  }\n\n  os << \")\";\n  return os.str();\n}\n\nvoid OfflineTtsConfig::Register(ParseOptions *po) {\n  model.Register(po);\n\n  po->Register(\"tts-rule-fsts\", &rule_fsts,\n               \"It not empty, it contains a list of rule FST filenames.\"\n               \"Multiple filenames are separated by a comma and they are \"\n               \"applied from left to right. An example value: \"\n               \"rule1.fst,rule2.fst,rule3.fst\");\n\n  po->Register(\"tts-rule-fars\", &rule_fars,\n               \"It not empty, it contains a list of rule FST archive filenames.\"\n               \"Multiple filenames are separated by a comma and they are \"\n               \"applied from left to right. An example value: \"\n               \"rule1.far,rule2.far,rule3.far. Note that an *.far can contain \"\n               \"multiple *.fst files\");\n\n  po->Register(\n      \"tts-max-num-sentences\", &max_num_sentences,\n      \"Maximum number of sentences that we process at a time. \"\n      \"This is to avoid OOM for very long input text. \"\n      \"If you set it to -1, then we process all sentences in a single batch.\");\n\n  po->Register(\"tts-silence-scale\", &silence_scale,\n               \"Duration of the pause is scaled by this number. So a smaller \"\n               \"value leads to a shorter pause.\");\n}\n\nbool OfflineTtsConfig::Validate() const {\n  if (!rule_fsts.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(rule_fsts, \",\", false, &files);\n    for (const auto &f : files) {\n      if (!FileExists(f)) {\n        SHERPA_ONNX_LOGE(\"Rule fst '%s' does not exist. \", f.c_str());\n        return false;\n      }\n    }\n  }\n\n  if (!rule_fars.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(rule_fars, \",\", false, &files);\n    for (const auto &f : files) {\n      if (!FileExists(f)) {\n        SHERPA_ONNX_LOGE(\"Rule far '%s' does not exist. \", f.c_str());\n        return false;\n      }\n    }\n  }\n\n  if (silence_scale < 0.001) {\n    SHERPA_ONNX_LOGE(\"--tts-silence-scale '%.3f' is too small\", silence_scale);\n    return false;\n  }\n\n  return model.Validate();\n}\n\nstd::string OfflineTtsConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineTtsConfig(\";\n  os << \"model=\" << model.ToString() << \", \";\n  os << \"rule_fsts=\\\"\" << rule_fsts << \"\\\", \";\n  os << \"rule_fars=\\\"\" << rule_fars << \"\\\", \";\n  os << \"max_num_sentences=\" << max_num_sentences << \", \";\n  os << \"silence_scale=\" << silence_scale << \")\";\n\n  return os.str();\n}\n\nOfflineTts::OfflineTts(const OfflineTtsConfig &config)\n    : impl_(OfflineTtsImpl::Create(config)) {}\n\ntemplate <typename Manager>\nOfflineTts::OfflineTts(Manager *mgr, const OfflineTtsConfig &config)\n    : impl_(OfflineTtsImpl::Create(mgr, config)) {}\n\nOfflineTts::~OfflineTts() = default;\n\nGeneratedAudio OfflineTts::Generate(\n    const std::string &text, int64_t sid /*=0*/, float speed /*= 1.0*/,\n    GeneratedAudioCallback callback /*= nullptr*/) const {\n#if !defined(_WIN32)\n  return impl_->Generate(text, sid, speed, std::move(callback));\n#else\n  if (IsUtf8(text)) {\n    return impl_->Generate(text, sid, speed, std::move(callback));\n  } else if (IsGB2312(text)) {\n    auto utf8_text = Gb2312ToUtf8(text);\n    static bool printed = false;\n    if (!printed) {\n      SHERPA_ONNX_LOGE(\n          \"Detected GB2312 encoded string! Converting it to UTF8.\");\n      printed = true;\n    }\n    return impl_->Generate(utf8_text, sid, speed, std::move(callback));\n  } else {\n    SHERPA_ONNX_LOGE(\n        \"Non UTF8 encoded string is received. You would not get expected \"\n        \"results!\");\n    return impl_->Generate(text, sid, speed, std::move(callback));\n  }\n#endif\n}\n\nGeneratedAudio OfflineTts::Generate(\n    const std::string &text, const std::string &prompt_text,\n    const std::vector<float> &prompt_samples, int32_t sample_rate,\n    float speed /*=1.0*/, int32_t num_steps /*=4*/,\n    GeneratedAudioCallback callback /*=nullptr*/) const {\n#if !defined(_WIN32)\n  return impl_->Generate(text, prompt_text, prompt_samples, sample_rate, speed,\n                         num_steps, std::move(callback));\n#else\n  static bool printed = false;\n  auto utf8_text = text;\n  if (IsGB2312(text)) {\n    utf8_text = Gb2312ToUtf8(text);\n    if (!printed) {\n      SHERPA_ONNX_LOGE(\"Detected GB2312 encoded text! Converting it to UTF8.\");\n      printed = true;\n    }\n  }\n  auto utf8_prompt_text = prompt_text;\n  if (IsGB2312(prompt_text)) {\n    utf8_prompt_text = Gb2312ToUtf8(prompt_text);\n    if (!printed) {\n      SHERPA_ONNX_LOGE(\n          \"Detected GB2312 encoded prompt text! Converting it to UTF8.\");\n      printed = true;\n    }\n  }\n  if (IsUtf8(utf8_text) && IsUtf8(utf8_prompt_text)) {\n    return impl_->Generate(utf8_text, utf8_prompt_text, prompt_samples,\n                           sample_rate, speed, num_steps, std::move(callback));\n  } else {\n    SHERPA_ONNX_LOGE(\n        \"Non UTF8 encoded string is received. You would not get expected \"\n        \"results!\");\n    return impl_->Generate(utf8_text, utf8_prompt_text, prompt_samples,\n                           sample_rate, speed, num_steps, std::move(callback));\n  }\n#endif\n}\n\nGeneratedAudio OfflineTts::Generate(\n    const std::string &text, const GenerationConfig &config,\n    GeneratedAudioCallback callback /*= nullptr*/) const {\n#if !defined(_WIN32)\n  return impl_->Generate(text, config, std::move(callback));\n#else\n  if (IsUtf8(text)) {\n    return impl_->Generate(text, config, std::move(callback));\n  } else if (IsGB2312(text)) {\n    auto utf8_text = Gb2312ToUtf8(text);\n    static bool printed = false;\n    if (!printed) {\n      SHERPA_ONNX_LOGE(\n          \"Detected GB2312 encoded string! Converting it to UTF8.\");\n      printed = true;\n    }\n    return impl_->Generate(utf8_text, config, std::move(callback));\n  } else {\n    SHERPA_ONNX_LOGE(\n        \"Non UTF8 encoded string is received. You would not get expected \"\n        \"results!\");\n    return impl_->Generate(text, config, std::move(callback));\n  }\n#endif\n}\n\nint32_t OfflineTts::SampleRate() const { return impl_->SampleRate(); }\n\nint32_t OfflineTts::NumSpeakers() const { return impl_->NumSpeakers(); }\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineTts::OfflineTts(AAssetManager *mgr,\n                                const OfflineTtsConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineTts::OfflineTts(NativeResourceManager *mgr,\n                                const OfflineTtsConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-tts.h",
    "content": "// sherpa-onnx/csrc/offline-tts.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_TTS_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_TTS_H_\n\n#include <cstdint>\n#include <functional>\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineTtsConfig {\n  OfflineTtsModelConfig model;\n  // If not empty, it contains a list of rule FST filenames.\n  // Filenames are separated by a comma.\n  // Example value: rule1.fst,rule2,fst,rule3.fst\n  //\n  // If there are multiple rules, they are applied from left to right.\n  std::string rule_fsts;\n\n  // If there are multiple FST archives, they are applied from left to right.\n  std::string rule_fars;\n\n  // Maximum number of sentences that we process at a time.\n  // This is to avoid OOM for very long input text.\n  // If you set it to -1, then we process all sentences in a single batch.\n  int32_t max_num_sentences = 1;\n\n  // A silence interval contains audio samples with value close to 0.\n  //\n  // the duration of the new interval is old_duration * silence_scale.\n  float silence_scale = 0.2;\n\n  OfflineTtsConfig() = default;\n  OfflineTtsConfig(const OfflineTtsModelConfig &model,\n                   const std::string &rule_fsts, const std::string &rule_fars,\n                   int32_t max_num_sentences, float silence_scale)\n      : model(model),\n        rule_fsts(rule_fsts),\n        rule_fars(rule_fars),\n        max_num_sentences(max_num_sentences),\n        silence_scale(silence_scale) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nstruct GeneratedAudio {\n  std::vector<float> samples;\n  int32_t sample_rate;\n\n  // Silence means pause here.\n  // If scale > 1, then it increases the duration of a pause\n  // If scale < 1, then it reduces the duration of a pause\n  GeneratedAudio ScaleSilence(float scale) const;\n};\n\nstruct GenerationConfig {\n  float silence_scale = 0.2;\n\n  float speed = 1.0f;  // used only by some models.\n  int32_t sid = 0;     // used only by models support multi-speakers\n\n  std::vector<float> reference_audio;  // mono, [-1, 1]\n  int32_t reference_sample_rate = 0;   // sample rate of reference_audio\n  std::string reference_text;          // not all models require this\n  int32_t num_steps = 5;               // number of steps in flow matching\n\n  // model specific\n  // Please see the Generate method of each model in ./offline-tts-xx-impl.h\n  // e.g., in ./offline-tts-pocket-impl.h\n  std::unordered_map<std::string, std::string> extra;\n\n  std::string GetExtraString(const std::string &key,\n                             const std::string &def = \"\") const;\n\n  int32_t GetExtraInt(const std::string &key, int32_t def) const;\n\n  float GetExtraFloat(const std::string &key, float def) const;\n\n  std::string ToString() const;\n};\n\nclass OfflineTtsImpl;\n\n// If the callback returns 0, then it stops generating\n// if the callback returns 1, then it keeps generating\nusing GeneratedAudioCallback = std::function<int32_t(\n    const float * /*samples*/, int32_t /*n*/, float /*progress*/)>;\n\nclass OfflineTts {\n public:\n  ~OfflineTts();\n  explicit OfflineTts(const OfflineTtsConfig &config);\n\n  template <typename Manager>\n  OfflineTts(Manager *mgr, const OfflineTtsConfig &config);\n\n  // @param text A string containing words separated by spaces\n  // @param sid Speaker ID. Used only for multi-speaker models, e.g., models\n  //            trained using the VCTK dataset. It is not used for\n  //            single-speaker models, e.g., models trained using the ljspeech\n  //            dataset.\n  // @param speed The speed for the generated speech. E.g., 2 means 2x faster.\n  // @param callback If not NULL, it is called whenever config.max_num_sentences\n  //                 sentences have been processed. Note that the passed\n  //                 pointer `samples` for the callback might be invalidated\n  //                 after the callback is returned, so the caller should not\n  //                 keep a reference to it. The caller can copy the data if\n  //                 he/she wants to access the samples after the callback\n  //                 returns. The callback is called in the current thread.\n  [[deprecated(\"Use Generate(text, GenerationConfig, callback) instead\")]]\n  GeneratedAudio Generate(const std::string &text, int64_t sid = 0,\n                          float speed = 1.0,\n                          GeneratedAudioCallback callback = nullptr) const;\n\n  // @param text The string to be synthesized.\n  // @param prompt_text The transcribe of `prompt_sampes`.\n  // @param prompt_samples The prompt audio samples (mono PCM floats in [-1,1]).\n  // @param sample_rate The sample rate of `prompt_audio` in Hz.\n  // @param speed The speed for the generated speech. E.g., 2 means 2x faster.\n  // @param num_steps The number of flow steps to generate the audio.\n  // @param callback If not NULL, it is called whenever config.max_num_sentences\n  //                 sentences have been processed. Note that the passed\n  //                 pointer `samples` for the callback might be invalidated\n  //                 after the callback is returned, so the caller should not\n  //                 keep a reference to it. The caller can copy the data if\n  //                 he/she wants to access the samples after the callback\n  //                 returns. The callback is called in the current thread.\n  [[deprecated(\"Use Generate(text, GenerationConfig, callback) instead\")]]\n  GeneratedAudio Generate(const std::string &text,\n                          const std::string &prompt_text,\n                          const std::vector<float> &prompt_samples,\n                          int32_t sample_rate, float speed = 1.0,\n                          int32_t num_steps = 4,\n                          GeneratedAudioCallback callback = nullptr) const;\n\n  GeneratedAudio Generate(const std::string &text,\n                          const GenerationConfig &config,\n                          GeneratedAudioCallback callback = nullptr) const;\n\n  // Return the sample rate of the generated audio\n  int32_t SampleRate() const;\n\n  // Number of supported speakers.\n  // If it supports only a single speaker, then it return 0 or 1.\n  int32_t NumSpeakers() const;\n\n private:\n  std::unique_ptr<OfflineTtsImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_TTS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-websocket-server-impl.cc",
    "content": "// sherpa-onnx/csrc/offline-websocket-server-impl.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-websocket-server-impl.h\"\n\n#include <algorithm>\n#include <iostream>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineWebsocketDecoderConfig::Register(ParseOptions *po) {\n  recognizer_config.Register(po);\n\n  po->Register(\"max-batch-size\", &max_batch_size,\n               \"Max batch size for decoding.\");\n\n  po->Register(\n      \"max-utterance-length\", &max_utterance_length,\n      \"Max utterance length in seconds. If we receive an utterance \"\n      \"longer than this value, we will reject the connection. \"\n      \"If you have enough memory, you can select a large value for it.\");\n}\n\nvoid OfflineWebsocketDecoderConfig::Validate() const {\n  if (!recognizer_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Error in recognizer config\");\n    exit(-1);\n  }\n\n  if (max_batch_size <= 0) {\n    SHERPA_ONNX_LOGE(\"Expect --max-batch-size > 0. Given: %d\", max_batch_size);\n    exit(-1);\n  }\n\n  if (max_utterance_length <= 0) {\n    SHERPA_ONNX_LOGE(\"Expect --max-utterance-length > 0. Given: %f\",\n                     max_utterance_length);\n    exit(-1);\n  }\n}\n\nOfflineWebsocketDecoder::OfflineWebsocketDecoder(OfflineWebsocketServer *server)\n    : config_(server->GetConfig().decoder_config),\n      server_(server),\n      recognizer_(config_.recognizer_config) {}  // NOLINT\n\nvoid OfflineWebsocketDecoder::Push(connection_hdl hdl, ConnectionDataPtr d) {\n  std::lock_guard<std::mutex> lock(mutex_);\n  streams_.push_back({hdl, d});\n}\n\nvoid OfflineWebsocketDecoder::Decode() {\n  std::unique_lock<std::mutex> lock(mutex_);\n  if (streams_.empty()) {\n    return;\n  }\n\n  int32_t size =\n      std::min(static_cast<int32_t>(streams_.size()), config_.max_batch_size);\n  SHERPA_ONNX_LOGE(\"size: %d\", size);\n\n  // We first lock the mutex for streams_, take items from it, and then\n  // unlock the mutex; in doing so we don't need to lock the mutex to\n  // access hdl and connection_data later.\n  std::vector<connection_hdl> handles(size);\n\n  // Store connection_data here to prevent the data from being freed\n  // while we are still using it.\n  std::vector<ConnectionDataPtr> connection_data(size);\n\n  std::vector<const float *> samples(size);\n  std::vector<int32_t> samples_length(size);\n  std::vector<std::unique_ptr<OfflineStream>> ss(size);\n  std::vector<OfflineStream *> p_ss(size);\n\n  for (int32_t i = 0; i != size; ++i) {\n    auto &p = streams_.front();\n    handles[i] = p.first;\n    connection_data[i] = p.second;\n    streams_.pop_front();\n\n    auto sample_rate = connection_data[i]->sample_rate;\n    auto samples =\n        reinterpret_cast<const float *>(&connection_data[i]->data[0]);\n    auto num_samples = connection_data[i]->expected_byte_size / sizeof(float);\n    auto s = recognizer_.CreateStream();\n    s->AcceptWaveform(sample_rate, samples, num_samples);\n\n    ss[i] = std::move(s);\n    p_ss[i] = ss[i].get();\n  }\n\n  lock.unlock();\n\n  // Note: DecodeStreams is thread-safe\n  recognizer_.DecodeStreams(p_ss.data(), size);\n\n  for (int32_t i = 0; i != size; ++i) {\n    connection_hdl hdl = handles[i];\n    asio::post(server_->GetConnectionContext(),\n               [this, hdl, result = ss[i]->GetResult()]() {\n                 websocketpp::lib::error_code ec;\n                 server_->GetServer().send(hdl, result.AsJsonString(),\n                                           websocketpp::frame::opcode::text,\n                                           ec);\n                 if (ec) {\n                   server_->GetServer().get_alog().write(\n                       websocketpp::log::alevel::app, ec.message());\n                 }\n               });\n  }\n}\n\nvoid OfflineWebsocketServerConfig::Register(ParseOptions *po) {\n  decoder_config.Register(po);\n  po->Register(\"log-file\", &log_file,\n               \"Path to the log file. Logs are \"\n               \"appended to this file\");\n}\n\nvoid OfflineWebsocketServerConfig::Validate() const {\n  decoder_config.Validate();\n}\n\nOfflineWebsocketServer::OfflineWebsocketServer(\n    asio::io_context &io_conn,  // NOLINT\n    asio::io_context &io_work,  // NOLINT\n    const OfflineWebsocketServerConfig &config)\n    : io_conn_(io_conn),\n      io_work_(io_work),\n      config_(config),\n      log_(config.log_file, std::ios::app),\n      tee_(std::cout, log_),\n      decoder_(this) {\n  SetupLog();\n\n  server_.init_asio(&io_conn_);\n\n  server_.set_open_handler([this](connection_hdl hdl) { OnOpen(hdl); });\n\n  server_.set_close_handler([this](connection_hdl hdl) { OnClose(hdl); });\n\n  server_.set_message_handler(\n      [this](connection_hdl hdl, server::message_ptr msg) {\n        OnMessage(hdl, msg);\n      });\n}\n\nvoid OfflineWebsocketServer::SetupLog() {\n  server_.clear_access_channels(websocketpp::log::alevel::all);\n  server_.set_access_channels(websocketpp::log::alevel::connect);\n  server_.set_access_channels(websocketpp::log::alevel::disconnect);\n\n  // So that it also prints to std::cout and std::cerr\n  server_.get_alog().set_ostream(&tee_);\n  server_.get_elog().set_ostream(&tee_);\n}\n\nvoid OfflineWebsocketServer::OnOpen(connection_hdl hdl) {\n  std::lock_guard<std::mutex> lock(mutex_);\n  connections_.emplace(hdl, std::make_shared<ConnectionData>());\n\n  SHERPA_ONNX_LOGE(\"Number of active connections: %d\",\n                   static_cast<int32_t>(connections_.size()));\n}\n\nvoid OfflineWebsocketServer::OnClose(connection_hdl hdl) {\n  std::lock_guard<std::mutex> lock(mutex_);\n  connections_.erase(hdl);\n\n  SHERPA_ONNX_LOGE(\"Number of active connections: %d\",\n                   static_cast<int32_t>(connections_.size()));\n}\n\nvoid OfflineWebsocketServer::OnMessage(connection_hdl hdl,\n                                       server::message_ptr msg) {\n  std::unique_lock<std::mutex> lock(mutex_);\n  auto connection_data = connections_.find(hdl)->second;\n  lock.unlock();\n  const std::string &payload = msg->get_payload();\n\n  switch (msg->get_opcode()) {\n    case websocketpp::frame::opcode::text:\n      if (payload == \"Done\") {\n        // The client will not send any more data. We can close the\n        // connection now.\n        Close(hdl, websocketpp::close::status::normal, \"Done\");\n      } else {\n        Close(hdl, websocketpp::close::status::normal,\n              std::string(\"Invalid payload: \") + payload);\n      }\n      break;\n\n    case websocketpp::frame::opcode::binary: {\n      auto p = reinterpret_cast<const int8_t *>(payload.data());\n\n      if (connection_data->expected_byte_size == 0) {\n        if (payload.size() < 8) {\n          Close(hdl, websocketpp::close::status::normal,\n                \"Payload is too short\");\n          break;\n        }\n\n        connection_data->sample_rate = *reinterpret_cast<const int32_t *>(p);\n\n        connection_data->expected_byte_size =\n            *reinterpret_cast<const int32_t *>(p + 4);\n\n        int32_t max_byte_size_ = decoder_.GetConfig().max_utterance_length *\n                                 connection_data->sample_rate * sizeof(float);\n        if (connection_data->expected_byte_size > max_byte_size_) {\n          float num_samples =\n              connection_data->expected_byte_size / sizeof(float);\n\n          float duration = num_samples / connection_data->sample_rate;\n\n          std::ostringstream os;\n          os << \"Max utterance length is configured to \"\n             << decoder_.GetConfig().max_utterance_length\n             << \" seconds, received length is \" << duration << \" seconds. \"\n             << \"Payload is too large!\";\n          Close(hdl, websocketpp::close::status::message_too_big, os.str());\n          break;\n        }\n\n        connection_data->data.resize(connection_data->expected_byte_size);\n        std::copy(payload.begin() + 8, payload.end(),\n                  connection_data->data.data());\n        connection_data->cur = payload.size() - 8;\n      } else {\n        std::copy(payload.begin(), payload.end(),\n                  connection_data->data.data() + connection_data->cur);\n        connection_data->cur += payload.size();\n      }\n\n      if (connection_data->expected_byte_size == connection_data->cur) {\n        auto d = std::make_shared<ConnectionData>(std::move(*connection_data));\n        // Clear it so that we can handle the next audio file from the client.\n        // The client can send multiple audio files for recognition without\n        // the need to create another connection.\n        connection_data->sample_rate = 0;\n        connection_data->expected_byte_size = 0;\n        connection_data->cur = 0;\n\n        decoder_.Push(hdl, d);\n\n        connection_data->Clear();\n\n        asio::post(io_work_, [this]() { decoder_.Decode(); });\n      }\n      break;\n    }\n\n    default:\n      // Unexpected message, ignore it\n      break;\n  }\n}\n\nvoid OfflineWebsocketServer::Close(connection_hdl hdl,\n                                   websocketpp::close::status::value code,\n                                   const std::string &reason) {\n  auto con = server_.get_con_from_hdl(hdl);\n\n  std::ostringstream os;\n  os << \"Closing \" << con->get_remote_endpoint() << \" with reason: \" << reason\n     << \"\\n\";\n\n  websocketpp::lib::error_code ec;\n  server_.close(hdl, code, reason, ec);\n  if (ec) {\n    os << \"Failed to close\" << con->get_remote_endpoint() << \". \"\n       << ec.message() << \"\\n\";\n  }\n  server_.get_alog().write(websocketpp::log::alevel::app, os.str());\n}\n\nvoid OfflineWebsocketServer::Run(uint16_t port) {\n  server_.set_reuse_addr(true);\n  server_.listen(asio::ip::tcp::v4(), port);\n  server_.start_accept();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-websocket-server-impl.h",
    "content": "// sherpa-onnx/csrc/offline-websocket-server-impl.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_WEBSOCKET_SERVER_IMPL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_WEBSOCKET_SERVER_IMPL_H_\n\n#include <deque>\n#include <fstream>\n#include <map>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/tee-stream.h\"\n#include \"websocketpp/config/asio_no_tls.hpp\"  // TODO(fangjun): support TLS\n#include \"websocketpp/server.hpp\"\n\nusing server = websocketpp::server<websocketpp::config::asio>;\nusing connection_hdl = websocketpp::connection_hdl;\n\nnamespace sherpa_onnx {\n\n/** Communication protocol\n *\n * The client sends a byte stream to the server. The first 4 bytes in little\n * endian indicates the sample rate of the audio data that the client will send.\n * The next 4 bytes in little endian indicates the total samples in bytes the\n * client will send. The remaining bytes represent audio samples. Each audio\n * sample is a float occupying 4 bytes and is normalized into the range\n * [-1, 1].\n *\n * The byte stream can be broken into arbitrary number of messages.\n * We require that the first message has to be at least 8 bytes so that\n * we can get `sample_rate` and `expected_byte_size` from the first message.\n */\nstruct ConnectionData {\n  // Sample rate of the audio samples the client\n  int32_t sample_rate;\n\n  // Number of expected bytes sent from the client\n  int32_t expected_byte_size = 0;\n\n  // Number of bytes received so far\n  int32_t cur = 0;\n\n  // It saves the received samples from the client.\n  // We will **reinterpret_cast** it to float.\n  // We expect that data.size() == expected_byte_size\n  std::vector<int8_t> data;\n\n  void Clear() {\n    sample_rate = 0;\n    expected_byte_size = 0;\n    cur = 0;\n    data.clear();\n  }\n};\n\nusing ConnectionDataPtr = std::shared_ptr<ConnectionData>;\n\nstruct OfflineWebsocketDecoderConfig {\n  OfflineRecognizerConfig recognizer_config;\n\n  int32_t max_batch_size = 5;\n\n  float max_utterance_length = 300;  // seconds\n\n  void Register(ParseOptions *po);\n  void Validate() const;\n};\n\nclass OfflineWebsocketServer;\n\nclass OfflineWebsocketDecoder {\n public:\n  /**\n   * @param config Configuration for the decoder.\n   * @param server **Borrowed** from outside.\n   */\n  explicit OfflineWebsocketDecoder(OfflineWebsocketServer *server);\n\n  /** Insert received data to the queue for decoding.\n   *\n   * @param hdl A handle to the connection. We can use it to send the result\n   *            back to the client once it finishes decoding.\n   * @param d  The received data\n   */\n  void Push(connection_hdl hdl, ConnectionDataPtr d);\n\n  /** It is called by one of the work thread.\n   */\n  void Decode();\n\n  const OfflineWebsocketDecoderConfig &GetConfig() const { return config_; }\n\n private:\n  OfflineWebsocketDecoderConfig config_;\n\n  /** When we have received all the data from the client, we put it into\n   * this queue; the worker threads will get items from this queue for\n   * decoding.\n   *\n   * Number of items to take from this queue is determined by\n   * `--max-batch-size`. If there are not enough items in the queue, we won't\n   * wait and take whatever we have for decoding.\n   */\n  std::mutex mutex_;\n  std::deque<std::pair<connection_hdl, ConnectionDataPtr>> streams_;\n\n  OfflineWebsocketServer *server_;  // Not owned\n  OfflineRecognizer recognizer_;\n};\n\nstruct OfflineWebsocketServerConfig {\n  OfflineWebsocketDecoderConfig decoder_config;\n  std::string log_file = \"./log.txt\";\n\n  void Register(ParseOptions *po);\n  void Validate() const;\n};\n\nclass OfflineWebsocketServer {\n public:\n  OfflineWebsocketServer(asio::io_context &io_conn,  // NOLINT\n                         asio::io_context &io_work,  // NOLINT\n                         const OfflineWebsocketServerConfig &config);\n\n  asio::io_context &GetConnectionContext() { return io_conn_; }\n  server &GetServer() { return server_; }\n\n  void Run(uint16_t port);\n\n  const OfflineWebsocketServerConfig &GetConfig() const { return config_; }\n\n private:\n  void SetupLog();\n\n  // When a websocket client is connected, it will invoke this method\n  // (Not for HTTP)\n  void OnOpen(connection_hdl hdl);\n\n  // When a websocket client is disconnected, it will invoke this method\n  void OnClose(connection_hdl hdl);\n\n  // When a message is received from a websocket client, this method will\n  // be invoked.\n  //\n  // The protocol between the client and the server is as follows:\n  //\n  // (1) The client connects to the server\n  // (2) The client starts to send binary byte stream to the server.\n  //     The byte stream can be broken into multiple messages or it can\n  //     be put into a single message.\n  //     The first message has to contain at least 8 bytes. The first\n  //     4 bytes in little endian contains a int32_t indicating the\n  //     sampling rate. The next 4 bytes in little endian contains a int32_t\n  //     indicating total number of bytes of samples the client will send.\n  //     We assume each sample is a float containing 4 bytes and has been\n  //     normalized to the range [-1, 1].\n  // (4) When the server receives all the samples from the client, it will\n  //     start to decode them. Once decoded, the server sends a text message\n  //     to the client containing the decoded results\n  // (5) After receiving the decoded results from the server, if the client has\n  //     another audio file to send, it repeats (2), (3), (4)\n  // (6) If the client has no more audio files to decode, the client sends a\n  //     text message containing \"Done\" to the server and closes the connection\n  // (7) The server receives a text message \"Done\" and closes the connection\n  //\n  // Note:\n  //  (a) All models in icefall use features extracted from audio samples\n  //      normalized to the range [-1, 1]. Please send normalized audio samples\n  //      if you use models from icefall.\n  //  (b) Only sound files with a single channel is supported\n  //  (c) Only audio samples are sent. For instance, if we want to decode\n  //      a WAVE file, the RIFF header of the WAVE is not sent.\n  void OnMessage(connection_hdl hdl, server::message_ptr msg);\n\n  // Close a websocket connection with given code and reason\n  void Close(connection_hdl hdl, websocketpp::close::status::value code,\n             const std::string &reason);\n\n private:\n  asio::io_context &io_conn_;\n  asio::io_context &io_work_;\n  server server_;\n\n  std::map<connection_hdl, ConnectionDataPtr, std::owner_less<connection_hdl>>\n      connections_;\n  std::mutex mutex_;\n\n  OfflineWebsocketServerConfig config_;\n\n  std::ofstream log_;\n  TeeStream tee_;\n\n  OfflineWebsocketDecoder decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_WEBSOCKET_SERVER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-websocket-server.cc",
    "content": "// sherpa-onnx/csrc/offline-websocket-server.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include <vector>\n\n#include \"asio.hpp\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-websocket-server-impl.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nstatic constexpr const char *kUsageMessage = R\"(\nAutomatic speech recognition with sherpa-onnx using websocket.\n\nUsage:\n\n./bin/sherpa-onnx-offline-websocket-server --help\n\n(1) For transducer models\n\n./bin/sherpa-onnx-offline-websocket-server \\\n  --port=6006 \\\n  --num-work-threads=5 \\\n  --tokens=/path/to/tokens.txt \\\n  --encoder=/path/to/encoder.onnx \\\n  --decoder=/path/to/decoder.onnx \\\n  --joiner=/path/to/joiner.onnx \\\n  --log-file=./log.txt \\\n  --max-batch-size=5\n\n(2) For Paraformer\n\n./bin/sherpa-onnx-offline-websocket-server \\\n  --port=6006 \\\n  --num-work-threads=5 \\\n  --tokens=/path/to/tokens.txt \\\n  --paraformer=/path/to/model.onnx \\\n  --log-file=./log.txt \\\n  --max-batch-size=5\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)\";\n\nint32_t main(int32_t argc, char *argv[]) {\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n\n  sherpa_onnx::OfflineWebsocketServerConfig config;\n\n  // the server will listen on this port\n  int32_t port = 6006;\n\n  // size of the thread pool for handling network connections\n  int32_t num_io_threads = 1;\n\n  // size of the thread pool for neural network computation and decoding\n  int32_t num_work_threads = 3;\n\n  po.Register(\"num-io-threads\", &num_io_threads,\n              \"Thread pool size for network connections.\");\n\n  po.Register(\"num-work-threads\", &num_work_threads,\n              \"Thread pool size for for neural network \"\n              \"computation and decoding.\");\n\n  po.Register(\"port\", &port, \"The port on which the server will listen.\");\n\n  config.Register(&po);\n  po.DisableOption(\"sample-rate\");\n\n  if (argc == 1) {\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  po.Read(argc, argv);\n\n  if (po.NumArgs() != 0) {\n    SHERPA_ONNX_LOGE(\"Unrecognized positional arguments!\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  config.Validate();\n\n  asio::io_context io_conn;  // for network connections\n  asio::io_context io_work;  // for neural network and decoding\n\n  sherpa_onnx::OfflineWebsocketServer server(io_conn, io_work, config);\n  server.Run(port);\n\n  SHERPA_ONNX_LOGE(\"Started!\");\n  SHERPA_ONNX_LOGE(\"Listening on: %d\", port);\n  SHERPA_ONNX_LOGE(\"Number of work threads: %d\", num_work_threads);\n\n  // give some work to do for the io_work pool\n  auto work_guard = asio::make_work_guard(io_work);\n\n  std::vector<std::thread> io_threads;\n\n  // decrement since the main thread is also used for network communications\n  for (int32_t i = 0; i < num_io_threads - 1; ++i) {\n    io_threads.emplace_back([&io_conn]() { io_conn.run(); });\n  }\n\n  std::vector<std::thread> work_threads;\n  for (int32_t i = 0; i < num_work_threads; ++i) {\n    work_threads.emplace_back([&io_work]() { io_work.run(); });\n  }\n\n  io_conn.run();\n\n  for (auto &t : io_threads) {\n    t.join();\n  }\n\n  for (auto &t : work_threads) {\n    t.join();\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-wenet-ctc-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-wenet-ctc-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-wenet-ctc-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineWenetCtcModelConfig::Register(ParseOptions *po) {\n  po->Register(\n      \"wenet-ctc-model\", &model,\n      \"Path to model.onnx from WeNet. Please see \"\n      \"https://github.com/k2-fsa/sherpa-onnx/pull/425 for available models\");\n}\n\nbool OfflineWenetCtcModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"WeNet model: '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineWenetCtcModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineWenetCtcModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-wenet-ctc-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-wenet-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_WENET_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_WENET_CTC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineWenetCtcModelConfig {\n  std::string model;\n\n  OfflineWenetCtcModelConfig() = default;\n  explicit OfflineWenetCtcModelConfig(const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_WENET_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-wenet-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/offline-wenet-ctc-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-wenet-ctc-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineWenetCtcModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.wenet_ctc.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.wenet_ctc.model);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) {\n    std::array<Ort::Value, 2> inputs = {std::move(features),\n                                        std::move(features_length)};\n\n    return sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                      output_names_ptr_.data(), output_names_ptr_.size());\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t SubsamplingFactor() const { return subsampling_factor_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n    SHERPA_ONNX_READ_META_DATA(subsampling_factor_, \"subsampling_factor\");\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t vocab_size_ = 0;\n  int32_t subsampling_factor_ = 0;\n};\n\nOfflineWenetCtcModel::OfflineWenetCtcModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineWenetCtcModel::OfflineWenetCtcModel(Manager *mgr,\n                                           const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineWenetCtcModel::~OfflineWenetCtcModel() = default;\n\nstd::vector<Ort::Value> OfflineWenetCtcModel::Forward(\n    Ort::Value features, Ort::Value features_length) {\n  return impl_->Forward(std::move(features), std::move(features_length));\n}\n\nint32_t OfflineWenetCtcModel::VocabSize() const { return impl_->VocabSize(); }\n\nint32_t OfflineWenetCtcModel::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\nOrtAllocator *OfflineWenetCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineWenetCtcModel::OfflineWenetCtcModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineWenetCtcModel::OfflineWenetCtcModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-wenet-ctc-model.h",
    "content": "// sherpa-onnx/csrc/offline-wenet-ctc-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_WENET_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_WENET_CTC_MODEL_H_\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements the CTC model from WeNet.\n *\n * See\n * https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/wenet/export-onnx.py\n * https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/wenet/test-onnx.py\n * https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/wenet/run.sh\n *\n */\nclass OfflineWenetCtcModel : public OfflineCtcModel {\n public:\n  explicit OfflineWenetCtcModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineWenetCtcModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineWenetCtcModel() override;\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size).\n   *  - log_probs_length A 1-D tensor of shape (N,). Its dtype is int64_t\n   */\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** SubsamplingFactor of the model\n   *\n   * For Citrinet, the subsampling factor is usually 4.\n   * For Conformer CTC, the subsampling factor is usually 8.\n   */\n  int32_t SubsamplingFactor() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  // WeNet CTC models do not support batch size > 1\n  bool SupportBatchProcessing() const override { return false; }\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_WENET_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-whisper-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_WHISPER_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_WHISPER_DECODER_H_\n\n#include <string>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-whisper-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineWhisperDecoder {\n public:\n  virtual ~OfflineWhisperDecoder() = default;\n\n  /** Run beam search given the output from the whisper encoder model.\n   *\n   * @param n_layer_cross_k       A 4-D tensor of shape\n   *                              (n_text_layer, N, n_audio_ctx, n_text_state).\n   * @param n_layer_cross_v       A 4-D tensor of shape\n   *                              (n_text_layer, N, n_audio_ctx, n_text_state).\n   *\n   * @return Return a vector of size `N` containing the decoded results.\n   */\n  virtual std::vector<OfflineWhisperDecoderResult> Decode(\n      Ort::Value n_layer_cross_k, Ort::Value n_layer_cross_v,\n      int32_t num_feature_frames) = 0;\n\n  virtual void SetConfig(const OfflineWhisperModelConfig &config) = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_WHISPER_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-dtw.cc",
    "content": "// sherpa-onnx/csrc/offline-whisper-dtw.cc\n//\n// Copyright (c)  2026  Posit Software, PBC\n\n#include \"sherpa-onnx/csrc/offline-whisper-dtw.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <cstdio>  // For debug output\n#include <limits>\n#include <numeric>\n#include <vector>\n\n// Set to 1 to enable debug output\n#define DTW_DEBUG 0\n\nnamespace sherpa_onnx {\n\nTokenTimingResult WhisperDTW::ComputeTokenTimings(\n    const float *attention, int32_t n_heads, int32_t n_tokens, int32_t n_frames,\n    int32_t num_audio_frames, int32_t sot_sequence_length,\n    int32_t num_text_tokens,\n    const std::vector<int32_t> &timestamp_token_indices) {\n  TokenTimingResult result;\n\n  if (n_heads <= 0 || n_tokens <= 0 || n_frames <= 0 || num_text_tokens <= 0) {\n    return result;\n  }\n\n#if DTW_DEBUG\n  fprintf(stderr, \"\\n========== DTW TIMING DEBUG ==========\\n\");\n  fprintf(stderr, \"Input: n_heads=%d, n_tokens=%d, n_frames=%d\\n\", n_heads,\n          n_tokens, n_frames);\n  fprintf(stderr,\n          \"num_audio_frames=%d, sot_sequence_length=%d, num_text_tokens=%d\\n\",\n          num_audio_frames, sot_sequence_length, num_text_tokens);\n  fprintf(stderr, \"timestamp_token_indices count: %zu\\n\",\n          timestamp_token_indices.size());\n#endif\n\n  // Clip to actual audio frames (like OpenAI: weights[:, :, :num_frames//2])\n  int32_t clipped_frames = std::min(n_frames, num_audio_frames);\n  if (clipped_frames <= 0) {\n    clipped_frames = n_frames;\n  }\n\n  // Process attention weights per-head, then average (like OpenAI)\n  std::vector<float> processed(n_tokens * clipped_frames, 0.0f);\n  std::vector<float> head_data(n_tokens * clipped_frames);\n\n  for (int32_t h = 0; h < n_heads; ++h) {\n    const float *src = attention + h * n_tokens * n_frames;\n    for (int32_t t = 0; t < n_tokens; ++t) {\n      for (int32_t f = 0; f < clipped_frames; ++f) {\n        head_data[t * clipped_frames + f] = src[t * n_frames + f];\n      }\n    }\n\n    ApplySoftmax(head_data.data(), n_tokens, clipped_frames);\n    ApplyZScoreNormalization(head_data.data(), n_tokens, clipped_frames);\n    ApplyMedianFilter(head_data.data(), n_tokens, clipped_frames, 7);\n\n    for (int32_t i = 0; i < n_tokens * clipped_frames; ++i) {\n      processed[i] += head_data[i];\n    }\n  }\n\n  float inv_n_heads = 1.0f / static_cast<float>(n_heads);\n  for (int32_t i = 0; i < n_tokens * clipped_frames; ++i) {\n    processed[i] *= inv_n_heads;\n  }\n\n  // Build a set of timestamp token indices for quick lookup.\n  // The DTW algorithm needs an \"anchor\" token at position sot_sequence_length\n  // to establish the time=0 reference point (like OpenAI's timing.py).\n  //\n  // Two modes, same anchor position:\n  // - enable_segment_timestamps=true: the first timestamp token (e.g. <|0.00|>)\n  //   is at index sot_sequence_length. We keep it as the anchor and filter\n  //   out subsequent timestamp tokens to avoid alignment drift.\n  // - enable_segment_timestamps=false: timestamp_token_indices is empty,\n  //   no filtering occurs. But the implementation of enable_segment_timestamps\n  //   being false happens to insert a <no_timestamps> token at index\n  //   sot_sequence_length, so that will serve as the anchor in this case.\n  std::vector<bool> is_timestamp_token(n_tokens, false);\n  bool found_first_timestamp = false;\n  for (int32_t idx : timestamp_token_indices) {\n    if (idx >= 0 && idx < n_tokens) {\n      // Keep the first timestamp token (it's the anchor), filter the rest\n      if (!found_first_timestamp && idx >= sot_sequence_length) {\n        found_first_timestamp = true;\n        // Don't mark as timestamp - keep it in the DTW matrix\n      } else {\n        is_timestamp_token[idx] = true;\n      }\n    }\n  }\n\n  // Skip SOT sequence and filter out timestamp tokens (except first one)\n  // Like OpenAI: we skip sot_sequence_length tokens at the start\n  // Additionally, we now filter out timestamp tokens from the middle\n  int32_t start_token = sot_sequence_length;\n\n  // Build filtered token list (indices into original processed array)\n  // and mapping from filtered index back to original index\n  std::vector<int32_t> filtered_to_original;\n  for (int32_t i = start_token; i < n_tokens; ++i) {\n    if (!is_timestamp_token[i]) {\n      filtered_to_original.push_back(i);\n    }\n  }\n\n  int32_t dtw_tokens = static_cast<int32_t>(filtered_to_original.size());\n\n#if DTW_DEBUG\n  fprintf(\n      stderr,\n      \"DTW tokens after filtering: %d (filtered out %zu timestamp tokens)\\n\",\n      dtw_tokens, timestamp_token_indices.size());\n#endif\n\n  if (dtw_tokens <= 1) {\n    return result;\n  }\n\n  // Extract the filtered portion for DTW and negate\n  std::vector<float> cost_matrix(dtw_tokens * clipped_frames);\n  for (int32_t i = 0; i < dtw_tokens; ++i) {\n    int32_t orig_idx = filtered_to_original[i];\n    for (int32_t j = 0; j < clipped_frames; ++j) {\n      cost_matrix[i * clipped_frames + j] =\n          -processed[orig_idx * clipped_frames + j];\n    }\n  }\n\n  // Run DTW\n  DTWResult dtw_result = RunDTW(cost_matrix.data(), dtw_tokens, clipped_frames);\n\n  if (dtw_result.text_indices.empty()) {\n    return result;\n  }\n\n  // Extract jump times (where text_idx changes)\n  // Like OpenAI: jumps = np.pad(np.diff(text_indices), (1, 0),\n  // constant_values=1)\n  //              jump_times = time_indices[jumps] / TOKENS_PER_SECOND\n  std::vector<int32_t> jump_frame_indices;\n  jump_frame_indices.push_back(\n      dtw_result.time_indices[0]);  // First is always a jump\n\n  for (size_t i = 1; i < dtw_result.text_indices.size(); ++i) {\n    if (dtw_result.text_indices[i] != dtw_result.text_indices[i - 1]) {\n      jump_frame_indices.push_back(dtw_result.time_indices[i]);\n    }\n  }\n\n#if DTW_DEBUG\n  fprintf(stderr, \"jump_frame_indices count: %zu\\n\", jump_frame_indices.size());\n  fprintf(stderr, \"jump_times (first 10): \");\n  for (size_t i = 0; i < std::min(size_t(10), jump_frame_indices.size()); ++i) {\n    fprintf(stderr, \"%.2f \", jump_frame_indices[i] * kWhisperSecondsPerToken);\n  }\n  fprintf(stderr, \"\\n\");\n#endif\n\n  // Now extract start_times and durations for text tokens only (not EOT)\n  // Like OpenAI: start_times = jump_times[word_boundaries[:-1]]\n  //              end_times = jump_times[word_boundaries[1:]]\n  // For tokens (each token is one \"word\"): boundaries = [0, 1, 2, ..., N]\n  // So: start_times[i] = jump_times[i], end_times[i] = jump_times[i+1]\n  result.start_times.reserve(num_text_tokens);\n  result.durations.reserve(num_text_tokens);\n\n  for (int32_t i = 0; i < num_text_tokens; ++i) {\n    if (i < static_cast<int32_t>(jump_frame_indices.size())) {\n      float start =\n          static_cast<float>(jump_frame_indices[i]) * kWhisperSecondsPerToken;\n      result.start_times.push_back(start);\n\n      // Duration = end_time - start_time = jump_times[i+1] - jump_times[i]\n      if (i + 1 < static_cast<int32_t>(jump_frame_indices.size())) {\n        float end = static_cast<float>(jump_frame_indices[i + 1]) *\n                    kWhisperSecondsPerToken;\n        result.durations.push_back(end - start);\n      } else {\n        // Last token: duration to end of audio\n        float audio_end =\n            static_cast<float>(clipped_frames) * kWhisperSecondsPerToken;\n        result.durations.push_back(std::max(0.0f, audio_end - start));\n      }\n    } else {\n      // Fallback: use last known time\n      float last_time =\n          result.start_times.empty() ? 0.0f : result.start_times.back();\n      result.start_times.push_back(last_time);\n      result.durations.push_back(0.0f);\n    }\n  }\n\n#if DTW_DEBUG\n  fprintf(stderr, \"Result: %zu start_times, %zu durations\\n\",\n          result.start_times.size(), result.durations.size());\n  fprintf(stderr, \"========== END DTW TIMING DEBUG ==========\\n\\n\");\n#endif\n\n  return result;\n}\n\nvoid WhisperDTW::ApplySoftmax(float *data, int32_t n_tokens, int32_t n_frames) {\n  for (int32_t t = 0; t < n_tokens; ++t) {\n    float *row = data + t * n_frames;\n\n    // Find max for numerical stability\n    float max_val = *std::max_element(row, row + n_frames);\n\n    // Compute exp and sum\n    float sum = 0.0f;\n    for (int32_t f = 0; f < n_frames; ++f) {\n      row[f] = std::exp(row[f] - max_val);\n      sum += row[f];\n    }\n\n    // Normalize\n    if (sum > 0.0f) {\n      float inv_sum = 1.0f / sum;\n      for (int32_t f = 0; f < n_frames; ++f) {\n        row[f] *= inv_sum;\n      }\n    }\n  }\n}\n\nvoid WhisperDTW::ApplyZScoreNormalization(float *data, int32_t n_tokens,\n                                          int32_t n_frames) {\n  // Normalize across tokens (dim=-2) for each frame\n  for (int32_t f = 0; f < n_frames; ++f) {\n    // Compute mean\n    float sum = 0.0f;\n    for (int32_t t = 0; t < n_tokens; ++t) {\n      sum += data[t * n_frames + f];\n    }\n    float mean = sum / static_cast<float>(n_tokens);\n\n    // Compute std\n    float sq_sum = 0.0f;\n    for (int32_t t = 0; t < n_tokens; ++t) {\n      float diff = data[t * n_frames + f] - mean;\n      sq_sum += diff * diff;\n    }\n    float std_dev = std::sqrt(sq_sum / static_cast<float>(n_tokens) + 1e-9f);\n\n    // Normalize\n    float inv_std = 1.0f / std_dev;\n    for (int32_t t = 0; t < n_tokens; ++t) {\n      data[t * n_frames + f] = (data[t * n_frames + f] - mean) * inv_std;\n    }\n  }\n}\n\nvoid WhisperDTW::ApplyMedianFilter(float *data, int32_t n_tokens,\n                                   int32_t n_frames, int32_t width) {\n  if (width <= 1 || n_frames <= 1) {\n    return;\n  }\n\n  int32_t half_width = width / 2;\n  std::vector<float> temp(n_frames);\n  std::vector<float> window(width);\n\n  for (int32_t t = 0; t < n_tokens; ++t) {\n    float *row = data + t * n_frames;\n\n    // Copy original row\n    std::copy(row, row + n_frames, temp.begin());\n\n    for (int32_t f = 0; f < n_frames; ++f) {\n      // Gather window values with reflection padding\n      int32_t w_idx = 0;\n      for (int32_t k = -half_width; k <= half_width && w_idx < width; ++k) {\n        int32_t src_idx = f + k;\n        // Reflect at boundaries\n        if (src_idx < 0) {\n          src_idx = -src_idx;\n        } else if (src_idx >= n_frames) {\n          src_idx = 2 * n_frames - 2 - src_idx;\n        }\n        src_idx = std::max(0, std::min(src_idx, n_frames - 1));\n        window[w_idx++] = temp[src_idx];\n      }\n\n      // Sort and take median\n      std::sort(window.begin(), window.begin() + w_idx);\n      row[f] = window[w_idx / 2];\n    }\n  }\n}\n\nDTWResult WhisperDTW::RunDTW(const float *cost_matrix, int32_t n_tokens,\n                             int32_t n_frames) {\n  // DTW algorithm based on whisper.cpp and OpenAI Whisper\n  // O(N*M) time and space complexity\n\n  DTWResult result;\n\n  if (n_tokens <= 0 || n_frames <= 0) {\n    return result;\n  }\n\n  constexpr float kInf = std::numeric_limits<float>::infinity();\n\n  int32_t N = n_tokens;\n  int32_t M = n_frames;\n\n  // Cost and trace matrices (N+1 x M+1)\n  std::vector<float> cost((N + 1) * (M + 1), kInf);\n  std::vector<int32_t> trace((N + 1) * (M + 1), -1);\n\n  auto cost_at = [&](int32_t i, int32_t j) -> float & {\n    return cost[i * (M + 1) + j];\n  };\n  auto trace_at = [&](int32_t i, int32_t j) -> int32_t & {\n    return trace[i * (M + 1) + j];\n  };\n\n  // Initialize\n  cost_at(0, 0) = 0.0f;\n\n  // Fill cost matrix\n  for (int32_t j = 1; j <= M; ++j) {\n    for (int32_t i = 1; i <= N; ++i) {\n      float c0 = cost_at(i - 1, j - 1);  // diagonal\n      float c1 = cost_at(i - 1, j);      // up\n      float c2 = cost_at(i, j - 1);      // left\n\n      float min_cost;\n      int32_t trace_dir;\n\n      if (c0 <= c1 && c0 <= c2) {\n        min_cost = c0;\n        trace_dir = 0;  // diagonal\n      } else if (c1 <= c0 && c1 <= c2) {\n        min_cost = c1;\n        trace_dir = 1;  // up\n      } else {\n        min_cost = c2;\n        trace_dir = 2;  // left\n      }\n\n      // Add current cost\n      cost_at(i, j) = cost_matrix[(i - 1) * M + (j - 1)] + min_cost;\n      trace_at(i, j) = trace_dir;\n    }\n  }\n\n  // Backtrace\n  int32_t i = N;\n  int32_t j = M;\n\n  // Force horizontal movement at row 0 and vertical at column 0\n  for (int32_t jj = 0; jj <= M; ++jj) {\n    trace_at(0, jj) = 2;  // left\n  }\n  for (int32_t ii = 0; ii <= N; ++ii) {\n    trace_at(ii, 0) = 1;  // up\n  }\n\n  std::vector<std::pair<int32_t, int32_t>> path;\n  path.reserve(N + M);\n\n  while (i > 0 || j > 0) {\n    path.push_back({i - 1, j - 1});\n\n    int32_t dir = trace_at(i, j);\n    if (dir == 0) {  // diagonal\n      --i;\n      --j;\n    } else if (dir == 1) {  // up\n      --i;\n    } else {  // left\n      --j;\n    }\n  }\n\n  // Reverse path (we built it backwards)\n  std::reverse(path.begin(), path.end());\n\n  // Extract result\n  result.text_indices.reserve(path.size());\n  result.time_indices.reserve(path.size());\n\n  for (const auto &p : path) {\n    if (p.first >= 0 && p.second >= 0) {\n      result.text_indices.push_back(p.first);\n      result.time_indices.push_back(p.second);\n    }\n  }\n\n  return result;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-dtw.h",
    "content": "// sherpa-onnx/csrc/offline-whisper-dtw.h\n//\n// Copyright (c)  2026  Posit Software, PBC\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_WHISPER_DTW_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_WHISPER_DTW_H_\n\n#include <cstdint>\n#include <utility>\n#include <vector>\n\nnamespace sherpa_onnx {\n\n// Result of DTW alignment\nstruct DTWResult {\n  std::vector<int32_t> text_indices;  // Token index at each alignment point\n  std::vector<int32_t> time_indices;  // Frame index at each alignment point\n};\n\n// Token timing result from DTW\nstruct TokenTimingResult {\n  std::vector<float> start_times;  // Start time in seconds for each token\n  std::vector<float> durations;    // Duration in seconds for each token\n};\n\n// Class for processing cross-attention weights and computing DTW alignment\n// for token-level timestamps in Whisper.\n//\n// Based on OpenAI Whisper (whisper/timing.py) and whisper.cpp implementations.\nclass WhisperDTW {\n public:\n  // Compute token timings (start times and durations) from raw cross-attention.\n  // This follows OpenAI's approach of extracting both start and end times\n  // directly from DTW jump_times, where:\n  //   start_times[i] = jump_times[i]\n  //   end_times[i] = jump_times[i+1]\n  //   durations[i] = end_times[i] - start_times[i]\n  //\n  // @param attention Raw attention weights from decoder.\n  //                  Shape: (n_heads, n_tokens, n_audio_frames)\n  // @param n_heads Number of alignment heads\n  // @param n_tokens Number of text tokens (including SOT sequence and EOT)\n  // @param n_frames Number of audio frames (full context, e.g., 1500)\n  // @param num_audio_frames Actual audio frames to use (for clipping)\n  // @param sot_sequence_length Number of special tokens at start (to skip)\n  // @param num_text_tokens Number of actual text tokens to return timings for\n  //                        (excluding SOT sequence and EOT)\n  // @param timestamp_token_indices Indices of timestamp tokens to filter out\n  //                                (0-based, relative to attention sequence)\n  //\n  // @return TokenTimingResult with start_times and durations for each token\n  TokenTimingResult ComputeTokenTimings(\n      const float *attention, int32_t n_heads, int32_t n_tokens,\n      int32_t n_frames, int32_t num_audio_frames, int32_t sot_sequence_length,\n      int32_t num_text_tokens,\n      const std::vector<int32_t> &timestamp_token_indices = {});\n\n private:\n  // Apply softmax normalization across the last dimension (frames)\n  void ApplySoftmax(float *data, int32_t n_tokens, int32_t n_frames);\n\n  // Apply z-score normalization across tokens (dim=-2)\n  void ApplyZScoreNormalization(float *data, int32_t n_tokens,\n                                int32_t n_frames);\n\n  // Apply median filter across frames with given width\n  void ApplyMedianFilter(float *data, int32_t n_tokens, int32_t n_frames,\n                         int32_t width = 7);\n\n  // Run DTW algorithm on cost matrix\n  //\n  // @param cost_matrix Negated alignment matrix (n_tokens, n_frames)\n  //                    Lower values = better alignment\n  // @param n_tokens Number of rows (text tokens)\n  // @param n_frames Number of columns (audio frames)\n  //\n  // @return DTW alignment path\n  DTWResult RunDTW(const float *cost_matrix, int32_t n_tokens,\n                   int32_t n_frames);\n};\n\n// Time conversion constant: 50 tokens per second (20ms per token/frame)\nconstexpr float kWhisperTokensPerSecond = 50.0f;\nconstexpr float kWhisperSecondsPerToken = 0.02f;\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_WHISPER_DTW_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/offline-whisper-timestamp-rules.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineWhisperGreedySearchDecoder::SetConfig(\n    const OfflineWhisperModelConfig &config) {\n  config_ = config;\n}\n\nstd::vector<OfflineWhisperDecoderResult>\nOfflineWhisperGreedySearchDecoder::Decode(Ort::Value cross_k,\n                                          Ort::Value cross_v,\n                                          int32_t num_feature_frames) {\n  auto memory_info =\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n  // Check if we should collect attention weights for DTW timestamp computation\n  bool collect_attention =\n      config_.enable_token_timestamps && model_->HasAttentionOutput();\n\n  // Warn once if timestamps requested but model doesn't support it\n  static bool warned_no_attention = false;\n  if (config_.enable_token_timestamps && !model_->HasAttentionOutput() &&\n      !warned_no_attention) {\n    warned_no_attention = true;\n    SHERPA_ONNX_LOGE(\n        \"Warning: enable_token_timestamps=true but the decoder model does not \"\n        \"have cross-attention outputs. Timestamps will not be available. \"\n        \"To enable timestamps, export the model with attention outputs using: \"\n        \"python scripts/whisper/export-onnx-with-attention.py\");\n  }\n\n  // For multilingual models, initial_tokens contains [sot, language, task]\n  //   - language is English by default\n  //   - task is transcribe by default\n  //\n  // For non-multilingual models, initial_tokens contains [sot]\n  std::vector<int64_t> initial_tokens = model_->GetInitialTokens();\n\n  if (model_->IsMultiLingual()) {\n    if (!config_.language.empty()) {\n      const auto &lang2id = model_->GetLang2ID();\n\n      if (!lang2id.count(config_.language)) {\n        SHERPA_ONNX_LOGE(\"Invalid language: %s\", config_.language.c_str());\n        exit(-1);\n      }\n\n      int32_t lang_id = lang2id.at(config_.language);\n\n      // 0: sot, 1: lang_id, 2: task, 3: no_timestamps\n      initial_tokens[1] = lang_id;\n    } else {\n      int32_t lang_id = model_->DetectLanguage(cross_k, cross_v);\n\n      // 0: sot, 1: lang_id, 2: task, 3: no_timestamps\n      initial_tokens[1] = lang_id;\n    }\n\n    if (config_.task == \"translate\") {\n      initial_tokens[2] = model_->Translate();\n    } else if (config_.task != \"transcribe\") {\n      // initial_tokens[2] is transcribe by default\n      SHERPA_ONNX_LOGE(\n          \"Unsupported task: %s. Valid values are: transcribe, translate.\",\n          config_.task.c_str());\n    }\n  }\n\n  // Add no_timestamps token when NOT using segment timestamp mode.\n  // When enable_segment_timestamps=true, we let the decoder output timestamp\n  // tokens (like <|0.00|>) which serve as alignment anchors.\n  // When enable_token_timestamps=true (DTW mode), we MUST include no_timestamps\n  // because OpenAI's alignment (timing.py) uses it as an anchor token at the\n  // start of the DTW matrix. Without it, the first text token is misaligned.\n  if (!config_.enable_segment_timestamps) {\n    initial_tokens.push_back(model_->NoTimeStampsToken());\n  }\n\n  // Track if we're using segment timestamp mode\n  bool enable_segment_timestamps = config_.enable_segment_timestamps;\n\n  // Get token IDs for timestamp rules\n  int32_t timestamp_begin = model_->TimestampBegin();\n  int32_t no_timestamps = model_->NoTimeStampsToken();\n  int32_t eot = model_->EOT();\n\n  // Max initial timestamp: 50 = 1.0 second (each timestamp is 0.02s)\n  constexpr int32_t kMaxInitialTimestampIndex = 50;\n\n  // Maintain running list of all tokens for timestamp rules\n  std::vector<int64_t> all_tokens = initial_tokens;\n  int32_t sample_begin = static_cast<int32_t>(initial_tokens.size());\n\n  int32_t batch_size = 1;\n  std::array<int64_t, 2> token_shape{\n      batch_size, static_cast<int64_t>(initial_tokens.size())};\n\n  Ort::Value tokens = Ort::Value::CreateTensor(\n      memory_info, initial_tokens.data(), initial_tokens.size(),\n      token_shape.data(), token_shape.size());\n\n  std::array<int64_t, 1> offset_shape{1};\n  Ort::Value offset = Ort::Value::CreateTensor<int64_t>(\n      model_->Allocator(), offset_shape.data(), offset_shape.size());\n  *(offset.GetTensorMutableData<int64_t>()) = 0;\n\n  auto self_kv_cache = model_->GetInitialSelfKVCache();\n\n  auto decoder_out = model_->ForwardDecoder(\n      std::move(tokens), std::move(self_kv_cache.first),\n      std::move(self_kv_cache.second), std::move(cross_k), std::move(cross_v),\n      std::move(offset));\n\n  // Note: decoder_out is now a 7-tuple with attention weights as 7th element\n  // Indices: 0=logits, 1=self_k, 2=self_v, 3=cross_k, 4=cross_v, 5=offset,\n  // 6=attention\n  *(std::get<5>(decoder_out).GetTensorMutableData<int64_t>()) =\n      initial_tokens.size();\n\n  auto logits_shape =\n      std::get<0>(decoder_out).GetTensorTypeAndShapeInfo().GetShape();\n  int32_t vocab_size = logits_shape[2];\n\n  int32_t n_text_ctx = model_->TextCtx();\n  int32_t max_token_id = 0;\n\n  // Get initial logits\n  {\n    const float *p_logits = std::get<0>(decoder_out).GetTensorData<float>();\n    const float *p_start = p_logits + (logits_shape[1] - 1) * vocab_size;\n\n    if (enable_segment_timestamps) {\n      // Make a copy of logits for applying timestamp rules\n      std::vector<float> logits_copy(p_start, p_start + vocab_size);\n      ApplyTimestampRules(logits_copy.data(), vocab_size, all_tokens,\n                          sample_begin, timestamp_begin, no_timestamps, eot,\n                          kMaxInitialTimestampIndex);\n      max_token_id = MaxElementIndex(logits_copy.data(), vocab_size);\n    } else {\n      max_token_id = MaxElementIndex(p_start, vocab_size);\n    }\n  }\n\n  std::vector<int32_t> predicted_tokens;\n\n  // Storage for accumulated attention weights\n  std::vector<std::vector<float>> all_attention_weights;\n  int32_t attention_n_heads = 0;\n  int32_t attention_n_frames = 0;\n\n  // Track indices of timestamp tokens in the attention sequence\n  // (0-based, relative to the start of all_attention_weights)\n  std::vector<int32_t> timestamp_token_indices;\n\n  // Collect attention from initial tokens if enabled\n  if (collect_attention) {\n    auto &attn = std::get<6>(decoder_out);\n    auto attn_shape = attn.GetTensorTypeAndShapeInfo().GetShape();\n    // Shape: (batch, n_heads, n_tokens, n_audio_ctx)\n    if (attn_shape.size() >= 4 && attn_shape[1] > 0) {\n      attention_n_heads = static_cast<int32_t>(attn_shape[1]);\n      attention_n_frames = static_cast<int32_t>(attn_shape[3]);\n      int32_t n_initial_tokens = static_cast<int32_t>(attn_shape[2]);\n\n      const float *p_attn = attn.GetTensorData<float>();\n      int32_t stride = attention_n_frames;\n\n      // Store attention for each initial token\n      for (int32_t t = 0; t < n_initial_tokens; ++t) {\n        std::vector<float> token_attn(attention_n_heads * attention_n_frames);\n        for (int32_t h = 0; h < attention_n_heads; ++h) {\n          const float *src =\n              p_attn + h * n_initial_tokens * stride + t * stride;\n          std::copy(src, src + attention_n_frames,\n                    token_attn.begin() + h * attention_n_frames);\n        }\n        all_attention_weights.push_back(std::move(token_attn));\n      }\n    }\n  }\n\n  // assume at most 6 tokens per second\n  int32_t num_possible_tokens = num_feature_frames / 100.0 * 6;\n  num_possible_tokens = std::min<int32_t>(num_possible_tokens, n_text_ctx / 2);\n\n  for (int32_t i = 0; i < num_possible_tokens; ++i) {\n    if (max_token_id == eot) {\n      break;\n    }\n\n    predicted_tokens.push_back(max_token_id);\n    all_tokens.push_back(max_token_id);\n\n    // Track if this is a timestamp token (for filtering in DTW)\n    if (max_token_id >= timestamp_begin) {\n      // The attention index is: initial_tokens.size() + current predicted index\n      int32_t attn_idx = static_cast<int32_t>(initial_tokens.size()) +\n                         static_cast<int32_t>(predicted_tokens.size()) - 1;\n      timestamp_token_indices.push_back(attn_idx);\n    }\n\n    std::array<int64_t, 2> token_shape{1, 1};\n    Ort::Value tokens = Ort::Value::CreateTensor<int64_t>(\n        model_->Allocator(), token_shape.data(), token_shape.size());\n\n    int64_t *p_tokens = tokens.GetTensorMutableData<int64_t>();\n    p_tokens[0] = max_token_id;\n\n    decoder_out = model_->ForwardDecoder(std::move(tokens),\n                                         std::move(std::get<1>(decoder_out)),\n                                         std::move(std::get<2>(decoder_out)),\n                                         std::move(std::get<3>(decoder_out)),\n                                         std::move(std::get<4>(decoder_out)),\n                                         std::move(std::get<5>(decoder_out)));\n\n    // Collect attention for this token\n    if (collect_attention) {\n      auto &attn = std::get<6>(decoder_out);\n      auto attn_shape = attn.GetTensorTypeAndShapeInfo().GetShape();\n      if (attn_shape.size() >= 4 && attn_shape[1] == attention_n_heads) {\n        const float *p_attn = attn.GetTensorData<float>();\n        // Shape: (batch, n_heads, 1, n_audio_ctx) - single token\n        std::vector<float> token_attn(attention_n_heads * attention_n_frames);\n        for (int32_t h = 0; h < attention_n_heads; ++h) {\n          const float *src = p_attn + h * attention_n_frames;\n          std::copy(src, src + attention_n_frames,\n                    token_attn.begin() + h * attention_n_frames);\n        }\n        all_attention_weights.push_back(std::move(token_attn));\n      }\n    }\n\n    int64_t *p_offset =\n        std::get<5>(decoder_out).GetTensorMutableData<int64_t>();\n\n    *p_offset += 1;\n    if (*p_offset >= n_text_ctx - 1) {\n      break;\n    }\n\n    const float *p_logits = std::get<0>(decoder_out).GetTensorData<float>();\n\n    if (enable_segment_timestamps) {\n      // Make a copy of logits for applying timestamp rules\n      std::vector<float> logits_copy(p_logits, p_logits + vocab_size);\n      // After first token, don't apply max_initial_timestamp constraint\n      ApplyTimestampRules(logits_copy.data(), vocab_size, all_tokens,\n                          sample_begin, timestamp_begin, no_timestamps, eot,\n                          -1);  // -1 = no max_initial constraint\n      max_token_id = MaxElementIndex(logits_copy.data(), vocab_size);\n    } else {\n      max_token_id = MaxElementIndex(p_logits, vocab_size);\n    }\n  }\n\n  std::vector<OfflineWhisperDecoderResult> ans(1);\n\n  const auto &id2lang = model_->GetID2Lang();\n  if (id2lang.count(initial_tokens[1])) {\n    ans[0].lang = id2lang.at(initial_tokens[1]);\n  } else {\n    ans[0].lang = \"\";\n  }\n\n  ans[0].tokens = std::move(predicted_tokens);\n\n  // Parse timestamp tokens into segments if using segment timestamp mode\n  if (enable_segment_timestamps) {\n    ans[0].segments = ParseTimestampTokens(ans[0].tokens, timestamp_begin, eot);\n  }\n\n  // Add accumulated attention weights if available\n  if (collect_attention && !all_attention_weights.empty()) {\n    int32_t n_tokens = static_cast<int32_t>(all_attention_weights.size());\n    ans[0].attention_n_heads = attention_n_heads;\n    ans[0].attention_n_tokens = n_tokens;\n    ans[0].attention_n_frames = attention_n_frames;\n    // Actual audio frames for clipping (encoder downsamples by factor of 2)\n    ans[0].num_audio_frames = num_feature_frames / 2;\n\n    // Flatten to (n_heads, n_tokens, n_frames)\n    ans[0].attention_weights.resize(attention_n_heads * n_tokens *\n                                    attention_n_frames);\n    for (int32_t h = 0; h < attention_n_heads; ++h) {\n      for (int32_t t = 0; t < n_tokens; ++t) {\n        const float *src =\n            all_attention_weights[t].data() + h * attention_n_frames;\n        float *dst = ans[0].attention_weights.data() +\n                     h * n_tokens * attention_n_frames + t * attention_n_frames;\n        std::copy(src, src + attention_n_frames, dst);\n      }\n    }\n\n    // Add timestamp token indices for DTW filtering\n    ans[0].timestamp_token_indices = std::move(timestamp_token_indices);\n  }\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.h",
    "content": "// sherpa-onnx/csrc/offline-whisper-greedy-search-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_WHISPER_GREEDY_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_WHISPER_GREEDY_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-whisper-decoder.h\"\n#include \"sherpa-onnx/csrc/offline-whisper-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineWhisperGreedySearchDecoder : public OfflineWhisperDecoder {\n public:\n  OfflineWhisperGreedySearchDecoder(const OfflineWhisperModelConfig &config,\n                                    OfflineWhisperModel *model)\n      : config_(config), model_(model) {}\n\n  std::vector<OfflineWhisperDecoderResult> Decode(\n      Ort::Value cross_k, Ort::Value cross_v,\n      int32_t num_feature_frames) override;\n\n  void SetConfig(const OfflineWhisperModelConfig &config) override;\n\n private:\n  OfflineWhisperModelConfig config_;\n  OfflineWhisperModel *model_;  // not owned\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_WHISPER_GREEDY_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-whisper-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-whisper-model-config.h\"\n\n#include <string>\n#include <unordered_map>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineWhisperModelConfig::Register(ParseOptions *po) {\n  po->Register(\"whisper-encoder\", &encoder,\n               \"Path to onnx encoder of whisper, e.g., tiny-encoder.onnx, \"\n               \"medium.en-encoder.onnx.\");\n\n  po->Register(\"whisper-decoder\", &decoder,\n               \"Path to onnx decoder of whisper, e.g., tiny-decoder.onnx, \"\n               \"medium.en-decoder.onnx.\");\n\n  po->Register(\n      \"whisper-language\", &language,\n      \"The spoken language in the input audio file. Example values: \"\n      \"en, de, fr, zh, jp. If it is not given for a multilingual model, we will\"\n      \" infer the language from the input audio file. \"\n      \"Please refer to \"\n      \"https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10\"\n      \" for valid values. Note that for non-multilingual models, it supports \"\n      \"only 'en'\");\n\n  po->Register(\"whisper-task\", &task,\n               \"Valid values: transcribe, translate. \"\n               \"Note that for non-multilingual models, it supports \"\n               \"only 'transcribe'\");\n\n  po->Register(\n      \"whisper-tail-paddings\", &tail_paddings,\n      \"Suggested value: 50 for English models. 300 for multilingual models. \"\n      \"Since we have removed the 30-second constraint, we need to add some \"\n      \"tail padding frames \"\n      \"so that whisper can detect the eot token. Leave it to -1 to use 1000.\");\n\n  po->Register(\n      \"whisper-enable-token-timestamps\", &enable_token_timestamps,\n      \"If true, use cross-attention weights and DTW to compute token-level \"\n      \"timestamps. Requires ONNX models exported with attention outputs. \"\n      \"Default: false.\");\n\n  po->Register(\n      \"whisper-enable-segment-timestamps\", &enable_segment_timestamps,\n      \"If true, use Whisper's native timestamp token mode to produce \"\n      \"segment-level timestamps. The decoder outputs timestamp tokens like \"\n      \"<|0.00|> interleaved with text, creating segments with start/end times. \"\n      \"Does not require attention outputs. Can be combined with \"\n      \"--whisper-enable-token-timestamps for both segment-level and \"\n      \"token-level \"\n      \"timestamps. Default: false.\");\n}\n\nbool OfflineWhisperModelConfig::Validate() const {\n  if (encoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --whisper-encoder\");\n    return false;\n  }\n\n  if (!FileExists(encoder)) {\n    SHERPA_ONNX_LOGE(\"whisper encoder file '%s' does not exist\",\n                     encoder.c_str());\n    return false;\n  }\n\n  if (decoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --whisper-decoder\");\n    return false;\n  }\n\n  if (!FileExists(decoder)) {\n    SHERPA_ONNX_LOGE(\"whisper decoder file '%s' does not exist\",\n                     decoder.c_str());\n    return false;\n  }\n\n  if (task != \"translate\" && task != \"transcribe\") {\n    SHERPA_ONNX_LOGE(\n        \"--whisper-task supports only translate and transcribe. Given: %s\",\n        task.c_str());\n\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineWhisperModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineWhisperModelConfig(\";\n  os << \"encoder=\\\"\" << encoder << \"\\\", \";\n  os << \"decoder=\\\"\" << decoder << \"\\\", \";\n  os << \"language=\\\"\" << language << \"\\\", \";\n  os << \"task=\\\"\" << task << \"\\\", \";\n  os << \"tail_paddings=\" << tail_paddings << \", \";\n  os << \"enable_token_timestamps=\"\n     << (enable_token_timestamps ? \"True\" : \"False\") << \", \";\n  os << \"enable_segment_timestamps=\"\n     << (enable_segment_timestamps ? \"True\" : \"False\") << \")\";\n\n  return os.str();\n}\n\nbool IsMultilingual(WhisperModelType model_type) {\n  switch (model_type) {\n    case WhisperModelType::TinyEn:\n    case WhisperModelType::BaseEn:\n    case WhisperModelType::SmallEn:\n    case WhisperModelType::MediumEn:\n      return false;  // English-only models\n\n    case WhisperModelType::Tiny:\n    case WhisperModelType::Base:\n    case WhisperModelType::Small:\n    case WhisperModelType::Medium:\n    case WhisperModelType::Large:\n      return true;  // Multilingual models\n  }\n\n  SHERPA_ONNX_LOGE(\"Unsupported model: %s\", ToString(model_type).c_str());\n  SHERPA_ONNX_EXIT(-1);\n  // Safety fallback (should never be hit)\n  return false;\n}\n\nstd::string ToString(WhisperModelType model) {\n  switch (model) {\n    case WhisperModelType::Tiny:\n      return \"tiny\";\n    case WhisperModelType::TinyEn:\n      return \"tiny.en\";\n    case WhisperModelType::Base:\n      return \"base\";\n    case WhisperModelType::BaseEn:\n      return \"base.en\";\n    case WhisperModelType::Small:\n      return \"small\";\n    case WhisperModelType::SmallEn:\n      return \"small.en\";\n    case WhisperModelType::Medium:\n      return \"medium\";\n    case WhisperModelType::MediumEn:\n      return \"medium.en\";\n    case WhisperModelType::Large:\n      return \"large\";\n  }\n  return \"unknown\";\n}\n\nWhisperModelType ParseWhisperModelType(const std::string &name) {\n  if (name == \"tiny\") return WhisperModelType::Tiny;\n  if (name == \"tiny.en\") return WhisperModelType::TinyEn;\n  if (name == \"base\") return WhisperModelType::Base;\n  if (name == \"base.en\") return WhisperModelType::BaseEn;\n  if (name == \"small\") return WhisperModelType::Small;\n  if (name == \"small.en\") return WhisperModelType::SmallEn;\n  if (name == \"medium\") return WhisperModelType::Medium;\n  if (name == \"medium.en\") return WhisperModelType::MediumEn;\n  if (name == \"large\") return WhisperModelType::Large;\n\n  SHERPA_ONNX_LOGE(\"Unknown Whisper model: '%s'\", name.c_str());\n  SHERPA_ONNX_EXIT(-1);\n\n  // Unreachable code\n  return WhisperModelType::Tiny;\n}\n\nint32_t GetWhisperLanguageTokenId(const std::string &lang) {\n  static const std::unordered_map<std::string, int32_t> kLangToToken = {\n      {\"hi\", 50276},  {\"cy\", 50297}, {\"oc\", 50328}, {\"so\", 50326},\n      {\"fr\", 50265},  {\"az\", 50304}, {\"eu\", 50310}, {\"ba\", 50355},\n      {\"no\", 50288},  {\"as\", 50350}, {\"nl\", 50271}, {\"bn\", 50302},\n      {\"es\", 50262},  {\"ml\", 50296}, {\"km\", 50323}, {\"mk\", 50308},\n      {\"sq\", 50317},  {\"mt\", 50343}, {\"et\", 50307}, {\"ms\", 50282},\n      {\"tr\", 50268},  {\"bg\", 50292}, {\"ps\", 50340}, {\"br\", 50309},\n      {\"ht\", 50339},  {\"tt\", 50351}, {\"tk\", 50341}, {\"la\", 50294},\n      {\"de\", 50261},  {\"ur\", 50290}, {\"ro\", 50284}, {\"fa\", 50300},\n      {\"uk\", 50280},  {\"mg\", 50349}, {\"lo\", 50336}, {\"sr\", 50303},\n      {\"yo\", 50325},  {\"id\", 50275}, {\"da\", 50285}, {\"pt\", 50267},\n      {\"nn\", 50342},  {\"sn\", 50324}, {\"sa\", 50344}, {\"sd\", 50332},\n      {\"gl\", 50319},  {\"ja\", 50266}, {\"pl\", 50269}, {\"ru\", 50263},\n      {\"ko\", 50264},  {\"ne\", 50313}, {\"kn\", 50306}, {\"zh\", 50260},\n      {\"be\", 50330},  {\"ca\", 50270}, {\"el\", 50281}, {\"it\", 50274},\n      {\"hu\", 50286},  {\"lt\", 50293}, {\"ta\", 50287}, {\"is\", 50311},\n      {\"jw\", 50356},  {\"fi\", 50277}, {\"bo\", 50347}, {\"sv\", 50273},\n      {\"mi\", 50295},  {\"hr\", 50291}, {\"bs\", 50315}, {\"yi\", 50335},\n      {\"sk\", 50298},  {\"lv\", 50301}, {\"af\", 50327}, {\"vi\", 50278},\n      {\"ha\", 50354},  {\"mn\", 50314}, {\"cs\", 50283}, {\"sl\", 50305},\n      {\"pa\", 50321},  {\"su\", 50357}, {\"ka\", 50329}, {\"ln\", 50353},\n      {\"lb\", 50345},  {\"sw\", 50318}, {\"en\", 50259}, {\"tl\", 50348},\n      {\"hy\", 50312},  {\"te\", 50299}, {\"he\", 50279}, {\"my\", 50346},\n      {\"haw\", 50352}, {\"fo\", 50338}, {\"kk\", 50316}, {\"si\", 50322},\n      {\"tg\", 50331},  {\"th\", 50289}, {\"ar\", 50272}, {\"am\", 50334},\n      {\"mr\", 50320},  {\"uz\", 50337}, {\"gu\", 50333}};\n\n  auto it = kLangToToken.find(lang);\n\n  return (it != kLangToToken.end()) ? it->second : -1;\n}\n\nstd::string GetWhisperLanguageCode(int32_t token_id) {\n  static const std::unordered_map<int32_t, std::string> kTokenToLang = {\n      {50276, \"hi\"},  {50297, \"cy\"}, {50328, \"oc\"}, {50326, \"so\"},\n      {50265, \"fr\"},  {50304, \"az\"}, {50310, \"eu\"}, {50355, \"ba\"},\n      {50288, \"no\"},  {50350, \"as\"}, {50271, \"nl\"}, {50302, \"bn\"},\n      {50262, \"es\"},  {50296, \"ml\"}, {50323, \"km\"}, {50308, \"mk\"},\n      {50317, \"sq\"},  {50343, \"mt\"}, {50307, \"et\"}, {50282, \"ms\"},\n      {50268, \"tr\"},  {50292, \"bg\"}, {50340, \"ps\"}, {50309, \"br\"},\n      {50339, \"ht\"},  {50351, \"tt\"}, {50341, \"tk\"}, {50294, \"la\"},\n      {50261, \"de\"},  {50290, \"ur\"}, {50284, \"ro\"}, {50300, \"fa\"},\n      {50280, \"uk\"},  {50349, \"mg\"}, {50336, \"lo\"}, {50303, \"sr\"},\n      {50325, \"yo\"},  {50275, \"id\"}, {50285, \"da\"}, {50267, \"pt\"},\n      {50342, \"nn\"},  {50324, \"sn\"}, {50344, \"sa\"}, {50332, \"sd\"},\n      {50319, \"gl\"},  {50266, \"ja\"}, {50269, \"pl\"}, {50263, \"ru\"},\n      {50264, \"ko\"},  {50313, \"ne\"}, {50306, \"kn\"}, {50260, \"zh\"},\n      {50330, \"be\"},  {50270, \"ca\"}, {50281, \"el\"}, {50274, \"it\"},\n      {50286, \"hu\"},  {50293, \"lt\"}, {50287, \"ta\"}, {50311, \"is\"},\n      {50356, \"jw\"},  {50277, \"fi\"}, {50347, \"bo\"}, {50273, \"sv\"},\n      {50295, \"mi\"},  {50291, \"hr\"}, {50315, \"bs\"}, {50335, \"yi\"},\n      {50298, \"sk\"},  {50301, \"lv\"}, {50327, \"af\"}, {50278, \"vi\"},\n      {50354, \"ha\"},  {50314, \"mn\"}, {50283, \"cs\"}, {50305, \"sl\"},\n      {50321, \"pa\"},  {50357, \"su\"}, {50329, \"ka\"}, {50353, \"ln\"},\n      {50345, \"lb\"},  {50318, \"sw\"}, {50259, \"en\"}, {50348, \"tl\"},\n      {50312, \"hy\"},  {50299, \"te\"}, {50279, \"he\"}, {50346, \"my\"},\n      {50352, \"haw\"}, {50338, \"fo\"}, {50316, \"kk\"}, {50322, \"si\"},\n      {50331, \"tg\"},  {50289, \"th\"}, {50272, \"ar\"}, {50334, \"am\"},\n      {50320, \"mr\"},  {50337, \"uz\"}, {50333, \"gu\"}};\n\n  auto it = kTokenToLang.find(token_id);\n  return (it != kTokenToLang.end()) ? it->second : std::string{};\n}\n\nconst std::vector<int32_t> &GetAllWhisperLanguageTokenIds() {\n  static const std::vector<int32_t> kLanguageTokenIds = {\n      50276, 50297, 50328, 50326, 50265, 50304, 50310, 50355, 50288, 50350,\n      50271, 50302, 50262, 50296, 50323, 50308, 50317, 50343, 50307, 50282,\n      50268, 50292, 50340, 50309, 50339, 50351, 50341, 50294, 50261, 50290,\n      50284, 50300, 50280, 50349, 50336, 50303, 50325, 50275, 50285, 50267,\n      50342, 50324, 50344, 50332, 50319, 50266, 50269, 50263, 50264, 50313,\n      50306, 50260, 50330, 50270, 50281, 50274, 50286, 50293, 50287, 50311,\n      50356, 50277, 50347, 50273, 50295, 50291, 50315, 50335, 50298, 50301,\n      50327, 50278, 50354, 50314, 50283, 50305, 50321, 50357, 50329, 50353,\n      50345, 50318, 50259, 50348, 50312, 50299, 50279, 50346, 50352, 50338,\n      50316, 50322, 50331, 50289, 50272, 50334, 50320, 50337, 50333};\n\n  return kLanguageTokenIds;\n}\n\nconst std::vector<std::string> &GetAllWhisperLanguageCodes() {\n  static const std::vector<std::string> kLanguageCodes = {\n      \"hi\",  \"cy\", \"oc\", \"so\", \"fr\", \"az\", \"eu\", \"ba\", \"no\", \"as\", \"nl\",\n      \"bn\",  \"es\", \"ml\", \"km\", \"mk\", \"sq\", \"mt\", \"et\", \"ms\", \"tr\", \"bg\",\n      \"ps\",  \"br\", \"ht\", \"tt\", \"tk\", \"la\", \"de\", \"ur\", \"ro\", \"fa\", \"uk\",\n      \"mg\",  \"lo\", \"sr\", \"yo\", \"id\", \"da\", \"pt\", \"nn\", \"sn\", \"sa\", \"sd\",\n      \"gl\",  \"ja\", \"pl\", \"ru\", \"ko\", \"ne\", \"kn\", \"zh\", \"be\", \"ca\", \"el\",\n      \"it\",  \"hu\", \"lt\", \"ta\", \"is\", \"jw\", \"fi\", \"bo\", \"sv\", \"mi\", \"hr\",\n      \"bs\",  \"yi\", \"sk\", \"lv\", \"af\", \"vi\", \"ha\", \"mn\", \"cs\", \"sl\", \"pa\",\n      \"su\",  \"ka\", \"ln\", \"lb\", \"sw\", \"en\", \"tl\", \"hy\", \"te\", \"he\", \"my\",\n      \"haw\", \"fo\", \"kk\", \"si\", \"tg\", \"th\", \"ar\", \"am\", \"mr\", \"uz\", \"gu\"};\n\n  return kLanguageCodes;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-whisper-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_WHISPER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_WHISPER_MODEL_CONFIG_H_\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineWhisperModelConfig {\n  std::string encoder;\n  std::string decoder;\n\n  // Available languages can be found at\n  // https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10\n  //\n  // Note: For non-multilingual models, it supports only \"en\"\n  //\n  // If empty, we will infer it from the input audio file when\n  // the model is multilingual.\n  std::string language;\n\n  // Valid values are transcribe and translate\n  //\n  // Note: For non-multilingual models, it supports only \"transcribe\"\n  std::string task = \"transcribe\";\n\n  // Number of tail padding frames.\n  //\n  // Since we remove the 30-second constraint, we need to add some paddings\n  // at the end.\n  //\n  // Recommended values:\n  //   - 50 for English models\n  //   - 300 for multilingual models\n  int32_t tail_paddings = -1;\n\n  // If true, use cross-attention weights and DTW to compute token-level\n  // timestamps. This requires ONNX models exported with attention outputs.\n  bool enable_token_timestamps = false;\n\n  // If true, use Whisper's native timestamp token mode to produce segment-level\n  // timestamps. The decoder outputs timestamp tokens like <|0.00|> interleaved\n  // with text, creating segments with start/end times. Does not require\n  // attention outputs. Can be combined with enable_token_timestamps for both\n  // segment-level and token-level timestamps.\n  bool enable_segment_timestamps = false;\n\n  OfflineWhisperModelConfig() = default;\n  OfflineWhisperModelConfig(const std::string &encoder,\n                            const std::string &decoder,\n                            const std::string &language,\n                            const std::string &task, int32_t tail_paddings,\n                            bool enable_token_timestamps = false,\n                            bool enable_segment_timestamps = false)\n      : encoder(encoder),\n        decoder(decoder),\n        language(language),\n        task(task),\n        tail_paddings(tail_paddings),\n        enable_token_timestamps(enable_token_timestamps),\n        enable_segment_timestamps(enable_segment_timestamps) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n// Represents a segment with start/end timestamps from timestamp tokens\nstruct OfflineWhisperSegment {\n  float start_time = 0.0f;\n  float end_time = 0.0f;\n  std::vector<int32_t> token_ids;  // Text token IDs in this segment\n};\n\nstruct OfflineWhisperDecoderResult {\n  /// The decoded token IDs\n  std::vector<int32_t> tokens;\n  std::string lang;\n\n  /// Cross-attention weights for token-level timestamps (if enabled)\n  /// Shape: (n_heads, n_tokens, n_audio_frames), flattened to 1D\n  /// Empty if timestamps are not enabled or model doesn't support it\n  std::vector<float> attention_weights;\n\n  /// Dimensions of attention weights\n  int32_t attention_n_heads = 0;\n  int32_t attention_n_tokens = 0;\n  int32_t attention_n_frames = 0;\n\n  /// Number of actual audio feature frames (for clipping attention)\n  /// This is num_feature_frames / 2 (due to encoder downsampling)\n  int32_t num_audio_frames = 0;\n\n  /// Indices of timestamp tokens in the attention weights (0-based, relative\n  /// to the start of the attention sequence which includes initial tokens).\n  /// Used to filter out timestamp tokens before DTW alignment.\n  std::vector<int32_t> timestamp_token_indices;\n\n  /// Segments with timestamps (when using timestamp token mode)\n  std::vector<OfflineWhisperSegment> segments;\n};\n\n// used by ascend/rknn/qnn/axera, etc.\nenum class WhisperModelType {\n  Tiny,\n  TinyEn,\n  Base,\n  BaseEn,\n  Small,\n  SmallEn,\n  Medium,\n  MediumEn,\n  Large\n};\n\nstd::string ToString(WhisperModelType model);\nbool IsMultilingual(WhisperModelType model_type);\n\nWhisperModelType ParseWhisperModelType(const std::string &name);\nint32_t GetWhisperLanguageTokenId(const std::string &lang);\nstd::string GetWhisperLanguageCode(int32_t token_id);\nconst std::vector<int32_t> &GetAllWhisperLanguageTokenIds();\nconst std::vector<std::string> &GetAllWhisperLanguageCodes();\n\nstruct WhisperModelMultilingualTokens {\n  int32_t sot = 50258;\n  int32_t eot = 50257;\n  int32_t transcribe = 50359;\n  int32_t translate = 50358;\n  int32_t no_timestamps = 50363;\n};\n\nstruct WhisperModelEnglishTokens {\n  int32_t sot = 50257;\n  int32_t eot = 50256;\n  int32_t no_timestamps = 50362;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_WHISPER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-model.cc",
    "content": "// sherpa-onnx/csrc/offline-whisper-model.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-whisper-model.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <tuple>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n\nstatic inline bool IsCudaProvider(const std::string &provider) {\n  return provider == \"cuda\";\n}\n\n}  // namespace\n\nclass OfflineWhisperModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{},\n        cpu_mem_info_(\n            Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)),\n        is_cpu_provider_(config.provider == \"cpu\" || config.provider.empty()) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.whisper.encoder), sess_opts_);\n    InitEncoder(nullptr, 0);\n\n    decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.whisper.decoder), sess_opts_);\n    InitDecoder(nullptr, 0);\n\n    InitCudaIOBinding();\n  }\n\n  explicit Impl(const SpokenLanguageIdentificationConfig &config)\n      : lid_config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{},\n        cpu_mem_info_(\n            Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)),\n        is_cpu_provider_(config.provider == \"cpu\" || config.provider.empty()) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.whisper.encoder), sess_opts_);\n    InitEncoder(nullptr, 0);\n\n    decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.whisper.decoder), sess_opts_);\n    InitDecoder(nullptr, 0);\n\n    InitCudaIOBinding();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{},\n        cpu_mem_info_(\n            Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)),\n        is_cpu_provider_(config.provider == \"cpu\" || config.provider.empty()) {\n    {\n      auto buf = ReadFile(mgr, config.whisper.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.whisper.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    InitCudaIOBinding();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const SpokenLanguageIdentificationConfig &config)\n      : lid_config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{},\n        cpu_mem_info_(\n            Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)),\n        is_cpu_provider_(config.provider == \"cpu\" || config.provider.empty()) {\n    {\n      auto buf = ReadFile(mgr, config.whisper.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.whisper.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    InitCudaIOBinding();\n  }\n\n  std::pair<Ort::Value, Ort::Value> ForwardEncoder(Ort::Value features) {\n    std::vector<Ort::Value> encoder_out;\n\n    if (use_cuda_iobinding_) {\n      // Encoder outputs are n_layer_cross_k and n_layer_cross_v, which are used\n      // multiple times in decoder steps. Keep them on GPU to avoid\n      // device<->host copies.\n      Ort::IoBinding binding(*encoder_sess_);\n      binding.BindInput(encoder_input_names_ptr_[0], features);\n\n      binding.BindOutput(encoder_output_names_ptr_[0], *cuda_mem_info_);\n      binding.BindOutput(encoder_output_names_ptr_[1], *cuda_mem_info_);\n\n      binding.SynchronizeInputs();\n      encoder_sess_->Run(Ort::RunOptions{nullptr}, binding);\n      binding.SynchronizeOutputs();\n      encoder_out = binding.GetOutputValues();\n    } else {\n      encoder_out = encoder_sess_->Run(\n          {}, encoder_input_names_ptr_.data(), &features, 1,\n          encoder_output_names_ptr_.data(), encoder_output_names_ptr_.size());\n    }\n\n    return {std::move(encoder_out[0]), std::move(encoder_out[1])};\n  }\n\n  std::tuple<Ort::Value, Ort::Value, Ort::Value, Ort::Value, Ort::Value,\n             Ort::Value, Ort::Value>\n  ForwardDecoder(Ort::Value tokens, Ort::Value n_layer_self_k_cache,\n                 Ort::Value n_layer_self_v_cache, Ort::Value n_layer_cross_k,\n                 Ort::Value n_layer_cross_v, Ort::Value offset) {\n    std::array<Ort::Value, 6> decoder_input = {std::move(tokens),\n                                               std::move(n_layer_self_k_cache),\n                                               std::move(n_layer_self_v_cache),\n                                               std::move(n_layer_cross_k),\n                                               std::move(n_layer_cross_v),\n                                               std::move(offset)};\n\n    std::vector<Ort::Value> decoder_out;\n\n    if (use_cuda_iobinding_) {\n      // CPU-side sampling needs logits on CPU, while self KV cache should\n      // remain on GPU to avoid large device<->host copies between decode steps.\n      Ort::IoBinding binding(*decoder_sess_);\n      for (size_t i = 0; i < decoder_input.size(); ++i) {\n        binding.BindInput(decoder_input_names_ptr_[i], decoder_input[i]);\n      }\n\n      binding.BindOutput(decoder_output_names_ptr_[0], cpu_mem_info_);\n      binding.BindOutput(decoder_output_names_ptr_[1], *cuda_mem_info_);\n      binding.BindOutput(decoder_output_names_ptr_[2], *cuda_mem_info_);\n      if (has_attention_output_ && decoder_output_names_ptr_.size() > 3) {\n        binding.BindOutput(decoder_output_names_ptr_[3], cpu_mem_info_);\n      }\n\n      binding.SynchronizeInputs();\n      decoder_sess_->Run(Ort::RunOptions{nullptr}, binding);\n      binding.SynchronizeOutputs();\n      decoder_out = binding.GetOutputValues();\n    } else {\n      decoder_out = decoder_sess_->Run(\n          {}, decoder_input_names_ptr_.data(), decoder_input.data(),\n          decoder_input.size(), decoder_output_names_ptr_.data(),\n          decoder_output_names_ptr_.size());\n    }\n\n    // Handle attention output (4th output) if present\n    // For models without attention output, this remains nullptr\n    Ort::Value attention_weights{nullptr};\n    if (has_attention_output_ && decoder_out.size() > 3) {\n      attention_weights = std::move(decoder_out[3]);\n    }\n\n    return std::tuple<Ort::Value, Ort::Value, Ort::Value, Ort::Value,\n                      Ort::Value, Ort::Value, Ort::Value>{\n        std::move(decoder_out[0]),   std::move(decoder_out[1]),\n        std::move(decoder_out[2]),   std::move(decoder_input[3]),\n        std::move(decoder_input[4]), std::move(decoder_input[5]),\n        std::move(attention_weights)};\n  }\n\n  bool HasAttentionOutput() const { return has_attention_output_; }\n\n  int32_t NumAlignmentHeads() const { return n_alignment_heads_; }\n\n  int32_t DetectLanguage(Ort::Value &cross_k,    // NOLINT\n                         Ort::Value &cross_v) {  // NOLINT\n    int64_t token_val = SOT();\n    std::array<int64_t, 2> token_shape{1, 1};\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    Ort::Value tokens = Ort::Value::CreateTensor(\n        memory_info, &token_val, 1, token_shape.data(), token_shape.size());\n\n    auto self_kv_cache = GetInitialSelfKVCache();\n\n    std::array<int64_t, 1> offset_shape{1};\n    Ort::Value offset = Ort::Value::CreateTensor<int64_t>(\n        Allocator(), offset_shape.data(), offset_shape.size());\n    *(offset.GetTensorMutableData<int64_t>()) = 0;\n\n    auto decoder_out =\n        ForwardDecoder(std::move(tokens), std::move(self_kv_cache.first),\n                       std::move(self_kv_cache.second), std::move(cross_k),\n                       std::move(cross_v), std::move(offset));\n\n    cross_k = std::move(std::get<3>(decoder_out));\n    cross_v = std::move(std::get<4>(decoder_out));\n\n    const float *p_logits = std::get<0>(decoder_out).GetTensorData<float>();\n    const auto &all_language_ids = GetAllLanguageIDs();\n\n    int32_t lang_id = all_language_ids[0];\n    float this_logit = p_logits[lang_id];\n\n    for (int32_t i = 1; i != all_language_ids.size(); ++i) {\n      int32_t id = all_language_ids[i];\n      float p = p_logits[id];\n\n      if (p > this_logit) {\n        this_logit = p;\n        lang_id = id;\n      }\n    }\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"Detected language: %s\",\n                       GetID2Lang().at(lang_id).c_str());\n    }\n\n    return lang_id;\n  }\n\n  std::pair<Ort::Value, Ort::Value> GetInitialSelfKVCache() {\n    std::array<int64_t, 4> shape{n_text_layer_, 1, n_text_ctx_, n_text_state_};\n\n    Ort::Value n_layer_self_k_cache = Ort::Value::CreateTensor<float>(\n        Allocator(), shape.data(), shape.size());\n\n    Ort::Value n_layer_self_v_cache = Ort::Value::CreateTensor<float>(\n        Allocator(), shape.data(), shape.size());\n\n    auto n = shape[0] * shape[1] * shape[2] * shape[3];\n\n    float *p_k = n_layer_self_k_cache.GetTensorMutableData<float>();\n    float *p_v = n_layer_self_v_cache.GetTensorMutableData<float>();\n\n    memset(p_k, 0, sizeof(float) * n);\n    memset(p_v, 0, sizeof(float) * n);\n\n    return {std::move(n_layer_self_k_cache), std::move(n_layer_self_v_cache)};\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  const std::vector<int64_t> &GetInitialTokens() const { return sot_sequence_; }\n\n  const std::vector<int32_t> &GetAllLanguageIDs() const {\n    return all_language_tokens_;\n  }\n\n  const std::unordered_map<std::string, int32_t> &GetLang2ID() const {\n    return lang2id_;\n  }\n\n  const std::unordered_map<int32_t, std::string> &GetID2Lang() const {\n    return id2lang_;\n  }\n\n  int32_t NoTimeStampsToken() const { return no_timestamps_; }\n\n  // First timestamp token (represents 0.00s)\n  // Timestamp tokens are: timestamp_begin, timestamp_begin+1, ...,\n  // timestamp_end Each token represents 0.02s (20ms) intervals from 0.00s\n  // to 30.00s\n  int32_t TimestampBegin() const { return timestamp_begin_; }\n\n  // Last timestamp token (represents 30.00s)\n  // There are 1501 timestamp tokens total (0.00s to 30.00s at 0.02s intervals)\n  int32_t TimestampEnd() const { return timestamp_begin_ + 1500; }\n\n  int32_t EOT() const { return eot_; }\n\n  int32_t SOT() const { return sot_; }\n\n  int32_t TextCtx() const { return n_text_ctx_; }\n\n  int32_t VocabSize() const { return n_vocab_; }\n\n  int32_t FeatureDim() const { return n_mels_; }\n\n  int32_t Translate() const { return translate_; }\n\n  bool IsMultiLingual() const { return is_multilingual_; }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      encoder_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!encoder_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize encoder session outside of \"\n          \"this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---encoder---\\n\";\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(n_mels_, \"n_mels\");\n    SHERPA_ONNX_READ_META_DATA(n_text_layer_, \"n_text_layer\");\n    SHERPA_ONNX_READ_META_DATA(n_text_ctx_, \"n_text_ctx\");\n    SHERPA_ONNX_READ_META_DATA(n_text_state_, \"n_text_state\");\n    SHERPA_ONNX_READ_META_DATA(n_vocab_, \"n_vocab\");\n    SHERPA_ONNX_READ_META_DATA(sot_, \"sot\");\n    SHERPA_ONNX_READ_META_DATA(eot_, \"eot\");\n    SHERPA_ONNX_READ_META_DATA(blank_, \"blank_id\");\n    SHERPA_ONNX_READ_META_DATA(translate_, \"translate\");\n    SHERPA_ONNX_READ_META_DATA(transcribe_, \"transcribe\");\n    SHERPA_ONNX_READ_META_DATA(is_multilingual_, \"is_multilingual\");\n    SHERPA_ONNX_READ_META_DATA(no_timestamps_, \"no_timestamps\");\n    // timestamp_begin is the first timestamp token (0.00s)\n    // It's typically no_timestamps + 1 in OpenAI Whisper tokenizer\n    timestamp_begin_ = no_timestamps_ + 1;\n    SHERPA_ONNX_READ_META_DATA(no_speech_, \"no_speech\");\n    SHERPA_ONNX_READ_META_DATA_VEC(sot_sequence_, \"sot_sequence\");\n\n    if (is_multilingual_) {\n      SHERPA_ONNX_READ_META_DATA_VEC(all_language_tokens_,\n                                     \"all_language_tokens\");\n      SHERPA_ONNX_READ_META_DATA_VEC_STRING(all_language_codes_,\n                                            \"all_language_codes\");\n      if (all_language_tokens_.size() != all_language_codes_.size()) {\n        SHERPA_ONNX_LOGE(\"# lang_id: %d != # lang_code: %d\",\n                         static_cast<int32_t>(all_language_tokens_.size()),\n                         static_cast<int32_t>(all_language_codes_.size()));\n        exit(-1);\n      }\n\n      for (int32_t i = 0;\n           i != static_cast<int32_t>(all_language_tokens_.size()); ++i) {\n        lang2id_[all_language_codes_[i]] = all_language_tokens_[i];\n        id2lang_[all_language_tokens_[i]] = all_language_codes_[i];\n      }\n    }\n  }\n\n  void InitDecoder(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      decoder_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!decoder_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize decoder session outside of \"\n          \"this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                  &decoder_input_names_ptr_);\n\n    GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                   &decoder_output_names_ptr_);\n\n    // Check if decoder has attention output (4 outputs instead of 3)\n    // Outputs are: logits, self_k_cache, self_v_cache,\n    // [cross_attention_weights]\n    has_attention_output_ = (decoder_output_names_.size() >= 4);\n\n    if (has_attention_output_) {\n      // Try to read n_alignment_heads from encoder metadata\n      Ort::AllocatorWithDefaultOptions allocator;\n      Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n      SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(n_alignment_heads_,\n                                              \"n_alignment_heads\", 0);\n\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\"Decoder has attention output with %d alignment heads\",\n                         n_alignment_heads_);\n      }\n    }\n  }\n\n  void InitCudaIOBinding() {\n    use_cuda_iobinding_ = (!is_cpu_provider_ && IsCudaProvider(GetProvider()));\n    if (use_cuda_iobinding_) {\n      // Use device 0 by default. SessionOptions() in sherpa-onnx usually\n      // configures the CUDA EP device; binding here only affects output memory.\n      cuda_mem_info_ = std::make_unique<Ort::MemoryInfo>(\n          \"Cuda\", OrtDeviceAllocator, 0, OrtMemTypeDefault);\n    }\n  }\n\n  std::string GetProvider() const {\n    if (!config_.provider.empty()) {\n      return config_.provider;\n    }\n    return lid_config_.provider;\n  }\n\n  OfflineModelConfig config_;\n  SpokenLanguageIdentificationConfig lid_config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  Ort::MemoryInfo cpu_mem_info_;\n  std::unique_ptr<Ort::MemoryInfo> cuda_mem_info_;\n  bool use_cuda_iobinding_ = false;\n  bool is_cpu_provider_ = false;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<int32_t> all_language_tokens_;\n  std::vector<std::string> all_language_codes_;\n  std::unordered_map<std::string, int32_t> lang2id_;\n  std::unordered_map<int32_t, std::string> id2lang_;\n\n  // model meta data\n  int32_t n_mels_ = 80;\n  int32_t n_text_layer_ = 0;\n  int32_t n_text_ctx_ = 0;\n  int32_t n_text_state_ = 0;\n  int32_t n_vocab_ = 0;\n  int32_t sot_ = 0;\n  int32_t eot_ = 0;\n  int32_t blank_ = 0;\n  int32_t translate_ = 0;\n  int32_t transcribe_ = 0;\n  int32_t no_timestamps_ = 0;\n  int32_t timestamp_begin_ =\n      0;  // First timestamp token, typically no_timestamps_ + 1\n  int32_t no_speech_ = 0;\n  int32_t is_multilingual_ = 0;\n  std::vector<int64_t> sot_sequence_;\n\n  // For cross-attention token-level timestamps\n  bool has_attention_output_ = false;\n  int32_t n_alignment_heads_ = 0;\n};\n\nOfflineWhisperModel::OfflineWhisperModel(const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\nOfflineWhisperModel::OfflineWhisperModel(\n    const SpokenLanguageIdentificationConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineWhisperModel::OfflineWhisperModel(Manager *mgr,\n                                         const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\ntemplate <typename Manager>\nOfflineWhisperModel::OfflineWhisperModel(\n    Manager *mgr, const SpokenLanguageIdentificationConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineWhisperModel::~OfflineWhisperModel() = default;\n\nstd::pair<Ort::Value, Ort::Value> OfflineWhisperModel::ForwardEncoder(\n    Ort::Value features) const {\n  return impl_->ForwardEncoder(std::move(features));\n}\n\nstd::tuple<Ort::Value, Ort::Value, Ort::Value, Ort::Value, Ort::Value,\n           Ort::Value, Ort::Value>\nOfflineWhisperModel::ForwardDecoder(Ort::Value tokens,\n                                    Ort::Value n_layer_self_k_cache,\n                                    Ort::Value n_layer_self_v_cache,\n                                    Ort::Value n_layer_cross_k,\n                                    Ort::Value n_layer_cross_v,\n                                    Ort::Value offset) const {\n  return impl_->ForwardDecoder(\n      std::move(tokens), std::move(n_layer_self_k_cache),\n      std::move(n_layer_self_v_cache), std::move(n_layer_cross_k),\n      std::move(n_layer_cross_v), std::move(offset));\n}\n\nint32_t OfflineWhisperModel::DetectLanguage(Ort::Value &cross_k,    // NOLINT\n                                            Ort::Value &cross_v) {  // NOLINT\n  return impl_->DetectLanguage(cross_k, cross_v);\n}\n\nstd::pair<Ort::Value, Ort::Value> OfflineWhisperModel::GetInitialSelfKVCache()\n    const {\n  return impl_->GetInitialSelfKVCache();\n}\n\nOrtAllocator *OfflineWhisperModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nconst std::vector<int64_t> &OfflineWhisperModel::GetInitialTokens() const {\n  return impl_->GetInitialTokens();\n}\n\nconst std::vector<int32_t> &OfflineWhisperModel::GetAllLanguageIDs() const {\n  return impl_->GetAllLanguageIDs();\n}\n\nconst std::unordered_map<std::string, int32_t> &\nOfflineWhisperModel::GetLang2ID() const {\n  return impl_->GetLang2ID();\n}\n\nconst std::unordered_map<int32_t, std::string> &\nOfflineWhisperModel::GetID2Lang() const {\n  return impl_->GetID2Lang();\n}\n\nint32_t OfflineWhisperModel::NoTimeStampsToken() const {\n  return impl_->NoTimeStampsToken();\n}\n\nint32_t OfflineWhisperModel::TimestampBegin() const {\n  return impl_->TimestampBegin();\n}\n\nint32_t OfflineWhisperModel::TimestampEnd() const {\n  return impl_->TimestampEnd();\n}\n\nint32_t OfflineWhisperModel::EOT() const { return impl_->EOT(); }\n\nint32_t OfflineWhisperModel::SOT() const { return impl_->SOT(); }\n\nint32_t OfflineWhisperModel::TextCtx() const { return impl_->TextCtx(); }\n\nint32_t OfflineWhisperModel::VocabSize() const { return impl_->VocabSize(); }\n\nint32_t OfflineWhisperModel::FeatureDim() const { return impl_->FeatureDim(); }\n\nint32_t OfflineWhisperModel::Translate() const { return impl_->Translate(); }\n\nbool OfflineWhisperModel::IsMultiLingual() const {\n  return impl_->IsMultiLingual();\n}\n\nbool OfflineWhisperModel::HasAttentionOutput() const {\n  return impl_->HasAttentionOutput();\n}\n\nint32_t OfflineWhisperModel::NumAlignmentHeads() const {\n  return impl_->NumAlignmentHeads();\n}\n\nvoid OfflineWhisperModel::NormalizeFeatures(float *features, int32_t num_frames,\n                                            int32_t feat_dim) {\n  NormalizeWhisperFeatures(features, num_frames, feat_dim);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineWhisperModel::OfflineWhisperModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n\ntemplate OfflineWhisperModel::OfflineWhisperModel(\n    AAssetManager *mgr, const SpokenLanguageIdentificationConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineWhisperModel::OfflineWhisperModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n\ntemplate OfflineWhisperModel::OfflineWhisperModel(\n    NativeResourceManager *mgr,\n    const SpokenLanguageIdentificationConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-model.h",
    "content": "// sherpa-onnx/csrc/offline-whisper-model.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_WHISPER_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_WHISPER_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <tuple>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/spoken-language-identification.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineWhisperModel {\n public:\n  explicit OfflineWhisperModel(const OfflineModelConfig &config);\n\n  explicit OfflineWhisperModel(\n      const SpokenLanguageIdentificationConfig &config);\n\n  template <typename Manager>\n  OfflineWhisperModel(Manager *mgr, const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineWhisperModel(Manager *mgr,\n                      const SpokenLanguageIdentificationConfig &config);\n\n  ~OfflineWhisperModel();\n\n  /** Run the encoder model.\n   *\n   * @param features  A tensor of shape (N, C, T). It is changed in-place.\n   *                  C is 80 and T is 3000.\n   *\n   * @return Return a pair containing:\n   *  - n_layer_cross_k: A 4-D tensor of shape\n   *                     (n_text_layer, N, n_audio_ctx, n_text_state)\n   *  - n_layer_cross_v: A 4-D tensor of shape\n   *                     (n_text_layer, N, n_audio_ctx, n_text_state)\n   */\n  std::pair<Ort::Value, Ort::Value> ForwardEncoder(Ort::Value features) const;\n\n  /** Run the decoder model.\n   *\n   * @param tokens A int64 tensor of shape (N, num_words)\n   * @param n_layer_self_k_cache  A 4-D tensor of shape\n   *                              (n_text_layer, N, n_text_ctx, n_text_state).\n   * @param n_layer_self_v_cache  A 4-D tensor of shape\n   *                              (n_text_layer, N, n_text_ctx, n_text_state).\n   * @param n_layer_cross_k       A 4-D tensor of shape\n   *                              (n_text_layer, N, n_audio_ctx, n_text_state).\n   * @param n_layer_cross_v       A 4-D tensor of shape\n   *                              (n_text_layer, N, n_audio_ctx, n_text_state).\n   * @param offset A int64 tensor of shape (N,)\n   *\n   * @return Return a tuple containing 7 tensors:\n   *\n   *  - logits A 3-D tensor of shape (N, num_words, vocab_size)\n   *  - out_n_layer_self_k_cache Same shape as n_layer_self_k_cache\n   *  - out_n_layer_self_v_cache Same shape as n_layer_self_v_cache\n   *  - out_n_layer_cross_k Same as n_layer_cross_k\n   *  - out_n_layer_cross_v Same as n_layer_cross_v\n   *  - out_offset Same as offset\n   *  - cross_attention_weights (if HasAttentionOutput()) A 4-D tensor of shape\n   *                            (N, n_alignment_heads, n_tokens, n_audio_ctx)\n   *                            Empty tensor if model doesn't have attention output\n   */\n  std::tuple<Ort::Value, Ort::Value, Ort::Value, Ort::Value, Ort::Value,\n             Ort::Value, Ort::Value>\n  ForwardDecoder(Ort::Value tokens, Ort::Value n_layer_self_k_cache,\n                 Ort::Value n_layer_self_v_cache, Ort::Value n_layer_cross_k,\n                 Ort::Value n_layer_cross_v, Ort::Value offset) const;\n\n  int32_t DetectLanguage(Ort::Value &cross_k,   // NOLINT\n                         Ort::Value &cross_v);  // NOLINT\n\n  /** Return the initial self kv cache in a pair\n   *  - n_layer_self_k_cache A 4-D tensor of shape\n   *                         (n_text_layer, N, n_audio_ctx, n_text_state).\n   *  - n_layer_self_v_cache A 4-D tensor of shape\n   *                         (n_text_layer, N, n_audio_ctx, n_text_state).\n   */\n  std::pair<Ort::Value, Ort::Value> GetInitialSelfKVCache() const;\n  const std::vector<int64_t> &GetInitialTokens() const;\n  const std::vector<int32_t> &GetAllLanguageIDs() const;\n  const std::unordered_map<std::string, int32_t> &GetLang2ID() const;\n  const std::unordered_map<int32_t, std::string> &GetID2Lang() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n  int32_t NoTimeStampsToken() const;\n  int32_t TimestampBegin() const;  // First timestamp token (0.00s)\n  int32_t TimestampEnd() const;    // Last timestamp token (30.00s)\n  int32_t EOT() const;\n  int32_t SOT() const;\n  int32_t TextCtx() const;\n  int32_t VocabSize() const;\n  int32_t FeatureDim() const;\n  int32_t Translate() const;\n  bool IsMultiLingual() const;\n\n  // Check if the decoder model has cross-attention weight outputs\n  bool HasAttentionOutput() const;\n\n  // Get number of alignment heads (0 if no attention output)\n  int32_t NumAlignmentHeads() const;\n\n  static void NormalizeFeatures(float *features, int32_t num_frames,\n                                int32_t feat_dim);\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_WHISPER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-timestamp-rules-test.cc",
    "content": "// sherpa-onnx/csrc/offline-whisper-timestamp-rules-test.cc\n//\n// Copyright (c)  2026  Posit Software, PBC\n\n#include \"sherpa-onnx/csrc/offline-whisper-timestamp-rules.h\"\n\n#include <cmath>\n#include <limits>\n#include <vector>\n\n#include \"gtest/gtest.h\"\n\nnamespace sherpa_onnx {\n\n// Realistic Whisper token IDs (from multilingual model)\nconstexpr int32_t kTimestampBegin = 50364;  // <|0.00|>\nconstexpr int32_t kEot = 50257;             // <|endoftranscript|>\nconstexpr int32_t kNoTimestamps = 50363;    // <|notimestamps|>\nconstexpr int32_t kVocabSize = 51865;\nconstexpr int32_t kSampleBegin = 3;  // After [sot, language, task]\n\nconstexpr float kNegInf = -std::numeric_limits<float>::infinity();\n\n// Helper to check if a logit is suppressed (is -inf)\nbool IsSuppressed(float logit) { return std::isinf(logit) && logit < 0; }\n\n// Helper to count non-suppressed logits in a range\nint32_t CountNonSuppressed(const float *logits, int32_t start, int32_t end) {\n  int32_t count = 0;\n  for (int32_t i = start; i < end; ++i) {\n    if (!IsSuppressed(logits[i])) {\n      ++count;\n    }\n  }\n  return count;\n}\n\n// Helper to initialize logits with uniform values\nvoid InitLogits(std::vector<float> *logits, float value = 0.0f) {\n  logits->assign(kVocabSize, value);\n}\n\nclass ApplyTimestampRulesTest : public ::testing::Test {\n protected:\n  std::vector<float> logits_;\n\n  void SetUp() override { InitLogits(&logits_); }\n};\n\n// =============================================================================\n// Rule 1: Always suppress no_timestamps token\n// =============================================================================\n\nTEST_F(ApplyTimestampRulesTest, AlwaysSuppressNoTimestamps) {\n  std::vector<int64_t> tokens = {1, 2, 3};  // SOT sequence only\n  logits_[kNoTimestamps] = 5.0f;            // Give it a high value\n\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, 50);\n\n  EXPECT_TRUE(IsSuppressed(logits_[kNoTimestamps]));\n}\n\n// =============================================================================\n// Rule 5: First sampled token must be a timestamp\n// =============================================================================\n\nTEST_F(ApplyTimestampRulesTest, FirstTokenMustBeTimestamp) {\n  // Only SOT sequence, no sampled tokens yet\n  std::vector<int64_t> tokens = {1, 2, 3};\n\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, 50);\n\n  // All text tokens should be suppressed\n  for (int32_t i = 0; i < kTimestampBegin; ++i) {\n    if (i != kNoTimestamps) {  // no_timestamps is already suppressed\n      EXPECT_TRUE(IsSuppressed(logits_[i]))\n          << \"Text token \" << i << \" should be suppressed on first sample\";\n    }\n  }\n\n  // Timestamps within max_initial_timestamp_index should NOT be suppressed\n  for (int32_t i = kTimestampBegin; i <= kTimestampBegin + 50; ++i) {\n    EXPECT_FALSE(IsSuppressed(logits_[i]))\n        << \"Timestamp \" << i << \" should be allowed on first sample\";\n  }\n\n  // Timestamps beyond max_initial_timestamp_index should be suppressed\n  for (int32_t i = kTimestampBegin + 51; i < kVocabSize; ++i) {\n    EXPECT_TRUE(IsSuppressed(logits_[i]))\n        << \"Timestamp \" << i << \" should be suppressed (beyond max_initial)\";\n  }\n}\n\nTEST_F(ApplyTimestampRulesTest, FirstTokenNoMaxInitialConstraint) {\n  std::vector<int64_t> tokens = {1, 2, 3};\n\n  // Pass -1 for max_initial_timestamp_index to disable the constraint\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, -1);\n\n  // All timestamps should be allowed\n  for (int32_t i = kTimestampBegin; i < kVocabSize; ++i) {\n    EXPECT_FALSE(IsSuppressed(logits_[i]))\n        << \"All timestamps should be allowed when max_initial is -1\";\n  }\n}\n\n// =============================================================================\n// Rule 3: Timestamp pairing - after opening timestamp, force text\n// =============================================================================\n\nTEST_F(ApplyTimestampRulesTest, AfterFirstTimestampForceText) {\n  // SOT sequence + first timestamp <|0.00|>\n  std::vector<int64_t> tokens = {1, 2, 3, kTimestampBegin};\n\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, 50);\n\n  // All timestamps should be suppressed (force text)\n  for (int32_t i = kTimestampBegin; i < kVocabSize; ++i) {\n    EXPECT_TRUE(IsSuppressed(logits_[i]))\n        << \"Timestamp \" << i << \" should be suppressed after opening timestamp\";\n  }\n\n  // Text tokens should NOT be suppressed (except no_timestamps)\n  // Note: EOT is also a \"text\" token in this context\n  int32_t text_allowed = CountNonSuppressed(logits_.data(), 0, kTimestampBegin);\n  EXPECT_GT(text_allowed, 0) << \"Some text tokens should be allowed\";\n}\n\nTEST_F(ApplyTimestampRulesTest, AfterTwoConsecutiveTimestampsForceText) {\n  // Pattern: <|0.00|><|0.00|> - two consecutive timestamps\n  std::vector<int64_t> tokens = {1, 2, 3, kTimestampBegin, kTimestampBegin};\n\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, 50);\n\n  // All timestamps should be suppressed (force text)\n  for (int32_t i = kTimestampBegin; i < kVocabSize; ++i) {\n    EXPECT_TRUE(IsSuppressed(logits_[i]))\n        << \"Timestamp \" << i << \" should be suppressed after double timestamp\";\n  }\n}\n\n// =============================================================================\n// Rule 3: After text+timestamp, force timestamp/EOT (suppress text)\n// =============================================================================\n\nTEST_F(ApplyTimestampRulesTest, AfterTextThenTimestampForceTimestampOrEot) {\n  // Pattern: <|0.00|> \"hello\" <|2.00|> - segment just closed\n  int32_t ts_0_00 = kTimestampBegin;\n  int32_t ts_2_00 = kTimestampBegin + 100;  // 2.00 seconds = 100 * 0.02\n  int32_t text_token = 500;                 // some text token\n\n  std::vector<int64_t> tokens = {1, 2, 3, ts_0_00, text_token, ts_2_00};\n\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, 50);\n\n  // Text tokens before EOT should be suppressed\n  for (int32_t i = 0; i < kEot; ++i) {\n    EXPECT_TRUE(IsSuppressed(logits_[i]))\n        << \"Text token \" << i << \" should be suppressed after segment closed\";\n  }\n\n  // EOT should be allowed\n  EXPECT_FALSE(IsSuppressed(logits_[kEot])) << \"EOT should be allowed\";\n\n  // Text tokens after EOT but before timestamp_begin should be suppressed\n  for (int32_t i = kEot + 1; i < kTimestampBegin; ++i) {\n    EXPECT_TRUE(IsSuppressed(logits_[i]))\n        << \"Token \" << i << \" should be suppressed after segment closed\";\n  }\n\n  // Timestamps >= last_ts should be allowed (monotonicity allows same ts)\n  EXPECT_FALSE(IsSuppressed(logits_[ts_2_00]))\n      << \"Same timestamp should be allowed for next segment opening\";\n}\n\n// =============================================================================\n// Rule 4: Monotonicity - timestamps must not decrease\n// =============================================================================\n\nTEST_F(ApplyTimestampRulesTest, MonotonicityPreventsEarlierTimestamps) {\n  // After <|0.00|> \"text\" - we're in text, last timestamp was 0.00\n  int32_t ts_0_00 = kTimestampBegin;\n  int32_t text_token = 500;\n\n  std::vector<int64_t> tokens = {1, 2, 3, ts_0_00, text_token};\n\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, 50);\n\n  // Timestamps before ts_0_00 + 1 should be suppressed (strictly increasing)\n  // Since last token was text, we require strictly increasing\n  EXPECT_TRUE(IsSuppressed(logits_[ts_0_00]))\n      << \"Same timestamp should be suppressed when not closing segment\";\n\n  // Timestamps after should be allowed\n  EXPECT_FALSE(IsSuppressed(logits_[ts_0_00 + 1]))\n      << \"Next timestamp should be allowed\";\n}\n\nTEST_F(ApplyTimestampRulesTest, MonotonicityAllowsSameTimestampAfterClose) {\n  // After <|0.00|> \"text\" <|2.00|> - segment just closed\n  int32_t ts_0_00 = kTimestampBegin;\n  int32_t ts_2_00 = kTimestampBegin + 100;\n  int32_t text_token = 500;\n\n  std::vector<int64_t> tokens = {1, 2, 3, ts_0_00, text_token, ts_2_00};\n\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, 50);\n\n  // Same timestamp should be allowed (for opening next segment)\n  EXPECT_FALSE(IsSuppressed(logits_[ts_2_00]))\n      << \"Same timestamp allowed when segment just closed\";\n\n  // Earlier timestamps should still be suppressed\n  EXPECT_TRUE(IsSuppressed(logits_[ts_2_00 - 1]))\n      << \"Earlier timestamps should be suppressed\";\n}\n\n// =============================================================================\n// Rule 6: Probability rule - force timestamp when sum > max text\n// =============================================================================\n\nTEST_F(ApplyTimestampRulesTest, ProbabilityRuleForcesTimestamp) {\n  // Set up: we're in text (last token was not timestamp)\n  int32_t ts_0_00 = kTimestampBegin;\n  int32_t text_token = 500;\n\n  std::vector<int64_t> tokens = {1, 2, 3, ts_0_00, text_token};\n\n  // Give timestamps high logits, text tokens low logits\n  for (int32_t i = 0; i < kTimestampBegin; ++i) {\n    logits_[i] = -10.0f;\n  }\n  for (int32_t i = kTimestampBegin; i < kVocabSize; ++i) {\n    logits_[i] = 0.0f;  // After logsumexp, this will dominate\n  }\n\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, 50);\n\n  // Text tokens should be suppressed due to probability rule\n  for (int32_t i = 0; i < kTimestampBegin; ++i) {\n    EXPECT_TRUE(IsSuppressed(logits_[i]))\n        << \"Text token \" << i << \" should be suppressed by probability rule\";\n  }\n}\n\nTEST_F(ApplyTimestampRulesTest, ProbabilityRuleDoesNotApplyWhenTextDominates) {\n  // Set up: we're in text, but text logits are higher\n  int32_t ts_0_00 = kTimestampBegin;\n  int32_t text_token = 500;\n\n  std::vector<int64_t> tokens = {1, 2, 3, ts_0_00, text_token};\n\n  // Give text tokens high logits, timestamps low\n  for (int32_t i = 0; i < kTimestampBegin; ++i) {\n    logits_[i] = 0.0f;\n  }\n  for (int32_t i = kTimestampBegin; i < kVocabSize; ++i) {\n    logits_[i] = -100.0f;\n  }\n\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, 50);\n\n  // Text tokens should NOT all be suppressed\n  int32_t text_allowed = CountNonSuppressed(logits_.data(), 0, kTimestampBegin);\n  EXPECT_GT(text_allowed, 0)\n      << \"Text tokens should be allowed when they dominate\";\n}\n\nTEST_F(ApplyTimestampRulesTest, ProbabilityRuleSkippedAfterTimestamp) {\n  // After timestamp, probability rule doesn't apply\n  std::vector<int64_t> tokens = {1, 2, 3, kTimestampBegin};\n\n  // Even with high timestamp logits, the pairing rule takes precedence\n  for (int32_t i = kTimestampBegin; i < kVocabSize; ++i) {\n    logits_[i] = 100.0f;\n  }\n\n  ApplyTimestampRules(logits_.data(), kVocabSize, tokens, kSampleBegin,\n                      kTimestampBegin, kNoTimestamps, kEot, 50);\n\n  // Timestamps should be suppressed (pairing rule), not text\n  for (int32_t i = kTimestampBegin; i < kVocabSize; ++i) {\n    EXPECT_TRUE(IsSuppressed(logits_[i]));\n  }\n}\n\n// =============================================================================\n// ParseTimestampTokens tests\n// =============================================================================\n\nclass ParseTimestampTokensTest : public ::testing::Test {};\n\nTEST_F(ParseTimestampTokensTest, BasicSingleSegment) {\n  // <|0.00|> \"hello\" <|2.00|> EOT\n  int32_t ts_0_00 = kTimestampBegin;\n  int32_t ts_2_00 = kTimestampBegin + 100;\n  std::vector<int32_t> tokens = {ts_0_00, 100, 200, 300, ts_2_00, kEot};\n\n  auto segments = ParseTimestampTokens(tokens, kTimestampBegin, kEot);\n\n  ASSERT_EQ(segments.size(), 1);\n  EXPECT_FLOAT_EQ(segments[0].start_time, 0.0f);\n  EXPECT_FLOAT_EQ(segments[0].end_time, 2.0f);\n  ASSERT_EQ(segments[0].token_ids.size(), 3);\n  EXPECT_EQ(segments[0].token_ids[0], 100);\n  EXPECT_EQ(segments[0].token_ids[1], 200);\n  EXPECT_EQ(segments[0].token_ids[2], 300);\n}\n\nTEST_F(ParseTimestampTokensTest, MultipleSegments) {\n  // <|0.00|> \"hi\" <|1.00|><|1.00|> \"bye\" <|2.00|> EOT\n  int32_t ts_0_00 = kTimestampBegin;\n  int32_t ts_1_00 = kTimestampBegin + 50;\n  int32_t ts_2_00 = kTimestampBegin + 100;\n  std::vector<int32_t> tokens = {ts_0_00, 100,     ts_1_00, ts_1_00,\n                                 200,     ts_2_00, kEot};\n\n  auto segments = ParseTimestampTokens(tokens, kTimestampBegin, kEot);\n\n  ASSERT_EQ(segments.size(), 2);\n\n  EXPECT_FLOAT_EQ(segments[0].start_time, 0.0f);\n  EXPECT_FLOAT_EQ(segments[0].end_time, 1.0f);\n  ASSERT_EQ(segments[0].token_ids.size(), 1);\n  EXPECT_EQ(segments[0].token_ids[0], 100);\n\n  EXPECT_FLOAT_EQ(segments[1].start_time, 1.0f);\n  EXPECT_FLOAT_EQ(segments[1].end_time, 2.0f);\n  ASSERT_EQ(segments[1].token_ids.size(), 1);\n  EXPECT_EQ(segments[1].token_ids[0], 200);\n}\n\nTEST_F(ParseTimestampTokensTest, EotClosesOpenSegment) {\n  // <|0.00|> \"hello\" EOT (no closing timestamp)\n  int32_t ts_0_00 = kTimestampBegin;\n  std::vector<int32_t> tokens = {ts_0_00, 100, 200, kEot};\n\n  auto segments = ParseTimestampTokens(tokens, kTimestampBegin, kEot);\n\n  ASSERT_EQ(segments.size(), 1);\n  EXPECT_FLOAT_EQ(segments[0].start_time, 0.0f);\n  // EOT closes the segment without a closing timestamp, so end_time is sentinel\n  EXPECT_FLOAT_EQ(segments[0].end_time, -1.0f);\n  ASSERT_EQ(segments[0].token_ids.size(), 2);\n  EXPECT_EQ(segments[0].token_ids[0], 100);\n  EXPECT_EQ(segments[0].token_ids[1], 200);\n}\n\nTEST_F(ParseTimestampTokensTest, EmptySegmentSkipped) {\n  // <|0.00|><|1.00|><|1.00|> \"text\" <|2.00|> EOT\n  // The first \"segment\" between 0.00 and 1.00 has no text, should be skipped\n  int32_t ts_0_00 = kTimestampBegin;\n  int32_t ts_1_00 = kTimestampBegin + 50;\n  int32_t ts_2_00 = kTimestampBegin + 100;\n  std::vector<int32_t> tokens = {ts_0_00, ts_1_00, ts_1_00, 100, ts_2_00, kEot};\n\n  auto segments = ParseTimestampTokens(tokens, kTimestampBegin, kEot);\n\n  ASSERT_EQ(segments.size(), 1);\n  EXPECT_FLOAT_EQ(segments[0].start_time, 1.0f);\n  EXPECT_FLOAT_EQ(segments[0].end_time, 2.0f);\n}\n\nTEST_F(ParseTimestampTokensTest, IncompleteSegmentGetsSentinel) {\n  // <|0.00|> \"hello\" (no closing timestamp, no EOT)\n  int32_t ts_0_00 = kTimestampBegin;\n  std::vector<int32_t> tokens = {ts_0_00, 100, 200};\n\n  auto segments = ParseTimestampTokens(tokens, kTimestampBegin, kEot);\n\n  ASSERT_EQ(segments.size(), 1);\n  EXPECT_FLOAT_EQ(segments[0].start_time, 0.0f);\n  EXPECT_FLOAT_EQ(segments[0].end_time, -1.0f);  // Sentinel for incomplete\n  ASSERT_EQ(segments[0].token_ids.size(), 2);\n}\n\nTEST_F(ParseTimestampTokensTest, SentinelConsistencyBetweenEotAndIncomplete) {\n  // Verify that both EOT-closed and incomplete segments use the same sentinel\n  // This ensures consistent handling by downstream code\n\n  // Case 1: EOT-closed segment (no closing timestamp before EOT)\n  int32_t ts_1_00 = kTimestampBegin + 50;\n  std::vector<int32_t> tokens_eot = {ts_1_00, 100, kEot};\n  auto segments_eot = ParseTimestampTokens(tokens_eot, kTimestampBegin, kEot);\n\n  // Case 2: Incomplete segment (tokens end without closing timestamp or EOT)\n  std::vector<int32_t> tokens_incomplete = {ts_1_00, 100};\n  auto segments_incomplete =\n      ParseTimestampTokens(tokens_incomplete, kTimestampBegin, kEot);\n\n  ASSERT_EQ(segments_eot.size(), 1);\n  ASSERT_EQ(segments_incomplete.size(), 1);\n\n  // Both should have the same start_time\n  EXPECT_FLOAT_EQ(segments_eot[0].start_time, 1.0f);\n  EXPECT_FLOAT_EQ(segments_incomplete[0].start_time, 1.0f);\n\n  // Both should use the same sentinel value for end_time\n  EXPECT_FLOAT_EQ(segments_eot[0].end_time, -1.0f);\n  EXPECT_FLOAT_EQ(segments_incomplete[0].end_time, -1.0f);\n  EXPECT_FLOAT_EQ(segments_eot[0].end_time, segments_incomplete[0].end_time)\n      << \"EOT-closed and incomplete segments must use the same sentinel\";\n}\n\nTEST_F(ParseTimestampTokensTest, NoSegmentsFromEmptyInput) {\n  std::vector<int32_t> tokens = {};\n\n  auto segments = ParseTimestampTokens(tokens, kTimestampBegin, kEot);\n\n  EXPECT_EQ(segments.size(), 0);\n}\n\nTEST_F(ParseTimestampTokensTest, OnlyEot) {\n  std::vector<int32_t> tokens = {kEot};\n\n  auto segments = ParseTimestampTokens(tokens, kTimestampBegin, kEot);\n\n  EXPECT_EQ(segments.size(), 0);\n}\n\nTEST_F(ParseTimestampTokensTest, TextBeforeFirstTimestampIgnored) {\n  // Text tokens before any timestamp should be ignored\n  int32_t ts_1_00 = kTimestampBegin + 50;\n  int32_t ts_2_00 = kTimestampBegin + 100;\n  std::vector<int32_t> tokens = {100, 200, ts_1_00, 300, ts_2_00, kEot};\n\n  auto segments = ParseTimestampTokens(tokens, kTimestampBegin, kEot);\n\n  ASSERT_EQ(segments.size(), 1);\n  EXPECT_FLOAT_EQ(segments[0].start_time, 1.0f);\n  EXPECT_FLOAT_EQ(segments[0].end_time, 2.0f);\n  ASSERT_EQ(segments[0].token_ids.size(), 1);\n  EXPECT_EQ(segments[0].token_ids[0], 300);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-timestamp-rules.cc",
    "content": "// sherpa-onnx/csrc/offline-whisper-timestamp-rules.cc\n//\n// Copyright (c)  2026  Posit Software, PBC\n\n#include \"sherpa-onnx/csrc/offline-whisper-timestamp-rules.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <cmath>\n#include <limits>\n#include <utility>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nnamespace {\n\nconstexpr float kNegInf = -std::numeric_limits<float>::infinity();\n\n// =============================================================================\n// Step 1: State Determination\n// =============================================================================\n\n// Mutually exclusive decoding states\n// The expected token pattern is:\n//   <|0.00|> text text <|6.60|><|6.60|> text text <|12.00|> EOT\nenum class TimestampDecodingState {\n  kStart,           // num_sampled == 0: first token must be timestamp\n  kAfterOpeningTs,  // last=TS, penult=TS: after opening or double TS, force\n                    // text\n  kSegmentClosing,  // last=TS, penult=text: segment just closed, force TS/EOT\n  kInText           // last=text: in text, probability rule may apply\n};\n\n// Raw information extracted from the token sequence\nstruct TokenSequenceInfo {\n  int32_t num_sampled;      // tokens sampled so far (excluding SOT sequence)\n  bool last_was_timestamp;  // was the last token a timestamp?\n  bool penultimate_was_timestamp;  // was the second-to-last token a timestamp?\n  int32_t last_ts;                 // last timestamp token ID (-1 if none)\n};\n\n// Extract information from the token sequence\nTokenSequenceInfo ExtractTokenSequenceInfo(const std::vector<int64_t> &tokens,\n                                           int32_t sample_begin,\n                                           int32_t timestamp_begin) {\n  TokenSequenceInfo info;\n  info.num_sampled = static_cast<int32_t>(tokens.size()) - sample_begin;\n  info.last_was_timestamp =\n      info.num_sampled >= 1 && tokens.back() >= timestamp_begin;\n  // IMPORTANT: penultimate defaults to TRUE when len < 2\n  // This matches OpenAI's behavior and ensures text follows the first timestamp\n  info.penultimate_was_timestamp =\n      info.num_sampled < 2 || tokens[tokens.size() - 2] >= timestamp_begin;\n\n  info.last_ts = -1;\n  // Find the last timestamp in the sequence (for monotonicity)\n  for (int32_t i = sample_begin; i < static_cast<int32_t>(tokens.size()); ++i) {\n    if (tokens[i] >= timestamp_begin) {\n      info.last_ts = static_cast<int32_t>(tokens[i]);\n    }\n  }\n\n  return info;\n}\n\n// Map raw token info to a mutually exclusive state\nTimestampDecodingState DetermineDecodingState(const TokenSequenceInfo &info) {\n  if (info.num_sampled == 0) {\n    return TimestampDecodingState::kStart;\n  }\n  if (info.last_was_timestamp && info.penultimate_was_timestamp) {\n    return TimestampDecodingState::kAfterOpeningTs;\n  }\n  if (info.last_was_timestamp && !info.penultimate_was_timestamp) {\n    return TimestampDecodingState::kSegmentClosing;\n  }\n  return TimestampDecodingState::kInText;\n}\n\n// =============================================================================\n// Step 2: Decision Making\n// =============================================================================\n\n// What actions to take based on the current state\nstruct TimestampDecision {\n  bool suppress_text;        // suppress text tokens\n  bool suppress_timestamps;  // suppress timestamp tokens\n  bool suppress_eot;         // suppress EOT token\n  int32_t min_timestamp;     // minimum allowed timestamp (-1 = no constraint)\n  int32_t max_timestamp;     // maximum allowed timestamp (-1 = no constraint)\n  bool check_probability_rule;  // apply probability rule after other\n                                // suppressions\n};\n\n// Map state to actions - each case must set ALL variables\nTimestampDecision DecideTimestampAction(TimestampDecodingState state,\n                                        const TokenSequenceInfo &info,\n                                        int32_t timestamp_begin,\n                                        int32_t max_initial_timestamp_index) {\n  // Declare all decision variables - must be set by every case\n  bool suppress_text;\n  bool suppress_timestamps;\n  bool suppress_eot;\n  int32_t max_timestamp;\n  bool check_probability_rule;\n\n  // Compute monotonicity constraint (cross-cutting concern, used by all cases)\n  int32_t min_timestamp = -1;\n  if (info.last_ts >= 0) {\n    if (state == TimestampDecodingState::kSegmentClosing) {\n      // Same timestamp allowed for next segment opening\n      min_timestamp = info.last_ts;\n    } else {\n      // Strictly increasing timestamps\n      min_timestamp = info.last_ts + 1;\n    }\n  }\n\n  switch (state) {\n    case TimestampDecodingState::kStart:\n      // First token must be a timestamp\n      suppress_text = true;\n      suppress_timestamps = false;\n      suppress_eot = true;\n      max_timestamp = (max_initial_timestamp_index >= 0)\n                          ? timestamp_begin + max_initial_timestamp_index\n                          : -1;\n      check_probability_rule = false;\n      break;\n\n    case TimestampDecodingState::kAfterOpeningTs:\n      // After opening timestamp (or double timestamp), force text\n      suppress_text = false;\n      suppress_timestamps = true;\n      suppress_eot = false;\n      max_timestamp = -1;\n      check_probability_rule = false;\n      break;\n\n    case TimestampDecodingState::kSegmentClosing:\n      // Segment just closed, force timestamp or EOT\n      suppress_text = true;\n      suppress_timestamps = false;\n      suppress_eot = false;  // EOT allowed to end transcript\n      max_timestamp = -1;\n      check_probability_rule = false;\n      break;\n\n    case TimestampDecodingState::kInText:\n      // In text, probability rule may force timestamp\n      suppress_text = false;\n      suppress_timestamps = false;\n      suppress_eot = false;\n      max_timestamp = -1;\n      check_probability_rule = true;\n      break;\n  }\n\n  return TimestampDecision{suppress_text, suppress_timestamps,\n                           suppress_eot,  min_timestamp,\n                           max_timestamp, check_probability_rule};\n}\n\n// =============================================================================\n// Step 3: Execution\n// =============================================================================\n\n// Apply the suppression decisions to the logits\nvoid ApplyTimestampDecision(float *logits, int32_t vocab_size,\n                            const TimestampDecision &decision,\n                            int32_t timestamp_begin, int32_t eot) {\n  // Suppress text tokens if needed\n  if (decision.suppress_text) {\n    if (decision.suppress_eot) {\n      // Suppress all text tokens including EOT\n      std::fill(logits, logits + timestamp_begin, kNegInf);\n    } else {\n      // Suppress text tokens but preserve EOT\n      std::fill(logits, logits + eot, kNegInf);\n      std::fill(logits + eot + 1, logits + timestamp_begin, kNegInf);\n    }\n  }\n\n  // Suppress timestamp tokens if needed\n  if (decision.suppress_timestamps) {\n    std::fill(logits + timestamp_begin, logits + vocab_size, kNegInf);\n  }\n\n  // Apply monotonicity constraint (suppress timestamps below minimum)\n  if (decision.min_timestamp >= 0) {\n    std::fill(logits + timestamp_begin, logits + decision.min_timestamp,\n              kNegInf);\n  }\n\n  // Apply max_initial constraint (suppress timestamps above maximum)\n  if (decision.max_timestamp >= 0) {\n    // Clamp to valid range to avoid out-of-bounds access\n    int32_t safe_max = std::min(decision.max_timestamp, vocab_size - 1);\n    if (safe_max + 1 < vocab_size) {\n      std::fill(logits + safe_max + 1, logits + vocab_size, kNegInf);\n    }\n  }\n}\n\n// Apply the probability rule: if timestamp probability > max text probability,\n// force timestamp. This is the \"sum rule\" from OpenAI's implementation.\nvoid ApplyProbabilityRule(float *logits, int32_t vocab_size,\n                          int32_t timestamp_begin) {\n  // Compute logsumexp of timestamp logits\n  float max_ts_logit =\n      *std::max_element(logits + timestamp_begin, logits + vocab_size);\n  if (max_ts_logit == kNegInf) {\n    return;  // All timestamps suppressed, nothing to do\n  }\n\n  float ts_logsum = 0.0f;\n  for (int32_t i = timestamp_begin; i < vocab_size; ++i) {\n    if (logits[i] > kNegInf) {\n      ts_logsum += std::exp(logits[i] - max_ts_logit);\n    }\n  }\n  ts_logsum = max_ts_logit + std::log(ts_logsum);\n\n  // Find max text logit (including EOT - matches OpenAI behavior)\n  float max_text_logit = *std::max_element(logits, logits + timestamp_begin);\n\n  // If timestamp logsumexp > max text logit, force timestamp\n  if (ts_logsum > max_text_logit) {\n    std::fill(logits, logits + timestamp_begin, kNegInf);\n  }\n}\n\n}  // namespace\n\n// =============================================================================\n// Public API\n// =============================================================================\n\nvoid ApplyTimestampRules(float *logits, int32_t vocab_size,\n                         const std::vector<int64_t> &tokens,\n                         int32_t sample_begin, int32_t timestamp_begin,\n                         int32_t no_timestamps, int32_t eot,\n                         int32_t max_initial_timestamp_index) {\n  // Validate parameters\n  assert(logits != nullptr && \"logits must not be null\");\n  assert(vocab_size > 0 && \"vocab_size must be positive\");\n  assert(sample_begin >= 0 && \"sample_begin must be non-negative\");\n  assert(sample_begin <= static_cast<int32_t>(tokens.size()) &&\n         \"sample_begin must not exceed tokens size\");\n  assert(timestamp_begin > 0 && \"timestamp_begin must be positive\");\n  assert(timestamp_begin < vocab_size &&\n         \"timestamp_begin must be less than vocab_size\");\n  assert(eot >= 0 && eot < timestamp_begin &&\n         \"eot must be in range [0, timestamp_begin)\");\n  assert(no_timestamps >= 0 && no_timestamps < vocab_size &&\n         \"no_timestamps must be in range [0, vocab_size)\");\n\n  // Always suppress no_timestamps token\n  logits[no_timestamps] = kNegInf;\n\n  // Step 1: Extract token info and determine state\n  TokenSequenceInfo info =\n      ExtractTokenSequenceInfo(tokens, sample_begin, timestamp_begin);\n  TimestampDecodingState state = DetermineDecodingState(info);\n\n  // Step 2: Map state to actions\n  TimestampDecision decision = DecideTimestampAction(\n      state, info, timestamp_begin, max_initial_timestamp_index);\n\n  // Step 3: Execute the decisions\n  ApplyTimestampDecision(logits, vocab_size, decision, timestamp_begin, eot);\n\n  if (decision.check_probability_rule) {\n    ApplyProbabilityRule(logits, vocab_size, timestamp_begin);\n  }\n}\n\nstd::vector<OfflineWhisperSegment> ParseTimestampTokens(\n    const std::vector<int32_t> &tokens, int32_t timestamp_begin, int32_t eot) {\n  // Validate parameters\n  assert(timestamp_begin > 0 && \"timestamp_begin must be positive\");\n  assert(eot >= 0 && eot < timestamp_begin &&\n         \"eot must be in range [0, timestamp_begin)\");\n\n  std::vector<OfflineWhisperSegment> segments;\n\n  // Each timestamp token represents 0.02 seconds (20ms)\n  constexpr float kSecondsPerTimestamp = 0.02f;\n\n  OfflineWhisperSegment current_segment;\n  bool in_segment = false;\n\n  for (size_t i = 0; i < tokens.size(); ++i) {\n    int32_t token = tokens[i];\n\n    if (token == eot) {\n      // End of transcript - close any open segment\n      if (in_segment && !current_segment.token_ids.empty()) {\n        current_segment.end_time =\n            -1.0f;  // Use sentinel for EOT-closed segment\n        segments.push_back(std::move(current_segment));\n        current_segment = OfflineWhisperSegment();\n      }\n      break;\n    }\n\n    if (token >= timestamp_begin) {\n      // This is a timestamp token\n      float time = (token - timestamp_begin) * kSecondsPerTimestamp;\n\n      if (!in_segment) {\n        // Start of a new segment\n        current_segment.start_time = time;\n        in_segment = true;\n      } else {\n        // End of current segment\n        current_segment.end_time = time;\n        if (!current_segment.token_ids.empty()) {\n          segments.push_back(std::move(current_segment));\n        }\n        // Start new segment at same timestamp\n        current_segment = OfflineWhisperSegment();\n        current_segment.start_time = time;\n      }\n    } else {\n      // Text token - add to current segment\n      if (in_segment) {\n        current_segment.token_ids.push_back(token);\n      }\n    }\n  }\n\n  // Handle any remaining segment without closing timestamp\n  if (in_segment && !current_segment.token_ids.empty()) {\n    // Use a sentinel value to indicate incomplete segment\n    current_segment.end_time = -1.0f;\n    segments.push_back(std::move(current_segment));\n  }\n\n  return segments;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-whisper-timestamp-rules.h",
    "content": "// sherpa-onnx/csrc/offline-whisper-timestamp-rules.h\n//\n// Copyright (c)  2026  Posit Software, PBC\n\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_WHISPER_TIMESTAMP_RULES_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_WHISPER_TIMESTAMP_RULES_H_\n\n#include <cstdint>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-whisper-decoder.h\"\n\nnamespace sherpa_onnx {\n\n// Apply OpenAI Whisper's timestamp token rules to logits\n// Reference: whisper/decoding.py ApplyTimestampRules\n//\n// Parameters:\n//   logits: pointer to logits array of size vocab_size (modified in-place)\n//   vocab_size: size of vocabulary\n//   tokens: all tokens decoded so far (including initial SOT sequence)\n//   sample_begin: index in tokens where actual sampling began (after SOT seq)\n//   timestamp_begin: token ID of first timestamp (<|0.00|>)\n//   no_timestamps: token ID of no_timestamps token\n//   eot: token ID of end-of-transcript\n//   max_initial_timestamp_index: limit for first timestamp (e.g., 50 = 1.0s)\nvoid ApplyTimestampRules(float *logits, int32_t vocab_size,\n                         const std::vector<int64_t> &tokens,\n                         int32_t sample_begin, int32_t timestamp_begin,\n                         int32_t no_timestamps, int32_t eot,\n                         int32_t max_initial_timestamp_index);\n\n// Parse timestamp tokens from decoded sequence and create segments\n// Pattern: <|start_time|> text tokens... <|end_time|>\n//\n// Parameters:\n//   tokens: decoded tokens (text + timestamp tokens interleaved)\n//   timestamp_begin: token ID of first timestamp (<|0.00|>)\n//   eot: token ID of end-of-transcript\n//\n// Returns: vector of segments with start/end times and token IDs\nstd::vector<OfflineWhisperSegment> ParseTimestampTokens(\n    const std::vector<int32_t> &tokens, int32_t timestamp_begin, int32_t eot);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_WHISPER_TIMESTAMP_RULES_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-zipformer-audio-tagging-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-zipformer-audio-tagging-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-zipformer-audio-tagging-model-config.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineZipformerAudioTaggingModelConfig::Register(ParseOptions *po) {\n  po->Register(\"zipformer-model\", &model,\n               \"Path to zipformer model for audio tagging\");\n}\n\nbool OfflineZipformerAudioTaggingModelConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --zipformer-model\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"--zipformer-model: '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OfflineZipformerAudioTaggingModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineZipformerAudioTaggingModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-zipformer-audio-tagging-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-zipformer-audio-tagging-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_AUDIO_TAGGING_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_AUDIO_TAGGING_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OfflineZipformerAudioTaggingModelConfig {\n  std::string model;\n\n  OfflineZipformerAudioTaggingModelConfig() = default;\n\n  explicit OfflineZipformerAudioTaggingModelConfig(const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_AUDIO_TAGGING_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-zipformer-audio-tagging-model.cc",
    "content": "// sherpa-onnx/csrc/offline-zipformer-audio-tagging-model.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-zipformer-audio-tagging-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineZipformerAudioTaggingModel::Impl {\n public:\n  explicit Impl(const AudioTaggingModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.zipformer.model);\n    Init(buf.data(), buf.size());\n  }\n\n#if __ANDROID_API__ >= 9\n  Impl(AAssetManager *mgr, const AudioTaggingModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.zipformer.model);\n    Init(buf.data(), buf.size());\n  }\n#endif\n\n  Ort::Value Forward(Ort::Value features, Ort::Value features_length) {\n    std::array<Ort::Value, 2> inputs = {std::move(features),\n                                        std::move(features_length)};\n\n    auto ans =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n    return std::move(ans[0]);\n  }\n\n  int32_t NumEventClasses() const { return num_event_classes_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n    }\n\n    // get num_event_classes from the output[0].shape,\n    // which is (N, num_event_classes)\n    num_event_classes_ =\n        sess_->GetOutputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape()[1];\n  }\n\n private:\n  AudioTaggingModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t num_event_classes_ = 0;\n};\n\nOfflineZipformerAudioTaggingModel::OfflineZipformerAudioTaggingModel(\n    const AudioTaggingModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\n#if __ANDROID_API__ >= 9\nOfflineZipformerAudioTaggingModel::OfflineZipformerAudioTaggingModel(\n    AAssetManager *mgr, const AudioTaggingModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n#endif\n\nOfflineZipformerAudioTaggingModel::~OfflineZipformerAudioTaggingModel() =\n    default;\n\nOrt::Value OfflineZipformerAudioTaggingModel::Forward(\n    Ort::Value features, Ort::Value features_length) const {\n  return impl_->Forward(std::move(features), std::move(features_length));\n}\n\nint32_t OfflineZipformerAudioTaggingModel::NumEventClasses() const {\n  return impl_->NumEventClasses();\n}\n\nOrtAllocator *OfflineZipformerAudioTaggingModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-zipformer-audio-tagging-model.h",
    "content": "// sherpa-onnx/csrc/offline-zipformer-audio-tagging-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_AUDIO_TAGGING_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_AUDIO_TAGGING_MODEL_H_\n#include <memory>\n#include <utility>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/audio-tagging-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements the zipformer CTC model of the librispeech recipe\n * from icefall.\n *\n * See\n * https://github.com/k2-fsa/icefall/blob/master/egs/audioset/AT/zipformer/export-onnx.py\n */\nclass OfflineZipformerAudioTaggingModel {\n public:\n  explicit OfflineZipformerAudioTaggingModel(\n      const AudioTaggingModelConfig &config);\n\n#if __ANDROID_API__ >= 9\n  OfflineZipformerAudioTaggingModel(AAssetManager *mgr,\n                                    const AudioTaggingModelConfig &config);\n#endif\n\n  ~OfflineZipformerAudioTaggingModel();\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a tensor\n   *  - probs: A 2-D tensor of shape (N, num_event_classes).\n   */\n  Ort::Value Forward(Ort::Value features, Ort::Value features_length) const;\n\n  /** Return the number of event classes of the model\n   */\n  int32_t NumEventClasses() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_AUDIO_TAGGING_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-zipformer-ctc-model-config.cc",
    "content": "// sherpa-onnx/csrc/offline-zipformer-ctc-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-zipformer-ctc-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid OfflineZipformerCtcModelConfig::Register(ParseOptions *po) {\n  po->Register(\"zipformer-ctc-model\", &model, \"Path to zipformer CTC model\");\n\n  std::string prefix = \"zipformer-ctc\";\n  ParseOptions p(prefix, po);\n\n  qnn_config.Register(&p);\n}\n\nbool OfflineZipformerCtcModelConfig::Validate() const {\n  if (qnn_config.context_binary.empty()) {\n    if (model.empty()) {\n      SHERPA_ONNX_LOGE(\"Please provide a Zipformer CTC model\");\n      return false;\n    }\n\n    if (!FileExists(model)) {\n      SHERPA_ONNX_LOGE(\"Zipformer CTC model '%s' does not exist\",\n                       model.c_str());\n      return false;\n    }\n  }\n\n  if (model.empty() && !qnn_config.context_binary.empty()) {\n    // we require that the context_binary exists\n    if (!FileExists(qnn_config.context_binary)) {\n      SHERPA_ONNX_LOGE(\n          \"Model is empty, but you provide a context binary that does not \"\n          \"exist\");\n      return false;\n    }\n  }\n\n  if (EndsWith(model, \".so\") || EndsWith(model, \".bin\") ||\n      (model.empty() && !qnn_config.context_binary.empty())) {\n    return qnn_config.Validate();\n  }\n\n  return true;\n}\n\nstd::string OfflineZipformerCtcModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OfflineZipformerCtcModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\"\";\n\n  if (!qnn_config.backend_lib.empty()) {\n    os << \", qnn_config=\" << qnn_config.ToString() << \", \";\n  }\n\n  os << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-zipformer-ctc-model-config.h",
    "content": "// sherpa-onnx/csrc/offline-zipformer-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_CTC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/qnn-config.h\"\n\nnamespace sherpa_onnx {\n\n// for\n// https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/zipformer/export-onnx-ctc.py\nstruct OfflineZipformerCtcModelConfig {\n  std::string model;\n  QnnConfig qnn_config;\n\n  OfflineZipformerCtcModelConfig() = default;\n\n  explicit OfflineZipformerCtcModelConfig(const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-zipformer-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/offline-zipformer-ctc-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-zipformer-ctc-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineZipformerCtcModel::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.zipformer_ctc.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.zipformer_ctc.model);\n    Init(buf.data(), buf.size());\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) {\n    std::array<Ort::Value, 2> inputs = {std::move(features),\n                                        std::move(features_length)};\n\n    return sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                      output_names_ptr_.data(), output_names_ptr_.size());\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n  int32_t SubsamplingFactor() const { return 4; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    // get vocab size from the output[0].shape, which is (N, T, vocab_size)\n    vocab_size_ =\n        sess_->GetOutputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape()[2];\n  }\n\n private:\n  OfflineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t vocab_size_ = 0;\n};\n\nOfflineZipformerCtcModel::OfflineZipformerCtcModel(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineZipformerCtcModel::OfflineZipformerCtcModel(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineZipformerCtcModel::~OfflineZipformerCtcModel() = default;\n\nstd::vector<Ort::Value> OfflineZipformerCtcModel::Forward(\n    Ort::Value features, Ort::Value features_length) {\n  return impl_->Forward(std::move(features), std::move(features_length));\n}\n\nint32_t OfflineZipformerCtcModel::VocabSize() const {\n  return impl_->VocabSize();\n}\n\nOrtAllocator *OfflineZipformerCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nint32_t OfflineZipformerCtcModel::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineZipformerCtcModel::OfflineZipformerCtcModel(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineZipformerCtcModel::OfflineZipformerCtcModel(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/offline-zipformer-ctc-model.h",
    "content": "// sherpa-onnx/csrc/offline-zipformer-ctc-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_CTC_MODEL_H_\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-ctc-model.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements the zipformer CTC model of the librispeech recipe\n * from icefall.\n *\n * See\n * https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/zipformer/export-onnx-ctc.py\n */\nclass OfflineZipformerCtcModel : public OfflineCtcModel {\n public:\n  explicit OfflineZipformerCtcModel(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineZipformerCtcModel(Manager *mgr, const OfflineModelConfig &config);\n\n  ~OfflineZipformerCtcModel() override;\n\n  /** Run the forward method of the model.\n   *\n   * @param features  A tensor of shape (N, T, C).\n   * @param features_length  A 1-D tensor of shape (N,) containing number of\n   *                         valid frames in `features` before padding.\n   *                         Its dtype is int64_t.\n   *\n   * @return Return a vector containing:\n   *  - log_probs: A 3-D tensor of shape (N, T', vocab_size).\n   *  - log_probs_length A 1-D tensor of shape (N,). Its dtype is int64_t\n   */\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  Ort::Value features_length) override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  int32_t SubsamplingFactor() const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_OFFLINE_ZIPFORMER_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-cnn-bilstm-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/online-cnn-bilstm-model-meta-data.h\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_CNN_BILSTM_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_ONLINE_CNN_BILSTM_MODEL_META_DATA_H_\n\nnamespace sherpa_onnx {\n\nstruct OnlineCNNBiLSTMModelMetaData {\n  int32_t comma_id = -1;\n  int32_t period_id = -1;\n  int32_t quest_id = -1;\n\n  int32_t upper_id = -1;\n  int32_t cap_id = -1;\n  int32_t mix_case_id = -1;\n\n  int32_t num_cases = -1;\n  int32_t num_punctuations = -1;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_CNN_BILSTM_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-cnn-bilstm-model.cc",
    "content": "// sherpa-onnx/csrc/online-cnn-bilstm-model.cc\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#include \"sherpa-onnx/csrc/online-cnn-bilstm-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineCNNBiLSTMModel::Impl {\n public:\n  explicit Impl(const OnlinePunctuationModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(config_.cnn_bilstm);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OnlinePunctuationModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    auto buf = ReadFile(mgr, config_.cnn_bilstm);\n    Init(buf.data(), buf.size());\n  }\n\n  std::pair<Ort::Value, Ort::Value> Forward(Ort::Value token_ids,\n                                            Ort::Value valid_ids,\n                                            Ort::Value label_lens) {\n    std::array<Ort::Value, 3> inputs = {\n        std::move(token_ids), std::move(valid_ids), std::move(label_lens)};\n\n    auto ans =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n    return {std::move(ans[0]), std::move(ans[1])};\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  const OnlineCNNBiLSTMModelMetaData &GetModelMetadata() const {\n    return meta_data_;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    SHERPA_ONNX_READ_META_DATA(meta_data_.comma_id, \"COMMA\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.period_id, \"PERIOD\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.quest_id, \"QUESTION\");\n\n    // assert here, because we will use the constant value\n    assert(meta_data_.comma_id == 1);\n    assert(meta_data_.period_id == 2);\n    assert(meta_data_.quest_id == 3);\n\n    SHERPA_ONNX_READ_META_DATA(meta_data_.upper_id, \"UPPER\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.cap_id, \"CAP\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.mix_case_id, \"MIX_CASE\");\n\n    assert(meta_data_.upper_id == 1);\n    assert(meta_data_.cap_id == 2);\n    assert(meta_data_.mix_case_id == 3);\n\n    // output shape is (T', num_cases)\n    meta_data_.num_cases =\n        sess_->GetOutputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape()[1];\n    meta_data_.num_punctuations =\n        sess_->GetOutputTypeInfo(1).GetTensorTypeAndShapeInfo().GetShape()[1];\n  }\n\n private:\n  OnlinePunctuationModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  OnlineCNNBiLSTMModelMetaData meta_data_;\n};\n\nOnlineCNNBiLSTMModel::OnlineCNNBiLSTMModel(\n    const OnlinePunctuationModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOnlineCNNBiLSTMModel::OnlineCNNBiLSTMModel(\n    Manager *mgr, const OnlinePunctuationModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOnlineCNNBiLSTMModel::~OnlineCNNBiLSTMModel() = default;\n\nstd::pair<Ort::Value, Ort::Value> OnlineCNNBiLSTMModel::Forward(\n    Ort::Value token_ids, Ort::Value valid_ids, Ort::Value label_lens) const {\n  return impl_->Forward(std::move(token_ids), std::move(valid_ids),\n                        std::move(label_lens));\n}\n\nOrtAllocator *OnlineCNNBiLSTMModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nconst OnlineCNNBiLSTMModelMetaData &OnlineCNNBiLSTMModel::GetModelMetadata()\n    const {\n  return impl_->GetModelMetadata();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineCNNBiLSTMModel::OnlineCNNBiLSTMModel(\n    AAssetManager *mgr, const OnlinePunctuationModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineCNNBiLSTMModel::OnlineCNNBiLSTMModel(\n    NativeResourceManager *mgr, const OnlinePunctuationModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-cnn-bilstm-model.h",
    "content": "// sherpa-onnx/csrc/online-cnn-bilstm-model.h\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_CNN_BILSTM_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_CNN_BILSTM_MODEL_H_\n#include <memory>\n#include <utility>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-cnn-bilstm-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/online-punctuation-model-config.h\"\n\nnamespace sherpa_onnx {\n\n/** This class implements\n *  https://github.com/frankyoujian/Edge-Punct-Casing/blob/main/onnx_decode_sentence.py\n */\nclass OnlineCNNBiLSTMModel {\n public:\n  explicit OnlineCNNBiLSTMModel(const OnlinePunctuationModelConfig &config);\n\n  template <typename Manager>\n  OnlineCNNBiLSTMModel(Manager *mgr,\n                       const OnlinePunctuationModelConfig &config);\n\n  ~OnlineCNNBiLSTMModel();\n\n  /** Run the forward method of the model.\n   *\n   * @param token_ids  A tensor of shape (N, T) of dtype int32.\n   * @param valid_ids  A tensor of shape (N, T) of dtype int32.\n   * @param label_lens A tensor of shape (N) of dtype int32.\n   *\n   * @return Return a pair of tensors\n   *  - case_logits:  A 2-D tensor of shape (T', num_cases).\n   *  - punct_logits: A 2-D tensor of shape (T', num_puncts).\n   */\n  std::pair<Ort::Value, Ort::Value> Forward(Ort::Value token_ids,\n                                            Ort::Value valid_ids,\n                                            Ort::Value label_lens) const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n  const OnlineCNNBiLSTMModelMetaData &GetModelMetadata() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_CNN_BILSTM_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-conformer-transducer-model.cc",
    "content": "// sherpa-onnx/csrc/online-conformer-transducer-model.cc\n//\n// Copyright (c)  2023 Jingzhao Ou (jingzhao.ou@gmail.com)\n\n#include \"sherpa-onnx/csrc/online-conformer-transducer-model.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/cat.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/unbind.h\"\n\nnamespace sherpa_onnx {\n\nOnlineConformerTransducerModel::OnlineConformerTransducerModel(\n    const OnlineModelConfig &config)\n    : env_(ORT_LOGGING_LEVEL_ERROR),\n      config_(config),\n      sess_opts_(GetSessionOptions(config)),\n      allocator_{} {\n  {\n    auto buf = ReadFile(config.transducer.encoder);\n    InitEncoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(config.transducer.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(config.transducer.joiner);\n    InitJoiner(buf.data(), buf.size());\n  }\n}\n\ntemplate <typename Manager>\nOnlineConformerTransducerModel::OnlineConformerTransducerModel(\n    Manager *mgr, const OnlineModelConfig &config)\n    : env_(ORT_LOGGING_LEVEL_ERROR),\n      config_(config),\n      sess_opts_(GetSessionOptions(config)),\n      allocator_{} {\n  {\n    auto buf = ReadFile(mgr, config.transducer.encoder);\n    InitEncoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(mgr, config.transducer.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(mgr, config.transducer.joiner);\n    InitJoiner(buf.data(), buf.size());\n  }\n}\n\nvoid OnlineConformerTransducerModel::InitEncoder(void *model_data,\n                                                 size_t model_data_length) {\n  encoder_sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                                 model_data_length, sess_opts_);\n\n  GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                &encoder_input_names_ptr_);\n\n  GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                 &encoder_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---encoder---\\n\";\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n  SHERPA_ONNX_READ_META_DATA(num_encoder_layers_, \"num_encoder_layers\");\n  SHERPA_ONNX_READ_META_DATA(T_, \"T\");\n  SHERPA_ONNX_READ_META_DATA(decode_chunk_len_, \"decode_chunk_len\");\n  SHERPA_ONNX_READ_META_DATA(left_context_, \"left_context\");\n  SHERPA_ONNX_READ_META_DATA(encoder_dim_, \"encoder_dim\");\n  SHERPA_ONNX_READ_META_DATA(pad_length_, \"pad_length\");\n  SHERPA_ONNX_READ_META_DATA(cnn_module_kernel_, \"cnn_module_kernel\");\n}\n\nvoid OnlineConformerTransducerModel::InitDecoder(void *model_data,\n                                                 size_t model_data_length) {\n  decoder_sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                                 model_data_length, sess_opts_);\n\n  GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                &decoder_input_names_ptr_);\n\n  GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                 &decoder_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = decoder_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---decoder---\\n\";\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n  SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n  SHERPA_ONNX_READ_META_DATA(context_size_, \"context_size\");\n}\n\nvoid OnlineConformerTransducerModel::InitJoiner(void *model_data,\n                                                size_t model_data_length) {\n  joiner_sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                                model_data_length, sess_opts_);\n\n  GetInputNames(joiner_sess_.get(), &joiner_input_names_,\n                &joiner_input_names_ptr_);\n\n  GetOutputNames(joiner_sess_.get(), &joiner_output_names_,\n                 &joiner_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = joiner_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---joiner---\\n\";\n    PrintModelMetadata(os, meta_data);\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n  }\n}\n\nstd::vector<Ort::Value> OnlineConformerTransducerModel::StackStates(\n    const std::vector<std::vector<Ort::Value>> &states) const {\n  int32_t batch_size = static_cast<int32_t>(states.size());\n\n  std::vector<const Ort::Value *> attn_vec(batch_size);\n  std::vector<const Ort::Value *> conv_vec(batch_size);\n\n  for (int32_t i = 0; i != batch_size; ++i) {\n    assert(states[i].size() == 2);\n    attn_vec[i] = &states[i][0];\n    conv_vec[i] = &states[i][1];\n  }\n\n  auto allocator =\n      const_cast<OnlineConformerTransducerModel *>(this)->allocator_;\n\n  Ort::Value attn = Cat(allocator, attn_vec, 2);\n  Ort::Value conv = Cat(allocator, conv_vec, 2);\n\n  std::vector<Ort::Value> ans;\n  ans.reserve(2);\n  ans.push_back(std::move(attn));\n  ans.push_back(std::move(conv));\n\n  return ans;\n}\n\nstd::vector<std::vector<Ort::Value>>\nOnlineConformerTransducerModel::UnStackStates(\n    const std::vector<Ort::Value> &states) const {\n  const int32_t batch_size =\n      states[0].GetTensorTypeAndShapeInfo().GetShape()[2];\n  assert(states.size() == 2);\n\n  std::vector<std::vector<Ort::Value>> ans(batch_size);\n\n  auto allocator =\n      const_cast<OnlineConformerTransducerModel *>(this)->allocator_;\n\n  std::vector<Ort::Value> attn_vec = Unbind(allocator, &states[0], 2);\n  std::vector<Ort::Value> conv_vec = Unbind(allocator, &states[1], 2);\n\n  assert(attn_vec.size() == batch_size);\n  assert(conv_vec.size() == batch_size);\n\n  for (int32_t i = 0; i != batch_size; ++i) {\n    ans[i].push_back(std::move(attn_vec[i]));\n    ans[i].push_back(std::move(conv_vec[i]));\n  }\n\n  return ans;\n}\n\nstd::vector<Ort::Value> OnlineConformerTransducerModel::GetEncoderInitStates() {\n  // Please see\n  // https://github.com/k2-fsa/icefall/blob/86b0db6eb9c84d9bc90a71d92774fe2a7f73e6ab/egs/librispeech/ASR/pruned_transducer_stateless5/conformer.py#L203\n  // for details\n  constexpr int32_t kBatchSize = 1;\n  std::array<int64_t, 4> h_shape{num_encoder_layers_, left_context_, kBatchSize,\n                                 encoder_dim_};\n  Ort::Value h = Ort::Value::CreateTensor<float>(allocator_, h_shape.data(),\n                                                 h_shape.size());\n\n  Fill<float>(&h, 0);\n\n  std::array<int64_t, 4> c_shape{num_encoder_layers_, cnn_module_kernel_ - 1,\n                                 kBatchSize, encoder_dim_};\n\n  Ort::Value c = Ort::Value::CreateTensor<float>(allocator_, c_shape.data(),\n                                                 c_shape.size());\n\n  Fill<float>(&c, 0);\n\n  std::vector<Ort::Value> states;\n\n  states.reserve(2);\n  states.push_back(std::move(h));\n  states.push_back(std::move(c));\n\n  return states;\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOnlineConformerTransducerModel::RunEncoder(Ort::Value features,\n                                           std::vector<Ort::Value> states,\n                                           Ort::Value processed_frames) {\n  std::array<Ort::Value, 4> encoder_inputs = {\n      std::move(features), std::move(states[0]), std::move(states[1]),\n      std::move(processed_frames)};\n\n  auto encoder_out = encoder_sess_->Run(\n      {}, encoder_input_names_ptr_.data(), encoder_inputs.data(),\n      encoder_inputs.size(), encoder_output_names_ptr_.data(),\n      encoder_output_names_ptr_.size());\n\n  std::vector<Ort::Value> next_states;\n  next_states.reserve(2);\n  next_states.push_back(std::move(encoder_out[1]));\n  next_states.push_back(std::move(encoder_out[2]));\n\n  return {std::move(encoder_out[0]), std::move(next_states)};\n}\n\nOrt::Value OnlineConformerTransducerModel::RunDecoder(\n    Ort::Value decoder_input) {\n  auto decoder_out = decoder_sess_->Run(\n      {}, decoder_input_names_ptr_.data(), &decoder_input, 1,\n      decoder_output_names_ptr_.data(), decoder_output_names_ptr_.size());\n  return std::move(decoder_out[0]);\n}\n\nOrt::Value OnlineConformerTransducerModel::RunJoiner(Ort::Value encoder_out,\n                                                     Ort::Value decoder_out) {\n  std::array<Ort::Value, 2> joiner_input = {std::move(encoder_out),\n                                            std::move(decoder_out)};\n  auto logit =\n      joiner_sess_->Run({}, joiner_input_names_ptr_.data(), joiner_input.data(),\n                        joiner_input.size(), joiner_output_names_ptr_.data(),\n                        joiner_output_names_ptr_.size());\n\n  return std::move(logit[0]);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineConformerTransducerModel::OnlineConformerTransducerModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineConformerTransducerModel::OnlineConformerTransducerModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-conformer-transducer-model.h",
    "content": "// sherpa-onnx/csrc/online-conformer-transducer-model.h\n//\n// Copyright (c) 2023 Jingzhao Ou (jingzhao.ou@gmail.com)\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_CONFORMER_TRANSDUCER_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_CONFORMER_TRANSDUCER_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineConformerTransducerModel : public OnlineTransducerModel {\n public:\n  explicit OnlineConformerTransducerModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineConformerTransducerModel(Manager *mgr, const OnlineModelConfig &config);\n\n  std::vector<Ort::Value> StackStates(\n      const std::vector<std::vector<Ort::Value>> &states) const override;\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      const std::vector<Ort::Value> &states) const override;\n\n  std::vector<Ort::Value> GetEncoderInitStates() override;\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> RunEncoder(\n      Ort::Value features, std::vector<Ort::Value> states,\n      Ort::Value processed_frames) override;\n\n  Ort::Value RunDecoder(Ort::Value decoder_input) override;\n\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out) override;\n\n  int32_t ContextSize() const override { return context_size_; }\n\n  int32_t ChunkSize() const override { return T_; }\n\n  int32_t ChunkShift() const override { return decode_chunk_len_; }\n\n  int32_t VocabSize() const override { return vocab_size_; }\n  OrtAllocator *Allocator() override { return allocator_; }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length);\n  void InitDecoder(void *model_data, size_t model_data_length);\n  void InitJoiner(void *model_data, size_t model_data_length);\n\n private:\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n  std::unique_ptr<Ort::Session> joiner_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<std::string> joiner_input_names_;\n  std::vector<const char *> joiner_input_names_ptr_;\n\n  std::vector<std::string> joiner_output_names_;\n  std::vector<const char *> joiner_output_names_ptr_;\n\n  OnlineModelConfig config_;\n\n  int32_t num_encoder_layers_ = 0;\n  int32_t T_ = 0;\n  int32_t decode_chunk_len_ = 0;\n  int32_t cnn_module_kernel_ = 0;\n  int32_t context_size_ = 0;\n  int32_t left_context_ = 0;\n  // TODO(jingzhaoou): to retrieve from model metadata\n  int32_t right_context_ = 4;\n  int32_t encoder_dim_ = 0;\n  int32_t pad_length_ = 0;\n  int32_t vocab_size_ = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_CONFORMER_TRANSDUCER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ctc-decoder.h",
    "content": "// sherpa-onnx/csrc/online-ctc-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_CTC_DECODER_H_\n#define SHERPA_ONNX_CSRC_ONLINE_CTC_DECODER_H_\n\n#include <memory>\n#include <vector>\n\n#include \"kaldi-decoder/csrc/faster-decoder.h\"\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nclass OnlineStream;\n\nstruct OnlineCtcDecoderResult {\n  /// Number of frames after subsampling we have decoded so far\n  int32_t frame_offset = 0;\n\n  /// The decoded token IDs\n  std::vector<int64_t> tokens;\n\n  /// The decoded word IDs\n  /// Note: tokens.size() is usually not equal to words.size()\n  /// words is empty for greedy search decoding.\n  /// it is not empty when an HLG graph or an HLG graph is used.\n  std::vector<int32_t> words;\n\n  /// timestamps[i] contains the output frame index where tokens[i] is decoded.\n  /// Note: The index is after subsampling\n  ///\n  /// tokens.size() == timestamps.size()\n  std::vector<int32_t> timestamps;\n\n  int32_t num_trailing_blanks = 0;\n};\n\nclass OnlineCtcDecoder {\n public:\n  virtual ~OnlineCtcDecoder() = default;\n\n  /** Run streaming CTC decoding given the output from the encoder model.\n   *\n   * @param log_probs A 3-D tensor of shape\n   *                  (batch_size, num_frames, vocab_size) containing\n   *                  lob_probs in row major.\n   *\n   * @param  results Input & Output parameters..\n   */\n  virtual void Decode(const float *log_probs, int32_t batch_size,\n                      int32_t num_frames, int32_t vocab_size,\n                      std::vector<OnlineCtcDecoderResult> *results,\n                      OnlineStream **ss = nullptr, int32_t n = 0) = 0;\n\n  virtual std::unique_ptr<kaldi_decoder::FasterDecoder> CreateFasterDecoder()\n      const {\n    return nullptr;\n  }\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_CTC_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ctc-fst-decoder-config.cc",
    "content": "// sherpa-onnx/csrc/online-ctc-fst-decoder-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-ctc-fst-decoder-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstd::string OnlineCtcFstDecoderConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlineCtcFstDecoderConfig(\";\n  os << \"graph=\\\"\" << graph << \"\\\", \";\n  os << \"max_active=\" << max_active << \")\";\n\n  return os.str();\n}\n\nvoid OnlineCtcFstDecoderConfig::Register(ParseOptions *po) {\n  po->Register(\"ctc-graph\", &graph, \"Path to H.fst, HL.fst, or HLG.fst\");\n\n  po->Register(\"ctc-max-active\", &max_active,\n               \"Decoder max active states.  Larger->slower; more accurate\");\n}\n\nbool OnlineCtcFstDecoderConfig::Validate() const {\n  if (!graph.empty() && !FileExists(graph)) {\n    SHERPA_ONNX_LOGE(\"graph: '%s' does not exist\", graph.c_str());\n    return false;\n  }\n  return true;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ctc-fst-decoder-config.h",
    "content": "// sherpa-onnx/csrc/online-ctc-fst-decoder-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_CTC_FST_DECODER_CONFIG_H_\n#define SHERPA_ONNX_CSRC_ONLINE_CTC_FST_DECODER_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineCtcFstDecoderConfig {\n  // Path to H.fst, HL.fst or HLG.fst\n  std::string graph;\n  int32_t max_active = 3000;\n\n  OnlineCtcFstDecoderConfig() = default;\n\n  OnlineCtcFstDecoderConfig(const std::string &graph, int32_t max_active)\n      : graph(graph), max_active(max_active) {}\n\n  std::string ToString() const;\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_CTC_FST_DECODER_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ctc-fst-decoder.cc",
    "content": "// sherpa-onnx/csrc/online-ctc-fst-decoder.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-ctc-fst-decoder.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"fst/fstlib.h\"\n#include \"kaldi-decoder/csrc/decodable-ctc.h\"\n#include \"kaldifst/csrc/fstext-utils.h\"\n#include \"sherpa-onnx/csrc/fst-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n\nnamespace sherpa_onnx {\n\nOnlineCtcFstDecoder::OnlineCtcFstDecoder(\n    const OnlineCtcFstDecoderConfig &config, int32_t blank_id)\n    : config_(config), fst_(ReadGraph(config.graph)), blank_id_(blank_id) {\n  options_.max_active = config_.max_active;\n}\n\nstd::unique_ptr<kaldi_decoder::FasterDecoder>\nOnlineCtcFstDecoder::CreateFasterDecoder() const {\n  return std::make_unique<kaldi_decoder::FasterDecoder>(*fst_, options_);\n}\n\nstatic void DecodeOne(const float *log_probs, int32_t num_rows,\n                      int32_t num_cols, OnlineCtcDecoderResult *result,\n                      OnlineStream *s, int32_t blank_id) {\n  int32_t &processed_frames = s->GetFasterDecoderProcessedFrames();\n  kaldi_decoder::DecodableCtc decodable(log_probs, num_rows, num_cols,\n                                        processed_frames);\n\n  kaldi_decoder::FasterDecoder *decoder = s->GetFasterDecoder();\n  if (processed_frames == 0) {\n    decoder->InitDecoding();\n  }\n\n  decoder->AdvanceDecoding(&decodable);\n\n  if (decoder->ReachedFinal()) {\n    fst::VectorFst<fst::LatticeArc> fst_out;\n    bool ok = decoder->GetBestPath(&fst_out);\n    if (ok) {\n      std::vector<int32_t> isymbols_out;\n      std::vector<int32_t> osymbols_out;\n      /*ok =*/fst::GetLinearSymbolSequence(fst_out, &isymbols_out,\n                                           &osymbols_out, nullptr);\n      // TODO(fangjun): handle ok is false\n      std::vector<int64_t> tokens;\n      tokens.reserve(isymbols_out.size());\n\n      std::vector<int32_t> timestamps;\n      timestamps.reserve(isymbols_out.size());\n\n      std::ostringstream os;\n      int32_t prev_id = -1;\n      int32_t &num_trailing_blanks = result->num_trailing_blanks;\n      int32_t f = 0;  // frame number\n\n      for (auto i : isymbols_out) {\n        i -= 1;\n\n        if (i == blank_id) {\n          num_trailing_blanks += 1;\n        } else {\n          num_trailing_blanks = 0;\n        }\n\n        if (i != blank_id && i != prev_id) {\n          tokens.push_back(i);\n          timestamps.push_back(f);\n        }\n        prev_id = i;\n        f += 1;\n      }\n\n      result->tokens = std::move(tokens);\n      result->words = std::move(osymbols_out);\n      result->timestamps = std::move(timestamps);\n      // no need to set frame_offset\n    }\n  }\n\n  processed_frames += num_rows;\n}\n\nvoid OnlineCtcFstDecoder::Decode(const float *log_probs, int32_t batch_size,\n                                 int32_t num_frames, int32_t vocab_size,\n                                 std::vector<OnlineCtcDecoderResult> *results,\n                                 OnlineStream **ss, int32_t n) {\n  if (batch_size != results->size()) {\n    SHERPA_ONNX_LOGE(\"Size mismatch! log_probs.size(0) %d, results.size(0): %d\",\n                     batch_size, static_cast<int32_t>(results->size()));\n    exit(-1);\n  }\n\n  if (batch_size != n) {\n    SHERPA_ONNX_LOGE(\"Size mismatch! log_probs.size(0) %d, n: %d\", batch_size,\n                     n);\n    exit(-1);\n  }\n\n  const float *p = log_probs;\n\n  for (int32_t i = 0; i != batch_size; ++i) {\n    DecodeOne(p + i * num_frames * vocab_size, num_frames, vocab_size,\n              &(*results)[i], ss[i], blank_id_);\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ctc-fst-decoder.h",
    "content": "// sherpa-onnx/csrc/online-ctc-fst-decoder.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_CTC_FST_DECODER_H_\n#define SHERPA_ONNX_CSRC_ONLINE_CTC_FST_DECODER_H_\n\n#include <memory>\n#include <vector>\n\n#include \"fst/fst.h\"\n#include \"sherpa-onnx/csrc/online-ctc-decoder.h\"\n#include \"sherpa-onnx/csrc/online-ctc-fst-decoder-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineCtcFstDecoder : public OnlineCtcDecoder {\n public:\n  OnlineCtcFstDecoder(const OnlineCtcFstDecoderConfig &config,\n                      int32_t blank_id);\n\n  void Decode(const float *log_probs, int32_t batch_size, int32_t num_frames,\n              int32_t vocab_size, std::vector<OnlineCtcDecoderResult> *results,\n              OnlineStream **ss = nullptr, int32_t n = 0) override;\n\n  std::unique_ptr<kaldi_decoder::FasterDecoder> CreateFasterDecoder()\n      const override;\n\n private:\n  OnlineCtcFstDecoderConfig config_;\n  kaldi_decoder::FasterDecoderOptions options_;\n\n  std::unique_ptr<fst::Fst<fst::StdArc>> fst_;\n  int32_t blank_id_ = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_CTC_FST_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ctc-greedy-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/online-ctc-greedy-search-decoder.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-ctc-greedy-search-decoder.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineCtcGreedySearchDecoder::Decode(\n    const float *log_probs, int32_t batch_size, int32_t num_frames,\n    int32_t vocab_size, std::vector<OnlineCtcDecoderResult> *results,\n    OnlineStream ** /*ss=nullptr*/, int32_t /*n = 0*/) {\n  if (batch_size != results->size()) {\n    SHERPA_ONNX_LOGE(\"Size mismatch! log_probs.size(0) %d, results.size(0): %d\",\n                     batch_size, static_cast<int32_t>(results->size()));\n    exit(-1);\n  }\n\n  const float *p = log_probs;\n\n  for (int32_t b = 0; b != batch_size; ++b) {\n    auto &r = (*results)[b];\n\n    int32_t prev_id = -1;\n    if (!r.tokens.empty()) {\n      if (r.num_trailing_blanks > 0) {\n        prev_id = blank_id_;\n      } else {\n        prev_id = r.tokens.back();\n      }\n    }\n\n    for (int32_t t = 0; t != num_frames; ++t, p += vocab_size) {\n      int32_t y = static_cast<int32_t>(std::distance(\n          static_cast<const float *>(p),\n          std::max_element(static_cast<const float *>(p),\n                           static_cast<const float *>(p) + vocab_size)));\n\n      if (y == blank_id_) {\n        r.num_trailing_blanks += 1;\n      } else {\n        r.num_trailing_blanks = 0;\n      }\n\n      if (y != blank_id_ && y != prev_id) {\n        r.tokens.push_back(y);\n        r.timestamps.push_back(t + r.frame_offset);\n      }\n\n      prev_id = y;\n    }  // for (int32_t t = 0; t != num_frames; ++t) {\n  }    // for (int32_t b = 0; b != batch_size; ++b)\n\n  // Update frame_offset\n  for (auto &r : *results) {\n    r.frame_offset += num_frames;\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ctc-greedy-search-decoder.h",
    "content": "// sherpa-onnx/csrc/online-ctc-greedy-search-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_CTC_GREEDY_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_ONLINE_CTC_GREEDY_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-ctc-decoder.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineCtcGreedySearchDecoder : public OnlineCtcDecoder {\n public:\n  explicit OnlineCtcGreedySearchDecoder(int32_t blank_id)\n      : blank_id_(blank_id) {}\n\n  void Decode(const float *log_probs, int32_t batch_size, int32_t num_frames,\n              int32_t vocab_size, std::vector<OnlineCtcDecoderResult> *results,\n              OnlineStream **ss = nullptr, int32_t n = 0) override;\n\n private:\n  int32_t blank_id_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_CTC_GREEDY_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/online-ctc-model.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-ctc-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <sstream>\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-nemo-ctc-model.h\"\n#include \"sherpa-onnx/csrc/online-t-one-ctc-model.h\"\n#include \"sherpa-onnx/csrc/online-wenet-ctc-model.h\"\n#include \"sherpa-onnx/csrc/online-zipformer2-ctc-model.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OnlineCtcModel> OnlineCtcModel::Create(\n    const OnlineModelConfig &config) {\n  if (!config.wenet_ctc.model.empty()) {\n    return std::make_unique<OnlineWenetCtcModel>(config);\n  } else if (!config.zipformer2_ctc.model.empty()) {\n    return std::make_unique<OnlineZipformer2CtcModel>(config);\n  } else if (!config.nemo_ctc.model.empty()) {\n    return std::make_unique<OnlineNeMoCtcModel>(config);\n  } else if (!config.t_one_ctc.model.empty()) {\n    return std::make_unique<OnlineToneCtcModel>(config);\n  } else {\n    SHERPA_ONNX_LOGE(\"Please specify a CTC model\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OnlineCtcModel> OnlineCtcModel::Create(\n    Manager *mgr, const OnlineModelConfig &config) {\n  if (!config.wenet_ctc.model.empty()) {\n    return std::make_unique<OnlineWenetCtcModel>(mgr, config);\n  } else if (!config.zipformer2_ctc.model.empty()) {\n    return std::make_unique<OnlineZipformer2CtcModel>(mgr, config);\n  } else if (!config.nemo_ctc.model.empty()) {\n    return std::make_unique<OnlineNeMoCtcModel>(mgr, config);\n  } else if (!config.t_one_ctc.model.empty()) {\n    return std::make_unique<OnlineToneCtcModel>(mgr, config);\n  } else {\n    SHERPA_ONNX_LOGE(\"Please specify a CTC model\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OnlineCtcModel> OnlineCtcModel::Create(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OnlineCtcModel> OnlineCtcModel::Create(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ctc-model.h",
    "content": "// sherpa-onnx/csrc/online-ctc-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_CTC_MODEL_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineCtcModel {\n public:\n  virtual ~OnlineCtcModel() = default;\n\n  static std::unique_ptr<OnlineCtcModel> Create(\n      const OnlineModelConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OnlineCtcModel> Create(\n      Manager *mgr, const OnlineModelConfig &config);\n\n  // Return a list of tensors containing the initial states\n  virtual std::vector<Ort::Value> GetInitStates() const = 0;\n\n  /** Stack a list of individual states into a batch.\n   *\n   * It is the inverse operation of `UnStackStates`.\n   *\n   * @param states states[i] contains the state for the i-th utterance.\n   * @return Return a single value representing the batched state.\n   */\n  virtual std::vector<Ort::Value> StackStates(\n      std::vector<std::vector<Ort::Value>> states) const = 0;\n\n  /** Unstack a batch state into a list of individual states.\n   *\n   * It is the inverse operation of `StackStates`.\n   *\n   * @param states A batched state.\n   * @return ans[i] contains the state for the i-th utterance.\n   */\n  virtual std::vector<std::vector<Ort::Value>> UnStackStates(\n      std::vector<Ort::Value> states) const = 0;\n\n  /**\n   *\n   * @param x A 3-D tensor of shape (N, T, C). N has to be 1.\n   * @param states  It is from GetInitStates() or returned from this method.\n   *\n   * @return Return a list of tensors\n   *    - ans[0] contains log_probs, of shape (N, T, C)\n   *    - ans[1:] contains next_states\n   */\n  virtual std::vector<Ort::Value> Forward(\n      Ort::Value x, std::vector<Ort::Value> states) const = 0;\n\n  /** Return the vocabulary size of the model\n   */\n  virtual int32_t VocabSize() const = 0;\n\n  /** Return an allocator for allocating memory\n   */\n  virtual OrtAllocator *Allocator() const = 0;\n\n  // The model accepts this number of frames before subsampling as input\n  virtual int32_t ChunkLength() const = 0;\n\n  // Similar to frame_shift in feature extractor, after processing\n  // ChunkLength() frames, we advance by ChunkShift() frames\n  // before we process the next chunk.\n  virtual int32_t ChunkShift() const = 0;\n\n  // Return true if the model supports batch size > 1\n  virtual bool SupportBatchProcessing() const { return true; }\n\n  virtual bool UseWhisperFeature() const { return false; }\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ebranchformer-transducer-model.cc",
    "content": "// sherpa-onnx/csrc/online-ebranchformer-transducer-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n//                2025  Brno University of Technology (author: Karel Vesely)\n\n#include \"sherpa-onnx/csrc/online-ebranchformer-transducer-model.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <cmath>\n#include <memory>\n#include <numeric>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/cat.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/unbind.h\"\n\nnamespace sherpa_onnx {\n\nOnlineEbranchformerTransducerModel::OnlineEbranchformerTransducerModel(\n    const OnlineModelConfig &config)\n    : env_(ORT_LOGGING_LEVEL_ERROR),\n      encoder_sess_opts_(GetSessionOptions(config)),\n      decoder_sess_opts_(GetSessionOptions(config, \"decoder\")),\n      joiner_sess_opts_(GetSessionOptions(config, \"joiner\")),\n      config_(config),\n      allocator_{} {\n  {\n    auto buf = ReadFile(config.transducer.encoder);\n    InitEncoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(config.transducer.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(config.transducer.joiner);\n    InitJoiner(buf.data(), buf.size());\n  }\n}\n\ntemplate <typename Manager>\nOnlineEbranchformerTransducerModel::OnlineEbranchformerTransducerModel(\n    Manager *mgr, const OnlineModelConfig &config)\n    : env_(ORT_LOGGING_LEVEL_ERROR),\n      config_(config),\n      encoder_sess_opts_(GetSessionOptions(config)),\n      decoder_sess_opts_(GetSessionOptions(config, \"decoder\")),\n      joiner_sess_opts_(GetSessionOptions(config, \"joiner\")),\n      allocator_{} {\n  {\n    auto buf = ReadFile(mgr, config.transducer.encoder);\n    InitEncoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(mgr, config.transducer.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(mgr, config.transducer.joiner);\n    InitJoiner(buf.data(), buf.size());\n  }\n}\n\nvoid OnlineEbranchformerTransducerModel::InitEncoder(void *model_data,\n                                                     size_t model_data_length) {\n  encoder_sess_ = std::make_unique<Ort::Session>(\n      env_, model_data, model_data_length, encoder_sess_opts_);\n\n  GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                &encoder_input_names_ptr_);\n\n  GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                 &encoder_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---encoder---\\n\";\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n  SHERPA_ONNX_READ_META_DATA(decode_chunk_len_, \"decode_chunk_len\");\n  SHERPA_ONNX_READ_META_DATA(T_, \"T\");\n\n  SHERPA_ONNX_READ_META_DATA(num_hidden_layers_, \"num_hidden_layers\");\n  SHERPA_ONNX_READ_META_DATA(hidden_size_, \"hidden_size\");\n  SHERPA_ONNX_READ_META_DATA(intermediate_size_, \"intermediate_size\");\n  SHERPA_ONNX_READ_META_DATA(csgu_kernel_size_, \"csgu_kernel_size\");\n  SHERPA_ONNX_READ_META_DATA(merge_conv_kernel_, \"merge_conv_kernel\");\n  SHERPA_ONNX_READ_META_DATA(left_context_len_, \"left_context_len\");\n  SHERPA_ONNX_READ_META_DATA(num_heads_, \"num_heads\");\n  SHERPA_ONNX_READ_META_DATA(head_dim_, \"head_dim\");\n\n  if (config_.debug) {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"T: %{public}d\", T_);\n    SHERPA_ONNX_LOGE(\"decode_chunk_len_: %{public}d\", decode_chunk_len_);\n\n    SHERPA_ONNX_LOGE(\"num_hidden_layers_: %{public}d\", num_hidden_layers_);\n    SHERPA_ONNX_LOGE(\"hidden_size_: %{public}d\", hidden_size_);\n    SHERPA_ONNX_LOGE(\"intermediate_size_: %{public}d\", intermediate_size_);\n    SHERPA_ONNX_LOGE(\"csgu_kernel_size_: %{public}d\", csgu_kernel_size_);\n    SHERPA_ONNX_LOGE(\"merge_conv_kernel_: %{public}d\", merge_conv_kernel_);\n    SHERPA_ONNX_LOGE(\"left_context_len_: %{public}d\", left_context_len_);\n    SHERPA_ONNX_LOGE(\"num_heads_: %{public}d\", num_heads_);\n    SHERPA_ONNX_LOGE(\"head_dim_: %{public}d\", head_dim_);\n#else\n    SHERPA_ONNX_LOGE(\"T: %d\", T_);\n    SHERPA_ONNX_LOGE(\"decode_chunk_len_: %d\", decode_chunk_len_);\n\n    SHERPA_ONNX_LOGE(\"num_hidden_layers_: %d\", num_hidden_layers_);\n    SHERPA_ONNX_LOGE(\"hidden_size_: %d\", hidden_size_);\n    SHERPA_ONNX_LOGE(\"intermediate_size_: %d\", intermediate_size_);\n    SHERPA_ONNX_LOGE(\"csgu_kernel_size_: %d\", csgu_kernel_size_);\n    SHERPA_ONNX_LOGE(\"merge_conv_kernel_: %d\", merge_conv_kernel_);\n    SHERPA_ONNX_LOGE(\"left_context_len_: %d\", left_context_len_);\n    SHERPA_ONNX_LOGE(\"num_heads_: %d\", num_heads_);\n    SHERPA_ONNX_LOGE(\"head_dim_: %d\", head_dim_);\n#endif\n  }\n}\n\nvoid OnlineEbranchformerTransducerModel::InitDecoder(void *model_data,\n                                                     size_t model_data_length) {\n  decoder_sess_ = std::make_unique<Ort::Session>(\n      env_, model_data, model_data_length, decoder_sess_opts_);\n\n  GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                &decoder_input_names_ptr_);\n\n  GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                 &decoder_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = decoder_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---decoder---\\n\";\n    PrintModelMetadata(os, meta_data);\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n  SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n  SHERPA_ONNX_READ_META_DATA(context_size_, \"context_size\");\n}\n\nvoid OnlineEbranchformerTransducerModel::InitJoiner(void *model_data,\n                                                    size_t model_data_length) {\n  joiner_sess_ = std::make_unique<Ort::Session>(\n      env_, model_data, model_data_length, joiner_sess_opts_);\n\n  GetInputNames(joiner_sess_.get(), &joiner_input_names_,\n                &joiner_input_names_ptr_);\n\n  GetOutputNames(joiner_sess_.get(), &joiner_output_names_,\n                 &joiner_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = joiner_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---joiner---\\n\";\n    PrintModelMetadata(os, meta_data);\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n  }\n}\n\nstd::vector<Ort::Value> OnlineEbranchformerTransducerModel::StackStates(\n    const std::vector<std::vector<Ort::Value>> &states) const {\n  int32_t batch_size = static_cast<int32_t>(states.size());\n\n  std::vector<const Ort::Value *> buf(batch_size);\n\n  auto allocator =\n      const_cast<OnlineEbranchformerTransducerModel *>(this)->allocator_;\n\n  std::vector<Ort::Value> ans;\n  int32_t num_states = static_cast<int32_t>(states[0].size());\n  ans.reserve(num_states);\n\n  for (int32_t i = 0; i != num_hidden_layers_; ++i) {\n    {  // cached_key\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][4 * i];\n      }\n      auto v = Cat(allocator, buf, /* axis */ 0);\n      ans.push_back(std::move(v));\n    }\n    {  // cached_value\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][4 * i + 1];\n      }\n      auto v = Cat(allocator, buf, 0);\n      ans.push_back(std::move(v));\n    }\n    {  // cached_conv\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][4 * i + 2];\n      }\n      auto v = Cat(allocator, buf, 0);\n      ans.push_back(std::move(v));\n    }\n    {  // cached_conv_fusion\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][4 * i + 3];\n      }\n      auto v = Cat(allocator, buf, 0);\n      ans.push_back(std::move(v));\n    }\n  }\n\n  {  // processed_lens\n    for (int32_t n = 0; n != batch_size; ++n) {\n      buf[n] = &states[n][num_states - 1];\n    }\n    auto v = Cat<int64_t>(allocator, buf, 0);\n    ans.push_back(std::move(v));\n  }\n\n  return ans;\n}\n\nstd::vector<std::vector<Ort::Value>>\nOnlineEbranchformerTransducerModel::UnStackStates(\n    const std::vector<Ort::Value> &states) const {\n  assert(static_cast<int32_t>(states.size()) == num_hidden_layers_ * 4 + 1);\n\n  int32_t batch_size = states[0].GetTensorTypeAndShapeInfo().GetShape()[0];\n\n  auto allocator =\n      const_cast<OnlineEbranchformerTransducerModel *>(this)->allocator_;\n\n  std::vector<std::vector<Ort::Value>> ans;\n  ans.resize(batch_size);\n\n  for (int32_t i = 0; i != num_hidden_layers_; ++i) {\n    {  // cached_key\n      auto v = Unbind(allocator, &states[i * 4], /* axis */ 0);\n      assert(static_cast<int32_t>(v.size()) == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n    {  // cached_value\n      auto v = Unbind(allocator, &states[i * 4 + 1], 0);\n      assert(static_cast<int32_t>(v.size()) == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n    {  // cached_conv\n      auto v = Unbind(allocator, &states[i * 4 + 2], 0);\n      assert(static_cast<int32_t>(v.size()) == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n    {  // cached_conv_fusion\n      auto v = Unbind(allocator, &states[i * 4 + 3], 0);\n      assert(static_cast<int32_t>(v.size()) == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n  }\n\n  {  // processed_lens\n    auto v = Unbind<int64_t>(allocator, &states.back(), 0);\n    assert(static_cast<int32_t>(v.size()) == batch_size);\n\n    for (int32_t n = 0; n != batch_size; ++n) {\n      ans[n].push_back(std::move(v[n]));\n    }\n  }\n\n  return ans;\n}\n\nstd::vector<Ort::Value>\nOnlineEbranchformerTransducerModel::GetEncoderInitStates() {\n  std::vector<Ort::Value> ans;\n\n  ans.reserve(num_hidden_layers_ * 4 + 1);\n\n  int32_t left_context_conv = csgu_kernel_size_ - 1;\n  int32_t channels_conv = intermediate_size_ / 2;\n\n  int32_t left_context_conv_fusion = merge_conv_kernel_ - 1;\n  int32_t channels_conv_fusion = 2 * hidden_size_;\n\n  for (int32_t i = 0; i != num_hidden_layers_; ++i) {\n    {  // cached_key_{i}\n      std::array<int64_t, 4> s{1, num_heads_, left_context_len_, head_dim_};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      ans.push_back(std::move(v));\n    }\n\n    {  // cahced_value_{i}\n      std::array<int64_t, 4> s{1, num_heads_, left_context_len_, head_dim_};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      ans.push_back(std::move(v));\n    }\n\n    {  // cached_conv_{i}\n      std::array<int64_t, 3> s{1, channels_conv, left_context_conv};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      ans.push_back(std::move(v));\n    }\n\n    {  // cached_conv_fusion_{i}\n      std::array<int64_t, 3> s{1, channels_conv_fusion,\n                               left_context_conv_fusion};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      ans.push_back(std::move(v));\n    }\n  }  // num_hidden_layers_\n\n  {  // processed_lens\n    std::array<int64_t, 1> s{1};\n    auto v = Ort::Value::CreateTensor<int64_t>(allocator_, s.data(), s.size());\n    Fill<int64_t>(&v, 0);\n    ans.push_back(std::move(v));\n  }\n\n  return ans;\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOnlineEbranchformerTransducerModel::RunEncoder(\n    Ort::Value features, std::vector<Ort::Value> states,\n    Ort::Value /* processed_frames */) {\n  std::vector<Ort::Value> encoder_inputs;\n  encoder_inputs.reserve(1 + states.size());\n\n  encoder_inputs.push_back(std::move(features));\n  for (auto &v : states) {\n    encoder_inputs.push_back(std::move(v));\n  }\n\n  auto encoder_out = encoder_sess_->Run(\n      {}, encoder_input_names_ptr_.data(), encoder_inputs.data(),\n      encoder_inputs.size(), encoder_output_names_ptr_.data(),\n      encoder_output_names_ptr_.size());\n\n  std::vector<Ort::Value> next_states;\n  next_states.reserve(states.size());\n\n  for (int32_t i = 1; i != static_cast<int32_t>(encoder_out.size()); ++i) {\n    next_states.push_back(std::move(encoder_out[i]));\n  }\n  return {std::move(encoder_out[0]), std::move(next_states)};\n}\n\nOrt::Value OnlineEbranchformerTransducerModel::RunDecoder(\n    Ort::Value decoder_input) {\n  auto decoder_out = decoder_sess_->Run(\n      {}, decoder_input_names_ptr_.data(), &decoder_input, 1,\n      decoder_output_names_ptr_.data(), decoder_output_names_ptr_.size());\n  return std::move(decoder_out[0]);\n}\n\nOrt::Value OnlineEbranchformerTransducerModel::RunJoiner(\n    Ort::Value encoder_out, Ort::Value decoder_out) {\n  std::array<Ort::Value, 2> joiner_input = {std::move(encoder_out),\n                                            std::move(decoder_out)};\n  auto logit =\n      joiner_sess_->Run({}, joiner_input_names_ptr_.data(), joiner_input.data(),\n                        joiner_input.size(), joiner_output_names_ptr_.data(),\n                        joiner_output_names_ptr_.size());\n\n  return std::move(logit[0]);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineEbranchformerTransducerModel::OnlineEbranchformerTransducerModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineEbranchformerTransducerModel::OnlineEbranchformerTransducerModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-ebranchformer-transducer-model.h",
    "content": "// sherpa-onnx/csrc/online-ebranchformer-transducer-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n//                2025  Brno University of Technology (author: Karel Vesely)\n#ifndef SHERPA_ONNX_CSRC_ONLINE_EBRANCHFORMER_TRANSDUCER_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_EBRANCHFORMER_TRANSDUCER_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineEbranchformerTransducerModel : public OnlineTransducerModel {\n public:\n  explicit OnlineEbranchformerTransducerModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineEbranchformerTransducerModel(Manager *mgr,\n                                     const OnlineModelConfig &config);\n\n  std::vector<Ort::Value> StackStates(\n      const std::vector<std::vector<Ort::Value>> &states) const override;\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      const std::vector<Ort::Value> &states) const override;\n\n  std::vector<Ort::Value> GetEncoderInitStates() override;\n\n  void SetFeatureDim(int32_t feature_dim) override {\n    feature_dim_ = feature_dim;\n  }\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> RunEncoder(\n      Ort::Value features, std::vector<Ort::Value> states,\n      Ort::Value processed_frames) override;\n\n  Ort::Value RunDecoder(Ort::Value decoder_input) override;\n\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out) override;\n\n  int32_t ContextSize() const override { return context_size_; }\n\n  int32_t ChunkSize() const override { return T_; }\n\n  int32_t ChunkShift() const override { return decode_chunk_len_; }\n\n  int32_t VocabSize() const override { return vocab_size_; }\n  OrtAllocator *Allocator() override { return allocator_; }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length);\n  void InitDecoder(void *model_data, size_t model_data_length);\n  void InitJoiner(void *model_data, size_t model_data_length);\n\n private:\n  Ort::Env env_;\n  Ort::SessionOptions encoder_sess_opts_;\n  Ort::SessionOptions decoder_sess_opts_;\n  Ort::SessionOptions joiner_sess_opts_;\n\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n  std::unique_ptr<Ort::Session> joiner_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<std::string> joiner_input_names_;\n  std::vector<const char *> joiner_input_names_ptr_;\n\n  std::vector<std::string> joiner_output_names_;\n  std::vector<const char *> joiner_output_names_ptr_;\n\n  OnlineModelConfig config_;\n\n  int32_t decode_chunk_len_ = 0;\n  int32_t T_ = 0;\n\n  int32_t num_hidden_layers_ = 0;\n  int32_t hidden_size_ = 0;\n  int32_t intermediate_size_ = 0;\n  int32_t csgu_kernel_size_ = 0;\n  int32_t merge_conv_kernel_ = 0;\n  int32_t left_context_len_ = 0;\n  int32_t num_heads_ = 0;\n  int32_t head_dim_ = 0;\n\n  int32_t context_size_ = 0;\n  int32_t vocab_size_ = 0;\n  int32_t feature_dim_ = 80;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_EBRANCHFORMER_TRANSDUCER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-lm-config.cc",
    "content": "// sherpa-onnx/csrc/online-lm-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-lm-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineLMConfig::Register(ParseOptions *po) {\n  po->Register(\"lm\", &model, \"Path to LM model.\");\n  po->Register(\"lm-scale\", &scale, \"LM scale.\");\n  po->Register(\"lm-num-threads\", &lm_num_threads,\n               \"Number of threads to run the neural network of LM model\");\n  po->Register(\"lm-provider\", &lm_provider,\n               \"Specify a provider to LM model use: cpu, cuda, coreml\");\n  po->Register(\"lm-shallow-fusion\", &shallow_fusion,\n               \"Boolean whether to use shallow fusion or rescore.\");\n  po->Register(\"lodr-fst\", &lodr_fst, \"Path to LODR FST model.\");\n  po->Register(\"lodr-scale\", &lodr_scale, \"LODR scale.\");\n  po->Register(\"lodr-backoff-id\", &lodr_backoff_id,\n               \"ID of the backoff in the LODR FST. -1 means autodetect\");\n}\n\nbool OnlineLMConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"'%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  if (!lodr_fst.empty() && !FileExists(lodr_fst)) {\n    SHERPA_ONNX_LOGE(\"'%s' does not exist\", lodr_fst.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OnlineLMConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlineLMConfig(\";\n  os << \"model=\\\"\" << model << \"\\\", \";\n  os << \"scale=\" << scale << \", \";\n  os << \"lodr_scale=\" << lodr_scale << \", \";\n  os << \"lodr_fst=\\\"\" << lodr_fst << \"\\\", \";\n  os << \"lodr_backoff_id=\" << lodr_backoff_id << \", \";\n  os << \"shallow_fusion=\" << (shallow_fusion ? \"True\" : \"False\") << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-lm-config.h",
    "content": "// sherpa-onnx/csrc/online-lm-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_LM_CONFIG_H_\n#define SHERPA_ONNX_CSRC_ONLINE_LM_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineLMConfig {\n  // path to the onnx model\n  std::string model;\n\n  // LM scale\n  float scale = 0.5;\n  int32_t lm_num_threads = 1;\n  std::string lm_provider = \"cpu\";\n  std::string lodr_fst;\n  float lodr_scale = 0.01;\n  int32_t lodr_backoff_id = -1;  // -1 means not set\n  // enable shallow fusion\n  bool shallow_fusion = true;\n\n  OnlineLMConfig() = default;\n\n  OnlineLMConfig(const std::string &model, float scale, int32_t lm_num_threads,\n                 const std::string &lm_provider, bool shallow_fusion,\n                 const std::string &lodr_fst, float lodr_scale,\n                 int32_t lodr_backoff_id)\n      : model(model),\n        scale(scale),\n        lm_num_threads(lm_num_threads),\n        lm_provider(lm_provider),\n        shallow_fusion(shallow_fusion),\n        lodr_fst(lodr_fst),\n        lodr_scale(lodr_scale),\n        lodr_backoff_id(lodr_backoff_id) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_LM_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-lm.cc",
    "content": "// sherpa-onnx/csrc/online-lm.cc\n//\n// Copyright (c)  2023  Pingfeng Luo\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-lm.h\"\n\n#include <algorithm>\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-rnn-lm.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OnlineLM> OnlineLM::Create(const OnlineLMConfig &config) {\n  return std::make_unique<OnlineRnnLM>(config);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-lm.h",
    "content": "// sherpa-onnx/csrc/online-lm.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_LM_H_\n#define SHERPA_ONNX_CSRC_ONLINE_LM_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/hypothesis.h\"\n#include \"sherpa-onnx/csrc/online-lm-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineLM {\n public:\n  virtual ~OnlineLM() = default;\n\n  static std::unique_ptr<OnlineLM> Create(const OnlineLMConfig &config);\n\n  // init states for classic rescore\n  virtual std::vector<Ort::Value> GetInitStates() = 0;\n\n  // init states for shallow fusion\n  virtual std::pair<Ort::Value, std::vector<Ort::Value>> GetInitStatesSF() = 0;\n\n   /** ScoreToken a batch of sentences (shallow fusion).\n   *\n   * @param x A 2-D tensor of shape (N, 1) with data type int64.\n   * @param states It contains the states for the LM model\n   * @return Return a pair containing\n   *          - log_prob of NN LM\n   *          - updated states\n   *\n   */\n  virtual std::pair<Ort::Value, std::vector<Ort::Value>> ScoreToken(\n      Ort::Value x, std::vector<Ort::Value> states) = 0;\n\n  /** This function updates hyp.lm_log_prob of hyps (classic rescore).\n   *\n   * @param scale LM score\n   * @param context_size Context size of the transducer decoder model\n   * @param hyps It is changed in-place.\n   *\n   */\n  virtual void ComputeLMScore(float scale, int32_t context_size,\n                      std::vector<Hypotheses> *hyps) = 0;\n\n  /** This function updates lm_log_prob and nn_lm_scores of hyp (shallow fusion).\n   *\n   * @param scale LM score\n   * @param hyps It is changed in-place.\n   *\n   */\n  virtual void ComputeLMScoreSF(float scale, Hypothesis *hyp) = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_LM_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-lstm-transducer-model.cc",
    "content": "// sherpa-onnx/csrc/online-lstm-transducer-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/online-lstm-transducer-model.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/cat.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/unbind.h\"\n\nnamespace sherpa_onnx {\n\nOnlineLstmTransducerModel::OnlineLstmTransducerModel(\n    const OnlineModelConfig &config)\n    : env_(ORT_LOGGING_LEVEL_ERROR),\n      config_(config),\n      sess_opts_(GetSessionOptions(config)),\n      allocator_{} {\n  {\n    auto buf = ReadFile(config.transducer.encoder);\n    InitEncoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(config.transducer.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(config.transducer.joiner);\n    InitJoiner(buf.data(), buf.size());\n  }\n}\n\ntemplate <typename Manager>\nOnlineLstmTransducerModel::OnlineLstmTransducerModel(\n    Manager *mgr, const OnlineModelConfig &config)\n    : env_(ORT_LOGGING_LEVEL_ERROR),\n      config_(config),\n      sess_opts_(GetSessionOptions(config)),\n      allocator_{} {\n  {\n    auto buf = ReadFile(mgr, config.transducer.encoder);\n    InitEncoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(mgr, config.transducer.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(mgr, config.transducer.joiner);\n    InitJoiner(buf.data(), buf.size());\n  }\n}\n\nvoid OnlineLstmTransducerModel::InitEncoder(void *model_data,\n                                            size_t model_data_length) {\n  encoder_sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                                 model_data_length, sess_opts_);\n\n  GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                &encoder_input_names_ptr_);\n\n  GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                 &encoder_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---encoder---\\n\";\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n  SHERPA_ONNX_READ_META_DATA(num_encoder_layers_, \"num_encoder_layers\");\n  SHERPA_ONNX_READ_META_DATA(T_, \"T\");\n  SHERPA_ONNX_READ_META_DATA(decode_chunk_len_, \"decode_chunk_len\");\n  SHERPA_ONNX_READ_META_DATA(rnn_hidden_size_, \"rnn_hidden_size\");\n  SHERPA_ONNX_READ_META_DATA(d_model_, \"d_model\");\n}\n\nvoid OnlineLstmTransducerModel::InitDecoder(void *model_data,\n                                            size_t model_data_length) {\n  decoder_sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                                 model_data_length, sess_opts_);\n\n  GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                &decoder_input_names_ptr_);\n\n  GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                 &decoder_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = decoder_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---decoder---\\n\";\n    PrintModelMetadata(os, meta_data);\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n  SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n  SHERPA_ONNX_READ_META_DATA(context_size_, \"context_size\");\n}\n\nvoid OnlineLstmTransducerModel::InitJoiner(void *model_data,\n                                           size_t model_data_length) {\n  joiner_sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                                model_data_length, sess_opts_);\n\n  GetInputNames(joiner_sess_.get(), &joiner_input_names_,\n                &joiner_input_names_ptr_);\n\n  GetOutputNames(joiner_sess_.get(), &joiner_output_names_,\n                 &joiner_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = joiner_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---joiner---\\n\";\n    PrintModelMetadata(os, meta_data);\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n  }\n}\n\nstd::vector<Ort::Value> OnlineLstmTransducerModel::StackStates(\n    const std::vector<std::vector<Ort::Value>> &states) const {\n  int32_t batch_size = static_cast<int32_t>(states.size());\n\n  std::vector<const Ort::Value *> h_buf(batch_size);\n  std::vector<const Ort::Value *> c_buf(batch_size);\n\n  for (int32_t i = 0; i != batch_size; ++i) {\n    assert(states[i].size() == 2);\n    h_buf[i] = &states[i][0];\n    c_buf[i] = &states[i][1];\n  }\n  auto allocator = const_cast<OnlineLstmTransducerModel *>(this)->allocator_;\n\n  Ort::Value h = Cat(allocator, h_buf, 1);\n  Ort::Value c = Cat(allocator, c_buf, 1);\n\n  std::vector<Ort::Value> ans;\n  ans.reserve(2);\n  ans.push_back(std::move(h));\n  ans.push_back(std::move(c));\n\n  return ans;\n}\n\nstd::vector<std::vector<Ort::Value>> OnlineLstmTransducerModel::UnStackStates(\n    const std::vector<Ort::Value> &states) const {\n  int32_t batch_size = states[0].GetTensorTypeAndShapeInfo().GetShape()[1];\n  assert(states.size() == 2);\n\n  std::vector<std::vector<Ort::Value>> ans(batch_size);\n\n  auto allocator = const_cast<OnlineLstmTransducerModel *>(this)->allocator_;\n\n  std::vector<Ort::Value> h_vec = Unbind(allocator, &states[0], 1);\n  std::vector<Ort::Value> c_vec = Unbind(allocator, &states[1], 1);\n\n  assert(h_vec.size() == batch_size);\n  assert(c_vec.size() == batch_size);\n\n  for (int32_t i = 0; i != batch_size; ++i) {\n    ans[i].push_back(std::move(h_vec[i]));\n    ans[i].push_back(std::move(c_vec[i]));\n  }\n\n  return ans;\n}\n\nstd::vector<Ort::Value> OnlineLstmTransducerModel::GetEncoderInitStates() {\n  // Please see\n  // https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/lstm_transducer_stateless2/export-onnx.py#L185\n  // for details\n  constexpr int32_t kBatchSize = 1;\n  std::array<int64_t, 3> h_shape{num_encoder_layers_, kBatchSize, d_model_};\n  Ort::Value h = Ort::Value::CreateTensor<float>(allocator_, h_shape.data(),\n                                                 h_shape.size());\n\n  Fill<float>(&h, 0);\n\n  std::array<int64_t, 3> c_shape{num_encoder_layers_, kBatchSize,\n                                 rnn_hidden_size_};\n\n  Ort::Value c = Ort::Value::CreateTensor<float>(allocator_, c_shape.data(),\n                                                 c_shape.size());\n\n  Fill<float>(&c, 0);\n\n  std::vector<Ort::Value> states;\n\n  states.reserve(2);\n  states.push_back(std::move(h));\n  states.push_back(std::move(c));\n\n  return states;\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOnlineLstmTransducerModel::RunEncoder(Ort::Value features,\n                                      std::vector<Ort::Value> states,\n                                      Ort::Value /* processed_frames */) {\n  std::array<Ort::Value, 3> encoder_inputs = {\n      std::move(features), std::move(states[0]), std::move(states[1])};\n\n  auto encoder_out = encoder_sess_->Run(\n      {}, encoder_input_names_ptr_.data(), encoder_inputs.data(),\n      encoder_inputs.size(), encoder_output_names_ptr_.data(),\n      encoder_output_names_ptr_.size());\n\n  std::vector<Ort::Value> next_states;\n  next_states.reserve(2);\n  next_states.push_back(std::move(encoder_out[1]));\n  next_states.push_back(std::move(encoder_out[2]));\n\n  return {std::move(encoder_out[0]), std::move(next_states)};\n}\n\nOrt::Value OnlineLstmTransducerModel::RunDecoder(Ort::Value decoder_input) {\n  auto decoder_out = decoder_sess_->Run(\n      {}, decoder_input_names_ptr_.data(), &decoder_input, 1,\n      decoder_output_names_ptr_.data(), decoder_output_names_ptr_.size());\n  return std::move(decoder_out[0]);\n}\n\nOrt::Value OnlineLstmTransducerModel::RunJoiner(Ort::Value encoder_out,\n                                                Ort::Value decoder_out) {\n  std::array<Ort::Value, 2> joiner_input = {std::move(encoder_out),\n                                            std::move(decoder_out)};\n  auto logit =\n      joiner_sess_->Run({}, joiner_input_names_ptr_.data(), joiner_input.data(),\n                        joiner_input.size(), joiner_output_names_ptr_.data(),\n                        joiner_output_names_ptr_.size());\n\n  return std::move(logit[0]);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineLstmTransducerModel::OnlineLstmTransducerModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineLstmTransducerModel::OnlineLstmTransducerModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-lstm-transducer-model.h",
    "content": "// sherpa-onnx/csrc/online-lstm-transducer-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_LSTM_TRANSDUCER_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_LSTM_TRANSDUCER_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineLstmTransducerModel : public OnlineTransducerModel {\n public:\n  explicit OnlineLstmTransducerModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineLstmTransducerModel(Manager *mgr, const OnlineModelConfig &config);\n\n  std::vector<Ort::Value> StackStates(\n      const std::vector<std::vector<Ort::Value>> &states) const override;\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      const std::vector<Ort::Value> &states) const override;\n\n  std::vector<Ort::Value> GetEncoderInitStates() override;\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> RunEncoder(\n      Ort::Value features, std::vector<Ort::Value> states,\n      Ort::Value processed_frames) override;\n\n  Ort::Value RunDecoder(Ort::Value decoder_input) override;\n\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out) override;\n\n  int32_t ContextSize() const override { return context_size_; }\n\n  int32_t ChunkSize() const override { return T_; }\n\n  int32_t ChunkShift() const override { return decode_chunk_len_; }\n\n  int32_t VocabSize() const override { return vocab_size_; }\n  OrtAllocator *Allocator() override { return allocator_; }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length);\n  void InitDecoder(void *model_data, size_t model_data_length);\n  void InitJoiner(void *model_data, size_t model_data_length);\n\n private:\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n  std::unique_ptr<Ort::Session> joiner_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<std::string> joiner_input_names_;\n  std::vector<const char *> joiner_input_names_ptr_;\n\n  std::vector<std::string> joiner_output_names_;\n  std::vector<const char *> joiner_output_names_ptr_;\n\n  OnlineModelConfig config_;\n\n  int32_t num_encoder_layers_ = 0;\n  int32_t T_ = 0;\n  int32_t decode_chunk_len_ = 0;\n  int32_t rnn_hidden_size_ = 0;\n  int32_t d_model_ = 0;\n  int32_t context_size_ = 0;\n  int32_t vocab_size_ = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_LSTM_TRANSDUCER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-model-config.cc",
    "content": "// sherpa-onnx/csrc/online-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineModelConfig::Register(ParseOptions *po) {\n  transducer.Register(po);\n  paraformer.Register(po);\n  wenet_ctc.Register(po);\n  zipformer2_ctc.Register(po);\n  nemo_ctc.Register(po);\n  t_one_ctc.Register(po);\n  provider_config.Register(po);\n\n  po->Register(\"tokens\", &tokens, \"Path to tokens.txt\");\n\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"warm-up\", &warm_up,\n               \"Number of warm-up to run the onnxruntime\"\n               \"Valid vales are: zipformer2\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"modeling-unit\", &modeling_unit,\n               \"The modeling unit of the model, commonly used units are bpe, \"\n               \"cjkchar, cjkchar+bpe, etc. Currently, it is needed only when \"\n               \"hotwords are provided, we need it to encode the hotwords into \"\n               \"token sequence.\");\n\n  po->Register(\"bpe-vocab\", &bpe_vocab,\n               \"The vocabulary generated by google's sentencepiece program. \"\n               \"It is a file has two columns, one is the token, the other is \"\n               \"the log probability, you can get it from the directory where \"\n               \"your bpe model is generated. Only used when hotwords provided \"\n               \"and the modeling unit is bpe or cjkchar+bpe\");\n\n  po->Register(\"model-type\", &model_type,\n               \"Specify it to reduce model initialization time. \"\n               \"Valid values are: conformer, lstm, zipformer, zipformer2, \"\n               \"wenet_ctc, nemo_ctc. \"\n               \"All other values lead to loading the model twice.\");\n}\n\nbool OnlineModelConfig::Validate() const {\n  // For RK NPU, we reinterpret num_threads:\n  //\n  // For RK3588 only\n  // num_threads == 1 -> Select a core randomly\n  // num_threads == 0 -> Use NPU core 0\n  // num_threads == -1 -> Use NPU core 1\n  // num_threads == -2 -> Use NPU core 2\n  // num_threads == -3 -> Use NPU core 0 and core 1\n  // num_threads == -4 -> Use NPU core 0, core 1, and core 2\n  if (provider_config.provider != \"rknn\") {\n    if (num_threads < 1) {\n      SHERPA_ONNX_LOGE(\"num_threads should be > 0. Given %d\", num_threads);\n      return false;\n    }\n    if (!transducer.encoder.empty() && (EndsWith(transducer.encoder, \".rknn\") ||\n                                        EndsWith(transducer.decoder, \".rknn\") ||\n                                        EndsWith(transducer.joiner, \".rknn\"))) {\n      SHERPA_ONNX_LOGE(\n          \"--provider is %s, which is not rknn, but you pass rknn model \"\n          \"filenames. encoder: '%s', decoder: '%s', joiner: '%s'\",\n          provider_config.provider.c_str(), transducer.encoder.c_str(),\n          transducer.decoder.c_str(), transducer.joiner.c_str());\n      return false;\n    }\n\n    if (!zipformer2_ctc.model.empty() &&\n        EndsWith(zipformer2_ctc.model, \".rknn\")) {\n      SHERPA_ONNX_LOGE(\n          \"--provider is %s, which is not rknn, but you pass rknn model \"\n          \"filename for zipformer2_ctc: '%s'\",\n          provider_config.provider.c_str(), zipformer2_ctc.model.c_str());\n      return false;\n    }\n  }\n\n  if (provider_config.provider == \"rknn\") {\n    if (!transducer.encoder.empty() && (EndsWith(transducer.encoder, \".onnx\") ||\n                                        EndsWith(transducer.decoder, \".onnx\") ||\n                                        EndsWith(transducer.joiner, \".onnx\"))) {\n      SHERPA_ONNX_LOGE(\n          \"--provider is rknn, but you pass onnx model \"\n          \"filenames. encoder: '%s', decoder: '%s', joiner: '%s'\",\n          transducer.encoder.c_str(), transducer.decoder.c_str(),\n          transducer.joiner.c_str());\n      return false;\n    }\n\n    if (!zipformer2_ctc.model.empty() &&\n        EndsWith(zipformer2_ctc.model, \".onnx\")) {\n      SHERPA_ONNX_LOGE(\n          \"--provider rknn, but you pass onnx model filename for \"\n          \"zipformer2_ctc: '%s'\",\n          zipformer2_ctc.model.c_str());\n      return false;\n    }\n  }\n\n  if (!tokens_buf.empty() && FileExists(tokens)) {\n    SHERPA_ONNX_LOGE(\n        \"you can not provide a tokens_buf and a tokens file: '%s', \"\n        \"at the same time, which is confusing\",\n        tokens.c_str());\n    return false;\n  }\n\n  if (tokens_buf.empty() && !FileExists(tokens)) {\n    SHERPA_ONNX_LOGE(\n        \"tokens: '%s' does not exist, you should provide \"\n        \"either a tokens buffer or a tokens file\",\n        tokens.c_str());\n    return false;\n  }\n\n  if (!modeling_unit.empty() &&\n      (modeling_unit == \"bpe\" || modeling_unit == \"cjkchar+bpe\")) {\n    if (!FileExists(bpe_vocab)) {\n      SHERPA_ONNX_LOGE(\"bpe_vocab: '%s' does not exist\", bpe_vocab.c_str());\n      return false;\n    }\n  }\n\n  if (!provider_config.Validate()) {\n    return false;\n  }\n\n  if (!paraformer.encoder.empty()) {\n    return paraformer.Validate();\n  }\n\n  if (!wenet_ctc.model.empty()) {\n    return wenet_ctc.Validate();\n  }\n\n  if (!zipformer2_ctc.model.empty()) {\n    return zipformer2_ctc.Validate();\n  }\n\n  if (!nemo_ctc.model.empty()) {\n    return nemo_ctc.Validate();\n  }\n\n  if (!t_one_ctc.model.empty()) {\n    return t_one_ctc.Validate();\n  }\n\n  return transducer.Validate();\n}\n\nstd::string OnlineModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlineModelConfig(\";\n  os << \"transducer=\" << transducer.ToString() << \", \";\n  os << \"paraformer=\" << paraformer.ToString() << \", \";\n  os << \"wenet_ctc=\" << wenet_ctc.ToString() << \", \";\n  os << \"zipformer2_ctc=\" << zipformer2_ctc.ToString() << \", \";\n  os << \"nemo_ctc=\" << nemo_ctc.ToString() << \", \";\n  os << \"t_one_ctc=\" << t_one_ctc.ToString() << \", \";\n  os << \"provider_config=\" << provider_config.ToString() << \", \";\n  os << \"tokens=\\\"\" << tokens << \"\\\", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"warm_up=\" << warm_up << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"model_type=\\\"\" << model_type << \"\\\", \";\n  os << \"modeling_unit=\\\"\" << modeling_unit << \"\\\", \";\n  os << \"bpe_vocab=\\\"\" << bpe_vocab << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-model-config.h",
    "content": "// sherpa-onnx/csrc/online-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_ONLINE_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/online-nemo-ctc-model-config.h\"\n#include \"sherpa-onnx/csrc/online-paraformer-model-config.h\"\n#include \"sherpa-onnx/csrc/online-t-one-ctc-model-config.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model-config.h\"\n#include \"sherpa-onnx/csrc/online-wenet-ctc-model-config.h\"\n#include \"sherpa-onnx/csrc/online-zipformer2-ctc-model-config.h\"\n#include \"sherpa-onnx/csrc/provider-config.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineModelConfig {\n  OnlineTransducerModelConfig transducer;\n  OnlineParaformerModelConfig paraformer;\n  OnlineWenetCtcModelConfig wenet_ctc;\n  OnlineZipformer2CtcModelConfig zipformer2_ctc;\n  OnlineNeMoCtcModelConfig nemo_ctc;\n  OnlineToneCtcModelConfig t_one_ctc;\n  ProviderConfig provider_config;\n  std::string tokens;\n  int32_t num_threads = 1;\n  int32_t warm_up = 0;\n  bool debug = false;\n\n  // Valid values:\n  //  - conformer, conformer transducer from icefall\n  //  - lstm, lstm transducer from icefall\n  //  - zipformer, zipformer transducer from icefall\n  //  - zipformer2, zipformer2 transducer or CTC from icefall\n  //  - wenet_ctc, wenet CTC model\n  //  - nemo_ctc, NeMo CTC model\n  //\n  // All other values are invalid and lead to loading the model twice.\n  std::string model_type;\n\n  // Valid values:\n  //  - cjkchar\n  //  - bpe\n  //  - cjkchar+bpe\n  std::string modeling_unit = \"cjkchar\";\n  std::string bpe_vocab;\n\n  /// if tokens_buf is non-empty,\n  /// the tokens will be loaded from the buffer instead of from the\n  /// \"tokens\" file\n  std::string tokens_buf;\n\n  OnlineModelConfig() = default;\n  OnlineModelConfig(const OnlineTransducerModelConfig &transducer,\n                    const OnlineParaformerModelConfig &paraformer,\n                    const OnlineWenetCtcModelConfig &wenet_ctc,\n                    const OnlineZipformer2CtcModelConfig &zipformer2_ctc,\n                    const OnlineNeMoCtcModelConfig &nemo_ctc,\n                    const OnlineToneCtcModelConfig &t_one_ctc,\n                    const ProviderConfig &provider_config,\n                    const std::string &tokens, int32_t num_threads,\n                    int32_t warm_up, bool debug, const std::string &model_type,\n                    const std::string &modeling_unit,\n                    const std::string &bpe_vocab)\n      : transducer(transducer),\n        paraformer(paraformer),\n        wenet_ctc(wenet_ctc),\n        zipformer2_ctc(zipformer2_ctc),\n        nemo_ctc(nemo_ctc),\n        t_one_ctc(t_one_ctc),\n        provider_config(provider_config),\n        tokens(tokens),\n        num_threads(num_threads),\n        warm_up(warm_up),\n        debug(debug),\n        model_type(model_type),\n        modeling_unit(modeling_unit),\n        bpe_vocab(bpe_vocab) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-nemo-ctc-model-config.cc",
    "content": "// sherpa-onnx/csrc/online-nemo-ctc-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-nemo-ctc-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineNeMoCtcModelConfig::Register(ParseOptions *po) {\n  po->Register(\"nemo-ctc-model\", &model,\n               \"Path to CTC model.onnx from NeMo. Please see \"\n               \"https://github.com/k2-fsa/sherpa-onnx/pull/843\");\n}\n\nbool OnlineNeMoCtcModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"NeMo CTC model '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OnlineNeMoCtcModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlineNeMoCtcModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-nemo-ctc-model-config.h",
    "content": "// sherpa-onnx/csrc/online-nemo-ctc-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_NEMO_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_ONLINE_NEMO_CTC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineNeMoCtcModelConfig {\n  std::string model;\n\n  OnlineNeMoCtcModelConfig() = default;\n\n  explicit OnlineNeMoCtcModelConfig(const std::string &model) : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_NEMO_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-nemo-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/online-nemo-ctc-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-nemo-ctc-model.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/cat.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n#include \"sherpa-onnx/csrc/unbind.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineNeMoCtcModel::Impl {\n public:\n  explicit Impl(const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.nemo_ctc.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.nemo_ctc.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value x,\n                                  std::vector<Ort::Value> states) {\n    Ort::Value &cache_last_channel = states[0];\n    Ort::Value &cache_last_time = states[1];\n    Ort::Value &cache_last_channel_len = states[2];\n\n    int32_t batch_size = x.GetTensorTypeAndShapeInfo().GetShape()[0];\n\n    std::array<int64_t, 1> length_shape{batch_size};\n\n    Ort::Value length = Ort::Value::CreateTensor<int64_t>(\n        allocator_, length_shape.data(), length_shape.size());\n\n    int64_t *p_length = length.GetTensorMutableData<int64_t>();\n\n    std::fill(p_length, p_length + batch_size, ChunkLength());\n\n    // (B, T, C) -> (B, C, T)\n    x = Transpose12(allocator_, &x);\n\n    std::array<Ort::Value, 5> inputs = {\n        std::move(x), View(&length), std::move(cache_last_channel),\n        std::move(cache_last_time), std::move(cache_last_channel_len)};\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n    // out[0]: logit\n    // out[1] logit_length\n    // out[2:] states_next\n    //\n    // we need to remove out[1]\n\n    std::vector<Ort::Value> ans;\n    ans.reserve(out.size() - 1);\n\n    for (int32_t i = 0; i != out.size(); ++i) {\n      if (i == 1) {\n        continue;\n      }\n\n      ans.push_back(std::move(out[i]));\n    }\n\n    return ans;\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t ChunkLength() const { return window_size_; }\n\n  int32_t ChunkShift() const { return chunk_shift_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  // Return a vector containing 3 tensors\n  // - cache_last_channel\n  // - cache_last_time_\n  // - cache_last_channel_len\n  std::vector<Ort::Value> GetInitStates() {\n    std::vector<Ort::Value> ans;\n    ans.reserve(3);\n    ans.push_back(View(&cache_last_channel_));\n    ans.push_back(View(&cache_last_time_));\n    ans.push_back(View(&cache_last_channel_len_));\n\n    return ans;\n  }\n\n  std::vector<Ort::Value> StackStates(\n      std::vector<std::vector<Ort::Value>> states) {\n    int32_t batch_size = static_cast<int32_t>(states.size());\n    if (batch_size == 1) {\n      return std::move(states[0]);\n    }\n\n    std::vector<Ort::Value> ans;\n\n    // stack cache_last_channel\n    std::vector<const Ort::Value *> buf(batch_size);\n\n    // there are 3 states to be stacked\n    for (int32_t i = 0; i != 3; ++i) {\n      buf.clear();\n      buf.reserve(batch_size);\n\n      for (int32_t b = 0; b != batch_size; ++b) {\n        assert(states[b].size() == 3);\n        buf.push_back(&states[b][i]);\n      }\n\n      Ort::Value c{nullptr};\n      if (i == 2) {\n        c = Cat<int64_t>(allocator_, buf, 0);\n      } else {\n        c = Cat(allocator_, buf, 0);\n      }\n\n      ans.push_back(std::move(c));\n    }\n\n    return ans;\n  }\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      std::vector<Ort::Value> states) const {\n    assert(states.size() == 3);\n\n    auto allocator = const_cast<Impl *>(this)->allocator_;\n\n    std::vector<std::vector<Ort::Value>> ans;\n\n    auto shape = states[0].GetTensorTypeAndShapeInfo().GetShape();\n    int32_t batch_size = shape[0];\n    ans.resize(batch_size);\n\n    if (batch_size == 1) {\n      ans[0] = std::move(states);\n      return ans;\n    }\n\n    for (int32_t i = 0; i != 3; ++i) {\n      std::vector<Ort::Value> v;\n      if (i == 2) {\n        v = Unbind<int64_t>(allocator, &states[i], 0);\n      } else {\n        v = Unbind(allocator, &states[i], 0);\n      }\n\n      assert(v.size() == batch_size);\n\n      for (int32_t b = 0; b != batch_size; ++b) {\n        ans[b].push_back(std::move(v[b]));\n      }\n    }\n\n    return ans;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(window_size_, \"window_size\");\n    SHERPA_ONNX_READ_META_DATA(chunk_shift_, \"chunk_shift\");\n    SHERPA_ONNX_READ_META_DATA(subsampling_factor_, \"subsampling_factor\");\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_channel_dim1_,\n                               \"cache_last_channel_dim1\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_channel_dim2_,\n                               \"cache_last_channel_dim2\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_channel_dim3_,\n                               \"cache_last_channel_dim3\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_time_dim1_, \"cache_last_time_dim1\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_time_dim2_, \"cache_last_time_dim2\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_time_dim3_, \"cache_last_time_dim3\");\n\n    // need to increase by 1 since the blank token is not included in computing\n    // vocab_size in NeMo.\n    vocab_size_ += 1;\n\n    InitStates();\n  }\n\n  void InitStates() {\n    std::array<int64_t, 4> cache_last_channel_shape{1, cache_last_channel_dim1_,\n                                                    cache_last_channel_dim2_,\n                                                    cache_last_channel_dim3_};\n\n    cache_last_channel_ = Ort::Value::CreateTensor<float>(\n        allocator_, cache_last_channel_shape.data(),\n        cache_last_channel_shape.size());\n\n    Fill<float>(&cache_last_channel_, 0);\n\n    std::array<int64_t, 4> cache_last_time_shape{\n        1, cache_last_time_dim1_, cache_last_time_dim2_, cache_last_time_dim3_};\n\n    cache_last_time_ = Ort::Value::CreateTensor<float>(\n        allocator_, cache_last_time_shape.data(), cache_last_time_shape.size());\n\n    Fill<float>(&cache_last_time_, 0);\n\n    int64_t shape = 1;\n    cache_last_channel_len_ =\n        Ort::Value::CreateTensor<int64_t>(allocator_, &shape, 1);\n\n    cache_last_channel_len_.GetTensorMutableData<int64_t>()[0] = 0;\n  }\n\n private:\n  OnlineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t window_size_ = 0;\n  int32_t chunk_shift_ = 0;\n  int32_t subsampling_factor_ = 0;\n  int32_t vocab_size_ = 0;\n  int32_t cache_last_channel_dim1_ = 0;\n  int32_t cache_last_channel_dim2_ = 0;\n  int32_t cache_last_channel_dim3_ = 0;\n  int32_t cache_last_time_dim1_ = 0;\n  int32_t cache_last_time_dim2_ = 0;\n  int32_t cache_last_time_dim3_ = 0;\n\n  Ort::Value cache_last_channel_{nullptr};\n  Ort::Value cache_last_time_{nullptr};\n  Ort::Value cache_last_channel_len_{nullptr};\n};\n\nOnlineNeMoCtcModel::OnlineNeMoCtcModel(const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOnlineNeMoCtcModel::OnlineNeMoCtcModel(Manager *mgr,\n                                       const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOnlineNeMoCtcModel::~OnlineNeMoCtcModel() = default;\n\nstd::vector<Ort::Value> OnlineNeMoCtcModel::Forward(\n    Ort::Value x, std::vector<Ort::Value> states) const {\n  return impl_->Forward(std::move(x), std::move(states));\n}\n\nint32_t OnlineNeMoCtcModel::VocabSize() const { return impl_->VocabSize(); }\n\nint32_t OnlineNeMoCtcModel::ChunkLength() const { return impl_->ChunkLength(); }\n\nint32_t OnlineNeMoCtcModel::ChunkShift() const { return impl_->ChunkShift(); }\n\nOrtAllocator *OnlineNeMoCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nstd::vector<Ort::Value> OnlineNeMoCtcModel::GetInitStates() const {\n  return impl_->GetInitStates();\n}\n\nstd::vector<Ort::Value> OnlineNeMoCtcModel::StackStates(\n    std::vector<std::vector<Ort::Value>> states) const {\n  return impl_->StackStates(std::move(states));\n}\n\nstd::vector<std::vector<Ort::Value>> OnlineNeMoCtcModel::UnStackStates(\n    std::vector<Ort::Value> states) const {\n  return impl_->UnStackStates(std::move(states));\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineNeMoCtcModel::OnlineNeMoCtcModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineNeMoCtcModel::OnlineNeMoCtcModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-nemo-ctc-model.h",
    "content": "// sherpa-onnx/csrc/online-nemo-ctc-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_NEMO_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_NEMO_CTC_MODEL_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-ctc-model.h\"\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineNeMoCtcModel : public OnlineCtcModel {\n public:\n  explicit OnlineNeMoCtcModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineNeMoCtcModel(Manager *mgr, const OnlineModelConfig &config);\n\n  ~OnlineNeMoCtcModel() override;\n\n  // A list of 3 tensors:\n  //  - cache_last_channel\n  //  - cache_last_time\n  //  - cache_last_channel_len\n  std::vector<Ort::Value> GetInitStates() const override;\n\n  std::vector<Ort::Value> StackStates(\n      std::vector<std::vector<Ort::Value>> states) const override;\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      std::vector<Ort::Value> states) const override;\n\n  /**\n   *\n   * @param x A 3-D tensor of shape (N, T, C). N has to be 1.\n   * @param states  It is from GetInitStates() or returned from this method.\n   *\n   * @return Return a list of tensors\n   *    - ans[0] contains log_probs, of shape (N, T, C)\n   *    - ans[1:] contains next_states\n   */\n  std::vector<Ort::Value> Forward(\n      Ort::Value x, std::vector<Ort::Value> states) const override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  // The model accepts this number of frames before subsampling as input\n  int32_t ChunkLength() const override;\n\n  // Similar to frame_shift in feature extractor, after processing\n  // ChunkLength() frames, we advance by ChunkShift() frames\n  // before we process the next chunk.\n  int32_t ChunkShift() const override;\n\n  bool SupportBatchProcessing() const override { return true; }\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_NEMO_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-paraformer-decoder.h",
    "content": "// sherpa-onnx/csrc/online-paraformer-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_PARAFORMER_DECODER_H_\n#define SHERPA_ONNX_CSRC_ONLINE_PARAFORMER_DECODER_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nstruct OnlineParaformerDecoderResult {\n  /// The decoded token IDs\n  std::vector<int32_t> tokens;\n\n  int32_t last_non_blank_frame_index = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_PARAFORMER_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-paraformer-model-config.cc",
    "content": "// sherpa-onnx/csrc/online-paraformer-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-paraformer-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineParaformerModelConfig::Register(ParseOptions *po) {\n  po->Register(\"paraformer-encoder\", &encoder,\n               \"Path to encoder.onnx of paraformer.\");\n  po->Register(\"paraformer-decoder\", &decoder,\n               \"Path to decoder.onnx of paraformer.\");\n}\n\nbool OnlineParaformerModelConfig::Validate() const {\n  if (!FileExists(encoder)) {\n    SHERPA_ONNX_LOGE(\"Paraformer encoder '%s' does not exist\", encoder.c_str());\n    return false;\n  }\n\n  if (!FileExists(decoder)) {\n    SHERPA_ONNX_LOGE(\"Paraformer decoder '%s' does not exist\", decoder.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OnlineParaformerModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlineParaformerModelConfig(\";\n  os << \"encoder=\\\"\" << encoder << \"\\\", \";\n  os << \"decoder=\\\"\" << decoder << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-paraformer-model-config.h",
    "content": "// sherpa-onnx/csrc/online-paraformer-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_PARAFORMER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_ONLINE_PARAFORMER_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineParaformerModelConfig {\n  std::string encoder;\n  std::string decoder;\n\n  OnlineParaformerModelConfig() = default;\n\n  OnlineParaformerModelConfig(const std::string &encoder,\n                              const std::string &decoder)\n      : encoder(encoder), decoder(decoder) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_PARAFORMER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-paraformer-model.cc",
    "content": "// sherpa-onnx/csrc/online-paraformer-model.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-paraformer-model.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineParaformerModel::Impl {\n public:\n  explicit Impl(const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.paraformer.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.paraformer.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.paraformer.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.paraformer.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n  }\n\n  std::vector<Ort::Value> ForwardEncoder(Ort::Value features,\n                                         Ort::Value features_length) {\n    std::array<Ort::Value, 2> inputs = {std::move(features),\n                                        std::move(features_length)};\n\n    return encoder_sess_->Run(\n        {}, encoder_input_names_ptr_.data(), inputs.data(), inputs.size(),\n        encoder_output_names_ptr_.data(), encoder_output_names_ptr_.size());\n  }\n\n  std::vector<Ort::Value> ForwardDecoder(Ort::Value encoder_out,\n                                         Ort::Value encoder_out_length,\n                                         Ort::Value acoustic_embedding,\n                                         Ort::Value acoustic_embedding_length,\n                                         std::vector<Ort::Value> states) {\n    std::vector<Ort::Value> decoder_inputs;\n    decoder_inputs.reserve(4 + states.size());\n\n    decoder_inputs.push_back(std::move(encoder_out));\n    decoder_inputs.push_back(std::move(encoder_out_length));\n    decoder_inputs.push_back(std::move(acoustic_embedding));\n    decoder_inputs.push_back(std::move(acoustic_embedding_length));\n\n    for (auto &v : states) {\n      decoder_inputs.push_back(std::move(v));\n    }\n\n    return decoder_sess_->Run({}, decoder_input_names_ptr_.data(),\n                              decoder_inputs.data(), decoder_inputs.size(),\n                              decoder_output_names_ptr_.data(),\n                              decoder_output_names_ptr_.size());\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t LfrWindowSize() const { return lfr_window_size_; }\n\n  int32_t LfrWindowShift() const { return lfr_window_shift_; }\n\n  int32_t EncoderOutputSize() const { return encoder_output_size_; }\n\n  int32_t DecoderKernelSize() const { return decoder_kernel_size_; }\n\n  int32_t DecoderNumBlocks() const { return decoder_num_blocks_; }\n\n  const std::vector<float> &NegativeMean() const { return neg_mean_; }\n\n  const std::vector<float> &InverseStdDev() const { return inv_stddev_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n    SHERPA_ONNX_READ_META_DATA(lfr_window_size_, \"lfr_window_size\");\n    SHERPA_ONNX_READ_META_DATA(lfr_window_shift_, \"lfr_window_shift\");\n    SHERPA_ONNX_READ_META_DATA(encoder_output_size_, \"encoder_output_size\");\n    SHERPA_ONNX_READ_META_DATA(decoder_num_blocks_, \"decoder_num_blocks\");\n    SHERPA_ONNX_READ_META_DATA(decoder_kernel_size_, \"decoder_kernel_size\");\n\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(neg_mean_, \"neg_mean\");\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(inv_stddev_, \"inv_stddev\");\n\n    float scale = std::sqrt(encoder_output_size_);\n    for (auto &f : inv_stddev_) {\n      f *= scale;\n    }\n  }\n\n  void InitDecoder(void *model_data, size_t model_data_length) {\n    decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, model_data, model_data_length, sess_opts_);\n\n    GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                  &decoder_input_names_ptr_);\n\n    GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                   &decoder_output_names_ptr_);\n  }\n\n private:\n  OnlineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::unique_ptr<Ort::Session> decoder_sess_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<float> neg_mean_;\n  std::vector<float> inv_stddev_;\n\n  int32_t vocab_size_ = 0;  // initialized in Init\n  int32_t lfr_window_size_ = 0;\n  int32_t lfr_window_shift_ = 0;\n\n  int32_t encoder_output_size_ = 0;\n  int32_t decoder_num_blocks_ = 0;\n  int32_t decoder_kernel_size_ = 0;\n};\n\nOnlineParaformerModel::OnlineParaformerModel(const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOnlineParaformerModel::OnlineParaformerModel(Manager *mgr,\n                                             const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOnlineParaformerModel::~OnlineParaformerModel() = default;\n\nstd::vector<Ort::Value> OnlineParaformerModel::ForwardEncoder(\n    Ort::Value features, Ort::Value features_length) const {\n  return impl_->ForwardEncoder(std::move(features), std::move(features_length));\n}\n\nstd::vector<Ort::Value> OnlineParaformerModel::ForwardDecoder(\n    Ort::Value encoder_out, Ort::Value encoder_out_length,\n    Ort::Value acoustic_embedding, Ort::Value acoustic_embedding_length,\n    std::vector<Ort::Value> states) const {\n  return impl_->ForwardDecoder(\n      std::move(encoder_out), std::move(encoder_out_length),\n      std::move(acoustic_embedding), std::move(acoustic_embedding_length),\n      std::move(states));\n}\n\nint32_t OnlineParaformerModel::VocabSize() const { return impl_->VocabSize(); }\n\nint32_t OnlineParaformerModel::LfrWindowSize() const {\n  return impl_->LfrWindowSize();\n}\nint32_t OnlineParaformerModel::LfrWindowShift() const {\n  return impl_->LfrWindowShift();\n}\n\nint32_t OnlineParaformerModel::EncoderOutputSize() const {\n  return impl_->EncoderOutputSize();\n}\n\nint32_t OnlineParaformerModel::DecoderKernelSize() const {\n  return impl_->DecoderKernelSize();\n}\n\nint32_t OnlineParaformerModel::DecoderNumBlocks() const {\n  return impl_->DecoderNumBlocks();\n}\n\nconst std::vector<float> &OnlineParaformerModel::NegativeMean() const {\n  return impl_->NegativeMean();\n}\nconst std::vector<float> &OnlineParaformerModel::InverseStdDev() const {\n  return impl_->InverseStdDev();\n}\n\nOrtAllocator *OnlineParaformerModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineParaformerModel::OnlineParaformerModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineParaformerModel::OnlineParaformerModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-paraformer-model.h",
    "content": "// sherpa-onnx/csrc/online-paraformer-model.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_PARAFORMER_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_PARAFORMER_MODEL_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineParaformerModel {\n public:\n  explicit OnlineParaformerModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineParaformerModel(Manager *mgr, const OnlineModelConfig &config);\n\n  ~OnlineParaformerModel();\n\n  std::vector<Ort::Value> ForwardEncoder(Ort::Value features,\n                                         Ort::Value features_length) const;\n\n  std::vector<Ort::Value> ForwardDecoder(Ort::Value encoder_out,\n                                         Ort::Value encoder_out_length,\n                                         Ort::Value acoustic_embedding,\n                                         Ort::Value acoustic_embedding_length,\n                                         std::vector<Ort::Value> states) const;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const;\n\n  /** It is lfr_m in config.yaml\n   */\n  int32_t LfrWindowSize() const;\n\n  /** It is lfr_n in config.yaml\n   */\n  int32_t LfrWindowShift() const;\n\n  int32_t EncoderOutputSize() const;\n\n  int32_t DecoderKernelSize() const;\n  int32_t DecoderNumBlocks() const;\n\n  /** Return negative mean for CMVN\n   */\n  const std::vector<float> &NegativeMean() const;\n\n  /** Return inverse stddev for CMVN\n   */\n  const std::vector<float> &InverseStdDev() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_PARAFORMER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-punctuation-cnn-bilstm-impl.h",
    "content": "// sherpa-onnx/csrc/online-punctuation-cnn-bilstm-impl.h\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_CNN_BILSTM_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_CNN_BILSTM_IMPL_H_\n\n#include <math.h>\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include <chrono>  // NOLINT\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/online-cnn-bilstm-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/online-cnn-bilstm-model.h\"\n#include \"sherpa-onnx/csrc/online-punctuation-impl.h\"\n#include \"sherpa-onnx/csrc/online-punctuation.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"ssentencepiece/csrc/ssentencepiece.h\"\n\nnamespace sherpa_onnx {\n\nstatic const int32_t kMaxSeqLen = 200;\n\nclass OnlinePunctuationCNNBiLSTMImpl : public OnlinePunctuationImpl {\n public:\n  explicit OnlinePunctuationCNNBiLSTMImpl(const OnlinePunctuationConfig &config)\n      : config_(config), model_(config.model) {\n    if (!config_.model.bpe_vocab.empty()) {\n      bpe_encoder_ = std::make_unique<ssentencepiece::Ssentencepiece>(\n          config_.model.bpe_vocab);\n    }\n  }\n\n  template <typename Manager>\n  OnlinePunctuationCNNBiLSTMImpl(Manager *mgr,\n                                 const OnlinePunctuationConfig &config)\n      : config_(config), model_(mgr, config.model) {\n    if (!config_.model.bpe_vocab.empty()) {\n      auto buf = ReadFile(mgr, config_.model.bpe_vocab);\n      std::istringstream iss(std::string(buf.begin(), buf.end()));\n      bpe_encoder_ = std::make_unique<ssentencepiece::Ssentencepiece>(iss);\n    }\n  }\n\n  std::string AddPunctuationWithCase(const std::string &text) const override {\n    if (text.empty()) {\n      return {};\n    }\n\n    std::vector<int32_t> tokens_list;     // N * kMaxSeqLen\n    std::vector<int32_t> valids_list;     // N * kMaxSeqLen\n    std::vector<int32_t> label_len_list;  // N\n\n    EncodeSentences(text, tokens_list, valids_list, label_len_list);\n\n    const auto &meta_data = model_.GetModelMetadata();\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t n = label_len_list.size();\n\n    std::array<int64_t, 2> token_ids_shape = {n, kMaxSeqLen};\n    Ort::Value token_ids = Ort::Value::CreateTensor(\n        memory_info, tokens_list.data(), tokens_list.size(),\n        token_ids_shape.data(), token_ids_shape.size());\n\n    std::array<int64_t, 2> valid_ids_shape = {n, kMaxSeqLen};\n    Ort::Value valid_ids = Ort::Value::CreateTensor(\n        memory_info, valids_list.data(), valids_list.size(),\n        valid_ids_shape.data(), valid_ids_shape.size());\n\n    std::array<int64_t, 1> label_len_shape = {n};\n    Ort::Value label_len = Ort::Value::CreateTensor(\n        memory_info, label_len_list.data(), label_len_list.size(),\n        label_len_shape.data(), label_len_shape.size());\n\n    auto pair = model_.Forward(std::move(token_ids), std::move(valid_ids),\n                               std::move(label_len));\n\n    std::vector<int32_t> case_pred;\n    std::vector<int32_t> punct_pred;\n    const float *active_case_logits = pair.first.GetTensorData<float>();\n    const float *active_punct_logits = pair.second.GetTensorData<float>();\n    std::vector<int64_t> case_logits_shape =\n        pair.first.GetTensorTypeAndShapeInfo().GetShape();\n\n    for (int32_t i = 0; i < case_logits_shape[0]; ++i) {\n      const float *p_cur_case = active_case_logits + i * meta_data.num_cases;\n      auto index_case = static_cast<int32_t>(std::distance(\n          p_cur_case,\n          std::max_element(p_cur_case, p_cur_case + meta_data.num_cases)));\n      case_pred.push_back(index_case);\n\n      const float *p_cur_punct =\n          active_punct_logits + i * meta_data.num_punctuations;\n      auto index_punct = static_cast<int32_t>(std::distance(\n          p_cur_punct,\n          std::max_element(p_cur_punct,\n                           p_cur_punct + meta_data.num_punctuations)));\n      punct_pred.push_back(index_punct);\n    }\n\n    std::string ans = DecodeSentences(text, case_pred, punct_pred);\n\n    return ans;\n  }\n\n private:\n  void EncodeSentences(const std::string &text,\n                       std::vector<int32_t> &tokens_list,             // NOLINT\n                       std::vector<int32_t> &valids_list,             // NOLINT\n                       std::vector<int32_t> &label_len_list) const {  // NOLINT\n    std::vector<int32_t> tokens;\n    std::vector<int32_t> valids;\n    int32_t label_len = 0;\n\n    tokens.push_back(1);  // hardcode 1 now, 1 - <s>\n    valids.push_back(1);\n\n    std::stringstream ss(text);\n    std::string word;\n    while (ss >> word) {\n      std::vector<int32_t> word_tokens;\n      bpe_encoder_->Encode(word, &word_tokens);\n\n      int32_t seq_len = tokens.size() + word_tokens.size();\n      if (seq_len > kMaxSeqLen - 1) {\n        tokens.push_back(2);  // hardcode 2 now, 2 - </s>\n        valids.push_back(1);\n\n        label_len = std::count(valids.begin(), valids.end(), 1);\n\n        if (tokens.size() < kMaxSeqLen) {\n          tokens.resize(kMaxSeqLen, 0);\n          valids.resize(kMaxSeqLen, 0);\n        }\n\n        assert(tokens.size() == kMaxSeqLen);\n        assert(valids.size() == kMaxSeqLen);\n\n        tokens_list.insert(tokens_list.end(), tokens.begin(), tokens.end());\n        valids_list.insert(valids_list.end(), valids.begin(), valids.end());\n        label_len_list.push_back(label_len);\n\n        std::vector<int32_t>().swap(tokens);\n        std::vector<int32_t>().swap(valids);\n        label_len = 0;\n        tokens.push_back(1);  // hardcode 1 now, 1 - <s>\n        valids.push_back(1);\n      }\n\n      tokens.insert(tokens.end(), word_tokens.begin(), word_tokens.end());\n      valids.push_back(1);  // only the first sub word is valid\n      int32_t remaining_size = static_cast<int32_t>(word_tokens.size()) - 1;\n      if (remaining_size > 0) {\n        int32_t valids_cur_size = static_cast<int32_t>(valids.size());\n        valids.resize(valids_cur_size + remaining_size, 0);\n      }\n    }\n\n    if (tokens.size() > 0) {\n      tokens.push_back(2);  // hardcode 2 now, 2 - </s>\n      valids.push_back(1);\n\n      label_len = std::count(valids.begin(), valids.end(), 1);\n\n      if (tokens.size() < kMaxSeqLen) {\n        tokens.resize(kMaxSeqLen, 0);\n        valids.resize(kMaxSeqLen, 0);\n      }\n\n      assert(tokens.size() == kMaxSeqLen);\n      assert(valids.size() == kMaxSeqLen);\n\n      tokens_list.insert(tokens_list.end(), tokens.begin(), tokens.end());\n      valids_list.insert(valids_list.end(), valids.begin(), valids.end());\n      label_len_list.push_back(label_len);\n    }\n  }\n\n  std::string DecodeSentences(const std::string &raw_text,\n                              const std::vector<int32_t> &case_pred,\n                              const std::vector<int32_t> &punct_pred) const {\n    std::string result_text;\n    std::istringstream iss(raw_text);\n    std::vector<std::string> words;\n    std::string word;\n\n    while (iss >> word) {\n      words.emplace_back(word);\n    }\n\n    assert(words.size() == case_pred.size());\n    assert(words.size() == punct_pred.size());\n\n    for (int32_t i = 0; i < words.size(); ++i) {\n      std::string prefix = ((i != 0) ? \" \" : \"\");\n      result_text += prefix;\n      switch (case_pred[i]) {\n        case 1:  // upper\n        {\n          std::transform(words[i].begin(), words[i].end(), words[i].begin(),\n                         [](auto c) { return std::toupper(c); });\n          result_text += words[i];\n          break;\n        }\n        case 2:  // cap\n        {\n          words[i][0] = std::toupper(words[i][0]);\n          result_text += words[i];\n          break;\n        }\n        case 3:  // mix case\n        {\n          // TODO(frankyoujian):\n          // Need to add a map containing supported mix case words so that we\n          // can fetch the predicted word from the map e.g. mcdonald's ->\n          // McDonald's\n          result_text += words[i];\n          break;\n        }\n        default: {\n          result_text += words[i];\n          break;\n        }\n      }\n\n      std::string suffix;\n      switch (punct_pred[i]) {\n        case 1:  // comma\n        {\n          suffix = \",\";\n          break;\n        }\n        case 2:  // period\n        {\n          suffix = \".\";\n          break;\n        }\n        case 3:  // question\n        {\n          suffix = \"?\";\n          break;\n        }\n        default:\n          break;\n      }\n\n      result_text += suffix;\n    }\n\n    return result_text;\n  }\n\n private:\n  OnlinePunctuationConfig config_;\n  OnlineCNNBiLSTMModel model_;\n  std::unique_ptr<ssentencepiece::Ssentencepiece> bpe_encoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_CNN_BILSTM_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-punctuation-impl.cc",
    "content": "// sherpa-onnx/csrc/online-punctuation-impl.cc\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#include \"sherpa-onnx/csrc/online-punctuation-impl.h\"\n\n#include <memory>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-punctuation-cnn-bilstm-impl.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OnlinePunctuationImpl> OnlinePunctuationImpl::Create(\n    const OnlinePunctuationConfig &config) {\n  if (!config.model.cnn_bilstm.empty() && !config.model.bpe_vocab.empty()) {\n    return std::make_unique<OnlinePunctuationCNNBiLSTMImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\n      \"Please specify a punctuation model and bpe vocab! Return a null \"\n      \"pointer\");\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OnlinePunctuationImpl> OnlinePunctuationImpl::Create(\n    Manager *mgr, const OnlinePunctuationConfig &config) {\n  if (!config.model.cnn_bilstm.empty() && !config.model.bpe_vocab.empty()) {\n    return std::make_unique<OnlinePunctuationCNNBiLSTMImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\n      \"Please specify a punctuation model and bpe vocab! Return a null \"\n      \"pointer\");\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OnlinePunctuationImpl> OnlinePunctuationImpl::Create(\n    AAssetManager *mgr, const OnlinePunctuationConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OnlinePunctuationImpl> OnlinePunctuationImpl::Create(\n    NativeResourceManager *mgr, const OnlinePunctuationConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-punctuation-impl.h",
    "content": "// sherpa-onnx/csrc/online-punctuation-impl.h\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_IMPL_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-punctuation.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlinePunctuationImpl {\n public:\n  virtual ~OnlinePunctuationImpl() = default;\n\n  static std::unique_ptr<OnlinePunctuationImpl> Create(\n      const OnlinePunctuationConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OnlinePunctuationImpl> Create(\n      Manager *mgr, const OnlinePunctuationConfig &config);\n\n  virtual std::string AddPunctuationWithCase(const std::string &text) const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-punctuation-model-config.cc",
    "content": "// sherpa-onnx/csrc/online-punctuation-model-config.cc\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#include \"sherpa-onnx/csrc/online-punctuation-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlinePunctuationModelConfig::Register(ParseOptions *po) {\n  po->Register(\"cnn-bilstm\", &cnn_bilstm,\n               \"Path to the light-weight CNN-BiLSTM model\");\n\n  po->Register(\"bpe-vocab\", &bpe_vocab, \"Path to the bpe vocab file\");\n\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n}\n\nbool OnlinePunctuationModelConfig::Validate() const {\n  if (cnn_bilstm.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --cnn-bilstm\");\n    return false;\n  }\n\n  if (!FileExists(cnn_bilstm)) {\n    SHERPA_ONNX_LOGE(\"--cnn-bilstm '%s' does not exist\", cnn_bilstm.c_str());\n    return false;\n  }\n\n  if (bpe_vocab.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --bpe-vocab\");\n    return false;\n  }\n\n  if (!FileExists(bpe_vocab)) {\n    SHERPA_ONNX_LOGE(\"--bpe-vocab '%s' does not exist\", bpe_vocab.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OnlinePunctuationModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlinePunctuationModelConfig(\";\n  os << \"cnn_bilstm=\\\"\" << cnn_bilstm << \"\\\", \";\n  os << \"bpe_vocab=\\\"\" << bpe_vocab << \"\\\", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-punctuation-model-config.h",
    "content": "// sherpa-onnx/csrc/online-punctuation-model-config.h\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlinePunctuationModelConfig {\n  std::string cnn_bilstm;\n  std::string bpe_vocab;\n\n  int32_t num_threads = 1;\n  bool debug = false;\n  std::string provider = \"cpu\";\n\n  OnlinePunctuationModelConfig() = default;\n\n  OnlinePunctuationModelConfig(const std::string &cnn_bilstm,\n                               const std::string &bpe_vocab,\n                               int32_t num_threads, bool debug,\n                               const std::string &provider)\n      : cnn_bilstm(cnn_bilstm),\n        bpe_vocab(bpe_vocab),\n        num_threads(num_threads),\n        debug(debug),\n        provider(provider) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-punctuation.cc",
    "content": "// sherpa-onnx/csrc/online-punctuation.cc\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#include \"sherpa-onnx/csrc/online-punctuation.h\"\n\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-punctuation-impl.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlinePunctuationConfig::Register(ParseOptions *po) { model.Register(po); }\n\nbool OnlinePunctuationConfig::Validate() const {\n  if (!model.Validate()) {\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OnlinePunctuationConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlinePunctuationConfig(\";\n  os << \"model=\" << model.ToString() << \")\";\n\n  return os.str();\n}\n\nOnlinePunctuation::OnlinePunctuation(const OnlinePunctuationConfig &config)\n    : impl_(OnlinePunctuationImpl::Create(config)) {}\n\ntemplate <typename Manager>\nOnlinePunctuation::OnlinePunctuation(Manager *mgr,\n                                     const OnlinePunctuationConfig &config)\n    : impl_(OnlinePunctuationImpl::Create(mgr, config)) {}\n\nOnlinePunctuation::~OnlinePunctuation() = default;\n\nstd::string OnlinePunctuation::AddPunctuationWithCase(\n    const std::string &text) const {\n  return impl_->AddPunctuationWithCase(text);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlinePunctuation::OnlinePunctuation(\n    AAssetManager *mgr, const OnlinePunctuationConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlinePunctuation::OnlinePunctuation(\n    NativeResourceManager *mgr, const OnlinePunctuationConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-punctuation.h",
    "content": "// sherpa-onnx/csrc/online-punctuation.h\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_H_\n#define SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-punctuation-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlinePunctuationConfig {\n  OnlinePunctuationModelConfig model;\n\n  OnlinePunctuationConfig() = default;\n\n  explicit OnlinePunctuationConfig(const OnlinePunctuationModelConfig &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nclass OnlinePunctuationImpl;\n\nclass OnlinePunctuation {\n public:\n  explicit OnlinePunctuation(const OnlinePunctuationConfig &config);\n\n  template <typename Manager>\n  OnlinePunctuation(Manager *mgr, const OnlinePunctuationConfig &config);\n\n  ~OnlinePunctuation();\n\n  // Add punctuation and casing to the input text and return it.\n  std::string AddPunctuationWithCase(const std::string &text) const;\n\n private:\n  std::unique_ptr<OnlinePunctuationImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_PUNCTUATION_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-recognizer-ctc-impl.h",
    "content": "// sherpa-onnx/csrc/online-recognizer-ctc-impl.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_CTC_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_CTC_IMPL_H_\n\n#include <algorithm>\n#include <cassert>\n#include <ios>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-whisper-model.h\"\n#include \"sherpa-onnx/csrc/online-ctc-decoder.h\"\n#include \"sherpa-onnx/csrc/online-ctc-fst-decoder.h\"\n#include \"sherpa-onnx/csrc/online-ctc-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/online-ctc-model.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\nstatic OnlineRecognizerResult ConvertCtc(const OnlineCtcDecoderResult &src,\n                                  const SymbolTable &sym_table,\n                                  float frame_shift_ms,\n                                  int32_t subsampling_factor, int32_t segment,\n                                  int32_t frames_since_start) {\n  OnlineRecognizerResult r;\n  r.tokens.reserve(src.tokens.size());\n  r.timestamps.reserve(src.tokens.size());\n\n  std::string text;\n  for (auto i : src.tokens) {\n    auto sym = sym_table[i];\n\n    text.append(sym);\n\n    if (sym.size() == 1 && (sym[0] < 0x20 || sym[0] > 0x7e)) {\n      // for bpe models with byte_fallback\n      // (but don't rewrite printable characters 0x20..0x7e,\n      //  which collide with standard BPE units)\n      std::ostringstream os;\n      os << \"<0x\" << std::hex << std::uppercase\n         << (static_cast<int32_t>(sym[0]) & 0xff) << \">\";\n      sym = os.str();\n    }\n\n    r.tokens.push_back(std::move(sym));\n  }\n\n  if (sym_table.IsByteBpe()) {\n    text = sym_table.DecodeByteBpe(text);\n  }\n\n  r.text = std::move(text);\n\n  float frame_shift_s = frame_shift_ms / 1000. * subsampling_factor;\n  for (auto t : src.timestamps) {\n    float time = frame_shift_s * t;\n    r.timestamps.push_back(time);\n  }\n\n  r.segment = segment;\n  r.words = std::move(src.words);\n  r.start_time = frames_since_start * frame_shift_ms / 1000.;\n\n  return r;\n}\n\nclass OnlineRecognizerCtcImpl : public OnlineRecognizerImpl {\n public:\n  explicit OnlineRecognizerCtcImpl(const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(config),\n        config_(config),\n        model_(OnlineCtcModel::Create(config.model_config)),\n        endpoint_(config_.endpoint_config) {\n    if (!config.model_config.tokens_buf.empty()) {\n      sym_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      sym_ = SymbolTable(config.model_config.tokens, true);\n    }\n    PostInit();\n  }\n\n  template <typename Manager>\n  explicit OnlineRecognizerCtcImpl(Manager *mgr,\n                                   const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(mgr, config),\n        config_(config),\n        model_(OnlineCtcModel::Create(mgr, config.model_config)),\n        endpoint_(config_.endpoint_config) {\n    if (!config.model_config.tokens_buf.empty()) {\n      sym_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      sym_ = SymbolTable(mgr, config.model_config.tokens);\n    }\n    PostInit();\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream() const override {\n    auto stream = std::make_unique<OnlineStream>(config_.feat_config);\n    stream->SetStates(model_->GetInitStates());\n    stream->SetFasterDecoder(decoder_->CreateFasterDecoder());\n\n    return stream;\n  }\n\n  bool IsReady(OnlineStream *s) const override {\n    return s->GetNumProcessedFrames() + model_->ChunkLength() <\n           s->NumFramesReady();\n  }\n\n  void DecodeStreams(OnlineStream **ss, int32_t n) const override {\n    if (n == 1 || !model_->SupportBatchProcessing()) {\n      for (int32_t i = 0; i != n; ++i) {\n        DecodeStream(ss[i]);\n      }\n      return;\n    }\n\n    // batch processing\n    int32_t chunk_length = model_->ChunkLength();\n    int32_t chunk_shift = model_->ChunkShift();\n\n    int32_t feat_dim = ss[0]->FeatureDim();\n\n    std::vector<OnlineCtcDecoderResult> results(n);\n    std::vector<float> features_vec(n * chunk_length * feat_dim);\n    std::vector<std::vector<Ort::Value>> states_vec(n);\n    std::vector<int64_t> all_processed_frames(n);\n\n    for (int32_t i = 0; i != n; ++i) {\n      const auto num_processed_frames = ss[i]->GetNumProcessedFrames();\n      std::vector<float> features =\n          ss[i]->GetFrames(num_processed_frames, chunk_length);\n      if (config_.feat_config.is_whisper) {\n        OfflineWhisperModel::NormalizeFeatures(features.data(), chunk_length,\n                                               feat_dim);\n      }\n\n      // Question: should num_processed_frames include chunk_shift?\n      ss[i]->GetNumProcessedFrames() += chunk_shift;\n\n      std::copy(features.begin(), features.end(),\n                features_vec.data() + i * chunk_length * feat_dim);\n\n      results[i] = std::move(ss[i]->GetCtcResult());\n      states_vec[i] = std::move(ss[i]->GetStates());\n      all_processed_frames[i] = num_processed_frames;\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> x_shape{n, chunk_length, feat_dim};\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, features_vec.data(),\n                                            features_vec.size(), x_shape.data(),\n                                            x_shape.size());\n\n    auto states = model_->StackStates(std::move(states_vec));\n    int32_t num_states = states.size();\n    auto out = model_->Forward(std::move(x), std::move(states));\n    std::vector<Ort::Value> out_states;\n    out_states.reserve(num_states);\n\n    for (int32_t k = 1; k != num_states + 1; ++k) {\n      out_states.push_back(std::move(out[k]));\n    }\n\n    std::vector<std::vector<Ort::Value>> next_states =\n        model_->UnStackStates(std::move(out_states));\n\n    std::vector<int64_t> log_probs_shape =\n        out[0].GetTensorTypeAndShapeInfo().GetShape();\n    decoder_->Decode(out[0].GetTensorData<float>(), log_probs_shape[0],\n                     log_probs_shape[1], log_probs_shape[2], &results, ss, n);\n\n    for (int32_t k = 0; k != n; ++k) {\n      ss[k]->SetCtcResult(results[k]);\n      ss[k]->SetStates(std::move(next_states[k]));\n    }\n  }\n\n  OnlineRecognizerResult GetResult(OnlineStream *s) const override {\n    OnlineCtcDecoderResult decoder_result = s->GetCtcResult();\n\n    // TODO(fangjun): Remember to change these constants if needed\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = 4;\n    if (!config_.model_config.t_one_ctc.model.empty()) {\n      // each input frame is of 300ms long, which produces 10 output frames.\n      // so frame_shift_ms is 300/10 = 30ms\n      //\n      frame_shift_ms = 30;\n      subsampling_factor = 1;\n    }\n\n    auto r =\n        ConvertCtc(decoder_result, sym_, frame_shift_ms, subsampling_factor,\n                   s->GetCurrentSegment(), s->GetNumFramesSinceStart());\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    return r;\n  }\n\n  bool IsEndpoint(OnlineStream *s) const override {\n    if (!config_.enable_endpoint) {\n      return false;\n    }\n\n    int32_t num_processed_frames = s->GetNumProcessedFrames();\n\n    float frame_shift_in_seconds = 0.01;\n    int32_t subsampling_factor = 4;\n    if (!config_.model_config.t_one_ctc.model.empty()) {\n      frame_shift_in_seconds = 0.03;\n      subsampling_factor = 1;\n    }\n\n    int32_t trailing_silence_frames =\n        s->GetCtcResult().num_trailing_blanks * subsampling_factor;\n\n    return endpoint_.IsEndpoint(num_processed_frames, trailing_silence_frames,\n                                frame_shift_in_seconds);\n  }\n\n  void Reset(OnlineStream *s) const override {\n    // segment is incremented only when the last\n    // result is not empty\n    const auto &r = s->GetCtcResult();\n    if (!r.tokens.empty()) {\n      s->GetCurrentSegment() += 1;\n    }\n\n    // clear result\n    s->SetCtcResult({});\n\n    // clear states\n    s->SetStates(model_->GetInitStates());\n\n    s->GetFasterDecoderProcessedFrames() = 0;\n\n    // Note: We only update counters. The underlying audio samples\n    // are not discarded.\n    s->Reset();\n  }\n\n private:\n  void PostInit() {\n    if (!config_.model_config.wenet_ctc.model.empty()) {\n      // WeNet CTC models assume input samples are in the range\n      // [-32768, 32767], so we set normalize_samples to false\n      config_.feat_config.normalize_samples = false;\n    }\n\n    if (!config_.model_config.t_one_ctc.model.empty()) {\n      config_.feat_config.is_t_one = true;\n      config_.feat_config.frame_length_ms = 300;\n      config_.feat_config.frame_shift_ms = 300;\n      config_.feat_config.sampling_rate = 8000;\n    }\n\n    if (model_->UseWhisperFeature()) {\n      config_.feat_config.is_whisper = true;\n    }\n\n    InitDecoder();\n  }\n  void InitDecoder() {\n    if (!sym_.Contains(\"<blk>\") && !sym_.Contains(\"<eps>\") &&\n        !sym_.Contains(\"<blank>\")) {\n      SHERPA_ONNX_LOGE(\n          \"We expect that tokens.txt contains \"\n          \"the symbol <blk> or <eps> or <blank> and its ID.\");\n      exit(-1);\n    }\n\n    int32_t blank_id = 0;\n    if (sym_.Contains(\"<blk>\")) {\n      blank_id = sym_[\"<blk>\"];\n    } else if (sym_.Contains(\"<eps>\")) {\n      // for tdnn models of the yesno recipe from icefall\n      blank_id = sym_[\"<eps>\"];\n    } else if (sym_.Contains(\"<blank>\")) {\n      // for WeNet CTC models\n      blank_id = sym_[\"<blank>\"];\n    }\n\n    if (!config_.ctc_fst_decoder_config.graph.empty()) {\n      decoder_ = std::make_unique<OnlineCtcFstDecoder>(\n          config_.ctc_fst_decoder_config, blank_id);\n    } else if (config_.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OnlineCtcGreedySearchDecoder>(blank_id);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Unsupported decoding method: %s for streaming CTC models\",\n          config_.decoding_method.c_str());\n      exit(-1);\n    }\n  }\n\n  void DecodeStream(OnlineStream *s) const {\n    int32_t chunk_length = model_->ChunkLength();\n    int32_t chunk_shift = model_->ChunkShift();\n\n    int32_t feat_dim = s->FeatureDim();\n\n    const auto num_processed_frames = s->GetNumProcessedFrames();\n    std::vector<float> frames =\n        s->GetFrames(num_processed_frames, chunk_length);\n\n    if (config_.feat_config.is_whisper) {\n      OfflineWhisperModel::NormalizeFeatures(frames.data(), chunk_length,\n                                             feat_dim);\n    }\n\n    s->GetNumProcessedFrames() += chunk_shift;\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> x_shape{1, chunk_length, feat_dim};\n    Ort::Value x =\n        Ort::Value::CreateTensor(memory_info, frames.data(), frames.size(),\n                                 x_shape.data(), x_shape.size());\n    auto out = model_->Forward(std::move(x), std::move(s->GetStates()));\n    int32_t num_states = static_cast<int32_t>(out.size()) - 1;\n\n    std::vector<Ort::Value> states;\n    states.reserve(num_states);\n\n    for (int32_t i = 0; i != num_states; ++i) {\n      states.push_back(std::move(out[i + 1]));\n    }\n    s->SetStates(std::move(states));\n\n    std::vector<OnlineCtcDecoderResult> results(1);\n    results[0] = std::move(s->GetCtcResult());\n\n    std::vector<int64_t> log_probs_shape =\n        out[0].GetTensorTypeAndShapeInfo().GetShape();\n    decoder_->Decode(out[0].GetTensorData<float>(), log_probs_shape[0],\n                     log_probs_shape[1], log_probs_shape[2], &results, &s, 1);\n    s->SetCtcResult(results[0]);\n  }\n\n private:\n  OnlineRecognizerConfig config_;\n  std::unique_ptr<OnlineCtcModel> model_;\n  std::unique_ptr<OnlineCtcDecoder> decoder_;\n  SymbolTable sym_;\n  Endpoint endpoint_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_CTC_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-recognizer-impl.cc",
    "content": "// sherpa-onnx/csrc/online-recognizer-impl.cc\n//\n// Copyright (c)  2023-2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-recognizer-impl.h\"\n\n#include <memory>\n#include <string>\n#include <sstream>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"fst/extensions/far/far.h\"\n#include \"kaldifst/csrc/kaldi-fst-io.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-ctc-impl.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-paraformer-impl.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-transducer-impl.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-transducer-nemo-impl.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\n#if SHERPA_ONNX_ENABLE_RKNN\n#include \"sherpa-onnx/csrc/rknn/online-recognizer-ctc-rknn-impl.h\"\n#include \"sherpa-onnx/csrc/rknn/online-recognizer-transducer-rknn-impl.h\"\n#endif\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OnlineRecognizerImpl> OnlineRecognizerImpl::Create(\n    const OnlineRecognizerConfig &config) {\n  if (config.model_config.provider_config.provider == \"rknn\") {\n#if SHERPA_ONNX_ENABLE_RKNN\n    if (config.model_config.transducer.encoder.empty() &&\n        config.model_config.zipformer2_ctc.model.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"Only Zipformer transducers and CTC models are currently supported \"\n          \"by rknn. Fallback to CPU. Make sure you pass an onnx model\");\n    } else if (!config.model_config.transducer.encoder.empty()) {\n      return std::make_unique<OnlineRecognizerTransducerRknnImpl>(config);\n    } else if (!config.model_config.zipformer2_ctc.model.empty()) {\n      return std::make_unique<OnlineRecognizerCtcRknnImpl>(config);\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_RKNN=ON if you \"\n        \"want to use rknn.\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (!config.model_config.transducer.encoder.empty()) {\n    Ort::Env env(ORT_LOGGING_LEVEL_ERROR);\n\n    Ort::SessionOptions sess_opts;\n    sess_opts.SetIntraOpNumThreads(1);\n    sess_opts.SetInterOpNumThreads(1);\n\n    auto decoder_model = ReadFile(config.model_config.transducer.decoder);\n    auto sess = std::make_unique<Ort::Session>(env, decoder_model.data(),\n                                               decoder_model.size(), sess_opts);\n\n    size_t node_count = sess->GetOutputCount();\n\n    if (node_count == 1) {\n      return std::make_unique<OnlineRecognizerTransducerImpl>(config);\n    } else {\n      return std::make_unique<OnlineRecognizerTransducerNeMoImpl>(config);\n    }\n  }\n\n  if (!config.model_config.paraformer.encoder.empty()) {\n    return std::make_unique<OnlineRecognizerParaformerImpl>(config);\n  }\n\n  if (!config.model_config.wenet_ctc.model.empty() ||\n      !config.model_config.zipformer2_ctc.model.empty() ||\n      !config.model_config.nemo_ctc.model.empty() ||\n      !config.model_config.t_one_ctc.model.empty()) {\n    return std::make_unique<OnlineRecognizerCtcImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please specify a model\");\n  SHERPA_ONNX_EXIT(-1);\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OnlineRecognizerImpl> OnlineRecognizerImpl::Create(\n    Manager *mgr, const OnlineRecognizerConfig &config) {\n  if (config.model_config.provider_config.provider == \"rknn\") {\n#if SHERPA_ONNX_ENABLE_RKNN\n    // Currently, only zipformer v1 is supported for rknn\n    if (config.model_config.transducer.encoder.empty() &&\n        config.model_config.zipformer2_ctc.model.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"Only Zipformer transducers and CTC models are currently supported \"\n          \"by rknn. Fallback to CPU\");\n    } else if (!config.model_config.transducer.encoder.empty()) {\n      return std::make_unique<OnlineRecognizerTransducerRknnImpl>(mgr, config);\n    } else if (!config.model_config.zipformer2_ctc.model.empty()) {\n      return std::make_unique<OnlineRecognizerCtcRknnImpl>(mgr, config);\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_RKNN=ON if you \"\n        \"want to use rknn.\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (!config.model_config.transducer.encoder.empty()) {\n    Ort::Env env(ORT_LOGGING_LEVEL_ERROR);\n\n    Ort::SessionOptions sess_opts;\n    sess_opts.SetIntraOpNumThreads(1);\n    sess_opts.SetInterOpNumThreads(1);\n\n    auto decoder_model = ReadFile(mgr, config.model_config.transducer.decoder);\n    auto sess = std::make_unique<Ort::Session>(env, decoder_model.data(),\n                                               decoder_model.size(), sess_opts);\n\n    size_t node_count = sess->GetOutputCount();\n\n    if (node_count == 1) {\n      return std::make_unique<OnlineRecognizerTransducerImpl>(mgr, config);\n    } else {\n      return std::make_unique<OnlineRecognizerTransducerNeMoImpl>(mgr, config);\n    }\n  }\n\n  if (!config.model_config.paraformer.encoder.empty()) {\n    return std::make_unique<OnlineRecognizerParaformerImpl>(mgr, config);\n  }\n\n  if (!config.model_config.wenet_ctc.model.empty() ||\n      !config.model_config.zipformer2_ctc.model.empty() ||\n      !config.model_config.nemo_ctc.model.empty() ||\n      !config.model_config.t_one_ctc.model.empty()) {\n    return std::make_unique<OnlineRecognizerCtcImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please specify a model\");\n  SHERPA_ONNX_EXIT(-1);\n  return nullptr;\n}\n\nOnlineRecognizerImpl::OnlineRecognizerImpl(const OnlineRecognizerConfig &config)\n    : config_(config) {\n  if (!config.rule_fsts.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(config.rule_fsts, \",\", false, &files);\n    itn_list_.reserve(files.size());\n    for (const auto &f : files) {\n      if (config.model_config.debug) {\n        SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n      }\n      itn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(f));\n    }\n  }\n\n  if (!config.rule_fars.empty()) {\n    if (config.model_config.debug) {\n      SHERPA_ONNX_LOGE(\"Loading FST archives\");\n    }\n    std::vector<std::string> files;\n    SplitStringToVector(config.rule_fars, \",\", false, &files);\n\n    itn_list_.reserve(files.size() + itn_list_.size());\n\n    for (const auto &f : files) {\n      if (config.model_config.debug) {\n        SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n      }\n      std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n          fst::FarReader<fst::StdArc>::Open(f));\n      for (; !reader->Done(); reader->Next()) {\n        std::unique_ptr<fst::StdConstFst> r(\n            fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n        itn_list_.push_back(\n            std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n      }\n    }\n\n    if (config.model_config.debug) {\n      SHERPA_ONNX_LOGE(\"FST archives loaded!\");\n    }\n  }\n\n  if (!config.hr.lexicon.empty() && !config.hr.rule_fsts.empty()) {\n    auto hr_config = config.hr;\n    hr_config.debug = config.model_config.debug;\n    hr_ = std::make_unique<HomophoneReplacer>(hr_config);\n  }\n}\n\ntemplate <typename Manager>\nOnlineRecognizerImpl::OnlineRecognizerImpl(Manager *mgr,\n                                           const OnlineRecognizerConfig &config)\n    : config_(config) {\n  if (!config.rule_fsts.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(config.rule_fsts, \",\", false, &files);\n    itn_list_.reserve(files.size());\n    for (const auto &f : files) {\n      if (config.model_config.debug) {\n        SHERPA_ONNX_LOGE(\"rule fst: %s\", f.c_str());\n      }\n      auto buf = ReadFile(mgr, f);\n      std::istringstream is(std::string(buf.data(), buf.size()));\n      itn_list_.push_back(std::make_unique<kaldifst::TextNormalizer>(is));\n    }\n  }\n\n  if (!config.rule_fars.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(config.rule_fars, \",\", false, &files);\n    itn_list_.reserve(files.size() + itn_list_.size());\n\n    for (const auto &f : files) {\n      if (config.model_config.debug) {\n        SHERPA_ONNX_LOGE(\"rule far: %s\", f.c_str());\n      }\n\n      auto buf = ReadFile(mgr, f);\n\n      std::unique_ptr<std::istream> s(\n          new std::istringstream(std::string(buf.data(), buf.size())));\n\n      std::unique_ptr<fst::FarReader<fst::StdArc>> reader(\n          fst::FarReader<fst::StdArc>::Open(std::move(s)));\n\n      for (; !reader->Done(); reader->Next()) {\n        std::unique_ptr<fst::StdConstFst> r(\n            fst::CastOrConvertToConstFst(reader->GetFst()->Copy()));\n\n        itn_list_.push_back(\n            std::make_unique<kaldifst::TextNormalizer>(std::move(r)));\n      }  // for (; !reader->Done(); reader->Next())\n    }  // for (const auto &f : files)\n  }  // if (!config.rule_fars.empty())\n  if (!config.hr.lexicon.empty() && !config.hr.rule_fsts.empty()) {\n    auto hr_config = config.hr;\n    hr_config.debug = config.model_config.debug;\n    hr_ = std::make_unique<HomophoneReplacer>(mgr, hr_config);\n  }\n}\n\nstd::string OnlineRecognizerImpl::ApplyInverseTextNormalization(\n    std::string text) const {\n  text = RemoveInvalidUtf8Sequences(text);\n\n  if (!itn_list_.empty()) {\n    for (const auto &tn : itn_list_) {\n      text = tn->Normalize(text);\n    }\n  }\n\n  return text;\n}\n\nstd::string OnlineRecognizerImpl::ApplyHomophoneReplacer(\n    std::string text) const {\n  if (hr_) {\n    text = hr_->Apply(text);\n  }\n\n  return text;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineRecognizerImpl::OnlineRecognizerImpl(\n    AAssetManager *mgr, const OnlineRecognizerConfig &config);\n\ntemplate std::unique_ptr<OnlineRecognizerImpl> OnlineRecognizerImpl::Create(\n    AAssetManager *mgr, const OnlineRecognizerConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineRecognizerImpl::OnlineRecognizerImpl(\n    NativeResourceManager *mgr, const OnlineRecognizerConfig &config);\n\ntemplate std::unique_ptr<OnlineRecognizerImpl> OnlineRecognizerImpl::Create(\n    NativeResourceManager *mgr, const OnlineRecognizerConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-recognizer-impl.h",
    "content": "// sherpa-onnx/csrc/online-recognizer-impl.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_IMPL_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"kaldifst/csrc/text-normalizer.h\"\n#include \"sherpa-onnx/csrc/homophone-replacer.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineRecognizerImpl {\n public:\n  explicit OnlineRecognizerImpl(const OnlineRecognizerConfig &config);\n\n  static std::unique_ptr<OnlineRecognizerImpl> Create(\n      const OnlineRecognizerConfig &config);\n\n  template <typename Manager>\n  OnlineRecognizerImpl(Manager *mgr, const OnlineRecognizerConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OnlineRecognizerImpl> Create(\n      Manager *mgr, const OnlineRecognizerConfig &config);\n\n  virtual ~OnlineRecognizerImpl() = default;\n\n  virtual std::unique_ptr<OnlineStream> CreateStream() const = 0;\n\n  virtual std::unique_ptr<OnlineStream> CreateStream(\n      const std::string &hotwords) const {\n    SHERPA_ONNX_LOGE(\"Only transducer models support contextual biasing.\");\n    exit(-1);\n  }\n\n  virtual bool IsReady(OnlineStream *s) const = 0;\n\n  virtual void WarmpUpRecognizer(int32_t warmup, int32_t mbs) const {\n    // ToDo extending to other  models\n    SHERPA_ONNX_LOGE(\"Only zipformer2 model supports Warm up for now.\");\n    exit(-1);\n  }\n\n  virtual void DecodeStreams(OnlineStream **ss, int32_t n) const = 0;\n\n  virtual OnlineRecognizerResult GetResult(OnlineStream *s) const = 0;\n\n  virtual bool IsEndpoint(OnlineStream *s) const = 0;\n\n  virtual void Reset(OnlineStream *s) const = 0;\n\n  std::string ApplyInverseTextNormalization(std::string text) const;\n  std::string ApplyHomophoneReplacer(std::string text) const;\n\n private:\n  OnlineRecognizerConfig config_;\n  // for inverse text normalization. Used only if\n  // config.rule_fsts is not empty or\n  // config.rule_fars is not empty\n  std::vector<std::unique_ptr<kaldifst::TextNormalizer>> itn_list_;\n  std::unique_ptr<HomophoneReplacer> hr_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-recognizer-paraformer-impl.h",
    "content": "// sherpa-onnx/csrc/online-recognizer-paraformer-impl.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_PARAFORMER_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_PARAFORMER_IMPL_H_\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-lm.h\"\n#include \"sherpa-onnx/csrc/online-paraformer-decoder.h\"\n#include \"sherpa-onnx/csrc/online-paraformer-model.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\nstatic OnlineRecognizerResult Convert(const OnlineParaformerDecoderResult &src,\n                                      const SymbolTable &sym_table) {\n  OnlineRecognizerResult r;\n  r.tokens.reserve(src.tokens.size());\n\n  std::string text;\n\n  // When the current token ends with \"@@\" we set mergeable to true\n  bool mergeable = false;\n\n  for (int32_t i = 0; i != src.tokens.size(); ++i) {\n    auto sym = sym_table[src.tokens[i]];\n    r.tokens.push_back(sym);\n\n    if ((sym.back() != '@') || (sym.size() > 2 && sym[sym.size() - 2] != '@')) {\n      // sym does not end with \"@@\"\n      const uint8_t *p = reinterpret_cast<const uint8_t *>(sym.c_str());\n      if (p[0] < 0x80) {\n        // an ascii\n        if (mergeable) {\n          mergeable = false;\n          text.append(sym);\n        } else {\n          text.append(\" \");\n          text.append(sym);\n        }\n      } else {\n        // not an ascii\n        mergeable = false;\n\n        if (i > 0) {\n          const uint8_t p = reinterpret_cast<const uint8_t *>(\n              sym_table[src.tokens[i - 1]].c_str())[0];\n          if (p < 0x80) {\n            // put a space between ascii and non-ascii\n            text.append(\" \");\n          }\n        }\n        text.append(sym);\n      }\n    } else {\n      // this sym ends with @@\n      sym = std::string(sym.data(), sym.size() - 2);\n      if (mergeable) {\n        text.append(sym);\n      } else {\n        text.append(\" \");\n        text.append(sym);\n        mergeable = true;\n      }\n    }\n  }\n  r.text = std::move(text);\n\n  return r;\n}\n\n// y[i] += x[i] * scale\nstatic void ScaleAddInPlace(const float *x, int32_t n, float scale, float *y) {\n  for (int32_t i = 0; i != n; ++i) {\n    y[i] += x[i] * scale;\n  }\n}\n\n// y[i] = x[i] * scale\nstatic void Scale(const float *x, int32_t n, float scale, float *y) {\n  for (int32_t i = 0; i != n; ++i) {\n    y[i] = x[i] * scale;\n  }\n}\n\nclass OnlineRecognizerParaformerImpl : public OnlineRecognizerImpl {\n public:\n  explicit OnlineRecognizerParaformerImpl(const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(config),\n        config_(config),\n        model_(config.model_config),\n        endpoint_(config_.endpoint_config) {\n    if (!config.model_config.tokens_buf.empty()) {\n      sym_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      /// assuming tokens_buf and tokens are guaranteed not being both empty\n      sym_ = SymbolTable(config.model_config.tokens, true);\n    }\n\n    if (config.decoding_method != \"greedy_search\") {\n      SHERPA_ONNX_LOGE(\n          \"Unsupported decoding method: %s. Support only greedy_search at \"\n          \"present\",\n          config.decoding_method.c_str());\n      exit(-1);\n    }\n\n    // Paraformer models assume input samples are in the range\n    // [-32768, 32767], so we set normalize_samples to false\n    config_.feat_config.normalize_samples = false;\n  }\n\n  template <typename Manager>\n  explicit OnlineRecognizerParaformerImpl(Manager *mgr,\n                                          const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(mgr, config),\n        config_(config),\n        model_(mgr, config.model_config),\n        endpoint_(config_.endpoint_config) {\n    if (!config.model_config.tokens_buf.empty()) {\n      sym_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      sym_ = SymbolTable(mgr, config.model_config.tokens);\n    }\n    if (config.decoding_method != \"greedy_search\") {\n      SHERPA_ONNX_LOGE(\"Unsupported decoding method: %s\",\n                       config.decoding_method.c_str());\n      exit(-1);\n    }\n\n    // Paraformer models assume input samples are in the range\n    // [-32768, 32767], so we set normalize_samples to false\n    config_.feat_config.normalize_samples = false;\n  }\n\n  OnlineRecognizerParaformerImpl(const OnlineRecognizerParaformerImpl &) =\n      delete;\n\n  OnlineRecognizerParaformerImpl operator=(\n      const OnlineRecognizerParaformerImpl &) = delete;\n\n  std::unique_ptr<OnlineStream> CreateStream() const override {\n    auto stream = std::make_unique<OnlineStream>(config_.feat_config);\n\n    OnlineParaformerDecoderResult r;\n    stream->SetParaformerResult(r);\n\n    return stream;\n  }\n\n  bool IsReady(OnlineStream *s) const override {\n    if (s->GetNumProcessedFrames() + chunk_size_ < s->NumFramesReady()) {\n      return true;\n    }\n    // is_final: accept short chunks (less than chunk_size_ frames)\n    // Users should call SetOption(\"is_final\", \"1\") before the last decode.\n    if (s->GetOptionInt(\"is_final\", 0) &&\n        s->GetNumProcessedFrames() < s->NumFramesReady()) {\n      return true;\n    }\n    return false;\n  }\n\n  void DecodeStreams(OnlineStream **ss, int32_t n) const override {\n    // TODO(fangjun): Support batch size > 1\n    for (int32_t i = 0; i != n; ++i) {\n      DecodeStream(ss[i]);\n    }\n  }\n\n  OnlineRecognizerResult GetResult(OnlineStream *s) const override {\n    auto decoder_result = s->GetParaformerResult();\n\n    auto r = Convert(decoder_result, sym_);\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    return r;\n  }\n\n  bool IsEndpoint(OnlineStream *s) const override {\n    if (!config_.enable_endpoint) {\n      return false;\n    }\n\n    const auto &result = s->GetParaformerResult();\n\n    int32_t num_processed_frames = s->GetNumProcessedFrames();\n\n    // frame shift is 10 milliseconds\n    float frame_shift_in_seconds = 0.01;\n\n    int32_t trailing_silence_frames =\n        num_processed_frames - result.last_non_blank_frame_index;\n\n    return endpoint_.IsEndpoint(num_processed_frames, trailing_silence_frames,\n                                frame_shift_in_seconds);\n  }\n\n  void Reset(OnlineStream *s) const override {\n    // segment is incremented only when the last result is not empty\n    const auto &r = s->GetParaformerResult();\n    if (!r.tokens.empty()) {\n      s->GetCurrentSegment() += 1;\n    }\n\n    OnlineParaformerDecoderResult empty;\n    s->SetParaformerResult(empty);\n\n    s->GetStates().clear();\n    s->GetParaformerEncoderOutCache().clear();\n    s->GetParaformerAlphaCache().clear();\n\n    // s->GetParaformerFeatCache().clear();\n\n    // Note: We only update counters. The underlying audio samples\n    // are not discarded.\n    s->Reset();\n  }\n\n private:\n  void DecodeStream(OnlineStream *s) const {\n    const auto num_processed_frames = s->GetNumProcessedFrames();\n    int32_t available_frames = s->NumFramesReady() - num_processed_frames;\n    bool is_final = s->GetOptionInt(\"is_final\", 0);\n\n    // For the final short chunk (fewer frames than chunk_size_):\n    // read the remaining frames and pad with zeros to chunk_size_.\n    bool is_short_final = is_final && available_frames < chunk_size_;\n\n    std::vector<float> frames =\n        s->GetFrames(num_processed_frames,\n                     is_short_final ? available_frames : chunk_size_);\n\n    if (is_short_final) {\n      int32_t feat_dim_raw = config_.feat_config.feature_dim;\n      frames.resize(chunk_size_ * feat_dim_raw, 0.0f);\n      // Consume all remaining frames (no overlap needed).\n      s->GetNumProcessedFrames() += available_frames;\n    } else {\n      // Normal: advance by chunk_size_ - 1 to keep 1-frame overlap.\n      s->GetNumProcessedFrames() += chunk_size_ - 1;\n    }\n\n    frames = ApplyLFR(frames);\n    ApplyCMVN(&frames);\n    PositionalEncoding(&frames, num_processed_frames / model_.LfrWindowShift());\n\n    int32_t feat_dim = model_.NegativeMean().size();\n\n    // We have scaled inv_stddev by sqrt(encoder_output_size)\n    // so the following line can be commented out\n    // frames *= encoder_output_size ** 0.5\n\n    // add overlap chunk\n    std::vector<float> &feat_cache = s->GetParaformerFeatCache();\n    if (feat_cache.empty()) {\n      int32_t n = (left_chunk_size_ + right_chunk_size_) * feat_dim;\n      feat_cache.resize(n, 0);\n    }\n\n    frames.insert(frames.begin(), feat_cache.begin(), feat_cache.end());\n    std::copy(frames.end() - feat_cache.size(), frames.end(),\n              feat_cache.begin());\n\n    int32_t num_frames = frames.size() / feat_dim;\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> x_shape{1, num_frames, feat_dim};\n    Ort::Value x =\n        Ort::Value::CreateTensor(memory_info, frames.data(), frames.size(),\n                                 x_shape.data(), x_shape.size());\n\n    int64_t x_len_shape = 1;\n    int32_t x_len_val = num_frames;\n\n    Ort::Value x_length =\n        Ort::Value::CreateTensor(memory_info, &x_len_val, 1, &x_len_shape, 1);\n\n    auto encoder_out_vec =\n        model_.ForwardEncoder(std::move(x), std::move(x_length));\n\n    // CIF search\n    auto &encoder_out = encoder_out_vec[0];\n    auto &encoder_out_len = encoder_out_vec[1];\n    auto &alpha = encoder_out_vec[2];\n\n    float *p_alpha = alpha.GetTensorMutableData<float>();\n\n    std::vector<int64_t> alpha_shape =\n        alpha.GetTensorTypeAndShapeInfo().GetShape();\n\n    std::fill(p_alpha, p_alpha + left_chunk_size_, 0);\n    std::fill(p_alpha + alpha_shape[1] - right_chunk_size_,\n              p_alpha + alpha_shape[1], 0);\n\n    const float *p_encoder_out = encoder_out.GetTensorData<float>();\n\n    std::vector<int64_t> encoder_out_shape =\n        encoder_out.GetTensorTypeAndShapeInfo().GetShape();\n\n    std::vector<float> &initial_hidden = s->GetParaformerEncoderOutCache();\n    if (initial_hidden.empty()) {\n      initial_hidden.resize(encoder_out_shape[2]);\n    }\n\n    std::vector<float> &alpha_cache = s->GetParaformerAlphaCache();\n    if (alpha_cache.empty()) {\n      alpha_cache.resize(1);\n    }\n\n    std::vector<float> acoustic_embedding;\n    acoustic_embedding.reserve(encoder_out_shape[1] * encoder_out_shape[2]);\n\n    float threshold = 1.0;\n\n    float integrate = alpha_cache[0];\n\n    for (int32_t i = 0; i != encoder_out_shape[1]; ++i) {\n      float this_alpha = p_alpha[i];\n      if (integrate + this_alpha < threshold) {\n        integrate += this_alpha;\n        ScaleAddInPlace(p_encoder_out + i * encoder_out_shape[2],\n                        encoder_out_shape[2], this_alpha,\n                        initial_hidden.data());\n        continue;\n      }\n\n      // fire\n      ScaleAddInPlace(p_encoder_out + i * encoder_out_shape[2],\n                      encoder_out_shape[2], threshold - integrate,\n                      initial_hidden.data());\n      acoustic_embedding.insert(acoustic_embedding.end(),\n                                initial_hidden.begin(), initial_hidden.end());\n      integrate += this_alpha - threshold;\n\n      Scale(p_encoder_out + i * encoder_out_shape[2], encoder_out_shape[2],\n            integrate, initial_hidden.data());\n    }\n\n    alpha_cache[0] = integrate;\n\n    if (acoustic_embedding.empty()) {\n      return;\n    }\n\n    auto &states = s->GetStates();\n    if (states.empty()) {\n      states.reserve(model_.DecoderNumBlocks());\n\n      std::array<int64_t, 3> shape{1, model_.EncoderOutputSize(),\n                                   model_.DecoderKernelSize() - 1};\n\n      int32_t num_bytes = sizeof(float) * shape[0] * shape[1] * shape[2];\n\n      for (int32_t i = 0; i != model_.DecoderNumBlocks(); ++i) {\n        Ort::Value this_state = Ort::Value::CreateTensor<float>(\n            model_.Allocator(), shape.data(), shape.size());\n\n        memset(this_state.GetTensorMutableData<float>(), 0, num_bytes);\n\n        states.push_back(std::move(this_state));\n      }\n    }\n\n    int32_t num_tokens = acoustic_embedding.size() / initial_hidden.size();\n    std::array<int64_t, 3> acoustic_embedding_shape{\n        1, num_tokens, static_cast<int32_t>(initial_hidden.size())};\n\n    Ort::Value acoustic_embedding_tensor = Ort::Value::CreateTensor(\n        memory_info, acoustic_embedding.data(), acoustic_embedding.size(),\n        acoustic_embedding_shape.data(), acoustic_embedding_shape.size());\n\n    std::array<int64_t, 1> acoustic_embedding_length_shape{1};\n    Ort::Value acoustic_embedding_length_tensor = Ort::Value::CreateTensor(\n        memory_info, &num_tokens, 1, acoustic_embedding_length_shape.data(),\n        acoustic_embedding_length_shape.size());\n\n    auto decoder_out_vec = model_.ForwardDecoder(\n        std::move(encoder_out), std::move(encoder_out_len),\n        std::move(acoustic_embedding_tensor),\n        std::move(acoustic_embedding_length_tensor), std::move(states));\n\n    states.reserve(model_.DecoderNumBlocks());\n    for (int32_t i = 2; i != decoder_out_vec.size(); ++i) {\n      // TODO(fangjun): When we change chunk_size_, we need to\n      // slice decoder_out_vec[i] accordingly.\n      states.push_back(std::move(decoder_out_vec[i]));\n    }\n\n    const auto &sample_ids = decoder_out_vec[1];\n    const int64_t *p_sample_ids = sample_ids.GetTensorData<int64_t>();\n\n    bool non_blank_detected = false;\n\n    auto &result = s->GetParaformerResult();\n\n    for (int32_t i = 0; i != num_tokens; ++i) {\n      int32_t t = p_sample_ids[i];\n      if (t == 0) {\n        continue;\n      }\n\n      non_blank_detected = true;\n      result.tokens.push_back(t);\n    }\n\n    if (non_blank_detected) {\n      result.last_non_blank_frame_index = num_processed_frames;\n    }\n  }\n\n  std::vector<float> ApplyLFR(const std::vector<float> &in) const {\n    int32_t lfr_window_size = model_.LfrWindowSize();\n    int32_t lfr_window_shift = model_.LfrWindowShift();\n    int32_t in_feat_dim = config_.feat_config.feature_dim;\n\n    int32_t in_num_frames = in.size() / in_feat_dim;\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n\n    std::vector<float> out(out_num_frames * out_feat_dim);\n\n    const float *p_in = in.data();\n    float *p_out = out.data();\n\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n\n    return out;\n  }\n\n  void ApplyCMVN(std::vector<float> *v) const {\n    const std::vector<float> &neg_mean = model_.NegativeMean();\n    const std::vector<float> &inv_stddev = model_.InverseStdDev();\n    int dim = static_cast<int>(neg_mean.size());\n    int num_frames = static_cast<int>(v->size()) / dim;\n\n    Eigen::Map<\n        Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>>\n        mat(v->data(), num_frames, dim);\n\n    Eigen::Map<const Eigen::RowVectorXf> neg_mean_vec(neg_mean.data(), dim);\n    Eigen::Map<const Eigen::RowVectorXf> inv_stddev_vec(inv_stddev.data(), dim);\n\n    mat.array() = (mat.array().rowwise() + neg_mean_vec.array()).rowwise() *\n                  inv_stddev_vec.array();\n  }\n\n  void PositionalEncoding(std::vector<float> *v, int32_t t_offset) const {\n    int32_t lfr_window_size = model_.LfrWindowSize();\n    int32_t in_feat_dim = config_.feat_config.feature_dim;\n\n    int32_t feat_dim = in_feat_dim * lfr_window_size;\n    int32_t T = v->size() / feat_dim;\n\n    // log(10000)/(7*80/2-1) == 0.03301197265941284\n    // 7 is lfr_window_size\n    // 80 is in_feat_dim\n    // 7*80 is feat_dim\n    constexpr float kScale = -0.03301197265941284;\n\n    for (int32_t t = 0; t != T; ++t) {\n      float *p = v->data() + t * feat_dim;\n\n      int32_t offset = t + 1 + t_offset;\n\n      for (int32_t d = 0; d < feat_dim / 2; ++d) {\n        float inv_timescale = offset * std::exp(d * kScale);\n\n        float sin_d = std::sin(inv_timescale);\n        float cos_d = std::cos(inv_timescale);\n\n        p[d] += sin_d;\n        p[d + feat_dim / 2] += cos_d;\n      }\n    }\n  }\n\n private:\n  OnlineRecognizerConfig config_;\n  OnlineParaformerModel model_;\n  SymbolTable sym_;\n  Endpoint endpoint_;\n\n  // 0.61 seconds\n  int32_t chunk_size_ = 61;\n  // (61 - 7) / 6 + 1 = 10\n\n  int32_t left_chunk_size_ = 5;\n  int32_t right_chunk_size_ = 3;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_PARAFORMER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-recognizer-transducer-impl.h",
    "content": "// sherpa-onnx/csrc/online-recognizer-transducer-impl.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_TRANSDUCER_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_TRANSDUCER_IMPL_H_\n\n#include <algorithm>\n#include <ios>\n#include <memory>\n#include <regex>  // NOLINT\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-whisper-model.h\"\n#include \"sherpa-onnx/csrc/online-lm.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/online-transducer-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n#include \"sherpa-onnx/csrc/online-transducer-modified-beam-search-decoder.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/utils.h\"\n#include \"ssentencepiece/csrc/ssentencepiece.h\"\n\nnamespace sherpa_onnx {\n\nOnlineRecognizerResult Convert(const OnlineTransducerDecoderResult &src,\n                               const SymbolTable &sym_table,\n                               float frame_shift_ms, int32_t subsampling_factor,\n                               int32_t segment, int32_t frames_since_start) {\n  OnlineRecognizerResult r;\n  r.tokens.reserve(src.tokens.size());\n  r.timestamps.reserve(src.tokens.size());\n\n  std::string text;\n  for (auto i : src.tokens) {\n    auto sym = sym_table[i];\n    if (sym == \"<unk>\") {\n      continue;\n    }\n\n    text.append(sym);\n\n    if (sym.size() == 1 && (sym[0] < 0x20 || sym[0] > 0x7e)) {\n      // for bpe models with byte_fallback\n      // (but don't rewrite printable characters 0x20..0x7e,\n      //  which collide with standard BPE units)\n      std::ostringstream os;\n      os << \"<0x\" << std::hex << std::uppercase\n         << (static_cast<int32_t>(sym[0]) & 0xff) << \">\";\n      sym = os.str();\n    }\n\n    r.tokens.push_back(std::move(sym));\n  }\n\n  if (sym_table.IsByteBpe()) {\n    text = sym_table.DecodeByteBpe(text);\n  }\n\n  r.text = std::move(text);\n\n  float frame_shift_s = frame_shift_ms / 1000. * subsampling_factor;\n  for (auto t : src.timestamps) {\n    float time = frame_shift_s * t;\n    r.timestamps.push_back(time);\n  }\n\n  r.ys_probs = std::move(src.ys_probs);\n  r.lm_probs = std::move(src.lm_probs);\n  r.context_scores = std::move(src.context_scores);\n\n  r.segment = segment;\n  r.start_time = frames_since_start * frame_shift_ms / 1000.;\n\n  return r;\n}\n\nclass OnlineRecognizerTransducerImpl : public OnlineRecognizerImpl {\n public:\n  explicit OnlineRecognizerTransducerImpl(const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(config),\n        config_(config),\n        model_(OnlineTransducerModel::Create(config.model_config)),\n        endpoint_(config_.endpoint_config) {\n    if (!config.model_config.tokens_buf.empty()) {\n      sym_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      /// assuming tokens_buf and tokens are guaranteed not being both empty\n      sym_ = SymbolTable(config.model_config.tokens, true);\n    }\n\n    if (sym_.Contains(\"<unk>\")) {\n      unk_id_ = sym_[\"<unk>\"];\n    }\n\n    model_->SetFeatureDim(config.feat_config.feature_dim);\n\n    if (config.decoding_method == \"modified_beam_search\") {\n      if (!config_.model_config.bpe_vocab.empty()) {\n        bpe_encoder_ = std::make_unique<ssentencepiece::Ssentencepiece>(\n            config_.model_config.bpe_vocab);\n      }\n\n      if (!config_.hotwords_buf.empty()) {\n        InitHotwordsFromBufStr();\n      } else if (!config_.hotwords_file.empty()) {\n        InitHotwords();\n      }\n\n      if (!config_.lm_config.model.empty()) {\n        lm_ = OnlineLM::Create(config.lm_config);\n      }\n\n      decoder_ = std::make_unique<OnlineTransducerModifiedBeamSearchDecoder>(\n          model_.get(), lm_.get(), config_.max_active_paths,\n          config_.lm_config.scale, config_.lm_config.shallow_fusion, unk_id_,\n          config_.blank_penalty, config_.temperature_scale);\n\n    } else if (config.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OnlineTransducerGreedySearchDecoder>(\n          model_.get(), unk_id_, config_.blank_penalty,\n          config_.temperature_scale);\n\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported decoding method: %s\",\n                       config.decoding_method.c_str());\n      exit(-1);\n    }\n\n    if (model_->UseWhisperFeature()) {\n      config_.feat_config.is_whisper = true;\n    }\n  }\n\n  template <typename Manager>\n  explicit OnlineRecognizerTransducerImpl(Manager *mgr,\n                                          const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(mgr, config),\n        config_(config),\n        model_(OnlineTransducerModel::Create(mgr, config.model_config)),\n        sym_(mgr, config.model_config.tokens),\n        endpoint_(config_.endpoint_config) {\n    if (sym_.Contains(\"<unk>\")) {\n      unk_id_ = sym_[\"<unk>\"];\n    }\n\n    model_->SetFeatureDim(config.feat_config.feature_dim);\n\n    if (config.decoding_method == \"modified_beam_search\") {\n#if 0\n      // TODO(fangjun): Implement it\n      if (!config_.lm_config.model.empty()) {\n        lm_ = OnlineLM::Create(mgr, config.lm_config);\n      }\n#endif\n\n      if (!config_.model_config.bpe_vocab.empty()) {\n        auto buf = ReadFile(mgr, config_.model_config.bpe_vocab);\n        std::istringstream iss(std::string(buf.begin(), buf.end()));\n        bpe_encoder_ = std::make_unique<ssentencepiece::Ssentencepiece>(iss);\n      }\n\n      if (!config_.hotwords_buf.empty()) {\n        InitHotwordsFromBufStr();\n      } else if (!config_.hotwords_file.empty()) {\n        InitHotwords(mgr);\n      }\n\n      decoder_ = std::make_unique<OnlineTransducerModifiedBeamSearchDecoder>(\n          model_.get(), lm_.get(), config_.max_active_paths,\n          config_.lm_config.scale, config_.lm_config.shallow_fusion, unk_id_,\n          config_.blank_penalty, config_.temperature_scale);\n\n    } else if (config.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OnlineTransducerGreedySearchDecoder>(\n          model_.get(), unk_id_, config_.blank_penalty,\n          config_.temperature_scale);\n\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported decoding method: %s\",\n                       config.decoding_method.c_str());\n      exit(-1);\n    }\n\n    if (model_->UseWhisperFeature()) {\n      config_.feat_config.is_whisper = true;\n    }\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream() const override {\n    auto stream =\n        std::make_unique<OnlineStream>(config_.feat_config, hotwords_graph_);\n    InitOnlineStream(stream.get());\n    return stream;\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream(\n      const std::string &hotwords) const override {\n    auto hws = std::regex_replace(hotwords, std::regex(\"/\"), \"\\n\");\n    std::istringstream is(hws);\n    std::vector<std::vector<int32_t>> current;\n    std::vector<float> current_scores;\n    if (!EncodeHotwords(is, config_.model_config.modeling_unit, sym_,\n                        bpe_encoder_.get(), &current, &current_scores)) {\n      SHERPA_ONNX_LOGE(\"Encode hotwords failed, skipping, hotwords are : %s\",\n                       hotwords.c_str());\n    }\n\n    int32_t num_default_hws = hotwords_.size();\n    int32_t num_hws = current.size();\n\n    current.insert(current.end(), hotwords_.begin(), hotwords_.end());\n\n    if (!current_scores.empty() && !boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), boost_scores_.begin(),\n                            boost_scores_.end());\n    } else if (!current_scores.empty() && boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), num_default_hws,\n                            config_.hotwords_score);\n    } else if (current_scores.empty() && !boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), num_hws,\n                            config_.hotwords_score);\n      current_scores.insert(current_scores.end(), boost_scores_.begin(),\n                            boost_scores_.end());\n    } else {\n      // Do nothing.\n    }\n\n    auto context_graph = std::make_shared<ContextGraph>(\n        current, config_.hotwords_score, current_scores);\n    auto stream =\n        std::make_unique<OnlineStream>(config_.feat_config, context_graph);\n    InitOnlineStream(stream.get());\n    return stream;\n  }\n\n  bool IsReady(OnlineStream *s) const override {\n    return s->GetNumProcessedFrames() + model_->ChunkSize() <\n           s->NumFramesReady();\n  }\n\n  // Warmping up engine with wp: warm_up count and max-batch-size\n  void WarmpUpRecognizer(int32_t warmup, int32_t mbs) const override {\n    auto max_batch_size = mbs;\n    if (warmup <= 0 || warmup > 100) {\n      return;\n    }\n    int32_t chunk_size = model_->ChunkSize();\n    int32_t feature_dim = config_.feat_config.feature_dim;\n    std::vector<OnlineTransducerDecoderResult> results(max_batch_size);\n    std::vector<float> features_vec(max_batch_size * chunk_size * feature_dim);\n    std::vector<std::vector<Ort::Value>> states_vec(max_batch_size);\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> x_shape{max_batch_size, chunk_size, feature_dim};\n\n    for (int32_t i = 0; i != max_batch_size; ++i) {\n      states_vec[i] = model_->GetEncoderInitStates();\n      results[i] = decoder_->GetEmptyResult();\n    }\n\n    for (int32_t i = 0; i != warmup; ++i) {\n      auto states = model_->StackStates(states_vec);\n      Ort::Value x = Ort::Value::CreateTensor(memory_info, features_vec.data(),\n                                              features_vec.size(),\n                                              x_shape.data(), x_shape.size());\n      auto x_copy = Clone(model_->Allocator(), &x);\n      auto pair = model_->RunEncoder(std::move(x), std::move(states),\n                                     std::move(x_copy));\n      decoder_->Decode(std::move(pair.first), &results);\n    }\n  }\n\n  void DecodeStreams(OnlineStream **ss, int32_t n) const override {\n    int32_t chunk_size = model_->ChunkSize();\n    int32_t chunk_shift = model_->ChunkShift();\n\n    int32_t feature_dim = ss[0]->FeatureDim();\n\n    std::vector<OnlineTransducerDecoderResult> results(n);\n    std::vector<float> features_vec(n * chunk_size * feature_dim);\n    std::vector<std::vector<Ort::Value>> states_vec(n);\n    std::vector<int64_t> all_processed_frames(n);\n    bool has_context_graph = false;\n\n    for (int32_t i = 0; i != n; ++i) {\n      if (!has_context_graph && ss[i]->GetContextGraph()) {\n        has_context_graph = true;\n      }\n\n      const auto num_processed_frames = ss[i]->GetNumProcessedFrames();\n      std::vector<float> features =\n          ss[i]->GetFrames(num_processed_frames, chunk_size);\n\n      if (config_.feat_config.is_whisper) {\n        OfflineWhisperModel::NormalizeFeatures(features.data(), chunk_size,\n                                               feature_dim);\n      }\n\n      // Question: should num_processed_frames include chunk_shift?\n      ss[i]->GetNumProcessedFrames() += chunk_shift;\n\n      std::copy(features.begin(), features.end(),\n                features_vec.data() + i * chunk_size * feature_dim);\n\n      results[i] = std::move(ss[i]->GetResult());\n      states_vec[i] = std::move(ss[i]->GetStates());\n      all_processed_frames[i] = num_processed_frames;\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> x_shape{n, chunk_size, feature_dim};\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, features_vec.data(),\n                                            features_vec.size(), x_shape.data(),\n                                            x_shape.size());\n\n    std::array<int64_t, 1> processed_frames_shape{\n        static_cast<int64_t>(all_processed_frames.size())};\n\n    Ort::Value processed_frames = Ort::Value::CreateTensor(\n        memory_info, all_processed_frames.data(), all_processed_frames.size(),\n        processed_frames_shape.data(), processed_frames_shape.size());\n\n    auto states = model_->StackStates(states_vec);\n\n    auto pair = model_->RunEncoder(std::move(x), std::move(states),\n                                   std::move(processed_frames));\n\n    if (has_context_graph) {\n      decoder_->Decode(std::move(pair.first), ss, &results);\n    } else {\n      decoder_->Decode(std::move(pair.first), &results);\n    }\n\n    std::vector<std::vector<Ort::Value>> next_states =\n        model_->UnStackStates(pair.second);\n\n    for (int32_t i = 0; i != n; ++i) {\n      ss[i]->SetResult(results[i]);\n      ss[i]->SetStates(std::move(next_states[i]));\n    }\n  }\n\n  OnlineRecognizerResult GetResult(OnlineStream *s) const override {\n    OnlineTransducerDecoderResult decoder_result = s->GetResult();\n    decoder_->StripLeadingBlanks(&decoder_result);\n\n    // TODO(fangjun): Remember to change these constants if needed\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = 4;\n    auto r = Convert(decoder_result, sym_, frame_shift_ms, subsampling_factor,\n                     s->GetCurrentSegment(), s->GetNumFramesSinceStart());\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    return r;\n  }\n\n  bool IsEndpoint(OnlineStream *s) const override {\n    if (!config_.enable_endpoint) {\n      return false;\n    }\n\n    int32_t num_processed_frames = s->GetNumProcessedFrames();\n\n    // frame shift is 10 milliseconds\n    float frame_shift_in_seconds = 0.01;\n\n    // subsampling factor is 4\n    int32_t trailing_silence_frames = s->GetResult().num_trailing_blanks * 4;\n\n    return endpoint_.IsEndpoint(num_processed_frames, trailing_silence_frames,\n                                frame_shift_in_seconds);\n  }\n\n  void Reset(OnlineStream *s) const override {\n    int32_t context_size = model_->ContextSize();\n\n    {\n      // segment is incremented only when the last\n      // result is not empty, contains non-blanks and longer than context_size)\n      const auto &r = s->GetResult();\n      if (!r.tokens.empty() && r.tokens.back() != 0 &&\n          r.tokens.size() > context_size) {\n        s->GetCurrentSegment() += 1;\n      }\n    }\n\n    auto r = decoder_->GetEmptyResult();\n    auto last_result = s->GetResult();\n\n    if (static_cast<int32_t>(last_result.tokens.size()) > context_size) {\n      // if last result is not empty, then\n      // truncate all last hyps and save as the 'ys' context for next result\n      // (the encoder state buffers are kept)\n      for (const auto &it : last_result.hyps) {\n        auto h = it.second;\n        r.hyps.Add({std::vector<int64_t>(h.ys.end() - context_size, h.ys.end()),\n                    h.log_prob});\n      }\n\n      r.tokens = std::vector<int64_t>(last_result.tokens.end() - context_size,\n                                      last_result.tokens.end());\n    } else {\n      if (config_.reset_encoder) {\n        // reset encoder states, use blanks as 'ys' context\n        s->SetStates(model_->GetEncoderInitStates());\n      }\n    }\n\n    // but reset all contextual biasing graph states to root\n    if (config_.decoding_method == \"modified_beam_search\" &&\n        nullptr != s->GetContextGraph()) {\n      for (auto it = r.hyps.begin(); it != r.hyps.end(); ++it) {\n        it->second.context_state = s->GetContextGraph()->Root();\n      }\n    }\n\n    s->SetResult(r);\n\n    // Note: We only update counters. The underlying audio samples\n    // are not discarded.\n    s->Reset();\n  }\n\n private:\n  void InitHotwords() {\n    // each line in hotwords_file contains space-separated words\n\n    std::ifstream is(config_.hotwords_file);\n    if (!is) {\n      SHERPA_ONNX_LOGE(\"Open hotwords file failed: %s\",\n                       config_.hotwords_file.c_str());\n      exit(-1);\n    }\n\n    if (!EncodeHotwords(is, config_.model_config.modeling_unit, sym_,\n                        bpe_encoder_.get(), &hotwords_, &boost_scores_)) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to encode some hotwords, skip them already, see logs above \"\n          \"for details.\");\n    }\n    hotwords_graph_ = std::make_shared<ContextGraph>(\n        hotwords_, config_.hotwords_score, boost_scores_);\n  }\n\n  template <typename Manager>\n  void InitHotwords(Manager *mgr) {\n    // each line in hotwords_file contains space-separated words\n\n    auto buf = ReadFile(mgr, config_.hotwords_file);\n\n    std::istringstream is(std::string(buf.begin(), buf.end()));\n\n    if (!is) {\n      SHERPA_ONNX_LOGE(\"Open hotwords file failed: %s\",\n                       config_.hotwords_file.c_str());\n      exit(-1);\n    }\n\n    if (!EncodeHotwords(is, config_.model_config.modeling_unit, sym_,\n                        bpe_encoder_.get(), &hotwords_, &boost_scores_)) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to encode some hotwords, skip them already, see logs above \"\n          \"for details.\");\n    }\n    hotwords_graph_ = std::make_shared<ContextGraph>(\n        hotwords_, config_.hotwords_score, boost_scores_);\n  }\n\n  void InitHotwordsFromBufStr() {\n    // each line in hotwords_file contains space-separated words\n\n    std::istringstream iss(config_.hotwords_buf);\n    if (!EncodeHotwords(iss, config_.model_config.modeling_unit, sym_,\n                        bpe_encoder_.get(), &hotwords_, &boost_scores_)) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to encode some hotwords, skip them already, see logs above \"\n          \"for details.\");\n    }\n    hotwords_graph_ = std::make_shared<ContextGraph>(\n        hotwords_, config_.hotwords_score, boost_scores_);\n  }\n\n  void InitOnlineStream(OnlineStream *stream) const {\n    auto r = decoder_->GetEmptyResult();\n\n    if (config_.decoding_method == \"modified_beam_search\" &&\n        nullptr != stream->GetContextGraph()) {\n      // r.hyps has only one element.\n      for (auto it = r.hyps.begin(); it != r.hyps.end(); ++it) {\n        it->second.context_state = stream->GetContextGraph()->Root();\n      }\n    }\n\n    stream->SetResult(r);\n    stream->SetStates(model_->GetEncoderInitStates());\n  }\n\n private:\n  OnlineRecognizerConfig config_;\n  std::vector<std::vector<int32_t>> hotwords_;\n  std::vector<float> boost_scores_;\n  ContextGraphPtr hotwords_graph_;\n  std::unique_ptr<ssentencepiece::Ssentencepiece> bpe_encoder_;\n  std::unique_ptr<OnlineTransducerModel> model_;\n  std::unique_ptr<OnlineLM> lm_;\n  std::unique_ptr<OnlineTransducerDecoder> decoder_;\n  SymbolTable sym_;\n  Endpoint endpoint_;\n  int32_t unk_id_ = -1;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_TRANSDUCER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-recognizer-transducer-nemo-impl.h",
    "content": "// sherpa-onnx/csrc/online-recognizer-transducer-nemo-impl.h\n//\n// Copyright (c)  2022-2024  Xiaomi Corporation\n// Copyright (c)  2024  Sangeet Sagar\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_TRANSDUCER_NEMO_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_TRANSDUCER_NEMO_IMPL_H_\n\n#include <algorithm>\n#include <fstream>\n#include <ios>\n#include <memory>\n#include <regex>  // NOLINT\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.h\"\n#include \"sherpa-onnx/csrc/online-transducer-nemo-model.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n#include \"sherpa-onnx/csrc/utils.h\"\n\nnamespace sherpa_onnx {\n\n// defined in ./online-recognizer-transducer-impl.h\nOnlineRecognizerResult Convert(const OnlineTransducerDecoderResult &src,\n                               const SymbolTable &sym_table,\n                               float frame_shift_ms, int32_t subsampling_factor,\n                               int32_t segment, int32_t frames_since_start);\n\nclass OnlineRecognizerTransducerNeMoImpl : public OnlineRecognizerImpl {\n public:\n  explicit OnlineRecognizerTransducerNeMoImpl(\n      const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(config),\n        config_(config),\n        endpoint_(config_.endpoint_config),\n        model_(\n            std::make_unique<OnlineTransducerNeMoModel>(config.model_config)) {\n    if (!config.model_config.tokens_buf.empty()) {\n      symbol_table_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      /// assuming tokens_buf and tokens are guaranteed not being both empty\n      symbol_table_ = SymbolTable(config.model_config.tokens, true);\n    }\n\n    if (config.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OnlineTransducerGreedySearchNeMoDecoder>(\n          model_.get(), config_.blank_penalty);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported decoding method: %s\",\n                       config.decoding_method.c_str());\n      exit(-1);\n    }\n    PostInit();\n  }\n\n  template <typename Manager>\n  explicit OnlineRecognizerTransducerNeMoImpl(\n      Manager *mgr, const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(mgr, config),\n        config_(config),\n        endpoint_(config_.endpoint_config),\n        model_(std::make_unique<OnlineTransducerNeMoModel>(\n            mgr, config.model_config)) {\n    if (!config.model_config.tokens_buf.empty()) {\n      symbol_table_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      symbol_table_ = SymbolTable(mgr, config.model_config.tokens);\n    }\n    if (config.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OnlineTransducerGreedySearchNeMoDecoder>(\n          model_.get(), config_.blank_penalty);\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported decoding method: %s\",\n                       config.decoding_method.c_str());\n      exit(-1);\n    }\n\n    PostInit();\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream() const override {\n    auto stream = std::make_unique<OnlineStream>(config_.feat_config);\n    InitOnlineStream(stream.get());\n    return stream;\n  }\n\n  bool IsReady(OnlineStream *s) const override {\n    return s->GetNumProcessedFrames() + model_->ChunkSize() <\n           s->NumFramesReady();\n  }\n\n  OnlineRecognizerResult GetResult(OnlineStream *s) const override {\n    // TODO(fangjun): Remember to change these constants if needed\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = model_->SubsamplingFactor();\n    auto r = Convert(s->GetResult(), symbol_table_, frame_shift_ms,\n                     subsampling_factor, s->GetCurrentSegment(),\n                     s->GetNumFramesSinceStart());\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    return r;\n  }\n\n  bool IsEndpoint(OnlineStream *s) const override {\n    if (!config_.enable_endpoint) {\n      return false;\n    }\n\n    int32_t num_processed_frames = s->GetNumProcessedFrames();\n\n    // frame shift is 10 milliseconds\n    float frame_shift_in_seconds = 0.01;\n\n    int32_t trailing_silence_frames =\n        s->GetResult().num_trailing_blanks * model_->SubsamplingFactor();\n\n    return endpoint_.IsEndpoint(num_processed_frames, trailing_silence_frames,\n                                frame_shift_in_seconds);\n  }\n\n  void Reset(OnlineStream *s) const override {\n    {\n      // segment is incremented only when the last\n      // result is not empty\n      const auto &r = s->GetResult();\n      if (!r.tokens.empty()) {\n        s->GetCurrentSegment() += 1;\n      }\n    }\n\n    s->SetResult({});\n\n    s->SetStates(model_->GetEncoderInitStates());\n\n    s->SetNeMoDecoderStates(model_->GetDecoderInitStates());\n\n    // Note: We only update counters. The underlying audio samples\n    // are not discarded.\n    s->Reset();\n  }\n\n  void DecodeStreams(OnlineStream **ss, int32_t n) const override {\n    int32_t chunk_size = model_->ChunkSize();\n    int32_t chunk_shift = model_->ChunkShift();\n\n    int32_t feature_dim = ss[0]->FeatureDim();\n\n    std::vector<float> features_vec(n * chunk_size * feature_dim);\n    std::vector<std::vector<Ort::Value>> encoder_states(n);\n\n    for (int32_t i = 0; i != n; ++i) {\n      const auto num_processed_frames = ss[i]->GetNumProcessedFrames();\n      std::vector<float> features =\n          ss[i]->GetFrames(num_processed_frames, chunk_size);\n\n      // Question: should num_processed_frames include chunk_shift?\n      ss[i]->GetNumProcessedFrames() += chunk_shift;\n\n      std::copy(features.begin(), features.end(),\n                features_vec.data() + i * chunk_size * feature_dim);\n\n      encoder_states[i] = std::move(ss[i]->GetStates());\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> x_shape{n, chunk_size, feature_dim};\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, features_vec.data(),\n                                            features_vec.size(), x_shape.data(),\n                                            x_shape.size());\n\n    auto states = model_->StackStates(std::move(encoder_states));\n    int32_t num_states = states.size();  // num_states = 3\n    auto t = model_->RunEncoder(std::move(x), std::move(states));\n    // t[0] encoder_out, float tensor, (batch_size, dim, T)\n    // t[1] next states\n\n    std::vector<Ort::Value> out_states;\n    out_states.reserve(num_states);\n\n    for (int32_t k = 1; k != num_states + 1; ++k) {\n      out_states.push_back(std::move(t[k]));\n    }\n\n    auto unstacked_states = model_->UnStackStates(std::move(out_states));\n    for (int32_t i = 0; i != n; ++i) {\n      ss[i]->SetStates(std::move(unstacked_states[i]));\n    }\n\n    Ort::Value encoder_out = Transpose12(model_->Allocator(), &t[0]);\n\n    decoder_->Decode(std::move(encoder_out), ss, n);\n  }\n\n  void InitOnlineStream(OnlineStream *stream) const {\n    // set encoder states\n    stream->SetStates(model_->GetEncoderInitStates());\n\n    // set decoder states\n    stream->SetNeMoDecoderStates(model_->GetDecoderInitStates());\n  }\n\n private:\n  void PostInit() {\n    config_.feat_config.feature_dim = model_->FeatureDim();\n\n    config_.feat_config.low_freq = 0;\n    config_.feat_config.high_freq = 8000;\n    config_.feat_config.is_librosa = true;\n    config_.feat_config.remove_dc_offset = false;\n    config_.feat_config.window_type = \"hann\";\n    config_.feat_config.dither = 0;\n    config_.feat_config.nemo_normalize_type =\n        model_->FeatureNormalizationMethod();\n\n    int32_t vocab_size = model_->VocabSize();\n\n    // check the blank ID\n    if (!symbol_table_.Contains(\"<blk>\")) {\n      SHERPA_ONNX_LOGE(\"tokens.txt does not include the blank token <blk>\");\n      exit(-1);\n    }\n\n    if (symbol_table_[\"<blk>\"] != vocab_size - 1) {\n      SHERPA_ONNX_LOGE(\"<blk> is not the last token!\");\n      exit(-1);\n    }\n\n    if (symbol_table_.NumSymbols() != vocab_size) {\n      SHERPA_ONNX_LOGE(\"number of lines in tokens.txt %d != %d (vocab_size)\",\n                       symbol_table_.NumSymbols(), vocab_size);\n      exit(-1);\n    }\n  }\n\n private:\n  OnlineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<OnlineTransducerNeMoModel> model_;\n  std::unique_ptr<OnlineTransducerGreedySearchNeMoDecoder> decoder_;\n  Endpoint endpoint_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_TRANSDUCER_NEMO_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-recognizer.cc",
    "content": "// sherpa-onnx/csrc/online-recognizer.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n// Copyright (c)  2023  Pingfeng Luo\n\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <iomanip>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n\n/// Helper for `OnlineRecognizerResult::AsJsonString()`\ntemplate <typename T>\nstd::string VecToString(const std::vector<T> &vec, int32_t precision = 6) {\n  std::ostringstream oss;\n  if (precision != 0) {\n    oss << std::fixed << std::setprecision(precision);\n  }\n  oss << \"[\";\n  std::string sep = \"\";\n  for (const auto &item : vec) {\n    oss << sep << item;\n    sep = \", \";\n  }\n  oss << \"]\";\n  return oss.str();\n}\n\n/// Helper for `OnlineRecognizerResult::AsJsonString()`\ntemplate <>  // explicit specialization for T = std::string\nstd::string VecToString<std::string>(const std::vector<std::string> &vec,\n                                     int32_t) {  // ignore 2nd arg\n  std::ostringstream oss;\n  oss << \"[\";\n  std::string sep = \"\";\n  for (const auto &item : vec) {\n    oss << sep << std::quoted(item);\n    sep = \", \";\n  }\n  oss << \"]\";\n  return oss.str();\n}\n\n}  // namespace\n\nstd::string OnlineRecognizerResult::AsJsonString() const {\n  std::ostringstream os;\n  os << \"{ \";\n  os << \"\\\"text\\\": \" << std::quoted(text) << \", \";\n  os << \"\\\"tokens\\\": \" << VecToString(tokens) << \", \";\n  os << \"\\\"timestamps\\\": \" << VecToString(timestamps, 2) << \", \";\n  os << \"\\\"ys_probs\\\": \" << VecToString(ys_probs, 6) << \", \";\n  os << \"\\\"lm_probs\\\": \" << VecToString(lm_probs, 6) << \", \";\n  os << \"\\\"context_scores\\\": \" << VecToString(context_scores, 6) << \", \";\n  os << \"\\\"segment\\\": \" << segment << \", \";\n  os << \"\\\"words\\\": \" << VecToString(words, 0) << \", \";\n  os << \"\\\"start_time\\\": \" << std::fixed << std::setprecision(2) << start_time\n     << \", \";\n  os << \"\\\"is_final\\\": \" << (is_final ? \"true\" : \"false\") << \", \";\n  os << \"\\\"is_eof\\\": \" << (is_eof ? \"true\" : \"false\");\n  os << \"}\";\n  return os.str();\n}\n\nvoid OnlineRecognizerConfig::Register(ParseOptions *po) {\n  feat_config.Register(po);\n  model_config.Register(po);\n  endpoint_config.Register(po);\n  lm_config.Register(po);\n  ctc_fst_decoder_config.Register(po);\n  hr.Register(po);\n\n  po->Register(\"enable-endpoint\", &enable_endpoint,\n               \"True to enable endpoint detection. False to disable it.\");\n  po->Register(\"max-active-paths\", &max_active_paths,\n               \"beam size used in modified beam search.\");\n  po->Register(\"blank-penalty\", &blank_penalty,\n               \"The penalty applied on blank symbol during decoding. \"\n               \"Note: It is a positive value. \"\n               \"Increasing value will lead to lower deletion at the cost\"\n               \"of higher insertions. \"\n               \"Currently only applicable for transducer models.\");\n  po->Register(\"hotwords-score\", &hotwords_score,\n               \"The bonus score for each token in context word/phrase. \"\n               \"Used only when decoding_method is modified_beam_search\");\n  po->Register(\n      \"hotwords-file\", &hotwords_file,\n      \"The file containing hotwords, one words/phrases per line, For example: \"\n      \"HELLO WORLD\"\n      \"你好世界\");\n  po->Register(\"decoding-method\", &decoding_method,\n               \"decoding method,\"\n               \"now support greedy_search and modified_beam_search.\");\n  po->Register(\"temperature-scale\", &temperature_scale,\n               \"Temperature scale for confidence computation in decoding.\");\n  po->Register(\n      \"rule-fsts\", &rule_fsts,\n      \"If not empty, it specifies fsts for inverse text normalization. \"\n      \"If there are multiple fsts, they are separated by a comma.\");\n\n  po->Register(\n      \"rule-fars\", &rule_fars,\n      \"If not empty, it specifies fst archives for inverse text normalization. \"\n      \"If there are multiple archives, they are separated by a comma.\");\n\n  po->Register(\"reset-encoder\", &reset_encoder,\n               \"True to reset encoder_state on an endpoint after empty segment.\"\n               \"Done in `Reset()` method, after an endpoint was detected.\");\n}\n\nbool OnlineRecognizerConfig::Validate() const {\n  if (decoding_method == \"modified_beam_search\") {\n    if (max_active_paths <= 0) {\n      SHERPA_ONNX_LOGE(\"max_active_paths must be > 0. Given: %d\",\n                       max_active_paths);\n      return false;\n    }\n  }\n\n  if (decoding_method == \"modified_beam_search\" && !lm_config.model.empty()) {\n    if (!lm_config.Validate()) {\n      return false;\n    }\n  }\n\n  if (!hotwords_file.empty() && decoding_method != \"modified_beam_search\") {\n    SHERPA_ONNX_LOGE(\n        \"Please use --decoding-method=modified_beam_search if you\"\n        \" provide --hotwords-file. Given --decoding-method=%s\",\n        decoding_method.c_str());\n    return false;\n  }\n\n  if (!ctc_fst_decoder_config.graph.empty() &&\n      !ctc_fst_decoder_config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors in ctc_fst_decoder_config\");\n    return false;\n  }\n\n  if (!hotwords_file.empty() && !FileExists(hotwords_file)) {\n    SHERPA_ONNX_LOGE(\"--hotwords-file: '%s' does not exist\",\n                     hotwords_file.c_str());\n    return false;\n  }\n\n  if (!rule_fsts.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(rule_fsts, \",\", false, &files);\n    for (const auto &f : files) {\n      if (!FileExists(f)) {\n        SHERPA_ONNX_LOGE(\"Rule fst '%s' does not exist. \", f.c_str());\n        return false;\n      }\n    }\n  }\n\n  if (!rule_fars.empty()) {\n    std::vector<std::string> files;\n    SplitStringToVector(rule_fars, \",\", false, &files);\n    for (const auto &f : files) {\n      if (!FileExists(f)) {\n        SHERPA_ONNX_LOGE(\"Rule far '%s' does not exist. \", f.c_str());\n        return false;\n      }\n    }\n  }\n\n  if (!hr.lexicon.empty() && !hr.rule_fsts.empty() && !hr.Validate()) {\n    return false;\n  }\n\n  return model_config.Validate();\n}\n\nstd::string OnlineRecognizerConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlineRecognizerConfig(\";\n  os << \"feat_config=\" << feat_config.ToString() << \", \";\n  os << \"model_config=\" << model_config.ToString() << \", \";\n  os << \"lm_config=\" << lm_config.ToString() << \", \";\n  os << \"endpoint_config=\" << endpoint_config.ToString() << \", \";\n  os << \"ctc_fst_decoder_config=\" << ctc_fst_decoder_config.ToString() << \", \";\n  os << \"enable_endpoint=\" << (enable_endpoint ? \"True\" : \"False\") << \", \";\n  os << \"max_active_paths=\" << max_active_paths << \", \";\n  os << \"hotwords_score=\" << hotwords_score << \", \";\n  os << \"hotwords_file=\\\"\" << hotwords_file << \"\\\", \";\n  os << \"decoding_method=\\\"\" << decoding_method << \"\\\", \";\n  os << \"blank_penalty=\" << blank_penalty << \", \";\n  os << \"temperature_scale=\" << temperature_scale << \", \";\n  os << \"rule_fsts=\\\"\" << rule_fsts << \"\\\", \";\n  os << \"rule_fars=\\\"\" << rule_fars << \"\\\", \";\n  os << \"reset_encoder=\" << (reset_encoder ? \"True\" : \"False\") << \", \";\n  os << \"hr=\" << hr.ToString() << \")\";\n\n  return os.str();\n}\n\nOnlineRecognizer::OnlineRecognizer(const OnlineRecognizerConfig &config)\n    : impl_(OnlineRecognizerImpl::Create(config)) {}\n\ntemplate <typename Manager>\nOnlineRecognizer::OnlineRecognizer(Manager *mgr,\n                                   const OnlineRecognizerConfig &config)\n    : impl_(OnlineRecognizerImpl::Create(mgr, config)) {}\n\nOnlineRecognizer::~OnlineRecognizer() = default;\n\nstd::unique_ptr<OnlineStream> OnlineRecognizer::CreateStream() const {\n  return impl_->CreateStream();\n}\n\nstd::unique_ptr<OnlineStream> OnlineRecognizer::CreateStream(\n    const std::string &hotwords) const {\n  return impl_->CreateStream(hotwords);\n}\n\nbool OnlineRecognizer::IsReady(OnlineStream *s) const {\n  return impl_->IsReady(s);\n}\n\nvoid OnlineRecognizer::WarmpUpRecognizer(int32_t warmup, int32_t mbs) const {\n  if (warmup > 0) {\n    impl_->WarmpUpRecognizer(warmup, mbs);\n  }\n}\n\nvoid OnlineRecognizer::DecodeStreams(OnlineStream **ss, int32_t n) const {\n  impl_->DecodeStreams(ss, n);\n}\n\nOnlineRecognizerResult OnlineRecognizer::GetResult(OnlineStream *s) const {\n  return impl_->GetResult(s);\n}\n\nbool OnlineRecognizer::IsEndpoint(OnlineStream *s) const {\n  return impl_->IsEndpoint(s);\n}\n\nvoid OnlineRecognizer::Reset(OnlineStream *s) const { impl_->Reset(s); }\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineRecognizer::OnlineRecognizer(\n    AAssetManager *mgr, const OnlineRecognizerConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineRecognizer::OnlineRecognizer(\n    NativeResourceManager *mgr, const OnlineRecognizerConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-recognizer.h",
    "content": "// sherpa-onnx/csrc/online-recognizer.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_H_\n#define SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/endpoint.h\"\n#include \"sherpa-onnx/csrc/features.h\"\n#include \"sherpa-onnx/csrc/homophone-replacer.h\"\n#include \"sherpa-onnx/csrc/online-ctc-fst-decoder-config.h\"\n#include \"sherpa-onnx/csrc/online-lm-config.h\"\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model-config.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineRecognizerResult {\n  /// Recognition results.\n  /// For English, it consists of space separated words.\n  /// For Chinese, it consists of Chinese words without spaces.\n  /// Example 1: \"hello world\"\n  /// Example 2: \"你好世界\"\n  std::string text;\n\n  /// Decoded results at the token level.\n  /// For instance, for BPE-based models it consists of a list of BPE tokens.\n  std::vector<std::string> tokens;\n\n  /// timestamps.size() == tokens.size()\n  /// timestamps[i] records the time in seconds when tokens[i] is decoded.\n  std::vector<float> timestamps;\n\n  std::vector<float> ys_probs;  //< log-prob scores from ASR model\n  std::vector<float> lm_probs;  //< log-prob scores from language model\n                                //\n  /// log-domain scores from \"hot-phrase\" contextual boosting\n  std::vector<float> context_scores;\n\n  std::vector<int32_t> words;\n\n  /// ID of this segment\n  /// When an endpoint is detected, it is incremented\n  int32_t segment = 0;\n\n  /// Starting time of this segment.\n  /// When an endpoint is detected, it will change\n  float start_time = 0;\n\n  /// True if the end of this segment is reached, i.e., an endpoint is detected\n  /// used only in ./online-websocket-server-impl.cc\n  bool is_final = false;\n\n  /// used only in ./online-websocket-server-impl.cc\n  /// If it is true, it means the server has processed all received samples\n  bool is_eof = false;\n\n  /** Return a json string.\n   *\n   * The returned string contains:\n   *   {\n   *     \"text\": \"The recognition result\",\n   *     \"tokens\": [x, x, x],\n   *     \"timestamps\": [x, x, x],\n   *     \"ys_probs\": [x, x, x],\n   *     \"lm_probs\": [x, x, x],\n   *     \"context_scores\": [x, x, x],\n   *     \"segment\": x,\n   *     \"start_time\": x,\n   *     \"is_final\": true|false\n   *     \"is_eof\": true|false\n   *   }\n   */\n  std::string AsJsonString() const;\n};\n\nstruct OnlineRecognizerConfig {\n  FeatureExtractorConfig feat_config;\n  OnlineModelConfig model_config;\n  OnlineLMConfig lm_config;\n  EndpointConfig endpoint_config;\n  OnlineCtcFstDecoderConfig ctc_fst_decoder_config;\n\n  bool enable_endpoint = true;\n\n  std::string decoding_method = \"greedy_search\";\n  // now support modified_beam_search and greedy_search\n\n  // used only for modified_beam_search\n  int32_t max_active_paths = 4;\n\n  /// used only for modified_beam_search\n  std::string hotwords_file;\n  float hotwords_score = 1.5;\n\n  float blank_penalty = 0.0;\n\n  float temperature_scale = 2.0;\n\n  // If there are multiple rules, they are applied from left to right.\n  std::string rule_fsts;\n\n  // If there are multiple FST archives, they are applied from left to right.\n  std::string rule_fars;\n\n  // True to reset encoder_state on an endpoint after empty segment.\n  // Done in `Reset()` method, after an endpoint was detected,\n  // currently only in `OnlineRecognizerTransducerImpl`.\n  bool reset_encoder = false;\n\n  HomophoneReplacerConfig hr;\n\n  /// used only for modified_beam_search, if hotwords_buf is non-empty,\n  /// the hotwords will be loaded from the buffered string instead of from the\n  /// \"hotwords_file\"\n  std::string hotwords_buf;\n\n  OnlineRecognizerConfig() = default;\n\n  OnlineRecognizerConfig(\n      const FeatureExtractorConfig &feat_config,\n      const OnlineModelConfig &model_config, const OnlineLMConfig &lm_config,\n      const EndpointConfig &endpoint_config,\n      const OnlineCtcFstDecoderConfig &ctc_fst_decoder_config,\n      bool enable_endpoint, const std::string &decoding_method,\n      int32_t max_active_paths, const std::string &hotwords_file,\n      float hotwords_score, float blank_penalty, float temperature_scale,\n      const std::string &rule_fsts, const std::string &rule_fars,\n      bool reset_encoder, const HomophoneReplacerConfig &hr,\n      const std::string &hotwords_buf = {})\n      : feat_config(feat_config),\n        model_config(model_config),\n        lm_config(lm_config),\n        endpoint_config(endpoint_config),\n        ctc_fst_decoder_config(ctc_fst_decoder_config),\n        enable_endpoint(enable_endpoint),\n        decoding_method(decoding_method),\n        max_active_paths(max_active_paths),\n        hotwords_file(hotwords_file),\n        hotwords_score(hotwords_score),\n        blank_penalty(blank_penalty),\n        temperature_scale(temperature_scale),\n        rule_fsts(rule_fsts),\n        rule_fars(rule_fars),\n        reset_encoder(reset_encoder),\n        hr(hr),\n        hotwords_buf(hotwords_buf) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nclass OnlineRecognizerImpl;\n\nclass OnlineRecognizer {\n public:\n  explicit OnlineRecognizer(const OnlineRecognizerConfig &config);\n\n  template <typename Manager>\n  OnlineRecognizer(Manager *mgr, const OnlineRecognizerConfig &config);\n\n  ~OnlineRecognizer();\n\n  /// Create a stream for decoding.\n  std::unique_ptr<OnlineStream> CreateStream() const;\n\n  /** Create a stream for decoding.\n   *\n   *  @param The hotwords for this string, it might contain several hotwords,\n   *         the hotwords are separated by \"/\". In each of the hotwords, there\n   *         are cjkchars or bpes, the bpe/cjkchar are separated by space (\" \").\n   *         For example, hotwords I LOVE YOU and HELLO WORLD, looks like:\n   *\n   *         \"▁I ▁LOVE ▁YOU/▁HE LL O ▁WORLD\"\n   */\n  std::unique_ptr<OnlineStream> CreateStream(const std::string &hotwords) const;\n\n  /**\n   * Return true if the given stream has enough frames for decoding.\n   * Return false otherwise\n   */\n  bool IsReady(OnlineStream *s) const;\n\n  /** Decode a single stream. */\n  void DecodeStream(OnlineStream *s) const {\n    OnlineStream *ss[1] = {s};\n    DecodeStreams(ss, 1);\n  }\n\n  /**\n   * Warmups up onnxruntime sessions by apply optimization and\n   * allocating memory prior\n   *\n   * @param warmup Number of warmups.\n   * @param mbs : max-batch-size Max batch size for the models\n   */\n  void WarmpUpRecognizer(int32_t warmup, int32_t mbs) const;\n\n  /** Decode multiple streams in parallel\n   *\n   * @param ss Pointer array containing streams to be decoded.\n   * @param n Number of streams in `ss`.\n   */\n  void DecodeStreams(OnlineStream **ss, int32_t n) const;\n\n  OnlineRecognizerResult GetResult(OnlineStream *s) const;\n\n  // Return true if we detect an endpoint for this stream.\n  // Note: If this function returns true, you usually want to\n  // invoke Reset(s).\n  bool IsEndpoint(OnlineStream *s) const;\n\n  // Clear the state of this stream. If IsEndpoint(s) returns true,\n  // after calling this function, IsEndpoint(s) will return false\n  void Reset(OnlineStream *s) const;\n\n private:\n  std::unique_ptr<OnlineRecognizerImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_RECOGNIZER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-rnn-lm.cc",
    "content": "// sherpa-onnx/csrc/on-rnn-lm.cc\n//\n// Copyright (c)  2023  Pingfeng Luo\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-rnn-lm.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/lodr-fst.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineRnnLM::Impl {\n public:\n  explicit Impl(const OnlineLMConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_{GetSessionOptions(config)},\n        allocator_{} {\n    Init(config);\n  }\n\n  // shallow fusion scoring function\n  void ComputeLMScoreSF(float scale, Hypothesis *hyp) {\n    if (hyp->nn_lm_states.empty()) {\n      auto init_states = GetInitStatesSF();\n      hyp->nn_lm_scores.value = std::move(init_states.first);\n      hyp->nn_lm_states = Convert(std::move(init_states.second));\n      // if LODR enabled, we need to initialize the LODR state\n      if (lodr_fst_ != nullptr) {\n        hyp->lodr_state = std::make_unique<LodrStateCost>(lodr_fst_.get());\n      }\n    }\n\n    // get lm score for cur token given the hyp->ys[:-1] and save to lm_log_prob\n    const float *nn_lm_scores = hyp->nn_lm_scores.value.GetTensorData<float>();\n    hyp->lm_log_prob += nn_lm_scores[hyp->ys.back()] * scale;\n\n    // if LODR enabled, we need to update the LODR state\n    if (lodr_fst_ != nullptr) {\n      auto next_lodr_state = std::make_unique<LodrStateCost>(\n          hyp->lodr_state->ForwardOneStep(hyp->ys.back()));\n      // calculate the score of the latest token\n      auto score = next_lodr_state->Score() - hyp->lodr_state->Score();\n      hyp->lodr_state = std::move(next_lodr_state);\n      // apply LODR to hyp score\n      hyp->lm_log_prob += score * config_.lodr_scale;\n    }\n\n    // get lm scores for next tokens given the hyp->ys[:] and save to\n    // nn_lm_scores\n    std::array<int64_t, 2> x_shape{1, 1};\n    Ort::Value x = Ort::Value::CreateTensor<int64_t>(allocator_, x_shape.data(),\n                                                     x_shape.size());\n    *x.GetTensorMutableData<int64_t>() = hyp->ys.back();\n    auto lm_out = ScoreToken(std::move(x), Convert(hyp->nn_lm_states));\n    hyp->nn_lm_scores.value = std::move(lm_out.first);\n    hyp->nn_lm_states = Convert(std::move(lm_out.second));\n  }\n\n  // classic rescore function\n  void ComputeLMScore(float scale, int32_t context_size,\n                      std::vector<Hypotheses> *hyps) {\n    Ort::AllocatorWithDefaultOptions allocator;\n\n    for (auto &hyp : *hyps) {\n      for (auto &h_m : hyp) {\n        auto &h = h_m.second;\n        auto &ys = h.ys;\n        const int32_t token_num_in_chunk =\n            ys.size() - context_size - h.cur_scored_pos - 1;\n\n        if (token_num_in_chunk < 1) {\n          continue;\n        }\n\n        if (h.nn_lm_states.empty()) {\n          h.nn_lm_states = Convert(GetInitStates());\n        }\n\n        if (token_num_in_chunk >= h.lm_rescore_min_chunk) {\n          std::array<int64_t, 2> x_shape{1, token_num_in_chunk};\n\n          Ort::Value x = Ort::Value::CreateTensor<int64_t>(\n              allocator, x_shape.data(), x_shape.size());\n          int64_t *p_x = x.GetTensorMutableData<int64_t>();\n          std::copy(ys.begin() + context_size + h.cur_scored_pos, ys.end() - 1,\n                    p_x);\n\n          // streaming forward by NN LM\n          auto out =\n              ScoreToken(std::move(x), Convert(std::move(h.nn_lm_states)));\n\n          // update NN LM score in hyp\n          const float *p_nll = out.first.GetTensorData<float>();\n          h.lm_log_prob = -scale * (*p_nll);\n\n          // apply LODR to hyp score\n          if (lodr_fst_ != nullptr) {\n            // We scale LODR scale with LM scale to replicate Icefall code\n            lodr_fst_->ComputeScore(config_.lodr_scale * scale, &h,\n                                    context_size);\n          }\n\n          // update NN LM states in hyp\n          h.nn_lm_states = Convert(std::move(out.second));\n\n          h.cur_scored_pos += token_num_in_chunk;\n        }\n      }\n    }\n  }\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> ScoreToken(\n      Ort::Value x, std::vector<Ort::Value> states) {\n    std::array<Ort::Value, 3> inputs = {std::move(x), std::move(states[0]),\n                                        std::move(states[1])};\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    std::vector<Ort::Value> next_states;\n    next_states.reserve(2);\n    next_states.push_back(std::move(out[1]));\n    next_states.push_back(std::move(out[2]));\n\n    return {std::move(out[0]), std::move(next_states)};\n  }\n\n  // get init states for shallow fusion\n  std::pair<Ort::Value, std::vector<Ort::Value>> GetInitStatesSF() {\n    std::vector<Ort::Value> ans;\n    ans.reserve(init_states_.size());\n    for (auto &s : init_states_) {\n      ans.emplace_back(View(&s));\n    }\n    return {View(&init_scores_.value), std::move(ans)};\n  }\n\n  // get init states for classic rescore\n  std::vector<Ort::Value> GetInitStates() {\n    std::vector<Ort::Value> ans;\n    ans.reserve(init_states_.size());\n\n    for (const auto &s : init_states_) {\n      ans.emplace_back(Clone(allocator_, &s));\n    }\n\n    return ans;\n  }\n\n private:\n  void Init(const OnlineLMConfig &config) {\n    auto buf = ReadFile(config_.model);\n\n    sess_ = std::make_unique<Ort::Session>(env_, buf.data(), buf.size(),\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(rnn_num_layers_, \"num_layers\");\n    SHERPA_ONNX_READ_META_DATA(rnn_hidden_size_, \"hidden_size\");\n    SHERPA_ONNX_READ_META_DATA(sos_id_, \"sos_id\");\n\n    ComputeInitStates();\n\n    if (!config_.lodr_fst.empty()) {\n      lodr_fst_ = std::make_unique<LodrFst>(\n          LodrFst(config_.lodr_fst, config_.lodr_backoff_id));\n    }\n  }\n\n  void ComputeInitStates() {\n    constexpr int32_t kBatchSize = 1;\n    std::array<int64_t, 3> h_shape{rnn_num_layers_, kBatchSize,\n                                   rnn_hidden_size_};\n    std::array<int64_t, 3> c_shape{rnn_num_layers_, kBatchSize,\n                                   rnn_hidden_size_};\n    Ort::Value h = Ort::Value::CreateTensor<float>(allocator_, h_shape.data(),\n                                                   h_shape.size());\n    Ort::Value c = Ort::Value::CreateTensor<float>(allocator_, c_shape.data(),\n                                                   c_shape.size());\n    Fill<float>(&h, 0);\n    Fill<float>(&c, 0);\n    std::array<int64_t, 2> x_shape{1, 1};\n    Ort::Value x = Ort::Value::CreateTensor<int64_t>(allocator_, x_shape.data(),\n                                                     x_shape.size());\n    *x.GetTensorMutableData<int64_t>() = sos_id_;\n\n    std::vector<Ort::Value> states;\n    states.push_back(std::move(h));\n    states.push_back(std::move(c));\n    auto pair = ScoreToken(std::move(x), std::move(states));\n\n    init_scores_.value = std::move(pair.first);  // only used during\n                                                 // shallow fusion\n    init_states_ = std::move(pair.second);\n  }\n\n private:\n  OnlineLMConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  CopyableOrtValue init_scores_;\n  std::vector<Ort::Value> init_states_;\n\n  int32_t rnn_num_layers_ = 2;\n  int32_t rnn_hidden_size_ = 512;\n  int32_t sos_id_ = 1;\n\n  std::unique_ptr<LodrFst> lodr_fst_;\n};\n\nOnlineRnnLM::OnlineRnnLM(const OnlineLMConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\nOnlineRnnLM::~OnlineRnnLM() = default;\n\n// classic rescore state init\nstd::vector<Ort::Value> OnlineRnnLM::GetInitStates() {\n  return impl_->GetInitStates();\n}\n\n// shallow fusion state init\nstd::pair<Ort::Value, std::vector<Ort::Value>> OnlineRnnLM::GetInitStatesSF() {\n  return impl_->GetInitStatesSF();\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>> OnlineRnnLM::ScoreToken(\n    Ort::Value x, std::vector<Ort::Value> states) {\n  return impl_->ScoreToken(std::move(x), std::move(states));\n}\n\n// classic rescore scores\nvoid OnlineRnnLM::ComputeLMScore(float scale, int32_t context_size,\n                                 std::vector<Hypotheses> *hyps) {\n  return impl_->ComputeLMScore(scale, context_size, hyps);\n}\n\n// shallow fusion scores\nvoid OnlineRnnLM::ComputeLMScoreSF(float scale, Hypothesis *hyp) {\n  return impl_->ComputeLMScoreSF(scale, hyp);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-rnn-lm.h",
    "content": "// sherpa-onnx/csrc/online-rnn-lm.h\n//\n// Copyright (c)  2023  Pingfeng Luo\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_RNN_LM_H_\n#define SHERPA_ONNX_CSRC_ONLINE_RNN_LM_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-lm-config.h\"\n#include \"sherpa-onnx/csrc/online-lm.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineRnnLM : public OnlineLM {\n public:\n  ~OnlineRnnLM() override;\n\n  explicit OnlineRnnLM(const OnlineLMConfig &config);\n\n  // init scores for classic rescore\n  std::vector<Ort::Value> GetInitStates() override;\n\n  // init scores for shallow fusion\n  std::pair<Ort::Value, std::vector<Ort::Value>> GetInitStatesSF() override;\n\n   /** ScoreToken a batch of sentences (shallow fusion).\n   *\n   * @param x A 2-D tensor of shape (N, L) with data type int64.\n   * @param states It contains the states for the LM model\n   * @return Return a pair containing\n   *          - log_prob of NN LM\n   *          - updated states\n   *\n   */\n  std::pair<Ort::Value, std::vector<Ort::Value>> ScoreToken(\n      Ort::Value x, std::vector<Ort::Value> states) override;\n\n   /** This function updates hyp.lm_lob_prob of hyps (classic rescore).\n   *\n   * @param scale LM score\n   * @param context_size Context size of the transducer decoder model\n   * @param hyps It is changed in-place.\n   *\n   */\n  void ComputeLMScore(float scale, int32_t context_size,\n                              std::vector<Hypotheses> *hyps) override;\n\n   /** This function updates lm_lob_prob and nn_lm_scores of hyp (shallow fusion).\n   *\n   * @param scale LM score\n   * @param hyps It is changed in-place.\n   *\n   */\n  void ComputeLMScoreSF(float scale, Hypothesis *hyp) override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_RNN_LM_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-speech-denoiser-dpdfnet-impl.h",
    "content": "// sherpa-onnx/csrc/online-speech-denoiser-dpdfnet-impl.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_DPDFNET_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_DPDFNET_IMPL_H_\n\n#include <algorithm>\n#include <cstdint>\n#include <memory>\n#include <utility>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model.h\"\n#include \"sherpa-onnx/csrc/online-speech-denoiser-impl.h\"\n#include \"sherpa-onnx/csrc/online-speech-denoiser-stft-impl.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineSpeechDenoiserDpdfNetImpl : public OnlineSpeechDenoiserImpl {\n public:\n  explicit OnlineSpeechDenoiserDpdfNetImpl(\n      const OnlineSpeechDenoiserConfig &config)\n      : model_(config.model),\n        stream_(GetStftConfig(model_.GetMetaData())),\n        state_(model_.GetInitState()) {\n    Init();\n  }\n\n  template <typename Manager>\n  OnlineSpeechDenoiserDpdfNetImpl(Manager *mgr,\n                                  const OnlineSpeechDenoiserConfig &config)\n      : model_(mgr, config.model),\n        stream_(GetStftConfig(model_.GetMetaData())),\n        state_(model_.GetInitState()) {\n    Init();\n  }\n\n  DenoisedAudio Run(const float *samples, int32_t n,\n                    int32_t sample_rate) override {\n    return stream_.Run(samples, n, sample_rate,\n                       [this](float *spec, size_t spec_size, float *enhanced) {\n                         ProcessFrame(spec, spec_size, enhanced);\n                       });\n  }\n\n  DenoisedAudio Flush() override {\n    return stream_.Flush(\n        [this](float *spec, size_t spec_size, float *enhanced) {\n          ProcessFrame(spec, spec_size, enhanced);\n        },\n        [this]() { state_ = model_.GetInitState(); });\n  }\n\n  void Reset() override {\n    stream_.Reset();\n    state_ = model_.GetInitState();\n  }\n\n  int32_t GetSampleRate() const override { return stream_.GetSampleRate(); }\n\n  int32_t GetFrameShiftInSamples() const override {\n    return stream_.GetFrameShiftInSamples();\n  }\n\n private:\n  void Init() {\n    const auto &meta = model_.GetMetaData();\n    if (meta.profile != \"dpdfnet_16khz\" &&\n        meta.profile != \"dpdfnet2_48khz_hr\") {\n      SHERPA_ONNX_LOGE(\n          \"Online speech denoiser currently supports only DPDFNet streaming \"\n          \"exports. Given profile: %s\",\n          meta.profile.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta.spec_shape.size() != 4 || meta.spec_shape[0] != 1 ||\n        meta.spec_shape[1] != 1 || meta.spec_shape[3] != 2) {\n      SHERPA_ONNX_LOGE(\n          \"Online speech denoiser expects a single-frame DPDFNet ONNX \"\n          \"signature shaped like [1, 1, F, 2].\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  static OnlineSpeechDenoiserStftConfig GetStftConfig(\n      const OfflineSpeechDenoiserDpdfNetModelMetaData &meta) {\n    OnlineSpeechDenoiserStftConfig config;\n    config.sample_rate = meta.sample_rate;\n    config.n_fft = meta.n_fft;\n    config.hop_length = meta.hop_length;\n    config.window_length = meta.window_length;\n    config.window_type = meta.window_type;\n    return config;\n  }\n\n  void ProcessFrame(float *spec, size_t spec_size, float *enhanced) {\n    const auto &meta = model_.GetMetaData();\n    const int32_t expected_size = meta.spec_shape[2] * meta.spec_shape[3];\n    if (spec_size != static_cast<size_t>(expected_size)) {\n      SHERPA_ONNX_LOGE(\"Unexpected DPDFNet spec size. Expected: %d. Given: %d\",\n                       expected_size, static_cast<int32_t>(spec_size));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    Ort::Value spec_tensor = Ort::Value::CreateTensor<float>(\n        stream_.GetMemoryInfo(), spec, spec_size, meta.spec_shape.data(),\n        meta.spec_shape.size());\n\n    auto out = model_.Run(std::move(spec_tensor), std::move(state_));\n    state_ = std::move(out.second);\n\n    const float *enhanced_spec = out.first.GetTensorData<float>();\n    std::copy(enhanced_spec, enhanced_spec + spec_size, enhanced);\n  }\n\n private:\n  OfflineSpeechDenoiserDpdfNetModel model_;\n  OnlineSpeechDenoiserStftImpl stream_;\n  Ort::Value state_{nullptr};\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_DPDFNET_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-speech-denoiser-gtcrn-impl.h",
    "content": "// sherpa-onnx/csrc/online-speech-denoiser-gtcrn-impl.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_GTCRN_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_GTCRN_IMPL_H_\n\n#include <algorithm>\n#include <array>\n#include <cstdint>\n#include <memory>\n#include <utility>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model.h\"\n#include \"sherpa-onnx/csrc/online-speech-denoiser-impl.h\"\n#include \"sherpa-onnx/csrc/online-speech-denoiser-stft-impl.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineSpeechDenoiserGtcrnImpl : public OnlineSpeechDenoiserImpl {\n public:\n  explicit OnlineSpeechDenoiserGtcrnImpl(\n      const OnlineSpeechDenoiserConfig &config)\n      : model_(config.model),\n        stream_(GetStftConfig(model_.GetMetaData())),\n        states_(model_.GetInitStates()) {}\n\n  template <typename Manager>\n  OnlineSpeechDenoiserGtcrnImpl(Manager *mgr,\n                                const OnlineSpeechDenoiserConfig &config)\n      : model_(mgr, config.model),\n        stream_(GetStftConfig(model_.GetMetaData())),\n        states_(model_.GetInitStates()) {}\n\n  DenoisedAudio Run(const float *samples, int32_t n,\n                    int32_t sample_rate) override {\n    return stream_.Run(samples, n, sample_rate,\n                       [this](float *spec, size_t spec_size, float *enhanced) {\n                         ProcessFrame(spec, spec_size, enhanced);\n                       });\n  }\n\n  DenoisedAudio Flush() override {\n    return stream_.Flush(\n        [this](float *spec, size_t spec_size, float *enhanced) {\n          ProcessFrame(spec, spec_size, enhanced);\n        },\n        [this]() { states_ = model_.GetInitStates(); });\n  }\n\n  void Reset() override {\n    stream_.Reset();\n    states_ = model_.GetInitStates();\n  }\n\n  int32_t GetSampleRate() const override { return stream_.GetSampleRate(); }\n\n  int32_t GetFrameShiftInSamples() const override {\n    return stream_.GetFrameShiftInSamples();\n  }\n\n private:\n  static OnlineSpeechDenoiserStftConfig GetStftConfig(\n      const OfflineSpeechDenoiserGtcrnModelMetaData &meta) {\n    OnlineSpeechDenoiserStftConfig config;\n    config.sample_rate = meta.sample_rate;\n    config.n_fft = meta.n_fft;\n    config.hop_length = meta.hop_length;\n    config.window_length = meta.window_length;\n    config.window_type = meta.window_type;\n    return config;\n  }\n\n  void ProcessFrame(float *spec, size_t spec_size, float *enhanced) {\n    const int32_t num_bins = stream_.GetNumBins();\n    const size_t expected_size = static_cast<size_t>(num_bins * 2);\n    if (spec_size != expected_size) {\n      SHERPA_ONNX_LOGE(\"Unexpected GTCRN spec size. Expected: %d. Given: %d\",\n                       num_bins * 2, static_cast<int32_t>(spec_size));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::array<int64_t, 4> x_shape{1, num_bins, 1, 2};\n    Ort::Value x_tensor = Ort::Value::CreateTensor<float>(\n        stream_.GetMemoryInfo(), spec, spec_size, x_shape.data(),\n        x_shape.size());\n\n    Ort::Value output{nullptr};\n    std::tie(output, states_) =\n        model_.Run(std::move(x_tensor), std::move(states_));\n\n    const float *enhanced_spec = output.GetTensorData<float>();\n    std::copy(enhanced_spec, enhanced_spec + spec_size, enhanced);\n  }\n\n private:\n  OfflineSpeechDenoiserGtcrnModel model_;\n  OnlineSpeechDenoiserStftImpl stream_;\n  OfflineSpeechDenoiserGtcrnModel::States states_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_GTCRN_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-speech-denoiser-impl.cc",
    "content": "// sherpa-onnx/csrc/online-speech-denoiser-impl.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-speech-denoiser-impl.h\"\n\n#include <memory>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-speech-denoiser-dpdfnet-impl.h\"\n#include \"sherpa-onnx/csrc/online-speech-denoiser-gtcrn-impl.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<OnlineSpeechDenoiserImpl> OnlineSpeechDenoiserImpl::Create(\n    const OnlineSpeechDenoiserConfig &config) {\n  const bool has_gtcrn = !config.model.gtcrn.model.empty();\n  const bool has_dpdfnet = !config.model.dpdfnet.model.empty();\n\n  if (has_gtcrn) {\n    return std::make_unique<OnlineSpeechDenoiserGtcrnImpl>(config);\n  } else if (has_dpdfnet) {\n    return std::make_unique<OnlineSpeechDenoiserDpdfNetImpl>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide one speech denoising model.\");\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OnlineSpeechDenoiserImpl> OnlineSpeechDenoiserImpl::Create(\n    Manager *mgr, const OnlineSpeechDenoiserConfig &config) {\n  const bool has_gtcrn = !config.model.gtcrn.model.empty();\n  const bool has_dpdfnet = !config.model.dpdfnet.model.empty();\n\n  if (has_gtcrn) {\n    return std::make_unique<OnlineSpeechDenoiserGtcrnImpl>(mgr, config);\n  } else if (has_dpdfnet) {\n    return std::make_unique<OnlineSpeechDenoiserDpdfNetImpl>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide one speech denoising model.\");\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OnlineSpeechDenoiserImpl>\nOnlineSpeechDenoiserImpl::Create(AAssetManager *mgr,\n                                 const OnlineSpeechDenoiserConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OnlineSpeechDenoiserImpl>\nOnlineSpeechDenoiserImpl::Create(NativeResourceManager *mgr,\n                                 const OnlineSpeechDenoiserConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-speech-denoiser-impl.h",
    "content": "// sherpa-onnx/csrc/online-speech-denoiser-impl.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_IMPL_H_\n\n#include <memory>\n\n#include \"sherpa-onnx/csrc/online-speech-denoiser.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineSpeechDenoiserImpl {\n public:\n  virtual ~OnlineSpeechDenoiserImpl() = default;\n\n  static std::unique_ptr<OnlineSpeechDenoiserImpl> Create(\n      const OnlineSpeechDenoiserConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OnlineSpeechDenoiserImpl> Create(\n      Manager *mgr, const OnlineSpeechDenoiserConfig &config);\n\n  virtual DenoisedAudio Run(const float *samples, int32_t n,\n                            int32_t sample_rate) = 0;\n  virtual DenoisedAudio Flush() = 0;\n  virtual void Reset() = 0;\n  virtual int32_t GetSampleRate() const = 0;\n  virtual int32_t GetFrameShiftInSamples() const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-speech-denoiser-stft-impl.h",
    "content": "// sherpa-onnx/csrc/online-speech-denoiser-stft-impl.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_STFT_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_STFT_IMPL_H_\n\n#include <algorithm>\n#include <cmath>\n#include <cstdint>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"kaldi-native-fbank/csrc/feature-window.h\"\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineSpeechDenoiserStftConfig {\n  int32_t sample_rate = 0;\n  int32_t n_fft = 0;\n  int32_t hop_length = 0;\n  int32_t window_length = 0;\n  std::string window_type;\n};\n\ninline std::vector<float> MakeOnlineSpeechDenoiserWindow(\n    const std::string &window_type, int32_t window_length) {\n  if (window_type == \"vorbis\") {\n    return MakeVorbisWindow(window_length);\n  }\n\n  if (window_type == \"hann_sqrt\") {\n    auto window = knf::GetWindow(\"hann\", window_length);\n    for (auto &w : window) {\n      w = std::sqrt(w);\n    }\n    return window;\n  }\n\n  return knf::GetWindow(window_type, window_length);\n}\n\nclass StreamingDft {\n public:\n  explicit StreamingDft(int32_t n_fft)\n      : n_fft_(n_fft),\n        num_bins_(n_fft / 2 + 1),\n        cos_f_(num_bins_ * n_fft_),\n        sin_f_(num_bins_ * n_fft_),\n        cos_i_(n_fft_ * num_bins_),\n        sin_i_(n_fft_ * num_bins_) {\n    constexpr double kPi = 3.14159265358979323846;\n    for (int32_t k = 0; k < num_bins_; ++k) {\n      for (int32_t n = 0; n < n_fft_; ++n) {\n        double angle = 2.0 * kPi * k * n / n_fft_;\n        double c = std::cos(angle);\n        double s = std::sin(angle);\n\n        cos_f_[k * n_fft_ + n] = c;\n        sin_f_[k * n_fft_ + n] = s;\n\n        cos_i_[n * num_bins_ + k] = c;\n        sin_i_[n * num_bins_ + k] = s;\n      }\n    }\n  }\n\n  void Forward(const float *input, float *output) const {\n    for (int32_t k = 0; k != num_bins_; ++k) {\n      double real = 0;\n      double imag = 0;\n      const double *p_cos = cos_f_.data() + k * n_fft_;\n      const double *p_sin = sin_f_.data() + k * n_fft_;\n      for (int32_t n = 0; n != n_fft_; ++n) {\n        double v = input[n];\n        real += v * p_cos[n];\n        imag -= v * p_sin[n];\n      }\n      output[2 * k] = static_cast<float>(real);\n      output[2 * k + 1] = static_cast<float>(imag);\n    }\n  }\n\n  void Inverse(const float *input, float *output) const {\n    for (int32_t n = 0; n != n_fft_; ++n) {\n      double sum = input[0];\n      if (n_fft_ % 2 == 0) {\n        sum += input[2 * (num_bins_ - 1)] * ((n & 1) ? -1.0 : 1.0);\n      }\n\n      const double *p_cos = cos_i_.data() + n * num_bins_;\n      const double *p_sin = sin_i_.data() + n * num_bins_;\n      for (int32_t k = 1; k != num_bins_ - 1; ++k) {\n        double real = input[2 * k];\n        double imag = input[2 * k + 1];\n        sum += 2.0 * (real * p_cos[k] - imag * p_sin[k]);\n      }\n\n      output[n] = static_cast<float>(sum / n_fft_);\n    }\n  }\n\n private:\n  int32_t n_fft_ = 0;\n  int32_t num_bins_ = 0;\n  std::vector<double> cos_f_;\n  std::vector<double> sin_f_;\n  std::vector<double> cos_i_;\n  std::vector<double> sin_i_;\n};\n\nclass OnlineSpeechDenoiserStftImpl {\n public:\n  explicit OnlineSpeechDenoiserStftImpl(OnlineSpeechDenoiserStftConfig config)\n      : config_(std::move(config)),\n        fft_(config_.n_fft),\n        memory_info_(\n            Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault)),\n        window_(\n            MakeOnlineSpeechDenoiserWindow(config_.window_type,\n                                           config_.window_length)),\n        analysis_buffer_(config_.window_length),\n        overlap_add_buffer_(config_.window_length),\n        fft_input_(config_.window_length),\n        fft_output_(2 * (config_.n_fft / 2 + 1)),\n        enhanced_fft_output_(2 * (config_.n_fft / 2 + 1)),\n        ifft_output_(config_.window_length),\n        zero_hop_(config_.hop_length) {}\n\n  template <typename ProcessFrame>\n  DenoisedAudio Run(const float *samples, int32_t n, int32_t sample_rate,\n                    ProcessFrame process_frame) {\n    if (sample_rate <= 0) {\n      SHERPA_ONNX_LOGE(\"Expected sample_rate > 0. Given: %d\", sample_rate);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (n < 0) {\n      SHERPA_ONNX_LOGE(\"Expected n >= 0. Given: %d\", n);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (n == 0) {\n      return {{}, config_.sample_rate};\n    }\n\n    if (input_sample_rate_ == -1) {\n      input_sample_rate_ = sample_rate;\n      CreateResamplerIfNeeded();\n    } else if (sample_rate != input_sample_rate_) {\n      SHERPA_ONNX_LOGE(\n          \"Streaming denoiser expects a fixed input sample rate. Previous: %d. \"\n          \"Current: %d.\",\n          input_sample_rate_, sample_rate);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<float> resampled;\n    if (resampler_) {\n      resampler_->Resample(samples, n, false, &resampled);\n    } else {\n      resampled.assign(samples, samples + n);\n    }\n\n    total_input_samples_ += resampled.size();\n    pending_input_.insert(pending_input_.end(), resampled.begin(),\n                          resampled.end());\n\n    DenoisedAudio ans;\n    ans.sample_rate = config_.sample_rate;\n    ans.samples = ProcessPending(process_frame);\n    total_output_samples_ += ans.samples.size();\n    return ans;\n  }\n\n  template <typename ProcessFrame, typename ResetModelState>\n  DenoisedAudio Flush(ProcessFrame process_frame,\n                      ResetModelState reset_model_state) {\n    DenoisedAudio ans;\n    ans.sample_rate = config_.sample_rate;\n\n    std::vector<float> tail;\n    if (resampler_) {\n      float dummy = 0;\n      resampler_->Resample(&dummy, 0, true, &tail);\n      total_input_samples_ += tail.size();\n      pending_input_.insert(pending_input_.end(), tail.begin(), tail.end());\n    }\n\n    ans.samples = ProcessPending(process_frame);\n\n    if (!pending_input_.empty()) {\n      std::vector<float> padded(config_.hop_length, 0.0f);\n      std::copy(pending_input_.begin(), pending_input_.end(), padded.begin());\n      ProcessHop(padded.data(), &ans.samples, process_frame);\n      pending_input_.clear();\n    }\n\n    if (started_) {\n      ProcessHop(zero_hop_.data(), &ans.samples, process_frame);\n    }\n\n    int64_t remaining = total_input_samples_ - total_output_samples_;\n    if (remaining < 0) {\n      remaining = 0;\n    }\n\n    if (ans.samples.size() > static_cast<size_t>(remaining)) {\n      ans.samples.resize(static_cast<size_t>(remaining));\n    }\n\n    total_output_samples_ += ans.samples.size();\n    Reset();\n    reset_model_state();\n    return ans;\n  }\n\n  void Reset() {\n    std::fill(analysis_buffer_.begin(), analysis_buffer_.end(), 0.0f);\n    std::fill(overlap_add_buffer_.begin(), overlap_add_buffer_.end(), 0.0f);\n    pending_input_.clear();\n    resampler_.reset();\n    input_sample_rate_ = -1;\n    started_ = false;\n    total_input_samples_ = 0;\n    total_output_samples_ = 0;\n  }\n\n  int32_t GetSampleRate() const { return config_.sample_rate; }\n\n  int32_t GetFrameShiftInSamples() const { return config_.hop_length; }\n\n  const Ort::MemoryInfo &GetMemoryInfo() const { return memory_info_; }\n\n  int32_t GetNumBins() const { return config_.n_fft / 2 + 1; }\n\n private:\n  void CreateResamplerIfNeeded() {\n    if (input_sample_rate_ == config_.sample_rate) {\n      return;\n    }\n\n    SHERPA_ONNX_LOGE(\n        \"Creating a streaming resampler:\\n\"\n        \"   in_sample_rate: %d\\n\"\n        \"   output_sample_rate: %d\\n\",\n        input_sample_rate_, config_.sample_rate);\n\n    float min_freq = std::min<int32_t>(input_sample_rate_, config_.sample_rate);\n    float lowpass_cutoff = 0.99f * 0.5f * min_freq;\n    int32_t lowpass_filter_width = 6;\n    resampler_ = std::make_unique<LinearResample>(\n        input_sample_rate_, config_.sample_rate, lowpass_cutoff,\n        lowpass_filter_width);\n  }\n\n  template <typename ProcessFrame>\n  std::vector<float> ProcessPending(ProcessFrame process_frame) {\n    std::vector<float> ans;\n\n    int32_t consumed = 0;\n    while (static_cast<int32_t>(pending_input_.size()) - consumed >=\n           config_.hop_length) {\n      ProcessHop(pending_input_.data() + consumed, &ans, process_frame);\n      consumed += config_.hop_length;\n    }\n\n    if (consumed != 0) {\n      pending_input_.erase(pending_input_.begin(),\n                           pending_input_.begin() + consumed);\n    }\n\n    return ans;\n  }\n\n  template <typename ProcessFrame>\n  void ProcessHop(const float *hop, std::vector<float> *output,\n                  ProcessFrame process_frame) {\n    std::move(analysis_buffer_.begin() + config_.hop_length,\n              analysis_buffer_.end(), analysis_buffer_.begin());\n    std::copy(hop, hop + config_.hop_length,\n              analysis_buffer_.end() - config_.hop_length);\n\n    for (int32_t i = 0; i != config_.window_length; ++i) {\n      fft_input_[i] = analysis_buffer_[i] * window_[i];\n    }\n\n    fft_.Forward(fft_input_.data(), fft_output_.data());\n    process_frame(fft_output_.data(), fft_output_.size(),\n                  enhanced_fft_output_.data());\n    fft_.Inverse(enhanced_fft_output_.data(), ifft_output_.data());\n\n    std::move(overlap_add_buffer_.begin() + config_.hop_length,\n              overlap_add_buffer_.end(), overlap_add_buffer_.begin());\n    std::fill(overlap_add_buffer_.end() - config_.hop_length,\n              overlap_add_buffer_.end(), 0.0f);\n\n    for (int32_t i = 0; i != config_.window_length; ++i) {\n      overlap_add_buffer_[i] += ifft_output_[i] * window_[i];\n    }\n\n    if (!started_) {\n      started_ = true;\n      return;\n    }\n\n    output->insert(output->end(), overlap_add_buffer_.begin(),\n                   overlap_add_buffer_.begin() + config_.hop_length);\n  }\n\n private:\n  OnlineSpeechDenoiserStftConfig config_;\n  StreamingDft fft_;\n  Ort::MemoryInfo memory_info_;\n\n  std::vector<float> window_;\n  std::vector<float> analysis_buffer_;\n  std::vector<float> overlap_add_buffer_;\n  std::vector<float> pending_input_;\n  std::vector<float> fft_input_;\n  std::vector<float> fft_output_;\n  std::vector<float> enhanced_fft_output_;\n  std::vector<float> ifft_output_;\n  std::vector<float> zero_hop_;\n  std::unique_ptr<LinearResample> resampler_;\n\n  int32_t input_sample_rate_ = -1;\n  bool started_ = false;\n  int64_t total_input_samples_ = 0;\n  int64_t total_output_samples_ = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_STFT_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-speech-denoiser.cc",
    "content": "// sherpa-onnx/csrc/online-speech-denoiser.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-speech-denoiser.h\"\n\n#include <memory>\n#include <sstream>\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/online-speech-denoiser-impl.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineSpeechDenoiserConfig::Register(ParseOptions *po) {\n  model.Register(po);\n}\n\nbool OnlineSpeechDenoiserConfig::Validate() const { return model.Validate(); }\n\nstd::string OnlineSpeechDenoiserConfig::ToString() const {\n  std::ostringstream os;\n  os << \"OnlineSpeechDenoiserConfig(\";\n  os << \"model=\" << model.ToString() << \")\";\n  return os.str();\n}\n\ntemplate <typename Manager>\nOnlineSpeechDenoiser::OnlineSpeechDenoiser(\n    Manager *mgr, const OnlineSpeechDenoiserConfig &config)\n    : impl_(OnlineSpeechDenoiserImpl::Create(mgr, config)) {}\n\nOnlineSpeechDenoiser::OnlineSpeechDenoiser(\n    const OnlineSpeechDenoiserConfig &config)\n    : impl_(OnlineSpeechDenoiserImpl::Create(config)) {}\n\nOnlineSpeechDenoiser::~OnlineSpeechDenoiser() = default;\n\nDenoisedAudio OnlineSpeechDenoiser::Run(const float *samples, int32_t n,\n                                        int32_t sample_rate) {\n  return impl_->Run(samples, n, sample_rate);\n}\n\nDenoisedAudio OnlineSpeechDenoiser::Flush() { return impl_->Flush(); }\n\nvoid OnlineSpeechDenoiser::Reset() { impl_->Reset(); }\n\nint32_t OnlineSpeechDenoiser::GetSampleRate() const {\n  return impl_->GetSampleRate();\n}\n\nint32_t OnlineSpeechDenoiser::GetFrameShiftInSamples() const {\n  return impl_->GetFrameShiftInSamples();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineSpeechDenoiser::OnlineSpeechDenoiser(\n    AAssetManager *mgr, const OnlineSpeechDenoiserConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineSpeechDenoiser::OnlineSpeechDenoiser(\n    NativeResourceManager *mgr, const OnlineSpeechDenoiserConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-speech-denoiser.h",
    "content": "// sherpa-onnx/csrc/online-speech-denoiser.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_H_\n#define SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_H_\n\n#include <memory>\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineSpeechDenoiserImpl;\n\nstruct OnlineSpeechDenoiserConfig {\n  OfflineSpeechDenoiserModelConfig model;\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nclass OnlineSpeechDenoiser {\n public:\n  explicit OnlineSpeechDenoiser(const OnlineSpeechDenoiserConfig &config);\n  ~OnlineSpeechDenoiser();\n\n  template <typename Manager>\n  OnlineSpeechDenoiser(Manager *mgr, const OnlineSpeechDenoiserConfig &config);\n\n  /*\n   * Process one chunk of streaming audio and return the enhanced samples\n   * currently available. Internally this keeps model and overlap-add state\n   * across calls.\n   */\n  DenoisedAudio Run(const float *samples, int32_t n, int32_t sample_rate);\n\n  /*\n   * Flush any buffered audio and reset the denoiser to an empty state so it\n   * can be reused for a new stream.\n   */\n  DenoisedAudio Flush();\n\n  void Reset();\n\n  int32_t GetSampleRate() const;\n  int32_t GetFrameShiftInSamples() const;\n\n private:\n  std::unique_ptr<OnlineSpeechDenoiserImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_SPEECH_DENOISER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-stream.cc",
    "content": "// sherpa-onnx/csrc/online-stream.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/online-stream.h\"\n\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/features.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/transducer-keyword-decoder.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineStream::Impl {\n public:\n  explicit Impl(const FeatureExtractorConfig &config,\n                ContextGraphPtr context_graph)\n      : feat_extractor_(config), context_graph_(std::move(context_graph)) {}\n\n  void AcceptWaveform(int32_t sampling_rate, const float *waveform, int32_t n) {\n    std::lock_guard<std::mutex> lock(mutex_);\n    feat_extractor_.AcceptWaveform(sampling_rate, waveform, n);\n  }\n\n  void InputFinished() const {\n    std::lock_guard<std::mutex> lock(mutex_);\n    feat_extractor_.InputFinished();\n  }\n\n  int32_t NumFramesReady() const {\n    std::lock_guard<std::mutex> lock(mutex_);\n    return feat_extractor_.NumFramesReady() - start_frame_index_;\n  }\n\n  bool IsLastFrame(int32_t frame) const {\n    std::lock_guard<std::mutex> lock(mutex_);\n    return feat_extractor_.IsLastFrame(frame);\n  }\n\n  std::vector<float> GetFrames(int32_t frame_index, int32_t n) const {\n    std::lock_guard<std::mutex> lock(mutex_);\n    return feat_extractor_.GetFrames(frame_index + start_frame_index_, n);\n  }\n\n  void Reset() {\n    std::lock_guard<std::mutex> lock(mutex_);\n    // we don't reset the feature extractor\n    start_frame_index_ += num_processed_frames_;\n    num_processed_frames_ = 0;\n  }\n\n  int32_t &GetNumProcessedFrames() {\n    std::lock_guard<std::mutex> lock(mutex_);\n    return num_processed_frames_;\n  }\n\n  int32_t GetNumFramesSinceStart() const {\n    std::lock_guard<std::mutex> lock(mutex_);\n    return start_frame_index_;\n  }\n\n  int32_t &GetCurrentSegment() {\n    std::lock_guard<std::mutex> lock(mutex_);\n    return segment_;\n  }\n\n  void SetResult(const OnlineTransducerDecoderResult &r) { result_ = r; }\n\n  OnlineTransducerDecoderResult &GetResult() { return result_; }\n\n  void SetKeywordResult(const TransducerKeywordResult &r) {\n    keyword_result_ = r;\n  }\n  TransducerKeywordResult &GetKeywordResult(bool remove_duplicates) {\n    if (remove_duplicates) {\n      if (!prev_keyword_result_.timestamps.empty() &&\n          !keyword_result_.timestamps.empty() &&\n          keyword_result_.timestamps[0] <=\n              prev_keyword_result_.timestamps.back()) {\n        return empty_keyword_result_;\n      } else {\n        prev_keyword_result_ = keyword_result_;\n      }\n      return keyword_result_;\n    } else {\n      return keyword_result_;\n    }\n  }\n\n  OnlineCtcDecoderResult &GetCtcResult() { return ctc_result_; }\n\n  void SetCtcResult(const OnlineCtcDecoderResult &r) { ctc_result_ = r; }\n\n  void SetParaformerResult(const OnlineParaformerDecoderResult &r) {\n    paraformer_result_ = r;\n  }\n\n  OnlineParaformerDecoderResult &GetParaformerResult() {\n    return paraformer_result_;\n  }\n\n  int32_t FeatureDim() const { return feat_extractor_.FeatureDim(); }\n\n  void SetStates(std::vector<Ort::Value> states) {\n    states_ = std::move(states);\n  }\n\n  std::vector<Ort::Value> &GetStates() { return states_; }\n\n  void SetNeMoDecoderStates(std::vector<Ort::Value> decoder_states) {\n    decoder_states_ = std::move(decoder_states);\n  }\n\n  std::vector<Ort::Value> &GetNeMoDecoderStates() { return decoder_states_; }\n\n  const ContextGraphPtr &GetContextGraph() const { return context_graph_; }\n\n  std::vector<float> &GetParaformerFeatCache() {\n    return paraformer_feat_cache_;\n  }\n\n  std::vector<float> &GetParaformerEncoderOutCache() {\n    return paraformer_encoder_out_cache_;\n  }\n\n  std::vector<float> &GetParaformerAlphaCache() {\n    return paraformer_alpha_cache_;\n  }\n\n  void SetOption(const std::string &key, const std::string &value) {\n    options_[key] = value;\n  }\n\n  bool HasOption(const std::string &key) const {\n    return options_.count(key) != 0;\n  }\n\n  const std::string &GetOption(const std::string &key) const {\n    auto it = options_.find(key);\n    if (it != options_.end()) {\n      return it->second;\n    }\n    static const std::string kEmpty;\n    return kEmpty;\n  }\n\n  int32_t GetOptionInt(const std::string &key, int32_t default_value) const {\n    auto it = options_.find(key);\n    if (it != options_.end()) {\n      return ToIntOrDefault(it->second, default_value);\n    }\n    return default_value;\n  }\n\n  float GetOptionFloat(const std::string &key, float default_value) const {\n    auto it = options_.find(key);\n    if (it != options_.end()) {\n      return ToFloatOrDefault(it->second, default_value);\n    }\n    return default_value;\n  }\n\n  void SetFasterDecoder(std::unique_ptr<kaldi_decoder::FasterDecoder> decoder) {\n    faster_decoder_ = std::move(decoder);\n  }\n\n  kaldi_decoder::FasterDecoder *GetFasterDecoder() const {\n    return faster_decoder_.get();\n  }\n\n  int32_t &GetFasterDecoderProcessedFrames() {\n    return faster_decoder_processed_frames_;\n  }\n\n private:\n  FeatureExtractor feat_extractor_;\n  mutable std::mutex mutex_;\n  /// For contextual-biasing\n  ContextGraphPtr context_graph_;\n  int32_t num_processed_frames_ = 0;  // before subsampling\n  int32_t start_frame_index_ = 0;     // never reset\n  int32_t segment_ = 0;\n  OnlineTransducerDecoderResult result_;\n  TransducerKeywordResult prev_keyword_result_;\n  TransducerKeywordResult keyword_result_;\n  TransducerKeywordResult empty_keyword_result_;\n  OnlineCtcDecoderResult ctc_result_;\n  std::vector<Ort::Value> states_;  // states for transducer or ctc models\n  std::vector<Ort::Value> decoder_states_;  // states for nemo transducer models\n  std::vector<float> paraformer_feat_cache_;\n  std::vector<float> paraformer_encoder_out_cache_;\n  std::vector<float> paraformer_alpha_cache_;\n  OnlineParaformerDecoderResult paraformer_result_;\n  std::unordered_map<std::string, std::string> options_;\n  std::unique_ptr<kaldi_decoder::FasterDecoder> faster_decoder_;\n  int32_t faster_decoder_processed_frames_ = 0;\n};\n\nOnlineStream::OnlineStream(const FeatureExtractorConfig &config /*= {}*/,\n                           ContextGraphPtr context_graph /*= nullptr */)\n    : impl_(std::make_unique<Impl>(config, std::move(context_graph))) {}\n\nOnlineStream::~OnlineStream() = default;\n\nvoid OnlineStream::AcceptWaveform(int32_t sampling_rate, const float *waveform,\n                                  int32_t n) const {\n  impl_->AcceptWaveform(sampling_rate, waveform, n);\n}\n\nvoid OnlineStream::InputFinished() const { impl_->InputFinished(); }\n\nint32_t OnlineStream::NumFramesReady() const { return impl_->NumFramesReady(); }\n\nbool OnlineStream::IsLastFrame(int32_t frame) const {\n  return impl_->IsLastFrame(frame);\n}\n\nstd::vector<float> OnlineStream::GetFrames(int32_t frame_index,\n                                           int32_t n) const {\n  return impl_->GetFrames(frame_index, n);\n}\n\nvoid OnlineStream::Reset() { impl_->Reset(); }\n\nint32_t OnlineStream::FeatureDim() const { return impl_->FeatureDim(); }\n\nint32_t &OnlineStream::GetNumProcessedFrames() {\n  return impl_->GetNumProcessedFrames();\n}\n\nint32_t OnlineStream::GetNumFramesSinceStart() const {\n  return impl_->GetNumFramesSinceStart();\n}\n\nint32_t &OnlineStream::GetCurrentSegment() {\n  return impl_->GetCurrentSegment();\n}\n\nvoid OnlineStream::SetResult(const OnlineTransducerDecoderResult &r) {\n  impl_->SetResult(r);\n}\n\nOnlineTransducerDecoderResult &OnlineStream::GetResult() {\n  return impl_->GetResult();\n}\n\nvoid OnlineStream::SetKeywordResult(const TransducerKeywordResult &r) {\n  impl_->SetKeywordResult(r);\n}\n\nTransducerKeywordResult &OnlineStream::GetKeywordResult(\n    bool remove_duplicates /*=false*/) {\n  return impl_->GetKeywordResult(remove_duplicates);\n}\n\nOnlineCtcDecoderResult &OnlineStream::GetCtcResult() {\n  return impl_->GetCtcResult();\n}\n\nvoid OnlineStream::SetCtcResult(const OnlineCtcDecoderResult &r) {\n  impl_->SetCtcResult(r);\n}\n\nvoid OnlineStream::SetParaformerResult(const OnlineParaformerDecoderResult &r) {\n  impl_->SetParaformerResult(r);\n}\n\nOnlineParaformerDecoderResult &OnlineStream::GetParaformerResult() {\n  return impl_->GetParaformerResult();\n}\n\nvoid OnlineStream::SetStates(std::vector<Ort::Value> states) {\n  impl_->SetStates(std::move(states));\n}\n\nstd::vector<Ort::Value> &OnlineStream::GetStates() {\n  return impl_->GetStates();\n}\n\nvoid OnlineStream::SetNeMoDecoderStates(\n    std::vector<Ort::Value> decoder_states) {\n  return impl_->SetNeMoDecoderStates(std::move(decoder_states));\n}\n\nstd::vector<Ort::Value> &OnlineStream::GetNeMoDecoderStates() {\n  return impl_->GetNeMoDecoderStates();\n}\n\nconst ContextGraphPtr &OnlineStream::GetContextGraph() const {\n  return impl_->GetContextGraph();\n}\n\nvoid OnlineStream::SetFasterDecoder(\n    std::unique_ptr<kaldi_decoder::FasterDecoder> decoder) {\n  impl_->SetFasterDecoder(std::move(decoder));\n}\n\nkaldi_decoder::FasterDecoder *OnlineStream::GetFasterDecoder() const {\n  return impl_->GetFasterDecoder();\n}\n\nint32_t &OnlineStream::GetFasterDecoderProcessedFrames() {\n  return impl_->GetFasterDecoderProcessedFrames();\n}\n\nstd::vector<float> &OnlineStream::GetParaformerFeatCache() {\n  return impl_->GetParaformerFeatCache();\n}\n\nstd::vector<float> &OnlineStream::GetParaformerEncoderOutCache() {\n  return impl_->GetParaformerEncoderOutCache();\n}\n\nstd::vector<float> &OnlineStream::GetParaformerAlphaCache() {\n  return impl_->GetParaformerAlphaCache();\n}\n\nvoid OnlineStream::SetOption(const std::string &key,\n                             const std::string &value) {\n  impl_->SetOption(key, value);\n}\n\nbool OnlineStream::HasOption(const std::string &key) const {\n  return impl_->HasOption(key);\n}\n\nconst std::string &OnlineStream::GetOption(const std::string &key) const {\n  return impl_->GetOption(key);\n}\n\nint32_t OnlineStream::GetOptionInt(const std::string &key,\n                                   int32_t default_value) const {\n  return impl_->GetOptionInt(key, default_value);\n}\n\nfloat OnlineStream::GetOptionFloat(const std::string &key,\n                                   float default_value) const {\n  return impl_->GetOptionFloat(key, default_value);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-stream.h",
    "content": "// sherpa-onnx/csrc/online-stream.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_STREAM_H_\n#define SHERPA_ONNX_CSRC_ONLINE_STREAM_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"kaldi-decoder/csrc/faster-decoder.h\"\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/context-graph.h\"\n#include \"sherpa-onnx/csrc/features.h\"\n#include \"sherpa-onnx/csrc/online-ctc-decoder.h\"\n#include \"sherpa-onnx/csrc/online-paraformer-decoder.h\"\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n\nnamespace sherpa_onnx {\n\nstruct TransducerKeywordResult;\nclass OnlineStream {\n public:\n  explicit OnlineStream(const FeatureExtractorConfig &config = {},\n                        ContextGraphPtr context_graph = nullptr);\n\n  virtual ~OnlineStream();\n\n  /**\n     @param sampling_rate The sampling_rate of the input waveform. If it does\n                          not equal to  config.sampling_rate, we will do\n                          resampling inside.\n     @param waveform Pointer to a 1-D array of size n. It must be normalized to\n                     the range [-1, 1].\n     @param n Number of entries in waveform\n   */\n  void AcceptWaveform(int32_t sampling_rate, const float *waveform,\n                      int32_t n) const;\n\n  /**\n   * InputFinished() tells the class you won't be providing any\n   * more waveform.  This will help flush out the last frame or two\n   * of features, in the case where snip-edges == false; it also\n   * affects the return value of IsLastFrame().\n   */\n  void InputFinished() const;\n\n  int32_t NumFramesReady() const;\n\n  /** Note: IsLastFrame() will only ever return true if you have called\n   * InputFinished() (and this frame is the last frame).\n   */\n  bool IsLastFrame(int32_t frame) const;\n\n  /** Get n frames starting from the given frame index.\n   *\n   * @param frame_index  The starting frame index\n   * @param n  Number of frames to get.\n   * @return Return a 2-D tensor of shape (n, feature_dim).\n   *         which is flattened into a 1-D vector (flattened in row major)\n   */\n  std::vector<float> GetFrames(int32_t frame_index, int32_t n) const;\n\n  void Reset();\n\n  int32_t FeatureDim() const;\n\n  // Return a reference to the number of processed frames so far\n  // before subsampling..\n  // Initially, it is 0. It is always less than NumFramesReady().\n  //\n  // The returned reference is valid as long as this object is alive.\n  int32_t &GetNumProcessedFrames();  // It's reset after calling Reset()\n\n  int32_t GetNumFramesSinceStart() const;\n\n  int32_t &GetCurrentSegment();\n\n  void SetResult(const OnlineTransducerDecoderResult &r);\n  OnlineTransducerDecoderResult &GetResult();\n\n  void SetKeywordResult(const TransducerKeywordResult &r);\n  TransducerKeywordResult &GetKeywordResult(bool remove_duplicates = false);\n\n  void SetCtcResult(const OnlineCtcDecoderResult &r);\n  OnlineCtcDecoderResult &GetCtcResult();\n\n  void SetParaformerResult(const OnlineParaformerDecoderResult &r);\n  OnlineParaformerDecoderResult &GetParaformerResult();\n\n  void SetStates(std::vector<Ort::Value> states);\n  std::vector<Ort::Value> &GetStates();\n\n  void SetNeMoDecoderStates(std::vector<Ort::Value> decoder_states);\n  std::vector<Ort::Value> &GetNeMoDecoderStates();\n\n  /**\n   * Get the context graph corresponding to this stream.\n   *\n   * @return Return the context graph for this stream.\n   */\n  const ContextGraphPtr &GetContextGraph() const;\n\n  // for online ctc decoder\n  void SetFasterDecoder(std::unique_ptr<kaldi_decoder::FasterDecoder> decoder);\n  kaldi_decoder::FasterDecoder *GetFasterDecoder() const;\n  int32_t &GetFasterDecoderProcessedFrames();\n\n  // for streaming paraformer\n  std::vector<float> &GetParaformerFeatCache();\n  std::vector<float> &GetParaformerEncoderOutCache();\n  std::vector<float> &GetParaformerAlphaCache();\n\n  // Generic per-stream option mechanism (key-value string pairs).\n  void SetOption(const std::string &key, const std::string &value);\n  bool HasOption(const std::string &key) const;\n\n  // Returns the value for the given key, or an empty string if the key\n  // does not exist. No exception is thrown for missing keys.\n  const std::string &GetOption(const std::string &key) const;\n  int32_t GetOptionInt(const std::string &key,\n                       int32_t default_value = 0) const;\n  float GetOptionFloat(const std::string &key,\n                       float default_value = 0.0f) const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_STREAM_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-t-one-ctc-model-config.cc",
    "content": "// sherpa-onnx/csrc/online-t-one-ctc-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-t-one-ctc-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineToneCtcModelConfig::Register(ParseOptions *po) {\n  po->Register(\"t-one-ctc-model\", &model,\n               \"Path to CTC model.onnx from T-one. Please see \"\n               \"https://github.com/k2-fsa/sherpa-onnx/pull/2571\");\n}\n\nbool OnlineToneCtcModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"T-one CTC model '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OnlineToneCtcModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlineToneCtcModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-t-one-ctc-model-config.h",
    "content": "// sherpa-onnx/csrc/online-t-one-ctc-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_T_ONE_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_ONLINE_T_ONE_CTC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineToneCtcModelConfig {\n  std::string model;\n\n  OnlineToneCtcModelConfig() = default;\n\n  explicit OnlineToneCtcModelConfig(const std::string &model) : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_T_ONE_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-t-one-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/online-t-one-ctc-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-t-one-ctc-model.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/cat.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/unbind.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineToneCtcModel::Impl {\n public:\n  explicit Impl(const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.t_one_ctc.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.t_one_ctc.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value x,\n                                  std::vector<Ort::Value> states) {\n    // shape0 is (batch_size, 1, num_samples)\n    auto shape0 = x.GetTensorTypeAndShapeInfo().GetShape();\n    std::array<int64_t, 3> shape = {shape0[0], shape0[2], shape0[1]};\n    std::vector<int32_t> samples(shape[0] * shape[1] * shape[2]);\n    const float *px = x.GetTensorData<float>();\n\n    for (int32_t i = 0; i < samples.size(); ++i) {\n      float f = px[i];\n      f = f > 1 ? 1 : f;\n      f = f < -1 ? -1 : f;\n      samples[i] = static_cast<int32_t>(f * 32767);\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    Ort::Value xx =\n        Ort::Value::CreateTensor(memory_info, samples.data(), samples.size(),\n                                 shape.data(), shape.size());\n\n    std::array<Ort::Value, 2> inputs = {std::move(xx), std::move(states[0])};\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n    // out[0]: log_probs\n    // out[1] next_states\n\n    return out;\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t ChunkLength() const { return 1; }\n\n  int32_t ChunkShift() const { return 1; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  // Return a vector containing 1 tensor\n  // - state_\n  std::vector<Ort::Value> GetInitStates() {\n    std::vector<Ort::Value> ans;\n    ans.push_back(View(&state_));\n\n    return ans;\n  }\n\n  std::vector<Ort::Value> StackStates(\n      std::vector<std::vector<Ort::Value>> states) {\n    int32_t batch_size = static_cast<int32_t>(states.size());\n    if (batch_size == 1) {\n      return std::move(states[0]);\n    }\n\n    std::vector<Ort::Value> ans;\n    ans.reserve(1);\n\n    std::vector<const Ort::Value *> buf;\n    buf.reserve(batch_size);\n\n    for (int32_t b = 0; b != batch_size; ++b) {\n      buf.push_back(&states[b][0]);\n    }\n\n    Ort::Value c{nullptr};\n    c = CatFloat16(allocator_, buf, 0);\n\n    ans.push_back(std::move(c));\n\n    return ans;\n  }\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      std::vector<Ort::Value> states) const {\n    auto allocator = const_cast<Impl *>(this)->allocator_;\n\n    std::vector<std::vector<Ort::Value>> ans;\n\n    auto shape = states[0].GetTensorTypeAndShapeInfo().GetShape();\n    int32_t batch_size = shape[0];\n    ans.resize(batch_size);\n\n    if (batch_size == 1) {\n      ans[0] = std::move(states);\n      return ans;\n    }\n\n    std::vector<Ort::Value> v;\n    v = UnbindFloat16(allocator, &states[0], 0);\n\n    for (int32_t b = 0; b != batch_size; ++b) {\n      ans[b].push_back(std::move(v[b]));\n    }\n\n    return ans;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(frame_length_ms_, \"frame_length_ms\");\n    SHERPA_ONNX_READ_META_DATA(state_dim_, \"state_dim\");\n    SHERPA_ONNX_READ_META_DATA(sample_rate_, \"sample_rate\");\n\n    InitStates();\n\n    vocab_size_ = sess_->GetOutputTypeInfo(0)\n                      .GetTensorTypeAndShapeInfo()\n                      .GetShape()\n                      .back();\n  }\n\n  void InitStates() {\n    std::array<int64_t, 2> state_shape{1, state_dim_};\n\n    state_ = Ort::Value::CreateTensor(allocator_, state_shape.data(),\n                                      state_shape.size(),\n                                      ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);\n\n    auto p = state_.GetTensorMutableData<uint16_t>();\n    std::fill(p, p + state_dim_, 0);\n  }\n\n private:\n  OnlineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  // One input frame is of  length is 300ms\n  // For each input frame, there are 10 output frames,\n  // so each output frame is 30ms\n  int32_t frame_length_ms_ = 0;\n  int32_t state_dim_ = 0;\n  int32_t sample_rate_ = 0;\n  int32_t vocab_size_ = 0;\n\n  Ort::Value state_{nullptr};\n};\n\nOnlineToneCtcModel::OnlineToneCtcModel(const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOnlineToneCtcModel::OnlineToneCtcModel(Manager *mgr,\n                                       const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOnlineToneCtcModel::~OnlineToneCtcModel() = default;\n\nstd::vector<Ort::Value> OnlineToneCtcModel::Forward(\n    Ort::Value x, std::vector<Ort::Value> states) const {\n  return impl_->Forward(std::move(x), std::move(states));\n}\n\nint32_t OnlineToneCtcModel::VocabSize() const { return impl_->VocabSize(); }\n\nint32_t OnlineToneCtcModel::ChunkLength() const { return impl_->ChunkLength(); }\n\nint32_t OnlineToneCtcModel::ChunkShift() const { return impl_->ChunkShift(); }\n\nOrtAllocator *OnlineToneCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nstd::vector<Ort::Value> OnlineToneCtcModel::GetInitStates() const {\n  return impl_->GetInitStates();\n}\n\nstd::vector<Ort::Value> OnlineToneCtcModel::StackStates(\n    std::vector<std::vector<Ort::Value>> states) const {\n  return impl_->StackStates(std::move(states));\n}\n\nstd::vector<std::vector<Ort::Value>> OnlineToneCtcModel::UnStackStates(\n    std::vector<Ort::Value> states) const {\n  return impl_->UnStackStates(std::move(states));\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineToneCtcModel::OnlineToneCtcModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineToneCtcModel::OnlineToneCtcModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-t-one-ctc-model.h",
    "content": "// sherpa-onnx/csrc/online-t-one-ctc-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_T_ONE_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_T_ONE_CTC_MODEL_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-ctc-model.h\"\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineToneCtcModel : public OnlineCtcModel {\n public:\n  explicit OnlineToneCtcModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineToneCtcModel(Manager *mgr, const OnlineModelConfig &config);\n\n  ~OnlineToneCtcModel() override;\n\n  // A list of 1 tensor:\n  //   - (batch_size, state_dim)\n  std::vector<Ort::Value> GetInitStates() const override;\n\n  std::vector<Ort::Value> StackStates(\n      std::vector<std::vector<Ort::Value>> states) const override;\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      std::vector<Ort::Value> states) const override;\n\n  /**\n   *\n   * @param x A 3-D tensor of shape (batch_size, num_samples).\n   * @param states  It is from GetInitStates() or returned from this method.\n   *\n   * @return Return a list of tensors\n   *    - ans[0] contains log_probs, of shape (N, T, C)\n   *    - ans[1:] contains next_states\n   */\n  std::vector<Ort::Value> Forward(\n      Ort::Value x, std::vector<Ort::Value> states) const override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  // The model accepts this number of frames before subsampling as input\n  int32_t ChunkLength() const override;\n\n  // Similar to frame_shift in feature extractor, after processing\n  // ChunkLength() frames, we advance by ChunkShift() frames\n  // before we process the next chunk.\n  int32_t ChunkShift() const override;\n\n  bool SupportBatchProcessing() const override { return true; }\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_T_ONE_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-decoder.cc",
    "content": "// sherpa-onnx/csrc/online-transducer-decoder.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nOnlineTransducerDecoderResult::OnlineTransducerDecoderResult(\n    const OnlineTransducerDecoderResult &other)\n    : OnlineTransducerDecoderResult() {\n  *this = other;\n}\n\nOnlineTransducerDecoderResult &OnlineTransducerDecoderResult::operator=(\n    const OnlineTransducerDecoderResult &other) {\n  if (this == &other) {\n    return *this;\n  }\n\n  tokens = other.tokens;\n  num_trailing_blanks = other.num_trailing_blanks;\n\n  Ort::AllocatorWithDefaultOptions allocator;\n  if (other.decoder_out) {\n    decoder_out = Clone(allocator, &other.decoder_out);\n  }\n\n  hyps = other.hyps;\n\n  frame_offset = other.frame_offset;\n  timestamps = other.timestamps;\n\n  ys_probs = other.ys_probs;\n  lm_probs = other.lm_probs;\n  context_scores = other.context_scores;\n\n  return *this;\n}\n\nOnlineTransducerDecoderResult::OnlineTransducerDecoderResult(\n    OnlineTransducerDecoderResult &&other) noexcept\n    : OnlineTransducerDecoderResult() {\n  *this = std::move(other);\n}\n\nOnlineTransducerDecoderResult &OnlineTransducerDecoderResult::operator=(\n    OnlineTransducerDecoderResult &&other) noexcept {\n  if (this == &other) {\n    return *this;\n  }\n\n  tokens = std::move(other.tokens);\n  num_trailing_blanks = other.num_trailing_blanks;\n  decoder_out = std::move(other.decoder_out);\n  hyps = std::move(other.hyps);\n\n  frame_offset = other.frame_offset;\n  timestamps = std::move(other.timestamps);\n\n  ys_probs = std::move(other.ys_probs);\n  lm_probs = std::move(other.lm_probs);\n  context_scores = std::move(other.context_scores);\n\n  return *this;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-decoder.h",
    "content": "// sherpa-onnx/csrc/online-transducer-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_DECODER_H_\n#define SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_DECODER_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/hypothesis.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineTransducerDecoderResult {\n  /// Number of frames after subsampling we have decoded so far\n  int32_t frame_offset = 0;\n\n  /// The decoded token IDs so far\n  std::vector<int64_t> tokens;\n\n  /// number of trailing blank frames decoded so far\n  int32_t num_trailing_blanks = 0;\n\n  /// timestamps[i] contains the output frame index where tokens[i] is decoded.\n  std::vector<int32_t> timestamps;\n\n  std::vector<float> ys_probs;\n  std::vector<float> lm_probs;\n  std::vector<float> context_scores;\n\n  // Cache decoder_out for endpointing\n  Ort::Value decoder_out;\n\n  // used only in modified beam_search\n  Hypotheses hyps;\n\n  OnlineTransducerDecoderResult()\n      : tokens{}, num_trailing_blanks(0), decoder_out{nullptr}, hyps{} {}\n\n  OnlineTransducerDecoderResult(const OnlineTransducerDecoderResult &other);\n\n  OnlineTransducerDecoderResult &operator=(\n      const OnlineTransducerDecoderResult &other);\n\n  OnlineTransducerDecoderResult(OnlineTransducerDecoderResult &&other) noexcept;\n\n  OnlineTransducerDecoderResult &operator=(\n      OnlineTransducerDecoderResult &&other) noexcept;\n};\n\nclass OnlineStream;\nclass OnlineTransducerDecoder {\n public:\n  virtual ~OnlineTransducerDecoder() = default;\n\n  /* Return an empty result.\n   *\n   * To simplify the decoding code, we add `context_size` blanks\n   * to the beginning of the decoding result, which will be\n   * stripped by calling `StripPrecedingBlanks()`.\n   */\n  virtual OnlineTransducerDecoderResult GetEmptyResult() const = 0;\n\n  /** Strip blanks added by `GetEmptyResult()`.\n   *\n   * @param r It is changed in-place.\n   */\n  virtual void StripLeadingBlanks(OnlineTransducerDecoderResult * /*r*/) const {\n  }\n\n  /** Run transducer beam search given the output from the encoder model.\n   *\n   * @param encoder_out A 3-D tensor of shape (N, T, joiner_dim)\n   * @param result  It is modified in-place.\n   *\n   * @note There is no need to pass encoder_out_length here since for the\n   * online decoding case, each utterance has the same number of frames\n   * and there are no paddings.\n   */\n  virtual void Decode(Ort::Value encoder_out,\n                      std::vector<OnlineTransducerDecoderResult> *result) = 0;\n\n  /** Run transducer beam search given the output from the encoder model.\n   *\n   * Note: Currently this interface is for contextual-biasing feature which\n   *       needs a ContextGraph owned by the OnlineStream.\n   *\n   * @param encoder_out A 3-D tensor of shape (N, T, joiner_dim)\n   * @param ss  A list of OnlineStreams.\n   * @param result  It is modified in-place.\n   *\n   * @note There is no need to pass encoder_out_length here since for the\n   * online decoding case, each utterance has the same number of frames\n   * and there are no paddings.\n   */\n  virtual void Decode(Ort::Value /*encoder_out*/, OnlineStream ** /*ss*/,\n                      std::vector<OnlineTransducerDecoderResult> * /*result*/) {\n    SHERPA_ONNX_LOGE(\n        \"This interface is for OnlineTransducerModifiedBeamSearchDecoder.\");\n    exit(-1);\n  }\n\n  // used for endpointing. We need to keep decoder_out after reset\n  virtual void UpdateDecoderOut(OnlineTransducerDecoderResult * /*result*/) {}\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-greedy-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/online-transducer-greedy-search-decoder.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-transducer-greedy-search-decoder.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nstatic void UseCachedDecoderOut(\n    const std::vector<OnlineTransducerDecoderResult> &results,\n    Ort::Value *decoder_out) {\n  std::vector<int64_t> shape =\n      decoder_out->GetTensorTypeAndShapeInfo().GetShape();\n  float *dst = decoder_out->GetTensorMutableData<float>();\n  for (const auto &r : results) {\n    if (r.decoder_out) {\n      const float *src = r.decoder_out.GetTensorData<float>();\n      std::copy(src, src + shape[1], dst);\n    }\n    dst += shape[1];\n  }\n}\n\nstatic void UpdateCachedDecoderOut(\n    OrtAllocator *allocator, const Ort::Value *decoder_out,\n    std::vector<OnlineTransducerDecoderResult> *results) {\n  std::vector<int64_t> shape =\n      decoder_out->GetTensorTypeAndShapeInfo().GetShape();\n  auto memory_info =\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n  std::array<int64_t, 2> v_shape{1, shape[1]};\n\n  const float *src = decoder_out->GetTensorData<float>();\n  for (auto &r : *results) {\n    if (!r.decoder_out) {\n      r.decoder_out = Ort::Value::CreateTensor<float>(allocator, v_shape.data(),\n                                                      v_shape.size());\n    }\n\n    float *dst = r.decoder_out.GetTensorMutableData<float>();\n    std::copy(src, src + shape[1], dst);\n    src += shape[1];\n  }\n}\n\nOnlineTransducerDecoderResult\nOnlineTransducerGreedySearchDecoder::GetEmptyResult() const {\n  int32_t context_size = model_->ContextSize();\n  int32_t blank_id = 0;  // always 0\n  OnlineTransducerDecoderResult r;\n  r.tokens.resize(context_size, -1);\n  r.tokens.back() = blank_id;\n\n  return r;\n}\n\nvoid OnlineTransducerGreedySearchDecoder::StripLeadingBlanks(\n    OnlineTransducerDecoderResult *r) const {\n  int32_t context_size = model_->ContextSize();\n\n  auto start = r->tokens.begin() + context_size;\n  auto end = r->tokens.end();\n\n  r->tokens = std::vector<int64_t>(start, end);\n}\n\nvoid OnlineTransducerGreedySearchDecoder::Decode(\n    Ort::Value encoder_out,\n    std::vector<OnlineTransducerDecoderResult> *result) {\n  std::vector<int64_t> encoder_out_shape =\n      encoder_out.GetTensorTypeAndShapeInfo().GetShape();\n\n  if (encoder_out_shape[0] != static_cast<int32_t>(result->size())) {\n    SHERPA_ONNX_LOGE(\n        \"Size mismatch! encoder_out.size(0) %d, result.size(0): %d\",\n        static_cast<int32_t>(encoder_out_shape[0]),\n        static_cast<int32_t>(result->size()));\n    exit(-1);\n  }\n\n  int32_t batch_size = static_cast<int32_t>(encoder_out_shape[0]);\n  int32_t num_frames = static_cast<int32_t>(encoder_out_shape[1]);\n  int32_t vocab_size = model_->VocabSize();\n\n  Ort::Value decoder_out{nullptr};\n  bool is_batch_decoder_out_cached = true;\n  for (const auto &r : *result) {\n    if (!r.decoder_out) {\n      is_batch_decoder_out_cached = false;\n      break;\n    }\n  }\n\n  if (is_batch_decoder_out_cached) {\n    auto &r = result->front();\n    std::vector<int64_t> decoder_out_shape =\n        r.decoder_out.GetTensorTypeAndShapeInfo().GetShape();\n    decoder_out_shape[0] = batch_size;\n    decoder_out = Ort::Value::CreateTensor<float>(model_->Allocator(),\n                                                  decoder_out_shape.data(),\n                                                  decoder_out_shape.size());\n    UseCachedDecoderOut(*result, &decoder_out);\n  } else {\n    Ort::Value decoder_input = model_->BuildDecoderInput(*result);\n    decoder_out = model_->RunDecoder(std::move(decoder_input));\n  }\n\n  for (int32_t t = 0; t != num_frames; ++t) {\n    Ort::Value cur_encoder_out =\n        GetEncoderOutFrame(model_->Allocator(), &encoder_out, t);\n    Ort::Value logit =\n        model_->RunJoiner(std::move(cur_encoder_out), View(&decoder_out));\n\n    float *p_logit = logit.GetTensorMutableData<float>();\n\n    bool emitted = false;\n    for (int32_t i = 0; i < batch_size; ++i, p_logit += vocab_size) {\n      auto &r = (*result)[i];\n      if (blank_penalty_ > 0.0) {\n        p_logit[0] -= blank_penalty_;  // assuming blank id is 0\n      }\n\n      auto y = static_cast<int32_t>(std::distance(\n          static_cast<const float *>(p_logit),\n          std::max_element(static_cast<const float *>(p_logit),\n                           static_cast<const float *>(p_logit) + vocab_size)));\n      // blank id is hardcoded to 0\n      // also, it treats unk as blank\n      if (y != 0 && y != unk_id_) {\n        emitted = true;\n        r.tokens.push_back(y);\n        r.timestamps.push_back(t + r.frame_offset);\n        r.num_trailing_blanks = 0;\n      } else {\n        ++r.num_trailing_blanks;\n      }\n\n      // export the per-token log scores\n      if (y != 0 && y != unk_id_) {\n        // apply temperature-scaling\n        for (int32_t n = 0; n < vocab_size; ++n) {\n          p_logit[n] /= temperature_scale_;\n        }\n        LogSoftmax(p_logit, vocab_size);   // renormalize probabilities,\n                                           // save time by doing it only for\n                                           // emitted symbols\n        const float *p_logprob = p_logit;  // rename p_logit as p_logprob,\n                                           // now it contains normalized\n                                           // probability\n        r.ys_probs.push_back(p_logprob[y]);\n      }\n    }\n    if (emitted) {\n      Ort::Value decoder_input = model_->BuildDecoderInput(*result);\n      decoder_out = model_->RunDecoder(std::move(decoder_input));\n    }\n  }\n\n  UpdateCachedDecoderOut(model_->Allocator(), &decoder_out, result);\n\n  // Update frame_offset\n  for (auto &r : *result) {\n    r.frame_offset += num_frames;\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-greedy-search-decoder.h",
    "content": "// sherpa-onnx/csrc/online-transducer-greedy-search-decoder.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_GREEDY_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_GREEDY_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineTransducerGreedySearchDecoder : public OnlineTransducerDecoder {\n public:\n  OnlineTransducerGreedySearchDecoder(OnlineTransducerModel *model,\n                                      int32_t unk_id,\n                                      float blank_penalty,\n                                      float temperature_scale)\n      : model_(model),\n      unk_id_(unk_id),\n      blank_penalty_(blank_penalty),\n      temperature_scale_(temperature_scale) {}\n\n  OnlineTransducerDecoderResult GetEmptyResult() const override;\n\n  void StripLeadingBlanks(OnlineTransducerDecoderResult *r) const override;\n\n  void Decode(Ort::Value encoder_out,\n              std::vector<OnlineTransducerDecoderResult> *result) override;\n\n private:\n  OnlineTransducerModel *model_;  // Not owned\n  int32_t unk_id_;\n  float blank_penalty_;\n  float temperature_scale_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_GREEDY_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc",
    "content": "// sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n// Copyright (c)  2024  Sangeet Sagar\n\n#include \"sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.h\"\n\n#include <algorithm>\n#include <iterator>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nstatic Ort::Value BuildDecoderInput(int32_t token, OrtAllocator *allocator) {\n  std::array<int64_t, 2> shape{1, 1};\n\n  Ort::Value decoder_input =\n      Ort::Value::CreateTensor<int32_t>(allocator, shape.data(), shape.size());\n\n  int32_t *p = decoder_input.GetTensorMutableData<int32_t>();\n\n  p[0] = token;\n\n  return decoder_input;\n}\n\nstatic void DecodeOne(const float *encoder_out, int32_t num_rows,\n                      int32_t num_cols, OnlineTransducerNeMoModel *model,\n                      float blank_penalty, OnlineStream *s) {\n  auto memory_info =\n      Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n  int32_t vocab_size = model->VocabSize();\n  int32_t blank_id = vocab_size - 1;\n\n  auto &r = s->GetResult();\n\n  Ort::Value decoder_out{nullptr};\n\n  auto decoder_input = BuildDecoderInput(\n      r.tokens.empty() ? blank_id : r.tokens.back(), model->Allocator());\n\n  std::vector<Ort::Value> &last_decoder_states = s->GetNeMoDecoderStates();\n\n  std::vector<Ort::Value> tmp_decoder_states;\n  tmp_decoder_states.reserve(last_decoder_states.size());\n  for (auto &v : last_decoder_states) {\n    tmp_decoder_states.push_back(View(&v));\n  }\n\n  // decoder_output_pair.second returns the next decoder state\n  std::pair<Ort::Value, std::vector<Ort::Value>> decoder_output_pair =\n      model->RunDecoder(std::move(decoder_input),\n                        std::move(tmp_decoder_states));\n\n  std::array<int64_t, 3> encoder_shape{1, num_cols, 1};\n\n  bool emitted = false;\n\n  for (int32_t t = 0; t != num_rows; ++t) {\n    Ort::Value cur_encoder_out = Ort::Value::CreateTensor(\n        memory_info, const_cast<float *>(encoder_out) + t * num_cols, num_cols,\n        encoder_shape.data(), encoder_shape.size());\n\n    Ort::Value logit = model->RunJoiner(std::move(cur_encoder_out),\n                                        View(&decoder_output_pair.first));\n\n    float *p_logit = logit.GetTensorMutableData<float>();\n    if (blank_penalty > 0) {\n      p_logit[blank_id] -= blank_penalty;\n    }\n\n    auto y = static_cast<int32_t>(std::distance(\n        static_cast<const float *>(p_logit),\n        std::max_element(static_cast<const float *>(p_logit),\n                         static_cast<const float *>(p_logit) + vocab_size)));\n\n    if (y != blank_id) {\n      emitted = true;\n      r.tokens.push_back(y);\n      r.timestamps.push_back(t + r.frame_offset);\n      r.num_trailing_blanks = 0;\n\n      decoder_input = BuildDecoderInput(y, model->Allocator());\n\n      // last decoder state becomes the current state for the first chunk\n      decoder_output_pair = model->RunDecoder(\n          std::move(decoder_input), std::move(decoder_output_pair.second));\n    } else {\n      ++r.num_trailing_blanks;\n    }\n  }\n\n  if (emitted) {\n    s->SetNeMoDecoderStates(std::move(decoder_output_pair.second));\n  }\n\n  r.frame_offset += num_rows;\n}\n\nvoid OnlineTransducerGreedySearchNeMoDecoder::Decode(Ort::Value encoder_out,\n                                                     OnlineStream **ss,\n                                                     int32_t n) const {\n  auto shape = encoder_out.GetTensorTypeAndShapeInfo().GetShape();\n  int32_t batch_size = static_cast<int32_t>(shape[0]);  // bs = 1\n\n  if (batch_size != n) {\n    SHERPA_ONNX_LOGE(\"Size mismatch! encoder_out.size(0) %d, n: %d\",\n                     static_cast<int32_t>(shape[0]), n);\n    exit(-1);\n  }\n\n  int32_t dim1 = static_cast<int32_t>(shape[1]);  // T\n  int32_t dim2 = static_cast<int32_t>(shape[2]);  // encoder_out_dim\n\n  const float *p = encoder_out.GetTensorData<float>();\n\n  for (int32_t i = 0; i != batch_size; ++i) {\n    const float *this_p = p + dim1 * dim2 * i;\n\n    DecodeOne(this_p, dim1, dim2, model_, blank_penalty_, ss[i]);\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.h",
    "content": "// sherpa-onnx/csrc/online-transducer-greedy-search-nemo-decoder.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n// Copyright (c)  2024  Sangeet Sagar\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_GREEDY_SEARCH_NEMO_DECODER_H_\n#define SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_GREEDY_SEARCH_NEMO_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/online-transducer-nemo-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineStream;\n\nclass OnlineTransducerGreedySearchNeMoDecoder {\n public:\n  OnlineTransducerGreedySearchNeMoDecoder(OnlineTransducerNeMoModel *model,\n                                          float blank_penalty)\n      : model_(model), blank_penalty_(blank_penalty) {}\n\n  // @param n number of elements in ss\n  void Decode(Ort::Value encoder_out, OnlineStream **ss, int32_t n) const;\n\n private:\n  OnlineTransducerNeMoModel *model_;  // Not owned\n  float blank_penalty_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_GREEDY_SEARCH_NEMO_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-model-config.cc",
    "content": "// sherpa-onnx/csrc/online-transducer-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/online-transducer-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineTransducerModelConfig::Register(ParseOptions *po) {\n  po->Register(\"encoder\", &encoder, \"Path to encoder.onnx\");\n  po->Register(\"decoder\", &decoder, \"Path to decoder.onnx\");\n  po->Register(\"joiner\", &joiner, \"Path to joiner.onnx\");\n}\n\nbool OnlineTransducerModelConfig::Validate() const {\n  if (!FileExists(encoder)) {\n    SHERPA_ONNX_LOGE(\"transducer encoder: '%s' does not exist\",\n                     encoder.c_str());\n    return false;\n  }\n\n  if (!FileExists(decoder)) {\n    SHERPA_ONNX_LOGE(\"transducer decoder: '%s' does not exist\",\n                     decoder.c_str());\n    return false;\n  }\n\n  if (!FileExists(joiner)) {\n    SHERPA_ONNX_LOGE(\"joiner: '%s' does not exist\", joiner.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OnlineTransducerModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlineTransducerModelConfig(\";\n  os << \"encoder=\\\"\" << encoder << \"\\\", \";\n  os << \"decoder=\\\"\" << decoder << \"\\\", \";\n  os << \"joiner=\\\"\" << joiner << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-model-config.h",
    "content": "// sherpa-onnx/csrc/online-transducer-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineTransducerModelConfig {\n  std::string encoder;\n  std::string decoder;\n  std::string joiner;\n\n  OnlineTransducerModelConfig() = default;\n  OnlineTransducerModelConfig(const std::string &encoder,\n                              const std::string &decoder,\n                              const std::string &joiner)\n      : encoder(encoder), decoder(decoder), joiner(joiner) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-model.cc",
    "content": "// sherpa-onnx/csrc/online-transducer-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n// Copyright (c)  2023  Pingfeng Luo\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include <algorithm>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-conformer-transducer-model.h\"\n#include \"sherpa-onnx/csrc/online-ebranchformer-transducer-model.h\"\n#include \"sherpa-onnx/csrc/online-lstm-transducer-model.h\"\n#include \"sherpa-onnx/csrc/online-zipformer-transducer-model.h\"\n#include \"sherpa-onnx/csrc/online-zipformer2-transducer-model.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace {\n\nenum class ModelType : std::uint8_t {\n  kConformer,\n  kEbranchformer,\n  kLstm,\n  kZipformer,\n  kZipformer2,\n  kUnknown,\n};\n\n}  // namespace\n\nnamespace sherpa_onnx {\n\nstatic ModelType GetModelType(char *model_data, size_t model_data_length,\n                              bool debug) {\n  Ort::Env env(ORT_LOGGING_LEVEL_ERROR);\n  Ort::SessionOptions sess_opts;\n  sess_opts.SetIntraOpNumThreads(1);\n  sess_opts.SetInterOpNumThreads(1);\n\n  auto sess = std::make_unique<Ort::Session>(env, model_data, model_data_length,\n                                             sess_opts);\n\n  Ort::ModelMetadata meta_data = sess->GetModelMetadata();\n  if (debug) {\n    std::ostringstream os;\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;\n  auto model_type =\n      LookupCustomModelMetaData(meta_data, \"model_type\", allocator);\n  if (model_type.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"No model_type in the metadata!\\n\"\n        \"Please make sure you are using the latest export-onnx.py from icefall \"\n        \"to export your transducer models\");\n    return ModelType::kUnknown;\n  }\n\n  if (model_type == \"conformer\") {\n    return ModelType::kConformer;\n  } else if (model_type == \"ebranchformer\") {\n    return ModelType::kEbranchformer;\n  } else if (model_type == \"lstm\") {\n    return ModelType::kLstm;\n  } else if (model_type == \"zipformer\") {\n    return ModelType::kZipformer;\n  } else if (model_type == \"zipformer2\") {\n    return ModelType::kZipformer2;\n  } else {\n    SHERPA_ONNX_LOGE(\"Unsupported model_type: %s\", model_type.c_str());\n    return ModelType::kUnknown;\n  }\n}\n\nstd::unique_ptr<OnlineTransducerModel> OnlineTransducerModel::Create(\n    const OnlineModelConfig &config) {\n  if (!config.model_type.empty()) {\n    const auto &model_type = config.model_type;\n    if (model_type == \"conformer\") {\n      return std::make_unique<OnlineConformerTransducerModel>(config);\n    } else if (model_type == \"ebranchformer\") {\n      return std::make_unique<OnlineEbranchformerTransducerModel>(config);\n    } else if (model_type == \"lstm\") {\n      return std::make_unique<OnlineLstmTransducerModel>(config);\n    } else if (model_type == \"zipformer\") {\n      return std::make_unique<OnlineZipformerTransducerModel>(config);\n    } else if (model_type == \"zipformer2\") {\n      return std::make_unique<OnlineZipformer2TransducerModel>(config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Invalid model_type: %s. Trying to load the model to get its type\",\n          model_type.c_str());\n    }\n  }\n  ModelType model_type = ModelType::kUnknown;\n\n  {\n    auto buffer = ReadFile(config.transducer.encoder);\n\n    model_type = GetModelType(buffer.data(), buffer.size(), config.debug);\n  }\n\n  switch (model_type) {\n    case ModelType::kConformer:\n      return std::make_unique<OnlineConformerTransducerModel>(config);\n    case ModelType::kEbranchformer:\n      return std::make_unique<OnlineEbranchformerTransducerModel>(config);\n    case ModelType::kLstm:\n      return std::make_unique<OnlineLstmTransducerModel>(config);\n    case ModelType::kZipformer:\n      return std::make_unique<OnlineZipformerTransducerModel>(config);\n    case ModelType::kZipformer2:\n      return std::make_unique<OnlineZipformer2TransducerModel>(config);\n    case ModelType::kUnknown:\n      SHERPA_ONNX_LOGE(\"Unknown model type in online transducer!\");\n      return nullptr;\n  }\n\n  // unreachable code\n  return nullptr;\n}\n\nOrt::Value OnlineTransducerModel::BuildDecoderInput(\n    const std::vector<OnlineTransducerDecoderResult> &results) {\n  int32_t batch_size = static_cast<int32_t>(results.size());\n  int32_t context_size = ContextSize();\n  std::array<int64_t, 2> shape{batch_size, context_size};\n  Ort::Value decoder_input = Ort::Value::CreateTensor<int64_t>(\n      Allocator(), shape.data(), shape.size());\n  int64_t *p = decoder_input.GetTensorMutableData<int64_t>();\n\n  for (const auto &r : results) {\n    const int64_t *begin = r.tokens.data() + r.tokens.size() - context_size;\n    const int64_t *end = r.tokens.data() + r.tokens.size();\n    std::copy(begin, end, p);\n    p += context_size;\n  }\n  return decoder_input;\n}\n\nOrt::Value OnlineTransducerModel::BuildDecoderInput(\n    const std::vector<Hypothesis> &hyps) {\n  int32_t batch_size = static_cast<int32_t>(hyps.size());\n  int32_t context_size = ContextSize();\n  std::array<int64_t, 2> shape{batch_size, context_size};\n  Ort::Value decoder_input = Ort::Value::CreateTensor<int64_t>(\n      Allocator(), shape.data(), shape.size());\n  int64_t *p = decoder_input.GetTensorMutableData<int64_t>();\n\n  for (const auto &h : hyps) {\n    std::copy(h.ys.end() - context_size, h.ys.end(), p);\n    p += context_size;\n  }\n  return decoder_input;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<OnlineTransducerModel> OnlineTransducerModel::Create(\n    Manager *mgr, const OnlineModelConfig &config) {\n  if (!config.model_type.empty()) {\n    const auto &model_type = config.model_type;\n    if (model_type == \"conformer\") {\n      return std::make_unique<OnlineConformerTransducerModel>(mgr, config);\n    } else if (model_type == \"ebranchformer\") {\n      return std::make_unique<OnlineEbranchformerTransducerModel>(mgr, config);\n    } else if (model_type == \"lstm\") {\n      return std::make_unique<OnlineLstmTransducerModel>(mgr, config);\n    } else if (model_type == \"zipformer\") {\n      return std::make_unique<OnlineZipformerTransducerModel>(mgr, config);\n    } else if (model_type == \"zipformer2\") {\n      return std::make_unique<OnlineZipformer2TransducerModel>(mgr, config);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Invalid model_type: %s. Trying to load the model to get its type\",\n          model_type.c_str());\n    }\n  }\n\n  auto buffer = ReadFile(mgr, config.transducer.encoder);\n  auto model_type = GetModelType(buffer.data(), buffer.size(), config.debug);\n\n  switch (model_type) {\n    case ModelType::kConformer:\n      return std::make_unique<OnlineConformerTransducerModel>(mgr, config);\n    case ModelType::kEbranchformer:\n      return std::make_unique<OnlineEbranchformerTransducerModel>(mgr, config);\n    case ModelType::kLstm:\n      return std::make_unique<OnlineLstmTransducerModel>(mgr, config);\n    case ModelType::kZipformer:\n      return std::make_unique<OnlineZipformerTransducerModel>(mgr, config);\n    case ModelType::kZipformer2:\n      return std::make_unique<OnlineZipformer2TransducerModel>(mgr, config);\n    case ModelType::kUnknown:\n      SHERPA_ONNX_LOGE(\"Unknown model type in online transducer!\");\n      return nullptr;\n  }\n\n  // unreachable code\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<OnlineTransducerModel> OnlineTransducerModel::Create(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<OnlineTransducerModel> OnlineTransducerModel::Create(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-model.h",
    "content": "// sherpa-onnx/csrc/online-transducer-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_MODEL_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/hypothesis.h\"\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model-config.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineTransducerDecoderResult;\n\nclass OnlineTransducerModel {\n public:\n  virtual ~OnlineTransducerModel() = default;\n\n  static std::unique_ptr<OnlineTransducerModel> Create(\n      const OnlineModelConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<OnlineTransducerModel> Create(\n      Manager *mgr, const OnlineModelConfig &config);\n\n  /** Stack a list of individual states into a batch.\n   *\n   * It is the inverse operation of `UnStackStates`.\n   *\n   * @param states states[i] contains the state for the i-th utterance.\n   * @return Return a single value representing the batched state.\n   */\n  virtual std::vector<Ort::Value> StackStates(\n      const std::vector<std::vector<Ort::Value>> &states) const = 0;\n\n  /** Unstack a batch state into a list of individual states.\n   *\n   * It is the inverse operation of `StackStates`.\n   *\n   * @param states A batched state.\n   * @return ans[i] contains the state for the i-th utterance.\n   */\n  virtual std::vector<std::vector<Ort::Value>> UnStackStates(\n      const std::vector<Ort::Value> &states) const = 0;\n\n  /** Get the initial encoder states.\n   *\n   * @return Return the initial encoder state.\n   */\n  virtual std::vector<Ort::Value> GetEncoderInitStates() = 0;\n\n  /** Set feature dim.\n   *\n   * This is used in `OnlineZipformer2TransducerModel`,\n   * to pass `feature_dim` for `GetEncoderInitStates()`.\n   *\n   * This has to be called before GetEncoderInitStates(), so the `encoder_embed`\n   * init state has the correct `embed_dim` of its output.\n   */\n  virtual void SetFeatureDim(int32_t /*feature_dim*/) {}\n\n  /** Run the encoder.\n   *\n   * @param features  A tensor of shape (N, T, C). It is changed in-place.\n   * @param states  Encoder state of the previous chunk. It is changed in-place.\n   * @param processed_frames  Processed frames before subsampling. It is a 1-D\n   * tensor with data type int64_t.\n   *\n   * @return Return a tuple containing:\n   *           - encoder_out, a tensor of shape (N, T', encoder_out_dim)\n   *           - next_states  Encoder state for the next chunk.\n   */\n  virtual std::pair<Ort::Value, std::vector<Ort::Value>> RunEncoder(\n      Ort::Value features, std::vector<Ort::Value> states,\n      Ort::Value processed_frames) = 0;  // NOLINT\n\n  /** Run the decoder network.\n   *\n   * Caution: We assume there are no recurrent connections in the decoder and\n   *          the decoder is stateless. See\n   * https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless2/decoder.py\n   *          for an example\n   *\n   * @param decoder_input It is usually of shape (N, context_size)\n   * @return Return a tensor of shape (N, decoder_dim).\n   */\n  virtual Ort::Value RunDecoder(Ort::Value decoder_input) = 0;\n\n  /** Run the joint network.\n   *\n   * @param encoder_out Output of the encoder network. A tensor of shape\n   *                    (N, joiner_dim).\n   * @param decoder_out Output of the decoder network. A tensor of shape\n   *                    (N, joiner_dim).\n   * @return Return a tensor of shape (N, vocab_size). In icefall, the last\n   *         last layer of the joint network is `nn.Linear`,\n   *         not `nn.LogSoftmax`.\n   */\n  virtual Ort::Value RunJoiner(Ort::Value encoder_out,\n                               Ort::Value decoder_out) = 0;\n\n  /** If we are using a stateless decoder and if it contains a\n   *  Conv1D, this function returns the kernel size of the convolution layer.\n   */\n  virtual int32_t ContextSize() const = 0;\n\n  /** We send this number of feature frames to the encoder at a time. */\n  virtual int32_t ChunkSize() const = 0;\n\n  /** Number of input frames to discard after each call to RunEncoder.\n   *\n   * For instance, if we have 30 frames, chunk_size=8, chunk_shift=6.\n   *\n   * In the first call of RunEncoder, we use frames 0~7 since chunk_size is 8.\n   * Then we discard frame 0~5 since chunk_shift is 6.\n   * In the second call of RunEncoder, we use frames 6~13; and then we discard\n   * frames 6~11.\n   * In the third call of RunEncoder, we use frames 12~19; and then we discard\n   * frames 12~16.\n   *\n   * Note: ChunkSize() - ChunkShift() == right context size\n   */\n  virtual int32_t ChunkShift() const = 0;\n\n  virtual int32_t VocabSize() const = 0;\n\n  virtual int32_t SubsamplingFactor() const { return 4; }\n\n  virtual bool UseWhisperFeature() const { return false; }\n\n  virtual OrtAllocator *Allocator() = 0;\n\n  Ort::Value BuildDecoderInput(\n      const std::vector<OnlineTransducerDecoderResult> &results);\n\n  Ort::Value BuildDecoderInput(const std::vector<Hypothesis> &hyps);\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-modified-beam-search-decoder.cc",
    "content": "// sherpa-onnx/csrc/online-transducer-modified-beam-search-decoder.cc\n//\n// Copyright (c)  2023  Pingfeng Luo\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-transducer-modified-beam-search-decoder.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/log.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nstatic void UseCachedDecoderOut(\n    const std::vector<int32_t> &hyps_row_splits,\n    const std::vector<OnlineTransducerDecoderResult> &results,\n    Ort::Value *decoder_out) {\n  std::vector<int64_t> shape =\n      decoder_out->GetTensorTypeAndShapeInfo().GetShape();\n\n  float *dst = decoder_out->GetTensorMutableData<float>();\n\n  int32_t batch_size = static_cast<int32_t>(results.size());\n  for (int32_t i = 0; i != batch_size; ++i) {\n    int32_t num_hyps = hyps_row_splits[i + 1] - hyps_row_splits[i];\n    if (num_hyps > 1 || !results[i].decoder_out) {\n      dst += num_hyps * shape[1];\n      continue;\n    }\n\n    const float *src = results[i].decoder_out.GetTensorData<float>();\n    std::copy(src, src + shape[1], dst);\n    dst += shape[1];\n  }\n}\n\nOnlineTransducerDecoderResult\nOnlineTransducerModifiedBeamSearchDecoder::GetEmptyResult() const {\n  int32_t context_size = model_->ContextSize();\n  int32_t blank_id = 0;  // always 0\n  OnlineTransducerDecoderResult r;\n  std::vector<int64_t> blanks(context_size, -1);\n  blanks.back() = blank_id;\n\n  Hypotheses blank_hyp({{blanks, 0}});\n  r.hyps = std::move(blank_hyp);\n  r.tokens = std::move(blanks);\n  return r;\n}\n\nvoid OnlineTransducerModifiedBeamSearchDecoder::StripLeadingBlanks(\n    OnlineTransducerDecoderResult *r) const {\n  int32_t context_size = model_->ContextSize();\n  auto hyp = r->hyps.GetMostProbable(true);\n\n  std::vector<int64_t> tokens(hyp.ys.begin() + context_size, hyp.ys.end());\n  r->tokens = std::move(tokens);\n  r->timestamps = std::move(hyp.timestamps);\n\n  // export per-token scores\n  r->ys_probs = std::move(hyp.ys_probs);\n  r->lm_probs = std::move(hyp.lm_probs);\n  r->context_scores = std::move(hyp.context_scores);\n\n  r->num_trailing_blanks = hyp.num_trailing_blanks;\n}\n\nvoid OnlineTransducerModifiedBeamSearchDecoder::Decode(\n    Ort::Value encoder_out,\n    std::vector<OnlineTransducerDecoderResult> *result) {\n  Decode(std::move(encoder_out), nullptr, result);\n}\n\nvoid OnlineTransducerModifiedBeamSearchDecoder::Decode(\n    Ort::Value encoder_out, OnlineStream **ss,\n    std::vector<OnlineTransducerDecoderResult> *result) {\n  std::vector<int64_t> encoder_out_shape =\n      encoder_out.GetTensorTypeAndShapeInfo().GetShape();\n\n  if (static_cast<int32_t>(encoder_out_shape[0]) !=\n      static_cast<int32_t>(result->size())) {\n    SHERPA_ONNX_LOGE(\n        \"Size mismatch! encoder_out.size(0) %d, result.size(0): %d\\n\",\n        static_cast<int32_t>(encoder_out_shape[0]),\n        static_cast<int32_t>(result->size()));\n    exit(-1);\n  }\n\n  int32_t batch_size = static_cast<int32_t>(encoder_out_shape[0]);\n\n  int32_t num_frames = static_cast<int32_t>(encoder_out_shape[1]);\n  int32_t vocab_size = model_->VocabSize();\n\n  std::vector<Hypotheses> cur;\n  for (auto &r : *result) {\n    cur.push_back(std::move(r.hyps));\n  }\n  std::vector<Hypothesis> prev;\n\n  for (int32_t t = 0; t != num_frames; ++t) {\n    // Due to merging paths with identical token sequences,\n    // not all utterances have \"num_active_paths\" paths.\n    auto hyps_row_splits = GetHypsRowSplits(cur);\n    int32_t num_hyps =\n        hyps_row_splits.back();  // total num hyps for all utterance\n    prev.clear();\n    for (auto &hyps : cur) {\n      for (auto &h : hyps) {\n        prev.push_back(std::move(h.second));\n      }\n    }\n    cur.clear();\n    cur.reserve(batch_size);\n\n    Ort::Value decoder_input = model_->BuildDecoderInput(prev);\n    Ort::Value decoder_out = model_->RunDecoder(std::move(decoder_input));\n    if (t == 0) {\n      UseCachedDecoderOut(hyps_row_splits, *result, &decoder_out);\n    }\n\n    Ort::Value cur_encoder_out =\n        GetEncoderOutFrame(model_->Allocator(), &encoder_out, t);\n    cur_encoder_out =\n        Repeat(model_->Allocator(), &cur_encoder_out, hyps_row_splits);\n    Ort::Value logit =\n        model_->RunJoiner(std::move(cur_encoder_out), View(&decoder_out));\n\n    float *p_logit = logit.GetTensorMutableData<float>();\n\n    // copy raw logits, apply temperature-scaling  (for confidences)\n    // Note: temperature scaling is used only for the confidences,\n    //       the decoding algorithm uses the original logits\n    int32_t p_logit_items = vocab_size * num_hyps;\n    std::vector<float> logit_with_temperature(p_logit_items);\n    {\n      std::copy(p_logit, p_logit + p_logit_items,\n                logit_with_temperature.begin());\n      for (float &elem : logit_with_temperature) {\n        elem /= temperature_scale_;\n      }\n      LogSoftmax(logit_with_temperature.data(), vocab_size, num_hyps);\n    }\n\n    if (blank_penalty_ > 0.0) {\n      // assuming blank id is 0\n      SubtractBlank(p_logit, vocab_size, num_hyps, 0, blank_penalty_);\n    }\n    LogSoftmax(p_logit, vocab_size, num_hyps);\n\n    // now p_logit contains log_softmax output, we rename it to p_logprob\n    // to match what it actually contains\n    float *p_logprob = p_logit;\n\n    // add log_prob of each hypothesis to p_logprob before taking top_k\n    for (int32_t i = 0; i != num_hyps; ++i) {\n      float log_prob = prev[i].log_prob;\n      if (lm_ && shallow_fusion_) {\n         log_prob += prev[i].lm_log_prob;\n      }\n\n      for (int32_t k = 0; k != vocab_size; ++k, ++p_logprob) {\n        *p_logprob += log_prob;\n      }\n    }\n    p_logprob = p_logit;  // we changed p_logprob in the above for loop\n\n    for (int32_t b = 0; b != batch_size; ++b) {\n      int32_t frame_offset = (*result)[b].frame_offset;\n      int32_t start = hyps_row_splits[b];\n      int32_t end = hyps_row_splits[b + 1];\n      auto topk =\n          TopkIndex(p_logprob, vocab_size * (end - start), max_active_paths_);\n\n      Hypotheses hyps;\n      for (auto k : topk) {\n        int32_t hyp_index = k / vocab_size + start;\n        int32_t new_token = k % vocab_size;\n\n        Hypothesis new_hyp = prev[hyp_index];\n        const float prev_lm_log_prob = new_hyp.lm_log_prob;\n        float context_score = 0;\n        auto context_state = new_hyp.context_state;\n\n        // blank is hardcoded to 0\n        // also, it treats unk as blank\n        if (new_token != 0 && new_token != unk_id_) {\n          new_hyp.ys.push_back(new_token);\n          new_hyp.timestamps.push_back(t + frame_offset);\n          new_hyp.num_trailing_blanks = 0;\n          if (ss != nullptr && ss[b]->GetContextGraph() != nullptr) {\n            auto context_res = ss[b]->GetContextGraph()->ForwardOneStep(\n                context_state, new_token, false /*strict mode*/);\n            context_score = std::get<0>(context_res);\n            new_hyp.context_state = std::get<1>(context_res);\n          }\n          if (lm_ && shallow_fusion_) {\n            lm_->ComputeLMScoreSF(lm_scale_, &new_hyp);\n          }\n        } else {\n          ++new_hyp.num_trailing_blanks;\n        }\n        if (lm_ && shallow_fusion_) {\n           new_hyp.log_prob = p_logprob[k] + context_score -\n                           prev_lm_log_prob;  // log_prob only includes the\n                                              // score of the transducer\n        } else {\n           new_hyp.log_prob = p_logprob[k] + context_score;  // rescore or no LM\n                                                             // previous token\n                                                             // score is ignored\n        }\n\n        // export the per-token log scores\n        if (new_token != 0 && new_token != unk_id_) {\n          float y_prob = logit_with_temperature[start * vocab_size + k];\n          new_hyp.ys_probs.push_back(y_prob);\n\n          if (lm_ && shallow_fusion_) {  // export only if\n                                         // LM shallow fusion is used\n            float lm_prob = new_hyp.lm_log_prob - prev_lm_log_prob;\n\n            if (lm_scale_ != 0.0) {\n              lm_prob /= lm_scale_;  // remove lm-scale\n            }\n            new_hyp.lm_probs.push_back(lm_prob);\n          }\n\n          // export only when `ContextGraph` is used\n          if (ss != nullptr && ss[b]->GetContextGraph() != nullptr) {\n            new_hyp.context_scores.push_back(context_score);\n          }\n        }\n\n        hyps.Add(std::move(new_hyp));\n      }  // for (auto k : topk)\n      cur.push_back(std::move(hyps));\n      p_logprob += (end - start) * vocab_size;\n    }  // for (int32_t b = 0; b != batch_size; ++b)\n  }    // for (int32_t t = 0; t != num_frames; ++t)\n\n  // classic lm rescore\n  if (lm_ && !shallow_fusion_) {\n    lm_->ComputeLMScore(lm_scale_, model_->ContextSize(), &cur);\n  }\n\n  for (int32_t b = 0; b != batch_size; ++b) {\n    auto &hyps = cur[b];\n    auto best_hyp = hyps.GetMostProbable(true);\n    auto &r = (*result)[b];\n\n    r.hyps = std::move(hyps);\n    r.tokens = std::move(best_hyp.ys);\n    r.num_trailing_blanks = best_hyp.num_trailing_blanks;\n    r.frame_offset += num_frames;\n  }\n}\n\nvoid OnlineTransducerModifiedBeamSearchDecoder::UpdateDecoderOut(\n    OnlineTransducerDecoderResult *result) {\n  if (static_cast<int32_t>(result->tokens.size()) == model_->ContextSize()) {\n    result->decoder_out = Ort::Value{nullptr};\n    return;\n  }\n  Ort::Value decoder_input = model_->BuildDecoderInput({*result});\n  result->decoder_out = model_->RunDecoder(std::move(decoder_input));\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-modified-beam-search-decoder.h",
    "content": "// sherpa-onnx/csrc/online-transducer-modified_beam-search-decoder.h\n//\n// Copyright (c)  2023  Pingfeng Luo\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_DECODER_H_\n#define SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_DECODER_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-lm.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineTransducerModifiedBeamSearchDecoder\n    : public OnlineTransducerDecoder {\n public:\n  OnlineTransducerModifiedBeamSearchDecoder(OnlineTransducerModel *model,\n                                            OnlineLM *lm,\n                                            int32_t max_active_paths,\n                                            float lm_scale,\n                                            bool shallow_fusion,\n                                            int32_t unk_id,\n                                            float blank_penalty,\n                                            float temperature_scale)\n      : model_(model),\n        lm_(lm),\n        max_active_paths_(max_active_paths),\n        lm_scale_(lm_scale),\n        shallow_fusion_(shallow_fusion),\n        unk_id_(unk_id),\n        blank_penalty_(blank_penalty),\n        temperature_scale_(temperature_scale) {}\n\n  OnlineTransducerDecoderResult GetEmptyResult() const override;\n\n  void StripLeadingBlanks(OnlineTransducerDecoderResult *r) const override;\n\n  void Decode(Ort::Value encoder_out,\n              std::vector<OnlineTransducerDecoderResult> *result) override;\n\n  void Decode(Ort::Value encoder_out, OnlineStream **ss,\n              std::vector<OnlineTransducerDecoderResult> *result) override;\n\n  void UpdateDecoderOut(OnlineTransducerDecoderResult *result) override;\n\n private:\n  OnlineTransducerModel *model_;  // Not owned\n  OnlineLM *lm_;                  // Not owned\n\n  int32_t max_active_paths_;\n  float lm_scale_;  // used only when lm_ is not nullptr\n  bool shallow_fusion_;  // used only when lm_ is not nullptr\n  int32_t unk_id_;\n  float blank_penalty_;\n  float temperature_scale_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-nemo-model.cc",
    "content": "// sherpa-onnx/csrc/online-transducer-nemo-model.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n// Copyright (c)  2024  Sangeet Sagar\n\n#include \"sherpa-onnx/csrc/online-transducer-nemo-model.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <cmath>\n#include <memory>\n#include <numeric>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/cat.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n#include \"sherpa-onnx/csrc/unbind.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineTransducerNeMoModel::Impl {\n public:\n  explicit Impl(const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    encoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.transducer.encoder), sess_opts_);\n    InitEncoder(nullptr, 0);\n\n    decoder_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.transducer.decoder), sess_opts_);\n    InitDecoder(nullptr, 0);\n\n    joiner_sess_ = std::make_unique<Ort::Session>(\n        env_, SHERPA_ONNX_TO_ORT_PATH(config.transducer.joiner), sess_opts_);\n    InitJoiner(nullptr, 0);\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.transducer.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.transducer.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.transducer.joiner);\n      InitJoiner(buf.data(), buf.size());\n    }\n  }\n\n  std::vector<Ort::Value> RunEncoder(Ort::Value features,\n                                     std::vector<Ort::Value> states) {\n    Ort::Value &cache_last_channel = states[0];\n    Ort::Value &cache_last_time = states[1];\n    Ort::Value &cache_last_channel_len = states[2];\n\n    int32_t batch_size = features.GetTensorTypeAndShapeInfo().GetShape()[0];\n\n    std::array<int64_t, 1> length_shape{batch_size};\n\n    Ort::Value length = Ort::Value::CreateTensor<int64_t>(\n        allocator_, length_shape.data(), length_shape.size());\n\n    int64_t *p_length = length.GetTensorMutableData<int64_t>();\n\n    std::fill(p_length, p_length + batch_size, ChunkSize());\n\n    // (B, T, C) -> (B, C, T)\n    features = Transpose12(allocator_, &features);\n\n    std::array<Ort::Value, 5> inputs = {\n        std::move(features), View(&length), std::move(cache_last_channel),\n        std::move(cache_last_time), std::move(cache_last_channel_len)};\n\n    auto out = encoder_sess_->Run(\n        {}, encoder_input_names_ptr_.data(), inputs.data(), inputs.size(),\n        encoder_output_names_ptr_.data(), encoder_output_names_ptr_.size());\n    // out[0]: logit\n    // out[1] logit_length\n    // out[2:] states_next\n    //\n    // we need to remove out[1]\n\n    std::vector<Ort::Value> ans;\n    ans.reserve(out.size() - 1);\n\n    for (int32_t i = 0; i != out.size(); ++i) {\n      if (i == 1) {\n        continue;\n      }\n\n      ans.push_back(std::move(out[i]));\n    }\n\n    return ans;\n  }\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> RunDecoder(\n      Ort::Value targets, std::vector<Ort::Value> states) {\n    Ort::MemoryInfo memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeCPU);\n\n    auto shape = targets.GetTensorTypeAndShapeInfo().GetShape();\n    int32_t batch_size = static_cast<int32_t>(shape[0]);\n\n    std::vector<int64_t> length_shape = {batch_size};\n    std::vector<int32_t> length_value(batch_size, 1);\n\n    Ort::Value targets_length = Ort::Value::CreateTensor<int32_t>(\n        memory_info, length_value.data(), batch_size, length_shape.data(),\n        length_shape.size());\n\n    std::vector<Ort::Value> decoder_inputs;\n    decoder_inputs.reserve(2 + states.size());\n\n    decoder_inputs.push_back(std::move(targets));\n    decoder_inputs.push_back(std::move(targets_length));\n\n    for (auto &s : states) {\n      decoder_inputs.push_back(std::move(s));\n    }\n\n    auto decoder_out = decoder_sess_->Run(\n        {}, decoder_input_names_ptr_.data(), decoder_inputs.data(),\n        decoder_inputs.size(), decoder_output_names_ptr_.data(),\n        decoder_output_names_ptr_.size());\n\n    std::vector<Ort::Value> states_next;\n    states_next.reserve(states.size());\n\n    // decoder_out[0]: decoder_output\n    // decoder_out[1]: decoder_output_length (discarded)\n    // decoder_out[2:] states_next\n\n    for (int32_t i = 0; i != states.size(); ++i) {\n      states_next.push_back(std::move(decoder_out[i + 2]));\n    }\n\n    // we discard decoder_out[1]\n    return {std::move(decoder_out[0]), std::move(states_next)};\n  }\n\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out) {\n    std::array<Ort::Value, 2> joiner_input = {std::move(encoder_out),\n                                              std::move(decoder_out)};\n    auto logit = joiner_sess_->Run({}, joiner_input_names_ptr_.data(),\n                                   joiner_input.data(), joiner_input.size(),\n                                   joiner_output_names_ptr_.data(),\n                                   joiner_output_names_ptr_.size());\n\n    return std::move(logit[0]);\n  }\n\n  std::vector<Ort::Value> GetDecoderInitStates() {\n    std::vector<Ort::Value> ans;\n    ans.reserve(2);\n    ans.push_back(View(&lstm0_));\n    ans.push_back(View(&lstm1_));\n\n    return ans;\n  }\n\n  int32_t ChunkSize() const { return window_size_; }\n\n  int32_t ChunkShift() const { return chunk_shift_; }\n\n  int32_t SubsamplingFactor() const { return subsampling_factor_; }\n\n  int32_t FeatureDim() const { return feat_dim_; }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  std::string FeatureNormalizationMethod() const { return normalize_type_; }\n\n  // Return a vector containing 3 tensors\n  // - cache_last_channel\n  // - cache_last_time_\n  // - cache_last_channel_len\n  std::vector<Ort::Value> GetEncoderInitStates() {\n    std::vector<Ort::Value> ans;\n    ans.reserve(3);\n    ans.push_back(View(&cache_last_channel_));\n    ans.push_back(View(&cache_last_time_));\n    ans.push_back(View(&cache_last_channel_len_));\n\n    return ans;\n  }\n\n  std::vector<Ort::Value> StackStates(\n      std::vector<std::vector<Ort::Value>> states) const {\n    int32_t batch_size = static_cast<int32_t>(states.size());\n    if (batch_size == 1) {\n      return std::move(states[0]);\n    }\n\n    std::vector<Ort::Value> ans;\n\n    auto allocator = const_cast<Impl *>(this)->allocator_;\n\n    // stack cache_last_channel\n    std::vector<const Ort::Value *> buf(batch_size);\n\n    // there are 3 states to be stacked\n    for (int32_t i = 0; i != 3; ++i) {\n      buf.clear();\n      buf.reserve(batch_size);\n\n      for (int32_t b = 0; b != batch_size; ++b) {\n        assert(states[b].size() == 3);\n        buf.push_back(&states[b][i]);\n      }\n\n      Ort::Value c{nullptr};\n      if (i == 2) {\n        c = Cat<int64_t>(allocator, buf, 0);\n      } else {\n        c = Cat(allocator, buf, 0);\n      }\n\n      ans.push_back(std::move(c));\n    }\n\n    return ans;\n  }\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      std::vector<Ort::Value> states) {\n    assert(states.size() == 3);\n\n    std::vector<std::vector<Ort::Value>> ans;\n\n    auto shape = states[0].GetTensorTypeAndShapeInfo().GetShape();\n    int32_t batch_size = shape[0];\n    ans.resize(batch_size);\n\n    if (batch_size == 1) {\n      ans[0] = std::move(states);\n      return ans;\n    }\n\n    for (int32_t i = 0; i != 3; ++i) {\n      std::vector<Ort::Value> v;\n      if (i == 2) {\n        v = Unbind<int64_t>(allocator_, &states[i], 0);\n      } else {\n        v = Unbind(allocator_, &states[i], 0);\n      }\n\n      assert(v.size() == batch_size);\n\n      for (int32_t b = 0; b != batch_size; ++b) {\n        ans[b].push_back(std::move(v[b]));\n      }\n    }\n\n    return ans;\n  }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      encoder_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!encoder_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize encoder session outside of \"\n          \"this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                  &encoder_input_names_ptr_);\n\n    GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                   &encoder_output_names_ptr_);\n\n    feat_dim_ = encoder_sess_->GetInputTypeInfo(0)\n                    .GetTensorTypeAndShapeInfo()\n                    .GetShape()[1];\n\n    // get meta data\n    Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---encoder---\\n\";\n      PrintModelMetadata(os, meta_data);\n      os << \"feat_dim: \" << feat_dim_ << \"\\n\";\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n\n    // need to increase by 1 since the blank token is not included in computing\n    // vocab_size in NeMo.\n    vocab_size_ += 1;\n\n    SHERPA_ONNX_READ_META_DATA(window_size_, \"window_size\");\n    SHERPA_ONNX_READ_META_DATA(chunk_shift_, \"chunk_shift\");\n    SHERPA_ONNX_READ_META_DATA(subsampling_factor_, \"subsampling_factor\");\n\n    SHERPA_ONNX_READ_META_DATA_STR_ALLOW_EMPTY(normalize_type_,\n                                               \"normalize_type\");\n    SHERPA_ONNX_READ_META_DATA(pred_rnn_layers_, \"pred_rnn_layers\");\n    SHERPA_ONNX_READ_META_DATA(pred_hidden_, \"pred_hidden\");\n\n    SHERPA_ONNX_READ_META_DATA(cache_last_channel_dim1_,\n                               \"cache_last_channel_dim1\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_channel_dim2_,\n                               \"cache_last_channel_dim2\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_channel_dim3_,\n                               \"cache_last_channel_dim3\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_time_dim1_, \"cache_last_time_dim1\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_time_dim2_, \"cache_last_time_dim2\");\n    SHERPA_ONNX_READ_META_DATA(cache_last_time_dim3_, \"cache_last_time_dim3\");\n\n    if (normalize_type_ == \"NA\") {\n      normalize_type_ = \"\";\n    }\n\n    InitEncoderStates();\n  }\n\n  void InitEncoderStates() {\n    std::array<int64_t, 4> cache_last_channel_shape{1, cache_last_channel_dim1_,\n                                                    cache_last_channel_dim2_,\n                                                    cache_last_channel_dim3_};\n\n    cache_last_channel_ = Ort::Value::CreateTensor<float>(\n        allocator_, cache_last_channel_shape.data(),\n        cache_last_channel_shape.size());\n\n    Fill<float>(&cache_last_channel_, 0);\n\n    std::array<int64_t, 4> cache_last_time_shape{\n        1, cache_last_time_dim1_, cache_last_time_dim2_, cache_last_time_dim3_};\n\n    cache_last_time_ = Ort::Value::CreateTensor<float>(\n        allocator_, cache_last_time_shape.data(), cache_last_time_shape.size());\n\n    Fill<float>(&cache_last_time_, 0);\n\n    int64_t shape = 1;\n    cache_last_channel_len_ =\n        Ort::Value::CreateTensor<int64_t>(allocator_, &shape, 1);\n\n    cache_last_channel_len_.GetTensorMutableData<int64_t>()[0] = 0;\n  }\n\n  void InitDecoder(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      decoder_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!decoder_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize decoder session outside of \"\n          \"this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                  &decoder_input_names_ptr_);\n\n    GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                   &decoder_output_names_ptr_);\n\n    InitDecoderStates();\n  }\n\n  void InitDecoderStates() {\n    int32_t batch_size = 1;\n    std::array<int64_t, 3> s0_shape{pred_rnn_layers_, batch_size, pred_hidden_};\n    lstm0_ = Ort::Value::CreateTensor<float>(allocator_, s0_shape.data(),\n                                             s0_shape.size());\n\n    Fill<float>(&lstm0_, 0);\n\n    std::array<int64_t, 3> s1_shape{pred_rnn_layers_, batch_size, pred_hidden_};\n\n    lstm1_ = Ort::Value::CreateTensor<float>(allocator_, s1_shape.data(),\n                                             s1_shape.size());\n\n    Fill<float>(&lstm1_, 0);\n  }\n\n  void InitJoiner(void *model_data, size_t model_data_length) {\n    if (model_data) {\n      joiner_sess_ = std::make_unique<Ort::Session>(\n          env_, model_data, model_data_length, sess_opts_);\n    } else if (!joiner_sess_) {\n      SHERPA_ONNX_LOGE(\n          \"Please pass buffer data or initialize joiner session outside of \"\n          \"this function\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    GetInputNames(joiner_sess_.get(), &joiner_input_names_,\n                  &joiner_input_names_ptr_);\n\n    GetOutputNames(joiner_sess_.get(), &joiner_output_names_,\n                   &joiner_output_names_ptr_);\n  }\n\n private:\n  OnlineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n  std::unique_ptr<Ort::Session> joiner_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<std::string> joiner_input_names_;\n  std::vector<const char *> joiner_input_names_ptr_;\n\n  std::vector<std::string> joiner_output_names_;\n  std::vector<const char *> joiner_output_names_ptr_;\n\n  int32_t window_size_ = 0;\n  int32_t chunk_shift_ = 0;\n  int32_t vocab_size_ = 0;\n  int32_t subsampling_factor_ = 8;\n  int32_t feat_dim_ = 80;\n  std::string normalize_type_;\n  int32_t pred_rnn_layers_ = -1;\n  int32_t pred_hidden_ = -1;\n\n  // encoder states\n  int32_t cache_last_channel_dim1_ = 0;\n  int32_t cache_last_channel_dim2_ = 0;\n  int32_t cache_last_channel_dim3_ = 0;\n  int32_t cache_last_time_dim1_ = 0;\n  int32_t cache_last_time_dim2_ = 0;\n  int32_t cache_last_time_dim3_ = 0;\n\n  // init encoder states\n  Ort::Value cache_last_channel_{nullptr};\n  Ort::Value cache_last_time_{nullptr};\n  Ort::Value cache_last_channel_len_{nullptr};\n\n  // init decoder states\n  Ort::Value lstm0_{nullptr};\n  Ort::Value lstm1_{nullptr};\n};\n\nOnlineTransducerNeMoModel::OnlineTransducerNeMoModel(\n    const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOnlineTransducerNeMoModel::OnlineTransducerNeMoModel(\n    Manager *mgr, const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOnlineTransducerNeMoModel::~OnlineTransducerNeMoModel() = default;\n\nstd::vector<Ort::Value> OnlineTransducerNeMoModel::RunEncoder(\n    Ort::Value features, std::vector<Ort::Value> states) const {\n  return impl_->RunEncoder(std::move(features), std::move(states));\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOnlineTransducerNeMoModel::RunDecoder(Ort::Value targets,\n                                      std::vector<Ort::Value> states) const {\n  return impl_->RunDecoder(std::move(targets), std::move(states));\n}\n\nstd::vector<Ort::Value> OnlineTransducerNeMoModel::GetDecoderInitStates()\n    const {\n  return impl_->GetDecoderInitStates();\n}\n\nOrt::Value OnlineTransducerNeMoModel::RunJoiner(Ort::Value encoder_out,\n                                                Ort::Value decoder_out) const {\n  return impl_->RunJoiner(std::move(encoder_out), std::move(decoder_out));\n}\n\nint32_t OnlineTransducerNeMoModel::ChunkSize() const {\n  return impl_->ChunkSize();\n}\n\nint32_t OnlineTransducerNeMoModel::ChunkShift() const {\n  return impl_->ChunkShift();\n}\n\nint32_t OnlineTransducerNeMoModel::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\nint32_t OnlineTransducerNeMoModel::VocabSize() const {\n  return impl_->VocabSize();\n}\n\nint32_t OnlineTransducerNeMoModel::FeatureDim() const {\n  return impl_->FeatureDim();\n}\n\nOrtAllocator *OnlineTransducerNeMoModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nstd::string OnlineTransducerNeMoModel::FeatureNormalizationMethod() const {\n  return impl_->FeatureNormalizationMethod();\n}\n\nstd::vector<Ort::Value> OnlineTransducerNeMoModel::GetEncoderInitStates()\n    const {\n  return impl_->GetEncoderInitStates();\n}\n\nstd::vector<Ort::Value> OnlineTransducerNeMoModel::StackStates(\n    std::vector<std::vector<Ort::Value>> states) const {\n  return impl_->StackStates(std::move(states));\n}\n\nstd::vector<std::vector<Ort::Value>> OnlineTransducerNeMoModel::UnStackStates(\n    std::vector<Ort::Value> states) const {\n  return impl_->UnStackStates(std::move(states));\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineTransducerNeMoModel::OnlineTransducerNeMoModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineTransducerNeMoModel::OnlineTransducerNeMoModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-transducer-nemo-model.h",
    "content": "// sherpa-onnx/csrc/online-transducer-nemo-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n// Copyright (c)  2024  Sangeet Sagar\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_NEMO_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_NEMO_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n\nnamespace sherpa_onnx {\n\n// see\n// https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/models/hybrid_rnnt_ctc_bpe_models.py#L40\n// Its decoder is stateful, not stateless.\nclass OnlineTransducerNeMoModel {\n public:\n  explicit OnlineTransducerNeMoModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineTransducerNeMoModel(Manager *mgr, const OnlineModelConfig &config);\n\n  ~OnlineTransducerNeMoModel();\n  // A list of 3 tensors:\n  //  - cache_last_channel\n  //  - cache_last_time\n  //  - cache_last_channel_len\n  std::vector<Ort::Value> GetEncoderInitStates() const;\n\n  // stack encoder states\n  std::vector<Ort::Value> StackStates(\n      std::vector<std::vector<Ort::Value>> states) const;\n\n  // unstack encoder states\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      std::vector<Ort::Value> states) const;\n\n  /** Run the encoder.\n   *\n   * @param features  A tensor of shape (N, T, C). It is changed in-place.\n   * @param states  It is from GetEncoderInitStates() or returned from this\n   *                method.\n   *\n   * @return Return a tuple containing:\n   *           - ans[0]: encoder_out, a tensor of shape (N, encoder_out_dim, T')\n   *           - ans[1:]: contains next states\n   */\n  std::vector<Ort::Value> RunEncoder(\n      Ort::Value features, std::vector<Ort::Value> states) const;  // NOLINT\n\n  /** Run the decoder network.\n   *\n   * @param targets A int32 tensor of shape (batch_size, 1)\n   * @param states The states for the decoder model.\n   * @return Return a vector:\n   *           - ans[0] is the decoder_out (a float tensor)\n   *           - ans[1:] is the next states\n   */\n  std::pair<Ort::Value, std::vector<Ort::Value>> RunDecoder(\n      Ort::Value targets, std::vector<Ort::Value> states) const;\n\n  std::vector<Ort::Value> GetDecoderInitStates() const;\n\n  /** Run the joint network.\n   *\n   * @param encoder_out Output of the encoder network.\n   * @param decoder_out Output of the decoder network.\n   * @return Return a tensor of shape (N, 1, 1, vocab_size) containing logits.\n   */\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out) const;\n\n  /** We send this number of feature frames to the encoder at a time. */\n  int32_t ChunkSize() const;\n\n  /** Number of input frames to discard after each call to RunEncoder.\n   *\n   * For instance, if we have 30 frames, chunk_size=8, chunk_shift=6.\n   *\n   * In the first call of RunEncoder, we use frames 0~7 since chunk_size is 8.\n   * Then we discard frame 0~5 since chunk_shift is 6.\n   * In the second call of RunEncoder, we use frames 6~13; and then we discard\n   * frames 6~11.\n   * In the third call of RunEncoder, we use frames 12~19; and then we discard\n   * frames 12~16.\n   *\n   * Note: ChunkSize() - ChunkShift() == right context size\n   */\n  int32_t ChunkShift() const;\n\n  /** Return the subsampling factor of the model.\n   */\n  int32_t SubsamplingFactor() const;\n\n  int32_t VocabSize() const;\n\n  int32_t FeatureDim() const;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const;\n\n  // Possible values:\n  // - per_feature\n  // - all_features (not implemented yet)\n  // - fixed_mean (not implemented)\n  // - fixed_std (not implemented)\n  // - or just leave it to empty\n  // See\n  // https://github.com/NVIDIA/NeMo/blob/main/nemo/collections/asr/parts/preprocessing/features.py#L59\n  // for details\n  std::string FeatureNormalizationMethod() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_TRANSDUCER_NEMO_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-websocket-client.cc",
    "content": "// sherpa/cpp_api/websocket/online-websocket-client.cc\n//\n// Copyright (c)  2022  Xiaomi Corporation\n#include <chrono>  // NOLINT\n#include <fstream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n#include \"websocketpp/client.hpp\"\n#include \"websocketpp/config/asio_no_tls_client.hpp\"\n#include \"websocketpp/uri.hpp\"\n\nusing client = websocketpp::client<websocketpp::config::asio_client>;\n\nusing message_ptr = client::message_ptr;\nusing websocketpp::connection_hdl;\n\nstatic constexpr const char *kUsageMessage = R\"(\nAutomatic speech recognition with sherpa-onnx using websocket.\n\nUsage:\n\n./bin/sherpa-onnx-online-websocket-client --help\n\n./bin/sherpa-onnx-online-websocket-client \\\n  --server-ip=127.0.0.1 \\\n  --server-port=6006 \\\n  --samples-per-message=8000 \\\n  --seconds-per-message=0.2 \\\n  /path/to/foo.wav\n\nIt support only wave of with a single channel, 16kHz, 16-bit samples.\n)\";\n\nclass Client {\n public:\n  Client(asio::io_context &io,  // NOLINT\n         const std::string &ip, int16_t port, const std::vector<float> &samples,\n         int32_t samples_per_message, float seconds_per_message)\n      : io_(io),\n        uri_(/*secure*/ false, ip, port, /*resource*/ \"/\"),\n        samples_(samples),\n        samples_per_message_(samples_per_message),\n        seconds_per_message_(seconds_per_message) {\n    c_.clear_access_channels(websocketpp::log::alevel::all);\n    // c_.set_access_channels(websocketpp::log::alevel::connect);\n    // c_.set_access_channels(websocketpp::log::alevel::disconnect);\n\n    c_.init_asio(&io_);\n    c_.set_open_handler([this](connection_hdl hdl) { OnOpen(hdl); });\n    c_.set_close_handler(\n        [](connection_hdl /*hdl*/) { SHERPA_ONNX_LOGE(\"Disconnected\"); });\n    c_.set_message_handler(\n        [this](connection_hdl hdl, message_ptr msg) { OnMessage(hdl, msg); });\n\n    Run();\n  }\n\n private:\n  void Run() {\n    websocketpp::lib::error_code ec;\n    client::connection_ptr con = c_.get_connection(uri_.str(), ec);\n    if (ec) {\n      SHERPA_ONNX_LOGE(\"Could not create connection to %s because %s\",\n                       uri_.str().c_str(), ec.message().c_str());\n      exit(EXIT_FAILURE);\n    }\n\n    c_.connect(con);\n  }\n\n  void OnOpen(connection_hdl hdl) {\n    auto start_time = std::chrono::steady_clock::now();\n    asio::post(\n        io_, [this, hdl, start_time]() { this->SendMessage(hdl, start_time); });\n  }\n\n  void OnMessage(connection_hdl hdl, message_ptr msg) {\n    const std::string &payload = msg->get_payload();\n\n    if (payload == \"Done!\") {\n      websocketpp::lib::error_code ec;\n      c_.close(hdl, websocketpp::close::status::normal, \"I'm exiting now\", ec);\n      if (ec) {\n        SHERPA_ONNX_LOGE(\"Failed to close because %s\", ec.message().c_str());\n        exit(EXIT_FAILURE);\n      }\n    } else {\n      SHERPA_ONNX_LOGE(\"%s\", payload.c_str());\n    }\n  }\n\n  void SendMessage(\n      connection_hdl hdl,\n      std::chrono::time_point<std::chrono::steady_clock> start_time) {\n    int32_t num_samples = samples_.size();\n    int32_t num_messages = num_samples / samples_per_message_;\n\n    websocketpp::lib::error_code ec;\n    auto time = std::chrono::steady_clock::now();\n    int elapsed_time_ms =\n        std::chrono::duration_cast<std::chrono::milliseconds>(time - start_time)\n            .count();\n\n    if (elapsed_time_ms <\n        static_cast<int>(seconds_per_message_ * num_sent_messages_ * 1000)) {\n      std::this_thread::sleep_for(std::chrono::milliseconds(int(\n          seconds_per_message_ * num_sent_messages_ * 1000 - elapsed_time_ms)));\n    }\n\n    if (num_sent_messages_ < 1) {\n      SHERPA_ONNX_LOGE(\"Starting to send audio\");\n    }\n\n    if (num_sent_messages_ < num_messages) {\n      c_.send(hdl, samples_.data() + num_sent_messages_ * samples_per_message_,\n              samples_per_message_ * sizeof(float),\n              websocketpp::frame::opcode::binary, ec);\n\n      if (ec) {\n        SHERPA_ONNX_LOGE(\"Failed to send audio samples because %s\",\n                         ec.message().c_str());\n        exit(EXIT_FAILURE);\n      }\n\n      ec.clear();\n\n      ++num_sent_messages_;\n    }\n\n    if (num_sent_messages_ == num_messages) {\n      int32_t remaining_samples = num_samples % samples_per_message_;\n      if (remaining_samples) {\n        c_.send(hdl,\n                samples_.data() + num_sent_messages_ * samples_per_message_,\n                remaining_samples * sizeof(float),\n                websocketpp::frame::opcode::binary, ec);\n\n        if (ec) {\n          SHERPA_ONNX_LOGE(\"Failed to send audio samples because %s\",\n                           ec.message().c_str());\n          exit(EXIT_FAILURE);\n        }\n        ec.clear();\n      }\n\n      // To signal that we have send all the messages\n      c_.send(hdl, \"Done\", websocketpp::frame::opcode::text, ec);\n      SHERPA_ONNX_LOGE(\"Sent Done Signal\");\n\n      if (ec) {\n        SHERPA_ONNX_LOGE(\"Failed to send audio samples because %s\",\n                         ec.message().c_str());\n        exit(EXIT_FAILURE);\n      }\n    } else {\n      asio::post(io_, [this, hdl, start_time]() {\n        this->SendMessage(hdl, start_time);\n      });\n    }\n  }\n\n private:\n  client c_;\n  asio::io_context &io_;\n  websocketpp::uri uri_;\n  std::vector<float> samples_;\n  int32_t samples_per_message_ = 8000;  // 0.5 seconds\n  float seconds_per_message_ = 0.2;\n  int32_t num_sent_messages_ = 0;\n};\n\nint32_t main(int32_t argc, char *argv[]) {\n  std::string server_ip = \"127.0.0.1\";\n  int32_t server_port = 6006;\n\n  // Sample rate of the input wave. No resampling is made.\n  int32_t sample_rate = 16000;\n  int32_t samples_per_message = 8000;\n  float seconds_per_message = 0.2;\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n\n  po.Register(\"server-ip\", &server_ip, \"IP address of the websocket server\");\n  po.Register(\"server-port\", &server_port, \"Port of the websocket server\");\n  po.Register(\"sample-rate\", &sample_rate,\n              \"Sample rate of the input wave. Should be the one expected by \"\n              \"the server\");\n\n  po.Register(\"samples-per-message\", &samples_per_message,\n              \"Send this number of samples per message.\");\n\n  po.Register(\"seconds-per-message\", &seconds_per_message,\n              \"We will simulate that each message takes this number of seconds \"\n              \"to send. If you select a very large value, it will take a long \"\n              \"time to send all the samples\");\n\n  po.Read(argc, argv);\n\n  if (!websocketpp::uri_helper::ipv4_literal(server_ip.begin(),\n                                             server_ip.end())) {\n    SHERPA_ONNX_LOGE(\"Invalid server IP: %s\", server_ip.c_str());\n    return -1;\n  }\n\n  if (server_port <= 0 || server_port > 65535) {\n    SHERPA_ONNX_LOGE(\"Invalid server port: %d\", server_port);\n    return -1;\n  }\n\n  // 0.01 is an arbitrary value. You can change it.\n  if (samples_per_message <= 0.01 * sample_rate) {\n    SHERPA_ONNX_LOGE(\"--samples-per-message is too small: %d\",\n                     samples_per_message);\n    return -1;\n  }\n\n  // 100 is an arbitrary value. You can change it.\n  if (samples_per_message >= sample_rate * 100) {\n    SHERPA_ONNX_LOGE(\"--samples-per-message is too small: %d\",\n                     samples_per_message);\n    return -1;\n  }\n\n  if (seconds_per_message < 0) {\n    SHERPA_ONNX_LOGE(\"--seconds-per-message is too small: %.3f\",\n                     seconds_per_message);\n    return -1;\n  }\n\n  // 1 is an arbitrary value.\n  if (seconds_per_message > 1) {\n    SHERPA_ONNX_LOGE(\n        \"--seconds-per-message is too large: %.3f. You will wait a long time \"\n        \"to \"\n        \"send all the samples\",\n        seconds_per_message);\n    return -1;\n  }\n\n  if (po.NumArgs() != 1) {\n    po.PrintUsage();\n    return -1;\n  }\n\n  std::string wave_filename = po.GetArg(1);\n\n  bool is_ok = false;\n  int32_t actual_sample_rate = -1;\n  std::vector<float> samples =\n      sherpa_onnx::ReadWave(wave_filename, &actual_sample_rate, &is_ok);\n\n  if (!is_ok) {\n    SHERPA_ONNX_LOGE(\"Failed to read '%s'\", wave_filename.c_str());\n    return -1;\n  }\n\n  if (actual_sample_rate != sample_rate) {\n    SHERPA_ONNX_LOGE(\"Expected sample rate: %d, given %d\", sample_rate,\n                     actual_sample_rate);\n    return -1;\n  }\n\n  asio::io_context io_conn;  // for network connections\n  Client c(io_conn, server_ip, server_port, samples, samples_per_message,\n           seconds_per_message);\n\n  io_conn.run();  // will exit when the above connection is closed\n\n  SHERPA_ONNX_LOGE(\"Done!\");\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-websocket-server-impl.cc",
    "content": "// sherpa-onnx/csrc/online-websocket-server-impl.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-websocket-server-impl.h\"\n\n#include <iostream>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/log.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineWebsocketDecoderConfig::Register(ParseOptions *po) {\n  recognizer_config.Register(po);\n\n  po->Register(\"loop-interval-ms\", &loop_interval_ms,\n               \"It determines how often the decoder loop runs. \");\n\n  po->Register(\"max-batch-size\", &max_batch_size,\n               \"Max batch size for recognition.\");\n\n  po->Register(\"end-tail-padding\", &end_tail_padding,\n               \"It determines the length of tail_padding at the end of audio.\");\n}\n\nvoid OnlineWebsocketDecoderConfig::Validate() const {\n  recognizer_config.Validate();\n  SHERPA_ONNX_CHECK_GT(loop_interval_ms, 0);\n  SHERPA_ONNX_CHECK_GT(max_batch_size, 0);\n  SHERPA_ONNX_CHECK_GT(end_tail_padding, 0);\n}\n\nvoid OnlineWebsocketServerConfig::Register(sherpa_onnx::ParseOptions *po) {\n  decoder_config.Register(po);\n\n  po->Register(\"log-file\", &log_file,\n               \"Path to the log file. Logs are \"\n               \"appended to this file\");\n}\n\nvoid OnlineWebsocketServerConfig::Validate() const {\n  decoder_config.Validate();\n}\n\nOnlineWebsocketDecoder::OnlineWebsocketDecoder(OnlineWebsocketServer *server)\n    : server_(server),\n      config_(server->GetConfig().decoder_config),\n      timer_(server->GetWorkContext()) {\n  recognizer_ = std::make_unique<OnlineRecognizer>(config_.recognizer_config);\n}\n\nstd::shared_ptr<Connection> OnlineWebsocketDecoder::GetOrCreateConnection(\n    connection_hdl hdl) {\n  std::lock_guard<std::mutex> lock(mutex_);\n  auto it = connections_.find(hdl);\n  if (it != connections_.end()) {\n    return it->second;\n  } else {\n    // create a new connection\n    std::shared_ptr<OnlineStream> s = recognizer_->CreateStream();\n    auto c = std::make_shared<Connection>(hdl, s);\n    connections_.insert({hdl, c});\n    return c;\n  }\n}\n\nvoid OnlineWebsocketDecoder::AcceptWaveform(std::shared_ptr<Connection> c) {\n  std::lock_guard<std::mutex> lock(c->mutex);\n  float sample_rate = config_.recognizer_config.feat_config.sampling_rate;\n  while (!c->samples.empty()) {\n    const auto &s = c->samples.front();\n    c->s->AcceptWaveform(sample_rate, s.data(), s.size());\n    c->samples.pop_front();\n  }\n}\n\nvoid OnlineWebsocketDecoder::InputFinished(std::shared_ptr<Connection> c) {\n  std::lock_guard<std::mutex> lock(c->mutex);\n\n  float sample_rate = config_.recognizer_config.feat_config.sampling_rate;\n\n  while (!c->samples.empty()) {\n    const auto &s = c->samples.front();\n    c->s->AcceptWaveform(sample_rate, s.data(), s.size());\n    c->samples.pop_front();\n  }\n\n  std::vector<float> tail_padding(\n      static_cast<int64_t>(config_.end_tail_padding * sample_rate));\n\n  c->s->AcceptWaveform(sample_rate, tail_padding.data(), tail_padding.size());\n\n  c->s->InputFinished();\n  c->eof = true;\n}\n\nvoid OnlineWebsocketDecoder::Warmup() const {\n  recognizer_->WarmpUpRecognizer(config_.recognizer_config.model_config.warm_up,\n                                 config_.max_batch_size);\n}\n\nvoid OnlineWebsocketDecoder::Run() {\n  timer_.expires_after(std::chrono::milliseconds(config_.loop_interval_ms));\n\n  timer_.async_wait(\n      [this](const asio::error_code &ec) { ProcessConnections(ec); });\n}\n\nvoid OnlineWebsocketDecoder::ProcessConnections(const asio::error_code &ec) {\n  if (ec) {\n    SHERPA_ONNX_LOG(FATAL) << \"The decoder loop is aborted!\";\n  }\n\n  std::lock_guard<std::mutex> lock(mutex_);\n  std::vector<connection_hdl> to_remove;\n  for (auto &p : connections_) {\n    auto hdl = p.first;\n    auto c = p.second;\n\n    // The order of `if` below matters!\n    if (!server_->Contains(hdl)) {\n      // If the connection is disconnected, we stop processing it\n      to_remove.push_back(hdl);\n      continue;\n    }\n\n    if (active_.count(hdl)) {\n      // Another thread is decoding this stream, so skip it\n      continue;\n    }\n\n    if (!recognizer_->IsReady(c->s.get()) && !c->eof) {\n      // this stream has not enough frames to decode, so skip it\n      continue;\n    }\n\n    if (!recognizer_->IsReady(c->s.get()) && c->eof) {\n      // We won't receive samples from the client, so send a Done! to client\n\n      asio::post(server_->GetWorkContext(),\n                 [this, hdl = c->hdl]() { server_->Send(hdl, \"Done!\"); });\n\n      to_remove.push_back(hdl);\n      continue;\n    }\n\n    // TODO(fangun): If the connection is timed out, we need to also\n    // add it to `to_remove`\n\n    // this stream has enough frames and is currently not processed by any\n    // threads, so put it into the ready queue\n    ready_connections_.push_back(c);\n\n    // In `Decode()`, it will remove hdl from `active_`\n    active_.insert(c->hdl);\n  }\n\n  for (auto hdl : to_remove) {\n    connections_.erase(hdl);\n  }\n\n  if (!ready_connections_.empty()) {\n    asio::post(server_->GetWorkContext(), [this]() { Decode(); });\n  }\n\n  // Schedule another call\n  timer_.expires_after(std::chrono::milliseconds(config_.loop_interval_ms));\n\n  timer_.async_wait(\n      [this](const asio::error_code &ec) { ProcessConnections(ec); });\n}\n\nvoid OnlineWebsocketDecoder::Decode() {\n  std::unique_lock<std::mutex> lock(mutex_);\n  if (ready_connections_.empty()) {\n    // There are no connections that are ready for decoding,\n    // so we return directly\n    return;\n  }\n\n  std::vector<std::shared_ptr<Connection>> c_vec;\n  std::vector<OnlineStream *> s_vec;\n  while (!ready_connections_.empty() &&\n         static_cast<int32_t>(s_vec.size()) < config_.max_batch_size) {\n    auto c = ready_connections_.front();\n    ready_connections_.pop_front();\n\n    c_vec.push_back(c);\n    s_vec.push_back(c->s.get());\n  }\n\n  if (!ready_connections_.empty()) {\n    // there are too many ready connections but this thread can only handle\n    // max_batch_size connections at a time, so we schedule another call\n    // to Decode() and let other threads to process the ready connections\n    asio::post(server_->GetWorkContext(), [this]() { Decode(); });\n  }\n\n  lock.unlock();\n  recognizer_->DecodeStreams(s_vec.data(), s_vec.size());\n  lock.lock();\n\n  for (auto c : c_vec) {\n    auto result = recognizer_->GetResult(c->s.get());\n    if (recognizer_->IsEndpoint(c->s.get())) {\n      result.is_final = true;\n      recognizer_->Reset(c->s.get());\n    }\n\n    if (!recognizer_->IsReady(c->s.get()) && c->eof) {\n      result.is_final = true;\n      result.is_eof = true;\n    }\n\n    asio::post(server_->GetConnectionContext(),\n               [this, hdl = c->hdl, str = result.AsJsonString()]() {\n                 server_->Send(hdl, str);\n               });\n    active_.erase(c->hdl);\n  }\n}\n\nOnlineWebsocketServer::OnlineWebsocketServer(\n    asio::io_context &io_conn, asio::io_context &io_work,\n    const OnlineWebsocketServerConfig &config)\n    : config_(config),\n      io_conn_(io_conn),\n      io_work_(io_work),\n      log_(config.log_file, std::ios::app),\n      tee_(std::cout, log_),\n      decoder_(this) {\n  SetupLog();\n\n  server_.init_asio(&io_conn_);\n\n  server_.set_open_handler([this](connection_hdl hdl) { OnOpen(hdl); });\n\n  server_.set_close_handler([this](connection_hdl hdl) { OnClose(hdl); });\n\n  server_.set_message_handler(\n      [this](connection_hdl hdl, server::message_ptr msg) {\n        OnMessage(hdl, msg);\n      });\n}\n\nvoid OnlineWebsocketServer::Run(uint16_t port) {\n  server_.set_reuse_addr(true);\n  server_.listen(asio::ip::tcp::v4(), port);\n  server_.start_accept();\n  auto recognizer_config = config_.decoder_config.recognizer_config;\n  int32_t warm_up = recognizer_config.model_config.warm_up;\n  const std::string &model_type = recognizer_config.model_config.model_type;\n  if (0 < warm_up && warm_up < 100) {\n    if (model_type == \"zipformer2\") {\n      decoder_.Warmup();\n      SHERPA_ONNX_LOGE(\"Warm up completed : %d times.\", warm_up);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only Zipformer2 has warmup support for now.\");\n      SHERPA_ONNX_LOGE(\"Given: %s\", model_type.c_str());\n      exit(0);\n    }\n  } else if (warm_up == 0) {\n    SHERPA_ONNX_LOGE(\"Starting without warmup!\");\n  } else {\n    SHERPA_ONNX_LOGE(\"Invalid Warm up Value!. Expected 0 < warm_up < 100\");\n    exit(0);\n  }\n  decoder_.Run();\n}\n\nvoid OnlineWebsocketServer::SetupLog() {\n  server_.clear_access_channels(websocketpp::log::alevel::all);\n  // server_.set_access_channels(websocketpp::log::alevel::connect);\n  // server_.set_access_channels(websocketpp::log::alevel::disconnect);\n\n  // So that it also prints to std::cout and std::cerr\n  server_.get_alog().set_ostream(&tee_);\n  server_.get_elog().set_ostream(&tee_);\n}\n\nvoid OnlineWebsocketServer::Send(connection_hdl hdl, const std::string &text) {\n  websocketpp::lib::error_code ec;\n  if (!Contains(hdl)) {\n    return;\n  }\n\n  server_.send(hdl, text, websocketpp::frame::opcode::text, ec);\n  if (ec) {\n    server_.get_alog().write(websocketpp::log::alevel::app, ec.message());\n  }\n}\n\nvoid OnlineWebsocketServer::OnOpen(connection_hdl hdl) {\n  std::lock_guard<std::mutex> lock(mutex_);\n  connections_.insert(hdl);\n\n  std::ostringstream os;\n  os << \"New connection: \"\n     << server_.get_con_from_hdl(hdl)->get_remote_endpoint() << \". \"\n     << \"Number of active connections: \" << connections_.size() << \".\\n\";\n  SHERPA_ONNX_LOG(INFO) << os.str();\n}\n\nvoid OnlineWebsocketServer::OnClose(connection_hdl hdl) {\n  std::lock_guard<std::mutex> lock(mutex_);\n  connections_.erase(hdl);\n\n  SHERPA_ONNX_LOG(INFO) << \"Number of active connections: \"\n                        << connections_.size() << \"\\n\";\n}\n\nbool OnlineWebsocketServer::Contains(connection_hdl hdl) const {\n  std::lock_guard<std::mutex> lock(mutex_);\n  return connections_.count(hdl);\n}\n\nvoid OnlineWebsocketServer::OnMessage(connection_hdl hdl,\n                                      server::message_ptr msg) {\n  auto c = decoder_.GetOrCreateConnection(hdl);\n\n  const std::string &payload = msg->get_payload();\n\n  switch (msg->get_opcode()) {\n    case websocketpp::frame::opcode::text:\n      if (payload == \"Done\") {\n        asio::post(io_work_, [this, c]() { decoder_.InputFinished(c); });\n      }\n      break;\n    case websocketpp::frame::opcode::binary: {\n      auto p = reinterpret_cast<const float *>(payload.data());\n      int32_t num_samples = payload.size() / sizeof(float);\n      std::vector<float> samples(p, p + num_samples);\n\n      {\n        std::lock_guard<std::mutex> lock(c->mutex);\n        c->samples.push_back(std::move(samples));\n      }\n\n      asio::post(io_work_, [this, c]() { decoder_.AcceptWaveform(c); });\n      break;\n    }\n    default:\n      break;\n  }\n}\n\nvoid OnlineWebsocketServer::Close(connection_hdl hdl,\n                                  websocketpp::close::status::value code,\n                                  const std::string &reason) {\n  auto con = server_.get_con_from_hdl(hdl);\n\n  std::ostringstream os;\n  os << \"Closing \" << con->get_remote_endpoint() << \" with reason: \" << reason\n     << \"\\n\";\n\n  websocketpp::lib::error_code ec;\n  server_.close(hdl, code, reason, ec);\n  if (ec) {\n    os << \"Failed to close\" << con->get_remote_endpoint() << \". \"\n       << ec.message() << \"\\n\";\n  }\n  server_.get_alog().write(websocketpp::log::alevel::app, os.str());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-websocket-server-impl.h",
    "content": "// sherpa-onnx/csrc/online-websocket-server-impl.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_ONLINE_WEBSOCKET_SERVER_IMPL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_WEBSOCKET_SERVER_IMPL_H_\n\n#include <deque>\n#include <fstream>\n#include <map>\n#include <memory>\n#include <mutex>\n#include <set>\n#include <string>\n#include <unordered_set>\n#include <utility>\n#include <vector>\n\n#include \"asio.hpp\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/tee-stream.h\"\n#include \"websocketpp/config/asio_no_tls.hpp\"  // TODO(fangjun): support TLS\n#include \"websocketpp/server.hpp\"\nusing server = websocketpp::server<websocketpp::config::asio>;\nusing connection_hdl = websocketpp::connection_hdl;\n\nnamespace sherpa_onnx {\n\nstruct Connection {\n  // handle to the connection. We can use it to send messages to the client\n  connection_hdl hdl;\n  std::shared_ptr<OnlineStream> s;\n\n  // set it to true when InputFinished() is called\n  bool eof = false;\n\n  // The last time we received a message from the client\n  // TODO(fangjun): Use it to disconnect from a client if it is inactive\n  // for a specified time.\n  std::chrono::steady_clock::time_point last_active;\n\n  std::mutex mutex;  // protect samples\n\n  // Audio samples received from the client.\n  //\n  // The I/O threads receive audio samples into this queue\n  // and invoke work threads to compute features\n  std::deque<std::vector<float>> samples;\n\n  Connection() = default;\n  Connection(connection_hdl hdl, std::shared_ptr<OnlineStream> s)\n      : hdl(hdl), s(s), last_active(std::chrono::steady_clock::now()) {}\n};\n\nstruct OnlineWebsocketDecoderConfig {\n  OnlineRecognizerConfig recognizer_config;\n\n  // It determines how often the decoder loop runs.\n  int32_t loop_interval_ms = 10;\n\n  int32_t max_batch_size = 5;\n\n  float end_tail_padding = 0.8;\n\n  void Register(ParseOptions *po);\n  void Validate() const;\n};\n\nclass OnlineWebsocketServer;\n\nclass OnlineWebsocketDecoder {\n public:\n  /**\n   * @param server  Not owned.\n   */\n  explicit OnlineWebsocketDecoder(OnlineWebsocketServer *server);\n\n  std::shared_ptr<Connection> GetOrCreateConnection(connection_hdl hdl);\n\n  // Compute features for a stream given audio samples\n  void AcceptWaveform(std::shared_ptr<Connection> c);\n\n  // signal that there will be no more audio samples for a stream\n  void InputFinished(std::shared_ptr<Connection> c);\n\n  void Warmup() const;\n\n  void Run();\n\n private:\n  void ProcessConnections(const asio::error_code &ec);\n\n  /** It is called by one of the worker thread.\n   */\n  void Decode();\n\n private:\n  OnlineWebsocketServer *server_;  // not owned\n  std::unique_ptr<OnlineRecognizer> recognizer_;\n  OnlineWebsocketDecoderConfig config_;\n  asio::steady_timer timer_;\n\n  // It protects `connections_`, `ready_connections_`, and `active_`\n  std::mutex mutex_;\n\n  std::map<connection_hdl, std::shared_ptr<Connection>,\n           std::owner_less<connection_hdl>>\n      connections_;\n\n  // Whenever a connection has enough feature frames for decoding, we put\n  // it in this queue\n  std::deque<std::shared_ptr<Connection>> ready_connections_;\n\n  // If we are decoding a stream, we put it in the active_ set so that\n  // only one thread can decode a stream at a time.\n  std::set<connection_hdl, std::owner_less<connection_hdl>> active_;\n};\n\nstruct OnlineWebsocketServerConfig {\n  OnlineWebsocketDecoderConfig decoder_config;\n\n  std::string log_file = \"./log.txt\";\n\n  void Register(sherpa_onnx::ParseOptions *po);\n  void Validate() const;\n};\n\nclass OnlineWebsocketServer {\n public:\n  explicit OnlineWebsocketServer(asio::io_context &io_conn,  // NOLINT\n                                 asio::io_context &io_work,  // NOLINT\n                                 const OnlineWebsocketServerConfig &config);\n\n  void Run(uint16_t port);\n\n  const OnlineWebsocketServerConfig &GetConfig() const { return config_; }\n  asio::io_context &GetConnectionContext() { return io_conn_; }\n  asio::io_context &GetWorkContext() { return io_work_; }\n  server &GetServer() { return server_; }\n\n  void Send(connection_hdl hdl, const std::string &text);\n\n  bool Contains(connection_hdl hdl) const;\n\n private:\n  void SetupLog();\n\n  // When a websocket client is connected, it will invoke this method\n  // (Not for HTTP)\n  void OnOpen(connection_hdl hdl);\n\n  // When a websocket client is disconnected, it will invoke this method\n  void OnClose(connection_hdl hdl);\n\n  void OnMessage(connection_hdl hdl, server::message_ptr msg);\n\n  // Close a websocket connection with given code and reason\n  void Close(connection_hdl hdl, websocketpp::close::status::value code,\n             const std::string &reason);\n\n private:\n  OnlineWebsocketServerConfig config_;\n  asio::io_context &io_conn_;\n  asio::io_context &io_work_;\n  server server_;\n\n  std::ofstream log_;\n  sherpa_onnx::TeeStream tee_;\n\n  OnlineWebsocketDecoder decoder_;\n\n  mutable std::mutex mutex_;\n\n  std::set<connection_hdl, std::owner_less<connection_hdl>> connections_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_WEBSOCKET_SERVER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-websocket-server.cc",
    "content": "// sherpa-onnx/csrc/online-websocket-server.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include <vector>\n\n#include \"asio.hpp\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-websocket-server-impl.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nstatic constexpr const char *kUsageMessage = R\"(\nAutomatic speech recognition with sherpa-onnx using websocket.\n\nUsage:\n\n./bin/sherpa-onnx-online-websocket-server --help\n\n./bin/sherpa-onnx-online-websocket-server \\\n  --port=6006 \\\n  --num-work-threads=5 \\\n  --tokens=/path/to/tokens.txt \\\n  --encoder=/path/to/encoder.onnx \\\n  --decoder=/path/to/decoder.onnx \\\n  --joiner=/path/to/joiner.onnx \\\n  --log-file=./log.txt \\\n  --max-batch-size=5 \\\n  --loop-interval-ms=10\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)\";\n\nint32_t main(int32_t argc, char *argv[]) {\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n\n  sherpa_onnx::OnlineWebsocketServerConfig config;\n\n  // the server will listen on this port\n  int32_t port = 6006;\n\n  // size of the thread pool for handling network connections\n  int32_t num_io_threads = 1;\n\n  // size of the thread pool for neural network computation and decoding\n  int32_t num_work_threads = 3;\n\n  po.Register(\"num-io-threads\", &num_io_threads,\n              \"Thread pool size for network connections.\");\n\n  po.Register(\"num-work-threads\", &num_work_threads,\n              \"Thread pool size for for neural network \"\n              \"computation and decoding.\");\n\n  po.Register(\"port\", &port, \"The port on which the server will listen.\");\n\n  config.Register(&po);\n\n  if (argc == 1) {\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  po.Read(argc, argv);\n\n  if (po.NumArgs() != 0) {\n    SHERPA_ONNX_LOGE(\"Unrecognized positional arguments!\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  config.Validate();\n\n  asio::io_context io_conn;  // for network connections\n  asio::io_context io_work;  // for neural network and decoding\n\n  sherpa_onnx::OnlineWebsocketServer server(io_conn, io_work, config);\n  server.Run(port);\n\n  SHERPA_ONNX_LOGE(\"Started!\");\n  SHERPA_ONNX_LOGE(\"Listening on: %d\", port);\n  SHERPA_ONNX_LOGE(\"Number of work threads: %d\", num_work_threads);\n\n  // give some work to do for the io_work pool\n  auto work_guard = asio::make_work_guard(io_work);\n\n  std::vector<std::thread> io_threads;\n\n  // decrement since the main thread is also used for network communications\n  for (int32_t i = 0; i < num_io_threads - 1; ++i) {\n    io_threads.emplace_back([&io_conn]() { io_conn.run(); });\n  }\n\n  std::vector<std::thread> work_threads;\n  for (int32_t i = 0; i < num_work_threads; ++i) {\n    work_threads.emplace_back([&io_work]() { io_work.run(); });\n  }\n\n  io_conn.run();\n\n  for (auto &t : io_threads) {\n    t.join();\n  }\n\n  for (auto &t : work_threads) {\n    t.join();\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-wenet-ctc-model-config.cc",
    "content": "// sherpa-onnx/csrc/online-wenet-ctc-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-wenet-ctc-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineWenetCtcModelConfig::Register(ParseOptions *po) {\n  po->Register(\"wenet-ctc-model\", &model,\n               \"Path to CTC model.onnx from WeNet. Please see \"\n               \"https://github.com/k2-fsa/sherpa-onnx/pull/425\");\n  po->Register(\"wenet-ctc-chunk-size\", &chunk_size,\n               \"Chunk size after subsampling used for decoding.\");\n  po->Register(\"wenet-ctc-num-left-chunks\", &num_left_chunks,\n               \"Number of left chunks after subsampling used for decoding.\");\n}\n\nbool OnlineWenetCtcModelConfig::Validate() const {\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"WeNet CTC model '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  if (chunk_size <= 0) {\n    SHERPA_ONNX_LOGE(\n        \"Please specify a positive value for --wenet-ctc-chunk-size. Currently \"\n        \"given: %d\",\n        chunk_size);\n    return false;\n  }\n\n  if (num_left_chunks <= 0) {\n    SHERPA_ONNX_LOGE(\n        \"Please specify a positive value for --wenet-ctc-num-left-chunks. \"\n        \"Currently given: %d. Note that if you want to use -1, please consider \"\n        \"using a non-streaming model.\",\n        num_left_chunks);\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OnlineWenetCtcModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlineWenetCtcModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\", \";\n  os << \"chunk_size=\" << chunk_size << \", \";\n  os << \"num_left_chunks=\" << num_left_chunks << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-wenet-ctc-model-config.h",
    "content": "// sherpa-onnx/csrc/online-wenet-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_WENET_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_ONLINE_WENET_CTC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineWenetCtcModelConfig {\n  std::string model;\n\n  // --chunk_size from wenet\n  int32_t chunk_size = 16;\n\n  // --num_left_chunks from wenet\n  int32_t num_left_chunks = 4;\n\n  OnlineWenetCtcModelConfig() = default;\n\n  OnlineWenetCtcModelConfig(const std::string &model, int32_t chunk_size,\n                            int32_t num_left_chunks)\n      : model(model),\n        chunk_size(chunk_size),\n        num_left_chunks(num_left_chunks) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_WENET_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-wenet-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/online-wenet-ctc-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-wenet-ctc-model.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineWenetCtcModel::Impl {\n public:\n  explicit Impl(const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.wenet_ctc.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.wenet_ctc.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value x,\n                                  std::vector<Ort::Value> states) {\n    Ort::Value &attn_cache = states[0];\n    Ort::Value &conv_cache = states[1];\n    Ort::Value &offset = states[2];\n\n    int32_t chunk_size = config_.wenet_ctc.chunk_size;\n    int32_t left_chunks = config_.wenet_ctc.num_left_chunks;\n    // build attn_mask\n    std::array<int64_t, 3> attn_mask_shape{1, 1,\n                                           required_cache_size_ + chunk_size};\n    Ort::Value attn_mask = Ort::Value::CreateTensor<bool>(\n        allocator_, attn_mask_shape.data(), attn_mask_shape.size());\n    bool *p = attn_mask.GetTensorMutableData<bool>();\n    int32_t chunk_idx =\n        offset.GetTensorData<int64_t>()[0] / chunk_size - left_chunks;\n    if (chunk_idx < left_chunks) {\n      std::fill(p, p + required_cache_size_ - chunk_idx * chunk_size, 0);\n      std::fill(p + required_cache_size_ - chunk_idx * chunk_size,\n                p + attn_mask_shape[2], 1);\n    } else {\n      std::fill(p, p + attn_mask_shape[2], 1);\n    }\n\n    std::array<Ort::Value, 6> inputs = {std::move(x),\n                                        View(&offset),\n                                        View(&required_cache_size_tensor_),\n                                        std::move(attn_cache),\n                                        std::move(conv_cache),\n                                        std::move(attn_mask)};\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    offset.GetTensorMutableData<int64_t>()[0] +=\n        out[0].GetTensorTypeAndShapeInfo().GetShape()[1];\n    out.push_back(std::move(offset));\n\n    return out;\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t ChunkLength() const {\n    // When chunk_size is 16, subsampling_factor_ is 4, right_context_ is 6,\n    // the returned value is (16 - 1)*4 + 6 + 1 = 67\n    return (config_.wenet_ctc.chunk_size - 1) * subsampling_factor_ +\n           right_context_ + 1;\n  }\n\n  int32_t ChunkShift() const {\n    return config_.wenet_ctc.chunk_size * subsampling_factor_;\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  // Return a vector containing 3 tensors\n  // - attn_cache\n  // - conv_cache\n  // - offset\n  std::vector<Ort::Value> GetInitStates() {\n    std::vector<Ort::Value> ans;\n    ans.reserve(3);\n    ans.push_back(View(&attn_cache_));\n    ans.push_back(View(&conv_cache_));\n\n    int64_t offset_shape = 1;\n\n    Ort::Value offset =\n        Ort::Value::CreateTensor<int64_t>(allocator_, &offset_shape, 1);\n\n    offset.GetTensorMutableData<int64_t>()[0] = required_cache_size_;\n\n    ans.push_back(std::move(offset));\n\n    return ans;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(head_, \"head\");\n    SHERPA_ONNX_READ_META_DATA(num_blocks_, \"num_blocks\");\n    SHERPA_ONNX_READ_META_DATA(output_size_, \"output_size\");\n    SHERPA_ONNX_READ_META_DATA(cnn_module_kernel_, \"cnn_module_kernel\");\n    SHERPA_ONNX_READ_META_DATA(right_context_, \"right_context\");\n    SHERPA_ONNX_READ_META_DATA(subsampling_factor_, \"subsampling_factor\");\n    SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n\n    required_cache_size_ =\n        config_.wenet_ctc.chunk_size * config_.wenet_ctc.num_left_chunks;\n\n    InitStates();\n  }\n\n  void InitStates() {\n    std::array<int64_t, 4> attn_cache_shape{\n        num_blocks_, head_, required_cache_size_, output_size_ / head_ * 2};\n    attn_cache_ = Ort::Value::CreateTensor<float>(\n        allocator_, attn_cache_shape.data(), attn_cache_shape.size());\n\n    Fill<float>(&attn_cache_, 0);\n\n    std::array<int64_t, 4> conv_cache_shape{num_blocks_, 1, output_size_,\n                                            cnn_module_kernel_ - 1};\n    conv_cache_ = Ort::Value::CreateTensor<float>(\n        allocator_, conv_cache_shape.data(), conv_cache_shape.size());\n\n    Fill<float>(&conv_cache_, 0);\n\n    int64_t shape = 1;\n    required_cache_size_tensor_ =\n        Ort::Value::CreateTensor<int64_t>(allocator_, &shape, 1);\n\n    required_cache_size_tensor_.GetTensorMutableData<int64_t>()[0] =\n        required_cache_size_;\n  }\n\n private:\n  OnlineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  int32_t head_ = 0;\n  int32_t num_blocks_ = 0;\n  int32_t output_size_ = 0;\n  int32_t cnn_module_kernel_ = 0;\n  int32_t right_context_ = 0;\n  int32_t subsampling_factor_ = 0;\n  int32_t vocab_size_ = 0;\n\n  int32_t required_cache_size_ = 0;\n\n  Ort::Value attn_cache_{nullptr};\n  Ort::Value conv_cache_{nullptr};\n  Ort::Value required_cache_size_tensor_{nullptr};\n};\n\nOnlineWenetCtcModel::OnlineWenetCtcModel(const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOnlineWenetCtcModel::OnlineWenetCtcModel(Manager *mgr,\n                                         const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOnlineWenetCtcModel::~OnlineWenetCtcModel() = default;\n\nstd::vector<Ort::Value> OnlineWenetCtcModel::Forward(\n    Ort::Value x, std::vector<Ort::Value> states) const {\n  return impl_->Forward(std::move(x), std::move(states));\n}\n\nint32_t OnlineWenetCtcModel::VocabSize() const { return impl_->VocabSize(); }\n\nint32_t OnlineWenetCtcModel::ChunkLength() const {\n  return impl_->ChunkLength();\n}\n\nint32_t OnlineWenetCtcModel::ChunkShift() const { return impl_->ChunkShift(); }\n\nOrtAllocator *OnlineWenetCtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nstd::vector<Ort::Value> OnlineWenetCtcModel::GetInitStates() const {\n  return impl_->GetInitStates();\n}\n\nstd::vector<Ort::Value> OnlineWenetCtcModel::StackStates(\n    std::vector<std::vector<Ort::Value>> states) const {\n  if (states.size() != 1) {\n    SHERPA_ONNX_LOGE(\"wenet CTC model supports only batch_size==1. Given: %d\",\n                     static_cast<int32_t>(states.size()));\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  return std::move(states[0]);\n}\n\nstd::vector<std::vector<Ort::Value>> OnlineWenetCtcModel::UnStackStates(\n    std::vector<Ort::Value> states) const {\n  std::vector<std::vector<Ort::Value>> ans(1);\n  ans[0] = std::move(states);\n  return ans;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineWenetCtcModel::OnlineWenetCtcModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineWenetCtcModel::OnlineWenetCtcModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-wenet-ctc-model.h",
    "content": "// sherpa-onnx/csrc/online-wenet-ctc-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_WENET_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_WENET_CTC_MODEL_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-ctc-model.h\"\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineWenetCtcModel : public OnlineCtcModel {\n public:\n  explicit OnlineWenetCtcModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineWenetCtcModel(Manager *mgr, const OnlineModelConfig &config);\n\n  ~OnlineWenetCtcModel() override;\n\n  // A list of 3 tensors:\n  //  - attn_cache\n  //  - conv_cache\n  //  - offset\n  std::vector<Ort::Value> GetInitStates() const override;\n\n  std::vector<Ort::Value> StackStates(\n      std::vector<std::vector<Ort::Value>> states) const override;\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      std::vector<Ort::Value> states) const override;\n\n  /**\n   *\n   * @param x A 3-D tensor of shape (N, T, C). N has to be 1.\n   * @param states  It is from GetInitStates() or returned from this method.\n   *\n   * @return Return a list of tensors\n   *    - ans[0] contains log_probs, of shape (N, T, C)\n   *    - ans[1:] contains next_states\n   */\n  std::vector<Ort::Value> Forward(\n      Ort::Value x, std::vector<Ort::Value> states) const override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  // The model accepts this number of frames before subsampling as input\n  int32_t ChunkLength() const override;\n\n  // Similar to frame_shift in feature extractor, after processing\n  // ChunkLength() frames, we advance by ChunkShift() frames\n  // before we process the next chunk.\n  int32_t ChunkShift() const override;\n\n  bool SupportBatchProcessing() const override { return false; }\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_WENET_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-zipformer-transducer-model.cc",
    "content": "// sherpa-onnx/csrc/online-zipformer-transducer-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-zipformer-transducer-model.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/cat.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/unbind.h\"\n\nnamespace sherpa_onnx {\n\nOnlineZipformerTransducerModel::OnlineZipformerTransducerModel(\n    const OnlineModelConfig &config)\n    : env_(ORT_LOGGING_LEVEL_ERROR),\n      config_(config),\n      sess_opts_(GetSessionOptions(config)),\n      allocator_{} {\n  {\n    auto buf = ReadFile(config.transducer.encoder);\n    InitEncoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(config.transducer.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(config.transducer.joiner);\n    InitJoiner(buf.data(), buf.size());\n  }\n}\n\ntemplate <typename Manager>\nOnlineZipformerTransducerModel::OnlineZipformerTransducerModel(\n    Manager *mgr, const OnlineModelConfig &config)\n    : env_(ORT_LOGGING_LEVEL_ERROR),\n      config_(config),\n      sess_opts_(GetSessionOptions(config)),\n      allocator_{} {\n  {\n    auto buf = ReadFile(mgr, config.transducer.encoder);\n    InitEncoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(mgr, config.transducer.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(mgr, config.transducer.joiner);\n    InitJoiner(buf.data(), buf.size());\n  }\n}\n\nvoid OnlineZipformerTransducerModel::InitEncoder(void *model_data,\n                                                 size_t model_data_length) {\n  encoder_sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                                 model_data_length, sess_opts_);\n\n  GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                &encoder_input_names_ptr_);\n\n  GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                 &encoder_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---encoder---\\n\";\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n  SHERPA_ONNX_READ_META_DATA_VEC(encoder_dims_, \"encoder_dims\");\n  SHERPA_ONNX_READ_META_DATA_VEC(attention_dims_, \"attention_dims\");\n  SHERPA_ONNX_READ_META_DATA_VEC(num_encoder_layers_, \"num_encoder_layers\");\n  SHERPA_ONNX_READ_META_DATA_VEC(cnn_module_kernels_, \"cnn_module_kernels\");\n  SHERPA_ONNX_READ_META_DATA_VEC(left_context_len_, \"left_context_len\");\n\n  SHERPA_ONNX_READ_META_DATA(T_, \"T\");\n  SHERPA_ONNX_READ_META_DATA(decode_chunk_len_, \"decode_chunk_len\");\n\n  if (config_.debug) {\n    auto print = [](const std::vector<int32_t> &v, const char *name) {\n      std::ostringstream os;\n      os << name << \": \";\n      for (auto i : v) {\n        os << i << \" \";\n      }\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    };\n    print(encoder_dims_, \"encoder_dims\");\n    print(attention_dims_, \"attention_dims\");\n    print(num_encoder_layers_, \"num_encoder_layers\");\n    print(cnn_module_kernels_, \"cnn_module_kernels\");\n    print(left_context_len_, \"left_context_len\");\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"T: %{public}d\", T_);\n    SHERPA_ONNX_LOGE(\"decode_chunk_len_: %{public}d\", decode_chunk_len_);\n#else\n    SHERPA_ONNX_LOGE(\"T: %d\", T_);\n    SHERPA_ONNX_LOGE(\"decode_chunk_len_: %d\", decode_chunk_len_);\n#endif\n  }\n}\n\nvoid OnlineZipformerTransducerModel::InitDecoder(void *model_data,\n                                                 size_t model_data_length) {\n  decoder_sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                                 model_data_length, sess_opts_);\n\n  GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                &decoder_input_names_ptr_);\n\n  GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                 &decoder_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = decoder_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---decoder---\\n\";\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n  SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n  SHERPA_ONNX_READ_META_DATA(context_size_, \"context_size\");\n}\n\nvoid OnlineZipformerTransducerModel::InitJoiner(void *model_data,\n                                                size_t model_data_length) {\n  joiner_sess_ = std::make_unique<Ort::Session>(env_, model_data,\n                                                model_data_length, sess_opts_);\n\n  GetInputNames(joiner_sess_.get(), &joiner_input_names_,\n                &joiner_input_names_ptr_);\n\n  GetOutputNames(joiner_sess_.get(), &joiner_output_names_,\n                 &joiner_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = joiner_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---joiner---\\n\";\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n}\n\nstd::vector<Ort::Value> OnlineZipformerTransducerModel::StackStates(\n    const std::vector<std::vector<Ort::Value>> &states) const {\n  int32_t batch_size = static_cast<int32_t>(states.size());\n  int32_t num_encoders = static_cast<int32_t>(num_encoder_layers_.size());\n\n  std::vector<const Ort::Value *> buf(batch_size);\n\n  std::vector<Ort::Value> ans;\n  ans.reserve(states[0].size());\n\n  auto allocator =\n      const_cast<OnlineZipformerTransducerModel *>(this)->allocator_;\n\n  // cached_len\n  for (int32_t i = 0; i != num_encoders; ++i) {\n    for (int32_t n = 0; n != batch_size; ++n) {\n      buf[n] = &states[n][i];\n    }\n    auto v = Cat<int64_t>(allocator, buf, 1);  // (num_layers, 1)\n    ans.push_back(std::move(v));\n  }\n\n  // cached_avg\n  for (int32_t i = 0; i != num_encoders; ++i) {\n    for (int32_t n = 0; n != batch_size; ++n) {\n      buf[n] = &states[n][num_encoders + i];\n    }\n    auto v = Cat(allocator, buf, 1);  // (num_layers, 1, encoder_dims)\n    ans.push_back(std::move(v));\n  }\n\n  // cached_key\n  for (int32_t i = 0; i != num_encoders; ++i) {\n    for (int32_t n = 0; n != batch_size; ++n) {\n      buf[n] = &states[n][num_encoders * 2 + i];\n    }\n    // (num_layers, left_context_len, 1, attention_dims)\n    auto v = Cat(allocator, buf, 2);\n    ans.push_back(std::move(v));\n  }\n\n  // cached_val\n  for (int32_t i = 0; i != num_encoders; ++i) {\n    for (int32_t n = 0; n != batch_size; ++n) {\n      buf[n] = &states[n][num_encoders * 3 + i];\n    }\n    // (num_layers, left_context_len, 1, attention_dims/2)\n    auto v = Cat(allocator, buf, 2);\n    ans.push_back(std::move(v));\n  }\n\n  // cached_val2\n  for (int32_t i = 0; i != num_encoders; ++i) {\n    for (int32_t n = 0; n != batch_size; ++n) {\n      buf[n] = &states[n][num_encoders * 4 + i];\n    }\n    // (num_layers, left_context_len, 1, attention_dims/2)\n    auto v = Cat(allocator, buf, 2);\n    ans.push_back(std::move(v));\n  }\n\n  // cached_conv1\n  for (int32_t i = 0; i != num_encoders; ++i) {\n    for (int32_t n = 0; n != batch_size; ++n) {\n      buf[n] = &states[n][num_encoders * 5 + i];\n    }\n    // (num_layers, 1, encoder_dims, cnn_module_kernels-1)\n    auto v = Cat(allocator, buf, 1);\n    ans.push_back(std::move(v));\n  }\n\n  // cached_conv2\n  for (int32_t i = 0; i != num_encoders; ++i) {\n    for (int32_t n = 0; n != batch_size; ++n) {\n      buf[n] = &states[n][num_encoders * 6 + i];\n    }\n    // (num_layers, 1, encoder_dims, cnn_module_kernels-1)\n    auto v = Cat(allocator, buf, 1);\n    ans.push_back(std::move(v));\n  }\n\n  return ans;\n}\n\nstd::vector<std::vector<Ort::Value>>\nOnlineZipformerTransducerModel::UnStackStates(\n    const std::vector<Ort::Value> &states) const {\n  assert(states.size() == num_encoder_layers_.size() * 7);\n\n  int32_t batch_size = states[0].GetTensorTypeAndShapeInfo().GetShape()[1];\n  int32_t num_encoders = num_encoder_layers_.size();\n\n  auto allocator =\n      const_cast<OnlineZipformerTransducerModel *>(this)->allocator_;\n\n  std::vector<std::vector<Ort::Value>> ans;\n  ans.resize(batch_size);\n\n  // cached_len\n  for (int32_t i = 0; i != num_encoders; ++i) {\n    auto v = Unbind<int64_t>(allocator, &states[i], 1);\n    assert(v.size() == batch_size);\n\n    for (int32_t n = 0; n != batch_size; ++n) {\n      ans[n].push_back(std::move(v[n]));\n    }\n  }\n\n  // cached_avg\n  for (int32_t i = num_encoders; i != 2 * num_encoders; ++i) {\n    auto v = Unbind(allocator, &states[i], 1);\n    assert(v.size() == batch_size);\n\n    for (int32_t n = 0; n != batch_size; ++n) {\n      ans[n].push_back(std::move(v[n]));\n    }\n  }\n\n  // cached_key\n  for (int32_t i = 2 * num_encoders; i != 3 * num_encoders; ++i) {\n    auto v = Unbind(allocator, &states[i], 2);\n    assert(v.size() == batch_size);\n\n    for (int32_t n = 0; n != batch_size; ++n) {\n      ans[n].push_back(std::move(v[n]));\n    }\n  }\n\n  // cached_val\n  for (int32_t i = 3 * num_encoders; i != 4 * num_encoders; ++i) {\n    auto v = Unbind(allocator, &states[i], 2);\n    assert(v.size() == batch_size);\n\n    for (int32_t n = 0; n != batch_size; ++n) {\n      ans[n].push_back(std::move(v[n]));\n    }\n  }\n\n  // cached_val2\n  for (int32_t i = 4 * num_encoders; i != 5 * num_encoders; ++i) {\n    auto v = Unbind(allocator, &states[i], 2);\n    assert(v.size() == batch_size);\n\n    for (int32_t n = 0; n != batch_size; ++n) {\n      ans[n].push_back(std::move(v[n]));\n    }\n  }\n\n  // cached_conv1\n  for (int32_t i = 5 * num_encoders; i != 6 * num_encoders; ++i) {\n    auto v = Unbind(allocator, &states[i], 1);\n    assert(v.size() == batch_size);\n\n    for (int32_t n = 0; n != batch_size; ++n) {\n      ans[n].push_back(std::move(v[n]));\n    }\n  }\n\n  // cached_conv2\n  for (int32_t i = 6 * num_encoders; i != 7 * num_encoders; ++i) {\n    auto v = Unbind(allocator, &states[i], 1);\n    assert(v.size() == batch_size);\n\n    for (int32_t n = 0; n != batch_size; ++n) {\n      ans[n].push_back(std::move(v[n]));\n    }\n  }\n\n  return ans;\n}\n\nstd::vector<Ort::Value> OnlineZipformerTransducerModel::GetEncoderInitStates() {\n  // Please see\n  // https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/zipformer.py#L673\n  // for details\n\n  int32_t n = static_cast<int32_t>(encoder_dims_.size());\n  std::vector<Ort::Value> cached_len_vec;\n  std::vector<Ort::Value> cached_avg_vec;\n  std::vector<Ort::Value> cached_key_vec;\n  std::vector<Ort::Value> cached_val_vec;\n  std::vector<Ort::Value> cached_val2_vec;\n  std::vector<Ort::Value> cached_conv1_vec;\n  std::vector<Ort::Value> cached_conv2_vec;\n\n  cached_len_vec.reserve(n);\n  cached_avg_vec.reserve(n);\n  cached_key_vec.reserve(n);\n  cached_val_vec.reserve(n);\n  cached_val2_vec.reserve(n);\n  cached_conv1_vec.reserve(n);\n  cached_conv2_vec.reserve(n);\n\n  for (int32_t i = 0; i != n; ++i) {\n    {\n      std::array<int64_t, 2> s{num_encoder_layers_[i], 1};\n      auto v =\n          Ort::Value::CreateTensor<int64_t>(allocator_, s.data(), s.size());\n      Fill<int64_t>(&v, 0);\n      cached_len_vec.push_back(std::move(v));\n    }\n\n    {\n      std::array<int64_t, 3> s{num_encoder_layers_[i], 1, encoder_dims_[i]};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      cached_avg_vec.push_back(std::move(v));\n    }\n\n    {\n      std::array<int64_t, 4> s{num_encoder_layers_[i], left_context_len_[i], 1,\n                               attention_dims_[i]};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      cached_key_vec.push_back(std::move(v));\n    }\n\n    {\n      std::array<int64_t, 4> s{num_encoder_layers_[i], left_context_len_[i], 1,\n                               attention_dims_[i] / 2};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      cached_val_vec.push_back(std::move(v));\n    }\n\n    {\n      std::array<int64_t, 4> s{num_encoder_layers_[i], left_context_len_[i], 1,\n                               attention_dims_[i] / 2};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      cached_val2_vec.push_back(std::move(v));\n    }\n\n    {\n      std::array<int64_t, 4> s{num_encoder_layers_[i], 1, encoder_dims_[i],\n                               cnn_module_kernels_[i] - 1};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      cached_conv1_vec.push_back(std::move(v));\n    }\n\n    {\n      std::array<int64_t, 4> s{num_encoder_layers_[i], 1, encoder_dims_[i],\n                               cnn_module_kernels_[i] - 1};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      cached_conv2_vec.push_back(std::move(v));\n    }\n  }\n\n  std::vector<Ort::Value> ans;\n  ans.reserve(n * 7);\n\n  for (auto &v : cached_len_vec) ans.push_back(std::move(v));\n  for (auto &v : cached_avg_vec) ans.push_back(std::move(v));\n  for (auto &v : cached_key_vec) ans.push_back(std::move(v));\n  for (auto &v : cached_val_vec) ans.push_back(std::move(v));\n  for (auto &v : cached_val2_vec) ans.push_back(std::move(v));\n  for (auto &v : cached_conv1_vec) ans.push_back(std::move(v));\n  for (auto &v : cached_conv2_vec) ans.push_back(std::move(v));\n\n  return ans;\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOnlineZipformerTransducerModel::RunEncoder(Ort::Value features,\n                                           std::vector<Ort::Value> states,\n                                           Ort::Value /* processed_frames */) {\n  std::vector<Ort::Value> encoder_inputs;\n  encoder_inputs.reserve(1 + states.size());\n\n  encoder_inputs.push_back(std::move(features));\n  for (auto &v : states) {\n    encoder_inputs.push_back(std::move(v));\n  }\n\n  auto encoder_out = encoder_sess_->Run(\n      {}, encoder_input_names_ptr_.data(), encoder_inputs.data(),\n      encoder_inputs.size(), encoder_output_names_ptr_.data(),\n      encoder_output_names_ptr_.size());\n\n  std::vector<Ort::Value> next_states;\n  next_states.reserve(states.size());\n\n  for (int32_t i = 1; i != static_cast<int32_t>(encoder_out.size()); ++i) {\n    next_states.push_back(std::move(encoder_out[i]));\n  }\n\n  return {std::move(encoder_out[0]), std::move(next_states)};\n}\n\nOrt::Value OnlineZipformerTransducerModel::RunDecoder(\n    Ort::Value decoder_input) {\n  auto decoder_out = decoder_sess_->Run(\n      {}, decoder_input_names_ptr_.data(), &decoder_input, 1,\n      decoder_output_names_ptr_.data(), decoder_output_names_ptr_.size());\n  return std::move(decoder_out[0]);\n}\n\nOrt::Value OnlineZipformerTransducerModel::RunJoiner(Ort::Value encoder_out,\n                                                     Ort::Value decoder_out) {\n  std::array<Ort::Value, 2> joiner_input = {std::move(encoder_out),\n                                            std::move(decoder_out)};\n  auto logit =\n      joiner_sess_->Run({}, joiner_input_names_ptr_.data(), joiner_input.data(),\n                        joiner_input.size(), joiner_output_names_ptr_.data(),\n                        joiner_output_names_ptr_.size());\n\n  return std::move(logit[0]);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineZipformerTransducerModel::OnlineZipformerTransducerModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineZipformerTransducerModel::OnlineZipformerTransducerModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-zipformer-transducer-model.h",
    "content": "// sherpa-onnx/csrc/online-zipformer-transducer-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER_TRANSDUCER_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER_TRANSDUCER_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineZipformerTransducerModel : public OnlineTransducerModel {\n public:\n  explicit OnlineZipformerTransducerModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineZipformerTransducerModel(Manager *mgr, const OnlineModelConfig &config);\n\n  std::vector<Ort::Value> StackStates(\n      const std::vector<std::vector<Ort::Value>> &states) const override;\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      const std::vector<Ort::Value> &states) const override;\n\n  std::vector<Ort::Value> GetEncoderInitStates() override;\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> RunEncoder(\n      Ort::Value features, std::vector<Ort::Value> states,\n      Ort::Value processed_frames) override;\n\n  Ort::Value RunDecoder(Ort::Value decoder_input) override;\n\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out) override;\n\n  int32_t ContextSize() const override { return context_size_; }\n\n  int32_t ChunkSize() const override { return T_; }\n\n  int32_t ChunkShift() const override { return decode_chunk_len_; }\n\n  int32_t VocabSize() const override { return vocab_size_; }\n  OrtAllocator *Allocator() override { return allocator_; }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length);\n  void InitDecoder(void *model_data, size_t model_data_length);\n  void InitJoiner(void *model_data, size_t model_data_length);\n\n private:\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n  std::unique_ptr<Ort::Session> joiner_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<std::string> joiner_input_names_;\n  std::vector<const char *> joiner_input_names_ptr_;\n\n  std::vector<std::string> joiner_output_names_;\n  std::vector<const char *> joiner_output_names_ptr_;\n\n  OnlineModelConfig config_;\n\n  std::vector<int32_t> encoder_dims_;\n  std::vector<int32_t> attention_dims_;\n  std::vector<int32_t> num_encoder_layers_;\n  std::vector<int32_t> cnn_module_kernels_;\n  std::vector<int32_t> left_context_len_;\n\n  int32_t T_ = 0;\n  int32_t decode_chunk_len_ = 0;\n\n  int32_t context_size_ = 0;\n  int32_t vocab_size_ = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER_TRANSDUCER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-zipformer2-ctc-model-config.cc",
    "content": "// sherpa-onnx/csrc/online-zipformer2-ctc-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-zipformer2-ctc-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid OnlineZipformer2CtcModelConfig::Register(ParseOptions *po) {\n  po->Register(\"zipformer2-ctc-model\", &model,\n               \"Path to CTC model.onnx. See also \"\n               \"https://github.com/k2-fsa/icefall/pull/1413\");\n}\n\nbool OnlineZipformer2CtcModelConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"--zipformer2-ctc-model is empty!\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"--zipformer2-ctc-model '%s' does not exist\",\n                     model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string OnlineZipformer2CtcModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"OnlineZipformer2CtcModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-zipformer2-ctc-model-config.h",
    "content": "// sherpa-onnx/csrc/online-zipformer2-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER2_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER2_CTC_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineZipformer2CtcModelConfig {\n  std::string model;\n\n  OnlineZipformer2CtcModelConfig() = default;\n\n  explicit OnlineZipformer2CtcModelConfig(const std::string &model)\n      : model(model) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER2_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-zipformer2-ctc-model.cc",
    "content": "// sherpa-onnx/csrc/online-zipformer2-ctc-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-zipformer2-ctc-model.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <cmath>\n#include <memory>\n#include <numeric>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/cat.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/unbind.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineZipformer2CtcModel::Impl {\n public:\n  explicit Impl(const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.zipformer2_ctc.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OnlineModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.zipformer2_ctc.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  std::vector<Ort::Value> Forward(Ort::Value features,\n                                  std::vector<Ort::Value> states) {\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(1 + states.size());\n\n    inputs.push_back(std::move(features));\n    for (auto &v : states) {\n      inputs.push_back(std::move(v));\n    }\n\n    return sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                      output_names_ptr_.data(), output_names_ptr_.size());\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  int32_t ChunkLength() const { return T_; }\n\n  int32_t ChunkShift() const { return decode_chunk_len_; }\n\n  bool UseWhisperFeature() const { return use_whisper_feature_; }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  // Return a vector containing 3 tensors\n  // - attn_cache\n  // - conv_cache\n  // - offset\n  std::vector<Ort::Value> GetInitStates() {\n    std::vector<Ort::Value> ans;\n    ans.reserve(initial_states_.size());\n    for (auto &s : initial_states_) {\n      ans.push_back(View(&s));\n    }\n    return ans;\n  }\n\n  std::vector<Ort::Value> StackStates(\n      std::vector<std::vector<Ort::Value>> states) {\n    int32_t batch_size = static_cast<int32_t>(states.size());\n\n    std::vector<const Ort::Value *> buf(batch_size);\n\n    std::vector<Ort::Value> ans;\n    int32_t num_states = static_cast<int32_t>(states[0].size());\n    ans.reserve(num_states);\n\n    for (int32_t i = 0; i != (num_states - 2) / 6; ++i) {\n      {\n        for (int32_t n = 0; n != batch_size; ++n) {\n          buf[n] = &states[n][6 * i];\n        }\n        auto v = Cat(allocator_, buf, 1);\n        ans.push_back(std::move(v));\n      }\n      {\n        for (int32_t n = 0; n != batch_size; ++n) {\n          buf[n] = &states[n][6 * i + 1];\n        }\n        auto v = Cat(allocator_, buf, 1);\n        ans.push_back(std::move(v));\n      }\n      {\n        for (int32_t n = 0; n != batch_size; ++n) {\n          buf[n] = &states[n][6 * i + 2];\n        }\n        auto v = Cat(allocator_, buf, 1);\n        ans.push_back(std::move(v));\n      }\n      {\n        for (int32_t n = 0; n != batch_size; ++n) {\n          buf[n] = &states[n][6 * i + 3];\n        }\n        auto v = Cat(allocator_, buf, 1);\n        ans.push_back(std::move(v));\n      }\n      {\n        for (int32_t n = 0; n != batch_size; ++n) {\n          buf[n] = &states[n][6 * i + 4];\n        }\n        auto v = Cat(allocator_, buf, 0);\n        ans.push_back(std::move(v));\n      }\n      {\n        for (int32_t n = 0; n != batch_size; ++n) {\n          buf[n] = &states[n][6 * i + 5];\n        }\n        auto v = Cat(allocator_, buf, 0);\n        ans.push_back(std::move(v));\n      }\n    }\n\n    {\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][num_states - 2];\n      }\n      auto v = Cat(allocator_, buf, 0);\n      ans.push_back(std::move(v));\n    }\n\n    {\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][num_states - 1];\n      }\n      auto v = Cat<int64_t>(allocator_, buf, 0);\n      ans.push_back(std::move(v));\n    }\n    return ans;\n  }\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      std::vector<Ort::Value> states) {\n    int32_t m = std::accumulate(num_encoder_layers_.begin(),\n                                num_encoder_layers_.end(), 0);\n    assert(states.size() == m * 6 + 2);\n\n    int32_t batch_size = states[0].GetTensorTypeAndShapeInfo().GetShape()[1];\n\n    std::vector<std::vector<Ort::Value>> ans;\n    ans.resize(batch_size);\n\n    for (int32_t i = 0; i != m; ++i) {\n      {\n        auto v = Unbind(allocator_, &states[i * 6], 1);\n        assert(v.size() == batch_size);\n\n        for (int32_t n = 0; n != batch_size; ++n) {\n          ans[n].push_back(std::move(v[n]));\n        }\n      }\n      {\n        auto v = Unbind(allocator_, &states[i * 6 + 1], 1);\n        assert(v.size() == batch_size);\n\n        for (int32_t n = 0; n != batch_size; ++n) {\n          ans[n].push_back(std::move(v[n]));\n        }\n      }\n      {\n        auto v = Unbind(allocator_, &states[i * 6 + 2], 1);\n        assert(v.size() == batch_size);\n\n        for (int32_t n = 0; n != batch_size; ++n) {\n          ans[n].push_back(std::move(v[n]));\n        }\n      }\n      {\n        auto v = Unbind(allocator_, &states[i * 6 + 3], 1);\n        assert(v.size() == batch_size);\n\n        for (int32_t n = 0; n != batch_size; ++n) {\n          ans[n].push_back(std::move(v[n]));\n        }\n      }\n      {\n        auto v = Unbind(allocator_, &states[i * 6 + 4], 0);\n        assert(v.size() == batch_size);\n\n        for (int32_t n = 0; n != batch_size; ++n) {\n          ans[n].push_back(std::move(v[n]));\n        }\n      }\n      {\n        auto v = Unbind(allocator_, &states[i * 6 + 5], 0);\n        assert(v.size() == batch_size);\n\n        for (int32_t n = 0; n != batch_size; ++n) {\n          ans[n].push_back(std::move(v[n]));\n        }\n      }\n    }\n\n    {\n      auto v = Unbind(allocator_, &states[m * 6], 0);\n      assert(v.size() == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n    {\n      auto v = Unbind<int64_t>(allocator_, &states[m * 6 + 1], 0);\n      assert(v.size() == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n\n    return ans;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---zipformer2_ctc---\\n\";\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA_VEC(encoder_dims_, \"encoder_dims\");\n    SHERPA_ONNX_READ_META_DATA_VEC(query_head_dims_, \"query_head_dims\");\n    SHERPA_ONNX_READ_META_DATA_VEC(value_head_dims_, \"value_head_dims\");\n    SHERPA_ONNX_READ_META_DATA_VEC(num_heads_, \"num_heads\");\n    SHERPA_ONNX_READ_META_DATA_VEC(num_encoder_layers_, \"num_encoder_layers\");\n    SHERPA_ONNX_READ_META_DATA_VEC(cnn_module_kernels_, \"cnn_module_kernels\");\n    SHERPA_ONNX_READ_META_DATA_VEC(left_context_len_, \"left_context_len\");\n\n    SHERPA_ONNX_READ_META_DATA(T_, \"T\");\n    SHERPA_ONNX_READ_META_DATA(decode_chunk_len_, \"decode_chunk_len\");\n\n    std::string feature_type;\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(feature_type, \"feature\", \"\");\n    if (feature_type == \"whisper\") {\n      use_whisper_feature_ = true;\n    }\n\n    {\n      auto shape =\n          sess_->GetOutputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape();\n      vocab_size_ = shape[2];\n    }\n\n    if (config_.debug) {\n      auto print = [](const std::vector<int32_t> &v, const char *name) {\n        std::ostringstream os;\n        os << name << \": \";\n        for (auto i : v) {\n          os << i << \" \";\n        }\n        SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n      };\n      print(encoder_dims_, \"encoder_dims\");\n      print(query_head_dims_, \"query_head_dims\");\n      print(value_head_dims_, \"value_head_dims\");\n      print(num_heads_, \"num_heads\");\n      print(num_encoder_layers_, \"num_encoder_layers\");\n      print(cnn_module_kernels_, \"cnn_module_kernels\");\n      print(left_context_len_, \"left_context_len\");\n      SHERPA_ONNX_LOGE(\"T: %d\", T_);\n      SHERPA_ONNX_LOGE(\"decode_chunk_len_: %d\", decode_chunk_len_);\n      SHERPA_ONNX_LOGE(\"vocab_size_: %d\", vocab_size_);\n    }\n\n    InitStates();\n  }\n\n  void InitStates() {\n    int32_t n = static_cast<int32_t>(encoder_dims_.size());\n    int32_t m = std::accumulate(num_encoder_layers_.begin(),\n                                num_encoder_layers_.end(), 0);\n    initial_states_.reserve(m * 6 + 2);\n\n    for (int32_t i = 0; i != n; ++i) {\n      int32_t num_layers = num_encoder_layers_[i];\n      int32_t key_dim = query_head_dims_[i] * num_heads_[i];\n      int32_t value_dim = value_head_dims_[i] * num_heads_[i];\n      int32_t nonlin_attn_head_dim = 3 * encoder_dims_[i] / 4;\n\n      for (int32_t j = 0; j != num_layers; ++j) {\n        {\n          std::array<int64_t, 3> s{left_context_len_[i], 1, key_dim};\n          auto v =\n              Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n          Fill(&v, 0);\n          initial_states_.push_back(std::move(v));\n        }\n\n        {\n          std::array<int64_t, 4> s{1, 1, left_context_len_[i],\n                                   nonlin_attn_head_dim};\n          auto v =\n              Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n          Fill(&v, 0);\n          initial_states_.push_back(std::move(v));\n        }\n\n        {\n          std::array<int64_t, 3> s{left_context_len_[i], 1, value_dim};\n          auto v =\n              Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n          Fill(&v, 0);\n          initial_states_.push_back(std::move(v));\n        }\n\n        {\n          std::array<int64_t, 3> s{left_context_len_[i], 1, value_dim};\n          auto v =\n              Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n          Fill(&v, 0);\n          initial_states_.push_back(std::move(v));\n        }\n\n        {\n          std::array<int64_t, 3> s{1, encoder_dims_[i],\n                                   cnn_module_kernels_[i] / 2};\n          auto v =\n              Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n          Fill(&v, 0);\n          initial_states_.push_back(std::move(v));\n        }\n\n        {\n          std::array<int64_t, 3> s{1, encoder_dims_[i],\n                                   cnn_module_kernels_[i] / 2};\n          auto v =\n              Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n          Fill(&v, 0);\n          initial_states_.push_back(std::move(v));\n        }\n      }\n    }\n\n    {\n      std::array<int64_t, 4> s{1, 128, 3, 19};\n      auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n      Fill(&v, 0);\n      initial_states_.push_back(std::move(v));\n    }\n\n    {\n      std::array<int64_t, 1> s{1};\n      auto v =\n          Ort::Value::CreateTensor<int64_t>(allocator_, s.data(), s.size());\n      Fill<int64_t>(&v, 0);\n      initial_states_.push_back(std::move(v));\n    }\n  }\n\n private:\n  OnlineModelConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  std::vector<Ort::Value> initial_states_;\n\n  std::vector<int32_t> encoder_dims_;\n  std::vector<int32_t> query_head_dims_;\n  std::vector<int32_t> value_head_dims_;\n  std::vector<int32_t> num_heads_;\n  std::vector<int32_t> num_encoder_layers_;\n  std::vector<int32_t> cnn_module_kernels_;\n  std::vector<int32_t> left_context_len_;\n\n  int32_t T_ = 0;\n  int32_t decode_chunk_len_ = 0;\n  int32_t vocab_size_ = 0;\n\n  // for models from\n  // https://github.com/k2-fsa/icefall/blob/master/egs/multi_zh-hans/ASR/RESULTS.md#streaming-with-ctc-head\n  bool use_whisper_feature_ = false;\n};\n\nOnlineZipformer2CtcModel::OnlineZipformer2CtcModel(\n    const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOnlineZipformer2CtcModel::OnlineZipformer2CtcModel(\n    Manager *mgr, const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOnlineZipformer2CtcModel::~OnlineZipformer2CtcModel() = default;\n\nstd::vector<Ort::Value> OnlineZipformer2CtcModel::Forward(\n    Ort::Value x, std::vector<Ort::Value> states) const {\n  return impl_->Forward(std::move(x), std::move(states));\n}\n\nint32_t OnlineZipformer2CtcModel::VocabSize() const {\n  return impl_->VocabSize();\n}\n\nint32_t OnlineZipformer2CtcModel::ChunkLength() const {\n  return impl_->ChunkLength();\n}\n\nint32_t OnlineZipformer2CtcModel::ChunkShift() const {\n  return impl_->ChunkShift();\n}\n\nbool OnlineZipformer2CtcModel::UseWhisperFeature() const {\n  return impl_->UseWhisperFeature();\n}\n\nOrtAllocator *OnlineZipformer2CtcModel::Allocator() const {\n  return impl_->Allocator();\n}\n\nstd::vector<Ort::Value> OnlineZipformer2CtcModel::GetInitStates() const {\n  return impl_->GetInitStates();\n}\n\nstd::vector<Ort::Value> OnlineZipformer2CtcModel::StackStates(\n    std::vector<std::vector<Ort::Value>> states) const {\n  return impl_->StackStates(std::move(states));\n}\n\nstd::vector<std::vector<Ort::Value>> OnlineZipformer2CtcModel::UnStackStates(\n    std::vector<Ort::Value> states) const {\n  return impl_->UnStackStates(std::move(states));\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineZipformer2CtcModel::OnlineZipformer2CtcModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineZipformer2CtcModel::OnlineZipformer2CtcModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-zipformer2-ctc-model.h",
    "content": "// sherpa-onnx/csrc/online-zipformer2-ctc-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER2_CTC_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER2_CTC_MODEL_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-ctc-model.h\"\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineZipformer2CtcModel : public OnlineCtcModel {\n public:\n  explicit OnlineZipformer2CtcModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineZipformer2CtcModel(Manager *mgr, const OnlineModelConfig &config);\n\n  ~OnlineZipformer2CtcModel() override;\n\n  // A list of tensors.\n  // See also\n  // https://github.com/k2-fsa/icefall/pull/1413\n  // and\n  // https://github.com/k2-fsa/icefall/pull/1415\n  std::vector<Ort::Value> GetInitStates() const override;\n\n  std::vector<Ort::Value> StackStates(\n      std::vector<std::vector<Ort::Value>> states) const override;\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      std::vector<Ort::Value> states) const override;\n\n  /**\n   *\n   * @param x A 3-D tensor of shape (N, T, C). N has to be 1.\n   * @param states  It is from GetInitStates() or returned from this method.\n   *\n   * @return Return a list of tensors\n   *    - ans[0] contains log_probs, of shape (N, T, C)\n   *    - ans[1:] contains next_states\n   */\n  std::vector<Ort::Value> Forward(\n      Ort::Value x, std::vector<Ort::Value> states) const override;\n\n  /** Return the vocabulary size of the model\n   */\n  int32_t VocabSize() const override;\n\n  /** Return an allocator for allocating memory\n   */\n  OrtAllocator *Allocator() const override;\n\n  // The model accepts this number of frames before subsampling as input\n  int32_t ChunkLength() const override;\n\n  // Similar to frame_shift in feature extractor, after processing\n  // ChunkLength() frames, we advance by ChunkShift() frames\n  // before we process the next chunk.\n  int32_t ChunkShift() const override;\n\n  bool UseWhisperFeature() const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER2_CTC_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-zipformer2-transducer-model.cc",
    "content": "// sherpa-onnx/csrc/online-zipformer2-transducer-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-zipformer2-transducer-model.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <cmath>\n#include <memory>\n#include <numeric>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/cat.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-transducer-decoder.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/unbind.h\"\n\nnamespace sherpa_onnx {\n\nOnlineZipformer2TransducerModel::OnlineZipformer2TransducerModel(\n    const OnlineModelConfig &config)\n    : env_(ORT_LOGGING_LEVEL_ERROR),\n      encoder_sess_opts_(GetSessionOptions(config)),\n      decoder_sess_opts_(GetSessionOptions(config, \"decoder\")),\n      joiner_sess_opts_(GetSessionOptions(config, \"joiner\")),\n      config_(config),\n      allocator_{} {\n  {\n    auto buf = ReadFile(config.transducer.encoder);\n    InitEncoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(config.transducer.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(config.transducer.joiner);\n    InitJoiner(buf.data(), buf.size());\n  }\n}\n\ntemplate <typename Manager>\nOnlineZipformer2TransducerModel::OnlineZipformer2TransducerModel(\n    Manager *mgr, const OnlineModelConfig &config)\n    : env_(ORT_LOGGING_LEVEL_ERROR),\n      config_(config),\n      encoder_sess_opts_(GetSessionOptions(config)),\n      decoder_sess_opts_(GetSessionOptions(config, \"decoder\")),\n      joiner_sess_opts_(GetSessionOptions(config, \"joiner\")),\n      allocator_{} {\n  {\n    auto buf = ReadFile(mgr, config.transducer.encoder);\n    InitEncoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(mgr, config.transducer.decoder);\n    InitDecoder(buf.data(), buf.size());\n  }\n\n  {\n    auto buf = ReadFile(mgr, config.transducer.joiner);\n    InitJoiner(buf.data(), buf.size());\n  }\n}\n\nvoid OnlineZipformer2TransducerModel::InitEncoder(void *model_data,\n                                                  size_t model_data_length) {\n  encoder_sess_ = std::make_unique<Ort::Session>(\n      env_, model_data, model_data_length, encoder_sess_opts_);\n\n  GetInputNames(encoder_sess_.get(), &encoder_input_names_,\n                &encoder_input_names_ptr_);\n\n  GetOutputNames(encoder_sess_.get(), &encoder_output_names_,\n                 &encoder_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = encoder_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---encoder---\\n\";\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n  SHERPA_ONNX_READ_META_DATA_VEC(encoder_dims_, \"encoder_dims\");\n  SHERPA_ONNX_READ_META_DATA_VEC(query_head_dims_, \"query_head_dims\");\n  SHERPA_ONNX_READ_META_DATA_VEC(value_head_dims_, \"value_head_dims\");\n  SHERPA_ONNX_READ_META_DATA_VEC(num_heads_, \"num_heads\");\n  SHERPA_ONNX_READ_META_DATA_VEC(num_encoder_layers_, \"num_encoder_layers\");\n  SHERPA_ONNX_READ_META_DATA_VEC(cnn_module_kernels_, \"cnn_module_kernels\");\n  SHERPA_ONNX_READ_META_DATA_VEC(left_context_len_, \"left_context_len\");\n\n  SHERPA_ONNX_READ_META_DATA(T_, \"T\");\n  SHERPA_ONNX_READ_META_DATA(decode_chunk_len_, \"decode_chunk_len\");\n\n  std::string feature_type;\n  SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(feature_type, \"feature\", \"\");\n  if (feature_type == \"whisper\") {\n    use_whisper_feature_ = true;\n  }\n\n  if (config_.debug) {\n    auto print = [](const std::vector<int32_t> &v, const char *name) {\n      std::ostringstream os;\n      os << name << \": \";\n      for (auto i : v) {\n        os << i << \" \";\n      }\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    };\n    print(encoder_dims_, \"encoder_dims\");\n    print(query_head_dims_, \"query_head_dims\");\n    print(value_head_dims_, \"value_head_dims\");\n    print(num_heads_, \"num_heads\");\n    print(num_encoder_layers_, \"num_encoder_layers\");\n    print(cnn_module_kernels_, \"cnn_module_kernels\");\n    print(left_context_len_, \"left_context_len\");\n\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"T: %{public}d\", T_);\n    SHERPA_ONNX_LOGE(\"decode_chunk_len_: %{public}d\", decode_chunk_len_);\n#else\n    SHERPA_ONNX_LOGE(\"T: %d\", T_);\n    SHERPA_ONNX_LOGE(\"decode_chunk_len_: %d\", decode_chunk_len_);\n#endif\n  }\n}\n\nvoid OnlineZipformer2TransducerModel::InitDecoder(void *model_data,\n                                                  size_t model_data_length) {\n  decoder_sess_ = std::make_unique<Ort::Session>(\n      env_, model_data, model_data_length, decoder_sess_opts_);\n\n  GetInputNames(decoder_sess_.get(), &decoder_input_names_,\n                &decoder_input_names_ptr_);\n\n  GetOutputNames(decoder_sess_.get(), &decoder_output_names_,\n                 &decoder_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = decoder_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---decoder---\\n\";\n    PrintModelMetadata(os, meta_data);\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n  SHERPA_ONNX_READ_META_DATA(vocab_size_, \"vocab_size\");\n  SHERPA_ONNX_READ_META_DATA(context_size_, \"context_size\");\n}\n\nvoid OnlineZipformer2TransducerModel::InitJoiner(void *model_data,\n                                                 size_t model_data_length) {\n  joiner_sess_ = std::make_unique<Ort::Session>(\n      env_, model_data, model_data_length, joiner_sess_opts_);\n\n  GetInputNames(joiner_sess_.get(), &joiner_input_names_,\n                &joiner_input_names_ptr_);\n\n  GetOutputNames(joiner_sess_.get(), &joiner_output_names_,\n                 &joiner_output_names_ptr_);\n\n  // get meta data\n  Ort::ModelMetadata meta_data = joiner_sess_->GetModelMetadata();\n  if (config_.debug) {\n    std::ostringstream os;\n    os << \"---joiner---\\n\";\n    PrintModelMetadata(os, meta_data);\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n  }\n}\n\nstd::vector<Ort::Value> OnlineZipformer2TransducerModel::StackStates(\n    const std::vector<std::vector<Ort::Value>> &states) const {\n  int32_t batch_size = static_cast<int32_t>(states.size());\n\n  std::vector<const Ort::Value *> buf(batch_size);\n\n  auto allocator =\n      const_cast<OnlineZipformer2TransducerModel *>(this)->allocator_;\n\n  std::vector<Ort::Value> ans;\n  int32_t num_states = static_cast<int32_t>(states[0].size());\n  ans.reserve(num_states);\n\n  for (int32_t i = 0; i != (num_states - 2) / 6; ++i) {\n    {\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][6 * i];\n      }\n      auto v = Cat(allocator, buf, 1);\n      ans.push_back(std::move(v));\n    }\n    {\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][6 * i + 1];\n      }\n      auto v = Cat(allocator, buf, 1);\n      ans.push_back(std::move(v));\n    }\n    {\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][6 * i + 2];\n      }\n      auto v = Cat(allocator, buf, 1);\n      ans.push_back(std::move(v));\n    }\n    {\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][6 * i + 3];\n      }\n      auto v = Cat(allocator, buf, 1);\n      ans.push_back(std::move(v));\n    }\n    {\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][6 * i + 4];\n      }\n      auto v = Cat(allocator, buf, 0);\n      ans.push_back(std::move(v));\n    }\n    {\n      for (int32_t n = 0; n != batch_size; ++n) {\n        buf[n] = &states[n][6 * i + 5];\n      }\n      auto v = Cat(allocator, buf, 0);\n      ans.push_back(std::move(v));\n    }\n  }\n\n  {\n    for (int32_t n = 0; n != batch_size; ++n) {\n      buf[n] = &states[n][num_states - 2];\n    }\n    auto v = Cat(allocator, buf, 0);\n    ans.push_back(std::move(v));\n  }\n\n  {\n    for (int32_t n = 0; n != batch_size; ++n) {\n      buf[n] = &states[n][num_states - 1];\n    }\n    auto v = Cat<int64_t>(allocator, buf, 0);\n    ans.push_back(std::move(v));\n  }\n  return ans;\n}\n\nstd::vector<std::vector<Ort::Value>>\nOnlineZipformer2TransducerModel::UnStackStates(\n    const std::vector<Ort::Value> &states) const {\n  int32_t m = std::accumulate(num_encoder_layers_.begin(),\n                              num_encoder_layers_.end(), 0);\n  assert(static_cast<int32_t>(states.size()) == m * 6 + 2);\n\n  int32_t batch_size = states[0].GetTensorTypeAndShapeInfo().GetShape()[1];\n\n  auto allocator =\n      const_cast<OnlineZipformer2TransducerModel *>(this)->allocator_;\n\n  std::vector<std::vector<Ort::Value>> ans;\n  ans.resize(batch_size);\n\n  for (int32_t i = 0; i != m; ++i) {\n    {\n      auto v = Unbind(allocator, &states[i * 6], 1);\n      assert(static_cast<int32_t>(v.size()) == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n    {\n      auto v = Unbind(allocator, &states[i * 6 + 1], 1);\n      assert(static_cast<int32_t>(v.size()) == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n    {\n      auto v = Unbind(allocator, &states[i * 6 + 2], 1);\n      assert(static_cast<int32_t>(v.size()) == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n    {\n      auto v = Unbind(allocator, &states[i * 6 + 3], 1);\n      assert(static_cast<int32_t>(v.size()) == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n    {\n      auto v = Unbind(allocator, &states[i * 6 + 4], 0);\n      assert(static_cast<int32_t>(v.size()) == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n    {\n      auto v = Unbind(allocator, &states[i * 6 + 5], 0);\n      assert(static_cast<int32_t>(v.size()) == batch_size);\n\n      for (int32_t n = 0; n != batch_size; ++n) {\n        ans[n].push_back(std::move(v[n]));\n      }\n    }\n  }\n\n  {\n    auto v = Unbind(allocator, &states[m * 6], 0);\n    assert(static_cast<int32_t>(v.size()) == batch_size);\n\n    for (int32_t n = 0; n != batch_size; ++n) {\n      ans[n].push_back(std::move(v[n]));\n    }\n  }\n  {\n    auto v = Unbind<int64_t>(allocator, &states[m * 6 + 1], 0);\n    assert(static_cast<int32_t>(v.size()) == batch_size);\n\n    for (int32_t n = 0; n != batch_size; ++n) {\n      ans[n].push_back(std::move(v[n]));\n    }\n  }\n\n  return ans;\n}\n\nstd::vector<Ort::Value>\nOnlineZipformer2TransducerModel::GetEncoderInitStates() {\n  std::vector<Ort::Value> ans;\n  int32_t n = static_cast<int32_t>(encoder_dims_.size());\n  int32_t m = std::accumulate(num_encoder_layers_.begin(),\n                              num_encoder_layers_.end(), 0);\n  ans.reserve(m * 6 + 2);\n\n  for (int32_t i = 0; i != n; ++i) {\n    int32_t num_layers = num_encoder_layers_[i];\n    int32_t key_dim = query_head_dims_[i] * num_heads_[i];\n    int32_t value_dim = value_head_dims_[i] * num_heads_[i];\n    int32_t nonlin_attn_head_dim = 3 * encoder_dims_[i] / 4;\n\n    for (int32_t j = 0; j != num_layers; ++j) {\n      {\n        std::array<int64_t, 3> s{left_context_len_[i], 1, key_dim};\n        auto v =\n            Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n        Fill(&v, 0);\n        ans.push_back(std::move(v));\n      }\n\n      {\n        std::array<int64_t, 4> s{1, 1, left_context_len_[i],\n                                 nonlin_attn_head_dim};\n        auto v =\n            Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n        Fill(&v, 0);\n        ans.push_back(std::move(v));\n      }\n\n      {\n        std::array<int64_t, 3> s{left_context_len_[i], 1, value_dim};\n        auto v =\n            Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n        Fill(&v, 0);\n        ans.push_back(std::move(v));\n      }\n\n      {\n        std::array<int64_t, 3> s{left_context_len_[i], 1, value_dim};\n        auto v =\n            Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n        Fill(&v, 0);\n        ans.push_back(std::move(v));\n      }\n\n      {\n        std::array<int64_t, 3> s{1, encoder_dims_[i],\n                                 cnn_module_kernels_[i] / 2};\n        auto v =\n            Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n        Fill(&v, 0);\n        ans.push_back(std::move(v));\n      }\n\n      {\n        std::array<int64_t, 3> s{1, encoder_dims_[i],\n                                 cnn_module_kernels_[i] / 2};\n        auto v =\n            Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n        Fill(&v, 0);\n        ans.push_back(std::move(v));\n      }\n    }\n  }\n\n  {\n    SHERPA_ONNX_CHECK_NE(feature_dim_, 0);\n    int32_t embed_dim = (((feature_dim_ - 1) / 2) - 1) / 2;\n    std::array<int64_t, 4> s{1, 128, 3, embed_dim};\n\n    auto v = Ort::Value::CreateTensor<float>(allocator_, s.data(), s.size());\n    Fill(&v, 0);\n    ans.push_back(std::move(v));\n  }\n\n  {\n    std::array<int64_t, 1> s{1};\n    auto v = Ort::Value::CreateTensor<int64_t>(allocator_, s.data(), s.size());\n    Fill<int64_t>(&v, 0);\n    ans.push_back(std::move(v));\n  }\n  return ans;\n}\n\nstd::pair<Ort::Value, std::vector<Ort::Value>>\nOnlineZipformer2TransducerModel::RunEncoder(Ort::Value features,\n                                            std::vector<Ort::Value> states,\n                                            Ort::Value /* processed_frames */) {\n  std::vector<Ort::Value> encoder_inputs;\n  encoder_inputs.reserve(1 + states.size());\n\n  encoder_inputs.push_back(std::move(features));\n  for (auto &v : states) {\n    encoder_inputs.push_back(std::move(v));\n  }\n\n  auto encoder_out = encoder_sess_->Run(\n      {}, encoder_input_names_ptr_.data(), encoder_inputs.data(),\n      encoder_inputs.size(), encoder_output_names_ptr_.data(),\n      encoder_output_names_ptr_.size());\n\n  std::vector<Ort::Value> next_states;\n  next_states.reserve(states.size());\n\n  for (int32_t i = 1; i != static_cast<int32_t>(encoder_out.size()); ++i) {\n    next_states.push_back(std::move(encoder_out[i]));\n  }\n  return {std::move(encoder_out[0]), std::move(next_states)};\n}\n\nOrt::Value OnlineZipformer2TransducerModel::RunDecoder(\n    Ort::Value decoder_input) {\n  auto decoder_out = decoder_sess_->Run(\n      {}, decoder_input_names_ptr_.data(), &decoder_input, 1,\n      decoder_output_names_ptr_.data(), decoder_output_names_ptr_.size());\n  return std::move(decoder_out[0]);\n}\n\nOrt::Value OnlineZipformer2TransducerModel::RunJoiner(Ort::Value encoder_out,\n                                                      Ort::Value decoder_out) {\n  std::array<Ort::Value, 2> joiner_input = {std::move(encoder_out),\n                                            std::move(decoder_out)};\n  auto logit =\n      joiner_sess_->Run({}, joiner_input_names_ptr_.data(), joiner_input.data(),\n                        joiner_input.size(), joiner_output_names_ptr_.data(),\n                        joiner_output_names_ptr_.size());\n\n  return std::move(logit[0]);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineZipformer2TransducerModel::OnlineZipformer2TransducerModel(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineZipformer2TransducerModel::OnlineZipformer2TransducerModel(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/online-zipformer2-transducer-model.h",
    "content": "// sherpa-onnx/csrc/online-zipformer2-transducer-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER2_TRANSDUCER_MODEL_H_\n#define SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER2_TRANSDUCER_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineZipformer2TransducerModel : public OnlineTransducerModel {\n public:\n  explicit OnlineZipformer2TransducerModel(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineZipformer2TransducerModel(Manager *mgr,\n                                  const OnlineModelConfig &config);\n\n  std::vector<Ort::Value> StackStates(\n      const std::vector<std::vector<Ort::Value>> &states) const override;\n\n  std::vector<std::vector<Ort::Value>> UnStackStates(\n      const std::vector<Ort::Value> &states) const override;\n\n  std::vector<Ort::Value> GetEncoderInitStates() override;\n\n  void SetFeatureDim(int32_t feature_dim) override {\n    feature_dim_ = feature_dim;\n  }\n\n  std::pair<Ort::Value, std::vector<Ort::Value>> RunEncoder(\n      Ort::Value features, std::vector<Ort::Value> states,\n      Ort::Value processed_frames) override;\n\n  Ort::Value RunDecoder(Ort::Value decoder_input) override;\n\n  Ort::Value RunJoiner(Ort::Value encoder_out, Ort::Value decoder_out) override;\n\n  int32_t ContextSize() const override { return context_size_; }\n\n  int32_t ChunkSize() const override { return T_; }\n\n  int32_t ChunkShift() const override { return decode_chunk_len_; }\n\n  int32_t VocabSize() const override { return vocab_size_; }\n  OrtAllocator *Allocator() override { return allocator_; }\n\n  bool UseWhisperFeature() const override { return use_whisper_feature_; }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length);\n  void InitDecoder(void *model_data, size_t model_data_length);\n  void InitJoiner(void *model_data, size_t model_data_length);\n\n private:\n  Ort::Env env_;\n  Ort::SessionOptions encoder_sess_opts_;\n  Ort::SessionOptions decoder_sess_opts_;\n  Ort::SessionOptions joiner_sess_opts_;\n\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> encoder_sess_;\n  std::unique_ptr<Ort::Session> decoder_sess_;\n  std::unique_ptr<Ort::Session> joiner_sess_;\n\n  std::vector<std::string> encoder_input_names_;\n  std::vector<const char *> encoder_input_names_ptr_;\n\n  std::vector<std::string> encoder_output_names_;\n  std::vector<const char *> encoder_output_names_ptr_;\n\n  std::vector<std::string> decoder_input_names_;\n  std::vector<const char *> decoder_input_names_ptr_;\n\n  std::vector<std::string> decoder_output_names_;\n  std::vector<const char *> decoder_output_names_ptr_;\n\n  std::vector<std::string> joiner_input_names_;\n  std::vector<const char *> joiner_input_names_ptr_;\n\n  std::vector<std::string> joiner_output_names_;\n  std::vector<const char *> joiner_output_names_ptr_;\n\n  OnlineModelConfig config_;\n\n  std::vector<int32_t> encoder_dims_;\n  std::vector<int32_t> query_head_dims_;\n  std::vector<int32_t> value_head_dims_;\n  std::vector<int32_t> num_heads_;\n  std::vector<int32_t> num_encoder_layers_;\n  std::vector<int32_t> cnn_module_kernels_;\n  std::vector<int32_t> left_context_len_;\n\n  int32_t T_ = 0;\n  int32_t decode_chunk_len_ = 0;\n\n  int32_t context_size_ = 0;\n  int32_t vocab_size_ = 0;\n  int32_t feature_dim_ = 80;\n\n  // for models from\n  // https://github.com/k2-fsa/icefall/blob/master/egs/multi_zh-hans/ASR/RESULTS.md#streaming-with-ctc-head\n  bool use_whisper_feature_ = false;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONLINE_ZIPFORMER2_TRANSDUCER_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/onnx-utils.cc",
    "content": "// sherpa-onnx/csrc/onnx-utils.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n// Copyright (c)  2023  Pingfeng Luo\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\n#include <algorithm>\n#include <cstdint>\n#include <cstring>\n#include <fstream>\n#include <functional>\n#include <memory>\n#include <numeric>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstatic std::string GetInputName(Ort::Session *sess, size_t index,\n                                OrtAllocator *allocator) {\n// Note(fangjun): We only tested 1.17.1 and 1.11.0\n// For other versions, we may need to change it\n#if ORT_API_VERSION >= 12\n  auto v = sess->GetInputNameAllocated(index, allocator);\n  return v.get();\n#else\n  auto v = sess->GetInputName(index, allocator);\n  std::string ans = v;\n  allocator->Free(allocator, v);\n  return ans;\n#endif\n}\n\nstatic std::string GetOutputName(Ort::Session *sess, size_t index,\n                                 OrtAllocator *allocator) {\n// Note(fangjun): We only tested 1.17.1 and 1.11.0\n// For other versions, we may need to change it\n#if ORT_API_VERSION >= 12\n  auto v = sess->GetOutputNameAllocated(index, allocator);\n  return v.get();\n#else\n  auto v = sess->GetOutputName(index, allocator);\n  std::string ans = v;\n  allocator->Free(allocator, v);\n  return ans;\n#endif\n}\n\nvoid GetInputNames(Ort::Session *sess, std::vector<std::string> *input_names,\n                   std::vector<const char *> *input_names_ptr) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  size_t node_count = sess->GetInputCount();\n  input_names->resize(node_count);\n  input_names_ptr->resize(node_count);\n  for (size_t i = 0; i != node_count; ++i) {\n    (*input_names)[i] = GetInputName(sess, i, allocator);\n    (*input_names_ptr)[i] = (*input_names)[i].c_str();\n  }\n}\n\nvoid GetOutputNames(Ort::Session *sess, std::vector<std::string> *output_names,\n                    std::vector<const char *> *output_names_ptr) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  size_t node_count = sess->GetOutputCount();\n  output_names->resize(node_count);\n  output_names_ptr->resize(node_count);\n  for (size_t i = 0; i != node_count; ++i) {\n    (*output_names)[i] = GetOutputName(sess, i, allocator);\n    (*output_names_ptr)[i] = (*output_names)[i].c_str();\n  }\n}\n\nOrt::Value GetEncoderOutFrame(OrtAllocator *allocator, Ort::Value *encoder_out,\n                              int32_t t) {\n  std::vector<int64_t> encoder_out_shape =\n      encoder_out->GetTensorTypeAndShapeInfo().GetShape();\n\n  auto batch_size = encoder_out_shape[0];\n  auto num_frames = encoder_out_shape[1];\n  assert(t < num_frames);\n\n  auto encoder_out_dim = encoder_out_shape[2];\n\n  auto offset = num_frames * encoder_out_dim;\n\n  std::array<int64_t, 2> shape{batch_size, encoder_out_dim};\n\n  Ort::Value ans =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n\n  float *dst = ans.GetTensorMutableData<float>();\n  const float *src = encoder_out->GetTensorData<float>();\n\n  for (int32_t i = 0; i != batch_size; ++i) {\n    std::copy(src + t * encoder_out_dim, src + (t + 1) * encoder_out_dim, dst);\n    src += offset;\n    dst += encoder_out_dim;\n  }\n  return ans;\n}\n\nvoid PrintModelMetadata(std::ostream &os, const Ort::ModelMetadata &meta_data) {\n  Ort::AllocatorWithDefaultOptions allocator;\n#if ORT_API_VERSION >= 12\n  std::vector<Ort::AllocatedStringPtr> v =\n      meta_data.GetCustomMetadataMapKeysAllocated(allocator);\n  for (const auto &key : v) {\n    auto p = meta_data.LookupCustomMetadataMapAllocated(key.get(), allocator);\n    os << key.get() << \"=\" << p.get() << \"\\n\";\n  }\n#else\n  int64_t num_keys = 0;\n  char **keys = meta_data.GetCustomMetadataMapKeys(allocator, num_keys);\n  for (int32_t i = 0; i < num_keys; ++i) {\n    auto v = LookupCustomModelMetaData(meta_data, keys[i], allocator);\n    os << keys[i] << \"=\" << v << \"\\n\";\n    allocator.Free(keys[i]);\n  }\n\n  allocator.Free(keys);\n#endif\n}\n\nOrt::Value Clone(OrtAllocator *allocator, const Ort::Value *v) {\n  auto type_and_shape = v->GetTensorTypeAndShapeInfo();\n  std::vector<int64_t> shape = type_and_shape.GetShape();\n\n  switch (type_and_shape.GetElementType()) {\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32: {\n      Ort::Value ans = Ort::Value::CreateTensor<int32_t>(\n          allocator, shape.data(), shape.size());\n      const int32_t *start = v->GetTensorData<int32_t>();\n      const int32_t *end = start + type_and_shape.GetElementCount();\n      int32_t *dst = ans.GetTensorMutableData<int32_t>();\n      std::copy(start, end, dst);\n      return ans;\n    }\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64: {\n      Ort::Value ans = Ort::Value::CreateTensor<int64_t>(\n          allocator, shape.data(), shape.size());\n      const int64_t *start = v->GetTensorData<int64_t>();\n      const int64_t *end = start + type_and_shape.GetElementCount();\n      int64_t *dst = ans.GetTensorMutableData<int64_t>();\n      std::copy(start, end, dst);\n      return ans;\n    }\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT: {\n      Ort::Value ans = Ort::Value::CreateTensor<float>(allocator, shape.data(),\n                                                       shape.size());\n      const float *start = v->GetTensorData<float>();\n      const float *end = start + type_and_shape.GetElementCount();\n      float *dst = ans.GetTensorMutableData<float>();\n      std::copy(start, end, dst);\n      return ans;\n    }\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16: {\n      Ort::Value ans =\n          Ort::Value::CreateTensor(allocator, shape.data(), shape.size(),\n                                   ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);\n      const auto *start = v->GetTensorData<uint16_t>();\n      const auto *end = start + type_and_shape.GetElementCount();\n      auto *dst = ans.GetTensorMutableData<uint16_t>();\n      std::copy(start, end, dst);\n      return ans;\n    }\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT16: {\n      Ort::Value ans = Ort::Value::CreateTensor<uint16_t>(\n          allocator, shape.data(), shape.size());\n      const auto *start = v->GetTensorData<uint16_t>();\n      const auto *end = start + type_and_shape.GetElementCount();\n      auto *dst = ans.GetTensorMutableData<uint16_t>();\n      std::copy(start, end, dst);\n      return ans;\n    }\n\n    default:\n      SHERPA_ONNX_LOGE(\"Unsupported type: %d\\n\",\n                       static_cast<int32_t>(type_and_shape.GetElementType()));\n      SHERPA_ONNX_EXIT(-1);\n      // unreachable code\n      return Ort::Value{nullptr};\n  }\n}\n\nOrt::Value View(Ort::Value *v) {\n  auto type_and_shape = v->GetTensorTypeAndShapeInfo();\n  std::vector<int64_t> shape = type_and_shape.GetShape();\n\n#if ORT_API_VERSION >= 14\n  auto memory_info = v->GetTensorMemoryInfo();\n#else\n  const OrtMemoryInfo *memory_info = nullptr;\n  OrtStatus *status = Ort::GetApi().GetTensorMemoryInfo(*v, &memory_info);\n  if (status != nullptr) {\n    const char *msg = Ort::GetApi().GetErrorMessage(status);\n    Ort::GetApi().ReleaseStatus(status);\n    SHERPA_ONNX_LOGE(\"Failed to get tensor memory info with error: '%s'\", msg);\n    SHERPA_ONNX_EXIT(-1);\n  }\n#endif\n\n  switch (type_and_shape.GetElementType()) {\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32:\n      return Ort::Value::CreateTensor(\n          memory_info, v->GetTensorMutableData<int32_t>(),\n          type_and_shape.GetElementCount(), shape.data(), shape.size());\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64:\n      return Ort::Value::CreateTensor(\n          memory_info, v->GetTensorMutableData<int64_t>(),\n          type_and_shape.GetElementCount(), shape.data(), shape.size());\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT:\n      return Ort::Value::CreateTensor(\n          memory_info, v->GetTensorMutableData<float>(),\n          type_and_shape.GetElementCount(), shape.data(), shape.size());\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16:\n      return Ort::Value::CreateTensor(\n          memory_info, v->GetTensorMutableData<uint16_t>(),\n          type_and_shape.GetElementCount() * sizeof(uint16_t), shape.data(),\n          shape.size(), ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT16:\n      return Ort::Value::CreateTensor(\n          memory_info, v->GetTensorMutableData<uint16_t>(),\n          type_and_shape.GetElementCount(), shape.data(), shape.size());\n    case ONNX_TENSOR_ELEMENT_DATA_TYPE_BOOL:\n      return Ort::Value::CreateTensor(\n          memory_info, v->GetTensorMutableData<bool>(),\n          type_and_shape.GetElementCount(), shape.data(), shape.size());\n    default:\n      SHERPA_ONNX_LOGE(\"Unsupported type: %d\\n\",\n                       static_cast<int32_t>(type_and_shape.GetElementType()));\n      SHERPA_ONNX_EXIT(-1);\n      // unreachable code\n      return Ort::Value{nullptr};\n  }\n}\n\nfloat ComputeSum(const Ort::Value *v, int32_t n /*= -1*/) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  auto size = static_cast<int32_t>(\n      std::accumulate(shape.begin(), shape.end(), 1, std::multiplies<>()));\n  if (n != -1 && n < size && n > 0) {\n    size = n;\n  }\n\n  const float *p = v->GetTensorData<float>();\n\n  return std::accumulate(p, p + size, 1.0f);\n}\n\nfloat ComputeMean(const Ort::Value *v, int32_t n /*= -1*/) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  auto size = static_cast<int32_t>(\n      std::accumulate(shape.begin(), shape.end(), 1, std::multiplies<>()));\n\n  if (n != -1 && n < size && n > 0) {\n    size = n;\n  }\n\n  auto sum = ComputeSum(v, n);\n  return sum / size;\n}\n\nvoid PrintShape(const Ort::Value *v) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  std::ostringstream os;\n  for (auto i : shape) {\n    os << i << \", \";\n  }\n  os << \"\\n\";\n  SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n}\n\ntemplate <typename T /*= float*/>\nvoid Print1D(const Ort::Value *v) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  const T *d = v->GetTensorData<T>();\n  std::ostringstream os;\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0]); ++i) {\n    os << d[i] << \" \";\n  }\n  os << \"\\n\";\n  SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n}\n\ntemplate void Print1D<int64_t>(const Ort::Value *v);\ntemplate void Print1D<int32_t>(const Ort::Value *v);\ntemplate void Print1D<float>(const Ort::Value *v);\n\ntemplate <typename T /*= float*/>\nvoid Print2D(const Ort::Value *v) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  const T *d = v->GetTensorData<T>();\n\n  std::ostringstream os;\n  for (int32_t r = 0; r != static_cast<int32_t>(shape[0]); ++r) {\n    for (int32_t c = 0; c != static_cast<int32_t>(shape[1]); ++c, ++d) {\n      os << *d << \" \";\n    }\n    os << \"\\n\";\n  }\n  SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n}\n\ntemplate void Print2D<int64_t>(const Ort::Value *v);\ntemplate void Print2D<float>(const Ort::Value *v);\n\nvoid Print3D(const Ort::Value *v) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  const float *d = v->GetTensorData<float>();\n\n  for (int32_t p = 0; p != static_cast<int32_t>(shape[0]); ++p) {\n    SHERPA_ONNX_LOGE(\"---plane %d---\\n\", p);\n    for (int32_t r = 0; r != static_cast<int32_t>(shape[1]); ++r) {\n      for (int32_t c = 0; c != static_cast<int32_t>(shape[2]); ++c, ++d) {\n        SHERPA_ONNX_LOGE(\"%.3f \", *d);\n      }\n      SHERPA_ONNX_LOGE(\"\\n\");\n    }\n  }\n  SHERPA_ONNX_LOGE(\"\\n\");\n}\n\nvoid Print4D(const Ort::Value *v) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  const float *d = v->GetTensorData<float>();\n\n  for (int32_t p = 0; p != static_cast<int32_t>(shape[0]); ++p) {\n    SHERPA_ONNX_LOGE(\"---plane %d---\\n\", p);\n    for (int32_t q = 0; q != static_cast<int32_t>(shape[1]); ++q) {\n      SHERPA_ONNX_LOGE(\"---subplane %d---\\n\", q);\n      for (int32_t r = 0; r != static_cast<int32_t>(shape[2]); ++r) {\n        for (int32_t c = 0; c != static_cast<int32_t>(shape[3]); ++c, ++d) {\n          SHERPA_ONNX_LOGE(\"%.3f \", *d);\n        }\n        SHERPA_ONNX_LOGE(\"\\n\");\n      }\n      SHERPA_ONNX_LOGE(\"\\n\");\n    }\n  }\n  SHERPA_ONNX_LOGE(\"\\n\");\n}\n\nOrt::Value Repeat(OrtAllocator *allocator, Ort::Value *cur_encoder_out,\n                  const std::vector<int32_t> &hyps_num_split) {\n  std::vector<int64_t> cur_encoder_out_shape =\n      cur_encoder_out->GetTensorTypeAndShapeInfo().GetShape();\n\n  std::array<int64_t, 2> ans_shape{hyps_num_split.back(),\n                                   cur_encoder_out_shape[1]};\n\n  Ort::Value ans = Ort::Value::CreateTensor<float>(allocator, ans_shape.data(),\n                                                   ans_shape.size());\n\n  const float *src = cur_encoder_out->GetTensorData<float>();\n  float *dst = ans.GetTensorMutableData<float>();\n  int32_t batch_size = static_cast<int32_t>(hyps_num_split.size()) - 1;\n  for (int32_t b = 0; b != batch_size; ++b) {\n    int32_t cur_stream_hyps_num = hyps_num_split[b + 1] - hyps_num_split[b];\n    for (int32_t i = 0; i != cur_stream_hyps_num; ++i) {\n      std::copy(src, src + cur_encoder_out_shape[1], dst);\n      dst += cur_encoder_out_shape[1];\n    }\n    src += cur_encoder_out_shape[1];\n  }\n  return ans;\n}\n\nCopyableOrtValue::CopyableOrtValue(const CopyableOrtValue &other) {\n  *this = other;\n}\n\nCopyableOrtValue &CopyableOrtValue::operator=(const CopyableOrtValue &other) {\n  if (this == &other) {\n    return *this;\n  }\n  if (other.value) {\n    Ort::AllocatorWithDefaultOptions allocator;\n    value = Clone(allocator, &other.value);\n  }\n  return *this;\n}\n\nCopyableOrtValue::CopyableOrtValue(CopyableOrtValue &&other) noexcept {\n  *this = std::move(other);\n}\n\nCopyableOrtValue &CopyableOrtValue::operator=(\n    CopyableOrtValue &&other) noexcept {\n  if (this == &other) {\n    return *this;\n  }\n  value = std::move(other.value);\n  return *this;\n}\n\nstd::vector<CopyableOrtValue> Convert(std::vector<Ort::Value> values) {\n  std::vector<CopyableOrtValue> ans;\n  ans.reserve(values.size());\n\n  for (auto &v : values) {\n    ans.emplace_back(std::move(v));\n  }\n\n  return ans;\n}\n\nstd::vector<Ort::Value> Convert(std::vector<CopyableOrtValue> values) {\n  std::vector<Ort::Value> ans;\n  ans.reserve(values.size());\n\n  for (auto &v : values) {\n    ans.emplace_back(std::move(v.value));\n  }\n\n  return ans;\n}\n\nstd::string LookupCustomModelMetaData(const Ort::ModelMetadata &meta_data,\n                                      const char *key,\n                                      OrtAllocator *allocator) {\n// Note(fangjun): We only tested 1.17.1 and 1.11.0\n// For other versions, we may need to change it\n#if ORT_API_VERSION >= 12\n  auto v = meta_data.LookupCustomMetadataMapAllocated(key, allocator);\n  return v ? v.get() : \"\";\n#else\n  auto v = meta_data.LookupCustomMetadataMap(key, allocator);\n  std::string ans = v ? v : \"\";\n  allocator->Free(allocator, v);\n  return ans;\n#endif\n}\n\n// Convert IEEE 754 half-precision (16-bit) float to single-precision (32-bit)\n// float. Handles special cases: zero, subnormal, normal, infinity, and NaN.\nfloat HalfBitsToFloat(uint16_t h) {\n  const uint32_t sign = (static_cast<uint32_t>(h & 0x8000u)) << 16;\n  const uint32_t exp = (h & 0x7C00u) >> 10;\n  const uint32_t mant = (h & 0x03FFu);\n  uint32_t fbits = 0;\n  if (exp == 0) {\n    if (mant == 0) {\n      fbits = sign;\n    } else {\n      uint32_t m = mant;\n      uint32_t e = 127 - 15 + 1;\n      while ((m & 0x0400u) == 0) {\n        m <<= 1;\n        --e;\n      }\n      m &= 0x03FFu;\n      fbits = sign | (e << 23) | (m << 13);\n    }\n  } else if (exp == 31) {\n    fbits = sign | 0x7F800000u | (mant << 13);\n  } else {\n    const uint32_t e = exp + (127 - 15);\n    fbits = sign | (e << 23) | (mant << 13);\n  }\n  float out;\n  std::memcpy(&out, &fbits, sizeof(out));\n  return out;\n}\n\n// Convert IEEE 754 single-precision (32-bit) float to half-precision (16-bit)\n// float. Handles overflow (clamped to infinity), underflow (clamped to zero),\n// and normal values with proper rounding.\nuint16_t FloatToHalfBits(float f) {\n  uint32_t x;\n  std::memcpy(&x, &f, sizeof(x));\n  const uint32_t sign = (x >> 16) & 0x8000u;\n  const int32_t exp = static_cast<int32_t>((x >> 23) & 0xFFu);\n  const uint32_t mant = x & 0x007FFFFFu;\n  if (exp == 255) {\n    if (mant == 0) return static_cast<uint16_t>(sign | 0x7C00u);\n    return static_cast<uint16_t>(sign | 0x7C00u | (mant ? 0x1u : 0));\n  }\n  int32_t new_exp = exp - 127 + 15;\n  if (new_exp >= 31) {\n    return static_cast<uint16_t>(sign | 0x7C00u);\n  } else if (new_exp <= 0) {\n    if (new_exp < -10) {\n      return static_cast<uint16_t>(sign);\n    }\n    uint32_t m = mant | 0x00800000u;\n    int32_t shift = 14 - new_exp;\n    uint32_t half_m = m >> shift;\n    if ((m >> (shift - 1)) & 1u) {\n      half_m += 1;\n    }\n    return static_cast<uint16_t>(sign | (half_m & 0x03FFu));\n  } else {\n    uint16_t half_exp = static_cast<uint16_t>(new_exp << 10);\n    uint32_t half_m = mant >> 13;\n    if (mant & 0x00001000u) {\n      half_m += 1;\n      if (half_m == 0x0400u) {\n        half_m = 0;\n        half_exp = static_cast<uint16_t>((new_exp + 1) << 10);\n        if ((half_exp >> 10) >= 31) {\n          return static_cast<uint16_t>(sign | 0x7C00u);\n        }\n      }\n    }\n    return static_cast<uint16_t>(sign | half_exp | (half_m & 0x03FFu));\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/onnx-utils.h",
    "content": "// sherpa-onnx/csrc/onnx-utils.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n// Copyright (c)  2023  Pingfeng Luo\n#ifndef SHERPA_ONNX_CSRC_ONNX_UTILS_H_\n#define SHERPA_ONNX_CSRC_ONNX_UTILS_H_\n\n#ifdef _MSC_VER\n// For ToWide() below\n#include <codecvt>\n#include <locale>\n#endif\n\n#include <algorithm>\n#include <cassert>\n#include <ostream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\n/**\n * Get the input names of a model.\n *\n * @param sess An onnxruntime session.\n * @param input_names. On return, it contains the input names of the model.\n * @param input_names_ptr. On return, input_names_ptr[i] contains\n *                         input_names[i].c_str()\n */\nvoid GetInputNames(Ort::Session *sess, std::vector<std::string> *input_names,\n                   std::vector<const char *> *input_names_ptr);\n\n/**\n * Get the output names of a model.\n *\n * @param sess An onnxruntime session.\n * @param output_names. On return, it contains the output names of the model.\n * @param output_names_ptr. On return, output_names_ptr[i] contains\n *                         output_names[i].c_str()\n */\nvoid GetOutputNames(Ort::Session *sess, std::vector<std::string> *output_names,\n                    std::vector<const char *> *output_names_ptr);\n\n/**\n * Get the output frame of Encoder\n *\n * @param allocator allocator of onnxruntime\n * @param encoder_out encoder out tensor\n * @param t frame_index\n *\n */\nOrt::Value GetEncoderOutFrame(OrtAllocator *allocator, Ort::Value *encoder_out,\n                              int32_t t);\n\nstd::string LookupCustomModelMetaData(const Ort::ModelMetadata &meta_data,\n                                      const char *key, OrtAllocator *allocator);\n\nvoid PrintModelMetadata(std::ostream &os,\n                        const Ort::ModelMetadata &meta_data);  // NOLINT\n\n// Return a deep copy of v\nOrt::Value Clone(OrtAllocator *allocator, const Ort::Value *v);\n\n// Return a shallow copy\nOrt::Value View(Ort::Value *v);\n\nfloat ComputeSum(const Ort::Value *v, int32_t n = -1);\nfloat ComputeMean(const Ort::Value *v, int32_t n = -1);\n\n// Print a 1-D tensor to stderr\ntemplate <typename T = float>\nvoid Print1D(const Ort::Value *v);\n\n// Print a 2-D tensor to stderr\ntemplate <typename T = float>\nvoid Print2D(const Ort::Value *v);\n\n// Print a 3-D tensor to stderr\nvoid Print3D(const Ort::Value *v);\n\n// Print a 4-D tensor to stderr\nvoid Print4D(const Ort::Value *v);\n\nvoid PrintShape(const Ort::Value *v);\n\ntemplate <typename T = float>\nvoid Fill(Ort::Value *tensor, T value) {\n  auto n = tensor->GetTypeInfo().GetTensorTypeAndShapeInfo().GetElementCount();\n  auto p = tensor->GetTensorMutableData<T>();\n  std::fill(p, p + n, value);\n}\n\n// TODO(fangjun): Document it\nOrt::Value Repeat(OrtAllocator *allocator, Ort::Value *cur_encoder_out,\n                  const std::vector<int32_t> &hyps_num_split);\n\nstruct CopyableOrtValue {\n  Ort::Value value{nullptr};\n\n  CopyableOrtValue() = default;\n\n  /*explicit*/ CopyableOrtValue(Ort::Value v)  // NOLINT\n      : value(std::move(v)) {}\n\n  CopyableOrtValue(const CopyableOrtValue &other);\n\n  CopyableOrtValue &operator=(const CopyableOrtValue &other);\n\n  CopyableOrtValue(CopyableOrtValue &&other) noexcept;\n\n  CopyableOrtValue &operator=(CopyableOrtValue &&other) noexcept;\n};\n\nstd::vector<CopyableOrtValue> Convert(std::vector<Ort::Value> values);\n\nstd::vector<Ort::Value> Convert(std::vector<CopyableOrtValue> values);\n\nfloat HalfBitsToFloat(uint16_t h);\n\nuint16_t FloatToHalfBits(float f);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_ONNX_UTILS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/packed-sequence-test.cc",
    "content": "// sherpa-onnx/csrc/packed-sequence-test.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/packed-sequence.h\"\n\n#include <cstdio>\n#include <numeric>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nTEST(PackedSequence, Case1) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 3> shape{5, 5, 4};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  std::iota(p, p + shape[0] * shape[1] * shape[2], 0);\n\n  Ort::Value length =\n      Ort::Value::CreateTensor<int64_t>(allocator, shape.data(), 1);\n  int64_t *p_length = length.GetTensorMutableData<int64_t>();\n  p_length[0] = 1;\n  p_length[1] = 2;\n  p_length[2] = 3;\n  p_length[3] = 5;\n  p_length[4] = 2;\n\n  auto packed_seq = PackPaddedSequence(allocator, &v, &length);\n  fprintf(stderr, \"sorted indexes: \");\n  for (auto i : packed_seq.sorted_indexes) {\n    fprintf(stderr, \"%d \", static_cast<int32_t>(i));\n  }\n  fprintf(stderr, \"\\n\");\n  // output index:   0 1 2 3 4\n  // sorted indexes: 3 2 1 4 0\n  // length:         5 3 2 2 1\n  Print3D(&v);\n  Print2D(&packed_seq.data);\n  fprintf(stderr, \"batch sizes per time step: \");\n  for (auto i : packed_seq.batch_sizes) {\n    fprintf(stderr, \"%d \", static_cast<int32_t>(i));\n  }\n  fprintf(stderr, \"\\n\");\n\n  // TODO(fangjun): Check that the return value is correct\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/packed-sequence.cc",
    "content": "// sherpa-onnx/csrc/packed-sequence.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/packed-sequence.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <numeric>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/slice.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nstatic Ort::Value IndexSelect(OrtAllocator *allocator, const Ort::Value *value,\n                              const std::vector<int32_t> &sorted_indexes) {\n  auto shape = value->GetTensorTypeAndShapeInfo().GetShape();\n  assert(shape.size() == 3);\n  std::array<int64_t, 3> ans_shape{static_cast<int64_t>(sorted_indexes.size()),\n                                   shape[1], shape[2]};\n\n  Ort::Value ans = Ort::Value::CreateTensor<float>(allocator, ans_shape.data(),\n                                                   ans_shape.size());\n  float *dst = ans.GetTensorMutableData<float>();\n  const float *src = value->GetTensorData<float>();\n\n  for (auto i : sorted_indexes) {\n    const float *start = src + i * shape[1] * shape[2];\n    std::copy(start, start + shape[1] * shape[2], dst);\n    dst += shape[1] * shape[2];\n  }\n  return ans;\n}\n\nPackedSequence PackPaddedSequence(OrtAllocator *allocator,\n                                  const Ort::Value *value, Ort::Value *length) {\n  std::vector<int64_t> v_shape = value->GetTensorTypeAndShapeInfo().GetShape();\n  std::vector<int64_t> l_shape = length->GetTensorTypeAndShapeInfo().GetShape();\n\n  assert(v_shape.size() == 3);\n  assert(l_shape.size() == 1);\n  assert(v_shape[0] == l_shape[0]);\n\n  std::vector<int32_t> indexes(v_shape[0]);\n  std::iota(indexes.begin(), indexes.end(), 0);\n\n  const int64_t *p_length = length->GetTensorData<int64_t>();\n  // sort in descending order\n  std::sort(indexes.begin(), indexes.end(), [p_length](int32_t i, int32_t j) {\n    return p_length[i] > p_length[j];\n  });\n\n  int32_t n = static_cast<int32_t>(v_shape[0]);\n\n  int64_t max_T = p_length[indexes[0]];\n\n  auto sum_T = std::accumulate(p_length, p_length + n, static_cast<int64_t>(0));\n\n  std::array<int64_t, 2> data_shape{sum_T, v_shape[2]};\n\n  Ort::Value data = Ort::Value::CreateTensor<float>(\n      allocator, data_shape.data(), data_shape.size());\n  float *dst = data.GetTensorMutableData<float>();\n\n  Ort::Value tensor = IndexSelect(allocator, value, indexes);\n  tensor = Transpose01(allocator, &tensor);\n\n  // batch size at each time step\n  std::vector<int32_t> batch_sizes;\n  batch_sizes.reserve(max_T);\n\n  int64_t prev_l = 0;\n  for (int32_t i = 0; i != n; ++i) {\n    auto cur_l = p_length[indexes[n - 1 - i]];\n    assert(cur_l >= prev_l);\n    if (cur_l == prev_l) {\n      continue;\n    }\n\n    auto cur_batch_size = n - i;\n\n    Ort::Value cur_batch =\n        Slice(allocator, &tensor, prev_l, cur_l, 0, cur_batch_size);\n    auto count = cur_batch.GetTensorTypeAndShapeInfo().GetElementCount();\n    const float *src = cur_batch.GetTensorData<float>();\n    std::copy(src, src + count, dst);\n    dst += count;\n\n    for (int32_t j = prev_l; j < cur_l; ++j) {\n      batch_sizes.push_back(cur_batch_size);\n    }\n\n    prev_l = cur_l;\n  }\n\n  PackedSequence packed_seq;\n  packed_seq.sorted_indexes = std::move(indexes);\n  packed_seq.data = std::move(data);\n  packed_seq.batch_sizes = std::move(batch_sizes);\n\n  return packed_seq;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/packed-sequence.h",
    "content": "// sherpa-onnx/csrc/packed-sequence.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_PACKED_SEQUENCE_H_\n#define SHERPA_ONNX_CSRC_PACKED_SEQUENCE_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nstruct PackedSequence {\n  std::vector<int32_t> sorted_indexes;\n  std::vector<int32_t> batch_sizes;\n\n  // data is a 2-D tensor of shape (sum(batch_sizes), channels)\n  Ort::Value data{nullptr};\n\n  // Return a shallow copy of data[start:start+size, :]\n  Ort::Value Get(int32_t start, int32_t size) {\n    auto shape = data.GetTensorTypeAndShapeInfo().GetShape();\n\n    std::array<int64_t, 2> ans_shape{size, shape[1]};\n\n    float *p = data.GetTensorMutableData<float>();\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    // a shallow copy\n    return Ort::Value::CreateTensor(memory_info, p + start * shape[1],\n                                    size * shape[1], ans_shape.data(),\n                                    ans_shape.size());\n  }\n};\n\n/** Similar to torch.nn.utils.rnn.pad_sequence but it supports only\n * batch_first=true.\n *\n * @param allocator\n * @param value  A 3-D tensor of shape (B, T, C). Its dtype is float.\n * @param length A 1-D tensor of shape (B,). Its dtype is int64_t. Each\n *               element in it specifies the valid length of the corresponding\n *               entry in value before padding.\n */\nPackedSequence PackPaddedSequence(OrtAllocator *allocator,\n                                  const Ort::Value *value, Ort::Value *length);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_PACKED_SEQUENCE_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/pad-sequence-test.cc",
    "content": "// sherpa-onnx/csrc/pad-sequence-test.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/pad-sequence.h\"\n\n#include <numeric>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nTEST(PadSequence, ThreeTensors) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 2> shape1{3, 5};\n  Ort::Value v1 =\n      Ort::Value::CreateTensor<float>(allocator, shape1.data(), shape1.size());\n  float *p1 = v1.GetTensorMutableData<float>();\n  std::iota(p1, p1 + shape1[0] * shape1[1], 0);\n\n  std::array<int64_t, 2> shape2{4, 5};\n  Ort::Value v2 =\n      Ort::Value::CreateTensor<float>(allocator, shape2.data(), shape2.size());\n  float *p2 = v2.GetTensorMutableData<float>();\n  std::iota(p2, p2 + shape2[0] * shape2[1], 0);\n\n  std::array<int64_t, 2> shape3{2, 5};\n  Ort::Value v3 =\n      Ort::Value::CreateTensor<float>(allocator, shape3.data(), shape3.size());\n  float *p3 = v3.GetTensorMutableData<float>();\n  std::iota(p3, p3 + shape3[0] * shape3[1], 0);\n\n  auto ans = PadSequence(allocator, {&v1, &v2, &v3}, -1);\n\n  Print2D(&v1);\n  Print2D(&v2);\n  Print2D(&v3);\n  Print3D(&ans);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/pad-sequence.cc",
    "content": "// sherpa-onnx/csrc/pad-sequence.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/pad-sequence.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nOrt::Value PadSequence(OrtAllocator *allocator,\n                       const std::vector<const Ort::Value *> &values,\n                       float padding_value) {\n  int32_t batch_size = static_cast<int32_t>(values.size());\n\n  std::vector<int64_t> shape0 =\n      values[0]->GetTensorTypeAndShapeInfo().GetShape();\n  assert(shape0.size() == 2);\n\n  auto feature_dim = shape0[1];\n  auto max_T = shape0[0];\n\n  for (int32_t i = 1; i != batch_size; ++i) {\n    auto shape = values[i]->GetTensorTypeAndShapeInfo().GetShape();\n\n    assert(shape.size() == 2);\n    assert(shape[1] == feature_dim);\n\n    max_T = std::max(max_T, shape[0]);\n  }\n  std::array<int64_t, 3> ans_shape{batch_size, max_T, feature_dim};\n\n  Ort::Value ans = Ort::Value::CreateTensor<float>(allocator, ans_shape.data(),\n                                                   ans_shape.size());\n  float *dst = ans.GetTensorMutableData<float>();\n  std::fill(dst, dst + batch_size * max_T * feature_dim, padding_value);\n\n  for (const auto *v : values) {\n    const float *src = v->GetTensorData<float>();\n    auto shape = v->GetTensorTypeAndShapeInfo().GetShape();\n    std::copy(src, src + shape[0] * shape[1], dst);\n    dst += max_T * feature_dim;\n  }\n\n  return ans;\n\n  // TODO(fangjun): Check that the returned value is correct.\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/pad-sequence.h",
    "content": "// sherpa-onnx/csrc/pad-sequence.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_PAD_SEQUENCE_H_\n#define SHERPA_ONNX_CSRC_PAD_SEQUENCE_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\n/** Similar to torch.nn.utils.rnn.pad_sequence but it supports only\n * batch_first=true.\n *\n * @param allocator\n * @param values A list of 2-D tensors. Each tensor's second dimension\n *               must be the same and the data type of each tensor should\n *               be float.\n * @param padding_value Value used for padding. For log-fbank, you usually use\n *                      -23.025850929940457f as the padding value.\n *\n * @return Return a 3-D tensor of shape (B, max_T, C).\n */\nOrt::Value PadSequence(OrtAllocator *allocator,\n                       const std::vector<const Ort::Value *> &values,\n                       float padding_value);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_PAD_SEQUENCE_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/parse-options.cc",
    "content": "// sherpa-onnx/csrc/parse-options.cc\n/**\n * Copyright 2009-2011  Karel Vesely;  Microsoft Corporation;\n *                      Saarland University (Author: Arnab Ghoshal);\n * Copyright 2012-2013  Johns Hopkins University (Author: Daniel Povey);\n *                      Frantisek Skala;  Arnab Ghoshal\n * Copyright 2013       Tanel Alumae\n */\n\n// This file is copied and modified from kaldi/src/util/parse-options.cu\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\n#include <algorithm>\n#include <array>\n#include <cctype>\n#include <cstring>\n#include <fstream>\n#include <iomanip>\n#include <string>\n\n#include \"sherpa-onnx/csrc/log.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nParseOptions::ParseOptions(const std::string &prefix, ParseOptions *po)\n    : print_args_(false), help_(false), usage_(\"\"), argc_(0), argv_(nullptr) {\n  if (po != nullptr && po->other_parser_ != nullptr) {\n    // we get here if this constructor is used twice, recursively.\n    other_parser_ = po->other_parser_;\n  } else {\n    other_parser_ = po;\n  }\n  if (po != nullptr && !po->prefix_.empty()) {\n    prefix_ = po->prefix_ + std::string(\".\") + prefix;\n  } else {\n    prefix_ = prefix;\n  }\n}\n\nvoid ParseOptions::Register(const std::string &name, bool *ptr,\n                            const std::string &doc) {\n  RegisterTmpl(name, ptr, doc);\n}\n\nvoid ParseOptions::Register(const std::string &name, int32_t *ptr,\n                            const std::string &doc) {\n  RegisterTmpl(name, ptr, doc);\n}\n\nvoid ParseOptions::Register(const std::string &name, int64_t *ptr,\n                            const std::string &doc) {\n  RegisterTmpl(name, ptr, doc);\n}\n\nvoid ParseOptions::Register(const std::string &name, uint32_t *ptr,\n                            const std::string &doc) {\n  RegisterTmpl(name, ptr, doc);\n}\n\nvoid ParseOptions::Register(const std::string &name, float *ptr,\n                            const std::string &doc) {\n  RegisterTmpl(name, ptr, doc);\n}\n\nvoid ParseOptions::Register(const std::string &name, double *ptr,\n                            const std::string &doc) {\n  RegisterTmpl(name, ptr, doc);\n}\n\nvoid ParseOptions::Register(const std::string &name, std::string *ptr,\n                            const std::string &doc) {\n  RegisterTmpl(name, ptr, doc);\n}\n\n// old-style, used for registering application-specific parameters\ntemplate <typename T>\nvoid ParseOptions::RegisterTmpl(const std::string &name, T *ptr,\n                                const std::string &doc) {\n  if (other_parser_ == nullptr) {\n    this->RegisterCommon(name, ptr, doc, false);\n  } else {\n    SHERPA_ONNX_CHECK(prefix_ != \"\")\n        << \"prefix: \" << prefix_ << \"\\n\"\n        << \"Cannot use empty prefix when registering with prefix.\";\n    std::string new_name = prefix_ + '.' + name;  // name becomes prefix.name\n    other_parser_->Register(new_name, ptr, doc);\n  }\n}\n\n// does the common part of the job of registering a parameter\ntemplate <typename T>\nvoid ParseOptions::RegisterCommon(const std::string &name, T *ptr,\n                                  const std::string &doc, bool is_standard) {\n  SHERPA_ONNX_CHECK(ptr != nullptr);\n  std::string idx = name;\n  NormalizeArgName(&idx);\n  if (doc_map_.find(idx) != doc_map_.end()) {\n    SHERPA_ONNX_LOGE(\"Registering option twice, ignoring second time: %s\",\n                     name.c_str());\n  } else {\n    this->RegisterSpecific(name, idx, ptr, doc, is_standard);\n  }\n}\n\n// used to register standard parameters (those that are present in all of the\n// applications)\ntemplate <typename T>\nvoid ParseOptions::RegisterStandard(const std::string &name, T *ptr,\n                                    const std::string &doc) {\n  this->RegisterCommon(name, ptr, doc, true);\n}\n\nvoid ParseOptions::RegisterSpecific(const std::string &name,\n                                    const std::string &idx, bool *b,\n                                    const std::string &doc, bool is_standard) {\n  bool_map_[idx] = b;\n  doc_map_[idx] =\n      DocInfo(name, doc + \" (bool, default = \" + ((*b) ? \"true)\" : \"false)\"),\n              is_standard);\n}\n\nvoid ParseOptions::RegisterSpecific(const std::string &name,\n                                    const std::string &idx, int32_t *i,\n                                    const std::string &doc, bool is_standard) {\n  int_map_[idx] = i;\n  std::ostringstream ss;\n  ss << doc << \" (int, default = \" << *i << \")\";\n  doc_map_[idx] = DocInfo(name, ss.str(), is_standard);\n}\n\nvoid ParseOptions::RegisterSpecific(const std::string &name,\n                                    const std::string &idx, int64_t *i,\n                                    const std::string &doc, bool is_standard) {\n  int64_map_[idx] = i;\n  std::ostringstream ss;\n  ss << doc << \" (int64, default = \" << *i << \")\";\n  doc_map_[idx] = DocInfo(name, ss.str(), is_standard);\n}\n\nvoid ParseOptions::RegisterSpecific(const std::string &name,\n                                    const std::string &idx, uint32_t *u,\n                                    const std::string &doc, bool is_standard) {\n  uint_map_[idx] = u;\n  std::ostringstream ss;\n  ss << doc << \" (uint, default = \" << *u << \")\";\n  doc_map_[idx] = DocInfo(name, ss.str(), is_standard);\n}\n\nvoid ParseOptions::RegisterSpecific(const std::string &name,\n                                    const std::string &idx, float *f,\n                                    const std::string &doc, bool is_standard) {\n  float_map_[idx] = f;\n  std::ostringstream ss;\n  ss << doc << \" (float, default = \" << *f << \")\";\n  doc_map_[idx] = DocInfo(name, ss.str(), is_standard);\n}\n\nvoid ParseOptions::RegisterSpecific(const std::string &name,\n                                    const std::string &idx, double *f,\n                                    const std::string &doc, bool is_standard) {\n  double_map_[idx] = f;\n  std::ostringstream ss;\n  ss << doc << \" (double, default = \" << *f << \")\";\n  doc_map_[idx] = DocInfo(name, ss.str(), is_standard);\n}\n\nvoid ParseOptions::RegisterSpecific(const std::string &name,\n                                    const std::string &idx, std::string *s,\n                                    const std::string &doc, bool is_standard) {\n  string_map_[idx] = s;\n  doc_map_[idx] =\n      DocInfo(name, doc + \" (string, default = \\\"\" + *s + \"\\\")\", is_standard);\n}\n\nvoid ParseOptions::DisableOption(const std::string &name) {\n  if (argv_ != nullptr) {\n    SHERPA_ONNX_LOGE(\"DisableOption must not be called after calling Read().\");\n    exit(-1);\n  }\n  if (doc_map_.erase(name) == 0) {\n    SHERPA_ONNX_LOGE(\"Option %s was not registered so cannot be disabled: \",\n                     name.c_str());\n    exit(-1);\n  }\n  bool_map_.erase(name);\n  int_map_.erase(name);\n  int64_map_.erase(name);\n  uint_map_.erase(name);\n  float_map_.erase(name);\n  double_map_.erase(name);\n  string_map_.erase(name);\n}\n\nint32_t ParseOptions::NumArgs() const { return positional_args_.size(); }\n\nstd::string ParseOptions::GetArg(int32_t i) const {\n  if (i < 1 || i > static_cast<int32_t>(positional_args_.size())) {\n    SHERPA_ONNX_LOGE(\"ParseOptions::GetArg, invalid index %d\", i);\n    exit(-1);\n  }\n\n  return positional_args_[i - 1];\n}\n\n// We currently do not support any other options.\nenum ShellType : std::uint8_t { kBash = 0 };\n\n// This can be changed in the code if it ever does need to be changed (as it's\n// unlikely that one compilation of this tool-set would use both shells).\nstatic ShellType kShellType = kBash;\n\n// Returns true if we need to escape a string before putting it into\n// a shell (mainly thinking of bash shell, but should work for others)\n// This is for the convenience of the user so command-lines that are\n// printed out by ParseOptions::Read (with --print-args=true) are\n// paste-able into the shell and will run. If you use a different type of\n// shell, it might be necessary to change this function.\n// But it's mostly a cosmetic issue as it basically affects how\n// the program echoes its command-line arguments to the screen.\nstatic bool MustBeQuoted(const std::string &str, ShellType st) {\n  // Only Bash is supported (for the moment).\n  SHERPA_ONNX_CHECK_EQ(st, kBash) << \"Invalid shell type.\";\n\n  const char *c = str.c_str();\n  if (*c == '\\0') {\n    return true;  // Must quote empty string\n  } else {\n    std::array<const char *, 2> ok_chars{};\n\n    // These seem not to be interpreted as long as there are no other \"bad\"\n    // characters involved (e.g. \",\" would be interpreted as part of something\n    // like a{b,c}, but not on its own.\n    ok_chars[kBash] = \"[]~#^_-+=:.,/\";\n\n    // Just want to make sure that a space character doesn't get automatically\n    // inserted here via an automated style-checking script, like it did before.\n    SHERPA_ONNX_CHECK(!strchr(ok_chars[kBash], ' '));\n\n    for (; *c != '\\0'; ++c) {\n      // For non-alphanumeric characters we have a list of characters which\n      // are OK. All others are forbidden (this is easier since the shell\n      // interprets most non-alphanumeric characters).\n      if (!isalnum(*c)) {\n        const char *d = nullptr;\n        for (d = ok_chars[st]; *d != '\\0'; ++d) {\n          if (*c == *d) break;\n        }\n        // If not alphanumeric or one of the \"ok_chars\", it must be escaped.\n        if (*d == '\\0') return true;\n      }\n    }\n    return false;  // The string was OK. No quoting or escaping.\n  }\n}\n\n// Returns a quoted and escaped version of \"str\"\n// which has previously been determined to need escaping.\n// Our aim is to print out the command line in such a way that if it's\n// pasted into a shell of ShellType \"st\" (only bash for now), it\n// will get passed to the program in the same way.\nstatic std::string QuoteAndEscape(const std::string &str, ShellType /*st*/) {\n  // For now we use the following rules:\n  // In the normal case, we quote with single-quote \"'\", and to escape\n  // a single-quote we use the string: '\\'' (interpreted as closing the\n  // single-quote, putting an escaped single-quote from the shell, and\n  // then reopening the single quote).\n  char quote_char = '\\'';\n  const char *escape_str = \"'\\\\''\";  // e.g. echo 'a'\\''b' returns a'b\n\n  // If the string contains single-quotes that would need escaping this\n  // way, and we determine that the string could be safely double-quoted\n  // without requiring any escaping, then we double-quote the string.\n  // This is the case if the characters \"`$\\ do not appear in the string.\n  // e.g. see http://www.redhat.com/mirrors/LDP/LDP/abs/html/quotingvar.html\n  const char *c_str = str.c_str();\n  if (strchr(c_str, '\\'') && !strpbrk(c_str, \"\\\"`$\\\\\")) {\n    quote_char = '\"';\n    escape_str = \"\\\\\\\"\";  // should never be accessed.\n  }\n\n  std::array<char, 2> buf{};\n  buf[1] = '\\0';\n\n  buf[0] = quote_char;\n  std::string ans = buf.data();\n  const char *c = str.c_str();\n  for (; *c != '\\0'; ++c) {\n    if (*c == quote_char) {\n      ans += escape_str;\n    } else {\n      buf[0] = *c;\n      ans += buf.data();\n    }\n  }\n  buf[0] = quote_char;\n  ans += buf.data();\n  return ans;\n}\n\n// static function\nstd::string ParseOptions::Escape(const std::string &str) {\n  return MustBeQuoted(str, kShellType) ? QuoteAndEscape(str, kShellType) : str;\n}\n\nint32_t ParseOptions::Read(int32_t argc, const char *const *argv) {\n  argc_ = argc;\n  argv_ = argv;\n  std::string key, value;\n  int32_t i = 0;\n\n  // first pass: look for config parameter, look for priority\n  for (i = 1; i < argc; ++i) {\n    if (std::strncmp(argv[i], \"--\", 2) == 0) {\n      if (std::strcmp(argv[i], \"--\") == 0) {\n        // a lone \"--\" marks the end of named options\n        break;\n      }\n      bool has_equal_sign = false;\n      SplitLongArg(argv[i], &key, &value, &has_equal_sign);\n      NormalizeArgName(&key);\n      Trim(&value);\n      if (key == \"config\") {\n        ReadConfigFile(value);\n      } else if (key == \"help\") {\n        PrintUsage();\n        exit(0);\n      }\n    }\n  }\n\n  bool double_dash_seen = false;\n  // second pass: add the command line options\n  for (i = 1; i < argc; ++i) {\n    if (std::strncmp(argv[i], \"--\", 2) == 0) {\n      if (std::strcmp(argv[i], \"--\") == 0) {\n        // A lone \"--\" marks the end of named options.\n        // Skip that option and break the processing of named options\n        i += 1;\n        double_dash_seen = true;\n        break;\n      }\n      bool has_equal_sign = false;\n      SplitLongArg(argv[i], &key, &value, &has_equal_sign);\n      NormalizeArgName(&key);\n      Trim(&value);\n      if (!SetOption(key, value, has_equal_sign)) {\n        PrintUsage(true);\n        SHERPA_ONNX_LOGE(\"Invalid option %s\", argv[i]);\n        exit(-1);\n      }\n    } else {\n      break;\n    }\n  }\n\n  // process remaining arguments as positional\n  for (; i < argc; ++i) {\n    if ((std::strcmp(argv[i], \"--\") == 0) && !double_dash_seen) {\n      double_dash_seen = true;\n    } else {\n      positional_args_.emplace_back(argv[i]);\n    }\n  }\n\n  // if the user did not suppress this with --print-args = false....\n  if (print_args_) {\n    std::ostringstream strm;\n    for (int32_t j = 0; j < argc; ++j) strm << Escape(argv[j]) << \" \";\n    strm << '\\n';\n    SHERPA_ONNX_LOGE(\"%s\", strm.str().c_str());\n  }\n  return i;\n}\n\nvoid ParseOptions::PrintUsage(bool print_command_line /*=false*/) const {\n  std::ostringstream os;\n  os << '\\n' << usage_ << '\\n';\n  // first we print application-specific options\n  bool app_specific_header_printed = false;\n  for (const auto &it : doc_map_) {\n    if (it.second.is_standard_ == false) {  // application-specific option\n      if (app_specific_header_printed == false) {  // header was not yet printed\n        os << \"Options:\" << '\\n';\n        app_specific_header_printed = true;\n      }\n      os << \"  --\" << std::setw(25) << std::left << it.second.name_ << \" : \"\n         << it.second.use_msg_ << '\\n';\n    }\n  }\n  if (app_specific_header_printed == true) {\n    os << '\\n';\n  }\n\n  // then the standard options\n  os << \"Standard options:\" << '\\n';\n  for (const auto &it : doc_map_) {\n    if (it.second.is_standard_ == true) {  // we have standard option\n      os << \"  --\" << std::setw(25) << std::left << it.second.name_ << \" : \"\n         << it.second.use_msg_ << '\\n';\n    }\n  }\n  os << '\\n';\n  if (print_command_line) {\n    std::ostringstream strm;\n    strm << \"Command line was: \";\n    for (int32_t j = 0; j < argc_; ++j) strm << Escape(argv_[j]) << \" \";\n    strm << '\\n';\n    os << strm.str();\n  }\n\n  SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n}\n\nvoid ParseOptions::PrintConfig(std::ostream &os) const {\n  os << '\\n' << \"[[ Configuration of UI-Registered options ]]\" << '\\n';\n  std::string key;\n  for (const auto &it : doc_map_) {\n    key = it.first;\n    os << it.second.name_ << \" = \";\n    if (bool_map_.end() != bool_map_.find(key)) {\n      os << (*bool_map_.at(key) ? \"true\" : \"false\");\n    } else if (int_map_.end() != int_map_.find(key)) {\n      os << (*int_map_.at(key));\n    } else if (int64_map_.end() != int64_map_.find(key)) {\n      os << (*int64_map_.at(key));\n    } else if (uint_map_.end() != uint_map_.find(key)) {\n      os << (*uint_map_.at(key));\n    } else if (float_map_.end() != float_map_.find(key)) {\n      os << (*float_map_.at(key));\n    } else if (double_map_.end() != double_map_.find(key)) {\n      os << (*double_map_.at(key));\n    } else if (string_map_.end() != string_map_.find(key)) {\n      os << \"'\" << *string_map_.at(key) << \"'\";\n    } else {\n      SHERPA_ONNX_LOGE(\"PrintConfig: unrecognized option %s [code error]\",\n                       key.c_str());\n      exit(-1);\n    }\n    os << '\\n';\n  }\n  os << '\\n';\n}\n\nvoid ParseOptions::ReadConfigFile(const std::string &filename) {\n  std::ifstream is(filename.c_str(), std::ifstream::in);\n  if (!is.good()) {\n    SHERPA_ONNX_LOGE(\"Cannot open config file: %s\", filename.c_str());\n    exit(-1);\n  }\n\n  std::string line, key, value;\n  int32_t line_number = 0;\n  while (std::getline(is, line)) {\n    ++line_number;\n    // trim out the comments\n    size_t pos = line.find_first_of('#');\n    if (pos != std::string::npos) {\n      line.erase(pos);\n    }\n    // skip empty lines\n    Trim(&line);\n    if (line.empty()) continue;\n\n    if (line.substr(0, 2) != \"--\") {\n      SHERPA_ONNX_LOGE(\n          \"Reading config file %s: line %d does not look like a line \"\n          \"from a sherpa-onnx command-line program's config file: should \"\n          \"be of the form --x=y.  Note: config files intended to \"\n          \"be sourced by shell scripts lack the '--'.\",\n          filename.c_str(), line_number);\n      exit(-1);\n    }\n\n    // parse option\n    bool has_equal_sign = false;\n    SplitLongArg(line, &key, &value, &has_equal_sign);\n    NormalizeArgName(&key);\n    Trim(&value);\n    if (!SetOption(key, value, has_equal_sign)) {\n      PrintUsage(true);\n      SHERPA_ONNX_LOGE(\"Invalid option %s in config file %s: line %d\",\n                       line.c_str(), filename.c_str(), line_number);\n      exit(-1);\n    }\n  }\n}\n\nvoid ParseOptions::SplitLongArg(const std::string &in, std::string *key,\n                                std::string *value,\n                                bool *has_equal_sign) const {\n  SHERPA_ONNX_CHECK(in.substr(0, 2) == \"--\") << in;  // precondition.\n  size_t pos = in.find_first_of('=', 0);\n  if (pos == std::string::npos) {  // we allow --option for bools\n    // defaults to empty.  We handle this differently in different cases.\n    *key = in.substr(2, in.size() - 2);  // 2 because starts with --.\n    *value = \"\";\n    *has_equal_sign = false;\n  } else if (pos == 2) {  // we also don't allow empty keys: --=value\n    PrintUsage(true);\n    SHERPA_ONNX_LOGE(\"Invalid option (no key): %s\", in.c_str());\n    exit(-1);\n  } else {                         // normal case: --option=value\n    *key = in.substr(2, pos - 2);  // 2 because starts with --.\n    *value = in.substr(pos + 1);\n    *has_equal_sign = true;\n  }\n}\n\nvoid ParseOptions::NormalizeArgName(std::string *str) const {\n  std::string out;\n  std::string::iterator it;\n\n  for (it = str->begin(); it != str->end(); ++it) {\n    if (*it == '_') {\n      out += '-';  // convert _ to -\n    } else {\n      out += std::tolower(*it);\n    }\n  }\n  *str = out;\n\n  SHERPA_ONNX_CHECK_GT(str->length(), 0);\n}\n\nvoid ParseOptions::Trim(std::string *str) const {\n  const char *white_chars = \" \\t\\n\\r\\f\\v\";\n\n  std::string::size_type pos = str->find_last_not_of(white_chars);\n  if (pos != std::string::npos) {\n    str->erase(pos + 1);\n    pos = str->find_first_not_of(white_chars);\n    if (pos != std::string::npos) str->erase(0, pos);\n  } else {\n    str->erase(str->begin(), str->end());\n  }\n}\n\nbool ParseOptions::SetOption(const std::string &key, const std::string &value,\n                             bool has_equal_sign) {\n  if (bool_map_.end() != bool_map_.find(key)) {\n    if (has_equal_sign && value.empty()) {\n      SHERPA_ONNX_LOGE(\"Invalid option --%s=\", key.c_str());\n      exit(-1);\n    }\n    *(bool_map_[key]) = ToBool(value);\n  } else if (int_map_.end() != int_map_.find(key)) {\n    *(int_map_[key]) = ToInt(value);\n  } else if (int64_map_.end() != int64_map_.find(key)) {\n    *(int64_map_[key]) = ToInt64(value);\n  } else if (uint_map_.end() != uint_map_.find(key)) {\n    *(uint_map_[key]) = ToUint(value);\n  } else if (float_map_.end() != float_map_.find(key)) {\n    *(float_map_[key]) = ToFloat(value);\n  } else if (double_map_.end() != double_map_.find(key)) {\n    *(double_map_[key]) = ToDouble(value);\n  } else if (string_map_.end() != string_map_.find(key)) {\n    if (!has_equal_sign) {\n      SHERPA_ONNX_LOGE(\"Invalid option --%s (option format is --x=y).\",\n                       key.c_str());\n      exit(-1);\n    }\n    *(string_map_[key]) = value;\n  } else {\n    return false;\n  }\n  return true;\n}\n\nbool ParseOptions::ToBool(std::string str) const {\n  std::transform(str.begin(), str.end(), str.begin(), ::tolower);\n\n  // allow \"\" as a valid option for \"true\", so that --x is the same as --x=true\n  if (str == \"true\" || str == \"t\" || str == \"1\" || str.empty()) {\n    return true;\n  }\n  if (str == \"false\" || str == \"f\" || str == \"0\") {\n    return false;\n  }\n  // if it is neither true nor false:\n  PrintUsage(true);\n  SHERPA_ONNX_LOGE(\n      \"Invalid format for boolean argument [expected true or false]: %s\",\n      str.c_str());\n  exit(-1);\n  return false;  // never reached\n}\n\nint32_t ParseOptions::ToInt(const std::string &str) const {\n  int32_t ret = 0;\n  if (!ConvertStringToInteger(str, &ret)) {\n    SHERPA_ONNX_LOGE(\"Invalid integer option \\\"%s\\\"\", str.c_str());\n    exit(-1);\n  }\n  return ret;\n}\n\nint64_t ParseOptions::ToInt64(const std::string &str) const {\n  int64_t ret = 0;\n  if (!ConvertStringToInteger(str, &ret)) {\n    SHERPA_ONNX_LOGE(\"Invalid integer int64 option \\\"%s\\\"\", str.c_str());\n    exit(-1);\n  }\n  return ret;\n}\n\nuint32_t ParseOptions::ToUint(const std::string &str) const {\n  uint32_t ret = 0;\n  if (!ConvertStringToInteger(str, &ret)) {\n    SHERPA_ONNX_LOGE(\"Invalid integer option \\\"%s\\\"\", str.c_str());\n    exit(-1);\n  }\n  return ret;\n}\n\nfloat ParseOptions::ToFloat(const std::string &str) const {\n  float ret = 0;\n  if (!ConvertStringToReal(str, &ret)) {\n    SHERPA_ONNX_LOGE(\"Invalid floating-point option \\\"%s\\\"\", str.c_str());\n    exit(-1);\n  }\n  return ret;\n}\n\ndouble ParseOptions::ToDouble(const std::string &str) const {\n  double ret = 0;\n  if (!ConvertStringToReal(str, &ret)) {\n    SHERPA_ONNX_LOGE(\"Invalid floating-point option \\\"%s\\\"\", str.c_str());\n    exit(-1);\n  }\n  return ret;\n}\n\n// instantiate templates\ntemplate void ParseOptions::RegisterTmpl(const std::string &name, bool *ptr,\n                                         const std::string &doc);\ntemplate void ParseOptions::RegisterTmpl(const std::string &name, int32_t *ptr,\n                                         const std::string &doc);\ntemplate void ParseOptions::RegisterTmpl(const std::string &name, int64_t *ptr,\n                                         const std::string &doc);\ntemplate void ParseOptions::RegisterTmpl(const std::string &name, uint32_t *ptr,\n                                         const std::string &doc);\ntemplate void ParseOptions::RegisterTmpl(const std::string &name, float *ptr,\n                                         const std::string &doc);\ntemplate void ParseOptions::RegisterTmpl(const std::string &name, double *ptr,\n                                         const std::string &doc);\ntemplate void ParseOptions::RegisterTmpl(const std::string &name,\n                                         std::string *ptr,\n                                         const std::string &doc);\n\ntemplate void ParseOptions::RegisterStandard(const std::string &name, bool *ptr,\n                                             const std::string &doc);\ntemplate void ParseOptions::RegisterStandard(const std::string &name,\n                                             int32_t *ptr,\n                                             const std::string &doc);\ntemplate void ParseOptions::RegisterStandard(const std::string &name,\n                                             int64_t *ptr,\n                                             const std::string &doc);\ntemplate void ParseOptions::RegisterStandard(const std::string &name,\n                                             uint32_t *ptr,\n                                             const std::string &doc);\ntemplate void ParseOptions::RegisterStandard(const std::string &name,\n                                             float *ptr,\n                                             const std::string &doc);\ntemplate void ParseOptions::RegisterStandard(const std::string &name,\n                                             double *ptr,\n                                             const std::string &doc);\ntemplate void ParseOptions::RegisterStandard(const std::string &name,\n                                             std::string *ptr,\n                                             const std::string &doc);\n\ntemplate void ParseOptions::RegisterCommon(const std::string &name, bool *ptr,\n                                           const std::string &doc,\n                                           bool is_standard);\ntemplate void ParseOptions::RegisterCommon(const std::string &name,\n                                           int32_t *ptr, const std::string &doc,\n                                           bool is_standard);\ntemplate void ParseOptions::RegisterCommon(const std::string &name,\n                                           int64_t *ptr, const std::string &doc,\n                                           bool is_standard);\ntemplate void ParseOptions::RegisterCommon(const std::string &name,\n                                           uint32_t *ptr,\n                                           const std::string &doc,\n                                           bool is_standard);\ntemplate void ParseOptions::RegisterCommon(const std::string &name, float *ptr,\n                                           const std::string &doc,\n                                           bool is_standard);\ntemplate void ParseOptions::RegisterCommon(const std::string &name, double *ptr,\n                                           const std::string &doc,\n                                           bool is_standard);\ntemplate void ParseOptions::RegisterCommon(const std::string &name,\n                                           std::string *ptr,\n                                           const std::string &doc,\n                                           bool is_standard);\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/parse-options.h",
    "content": "// sherpa-onnx/csrc/parse-options.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n//\n// This file is copied and modified from kaldi/src/util/parse-options.h\n\n#ifndef SHERPA_ONNX_CSRC_PARSE_OPTIONS_H_\n#define SHERPA_ONNX_CSRC_PARSE_OPTIONS_H_\n\n#include <cstdint>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nclass ParseOptions {\n public:\n  explicit ParseOptions(const char *usage)\n      : print_args_(true),\n        help_(false),\n        usage_(usage),\n        argc_(0),\n        argv_(nullptr),\n        prefix_(\"\"),\n        other_parser_(nullptr) {\n#if !defined(_MSC_VER) && !defined(__CYGWIN__)\n    // This is just a convenient place to set the stderr to line\n    // buffering mode, since it's called at program start.\n    // This helps ensure different programs' output is not mixed up.\n    setlinebuf(stderr);\n#endif\n    RegisterStandard(\"config\", &config_,\n                     \"Configuration file to read (this \"\n                     \"option may be repeated)\");\n    RegisterStandard(\"print-args\", &print_args_,\n                     \"Print the command line arguments (to stderr)\");\n    RegisterStandard(\"help\", &help_, \"Print out usage message\");\n  }\n\n  /**\n    This is a constructor for the special case where some options are\n    registered with a prefix to avoid conflicts.  The object thus created will\n    only be used temporarily to register an options class with the original\n    options parser (which is passed as the *other pointer) using the given\n    prefix.  It should not be used for any other purpose, and the prefix must\n    not be the empty string.  It seems to be the least bad way of implementing\n    options with prefixes at this point.\n    Example of usage is:\n     ParseOptions po;  // original ParseOptions object\n     ParseOptions po_mfcc(\"mfcc\", &po); // object with prefix.\n     MfccOptions mfcc_opts;\n     mfcc_opts.Register(&po_mfcc);\n    The options will now get registered as, e.g., --mfcc.frame-shift=10.0\n    instead of just --frame-shift=10.0\n   */\n  ParseOptions(const std::string &prefix, ParseOptions *other);\n\n  ParseOptions(const ParseOptions &) = delete;\n  ParseOptions &operator=(const ParseOptions &) = delete;\n  ~ParseOptions() = default;\n\n  void Register(const std::string &name, bool *ptr, const std::string &doc);\n  void Register(const std::string &name, int32_t *ptr, const std::string &doc);\n  void Register(const std::string &name, int64_t *ptr, const std::string &doc);\n  void Register(const std::string &name, uint32_t *ptr, const std::string &doc);\n  void Register(const std::string &name, float *ptr, const std::string &doc);\n  void Register(const std::string &name, double *ptr, const std::string &doc);\n  void Register(const std::string &name, std::string *ptr,\n                const std::string &doc);\n\n  /// If called after registering an option and before calling\n  /// Read(), disables that option from being used.  Will crash\n  /// at runtime if that option had not been registered.\n  void DisableOption(const std::string &name);\n\n  /// This one is used for registering standard parameters of all the programs\n  template <typename T>\n  void RegisterStandard(const std::string &name, T *ptr,\n                        const std::string &doc);\n\n  /**\n    Parses the command line options and fills the ParseOptions-registered\n    variables. This must be called after all the variables were registered!!!\n\n    Initially the variables have implicit values,\n    then the config file values are set-up,\n    finally the command line values given.\n    Returns the first position in argv that was not used.\n    [typically not useful: use NumParams() and GetParam(). ]\n   */\n  int Read(int argc, const char *const *argv);\n\n  /// Prints the usage documentation [provided in the constructor].\n  void PrintUsage(bool print_command_line = false) const;\n\n  /// Prints the actual configuration of all the registered variables\n  void PrintConfig(std::ostream &os) const;\n\n  /// Reads the options values from a config file.  Must be called after\n  /// registering all options.  This is usually used internally after the\n  /// standard --config option is used, but it may also be called from a\n  /// program.\n  void ReadConfigFile(const std::string &filename);\n\n  /// Number of positional parameters (c.f. argc-1).\n  int NumArgs() const;\n\n  /// Returns one of the positional parameters; 1-based indexing for argc/argv\n  /// compatibility. Will crash if param is not >=1 and <=NumArgs().\n  ///\n  /// Note: Index is 1 based.\n  std::string GetArg(int param) const;\n\n  std::string GetOptArg(int param) const {\n    return (param <= NumArgs() ? GetArg(param) : \"\");\n  }\n\n  /// The following function will return a possibly quoted and escaped\n  /// version of \"str\", according to the current shell.  Currently\n  /// this is just hardwired to bash.  It's useful for debug output.\n  static std::string Escape(const std::string &str);\n\n private:\n  /// Template to register various variable types,\n  /// used for program-specific parameters\n  template <typename T>\n  void RegisterTmpl(const std::string &name, T *ptr, const std::string &doc);\n\n  // Following functions do just the datatype-specific part of the job\n  /// Register boolean variable\n  void RegisterSpecific(const std::string &name, const std::string &idx,\n                        bool *b, const std::string &doc, bool is_standard);\n  /// Register int32_t variable\n  void RegisterSpecific(const std::string &name, const std::string &idx,\n                        int32_t *i, const std::string &doc, bool is_standard);\n  /// Register int64_t variable\n  void RegisterSpecific(const std::string &name, const std::string &idx,\n                        int64_t *i, const std::string &doc, bool is_standard);\n  /// Register unsigned  int32_t variable\n  void RegisterSpecific(const std::string &name, const std::string &idx,\n                        uint32_t *u, const std::string &doc, bool is_standard);\n  /// Register float variable\n  void RegisterSpecific(const std::string &name, const std::string &idx,\n                        float *f, const std::string &doc, bool is_standard);\n  /// Register double variable [useful as we change BaseFloat type].\n  void RegisterSpecific(const std::string &name, const std::string &idx,\n                        double *f, const std::string &doc, bool is_standard);\n  /// Register string variable\n  void RegisterSpecific(const std::string &name, const std::string &idx,\n                        std::string *s, const std::string &doc,\n                        bool is_standard);\n\n  /// Does the actual job for both kinds of parameters\n  /// Does the common part of the job for all datatypes,\n  /// then calls RegisterSpecific\n  template <typename T>\n  void RegisterCommon(const std::string &name, T *ptr, const std::string &doc,\n                      bool is_standard);\n\n  /// Set option with name \"key\" to \"value\"; will crash if can't do it.\n  /// \"has_equal_sign\" is used to allow --x for a boolean option x,\n  /// and --y=, for a string option y.\n  bool SetOption(const std::string &key, const std::string &value,\n                 bool has_equal_sign);\n\n  bool ToBool(std::string str) const;\n  int32_t ToInt(const std::string &str) const;\n  int64_t ToInt64(const std::string &str) const;\n  uint32_t ToUint(const std::string &str) const;\n  float ToFloat(const std::string &str) const;\n  double ToDouble(const std::string &str) const;\n\n  // maps for option variables\n  std::unordered_map<std::string, bool *> bool_map_;\n  std::unordered_map<std::string, int32_t *> int_map_;\n  std::unordered_map<std::string, int64_t *> int64_map_;\n  std::unordered_map<std::string, uint32_t *> uint_map_;\n  std::unordered_map<std::string, float *> float_map_;\n  std::unordered_map<std::string, double *> double_map_;\n  std::unordered_map<std::string, std::string *> string_map_;\n\n  /**\n     Structure for options' documentation\n   */\n  struct DocInfo {\n    DocInfo() = default;\n    DocInfo(const std::string &name, const std::string &usemsg)\n        : name_(name), use_msg_(usemsg), is_standard_(false) {}\n    DocInfo(const std::string &name, const std::string &usemsg,\n            bool is_standard)\n        : name_(name), use_msg_(usemsg), is_standard_(is_standard) {}\n\n    std::string name_;\n    std::string use_msg_;\n    bool is_standard_;\n  };\n  using DocMapType = std::unordered_map<std::string, DocInfo>;\n  DocMapType doc_map_;  ///< map for the documentation\n\n  bool print_args_;     ///< variable for the implicit --print-args parameter\n  bool help_;           ///< variable for the implicit --help parameter\n  std::string config_;  ///< variable for the implicit --config parameter\n  std::vector<std::string> positional_args_;\n  const char *usage_;\n  int argc_;\n  const char *const *argv_;\n\n  /// These members are not normally used. They are only used when the object\n  /// is constructed with a prefix\n  std::string prefix_;\n  ParseOptions *other_parser_;\n\n protected:\n  /// SplitLongArg parses an argument of the form --a=b, --a=, or --a,\n  /// and sets \"has_equal_sign\" to true if an equals-sign was parsed..\n  /// this is needed in order to correctly allow --x for a boolean option\n  /// x, and --y= for a string option y, and to disallow --x= and --y.\n  void SplitLongArg(const std::string &in, std::string *key, std::string *value,\n                    bool *has_equal_sign) const;\n\n  void NormalizeArgName(std::string *str) const;\n\n  /// Removes the beginning and trailing whitespaces from a string\n  void Trim(std::string *str) const;\n};\n\n/// This template is provided for convenience in reading config classes from\n/// files; this is not the standard way to read configuration options, but may\n/// occasionally be needed.  This function assumes the config has a function\n/// \"void Register(ParseOptions *opts)\" which it can call to register the\n/// ParseOptions object.\ntemplate <class C>\nvoid ReadConfigFromFile(const std::string &config_filename, C *c) {\n  std::ostringstream usage_str;\n  usage_str << \"Parsing config from \"\n            << \"from '\" << config_filename << \"'\";\n  ParseOptions po(usage_str.str().c_str());\n  c->Register(&po);\n  po.ReadConfigFile(config_filename);\n}\n\n/// This variant of the template ReadConfigFromFile is for if you need to read\n/// two config classes from the same file.\ntemplate <class C1, class C2>\nvoid ReadConfigsFromFile(const std::string &conf, C1 *c1, C2 *c2) {\n  std::ostringstream usage_str;\n  usage_str << \"Parsing config from \"\n            << \"from '\" << conf << \"'\";\n  ParseOptions po(usage_str.str().c_str());\n  c1->Register(&po);\n  c2->Register(&po);\n  po.ReadConfigFile(conf);\n}\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_PARSE_OPTIONS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/phrase-matcher.cc",
    "content": "// sherpa-onnx/csrc/phrase-matcher.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/phrase-matcher.h\"\n\n#include <algorithm>\n#include <sstream>\n#include <string>\n#include <unordered_set>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\nclass PhraseMatcher::Impl {\n public:\n  Impl(const std::unordered_set<std::string> *lexicon,\n       const std::vector<std::string> &words, bool debug,\n       int32_t max_search_len)\n      : lexicon_(lexicon), max_search_len_(max_search_len), debug_(debug) {\n    if (max_search_len_ < 1) {\n      max_search_len_ = 1;\n    }\n    if (debug_) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"max_search_len %{public}d\", max_search_len_);\n#else\n      SHERPA_ONNX_LOGE(\"max_search_len %d\", max_search_len_);\n#endif\n    }\n\n    Build(words);\n\n    if (debug_) {\n      std::ostringstream os;\n      std::string sep;\n      os << \"After phrase matching: \";\n      for (const auto &p : phrases_) {\n        os << sep << p;\n        sep = \"_\";\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n  }\n\n  auto begin() const { return phrases_.begin(); }\n\n  auto end() const { return phrases_.end(); }\n\n private:\n  void Build(const std::vector<std::string> &words) {\n    int32_t num_words = static_cast<int32_t>(words.size());\n    for (int32_t i = 0; i < num_words;) {\n      int32_t start = i;\n\n      std::string w;\n\n      if (!IsAlphaOrPunct(words[i].front())) {\n        int32_t end = std::min(i + max_search_len_ - 1, num_words - 1);\n\n        while (end > start) {\n          auto this_word = GetWord(words, start, end);\n          if (IsAlphaOrPunct(this_word.back())) {\n            --end;\n            continue;\n          }\n\n          if (debug_) {\n#if __OHOS__\n            SHERPA_ONNX_LOGE(\"%{public}d-%{public}d: %{public}s\", start, end,\n                             this_word.c_str());\n#else\n            SHERPA_ONNX_LOGE(\"%d-%d: %s\", start, end, this_word.c_str());\n#endif\n          }\n          if (lexicon_->count(this_word)) {\n            i = end + 1;\n            w = std::move(this_word);\n            if (debug_) {\n#if __OHOS__\n              SHERPA_ONNX_LOGE(\"matched %{public}d-%{public}d: %{public}s\",\n                               start, end, w.c_str());\n#else\n              SHERPA_ONNX_LOGE(\"matched %d-%d: %s\", start, end, w.c_str());\n#endif\n            }\n            break;\n          }\n\n          end -= 1;\n        }\n      }\n\n      if (w.empty()) {\n        w = words[i];\n\n        if (debug_) {\n#if __OHOS__\n          SHERPA_ONNX_LOGE(\"single word %{public}d-%{public}d: %{public}s\", i,\n                           i, w.c_str());\n#else\n          SHERPA_ONNX_LOGE(\"single word %d-%d: %s\", i, i, w.c_str());\n#endif\n        }\n\n        i += 1;\n      }\n\n      phrases_.push_back(std::move(w));\n    }\n  }\n\n private:\n  std::vector<std::string> phrases_;\n  const std::unordered_set<std::string> *lexicon_;\n  int32_t max_search_len_;\n  bool debug_;\n};\n\nPhraseMatcher::PhraseMatcher(const std::unordered_set<std::string> *lexicon,\n                             const std::vector<std::string> &words,\n                             bool debug /*= false*/,\n                             int32_t max_search_len /*= 10*/)\n    : impl_(std::make_unique<Impl>(lexicon, words, debug, max_search_len)) {}\n\nPhraseMatcher::~PhraseMatcher() = default;\n\nstd::vector<std::string>::const_iterator PhraseMatcher::begin() const {\n  return impl_->begin();\n}\nstd::vector<std::string>::const_iterator PhraseMatcher::end() const {\n  return impl_->end();\n}\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/phrase-matcher.h",
    "content": "// sherpa-onnx/csrc/phrase-matcher.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_PHRASE_MATCHER_H_\n#define SHERPA_ONNX_CSRC_PHRASE_MATCHER_H_\n\n#include <cstdint>\n#include <memory>\n#include <string>\n#include <unordered_set>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nclass PhraseMatcher {\n public:\n  PhraseMatcher(const std::unordered_set<std::string>\n                    *lexicon,  // Not owned by this instance. The passed lexicon\n                               // should live longer than this instance\n                const std::vector<std::string> &words, bool debug = false,\n                int32_t max_search_len = 10);\n  ~PhraseMatcher();\n\n  std::vector<std::string>::const_iterator begin() const;\n  std::vector<std::string>::const_iterator end() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_PHRASE_MATCHER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/piper-phonemize-lexicon.cc",
    "content": "// sherpa-onnx/csrc/piper-phonemize-lexicon.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/piper-phonemize-lexicon.h\"\n\n#include <fstream>\n#include <locale>\n#include <map>\n#include <mutex>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"espeak-ng/speak_lib.h\"\n#include \"phoneme_ids.hpp\"  // NOLINT\n#include \"phonemize.hpp\"    // NOLINT\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\n// Encode a single char32_t to UTF-8 string. For debugging only\nstatic std::string ToString(char32_t cp) {\n  std::string result;\n\n  if (cp <= 0x7F) {\n    result += static_cast<char>(cp);\n  } else if (cp <= 0x7FF) {\n    result += static_cast<char>(0xC0 | ((cp >> 6) & 0x1F));\n    result += static_cast<char>(0x80 | (cp & 0x3F));\n  } else if (cp <= 0xFFFF) {\n    result += static_cast<char>(0xE0 | ((cp >> 12) & 0x0F));\n    result += static_cast<char>(0x80 | ((cp >> 6) & 0x3F));\n    result += static_cast<char>(0x80 | (cp & 0x3F));\n  } else if (cp <= 0x10FFFF) {\n    result += static_cast<char>(0xF0 | ((cp >> 18) & 0x07));\n    result += static_cast<char>(0x80 | ((cp >> 12) & 0x3F));\n    result += static_cast<char>(0x80 | ((cp >> 6) & 0x3F));\n    result += static_cast<char>(0x80 | (cp & 0x3F));\n  } else {\n    SHERPA_ONNX_LOGE(\"Invalid Unicode code point: %d\",\n                     static_cast<int32_t>(cp));\n  }\n\n  return result;\n}\n\nvoid CallPhonemizeEspeak(const std::string &text,\n                         piper::eSpeakPhonemeConfig &config,  // NOLINT\n                         std::vector<std::vector<piper::Phoneme>> *phonemes) {\n  static std::mutex espeak_mutex;\n\n  std::lock_guard<std::mutex> lock(espeak_mutex);\n\n  // keep multi threads from calling into piper::phonemize_eSpeak\n  piper::phonemize_eSpeak(text, config, *phonemes);\n}\n\nstatic std::unordered_map<char32_t, int32_t> ReadTokens(std::istream &is) {\n  std::unordered_map<char32_t, int32_t> token2id;\n\n  std::string line;\n\n  std::string sym;\n  std::u32string s;\n  int32_t id = 0;\n  while (std::getline(is, line)) {\n    std::istringstream iss(line);\n    iss >> sym;\n    if (iss.eof()) {\n      id = atoi(sym.c_str());\n      sym = \" \";\n    } else {\n      iss >> id;\n    }\n\n    // eat the trailing \\r\\n on windows\n    iss >> std::ws;\n    if (!iss.eof()) {\n      SHERPA_ONNX_LOGE(\"Error when reading tokens: %s\", line.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    s = Utf8ToUtf32(sym);\n    if (s.size() != 1) {\n      // for tokens.txt from coqui-ai/TTS, the last token is <BLNK>\n      if (s.size() == 6 && s[0] == '<' && s[1] == 'B' && s[2] == 'L' &&\n          s[3] == 'N' && s[4] == 'K' && s[5] == '>') {\n        continue;\n      }\n\n      SHERPA_ONNX_LOGE(\"Error when reading tokens at Line %s. size: %d\",\n                       line.c_str(), static_cast<int32_t>(s.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    char32_t c = s[0];\n\n    if (token2id.count(c)) {\n      SHERPA_ONNX_LOGE(\"Duplicated token %s. Line %s. Existing ID: %d\",\n                       sym.c_str(), line.c_str(), token2id.at(c));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    token2id.insert({c, id});\n  }\n\n  return token2id;\n}\n\n// see the function \"phonemes_to_ids\" from\n// https://github.com/rhasspy/piper/blob/master/notebooks/piper_inference_(ONNX).ipynb\nstatic std::vector<int64_t> PiperPhonemesToIdsVits(\n    const std::unordered_map<char32_t, int32_t> &token2id,\n    const std::vector<piper::Phoneme> &phonemes) {\n  // see\n  // https://github.com/rhasspy/piper-phonemize/blob/master/src/phoneme_ids.hpp#L17\n  int32_t pad = token2id.at(U'_');\n  int32_t bos = token2id.at(U'^');\n  int32_t eos = token2id.at(U'$');\n\n  std::vector<int64_t> ans;\n  ans.reserve(phonemes.size());\n\n  ans.push_back(bos);\n  for (auto p : phonemes) {\n    if (token2id.count(p)) {\n      ans.push_back(token2id.at(p));\n      ans.push_back(pad);\n    } else {\n      SHERPA_ONNX_LOGE(\"Skip unknown phonemes. Unicode codepoint: \\\\U+%04x.\",\n                       static_cast<uint32_t>(p));\n    }\n  }\n  ans.push_back(eos);\n\n  return ans;\n}\n\nstatic std::vector<std::vector<int64_t>> PiperPhonemesToIdsMatcha(\n    const std::unordered_map<char32_t, int32_t> &token2id,\n    const std::vector<piper::Phoneme> &phonemes, bool use_eos_bos,\n    int32_t max_token_len = 400) {\n  // We set max_token_len to 400 here to fix\n  // https://github.com/k2-fsa/sherpa-onnx/issues/2666\n  std::vector<std::vector<int64_t>> ans;\n  std::vector<int64_t> current;\n\n  int32_t bos = token2id.at(U'^');\n  int32_t eos = token2id.at(U'$');\n\n  if (use_eos_bos) {\n    current.push_back(bos);\n  }\n\n  for (auto p : phonemes) {\n    if (token2id.count(p)) {\n      current.push_back(token2id.at(p));\n    } else {\n      SHERPA_ONNX_LOGE(\"Skip unknown phonemes. Unicode codepoint: \\\\U+%04x.\",\n                       static_cast<uint32_t>(p));\n    }\n\n    if (current.size() > max_token_len + 1) {\n      if (use_eos_bos) {\n        current.push_back(eos);\n      }\n\n      ans.push_back(std::move(current));\n\n      if (use_eos_bos) {\n        current.push_back(bos);\n      }\n    }\n  }  // for (auto p : phonemes)\n\n  if (!current.empty()) {\n    if (use_eos_bos) {\n      if (current.size() > 1) {\n        current.push_back(eos);\n\n        ans.push_back(std::move(current));\n      }\n    } else {\n      ans.push_back(std::move(current));\n    }\n  }\n\n  return ans;\n}\n\nstatic std::vector<std::vector<int64_t>> PiperPhonemesToIdsKokoroOrKitten(\n    const std::unordered_map<char32_t, int32_t> &token2id,\n    const std::vector<piper::Phoneme> &phonemes, int32_t max_len) {\n  std::vector<std::vector<int64_t>> ans;\n\n  std::vector<int64_t> current;\n  current.reserve(phonemes.size());\n\n  current.push_back(0);\n\n  for (auto p : phonemes) {\n    // SHERPA_ONNX_LOGE(\"%d %s\", static_cast<int32_t>(p), ToString(p).c_str());\n    if (token2id.count(p)) {\n      if (current.size() > max_len - 1) {\n        current.push_back(0);\n        ans.push_back(std::move(current));\n\n        current.reserve(phonemes.size());\n        current.push_back(0);\n      }\n\n      current.push_back(token2id.at(p));\n      if (p == '.') {\n        current.push_back(token2id.at(' '));\n      }\n    } else {\n      SHERPA_ONNX_LOGE(\"Skip unknown phonemes. Unicode codepoint: \\\\U+%04x.\",\n                       static_cast<uint32_t>(p));\n    }\n  }\n\n  current.push_back(0);\n  ans.push_back(std::move(current));\n  return ans;\n}\n\nstatic std::vector<int64_t> CoquiPhonemesToIds(\n    const std::unordered_map<char32_t, int32_t> &token2id,\n    const std::vector<piper::Phoneme> &phonemes,\n    const OfflineTtsVitsModelMetaData &vits_meta_data) {\n  // see\n  // https://github.com/coqui-ai/TTS/blob/dev/TTS/tts/utils/text/tokenizer.py#L87\n  int32_t use_eos_bos = vits_meta_data.use_eos_bos;\n  int32_t bos_id = vits_meta_data.bos_id;\n  int32_t eos_id = vits_meta_data.eos_id;\n  int32_t blank_id = vits_meta_data.blank_id;\n  int32_t add_blank = vits_meta_data.add_blank;\n  int32_t comma_id = token2id.at(',');\n\n  std::vector<int64_t> ans;\n  if (add_blank) {\n    ans.reserve(phonemes.size() * 2 + 3);\n  } else {\n    ans.reserve(phonemes.size() + 2);\n  }\n\n  if (use_eos_bos) {\n    ans.push_back(bos_id);\n  }\n\n  if (add_blank) {\n    ans.push_back(blank_id);\n\n    for (auto p : phonemes) {\n      if (token2id.count(p)) {\n        ans.push_back(token2id.at(p));\n        ans.push_back(blank_id);\n      } else {\n        SHERPA_ONNX_LOGE(\"Skip unknown phonemes. Unicode codepoint: \\\\U+%04x.\",\n                         static_cast<uint32_t>(p));\n      }\n    }\n  } else {\n    // not adding blank\n    for (auto p : phonemes) {\n      if (token2id.count(p)) {\n        ans.push_back(token2id.at(p));\n      } else {\n        SHERPA_ONNX_LOGE(\"Skip unknown phonemes. Unicode codepoint: \\\\U+%04x.\",\n                         static_cast<uint32_t>(p));\n      }\n    }\n  }\n\n  // add a comma at the end of a sentence so that we can have a longer pause.\n  ans.push_back(comma_id);\n\n  if (use_eos_bos) {\n    ans.push_back(eos_id);\n  }\n\n  return ans;\n}\n\nvoid InitEspeak(const std::string &data_dir) {\n  static std::once_flag init_flag;\n  std::call_once(init_flag, [data_dir]() {\n#if __ANDROID_API__ >= 9 || defined(__OHOS__)\n    if (data_dir[0] != '/') {\n      SHERPA_ONNX_LOGE(\n          \"You need to follow our examples to copy the espeak-ng-data \"\n          \"directory from the assets folder to an external storage directory.\");\n\n      SHERPA_ONNX_LOGE(\n          \"Hint: Please see\\n\"\n          \"https://github.com/k2-fsa/sherpa-onnx/blob/master/android/\"\n          \"SherpaOnnxTtsEngine/app/src/main/java/com/k2fsa/sherpa/onnx/tts/\"\n          \"engine/TtsEngine.kt#L188\\n\"\n          \"The function copyDataDir()\\n\");\n    }\n#endif\n\n    int32_t result =\n        espeak_Initialize(AUDIO_OUTPUT_SYNCHRONOUS, 0, data_dir.c_str(), 0);\n    if (result != 22050) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to initialize espeak-ng with data dir: %s. Return code is: \"\n          \"%d\",\n          data_dir.c_str(), result);\n      SHERPA_ONNX_EXIT(-1);\n    }\n  });\n}\n\nPiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsVitsModelMetaData &vits_meta_data)\n    : vits_meta_data_(vits_meta_data) {\n  {\n    std::ifstream is(tokens);\n    token2id_ = ReadTokens(is);\n  }\n\n  InitEspeak(data_dir);\n}\n\ntemplate <typename Manager>\nPiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    Manager *mgr, const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsVitsModelMetaData &vits_meta_data)\n    : vits_meta_data_(vits_meta_data) {\n  {\n    auto buf = ReadFile(mgr, tokens);\n    std::istringstream is(std::string(buf.data(), buf.size()));\n    token2id_ = ReadTokens(is);\n  }\n\n  // We should copy the directory of espeak-ng-data from the asset to\n  // some internal or external storage and then pass the directory to\n  // data_dir.\n  InitEspeak(data_dir);\n}\n\nPiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsMatchaModelMetaData &matcha_meta_data)\n    : matcha_meta_data_(matcha_meta_data), is_matcha_(true) {\n  {\n    std::ifstream is(tokens);\n    token2id_ = ReadTokens(is);\n  }\n\n  InitEspeak(data_dir);\n}\n\nPiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsKokoroModelMetaData &kokoro_meta_data)\n    : kokoro_meta_data_(kokoro_meta_data), is_kokoro_(true) {\n  {\n    std::ifstream is(tokens);\n    token2id_ = ReadTokens(is);\n  }\n\n  InitEspeak(data_dir);\n}\n\nPiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsKittenModelMetaData &kitten_meta_data)\n    : kitten_meta_data_(kitten_meta_data), is_kitten_(true) {\n  {\n    std::ifstream is(tokens);\n    token2id_ = ReadTokens(is);\n  }\n\n  InitEspeak(data_dir);\n}\n\ntemplate <typename Manager>\nPiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    Manager *mgr, const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsMatchaModelMetaData &matcha_meta_data)\n    : matcha_meta_data_(matcha_meta_data), is_matcha_(true) {\n  {\n    auto buf = ReadFile(mgr, tokens);\n    std::istringstream is(std::string(buf.data(), buf.size()));\n    token2id_ = ReadTokens(is);\n  }\n\n  // We should copy the directory of espeak-ng-data from the asset to\n  // some internal or external storage and then pass the directory to\n  // data_dir.\n  InitEspeak(data_dir);\n}\n\ntemplate <typename Manager>\nPiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    Manager *mgr, const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsKokoroModelMetaData &kokoro_meta_data)\n    : kokoro_meta_data_(kokoro_meta_data), is_kokoro_(true) {\n  {\n    auto buf = ReadFile(mgr, tokens);\n    std::istringstream is(std::string(buf.data(), buf.size()));\n    token2id_ = ReadTokens(is);\n  }\n\n  // We should copy the directory of espeak-ng-data from the asset to\n  // some internal or external storage and then pass the directory to\n  // data_dir.\n  InitEspeak(data_dir);\n}\n\ntemplate <typename Manager>\nPiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    Manager *mgr, const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsKittenModelMetaData &kitten_meta_data)\n    : kitten_meta_data_(kitten_meta_data), is_kitten_(true) {\n  {\n    auto buf = ReadFile(mgr, tokens);\n    std::istringstream is(std::string(buf.data(), buf.size()));\n    token2id_ = ReadTokens(is);\n  }\n\n  // We should copy the directory of espeak-ng-data from the asset to\n  // some internal or external storage and then pass the directory to\n  // data_dir.\n  InitEspeak(data_dir);\n}\n\nstd::vector<TokenIDs> PiperPhonemizeLexicon::ConvertTextToTokenIds(\n    const std::string &text, const std::string &voice /*= \"\"*/) const {\n  if (is_matcha_) {\n    return ConvertTextToTokenIdsMatcha(text, voice);\n  } else if (is_kokoro_) {\n    return ConvertTextToTokenIdsKokoroOrKitten(\n        token2id_, kokoro_meta_data_.max_token_len, text, voice);\n  } else if (is_kitten_) {\n    return ConvertTextToTokenIdsKokoroOrKitten(\n        token2id_, kitten_meta_data_.max_token_len, text, voice);\n  } else {\n    return ConvertTextToTokenIdsVits(text, voice);\n  }\n}\n\nstd::vector<TokenIDs> PiperPhonemizeLexicon::ConvertTextToTokenIdsMatcha(\n    const std::string &text, const std::string &voice /*= \"\"*/) const {\n  piper::eSpeakPhonemeConfig config;\n\n  // ./bin/espeak-ng-bin --path  ./install/share/espeak-ng-data/ --voices\n  // to list available voices\n  config.voice = voice;  // e.g., voice is en-us\n\n  std::vector<std::vector<piper::Phoneme>> phonemes;\n\n  CallPhonemizeEspeak(text, config, &phonemes);\n\n  std::vector<TokenIDs> ans;\n\n  for (const auto &p : phonemes) {\n    auto phoneme_ids =\n        PiperPhonemesToIdsMatcha(token2id_, p, matcha_meta_data_.use_eos_bos);\n\n    for (auto &ids : phoneme_ids) {\n      ans.emplace_back(std::move(ids));\n    }\n  }\n\n  return ans;\n}\n\nstd::vector<TokenIDs> ConvertTextToTokenIdsKokoroOrKitten(\n    const std::unordered_map<char32_t, int32_t> &token2id,\n    int32_t max_token_len, const std::string &text,\n    const std::string &voice /*= \"\"*/) {\n  piper::eSpeakPhonemeConfig config;\n\n  // ./bin/espeak-ng-bin --path  ./install/share/espeak-ng-data/ --voices\n  // to list available voices\n  config.voice = voice;  // e.g., voice is en-us\n\n  std::vector<std::vector<piper::Phoneme>> phonemes;\n\n  CallPhonemizeEspeak(text, config, &phonemes);\n\n  std::vector<TokenIDs> ans;\n\n  for (const auto &p : phonemes) {\n    auto phoneme_ids =\n        PiperPhonemesToIdsKokoroOrKitten(token2id, p, max_token_len);\n\n    for (auto &ids : phoneme_ids) {\n      ans.emplace_back(std::move(ids));\n    }\n  }\n\n  return ans;\n}\n\nstd::vector<TokenIDs> PiperPhonemizeLexicon::ConvertTextToTokenIdsVits(\n    const std::string &text, const std::string &voice /*= \"\"*/) const {\n  piper::eSpeakPhonemeConfig config;\n\n  // ./bin/espeak-ng-bin --path  ./install/share/espeak-ng-data/ --voices\n  // to list available voices\n  config.voice = voice;  // e.g., voice is en-us\n\n  std::vector<std::vector<piper::Phoneme>> phonemes;\n\n  CallPhonemizeEspeak(text, config, &phonemes);\n\n  std::vector<TokenIDs> ans;\n\n  std::vector<int64_t> phoneme_ids;\n\n  if (vits_meta_data_.is_piper || vits_meta_data_.is_icefall) {\n    for (const auto &p : phonemes) {\n      phoneme_ids = PiperPhonemesToIdsVits(token2id_, p);\n      ans.emplace_back(std::move(phoneme_ids));\n    }\n  } else if (vits_meta_data_.is_coqui) {\n    for (const auto &p : phonemes) {\n      phoneme_ids = CoquiPhonemesToIds(token2id_, p, vits_meta_data_);\n      ans.emplace_back(std::move(phoneme_ids));\n    }\n\n  } else {\n    SHERPA_ONNX_LOGE(\"Unsupported model\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  return ans;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate PiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    AAssetManager *mgr, const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsVitsModelMetaData &vits_meta_data);\n\ntemplate PiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    AAssetManager *mgr, const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsMatchaModelMetaData &matcha_meta_data);\n\ntemplate PiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    AAssetManager *mgr, const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsKokoroModelMetaData &kokoro_meta_data);\n\ntemplate PiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    AAssetManager *mgr, const std::string &tokens, const std::string &data_dir,\n    const OfflineTtsKittenModelMetaData &kokoro_meta_data);\n#endif\n\n#if __OHOS__\ntemplate PiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    NativeResourceManager *mgr, const std::string &tokens,\n    const std::string &data_dir,\n    const OfflineTtsVitsModelMetaData &vits_meta_data);\n\ntemplate PiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    NativeResourceManager *mgr, const std::string &tokens,\n    const std::string &data_dir,\n    const OfflineTtsMatchaModelMetaData &matcha_meta_data);\n\ntemplate PiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    NativeResourceManager *mgr, const std::string &tokens,\n    const std::string &data_dir,\n    const OfflineTtsKokoroModelMetaData &kokoro_meta_data);\n\ntemplate PiperPhonemizeLexicon::PiperPhonemizeLexicon(\n    NativeResourceManager *mgr, const std::string &tokens,\n    const std::string &data_dir,\n    const OfflineTtsKittenModelMetaData &kokoro_meta_data);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/piper-phonemize-lexicon.h",
    "content": "// sherpa-onnx/csrc/piper-phonemize-lexicon.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_PIPER_PHONEMIZE_LEXICON_H_\n#define SHERPA_ONNX_CSRC_PIPER_PHONEMIZE_LEXICON_H_\n\n#include <string>\n#include <unordered_map>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-tts-frontend.h\"\n#include \"sherpa-onnx/csrc/offline-tts-kitten-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-tts-kokoro-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-tts-matcha-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/offline-tts-vits-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass PiperPhonemizeLexicon : public OfflineTtsFrontend {\n public:\n  PiperPhonemizeLexicon(const std::string &tokens, const std::string &data_dir,\n                        const OfflineTtsVitsModelMetaData &vits_meta_data);\n\n  PiperPhonemizeLexicon(const std::string &tokens, const std::string &data_dir,\n                        const OfflineTtsMatchaModelMetaData &matcha_meta_data);\n\n  PiperPhonemizeLexicon(const std::string &tokens, const std::string &data_dir,\n                        const OfflineTtsKokoroModelMetaData &kokoro_meta_data);\n\n  PiperPhonemizeLexicon(const std::string &tokens, const std::string &data_dir,\n                        const OfflineTtsKittenModelMetaData &kitten_meta_data);\n\n  template <typename Manager>\n  PiperPhonemizeLexicon(Manager *mgr, const std::string &tokens,\n                        const std::string &data_dir,\n                        const OfflineTtsVitsModelMetaData &vits_meta_data);\n\n  template <typename Manager>\n  PiperPhonemizeLexicon(Manager *mgr, const std::string &tokens,\n                        const std::string &data_dir,\n                        const OfflineTtsMatchaModelMetaData &matcha_meta_data);\n\n  template <typename Manager>\n  PiperPhonemizeLexicon(Manager *mgr, const std::string &tokens,\n                        const std::string &data_dir,\n                        const OfflineTtsKokoroModelMetaData &kokoro_meta_data);\n\n  template <typename Manager>\n  PiperPhonemizeLexicon(Manager *mgr, const std::string &tokens,\n                        const std::string &data_dir,\n                        const OfflineTtsKittenModelMetaData &kitten_meta_data);\n\n  std::vector<TokenIDs> ConvertTextToTokenIds(\n      const std::string &text, const std::string &voice = \"\") const override;\n\n private:\n  std::vector<TokenIDs> ConvertTextToTokenIdsVits(\n      const std::string &text, const std::string &voice = \"\") const;\n\n  std::vector<TokenIDs> ConvertTextToTokenIdsMatcha(\n      const std::string &text, const std::string &voice = \"\") const;\n\n private:\n  // map unicode codepoint to an integer ID\n  std::unordered_map<char32_t, int32_t> token2id_;\n  OfflineTtsVitsModelMetaData vits_meta_data_;\n  OfflineTtsMatchaModelMetaData matcha_meta_data_;\n  OfflineTtsKokoroModelMetaData kokoro_meta_data_;\n  OfflineTtsKittenModelMetaData kitten_meta_data_;\n  bool is_matcha_ = false;\n  bool is_kokoro_ = false;\n  bool is_kitten_ = false;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_PIPER_PHONEMIZE_LEXICON_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/piper-phonemize-test.cc",
    "content": "// sherpa-onnx/csrc/piper-phonemize-test.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include <iostream>\n#include <map>\n#include <string>\n#include <vector>\n\n#include \"espeak-ng/speak_lib.h\"\n#include \"gtest/gtest.h\"\n#include \"phoneme_ids.hpp\"  // NOLINT\n#include \"phonemize.hpp\"    // NOLINT\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nTEST(PiperPhonemize, Case1) {\n  std::string data_dir = \"./install/share/espeak-ng-data\";\n  if (!FileExists(data_dir + \"/en_dict\")) {\n    SHERPA_ONNX_LOGE(\"%s/en_dict does not exist. Skipping test\",\n                     data_dir.c_str());\n    return;\n  }\n\n  if (!FileExists(data_dir + \"/phontab\")) {\n    SHERPA_ONNX_LOGE(\"%s/phontab does not exist. Skipping test\",\n                     data_dir.c_str());\n    return;\n  }\n\n  if (!FileExists(data_dir + \"/phonindex\")) {\n    SHERPA_ONNX_LOGE(\"%s/phonindex does not exist. Skipping test\",\n                     data_dir.c_str());\n    return;\n  }\n\n  if (!FileExists(data_dir + \"/phondata\")) {\n    SHERPA_ONNX_LOGE(\"%s/phondata does not exist. Skipping test\",\n                     data_dir.c_str());\n    return;\n  }\n\n  if (!FileExists(data_dir + \"/intonations\")) {\n    SHERPA_ONNX_LOGE(\"%s/intonations does not exist. Skipping test\",\n                     data_dir.c_str());\n    return;\n  }\n  int32_t result =\n      espeak_Initialize(AUDIO_OUTPUT_SYNCHRONOUS, 0, data_dir.c_str(), 0);\n  EXPECT_EQ(result, 22050);\n\n  piper::eSpeakPhonemeConfig config;\n\n  // ./bin/espeak-ng-bin --path  ./install/share/espeak-ng-data/ --voices\n  // to list available voices\n  config.voice = \"en-us\";\n\n  std::vector<std::vector<piper::Phoneme>> phonemes;\n  std::string text = \"how are you doing?\";\n  piper::phonemize_eSpeak(text, config, phonemes);\n\n  for (int32_t p : phonemes[0]) {\n    std::cout << p << \" \";\n  }\n  std::cout << \"\\n\";\n\n  std::vector<piper::PhonemeId> phoneme_ids;\n  std::map<piper::Phoneme, std::size_t> missing_phonemes;\n\n  {\n    piper::PhonemeIdConfig config;\n    phonemes_to_ids(phonemes[0], config, phoneme_ids, missing_phonemes);\n  }\n\n  for (int32_t p : phoneme_ids) {\n    std::cout << p << \" \";\n  }\n  std::cout << \"\\n\";\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/provider-config.cc",
    "content": "// sherpa-onnx/csrc/provider-config.cc\n//\n// Copyright (c)  2024  Uniphore (Author: Manickavela)\n\n#include \"sherpa-onnx/csrc/provider-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid CudaConfig::Register(ParseOptions *po) {\n  po->Register(\"cuda-cudnn-conv-algo-search\", &cudnn_conv_algo_search,\n               \"CuDNN convolution algrorithm search\");\n}\n\nbool CudaConfig::Validate() const {\n  if (cudnn_conv_algo_search < 1 || cudnn_conv_algo_search > 3) {\n    SHERPA_ONNX_LOGE(\n        \"cudnn_conv_algo_search: '%d' is not a valid option.\"\n        \"Options : [1,3]. Check OnnxRT docs\",\n        cudnn_conv_algo_search);\n    return false;\n  }\n  return true;\n}\n\nstd::string CudaConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"CudaConfig(\";\n  os << \"cudnn_conv_algo_search=\" << cudnn_conv_algo_search << \")\";\n\n  return os.str();\n}\n\nvoid TensorrtConfig::Register(ParseOptions *po) {\n  po->Register(\"trt-max-workspace-size\", &trt_max_workspace_size,\n               \"Set TensorRT EP GPU memory usage limit.\");\n  po->Register(\"trt-max-partition-iterations\", &trt_max_partition_iterations,\n               \"Limit partitioning iterations for model conversion.\");\n  po->Register(\"trt-min-subgraph-size\", &trt_min_subgraph_size,\n               \"Set minimum size for subgraphs in partitioning.\");\n  po->Register(\"trt-fp16-enable\", &trt_fp16_enable,\n               \"Enable FP16 precision for faster performance.\");\n  po->Register(\"trt-detailed-build-log\", &trt_detailed_build_log,\n               \"Enable detailed logging of build steps.\");\n  po->Register(\"trt-engine-cache-enable\", &trt_engine_cache_enable,\n               \"Enable caching of TensorRT engines.\");\n  po->Register(\"trt-timing-cache-enable\", &trt_timing_cache_enable,\n               \"Enable use of timing cache to speed up builds.\");\n  po->Register(\"trt-engine-cache-path\", &trt_engine_cache_path,\n               \"Set path to store cached TensorRT engines.\");\n  po->Register(\"trt-timing-cache-path\", &trt_timing_cache_path,\n               \"Set path for storing timing cache.\");\n  po->Register(\"trt-dump-subgraphs\", &trt_dump_subgraphs,\n               \"Dump optimized subgraphs for debugging.\");\n}\n\nbool TensorrtConfig::Validate() const {\n  if (trt_max_workspace_size < 0) {\n    std::ostringstream os;\n    os << \"trt_max_workspace_size: \" << trt_max_workspace_size\n       << \" is not valid.\";\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n    return false;\n  }\n  if (trt_max_partition_iterations < 0) {\n    SHERPA_ONNX_LOGE(\"trt_max_partition_iterations: %d is not valid.\",\n                     trt_max_partition_iterations);\n    return false;\n  }\n  if (trt_min_subgraph_size < 0) {\n    SHERPA_ONNX_LOGE(\"trt_min_subgraph_size: %d is not valid.\",\n                     trt_min_subgraph_size);\n    return false;\n  }\n\n  return true;\n}\n\nstd::string TensorrtConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"TensorrtConfig(\";\n  os << \"trt_max_workspace_size=\" << trt_max_workspace_size << \", \";\n  os << \"trt_max_partition_iterations=\" << trt_max_partition_iterations << \", \";\n  os << \"trt_min_subgraph_size=\" << trt_min_subgraph_size << \", \";\n  os << \"trt_fp16_enable=\\\"\" << (trt_fp16_enable ? \"True\" : \"False\") << \"\\\", \";\n  os << \"trt_detailed_build_log=\\\"\"\n     << (trt_detailed_build_log ? \"True\" : \"False\") << \"\\\", \";\n  os << \"trt_engine_cache_enable=\\\"\"\n     << (trt_engine_cache_enable ? \"True\" : \"False\") << \"\\\", \";\n  os << \"trt_engine_cache_path=\\\"\" << trt_engine_cache_path.c_str() << \"\\\", \";\n  os << \"trt_timing_cache_enable=\\\"\"\n     << (trt_timing_cache_enable ? \"True\" : \"False\") << \"\\\", \";\n  os << \"trt_timing_cache_path=\\\"\" << trt_timing_cache_path.c_str() << \"\\\",\";\n  os << \"trt_dump_subgraphs=\\\"\" << (trt_dump_subgraphs ? \"True\" : \"False\")\n     << \"\\\" )\";\n  return os.str();\n}\n\nvoid ProviderConfig::Register(ParseOptions *po) {\n  cuda_config.Register(po);\n  trt_config.Register(po);\n\n  po->Register(\"device\", &device, \"GPU device index for CUDA and Trt EP\");\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n}\n\nbool ProviderConfig::Validate() const {\n  if (device < 0) {\n    SHERPA_ONNX_LOGE(\"device: '%d' is invalid.\", device);\n    return false;\n  }\n\n  if (provider == \"cuda\" && !cuda_config.Validate()) {\n    return false;\n  }\n\n  if (provider == \"trt\" && !trt_config.Validate()) {\n    return false;\n  }\n\n  return true;\n}\n\nstd::string ProviderConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"ProviderConfig(\";\n  os << \"device=\" << device << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\", \";\n  os << \"cuda_config=\" << cuda_config.ToString() << \", \";\n  os << \"trt_config=\" << trt_config.ToString() << \")\";\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/provider-config.h",
    "content": "// sherpa-onnx/csrc/provider-config.h\n//\n// Copyright (c)  2024  Uniphore (Author: Manickavela)\n\n#ifndef SHERPA_ONNX_CSRC_PROVIDER_CONFIG_H_\n#define SHERPA_ONNX_CSRC_PROVIDER_CONFIG_H_\n\n#include <string>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct CudaConfig {\n  int32_t cudnn_conv_algo_search = OrtCudnnConvAlgoSearchHeuristic;\n\n  CudaConfig() = default;\n  explicit CudaConfig(int32_t cudnn_conv_algo_search)\n      : cudnn_conv_algo_search(cudnn_conv_algo_search) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nstruct TensorrtConfig {\n  int64_t trt_max_workspace_size = 2147483647;\n  int32_t trt_max_partition_iterations = 10;\n  int32_t trt_min_subgraph_size = 5;\n  bool trt_fp16_enable = true;\n  bool trt_detailed_build_log = false;\n  bool trt_engine_cache_enable = true;\n  bool trt_timing_cache_enable = true;\n  std::string trt_engine_cache_path = \".\";\n  std::string trt_timing_cache_path = \".\";\n  bool trt_dump_subgraphs = false;\n\n  TensorrtConfig() = default;\n  TensorrtConfig(int64_t trt_max_workspace_size,\n                 int32_t trt_max_partition_iterations,\n                 int32_t trt_min_subgraph_size, bool trt_fp16_enable,\n                 bool trt_detailed_build_log, bool trt_engine_cache_enable,\n                 bool trt_timing_cache_enable,\n                 const std::string &trt_engine_cache_path,\n                 const std::string &trt_timing_cache_path,\n                 bool trt_dump_subgraphs)\n      : trt_max_workspace_size(trt_max_workspace_size),\n        trt_max_partition_iterations(trt_max_partition_iterations),\n        trt_min_subgraph_size(trt_min_subgraph_size),\n        trt_fp16_enable(trt_fp16_enable),\n        trt_detailed_build_log(trt_detailed_build_log),\n        trt_engine_cache_enable(trt_engine_cache_enable),\n        trt_timing_cache_enable(trt_timing_cache_enable),\n        trt_engine_cache_path(trt_engine_cache_path),\n        trt_timing_cache_path(trt_timing_cache_path),\n        trt_dump_subgraphs(trt_dump_subgraphs) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\nstruct ProviderConfig {\n  TensorrtConfig trt_config;\n  CudaConfig cuda_config;\n  std::string provider = \"cpu\";\n  int32_t device = 0;\n  // device only used for cuda and trt\n\n  ProviderConfig() = default;\n  ProviderConfig(const std::string &provider, int32_t device)\n      : provider(provider), device(device) {}\n  ProviderConfig(const TensorrtConfig &trt_config,\n                 const CudaConfig &cuda_config, const std::string &provider,\n                 int32_t device)\n      : trt_config(trt_config),\n        cuda_config(cuda_config),\n        provider(provider),\n        device(device) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_PROVIDER_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/provider.cc",
    "content": "// sherpa-onnx/csrc/provider.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/provider.h\"\n\n#include <algorithm>\n#include <cctype>\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nProvider StringToProvider(std::string s) {\n  std::transform(s.cbegin(), s.cend(), s.begin(),\n                 [](unsigned char c) { return std::tolower(c); });\n  if (s == \"cpu\") {\n    return Provider::kCPU;\n  } else if (s == \"cuda\") {\n    return Provider::kCUDA;\n  } else if (s == \"coreml\") {\n    return Provider::kCoreML;\n  } else if (s == \"xnnpack\") {\n    return Provider::kXnnpack;\n  } else if (s == \"nnapi\") {\n    return Provider::kNNAPI;\n  } else if (s == \"trt\") {\n    return Provider::kTRT;\n  } else if (s == \"directml\") {\n    return Provider::kDirectML;\n  } else if (s == \"spacemit\") {\n    return Provider::kSpacemiT;\n  } else {\n    SHERPA_ONNX_LOGE(\"Unsupported string: %s. Fallback to cpu\", s.c_str());\n    return Provider::kCPU;\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/provider.h",
    "content": "// sherpa-onnx/csrc/provider.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_PROVIDER_H_\n#define SHERPA_ONNX_CSRC_PROVIDER_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/provider-config.h\"\nnamespace sherpa_onnx {\n\n// Please refer to\n// https://github.com/microsoft/onnxruntime/blob/main/java/src/main/java/ai/onnxruntime/OrtProvider.java\n// for a list of available providers\nenum class Provider {\n  kCPU = 0,       // CPUExecutionProvider\n  kCUDA = 1,      // CUDAExecutionProvider\n  kCoreML = 2,    // CoreMLExecutionProvider\n  kXnnpack = 3,   // XnnpackExecutionProvider\n  kNNAPI = 4,     // NnapiExecutionProvider\n  kTRT = 5,       // TensorRTExecutionProvider\n  kDirectML = 6,  // DmlExecutionProvider\n  kSpacemiT = 7,  // SpacemiTExecutionProvider\n};\n\n/**\n * Convert a string to an enum.\n *\n * @param s We will convert it to lowercase before comparing.\n * @return Return an instance of Provider.\n */\nProvider StringToProvider(std::string s);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_PROVIDER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/macros.h",
    "content": "// sherpa-onnx/csrc/qnn/macros.h\n//\n// Copyright      2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_QNN_MACROS_H_\n#define SHERPA_ONNX_CSRC_QNN_MACROS_H_\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\n#define SHERPA_ONNX_QNN_CHECK(ret, msg, ...)                             \\\n  do {                                                                   \\\n    if (ret != QNN_SUCCESS) {                                            \\\n      SHERPA_ONNX_LOGE(\"Return code is: %d\", static_cast<int32_t>(ret)); \\\n      SHERPA_ONNX_LOGE(msg, ##__VA_ARGS__);                              \\\n      SHERPA_ONNX_EXIT(-1);                                              \\\n    }                                                                    \\\n  } while (0)\n\n#endif  // SHERPA_ONNX_CSRC_QNN_MACROS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/offline-paraformer-model-qnn.cc",
    "content": "// sherpa-onnx/csrc/qnn/offline-paraformer-model-qnn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/qnn/offline-paraformer-model-qnn.h\"\n\n#include <algorithm>\n#include <array>\n#include <memory>\n#include <mutex>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/qnn/macros.h\"\n#include \"sherpa-onnx/csrc/qnn/qnn-backend.h\"\n#include \"sherpa-onnx/csrc/qnn/qnn-model.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineParaformerModelQnn::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    std::vector<std::string> filenames;\n    SplitStringToVector(config_.paraformer.model, \",\", true, &filenames);\n    if (!filenames.empty()) {\n      if (filenames.size() != 3) {\n        SHERPA_ONNX_LOGE(\"Invalid Paraformer QNN model '%s'\",\n                         config_.paraformer.model.c_str());\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n\n    std::vector<std::string> binary_filenames;\n    SplitStringToVector(config_.paraformer.qnn_config.context_binary, \",\", true,\n                        &binary_filenames);\n    if (!binary_filenames.empty()) {\n      if (binary_filenames.size() != 3) {\n        SHERPA_ONNX_LOGE(\n            \"There should be 3 files for Paraformer context binary. Actual: \"\n            \"%d. '%s'\",\n            static_cast<int32_t>(binary_filenames.size()),\n            config_.paraformer.qnn_config.context_binary.c_str());\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n\n    if (filenames.empty() && binary_filenames.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"You need to provide either a model or a context binary for \"\n          \"Paraformer with QNN\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    bool ok = InitEncoder(filenames.empty() ? \"\" : filenames[0],\n                          binary_filenames.empty() ? \"\" : binary_filenames[0]);\n    if (!ok) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to init encoder with lib file '%s', context binary: '%s'\",\n          filenames.empty() ? \"\" : filenames[0].c_str(),\n          binary_filenames.empty() ? \"\" : binary_filenames[0].c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    ok = InitPredictor(filenames.empty() ? \"\" : filenames[1],\n                       binary_filenames.empty() ? \"\" : binary_filenames[1]);\n    if (!ok) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to init predictor with lib file '%s', context binary: '%s'\",\n          filenames.empty() ? \"\" : filenames[1].c_str(),\n          binary_filenames.empty() ? \"\" : binary_filenames[1].c_str());\n      return;\n    }\n\n    ok = InitDecoder(filenames.empty() ? \"\" : filenames[2],\n                     binary_filenames.empty() ? \"\" : binary_filenames[2]);\n    if (!ok) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to init decoder with lib file '%s', context binary: '%s'\",\n          filenames.empty() ? \"\" : filenames[2].c_str(),\n          binary_filenames.empty() ? \"\" : binary_filenames[2].c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) {\n    SHERPA_ONNX_LOGE(\n        \"Please copy all files from assets to SD card and set assetManager to \"\n        \"null\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  std::vector<float> Run(std::vector<float> features) {\n    std::lock_guard<std::mutex> lock(mutex_);\n\n    std::vector<float> encoder_out = RunEncoder(std::move(features));\n    std::vector<float> transposed_encoder_out =\n        Transpose(encoder_out.data(), encoder_out_dim1_, encoder_out_dim2_);\n\n    std::vector<float> alphas = RunPredictor(transposed_encoder_out);\n\n    std::vector<float> acoustic_embedding =\n        ComputeAcousticEmbedding(encoder_out, alphas, encoder_out_dim2_);\n\n    int32_t num_tokens = acoustic_embedding.size() / encoder_out_dim2_;\n\n    acoustic_embedding.resize(encoder_out.size());\n\n    std::vector<float> transposed_acoustic_embedding = Transpose(\n        acoustic_embedding.data(), encoder_out_dim1_, encoder_out_dim2_);\n\n    std::vector<float> decoder_out = RunDecoder(\n        transposed_encoder_out, transposed_acoustic_embedding, num_tokens);\n\n    decoder_out = Transpose(decoder_out.data(), vocab_size_, encoder_out_dim1_);\n    decoder_out.resize(num_tokens * vocab_size_);\n    return decoder_out;\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n private:\n  std::vector<float> RunEncoder(std::vector<float> features) const {\n    features = ApplyLFR(std::move(features));\n    if (features.empty()) {\n      return {};\n    }\n\n    encoder_model_->SetInputTensorData(\"x\", features.data(), features.size());\n    encoder_model_->Run();\n    return encoder_model_->GetOutputTensorData(\"encoder_out\");\n  }\n\n  std::vector<float> RunPredictor(\n      const std::vector<float> &transposed_encoder_out) const {\n    predictor_model_->SetInputTensorData(\"encoder_out\",\n                                         transposed_encoder_out.data(),\n                                         transposed_encoder_out.size());\n    predictor_model_->Run();\n    return predictor_model_->GetOutputTensorData(\"alphas\");\n  }\n\n  std::vector<float> RunDecoder(\n      const std::vector<float> &transposed_encoder_out,\n      const std::vector<float> &transposed_acoustic_embedding,\n      int32_t num_tokens) const {\n    std::vector<int32_t> mask(encoder_out_dim1_, 1);\n    std::fill(mask.begin() + num_tokens, mask.end(), 0);\n\n    decoder_model_->SetInputTensorData(\"encoder_out\",\n                                       transposed_encoder_out.data(),\n                                       transposed_encoder_out.size());\n\n    decoder_model_->SetInputTensorData(\"acoustic_embedding\",\n                                       transposed_acoustic_embedding.data(),\n                                       transposed_acoustic_embedding.size());\n\n    decoder_model_->SetInputTensorData(\"mask\", mask.data(), mask.size());\n\n    decoder_model_->Run();\n\n    return decoder_model_->GetOutputTensorData(\"decoder_out\");\n  }\n\n  std::vector<float> ApplyLFR(std::vector<float> in) const {\n    int32_t lfr_window_size = 7;\n    int32_t lfr_window_shift = 6;\n    int32_t in_feat_dim = 80;\n\n    int32_t in_num_frames = in.size() / in_feat_dim;\n    if (in_num_frames < lfr_window_size) {\n      return {};\n    }\n\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n\n    if (out_num_frames > num_input_frames_) {\n      SHERPA_ONNX_LOGE(\n          \"Number of input frames %d is too large. Truncate it to %d frames.\",\n          out_num_frames, num_input_frames_);\n\n      SHERPA_ONNX_LOGE(\n          \"Recognition result may be truncated/incomplete. Please select a \"\n          \"model accepting longer audios.\");\n\n      out_num_frames = num_input_frames_;\n    }\n\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n\n    std::vector<float> out(num_input_frames_ * out_feat_dim);\n\n    const float *p_in = in.data();\n    float *p_out = out.data();\n\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n\n    return out;\n  }\n\n  bool InitEncoder(const std::string &lib_filename,\n                   const std::string &context_binary) {\n    encoder_backend_ = std::make_unique<QnnBackend>(\n        config_.paraformer.qnn_config.backend_lib, config_.debug);\n\n    if (context_binary.empty()) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Init from encoder model lib '%s' since context binary is not \"\n            \"given.\",\n            lib_filename.c_str());\n      }\n\n      InitEncoderFromModelLib(lib_filename);\n\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Skip generating encoder context binary since you don't provide a \"\n            \"path to save it\");\n      }\n    } else if (!FileExists(context_binary)) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Init encoder from model lib '%s' since context binary '%s' does \"\n            \"not exist\",\n            lib_filename.c_str(), context_binary.c_str());\n      }\n\n      InitEncoderFromModelLib(lib_filename);\n\n      CreateContextBinary(encoder_model_.get(), context_binary);\n    } else {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\"Init from encoder context binary '%s'\",\n                         context_binary.c_str());\n      }\n      InitEncoderFromContextBinary(context_binary);\n    }\n\n    PostInitEncoder();\n\n    return true;\n  }\n\n  bool InitPredictor(const std::string &lib_filename,\n                     const std::string &context_binary) {\n    predictor_backend_ = std::make_unique<QnnBackend>(\n        config_.paraformer.qnn_config.backend_lib, config_.debug);\n\n    if (context_binary.empty()) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Init from predictor model lib '%s' since context binary is not \"\n            \"given.\",\n            lib_filename.c_str());\n      }\n\n      InitPredictorFromModelLib(lib_filename);\n\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Skip generating predictor context binary since you don't provide \"\n            \"a path to save it\");\n      }\n    } else if (!FileExists(context_binary)) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Init predictor from model lib '%s' since context binary '%s' does \"\n            \"not exist\",\n            lib_filename.c_str(), context_binary.c_str());\n      }\n\n      InitPredictorFromModelLib(lib_filename);\n      CreateContextBinary(predictor_model_.get(), context_binary);\n    } else {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\"Init from predictor context binary '%s'\",\n                         context_binary.c_str());\n      }\n      InitPredictorFromContextBinary(context_binary);\n    }\n\n    PostInitPredictor();\n\n    return true;\n  }\n\n  bool InitDecoder(const std::string &lib_filename,\n                   const std::string &context_binary) {\n    decoder_backend_ = std::make_unique<QnnBackend>(\n        config_.paraformer.qnn_config.backend_lib, config_.debug);\n\n    if (context_binary.empty()) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Init from decoder model lib since context binary is not given\");\n      }\n\n      InitDecoderFromModelLib(lib_filename);\n\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Skip generating decoder context binary since you don't provide \"\n            \"a path to save it\");\n      }\n    } else if (!FileExists(context_binary)) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Init decoder from model lib since context binary '%s' does not \"\n            \"exist\",\n            context_binary.c_str());\n      }\n\n      InitDecoderFromModelLib(lib_filename);\n      CreateContextBinary(decoder_model_.get(), context_binary);\n    } else {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\"Init from decoder context binary '%s'\",\n                         context_binary.c_str());\n      }\n      InitDecoderFromContextBinary(context_binary);\n    }\n\n    PostInitDecoder();\n\n    return true;\n  }\n\n  void InitEncoderFromModelLib(const std::string &lib_filename) {\n    encoder_backend_->InitContext();\n    encoder_model_ = std::make_unique<QnnModel>(\n        lib_filename, encoder_backend_.get(), config_.debug);\n  }\n\n  void InitPredictorFromModelLib(const std::string &lib_filename) {\n    predictor_backend_->InitContext();\n    predictor_model_ = std::make_unique<QnnModel>(\n        lib_filename, predictor_backend_.get(), config_.debug);\n  }\n\n  void InitDecoderFromModelLib(const std::string &lib_filename) {\n    decoder_backend_->InitContext();\n    decoder_model_ = std::make_unique<QnnModel>(\n        lib_filename, decoder_backend_.get(), config_.debug);\n  }\n\n  void CreateContextBinary(QnnModel *model, const std::string &context_binary) {\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"Creating context binary '%s'.\", context_binary.c_str());\n    }\n\n    bool ok = model->SaveBinaryContext(context_binary);\n\n    if (!ok) {\n      SHERPA_ONNX_LOGE(\"Failed to save context binary to '%s'\",\n                       context_binary.c_str());\n    }\n\n    if (config_.debug && ok) {\n      SHERPA_ONNX_LOGE(\"Saved context binary to '%s'.\", context_binary.c_str());\n      SHERPA_ONNX_LOGE(\n          \"It should be super fast the next time you init the system.\");\n      SHERPA_ONNX_LOGE(\"Remember to also provide libQnnSystem.so.\");\n    }\n  }\n\n  void InitEncoderFromContextBinary(const std::string &context_binary) {\n    if (config_.paraformer.qnn_config.system_lib.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"You should provide --paraformer.qnn-system-lib if you also provide \"\n          \"context binary\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    encoder_model_ = std::make_unique<QnnModel>(\n        context_binary, config_.paraformer.qnn_config.system_lib,\n        encoder_backend_.get(), BinaryContextTag{}, config_.debug);\n  }\n\n  void InitPredictorFromContextBinary(const std::string &context_binary) {\n    if (config_.paraformer.qnn_config.system_lib.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"You should provide --paraformer.qnn-system-lib if you also provide \"\n          \"context binary\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    predictor_model_ = std::make_unique<QnnModel>(\n        context_binary, config_.paraformer.qnn_config.system_lib,\n        predictor_backend_.get(), BinaryContextTag{}, config_.debug);\n  }\n\n  void InitDecoderFromContextBinary(const std::string &context_binary) {\n    if (config_.paraformer.qnn_config.system_lib.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"You should provide --paraformer.qnn-system-lib if you also provide \"\n          \"context binary\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    decoder_model_ = std::make_unique<QnnModel>(\n        context_binary, config_.paraformer.qnn_config.system_lib,\n        decoder_backend_.get(), BinaryContextTag{}, config_.debug);\n  }\n\n  void PostInitEncoder() { CheckEncoderModel(); }\n\n  void PostInitPredictor() { CheckPredictorModel(); }\n\n  void PostInitDecoder() { CheckDecoderModel(); }\n\n  void CheckEncoderModel() {\n    const auto &input_tensor_names = encoder_model_->InputTensorNames();\n    if (input_tensor_names.size() != 1) {\n      SHERPA_ONNX_LOGE(\"Expect 1 input tensor. Actual %d\",\n                       static_cast<int32_t>(input_tensor_names.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (input_tensor_names[0] != \"x\") {\n      SHERPA_ONNX_LOGE(\"The 1st input should be x, actual '%s'\",\n                       input_tensor_names[0].c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<int32_t> x_shape =\n        encoder_model_->TensorShape(input_tensor_names[0]);\n    if (x_shape.size() != 3) {\n      SHERPA_ONNX_LOGE(\"The 1st input should be 3-d, actual '%d'\",\n                       static_cast<int32_t>(x_shape.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (x_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"The x.shape[0] should be 1, actual '%d'\", x_shape[0]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    num_input_frames_ = x_shape[1];\n    feat_dim_ = x_shape[2];\n\n    if (!encoder_model_->HasTensor(\"encoder_out\")) {\n      SHERPA_ONNX_LOGE(\"Model does not have output node 'encoder_out'\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<int32_t> encoder_out_shape =\n        encoder_model_->TensorShape(\"encoder_out\");\n\n    encoder_out_dim1_ = encoder_out_shape[1];\n    encoder_out_dim2_ = encoder_out_shape[2];\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"num_input_frames: %d\", num_input_frames_);\n      SHERPA_ONNX_LOGE(\"feat_dim: %d\", feat_dim_);\n      SHERPA_ONNX_LOGE(\"encoder_out_dim1: %d\", encoder_out_dim1_);\n      SHERPA_ONNX_LOGE(\"encoder_out_dim2: %d\", encoder_out_dim2_);\n    }\n  }\n\n  void CheckPredictorModel() {\n    const auto &input_tensor_names = predictor_model_->InputTensorNames();\n    if (input_tensor_names.size() != 1) {\n      SHERPA_ONNX_LOGE(\"Expect 1 input tensor. Actual %d\",\n                       static_cast<int32_t>(input_tensor_names.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (input_tensor_names[0] != \"encoder_out\") {\n      SHERPA_ONNX_LOGE(\"The 1st input should be encoder_out, actual '%s'\",\n                       input_tensor_names[0].c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<int32_t> x_shape =\n        predictor_model_->TensorShape(input_tensor_names[0]);\n    if (x_shape.size() != 3) {\n      SHERPA_ONNX_LOGE(\"The 1st input should be 3-d, actual '%d'\",\n                       static_cast<int32_t>(x_shape.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (x_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"The x.shape[0] should be 1, actual '%d'\", x_shape[0]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (x_shape[1] != encoder_out_dim2_) {\n      SHERPA_ONNX_LOGE(\n          \"The input dim 1 of the predictor should be %d, given: %d\",\n          encoder_out_dim2_, x_shape[1]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (x_shape[2] != encoder_out_dim1_) {\n      SHERPA_ONNX_LOGE(\n          \"The input dim 2 of the predictor should be %d, given: %d\",\n          encoder_out_dim1_, x_shape[2]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (!predictor_model_->HasTensor(\"alphas\")) {\n      SHERPA_ONNX_LOGE(\"Model does not have output node 'alphas'\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<int32_t> alphas_shape = predictor_model_->TensorShape(\"alphas\");\n    if (alphas_shape.size() != 2) {\n      SHERPA_ONNX_LOGE(\"alphas should be 2-d, given: %d\",\n                       static_cast<int32_t>(alphas_shape.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (alphas_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"We support only batch size 1 for alphas. Given: %d\",\n                       alphas_shape[0]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (alphas_shape[1] != encoder_out_dim1_) {\n      SHERPA_ONNX_LOGE(\"Expected output dim %d for alphas. Given: %d\",\n                       encoder_out_dim1_, alphas_shape[1]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  void CheckDecoderModel() {\n    const auto &input_tensor_names = decoder_model_->InputTensorNames();\n    if (input_tensor_names.size() != 3) {\n      SHERPA_ONNX_LOGE(\"Expect 3 input tensors. Actual %d\",\n                       static_cast<int32_t>(input_tensor_names.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (input_tensor_names[0] != \"encoder_out\") {\n      SHERPA_ONNX_LOGE(\"The 1st input should be encoder_out, actual '%s'\",\n                       input_tensor_names[0].c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (input_tensor_names[1] != \"acoustic_embedding\") {\n      SHERPA_ONNX_LOGE(\n          \"The 2nd input should be acoustic_embedding, actual '%s'\",\n          input_tensor_names[1].c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (input_tensor_names[2] != \"mask\") {\n      SHERPA_ONNX_LOGE(\"The 3rd input should be mask, actual '%s'\",\n                       input_tensor_names[2].c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (!decoder_model_->HasTensor(\"decoder_out\")) {\n      SHERPA_ONNX_LOGE(\"Model does not have output node 'decoder_out'\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<int32_t> decoder_out_shape =\n        decoder_model_->TensorShape(\"decoder_out\");\n    if (decoder_out_shape.size() != 3) {\n      SHERPA_ONNX_LOGE(\"decoder_out should be 3-d, given: %d\",\n                       static_cast<int32_t>(decoder_out_shape.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (decoder_out_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"We support only batch size 1 for decoder. Given: %d\",\n                       decoder_out_shape[0]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (decoder_out_shape[2] != encoder_out_dim1_) {\n      SHERPA_ONNX_LOGE(\"Expected output dim %d for decoder_out. Given: %d\",\n                       encoder_out_dim1_, decoder_out_shape[2]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    vocab_size_ = decoder_out_shape[1];\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"vocab_size: %d\", vocab_size_);\n    }\n  }\n\n private:\n  std::mutex mutex_;\n  OfflineModelConfig config_;\n\n  std::unique_ptr<QnnBackend> encoder_backend_;\n  std::unique_ptr<QnnModel> encoder_model_;\n\n  std::unique_ptr<QnnBackend> predictor_backend_;\n  std::unique_ptr<QnnModel> predictor_model_;\n\n  std::unique_ptr<QnnBackend> decoder_backend_;\n  std::unique_ptr<QnnModel> decoder_model_;\n\n  int32_t num_input_frames_ = 0;\n  int32_t feat_dim_ = 0;\n\n  int32_t encoder_out_dim1_ = 0;\n  int32_t encoder_out_dim2_ = 0;\n  int32_t vocab_size_ = 0;\n};\n\nOfflineParaformerModelQnn::~OfflineParaformerModelQnn() = default;\n\nOfflineParaformerModelQnn::OfflineParaformerModelQnn(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineParaformerModelQnn::OfflineParaformerModelQnn(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nstd::vector<float> OfflineParaformerModelQnn::Run(\n    std::vector<float> features) const {\n  return impl_->Run(std::move(features));\n}\n\nint32_t OfflineParaformerModelQnn::VocabSize() const {\n  return impl_->VocabSize();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineParaformerModelQnn::OfflineParaformerModelQnn(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineParaformerModelQnn::OfflineParaformerModelQnn(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/offline-paraformer-model-qnn.h",
    "content": "// sherpa-onnx/csrc/qnn/offline-paraformer-model-qnn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_QNN_OFFLINE_PARAFORMER_MODEL_QNN_H_\n#define SHERPA_ONNX_CSRC_QNN_OFFLINE_PARAFORMER_MODEL_QNN_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineParaformerModelQnn {\n public:\n  ~OfflineParaformerModelQnn();\n\n  explicit OfflineParaformerModelQnn(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineParaformerModelQnn(Manager *mgr, const OfflineModelConfig &config);\n\n  /**\n   * @param features A tensor of shape (num_frames, feature_dim)\n   *                 before applying LFR.\n   * @returns Return a tensor of shape (num_output_frames, vocab_size)\n   */\n  std::vector<float> Run(std::vector<float> features) const;\n\n  int32_t VocabSize() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_QNN_OFFLINE_PARAFORMER_MODEL_QNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/offline-recognizer-zipformer-ctc-qnn-impl.h",
    "content": "// sherpa-onnx/csrc/qnn/offline-recognizer-zipformer-ctc-qnn-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_QNN_OFFLINE_RECOGNIZER_ZIPFORMER_CTC_QNN_IMPL_H_\n#define SHERPA_ONNX_CSRC_QNN_OFFLINE_RECOGNIZER_ZIPFORMER_CTC_QNN_IMPL_H_\n\n#include <ios>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/qnn/offline-zipformer-ctc-model-qnn.h\"\n#include \"sherpa-onnx/csrc/rknn/offline-ctc-greedy-search-decoder-rknn.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\n// defined in ../offline-recognizer-ctc-impl.h\nOfflineRecognitionResult Convert(const OfflineCtcDecoderResult &src,\n                                 const SymbolTable &sym_table,\n                                 int32_t frame_shift_ms,\n                                 int32_t subsampling_factor);\n\nclass OfflineRecognizerZipformerCtcQnnImpl : public OfflineRecognizerImpl {\n public:\n  explicit OfflineRecognizerZipformerCtcQnnImpl(\n      const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(config),\n        config_(config),\n        symbol_table_(config_.model_config.tokens),\n        model_(std::make_unique<OfflineZipformerCtcModelQnn>(\n            config.model_config)) {\n    Init();\n  }\n\n  template <typename Manager>\n  OfflineRecognizerZipformerCtcQnnImpl(Manager *mgr,\n                                       const OfflineRecognizerConfig &config)\n      : OfflineRecognizerImpl(mgr, config),\n        config_(config),\n        symbol_table_(mgr, config_.model_config.tokens),\n        model_(std::make_unique<OfflineZipformerCtcModelQnn>(\n            mgr, config.model_config)) {\n    Init();\n  }\n\n  void Init() {\n    if (config_.decoding_method == \"greedy_search\") {\n      if (!symbol_table_.Contains(\"<blk>\") &&\n          !symbol_table_.Contains(\"<eps>\") &&\n          !symbol_table_.Contains(\"<blank>\") &&\n          config_.model_config.omnilingual.model.empty()) {\n        // for omnilingual asr, its blank id is 0\n        SHERPA_ONNX_LOGE(\n            \"We expect that tokens.txt contains \"\n            \"the symbol <blk> or <eps> or <blank> and its ID.\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      int32_t blank_id = 0;\n      if (symbol_table_.Contains(\"<blk>\")) {\n        blank_id = symbol_table_[\"<blk>\"];\n      } else if (symbol_table_.Contains(\"<eps>\")) {\n        // for tdnn models of the yesno recipe from icefall\n        blank_id = symbol_table_[\"<eps>\"];\n      } else if (symbol_table_.Contains(\"<blank>\")) {\n        // for Wenet CTC models\n        blank_id = symbol_table_[\"<blank>\"];\n      }\n\n      decoder_ = std::make_unique<OfflineCtcGreedySearchDecoderRknn>(blank_id);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only greedy_search is supported at present. Given %s\",\n                       config_.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(config_.feat_config);\n  }\n\n  void DecodeStreams(OfflineStream **ss, int32_t n) const override {\n    for (int32_t i = 0; i != n; ++i) {\n      DecodeStream(ss[i]);\n    }\n  }\n\n  OfflineRecognizerConfig GetConfig() const override { return config_; }\n\n private:\n  // Decode a single stream.\n  // Some models do not support batch size > 1, e.g., WeNet CTC models.\n  void DecodeStream(OfflineStream *s) const {\n    std::vector<float> f = s->GetFrames();\n\n    int32_t vocab_size = model_->VocabSize();\n\n    std::vector<float> log_probs = model_->Run(std::move(f));\n    int32_t num_out_frames = log_probs.size() / vocab_size;\n\n    auto result =\n        decoder_->Decode(log_probs.data(), num_out_frames, vocab_size);\n\n    int32_t frame_shift_ms = 10;\n\n    auto r = Convert(result, symbol_table_, frame_shift_ms,\n                     model_->SubsamplingFactor());\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    s->SetResult(r);\n  }\n\n private:\n  OfflineRecognizerConfig config_;\n  SymbolTable symbol_table_;\n  std::unique_ptr<OfflineZipformerCtcModelQnn> model_;\n  std::unique_ptr<OfflineCtcGreedySearchDecoderRknn> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_QNN_OFFLINE_RECOGNIZER_ZIPFORMER_CTC_QNN_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/offline-sense-voice-model-qnn.cc",
    "content": "// sherpa-onnx/csrc/qnn/offline-sense-voice-model-qnn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/qnn/offline-sense-voice-model-qnn.h\"\n\n#include <algorithm>\n#include <array>\n#include <memory>\n#include <mutex>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/qnn/macros.h\"\n#include \"sherpa-onnx/csrc/qnn/qnn-backend.h\"\n#include \"sherpa-onnx/csrc/qnn/qnn-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModelQnn::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    backend_ = std::make_unique<QnnBackend>(\n        config.sense_voice.qnn_config.backend_lib, config_.debug);\n\n    const auto &context_binary = config_.sense_voice.qnn_config.context_binary;\n\n    if (context_binary.empty()) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Init from model lib since context binary is not given\");\n      }\n\n      InitFromModelLib();\n\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Skip generating context binary since you don't provide a path to \"\n            \"save it\");\n      }\n\n    } else if (!FileExists(context_binary)) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Init from model lib since context binary '%s' does not exist\",\n            context_binary.c_str());\n      }\n\n      InitFromModelLib();\n\n      CreateContextBinary();\n    } else {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\"Init from context binary '%s'\",\n                         context_binary.c_str());\n      }\n      InitFromContextBinary();\n    }\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) : config_(config) {\n    SHERPA_ONNX_LOGE(\n        \"Please copy all files from assets to SD card and set assetManager to \"\n        \"null\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const {\n    return meta_data_;\n  }\n\n  std::vector<float> Run(std::vector<float> features, int32_t language,\n                         int32_t text_norm) {\n    std::lock_guard<std::mutex> lock(mutex_);\n\n    features = ApplyLFR(std::move(features));\n    if (features.empty()) {\n      return {};\n    }\n\n    int32_t num_frames = features.size() / feat_dim_;\n\n    model_->SetInputTensorData(\"x\", features.data(), features.size());\n\n    std::array<int32_t, 4> prompt = {language, 1, 2, text_norm};\n    model_->SetInputTensorData(\"prompt\", prompt.data(), prompt.size());\n\n    model_->Run();\n\n    return model_->GetOutputTensorData(\"logits\");\n  }\n\n private:\n  void InitFromModelLib() {\n    backend_->InitContext();\n\n    model_ = std::make_unique<QnnModel>(config_.sense_voice.model,\n                                        backend_.get(), config_.debug);\n  }\n\n  void InitFromContextBinary() {\n    model_ = std::make_unique<QnnModel>(\n        config_.sense_voice.qnn_config.context_binary,\n        config_.sense_voice.qnn_config.system_lib, backend_.get(),\n        BinaryContextTag{}, config_.debug);\n  }\n\n  void CreateContextBinary() {\n    const auto &context_binary = config_.sense_voice.qnn_config.context_binary;\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"Creating context binary '%s'.\", context_binary.c_str());\n    }\n\n    bool ok = model_->SaveBinaryContext(context_binary);\n\n    if (!ok) {\n      SHERPA_ONNX_LOGE(\"Failed to save context binary to '%s'\",\n                       context_binary.c_str());\n    }\n\n    if (config_.debug && ok) {\n      SHERPA_ONNX_LOGE(\"Saved context binary to '%s'.\", context_binary.c_str());\n      SHERPA_ONNX_LOGE(\n          \"It should be super fast the next time you init the system.\");\n      SHERPA_ONNX_LOGE(\"Remember to also provide libQnnSystem.so.\");\n    }\n  }\n\n  void PostInit() { CheckModel(); }\n\n  void CheckModel() {\n    const auto &input_tensor_names = model_->InputTensorNames();\n    if (input_tensor_names.size() != 2) {\n      SHERPA_ONNX_LOGE(\"Expect two input tensors. Actual %d\",\n                       static_cast<int32_t>(input_tensor_names.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (input_tensor_names[0] != \"x\") {\n      SHERPA_ONNX_LOGE(\"The 1st input should be x, actual '%s'\",\n                       input_tensor_names[0].c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (input_tensor_names[1] != \"prompt\") {\n      SHERPA_ONNX_LOGE(\"The 2nd input should be prompt, actual '%s'\",\n                       input_tensor_names[1].c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<int32_t> x_shape = model_->TensorShape(input_tensor_names[0]);\n    if (x_shape.size() != 3) {\n      SHERPA_ONNX_LOGE(\"The 1st input should be 3-d, actual '%d'\",\n                       static_cast<int32_t>(x_shape.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (x_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"The x.shape[0] should be 1, actual '%d'\", x_shape[0]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (x_shape[2] != feat_dim_) {\n      SHERPA_ONNX_LOGE(\"The x.shape[2] should be %d, actual '%d'\", feat_dim_,\n                       x_shape[2]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<int32_t> prompt_shape =\n        model_->TensorShape(input_tensor_names[1]);\n\n    if (prompt_shape.size() != 1) {\n      SHERPA_ONNX_LOGE(\"The 2nd input should be 1-d, actual '%d'\",\n                       static_cast<int32_t>(prompt_shape.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (prompt_shape[0] != 4) {\n      SHERPA_ONNX_LOGE(\"The prompt.shape[0] should be 4, actual '%d'\",\n                       prompt_shape[0]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (!model_->HasTensor(\"logits\")) {\n      SHERPA_ONNX_LOGE(\"Model does not have output node 'logits'\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    expected_num_frames_ = x_shape[1];\n  }\n\n  std::vector<float> ApplyLFR(std::vector<float> in) const {\n    int32_t lfr_window_size = meta_data_.window_size;\n    int32_t lfr_window_shift = meta_data_.window_shift;\n    int32_t in_feat_dim = 80;\n\n    int32_t in_num_frames = in.size() / in_feat_dim;\n\n    if (in_num_frames < lfr_window_size) {\n      return {};\n    }\n\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n\n    if (out_num_frames > expected_num_frames_) {\n      SHERPA_ONNX_LOGE(\n          \"Number of input frames %d is too large. Truncate it to %d frames.\",\n          out_num_frames, expected_num_frames_);\n\n      SHERPA_ONNX_LOGE(\n          \"Recognition result may be truncated/incomplete. Please select a \"\n          \"model accepting longer audios.\");\n\n      out_num_frames = expected_num_frames_;\n    }\n\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n\n    // if out_num_frames < expected_num_frames_, it uses 0 padding\n    std::vector<float> out(expected_num_frames_ * out_feat_dim, 0);\n\n    const float *p_in = in.data();\n    float *p_out = out.data();\n\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n\n    return out;\n  }\n\n private:\n  std::mutex mutex_;\n\n  OfflineModelConfig config_;\n  OfflineSenseVoiceModelMetaData meta_data_;\n\n  std::unique_ptr<QnnBackend> backend_;\n  std::unique_ptr<QnnModel> model_;\n\n  int32_t expected_num_frames_ = 0;\n  int32_t feat_dim_ = 560;\n};\n\nOfflineSenseVoiceModelQnn::OfflineSenseVoiceModelQnn(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineSenseVoiceModelQnn::OfflineSenseVoiceModelQnn(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineSenseVoiceModelQnn::~OfflineSenseVoiceModelQnn() = default;\n\nstd::vector<float> OfflineSenseVoiceModelQnn::Run(std::vector<float> features,\n                                                  int32_t language,\n                                                  int32_t text_norm) const {\n  return impl_->Run(std::move(features), language, text_norm);\n}\n\nconst OfflineSenseVoiceModelMetaData &\nOfflineSenseVoiceModelQnn::GetModelMetadata() const {\n  return impl_->GetModelMetadata();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSenseVoiceModelQnn::OfflineSenseVoiceModelQnn(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSenseVoiceModelQnn::OfflineSenseVoiceModelQnn(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/offline-sense-voice-model-qnn.h",
    "content": "// sherpa-onnx/csrc/qnn/offline-sense-voice-model-qnn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_QNN_OFFLINE_SENSE_VOICE_MODEL_QNN_H_\n#define SHERPA_ONNX_CSRC_QNN_OFFLINE_SENSE_VOICE_MODEL_QNN_H_\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-sense-voice-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModelQnn {\n public:\n  ~OfflineSenseVoiceModelQnn();\n\n  explicit OfflineSenseVoiceModelQnn(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineSenseVoiceModelQnn(Manager *mgr, const OfflineModelConfig &config);\n\n  /**\n   * @param features A tensor of shape (num_frames, feature_dim)\n   *                 before applying LFR.\n   * @param language\n   * @param text_norm\n   * @returns Return a tensor of shape (num_output_frames, vocab_size)\n   */\n  std::vector<float> Run(std::vector<float> features, int32_t language,\n                         int32_t text_norm) const;\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_QNN_OFFLINE_SENSE_VOICE_MODEL_QNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/offline-zipformer-ctc-model-qnn.cc",
    "content": "// sherpa-onnx/csrc/qnn/offline-zipformer-ctc-model-qnn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/qnn/offline-zipformer-ctc-model-qnn.h\"\n\n#include <algorithm>\n#include <array>\n#include <memory>\n#include <mutex>  // NOLINT\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/qnn/macros.h\"\n#include \"sherpa-onnx/csrc/qnn/qnn-backend.h\"\n#include \"sherpa-onnx/csrc/qnn/qnn-model.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineZipformerCtcModelQnn::Impl {\n public:\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    backend_ = std::make_unique<QnnBackend>(\n        config.zipformer_ctc.qnn_config.backend_lib, config_.debug);\n\n    const auto &context_binary =\n        config_.zipformer_ctc.qnn_config.context_binary;\n\n    if (context_binary.empty()) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Init from model lib since context binary is not given\");\n      }\n\n      InitFromModelLib();\n\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Skip generating context binary since you don't provide a path to \"\n            \"save it\");\n      }\n    } else if (!FileExists(context_binary)) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\n            \"Init from model lib since context binary '%s' does not exist\",\n            context_binary.c_str());\n      }\n\n      InitFromModelLib();\n\n      CreateContextBinary();\n    } else {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\"Init from context binary '%s'\",\n                         context_binary.c_str());\n      }\n      InitFromContextBinary();\n    }\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) : config_(config) {\n    SHERPA_ONNX_LOGE(\n        \"Please copy all files from assets to SD card and set assetManager to \"\n        \"null\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  std::vector<float> Run(std::vector<float> features) {\n    int32_t num_frames = features.size() / feat_dim_;\n\n    if (num_frames != max_num_frames_) {\n      if (num_frames > max_num_frames_) {\n        SHERPA_ONNX_LOGE(\n            \"Number of input frames %d is too large. Truncate it to %d frames.\",\n            num_frames, max_num_frames_);\n\n        SHERPA_ONNX_LOGE(\n            \"Recognition result may be truncated/incomplete. Please select a \"\n            \"model accepting longer audios.\");\n      }\n\n      features.resize(max_num_frames_ * feat_dim_);\n\n      num_frames = max_num_frames_;\n    }\n\n    std::lock_guard<std::mutex> lock(mutex_);\n\n    model_->SetInputTensorData(\"x\", features.data(), features.size());\n\n    model_->Run();\n\n    return model_->GetOutputTensorData(\"log_probs\");\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n  int32_t SubsamplingFactor() const { return subsampling_factor_; }\n\n private:\n  void InitFromModelLib() {\n    backend_->InitContext();\n\n    model_ = std::make_unique<QnnModel>(config_.zipformer_ctc.model,\n                                        backend_.get(), config_.debug);\n  }\n\n  void InitFromContextBinary() {\n    model_ = std::make_unique<QnnModel>(\n        config_.zipformer_ctc.qnn_config.context_binary,\n        config_.zipformer_ctc.qnn_config.system_lib, backend_.get(),\n        BinaryContextTag{}, config_.debug);\n  }\n\n  void CreateContextBinary() {\n    const auto &context_binary =\n        config_.zipformer_ctc.qnn_config.context_binary;\n\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"Creating context binary '%s'.\", context_binary.c_str());\n    }\n\n    bool ok = model_->SaveBinaryContext(context_binary);\n\n    if (!ok) {\n      SHERPA_ONNX_LOGE(\"Failed to save context binary to '%s'\",\n                       context_binary.c_str());\n    }\n\n    if (config_.debug && ok) {\n      SHERPA_ONNX_LOGE(\"Saved context binary to '%s'.\", context_binary.c_str());\n      SHERPA_ONNX_LOGE(\n          \"It should be super fast the next time you init the system.\");\n      SHERPA_ONNX_LOGE(\"Remember to also provide libQnnSystem.so.\");\n    }\n  }\n\n  void PostInit() { CheckModel(); }\n\n  void CheckModel() {\n    const auto &input_tensor_names = model_->InputTensorNames();\n    if (input_tensor_names.size() != 1) {\n      SHERPA_ONNX_LOGE(\"Expect 1 input tensor. Actual %d\",\n                       static_cast<int32_t>(input_tensor_names.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (input_tensor_names[0] != \"x\") {\n      SHERPA_ONNX_LOGE(\"The 1st input should be x, actual '%s'\",\n                       input_tensor_names[0].c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<int32_t> x_shape = model_->TensorShape(input_tensor_names[0]);\n    if (x_shape.size() != 3) {\n      SHERPA_ONNX_LOGE(\"The 1st input should be 3-d, actual '%d'\",\n                       static_cast<int32_t>(x_shape.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (x_shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"The x.shape[0] should be 1, actual '%d'\", x_shape[0]);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    max_num_frames_ = x_shape[1];\n    feat_dim_ = x_shape[2];\n\n    if (!model_->HasTensor(\"log_probs\")) {\n      SHERPA_ONNX_LOGE(\"Model does not have output node 'log_probs'\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    auto out_shape = model_->TensorShape(\"log_probs\");\n    vocab_size_ = out_shape[2];\n\n    subsampling_factor_ = max_num_frames_ / out_shape[1];\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"max_num_frames: %d\", max_num_frames_);\n      SHERPA_ONNX_LOGE(\"feat_dim: %d\", feat_dim_);\n      SHERPA_ONNX_LOGE(\"vocab_size: %d\", vocab_size_);\n      SHERPA_ONNX_LOGE(\"subsampling_factor: %d\", subsampling_factor_);\n    }\n  }\n\n private:\n  std::mutex mutex_;\n\n  OfflineModelConfig config_;\n\n  std::unique_ptr<QnnBackend> backend_;\n  std::unique_ptr<QnnModel> model_;\n\n  int32_t max_num_frames_ = 0;\n  int32_t feat_dim_ = 0;\n  int32_t vocab_size_ = 0;\n  int32_t subsampling_factor_ = 1;\n};\n\nOfflineZipformerCtcModelQnn::OfflineZipformerCtcModelQnn(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineZipformerCtcModelQnn::OfflineZipformerCtcModelQnn(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nOfflineZipformerCtcModelQnn::~OfflineZipformerCtcModelQnn() = default;\n\nstd::vector<float> OfflineZipformerCtcModelQnn::Run(\n    std::vector<float> features) const {\n  return impl_->Run(std::move(features));\n}\n\nint32_t OfflineZipformerCtcModelQnn::VocabSize() const {\n  return impl_->VocabSize();\n}\n\nint32_t OfflineZipformerCtcModelQnn::SubsamplingFactor() const {\n  return impl_->SubsamplingFactor();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineZipformerCtcModelQnn::OfflineZipformerCtcModelQnn(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineZipformerCtcModelQnn::OfflineZipformerCtcModelQnn(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/offline-zipformer-ctc-model-qnn.h",
    "content": "// sherpa-onnx/csrc/qnn/offline-zipformer-ctc-model-qnn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_QNN_OFFLINE_ZIPFORMER_CTC_MODEL_QNN_H_\n#define SHERPA_ONNX_CSRC_QNN_OFFLINE_ZIPFORMER_CTC_MODEL_QNN_H_\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineZipformerCtcModelQnn {\n public:\n  ~OfflineZipformerCtcModelQnn();\n\n  explicit OfflineZipformerCtcModelQnn(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineZipformerCtcModelQnn(Manager *mgr, const OfflineModelConfig &config);\n\n  /**\n   * @param features A tensor of shape (num_frames, feature_dim)\n   * @returns Return a tensor of shape (num_output_frames, vocab_size)\n   */\n  std::vector<float> Run(std::vector<float> features) const;\n\n  int32_t VocabSize() const;\n  int32_t SubsamplingFactor() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_QNN_OFFLINE_ZIPFORMER_CTC_MODEL_QNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/qnn-backend.cc",
    "content": "// sherpa-onnx/csrc/qnn/qnn-backend.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/qnn/qnn-backend.h\"\n\n#include <dlfcn.h>\n#include <stdio.h>\n\n#include <cstdint>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <vector>\n\n#include \"QnnInterface.h\"\n#include \"System/QnnSystemInterface.h\"\n#include \"sherpa-onnx/csrc/qnn/macros.h\"\n#include \"sherpa-onnx/csrc/qnn/utils.h\"\n\nnamespace sherpa_onnx {\n\nclass QnnBackend::Impl {\n public:\n  explicit Impl(const std::string &backend_lib, bool debug) : debug_(debug) {\n    bool ok = InitQnnInterface(backend_lib);\n    if (!ok) {\n      SHERPA_ONNX_LOGE(\"Failed to init qnn interface from '%s'\",\n                       backend_lib.c_str());\n      return;\n    }\n\n    InitLog();\n    InitBackend();\n    InitDevice();\n\n    is_initialized_ = true;\n  }\n\n  ~Impl() {\n    if (context_handle_) {\n      auto ret = qnn_interface_.contextFree(context_handle_, nullptr);\n      SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call contextFree\");\n    }\n\n    if (device_handle_) {\n      auto ret = qnn_interface_.deviceFree(device_handle_);\n      SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call deviceFree\");\n    }\n\n    if (backend_handle_) {\n      auto ret = qnn_interface_.backendFree(backend_handle_);\n      SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call backendFree\");\n    }\n\n    if (log_handle_) {\n      auto ret = qnn_interface_.logFree(log_handle_);\n      SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call logFree\");\n    }\n  }\n\n  void InitContext() {\n    if (context_handle_) {\n      SHERPA_ONNX_LOGE(\"context handle is already initialized\");\n      return;\n    }\n\n    auto ret = qnn_interface_.contextCreate(backend_handle_, device_handle_,\n                                            context_config_, &context_handle_);\n    SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call contextCreate\");\n  }\n\n  void InitContext(Qnn_ContextHandle_t t) { context_handle_ = t; }\n\n  Qnn_LogHandle_t LogHandle() const { return log_handle_; }\n\n  Qnn_BackendHandle_t BackendHandle() const { return backend_handle_; }\n\n  Qnn_DeviceHandle_t DeviceHandle() const { return device_handle_; }\n\n  Qnn_ContextHandle_t ContextHandle() const { return context_handle_; }\n\n  QNN_INTERFACE_VER_TYPE QnnInterface() const { return qnn_interface_; }\n\n  QnnLog_Level_t LogLevel() const { return log_level_; }\n\n  bool IsInitialized() const { return is_initialized_; }\n\n private:\n  bool InitQnnInterface(const std::string &backend_lib) {\n    backend_lib_handle_ = std::unique_ptr<void, decltype(&dlclose)>(\n        dlopen(backend_lib.c_str(), RTLD_NOW | RTLD_LOCAL), &dlclose);\n    if (!backend_lib_handle_) {\n      SHERPA_ONNX_LOGE(\"Failed to dlopen '%s'. Error is: '%s'\",\n                       backend_lib.c_str(), dlerror());\n      return false;\n    }\n\n    if (debug_) {\n      SHERPA_ONNX_LOGE(\"loaded %s\", backend_lib.c_str());\n    }\n\n    const char *symbol = \"QnnInterface_getProviders\";\n    auto get_interface_providers =\n        reinterpret_cast<QnnInterfaceGetProvidersFnType>(\n            dlsym(backend_lib_handle_.get(), symbol));\n    if (!get_interface_providers) {\n      SHERPA_ONNX_LOGE(\"Failed to dlsym for '%s'. Error is: '%s'\", symbol,\n                       dlerror());\n      return false;\n    }\n\n    if (debug_) {\n      SHERPA_ONNX_LOGE(\"Got %s\", symbol);\n    }\n\n    const QnnInterface_t **interface_providers = nullptr;\n    uint32_t num_providers = 0;\n\n    auto ret = get_interface_providers(&interface_providers, &num_providers);\n    SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call get_interface_providers\");\n\n    if (!interface_providers) {\n      SHERPA_ONNX_LOGE(\"interface_providers is nullptr\");\n      return false;\n    }\n\n    if (num_providers == 0) {\n      SHERPA_ONNX_LOGE(\"Number of providers is 0\");\n      return false;\n    }\n\n    bool found_valid_interface = false;\n\n    if (debug_) {\n      SHERPA_ONNX_LOGE(\"QNN_API_VERSION_MAJOR: %d\", QNN_API_VERSION_MAJOR);\n      SHERPA_ONNX_LOGE(\"QNN_API_VERSION_MINOR: %d\", QNN_API_VERSION_MINOR);\n      SHERPA_ONNX_LOGE(\"QNN_API_VERSION_PATCH: %d\", QNN_API_VERSION_PATCH);\n    }\n\n    for (size_t idx = 0; idx < num_providers; ++idx) {\n      auto p = interface_providers[idx];\n\n      if (debug_) {\n        std::ostringstream os;\n        os << \"---\" << idx << \"----\\n\";\n        os << \"backendId: \" << p->backendId << \"\\n\";\n        os << \"coreApiVersion.major: \" << p->apiVersion.coreApiVersion.major\n           << \"\\n\";\n        os << \"coreApiVersion.minor: \" << p->apiVersion.coreApiVersion.minor\n           << \"\\n\";\n        os << \"coreApiVersion.patch: \" << p->apiVersion.coreApiVersion.patch\n           << \"\\n\";\n\n        os << \"backendApiVersion.major: \"\n           << p->apiVersion.backendApiVersion.major << \"\\n\";\n        os << \"backendApiVersion.minor: \"\n           << p->apiVersion.backendApiVersion.minor << \"\\n\";\n        os << \"backendApiVersion.patch: \"\n           << p->apiVersion.backendApiVersion.patch << \"\\n\";\n        SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n      }\n\n      qnn_interface_ = p->QNN_INTERFACE_VER_NAME;\n      found_valid_interface = true;\n      break;\n    }\n\n    if (!found_valid_interface) {\n      SHERPA_ONNX_LOGE(\"Failed to find valid interface\");\n      return false;\n    }\n\n    if (debug_) {\n      const char *build_id = nullptr;\n      ret = qnn_interface_.backendGetBuildId(&build_id);\n      SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call backendGetBuildId()\");\n\n      SHERPA_ONNX_LOGE(\"backend build ID: %s\", build_id);\n    }\n\n    return true;\n  }\n\n  void InitLog() {\n    auto ret = qnn_interface_.logCreate(LogCallback, log_level_, &log_handle_);\n    SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call logCreate\");\n  }\n\n  void InitBackend() {\n    auto ret = qnn_interface_.backendCreate(log_handle_, backend_config_,\n                                            &backend_handle_);\n    SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call backendCreate\");\n  }\n\n  void InitDevice() {\n    auto ret =\n        qnn_interface_.deviceCreate(log_handle_, nullptr, &device_handle_);\n    SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call deviceCreate\");\n  }\n\n private:\n  bool debug_ = true;\n  std::unique_ptr<void, decltype(&dlclose)> backend_lib_handle_{nullptr,\n                                                                &dlclose};\n\n  QNN_INTERFACE_VER_TYPE qnn_interface_;\n\n  QnnLog_Level_t log_level_ = QNN_LOG_LEVEL_WARN;\n  // QnnLog_Level_t log_level_ = QNN_LOG_LEVEL_INFO;\n  // QnnLog_Level_t log_level_ = QNN_LOG_LEVEL_VERBOSE;\n\n  Qnn_LogHandle_t log_handle_ = nullptr;\n\n  const QnnBackend_Config_t **backend_config_ = nullptr;\n  Qnn_BackendHandle_t backend_handle_ = nullptr;\n\n  Qnn_DeviceHandle_t device_handle_ = nullptr;\n\n  Qnn_ContextHandle_t context_handle_ = nullptr;\n  const QnnContext_Config_t **context_config_ = nullptr;\n  bool is_initialized_ = false;\n};\n\nQnnBackend::~QnnBackend() = default;\n\nQnnBackend::QnnBackend(const std::string &backend_lib, bool debug)\n    : impl_(std::make_unique<Impl>(backend_lib, debug)) {}\n\nvoid QnnBackend::InitContext() const { impl_->InitContext(); }\n\nvoid QnnBackend::InitContext(Qnn_ContextHandle_t context_handle) const {\n  impl_->InitContext(context_handle);\n}\n\nQnn_LogHandle_t QnnBackend::LogHandle() const { return impl_->LogHandle(); }\n\nQnn_BackendHandle_t QnnBackend::BackendHandle() const {\n  return impl_->BackendHandle();\n}\n\nQnn_DeviceHandle_t QnnBackend::DeviceHandle() const {\n  return impl_->DeviceHandle();\n}\n\nQnn_ContextHandle_t QnnBackend::ContextHandle() const {\n  return impl_->ContextHandle();\n}\n\nQNN_INTERFACE_VER_TYPE QnnBackend::QnnInterface() const {\n  return impl_->QnnInterface();\n}\n\nQnnLog_Level_t QnnBackend::LogLevel() const { return impl_->LogLevel(); }\n\nbool QnnBackend::IsInitialized() const { return impl_->IsInitialized(); }\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/qnn-backend.h",
    "content": "// sherpa-onnx/csrc/qnn/qnn-backend.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_QNN_QNN_BACKEND_H_\n#define SHERPA_ONNX_CSRC_QNN_QNN_BACKEND_H_\n\n#include <memory>\n#include <string>\n\n#include \"QnnInterface.h\"\n\nnamespace sherpa_onnx {\n\nclass QnnBackend {\n public:\n  explicit QnnBackend(const std::string &backend_lib, bool debug);\n  ~QnnBackend();\n\n  void InitContext() const;\n  void InitContext(Qnn_ContextHandle_t context_handle) const;\n  Qnn_LogHandle_t LogHandle() const;\n  Qnn_BackendHandle_t BackendHandle() const;\n  Qnn_DeviceHandle_t DeviceHandle() const;\n  Qnn_ContextHandle_t ContextHandle() const;\n  QNN_INTERFACE_VER_TYPE QnnInterface() const;\n  QnnLog_Level_t LogLevel() const;\n  bool IsInitialized() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_QNN_QNN_BACKEND_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/qnn-model.cc",
    "content": "// sherpa-onnx/csrc/qnn/qnn-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/qnn/qnn-model.h\"\n\n#include <dlfcn.h>\n\n#include <fstream>\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/qnn/macros.h\"\n#include \"sherpa-onnx/csrc/qnn/qnn-backend.h\"\n#include \"sherpa-onnx/csrc/qnn/utils.h\"\n\nnamespace sherpa_onnx {\n\nclass QnnModel::Impl {\n public:\n  Impl(const std::string &model_so, const QnnBackend *backend, bool debug)\n      : debug_(debug), backend_(backend) {\n    bool ok = InitModel(model_so);\n    if (!ok) {\n      SHERPA_ONNX_LOGE(\"Failed to load '%s'\", model_so.c_str());\n      return;\n    }\n\n    ok = InitSymbols();\n    if (!ok) {\n      SHERPA_ONNX_LOGE(\"Failed to get model symbols from '%s'\",\n                       model_so.c_str());\n      return;\n    }\n\n    InitGraph();\n\n    PostInit();\n  }\n\n  Impl(const std::string &binary_context_file, const std::string &system_lib,\n       const QnnBackend *backend, BinaryContextTag, bool debug)\n      : debug_(debug), backend_(backend) {\n    bool ok = LoadSystemLib(binary_context_file, system_lib);\n    if (!ok) {\n      return;\n    }\n\n    PostInit();\n  }\n\n  bool LoadSystemLib(const std::string &binary_context_file,\n                     const std::string &system_lib) {\n    system_lib_handle_ = std::unique_ptr<void, decltype(&dlclose)>(\n        dlopen(system_lib.c_str(), RTLD_NOW | RTLD_LOCAL), &dlclose);\n    if (!system_lib_handle_) {\n      SHERPA_ONNX_LOGE(\"Failed to dlopen '%s'. Error is: '%s'\",\n                       system_lib.c_str(), dlerror());\n      return false;\n    }\n    if (debug_) {\n      SHERPA_ONNX_LOGE(\"loaded %s\", system_lib.c_str());\n    }\n\n    auto get_system_interface_providers =\n        reinterpret_cast<QnnSystemInterfaceGetProvidersFnType>(\n            dlsym(system_lib_handle_.get(), \"QnnSystemInterface_getProviders\"));\n\n    if (!get_system_interface_providers) {\n      SHERPA_ONNX_LOGE(\"Failed to get QnnSystemInterface_getProviders\");\n      return false;\n    }\n\n    const QnnSystemInterface_t **system_interface_providers = nullptr;\n    uint32_t num_providers = 0;\n    if (get_system_interface_providers(&system_interface_providers,\n                                       &num_providers) != QNN_SUCCESS) {\n      SHERPA_ONNX_LOGE(\"Failed to get system interface providers.\");\n      return false;\n    }\n\n    if (!system_interface_providers) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to get system interface providers: null \"\n          \"interface providers received.\");\n      return false;\n    }\n\n    if (!num_providers) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to get interface providers: 0 interface providers.\");\n      return false;\n    }\n\n    for (uint32_t i = 0; i < num_providers; ++i) {\n      if (debug_) {\n        SHERPA_ONNX_LOGE(\"QNN_SYSTEM_API_VERSION_MAJOR: %d\",\n                         static_cast<int32_t>(QNN_SYSTEM_API_VERSION_MAJOR));\n        SHERPA_ONNX_LOGE(\"QNN_SYSTEM_API_VERSION_MINOR: %d\",\n                         static_cast<int32_t>(QNN_SYSTEM_API_VERSION_MINOR));\n        SHERPA_ONNX_LOGE(\n            \"systemApiVersion.major: %d\",\n            static_cast<int32_t>(\n                system_interface_providers[i]->systemApiVersion.major));\n        SHERPA_ONNX_LOGE(\n            \"systemApiVersion.minor: %d\",\n            static_cast<int32_t>(\n                system_interface_providers[i]->systemApiVersion.minor));\n      }\n\n      qnn_system_interface_ =\n          system_interface_providers[i]->QNN_SYSTEM_INTERFACE_VER_NAME;\n    }\n\n    // read file into a buffer\n    std::vector<uint8_t> buffer = ReadFile<uint8_t>(binary_context_file);\n\n    QnnSystemContext_Handle_t sys_ctx_handle = nullptr;\n    if (qnn_system_interface_.systemContextCreate(&sys_ctx_handle) !=\n        QNN_SUCCESS) {\n      SHERPA_ONNX_LOGE(\"Could not create system handle.\");\n      return false;\n    }\n\n    const QnnSystemContext_BinaryInfo_t *binary_info = nullptr;\n    Qnn_ContextBinarySize_t binary_info_size = 0;\n\n    auto ret = qnn_system_interface_.systemContextGetBinaryInfo(\n        sys_ctx_handle, static_cast<void *>(buffer.data()), buffer.size(),\n        &binary_info, &binary_info_size);\n    if (ret != QNN_SUCCESS) {\n      SHERPA_ONNX_LOGE(\n          \"Failed to get context binary info from '%s'. ret code is %d\",\n          binary_context_file.c_str(), static_cast<int32_t>(ret));\n\n      qnn_system_interface_.systemContextFree(sys_ctx_handle);\n      return false;\n    }\n\n    const GraphConfigInfo **graph_configs_info = nullptr;\n\n    uint32_t graph_configs_info_count = 0;\n    GraphInfo **graphs_info = nullptr;\n    uint32_t graphs_count = 0;\n\n    if (!CopyMetadataToGraphsInfo(binary_info, graphs_info, graphs_count)) {\n      SHERPA_ONNX_LOGE(\"Failed to call CopyMetadataToGraphsInfo\");\n\n      qnn_system_interface_.systemContextFree(sys_ctx_handle);\n      return false;\n    }\n\n    qnn_system_interface_.systemContextFree(sys_ctx_handle);\n\n    auto free_graphs_info = [&graphs_info, &graphs_count] {\n      for (uint32_t i = 0; i < graphs_count; ++i) {\n        for (uint32_t k = 0; k < graphs_info[i]->num_input_tensors; ++k) {\n          FreeTensor(&graphs_info[i]->input_tensors[k]);\n        }\n\n        for (uint32_t k = 0; k < graphs_info[i]->num_output_tensors; ++k) {\n          FreeTensor(&graphs_info[i]->output_tensors[k]);\n        }\n\n        free(graphs_info[i]->input_tensors);\n        free(graphs_info[i]->output_tensors);\n\n        free(graphs_info[i]->graph_name);\n      }\n\n      free(graphs_info[0]);\n      free(graphs_info);\n    };\n\n    if (graphs_count > 1) {\n      SHERPA_ONNX_LOGE(\"Only the first graph is used\");\n    }\n\n    Qnn_ContextHandle_t context_handle = nullptr;\n\n    if (backend_->QnnInterface().contextCreateFromBinary(\n            backend_->BackendHandle(), backend_->DeviceHandle(),\n            context_config_, static_cast<void *>(buffer.data()), buffer.size(),\n            &context_handle, nullptr) != QNN_SUCCESS) {\n      free_graphs_info();\n      SHERPA_ONNX_LOGE(\"Could not create context from binary.\");\n      return false;\n    }\n\n    backend_->InitContext(context_handle);\n\n    if (backend_->QnnInterface().graphRetrieve(\n            context_handle, (*graphs_info)[0].graph_name,\n            &((*graphs_info)[0].graph)) != QNN_SUCCESS) {\n      free_graphs_info();\n      SHERPA_ONNX_LOGE(\"Unable to retrieve graph handle for graph %d\", 0);\n      return false;\n    }\n\n    graph_handle_ = (*graphs_info)[0].graph;\n\n    InitInputTensors((*graphs_info)[0]);\n    InitOutputTensors((*graphs_info)[0]);\n\n    free_graphs_info();\n\n    return true;\n  }\n\n  ~Impl() = default;\n\n  bool SaveBinaryContext(const std::string &filename) {\n    auto qnn_interface = backend_->QnnInterface();\n\n    if (!qnn_interface.contextGetBinarySize ||\n        !qnn_interface.contextGetBinary) {\n      SHERPA_ONNX_LOGE(\n          \"contextGetBinarySizeFnHandle or \"\n          \"contextGetBinaryFnHandle is nullptr.\");\n      return false;\n    }\n\n    uint64_t required_buffer_size{0};\n    auto ret = qnn_interface.contextGetBinarySize(backend_->ContextHandle(),\n                                                  &required_buffer_size);\n    SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call contextGetBinarySize\");\n\n    if (debug_) {\n      SHERPA_ONNX_LOGE(\"context binary size: %.3f MB\",\n                       static_cast<float>(required_buffer_size) / 1024 / 1024);\n    }\n    std::vector<uint8_t> saveBuffer(required_buffer_size);\n    uint64_t writtenBufferSize{0};\n\n    ret = qnn_interface.contextGetBinary(\n        backend_->ContextHandle(), reinterpret_cast<void *>(saveBuffer.data()),\n        required_buffer_size, &writtenBufferSize);\n\n    SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call contextGetBinary\");\n\n    if (required_buffer_size < writtenBufferSize) {\n      SHERPA_ONNX_LOGE(\n          \"Illegal written buffer size %d bytes. Cannot exceed \"\n          \"allocated memory of %d bytes\",\n          static_cast<int32_t>(writtenBufferSize),\n          static_cast<int32_t>(required_buffer_size));\n      return false;\n    }\n    std::ofstream ofs(filename, std::ios::binary | std::ios::trunc);\n    if (!ofs) {\n      SHERPA_ONNX_LOGE(\"Failed to create '%s'\", filename.c_str());\n      return false;\n    }\n\n    ofs.write(reinterpret_cast<const char *>(saveBuffer.data()),\n              saveBuffer.size());\n\n    if (!ofs) {\n      SHERPA_ONNX_LOGE(\"Failed to write '%s'\", filename.c_str());\n      return false;\n    }\n\n    return true;\n  }\n\n  const std::vector<std::string> &InputTensorNames() const {\n    return input_tensor_names_;\n  }\n\n  const std::vector<std::string> &OutputTensorNames() const {\n    return output_tensor_names_;\n  }\n\n  std::vector<int32_t> TensorShape(const std::string &name) const {\n    std::vector<int32_t> shape;\n\n    if (!HasTensor(name)) {\n      SHERPA_ONNX_LOGE(\"No such tensor '%s'\", name.c_str());\n      return shape;\n    }\n\n    auto t = name2tensor_.at(name);\n\n    shape = {t->v1.dimensions, t->v1.dimensions + t->v1.rank};\n\n    return shape;\n  }\n\n  int32_t TensorSizeInBytes(const std::string &name) const {\n    if (!HasTensor(name)) {\n      return 0;\n    }\n\n    return name2tensor_.at(name)->v1.clientBuf.dataSize;\n  }\n\n  bool HasTensor(const std::string &name) const {\n    return name2tensor_.count(name);\n  }\n\n  bool SetInputTensorData(const std::string &name, const float *p, int32_t n) {\n    if (!HasTensor(name)) {\n      SHERPA_ONNX_LOGE(\"No such tensor '%s'\", name.c_str());\n      return false;\n    }\n\n    auto t = name2tensor_.at(name);\n    if (t->v1.dataType != QNN_DATATYPE_UFIXED_POINT_16) {\n      SHERPA_ONNX_LOGE(\n          \"tensor '%s' should be of type \"\n          \"QNN_DATATYPE_UFIXED_POINT_16, but it is %s\",\n          name.c_str(), TensorDataTypeToString(t->v1.dataType).c_str());\n      return false;\n    }\n\n    if (t->v1.quantizeParams.quantizationEncoding !=\n        QNN_QUANTIZATION_ENCODING_SCALE_OFFSET) {\n      SHERPA_ONNX_LOGE(\n          \"tensor '%s' should be quantized with \"\n          \"QNN_QUANTIZATION_ENCODING_SCALE_OFFSET, but it is %s\",\n          name.c_str(),\n          QuantizationEncodingToString(\n              t->v1.quantizeParams.quantizationEncoding)\n              .c_str());\n      return false;\n    }\n\n    if (n * sizeof(uint16_t) != t->v1.clientBuf.dataSize) {\n      SHERPA_ONNX_LOGE(\"tensor '%s' expects %d bytes, but you provide %d bytes\",\n                       name.c_str(),\n                       static_cast<int32_t>(t->v1.clientBuf.dataSize),\n                       static_cast<int32_t>(n * sizeof(uint16_t)));\n      return false;\n    }\n\n    FillData(t, p, n);\n\n    return true;\n  }\n\n  bool SetInputTensorData(const std::string &name, const int32_t *p,\n                          int32_t n) {\n    if (!HasTensor(name)) {\n      SHERPA_ONNX_LOGE(\"No such tensor '%s'\", name.c_str());\n      return false;\n    }\n\n    auto t = name2tensor_.at(name);\n    if (t->v1.dataType != QNN_DATATYPE_INT_32) {\n      SHERPA_ONNX_LOGE(\n          \"tensor '%s' should be of type \"\n          \"QNN_DATATYPE_INT_32, but it is %s\",\n          name.c_str(), TensorDataTypeToString(t->v1.dataType).c_str());\n      return false;\n    }\n\n    if (n * sizeof(int32_t) != t->v1.clientBuf.dataSize) {\n      SHERPA_ONNX_LOGE(\"tensor '%s' expects %d bytes, but you provide %d bytes\",\n                       name.c_str(),\n                       static_cast<int32_t>(t->v1.clientBuf.dataSize),\n                       static_cast<int32_t>(n * sizeof(int32_t)));\n      return false;\n    }\n\n    FillData(t, p, n);\n\n    return true;\n  }\n\n  std::vector<float> GetOutputTensorData(const std::string &name) {\n    if (!HasTensor(name)) {\n      SHERPA_ONNX_LOGE(\"No such tensor '%s'\", name.c_str());\n      return {};\n    }\n\n    auto t = name2tensor_.at(name);\n    if (t->v1.dataType != QNN_DATATYPE_UFIXED_POINT_16) {\n      SHERPA_ONNX_LOGE(\n          \"tensor '%s' should be of type \"\n          \"QNN_DATATYPE_UFIXED_POINT_16, but it is %s\",\n          name.c_str(), TensorDataTypeToString(t->v1.dataType).c_str());\n      return {};\n    }\n\n    if (t->v1.quantizeParams.quantizationEncoding !=\n        QNN_QUANTIZATION_ENCODING_SCALE_OFFSET) {\n      SHERPA_ONNX_LOGE(\n          \"tensor '%s' should be quantized with \"\n          \"QNN_QUANTIZATION_ENCODING_SCALE_OFFSET, but it is %s\",\n          name.c_str(),\n          QuantizationEncodingToString(\n              t->v1.quantizeParams.quantizationEncoding)\n              .c_str());\n      return {};\n    }\n\n    int32_t n = t->v1.clientBuf.dataSize / sizeof(uint16_t);\n    std::vector<float> ans(n);\n\n    GetData(t, ans.data(), n);\n\n    return ans;\n  }\n\n  bool Run() {\n    std::vector<Qnn_Tensor_t> input_tensors_raw;\n    std::vector<Qnn_Tensor_t> output_tensors_raw;\n\n    input_tensors_raw.reserve(input_tensors_.size());\n    output_tensors_raw.reserve(output_tensors_.size());\n\n    for (const auto &p : input_tensors_) {\n      input_tensors_raw.push_back(*p);\n    }\n\n    for (const auto &p : output_tensors_) {\n      output_tensors_raw.push_back(*p);\n    }\n\n    auto ret = backend_->QnnInterface().graphExecute(\n        graph_handle_, input_tensors_raw.data(), input_tensors_raw.size(),\n        output_tensors_raw.data(), output_tensors_raw.size(), nullptr, nullptr);\n    SHERPA_ONNX_QNN_CHECK(ret, \"Failed to run graphExecute\");\n\n    return true;\n  }\n\n  bool IsInitialized() const { return is_initialized_; }\n\n private:\n  void PostInit() {\n    AllocateBuffer();\n    SetupPointers();\n\n    is_initialized_ = true;\n  }\n\n  bool InitModel(const std::string &model_so) {\n    model_lib_handle_ = std::unique_ptr<void, decltype(&dlclose)>(\n        dlopen(model_so.c_str(), RTLD_NOW | RTLD_LOCAL), &dlclose);\n    if (!model_lib_handle_) {\n      SHERPA_ONNX_LOGE(\"Failed to dlopen '%s'. Error is: '%s'\",\n                       model_so.c_str(), dlerror());\n      return false;\n    }\n\n    if (debug_) {\n      SHERPA_ONNX_LOGE(\"loaded %s\", model_so.c_str());\n    }\n\n    return true;\n  }\n\n  bool InitSymbols() {\n    const char *symbol = \"QnnModel_composeGraphs\";\n\n    compose_graphs_fn_handle_ = reinterpret_cast<ComposeGraphsFnHandleType>(\n        dlsym(model_lib_handle_.get(), symbol));\n    if (!compose_graphs_fn_handle_) {\n      SHERPA_ONNX_LOGE(\"Failed to dlsym for '%s'. Error is: '%s'\", symbol,\n                       dlerror());\n      return false;\n    }\n\n    symbol = \"QnnModel_freeGraphsInfo\";\n    free_graph_info_fn_handle_ = reinterpret_cast<FreeGraphInfoFnHandleType>(\n        dlsym(model_lib_handle_.get(), symbol));\n    if (!free_graph_info_fn_handle_) {\n      SHERPA_ONNX_LOGE(\"Failed to dlsym for '%s'. Error is: '%s'\", symbol,\n                       dlerror());\n      return false;\n    }\n    return true;\n  }\n\n  void InitGraph() {\n    const GraphConfigInfo **graph_configs_info = nullptr;\n\n    uint32_t graph_configs_info_count = 0;\n    GraphInfo **graphs_info = nullptr;\n    uint32_t graphs_count = 0;\n\n    auto ret = compose_graphs_fn_handle_(\n        backend_->BackendHandle(), backend_->QnnInterface(),\n        backend_->ContextHandle(), graph_configs_info, graph_configs_info_count,\n        &graphs_info, &graphs_count, debug_, LogCallback, backend_->LogLevel());\n    SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call compose_graphs_fn_handle_\");\n\n    if (debug_) {\n      SHERPA_ONNX_LOGE(\"graphs_count: %d\", (int32_t)graphs_count);\n    }\n\n    for (uint32_t i = 0; i < graphs_count; ++i) {\n      if (debug_) {\n        SHERPA_ONNX_LOGE(\n            \"Finalizing graph %d/%d: '%s'\", static_cast<int32_t>(i),\n            static_cast<int32_t>(graphs_count), (*graphs_info)[i].graph_name);\n      }\n      ret = backend_->QnnInterface().graphFinalize((*graphs_info)[i].graph,\n                                                   nullptr, nullptr);\n      SHERPA_ONNX_QNN_CHECK(ret, \"Failed to call graph_finalize\");\n    }\n\n    if (graphs_count > 1) {\n      SHERPA_ONNX_LOGE(\"We only use the first graph: %s\",\n                       (*graphs_info)[0].graph_name);\n    }\n\n    InitInputTensors((*graphs_info)[0]);\n    InitOutputTensors((*graphs_info)[0]);\n\n    graph_handle_ = (*graphs_info)[0].graph;\n  }\n\n  void InitInputTensors(GraphInfo graph) {\n    input_tensors_.reserve(graph.num_input_tensors);\n    input_tensor_names_.reserve(graph.num_input_tensors);\n\n    for (uint32_t i = 0; i < graph.num_input_tensors; ++i) {\n      auto p = TensorPtr(new Qnn_Tensor_t(QNN_TENSOR_INIT), &FreeTensor);\n\n      CopyTensorInfo(graph.input_tensors[i], *p);\n\n      if (debug_) {\n        SHERPA_ONNX_LOGE(\"input %d\", (int)i);\n        PrintTensor(p->v2);\n      }\n\n      std::string name = p->v1.name;\n      name2tensor_[name] = p.get();\n      input_tensor_names_.push_back(std::move(name));\n\n      input_tensors_.push_back(std::move(p));\n    }\n  }\n\n  void InitOutputTensors(GraphInfo graph) {\n    output_tensors_.reserve(graph.num_output_tensors);\n    output_tensor_names_.reserve(graph.num_output_tensors);\n    for (uint32_t i = 0; i < graph.num_output_tensors; ++i) {\n      auto p = TensorPtr(new Qnn_Tensor_t(QNN_TENSOR_INIT), &FreeTensor);\n\n      CopyTensorInfo(graph.output_tensors[i], *p);\n\n      if (debug_ && (i + 3 > graph.num_output_tensors)) {\n        SHERPA_ONNX_LOGE(\"output %d\", (int)i);\n\n        PrintTensor(p->v2);\n      }\n\n      std::string name = p->v1.name;\n      name2tensor_[name] = p.get();\n      output_tensor_names_.push_back(std::move(name));\n\n      output_tensors_.push_back(std::move(p));\n    }\n  }\n\n  void AllocateBuffer() {\n    uint32_t n = 0;\n    for (const auto &p : name2tensor_) {\n      n += p.second->v1.clientBuf.dataSize;\n    }\n\n    if (debug_) {\n      SHERPA_ONNX_LOGE(\"Allocate %d bytes, or %.3f MB\", static_cast<int32_t>(n),\n                       static_cast<float>(n) / 1024 / 1024);\n    }\n\n    buffer_.resize(n);\n  }\n\n  void SetupPointers() {\n    uint8_t *p = buffer_.data();\n    uint32_t n = 0;\n    for (auto &t : input_tensors_) {\n      t->v1.clientBuf.data = p;\n      p += t->v1.clientBuf.dataSize;\n    }\n\n    for (auto &t : output_tensors_) {\n      t->v1.clientBuf.data = p;\n      p += t->v1.clientBuf.dataSize;\n    }\n\n    if (debug_) {\n      if (p == buffer_.data() + buffer_.size()) {\n        SHERPA_ONNX_LOGE(\"Setup pointers successfully.\");\n      } else {\n        SHERPA_ONNX_LOGE(\"Bad things happened in setting up pointers.\");\n      }\n    }\n  }\n\n private:\n  bool debug_ = true;\n  std::unique_ptr<void, decltype(&dlclose)> model_lib_handle_{nullptr,\n                                                              &dlclose};\n\n  std::unique_ptr<void, decltype(&dlclose)> system_lib_handle_{nullptr,\n                                                               &dlclose};\n\n  QNN_SYSTEM_INTERFACE_VER_TYPE qnn_system_interface_;\n\n  ComposeGraphsFnHandleType compose_graphs_fn_handle_ = nullptr;\n  FreeGraphInfoFnHandleType free_graph_info_fn_handle_ = nullptr;\n\n  std::vector<TensorPtr> input_tensors_;\n  std::vector<TensorPtr> output_tensors_;\n\n  std::vector<std::string> input_tensor_names_;\n  std::vector<std::string> output_tensor_names_;\n\n  std::unordered_map<std::string, Qnn_Tensor_t *> name2tensor_;\n\n  std::vector<uint8_t> buffer_;\n  const QnnBackend *backend_ = nullptr;\n\n  Qnn_GraphHandle_t graph_handle_ = nullptr;\n\n  const QnnContext_Config_t **context_config_ = nullptr;\n  bool is_initialized_ = false;\n};\n\nQnnModel::~QnnModel() = default;\n\nQnnModel::QnnModel(const std::string &model_so, const QnnBackend *backend,\n                   bool debug)\n    : impl_(std::make_unique<Impl>(model_so, backend, debug)) {}\n\nQnnModel::QnnModel(const std::string &binary_context_file,\n                   const std::string &system_lib, const QnnBackend *backend,\n                   BinaryContextTag tag, bool debug)\n    : impl_(std::make_unique<Impl>(binary_context_file, system_lib, backend,\n                                   tag, debug)) {}  // NOLINT\n\nbool QnnModel::SaveBinaryContext(const std::string &filename) const {\n  return impl_->SaveBinaryContext(filename);\n}\n\nconst std::vector<std::string> &QnnModel::InputTensorNames() const {\n  return impl_->InputTensorNames();\n}\n\nconst std::vector<std::string> &QnnModel::OutputTensorNames() const {\n  return impl_->OutputTensorNames();\n}\n\nstd::vector<int32_t> QnnModel::TensorShape(const std::string &name) const {\n  return impl_->TensorShape(name);\n}\n\nint32_t QnnModel::TensorSizeInBytes(const std::string &name) const {\n  return impl_->TensorSizeInBytes(name);\n}\n\nbool QnnModel::HasTensor(const std::string &name) const {\n  return impl_->HasTensor(name);\n}\n\nbool QnnModel::SetInputTensorData(const std::string &name, const float *p,\n                                  int32_t n) const {\n  return impl_->SetInputTensorData(name, p, n);\n}\n\nbool QnnModel::SetInputTensorData(const std::string &name, const int32_t *p,\n                                  int32_t n) const {\n  return impl_->SetInputTensorData(name, p, n);\n}\n\nstd::vector<float> QnnModel::GetOutputTensorData(\n    const std::string &name) const {\n  return impl_->GetOutputTensorData(name);\n}\n\nbool QnnModel::Run() const { return impl_->Run(); }\n\nbool QnnModel::IsInitialized() const { return impl_->IsInitialized(); }\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/qnn-model.h",
    "content": "// sherpa-onnx/csrc/qnn/qnn-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_QNN_QNN_MODEL_H_\n#define SHERPA_ONNX_CSRC_QNN_QNN_MODEL_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"QnnInterface.h\"\n\nnamespace sherpa_onnx {\n\nclass QnnBackend;\n\nstruct BinaryContextTag {};\n\nclass QnnModel {\n public:\n  QnnModel(const std::string &model_so, const QnnBackend *backend, bool debug);\n  QnnModel(const std::string &binary_context_file,\n           const std::string &system_lib, const QnnBackend *backend,\n           BinaryContextTag tag, bool debug);\n  ~QnnModel();\n\n  bool SaveBinaryContext(const std::string &filename) const;\n\n  const std::vector<std::string> &InputTensorNames() const;\n  const std::vector<std::string> &OutputTensorNames() const;\n\n  std::vector<int32_t> TensorShape(const std::string &name) const;\n  int32_t TensorSizeInBytes(const std::string &name) const;\n\n  bool HasTensor(const std::string &name) const;\n\n  bool SetInputTensorData(const std::string &name, const float *p,\n                          int32_t n) const;\n\n  bool SetInputTensorData(const std::string &name, const int32_t *p,\n                          int32_t n) const;\n\n  std::vector<float> GetOutputTensorData(const std::string &name) const;\n\n  bool Run() const;\n  bool IsInitialized() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_QNN_QNN_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/utils.cc",
    "content": "// sherpa-onnx/csrc/qnn/utils.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/qnn/utils.h\"\n\n#include <math.h>\n#include <stdio.h>\n\n#include <algorithm>\n#include <functional>\n#include <numeric>\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/qnn/macros.h\"\n\n#define SHERPA_ONNX_TO_STRING(s) \\\n  case s:                        \\\n    return #s\n\nstd::string TensorTypeToString(Qnn_TensorType_t t) {\n  switch (t) {\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_APP_WRITE);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_APP_READ);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_APP_READWRITE);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_NATIVE);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_STATIC);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_NULL);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_UPDATEABLE_STATIC);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_UPDATEABLE_NATIVE);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_UPDATEABLE_APP_WRITE);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_UPDATEABLE_APP_READ);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_UPDATEABLE_APP_READWRITE);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_OPTIONAL_APP_WRITE);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_OPTIONAL_APP_READ);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_OPTIONAL_APP_READWRITE);\n    SHERPA_ONNX_TO_STRING(QNN_TENSOR_TYPE_UNDEFINED);\n  }\n  return \"Unknown\";\n}\n\nstd::string QuantizationEncodingToString(Qnn_QuantizationEncoding_t q) {\n  switch (q) {\n    SHERPA_ONNX_TO_STRING(QNN_QUANTIZATION_ENCODING_SCALE_OFFSET);\n    SHERPA_ONNX_TO_STRING(QNN_QUANTIZATION_ENCODING_AXIS_SCALE_OFFSET);\n    SHERPA_ONNX_TO_STRING(QNN_QUANTIZATION_ENCODING_BW_SCALE_OFFSET);\n    SHERPA_ONNX_TO_STRING(QNN_QUANTIZATION_ENCODING_BW_AXIS_SCALE_OFFSET);\n    SHERPA_ONNX_TO_STRING(QNN_QUANTIZATION_ENCODING_BLOCK);\n    SHERPA_ONNX_TO_STRING(QNN_QUANTIZATION_ENCODING_BLOCKWISE_EXPANSION);\n    SHERPA_ONNX_TO_STRING(QNN_QUANTIZATION_ENCODING_VECTOR);\n    SHERPA_ONNX_TO_STRING(QNN_QUANTIZATION_ENCODING_UNDEFINED);\n  }\n  return \"Unknown\";\n}\n\nstd::string TensorDataTypeToString(Qnn_DataType_t t) {\n  switch (t) {\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_INT_8);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_INT_16);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_INT_32);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_INT_64);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_UINT_8);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_UINT_16);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_UINT_32);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_UINT_64);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_FLOAT_16);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_FLOAT_32);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_FLOAT_64);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_SFIXED_POINT_4);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_SFIXED_POINT_8);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_SFIXED_POINT_16);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_SFIXED_POINT_32);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_UFIXED_POINT_4);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_UFIXED_POINT_8);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_UFIXED_POINT_16);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_UFIXED_POINT_32);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_BOOL_8);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_STRING);\n    SHERPA_ONNX_TO_STRING(QNN_DATATYPE_UNDEFINED);\n  }\n  return \"unknown\";\n}\n\nstd::string TensorMemTypeToString(Qnn_TensorMemType_t t) {\n  switch (t) {\n    SHERPA_ONNX_TO_STRING(QNN_TENSORMEMTYPE_RAW);\n    SHERPA_ONNX_TO_STRING(QNN_TENSORMEMTYPE_MEMHANDLE);\n    SHERPA_ONNX_TO_STRING(QNN_TENSORMEMTYPE_RETRIEVE_RAW);\n    SHERPA_ONNX_TO_STRING(QNN_TENSORMEMTYPE_UNDEFINED);\n  }\n  return \"Unknown\";\n}\n\n#undef SHERPA_ONNX_TO_STRING\n\n// quantized = float / scale - offset;\nvoid FillData(Qnn_Tensor_t *t, const float *data, int32_t n) {\n  float scale = t->v1.quantizeParams.scaleOffsetEncoding.scale;\n  int32_t offset = t->v1.quantizeParams.scaleOffsetEncoding.offset;\n\n  size_t bit_width = 16;\n  double true_bit_width_max = pow(2, bit_width) - 1;\n  double encoding_min = offset * scale;\n  double encoding_max = (true_bit_width_max + offset) * scale;\n  double encoding_range = encoding_max - encoding_min;\n\n  uint16_t *out = reinterpret_cast<uint16_t *>(t->v1.clientBuf.data);\n\n  for (size_t i = 0; i < n; ++i) {\n    int32_t quantized_value =\n        round(true_bit_width_max * (data[i] - encoding_min) / encoding_range);\n\n    if (quantized_value < 0) {\n      quantized_value = 0;\n    } else if (quantized_value > static_cast<int32_t>(true_bit_width_max)) {\n      quantized_value = static_cast<int32_t>(true_bit_width_max);\n    }\n    out[i] = static_cast<uint16_t>(quantized_value);\n  }\n}\n\nvoid FillData(Qnn_Tensor_t *t, const int32_t *data, int32_t n) {\n  int32_t *out = reinterpret_cast<int32_t *>(t->v1.clientBuf.data);\n  std::copy(data, data + n, out);\n}\n\nvoid GetData(const Qnn_Tensor_t *t, float *data, int32_t n) {\n  double scale = t->v1.quantizeParams.scaleOffsetEncoding.scale;\n  double offset = t->v1.quantizeParams.scaleOffsetEncoding.offset;\n\n  const uint16_t *p = reinterpret_cast<const uint16_t *>(t->v1.clientBuf.data);\n  for (int32_t i = 0; i < n; ++i) {\n    double quantizedValue = static_cast<double>(p[i]);\n    data[i] = (quantizedValue + offset) * scale;\n  }\n}\n\nstatic void FreeTensorV1(Qnn_Tensor_t *t) {\n  free(const_cast<char *>(t->v1.name));\n\n  delete[] t->v1.dimensions;\n}\n\nstatic void FreeTensorV2(Qnn_Tensor_t *t) {\n  free(const_cast<char *>(t->v2.name));\n\n  delete[] t->v2.dimensions;\n  delete[] t->v2.isDynamicDimensions;\n}\n\nvoid FreeTensor(Qnn_Tensor_t *t) {\n  if (t->version == QNN_TENSOR_VERSION_1) {\n    FreeTensorV1(t);\n  } else if (t->version == QNN_TENSOR_VERSION_2) {\n    FreeTensorV2(t);\n  } else {\n    SHERPA_ONNX_LOGE(\"Unknown tensor version: %d\", t->version);\n  }\n}\n\nuint32_t GetSizeInBytes(const uint32_t *dimensions, uint32_t n,\n                        Qnn_DataType_t type) {\n  if (n == 0) {\n    return 0;\n  }\n\n  auto count = std::accumulate(dimensions, dimensions + n, 1,\n                               std::multiplies<uint32_t>());\n\n  uint32_t b = 1;\n  switch (type) {\n    case QNN_DATATYPE_INT_8:\n      b = 1;\n      break;\n    case QNN_DATATYPE_INT_16:\n      b = 2;\n      break;\n    case QNN_DATATYPE_INT_32:\n      b = 4;\n      break;\n    case QNN_DATATYPE_INT_64:\n      b = 8;\n      break;\n    case QNN_DATATYPE_UINT_8:\n      b = 1;\n      break;\n    case QNN_DATATYPE_UINT_16:\n      b = 2;\n      break;\n    case QNN_DATATYPE_UINT_32:\n      b = 4;\n      break;\n    case QNN_DATATYPE_UINT_64:\n      b = 8;\n      break;\n    case QNN_DATATYPE_FLOAT_16:\n      b = 2;\n      break;\n    case QNN_DATATYPE_FLOAT_32:\n      b = 4;\n      break;\n    case QNN_DATATYPE_FLOAT_64:\n      b = 8;\n      break;\n    case QNN_DATATYPE_SFIXED_POINT_8:\n      b = 1;\n      break;\n    case QNN_DATATYPE_SFIXED_POINT_16:\n      b = 2;\n      break;\n    case QNN_DATATYPE_SFIXED_POINT_32:\n      b = 4;\n      break;\n    case QNN_DATATYPE_UFIXED_POINT_8:\n      b = 1;\n      break;\n    case QNN_DATATYPE_UFIXED_POINT_16:\n      b = 2;\n      break;\n    case QNN_DATATYPE_UFIXED_POINT_32:\n      b = 4;\n      break;\n    case QNN_DATATYPE_BOOL_8:\n      b = 1;\n      break;\n    default:\n      SHERPA_ONNX_LOGE(\"Unsupported data type: %s\",\n                       TensorDataTypeToString(type).c_str());\n      break;\n  }\n\n  return count * b;\n}\n\ntemplate <typename T>\nvoid CopyDimensions(const T *src, uint32_t n, T **dst) {\n  if (!src || n == 0) {\n    *dst = nullptr;\n    return;\n  }\n\n  *dst = new T[n];\n  std::copy(src, src + n, *dst);\n}\n\nstatic void CopyQuantizeParams(const Qnn_QuantizeParams_t &src,\n                               Qnn_QuantizeParams_t &dst) {  // NOLINT\n  dst.encodingDefinition = src.encodingDefinition;\n  dst.quantizationEncoding = src.quantizationEncoding;\n\n  switch (src.quantizationEncoding) {\n    case QNN_QUANTIZATION_ENCODING_SCALE_OFFSET:\n      dst.scaleOffsetEncoding = src.scaleOffsetEncoding;\n      break;\n    case QNN_QUANTIZATION_ENCODING_UNDEFINED:\n      // do nothing in this case\n      break;\n    default:\n      SHERPA_ONNX_LOGE(\n          \"Unsupported quantizationEncoding: %s\",\n          QuantizationEncodingToString(src.quantizationEncoding).c_str());\n  }\n}\n\nstatic void CopyTensorInfoV1(const Qnn_Tensor_t &src,\n                             Qnn_Tensor_t &dst) {  // NOLINT\n  dst.version = src.version;\n  dst.v1.id = src.v1.id;\n  if (src.v1.name) {\n    dst.v1.name = strdup(src.v1.name);\n  } else {\n    dst.v1.name = strdup(\"\");\n  }\n\n  dst.v1.type = src.v1.type;\n  dst.v1.dataFormat = src.v1.dataFormat;\n  dst.v1.dataType = src.v1.dataType;\n\n  CopyQuantizeParams(src.v1.quantizeParams, dst.v1.quantizeParams);\n\n  dst.v1.rank = src.v1.rank;\n\n  CopyDimensions(src.v1.dimensions, src.v1.rank, &dst.v1.dimensions);\n\n  dst.v1.memType = src.v1.memType;\n  if (dst.v1.memType != QNN_TENSORMEMTYPE_RAW) {\n    SHERPA_ONNX_LOGE(\"Unsupported mem type: %s\",\n                     TensorMemTypeToString(dst.v1.memType).c_str());\n  } else {\n    dst.v1.clientBuf.data = nullptr;\n    dst.v1.clientBuf.dataSize =\n        GetSizeInBytes(dst.v1.dimensions, dst.v1.rank, dst.v1.dataType);\n  }\n}\n\nstatic void CopyTensorInfoV2(const Qnn_Tensor_t &src,\n                             Qnn_Tensor_t &dst) {  // NOLINT\n  dst.version = src.version;\n  dst.v2.id = src.v2.id;\n  if (src.v2.name) {\n    dst.v2.name = strdup(src.v2.name);\n  } else {\n    dst.v2.name = strdup(\"\");\n  }\n\n  dst.v2.type = src.v2.type;\n  dst.v2.dataFormat = src.v2.dataFormat;\n  dst.v2.dataType = src.v2.dataType;\n\n  CopyQuantizeParams(src.v2.quantizeParams, dst.v2.quantizeParams);\n\n  dst.v2.rank = src.v2.rank;\n\n  CopyDimensions(src.v2.dimensions, src.v2.rank, &dst.v2.dimensions);\n\n  dst.v2.memType = src.v2.memType;\n  if (dst.v2.memType != QNN_TENSORMEMTYPE_RAW) {\n    SHERPA_ONNX_LOGE(\"Unsupported mem type: %s\",\n                     TensorMemTypeToString(dst.v2.memType).c_str());\n  } else {\n    dst.v2.clientBuf.data = nullptr;\n    dst.v2.clientBuf.dataSize =\n        GetSizeInBytes(dst.v2.dimensions, dst.v2.rank, dst.v2.dataType);\n  }\n\n  CopyDimensions(src.v2.isDynamicDimensions, src.v2.rank,\n                 &dst.v2.isDynamicDimensions);\n\n  dst.v2.sparseParams.type = src.v2.sparseParams.type;\n  dst.v2.sparseParams.hybridCoo.numSpecifiedElements =\n      src.v2.sparseParams.hybridCoo.numSpecifiedElements;\n  dst.v2.sparseParams.hybridCoo.numSparseDimensions =\n      src.v2.sparseParams.hybridCoo.numSparseDimensions;\n  dst.v2.isProduced = src.v2.isProduced;\n}\n\nvoid CopyTensorInfo(const Qnn_Tensor_t &src, Qnn_Tensor_t &dst) {  // NOLINT\n  if (src.version == QNN_TENSOR_VERSION_1) {\n    CopyTensorInfoV1(src, dst);\n  } else if (src.version == QNN_TENSOR_VERSION_2) {\n    CopyTensorInfoV2(src, dst);\n  } else {\n    SHERPA_ONNX_LOGE(\"Unknown tensor version: %d\", dst.version);\n  }\n}\n\nvoid LogCallback(const char *fmt, QnnLog_Level_t level, uint64_t timestamp,\n                 va_list args) {\n  std::string s;\n  switch (level) {\n    case QNN_LOG_LEVEL_ERROR:\n      s = \"ERROR\";\n      break;\n    case QNN_LOG_LEVEL_WARN:\n      s = \"WARN\";\n      break;\n    case QNN_LOG_LEVEL_INFO:\n      s = \"INFO\";\n      break;\n    case QNN_LOG_LEVEL_DEBUG:\n      s = \"DEBUG\";\n      break;\n    case QNN_LOG_LEVEL_VERBOSE:\n      s = \"VERBOSE\";\n      break;\n    case QNN_LOG_LEVEL_MAX:\n      s = \"UNKNOWN\";\n      break;\n  }\n\n  double ms = timestamp / 1000000.0;\n  fprintf(stdout, \"%8.1fms [%-7s] \", ms, s.c_str());\n  vfprintf(stdout, fmt, args);\n}\n\nvoid PrintTensor(Qnn_TensorV2_t t) {\n  std::ostringstream os;\n  os << \"  id: \" << t.id << \"\\n\";\n  os << \"  name: \" << t.name << \"\\n\";\n  os << \"  type: \" << TensorTypeToString(t.type) << \"\\n\";\n  os << \"  data format: \" << t.dataFormat << \"\\n\";\n  os << \"  data type: \" << TensorDataTypeToString(t.dataType) << \"\\n\";\n  os << \"  quantize info: \\n\";\n  auto qp = t.quantizeParams;\n  os << \"    encodingDefinition: \" << std::hex << \"0x\" << qp.encodingDefinition\n     << std::dec << \"\\n\";\n  os << \"    quantizationEncoding: \"\n     << QuantizationEncodingToString(qp.quantizationEncoding) << \"\\n\";\n  if (qp.quantizationEncoding == QNN_QUANTIZATION_ENCODING_SCALE_OFFSET) {\n    Qnn_ScaleOffset_t s = qp.scaleOffsetEncoding;\n    os << \"     scale: \" << s.scale << \"\\n\";\n    os << \"     offset: \" << s.offset << \"\\n\";\n  }\n  os << \"  rank: \" << t.rank << \"\\n\";\n  os << \"  dimensions: \";\n  for (int32_t i = 0; i < t.rank; ++i) {\n    os << t.dimensions[i] << \", \";\n    if (i + 1 == t.rank) {\n      os << \"\\n\";\n    }\n  }\n  os << \"  memType: \" << TensorMemTypeToString(t.memType) << \"\\n\";\n  if (t.memType == QNN_TENSORMEMTYPE_RAW) {\n    os << \" memType raw data size: \" << t.clientBuf.dataSize << \"\\n\";\n  }\n  os << \"  isDynamicDimensions: \"\n     << ((t.isDynamicDimensions != nullptr) ? \"True\" : \"False\") << \"\\n\";\n  os << \"  isProduced: \" << static_cast<int32_t>(t.isProduced) << \"\\n\";\n\n  SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n}\n\nstatic bool CopyGraphsInfoV3(const QnnSystemContext_GraphInfoV3_t *src,\n                             GraphInfo *dst) {\n  if (src->graphName) {\n    dst->graph_name = strdup(src->graphName);\n  } else {\n    dst->graph_name = strdup(\"\");\n  }\n\n  dst->input_tensors = nullptr;\n  dst->num_input_tensors = 0;\n\n  if (src->graphInputs) {\n    dst->input_tensors = reinterpret_cast<Qnn_Tensor_t *>(\n        calloc(src->numGraphInputs, sizeof(Qnn_Tensor_t)));\n\n    for (uint32_t i = 0; i < src->numGraphInputs; ++i) {\n      dst->input_tensors[i] = QNN_TENSOR_INIT;\n\n      CopyTensorInfo(src->graphInputs[i], dst->input_tensors[i]);\n    }\n\n    dst->num_input_tensors = src->numGraphInputs;\n  }\n\n  dst->output_tensors = nullptr;\n  dst->num_output_tensors = 0;\n\n  if (src->graphOutputs) {\n    dst->output_tensors = reinterpret_cast<Qnn_Tensor_t *>(\n        calloc(src->numGraphOutputs, sizeof(Qnn_Tensor_t)));\n\n    for (uint32_t i = 0; i < src->numGraphOutputs; ++i) {\n      dst->output_tensors[i] = QNN_TENSOR_INIT;\n\n      CopyTensorInfo(src->graphOutputs[i], dst->output_tensors[i]);\n    }\n\n    dst->num_output_tensors = src->numGraphOutputs;\n  }\n\n  return true;\n}\n\nstatic bool CopyGraphsInfo(const QnnSystemContext_GraphInfo_t *graphs_input,\n                           uint32_t num_graphs,\n                           GraphInfo **&graphs_info) {  // NOLINT\n  if (num_graphs == 0) {\n    SHERPA_ONNX_LOGE(\"empty graphs\");\n    graphs_info = nullptr;\n    return false;\n  }\n\n  SHERPA_ONNX_LOGE(\"version: %d\", (int)graphs_input[0].version);\n\n  // remember to free graphs_info\n  graphs_info =\n      reinterpret_cast<GraphInfo **>(calloc(num_graphs, sizeof(GraphInfo *)));\n\n  GraphInfo *graph_info_arr =\n      reinterpret_cast<GraphInfo *>(calloc(num_graphs, sizeof(GraphInfo)));\n\n  if (!graphs_info || !graph_info_arr) {\n    SHERPA_ONNX_LOGE(\"Failure to allocate memory for *graphInfo\");\n    return false;\n  }\n\n  for (uint32_t i = 0; i < num_graphs; ++i) {\n    switch (graphs_input[i].version) {\n      case QNN_SYSTEM_CONTEXT_GRAPH_INFO_VERSION_1:\n        SHERPA_ONNX_LOGE(\"Unsupported version: %d\",\n                         static_cast<int32_t>(graphs_input[i].version));\n        return false;\n\n      case QNN_SYSTEM_CONTEXT_GRAPH_INFO_VERSION_2:\n        SHERPA_ONNX_LOGE(\"Unsupported version: %d\",\n                         static_cast<int32_t>(graphs_input[i].version));\n        return false;\n\n      case QNN_SYSTEM_CONTEXT_GRAPH_INFO_VERSION_3: {\n        bool ok =\n            CopyGraphsInfoV3(&graphs_input[i].graphInfoV3, &graph_info_arr[i]);\n        if (!ok) {\n          SHERPA_ONNX_LOGE(\"Failed to copy graphs info v3\");\n        }\n        graphs_info[i] = graph_info_arr + i;\n\n        break;\n      }\n\n      default:\n        SHERPA_ONNX_LOGE(\"Unsupported version: %d\",\n                         static_cast<int32_t>(graphs_input[i].version));\n        return false;\n    }\n  }\n\n  return true;\n}\n\nbool CopyMetadataToGraphsInfo(const QnnSystemContext_BinaryInfo_t *binary_info,\n                              GraphInfo **&graphs_info,  // NOLINT\n                              uint32_t &graphs_count) {  // NOLINT\n  graphs_count = 0;\n\n  switch (binary_info->version) {\n    case QNN_SYSTEM_CONTEXT_BINARY_INFO_VERSION_1: {\n      SHERPA_ONNX_LOGE(\"Unsupported binary context version: %d\",\n                       binary_info->version);\n      return false;\n    }\n    case QNN_SYSTEM_CONTEXT_BINARY_INFO_VERSION_2: {\n      SHERPA_ONNX_LOGE(\"Unsupported binary context version: %d\",\n                       binary_info->version);\n      return false;\n    }\n    case QNN_SYSTEM_CONTEXT_BINARY_INFO_VERSION_3: {\n      bool ok = CopyGraphsInfo(binary_info->contextBinaryInfoV3.graphs,\n                               binary_info->contextBinaryInfoV3.numGraphs,\n                               graphs_info);\n\n      if (!ok) {\n        SHERPA_ONNX_LOGE(\"Failed while copying graphs Info v3.\");\n        return false;\n      }\n      graphs_count = binary_info->contextBinaryInfoV3.numGraphs;\n      return true;\n    }\n    default: {\n      SHERPA_ONNX_LOGE(\"Unsupported binary context version: %d\",\n                       binary_info->version);\n      return false;\n    }\n  }\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn/utils.h",
    "content": "// sherpa-onnx/csrc/qnn/utils.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_QNN_UTILS_H_\n#define SHERPA_ONNX_CSRC_QNN_UTILS_H_\n#include <stdio.h>\n\n#include <cstdint>\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"QnnInterface.h\"\n#include \"System/QnnSystemInterface.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\ntemplate <typename T>\nstd::vector<T> ReadFile(const std::string &filename) {\n  FILE *fp = fopen(filename.c_str(), \"rb\");\n  if (!fp) {\n    SHERPA_ONNX_LOGE(\"Failed to open '%s'\", filename.c_str());\n    return {};\n  }\n\n  fseek(fp, 0, SEEK_END);\n  int32_t n = ftell(fp);\n  fseek(fp, 0, SEEK_SET);\n\n  std::vector<T> ans(n / sizeof(T));\n  fread(ans.data(), sizeof(T), ans.size(), fp);\n  fclose(fp);\n\n  return ans;\n}\n\nvoid PrintTensor(Qnn_TensorV2_t t);\n\n// float -> uint16_t\nvoid FillData(Qnn_Tensor_t *t, const float *data, int32_t n);\n\n// int32_t -> int32_t\nvoid FillData(Qnn_Tensor_t *t, const int32_t *data, int32_t n);\n\n// uint16_t -> float\nvoid GetData(const Qnn_Tensor_t *t, float *data, int32_t n);\n\nvoid FreeTensor(Qnn_Tensor_t *t);\n\nusing TensorPtr = std::unique_ptr<Qnn_Tensor_t, decltype(&FreeTensor)>;\n\nvoid CopyTensorInfo(const Qnn_Tensor_t &src, Qnn_Tensor_t &dst);  // NOLINT\n\nstd::string QuantizationEncodingToString(Qnn_QuantizationEncoding_t q);\n\nstd::string TensorDataTypeToString(Qnn_DataType_t t);\n\nusing QnnInterfaceGetProvidersFnType = Qnn_ErrorHandle_t (*)(\n    const QnnInterface_t ***provider_list, uint32_t *num_providers);\n\nusing QnnSystemInterfaceGetProvidersFnType = Qnn_ErrorHandle_t (*)(\n    const QnnSystemInterface_t ***provider_list, uint32_t *num_providers);\n\nstruct GraphInfo {\n  Qnn_GraphHandle_t graph;\n  char *graph_name;\n  Qnn_Tensor_t *input_tensors;\n  uint32_t num_input_tensors;\n  Qnn_Tensor_t *output_tensors;\n  uint32_t num_output_tensors;\n};\n\nstruct GraphConfigInfo {\n  char *graph_name;\n  const QnnGraph_Config_t **graph_configs;\n};\n\nusing ComposeGraphsFnHandleType = Qnn_ErrorHandle_t (*)(\n    Qnn_BackendHandle_t backend_handle, QNN_INTERFACE_VER_TYPE interface,\n    Qnn_ContextHandle_t context_handle,\n    const GraphConfigInfo **graphs_config_info,\n    const uint32_t num_graphs_config_info, GraphInfo ***graphs_info,\n    uint32_t *num_graphs_info, bool debug, QnnLog_Callback_t logCallback,\n    QnnLog_Level_t max_log_level);\n\nusing FreeGraphInfoFnHandleType =\n    Qnn_ErrorHandle_t (*)(GraphInfo ***, uint32_t num_graphs_info);\n\nvoid LogCallback(const char *fmt, QnnLog_Level_t level, uint64_t timestamp,\n                 va_list args);\n\nbool CopyMetadataToGraphsInfo(const QnnSystemContext_BinaryInfo_t *binary_info,\n                              GraphInfo **&graphs_info,  // NOLINT\n                              uint32_t &graphs_count);   // NOLINT\n#endif  // SHERPA_ONNX_CSRC_QNN_UTILS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn-config.cc",
    "content": "// sherpa-onnx/csrc/qnn-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/qnn-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid QnnConfig::Register(ParseOptions *po) {\n  po->Register(\"qnn-backend-lib\", &backend_lib,\n               \"Path to libQnnHtp.so \"\n               \"Used only when provider is qnn.\"\n               \"Leave it empty if you don't use qnn\");\n\n  po->Register(\n      \"qnn-context-binary\", &context_binary,\n      \"Path to model.bin. Used only when provider is qnn.\"\n      \"If it exists, libmodel.so is ignored.\"\n      \"If it does not exist, Context binary is saved to this path so that \"\n      \"it is loaded the next time you run it. You can leave it empty if you \"\n      \"don't use qnn\");\n\n  po->Register(\"qnn-system-lib\", &system_lib,\n               \"Required and used only when --qnn-context-binary is not empty \"\n               \"and exists. You can leave it empty if you don't use qnn.\");\n}\n\nbool QnnConfig::Validate() const {\n  if (backend_lib.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide path to libQnnHtp.so if you use qnn\");\n    return false;\n  }\n\n  // we don't check whether backend_lib and system_lib exist or not since\n  // dlopen() will find them by searching predefined paths\n\n  if (!context_binary.empty() && FileExists(context_binary)) {\n    if (system_lib.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"Please provide --qnn-system-lib when you provide \"\n          \"--qnn-context-binary\");\n      return false;\n    }\n  }\n\n  return true;\n}\n\nstd::string QnnConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"QnnConfig(\";\n  os << \"backend_lib=\\\"\" << backend_lib << \"\\\", \";\n  os << \"context_binary=\\\"\" << context_binary << \"\\\", \";\n  os << \"system_lib=\\\"\" << system_lib << \"\\\")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/qnn-config.h",
    "content": "// sherpa-onnx/csrc/qnn-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_QNN_CONFIG_H_\n#define SHERPA_ONNX_CSRC_QNN_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct QnnConfig {\n  // Path to the backend library, e.g.,\n  // /some/path/to/libQnnHtp.so\n  std::string backend_lib;\n\n  // If it exists, you need to also provide system_lib.\n  // In this case, the model lib, i.e., libmodel.so, is ignored\n  //\n  // If it does not exist and if the user want to save the context binary,\n  // it will save it to this path.\n  std::string context_binary;\n\n  // Required and used only when context_binary exists\n  // Example value: /some/path/to/libQnnSystem.so\n  std::string system_lib;\n\n  std::string ToString() const;\n\n  void Register(ParseOptions *po);\n\n  bool Validate() const;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_QNN_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/regex-lang-test.cc",
    "content": "// sherpa-onnx/csrc/regex-lang-test.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include <iostream>\n#include <regex>  // NOLINT\n#include <string>\n#include <vector>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/text-utils.cc\"\n\nnamespace sherpa_onnx {\n\nstatic void TestLang(const std::string &expr, const std::string &text,\n                     const std::vector<std::string> &expected) {\n  auto ws = ToWideString(text);\n  std::wstring wexpr = ToWideString(expr);\n  std::wregex we(wexpr);\n\n  auto begin = std::wsregex_iterator(ws.begin(), ws.end(), we);\n  auto end = std::wsregex_iterator();\n  int32_t k = 0;\n  for (std::wsregex_iterator i = begin; i != end; ++i) {\n    std::wsmatch match = *i;\n    std::wstring match_str = match.str();\n    auto ms = ToString(match_str);\n    std::cout << ms << \"\\n\";\n    EXPECT_EQ(ms, expected[k]);\n    k++;\n  }\n  EXPECT_EQ(k, expected.size());\n}\n\nTEST(German, Case1) {\n  std::cout << \"----------Test German----------\";\n  // see https://character-table.netlify.app/german/\n  std::string expr =\n      \"([\\\\u0020-\\\\u005f\\\\u0061-\"\n      \"\\\\u007d\\\\u00a0\\\\u00a7\\\\u00a9\\\\u00ab\\\\u00bb\\\\u00c4\\\\u00d6\\\\u00dc\\\\u00df\\\\\"\n      \"u00e4\\\\u00f6\\\\u00fc\\\\u2010-\\\\u2011\\\\u2013-\"\n      \"\\\\u2014\\\\u2018\\\\u201a\\\\u201c\\\\u201e\\\\u2026\\\\u2030\\\\u20ac]+)\";\n\n  std::string text =\n      \"开始Übeltäter übergibt Ärzten 中间öfters äußerst ätzende Öle结束3€\";\n\n  std::vector<std::string> expected = {\"Übeltäter übergibt Ärzten \",\n                                       \"öfters äußerst ätzende Öle\", \"3€\"};\n\n  TestLang(expr, text, expected);\n}\n\nTEST(French, Case1) {\n  std::string expr =\n      \"([\\\\u0020-\\\\u005f\\\\u0061-\"\n      \"\\\\u007a\\\\u007c\\\\u00a0\\\\u00a7\\\\u00a9\\\\u00ab\\\\u00b2-\"\n      \"\\\\u00b3\\\\u00bb\\\\u00c0\\\\u00c2\\\\u00c6-\\\\u00cb\\\\u00ce-\"\n      \"\\\\u00cf\\\\u00d4\\\\u00d9\\\\u00db-\\\\u00dc\\\\u00e0\\\\u00e2\\\\u00e6-\"\n      \"\\\\u00eb\\\\u00ee-\\\\u00ef\\\\u00f4\\\\u00f9\\\\u00fb-\\\\u00fc\\\\u00ff\\\\u0152-\"\n      \"\\\\u0153\\\\u0178\\\\u02b3\\\\u02e2\\\\u1d48-\\\\u1d49\\\\u2010-\\\\u2011\\\\u2013-\"\n      \"\\\\u2014\\\\u2019\\\\u201c-\\\\u201d\\\\u2020-\\\\u2021\\\\u2026\\\\u202f-\"\n      \"\\\\u2030\\\\u20ac\\\\u2212]+)\";\n  std::string text =\n      \"L'été, 一avec son ciel bleuâtre, 二est un moment où, 三Noël, maçon\";\n  std::vector<std::string> expected = {\n      \"L'été, \",\n      \"avec son ciel bleuâtre, \",\n      \"est un moment où, \",\n      \"Noël, maçon\",\n  };\n  TestLang(expr, text, expected);\n}\n\nTEST(English, Case1) {\n  // https://character-table.netlify.app/english/\n  std::string expr =\n      \"([\\\\u0020-\\\\u005f\\\\u0061-\\\\u007a\\\\u007c\\\\u00a0\\\\u00a7\\\\u00a9\\\\u2010-\"\n      \"\\\\u2011\\\\u2013-\\\\u2014\\\\u2018-\\\\u2019\\\\u201c-\\\\u201d\\\\u2020-\"\n      \"\\\\u2021\\\\u2026\\\\u2030\\\\u2032-\\\\u2033\\\\u20ac]+)\";\n  std::string text = \"一how are you doing? 二Thank you!\";\n\n  std::vector<std::string> expected = {\n      \"how are you doing? \",\n      \"Thank you!\",\n  };\n  TestLang(expr, text, expected);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/resample.cc",
    "content": "/**\n * Copyright     2013  Pegah Ghahremani\n *               2014  IMSL, PKU-HKUST (author: Wei Shi)\n *               2014  Yanqing Sun, Junjie Wang\n *               2014  Johns Hopkins University (author: Daniel Povey)\n * Copyright     2023  Xiaomi Corporation (authors: Fangjun Kuang)\n *\n * See LICENSE for clarification regarding multiple authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *     http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n// this file is copied and modified from\n// kaldi/src/feat/resample.cc\n\n#include \"sherpa-onnx/csrc/resample.h\"\n\n#include <cassert>\n#include <cmath>\n#include <cstdio>\n#include <cstdlib>\n#include <type_traits>\n#include <vector>\n\n#ifndef M_2PI\n#define M_2PI 6.283185307179586476925286766559005\n#endif\n\n#ifndef M_PI\n#define M_PI 3.1415926535897932384626433832795\n#endif\n\nnamespace sherpa_onnx {\n\ntemplate <class I>\nstatic I Gcd(I m, I n) {\n  // this function is copied from kaldi/src/base/kaldi-math.h\n  if (m == 0 || n == 0) {\n    if (m == 0 && n == 0) {  // gcd not defined, as all integers are divisors.\n      fprintf(stderr, \"Undefined GCD since m = 0, n = 0.\\n\");\n      exit(-1);\n    }\n    return (m == 0 ? (n > 0 ? n : -n) : (m > 0 ? m : -m));\n    // return absolute value of whichever is nonzero\n  }\n  // could use compile-time assertion\n  // but involves messing with complex template stuff.\n  static_assert(std::is_integral_v<I>);\n  while (true) {\n    m %= n;\n    if (m == 0) return (n > 0 ? n : -n);\n    n %= m;\n    if (n == 0) return (m > 0 ? m : -m);\n  }\n}\n\n/// Returns the least common multiple of two integers.  Will\n/// crash unless the inputs are positive.\ntemplate <class I>\nstatic I Lcm(I m, I n) {\n  // This function is copied from kaldi/src/base/kaldi-math.h\n  assert(m > 0 && n > 0);\n  I gcd = Gcd(m, n);\n  return gcd * (m / gcd) * (n / gcd);\n}\n\nstatic float DotProduct(const float *a, const float *b, int32_t n) {\n  float sum = 0;\n  for (int32_t i = 0; i != n; ++i) {\n    sum += a[i] * b[i];\n  }\n  return sum;\n}\n\nLinearResample::LinearResample(int32_t samp_rate_in_hz,\n                               int32_t samp_rate_out_hz, float filter_cutoff_hz,\n                               int32_t num_zeros)\n    : samp_rate_in_(samp_rate_in_hz),\n      samp_rate_out_(samp_rate_out_hz),\n      filter_cutoff_(filter_cutoff_hz),\n      num_zeros_(num_zeros) {\n  assert(samp_rate_in_hz > 0.0 && samp_rate_out_hz > 0.0 &&\n         filter_cutoff_hz > 0.0 && filter_cutoff_hz * 2 <= samp_rate_in_hz &&\n         filter_cutoff_hz * 2 <= samp_rate_out_hz && num_zeros > 0);\n\n  // base_freq is the frequency of the repeating unit, which is the gcd\n  // of the input frequencies.\n  int32_t base_freq = Gcd(samp_rate_in_, samp_rate_out_);\n  input_samples_in_unit_ = samp_rate_in_ / base_freq;\n  output_samples_in_unit_ = samp_rate_out_ / base_freq;\n\n  SetIndexesAndWeights();\n  Reset();\n}\n\nvoid LinearResample::SetIndexesAndWeights() {\n  first_index_.resize(output_samples_in_unit_);\n  weights_.resize(output_samples_in_unit_);\n\n  double window_width = num_zeros_ / (2.0 * filter_cutoff_);\n\n  for (int32_t i = 0; i < output_samples_in_unit_; i++) {\n    double output_t = i / static_cast<double>(samp_rate_out_);\n    double min_t = output_t - window_width, max_t = output_t + window_width;\n    // we do ceil on the min and floor on the max, because if we did it\n    // the other way around we would unnecessarily include indexes just\n    // outside the window, with zero coefficients.  It's possible\n    // if the arguments to the ceil and floor expressions are integers\n    // (e.g. if filter_cutoff_ has an exact ratio with the sample rates),\n    // that we unnecessarily include something with a zero coefficient,\n    // but this is only a slight efficiency issue.\n    int32_t min_input_index = ceil(min_t * samp_rate_in_),\n            max_input_index = floor(max_t * samp_rate_in_),\n            num_indices = max_input_index - min_input_index + 1;\n    first_index_[i] = min_input_index;\n    weights_[i].resize(num_indices);\n    for (int32_t j = 0; j < num_indices; j++) {\n      int32_t input_index = min_input_index + j;\n      double input_t = input_index / static_cast<double>(samp_rate_in_),\n             delta_t = input_t - output_t;\n      // sign of delta_t doesn't matter.\n      weights_[i][j] = FilterFunc(delta_t) / samp_rate_in_;\n    }\n  }\n}\n\n/** Here, t is a time in seconds representing an offset from\n    the center of the windowed filter function, and FilterFunction(t)\n    returns the windowed filter function, described\n    in the header as h(t) = f(t)g(t), evaluated at t.\n*/\nfloat LinearResample::FilterFunc(float t) const {\n  float window = 0,  // raised-cosine (Hanning) window of width\n                     // num_zeros_/2*filter_cutoff_\n      filter = 0;    // sinc filter function\n  if (std::fabs(t) < num_zeros_ / (2.0 * filter_cutoff_))\n    window = 0.5 * (1 + cos(M_2PI * filter_cutoff_ / num_zeros_ * t));\n  else\n    window = 0.0;  // outside support of window function\n  if (t != 0)\n    filter = sin(M_2PI * filter_cutoff_ * t) / (M_PI * t);\n  else\n    filter = 2 * filter_cutoff_;  // limit of the function at t = 0\n  return filter * window;\n}\n\nvoid LinearResample::Reset() {\n  input_sample_offset_ = 0;\n  output_sample_offset_ = 0;\n  input_remainder_.resize(0);\n}\n\nvoid LinearResample::Resample(const float *input, int32_t input_dim, bool flush,\n                              std::vector<float> *output) {\n  int64_t tot_input_samp = input_sample_offset_ + input_dim,\n          tot_output_samp = GetNumOutputSamples(tot_input_samp, flush);\n\n  assert(tot_output_samp >= output_sample_offset_);\n\n  output->resize(tot_output_samp - output_sample_offset_);\n\n  // samp_out is the index into the total output signal, not just the part\n  // of it we are producing here.\n  for (int64_t samp_out = output_sample_offset_; samp_out < tot_output_samp;\n       samp_out++) {\n    int64_t first_samp_in = 0;\n    int32_t samp_out_wrapped = 0;\n    GetIndexes(samp_out, &first_samp_in, &samp_out_wrapped);\n    const std::vector<float> &weights = weights_[samp_out_wrapped];\n    // first_input_index is the first index into \"input\" that we have a weight\n    // for.\n    int32_t first_input_index =\n        static_cast<int32_t>(first_samp_in - input_sample_offset_);\n    float this_output = 0;\n    if (first_input_index >= 0 &&\n        first_input_index + static_cast<int32_t>(weights.size()) <= input_dim) {\n      this_output =\n          DotProduct(input + first_input_index, weights.data(), weights.size());\n    } else {  // Handle edge cases.\n      this_output = 0.0;\n      for (int32_t i = 0; i < static_cast<int32_t>(weights.size()); i++) {\n        float weight = weights[i];\n        int32_t input_index = first_input_index + i;\n        if (input_index < 0 &&\n            static_cast<int32_t>(input_remainder_.size()) + input_index >= 0) {\n          this_output +=\n              weight * input_remainder_[input_remainder_.size() + input_index];\n        } else if (input_index >= 0 && input_index < input_dim) {\n          this_output += weight * input[input_index];\n        } else if (input_index >= input_dim) {\n          // We're past the end of the input and are adding zero; should only\n          // happen if the user specified flush == true, or else we would not\n          // be trying to output this sample.\n          assert(flush);\n        }\n      }\n    }\n    int32_t output_index =\n        static_cast<int32_t>(samp_out - output_sample_offset_);\n    (*output)[output_index] = this_output;\n  }\n\n  if (flush) {\n    Reset();  // Reset the internal state.\n  } else {\n    SetRemainder(input, input_dim);\n    input_sample_offset_ = tot_input_samp;\n    output_sample_offset_ = tot_output_samp;\n  }\n}\n\nint64_t LinearResample::GetNumOutputSamples(int64_t input_num_samp,\n                                            bool flush) const {\n  // For exact computation, we measure time in \"ticks\" of 1.0 / tick_freq,\n  // where tick_freq is the least common multiple of samp_rate_in_ and\n  // samp_rate_out_.\n  int32_t tick_freq = Lcm(samp_rate_in_, samp_rate_out_);\n  int32_t ticks_per_input_period = tick_freq / samp_rate_in_;\n\n  // work out the number of ticks in the time interval\n  // [ 0, input_num_samp/samp_rate_in_ ).\n  int64_t interval_length_in_ticks = input_num_samp * ticks_per_input_period;\n  if (!flush) {\n    float window_width = num_zeros_ / (2.0 * filter_cutoff_);\n    // To count the window-width in ticks we take the floor.  This\n    // is because since we're looking for the largest integer num-out-samp\n    // that fits in the interval, which is open on the right, a reduction\n    // in interval length of less than a tick will never make a difference.\n    // For example, the largest integer in the interval [ 0, 2 ) and the\n    // largest integer in the interval [ 0, 2 - 0.9 ) are the same (both one).\n    // So when we're subtracting the window-width we can ignore the fractional\n    // part.\n    int32_t window_width_ticks = std::floor(window_width * tick_freq);\n    // The time-period of the output that we can sample gets reduced\n    // by the window-width (which is actually the distance from the\n    // center to the edge of the windowing function) if we're not\n    // \"flushing the output\".\n    interval_length_in_ticks -= window_width_ticks;\n  }\n  if (interval_length_in_ticks <= 0) return 0;\n\n  int32_t ticks_per_output_period = tick_freq / samp_rate_out_;\n  // Get the last output-sample in the closed interval, i.e. replacing [ ) with\n  // [ ].  Note: integer division rounds down.  See\n  // http://en.wikipedia.org/wiki/Interval_(mathematics) for an explanation of\n  // the notation.\n  int64_t last_output_samp = interval_length_in_ticks / ticks_per_output_period;\n  // We need the last output-sample in the open interval, so if it takes us to\n  // the end of the interval exactly, subtract one.\n  if (last_output_samp * ticks_per_output_period == interval_length_in_ticks)\n    last_output_samp--;\n\n  // First output-sample index is zero, so the number of output samples\n  // is the last output-sample plus one.\n  int64_t num_output_samp = last_output_samp + 1;\n  return num_output_samp;\n}\n\n// inline\nvoid LinearResample::GetIndexes(int64_t samp_out, int64_t *first_samp_in,\n                                int32_t *samp_out_wrapped) const {\n  // A unit is the smallest nonzero amount of time that is an exact\n  // multiple of the input and output sample periods.  The unit index\n  // is the answer to \"which numbered unit we are in\".\n  int64_t unit_index = samp_out / output_samples_in_unit_;\n  // samp_out_wrapped is equal to samp_out % output_samples_in_unit_\n  *samp_out_wrapped =\n      static_cast<int32_t>(samp_out - unit_index * output_samples_in_unit_);\n  *first_samp_in =\n      first_index_[*samp_out_wrapped] + unit_index * input_samples_in_unit_;\n}\n\nvoid LinearResample::SetRemainder(const float *input, int32_t input_dim) {\n  std::vector<float> old_remainder(input_remainder_);\n  // max_remainder_needed is the width of the filter from side to side,\n  // measured in input samples.  you might think it should be half that,\n  // but you have to consider that you might be wanting to output samples\n  // that are \"in the past\" relative to the beginning of the latest\n  // input... anyway, storing more remainder than needed is not harmful.\n  int32_t max_remainder_needed =\n      std::ceil(samp_rate_in_ * num_zeros_ / filter_cutoff_);\n  input_remainder_.resize(max_remainder_needed);\n  for (int32_t index = -static_cast<int32_t>(input_remainder_.size());\n       index < 0; index++) {\n    // we interpret \"index\" as an offset from the end of \"input\" and\n    // from the end of input_remainder_.\n    int32_t input_index = index + input_dim;\n    if (input_index >= 0) {\n      input_remainder_[index + static_cast<int32_t>(input_remainder_.size())] =\n          input[input_index];\n    } else if (input_index + static_cast<int32_t>(old_remainder.size()) >= 0) {\n      input_remainder_[index + static_cast<int32_t>(input_remainder_.size())] =\n          old_remainder[input_index +\n                        static_cast<int32_t>(old_remainder.size())];\n      // else leave it at zero.\n    }\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/resample.h",
    "content": "/**\n * Copyright     2013  Pegah Ghahremani\n *               2014  IMSL, PKU-HKUST (author: Wei Shi)\n *               2014  Yanqing Sun, Junjie Wang\n *               2014  Johns Hopkins University (author: Daniel Povey)\n * Copyright     2023  Xiaomi Corporation (authors: Fangjun Kuang)\n *\n * See LICENSE for clarification regarding multiple authors\n *\n * Licensed under the Apache License, Version 2.0 (the \"License\");\n * you may not use this file except in compliance with the License.\n * You may obtain a copy of the License at\n *\n *     http://www.apache.org/licenses/LICENSE-2.0\n *\n * Unless required by applicable law or agreed to in writing, software\n * distributed under the License is distributed on an \"AS IS\" BASIS,\n * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n * See the License for the specific language governing permissions and\n * limitations under the License.\n */\n// this file is copied and modified from\n// kaldi/src/feat/resample.h\n#ifndef SHERPA_ONNX_CSRC_RESAMPLE_H_\n#define SHERPA_ONNX_CSRC_RESAMPLE_H_\n\n#include <cstdint>\n#include <vector>\n\nnamespace sherpa_onnx {\n\n/*\n   We require that the input and output sampling rate be specified as\n   integers, as this is an easy way to specify that their ratio be rational.\n*/\n\nclass LinearResample {\n public:\n  /// Constructor.  We make the input and output sample rates integers, because\n  /// we are going to need to find a common divisor.  This should just remind\n  /// you that they need to be integers.  The filter cutoff needs to be less\n  /// than samp_rate_in_hz/2 and less than samp_rate_out_hz/2.  num_zeros\n  /// controls the sharpness of the filter, more == sharper but less efficient.\n  /// We suggest around 4 to 10 for normal use.\n  LinearResample(int32_t samp_rate_in_hz, int32_t samp_rate_out_hz,\n                 float filter_cutoff_hz, int32_t num_zeros);\n\n  /// Calling the function Reset() resets the state of the object prior to\n  /// processing a new signal; it is only necessary if you have called\n  /// Resample(x, x_size, false, y) for some signal, leading to a remainder of\n  /// the signal being called, but then abandon processing the signal before\n  /// calling Resample(x, x_size, true, y) for the last piece.  Call it\n  /// unnecessarily between signals will not do any harm.\n  void Reset();\n\n  /// This function does the resampling.  If you call it with flush == true and\n  /// you have never called it with flush == false, it just resamples the input\n  /// signal (it resizes the output to a suitable number of samples).\n  ///\n  /// You can also use this function to process a signal a piece at a time.\n  /// suppose you break it into piece1, piece2, ... pieceN.  You can call\n  /// \\code{.cc}\n  /// Resample(piece1, piece1_size, false, &output1);\n  /// Resample(piece2, piece2_size, false, &output2);\n  /// Resample(piece3, piece3_size, true, &output3);\n  /// \\endcode\n  /// If you call it with flush == false, it won't output the last few samples\n  /// but will remember them, so that if you later give it a second piece of\n  /// the input signal it can process it correctly.\n  /// If your most recent call to the object was with flush == false, it will\n  /// have internal state; you can remove this by calling Reset().\n  /// Empty input is acceptable.\n  void Resample(const float *input, int32_t input_dim, bool flush,\n                std::vector<float> *output);\n\n  //// Return the input and output sampling rates (for checks, for example)\n  int32_t GetInputSamplingRate() const { return samp_rate_in_; }\n  int32_t GetOutputSamplingRate() const { return samp_rate_out_; }\n\n private:\n  void SetIndexesAndWeights();\n\n  float FilterFunc(float) const;\n\n  /// This function outputs the number of output samples we will output\n  /// for a signal with \"input_num_samp\" input samples.  If flush == true,\n  /// we return the largest n such that\n  /// (n/samp_rate_out_) is in the interval [ 0, input_num_samp/samp_rate_in_ ),\n  /// and note that the interval is half-open.  If flush == false,\n  /// define window_width as num_zeros / (2.0 * filter_cutoff_);\n  /// we return the largest n such that (n/samp_rate_out_) is in the interval\n  /// [ 0, input_num_samp/samp_rate_in_ - window_width ).\n  int64_t GetNumOutputSamples(int64_t input_num_samp, bool flush) const;\n\n  /// Given an output-sample index, this function outputs to *first_samp_in the\n  /// first input-sample index that we have a weight on (may be negative),\n  /// and to *samp_out_wrapped the index into weights_ where we can get the\n  /// corresponding weights on the input.\n  inline void GetIndexes(int64_t samp_out, int64_t *first_samp_in,\n                         int32_t *samp_out_wrapped) const;\n\n  void SetRemainder(const float *input, int32_t input_dim);\n\n private:\n  // The following variables are provided by the user.\n  int32_t samp_rate_in_;\n  int32_t samp_rate_out_;\n  float filter_cutoff_;\n  int32_t num_zeros_;\n\n  int32_t input_samples_in_unit_;  ///< The number of input samples in the\n                                   ///< smallest repeating unit: num_samp_in_ =\n                                   ///< samp_rate_in_hz / Gcd(samp_rate_in_hz,\n                                   ///< samp_rate_out_hz)\n\n  int32_t output_samples_in_unit_;  ///< The number of output samples in the\n                                    ///< smallest repeating unit: num_samp_out_\n                                    ///< = samp_rate_out_hz /\n                                    ///< Gcd(samp_rate_in_hz, samp_rate_out_hz)\n\n  /// The first input-sample index that we sum over, for this output-sample\n  /// index.  May be negative; any truncation at the beginning is handled\n  /// separately.  This is just for the first few output samples, but we can\n  /// extrapolate the correct input-sample index for arbitrary output samples.\n  std::vector<int32_t> first_index_;\n\n  /// Weights on the input samples, for this output-sample index.\n  std::vector<std::vector<float>> weights_;\n\n  // the following variables keep track of where we are in a particular signal,\n  // if it is being provided over multiple calls to Resample().\n\n  int64_t input_sample_offset_ = 0;   ///< The number of input samples we have\n                                      ///< already received for this signal\n                                      ///< (including anything in remainder_)\n  int64_t output_sample_offset_ = 0;  ///< The number of samples we have already\n                                      ///< output for this signal.\n  std::vector<float> input_remainder_;  ///< A small trailing part of the\n                                        ///< previously seen input signal.\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RESAMPLE_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/context-blocking-queue-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/context-blocking-queue-rknn.cc\n//\n// Copyright      2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/context-blocking-queue-rknn.h\"\n\n#include <condition_variable>\n#include <mutex>\n#include <queue>\n\n#include \"sherpa-onnx/csrc/rknn/macros.h\"\n#include \"sherpa-onnx/csrc/rknn/utils.h\"\n\nnamespace sherpa_onnx {\n\nclass ContextBlockingQueueRknn::Impl {\n public:\n  Impl(rknn_context context, int32_t num_threads, int32_t capacity) {\n    for (int32_t i = 0; i < capacity; ++i) {\n      rknn_context bak = 0;\n      auto ret = rknn_dup_context(&context, &bak);\n      SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to duplicate context\");\n\n      SetCoreMask(bak, num_threads);\n      queue_.push(bak);\n    }\n  }\n  rknn_context Take() {\n    std::unique_lock<std::mutex> lock(mutex_);\n\n    cv_.wait(lock, [&] { return stopped_ || !queue_.empty(); });\n\n    if (stopped_ && queue_.empty()) {\n      return 0;\n    }\n\n    rknn_context ctx = queue_.front();\n    queue_.pop();\n    return ctx;\n  }\n\n  void Put(rknn_context ctx) {\n    {\n      std::lock_guard<std::mutex> lock(mutex_);\n      if (stopped_) {\n        rknn_destroy(ctx);\n        return;\n      }\n      queue_.push(ctx);\n    }\n    cv_.notify_one();\n  }\n\n  ~Impl() {\n    {\n      std::lock_guard<std::mutex> lock(mutex_);\n      stopped_ = true;\n    }\n    cv_.notify_all();\n    Cleanup();\n  }\n\n private:\n  void Cleanup() {\n    while (!queue_.empty()) {\n      rknn_destroy(queue_.front());\n      queue_.pop();\n    }\n  }\n\n  std::queue<rknn_context> queue_;\n  std::mutex mutex_;\n  std::condition_variable cv_;\n  bool stopped_ = false;\n};\n\nContextBlockingQueueRknn::ContextBlockingQueueRknn(rknn_context context,\n                                                   int32_t num_threads,\n                                                   int32_t capacity /*= 10*/)\n    : impl_(std::make_unique<Impl>(context, num_threads, capacity)) {}\n\nContextBlockingQueueRknn::~ContextBlockingQueueRknn() = default;\n\nrknn_context ContextBlockingQueueRknn::Take() { return impl_->Take(); }\n\nvoid ContextBlockingQueueRknn::Put(rknn_context context) {\n  impl_->Put(context);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/context-blocking-queue-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/context-blocking-queue-rknn.h\n//\n// Copyright      2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_RKNN_CONTEXT_BLOCKING_QUEUE_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_CONTEXT_BLOCKING_QUEUE_RKNN_H_\n\n#include <memory>\n\n#include \"rknn_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nclass ContextBlockingQueueRknn {\n public:\n  ContextBlockingQueueRknn(rknn_context context, int32_t num_threads,\n                           int32_t capacity = 10);\n  ~ContextBlockingQueueRknn();\n\n  rknn_context Take();\n  void Put(rknn_context context);\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_CONTEXT_BLOCKING_QUEUE_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/keyword-spotter-transducer-rknn-impl.h",
    "content": "// sherpa-onnx/csrc/rknn/keyword-spotter-transducer-rknn-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_RKNN_KEYWORD_SPOTTER_TRANSDUCER_RKNN_IMPL_H_\n#define SHERPA_ONNX_CSRC_RKNN_KEYWORD_SPOTTER_TRANSDUCER_RKNN_IMPL_H_\n\n#include <algorithm>\n#include <memory>\n#include <regex>  // NOLINT\n#include <string>\n#include <sstream>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/keyword-spotter-impl.h\"\n#include \"sherpa-onnx/csrc/keyword-spotter.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/rknn/online-stream-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/online-zipformer-transducer-model-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/transducer-keyword-decoder-rknn.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/utils.h\"\n\nnamespace sherpa_onnx {\n\nKeywordResult Convert(const TransducerKeywordResult &src,\n                      const SymbolTable &sym_table, float frame_shift_ms,\n                      int32_t subsampling_factor, int32_t frames_since_start);\n\nclass KeywordSpotterTransducerRknnImpl : public KeywordSpotterImpl {\n public:\n  explicit KeywordSpotterTransducerRknnImpl(const KeywordSpotterConfig &config)\n      : config_(config),\n        model_(std::make_unique<OnlineZipformerTransducerModelRknn>(\n            config.model_config)) {\n    if (!config.model_config.tokens_buf.empty()) {\n      sym_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      /// assuming tokens_buf and tokens are guaranteed not being both empty\n      sym_ = SymbolTable(config.model_config.tokens, true);\n    }\n\n    if (sym_.Contains(\"<unk>\")) {\n      unk_id_ = sym_[\"<unk>\"];\n    }\n\n    if (config.keywords_buf.empty()) {\n      InitKeywords();\n    } else {\n      InitKeywordsFromBufStr();\n    }\n\n    decoder_ = std::make_unique<TransducerKeywordDecoderRknn>(\n        model_.get(), config_.max_active_paths, config_.num_trailing_blanks,\n        unk_id_);\n  }\n\n  template <typename Manager>\n  KeywordSpotterTransducerRknnImpl(Manager *mgr,\n                                   const KeywordSpotterConfig &config)\n      : config_(config),\n        model_(std::make_unique<OnlineZipformerTransducerModelRknn>(\n            mgr, config.model_config)),\n        sym_(mgr, config.model_config.tokens) {\n    if (sym_.Contains(\"<unk>\")) {\n      unk_id_ = sym_[\"<unk>\"];\n    }\n\n    InitKeywords(mgr);\n\n    decoder_ = std::make_unique<TransducerKeywordDecoderRknn>(\n        model_.get(), config_.max_active_paths, config_.num_trailing_blanks,\n        unk_id_);\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream() const override {\n    auto stream = std::make_unique<OnlineStreamRknn>(config_.feat_config,\n                                                     keywords_graph_);\n\n    InitOnlineStream(stream.get());\n    return stream;\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream(\n      const std::string &keywords) const override {\n    auto kws = std::regex_replace(keywords, std::regex(\"/\"), \"\\n\");\n    std::istringstream is(kws);\n\n    std::vector<std::vector<int32_t>> current_ids;\n    std::vector<std::string> current_kws;\n    std::vector<float> current_scores;\n    std::vector<float> current_thresholds;\n\n    if (!EncodeKeywords(is, sym_, &current_ids, &current_kws, &current_scores,\n                        &current_thresholds)) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Encode keywords %{public}s failed.\", keywords.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Encode keywords %s failed.\", keywords.c_str());\n#endif\n      return nullptr;\n    }\n\n    int32_t num_kws = current_ids.size();\n    int32_t num_default_kws = keywords_id_.size();\n\n    current_ids.insert(current_ids.end(), keywords_id_.begin(),\n                       keywords_id_.end());\n\n    if (!current_kws.empty() && !keywords_.empty()) {\n      current_kws.insert(current_kws.end(), keywords_.begin(), keywords_.end());\n    } else if (!current_kws.empty() && keywords_.empty()) {\n      current_kws.insert(current_kws.end(), num_default_kws, std::string());\n    } else if (current_kws.empty() && !keywords_.empty()) {\n      current_kws.insert(current_kws.end(), num_kws, std::string());\n      current_kws.insert(current_kws.end(), keywords_.begin(), keywords_.end());\n    } else {\n      // Do nothing.\n    }\n\n    if (!current_scores.empty() && !boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), boost_scores_.begin(),\n                            boost_scores_.end());\n    } else if (!current_scores.empty() && boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), num_default_kws,\n                            config_.keywords_score);\n    } else if (current_scores.empty() && !boost_scores_.empty()) {\n      current_scores.insert(current_scores.end(), num_kws,\n                            config_.keywords_score);\n      current_scores.insert(current_scores.end(), boost_scores_.begin(),\n                            boost_scores_.end());\n    } else {\n      // Do nothing.\n    }\n\n    if (!current_thresholds.empty() && !thresholds_.empty()) {\n      current_thresholds.insert(current_thresholds.end(), thresholds_.begin(),\n                                thresholds_.end());\n    } else if (!current_thresholds.empty() && thresholds_.empty()) {\n      current_thresholds.insert(current_thresholds.end(), num_default_kws,\n                                config_.keywords_threshold);\n    } else if (current_thresholds.empty() && !thresholds_.empty()) {\n      current_thresholds.insert(current_thresholds.end(), num_kws,\n                                config_.keywords_threshold);\n      current_thresholds.insert(current_thresholds.end(), thresholds_.begin(),\n                                thresholds_.end());\n    } else {\n      // Do nothing.\n    }\n\n    auto keywords_graph = std::make_shared<ContextGraph>(\n        current_ids, config_.keywords_score, config_.keywords_threshold,\n        current_scores, current_kws, current_thresholds);\n\n    auto stream =\n        std::make_unique<OnlineStreamRknn>(config_.feat_config, keywords_graph);\n    InitOnlineStream(stream.get());\n    return stream;\n  }\n\n  bool IsReady(OnlineStream *s) const override {\n    return s->GetNumProcessedFrames() + model_->ChunkSize() <\n           s->NumFramesReady();\n  }\n\n  void Reset(OnlineStream *s) const override {\n    InitOnlineStream(reinterpret_cast<OnlineStreamRknn *>(s));\n  }\n\n  void DecodeStream(OnlineStreamRknn *s) const {\n    auto r = s->GetKeywordResult(true);\n    int32_t num_trailing_blanks = r.num_trailing_blanks;\n    // assume subsampling_factor is 4\n    // assume frameshift is 0.01 second\n    float trailing_slience = num_trailing_blanks * 4 * 0.01;\n\n    // it resets automatically after detecting 1.5 seconds of silence\n    float threshold = 1.5;\n    if (trailing_slience > threshold) {\n      Reset(s);\n    }\n\n    int32_t chunk_size = model_->ChunkSize();\n    int32_t chunk_shift = model_->ChunkShift();\n\n    int32_t feature_dim = s->FeatureDim();\n\n    const auto num_processed_frames = s->GetNumProcessedFrames();\n\n    std::vector<float> features =\n        s->GetFrames(num_processed_frames, chunk_size);\n    s->GetNumProcessedFrames() += chunk_shift;\n\n    auto &states = s->GetZipformerEncoderStates();\n\n    auto p = model_->RunEncoder(features, std::move(states));\n\n    states = std::move(p.second);\n\n    decoder_->Decode(std::move(p.first), s);\n  }\n\n  void DecodeStreams(OnlineStream **ss, int32_t n) const override {\n    for (int32_t i = 0; i < n; ++i) {\n      DecodeStream(reinterpret_cast<OnlineStreamRknn *>(ss[i]));\n    }\n  }\n\n  KeywordResult GetResult(OnlineStream *s) const override {\n    TransducerKeywordResult decoder_result = s->GetKeywordResult(true);\n\n    // TODO(fangjun): Remember to change these constants if needed\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = 4;\n    return Convert(decoder_result, sym_, frame_shift_ms, subsampling_factor,\n                   s->GetNumFramesSinceStart());\n  }\n\n private:\n  void InitKeywords(std::istream &is) {\n    if (!EncodeKeywords(is, sym_, &keywords_id_, &keywords_, &boost_scores_,\n                        &thresholds_)) {\n      SHERPA_ONNX_LOGE(\"Encode keywords failed.\");\n      exit(-1);\n    }\n    keywords_graph_ = std::make_shared<ContextGraph>(\n        keywords_id_, config_.keywords_score, config_.keywords_threshold,\n        boost_scores_, keywords_, thresholds_);\n  }\n\n  void InitKeywords() {\n#ifdef SHERPA_ONNX_ENABLE_WASM_KWS\n    // Due to the limitations of the wasm file system,\n    // the keyword_file variable is directly parsed as a string of keywords\n    // if WASM KWS on\n    std::istringstream is(config_.keywords_file);\n    InitKeywords(is);\n#else\n    // each line in keywords_file contains space-separated words\n    std::ifstream is(config_.keywords_file);\n    if (!is) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Open keywords file failed: %{public}s\",\n                       config_.keywords_file.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Open keywords file failed: %s\",\n                       config_.keywords_file.c_str());\n#endif\n      exit(-1);\n    }\n    InitKeywords(is);\n#endif\n  }\n\n  template <typename Manager>\n  void InitKeywords(Manager *mgr) {\n    // each line in keywords_file contains space-separated words\n\n    auto buf = ReadFile(mgr, config_.keywords_file);\n\n    std::istringstream is(std::string(buf.data(), buf.size()));\n\n    if (!is) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Open keywords file failed: %{public}s\",\n                       config_.keywords_file.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Open keywords file failed: %s\",\n                       config_.keywords_file.c_str());\n#endif\n      exit(-1);\n    }\n    InitKeywords(is);\n  }\n\n  void InitKeywordsFromBufStr() {\n    // keywords_buf's content is supposed to be same as the keywords_file's\n    std::istringstream is(config_.keywords_buf);\n    InitKeywords(is);\n  }\n\n  void InitOnlineStream(OnlineStreamRknn *stream) const {\n    auto r = decoder_->GetEmptyResult();\n    SHERPA_ONNX_CHECK_EQ(r.hyps.Size(), 1);\n\n    SHERPA_ONNX_CHECK(stream->GetContextGraph() != nullptr);\n    r.hyps.begin()->second.context_state = stream->GetContextGraph()->Root();\n\n    stream->SetKeywordResult(r);\n    stream->SetZipformerEncoderStates(model_->GetEncoderInitStates());\n  }\n\n private:\n  KeywordSpotterConfig config_;\n  std::vector<std::vector<int32_t>> keywords_id_;\n  std::vector<float> boost_scores_;\n  std::vector<float> thresholds_;\n  std::vector<std::string> keywords_;\n  ContextGraphPtr keywords_graph_;\n  std::unique_ptr<OnlineZipformerTransducerModelRknn> model_;\n\n  std::unique_ptr<TransducerKeywordDecoderRknn> decoder_;\n  SymbolTable sym_;\n  int32_t unk_id_ = -1;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_KEYWORD_SPOTTER_TRANSDUCER_RKNN_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/macros.h",
    "content": "// sherpa-onnx/csrc/rknn/macros.h\n//\n// Copyright      2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_RKNN_MACROS_H_\n#define SHERPA_ONNX_CSRC_RKNN_MACROS_H_\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\n#define SHERPA_ONNX_RKNN_CHECK(ret, msg, ...)      \\\n  do {                                             \\\n    if (ret != RKNN_SUCC) {                        \\\n      SHERPA_ONNX_LOGE(\"Return code is: %d\", ret); \\\n      SHERPA_ONNX_LOGE(msg, ##__VA_ARGS__);        \\\n      SHERPA_ONNX_EXIT(-1);                        \\\n    }                                              \\\n  } while (0)\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_MACROS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/offline-ctc-greedy-search-decoder-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/offline-ctc-greedy-search-decoder-rknn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/offline-ctc-greedy-search-decoder-rknn.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nOfflineCtcDecoderResult OfflineCtcGreedySearchDecoderRknn::Decode(\n    const float *logits, int32_t num_frames, int32_t vocab_size) {\n  OfflineCtcDecoderResult ans;\n\n  int64_t prev_id = -1;\n\n  for (int32_t t = 0; t != num_frames; ++t) {\n    auto y = static_cast<int64_t>(std::distance(\n        static_cast<const float *>(logits),\n        std::max_element(static_cast<const float *>(logits),\n                         static_cast<const float *>(logits) + vocab_size)));\n    logits += vocab_size;\n\n    if (y != blank_id_ && y != prev_id) {\n      ans.tokens.push_back(y);\n      ans.timestamps.push_back(t);\n    }\n    prev_id = y;\n  }  // for (int32_t t = 0; ...)\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/offline-ctc-greedy-search-decoder-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/offline-ctc-greedy-search-decoder-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_RKNN_OFFLINE_CTC_GREEDY_SEARCH_DECODER_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_OFFLINE_CTC_GREEDY_SEARCH_DECODER_RKNN_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-ctc-decoder.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineCtcGreedySearchDecoderRknn {\n public:\n  explicit OfflineCtcGreedySearchDecoderRknn(int32_t blank_id)\n      : blank_id_(blank_id) {}\n\n  OfflineCtcDecoderResult Decode(const float *logits, int32_t num_frames,\n                                 int32_t vocab_size);\n\n private:\n  int32_t blank_id_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_OFFLINE_CTC_GREEDY_SEARCH_DECODER_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/offline-paraformer-model-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/offline-paraformer-model-rknn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/offline-paraformer-model-rknn.h\"\n\n#include <algorithm>\n#include <array>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n#include \"sherpa-onnx/csrc/rknn/context-blocking-queue-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/macros.h\"\n#include \"sherpa-onnx/csrc/rknn/utils.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineParaformerModelRknn::Impl {\n public:\n  ~Impl() {\n    auto ret = rknn_destroy(encoder_ctx_);\n    if (ret != RKNN_SUCC) {\n      SHERPA_ONNX_LOGE(\"Failed to destroy the encoder context\");\n    }\n\n    ret = rknn_destroy(predictor_ctx_);\n    if (ret != RKNN_SUCC) {\n      SHERPA_ONNX_LOGE(\"Failed to destroy the predictor context\");\n    }\n\n    ret = rknn_destroy(decoder_ctx_);\n    if (ret != RKNN_SUCC) {\n      SHERPA_ONNX_LOGE(\"Failed to destroy the decoder context\");\n    }\n  }\n\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    std::vector<std::string> filenames;\n    SplitStringToVector(config_.paraformer.model, \",\", false, &filenames);\n    if (filenames.size() != 3) {\n      SHERPA_ONNX_LOGE(\"Invalid Paraformer RK NPU model '%s'\",\n                       config_.paraformer.model.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    {\n      auto buf = ReadFile(filenames[0]);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(filenames[1]);\n      InitPredictor(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(filenames[2]);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) : config_(config) {\n    std::vector<std::string> filenames;\n    SplitStringToVector(config_.paraformer.model, \",\", false, &filenames);\n    if (filenames.size() != 3) {\n      SHERPA_ONNX_LOGE(\"Invalid Paraformer RK NPU model '%s'\",\n                       config_.paraformer.model.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    {\n      auto buf = ReadFile(mgr, filenames[0]);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, filenames[1]);\n      InitPredictor(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, filenames[2]);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    PostInit();\n  }\n\n  std::vector<float> Run(std::vector<float> features) {\n    std::vector<float> encoder_out = RunEncoder(features);\n    if (encoder_out.empty()) {\n      return {};\n    }\n\n    std::vector<float> alphas = RunPredictor(encoder_out);\n\n    std::vector<float> acoustic_embedding =\n        ComputeAcousticEmbedding(encoder_out, alphas, encoder_out_dim_);\n    if (acoustic_embedding.empty()) {\n      if (config_.debug) {\n        SHERPA_ONNX_LOGE(\"No speech found in the input audio\");\n      }\n\n      return {};\n    }\n\n    int32_t num_tokens = acoustic_embedding.size() / encoder_out_dim_;\n\n    acoustic_embedding.resize(encoder_out.size());\n\n    return RunDecoder(std::move(encoder_out), std::move(acoustic_embedding),\n                      num_tokens);\n  }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n private:\n  std::vector<float> RunEncoder(std::vector<float> features) {\n    features = ApplyLFR(std::move(features));\n    if (features.empty()) {\n      return {};\n    }\n\n    std::vector<rknn_input> inputs(encoder_input_attrs_.size());\n\n    inputs[0].index = encoder_input_attrs_[0].index;\n    inputs[0].type = RKNN_TENSOR_FLOAT32;\n    inputs[0].fmt = encoder_input_attrs_[0].fmt;\n    inputs[0].buf = reinterpret_cast<void *>(features.data());\n    inputs[0].size = features.size() * sizeof(float);\n\n    std::vector<float> out(encoder_output_attrs_[0].n_elems);\n\n    std::vector<rknn_output> outputs(encoder_output_attrs_.size());\n    outputs[0].index = encoder_output_attrs_[0].index;\n    outputs[0].is_prealloc = 1;\n    outputs[0].want_float = 1;\n    outputs[0].size = out.size() * sizeof(float);\n    outputs[0].buf = reinterpret_cast<void *>(out.data());\n\n    rknn_context ctx = encoder_ctx_queue_->Take();\n\n    auto ret = rknn_inputs_set(ctx, inputs.size(), inputs.data());\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to set encoder inputs\");\n\n    ret = rknn_run(ctx, nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to run the encoder model\");\n\n    ret = rknn_outputs_get(ctx, outputs.size(), outputs.data(), nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get encoder output\");\n\n    encoder_ctx_queue_->Put(ctx);\n\n    return out;\n  }\n\n  std::vector<float> RunPredictor(const std::vector<float> &encoder_out) {\n    std::vector<rknn_input> inputs(predictor_input_attrs_.size());\n\n    inputs[0].index = predictor_input_attrs_[0].index;\n    inputs[0].type = RKNN_TENSOR_FLOAT32;\n    inputs[0].fmt = predictor_input_attrs_[0].fmt;\n    inputs[0].buf =\n        reinterpret_cast<void *>(const_cast<float *>(encoder_out.data()));\n    inputs[0].size = encoder_out.size() * sizeof(float);\n\n    std::vector<float> out(predictor_output_attrs_[0].n_elems);\n\n    std::vector<rknn_output> outputs(predictor_output_attrs_.size());\n    outputs[0].index = predictor_output_attrs_[0].index;\n    outputs[0].is_prealloc = 1;\n    outputs[0].want_float = 1;\n    outputs[0].size = out.size() * sizeof(float);\n    outputs[0].buf = reinterpret_cast<void *>(out.data());\n\n    rknn_context ctx = predictor_ctx_queue_->Take();\n\n    auto ret = rknn_inputs_set(ctx, inputs.size(), inputs.data());\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to set predictor inputs\");\n\n    ret = rknn_run(ctx, nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to run the predictor model\");\n\n    ret = rknn_outputs_get(ctx, outputs.size(), outputs.data(), nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get predictor output\");\n\n    predictor_ctx_queue_->Put(ctx);\n\n    return out;\n  }\n\n  std::vector<float> RunDecoder(std::vector<float> encoder_out,\n                                std::vector<float> acoustic_embedding,\n                                int32_t num_tokens) {\n    int32_t num_frames = encoder_out.size() / encoder_out_dim_;\n\n    std::vector<rknn_input> inputs(decoder_input_attrs_.size());\n\n    inputs[0].index = decoder_input_attrs_[0].index;\n    inputs[0].type = RKNN_TENSOR_FLOAT32;\n    inputs[0].fmt = decoder_input_attrs_[0].fmt;\n    inputs[0].buf = reinterpret_cast<void *>(encoder_out.data());\n    inputs[0].size = encoder_out.size() * sizeof(float);\n\n    inputs[1].index = decoder_input_attrs_[1].index;\n    inputs[1].type = RKNN_TENSOR_FLOAT32;\n    inputs[1].fmt = decoder_input_attrs_[1].fmt;\n    inputs[1].buf = reinterpret_cast<void *>(acoustic_embedding.data());\n    inputs[1].size = acoustic_embedding.size() * sizeof(float);\n\n    std::vector<float> mask(num_frames, 1);\n    std::fill(mask.begin() + num_tokens, mask.end(), 0);\n\n    inputs[2].index = decoder_input_attrs_[2].index;\n    inputs[2].type = RKNN_TENSOR_FLOAT32;\n    inputs[2].fmt = decoder_input_attrs_[2].fmt;\n    inputs[2].buf = reinterpret_cast<void *>(mask.data());\n    inputs[2].size = mask.size() * sizeof(float);\n\n    std::vector<float> out(decoder_output_attrs_[0].n_elems);\n\n    std::vector<rknn_output> outputs(decoder_output_attrs_.size());\n    outputs[0].index = decoder_output_attrs_[0].index;\n    outputs[0].is_prealloc = 1;\n    outputs[0].want_float = 1;\n    outputs[0].size = out.size() * sizeof(float);\n    outputs[0].buf = reinterpret_cast<void *>(out.data());\n\n    rknn_context ctx = decoder_ctx_queue_->Take();\n\n    auto ret = rknn_inputs_set(ctx, inputs.size(), inputs.data());\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to set decoder inputs\");\n\n    ret = rknn_run(ctx, nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to run the decoder model\");\n\n    ret = rknn_outputs_get(ctx, outputs.size(), outputs.data(), nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get decoder output\");\n\n    decoder_ctx_queue_->Put(ctx);\n\n    return out;\n  }\n\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    InitContext(model_data, model_data_length, config_.debug, &encoder_ctx_);\n\n    InitInputOutputAttrs(encoder_ctx_, config_.debug, &encoder_input_attrs_,\n                         &encoder_output_attrs_);\n\n    num_input_frames_ = encoder_input_attrs_[0].dims[1];\n    encoder_out_dim_ = encoder_output_attrs_[0].dims[2];\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"num_input_frames_: %d\", num_input_frames_);\n      SHERPA_ONNX_LOGE(\"encoder_out_dim:: %d\", encoder_out_dim_);\n    }\n  }\n\n  void InitPredictor(void *model_data, size_t model_data_length) {\n    InitContext(model_data, model_data_length, config_.debug, &predictor_ctx_);\n\n    InitInputOutputAttrs(predictor_ctx_, config_.debug, &predictor_input_attrs_,\n                         &predictor_output_attrs_);\n  }\n\n  void InitDecoder(void *model_data, size_t model_data_length) {\n    InitContext(model_data, model_data_length, config_.debug, &decoder_ctx_);\n\n    InitInputOutputAttrs(decoder_ctx_, config_.debug, &decoder_input_attrs_,\n                         &decoder_output_attrs_);\n    vocab_size_ = decoder_output_attrs_[0].dims[2];\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"vocab_size: %d\", vocab_size_);\n    }\n  }\n\n  std::vector<float> ApplyLFR(std::vector<float> in) const {\n    int32_t lfr_window_size = 7;\n    int32_t lfr_window_shift = 6;\n    int32_t in_feat_dim = 80;\n\n    int32_t in_num_frames = in.size() / in_feat_dim;\n    if (in_num_frames < lfr_window_size) {\n      return {};\n    }\n\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n\n    if (out_num_frames > num_input_frames_) {\n      SHERPA_ONNX_LOGE(\n          \"Number of input frames %d is too large. Truncate it to %d frames.\",\n          out_num_frames, num_input_frames_);\n\n      SHERPA_ONNX_LOGE(\n          \"Recognition result may be truncated/incomplete. Please select a \"\n          \"model accepting longer audios.\");\n\n      out_num_frames = num_input_frames_;\n    }\n\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n\n    std::vector<float> out(num_input_frames_ * out_feat_dim);\n\n    const float *p_in = in.data();\n    float *p_out = out.data();\n\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n\n    return out;\n  }\n\n  void PostInit() {\n    if (config_.num_threads > 1) {\n      config_.num_threads = 1;\n    }\n\n    encoder_ctx_queue_ = std::make_unique<ContextBlockingQueueRknn>(\n        encoder_ctx_, config_.num_threads);\n\n    predictor_ctx_queue_ = std::make_unique<ContextBlockingQueueRknn>(\n        predictor_ctx_, config_.num_threads);\n\n    decoder_ctx_queue_ = std::make_unique<ContextBlockingQueueRknn>(\n        decoder_ctx_, config_.num_threads);\n  }\n\n private:\n  OfflineModelConfig config_;\n\n  rknn_context encoder_ctx_ = 0;\n  rknn_context predictor_ctx_ = 0;\n  rknn_context decoder_ctx_ = 0;\n\n  std::unique_ptr<ContextBlockingQueueRknn> encoder_ctx_queue_;\n  std::unique_ptr<ContextBlockingQueueRknn> predictor_ctx_queue_;\n  std::unique_ptr<ContextBlockingQueueRknn> decoder_ctx_queue_;\n\n  std::vector<rknn_tensor_attr> encoder_input_attrs_;\n  std::vector<rknn_tensor_attr> encoder_output_attrs_;\n\n  std::vector<rknn_tensor_attr> predictor_input_attrs_;\n  std::vector<rknn_tensor_attr> predictor_output_attrs_;\n\n  std::vector<rknn_tensor_attr> decoder_input_attrs_;\n  std::vector<rknn_tensor_attr> decoder_output_attrs_;\n\n  int32_t vocab_size_ = 0;\n  int32_t num_input_frames_ = -1;\n  int32_t encoder_out_dim_ = -1;\n};\n\nOfflineParaformerModelRknn::~OfflineParaformerModelRknn() = default;\n\nOfflineParaformerModelRknn::OfflineParaformerModelRknn(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineParaformerModelRknn::OfflineParaformerModelRknn(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nstd::vector<float> OfflineParaformerModelRknn::Run(\n    std::vector<float> features) const {\n  return impl_->Run(std::move(features));\n}\n\nint32_t OfflineParaformerModelRknn::VocabSize() const {\n  return impl_->VocabSize();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineParaformerModelRknn::OfflineParaformerModelRknn(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineParaformerModelRknn::OfflineParaformerModelRknn(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/offline-paraformer-model-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/offline-paraformer-model-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_RKNN_OFFLINE_PARAFORMER_MODEL_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_OFFLINE_PARAFORMER_MODEL_RKNN_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineParaformerModelRknn {\n public:\n  ~OfflineParaformerModelRknn();\n\n  explicit OfflineParaformerModelRknn(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineParaformerModelRknn(Manager *mgr, const OfflineModelConfig &config);\n\n  /**\n   * @param features A tensor of shape (num_frames, feature_dim)\n   *                 before applying LFR.\n   * @returns Return a tensor of shape (num_output_frames, vocab_size)\n   */\n  std::vector<float> Run(std::vector<float> features) const;\n\n  int32_t VocabSize() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_OFFLINE_PARAFORMER_MODEL_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/offline-sense-voice-model-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/offline-sense-voice-model-rknn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/offline-sense-voice-model-rknn.h\"\n\n#include <algorithm>\n#include <array>\n#include <memory>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/rknn/context-blocking-queue-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/macros.h\"\n#include \"sherpa-onnx/csrc/rknn/utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModelRknn::Impl {\n public:\n  ~Impl() {\n    auto ret = rknn_destroy(ctx_);\n    if (ret != RKNN_SUCC) {\n      SHERPA_ONNX_LOGE(\"Failed to destroy the context\");\n    }\n  }\n\n  explicit Impl(const OfflineModelConfig &config) : config_(config) {\n    auto buf = ReadFile(config_.sense_voice.model);\n    Init(buf.data(), buf.size());\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OfflineModelConfig &config) : config_(config) {\n    auto buf = ReadFile(mgr, config_.sense_voice.model);\n    Init(buf.data(), buf.size());\n\n    PostInit();\n  }\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const {\n    return meta_data_;\n  }\n\n  std::vector<float> Run(std::vector<float> features, int32_t language,\n                         int32_t text_norm) {\n    features = ApplyLFR(std::move(features));\n    if (features.empty()) {\n      return {};\n    }\n\n    std::vector<rknn_input> inputs(input_attrs_.size());\n\n    std::array<int32_t, 4> prompt{language, 1, 2, text_norm};\n\n    inputs[0].index = input_attrs_[0].index;\n    inputs[0].type = RKNN_TENSOR_FLOAT32;\n    inputs[0].fmt = input_attrs_[0].fmt;\n    inputs[0].buf = reinterpret_cast<void *>(features.data());\n    inputs[0].size = features.size() * sizeof(float);\n\n    inputs[1].index = input_attrs_[1].index;\n    inputs[1].type = RKNN_TENSOR_INT32;\n    inputs[1].fmt = input_attrs_[1].fmt;\n    inputs[1].buf = reinterpret_cast<void *>(prompt.data());\n    inputs[1].size = prompt.size() * sizeof(int32_t);\n\n    std::vector<float> out(output_attrs_[0].n_elems);\n\n    std::vector<rknn_output> outputs(output_attrs_.size());\n    outputs[0].index = output_attrs_[0].index;\n    outputs[0].is_prealloc = 1;\n    outputs[0].want_float = 1;\n    outputs[0].size = out.size() * sizeof(float);\n    outputs[0].buf = reinterpret_cast<void *>(out.data());\n\n    rknn_context ctx = ctx_queue_->Take();\n\n    auto ret = rknn_inputs_set(ctx, inputs.size(), inputs.data());\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to set inputs\");\n\n    ret = rknn_run(ctx, nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to run the model\");\n\n    ret = rknn_outputs_get(ctx, outputs.size(), outputs.data(), nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get model output\");\n\n    ctx_queue_->Put(ctx);\n\n    return out;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    InitContext(model_data, model_data_length, config_.debug, &ctx_);\n\n    InitInputOutputAttrs(ctx_, config_.debug, &input_attrs_, &output_attrs_);\n\n    rknn_custom_string custom_string = GetCustomString(ctx_, config_.debug);\n\n    auto meta = Parse(custom_string, config_.debug);\n\n#define SHERPA_ONNX_RKNN_READ_META_DATA_INT(dst, src_key)                     \\\n  do {                                                                        \\\n    if (!meta.count(#src_key)) {                                              \\\n      SHERPA_ONNX_LOGE(\"'%s' does not exist in the custom_string\", #src_key); \\\n      SHERPA_ONNX_EXIT(-1);                                                   \\\n    }                                                                         \\\n                                                                              \\\n    dst = atoi(meta.at(#src_key).c_str());                                    \\\n  } while (0)\n\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(meta_data_.with_itn_id, with_itn);\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(meta_data_.without_itn_id, without_itn);\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(meta_data_.window_size,\n                                        lfr_window_size);\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(meta_data_.window_shift,\n                                        lfr_window_shift);\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(meta_data_.vocab_size, vocab_size);\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(meta_data_.normalize_samples,\n                                        normalize_samples);\n\n    int32_t lang_auto = 0;\n    int32_t lang_zh = 0;\n    int32_t lang_en = 0;\n    int32_t lang_ja = 0;\n    int32_t lang_ko = 0;\n    int32_t lang_yue = 0;\n\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(lang_auto, lang_auto);\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(lang_zh, lang_zh);\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(lang_en, lang_en);\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(lang_ja, lang_ja);\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(lang_ko, lang_ko);\n    SHERPA_ONNX_RKNN_READ_META_DATA_INT(lang_yue, lang_yue);\n\n    meta_data_.lang2id = {\n        {\"auto\", lang_auto}, {\"zh\", lang_zh}, {\"en\", lang_en},\n        {\"ja\", lang_ja},     {\"ko\", lang_ko}, {\"yue\", lang_yue},\n    };\n\n    // for rknn models, neg_mean and inv_stddev are stored inside the model\n\n#undef SHERPA_ONNX_RKNN_READ_META_DATA_INT\n\n    num_input_frames_ = input_attrs_[0].dims[1];\n  }\n\n  std::vector<float> ApplyLFR(std::vector<float> in) const {\n    int32_t lfr_window_size = meta_data_.window_size;\n    int32_t lfr_window_shift = meta_data_.window_shift;\n    int32_t in_feat_dim = 80;\n\n    int32_t in_num_frames = in.size() / in_feat_dim;\n\n    if (in_num_frames < lfr_window_size) {\n      return {};\n    }\n\n    int32_t out_num_frames =\n        (in_num_frames - lfr_window_size) / lfr_window_shift + 1;\n\n    if (out_num_frames > num_input_frames_) {\n      SHERPA_ONNX_LOGE(\n          \"Number of input frames %d is too large. Truncate it to %d frames.\",\n          out_num_frames, num_input_frames_);\n\n      SHERPA_ONNX_LOGE(\n          \"Recognition result may be truncated/incomplete. Please select a \"\n          \"model accepting longer audios.\");\n\n      out_num_frames = num_input_frames_;\n    }\n\n    int32_t out_feat_dim = in_feat_dim * lfr_window_size;\n\n    std::vector<float> out(num_input_frames_ * out_feat_dim);\n\n    const float *p_in = in.data();\n    float *p_out = out.data();\n\n    for (int32_t i = 0; i != out_num_frames; ++i) {\n      std::copy(p_in, p_in + out_feat_dim, p_out);\n\n      p_out += out_feat_dim;\n      p_in += lfr_window_shift * in_feat_dim;\n    }\n\n    return out;\n  }\n\n  void PostInit() {\n    ctx_queue_ =\n        std::make_unique<ContextBlockingQueueRknn>(ctx_, config_.num_threads);\n  }\n\n private:\n  OfflineModelConfig config_;\n\n  rknn_context ctx_ = 0;\n  std::unique_ptr<ContextBlockingQueueRknn> ctx_queue_;\n\n  std::vector<rknn_tensor_attr> input_attrs_;\n  std::vector<rknn_tensor_attr> output_attrs_;\n\n  OfflineSenseVoiceModelMetaData meta_data_;\n  int32_t num_input_frames_ = -1;\n};\n\nOfflineSenseVoiceModelRknn::~OfflineSenseVoiceModelRknn() = default;\n\nOfflineSenseVoiceModelRknn::OfflineSenseVoiceModelRknn(\n    const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOfflineSenseVoiceModelRknn::OfflineSenseVoiceModelRknn(\n    Manager *mgr, const OfflineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nstd::vector<float> OfflineSenseVoiceModelRknn::Run(std::vector<float> features,\n                                                   int32_t language,\n                                                   int32_t text_norm) const {\n  return impl_->Run(std::move(features), language, text_norm);\n}\n\nconst OfflineSenseVoiceModelMetaData &\nOfflineSenseVoiceModelRknn::GetModelMetadata() const {\n  return impl_->GetModelMetadata();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OfflineSenseVoiceModelRknn::OfflineSenseVoiceModelRknn(\n    AAssetManager *mgr, const OfflineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OfflineSenseVoiceModelRknn::OfflineSenseVoiceModelRknn(\n    NativeResourceManager *mgr, const OfflineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/offline-sense-voice-model-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/offline-sense-voice-model-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_RKNN_OFFLINE_SENSE_VOICE_MODEL_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_OFFLINE_SENSE_VOICE_MODEL_RKNN_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-sense-voice-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass OfflineSenseVoiceModelRknn {\n public:\n  ~OfflineSenseVoiceModelRknn();\n\n  explicit OfflineSenseVoiceModelRknn(const OfflineModelConfig &config);\n\n  template <typename Manager>\n  OfflineSenseVoiceModelRknn(Manager *mgr, const OfflineModelConfig &config);\n\n  /**\n   * @param features A tensor of shape (num_frames, feature_dim)\n   *                 before applying LFR.\n   * @param language\n   * @param text_norm\n   * @returns Return a tensor of shape (num_output_frames, vocab_size)\n   */\n  std::vector<float> Run(std::vector<float> features, int32_t language,\n                         int32_t text_norm) const;\n\n  const OfflineSenseVoiceModelMetaData &GetModelMetadata() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_OFFLINE_SENSE_VOICE_MODEL_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-recognizer-ctc-rknn-impl.h",
    "content": "// sherpa-onnx/csrc/rknn/online-recognizer-ctc-rknn-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_RKNN_ONLINE_RECOGNIZER_CTC_RKNN_IMPL_H_\n#define SHERPA_ONNX_CSRC_RKNN_ONLINE_RECOGNIZER_CTC_RKNN_IMPL_H_\n\n#include <algorithm>\n#include <ios>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-ctc-decoder.h\"\n#include \"sherpa-onnx/csrc/online-ctc-fst-decoder.h\"\n#include \"sherpa-onnx/csrc/online-ctc-greedy-search-decoder.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/rknn/online-stream-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/online-zipformer-ctc-model-rknn.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\n// defined in ../online-recognizer-ctc-impl.h\nOnlineRecognizerResult ConvertCtc(const OnlineCtcDecoderResult &src,\n                                  const SymbolTable &sym_table,\n                                  float frame_shift_ms,\n                                  int32_t subsampling_factor, int32_t segment,\n                                  int32_t frames_since_start);\n\nclass OnlineRecognizerCtcRknnImpl : public OnlineRecognizerImpl {\n public:\n  explicit OnlineRecognizerCtcRknnImpl(const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(config),\n        config_(config),\n        model_(\n            std::make_unique<OnlineZipformerCtcModelRknn>(config.model_config)),\n        endpoint_(config_.endpoint_config) {\n    if (!config.model_config.tokens_buf.empty()) {\n      sym_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      /// assuming tokens_buf and tokens are guaranteed not being both empty\n      sym_ = SymbolTable(config.model_config.tokens, true);\n    }\n\n    InitDecoder();\n  }\n\n  template <typename Manager>\n  explicit OnlineRecognizerCtcRknnImpl(Manager *mgr,\n                                       const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(mgr, config),\n        config_(config),\n        model_(std::make_unique<OnlineZipformerCtcModelRknn>(\n            mgr, config_.model_config)),\n        sym_(mgr, config_.model_config.tokens),\n        endpoint_(config_.endpoint_config) {\n    InitDecoder();\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream() const override {\n    auto stream = std::make_unique<OnlineStreamRknn>(config_.feat_config);\n    stream->SetZipformerEncoderStates(model_->GetInitStates());\n    stream->SetFasterDecoder(decoder_->CreateFasterDecoder());\n    return stream;\n  }\n\n  bool IsReady(OnlineStream *s) const override {\n    return s->GetNumProcessedFrames() + model_->ChunkSize() <\n           s->NumFramesReady();\n  }\n\n  void DecodeStreams(OnlineStream **ss, int32_t n) const override {\n    for (int32_t i = 0; i != n; ++i) {\n      DecodeStream(reinterpret_cast<OnlineStreamRknn *>(ss[i]));\n    }\n  }\n\n  OnlineRecognizerResult GetResult(OnlineStream *s) const override {\n    OnlineCtcDecoderResult decoder_result = s->GetCtcResult();\n\n    // TODO(fangjun): Remember to change these constants if needed\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = 4;\n    auto r =\n        ConvertCtc(decoder_result, sym_, frame_shift_ms, subsampling_factor,\n                   s->GetCurrentSegment(), s->GetNumFramesSinceStart());\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    return r;\n  }\n\n  bool IsEndpoint(OnlineStream *s) const override {\n    if (!config_.enable_endpoint) {\n      return false;\n    }\n\n    int32_t num_processed_frames = s->GetNumProcessedFrames();\n\n    // frame shift is 10 milliseconds\n    float frame_shift_in_seconds = 0.01;\n\n    // subsampling factor is 4\n    int32_t trailing_silence_frames = s->GetCtcResult().num_trailing_blanks * 4;\n\n    return endpoint_.IsEndpoint(num_processed_frames, trailing_silence_frames,\n                                frame_shift_in_seconds);\n  }\n\n  void Reset(OnlineStream *s) const override {\n    // segment is incremented only when the last\n    // result is not empty\n    const auto &r = s->GetCtcResult();\n    if (!r.tokens.empty()) {\n      s->GetCurrentSegment() += 1;\n    }\n\n    // clear result\n    s->SetCtcResult({});\n\n    // clear states\n    reinterpret_cast<OnlineStreamRknn *>(s)->SetZipformerEncoderStates(\n        model_->GetInitStates());\n\n    s->GetFasterDecoderProcessedFrames() = 0;\n\n    // Note: We only update counters. The underlying audio samples\n    // are not discarded.\n    s->Reset();\n  }\n\n private:\n  void InitDecoder() {\n    if (!sym_.Contains(\"<blk>\") && !sym_.Contains(\"<eps>\") &&\n        !sym_.Contains(\"<blank>\")) {\n      SHERPA_ONNX_LOGE(\n          \"We expect that tokens.txt contains \"\n          \"the symbol <blk> or <eps> or <blank> and its ID.\");\n      exit(-1);\n    }\n\n    int32_t blank_id = 0;\n    if (sym_.Contains(\"<blk>\")) {\n      blank_id = sym_[\"<blk>\"];\n    } else if (sym_.Contains(\"<eps>\")) {\n      // for tdnn models of the yesno recipe from icefall\n      blank_id = sym_[\"<eps>\"];\n    } else if (sym_.Contains(\"<blank>\")) {\n      // for WeNet CTC models\n      blank_id = sym_[\"<blank>\"];\n    }\n\n    if (!config_.ctc_fst_decoder_config.graph.empty()) {\n      decoder_ = std::make_unique<OnlineCtcFstDecoder>(\n          config_.ctc_fst_decoder_config, blank_id);\n    } else if (config_.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OnlineCtcGreedySearchDecoder>(blank_id);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Unsupported decoding method: %s for streaming CTC models\",\n          config_.decoding_method.c_str());\n      exit(-1);\n    }\n  }\n\n  void DecodeStream(OnlineStreamRknn *s) const {\n    int32_t chunk_size = model_->ChunkSize();\n    int32_t chunk_shift = model_->ChunkShift();\n\n    int32_t feat_dim = s->FeatureDim();\n\n    const auto num_processed_frames = s->GetNumProcessedFrames();\n    std::vector<float> features =\n        s->GetFrames(num_processed_frames, chunk_size);\n    s->GetNumProcessedFrames() += chunk_shift;\n\n    auto &states = s->GetZipformerEncoderStates();\n    auto p = model_->Run(features, std::move(states));\n    states = std::move(p.second);\n\n    std::vector<OnlineCtcDecoderResult> results(1);\n    results[0] = std::move(s->GetCtcResult());\n\n    auto attr = model_->GetOutAttr();\n\n    decoder_->Decode(p.first.data(), attr.dims[0], attr.dims[1], attr.dims[2],\n                     &results, reinterpret_cast<OnlineStream **>(&s), 1);\n    s->SetCtcResult(results[0]);\n  }\n\n private:\n  OnlineRecognizerConfig config_;\n  std::unique_ptr<OnlineZipformerCtcModelRknn> model_;\n  std::unique_ptr<OnlineCtcDecoder> decoder_;\n  SymbolTable sym_;\n  Endpoint endpoint_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_ONLINE_RECOGNIZER_CTC_RKNN_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-recognizer-transducer-rknn-impl.h",
    "content": "// sherpa-onnx/csrc/rknn/online-recognizer-transducer-rknn-impl.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_RKNN_ONLINE_RECOGNIZER_TRANSDUCER_RKNN_IMPL_H_\n#define SHERPA_ONNX_CSRC_RKNN_ONLINE_RECOGNIZER_TRANSDUCER_RKNN_IMPL_H_\n\n#include <algorithm>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/online-recognizer-impl.h\"\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/csrc/rknn/online-stream-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/online-transducer-decoder-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/online-transducer-greedy-search-decoder-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/online-transducer-modified-beam-search-decoder-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/online-zipformer-transducer-model-rknn.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\nnamespace sherpa_onnx {\n\nOnlineRecognizerResult Convert(const OnlineTransducerDecoderResultRknn &src,\n                               const SymbolTable &sym_table,\n                               float frame_shift_ms, int32_t subsampling_factor,\n                               int32_t segment, int32_t frames_since_start) {\n  OnlineRecognizerResult r;\n  r.tokens.reserve(src.tokens.size());\n  r.timestamps.reserve(src.tokens.size());\n\n  std::string text;\n  for (auto i : src.tokens) {\n    auto sym = sym_table[i];\n\n    text.append(sym);\n\n    if (sym.size() == 1 && (sym[0] < 0x20 || sym[0] > 0x7e)) {\n      // for bpe models with byte_fallback\n      // (but don't rewrite printable characters 0x20..0x7e,\n      //  which collide with standard BPE units)\n      std::ostringstream os;\n      os << \"<0x\" << std::hex << std::uppercase\n         << (static_cast<int32_t>(sym[0]) & 0xff) << \">\";\n      sym = os.str();\n    }\n\n    r.tokens.push_back(std::move(sym));\n  }\n\n  if (sym_table.IsByteBpe()) {\n    text = sym_table.DecodeByteBpe(text);\n  }\n\n  r.text = std::move(text);\n\n  float frame_shift_s = frame_shift_ms / 1000. * subsampling_factor;\n  for (auto t : src.timestamps) {\n    float time = frame_shift_s * t;\n    r.timestamps.push_back(time);\n  }\n\n  r.segment = segment;\n  r.start_time = frames_since_start * frame_shift_ms / 1000.;\n\n  return r;\n}\n\nclass OnlineRecognizerTransducerRknnImpl : public OnlineRecognizerImpl {\n public:\n  explicit OnlineRecognizerTransducerRknnImpl(\n      const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(config),\n        config_(config),\n        endpoint_(config_.endpoint_config),\n        model_(std::make_unique<OnlineZipformerTransducerModelRknn>(\n            config.model_config)) {\n    if (!config.model_config.tokens_buf.empty()) {\n      sym_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      /// assuming tokens_buf and tokens are guaranteed not being both empty\n      sym_ = SymbolTable(config.model_config.tokens, true);\n    }\n\n    if (sym_.Contains(\"<unk>\")) {\n      unk_id_ = sym_[\"<unk>\"];\n    }\n\n    if (config.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OnlineTransducerGreedySearchDecoderRknn>(\n          model_.get(), unk_id_);\n    } else if (config.decoding_method == \"modified_beam_search\") {\n      decoder_ =\n          std::make_unique<OnlineTransducerModifiedBeamSearchDecoderRknn>(\n              model_.get(), config.max_active_paths, unk_id_);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Invalid decoding method: '%s'. Support only greedy_search and \"\n          \"modified_beam_search.\",\n          config.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  template <typename Manager>\n  explicit OnlineRecognizerTransducerRknnImpl(\n      Manager *mgr, const OnlineRecognizerConfig &config)\n      : OnlineRecognizerImpl(mgr, config),\n        config_(config),\n        endpoint_(config_.endpoint_config),\n        model_(std::make_unique<OnlineZipformerTransducerModelRknn>(\n            mgr, config_.model_config)) {\n    if (!config.model_config.tokens_buf.empty()) {\n      sym_ = SymbolTable(config.model_config.tokens_buf, false);\n    } else {\n      /// assuming tokens_buf and tokens are guaranteed not being both empty\n      sym_ = SymbolTable(mgr, config.model_config.tokens);\n    }\n\n    if (sym_.Contains(\"<unk>\")) {\n      unk_id_ = sym_[\"<unk>\"];\n    }\n\n    if (config.decoding_method == \"greedy_search\") {\n      decoder_ = std::make_unique<OnlineTransducerGreedySearchDecoderRknn>(\n          model_.get(), unk_id_);\n    } else if (config.decoding_method == \"modified_beam_search\") {\n      decoder_ =\n          std::make_unique<OnlineTransducerModifiedBeamSearchDecoderRknn>(\n              model_.get(), config.max_active_paths, unk_id_);\n    } else {\n      SHERPA_ONNX_LOGE(\n          \"Invalid decoding method: '%s'. Support only greedy_search and \"\n          \"modified_beam_search.\",\n          config.decoding_method.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream() const override {\n    auto stream = std::make_unique<OnlineStreamRknn>(config_.feat_config);\n    auto r = decoder_->GetEmptyResult();\n    stream->SetZipformerResult(std::move(r));\n    stream->SetZipformerEncoderStates(model_->GetEncoderInitStates());\n    return stream;\n  }\n\n  std::unique_ptr<OnlineStream> CreateStream(\n      const std::string &hotwords) const override {\n    SHERPA_ONNX_LOGE(\"Hotwords for RKNN is not supported now.\");\n    return CreateStream();\n  }\n\n  bool IsReady(OnlineStream *s) const override {\n    return s->GetNumProcessedFrames() + model_->ChunkSize() <\n           s->NumFramesReady();\n  }\n\n  // Warmping up engine with wp: warm_up count and max-batch-size\n\n  void DecodeStreams(OnlineStream **ss, int32_t n) const override {\n    for (int32_t i = 0; i < n; ++i) {\n      DecodeStream(reinterpret_cast<OnlineStreamRknn *>(ss[i]));\n    }\n  }\n\n  OnlineRecognizerResult GetResult(OnlineStream *s) const override {\n    OnlineTransducerDecoderResultRknn decoder_result =\n        reinterpret_cast<OnlineStreamRknn *>(s)->GetZipformerResult();\n    decoder_->StripLeadingBlanks(&decoder_result);\n    // TODO(fangjun): Remember to change these constants if needed\n    int32_t frame_shift_ms = 10;\n    int32_t subsampling_factor = 4;\n    auto r = Convert(decoder_result, sym_, frame_shift_ms, subsampling_factor,\n                     s->GetCurrentSegment(), s->GetNumFramesSinceStart());\n    r.text = ApplyInverseTextNormalization(std::move(r.text));\n    r.text = ApplyHomophoneReplacer(std::move(r.text));\n    return r;\n  }\n\n  bool IsEndpoint(OnlineStream *s) const override {\n    if (!config_.enable_endpoint) {\n      return false;\n    }\n\n    int32_t num_processed_frames = s->GetNumProcessedFrames();\n\n    // frame shift is 10 milliseconds\n    float frame_shift_in_seconds = 0.01;\n\n    // subsampling factor is 4\n    int32_t trailing_silence_frames = reinterpret_cast<OnlineStreamRknn *>(s)\n                                          ->GetZipformerResult()\n                                          .num_trailing_blanks *\n                                      4;\n\n    return endpoint_.IsEndpoint(num_processed_frames, trailing_silence_frames,\n                                frame_shift_in_seconds);\n  }\n\n  void Reset(OnlineStream *s) const override {\n    int32_t context_size = model_->ContextSize();\n\n    {\n      // segment is incremented only when the last\n      // result is not empty, contains non-blanks and longer than context_size)\n      const auto &r =\n          reinterpret_cast<OnlineStreamRknn *>(s)->GetZipformerResult();\n      if (!r.tokens.empty() && r.tokens.back() != 0 &&\n          r.tokens.size() > context_size) {\n        s->GetCurrentSegment() += 1;\n      }\n    }\n\n    // reset encoder states\n    // reinterpret_cast<OnlineStreamRknn*>(s)->SetZipformerEncoderStates(model_->GetEncoderInitStates());\n    auto r = decoder_->GetEmptyResult();\n    auto last_result =\n        reinterpret_cast<OnlineStreamRknn *>(s)->GetZipformerResult();\n\n    // if last result is not empty, then\n    // preserve last tokens as the context for next result\n    if (static_cast<int32_t>(last_result.tokens.size()) > context_size) {\n      r.tokens = {last_result.tokens.end() - context_size,\n                  last_result.tokens.end()};\n    }\n    reinterpret_cast<OnlineStreamRknn *>(s)->SetZipformerResult(std::move(r));\n\n    // Note: We only update counters. The underlying audio samples\n    // are not discarded.\n    s->Reset();\n  }\n\n private:\n  void DecodeStream(OnlineStreamRknn *s) const {\n    int32_t chunk_size = model_->ChunkSize();\n    int32_t chunk_shift = model_->ChunkShift();\n\n    int32_t feature_dim = s->FeatureDim();\n\n    const auto num_processed_frames = s->GetNumProcessedFrames();\n\n    std::vector<float> features =\n        s->GetFrames(num_processed_frames, chunk_size);\n    s->GetNumProcessedFrames() += chunk_shift;\n\n    auto &states = s->GetZipformerEncoderStates();\n\n    auto p = model_->RunEncoder(features, std::move(states));\n    states = std::move(p.second);\n\n    auto &r = s->GetZipformerResult();\n    decoder_->Decode(std::move(p.first), &r);\n  }\n\n private:\n  OnlineRecognizerConfig config_;\n  SymbolTable sym_;\n  Endpoint endpoint_;\n  int32_t unk_id_ = -1;\n  std::unique_ptr<OnlineZipformerTransducerModelRknn> model_;\n  std::unique_ptr<OnlineTransducerDecoderRknn> decoder_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_ONLINE_RECOGNIZER_TRANSDUCER_RKNN_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-stream-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/online-stream-rknn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/online-stream-rknn.h\"\n\n#include <utility>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nclass OnlineStreamRknn::Impl {\n public:\n  void SetZipformerEncoderStates(std::vector<std::vector<uint8_t>> states) {\n    states_ = std::move(states);\n  }\n\n  std::vector<std::vector<uint8_t>> &GetZipformerEncoderStates() {\n    return states_;\n  }\n\n  void SetZipformerResult(OnlineTransducerDecoderResultRknn r) {\n    result_ = std::move(r);\n  }\n\n  OnlineTransducerDecoderResultRknn &GetZipformerResult() { return result_; }\n\n private:\n  std::vector<std::vector<uint8_t>> states_;\n  OnlineTransducerDecoderResultRknn result_;\n};\n\nOnlineStreamRknn::OnlineStreamRknn(\n    const FeatureExtractorConfig &config /*= {}*/,\n    ContextGraphPtr context_graph /*= nullptr*/)\n    : OnlineStream(config, context_graph), impl_(std::make_unique<Impl>()) {}\n\nOnlineStreamRknn::~OnlineStreamRknn() = default;\n\nvoid OnlineStreamRknn::SetZipformerEncoderStates(\n    std::vector<std::vector<uint8_t>> states) const {\n  impl_->SetZipformerEncoderStates(std::move(states));\n}\n\nstd::vector<std::vector<uint8_t>> &OnlineStreamRknn::GetZipformerEncoderStates()\n    const {\n  return impl_->GetZipformerEncoderStates();\n}\n\nvoid OnlineStreamRknn::SetZipformerResult(\n    OnlineTransducerDecoderResultRknn r) const {\n  impl_->SetZipformerResult(std::move(r));\n}\n\nOnlineTransducerDecoderResultRknn &OnlineStreamRknn::GetZipformerResult()\n    const {\n  return impl_->GetZipformerResult();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-stream-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/online-stream-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_RKNN_ONLINE_STREAM_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_ONLINE_STREAM_RKNN_H_\n#include <memory>\n#include <vector>\n\n#include \"rknn_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/rknn/online-transducer-decoder-rknn.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineStreamRknn : public OnlineStream {\n public:\n  explicit OnlineStreamRknn(const FeatureExtractorConfig &config = {},\n                            ContextGraphPtr context_graph = nullptr);\n\n  ~OnlineStreamRknn();\n\n  void SetZipformerEncoderStates(\n      std::vector<std::vector<uint8_t>> states) const;\n\n  std::vector<std::vector<uint8_t>> &GetZipformerEncoderStates() const;\n\n  void SetZipformerResult(OnlineTransducerDecoderResultRknn r) const;\n\n  OnlineTransducerDecoderResultRknn &GetZipformerResult() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_ONLINE_STREAM_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-transducer-decoder-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/online-transducer-decoder-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_RKNN_ONLINE_TRANSDUCER_DECODER_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_ONLINE_TRANSDUCER_DECODER_RKNN_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/hypothesis.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstruct OnlineTransducerDecoderResultRknn {\n  /// Number of frames after subsampling we have decoded so far\n  int32_t frame_offset = 0;\n\n  /// The decoded token IDs so far\n  std::vector<int64_t> tokens;\n\n  /// number of trailing blank frames decoded so far\n  int32_t num_trailing_blanks = 0;\n\n  /// timestamps[i] contains the output frame index where tokens[i] is decoded.\n  std::vector<int32_t> timestamps;\n\n  // used only by greedy_search\n  std::vector<float> previous_decoder_out;\n\n  // used only in modified beam_search\n  Hypotheses hyps;\n\n  // used only by modified_beam_search\n  std::vector<std::vector<float>> previous_decoder_out2;\n};\n\nclass OnlineTransducerDecoderRknn {\n public:\n  virtual ~OnlineTransducerDecoderRknn() = default;\n\n  /* Return an empty result.\n   *\n   * To simplify the decoding code, we add `context_size` blanks\n   * to the beginning of the decoding result, which will be\n   * stripped by calling `StripPrecedingBlanks()`.\n   */\n  virtual OnlineTransducerDecoderResultRknn GetEmptyResult() const = 0;\n\n  /** Strip blanks added by `GetEmptyResult()`.\n   *\n   * @param r It is changed in-place.\n   */\n  virtual void StripLeadingBlanks(\n      OnlineTransducerDecoderResultRknn * /*r*/) const {}\n\n  virtual void Decode(std::vector<float> encoder_out,\n                      OnlineTransducerDecoderResultRknn *result) const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_ONLINE_TRANSDUCER_DECODER_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-transducer-greedy-search-decoder-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/online-transducer-greedy-search-decoder-rknn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/online-transducer-greedy-search-decoder-rknn.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nOnlineTransducerDecoderResultRknn\nOnlineTransducerGreedySearchDecoderRknn::GetEmptyResult() const {\n  int32_t context_size = model_->ContextSize();\n  int32_t blank_id = 0;  // always 0\n  OnlineTransducerDecoderResultRknn r;\n  r.tokens.resize(context_size, -1);\n  r.tokens.back() = blank_id;\n\n  return r;\n}\n\nvoid OnlineTransducerGreedySearchDecoderRknn::StripLeadingBlanks(\n    OnlineTransducerDecoderResultRknn *r) const {\n  int32_t context_size = model_->ContextSize();\n\n  auto start = r->tokens.begin() + context_size;\n  auto end = r->tokens.end();\n\n  r->tokens = std::vector<int64_t>(start, end);\n}\n\nvoid OnlineTransducerGreedySearchDecoderRknn::Decode(\n    std::vector<float> encoder_out,\n    OnlineTransducerDecoderResultRknn *result) const {\n  auto &r = result[0];\n  auto attr = model_->GetEncoderOutAttr();\n  int32_t num_frames = attr.dims[1];\n  int32_t encoder_out_dim = attr.dims[2];\n\n  int32_t vocab_size = model_->VocabSize();\n  int32_t context_size = model_->ContextSize();\n\n  std::vector<int64_t> decoder_input;\n  std::vector<float> decoder_out;\n\n  if (r.previous_decoder_out.empty()) {\n    decoder_input = {r.tokens.begin() + (r.tokens.size() - context_size),\n                     r.tokens.end()};\n    decoder_out = model_->RunDecoder(std::move(decoder_input));\n\n  } else {\n    decoder_out = std::move(r.previous_decoder_out);\n  }\n\n  const float *p_encoder_out = encoder_out.data();\n  for (int32_t t = 0; t != num_frames; ++t) {\n    auto logit = model_->RunJoiner(p_encoder_out, decoder_out.data());\n    p_encoder_out += encoder_out_dim;\n\n    bool emitted = false;\n    if (blank_penalty_ > 0.0) {\n      logit[0] -= blank_penalty_;  // assuming blank id is 0\n    }\n\n    auto y = static_cast<int32_t>(std::distance(\n        logit.data(),\n        std::max_element(logit.data(), logit.data() + vocab_size)));\n    // blank id is hardcoded to 0\n    // also, it treats unk as blank\n    if (y != 0 && y != unk_id_) {\n      emitted = true;\n      r.tokens.push_back(y);\n      r.timestamps.push_back(t + r.frame_offset);\n      r.num_trailing_blanks = 0;\n    } else {\n      ++r.num_trailing_blanks;\n    }\n\n    if (emitted) {\n      decoder_input = {r.tokens.begin() + (r.tokens.size() - context_size),\n                       r.tokens.end()};\n      decoder_out = model_->RunDecoder(std::move(decoder_input));\n    }\n  }\n\n  r.frame_offset += num_frames;\n  r.previous_decoder_out = std::move(decoder_out);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-transducer-greedy-search-decoder-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/online-transducer-greedy-search-decoder-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_RKNN_ONLINE_TRANSDUCER_GREEDY_SEARCH_DECODER_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_ONLINE_TRANSDUCER_GREEDY_SEARCH_DECODER_RKNN_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/rknn/online-transducer-decoder-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/online-transducer-greedy-search-decoder-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/online-zipformer-transducer-model-rknn.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineTransducerGreedySearchDecoderRknn\n    : public OnlineTransducerDecoderRknn {\n public:\n  explicit OnlineTransducerGreedySearchDecoderRknn(\n      OnlineZipformerTransducerModelRknn *model, int32_t unk_id = 2,\n      float blank_penalty = 0.0)\n      : model_(model), unk_id_(unk_id), blank_penalty_(blank_penalty) {}\n\n  OnlineTransducerDecoderResultRknn GetEmptyResult() const override;\n\n  void StripLeadingBlanks(OnlineTransducerDecoderResultRknn *r) const override;\n\n  void Decode(std::vector<float> encoder_out,\n              OnlineTransducerDecoderResultRknn *result) const override;\n\n private:\n  OnlineZipformerTransducerModelRknn *model_;  // Not owned\n  int32_t unk_id_;\n  float blank_penalty_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_ONLINE_TRANSDUCER_GREEDY_SEARCH_DECODER_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-transducer-modified-beam-search-decoder-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/online-transducer-modified-beam-search-decoder-rknn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/online-transducer-modified-beam-search-decoder-rknn.h\"\n\n#include <algorithm>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/hypothesis.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/math.h\"\n\nnamespace sherpa_onnx {\n\nOnlineTransducerDecoderResultRknn\nOnlineTransducerModifiedBeamSearchDecoderRknn::GetEmptyResult() const {\n  int32_t context_size = model_->ContextSize();\n  int32_t blank_id = 0;  // always 0\n  OnlineTransducerDecoderResultRknn r;\n\n  std::vector<int64_t> blanks(context_size, -1);\n  blanks.back() = blank_id;\n\n  Hypotheses blank_hyp({{blanks, 0}});\n  r.hyps = std::move(blank_hyp);\n  r.tokens = std::move(blanks);\n\n  return r;\n}\n\nvoid OnlineTransducerModifiedBeamSearchDecoderRknn::StripLeadingBlanks(\n    OnlineTransducerDecoderResultRknn *r) const {\n  int32_t context_size = model_->ContextSize();\n  auto hyp = r->hyps.GetMostProbable(true);\n\n  std::vector<int64_t> tokens(hyp.ys.begin() + context_size, hyp.ys.end());\n  r->tokens = std::move(tokens);\n  r->timestamps = std::move(hyp.timestamps);\n\n  r->num_trailing_blanks = hyp.num_trailing_blanks;\n}\n\nstd::vector<std::vector<float>> GetDecoderOut(\n    OnlineZipformerTransducerModelRknn *model, const Hypotheses &hyp_vec) {\n  std::vector<std::vector<float>> ans;\n  ans.reserve(hyp_vec.Size());\n\n  int32_t context_size = model->ContextSize();\n  for (const auto &p : hyp_vec) {\n    const auto &hyp = p.second;\n    auto start = hyp.ys.begin() + (hyp.ys.size() - context_size);\n    auto end = hyp.ys.end();\n    auto tokens = std::vector<int64_t>(start, end);\n    auto decoder_out = model->RunDecoder(std::move(tokens));\n\n    ans.push_back(std::move(decoder_out));\n  }\n\n  return ans;\n}\n\nstd::vector<std::vector<float>> GetJoinerOutLogSoftmax(\n    OnlineZipformerTransducerModelRknn *model, const float *p_encoder_out,\n    const std::vector<std::vector<float>> &decoder_out) {\n  std::vector<std::vector<float>> ans;\n  ans.reserve(decoder_out.size());\n\n  for (const auto &d : decoder_out) {\n    auto joiner_out = model->RunJoiner(p_encoder_out, d.data());\n\n    LogSoftmax(joiner_out.data(), joiner_out.size());\n\n    ans.push_back(std::move(joiner_out));\n  }\n  return ans;\n}\n\nvoid OnlineTransducerModifiedBeamSearchDecoderRknn::Decode(\n    std::vector<float> encoder_out,\n    OnlineTransducerDecoderResultRknn *result) const {\n  auto &r = result[0];\n  auto attr = model_->GetEncoderOutAttr();\n  int32_t num_frames = attr.dims[1];\n  int32_t encoder_out_dim = attr.dims[2];\n\n  int32_t vocab_size = model_->VocabSize();\n  int32_t context_size = model_->ContextSize();\n\n  Hypotheses cur = std::move(result->hyps);\n  std::vector<Hypothesis> prev;\n\n  auto decoder_out = std::move(result->previous_decoder_out2);\n  if (decoder_out.empty()) {\n    decoder_out = GetDecoderOut(model_, cur);\n  }\n\n  const float *p_encoder_out = encoder_out.data();\n\n  int32_t frame_offset = result->frame_offset;\n\n  for (int32_t t = 0; t != num_frames; ++t) {\n    prev = cur.Vec();\n    cur.Clear();\n\n    auto log_probs = GetJoinerOutLogSoftmax(model_, p_encoder_out, decoder_out);\n    p_encoder_out += encoder_out_dim;\n\n    for (int32_t i = 0; i != prev.size(); ++i) {\n      auto log_prob = prev[i].log_prob;\n      for (auto &p : log_probs[i]) {\n        p += log_prob;\n      }\n    }\n\n    auto topk = TopkIndex(log_probs, max_active_paths_);\n    for (auto k : topk) {\n      int32_t hyp_index = k / vocab_size;\n      int32_t new_token = k % vocab_size;\n\n      Hypothesis new_hyp = prev[hyp_index];\n      new_hyp.log_prob = log_probs[hyp_index][new_token];\n\n      // blank is hardcoded to 0\n      // also, it treats unk as blank\n      if (new_token != 0 && new_token != unk_id_) {\n        new_hyp.ys.push_back(new_token);\n        new_hyp.timestamps.push_back(t + frame_offset);\n        new_hyp.num_trailing_blanks = 0;\n\n      } else {\n        ++new_hyp.num_trailing_blanks;\n      }\n      cur.Add(std::move(new_hyp));\n    }\n\n    decoder_out = GetDecoderOut(model_, cur);\n  }\n\n  result->hyps = std::move(cur);\n  result->frame_offset += num_frames;\n  result->previous_decoder_out2 = std::move(decoder_out);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-transducer-modified-beam-search-decoder-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/online-transducer-modified-beam-search-decoder-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_RKNN_ONLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_DECODER_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_ONLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_DECODER_RKNN_H_\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/rknn/online-transducer-decoder-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/online-zipformer-transducer-model-rknn.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineTransducerModifiedBeamSearchDecoderRknn\n    : public OnlineTransducerDecoderRknn {\n public:\n  explicit OnlineTransducerModifiedBeamSearchDecoderRknn(\n      OnlineZipformerTransducerModelRknn *model, int32_t max_active_paths,\n      int32_t unk_id = 2, float blank_penalty = 0.0)\n      : model_(model),\n        max_active_paths_(max_active_paths),\n        unk_id_(unk_id),\n        blank_penalty_(blank_penalty) {}\n\n  OnlineTransducerDecoderResultRknn GetEmptyResult() const override;\n\n  void StripLeadingBlanks(OnlineTransducerDecoderResultRknn *r) const override;\n\n  void Decode(std::vector<float> encoder_out,\n              OnlineTransducerDecoderResultRknn *result) const override;\n\n private:\n  OnlineZipformerTransducerModelRknn *model_;  // Not owned\n  int32_t max_active_paths_;\n  int32_t unk_id_;\n  float blank_penalty_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_ONLINE_TRANSDUCER_MODIFIED_BEAM_SEARCH_DECODER_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-zipformer-ctc-model-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/online-zipformer-ctc-model-rknn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/online-zipformer-ctc-model-rknn.h\"\n\n#include <memory>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/rknn/context-blocking-queue-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/macros.h\"\n#include \"sherpa-onnx/csrc/rknn/utils.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineZipformerCtcModelRknn::Impl {\n public:\n  ~Impl() {\n    auto ret = rknn_destroy(ctx_);\n    if (ret != RKNN_SUCC) {\n      SHERPA_ONNX_LOGE(\"Failed to destroy the context\");\n    }\n  }\n\n  explicit Impl(const OnlineModelConfig &config) : config_(config) {\n    auto buf = ReadFile(config.zipformer2_ctc.model);\n    Init(buf.data(), buf.size());\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OnlineModelConfig &config) : config_(config) {\n    auto buf = ReadFile(mgr, config.zipformer2_ctc.model);\n    Init(buf.data(), buf.size());\n\n    PostInit();\n  }\n\n  std::vector<std::vector<uint8_t>> GetInitStates() const {\n    // input_attrs_[0] is for the feature\n    // input_attrs_[1:] is for states\n    // so we use -1 here\n    std::vector<std::vector<uint8_t>> states(input_attrs_.size() - 1);\n\n    int32_t i = -1;\n    for (auto &attr : input_attrs_) {\n      i += 1;\n      if (i == 0) {\n        // skip processing the attr for features.\n        continue;\n      }\n\n      if (attr.type == RKNN_TENSOR_FLOAT16) {\n        states[i - 1].resize(attr.n_elems * sizeof(float));\n      } else if (attr.type == RKNN_TENSOR_INT64) {\n        states[i - 1].resize(attr.n_elems * sizeof(int64_t));\n      } else {\n        SHERPA_ONNX_LOGE(\"Unsupported tensor type: %d, %s\", attr.type,\n                         get_type_string(attr.type));\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n\n    return states;\n  }\n\n  std::pair<std::vector<float>, std::vector<std::vector<uint8_t>>> Run(\n      std::vector<float> features, std::vector<std::vector<uint8_t>> states) {\n    std::vector<rknn_input> inputs(input_attrs_.size());\n\n    for (int32_t i = 0; i < static_cast<int32_t>(inputs.size()); ++i) {\n      auto &input = inputs[i];\n      auto &attr = input_attrs_[i];\n      input.index = attr.index;\n\n      if (attr.type == RKNN_TENSOR_FLOAT16) {\n        input.type = RKNN_TENSOR_FLOAT32;\n      } else if (attr.type == RKNN_TENSOR_INT64) {\n        input.type = RKNN_TENSOR_INT64;\n      } else {\n        SHERPA_ONNX_LOGE(\"Unsupported tensor type %d, %s\", attr.type,\n                         get_type_string(attr.type));\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      input.fmt = attr.fmt;\n      if (i == 0) {\n        input.buf = reinterpret_cast<void *>(features.data());\n        input.size = features.size() * sizeof(float);\n      } else {\n        input.buf = reinterpret_cast<void *>(states[i - 1].data());\n        input.size = states[i - 1].size();\n      }\n    }\n\n    std::vector<float> out(output_attrs_[0].n_elems);\n\n    // Note(fangjun): We can reuse the memory from input argument `states`\n    // auto next_states = GetInitStates();\n    auto &next_states = states;\n\n    std::vector<rknn_output> outputs(output_attrs_.size());\n    for (int32_t i = 0; i < outputs.size(); ++i) {\n      auto &output = outputs[i];\n      auto &attr = output_attrs_[i];\n      output.index = attr.index;\n      output.is_prealloc = 1;\n\n      if (attr.type == RKNN_TENSOR_FLOAT16) {\n        output.want_float = 1;\n      } else if (attr.type == RKNN_TENSOR_INT64) {\n        output.want_float = 0;\n      } else {\n        SHERPA_ONNX_LOGE(\"Unsupported tensor type %d, %s\", attr.type,\n                         get_type_string(attr.type));\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      if (i == 0) {\n        output.size = out.size() * sizeof(float);\n        output.buf = reinterpret_cast<void *>(out.data());\n      } else {\n        output.size = next_states[i - 1].size();\n        output.buf = reinterpret_cast<void *>(next_states[i - 1].data());\n      }\n    }\n\n    rknn_context ctx = ctx_queue_->Take();\n\n    auto ret = rknn_inputs_set(ctx, inputs.size(), inputs.data());\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to set inputs\");\n\n    ret = rknn_run(ctx, nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to run the model\");\n\n    ret = rknn_outputs_get(ctx, outputs.size(), outputs.data(), nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get model output\");\n\n    for (int32_t i = 0; i < next_states.size(); ++i) {\n      const auto &attr = input_attrs_[i + 1];\n      if (attr.n_dims == 4) {\n        // TODO(fangjun): The transpose is copied from\n        // https://github.com/airockchip/rknn_model_zoo/blob/main/examples/zipformer/cpp/process.cc#L22\n        // I don't understand why we need to do that.\n        std::vector<uint8_t> dst(next_states[i].size());\n        int32_t n = attr.dims[0];\n        int32_t h = attr.dims[1];\n        int32_t w = attr.dims[2];\n        int32_t c = attr.dims[3];\n        ConvertNCHWtoNHWC(\n            reinterpret_cast<const float *>(next_states[i].data()), n, c, h, w,\n            reinterpret_cast<float *>(dst.data()));\n        next_states[i] = std::move(dst);\n      }\n    }\n\n    ctx_queue_->Put(ctx);\n\n    return {std::move(out), std::move(next_states)};\n  }\n\n  int32_t ChunkSize() const { return T_; }\n\n  int32_t ChunkShift() const { return decode_chunk_len_; }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  rknn_tensor_attr GetOutAttr() const { return output_attrs_[0]; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    InitContext(model_data, model_data_length, config_.debug, &ctx_);\n\n    InitInputOutputAttrs(ctx_, config_.debug, &input_attrs_, &output_attrs_);\n\n    rknn_custom_string custom_string = GetCustomString(ctx_, config_.debug);\n\n    auto meta = Parse(custom_string, config_.debug);\n\n    if (meta.count(\"T\")) {\n      T_ = atoi(meta.at(\"T\").c_str());\n    }\n\n    if (meta.count(\"decode_chunk_len\")) {\n      decode_chunk_len_ = atoi(meta.at(\"decode_chunk_len\").c_str());\n    }\n\n    vocab_size_ = output_attrs_[0].dims[2];\n\n    if (config_.debug) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"T: %{public}d\", T_);\n      SHERPA_ONNX_LOGE(\"decode_chunk_len_: %{public}d\", decode_chunk_len_);\n      SHERPA_ONNX_LOGE(\"vocab_size: %{public}d\", vocab_size);\n#else\n      SHERPA_ONNX_LOGE(\"T: %d\", T_);\n      SHERPA_ONNX_LOGE(\"decode_chunk_len_: %d\", decode_chunk_len_);\n      SHERPA_ONNX_LOGE(\"vocab_size: %d\", vocab_size_);\n#endif\n    }\n\n    if (T_ == 0) {\n      SHERPA_ONNX_LOGE(\n          \"Invalid T. Please use the script from icefall to export your model\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (decode_chunk_len_ == 0) {\n      SHERPA_ONNX_LOGE(\n          \"Invalid decode_chunk_len. Please use the script from icefall to \"\n          \"export your model\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  void PostInit() {\n    ctx_queue_ =\n        std::make_unique<ContextBlockingQueueRknn>(ctx_, config_.num_threads);\n  }\n\n private:\n  OnlineModelConfig config_;\n  rknn_context ctx_ = 0;\n  std::unique_ptr<ContextBlockingQueueRknn> ctx_queue_;\n\n  std::vector<rknn_tensor_attr> input_attrs_;\n  std::vector<rknn_tensor_attr> output_attrs_;\n\n  int32_t T_ = 0;\n  int32_t decode_chunk_len_ = 0;\n  int32_t vocab_size_ = 0;\n};\n\nOnlineZipformerCtcModelRknn::~OnlineZipformerCtcModelRknn() = default;\n\nOnlineZipformerCtcModelRknn::OnlineZipformerCtcModelRknn(\n    const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOnlineZipformerCtcModelRknn::OnlineZipformerCtcModelRknn(\n    Manager *mgr, const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nstd::vector<std::vector<uint8_t>> OnlineZipformerCtcModelRknn::GetInitStates()\n    const {\n  return impl_->GetInitStates();\n}\n\nstd::pair<std::vector<float>, std::vector<std::vector<uint8_t>>>\nOnlineZipformerCtcModelRknn::Run(\n    std::vector<float> features,\n    std::vector<std::vector<uint8_t>> states) const {\n  return impl_->Run(std::move(features), std::move(states));\n}\n\nint32_t OnlineZipformerCtcModelRknn::ChunkSize() const {\n  return impl_->ChunkSize();\n}\n\nint32_t OnlineZipformerCtcModelRknn::ChunkShift() const {\n  return impl_->ChunkShift();\n}\n\nint32_t OnlineZipformerCtcModelRknn::VocabSize() const {\n  return impl_->VocabSize();\n}\n\nrknn_tensor_attr OnlineZipformerCtcModelRknn::GetOutAttr() const {\n  return impl_->GetOutAttr();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineZipformerCtcModelRknn::OnlineZipformerCtcModelRknn(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineZipformerCtcModelRknn::OnlineZipformerCtcModelRknn(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-zipformer-ctc-model-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/online-zipformer-ctc-model-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_RKNN_ONLINE_ZIPFORMER_CTC_MODEL_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_ONLINE_ZIPFORMER_CTC_MODEL_RKNN_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"rknn_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineZipformerCtcModelRknn {\n public:\n  ~OnlineZipformerCtcModelRknn();\n\n  explicit OnlineZipformerCtcModelRknn(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineZipformerCtcModelRknn(Manager *mgr, const OnlineModelConfig &config);\n\n  std::vector<std::vector<uint8_t>> GetInitStates() const;\n\n  std::pair<std::vector<float>, std::vector<std::vector<uint8_t>>> Run(\n      std::vector<float> features,\n      std::vector<std::vector<uint8_t>> states) const;\n\n  int32_t ChunkSize() const;\n\n  int32_t ChunkShift() const;\n\n  int32_t VocabSize() const;\n\n  rknn_tensor_attr GetOutAttr() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_ONLINE_ZIPFORMER_CTC_MODEL_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-zipformer-transducer-model-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/online-zipformer-transducer-model-rknn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/online-zipformer-transducer-model-rknn.h\"\n\n#include <memory>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/rknn/context-blocking-queue-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/macros.h\"\n#include \"sherpa-onnx/csrc/rknn/utils.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass OnlineZipformerTransducerModelRknn::Impl {\n public:\n  ~Impl() {\n    auto ret = rknn_destroy(encoder_ctx_);\n    if (ret != RKNN_SUCC) {\n      SHERPA_ONNX_LOGE(\"Failed to destroy the encoder context\");\n    }\n\n    ret = rknn_destroy(decoder_ctx_);\n    if (ret != RKNN_SUCC) {\n      SHERPA_ONNX_LOGE(\"Failed to destroy the decoder context\");\n    }\n\n    ret = rknn_destroy(joiner_ctx_);\n    if (ret != RKNN_SUCC) {\n      SHERPA_ONNX_LOGE(\"Failed to destroy the joiner context\");\n    }\n  }\n\n  explicit Impl(const OnlineModelConfig &config) : config_(config) {\n    {\n      auto buf = ReadFile(config.transducer.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.transducer.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(config.transducer.joiner);\n      InitJoiner(buf.data(), buf.size());\n    }\n\n    PostInit();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const OnlineModelConfig &config) : config_(config) {\n    {\n      auto buf = ReadFile(mgr, config.transducer.encoder);\n      InitEncoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.transducer.decoder);\n      InitDecoder(buf.data(), buf.size());\n    }\n\n    {\n      auto buf = ReadFile(mgr, config.transducer.joiner);\n      InitJoiner(buf.data(), buf.size());\n    }\n\n    PostInit();\n  }\n\n  std::vector<std::vector<uint8_t>> GetEncoderInitStates() const {\n    // encoder_input_attrs_[0] is for the feature\n    // encoder_input_attrs_[1:] is for states\n    // so we use -1 here\n    std::vector<std::vector<uint8_t>> states(encoder_input_attrs_.size() - 1);\n\n    int32_t i = -1;\n    for (auto &attr : encoder_input_attrs_) {\n      i += 1;\n      if (i == 0) {\n        // skip processing the attr for features.\n        continue;\n      }\n\n      if (attr.type == RKNN_TENSOR_FLOAT16) {\n        states[i - 1].resize(attr.n_elems * sizeof(float));\n      } else if (attr.type == RKNN_TENSOR_INT64) {\n        states[i - 1].resize(attr.n_elems * sizeof(int64_t));\n      } else {\n        SHERPA_ONNX_LOGE(\"Unsupported tensor type: %d, %s\", attr.type,\n                         get_type_string(attr.type));\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n\n    return states;\n  }\n\n  std::pair<std::vector<float>, std::vector<std::vector<uint8_t>>> RunEncoder(\n      std::vector<float> features, std::vector<std::vector<uint8_t>> states) {\n    std::vector<rknn_input> inputs(encoder_input_attrs_.size());\n\n    for (int32_t i = 0; i < static_cast<int32_t>(inputs.size()); ++i) {\n      auto &input = inputs[i];\n      auto &attr = encoder_input_attrs_[i];\n      input.index = attr.index;\n\n      if (attr.type == RKNN_TENSOR_FLOAT16) {\n        input.type = RKNN_TENSOR_FLOAT32;\n      } else if (attr.type == RKNN_TENSOR_INT64) {\n        input.type = RKNN_TENSOR_INT64;\n      } else {\n        SHERPA_ONNX_LOGE(\"Unsupported tensor type %d, %s\", attr.type,\n                         get_type_string(attr.type));\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      input.fmt = attr.fmt;\n      if (i == 0) {\n        input.buf = reinterpret_cast<void *>(features.data());\n        input.size = features.size() * sizeof(float);\n      } else {\n        input.buf = reinterpret_cast<void *>(states[i - 1].data());\n        input.size = states[i - 1].size();\n      }\n    }\n\n    std::vector<float> encoder_out(encoder_output_attrs_[0].n_elems);\n\n    // Note(fangjun): We can reuse the memory from input argument `states`\n    // auto next_states = GetEncoderInitStates();\n    auto &next_states = states;\n\n    std::vector<rknn_output> outputs(encoder_output_attrs_.size());\n    for (int32_t i = 0; i < outputs.size(); ++i) {\n      auto &output = outputs[i];\n      auto &attr = encoder_output_attrs_[i];\n      output.index = attr.index;\n      output.is_prealloc = 1;\n\n      if (attr.type == RKNN_TENSOR_FLOAT16) {\n        output.want_float = 1;\n      } else if (attr.type == RKNN_TENSOR_INT64) {\n        output.want_float = 0;\n      } else {\n        SHERPA_ONNX_LOGE(\"Unsupported tensor type %d, %s\", attr.type,\n                         get_type_string(attr.type));\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      if (i == 0) {\n        output.size = encoder_out.size() * sizeof(float);\n        output.buf = reinterpret_cast<void *>(encoder_out.data());\n      } else {\n        output.size = next_states[i - 1].size();\n        output.buf = reinterpret_cast<void *>(next_states[i - 1].data());\n      }\n    }\n\n    rknn_context encoder_ctx = encoder_ctx_queue_->Take();\n\n    auto ret = rknn_inputs_set(encoder_ctx, inputs.size(), inputs.data());\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to set encoder inputs\");\n\n    ret = rknn_run(encoder_ctx, nullptr);\n\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to run encoder\");\n\n    ret =\n        rknn_outputs_get(encoder_ctx, outputs.size(), outputs.data(), nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get encoder output\");\n\n    for (int32_t i = 0; i < next_states.size(); ++i) {\n      const auto &attr = encoder_input_attrs_[i + 1];\n      if (attr.n_dims == 4) {\n        // TODO(fangjun): The ConvertNCHWtoNHWC is copied from\n        // https://github.com/airockchip/rknn_model_zoo/blob/main/examples/zipformer/cpp/process.cc#L22\n        // I don't understand why we need to do that.\n        std::vector<uint8_t> dst(next_states[i].size());\n        int32_t n = attr.dims[0];\n        int32_t h = attr.dims[1];\n        int32_t w = attr.dims[2];\n        int32_t c = attr.dims[3];\n        ConvertNCHWtoNHWC(\n            reinterpret_cast<const float *>(next_states[i].data()), n, c, h, w,\n            reinterpret_cast<float *>(dst.data()));\n        next_states[i] = std::move(dst);\n      }\n    }\n\n    encoder_ctx_queue_->Put(encoder_ctx);\n\n    return {std::move(encoder_out), std::move(next_states)};\n  }\n\n  std::vector<float> RunDecoder(std::vector<int64_t> decoder_input) {\n    auto &attr = decoder_input_attrs_[0];\n    rknn_input input;\n\n    input.index = 0;\n    input.type = RKNN_TENSOR_INT64;\n    input.fmt = attr.fmt;\n    input.buf = decoder_input.data();\n    input.size = decoder_input.size() * sizeof(int64_t);\n\n    std::vector<float> decoder_out(decoder_output_attrs_[0].n_elems);\n    rknn_output output;\n    output.index = decoder_output_attrs_[0].index;\n    output.is_prealloc = 1;\n    output.want_float = 1;\n    output.size = decoder_out.size() * sizeof(float);\n    output.buf = decoder_out.data();\n\n    rknn_context decoder_ctx = decoder_ctx_queue_->Take();\n\n    auto ret = rknn_inputs_set(decoder_ctx, 1, &input);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to set decoder inputs\");\n\n    ret = rknn_run(decoder_ctx, nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to run decoder\");\n\n    ret = rknn_outputs_get(decoder_ctx, 1, &output, nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get decoder output\");\n\n    decoder_ctx_queue_->Put(decoder_ctx);\n\n    return decoder_out;\n  }\n\n  std::vector<float> RunJoiner(const float *encoder_out,\n                               const float *decoder_out) {\n    std::vector<rknn_input> inputs(2);\n    inputs[0].index = 0;\n    inputs[0].type = RKNN_TENSOR_FLOAT32;\n    inputs[0].fmt = joiner_input_attrs_[0].fmt;\n    inputs[0].buf = const_cast<float *>(encoder_out);\n    inputs[0].size = joiner_input_attrs_[0].n_elems * sizeof(float);\n\n    inputs[1].index = 1;\n    inputs[1].type = RKNN_TENSOR_FLOAT32;\n    inputs[1].fmt = joiner_input_attrs_[1].fmt;\n    inputs[1].buf = const_cast<float *>(decoder_out);\n    inputs[1].size = joiner_input_attrs_[1].n_elems * sizeof(float);\n\n    std::vector<float> joiner_out(joiner_output_attrs_[0].n_elems);\n    rknn_output output;\n    output.index = joiner_output_attrs_[0].index;\n    output.is_prealloc = 1;\n    output.want_float = 1;\n    output.size = joiner_out.size() * sizeof(float);\n    output.buf = joiner_out.data();\n\n    rknn_context joiner_ctx = joiner_ctx_queue_->Take();\n\n    auto ret = rknn_inputs_set(joiner_ctx, inputs.size(), inputs.data());\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to set joiner inputs\");\n\n    ret = rknn_run(joiner_ctx, nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to run joiner\");\n\n    ret = rknn_outputs_get(joiner_ctx, 1, &output, nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get joiner output\");\n\n    joiner_ctx_queue_->Put(joiner_ctx);\n\n    return joiner_out;\n  }\n\n  int32_t ContextSize() const { return context_size_; }\n\n  int32_t ChunkSize() const { return T_; }\n\n  int32_t ChunkShift() const { return decode_chunk_len_; }\n\n  int32_t VocabSize() const { return vocab_size_; }\n\n  rknn_tensor_attr GetEncoderOutAttr() const {\n    return encoder_output_attrs_[0];\n  }\n\n private:\n  void InitEncoder(void *model_data, size_t model_data_length) {\n    InitContext(model_data, model_data_length, config_.debug, &encoder_ctx_);\n\n    InitInputOutputAttrs(encoder_ctx_, config_.debug, &encoder_input_attrs_,\n                         &encoder_output_attrs_);\n\n    rknn_custom_string custom_string =\n        GetCustomString(encoder_ctx_, config_.debug);\n\n    auto meta = Parse(custom_string, config_.debug);\n\n    if (meta.count(\"encoder_dims\")) {\n      SplitStringToIntegers(meta.at(\"encoder_dims\"), \",\", false,\n                            &encoder_dims_);\n    }\n\n    if (meta.count(\"attention_dims\")) {\n      SplitStringToIntegers(meta.at(\"attention_dims\"), \",\", false,\n                            &attention_dims_);\n    }\n\n    if (meta.count(\"num_encoder_layers\")) {\n      SplitStringToIntegers(meta.at(\"num_encoder_layers\"), \",\", false,\n                            &num_encoder_layers_);\n    }\n\n    if (meta.count(\"cnn_module_kernels\")) {\n      SplitStringToIntegers(meta.at(\"cnn_module_kernels\"), \",\", false,\n                            &cnn_module_kernels_);\n    }\n\n    if (meta.count(\"left_context_len\")) {\n      SplitStringToIntegers(meta.at(\"left_context_len\"), \",\", false,\n                            &left_context_len_);\n    }\n\n    if (meta.count(\"T\")) {\n      T_ = atoi(meta.at(\"T\").c_str());\n    }\n\n    if (meta.count(\"decode_chunk_len\")) {\n      decode_chunk_len_ = atoi(meta.at(\"decode_chunk_len\").c_str());\n    }\n\n    if (meta.count(\"context_size\")) {\n      context_size_ = atoi(meta.at(\"context_size\").c_str());\n    }\n\n    if (config_.debug) {\n      auto print = [](const std::vector<int32_t> &v, const char *name) {\n        std::ostringstream os;\n        os << name << \": \";\n        for (auto i : v) {\n          os << i << \" \";\n        }\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n        SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n      };\n      print(encoder_dims_, \"encoder_dims\");\n      print(attention_dims_, \"attention_dims\");\n      print(num_encoder_layers_, \"num_encoder_layers\");\n      print(cnn_module_kernels_, \"cnn_module_kernels\");\n      print(left_context_len_, \"left_context_len\");\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"T: %{public}d\", T_);\n      SHERPA_ONNX_LOGE(\"decode_chunk_len_: %{public}d\", decode_chunk_len_);\n#else\n      SHERPA_ONNX_LOGE(\"T: %d\", T_);\n      SHERPA_ONNX_LOGE(\"decode_chunk_len_: %d\", decode_chunk_len_);\n#endif\n    }\n  }\n\n  void InitDecoder(void *model_data, size_t model_data_length) {\n    InitContext(model_data, model_data_length, config_.debug, &decoder_ctx_);\n\n    InitInputOutputAttrs(decoder_ctx_, config_.debug, &decoder_input_attrs_,\n                         &decoder_output_attrs_);\n\n    if (decoder_input_attrs_[0].type != RKNN_TENSOR_INT64) {\n      SHERPA_ONNX_LOGE(\"Expect int64 for decoder input. Given: %d, %s\",\n                       decoder_input_attrs_[0].type,\n                       get_type_string(decoder_input_attrs_[0].type));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    context_size_ = decoder_input_attrs_[0].dims[1];\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"context_size: %d\", context_size_);\n    }\n  }\n\n  void InitJoiner(void *model_data, size_t model_data_length) {\n    InitContext(model_data, model_data_length, config_.debug, &joiner_ctx_);\n\n    InitInputOutputAttrs(joiner_ctx_, config_.debug, &joiner_input_attrs_,\n                         &joiner_output_attrs_);\n\n    vocab_size_ = joiner_output_attrs_[0].dims[1];\n    if (config_.debug) {\n      SHERPA_ONNX_LOGE(\"vocab_size: %d\", vocab_size_);\n    }\n  }\n\n  void PostInit() {\n    encoder_ctx_queue_ = std::make_unique<ContextBlockingQueueRknn>(\n        encoder_ctx_, config_.num_threads);\n    decoder_ctx_queue_ = std::make_unique<ContextBlockingQueueRknn>(\n        decoder_ctx_, config_.num_threads);\n    joiner_ctx_queue_ = std::make_unique<ContextBlockingQueueRknn>(\n        joiner_ctx_, config_.num_threads);\n  }\n\n private:\n  OnlineModelConfig config_;\n  rknn_context encoder_ctx_ = 0;\n  rknn_context decoder_ctx_ = 0;\n  rknn_context joiner_ctx_ = 0;\n\n  std::unique_ptr<ContextBlockingQueueRknn> encoder_ctx_queue_;\n  std::unique_ptr<ContextBlockingQueueRknn> decoder_ctx_queue_;\n  std::unique_ptr<ContextBlockingQueueRknn> joiner_ctx_queue_;\n\n  std::vector<rknn_tensor_attr> encoder_input_attrs_;\n  std::vector<rknn_tensor_attr> encoder_output_attrs_;\n\n  std::vector<rknn_tensor_attr> decoder_input_attrs_;\n  std::vector<rknn_tensor_attr> decoder_output_attrs_;\n\n  std::vector<rknn_tensor_attr> joiner_input_attrs_;\n  std::vector<rknn_tensor_attr> joiner_output_attrs_;\n\n  std::vector<int32_t> encoder_dims_;\n  std::vector<int32_t> attention_dims_;\n  std::vector<int32_t> num_encoder_layers_;\n  std::vector<int32_t> cnn_module_kernels_;\n  std::vector<int32_t> left_context_len_;\n\n  int32_t T_ = 0;\n  int32_t decode_chunk_len_ = 0;\n\n  int32_t context_size_ = 2;\n  int32_t vocab_size_ = 0;\n};\n\nOnlineZipformerTransducerModelRknn::~OnlineZipformerTransducerModelRknn() =\n    default;  // NOLINT\n\nOnlineZipformerTransducerModelRknn::OnlineZipformerTransducerModelRknn(\n    const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nOnlineZipformerTransducerModelRknn::OnlineZipformerTransducerModelRknn(\n    Manager *mgr, const OnlineModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nstd::vector<std::vector<uint8_t>>\nOnlineZipformerTransducerModelRknn::GetEncoderInitStates() const {\n  return impl_->GetEncoderInitStates();\n}\n\nstd::pair<std::vector<float>, std::vector<std::vector<uint8_t>>>\nOnlineZipformerTransducerModelRknn::RunEncoder(\n    std::vector<float> features,\n    std::vector<std::vector<uint8_t>> states) const {\n  return impl_->RunEncoder(std::move(features), std::move(states));\n}\n\nstd::vector<float> OnlineZipformerTransducerModelRknn::RunDecoder(\n    std::vector<int64_t> decoder_input) const {\n  return impl_->RunDecoder(std::move(decoder_input));\n}\n\nstd::vector<float> OnlineZipformerTransducerModelRknn::RunJoiner(\n    const float *encoder_out, const float *decoder_out) const {\n  return impl_->RunJoiner(encoder_out, decoder_out);\n}\n\nint32_t OnlineZipformerTransducerModelRknn::ContextSize() const {\n  return impl_->ContextSize();\n}\n\nint32_t OnlineZipformerTransducerModelRknn::ChunkSize() const {\n  return impl_->ChunkSize();\n}\n\nint32_t OnlineZipformerTransducerModelRknn::ChunkShift() const {\n  return impl_->ChunkShift();\n}\n\nint32_t OnlineZipformerTransducerModelRknn::VocabSize() const {\n  return impl_->VocabSize();\n}\n\nrknn_tensor_attr OnlineZipformerTransducerModelRknn::GetEncoderOutAttr() const {\n  return impl_->GetEncoderOutAttr();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate OnlineZipformerTransducerModelRknn::OnlineZipformerTransducerModelRknn(\n    AAssetManager *mgr, const OnlineModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate OnlineZipformerTransducerModelRknn::OnlineZipformerTransducerModelRknn(\n    NativeResourceManager *mgr, const OnlineModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/online-zipformer-transducer-model-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/online-zipformer-transducer-model-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_RKNN_ONLINE_ZIPFORMER_TRANSDUCER_MODEL_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_ONLINE_ZIPFORMER_TRANSDUCER_MODEL_RKNN_H_\n\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"rknn_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\n// this is for zipformer v1 and v2, i.e., the folder\n// pruned_transducer_statelss7_streaming\n// and\n// zipformer\n// from icefall\nclass OnlineZipformerTransducerModelRknn {\n public:\n  ~OnlineZipformerTransducerModelRknn();\n\n  explicit OnlineZipformerTransducerModelRknn(const OnlineModelConfig &config);\n\n  template <typename Manager>\n  OnlineZipformerTransducerModelRknn(Manager *mgr,\n                                     const OnlineModelConfig &config);\n\n  std::vector<std::vector<uint8_t>> GetEncoderInitStates() const;\n\n  std::pair<std::vector<float>, std::vector<std::vector<uint8_t>>> RunEncoder(\n      std::vector<float> features,\n      std::vector<std::vector<uint8_t>> states) const;\n\n  std::vector<float> RunDecoder(std::vector<int64_t> decoder_input) const;\n\n  std::vector<float> RunJoiner(const float *encoder_out,\n                               const float *decoder_out) const;\n\n  int32_t ContextSize() const;\n\n  int32_t ChunkSize() const;\n\n  int32_t ChunkShift() const;\n\n  int32_t VocabSize() const;\n\n  rknn_tensor_attr GetEncoderOutAttr() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_ONLINE_ZIPFORMER_TRANSDUCER_MODEL_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/silero-vad-model-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/silero-vad-model-rknn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/silero-vad-model-rknn.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/rknn/macros.h\"\n#include \"sherpa-onnx/csrc/rknn/utils.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass SileroVadModelRknn::Impl {\n public:\n  ~Impl() {\n    auto ret = rknn_destroy(ctx_);\n    if (ret != RKNN_SUCC) {\n      SHERPA_ONNX_LOGE(\"Failed to destroy the context\");\n    }\n  }\n\n  explicit Impl(const VadModelConfig &config)\n      : config_(config), sample_rate_(config.sample_rate) {\n    auto buf = ReadFile(config.silero_vad.model);\n    Init(buf.data(), buf.size());\n\n    SetCoreMask(ctx_, config_.num_threads);\n\n    if (sample_rate_ != 16000) {\n      SHERPA_ONNX_LOGE(\"Expected sample rate 16000. Given: %d\",\n                       config.sample_rate);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    min_silence_samples_ =\n        sample_rate_ * config_.silero_vad.min_silence_duration;\n\n    min_speech_samples_ = sample_rate_ * config_.silero_vad.min_speech_duration;\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const VadModelConfig &config)\n      : config_(config), sample_rate_(config.sample_rate) {\n    auto buf = ReadFile(mgr, config.silero_vad.model);\n    Init(buf.data(), buf.size());\n\n    SetCoreMask(ctx_, config_.num_threads);\n\n    if (sample_rate_ != 16000) {\n      SHERPA_ONNX_LOGE(\"Expected sample rate 16000. Given: %d\",\n                       config.sample_rate);\n      exit(-1);\n    }\n\n    min_silence_samples_ =\n        sample_rate_ * config_.silero_vad.min_silence_duration;\n\n    min_speech_samples_ = sample_rate_ * config_.silero_vad.min_speech_duration;\n  }\n\n  void Reset() {\n    for (auto &s : states_) {\n      std::fill(s.begin(), s.end(), 0);\n    }\n\n    triggered_ = false;\n    current_sample_ = 0;\n    temp_start_ = 0;\n    temp_end_ = 0;\n  }\n\n  bool IsSpeech(const float *samples, int32_t n) {\n    if (n != WindowSize()) {\n      SHERPA_ONNX_LOGE(\"n: %d != window_size: %d\", n, WindowSize());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    float prob = Run(samples, n);\n\n    float threshold = config_.silero_vad.threshold;\n\n    current_sample_ += config_.silero_vad.window_size;\n\n    if (prob > threshold && temp_end_ != 0) {\n      temp_end_ = 0;\n    }\n\n    if (prob > threshold && temp_start_ == 0) {\n      // start speaking, but we require that it must satisfy\n      // min_speech_duration\n      temp_start_ = current_sample_;\n      return false;\n    }\n\n    if (prob > threshold && temp_start_ != 0 && !triggered_) {\n      if (current_sample_ - temp_start_ < min_speech_samples_) {\n        return false;\n      }\n\n      triggered_ = true;\n\n      return true;\n    }\n\n    if ((prob < threshold) && !triggered_) {\n      // silence\n      temp_start_ = 0;\n      temp_end_ = 0;\n      return false;\n    }\n\n    if ((prob > threshold - 0.15) && triggered_) {\n      // speaking\n      return true;\n    }\n\n    if ((prob > threshold) && !triggered_) {\n      // start speaking\n      triggered_ = true;\n\n      return true;\n    }\n\n    if ((prob < threshold) && triggered_) {\n      // stop to speak\n      if (temp_end_ == 0) {\n        temp_end_ = current_sample_;\n      }\n\n      if (current_sample_ - temp_end_ < min_silence_samples_) {\n        // continue speaking\n        return true;\n      }\n      // stopped speaking\n      temp_start_ = 0;\n      temp_end_ = 0;\n      triggered_ = false;\n      return false;\n    }\n\n    return false;\n  }\n\n  int32_t WindowShift() const { return config_.silero_vad.window_size; }\n\n  int32_t WindowSize() const {\n    return config_.silero_vad.window_size + window_overlap_;\n  }\n\n  int32_t MinSilenceDurationSamples() const { return min_silence_samples_; }\n\n  int32_t MinSpeechDurationSamples() const { return min_speech_samples_; }\n\n  void SetMinSilenceDuration(float s) {\n    min_silence_samples_ = sample_rate_ * s;\n  }\n\n  void SetThreshold(float threshold) {\n    config_.silero_vad.threshold = threshold;\n  }\n\n  float Run(const float *samples, int32_t n) {\n    std::vector<rknn_input> inputs(input_attrs_.size());\n\n    for (int32_t i = 0; i < static_cast<int32_t>(inputs.size()); ++i) {\n      auto &input = inputs[i];\n      auto &attr = input_attrs_[i];\n      input.index = attr.index;\n\n      if (attr.type == RKNN_TENSOR_FLOAT16) {\n        input.type = RKNN_TENSOR_FLOAT32;\n      } else if (attr.type == RKNN_TENSOR_INT64) {\n        input.type = RKNN_TENSOR_INT64;\n      } else {\n        SHERPA_ONNX_LOGE(\"Unsupported tensor type %d, %s\", attr.type,\n                         get_type_string(attr.type));\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      input.fmt = attr.fmt;\n      if (i == 0) {\n        input.buf = reinterpret_cast<void *>(const_cast<float *>(samples));\n        input.size = n * sizeof(float);\n      } else {\n        input.buf = reinterpret_cast<void *>(states_[i - 1].data());\n        input.size = states_[i - 1].size() * sizeof(float);\n      }\n    }\n\n    std::vector<float> out(output_attrs_[0].n_elems);\n\n    auto &next_states = states_;\n\n    std::vector<rknn_output> outputs(output_attrs_.size());\n\n    for (int32_t i = 0; i < outputs.size(); ++i) {\n      auto &output = outputs[i];\n      auto &attr = output_attrs_[i];\n      output.index = attr.index;\n      output.is_prealloc = 1;\n\n      if (attr.type == RKNN_TENSOR_FLOAT16) {\n        output.want_float = 1;\n      } else if (attr.type == RKNN_TENSOR_INT64) {\n        output.want_float = 0;\n      } else {\n        SHERPA_ONNX_LOGE(\"Unsupported tensor type %d, %s\", attr.type,\n                         get_type_string(attr.type));\n        SHERPA_ONNX_EXIT(-1);\n      }\n\n      if (i == 0) {\n        output.size = out.size() * sizeof(float);\n        output.buf = reinterpret_cast<void *>(out.data());\n      } else {\n        output.size = next_states[i - 1].size() * sizeof(float);\n        output.buf = reinterpret_cast<void *>(next_states[i - 1].data());\n      }\n    }\n\n    auto ret = rknn_inputs_set(ctx_, inputs.size(), inputs.data());\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to set inputs\");\n\n    ret = rknn_run(ctx_, nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to run the model\");\n\n    ret = rknn_outputs_get(ctx_, outputs.size(), outputs.data(), nullptr);\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get model output\");\n\n    return out[0];\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    InitContext(model_data, model_data_length, config_.debug, &ctx_);\n\n    InitInputOutputAttrs(ctx_, config_.debug, &input_attrs_, &output_attrs_);\n\n    rknn_custom_string custom_string = GetCustomString(ctx_, config_.debug);\n\n    auto meta = Parse(custom_string, config_.debug);\n\n    if (config_.silero_vad.window_size != 512) {\n      SHERPA_ONNX_LOGE(\"we require window_size to be 512. Given: %d\",\n                       config_.silero_vad.window_size);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (config_.debug) {\n      for (const auto &p : meta) {\n        SHERPA_ONNX_LOGE(\"%s: %s\", p.first.c_str(), p.second.c_str());\n      }\n    }\n\n    if (meta.count(\"model_type\") == 0) {\n      SHERPA_ONNX_LOGE(\"No model type found in '%s'\",\n                       config_.silero_vad.model.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta.at(\"model_type\") != \"silero-vad-v4\") {\n      SHERPA_ONNX_LOGE(\"Expect model type silero-vad-v4 in '%s', given: '%s'\",\n                       config_.silero_vad.model.c_str(),\n                       meta.at(\"model_type\").c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta.count(\"sample_rate\") == 0) {\n      SHERPA_ONNX_LOGE(\"No sample_rate found in '%s'\",\n                       config_.silero_vad.model.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta.at(\"sample_rate\") != \"16000\") {\n      SHERPA_ONNX_LOGE(\"Expect sample rate 16000 in '%s', given: '%s'\",\n                       config_.silero_vad.model.c_str(),\n                       meta.at(\"sample_rate\").c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta.count(\"version\") == 0) {\n      SHERPA_ONNX_LOGE(\"No version found in '%s'\",\n                       config_.silero_vad.model.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta.at(\"version\") != \"4\") {\n      SHERPA_ONNX_LOGE(\"Expect version 4 in '%s', given: '%s'\",\n                       config_.silero_vad.model.c_str(),\n                       meta.at(\"version\").c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta.count(\"h_shape\") == 0) {\n      SHERPA_ONNX_LOGE(\"No h_shape found in '%s'\",\n                       config_.silero_vad.model.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (meta.count(\"c_shape\") == 0) {\n      SHERPA_ONNX_LOGE(\"No c_shape found in '%s'\",\n                       config_.silero_vad.model.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    std::vector<int64_t> h_shape;\n    std::vector<int64_t> c_shape;\n\n    SplitStringToIntegers(meta.at(\"h_shape\"), \",\", false, &h_shape);\n    SplitStringToIntegers(meta.at(\"c_shape\"), \",\", false, &c_shape);\n    if (h_shape.size() != 3 || c_shape.size() != 3) {\n      SHERPA_ONNX_LOGE(\"Incorrect shape for h (%d) or c (%d)\",\n                       static_cast<int32_t>(h_shape.size()),\n                       static_cast<int32_t>(c_shape.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    states_.resize(2);\n    states_[0].resize(h_shape[0] * h_shape[1] * h_shape[2]);\n    states_[1].resize(c_shape[0] * c_shape[1] * c_shape[2]);\n\n    Reset();\n  }\n\n private:\n  VadModelConfig config_;\n  rknn_context ctx_ = 0;\n\n  std::vector<rknn_tensor_attr> input_attrs_;\n  std::vector<rknn_tensor_attr> output_attrs_;\n\n  std::vector<std::vector<float>> states_;\n\n  int64_t sample_rate_;\n  int32_t min_silence_samples_;\n  int32_t min_speech_samples_;\n\n  bool triggered_ = false;\n  int32_t current_sample_ = 0;\n  int32_t temp_start_ = 0;\n  int32_t temp_end_ = 0;\n\n  int32_t window_overlap_ = 0;\n};\n\nSileroVadModelRknn::SileroVadModelRknn(const VadModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nSileroVadModelRknn::SileroVadModelRknn(Manager *mgr,\n                                       const VadModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nSileroVadModelRknn::~SileroVadModelRknn() = default;\n\nvoid SileroVadModelRknn::Reset() { return impl_->Reset(); }\n\nbool SileroVadModelRknn::IsSpeech(const float *samples, int32_t n) {\n  return impl_->IsSpeech(samples, n);\n}\n\nint32_t SileroVadModelRknn::WindowSize() const { return impl_->WindowSize(); }\n\nint32_t SileroVadModelRknn::WindowShift() const { return impl_->WindowShift(); }\n\nint32_t SileroVadModelRknn::MinSilenceDurationSamples() const {\n  return impl_->MinSilenceDurationSamples();\n}\n\nint32_t SileroVadModelRknn::MinSpeechDurationSamples() const {\n  return impl_->MinSpeechDurationSamples();\n}\n\nvoid SileroVadModelRknn::SetMinSilenceDuration(float s) {\n  impl_->SetMinSilenceDuration(s);\n}\n\nvoid SileroVadModelRknn::SetThreshold(float threshold) {\n  impl_->SetThreshold(threshold);\n}\n\nfloat SileroVadModelRknn::Compute(const float *samples, int32_t n) {\n  return impl_->Run(samples, n);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate SileroVadModelRknn::SileroVadModelRknn(AAssetManager *mgr,\n                                                const VadModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate SileroVadModelRknn::SileroVadModelRknn(NativeResourceManager *mgr,\n                                                const VadModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/silero-vad-model-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/silero-vad-model-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_RKNN_SILERO_VAD_MODEL_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_SILERO_VAD_MODEL_RKNN_H_\n\n#include <memory>\n\n#include \"rknn_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/vad-model.h\"\n\nnamespace sherpa_onnx {\n\nclass SileroVadModelRknn : public VadModel {\n public:\n  explicit SileroVadModelRknn(const VadModelConfig &config);\n\n  template <typename Manager>\n  SileroVadModelRknn(Manager *mgr, const VadModelConfig &config);\n\n  ~SileroVadModelRknn() override;\n\n  // reset the internal model states\n  void Reset() override;\n\n  /**\n   * @param samples Pointer to a 1-d array containing audio samples.\n   *                Each sample should be normalized to the range [-1, 1].\n   * @param n Number of samples.\n   *\n   * @return Return true if speech is detected. Return false otherwise.\n   */\n  bool IsSpeech(const float *samples, int32_t n) override;\n  float Compute(const float *samples, int32_t n) override;\n\n  // For silero vad V4, it is WindowShift().\n  int32_t WindowSize() const override;\n\n  // 512\n  int32_t WindowShift() const override;\n\n  int32_t MinSilenceDurationSamples() const override;\n  int32_t MinSpeechDurationSamples() const override;\n\n  void SetMinSilenceDuration(float s) override;\n  void SetThreshold(float threshold) override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_SILERO_VAD_MODEL_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/transducer-keyword-decoder-rknn.cc",
    "content": "// sherpa-onnx/csrc/rknn/transducer-keywords-decoder-rknn.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/transducer-keyword-decoder-rknn.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <cstring>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/log.h\"\n\nnamespace sherpa_onnx {\n\nTransducerKeywordResult TransducerKeywordDecoderRknn::GetEmptyResult() const {\n  int32_t context_size = model_->ContextSize();\n  int32_t blank_id = 0;  // always 0\n  TransducerKeywordResult r;\n  std::vector<int64_t> blanks(context_size, -1);\n  blanks.back() = blank_id;\n\n  Hypotheses blank_hyp({{blanks, 0}});\n  r.hyps = std::move(blank_hyp);\n  return r;\n}\n\nstd::vector<std::vector<float>> GetDecoderOut(\n    OnlineZipformerTransducerModelRknn *model, const Hypotheses &hyp_vec);\n\nstd::vector<std::vector<float>> GetJoinerOutLogSoftmax(\n    OnlineZipformerTransducerModelRknn *model, const float *p_encoder_out,\n    const std::vector<std::vector<float>> &decoder_out);\n\nvoid TransducerKeywordDecoderRknn::Decode(std::vector<float> encoder_out,\n                                          OnlineStreamRknn *s) {\n  auto attr = model_->GetEncoderOutAttr();\n  int32_t num_frames = attr.dims[1];\n  int32_t encoder_out_dim = attr.dims[2];\n\n  int32_t vocab_size = model_->VocabSize();\n  int32_t context_size = model_->ContextSize();\n\n  std::vector<int64_t> blanks(context_size, -1);\n  blanks.back() = 0;  // blank_id is hardcoded to 0\n\n  auto r = s->GetKeywordResult();\n\n  Hypotheses cur = std::move(r.hyps);\n  std::vector<Hypothesis> prev;\n\n  auto decoder_out = GetDecoderOut(model_, cur);\n\n  const float *p_encoder_out = encoder_out.data();\n\n  int32_t frame_offset = r.frame_offset;\n\n  for (int32_t t = 0; t != num_frames; ++t) {\n    prev = cur.Vec();\n    cur.Clear();\n\n    auto log_probs = GetJoinerOutLogSoftmax(model_, p_encoder_out, decoder_out);\n\n    auto log_probs_old = log_probs;\n\n    p_encoder_out += encoder_out_dim;\n\n    for (int32_t i = 0; i != prev.size(); ++i) {\n      auto log_prob = prev[i].log_prob;\n      for (auto &p : log_probs[i]) {\n        p += log_prob;\n      }\n    }\n\n    auto topk = TopkIndex(log_probs, max_active_paths_);\n\n    Hypotheses hyps;\n\n    for (auto k : topk) {\n      int32_t hyp_index = k / vocab_size;\n      int32_t new_token = k % vocab_size;\n\n      Hypothesis new_hyp = prev[hyp_index];\n      float context_score = 0;\n      auto context_state = new_hyp.context_state;\n\n      // blank is hardcoded to 0\n      // also, it treats unk as blank\n      if (new_token != 0 && new_token != unk_id_) {\n        new_hyp.ys.push_back(new_token);\n        new_hyp.timestamps.push_back(t + frame_offset);\n        new_hyp.ys_probs.push_back(exp(log_probs_old[hyp_index][new_token]));\n\n        new_hyp.num_trailing_blanks = 0;\n        auto context_res =\n            s->GetContextGraph()->ForwardOneStep(context_state, new_token);\n        context_score = std::get<0>(context_res);\n        new_hyp.context_state = std::get<1>(context_res);\n        // Start matching from the start state, forget the decoder history.\n        if (new_hyp.context_state->token == -1) {\n          new_hyp.ys = blanks;\n          new_hyp.timestamps.clear();\n          new_hyp.ys_probs.clear();\n        }\n      } else {\n        ++new_hyp.num_trailing_blanks;\n      }\n      new_hyp.log_prob = log_probs[hyp_index][new_token] + context_score;\n      hyps.Add(std::move(new_hyp));\n    }  // for (auto k : topk)\n\n    auto best_hyp = hyps.GetMostProbable(false);\n\n    auto status = s->GetContextGraph()->IsMatched(best_hyp.context_state);\n    bool matched = std::get<0>(status);\n    const ContextState *matched_state = std::get<1>(status);\n\n    if (matched) {\n      float ys_prob = 0.0;\n      for (int32_t i = 0; i < matched_state->level; ++i) {\n        ys_prob += best_hyp.ys_probs[i];\n      }\n      ys_prob /= matched_state->level;\n      if (best_hyp.num_trailing_blanks > num_trailing_blanks_ &&\n          ys_prob >= matched_state->ac_threshold) {\n        r.tokens = {best_hyp.ys.end() - matched_state->level,\n                    best_hyp.ys.end()};\n        r.timestamps = {best_hyp.timestamps.end() - matched_state->level,\n                        best_hyp.timestamps.end()};\n        r.keyword = matched_state->phrase;\n\n        hyps = Hypotheses({{blanks, 0, s->GetContextGraph()->Root()}});\n      }\n    }\n\n    cur = std::move(hyps);\n    decoder_out = GetDecoderOut(model_, cur);\n  }\n\n  auto best_hyp = cur.GetMostProbable(false);\n  r.hyps = std::move(cur);\n  r.frame_offset += num_frames;\n  r.num_trailing_blanks = best_hyp.num_trailing_blanks;\n\n  s->SetKeywordResult(r);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/transducer-keyword-decoder-rknn.h",
    "content": "// sherpa-onnx/csrc/rknn/transducer-keywords-decoder-rknn.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_RKNN_TRANSDUCER_KEYWORD_DECODER_RKNN_H_\n#define SHERPA_ONNX_CSRC_RKNN_TRANSDUCER_KEYWORD_DECODER_RKNN_H_\n\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/rknn/online-stream-rknn.h\"\n#include \"sherpa-onnx/csrc/rknn/online-zipformer-transducer-model-rknn.h\"\n#include \"sherpa-onnx/csrc/transducer-keyword-decoder.h\"\n\nnamespace sherpa_onnx {\n\nclass TransducerKeywordDecoderRknn {\n public:\n  TransducerKeywordDecoderRknn(OnlineZipformerTransducerModelRknn *model,\n                               int32_t max_active_paths,\n                               int32_t num_trailing_blanks, int32_t unk_id)\n      : model_(model),\n        max_active_paths_(max_active_paths),\n        num_trailing_blanks_(num_trailing_blanks),\n        unk_id_(unk_id) {}\n\n  TransducerKeywordResult GetEmptyResult() const;\n\n  void Decode(std::vector<float> encoder_out, OnlineStreamRknn *s);\n\n private:\n  OnlineZipformerTransducerModelRknn *model_;  // Not owned\n\n  int32_t max_active_paths_;\n  int32_t num_trailing_blanks_;\n  int32_t unk_id_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_TRANSDUCER_KEYWORD_DECODER_RKNN_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/utils.cc",
    "content": "// sherpa-onnx/csrc/utils.cc\n//\n// Copyright      2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/rknn/utils.h\"\n\n#include <string.h>\n\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/rknn/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid ConvertNCHWtoNHWC(const float *src, int32_t n, int32_t channel,\n                       int32_t height, int32_t width, float *dst) {\n  for (int32_t i = 0; i < n; ++i) {\n    for (int32_t h = 0; h < height; ++h) {\n      for (int32_t w = 0; w < width; ++w) {\n        for (int32_t c = 0; c < channel; ++c) {\n          // dst[h, w, c] = src[c, h, w]\n          dst[i * height * width * channel + h * width * channel + w * channel +\n              c] = src[i * height * width * channel + c * height * width +\n                       h * width + w];\n        }\n      }\n    }\n  }\n}\n\nstd::string ToString(const rknn_tensor_attr &attr) {\n  std::ostringstream os;\n  os << \"{\";\n  os << attr.index;\n  os << \", name: \" << attr.name;\n  os << \", shape: (\";\n  std::string sep;\n  for (int32_t i = 0; i < static_cast<int32_t>(attr.n_dims); ++i) {\n    os << sep << attr.dims[i];\n    sep = \",\";\n  }\n  os << \")\";\n  os << \", n_elems: \" << attr.n_elems;\n  os << \", size: \" << attr.size;\n  os << \", fmt: \" << get_format_string(attr.fmt);\n  os << \", type: \" << get_type_string(attr.type);\n  os << \", pass_through: \" << (attr.pass_through ? \"true\" : \"false\");\n  os << \"}\";\n  return os.str();\n}\n\nstd::unordered_map<std::string, std::string> Parse(\n    const rknn_custom_string &custom_string, bool debug /*= false*/) {\n  std::unordered_map<std::string, std::string> ans;\n  std::vector<std::string> fields;\n  SplitStringToVector(custom_string.string, \";\", false, &fields);\n\n  std::vector<std::string> tmp;\n  for (const auto &f : fields) {\n    SplitStringToVector(f, \"=\", false, &tmp);\n    if (tmp.size() != 2) {\n      SHERPA_ONNX_LOGE(\"Invalid custom string %s for %s\", custom_string.string,\n                       f.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n    ans[std::move(tmp[0])] = std::move(tmp[1]);\n  }\n\n  if (debug) {\n    for (const auto &p : ans) {\n      SHERPA_ONNX_LOGE(\"%s: %s\", p.first.c_str(), p.second.c_str());\n    }\n  }\n\n  return ans;\n}\n\nvoid InitContext(void *model_data, size_t model_data_length, bool debug,\n                 rknn_context *ctx) {\n  auto ret = rknn_init(ctx, model_data, model_data_length, 0, nullptr);\n  SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to init rknn\");\n\n  if (debug) {\n    rknn_sdk_version v;\n    ret = rknn_query(*ctx, RKNN_QUERY_SDK_VERSION, &v, sizeof(v));\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get rknn sdk version\");\n\n    SHERPA_ONNX_LOGE(\"sdk api version: %s, driver version: %s\", v.api_version,\n                     v.drv_version);\n  }\n}\n\nvoid InitInputOutputAttrs(rknn_context ctx, bool debug,\n                          std::vector<rknn_tensor_attr> *input_attrs,\n                          std::vector<rknn_tensor_attr> *output_attrs) {\n  rknn_input_output_num io_num;\n  auto ret = rknn_query(ctx, RKNN_QUERY_IN_OUT_NUM, &io_num, sizeof(io_num));\n  SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get I/O information for the model\");\n\n  if (debug) {\n    SHERPA_ONNX_LOGE(\"model: %d inputs, %d outputs\",\n                     static_cast<int32_t>(io_num.n_input),\n                     static_cast<int32_t>(io_num.n_output));\n  }\n\n  input_attrs->resize(io_num.n_input);\n  output_attrs->resize(io_num.n_output);\n\n  int32_t i = 0;\n  for (auto &attr : *input_attrs) {\n    memset(&attr, 0, sizeof(attr));\n    attr.index = i;\n    ret = rknn_query(ctx, RKNN_QUERY_INPUT_ATTR, &attr, sizeof(attr));\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get attr for model input %d\", i);\n    i += 1;\n  }\n\n  if (debug) {\n    std::ostringstream os;\n    std::string sep;\n    for (auto &attr : *input_attrs) {\n      os << sep << ToString(attr);\n      sep = \"\\n\";\n    }\n    SHERPA_ONNX_LOGE(\"\\n----------Model inputs info----------\\n%s\",\n                     os.str().c_str());\n  }\n\n  i = 0;\n  for (auto &attr : *output_attrs) {\n    memset(&attr, 0, sizeof(attr));\n    attr.index = i;\n    ret = rknn_query(ctx, RKNN_QUERY_OUTPUT_ATTR, &attr, sizeof(attr));\n    SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to get attr for model output %d\", i);\n    i += 1;\n  }\n\n  if (debug) {\n    std::ostringstream os;\n    std::string sep;\n    for (auto &attr : *output_attrs) {\n      os << sep << ToString(attr);\n      sep = \"\\n\";\n    }\n    SHERPA_ONNX_LOGE(\"\\n----------Model outputs info----------\\n%s\",\n                     os.str().c_str());\n  }\n}\n\nrknn_custom_string GetCustomString(rknn_context ctx, bool debug) {\n  rknn_custom_string custom_string;\n  auto ret = rknn_query(ctx, RKNN_QUERY_CUSTOM_STRING, &custom_string,\n                        sizeof(custom_string));\n  SHERPA_ONNX_RKNN_CHECK(ret, \"Failed to read custom string from the model\");\n  if (debug) {\n    SHERPA_ONNX_LOGE(\"customs string: %s\", custom_string.string);\n  }\n  return custom_string;\n}\n\nvoid SetCoreMask(rknn_context ctx, int32_t num_threads) {\n  int32_t ret = RKNN_SUCC;\n  switch (num_threads) {\n    case 1:\n      ret = rknn_set_core_mask(ctx, RKNN_NPU_CORE_AUTO);\n      break;\n    case 0:\n      ret = rknn_set_core_mask(ctx, RKNN_NPU_CORE_0);\n      break;\n    case -1:\n      ret = rknn_set_core_mask(ctx, RKNN_NPU_CORE_1);\n      break;\n    case -2:\n      ret = rknn_set_core_mask(ctx, RKNN_NPU_CORE_2);\n      break;\n    case -3:\n      ret = rknn_set_core_mask(ctx, RKNN_NPU_CORE_0_1);\n      break;\n    case -4:\n      ret = rknn_set_core_mask(ctx, RKNN_NPU_CORE_0_1_2);\n      break;\n    default:\n      SHERPA_ONNX_LOGE(\n          \"Valid num_threads for rk npu is 1 (auto), 0 (core 0), -1 (core \"\n          \"1), -2 (core 2), -3 (core 0_1), -4 (core 0_1_2). Given: %d\",\n          num_threads);\n      break;\n  }\n  if (ret != RKNN_SUCC) {\n    SHERPA_ONNX_LOGE(\n        \"Failed to select npu core to run the model (You can ignore it if \"\n        \"you are not using RK3588.\");\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/rknn/utils.h",
    "content": "// sherpa-onnx/csrc/utils.h\n//\n// Copyright      2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_RKNN_UTILS_H_\n#define SHERPA_ONNX_CSRC_RKNN_UTILS_H_\n\n#include <string>\n#include <unordered_map>\n#include <vector>\n\n#include \"rknn_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\nvoid ConvertNCHWtoNHWC(const float *src, int32_t n, int32_t channel,\n                       int32_t height, int32_t width, float *dst);\n\nstd::string ToString(const rknn_tensor_attr &attr);\n\nstd::unordered_map<std::string, std::string> Parse(\n    const rknn_custom_string &custom_string, bool debug = false);\n\nvoid InitContext(void *model_data, size_t model_data_length, bool debug,\n                 rknn_context *ctx);\n\nvoid InitInputOutputAttrs(rknn_context ctx, bool debug,\n                          std::vector<rknn_tensor_attr> *input_attrs,\n                          std::vector<rknn_tensor_attr> *output_attrs);\n\nrknn_custom_string GetCustomString(rknn_context ctx, bool debug);\n\nvoid SetCoreMask(rknn_context ctx, int32_t num_threads);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_RKNN_UTILS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/sentence-piece-tokenizer-test.cc",
    "content": "// sherpa-onnx/csrc/sentence-piece-tokenizer-test.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/sentence-piece-tokenizer.h\"\n\n#include <fstream>\n#include <string>\n#include <vector>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nstatic const char dir[] = \"/tmp/sherpa-onnx-test-data\";\n\nTEST(SpTokenizer, TestEncode) {\n  auto vocab_json = std::string(dir) + \"/vocab.json\";\n  auto token_scores_json = std::string(dir) + \"/token_scores.json\";\n\n  if (!std::ifstream(vocab_json).good() ||\n      !std::ifstream(token_scores_json).good()) {\n    SHERPA_ONNX_LOGE(\n        \"No test data found, skipping TestEncode().\"\n        \"You can download the test data from: \"\n        \"https://huggingface.co/csukuangfj/sherpa-onnx-test-data/tree/main\"\n        \"and put it inside \"\n        \"/tmp/sherpa-onnx-test-data\");\n    return;\n  }\n\n  auto sp = SentencePieceTokenizer(vocab_json, token_scores_json);\n  std::string text =\n      \"How are you doing today? Fantastic! How about you? I am OK.\";\n  std::vector<std::string> expected_tokens = {\n      \"▁How\", \"▁are\", \"▁you\",   \"▁doing\", \"▁today\", \"?\",\n      \"▁F\",   \"an\",   \"tastic\", \"!\",      \"▁How\",   \"▁about\",\n      \"▁you\", \"?\",    \"▁I\",     \"▁am\",    \"▁OK\",    \".\"};\n\n  std::vector<std::string> tokens = sp.EncodeTokens(text);\n  EXPECT_EQ(tokens, expected_tokens);\n\n  std::vector<int32_t> expected_ids = {668, 304, 270,  473, 630,  292,\n                                       496, 456, 2264, 682, 668,  315,\n                                       270, 292, 268,  686, 1183, 263};\n\n  std::vector<int32_t> token_ids = sp.EncodeIds(text);\n  EXPECT_EQ(token_ids, expected_ids);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/sentence-piece-tokenizer.cc",
    "content": "// sherpa-onnx/csrc/sentence-piece-tokenizer.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/sentence-piece-tokenizer.h\"\n\n#include <cstdio>\n#include <fstream>\n#include <limits>\n#include <string>\n#include <unordered_map>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"nlohmann/json.hpp\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nusing json = nlohmann::json;\nstatic constexpr float kNegInf = -1e30f;\n\nstatic json LoadJson(const std::string &filename) {\n  if (filename.empty()) {\n    SHERPA_ONNX_LOGE(\"Empty json filename\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n  AssertFileExists(filename);\n\n  std::ifstream is(filename);\n  json j;\n  is >> j;\n  return j;\n}\n\nstatic json LoadJson(const std::vector<char> &buf) {\n  if (buf.empty()) {\n    SHERPA_ONNX_LOGE(\"Empty json buffer\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n  return json::parse(buf.begin(), buf.end());\n}\n\nclass SentencePieceTokenizer::Impl {\n public:\n  Impl(const std::string &vocab_json, const std::string &token_scores_json) {\n    Init(LoadJson(vocab_json), LoadJson(token_scores_json));\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const std::string &vocab_json,\n       const std::string &token_scores_json) {\n    Init(LoadJson(ReadFile(mgr, vocab_json)),\n         LoadJson(ReadFile(mgr, token_scores_json)));\n  }\n\n  std::vector<int32_t> EncodeIds(const std::string &text) const {\n    std::vector<int32_t> ids;\n    EncodeInternal(text, &ids, nullptr);\n    return ids;\n  }\n\n  std::vector<std::string> EncodeTokens(const std::string &text) const {\n    std::vector<std::string> tokens;\n    EncodeInternal(text, nullptr, &tokens);\n    return tokens;\n  }\n\n private:\n  void Init(const json &vocab, const json &scores) {\n    InitVocabJson(vocab);\n    InitTokenScores(scores);\n\n    for (int i = 0; i < 256; ++i) {\n      byte_token_id_[i] = -1;\n      byte_token_score_[i] = kNegInf;\n    }\n\n    InitTrie();\n  }\n\n  void InitVocabJson(const std::string &filename) {\n    InitVocabJson(LoadJson(filename));\n  }\n\n  void InitVocabJson(const std::vector<char> &buf) {\n    InitVocabJson(LoadJson(buf));\n  }\n\n  void InitVocabJson(const json &j) {\n    token2id_.reserve(j.size());\n    id2token_.resize(j.size());\n\n    for (const auto &item : j.items()) {\n      token2id_[item.key()] = item.value();\n      id2token_[item.value()] = item.key();\n    }\n  }\n\n  void InitTokenScores(const std::string &filename) {\n    InitTokenScores(LoadJson(filename));\n  }\n\n  void InitTokenScores(const std::vector<char> &buf) {\n    InitTokenScores(LoadJson(buf));\n  }\n\n  void InitTokenScores(const json &j) {\n    token2score_.reserve(j.size());\n\n    for (const auto &item : j.items()) {\n      token2score_[item.key()] = item.value();\n    }\n  }\n\n  void InitTrie() {\n    trie_.reserve(token2id_.size() * 2);\n    trie_.push_back(TrieNode());  // root\n\n    for (const auto &kv : token2id_) {\n      const std::string &tok = kv.first;\n      int32_t id = kv.second;\n\n      int32_t node = 0;\n      for (unsigned char c : tok) {\n        auto it = trie_[node].next.find(c);\n        if (it == trie_[node].next.end()) {\n          int32_t new_node = trie_.size();\n          trie_[node].next[c] = new_node;\n          trie_.push_back(TrieNode());\n          node = new_node;\n        } else {\n          node = it->second;\n        }\n      }\n\n      trie_[node].token_id = id;\n      trie_[node].score = token2score_[tok];\n    }\n\n    // -------------------------\n    // Byte fallback\n    // -------------------------\n    for (int32_t i = 0; i < 256; ++i) {\n      char buf[8];\n      std::snprintf(buf, sizeof(buf), \"<0x%02X>\", i);\n      std::string tok(buf);\n\n      auto it = token2id_.find(tok);\n      if (it == token2id_.end()) {\n        SHERPA_ONNX_LOGE(\"Missing byte token: '%s'\", tok.c_str());\n        continue;\n      }\n\n      byte_token_id_[i] = it->second;\n      byte_token_score_[i] = token2score_[tok];\n    }\n  }\n\n  void EncodeInternal(const std::string &input, std::vector<int32_t> *ids,\n                      std::vector<std::string> *tokens) const {\n    // SentencePiece whitespace handling\n    std::string text;\n    text.reserve(input.size() + 8);\n\n    for (char c : input) {\n      if (c == ' ')\n        text.append(\"\\xE2\\x96\\x81\");  // ▁\n      else\n        text.push_back(c);\n    }\n\n    if (text.rfind(\"\\xE2\\x96\\x81\", 0) == std::string::npos) {\n      text.insert(0, \"\\xE2\\x96\\x81\");\n    }\n\n    const int32_t n = static_cast<int32_t>(text.size());\n    std::vector<float> dp(n + 1, kNegInf);\n    std::vector<int32_t> back(n + 1, -1);\n    std::vector<int32_t> back_id(n + 1, -1);\n\n    dp[n] = 0.0f;\n\n    // DP\n    for (int32_t i = n - 1; i >= 0; --i) {\n      int32_t node = 0;\n      for (int32_t j = i; j < n; ++j) {\n        unsigned char c = static_cast<unsigned char>(text[j]);\n        auto it = trie_[node].next.find(c);\n        if (it == trie_[node].next.end()) break;\n        node = it->second;\n\n        if (trie_[node].token_id >= 0) {\n          float score = trie_[node].score + dp[j + 1];\n          if (score > dp[i]) {\n            dp[i] = score;\n            back[i] = j + 1;\n            back_id[i] = trie_[node].token_id;\n          }\n        }\n      }\n\n      // byte fallback\n      if (back[i] < 0) {\n        unsigned char b = static_cast<unsigned char>(text[i]);\n        dp[i] = byte_token_score_[b] + dp[i + 1];\n        back[i] = i + 1;\n        back_id[i] = byte_token_id_[b];\n      }\n    }\n\n    // reconstruct\n    for (int32_t i = 0; i < n;) {\n      int32_t j = back[i];\n      int32_t id = back_id[i];\n      if (j <= i || id < 0) break;\n\n      if (ids != nullptr) {\n        ids->push_back(id);\n      }\n\n      if (tokens != nullptr) {\n        tokens->push_back(id2token_[id]);\n      }\n\n      i = j;\n    }\n  }\n\n private:\n  struct TrieNode {\n    std::unordered_map<unsigned char, int32_t> next;\n    int32_t token_id = -1;\n    float score = 0.0f;\n  };\n\n  std::vector<TrieNode> trie_;  // immutable after build\n  std::vector<std::string> id2token_;\n  std::unordered_map<std::string, int32_t> token2id_;\n  std::unordered_map<std::string, float> token2score_;\n\n  // <0xNN> byte fallback\n  int32_t byte_token_id_[256];\n  float byte_token_score_[256];\n};\n\nSentencePieceTokenizer::SentencePieceTokenizer(\n    const std::string &vocab_json, const std::string &token_scores_json)\n    : impl_(std::make_unique<Impl>(vocab_json, token_scores_json)) {}\n\ntemplate <typename Manager>\nSentencePieceTokenizer::SentencePieceTokenizer(\n    Manager *mgr, const std::string &vocab_json,\n    const std::string &token_scores_json)\n    : impl_(std::make_unique<Impl>(mgr, vocab_json, token_scores_json)) {}\n\nSentencePieceTokenizer::~SentencePieceTokenizer() = default;\n\nstd::vector<int32_t> SentencePieceTokenizer::EncodeIds(\n    const std::string &text) const {\n  return impl_->EncodeIds(text);\n}\n\nstd::vector<std::string> SentencePieceTokenizer::EncodeTokens(\n    const std::string &text) const {\n  return impl_->EncodeTokens(text);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate SentencePieceTokenizer::SentencePieceTokenizer(\n    AAssetManager *mgr, const std::string &vocab_json,\n    const std::string &token_scores_json);\n#endif\n\n#if __OHOS__\ntemplate SentencePieceTokenizer::SentencePieceTokenizer(\n    NativeResourceManager *mgr, const std::string &vocab_json,\n    const std::string &token_scores_json);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/sentence-piece-tokenizer.h",
    "content": "// sherpa-onnx/csrc/sentence-piece-tokenizer.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_SENTENCE_PIECE_TOKENIZER_H_\n#define SHERPA_ONNX_CSRC_SENTENCE_PIECE_TOKENIZER_H_\n\n#include <memory>\n#include <string>\n#include <unordered_map>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nclass SentencePieceTokenizer {\n public:\n  SentencePieceTokenizer(const std::string &vocab_json,\n                         const std::string &token_scores_json);\n\n  template <typename Manager>\n  SentencePieceTokenizer(Manager *mgr, const std::string &vocab_json,\n                         const std::string &token_scores_json);\n\n  ~SentencePieceTokenizer();\n\n  std::vector<int32_t> EncodeIds(const std::string &text) const;\n  std::vector<std::string> EncodeTokens(const std::string &text) const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SENTENCE_PIECE_TOKENIZER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/session.cc",
    "content": "// sherpa-onnx/csrc/session.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/session.h\"\n\n#include <algorithm>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/provider.h\"\n#if defined(__APPLE__) && (ORT_API_VERSION >= 15) && \\\n    !defined(SHERPA_ONNX_DISABLE_COREML)\n#include \"coreml_provider_factory.h\"  // NOLINT\n#endif\n\n#if __ANDROID_API__ >= 27\n#include \"nnapi_provider_factory.h\"  // NOLINT\n#endif\n\n#if defined(_WIN32) && SHERPA_ONNX_ENABLE_DIRECTML == 1\n#include \"dml_provider_factory.h\"  // NOLINT\n#endif\n\n#if defined(SHERPA_ONNX_ENABLE_SPACEMIT)\n#include \"spacemit_ort_env.h\"  // NOLINT\n#endif\n\nnamespace sherpa_onnx {\n\nstatic void OrtStatusFailure(OrtStatus *status, const char *s) {\n  const auto &api = Ort::GetApi();\n  const char *msg = api.GetErrorMessage(status);\n  SHERPA_ONNX_LOGE(\n      \"Failed to enable TensorRT : %s.\"\n      \"Available providers: %s. Fallback to cuda\",\n      msg, s);\n  api.ReleaseStatus(status);\n}\n\nOrt::SessionOptions GetSessionOptionsImpl(\n    int32_t num_threads, const std::string &provider_str,\n    const ProviderConfig *provider_config /*= nullptr*/) {\n  Provider p = StringToProvider(provider_str);\n\n  Ort::SessionOptions sess_opts;\n  sess_opts.SetIntraOpNumThreads(num_threads);\n\n  sess_opts.SetInterOpNumThreads(num_threads);\n\n  std::vector<std::string> available_providers = Ort::GetAvailableProviders();\n  std::ostringstream os;\n  for (const auto &ep : available_providers) {\n    os << ep << \", \";\n  }\n\n  // Other possible options\n  // sess_opts.SetGraphOptimizationLevel(ORT_ENABLE_EXTENDED);\n  // sess_opts.SetLogSeverityLevel(ORT_LOGGING_LEVEL_VERBOSE);\n  // sess_opts.EnableProfiling(\"profile\");\n\n  // If you want to speed up initialization, please uncomment the following line\n  // sess_opts.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_DISABLE_ALL);\n\n  switch (p) {\n    case Provider::kCPU:\n      break;  // nothing to do for the CPU provider\n    case Provider::kXnnpack: {\n#if ORT_API_VERSION >= 12\n      if (std::find(available_providers.begin(), available_providers.end(),\n                    \"XnnpackExecutionProvider\") != available_providers.end()) {\n        sess_opts.AppendExecutionProvider(\"XNNPACK\");\n      } else {\n        SHERPA_ONNX_LOGE(\"Available providers: %s. Fallback to cpu!\",\n                         os.str().c_str());\n      }\n#else\n      SHERPA_ONNX_LOGE(\n          \"Does not support xnnpack for onnxruntime: %d. Fallback to cpu!\",\n          static_cast<int32_t>(ORT_API_VERSION));\n#endif\n      break;\n    }\n    case Provider::kTRT: {\n      if (provider_config == nullptr) {\n        SHERPA_ONNX_LOGE(\n            \"Tensorrt support for Online models only,\"\n            \"Must be extended for offline and others\");\n        exit(1);\n      }\n      auto trt_config = provider_config->trt_config;\n      struct TrtPairs {\n        const char *op_keys;\n        const char *op_values;\n      };\n\n      auto device_id = std::to_string(provider_config->device);\n      auto trt_max_workspace_size =\n          std::to_string(trt_config.trt_max_workspace_size);\n      auto trt_max_partition_iterations =\n          std::to_string(trt_config.trt_max_partition_iterations);\n      auto trt_min_subgraph_size =\n          std::to_string(trt_config.trt_min_subgraph_size);\n      auto trt_fp16_enable = std::to_string(trt_config.trt_fp16_enable);\n      auto trt_detailed_build_log =\n          std::to_string(trt_config.trt_detailed_build_log);\n      auto trt_engine_cache_enable =\n          std::to_string(trt_config.trt_engine_cache_enable);\n      auto trt_timing_cache_enable =\n          std::to_string(trt_config.trt_timing_cache_enable);\n      auto trt_dump_subgraphs = std::to_string(trt_config.trt_dump_subgraphs);\n      std::vector<TrtPairs> trt_options = {\n          {\"device_id\", device_id.c_str()},\n          {\"trt_max_workspace_size\", trt_max_workspace_size.c_str()},\n          {\"trt_max_partition_iterations\",\n           trt_max_partition_iterations.c_str()},\n          {\"trt_min_subgraph_size\", trt_min_subgraph_size.c_str()},\n          {\"trt_fp16_enable\", trt_fp16_enable.c_str()},\n          {\"trt_detailed_build_log\", trt_detailed_build_log.c_str()},\n          {\"trt_engine_cache_enable\", trt_engine_cache_enable.c_str()},\n          {\"trt_engine_cache_path\", trt_config.trt_engine_cache_path.c_str()},\n          {\"trt_timing_cache_enable\", trt_timing_cache_enable.c_str()},\n          {\"trt_timing_cache_path\", trt_config.trt_timing_cache_path.c_str()},\n          {\"trt_dump_subgraphs\", trt_dump_subgraphs.c_str()}};\n      // ToDo : Trt configs\n      // \"trt_int8_enable\"\n      // \"trt_int8_use_native_calibration_table\"\n\n      std::vector<const char *> option_keys, option_values;\n      for (const TrtPairs &pair : trt_options) {\n        option_keys.emplace_back(pair.op_keys);\n        option_values.emplace_back(pair.op_values);\n      }\n\n      std::vector<std::string> available_providers =\n          Ort::GetAvailableProviders();\n      if (std::find(available_providers.begin(), available_providers.end(),\n                    \"TensorrtExecutionProvider\") != available_providers.end()) {\n        const auto &api = Ort::GetApi();\n\n        OrtTensorRTProviderOptionsV2 *tensorrt_options = nullptr;\n        OrtStatus *statusC =\n            api.CreateTensorRTProviderOptions(&tensorrt_options);\n        OrtStatus *statusU = api.UpdateTensorRTProviderOptions(\n            tensorrt_options, option_keys.data(), option_values.data(),\n            option_keys.size());\n        sess_opts.AppendExecutionProvider_TensorRT_V2(*tensorrt_options);\n\n        if (statusC) {\n          OrtStatusFailure(statusC, os.str().c_str());\n        }\n        if (statusU) {\n          OrtStatusFailure(statusU, os.str().c_str());\n        }\n\n        api.ReleaseTensorRTProviderOptions(tensorrt_options);\n      }\n      // break; is omitted here intentionally so that\n      // if TRT not available, CUDA will be used\n    }\n    case Provider::kCUDA: {\n      if (std::find(available_providers.begin(), available_providers.end(),\n                    \"CUDAExecutionProvider\") != available_providers.end()) {\n        // The CUDA provider is available, proceed with setting the options\n        OrtCUDAProviderOptions options;\n\n        if (provider_config != nullptr) {\n          options.device_id = provider_config->device;\n          options.cudnn_conv_algo_search = OrtCudnnConvAlgoSearch(\n              provider_config->cuda_config.cudnn_conv_algo_search);\n        } else {\n          options.device_id = 0;\n          // Default OrtCudnnConvAlgoSearchExhaustive is extremely slow\n          options.cudnn_conv_algo_search = OrtCudnnConvAlgoSearchHeuristic;\n          // set more options on need\n        }\n        sess_opts.AppendExecutionProvider_CUDA(options);\n      } else {\n        SHERPA_ONNX_LOGE(\n            \"Please compile with -DSHERPA_ONNX_ENABLE_GPU=ON. Available \"\n            \"providers: %s. Fallback to cpu!\",\n            os.str().c_str());\n      }\n      break;\n    }\n    case Provider::kDirectML: {\n#if defined(_WIN32) && SHERPA_ONNX_ENABLE_DIRECTML == 1\n      sess_opts.DisableMemPattern();\n      sess_opts.SetExecutionMode(ORT_SEQUENTIAL);\n      int32_t device_id = 0;\n      OrtStatus *status =\n          OrtSessionOptionsAppendExecutionProvider_DML(sess_opts, device_id);\n      if (status) {\n        const auto &api = Ort::GetApi();\n        const char *msg = api.GetErrorMessage(status);\n        SHERPA_ONNX_LOGE(\"Failed to enable DirectML: %s. Fallback to cpu\", msg);\n        api.ReleaseStatus(status);\n      }\n#else\n      SHERPA_ONNX_LOGE(\"DirectML is for Windows only. Fallback to cpu!\");\n#endif\n      break;\n    }\n    case Provider::kCoreML: {\n#if defined(__APPLE__) && (ORT_API_VERSION >= 15) && \\\n    !defined(SHERPA_ONNX_DISABLE_COREML)\n      uint32_t coreml_flags = 0;\n      (void)OrtSessionOptionsAppendExecutionProvider_CoreML(sess_opts,\n                                                            coreml_flags);\n#else\n      SHERPA_ONNX_LOGE(\n          \"CoreML is for Apple only since onnxruntime>=1.15. Fallback to cpu!\");\n#endif\n      break;\n    }\n    case Provider::kNNAPI: {\n#if __ANDROID_API__ >= 27\n      SHERPA_ONNX_LOGE(\"Current API level %d \", (int32_t)__ANDROID_API__);\n\n      // Please see\n      // https://onnxruntime.ai/docs/execution-providers/NNAPI-ExecutionProvider.html#usage\n      // to enable different flags\n      uint32_t nnapi_flags = 0;\n      // nnapi_flags |= NNAPI_FLAG_USE_FP16;\n      // nnapi_flags |= NNAPI_FLAG_CPU_DISABLED;\n      OrtStatus *status = OrtSessionOptionsAppendExecutionProvider_Nnapi(\n          sess_opts, nnapi_flags);\n\n      if (status) {\n        const auto &api = Ort::GetApi();\n        const char *msg = api.GetErrorMessage(status);\n        SHERPA_ONNX_LOGE(\n            \"Failed to enable NNAPI: %s. Available providers: %s. Fallback to \"\n            \"cpu\",\n            msg, os.str().c_str());\n        api.ReleaseStatus(status);\n      } else {\n        SHERPA_ONNX_LOGE(\"Use nnapi\");\n      }\n#elif defined(__ANDROID_API__)\n      SHERPA_ONNX_LOGE(\n          \"Android NNAPI requires API level >= 27. Current API level %d \"\n          \"Fallback to cpu!\",\n          (int32_t)__ANDROID_API__);\n#else\n      SHERPA_ONNX_LOGE(\"NNAPI is for Android only. Fallback to cpu\");\n#endif\n      break;\n    }\n    case Provider::kSpacemiT: {\n#if defined(SHERPA_ONNX_ENABLE_SPACEMIT)\n      SHERPA_ONNX_LOGE(\"Use SpacemiT Execution Provider\");\n      // when using SpacemiT Execution Provider, set intra_op_num_threads and\n      // inter_op_num_threads to 1 can improve performance.\n      // all ops run on ep, no need to create multiple threads in onnxruntime.\n      // ep will create SPACEMIT_EP_INTRA_THREAD_NUM threads as intra threads.\n      std::unordered_map<std::string, std::string> provider_options;\n      SHERPA_ONNX_LOGE(\"Set IntraOpNumThreads to 1\");\n      sess_opts.SetIntraOpNumThreads(1);\n      SHERPA_ONNX_LOGE(\"Set InterOpNumThreads to 1\");\n      sess_opts.SetInterOpNumThreads(1);\n      SHERPA_ONNX_LOGE(\"Set SPACEMIT_EP_INTRA_THREAD_NUM to %d\", num_threads);\n      provider_options.insert(std::make_pair(\"SPACEMIT_EP_INTRA_THREAD_NUM\",\n                                             std::to_string(num_threads)));\n      OrtStatus *sts =\n          Ort::SessionOptionsSpaceMITEnvInit(sess_opts, provider_options);\n      if (sts) {\n        const auto &api = Ort::GetApi();\n        const char *msg = api.GetErrorMessage(sts);\n        SHERPA_ONNX_LOGE(\n            \"Failed to enable SpacemiT Execution Provider: %s. Fallback to cpu\",\n            msg);\n        api.ReleaseStatus(sts);\n      }\n#else\n      SHERPA_ONNX_LOGE(\n          \"SpacemiT Execution Provider is for SpacemiT AI-CPUs only. Fallback \"\n          \"to cpu!\");\n#endif\n      break;\n    }\n  }\n  return sess_opts;\n}\n\nOrt::SessionOptions GetSessionOptions(const OnlineModelConfig &config) {\n  return GetSessionOptionsImpl(config.num_threads,\n                               config.provider_config.provider,\n                               &config.provider_config);\n}\n\nOrt::SessionOptions GetSessionOptions(const OnlineModelConfig &config,\n                                      const std::string &model_type) {\n  /*\n    Transducer models : Only encoder will run with tensorrt,\n                        decoder and joiner will run with cuda\n  */\n  if (config.provider_config.provider == \"trt\" &&\n      (model_type == \"decoder\" || model_type == \"joiner\")) {\n    return GetSessionOptionsImpl(config.num_threads, \"cuda\",\n                                 &config.provider_config);\n  }\n  return GetSessionOptionsImpl(config.num_threads,\n                               config.provider_config.provider,\n                               &config.provider_config);\n}\n\nOrt::SessionOptions GetSessionOptions(const OfflineLMConfig &config) {\n  return GetSessionOptionsImpl(config.lm_num_threads, config.lm_provider);\n}\n\nOrt::SessionOptions GetSessionOptions(const OnlineLMConfig &config) {\n  return GetSessionOptionsImpl(config.lm_num_threads, config.lm_provider);\n}\n\nOrt::SessionOptions GetSessionOptions(int32_t num_threads,\n                                      const std::string &provider_str) {\n  return GetSessionOptionsImpl(num_threads, provider_str);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/session.h",
    "content": "// sherpa-onnx/csrc/session.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_SESSION_H_\n#define SHERPA_ONNX_CSRC_SESSION_H_\n\n#include <string>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-lm-config.h\"\n#include \"sherpa-onnx/csrc/online-lm-config.h\"\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n\nnamespace sherpa_onnx {\n\nOrt::SessionOptions GetSessionOptionsImpl(\n    int32_t num_threads, const std::string &provider_str,\n    const ProviderConfig *provider_config = nullptr);\n\nOrt::SessionOptions GetSessionOptions(const OfflineLMConfig &config);\nOrt::SessionOptions GetSessionOptions(const OnlineLMConfig &config);\n\nOrt::SessionOptions GetSessionOptions(const OnlineModelConfig &config);\n\nOrt::SessionOptions GetSessionOptions(const OnlineModelConfig &config,\n                                      const std::string &model_type);\n\nOrt::SessionOptions GetSessionOptions(int32_t num_threads,\n                                      const std::string &provider_str);\n\ntemplate <typename T>\nOrt::SessionOptions GetSessionOptions(const T &config) {\n  return GetSessionOptionsImpl(config.num_threads, config.provider);\n}\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SESSION_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-display.h",
    "content": "// sherpa-onnx/csrc/sherpa-display.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#pragma once\n\n#include <stdlib.h>\n\n#include <cstdio>\n#include <ctime>\n#include <iomanip>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\nnamespace sherpa_onnx {\n\nclass SherpaDisplay {\n public:\n  void UpdateText(const std::string &text) { current_text_ = text; }\n\n  void FinalizeCurrentSentence() {\n    if (!current_text_.empty() &&\n        (current_text_[0] != ' ' || current_text_.size() > 1)) {\n      sentences_.push_back({GetCurrentDateTime(), std::move(current_text_)});\n    }\n  }\n\n  void Display() const {\n    if (!sentences_.empty() || !current_text_.empty()) {\n      ClearScreen();\n    }\n\n    printf(\"=== Speech Recognition with Next-gen Kaldi ===\\n\");\n    printf(\"------------------------------\\n\");\n    if (!sentences_.empty()) {\n      int32_t i = 1;\n      for (const auto &p : sentences_) {\n        printf(\"[%s] %d. %s\\n\", p.first.c_str(), i, p.second.c_str());\n        i += 1;\n      }\n\n      printf(\"------------------------------\\n\");\n    }\n\n    if (!current_text_.empty()) {\n      printf(\"Recognizing: %s\\n\", current_text_.c_str());\n    }\n  }\n\n private:\n  static void ClearScreen() {\n#ifdef _MSC_VER\n    auto ret = system(\"cls\");\n#else\n    auto ret = system(\"clear\");\n#endif\n    (void)ret;\n  }\n\n  static std::string GetCurrentDateTime() {\n    std::ostringstream os;\n    auto t = std::time(nullptr);\n    auto tm = std::localtime(&t);\n    os << std::put_time(tm, \"%Y-%m-%d %H:%M:%S\");\n    return os.str();\n  }\n\n private:\n  std::vector<std::pair<std::string, std::string>> sentences_;\n  std::string current_text_;\n};\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-alsa-offline-audio-tagging.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-alsa-offline-audio-tagging.cc\n//\n// Copyright (c)  2022-2024  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <mutex>  // NOLINT\n#include <string>\n#include <thread>  // NOLINT\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/alsa.h\"\n#include \"sherpa-onnx/csrc/audio-tagging.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nenum class State {\n  kIdle,\n  kRecording,\n  kDecoding,\n};\n\nState state = State::kIdle;\n\n// true to stop the program and exit\nbool stop = false;\n\nstd::vector<float> samples;\nstd::mutex samples_mutex;\n\nstatic void DetectKeyPress() {\n  SHERPA_ONNX_LOGE(\"Press Enter to start\");\n  int32_t key;\n  while (!stop && (key = getchar())) {\n    if (key != 0x0a) {\n      continue;\n    }\n\n    switch (state) {\n      case State::kIdle:\n        SHERPA_ONNX_LOGE(\"Start recording. Press Enter to stop recording\");\n        state = State::kRecording;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          samples.clear();\n        }\n        break;\n      case State::kRecording:\n        SHERPA_ONNX_LOGE(\"Stop recording. Decoding ...\");\n        state = State::kDecoding;\n        break;\n      case State::kDecoding:\n        break;\n    }\n  }\n}\n\nstatic void Record(const char *device_name, int32_t expected_sample_rate) {\n  sherpa_onnx::Alsa alsa(device_name);\n\n  if (alsa.GetExpectedSampleRate() != expected_sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            expected_sample_rate);\n    exit(-1);\n  }\n\n  int32_t chunk = 0.1 * alsa.GetActualSampleRate();\n  while (!stop) {\n    const std::vector<float> &s = alsa.Read(chunk);\n    std::lock_guard<std::mutex> lock(samples_mutex);\n    samples.insert(samples.end(), s.begin(), s.end());\n  }\n}\n\nstatic void Handler(int32_t sig) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Press Enter to exit\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nAudio tagging from microphone (Linux only).\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\ntar xvf sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\nrm sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n\n./bin/sherpa-onnx-alsa-offline-audio-tagging \\\n  --zipformer-model=./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.onnx \\\n  --labels=./sherpa-onnx-zipformer-audio-tagging-2024-04-09/class_labels_indices.csv \\\n    device_name\n\nPlease refer to\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\nfor a list of pre-trained models to download.\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::AudioTaggingConfig config;\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"Please provide only 1 argument: the device name\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  SHERPA_ONNX_LOGE(\"Creating audio tagger ...\");\n  sherpa_onnx::AudioTagging tagger(config);\n  SHERPA_ONNX_LOGE(\"Audio tagger created created!\");\n\n  std::string device_name = po.GetArg(1);\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name.c_str());\n\n  int32_t sample_rate = 16000;  // fixed to 16000Hz for all models from icefall\n\n  std::thread t2(Record, device_name.c_str(), sample_rate);\n  using namespace std::chrono_literals;  // NOLINT\n  std::this_thread::sleep_for(100ms);    // sleep for 100ms\n  std::thread t(DetectKeyPress);\n\n  while (!stop) {\n    switch (state) {\n      case State::kIdle:\n        break;\n      case State::kRecording:\n        break;\n      case State::kDecoding: {\n        std::vector<float> buf;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          buf = std::move(samples);\n        }\n        SHERPA_ONNX_LOGE(\"Computing...\");\n        auto s = tagger.CreateStream();\n        s->AcceptWaveform(sample_rate, buf.data(), buf.size());\n        auto results = tagger.Compute(s.get());\n        SHERPA_ONNX_LOGE(\"Result is:\");\n\n        int32_t i = 0;\n        std::ostringstream os;\n        for (const auto &event : results) {\n          os << i << \": \" << event.ToString() << \"\\n\";\n          i += 1;\n        }\n\n        SHERPA_ONNX_LOGE(\"\\n%s\\n\", os.str().c_str());\n\n        state = State::kIdle;\n        SHERPA_ONNX_LOGE(\"Press Enter to start\");\n        break;\n      }\n    }\n\n    std::this_thread::sleep_for(20ms);  // sleep for 20ms\n  }\n  t.join();\n  t2.join();\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-alsa-offline-speaker-identification.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-alsa-offline-speaker-identification.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <fstream>\n#include <mutex>  // NOLINT\n#include <sstream>\n#include <string>\n#include <thread>  // NOLINT\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/alsa.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-manager.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\nenum class State {\n  kIdle,\n  kRecording,\n  kComputing,\n};\n\nState state = State::kIdle;\n\n// true to stop the program and exit\nbool stop = false;\n\nstd::vector<float> samples;\nstd::mutex samples_mutex;\n\nstatic void DetectKeyPress() {\n  SHERPA_ONNX_LOGE(\"\\nPress Enter to start\");\n  int32_t key;\n  while (!stop && (key = getchar())) {\n    if (key != 0x0a) {\n      continue;\n    }\n\n    switch (state) {\n      case State::kIdle:\n        SHERPA_ONNX_LOGE(\"\\nStart recording. Press Enter to stop recording\");\n        state = State::kRecording;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          samples.clear();\n        }\n        break;\n      case State::kRecording:\n        SHERPA_ONNX_LOGE(\"\\nStop recording. Computing ...\");\n        state = State::kComputing;\n        break;\n      case State::kComputing:\n        break;\n    }\n  }\n}\n\nstatic void Record(const char *device_name, int32_t expected_sample_rate) {\n  sherpa_onnx::Alsa alsa(device_name);\n\n  if (alsa.GetExpectedSampleRate() != expected_sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            expected_sample_rate);\n    exit(-1);\n  }\n\n  int32_t chunk = 0.1 * alsa.GetActualSampleRate();\n  while (!stop) {\n    const std::vector<float> &s = alsa.Read(chunk);\n    std::lock_guard<std::mutex> lock(samples_mutex);\n    samples.insert(samples.end(), s.begin(), s.end());\n  }\n}\n\nstatic void Handler(int32_t sig) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Press Enter to exit\\n\");\n}\n\nstatic std::vector<std::vector<float>> ComputeEmbeddings(\n    const std::vector<std::string> &filenames,\n    sherpa_onnx::SpeakerEmbeddingExtractor *extractor) {\n  std::vector<std::vector<float>> embedding_list;\n  embedding_list.reserve(filenames.size());\n\n  for (const auto &f : filenames) {\n    int32_t sampling_rate = -1;\n\n    bool is_ok = false;\n    const std::vector<float> samples =\n        sherpa_onnx::ReadWave(f, &sampling_rate, &is_ok);\n\n    if (!is_ok) {\n      fprintf(stderr, \"Failed to read '%s'\\n\", f.c_str());\n      exit(-1);\n    }\n\n    auto s = extractor->CreateStream();\n    s->AcceptWaveform(sampling_rate, samples.data(), samples.size());\n    s->InputFinished();\n    auto embedding = extractor->Compute(s.get());\n    embedding_list.push_back(embedding);\n  }\n  return embedding_list;\n}\n\nstatic std::unordered_map<std::string, std::vector<std::string>>\nReadSpeakerFile(const std::string &filename) {\n  std::unordered_map<std::string, std::vector<std::string>> ans;\n\n  std::ifstream is(filename);\n  if (!is) {\n    fprintf(stderr, \"Failed to open %s\", filename.c_str());\n    exit(0);\n  }\n\n  std::string line;\n  std::string name;\n  std::string path;\n\n  while (std::getline(is, line)) {\n    std::istringstream iss(line);\n    name.clear();\n    path.clear();\n\n    iss >> name >> path;\n    if (!iss || !iss.eof() || name.empty() || path.empty()) {\n      fprintf(stderr, \"Invalid line: %s\\n\", line.c_str());\n      exit(-1);\n    }\n    ans[name].push_back(path);\n  }\n\n  return ans;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program shows how to use non-streaming speaker identification.\nUsage:\n\n(1) Prepare a text file containing speaker related files.\n\nEach line in the text file contains two columns. The first column is the\nspeaker name, while the second column contains the wave file of the speaker.\n\nIf the text file contains multiple wave files for the same speaker, then the\nembeddings of these files are averaged.\n\nAn example text file is given below:\n\n    foo /path/to/a.wav\n    bar /path/to/b.wav\n    foo /path/to/c.wav\n    foobar /path/to/d.wav\n\nEach wave file should contain only a single channel; the sample format\nshould be int16_t; the sample rate can be arbitrary.\n\n(2) Download a model for computing speaker embeddings\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nto download a model. An example is given below:\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/wespeaker_zh_cnceleb_resnet34.onnx\n\nNote that `zh` means Chinese, while `en` means English.\n\n(3) Run it !\n\n  ./bin/sherpa-onnx-alsa-offline-speaker-identification \\\n    --model=/path/to/your-model.onnx \\\n    --speaker-file=/path/to/speaker.txt \\\n    device_name\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n  plughw:3,0\nas the device_name.\n\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  float threshold = 0.5;\n  std::string speaker_file;\n\n  po.Register(\"threshold\", &threshold,\n              \"Threshold for comparing embedding scores.\");\n\n  po.Register(\"speaker-file\", &speaker_file, \"Path to speaker.txt\");\n\n  sherpa_onnx::SpeakerEmbeddingExtractorConfig config;\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"Please provide only 1 argument: the device name\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config! Please use --help to view the usage.\\n\");\n    return -1;\n  }\n\n  SHERPA_ONNX_LOGE(\"\\nCreating extractor ...\");\n  sherpa_onnx::SpeakerEmbeddingExtractor extractor(config);\n  SHERPA_ONNX_LOGE(\"\\nextractor created!\");\n\n  sherpa_onnx::SpeakerEmbeddingManager manager(extractor.Dim());\n\n  auto name2files = ReadSpeakerFile(speaker_file);\n  for (const auto &p : name2files) {\n    SHERPA_ONNX_LOGE(\"\\nProcessing speaker %s\", p.first.c_str());\n    auto embedding_list = ComputeEmbeddings(p.second, &extractor);\n    manager.Add(p.first, embedding_list);\n  }\n\n  std::string device_name = po.GetArg(1);\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name.c_str());\n  int32_t sample_rate = 16000;\n\n  std::thread t(DetectKeyPress);\n  std::thread t2(Record, device_name.c_str(), sample_rate);\n\n  while (!stop) {\n    switch (state) {\n      case State::kIdle:\n        break;\n      case State::kRecording:\n        break;\n      case State::kComputing: {\n        std::vector<float> buf;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          buf = std::move(samples);\n        }\n\n        auto s = extractor.CreateStream();\n        s->AcceptWaveform(sample_rate, buf.data(), buf.size());\n        s->InputFinished();\n        auto embedding = extractor.Compute(s.get());\n        auto name = manager.Search(embedding.data(), threshold);\n\n        if (name.empty()) {\n          name = \"--Unknown--\";\n        }\n\n        SHERPA_ONNX_LOGE(\"\\nDone!\\nDetected speaker is: %s\", name.c_str());\n\n        state = State::kIdle;\n        SHERPA_ONNX_LOGE(\"\\nPress Enter to start\");\n        break;\n      }\n    }\n\n    using namespace std::chrono_literals;  // NOLINT\n    std::this_thread::sleep_for(20ms);     // sleep for 20ms\n  }\n\n  t.join();\n  t2.join();\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-alsa-offline.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-alsa-offline.cc\n//\n// Copyright (c)  2022-2024  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <cctype>  // std::tolower\n#include <chrono>\n#include <mutex>\n#include <string>\n#include <thread>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/alsa.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n\nenum class State {\n  kIdle,\n  kRecording,\n  kDecoding,\n};\n\nState state = State::kIdle;\n\n// true to stop the program and exit\nbool stop = false;\n\nstd::vector<float> samples;\nstd::mutex samples_mutex;\n\nstatic void DetectKeyPress() {\n  SHERPA_ONNX_LOGE(\"Press Enter to start\");\n  int32_t key;\n  while (!stop && (key = getchar())) {\n    if (key != 0x0a) {\n      continue;\n    }\n\n    switch (state) {\n      case State::kIdle:\n        SHERPA_ONNX_LOGE(\"Start recording. Press Enter to stop recording\");\n        state = State::kRecording;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          samples.clear();\n        }\n        break;\n      case State::kRecording:\n        SHERPA_ONNX_LOGE(\"Stop recording. Decoding ...\");\n        state = State::kDecoding;\n        break;\n      case State::kDecoding:\n        break;\n    }\n  }\n}\n\nstatic void Record(const char *device_name, int32_t expected_sample_rate) {\n  sherpa_onnx::Alsa alsa(device_name);\n\n  if (alsa.GetExpectedSampleRate() != expected_sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            expected_sample_rate);\n    exit(-1);\n  }\n\n  int32_t chunk = 0.1 * alsa.GetActualSampleRate();\n  while (!stop) {\n    const std::vector<float> &s = alsa.Read(chunk);\n    std::lock_guard<std::mutex> lock(samples_mutex);\n    samples.insert(samples.end(), s.begin(), s.end());\n  }\n}\n\nstatic void Handler(int32_t sig) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Press Enter to exit\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program uses non-streaming models with microphone for speech recognition.\nUsage:\n\n(1) Transducer from icefall\n\n  ./bin/sherpa-onnx-alsa-offline \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --num-threads=2 \\\n    --decoding-method=greedy_search \\\n    device_name\n\n(2) Paraformer from FunASR\n\n  ./bin/sherpa-onnx-alsa-offline \\\n    --tokens=/path/to/tokens.txt \\\n    --paraformer=/path/to/model.onnx \\\n    --num-threads=1 \\\n    device_name\n\n(3) Whisper models\n\n  ./bin/sherpa-onnx-alsa-offline \\\n    --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \\\n    --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \\\n    --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \\\n    --num-threads=1 \\\n    device_name\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OfflineRecognizerConfig config;\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"Please provide only 1 argument: the device name\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  SHERPA_ONNX_LOGE(\"Creating recognizer ...\");\n  sherpa_onnx::OfflineRecognizer recognizer(config);\n  SHERPA_ONNX_LOGE(\"Recognizer created!\");\n\n  std::string device_name = po.GetArg(1);\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name.c_str());\n\n  int32_t sample_rate = config.feat_config.sampling_rate;\n\n  std::thread t(DetectKeyPress);\n  std::thread t2(Record, device_name.c_str(), sample_rate);\n\n  while (!stop) {\n    switch (state) {\n      case State::kIdle:\n        break;\n      case State::kRecording:\n        break;\n      case State::kDecoding: {\n        std::vector<float> buf;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          buf = std::move(samples);\n        }\n\n        auto s = recognizer.CreateStream();\n        s->AcceptWaveform(sample_rate, buf.data(), buf.size());\n        recognizer.DecodeStream(s.get());\n        SHERPA_ONNX_LOGE(\"Decoding Done! Result is:\");\n        SHERPA_ONNX_LOGE(\"%s\", s->GetResult().text.c_str());\n\n        state = State::kIdle;\n        SHERPA_ONNX_LOGE(\"Press Enter to start\");\n        break;\n      }\n    }\n\n    using namespace std::chrono_literals;  // NOLINT\n    std::this_thread::sleep_for(20ms);     // sleep for 20ms\n  }\n  t.join();\n  t2.join();\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-alsa.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-alsa.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <cctype>  // std::tolower\n#include <cstdint>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/alsa.h\"\n#include \"sherpa-onnx/csrc/display.h\"\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nbool stop = false;\n\nstatic void Handler(int sig) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nint main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nUsage:\n  ./bin/sherpa-onnx-alsa \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --provider=cpu \\\n    --num-threads=2 \\\n    --decoding-method=greedy_search \\\n    device_name\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n)usage\";\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OnlineRecognizerConfig config;\n\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"Please provide only 1 argument: the device name\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n  sherpa_onnx::OnlineRecognizer recognizer(config);\n\n  int32_t expected_sample_rate = config.feat_config.sampling_rate;\n\n  std::string device_name = po.GetArg(1);\n  sherpa_onnx::Alsa alsa(device_name.c_str());\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name.c_str());\n\n  if (alsa.GetExpectedSampleRate() != expected_sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            expected_sample_rate);\n    exit(-1);\n  }\n\n  fprintf(stderr, \"Started! Please speak\\n\");\n\n  int32_t chunk = 0.1 * alsa.GetActualSampleRate();\n\n  std::string last_text;\n\n  auto stream = recognizer.CreateStream();\n\n  sherpa_onnx::Display display;\n\n  int32_t segment_index = 0;\n  while (!stop) {\n    const std::vector<float> &samples = alsa.Read(chunk);\n\n    stream->AcceptWaveform(expected_sample_rate, samples.data(),\n                           samples.size());\n\n    while (recognizer.IsReady(stream.get())) {\n      recognizer.DecodeStream(stream.get());\n    }\n\n    auto text = recognizer.GetResult(stream.get()).text;\n\n    bool is_endpoint = recognizer.IsEndpoint(stream.get());\n\n    if (is_endpoint && !config.model_config.paraformer.encoder.empty()) {\n      // For streaming paraformer models, since it has a large right chunk size\n      // we need to pad it on endpointing so that the last character\n      // can be recognized\n      std::vector<float> tail_paddings(\n          static_cast<int>(1.0 * expected_sample_rate));\n      stream->AcceptWaveform(expected_sample_rate, tail_paddings.data(),\n                             tail_paddings.size());\n      while (recognizer.IsReady(stream.get())) {\n        recognizer.DecodeStream(stream.get());\n      }\n      text = recognizer.GetResult(stream.get()).text;\n    }\n\n    if (!text.empty() && last_text != text) {\n      last_text = text;\n\n      std::transform(text.begin(), text.end(), text.begin(),\n                     [](auto c) { return std::tolower(c); });\n\n      display.Print(segment_index, text);\n      fflush(stderr);\n    }\n\n    if (is_endpoint) {\n      if (!text.empty()) {\n        ++segment_index;\n      }\n\n      recognizer.Reset(stream.get());\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-keyword-spotter-alsa.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-keyword-spotter-alsa.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <cstdint>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/alsa.h\"\n#include \"sherpa-onnx/csrc/display.h\"\n#include \"sherpa-onnx/csrc/keyword-spotter.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nbool stop = false;\n\nstatic void Handler(int sig) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nint main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nUsage:\n  ./bin/sherpa-onnx-keyword-spotter-alsa \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --provider=cpu \\\n    --num-threads=2 \\\n    --keywords-file=keywords.txt \\\n    device_name\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html\nfor a list of pre-trained models to download.\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n)usage\";\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::KeywordSpotterConfig config;\n\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"Please provide only 1 argument: the device name\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n  sherpa_onnx::KeywordSpotter spotter(config);\n\n  int32_t expected_sample_rate = config.feat_config.sampling_rate;\n\n  std::string device_name = po.GetArg(1);\n  sherpa_onnx::Alsa alsa(device_name.c_str());\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name.c_str());\n\n  if (alsa.GetExpectedSampleRate() != expected_sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            expected_sample_rate);\n    exit(-1);\n  }\n\n  int32_t chunk = 0.1 * alsa.GetActualSampleRate();\n\n  std::string last_text;\n\n  auto stream = spotter.CreateStream();\n\n  sherpa_onnx::Display display;\n\n  int32_t keyword_index = 0;\n  while (!stop) {\n    const std::vector<float> &samples = alsa.Read(chunk);\n\n    stream->AcceptWaveform(expected_sample_rate, samples.data(),\n                           samples.size());\n\n    while (spotter.IsReady(stream.get())) {\n      spotter.DecodeStream(stream.get());\n\n      const auto r = spotter.GetResult(stream.get());\n      if (!r.keyword.empty()) {\n        display.Print(keyword_index, r.AsJsonString());\n        fflush(stderr);\n        keyword_index++;\n\n        spotter.Reset(stream.get());\n      }\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-keyword-spotter-microphone.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-keyword-spotter-microphone.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n\n#include \"portaudio.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/display.h\"\n#include \"sherpa-onnx/csrc/keyword-spotter.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n\nbool stop = false;\nfloat mic_sample_rate = 16000;\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void *user_data) {\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(user_data);\n\n  stream->AcceptWaveform(mic_sample_rate,\n                         reinterpret_cast<const float *>(input_buffer),\n                         frames_per_buffer);\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program uses streaming models with microphone for keyword spotting.\nUsage:\n\n  ./bin/sherpa-onnx-keyword-spotter-microphone \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --provider=cpu \\\n    --num-threads=1 \\\n    --keywords-file=keywords.txt\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::KeywordSpotterConfig config;\n\n  config.Register(&po);\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  sherpa_onnx::KeywordSpotter spotter(config);\n  auto s = spotter.CreateStream();\n\n  sherpa_onnx::Microphone mic;\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  if (device_index == paNoDevice) {\n    fprintf(stderr, \"No default input device found\\n\");\n    fprintf(stderr, \"If you are using Linux, please switch to \\n\");\n    fprintf(stderr, \" ./bin/sherpa-onnx-keyword-spotter-alsa \\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n\n  mic.PrintDevices(device_index);\n\n  const char *pSampleRateStr = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (pSampleRateStr) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(pSampleRateStr);\n  }\n\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      s.get())) {\n    fprintf(stderr, \"portaudio error: %d\\n\", device_index);\n    exit(EXIT_FAILURE);\n  }\n\n  int32_t keyword_index = 0;\n  sherpa_onnx::Display display;\n  while (!stop) {\n    while (spotter.IsReady(s.get())) {\n      spotter.DecodeStream(s.get());\n\n      const auto r = spotter.GetResult(s.get());\n      if (!r.keyword.empty()) {\n        display.Print(keyword_index, r.AsJsonString());\n        fflush(stderr);\n        keyword_index++;\n\n        spotter.Reset(s.get());\n      }\n    }\n\n    Pa_Sleep(20);  // sleep for 20ms\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-keyword-spotter.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-keyword-spotter.cc\n//\n// Copyright (c)  2023-2024  Xiaomi Corporation\n\n#include <stdio.h>\n\n#include <chrono>\n#include <iomanip>\n#include <iostream>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/keyword-spotter.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\ntypedef struct {\n  std::unique_ptr<sherpa_onnx::OnlineStream> online_stream;\n  std::string filename;\n} Stream;\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nUsage:\n\n(1) Streaming transducer\n\n  ./bin/sherpa-onnx-keyword-spotter \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --provider=cpu \\\n    --num-threads=2 \\\n    --keywords-file=keywords.txt \\\n    /path/to/foo.wav [bar.wav foobar.wav ...]\n\nNote: It supports decoding multiple files in batches\n\nDefault value for num_threads is 2.\nValid values for provider: cpu (default), cuda, coreml.\nfoo.wav should be of single channel, 16-bit PCM encoded wave file; its\nsampling rate can be arbitrary and does not need to be 16kHz.\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::KeywordSpotterConfig config;\n\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() < 1) {\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  sherpa_onnx::KeywordSpotter keyword_spotter(config);\n\n  if (po.NumArgs() == 1) {\n    const std::string wav_filename = po.GetArg(1);\n\n    int32_t sampling_rate = -1;\n\n    bool is_ok = false;\n    const std::vector<float> samples =\n        sherpa_onnx::ReadWave(wav_filename, &sampling_rate, &is_ok);\n\n    if (!is_ok) {\n      fprintf(stderr, \"Failed to read '%s'\\n\", wav_filename.c_str());\n      return -1;\n    }\n\n    auto begin = std::chrono::steady_clock::now();\n\n    auto s = keyword_spotter.CreateStream();\n    s->AcceptWaveform(sampling_rate, samples.data(), samples.size());\n\n    std::vector<float> tail_paddings(static_cast<int>(0.8 * sampling_rate));\n    // Note: We can call AcceptWaveform() multiple times.\n    s->AcceptWaveform(sampling_rate, tail_paddings.data(),\n                      tail_paddings.size());\n\n    s->InputFinished();\n\n    while (keyword_spotter.IsReady(s.get())) {\n      keyword_spotter.DecodeStream(s.get());\n\n      auto r = keyword_spotter.GetResult(s.get());\n      if (!r.keyword.empty()) {\n        keyword_spotter.Reset(s.get());\n\n        fprintf(stderr, \"%s\\n\", wav_filename.c_str());\n        fprintf(stdout, \"%s\\n\", r.AsJsonString().c_str());\n        fprintf(stderr, \"\\n\");\n      }\n    }\n\n    auto end = std::chrono::steady_clock::now();\n\n    float duration = samples.size() / static_cast<float>(sampling_rate);\n\n    float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n            .count() /\n        1000.;\n    float rtf = elapsed_seconds / duration;\n    fprintf(stderr, \"Number of threads: %d\\n\", config.model_config.num_threads);\n    fprintf(stderr, \"Audio duration: %.3f s\\n\", duration);\n    fprintf(stderr, \"Elapsed seconds: %.3f\\n\", elapsed_seconds);\n    fprintf(stderr, \"RTF = %.3f/%.3f = %.3f\\n\", elapsed_seconds, duration, rtf);\n\n  } else {\n    std::vector<Stream> ss;\n\n    for (int32_t i = 1; i <= po.NumArgs(); ++i) {\n      const std::string wav_filename = po.GetArg(i);\n      int32_t sampling_rate = -1;\n\n      bool is_ok = false;\n      const std::vector<float> samples =\n          sherpa_onnx::ReadWave(wav_filename, &sampling_rate, &is_ok);\n\n      if (!is_ok) {\n        fprintf(stderr, \"Failed to read '%s'\\n\", wav_filename.c_str());\n        return -1;\n      }\n\n      auto s = keyword_spotter.CreateStream();\n      s->AcceptWaveform(sampling_rate, samples.data(), samples.size());\n\n      std::vector<float> tail_paddings(static_cast<int>(0.8 * sampling_rate));\n      // Note: We can call AcceptWaveform() multiple times.\n      s->AcceptWaveform(sampling_rate, tail_paddings.data(),\n                        tail_paddings.size());\n\n      // Call InputFinished() to indicate that no audio samples are available\n      s->InputFinished();\n      ss.push_back({std::move(s), wav_filename});\n    }\n\n    std::vector<sherpa_onnx::OnlineStream *> ready_streams;\n    for (;;) {\n      ready_streams.clear();\n      for (auto &s : ss) {\n        const auto p_ss = s.online_stream.get();\n        if (keyword_spotter.IsReady(p_ss)) {\n          ready_streams.push_back(p_ss);\n        }\n        std::ostringstream os;\n        const auto r = keyword_spotter.GetResult(p_ss);\n        if (!r.keyword.empty()) {\n          os << s.filename << \"\\n\";\n          fprintf(stderr, \"%s\", os.str().c_str());\n          fprintf(stdout, \"%s\\n\", r.AsJsonString().c_str());\n          fprintf(stderr, \"\\n\");\n        }\n      }\n\n      if (ready_streams.empty()) {\n        break;\n      }\n      keyword_spotter.DecodeStreams(ready_streams.data(), ready_streams.size());\n    }\n  }\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-microphone-offline-audio-tagging.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-microphone-offline-audio-tagging.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <cctype>  // std::tolower\n#include <mutex>\n#include <thread>\n#include <utility>\n#include <vector>\n\n#include \"portaudio.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/audio-tagging.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n\nenum class State {\n  kIdle,\n  kRecording,\n  kDecoding,\n};\n\nState state = State::kIdle;\n\n// true to stop the program and exit\nbool stop = false;\n\nstd::vector<float> samples;\nstd::mutex samples_mutex;\n\nstatic void DetectKeyPress() {\n  SHERPA_ONNX_LOGE(\"Press Enter to start\");\n  int32_t key;\n  while (!stop && (key = getchar())) {\n    if (key != 0x0a) {\n      continue;\n    }\n\n    switch (state) {\n      case State::kIdle:\n        SHERPA_ONNX_LOGE(\"Start recording. Press Enter to stop recording\");\n        state = State::kRecording;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          samples.clear();\n        }\n        break;\n      case State::kRecording:\n        SHERPA_ONNX_LOGE(\"Stop recording. Decoding ...\");\n        state = State::kDecoding;\n        break;\n      case State::kDecoding:\n        break;\n    }\n  }\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(samples_mutex);\n\n  auto p = reinterpret_cast<const float *>(input_buffer);\n  samples.insert(samples.end(), p, p + frames_per_buffer);\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Press Enter to exit\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nAudio tagging from microphone.\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\ntar xvf sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\nrm sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n\n./bin/sherpa-onnx-microphone-offline-audio-tagging \\\n  --zipformer-model=./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.onnx \\\n  --labels=./sherpa-onnx-zipformer-audio-tagging-2024-04-09/class_labels_indices.csv\n\nPlease see\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\nfor more models.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::AudioTaggingConfig config;\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    fprintf(stderr, \"\\nThis program does not support positional arguments\\n\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  SHERPA_ONNX_LOGE(\"Creating audio tagger ...\");\n  sherpa_onnx::AudioTagging tagger(config);\n  SHERPA_ONNX_LOGE(\"Audio tagger created created!\");\n\n  sherpa_onnx::Microphone mic;\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  if (device_index == paNoDevice) {\n    fprintf(stderr, \"No default input device found\\n\");\n    fprintf(stderr, \"If you are using Linux, please switch to \\n\");\n    fprintf(stderr, \" ./bin/sherpa-onnx-alsa-offline-audio-tagging \\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n\n  mic.PrintDevices(device_index);\n  float mic_sample_rate = 16000;\n  const char *pSampleRateStr = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (pSampleRateStr) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(pSampleRateStr);\n  }\n\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr /* user_data */)) {\n    fprintf(stderr, \"portaudio error: %d\\n\", device_index);\n    exit(EXIT_FAILURE);\n  }\n\n  std::thread t(DetectKeyPress);\n  while (!stop) {\n    switch (state) {\n      case State::kIdle:\n        break;\n      case State::kRecording:\n        break;\n      case State::kDecoding: {\n        std::vector<float> buf;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          buf = std::move(samples);\n        }\n\n        SHERPA_ONNX_LOGE(\"Computing...\");\n        auto s = tagger.CreateStream();\n        s->AcceptWaveform(mic_sample_rate, buf.data(), buf.size());\n        auto results = tagger.Compute(s.get());\n\n        SHERPA_ONNX_LOGE(\"Result is:\");\n\n        int32_t i = 0;\n        std::ostringstream os;\n        for (const auto &event : results) {\n          os << i << \": \" << event.ToString() << \"\\n\";\n          i += 1;\n        }\n\n        SHERPA_ONNX_LOGE(\"\\n%s\\n\", os.str().c_str());\n\n        state = State::kIdle;\n        SHERPA_ONNX_LOGE(\"Press Enter to start\");\n        break;\n      }\n    }\n\n    Pa_Sleep(20);  // sleep for 20ms\n  }\n  t.join();\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-microphone-offline-speaker-identification.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-microphone-offline-speaker-identification.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <fstream>\n#include <mutex>\n#include <sstream>\n#include <string>\n#include <thread>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"portaudio.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-manager.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\nenum class State {\n  kIdle,\n  kRecording,\n  kComputing,\n};\n\nState state = State::kIdle;\n\n// true to stop the program and exit\nbool stop = false;\n\nstd::vector<float> samples;\nstd::mutex samples_mutex;\n\nstatic void DetectKeyPress() {\n  SHERPA_ONNX_LOGE(\"\\nPress Enter to start\");\n  int32_t key;\n  while (!stop && (key = getchar())) {\n    if (key != 0x0a) {\n      continue;\n    }\n\n    switch (state) {\n      case State::kIdle:\n        SHERPA_ONNX_LOGE(\"\\nStart recording. Press Enter to stop recording\");\n        state = State::kRecording;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          samples.clear();\n        }\n        break;\n      case State::kRecording:\n        SHERPA_ONNX_LOGE(\"\\nStop recording. Computing ...\");\n        state = State::kComputing;\n        break;\n      case State::kComputing:\n        break;\n    }\n  }\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void *user_data) {\n  std::lock_guard<std::mutex> lock(samples_mutex);\n\n  auto p = reinterpret_cast<const float *>(input_buffer);\n  samples.insert(samples.end(), p, p + frames_per_buffer);\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic void Handler(int32_t sig) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Press Enter to exit\\n\");\n}\n\nstatic std::vector<std::vector<float>> ComputeEmbeddings(\n    const std::vector<std::string> &filenames,\n    sherpa_onnx::SpeakerEmbeddingExtractor *extractor) {\n  std::vector<std::vector<float>> embedding_list;\n  embedding_list.reserve(filenames.size());\n\n  for (const auto &f : filenames) {\n    int32_t sampling_rate = -1;\n\n    bool is_ok = false;\n    const std::vector<float> samples =\n        sherpa_onnx::ReadWave(f, &sampling_rate, &is_ok);\n\n    if (!is_ok) {\n      fprintf(stderr, \"Failed to read '%s'\\n\", f.c_str());\n      exit(-1);\n    }\n\n    auto s = extractor->CreateStream();\n    s->AcceptWaveform(sampling_rate, samples.data(), samples.size());\n    s->InputFinished();\n    auto embedding = extractor->Compute(s.get());\n    embedding_list.push_back(embedding);\n  }\n  return embedding_list;\n}\n\nstatic std::unordered_map<std::string, std::vector<std::string>>\nReadSpeakerFile(const std::string &filename) {\n  std::unordered_map<std::string, std::vector<std::string>> ans;\n\n  std::ifstream is(filename);\n  if (!is) {\n    fprintf(stderr, \"Failed to open %s\", filename.c_str());\n    exit(0);\n  }\n\n  std::string line;\n  std::string name;\n  std::string path;\n\n  while (std::getline(is, line)) {\n    std::istringstream iss(line);\n    name.clear();\n    path.clear();\n\n    iss >> name >> path;\n    if (!iss || !iss.eof() || name.empty() || path.empty()) {\n      fprintf(stderr, \"Invalid line: %s\\n\", line.c_str());\n      exit(-1);\n    }\n    ans[name].push_back(path);\n  }\n\n  return ans;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program shows how to use non-streaming speaker identification.\nUsage:\n\n(1) Prepare a text file containing speaker related files.\n\nEach line in the text file contains two columns. The first column is the\nspeaker name, while the second column contains the wave file of the speaker.\n\nIf the text file contains multiple wave files for the same speaker, then the\nembeddings of these files are averaged.\n\nAn example text file is given below:\n\n    foo /path/to/a.wav\n    bar /path/to/b.wav\n    foo /path/to/c.wav\n    foobar /path/to/d.wav\n\nEach wave file should contain only a single channel; the sample format\nshould be int16_t; the sample rate can be arbitrary.\n\n(2) Download a model for computing speaker embeddings\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nto download a model. An example is given below:\n\n    wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/wespeaker_zh_cnceleb_resnet34.onnx\n\nNote that `zh` means Chinese, while `en` means English.\n\n(3) Run it !\n\n  ./bin/sherpa-onnx-microphone-offline-speaker-identification \\\n    --model=/path/to/your-model.onnx \\\n    --speaker-file=/path/to/speaker.txt\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  float threshold = 0.5;\n  std::string speaker_file;\n\n  po.Register(\"threshold\", &threshold,\n              \"Threshold for comparing embedding scores.\");\n\n  po.Register(\"speaker-file\", &speaker_file, \"Path to speaker.txt\");\n\n  sherpa_onnx::SpeakerEmbeddingExtractorConfig config;\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    fprintf(stderr,\n            \"This program does not support any positional arguments.\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config! Please use --help to view the usage.\\n\");\n    return -1;\n  }\n\n  SHERPA_ONNX_LOGE(\"\\nCreating extractor ...\");\n  sherpa_onnx::SpeakerEmbeddingExtractor extractor(config);\n  SHERPA_ONNX_LOGE(\"\\nextractor created!\");\n\n  sherpa_onnx::SpeakerEmbeddingManager manager(extractor.Dim());\n\n  auto name2files = ReadSpeakerFile(speaker_file);\n  for (const auto &p : name2files) {\n    SHERPA_ONNX_LOGE(\"\\nProcessing speaker %s\", p.first.c_str());\n    auto embedding_list = ComputeEmbeddings(p.second, &extractor);\n    manager.Add(p.first, embedding_list);\n  }\n\n  sherpa_onnx::Microphone mic;\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  if (device_index == paNoDevice) {\n    fprintf(stderr, \"No default input device found\\n\");\n    fprintf(stderr, \"If you are using Linux, please switch to \\n\");\n    fprintf(stderr,\n            \" ./bin/sherpa-onnx-alsa-offline-speaker-identification \\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *pSampleRateStr = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (pSampleRateStr) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(pSampleRateStr);\n  }\n\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr /* user_data */)) {\n    fprintf(stderr, \"portaudio error: %d\\n\", device_index);\n    exit(EXIT_FAILURE);\n  }\n\n  std::thread t(DetectKeyPress);\n  while (!stop) {\n    switch (state) {\n      case State::kIdle:\n        break;\n      case State::kRecording:\n        break;\n      case State::kComputing: {\n        std::vector<float> buf;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          buf = std::move(samples);\n        }\n\n        auto s = extractor.CreateStream();\n        s->AcceptWaveform(mic_sample_rate, buf.data(), buf.size());\n        s->InputFinished();\n        auto embedding = extractor.Compute(s.get());\n        auto name = manager.Search(embedding.data(), threshold);\n\n        if (name.empty()) {\n          name = \"--Unknown--\";\n        }\n\n        SHERPA_ONNX_LOGE(\"\\nDone!\\nDetected speaker is: %s\", name.c_str());\n\n        state = State::kIdle;\n        SHERPA_ONNX_LOGE(\"\\nPress Enter to start\");\n        break;\n      }\n    }\n\n    Pa_Sleep(20);  // sleep for 20ms\n  }\n  t.join();\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-microphone-offline.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-microphone-offline.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <cctype>  // std::tolower\n#include <mutex>\n#include <thread>\n#include <utility>\n#include <vector>\n\n#include \"portaudio.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n\nenum class State {\n  kIdle,\n  kRecording,\n  kDecoding,\n};\n\nState state = State::kIdle;\n\n// true to stop the program and exit\nbool stop = false;\n\nstd::vector<float> samples;\nstd::mutex samples_mutex;\n\nstatic void DetectKeyPress() {\n  SHERPA_ONNX_LOGE(\"Press Enter to start\");\n  int32_t key;\n  while (!stop && (key = getchar())) {\n    if (key != 0x0a) {\n      continue;\n    }\n\n    switch (state) {\n      case State::kIdle:\n        SHERPA_ONNX_LOGE(\"Start recording. Press Enter to stop recording\");\n        state = State::kRecording;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          samples.clear();\n        }\n        break;\n      case State::kRecording:\n        SHERPA_ONNX_LOGE(\"Stop recording. Decoding ...\");\n        state = State::kDecoding;\n        break;\n      case State::kDecoding:\n        break;\n    }\n  }\n}\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(samples_mutex);\n\n  auto p = reinterpret_cast<const float *>(input_buffer);\n  samples.insert(samples.end(), p, p + frames_per_buffer);\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Press Enter to exit\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program uses non-streaming models with microphone for speech recognition.\nUsage:\n\n(1) Transducer from icefall\n\n  ./bin/sherpa-onnx-microphone-offline \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --num-threads=2 \\\n    --decoding-method=greedy_search\n\n(2) Paraformer from FunASR\n\n  ./bin/sherpa-onnx-microphone-offline \\\n    --tokens=/path/to/tokens.txt \\\n    --paraformer=/path/to/model.onnx \\\n    --num-threads=1\n\n(3) Whisper models\n\n  ./bin/sherpa-onnx-microphone-offline \\\n    --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \\\n    --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \\\n    --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \\\n    --num-threads=1\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OfflineRecognizerConfig config;\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  SHERPA_ONNX_LOGE(\"Creating recognizer ...\");\n  sherpa_onnx::OfflineRecognizer recognizer(config);\n  SHERPA_ONNX_LOGE(\"Recognizer created!\");\n\n  sherpa_onnx::Microphone mic;\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  if (device_index == paNoDevice) {\n    fprintf(stderr, \"No default input device found\\n\");\n    fprintf(stderr, \"If you are using Linux, please switch to \\n\");\n    fprintf(stderr, \" ./bin/sherpa-onnx-alsa-offline \\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *pSampleRateStr = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (pSampleRateStr) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(pSampleRateStr);\n  }\n\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr /* user_data */)) {\n    fprintf(stderr, \"portaudio error: %d\\n\", device_index);\n    exit(EXIT_FAILURE);\n  }\n\n  std::thread t(DetectKeyPress);\n  while (!stop) {\n    switch (state) {\n      case State::kIdle:\n        break;\n      case State::kRecording:\n        break;\n      case State::kDecoding: {\n        std::vector<float> buf;\n        {\n          std::lock_guard<std::mutex> lock(samples_mutex);\n          buf = std::move(samples);\n        }\n\n        auto s = recognizer.CreateStream();\n        s->AcceptWaveform(mic_sample_rate, buf.data(), buf.size());\n        recognizer.DecodeStream(s.get());\n        SHERPA_ONNX_LOGE(\"Decoding Done! Result is:\");\n        SHERPA_ONNX_LOGE(\"%s\", s->GetResult().text.c_str());\n\n        state = State::kIdle;\n        SHERPA_ONNX_LOGE(\"Press Enter to start\");\n        break;\n      }\n    }\n\n    Pa_Sleep(20);  // sleep for 20ms\n  }\n  t.join();\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-microphone.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-microphone.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <clocale>\n#include <cwctype>\n#include <string>\n#include <vector>\n\n#include \"portaudio.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/display.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n\nbool stop = false;\nfloat mic_sample_rate = 16000;\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void *user_data) {\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(user_data);\n\n  stream->AcceptWaveform(mic_sample_rate,\n                         reinterpret_cast<const float *>(input_buffer),\n                         frames_per_buffer);\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nstatic std::string tolowerUnicode(const std::string &input_str) {\n  // Use system locale\n  std::setlocale(LC_ALL, \"\");\n\n  // From char string to wchar string\n  std::wstring input_wstr(input_str.size() + 1, '\\0');\n  std::mbstowcs(&input_wstr[0], input_str.c_str(), input_str.size());\n  std::wstring lowercase_wstr;\n\n  for (wchar_t wc : input_wstr) {\n    if (std::iswupper(wc)) {\n      lowercase_wstr += std::towlower(wc);\n    } else {\n      lowercase_wstr += wc;\n    }\n  }\n\n  // Back to char string\n  std::string lowercase_str(input_str.size() + 1, '\\0');\n  std::wcstombs(&lowercase_str[0], lowercase_wstr.c_str(),\n                lowercase_wstr.size());\n\n  return lowercase_str;\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program uses streaming models with microphone for speech recognition.\nUsage:\n\n  ./bin/sherpa-onnx-microphone \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --provider=cpu \\\n    --num-threads=1 \\\n    --decoding-method=greedy_search\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OnlineRecognizerConfig config;\n\n  config.Register(&po);\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  sherpa_onnx::OnlineRecognizer recognizer(config);\n  auto s = recognizer.CreateStream();\n\n  sherpa_onnx::Microphone mic;\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  if (device_index == paNoDevice) {\n    fprintf(stderr, \"No default input device found\\n\");\n    fprintf(stderr, \"If you are using Linux, please switch to \\n\");\n    fprintf(stderr, \" ./bin/sherpa-onnx-alsa \\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *pSampleRateStr = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (pSampleRateStr) {\n    mic_sample_rate = atof(pSampleRateStr);\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n  }\n\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      s.get())) {\n    fprintf(stderr, \"portaudio error: %d\\n\", device_index);\n    exit(EXIT_FAILURE);\n  }\n\n  std::string last_text;\n  int32_t segment_index = 0;\n  sherpa_onnx::Display display(30);\n  while (!stop) {\n    while (recognizer.IsReady(s.get())) {\n      recognizer.DecodeStream(s.get());\n    }\n\n    auto text = recognizer.GetResult(s.get()).text;\n    bool is_endpoint = recognizer.IsEndpoint(s.get());\n\n    if (is_endpoint && !config.model_config.paraformer.encoder.empty()) {\n      // For streaming paraformer models, since it has a large right chunk size\n      // we need to pad it on endpointing so that the last character\n      // can be recognized\n      std::vector<float> tail_paddings(static_cast<int>(1.0 * mic_sample_rate));\n      s->AcceptWaveform(mic_sample_rate, tail_paddings.data(),\n                        tail_paddings.size());\n      while (recognizer.IsReady(s.get())) {\n        recognizer.DecodeStream(s.get());\n      }\n      text = recognizer.GetResult(s.get()).text;\n    }\n\n    if (!text.empty() && last_text != text) {\n      last_text = text;\n      display.Print(segment_index, tolowerUnicode(text));\n      fflush(stderr);\n    }\n\n    if (is_endpoint) {\n      if (!text.empty()) {\n        ++segment_index;\n      }\n\n      recognizer.Reset(s.get());\n    }\n\n    Pa_Sleep(20);  // sleep for 20ms\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline-audio-tagging.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-offline-audio-tagging.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <stdio.h>\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/audio-tagging.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\nint32_t main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nAudio tagging from a file.\n\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\ntar xvf sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\nrm sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2\n\n./bin/sherpa-onnx-offline-audio-tagging \\\n  --zipformer-model=./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.onnx \\\n  --labels=./sherpa-onnx-zipformer-audio-tagging-2024-04-09/class_labels_indices.csv \\\n  sherpa-onnx-zipformer-audio-tagging-2024-04-09/test_wavs/0.wav\n\nInput wave files should be of single channel, 16-bit PCM encoded wave file; its\nsampling rate can be arbitrary and does not need to be 16kHz.\n\nPlease see\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\nfor more models.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::AudioTaggingConfig config;\n  config.Register(&po);\n  po.Read(argc, argv);\n\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"\\nError: Please provide 1 wave file\\n\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  sherpa_onnx::AudioTagging tagger(config);\n  std::string wav_filename = po.GetArg(1);\n\n  int32_t sampling_rate = -1;\n\n  bool is_ok = false;\n  const std::vector<float> samples =\n      sherpa_onnx::ReadWave(wav_filename, &sampling_rate, &is_ok);\n\n  if (!is_ok) {\n    fprintf(stderr, \"Failed to read '%s'\\n\", wav_filename.c_str());\n    return -1;\n  }\n\n  const float duration = samples.size() / static_cast<float>(sampling_rate);\n\n  fprintf(stderr, \"Start to compute\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n\n  auto stream = tagger.CreateStream();\n\n  stream->AcceptWaveform(sampling_rate, samples.data(), samples.size());\n\n  auto results = tagger.Compute(stream.get());\n  const auto end = std::chrono::steady_clock::now();\n  fprintf(stderr, \"Done\\n\");\n\n  int32_t i = 0;\n\n  for (const auto &event : results) {\n    fprintf(stderr, \"%d: \", i);\n    fprintf(stdout, \"%s\\n\", event.ToString().c_str());\n    i += 1;\n  }\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Num threads: %d\\n\", config.model.num_threads);\n  fprintf(stderr, \"Wave duration: %.3f\\n\", duration);\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  fprintf(stderr, \"Real time factor (RTF): %.3f / %.3f = %.3f\\n\",\n          elapsed_seconds, duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline-denoiser.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-offline-denoiser.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include <stdio.h>\n\n#include <chrono>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nNon-streaming speech denoising with sherpa-onnx.\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\nto download models.\n\nUsage:\n\n(1) Use gtcrn models\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\n./bin/sherpa-onnx-offline-denoiser \\\n  --speech-denoiser-gtcrn-model=gtcrn_simple.onnx \\\n  --input-wav=input.wav \\\n  --output-wav=output_16k.wav\n\n(2) Use DPDFNet models at 16 kHz or 48 kHz\n\n# Download DPDFNet models from either:\n#   https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n#   https://huggingface.co/Ceva-IP/DPDFNet\n\n./bin/sherpa-onnx-offline-denoiser \\\n  --speech-denoiser-dpdfnet-model=dpdfnet4.onnx \\\n  --input-wav=input.wav \\\n  --output-wav=output_16k.wav\n\n# You can also use other 16 kHz DPDFNet models such as:\n#   dpdfnet_baseline.onnx\n#   dpdfnet2.onnx\n#   dpdfnet8.onnx\n\n./bin/sherpa-onnx-offline-denoiser \\\n  --speech-denoiser-dpdfnet-model=dpdfnet2_48khz_hr.onnx \\\n  --input-wav=input.wav \\\n  --output-wav=output_48k.wav\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OfflineSpeechDenoiserConfig config;\n  std::string input_wave;\n  std::string output_wave;\n\n  config.Register(&po);\n  po.Register(\"input-wav\", &input_wave, \"Path to input wav.\");\n  po.Register(\"output-wav\", &output_wave, \"Path to output wav\");\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    fprintf(stderr, \"Please don't give positional arguments\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (input_wave.empty()) {\n    fprintf(stderr, \"Please provide --input-wav\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (output_wave.empty()) {\n    fprintf(stderr, \"Please provide --output-wav\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  sherpa_onnx::OfflineSpeechDenoiser denoiser(config);\n  int32_t sampling_rate = -1;\n  bool is_ok = false;\n  std::vector<float> samples =\n      sherpa_onnx::ReadWave(input_wave, &sampling_rate, &is_ok);\n  if (!is_ok) {\n    fprintf(stderr, \"Failed to read '%s'\\n\", input_wave.c_str());\n    return -1;\n  }\n\n  fprintf(stderr, \"Started\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n  auto result = denoiser.Run(samples.data(), samples.size(), sampling_rate);\n  const auto end = std::chrono::steady_clock::now();\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n\n  fprintf(stderr, \"Done\\n\");\n  is_ok = sherpa_onnx::WriteWave(output_wave, result.sample_rate,\n                                 result.samples.data(), result.samples.size());\n  if (is_ok) {\n    fprintf(stderr, \"Saved to %s\\n\", output_wave.c_str());\n  } else {\n    fprintf(stderr, \"Failed to save to %s\\n\", output_wave.c_str());\n  }\n\n  float duration = samples.size() / static_cast<float>(sampling_rate);\n  fprintf(stderr, \"num threads: %d\\n\", config.model.num_threads);\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Real time factor (RTF): %.3f / %.3f = %.3f\\n\",\n          elapsed_seconds, duration, rtf);\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline-language-identification.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-offline-language-identification.cc\n//\n// Copyright (c)  2022-2024  Xiaomi Corporation\n\n#include <stdio.h>\n\n#include <chrono>  // NOLINT\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/spoken-language-identification.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nSpoken language identification with sherpa-onnx.\n\nUsage:\n\n(1) Use a whisper multilingual model\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\ntar xvf sherpa-onnx-whisper-tiny.tar.bz2\nrm sherpa-onnx-whisper-tiny.tar.bz2\n\nWe only use the int8.onnx models below.\n\n./bin/sherpa-onnx-offline-spoken-language-identification \\\n  --whisper-encoder=sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx \\\n  --whisper-decoder=sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx \\\n  --num-threads=1 \\\n  /path/to/foo.wav\n\nfoo.wav should be of single channel, 16-bit PCM encoded wave file; its\nsampling rate can be arbitrary and does not need to be 16kHz.\nYou can find test waves for different languages at\nhttps://hf-mirror.com/spaces/k2-fsa/spoken-language-identification/tree/main/test_wavs\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/index.html\nNote that only whisper multilingual models are supported. For instance,\n\"tiny\" is supported but \"tiny.en\" is not.\nfor a list of pre-trained models to download.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::SpokenLanguageIdentificationConfig config;\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"Error: Please provide 1 wave file.\\n\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  fprintf(stderr, \"Creating spoken language identifier ...\\n\");\n  sherpa_onnx::SpokenLanguageIdentification slid(config);\n\n  fprintf(stderr, \"Started\\n\");\n  const std::string wav_filename = po.GetArg(1);\n\n  int32_t sampling_rate = -1;\n  bool is_ok = false;\n  const std::vector<float> samples =\n      sherpa_onnx::ReadWave(wav_filename, &sampling_rate, &is_ok);\n  if (!is_ok) {\n    fprintf(stderr, \"Failed to read '%s'\\n\", wav_filename.c_str());\n    return -1;\n  }\n  float duration = samples.size() / static_cast<float>(sampling_rate);\n\n  const auto begin = std::chrono::steady_clock::now();\n\n  auto s = slid.CreateStream();\n  s->AcceptWaveform(sampling_rate, samples.data(), samples.size());\n\n  auto language = slid.Compute(s.get());\n\n  const auto end = std::chrono::steady_clock::now();\n\n  fprintf(stderr, \"Done!\\n\\n\");\n  fprintf(stderr, \"%s\\n\", wav_filename.c_str());\n  fprintf(stderr, \"Detected language: \");\n  fprintf(stdout, \"%s\\n\", language.c_str());\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n\n  fprintf(stderr, \"num threads: %d\\n\", config.num_threads);\n\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Real time factor (RTF): %.3f / %.3f = %.3f\\n\",\n          elapsed_seconds, duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline-parallel.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-offline-parallel.cc\n//\n// Copyright (c)  2022-2023  cuidc\n\n#include <stdio.h>\n\n#include <atomic>\n#include <chrono>\n#include <fstream>\n#include <mutex>\n#include <string>\n#include <thread>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\nstd::atomic<int> wav_index(0);\nstd::mutex mtx;\n\nstd::vector<std::vector<std::string>> SplitToBatches(\n    const std::vector<std::string> &input, int32_t batch_size) {\n  std::vector<std::vector<std::string>> outputs;\n  auto itr = input.cbegin();\n  int32_t process_num = 0;\n\n  while (process_num + batch_size <= static_cast<int32_t>(input.size())) {\n    auto chunk_end = itr + batch_size;\n    outputs.emplace_back(itr, chunk_end);\n    itr = chunk_end;\n    process_num += batch_size;\n  }\n  if (itr != input.cend()) {\n    outputs.emplace_back(itr, input.cend());\n  }\n  return outputs;\n}\n\nstd::vector<std::string> LoadScpFile(const std::string &wav_scp_path) {\n  std::vector<std::string> wav_paths;\n  std::ifstream in(wav_scp_path);\n  if (!in.is_open()) {\n    fprintf(stderr, \"Failed to open file: %s.\\n\", wav_scp_path.c_str());\n    return wav_paths;\n  }\n  std::string line, column1, column2;\n  while (std::getline(in, line)) {\n    std::istringstream iss(line);\n    iss >> column1 >> column2;\n    wav_paths.emplace_back(std::move(column2));\n  }\n\n  return wav_paths;\n}\n\nvoid AsrInference(const std::vector<std::vector<std::string>> &chunk_wav_paths,\n                  sherpa_onnx::OfflineRecognizer *recognizer,\n                  float *total_length, float *total_time) {\n  std::vector<std::unique_ptr<sherpa_onnx::OfflineStream>> ss;\n  std::vector<sherpa_onnx::OfflineStream *> ss_pointers;\n  float duration = 0.0f;\n  float elapsed_seconds_batch = 0.0f;\n\n  // warm up\n  for (const auto &wav_filename : chunk_wav_paths[0]) {\n    int32_t sampling_rate = -1;\n    bool is_ok = false;\n    const std::vector<float> samples =\n        sherpa_onnx::ReadWave(wav_filename, &sampling_rate, &is_ok);\n    if (!is_ok) {\n      fprintf(stderr, \"Failed to read '%s'\\n\", wav_filename.c_str());\n      continue;\n    }\n    duration += samples.size() / static_cast<float>(sampling_rate);\n    auto s = recognizer->CreateStream();\n    s->AcceptWaveform(sampling_rate, samples.data(), samples.size());\n\n    ss.push_back(std::move(s));\n    ss_pointers.push_back(ss.back().get());\n  }\n  recognizer->DecodeStreams(ss_pointers.data(), ss_pointers.size());\n  ss_pointers.clear();\n  ss.clear();\n\n  while (true) {\n    int chunk = wav_index.fetch_add(1);\n    if (chunk >= static_cast<int32_t>(chunk_wav_paths.size())) {\n      break;\n    }\n    const auto &wav_paths = chunk_wav_paths[chunk];\n    const auto begin = std::chrono::steady_clock::now();\n    for (const auto &wav_filename : wav_paths) {\n      int32_t sampling_rate = -1;\n      bool is_ok = false;\n      const std::vector<float> samples =\n          sherpa_onnx::ReadWave(wav_filename, &sampling_rate, &is_ok);\n      if (!is_ok) {\n        fprintf(stderr, \"Failed to read '%s'\\n\", wav_filename.c_str());\n        continue;\n      }\n      duration += samples.size() / static_cast<float>(sampling_rate);\n      auto s = recognizer->CreateStream();\n      s->AcceptWaveform(sampling_rate, samples.data(), samples.size());\n\n      ss.push_back(std::move(s));\n      ss_pointers.push_back(ss.back().get());\n    }\n    recognizer->DecodeStreams(ss_pointers.data(), ss_pointers.size());\n    const auto end = std::chrono::steady_clock::now();\n    float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n            .count() /\n        1000.;\n    elapsed_seconds_batch += elapsed_seconds;\n    int i = 0;\n    for (const auto &wav_filename : wav_paths) {\n      fprintf(stderr, \"%s\\n\", wav_filename.c_str());\n      fprintf(stdout, \"%s\\n\", ss[i]->GetResult().AsJsonString().c_str());\n      fprintf(stderr, \"----\\n\");\n      i = i + 1;\n    }\n    ss_pointers.clear();\n    ss.clear();\n  }\n\n  {\n    std::lock_guard<std::mutex> guard(mtx);\n    *total_length += duration;\n    if (*total_time < elapsed_seconds_batch) {\n      *total_time = elapsed_seconds_batch;\n    }\n  }\n}\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nSpeech recognition using non-streaming models with sherpa-onnx.\n\nUsage:\n\n(1) Transducer from icefall\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html\n\n  ./bin/sherpa-onnx-offline-parallel \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --num-threads=1 \\\n    --decoding-method=greedy_search \\\n    --batch-size=8 \\\n    --nj=1 \\\n    --wav-scp=wav.scp\n\n  ./bin/sherpa-onnx-offline-parallel \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --num-threads=1 \\\n    --decoding-method=greedy_search \\\n    --batch-size=1 \\\n    --nj=8 \\\n    /path/to/foo.wav [bar.wav foobar.wav ...]\n\n(2) Paraformer from FunASR\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html\n\n  ./bin/sherpa-onnx-offline-parallel \\\n    --tokens=/path/to/tokens.txt \\\n    --paraformer=/path/to/model.onnx \\\n    --num-threads=1 \\\n    --decoding-method=greedy_search \\\n    /path/to/foo.wav [bar.wav foobar.wav ...]\n\n(3) Whisper models\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html\n\n  ./bin/sherpa-onnx-offline-parallel \\\n    --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \\\n    --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \\\n    --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \\\n    --num-threads=1 \\\n    /path/to/foo.wav [bar.wav foobar.wav ...]\n\n(4) NeMo CTC models\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/index.html\n\n  ./bin/sherpa-onnx-offline-parallel \\\n    --tokens=./sherpa-onnx-nemo-ctc-en-conformer-medium/tokens.txt \\\n    --nemo-ctc-model=./sherpa-onnx-nemo-ctc-en-conformer-medium/model.onnx \\\n    --num-threads=2 \\\n    --decoding-method=greedy_search \\\n    --debug=false \\\n    ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/0.wav \\\n    ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/1.wav \\\n    ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/8k.wav\n\n(5) TDNN CTC model for the yesno recipe from icefall\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/yesno/index.html\n      //\n  ./bin/sherpa-onnx-offline-parallel \\\n    --sample-rate=8000 \\\n    --feat-dim=23 \\\n    --tokens=./sherpa-onnx-tdnn-yesno/tokens.txt \\\n    --tdnn-model=./sherpa-onnx-tdnn-yesno/model-epoch-14-avg-2.onnx \\\n    ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_0_1_0_0_0_1.wav \\\n    ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_0_1_0.wav\n\nNote: It supports decoding multiple files in batches\n\nfoo.wav should be of single channel, 16-bit PCM encoded wave file; its\nsampling rate can be arbitrary and does not need to be 16kHz.\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)usage\";\n  std::string wav_scp = \"\";  // file path, kaldi style wav list.\n  int32_t nj = 1;            // thread number\n  int32_t batch_size = 1;    // number of wav files processed at once.\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OfflineRecognizerConfig config;\n  config.Register(&po);\n  po.Register(\"wav-scp\", &wav_scp,\n              \"a file including wav-id and wav-path, kaldi style wav list.\"\n              \"default=\"\n              \". when it is not empty, wav files which positional \"\n              \"parameters provide are invalid.\");\n  po.Register(\"nj\", &nj, \"multi-thread num for decoding, default=1\");\n  po.Register(\"batch-size\", &batch_size,\n              \"number of wav files processed at once during the decoding\"\n              \"process. default=1\");\n\n  po.Read(argc, argv);\n  if (po.NumArgs() < 1 && wav_scp.empty()) {\n    fprintf(stderr, \"Error: Please provide at least 1 wave file.\\n\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n  std::this_thread::sleep_for(std::chrono::seconds(10));  // sleep 10s\n  fprintf(stderr, \"Creating recognizer ...\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n  sherpa_onnx::OfflineRecognizer recognizer(config);\n  const auto end = std::chrono::steady_clock::now();\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  fprintf(stderr,\n          \"Started nj: %d, batch_size: %d, wav_path: %s. recognizer init time: \"\n          \"%.6f\\n\",\n          nj, batch_size, wav_scp.c_str(), elapsed_seconds);\n  std::this_thread::sleep_for(std::chrono::seconds(10));  // sleep 10s\n  std::vector<std::string> wav_paths;\n  if (!wav_scp.empty()) {\n    wav_paths = LoadScpFile(wav_scp);\n  } else {\n    for (int32_t i = 1; i <= po.NumArgs(); ++i) {\n      wav_paths.emplace_back(po.GetArg(i));\n    }\n  }\n  if (wav_paths.empty()) {\n    fprintf(stderr, \"wav files is empty.\\n\");\n    return -1;\n  }\n  std::vector<std::thread> threads;\n  std::vector<std::vector<std::string>> batch_wav_paths =\n      SplitToBatches(wav_paths, batch_size);\n  float total_length = 0.0f;\n  float total_time = 0.0f;\n  for (int i = 0; i < nj; i++) {\n    threads.emplace_back(std::thread(AsrInference, batch_wav_paths, &recognizer,\n                                     &total_length, &total_time));\n  }\n\n  for (auto &thread : threads) {\n    thread.join();\n  }\n\n  fprintf(stderr, \"num threads: %d\\n\", config.model_config.num_threads);\n  fprintf(stderr, \"decoding method: %s\\n\", config.decoding_method.c_str());\n  if (config.decoding_method == \"modified_beam_search\") {\n    fprintf(stderr, \"max active paths: %d\\n\", config.max_active_paths);\n  }\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", total_time);\n  float rtf = total_time / total_length;\n  fprintf(stderr, \"Real time factor (RTF): %.6f / %.6f = %.4f\\n\", total_time,\n          total_length, rtf);\n  fprintf(stderr, \"SPEEDUP: %.4f\\n\", 1.0 / rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline-punctuation.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-offline-punctuation.cc\n//\n// Copyright (c)  2022-2024  Xiaomi Corporation\n#include <stdio.h>\n\n#include <chrono>\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-punctuation.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nAdd punctuations to the input text.\n\nThe input text can contain both Chinese and English words.\n\nUsage:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\ntar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nrm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n\n./bin/sherpa-onnx-offline-punctuation \\\n  --ct-transformer=./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx\n  \"你好吗how are you Fantasitic 谢谢我很好你怎么样呢\"\n\nThe output text should look like below:\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OfflinePunctuationConfig config;\n  config.Register(&po);\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr,\n            \"Error: Please provide only 1 position argument containing the \"\n            \"input text.\\n\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  fprintf(stderr, \"Creating OfflinePunctuation ...\\n\");\n  sherpa_onnx::OfflinePunctuation punct(config);\n  fprintf(stderr, \"Started\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n\n  std::string text = po.GetArg(1);\n  std::string text_with_punct = punct.AddPunctuation(text);\n  fprintf(stderr, \"Done\\n\");\n  const auto end = std::chrono::steady_clock::now();\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n\n  fprintf(stderr, \"Num threads: %d\\n\", config.model.num_threads);\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  fprintf(stderr, \"Input text: %s\\n\", text.c_str());\n  fprintf(stderr, \"Output text: \");\n  fprintf(stdout, \"%s\\n\", text_with_punct.c_str());\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline-source-separation.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-offline-source-separation.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include <stdio.h>\n\n#include <chrono>  // NOLINT\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-source-separation.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nNon-streaming source separation with sherpa-onnx.\n\nPlease visit\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/source-separation-models\nto download models.\n\nUsage:\n\n(1) Use spleeter models\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/sherpa-onnx-spleeter-2stems-fp16.tar.bz2\ntar xvf sherpa-onnx-spleeter-2stems-fp16.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/audio_example.wav\n\n./bin/sherpa-onnx-offline-source-separation \\\n  --spleeter-vocals=sherpa-onnx-spleeter-2stems-fp16/vocals.fp16.onnx \\\n  --spleeter-accompaniment=sherpa-onnx-spleeter-2stems-fp16/accompaniment.fp16.onnx \\\n  --input-wav=audio_example.wav \\\n  --output-vocals-wav=output_vocals.wav \\\n  --output-accompaniment-wav=output_accompaniment.wav\n\n(2) Use UVR models\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/UVR_MDXNET_1_9703.onnx\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/source-separation-models/audio_example.wav\n\n./bin/sherpa-onnx-offline-source-separation \\\n  --uvr-model=./UVR_MDXNET_1_9703.onnx \\\n  --input-wav=audio_example.wav \\\n  --output-vocals-wav=output_vocals.wav \\\n  --output-accompaniment-wav=output_accompaniment.wav\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OfflineSourceSeparationConfig config;\n\n  std::string input_wave;\n  std::string output_vocals_wave;\n  std::string output_accompaniment_wave;\n\n  config.Register(&po);\n  po.Register(\"input-wav\", &input_wave, \"Path to input wav.\");\n  po.Register(\"output-vocals-wav\", &output_vocals_wave,\n              \"Path to output vocals wav\");\n  po.Register(\"output-accompaniment-wav\", &output_accompaniment_wave,\n              \"Path to output accompaniment wav\");\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    fprintf(stderr, \"Please don't give positional arguments\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (input_wave.empty()) {\n    fprintf(stderr, \"Please provide --input-wav\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (output_vocals_wave.empty()) {\n    fprintf(stderr, \"Please provide --output-vocals-wav\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (output_accompaniment_wave.empty()) {\n    fprintf(stderr, \"Please provide --output-accompaniment-wav\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  bool is_ok = false;\n  sherpa_onnx::OfflineSourceSeparationInput input;\n  input.samples.data =\n      sherpa_onnx::ReadWaveMultiChannel(input_wave, &input.sample_rate, &is_ok);\n  if (!is_ok) {\n    fprintf(stderr, \"Failed to read '%s'\\n\", input_wave.c_str());\n    return -1;\n  }\n\n  fprintf(stderr, \"Started\\n\");\n\n  sherpa_onnx::OfflineSourceSeparation sp(config);\n\n  const auto begin = std::chrono::steady_clock::now();\n  auto output = sp.Process(input);\n  const auto end = std::chrono::steady_clock::now();\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n\n  is_ok = sherpa_onnx::WriteWave(\n      output_vocals_wave, output.sample_rate, output.stems[0].data[0].data(),\n      output.stems[0].data[1].data(), output.stems[0].data[0].size());\n\n  if (!is_ok) {\n    fprintf(stderr, \"Failed to write to '%s'\\n\", output_vocals_wave.c_str());\n    exit(EXIT_FAILURE);\n  }\n\n  is_ok = sherpa_onnx::WriteWave(output_accompaniment_wave, output.sample_rate,\n                                 output.stems[1].data[0].data(),\n                                 output.stems[1].data[1].data(),\n                                 output.stems[1].data[0].size());\n\n  if (!is_ok) {\n    fprintf(stderr, \"Failed to write to '%s'\\n\",\n            output_accompaniment_wave.c_str());\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"Done\\n\");\n  fprintf(stderr, \"Saved to write to '%s' and '%s'\\n\",\n          output_vocals_wave.c_str(), output_accompaniment_wave.c_str());\n\n  float duration =\n      input.samples.data[0].size() / static_cast<float>(input.sample_rate);\n  fprintf(stderr, \"num threads: %d\\n\", config.model.num_threads);\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Real time factor (RTF): %.3f / %.3f = %.3f\\n\",\n          elapsed_seconds, duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline-speaker-diarization.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-offline-speaker-diarization.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <cstdio>\n#include <iostream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-speaker-diarization.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\nstatic int32_t ProgressCallback(int32_t processed_chunks, int32_t num_chunks,\n                                void *) {\n  float progress = 100.0 * processed_chunks / num_chunks;\n  fprintf(stderr, \"progress %.2f%%\\n\", progress);\n\n  // the return value is currently ignored\n  return 0;\n}\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nOffline/Non-streaming speaker diarization with sherpa-onnx\nUsage example:\n\nStep 1: Download a speaker segmentation model\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nfor a list of available models. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n\nStep 2: Download a speaker embedding extractor model\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nfor a list of available models. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\n\nStep 3. Download test wave files\n\nPlease visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nfor a list of available test wave files. The following is an example\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\n\nStep 4. Build sherpa-onnx\n\nStep 5. Run it\n\n  ./bin/sherpa-onnx-offline-speaker-diarization \\\n    --clustering.num-clusters=4 \\\n    --segmentation.pyannote-model=./sherpa-onnx-pyannote-segmentation-3-0/model.onnx \\\n    --embedding.model=./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx \\\n    ./0-four-speakers-zh.wav\n\nSince we know that there are four speakers in the test wave file, we use\n--clustering.num-clusters=4 in the above example.\n\nIf we don't know number of speakers in the given wave file, we can use\nthe argument --clustering.cluster-threshold. The following is an example:\n\n  ./bin/sherpa-onnx-offline-speaker-diarization \\\n    --clustering.cluster-threshold=0.90 \\\n    --segmentation.pyannote-model=./sherpa-onnx-pyannote-segmentation-3-0/model.onnx \\\n    --embedding.model=./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx \\\n    ./0-four-speakers-zh.wav\n\nA larger threshold leads to few clusters, i.e., few speakers;\na smaller threshold leads to more clusters, i.e., more speakers\n  )usage\";\n  sherpa_onnx::OfflineSpeakerDiarizationConfig config;\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  config.Register(&po);\n  po.Read(argc, argv);\n\n  std::cout << config.ToString() << \"\\n\";\n\n  if (!config.Validate()) {\n    po.PrintUsage();\n    std::cerr << \"Errors in config!\\n\";\n    return -1;\n  }\n\n  if (po.NumArgs() != 1) {\n    std::cerr << \"Error: Please provide exactly 1 wave file.\\n\\n\";\n    po.PrintUsage();\n    return -1;\n  }\n\n  sherpa_onnx::OfflineSpeakerDiarization sd(config);\n\n  std::cout << \"Started\\n\";\n  const auto begin = std::chrono::steady_clock::now();\n  const std::string wav_filename = po.GetArg(1);\n  int32_t sample_rate = -1;\n  bool is_ok = false;\n  const std::vector<float> samples =\n      sherpa_onnx::ReadWave(wav_filename, &sample_rate, &is_ok);\n  if (!is_ok) {\n    std::cerr << \"Failed to read \" << wav_filename.c_str() << \"\\n\";\n    return -1;\n  }\n\n  if (sample_rate != sd.SampleRate()) {\n    std::cerr << \"Expect sample rate \" << sd.SampleRate()\n              << \". Given: \" << sample_rate << \"\\n\";\n    return -1;\n  }\n\n  float duration = samples.size() / static_cast<float>(sample_rate);\n\n  auto result =\n      sd.Process(samples.data(), samples.size(), ProgressCallback, nullptr)\n          .SortByStartTime();\n\n  for (const auto &r : result) {\n    std::cout << r.ToString() << \"\\n\";\n  }\n\n  const auto end = std::chrono::steady_clock::now();\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n\n  fprintf(stderr, \"Duration : %.3f s\\n\", duration);\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Real time factor (RTF): %.3f / %.3f = %.3f\\n\",\n          elapsed_seconds, duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline-tts-play-alsa.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-tts-play-alsa.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n// see https://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html\n// https://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m___h_w___params.html\n// https://www.alsa-project.org/alsa-doc/alsa-lib/group___p_c_m.html\n\n#include <signal.h>\n\n#include <algorithm>\n#include <chrono>              // NOLINT\n#include <condition_variable>  // NOLINT\n#include <cstdio>\n#include <fstream>\n#include <mutex>  // NOLINT\n#include <queue>\n#include <string>\n#include <thread>  // NOLINT\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/alsa-play.h\"\n#include \"sherpa-onnx/csrc/offline-tts.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nstatic std::condition_variable g_cv;\nstatic std::mutex g_cv_m;\n\nstruct Buffer {\n  std::queue<std::vector<float>> samples;\n  std::mutex mutex;\n};\n\nstatic Buffer g_buffer;\n\nstatic bool g_stopped = false;\nstatic bool g_killed = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  if (g_killed) {\n    exit(0);\n  }\n\n  g_killed = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting\\n\");\n}\n\nstatic int32_t AudioGeneratedCallback(const float *s, int32_t n,\n                                      float /*progress*/) {\n  if (n > 0) {\n    std::lock_guard<std::mutex> lock(g_buffer.mutex);\n    g_buffer.samples.push({s, s + n});\n    g_cv.notify_all();\n  }\n\n  if (g_killed) {\n    return 0;  // stop generating\n  }\n\n  // continue generating\n  return 1;\n}\n\nstatic void StartPlayback(const std::string &device_name, int32_t sample_rate) {\n  sherpa_onnx::AlsaPlay alsa(device_name.c_str(), sample_rate);\n\n  std::unique_lock<std::mutex> lock(g_cv_m);\n  while (!g_killed && !g_stopped) {\n    while (!g_buffer.samples.empty()) {\n      auto &p = g_buffer.samples.front();\n      alsa.Play(p);\n      g_buffer.samples.pop();\n    }\n\n    g_cv.wait(lock);\n  }\n\n  if (g_killed) {\n    return;\n  }\n\n  if (g_stopped) {\n    while (!g_buffer.samples.empty()) {\n      auto &p = g_buffer.samples.front();\n      alsa.Play(p);\n      g_buffer.samples.pop();\n    }\n  }\n\n  alsa.Drain();\n}\n\nint main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nOffline text-to-speech with sherpa-onnx.\n\nIt plays the generated audio as the model is processing.\n\nNote that it is alsa so it works only on **Linux**. For instance, you can\nuse it on Raspberry Pi.\n\nUsage examples:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\ntar xf vits-piper-en_US-amy-low.tar.bz2\n\n./bin/sherpa-onnx-offline-tts-play-alsa \\\n --vits-model=./vits-piper-en_US-amy-low/en_US-amy-low.onnx \\\n --vits-tokens=./vits-piper-en_US-amy-low/tokens.txt \\\n --vits-data-dir=./vits-piper-en_US-amy-low/espeak-ng-data \\\n --output-filename=./generated.wav \\\n \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nPocket TTS:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\n./bin/sherpa-onnx-offline-tts-play-alsa \\\n --pocket-lm-flow=./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx \\\n --pocket-lm-main=./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx \\\n --pocket-encoder=./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx \\\n --pocket-decoder=./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx \\\n --pocket-text-conditioner=./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx \\\n --pocket-vocab-json=./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json \\\n --pocket-token-scores-json=./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json \\\n --reference-audio=./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav \\\n \"Hello from Pocket TTS\"\n\nSupertonic TTS:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\n./bin/sherpa-onnx-offline-tts-play-alsa \\\n --supertonic-duration-predictor=./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx \\\n --supertonic-text-encoder=./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx \\\n --supertonic-vector-estimator=./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx \\\n --supertonic-vocoder=./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx \\\n --supertonic-tts-json=./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json \\\n --supertonic-unicode-indexer=./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin \\\n --supertonic-voice-style=./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin \\\n --lang=en \\\n \"Hello from Supertonic TTS\"\n\nZipVoice TTS:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n./bin/sherpa-onnx-offline-tts-play-alsa \\\n --zipvoice-encoder=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx \\\n --zipvoice-decoder=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx \\\n --zipvoice-data-dir=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data \\\n --zipvoice-lexicon=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt \\\n --zipvoice-tokens=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt \\\n --zipvoice-vocoder=./vocos_24khz.onnx \\\n --reference-audio=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav \\\n --reference-text=\"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\" \\\n --num-steps=4 \\\n \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n\nIt will optionally save audio to --output-filename and play it while generating.\n\nYou can find more models at\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/tts/index.html\nor details.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  std::string device_name = \"default\";\n  std::string output_filename = \"./generated.wav\";\n  int32_t sid = 0;\n\n  std::string reference_audio;\n  po.Register(\n      \"reference-audio\", &reference_audio,\n      \"Path to reference audio. Required by Pocket TTS and ZipVoice TTS.\");\n\n  std::string reference_text;\n  po.Register(\n      \"reference-text\", &reference_text,\n      \"Reference text for the reference audio. Required by ZipVoice TTS.\");\n\n  sherpa_onnx::GenerationConfig gen_config;\n  std::string lang;\n\n  po.Register(\"output-filename\", &output_filename,\n              \"Path to save the generated audio\");\n\n  po.Register(\n      \"num-steps\", &gen_config.num_steps,\n      \"Used by some models, e.g., Pocket TTS and ZipVoice. Number of flow \"\n      \"matching steps.\");\n\n  po.Register(\"device-name\", &device_name,\n              \"Name of the device to play the generated audio\");\n\n  po.Register(\"lang\", &lang,\n              \"Language for text: en, ko, es, pt, fr. Used only by \"\n              \"Supertonic TTS.\");\n\n  po.Register(\"sid\", &sid,\n              \"Speaker ID. Used only for multi-speaker models, e.g., models \"\n              \"trained using the VCTK dataset. Not used for single-speaker \"\n              \"models, e.g., models trained using the LJSpeech dataset\");\n\n  po.Register(\"speed\", &gen_config.speed,\n              \"Speech speed. Larger=faster. Used by Supertonic, VITS, etc. \"\n              \"(float, default = 1.0)\");\n\n  sherpa_onnx::OfflineTtsConfig config;\n\n  config.Register(&po);\n  po.Read(argc, argv);\n\n  if (po.NumArgs() == 0) {\n    fprintf(stderr, \"Error: Please provide the text to generate audio.\\n\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (po.NumArgs() > 1) {\n    fprintf(stderr,\n            \"Error: Accept only one positional argument. Please use single \"\n            \"quotes to wrap your text\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  if (config.max_num_sentences != 1) {\n    fprintf(stderr, \"Setting config.max_num_sentences to 1\\n\");\n    config.max_num_sentences = 1;\n  }\n\n  fprintf(stderr, \"Loading the model\\n\");\n  sherpa_onnx::OfflineTts tts(config);\n\n  fprintf(stderr, \"Start the playback thread\\n\");\n  std::thread playback_thread(StartPlayback, device_name, tts.SampleRate());\n\n  fprintf(stderr, \"Generating ...\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n\n  sherpa_onnx::GeneratedAudio audio;\n\n  bool is_pocket_tts = !config.model.pocket.lm_flow.empty();\n  bool is_supertonic_tts = !config.model.supertonic.tts_json.empty();\n  bool is_zipvoice_tts = !config.model.zipvoice.encoder.empty() &&\n                         !config.model.zipvoice.decoder.empty();\n\n  gen_config.sid = sid;\n\n  if (is_supertonic_tts && !lang.empty()) {\n    gen_config.extra[\"lang\"] = lang;\n  }\n\n  if (is_pocket_tts || is_zipvoice_tts) {\n    if (reference_audio.empty()) {\n      fprintf(stderr,\n              \"You need to provide --reference-audio for this TTS model\");\n      exit(EXIT_FAILURE);\n    }\n\n    int32_t sample_rate;\n    bool is_ok = false;\n    auto samples =\n        sherpa_onnx::ReadWave(reference_audio, &sample_rate, &is_ok);\n    if (!is_ok) {\n      fprintf(stderr, \"Failed to read '%s'\", reference_audio.c_str());\n      exit(EXIT_FAILURE);\n    }\n\n    gen_config.reference_audio = std::move(samples);\n    gen_config.reference_sample_rate = sample_rate;\n  }\n\n  if (is_zipvoice_tts) {\n    if (reference_text.empty()) {\n      fprintf(stderr,\n              \"You need to provide --reference-text for ZipVoice TTS\");\n      exit(EXIT_FAILURE);\n    }\n    gen_config.reference_text = reference_text;\n  }\n\n  audio = tts.Generate(po.GetArg(1), gen_config, AudioGeneratedCallback);\n\n  const auto end = std::chrono::steady_clock::now();\n  g_stopped = true;\n  g_cv.notify_all();\n  fprintf(stderr, \"Generating done!\\n\");\n  if (audio.samples.empty()) {\n    fprintf(\n        stderr,\n        \"Error in generating audio. Please read previous error messages.\\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = audio.samples.size() / static_cast<float>(audio.sample_rate);\n\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  fprintf(stderr, \"Audio duration: %.3f s\\n\", duration);\n  fprintf(stderr, \"Real-time factor (RTF): %.3f/%.3f = %.3f\\n\", elapsed_seconds,\n          duration, rtf);\n\n  bool ok = sherpa_onnx::WriteWave(output_filename, audio.sample_rate,\n                                   audio.samples.data(), audio.samples.size());\n  if (!ok) {\n    fprintf(stderr, \"Failed to write wave to %s\\n\", output_filename.c_str());\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"The text is: %s. Speaker ID: %d\\n\\n\", po.GetArg(1).c_str(),\n          sid);\n  fprintf(stderr, \"\\n**** Saved to %s successfully! ****\\n\",\n          output_filename.c_str());\n\n  fprintf(stderr, \"\\n\");\n  fprintf(\n      stderr,\n      \"Wait for the playback to finish. You can safely press ctrl + C to stop \"\n      \"the playback.\\n\");\n  playback_thread.join();\n\n  fprintf(stderr, \"Done!\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline-tts-play.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-offline-tts-play.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include <signal.h>\n\n#include <algorithm>\n#include <chrono>\n#include <condition_variable>\n#include <cstdio>\n#include <fstream>\n#include <mutex>\n#include <queue>\n#include <string>\n#include <thread>\n#include <utility>\n#include <vector>\n\n#include \"portaudio.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/microphone.h\"\n#include \"sherpa-onnx/csrc/offline-tts.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nstatic std::condition_variable g_cv;\nstatic std::mutex g_cv_m;\n\nstruct Samples {\n  std::vector<float> data;\n  int32_t consumed = 0;\n};\n\nstruct Buffer {\n  std::queue<Samples> samples;\n  std::mutex mutex;\n};\n\nstatic Buffer g_buffer;\n\nstatic bool g_started = false;\nstatic bool g_stopped = false;\nstatic bool g_killed = false;\n\nstatic void Handler(int32_t /*sig*/) {\n  if (g_killed) {\n    exit(0);\n  }\n\n  g_killed = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting\\n\");\n}\n\nstatic int32_t AudioGeneratedCallback(const float *s, int32_t n,\n                                      float /*progress*/) {\n  if (n > 0) {\n    Samples samples;\n    samples.data = std::vector<float>{s, s + n};\n\n    std::lock_guard<std::mutex> lock(g_buffer.mutex);\n    g_buffer.samples.push(std::move(samples));\n    g_started = true;\n  }\n  if (g_killed) {\n    return 0;  // stop generating\n  }\n\n  // continue generating\n  return 1;\n}\n\nstatic int PlayCallback(const void * /*in*/, void *out,\n                        unsigned long n,  // NOLINT\n                        const PaStreamCallbackTimeInfo * /*time_info*/,\n                        PaStreamCallbackFlags /*status_flags*/,\n                        void * /*user_data*/) {\n  if (g_killed) {\n    return paComplete;\n  }\n\n  float *pout = reinterpret_cast<float *>(out);\n  std::lock_guard<std::mutex> lock(g_buffer.mutex);\n\n  if (g_buffer.samples.empty()) {\n    if (g_stopped) {\n      // no more data is available and we have processed all of the samples\n      return paComplete;\n    }\n\n    // The current sentence is so long, though very unlikely, that\n    // the model has not finished processing it yet.\n    std::fill_n(pout, n, 0);\n\n    return paContinue;\n  }\n\n  int32_t k = 0;\n  for (; k < static_cast<int32_t>(n) && !g_buffer.samples.empty();) {\n    int32_t this_block = n - k;\n\n    auto &p = g_buffer.samples.front();\n\n    int32_t remaining = p.data.size() - p.consumed;\n\n    if (this_block <= remaining) {\n      std::copy(p.data.begin() + p.consumed,\n                p.data.begin() + p.consumed + this_block, pout + k);\n      p.consumed += this_block;\n\n      k = n;\n\n      if (p.consumed == static_cast<int32_t>(p.data.size())) {\n        g_buffer.samples.pop();\n      }\n      break;\n    }\n\n    std::copy(p.data.begin() + p.consumed, p.data.end(), pout + k);\n    k += p.data.size() - p.consumed;\n    g_buffer.samples.pop();\n  }\n\n  if (k < static_cast<int32_t>(n)) {\n    std::fill_n(pout + k, n - k, 0);\n  }\n\n  if (g_stopped && g_buffer.samples.empty()) {\n    return paComplete;\n  }\n\n  return paContinue;\n}\n\nstatic void PlayCallbackFinished(void * /*userData*/) { g_cv.notify_all(); }\n\nstatic void StartPlayback(int32_t sample_rate) {\n  int32_t frames_per_buffer = 1024;\n  PaStreamParameters outputParameters;\n  PaStream *stream;\n  PaError err;\n\n  outputParameters.device =\n      Pa_GetDefaultOutputDevice(); /* default output device */\n\n  outputParameters.channelCount = 1;         /* stereo output */\n  outputParameters.sampleFormat = paFloat32; /* 32 bit floating point output */\n  outputParameters.suggestedLatency =\n      Pa_GetDeviceInfo(outputParameters.device)->defaultLowOutputLatency;\n  outputParameters.hostApiSpecificStreamInfo = nullptr;\n\n  err = Pa_OpenStream(&stream, nullptr, /* no input */\n                      &outputParameters, sample_rate, frames_per_buffer,\n                      paClipOff,  // we won't output out of range samples so\n                                  //   don't bother clipping them\n                      PlayCallback, nullptr);\n  if (err != paNoError) {\n    fprintf(stderr, \"%d portaudio error: %s\\n\", __LINE__, Pa_GetErrorText(err));\n    return;\n  }\n\n  err = Pa_SetStreamFinishedCallback(stream, &PlayCallbackFinished);\n  if (err != paNoError) {\n    fprintf(stderr, \"%d portaudio error: %s\\n\", __LINE__, Pa_GetErrorText(err));\n    return;\n  }\n\n  err = Pa_StartStream(stream);\n  if (err != paNoError) {\n    fprintf(stderr, \"%d portaudio error: %s\\n\", __LINE__, Pa_GetErrorText(err));\n    return;\n  }\n\n  std::unique_lock<std::mutex> lock(g_cv_m);\n  while (!g_killed && !g_stopped &&\n         (!g_started || (g_started && !g_buffer.samples.empty()))) {\n    g_cv.wait(lock);\n  }\n\n  err = Pa_StopStream(stream);\n  if (err != paNoError) {\n    return;\n  }\n\n  err = Pa_CloseStream(stream);\n  if (err != paNoError) {\n    return;\n  }\n}\n\nint main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nOffline text-to-speech with sherpa-onnx.\n\nIt plays the generated audio as the model is processing.\n\nUsage examples:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\ntar xf vits-piper-en_US-amy-low.tar.bz2\n\n./bin/sherpa-onnx-offline-tts-play \\\n --vits-model=./vits-piper-en_US-amy-low/en_US-amy-low.onnx \\\n --vits-tokens=./vits-piper-en_US-amy-low/tokens.txt \\\n --vits-data-dir=./vits-piper-en_US-amy-low/espeak-ng-data \\\n --output-filename=./generated.wav \\\n  \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nPocket TTS:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\n./bin/sherpa-onnx-offline-tts-play \\\n --pocket-lm-flow=./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx \\\n --pocket-lm-main=./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx \\\n --pocket-encoder=./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx \\\n --pocket-decoder=./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx \\\n --pocket-text-conditioner=./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx \\\n --pocket-vocab-json=./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json \\\n --pocket-token-scores-json=./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json \\\n --reference-audio=./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav \\\n \"Hello from Pocket TTS\"\n\nSupertonic TTS:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\n./bin/sherpa-onnx-offline-tts-play \\\n --supertonic-duration-predictor=./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx \\\n --supertonic-text-encoder=./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx \\\n --supertonic-vector-estimator=./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx \\\n --supertonic-vocoder=./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx \\\n --supertonic-tts-json=./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json \\\n --supertonic-unicode-indexer=./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin \\\n --supertonic-voice-style=./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin \\\n --lang=en \\\n \"Hello from Supertonic TTS\"\n\nZipVoice TTS:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n./bin/sherpa-onnx-offline-tts-play \\\n --zipvoice-encoder=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx \\\n --zipvoice-decoder=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx \\\n --zipvoice-data-dir=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data \\\n --zipvoice-lexicon=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt \\\n --zipvoice-tokens=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt \\\n --zipvoice-vocoder=./vocos_24khz.onnx \\\n --reference-audio=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav \\\n --reference-text=\"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\" \\\n --num-steps=4 \\\n \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n\nIt will optionally save audio to --output-filename and play it while generating.\n\nYou can find more models at\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/tts/index.html\nor details.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  std::string output_filename = \"./generated.wav\";\n  int32_t sid = 0;\n\n  std::string reference_audio;\n  po.Register(\n      \"reference-audio\", &reference_audio,\n      \"Path to reference audio. Required by Pocket TTS and ZipVoice TTS.\");\n\n  std::string reference_text;\n  po.Register(\n      \"reference-text\", &reference_text,\n      \"Reference text for the reference audio. Required by ZipVoice TTS.\");\n\n  sherpa_onnx::GenerationConfig gen_config;\n  std::string lang;\n\n  po.Register(\"output-filename\", &output_filename,\n              \"Path to save the generated audio\");\n\n  po.Register(\n      \"num-steps\", &gen_config.num_steps,\n      \"Used by some models, e.g., Pocket TTS and ZipVoice. Number of flow \"\n      \"matching steps.\");\n\n  po.Register(\"lang\", &lang,\n              \"Language for text: en, ko, es, pt, fr. Used only by \"\n              \"Supertonic TTS.\");\n\n  po.Register(\"sid\", &sid,\n              \"Speaker ID. Used only for multi-speaker models, e.g., models \"\n              \"trained using the VCTK dataset. Not used for single-speaker \"\n              \"models, e.g., models trained using the LJSpeech dataset\");\n\n  po.Register(\"speed\", &gen_config.speed,\n              \"Speech speed. Larger=faster. Used by Supertonic, VITS, etc. \"\n              \"(float, default = 1.0)\");\n\n  sherpa_onnx::OfflineTtsConfig config;\n\n  config.Register(&po);\n  po.Read(argc, argv);\n\n  if (po.NumArgs() == 0) {\n    fprintf(stderr, \"Error: Please provide the text to generate audio.\\n\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (po.NumArgs() > 1) {\n    fprintf(stderr,\n            \"Error: Accept only one positional argument. Please use single \"\n            \"quotes to wrap your text\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  sherpa_onnx::Microphone mic;\n\n  PaDeviceIndex num_devices = Pa_GetDeviceCount();\n  fprintf(stderr, \"Num devices: %d\\n\", num_devices);\n\n  PaStreamParameters param;\n\n  param.device = Pa_GetDefaultOutputDevice();\n  if (param.device == paNoDevice) {\n    fprintf(stderr, \"No default output device found\\n\");\n    exit(EXIT_FAILURE);\n  }\n  fprintf(stderr, \"Use default device: %d\\n\", param.device);\n\n  const PaDeviceInfo *info = Pa_GetDeviceInfo(param.device);\n  fprintf(stderr, \"  Name: %s\\n\", info->name);\n  fprintf(stderr, \"  Max output channels: %d\\n\", info->maxOutputChannels);\n\n  if (config.max_num_sentences != 1) {\n    fprintf(stderr, \"Setting config.max_num_sentences to 1\\n\");\n    config.max_num_sentences = 1;\n  }\n\n  fprintf(stderr, \"Loading the model\\n\");\n  sherpa_onnx::OfflineTts tts(config);\n\n  fprintf(stderr, \"Start the playback thread\\n\");\n  std::thread playback_thread(StartPlayback, tts.SampleRate());\n\n  fprintf(stderr, \"Generating ...\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n\n  sherpa_onnx::GeneratedAudio audio;\n\n  bool is_pocket_tts = !config.model.pocket.lm_flow.empty();\n  bool is_supertonic_tts = !config.model.supertonic.tts_json.empty();\n  bool is_zipvoice_tts = !config.model.zipvoice.encoder.empty() &&\n                         !config.model.zipvoice.decoder.empty();\n\n  gen_config.sid = sid;\n\n  if (is_supertonic_tts && !lang.empty()) {\n    gen_config.extra[\"lang\"] = lang;\n  }\n\n  if (is_pocket_tts || is_zipvoice_tts) {\n    if (reference_audio.empty()) {\n      fprintf(stderr,\n              \"You need to provide --reference-audio for this TTS model\");\n      exit(EXIT_FAILURE);\n    }\n\n    int32_t sample_rate;\n    bool is_ok = false;\n    auto samples =\n        sherpa_onnx::ReadWave(reference_audio, &sample_rate, &is_ok);\n    if (!is_ok) {\n      fprintf(stderr, \"Failed to read '%s'\", reference_audio.c_str());\n      exit(EXIT_FAILURE);\n    }\n\n    gen_config.reference_audio = std::move(samples);\n    gen_config.reference_sample_rate = sample_rate;\n  }\n\n  if (is_zipvoice_tts) {\n    if (reference_text.empty()) {\n      fprintf(stderr,\n              \"You need to provide --reference-text for ZipVoice TTS\");\n      exit(EXIT_FAILURE);\n    }\n    gen_config.reference_text = reference_text;\n  }\n\n  audio = tts.Generate(po.GetArg(1), gen_config, AudioGeneratedCallback);\n\n  const auto end = std::chrono::steady_clock::now();\n  g_stopped = true;\n  fprintf(stderr, \"Generating done!\\n\");\n  if (audio.samples.empty()) {\n    fprintf(\n        stderr,\n        \"Error in generating audio. Please read previous error messages.\\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = audio.samples.size() / static_cast<float>(audio.sample_rate);\n\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  fprintf(stderr, \"Audio duration: %.3f s\\n\", duration);\n  fprintf(stderr, \"Real-time factor (RTF): %.3f/%.3f = %.3f\\n\", elapsed_seconds,\n          duration, rtf);\n\n  bool ok = sherpa_onnx::WriteWave(output_filename, audio.sample_rate,\n                                   audio.samples.data(), audio.samples.size());\n  if (!ok) {\n    fprintf(stderr, \"Failed to write wave to %s\\n\", output_filename.c_str());\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"The text is: %s. Speaker ID: %d\\n\\n\", po.GetArg(1).c_str(),\n          sid);\n  fprintf(stderr, \"\\n**** Saved to %s successfully! ****\\n\",\n          output_filename.c_str());\n\n  fprintf(stderr, \"\\n\");\n  fprintf(\n      stderr,\n      \"Wait for the playback to finish. You can safely press ctrl + C to stop \"\n      \"the playback.\\n\");\n  playback_thread.join();\n\n  fprintf(stderr, \"Done!\\n\");\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline-tts.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-offline-tts.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include <chrono>  // NOLINT\n#include <cstdio>\n#include <fstream>\n#include <string>\n#include <utility>\n\n#include \"sherpa-onnx/csrc/offline-tts.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nstatic int32_t AudioCallback(const float * /*samples*/, int32_t n,\n                             float progress) {\n  printf(\"sample=%d, progress=%f\\n\", n, progress);\n  return 1;\n}\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nOffline/Non-streaming text-to-speech with sherpa-onnx\n\nUsage examples:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\ntar xf vits-piper-en_US-amy-low.tar.bz2\n\n./bin/sherpa-onnx-offline-tts \\\n --vits-model=./vits-piper-en_US-amy-low/en_US-amy-low.onnx \\\n --vits-tokens=./vits-piper-en_US-amy-low/tokens.txt \\\n --vits-data-dir=./vits-piper-en_US-amy-low/espeak-ng-data \\\n --output-filename=./generated.wav \\\n  \"Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.\"\n\nPocket TTS:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\n./bin/sherpa-onnx-offline-tts \\\n --pocket-lm-flow=./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx \\\n --pocket-lm-main=./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx \\\n --pocket-encoder=./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx \\\n --pocket-decoder=./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx \\\n --pocket-text-conditioner=./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx \\\n --pocket-vocab-json=./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json \\\n --pocket-token-scores-json=./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json \\\n --reference-audio=./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav \\\n --output-filename=./generated-pocket.wav \\\n \"Hello from Pocket TTS\"\n\nSupertonic TTS:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\ntar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n\n./bin/sherpa-onnx-offline-tts \\\n --supertonic-duration-predictor=./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx \\\n --supertonic-text-encoder=./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx \\\n --supertonic-vector-estimator=./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx \\\n --supertonic-vocoder=./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx \\\n --supertonic-tts-json=./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json \\\n --supertonic-unicode-indexer=./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin \\\n --supertonic-voice-style=./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin \\\n --lang=en \\\n --output-filename=./generated-supertonic.wav \\\n \"Hello from Supertonic TTS\"\n\nZipVoice TTS:\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n\n./bin/sherpa-onnx-offline-tts \\\n --zipvoice-encoder=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx \\\n --zipvoice-decoder=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx \\\n --zipvoice-data-dir=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data \\\n --zipvoice-lexicon=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt \\\n --zipvoice-tokens=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt \\\n --zipvoice-vocoder=./vocos_24khz.onnx \\\n --reference-audio=./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav \\\n --reference-text=\"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\" \\\n --num-steps=4 \\\n --output-filename=./generated-zipvoice.wav \\\n \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n\nIt will generate a file specified by --output-filename.\n\nYou can find more models at\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\n\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/tts/index.html\nor details.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  std::string output_filename = \"./generated.wav\";\n  int32_t sid = 0;\n\n  std::string reference_audio;\n  po.Register(\n      \"reference-audio\", &reference_audio,\n      \"Path to reference audio. Required by Pocket TTS and ZipVoice TTS.\");\n\n  std::string reference_text;\n  po.Register(\n      \"reference-text\", &reference_text,\n      \"Reference text for the reference audio. Required by ZipVoice TTS.\");\n\n  sherpa_onnx::GenerationConfig gen_config;\n\n  std::string lang;\n\n  po.Register(\n      \"num-steps\", &gen_config.num_steps,\n      \"Used by some models, e.g., Pocket TTS and ZipVoice. Number of flow \"\n      \"matching steps.\");\n\n  po.Register(\"output-filename\", &output_filename,\n              \"Path to save the generated audio\");\n\n  po.Register(\"lang\", &lang,\n              \"Language for text: en, ko, es, pt, fr. Used only by \"\n              \"Supertonic TTS.\");\n\n  po.Register(\"sid\", &sid,\n              \"Speaker ID. Used only for multi-speaker models, e.g., models \"\n              \"trained using the VCTK dataset. Not used for single-speaker \"\n              \"models, e.g., models trained using the LJSpeech dataset\");\n\n  po.Register(\"speed\", &gen_config.speed,\n              \"Speech speed. Larger=faster. Used by Supertonic, VITS, etc. \"\n              \"(float, default = 1.0)\");\n\n  sherpa_onnx::OfflineTtsConfig config;\n\n  config.Register(&po);\n  po.Read(argc, argv);\n\n  if (po.NumArgs() == 0) {\n    fprintf(stderr, \"Error: Please provide the text to generate audio.\\n\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (po.NumArgs() > 1) {\n    fprintf(stderr,\n            \"Error: Accept only one positional argument. Please use single \"\n            \"quotes to wrap your text.\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (config.model.debug) {\n    fprintf(stderr, \"%s\\n\", config.model.ToString().c_str());\n  }\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  sherpa_onnx::OfflineTts tts(config);\n\n  const auto begin = std::chrono::steady_clock::now();\n  sherpa_onnx::GeneratedAudio audio;\n\n  bool is_pocket_tts = !config.model.pocket.lm_flow.empty();\n  bool is_supertonic_tts = !config.model.supertonic.tts_json.empty();\n  bool is_zipvoice_tts = !config.model.zipvoice.encoder.empty() &&\n                         !config.model.zipvoice.decoder.empty();\n\n  gen_config.sid = sid;\n\n  if (is_supertonic_tts && !lang.empty()) {\n    gen_config.extra[\"lang\"] = lang;\n  }\n\n  if (is_pocket_tts || is_zipvoice_tts) {\n    if (reference_audio.empty()) {\n      fprintf(stderr,\n              \"You need to provide --reference-audio for this TTS model\");\n      exit(EXIT_FAILURE);\n    }\n\n    int32_t sample_rate;\n    bool is_ok = false;\n    auto samples =\n        sherpa_onnx::ReadWave(reference_audio, &sample_rate, &is_ok);\n    if (!is_ok) {\n      fprintf(stderr, \"Failed to read '%s'\", reference_audio.c_str());\n      exit(EXIT_FAILURE);\n    }\n\n    gen_config.reference_audio = std::move(samples);\n    gen_config.reference_sample_rate = sample_rate;\n  }\n\n  if (is_zipvoice_tts) {\n    if (reference_text.empty()) {\n      fprintf(stderr,\n              \"You need to provide --reference-text for ZipVoice TTS\");\n      exit(EXIT_FAILURE);\n    }\n    gen_config.reference_text = reference_text;\n  }\n\n  audio = tts.Generate(po.GetArg(1), gen_config, AudioCallback);\n\n  const auto end = std::chrono::steady_clock::now();\n\n  if (audio.samples.empty()) {\n    fprintf(\n        stderr,\n        \"Error in generating audio. Please read previous error messages.\\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n  float duration = audio.samples.size() / static_cast<float>(audio.sample_rate);\n\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Number of threads: %d\\n\", config.model.num_threads);\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  fprintf(stderr, \"Audio duration: %.3f s\\n\", duration);\n  fprintf(stderr, \"Real-time factor (RTF): %.3f/%.3f = %.3f\\n\", elapsed_seconds,\n          duration, rtf);\n\n  bool ok = sherpa_onnx::WriteWave(output_filename, audio.sample_rate,\n                                   audio.samples.data(), audio.samples.size());\n  if (!ok) {\n    fprintf(stderr, \"Failed to write wave to %s\\n\", output_filename.c_str());\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"The text is: %s. Speaker ID: %d\\n\", po.GetArg(1).c_str(),\n          sid);\n  fprintf(stderr, \"Saved to %s successfully!\\n\", output_filename.c_str());\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-offline.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-offline.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include <stdio.h>\n\n#include <chrono>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nSpeech recognition using non-streaming models with sherpa-onnx.\n\nUsage:\n\n(1) Transducer from icefall\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html\n\n  ./bin/sherpa-onnx-offline \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --num-threads=1 \\\n    --decoding-method=greedy_search \\\n    /path/to/foo.wav [bar.wav foobar.wav ...]\n\n\n(2) Paraformer from FunASR\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html\n\n  ./bin/sherpa-onnx-offline \\\n    --tokens=/path/to/tokens.txt \\\n    --paraformer=/path/to/model.onnx \\\n    --num-threads=1 \\\n    --decoding-method=greedy_search \\\n    /path/to/foo.wav [bar.wav foobar.wav ...]\n\n(3) Moonshine models\n\nSee https://k2-fsa.github.io/sherpa/onnx/moonshine/index.html\n\n  ./bin/sherpa-onnx-offline \\\n    --moonshine-preprocessor=/Users/fangjun/open-source/sherpa-onnx/scripts/moonshine/preprocess.onnx \\\n    --moonshine-encoder=/Users/fangjun/open-source/sherpa-onnx/scripts/moonshine/encode.int8.onnx \\\n    --moonshine-uncached-decoder=/Users/fangjun/open-source/sherpa-onnx/scripts/moonshine/uncached_decode.int8.onnx \\\n    --moonshine-cached-decoder=/Users/fangjun/open-source/sherpa-onnx/scripts/moonshine/cached_decode.int8.onnx \\\n    --tokens=/Users/fangjun/open-source/sherpa-onnx/scripts/moonshine/tokens.txt \\\n    --num-threads=1 \\\n    /path/to/foo.wav [bar.wav foobar.wav ...]\n\n(4) Whisper models\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html\n\n  ./bin/sherpa-onnx-offline \\\n    --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \\\n    --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \\\n    --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \\\n    --num-threads=1 \\\n    /path/to/foo.wav [bar.wav foobar.wav ...]\n\n(5) NeMo CTC models\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/index.html\n\n  ./bin/sherpa-onnx-offline \\\n    --tokens=./sherpa-onnx-nemo-ctc-en-conformer-medium/tokens.txt \\\n    --nemo-ctc-model=./sherpa-onnx-nemo-ctc-en-conformer-medium/model.onnx \\\n    --num-threads=2 \\\n    --decoding-method=greedy_search \\\n    --debug=false \\\n    ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/0.wav \\\n    ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/1.wav \\\n    ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/8k.wav\n\n(6) TDNN CTC model for the yesno recipe from icefall\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/yesno/index.html\n      //\n  ./build/bin/sherpa-onnx-offline \\\n    --sample-rate=8000 \\\n    --feat-dim=23 \\\n    --tokens=./sherpa-onnx-tdnn-yesno/tokens.txt \\\n    --tdnn-model=./sherpa-onnx-tdnn-yesno/model-epoch-14-avg-2.onnx \\\n    ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_0_1_0_0_0_1.wav \\\n    ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_1_0_0_0_1_0.wav\n\n(7) FunASR-nano models\n\nSee https://github.com/FunAudioLLM/Fun-ASR-Nano-2512\n\n  ./bin/sherpa-onnx-offline \\\n    --funasr-nano-encoder-adaptor=/path/to/encoder_adaptor.onnx \\\n    --funasr-nano-llm=/path/to/llm.onnx \\\n    --funasr-nano-tokenizer=/path/to/Qwen3-0.6B \\\n    --funasr-nano-embedding=/path/to/embedding.onnx \\\n    /path/to/foo.wav [bar.wav foobar.wav ...]\n\nNote: It supports decoding multiple files in batches\n\nfoo.wav should be of single channel, 16-bit PCM encoded wave file; its\nsampling rate can be arbitrary and does not need to be 16kHz.\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OfflineRecognizerConfig config;\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() < 1) {\n    fprintf(stderr, \"Error: Please provide at least 1 wave file.\\n\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  fprintf(stderr, \"Creating recognizer ...\\n\");\n  const auto begin_init = std::chrono::steady_clock::now();\n\n  sherpa_onnx::OfflineRecognizer recognizer(config);\n\n  const auto end_init = std::chrono::steady_clock::now();\n  float elapsed_seconds_init =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end_init -\n                                                            begin_init)\n          .count() /\n      1000.;\n  fprintf(stderr, \"recognizer created in %.3f s\\n\", elapsed_seconds_init);\n\n  fprintf(stderr, \"Started\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n\n  std::vector<std::unique_ptr<sherpa_onnx::OfflineStream>> ss;\n  std::vector<sherpa_onnx::OfflineStream *> ss_pointers;\n  float duration = 0;\n  for (int32_t i = 1; i <= po.NumArgs(); ++i) {\n    std::string wav_filename = po.GetArg(i);\n    int32_t sampling_rate = -1;\n    bool is_ok = false;\n    std::vector<float> samples =\n        sherpa_onnx::ReadWave(wav_filename, &sampling_rate, &is_ok);\n    if (!is_ok) {\n      fprintf(stderr, \"Failed to read '%s'\\n\", wav_filename.c_str());\n      return -1;\n    }\n    duration += samples.size() / static_cast<float>(sampling_rate);\n\n    auto s = recognizer.CreateStream();\n    s->AcceptWaveform(sampling_rate, samples.data(), samples.size());\n\n    ss.push_back(std::move(s));\n    ss_pointers.push_back(ss.back().get());\n  }\n\n  recognizer.DecodeStreams(ss_pointers.data(), ss_pointers.size());\n\n  const auto end = std::chrono::steady_clock::now();\n\n  fprintf(stderr, \"Done!\\n\\n\");\n  for (int32_t i = 1; i <= po.NumArgs(); ++i) {\n    fprintf(stderr, \"%s\\n\", po.GetArg(i).c_str());\n    fprintf(stdout, \"%s\\n\", ss[i - 1]->GetResult().AsJsonString().c_str());\n    fprintf(stderr, \"----\\n\");\n  }\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n\n  fprintf(stderr, \"num threads: %d\\n\", config.model_config.num_threads);\n  fprintf(stderr, \"decoding method: %s\\n\", config.decoding_method.c_str());\n  if (config.decoding_method == \"modified_beam_search\") {\n    fprintf(stderr, \"max active paths: %d\\n\", config.max_active_paths);\n  }\n\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Real time factor (RTF): %.3f / %.3f = %.3f\\n\",\n          elapsed_seconds, duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-online-denoiser.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-online-denoiser.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include <stdio.h>\n\n#include <algorithm>\n#include <chrono>\n#include <cstdint>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-speech-denoiser.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nStreaming speech denoising with sherpa-onnx.\n\nPlease download GTCRN and sample files from:\n\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n\nDPDFNet models are available from either:\n\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\nhttps://huggingface.co/Ceva-IP/DPDFNet\n\nCurrently this binary supports:\n  gtcrn_simple.onnx\n  dpdfnet_baseline.onnx\n  dpdfnet2.onnx\n  dpdfnet4.onnx\n  dpdfnet8.onnx\n  dpdfnet2_48khz_hr.onnx\n\nUsage:\n\n./bin/sherpa-onnx-online-denoiser \\\n  --speech-denoiser-gtcrn-model=gtcrn_simple.onnx \\\n  --chunk-duration-ms=16 \\\n  --input-wav=input.wav \\\n  --output-wav=output_16k.wav\n\n./bin/sherpa-onnx-online-denoiser \\\n  --speech-denoiser-dpdfnet-model=dpdfnet4.onnx \\\n  --chunk-duration-ms=10 \\\n  --input-wav=input.wav \\\n  --output-wav=output_16k.wav\n\n./bin/sherpa-onnx-online-denoiser \\\n  --speech-denoiser-dpdfnet-model=dpdfnet2_48khz_hr.onnx \\\n  --chunk-duration-ms=10 \\\n  --input-wav=input.wav \\\n  --output-wav=output_48k.wav\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OnlineSpeechDenoiserConfig config;\n  std::string input_wave;\n  std::string output_wave;\n  int32_t chunk_duration_ms = 10;\n\n  config.Register(&po);\n  po.Register(\"input-wav\", &input_wave, \"Path to input wav.\");\n  po.Register(\"output-wav\", &output_wave, \"Path to output wav.\");\n  po.Register(\"chunk-duration-ms\", &chunk_duration_ms,\n              \"Streaming chunk duration in milliseconds.\");\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    fprintf(stderr, \"Please don't give positional arguments\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  if (input_wave.empty()) {\n    fprintf(stderr, \"Please provide --input-wav\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (output_wave.empty()) {\n    fprintf(stderr, \"Please provide --output-wav\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  if (chunk_duration_ms <= 0) {\n    fprintf(stderr, \"Please provide --chunk-duration-ms > 0\\n\");\n    return -1;\n  }\n\n  int32_t sampling_rate = -1;\n  bool is_ok = false;\n  std::vector<float> samples =\n      sherpa_onnx::ReadWave(input_wave, &sampling_rate, &is_ok);\n  if (!is_ok) {\n    fprintf(stderr, \"Failed to read '%s'\\n\", input_wave.c_str());\n    return -1;\n  }\n\n  int32_t chunk_size = sampling_rate * chunk_duration_ms / 1000;\n  if (chunk_size <= 0) {\n    fprintf(stderr,\n            \"The selected chunk duration is too small for sample rate %d\\n\",\n            sampling_rate);\n    return -1;\n  }\n\n  sherpa_onnx::OnlineSpeechDenoiser denoiser(config);\n\n  fprintf(stderr, \"Started\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n\n  std::vector<float> enhanced;\n  enhanced.reserve(samples.size());\n\n  for (size_t i = 0; i < samples.size(); i += chunk_size) {\n    size_t num_samples =\n        std::min(static_cast<size_t>(chunk_size), samples.size() - i);\n    auto chunk = denoiser.Run(samples.data() + i, num_samples, sampling_rate);\n    enhanced.insert(enhanced.end(), chunk.samples.begin(), chunk.samples.end());\n  }\n\n  auto tail = denoiser.Flush();\n  enhanced.insert(enhanced.end(), tail.samples.begin(), tail.samples.end());\n\n  const auto end = std::chrono::steady_clock::now();\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n\n  fprintf(stderr, \"Done\\n\");\n  is_ok = sherpa_onnx::WriteWave(output_wave, denoiser.GetSampleRate(),\n                                 enhanced.data(), enhanced.size());\n  if (is_ok) {\n    fprintf(stderr, \"Saved to %s\\n\", output_wave.c_str());\n  } else {\n    fprintf(stderr, \"Failed to save to %s\\n\", output_wave.c_str());\n  }\n\n  float duration = samples.size() / static_cast<float>(sampling_rate);\n  fprintf(stderr, \"num threads: %d\\n\", config.model.num_threads);\n  fprintf(stderr, \"chunk duration: %d ms\\n\", chunk_duration_ms);\n  fprintf(stderr, \"frame shift: %d samples @ %d Hz\\n\",\n          denoiser.GetFrameShiftInSamples(), denoiser.GetSampleRate());\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Real time factor (RTF): %.3f / %.3f = %.3f\\n\",\n          elapsed_seconds, duration, rtf);\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-online-punctuation.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-online-punctuation.cc\n//\n// Copyright (c) 2024 Jian You (jianyou@cisco.com, Cisco Systems)\n\n#include <stdio.h>\n\n#include <chrono>\n#include <iostream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/online-punctuation.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nAdd punctuations to the input text.\n\nThe input text can contain English words.\n\nUsage:\n\nPlease download the model from:\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n\n./bin/Release/sherpa-onnx-online-punctuation \\\n  --cnn-bilstm=/path/to/model.onnx \\\n  --bpe-vocab=/path/to/bpe.vocab \\\n  \"how are you i am fine thank you\"\n\nThe output text should look like below:\n  \"How are you? I am fine. Thank you.\"\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OnlinePunctuationConfig config;\n  config.Register(&po);\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr,\n            \"Error: Please provide only 1 positional argument containing the \"\n            \"input text.\\n\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  fprintf(stderr, \"Creating OnlinePunctuation ...\\n\");\n  sherpa_onnx::OnlinePunctuation punct(config);\n  fprintf(stderr, \"Started\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n\n  std::string text = po.GetArg(1);\n\n  std::string text_with_punct_case = punct.AddPunctuationWithCase(text);\n\n  const auto end = std::chrono::steady_clock::now();\n  fprintf(stderr, \"Done\\n\");\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n\n  fprintf(stderr, \"Num threads: %d\\n\", config.model.num_threads);\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  fprintf(stderr, \"Input text: %s\\n\", text.c_str());\n  fprintf(stderr, \"Output text: \");\n  fprintf(stdout, \"%s\\n\", text_with_punct_case.c_str());\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-vad-alsa-offline-asr.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-vad-alsa-offline-asr.cc\n//\n// Copyright (c)  2022-2025  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <memory>\n#include <mutex>  // NOLINT\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/alsa.h\"\n#include \"sherpa-onnx/csrc/circular-buffer.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n\nbool stop = false;\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program shows how to use a streaming VAD with non-streaming ASR in\nsherpa-onnx.\n\nPlease download silero_vad.onnx from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nFor instance, use\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nPlease refer to ./sherpa-onnx-microphone-offline.cc\nto download models for offline ASR.\n\n(1) Transducer from icefall\n\n  ./bin/sherpa-onnx-vad-microphone-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    device_name\n\n(2) Paraformer from FunASR\n\n  ./bin/sherpa-onnx-vad-microphone-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --paraformer=/path/to/model.onnx \\\n    device_name\n\n(3) Whisper models\n\n  ./bin/sherpa-onnx-vad-microphone-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \\\n    --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \\\n    --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \\\n    device_name\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::VadModelConfig vad_config;\n\n  sherpa_onnx::OfflineRecognizerConfig asr_config;\n\n  vad_config.Register(&po);\n  asr_config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"Please provide only 1 argument: the device name\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", vad_config.ToString().c_str());\n  fprintf(stderr, \"%s\\n\", asr_config.ToString().c_str());\n\n  if (!vad_config.Validate()) {\n    fprintf(stderr, \"Errors in vad_config!\\n\");\n    return -1;\n  }\n\n  if (!asr_config.Validate()) {\n    fprintf(stderr, \"Errors in asr_config!\\n\");\n    return -1;\n  }\n\n  fprintf(stderr, \"Creating recognizer ...\\n\");\n  sherpa_onnx::OfflineRecognizer recognizer(asr_config);\n  fprintf(stderr, \"Recognizer created!\\n\");\n\n  auto vad = std::make_unique<sherpa_onnx::VoiceActivityDetector>(vad_config);\n\n  std::string device_name = po.GetArg(1);\n  sherpa_onnx::Alsa alsa(device_name.c_str());\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name.c_str());\n\n  int32_t sample_rate = 16000;\n\n  if (alsa.GetExpectedSampleRate() != sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            sample_rate);\n    exit(-1);\n  }\n\n  fprintf(stderr, \"Started. Please speak\\n\");\n\n  int32_t window_size = vad_config.silero_vad.window_size;\n  int32_t index = 0;\n\n  while (!stop) {\n    const std::vector<float> &samples = alsa.Read(window_size);\n    vad->AcceptWaveform(samples.data(), samples.size());\n\n    while (!vad->Empty()) {\n      const auto &segment = vad->Front();\n      auto s = recognizer.CreateStream();\n      s->AcceptWaveform(sample_rate, segment.samples.data(),\n                        segment.samples.size());\n      recognizer.DecodeStream(s.get());\n      const auto &result = s->GetResult();\n      if (!result.text.empty()) {\n        fprintf(stderr, \"%2d: %s\\n\", index, result.text.c_str());\n        ++index;\n      }\n      vad->Pop();\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-vad-alsa.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-vad-alsa.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <iomanip>\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/alsa.h\"\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nbool stop = false;\nstatic void Handler(int32_t sig) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program shows how to use VAD in sherpa-onnx.\n\n  ./bin/sherpa-onnx-vad-alsa \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    device_name\n\nPlease download silero_vad.onnx from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nFor instance, use\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nThe device name specifies which microphone to use in case there are several\non your system. You can use\n\n  arecord -l\n\nto find all available microphones on your computer. For instance, if it outputs\n\n**** List of CAPTURE Hardware Devices ****\ncard 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]\n  Subdevices: 1/1\n  Subdevice #0: subdevice #0\n\nand if you want to select card 3 and device 0 on that card, please use:\n\n  plughw:3,0\n\nas the device_name.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::VadModelConfig config;\n\n  config.Register(&po);\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"Please provide only 1 argument: the device name\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  std::string device_name = po.GetArg(1);\n  sherpa_onnx::Alsa alsa(device_name.c_str());\n  fprintf(stderr, \"Use recording device: %s\\n\", device_name.c_str());\n\n  int32_t sample_rate = 16000;\n\n  if (alsa.GetExpectedSampleRate() != sample_rate) {\n    fprintf(stderr, \"sample rate: %d != %d\\n\", alsa.GetExpectedSampleRate(),\n            sample_rate);\n    exit(-1);\n  }\n\n  auto vad = std::make_unique<sherpa_onnx::VoiceActivityDetector>(config);\n\n  fprintf(stderr, \"Started. Please speak\\n\");\n\n  int32_t window_size = config.silero_vad.window_size;\n  bool printed = false;\n\n  int32_t k = 0;\n  while (!stop) {\n    const std::vector<float> &samples = alsa.Read(window_size);\n\n    vad->AcceptWaveform(samples.data(), samples.size());\n\n    if (vad->IsSpeechDetected() && !printed) {\n      printed = true;\n      fprintf(stderr, \"\\nDetected speech!\\n\");\n    }\n    if (!vad->IsSpeechDetected()) {\n      printed = false;\n    }\n\n    while (!vad->Empty()) {\n      const auto &segment = vad->Front();\n      float duration = segment.samples.size() / static_cast<float>(sample_rate);\n\n      fprintf(stderr, \"Duration: %.3f seconds\\n\", duration);\n\n      std::ostringstream os;\n      os << \"seg-\" << k << \"-\" << std::fixed << std::setprecision(3) << duration\n         << \"s.wav\";\n      k += 1;\n      sherpa_onnx::WriteWave(os.str(), 16000, segment.samples.data(),\n                             segment.samples.size());\n      fprintf(stderr, \"Saved to %s\\n\", os.str().c_str());\n      fprintf(stderr, \"----------\\n\");\n\n      vad->Pop();\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-vad-microphone-offline-asr.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-vad-microphone-offline-asr.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <memory>\n#include <mutex>\n#include <utility>\n#include <vector>\n\n#include \"portaudio.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/circular-buffer.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n\nbool stop = false;\nstd::mutex mutex;\nsherpa_onnx::CircularBuffer buffer(16000 * 60);\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(mutex);\n  buffer.Push(reinterpret_cast<const float *>(input_buffer), frames_per_buffer);\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program shows how to use a streaming VAD with non-streaming ASR in\nsherpa-onnx.\n\nPlease download silero_vad.onnx from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nFor instance, use\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nPlease refer to ./sherpa-onnx-microphone-offline.cc\nto download models for offline ASR.\n\n(1) Transducer from icefall\n\n  ./bin/sherpa-onnx-vad-microphone-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx\n\n(2) Paraformer from FunASR\n\n  ./bin/sherpa-onnx-vad-microphone-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --paraformer=/path/to/model.onnx \\\n    --num-threads=1\n\n(3) Whisper models\n\n  ./bin/sherpa-onnx-vad-microphone-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \\\n    --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \\\n    --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \\\n    --num-threads=1\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::VadModelConfig vad_config;\n\n  sherpa_onnx::OfflineRecognizerConfig asr_config;\n\n  vad_config.Register(&po);\n  asr_config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", vad_config.ToString().c_str());\n  fprintf(stderr, \"%s\\n\", asr_config.ToString().c_str());\n\n  if (!vad_config.Validate()) {\n    fprintf(stderr, \"Errors in vad_config!\\n\");\n    return -1;\n  }\n\n  if (!asr_config.Validate()) {\n    fprintf(stderr, \"Errors in asr_config!\\n\");\n    return -1;\n  }\n\n  fprintf(stderr, \"Creating recognizer ...\\n\");\n  sherpa_onnx::OfflineRecognizer recognizer(asr_config);\n  fprintf(stderr, \"Recognizer created!\\n\");\n\n  sherpa_onnx::Microphone mic;\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  if (device_index == paNoDevice) {\n    fprintf(stderr, \"No default input device found\\n\");\n    fprintf(stderr,\n            \"  If you are using Linux, please try \"\n            \"./build/bin/sherpa-onnx-vad-alsa-offline-asr\\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *pSampleRateStr = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (pSampleRateStr) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(pSampleRateStr);\n  }\n\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr)) {\n    fprintf(stderr, \"Failed to open device %d\\n\", device_index);\n    exit(EXIT_FAILURE);\n  }\n\n  float sample_rate = 16000;\n  std::unique_ptr<sherpa_onnx::LinearResample> resampler;\n  if (mic_sample_rate != sample_rate) {\n    float min_freq = std::min(mic_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler = std::make_unique<sherpa_onnx::LinearResample>(\n        mic_sample_rate, sample_rate, lowpass_cutoff, lowpass_filter_width);\n  }\n\n  auto vad = std::make_unique<sherpa_onnx::VoiceActivityDetector>(vad_config);\n\n  fprintf(stderr, \"Started. Please speak\\n\");\n\n  int32_t window_size = vad_config.silero_vad.window_size;\n  int32_t index = 0;\n\n  while (!stop) {\n    {\n      std::lock_guard<std::mutex> lock(mutex);\n\n      while (buffer.Size() >= window_size) {\n        std::vector<float> samples = buffer.Get(buffer.Head(), window_size);\n        buffer.Pop(window_size);\n\n        if (resampler) {\n          std::vector<float> tmp;\n          resampler->Resample(samples.data(), samples.size(), true, &tmp);\n          samples = std::move(tmp);\n        }\n\n        vad->AcceptWaveform(samples.data(), samples.size());\n      }\n    }\n\n    while (!vad->Empty()) {\n      const auto &segment = vad->Front();\n      auto s = recognizer.CreateStream();\n      s->AcceptWaveform(sample_rate, segment.samples.data(),\n                        segment.samples.size());\n      recognizer.DecodeStream(s.get());\n      const auto &result = s->GetResult();\n      if (!result.text.empty()) {\n        fprintf(stderr, \"%2d: %s\\n\", index, result.text.c_str());\n        ++index;\n      }\n      vad->Pop();\n    }\n\n    Pa_Sleep(100);  // sleep for 100ms\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-vad-microphone-simulated-streaming-asr.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-vad-microphone-simulated-streaming-asr.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <chrono>\n#include <condition_variable>\n#include <memory>\n#include <mutex>\n#include <queue>\n#include <string>\n#include <vector>\n\n#include \"portaudio.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/circular-buffer.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n#include \"sherpa-onnx/csrc/sherpa-display.h\"\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nstd::queue<std::vector<float>> samples_queue;\nstd::condition_variable condition_variable;\nstd::mutex mutex;\nbool stop = false;\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(mutex);\n  samples_queue.emplace(\n      reinterpret_cast<const float *>(input_buffer),\n      reinterpret_cast<const float *>(input_buffer) + frames_per_buffer);\n  condition_variable.notify_one();\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  condition_variable.notify_one();\n  fprintf(stdout, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program shows how to use a streaming VAD with non-streaming ASR in\nsherpa-onnx for real-time speech recognition.\n\n(1) SenseVoice\n\ncd /path/to/sherpa-onnx/build\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\ntar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n./bin/sherpa-onnx-vad-microphone-simulated-streaming-asr \\\n  --silero-vad-model=./silero_vad.onnx \\\n  --sense-voice-model=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/model.int8.onnx \\\n  --tokens=./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17/tokens.txt\n\n(2) Parakeet TDT 0.6b v2\n\ncd /path/to/sherpa-onnx/build\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\ntar xvf sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n./bin/sherpa-onnx-vad-microphone-simulated-streaming-asr \\\n  --silero-vad-model=./silero_vad.onnx \\\n  --encoder=./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/encoder.int8.onnx \\\n  --decoder=./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/decoder.int8.onnx \\\n  --joiner=./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/joiner.int8.onnx \\\n  --tokens=./sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/tokens.txt\n\n(3) Please refer to our doc for more non-streaming ASR models,\ne.g., zipformer, paraformer, whisper, etc.\n\nPlease first use ./bin/sherpa-onnx-offline to test the RTF of the model.\nA model with RTF < 0.2 should work with this program.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::VadModelConfig vad_config;\n\n  sherpa_onnx::OfflineRecognizerConfig asr_config;\n\n  vad_config.Register(&po);\n  asr_config.Register(&po);\n\n  int32_t user_device_index = -1;  // -1 means to use default value\n  int32_t user_sample_rate = -1;   // -1 means to use default value\n\n  po.Register(\"mic-device-index\", &user_device_index,\n              \"If provided, we use it to replace the default device index.\"\n              \"You can use sherpa-onnx-pa-devs to list available devices\");\n\n  po.Register(\"mic-sample-rate\", &user_sample_rate,\n              \"If provided, we use it to replace the default sample rate.\"\n              \"You can use sherpa-onnx-pa-devs to list sample rate of \"\n              \"available devices\");\n\n  if (argc == 1) {\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stdout, \"%s\\n\", vad_config.ToString().c_str());\n  fprintf(stdout, \"%s\\n\", asr_config.ToString().c_str());\n\n  if (!vad_config.Validate()) {\n    fprintf(stdout, \"Errors in vad_config!\\n\");\n    return -1;\n  }\n\n  if (!asr_config.Validate()) {\n    fprintf(stdout, \"Errors in asr_config!\\n\");\n    return -1;\n  }\n\n  fprintf(stdout, \"Creating recognizer ...\\n\");\n  sherpa_onnx::OfflineRecognizer recognizer(asr_config);\n  fprintf(stdout, \"Recognizer created!\\n\");\n\n  sherpa_onnx::Microphone mic;\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  if (device_index == paNoDevice) {\n    fprintf(stdout, \"No default input device found\\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  if (user_device_index >= 0) {\n    fprintf(stdout, \"Use specified device: %d\\n\", user_device_index);\n    device_index = user_device_index;\n  } else {\n    fprintf(stdout, \"Use default device: %d\\n\", device_index);\n  }\n\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  if (user_sample_rate > 0) {\n    fprintf(stdout, \"Use sample rate %d for mic\\n\", user_sample_rate);\n    mic_sample_rate = user_sample_rate;\n  }\n\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr)) {\n    fprintf(stdout, \"Failed to open device %d\\n\", device_index);\n    exit(EXIT_FAILURE);\n  }\n\n  float sample_rate = 16000;\n  std::unique_ptr<sherpa_onnx::LinearResample> resampler;\n  if (mic_sample_rate != sample_rate) {\n    float min_freq = std::min(mic_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler = std::make_unique<sherpa_onnx::LinearResample>(\n        mic_sample_rate, sample_rate, lowpass_cutoff, lowpass_filter_width);\n  }\n\n  auto vad = std::make_unique<sherpa_onnx::VoiceActivityDetector>(vad_config);\n\n  int32_t window_size = vad_config.silero_vad.window_size;\n\n  int32_t offset = 0;\n  bool speech_started = false;\n  std::vector<float> buffer;\n\n  auto started_time = std::chrono::steady_clock::now();\n  sherpa_onnx::SherpaDisplay display;\n\n  fprintf(stdout, \"Started. Please speak\\n\");\n  std::vector<float> resampled;\n\n  while (!stop) {\n    {\n      std::unique_lock<std::mutex> lock(mutex);\n      while (samples_queue.empty() && !stop) {\n        condition_variable.wait(lock);\n      }\n\n      if (stop) {\n        break;\n      }\n\n      const auto &s = samples_queue.front();\n      if (!resampler) {\n        buffer.insert(buffer.end(), s.begin(), s.end());\n      } else {\n        resampler->Resample(s.data(), s.size(), false, &resampled);\n        buffer.insert(buffer.end(), resampled.begin(), resampled.end());\n      }\n\n      samples_queue.pop();\n    }\n\n    for (; offset + window_size < buffer.size(); offset += window_size) {\n      vad->AcceptWaveform(buffer.data() + offset, window_size);\n      if (!speech_started && vad->IsSpeechDetected()) {\n        speech_started = true;\n        started_time = std::chrono::steady_clock::now();\n      }\n    }\n\n    if (!speech_started) {\n      if (buffer.size() > 10 * window_size) {\n        offset -= buffer.size() - 10 * window_size;\n        buffer = {buffer.end() - 10 * window_size, buffer.end()};\n      }\n    }\n\n    auto current_time = std::chrono::steady_clock::now();\n    const float elapsed_seconds =\n        std::chrono::duration_cast<std::chrono::milliseconds>(current_time -\n                                                              started_time)\n            .count() /\n        1000.;\n\n    if (speech_started && elapsed_seconds > 0.2) {\n      auto s = recognizer.CreateStream();\n      s->AcceptWaveform(sample_rate, buffer.data(), buffer.size());\n      recognizer.DecodeStream(s.get());\n      const auto &result = s->GetResult();\n      display.UpdateText(result.text);\n      display.Display();\n\n      started_time = std::chrono::steady_clock::now();\n    }\n\n    while (!vad->Empty()) {\n      // when stopping speak, this while loop is executed\n\n      vad->Pop();\n\n      display.FinalizeCurrentSentence();\n      display.Display();\n\n      buffer.clear();\n      offset = 0;\n      speech_started = false;\n    }\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-vad-microphone.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-vad-microphone.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include <signal.h>\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <memory>\n#include <mutex>\n#include <utility>\n#include <vector>\n\n#include \"portaudio.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/circular-buffer.h\"\n#include \"sherpa-onnx/csrc/microphone.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nbool stop = false;\nstd::mutex mutex;\nsherpa_onnx::CircularBuffer buffer(16000 * 60);\n\nstatic int32_t RecordCallback(const void *input_buffer,\n                              void * /*output_buffer*/,\n                              unsigned long frames_per_buffer,  // NOLINT\n                              const PaStreamCallbackTimeInfo * /*time_info*/,\n                              PaStreamCallbackFlags /*status_flags*/,\n                              void * /*user_data*/) {\n  std::lock_guard<std::mutex> lock(mutex);\n  buffer.Push(reinterpret_cast<const float *>(input_buffer), frames_per_buffer);\n\n  return stop ? paComplete : paContinue;\n}\n\nstatic void Handler(int32_t /*sig*/) {\n  stop = true;\n  fprintf(stderr, \"\\nCaught Ctrl + C. Exiting...\\n\");\n}\n\nint32_t main(int32_t argc, char *argv[]) {\n  signal(SIGINT, Handler);\n\n  const char *kUsageMessage = R\"usage(\nThis program shows how to use VAD in sherpa-onnx.\n\n  ./bin/sherpa-onnx-vad-microphone \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --vad-provider=cpu \\\n    --vad-num-threads=1\n\nPlease download silero_vad.onnx from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nFor instance, use\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::VadModelConfig config;\n\n  config.Register(&po);\n  po.Read(argc, argv);\n  if (po.NumArgs() != 0) {\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  sherpa_onnx::Microphone mic;\n\n  int32_t device_index = Pa_GetDefaultInputDevice();\n  if (device_index == paNoDevice) {\n    fprintf(stderr, \"No default input device found\\n\");\n    fprintf(stderr, \"If you are using Linux, please switch to \\n\");\n    fprintf(stderr, \" ./bin/sherpa-onnx-vad-alsa \\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  const char *pDeviceIndex = std::getenv(\"SHERPA_ONNX_MIC_DEVICE\");\n  if (pDeviceIndex) {\n    fprintf(stderr, \"Use specified device: %s\\n\", pDeviceIndex);\n    device_index = atoi(pDeviceIndex);\n  }\n  mic.PrintDevices(device_index);\n\n  float mic_sample_rate = 16000;\n  const char *pSampleRateStr = std::getenv(\"SHERPA_ONNX_MIC_SAMPLE_RATE\");\n  if (pSampleRateStr) {\n    fprintf(stderr, \"Use sample rate %f for mic\\n\", mic_sample_rate);\n    mic_sample_rate = atof(pSampleRateStr);\n  }\n  if (!mic.OpenDevice(device_index, mic_sample_rate, 1, RecordCallback,\n                      nullptr)) {\n    fprintf(stderr, \"Failed to open microphone device %d\\n\", device_index);\n    exit(EXIT_FAILURE);\n  }\n\n  float sample_rate = 16000;\n  std::unique_ptr<sherpa_onnx::LinearResample> resampler;\n  if (mic_sample_rate != sample_rate) {\n    float min_freq = std::min(mic_sample_rate, sample_rate);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    resampler = std::make_unique<sherpa_onnx::LinearResample>(\n        mic_sample_rate, sample_rate, lowpass_cutoff, lowpass_filter_width);\n  }\n\n  auto vad = std::make_unique<sherpa_onnx::VoiceActivityDetector>(config);\n\n  int32_t window_size = config.silero_vad.window_size;\n  bool printed = false;\n\n  int32_t k = 0;\n  while (!stop) {\n    {\n      std::lock_guard<std::mutex> lock(mutex);\n\n      while (buffer.Size() >= window_size) {\n        std::vector<float> samples = buffer.Get(buffer.Head(), window_size);\n        buffer.Pop(window_size);\n\n        if (resampler) {\n          std::vector<float> tmp;\n          resampler->Resample(samples.data(), samples.size(), true, &tmp);\n          samples = std::move(tmp);\n        }\n\n        vad->AcceptWaveform(samples.data(), samples.size());\n\n        if (vad->IsSpeechDetected() && !printed) {\n          printed = true;\n          fprintf(stderr, \"\\nDetected speech!\\n\");\n        }\n        if (!vad->IsSpeechDetected()) {\n          printed = false;\n        }\n\n        while (!vad->Empty()) {\n          const auto &segment = vad->Front();\n          float duration = segment.samples.size() / sample_rate;\n          fprintf(stderr, \"Duration: %.3f seconds\\n\", duration);\n\n          char filename[128];\n          snprintf(filename, sizeof(filename), \"seg-%d-%.3fs.wav\", k, duration);\n          k += 1;\n          sherpa_onnx::WriteWave(filename, sample_rate, segment.samples.data(),\n                                 segment.samples.size());\n          fprintf(stderr, \"Saved to %s\\n\", filename);\n          fprintf(stderr, \"----------\\n\");\n\n          vad->Pop();\n        }\n      }\n    }\n    Pa_Sleep(100);  // sleep for 100ms\n  }\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-vad-with-offline-asr.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-vad-with-offline-asr.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include <stdio.h>\n\n#include <algorithm>\n#include <chrono>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nSpeech recognition using VAD + non-streaming models with sherpa-onnx.\n\nUsage:\n\nNote you can download silero_vad.onnx using\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n(0) FireRedAsr\n\nSee https://k2-fsa.github.io/sherpa/onnx/FireRedAsr/pretrained.html\n\n  ./bin/sherpa-onnx-vad-with-offline-asr \\\n    --tokens=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt \\\n    --fire-red-asr-encoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx \\\n    --fire-red-asr-decoder=./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx \\\n    --num-threads=1 \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    /path/to/foo.wav\n\n(1) Transducer from icefall\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html\n\n  ./bin/sherpa-onnx-vad-with-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --num-threads=1 \\\n    --decoding-method=greedy_search \\\n    /path/to/foo.wav\n\n\n(2) Paraformer from FunASR\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html\n\n  ./bin/sherpa-onnx-vad-with-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --paraformer=/path/to/model.onnx \\\n    --num-threads=1 \\\n    --decoding-method=greedy_search \\\n    /path/to/foo.wav\n\n(3) Moonshine models\n\nSee https://k2-fsa.github.io/sherpa/onnx/moonshine/index.html\n\n  ./bin/sherpa-onnx-vad-with-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --moonshine-preprocessor=/Users/fangjun/open-source/sherpa-onnx/scripts/moonshine/preprocess.onnx \\\n    --moonshine-encoder=/Users/fangjun/open-source/sherpa-onnx/scripts/moonshine/encode.int8.onnx \\\n    --moonshine-uncached-decoder=/Users/fangjun/open-source/sherpa-onnx/scripts/moonshine/uncached_decode.int8.onnx \\\n    --moonshine-cached-decoder=/Users/fangjun/open-source/sherpa-onnx/scripts/moonshine/cached_decode.int8.onnx \\\n    --tokens=/Users/fangjun/open-source/sherpa-onnx/scripts/moonshine/tokens.txt \\\n    --num-threads=1 \\\n    /path/to/foo.wav\n\n(4) Whisper models\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html\n\n  ./bin/sherpa-onnx-vad-with-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \\\n    --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \\\n    --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \\\n    --num-threads=1 \\\n    /path/to/foo.wav\n\n(5) NeMo CTC models\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/index.html\n\n  ./bin/sherpa-onnx-vad-with-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --tokens=./sherpa-onnx-nemo-ctc-en-conformer-medium/tokens.txt \\\n    --nemo-ctc-model=./sherpa-onnx-nemo-ctc-en-conformer-medium/model.onnx \\\n    --num-threads=2 \\\n    --decoding-method=greedy_search \\\n    --debug=false \\\n    ./sherpa-onnx-nemo-ctc-en-conformer-medium/test_wavs/0.wav\n\n(6) TDNN CTC model for the yesno recipe from icefall\n\nSee https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/yesno/index.html\n\n  ./bin/sherpa-onnx-vad-with-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --sample-rate=8000 \\\n    --feat-dim=23 \\\n    --tokens=./sherpa-onnx-tdnn-yesno/tokens.txt \\\n    --tdnn-model=./sherpa-onnx-tdnn-yesno/model-epoch-14-avg-2.onnx \\\n    ./sherpa-onnx-tdnn-yesno/test_wavs/0_0_0_1_0_0_0_1.wav\n\n(7) FunASR-nano models\n\nSee https://github.com/FunAudioLLM/Fun-ASR-Nano-2512\n\n  ./bin/sherpa-onnx-vad-with-offline-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --funasr-nano-encoder-adaptor=/path/to/encoder_adaptor.onnx \\\n    --funasr-nano-llm=/path/to/llm.onnx \\\n    --funasr-nano-tokenizer=/path/to/Qwen3-0.6B \\\n    --funasr-nano-embedding=/path/to/embedding.onnx \\\n    [--funasr-nano-user-prompt=\"Transcription:\"] \\\n    [--funasr-nano-max-new-tokens=512] \\\n    [--funasr-nano-temperature=1e-6] \\\n    [--funasr-nano-top-p=0.8] \\\n    --num-threads=4 \\\n    /path/to/foo.wav\n\nThe input wav should be of single channel, 16-bit PCM encoded wave file; its\nsampling rate can be arbitrary and does not need to be 16kHz.\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OfflineRecognizerConfig asr_config;\n  asr_config.Register(&po);\n\n  sherpa_onnx::VadModelConfig vad_config;\n  vad_config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"Error: Please provide at only 1 wave file. Given: %d\\n\\n\",\n            po.NumArgs());\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", vad_config.ToString().c_str());\n  fprintf(stderr, \"%s\\n\", asr_config.ToString().c_str());\n\n  if (!vad_config.Validate()) {\n    fprintf(stderr, \"Errors in vad_config!\\n\");\n    return -1;\n  }\n\n  if (!asr_config.Validate()) {\n    fprintf(stderr, \"Errors in ASR config!\\n\");\n    return -1;\n  }\n\n  fprintf(stderr, \"Creating recognizer ...\\n\");\n  sherpa_onnx::OfflineRecognizer recognizer(asr_config);\n  fprintf(stderr, \"Recognizer created!\\n\");\n\n  auto vad = std::make_unique<sherpa_onnx::VoiceActivityDetector>(vad_config);\n\n  fprintf(stderr, \"Started\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n\n  std::string wave_filename = po.GetArg(1);\n  fprintf(stderr, \"Reading: %s\\n\", wave_filename.c_str());\n  int32_t sampling_rate = -1;\n  bool is_ok = false;\n  auto samples = sherpa_onnx::ReadWave(wave_filename, &sampling_rate, &is_ok);\n  if (!is_ok) {\n    fprintf(stderr, \"Failed to read '%s'\\n\", wave_filename.c_str());\n    return -1;\n  }\n\n  if (sampling_rate != 16000) {\n    fprintf(stderr, \"Resampling from %d Hz to 16000 Hz\", sampling_rate);\n    float min_freq = std::min<int32_t>(sampling_rate, 16000);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    auto resampler = std::make_unique<sherpa_onnx::LinearResample>(\n        sampling_rate, 16000, lowpass_cutoff, lowpass_filter_width);\n    std::vector<float> out_samples;\n    resampler->Resample(samples.data(), samples.size(), true, &out_samples);\n    samples = std::move(out_samples);\n    fprintf(stderr, \"Resampling done\\n\");\n  }\n\n  fprintf(stderr, \"Started!\\n\");\n  int32_t window_size = vad_config.silero_vad.window_size;\n  int32_t i = 0;\n  while (i < samples.size()) {\n    if (i + window_size <= samples.size()) {\n      vad->AcceptWaveform(samples.data() + i, window_size);\n    } else {\n      vad->Flush();\n    }\n\n    i += window_size;\n\n    while (!vad->Empty()) {\n      const auto &segment = vad->Front();\n      float duration = segment.samples.size() / 16000.;\n      float start_time = segment.start / 16000.;\n      float end_time = start_time + duration;\n      if (duration < 0.1) {\n        vad->Pop();\n        continue;\n      }\n\n      auto s = recognizer.CreateStream();\n      s->AcceptWaveform(16000, segment.samples.data(), segment.samples.size());\n      recognizer.DecodeStream(s.get());\n      const auto &result = s->GetResult();\n      if (!result.text.empty()) {\n        fprintf(stderr, \"%.3f -- %.3f: \", start_time, end_time);\n        fprintf(stdout, \"%s\\n\", result.text.c_str());\n      }\n      vad->Pop();\n    }\n  }\n\n  const auto end = std::chrono::steady_clock::now();\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n\n  fprintf(stderr, \"num threads: %d\\n\", asr_config.model_config.num_threads);\n  fprintf(stderr, \"decoding method: %s\\n\", asr_config.decoding_method.c_str());\n  if (asr_config.decoding_method == \"modified_beam_search\") {\n    fprintf(stderr, \"max active paths: %d\\n\", asr_config.max_active_paths);\n  }\n\n  float duration = samples.size() / 16000.;\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Real time factor (RTF): %.3f / %.3f = %.3f\\n\",\n          elapsed_seconds, duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-vad-with-online-asr.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-vad-with-online-asr.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n// Copyright (c)  2025  Pingfeng Luo\n//\n// This file demonstrates how to use vad in streaming speech recognition\n//\n\n#include <stdio.h>\n\n#include <algorithm>\n#include <chrono>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/resample.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\nint32_t main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nSpeech recognition using VAD + streaming models with sherpa-onnx-vad-with-online-asr.\nThis is useful when testing long audio.\n\nUsage:\n\nNote you can download silero_vad.onnx using\n\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\n(1) Streaming transducer\n\n  ./bin/sherpa-onnx-vad-with-online-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --provider=cpu \\\n    --num-threads=2 \\\n    --decoding-method=greedy_search \\\n    /path/to/long_duration.wav\n\n(2) Streaming zipformer2 CTC\n\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n\n  ./bin/sherpa-onnx-vad-with-online-asr \\\n    --debug=1 \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --zipformer2-ctc-model=./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/ctc-epoch-20-avg-1-chunk-16-left-128.onnx \\\n    --tokens=./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt \\\n    ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000000.wav\n\n(3) Streaming paraformer\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n\n  ./bin/sherpa-onnx-vad-with-online-asr \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    --tokens=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt \\\n    --paraformer-encoder=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.onnx \\\n    --paraformer-decoder=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.onnx \\\n    /path/to/long_duration.wav\n\n\nThe input wav should be of single channel, 16-bit PCM encoded wave file; its\nsampling rate can be arbitrary and does not need to be 16kHz.\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OnlineRecognizerConfig asr_config;\n  asr_config.Register(&po);\n\n  sherpa_onnx::VadModelConfig vad_config;\n  vad_config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() != 1) {\n    fprintf(stderr, \"Error: Please provide exactly 1 wave file. Given: %d\\n\\n\",\n            po.NumArgs());\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", vad_config.ToString().c_str());\n  fprintf(stderr, \"%s\\n\", asr_config.ToString().c_str());\n\n  if (!vad_config.Validate()) {\n    fprintf(stderr, \"Errors in vad_config!\\n\");\n    return -1;\n  }\n\n  if (!asr_config.Validate()) {\n    fprintf(stderr, \"Errors in ASR config!\\n\");\n    return -1;\n  }\n\n  fprintf(stderr, \"Creating recognizer ...\\n\");\n  sherpa_onnx::OnlineRecognizer recognizer(asr_config);\n  fprintf(stderr, \"Recognizer created!\\n\");\n\n  auto vad = std::make_unique<sherpa_onnx::VoiceActivityDetector>(vad_config);\n\n  fprintf(stderr, \"Started\\n\");\n  const auto begin = std::chrono::steady_clock::now();\n\n  std::string wave_filename = po.GetArg(1);\n  fprintf(stderr, \"Reading: %s\\n\", wave_filename.c_str());\n  int32_t sampling_rate = -1;\n  bool is_ok = false;\n  auto samples = sherpa_onnx::ReadWave(wave_filename, &sampling_rate, &is_ok);\n  if (!is_ok) {\n    fprintf(stderr, \"Failed to read '%s'\\n\", wave_filename.c_str());\n    return -1;\n  }\n\n  if (sampling_rate != 16000) {\n    fprintf(stderr, \"Resampling from %d Hz to 16000 Hz\\n\", sampling_rate);\n    float min_freq = std::min(sampling_rate, 16000);\n    float lowpass_cutoff = 0.99 * 0.5 * min_freq;\n\n    int32_t lowpass_filter_width = 6;\n    auto resampler = std::make_unique<sherpa_onnx::LinearResample>(\n        sampling_rate, 16000, lowpass_cutoff, lowpass_filter_width);\n    std::vector<float> out_samples;\n    resampler->Resample(samples.data(), samples.size(), true, &out_samples);\n    samples = std::move(out_samples);\n    fprintf(stderr, \"Resampling done\\n\");\n  }\n  const float tail_padding_len = 1.28;  // related to model chunk-size\n  std::vector<float> tail_paddings(static_cast<int>(tail_padding_len * 16000));\n\n  fprintf(stderr, \"Started!\\n\");\n  int32_t window_size = vad_config.ten_vad.model.empty()\n                            ? vad_config.silero_vad.window_size\n                            : vad_config.ten_vad.window_size;\n  int32_t offset = 0;\n  int32_t segment_id = 0;\n  bool speech_started = false;\n  while (offset < samples.size()) {\n    if (offset + window_size <= samples.size()) {\n      vad->AcceptWaveform(samples.data() + offset, window_size);\n    } else {\n      vad->Flush();\n    }\n    offset += window_size;\n    if (vad->IsSpeechDetected() && !speech_started) {\n      // new voice activity\n      speech_started = true;\n      segment_id++;\n    } else if (!vad->IsSpeechDetected() && speech_started) {\n      // end voice activity\n      speech_started = false;\n    }\n\n    while (!vad->Empty()) {\n      const auto &segment = vad->Front();\n      float duration = segment.samples.size() / 16000.;\n      float start_time = segment.start / 16000.;\n      float end_time = start_time + duration;\n      auto s = recognizer.CreateStream();\n      s->AcceptWaveform(16000, segment.samples.data(), segment.samples.size());\n      s->AcceptWaveform(16000, tail_paddings.data(), tail_paddings.size());\n      s->InputFinished();\n      while (recognizer.IsReady(s.get())) {\n        recognizer.DecodeStream(s.get());\n      }\n      auto text = recognizer.GetResult(s.get()).text;\n      if (!text.empty()) {\n        fprintf(stderr, \"vad segment(%d:%.3f-%.3f) results: %s\\n\", segment_id,\n                start_time, end_time, text.c_str());\n      }\n      vad->Pop();\n    }\n  }\n\n  const auto end = std::chrono::steady_clock::now();\n\n  float elapsed_seconds =\n      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n          .count() /\n      1000.;\n\n  fprintf(stderr, \"num threads: %d\\n\", asr_config.model_config.num_threads);\n  fprintf(stderr, \"decoding method: %s\\n\", asr_config.decoding_method.c_str());\n  if (asr_config.decoding_method == \"modified_beam_search\") {\n    fprintf(stderr, \"max active paths: %d\\n\", asr_config.max_active_paths);\n  }\n\n  float duration = samples.size() / 16000.;\n  fprintf(stderr, \"Elapsed seconds: %.3f s\\n\", elapsed_seconds);\n  float rtf = elapsed_seconds / duration;\n  fprintf(stderr, \"Real time factor (RTF): %.3f / %.3f = %.3f\\n\",\n          elapsed_seconds, duration, rtf);\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-vad.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-vad.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include <stdio.h>\n#include <stdlib.h>\n\n#include <algorithm>\n#include <iomanip>\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nint32_t main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nThis program shows how to use VAD in sherpa-onnx\nto remove silences from a file.\n\n  ./bin/sherpa-onnx-vad \\\n    --silero-vad-model=/path/to/silero_vad.onnx \\\n    /path/to/input.wav\n    /path/to/output.wav\n\nPlease download silero_vad.onnx from\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\nFor instance, use\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n\ninput.wav should be 16kHz.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::VadModelConfig config;\n\n  config.Register(&po);\n  po.Read(argc, argv);\n  if (po.NumArgs() != 2) {\n    fprintf(\n        stderr,\n        \"Please provide only 2 argument2: the input wav and the output wav\\n\");\n    po.PrintUsage();\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  std::string wav_filename = po.GetArg(1);\n  int32_t sampling_rate = -1;\n\n  bool is_ok = false;\n  std::vector<float> samples =\n      sherpa_onnx::ReadWave(wav_filename, &sampling_rate, &is_ok);\n\n  if (!is_ok) {\n    fprintf(stderr, \"Failed to read '%s'\\n\", wav_filename.c_str());\n    return -1;\n  }\n\n  if (sampling_rate != 16000) {\n    fprintf(stderr, \"Support only 16000Hz. Given: %d\\n\", sampling_rate);\n    return -1;\n  }\n\n  auto vad = std::make_unique<sherpa_onnx::VoiceActivityDetector>(config);\n\n  int32_t window_size = config.silero_vad.window_size;\n\n  int32_t i = 0;\n  bool is_eof = false;\n\n  std::vector<float> samples_without_silence;\n\n  while (!is_eof) {\n    if (i + window_size < samples.size()) {\n      vad->AcceptWaveform(samples.data() + i, window_size);\n      i += window_size;\n    } else {\n      vad->Flush();\n      is_eof = true;\n    }\n\n    while (!vad->Empty()) {\n      const auto &segment = vad->Front();\n      float start_time = segment.start / static_cast<float>(sampling_rate);\n      float end_time = start_time + segment.samples.size() /\n                                        static_cast<float>(sampling_rate);\n\n      fprintf(stderr, \"%.3f -- %.3f\\n\", start_time, end_time);\n      samples_without_silence.insert(samples_without_silence.end(),\n                                     segment.samples.begin(),\n                                     segment.samples.end());\n      vad->Pop();\n    }\n  }\n\n  sherpa_onnx::WriteWave(po.GetArg(2), sampling_rate,\n                         samples_without_silence.data(),\n                         samples_without_silence.size());\n\n  fprintf(stderr, \"Saved to %s\\n\", po.GetArg(2).c_str());\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx-version.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx-version.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include <stdio.h>\n\n#include <cstdint>\n\n#include \"sherpa-onnx/csrc/version.h\"\n\nint32_t main() {\n  printf(\"sherpa-onnx version : %s\\n\", sherpa_onnx::GetVersionStr());\n  printf(\"sherpa-onnx Git SHA1: %s\\n\", sherpa_onnx::GetGitSha1());\n  printf(\"sherpa-onnx Git date: %s\\n\", sherpa_onnx::GetGitDate());\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/sherpa-onnx.cc",
    "content": "// sherpa-onnx/csrc/sherpa-onnx.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include <stdio.h>\n\n#include <chrono>\n#include <iomanip>\n#include <iostream>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"sherpa-onnx/csrc/timer.h\"\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\ntypedef struct {\n  std::unique_ptr<sherpa_onnx::OnlineStream> online_stream;\n  float duration;\n  float elapsed_seconds;\n} Stream;\n\nint main(int32_t argc, char *argv[]) {\n  const char *kUsageMessage = R\"usage(\nUsage:\n\n(1) Streaming transducer\n\n  ./bin/sherpa-onnx \\\n    --tokens=/path/to/tokens.txt \\\n    --encoder=/path/to/encoder.onnx \\\n    --decoder=/path/to/decoder.onnx \\\n    --joiner=/path/to/joiner.onnx \\\n    --provider=cpu \\\n    --num-threads=2 \\\n    --decoding-method=greedy_search \\\n    /path/to/foo.wav [bar.wav foobar.wav ...]\n\n(2) Streaming zipformer2 CTC\n\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n\n  ./bin/sherpa-onnx \\\n    --debug=1 \\\n    --zipformer2-ctc-model=./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/ctc-epoch-20-avg-1-chunk-16-left-128.onnx \\\n    --tokens=./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt \\\n    ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000000.wav \\\n    ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000001.wav \\\n    ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000002.wav\n\n(3) Streaming paraformer\n\n  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n  tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n\n  ./bin/sherpa-onnx \\\n    --tokens=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt \\\n    --paraformer-encoder=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.onnx \\\n    --paraformer-decoder=./sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.onnx \\\n    ./sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav\n\nNote: It supports decoding multiple files in batches\n\nDefault value for num_threads is 2.\nValid values for decoding_method: greedy_search (default), modified_beam_search.\nValid values for provider: cpu (default), cuda, coreml.\nfoo.wav should be of single channel, 16-bit PCM encoded wave file; its\nsampling rate can be arbitrary and does not need to be 16kHz.\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models to download.\n)usage\";\n\n  sherpa_onnx::ParseOptions po(kUsageMessage);\n  sherpa_onnx::OnlineRecognizerConfig config;\n\n  config.Register(&po);\n\n  po.Read(argc, argv);\n  if (po.NumArgs() < 1) {\n    po.PrintUsage();\n    fprintf(stderr, \"Error! Please provide at lease 1 wav file\\n\");\n    exit(EXIT_FAILURE);\n  }\n\n  fprintf(stderr, \"%s\\n\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    fprintf(stderr, \"Errors in config!\\n\");\n    return -1;\n  }\n\n  printf(\"Start to create recognizer\\n\");\n  sherpa_onnx::Timer timer;\n  sherpa_onnx::OnlineRecognizer recognizer(config);\n  printf(\"Recognizer created in %.5f s\\n\", timer.Elapsed());\n\n  std::vector<Stream> ss;\n\n  const auto begin = std::chrono::steady_clock::now();\n  std::vector<float> durations;\n\n  for (int32_t i = 1; i <= po.NumArgs(); ++i) {\n    const std::string wav_filename = po.GetArg(i);\n    int32_t sampling_rate = -1;\n\n    bool is_ok = false;\n    const std::vector<float> samples =\n        sherpa_onnx::ReadWave(wav_filename, &sampling_rate, &is_ok);\n\n    if (!is_ok) {\n      fprintf(stderr, \"Failed to read '%s'\\n\", wav_filename.c_str());\n      return -1;\n    }\n\n    const float duration = samples.size() / static_cast<float>(sampling_rate);\n\n    auto s = recognizer.CreateStream();\n\n    // std::vector<float> left_paddings(static_cast<int>(0.3 * sampling_rate));\n    // s->AcceptWaveform(sampling_rate, left_paddings.data(),\n    //                   left_paddings.size());\n\n    s->AcceptWaveform(sampling_rate, samples.data(), samples.size());\n\n    std::vector<float> tail_paddings(static_cast<int>(0.8 * sampling_rate));\n    // Note: We can call AcceptWaveform() multiple times.\n    s->AcceptWaveform(sampling_rate, tail_paddings.data(),\n                      tail_paddings.size());\n\n    // Call InputFinished() to indicate that no audio samples are available\n    s->InputFinished();\n    ss.push_back({std::move(s), duration, 0});\n  }\n\n  std::vector<sherpa_onnx::OnlineStream *> ready_streams;\n  for (;;) {\n    ready_streams.clear();\n    for (auto &s : ss) {\n      const auto p_ss = s.online_stream.get();\n      if (recognizer.IsReady(p_ss)) {\n        ready_streams.push_back(p_ss);\n      } else if (s.elapsed_seconds == 0) {\n        const auto end = std::chrono::steady_clock::now();\n        const float elapsed_seconds =\n            std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)\n                .count() /\n            1000.;\n        s.elapsed_seconds = elapsed_seconds;\n      }\n    }\n\n    if (ready_streams.empty()) {\n      break;\n    }\n\n    recognizer.DecodeStreams(ready_streams.data(), ready_streams.size());\n  }\n\n  std::ostringstream os;\n  for (int32_t i = 1; i <= po.NumArgs(); ++i) {\n    const auto &s = ss[i - 1];\n    const float rtf = s.elapsed_seconds / s.duration;\n\n    os << po.GetArg(i) << \"\\n\";\n    os << \"Number of threads: \" << config.model_config.num_threads << \", \"\n       << std::setprecision(2) << \"Elapsed seconds: \" << s.elapsed_seconds\n       << \", Audio duration (s): \" << s.duration\n       << \", Real time factor (RTF) = \" << s.elapsed_seconds << \"/\"\n       << s.duration << \" = \" << rtf << \"\\n\";\n    const auto r = recognizer.GetResult(s.online_stream.get());\n    os << r.text << \"\\n\";\n    os << r.AsJsonString() << \"\\n\\n\";\n  }\n\n  std::cerr << os.str();\n\n  return 0;\n}\n"
  },
  {
    "path": "sherpa-onnx/csrc/silero-vad-model-config.cc",
    "content": "// sherpa-onnx/csrc/silero-vad-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/silero-vad-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid SileroVadModelConfig::Register(ParseOptions *po) {\n  po->Register(\"silero-vad-model\", &model, \"Path to silero VAD ONNX model.\");\n\n  po->Register(\"silero-vad-threshold\", &threshold,\n               \"Speech threshold. Silero VAD outputs speech probabilities for \"\n               \"each audio chunk, probabilities ABOVE this value are \"\n               \"considered as SPEECH. It is better to tune this parameter for \"\n               \"each dataset separately, but lazy \"\n               \"0.5 is pretty good for most datasets.\");\n\n  po->Register(\n      \"silero-vad-min-silence-duration\", &min_silence_duration,\n      \"In seconds.  In the end of each speech chunk wait for \"\n      \"--silero-vad-min-silence-duration seconds before separating it\");\n\n  po->Register(\"silero-vad-min-speech-duration\", &min_speech_duration,\n               \"In seconds.  In the end of each silence chunk wait for \"\n               \"--silero-vad-min-speech-duration seconds before separating it\");\n\n  po->Register(\n      \"silero-vad-max-speech-duration\", &max_speech_duration,\n      \"In seconds. If a speech segment is longer than this value, then we \"\n      \"increase the threshold to 0.9. After finishing detecting the segment, \"\n      \"the threshold value is reset to its original value.\");\n\n  po->Register(\n      \"silero-vad-window-size\", &window_size,\n      \"In samples. Audio chunks of --silero-vad-window-size samples are fed \"\n      \"to the silero VAD model. WARNING! Silero VAD models were trained using \"\n      \"512, 1024, 1536 samples for 16000 sample rate and 256, 512, 768 samples \"\n      \"for 8000 sample rate. Values other than these may affect model \"\n      \"performance!\");\n\n  po->Register(\"silero-vad-neg-threshold\", &neg_threshold,\n               \"Negative threshold (noise threshold). If < 0, defaults to \"\n               \"(threshold - 0.15) with lower bound 0.01.\");\n}\n\nbool SileroVadModelConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --silero-vad-model\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"Silero vad model file '%s' does not exist\",\n                     model.c_str());\n    return false;\n  }\n\n  if (threshold < 0.01) {\n    SHERPA_ONNX_LOGE(\n        \"Please use a larger value for --silero-vad-threshold. Given: %f\",\n        threshold);\n    return false;\n  }\n\n  if (threshold >= 1) {\n    SHERPA_ONNX_LOGE(\n        \"Please use a smaller value for --silero-vad-threshold. Given: %f\",\n        threshold);\n    return false;\n  }\n\n  if (min_silence_duration <= 0) {\n    SHERPA_ONNX_LOGE(\n        \"Please use a larger value for --silero-vad-min-silence-duration. \"\n        \"Given: \"\n        \"%f\",\n        min_silence_duration);\n    return false;\n  }\n\n  if (min_speech_duration <= 0) {\n    SHERPA_ONNX_LOGE(\n        \"Please use a larger value for --silero-vad-min-speech-duration. \"\n        \"Given: \"\n        \"%f\",\n        min_speech_duration);\n    return false;\n  }\n\n  if (max_speech_duration <= 0) {\n    SHERPA_ONNX_LOGE(\n        \"Please use a larger value for --silero-vad-max-speech-duration. \"\n        \"Given: \"\n        \"%f\",\n        max_speech_duration);\n    return false;\n  }\n\n  return true;\n}\n\nstd::string SileroVadModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"SileroVadModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\", \";\n  os << \"threshold=\" << threshold << \", \";\n  os << \"min_silence_duration=\" << min_silence_duration << \", \";\n  os << \"min_speech_duration=\" << min_speech_duration << \", \";\n  os << \"max_speech_duration=\" << max_speech_duration << \", \";\n  os << \"window_size=\" << window_size << \", \";\n  os << \"neg_threshold=\" << neg_threshold << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/silero-vad-model-config.h",
    "content": "// sherpa-onnx/csrc/silero-vad-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_SILERO_VAD_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_SILERO_VAD_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct SileroVadModelConfig {\n  std::string model;\n\n  // threshold to classify a segment as speech\n  //\n  // If the predicted probability of a segment is larger than this\n  // value, then it is classified as speech.\n  float threshold = 0.5;\n\n  float min_silence_duration = 0.5;  // in seconds\n\n  float min_speech_duration = 0.25;  // in seconds\n\n  // 512, 1024, 1536 samples for 16000 Hz\n  int32_t window_size = 512;  // in samples\n\n  // If a speech segment is longer than this value, then we increase\n  // the threshold to 0.9. After finishing detecting the segment,\n  // the threshold value is reset to its original value.\n  float max_speech_duration = 20;  // in seconds\n\n  // Negative (exit) threshold for transitioning from speech → silence.\n  // If left as a negative value, the default Silero rule applies:\n  //     neg_threshold = max(threshold - 0.15f, 0.01f)\n  // This prevents the exit threshold from becoming negative when\n  // threshold < 0.15.\n  float neg_threshold = -1;\n\n  SileroVadModelConfig() = default;\n\n  void Register(ParseOptions *po);\n\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SILERO_VAD_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/silero-vad-model.cc",
    "content": "// sherpa-onnx/csrc/silero-vad-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/silero-vad-model.h\"\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n\nnamespace sherpa_onnx {\n\nclass SileroVadModel::Impl {\n public:\n  explicit Impl(const VadModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{},\n        sample_rate_(config.sample_rate) {\n    auto buf = ReadFile(config.silero_vad.model);\n    Init(buf.data(), buf.size());\n\n    if (sample_rate_ != 16000) {\n      SHERPA_ONNX_LOGE(\"Expected sample rate 16000. Given: %d\",\n                       config.sample_rate);\n      exit(-1);\n    }\n\n    min_silence_samples_ =\n        sample_rate_ * config_.silero_vad.min_silence_duration;\n\n    min_speech_samples_ = sample_rate_ * config_.silero_vad.min_speech_duration;\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const VadModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{},\n        sample_rate_(config.sample_rate) {\n    auto buf = ReadFile(mgr, config.silero_vad.model);\n    Init(buf.data(), buf.size());\n\n    if (sample_rate_ != 16000) {\n      SHERPA_ONNX_LOGE(\"Expected sample rate 16000. Given: %d\",\n                       config.sample_rate);\n      exit(-1);\n    }\n\n    min_silence_samples_ =\n        sample_rate_ * config_.silero_vad.min_silence_duration;\n\n    min_speech_samples_ = sample_rate_ * config_.silero_vad.min_speech_duration;\n  }\n\n  float Run(const float *samples, int32_t n) {\n    if (is_v5_) {\n      return RunV5(samples, n);\n    } else {\n      return RunV4(samples, n);\n    }\n  }\n\n  void Reset() {\n    if (is_v5_) {\n      ResetV5();\n    } else {\n      ResetV4();\n    }\n\n    triggered_ = false;\n    current_sample_ = 0;\n    temp_start_ = 0;\n    temp_end_ = 0;\n  }\n\n  bool IsSpeech(const float *samples, int32_t n) {\n    if (n != WindowSize()) {\n      SHERPA_ONNX_LOGE(\"n: %d != window_size: %d\", n, WindowSize());\n      exit(-1);\n    }\n\n    float prob = Run(samples, n);\n\n    float threshold = config_.silero_vad.threshold;\n\n    current_sample_ += config_.silero_vad.window_size;\n\n    if (prob > threshold && temp_end_ != 0) {\n      temp_end_ = 0;\n    }\n\n    if (prob > threshold && temp_start_ == 0) {\n      // start speaking, but we require that it must satisfy\n      // min_speech_duration\n      temp_start_ = current_sample_;\n      return false;\n    }\n\n    if (prob > threshold && temp_start_ != 0 && !triggered_) {\n      if (current_sample_ - temp_start_ < min_speech_samples_) {\n        return false;\n      }\n\n      triggered_ = true;\n\n      return true;\n    }\n\n    if ((prob < threshold) && !triggered_) {\n      // silence\n      temp_start_ = 0;\n      temp_end_ = 0;\n      return false;\n    }\n\n    float neg_threshold;\n    if (config_.silero_vad.neg_threshold < 0) {\n        neg_threshold = std::max(threshold - 0.15f, 0.01f);\n    } else {\n        neg_threshold = std::max(config_.silero_vad.neg_threshold, 0.01f);\n    }\n    if ((prob > neg_threshold) && triggered_) {\n      // speaking\n      return true;\n    }\n\n    if ((prob > threshold) && !triggered_) {\n      // start speaking\n      triggered_ = true;\n\n      return true;\n    }\n\n    if ((prob < threshold) && triggered_) {\n      // stop to speak\n      if (temp_end_ == 0) {\n        temp_end_ = current_sample_;\n      }\n\n      if (current_sample_ - temp_end_ < min_silence_samples_) {\n        // continue speaking\n        return true;\n      }\n      // stopped speaking\n      temp_start_ = 0;\n      temp_end_ = 0;\n      triggered_ = false;\n      return false;\n    }\n\n    return false;\n  }\n\n  int32_t WindowShift() const { return config_.silero_vad.window_size; }\n\n  int32_t WindowSize() const {\n    return config_.silero_vad.window_size + window_overlap_;\n  }\n\n  int32_t MinSilenceDurationSamples() const { return min_silence_samples_; }\n\n  int32_t MinSpeechDurationSamples() const { return min_speech_samples_; }\n\n  void SetMinSilenceDuration(float s) {\n    min_silence_samples_ = sample_rate_ * s;\n  }\n\n  void SetThreshold(float threshold) {\n    config_.silero_vad.threshold = threshold;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    if ((input_names_.size() == 4 && output_names_.size() == 3) ||\n        IsExportedByK2Fsa()) {\n      is_v5_ = false;\n    } else if (input_names_.size() == 3 && output_names_.size() == 2) {\n      is_v5_ = true;\n\n      // 64 for 16kHz\n      // 32 for 8kHz\n      window_overlap_ = 64;\n\n      if (config_.silero_vad.window_size != 512) {\n        SHERPA_ONNX_LOGE(\n            \"For silero_vad  v5, we require window_size to be 512 for 16kHz\");\n        exit(-1);\n      }\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported silero vad model\");\n      exit(-1);\n    }\n\n    Check();\n\n    Reset();\n  }\n\n  void ResetV5() {\n    // 2 - number of LSTM layer\n    // 1 - batch size\n    // 128 - hidden dim\n    std::array<int64_t, 3> shape{2, 1, 128};\n\n    Ort::Value s =\n        Ort::Value::CreateTensor<float>(allocator_, shape.data(), shape.size());\n\n    Fill<float>(&s, 0);\n    states_.clear();\n    states_.push_back(std::move(s));\n  }\n\n  void ResetV4() {\n    // 2 - number of LSTM layer\n    // 1 - batch size\n    // 64 - hidden dim\n    std::array<int64_t, 3> shape{2, 1, 64};\n\n    Ort::Value h =\n        Ort::Value::CreateTensor<float>(allocator_, shape.data(), shape.size());\n\n    Ort::Value c =\n        Ort::Value::CreateTensor<float>(allocator_, shape.data(), shape.size());\n\n    Fill<float>(&h, 0);\n    Fill<float>(&c, 0);\n\n    states_.clear();\n\n    states_.reserve(2);\n    states_.push_back(std::move(h));\n    states_.push_back(std::move(c));\n  }\n\n  void Check() const {\n    if (is_v5_) {\n      CheckV5();\n    } else {\n      CheckV4();\n    }\n  }\n\n  bool IsExportedByK2Fsa() const {\n    if (input_names_.size() == 3 && input_names_[0] == \"x\" &&\n        input_names_[1] == \"h\" && input_names_[2] == \"c\" &&\n        output_names_.size() == 3 && output_names_[0] == \"prob\" &&\n        output_names_[1] == \"new_h\" && output_names_[2] == \"new_c\") {\n      // this version is exported and maintained by us (k2-fsa)\n      return true;\n    }\n\n    return false;\n  }\n\n  void CheckV4() const {\n    if (IsExportedByK2Fsa()) {\n      return;\n    }\n\n    if (input_names_.size() != 4) {\n      SHERPA_ONNX_LOGE(\"Expect 4 inputs. Given: %d\",\n                       static_cast<int32_t>(input_names_.size()));\n      exit(-1);\n    }\n\n    if (input_names_[0] != \"input\") {\n      SHERPA_ONNX_LOGE(\"Input[0]: %s. Expected: input\",\n                       input_names_[0].c_str());\n      exit(-1);\n    }\n\n    if (input_names_[1] != \"sr\") {\n      SHERPA_ONNX_LOGE(\"Input[1]: %s. Expected: sr\", input_names_[1].c_str());\n      exit(-1);\n    }\n\n    if (input_names_[2] != \"h\") {\n      SHERPA_ONNX_LOGE(\"Input[2]: %s. Expected: h\", input_names_[2].c_str());\n      exit(-1);\n    }\n\n    if (input_names_[3] != \"c\") {\n      SHERPA_ONNX_LOGE(\"Input[3]: %s. Expected: c\", input_names_[3].c_str());\n      exit(-1);\n    }\n\n    // Now for outputs\n    if (output_names_.size() != 3) {\n      SHERPA_ONNX_LOGE(\"Expect 3 outputs. Given: %d\",\n                       static_cast<int32_t>(output_names_.size()));\n      exit(-1);\n    }\n\n    if (output_names_[0] != \"output\") {\n      SHERPA_ONNX_LOGE(\"Output[0]: %s. Expected: output\",\n                       output_names_[0].c_str());\n      exit(-1);\n    }\n\n    if (output_names_[1] != \"hn\") {\n      SHERPA_ONNX_LOGE(\"Output[1]: %s. Expected: sr\", output_names_[1].c_str());\n      exit(-1);\n    }\n\n    if (output_names_[2] != \"cn\") {\n      SHERPA_ONNX_LOGE(\"Output[2]: %s. Expected: sr\", output_names_[2].c_str());\n      exit(-1);\n    }\n  }\n\n  void CheckV5() const {\n    if (input_names_.size() != 3) {\n      SHERPA_ONNX_LOGE(\"Expect 3 inputs. Given: %d\",\n                       static_cast<int32_t>(input_names_.size()));\n      exit(-1);\n    }\n\n    if (input_names_[0] != \"input\") {\n      SHERPA_ONNX_LOGE(\"Input[0]: %s. Expected: input\",\n                       input_names_[0].c_str());\n      exit(-1);\n    }\n\n    if (input_names_[1] != \"state\") {\n      SHERPA_ONNX_LOGE(\"Input[1]: %s. Expected: state\",\n                       input_names_[1].c_str());\n      exit(-1);\n    }\n\n    if (input_names_[2] != \"sr\") {\n      SHERPA_ONNX_LOGE(\"Input[2]: %s. Expected: sr\", input_names_[2].c_str());\n      exit(-1);\n    }\n\n    // Now for outputs\n    if (output_names_.size() != 2) {\n      SHERPA_ONNX_LOGE(\"Expect 2 outputs. Given: %d\",\n                       static_cast<int32_t>(output_names_.size()));\n      exit(-1);\n    }\n\n    if (output_names_[0] != \"output\") {\n      SHERPA_ONNX_LOGE(\"Output[0]: %s. Expected: output\",\n                       output_names_[0].c_str());\n      exit(-1);\n    }\n\n    if (output_names_[1] != \"stateN\") {\n      SHERPA_ONNX_LOGE(\"Output[1]: %s. Expected: stateN\",\n                       output_names_[1].c_str());\n      exit(-1);\n    }\n  }\n\n  float RunV5(const float *samples, int32_t n) {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 2> x_shape = {1, n};\n\n    Ort::Value x =\n        Ort::Value::CreateTensor(memory_info, const_cast<float *>(samples), n,\n                                 x_shape.data(), x_shape.size());\n\n    int64_t sr_shape = 1;\n    Ort::Value sr =\n        Ort::Value::CreateTensor(memory_info, &sample_rate_, 1, &sr_shape, 1);\n\n    std::array<Ort::Value, 3> inputs = {std::move(x), std::move(states_[0]),\n                                        std::move(sr)};\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    states_[0] = std::move(out[1]);\n\n    float prob = out[0].GetTensorData<float>()[0];\n    return prob;\n  }\n\n  float RunV4(const float *samples, int32_t n) {\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 2> x_shape = {1, n};\n\n    Ort::Value x =\n        Ort::Value::CreateTensor(memory_info, const_cast<float *>(samples), n,\n                                 x_shape.data(), x_shape.size());\n\n    int64_t sr_shape = 1;\n    Ort::Value sr =\n        Ort::Value::CreateTensor(memory_info, &sample_rate_, 1, &sr_shape, 1);\n\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(input_names_.size());\n\n    inputs.push_back(std::move(x));\n    if (input_names_.size() == 4) {\n      inputs.push_back(std::move(sr));\n    }\n    inputs.push_back(std::move(states_[0]));\n    inputs.push_back(std::move(states_[1]));\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    states_[0] = std::move(out[1]);\n    states_[1] = std::move(out[2]);\n\n    float prob = out[0].GetTensorData<float>()[0];\n    return prob;\n  }\n\n private:\n  VadModelConfig config_;\n\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  std::vector<Ort::Value> states_;\n  int64_t sample_rate_;\n  int32_t min_silence_samples_;\n  int32_t min_speech_samples_;\n\n  bool triggered_ = false;\n  int32_t current_sample_ = 0;\n  int32_t temp_start_ = 0;\n  int32_t temp_end_ = 0;\n\n  int32_t window_overlap_ = 0;\n\n  bool is_v5_ = false;\n};\n\nSileroVadModel::SileroVadModel(const VadModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nSileroVadModel::SileroVadModel(Manager *mgr, const VadModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nSileroVadModel::~SileroVadModel() = default;\n\nvoid SileroVadModel::Reset() { return impl_->Reset(); }\n\nbool SileroVadModel::IsSpeech(const float *samples, int32_t n) {\n  return impl_->IsSpeech(samples, n);\n}\n\nint32_t SileroVadModel::WindowSize() const { return impl_->WindowSize(); }\n\nint32_t SileroVadModel::WindowShift() const { return impl_->WindowShift(); }\n\nint32_t SileroVadModel::MinSilenceDurationSamples() const {\n  return impl_->MinSilenceDurationSamples();\n}\n\nint32_t SileroVadModel::MinSpeechDurationSamples() const {\n  return impl_->MinSpeechDurationSamples();\n}\n\nvoid SileroVadModel::SetMinSilenceDuration(float s) {\n  impl_->SetMinSilenceDuration(s);\n}\n\nvoid SileroVadModel::SetThreshold(float threshold) {\n  impl_->SetThreshold(threshold);\n}\n\nfloat SileroVadModel::Compute(const float *samples, int32_t n) {\n  return impl_->Run(samples, n);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate SileroVadModel::SileroVadModel(AAssetManager *mgr,\n                                        const VadModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate SileroVadModel::SileroVadModel(NativeResourceManager *mgr,\n                                        const VadModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/silero-vad-model.h",
    "content": "// sherpa-onnx/csrc/silero-vad-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_SILERO_VAD_MODEL_H_\n#define SHERPA_ONNX_CSRC_SILERO_VAD_MODEL_H_\n\n#include <memory>\n\n#include \"sherpa-onnx/csrc/vad-model.h\"\n\nnamespace sherpa_onnx {\n\nclass SileroVadModel : public VadModel {\n public:\n  explicit SileroVadModel(const VadModelConfig &config);\n\n  template <typename Manager>\n  SileroVadModel(Manager *mgr, const VadModelConfig &config);\n\n  ~SileroVadModel() override;\n\n  // reset the internal model states\n  void Reset() override;\n\n  /**\n   * @param samples Pointer to a 1-d array containing audio samples.\n   *                Each sample should be normalized to the range [-1, 1].\n   * @param n Number of samples.\n   *\n   * @return Return true if speech is detected. Return false otherwise.\n   */\n  bool IsSpeech(const float *samples, int32_t n) override;\n\n  float Compute(const float *samples, int32_t n) override;\n\n  // For silero vad V4, it is WindowShift().\n  // For silero vad V5, it is WindowShift()+64 for 16kHz and\n  //                          WindowShift()+32 for 8kHz\n  int32_t WindowSize() const override;\n\n  // 512\n  int32_t WindowShift() const override;\n\n  int32_t MinSilenceDurationSamples() const override;\n  int32_t MinSpeechDurationSamples() const override;\n\n  void SetMinSilenceDuration(float s) override;\n  void SetThreshold(float threshold) override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SILERO_VAD_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/slice-test.cc",
    "content": "// sherpa-onnx/csrc/slice-test.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/slice.h\"\n\n#include <numeric>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nTEST(Slice, Slice3D) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 3> shape{5, 5, 4};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  std::iota(p, p + shape[0] * shape[1] * shape[2], 0);\n\n  auto v1 = Slice(allocator, &v, 2, 4, 0, 2);\n  auto v2 = Slice(allocator, &v, 1, 3, 1, 3);\n\n  Print3D(&v);\n  Print3D(&v1);\n  Print3D(&v2);\n\n  // TODO(fangjun): Check that the results are correct\n}\n\nTEST(Slice, Slice2D) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 2> shape{5, 8};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  std::iota(p, p + shape[0] * shape[1], 0);\n\n  auto v1 = Slice(allocator, &v, 1, 3);\n  auto v2 = Slice(allocator, &v, 0, 2);\n\n  Print2D(&v);\n  Print2D(&v1);\n  Print2D(&v2);\n\n  // TODO(fangjun): Check that the results are correct\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/slice.cc",
    "content": "// sherpa-onnx/csrc/slice.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/slice.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <vector>\n\nnamespace sherpa_onnx {\n\ntemplate <typename T /*=float*/>\nOrt::Value Slice(OrtAllocator *allocator, const Ort::Value *v,\n                 int32_t dim0_start, int32_t dim0_end, int32_t dim1_start,\n                 int32_t dim1_end) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  assert(shape.size() == 3);\n\n  assert(0 <= dim0_start);\n  assert(dim0_start < dim0_end);\n  assert(dim0_end <= shape[0]);\n\n  assert(0 <= dim1_start);\n  assert(dim1_start < dim1_end);\n  assert(dim1_end <= shape[1]);\n\n  std::array<int64_t, 3> ans_shape{dim0_end - dim0_start, dim1_end - dim1_start,\n                                   shape[2]};\n\n  Ort::Value ans = Ort::Value::CreateTensor<T>(allocator, ans_shape.data(),\n                                               ans_shape.size());\n  T *dst = ans.GetTensorMutableData<T>();\n  for (int32_t i = dim0_start; i != dim0_end; ++i) {\n    const T *src = v->GetTensorData<T>() + i * shape[1] * shape[2];\n    const T *start = src + dim1_start * shape[2];\n    const T *end = src + dim1_end * shape[2];\n\n    std::copy(start, end, dst);\n    dst += ans_shape[1] * ans_shape[2];\n  }\n\n  return ans;\n}\n\ntemplate <typename T /*= float*/>\nOrt::Value Slice(OrtAllocator *allocator, const Ort::Value *v,\n                 int32_t dim0_start, int32_t dim0_end) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  assert(shape.size() == 2);\n\n  assert(0 <= dim0_start);\n  assert(dim0_start < dim0_end);\n  assert(dim0_end <= shape[0]);\n\n  const T *src = v->GetTensorData<T>();\n\n  std::array<int64_t, 2> ans_shape{dim0_end - dim0_start, shape[1]};\n\n  Ort::Value ans = Ort::Value::CreateTensor<T>(allocator, ans_shape.data(),\n                                               ans_shape.size());\n  const T *start = v->GetTensorData<T>() + dim0_start * shape[1];\n  const T *end = v->GetTensorData<T>() + dim0_end * shape[1];\n  T *dst = ans.GetTensorMutableData<T>();\n  std::copy(start, end, dst);\n\n  return ans;\n}\n\ntemplate Ort::Value Slice<float>(OrtAllocator *allocator, const Ort::Value *v,\n                                 int32_t dim0_start, int32_t dim0_end,\n                                 int32_t dim1_start, int32_t dim1_end);\n\ntemplate Ort::Value Slice<float>(OrtAllocator *allocator, const Ort::Value *v,\n                                 int32_t dim0_start, int32_t dim0_end);\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/slice.h",
    "content": "// sherpa-onnx/csrc/slice.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_SLICE_H_\n#define SHERPA_ONNX_CSRC_SLICE_H_\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\n/** Get a deep copy by slicing a 3-D tensor v.\n *\n * It returns v[dim0_start:dim0_end, dim1_start:dim1_end, :]\n *\n * @param allocator\n * @param v A 3-D tensor. Its data type is T.\n * @param dim0_start  Start index of the first dimension..\n * @param dim0_end    End index of the first dimension..\n * @param dim1_start Start index of the second dimension.\n * @param dim1_end  End index of the second dimension.\n *\n * @return Return a 3-D tensor of shape\n *         (dim0_end-dim0_start, dim1_end-dim1_start, v.shape[2])\n */\ntemplate <typename T = float>\nOrt::Value Slice(OrtAllocator *allocator, const Ort::Value *v,\n                 int32_t dim0_start, int32_t dim0_end, int32_t dim1_start,\n                 int32_t dim1_end);\n\n/** Get a deep copy by slicing a 2-D tensor v.\n *\n * It returns v[dim0_start:dim0_end, :]\n *\n * @param allocator\n * @param v A 2-D tensor. Its data type is T.\n * @param dim0_start  Start index of the first dimension..\n * @param dim0_end    End index of the first dimension..\n *\n * @return Return a 2-D tensor of shape\n *         (dim0_end-dim0_start, v.shape[1])\n */\ntemplate <typename T = float>\nOrt::Value Slice(OrtAllocator *allocator, const Ort::Value *v,\n                 int32_t dim0_start, int32_t dim0_end);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SLICE_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor-general-impl.h",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor-general-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_GENERAL_IMPL_H_\n#define SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_GENERAL_IMPL_H_\n#include <algorithm>\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-impl.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-model.h\"\n\nnamespace sherpa_onnx {\n\nclass SpeakerEmbeddingExtractorGeneralImpl\n    : public SpeakerEmbeddingExtractorImpl {\n public:\n  explicit SpeakerEmbeddingExtractorGeneralImpl(\n      const SpeakerEmbeddingExtractorConfig &config)\n      : model_(config) {}\n\n  template <typename Manager>\n  SpeakerEmbeddingExtractorGeneralImpl(\n      Manager *mgr, const SpeakerEmbeddingExtractorConfig &config)\n      : model_(mgr, config) {}\n\n  int32_t Dim() const override { return model_.GetMetaData().output_dim; }\n\n  std::unique_ptr<OnlineStream> CreateStream() const override {\n    FeatureExtractorConfig feat_config;\n    const auto &meta_data = model_.GetMetaData();\n    feat_config.sampling_rate = meta_data.sample_rate;\n    feat_config.normalize_samples = meta_data.normalize_samples;\n\n    return std::make_unique<OnlineStream>(feat_config);\n  }\n\n  bool IsReady(OnlineStream *s) const override {\n    return s->GetNumProcessedFrames() < s->NumFramesReady();\n  }\n\n  std::vector<float> Compute(OnlineStream *s) const override {\n    int32_t num_frames = s->NumFramesReady() - s->GetNumProcessedFrames();\n    if (num_frames <= 0) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"Please make sure IsReady(s) returns true. num_frames: %{public}d\",\n          num_frames);\n#else\n      SHERPA_ONNX_LOGE(\n          \"Please make sure IsReady(s) returns true. num_frames: %d\",\n          num_frames);\n#endif\n      return {};\n    }\n\n    std::vector<float> features =\n        s->GetFrames(s->GetNumProcessedFrames(), num_frames);\n\n    s->GetNumProcessedFrames() += num_frames;\n\n    int32_t feat_dim = features.size() / num_frames;\n\n    const auto &meta_data = model_.GetMetaData();\n    if (!meta_data.feature_normalize_type.empty()) {\n      if (meta_data.feature_normalize_type == \"global-mean\") {\n        SubtractGlobalMean(features.data(), num_frames, feat_dim);\n      } else {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\"Unsupported feature_normalize_type: %{public}s\",\n                         meta_data.feature_normalize_type.c_str());\n#else\n        SHERPA_ONNX_LOGE(\"Unsupported feature_normalize_type: %s\",\n                         meta_data.feature_normalize_type.c_str());\n#endif\n        exit(-1);\n      }\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> x_shape{1, num_frames, feat_dim};\n    Ort::Value x =\n        Ort::Value::CreateTensor(memory_info, features.data(), features.size(),\n                                 x_shape.data(), x_shape.size());\n    Ort::Value embedding = model_.Compute(std::move(x));\n    std::vector<int64_t> embedding_shape =\n        embedding.GetTensorTypeAndShapeInfo().GetShape();\n\n    std::vector<float> ans(embedding_shape[1]);\n    std::copy(embedding.GetTensorData<float>(),\n              embedding.GetTensorData<float>() + ans.size(), ans.begin());\n\n    return ans;\n  }\n\n private:\n  void SubtractGlobalMean(float *p, int32_t num_frames,\n                          int32_t feat_dim) const {\n    auto m = Eigen::Map<\n        Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>>(\n        p, num_frames, feat_dim);\n\n    m = m.rowwise() - m.colwise().mean();\n  }\n\n private:\n  SpeakerEmbeddingExtractorModel model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_GENERAL_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor-impl.cc",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor-impl.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-impl.h\"\n\n#include <memory>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-general-impl.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-nemo-impl.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n\nenum class ModelType : std::uint8_t {\n  kWeSpeaker,\n  k3dSpeaker,\n  kNeMo,\n  kUnknown,\n};\n\n}  // namespace\n\nstatic ModelType GetModelType(char *model_data, size_t model_data_length,\n                              bool debug) {\n  Ort::Env env(ORT_LOGGING_LEVEL_ERROR);\n  Ort::SessionOptions sess_opts;\n  sess_opts.SetIntraOpNumThreads(1);\n  sess_opts.SetInterOpNumThreads(1);\n\n  auto sess = std::make_unique<Ort::Session>(env, model_data, model_data_length,\n                                             sess_opts);\n\n  Ort::ModelMetadata meta_data = sess->GetModelMetadata();\n  if (debug) {\n    std::ostringstream os;\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;\n  auto model_type =\n      LookupCustomModelMetaData(meta_data, \"framework\", allocator);\n  if (model_type.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"No model_type in the metadata!\\n\"\n        \"Please make sure you have added metadata to the model.\\n\\n\"\n        \"For instance, you can use\\n\"\n        \"https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/wespeaker/\"\n        \"add_meta_data.py\"\n        \"to add metadata to models from WeSpeaker\\n\");\n    return ModelType::kUnknown;\n  }\n\n  if (model_type == \"wespeaker\") {\n    return ModelType::kWeSpeaker;\n  } else if (model_type == \"3d-speaker\") {\n    return ModelType::k3dSpeaker;\n  } else if (model_type == \"nemo\") {\n    return ModelType::kNeMo;\n  } else {\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"Unsupported model_type: %{public}s\", model_type.c_str());\n#else\n    SHERPA_ONNX_LOGE(\"Unsupported model_type: %s\", model_type.c_str());\n#endif\n    return ModelType::kUnknown;\n  }\n}\n\nstd::unique_ptr<SpeakerEmbeddingExtractorImpl>\nSpeakerEmbeddingExtractorImpl::Create(\n    const SpeakerEmbeddingExtractorConfig &config) {\n  ModelType model_type = ModelType::kUnknown;\n\n  {\n    auto buffer = ReadFile(config.model);\n\n    model_type = GetModelType(buffer.data(), buffer.size(), config.debug);\n  }\n\n  switch (model_type) {\n    case ModelType::kWeSpeaker:\n      // fall through\n    case ModelType::k3dSpeaker:\n      return std::make_unique<SpeakerEmbeddingExtractorGeneralImpl>(config);\n    case ModelType::kNeMo:\n      return std::make_unique<SpeakerEmbeddingExtractorNeMoImpl>(config);\n    case ModelType::kUnknown:\n      SHERPA_ONNX_LOGE(\"Unknown model type for speaker embedding extractor!\");\n      return nullptr;\n  }\n\n  // unreachable code\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<SpeakerEmbeddingExtractorImpl>\nSpeakerEmbeddingExtractorImpl::Create(\n    Manager *mgr, const SpeakerEmbeddingExtractorConfig &config) {\n  ModelType model_type = ModelType::kUnknown;\n\n  {\n    auto buffer = ReadFile(mgr, config.model);\n\n    model_type = GetModelType(buffer.data(), buffer.size(), config.debug);\n  }\n\n  switch (model_type) {\n    case ModelType::kWeSpeaker:\n      // fall through\n    case ModelType::k3dSpeaker:\n      return std::make_unique<SpeakerEmbeddingExtractorGeneralImpl>(mgr,\n                                                                    config);\n    case ModelType::kNeMo:\n      return std::make_unique<SpeakerEmbeddingExtractorNeMoImpl>(mgr, config);\n    case ModelType::kUnknown:\n      SHERPA_ONNX_LOGE(\n          \"Unknown model type in for speaker embedding extractor!\");\n      return nullptr;\n  }\n\n  // unreachable code\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<SpeakerEmbeddingExtractorImpl>\nSpeakerEmbeddingExtractorImpl::Create(\n    AAssetManager *mgr, const SpeakerEmbeddingExtractorConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<SpeakerEmbeddingExtractorImpl>\nSpeakerEmbeddingExtractorImpl::Create(\n    NativeResourceManager *mgr, const SpeakerEmbeddingExtractorConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor-impl.h",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_IMPL_H_\n#define SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_IMPL_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n\nnamespace sherpa_onnx {\n\nclass SpeakerEmbeddingExtractorImpl {\n public:\n  virtual ~SpeakerEmbeddingExtractorImpl() = default;\n\n  static std::unique_ptr<SpeakerEmbeddingExtractorImpl> Create(\n      const SpeakerEmbeddingExtractorConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<SpeakerEmbeddingExtractorImpl> Create(\n      Manager *mgr, const SpeakerEmbeddingExtractorConfig &config);\n\n  virtual int32_t Dim() const = 0;\n\n  virtual std::unique_ptr<OnlineStream> CreateStream() const = 0;\n\n  virtual bool IsReady(OnlineStream *s) const = 0;\n\n  virtual std::vector<float> Compute(OnlineStream *s) const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor-model-meta-data.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_MODEL_META_DATA_H_\n\n#include <cstdint>\n#include <string>\n\nnamespace sherpa_onnx {\n\nstruct SpeakerEmbeddingExtractorModelMetaData {\n  int32_t output_dim = 0;\n  int32_t sample_rate = 0;\n\n  // for wespeaker models, it is 0;\n  // for 3d-speaker models, it is 1\n  int32_t normalize_samples = 1;\n\n  // Chinese, English, etc.\n  std::string language;\n\n  // for 3d-speaker, it is global-mean\n  std::string feature_normalize_type;\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor-model.cc",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor-model.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass SpeakerEmbeddingExtractorModel::Impl {\n public:\n  explicit Impl(const SpeakerEmbeddingExtractorConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const SpeakerEmbeddingExtractorConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  Ort::Value Compute(Ort::Value x) const {\n    std::array<Ort::Value, 1> inputs = {std::move(x)};\n\n    auto outputs =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n    return std::move(outputs[0]);\n  }\n\n  const SpeakerEmbeddingExtractorModelMetaData &GetMetaData() const {\n    return meta_data_;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(meta_data_.output_dim, \"output_dim\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.sample_rate, \"sample_rate\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.normalize_samples,\n                               \"normalize_samples\");\n    SHERPA_ONNX_READ_META_DATA_STR(meta_data_.language, \"language\");\n\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(\n        meta_data_.feature_normalize_type, \"feature_normalize_type\", \"\");\n\n    std::string framework;\n    SHERPA_ONNX_READ_META_DATA_STR(framework, \"framework\");\n    if (framework != \"wespeaker\" && framework != \"3d-speaker\") {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"Expect a wespeaker or a 3d-speaker model, given: %{public}s\",\n          framework.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Expect a wespeaker or a 3d-speaker model, given: %s\",\n                       framework.c_str());\n#endif\n      exit(-1);\n    }\n  }\n\n private:\n  SpeakerEmbeddingExtractorConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  SpeakerEmbeddingExtractorModelMetaData meta_data_;\n};\n\nSpeakerEmbeddingExtractorModel::SpeakerEmbeddingExtractorModel(\n    const SpeakerEmbeddingExtractorConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nSpeakerEmbeddingExtractorModel::SpeakerEmbeddingExtractorModel(\n    Manager *mgr, const SpeakerEmbeddingExtractorConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nSpeakerEmbeddingExtractorModel::~SpeakerEmbeddingExtractorModel() = default;\n\nconst SpeakerEmbeddingExtractorModelMetaData &\nSpeakerEmbeddingExtractorModel::GetMetaData() const {\n  return impl_->GetMetaData();\n}\n\nOrt::Value SpeakerEmbeddingExtractorModel::Compute(Ort::Value x) const {\n  return impl_->Compute(std::move(x));\n}\n\n#if __ANDROID_API__ >= 9\ntemplate SpeakerEmbeddingExtractorModel::SpeakerEmbeddingExtractorModel(\n    AAssetManager *mgr, const SpeakerEmbeddingExtractorConfig &config);\n#endif\n\n#if __OHOS__\ntemplate SpeakerEmbeddingExtractorModel::SpeakerEmbeddingExtractorModel(\n    NativeResourceManager *mgr, const SpeakerEmbeddingExtractorConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor-model.h",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_MODEL_H_\n#define SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_MODEL_H_\n\n#include <memory>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n\nnamespace sherpa_onnx {\n\nclass SpeakerEmbeddingExtractorModel {\n public:\n  explicit SpeakerEmbeddingExtractorModel(\n      const SpeakerEmbeddingExtractorConfig &config);\n\n  template <typename Manager>\n  SpeakerEmbeddingExtractorModel(Manager *mgr,\n                                 const SpeakerEmbeddingExtractorConfig &config);\n\n  ~SpeakerEmbeddingExtractorModel();\n\n  const SpeakerEmbeddingExtractorModelMetaData &GetMetaData() const;\n\n  /**\n   * @param x A float32 tensor of shape (N, T, C)\n   * @return A float32 tensor of shape (N, C)\n   */\n  Ort::Value Compute(Ort::Value x) const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor-nemo-impl.h",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor-nemo-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_NEMO_IMPL_H_\n#define SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_NEMO_IMPL_H_\n#include <algorithm>\n#include <memory>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-impl.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-nemo-model.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass SpeakerEmbeddingExtractorNeMoImpl : public SpeakerEmbeddingExtractorImpl {\n public:\n  explicit SpeakerEmbeddingExtractorNeMoImpl(\n      const SpeakerEmbeddingExtractorConfig &config)\n      : model_(config) {}\n\n  template <typename Manager>\n  SpeakerEmbeddingExtractorNeMoImpl(\n      Manager *mgr, const SpeakerEmbeddingExtractorConfig &config)\n      : model_(mgr, config) {}\n\n  int32_t Dim() const override { return model_.GetMetaData().output_dim; }\n\n  std::unique_ptr<OnlineStream> CreateStream() const override {\n    FeatureExtractorConfig feat_config;\n    const auto &meta_data = model_.GetMetaData();\n    feat_config.sampling_rate = meta_data.sample_rate;\n    feat_config.feature_dim = meta_data.feat_dim;\n    feat_config.normalize_samples = true;\n    feat_config.snip_edges = true;\n    feat_config.frame_shift_ms = meta_data.window_stride_ms;\n    feat_config.frame_length_ms = meta_data.window_size_ms;\n    feat_config.low_freq = 0;\n    feat_config.is_librosa = true;\n    feat_config.remove_dc_offset = false;\n    feat_config.window_type = meta_data.window_type;\n\n    return std::make_unique<OnlineStream>(feat_config);\n  }\n\n  bool IsReady(OnlineStream *s) const override {\n    return s->GetNumProcessedFrames() < s->NumFramesReady();\n  }\n\n  std::vector<float> Compute(OnlineStream *s) const override {\n    int32_t num_frames = s->NumFramesReady() - s->GetNumProcessedFrames();\n    if (num_frames <= 0) {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\n          \"Please make sure IsReady(s) returns true. num_frames: %{public}d\",\n          num_frames);\n#else\n      SHERPA_ONNX_LOGE(\n          \"Please make sure IsReady(s) returns true. num_frames: %d\",\n          num_frames);\n#endif\n      return {};\n    }\n\n    std::vector<float> features =\n        s->GetFrames(s->GetNumProcessedFrames(), num_frames);\n\n    s->GetNumProcessedFrames() += num_frames;\n\n    int32_t feat_dim = features.size() / num_frames;\n\n    const auto &meta_data = model_.GetMetaData();\n    if (!meta_data.feature_normalize_type.empty()) {\n      if (meta_data.feature_normalize_type == \"per_feature\") {\n        NormalizePerFeature(features.data(), num_frames, feat_dim);\n      } else {\n#if __OHOS__\n        SHERPA_ONNX_LOGE(\"Unsupported feature_normalize_type: %{public}s\",\n                         meta_data.feature_normalize_type.c_str());\n#else\n\n        SHERPA_ONNX_LOGE(\"Unsupported feature_normalize_type: %s\",\n                         meta_data.feature_normalize_type.c_str());\n#endif\n        exit(-1);\n      }\n    }\n\n    if (num_frames % 16 != 0) {\n      int32_t pad = 16 - num_frames % 16;\n      features.resize((num_frames + pad) * feat_dim);\n    }\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> x_shape{1, num_frames, feat_dim};\n    Ort::Value x =\n        Ort::Value::CreateTensor(memory_info, features.data(), features.size(),\n                                 x_shape.data(), x_shape.size());\n\n    x = Transpose12(model_.Allocator(), &x);\n\n    int64_t x_lens = num_frames;\n    std::array<int64_t, 1> x_lens_shape{1};\n    Ort::Value x_lens_tensor = Ort::Value::CreateTensor(\n        memory_info, &x_lens, 1, x_lens_shape.data(), x_lens_shape.size());\n\n    Ort::Value embedding =\n        model_.Compute(std::move(x), std::move(x_lens_tensor));\n    std::vector<int64_t> embedding_shape =\n        embedding.GetTensorTypeAndShapeInfo().GetShape();\n\n    std::vector<float> ans(embedding_shape[1]);\n    std::copy(embedding.GetTensorData<float>(),\n              embedding.GetTensorData<float>() + ans.size(), ans.begin());\n\n    return ans;\n  }\n\n private:\n  void NormalizePerFeature(float *p, int32_t num_frames,\n                           int32_t feat_dim) const {\n    auto m = Eigen::Map<\n        Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>>(\n        p, num_frames, feat_dim);\n\n    auto EX = m.colwise().mean();\n    auto EX2 = m.array().pow(2).colwise().sum() / num_frames;\n    auto variance = (EX2 - EX.array().pow(2)).max(1e-5f);\n\n    auto stddev = variance.array().sqrt();\n\n    m = (m.rowwise() - EX).array().rowwise() / (stddev.array() + 1e-5f);\n  }\n\n private:\n  SpeakerEmbeddingExtractorNeMoModel model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_NEMO_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor-nemo-model-meta-data.h",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor-nemo-model-meta-data.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_NEMO_MODEL_META_DATA_H_\n#define SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_NEMO_MODEL_META_DATA_H_\n\n#include <cstdint>\n#include <string>\n\nnamespace sherpa_onnx {\n\nstruct SpeakerEmbeddingExtractorNeMoModelMetaData {\n  int32_t output_dim = 0;\n  int32_t feat_dim = 80;\n  int32_t sample_rate = 0;\n  int32_t window_size_ms = 25;\n  int32_t window_stride_ms = 25;\n\n  // Chinese, English, etc.\n  std::string language;\n\n  // for 3d-speaker, it is global-mean\n  std::string feature_normalize_type;\n  std::string window_type = \"hann\";\n};\n\n}  // namespace sherpa_onnx\n#endif  // SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_NEMO_MODEL_META_DATA_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor-nemo-model.cc",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor-nemo-model.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-nemo-model.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-nemo-model-meta-data.h\"\n\nnamespace sherpa_onnx {\n\nclass SpeakerEmbeddingExtractorNeMoModel::Impl {\n public:\n  explicit Impl(const SpeakerEmbeddingExtractorConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(config.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const SpeakerEmbeddingExtractorConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{} {\n    {\n      auto buf = ReadFile(mgr, config.model);\n      Init(buf.data(), buf.size());\n    }\n  }\n\n  Ort::Value Compute(Ort::Value x, Ort::Value x_lens) const {\n    std::array<Ort::Value, 2> inputs = {std::move(x), std::move(x_lens)};\n\n    // output_names_ptr_[0] is logits\n    // output_names_ptr_[1] is embeddings\n    // so we use output_names_ptr_.data() + 1 here to extract only the\n    // embeddings\n    auto outputs = sess_->Run({}, input_names_ptr_.data(), inputs.data(),\n                              inputs.size(), output_names_ptr_.data() + 1, 1);\n    return std::move(outputs[0]);\n  }\n\n  OrtAllocator *Allocator() { return allocator_; }\n\n  const SpeakerEmbeddingExtractorNeMoModelMetaData &GetMetaData() const {\n    return meta_data_;\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA(meta_data_.output_dim, \"output_dim\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.feat_dim, \"feat_dim\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.sample_rate, \"sample_rate\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.window_size_ms, \"window_size_ms\");\n    SHERPA_ONNX_READ_META_DATA(meta_data_.window_stride_ms, \"window_stride_ms\");\n    SHERPA_ONNX_READ_META_DATA_STR(meta_data_.language, \"language\");\n\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(\n        meta_data_.feature_normalize_type, \"feature_normalize_type\", \"\");\n\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(meta_data_.window_type,\n                                                \"window_type\", \"povey\");\n\n    std::string framework;\n    SHERPA_ONNX_READ_META_DATA_STR(framework, \"framework\");\n    if (framework != \"nemo\") {\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"Expect a NeMo model, given: %{public}s\",\n                       framework.c_str());\n#else\n      SHERPA_ONNX_LOGE(\"Expect a NeMo model, given: %s\", framework.c_str());\n#endif\n      exit(-1);\n    }\n  }\n\n private:\n  SpeakerEmbeddingExtractorConfig config_;\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  SpeakerEmbeddingExtractorNeMoModelMetaData meta_data_;\n};\n\nSpeakerEmbeddingExtractorNeMoModel::SpeakerEmbeddingExtractorNeMoModel(\n    const SpeakerEmbeddingExtractorConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nSpeakerEmbeddingExtractorNeMoModel::SpeakerEmbeddingExtractorNeMoModel(\n    Manager *mgr, const SpeakerEmbeddingExtractorConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nSpeakerEmbeddingExtractorNeMoModel::~SpeakerEmbeddingExtractorNeMoModel() =\n    default;\n\nconst SpeakerEmbeddingExtractorNeMoModelMetaData &\nSpeakerEmbeddingExtractorNeMoModel::GetMetaData() const {\n  return impl_->GetMetaData();\n}\n\nOrt::Value SpeakerEmbeddingExtractorNeMoModel::Compute(\n    Ort::Value x, Ort::Value x_lens) const {\n  return impl_->Compute(std::move(x), std::move(x_lens));\n}\n\nOrtAllocator *SpeakerEmbeddingExtractorNeMoModel::Allocator() const {\n  return impl_->Allocator();\n}\n\n#if __ANDROID_API__ >= 9\ntemplate SpeakerEmbeddingExtractorNeMoModel::SpeakerEmbeddingExtractorNeMoModel(\n    AAssetManager *mgr, const SpeakerEmbeddingExtractorConfig &config);\n#endif\n\n#if __OHOS__\ntemplate SpeakerEmbeddingExtractorNeMoModel::SpeakerEmbeddingExtractorNeMoModel(\n    NativeResourceManager *mgr, const SpeakerEmbeddingExtractorConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor-nemo-model.h",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor-nemo-model.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_NEMO_MODEL_H_\n#define SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_NEMO_MODEL_H_\n\n#include <memory>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-nemo-model-meta-data.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n\nnamespace sherpa_onnx {\n\nclass SpeakerEmbeddingExtractorNeMoModel {\n public:\n  explicit SpeakerEmbeddingExtractorNeMoModel(\n      const SpeakerEmbeddingExtractorConfig &config);\n\n  template <typename Manager>\n  SpeakerEmbeddingExtractorNeMoModel(\n      Manager *mgr, const SpeakerEmbeddingExtractorConfig &config);\n\n  ~SpeakerEmbeddingExtractorNeMoModel();\n\n  const SpeakerEmbeddingExtractorNeMoModelMetaData &GetMetaData() const;\n\n  /**\n   * @param x A float32 tensor of shape (N, C, T)\n   * @param x_len A int64 tensor of shape (N,)\n   * @return A float32 tensor of shape (N, C)\n   */\n  Ort::Value Compute(Ort::Value x, Ort::Value x_len) const;\n\n  OrtAllocator *Allocator() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_NEMO_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor.cc",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor-impl.h\"\n\nnamespace sherpa_onnx {\n\nvoid SpeakerEmbeddingExtractorConfig::Register(ParseOptions *po) {\n  po->Register(\"model\", &model, \"Path to the speaker embedding model.\");\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n}\n\nbool SpeakerEmbeddingExtractorConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide a speaker embedding extractor model\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"speaker embedding extractor model: '%s' does not exist\",\n                     model.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string SpeakerEmbeddingExtractorConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"SpeakerEmbeddingExtractorConfig(\";\n  os << \"model=\\\"\" << model << \"\\\", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\")\";\n\n  return os.str();\n}\n\nSpeakerEmbeddingExtractor::SpeakerEmbeddingExtractor(\n    const SpeakerEmbeddingExtractorConfig &config)\n    : impl_(SpeakerEmbeddingExtractorImpl::Create(config)) {}\n\ntemplate <typename Manager>\nSpeakerEmbeddingExtractor::SpeakerEmbeddingExtractor(\n    Manager *mgr, const SpeakerEmbeddingExtractorConfig &config)\n    : impl_(SpeakerEmbeddingExtractorImpl::Create(mgr, config)) {}\n\nSpeakerEmbeddingExtractor::~SpeakerEmbeddingExtractor() = default;\n\nint32_t SpeakerEmbeddingExtractor::Dim() const { return impl_->Dim(); }\n\nstd::unique_ptr<OnlineStream> SpeakerEmbeddingExtractor::CreateStream() const {\n  return impl_->CreateStream();\n}\n\nbool SpeakerEmbeddingExtractor::IsReady(OnlineStream *s) const {\n  return impl_->IsReady(s);\n}\n\nstd::vector<float> SpeakerEmbeddingExtractor::Compute(OnlineStream *s) const {\n  return impl_->Compute(s);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate SpeakerEmbeddingExtractor::SpeakerEmbeddingExtractor(\n    AAssetManager *mgr, const SpeakerEmbeddingExtractorConfig &config);\n#endif\n\n#if __OHOS__\ntemplate SpeakerEmbeddingExtractor::SpeakerEmbeddingExtractor(\n    NativeResourceManager *mgr, const SpeakerEmbeddingExtractorConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-extractor.h",
    "content": "// sherpa-onnx/csrc/speaker-embedding-extractor.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_H_\n#define SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct SpeakerEmbeddingExtractorConfig {\n  std::string model;\n  int32_t num_threads = 1;\n  bool debug = false;\n  std::string provider = \"cpu\";\n\n  SpeakerEmbeddingExtractorConfig() = default;\n  SpeakerEmbeddingExtractorConfig(const std::string &model, int32_t num_threads,\n                                  bool debug, const std::string &provider)\n      : model(model),\n        num_threads(num_threads),\n        debug(debug),\n        provider(provider) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n  std::string ToString() const;\n};\n\nclass SpeakerEmbeddingExtractorImpl;\n\nclass SpeakerEmbeddingExtractor {\n public:\n  explicit SpeakerEmbeddingExtractor(\n      const SpeakerEmbeddingExtractorConfig &config);\n\n  template <typename Manager>\n  SpeakerEmbeddingExtractor(Manager *mgr,\n                            const SpeakerEmbeddingExtractorConfig &config);\n\n  ~SpeakerEmbeddingExtractor();\n\n  // Return the dimension of the embedding\n  int32_t Dim() const;\n\n  // Create a stream to accept audio samples and compute features\n  std::unique_ptr<OnlineStream> CreateStream() const;\n\n  // Return true if there are feature frames in OnlineStream that\n  // can be used to compute embeddings.\n  bool IsReady(OnlineStream *s) const;\n\n  // Compute the speaker embedding from the available unprocessed features\n  // of the given stream\n  //\n  // You have to ensure IsReady(s) returns true before you call this method.\n  std::vector<float> Compute(OnlineStream *s) const;\n\n private:\n  std::unique_ptr<SpeakerEmbeddingExtractorImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-manager-test.cc",
    "content": "// sherpa-onnx/csrc/speaker-embedding-manager-test.cc\n//\n// Copyright (c) 2024 Jingzhao Ou (jingzhao.ou@gmail.com)\n\n#include \"sherpa-onnx/csrc/speaker-embedding-manager.h\"\n\n#include <string>\n#include <vector>\n\n#include \"gtest/gtest.h\"\n\nnamespace sherpa_onnx {\n\nTEST(SpeakerEmbeddingManager, AddAndRemove) {\n  int32_t dim = 2;\n  SpeakerEmbeddingManager manager(dim);\n  std::vector<float> v = {0.1, 0.1};\n  bool status = manager.Add(\"first\", v.data());\n  ASSERT_TRUE(status);\n  ASSERT_EQ(manager.NumSpeakers(), 1);\n\n  // duplicate\n  status = manager.Add(\"first\", v.data());\n  ASSERT_FALSE(status);\n  ASSERT_EQ(manager.NumSpeakers(), 1);\n\n  // non-duplicate\n  v = {0.1, 0.9};\n  status = manager.Add(\"second\", v.data());\n  ASSERT_TRUE(status);\n  ASSERT_EQ(manager.NumSpeakers(), 2);\n\n  // do not exist\n  status = manager.Remove(\"third\");\n  ASSERT_FALSE(status);\n\n  status = manager.Remove(\"first\");\n  ASSERT_TRUE(status);\n  ASSERT_EQ(manager.NumSpeakers(), 1);\n\n  v = {0.1, 0.1};\n  status = manager.Add(\"first\", v.data());\n  ASSERT_TRUE(status);\n  ASSERT_EQ(manager.NumSpeakers(), 2);\n\n  status = manager.Remove(\"first\");\n  ASSERT_TRUE(status);\n  ASSERT_EQ(manager.NumSpeakers(), 1);\n\n  status = manager.Remove(\"second\");\n  ASSERT_TRUE(status);\n  ASSERT_EQ(manager.NumSpeakers(), 0);\n}\n\nTEST(SpeakerEmbeddingManager, Search) {\n  int32_t dim = 2;\n  SpeakerEmbeddingManager manager(dim);\n  std::vector<float> v1 = {0.1, 0.1};\n  std::vector<float> v2 = {0.1, 0.9};\n  std::vector<float> v3 = {0.9, 0.1};\n  bool status = manager.Add(\"first\", v1.data());\n  ASSERT_TRUE(status);\n\n  status = manager.Add(\"second\", v2.data());\n  ASSERT_TRUE(status);\n\n  status = manager.Add(\"third\", v3.data());\n  ASSERT_TRUE(status);\n\n  ASSERT_EQ(manager.NumSpeakers(), 3);\n\n  std::vector<float> v = {15, 16};\n  float threshold = 0.9;\n\n  std::string name = manager.Search(v.data(), threshold);\n  EXPECT_EQ(name, \"first\");\n\n  v = {2, 17};\n  name = manager.Search(v.data(), threshold);\n  EXPECT_EQ(name, \"second\");\n\n  v = {17, 2};\n  name = manager.Search(v.data(), threshold);\n  EXPECT_EQ(name, \"third\");\n\n  threshold = 0.9;\n  v = {15, 16};\n  status = manager.Remove(\"first\");\n  ASSERT_TRUE(status);\n  name = manager.Search(v.data(), threshold);\n  EXPECT_EQ(name, \"\");\n\n  v = {17, 2};\n  status = manager.Remove(\"third\");\n  ASSERT_TRUE(status);\n  name = manager.Search(v.data(), threshold);\n  EXPECT_EQ(name, \"\");\n\n  v = {2, 17};\n  status = manager.Remove(\"second\");\n  ASSERT_TRUE(status);\n  name = manager.Search(v.data(), threshold);\n  EXPECT_EQ(name, \"\");\n\n  ASSERT_EQ(manager.NumSpeakers(), 0);\n}\n\nTEST(SpeakerEmbeddingManager, Verify) {\n  int32_t dim = 2;\n  SpeakerEmbeddingManager manager(dim);\n  std::vector<float> v1 = {0.1, 0.1};\n  std::vector<float> v2 = {0.1, 0.9};\n  std::vector<float> v3 = {0.9, 0.1};\n  bool status = manager.Add(\"first\", v1.data());\n  ASSERT_TRUE(status);\n\n  status = manager.Add(\"second\", v2.data());\n  ASSERT_TRUE(status);\n\n  status = manager.Add(\"third\", v3.data());\n  ASSERT_TRUE(status);\n\n  std::vector<float> v = {15, 16};\n  float threshold = 0.9;\n\n  status = manager.Verify(\"first\", v.data(), threshold);\n  ASSERT_TRUE(status);\n\n  v = {2, 17};\n  status = manager.Verify(\"first\", v.data(), threshold);\n  ASSERT_FALSE(status);\n\n  status = manager.Verify(\"second\", v.data(), threshold);\n  ASSERT_TRUE(status);\n\n  v = {17, 2};\n  status = manager.Verify(\"first\", v.data(), threshold);\n  ASSERT_FALSE(status);\n\n  status = manager.Verify(\"second\", v.data(), threshold);\n  ASSERT_FALSE(status);\n\n  status = manager.Verify(\"third\", v.data(), threshold);\n  ASSERT_TRUE(status);\n\n  status = manager.Verify(\"fourth\", v.data(), threshold);\n  ASSERT_FALSE(status);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-manager.cc",
    "content": "// sherpa-onnx/csrc/speaker-embedding-manager.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/speaker-embedding-manager.h\"\n\n#include <algorithm>\n#include <string>\n#include <unordered_map>\n#include <utility>\n#include <vector>\n\n#include \"Eigen/Dense\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nusing FloatMatrix = Eigen::Matrix<float, Eigen::Dynamic, Eigen::Dynamic,\n                                  Eigen::RowMajor>;  // NOLINT\n\nclass SpeakerEmbeddingManager::Impl {\n public:\n  explicit Impl(int32_t dim) : dim_(dim) {}\n\n  bool Add(const std::string &name, const float *p) {\n    if (name2row_.count(name)) {\n      // a speaker with the same name already exists\n      return false;\n    }\n\n    embedding_matrix_.conservativeResize(embedding_matrix_.rows() + 1, dim_);\n\n    std::copy(p, p + dim_, &embedding_matrix_.bottomRows(1)(0, 0));\n\n    embedding_matrix_.bottomRows(1).normalize();  // inplace\n\n    name2row_[name] = embedding_matrix_.rows() - 1;\n    row2name_[embedding_matrix_.rows() - 1] = name;\n\n    return true;\n  }\n\n  bool Add(const std::string &name,\n           const std::vector<std::vector<float>> &embedding_list) {\n    if (name2row_.count(name)) {\n      // a speaker with the same name already exists\n      return false;\n    }\n\n    if (embedding_list.empty()) {\n      SHERPA_ONNX_LOGE(\"Empty list of embeddings\");\n      return false;\n    }\n\n    for (const auto &x : embedding_list) {\n      if (static_cast<int32_t>(x.size()) != dim_) {\n        SHERPA_ONNX_LOGE(\"Given dim: %d, expected dim: %d\",\n                         static_cast<int32_t>(x.size()), dim_);\n        return false;\n      }\n    }\n\n    // compute the average\n    Eigen::RowVectorXf v = Eigen::Map<Eigen::RowVectorXf>(\n        const_cast<float *>(embedding_list[0].data()), dim_);\n    int32_t i = -1;\n    for (const auto &x : embedding_list) {\n      ++i;\n      if (i == 0) {\n        continue;\n      }\n      v += Eigen::Map<Eigen::RowVectorXf>(const_cast<float *>(x.data()), dim_);\n    }\n\n    // no need to compute the mean since we are going to normalize it anyway\n    // v /= embedding_list.size();\n\n    v.normalize();\n\n    embedding_matrix_.conservativeResize(embedding_matrix_.rows() + 1, dim_);\n    embedding_matrix_.bottomRows(1) = v;\n\n    name2row_[name] = embedding_matrix_.rows() - 1;\n    row2name_[embedding_matrix_.rows() - 1] = name;\n\n    return true;\n  }\n\n  bool Remove(const std::string &name) {\n    if (!name2row_.count(name)) {\n      return false;\n    }\n\n    int32_t row_idx = name2row_.at(name);\n\n    int32_t num_rows = embedding_matrix_.rows();\n\n    if (row_idx < num_rows - 1) {\n      embedding_matrix_.block(row_idx, 0, num_rows - 1 - row_idx, dim_) =\n          embedding_matrix_.bottomRows(num_rows - 1 - row_idx);\n    }\n\n    embedding_matrix_.conservativeResize(num_rows - 1, dim_);\n    for (auto &p : name2row_) {\n      if (p.second > row_idx) {\n        p.second -= 1;\n        row2name_[p.second] = p.first;\n      }\n    }\n\n    name2row_.erase(name);\n    row2name_.erase(num_rows - 1);\n\n    return true;\n  }\n\n  std::string Search(const float *p, float threshold) {\n    if (embedding_matrix_.rows() == 0) {\n      return {};\n    }\n\n    Eigen::VectorXf v =\n        Eigen::Map<Eigen::VectorXf>(const_cast<float *>(p), dim_);\n    v.normalize();\n\n    Eigen::VectorXf scores = embedding_matrix_ * v;\n\n    Eigen::VectorXf::Index max_index = 0;\n    float max_score = scores.maxCoeff(&max_index);\n    if (max_score < threshold) {\n      return {};\n    }\n\n    return row2name_.at(max_index);\n  }\n\n  std::vector<SpeakerMatch> GetBestMatches(const float *p, float threshold,\n                                           int32_t n) {\n    std::vector<SpeakerMatch> matches;\n\n    if (embedding_matrix_.rows() == 0) {\n      return matches;\n    }\n\n    Eigen::VectorXf v =\n        Eigen::Map<Eigen::VectorXf>(const_cast<float *>(p), dim_);\n    v.normalize();\n\n    Eigen::VectorXf scores = embedding_matrix_ * v;\n\n    std::vector<std::pair<float, int>> score_indices;\n    for (int i = 0; i < scores.size(); ++i) {\n      if (scores[i] >= threshold) {\n        score_indices.emplace_back(scores[i], i);\n      }\n    }\n\n    std::sort(score_indices.rbegin(), score_indices.rend(),\n              [](const auto &a, const auto &b) { return a.first < b.first; });\n\n    matches.reserve(score_indices.size());\n    for (int i = 0; i < std::min(n, static_cast<int32_t>(score_indices.size()));\n         ++i) {\n      const auto &pair = score_indices[i];\n      matches.push_back({row2name_.at(pair.second), pair.first});\n    }\n\n    return matches;\n  }\n\n  bool Verify(const std::string &name, const float *p, float threshold) {\n    if (!name2row_.count(name)) {\n      return false;\n    }\n\n    int32_t row_idx = name2row_.at(name);\n\n    Eigen::VectorXf v =\n        Eigen::Map<Eigen::VectorXf>(const_cast<float *>(p), dim_);\n    v.normalize();\n\n    float score = embedding_matrix_.row(row_idx) * v;\n\n    if (score < threshold) {\n      return false;\n    }\n\n    return true;\n  }\n\n  float Score(const std::string &name, const float *p) {\n    if (!name2row_.count(name)) {\n      // Setting a default value if the name is not found\n      return -2.0;\n    }\n\n    int32_t row_idx = name2row_.at(name);\n\n    Eigen::VectorXf v =\n        Eigen::Map<Eigen::VectorXf>(const_cast<float *>(p), dim_);\n    v.normalize();\n\n    float score = embedding_matrix_.row(row_idx) * v;\n\n    return score;\n  }\n\n  bool Contains(const std::string &name) const {\n    return name2row_.count(name) > 0;\n  }\n\n  int32_t NumSpeakers() const { return embedding_matrix_.rows(); }\n\n  int32_t Dim() const { return dim_; }\n\n  std::vector<std::string> GetAllSpeakers() const {\n    std::vector<std::string> all_speakers;\n    all_speakers.reserve(name2row_.size());\n    for (const auto &p : name2row_) {\n      all_speakers.push_back(p.first);\n    }\n\n    std::sort(all_speakers.begin(), all_speakers.end());\n    return all_speakers;\n  }\n\n private:\n  int32_t dim_;\n  FloatMatrix embedding_matrix_;\n  std::unordered_map<std::string, int32_t> name2row_;\n  std::unordered_map<int32_t, std::string> row2name_;\n};\n\nSpeakerEmbeddingManager::SpeakerEmbeddingManager(int32_t dim)\n    : impl_(std::make_unique<Impl>(dim)) {}\n\nSpeakerEmbeddingManager::~SpeakerEmbeddingManager() = default;\n\nbool SpeakerEmbeddingManager::Add(const std::string &name,\n                                  const float *p) const {\n  return impl_->Add(name, p);\n}\n\nbool SpeakerEmbeddingManager::Add(\n    const std::string &name,\n    const std::vector<std::vector<float>> &embedding_list) const {\n  return impl_->Add(name, embedding_list);\n}\n\nbool SpeakerEmbeddingManager::Remove(const std::string &name) const {\n  return impl_->Remove(name);\n}\n\nstd::string SpeakerEmbeddingManager::Search(const float *p,\n                                            float threshold) const {\n  return impl_->Search(p, threshold);\n}\n\nstd::vector<SpeakerMatch> SpeakerEmbeddingManager::GetBestMatches(\n    const float *p, float threshold, int32_t n) const {\n  return impl_->GetBestMatches(p, threshold, n);\n}\n\nbool SpeakerEmbeddingManager::Verify(const std::string &name, const float *p,\n                                     float threshold) const {\n  return impl_->Verify(name, p, threshold);\n}\n\nfloat SpeakerEmbeddingManager::Score(const std::string &name,\n                                     const float *p) const {\n  return impl_->Score(name, p);\n}\n\nint32_t SpeakerEmbeddingManager::NumSpeakers() const {\n  return impl_->NumSpeakers();\n}\n\nint32_t SpeakerEmbeddingManager::Dim() const { return impl_->Dim(); }\n\nbool SpeakerEmbeddingManager::Contains(const std::string &name) const {\n  return impl_->Contains(name);\n}\n\nstd::vector<std::string> SpeakerEmbeddingManager::GetAllSpeakers() const {\n  return impl_->GetAllSpeakers();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/speaker-embedding-manager.h",
    "content": "// sherpa-onnx/csrc/speaker-embedding-manager.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_MANAGER_H_\n#define SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_MANAGER_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\nstruct SpeakerMatch {\n  const std::string name;\n  float score;\n};\n\nnamespace sherpa_onnx {\n\nclass SpeakerEmbeddingManager {\n public:\n  // @param dim Embedding dimension.\n  explicit SpeakerEmbeddingManager(int32_t dim);\n  ~SpeakerEmbeddingManager();\n\n  /* Add the embedding and name of a speaker to the manager.\n   *\n   * @param name Name of the speaker\n   * @param p Pointer to the embedding. Its length is `dim`.\n   * @return Return true if added successfully. Return false if it failed.\n   *         At present, the only reason for a failure is that there is already\n   *         a speaker with the same `name`.\n   */\n  bool Add(const std::string &name, const float *p) const;\n\n  /** Add a list of embeddings of a speaker.\n   *\n   * @param name Name of the speaker\n   * @param embedding_list A list of embeddings. Each entry should be of size\n   *                       `dim`. The average of the list is the final\n   *                       embedding.\n   * @return Return true if added successfully. Return false if it failed.\n   *         At present, the only reason for a failure is that there is already\n   *         a speaker with the same `name`.\n   */\n  bool Add(const std::string &name,\n           const std::vector<std::vector<float>> &embedding_list) const;\n\n  /* Remove a speaker by its name.\n   *\n   * @param name Name of the speaker to remove.\n   * @return Return true if it is removed successfully. Return false\n   *         if there is no such a speaker.\n   */\n  bool Remove(const std::string &name) const;\n\n  /** It is for speaker identification.\n   *\n   * It computes the cosine similarity between and given embedding and all\n   * other embeddings and find the embedding that has the largest score\n   * and the score is above or equal to threshold. Return the speaker\n   * name for the embedding if found; otherwise, it returns an empty string.\n   *\n   * @param p The input embedding.\n   * @param threshold A value between 0 and 1.\n   * @param If found, return the name of the speaker. Otherwise, return an\n   *        empty string.\n   */\n  std::string Search(const float *p, float threshold) const;\n\n  /**\n   * It is for speaker identification.\n   *\n   * It computes the cosine similarity between a given embedding and all\n   * other embeddings and finds the embeddings that have the largest scores\n   * and the scores are above or equal to the threshold. Returns a vector of\n   * SpeakerMatch structures containing the speaker names and scores for the\n   * embeddings if found; otherwise, returns an empty vector.\n   *\n   * @param p A pointer to the input embedding.\n   * @param threshold A value between 0 and 1.\n   * @param n The number of top matches to return.\n   * @return A vector of SpeakerMatch structures. If matches are found, the\n   *         vector contains the names and scores of the speakers. Otherwise,\n   *         it returns an empty vector.\n   */\n  std::vector<SpeakerMatch> GetBestMatches(const float *p, float threshold,\n                                           int32_t n) const;\n\n  /* Check whether the input embedding matches the embedding of the input\n   * speaker.\n   *\n   * It is for speaker verification.\n   *\n   * @param name The target speaker name.\n   * @param p The input embedding to check.\n   * @param threshold A value between 0 and 1.\n   * @return Return true if it matches. Otherwise, it returns false.\n   */\n  bool Verify(const std::string &name, const float *p, float threshold) const;\n\n  float Score(const std::string &name, const float *p) const;\n\n  // Return true if the given speaker already exists; return false otherwise.\n  bool Contains(const std::string &name) const;\n\n  int32_t NumSpeakers() const;\n\n  int32_t Dim() const;\n\n  // Return a list of speaker names\n  std::vector<std::string> GetAllSpeakers() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SPEAKER_EMBEDDING_MANAGER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/spoken-language-identification-impl.cc",
    "content": "// sherpa-onnx/csrc/spoken-language-identification-impl.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/spoken-language-identification-impl.h\"\n\n#include <memory>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/spoken-language-identification-whisper-impl.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n\nenum class ModelType : std::uint8_t {\n  kWhisper,\n  kUnknown,\n};\n\n}\n\nstatic ModelType GetModelType(char *model_data, size_t model_data_length,\n                              bool debug) {\n  Ort::Env env(ORT_LOGGING_LEVEL_ERROR);\n  Ort::SessionOptions sess_opts;\n\n  auto sess = std::make_unique<Ort::Session>(env, model_data, model_data_length,\n                                             sess_opts);\n\n  Ort::ModelMetadata meta_data = sess->GetModelMetadata();\n  if (debug) {\n    std::ostringstream os;\n    PrintModelMetadata(os, meta_data);\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;\n  auto model_type =\n      LookupCustomModelMetaData(meta_data, \"model_type\", allocator);\n  if (model_type.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"No model_type in the metadata!\\n\"\n        \"Please make sure you have added metadata to the model.\\n\\n\"\n        \"For instance, you can use\\n\"\n        \"https://github.com/k2-fsa/sherpa-onnx/blob/master/scripts/whisper/\"\n        \"export-onnx.py \"\n        \"to add metadata to models from whisper\\n\");\n    return ModelType::kUnknown;\n  }\n\n  if (model_type.find(\"whisper\") == 0) {\n    return ModelType::kWhisper;\n  } else {\n    SHERPA_ONNX_LOGE(\"Unsupported model_type: %s\", model_type.c_str());\n    return ModelType::kUnknown;\n  }\n}\n\nstd::unique_ptr<SpokenLanguageIdentificationImpl>\nSpokenLanguageIdentificationImpl::Create(\n    const SpokenLanguageIdentificationConfig &config) {\n  ModelType model_type = ModelType::kUnknown;\n  {\n    if (config.whisper.encoder.empty()) {\n      SHERPA_ONNX_LOGE(\"Only whisper models are supported at present\");\n      exit(-1);\n    }\n    auto buffer = ReadFile(config.whisper.encoder);\n\n    model_type = GetModelType(buffer.data(), buffer.size(), config.debug);\n  }\n\n  switch (model_type) {\n    case ModelType::kWhisper:\n      return std::make_unique<SpokenLanguageIdentificationWhisperImpl>(config);\n    case ModelType::kUnknown:\n      SHERPA_ONNX_LOGE(\n          \"Unknown model type for spoken language identification!\");\n      return nullptr;\n  }\n\n  // unreachable code\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\nstd::unique_ptr<SpokenLanguageIdentificationImpl>\nSpokenLanguageIdentificationImpl::Create(\n    AAssetManager *mgr, const SpokenLanguageIdentificationConfig &config) {\n  ModelType model_type = ModelType::kUnknown;\n  {\n    if (config.whisper.encoder.empty()) {\n      SHERPA_ONNX_LOGE(\"Only whisper models are supported at present\");\n      exit(-1);\n    }\n    auto buffer = ReadFile(mgr, config.whisper.encoder);\n\n    model_type = GetModelType(buffer.data(), buffer.size(), config.debug);\n  }\n\n  switch (model_type) {\n    case ModelType::kWhisper:\n      return std::make_unique<SpokenLanguageIdentificationWhisperImpl>(mgr,\n                                                                       config);\n    case ModelType::kUnknown:\n      SHERPA_ONNX_LOGE(\n          \"Unknown model type for spoken language identification!\");\n      return nullptr;\n  }\n\n  // unreachable code\n  return nullptr;\n}\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/spoken-language-identification-impl.h",
    "content": "// sherpa-onnx/csrc/spoken-language-identification-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_IMPL_H_\n#define SHERPA_ONNX_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_IMPL_H_\n\n#include <memory>\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/spoken-language-identification.h\"\n\nnamespace sherpa_onnx {\n\nclass SpokenLanguageIdentificationImpl {\n public:\n  virtual ~SpokenLanguageIdentificationImpl() = default;\n\n  static std::unique_ptr<SpokenLanguageIdentificationImpl> Create(\n      const SpokenLanguageIdentificationConfig &config);\n\n#if __ANDROID_API__ >= 9\n  static std::unique_ptr<SpokenLanguageIdentificationImpl> Create(\n      AAssetManager *mgr, const SpokenLanguageIdentificationConfig &config);\n#endif\n\n  virtual std::unique_ptr<OfflineStream> CreateStream() const = 0;\n\n  virtual std::string Compute(OfflineStream *s) const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/spoken-language-identification-whisper-impl.h",
    "content": "// sherpa-onnx/csrc/spoken-language-identification-whisper-impl.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_WHISPER_IMPL_H_\n#define SHERPA_ONNX_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_WHISPER_IMPL_H_\n\n#include <algorithm>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/offline-whisper-model.h\"\n#include \"sherpa-onnx/csrc/spoken-language-identification-impl.h\"\n#include \"sherpa-onnx/csrc/transpose.h\"\n\nnamespace sherpa_onnx {\n\nclass SpokenLanguageIdentificationWhisperImpl\n    : public SpokenLanguageIdentificationImpl {\n public:\n  explicit SpokenLanguageIdentificationWhisperImpl(\n      const SpokenLanguageIdentificationConfig &config)\n      : config_(config), model_(std::make_unique<OfflineWhisperModel>(config)) {\n    Check();\n  }\n\n#if __ANDROID_API__ >= 9\n  SpokenLanguageIdentificationWhisperImpl(\n      AAssetManager *mgr, const SpokenLanguageIdentificationConfig &config)\n      : config_(config),\n        model_(std::make_unique<OfflineWhisperModel>(mgr, config)) {\n    Check();\n  }\n#endif\n\n  std::unique_ptr<OfflineStream> CreateStream() const override {\n    return std::make_unique<OfflineStream>(WhisperTag{});\n  }\n\n  std::string Compute(OfflineStream *s) const override {\n    int32_t max_num_frames = 3000;\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    int32_t feat_dim = s->FeatureDim();\n    std::vector<float> f = s->GetFrames();\n    int32_t num_frames = f.size() / feat_dim;\n\n    // we use 50 here so that there will be some zero tail paddings\n    if (num_frames >= max_num_frames - 50) {\n      SHERPA_ONNX_LOGE(\n          \"Only waves less than 30 seconds are supported. We process only the \"\n          \"first 30 seconds and discard the remaining data\");\n      num_frames = max_num_frames - 50;\n    }\n\n    model_->NormalizeFeatures(f.data(), num_frames, feat_dim);\n\n    // note that 1000 is an experience-value.\n    // You can replace 1000 by other values, say, 100.\n    //\n    // Since we have removed the 30 seconds constraint, we need\n    // tail_padding_frames so that whisper is able to detect the eot token.\n    int32_t tail_padding_frames = 1000;\n\n    if (config_.whisper.tail_paddings > 0) {\n      tail_padding_frames = config_.whisper.tail_paddings;\n    }\n\n    int32_t actual_frames =\n        std::min(num_frames + tail_padding_frames, max_num_frames);\n\n    std::array<int64_t, 3> shape{1, actual_frames, feat_dim};\n\n    Ort::Value mel = Ort::Value::CreateTensor<float>(\n        model_->Allocator(), shape.data(), shape.size());\n\n    float *p_mel = mel.GetTensorMutableData<float>();\n    std::copy(f.data(), f.data() + num_frames * feat_dim, p_mel);\n\n    std::fill_n(p_mel + num_frames * feat_dim,\n                (actual_frames - num_frames) * feat_dim, 0);\n\n    mel = Transpose12(model_->Allocator(), &mel);\n\n    try {\n      auto cross_kv = model_->ForwardEncoder(std::move(mel));\n      int32_t lang_id = model_->DetectLanguage(cross_kv.first, cross_kv.second);\n      const auto &id2lang = model_->GetID2Lang();\n      if (id2lang.count(lang_id)) {\n        return id2lang.at(lang_id);\n      } else {\n        SHERPA_ONNX_LOGE(\"Unknown language ID: %d. Return an empty string.\",\n                         lang_id);\n        return \"\";\n      }\n    } catch (const Ort::Exception &ex) {\n      SHERPA_ONNX_LOGE(\n          \"\\n\\nCaught exception:\\n\\n%s\\n\\nReturn an empty result. Number of \"\n          \"input frames: %d, Current tail \"\n          \"paddings: %d. If you see a lot of such exceptions, please consider \"\n          \"using a larger --whisper-tail-paddings\",\n          ex.what(), num_frames, tail_padding_frames);\n      return \"\";\n    }\n  }\n\n private:\n  void Check() const {\n    if (!model_->IsMultiLingual()) {\n      SHERPA_ONNX_LOGE(\n          \"Only whisper multilingual models can be used for spoken language \"\n          \"identification. Given: %s,%s\",\n          config_.whisper.encoder.c_str(), config_.whisper.decoder.c_str());\n      exit(-1);\n    }\n  }\n\n private:\n  SpokenLanguageIdentificationConfig config_;\n  std::unique_ptr<OfflineWhisperModel> model_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_WHISPER_IMPL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/spoken-language-identification.cc",
    "content": "// sherpa-onnx/csrc/spoken-language-identification.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/spoken-language-identification.h\"\n\n#include <memory>\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/spoken-language-identification-impl.h\"\n\nnamespace sherpa_onnx {\n\nvoid SpokenLanguageIdentificationWhisperConfig::Register(ParseOptions *po) {\n  po->Register(\n      \"whisper-encoder\", &encoder,\n      \"Path to then encoder of a whisper multilingual model. Support only \"\n      \"tiny, base, small, medium, large.\");\n\n  po->Register(\n      \"whisper-decoder\", &decoder,\n      \"Path to the decoder of a whisper multilingual model. Support only \"\n      \"tiny, base, small, medium, large.\");\n\n  po->Register(\n      \"whisper-tail-paddings\", &tail_paddings,\n      \"Suggested value: 300 for multilingual models. \"\n      \"Since we have removed the 30-second constraint, we need to add some \"\n      \"tail padding frames \"\n      \"so that whisper can detect the eot token. Leave it to -1 to use 1000\");\n}\n\nbool SpokenLanguageIdentificationWhisperConfig::Validate() const {\n  if (encoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --whisper-encoder\");\n    return false;\n  }\n\n  if (!FileExists(encoder)) {\n    SHERPA_ONNX_LOGE(\"whisper encoder file '%s' does not exist\",\n                     encoder.c_str());\n    return false;\n  }\n\n  if (decoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --whisper-decoder\");\n    return false;\n  }\n\n  if (!FileExists(decoder)) {\n    SHERPA_ONNX_LOGE(\"whisper decoder file '%s' does not exist\",\n                     decoder.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nstd::string SpokenLanguageIdentificationWhisperConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"SpokenLanguageIdentificationWhisperConfig(\";\n  os << \"encoder=\\\"\" << encoder << \"\\\", \";\n  os << \"decoder=\\\"\" << decoder << \"\\\", \";\n  os << \"tail_paddings=\" << tail_paddings << \")\";\n\n  return os.str();\n}\n\nvoid SpokenLanguageIdentificationConfig::Register(ParseOptions *po) {\n  whisper.Register(po);\n\n  po->Register(\"num-threads\", &num_threads,\n               \"Number of threads to run the neural network\");\n\n  po->Register(\"debug\", &debug,\n               \"true to print model information while loading it.\");\n\n  po->Register(\"provider\", &provider,\n               \"Specify a provider to use: cpu, cuda, coreml\");\n}\n\nbool SpokenLanguageIdentificationConfig::Validate() const {\n  if (!whisper.Validate()) {\n    return false;\n  }\n\n  return true;\n}\n\nstd::string SpokenLanguageIdentificationConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"SpokenLanguageIdentificationConfig(\";\n  os << \"whisper=\" << whisper.ToString() << \", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\")\";\n\n  return os.str();\n}\n\nSpokenLanguageIdentification::SpokenLanguageIdentification(\n    const SpokenLanguageIdentificationConfig &config)\n    : impl_(SpokenLanguageIdentificationImpl::Create(config)) {}\n\n#if __ANDROID_API__ >= 9\nSpokenLanguageIdentification::SpokenLanguageIdentification(\n    AAssetManager *mgr, const SpokenLanguageIdentificationConfig &config)\n    : impl_(SpokenLanguageIdentificationImpl::Create(mgr, config)) {}\n#endif\n\nSpokenLanguageIdentification::~SpokenLanguageIdentification() = default;\n\nstd::unique_ptr<OfflineStream> SpokenLanguageIdentification::CreateStream()\n    const {\n  return impl_->CreateStream();\n}\n\nstd::string SpokenLanguageIdentification::Compute(OfflineStream *s) const {\n  return impl_->Compute(s);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/spoken-language-identification.h",
    "content": "// sherpa-onnx/csrc/spoken-language-identification.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_H_\n#define SHERPA_ONNX_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_H_\n\n#include <memory>\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/offline-stream.h\"\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct SpokenLanguageIdentificationWhisperConfig {\n  // Requires a multi-lingual whisper model.\n  // That is, it supports only tiny, base, small, medium, large.\n  // Note: It does NOT support tiny.en, base.en, small.en, medium.en\n  std::string encoder;\n  std::string decoder;\n\n  // Number of tail padding frames.\n  //\n  // Since we remove the 30-second constraint, we need to add some paddings\n  // at the end.\n  //\n  // Recommended values:\n  //   - 50 for English models\n  //   - 300 for multilingual models\n  int32_t tail_paddings = -1;\n\n  SpokenLanguageIdentificationWhisperConfig() = default;\n\n  SpokenLanguageIdentificationWhisperConfig(const std::string &encoder,\n                                            const std::string &decoder,\n                                            int32_t tail_paddings)\n      : encoder(encoder), decoder(decoder), tail_paddings(tail_paddings) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n  std::string ToString() const;\n};\n\nstruct SpokenLanguageIdentificationConfig {\n  SpokenLanguageIdentificationWhisperConfig whisper;\n\n  int32_t num_threads = 1;\n  bool debug = false;\n  std::string provider = \"cpu\";\n\n  SpokenLanguageIdentificationConfig() = default;\n\n  SpokenLanguageIdentificationConfig(\n      const SpokenLanguageIdentificationWhisperConfig &whisper,\n      int32_t num_threads, bool debug, const std::string &provider)\n      : whisper(whisper),\n        num_threads(num_threads),\n        debug(debug),\n        provider(provider) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n  std::string ToString() const;\n};\n\nclass SpokenLanguageIdentificationImpl;\n\nclass SpokenLanguageIdentification {\n public:\n  explicit SpokenLanguageIdentification(\n      const SpokenLanguageIdentificationConfig &config);\n\n#if __ANDROID_API__ >= 9\n  SpokenLanguageIdentification(\n      AAssetManager *mgr, const SpokenLanguageIdentificationConfig &config);\n#endif\n\n  ~SpokenLanguageIdentification();\n\n  // Create a stream to accept audio samples and compute features\n  std::unique_ptr<OfflineStream> CreateStream() const;\n\n  // Return a string containing the language, e.g., en, zh, de,\n  // etc.\n  // Note: en is for English, zh is for Chinese, de is for German, etc.\n  std::string Compute(OfflineStream *s) const;\n\n private:\n  std::unique_ptr<SpokenLanguageIdentificationImpl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/stack-test.cc",
    "content": "// sherpa-onnx/csrc/stack-test.cc\n//\n// Copyright (c) 2023 Jingzhao Ou (jingzhao.ou@gmail.com)\n\n#include \"sherpa-onnx/csrc/stack.h\"\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nTEST(Stack, Test1DTensors) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 1> a_shape{3};\n  std::array<int64_t, 1> b_shape{3};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0; i != static_cast<int32_t>(b_shape[0]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Stack(allocator, {&a, &b}, 0);\n\n  Print1D(&a);\n  Print1D(&b);\n  Print2D(&ans);\n\n  const float *pans = ans.GetTensorData<float>();\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0]); ++i) {\n    EXPECT_EQ(pa[i], pans[i]);\n  }\n\n  for (int32_t i = 0; i != static_cast<int32_t>(b_shape[0]); ++i) {\n    EXPECT_EQ(pb[i], pans[i + a_shape[0]]);\n  }\n}\n\nTEST(Stack, Test2DTensorsDim0) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 2> a_shape{2, 3};\n  std::array<int64_t, 2> b_shape{2, 3};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0] * a_shape[1]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0; i != static_cast<int32_t>(b_shape[0] * b_shape[1]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Stack(allocator, {&a, &b}, 0);\n\n  Print2D(&a);\n  Print2D(&b);\n  Print3D(&ans);\n\n  const float *pans = ans.GetTensorData<float>();\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0] * a_shape[1]); ++i) {\n    EXPECT_EQ(pa[i], pans[i]);\n  }\n  for (int32_t i = 0; i != static_cast<int32_t>(b_shape[0] * b_shape[1]); ++i) {\n    EXPECT_EQ(pb[i], pans[i + a_shape[0] * a_shape[1]]);\n  }\n}\n\nTEST(Stack, Test2DTensorsDim1) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 2> a_shape{4, 3};\n  std::array<int64_t, 2> b_shape{4, 3};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0] * a_shape[1]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0; i != static_cast<int32_t>(b_shape[0] * b_shape[1]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Stack(allocator, {&a, &b}, 1);\n\n  Print2D(&a);\n  Print2D(&b);\n  Print3D(&ans);\n\n  const float *pans = ans.GetTensorData<float>();\n\n  for (int32_t r = 0; r != static_cast<int32_t>(a_shape[0]); ++r) {\n    for (int32_t i = 0; i != static_cast<int32_t>(a_shape[1]);\n         ++i, ++pa, ++pans) {\n      EXPECT_EQ(*pa, *pans);\n    }\n\n    for (int32_t i = 0; i != static_cast<int32_t>(b_shape[1]);\n         ++i, ++pb, ++pans) {\n      EXPECT_EQ(*pb, *pans);\n    }\n  }\n}\n\nTEST(Stack, Test3DTensorsDim0) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 3> a_shape{2, 3, 2};\n  std::array<int64_t, 3> b_shape{2, 3, 2};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(a_shape[0] * a_shape[1] * a_shape[2]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(b_shape[0] * b_shape[1] * b_shape[2]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Stack(allocator, {&a, &b}, 0);\n\n  const float *pans = ans.GetTensorData<float>();\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(a_shape[0] * a_shape[1] * a_shape[2]); ++i) {\n    EXPECT_EQ(pa[i], pans[i]);\n  }\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(b_shape[0] * b_shape[1] * b_shape[2]); ++i) {\n    EXPECT_EQ(pb[i], pans[i + a_shape[0] * a_shape[1] * a_shape[2]]);\n  }\n\n  Print3D(&a);\n  Print3D(&b);\n  Print4D(&ans);\n}\n\nTEST(Stack, Test3DTensorsDim1) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 3> a_shape{2, 2, 3};\n  std::array<int64_t, 3> b_shape{2, 2, 3};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(a_shape[0] * a_shape[1] * a_shape[2]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(b_shape[0] * b_shape[1] * b_shape[2]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Stack(allocator, {&a, &b}, 1);\n\n  const float *pans = ans.GetTensorData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0]); ++i) {\n    for (int32_t k = 0; k != static_cast<int32_t>(a_shape[1] * a_shape[2]);\n         ++k, ++pa, ++pans) {\n      EXPECT_EQ(*pa, *pans);\n    }\n\n    for (int32_t k = 0; k != static_cast<int32_t>(b_shape[1] * b_shape[2]);\n         ++k, ++pb, ++pans) {\n      EXPECT_EQ(*pb, *pans);\n    }\n  }\n\n  Print3D(&a);\n  Print3D(&b);\n  Print4D(&ans);\n}\n\nTEST(Stack, Test3DTensorsDim2) {\n  Ort::AllocatorWithDefaultOptions allocator;\n\n  std::array<int64_t, 3> a_shape{2, 3, 4};\n  std::array<int64_t, 3> b_shape{2, 3, 4};\n\n  Ort::Value a = Ort::Value::CreateTensor<float>(allocator, a_shape.data(),\n                                                 a_shape.size());\n\n  Ort::Value b = Ort::Value::CreateTensor<float>(allocator, b_shape.data(),\n                                                 b_shape.size());\n\n  float *pa = a.GetTensorMutableData<float>();\n  float *pb = b.GetTensorMutableData<float>();\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(a_shape[0] * a_shape[1] * a_shape[2]); ++i) {\n    pa[i] = i;\n  }\n  for (int32_t i = 0;\n       i != static_cast<int32_t>(b_shape[0] * b_shape[1] * b_shape[2]); ++i) {\n    pb[i] = i + 10;\n  }\n\n  Ort::Value ans = Stack(allocator, {&a, &b}, 2);\n\n  const float *pans = ans.GetTensorData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(a_shape[0] * a_shape[1]); ++i) {\n    for (int32_t k = 0; k != static_cast<int32_t>(a_shape[2]);\n         ++k, ++pa, ++pans) {\n      EXPECT_EQ(*pa, *pans);\n    }\n\n    for (int32_t k = 0; k != static_cast<int32_t>(b_shape[2]);\n         ++k, ++pb, ++pans) {\n      EXPECT_EQ(*pb, *pans);\n    }\n  }\n\n  Print3D(&a);\n  Print3D(&b);\n  Print4D(&ans);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/stack.cc",
    "content": "// sherpa-onnx/csrc/stack.cc\n//\n// Copyright (c) 2023 Jingzhao Ou (jingzhao.ou@gmail.com)\n\n#include \"sherpa-onnx/csrc/stack.h\"\n\n#include <algorithm>\n#include <functional>\n#include <numeric>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nstatic bool Compare(const std::vector<int64_t> &a,\n                    const std::vector<int64_t> &b) {\n  if (a.size() != b.size()) return false;\n\n  for (int32_t i = 0; i != static_cast<int32_t>(a.size()); ++i) {\n    if (a[i] != b[i]) return false;\n  }\n\n  return true;\n}\n\nstatic void PrintShape(const std::vector<int64_t> &a) {\n  for (auto i : a) {\n    SHERPA_ONNX_LOGE(\"%d \", static_cast<int32_t>(i));\n  }\n  SHERPA_ONNX_LOGE(\"\\n\");\n}\n\ntemplate <typename T /*=float*/>\nOrt::Value Stack(OrtAllocator *allocator,\n                 const std::vector<const Ort::Value *> &values, int32_t dim) {\n  std::vector<int64_t> v0_shape =\n      values[0]->GetTensorTypeAndShapeInfo().GetShape();\n\n  for (int32_t i = 1; i != static_cast<int32_t>(values.size()); ++i) {\n    auto s = values[i]->GetTensorTypeAndShapeInfo().GetShape();\n    bool ret = Compare(v0_shape, s);\n    if (!ret) {\n      SHERPA_ONNX_LOGE(\"Incorrect shape in Stack !\\n\");\n\n      SHERPA_ONNX_LOGE(\"Shape for tensor 0: \");\n      PrintShape(v0_shape);\n\n      SHERPA_ONNX_LOGE(\"Shape for tensor %d: \", i);\n      PrintShape(s);\n\n      exit(-1);\n    }\n  }\n\n  std::vector<int64_t> ans_shape;\n  ans_shape.reserve(v0_shape.size() + 1);\n  ans_shape.insert(ans_shape.end(), v0_shape.data(), v0_shape.data() + dim);\n  ans_shape.push_back(values.size());\n  ans_shape.insert(ans_shape.end(), v0_shape.data() + dim,\n                   v0_shape.data() + v0_shape.size());\n\n  auto leading_size = static_cast<int32_t>(std::accumulate(\n      v0_shape.begin(), v0_shape.begin() + dim, 1, std::multiplies<int64_t>()));\n\n  auto trailing_size = static_cast<int32_t>(std::accumulate(\n      v0_shape.begin() + dim, v0_shape.end(), 1, std::multiplies<int64_t>()));\n\n  Ort::Value ans = Ort::Value::CreateTensor<T>(allocator, ans_shape.data(),\n                                               ans_shape.size());\n  T *dst = ans.GetTensorMutableData<T>();\n\n  for (int32_t i = 0; i != leading_size; ++i) {\n    for (auto value : values) {\n      const T *src = value->GetTensorData<T>();\n      src += i * trailing_size;\n\n      std::copy(src, src + trailing_size, dst);\n      dst += trailing_size;\n    }\n  }\n\n  return ans;\n}\n\ntemplate Ort::Value Stack<float>(OrtAllocator *allocator,\n                                 const std::vector<const Ort::Value *> &values,\n                                 int32_t dim);\n\ntemplate Ort::Value Stack<int64_t>(\n    OrtAllocator *allocator, const std::vector<const Ort::Value *> &values,\n    int32_t dim);\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/stack.h",
    "content": "// sherpa-onnx/csrc/stack.h\n//\n// Copyright (c) 2023 Jingzhao Ou (jingzhao.ou@gmail.com)\n\n#ifndef SHERPA_ONNX_CSRC_STACK_H_\n#define SHERPA_ONNX_CSRC_STACK_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\n/** Stack a list of tensors along the given dim.\n *\n * @param allocator Allocator to allocate space for the returned tensor\n * @param values  Pointer to a list of tensors. The shape of the tensor must\n *                be the same except on the dim to be stacked.\n * @param dim  The dim along which to concatenate the input tensors\n *\n * @return Return the stacked tensor\n */\ntemplate <typename T = float>\nOrt::Value Stack(OrtAllocator *allocator,\n                 const std::vector<const Ort::Value *> &values, int32_t dim);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_STACK_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/symbol-table.cc",
    "content": "// sherpa-onnx/csrc/symbol-table.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <cctype>\n#include <fstream>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <utility>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/base64-decode.h\"\n#include \"sherpa-onnx/csrc/bbpe.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/lexicon.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n// copied from\n// https://stackoverflow.com/questions/216823/how-to-trim-a-stdstring\nconst char *ws = \" \\t\\n\\r\\f\\v\";\n\n// trim from end of string (right)\ninline void TrimRight(std::string *s, const char *t = ws) {\n  s->erase(s->find_last_not_of(t) + 1);\n}\n\n// trim from beginning of string (left)\ninline void TrimLeft(std::string *s, const char *t = ws) {\n  s->erase(0, s->find_first_not_of(t));\n}\n\n// trim from both ends of string (right then left)\ninline void Trim(std::string *s, const char *t = ws) {\n  TrimRight(s, t);\n  TrimLeft(s, t);\n}\n\nbool IsByteBPE(const char *s, int32_t n) {\n  const uint8_t *p = reinterpret_cast<const uint8_t *>(s);\n  if (n >= 3 && p[0] == 0xe2 && p[1] == 0x96 && p[2] == 0x81) {\n    return IsByteBPE(s + 3, n - 3);\n  }\n\n  for (int32_t i = 0; i != n; ++i) {\n    if (p[i] > 0xc6) {\n      return false;\n    }\n  }\n\n  return true;\n}\n\nbool IsByteBPE(const std::unordered_map<std::string, int32_t> &sym2id) {\n  uint8_t max_v = 0;\n  for (const auto &p : sym2id) {\n    const auto &s = p.first;\n    if (!IsByteBPE(s.c_str(), s.size())) {\n      return false;\n    }\n\n    uint8_t m = 0;\n    if (s.size() >= 3) {\n      const uint8_t *p = reinterpret_cast<const uint8_t *>(s.c_str());\n\n      if (p[0] == 0xe2 && p[1] == 0x96 && p[2] == 0x81) {\n        if (s.size() > 3) {\n          m = *std::max_element(\n              reinterpret_cast<const uint8_t *>(s.data()) + 3,\n              reinterpret_cast<const uint8_t *>(s.data()) + s.size());\n        } else {\n          m = 0;\n        }\n      } else {\n        m = *std::max_element(\n            reinterpret_cast<const uint8_t *>(s.data()),\n            reinterpret_cast<const uint8_t *>(s.data()) + s.size());\n      }\n    } else {\n      m = *std::max_element(\n          reinterpret_cast<const uint8_t *>(s.data()),\n          reinterpret_cast<const uint8_t *>(s.data()) + s.size());\n    }\n\n    max_v = (m > max_v) ? m : max_v;\n  }\n\n  return static_cast<uint8_t>(max_v) == 0xc6;\n}\n\n}  // namespace\n\nstd::unordered_map<std::string, int32_t> ReadTokens(\n    std::istream &is,\n    std::unordered_map<int32_t, std::string> *id2token /*= nullptr*/) {\n  std::unordered_map<std::string, int32_t> token2id;\n\n  std::string line;\n\n  std::string sym;\n  int32_t id = -1;\n  while (std::getline(is, line)) {\n    Trim(&line);\n    std::istringstream iss(line);\n    iss >> sym;\n    if (iss.eof()) {\n      id = atoi(sym.c_str());\n      sym = \" \";\n    } else {\n      iss >> id;\n    }\n\n    // eat the trailing \\r\\n on windows\n    iss >> std::ws;\n    if (!iss.eof()) {\n      SHERPA_ONNX_LOGE(\"Error: %s\", line.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n#if 0\n    if (token2id.count(sym)) {\n      SHERPA_ONNX_LOGE(\"Duplicated token %s. Line %s. Existing ID: %d\",\n                       sym.c_str(), line.c_str(), token2id.at(sym));\n      SHERPA_ONNX_EXIT(-1);\n    }\n#endif\n    if (id2token) {\n      id2token->insert({id, sym});\n    }\n\n    token2id.insert({std::move(sym), id});\n  }\n\n  return token2id;\n}\n\nSymbolTable::SymbolTable(const std::string &filename, bool is_file) {\n  if (is_file) {\n    std::ifstream is(filename);\n    Init(is);\n  } else {\n    std::istringstream iss(filename);\n    Init(iss);\n  }\n}\n\ntemplate <typename Manager>\nSymbolTable::SymbolTable(Manager *mgr, const std::string &filename) {\n  auto buf = ReadFile(mgr, filename);\n\n  std::istringstream is(std::string(buf.data(), buf.size()));\n  Init(is);\n}\n\nvoid SymbolTable::Init(std::istream &is) {\n  sym2id_ = ReadTokens(is, &id2sym_);\n  is_bbpe_ = IsByteBPE(sym2id_);\n\n  if (sym2id_.count(\"<0x00>\") && sym2id_.count(\"<0xFF>\") &&\n      ((sym2id_.at(\"<0xFF>\") - sym2id_.at(\"<0x00>\")) == 255)) {\n    is_bpe_with_byte_fallback_ = true;\n    id_for_0x00_ = sym2id_.at(\"<0x00>\");\n  }\n}\n\nstd::string SymbolTable::ToString() const {\n  std::ostringstream os;\n  char sep = ' ';\n  for (const auto &p : sym2id_) {\n    os << p.first << sep << p.second << \"\\n\";\n  }\n  return os.str();\n}\n\nconst std::string SymbolTable::operator[](int32_t id) const {\n  std::string sym = id2sym_.at(id);\n  if (sym.size() >= 3 && !is_bbpe_) {\n    // For BPE-based models, we replace ▁ with a space\n    // Unicode 9601, hex 0x2581, utf8 0xe29681\n    const uint8_t *p = reinterpret_cast<const uint8_t *>(sym.c_str());\n    if (p[0] == 0xe2 && p[1] == 0x96 && p[2] == 0x81) {\n      sym = sym.replace(0, 3, \" \");\n    }\n  }\n\n  // for BPE with byte_fallback\n  // id 0 is blank, id 1 is sos/eos, id 2 is unk\n  //\n  // Note: For moonshine models, 0 is <unk>, 1, is <s>, 2 is</s>\n  if (is_bpe_with_byte_fallback_ && sym.size() == 6 && sym[0] == '<' &&\n      sym[1] == '0' && sym[2] == 'x' && sym[5] == '>') {\n    std::ostringstream os;\n    os << std::hex << std::uppercase << (id - id_for_0x00_);\n\n    if (std::string(sym.data() + 3, sym.data() + 5) == os.str()) {\n      uint8_t i = id - id_for_0x00_;\n      sym = std::string(&i, &i + 1);\n    }\n  }\n  return sym;\n}\n\nint32_t SymbolTable::operator[](const std::string &sym) const {\n  return sym2id_.at(sym);\n}\n\nbool SymbolTable::Contains(int32_t id) const { return id2sym_.count(id) != 0; }\n\nbool SymbolTable::Contains(const std::string &sym) const {\n  return sym2id_.count(sym) != 0;\n}\n\nstd::ostream &operator<<(std::ostream &os, const SymbolTable &symbol_table) {\n  return os << symbol_table.ToString();\n}\n\nvoid SymbolTable::ApplyBase64Decode() {\n  sym2id_.clear();\n  for (auto &p : id2sym_) {\n    if (p.second == \" \") {\n      // for FunASR nano models, there is an empty string in the tokens.txt,\n      // which is converted to \" \" while reading it in sherpa-onnx. We convert\n      // it back to \"\" here\n      p.second = \"\";\n    } else {\n      p.second = Base64Decode(p.second);\n    }\n    sym2id_[p.second] = p.first;\n  }\n}\n\nstd::string SymbolTable::DecodeByteBpe(const std::string &text) const {\n  if (!is_bbpe_) {\n    return text;\n  }\n  auto v = SplitUtf8(text);\n\n  const auto &bbpe_table = GetByteBpeTable();\n  std::string ans;\n  for (const auto &s : v) {\n    if (s == \"▁\") {\n      if (!ans.empty() && ans.back() != ' ' && std::isprint(ans.back())) {\n        ans.push_back(' ');\n      }\n    } else if (bbpe_table.count(s)) {\n      ans.push_back(bbpe_table.at(s));\n    } else if (std::isprint(s[0])) {\n      ans.append(s);\n    } else {\n      // Should not happen\n      SHERPA_ONNX_LOGE(\"Skip OOV: %s from %s\", s.c_str(), text.c_str());\n    }\n  }\n\n  // TODO(fangjun): Filter invalid utf-8 sequences\n  return ans;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate SymbolTable::SymbolTable(AAssetManager *mgr,\n                                  const std::string &filename);\n#endif\n\n#if __OHOS__\ntemplate SymbolTable::SymbolTable(NativeResourceManager *mgr,\n                                  const std::string &filename);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/symbol-table.h",
    "content": "// sherpa-onnx/csrc/symbol-table.h\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_SYMBOL_TABLE_H_\n#define SHERPA_ONNX_CSRC_SYMBOL_TABLE_H_\n\n#include <istream>\n#include <string>\n#include <unordered_map>\n#include <vector>\n\nnamespace sherpa_onnx {\n\n// The same token can be mapped to different integer IDs, so\n// we need an id2token argument here.\nstd::unordered_map<std::string, int32_t> ReadTokens(\n    std::istream &is,\n    std::unordered_map<int32_t, std::string> *id2token = nullptr);\n\nstd::vector<int32_t> ConvertTokensToIds(\n    const std::unordered_map<std::string, int32_t> &token2id,\n    const std::vector<std::string> &tokens);\n\n/// It manages mapping between symbols and integer IDs.\nclass SymbolTable {\n public:\n  SymbolTable() = default;\n  /// Construct a symbol table from a file or from a buffered string.\n  /// Each line in the file contains two fields:\n  ///\n  ///    sym ID\n  ///\n  /// Fields are separated by space(s).\n  explicit SymbolTable(const std::string &filename, bool is_file = true);\n\n  template <typename Manager>\n  SymbolTable(Manager *mgr, const std::string &filename);\n\n  /// Return a string representation of this symbol table\n  std::string ToString() const;\n\n  /// Return the symbol corresponding to the given ID.\n  const std::string operator[](int32_t id) const;\n  /// Return the ID corresponding to the given symbol.\n  int32_t operator[](const std::string &sym) const;\n\n  /// Return true if there is a symbol with the given ID.\n  bool Contains(int32_t id) const;\n\n  /// Return true if there is a given symbol in the symbol table.\n  bool Contains(const std::string &sym) const;\n\n  // for tokens.txt from Whisper\n  void ApplyBase64Decode();\n\n  int32_t NumSymbols() const { return id2sym_.size(); }\n\n  std::string DecodeByteBpe(const std::string &text) const;\n\n  bool IsByteBpe() const { return is_bbpe_; }\n\n private:\n  void Init(std::istream &is);\n\n private:\n  std::unordered_map<std::string, int32_t> sym2id_;\n  std::unordered_map<int32_t, std::string> id2sym_;\n\n  // see https://github.com/k2-fsa/sherpa-onnx/issues/2524\n  bool is_bpe_with_byte_fallback_ = false;\n\n  // used only when is_bpe_with_byte_fallback_ is true. It is the ID\n  // of <0x00> in tokens.txt\n  int32_t id_for_0x00_ = 0;\n\n  // true for byte BPE. false for non byte BPE.\n  bool is_bbpe_ = false;\n};\n\nstd::ostream &operator<<(std::ostream &os, const SymbolTable &symbol_table);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_SYMBOL_TABLE_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/tee-stream.h",
    "content": "// Code in this file is copied and modified from\n// https://wordaligned.org/articles/cpp-streambufs\n\n#ifndef SHERPA_ONNX_CSRC_TEE_STREAM_H_\n#define SHERPA_ONNX_CSRC_TEE_STREAM_H_\n#include <ostream>\n#include <streambuf>\n#include <string>\n\nnamespace sherpa_onnx {\n\ntemplate <typename char_type, typename traits = std::char_traits<char_type>>\nclass basic_teebuf : public std::basic_streambuf<char_type, traits> {\n public:\n  using int_type = typename traits::int_type;\n\n  basic_teebuf(std::basic_streambuf<char_type, traits> *sb1,\n               std::basic_streambuf<char_type, traits> *sb2)\n      : sb1(sb1), sb2(sb2) {}\n\n private:\n  int sync() override {\n    int const r1 = sb1->pubsync();\n    int const r2 = sb2->pubsync();\n    return r1 == 0 && r2 == 0 ? 0 : -1;\n  }\n\n  int_type overflow(int_type c) override {\n    int_type const eof = traits::eof();\n\n    if (traits::eq_int_type(c, eof)) {\n      return traits::not_eof(c);\n    } else {\n      char_type const ch = traits::to_char_type(c);\n      int_type const r1 = sb1->sputc(ch);\n      int_type const r2 = sb2->sputc(ch);\n\n      return traits::eq_int_type(r1, eof) || traits::eq_int_type(r2, eof) ? eof\n                                                                          : c;\n    }\n  }\n\n private:\n  std::basic_streambuf<char_type, traits> *sb1;\n  std::basic_streambuf<char_type, traits> *sb2;\n};\n\nusing teebuf = basic_teebuf<char>;\n\nclass TeeStream : public std::ostream {\n public:\n  TeeStream(std::ostream &o1, std::ostream &o2)\n      : std::ostream(&tbuf), tbuf(o1.rdbuf(), o2.rdbuf()) {}\n\n private:\n  teebuf tbuf;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_TEE_STREAM_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/ten-vad-model-config.cc",
    "content": "// sherpa-onnx/csrc/ten-vad-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/ten-vad-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nvoid TenVadModelConfig::Register(ParseOptions *po) {\n  po->Register(\"ten-vad-model\", &model, \"Path to TEN VAD ONNX model.\");\n\n  po->Register(\"ten-vad-threshold\", &threshold,\n               \"Speech threshold. TEN VAD outputs speech probabilities for \"\n               \"each audio chunk, probabilities ABOVE this value are \"\n               \"considered as SPEECH. It is better to tune this parameter for \"\n               \"each dataset separately, but lazy \"\n               \"0.5 is pretty good for most datasets.\");\n\n  po->Register(\"ten-vad-min-silence-duration\", &min_silence_duration,\n               \"In seconds.  In the end of each speech chunk wait for \"\n               \"--ten-vad-min-silence-duration seconds before separating it\");\n\n  po->Register(\"ten-vad-min-speech-duration\", &min_speech_duration,\n               \"In seconds.  In the end of each silence chunk wait for \"\n               \"--ten-vad-min-speech-duration seconds before separating it\");\n\n  po->Register(\n      \"ten-vad-max-speech-duration\", &max_speech_duration,\n      \"In seconds. If a speech segment is longer than this value, then we \"\n      \"increase the threshold to 0.9. After finishing detecting the segment, \"\n      \"the threshold value is reset to its original value.\");\n\n  po->Register(\n      \"ten-vad-window-size\", &window_size,\n      \"In samples. Audio chunks of --ten-vad-window-size samples are fed \"\n      \"to the ten VAD model. WARNING! Please use 160 or 256 \");\n}\n\nbool TenVadModelConfig::Validate() const {\n  if (model.empty()) {\n    SHERPA_ONNX_LOGE(\"Please provide --ten-vad-model\");\n    return false;\n  }\n\n  if (!FileExists(model)) {\n    SHERPA_ONNX_LOGE(\"TEN vad model file '%s' does not exist\", model.c_str());\n    return false;\n  }\n\n  if (threshold < 0.01) {\n    SHERPA_ONNX_LOGE(\n        \"Please use a larger value for --ten-vad-threshold. Given: %f\",\n        threshold);\n    return false;\n  }\n\n  if (threshold >= 1) {\n    SHERPA_ONNX_LOGE(\n        \"Please use a smaller value for --ten-vad-threshold. Given: %f\",\n        threshold);\n    return false;\n  }\n\n  if (min_silence_duration <= 0) {\n    SHERPA_ONNX_LOGE(\n        \"Please use a larger value for --ten-vad-min-silence-duration. \"\n        \"Given: \"\n        \"%f\",\n        min_silence_duration);\n    return false;\n  }\n\n  if (min_speech_duration <= 0) {\n    SHERPA_ONNX_LOGE(\n        \"Please use a larger value for --ten-vad-min-speech-duration. \"\n        \"Given: \"\n        \"%f\",\n        min_speech_duration);\n    return false;\n  }\n\n  if (max_speech_duration <= 0) {\n    SHERPA_ONNX_LOGE(\n        \"Please use a larger value for --ten-vad-max-speech-duration. \"\n        \"Given: \"\n        \"%f\",\n        max_speech_duration);\n    return false;\n  }\n\n  return true;\n}\n\nstd::string TenVadModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"TenVadModelConfig(\";\n  os << \"model=\\\"\" << model << \"\\\", \";\n  os << \"threshold=\" << threshold << \", \";\n  os << \"min_silence_duration=\" << min_silence_duration << \", \";\n  os << \"min_speech_duration=\" << min_speech_duration << \", \";\n  os << \"max_speech_duration=\" << max_speech_duration << \", \";\n  os << \"window_size=\" << window_size << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/ten-vad-model-config.h",
    "content": "// sherpa-onnx/csrc/ten-vad-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_TEN_VAD_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_TEN_VAD_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n\nnamespace sherpa_onnx {\n\nstruct TenVadModelConfig {\n  std::string model;\n\n  // threshold to classify a segment as speech\n  //\n  // If the predicted probability of a segment is larger than this\n  // value, then it is classified as speech.\n  float threshold = 0.5;\n\n  float min_silence_duration = 0.5;  // in seconds\n\n  float min_speech_duration = 0.25;  // in seconds\n\n  // 160 or 256\n  int32_t window_size = 256;  // in samples\n\n  // If a speech segment is longer than this value, then we increase\n  // the threshold to 0.9. After finishing detecting the segment,\n  // the threshold value is reset to its original value.\n  float max_speech_duration = 20;  // in seconds\n\n  TenVadModelConfig() = default;\n\n  void Register(ParseOptions *po);\n\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_TEN_VAD_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/ten-vad-model.cc",
    "content": "// sherpa-onnx/csrc/ten-vad-model.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/ten-vad-model.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <cstring>\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"Eigen/Dense\"\n#include \"kaldi-native-fbank/csrc/mel-computations.h\"\n#include \"kaldi-native-fbank/csrc/rfft.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nclass TenVadModel::Impl {\n public:\n  explicit Impl(const VadModelConfig &config)\n      : config_(config),\n        rfft_(1024),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{},\n        sample_rate_(config.sample_rate) {\n    auto buf = ReadFile(config.ten_vad.model);\n    Init(buf.data(), buf.size());\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const VadModelConfig &config)\n      : config_(config),\n        rfft_(1024),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config)),\n        allocator_{},\n        sample_rate_(config.sample_rate) {\n    auto buf = ReadFile(mgr, config.ten_vad.model);\n    Init(buf.data(), buf.size());\n  }\n\n  float Run(const float *samples, int32_t n) {\n    ComputeFeatures(samples, n);\n\n    auto memory_info =\n        Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeDefault);\n\n    std::array<int64_t, 3> x_shape = {1, 3, 41};\n\n    Ort::Value x = Ort::Value::CreateTensor(memory_info, last_features_.data(),\n                                            last_features_.size(),\n                                            x_shape.data(), x_shape.size());\n\n    std::vector<Ort::Value> inputs;\n    inputs.reserve(input_names_.size());\n\n    inputs.push_back(std::move(x));\n    for (auto &s : states_) {\n      inputs.push_back(std::move(s));\n    }\n\n    auto out =\n        sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),\n                   output_names_ptr_.data(), output_names_ptr_.size());\n\n    for (int32_t i = 1; i != static_cast<int32_t>(output_names_.size()); ++i) {\n      states_[i - 1] = std::move(out[i]);\n    }\n\n    float prob = out[0].GetTensorData<float>()[0];\n\n    return prob;\n  }\n  void Reset() {\n    triggered_ = false;\n    current_sample_ = 0;\n    temp_start_ = 0;\n    temp_end_ = 0;\n\n    last_sample_ = 0;\n\n    last_features_.resize(3 * 41);\n    std::fill(last_features_.begin(), last_features_.end(), 0.0f);\n    tmp_samples_.resize(1024);\n\n    ResetStates();\n  }\n\n  bool IsSpeech(const float *samples, int32_t n) {\n    if (n != WindowSize()) {\n      SHERPA_ONNX_LOGE(\"n: %d != window_size: %d\", n, WindowSize());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    float prob = Run(samples, n);\n\n    float threshold = config_.ten_vad.threshold;\n\n    current_sample_ += config_.ten_vad.window_size;\n\n    if (prob > threshold && temp_end_ != 0) {\n      temp_end_ = 0;\n    }\n\n    if (prob > threshold && temp_start_ == 0) {\n      // start speaking, but we require that it must satisfy\n      // min_speech_duration\n      temp_start_ = current_sample_;\n      return false;\n    }\n\n    if (prob > threshold && temp_start_ != 0 && !triggered_) {\n      if (current_sample_ - temp_start_ < min_speech_samples_) {\n        return false;\n      }\n\n      triggered_ = true;\n\n      return true;\n    }\n\n    if ((prob < threshold) && !triggered_) {\n      // silence\n      temp_start_ = 0;\n      temp_end_ = 0;\n      return false;\n    }\n\n    if ((prob > threshold - 0.15) && triggered_) {\n      // speaking\n      return true;\n    }\n\n    if ((prob > threshold) && !triggered_) {\n      // start speaking\n      triggered_ = true;\n\n      return true;\n    }\n\n    if ((prob < threshold) && triggered_) {\n      // stop to speak\n      if (temp_end_ == 0) {\n        temp_end_ = current_sample_;\n      }\n\n      if (current_sample_ - temp_end_ < min_silence_samples_) {\n        // continue speaking\n        return true;\n      }\n      // stopped speaking\n      temp_start_ = 0;\n      temp_end_ = 0;\n      triggered_ = false;\n      return false;\n    }\n\n    return false;\n  }\n\n  int32_t WindowShift() const { return config_.ten_vad.window_size; }\n\n  int32_t WindowSize() const { return config_.ten_vad.window_size; }\n\n  int32_t MinSilenceDurationSamples() const { return min_silence_samples_; }\n\n  int32_t MinSpeechDurationSamples() const { return min_speech_samples_; }\n\n  void SetMinSilenceDuration(float s) {\n    min_silence_samples_ = sample_rate_ * s;\n  }\n\n  void SetThreshold(float threshold) { config_.ten_vad.threshold = threshold; }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    if (sample_rate_ != 16000) {\n      SHERPA_ONNX_LOGE(\"Expected sample rate 16000. Given: %d\",\n                       config_.sample_rate);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (config_.ten_vad.window_size > 768) {\n      SHERPA_ONNX_LOGE(\"Windows size %d for ten-vad is too large\",\n                       config_.ten_vad.window_size);\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    min_silence_samples_ = sample_rate_ * config_.ten_vad.min_silence_duration;\n\n    min_speech_samples_ = sample_rate_ * config_.ten_vad.min_speech_duration;\n\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    InitMelBanks();\n\n    Check();\n\n    Reset();\n  }\n\n  void ResetStates() {\n    std::array<int64_t, 2> shape{1, 64};\n\n    states_.clear();\n    states_.reserve(4);\n    for (int32_t i = 0; i != 4; ++i) {\n      Ort::Value s = Ort::Value::CreateTensor<float>(allocator_, shape.data(),\n                                                     shape.size());\n\n      Fill<float>(&s, 0);\n      states_.push_back(std::move(s));\n    }\n  }\n\n  void InitMelBanks() {\n    knf::FrameExtractionOptions frame_opts;\n\n    // 16 kHz, so num_fft is 16000*64/1000 = 1024\n    frame_opts.frame_length_ms = 64;\n\n    knf::MelBanksOptions mel_opts;\n    mel_opts.is_librosa = true;\n    mel_opts.norm = \"\";\n    mel_opts.use_slaney_mel_scale = true;\n    mel_opts.floor_to_int_bin = true;\n    mel_opts.low_freq = 0;\n    mel_opts.high_freq = 8000;\n    mel_opts.num_bins = 40;\n\n    mel_banks_ = std::make_unique<knf::MelBanks>(mel_opts, frame_opts, 1.0f);\n\n    features_.resize(41);\n  }\n\n  void Check() {\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---ten-vad---\\n\";\n      PrintModelMetadata(os, meta_data);\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n\n    std::string model_type;\n    SHERPA_ONNX_READ_META_DATA_STR_ALLOW_EMPTY(model_type, \"model_type\");\n\n    if (model_type.empty()) {\n      SHERPA_ONNX_LOGE(\n          \"Please download ten-vad.onnx or ten-vad.int8.onnx from\\n\"\n          \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\"\n          \"\\nWe have added meta data to the original ten-vad.onnx from\\n\"\n          \"https://github.com/TEN-framework/ten-vad\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (model_type != \"ten-vad\") {\n      SHERPA_ONNX_LOGE(\"Expect model type 'ten-vad', given '%s'\",\n                       model_type.c_str());\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(mean_, \"mean\");\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(inv_stddev_, \"inv_stddev\");\n    SHERPA_ONNX_READ_META_DATA_VEC_FLOAT(window_, \"window\");\n\n    if (mean_.size() != 41) {\n      SHERPA_ONNX_LOGE(\n          \"Incorrect size of the mean vector. Given %d, expected 41\",\n          static_cast<int32_t>(mean_.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (inv_stddev_.size() != 41) {\n      SHERPA_ONNX_LOGE(\n          \"Incorrect size of the inv_stddev vector. Given %d, expected 41\",\n          static_cast<int32_t>(inv_stddev_.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    if (window_.size() != 768) {\n      SHERPA_ONNX_LOGE(\n          \"Incorrect size of the window vector. Given %d, expected 768\",\n          static_cast<int32_t>(window_.size()));\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n  static void Scale(const float *samples, int32_t n, float *out) {\n    Eigen::Map<const Eigen::ArrayXf> input(samples, n);\n    Eigen::Map<Eigen::ArrayXf> output(out, n);\n    constexpr float kScale = 32768.0f;\n    output = input * kScale;\n  }\n\n  void Preemphasis(const float *samples, int32_t n, float *out) {\n    float t = samples[n - 1];\n\n    for (int32_t i = n - 1; i > 0; --i) {\n      out[i] = samples[i] - 0.97 * samples[i - 1];\n    }\n\n    out[0] = samples[0] - 0.97 * last_sample_;\n\n    last_sample_ = t;\n  }\n\n  static void ApplyWindow(const float *samples, const float *window, int32_t n,\n                          float *out) {\n    Eigen::Map<const Eigen::ArrayXf> samp_vec(samples, n);\n    Eigen::Map<const Eigen::ArrayXf> win_vec(window, n);\n    Eigen::Map<Eigen::ArrayXf> out_vec(out, n);\n    out_vec = samp_vec * win_vec;\n  }\n\n  static void ComputePowerSpectrum(const float *fft_bins, int32_t n,\n                                   float *out) {\n    out[0] = fft_bins[0] * fft_bins[0];\n    out[n - 1] = fft_bins[1] * fft_bins[1];\n\n    for (int32_t i = 1; i < n / 2; ++i) {\n      float real = fft_bins[2 * i];\n      float imag = fft_bins[2 * i + 1];\n      out[i] = real * real + imag * imag;\n    }\n  }\n\n  static void LogMel(const float *in, int32_t n, float *out) {\n    Eigen::Map<const Eigen::ArrayXf> input(in, n);\n    Eigen::Map<Eigen::ArrayXf> output(out, n);\n    // 20.79441541679836 is log(32768*32768)\n    constexpr float kLogScale = 20.79441541679836f;\n    output = (input + 1e-10f).log() - kLogScale;\n  }\n\n  void ApplyNormalization(const float *in, float *out) const {\n    int32_t dim = static_cast<int32_t>(mean_.size());\n\n    Eigen::Map<const Eigen::ArrayXf> input(in, dim);\n    Eigen::Map<Eigen::ArrayXf> output(out, dim);\n    Eigen::Map<const Eigen::ArrayXf> mean_vec(mean_.data(), dim);\n    Eigen::Map<const Eigen::ArrayXf> inv_stddev_vec(inv_stddev_.data(), dim);\n    output = (input - mean_vec) * inv_stddev_vec;\n  }\n\n  void ComputeFeatures(const float *samples, int32_t n) {\n    std::fill(tmp_samples_.begin() + n, tmp_samples_.end(), 0.0f);\n\n    Scale(samples, n, tmp_samples_.data());\n\n    Preemphasis(tmp_samples_.data(), n, tmp_samples_.data());\n    ApplyWindow(tmp_samples_.data(), window_.data(), n, tmp_samples_.data());\n\n    rfft_.Compute(tmp_samples_.data());\n    auto &power_spectrum = tmp_samples_;\n    ComputePowerSpectrum(tmp_samples_.data(), tmp_samples_.size(),\n                         power_spectrum.data());\n\n    // note only the first half of power_spectrum is used inside Compute()\n    mel_banks_->Compute(power_spectrum.data(), features_.data());\n    LogMel(features_.data(), static_cast<int32_t>(features_.size()) - 1,\n           features_.data());\n\n    // Note(fangjun): The ten-vad model expects a pitch feature, but we set it\n    // to 0 as a simplification. This may reduce performance as noted\n    // in the PR #2377\n    features_.back() = 0;\n\n    ApplyNormalization(features_.data(), features_.data());\n\n    std::memmove(last_features_.data(),\n                 last_features_.data() + features_.size(),\n                 2 * features_.size() * sizeof(float));\n    std::copy(features_.begin(), features_.end(),\n              last_features_.begin() + 2 * features_.size());\n  }\n\n private:\n  VadModelConfig config_;\n  knf::Rfft rfft_;\n  std::unique_ptr<knf::MelBanks> mel_banks_;\n\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n\n  std::vector<Ort::Value> states_;\n  int64_t sample_rate_;\n  int32_t min_silence_samples_;\n  int32_t min_speech_samples_;\n\n  bool triggered_ = false;\n  int32_t current_sample_ = 0;\n  int32_t temp_start_ = 0;\n  int32_t temp_end_ = 0;\n\n  float last_sample_ = 0;\n\n  std::vector<float> mean_;\n  std::vector<float> inv_stddev_;\n  std::vector<float> window_;\n\n  std::vector<float> features_;\n  std::vector<float> last_features_;  // (3, 41), row major\n  std::vector<float> tmp_samples_;    // (1024,)\n};\n\nTenVadModel::TenVadModel(const VadModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nTenVadModel::TenVadModel(Manager *mgr, const VadModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nTenVadModel::~TenVadModel() = default;\n\nvoid TenVadModel::Reset() { return impl_->Reset(); }\n\nbool TenVadModel::IsSpeech(const float *samples, int32_t n) {\n  return impl_->IsSpeech(samples, n);\n}\n\nint32_t TenVadModel::WindowSize() const { return impl_->WindowSize(); }\n\nint32_t TenVadModel::WindowShift() const { return impl_->WindowShift(); }\n\nint32_t TenVadModel::MinSilenceDurationSamples() const {\n  return impl_->MinSilenceDurationSamples();\n}\n\nint32_t TenVadModel::MinSpeechDurationSamples() const {\n  return impl_->MinSpeechDurationSamples();\n}\n\nvoid TenVadModel::SetMinSilenceDuration(float s) {\n  impl_->SetMinSilenceDuration(s);\n}\n\nvoid TenVadModel::SetThreshold(float threshold) {\n  impl_->SetThreshold(threshold);\n}\n\nfloat TenVadModel::Compute(const float *samples, int32_t n) {\n  return impl_->Run(samples, n);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate TenVadModel::TenVadModel(AAssetManager *mgr,\n                                  const VadModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate TenVadModel::TenVadModel(NativeResourceManager *mgr,\n                                  const VadModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/ten-vad-model.h",
    "content": "// sherpa-onnx/csrc/ten-vad-model.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_TEN_VAD_MODEL_H_\n#define SHERPA_ONNX_CSRC_TEN_VAD_MODEL_H_\n\n#include <memory>\n\n#include \"sherpa-onnx/csrc/vad-model.h\"\n\nnamespace sherpa_onnx {\n\nclass TenVadModel : public VadModel {\n public:\n  explicit TenVadModel(const VadModelConfig &config);\n\n  template <typename Manager>\n  TenVadModel(Manager *mgr, const VadModelConfig &config);\n\n  ~TenVadModel() override;\n\n  // reset the internal model states\n  void Reset() override;\n\n  /**\n   * @param samples Pointer to a 1-d array containing audio samples.\n   *                Each sample should be normalized to the range [-1, 1].\n   * @param n Number of samples.\n   *\n   * @return Return true if speech is detected. Return false otherwise.\n   */\n  bool IsSpeech(const float *samples, int32_t n) override;\n\n  float Compute(const float *samples, int32_t n) override;\n\n  // 256 or 160\n  int32_t WindowSize() const override;\n\n  // 256 or 128\n  int32_t WindowShift() const override;\n\n  int32_t MinSilenceDurationSamples() const override;\n  int32_t MinSpeechDurationSamples() const override;\n\n  void SetMinSilenceDuration(float s) override;\n  void SetThreshold(float threshold) override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_TEN_VAD_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/text-utils-test.cc",
    "content": "// sherpa-onnx/csrc/text-utils-test.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\n#include <cstdio>\n#include <cstring>\n#include <iostream>\n#include <regex>\n#include <sstream>\n#include <string>\n#include <vector>\n\n#include \"gtest/gtest.h\"\n\nnamespace sherpa_onnx {\n\nTEST(ToLowerCase, WideString) {\n  std::string text =\n      \"Hallo! Übeltäter übergibt Ärzten öfters äußerst ätzende Öle 3€\";\n  auto t = ToLowerCase(text);\n  std::cout << text << \"\\n\";\n  std::cout << t << \"\\n\";\n}\n\nTEST(RemoveInvalidUtf8Sequences, Case1) {\n  std::vector<uint8_t> v = {\n      0xe4, 0xbb, 0x8a,                                  // 今\n      0xe5, 0xa4, 0xa9,                                  // 天\n      'i',  's',  ' ',  'M', 'o', 'd', 'a', 'y',  ',',   // is Monday,\n      ' ',  'w',  'i',  'e', ' ', 'h', 'e', 'i',  0xc3,  // wie heißen Size\n      0x9f, 'e',  'n',  ' ', 'S', 'i', 'e', 0xf0, 0x9d, 0x84, 0x81};\n\n  std::vector<uint8_t> v0 = v;\n  v0[1] = 0xc0;  // make the first 3 bytes an invalid utf8 character\n  std::string s0{v0.begin(), v0.end()};\n  EXPECT_EQ(s0.size(), v0.size());\n\n  auto s = RemoveInvalidUtf8Sequences(s0);  // should remove 今\n\n  v0 = v;\n  // v0[23] == 0xc3\n  // v0[24] == 0x9f\n\n  v0[23] = 0xc1;\n\n  s0 = {v0.begin(), v0.end()};\n  s = RemoveInvalidUtf8Sequences(s0);  // should remove ß\n\n  EXPECT_EQ(s.size() + 2, v.size());\n\n  v0 = v;\n  // v0[31] = 0xf0;\n  // v0[32] = 0x9d;\n  // v0[33] = 0x84;\n  // v0[34] = 0x81;\n  v0[31] = 0xf5;\n\n  s0 = {v0.begin(), v0.end()};\n  s = RemoveInvalidUtf8Sequences(s0);\n\n  EXPECT_EQ(s.size() + 4, v.size());\n}\n\n// Tests for sanitizeUtf8\nTEST(RemoveInvalidUtf8Sequences, ValidUtf8StringPassesUnchanged) {\n  std::string input = \"Valid UTF-8 🌍\";\n  EXPECT_EQ(RemoveInvalidUtf8Sequences(input), input);\n}\n\nTEST(RemoveInvalidUtf8Sequences, SingleInvalidByteReplaced) {\n  std::string input = \"Invalid \\xFF UTF-8\";\n  std::string expected = \"Invalid  UTF-8\";\n  EXPECT_EQ(RemoveInvalidUtf8Sequences(input), expected);\n}\n\nTEST(RemoveInvalidUtf8Sequences, TruncatedUtf8SequenceReplaced) {\n  std::string input = \"Broken \\xE2\\x82\";  // Incomplete UTF-8 sequence\n  std::string expected = \"Broken \";\n  EXPECT_EQ(RemoveInvalidUtf8Sequences(input), expected);\n}\n\nTEST(RemoveInvalidUtf8Sequences, MultipleInvalidBytes) {\n  std::string input = \"Test \\xC0\\xC0\\xF8\\xA0\";  // Multiple invalid sequences\n  std::string expected = \"Test \";\n  EXPECT_EQ(RemoveInvalidUtf8Sequences(input), expected);\n}\n\nTEST(RemoveInvalidUtf8Sequences, BreakingCase_SpaceFollowedByInvalidByte) {\n  std::string input = \"\\x20\\xC4\";  // Space followed by an invalid byte\n  std::string expected = \" \";      // 0xC4 removed\n  EXPECT_EQ(RemoveInvalidUtf8Sequences(input), expected);\n}\n\nTEST(RemoveInvalidUtf8Sequences, ValidUtf8WithEdgeCaseCharacters) {\n  std::string input = \"Edge 🏆💯\";\n  EXPECT_EQ(RemoveInvalidUtf8Sequences(input), input);\n}\n\nTEST(RemoveInvalidUtf8Sequences, MixedValidAndInvalidBytes) {\n  std::string input = \"Mix \\xE2\\x82\\xAC \\xF0\\x9F\\x98\\x81 \\xFF\";\n  std::string expected = \"Mix € 😁 \";  // Invalid bytes removed\n  EXPECT_EQ(RemoveInvalidUtf8Sequences(input), expected);\n}\n\nTEST(RemoveInvalidUtf8Sequences, SpaceFollowedByInvalidByte) {\n  std::string input = \"\\x20\\xC4\";  // Space (0x20) followed by invalid (0xC4)\n  std::string expected = \" \";      // Space remains, 0xC4 is removed\n  EXPECT_EQ(RemoveInvalidUtf8Sequences(input), expected);\n}\n\nTEST(RemoveInvalidUtf8Sequences, RemoveTruncatedC4) {\n  std::string input = \"Hello \\xc4 world\";  // Invalid `0xC4`\n  std::string expected = \"Hello  world\";   // `0xC4` should be removed\n  EXPECT_EQ(RemoveInvalidUtf8Sequences(input), expected);\n}\n\nTEST(RemoveInvalidUtf8Sequences, SpaceFollowedByInvalidByte_Breaking) {\n  std::string input = \"\\x20\\xc4\";  // Space followed by invalid `0xc4`\n  std::string expected = \" \";      // `0xc4` should be removed, space remains\n  EXPECT_EQ(RemoveInvalidUtf8Sequences(input), expected);\n}\n\nTEST(RemoveInvalidUtf8Sequences, DebugSpaceFollowedByInvalidByte) {\n  std::string input = \"\\x20\\xc4\";  // Space followed by invalid `0xc4`\n  std::string output = RemoveInvalidUtf8Sequences(input);\n\n  std::cout << \"Processed string: \";\n  for (unsigned char c : output) {\n    printf(\"\\\\x%02x \", c);\n  }\n  std::cout << std::endl;\n\n  EXPECT_EQ(output, \" \");  // Expect `0xc4` to be removed, leaving only space\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/text-utils.cc",
    "content": "// sherpa-onnx/csrc/text-utils.cc\n//\n// Copyright 2009-2011  Saarland University;  Microsoft Corporation\n// Copyright      2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <cctype>\n#include <charconv>\n#include <cinttypes>\n#include <climits>\n#include <cstdint>\n#include <cstdlib>\n#include <cwctype>\n#include <limits>\n#include <locale>\n#include <sstream>\n#include <string>\n#include <unordered_map>\n#include <unordered_set>\n#include <utility>\n#include <vector>\n\n#if defined(_WIN32)\n#include <Windows.h>\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\n// This file is copied/modified from\n// https://github.com/kaldi-asr/kaldi/blob/master/src/util/text-utils.cc\n\nnamespace sherpa_onnx {\n\n// copied from kaldi/src/util/text-util.cc\ntemplate <class T>\nclass NumberIstream {\n public:\n  explicit NumberIstream(std::istream &i) : in_(i) {}\n\n  NumberIstream &operator>>(T &x) {\n    if (!in_.good()) return *this;\n    in_ >> x;\n    if (!in_.fail() && RemainderIsOnlySpaces()) return *this;\n    return ParseOnFail(&x);\n  }\n\n private:\n  std::istream &in_;\n\n  bool RemainderIsOnlySpaces() {\n    if (in_.tellg() != std::istream::pos_type(-1)) {\n      std::string rem;\n      in_ >> rem;\n\n      if (rem.find_first_not_of(' ') != std::string::npos) {\n        // there is not only spaces\n        return false;\n      }\n    }\n\n    in_.clear();\n    return true;\n  }\n\n  NumberIstream &ParseOnFail(T *x) {\n    std::string str;\n    in_.clear();\n    in_.seekg(0);\n    // If the stream is broken even before trying\n    // to read from it or if there are many tokens,\n    // it's pointless to try.\n    if (!(in_ >> str) || !RemainderIsOnlySpaces()) {\n      in_.setstate(std::ios_base::failbit);\n      return *this;\n    }\n\n    std::unordered_map<std::string, T> inf_nan_map;\n    // we'll keep just uppercase values.\n    inf_nan_map[\"INF\"] = std::numeric_limits<T>::infinity();\n    inf_nan_map[\"+INF\"] = std::numeric_limits<T>::infinity();\n    inf_nan_map[\"-INF\"] = -std::numeric_limits<T>::infinity();\n    inf_nan_map[\"INFINITY\"] = std::numeric_limits<T>::infinity();\n    inf_nan_map[\"+INFINITY\"] = std::numeric_limits<T>::infinity();\n    inf_nan_map[\"-INFINITY\"] = -std::numeric_limits<T>::infinity();\n    inf_nan_map[\"NAN\"] = std::numeric_limits<T>::quiet_NaN();\n    inf_nan_map[\"+NAN\"] = std::numeric_limits<T>::quiet_NaN();\n    inf_nan_map[\"-NAN\"] = -std::numeric_limits<T>::quiet_NaN();\n    // MSVC\n    inf_nan_map[\"1.#INF\"] = std::numeric_limits<T>::infinity();\n    inf_nan_map[\"-1.#INF\"] = -std::numeric_limits<T>::infinity();\n    inf_nan_map[\"1.#QNAN\"] = std::numeric_limits<T>::quiet_NaN();\n    inf_nan_map[\"-1.#QNAN\"] = -std::numeric_limits<T>::quiet_NaN();\n\n    std::transform(str.begin(), str.end(), str.begin(), ::toupper);\n\n    if (inf_nan_map.find(str) != inf_nan_map.end()) {\n      *x = inf_nan_map[str];\n    } else {\n      in_.setstate(std::ios_base::failbit);\n    }\n\n    return *this;\n  }\n};\n\n/// ConvertStringToReal converts a string into either float or double\n/// and returns false if there was any kind of problem (i.e. the string\n/// was not a floating point number or contained extra non-whitespace junk).\n/// Be careful- this function will successfully read inf's or nan's.\ntemplate <typename T>\nbool ConvertStringToReal(const std::string &str, T *out) {\n  std::istringstream iss(str);\n\n  NumberIstream<T> i(iss);\n\n  i >> *out;\n\n  if (iss.fail()) {\n    // Number conversion failed.\n    return false;\n  }\n\n  return true;\n}\n\ntemplate bool ConvertStringToReal<float>(const std::string &str, float *out);\n\ntemplate bool ConvertStringToReal<double>(const std::string &str, double *out);\n\nvoid SplitStringToVector(const std::string &full, const char *delim,\n                         bool omit_empty_strings,\n                         std::vector<std::string> *out) {\n  size_t start = 0, found = 0, end = full.size();\n  out->clear();\n  while (found != std::string::npos) {\n    found = full.find_first_of(delim, start);\n    // start != end condition is for when the delimiter is at the end\n    if (!omit_empty_strings || (found != start && start != end))\n      out->push_back(full.substr(start, found - start));\n    start = found + 1;\n  }\n}\n\nstd::string Trim(const std::string &str) {\n  size_t start = 0;\n  while (start < str.size() &&\n         std::isspace(static_cast<unsigned char>(str[start]))) {\n    start++;\n  }\n  size_t end = str.size();\n  while (end > start &&\n         std::isspace(static_cast<unsigned char>(str[end - 1]))) {\n    end--;\n  }\n  return str.substr(start, end - start);\n}\n\nstd::vector<std::string> SplitStringAndTrim(const std::string &str,\n                                            char delim) {\n  std::vector<std::string> result;\n  std::string delim_str(1, delim);\n  SplitStringToVector(str, delim_str.c_str(), true, &result);\n  // Trim whitespace from each part\n  for (auto &part : result) {\n    part = Trim(part);\n  }\n  // Remove empty strings after trimming\n  result.erase(std::remove_if(result.begin(), result.end(),\n                              [](const std::string &s) { return s.empty(); }),\n               result.end());\n  return result;\n}\n\ntemplate <class F>\nbool SplitStringToFloats(const std::string &full, const char *delim,\n                         bool omit_empty_strings,  // typically false\n                         std::vector<F> *out) {\n  assert(out != nullptr);\n  if (*(full.c_str()) == '\\0') {\n    out->clear();\n    return true;\n  }\n  std::vector<std::string> split;\n  SplitStringToVector(full, delim, omit_empty_strings, &split);\n  out->resize(split.size());\n  for (size_t i = 0; i < split.size(); ++i) {\n    // assume atof never fails\n    F f = 0;\n    if (!ConvertStringToReal(split[i], &f)) return false;\n    (*out)[i] = f;\n  }\n  return true;\n}\n\n// Instantiate the template above for float and double.\ntemplate bool SplitStringToFloats(const std::string &full, const char *delim,\n                                  bool omit_empty_strings,\n                                  std::vector<float> *out);\ntemplate bool SplitStringToFloats(const std::string &full, const char *delim,\n                                  bool omit_empty_strings,\n                                  std::vector<double> *out);\n\nstatic bool IsPunct(char c) { return c != '\\'' && std::ispunct(c); }\nstatic bool IsGermanUmlaut(const std::string &word) {\n  // ä 0xC3 0xA4\n  // ö 0xC3 0xB6\n  // ü 0xC3 0xBC\n  // Ä 0xC3 0x84\n  // Ö 0xC3 0x96\n  // Ü 0xC3 0x9C\n  // ß 0xC3 0x9F\n\n  if (word.size() != 2 || static_cast<uint8_t>(word[0]) != 0xc3) {\n    return false;\n  }\n\n  auto c = static_cast<uint8_t>(word[1]);\n  if (c == 0xa4 || c == 0xb6 || c == 0xbc || c == 0x84 || c == 0x96 ||\n      c == 0x9c || c == 0x9f) {\n    return true;\n  }\n\n  return false;\n}\n\n// see https://www.tandem.net/blog/spanish-accents\n// https://www.compart.com/en/unicode/U+00DC\nstatic bool IsSpanishDiacritic(const std::string &word) {\n  // á 0xC3 0xA1\n  // é 0xC3 0xA9\n  // í 0xC3 0xAD\n  // ó 0xC3 0xB3\n  // ú 0xC3 0xBA\n  // ü 0xC3 0xBC\n  // ñ 0xC3 0xB1\n  //\n  // uppercase\n  //\n  // Á 0xC3 0x81\n  // É 0xC3 0x89\n  // Í 0xC3 0x8D\n  // Ó 0xC3 0x93\n  // Ú 0xC3 0x9A\n  // Ü 0xC3 0x9C\n  // Ñ 0xC3 0x91\n\n  if (word.size() != 2 || static_cast<uint8_t>(word[0]) != 0xc3) {\n    return false;\n  }\n\n  auto c = static_cast<uint8_t>(word[1]);\n  if (c == 0xa1 || c == 0xa9 || c == 0xad || c == 0xb3 || c == 0xba ||\n      c == 0xbc || c == 0xb1 || c == 0x81 || c == 0x89 || c == 0x8d ||\n      c == 0x93 || c == 0x9a || c == 0x9c || c == 0x91) {\n    return true;\n  }\n\n  return false;\n}\n\n// see https://www.busuu.com/en/french/accent-marks\nstatic bool IsFrenchDiacritic(const std::string &word) {\n  // acute accent\n  // é 0xC3 0xA9\n  //\n  // grave accent\n  // à 0xC3 0xA0\n  // è 0xC3 0xA8\n  // ù 0xC3 0xB9\n  //\n  // cedilla\n  // ç 0xC3 0xA7\n  //\n  // circumflex\n  // â 0xC3 0xA2\n  // ê 0xC3 0xAA\n  // î 0xC3 0xAE\n  // ô 0xC3 0xB4\n  // û 0xC3 0xBB\n  //\n  // trema\n  // ë 0xC3 0xAB\n  // ï 0xC3 0xAF\n  // ü 0xC3 0xBC\n  //\n  // É 0xC3 0x89\n  //\n  // À 0xC3 0x80\n  // È 0xC3 0x88\n  // Ù 0xC3 0x99\n  // Ç 0xC3 0x87\n  // Â 0xC3 0x82\n  // Ê 0xC3 0x8A\n  // Î 0xC3 0x8E\n  // Ô 0xC3 0x94\n  // Û 0xC3 0x9B\n  // Ë 0xC3 0x8B\n  // Ï 0xC3 0x8F\n  // Ü 0xC3 0x9C\n\n  if (word.size() != 2 || static_cast<uint8_t>(word[0]) != 0xc3) {\n    return false;\n  }\n\n  auto c = static_cast<uint8_t>(word[1]);\n  if (c == 0xa9 || c == 0xa0 || c == 0xa8 || c == 0xb9 || c == 0xa7 ||\n      c == 0xa2 || c == 0xaa || c == 0xae || c == 0xb4 || c == 0xbb ||\n      c == 0xab || c == 0xaf || c == 0xbc || c == 0x89 || c == 0x80 ||\n      c == 0x88 || c == 0x99 || c == 0x87 || c == 0x82 || c == 0x8a ||\n      c == 0x8e || c == 0x94 || c == 0x9b || c == 0x8b || c == 0x8f ||\n      c == 0x9c) {\n    return true;\n  }\n  return false;\n}\n\nstatic bool IsSpecial(const std::string &w) {\n  bool ans = IsGermanUmlaut(w) || IsSpanishDiacritic(w) || IsFrenchDiacritic(w);\n\n  // for french d’impossible\n  // ’ 0xE2 0x80 0x99\n  bool ans2 = false;\n  if (w.size() == 3) {\n    auto c0 = static_cast<uint8_t>(w[0]);\n    auto c1 = static_cast<uint8_t>(w[1]);\n    auto c2 = static_cast<uint8_t>(w[2]);\n    if (c0 == 0xe2 && c1 == 0x80 && c2 == 0x99) {\n      ans2 = true;\n    }\n  }\n\n  return ans || ans2;\n}\n\nstatic std::vector<std::string> MergeCharactersIntoWords(\n    const std::vector<std::string> &words) {\n  std::vector<std::string> ans;\n\n  int32_t n = static_cast<int32_t>(words.size());\n  int32_t i = 0;\n  int32_t prev = -1;\n\n  while (i < n) {\n    const auto &w = words[i];\n    if (w.size() >= 3 || (w.size() == 2 && !IsSpecial(w)) ||\n        (w.size() == 1 &&\n         (IsPunct(w[0]) || std::isspace(static_cast<uint8_t>(w[0]))))) {\n      if (prev != -1) {\n        std::string t;\n        for (; prev < i; ++prev) {\n          t.append(words[prev]);\n        }\n        prev = -1;\n        ans.push_back(std::move(t));\n      }\n\n      if (!std::isspace(static_cast<uint8_t>(w[0]))) {\n        ans.push_back(w);\n      }\n      ++i;\n      continue;\n    }\n\n    // e.g., öffnen\n    if (w.size() == 1 || (w.size() == 2 && IsSpecial(w))) {\n      if (prev == -1) {\n        prev = i;\n      }\n      ++i;\n      continue;\n    }\n\n    SHERPA_ONNX_LOGE(\"Ignore %s\", w.c_str());\n    ++i;\n  }\n\n  if (prev != -1) {\n    std::string t;\n    for (; prev < i; ++prev) {\n      t.append(words[prev]);\n    }\n    ans.push_back(std::move(t));\n  }\n\n  return ans;\n}\n\nstd::vector<std::string> SplitUtf8(const std::string &text) {\n  const uint8_t *begin = reinterpret_cast<const uint8_t *>(text.c_str());\n  const uint8_t *end = begin + text.size();\n\n  // Note that English words are split into single characters.\n  // We need to invoke MergeCharactersIntoWords() to merge them\n  std::vector<std::string> ans;\n\n  auto start = begin;\n  while (start < end) {\n    uint8_t c = *start;\n    uint8_t i = 0x80;\n    int32_t num_bytes = 0;\n\n    // see\n    // https://en.wikipedia.org/wiki/UTF-8\n    for (; c & i; i >>= 1) {\n      ++num_bytes;\n    }\n\n    if (num_bytes == 0) {\n      // this is an ascii\n      ans.emplace_back(reinterpret_cast<const char *>(start), 1);\n      ++start;\n    } else if (2 <= num_bytes && num_bytes <= 4) {\n      ans.emplace_back(reinterpret_cast<const char *>(start), num_bytes);\n      start += num_bytes;\n    } else {\n      SHERPA_ONNX_LOGE(\"Invalid byte at position: %d\",\n                       static_cast<int32_t>(start - begin));\n      // skip this byte\n      ++start;\n    }\n  }\n\n  return MergeCharactersIntoWords(ans);\n}\n\nstd::string ToLowerCase(const std::string &s) {\n  return ToString(ToLowerCase(ToWideString(s)));\n}\n\nvoid ToLowerCase(std::string *in_out) {\n  std::transform(in_out->begin(), in_out->end(), in_out->begin(),\n                 [](unsigned char c) { return std::tolower(c); });\n}\n\nstd::wstring ToLowerCase(const std::wstring &s) {\n  std::wstring ans(s.size(), 0);\n  std::transform(s.begin(), s.end(), ans.begin(), [](wchar_t c) -> wchar_t {\n    switch (c) {\n      // French\n      case L'À':\n        return L'à';\n      case L'Â':\n        return L'â';\n      case L'Æ':\n        return L'æ';\n      case L'Ç':\n        return L'ç';\n      case L'È':\n        return L'è';\n      case L'É':\n        return L'é';\n      case L'Ë':\n        return L'ë';\n      case L'Î':\n        return L'î';\n      case L'Ï':\n        return L'ï';\n      case L'Ô':\n        return L'ô';\n      case L'Ù':\n        return L'ù';\n      case L'Û':\n        return L'û';\n      case L'Ü':\n        return L'ü';\n\n      // others\n      case L'Á':\n        return L'á';\n      case L'Í':\n        return L'í';\n      case L'Ó':\n        return L'ó';\n      case L'Ú':\n        return L'ú';\n      case L'Ñ':\n        return L'ñ';\n      case L'Ì':\n        return L'ì';\n      case L'Ò':\n        return L'ò';\n      case L'Ä':\n        return L'ä';\n      case L'Ö':\n        return L'ö';\n        // TODO(fangjun): Add more\n\n      default:\n        return std::towlower(c);\n    }\n  });\n  return ans;\n}\n\nstatic inline bool InRange(uint8_t x, uint8_t low, uint8_t high) {\n  return low <= x && x <= high;\n}\n\n/*\nPlease see\nhttps://stackoverflow.com/questions/6555015/check-for-invalid-utf8\n\n\nTable 3-7. Well-Formed UTF-8 Byte Sequences\n\nCode Points        First Byte Second Byte Third Byte Fourth Byte\nU+0000..U+007F     00..7F\nU+0080..U+07FF     C2..DF     80..BF\nU+0800..U+0FFF     E0         A0..BF      80..BF\nU+1000..U+CFFF     E1..EC     80..BF      80..BF\nU+D000..U+D7FF     ED         80..9F      80..BF\nU+E000..U+FFFF     EE..EF     80..BF      80..BF\nU+10000..U+3FFFF   F0         90..BF      80..BF     80..BF\nU+40000..U+FFFFF   F1..F3     80..BF      80..BF     80..BF\nU+100000..U+10FFFF F4         80..8F      80..BF     80..BF\n */\nstd::string RemoveInvalidUtf8Sequences(const std::string &text,\n                                       bool show_debug_msg /*= false*/) {\n  int32_t n = static_cast<int32_t>(text.size());\n\n  std::string ans;\n  ans.reserve(n);\n\n  int32_t i = 0;\n  const uint8_t *p = reinterpret_cast<const uint8_t *>(text.data());\n  while (i < n) {\n    if (p[i] <= 0x7f) {\n      ans.append(text, i, 1);\n      i += 1;\n      continue;\n    }\n\n    if (InRange(p[i], 0xc2, 0xdf) && i + 1 < n &&\n        InRange(p[i + 1], 0x80, 0xbf)) {\n      ans.append(text, i, 2);\n      i += 2;\n      continue;\n    }\n\n    if (p[i] == 0xe0 && i + 2 < n && InRange(p[i + 1], 0xa0, 0xbf) &&\n        InRange(p[i + 2], 0x80, 0xbf)) {\n      ans.append(text, i, 3);\n      i += 3;\n      continue;\n    }\n\n    if (InRange(p[i], 0xe1, 0xec) && i + 2 < n &&\n        InRange(p[i + 1], 0x80, 0xbf) && InRange(p[i + 2], 0x80, 0xbf)) {\n      ans.append(text, i, 3);\n      i += 3;\n      continue;\n    }\n\n    if (p[i] == 0xed && i + 2 < n && InRange(p[i + 1], 0x80, 0x9f) &&\n        InRange(p[i + 2], 0x80, 0xbf)) {\n      ans.append(text, i, 3);\n      i += 3;\n      continue;\n    }\n\n    if (InRange(p[i], 0xee, 0xef) && i + 2 < n &&\n        InRange(p[i + 1], 0x80, 0xbf) && InRange(p[i + 2], 0x80, 0xbf)) {\n      ans.append(text, i, 3);\n      i += 3;\n      continue;\n    }\n\n    if (p[i] == 0xf0 && i + 3 < n && InRange(p[i + 1], 0x90, 0xbf) &&\n        InRange(p[i + 2], 0x80, 0xbf) && InRange(p[i + 3], 0x80, 0xbf)) {\n      ans.append(text, i, 4);\n      i += 4;\n      continue;\n    }\n\n    if (InRange(p[i], 0xf1, 0xf3) && i + 3 < n &&\n        InRange(p[i + 1], 0x80, 0xbf) && InRange(p[i + 2], 0x80, 0xbf) &&\n        InRange(p[i + 3], 0x80, 0xbf)) {\n      ans.append(text, i, 4);\n      i += 4;\n      continue;\n    }\n\n    if (p[i] == 0xf4 && i + 3 < n && InRange(p[i + 1], 0x80, 0x8f) &&\n        InRange(p[i + 2], 0x80, 0xbf) && InRange(p[i + 3], 0x80, 0xbf)) {\n      ans.append(text, i, 4);\n      i += 4;\n      continue;\n    }\n\n    if (show_debug_msg) {\n      SHERPA_ONNX_LOGE(\"Ignore invalid utf8 sequence at pos: %d, value: %02x\",\n                       i, p[i]);\n    }\n\n    i += 1;\n  }\n\n  return ans;\n}\n\nbool IsUtf8(const std::string &text) {\n  int32_t n = static_cast<int32_t>(text.size());\n  int32_t i = 0;\n  const uint8_t *p = reinterpret_cast<const uint8_t *>(text.data());\n  while (i < n) {\n    if (p[i] <= 0x7f) {\n      i += 1;\n      continue;\n    }\n\n    if (InRange(p[i], 0xc2, 0xdf) && i + 1 < n &&\n        InRange(p[i + 1], 0x80, 0xbf)) {\n      i += 2;\n      continue;\n    }\n\n    if (p[i] == 0xe0 && i + 2 < n && InRange(p[i + 1], 0xa0, 0xbf) &&\n        InRange(p[i + 2], 0x80, 0xbf)) {\n      i += 3;\n      continue;\n    }\n\n    if (InRange(p[i], 0xe1, 0xec) && i + 2 < n &&\n        InRange(p[i + 1], 0x80, 0xbf) && InRange(p[i + 2], 0x80, 0xbf)) {\n      i += 3;\n      continue;\n    }\n\n    if (p[i] == 0xed && i + 2 < n && InRange(p[i + 1], 0x80, 0x9f) &&\n        InRange(p[i + 2], 0x80, 0xbf)) {\n      i += 3;\n      continue;\n    }\n\n    if (InRange(p[i], 0xee, 0xef) && i + 2 < n &&\n        InRange(p[i + 1], 0x80, 0xbf) && InRange(p[i + 2], 0x80, 0xbf)) {\n      i += 3;\n      continue;\n    }\n\n    if (p[i] == 0xf0 && i + 3 < n && InRange(p[i + 1], 0x90, 0xbf) &&\n        InRange(p[i + 2], 0x80, 0xbf) && InRange(p[i + 3], 0x80, 0xbf)) {\n      i += 4;\n      continue;\n    }\n\n    if (InRange(p[i], 0xf1, 0xf3) && i + 3 < n &&\n        InRange(p[i + 1], 0x80, 0xbf) && InRange(p[i + 2], 0x80, 0xbf) &&\n        InRange(p[i + 3], 0x80, 0xbf)) {\n      i += 4;\n      continue;\n    }\n\n    if (p[i] == 0xf4 && i + 3 < n && InRange(p[i + 1], 0x80, 0x8f) &&\n        InRange(p[i + 2], 0x80, 0xbf) && InRange(p[i + 3], 0x80, 0xbf)) {\n      i += 4;\n      continue;\n    }\n\n    return false;\n  }\n\n  return true;\n}\n\nbool IsGB2312(const std::string &text) {\n  int32_t n = static_cast<int32_t>(text.size());\n  int32_t i = 0;\n  const uint8_t *p = reinterpret_cast<const uint8_t *>(text.data());\n  while (i < n) {\n    if (p[i] <= 0x7f) {\n      i += 1;\n      continue;\n    }\n\n    if (InRange(p[i], 0xa1, 0xf7) && i + 1 < n &&\n        InRange(p[i + 1], 0xa1, 0xfe)) {\n      i += 2;\n      continue;\n    }\n\n    return false;\n  }\n\n  return true;\n}\n\n#if defined(_WIN32)\nstd::string Gb2312ToUtf8(const std::string &text) {\n  // https://learn.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-multibytetowidechar\n  // 936 is from\n  // https://learn.microsoft.com/en-us/windows/win32/intl/code-page-identifiers\n  // GB2312 -> 936\n  int32_t num_wchars =\n      MultiByteToWideChar(936, 0, text.c_str(), text.size(), nullptr, 0);\n  SHERPA_ONNX_LOGE(\"num of wchars: %d\", num_wchars);\n  if (num_wchars == 0) {\n    return {};\n  }\n\n  std::wstring wstr;\n  wstr.resize(num_wchars);\n  MultiByteToWideChar(936, 0, text.c_str(), text.size(), wstr.data(),\n                      num_wchars);\n  // https://learn.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-widechartomultibyte\n  int32_t num_chars = WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), -1, nullptr,\n                                          0, nullptr, nullptr);\n  if (num_chars == 0) {\n    return {};\n  }\n\n  std::string ans(num_chars, 0);\n  WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), -1, ans.data(), num_chars,\n                      nullptr, nullptr);\n\n  return ans;\n}\n#endif\n\nstd::wstring ToWideString(const std::string &s) {\n  std::u32string u32 = Utf8ToUtf32(s);\n  std::wstring result;\n  result.reserve(u32.size());\n\n#if WCHAR_MAX > 0xFFFF\n  // wchar_t is 32-bit (Linux, macOS) — direct copy\n  for (char32_t cp : u32) {\n    result.push_back(static_cast<wchar_t>(cp));\n  }\n#else\n  // wchar_t is 16-bit (Windows) — encode surrogate pairs\n  for (char32_t cp : u32) {\n    if (cp <= 0xFFFF) {\n      result.push_back(static_cast<wchar_t>(cp));\n    } else {\n      cp -= 0x10000;\n      result.push_back(static_cast<wchar_t>(0xD800 + (cp >> 10)));\n      result.push_back(static_cast<wchar_t>(0xDC00 + (cp & 0x3FF)));\n    }\n  }\n#endif\n\n  return result;\n}\n\nstd::string ToString(const std::wstring &s) {\n  std::u32string u32;\n  u32.reserve(s.size());\n\n#if WCHAR_MAX > 0xFFFF\n  // wchar_t is 32-bit — direct copy\n  for (wchar_t wc : s) {\n    u32.push_back(static_cast<char32_t>(wc));\n  }\n#else\n  // wchar_t is 16-bit — decode surrogate pairs\n  for (size_t i = 0; i < s.size(); ++i) {\n    auto wc = static_cast<uint16_t>(s[i]);\n    if (wc >= 0xD800 && wc <= 0xDBFF) {\n      // High surrogate — look for matching low surrogate\n      if (i + 1 < s.size()) {\n        auto wc2 = static_cast<uint16_t>(s[i + 1]);\n        if (wc2 >= 0xDC00 && wc2 <= 0xDFFF) {\n          char32_t cp = 0x10000 + ((static_cast<char32_t>(wc - 0xD800) << 10) |\n                                   (wc2 - 0xDC00));\n          u32.push_back(cp);\n          ++i;\n          continue;\n        }\n      }\n      // Unpaired high surrogate\n      u32.push_back(0xFFFD);\n    } else if (wc >= 0xDC00 && wc <= 0xDFFF) {\n      // Lone low surrogate\n      u32.push_back(0xFFFD);\n    } else {\n      u32.push_back(static_cast<char32_t>(wc));\n    }\n  }\n#endif\n\n  return Utf32ToUtf8(u32);\n}\n\nbool EndsWith(const std::string &haystack, const std::string &needle) {\n  if (needle.size() > haystack.size()) {\n    return false;\n  }\n\n  return std::equal(needle.rbegin(), needle.rend(), haystack.rbegin());\n}\n\nbool Contains(const std::string &haystack, const std::string &needle) {\n  if (needle.size() > haystack.size()) {\n    return false;\n  }\n\n  return haystack.find(needle) != std::string::npos;\n}\n\nstd::vector<std::string> SplitString(const std::string &s, int32_t chunk_size) {\n  std::vector<std::string> ans;\n  if (chunk_size < 1 || chunk_size > s.size()) {\n    ans.push_back(s);\n  } else {\n    int32_t n = static_cast<int32_t>(s.size());\n    int32_t i = 0;\n    while (i < n) {\n      int32_t end = std::min(i + chunk_size, n);\n      ans.push_back(s.substr(i, end - i));\n      i = end;\n    }\n  }\n  return ans;\n}\n\nstd::string Join(const std::vector<std::string> &ss, const std::string &delim) {\n  std::ostringstream oss;\n  if (!ss.empty()) {\n    oss << ss[0];\n    for (size_t i = 1; i < ss.size(); ++i) {\n      oss << delim << ss[i];\n    }\n  }\n  return oss.str();\n}\n\nstd::string Utf32ToUtf8(char32_t cp) {\n  // Clamp surrogates and out-of-range codepoints to U+FFFD\n  if (cp > 0x10FFFF || (cp >= 0xD800 && cp <= 0xDFFF)) {\n    cp = 0xFFFD;\n  }\n\n  std::string out;\n\n  if (cp <= 0x7F) {\n    out.push_back(static_cast<char>(cp));\n  } else if (cp <= 0x7FF) {\n    out.push_back(static_cast<char>(0xC0 | (cp >> 6)));\n    out.push_back(static_cast<char>(0x80 | (cp & 0x3F)));\n  } else if (cp <= 0xFFFF) {\n    out.push_back(static_cast<char>(0xE0 | (cp >> 12)));\n    out.push_back(static_cast<char>(0x80 | ((cp >> 6) & 0x3F)));\n    out.push_back(static_cast<char>(0x80 | (cp & 0x3F)));\n  } else {\n    out.push_back(static_cast<char>(0xF0 | (cp >> 18)));\n    out.push_back(static_cast<char>(0x80 | ((cp >> 12) & 0x3F)));\n    out.push_back(static_cast<char>(0x80 | ((cp >> 6) & 0x3F)));\n    out.push_back(static_cast<char>(0x80 | (cp & 0x3F)));\n  }\n\n  return out;\n}\n\nstd::u32string Utf8ToUtf32(const std::string &str) {\n  std::u32string out;\n  out.reserve(str.size());\n\n  const auto *p = reinterpret_cast<const uint8_t *>(str.data());\n  const auto *end = p + str.size();\n\n  // RFC 3629 / Unicode Table 3-7 validation with U+FFFD replacement\n  // for maximal subpart of ill-formed subsequence (Unicode 3.9)\n  while (p < end) {\n    uint8_t b0 = *p;\n    if (b0 <= 0x7F) {\n      // ASCII\n      out.push_back(static_cast<char32_t>(b0));\n      ++p;\n    } else if (InRange(b0, 0xC2, 0xDF)) {\n      // 2-byte: U+0080..U+07FF (C2..DF starts at C2 to reject overlongs)\n      if (p + 1 < end && InRange(p[1], 0x80, 0xBF)) {\n        char32_t cp = (static_cast<char32_t>(b0 & 0x1F) << 6) | (p[1] & 0x3F);\n        out.push_back(cp);\n        p += 2;\n      } else {\n        out.push_back(0xFFFD);\n        ++p;\n      }\n    } else if (b0 == 0xE0) {\n      // 3-byte: U+0800..U+0FFF — second byte must be A0..BF (reject overlongs)\n      if (p + 2 < end && InRange(p[1], 0xA0, 0xBF) &&\n          InRange(p[2], 0x80, 0xBF)) {\n        char32_t cp = (static_cast<char32_t>(b0 & 0x0F) << 12) |\n                      (static_cast<char32_t>(p[1] & 0x3F) << 6) | (p[2] & 0x3F);\n        out.push_back(cp);\n        p += 3;\n      } else {\n        out.push_back(0xFFFD);\n        ++p;\n      }\n    } else if (InRange(b0, 0xE1, 0xEC)) {\n      // 3-byte: U+1000..U+CFFF\n      if (p + 2 < end && InRange(p[1], 0x80, 0xBF) &&\n          InRange(p[2], 0x80, 0xBF)) {\n        char32_t cp = (static_cast<char32_t>(b0 & 0x0F) << 12) |\n                      (static_cast<char32_t>(p[1] & 0x3F) << 6) | (p[2] & 0x3F);\n        out.push_back(cp);\n        p += 3;\n      } else {\n        out.push_back(0xFFFD);\n        ++p;\n      }\n    } else if (b0 == 0xED) {\n      // 3-byte: U+D000..U+D7FF — second byte must be 80..9F (reject surrogates)\n      if (p + 2 < end && InRange(p[1], 0x80, 0x9F) &&\n          InRange(p[2], 0x80, 0xBF)) {\n        char32_t cp = (static_cast<char32_t>(b0 & 0x0F) << 12) |\n                      (static_cast<char32_t>(p[1] & 0x3F) << 6) | (p[2] & 0x3F);\n        out.push_back(cp);\n        p += 3;\n      } else {\n        out.push_back(0xFFFD);\n        ++p;\n      }\n    } else if (InRange(b0, 0xEE, 0xEF)) {\n      // 3-byte: U+E000..U+FFFF\n      if (p + 2 < end && InRange(p[1], 0x80, 0xBF) &&\n          InRange(p[2], 0x80, 0xBF)) {\n        char32_t cp = (static_cast<char32_t>(b0 & 0x0F) << 12) |\n                      (static_cast<char32_t>(p[1] & 0x3F) << 6) | (p[2] & 0x3F);\n        out.push_back(cp);\n        p += 3;\n      } else {\n        out.push_back(0xFFFD);\n        ++p;\n      }\n    } else if (b0 == 0xF0) {\n      // 4-byte: U+10000..U+3FFFF — second byte must be 90..BF (reject\n      // overlongs)\n      if (p + 3 < end && InRange(p[1], 0x90, 0xBF) &&\n          InRange(p[2], 0x80, 0xBF) && InRange(p[3], 0x80, 0xBF)) {\n        char32_t cp = (static_cast<char32_t>(b0 & 0x07) << 18) |\n                      (static_cast<char32_t>(p[1] & 0x3F) << 12) |\n                      (static_cast<char32_t>(p[2] & 0x3F) << 6) | (p[3] & 0x3F);\n        out.push_back(cp);\n        p += 4;\n      } else {\n        out.push_back(0xFFFD);\n        ++p;\n      }\n    } else if (InRange(b0, 0xF1, 0xF3)) {\n      // 4-byte: U+40000..U+FFFFF\n      if (p + 3 < end && InRange(p[1], 0x80, 0xBF) &&\n          InRange(p[2], 0x80, 0xBF) && InRange(p[3], 0x80, 0xBF)) {\n        char32_t cp = (static_cast<char32_t>(b0 & 0x07) << 18) |\n                      (static_cast<char32_t>(p[1] & 0x3F) << 12) |\n                      (static_cast<char32_t>(p[2] & 0x3F) << 6) | (p[3] & 0x3F);\n        out.push_back(cp);\n        p += 4;\n      } else {\n        out.push_back(0xFFFD);\n        ++p;\n      }\n    } else if (b0 == 0xF4) {\n      // 4-byte: U+100000..U+10FFFF — second byte must be 80..8F (reject >\n      // U+10FFFF)\n      if (p + 3 < end && InRange(p[1], 0x80, 0x8F) &&\n          InRange(p[2], 0x80, 0xBF) && InRange(p[3], 0x80, 0xBF)) {\n        char32_t cp = (static_cast<char32_t>(b0 & 0x07) << 18) |\n                      (static_cast<char32_t>(p[1] & 0x3F) << 12) |\n                      (static_cast<char32_t>(p[2] & 0x3F) << 6) | (p[3] & 0x3F);\n        out.push_back(cp);\n        p += 4;\n      } else {\n        out.push_back(0xFFFD);\n        ++p;\n      }\n    } else {\n      // Invalid lead byte (C0, C1, F5..FF, or bare continuation 80..BF)\n      out.push_back(0xFFFD);\n      ++p;\n    }\n  }\n\n  return out;\n}\n\nstd::string Utf32ToUtf8(const std::u32string &str) {\n  std::string out;\n  out.reserve(str.size() * 2);  // rough estimate\n\n  for (char32_t cp : str) {\n    // Clamp surrogates and out-of-range codepoints to U+FFFD\n    if (cp > 0x10FFFF || (cp >= 0xD800 && cp <= 0xDFFF)) {\n      cp = 0xFFFD;\n    }\n\n    if (cp <= 0x7F) {\n      out.push_back(static_cast<char>(cp));\n    } else if (cp <= 0x7FF) {\n      out.push_back(static_cast<char>(0xC0 | (cp >> 6)));\n      out.push_back(static_cast<char>(0x80 | (cp & 0x3F)));\n    } else if (cp <= 0xFFFF) {\n      out.push_back(static_cast<char>(0xE0 | (cp >> 12)));\n      out.push_back(static_cast<char>(0x80 | ((cp >> 6) & 0x3F)));\n      out.push_back(static_cast<char>(0x80 | (cp & 0x3F)));\n    } else {\n      out.push_back(static_cast<char>(0xF0 | (cp >> 18)));\n      out.push_back(static_cast<char>(0x80 | ((cp >> 12) & 0x3F)));\n      out.push_back(static_cast<char>(0x80 | ((cp >> 6) & 0x3F)));\n      out.push_back(static_cast<char>(0x80 | (cp & 0x3F)));\n    }\n  }\n\n  return out;\n}\n\n// Helper: Convert ASCII chars in a std::string to uppercase (leaves non-ASCII\n// unchanged)\nstd::string ToUpperAscii(const std::string &str) {\n  std::string out = str;\n  for (char &c : out) {\n    unsigned char uc = static_cast<unsigned char>(c);\n    if (uc >= 'a' && uc <= 'z') {\n      c = static_cast<char>(uc - 'a' + 'A');\n    }\n  }\n  return out;\n}\n\n// Helper: Convert ASCII chars in a std::string to lowercase (leaves non-ASCII\n// unchanged)\nstd::string ToLowerAscii(const std::string &str) {\n  std::string out = str;\n  for (char &c : out) {\n    unsigned char uc = static_cast<unsigned char>(c);\n    if (uc >= 'A' && uc <= 'Z') {\n      c = static_cast<char>(uc - 'A' + 'a');\n    }\n  }\n  return out;\n}\n\n// Detect if a codepoint is a CJK character\nbool IsCJK(char32_t cp) {\n  return (cp >= 0x1100 && cp <= 0x11FF) || (cp >= 0x2E80 && cp <= 0xA4CF) ||\n         (cp >= 0xA840 && cp <= 0xD7AF) || (cp >= 0xF900 && cp <= 0xFAFF) ||\n         (cp >= 0xFE30 && cp <= 0xFE4F) || (cp >= 0xFF65 && cp <= 0xFFDC) ||\n         (cp >= 0x20000 && cp <= 0x2FFFF);\n}\n\nbool ContainsCJK(const std::string &text) {\n  std::u32string utf32_text = Utf8ToUtf32(text);\n  return ContainsCJK(utf32_text);\n}\n\nbool ContainsCJK(const std::u32string &text) {\n  for (char32_t cp : text) {\n    if (IsCJK(cp)) {\n      return true;\n    }\n  }\n  return false;\n}\n\nstd::string GetWord(const std::vector<std::string> &words, int32_t start,\n                    int32_t end) {\n  std::string ans;\n\n  int32_t ws = words.size();\n\n  if (start >= ws || end >= ws || start < 0 || end < 0) {\n    return ans;\n  }\n\n  for (int32_t i = start; i <= end; ++i) {\n    ans += words[i];\n  }\n\n  return ans;\n}\n\nbool IsAlphaOrPunct(int ch) { return std::isalpha(ch) || std::ispunct(ch); }\n\nbool IsPunct(const std::string &s) {\n  static const std::unordered_set<std::string> puncts = {\n      \",\",  \".\",  \"!\",  \"?\", \":\", \"\\\"\", \"'\", \"，\",\n      \"。\", \"！\", \"？\", \"“\", \"”\", \"‘\",  \"’\",\n  };\n  return puncts.count(s);\n}\n\nint32_t ToIntOrDefault(const std::string &s, int32_t default_value) {\n  if (s.empty()) return default_value;\n\n  std::string str = s;\n\n  // Remove surrounding quotes if present\n  if (str.size() >= 2 && str.front() == '\"' && str.back() == '\"') {\n    str = str.substr(1, str.size() - 2);\n  }\n\n  int32_t value = default_value;\n  auto [ptr, ec] = std::from_chars(str.data(), str.data() + str.size(), value);\n\n  // Check for conversion errors or trailing characters\n  if (ec != std::errc() || ptr != str.data() + str.size()) {\n    return default_value;\n  }\n\n  return value;\n}\n\nfloat ToFloatOrDefault(const std::string &s, float default_value) {\n  if (s.empty()) return default_value;\n\n  std::string str = s;\n\n  // Remove surrounding quotes if present\n  if (str.size() >= 2 && str.front() == '\"' && str.back() == '\"') {\n    str = str.substr(1, str.size() - 2);\n  }\n\n  char *end = nullptr;\n  errno = 0;\n  float value = std::strtof(str.c_str(), &end);\n\n  // No conversion or out of range\n  if (end == str.c_str() || errno == ERANGE) {\n    return default_value;\n  }\n\n  // Reject trailing garbage\n  if (*end != '\\0') {\n    return default_value;\n  }\n\n  return value;\n}\n\nvoid LengthsToMask(const std::vector<int64_t> &lengths,\n                   std::vector<float> *mask_flat,\n                   std::vector<int64_t> *mask_shape) {\n  if (lengths.empty()) {\n    mask_flat->clear();\n    mask_shape->assign({0, 1, 0});\n    return;\n  }\n\n  const int bsz = static_cast<int>(lengths.size());\n  const int64_t max_len = *std::max_element(lengths.begin(), lengths.end());\n  if (max_len < 0) {\n    SHERPA_ONNX_LOGE(\"LengthsToMask: max_len (%\" PRId64 \") < 0\", max_len);\n    SHERPA_ONNX_EXIT(-1);\n  }\n\n  mask_shape->assign({static_cast<int64_t>(lengths.size()), 1, max_len});\n\n  size_t total_size = static_cast<size_t>(bsz) * static_cast<size_t>(max_len);\n  mask_flat->assign(total_size, 0.0f);\n  for (int b = 0; b < bsz; ++b) {\n    int64_t len = lengths[b];\n    float *batch_mask = mask_flat->data() + b * max_len;\n    std::fill_n(batch_mask, len, 1.0f);\n  }\n}\n\nstd::vector<std::string> SplitByBlankLines(const std::string &text) {\n  std::vector<std::string> paragraphs;\n  std::string cur;\n\n  auto flush = [&]() {\n    std::string s = Trim(cur);\n    if (!s.empty()) {\n      paragraphs.emplace_back(std::move(s));\n    }\n    cur.clear();\n  };\n\n  size_t start = 0;\n  const size_t n = text.size();\n\n  while (start <= n) {\n    size_t end = text.find('\\n', start);\n    if (end == std::string::npos) end = n;\n\n    std::string line = text.substr(start, end - start);\n    line = Trim(line);\n    if (line.empty()) {\n      flush();\n    } else {\n      if (!cur.empty()) cur.push_back(' ');\n      cur += line;\n    }\n\n    if (end == n) break;\n    start = end + 1;\n  }\n  flush();\n  if (paragraphs.empty()) {\n    std::string s = Trim(text);\n    if (!s.empty()) paragraphs.emplace_back(std::move(s));\n  }\n  return paragraphs;\n}\n\nnamespace {\n\nbool IsSentenceBoundary(char32_t c) {\n  return c == U'.' || c == U'!' || c == U'?' || c == U'。' || c == U'！' ||\n         c == U'？';\n}\n\nbool IsChunkBoundary(char32_t c) {\n  return IsSentenceBoundary(c) || c == U',' || c == U';' || c == U':' ||\n         c == U'，' || c == U'；' || c == U'：';\n}\n\nbool IsSpace(char32_t c) {\n  return c == U' ' || c == U'\\t' || c == U'\\n' || c == U'\\r' || c == U'\\f' ||\n         c == U'\\v';\n}\n\nsize_t CountCodepoints(const std::string &s) { return Utf8ToUtf32(s).size(); }\n\nbool NeedSpaceBetween(const std::string &left, const std::string &right) {\n  if (left.empty() || right.empty()) {\n    return false;\n  }\n\n  auto left_u32 = Utf8ToUtf32(left);\n  auto right_u32 = Utf8ToUtf32(right);\n  if (left_u32.empty() || right_u32.empty()) {\n    return false;\n  }\n\n  char32_t last = left_u32.back();\n  char32_t first = right_u32.front();\n\n  if (IsSpace(last) || IsSpace(first)) {\n    return false;\n  }\n\n  if (IsCJK(last) || IsCJK(first) || IsChunkBoundary(last) ||\n      IsChunkBoundary(first)) {\n    return false;\n  }\n\n  return true;\n}\n\n}  // namespace\n\nstd::vector<std::string> SplitByPunctuation(const std::string &text) {\n  std::vector<std::string> sentences;\n  std::u32string cur;\n  auto flush = [&]() {\n    std::string s = Trim(Utf32ToUtf8(cur));\n    if (!s.empty()) sentences.emplace_back(std::move(s));\n    cur.clear();\n  };\n  for (char32_t c : Utf8ToUtf32(text)) {\n    cur.push_back(c);\n    if (IsSentenceBoundary(c)) {\n      flush();\n    }\n  }\n  flush();\n  return sentences;\n}\n\nstd::vector<std::string> MergeShortSentences(\n    const std::vector<std::string> &sentences, size_t min_chars) {\n  std::vector<std::string> merged;\n  std::string buffer;\n\n  for (const auto &s : sentences) {\n    std::string piece = Trim(s);\n    if (piece.empty()) {\n      continue;\n    }\n\n    if (!buffer.empty() && NeedSpaceBetween(buffer, piece)) {\n      buffer += \" \";\n    }\n    buffer += piece;\n\n    if (CountCodepoints(buffer) >= min_chars) {\n      merged.push_back(Trim(buffer));\n      buffer.clear();\n    }\n  }\n\n  if (!buffer.empty()) {\n    merged.push_back(Trim(buffer));\n  }\n\n  return merged;\n}\n\nstd::vector<std::string> SplitLongSentence(const std::string &sentence,\n                                           size_t max_chars) {\n  std::vector<std::string> chunks;\n  if (max_chars == 0) return chunks;\n  std::string s = Trim(sentence);\n  if (s.empty()) return chunks;\n\n  std::u32string u32 = Utf8ToUtf32(s);\n  size_t start = 0;\n  const size_t len = u32.size();\n  while (start < len) {\n    size_t end = std::min(start + max_chars, len);\n    if (end >= len) {\n      std::string piece = Trim(Utf32ToUtf8(u32.substr(start)));\n      if (!piece.empty()) {\n        chunks.emplace_back(std::move(piece));\n      }\n      break;\n    }\n\n    size_t split_pos = end;\n    bool found = false;\n    for (size_t i = end; i > start; --i) {\n      char32_t c = u32[i - 1];\n      if (IsSpace(c)) {\n        split_pos = i - 1;\n        found = true;\n        break;\n      }\n\n      if (IsChunkBoundary(c)) {\n        split_pos = i;\n        found = true;\n        break;\n      }\n    }\n\n    if (!found || split_pos <= start) {\n      split_pos = end;\n    }\n\n    std::string piece =\n        Trim(Utf32ToUtf8(u32.substr(start, split_pos - start)));\n    if (!piece.empty()) {\n      chunks.emplace_back(std::move(piece));\n    }\n\n    start = split_pos;\n    while (start < len && IsSpace(u32[start])) {\n      ++start;\n    }\n  }\n  return chunks;\n}\n\nstd::vector<std::string> ChunkText(const std::string &text, size_t max_len) {\n  std::vector<std::string> chunks;\n  if (max_len == 0) return chunks;\n\n  std::string text_single = Trim(text);\n  if (text_single.empty()) return chunks;\n\n  std::string cur;\n\n  auto flush = [&]() {\n    std::string s = Trim(cur);\n    if (!s.empty()) chunks.emplace_back(std::move(s));\n    cur.clear();\n  };\n\n  auto paragraphs = SplitByBlankLines(text_single);\n  for (const auto &para : paragraphs) {\n    auto sentences = SplitByPunctuation(para);\n    for (const auto &sent : sentences) {\n      auto pieces = SplitLongSentence(sent, max_len);\n      for (auto &p : pieces) {\n        if (p.empty()) continue;\n\n        if (cur.empty()) {\n          cur = std::move(p);\n          continue;\n        }\n\n        if (cur.size() + 1 + p.size() <= max_len) {\n          cur.push_back(' ');\n          cur += p;\n        } else {\n          flush();\n          cur = std::move(p);\n        }\n      }\n    }\n  }\n\n  flush();\n  if (chunks.empty()) chunks.emplace_back(std::move(text_single));\n  return chunks;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/text-utils.h",
    "content": "// sherpa-onnx/csrc/text-utils.h\n//\n// Copyright 2009-2011  Saarland University;  Microsoft Corporation\n// Copyright      2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_TEXT_UTILS_H_\n#define SHERPA_ONNX_CSRC_TEXT_UTILS_H_\n#include <errno.h>\n#include <stdlib.h>\n\n#include <limits>\n#include <string>\n#include <type_traits>\n#include <vector>\n\n#ifdef _MSC_VER\n#define SHERPA_ONNX_STRTOLL(cur_cstr, end_cstr) \\\n  _strtoi64(cur_cstr, end_cstr, 10);\n#else\n#define SHERPA_ONNX_STRTOLL(cur_cstr, end_cstr) strtoll(cur_cstr, end_cstr, 10);\n#endif\n\n// This file is copied/modified from\n// https://github.com/kaldi-asr/kaldi/blob/master/src/util/text-utils.h\n\nnamespace sherpa_onnx {\n\n/// Converts a string into an integer via strtoll and returns false if there was\n/// any kind of problem (i.e. the string was not an integer or contained extra\n/// non-whitespace junk, or the integer was too large to fit into the type it is\n/// being converted into).  Only sets *out if everything was OK and it returns\n/// true.\ntemplate <class Int>\nbool ConvertStringToInteger(const std::string &str, Int *out) {\n  // copied from kaldi/src/util/text-util.h\n  static_assert(std::is_integral<Int>::value, \"\");\n  const char *this_str = str.c_str();\n  char *end = nullptr;\n  errno = 0;\n  int64_t i = SHERPA_ONNX_STRTOLL(this_str, &end);\n  if (end != this_str) {\n    while (isspace(*end)) ++end;\n  }\n  if (end == this_str || *end != '\\0' || errno != 0) return false;\n  Int iInt = static_cast<Int>(i);\n  if (static_cast<int64_t>(iInt) != i ||\n      (i < 0 && !std::numeric_limits<Int>::is_signed)) {\n    return false;\n  }\n  *out = iInt;\n  return true;\n}\n\n/// Split a string using any of the single character delimiters.\n/// If omit_empty_strings == true, the output will contain any\n/// nonempty strings after splitting on any of the\n/// characters in the delimiter.  If omit_empty_strings == false,\n/// the output will contain n+1 strings if there are n characters\n/// in the set \"delim\" within the input string.  In this case\n/// the empty string is split to a single empty string.\nvoid SplitStringToVector(const std::string &full, const char *delim,\n                         bool omit_empty_strings,\n                         std::vector<std::string> *out);\n\n/// Trim leading and trailing whitespace from a string.\nstd::string Trim(const std::string &str);\n\n/// Split a string by a single character delimiter, trim whitespace from each\n/// part, and remove empty strings. This is a convenience wrapper around\n/// SplitStringToVector with trimming and filtering.\nstd::vector<std::string> SplitStringAndTrim(const std::string &str, char delim);\n\n/**\n  \\brief Split a string (e.g. 1:2:3) into a vector of integers.\n\n  \\param [in]  delim  String containing a list of characters, any of which\n                      is allowed as a delimiter.\n  \\param [in] omit_empty_strings If true, empty strings between delimiters are\n                      allowed and will not produce an output integer; if false,\n                      instances of characters in 'delim' that are consecutive or\n                      at the start or end of the string would be an error.\n                      You'll normally want this to be true if 'delim' consists\n                      of spaces, and false otherwise.\n  \\param [out] out   The output list of integers.\n*/\ntemplate <class I>\nbool SplitStringToIntegers(const std::string &full, const char *delim,\n                           bool omit_empty_strings,  // typically false [but\n                                                     // should probably be true\n                                                     // if \"delim\" is spaces].\n                           std::vector<I> *out) {\n  static_assert(std::is_integral<I>::value, \"\");\n  if (*(full.c_str()) == '\\0') {\n    out->clear();\n    return true;\n  }\n  std::vector<std::string> split;\n  SplitStringToVector(full, delim, omit_empty_strings, &split);\n  out->resize(split.size());\n  for (size_t i = 0; i < split.size(); i++) {\n    const char *this_str = split[i].c_str();\n    char *end = NULL;\n    int64_t j = 0;\n    j = SHERPA_ONNX_STRTOLL(this_str, &end);\n    if (end == this_str || *end != '\\0') {\n      out->clear();\n      return false;\n    } else {\n      I jI = static_cast<I>(j);\n      if (static_cast<int64_t>(jI) != j) {\n        // output type cannot fit this integer.\n        out->clear();\n        return false;\n      }\n      (*out)[i] = jI;\n    }\n  }\n  return true;\n}\n\n// This is defined for F = float and double.\ntemplate <class F>\nbool SplitStringToFloats(const std::string &full, const char *delim,\n                         bool omit_empty_strings,  // typically false\n                         std::vector<F> *out);\n\n// This is defined for F = float and double.\ntemplate <typename T>\nbool ConvertStringToReal(const std::string &str, T *out);\n\nstd::vector<std::string> SplitUtf8(const std::string &text);\n\nstd::string ToLowerCase(const std::string &s);\nvoid ToLowerCase(std::string *in_out);\n\nstd::wstring ToLowerCase(const std::wstring &s);\n\nstd::string RemoveInvalidUtf8Sequences(const std::string &text,\n                                       bool show_debug_msg = false);\n\n// Return true if text contains valid utf8 sequence.\n// Return false otherwise\nbool IsUtf8(const std::string &text);\n\n// Return true if text contains valid gb2312 encoded sequence\n// Return false otherwise\nbool IsGB2312(const std::string &text);\n\n#if defined(_WIN32)\nstd::string Gb2312ToUtf8(const std::string &text);\n#endif\n\nstd::wstring ToWideString(const std::string &s);\n\nstd::string ToString(const std::wstring &s);\n\nbool EndsWith(const std::string &haystack, const std::string &needle);\n\nbool Contains(const std::string &haystack, const std::string &needle);\n\nstd::vector<std::string> SplitString(const std::string &s, int32_t chunk_size);\n\nstd::string Join(const std::vector<std::string> &ss,\n                 const std::string &delim = \"\");\n\n// Converts a UTF-8 std::string to a UTF-32 std::u32string\nstd::u32string Utf8ToUtf32(const std::string &str);\n\n// Converts a UTF-32 std::u32string to a UTF-8 std::string\nstd::string Utf32ToUtf8(const std::u32string &str);\n\n// Converts a single UTF-32 codepoint to a UTF-8 std::string\nstd::string Utf32ToUtf8(char32_t cp);\n\n// Helper: Convert ASCII chars in a std::string to uppercase (leaves non-ASCII\n// unchanged)\nstd::string ToUpperAscii(const std::string &str);\n\n// Helper: Convert ASCII chars in a std::string to lowercase (leaves non-ASCII\n// unchanged)\nstd::string ToLowerAscii(const std::string &str);\n\nbool IsAlphaOrPunct(int ch);\n\n// Detect if a codepoint is a CJK character\nbool IsCJK(char32_t cp);\n\nbool ContainsCJK(const std::string &text);\n\nbool ContainsCJK(const std::u32string &text);\n\nbool StringToBool(const std::string &s);\n\n// end is inclusive\nstd::string GetWord(const std::vector<std::string> &words, int32_t start,\n                    int32_t end);\n\nbool IsPunct(const std::string &s);\n\n#if defined(_WIN32)\n#define SHERPA_ONNX_TO_ORT_PATH(s) (ToWideString(s).c_str())\n#else\n#define SHERPA_ONNX_TO_ORT_PATH(s) ((s).c_str())\n#endif\n\nint32_t ToIntOrDefault(const std::string &s, int32_t default_value);\n\nfloat ToFloatOrDefault(const std::string &s, float default_value);\n\n// Convert lengths to flat mask + shape. Outputs [batch, 1, max_len] format\n// where mask[b][0][i] = 1.0 if i < lengths[b], else 0.0.\nvoid LengthsToMask(const std::vector<int64_t> &lengths,\n                   std::vector<float> *mask_flat,\n                   std::vector<int64_t> *mask_shape);\n\n// TTS text chunking helpers.\nstd::vector<std::string> SplitByBlankLines(const std::string &text);\nstd::vector<std::string> SplitByPunctuation(const std::string &text);\nstd::vector<std::string> MergeShortSentences(\n    const std::vector<std::string> &sentences, size_t min_chars);\nstd::vector<std::string> SplitLongSentence(const std::string &sentence,\n                                           size_t max_chars);\nstd::vector<std::string> ChunkText(const std::string &text, size_t max_len);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_TEXT_UTILS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/text2token-test.cc",
    "content": "// sherpa-onnx/csrc/text2token-test.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <fstream>\n#include <memory>\n#include <sstream>\n#include <string>\n#include <vector>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/utils.h\"\n#include \"ssentencepiece/csrc/ssentencepiece.h\"\n\nnamespace sherpa_onnx {\n\n// Please refer to\n// https://github.com/pkufool/sherpa-test-data\n// to download test data for testing\nstatic const char dir[] = \"/tmp/sherpa-test-data\";\n\nTEST(TEXT2TOKEN, TEST_cjkchar) {\n  std::ostringstream oss;\n  oss << dir << \"/text2token/tokens_cn.txt\";\n\n  std::string tokens = oss.str();\n\n  if (!std::ifstream(tokens).good()) {\n    SHERPA_ONNX_LOGE(\n        \"No test data found, skipping TEST_cjkchar().\"\n        \"You can download the test data by: \"\n        \"git clone https://github.com/pkufool/sherpa-test-data.git \"\n        \"/tmp/sherpa-test-data\");\n    return;\n  }\n\n  auto sym_table = SymbolTable(tokens);\n\n  std::string text =\n      \"世界人民大团结\\n中国 V S 美国\\n\\n\";  // Test blank lines also\n\n  std::istringstream iss(text);\n\n  std::vector<std::vector<int32_t>> ids;\n  std::vector<float> scores;\n\n  auto r = EncodeHotwords(iss, \"cjkchar\", sym_table, nullptr, &ids, &scores);\n\n  std::vector<std::vector<int32_t>> expected_ids(\n      {{379, 380, 72, 874, 93, 1251, 489}, {262, 147, 3423, 2476, 21, 147}});\n  EXPECT_EQ(ids, expected_ids);\n\n  EXPECT_EQ(scores.size(), 0);\n}\n\nTEST(TEXT2TOKEN, TEST_bpe) {\n  std::ostringstream oss;\n  oss << dir << \"/text2token/tokens_en.txt\";\n  std::string tokens = oss.str();\n  oss.clear();\n  oss.str(\"\");\n  oss << dir << \"/text2token/bpe_en.vocab\";\n  std::string bpe = oss.str();\n  if (!std::ifstream(tokens).good() || !std::ifstream(bpe).good()) {\n    SHERPA_ONNX_LOGE(\n        \"No test data found, skipping TEST_bpe().\"\n        \"You can download the test data by: \"\n        \"git clone https://github.com/pkufool/sherpa-test-data.git \"\n        \"/tmp/sherpa-test-data\");\n    return;\n  }\n\n  auto sym_table = SymbolTable(tokens);\n  auto bpe_processor = std::make_unique<ssentencepiece::Ssentencepiece>(bpe);\n\n  std::string text = \"HELLO WORLD\\nI LOVE YOU :2.0\";\n\n  std::istringstream iss(text);\n\n  std::vector<std::vector<int32_t>> ids;\n  std::vector<float> scores;\n\n  auto r =\n      EncodeHotwords(iss, \"bpe\", sym_table, bpe_processor.get(), &ids, &scores);\n\n  std::vector<std::vector<int32_t>> expected_ids(\n      {{22, 58, 24, 425}, {19, 370, 47}});\n  EXPECT_EQ(ids, expected_ids);\n\n  std::vector<float> expected_scores({0, 2.0});\n  EXPECT_EQ(scores, expected_scores);\n}\n\nTEST(TEXT2TOKEN, TEST_cjkchar_bpe) {\n  std::ostringstream oss;\n  oss << dir << \"/text2token/tokens_mix.txt\";\n  std::string tokens = oss.str();\n  oss.clear();\n  oss.str(\"\");\n  oss << dir << \"/text2token/bpe_mix.vocab\";\n  std::string bpe = oss.str();\n  if (!std::ifstream(tokens).good() || !std::ifstream(bpe).good()) {\n    SHERPA_ONNX_LOGE(\n        \"No test data found, skipping TEST_cjkchar_bpe().\"\n        \"You can download the test data by: \"\n        \"git clone https://github.com/pkufool/sherpa-test-data.git \"\n        \"/tmp/sherpa-test-data\");\n    return;\n  }\n\n  auto sym_table = SymbolTable(tokens);\n  auto bpe_processor = std::make_unique<ssentencepiece::Ssentencepiece>(bpe);\n\n  std::string text = \"世界人民 GOES TOGETHER :1.5\\n中国 GOES WITH 美国 :0.5\";\n\n  std::istringstream iss(text);\n\n  std::vector<std::vector<int32_t>> ids;\n  std::vector<float> scores;\n\n  auto r = EncodeHotwords(iss, \"cjkchar+bpe\", sym_table, bpe_processor.get(),\n                          &ids, &scores);\n\n  std::vector<std::vector<int32_t>> expected_ids(\n      {{1368, 1392, 557, 680, 275, 178, 475},\n       {685, 736, 275, 178, 179, 921, 736}});\n  EXPECT_EQ(ids, expected_ids);\n\n  std::vector<float> expected_scores({1.5, 0.5});\n  EXPECT_EQ(scores, expected_scores);\n}\n\nTEST(TEXT2TOKEN, TEST_bbpe) {\n  std::ostringstream oss;\n  oss << dir << \"/text2token/tokens_bbpe.txt\";\n  std::string tokens = oss.str();\n  oss.clear();\n  oss.str(\"\");\n  oss << dir << \"/text2token/bbpe.vocab\";\n  std::string bpe = oss.str();\n  if (!std::ifstream(tokens).good() || !std::ifstream(bpe).good()) {\n    SHERPA_ONNX_LOGE(\n        \"No test data found, skipping TEST_bbpe().\"\n        \"You can download the test data by: \"\n        \"git clone https://github.com/pkufool/sherpa-test-data.git \"\n        \"/tmp/sherpa-test-data\");\n    return;\n  }\n\n  auto sym_table = SymbolTable(tokens);\n  auto bpe_processor = std::make_unique<ssentencepiece::Ssentencepiece>(bpe);\n\n  std::string text = \"频繁 :1.0\\n李鞑靼\";\n\n  std::istringstream iss(text);\n\n  std::vector<std::vector<int32_t>> ids;\n  std::vector<float> scores;\n\n  auto r =\n      EncodeHotwords(iss, \"bpe\", sym_table, bpe_processor.get(), &ids, &scores);\n\n  std::vector<std::vector<int32_t>> expected_ids(\n      {{259, 1118, 234, 188, 132}, {259, 1585, 236, 161, 148, 236, 160, 191}});\n  EXPECT_EQ(ids, expected_ids);\n\n  std::vector<float> expected_scores({1.0, 0});\n  EXPECT_EQ(scores, expected_scores);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/timer.cc",
    "content": "// sherpa-onnx/csrc/timer.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/timer.h\"\n\n#include <chrono>\n#include <memory>\n\nnamespace sherpa_onnx {\n\n// modified from https://github.com/kaldi-asr/kaldi/blob/master/src/base/timer.h\nclass Timer::Impl {\n public:\n  Impl() { Reset(); }\n\n  using high_resolution_clock = std::chrono::high_resolution_clock;\n\n  void Reset() { begin_ = high_resolution_clock::now(); }\n\n  // Return time in seconds\n  double Elapsed() {\n    auto end = high_resolution_clock::now();\n    auto diff =\n        std::chrono::duration_cast<std::chrono::microseconds>(end - begin_);\n    return diff.count() / 1000000.0;\n  }\n\n private:\n  high_resolution_clock::time_point begin_;\n};\n\nTimer::Timer() : impl_(std::make_unique<Impl>()) {}\n\nTimer::~Timer() = default;\n\nvoid Timer::Reset() const { impl_->Reset(); }\n\n// Return time in seconds\ndouble Timer::Elapsed() const { return impl_->Elapsed(); }\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/timer.h",
    "content": "// sherpa-onnx/csrc/timer.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_TIMER_H_\n#define SHERPA_ONNX_CSRC_TIMER_H_\n\n#include <memory>\n\nnamespace sherpa_onnx {\n\nclass Timer {\n public:\n  Timer();\n  ~Timer();\n\n  void Reset() const;\n\n  // Return time in seconds\n  double Elapsed() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_TIMER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/transducer-keyword-decoder.cc",
    "content": "// sherpa-onnx/csrc/transducer-keywords-decoder.cc\n//\n// Copyright (c)  2023-2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/transducer-keyword-decoder.h\"\n\n#include <algorithm>\n#include <cmath>\n#include <cstring>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/log.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nTransducerKeywordResult TransducerKeywordDecoder::GetEmptyResult() const {\n  int32_t context_size = model_->ContextSize();\n  int32_t blank_id = 0;  // always 0\n  TransducerKeywordResult r;\n  std::vector<int64_t> blanks(context_size, -1);\n  blanks.back() = blank_id;\n\n  Hypotheses blank_hyp({{blanks, 0}});\n  r.hyps = std::move(blank_hyp);\n  return r;\n}\n\nvoid TransducerKeywordDecoder::Decode(\n    Ort::Value encoder_out, OnlineStream **ss,\n    std::vector<TransducerKeywordResult> *result) {\n  std::vector<int64_t> encoder_out_shape =\n      encoder_out.GetTensorTypeAndShapeInfo().GetShape();\n\n  if (encoder_out_shape[0] != result->size()) {\n    SHERPA_ONNX_LOGE(\n        \"Size mismatch! encoder_out.size(0) %d, result.size(0): %d\\n\",\n        static_cast<int32_t>(encoder_out_shape[0]),\n        static_cast<int32_t>(result->size()));\n    exit(-1);\n  }\n\n  int32_t batch_size = static_cast<int32_t>(encoder_out_shape[0]);\n\n  int32_t num_frames = static_cast<int32_t>(encoder_out_shape[1]);\n  int32_t vocab_size = model_->VocabSize();\n  int32_t context_size = model_->ContextSize();\n  std::vector<int64_t> blanks(context_size, -1);\n  blanks.back() = 0;  // blank_id is hardcoded to 0\n\n  std::vector<Hypotheses> cur;\n  for (auto &r : *result) {\n    cur.push_back(std::move(r.hyps));\n  }\n  std::vector<Hypothesis> prev;\n\n  for (int32_t t = 0; t != num_frames; ++t) {\n    // Due to merging paths with identical token sequences,\n    // not all utterances have \"num_active_paths\" paths.\n    auto hyps_row_splits = GetHypsRowSplits(cur);\n    int32_t num_hyps =\n        hyps_row_splits.back();  // total num hyps for all utterance\n    prev.clear();\n    for (auto &hyps : cur) {\n      for (auto &h : hyps) {\n        prev.push_back(std::move(h.second));\n      }\n    }\n    cur.clear();\n    cur.reserve(batch_size);\n\n    Ort::Value decoder_input = model_->BuildDecoderInput(prev);\n    Ort::Value decoder_out = model_->RunDecoder(std::move(decoder_input));\n\n    Ort::Value cur_encoder_out =\n        GetEncoderOutFrame(model_->Allocator(), &encoder_out, t);\n    cur_encoder_out =\n        Repeat(model_->Allocator(), &cur_encoder_out, hyps_row_splits);\n    Ort::Value logit =\n        model_->RunJoiner(std::move(cur_encoder_out), View(&decoder_out));\n\n    float *p_logit = logit.GetTensorMutableData<float>();\n    LogSoftmax(p_logit, vocab_size, num_hyps);\n\n    // The acoustic logprobs for current frame\n    std::vector<float> logprobs(vocab_size * num_hyps);\n    std::memcpy(logprobs.data(), p_logit,\n                sizeof(float) * vocab_size * num_hyps);\n\n    // now p_logit contains log_softmax output, we rename it to p_logprob\n    // to match what it actually contains\n    float *p_logprob = p_logit;\n\n    // add log_prob of each hypothesis to p_logprob before taking top_k\n    for (int32_t i = 0; i != num_hyps; ++i) {\n      float log_prob = prev[i].log_prob;\n      for (int32_t k = 0; k != vocab_size; ++k, ++p_logprob) {\n        *p_logprob += log_prob;\n      }\n    }\n    p_logprob = p_logit;  // we changed p_logprob in the above for loop\n\n    for (int32_t b = 0; b != batch_size; ++b) {\n      int32_t frame_offset = (*result)[b].frame_offset;\n      int32_t start = hyps_row_splits[b];\n      int32_t end = hyps_row_splits[b + 1];\n      auto topk =\n          TopkIndex(p_logprob, vocab_size * (end - start), max_active_paths_);\n\n      Hypotheses hyps;\n      for (auto k : topk) {\n        int32_t hyp_index = k / vocab_size + start;\n        int32_t new_token = k % vocab_size;\n\n        Hypothesis new_hyp = prev[hyp_index];\n        float context_score = 0;\n        auto context_state = new_hyp.context_state;\n\n        // blank is hardcoded to 0\n        // also, it treats unk as blank\n        if (new_token != 0 && new_token != unk_id_) {\n          new_hyp.ys.push_back(new_token);\n          new_hyp.timestamps.push_back(t + frame_offset);\n          new_hyp.ys_probs.push_back(\n              exp(logprobs[hyp_index * vocab_size + new_token]));\n\n          new_hyp.num_trailing_blanks = 0;\n          auto context_res = ss[b]->GetContextGraph()->ForwardOneStep(\n              context_state, new_token);\n          context_score = std::get<0>(context_res);\n          new_hyp.context_state = std::get<1>(context_res);\n          // Start matching from the start state, forget the decoder history.\n          if (new_hyp.context_state->token == -1) {\n            new_hyp.ys = blanks;\n            new_hyp.timestamps.clear();\n            new_hyp.ys_probs.clear();\n          }\n        } else {\n          ++new_hyp.num_trailing_blanks;\n        }\n        new_hyp.log_prob = p_logprob[k] + context_score;\n        hyps.Add(std::move(new_hyp));\n      }  // for (auto k : topk)\n\n      auto best_hyp = hyps.GetMostProbable(false);\n\n      auto status = ss[b]->GetContextGraph()->IsMatched(best_hyp.context_state);\n      bool matched = std::get<0>(status);\n      const ContextState *matched_state = std::get<1>(status);\n\n      if (matched) {\n        float ys_prob = 0.0;\n        for (int32_t i = 0; i < matched_state->level; ++i) {\n          ys_prob += best_hyp.ys_probs[i];\n        }\n        ys_prob /= matched_state->level;\n        if (best_hyp.num_trailing_blanks > num_trailing_blanks_ &&\n            ys_prob >= matched_state->ac_threshold) {\n          auto &r = (*result)[b];\n          r.tokens = {best_hyp.ys.end() - matched_state->level,\n                      best_hyp.ys.end()};\n          r.timestamps = {best_hyp.timestamps.end() - matched_state->level,\n                          best_hyp.timestamps.end()};\n          r.keyword = matched_state->phrase;\n\n          hyps = Hypotheses({{blanks, 0, ss[b]->GetContextGraph()->Root()}});\n        }\n      }\n      cur.push_back(std::move(hyps));\n      p_logprob += (end - start) * vocab_size;\n    }  // for (int32_t b = 0; b != batch_size; ++b)\n  }\n\n  for (int32_t b = 0; b != batch_size; ++b) {\n    auto &hyps = cur[b];\n    auto best_hyp = hyps.GetMostProbable(false);\n    auto &r = (*result)[b];\n    r.hyps = std::move(hyps);\n    r.num_trailing_blanks = best_hyp.num_trailing_blanks;\n    r.frame_offset += num_frames;\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/transducer-keyword-decoder.h",
    "content": "// sherpa-onnx/csrc/transducer-keywords-decoder.h\n//\n// Copyright (c)  2023-2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_TRANSDUCER_KEYWORD_DECODER_H_\n#define SHERPA_ONNX_CSRC_TRANSDUCER_KEYWORD_DECODER_H_\n\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-stream.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model.h\"\n\nnamespace sherpa_onnx {\n\nstruct TransducerKeywordResult {\n  /// Number of frames after subsampling we have decoded so far\n  int32_t frame_offset = 0;\n\n  /// The decoded token IDs for keywords\n  std::vector<int64_t> tokens;\n\n  /// The triggered keyword\n  std::string keyword;\n\n  /// number of trailing blank frames decoded so far\n  int32_t num_trailing_blanks = 0;\n\n  /// timestamps[i] contains the output frame index where tokens[i] is decoded.\n  std::vector<int32_t> timestamps;\n\n  // used only in modified beam_search\n  Hypotheses hyps;\n};\n\nclass TransducerKeywordDecoder {\n public:\n  TransducerKeywordDecoder(OnlineTransducerModel *model,\n                           int32_t max_active_paths,\n                           int32_t num_trailing_blanks, int32_t unk_id)\n      : model_(model),\n        max_active_paths_(max_active_paths),\n        num_trailing_blanks_(num_trailing_blanks),\n        unk_id_(unk_id) {}\n\n  TransducerKeywordResult GetEmptyResult() const;\n\n  void Decode(Ort::Value encoder_out, OnlineStream **ss,\n              std::vector<TransducerKeywordResult> *result);\n\n private:\n  OnlineTransducerModel *model_;  // Not owned\n\n  int32_t max_active_paths_;\n  int32_t num_trailing_blanks_;\n  int32_t unk_id_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_TRANSDUCER_KEYWORD_DECODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/transpose-test.cc",
    "content": "// sherpa-onnx/csrc/transpose-test.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/transpose.h\"\n\n#include <numeric>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nTEST(Transpose, Tranpose01) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 3> shape{3, 2, 5};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  std::iota(p, p + shape[0] * shape[1] * shape[2], 0);\n\n  auto ans = Transpose01(allocator, &v);\n  auto v2 = Transpose01(allocator, &ans);\n\n  Print3D(&v);\n  Print3D(&ans);\n  Print3D(&v2);\n\n  const float *q = v2.GetTensorData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0] * shape[1] * shape[2]);\n       ++i) {\n    EXPECT_EQ(p[i], q[i]);\n  }\n}\n\nTEST(Transpose, Tranpose12) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 3> shape{3, 2, 5};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  std::iota(p, p + shape[0] * shape[1] * shape[2], 0);\n\n  auto ans = Transpose12(allocator, &v);\n  auto v2 = Transpose12(allocator, &ans);\n\n  Print3D(&v);\n  Print3D(&ans);\n  Print3D(&v2);\n\n  const float *q = v2.GetTensorData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0] * shape[1] * shape[2]);\n       ++i) {\n    EXPECT_EQ(p[i], q[i]);\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/transpose.cc",
    "content": "// sherpa-onnx/csrc/transpose.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/transpose.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <vector>\n\nnamespace sherpa_onnx {\n\ntemplate <typename T /*=float*/>\nOrt::Value Transpose01(OrtAllocator *allocator, const Ort::Value *v) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  assert(shape.size() == 3);\n\n  std::array<int64_t, 3> ans_shape{shape[1], shape[0], shape[2]};\n  Ort::Value ans = Ort::Value::CreateTensor<T>(allocator, ans_shape.data(),\n                                               ans_shape.size());\n\n  T *dst = ans.GetTensorMutableData<T>();\n  auto plane_offset = shape[1] * shape[2];\n\n  for (int64_t i = 0; i != ans_shape[0]; ++i) {\n    const T *src = v->GetTensorData<T>() + i * shape[2];\n    for (int64_t k = 0; k != ans_shape[1]; ++k) {\n      std::copy(src, src + shape[2], dst);\n      src += plane_offset;\n      dst += shape[2];\n    }\n  }\n\n  return ans;\n}\n\ntemplate <typename T /*= float*/>\nOrt::Value Transpose12(OrtAllocator *allocator, const Ort::Value *v) {\n  std::vector<int64_t> shape = v->GetTensorTypeAndShapeInfo().GetShape();\n  assert(shape.size() == 3);\n\n  std::array<int64_t, 3> ans_shape{shape[0], shape[2], shape[1]};\n  Ort::Value ans = Ort::Value::CreateTensor<T>(allocator, ans_shape.data(),\n                                               ans_shape.size());\n  T *dst = ans.GetTensorMutableData<T>();\n  auto row_stride = shape[2];\n  for (int64_t b = 0; b != ans_shape[0]; ++b) {\n    const T *src = v->GetTensorData<T>() + b * shape[1] * shape[2];\n    for (int64_t i = 0; i != ans_shape[1]; ++i) {\n      for (int64_t k = 0; k != ans_shape[2]; ++k, ++dst) {\n        *dst = (src + k * row_stride)[i];\n      }\n    }\n  }\n\n  return ans;\n}\n\ntemplate Ort::Value Transpose01<float>(OrtAllocator *allocator,\n                                       const Ort::Value *v);\n\ntemplate Ort::Value Transpose12<float>(OrtAllocator *allocator,\n                                       const Ort::Value *v);\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/transpose.h",
    "content": "// sherpa-onnx/csrc/transpose.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_TRANSPOSE_H_\n#define SHERPA_ONNX_CSRC_TRANSPOSE_H_\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n/** Transpose a 3-D tensor from shape (B, T, C) to (T, B, C).\n *\n * @param allocator\n * @param v A 3-D tensor of shape (B, T, C). Its data type is type.\n *\n * @return Return a 3-D tensor of shape (T, B, C). Its data type is type.\n */\ntemplate <typename type = float>\nOrt::Value Transpose01(OrtAllocator *allocator, const Ort::Value *v);\n\n/** Transpose a 3-D tensor from shape (B, T, C) to (B, C, T).\n *\n * @param allocator\n * @param v A 3-D tensor of shape (B, T, C). Its data type is type.\n *\n * @return Return a 3-D tensor of shape (B, C, T). Its data type is type.\n */\ntemplate <typename type = float>\nOrt::Value Transpose12(OrtAllocator *allocator, const Ort::Value *v);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_TRANSPOSE_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/unbind-test.cc",
    "content": "// sherpa-onnx/csrc/unbind-test.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/unbind.h\"\n\n#include <vector>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/cat.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\nTEST(Ubind, Test1DTensors) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 1> shape{3};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0]); ++i) {\n    p[i] = i;\n  }\n  auto ans = Unbind(allocator, &v, 0);\n  EXPECT_EQ(ans.size(), shape[0]);\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0]); ++i) {\n    EXPECT_EQ(ans[i].GetTensorData<float>()[0], p[i]);\n  }\n  Print1D(&v);\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0]); ++i) {\n    Print1D(&ans[i]);\n  }\n\n  // For Cat\n  std::vector<const Ort::Value *> vec(ans.size());\n  for (int32_t i = 0; i != static_cast<int32_t>(vec.size()); ++i) {\n    vec[i] = &ans[i];\n  }\n  Ort::Value v2 = Cat(allocator, vec, 0);\n  const float *p2 = v2.GetTensorData<float>();\n  for (int32_t i = 0; i != shape[0]; ++i) {\n    EXPECT_EQ(p[i], p2[i]);\n  }\n}\n\nTEST(Ubind, Test2DTensorsDim0) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 2> shape{3, 2};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0] * shape[1]); ++i) {\n    p[i] = i;\n  }\n  auto ans = Unbind(allocator, &v, 0);\n\n  Print2D(&v);\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0]); ++i) {\n    Print2D(&ans[i]);\n  }\n\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0]); ++i) {\n    const float *pans = ans[i].GetTensorData<float>();\n    for (int32_t k = 0; k != static_cast<int32_t>(shape[1]); ++k, ++p) {\n      EXPECT_EQ(*p, pans[k]);\n    }\n  }\n\n  // For Cat\n  std::vector<const Ort::Value *> vec(ans.size());\n  for (int32_t i = 0; i != static_cast<int32_t>(vec.size()); ++i) {\n    vec[i] = &ans[i];\n  }\n  Ort::Value v2 = Cat(allocator, vec, 0);\n  Print2D(&v2);\n\n  p = v.GetTensorMutableData<float>();\n  const float *p2 = v2.GetTensorData<float>();\n  for (int32_t i = 0; i != shape[0] * shape[1]; ++i) {\n    EXPECT_EQ(p[i], p2[i]);\n  }\n}\n\nTEST(Ubind, Test2DTensorsDim1) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 2> shape{3, 2};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0] * shape[1]); ++i) {\n    p[i] = i;\n  }\n  auto ans = Unbind(allocator, &v, 1);\n\n  Print2D(&v);\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[1]); ++i) {\n    Print2D(&ans[i]);\n  }\n\n  // For Cat\n  std::vector<const Ort::Value *> vec(ans.size());\n  for (int32_t i = 0; i != static_cast<int32_t>(vec.size()); ++i) {\n    vec[i] = &ans[i];\n  }\n  Ort::Value v2 = Cat(allocator, vec, 1);\n  Print2D(&v2);\n\n  p = v.GetTensorMutableData<float>();\n  const float *p2 = v2.GetTensorData<float>();\n  for (int32_t i = 0; i != shape[0] * shape[1]; ++i) {\n    EXPECT_EQ(p[i], p2[i]);\n  }\n}\n\nTEST(Ubind, Test3DTensorsDim0) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 3> shape{3, 2, 5};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0] * shape[1] * shape[2]);\n       ++i) {\n    p[i] = i;\n  }\n  auto ans = Unbind(allocator, &v, 0);\n\n  Print3D(&v);\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0]); ++i) {\n    Print3D(&ans[i]);\n  }\n\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0]); ++i) {\n    const float *pans = ans[i].GetTensorData<float>();\n    for (int32_t k = 0; k != static_cast<int32_t>(shape[1] * shape[2]);\n         ++k, ++p) {\n      EXPECT_EQ(*p, pans[k]);\n    }\n  }\n\n  // For Cat\n  std::vector<const Ort::Value *> vec(ans.size());\n  for (int32_t i = 0; i != static_cast<int32_t>(vec.size()); ++i) {\n    vec[i] = &ans[i];\n  }\n  Ort::Value v2 = Cat(allocator, vec, 0);\n  Print3D(&v2);\n\n  p = v.GetTensorMutableData<float>();\n  const float *p2 = v2.GetTensorData<float>();\n  for (int32_t i = 0; i != shape[0] * shape[1] * shape[2]; ++i) {\n    EXPECT_EQ(p[i], p2[i]);\n  }\n}\n\nTEST(Ubind, Test3DTensorsDim1) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 3> shape{3, 2, 5};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0] * shape[1] * shape[2]);\n       ++i) {\n    p[i] = i;\n  }\n  auto ans = Unbind(allocator, &v, 1);\n\n  Print3D(&v);\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[1]); ++i) {\n    Print3D(&ans[i]);\n  }\n\n  // For Cat\n  std::vector<const Ort::Value *> vec(ans.size());\n  for (int32_t i = 0; i != static_cast<int32_t>(vec.size()); ++i) {\n    vec[i] = &ans[i];\n  }\n  Ort::Value v2 = Cat(allocator, vec, 1);\n  Print3D(&v2);\n\n  p = v.GetTensorMutableData<float>();\n  const float *p2 = v2.GetTensorData<float>();\n  for (int32_t i = 0; i != shape[0] * shape[1] * shape[2]; ++i) {\n    EXPECT_EQ(p[i], p2[i]);\n  }\n}\n\nTEST(Ubind, Test3DTensorsDim2) {\n  Ort::AllocatorWithDefaultOptions allocator;\n  std::array<int64_t, 3> shape{3, 2, 5};\n  Ort::Value v =\n      Ort::Value::CreateTensor<float>(allocator, shape.data(), shape.size());\n  float *p = v.GetTensorMutableData<float>();\n\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[0] * shape[1] * shape[2]);\n       ++i) {\n    p[i] = i;\n  }\n  auto ans = Unbind(allocator, &v, 2);\n\n  Print3D(&v);\n  for (int32_t i = 0; i != static_cast<int32_t>(shape[2]); ++i) {\n    Print3D(&ans[i]);\n  }\n\n  // For Cat\n  std::vector<const Ort::Value *> vec(ans.size());\n  for (int32_t i = 0; i != static_cast<int32_t>(vec.size()); ++i) {\n    vec[i] = &ans[i];\n  }\n  Ort::Value v2 = Cat(allocator, vec, 2);\n  Print3D(&v2);\n\n  p = v.GetTensorMutableData<float>();\n  const float *p2 = v2.GetTensorData<float>();\n  for (int32_t i = 0; i != shape[0] * shape[1] * shape[2]; ++i) {\n    EXPECT_EQ(p[i], p2[i]);\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/unbind.cc",
    "content": "// sherpa-onnx/csrc/unbind.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/unbind.h\"\n\n#include <algorithm>\n#include <cassert>\n#include <functional>\n#include <numeric>\n#include <utility>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n\nnamespace sherpa_onnx {\n\ntemplate <typename T /*= float*/>\nstd::vector<Ort::Value> Unbind(OrtAllocator *allocator, const Ort::Value *value,\n                               int32_t dim) {\n  std::vector<int64_t> shape = value->GetTensorTypeAndShapeInfo().GetShape();\n  assert(dim >= 0);\n  assert(dim < static_cast<int32_t>(shape.size()));\n  int32_t n = static_cast<int32_t>(shape[dim]);\n  if (n == 1) {\n    std::vector<Ort::Value> ans;\n    ans.push_back(Clone(allocator, value));\n    return ans;\n  }\n\n  std::vector<int64_t> ans_shape = shape;\n  ans_shape[dim] = 1;  // // Unlike torch, we keep the dim to 1\n\n  // allocator tensors\n  std::vector<Ort::Value> ans;\n  ans.reserve(n);\n  for (int32_t i = 0; i != n; ++i) {\n    Ort::Value t = Ort::Value::CreateTensor<T>(allocator, ans_shape.data(),\n                                               ans_shape.size());\n    ans.push_back(std::move(t));\n  }\n\n  auto leading_size = static_cast<int32_t>(std::accumulate(\n      shape.begin(), shape.begin() + dim, 1, std::multiplies<int64_t>()));\n\n  auto trailing_size = static_cast<int32_t>(std::accumulate(\n      shape.begin() + dim + 1, shape.end(), 1, std::multiplies<int64_t>()));\n\n  const T *src = value->GetTensorData<T>();\n\n  for (int32_t i = 0; i != leading_size; ++i) {\n    for (int32_t k = 0; k != n; ++k) {\n      T *dst = ans[k].GetTensorMutableData<T>() + i * trailing_size;\n      std::copy(src, src + trailing_size, dst);\n      src += trailing_size;\n    }\n  }\n\n  return ans;\n}\n\ntemplate std::vector<Ort::Value> Unbind<float>(OrtAllocator *allocator,\n                                               const Ort::Value *value,\n                                               int32_t dim);\n\ntemplate std::vector<Ort::Value> Unbind<int64_t>(OrtAllocator *allocator,\n                                                 const Ort::Value *value,\n                                                 int32_t dim);\n\nstd::vector<Ort::Value> UnbindFloat16(OrtAllocator *allocator,\n                                      const Ort::Value *value, int32_t dim) {\n  std::vector<int64_t> shape = value->GetTensorTypeAndShapeInfo().GetShape();\n  assert(dim >= 0);\n  assert(dim < static_cast<int32_t>(shape.size()));\n  int32_t n = static_cast<int32_t>(shape[dim]);\n  if (n == 1) {\n    std::vector<Ort::Value> ans;\n    ans.push_back(Clone(allocator, value));\n    return ans;\n  }\n\n  std::vector<int64_t> ans_shape = shape;\n  ans_shape[dim] = 1;  // // Unlike torch, we keep the dim to 1\n\n  // allocator tensors\n  std::vector<Ort::Value> ans;\n  ans.reserve(n);\n  for (int32_t i = 0; i != n; ++i) {\n    Ort::Value t =\n        Ort::Value::CreateTensor(allocator, ans_shape.data(), ans_shape.size(),\n                                 ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16);\n    ans.push_back(std::move(t));\n  }\n\n  auto leading_size = static_cast<int32_t>(std::accumulate(\n      shape.begin(), shape.begin() + dim, 1, std::multiplies<int64_t>()));\n\n  auto trailing_size = static_cast<int32_t>(std::accumulate(\n      shape.begin() + dim + 1, shape.end(), 1, std::multiplies<int64_t>()));\n\n  using T = uint16_t;\n  const T *src = value->GetTensorData<T>();\n\n  for (int32_t i = 0; i != leading_size; ++i) {\n    for (int32_t k = 0; k != n; ++k) {\n      T *dst = ans[k].GetTensorMutableData<T>() + i * trailing_size;\n      std::copy(src, src + trailing_size, dst);\n      src += trailing_size;\n    }\n  }\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/unbind.h",
    "content": "// sherpa-onnx/csrc/unbind.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_UNBIND_H_\n#define SHERPA_ONNX_CSRC_UNBIND_H_\n\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n\nnamespace sherpa_onnx {\n\n/** It is similar to torch.unbind() but we keep the unbind dim to 1 in\n * the output\n *\n * @param allocator Allocator to allocate space for the returned tensor\n * @param value  The tensor to unbind\n * @param dim  The dim along which to unbind the tensor\n *\n * @return Return a list of tensors\n */\ntemplate <typename T = float>\nstd::vector<Ort::Value> Unbind(OrtAllocator *allocator, const Ort::Value *value,\n                               int32_t dim);\n\nstd::vector<Ort::Value> UnbindFloat16(OrtAllocator *allocator,\n                                      const Ort::Value *value, int32_t dim);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_UNBIND_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/utfcpp-test.cc",
    "content": "// sherpa-onnx/csrc/utfcpp-test.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include <cctype>\n#include <iostream>\n#include <string>\n#include <vector>\n\n#include \"gtest/gtest.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nTEST(UTF8, Case1) {\n  std::string hello = \"你好, 早上好！世界.  hello!。Hallo! how are you?\";\n  std::vector<std::string> ss = SplitUtf8(hello);\n  for (const auto &s : ss) {\n    std::cout << s << \"\\n\";\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/utils.cc",
    "content": "// sherpa-onnx/csrc/utils.cc\n//\n// Copyright      2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/utils.h\"\n\n#include <cassert>\n#include <iostream>\n#include <sstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/bbpe.h\"\n#include \"sherpa-onnx/csrc/log.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nstatic bool EncodeBase(const std::vector<std::string> &lines,\n                       const SymbolTable &symbol_table,\n                       std::vector<std::vector<int32_t>> *ids,\n                       std::vector<std::string> *phrases,\n                       std::vector<float> *scores,\n                       std::vector<float> *thresholds) {\n  ids->clear();\n\n  std::vector<int32_t> tmp_ids;\n  std::vector<float> tmp_scores;\n  std::vector<float> tmp_thresholds;\n  std::vector<std::string> tmp_phrases;\n\n  std::string word;\n  bool has_scores = false;\n  bool has_thresholds = false;\n  bool has_phrases = false;\n  bool has_oov = false;\n\n  for (const auto &line : lines) {\n    float score = 0;\n    float threshold = 0;\n    std::string phrase = \"\";\n\n    std::istringstream iss(line);\n    while (iss >> word) {\n      if (symbol_table.Contains(word)) {\n        int32_t id = symbol_table[word];\n        tmp_ids.push_back(id);\n      } else {\n        switch (word[0]) {\n          case ':':  // boosting score for current keyword\n            score = std::stof(word.substr(1));\n            has_scores = true;\n            break;\n          case '#':  // triggering threshold (probability) for current keyword\n            threshold = std::stof(word.substr(1));\n            has_thresholds = true;\n            break;\n          case '@':  // the original keyword string\n            phrase = word.substr(1);\n            has_phrases = true;\n            break;\n          default:\n            SHERPA_ONNX_LOGE(\n                \"Cannot find ID for token %s at line: %s. (Hint: Check the \"\n                \"tokens.txt see if %s in it)\",\n                word.c_str(), line.c_str(), word.c_str());\n            has_oov = true;\n            break;\n        }\n      }\n    }\n    ids->push_back(std::move(tmp_ids));\n    tmp_ids = {};\n    tmp_scores.push_back(score);\n    tmp_phrases.push_back(phrase);\n    tmp_thresholds.push_back(threshold);\n  }\n  if (scores != nullptr) {\n    if (has_scores) {\n      scores->swap(tmp_scores);\n    } else {\n      scores->clear();\n    }\n  }\n  if (phrases != nullptr) {\n    if (has_phrases) {\n      *phrases = std::move(tmp_phrases);\n    } else {\n      phrases->clear();\n    }\n  }\n  if (thresholds != nullptr) {\n    if (has_thresholds) {\n      thresholds->swap(tmp_thresholds);\n    } else {\n      thresholds->clear();\n    }\n  }\n  return !has_oov;\n}\n\nbool EncodeHotwords(std::istream &is, const std::string &modeling_unit,\n                    const SymbolTable &symbol_table,\n                    const ssentencepiece::Ssentencepiece *bpe_encoder,\n                    std::vector<std::vector<int32_t>> *hotwords,\n                    std::vector<float> *boost_scores) {\n  std::vector<std::string> lines;\n  std::string line;\n  std::string word;\n\n  while (std::getline(is, line)) {\n    std::string score;\n    std::string phrase;\n\n    std::ostringstream oss;\n    std::istringstream iss(line);\n    while (iss >> word) {\n      switch (word[0]) {\n        case ':':  // boosting score for current keyword\n          score = word;\n          break;\n        default:\n          if (!score.empty()) {\n            SHERPA_ONNX_LOGE(\n                \"Boosting score should be put after the words/phrase, given \"\n                \"%s.\",\n                line.c_str());\n            return false;\n          }\n          oss << \" \" << word;\n          break;\n      }\n    }\n    phrase = oss.str();\n    if (phrase.empty()) {\n      continue;\n    } else {\n      phrase = phrase.substr(1);\n    }\n    std::istringstream piss(phrase);\n    oss.clear();\n    oss.str(\"\");\n    while (piss >> word) {\n      if (modeling_unit == \"cjkchar\") {\n        for (const auto &w : SplitUtf8(word)) {\n          oss << \" \" << w;\n        }\n      } else if (modeling_unit == \"bpe\") {\n        std::vector<std::string> bpes;\n        bpe_encoder->Encode(word, &bpes);\n        for (const auto &bpe : bpes) {\n          oss << \" \" << bpe;\n        }\n      } else if (modeling_unit == \"bbpe\") {\n        std::vector<std::string> bpes;\n\n        const auto &id2token = GetByteBpeTableId2Token();\n        std::string tokens;\n        for (size_t i = 0; i < word.length(); ++i) {\n          uint8_t byte = static_cast<uint8_t>(word[i]);\n          tokens += id2token.at(byte);\n          if ((i + 1) % 3 == 0 && (i + 1) < word.length()) {\n            tokens += \" \";\n          }\n        }\n\n        bpe_encoder->Encode(tokens, &bpes);\n        for (const auto &bpe : bpes) {\n          oss << \" \" << bpe;\n        }\n      } else {\n        if (modeling_unit != \"cjkchar+bpe\") {\n          SHERPA_ONNX_LOGE(\n              \"modeling_unit should be one of bpe, cjkchar or cjkchar+bpe, \"\n              \"given \"\n              \"%s\",\n              modeling_unit.c_str());\n          exit(-1);\n        }\n        for (const auto &w : SplitUtf8(word)) {\n          if (isalpha(w[0])) {\n            std::vector<std::string> bpes;\n            bpe_encoder->Encode(w, &bpes);\n            for (const auto &bpe : bpes) {\n              oss << \" \" << bpe;\n            }\n          } else {\n            oss << \" \" << w;\n          }\n        }\n      }\n    }\n    std::string encoded_phrase = oss.str().substr(1);\n    oss.clear();\n    oss.str(\"\");\n    oss << encoded_phrase;\n    if (!score.empty()) {\n      oss << \" \" << score;\n    }\n    lines.push_back(oss.str());\n  }\n  return EncodeBase(lines, symbol_table, hotwords, nullptr, boost_scores,\n                    nullptr);\n}\n\nbool EncodeKeywords(std::istream &is, const SymbolTable &symbol_table,\n                    std::vector<std::vector<int32_t>> *keywords_id,\n                    std::vector<std::string> *keywords,\n                    std::vector<float> *boost_scores,\n                    std::vector<float> *threshold) {\n  std::vector<std::string> lines;\n  std::string line;\n  while (std::getline(is, line)) {\n    lines.push_back(line);\n  }\n  return EncodeBase(lines, symbol_table, keywords_id, keywords, boost_scores,\n                    threshold);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/utils.h",
    "content": "// sherpa-onnx/csrc/utils.h\n//\n// Copyright      2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_UTILS_H_\n#define SHERPA_ONNX_CSRC_UTILS_H_\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/symbol-table.h\"\n#include \"ssentencepiece/csrc/ssentencepiece.h\"\n\nnamespace sherpa_onnx {\n\n/* Encode the hotwords in an input stream to be tokens ids.\n *\n * @param is The input stream, it contains several lines, one hotword for each\n *           line. For each hotword, the tokens (cjkchar or bpe) are separated\n *           by spaces.\n * @param symbol_table  The tokens table mapping symbols to ids. All the symbols\n *                      in the stream should be in the symbol_table, if not this\n *                      function returns false.\n *\n * @@param hotwords  The encoded ids to be written to.\n *\n * @return  If all the symbols from ``is`` are in the symbol_table, returns true\n *          otherwise returns false.\n */\nbool EncodeHotwords(std::istream &is, const std::string &modeling_unit,\n                    const SymbolTable &symbol_table,\n                    const ssentencepiece::Ssentencepiece *bpe_encoder,\n                    std::vector<std::vector<int32_t>> *hotwords_id,\n                    std::vector<float> *boost_scores);\n\n/* Encode the keywords in an input stream to be tokens ids.\n *\n * @param is The input stream, it contains several lines, one hotword for each\n *           line. For each hotword, the tokens (cjkchar or bpe) are separated\n *           by spaces, it might contain boosting score (starting with :),\n *           triggering threshold (starting with #) and keyword string (starting\n *           with @) too.\n * @param symbol_table  The tokens table mapping symbols to ids. All the symbols\n *                      in the stream should be in the symbol_table, if not this\n *                      function returns false.\n *\n * @param keywords_id The encoded ids to be written to.\n * @param keywords The original keyword string to be written to.\n * @param boost_scores  The boosting score for each keyword to be written to.\n * @param threshold  The triggering threshold for each keyword to be written to.\n *\n * @return  If all the symbols from ``is`` are in the symbol_table, returns true\n *          otherwise returns false.\n */\nbool EncodeKeywords(std::istream &is, const SymbolTable &symbol_table,\n                    std::vector<std::vector<int32_t>> *keywords_id,\n                    std::vector<std::string> *keywords,\n                    std::vector<float> *boost_scores,\n                    std::vector<float> *threshold);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_UTILS_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/vad-model-config.cc",
    "content": "// sherpa-onnx/csrc/vad-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/vad-model-config.h\"\n\n#include <sstream>\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n\nnamespace sherpa_onnx {\n\nvoid VadModelConfig::Register(ParseOptions *po) {\n  silero_vad.Register(po);\n  ten_vad.Register(po);\n\n  po->Register(\"vad-sample-rate\", &sample_rate,\n               \"Sample rate expected by the VAD model\");\n\n  po->Register(\"vad-num-threads\", &num_threads,\n               \"Number of threads to run the VAD model\");\n\n  po->Register(\"vad-provider\", &provider,\n               \"Specify a provider to run the VAD model. Supported values: \"\n               \"cpu, cuda, coreml\");\n\n  po->Register(\"vad-debug\", &debug,\n               \"true to display debug information when loading vad models\");\n}\n\nbool VadModelConfig::Validate() const {\n  if (provider != \"rknn\") {\n    if (!silero_vad.model.empty() && EndsWith(silero_vad.model, \".rknn\")) {\n      SHERPA_ONNX_LOGE(\n          \"--provider is %s, which is not rknn, but you pass an rknn model \"\n          \"'%s'\",\n          provider.c_str(), silero_vad.model.c_str());\n      return false;\n    }\n  }\n\n  if (provider == \"rknn\") {\n    if (!silero_vad.model.empty() && EndsWith(silero_vad.model, \".onnx\")) {\n      SHERPA_ONNX_LOGE(\"--provider is rknn, but you pass an onnx model '%s'\",\n                       silero_vad.model.c_str());\n      return false;\n    }\n  }\n\n  if (!silero_vad.model.empty()) {\n    return silero_vad.Validate();\n  }\n\n  if (!ten_vad.model.empty()) {\n    return ten_vad.Validate();\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide one VAD model.\");\n\n  return false;\n}\n\nstd::string VadModelConfig::ToString() const {\n  std::ostringstream os;\n\n  os << \"VadModelConfig(\";\n  os << \"silero_vad=\" << silero_vad.ToString() << \", \";\n  os << \"ten_vad=\" << ten_vad.ToString() << \", \";\n  os << \"sample_rate=\" << sample_rate << \", \";\n  os << \"num_threads=\" << num_threads << \", \";\n  os << \"provider=\\\"\" << provider << \"\\\", \";\n  os << \"debug=\" << (debug ? \"True\" : \"False\") << \")\";\n\n  return os.str();\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/vad-model-config.h",
    "content": "// sherpa-onnx/csrc/vad-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_VAD_MODEL_CONFIG_H_\n#define SHERPA_ONNX_CSRC_VAD_MODEL_CONFIG_H_\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/parse-options.h\"\n#include \"sherpa-onnx/csrc/silero-vad-model-config.h\"\n#include \"sherpa-onnx/csrc/ten-vad-model-config.h\"\n\nnamespace sherpa_onnx {\n\nstruct VadModelConfig {\n  SileroVadModelConfig silero_vad;\n  TenVadModelConfig ten_vad;\n\n  int32_t sample_rate = 16000;\n  int32_t num_threads = 1;\n  std::string provider = \"cpu\";\n\n  // true to show debug information when loading models\n  bool debug = false;\n\n  VadModelConfig() = default;\n\n  VadModelConfig(const SileroVadModelConfig &silero_vad,\n                 const TenVadModelConfig &ten_vad, int32_t sample_rate,\n                 int32_t num_threads, const std::string &provider, bool debug)\n      : silero_vad(silero_vad),\n        ten_vad(ten_vad),\n        sample_rate(sample_rate),\n        num_threads(num_threads),\n        provider(provider),\n        debug(debug) {}\n\n  void Register(ParseOptions *po);\n  bool Validate() const;\n\n  std::string ToString() const;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_VAD_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/vad-model.cc",
    "content": "// sherpa-onnx/csrc/vad-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/vad-model.h\"\n\n#include <memory>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#if SHERPA_ONNX_ENABLE_RKNN\n#include \"sherpa-onnx/csrc/rknn/silero-vad-model-rknn.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/silero-vad-model.h\"\n#include \"sherpa-onnx/csrc/ten-vad-model.h\"\n\nnamespace sherpa_onnx {\n\nstd::unique_ptr<VadModel> VadModel::Create(const VadModelConfig &config) {\n  if (config.provider == \"rknn\") {\n#if SHERPA_ONNX_ENABLE_RKNN\n    if (!config.silero_vad.model.empty()) {\n      return std::make_unique<SileroVadModelRknn>(config);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only silero-vad is supported for RKNN at present\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_RKNN=ON if you \"\n        \"want to use rknn.\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n\n  if (!config.silero_vad.model.empty()) {\n    return std::make_unique<SileroVadModel>(config);\n  }\n\n  if (!config.ten_vad.model.empty()) {\n    return std::make_unique<TenVadModel>(config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide a vad model\");\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<VadModel> VadModel::Create(Manager *mgr,\n                                           const VadModelConfig &config) {\n  if (config.provider == \"rknn\") {\n#if SHERPA_ONNX_ENABLE_RKNN\n    if (!config.silero_vad.model.empty()) {\n      return std::make_unique<SileroVadModelRknn>(mgr, config);\n    } else {\n      SHERPA_ONNX_LOGE(\"Only silero-vad is supported for RKNN at present\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n#else\n    SHERPA_ONNX_LOGE(\n        \"Please rebuild sherpa-onnx with -DSHERPA_ONNX_ENABLE_RKNN=ON if you \"\n        \"want to use rknn.\");\n    SHERPA_ONNX_EXIT(-1);\n    return nullptr;\n#endif\n  }\n  if (!config.silero_vad.model.empty()) {\n    return std::make_unique<SileroVadModel>(mgr, config);\n  }\n\n  if (!config.ten_vad.model.empty()) {\n    return std::make_unique<TenVadModel>(mgr, config);\n  }\n\n  SHERPA_ONNX_LOGE(\"Please provide a vad model\");\n  return nullptr;\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<VadModel> VadModel::Create(\n    AAssetManager *mgr, const VadModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<VadModel> VadModel::Create(\n    NativeResourceManager *mgr, const VadModelConfig &config);\n#endif\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/vad-model.h",
    "content": "// sherpa-onnx/csrc/vad-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_VAD_MODEL_H_\n#define SHERPA_ONNX_CSRC_VAD_MODEL_H_\n\n#include <memory>\n\n#include \"sherpa-onnx/csrc/vad-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass VadModel {\n public:\n  virtual ~VadModel() = default;\n\n  static std::unique_ptr<VadModel> Create(const VadModelConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<VadModel> Create(Manager *mgr,\n                                          const VadModelConfig &config);\n\n  // reset the internal model states\n  virtual void Reset() = 0;\n\n  /**\n   * @param samples Pointer to a 1-d array containing audio samples.\n   *                Each sample should be normalized to the range [-1, 1].\n   * @param n Number of samples. Should be equal to WindowSize()\n   *\n   * @return Return true if speech is detected. Return false otherwise.\n   */\n  virtual bool IsSpeech(const float *samples, int32_t n) = 0;\n\n  virtual float Compute(const float *samples, int32_t n) = 0;\n\n  virtual int32_t WindowSize() const = 0;\n\n  virtual int32_t WindowShift() const = 0;\n\n  virtual int32_t MinSilenceDurationSamples() const = 0;\n  virtual int32_t MinSpeechDurationSamples() const = 0;\n  virtual void SetMinSilenceDuration(float s) = 0;\n  virtual void SetThreshold(float threshold) = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_VAD_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/version.cc",
    "content": "// sherpa-onnx/csrc/version.h\n//\n// Copyright      2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/version.h\"\n\nnamespace sherpa_onnx {\n\nconst char *GetGitDate() {\n  static const char *date = \"Fri Mar 20 19:09:44 2026\";\n  return date;\n}\n\nconst char *GetGitSha1() {\n  static const char *sha1 = \"6ff3ce76\";\n  return sha1;\n}\n\nconst char *GetVersionStr() {\n  static const char *version = \"1.12.31\";\n  return version;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/version.h",
    "content": "// sherpa-onnx/csrc/version.h\n//\n// Copyright      2025  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_VERSION_H_\n#define SHERPA_ONNX_CSRC_VERSION_H_\n\nnamespace sherpa_onnx {\n\n// Please don't free the returned pointer.\n// Please don't modify the memory pointed by the returned pointer.\n//\n// The memory pointed by the returned pointer is statically allocated.\nconst char *GetVersionStr();\n\n// Please don't free the returned pointer.\n// Please don't modify the memory pointed by the returned pointer.\n//\n// The memory pointed by the returned pointer is statically allocated.\nconst char *GetGitSha1();\n\n// Please don't free the returned pointer.\n// Please don't modify the memory pointed by the returned pointer.\n//\n// The memory pointed by the returned pointer is statically allocated.\nconst char *GetGitDate();\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_VERSION_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/vocoder.cc",
    "content": "// sherpa-onnx/csrc/vocoder.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/vocoder.h\"\n\n#include <memory>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/hifigan-vocoder.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/vocos-vocoder.h\"\n\nnamespace sherpa_onnx {\n\nnamespace {\n\nenum class ModelType : std::uint8_t {\n  kHifigan,\n  kVocoos,\n  kUnknown,\n};\n\n}  // namespace\n\nstatic ModelType GetModelType(char *model_data, size_t model_data_length,\n                              bool debug) {\n  Ort::Env env(ORT_LOGGING_LEVEL_ERROR);\n  Ort::SessionOptions sess_opts;\n  sess_opts.SetIntraOpNumThreads(1);\n  sess_opts.SetInterOpNumThreads(1);\n\n  auto sess = std::make_unique<Ort::Session>(env, model_data, model_data_length,\n                                             sess_opts);\n\n  Ort::ModelMetadata meta_data = sess->GetModelMetadata();\n  if (debug) {\n    std::ostringstream os;\n    PrintModelMetadata(os, meta_data);\n#if __OHOS__\n    SHERPA_ONNX_LOGE(\"%{public}s\", os.str().c_str());\n#else\n    SHERPA_ONNX_LOGE(\"%s\", os.str().c_str());\n#endif\n  }\n\n  Ort::AllocatorWithDefaultOptions allocator;\n  auto model_type =\n      LookupCustomModelMetaData(meta_data, \"model_type\", allocator);\n  if (model_type.empty()) {\n    SHERPA_ONNX_LOGE(\n        \"No model_type in the metadata!\\n\"\n        \"Please make sure you are using the vocoder from \"\n        \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/vocoder-models\");\n    return ModelType::kUnknown;\n  }\n\n  if (model_type == \"hifigan\") {\n    return ModelType::kHifigan;\n  } else if (model_type == \"vocos\" || model_type == \"matcha-tts vocos\") {\n    return ModelType::kVocoos;\n  } else {\n    SHERPA_ONNX_LOGE(\"Unsupported model_type: %s\", model_type.c_str());\n    return ModelType::kUnknown;\n  }\n}\n\nstd::unique_ptr<Vocoder> Vocoder::Create(const OfflineTtsModelConfig &config) {\n  std::vector<char> buffer;\n  if (!config.matcha.vocoder.empty()) {\n    buffer = ReadFile(config.matcha.vocoder);\n  } else if (!config.zipvoice.vocoder.empty()) {\n    buffer = ReadFile(config.zipvoice.vocoder);\n  } else {\n    SHERPA_ONNX_LOGE(\"No vocoder model provided in the config!\");\n    SHERPA_ONNX_EXIT(-1);\n  }\n  auto model_type = GetModelType(buffer.data(), buffer.size(), config.debug);\n\n  switch (model_type) {\n    case ModelType::kHifigan:\n      return std::make_unique<HifiganVocoder>(\n          config.num_threads, config.provider, config.matcha.vocoder);\n    case ModelType::kVocoos:\n      return std::make_unique<VocosVocoder>(config);\n    case ModelType::kUnknown:\n      SHERPA_ONNX_LOGE(\"Unknown model type in vocoder!\");\n      return nullptr;\n  }\n\n  return nullptr;\n}\n\ntemplate <typename Manager>\nstd::unique_ptr<Vocoder> Vocoder::Create(Manager *mgr,\n                                         const OfflineTtsModelConfig &config) {\n  std::vector<char> buffer;\n  if (!config.matcha.vocoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Using matcha vocoder: %s\", config.matcha.vocoder.c_str());\n    buffer = ReadFile(mgr, config.matcha.vocoder);\n  } else if (!config.zipvoice.vocoder.empty()) {\n    SHERPA_ONNX_LOGE(\"Using zipvoice vocoder: %s\",\n                     config.zipvoice.vocoder.c_str());\n    buffer = ReadFile(mgr, config.zipvoice.vocoder);\n  } else {\n    SHERPA_ONNX_LOGE(\"No vocoder model provided in the config!\");\n    return nullptr;\n  }\n\n  auto model_type = GetModelType(buffer.data(), buffer.size(), config.debug);\n\n  switch (model_type) {\n    case ModelType::kHifigan:\n      return std::make_unique<HifiganVocoder>(\n          mgr, config.num_threads, config.provider, config.matcha.vocoder);\n    case ModelType::kVocoos:\n      return std::make_unique<VocosVocoder>(mgr, config);\n    case ModelType::kUnknown:\n      SHERPA_ONNX_LOGE(\"Unknown model type in vocoder!\");\n      return nullptr;\n  }\n}\n\n#if __ANDROID_API__ >= 9\ntemplate std::unique_ptr<Vocoder> Vocoder::Create(\n    AAssetManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate std::unique_ptr<Vocoder> Vocoder::Create(\n    NativeResourceManager *mgr, const OfflineTtsModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/vocoder.h",
    "content": "// sherpa-onnx/csrc/vocoder.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_VOCODER_H_\n#define SHERPA_ONNX_CSRC_VOCODER_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n\nnamespace sherpa_onnx {\n\nclass Vocoder {\n public:\n  virtual ~Vocoder() = default;\n\n  static std::unique_ptr<Vocoder> Create(const OfflineTtsModelConfig &config);\n\n  template <typename Manager>\n  static std::unique_ptr<Vocoder> Create(Manager *mgr,\n                                         const OfflineTtsModelConfig &config);\n\n  /** @param mel A float32 tensor of shape (batch_size, feat_dim, num_frames).\n   *  @return Return a float32 vector containing audio samples..\n   */\n  virtual std::vector<float> Run(Ort::Value mel) const = 0;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_VOCODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/vocos-vocoder.cc",
    "content": "// sherpa-onnx/csrc/vocos-vocoder.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/vocos-vocoder.h\"\n\n#include <memory>\n#include <string>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"kaldi-native-fbank/csrc/istft.h\"\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/session.h\"\n\nnamespace sherpa_onnx {\n\nstruct VocosModelMetaData {\n  int32_t n_fft;\n  int32_t hop_length;\n  int32_t win_length;\n  int32_t center;\n  int32_t normalized;\n  std::string window_type;\n  std::string pad_mode;\n};\n\nclass VocosVocoder::Impl {\n public:\n  explicit Impl(const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config.num_threads, config.provider)),\n        allocator_{} {\n    std::vector<char> buffer;\n    if (!config.matcha.vocoder.empty()) {\n      buffer = ReadFile(config.matcha.vocoder);\n    } else if (!config.zipvoice.vocoder.empty()) {\n      buffer = ReadFile(config.zipvoice.vocoder);\n    } else {\n      SHERPA_ONNX_LOGE(\"No vocoder model provided in the config!\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n    Init(buffer.data(), buffer.size());\n  }\n\n  template <typename Manager>\n  explicit Impl(Manager *mgr, const OfflineTtsModelConfig &config)\n      : config_(config),\n        env_(ORT_LOGGING_LEVEL_ERROR),\n        sess_opts_(GetSessionOptions(config.num_threads, config.provider)),\n        allocator_{} {\n    std::vector<char> buffer;\n    if (!config.matcha.vocoder.empty()) {\n      buffer = ReadFile(mgr, config.matcha.vocoder);\n    } else if (!config.zipvoice.vocoder.empty()) {\n      buffer = ReadFile(mgr, config.zipvoice.vocoder);\n    } else {\n      SHERPA_ONNX_LOGE(\"No vocoder model provided in the config!\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n    Init(buffer.data(), buffer.size());\n  }\n\n  std::vector<float> Run(Ort::Value mel) const {\n    auto out = sess_->Run({}, input_names_ptr_.data(), &mel, 1,\n                          output_names_ptr_.data(), output_names_ptr_.size());\n\n    std::vector<int64_t> shape = out[0].GetTensorTypeAndShapeInfo().GetShape();\n\n    if (shape[0] != 1) {\n      SHERPA_ONNX_LOGE(\"Support only batch size 1, given: %d\",\n                       static_cast<int32_t>(shape[0]));\n      SHERPA_ONNX_EXIT(-1);\n    }\n\n    knf::StftResult stft_result;\n    stft_result.num_frames = shape[2];\n    stft_result.real.resize(shape[1] * shape[2]);\n    stft_result.imag.resize(shape[1] * shape[2]);\n\n    // stft_result.real: (num_frames, n_fft/2+1), flattened in row major\n\n    // mag.shape: (batch_size, n_fft/2+1, num_frames)\n    const float *p_mag = out[0].GetTensorData<float>();\n    const float *p_x = out[1].GetTensorData<float>();\n    const float *p_y = out[2].GetTensorData<float>();\n\n    for (int32_t frame_index = 0; frame_index < static_cast<int32_t>(shape[2]);\n         ++frame_index) {\n      for (int32_t bin = 0; bin < static_cast<int32_t>(shape[1]); ++bin) {\n        stft_result.real[frame_index * shape[1] + bin] =\n            p_mag[bin * shape[2] + frame_index] *\n            p_x[bin * shape[2] + frame_index];\n        stft_result.imag[frame_index * shape[1] + bin] =\n            p_mag[bin * shape[2] + frame_index] *\n            p_y[bin * shape[2] + frame_index];\n      }\n    }\n\n    knf::StftConfig stft_config;\n    stft_config.n_fft = meta_.n_fft;\n    stft_config.hop_length = meta_.hop_length;\n    stft_config.win_length = meta_.win_length;\n    stft_config.normalized = meta_.normalized;\n    stft_config.center = meta_.center;\n    stft_config.window_type = meta_.window_type;\n    stft_config.pad_mode = meta_.pad_mode;\n\n    knf::IStft istft(stft_config);\n    return istft.Compute(stft_result);\n  }\n\n private:\n  void Init(void *model_data, size_t model_data_length) {\n    sess_ = std::make_unique<Ort::Session>(env_, model_data, model_data_length,\n                                           sess_opts_);\n\n    GetInputNames(sess_.get(), &input_names_, &input_names_ptr_);\n\n    GetOutputNames(sess_.get(), &output_names_, &output_names_ptr_);\n\n    // get meta data\n    Ort::ModelMetadata meta_data = sess_->GetModelMetadata();\n    if (config_.debug) {\n      std::ostringstream os;\n      os << \"---Vocos model---\\n\";\n      PrintModelMetadata(os, meta_data);\n\n      os << \"----------input names----------\\n\";\n      int32_t i = 0;\n      for (const auto &s : input_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n      os << \"----------output names----------\\n\";\n      i = 0;\n      for (const auto &s : output_names_) {\n        os << i << \" \" << s << \"\\n\";\n        ++i;\n      }\n\n#if __OHOS__\n      SHERPA_ONNX_LOGE(\"%{public}s\\n\", os.str().c_str());\n#else\n      SHERPA_ONNX_LOGE(\"%s\\n\", os.str().c_str());\n#endif\n    }\n\n    Ort::AllocatorWithDefaultOptions allocator;  // used in the macro below\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_.n_fft, \"n_fft\", 1024);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_.hop_length, \"hop_length\",\n                                            256);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_.win_length, \"win_length\",\n                                            1024);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_.center, \"center\", 1);\n    SHERPA_ONNX_READ_META_DATA_WITH_DEFAULT(meta_.normalized, \"normalized\", 0);\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(meta_.window_type,\n                                                \"window_type\", \"hann\");\n    SHERPA_ONNX_READ_META_DATA_STR_WITH_DEFAULT(meta_.pad_mode, \"pad_mode\",\n                                                \"reflect\");\n  }\n\n private:\n  OfflineTtsModelConfig config_;\n  VocosModelMetaData meta_;\n\n  Ort::Env env_;\n  Ort::SessionOptions sess_opts_;\n  Ort::AllocatorWithDefaultOptions allocator_;\n\n  std::unique_ptr<Ort::Session> sess_;\n\n  std::vector<std::string> input_names_;\n  std::vector<const char *> input_names_ptr_;\n\n  std::vector<std::string> output_names_;\n  std::vector<const char *> output_names_ptr_;\n};\n\nVocosVocoder::VocosVocoder(const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(config)) {}\n\ntemplate <typename Manager>\nVocosVocoder::VocosVocoder(Manager *mgr, const OfflineTtsModelConfig &config)\n    : impl_(std::make_unique<Impl>(mgr, config)) {}\n\nVocosVocoder::~VocosVocoder() = default;\n\nstd::vector<float> VocosVocoder::Run(Ort::Value mel) const {\n  return impl_->Run(std::move(mel));\n}\n\n#if __ANDROID_API__ >= 9\ntemplate VocosVocoder::VocosVocoder(AAssetManager *mgr,\n                                    const OfflineTtsModelConfig &config);\n#endif\n\n#if __OHOS__\ntemplate VocosVocoder::VocosVocoder(NativeResourceManager *mgr,\n                                    const OfflineTtsModelConfig &config);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/vocos-vocoder.h",
    "content": "// sherpa-onnx/csrc/vocos-vocoder.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_VOCOS_VOCODER_H_\n#define SHERPA_ONNX_CSRC_VOCOS_VOCODER_H_\n\n#include <memory>\n#include <string>\n#include <vector>\n\n#include \"onnxruntime_cxx_api.h\"  // NOLINT\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n#include \"sherpa-onnx/csrc/vocoder.h\"\n\nnamespace sherpa_onnx {\n\nclass VocosVocoder : public Vocoder {\n public:\n  ~VocosVocoder() override;\n\n  explicit VocosVocoder(const OfflineTtsModelConfig &config);\n\n  template <typename Manager>\n  VocosVocoder(Manager *mgr, const OfflineTtsModelConfig &config);\n\n  /** @param mel A float32 tensor of shape (batch_size, feat_dim, num_frames).\n   *  @return Return a float32 tensor of shape (batch_size, num_samples).\n   */\n  std::vector<float> Run(Ort::Value mel) const override;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_VOCOS_VOCODER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/voice-activity-detector.cc",
    "content": "// sherpa-onnx/csrc/voice-activity-detector.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n\n#include <algorithm>\n#include <memory>\n#include <queue>\n#include <utility>\n#include <vector>\n\n#if __ANDROID_API__ >= 9\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if __OHOS__\n#include \"rawfile/raw_file_manager.h\"\n#endif\n\n#include \"sherpa-onnx/csrc/circular-buffer.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/vad-model.h\"\n\nnamespace sherpa_onnx {\n\nclass VoiceActivityDetector::Impl {\n public:\n  explicit Impl(const VadModelConfig &config, float buffer_size_in_seconds = 60)\n      : model_(VadModel::Create(config)),\n        config_(config),\n        buffer_(buffer_size_in_seconds * config.sample_rate) {\n    Init();\n  }\n\n  template <typename Manager>\n  Impl(Manager *mgr, const VadModelConfig &config,\n       float buffer_size_in_seconds = 60)\n      : model_(VadModel::Create(mgr, config)),\n        config_(config),\n        buffer_(buffer_size_in_seconds * config.sample_rate) {\n    Init();\n  }\n\n  float Compute(const float *samples, int32_t n) {\n    return model_->Compute(samples, n);\n  }\n\n  void AcceptWaveform(const float *samples, int32_t n) {\n    if (buffer_.Size() > max_utterance_length_) {\n      model_->SetMinSilenceDuration(new_min_silence_duration_s_);\n      model_->SetThreshold(new_threshold_);\n    } else {\n      if (!config_.silero_vad.model.empty()) {\n        model_->SetMinSilenceDuration(config_.silero_vad.min_silence_duration);\n        model_->SetThreshold(config_.silero_vad.threshold);\n      } else if (!config_.ten_vad.model.empty()) {\n        model_->SetMinSilenceDuration(config_.ten_vad.min_silence_duration);\n        model_->SetThreshold(config_.ten_vad.threshold);\n      } else {\n        SHERPA_ONNX_LOGE(\"Unknown vad model\");\n        SHERPA_ONNX_EXIT(-1);\n      }\n    }\n\n    int32_t window_size = model_->WindowSize();\n    int32_t window_shift = model_->WindowShift();\n\n    // note n is usually window_size and there is no need to use\n    // an extra buffer here\n    last_.insert(last_.end(), samples, samples + n);\n\n    if (last_.size() < window_size) {\n      return;\n    }\n\n    // Note: For v4, window_shift == window_size\n    int32_t k =\n        (static_cast<int32_t>(last_.size()) - window_size) / window_shift + 1;\n    const float *p = last_.data();\n    bool is_speech = false;\n\n    for (int32_t i = 0; i < k; ++i, p += window_shift) {\n      buffer_.Push(p, window_shift);\n      // NOTE(fangjun): Please don't use a very large n.\n      bool this_window_is_speech = model_->IsSpeech(p, window_size);\n      is_speech = is_speech || this_window_is_speech;\n    }\n\n    last_ = std::vector<float>(\n        p, static_cast<const float *>(last_.data()) + last_.size());\n\n    if (is_speech) {\n      if (start_ == -1) {\n        // beginning of speech\n        start_ = std::max(buffer_.Tail() - 2 * model_->WindowSize() -\n                              model_->MinSpeechDurationSamples(),\n                          buffer_.Head());\n        cur_segment_.start = start_;\n      }\n      int32_t num_samples = buffer_.Tail() - start_ - 1;\n      cur_segment_.samples = buffer_.Get(start_, num_samples);\n    } else {\n      // non-speech\n\n      cur_segment_.start = -1;\n      cur_segment_.samples.clear();\n\n      if (start_ != -1 && buffer_.Size()) {\n        // end of speech, save the speech segment\n        int32_t end = buffer_.Tail() - model_->MinSilenceDurationSamples();\n\n        std::vector<float> s = buffer_.Get(start_, end - start_);\n        SpeechSegment segment;\n\n        segment.start = start_;\n        segment.samples = std::move(s);\n\n        segments_.push(std::move(segment));\n\n        buffer_.Pop(end - buffer_.Head());\n      }\n\n      if (start_ == -1) {\n        int32_t end = buffer_.Tail() - 2 * model_->WindowSize() -\n                      model_->MinSpeechDurationSamples();\n        int32_t n = std::max(0, end - buffer_.Head());\n        if (n > 0) {\n          buffer_.Pop(n);\n        }\n      }\n\n      start_ = -1;\n    }\n  }\n\n  bool Empty() const { return segments_.empty(); }\n\n  void Pop() { segments_.pop(); }\n\n  void Clear() { std::queue<SpeechSegment>().swap(segments_); }\n\n  const SpeechSegment &Front() const {\n    static SpeechSegment tmp;\n\n    if (Empty()) {\n      SHERPA_ONNX_LOGE(\n          \"Make sure you call this method only when Empty() returns false; \"\n          \"Return an empty segment\");\n      return tmp;\n    }\n\n    return segments_.front();\n  }\n\n  void Reset() {\n    std::queue<SpeechSegment>().swap(segments_);\n\n    model_->Reset();\n    buffer_.Reset();\n    last_.clear();\n\n    start_ = -1;\n\n    cur_segment_.start = -1;\n    cur_segment_.samples.clear();\n  }\n\n  void Flush() {\n    if (start_ == -1 || buffer_.Size() == 0) {\n      return;\n    }\n\n    int32_t end = buffer_.Tail();\n    if (end <= start_) {\n      return;\n    }\n\n    std::vector<float> s = buffer_.Get(start_, end - start_);\n\n    SpeechSegment segment;\n\n    segment.start = start_;\n    segment.samples = std::move(s);\n\n    segments_.push(std::move(segment));\n\n    buffer_.Pop(end - buffer_.Head());\n    start_ = -1;\n\n    cur_segment_.start = -1;\n    cur_segment_.samples.clear();\n  }\n\n  bool IsSpeechDetected() const { return start_ != -1; }\n\n  SpeechSegment CurrentSpeechSegment() const { return cur_segment_; }\n\n  const VadModelConfig &GetConfig() const { return config_; }\n\n private:\n  void Init() {\n    if (!config_.silero_vad.model.empty()) {\n      max_utterance_length_ =\n          config_.sample_rate * config_.silero_vad.max_speech_duration;\n    } else if (!config_.ten_vad.model.empty()) {\n      max_utterance_length_ =\n          config_.sample_rate * config_.ten_vad.max_speech_duration;\n    } else {\n      SHERPA_ONNX_LOGE(\"Unsupported VAD model\");\n      SHERPA_ONNX_EXIT(-1);\n    }\n  }\n\n private:\n  std::queue<SpeechSegment> segments_;\n\n  // it is empty if no speech is detected\n  SpeechSegment cur_segment_;\n\n  std::unique_ptr<VadModel> model_;\n  VadModelConfig config_;\n  CircularBuffer buffer_;\n  std::vector<float> last_;\n\n  int max_utterance_length_ = -1;  // in samples\n  float new_min_silence_duration_s_ = 0.1;\n  float new_threshold_ = 0.90;\n\n  int32_t start_ = -1;\n};\n\nVoiceActivityDetector::VoiceActivityDetector(\n    const VadModelConfig &config, float buffer_size_in_seconds /*= 60*/)\n    : impl_(std::make_unique<Impl>(config, buffer_size_in_seconds)) {}\n\ntemplate <typename Manager>\nVoiceActivityDetector::VoiceActivityDetector(\n    Manager *mgr, const VadModelConfig &config,\n    float buffer_size_in_seconds /*= 60*/)\n    : impl_(std::make_unique<Impl>(mgr, config, buffer_size_in_seconds)) {}\n\nVoiceActivityDetector::~VoiceActivityDetector() = default;\n\nvoid VoiceActivityDetector::AcceptWaveform(const float *samples, int32_t n) {\n  impl_->AcceptWaveform(samples, n);\n}\n\nbool VoiceActivityDetector::Empty() const { return impl_->Empty(); }\n\nvoid VoiceActivityDetector::Pop() { impl_->Pop(); }\n\nvoid VoiceActivityDetector::Clear() { impl_->Clear(); }\n\nconst SpeechSegment &VoiceActivityDetector::Front() const {\n  return impl_->Front();\n}\n\nvoid VoiceActivityDetector::Reset() const { impl_->Reset(); }\n\nvoid VoiceActivityDetector::Flush() const { impl_->Flush(); }\n\nbool VoiceActivityDetector::IsSpeechDetected() const {\n  return impl_->IsSpeechDetected();\n}\n\nSpeechSegment VoiceActivityDetector::CurrentSpeechSegment() const {\n  return impl_->CurrentSpeechSegment();\n}\n\nconst VadModelConfig &VoiceActivityDetector::GetConfig() const {\n  return impl_->GetConfig();\n}\n\nfloat VoiceActivityDetector::Compute(const float *samples, int32_t n) {\n  return impl_->Compute(samples, n);\n}\n\n#if __ANDROID_API__ >= 9\ntemplate VoiceActivityDetector::VoiceActivityDetector(\n    AAssetManager *mgr, const VadModelConfig &config,\n    float buffer_size_in_seconds = 60);\n#endif\n\n#if __OHOS__\ntemplate VoiceActivityDetector::VoiceActivityDetector(\n    NativeResourceManager *mgr, const VadModelConfig &config,\n    float buffer_size_in_seconds = 60);\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/voice-activity-detector.h",
    "content": "// sherpa-onnx/csrc/voice-activity-detector.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SHERPA_ONNX_CSRC_VOICE_ACTIVITY_DETECTOR_H_\n#define SHERPA_ONNX_CSRC_VOICE_ACTIVITY_DETECTOR_H_\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/vad-model-config.h\"\n\nnamespace sherpa_onnx {\n\nstruct SpeechSegment {\n  int32_t start;  // in samples\n  std::vector<float> samples;\n};\n\nclass VoiceActivityDetector {\n public:\n  explicit VoiceActivityDetector(const VadModelConfig &config,\n                                 float buffer_size_in_seconds = 60);\n\n  template <typename Manager>\n  VoiceActivityDetector(Manager *mgr, const VadModelConfig &config,\n                        float buffer_size_in_seconds = 60);\n\n  ~VoiceActivityDetector();\n\n  void AcceptWaveform(const float *samples, int32_t n);\n  float Compute(const float *samples, int32_t n);\n\n  bool Empty() const;\n  void Pop();\n  void Clear();\n\n  // It is an error to call Front() if Empty() returns true.\n  //\n  // The returned reference is valid until the next call to any\n  // methods of VoiceActivityDetector.\n  const SpeechSegment &Front() const;\n\n  bool IsSpeechDetected() const;\n\n  // It is empty if IsSpeechDetected() returns false\n  SpeechSegment CurrentSpeechSegment() const;\n\n  void Reset() const;\n\n  // At the end of the utterance, you can invoke this method so that\n  // the last speech segment can be detected.\n  void Flush() const;\n\n  const VadModelConfig &GetConfig() const;\n\n private:\n  class Impl;\n  std::unique_ptr<Impl> impl_;\n};\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_VOICE_ACTIVITY_DETECTOR_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/wave-reader-test.cc",
    "content": "// sherpa-onnx/csrc/wave-reader-test.cc\n//\n// Copyright (c)  2025  Posit Software, PBC\n\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\n#include <cstdio>\n#include <fstream>\n#include <string>\n#include <vector>\n\n#if defined(_WIN32)\n#include <windows.h>\n#else\n#include <unistd.h>\n#endif\n\n#include \"gtest/gtest.h\"\n\nnamespace sherpa_onnx {\n\n// RAII helper class for managing temporary test files\nclass TempFile {\n public:\n  TempFile() : TempFile(\"\") {}\n\n  explicit TempFile(const std::string &suffix) {\n#if defined(_WIN32)\n    char temp_path[MAX_PATH];\n    char temp_file[MAX_PATH];\n    GetTempPathA(MAX_PATH, temp_path);\n    GetTempFileNameA(temp_path, \"sot\", 0, temp_file);\n    path_ = temp_file;\n    if (!suffix.empty()) {\n      path_ += suffix;\n      std::remove(temp_file);  // Remove the file without suffix\n    }\n#else\n    char temp_template[] = \"/tmp/sherpa_onnx_test_XXXXXX\";\n    int fd = mkstemp(temp_template);\n    if (fd != -1) {\n      close(fd);\n      path_ = temp_template;\n      if (!suffix.empty()) {\n        path_ += suffix;\n        std::remove(temp_template);  // Remove the file without suffix\n      }\n    }\n#endif\n  }\n\n  ~TempFile() {\n    if (!path_.empty()) {\n      std::remove(path_.c_str());\n    }\n  }\n\n  const char *path() const { return path_.c_str(); }\n\n private:\n  std::string path_;\n};\n\nTEST(WaveReader, TestNonWavFile) {\n  // Create a temporary file with non-WAV content (e.g., webm-like header)\n  TempFile temp_file(\".webm\");\n\n  {\n    std::ofstream out(temp_file.path(), std::ios::binary);\n    // Write some content that doesn't start with RIFF\n    // (webm files typically start with EBML header: 0x1a45dfa3)\n    const unsigned char webm_header[] = {\n        0x1a, 0x45, 0xdf, 0xa3,  // EBML header signature (NOT RIFF)\n        0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x1f, 0x42, 0x86, 0x81, 0x01,\n        // Add some more bytes to make it look like a real file\n        0x42, 0xf7, 0x81, 0x01, 0x42, 0xf2, 0x81, 0x04, 'w', 'e', 'b', 'm'};\n    out.write(reinterpret_cast<const char *>(webm_header), sizeof(webm_header));\n  }\n\n  // Test C++ API - should not segfault\n  int32_t sample_rate = -1;\n  bool is_ok = false;\n  std::vector<float> samples = ReadWave(temp_file.path(), &sample_rate, &is_ok);\n\n  EXPECT_FALSE(is_ok);\n  EXPECT_TRUE(samples.empty());\n  EXPECT_EQ(sample_rate, -1);\n}\n\nTEST(WaveReader, TestNonExistentFile) {\n  // Generate a unique path but don't create the file\n  TempFile temp_file(\".wav\");\n\n  // Test C++ API - should not segfault\n  int32_t sample_rate = -1;\n  bool is_ok = false;\n  std::vector<float> samples = ReadWave(temp_file.path(), &sample_rate, &is_ok);\n\n  EXPECT_FALSE(is_ok);\n  EXPECT_TRUE(samples.empty());\n  EXPECT_EQ(sample_rate, -1);\n}\n\nTEST(WaveReader, TestTruncatedWaveFile) {\n  // Create a temporary file with truncated WAV header\n  TempFile temp_file(\".wav\");\n\n  {\n    std::ofstream out(temp_file.path(), std::ios::binary);\n    // Write only partial WAV header (less than 44 bytes required)\n    const unsigned char partial_wav[] = {\n        'R',  'I',  'F',\n        'F',  // chunk_id\n        0x00, 0x00, 0x00,\n        0x00,  // chunk_size\n        'W',  'A',  'V',\n        'E'  // format\n             // Missing the rest of the header\n    };\n    out.write(reinterpret_cast<const char *>(partial_wav), sizeof(partial_wav));\n  }\n\n  // Test C++ API - should not segfault\n  int32_t sample_rate = -1;\n  bool is_ok = false;\n  std::vector<float> samples = ReadWave(temp_file.path(), &sample_rate, &is_ok);\n\n  EXPECT_FALSE(is_ok);\n  EXPECT_TRUE(samples.empty());\n  EXPECT_EQ(sample_rate, -1);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/wave-reader.cc",
    "content": "// sherpa-onnx/csrc/wave-reader.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\n#include <cassert>\n#include <cstdint>\n#include <fstream>\n#include <string>\n#include <utility>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\nnamespace {\n// see http://soundfile.sapp.org/doc/WaveFormat/\n//\n// Note: We assume little endian here\n// TODO(fangjun): Support big endian\nstruct WaveHeader {\n  // See\n  // https://en.wikipedia.org/wiki/WAV#Metadata\n  // and\n  // https://www.robotplanet.dk/audio/wav_meta_data/riff_mci.pdf\n  void SeekToDataChunk(std::istream &is) {\n    //                              a t a d\n    while (is && subchunk2_id != 0x61746164) {\n      // const char *p = reinterpret_cast<const char *>(&subchunk2_id);\n      // printf(\"Skip chunk (%x): %c%c%c%c of size: %d\\n\", subchunk2_id, p[0],\n      //        p[1], p[2], p[3], subchunk2_size);\n      is.seekg(subchunk2_size, std::istream::cur);\n      is.read(reinterpret_cast<char *>(&subchunk2_id), sizeof(int32_t));\n      is.read(reinterpret_cast<char *>(&subchunk2_size), sizeof(int32_t));\n    }\n  }\n\n  int32_t chunk_id;\n  int32_t chunk_size;\n  int32_t format;\n  int32_t subchunk1_id;\n  int32_t subchunk1_size;\n  int16_t audio_format;\n  int16_t num_channels;\n  int32_t sample_rate;\n  int32_t byte_rate;\n  int16_t block_align;\n  int16_t bits_per_sample;\n  int32_t subchunk2_id;    // a tag of this chunk\n  int32_t subchunk2_size;  // size of subchunk2\n};\nstatic_assert(sizeof(WaveHeader) == 44);\n\n/*\nsox int16-1-channel-zh.wav -b 8 int8-1-channel-zh.wav\n\nsox int16-1-channel-zh.wav -c 2 int16-2-channel-zh.wav\n\nwe use audacity to generate int32-1-channel-zh.wav and float32-1-channel-zh.wav\nbecause sox uses WAVE_FORMAT_EXTENSIBLE, which is not easy to support\nin sherpa-onnx.\n */\n\n// Read a wave file of mono-channel.\n// Return its samples normalized to the range [-1, 1).\nstd::vector<std::vector<float>> ReadWaveImpl(std::istream &is,\n                                             int32_t *sampling_rate,\n                                             bool *is_ok) {\n  WaveHeader header{};\n  is.read(reinterpret_cast<char *>(&header.chunk_id), sizeof(header.chunk_id));\n\n  //                        F F I R\n  if (header.chunk_id != 0x46464952) {\n    SHERPA_ONNX_LOGE(\"Expected chunk_id RIFF. Given: 0x%08x\\n\",\n                     header.chunk_id);\n    *is_ok = false;\n    return {};\n  }\n\n  is.read(reinterpret_cast<char *>(&header.chunk_size),\n          sizeof(header.chunk_size));\n\n  is.read(reinterpret_cast<char *>(&header.format), sizeof(header.format));\n\n  //                      E V A W\n  if (header.format != 0x45564157) {\n    SHERPA_ONNX_LOGE(\"Expected format WAVE. Given: 0x%08x\\n\", header.format);\n    *is_ok = false;\n    return {};\n  }\n\n  is.read(reinterpret_cast<char *>(&header.subchunk1_id),\n          sizeof(header.subchunk1_id));\n\n  is.read(reinterpret_cast<char *>(&header.subchunk1_size),\n          sizeof(header.subchunk1_size));\n\n  if (header.subchunk1_id == 0x4b4e554a) {\n    // skip junk padding\n    is.seekg(header.subchunk1_size, std::istream::cur);\n\n    is.read(reinterpret_cast<char *>(&header.subchunk1_id),\n            sizeof(header.subchunk1_id));\n\n    is.read(reinterpret_cast<char *>(&header.subchunk1_size),\n            sizeof(header.subchunk1_size));\n  }\n\n  if (header.subchunk1_id != 0x20746d66) {\n    SHERPA_ONNX_LOGE(\"Expected subchunk1_id 0x20746d66. Given: 0x%08x\\n\",\n                     header.subchunk1_id);\n    *is_ok = false;\n    return {};\n  }\n\n  // NAudio uses 18\n  // See https://github.com/naudio/NAudio/issues/1132\n  if (header.subchunk1_size != 16 &&\n      header.subchunk1_size != 18) {  // 16 for PCM\n    SHERPA_ONNX_LOGE(\"Expected subchunk1_size 16. Given: %d\\n\",\n                     header.subchunk1_size);\n    *is_ok = false;\n    return {};\n  }\n\n  is.read(reinterpret_cast<char *>(&header.audio_format),\n          sizeof(header.audio_format));\n\n  if (header.audio_format != 1 && header.audio_format != 3) {\n    // 1 for integer PCM\n    // 3 for floating point PCM\n    // see https://www.mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html\n    // and https://github.com/microsoft/DirectXTK/wiki/Wave-Formats\n    SHERPA_ONNX_LOGE(\"Expected audio_format 1. Given: %d\\n\",\n                     header.audio_format);\n\n    if (header.audio_format == static_cast<int16_t>(0xfffe)) {\n      SHERPA_ONNX_LOGE(\"We don't support WAVE_FORMAT_EXTENSIBLE files.\");\n    }\n\n    *is_ok = false;\n    return {};\n  }\n\n  is.read(reinterpret_cast<char *>(&header.num_channels),\n          sizeof(header.num_channels));\n\n  is.read(reinterpret_cast<char *>(&header.sample_rate),\n          sizeof(header.sample_rate));\n\n  is.read(reinterpret_cast<char *>(&header.byte_rate),\n          sizeof(header.byte_rate));\n\n  is.read(reinterpret_cast<char *>(&header.block_align),\n          sizeof(header.block_align));\n\n  is.read(reinterpret_cast<char *>(&header.bits_per_sample),\n          sizeof(header.bits_per_sample));\n\n  if (header.byte_rate !=\n      (header.sample_rate * header.num_channels * header.bits_per_sample / 8)) {\n    SHERPA_ONNX_LOGE(\"Incorrect byte rate: %d. Expected: %d\", header.byte_rate,\n                     (header.sample_rate * header.num_channels *\n                      header.bits_per_sample / 8));\n    *is_ok = false;\n    return {};\n  }\n\n  if (header.block_align !=\n      (header.num_channels * header.bits_per_sample / 8)) {\n    SHERPA_ONNX_LOGE(\"Incorrect block align: %d. Expected: %d\\n\",\n                     header.block_align,\n                     (header.num_channels * header.bits_per_sample / 8));\n    *is_ok = false;\n    return {};\n  }\n\n  if (header.bits_per_sample != 8 && header.bits_per_sample != 16 &&\n      header.bits_per_sample != 32) {\n    SHERPA_ONNX_LOGE(\"Expected bits_per_sample 8, 16 or 32. Given: %d\\n\",\n                     header.bits_per_sample);\n    *is_ok = false;\n    return {};\n  }\n\n  if (header.subchunk1_size == 18) {\n    // this is for NAudio. It puts extra bytes after bits_per_sample\n    // See\n    // https://github.com/naudio/NAudio/blob/master/NAudio.Core/Wave/WaveFormats/WaveFormat.cs#L223\n\n    int16_t extra_size = -1;\n    is.read(reinterpret_cast<char *>(&extra_size), sizeof(int16_t));\n    if (extra_size != 0) {\n      SHERPA_ONNX_LOGE(\n          \"Extra size should be 0 for wave from NAudio. Current extra size \"\n          \"%d\\n\",\n          extra_size);\n      *is_ok = false;\n      return {};\n    }\n  }\n\n  is.read(reinterpret_cast<char *>(&header.subchunk2_id),\n          sizeof(header.subchunk2_id));\n\n  is.read(reinterpret_cast<char *>(&header.subchunk2_size),\n          sizeof(header.subchunk2_size));\n\n  header.SeekToDataChunk(is);\n  if (!is) {\n    *is_ok = false;\n    return {};\n  }\n\n  *sampling_rate = header.sample_rate;\n\n  std::vector<std::vector<float>> ans(header.num_channels);\n\n  if (header.bits_per_sample == 16 && header.audio_format == 1) {\n    // header.subchunk2_size contains the number of bytes in the data.\n    // As we assume each sample contains two bytes, so it is divided by 2 here\n    std::vector<int16_t> samples(header.subchunk2_size / 2);\n\n    is.read(reinterpret_cast<char *>(samples.data()),\n            samples.size() * sizeof(int16_t));\n    if (!is) {\n      SHERPA_ONNX_LOGE(\"Failed to read %d bytes\", header.subchunk2_size);\n      *is_ok = false;\n      return {};\n    }\n\n    for (auto &v : ans) {\n      v.resize(samples.size() / header.num_channels);\n    }\n\n    // samples are interleaved\n    for (int32_t i = 0, k = 0; i < static_cast<int32_t>(samples.size());\n         i += header.num_channels, ++k) {\n      for (int32_t c = 0; c != header.num_channels; ++c) {\n        ans[c][k] = samples[i + c] / 32768.;\n      }\n    }\n  } else if (header.bits_per_sample == 8 && header.audio_format == 1) {\n    // number of samples == number of bytes for 8-bit encoded samples\n    //\n    // For 8-bit encoded samples, they are unsigned!\n    std::vector<uint8_t> samples(header.subchunk2_size);\n\n    is.read(reinterpret_cast<char *>(samples.data()), header.subchunk2_size);\n    if (!is) {\n      SHERPA_ONNX_LOGE(\"Failed to read %d bytes\", header.subchunk2_size);\n      *is_ok = false;\n      return {};\n    }\n\n    for (auto &v : ans) {\n      v.resize(samples.size() / header.num_channels);\n    }\n\n    // samples are interleaved\n    for (int32_t i = 0, k = 0; i < static_cast<int32_t>(samples.size());\n         i += header.num_channels, ++k) {\n      for (int32_t c = 0; c != header.num_channels; ++c) {\n        // Note(fangjun): We want to normalize each sample into the range [-1,\n        // 1] Since each original sample is in the range [0, 256], dividing them\n        // by 128 converts them to the range [0, 2]; so after subtracting 1, we\n        // get the range [-1, 1]\n        //\n        ans[c][k] = samples[i + c] / 128. - 1;\n      }\n    }\n  } else if (header.bits_per_sample == 32 && header.audio_format == 1) {\n    // 32 here is for int32\n    //\n    // header.subchunk2_size contains the number of bytes in the data.\n    // As we assume each sample contains 4 bytes, so it is divided by 4 here\n    std::vector<int32_t> samples(header.subchunk2_size / 4);\n\n    is.read(reinterpret_cast<char *>(samples.data()), header.subchunk2_size);\n    if (!is) {\n      SHERPA_ONNX_LOGE(\"Failed to read %d bytes\", header.subchunk2_size);\n      *is_ok = false;\n      return {};\n    }\n\n    for (auto &v : ans) {\n      v.resize(samples.size() / header.num_channels);\n    }\n\n    // samples are interleaved\n    for (int32_t i = 0, k = 0; i < static_cast<int32_t>(samples.size());\n         i += header.num_channels, ++k) {\n      for (int32_t c = 0; c != header.num_channels; ++c) {\n        ans[c][k] = static_cast<float>(samples[i + c]) / (1 << 31);\n      }\n    }\n  } else if (header.bits_per_sample == 32 && header.audio_format == 3) {\n    // 32 here is for float32\n    //\n    // header.subchunk2_size contains the number of bytes in the data.\n    // As we assume each sample contains 4 bytes, so it is divided by 4 here\n    std::vector<float> samples(header.subchunk2_size / 4);\n\n    is.read(reinterpret_cast<char *>(samples.data()), header.subchunk2_size);\n    if (!is) {\n      SHERPA_ONNX_LOGE(\"Failed to read %d bytes\", header.subchunk2_size);\n      *is_ok = false;\n      return {};\n    }\n\n    for (auto &v : ans) {\n      v.resize(samples.size() / header.num_channels);\n    }\n\n    // samples are interleaved\n    for (int32_t i = 0, k = 0; i < static_cast<int32_t>(samples.size());\n         i += header.num_channels, ++k) {\n      for (int32_t c = 0; c != header.num_channels; ++c) {\n        ans[c][k] = samples[i + c];\n      }\n    }\n  } else {\n    SHERPA_ONNX_LOGE(\n        \"Unsupported %d bits per sample and audio format: %d. Supported values \"\n        \"are: 8, 16, 32.\",\n        header.bits_per_sample, header.audio_format);\n    *is_ok = false;\n    return {};\n  }\n\n  *is_ok = true;\n  return ans;\n}\n\n}  // namespace\n\nstd::vector<float> ReadWave(const std::string &filename, int32_t *sampling_rate,\n                            bool *is_ok) {\n  *is_ok = false;\n  if (filename.empty()) {\n    SHERPA_ONNX_LOGE(\"Filename is empty\");\n    return {};\n  }\n\n  if (!FileExists(filename)) {\n    SHERPA_ONNX_LOGE(\"Filename '%s' does not exist\", filename.c_str());\n    return {};\n  }\n\n  std::ifstream is(filename, std::ifstream::binary);\n  return ReadWave(is, sampling_rate, is_ok);\n}\n\nstd::vector<float> ReadWave(std::istream &is, int32_t *sampling_rate,\n                            bool *is_ok) {\n  auto samples = ReadWaveImpl(is, sampling_rate, is_ok);\n\n  if (!*is_ok || samples.empty()) {\n    return {};\n  }\n\n  if (samples.size() > 1) {\n    SHERPA_ONNX_LOGE(\n        \"Warning: %d channels are found. We only use the first channel.\\n\",\n        static_cast<int32_t>(samples.size()));\n  }\n\n  return samples[0];\n}\n\nstd::vector<std::vector<float>> ReadWaveMultiChannel(std::istream &is,\n                                                     int32_t *sampling_rate,\n                                                     bool *is_ok) {\n  auto samples = ReadWaveImpl(is, sampling_rate, is_ok);\n  return samples;\n}\n\nstd::vector<std::vector<float>> ReadWaveMultiChannel(\n    const std::string &filename, int32_t *sampling_rate, bool *is_ok) {\n  std::ifstream is(filename, std::ifstream::binary);\n  return ReadWaveMultiChannel(is, sampling_rate, is_ok);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/wave-reader.h",
    "content": "// sherpa-onnx/csrc/wave-reader.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_WAVE_READER_H_\n#define SHERPA_ONNX_CSRC_WAVE_READER_H_\n\n#include <istream>\n#include <string>\n#include <vector>\n\nnamespace sherpa_onnx {\n\n/** Read a wave file with expected sample rate.\n\n    @param filename Path to a wave file. It MUST be single channel, 16-bit\n                    PCM encoded.\n    @param sampling_rate  On return, it contains the sampling rate of the file.\n    @param is_ok On return it is true if the reading succeeded; false otherwise.\n\n    @return Return wave samples normalized to the range [-1, 1).\n */\nstd::vector<float> ReadWave(const std::string &filename, int32_t *sampling_rate,\n                            bool *is_ok);\n\nstd::vector<float> ReadWave(std::istream &is, int32_t *sampling_rate,\n                            bool *is_ok);\n\nstd::vector<std::vector<float>> ReadWaveMultiChannel(std::istream &is,\n                                                     int32_t *sampling_rate,\n                                                     bool *is_ok);\n\nstd::vector<std::vector<float>> ReadWaveMultiChannel(\n    const std::string &filename, int32_t *sampling_rate, bool *is_ok);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_WAVE_READER_H_\n"
  },
  {
    "path": "sherpa-onnx/csrc/wave-writer.cc",
    "content": "// sherpa-onnx/csrc/wave-writer.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\n#include <algorithm>\n#include <cstring>\n#include <fstream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\nnamespace {\n\n// see http://soundfile.sapp.org/doc/WaveFormat/\n//\n// Note: We assume little endian here\n// TODO(fangjun): Support big endian\nstruct WaveHeader {\n  int32_t chunk_id;\n  int32_t chunk_size;\n  int32_t format;\n  int32_t subchunk1_id;\n  int32_t subchunk1_size;\n  int16_t audio_format;\n  int16_t num_channels;\n  int32_t sample_rate;\n  int32_t byte_rate;\n  int16_t block_align;\n  int16_t bits_per_sample;\n  int32_t subchunk2_id;    // a tag of this chunk\n  int32_t subchunk2_size;  // size of subchunk2\n};\n\n}  // namespace\n\nint64_t WaveFileSize(int32_t n_samples, int32_t num_channels /*= 1*/) {\n  return sizeof(WaveHeader) + n_samples * sizeof(int16_t) * num_channels;\n}\n\nvoid WriteWave(char *buffer, int32_t sampling_rate, const float *samples,\n               int32_t n) {\n  WriteWave(buffer, sampling_rate, samples, nullptr, n);\n}\n\nbool WriteWave(const std::string &filename, int32_t sampling_rate,\n               const float *samples, int32_t n) {\n  return WriteWave(filename, sampling_rate, samples, nullptr, n);\n}\n\nbool WriteWave(const std::string &filename, int32_t sampling_rate,\n               const float *samples_ch0, const float *samples_ch1, int32_t n) {\n  std::string buffer;\n  buffer.resize(WaveFileSize(n, samples_ch1 == nullptr ? 1 : 2));\n\n  WriteWave(buffer.data(), sampling_rate, samples_ch0, samples_ch1, n);\n\n  std::ofstream os(filename, std::ios::binary);\n  if (!os) {\n    SHERPA_ONNX_LOGE(\"Failed to create '%s'\", filename.c_str());\n    return false;\n  }\n\n  os << buffer;\n  if (!os) {\n    SHERPA_ONNX_LOGE(\"Write '%s' failed\", filename.c_str());\n    return false;\n  }\n\n  return true;\n}\n\nvoid WriteWave(char *buffer, int32_t sampling_rate, const float *samples_ch0,\n               const float *samples_ch1, int32_t n) {\n  WaveHeader header{};\n  header.chunk_id = 0x46464952;      // FFIR\n  header.format = 0x45564157;        // EVAW\n  header.subchunk1_id = 0x20746d66;  // \"fmt \"\n  header.subchunk1_size = 16;        // 16 for PCM\n  header.audio_format = 1;           // PCM =1\n\n  int32_t num_channels = samples_ch1 == nullptr ? 1 : 2;\n  int32_t bits_per_sample = 16;  // int16_t\n\n  header.num_channels = num_channels;\n  header.sample_rate = sampling_rate;\n  header.byte_rate = sampling_rate * num_channels * bits_per_sample / 8;\n  header.block_align = num_channels * bits_per_sample / 8;\n  header.bits_per_sample = bits_per_sample;\n  header.subchunk2_id = 0x61746164;  // atad\n  header.subchunk2_size = n * num_channels * bits_per_sample / 8;\n\n  header.chunk_size = 36 + header.subchunk2_size;\n\n  std::vector<int16_t> samples_int16_ch0(n);\n  for (int32_t i = 0; i != n; ++i) {\n    samples_int16_ch0[i] = std::min<int32_t>(samples_ch0[i] * 32767, 32767);\n  }\n\n  std::vector<int16_t> samples_int16_ch1;\n  if (samples_ch1) {\n    samples_int16_ch1.resize(n);\n    for (int32_t i = 0; i != n; ++i) {\n      samples_int16_ch1[i] = std::min<int32_t>(samples_ch1[i] * 32767, 32767);\n    }\n  }\n\n  memcpy(buffer, &header, sizeof(WaveHeader));\n\n  if (samples_ch1 == nullptr) {\n    memcpy(buffer + sizeof(WaveHeader), samples_int16_ch0.data(),\n           n * sizeof(int16_t));\n  } else {\n    auto p = reinterpret_cast<int16_t *>(buffer + sizeof(WaveHeader));\n\n    for (int32_t i = 0; i != n; ++i) {\n      p[2 * i] = samples_int16_ch0[i];\n      p[2 * i + 1] = samples_int16_ch1[i];\n    }\n  }\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/csrc/wave-writer.h",
    "content": "// sherpa-onnx/csrc/wave-writer.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_CSRC_WAVE_WRITER_H_\n#define SHERPA_ONNX_CSRC_WAVE_WRITER_H_\n\n#include <cstdint>\n#include <string>\n\nnamespace sherpa_onnx {\n\n// Write a single channel wave file.\n// Note that the input samples are in the range [-1, 1]. It will be multiplied\n// by 32767 and saved in int16_t format in the wave file.\n//\n// @param filename Path to save the samples.\n// @param sampling_rate Sample rate of the samples.\n// @param samples Pointer to the samples\n// @param n Number of samples\n// @return Return true if the write succeeds; return false otherwise.\nbool WriteWave(const std::string &filename, int32_t sampling_rate,\n               const float *samples, int32_t n);\n\nvoid WriteWave(char *buffer, int32_t sampling_rate, const float *samples,\n               int32_t n);\n\nbool WriteWave(const std::string &filename, int32_t sampling_rate,\n               const float *samples_ch0, const float *samples_ch1, int32_t n);\n\nvoid WriteWave(char *buffer, int32_t sampling_rate, const float *samples_ch0,\n               const float *samples_ch1, int32_t n);\n\nint64_t WaveFileSize(int32_t n_samples, int32_t num_channels = 1);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_CSRC_WAVE_WRITER_H_\n"
  },
  {
    "path": "sherpa-onnx/java-api/.build.txt",
    "content": "[win.env]\nset JAVA_HOME=D:\\java\\jdk1.8.0_121\n\n[win.build]\nmvn clean install -DskipTests -Dgpg.skip=true\n\n[linux.env]\nexport JAVA_HOME=/usr/java/jdk1.8.0_121\n\n[linux.build]\nmvn clean install -DskipTests -Dgpg.skip=true\n\n[mac.env]\nexport JAVA_HOME=~/java/jdk1.8.0_121\n\n[mac.build]\nmvn clean install -DskipTests -Dgpg.skip=true"
  },
  {
    "path": "sherpa-onnx/java-api/.gitignore",
    "content": "### Eclipse template\n*.pydevproject\n.metadata\n.gradle*\nclasses/\nbin/\ntmp/\n*.tmp\n*.bak\n*.swp\n*~.nib\nlocal.properties\n.settings/\n.loadpath\nrebel.xml\n\n# Eclipse Core\n.project\n\ngeneratedsources\n\n# External tool builders\n.externalToolBuilders/\n\n# Locally stored \"Eclipse launch configurations\"\n*.launch\n\n# CDT-specific\n.cproject\n\n# JDT-specific (Eclipse Java Development Tools)\n.classpath\n\n# PDT-specific\n.buildpath\n\n# sbteclipse plugin\n.target\n\n# TeXlipse plugin\n.texlipse\n\n\n\n### JetBrains template\n# Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm\n\n*.iml\n.flattened-pom.xml\n## Directory-based project format:\n.idea/\n# if you remove the above rule, at least ignore the following:\n\n# User-specific stuff:\n# .idea/workspace.xml\n# .idea/tasks.xml\n# .idea/dictionaries\n\n# Sensitive or high-churn files:\n# .idea/dataSources.ids\n# .idea/dataSources.xml\n# .idea/sqlDataSources.xml\n# .idea/dynamic.xml\n# .idea/uiDesigner.xml\n\n# Gradle:\n# .idea/gradle.xml\n# .idea/libraries\n\n# Mongo Explorer plugin:\n# .idea/mongoSettings.xml\n\n## File-based project format:\n*.ipr\n*.iws\n\n## Plugin-specific files:\n\n# IntelliJ\n/out/\n\n# mpeltonen/sbt-idea plugin\n.idea_modules/\n\n# JIRA plugin\natlassian-ide-plugin.xml\n\n# Crashlytics plugin (for Android Studio and IntelliJ)\ncom_crashlytics_export_strings.xml\ncrashlytics.properties\ncrashlytics-build.properties\n\nbuild/\n\n# Ignore Gradle GUI config\ngradle-app.setting\n\n# Avoid ignoring Gradle wrapper jar file (.jar files are usually ignored)\n!gradle-wrapper.jar\n\ndb\n\n### Java template\n*.class\n\n# Mobile Tools for Java (J2ME)\n.mtj.tmp/\n\n# Package Files #\n#*.jar\n\n# virtual machine crash logs, see http://www.java.com/en/download/help/error_hotspot.xml\nhs_err_pid*\n\n\n### Leiningen template\nclasses/\ntarget/\nlogs/\ncheckouts/\n.lein-deps-sum\n.lein-repl-history\n.lein-plugins/\n.lein-failures\n.nrepl-port\n\nquerydsl/\n\n.DS_Store\n\n*.exe\n*.out\n\n*.log\nnode_modules/\ndist/\ndist.zip\npackage-lock.json\n*.lock\nlocal.properties\n.cxx\n.externalNativeBuild\n/captures\n/build\n__pycache__/\n*.pyc\n\n\ncmake-build-debug/\ncmake-build-debug-mingw/\nvenv/\n.vs/\nDebug/\nvcpkg_installed/\n.env\n.next/\napp.zip\nsecrets.txt\nsrc.zip"
  },
  {
    "path": "sherpa-onnx/java-api/MANIFEST.MF",
    "content": "Manifest-Version: 1.0\n"
  },
  {
    "path": "sherpa-onnx/java-api/Makefile",
    "content": "# Copyright 2024 Xiaomi Corporation\n\n# all .class and .jar files are put inside out_dir\nout_dir := build\nout_jar := $(out_dir)/sherpa-onnx.jar\n\npackage_dir := com/k2fsa/sherpa/onnx\n\njava_files := LibraryUtils.java\njava_files += LibraryLoader.java\n\njava_files += VersionInfo.java\njava_files += WaveData.java\njava_files += WaveReader.java\njava_files += WaveWriter.java\njava_files += EndpointRule.java\njava_files += EndpointConfig.java\njava_files += FeatureConfig.java\njava_files += QnnConfig.java\njava_files += HomophoneReplacerConfig.java\njava_files += OnlineLMConfig.java\njava_files += OnlineParaformerModelConfig.java\njava_files += OnlineZipformer2CtcModelConfig.java\njava_files += OnlineToneCtcModelConfig.java\njava_files += OnlineNeMoCtcModelConfig.java\njava_files += OnlineTransducerModelConfig.java\njava_files += OnlineModelConfig.java\njava_files += OnlineCtcFstDecoderConfig.java\njava_files += OnlineStream.java\njava_files += OnlineRecognizerConfig.java\njava_files += OnlineRecognizerResult.java\njava_files += OnlineRecognizer.java\n\njava_files += OfflineTransducerModelConfig.java\njava_files += OfflineParaformerModelConfig.java\njava_files += OfflineWhisperModelConfig.java\njava_files += OfflineFireRedAsrModelConfig.java\njava_files += OfflineMoonshineModelConfig.java\njava_files += OfflineNemoEncDecCtcModelConfig.java\njava_files += OfflineZipformerCtcModelConfig.java\njava_files += OfflineWenetCtcModelConfig.java\njava_files += OfflineOmnilingualAsrCtcModelConfig.java\njava_files += OfflineMedAsrCtcModelConfig.java\njava_files += OfflineFireRedAsrCtcModelConfig.java\njava_files += OfflineFunAsrNanoModelConfig.java\njava_files += OfflineCanaryModelConfig.java\njava_files += OfflineSenseVoiceModelConfig.java\njava_files += OfflineDolphinModelConfig.java\njava_files += OfflineModelConfig.java\njava_files += OfflineRecognizerConfig.java\njava_files += OfflineRecognizerResult.java\njava_files += OfflineStream.java\njava_files += OfflineRecognizer.java\n\njava_files += GenerationConfig.java\njava_files += OfflineTtsKittenModelConfig.java\njava_files += OfflineTtsPocketModelConfig.java\njava_files += OfflineTtsSupertonicModelConfig.java\njava_files += OfflineTtsKokoroModelConfig.java\njava_files += OfflineTtsZipVoiceModelConfig.java\njava_files += OfflineTtsMatchaModelConfig.java\njava_files += OfflineTtsVitsModelConfig.java\njava_files += OfflineTtsModelConfig.java\njava_files += OfflineTtsConfig.java\njava_files += GeneratedAudio.java\njava_files += OfflineTtsCallback.java\njava_files += OfflineTts.java\n\njava_files += SpokenLanguageIdentificationWhisperConfig.java\njava_files += SpokenLanguageIdentificationConfig.java\njava_files += SpokenLanguageIdentification.java\n\njava_files += OfflinePunctuationModelConfig.java\njava_files += OfflinePunctuationConfig.java\njava_files += OfflinePunctuation.java\n\njava_files += OnlinePunctuationModelConfig.java\njava_files += OnlinePunctuationConfig.java\njava_files += OnlinePunctuation.java\n\njava_files += OfflineZipformerAudioTaggingModelConfig.java\njava_files += AudioTaggingModelConfig.java\njava_files += AudioTaggingConfig.java\njava_files += AudioEvent.java\njava_files += AudioTagging.java\n\njava_files += SpeakerEmbeddingExtractorConfig.java\njava_files += SpeakerEmbeddingExtractor.java\njava_files += SpeakerEmbeddingManager.java\n\njava_files += TenVadModelConfig.java\njava_files += SileroVadModelConfig.java\njava_files += VadModelConfig.java\njava_files += SpeechSegment.java\njava_files += Vad.java\n\njava_files += KeywordSpotterConfig.java\njava_files += KeywordSpotterResult.java\njava_files += KeywordSpotter.java\n\njava_files += OfflineSpeakerSegmentationPyannoteModelConfig.java\njava_files += OfflineSpeakerSegmentationModelConfig.java\njava_files += FastClusteringConfig.java\njava_files += OfflineSpeakerDiarizationConfig.java\njava_files += OfflineSpeakerDiarizationSegment.java\njava_files += OfflineSpeakerDiarizationCallback.java\njava_files += OfflineSpeakerDiarization.java\n\njava_files += OfflineSpeechDenoiserGtcrnModelConfig.java\njava_files += OfflineSpeechDenoiserDpdfNetModelConfig.java\njava_files += OfflineSpeechDenoiserModelConfig.java\njava_files += OfflineSpeechDenoiserConfig.java\njava_files += DenoisedAudio.java\njava_files += OfflineSpeechDenoiser.java\njava_files += OnlineSpeechDenoiserConfig.java\njava_files += OnlineSpeechDenoiser.java\n\nclass_files := $(java_files:%.java=%.class)\n\njava_files := $(addprefix src/main/java/$(package_dir)/,$(java_files))\nclass_files := $(addprefix $(out_dir)/$(package_dir)/,$(class_files))\n\n$(info -- java files $(java_files))\n$(info --)\n$(info -- class files $(class_files))\n\n.PHONY: all clean native\n\nall: $(out_jar)\n\n# macos x86_x64 -> osx-x64\n# macos arm64 -> osx-aarch64\n# linux x86_x64 -> linux-x64\n# linux arm64 -> linux-aarch64\n# windows x86_x64 -> win-x64\n# windows arm64 -> win-aarch64\n# windows x86 -> win-x86\nnative:\n\tjar cfvm ./sherpa-onnx-native.jar MANIFEST.MF -C ./resources .\n\n$(out_jar): $(class_files)\n\t# jar --create --verbose --file $(out_jar) -C $(out_dir) ./\n\t# jar cvf $(out_jar) -C $(out_dir) ./\n\tjar cfvm $@ MANIFEST.MF -C $(out_dir) .\n\nclean:\n\t$(RM) -rfv $(out_dir)\n\n$(class_files): $(out_dir)/$(package_dir)/%.class: src/main/java/$(package_dir)/%.java\n\tmkdir -p build\n\tjavac --release 8 -Xlint:-options -d $(out_dir) -cp $(out_dir) $<\n"
  },
  {
    "path": "sherpa-onnx/java-api/pom.xml",
    "content": "<project xmlns=\"http://maven.apache.org/POM/4.0.0\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:schemaLocation=\"http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd\">\n  <modelVersion>4.0.0</modelVersion>\n  <groupId>com.litongjava</groupId>\n  <artifactId>sherpa-onnx-java-api</artifactId>\n  <version>1.0.1</version>\n  <packaging>jar</packaging>\n  <name>sherpa-onnx-java-api</name>\n  <description>sherpa-onnx-java-api</description>\n  <url>https://github.com/k2-fsa/sherpa-onnx/tree/master/sherpa-onnx/java-api</url>\n  <properties>\n    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>\n    <java.version>1.8</java.version>\n    <maven.compiler.source>${java.version}</maven.compiler.source>\n    <maven.compiler.target>${java.version}</maven.compiler.target>\n  </properties>\n\n  <licenses>\n    <license>\n      <name>The Apache Software License, Version 2.0</name>\n      <url>http://apache.org/licenses/LICENSE-2.0.txt</url>\n    </license>\n  </licenses>\n\n  <developers>\n    <developer>\n      <id>litongjava</id>\n      <name>Tong Li</name>\n      <email>litongjava001@gmail.com</email>\n      <url>https://github.com/litongjava</url>\n    </developer>\n  </developers>\n\n  <scm>\n    <connection>scm:git:git@github.com:k2-fsa/sherpa-onnx.git</connection>\n    <developerConnection>scm:git:git@github.com:k2-fsa/sherpa-onnx.git</developerConnection>\n    <url>git@github.com:k2-fsa/sherpa-onnx.git</url>\n  </scm>\n\n  <build>\n    <plugins>\n      <!-- Source -->\n      <plugin>\n        <groupId>org.apache.maven.plugins</groupId>\n        <artifactId>maven-source-plugin</artifactId>\n        <version>2.2.1</version>\n        <executions>\n          <execution>\n            <phase>package</phase>\n            <goals>\n              <goal>jar-no-fork</goal>\n            </goals>\n          </execution>\n        </executions>\n      </plugin>\n      <!-- Javadoc -->\n      <plugin>\n        <groupId>org.apache.maven.plugins</groupId>\n        <artifactId>maven-javadoc-plugin</artifactId>\n        <version>2.9.1</version>\n        <configuration>\n          <!-- 添加这个压制JavaDoc检查 -->\n          <additionalparam>-Xdoclint:none</additionalparam>\n        </configuration>\n        <executions>\n          <execution>\n            <phase>package</phase>\n            <goals>\n              <goal>jar</goal>\n            </goals>\n          </execution>\n        </executions>\n      </plugin>\n      <!-- GPG mvn clean deploy -Dgpg.passphrase=YourPassphase -->\n      <plugin>\n        <groupId>org.apache.maven.plugins</groupId>\n        <artifactId>maven-gpg-plugin</artifactId>\n        <version>1.5</version>\n        <executions>\n          <execution>\n            <id>sign-artifacts</id>\n            <phase>verify</phase>\n            <goals>\n              <goal>sign</goal>\n            </goals>\n          </execution>\n        </executions>\n      </plugin>\n      <plugin>\n        <groupId>org.sonatype.central</groupId>\n        <artifactId>central-publishing-maven-plugin</artifactId>\n        <version>0.7.0</version>\n        <extensions>true</extensions>\n        <configuration>\n          <publishingServerId>central</publishingServerId>\n        </configuration>\n      </plugin>\n    </plugins>\n  </build>\n</project>"
  },
  {
    "path": "sherpa-onnx/java-api/readme.md",
    "content": "# User Guide\n\n*Applicable to Windows / macOS / Linux (using Windows as an example for dynamic library loading)*\n\n## 1. Prerequisites\n\n* Java 1.8+ environment\n* Download and prepare the following:\n\n  * Sherpa-ONNX Java API (Maven dependency)\n  * Kokoro TTS model files (including `model.onnx`, etc.)\n\n---\n\n## 2. Add Maven Dependency\n\nIn your `pom.xml`, add:\n\n```xml\n<dependency>\n  <groupId>com.litongjava</groupId>\n  <artifactId>sherpa-onnx-java-api</artifactId>\n  <version>1.0.1</version>\n</dependency>\n```\n\n---\n\n## 3. Obtain and Configure Native Dynamic Libraries (JNI)\n\n### 3.1 Install ONNX Runtime\n\n#### Windows 10\n\nStarting from Windows 10 v1809 and all versions of Windows 11, the system comes with built-in ONNX Runtime as part of Windows ML (WinRT API), exposed through Windows.AI.MachineLearning.dll. You can directly use WinML to load and run ONNX models without additional downloads or installations.\n[run-onnx-models](https://learn.microsoft.com/en-us/windows/ai/new-windows-ml/run-onnx-models)\n\n#### Linux\n\nSherpa-ONNX does **not** bundle ONNX Runtime. To install it manually:\n\n1. Download the Linux x64 binary from Microsoft’s GitHub Releases:\n\n   ```bash\n   wget https://github.com/microsoft/onnxruntime/releases/download/v1.23.2/onnxruntime-linux-x64-1.23.2.tgz\n   tar -xzf onnxruntime-linux-x64-1.23.2.tgz\n   ```\n\n2. Copy and symlink the library into a system directory:\n\n   ```bash\n   sudo cp onnxruntime-linux-x64-1.23.2/lib/libonnxruntime.so* /usr/local/lib/\n   sudo ln -sf /usr/local/lib/libonnxruntime.so.1.23.2 /usr/local/lib/libonnxruntime.so\n   ```\n\n3. Update the shared-library cache and verify:\n\n   ```bash\n   sudo ldconfig\n   ldconfig -p | grep onnxruntime\n   ```\n\n#### macOS\n\nSherpa-ONNX also requires you to install ONNX Runtime on macOS:\n\n1. Download the macOS ARM64 binary:\n\n   ```bash\n   wget https://github.com/microsoft/onnxruntime/releases/download/v1.23.2/onnxruntime-osx-arm64-1.23.2.tgz\n   tar -xzf onnxruntime-osx-arm64-1.23.2.tgz\n   ```\n\n2. Copy the dylib into `/usr/local/lib`:\n\n   ```bash\n   sudo cp onnxruntime-osx-arm64-1.23.2/lib/libonnxruntime.1.23.2.dylib /usr/local/lib/\n   ```\n\n3. Add `/usr/local/lib` to `dyld`’s search path:\n\n   ```bash\n   export DYLD_LIBRARY_PATH=/usr/local/lib:$DYLD_LIBRARY_PATH\n   ```\n\n4. Verify with `otool`:\n\n   ```bash\n   otool -L /Users/ping/lib/darwin_arm64/libsherpa-onnx-jni.dylib\n   ```\n\n---\n\n### 3.2 Common Errors & Troubleshooting\n\n**Error Example:**\n\n```text\nException in thread \"main\" java.lang.UnsatisfiedLinkError: no sherpa-onnx-jni in java.library.path: ...\n```\n\nThis means the JVM couldn’t locate the native library in `java.library.path`.\n\n**Troubleshooting steps:**\n\n1. Ensure you downloaded the build matching your OS and architecture (e.g. win-x64 vs. arm64).\n\n2. Test with an absolute path:\n\n   ```bash\n   java -Djava.library.path=C:\\full\\path\\to\\jni -jar your-app.jar\n   ```\n\n3. Print or inspect `java.library.path` at runtime (e.g. `System.out.println(System.getProperty(\"java.library.path\"));`).\n\n4. **Do not** hack the internal `sys_paths` via reflection (it may throw `NoSuchFieldException`). Use `-Djava.library.path` instead.\n\n---\n\n## 4. Download & Prepare the Kokoro Model\n\nFetch the model package from the official release (example: Kokoro v0.19 English):\n\n```\nhttps://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n```\n\n```bash\n# Download (manually or via script)\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n\n# Extract\ntar -xjf kokoro-en-v0_19.tar.bz2\n\n# Inspect\nls -lh kokoro-en-v0_19/\n```\n\nYou should see:\n\n```\nLICENSE\nREADME.md\nespeak-ng-data/    # speech data directory\nmodel.onnx         # TTS model\ntokens.txt         # token mapping\nvoices.bin         # voice embeddings\n```\n\nMake sure your Java code points to these files (using either relative or absolute paths).\n\n---\n\n## 5. Test Code (Java Example)\n\n```java\npackage com.litongjava.linux.tts;\n\nimport com.k2fsa.sherpa.onnx.GeneratedAudio;\nimport com.k2fsa.sherpa.onnx.OfflineTts;\nimport com.k2fsa.sherpa.onnx.OfflineTtsConfig;\nimport com.k2fsa.sherpa.onnx.OfflineTtsKokoroModelConfig;\nimport com.k2fsa.sherpa.onnx.OfflineTtsModelConfig;\n\npublic class NonStreamingTtsKokoroEn {\n  public static void main(String[] args) {\n    String model   = \"./kokoro-en-v0_19/model.onnx\";\n    String voices  = \"./kokoro-en-v0_19/voices.bin\";\n    String tokens  = \"./kokoro-en-v0_19/tokens.txt\";\n    String dataDir = \"./kokoro-en-v0_19/espeak-ng-data\";\n    String text    = \"Today as always, men fall into two groups: slaves and free men. Whoever does not have\"\n                   + \" two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a\"\n                   + \" businessman, an official, or a scholar.\";\n\n    OfflineTtsKokoroModelConfig kokoroConfig = OfflineTtsKokoroModelConfig.builder()\n        .setModel(model)\n        .setVoices(voices)\n        .setTokens(tokens)\n        .setDataDir(dataDir)\n        .build();\n\n    OfflineTtsModelConfig modelConfig = OfflineTtsModelConfig.builder()\n        .setKokoro(kokoroConfig)\n        .setNumThreads(2)\n        .setDebug(true)\n        .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder()\n        .setModel(modelConfig)\n        .build();\n\n    OfflineTts tts = new OfflineTts(config);\n\n    int sid   = 0;\n    float speed = 1.0f;\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio = tts.generate(text, sid, speed);\n    long stop  = System.currentTimeMillis();\n\n    float elapsed   = (stop - start) / 1000.0f;\n    float duration  = audio.getSamples().length / (float) audio.getSampleRate();\n    float rtf       = elapsed / duration;\n\n    String outFile = \"tts-kokoro-en.wav\";\n    audio.save(outFile);\n\n    System.out.printf(\"-- elapsed           : %.3f seconds%n\", elapsed);\n    System.out.printf(\"-- audio duration    : %.3f seconds%n\", duration);\n    System.out.printf(\"-- real-time factor  : %.3f%n\", rtf);\n    System.out.printf(\"-- text              : %s%n\", text);\n    System.out.printf(\"-- Saved to          : %s%n\", outFile);\n\n    tts.release();\n  }\n}\n```\n\n### Output Explanation\n\nAfter successful execution, you should see something like:\n\n```\n-- elapsed           : 6.739 seconds\n-- audio duration    : 6.739 seconds\n-- real-time factor  : 0.563\n-- text              : ...\n-- Saved to          : tts-kokoro-en.wav\n```\n\nA file named `tts-kokoro-en.wav` will appear in the current directory—play it with any audio player to verify.\n"
  },
  {
    "path": "sherpa-onnx/java-api/readme.zh.md",
    "content": "# 使用指南\n\n*适用于 Windows / macOS / Linux（以 Windows 为例说明动态库加载）*\n\n## 1. 前提条件\n\n* Java 1.8+ 环境\n* 下载并准备好以下内容：\n  * Sherpa-ONNX Java API（Maven 依赖）\n  * Kokoro TTS 模型文件（包含 `model.onnx` 等）\n\n---\n\n## 2. 添加 Maven 依赖\n\n在你的 `pom.xml` 中添加如下依赖：\n\n```xml\n<dependency>\n  <groupId>com.litongjava</groupId>\n  <artifactId>sherpa-onnx-java-api</artifactId>\n  <version>1.0.1</version>\n</dependency>\n```\n\n---\n\n## 3. 获取并配置本地动态链接库（JNI）\n\n### 3.1 安装 ONNX Runtime\n\n#### 1. Windows 11\n\nStarting from Windows 10 v1809 and all versions of Windows 11, the system comes with built-in ONNX Runtime as part of Windows ML (WinRT API), exposed through Windows.AI.MachineLearning.dll. You can directly use WinML to load and run ONNX models without additional downloads or installations.\n(run-onnx-models)[https://learn.microsoft.com/en-us/windows/ai/new-windows-ml/run-onnx-models]\n\n#### 2. Linux\n\nSherpa-ONNX 并不包含 ONNX Runtime，需要手动下载并配置：\n\n1. 从微软官方 GitHub Releases 下载 Linux 64 位二进制包：\n\n   ```bash\n   wget https://github.com/microsoft/onnxruntime/releases/download/v1.23.2/onnxruntime-linux-x64-1.23.2.tgz\n   tar -xzf onnxruntime-linux-x64-1.23.2.tgz\n   ```\n2. 将解压后的 `libonnxruntime.so` 文件复制到系统库目录，并创建软链接：\n\n   ```bash\n   sudo cp onnxruntime-linux-x64-1.23.2/lib/libonnxruntime.so* /usr/local/lib/\n   sudo ln -sf /usr/local/lib/libonnxruntime.so.1.23.2 /usr/local/lib/libonnxruntime.so\n   ```\n3. 更新共享库缓存并验证安装：\n\n   ```bash\n   sudo ldconfig\n   ldconfig -p | grep onnxruntime\n   ```\n\n#### 3. macOS\n\nSherpa-ONNX 同样不包含 ONNX Runtime，需要从官方获取并配置：\n\n1. 下载 macOS ARM64 版本二进制包：\n\n   ```bash\n   wget https://github.com/microsoft/onnxruntime/releases/download/v1.23.2/onnxruntime-osx-arm64-1.23.2.tgz\n   tar -xzf onnxruntime-osx-arm64-1.23.2.tgz\n   ```\n2. 将 `libonnxruntime.1.23.2.dylib` 复制到 `/usr/local/lib`：\n\n   ```bash\n   sudo cp onnxruntime-osx-arm64-1.23.2/lib/libonnxruntime.1.23.2.dylib /usr/local/lib/\n   ```\n3. 将 `/usr/local/lib` 添加到 `dyld` 的搜索路径：\n\n   ```bash\n   export DYLD_LIBRARY_PATH=/usr/local/lib:$DYLD_LIBRARY_PATH\n   ```\n4. 使用 `otool` 验证：\n\n   ```bash\n   otool -L /Users/ping/lib/darwin_arm64/libsherpa-onnx-jni.dylib\n   ```\n---\n\n\n### 3.2 常见错误与排查\n\n**错误示例：**\n\n```text\nException in thread \"main\" java.lang.UnsatisfiedLinkError: no sherpa-onnx-jni in java.library.path: ...\n```\n\n说明 JVM 没有在 `java.library.path` 中找到本地库。\n\n排查步骤：\n\n1. 确认下载的是与你操作系统与架构匹配的版本（如 win-x64 vs arm64 等）。\n2. 用绝对路径测试：将 `.dll` 放在某个目录并运行：\n\n   ```sh\n   java -Djava.library.path=C:\\full\\path\\to\\jni -jar your-app.jar\n   ```\n3. 打印或检查 `java.library.path` 内容（示例代码里可输出 `System.getProperty(\"java.library.path\")`）。\n4. 避免通过反射修改 `sys_paths`（不要尝试 hack `java.library.path` 的内部字段，容易引发 `NoSuchFieldException: sys_paths`，建议直接用 `-Djava.library.path`）。\n\n---\n\n## 4. 下载并准备 Kokoro 模型\n\n从官方 release 获取模型包（以英文 Kokoro v0.19 为例）：\n```\nhttps://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n```\n\n```sh\n# 下载（手工或脚本）\n# 例如从 GitHub releases:\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n\n# 解压\ntar -xjf kokoro-en-v0_19.tar.bz2\n\n# 查看结构\nls -lh kokoro-en-v0_19/\n```\n\n该目录结构示例（解压后应包含）：\n\n```\nLICENSE\nREADME.md\nespeak-ng-data/        # 语音数据目录\nmodel.onnx            # TTS 模型\ntokens.txt           # token 映射\nvoices.bin           # voice embedding\n```\n\n确保这些路径在你的 Java 程序中指向正确的位置（相对或绝对皆可）。\n\n---\n\n## 5. 测试代码（Java 示例）\n\n```java\npackage com.litongjava.linux.tts;\n\nimport com.k2fsa.sherpa.onnx.GeneratedAudio;\nimport com.k2fsa.sherpa.onnx.OfflineTts;\nimport com.k2fsa.sherpa.onnx.OfflineTtsConfig;\nimport com.k2fsa.sherpa.onnx.OfflineTtsKokoroModelConfig;\nimport com.k2fsa.sherpa.onnx.OfflineTtsModelConfig;\n\npublic class NonStreamingTtsKokoroEn {\n  public static void main(String[] args) {\n    String model = \"./kokoro-en-v0_19/model.onnx\";\n    String voices = \"./kokoro-en-v0_19/voices.bin\";\n    String tokens = \"./kokoro-en-v0_19/tokens.txt\";\n    String dataDir = \"./kokoro-en-v0_19/espeak-ng-data\";\n    String text = \"Today as always, men fall into two groups: slaves and free men. Whoever does not have\"\n        + \" two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a\"\n        + \" businessman, an official, or a scholar.\";\n\n    OfflineTtsKokoroModelConfig kokoroModelConfig = OfflineTtsKokoroModelConfig.builder()\n        .setModel(model)\n        .setVoices(voices)\n        .setTokens(tokens)\n        .setDataDir(dataDir)\n        .build();\n\n    OfflineTtsModelConfig modelConfig = OfflineTtsModelConfig.builder()\n        .setKokoro(kokoroModelConfig)\n        .setNumThreads(2)\n        .setDebug(true)\n        .build();\n\n    OfflineTtsConfig config = OfflineTtsConfig.builder()\n        .setModel(modelConfig)\n        .build();\n\n    OfflineTts tts = new OfflineTts(config);\n\n    int sid = 0;\n    float speed = 1.0f;\n    long start = System.currentTimeMillis();\n    GeneratedAudio audio = tts.generate(text, sid, speed);\n    long stop = System.currentTimeMillis();\n\n    float timeElapsedSeconds = (stop - start) / 1000.0f;\n    float audioDuration = audio.getSamples().length / (float) audio.getSampleRate();\n    float real_time_factor = timeElapsedSeconds / audioDuration;\n\n    String waveFilename = \"tts-kokoro-en.wav\";\n    audio.save(waveFilename);\n    System.out.printf(\"-- elapsed : %.3f seconds\\n\", timeElapsedSeconds);\n    System.out.printf(\"-- audio duration: %.3f seconds\\n\", audioDuration);\n    System.out.printf(\"-- real-time factor (RTF): %.3f\\n\", real_time_factor);\n    System.out.printf(\"-- text: %s\\n\", text);\n    System.out.printf(\"-- Saved to %s\\n\", waveFilename);\n\n    tts.release();\n  }\n}\n```\n\n### 输出说明\n\n成功执行后会输出类似：\n\n```\n-- elapsed : 6.739 seconds\n-- audio duration: 6.739 seconds\n-- real-time factor (RTF): 0.563\n-- text: ...\n-- Saved to tts-kokoro-en.wav\n```\n\n并在当前目录生成 `tts-kokoro-en.wav`，可以用任意音频播放器播放验证。\n\n---\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/AudioEvent.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class AudioEvent {\n    private String name = \"\";\n    private int index = 0;\n    private float prob = 0;\n\n    public AudioEvent(String name, int index, float prob) {\n        this.name = name;\n        this.index = index;\n        this.prob = prob;\n    }\n\n    public String getName() {\n        return name;\n    }\n\n    public int getIndex() {\n        return index;\n    }\n\n    public float getProb() {\n        return prob;\n    }\n\n    @Override\n    public String toString() {\n        return String.format(\"AudioEven(name=%s, index=%d, prob=%.3f)\\n\", name, index, prob);\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/AudioTagging.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class AudioTagging {\n    private long ptr = 0;\n\n    public AudioTagging(AudioTaggingConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid AudioTaggingConfig: failed to create native AudioTagging\");\n        }\n    }\n\n    public OfflineStream createStream() {\n        long p = createStream(ptr);\n        return new OfflineStream(p);\n    }\n\n    public AudioEvent[] compute(OfflineStream stream) {\n        return compute(stream, -1);\n\n    }\n\n    public AudioEvent[] compute(OfflineStream stream, int topK) {\n        return compute(ptr, stream.getPtr(), topK);\n    }\n\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    // You'd better call it manually if it is not used anymore\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    private native void delete(long ptr);\n\n    private native long newFromFile(AudioTaggingConfig config);\n\n    private native long createStream(long ptr);\n\n    private native AudioEvent[] compute(long ptr, long streamPtr, int topK);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/AudioTaggingConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class AudioTaggingConfig {\n    private final AudioTaggingModelConfig model;\n    private final String labels;\n    private final int topK;\n\n    private AudioTaggingConfig(Builder builder) {\n        this.model = builder.model;\n        this.labels = builder.labels;\n        this.topK = builder.topK;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public static class Builder {\n        private AudioTaggingModelConfig model = AudioTaggingModelConfig.builder().build();\n        private String labels = \"\";\n        private int topK = 5;\n\n        public AudioTaggingConfig build() {\n            return new AudioTaggingConfig(this);\n        }\n\n        public Builder setModel(AudioTaggingModelConfig model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setLabels(String labels) {\n            this.labels = labels;\n            return this;\n        }\n\n        public Builder setTopK(int topK) {\n            this.topK = topK;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/AudioTaggingModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class AudioTaggingModelConfig {\n    private final OfflineZipformerAudioTaggingModelConfig zipformer;\n    private final String ced;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n\n    private AudioTaggingModelConfig(Builder builder) {\n        this.zipformer = builder.zipformer;\n        this.ced = builder.ced;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public static class Builder {\n        private OfflineZipformerAudioTaggingModelConfig zipformer = OfflineZipformerAudioTaggingModelConfig.builder().build();\n        private String ced = \"\";\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n\n        public AudioTaggingModelConfig build() {\n            return new AudioTaggingModelConfig(this);\n        }\n\n        public Builder setZipformer(OfflineZipformerAudioTaggingModelConfig zipformer) {\n            this.zipformer = zipformer;\n            return this;\n        }\n\n        public Builder setCED(String ced) {\n            this.ced = ced;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/DenoisedAudio.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class DenoisedAudio {\n    private final float[] samples;\n    private final int sampleRate;\n\n    public DenoisedAudio(float[] samples, int sampleRate) {\n        LibraryLoader.maybeLoad();\n        this.samples = samples;\n        this.sampleRate = sampleRate;\n    }\n\n    public int getSampleRate() {\n        return sampleRate;\n    }\n\n    public float[] getSamples() {\n        return samples;\n    }\n\n    // return true if saved successfully.\n    public boolean save(String filename) {\n        return saveImpl(filename, samples, sampleRate);\n    }\n\n    private native boolean saveImpl(String filename, float[] samples, int sampleRate);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/EndpointConfig.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class EndpointConfig {\n\n    private final EndpointRule rule1;\n    private final EndpointRule rule2;\n    private final EndpointRule rule3;\n\n    private EndpointConfig(Builder builder) {\n        this.rule1 = builder.rule1;\n        this.rule2 = builder.rule2;\n        this.rule3 = builder.rule3;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public EndpointRule getRule1() {\n        return rule1;\n    }\n\n    public EndpointRule getRule2() {\n        return rule2;\n    }\n\n    public EndpointRule getRule3() {\n        return rule3;\n    }\n\n    public static class Builder {\n\n        private EndpointRule rule1 = EndpointRule.builder().\n                setMustContainNonSilence(false).\n                setMinTrailingSilence(2.4f).\n                setMinUtteranceLength(0).\n                build();\n        private EndpointRule rule2 = EndpointRule.builder().\n                setMustContainNonSilence(true).\n                setMinTrailingSilence(1.4f).\n                setMinUtteranceLength(0).\n                build();\n        private EndpointRule rule3 = EndpointRule.builder().\n                setMustContainNonSilence(false).\n                setMinTrailingSilence(0.0f).\n                setMinUtteranceLength(20.0f).\n                build();\n\n        public EndpointConfig build() {\n            return new EndpointConfig(this);\n        }\n\n        public Builder setRule1(EndpointRule rule) {\n            this.rule1 = rule;\n            return this;\n        }\n\n        public Builder setRule2(EndpointRule rule) {\n            this.rule2 = rule;\n            return this;\n        }\n\n        public Builder setRule3(EndpointRule rule) {\n            this.rule3 = rule;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/EndpointRule.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class EndpointRule {\n\n    private final boolean mustContainNonSilence;\n    private final float minTrailingSilence;\n    private final float minUtteranceLength;\n\n    private EndpointRule(Builder builder) {\n        this.mustContainNonSilence = builder.mustContainNonSilence;\n        this.minTrailingSilence = builder.minTrailingSilence;\n        this.minUtteranceLength = builder.minUtteranceLength;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public float getMinTrailingSilence() {\n        return minTrailingSilence;\n    }\n\n    public float getMinUtteranceLength() {\n        return minUtteranceLength;\n    }\n\n    public boolean getMustContainNonSilence() {\n        return mustContainNonSilence;\n    }\n\n    public static class Builder {\n        private boolean mustContainNonSilence = false;\n        private float minTrailingSilence = 0;\n        private float minUtteranceLength = 0;\n\n        public EndpointRule build() {\n            return new EndpointRule(this);\n        }\n\n        public Builder setMustContainNonSilence(boolean mustContainNonSilence) {\n            this.mustContainNonSilence = mustContainNonSilence;\n            return this;\n        }\n\n        public Builder setMinTrailingSilence(float minTrailingSilence) {\n            this.minTrailingSilence = minTrailingSilence;\n            return this;\n        }\n\n        public Builder setMinUtteranceLength(float minUtteranceLength) {\n            this.minUtteranceLength = minUtteranceLength;\n            return this;\n        }\n    }\n}"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/FastClusteringConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class FastClusteringConfig {\n    private final int numClusters;\n    private final float threshold;\n\n    private FastClusteringConfig(Builder builder) {\n        this.numClusters = builder.numClusters;\n        this.threshold = builder.threshold;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public int getNumClusters() {\n        return numClusters;\n    }\n\n    public float getThreshold() {\n        return threshold;\n    }\n\n    public static class Builder {\n        private int numClusters = -1;\n        private float threshold = 0.5f;\n\n        public FastClusteringConfig build() {\n            return new FastClusteringConfig(this);\n        }\n\n        public Builder setNumClusters(int numClusters) {\n            this.numClusters = numClusters;\n            return this;\n        }\n\n        public Builder setThreshold(float threshold) {\n            this.threshold = threshold;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/FeatureConfig.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class FeatureConfig {\n    private final int sampleRate;\n    private final int featureDim;\n    private final float dither;\n\n    private FeatureConfig(Builder builder) {\n        this.sampleRate = builder.sampleRate;\n        this.featureDim = builder.featureDim;\n        this.dither = builder.dither;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public int getSampleRate() {\n        return sampleRate;\n    }\n\n    public int getFeatureDim() {\n        return featureDim;\n    }\n\n   public float getDither() {\n        return dither;\n   }\n\n    public static class Builder {\n        private int sampleRate = 16000;\n        private int featureDim = 80;\n        private float dither = 0.0f;\n\n        public FeatureConfig build() {\n          return new FeatureConfig(this);\n        }\n\n        public Builder setSampleRate(int sampleRate) {\n            this.sampleRate = sampleRate;\n            return this;\n        }\n\n        public Builder setFeatureDim(int featureDim) {\n            this.featureDim = featureDim;\n            return this;\n        }\n        public Builder setDither(float dither) {\n            this.dither = dither;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/GeneratedAudio.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class GeneratedAudio {\n    private final float[] samples;\n    private final int sampleRate;\n\n    public GeneratedAudio(float[] samples, int sampleRate) {\n        LibraryLoader.maybeLoad();\n        this.samples = samples;\n        this.sampleRate = sampleRate;\n    }\n\n    public int getSampleRate() {\n        return sampleRate;\n    }\n\n    public float[] getSamples() {\n        return samples;\n    }\n\n    // return true if saved successfully.\n    public boolean save(String filename) {\n        return saveImpl(filename, samples, sampleRate);\n    }\n\n    private native boolean saveImpl(String filename, float[] samples, int sampleRate);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/GenerationConfig.java",
    "content": "// Copyright 2026 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\nimport java.util.Map;\n\n/**\n * Configuration for generating audio.\n * Mirrors Kotlin GenerationConfig.\n */\npublic class GenerationConfig {\n\n    private float silenceScale = 0.2f;\n    private float speed = 1.0f;\n    private int sid = 0;\n\n    /** Reference audio samples (mono, [-1, 1]). */\n    private float[] referenceAudio = null;\n\n    /** Sample rate of reference audio */\n    private int referenceSampleRate = 0;\n\n    /** Optional reference text */\n    private String referenceText = null;\n\n    /** Number of steps in flow matching */\n    private int numSteps = 5;\n\n    /** Extra model-specific key-value pairs. Can be null. */\n    private Map<String, String> extra = null;\n\n    /** Default constructor */\n    public GenerationConfig() {\n    }\n\n    /** Getters */\n    public float getSilenceScale() {\n        return silenceScale;\n    }\n\n    public float getSpeed() {\n        return speed;\n    }\n\n    public int getSid() {\n        return sid;\n    }\n\n    public float[] getReferenceAudio() {\n        return referenceAudio;\n    }\n\n    public int getReferenceSampleRate() {\n        return referenceSampleRate;\n    }\n\n    public String getReferenceText() {\n        return referenceText;\n    }\n\n    public int getNumSteps() {\n        return numSteps;\n    }\n\n    public Map<String, String> getExtra() {\n        return extra;\n    }\n\n    /** Setters */\n    public void setSilenceScale(float silenceScale) {\n        this.silenceScale = silenceScale;\n    }\n\n    public void setSpeed(float speed) {\n        this.speed = speed;\n    }\n\n    public void setSid(int sid) {\n        this.sid = sid;\n    }\n\n    public void setReferenceAudio(float[] referenceAudio) {\n        this.referenceAudio = referenceAudio;\n    }\n\n    public void setReferenceSampleRate(int referenceSampleRate) {\n        this.referenceSampleRate = referenceSampleRate;\n    }\n\n    public void setReferenceText(String referenceText) {\n        this.referenceText = referenceText;\n    }\n\n    public void setNumSteps(int numSteps) {\n        this.numSteps = numSteps;\n    }\n\n    public void setExtra(Map<String, String> extra) {\n        this.extra = extra;\n    }\n\n    @Override\n    public String toString() {\n        return \"GenerationConfig{\" +\n                \"silenceScale=\" + silenceScale +\n                \", speed=\" + speed +\n                \", sid=\" + sid +\n                \", referenceAudioLength=\" + (referenceAudio != null ? referenceAudio.length : 0) +\n                \", referenceSampleRate=\" + referenceSampleRate +\n                \", referenceText='\" + referenceText + '\\'' +\n                \", numSteps=\" + numSteps +\n                \", extra=\" + extra +\n                '}';\n    }\n}\n\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/HomophoneReplacerConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class HomophoneReplacerConfig {\n    private final String dictDir;  // unused\n    private final String lexicon;\n    private final String ruleFsts;\n\n    private HomophoneReplacerConfig(Builder builder) {\n        this.dictDir = builder.dictDir;\n        this.lexicon = builder.lexicon;\n        this.ruleFsts = builder.ruleFsts;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getDictDir() {\n        return dictDir;\n    }\n\n    public String getLexicon() {\n        return lexicon;\n    }\n\n    public String getRuleFsts() {\n        return ruleFsts;\n    }\n\n    public static class Builder {\n        private String dictDir = \"\";\n        private String lexicon = \"\";\n        private String ruleFsts = \"\";\n\n        public HomophoneReplacerConfig build() {\n            return new HomophoneReplacerConfig(this);\n        }\n\n        public Builder setDictDir(String dictDir) {\n            this.dictDir = dictDir;\n            return this;\n        }\n\n        public Builder setLexicon(String lexicon) {\n            this.lexicon = lexicon;\n            return this;\n        }\n\n        public Builder setRuleFsts(String ruleFsts) {\n            this.ruleFsts = ruleFsts;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/KeywordSpotter.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class KeywordSpotter {\n    private long ptr = 0;\n\n    public KeywordSpotter(KeywordSpotterConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid KeywordSpotterConfig: failed to create native KeywordSpotter\");\n        }\n    }\n\n    public OnlineStream createStream(String keywords) {\n        long p = createStream(ptr, keywords);\n        return new OnlineStream(p);\n    }\n\n    public OnlineStream createStream() {\n        long p = createStream(ptr, \"\");\n        return new OnlineStream(p);\n    }\n\n    public void decode(OnlineStream s) {\n        decode(ptr, s.getPtr());\n    }\n\n    public void reset(OnlineStream s) {\n        reset(ptr, s.getPtr());\n    }\n\n    public boolean isReady(OnlineStream s) {\n        return isReady(ptr, s.getPtr());\n    }\n\n    public KeywordSpotterResult getResult(OnlineStream s) {\n        return getResult(ptr, s.getPtr());\n    }\n\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    // You'd better call it manually if it is not used anymore\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    private native long newFromFile(KeywordSpotterConfig config);\n\n    private native void delete(long ptr);\n\n    private native long createStream(long ptr, String keywords);\n\n    private native void decode(long ptr, long streamPtr);\n\n    private native void reset(long ptr, long streamPtr);\n\n    private native boolean isReady(long ptr, long streamPtr);\n\n    private native KeywordSpotterResult getResult(long ptr, long streamPtr);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/KeywordSpotterConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class KeywordSpotterConfig {\n    private final FeatureConfig featConfig;\n    private final OnlineModelConfig modelConfig;\n\n    private final int maxActivePaths;\n    private final String keywordsFile;\n    private final float keywordsScore;\n    private final float keywordsThreshold;\n    private final int numTrailingBlanks;\n\n    private KeywordSpotterConfig(Builder builder) {\n        this.featConfig = builder.featConfig;\n        this.modelConfig = builder.modelConfig;\n        this.maxActivePaths = builder.maxActivePaths;\n        this.keywordsFile = builder.keywordsFile;\n        this.keywordsScore = builder.keywordsScore;\n        this.keywordsThreshold = builder.keywordsThreshold;\n        this.numTrailingBlanks = builder.numTrailingBlanks;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public static class Builder {\n        private FeatureConfig featConfig = FeatureConfig.builder().build();\n        private OnlineModelConfig modelConfig = OnlineModelConfig.builder().build();\n        private int maxActivePaths = 4;\n        private String keywordsFile = \"keywords.txt\";\n        private float keywordsScore = 1.5f;\n        private float keywordsThreshold = 0.25f;\n        private int numTrailingBlanks = 2;\n\n        public KeywordSpotterConfig build() {\n            return new KeywordSpotterConfig(this);\n        }\n\n        public Builder setFeatureConfig(FeatureConfig featConfig) {\n            this.featConfig = featConfig;\n            return this;\n        }\n\n        public Builder setOnlineModelConfig(OnlineModelConfig modelConfig) {\n            this.modelConfig = modelConfig;\n            return this;\n        }\n\n        public Builder setMaxActivePaths(int maxActivePaths) {\n            this.maxActivePaths = maxActivePaths;\n            return this;\n        }\n\n        public Builder setKeywordsFile(String keywordsFile) {\n            this.keywordsFile = keywordsFile;\n            return this;\n        }\n\n        public Builder setKeywordsScore(float keywordsScore) {\n            this.keywordsScore = keywordsScore;\n            return this;\n        }\n\n        public Builder setKeywordsThreshold(float keywordsThreshold) {\n            this.keywordsThreshold = keywordsThreshold;\n            return this;\n        }\n\n        public Builder setNumTrailingBlanks(int numTrailingBlanks) {\n            this.numTrailingBlanks = numTrailingBlanks;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/KeywordSpotterResult.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\nimport java.util.Arrays;\n\npublic class KeywordSpotterResult {\n    private final String keyword;\n    private final String[] tokens;\n    private final float[] timestamps;\n\n    public KeywordSpotterResult(String keyword, String[] tokens, float[] timestamps) {\n        this.keyword = keyword;\n        this.tokens = tokens;\n        this.timestamps = timestamps;\n    }\n\n    public String getKeyword() {\n        return keyword;\n    }\n\n    public String[] getTokens() {\n        return tokens;\n    }\n\n    public float[] getTimestamps() {\n        return timestamps;\n    }\n\n    @Override\n    public String toString() {\n        StringBuilder sb = new StringBuilder();\n        sb.append(\"Keyword: \").append(keyword).append(\"\\n\");\n        sb.append(\"Tokens: \").append(Arrays.toString(tokens)).append(\"\\n\");\n        sb.append(\"Timestamps: \").append(Arrays.toString(timestamps));\n        return sb.toString();\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/LibraryLoader.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class LibraryLoader {\n    private static volatile boolean autoLoadEnabled = true;\n    private static volatile boolean isLoaded = false;\n\n    static synchronized void loadLibrary() {\n        if (!isLoaded) {\n            LibraryUtils.load();\n            isLoaded = true;\n        }\n    }\n\n    public static void setAutoLoadEnabled(boolean enabled) {\n        autoLoadEnabled = enabled;\n    }\n\n    static void maybeLoad() {\n        if (autoLoadEnabled) {\n            loadLibrary();\n        }\n    }\n}"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/LibraryUtils.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\nimport java.io.File;\nimport java.io.IOException;\nimport java.io.InputStream;\nimport java.nio.file.Files;\nimport java.nio.file.Path;\nimport java.nio.file.StandardCopyOption;\nimport java.util.Locale;\nimport java.util.Objects;\n\n/*\n# We support the following loading methods\n\n## Method 1 Specify the property sherpa_onnx.native.path\n\nWe assume the path contains the libraries sherpa-onnx-jni and onnxruntime.\n\njava \\\n -Dsherpa_onnx.native.path=/Users/fangjun/sherpa-onnx/build/install/lib \\\n -cp /Users/fangjun/sherpa-onnx/sherpa-onnx/java-api/build/sherpa-onnx.jar\n xxx.java\n\n## Method 2 Specify the native jar library\n\njava \\\n -cp /Users/fangjun/sherpa-onnx/sherpa-onnx/java-api/build/sherpa-onnx.jar:/path/to/sherpa-onnx-osx-x64.jar\n xxx.java\n\nNote that you need to replace  : in -cp with ; on windows.\n\n## Method 3 Specify the property java.library.path\n\nWe assume the path contains the libraries sherpa-onnx-jni and onnxruntime.\n\njava \\\n -Djava.library.path=/Users/fangjun/sherpa-onnx/build/install/lib \\\n -cp /Users/fangjun/sherpa-onnx/sherpa-onnx/java-api/build/sherpa-onnx.jar\n xxx.java\n\n */\n\npublic class LibraryUtils {\n    // System property to override native library path\n    private static final String NATIVE_PATH_PROP = \"sherpa_onnx.native.path\";\n    private static final String LIB_NAME = \"sherpa-onnx-jni\";\n\n    private static boolean debug = false;\n\n    private static String detectedOS;\n\n    public static void enableDebug() {\n        debug = true;\n    }\n\n    public static void disableDebug() {\n        debug = false;\n    }\n\n    public static void load() {\n        // 1. Try to load from external directory specified by -Dsherpa_onnx.native.path if provided\n        if (loadFromSherpaOnnxNativePath()) {\n            return;\n        }\n\n        // 2. Load from resources contains in some jar file\n        if (!isAndroid()) {\n            try {\n                if (loadFromResourceInJar()) {\n                    return;\n                }\n            } catch (IOException e) {\n                // pass\n            }\n        }\n\n        // 3. fallback to -Djava.library.path\n        // java -Djava.library.path=C:\\mylibs;D:\\otherlibs -cp sherpa-onnx.jar xxx.java\n        //\n        // It throws if it cannot load the lib sherpa-onnx-jni\n        System.loadLibrary(LIB_NAME);\n    }\n\n    // You specify -Dsherpa_onnx.native.path=/path/to/some/dir\n    // where /path/to/some/dir contains the sherpa-onnx-jni and onnxruntime libs\n    private static boolean loadFromSherpaOnnxNativePath() {\n        String libFileName = System.mapLibraryName(LIB_NAME);\n        String nativePath = System.getProperty(NATIVE_PATH_PROP);\n\n        if (nativePath != null) {\n            File nativeDir = new File(nativePath);\n            File libInDir = new File(nativeDir, libFileName);\n            if (nativeDir.isDirectory() && libInDir.exists()) {\n                if (debug) {\n                    System.out.printf(\"Loading from: %s\\n\", libInDir.getAbsolutePath());\n                }\n\n                System.load(libInDir.getAbsolutePath());\n                return true;\n            }\n        }\n\n        if (debug) {\n            System.out.println(\"nativePath is null\");\n        }\n\n        return false;\n    }\n\n    private static boolean loadFromResourceInJar() throws IOException {\n        String libFileName = System.mapLibraryName(LIB_NAME);\n        String sherpaOnnxJniPath = \"sherpa-onnx/native/\" + getOsArch() + '/' + libFileName;\n\n        Path tempDirectory = null;\n        try {\n            if (!resourceExists(sherpaOnnxJniPath)) {\n                if (debug) {\n                    System.out.printf(\"%s does not exist\\n\", sherpaOnnxJniPath);\n                }\n\n                return false;\n            }\n\n            tempDirectory = Files.createTempDirectory(\"sherpa-onnx-java\");\n\n            if (Objects.equals(detectedOS, \"osx\")) {\n                // for macos, we need to first load libonnxruntime.1.23.2.dylib\n                String onnxruntimePath = \"sherpa-onnx/native/\" + getOsArch() + '/' + \"libonnxruntime.1.23.2.dylib\";\n                if (!resourceExists(onnxruntimePath)) {\n                    if (debug) {\n                        System.out.printf(\"%s does not exist\\n\", onnxruntimePath);\n                    }\n\n                    return false;\n                }\n\n                File tempFile = tempDirectory.resolve(\"libonnxruntime.1.23.2.dylib\").toFile();\n                extractResource(onnxruntimePath, tempFile);\n                System.load(tempFile.getAbsolutePath());\n            } else {\n                String onnxLibFileName = System.mapLibraryName(\"onnxruntime\");\n                String onnxruntimePath = \"sherpa-onnx/native/\" + getOsArch() + '/' + onnxLibFileName;\n                if (!resourceExists(onnxruntimePath)) {\n                    if (debug) {\n                        System.out.printf(\"%s does not exist\\n\", onnxruntimePath);\n                    }\n\n                    return false;\n                }\n\n                File tempFile = tempDirectory.resolve(onnxLibFileName).toFile();\n                extractResource(onnxruntimePath, tempFile);\n                System.load(tempFile.getAbsolutePath());\n            }\n\n            File tempFile = tempDirectory.resolve(libFileName).toFile();\n            extractResource(sherpaOnnxJniPath, tempFile);\n            System.load(tempFile.getAbsolutePath());\n        } finally {\n            if (tempDirectory != null) {\n                cleanUpTempDir(tempDirectory.toFile());\n            }\n        }\n\n        return true;\n    }\n\n    // this method is copied and modified from\n    // https://github.com/microsoft/onnxruntime/blob/main/java/src/main/java/ai/onnxruntime/OnnxRuntime.java#L118\n    private static String getOsArch() {\n        String os = System.getProperty(\"os.name\", \"generic\").toLowerCase(Locale.ENGLISH);\n        if (os.contains(\"mac\") || os.contains(\"darwin\")) {\n            detectedOS = \"osx\";\n        } else if (os.contains(\"win\")) {\n            detectedOS = \"win\";\n        } else if (os.contains(\"nux\")) {\n            detectedOS = \"linux\";\n        } else {\n            throw new IllegalStateException(\"Unsupported os:\" + os);\n        }\n\n        String detectedArch;\n        String arch = System.getProperty(\"os.arch\", \"generic\")\n                .toLowerCase(Locale.ENGLISH);\n        if (arch.startsWith(\"amd64\") || arch.startsWith(\"x86_64\")) {\n            detectedArch = \"x64\";\n        } else if (arch.startsWith(\"x86\")) {\n            // 32-bit x86 is not supported by the Java API\n            detectedArch = \"x86\";\n        } else if (arch.startsWith(\"aarch64\") || arch.startsWith(\"arm64\")) {\n            detectedArch = \"aarch64\";\n        } else if (arch.startsWith(\"arm\")) {\n            detectedArch = \"arm\"; //armv8l架构\n        } else {\n            throw new IllegalStateException(\"Unsupported arch:\" + arch);\n        }\n\n        return detectedOS + '-' + detectedArch;\n    }\n\n    private static void extractResource(String resourcePath, File destination) {\n        if (debug) {\n            System.out.printf(\"Copying from resource path %s to %s\\n\", resourcePath, destination.toPath());\n        }\n\n        try (InputStream in = LibraryUtils.class.getClassLoader().getResourceAsStream(resourcePath)) {\n            if (in == null) {\n                throw new RuntimeException(\"Resource not found: \" + resourcePath);\n            }\n            Files.copy(in, destination.toPath(), StandardCopyOption.REPLACE_EXISTING);\n        } catch (IOException e) {\n            throw new RuntimeException(\"Failed to extract resource \" + resourcePath + \" to \" + destination.getAbsolutePath(), e);\n        }\n    }\n\n    // From ChatGPT:\n    // Class.getResourceAsStream(String path) behaves differently than ClassLoader\n    //  - No leading slash → relative to the package of LibraryUtils\n    //  - Leading slash → absolute path relative to classpath root\n    //\n    // ClassLoader.getResourceAsStream always uses absolute paths relative to classpath root,\n    // no leading slash needed\n\n    private static boolean resourceExists(String path) {\n        return LibraryUtils.class.getClassLoader().getResource(path) != null;\n    }\n\n    private static void cleanUpTempDir(File dir) {\n        if (!dir.exists()) return;\n\n        File[] files = dir.listFiles();\n        if (files != null) {\n            for (File f : files) {\n                f.deleteOnExit(); // schedule each .so for deletion\n            }\n        }\n        dir.deleteOnExit(); // schedule the directory itself\n    }\n\n    static boolean isAndroid() {\n        String vmName = System.getProperty(\"java.vm.name\", \"\").toLowerCase(Locale.ROOT);\n        String specVendor = System.getProperty(\"java.specification.vendor\", \"\");\n        return vmName.contains(\"dalvik\") || vmName.contains(\"art\") ||\n               specVendor.equals(\"The Android Project\");\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineCanaryModelConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineCanaryModelConfig {\n    private final String encoder;\n    private final String decoder;\n    private final String srcLang;\n    private final String tgtLang;\n    private final boolean usePnc;\n\n    private OfflineCanaryModelConfig(Builder builder) {\n        this.encoder = builder.encoder;\n        this.decoder = builder.decoder;\n        this.srcLang = builder.srcLang;\n        this.tgtLang = builder.tgtLang;\n        this.usePnc = builder.usePnc;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getEncoder() {\n        return encoder;\n    }\n\n    public String getDecoder() {\n        return decoder;\n    }\n\n    public String getSrcLang() {\n        return srcLang;\n    }\n\n    public String getTgtLang() {\n        return tgtLang;\n    }\n\n    public boolean isUsePnc() {\n        return usePnc;\n    }\n\n    public static class Builder {\n        private String encoder = \"\";\n        private String decoder = \"\";\n        private String srcLang = \"en\";\n        private String tgtLang = \"en\";\n        private boolean usePnc = true;\n\n        public OfflineCanaryModelConfig build() {\n            return new OfflineCanaryModelConfig(this);\n        }\n\n        public Builder setEncoder(String encoder) {\n            this.encoder = encoder;\n            return this;\n        }\n\n        public Builder setDecoder(String decoder) {\n            this.decoder = decoder;\n            return this;\n        }\n\n        public Builder setSrcLang(String srcLang) {\n            this.srcLang = srcLang;\n            return this;\n        }\n\n        public Builder setTgtLang(String tgtLang) {\n            this.tgtLang = tgtLang;\n            return this;\n        }\n\n        public Builder setUsePnc(boolean usePnc) {\n            this.usePnc = usePnc;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineDolphinModelConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineDolphinModelConfig {\n    private final String model;\n\n    private OfflineDolphinModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OfflineDolphinModelConfig build() {\n            return new OfflineDolphinModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineFireRedAsrCtcModelConfig.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class OfflineFireRedAsrCtcModelConfig {\n    private final String model;\n\n    private OfflineFireRedAsrCtcModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OfflineFireRedAsrCtcModelConfig build() {\n            return new OfflineFireRedAsrCtcModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineFireRedAsrModelConfig.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class OfflineFireRedAsrModelConfig {\n    private final String encoder;\n    private final String decoder;\n\n    private OfflineFireRedAsrModelConfig(Builder builder) {\n        this.encoder = builder.encoder;\n        this.decoder = builder.decoder;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getEncoder() {\n        return encoder;\n    }\n\n    public String getDecoder() {\n        return decoder;\n    }\n\n    public static class Builder {\n        private String encoder = \"\";\n        private String decoder = \"\";\n\n        public OfflineFireRedAsrModelConfig build() {\n            return new OfflineFireRedAsrModelConfig(this);\n        }\n\n        public Builder setEncoder(String encoder) {\n            this.encoder = encoder;\n            return this;\n        }\n\n        public Builder setDecoder(String decoder) {\n            this.decoder = decoder;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineFunAsrNanoModelConfig.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class OfflineFunAsrNanoModelConfig {\n    private final String encoderAdaptor;\n    private final String llm;\n    private final String embedding;\n    private final String tokenizer;\n    private final String systemPrompt;\n    private final String userPrompt;\n    private final int maxNewTokens;\n    private final float temperature;\n    private final float topP;\n    private final int seed;\n    private final String language;\n    private final boolean itn;\n    private final String hotwords;\n\n    private OfflineFunAsrNanoModelConfig(Builder builder) {\n        this.encoderAdaptor = builder.encoderAdaptor;\n        this.llm = builder.llm;\n        this.embedding = builder.embedding;\n        this.tokenizer = builder.tokenizer;\n        this.systemPrompt = builder.systemPrompt;\n        this.userPrompt = builder.userPrompt;\n        this.maxNewTokens = builder.maxNewTokens;\n        this.temperature = builder.temperature;\n        this.topP = builder.topP;\n        this.seed = builder.seed;\n        this.language = builder.language;\n        this.itn = builder.itn;\n        this.hotwords = builder.hotwords;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getEncoderAdaptor() {\n        return encoderAdaptor;\n    }\n\n    public String getLLM() {\n        return llm;\n    }\n\n    public String getEmbedding() {\n        return embedding;\n    }\n\n    public String getTokenizer() {\n        return tokenizer;\n    }\n\n    public String getSystemPrompt() {\n        return systemPrompt;\n    }\n\n    public String getUserPrompt() {\n        return userPrompt;\n    }\n\n    public String getLanguage() {\n        return language;\n    }\n\n    public boolean getItn() {\n        return itn;\n    }\n\n    public String getHotwords() {\n        return hotwords;\n    }\n\n    public int getMaxNewTokens() {\n        return maxNewTokens;\n    }\n\n    public float getTemperature() {\n        return temperature;\n    }\n\n    public float getTopP() {\n        return topP;\n    }\n\n    public int getSeed() {\n        return seed;\n    }\n\n    public static class Builder {\n        private String encoderAdaptor = \"\";\n        private String llm = \"\";\n        private String embedding = \"\";\n        private String tokenizer = \"\";\n        private String systemPrompt = \"You are a helpful assistant.\";\n        private String userPrompt = \"语音转写：\";\n        private int maxNewTokens = 512;\n        private float temperature = 1e-6f;\n        private float topP = 0.8f;\n        private int seed = 42;\n        private String language = \"\";\n        private boolean itn = true;\n        private String hotwords = \"\";\n\n        public OfflineFunAsrNanoModelConfig build() {\n            return new OfflineFunAsrNanoModelConfig(this);\n        }\n\n        public Builder setEncoderAdaptor(String encoderAdaptor) {\n            this.encoderAdaptor = encoderAdaptor;\n            return this;\n        }\n\n        public Builder setLLM(String llm) {\n            this.llm = llm;\n            return this;\n        }\n\n        public Builder setEmbedding(String embedding) {\n            this.embedding = embedding;\n            return this;\n        }\n\n        public Builder setTokenizer(String tokenizer) {\n            this.tokenizer = tokenizer;\n            return this;\n        }\n\n        public Builder setSystemPrompt(String systemPrompt) {\n            this.systemPrompt = systemPrompt;\n            return this;\n        }\n\n        public Builder setUserPrompt(String userPrompt) {\n            this.userPrompt = userPrompt;\n            return this;\n        }\n\n        public Builder setLanguage(String language) {\n            this.language = language;\n            return this;\n        }\n\n        public Builder setItn(boolean itn) {\n            this.itn = itn;\n            return this;\n        }\n\n        public Builder setHotwords(String hotwords) {\n            this.hotwords = hotwords;\n            return this;\n        }\n\n        public Builder setMaxNewTokens(int maxNewTokens) {\n            this.maxNewTokens = maxNewTokens;\n            return this;\n        }\n\n        public Builder setTemperature(float temperature) {\n            this.temperature = temperature;\n            return this;\n        }\n\n        public Builder setTopP(float topP) {\n            this.topP = topP;\n            return this;\n        }\n\n        public Builder setSeed(int seed) {\n            this.seed = seed;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineMedAsrCtcModelConfig.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class OfflineMedAsrCtcModelConfig {\n    private final String model;\n\n    private OfflineMedAsrCtcModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OfflineMedAsrCtcModelConfig build() {\n            return new OfflineMedAsrCtcModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineModelConfig {\n    private final OfflineTransducerModelConfig transducer;\n    private final OfflineParaformerModelConfig paraformer;\n    private final OfflineWhisperModelConfig whisper;\n    private final OfflineFireRedAsrModelConfig fireRedAsr;\n    private final OfflineMoonshineModelConfig moonshine;\n    private final OfflineNemoEncDecCtcModelConfig nemo;\n    private final OfflineSenseVoiceModelConfig senseVoice;\n    private final OfflineDolphinModelConfig dolphin;\n    private final OfflineZipformerCtcModelConfig zipformerCtc;\n    private final OfflineWenetCtcModelConfig wenetCtc;\n    private final OfflineOmnilingualAsrCtcModelConfig omnilingual;\n    private final OfflineMedAsrCtcModelConfig medasr;\n    private final OfflineFireRedAsrCtcModelConfig fireRedAsrCtc;\n    private final OfflineFunAsrNanoModelConfig funasrNano;\n    private final OfflineCanaryModelConfig canary;\n    private final String teleSpeech;\n    private final String tokens;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n\n    private final String modelType;\n    private final String modelingUnit;\n    private final String bpeVocab;\n\n    private OfflineModelConfig(Builder builder) {\n        this.transducer = builder.transducer;\n        this.paraformer = builder.paraformer;\n        this.whisper = builder.whisper;\n        this.fireRedAsr = builder.fireRedAsr;\n        this.moonshine = builder.moonshine;\n        this.nemo = builder.nemo;\n        this.zipformerCtc = builder.zipformerCtc;\n        this.canary = builder.canary;\n        this.wenetCtc = builder.wenetCtc;\n        this.omnilingual = builder.omnilingual;\n        this.medasr = builder.medasr;\n        this.fireRedAsrCtc = builder.fireRedAsrCtc;\n        this.funasrNano = builder.funasrNano;\n        this.senseVoice = builder.senseVoice;\n        this.dolphin = builder.dolphin;\n        this.teleSpeech = builder.teleSpeech;\n        this.tokens = builder.tokens;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n        this.modelType = builder.modelType;\n        this.modelingUnit = builder.modelingUnit;\n        this.bpeVocab = builder.bpeVocab;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public OfflineParaformerModelConfig getParaformer() {\n        return paraformer;\n    }\n\n    public OfflineTransducerModelConfig getTransducer() {\n        return transducer;\n    }\n\n    public OfflineWhisperModelConfig getWhisper() {\n        return whisper;\n    }\n\n    public OfflineMoonshineModelConfig getMoonshine() {\n        return moonshine;\n    }\n\n    public OfflineSenseVoiceModelConfig getSenseVoice() {\n        return senseVoice;\n    }\n\n    public OfflineDolphinModelConfig getDolphin() {\n        return dolphin;\n    }\n\n    public OfflineNemoEncDecCtcModelConfig getNemo() {\n        return nemo;\n    }\n\n    public OfflineZipformerCtcModelConfig getZipformerCtc() {\n        return zipformerCtc;\n    }\n\n    public OfflineWenetCtcModelConfig getWenetCtc() {\n        return wenetCtc;\n    }\n\n    public OfflineOmnilingualAsrCtcModelConfig getOmnilingual() {\n        return omnilingual;\n    }\n\n    public OfflineMedAsrCtcModelConfig getMedAsr() {\n        return medasr;\n    }\n\n    public OfflineFireRedAsrCtcModelConfig getFireRedAsrCtc() {\n        return fireRedAsrCtc;\n    }\n\n    public OfflineFireRedAsrModelConfig getFireRedAsr() {\n        return fireRedAsr;\n    }\n\n    public OfflineFunAsrNanoModelConfig getFunAsrNano() {\n        return funasrNano;\n    }\n\n    public OfflineCanaryModelConfig getCanary() {\n        return canary;\n    }\n\n    public String getTokens() {\n        return tokens;\n    }\n\n    public int getNumThreads() {\n        return numThreads;\n    }\n\n    public boolean getDebug() {\n        return debug;\n    }\n\n    public String getProvider() {\n        return provider;\n    }\n\n    public String getModelType() {\n        return modelType;\n    }\n\n    public String getModelingUnit() {\n        return modelingUnit;\n    }\n\n    public String getBpeVocab() {\n        return bpeVocab;\n    }\n\n    public String getTeleSpeech() {\n        return teleSpeech;\n    }\n\n    public static class Builder {\n        private OfflineParaformerModelConfig paraformer = OfflineParaformerModelConfig.builder().build();\n        private OfflineTransducerModelConfig transducer = OfflineTransducerModelConfig.builder().build();\n        private OfflineWhisperModelConfig whisper = OfflineWhisperModelConfig.builder().build();\n        private OfflineFireRedAsrModelConfig fireRedAsr = OfflineFireRedAsrModelConfig.builder().build();\n        private OfflineMoonshineModelConfig moonshine = OfflineMoonshineModelConfig.builder().build();\n        private OfflineNemoEncDecCtcModelConfig nemo = OfflineNemoEncDecCtcModelConfig.builder().build();\n        private OfflineSenseVoiceModelConfig senseVoice = OfflineSenseVoiceModelConfig.builder().build();\n        private OfflineDolphinModelConfig dolphin = OfflineDolphinModelConfig.builder().build();\n        private OfflineZipformerCtcModelConfig zipformerCtc = OfflineZipformerCtcModelConfig.builder().build();\n        private OfflineWenetCtcModelConfig wenetCtc = OfflineWenetCtcModelConfig.builder().build();\n        private OfflineOmnilingualAsrCtcModelConfig omnilingual = OfflineOmnilingualAsrCtcModelConfig.builder().build();\n        private OfflineMedAsrCtcModelConfig medasr = OfflineMedAsrCtcModelConfig.builder().build();\n        private OfflineFireRedAsrCtcModelConfig fireRedAsrCtc = OfflineFireRedAsrCtcModelConfig.builder().build();\n        private OfflineFunAsrNanoModelConfig funasrNano = OfflineFunAsrNanoModelConfig.builder().build();\n        private OfflineCanaryModelConfig canary = OfflineCanaryModelConfig.builder().build();\n        private String teleSpeech = \"\";\n        private String tokens = \"\";\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n        private String modelType = \"\";\n        private String modelingUnit = \"cjkchar\";\n        private String bpeVocab = \"\";\n\n        public OfflineModelConfig build() {\n            return new OfflineModelConfig(this);\n        }\n\n        public Builder setTransducer(OfflineTransducerModelConfig transducer) {\n            this.transducer = transducer;\n            return this;\n        }\n\n        public Builder setDolphin(OfflineDolphinModelConfig dolphin) {\n            this.dolphin = dolphin;\n            return this;\n        }\n\n        public Builder setParaformer(OfflineParaformerModelConfig paraformer) {\n            this.paraformer = paraformer;\n            return this;\n        }\n\n        public Builder setNemo(OfflineNemoEncDecCtcModelConfig nemo) {\n            this.nemo = nemo;\n            return this;\n        }\n\n        public Builder setZipformerCtc(OfflineZipformerCtcModelConfig zipformerCtc) {\n            this.zipformerCtc = zipformerCtc;\n            return this;\n        }\n\n        public Builder setWenetCtc(OfflineWenetCtcModelConfig wenetCtc) {\n            this.wenetCtc = wenetCtc;\n            return this;\n        }\n\n        public Builder setOmnilingual(OfflineOmnilingualAsrCtcModelConfig omnilingual) {\n            this.omnilingual = omnilingual;\n            return this;\n        }\n\n        public Builder setMedAsr(OfflineMedAsrCtcModelConfig medasr) {\n            this.medasr = medasr;\n            return this;\n        }\n\n        public Builder setFireRedAsrCtc(OfflineFireRedAsrCtcModelConfig fireRedAsrCtc) {\n            this.fireRedAsrCtc = fireRedAsrCtc;\n            return this;\n        }\n\n        public Builder setFunAsrNano(OfflineFunAsrNanoModelConfig funasrNano) {\n            this.funasrNano = funasrNano;\n            return this;\n        }\n\n        public Builder setCanary(OfflineCanaryModelConfig canary) {\n            this.canary = canary;\n            return this;\n        }\n\n        public Builder setTeleSpeech(String teleSpeech) {\n            this.teleSpeech = teleSpeech;\n            return this;\n        }\n\n        public Builder setWhisper(OfflineWhisperModelConfig whisper) {\n            this.whisper = whisper;\n            return this;\n        }\n\n        public Builder setFireRedAsr(OfflineFireRedAsrModelConfig fireRedAsr) {\n            this.fireRedAsr = fireRedAsr;\n            return this;\n        }\n\n        public Builder setSenseVoice(OfflineSenseVoiceModelConfig senseVoice) {\n            this.senseVoice = senseVoice;\n            return this;\n        }\n\n        public Builder setMoonshine(OfflineMoonshineModelConfig moonshine) {\n            this.moonshine = moonshine;\n            return this;\n        }\n\n        public Builder setTokens(String tokens) {\n            this.tokens = tokens;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n\n        public Builder setModelType(String modelType) {\n            this.modelType = modelType;\n            return this;\n        }\n\n        public Builder setModelingUnit(String modelingUnit) {\n            this.modelingUnit = modelingUnit;\n            return this;\n        }\n\n        public Builder setBpeVocab(String bpeVocab) {\n            this.bpeVocab = bpeVocab;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineMoonshineModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineMoonshineModelConfig {\n    private final String preprocessor;\n    private final String encoder;\n    private final String uncachedDecoder;\n    private final String cachedDecoder;\n    private final String mergedDecoder;\n\n    private OfflineMoonshineModelConfig(Builder builder) {\n        this.preprocessor = builder.preprocessor;\n        this.encoder = builder.encoder;\n        this.uncachedDecoder = builder.uncachedDecoder;\n        this.cachedDecoder = builder.cachedDecoder;\n        this.mergedDecoder = builder.mergedDecoder;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getPreprocessor() {\n        return preprocessor;\n    }\n\n    public String getEncoder() {\n        return encoder;\n    }\n\n    public String getUncachedDecoder() {\n        return uncachedDecoder;\n    }\n\n    public String getCachedDecoder() {\n        return cachedDecoder;\n    }\n\n    public String getMergedDecoder() {\n        return mergedDecoder;\n    }\n\n    public static class Builder {\n        private String preprocessor = \"\";\n        private String encoder = \"\";\n        private String uncachedDecoder = \"\";\n        private String cachedDecoder = \"\";\n        private String mergedDecoder = \"\";\n\n        public OfflineMoonshineModelConfig build() {\n            return new OfflineMoonshineModelConfig(this);\n        }\n\n        public Builder setPreprocessor(String preprocessor) {\n            this.preprocessor = preprocessor;\n            return this;\n        }\n\n        public Builder setEncoder(String encoder) {\n            this.encoder = encoder;\n            return this;\n        }\n\n        public Builder setUncachedDecoder(String uncachedDecoder) {\n            this.uncachedDecoder = uncachedDecoder;\n            return this;\n        }\n\n        public Builder setCachedDecoder(String cachedDecoder) {\n            this.cachedDecoder = cachedDecoder;\n            return this;\n        }\n\n        public Builder setMergedDecoder(String mergedDecoder) {\n            this.mergedDecoder = mergedDecoder;\n            return this;\n        }\n    }\n\n\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineNemoEncDecCtcModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineNemoEncDecCtcModelConfig {\n    private final String model;\n\n    private OfflineNemoEncDecCtcModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OfflineNemoEncDecCtcModelConfig build() {\n            return new OfflineNemoEncDecCtcModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineOmnilingualAsrCtcModelConfig.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class OfflineOmnilingualAsrCtcModelConfig {\n    private final String model;\n\n    private OfflineOmnilingualAsrCtcModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OfflineOmnilingualAsrCtcModelConfig build() {\n            return new OfflineOmnilingualAsrCtcModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineParaformerModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineParaformerModelConfig {\n    private final String model;\n    private final QnnConfig qnnConfig;\n\n    private OfflineParaformerModelConfig(Builder builder) {\n        this.model = builder.model;\n        this.qnnConfig = builder.qnnConfig;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public QnnConfig getQnnConfig() {\n        return qnnConfig;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n        private QnnConfig qnnConfig = QnnConfig.builder().build();\n\n        public OfflineParaformerModelConfig build() {\n            return new OfflineParaformerModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setQnnConfig(QnnConfig qnnConfig) {\n            this.qnnConfig = qnnConfig;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflinePunctuation.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflinePunctuation {\n    private long ptr = 0;\n\n    public OfflinePunctuation(OfflinePunctuationConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid OfflinePunctuationConfig: failed to create native OfflinePunctuation\");\n        }\n    }\n\n    public String addPunctuation(String text) {\n        return addPunctuation(ptr, text);\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    // You'd better call it manually if it is not used anymore\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    private native void delete(long ptr);\n\n    private native long newFromFile(OfflinePunctuationConfig config);\n\n    private native String addPunctuation(long ptr, String text);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflinePunctuationConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflinePunctuationConfig {\n    private final OfflinePunctuationModelConfig model;\n\n    private OfflinePunctuationConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public OfflinePunctuationModelConfig getModel() {\n        return model;\n    }\n\n\n    public static class Builder {\n        private OfflinePunctuationModelConfig model = OfflinePunctuationModelConfig.builder().build();\n\n        public OfflinePunctuationConfig build() {\n            return new OfflinePunctuationConfig(this);\n        }\n\n        public Builder setModel(OfflinePunctuationModelConfig model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflinePunctuationModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflinePunctuationModelConfig {\n    private final String ctTransformer;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n\n    private OfflinePunctuationModelConfig(Builder builder) {\n        this.ctTransformer = builder.ctTransformer;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getCtTransformer() {\n        return ctTransformer;\n    }\n\n    public static class Builder {\n        private String ctTransformer = \"\";\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n\n        public OfflinePunctuationModelConfig build() {\n            return new OfflinePunctuationModelConfig(this);\n        }\n\n        public Builder setCtTransformer(String ctTransformer) {\n            this.ctTransformer = ctTransformer;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineRecognizer.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineRecognizer {\n    private long ptr = 0;\n    private final OfflineRecognizerConfig config;\n\n    public OfflineRecognizer(OfflineRecognizerConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid OfflineRecognizerConfig: failed to create native OfflineRecognizer\");\n        }\n\n        this.config = config;\n    }\n\n    public void setConfig(OfflineRecognizerConfig config) {\n        setConfig(ptr, config);\n        // we don't update this.config\n    }\n\n    public OfflineRecognizerConfig getConfig() {\n        return config;\n    }\n\n    public void decode(OfflineStream s) {\n        decode(ptr, s.getPtr());\n    }\n\n    public void decode(OfflineStream[] ss) {\n        long[] streamPtrs = new long[ss.length];\n        for (int i = 0; i < ss.length; ++i) {\n            streamPtrs[i] = ss[i].getPtr();\n        }\n        decodeStreams(ptr, streamPtrs);\n    }\n\n    public OfflineStream createStream() {\n        long p = createStream(ptr);\n        return new OfflineStream(p);\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    // You'd better call it manually if it is not used anymore\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    public OfflineRecognizerResult getResult(OfflineStream s) {\n        return getResult(s.getPtr());\n    }\n\n    private native void delete(long ptr);\n\n    private native long newFromFile(OfflineRecognizerConfig config);\n\n    private native long createStream(long ptr);\n\n    private native void decode(long ptr, long streamPtr);\n\n    private native void setConfig(long ptr, OfflineRecognizerConfig config);\n\n    private native void decodeStreams(long ptr, long[] streamPtrs);\n\n    private native OfflineRecognizerResult getResult(long streamPtr);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineRecognizerConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineRecognizerConfig {\n    private final FeatureConfig featConfig;\n    private final OfflineModelConfig modelConfig;\n    private final HomophoneReplacerConfig hr;\n    private final String decodingMethod;\n    private final int maxActivePaths;\n    private final String hotwordsFile;\n    private final float hotwordsScore;\n    private final String ruleFsts;\n    private final String ruleFars;\n    private final float blankPenalty;\n\n    private OfflineRecognizerConfig(Builder builder) {\n        this.featConfig = builder.featConfig;\n        this.modelConfig = builder.modelConfig;\n        this.hr = builder.hr;\n        this.decodingMethod = builder.decodingMethod;\n        this.maxActivePaths = builder.maxActivePaths;\n        this.hotwordsFile = builder.hotwordsFile;\n        this.hotwordsScore = builder.hotwordsScore;\n        this.ruleFsts = builder.ruleFsts;\n        this.ruleFars = builder.ruleFars;\n        this.blankPenalty = builder.blankPenalty;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public OfflineModelConfig getModelConfig() {\n        return modelConfig;\n    }\n\n    public static class Builder {\n        private FeatureConfig featConfig = FeatureConfig.builder().build();\n        private OfflineModelConfig modelConfig = OfflineModelConfig.builder().build();\n        private HomophoneReplacerConfig hr = HomophoneReplacerConfig.builder().build();\n        private String decodingMethod = \"greedy_search\";\n        private int maxActivePaths = 4;\n        private String hotwordsFile = \"\";\n        private float hotwordsScore = 1.5f;\n        private String ruleFsts = \"\";\n        private String ruleFars = \"\";\n        private float blankPenalty = 0.0f;\n\n        public OfflineRecognizerConfig build() {\n            return new OfflineRecognizerConfig(this);\n        }\n\n        public Builder setFeatureConfig(FeatureConfig featConfig) {\n            this.featConfig = featConfig;\n            return this;\n        }\n\n        public Builder setOfflineModelConfig(OfflineModelConfig modelConfig) {\n            this.modelConfig = modelConfig;\n            return this;\n        }\n\n        public Builder setHr(HomophoneReplacerConfig hr) {\n            this.hr = hr;\n            return this;\n        }\n\n        public Builder setDecodingMethod(String decodingMethod) {\n            this.decodingMethod = decodingMethod;\n            return this;\n        }\n\n        public Builder setMaxActivePaths(int maxActivePaths) {\n            this.maxActivePaths = maxActivePaths;\n            return this;\n        }\n\n        public Builder setHotwordsFile(String hotwordsFile) {\n            this.hotwordsFile = hotwordsFile;\n            return this;\n        }\n\n        public Builder setHotwordsScore(float hotwordsScore) {\n            this.hotwordsScore = hotwordsScore;\n            return this;\n        }\n\n        public Builder setRuleFsts(String ruleFsts) {\n            this.ruleFsts = ruleFsts;\n            return this;\n        }\n\n        public Builder setRuleFars(String ruleFars) {\n            this.ruleFars = ruleFars;\n            return this;\n        }\n\n        public Builder setBlankPenalty(float blankPenalty) {\n            this.blankPenalty = blankPenalty;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineRecognizerResult.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineRecognizerResult {\n    private final String text;\n    private final String[] tokens;\n    private final float[] timestamps;\n    private final String lang;\n    private final String emotion;\n    private final String event;\n    private final float[] durations;\n\n    public OfflineRecognizerResult(String text, String[] tokens, float[] timestamps, String lang, String emotion, String event, float[] durations) {\n        this.text = text;\n        this.tokens = tokens;\n        this.timestamps = timestamps;\n        this.lang = lang;\n        this.emotion = emotion;\n        this.event = event;\n        this.durations = durations;\n    }\n\n    public String getText() {\n        return text;\n    }\n\n    public String[] getTokens() {\n        return tokens;\n    }\n\n    public float[] getTimestamps() {\n        return timestamps;\n    }\n\n    public String getLang() {\n        return lang;\n    }\n\n    public String getEmotion() {\n        return emotion;\n    }\n\n    public String getEvent() {\n        return event;\n    }\n\n    public float[] getDurations() {\n        return durations;\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSenseVoiceModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineSenseVoiceModelConfig {\n    private final String model;\n    private final String language;\n    private final boolean useInverseTextNormalization;\n    private final QnnConfig qnnConfig;\n\n    private OfflineSenseVoiceModelConfig(Builder builder) {\n        this.model = builder.model;\n        this.language = builder.language;\n        this.useInverseTextNormalization = builder.useInverseTextNormalization;\n        this.qnnConfig = builder.qnnConfig;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public String getLanguage() {\n        return language;\n    }\n\n    public boolean getUseInverseTextNormalization() {\n        return useInverseTextNormalization;\n    }\n\n    public QnnConfig getQnnConfig() {\n        return qnnConfig;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n        private String language = \"\";\n        private boolean useInverseTextNormalization = true;\n        private QnnConfig qnnConfig = QnnConfig.builder().build();\n\n        public OfflineSenseVoiceModelConfig build() {\n            return new OfflineSenseVoiceModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setLanguage(String language) {\n            this.language = language;\n            return this;\n        }\n\n        public Builder setInverseTextNormalization(boolean useInverseTextNormalization) {\n            this.useInverseTextNormalization = useInverseTextNormalization;\n            return this;\n        }\n\n        public Builder setQnnConfig(QnnConfig qnnConfig) {\n            this.qnnConfig = qnnConfig;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeakerDiarization.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineSpeakerDiarization {\n    private long ptr = 0;\n\n    public OfflineSpeakerDiarization(OfflineSpeakerDiarizationConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid OfflineSpeakerDiarizationConfig: failed to create native OfflineSpeakerDiarization\");\n        }\n    }\n\n    public int getSampleRate() {\n        return getSampleRate(ptr);\n    }\n\n    // Only config.clustering is used. All other fields are ignored\n    public void setConfig(OfflineSpeakerDiarizationConfig config) {\n        setConfig(ptr, config);\n    }\n\n    public OfflineSpeakerDiarizationSegment[] process(float[] samples) {\n        return process(ptr, samples);\n    }\n\n    public OfflineSpeakerDiarizationSegment[] processWithCallback(float[] samples, OfflineSpeakerDiarizationCallback callback) {\n        return processWithCallback(ptr, samples, callback, 0);\n    }\n\n    public OfflineSpeakerDiarizationSegment[] processWithCallback(float[] samples, OfflineSpeakerDiarizationCallback callback, long arg) {\n        return processWithCallback(ptr, samples, callback, arg);\n    }\n\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    // You'd better call it manually if it is not used anymore\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    private native int getSampleRate(long ptr);\n\n    private native void delete(long ptr);\n\n    private native long newFromFile(OfflineSpeakerDiarizationConfig config);\n\n    private native void setConfig(long ptr, OfflineSpeakerDiarizationConfig config);\n\n    private native OfflineSpeakerDiarizationSegment[] process(long ptr, float[] samples);\n\n    private native OfflineSpeakerDiarizationSegment[] processWithCallback(long ptr, float[] samples, OfflineSpeakerDiarizationCallback callback, long arg);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeakerDiarizationCallback.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\n@FunctionalInterface\npublic interface OfflineSpeakerDiarizationCallback {\n    Integer invoke(int numProcessedChunks, int numTotalCunks, long arg);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeakerDiarizationConfig.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class OfflineSpeakerDiarizationConfig {\n    private final OfflineSpeakerSegmentationModelConfig segmentation;\n    private final SpeakerEmbeddingExtractorConfig embedding;\n    private final FastClusteringConfig clustering;\n    private final float minDurationOn;\n    private final float minDurationOff;\n\n    private OfflineSpeakerDiarizationConfig(Builder builder) {\n        this.segmentation = builder.segmentation;\n        this.embedding = builder.embedding;\n        this.clustering = builder.clustering;\n        this.minDurationOff = builder.minDurationOff;\n        this.minDurationOn = builder.minDurationOn;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public OfflineSpeakerSegmentationModelConfig getSegmentation() {\n        return segmentation;\n    }\n\n    public SpeakerEmbeddingExtractorConfig getEmbedding() {\n        return embedding;\n    }\n\n    public FastClusteringConfig getClustering() {\n        return clustering;\n    }\n\n    public float getMinDurationOff() {\n        return minDurationOff;\n    }\n\n    public float getMinDurationOn() {\n        return minDurationOn;\n    }\n\n    public static class Builder {\n        private OfflineSpeakerSegmentationModelConfig segmentation = OfflineSpeakerSegmentationModelConfig.builder().build();\n        private SpeakerEmbeddingExtractorConfig embedding = SpeakerEmbeddingExtractorConfig.builder().build();\n        private FastClusteringConfig clustering = FastClusteringConfig.builder().build();\n        private float minDurationOn = 0.2f;\n        private float minDurationOff = 0.5f;\n\n        public OfflineSpeakerDiarizationConfig build() {\n            return new OfflineSpeakerDiarizationConfig(this);\n        }\n\n        public Builder setSegmentation(OfflineSpeakerSegmentationModelConfig segmentation) {\n            this.segmentation = segmentation;\n            return this;\n        }\n\n        public Builder setEmbedding(SpeakerEmbeddingExtractorConfig embedding) {\n            this.embedding = embedding;\n            return this;\n        }\n\n        public Builder setClustering(FastClusteringConfig clustering) {\n            this.clustering = clustering;\n            return this;\n        }\n\n        public Builder setMinDurationOff(float minDurationOff) {\n            this.minDurationOff = minDurationOff;\n            return this;\n        }\n\n        public Builder setMinDurationOn(float minDurationOn) {\n            this.minDurationOn = minDurationOn;\n            return this;\n        }\n    }\n\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeakerDiarizationSegment.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineSpeakerDiarizationSegment {\n    private final float start;\n    private final float end;\n    private final int speaker;\n\n    public OfflineSpeakerDiarizationSegment(float start, float end, int speaker) {\n        this.start = start;\n        this.end = end;\n        this.speaker = speaker;\n    }\n\n    public float getStart() {\n        return start;\n    }\n\n    public float getEnd() {\n        return end;\n    }\n\n    public int getSpeaker() {\n        return speaker;\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeakerSegmentationModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineSpeakerSegmentationModelConfig {\n    private final OfflineSpeakerSegmentationPyannoteModelConfig pyannote;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n\n    private OfflineSpeakerSegmentationModelConfig(Builder builder) {\n        this.pyannote = builder.pyannote;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public static class Builder {\n        private OfflineSpeakerSegmentationPyannoteModelConfig pyannote = OfflineSpeakerSegmentationPyannoteModelConfig.builder().build();\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n\n        public OfflineSpeakerSegmentationModelConfig build() {\n            return new OfflineSpeakerSegmentationModelConfig(this);\n        }\n\n        public Builder setPyannote(OfflineSpeakerSegmentationPyannoteModelConfig pyannote) {\n            this.pyannote = pyannote;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n    }\n}"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeakerSegmentationPyannoteModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineSpeakerSegmentationPyannoteModelConfig {\n    private final String model;\n\n    private OfflineSpeakerSegmentationPyannoteModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OfflineSpeakerSegmentationPyannoteModelConfig build() {\n            return new OfflineSpeakerSegmentationPyannoteModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeechDenoiser.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineSpeechDenoiser {\n    private long ptr = 0;\n\n    public OfflineSpeechDenoiser(OfflineSpeechDenoiserConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid OfflineSpeechDenoiserConfig: failed to create native OfflineSpeechDenoiser\");\n        }\n    }\n\n    public int getSampleRate() {\n        return getSampleRate(ptr);\n    }\n\n    public DenoisedAudio run(float[] samples, int sampleRate) {\n        return run(ptr, samples, sampleRate);\n    }\n\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    private native void delete(long ptr);\n\n    private native int getSampleRate(long ptr);\n\n    private native DenoisedAudio run(long ptr, float[] samples, int sampleRate);\n\n    private native long newFromFile(OfflineSpeechDenoiserConfig config);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeechDenoiserConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineSpeechDenoiserConfig {\n    private final OfflineSpeechDenoiserModelConfig model;\n\n    private OfflineSpeechDenoiserConfig(OfflineSpeechDenoiserConfig.Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public static class Builder {\n        private OfflineSpeechDenoiserModelConfig model = OfflineSpeechDenoiserModelConfig.builder().build();\n\n        public OfflineSpeechDenoiserConfig build() {\n            return new OfflineSpeechDenoiserConfig(this);\n        }\n\n        public Builder setModel(OfflineSpeechDenoiserModelConfig model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeechDenoiserDpdfNetModelConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineSpeechDenoiserDpdfNetModelConfig {\n    private final String model;\n\n    private OfflineSpeechDenoiserDpdfNetModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OfflineSpeechDenoiserDpdfNetModelConfig build() {\n            return new OfflineSpeechDenoiserDpdfNetModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeechDenoiserGtcrnModelConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineSpeechDenoiserGtcrnModelConfig {\n    private final String model;\n\n    private OfflineSpeechDenoiserGtcrnModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OfflineSpeechDenoiserGtcrnModelConfig build() {\n            return new OfflineSpeechDenoiserGtcrnModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineSpeechDenoiserModelConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineSpeechDenoiserModelConfig {\n    private final OfflineSpeechDenoiserGtcrnModelConfig gtcrn;\n    private final OfflineSpeechDenoiserDpdfNetModelConfig dpdfnet;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n\n    private OfflineSpeechDenoiserModelConfig(Builder builder) {\n        this.gtcrn = builder.gtcrn;\n        this.dpdfnet = builder.dpdfnet;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public static class Builder {\n        private OfflineSpeechDenoiserGtcrnModelConfig gtcrn = OfflineSpeechDenoiserGtcrnModelConfig.builder().build();\n        private OfflineSpeechDenoiserDpdfNetModelConfig dpdfnet = OfflineSpeechDenoiserDpdfNetModelConfig.builder().build();\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n\n        public OfflineSpeechDenoiserModelConfig build() {\n            return new OfflineSpeechDenoiserModelConfig(this);\n        }\n\n        public Builder setGtcrn(OfflineSpeechDenoiserGtcrnModelConfig gtcrn) {\n            this.gtcrn = gtcrn;\n            return this;\n        }\n\n        public Builder setDpdfnet(OfflineSpeechDenoiserDpdfNetModelConfig dpdfnet) {\n            this.dpdfnet = dpdfnet;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineStream.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineStream {\n    private long ptr = 0;\n\n    public OfflineStream() {\n        LibraryLoader.maybeLoad();\n        this.ptr = 0;\n    }\n\n    public OfflineStream(long ptr) {\n        this.ptr = ptr;\n    }\n\n    public long getPtr() {\n        return ptr;\n    }\n\n    public void setPtr(long ptr) {\n        this.ptr = ptr;\n    }\n\n    public void acceptWaveform(float[] samples, int sampleRate) {\n        acceptWaveform(this.ptr, samples, sampleRate);\n    }\n\n    public void setOption(String key, String value) {\n        setOption(this.ptr, key, value);\n    }\n\n    public String getOption(String key) {\n        return getOption(this.ptr, key);\n    }\n\n    public boolean hasOption(String key) {\n        return hasOption(this.ptr, key);\n    }\n\n    public void release() {\n        // stream object must be release after used\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n        super.finalize();\n    }\n\n    private native void acceptWaveform(long ptr, float[] samples, int sampleRate);\n\n    private native void setOption(long ptr, String key, String value);\n\n    private native String getOption(long ptr, String key);\n\n    private native boolean hasOption(long ptr, String key);\n\n    private native void delete(long ptr);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTransducerModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineTransducerModelConfig {\n    private final String encoder;\n    private final String decoder;\n    private final String joiner;\n\n    private OfflineTransducerModelConfig(Builder builder) {\n        this.encoder = builder.encoder;\n        this.decoder = builder.decoder;\n        this.joiner = builder.joiner;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getEncoder() {\n        return encoder;\n    }\n\n    public String getDecoder() {\n        return decoder;\n    }\n\n    public String getJoiner() {\n        return joiner;\n    }\n\n    public static class Builder {\n        private String encoder = \"\";\n        private String decoder = \"\";\n        private String joiner = \"\";\n\n        public OfflineTransducerModelConfig build() {\n            return new OfflineTransducerModelConfig(this);\n        }\n\n        public Builder setEncoder(String encoder) {\n            this.encoder = encoder;\n            return this;\n        }\n\n        public Builder setDecoder(String decoder) {\n            this.decoder = decoder;\n            return this;\n        }\n\n        public Builder setJoiner(String joiner) {\n            this.joiner = joiner;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTts.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\nimport java.util.function.Consumer;\n\npublic class OfflineTts {\n    private long ptr = 0;\n\n    public OfflineTts(OfflineTtsConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid OfflineTtsConfig: failed to create native OfflineTts\");\n        }\n    }\n\n    /** Returns the sample rate of the TTS engine. */\n    public int getSampleRate() {\n        return getSampleRate(ptr);\n    }\n\n    public int getNumSpeakers() {\n        return getNumSpeakers(ptr);\n    }\n\n    /** Generates audio for the given text using the default speaker (sid=0) and speed=1.0. */\n    public GeneratedAudio generate(String text) {\n        return generate(text, 0, 1.0f);\n    }\n\n    /** Generates audio for the given text using a specific speaker ID. */\n    public GeneratedAudio generate(String text, int sid) {\n        return generate(text, sid, 1.0f);\n    }\n\n    /** Generates audio for the given text using a specific speaker ID and speed multiplier. */\n    public GeneratedAudio generate(String text, int sid, float speed) {\n        return generateImpl(ptr, text, sid, speed);\n    }\n\n    public GeneratedAudio generateWithCallback(String text, OfflineTtsCallback callback) {\n        return generateWithCallback(text, 0, 1.0f, callback);\n    }\n\n    public GeneratedAudio generateWithCallback(\n        String text,\n        Consumer<float[]> consumer\n    ) {\n        return generateWithCallback(text, 0, 1.0f, consumer);\n    }\n\n    public GeneratedAudio generateWithCallback(String text, int sid, OfflineTtsCallback callback) {\n        return generateWithCallback(text, sid, 1.0f, callback);\n    }\n\n    public GeneratedAudio generateWithCallback(\n        String text,\n        int sid,\n        Consumer<float[]> consumer\n    ) {\n\n        return generateWithCallback(text, sid, 1.0f, consumer);\n    }\n\n    public GeneratedAudio generateWithCallback(String text, int sid, float speed, OfflineTtsCallback callback) {\n        return generateWithCallbackImpl(ptr, text, sid, speed, callback);\n    }\n\n    public GeneratedAudio generateWithCallback(\n            String text,\n            int sid,\n            float speed,\n            Consumer<float[]> consumer\n    ) {\n        OfflineTtsCallback cb = samples -> {\n            consumer.accept(samples);\n            return 1;\n        };\n        return generateWithCallback(text, sid, speed, cb);\n    }\n\n    /**\n     * Generate audio using a GenerationConfig and a callback.\n     *\n     * @param text The text to synthesize.\n     * @param config The generation configuration.\n     * @param callback Callback to receive intermediate audio chunks.\n     * @return GeneratedAudio with samples and sample rate.\n     */\n    public GeneratedAudio generateWithConfigAndCallback(\n            String text,\n            GenerationConfig config,\n            OfflineTtsCallback callback\n    ) {\n        return generateWithConfigImpl(ptr, text, config, callback);\n    }\n\n\n    public GeneratedAudio generateWithConfigAndCallback(\n            String text,\n            GenerationConfig config,\n            Consumer<float[]> consumer\n    ) {\n        OfflineTtsCallback cb = samples -> {\n            consumer.accept(samples);\n            return 1;\n        };\n        return generateWithConfigAndCallback(text, config, cb);\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    private native void delete(long ptr);\n\n    private native int getSampleRate(long ptr);\n\n    private native int getNumSpeakers(long ptr);\n\n    private native GeneratedAudio generateImpl(long ptr, String text, int sid, float speed);\n\n    private native GeneratedAudio generateWithCallbackImpl(long ptr, String text, int sid, float speed, OfflineTtsCallback callback);\n\n    private native GeneratedAudio generateWithConfigImpl(\n            long ptr,\n            String text,\n            GenerationConfig config,\n            OfflineTtsCallback callback\n    );\n\n    private native long newFromFile(OfflineTtsConfig config);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTtsCallback.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\n@FunctionalInterface\npublic interface OfflineTtsCallback {\n    /**\n     * @param samples audio chunk\n     * @return 1 to continue, 0 to stop\n     */\n    Integer invoke(float[] samples);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTtsConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineTtsConfig {\n    private final OfflineTtsModelConfig model;\n    private final String ruleFsts;\n    private final String ruleFars;\n    private final int maxNumSentences;\n    private final float silenceScale;\n\n    private OfflineTtsConfig(Builder builder) {\n        this.model = builder.model;\n        this.ruleFsts = builder.ruleFsts;\n        this.ruleFars = builder.ruleFars;\n        this.maxNumSentences = builder.maxNumSentences;\n        this.silenceScale = builder.silenceScale;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public OfflineTtsModelConfig getModel() {\n        return model;\n    }\n\n    public String getRuleFsts() {\n        return ruleFsts;\n    }\n\n    public String getRuleFars() {\n        return ruleFars;\n    }\n\n    public int getMaxNumSentences() {\n        return maxNumSentences;\n    }\n\n    public float getSilenceScale() {\n        return silenceScale;\n    }\n\n    public static class Builder {\n        private OfflineTtsModelConfig model = OfflineTtsModelConfig.builder().build();\n        private String ruleFsts = \"\";\n        private String ruleFars = \"\";\n        private int maxNumSentences = 1;\n        private float silenceScale = 0.2f;\n\n        public OfflineTtsConfig build() {\n            return new OfflineTtsConfig(this);\n        }\n\n        public Builder setModel(OfflineTtsModelConfig model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setRuleFsts(String ruleFsts) {\n            this.ruleFsts = ruleFsts;\n            return this;\n        }\n\n        public Builder setRuleFars(String ruleFars) {\n            this.ruleFars = ruleFars;\n            return this;\n        }\n\n        public Builder setMaxNumSentences(int maxNumSentences) {\n            this.maxNumSentences = maxNumSentences;\n            return this;\n        }\n\n        public Builder setSilenceScale(float silenceScale) {\n            this.silenceScale = silenceScale;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTtsKittenModelConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineTtsKittenModelConfig {\n    private final String model;\n    private final String voices;\n    private final String tokens;\n    private final String dataDir;\n    private final float lengthScale;\n\n    private OfflineTtsKittenModelConfig(Builder builder) {\n        this.model = builder.model;\n        this.voices = builder.voices;\n        this.tokens = builder.tokens;\n        this.dataDir = builder.dataDir;\n        this.lengthScale = builder.lengthScale;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public String getVoices() {\n        return voices;\n    }\n\n    public String getTokens() {\n        return tokens;\n    }\n\n    public String getDataDir() {\n        return dataDir;\n    }\n\n    public float getLengthScale() {\n        return lengthScale;\n    }\n\n\n    public static class Builder {\n        private String model = \"\";\n        private String voices = \"\";\n        private String tokens = \"\";\n        private String dataDir = \"\";\n        private float lengthScale = 1.0f;\n\n        public OfflineTtsKittenModelConfig build() {\n            return new OfflineTtsKittenModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setVoices(String voices) {\n            this.voices = voices;\n            return this;\n        }\n\n        public Builder setTokens(String tokens) {\n            this.tokens = tokens;\n            return this;\n        }\n\n        public Builder setDataDir(String dataDir) {\n            this.dataDir = dataDir;\n            return this;\n        }\n\n        public Builder setLengthScale(float lengthScale) {\n            this.lengthScale = lengthScale;\n            return this;\n        }\n    }\n}"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTtsKokoroModelConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineTtsKokoroModelConfig {\n    private final String model;\n    private final String voices;\n    private final String tokens;\n    private final String lexicon;\n    private final String lang;\n    private final String dataDir;\n    private final String dictDir;  // unused\n    private final float lengthScale;\n\n    private OfflineTtsKokoroModelConfig(Builder builder) {\n        this.model = builder.model;\n        this.voices = builder.voices;\n        this.tokens = builder.tokens;\n        this.lexicon = builder.lexicon;\n        this.lang = builder.lang;\n        this.dataDir = builder.dataDir;\n        this.dictDir = builder.dictDir;\n        this.lengthScale = builder.lengthScale;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public String getVoices() {\n        return voices;\n    }\n\n    public String getTokens() {\n        return tokens;\n    }\n\n    public String getDataDir() {\n        return dataDir;\n    }\n\n    public float getLengthScale() {\n        return lengthScale;\n    }\n\n\n    public static class Builder {\n        private String model = \"\";\n        private String voices = \"\";\n        private String tokens = \"\";\n        private String lexicon = \"\";\n        private String lang = \"\";\n        private String dataDir = \"\";\n        private String dictDir = \"\";\n        private float lengthScale = 1.0f;\n\n        public OfflineTtsKokoroModelConfig build() {\n            return new OfflineTtsKokoroModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setVoices(String voices) {\n            this.voices = voices;\n            return this;\n        }\n\n        public Builder setTokens(String tokens) {\n            this.tokens = tokens;\n            return this;\n        }\n\n        public Builder setLexicon(String lexicon) {\n            this.lexicon = lexicon;\n            return this;\n        }\n\n        public Builder setLang(String lang) {\n            this.lang = lang;\n            return this;\n        }\n\n        public Builder setDataDir(String dataDir) {\n            this.dataDir = dataDir;\n            return this;\n        }\n\n        public Builder setDictDir(String dictDir) {\n            this.dictDir = dictDir;\n            return this;\n        }\n\n        public Builder setLengthScale(float lengthScale) {\n            this.lengthScale = lengthScale;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTtsMatchaModelConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineTtsMatchaModelConfig {\n    private final String acousticModel;\n    private final String vocoder;\n    private final String lexicon;\n    private final String tokens;\n    private final String dataDir;\n    private final String dictDir;  // unused\n    private final float noiseScale;\n    private final float lengthScale;\n\n    private OfflineTtsMatchaModelConfig(Builder builder) {\n        this.acousticModel = builder.acousticModel;\n        this.vocoder = builder.vocoder;\n        this.lexicon = builder.lexicon;\n        this.tokens = builder.tokens;\n        this.dataDir = builder.dataDir;\n        this.dictDir = builder.dictDir;\n        this.noiseScale = builder.noiseScale;\n        this.lengthScale = builder.lengthScale;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getAcousticModel() {\n        return acousticModel;\n    }\n\n    public String getVocoder() {\n        return vocoder;\n    }\n\n    public String getLexicon() {\n        return lexicon;\n    }\n\n    public String getTokens() {\n        return tokens;\n    }\n\n    public String getDataDir() {\n        return dataDir;\n    }\n\n    public String getDictDir() {\n        return dictDir;\n    }\n\n    public float getLengthScale() {\n        return lengthScale;\n    }\n\n    public float getNoiseScale() {\n        return noiseScale;\n    }\n\n    public static class Builder {\n        private String acousticModel = \"\";\n        private String vocoder = \"\";\n        private String lexicon = \"\";\n        private String tokens = \"\";\n        private String dataDir = \"\";\n        private String dictDir = \"\";\n        private float noiseScale = 1.0f;\n        private float lengthScale = 1.0f;\n\n        public OfflineTtsMatchaModelConfig build() {\n            return new OfflineTtsMatchaModelConfig(this);\n        }\n\n        public Builder setAcousticModel(String acousticModel) {\n            this.acousticModel = acousticModel;\n            return this;\n        }\n\n        public Builder setVocoder(String vocoder) {\n            this.vocoder = vocoder;\n            return this;\n        }\n\n        public Builder setTokens(String tokens) {\n            this.tokens = tokens;\n            return this;\n        }\n\n        public Builder setLexicon(String lexicon) {\n            this.lexicon = lexicon;\n            return this;\n        }\n\n        public Builder setDataDir(String dataDir) {\n            this.dataDir = dataDir;\n            return this;\n        }\n\n        public Builder setDictDir(String dictDir) {\n            this.dictDir = dictDir;\n            return this;\n        }\n\n        public Builder setNoiseScale(float noiseScale) {\n            this.noiseScale = noiseScale;\n            return this;\n        }\n\n        public Builder setLengthScale(float lengthScale) {\n            this.lengthScale = lengthScale;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTtsModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineTtsModelConfig {\n    private final OfflineTtsVitsModelConfig vits;\n    private final OfflineTtsMatchaModelConfig matcha;\n    private final OfflineTtsKokoroModelConfig kokoro;\n    private final OfflineTtsZipVoiceModelConfig zipvoice;\n    private final OfflineTtsKittenModelConfig kitten;\n    private final OfflineTtsPocketModelConfig pocket;\n    private final OfflineTtsSupertonicModelConfig supertonic;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n\n    private OfflineTtsModelConfig(Builder builder) {\n        this.vits = builder.vits;\n        this.matcha = builder.matcha;\n        this.kokoro = builder.kokoro;\n        this.zipvoice = builder.zipvoice;\n        this.kitten = builder.kitten;\n        this.pocket = builder.pocket;\n        this.supertonic = builder.supertonic;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public OfflineTtsVitsModelConfig getVits() {\n        return vits;\n    }\n\n    public OfflineTtsMatchaModelConfig getMatcha() {\n        return matcha;\n    }\n\n    public OfflineTtsKokoroModelConfig getKokoro() {\n        return kokoro;\n    }\n\n    public OfflineTtsZipVoiceModelConfig getZipvoice() {\n        return zipvoice;\n    }\n\n    public OfflineTtsKittenModelConfig getKitten() {\n        return kitten;\n    }\n\n    public OfflineTtsPocketModelConfig getPocket() {\n        return pocket;\n    }\n\n    public OfflineTtsSupertonicModelConfig getSupertonic() {\n        return supertonic;\n    }\n\n    public static class Builder {\n        private OfflineTtsVitsModelConfig vits = OfflineTtsVitsModelConfig.builder().build();\n        private OfflineTtsMatchaModelConfig matcha = OfflineTtsMatchaModelConfig.builder().build();\n        private OfflineTtsKokoroModelConfig kokoro = OfflineTtsKokoroModelConfig.builder().build();\n        private OfflineTtsZipVoiceModelConfig zipvoice = OfflineTtsZipVoiceModelConfig.builder().build();\n        private OfflineTtsKittenModelConfig kitten = OfflineTtsKittenModelConfig.builder().build();\n        private OfflineTtsPocketModelConfig pocket = OfflineTtsPocketModelConfig.builder().build();\n        private OfflineTtsSupertonicModelConfig supertonic = OfflineTtsSupertonicModelConfig.builder().build();\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n\n        public OfflineTtsModelConfig build() {\n            return new OfflineTtsModelConfig(this);\n        }\n\n        public Builder setVits(OfflineTtsVitsModelConfig vits) {\n            this.vits = vits;\n            return this;\n        }\n\n        public Builder setMatcha(OfflineTtsMatchaModelConfig matcha) {\n            this.matcha = matcha;\n            return this;\n        }\n\n        public Builder setKokoro(OfflineTtsKokoroModelConfig kokoro) {\n            this.kokoro = kokoro;\n            return this;\n        }\n\n        public Builder setZipvoice(OfflineTtsZipVoiceModelConfig zipvoice) {\n            this.zipvoice = zipvoice;\n            return this;\n        }\n\n        public Builder setKitten(OfflineTtsKittenModelConfig kitten) {\n            this.kitten = kitten;\n            return this;\n        }\n\n        public Builder setPocket(OfflineTtsPocketModelConfig pocket) {\n            this.pocket = pocket;\n            return this;\n        }\n\n        public Builder setSupertonic(OfflineTtsSupertonicModelConfig supertonic) {\n            this.supertonic = supertonic;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTtsPocketModelConfig.java",
    "content": "// Copyright 2026 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineTtsPocketModelConfig {\n    private final String lmFlow;\n    private final String lmMain;\n    private final String encoder;\n    private final String decoder;\n    private final String textConditioner;\n    private final String vocabJson;\n    private final String tokenScoresJson;\n    private final int voiceEmbeddingCacheCapacity;\n\n    private OfflineTtsPocketModelConfig(Builder builder) {\n        this.lmFlow = builder.lmFlow;\n        this.lmMain = builder.lmMain;\n        this.encoder = builder.encoder;\n        this.decoder = builder.decoder;\n        this.textConditioner = builder.textConditioner;\n        this.vocabJson = builder.vocabJson;\n        this.tokenScoresJson = builder.tokenScoresJson;\n        this.voiceEmbeddingCacheCapacity = builder.voiceEmbeddingCacheCapacity;\n    }\n\n    public String getLmFlow() {\n        return lmFlow;\n    }\n\n    public String getLmMain() {\n        return lmMain;\n    }\n\n    public String getEncoder() {\n        return encoder;\n    }\n\n    public String getDecoder() {\n        return decoder;\n    }\n\n    public String getTextConditioner() {\n        return textConditioner;\n    }\n\n    public String getVocabJson() {\n        return vocabJson;\n    }\n\n    public String getTokenScoresJson() {\n        return tokenScoresJson;\n    }\n\n    public int getVoiceEmbeddingCacheCapacity() {\n        return voiceEmbeddingCacheCapacity;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public static class Builder {\n        private String lmFlow = \"\";\n        private String lmMain = \"\";\n        private String encoder = \"\";\n        private String decoder = \"\";\n        private String textConditioner = \"\";\n        private String vocabJson = \"\";\n        private String tokenScoresJson = \"\";\n        private int voiceEmbeddingCacheCapacity = 50;\n\n        public OfflineTtsPocketModelConfig build() {\n            return new OfflineTtsPocketModelConfig(this);\n        }\n\n        public Builder setLmFlow(String lmFlow) {\n            this.lmFlow = lmFlow;\n            return this;\n        }\n\n        public Builder setLmMain(String lmMain) {\n            this.lmMain = lmMain;\n            return this;\n        }\n\n        public Builder setEncoder(String encoder) {\n            this.encoder = encoder;\n            return this;\n        }\n\n        public Builder setDecoder(String decoder) {\n            this.decoder = decoder;\n            return this;\n        }\n\n        public Builder setTextConditioner(String textConditioner) {\n            this.textConditioner = textConditioner;\n            return this;\n        }\n\n        public Builder setVocabJson(String vocabJson) {\n            this.vocabJson = vocabJson;\n            return this;\n        }\n\n        public Builder setTokenScoresJson(String tokenScoresJson) {\n            this.tokenScoresJson = tokenScoresJson;\n            return this;\n        }\n\n        public Builder setVoiceEmbeddingCacheCapacity(int voiceEmbeddingCacheCapacity) {\n            this.voiceEmbeddingCacheCapacity = voiceEmbeddingCacheCapacity;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTtsSupertonicModelConfig.java",
    "content": "// Copyright 2026 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineTtsSupertonicModelConfig {\n    private final String durationPredictor;\n    private final String textEncoder;\n    private final String vectorEstimator;\n    private final String vocoder;\n    private final String ttsJson;\n    private final String unicodeIndexer;\n    private final String voiceStyle;\n\n    private OfflineTtsSupertonicModelConfig(Builder builder) {\n        this.durationPredictor = builder.durationPredictor;\n        this.textEncoder = builder.textEncoder;\n        this.vectorEstimator = builder.vectorEstimator;\n        this.vocoder = builder.vocoder;\n        this.ttsJson = builder.ttsJson;\n        this.unicodeIndexer = builder.unicodeIndexer;\n        this.voiceStyle = builder.voiceStyle;\n    }\n\n    public String getDurationPredictor() {\n        return durationPredictor;\n    }\n\n    public String getTextEncoder() {\n        return textEncoder;\n    }\n\n    public String getVectorEstimator() {\n        return vectorEstimator;\n    }\n\n    public String getVocoder() {\n        return vocoder;\n    }\n\n    public String getTtsJson() {\n        return ttsJson;\n    }\n\n    public String getUnicodeIndexer() {\n        return unicodeIndexer;\n    }\n\n    public String getVoiceStyle() {\n        return voiceStyle;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public static class Builder {\n        private String durationPredictor = \"\";\n        private String textEncoder = \"\";\n        private String vectorEstimator = \"\";\n        private String vocoder = \"\";\n        private String ttsJson = \"\";\n        private String unicodeIndexer = \"\";\n        private String voiceStyle = \"\";\n\n        public OfflineTtsSupertonicModelConfig build() {\n            return new OfflineTtsSupertonicModelConfig(this);\n        }\n\n        public Builder setDurationPredictor(String durationPredictor) {\n            this.durationPredictor = durationPredictor;\n            return this;\n        }\n\n        public Builder setTextEncoder(String textEncoder) {\n            this.textEncoder = textEncoder;\n            return this;\n        }\n\n        public Builder setVectorEstimator(String vectorEstimator) {\n            this.vectorEstimator = vectorEstimator;\n            return this;\n        }\n\n        public Builder setVocoder(String vocoder) {\n            this.vocoder = vocoder;\n            return this;\n        }\n\n        public Builder setTtsJson(String ttsJson) {\n            this.ttsJson = ttsJson;\n            return this;\n        }\n\n        public Builder setUnicodeIndexer(String unicodeIndexer) {\n            this.unicodeIndexer = unicodeIndexer;\n            return this;\n        }\n\n        public Builder setVoiceStyle(String voiceStyle) {\n            this.voiceStyle = voiceStyle;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTtsVitsModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineTtsVitsModelConfig {\n    private final String model;\n    private final String lexicon;\n    private final String tokens;\n    private final String dataDir;\n    private final String dictDir;  // unused\n    private final float noiseScale;\n    private final float noiseScaleW;\n    private final float lengthScale;\n\n    private OfflineTtsVitsModelConfig(Builder builder) {\n        this.model = builder.model;\n        this.lexicon = builder.lexicon;\n        this.tokens = builder.tokens;\n        this.dataDir = builder.dataDir;\n        this.dictDir = builder.dictDir;\n        this.noiseScale = builder.noiseScale;\n        this.noiseScaleW = builder.noiseScaleW;\n        this.lengthScale = builder.lengthScale;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public String getLexicon() {\n        return lexicon;\n    }\n\n    public String getTokens() {\n        return tokens;\n    }\n\n    public String getDataDir() {\n        return dataDir;\n    }\n\n    public String getDictDir() {\n        return dictDir;\n    }\n\n    public float getLengthScale() {\n        return lengthScale;\n    }\n\n    public float getNoiseScale() {\n        return noiseScale;\n    }\n\n    public float getNoiseScaleW() {\n        return noiseScaleW;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n        private String lexicon = \"\";\n        private String tokens = \"\";\n        private String dataDir = \"\";\n        private String dictDir = \"\";\n        private float noiseScale = 0.667f;\n        private float noiseScaleW = 0.8f;\n        private float lengthScale = 1.0f;\n\n        public OfflineTtsVitsModelConfig build() {\n            return new OfflineTtsVitsModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setTokens(String tokens) {\n            this.tokens = tokens;\n            return this;\n        }\n\n        public Builder setLexicon(String lexicon) {\n            this.lexicon = lexicon;\n            return this;\n        }\n\n        public Builder setDataDir(String dataDir) {\n            this.dataDir = dataDir;\n            return this;\n        }\n\n        public Builder setDictDir(String dictDir) {\n            this.dictDir = dictDir;\n            return this;\n        }\n\n        public Builder setNoiseScale(float noiseScale) {\n            this.noiseScale = noiseScale;\n            return this;\n        }\n\n        public Builder setNoiseScaleW(float noiseScaleW) {\n            this.noiseScaleW = noiseScaleW;\n            return this;\n        }\n\n        public Builder setLengthScale(float lengthScale) {\n            this.lengthScale = lengthScale;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineTtsZipVoiceModelConfig.java",
    "content": "// Copyright 2026 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineTtsZipVoiceModelConfig {\n    private final String tokens;\n    private final String encoder;\n    private final String decoder;\n    private final String vocoder;\n    private final String dataDir;\n    private final String lexicon;\n    private final float featScale;\n    private final float tShift;\n    private final float targetRms;\n    private final float guidanceScale;\n\n    private OfflineTtsZipVoiceModelConfig(Builder builder) {\n        this.tokens = builder.tokens;\n        this.encoder = builder.encoder;\n        this.decoder = builder.decoder;\n        this.vocoder = builder.vocoder;\n        this.dataDir = builder.dataDir;\n        this.lexicon = builder.lexicon;\n        this.featScale = builder.featScale;\n        this.tShift = builder.tShift;\n        this.targetRms = builder.targetRms;\n        this.guidanceScale = builder.guidanceScale;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getTokens() {\n        return tokens;\n    }\n\n    public String getEncoder() {\n        return encoder;\n    }\n\n    public String getDecoder() {\n        return decoder;\n    }\n\n    public String getVocoder() {\n        return vocoder;\n    }\n\n    public String getDataDir() {\n        return dataDir;\n    }\n\n    public String getLexicon() {\n        return lexicon;\n    }\n\n    public float getFeatScale() {\n        return featScale;\n    }\n\n    public float getTShift() {\n        return tShift;\n    }\n\n    public float getTargetRms() {\n        return targetRms;\n    }\n\n    public float getGuidanceScale() {\n        return guidanceScale;\n    }\n\n    public static class Builder {\n        private String tokens = \"\";\n        private String encoder = \"\";\n        private String decoder = \"\";\n        private String vocoder = \"\";\n        private String dataDir = \"\";\n        private String lexicon = \"\";\n        private float featScale = 0.1f;\n        private float tShift = 0.5f;\n        private float targetRms = 0.1f;\n        private float guidanceScale = 1.0f;\n\n        public OfflineTtsZipVoiceModelConfig build() {\n            return new OfflineTtsZipVoiceModelConfig(this);\n        }\n\n        public Builder setTokens(String tokens) {\n            this.tokens = tokens;\n            return this;\n        }\n\n        public Builder setEncoder(String encoder) {\n            this.encoder = encoder;\n            return this;\n        }\n\n        public Builder setDecoder(String decoder) {\n            this.decoder = decoder;\n            return this;\n        }\n\n        public Builder setVocoder(String vocoder) {\n            this.vocoder = vocoder;\n            return this;\n        }\n\n        public Builder setDataDir(String dataDir) {\n            this.dataDir = dataDir;\n            return this;\n        }\n\n        public Builder setLexicon(String lexicon) {\n            this.lexicon = lexicon;\n            return this;\n        }\n\n        public Builder setFeatScale(float featScale) {\n            this.featScale = featScale;\n            return this;\n        }\n\n        public Builder setTShift(float tShift) {\n            this.tShift = tShift;\n            return this;\n        }\n\n        public Builder setTargetRms(float targetRms) {\n            this.targetRms = targetRms;\n            return this;\n        }\n\n        public Builder setGuidanceScale(float guidanceScale) {\n            this.guidanceScale = guidanceScale;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineWenetCtcModelConfig.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class OfflineWenetCtcModelConfig {\n    private final String model;\n\n    private OfflineWenetCtcModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OfflineWenetCtcModelConfig build() {\n            return new OfflineWenetCtcModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineWhisperModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineWhisperModelConfig {\n    private final String encoder;\n    private final String decoder;\n    private final String language;\n    private final String task;\n    private final int tailPaddings;\n    private final boolean enableTokenTimestamps;\n    private final boolean enableSegmentTimestamps;\n\n    private OfflineWhisperModelConfig(Builder builder) {\n        this.encoder = builder.encoder;\n        this.decoder = builder.decoder;\n        this.language = builder.language;\n        this.task = builder.task;\n        this.tailPaddings = builder.tailPaddings;\n        this.enableTokenTimestamps = builder.enableTokenTimestamps;\n        this.enableSegmentTimestamps = builder.enableSegmentTimestamps;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getEncoder() {\n        return encoder;\n    }\n\n    public String getDecoder() {\n        return decoder;\n    }\n\n    public String getLanguage() {\n        return language;\n    }\n\n    public String getTask() {\n        return task;\n    }\n\n    public int getTailPaddings() {\n        return tailPaddings;\n    }\n\n    public boolean getEnableTokenTimestamps() {\n        return enableTokenTimestamps;\n    }\n\n    public boolean getEnableSegmentTimestamps() {\n        return enableSegmentTimestamps;\n    }\n\n    public static class Builder {\n        private String encoder = \"\";\n        private String decoder = \"\";\n        private String language = \"en\"; // used only with multilingual models\n        private String task = \"transcribe\"; // used only with multilingual models\n\n        private int tailPaddings = 1000; // number of frames to pad\n        private boolean enableTokenTimestamps = false;\n        private boolean enableSegmentTimestamps = false;\n\n        public OfflineWhisperModelConfig build() {\n            return new OfflineWhisperModelConfig(this);\n        }\n\n        public Builder setEncoder(String encoder) {\n            this.encoder = encoder;\n            return this;\n        }\n\n        public Builder setDecoder(String decoder) {\n            this.decoder = decoder;\n            return this;\n        }\n\n        public Builder setLanguage(String language) {\n            this.language = language;\n            return this;\n        }\n\n        public Builder setTask(String task) {\n            this.task = task;\n            return this;\n        }\n\n        public Builder setTailPaddings(int tailPaddings) {\n            this.tailPaddings = tailPaddings;\n            return this;\n        }\n\n        public Builder setEnableTokenTimestamps(boolean enableTokenTimestamps) {\n            this.enableTokenTimestamps = enableTokenTimestamps;\n            return this;\n        }\n\n        public Builder setEnableSegmentTimestamps(boolean enableSegmentTimestamps) {\n            this.enableSegmentTimestamps = enableSegmentTimestamps;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineZipformerAudioTaggingModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineZipformerAudioTaggingModelConfig {\n    private final String model;\n\n    private OfflineZipformerAudioTaggingModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OfflineZipformerAudioTaggingModelConfig build() {\n            return new OfflineZipformerAudioTaggingModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OfflineZipformerCtcModelConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OfflineZipformerCtcModelConfig {\n    private final String model;\n    private final QnnConfig qnnConfig;\n\n    private OfflineZipformerCtcModelConfig(Builder builder) {\n        this.model = builder.model;\n        this.qnnConfig = builder.qnnConfig;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public QnnConfig getQnnConfig() {\n        return qnnConfig;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n        private QnnConfig qnnConfig = QnnConfig.builder().build();\n\n        public OfflineZipformerCtcModelConfig build() {\n            return new OfflineZipformerCtcModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setQnnConfig(QnnConfig qnnConfig) {\n            this.qnnConfig = qnnConfig;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineCtcFstDecoderConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineCtcFstDecoderConfig {\n    private final String graph;\n    private final int maxActive;\n\n    private OnlineCtcFstDecoderConfig(Builder builder) {\n        this.graph = builder.graph;\n        this.maxActive = builder.maxActive;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getGraph() {\n        return graph;\n    }\n\n    public float getMaxActive() {\n        return maxActive;\n    }\n\n    public static class Builder {\n        private String graph = \"\";\n        private int maxActive = 3000;\n\n        public OnlineCtcFstDecoderConfig build() {\n            return new OnlineCtcFstDecoderConfig(this);\n        }\n\n        public Builder setGraph(String graph) {\n            this.graph = graph;\n            return this;\n        }\n\n        public Builder setMaxActive(int maxActive) {\n            this.maxActive = maxActive;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineLMConfig.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineLMConfig {\n\n    private final String model;\n    private final float scale;\n\n    private OnlineLMConfig(Builder builder) {\n        this.model = builder.model;\n        this.scale = builder.scale;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public float getScale() {\n        return scale;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n        private float scale = 1.0f;\n\n        public OnlineLMConfig build() {\n            return new OnlineLMConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setScale(float scale) {\n            this.scale = scale;\n            return this;\n        }\n    }\n}"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineModelConfig.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineModelConfig {\n    private final OnlineTransducerModelConfig transducer;\n    private final OnlineParaformerModelConfig paraformer;\n    private final OnlineZipformer2CtcModelConfig zipformer2Ctc;\n    private final OnlineNeMoCtcModelConfig neMoCtc;\n    private final OnlineToneCtcModelConfig toneCtc;\n    private final String tokens;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n    private final String modelType;\n    private final String modelingUnit;\n    private final String bpeVocab;\n\n    private OnlineModelConfig(Builder builder) {\n        this.transducer = builder.transducer;\n        this.paraformer = builder.paraformer;\n        this.zipformer2Ctc = builder.zipformer2Ctc;\n        this.neMoCtc = builder.neMoCtc;\n        this.toneCtc = builder.toneCtc;\n        this.tokens = builder.tokens;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n        this.modelType = builder.modelType;\n        this.modelingUnit = builder.modelingUnit;\n        this.bpeVocab = builder.bpeVocab;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public OnlineParaformerModelConfig getParaformer() {\n        return paraformer;\n    }\n\n    public OnlineTransducerModelConfig getTransducer() {\n        return transducer;\n    }\n\n    public OnlineZipformer2CtcModelConfig getZipformer2Ctc() {\n        return zipformer2Ctc;\n    }\n\n    public OnlineNeMoCtcModelConfig getNeMoCtc() {\n        return neMoCtc;\n    }\n\n    public OnlineToneCtcModelConfig getToneCtc() {\n        return toneCtc;\n    }\n\n    public String getTokens() {\n        return tokens;\n    }\n\n    public int getNumThreads() {\n        return numThreads;\n    }\n\n    public boolean getDebug() {\n        return debug;\n    }\n\n    public String getProvider() {\n        return provider;\n    }\n\n    public String getModelType() {\n        return modelType;\n    }\n\n    public String getModelingUnit() {\n        return modelingUnit;\n    }\n\n    public String getBpeVocab() {\n        return bpeVocab;\n    }\n\n    public static class Builder {\n        private OnlineParaformerModelConfig paraformer = OnlineParaformerModelConfig.builder().build();\n        private OnlineTransducerModelConfig transducer = OnlineTransducerModelConfig.builder().build();\n        private OnlineZipformer2CtcModelConfig zipformer2Ctc = OnlineZipformer2CtcModelConfig.builder().build();\n        private OnlineNeMoCtcModelConfig neMoCtc = OnlineNeMoCtcModelConfig.builder().build();\n        private OnlineToneCtcModelConfig toneCtc = OnlineToneCtcModelConfig.builder().build();\n        private String tokens = \"\";\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n        private String modelType = \"\";\n        private String modelingUnit = \"cjkchar\";\n        private String bpeVocab = \"\";\n\n        public OnlineModelConfig build() {\n            return new OnlineModelConfig(this);\n        }\n\n        public Builder setTransducer(OnlineTransducerModelConfig transducer) {\n            this.transducer = transducer;\n            return this;\n        }\n\n        public Builder setParaformer(OnlineParaformerModelConfig paraformer) {\n            this.paraformer = paraformer;\n            return this;\n        }\n\n        public Builder setZipformer2Ctc(OnlineZipformer2CtcModelConfig zipformer2Ctc) {\n            this.zipformer2Ctc = zipformer2Ctc;\n            return this;\n        }\n\n        public Builder setNeMoCtc(OnlineNeMoCtcModelConfig neMoCtc) {\n            this.neMoCtc = neMoCtc;\n            return this;\n        }\n\n        public Builder setToneCtc(OnlineToneCtcModelConfig toneCtc) {\n            this.toneCtc = toneCtc;\n            return this;\n        }\n\n        public Builder setTokens(String tokens) {\n            this.tokens = tokens;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n\n        public Builder setModelType(String modelType) {\n            this.modelType = modelType;\n            return this;\n        }\n\n        public Builder setModelingUnit(String modelingUnit) {\n            this.modelingUnit = modelingUnit;\n            return this;\n        }\n\n        public Builder setBpeVocab(String bpeVocab) {\n            this.bpeVocab = bpeVocab;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineNeMoCtcModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineNeMoCtcModelConfig {\n    private final String model;\n\n    private OnlineNeMoCtcModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OnlineNeMoCtcModelConfig build() {\n            return new OnlineNeMoCtcModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineParaformerModelConfig.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineParaformerModelConfig {\n    private final String encoder;\n    private final String decoder;\n\n    private OnlineParaformerModelConfig(Builder builder) {\n      this.encoder = builder.encoder;\n      this.decoder = builder.decoder;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getEncoder() {\n        return encoder;\n    }\n\n    public String getDecoder() {\n        return decoder;\n    }\n\n    public static class Builder {\n        private String encoder = \"\";\n        private String decoder = \"\";\n\n        public OnlineParaformerModelConfig build() {\n            return new OnlineParaformerModelConfig(this);\n        }\n\n        public Builder setEncoder(String encoder) {\n            this.encoder = encoder;\n            return this;\n        }\n\n        public Builder setDecoder(String decoder) {\n            this.decoder = decoder;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlinePunctuation.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlinePunctuation {\n    private long ptr = 0;\n\n    public OnlinePunctuation(OnlinePunctuationConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n    }\n\n    public String addPunctuation(String text) {\n        return addPunctuation(ptr, text);\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    // You'd better call it manually if it is not used anymore\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    private native void delete(long ptr);\n\n    private native long newFromFile(OnlinePunctuationConfig config);\n\n    private native String addPunctuation(long ptr, String text);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlinePunctuationConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlinePunctuationConfig {\n    private final OnlinePunctuationModelConfig model;\n\n    private OnlinePunctuationConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public OnlinePunctuationModelConfig getModel() {\n        return model;\n    }\n\n\n    public static class Builder {\n        private OnlinePunctuationModelConfig model = OnlinePunctuationModelConfig.builder().build();\n\n        public OnlinePunctuationConfig build() {\n            return new OnlinePunctuationConfig(this);\n        }\n\n        public Builder setModel(OnlinePunctuationModelConfig model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlinePunctuationModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlinePunctuationModelConfig {\n    private final String cnnBilstm;\n    private final String bpeVocab;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n\n    private OnlinePunctuationModelConfig(Builder builder) {\n        this.cnnBilstm = builder.cnnBilstm;\n        this.bpeVocab = builder.bpeVocab;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getCnnBilstm() {\n        return cnnBilstm;\n    }\n\n    public String getBpeVocab() {\n        return bpeVocab;\n    }\n\n    public static class Builder {\n        private String cnnBilstm = \"\";\n        private String bpeVocab = \"\";\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n\n        public OnlinePunctuationModelConfig build() {\n            return new OnlinePunctuationModelConfig(this);\n        }\n\n        public Builder setCnnBilstm(String cnnBilstm) {\n            this.cnnBilstm = cnnBilstm;\n            return this;\n        }\n\n        public Builder setBpeVocab(String bpeVocab) {\n            this.bpeVocab = bpeVocab;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineRecognizer.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineRecognizer {\n    private long ptr = 0;\n\n    public OnlineRecognizer(OnlineRecognizerConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid OnlineRecognizerConfig: failed to create native OnlineRecognizer\");\n        }\n    }\n\n    public void decode(OnlineStream s) {\n        decode(ptr, s.getPtr());\n    }\n\n    public void decode(OnlineStream[] ss) {\n        if (ss == null || ss.length == 0) {\n          throw new IllegalArgumentException(\"Stream array must be non-empty\");\n        }\n        long[] streamPtrs = new long[ss.length];\n        for (int i = 0; i < ss.length; ++i) {\n            streamPtrs[i] = ss[i].getPtr();\n        }\n        decodeStreams(ptr, streamPtrs);\n    }\n\n    public boolean isReady(OnlineStream s) {\n        return isReady(ptr, s.getPtr());\n    }\n\n    public boolean isEndpoint(OnlineStream s) {\n        return isEndpoint(ptr, s.getPtr());\n    }\n\n    public void reset(OnlineStream s) {\n        reset(ptr, s.getPtr());\n    }\n\n    public OnlineStream createStream() {\n        long p = createStream(ptr, \"\");\n        return new OnlineStream(p);\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    // You'd better call it manually if it is not used anymore\n    protected void close()  {\n      if (this.ptr == 0) {\n        return;\n      }\n      delete(this.ptr);\n      this.ptr = 0;\n    }\n    \n    public void release() {\n      this.close();\n    }\n\n    public OnlineRecognizerResult getResult(OnlineStream s) {\n        return getResult(ptr, s.getPtr());\n    }\n\n    private native void delete(long ptr);\n\n    private native long newFromFile(OnlineRecognizerConfig config);\n\n    private native long createStream(long ptr, String hotwords);\n\n    private native void reset(long ptr, long streamPtr);\n\n    private native void decode(long ptr, long streamPtr);\n\n    private native void decodeStreams(long ptr, long[] streamPtrs);\n\n    private native boolean isEndpoint(long ptr, long streamPtr);\n\n    private native boolean isReady(long ptr, long streamPtr);\n\n    private native OnlineRecognizerResult getResult(long ptr, long streamPtr);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineRecognizerConfig.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineRecognizerConfig {\n    private final FeatureConfig featConfig;\n    private final OnlineModelConfig modelConfig;\n    private final OnlineLMConfig lmConfig;\n\n    private final OnlineCtcFstDecoderConfig ctcFstDecoderConfig;\n    private final EndpointConfig endpointConfig;\n    private final HomophoneReplacerConfig hr;\n    private final boolean enableEndpoint;\n    private final String decodingMethod;\n    private final int maxActivePaths;\n    private final String hotwordsFile;\n    private final float hotwordsScore;\n    private final String ruleFsts;\n    private final String ruleFars;\n    private final float blankPenalty;\n\n    private OnlineRecognizerConfig(Builder builder) {\n        this.featConfig = builder.featConfig;\n        this.modelConfig = builder.modelConfig;\n        this.lmConfig = builder.lmConfig;\n        this.ctcFstDecoderConfig = builder.ctcFstDecoderConfig;\n        this.endpointConfig = builder.endpointConfig;\n        this.hr = builder.hr;\n        this.enableEndpoint = builder.enableEndpoint;\n        this.decodingMethod = builder.decodingMethod;\n        this.maxActivePaths = builder.maxActivePaths;\n        this.hotwordsFile = builder.hotwordsFile;\n        this.hotwordsScore = builder.hotwordsScore;\n        this.ruleFsts = builder.ruleFsts;\n        this.ruleFars = builder.ruleFars;\n        this.blankPenalty = builder.blankPenalty;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public OnlineModelConfig getModelConfig() {\n        return modelConfig;\n    }\n\n    public static class Builder {\n        private FeatureConfig featConfig = FeatureConfig.builder().build();\n        private OnlineModelConfig modelConfig = OnlineModelConfig.builder().build();\n        private OnlineLMConfig lmConfig = OnlineLMConfig.builder().build();\n        private OnlineCtcFstDecoderConfig ctcFstDecoderConfig = OnlineCtcFstDecoderConfig.builder().build();\n        private EndpointConfig endpointConfig = EndpointConfig.builder().build();\n        private HomophoneReplacerConfig hr = HomophoneReplacerConfig.builder().build();\n        private boolean enableEndpoint = true;\n        private String decodingMethod = \"greedy_search\";\n        private int maxActivePaths = 4;\n        private String hotwordsFile = \"\";\n        private float hotwordsScore = 1.5f;\n        private String ruleFsts = \"\";\n        private String ruleFars = \"\";\n        private float blankPenalty = 0.0f;\n\n        public OnlineRecognizerConfig build() {\n          return new OnlineRecognizerConfig(this);\n        }\n\n        public Builder setFeatureConfig(FeatureConfig featConfig) {\n            this.featConfig = featConfig;\n            return this;\n        }\n\n        public Builder setOnlineModelConfig(OnlineModelConfig modelConfig) {\n            this.modelConfig = modelConfig;\n            return this;\n        }\n\n        public Builder setOnlineLMConfig(OnlineLMConfig lmConfig) {\n            this.lmConfig = lmConfig;\n            return this;\n        }\n\n        public Builder setCtcFstDecoderConfig(OnlineCtcFstDecoderConfig ctcFstDecoderConfig) {\n            this.ctcFstDecoderConfig = ctcFstDecoderConfig;\n            return this;\n        }\n\n        public Builder setEndpointConfig(EndpointConfig endpointConfig) {\n            this.endpointConfig = endpointConfig;\n            return this;\n        }\n\n        public Builder setHr(HomophoneReplacerConfig hr) {\n            this.hr = hr;\n            return this;\n        }\n\n        public Builder setEnableEndpoint(boolean enableEndpoint) {\n            this.enableEndpoint = enableEndpoint;\n            return this;\n        }\n\n        public Builder setDecodingMethod(String decodingMethod) {\n            this.decodingMethod = decodingMethod;\n            return this;\n        }\n\n        public Builder setMaxActivePaths(int maxActivePaths) {\n            this.maxActivePaths = maxActivePaths;\n            return this;\n        }\n\n        public Builder setHotwordsFile(String hotwordsFile) {\n            this.hotwordsFile = hotwordsFile;\n            return this;\n        }\n\n        public Builder setHotwordsScore(float hotwordsScore) {\n            this.hotwordsScore = hotwordsScore;\n            return this;\n        }\n\n        public Builder setRuleFsts(String ruleFsts) {\n            this.ruleFsts = ruleFsts;\n            return this;\n        }\n\n        public Builder setRuleFars(String ruleFars) {\n            this.ruleFars = ruleFars;\n            return this;\n        }\n\n        public Builder setBlankPenalty(float blankPenalty) {\n            this.blankPenalty = blankPenalty;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineRecognizerResult.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineRecognizerResult {\n    private final String text;\n    private final String[] tokens;\n    private final float[] timestamps;\n    private final float[] ysProbs;\n\n    public OnlineRecognizerResult(String text, String[] tokens, float[] timestamps, float[] ysProbs) {\n        this.text = text;\n        this.tokens = tokens;\n        this.timestamps = timestamps;\n        this.ysProbs = ysProbs;\n    }\n\n    public String getText() {\n        return text;\n    }\n\n    public String[] getTokens() {\n        return tokens;\n    }\n\n    public float[] getTimestamps() {\n        return timestamps;\n    }\n\n    public float[] getYsProbs() {\n        return ysProbs;\n    }\n\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineSpeechDenoiser.java",
    "content": "// Copyright 2026 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineSpeechDenoiser {\n    private long ptr = 0;\n\n    public OnlineSpeechDenoiser(OnlineSpeechDenoiserConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid OnlineSpeechDenoiserConfig: failed to create native OnlineSpeechDenoiser\");\n        }\n    }\n\n    public int getSampleRate() {\n        return getSampleRate(ptr);\n    }\n\n    public int getFrameShiftInSamples() {\n        return getFrameShiftInSamples(ptr);\n    }\n\n    public DenoisedAudio run(float[] samples, int sampleRate) {\n        return run(ptr, samples, sampleRate);\n    }\n\n    public DenoisedAudio flush() {\n        return flush(ptr);\n    }\n\n    public void reset() {\n        reset(ptr);\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    private native void delete(long ptr);\n\n    private native int getSampleRate(long ptr);\n\n    private native int getFrameShiftInSamples(long ptr);\n\n    private native DenoisedAudio run(long ptr, float[] samples, int sampleRate);\n\n    private native DenoisedAudio flush(long ptr);\n\n    private native void reset(long ptr);\n\n    private native long newFromFile(OnlineSpeechDenoiserConfig config);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineSpeechDenoiserConfig.java",
    "content": "// Copyright 2026 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineSpeechDenoiserConfig {\n    private final OfflineSpeechDenoiserModelConfig model;\n\n    private OnlineSpeechDenoiserConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public static class Builder {\n        private OfflineSpeechDenoiserModelConfig model = OfflineSpeechDenoiserModelConfig.builder().build();\n\n        public OnlineSpeechDenoiserConfig build() {\n            return new OnlineSpeechDenoiserConfig(this);\n        }\n\n        public Builder setModel(OfflineSpeechDenoiserModelConfig model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineStream.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineStream {\n    private long ptr = 0;\n\n    public OnlineStream() {\n        LibraryLoader.maybeLoad();\n        this.ptr = 0;\n    }\n\n    public OnlineStream(long ptr) {\n        this.ptr = ptr;\n    }\n\n    public long getPtr() {\n        return ptr;\n    }\n\n    public void setPtr(long ptr) {\n        this.ptr = ptr;\n    }\n\n    public void acceptWaveform(float[] samples, int sampleRate) {\n        acceptWaveform(this.ptr, samples, sampleRate);\n    }\n\n    public void inputFinished() {\n        inputFinished(this.ptr);\n    }\n\n    public void setOption(String key, String value) {\n        setOption(this.ptr, key, value);\n    }\n\n    public String getOption(String key) {\n        return getOption(this.ptr, key);\n    }\n\n    public boolean hasOption(String key) {\n        return hasOption(this.ptr, key);\n    }\n\n    public void release() {\n        close();\n    }\n    \n    public void close() {\n      // stream object must be release after used\n      if (this.ptr == 0) {\n          return;\n      }\n      delete(this.ptr);\n      this.ptr = 0;\n    }\n    \n    @Override\n    protected void finalize() throws Throwable {\n        close();\n        super.finalize();\n    }\n\n    private native void acceptWaveform(long ptr, float[] samples, int sampleRate);\n\n    private native void inputFinished(long ptr);\n\n    private native void setOption(long ptr, String key, String value);\n\n    private native String getOption(long ptr, String key);\n\n    private native boolean hasOption(long ptr, String key);\n\n    private native void delete(long ptr);\n}"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineToneCtcModelConfig.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class OnlineToneCtcModelConfig {\n    private final String model;\n\n    private OnlineToneCtcModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OnlineToneCtcModelConfig build() {\n            return new OnlineToneCtcModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineTransducerModelConfig.java",
    "content": "// Copyright 2022-2023 by zhaoming\n// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineTransducerModelConfig {\n    private final String encoder;\n    private final String decoder;\n    private final String joiner;\n\n    private OnlineTransducerModelConfig(Builder builder) {\n        this.encoder = builder.encoder;\n        this.decoder = builder.decoder;\n        this.joiner = builder.joiner;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getEncoder() {\n        return encoder;\n    }\n\n    public String getDecoder() {\n        return decoder;\n    }\n\n    public String getJoiner() {\n        return joiner;\n    }\n\n    public static class Builder {\n        private String encoder = \"\";\n        private String decoder = \"\";\n        private String joiner = \"\";\n\n        public OnlineTransducerModelConfig build() {\n          return new OnlineTransducerModelConfig(this);\n        }\n\n        public Builder setEncoder(String encoder) {\n            this.encoder = encoder;\n            return this;\n        }\n\n        public Builder setDecoder(String decoder) {\n            this.decoder = decoder;\n            return this;\n        }\n\n        public Builder setJoiner(String joiner) {\n            this.joiner = joiner;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/OnlineZipformer2CtcModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class OnlineZipformer2CtcModelConfig {\n    private final String model;\n\n    private OnlineZipformer2CtcModelConfig(Builder builder) {\n        this.model = builder.model;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n\n        public OnlineZipformer2CtcModelConfig build() {\n            return new OnlineZipformer2CtcModelConfig(this);\n        }\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/QnnConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class QnnConfig {\n    private final String backendLib;\n    private final String contextBinary;\n    private final String systemLib;\n\n    private QnnConfig(Builder builder) {\n        this.backendLib = builder.backendLib;\n        this.contextBinary = builder.contextBinary;\n        this.systemLib = builder.systemLib;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getBackendLib() {\n        return backendLib;\n    }\n\n    public String getContextBinary() {\n        return contextBinary;\n    }\n\n    public String getSystemLib() {\n        return systemLib;\n    }\n\n    public static class Builder {\n        private String backendLib = \"\";\n        private String contextBinary = \"\";\n        private String systemLib = \"\";\n\n        public QnnConfig build() {\n            return new QnnConfig(this);\n        }\n\n        public Builder setBackendLib(String backendLib) {\n            this.backendLib = backendLib;\n            return this;\n        }\n\n        public Builder setContextBinary(String contextBinary) {\n            this.contextBinary = contextBinary;\n            return this;\n        }\n\n        public Builder setSystemLib(String systemLib) {\n            this.systemLib = systemLib;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/SileroVadModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class SileroVadModelConfig {\n    private final String model;\n    private final float threshold;\n    private final float minSilenceDuration;\n    private final float minSpeechDuration;\n    private final int windowSize;\n    private final float maxSpeechDuration;\n\n    private SileroVadModelConfig(Builder builder) {\n        this.model = builder.model;\n        this.threshold = builder.threshold;\n        this.minSilenceDuration = builder.minSilenceDuration;\n        this.minSpeechDuration = builder.minSpeechDuration;\n        this.windowSize = builder.windowSize;\n        this.maxSpeechDuration = builder.maxSpeechDuration;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public float getThreshold() {\n        return threshold;\n    }\n\n    public float getMinSilenceDuration() {\n        return minSilenceDuration;\n    }\n\n    public float getMinSpeechDuration() {\n        return minSpeechDuration;\n    }\n\n    public int getWindowSize() {\n        return windowSize;\n    }\n\n    public float getMaxSpeechDuration() {\n        return maxSpeechDuration;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n        private float threshold = 0.5f;\n        private float minSilenceDuration = 0.25f;\n        private float minSpeechDuration = 0.5f;\n        private int windowSize = 512;\n        private float maxSpeechDuration = 5.0f;\n\n        public SileroVadModelConfig build() {\n            return new SileroVadModelConfig(this);\n        }\n\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setThreshold(float threshold) {\n            this.threshold = threshold;\n            return this;\n        }\n\n        public Builder setMinSilenceDuration(float minSilenceDuration) {\n            this.minSilenceDuration = minSilenceDuration;\n            return this;\n        }\n\n        public Builder setMinSpeechDuration(float minSpeechDuration) {\n            this.minSpeechDuration = minSpeechDuration;\n            return this;\n        }\n\n        public Builder setWindowSize(int windowSize) {\n            this.windowSize = windowSize;\n            return this;\n        }\n\n        public Builder setMaxSpeechDuration(float maxSpeechDuration) {\n            this.maxSpeechDuration = maxSpeechDuration;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/SpeakerEmbeddingExtractor.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class SpeakerEmbeddingExtractor {\n    private long ptr = 0;\n\n    public SpeakerEmbeddingExtractor(SpeakerEmbeddingExtractorConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid SpeakerEmbeddingExtractorConfig: failed to create native SpeakerEmbeddingExtractor\");\n        }\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    public OnlineStream createStream() {\n        long p = createStream(ptr);\n        return new OnlineStream(p);\n    }\n\n    public boolean isReady(OnlineStream s) {\n        return isReady(ptr, s.getPtr());\n    }\n\n    public float[] compute(OnlineStream s) {\n        return compute(ptr, s.getPtr());\n    }\n\n    public int getDim() {\n        return dim(ptr);\n    }\n\n    private native void delete(long ptr);\n\n    private native long newFromFile(SpeakerEmbeddingExtractorConfig config);\n\n    private native long createStream(long ptr);\n\n    private native boolean isReady(long ptr, long streamPtr);\n\n    private native float[] compute(long ptr, long streamPtr);\n\n    private native int dim(long ptr);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/SpeakerEmbeddingExtractorConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class SpeakerEmbeddingExtractorConfig {\n    private final String model;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n\n    private SpeakerEmbeddingExtractorConfig(Builder builder) {\n        this.model = builder.model;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public static class Builder {\n        private String model = \"\";\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n\n        public SpeakerEmbeddingExtractorConfig build() {\n            return new SpeakerEmbeddingExtractorConfig(this);\n        }\n\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/SpeakerEmbeddingManager.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class SpeakerEmbeddingManager {\n    private long ptr = 0;\n\n    public SpeakerEmbeddingManager(int dim) {\n        LibraryLoader.maybeLoad();\n        ptr = create(dim);\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    public boolean add(String name, float[] embedding) {\n        return add(ptr, name, embedding);\n    }\n\n    public boolean add(String name, float[][] embedding) {\n        return addList(ptr, name, embedding);\n    }\n\n    public boolean remove(String name) {\n        return remove(ptr, name);\n    }\n\n    public String search(float[] embedding, float threshold) {\n        return search(ptr, embedding, threshold);\n    }\n\n    public boolean verify(String name, float[] embedding, float threshold) {\n        return verify(ptr, name, embedding, threshold);\n    }\n\n    public boolean contains(String name) {\n        return contains(ptr, name);\n    }\n\n    public int getNumSpeakers() {\n        return numSpeakers(ptr);\n    }\n\n    public String[] getAllSpeakerNames() {\n        return allSpeakerNames(ptr);\n    }\n\n    private native long create(int dim);\n\n    private native void delete(long ptr);\n\n    private native boolean add(long ptr, String name, float[] embedding);\n\n    private native boolean addList(long ptr, String name, float[][] embedding);\n\n    private native boolean remove(long ptr, String name);\n\n    private native String search(long ptr, float[] embedding, float threshold);\n\n    private native boolean verify(long ptr, String name, float[] embedding, float threshold);\n\n    private native boolean contains(long ptr, String name);\n\n    private native int numSpeakers(long ptr);\n\n    private native String[] allSpeakerNames(long ptr);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/SpeechSegment.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class SpeechSegment {\n\n    private final int start;\n    private final float[] samples;\n\n    public SpeechSegment(int start, float[] samples) {\n        this.start = start;\n        this.samples = samples;\n    }\n\n    public int getStart() {\n        return start;\n    }\n\n    public float[] getSamples() {\n        return samples;\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/SpokenLanguageIdentification.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\nimport java.util.HashMap;\nimport java.util.Locale;\nimport java.util.Map;\n\npublic class SpokenLanguageIdentification {\n    private final Map<String, String> localeMap;\n    private long ptr = 0;\n\n    public SpokenLanguageIdentification(SpokenLanguageIdentificationConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid SpokenLanguageIdentificationConfig: failed to create native SpokenLanguageIdentification\");\n        }\n\n        String[] languages = Locale.getISOLanguages();\n        localeMap = new HashMap<String, String>(languages.length);\n        for (String language : languages) {\n            Locale locale = new Locale(language);\n            localeMap.put(language, locale.getDisplayName());\n        }\n    }\n\n    public String compute(OfflineStream stream) {\n        String lang = compute(ptr, stream.getPtr());\n        return localeMap.getOrDefault(lang, lang);\n    }\n\n    public OfflineStream createStream() {\n        long p = createStream(ptr);\n        return new OfflineStream(p);\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    // You'd better call it manually if it is not used anymore\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    private native void delete(long ptr);\n\n    private native long newFromFile(SpokenLanguageIdentificationConfig config);\n\n    private native long createStream(long ptr);\n\n    private native String compute(long ptr, long streamPtr);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/SpokenLanguageIdentificationConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class SpokenLanguageIdentificationConfig {\n    private final SpokenLanguageIdentificationWhisperConfig whisper;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n\n    private SpokenLanguageIdentificationConfig(Builder builder) {\n        this.whisper = builder.whisper;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public SpokenLanguageIdentificationWhisperConfig getWhisper() {\n        return whisper;\n    }\n\n    public static class Builder {\n        private SpokenLanguageIdentificationWhisperConfig whisper = SpokenLanguageIdentificationWhisperConfig.builder().build();\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n\n        public SpokenLanguageIdentificationConfig build() {\n            return new SpokenLanguageIdentificationConfig(this);\n        }\n\n        public Builder setWhisper(SpokenLanguageIdentificationWhisperConfig whisper) {\n            this.whisper = whisper;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/SpokenLanguageIdentificationWhisperConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class SpokenLanguageIdentificationWhisperConfig {\n    private final String encoder;\n    private final String decoder;\n    private final int tailPaddings;\n\n    private SpokenLanguageIdentificationWhisperConfig(Builder builder) {\n        this.encoder = builder.encoder;\n        this.decoder = builder.decoder;\n        this.tailPaddings = builder.tailPaddings;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getEncoder() {\n        return encoder;\n    }\n\n    public String getDecoder() {\n        return decoder;\n    }\n\n    public int getTailPaddings() {\n        return tailPaddings;\n    }\n\n    public static class Builder {\n        private String encoder = \"\";\n        private String decoder = \"\";\n        private int tailPaddings = 1000; // number of frames to pad\n\n        public SpokenLanguageIdentificationWhisperConfig build() {\n            return new SpokenLanguageIdentificationWhisperConfig(this);\n        }\n\n        public Builder setEncoder(String encoder) {\n            this.encoder = encoder;\n            return this;\n        }\n\n        public Builder setDecoder(String decoder) {\n            this.decoder = decoder;\n            return this;\n        }\n\n        public Builder setTailPaddings(int tailPaddings) {\n            this.tailPaddings = tailPaddings;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/TenVadModelConfig.java",
    "content": "// Copyright 2025 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class TenVadModelConfig {\n    private final String model;\n    private final float threshold;\n    private final float minSilenceDuration;\n    private final float minSpeechDuration;\n    private final int windowSize;\n    private final float maxSpeechDuration;\n\n    private TenVadModelConfig(Builder builder) {\n        this.model = builder.model;\n        this.threshold = builder.threshold;\n        this.minSilenceDuration = builder.minSilenceDuration;\n        this.minSpeechDuration = builder.minSpeechDuration;\n        this.windowSize = builder.windowSize;\n        this.maxSpeechDuration = builder.maxSpeechDuration;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public String getModel() {\n        return model;\n    }\n\n    public float getThreshold() {\n        return threshold;\n    }\n\n    public float getMinSilenceDuration() {\n        return minSilenceDuration;\n    }\n\n    public float getMinSpeechDuration() {\n        return minSpeechDuration;\n    }\n\n    public int getWindowSize() {\n        return windowSize;\n    }\n\n    public float getMaxSpeechDuration() {\n        return maxSpeechDuration;\n    }\n\n    public static class Builder {\n        private String model = \"\";\n        private float threshold = 0.5f;\n        private float minSilenceDuration = 0.25f;\n        private float minSpeechDuration = 0.25f;\n        private int windowSize = 256;\n        private float maxSpeechDuration = 5.0f;\n\n        public TenVadModelConfig build() {\n            return new TenVadModelConfig(this);\n        }\n\n\n        public Builder setModel(String model) {\n            this.model = model;\n            return this;\n        }\n\n        public Builder setThreshold(float threshold) {\n            this.threshold = threshold;\n            return this;\n        }\n\n        public Builder setMinSilenceDuration(float minSilenceDuration) {\n            this.minSilenceDuration = minSilenceDuration;\n            return this;\n        }\n\n        public Builder setMinSpeechDuration(float minSpeechDuration) {\n            this.minSpeechDuration = minSpeechDuration;\n            return this;\n        }\n\n        public Builder setWindowSize(int windowSize) {\n            this.windowSize = windowSize;\n            return this;\n        }\n\n        public Builder setMaxSpeechDuration(float maxSpeechDuration) {\n            this.maxSpeechDuration = maxSpeechDuration;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/Vad.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class Vad {\n    private long ptr = 0;\n\n    public Vad(VadModelConfig config) {\n        LibraryLoader.maybeLoad();\n        ptr = newFromFile(config);\n        if (ptr == 0) {\n            throw new IllegalArgumentException(\"Invalid VadModelConfig: failed to create native Vad\");\n        }\n    }\n\n    @Override\n    protected void finalize() throws Throwable {\n        release();\n    }\n\n    public void release() {\n        if (this.ptr == 0) {\n            return;\n        }\n        delete(this.ptr);\n        this.ptr = 0;\n    }\n\n    public void acceptWaveform(float[] samples) {\n        acceptWaveform(this.ptr, samples);\n    }\n\n    public float compute(float[] samples) {\n        return compute(this.ptr, samples);\n    }\n\n    public boolean empty() {\n        return empty(this.ptr);\n    }\n\n    public void pop() {\n        pop(this.ptr);\n    }\n\n    public void clear() {\n        clear(this.ptr);\n    }\n\n    public void reset() {\n        reset(this.ptr);\n    }\n\n    public void flush() {\n        flush(this.ptr);\n    }\n\n    public SpeechSegment front() {\n        return front(this.ptr);\n    }\n\n    public boolean isSpeechDetected() {\n        return isSpeechDetected(this.ptr);\n    }\n\n    private native void delete(long ptr);\n\n    private native long newFromFile(VadModelConfig config);\n\n    private native void acceptWaveform(long ptr, float[] samples);\n\n    private native float compute(long ptr, float[] samples);\n\n    private native boolean empty(long ptr);\n\n    private native void pop(long ptr);\n\n    private native void clear(long ptr);\n\n    private native SpeechSegment front(long ptr);\n\n    private native boolean isSpeechDetected(long ptr);\n\n    private native void reset(long ptr);\n\n    private native void flush(long ptr);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/VadModelConfig.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class VadModelConfig {\n    private final SileroVadModelConfig sileroVadModelConfig;\n    private final TenVadModelConfig tenVadModelConfig;\n    private final int sampleRate;\n    private final int numThreads;\n    private final boolean debug;\n    private final String provider;\n\n    private VadModelConfig(Builder builder) {\n        this.sileroVadModelConfig = builder.sileroVadModelConfig;\n        this.tenVadModelConfig = builder.tenVadModelConfig;\n        this.sampleRate = builder.sampleRate;\n        this.numThreads = builder.numThreads;\n        this.debug = builder.debug;\n        this.provider = builder.provider;\n    }\n\n    public static Builder builder() {\n        return new Builder();\n    }\n\n    public SileroVadModelConfig getSileroVadModelConfig() {\n        return sileroVadModelConfig;\n    }\n\n    public TenVadModelConfig getTenVadModelConfig() {\n        return tenVadModelConfig;\n    }\n\n    public int getSampleRate() {\n        return sampleRate;\n    }\n\n    public int getNumThreads() {\n        return numThreads;\n    }\n\n    public String getProvider() {\n        return provider;\n    }\n\n    public boolean getDebug() {\n        return debug;\n    }\n\n    public static class Builder {\n        private SileroVadModelConfig sileroVadModelConfig = new SileroVadModelConfig.Builder().build();\n        private TenVadModelConfig tenVadModelConfig = new TenVadModelConfig.Builder().build();\n        private int sampleRate = 16000;\n        private int numThreads = 1;\n        private boolean debug = true;\n        private String provider = \"cpu\";\n\n        public VadModelConfig build() {\n            return new VadModelConfig(this);\n        }\n\n        public Builder setSileroVadModelConfig(SileroVadModelConfig sileroVadModelConfig) {\n            this.sileroVadModelConfig = sileroVadModelConfig;\n            return this;\n        }\n\n        public Builder setTenVadModelConfig(TenVadModelConfig tenVadModelConfig) {\n            this.tenVadModelConfig = tenVadModelConfig;\n            return this;\n        }\n\n        public Builder setSampleRate(int sampleRate) {\n            this.sampleRate = sampleRate;\n            return this;\n        }\n\n        public Builder setNumThreads(int numThreads) {\n            this.numThreads = numThreads;\n            return this;\n        }\n\n        public Builder setDebug(boolean debug) {\n            this.debug = debug;\n            return this;\n        }\n\n        public Builder setProvider(String provider) {\n            this.provider = provider;\n            return this;\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/VersionInfo.java",
    "content": "package com.k2fsa.sherpa.onnx;\n\npublic class VersionInfo {\n\n    public static String getVersion() {\n        LibraryLoader.maybeLoad();\n        return getVersionStr2();\n    }\n\n    public static String getGitSha1() {\n        LibraryLoader.maybeLoad();\n        return getGitSha12();\n    }\n\n    public static String getGitDate() {\n        LibraryLoader.maybeLoad();\n        return getGitDate2();\n    }\n\n    private static native String getVersionStr2();\n    private static native String getGitSha12();\n    private static native String getGitDate2();\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/WaveData.java",
    "content": "// Copyright (c) 2026 Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx;\n\nimport java.util.Arrays;\n\npublic class WaveData {\n    private final float[] samples;\n    private final int sampleRate;\n\n    public WaveData(float[] samples, int sampleRate) {\n        this.samples = samples;\n        this.sampleRate = sampleRate;\n    }\n\n    public float[] getSamples() {\n        return samples;\n    }\n\n    public int getSampleRate() {\n        return sampleRate;\n    }\n\n    @Override\n    public boolean equals(Object obj) {\n        if (this == obj) return true;\n        if (obj == null || getClass() != obj.getClass()) return false;\n        WaveData other = (WaveData) obj;\n        return sampleRate == other.sampleRate && Arrays.equals(samples, other.samples);\n    }\n\n    @Override\n    public int hashCode() {\n        int result = Arrays.hashCode(samples);\n        result = 31 * result + sampleRate;\n        return result;\n    }\n}\n\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/WaveReader.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class WaveReader {\n    private WaveData data;\n\n    // It supports only single channel, 16-bit wave file.\n    // It will exit the program if the given file has a wrong format\n    public WaveReader(String filename) {\n        LibraryLoader.maybeLoad();\n        this.data = readWaveFromFile(filename);\n    }\n\n    public int getSampleRate() {\n        return this.data.getSampleRate();\n    }\n\n    public float[] getSamples() {\n        return this.data.getSamples();\n    }\n\n    private native WaveData readWaveFromFile(String filename);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/java/com/k2fsa/sherpa/onnx/WaveWriter.java",
    "content": "// Copyright 2024 Xiaomi Corporation\n\npackage com.k2fsa.sherpa.onnx;\n\npublic class WaveWriter {\n    public WaveWriter() {\n    }\n\n    public static boolean write(String filename, float[] samples, int sampleRate) {\n        WaveWriter w = new WaveWriter();\n        return w.writeWaveToFile(filename, samples, sampleRate);\n    }\n\n    private native boolean writeWaveToFile(String filename, float[] samples, int sampleRate);\n}\n"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/resources/.gitignore",
    "content": "lib/\nnative/"
  },
  {
    "path": "sherpa-onnx/java-api/src/main/resources/readme.md",
    "content": "please downlaod file and put in folder\n[donwload link](https://huggingface.co/csukuangfj2/sherpa-onnx-libs/tree/main/jni)\n\n- sherpa-onnx-v1.12.7-linux-aarch64-jni.tar.bz2\n- sherpa-onnx-v1.12.7-linux-x64-jni.tar.bz2\n- sherpa-onnx-v1.12.7-osx-arm64-jni.tar.bz2\n- sherpa-onnx-v1.12.7-osx-x86_64-jni.tar.bz2\n- sherpa-onnx-v1.12.7-win-x64-jni.tar.bz2\n\n\n- linux_arm64\n- linux_x64\n- darwin_arm64\n- darwin_x64\n- windows_x64\n\n\nadd to src/main/resources\n\n```\n.\n├── native\n│   ├── linux-aarch64\n│   │   ├── libsherpa-onnx-jni.so\n│   ├── linux-x64\n│   │   ├── libsherpa-onnx-jni.so\n│   ├── osx-aarch64\n│   │   ├── libsherpa-onnx-jni.dylib\n│   ├── osx-x64\n│   │   ├── libsherpa-onnx-jni.dylib\n│   ├── win-x64\n│   │   ├── sherpa-onnx-jni.dll\n```\n\n"
  },
  {
    "path": "sherpa-onnx/jni/CMakeLists.txt",
    "content": "include_directories(${PROJECT_SOURCE_DIR})\n\nif(NOT DEFINED ANDROID_ABI)\n  if(NOT DEFINED ENV{JAVA_HOME})\n    message(FATAL_ERROR \"Please set the environment variable JAVA_HOME\")\n  endif()\n  include_directories($ENV{JAVA_HOME}/include)\n  include_directories($ENV{JAVA_HOME}/include/linux)\n  include_directories($ENV{JAVA_HOME}/include/darwin)\n  include_directories($ENV{JAVA_HOME}/include/win32)\nendif()\n\nset(sources\n  audio-tagging.cc\n  common.cc\n  jni.cc\n  keyword-spotter.cc\n  offline-punctuation.cc\n  offline-recognizer.cc\n  offline-speech-denoiser.cc\n  offline-stream.cc\n  online-speech-denoiser.cc\n  online-punctuation.cc\n  online-recognizer.cc\n  online-stream.cc\n  speaker-embedding-extractor.cc\n  speaker-embedding-manager.cc\n  speech-denoiser.cc\n  spoken-language-identification.cc\n  version.cc\n  voice-activity-detector.cc\n  wave-reader.cc\n  wave-writer.cc\n)\n\nif(SHERPA_ONNX_ENABLE_TTS)\n  list(APPEND sources\n    offline-tts.cc\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION)\n  list(APPEND sources\n    offline-speaker-diarization.cc\n  )\nendif()\n\nadd_library(sherpa-onnx-jni SHARED ${sources})\n\ntarget_compile_definitions(sherpa-onnx-jni PRIVATE SHERPA_ONNX_BUILD_SHARED_LIBS=1)\ntarget_compile_definitions(sherpa-onnx-jni PRIVATE SHERPA_ONNX_BUILD_MAIN_LIB=1)\n\nif(ANDROID OR (UNIX AND NOT APPLE))\n  set_target_properties(sherpa-onnx-jni PROPERTIES\n    LINK_FLAGS \"-Wl,--version-script=${CMAKE_CURRENT_SOURCE_DIR}/sherpa-onnx-symbols.lds\"\n  )\nelseif(APPLE)\n  set_target_properties(sherpa-onnx-jni PROPERTIES\n    LINK_FLAGS \"-Wl,-exported_symbols_list,${CMAKE_CURRENT_SOURCE_DIR}/sherpa-onnx-symbols.exp\"\n  )\nendif()\n\ntarget_link_libraries(sherpa-onnx-jni sherpa-onnx-core)\ninstall(TARGETS sherpa-onnx-jni DESTINATION lib)\n"
  },
  {
    "path": "sherpa-onnx/jni/audio-tagging.cc",
    "content": "// sherpa-onnx/jni/audio-tagging.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/audio-tagging.h\"\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nstatic AudioTaggingConfig GetAudioTaggingConfig(JNIEnv *env, jobject config,\n                                                bool *ok) {\n  AudioTaggingConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n\n  jfieldID fid = env->GetFieldID(\n      cls, \"model\", \"Lcom/k2fsa/sherpa/onnx/AudioTaggingModelConfig;\");\n  jobject model = env->GetObjectField(config, fid);\n  jclass model_cls = env->GetObjectClass(model);\n\n  fid = env->GetFieldID(\n      model_cls, \"zipformer\",\n      \"Lcom/k2fsa/sherpa/onnx/OfflineZipformerAudioTaggingModelConfig;\");\n  jobject zipformer = env->GetObjectField(model, fid);\n  jclass zipformer_cls = env->GetObjectClass(zipformer);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.zipformer.model, model, zipformer_cls,\n                              zipformer);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.ced, ced, model_cls, model);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.model.num_threads, numThreads, model_cls, model);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.model.debug, debug, model_cls, model);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.provider, provider, model_cls, model);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.labels, labels, cls, config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.top_k, topK, cls, config);\n\n  *ok = true;\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_AudioTagging_newFromAsset(\n    JNIEnv *env, jobject /*obj*/, jobject asset_manager, jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n\n  bool ok = false;\n  auto config = sherpa_onnx::GetAudioTaggingConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"audio tagging newFromAsset config:\\n%s\",\n                   config.ToString().c_str());\n\n  auto tagger = new sherpa_onnx::AudioTagging(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return (jlong)tagger;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_AudioTagging_newFromFile(\n    JNIEnv *env, jobject /*obj*/, jobject _config) {\n  bool ok = false;\n\n  auto config = sherpa_onnx::GetAudioTaggingConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"audio tagging newFromFile config:\\n%s\",\n                   config.ToString().c_str());\n\n  if (!config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors found in config!\");\n    return 0;\n  }\n\n  auto tagger = new sherpa_onnx::AudioTagging(config);\n\n  return (jlong)tagger;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_AudioTagging_delete(\n    JNIEnv *env, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::AudioTagging *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_AudioTagging_createStream(\n    JNIEnv *env, jobject /*obj*/, jlong ptr) {\n  auto tagger = reinterpret_cast<sherpa_onnx::AudioTagging *>(ptr);\n  std::unique_ptr<sherpa_onnx::OfflineStream> s = tagger->CreateStream();\n\n  // The user is responsible to free the returned pointer.\n  //\n  // See Java_com_k2fsa_sherpa_onnx_OfflineStream_delete() from\n  // ./offline-stream.cc\n  sherpa_onnx::OfflineStream *p = s.release();\n  return (jlong)p;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobjectArray JNICALL Java_com_k2fsa_sherpa_onnx_AudioTagging_compute(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jlong streamPtr, jint top_k) {\n  auto tagger = reinterpret_cast<sherpa_onnx::AudioTagging *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OfflineStream *>(streamPtr);\n  std::vector<sherpa_onnx::AudioEvent> events = tagger->Compute(stream, top_k);\n\n  // Find the AudioEvent class\n  jclass cls = env->FindClass(\"com/k2fsa/sherpa/onnx/AudioEvent\");\n  if (cls == nullptr) {\n    SHERPA_ONNX_LOGE(\"Failed to find class com/k2fsa/sherpa/onnx/AudioEvent\");\n    return nullptr;\n  }\n\n  // Get the constructor: AudioEvent(String name, int index, float prob)\n  jmethodID ctor = env->GetMethodID(cls, \"<init>\", \"(Ljava/lang/String;IF)V\");\n  if (ctor == nullptr) {\n    SHERPA_ONNX_LOGE(\"Failed to get AudioEvent constructor\");\n    env->DeleteLocalRef(cls);\n    return nullptr;\n  }\n\n  // Create a jobjectArray of AudioEvent\n  jobjectArray obj_arr = env->NewObjectArray(events.size(), cls, nullptr);\n\n  for (size_t i = 0; i < events.size(); ++i) {\n    const auto &e = events[i];\n\n    jstring name = env->NewStringUTF(e.name.c_str());\n    jobject event_obj = env->NewObject(cls, ctor, name, e.index, e.prob);\n\n    env->SetObjectArrayElement(obj_arr, i, event_obj);\n\n    env->DeleteLocalRef(name);\n    env->DeleteLocalRef(event_obj);\n  }\n\n  env->DeleteLocalRef(cls);\n\n  return obj_arr;\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/common.cc",
    "content": "// sherpa-onnx/jni/common.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include \"sherpa-onnx/jni/common.h\"\n\n#include <stdlib.h>\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\n/* For qnn to load libQnnHtpVxxSkel.so, e.g., libQnnHtpV81Skel.so file\n\nhttps://workbench.aihub.qualcomm.com/docs/hub/faq.html#why-am-i-seeing-error-1008-when-trying-to-use-htp\n */\n#if defined(_WIN32)\nvoid PrependAdspLibraryPath(const std::string &new_path) {\n  SHERPA_ONNX_LOGE(\"This function is not for Windows. Ignore it\");\n}\n#else\nvoid PrependAdspLibraryPath(const std::string &new_path) {\n  const char *old_path = getenv(\"ADSP_LIBRARY_PATH\");\n  std::string updated_path;\n\n  if (old_path && !std::string(old_path).empty()) {\n    // Caution(fangjun):\n    // 1. Must use ; here, not :\n    // 2. Must use prepend, not append\n    updated_path = new_path + \";\" + std::string(old_path);\n  } else {\n    updated_path = new_path;  // no old path\n  }\n\n  if (setenv(\"ADSP_LIBRARY_PATH\", updated_path.c_str(), 1) != 0) {\n    SHERPA_ONNX_LOGE(\"Failed to set ADSP_LIBRARY_PATH to '%s'\",\n                     updated_path.c_str());\n  } else {\n    SHERPA_ONNX_LOGE(\"Successfully set ADSP_LIBRARY_PATH to '%s'\",\n                     updated_path.c_str());\n  }\n  /*\nYou will see something like the following:\n\nSuccessfully set ADSP_LIBRARY_PATH to\n'/data/app/~~pHS2-9SwVjl9ma3cIKtj-g==/com.k2fsa.sherpa.onnx.simulate.streaming.asr-ejCDb8LodsnyK5cr3SvGjA==/lib/arm64;/odm/lib/rfsa/adsp;/vendor/lib/rfsa/adsp/;/system/lib/rfsa/adsp;/system/vendor/lib/rfsa/adsp;/dsp'\n\n   */\n}\n#endif\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/jni/common.h",
    "content": "// sherpa-onnx/jni/common.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_JNI_COMMON_H_\n#define SHERPA_ONNX_JNI_COMMON_H_\n\n#include <string>\n\n#if __ANDROID_API__ >= 9\n#include <sstream>\n\n#include \"android/asset_manager.h\"\n#include \"android/asset_manager_jni.h\"\n#endif\n\n#if defined(_WIN32)\n#if defined(SHERPA_ONNX_BUILD_SHARED_LIBS)\n#define SHERPA_ONNX_EXPORT __declspec(dllexport)\n#define SHERPA_ONNX_IMPORT __declspec(dllimport)\n#else\n#define SHERPA_ONNX_EXPORT\n#define SHERPA_ONNX_IMPORT\n#endif\n#else  // WIN32\n#define SHERPA_ONNX_EXPORT __attribute__((visibility(\"default\")))\n\n#define SHERPA_ONNX_IMPORT SHERPA_ONNX_EXPORT\n#endif  // WIN32\n\n#if defined(SHERPA_ONNX_BUILD_MAIN_LIB)\n#define SHERPA_ONNX_API SHERPA_ONNX_EXPORT\n#else\n#define SHERPA_ONNX_API SHERPA_ONNX_IMPORT\n#endif\n\n// If you use ndk, you can find \"jni.h\" inside\n// android-ndk/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include\n#include \"jni.h\"  // NOLINT\n\n#define SHERPA_ONNX_EXTERN_C extern \"C\" SHERPA_ONNX_API\n\n#define SHERPA_ONNX_JNI_READ_STRING(cpp_field, kotlin_field, cls, config)     \\\n  do {                                                                        \\\n    jfieldID fid = env->GetFieldID(cls, #kotlin_field, \"Ljava/lang/String;\"); \\\n    if (fid == nullptr || env->ExceptionCheck()) {                            \\\n      SHERPA_ONNX_LOGE(\"Failed to get field ID for '%s'\", #kotlin_field);     \\\n      SHERPA_ONNX_LOGE(                                                       \\\n          \"Please check that your kotlin code matches the library file \"      \\\n          \"libsherpa-onnx-jni.so . If you are not sure, always use the \"      \\\n          \"LATEST code and the latest library\");                              \\\n      if (env->ExceptionCheck()) {                                            \\\n        env->ExceptionDescribe();                                             \\\n        env->ExceptionClear();                                                \\\n      }                                                                       \\\n      jclass exClass = env->FindClass(\"java/lang/RuntimeException\");          \\\n      if (exClass) {                                                          \\\n        env->ThrowNew(exClass, \"Failed to get field ID for \" #kotlin_field);  \\\n        env->DeleteLocalRef(exClass);                                         \\\n      }                                                                       \\\n      return ans;                                                             \\\n    }                                                                         \\\n    jstring s = (jstring)env->GetObjectField(config, fid);                    \\\n    if (s != nullptr) {                                                       \\\n      const char *p = env->GetStringUTFChars(s, nullptr);                     \\\n      cpp_field = p;                                                          \\\n      env->ReleaseStringUTFChars(s, p);                                       \\\n      env->DeleteLocalRef(s);                                                 \\\n    }                                                                         \\\n  } while (0)\n\n#define SHERPA_ONNX_JNI_READ_FLOAT(cpp_field, kotlin_field, cls, config)     \\\n  do {                                                                       \\\n    jfieldID fid = env->GetFieldID(cls, #kotlin_field, \"F\");                 \\\n    if (fid == nullptr || env->ExceptionCheck()) {                           \\\n      SHERPA_ONNX_LOGE(\"Failed to get field ID for '%s'\", #kotlin_field);    \\\n      SHERPA_ONNX_LOGE(                                                      \\\n          \"Please check that your kotlin code matches the library file \"     \\\n          \"libsherpa-onnx-jni.so . If you are not sure, always use the \"     \\\n          \"LATEST code and the latest library\");                             \\\n      if (env->ExceptionCheck()) {                                           \\\n        env->ExceptionDescribe();                                            \\\n        env->ExceptionClear();                                               \\\n      }                                                                      \\\n      jclass exClass = env->FindClass(\"java/lang/RuntimeException\");         \\\n      if (exClass) {                                                         \\\n        env->ThrowNew(exClass, \"Failed to get field ID for \" #kotlin_field); \\\n        env->DeleteLocalRef(exClass);                                        \\\n      }                                                                      \\\n      return ans;                                                            \\\n    }                                                                        \\\n    cpp_field = env->GetFloatField(config, fid);                             \\\n  } while (0)\n\n#define SHERPA_ONNX_JNI_READ_INT(cpp_field, kotlin_field, cls, config)       \\\n  do {                                                                       \\\n    jfieldID fid = env->GetFieldID(cls, #kotlin_field, \"I\");                 \\\n    if (fid == nullptr || env->ExceptionCheck()) {                           \\\n      SHERPA_ONNX_LOGE(\"Failed to get field ID for '%s'\", #kotlin_field);    \\\n      SHERPA_ONNX_LOGE(                                                      \\\n          \"Please check that your kotlin code matches the library file \"     \\\n          \"libsherpa-onnx-jni.so . If you are not sure, always use the \"     \\\n          \"LATEST code and the latest library\");                             \\\n      if (env->ExceptionCheck()) {                                           \\\n        env->ExceptionDescribe();                                            \\\n        env->ExceptionClear();                                               \\\n      }                                                                      \\\n      jclass exClass = env->FindClass(\"java/lang/RuntimeException\");         \\\n      if (exClass) {                                                         \\\n        env->ThrowNew(exClass, \"Failed to get field ID for \" #kotlin_field); \\\n        env->DeleteLocalRef(exClass);                                        \\\n      }                                                                      \\\n      return ans;                                                            \\\n    }                                                                        \\\n    cpp_field = env->GetIntField(config, fid);                               \\\n  } while (0)\n\n#define SHERPA_ONNX_JNI_READ_BOOL(cpp_field, kotlin_field, cls, config)      \\\n  do {                                                                       \\\n    jfieldID fid = env->GetFieldID(cls, #kotlin_field, \"Z\");                 \\\n    if (fid == nullptr || env->ExceptionCheck()) {                           \\\n      SHERPA_ONNX_LOGE(\"Failed to get field ID for '%s'\", #kotlin_field);    \\\n      SHERPA_ONNX_LOGE(                                                      \\\n          \"Please check that your kotlin code matches the library file \"     \\\n          \"libsherpa-onnx-jni.so . If you are not sure, always use the \"     \\\n          \"LATEST code and the latest library\");                             \\\n      if (env->ExceptionCheck()) {                                           \\\n        env->ExceptionDescribe();                                            \\\n        env->ExceptionClear();                                               \\\n      }                                                                      \\\n      jclass exClass = env->FindClass(\"java/lang/RuntimeException\");         \\\n      if (exClass) {                                                         \\\n        env->ThrowNew(exClass, \"Failed to get field ID for \" #kotlin_field); \\\n        env->DeleteLocalRef(exClass);                                        \\\n      }                                                                      \\\n      return ans;                                                            \\\n    }                                                                        \\\n    cpp_field = env->GetBooleanField(config, fid);                           \\\n  } while (0)\n\n// defined in jni.cc\njobject NewInteger(JNIEnv *env, int32_t value);\njobject NewFloat(JNIEnv *env, float value);\n\n// Template function for non-void return types\ntemplate <typename Func, typename ReturnType>\nReturnType SafeJNI(JNIEnv *env, const char *functionName, Func func,\n                   ReturnType defaultValue) {\n  try {\n    return func();\n  } catch (const std::exception &e) {\n    jclass exClass = env->FindClass(\"java/lang/RuntimeException\");\n    if (exClass != nullptr) {\n      std::string errorMessage = std::string(functionName) + \": \" + e.what();\n      env->ThrowNew(exClass, errorMessage.c_str());\n      env->DeleteLocalRef(exClass);\n    }\n  } catch (...) {\n    jclass exClass = env->FindClass(\"java/lang/RuntimeException\");\n    if (exClass != nullptr) {\n      std::string errorMessage = std::string(functionName) +\n                                 \": Native exception: caught unknown exception\";\n      env->ThrowNew(exClass, errorMessage.c_str());\n      env->DeleteLocalRef(exClass);\n    }\n  }\n  return defaultValue;\n}\n\n// Specialization for void return type\ntemplate <typename Func>\nvoid SafeJNI(JNIEnv *env, const char *functionName, Func func) {\n  try {\n    func();\n  } catch (const std::exception &e) {\n    jclass exClass = env->FindClass(\"java/lang/RuntimeException\");\n    if (exClass != nullptr) {\n      std::string errorMessage = std::string(functionName) + \": \" + e.what();\n      env->ThrowNew(exClass, errorMessage.c_str());\n      env->DeleteLocalRef(exClass);\n    }\n  } catch (...) {\n    jclass exClass = env->FindClass(\"java/lang/RuntimeException\");\n    if (exClass != nullptr) {\n      std::string errorMessage = std::string(functionName) +\n                                 \": Native exception: caught unknown exception\";\n      env->ThrowNew(exClass, errorMessage.c_str());\n      env->DeleteLocalRef(exClass);\n    }\n  }\n}\n\n// Helper function to validate JNI pointers\ninline bool ValidatePointer(JNIEnv *env, jlong ptr, const char *functionName,\n                            const char *message) {\n  if (ptr == 0) {\n    jclass exClass = env->FindClass(\"java/lang/NullPointerException\");\n    if (exClass != nullptr) {\n      std::string errorMessage = std::string(functionName) + \": \" + message;\n      env->ThrowNew(exClass, errorMessage.c_str());\n      env->DeleteLocalRef(exClass);\n    }\n    return false;\n  }\n  return true;\n}\n\nnamespace sherpa_onnx {\nvoid PrependAdspLibraryPath(const std::string &new_path);\n}\n\n#endif  // SHERPA_ONNX_JNI_COMMON_H_\n"
  },
  {
    "path": "sherpa-onnx/jni/generate.sh",
    "content": "#!/usr/bin/env bash\nset -ex\n\nnm -g ../../build/lib/libsherpa-onnx-jni.dylib | awk '$2==\"T\" && $3 ~ /^_Java_com_k2fsa/ {print $3}' | sort  > ./sherpa-onnx-symbols.exp\n\n"
  },
  {
    "path": "sherpa-onnx/jni/jni.cc",
    "content": "// sherpa-onnx/jni/jni.cc\n//\n// Copyright (c)  2022-2023  Xiaomi Corporation\n//                2022       Pingfeng Luo\n//                2023       Zhaoming\n\n#include <fstream>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/onnx-utils.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\n// see\n// https://stackoverflow.com/questions/29043872/android-jni-return-multiple-variables\njobject NewInteger(JNIEnv *env, int32_t value) {\n  jclass cls = env->FindClass(\"java/lang/Integer\");\n  jmethodID constructor = env->GetMethodID(cls, \"<init>\", \"(I)V\");\n  jobject obj = env->NewObject(cls, constructor, value);\n  env->DeleteLocalRef(cls);\n  return obj;\n}\n\njobject NewFloat(JNIEnv *env, float value) {\n  jclass cls = env->FindClass(\"java/lang/Float\");\n  jmethodID constructor = env->GetMethodID(cls, \"<init>\", \"(F)V\");\n  jobject obj = env->NewObject(cls, constructor, value);\n  env->DeleteLocalRef(cls);\n  return obj;\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/keyword-spotter.cc",
    "content": "// sherpa-onnx/jni/keyword-spotter.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/keyword-spotter.h\"\n\n#include <memory>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nOnlineModelConfig GetOnlineModelConfig(JNIEnv *env, jclass model_config_cls,\n                                       jobject model_config, bool *ok);\n\nstatic KeywordSpotterConfig GetKwsConfig(JNIEnv *env, jobject config,\n                                         bool *ok) {\n  KeywordSpotterConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid;\n\n  // https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/types.html\n  // https://courses.cs.washington.edu/courses/cse341/99wi/java/tutorial/native1.1/implementing/field.html\n\n  //---------- decoding ----------\n  SHERPA_ONNX_JNI_READ_INT(ans.max_active_paths, maxActivePaths, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.keywords_file, keywordsFile, cls, config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.keywords_score, keywordsScore, cls, config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.keywords_threshold, keywordsThreshold, cls,\n                             config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.num_trailing_blanks, numTrailingBlanks, cls,\n                           config);\n\n  //---------- feat config ----------\n  fid = env->GetFieldID(cls, \"featConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/FeatureConfig;\");\n  jobject feat_config = env->GetObjectField(config, fid);\n  jclass feat_config_cls = env->GetObjectClass(feat_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.feat_config.sampling_rate, sampleRate,\n                           feat_config_cls, feat_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.feat_config.feature_dim, featureDim,\n                           feat_config_cls, feat_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.feat_config.dither, dither, feat_config_cls,\n                             feat_config);\n\n  //---------- model config ----------\n  fid = env->GetFieldID(cls, \"modelConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/OnlineModelConfig;\");\n  jobject model_config = env->GetObjectField(config, fid);\n  jclass model_config_cls = env->GetObjectClass(model_config);\n  ans.model_config =\n      GetOnlineModelConfig(env, model_config_cls, model_config, ok);\n\n  if (!*ok) {\n    return ans;\n  }\n\n  // *ok = false;\n  // If there are more fields, remember to set *ok to false\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_KeywordSpotter_newFromAsset(\n    JNIEnv *env, jobject /*obj*/, jobject asset_manager, jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n  bool ok = false;\n  auto config = sherpa_onnx::GetKwsConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  auto kws = new sherpa_onnx::KeywordSpotter(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return (jlong)kws;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_KeywordSpotter_newFromFile(\n    JNIEnv *env, jobject /*obj*/, jobject _config) {\n  bool ok = false;\n  auto config = sherpa_onnx::GetKwsConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors found in config!\");\n    return 0;\n  }\n\n  auto kws = new sherpa_onnx::KeywordSpotter(config);\n\n  return (jlong)kws;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_KeywordSpotter_delete(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::KeywordSpotter *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_KeywordSpotter_decode(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr, jlong stream_ptr) {\n  auto kws = reinterpret_cast<sherpa_onnx::KeywordSpotter *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n\n  kws->DecodeStream(stream);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_KeywordSpotter_reset(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr, jlong stream_ptr) {\n  auto kws = reinterpret_cast<sherpa_onnx::KeywordSpotter *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n\n  kws->Reset(stream);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_KeywordSpotter_createStream(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring keywords) {\n  auto kws = reinterpret_cast<sherpa_onnx::KeywordSpotter *>(ptr);\n\n  const char *p = env->GetStringUTFChars(keywords, nullptr);\n  std::unique_ptr<sherpa_onnx::OnlineStream> stream;\n\n  if (strlen(p) == 0) {\n    stream = kws->CreateStream();\n  } else {\n    stream = kws->CreateStream(p);\n  }\n\n  env->ReleaseStringUTFChars(keywords, p);\n\n  // The user is responsible to free the returned pointer.\n  //\n  // See Java_com_k2fsa_sherpa_onnx_OfflineStream_delete() from\n  // ./offline-stream.cc\n  sherpa_onnx::OnlineStream *ans = stream.release();\n  return (jlong)ans;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL Java_com_k2fsa_sherpa_onnx_KeywordSpotter_isReady(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr, jlong stream_ptr) {\n  auto kws = reinterpret_cast<sherpa_onnx::KeywordSpotter *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n\n  return kws->IsReady(stream);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL Java_com_k2fsa_sherpa_onnx_KeywordSpotter_getResult(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jlong stream_ptr) {\n  auto kws = reinterpret_cast<sherpa_onnx::KeywordSpotter *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n\n  sherpa_onnx::KeywordResult result = kws->GetResult(stream);\n\n  jstring j_keyword = env->NewStringUTF(result.keyword.c_str());\n\n  // Convert tokens (std::vector<std::string> -> String[])\n  jclass string_cls = env->FindClass(\"java/lang/String\");\n  if (string_cls == nullptr) {\n    SHERPA_ONNX_LOGE(\"Failed to find class java/lang/String\");\n    env->DeleteLocalRef(j_keyword);\n    return nullptr;\n  }\n\n  jobjectArray j_tokens =\n      env->NewObjectArray(result.tokens.size(), string_cls, nullptr);\n\n  for (size_t i = 0; i < result.tokens.size(); ++i) {\n    jstring t = env->NewStringUTF(result.tokens[i].c_str());\n    env->SetObjectArrayElement(j_tokens, i, t);\n    env->DeleteLocalRef(t);\n  }\n\n  // Convert timestamps (std::vector<float> -> float[])\n  jfloatArray j_timestamps = env->NewFloatArray(result.timestamps.size());\n  env->SetFloatArrayRegion(j_timestamps, 0, result.timestamps.size(),\n                           result.timestamps.data());\n\n  // Find KeywordSpotterResult class\n  jclass result_cls =\n      env->FindClass(\"com/k2fsa/sherpa/onnx/KeywordSpotterResult\");\n\n  if (result_cls == nullptr) {\n    SHERPA_ONNX_LOGE(\n        \"Failed to find class com/k2fsa/sherpa/onnx/KeywordSpotterResult\");\n    env->DeleteLocalRef(j_keyword);\n    env->DeleteLocalRef(j_tokens);\n    env->DeleteLocalRef(j_timestamps);\n    env->DeleteLocalRef(string_cls);\n    return nullptr;\n  }\n\n  jmethodID ctor = env->GetMethodID(\n      result_cls, \"<init>\", \"(Ljava/lang/String;[Ljava/lang/String;[F)V\");\n\n  if (ctor == nullptr) {\n    SHERPA_ONNX_LOGE(\"Failed to get KeywordSpotterResult constructor\");\n    env->DeleteLocalRef(j_keyword);\n    env->DeleteLocalRef(j_tokens);\n    env->DeleteLocalRef(j_timestamps);\n    env->DeleteLocalRef(result_cls);\n    env->DeleteLocalRef(string_cls);\n    return nullptr;\n  }\n\n  // Create the KeywordSpotterResult object\n  jobject result_obj =\n      env->NewObject(result_cls, ctor, j_keyword, j_tokens, j_timestamps);\n\n  env->DeleteLocalRef(j_keyword);\n  env->DeleteLocalRef(j_tokens);\n  env->DeleteLocalRef(j_timestamps);\n  env->DeleteLocalRef(result_cls);\n  env->DeleteLocalRef(string_cls);\n\n  return result_obj;\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/offline-punctuation.cc",
    "content": "// sherpa-onnx/jni/offline-punctuation.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-punctuation.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nstatic OfflinePunctuationConfig GetOfflinePunctuationConfig(JNIEnv *env,\n                                                            jobject config,\n                                                            bool *ok) {\n  OfflinePunctuationConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid;\n\n  fid = env->GetFieldID(\n      cls, \"model\", \"Lcom/k2fsa/sherpa/onnx/OfflinePunctuationModelConfig;\");\n  jobject model_config = env->GetObjectField(config, fid);\n  jclass model_config_cls = env->GetObjectClass(model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.ct_transformer, ctTransformer,\n                              model_config_cls, model_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.model.num_threads, numThreads, model_config_cls,\n                           model_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.model.debug, debug, model_config_cls,\n                            model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.provider, provider, model_config_cls,\n                              model_config);\n\n  *ok = true;\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflinePunctuation_newFromAsset(\n    JNIEnv *env, jobject /*obj*/, jobject asset_manager, jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n  bool ok = false;\n  auto config = sherpa_onnx::GetOfflinePunctuationConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  auto model = new sherpa_onnx::OfflinePunctuation(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return (jlong)model;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflinePunctuation_newFromFile(JNIEnv *env,\n                                                          jobject /*obj*/,\n                                                          jobject _config) {\n  bool ok = false;\n  auto config = sherpa_onnx::GetOfflinePunctuationConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors found in config!\");\n    return 0;\n  }\n\n  auto model = new sherpa_onnx::OfflinePunctuation(config);\n\n  return (jlong)model;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OfflinePunctuation_delete(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::OfflinePunctuation *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflinePunctuation_addPunctuation(JNIEnv *env,\n                                                             jobject /*obj*/,\n                                                             jlong ptr,\n                                                             jstring text) {\n  auto punct = reinterpret_cast<const sherpa_onnx::OfflinePunctuation *>(ptr);\n\n  const char *ptext = env->GetStringUTFChars(text, nullptr);\n\n  std::string result = punct->AddPunctuation(ptext);\n\n  env->ReleaseStringUTFChars(text, ptext);\n\n  return env->NewStringUTF(result.c_str());\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/offline-recognizer.cc",
    "content": "// sherpa-onnx/jni/offline-recognizer.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n\n#include <stdlib.h>\n\n#include <memory>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nstatic OfflineRecognizerConfig GetOfflineConfig(JNIEnv *env, jobject config,\n                                                bool *ok) {\n  OfflineRecognizerConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid;\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.decoding_method, decodingMethod, cls, config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.max_active_paths, maxActivePaths, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.hotwords_file, hotwordsFile, cls, config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.hotwords_score, hotwordsScore, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.rule_fsts, ruleFsts, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.rule_fars, ruleFars, cls, config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.blank_penalty, blankPenalty, cls, config);\n\n  fid = env->GetFieldID(cls, \"featConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/FeatureConfig;\");\n  jobject feat_config = env->GetObjectField(config, fid);\n  jclass feat_config_cls = env->GetObjectClass(feat_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.feat_config.sampling_rate, sampleRate,\n                           feat_config_cls, feat_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.feat_config.feature_dim, featureDim,\n                           feat_config_cls, feat_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.feat_config.dither, dither, feat_config_cls,\n                             feat_config);\n\n  fid = env->GetFieldID(cls, \"modelConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineModelConfig;\");\n  jobject model_config = env->GetObjectField(config, fid);\n  jclass model_config_cls = env->GetObjectClass(model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.tokens, tokens, model_config_cls,\n                              model_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.model_config.num_threads, numThreads,\n                           model_config_cls, model_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.model_config.debug, debug, model_config_cls,\n                            model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.provider, provider,\n                              model_config_cls, model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.model_type, modelType,\n                              model_config_cls, model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.modeling_unit, modelingUnit,\n                              model_config_cls, model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.bpe_vocab, bpeVocab,\n                              model_config_cls, model_config);\n\n  fid = env->GetFieldID(model_config_cls, \"transducer\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineTransducerModelConfig;\");\n  jobject transducer_config = env->GetObjectField(model_config, fid);\n  jclass transducer_config_cls = env->GetObjectClass(transducer_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.transducer.encoder_filename,\n                              encoder, transducer_config_cls,\n                              transducer_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.transducer.decoder_filename,\n                              decoder, transducer_config_cls,\n                              transducer_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.transducer.joiner_filename,\n                              joiner, transducer_config_cls, transducer_config);\n\n  fid = env->GetFieldID(model_config_cls, \"paraformer\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineParaformerModelConfig;\");\n  jobject paraformer_config = env->GetObjectField(model_config, fid);\n  jclass paraformer_config_cls = env->GetObjectClass(paraformer_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.paraformer.model, model,\n                              paraformer_config_cls, paraformer_config);\n\n  fid = env->GetFieldID(paraformer_config_cls, \"qnnConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/QnnConfig;\");\n  jobject qnn_config = env->GetObjectField(paraformer_config, fid);\n  jclass qnn_config_cls = env->GetObjectClass(qnn_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(\n      ans.model_config.paraformer.qnn_config.backend_lib, backendLib,\n      qnn_config_cls, qnn_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(\n      ans.model_config.paraformer.qnn_config.context_binary, contextBinary,\n      qnn_config_cls, qnn_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.paraformer.qnn_config.system_lib,\n                              systemLib, qnn_config_cls, qnn_config);\n\n  fid = env->GetFieldID(model_config_cls, \"whisper\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineWhisperModelConfig;\");\n  jobject whisper_config = env->GetObjectField(model_config, fid);\n  jclass whisper_config_cls = env->GetObjectClass(whisper_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.whisper.encoder, encoder,\n                              whisper_config_cls, whisper_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.whisper.decoder, decoder,\n                              whisper_config_cls, whisper_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.whisper.language, language,\n                              whisper_config_cls, whisper_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.whisper.task, task,\n                              whisper_config_cls, whisper_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.model_config.whisper.tail_paddings, tailPaddings,\n                           whisper_config_cls, whisper_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.model_config.whisper.enable_token_timestamps,\n                            enableTokenTimestamps, whisper_config_cls,\n                            whisper_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.model_config.whisper.enable_segment_timestamps,\n                            enableSegmentTimestamps, whisper_config_cls,\n                            whisper_config);\n\n  fid = env->GetFieldID(model_config_cls, \"fireRedAsr\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineFireRedAsrModelConfig;\");\n  jobject fire_red_asr_config = env->GetObjectField(model_config, fid);\n  jclass fire_red_asr_config_cls = env->GetObjectClass(fire_red_asr_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.fire_red_asr.encoder, encoder,\n                              fire_red_asr_config_cls, fire_red_asr_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.fire_red_asr.decoder, decoder,\n                              fire_red_asr_config_cls, fire_red_asr_config);\n\n  // moonshine\n  fid = env->GetFieldID(model_config_cls, \"moonshine\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineMoonshineModelConfig;\");\n  jobject moonshine_config = env->GetObjectField(model_config, fid);\n  jclass moonshine_config_cls = env->GetObjectClass(moonshine_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.moonshine.preprocessor,\n                              preprocessor, moonshine_config_cls,\n                              moonshine_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.moonshine.encoder, encoder,\n                              moonshine_config_cls, moonshine_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.moonshine.uncached_decoder,\n                              uncachedDecoder, moonshine_config_cls,\n                              moonshine_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.moonshine.cached_decoder,\n                              cachedDecoder, moonshine_config_cls,\n                              moonshine_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.moonshine.merged_decoder,\n                              mergedDecoder, moonshine_config_cls,\n                              moonshine_config);\n\n  fid = env->GetFieldID(model_config_cls, \"senseVoice\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineSenseVoiceModelConfig;\");\n  jobject sense_voice_config = env->GetObjectField(model_config, fid);\n  jclass sense_voice_config_cls = env->GetObjectClass(sense_voice_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.sense_voice.model, model,\n                              sense_voice_config_cls, sense_voice_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.sense_voice.language, language,\n                              sense_voice_config_cls, sense_voice_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.model_config.sense_voice.use_itn,\n                            useInverseTextNormalization, sense_voice_config_cls,\n                            sense_voice_config);\n\n  fid = env->GetFieldID(sense_voice_config_cls, \"qnnConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/QnnConfig;\");\n  qnn_config = env->GetObjectField(sense_voice_config, fid);\n  qnn_config_cls = env->GetObjectClass(qnn_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(\n      ans.model_config.sense_voice.qnn_config.backend_lib, backendLib,\n      qnn_config_cls, qnn_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(\n      ans.model_config.sense_voice.qnn_config.context_binary, contextBinary,\n      qnn_config_cls, qnn_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(\n      ans.model_config.sense_voice.qnn_config.system_lib, systemLib,\n      qnn_config_cls, qnn_config);\n\n  // nemo\n  fid = env->GetFieldID(\n      model_config_cls, \"nemo\",\n      \"Lcom/k2fsa/sherpa/onnx/OfflineNemoEncDecCtcModelConfig;\");\n  jobject nemo_config = env->GetObjectField(model_config, fid);\n  jclass nemo_config_cls = env->GetObjectClass(nemo_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.nemo_ctc.model, model,\n                              nemo_config_cls, nemo_config);\n\n  // zipformer ctc\n  fid =\n      env->GetFieldID(model_config_cls, \"zipformerCtc\",\n                      \"Lcom/k2fsa/sherpa/onnx/OfflineZipformerCtcModelConfig;\");\n  jobject zipformer_ctc_config = env->GetObjectField(model_config, fid);\n  jclass zipformer_ctc_config_cls = env->GetObjectClass(zipformer_ctc_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.zipformer_ctc.model, model,\n                              zipformer_ctc_config_cls, zipformer_ctc_config);\n\n  fid = env->GetFieldID(zipformer_ctc_config_cls, \"qnnConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/QnnConfig;\");\n\n  qnn_config = env->GetObjectField(zipformer_ctc_config, fid);\n  qnn_config_cls = env->GetObjectClass(qnn_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(\n      ans.model_config.zipformer_ctc.qnn_config.backend_lib, backendLib,\n      qnn_config_cls, qnn_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(\n      ans.model_config.zipformer_ctc.qnn_config.context_binary, contextBinary,\n      qnn_config_cls, qnn_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(\n      ans.model_config.zipformer_ctc.qnn_config.system_lib, systemLib,\n      qnn_config_cls, qnn_config);\n\n  // wenet ctc\n  fid = env->GetFieldID(model_config_cls, \"wenetCtc\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineWenetCtcModelConfig;\");\n  jobject wenet_ctc_config = env->GetObjectField(model_config, fid);\n  jclass wenet_ctc_config_cls = env->GetObjectClass(wenet_ctc_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.wenet_ctc.model, model,\n                              wenet_ctc_config_cls, wenet_ctc_config);\n\n  // omnilingual asr ctc\n  fid = env->GetFieldID(\n      model_config_cls, \"omnilingual\",\n      \"Lcom/k2fsa/sherpa/onnx/OfflineOmnilingualAsrCtcModelConfig;\");\n  jobject omnilingual_ctc_config = env->GetObjectField(model_config, fid);\n  jclass omnilingual_ctc_config_cls =\n      env->GetObjectClass(omnilingual_ctc_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.omnilingual.model, model,\n                              omnilingual_ctc_config_cls,\n                              omnilingual_ctc_config);\n\n  // medasr ctc\n  fid = env->GetFieldID(model_config_cls, \"medasr\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineMedAsrCtcModelConfig;\");\n  jobject medasr_ctc_config = env->GetObjectField(model_config, fid);\n  jclass medasr_ctc_config_cls = env->GetObjectClass(medasr_ctc_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.medasr.model, model,\n                              medasr_ctc_config_cls, medasr_ctc_config);\n\n  // FunASR Nano\n  fid = env->GetFieldID(model_config_cls, \"funasrNano\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineFunAsrNanoModelConfig;\");\n  jobject funasr_nano_config = env->GetObjectField(model_config, fid);\n  jclass funasr_nano_config_cls = env->GetObjectClass(funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.funasr_nano.encoder_adaptor,\n                              encoderAdaptor, funasr_nano_config_cls,\n                              funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.funasr_nano.llm, llm,\n                              funasr_nano_config_cls, funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.funasr_nano.embedding, embedding,\n                              funasr_nano_config_cls, funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.funasr_nano.tokenizer, tokenizer,\n                              funasr_nano_config_cls, funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.funasr_nano.system_prompt,\n                              systemPrompt, funasr_nano_config_cls,\n                              funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.funasr_nano.user_prompt,\n                              userPrompt, funasr_nano_config_cls,\n                              funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.funasr_nano.language, language,\n                              funasr_nano_config_cls, funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.model_config.funasr_nano.itn, itn,\n                            funasr_nano_config_cls, funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.funasr_nano.hotwords, hotwords,\n                              funasr_nano_config_cls, funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.model_config.funasr_nano.max_new_tokens,\n                           maxNewTokens, funasr_nano_config_cls,\n                           funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model_config.funasr_nano.temperature,\n                             temperature, funasr_nano_config_cls,\n                             funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model_config.funasr_nano.top_p, topP,\n                             funasr_nano_config_cls, funasr_nano_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.model_config.funasr_nano.seed, seed,\n                           funasr_nano_config_cls, funasr_nano_config);\n\n  // fire red asr ctc\n  fid = env->GetFieldID(\n      model_config_cls, \"fireRedAsrCtc\",\n      \"Lcom/k2fsa/sherpa/onnx/OfflineFireRedAsrCtcModelConfig;\");\n  jobject fire_red_asr_ctc_config = env->GetObjectField(model_config, fid);\n  jclass fire_red_asr_ctc_config_cls =\n      env->GetObjectClass(fire_red_asr_ctc_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.fire_red_asr_ctc.model, model,\n                              fire_red_asr_ctc_config_cls,\n                              fire_red_asr_ctc_config);\n\n  // canary\n  fid = env->GetFieldID(model_config_cls, \"canary\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineCanaryModelConfig;\");\n  jobject canary_config = env->GetObjectField(model_config, fid);\n  jclass canary_config_cls = env->GetObjectClass(canary_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.canary.encoder, encoder,\n                              canary_config_cls, canary_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.canary.decoder, decoder,\n                              canary_config_cls, canary_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.canary.src_lang, srcLang,\n                              canary_config_cls, canary_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.canary.tgt_lang, tgtLang,\n                              canary_config_cls, canary_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.model_config.canary.use_pnc, usePnc,\n                            canary_config_cls, canary_config);\n\n  fid = env->GetFieldID(model_config_cls, \"dolphin\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineDolphinModelConfig;\");\n  jobject dolphin_config = env->GetObjectField(model_config, fid);\n  jclass dolphin_config_cls = env->GetObjectClass(dolphin_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.dolphin.model, model,\n                              dolphin_config_cls, dolphin_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_config.telespeech_ctc, teleSpeech,\n                              model_config_cls, model_config);\n\n  // homophone replacer config\n  fid = env->GetFieldID(cls, \"hr\",\n                        \"Lcom/k2fsa/sherpa/onnx/HomophoneReplacerConfig;\");\n  jobject hr_config = env->GetObjectField(config, fid);\n  jclass hr_config_cls = env->GetObjectClass(hr_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.hr.lexicon, lexicon, hr_config_cls,\n                              hr_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.hr.rule_fsts, ruleFsts, hr_config_cls,\n                              hr_config);\n\n  *ok = true;\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineRecognizer_newFromAsset(JNIEnv *env,\n                                                          jobject /*obj*/,\n                                                          jobject asset_manager,\n                                                          jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n  bool ok = false;\n  auto config = sherpa_onnx::GetOfflineConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  if (config.model_config.debug) {\n#if __ANDROID_API__\n    // logcat truncates long strings, so we split the string into chunks\n    auto str_vec = sherpa_onnx::SplitString(config.ToString(), 128);\n    for (const auto &s : str_vec) {\n      SHERPA_ONNX_LOGE(\"%s\", s.c_str());\n    }\n#else\n    SHERPA_ONNX_LOGE(\"%s\", config.ToString().c_str());\n#endif\n  }\n\n  auto model = new sherpa_onnx::OfflineRecognizer(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return (jlong)model;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineRecognizer_newFromFile(JNIEnv *env,\n                                                         jobject /*obj*/,\n                                                         jobject _config) {\n  bool ok = false;\n  auto config = sherpa_onnx::GetOfflineConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  if (config.model_config.debug) {\n#if __ANDROID_API__\n    auto str_vec = sherpa_onnx::SplitString(config.ToString(), 128);\n    for (const auto &s : str_vec) {\n      SHERPA_ONNX_LOGE(\"%s\", s.c_str());\n    }\n#else\n    SHERPA_ONNX_LOGE(\"%s\", config.ToString().c_str());\n#endif\n  }\n\n  if (!config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors found in config!\");\n    return 0;\n  }\n\n  auto model = new sherpa_onnx::OfflineRecognizer(config);\n\n  return (jlong)model;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_setConfig(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jobject _config) {\n  bool ok = false;\n  auto config = sherpa_onnx::GetOfflineConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return;\n  }\n\n  if (config.model_config.debug) {\n    SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n  }\n\n  auto recognizer = reinterpret_cast<sherpa_onnx::OfflineRecognizer *>(ptr);\n  recognizer->SetConfig(config);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_delete(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::OfflineRecognizer *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineRecognizer_createStream(JNIEnv * /*env*/,\n                                                          jobject /*obj*/,\n                                                          jlong ptr) {\n  auto recognizer = reinterpret_cast<sherpa_onnx::OfflineRecognizer *>(ptr);\n  std::unique_ptr<sherpa_onnx::OfflineStream> s = recognizer->CreateStream();\n\n  // The user is responsible to free the returned pointer.\n  //\n  // See Java_com_k2fsa_sherpa_onnx_OfflineStream_delete() from\n  // ./offline-stream.cc\n  sherpa_onnx::OfflineStream *p = s.release();\n  return (jlong)p;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_decode(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jlong stream_ptr) {\n  SafeJNI(env, \"OfflineRecognizer_decode\", [&] {\n    if (!ValidatePointer(env, ptr, \"OfflineRecognizer_decode\",\n                         \"OfflineRecognizer pointer is null.\") ||\n        !ValidatePointer(env, stream_ptr, \"OfflineRecognizer_decode\",\n                         \"OfflineStream pointer is null.\")) {\n      return;\n    }\n\n    auto recognizer = reinterpret_cast<sherpa_onnx::OfflineRecognizer *>(ptr);\n    auto stream = reinterpret_cast<sherpa_onnx::OfflineStream *>(stream_ptr);\n    recognizer->DecodeStream(stream);\n  });\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineRecognizer_decodeStreams(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jlongArray stream_ptrs) {\n  SafeJNI(env, \"OfflineRecognizer_decode_streams\", [&] {\n    if (!ValidatePointer(env, ptr, \"OfflineRecognizer_decode_streams\",\n                         \"OfflineRecognizer pointer is null.\")) {\n      return;\n    }\n\n    auto recognizer = reinterpret_cast<sherpa_onnx::OfflineRecognizer *>(ptr);\n\n    jlong *p = env->GetLongArrayElements(stream_ptrs, nullptr);\n    jsize n = env->GetArrayLength(stream_ptrs);\n\n    auto ss = reinterpret_cast<sherpa_onnx::OfflineStream **>(p);\n    recognizer->DecodeStreams(ss, n);\n\n    env->ReleaseLongArrayElements(stream_ptrs, p, JNI_ABORT);\n  });\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineRecognizer_getResult(JNIEnv *env,\n                                                       jobject /*obj*/,\n                                                       jlong streamPtr) {\n  auto stream = reinterpret_cast<sherpa_onnx::OfflineStream *>(streamPtr);\n  sherpa_onnx::OfflineRecognitionResult result = stream->GetResult();\n\n  // 2. Find the Java class and constructor\n  jclass cls = env->FindClass(\"com/k2fsa/sherpa/onnx/OfflineRecognizerResult\");\n  if (cls == nullptr) {\n    SHERPA_ONNX_LOGE(\"Failed to find class OfflineRecognizerResult\");\n    return nullptr;\n  }\n  jmethodID ctor =\n      env->GetMethodID(cls, \"<init>\",\n                       \"(Ljava/lang/String;[Ljava/lang/String;[FLjava/lang/\"\n                       \"String;Ljava/lang/String;Ljava/lang/String;[F)V\");\n  jstring jtext = env->NewStringUTF(result.text.c_str());\n\n  jclass string_cls = env->FindClass(\"java/lang/String\");\n  jobjectArray jtokens = env->NewObjectArray(\n      result.tokens.size(), string_cls, nullptr);\n  env->DeleteLocalRef(string_cls);\n\n  for (size_t i = 0; i < result.tokens.size(); ++i) {\n    jstring token_str = env->NewStringUTF(result.tokens[i].c_str());\n    env->SetObjectArrayElement(jtokens, i, token_str);\n    env->DeleteLocalRef(token_str);\n  }\n\n  jfloatArray jtimestamps = env->NewFloatArray(result.timestamps.size());\n  env->SetFloatArrayRegion(jtimestamps, 0, result.timestamps.size(),\n                           result.timestamps.data());\n\n  jstring jlang = env->NewStringUTF(result.lang.c_str());\n  jstring jemotion = env->NewStringUTF(result.emotion.c_str());\n  jstring jevent = env->NewStringUTF(result.event.c_str());\n\n  jfloatArray jdurations = env->NewFloatArray(result.durations.size());\n  env->SetFloatArrayRegion(jdurations, 0, result.durations.size(),\n                           result.durations.data());\n\n  jobject jresult = env->NewObject(cls, ctor, jtext, jtokens, jtimestamps,\n                                   jlang, jemotion, jevent, jdurations);\n\n  env->DeleteLocalRef(jtext);\n  env->DeleteLocalRef(jtokens);\n  env->DeleteLocalRef(jtimestamps);\n  env->DeleteLocalRef(jlang);\n  env->DeleteLocalRef(jemotion);\n  env->DeleteLocalRef(jevent);\n  env->DeleteLocalRef(jdurations);\n  env->DeleteLocalRef(cls);\n\n  return jresult;  // returned object is safe\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineRecognizer_prependAdspLibraryPath(\n    JNIEnv *env, jclass /*cls*/, jstring new_path) {\n  const char *p = env->GetStringUTFChars(new_path, nullptr);\n  sherpa_onnx::PrependAdspLibraryPath(p);\n\n  env->ReleaseStringUTFChars(new_path, p);\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/offline-speaker-diarization.cc",
    "content": "// sherpa-onnx/jni/offline-speaker-diarization.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-speaker-diarization.h\"\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nstatic OfflineSpeakerDiarizationConfig GetOfflineSpeakerDiarizationConfig(\n    JNIEnv *env, jobject config, bool *ok) {\n  OfflineSpeakerDiarizationConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid;\n\n  //---------- segmentation ----------\n  fid = env->GetFieldID(\n      cls, \"segmentation\",\n      \"Lcom/k2fsa/sherpa/onnx/OfflineSpeakerSegmentationModelConfig;\");\n  jobject segmentation_config = env->GetObjectField(config, fid);\n  jclass segmentation_config_cls = env->GetObjectClass(segmentation_config);\n\n  fid = env->GetFieldID(\n      segmentation_config_cls, \"pyannote\",\n      \"Lcom/k2fsa/sherpa/onnx/OfflineSpeakerSegmentationPyannoteModelConfig;\");\n  jobject pyannote_config = env->GetObjectField(segmentation_config, fid);\n  jclass pyannote_config_cls = env->GetObjectClass(pyannote_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.segmentation.pyannote.model, model,\n                              pyannote_config_cls, pyannote_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.segmentation.num_threads, numThreads,\n                           segmentation_config_cls, segmentation_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.segmentation.debug, debug,\n                            segmentation_config_cls, segmentation_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.segmentation.provider, provider,\n                              segmentation_config_cls, segmentation_config);\n\n  //---------- embedding ----------\n  fid = env->GetFieldID(\n      cls, \"embedding\",\n      \"Lcom/k2fsa/sherpa/onnx/SpeakerEmbeddingExtractorConfig;\");\n  jobject embedding_config = env->GetObjectField(config, fid);\n  jclass embedding_config_cls = env->GetObjectClass(embedding_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.embedding.model, model, embedding_config_cls,\n                              embedding_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.embedding.num_threads, numThreads,\n                           embedding_config_cls, embedding_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.embedding.debug, debug, embedding_config_cls,\n                            embedding_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.embedding.provider, provider,\n                              embedding_config_cls, embedding_config);\n\n  fid = env->GetFieldID(cls, \"clustering\",\n                        \"Lcom/k2fsa/sherpa/onnx/FastClusteringConfig;\");\n  jobject clustering_config = env->GetObjectField(config, fid);\n  jclass clustering_config_cls = env->GetObjectClass(clustering_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.clustering.num_clusters, numClusters,\n                           clustering_config_cls, clustering_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.clustering.threshold, threshold,\n                             clustering_config_cls, clustering_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.min_duration_on, minDurationOn, cls, config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.min_duration_off, minDurationOff, cls, config);\n\n  *ok = true;\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_newFromAsset(\n    JNIEnv *env, jobject /*obj*/, jobject asset_manager, jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n\n  bool ok = false;\n  auto config =\n      sherpa_onnx::GetOfflineSpeakerDiarizationConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  auto sd = new sherpa_onnx::OfflineSpeakerDiarization(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return (jlong)sd;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_newFromFile(\n    JNIEnv *env, jobject /*obj*/, jobject _config) {\n  bool ok = false;\n  auto config =\n      sherpa_onnx::GetOfflineSpeakerDiarizationConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors found in config!\");\n    return 0;\n  }\n\n  auto sd = new sherpa_onnx::OfflineSpeakerDiarization(config);\n\n  return (jlong)sd;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_setConfig(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jobject _config) {\n  bool ok = false;\n  auto config =\n      sherpa_onnx::GetOfflineSpeakerDiarizationConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  auto sd = reinterpret_cast<sherpa_onnx::OfflineSpeakerDiarization *>(ptr);\n  sd->SetConfig(config);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_delete(JNIEnv * /*env*/,\n                                                            jobject /*obj*/,\n                                                            jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::OfflineSpeakerDiarization *>(ptr);\n}\n\nstatic jobjectArray ProcessImpl(\n    JNIEnv *env,\n    const std::vector<sherpa_onnx::OfflineSpeakerDiarizationSegment>\n        &segments) {\n  jclass cls =\n      env->FindClass(\"com/k2fsa/sherpa/onnx/OfflineSpeakerDiarizationSegment\");\n  if (cls == nullptr) {\n    SHERPA_ONNX_LOGE(\n        \"Failed to find class OfflineSpeakerDiarizationSegment\");\n    return nullptr;\n  }\n\n  jobjectArray obj_arr =\n      (jobjectArray)env->NewObjectArray(segments.size(), cls, nullptr);\n\n  jmethodID constructor = env->GetMethodID(cls, \"<init>\", \"(FFI)V\");\n  if (constructor == nullptr) {\n    SHERPA_ONNX_LOGE(\n        \"Failed to get OfflineSpeakerDiarizationSegment constructor\");\n    env->DeleteLocalRef(cls);\n    return nullptr;\n  }\n\n  for (int32_t i = 0; i != static_cast<int32_t>(segments.size()); ++i) {\n    const auto &s = segments[i];\n    jobject segment =\n        env->NewObject(cls, constructor, s.Start(), s.End(), s.Speaker());\n    env->SetObjectArrayElement(obj_arr, i, segment);\n    env->DeleteLocalRef(segment);\n  }\n\n  env->DeleteLocalRef(cls);\n  return obj_arr;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobjectArray JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_process(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jfloatArray samples) {\n  auto sd = reinterpret_cast<sherpa_onnx::OfflineSpeakerDiarization *>(ptr);\n\n  jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n  jsize n = env->GetArrayLength(samples);\n  auto segments = sd->Process(p, n).SortByStartTime();\n  env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n\n  return ProcessImpl(env, segments);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobjectArray JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_processWithCallback(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jfloatArray samples,\n    jobject callback, jlong arg) {\n  std::function<int32_t(int32_t, int32_t, void *)> callback_wrapper =\n      [env, callback](int32_t num_processed_chunks, int32_t num_total_chunks,\n                      void *data) -> int {\n    jclass cls = env->GetObjectClass(callback);\n\n    jmethodID mid = env->GetMethodID(cls, \"invoke\", \"(IIJ)Ljava/lang/Integer;\");\n    if (mid == nullptr) {\n      SHERPA_ONNX_LOGE(\"Failed to get the callback. Ignore it.\");\n      env->DeleteLocalRef(cls);\n      return 0;\n    }\n    env->DeleteLocalRef(cls);\n\n    jobject ret = env->CallObjectMethod(callback, mid, num_processed_chunks,\n                                        num_total_chunks, (jlong)data);\n    if (ret == nullptr) {\n      return 0;\n    }\n\n    jclass jklass = env->GetObjectClass(ret);\n    jmethodID int_value_mid = env->GetMethodID(jklass, \"intValue\", \"()I\");\n    int32_t result = env->CallIntMethod(ret, int_value_mid);\n    env->DeleteLocalRef(jklass);\n    env->DeleteLocalRef(ret);\n    return result;\n  };\n\n  auto sd = reinterpret_cast<sherpa_onnx::OfflineSpeakerDiarization *>(ptr);\n\n  jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n  jsize n = env->GetArrayLength(samples);\n  auto segments =\n      sd->Process(p, n, callback_wrapper, reinterpret_cast<void *>(arg))\n          .SortByStartTime();\n  env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n\n  return ProcessImpl(env, segments);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jint JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_getSampleRate(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  return reinterpret_cast<sherpa_onnx::OfflineSpeakerDiarization *>(ptr)\n      ->SampleRate();\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/offline-speech-denoiser.cc",
    "content": "// sherpa-onnx/jni/offline-speech-denoiser.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n#include \"sherpa-onnx/jni/common.h\"\n#include \"sherpa-onnx/jni/speech-denoiser.h\"\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineSpeechDenoiser_newFromAsset(\n    JNIEnv *env, jobject /*obj*/, jobject asset_manager, jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n\n  bool ok = false;\n  auto config = sherpa_onnx::GetOfflineSpeechDenoiserConfig(env, _config, &ok);\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  auto speech_denoiser = new sherpa_onnx::OfflineSpeechDenoiser(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return (jlong)speech_denoiser;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineSpeechDenoiser_newFromFile(JNIEnv *env,\n                                                             jobject /*obj*/,\n                                                             jobject _config) {\n  return SafeJNI(\n      env, \"OfflineSpeechDenoiser_newFromFile\",\n      [&]() -> jlong {\n        bool ok = false;\n        auto config =\n            sherpa_onnx::GetOfflineSpeechDenoiserConfig(env, _config, &ok);\n\n        if (!ok) {\n          SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n          return 0;\n        }\n\n        SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n        if (!config.Validate()) {\n          SHERPA_ONNX_LOGE(\"Errors found in config!\");\n          return 0;\n        }\n\n        auto speech_denoiser = new sherpa_onnx::OfflineSpeechDenoiser(config);\n        return reinterpret_cast<jlong>(speech_denoiser);\n      },\n      (jlong)0);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OfflineSpeechDenoiser_delete(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::OfflineSpeechDenoiser *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jint JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineSpeechDenoiser_getSampleRate(JNIEnv * /*env*/,\n                                                               jobject /*obj*/,\n                                                               jlong ptr) {\n  return reinterpret_cast<sherpa_onnx::OfflineSpeechDenoiser *>(ptr)\n      ->GetSampleRate();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL Java_com_k2fsa_sherpa_onnx_OfflineSpeechDenoiser_run(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jfloatArray samples,\n    jint sample_rate) {\n  auto speech_denoiser =\n      reinterpret_cast<sherpa_onnx::OfflineSpeechDenoiser *>(ptr);\n\n  jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n  jsize n = env->GetArrayLength(samples);\n  auto denoised = speech_denoiser->Run(p, n, sample_rate);\n  env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n\n  return sherpa_onnx::NewDenoisedAudio(env, denoised);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL Java_com_k2fsa_sherpa_onnx_DenoisedAudio_saveImpl(\n    JNIEnv *env, jobject /*obj*/, jstring filename, jfloatArray samples,\n    jint sample_rate) {\n  const char *p_filename = env->GetStringUTFChars(filename, nullptr);\n\n  jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n  jsize n = env->GetArrayLength(samples);\n\n  bool ok = sherpa_onnx::WriteWave(p_filename, sample_rate, p, n);\n\n  env->ReleaseStringUTFChars(filename, p_filename);\n  env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n\n  return ok;\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/offline-stream.cc",
    "content": "// sherpa-onnx/jni/offline-stream.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-stream.h\"\n\n#include \"sherpa-onnx/jni/common.h\"\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OfflineStream_delete(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::OfflineStream *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OfflineStream_acceptWaveform(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jfloatArray samples,\n    jint sample_rate) {\n  auto stream = reinterpret_cast<sherpa_onnx::OfflineStream *>(ptr);\n\n  jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n  jsize n = env->GetArrayLength(samples);\n  stream->AcceptWaveform(sample_rate, p, n);\n  env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OfflineStream_setOption(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring key, jstring value) {\n  auto stream = reinterpret_cast<sherpa_onnx::OfflineStream *>(ptr);\n  const char *p_key = env->GetStringUTFChars(key, nullptr);\n  const char *p_value = env->GetStringUTFChars(value, nullptr);\n  stream->SetOption(p_key, p_value);\n  env->ReleaseStringUTFChars(key, p_key);\n  env->ReleaseStringUTFChars(value, p_value);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL Java_com_k2fsa_sherpa_onnx_OfflineStream_getOption(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring key) {\n  auto stream = reinterpret_cast<sherpa_onnx::OfflineStream *>(ptr);\n  const char *p_key = env->GetStringUTFChars(key, nullptr);\n  const std::string &value = stream->GetOption(p_key);\n  env->ReleaseStringUTFChars(key, p_key);\n  return env->NewStringUTF(value.c_str());\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL Java_com_k2fsa_sherpa_onnx_OfflineStream_hasOption(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring key) {\n  auto stream = reinterpret_cast<sherpa_onnx::OfflineStream *>(ptr);\n  const char *p_key = env->GetStringUTFChars(key, nullptr);\n  jboolean result = stream->HasOption(p_key);\n  env->ReleaseStringUTFChars(key, p_key);\n  return result;\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/offline-tts.cc",
    "content": "// sherpa-onnx/jni/offline-tts.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tts.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\n// ------------------ JNI Config Helpers ------------------\n\nstatic GenerationConfig GetGenerationConfig(JNIEnv *env, jobject config_obj) {\n  GenerationConfig ans;\n\n  if (!config_obj) {\n    SHERPA_ONNX_LOGE(\"GenerationConfig is null\");\n    return ans;\n  }\n\n  jclass cls = env->GetObjectClass(config_obj);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.silence_scale, silenceScale, cls, config_obj);\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.speed, speed, cls, config_obj);\n  SHERPA_ONNX_JNI_READ_INT(ans.sid, sid, cls, config_obj);\n\n  // referenceAudio\n  jfieldID fid = env->GetFieldID(cls, \"referenceAudio\", \"[F\");\n  if (fid != nullptr) {\n    jfloatArray arr = (jfloatArray)env->GetObjectField(config_obj, fid);\n    if (arr != nullptr) {\n      jsize len = env->GetArrayLength(arr);\n      jfloat *elems = env->GetFloatArrayElements(arr, nullptr);\n      ans.reference_audio.assign(elems, elems + len);\n      env->ReleaseFloatArrayElements(arr, elems, JNI_ABORT);\n      env->DeleteLocalRef(arr);\n    }\n  }\n\n  SHERPA_ONNX_JNI_READ_INT(ans.reference_sample_rate, referenceSampleRate, cls,\n                           config_obj);\n\n  // referenceText\n  SHERPA_ONNX_JNI_READ_STRING(ans.reference_text, referenceText, cls,\n                              config_obj);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.num_steps, numSteps, cls, config_obj);\n\n  // extra Map<String, String>\n  fid = env->GetFieldID(cls, \"extra\", \"Ljava/util/Map;\");\n  if (fid != nullptr) {\n    jobject map_obj = env->GetObjectField(config_obj, fid);\n    if (map_obj != nullptr) {\n      jclass map_cls = env->GetObjectClass(map_obj);\n      jmethodID entrySet =\n          env->GetMethodID(map_cls, \"entrySet\", \"()Ljava/util/Set;\");\n      jobject entry_set = env->CallObjectMethod(map_obj, entrySet);\n\n      jclass set_cls = env->GetObjectClass(entry_set);\n      jmethodID iteratorMid =\n          env->GetMethodID(set_cls, \"iterator\", \"()Ljava/util/Iterator;\");\n      jobject iterator = env->CallObjectMethod(entry_set, iteratorMid);\n\n      jclass iter_cls = env->GetObjectClass(iterator);\n      jmethodID hasNextMid = env->GetMethodID(iter_cls, \"hasNext\", \"()Z\");\n      jmethodID nextMid =\n          env->GetMethodID(iter_cls, \"next\", \"()Ljava/lang/Object;\");\n\n      jclass entry_cls = env->FindClass(\"java/util/Map$Entry\");\n      jmethodID getKeyMid =\n          env->GetMethodID(entry_cls, \"getKey\", \"()Ljava/lang/Object;\");\n      jmethodID getValueMid =\n          env->GetMethodID(entry_cls, \"getValue\", \"()Ljava/lang/Object;\");\n\n      while (env->CallBooleanMethod(iterator, hasNextMid)) {\n        jobject entry = env->CallObjectMethod(iterator, nextMid);\n        if (!entry) {\n          continue;\n        }\n\n        jstring key = (jstring)env->CallObjectMethod(entry, getKeyMid);\n        jstring value = (jstring)env->CallObjectMethod(entry, getValueMid);\n\n        if (key != nullptr && value != nullptr) {\n          const char *keyChars = env->GetStringUTFChars(key, nullptr);\n          const char *valueChars = env->GetStringUTFChars(value, nullptr);\n          ans.extra[std::string(keyChars)] = std::string(valueChars);\n\n          env->ReleaseStringUTFChars(key, keyChars);\n          env->ReleaseStringUTFChars(value, valueChars);\n        }\n\n        env->DeleteLocalRef(key);\n        env->DeleteLocalRef(value);\n        env->DeleteLocalRef(entry);\n      }\n\n      env->DeleteLocalRef(entry_set);\n      env->DeleteLocalRef(iterator);\n      env->DeleteLocalRef(entry_cls);\n      env->DeleteLocalRef(iter_cls);\n      env->DeleteLocalRef(set_cls);\n      env->DeleteLocalRef(map_cls);\n      env->DeleteLocalRef(map_obj);\n    }\n  }\n\n  env->DeleteLocalRef(cls);\n  return ans;\n}\n\nstatic OfflineTtsConfig GetOfflineTtsConfig(JNIEnv *env, jobject config,\n                                            bool *ok) {\n  OfflineTtsConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid;\n\n  fid = env->GetFieldID(cls, \"model\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineTtsModelConfig;\");\n  jobject model = env->GetObjectField(config, fid);\n  jclass model_config_cls = env->GetObjectClass(model);\n\n  fid = env->GetFieldID(model_config_cls, \"vits\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineTtsVitsModelConfig;\");\n  jobject vits = env->GetObjectField(model, fid);\n  jclass vits_cls = env->GetObjectClass(vits);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.vits.model, model, vits_cls, vits);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.vits.lexicon, lexicon, vits_cls, vits);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.vits.tokens, tokens, vits_cls, vits);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.vits.data_dir, dataDir, vits_cls, vits);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.vits.noise_scale, noiseScale, vits_cls,\n                             vits);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.vits.noise_scale_w, noiseScaleW,\n                             vits_cls, vits);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.vits.length_scale, lengthScale, vits_cls,\n                             vits);\n\n  // matcha\n  fid = env->GetFieldID(model_config_cls, \"matcha\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineTtsMatchaModelConfig;\");\n  jobject matcha = env->GetObjectField(model, fid);\n  jclass matcha_cls = env->GetObjectClass(matcha);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.matcha.acoustic_model, acousticModel,\n                              matcha_cls, matcha);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.matcha.vocoder, vocoder, matcha_cls,\n                              matcha);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.matcha.lexicon, lexicon, matcha_cls,\n                              matcha);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.matcha.tokens, tokens, matcha_cls,\n                              matcha);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.matcha.data_dir, dataDir, matcha_cls,\n                              matcha);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.matcha.noise_scale, noiseScale,\n                             matcha_cls, matcha);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.matcha.length_scale, lengthScale,\n                             matcha_cls, matcha);\n\n  fid = env->GetFieldID(model_config_cls, \"kokoro\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineTtsKokoroModelConfig;\");\n  jobject kokoro = env->GetObjectField(model, fid);\n  jclass kokoro_cls = env->GetObjectClass(kokoro);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.kokoro.model, model, kokoro_cls,\n                              kokoro);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.kokoro.voices, voices, kokoro_cls,\n                              kokoro);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.kokoro.tokens, tokens, kokoro_cls,\n                              kokoro);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.kokoro.lexicon, lexicon, kokoro_cls,\n                              kokoro);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.kokoro.lang, lang, kokoro_cls, kokoro);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.kokoro.data_dir, dataDir, kokoro_cls,\n                              kokoro);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.kokoro.length_scale, lengthScale,\n                             kokoro_cls, kokoro);\n\n  // zipvoice\n  fid = env->GetFieldID(\n      model_config_cls, \"zipvoice\",\n      \"Lcom/k2fsa/sherpa/onnx/OfflineTtsZipVoiceModelConfig;\");\n  jobject zipvoice = env->GetObjectField(model, fid);\n  jclass zipvoice_cls = env->GetObjectClass(zipvoice);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.zipvoice.tokens, tokens, zipvoice_cls,\n                              zipvoice);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.zipvoice.encoder, encoder, zipvoice_cls,\n                              zipvoice);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.zipvoice.decoder, decoder, zipvoice_cls,\n                              zipvoice);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.zipvoice.vocoder, vocoder, zipvoice_cls,\n                              zipvoice);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.zipvoice.data_dir, dataDir, zipvoice_cls,\n                              zipvoice);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.zipvoice.lexicon, lexicon, zipvoice_cls,\n                              zipvoice);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.zipvoice.feat_scale, featScale,\n                             zipvoice_cls, zipvoice);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.zipvoice.t_shift, tShift, zipvoice_cls,\n                             zipvoice);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.zipvoice.target_rms, targetRms,\n                             zipvoice_cls, zipvoice);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.zipvoice.guidance_scale, guidanceScale,\n                             zipvoice_cls, zipvoice);\n\n  // kitten\n  fid = env->GetFieldID(model_config_cls, \"kitten\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineTtsKittenModelConfig;\");\n  jobject kitten = env->GetObjectField(model, fid);\n  jclass kitten_cls = env->GetObjectClass(kitten);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.kitten.model, model, kitten_cls,\n                              kitten);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.kitten.voices, voices, kitten_cls,\n                              kitten);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.kitten.tokens, tokens, kitten_cls,\n                              kitten);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.kitten.data_dir, dataDir, kitten_cls,\n                              kitten);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.model.kitten.length_scale, lengthScale,\n                             kitten_cls, kitten);\n\n  // pocket\n  fid = env->GetFieldID(model_config_cls, \"pocket\",\n                        \"Lcom/k2fsa/sherpa/onnx/OfflineTtsPocketModelConfig;\");\n  jobject pocket = env->GetObjectField(model, fid);\n  jclass pocket_cls = env->GetObjectClass(pocket);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.pocket.lm_flow, lmFlow, pocket_cls,\n                              pocket);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.pocket.lm_main, lmMain, pocket_cls,\n                              pocket);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.pocket.encoder, encoder, pocket_cls,\n                              pocket);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.pocket.decoder, decoder, pocket_cls,\n                              pocket);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.pocket.text_conditioner,\n                              textConditioner, pocket_cls, pocket);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.pocket.vocab_json, vocabJson,\n                              pocket_cls, pocket);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.pocket.token_scores_json,\n                              tokenScoresJson, pocket_cls, pocket);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.model.pocket.voice_embedding_cache_capacity,\n                           voiceEmbeddingCacheCapacity, pocket_cls, pocket);\n\n  // supertonic\n  fid = env->GetFieldID(\n      model_config_cls, \"supertonic\",\n      \"Lcom/k2fsa/sherpa/onnx/OfflineTtsSupertonicModelConfig;\");\n  jobject supertonic = env->GetObjectField(model, fid);\n  jclass supertonic_cls = env->GetObjectClass(supertonic);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.supertonic.duration_predictor,\n                              durationPredictor, supertonic_cls, supertonic);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.supertonic.text_encoder, textEncoder,\n                              supertonic_cls, supertonic);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.supertonic.vector_estimator,\n                              vectorEstimator, supertonic_cls, supertonic);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.supertonic.vocoder, vocoder,\n                              supertonic_cls, supertonic);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.supertonic.tts_json, ttsJson,\n                              supertonic_cls, supertonic);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.supertonic.unicode_indexer,\n                              unicodeIndexer, supertonic_cls, supertonic);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.supertonic.voice_style, voiceStyle,\n                              supertonic_cls, supertonic);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.model.num_threads, numThreads, model_config_cls,\n                           model);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.model.debug, debug, model_config_cls, model);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.provider, provider, model_config_cls,\n                              model);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.rule_fsts, ruleFsts, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.rule_fars, ruleFars, cls, config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.max_num_sentences, maxNumSentences, cls, config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.silence_scale, silenceScale, cls, config);\n\n  env->DeleteLocalRef(model);\n  env->DeleteLocalRef(vits);\n  env->DeleteLocalRef(vits_cls);\n  env->DeleteLocalRef(matcha);\n  env->DeleteLocalRef(matcha_cls);\n  env->DeleteLocalRef(kokoro);\n  env->DeleteLocalRef(kokoro_cls);\n  env->DeleteLocalRef(zipvoice);\n  env->DeleteLocalRef(zipvoice_cls);\n  env->DeleteLocalRef(kitten);\n  env->DeleteLocalRef(kitten_cls);\n  env->DeleteLocalRef(pocket);\n  env->DeleteLocalRef(pocket_cls);\n  env->DeleteLocalRef(supertonic);\n  env->DeleteLocalRef(supertonic_cls);\n  env->DeleteLocalRef(model_config_cls);\n  env->DeleteLocalRef(cls);\n\n  *ok = true;\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\n// Convert audio samples and sample rate to a Java GeneratedAudio object\nstatic jobject CreateAudioObject(JNIEnv *env, const std::vector<float> &samples,\n                                 int32_t sample_rate) {\n  // Step 1: Create a jfloatArray for samples\n  jfloatArray samples_arr = env->NewFloatArray(samples.size());\n  env->SetFloatArrayRegion(samples_arr, 0, samples.size(), samples.data());\n\n  // Step 2: Find the GeneratedAudio class\n  jclass gen_audio_cls = env->FindClass(\"com/k2fsa/sherpa/onnx/GeneratedAudio\");\n  if (!gen_audio_cls) {\n    env->DeleteLocalRef(samples_arr);\n    return nullptr;\n  }\n\n  // Step 3: Get the constructor: GeneratedAudio(float[] samples, int\n  // sampleRate)\n  jmethodID ctor = env->GetMethodID(gen_audio_cls, \"<init>\", \"([FI)V\");\n  if (!ctor) {\n    env->DeleteLocalRef(samples_arr);\n    env->DeleteLocalRef(gen_audio_cls);\n    return nullptr;\n  }\n\n  // Step 4: Create the object\n  jobject gen_audio_obj =\n      env->NewObject(gen_audio_cls, ctor, samples_arr, sample_rate);\n\n  // Step 5: Clean up local refs\n  env->DeleteLocalRef(samples_arr);\n  env->DeleteLocalRef(gen_audio_cls);\n\n  return gen_audio_obj;\n}\n\nstatic int32_t CallCallback(JNIEnv *env, jobject callback,\n                            jfloatArray samples_arr) {\n  if (!callback) return 1;\n\n  jclass cls = env->GetObjectClass(callback);\n  if (env->ExceptionCheck()) {\n    env->DeleteLocalRef(cls);\n    return 1;\n  }\n\n  jmethodID invoke_mid =\n      env->GetMethodID(cls, \"invoke\", \"([F)Ljava/lang/Integer;\");\n  if (env->ExceptionCheck() || !invoke_mid) {\n    env->DeleteLocalRef(cls);\n    return 1;\n  }\n\n  jobject result = env->CallObjectMethod(callback, invoke_mid, samples_arr);\n  if (env->ExceptionCheck() || !result) {\n    env->DeleteLocalRef(cls);\n    return 1;\n  }\n\n  jclass integer_cls = env->GetObjectClass(result);\n  jmethodID int_val_mid = env->GetMethodID(integer_cls, \"intValue\", \"()I\");\n  jint ret = env->CallIntMethod(result, int_val_mid);\n\n  env->DeleteLocalRef(integer_cls);\n  env->DeleteLocalRef(result);\n  env->DeleteLocalRef(cls);\n\n  return ret;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_OfflineTts_newFromAsset(\n    JNIEnv *env, jobject /*obj*/, jobject asset_manager, jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n\n  bool ok = false;\n  auto config = sherpa_onnx::GetOfflineTtsConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  if (config.model.debug) {\n#if __ANDROID_API__\n    auto str_vec = sherpa_onnx::SplitString(config.ToString(), 128);\n    for (const auto &s : str_vec) {\n      SHERPA_ONNX_LOGE(\"%s\", s.c_str());\n    }\n#else\n    SHERPA_ONNX_LOGE(\"%s\", config.ToString().c_str());\n#endif\n  }\n\n  auto tts = new sherpa_onnx::OfflineTts(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return reinterpret_cast<jlong>(tts);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_OfflineTts_newFromFile(\n    JNIEnv *env, jobject /*obj*/, jobject _config) {\n  return SafeJNI(\n      env, \"OfflineTts_newFromFile\",\n      [&]() -> jlong {\n        bool ok = false;\n        auto config = sherpa_onnx::GetOfflineTtsConfig(env, _config, &ok);\n\n        if (!ok) {\n          SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n          return 0;\n        }\n\n        SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n        if (!config.Validate()) {\n          SHERPA_ONNX_LOGE(\"Errors found in config!\");\n          return 0;\n        }\n\n        auto tts = new sherpa_onnx::OfflineTts(config);\n        return reinterpret_cast<jlong>(tts);\n      },\n      (jlong)0);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OfflineTts_delete(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::OfflineTts *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jint JNICALL Java_com_k2fsa_sherpa_onnx_OfflineTts_getSampleRate(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  return reinterpret_cast<sherpa_onnx::OfflineTts *>(ptr)->SampleRate();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jint JNICALL Java_com_k2fsa_sherpa_onnx_OfflineTts_getNumSpeakers(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  return reinterpret_cast<sherpa_onnx::OfflineTts *>(ptr)->NumSpeakers();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL Java_com_k2fsa_sherpa_onnx_OfflineTts_generateImpl(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring text, jint sid,\n    jfloat speed) {\n  const char *p_text = env->GetStringUTFChars(text, nullptr);\n\n  auto audio = reinterpret_cast<sherpa_onnx::OfflineTts *>(ptr)->Generate(\n      p_text, sid, speed);\n\n  env->ReleaseStringUTFChars(text, p_text);\n\n  return CreateAudioObject(env, audio.samples, audio.sample_rate);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineTts_generateWithCallbackImpl(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring text, jint sid,\n    jfloat speed, jobject callback) {\n  const char *p_text = env->GetStringUTFChars(text, nullptr);\n\n  auto tts = reinterpret_cast<sherpa_onnx::OfflineTts *>(ptr);\n\n  sherpa_onnx::GeneratedAudio audio;\n\n  if (callback) {\n    std::function<int32_t(const float *, int32_t, float)> callback_wrapper =\n        [env, callback](const float *samples, int32_t n, float) -> int32_t {\n      jfloatArray samples_arr = env->NewFloatArray(n);\n      env->SetFloatArrayRegion(samples_arr, 0, n, samples);\n      int32_t ret = CallCallback(env, callback, samples_arr);\n      env->DeleteLocalRef(samples_arr);\n      return ret;\n    };\n\n    audio = tts->Generate(p_text, sid, speed, callback_wrapper);\n  } else {\n    audio = tts->Generate(p_text, sid, speed, nullptr);\n  }\n\n  env->ReleaseStringUTFChars(text, p_text);\n\n  return CreateAudioObject(env, audio.samples, audio.sample_rate);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL\nJava_com_k2fsa_sherpa_onnx_OfflineTts_generateWithConfigImpl(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring text, jobject _gen_config,\n    jobject callback) {\n  const char *p_text = env->GetStringUTFChars(text, nullptr);\n  auto gen_config = sherpa_onnx::GetGenerationConfig(env, _gen_config);\n  auto tts = reinterpret_cast<sherpa_onnx::OfflineTts *>(ptr);\n\n  sherpa_onnx::GeneratedAudio audio;\n\n  if (callback) {\n    std::function<int32_t(const float *, int32_t, float)> callback_wrapper =\n        [env, callback](const float *samples, int32_t n, float) -> int32_t {\n      jfloatArray samples_arr = env->NewFloatArray(n);\n      env->SetFloatArrayRegion(samples_arr, 0, n, samples);\n      int32_t ret = CallCallback(env, callback, samples_arr);\n      env->DeleteLocalRef(samples_arr);\n      return ret;\n    };\n\n    audio = tts->Generate(p_text, gen_config, callback_wrapper);\n  } else {\n    audio = tts->Generate(p_text, gen_config, nullptr);\n  }\n\n  env->ReleaseStringUTFChars(text, p_text);\n\n  return CreateAudioObject(env, audio.samples, audio.sample_rate);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL Java_com_k2fsa_sherpa_onnx_GeneratedAudio_saveImpl(\n    JNIEnv *env, jobject /*obj*/, jstring filename, jfloatArray samples,\n    jint sample_rate) {\n  const char *p_filename = env->GetStringUTFChars(filename, nullptr);\n\n  jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n  jsize n = env->GetArrayLength(samples);\n\n  bool ok = sherpa_onnx::WriteWave(p_filename, sample_rate, p, n);\n\n  env->ReleaseStringUTFChars(filename, p_filename);\n  env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n\n  return ok;\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/online-punctuation.cc",
    "content": "// sherpa-onnx/jni/online-punctuation.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-punctuation.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nstatic OnlinePunctuationConfig GetOnlinePunctuationConfig(JNIEnv *env,\n                                                          jobject config,\n                                                          bool *ok) {\n  OnlinePunctuationConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid;\n\n  fid = env->GetFieldID(cls, \"model\",\n                        \"Lcom/k2fsa/sherpa/onnx/OnlinePunctuationModelConfig;\");\n  jobject model_config = env->GetObjectField(config, fid);\n  jclass model_config_cls = env->GetObjectClass(model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.cnn_bilstm, cnnBilstm, model_config_cls,\n                              model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.bpe_vocab, bpeVocab, model_config_cls,\n                              model_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.model.num_threads, numThreads, model_config_cls,\n                           model_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.model.debug, debug, model_config_cls,\n                            model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model.provider, provider, model_config_cls,\n                              model_config);\n\n  *ok = true;\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlinePunctuation_newFromAsset(JNIEnv *env,\n                                                          jobject /*obj*/,\n                                                          jobject asset_manager,\n                                                          jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n  bool ok = false;\n  auto config = sherpa_onnx::GetOnlinePunctuationConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  auto model = new sherpa_onnx::OnlinePunctuation(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return (jlong)model;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlinePunctuation_newFromFile(JNIEnv *env,\n                                                         jobject /*obj*/,\n                                                         jobject _config) {\n  bool ok = false;\n  auto config = sherpa_onnx::GetOnlinePunctuationConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors found in config!\");\n    return 0;\n  }\n\n  auto model = new sherpa_onnx::OnlinePunctuation(config);\n\n  return (jlong)model;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OnlinePunctuation_delete(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::OnlinePunctuation *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlinePunctuation_addPunctuation(JNIEnv *env,\n                                                            jobject /*obj*/,\n                                                            jlong ptr,\n                                                            jstring text) {\n  auto punct = reinterpret_cast<const sherpa_onnx::OnlinePunctuation *>(ptr);\n\n  const char *ptext = env->GetStringUTFChars(text, nullptr);\n\n  std::string result = punct->AddPunctuationWithCase(ptext);\n\n  env->ReleaseStringUTFChars(text, ptext);\n\n  return env->NewStringUTF(result.c_str());\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/online-recognizer.cc",
    "content": "// sherpa-onnx/jni/online-recognizer.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n\n#include <memory>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/csrc/text-utils.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nOnlineModelConfig GetOnlineModelConfig(JNIEnv *env, jclass model_config_cls,\n                                       jobject model_config, bool *ok) {\n  OnlineModelConfig ans;\n\n  auto fid =\n      env->GetFieldID(model_config_cls, \"transducer\",\n                      \"Lcom/k2fsa/sherpa/onnx/OnlineTransducerModelConfig;\");\n  jobject transducer_config = env->GetObjectField(model_config, fid);\n  jclass transducer_config_cls = env->GetObjectClass(transducer_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.transducer.encoder, encoder,\n                              transducer_config_cls, transducer_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.transducer.decoder, decoder,\n                              transducer_config_cls, transducer_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.transducer.joiner, joiner,\n                              transducer_config_cls, transducer_config);\n\n  fid = env->GetFieldID(model_config_cls, \"paraformer\",\n                        \"Lcom/k2fsa/sherpa/onnx/OnlineParaformerModelConfig;\");\n  jobject paraformer_config = env->GetObjectField(model_config, fid);\n  jclass paraformer_config_cls = env->GetObjectClass(paraformer_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.paraformer.encoder, encoder,\n                              paraformer_config_cls, paraformer_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.paraformer.decoder, decoder,\n                              paraformer_config_cls, paraformer_config);\n\n  fid =\n      env->GetFieldID(model_config_cls, \"zipformer2Ctc\",\n                      \"Lcom/k2fsa/sherpa/onnx/OnlineZipformer2CtcModelConfig;\");\n  jobject zipformer2_ctc_config = env->GetObjectField(model_config, fid);\n  jclass zipformer2_ctc_config_cls = env->GetObjectClass(zipformer2_ctc_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.zipformer2_ctc.model, model,\n                              zipformer2_ctc_config_cls, zipformer2_ctc_config);\n\n  fid = env->GetFieldID(model_config_cls, \"neMoCtc\",\n                        \"Lcom/k2fsa/sherpa/onnx/OnlineNeMoCtcModelConfig;\");\n  jobject nemo_ctc_config = env->GetObjectField(model_config, fid);\n  jclass nemo_ctc_config_cls = env->GetObjectClass(nemo_ctc_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.nemo_ctc.model, model, nemo_ctc_config_cls,\n                              nemo_ctc_config);\n\n  fid = env->GetFieldID(model_config_cls, \"toneCtc\",\n                        \"Lcom/k2fsa/sherpa/onnx/OnlineToneCtcModelConfig;\");\n  jobject t_one_ctc_config = env->GetObjectField(model_config, fid);\n  jclass t_one_ctc_config_cls = env->GetObjectClass(t_one_ctc_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.t_one_ctc.model, model, t_one_ctc_config_cls,\n                              t_one_ctc_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.tokens, tokens, model_config_cls,\n                              model_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.num_threads, numThreads, model_config_cls,\n                           model_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.debug, debug, model_config_cls, model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.provider_config.provider, provider,\n                              model_config_cls, model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model_type, modelType, model_config_cls,\n                              model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.modeling_unit, modelingUnit, model_config_cls,\n                              model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.bpe_vocab, bpeVocab, model_config_cls,\n                              model_config);\n\n  *ok = true;\n  return ans;\n}\n\nstatic OnlineRecognizerConfig GetConfig(JNIEnv *env, jobject config, bool *ok) {\n  OnlineRecognizerConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid;\n\n  // https://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/types.html\n  // https://courses.cs.washington.edu/courses/cse341/99wi/java/tutorial/native1.1/implementing/field.html\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.decoding_method, decodingMethod, cls, config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.max_active_paths, maxActivePaths, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.hotwords_file, hotwordsFile, cls, config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.hotwords_score, hotwordsScore, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.rule_fsts, ruleFsts, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.rule_fars, ruleFars, cls, config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.blank_penalty, blankPenalty, cls, config);\n\n  fid = env->GetFieldID(cls, \"featConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/FeatureConfig;\");\n  jobject feat_config = env->GetObjectField(config, fid);\n  jclass feat_config_cls = env->GetObjectClass(feat_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.feat_config.sampling_rate, sampleRate,\n                           feat_config_cls, feat_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.feat_config.feature_dim, featureDim,\n                           feat_config_cls, feat_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.feat_config.dither, dither, feat_config_cls,\n                             feat_config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.enable_endpoint, enableEndpoint, cls, config);\n\n  fid = env->GetFieldID(cls, \"endpointConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/EndpointConfig;\");\n  jobject endpoint_config = env->GetObjectField(config, fid);\n  jclass endpoint_config_cls = env->GetObjectClass(endpoint_config);\n\n  fid = env->GetFieldID(endpoint_config_cls, \"rule1\",\n                        \"Lcom/k2fsa/sherpa/onnx/EndpointRule;\");\n  jobject rule1 = env->GetObjectField(endpoint_config, fid);\n  jclass rule_class = env->GetObjectClass(rule1);\n\n  fid = env->GetFieldID(endpoint_config_cls, \"rule2\",\n                        \"Lcom/k2fsa/sherpa/onnx/EndpointRule;\");\n  jobject rule2 = env->GetObjectField(endpoint_config, fid);\n\n  fid = env->GetFieldID(endpoint_config_cls, \"rule3\",\n                        \"Lcom/k2fsa/sherpa/onnx/EndpointRule;\");\n  jobject rule3 = env->GetObjectField(endpoint_config, fid);\n\n  fid = env->GetFieldID(rule_class, \"mustContainNonSilence\", \"Z\");\n  ans.endpoint_config.rule1.must_contain_nonsilence =\n      env->GetBooleanField(rule1, fid);\n  ans.endpoint_config.rule2.must_contain_nonsilence =\n      env->GetBooleanField(rule2, fid);\n  ans.endpoint_config.rule3.must_contain_nonsilence =\n      env->GetBooleanField(rule3, fid);\n\n  fid = env->GetFieldID(rule_class, \"minTrailingSilence\", \"F\");\n  ans.endpoint_config.rule1.min_trailing_silence =\n      env->GetFloatField(rule1, fid);\n  ans.endpoint_config.rule2.min_trailing_silence =\n      env->GetFloatField(rule2, fid);\n  ans.endpoint_config.rule3.min_trailing_silence =\n      env->GetFloatField(rule3, fid);\n\n  fid = env->GetFieldID(rule_class, \"minUtteranceLength\", \"F\");\n  ans.endpoint_config.rule1.min_utterance_length =\n      env->GetFloatField(rule1, fid);\n  ans.endpoint_config.rule2.min_utterance_length =\n      env->GetFloatField(rule2, fid);\n  ans.endpoint_config.rule3.min_utterance_length =\n      env->GetFloatField(rule3, fid);\n\n  //---------- model config ----------\n  fid = env->GetFieldID(cls, \"modelConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/OnlineModelConfig;\");\n  jobject model_config = env->GetObjectField(config, fid);\n  jclass model_config_cls = env->GetObjectClass(model_config);\n\n  ans.model_config =\n      GetOnlineModelConfig(env, model_config_cls, model_config, ok);\n\n  if (!*ok) {\n    return ans;\n  }\n\n  *ok = false;\n\n  //---------- rnn lm model config ----------\n  fid = env->GetFieldID(cls, \"lmConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/OnlineLMConfig;\");\n  jobject lm_model_config = env->GetObjectField(config, fid);\n  jclass lm_model_config_cls = env->GetObjectClass(lm_model_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.lm_config.model, model, lm_model_config_cls,\n                              lm_model_config);\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.lm_config.scale, scale, lm_model_config_cls,\n                             lm_model_config);\n\n  fid = env->GetFieldID(cls, \"ctcFstDecoderConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/OnlineCtcFstDecoderConfig;\");\n\n  jobject fst_decoder_config = env->GetObjectField(config, fid);\n  jclass fst_decoder_config_cls = env->GetObjectClass(fst_decoder_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.ctc_fst_decoder_config.graph, graph,\n                              fst_decoder_config_cls, fst_decoder_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.ctc_fst_decoder_config.max_active, maxActive,\n                           fst_decoder_config_cls, fst_decoder_config);\n\n  fid = env->GetFieldID(cls, \"hr\",\n                        \"Lcom/k2fsa/sherpa/onnx/HomophoneReplacerConfig;\");\n  jobject hr_config = env->GetObjectField(config, fid);\n  jclass hr_config_cls = env->GetObjectClass(hr_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.hr.lexicon, lexicon, hr_config_cls,\n                              hr_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.hr.rule_fsts, ruleFsts, hr_config_cls,\n                              hr_config);\n\n  *ok = true;\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlineRecognizer_newFromAsset(JNIEnv *env,\n                                                         jobject /*obj*/,\n                                                         jobject asset_manager,\n                                                         jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n  bool ok = false;\n  auto config = sherpa_onnx::GetConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  if (config.model_config.debug) {\n#if __ANDROID_API__\n    auto str_vec = sherpa_onnx::SplitString(config.ToString(), 128);\n    for (const auto &s : str_vec) {\n      SHERPA_ONNX_LOGE(\"%s\", s.c_str());\n    }\n#else\n    SHERPA_ONNX_LOGE(\"%s\", config.ToString().c_str());\n#endif\n  }\n\n  auto recognizer = new sherpa_onnx::OnlineRecognizer(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return (jlong)recognizer;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_newFromFile(\n    JNIEnv *env, jobject /*obj*/, jobject _config) {\n  bool ok = false;\n  auto config = sherpa_onnx::GetConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  if (config.model_config.debug) {\n#if __ANDROID_API__\n    auto str_vec = sherpa_onnx::SplitString(config.ToString(), 128);\n    for (const auto &s : str_vec) {\n      SHERPA_ONNX_LOGE(\"%s\", s.c_str());\n    }\n#else\n    SHERPA_ONNX_LOGE(\"%s\", config.ToString().c_str());\n#endif\n  }\n\n  if (!config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors found in config!\");\n    return 0;\n  }\n\n  auto recognizer = new sherpa_onnx::OnlineRecognizer(config);\n\n  return (jlong)recognizer;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_delete(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::OnlineRecognizer *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_reset(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr, jlong stream_ptr) {\n  auto recognizer = reinterpret_cast<sherpa_onnx::OnlineRecognizer *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n  recognizer->Reset(stream);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_isReady(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr, jlong stream_ptr) {\n  auto recognizer = reinterpret_cast<sherpa_onnx::OnlineRecognizer *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n\n  return recognizer->IsReady(stream);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlineRecognizer_isEndpoint(JNIEnv * /*env*/,\n                                                       jobject /*obj*/,\n                                                       jlong ptr,\n                                                       jlong stream_ptr) {\n  auto recognizer = reinterpret_cast<sherpa_onnx::OnlineRecognizer *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n\n  return recognizer->IsEndpoint(stream);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_decode(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr, jlong stream_ptr) {\n  auto recognizer = reinterpret_cast<sherpa_onnx::OnlineRecognizer *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n\n  recognizer->DecodeStream(stream);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlineRecognizer_decodeStreams(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jlongArray stream_ptrs) {\n  auto recognizer = reinterpret_cast<sherpa_onnx::OnlineRecognizer *>(ptr);\n\n  jlong *p = env->GetLongArrayElements(stream_ptrs, nullptr);\n  jsize n = env->GetArrayLength(stream_ptrs);\n\n  auto ss = reinterpret_cast<sherpa_onnx::OnlineStream **>(p);\n\n  recognizer->DecodeStreams(ss, n);\n\n  env->ReleaseLongArrayElements(stream_ptrs, p, JNI_ABORT);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlineRecognizer_createStream(JNIEnv *env,\n                                                         jobject /*obj*/,\n                                                         jlong ptr,\n                                                         jstring hotwords) {\n  auto recognizer = reinterpret_cast<sherpa_onnx::OnlineRecognizer *>(ptr);\n\n  const char *p = env->GetStringUTFChars(hotwords, nullptr);\n  std::unique_ptr<sherpa_onnx::OnlineStream> stream;\n\n  if (strlen(p) == 0) {\n    stream = recognizer->CreateStream();\n  } else {\n    stream = recognizer->CreateStream(p);\n  }\n\n  env->ReleaseStringUTFChars(hotwords, p);\n\n  // The user is responsible to free the returned pointer.\n  //\n  // See Java_com_k2fsa_sherpa_onnx_OfflineStream_delete() from\n  // ./offline-stream.cc\n  sherpa_onnx::OnlineStream *ans = stream.release();\n  return (jlong)ans;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_getResult(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jlong stream_ptr) {\n  auto recognizer = reinterpret_cast<sherpa_onnx::OnlineRecognizer *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n\n  sherpa_onnx::OnlineRecognizerResult result = recognizer->GetResult(stream);\n\n  // Find the OnlineRecognizerResult class\n  jclass cls = env->FindClass(\"com/k2fsa/sherpa/onnx/OnlineRecognizerResult\");\n  if (cls == nullptr) {\n    SHERPA_ONNX_LOGE(\"Failed to find class OnlineRecognizerResult\");\n    return nullptr;\n  }\n\n  // Find the constructor: (String, String[], float[], float[])V\n  jmethodID ctor = env->GetMethodID(\n      cls, \"<init>\", \"(Ljava/lang/String;[Ljava/lang/String;[F[F)V\");\n\n  // text\n  jstring text = env->NewStringUTF(result.text.c_str());\n\n  // tokens\n  jclass string_cls = env->FindClass(\"java/lang/String\");\n  jobjectArray tokens =\n      env->NewObjectArray(result.tokens.size(), string_cls, nullptr);\n  env->DeleteLocalRef(string_cls);\n  for (size_t i = 0; i < result.tokens.size(); ++i) {\n    jstring token_str = env->NewStringUTF(result.tokens[i].c_str());\n    env->SetObjectArrayElement(tokens, i, token_str);\n    env->DeleteLocalRef(token_str);\n  }\n\n  // timestamps\n  jfloatArray timestamps = env->NewFloatArray(result.timestamps.size());\n  env->SetFloatArrayRegion(timestamps, 0, result.timestamps.size(),\n                           result.timestamps.data());\n\n  // ys_probs\n  jfloatArray ys_probs = env->NewFloatArray(result.ys_probs.size());\n  env->SetFloatArrayRegion(ys_probs, 0, result.ys_probs.size(),\n                           result.ys_probs.data());\n\n  // Construct and return OnlineRecognizerResult\n  jobject obj = env->NewObject(cls, ctor, text, tokens, timestamps, ys_probs);\n\n  // Delete local references\n  env->DeleteLocalRef(text);\n  env->DeleteLocalRef(tokens);\n  env->DeleteLocalRef(timestamps);\n  env->DeleteLocalRef(ys_probs);\n  env->DeleteLocalRef(cls);\n\n  return obj;\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/online-speech-denoiser.cc",
    "content": "// sherpa-onnx/jni/online-speech-denoiser.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/online-speech-denoiser.h\"\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/jni/common.h\"\n#include \"sherpa-onnx/jni/speech-denoiser.h\"\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_newFromAsset(\n    JNIEnv *env, jobject /*obj*/, jobject asset_manager, jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n\n  bool ok = false;\n  auto config = sherpa_onnx::GetOnlineSpeechDenoiserConfig(env, _config, &ok);\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  auto speech_denoiser = new sherpa_onnx::OnlineSpeechDenoiser(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return reinterpret_cast<jlong>(speech_denoiser);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_newFromFile(JNIEnv *env,\n                                                            jobject /*obj*/,\n                                                            jobject _config) {\n  return SafeJNI(\n      env, \"OnlineSpeechDenoiser_newFromFile\",\n      [&]() -> jlong {\n        bool ok = false;\n        auto config =\n            sherpa_onnx::GetOnlineSpeechDenoiserConfig(env, _config, &ok);\n\n        if (!ok) {\n          SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n          return 0;\n        }\n\n        SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n        if (!config.Validate()) {\n          SHERPA_ONNX_LOGE(\"Errors found in config!\");\n          return 0;\n        }\n\n        auto speech_denoiser = new sherpa_onnx::OnlineSpeechDenoiser(config);\n        return reinterpret_cast<jlong>(speech_denoiser);\n      },\n      static_cast<jlong>(0));\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_delete(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::OnlineSpeechDenoiser *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jint JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_getSampleRate(JNIEnv * /*env*/,\n                                                              jobject /*obj*/,\n                                                              jlong ptr) {\n  return reinterpret_cast<sherpa_onnx::OnlineSpeechDenoiser *>(ptr)\n      ->GetSampleRate();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jint JNICALL\nJava_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_getFrameShiftInSamples(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  return reinterpret_cast<sherpa_onnx::OnlineSpeechDenoiser *>(ptr)\n      ->GetFrameShiftInSamples();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_run(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jfloatArray samples,\n    jint sample_rate) {\n  auto speech_denoiser =\n      reinterpret_cast<sherpa_onnx::OnlineSpeechDenoiser *>(ptr);\n\n  jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n  jsize n = env->GetArrayLength(samples);\n  auto denoised = speech_denoiser->Run(p, n, sample_rate);\n  env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n\n  return sherpa_onnx::NewDenoisedAudio(env, denoised);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_flush(\n    JNIEnv *env, jobject /*obj*/, jlong ptr) {\n  auto speech_denoiser =\n      reinterpret_cast<sherpa_onnx::OnlineSpeechDenoiser *>(ptr);\n  auto denoised = speech_denoiser->Flush();\n  return sherpa_onnx::NewDenoisedAudio(env, denoised);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_reset(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  reinterpret_cast<sherpa_onnx::OnlineSpeechDenoiser *>(ptr)->Reset();\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/online-stream.cc",
    "content": "// sherpa-onnx/jni/online-stream.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-stream.h\"\n\n#include \"sherpa-onnx/jni/common.h\"\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OnlineStream_delete(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::OnlineStream *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OnlineStream_acceptWaveform(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jfloatArray samples,\n    jint sample_rate) {\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(ptr);\n\n  jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n  jsize n = env->GetArrayLength(samples);\n  stream->AcceptWaveform(sample_rate, p, n);\n  env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OnlineStream_inputFinished(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(ptr);\n  stream->InputFinished();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_OnlineStream_setOption(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring key, jstring value) {\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(ptr);\n  const char *p_key = env->GetStringUTFChars(key, nullptr);\n  const char *p_value = env->GetStringUTFChars(value, nullptr);\n  stream->SetOption(p_key, p_value);\n  env->ReleaseStringUTFChars(key, p_key);\n  env->ReleaseStringUTFChars(value, p_value);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL Java_com_k2fsa_sherpa_onnx_OnlineStream_getOption(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring key) {\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(ptr);\n  const char *p_key = env->GetStringUTFChars(key, nullptr);\n  const std::string &value = stream->GetOption(p_key);\n  env->ReleaseStringUTFChars(key, p_key);\n  return env->NewStringUTF(value.c_str());\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL Java_com_k2fsa_sherpa_onnx_OnlineStream_hasOption(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring key) {\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(ptr);\n  const char *p_key = env->GetStringUTFChars(key, nullptr);\n  jboolean result = stream->HasOption(p_key);\n  env->ReleaseStringUTFChars(key, p_key);\n  return result;\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/sherpa-onnx-symbols.exp",
    "content": "_Java_com_k2fsa_sherpa_onnx_AudioTagging_compute\n_Java_com_k2fsa_sherpa_onnx_AudioTagging_createStream\n_Java_com_k2fsa_sherpa_onnx_AudioTagging_delete\n_Java_com_k2fsa_sherpa_onnx_AudioTagging_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_AudioTagging_newFromFile\n_Java_com_k2fsa_sherpa_onnx_DenoisedAudio_saveImpl\n_Java_com_k2fsa_sherpa_onnx_GeneratedAudio_saveImpl\n_Java_com_k2fsa_sherpa_onnx_KeywordSpotter_createStream\n_Java_com_k2fsa_sherpa_onnx_KeywordSpotter_decode\n_Java_com_k2fsa_sherpa_onnx_KeywordSpotter_delete\n_Java_com_k2fsa_sherpa_onnx_KeywordSpotter_getResult\n_Java_com_k2fsa_sherpa_onnx_KeywordSpotter_isReady\n_Java_com_k2fsa_sherpa_onnx_KeywordSpotter_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_KeywordSpotter_newFromFile\n_Java_com_k2fsa_sherpa_onnx_KeywordSpotter_reset\n_Java_com_k2fsa_sherpa_onnx_OfflinePunctuation_addPunctuation\n_Java_com_k2fsa_sherpa_onnx_OfflinePunctuation_delete\n_Java_com_k2fsa_sherpa_onnx_OfflinePunctuation_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_OfflinePunctuation_newFromFile\n_Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_createStream\n_Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_decode\n_Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_decodeStreams\n_Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_delete\n_Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_getResult\n_Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_newFromFile\n_Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_prependAdspLibraryPath\n_Java_com_k2fsa_sherpa_onnx_OfflineRecognizer_setConfig\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_delete\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_getSampleRate\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_newFromFile\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_process\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_processWithCallback\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeakerDiarization_setConfig\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeechDenoiser_delete\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeechDenoiser_getSampleRate\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeechDenoiser_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeechDenoiser_newFromFile\n_Java_com_k2fsa_sherpa_onnx_OfflineSpeechDenoiser_run\n_Java_com_k2fsa_sherpa_onnx_OfflineStream_acceptWaveform\n_Java_com_k2fsa_sherpa_onnx_OfflineStream_delete\n_Java_com_k2fsa_sherpa_onnx_OfflineTts_delete\n_Java_com_k2fsa_sherpa_onnx_OfflineTts_generateImpl\n_Java_com_k2fsa_sherpa_onnx_OfflineTts_generateWithCallbackImpl\n_Java_com_k2fsa_sherpa_onnx_OfflineTts_generateWithConfigImpl\n_Java_com_k2fsa_sherpa_onnx_OfflineTts_getNumSpeakers\n_Java_com_k2fsa_sherpa_onnx_OfflineTts_getSampleRate\n_Java_com_k2fsa_sherpa_onnx_OfflineTts_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_OfflineTts_newFromFile\n_Java_com_k2fsa_sherpa_onnx_OnlinePunctuation_addPunctuation\n_Java_com_k2fsa_sherpa_onnx_OnlinePunctuation_delete\n_Java_com_k2fsa_sherpa_onnx_OnlinePunctuation_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_OnlinePunctuation_newFromFile\n_Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_delete\n_Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_flush\n_Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_getFrameShiftInSamples\n_Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_getSampleRate\n_Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_newFromFile\n_Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_reset\n_Java_com_k2fsa_sherpa_onnx_OnlineSpeechDenoiser_run\n_Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_createStream\n_Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_decode\n_Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_decodeStreams\n_Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_delete\n_Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_getResult\n_Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_isEndpoint\n_Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_isReady\n_Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_newFromFile\n_Java_com_k2fsa_sherpa_onnx_OnlineRecognizer_reset\n_Java_com_k2fsa_sherpa_onnx_OnlineStream_acceptWaveform\n_Java_com_k2fsa_sherpa_onnx_OnlineStream_delete\n_Java_com_k2fsa_sherpa_onnx_OnlineStream_inputFinished\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_compute\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_createStream\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_delete\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_dim\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_isReady\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_newFromFile\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_add\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_addList\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_allSpeakerNames\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_contains\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_create\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_delete\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_numSpeakers\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_remove\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_search\n_Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_verify\n_Java_com_k2fsa_sherpa_onnx_SpokenLanguageIdentification_compute\n_Java_com_k2fsa_sherpa_onnx_SpokenLanguageIdentification_createStream\n_Java_com_k2fsa_sherpa_onnx_SpokenLanguageIdentification_delete\n_Java_com_k2fsa_sherpa_onnx_SpokenLanguageIdentification_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_SpokenLanguageIdentification_newFromFile\n_Java_com_k2fsa_sherpa_onnx_Vad_acceptWaveform\n_Java_com_k2fsa_sherpa_onnx_Vad_clear\n_Java_com_k2fsa_sherpa_onnx_Vad_compute\n_Java_com_k2fsa_sherpa_onnx_Vad_delete\n_Java_com_k2fsa_sherpa_onnx_Vad_empty\n_Java_com_k2fsa_sherpa_onnx_Vad_flush\n_Java_com_k2fsa_sherpa_onnx_Vad_front\n_Java_com_k2fsa_sherpa_onnx_Vad_isSpeechDetected\n_Java_com_k2fsa_sherpa_onnx_Vad_newFromAsset\n_Java_com_k2fsa_sherpa_onnx_Vad_newFromFile\n_Java_com_k2fsa_sherpa_onnx_Vad_pop\n_Java_com_k2fsa_sherpa_onnx_Vad_reset\n_Java_com_k2fsa_sherpa_onnx_VersionInfo_00024Companion_getGitDate2\n_Java_com_k2fsa_sherpa_onnx_VersionInfo_00024Companion_getGitSha12\n_Java_com_k2fsa_sherpa_onnx_VersionInfo_00024Companion_getVersionStr2\n_Java_com_k2fsa_sherpa_onnx_VersionInfo_getGitDate2\n_Java_com_k2fsa_sherpa_onnx_VersionInfo_getGitSha12\n_Java_com_k2fsa_sherpa_onnx_VersionInfo_getVersionStr2\n_Java_com_k2fsa_sherpa_onnx_WaveReader_00024Companion_readWaveFromAsset\n_Java_com_k2fsa_sherpa_onnx_WaveReader_00024Companion_readWaveFromFile\n_Java_com_k2fsa_sherpa_onnx_WaveReader_readWaveFromFile\n_Java_com_k2fsa_sherpa_onnx_WaveWriter_writeWaveToFile\n"
  },
  {
    "path": "sherpa-onnx/jni/sherpa-onnx-symbols.lds",
    "content": "{\n  global:\n    Java_com_k2fsa_sherpa_onnx*;\n  local:\n    *;\n};\n"
  },
  {
    "path": "sherpa-onnx/jni/speaker-embedding-extractor.cc",
    "content": "// sherpa-onnx/jni/speaker-embedding-extractor.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nstatic SpeakerEmbeddingExtractorConfig GetSpeakerEmbeddingExtractorConfig(\n    JNIEnv *env, jobject config, bool *ok) {\n  SpeakerEmbeddingExtractorConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.model, model, cls, config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.num_threads, numThreads, cls, config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.debug, debug, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.provider, provider, cls, config);\n\n  *ok = true;\n\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_newFromAsset(\n    JNIEnv *env, jobject /*obj*/, jobject asset_manager, jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n  bool ok = false;\n  auto config =\n      sherpa_onnx::GetSpeakerEmbeddingExtractorConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"new config:\\n%s\", config.ToString().c_str());\n\n  auto extractor = new sherpa_onnx::SpeakerEmbeddingExtractor(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return (jlong)extractor;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_newFromFile(\n    JNIEnv *env, jobject /*obj*/, jobject _config) {\n  bool ok = false;\n  auto config =\n      sherpa_onnx::GetSpeakerEmbeddingExtractorConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"newFromFile config:\\n%s\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors found in config!\");\n    return 0;\n  }\n\n  auto extractor = new sherpa_onnx::SpeakerEmbeddingExtractor(config);\n\n  return (jlong)extractor;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_delete(JNIEnv * /*env*/,\n                                                            jobject /*obj*/,\n                                                            jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::SpeakerEmbeddingExtractor *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_createStream(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  std::unique_ptr<sherpa_onnx::OnlineStream> s =\n      reinterpret_cast<sherpa_onnx::SpeakerEmbeddingExtractor *>(ptr)\n          ->CreateStream();\n\n  // The user is responsible to free the returned pointer.\n  //\n  // See Java_com_k2fsa_sherpa_onnx_OnlineStream_delete() from\n  // ./online-stream.cc\n  sherpa_onnx::OnlineStream *p = s.release();\n  return (jlong)p;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_isReady(JNIEnv * /*env*/,\n                                                             jobject /*obj*/,\n                                                             jlong ptr,\n                                                             jlong stream_ptr) {\n  auto extractor =\n      reinterpret_cast<sherpa_onnx::SpeakerEmbeddingExtractor *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n  return extractor->IsReady(stream);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jfloatArray JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_compute(JNIEnv *env,\n                                                             jobject /*obj*/,\n                                                             jlong ptr,\n                                                             jlong stream_ptr) {\n  auto extractor =\n      reinterpret_cast<sherpa_onnx::SpeakerEmbeddingExtractor *>(ptr);\n  auto stream = reinterpret_cast<sherpa_onnx::OnlineStream *>(stream_ptr);\n\n  std::vector<float> embedding = extractor->Compute(stream);\n  jfloatArray embedding_arr = env->NewFloatArray(embedding.size());\n  env->SetFloatArrayRegion(embedding_arr, 0, embedding.size(),\n                           embedding.data());\n  return embedding_arr;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jint JNICALL Java_com_k2fsa_sherpa_onnx_SpeakerEmbeddingExtractor_dim(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  auto extractor =\n      reinterpret_cast<sherpa_onnx::SpeakerEmbeddingExtractor *>(ptr);\n  return extractor->Dim();\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/speaker-embedding-manager.cc",
    "content": "// sherpa-onnx/jni/speaker-embedding-manager.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/speaker-embedding-manager.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_create(JNIEnv *env,\n                                                          jobject /*obj*/,\n                                                          jint dim) {\n  auto p = new sherpa_onnx::SpeakerEmbeddingManager(dim);\n  return (jlong)p;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_delete(JNIEnv * /*env*/,\n                                                          jobject /*obj*/,\n                                                          jlong ptr) {\n  auto manager = reinterpret_cast<sherpa_onnx::SpeakerEmbeddingManager *>(ptr);\n  delete manager;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_add(JNIEnv *env,\n                                                       jobject /*obj*/,\n                                                       jlong ptr, jstring name,\n                                                       jfloatArray embedding) {\n  auto manager = reinterpret_cast<sherpa_onnx::SpeakerEmbeddingManager *>(ptr);\n\n  jfloat *p = env->GetFloatArrayElements(embedding, nullptr);\n  jsize n = env->GetArrayLength(embedding);\n\n  if (n != manager->Dim()) {\n    SHERPA_ONNX_LOGE(\"Expected dim %d, given %d\", manager->Dim(),\n                     static_cast<int32_t>(n));\n    env->ReleaseFloatArrayElements(embedding, p, JNI_ABORT);\n    jclass iae = env->FindClass(\"java/lang/IllegalArgumentException\");\n    env->ThrowNew(iae, \"Embedding dimension mismatch\");\n    env->DeleteLocalRef(iae);\n    return false;\n  }\n\n  const char *p_name = env->GetStringUTFChars(name, nullptr);\n\n  jboolean ok = manager->Add(p_name, p);\n  env->ReleaseStringUTFChars(name, p_name);\n  env->ReleaseFloatArrayElements(embedding, p, JNI_ABORT);\n\n  return ok;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_addList(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring name,\n    jobjectArray embedding_arr) {\n  auto manager = reinterpret_cast<sherpa_onnx::SpeakerEmbeddingManager *>(ptr);\n\n  int num_embeddings = env->GetArrayLength(embedding_arr);\n  if (num_embeddings == 0) {\n    return false;\n  }\n\n  std::vector<std::vector<float>> embedding_list;\n  embedding_list.reserve(num_embeddings);\n  for (int32_t i = 0; i != num_embeddings; ++i) {\n    jfloatArray embedding =\n        (jfloatArray)env->GetObjectArrayElement(embedding_arr, i);\n\n    jfloat *p = env->GetFloatArrayElements(embedding, nullptr);\n    jsize n = env->GetArrayLength(embedding);\n\n    if (n != manager->Dim()) {\n      SHERPA_ONNX_LOGE(\"i: %d. Expected dim %d, given %d\", i, manager->Dim(),\n                       static_cast<int32_t>(n));\n      env->ReleaseFloatArrayElements(embedding, p, JNI_ABORT);\n      env->DeleteLocalRef(embedding);\n      jclass iae = env->FindClass(\"java/lang/IllegalArgumentException\");\n      env->ThrowNew(iae, \"Embedding dimension mismatch\");\n      env->DeleteLocalRef(iae);\n      return false;\n    }\n\n    embedding_list.push_back({p, p + n});\n    env->ReleaseFloatArrayElements(embedding, p, JNI_ABORT);\n    env->DeleteLocalRef(embedding);\n  }\n\n  const char *p_name = env->GetStringUTFChars(name, nullptr);\n\n  jboolean ok = manager->Add(p_name, embedding_list);\n\n  env->ReleaseStringUTFChars(name, p_name);\n\n  return ok;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_remove(JNIEnv *env,\n                                                          jobject /*obj*/,\n                                                          jlong ptr,\n                                                          jstring name) {\n  auto manager = reinterpret_cast<sherpa_onnx::SpeakerEmbeddingManager *>(ptr);\n\n  const char *p_name = env->GetStringUTFChars(name, nullptr);\n\n  jboolean ok = manager->Remove(p_name);\n\n  env->ReleaseStringUTFChars(name, p_name);\n\n  return ok;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_search(JNIEnv *env,\n                                                          jobject /*obj*/,\n                                                          jlong ptr,\n                                                          jfloatArray embedding,\n                                                          jfloat threshold) {\n  auto manager = reinterpret_cast<sherpa_onnx::SpeakerEmbeddingManager *>(ptr);\n\n  jfloat *p = env->GetFloatArrayElements(embedding, nullptr);\n  jsize n = env->GetArrayLength(embedding);\n\n  if (n != manager->Dim()) {\n    SHERPA_ONNX_LOGE(\"Expected dim %d, given %d\", manager->Dim(),\n                     static_cast<int32_t>(n));\n    env->ReleaseFloatArrayElements(embedding, p, JNI_ABORT);\n    jclass iae = env->FindClass(\"java/lang/IllegalArgumentException\");\n    env->ThrowNew(iae, \"Embedding dimension mismatch\");\n    env->DeleteLocalRef(iae);\n    return env->NewStringUTF(\"\");\n  }\n\n  std::string name = manager->Search(p, threshold);\n\n  env->ReleaseFloatArrayElements(embedding, p, JNI_ABORT);\n\n  return env->NewStringUTF(name.c_str());\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_verify(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jstring name,\n    jfloatArray embedding, jfloat threshold) {\n  auto manager = reinterpret_cast<sherpa_onnx::SpeakerEmbeddingManager *>(ptr);\n\n  jfloat *p = env->GetFloatArrayElements(embedding, nullptr);\n  jsize n = env->GetArrayLength(embedding);\n\n  if (n != manager->Dim()) {\n    SHERPA_ONNX_LOGE(\"Expected dim %d, given %d\", manager->Dim(),\n                     static_cast<int32_t>(n));\n    env->ReleaseFloatArrayElements(embedding, p, JNI_ABORT);\n    jclass iae = env->FindClass(\"java/lang/IllegalArgumentException\");\n    env->ThrowNew(iae, \"Embedding dimension mismatch\");\n    env->DeleteLocalRef(iae);\n    return false;\n  }\n\n  const char *p_name = env->GetStringUTFChars(name, nullptr);\n\n  jboolean ok = manager->Verify(p_name, p, threshold);\n\n  env->ReleaseFloatArrayElements(embedding, p, JNI_ABORT);\n\n  env->ReleaseStringUTFChars(name, p_name);\n\n  return ok;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_contains(JNIEnv *env,\n                                                            jobject /*obj*/,\n                                                            jlong ptr,\n                                                            jstring name) {\n  auto manager = reinterpret_cast<sherpa_onnx::SpeakerEmbeddingManager *>(ptr);\n\n  const char *p_name = env->GetStringUTFChars(name, nullptr);\n\n  jboolean ok = manager->Contains(p_name);\n\n  env->ReleaseStringUTFChars(name, p_name);\n\n  return ok;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jint JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_numSpeakers(JNIEnv * /*env*/,\n                                                               jobject /*obj*/,\n                                                               jlong ptr) {\n  auto manager = reinterpret_cast<sherpa_onnx::SpeakerEmbeddingManager *>(ptr);\n  return manager->NumSpeakers();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobjectArray JNICALL\nJava_com_k2fsa_sherpa_onnx_SpeakerEmbeddingManager_allSpeakerNames(\n    JNIEnv *env, jobject /*obj*/, jlong ptr) {\n  auto manager = reinterpret_cast<sherpa_onnx::SpeakerEmbeddingManager *>(ptr);\n  std::vector<std::string> all_speakers = manager->GetAllSpeakers();\n\n  jclass string_cls = env->FindClass(\"java/lang/String\");\n  jobjectArray obj_arr = (jobjectArray)env->NewObjectArray(\n      all_speakers.size(), string_cls, nullptr);\n  env->DeleteLocalRef(string_cls);\n\n  int32_t i = 0;\n  for (auto &s : all_speakers) {\n    jstring js = env->NewStringUTF(s.c_str());\n    env->SetObjectArrayElement(obj_arr, i, js);\n    env->DeleteLocalRef(js);\n    ++i;\n  }\n\n  return obj_arr;\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/speech-denoiser.cc",
    "content": "// sherpa-onnx/jni/speech-denoiser.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/jni/speech-denoiser.h\"\n\n#include \"sherpa-onnx/csrc/macros.h\"\n\nnamespace sherpa_onnx {\n\nOfflineSpeechDenoiserModelConfig GetOfflineSpeechDenoiserModelConfig(\n    JNIEnv *env, jobject model, bool *ok) {\n  OfflineSpeechDenoiserModelConfig ans;\n\n  jclass model_config_cls = env->GetObjectClass(model);\n  jfieldID fid;\n\n  fid = env->GetFieldID(\n      model_config_cls, \"gtcrn\",\n      \"Lcom/k2fsa/sherpa/onnx/OfflineSpeechDenoiserGtcrnModelConfig;\");\n  jobject gtcrn = env->GetObjectField(model, fid);\n  jclass gtcrn_cls = env->GetObjectClass(gtcrn);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.gtcrn.model, model, gtcrn_cls, gtcrn);\n\n  fid = env->GetFieldID(\n      model_config_cls, \"dpdfnet\",\n      \"Lcom/k2fsa/sherpa/onnx/OfflineSpeechDenoiserDpdfNetModelConfig;\");\n  jobject dpdfnet = env->GetObjectField(model, fid);\n  jclass dpdfnet_cls = env->GetObjectClass(dpdfnet);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.dpdfnet.model, model, dpdfnet_cls, dpdfnet);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.num_threads, numThreads, model_config_cls,\n                           model);\n  SHERPA_ONNX_JNI_READ_BOOL(ans.debug, debug, model_config_cls, model);\n  SHERPA_ONNX_JNI_READ_STRING(ans.provider, provider, model_config_cls, model);\n\n  *ok = true;\n  return ans;\n}\n\nOfflineSpeechDenoiserConfig GetOfflineSpeechDenoiserConfig(JNIEnv *env,\n                                                           jobject config,\n                                                           bool *ok) {\n  OfflineSpeechDenoiserConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid = env->GetFieldID(\n      cls, \"model\", \"Lcom/k2fsa/sherpa/onnx/OfflineSpeechDenoiserModelConfig;\");\n  jobject model = env->GetObjectField(config, fid);\n\n  ans.model = GetOfflineSpeechDenoiserModelConfig(env, model, ok);\n  return ans;\n}\n\nOnlineSpeechDenoiserConfig GetOnlineSpeechDenoiserConfig(JNIEnv *env,\n                                                         jobject config,\n                                                         bool *ok) {\n  OnlineSpeechDenoiserConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid = env->GetFieldID(\n      cls, \"model\", \"Lcom/k2fsa/sherpa/onnx/OfflineSpeechDenoiserModelConfig;\");\n  jobject model = env->GetObjectField(config, fid);\n\n  ans.model = GetOfflineSpeechDenoiserModelConfig(env, model, ok);\n  return ans;\n}\n\njobject NewDenoisedAudio(JNIEnv *env, const DenoisedAudio &denoised) {\n  jclass cls = env->FindClass(\"com/k2fsa/sherpa/onnx/DenoisedAudio\");\n  if (cls == nullptr) {\n    SHERPA_ONNX_LOGE(\"Failed to get class for DenoisedAudio\");\n    return nullptr;\n  }\n\n  jmethodID constructor = env->GetMethodID(cls, \"<init>\", \"([FI)V\");\n  if (constructor == nullptr) {\n    SHERPA_ONNX_LOGE(\"Failed to get constructor for DenoisedAudio\");\n    env->DeleteLocalRef(cls);\n    return nullptr;\n  }\n\n  jfloatArray samples_arr = env->NewFloatArray(denoised.samples.size());\n  env->SetFloatArrayRegion(samples_arr, 0, denoised.samples.size(),\n                           denoised.samples.data());\n\n  jobject obj =\n      env->NewObject(cls, constructor, samples_arr, denoised.sample_rate);\n  env->DeleteLocalRef(cls);\n  env->DeleteLocalRef(samples_arr);\n  return obj;\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/jni/speech-denoiser.h",
    "content": "// sherpa-onnx/jni/speech-denoiser.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_JNI_SPEECH_DENOISER_H_\n#define SHERPA_ONNX_JNI_SPEECH_DENOISER_H_\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n#include \"sherpa-onnx/csrc/online-speech-denoiser.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nOfflineSpeechDenoiserModelConfig GetOfflineSpeechDenoiserModelConfig(\n    JNIEnv *env, jobject model, bool *ok);\n\nOfflineSpeechDenoiserConfig GetOfflineSpeechDenoiserConfig(\n    JNIEnv *env, jobject config, bool *ok);\n\nOnlineSpeechDenoiserConfig GetOnlineSpeechDenoiserConfig(\n    JNIEnv *env, jobject config, bool *ok);\n\njobject NewDenoisedAudio(JNIEnv *env, const DenoisedAudio &denoised);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_JNI_SPEECH_DENOISER_H_\n"
  },
  {
    "path": "sherpa-onnx/jni/spoken-language-identification.cc",
    "content": "// sherpa-onnx/jni/spoken-language-identification.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/spoken-language-identification.h\"\n\n#include <memory>\n#include <string>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nstatic SpokenLanguageIdentificationConfig GetSpokenLanguageIdentificationConfig(\n    JNIEnv *env, jobject config, bool *ok) {\n  SpokenLanguageIdentificationConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid = env->GetFieldID(\n      cls, \"whisper\",\n      \"Lcom/k2fsa/sherpa/onnx/SpokenLanguageIdentificationWhisperConfig;\");\n\n  jobject whisper = env->GetObjectField(config, fid);\n  jclass whisper_cls = env->GetObjectClass(whisper);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.whisper.encoder, encoder, whisper_cls,\n                              whisper);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.whisper.decoder, decoder, whisper_cls,\n                              whisper);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.whisper.tail_paddings, tailPaddings, whisper_cls,\n                           whisper);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.num_threads, numThreads, cls, config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.debug, debug, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.provider, provider, cls, config);\n\n  *ok = true;\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_SpokenLanguageIdentification_newFromAsset(\n    JNIEnv *env, jobject /*obj*/, jobject asset_manager, jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n\n  bool ok = false;\n  auto config =\n      sherpa_onnx::GetSpokenLanguageIdentificationConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"spoken language identification newFromAsset config:\\n%s\",\n                   config.ToString().c_str());\n\n  auto slid = new sherpa_onnx::SpokenLanguageIdentification(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n  SHERPA_ONNX_LOGE(\"slid %p\", slid);\n\n  return (jlong)slid;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_SpokenLanguageIdentification_newFromFile(\n    JNIEnv *env, jobject /*obj*/, jobject _config) {\n  bool ok = false;\n  auto config =\n      sherpa_onnx::GetSpokenLanguageIdentificationConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"SpokenLanguageIdentification newFromFile config:\\n%s\",\n                   config.ToString().c_str());\n\n  if (!config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors found in config!\");\n    return 0;\n  }\n\n  auto tagger = new sherpa_onnx::SpokenLanguageIdentification(config);\n\n  return (jlong)tagger;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL\nJava_com_k2fsa_sherpa_onnx_SpokenLanguageIdentification_delete(JNIEnv * /*env*/,\n                                                               jobject /*obj*/,\n                                                               jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::SpokenLanguageIdentification *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL\nJava_com_k2fsa_sherpa_onnx_SpokenLanguageIdentification_createStream(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  auto slid =\n      reinterpret_cast<sherpa_onnx::SpokenLanguageIdentification *>(ptr);\n  std::unique_ptr<sherpa_onnx::OfflineStream> s = slid->CreateStream();\n\n  // The user is responsible to free the returned pointer.\n  //\n  // See Java_com_k2fsa_sherpa_onnx_OfflineStream_delete() from\n  // ./offline-stream.cc\n  sherpa_onnx::OfflineStream *p = s.release();\n  return (jlong)p;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL\nJava_com_k2fsa_sherpa_onnx_SpokenLanguageIdentification_compute(JNIEnv *env,\n                                                                jobject /*obj*/,\n                                                                jlong ptr,\n                                                                jlong s_ptr) {\n  sherpa_onnx::SpokenLanguageIdentification *slid =\n      reinterpret_cast<sherpa_onnx::SpokenLanguageIdentification *>(ptr);\n  sherpa_onnx::OfflineStream *s =\n      reinterpret_cast<sherpa_onnx::OfflineStream *>(s_ptr);\n  std::string lang = slid->Compute(s);\n  return env->NewStringUTF(lang.c_str());\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/version.cc",
    "content": "// sherpa-onnx/jni/version.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/version.h\"\n\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL\nJava_com_k2fsa_sherpa_onnx_VersionInfo_00024Companion_getVersionStr2(\n    JNIEnv *env, jclass /*cls*/) {\n  return env->NewStringUTF(GetVersionStr());\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL\nJava_com_k2fsa_sherpa_onnx_VersionInfo_00024Companion_getGitSha12(\n    JNIEnv *env, jclass /*cls*/) {\n  return env->NewStringUTF(GetGitSha1());\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL\nJava_com_k2fsa_sherpa_onnx_VersionInfo_00024Companion_getGitDate2(\n    JNIEnv *env, jclass /*cls*/) {\n  return env->NewStringUTF(GetGitDate());\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL Java_com_k2fsa_sherpa_onnx_VersionInfo_getVersionStr2(\n    JNIEnv *env, jclass /*cls*/) {\n  return env->NewStringUTF(GetVersionStr());\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL Java_com_k2fsa_sherpa_onnx_VersionInfo_getGitSha12(\n    JNIEnv *env, jclass /*cls*/) {\n  return env->NewStringUTF(GetGitSha1());\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jstring JNICALL Java_com_k2fsa_sherpa_onnx_VersionInfo_getGitDate2(\n    JNIEnv *env, jclass /*cls*/) {\n  return env->NewStringUTF(GetGitDate());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/jni/voice-activity-detector.cc",
    "content": "// sherpa-onnx/csrc/voice-activity-detector.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nnamespace sherpa_onnx {\n\nstatic VadModelConfig GetVadModelConfig(JNIEnv *env, jobject config, bool *ok) {\n  VadModelConfig ans;\n\n  jclass cls = env->GetObjectClass(config);\n  jfieldID fid;\n\n  // silero_vad\n  fid = env->GetFieldID(cls, \"sileroVadModelConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/SileroVadModelConfig;\");\n  jobject silero_vad_config = env->GetObjectField(config, fid);\n  jclass silero_vad_config_cls = env->GetObjectClass(silero_vad_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.silero_vad.model, model,\n                              silero_vad_config_cls, silero_vad_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.silero_vad.threshold, threshold,\n                             silero_vad_config_cls, silero_vad_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.silero_vad.min_silence_duration,\n                             minSilenceDuration, silero_vad_config_cls,\n                             silero_vad_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.silero_vad.min_speech_duration,\n                             minSpeechDuration, silero_vad_config_cls,\n                             silero_vad_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.silero_vad.window_size, windowSize,\n                           silero_vad_config_cls, silero_vad_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.silero_vad.max_speech_duration,\n                             maxSpeechDuration, silero_vad_config_cls,\n                             silero_vad_config);\n\n  fid = env->GetFieldID(cls, \"tenVadModelConfig\",\n                        \"Lcom/k2fsa/sherpa/onnx/TenVadModelConfig;\");\n  jobject ten_vad_config = env->GetObjectField(config, fid);\n  jclass ten_vad_config_cls = env->GetObjectClass(ten_vad_config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.ten_vad.model, model, ten_vad_config_cls,\n                              ten_vad_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.ten_vad.threshold, threshold,\n                             ten_vad_config_cls, ten_vad_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.ten_vad.min_silence_duration,\n                             minSilenceDuration, ten_vad_config_cls,\n                             ten_vad_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.ten_vad.min_speech_duration, minSpeechDuration,\n                             ten_vad_config_cls, ten_vad_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.ten_vad.window_size, windowSize,\n                           ten_vad_config_cls, ten_vad_config);\n\n  SHERPA_ONNX_JNI_READ_FLOAT(ans.ten_vad.max_speech_duration, maxSpeechDuration,\n                             ten_vad_config_cls, ten_vad_config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.sample_rate, sampleRate, cls, config);\n\n  SHERPA_ONNX_JNI_READ_INT(ans.num_threads, numThreads, cls, config);\n\n  SHERPA_ONNX_JNI_READ_STRING(ans.provider, provider, cls, config);\n\n  SHERPA_ONNX_JNI_READ_BOOL(ans.debug, debug, cls, config);\n\n  *ok = true;\n  return ans;\n}\n\n}  // namespace sherpa_onnx\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_Vad_newFromAsset(\n    JNIEnv *env, jobject /*obj*/, jobject asset_manager, jobject _config) {\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    return 0;\n  }\n#endif\n\n  bool ok = false;\n  auto config = sherpa_onnx::GetVadModelConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  auto model = new sherpa_onnx::VoiceActivityDetector(\n#if __ANDROID_API__ >= 9\n      mgr,\n#endif\n      config);\n\n  return (jlong)model;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jlong JNICALL Java_com_k2fsa_sherpa_onnx_Vad_newFromFile(\n    JNIEnv *env, jobject /*obj*/, jobject _config) {\n  bool ok = false;\n  auto config = sherpa_onnx::GetVadModelConfig(env, _config, &ok);\n\n  if (!ok) {\n    SHERPA_ONNX_LOGE(\"Please read the error message carefully\");\n    return 0;\n  }\n\n  SHERPA_ONNX_LOGE(\"config:\\n%s\", config.ToString().c_str());\n\n  if (!config.Validate()) {\n    SHERPA_ONNX_LOGE(\"Errors found in config!\");\n    return 0;\n  }\n\n  auto model = new sherpa_onnx::VoiceActivityDetector(config);\n\n  return (jlong)model;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_Vad_delete(JNIEnv * /*env*/,\n                                                             jobject /*obj*/,\n                                                             jlong ptr) {\n  delete reinterpret_cast<sherpa_onnx::VoiceActivityDetector *>(ptr);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_Vad_acceptWaveform(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jfloatArray samples) {\n  SafeJNI(env, \"Vad_acceptWaveform\", [&] {\n    if (!ValidatePointer(env, ptr, \"Vad_acceptWaveform\",\n                         \"VoiceActivityDetector pointer is null.\")) {\n      return;\n    }\n\n    auto model = reinterpret_cast<sherpa_onnx::VoiceActivityDetector *>(ptr);\n    jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n    jsize n = env->GetArrayLength(samples);\n\n    model->AcceptWaveform(p, n);\n\n    env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n  });\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL Java_com_k2fsa_sherpa_onnx_Vad_empty(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  auto model = reinterpret_cast<sherpa_onnx::VoiceActivityDetector *>(ptr);\n  return model->Empty();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_Vad_pop(JNIEnv * /*env*/,\n                                                          jobject /*obj*/,\n                                                          jlong ptr) {\n  auto model = reinterpret_cast<sherpa_onnx::VoiceActivityDetector *>(ptr);\n  model->Pop();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_Vad_clear(JNIEnv * /*env*/,\n                                                            jobject /*obj*/,\n                                                            jlong ptr) {\n  auto model = reinterpret_cast<sherpa_onnx::VoiceActivityDetector *>(ptr);\n  model->Clear();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL Java_com_k2fsa_sherpa_onnx_Vad_front(JNIEnv *env,\n                                                               jobject /*obj*/,\n                                                               jlong ptr) {\n  auto vad = reinterpret_cast<sherpa_onnx::VoiceActivityDetector *>(ptr);\n  if (!vad) {\n    return nullptr;\n  }\n\n  const auto &front = vad->Front();\n\n  jfloatArray samples_arr =\n      env->NewFloatArray(static_cast<jsize>(front.samples.size()));\n\n  if (!samples_arr) {\n    SHERPA_ONNX_LOGE(\"Failed to allocate\");\n    return nullptr;\n  }\n\n  env->SetFloatArrayRegion(samples_arr, 0,\n                           static_cast<jsize>(front.samples.size()),\n                           front.samples.data());\n\n  jclass cls = env->FindClass(\"com/k2fsa/sherpa/onnx/SpeechSegment\");\n  if (!cls) {\n    SHERPA_ONNX_LOGE(\"Failed to find com/k2fsa/sherpa/onnx/SpeechSegment\");\n\n    env->DeleteLocalRef(samples_arr);\n    return nullptr;\n  }\n\n  jmethodID ctor = env->GetMethodID(cls, \"<init>\", \"(I[F)V\");\n  if (!ctor) {\n    SHERPA_ONNX_LOGE(\"failed to get constructor\");\n\n    env->DeleteLocalRef(samples_arr);\n    env->DeleteLocalRef(cls);\n    return nullptr;\n  }\n\n  jobject speechSegment =\n      env->NewObject(cls, ctor, static_cast<jint>(front.start), samples_arr);\n\n  env->DeleteLocalRef(samples_arr);\n  env->DeleteLocalRef(cls);\n\n  return speechSegment;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jboolean JNICALL Java_com_k2fsa_sherpa_onnx_Vad_isSpeechDetected(\n    JNIEnv * /*env*/, jobject /*obj*/, jlong ptr) {\n  auto model = reinterpret_cast<sherpa_onnx::VoiceActivityDetector *>(ptr);\n  return model->IsSpeechDetected();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_Vad_reset(JNIEnv *env,\n                                                            jobject /*obj*/,\n                                                            jlong ptr) {\n  SafeJNI(env, \"Vad_reset\", [&] {\n    if (!ValidatePointer(env, ptr, \"Vad_reset\",\n                         \"VoiceActivityDetector pointer is null.\")) {\n      return;\n    }\n\n    auto model = reinterpret_cast<sherpa_onnx::VoiceActivityDetector *>(ptr);\n    model->Reset();\n  });\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT void JNICALL Java_com_k2fsa_sherpa_onnx_Vad_flush(JNIEnv * /*env*/,\n                                                            jobject /*obj*/,\n                                                            jlong ptr) {\n  auto model = reinterpret_cast<sherpa_onnx::VoiceActivityDetector *>(ptr);\n  model->Flush();\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jfloat JNICALL Java_com_k2fsa_sherpa_onnx_Vad_compute(\n    JNIEnv *env, jobject /*obj*/, jlong ptr, jfloatArray samples) {\n  return SafeJNI(\n      env, \"Vad_compute\",\n      [&]() -> jfloat {\n        if (!ValidatePointer(env, ptr, \"Vad_compute\",\n                             \"VoiceActivityDetector pointer is null.\")) {\n          return -1.0f;\n        }\n        auto vad = reinterpret_cast<sherpa_onnx::VoiceActivityDetector *>(ptr);\n        jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n        jsize n = env->GetArrayLength(samples);\n\n        float score = vad->Compute(p, n);\n\n        env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n\n        return static_cast<jfloat>(score);\n      },\n      -1.0f);\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/wave-reader.cc",
    "content": "// sherpa-onnx/jni/wave-reader.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/wave-reader.h\"\n\n#include <fstream>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/file-utils.h\"\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/jni/common.h\"\n\nstatic jobject ReadWaveImpl(JNIEnv *env, std::istream &is,\n                            const char *p_filename) {\n  bool is_ok = false;\n  int32_t sampling_rate = -1;\n  std::vector<float> samples =\n      sherpa_onnx::ReadWave(is, &sampling_rate, &is_ok);\n\n  if (!is_ok) {\n    SHERPA_ONNX_LOGE(\"Failed to read '%s'\", p_filename);\n    jclass exception_class = env->FindClass(\"java/lang/Exception\");\n    env->ThrowNew(exception_class, \"Failed to read wave file.\");\n    env->DeleteLocalRef(exception_class);\n    return nullptr;\n  }\n\n  jfloatArray samples_arr = env->NewFloatArray(samples.size());\n  if (samples_arr == nullptr) {\n    SHERPA_ONNX_LOGE(\"Failed to allocate samples array\");\n    return nullptr;\n  }\n\n  env->SetFloatArrayRegion(samples_arr, 0, samples.size(), samples.data());\n\n  // Find WaveData class\n  jclass cls = env->FindClass(\"com/k2fsa/sherpa/onnx/WaveData\");\n  if (cls == nullptr) {\n    env->DeleteLocalRef(samples_arr);\n    SHERPA_ONNX_LOGE(\"Failed to find class com/k2fsa/sherpa/onnx/WaveData\");\n    return nullptr;\n  }\n\n  // Get constructor: WaveData(float[] samples, int sampleRate)\n  jmethodID ctor = env->GetMethodID(cls, \"<init>\", \"([FI)V\");\n  if (ctor == nullptr) {\n    SHERPA_ONNX_LOGE(\"Failed to get WaveData constructor\");\n\n    env->DeleteLocalRef(samples_arr);\n    env->DeleteLocalRef(cls);\n    return nullptr;\n  }\n\n  // Create WaveData object\n  jobject obj = env->NewObject(cls, ctor, samples_arr, sampling_rate);\n  if (obj == nullptr) {\n    env->DeleteLocalRef(samples_arr);\n    env->DeleteLocalRef(cls);\n    return nullptr;\n  }\n\n  // Clean up local refs\n  env->DeleteLocalRef(samples_arr);\n  env->DeleteLocalRef(cls);\n\n  return obj;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL\nJava_com_k2fsa_sherpa_onnx_WaveReader_00024Companion_readWaveFromFile(\n    JNIEnv *env, jclass /*cls*/, jstring filename) {\n  const char *p_filename = env->GetStringUTFChars(filename, nullptr);\n  std::ifstream is(p_filename, std::ios::binary);\n\n  auto obj = ReadWaveImpl(env, is, p_filename);\n\n  env->ReleaseStringUTFChars(filename, p_filename);\n\n  return obj;\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL\nJava_com_k2fsa_sherpa_onnx_WaveReader_readWaveFromFile(JNIEnv *env,\n                                                       jclass /*obj*/,\n                                                       jstring filename) {\n  return Java_com_k2fsa_sherpa_onnx_WaveReader_00024Companion_readWaveFromFile(\n      env, nullptr, filename);\n}\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT jobject JNICALL\nJava_com_k2fsa_sherpa_onnx_WaveReader_00024Companion_readWaveFromAsset(\n    JNIEnv *env, jclass /*cls*/, jobject asset_manager, jstring filename) {\n  const char *p_filename = env->GetStringUTFChars(filename, nullptr);\n#if __ANDROID_API__ >= 9\n  AAssetManager *mgr = AAssetManager_fromJava(env, asset_manager);\n  if (!mgr) {\n    SHERPA_ONNX_LOGE(\"Failed to get asset manager: %p\", mgr);\n    env->ReleaseStringUTFChars(filename, p_filename);\n    jclass re = env->FindClass(\"java/lang/RuntimeException\");\n    env->ThrowNew(re, \"Failed to get asset manager\");\n    env->DeleteLocalRef(re);\n    return nullptr;\n  }\n  std::vector<char> buffer = sherpa_onnx::ReadFile(mgr, p_filename);\n\n  std::istringstream is(std::string(buffer.data(), buffer.size()));\n#else\n  std::ifstream is(p_filename, std::ios::binary);\n#endif\n\n  auto obj = ReadWaveImpl(env, is, p_filename);\n\n  env->ReleaseStringUTFChars(filename, p_filename);\n\n  return obj;\n}\n"
  },
  {
    "path": "sherpa-onnx/jni/wave-writer.cc",
    "content": "// sherpa-onnx/jni/wave-writer.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\n#include \"sherpa-onnx/jni/common.h\"\n\nSHERPA_ONNX_EXTERN_C\nJNIEXPORT bool JNICALL Java_com_k2fsa_sherpa_onnx_WaveWriter_writeWaveToFile(\n    JNIEnv *env, jclass /*obj*/, jstring filename, jfloatArray samples,\n    jint sample_rate) {\n  jfloat *p = env->GetFloatArrayElements(samples, nullptr);\n  jsize n = env->GetArrayLength(samples);\n\n  const char *p_filename = env->GetStringUTFChars(filename, nullptr);\n\n  bool ok = sherpa_onnx::WriteWave(p_filename, sample_rate, p, n);\n\n  env->ReleaseFloatArrayElements(samples, p, JNI_ABORT);\n  env->ReleaseStringUTFChars(filename, p_filename);\n\n  return ok;\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/AudioTagging.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class OfflineZipformerAudioTaggingModelConfig(\n    var model: String = \"\",\n)\n\ndata class AudioTaggingModelConfig(\n    var zipformer: OfflineZipformerAudioTaggingModelConfig = OfflineZipformerAudioTaggingModelConfig(),\n    var ced: String = \"\",\n    var numThreads: Int = 1,\n    var debug: Boolean = false,\n    var provider: String = \"cpu\",\n)\n\ndata class AudioTaggingConfig(\n    var model: AudioTaggingModelConfig = AudioTaggingModelConfig(),\n    var labels: String = \"\",\n    var topK: Int = 5,\n)\n\ndata class AudioEvent(\n    val name: String,\n    val index: Int,\n    val prob: Float,\n)\n\nclass AudioTagging(\n    assetManager: AssetManager? = null,\n    config: AudioTaggingConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun createStream(): OfflineStream {\n        val p = createStream(ptr)\n        return OfflineStream(p)\n    }\n\n    @Suppress(\"UNCHECKED_CAST\")\n    fun compute(stream: OfflineStream, topK: Int = -1): Array<AudioEvent> {\n        return compute(ptr, stream.ptr, topK)\n    }\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: AudioTaggingConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: AudioTaggingConfig,\n    ): Long\n\n    private external fun delete(ptr: Long)\n\n    private external fun createStream(ptr: Long): Long\n\n    private external fun compute(ptr: Long, streamPtr: Long, topK: Int): Array<AudioEvent>\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n\n// please refer to\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models\n// to download more models\n//\n// See also\n// https://k2-fsa.github.io/sherpa/onnx/audio-tagging/\nfun getAudioTaggingConfig(type: Int, numThreads: Int = 1): AudioTaggingConfig? {\n    when (type) {\n        0 -> {\n            val modelDir = \"sherpa-onnx-zipformer-small-audio-tagging-2024-04-15\"\n            return AudioTaggingConfig(\n                model = AudioTaggingModelConfig(\n                    zipformer = OfflineZipformerAudioTaggingModelConfig(model = \"$modelDir/model.int8.onnx\"),\n                    numThreads = numThreads,\n                    debug = true,\n                ),\n                labels = \"$modelDir/class_labels_indices.csv\",\n                topK = 3,\n            )\n        }\n\n        1 -> {\n            val modelDir = \"sherpa-onnx-zipformer-audio-tagging-2024-04-09\"\n            return AudioTaggingConfig(\n                model = AudioTaggingModelConfig(\n                    zipformer = OfflineZipformerAudioTaggingModelConfig(model = \"$modelDir/model.int8.onnx\"),\n                    numThreads = numThreads,\n                    debug = true,\n                ),\n                labels = \"$modelDir/class_labels_indices.csv\",\n                topK = 3,\n            )\n        }\n\n        2 -> {\n            val modelDir = \"sherpa-onnx-ced-tiny-audio-tagging-2024-04-19\"\n            return AudioTaggingConfig(\n                model = AudioTaggingModelConfig(\n                    ced = \"$modelDir/model.int8.onnx\",\n                    numThreads = numThreads,\n                    debug = true,\n                ),\n                labels = \"$modelDir/class_labels_indices.csv\",\n                topK = 3,\n            )\n        }\n\n        3 -> {\n            val modelDir = \"sherpa-onnx-ced-mini-audio-tagging-2024-04-19\"\n            return AudioTaggingConfig(\n                model = AudioTaggingModelConfig(\n                    ced = \"$modelDir/model.int8.onnx\",\n                    numThreads = numThreads,\n                    debug = true,\n                ),\n                labels = \"$modelDir/class_labels_indices.csv\",\n                topK = 3,\n            )\n        }\n\n        4 -> {\n            val modelDir = \"sherpa-onnx-ced-small-audio-tagging-2024-04-19\"\n            return AudioTaggingConfig(\n                model = AudioTaggingModelConfig(\n                    ced = \"$modelDir/model.int8.onnx\",\n                    numThreads = numThreads,\n                    debug = true,\n                ),\n                labels = \"$modelDir/class_labels_indices.csv\",\n                topK = 3,\n            )\n        }\n\n        5 -> {\n            val modelDir = \"sherpa-onnx-ced-base-audio-tagging-2024-04-19\"\n            return AudioTaggingConfig(\n                model = AudioTaggingModelConfig(\n                    ced = \"$modelDir/model.int8.onnx\",\n                    numThreads = numThreads,\n                    debug = true,\n                ),\n                labels = \"$modelDir/class_labels_indices.csv\",\n                topK = 3,\n            )\n        }\n    }\n\n    return null\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/DenoisedAudio.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nclass DenoisedAudio(\n    val samples: FloatArray,\n    val sampleRate: Int,\n) {\n    fun save(filename: String) =\n        saveImpl(filename = filename, samples = samples, sampleRate = sampleRate)\n\n    private external fun saveImpl(\n        filename: String,\n        samples: FloatArray,\n        sampleRate: Int\n    ): Boolean\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/FeatureConfig.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\ndata class FeatureConfig(\n    var sampleRate: Int = 16000,\n    var featureDim: Int = 80,\n    var dither: Float = 0.0f\n)\n\nfun getFeatureConfig(sampleRate: Int, featureDim: Int): FeatureConfig {\n    return FeatureConfig(sampleRate = sampleRate, featureDim = featureDim)\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/HomophoneReplacerConfig.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\ndata class HomophoneReplacerConfig(\n    var dictDir: String = \"\", // unused\n    var lexicon: String = \"\",\n    var ruleFsts: String = \"\",\n)\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/KeywordSpotter.kt",
    "content": "// Copyright (c)  2024  Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class KeywordSpotterConfig(\n    var featConfig: FeatureConfig = FeatureConfig(),\n    var modelConfig: OnlineModelConfig = OnlineModelConfig(),\n    var maxActivePaths: Int = 4,\n    var keywordsFile: String = \"keywords.txt\",\n    var keywordsScore: Float = 1.5f,\n    var keywordsThreshold: Float = 0.25f,\n    var numTrailingBlanks: Int = 2,\n)\n\ndata class KeywordSpotterResult(\n    val keyword: String,\n    val tokens: Array<String>,\n    val timestamps: FloatArray,\n    // TODO(fangjun): Add more fields\n) {\n    override fun toString(): String {\n        val tokensStr = tokens.joinToString(\", \")\n        val timestampsStr = timestamps.joinToString(\", \") { \"%.2f\".format(it) }\n        return \"Keyword: $keyword\\nTokens: [$tokensStr]\\nTimestamps: [$timestampsStr]\"\n    }\n}\n\nclass KeywordSpotter(\n    assetManager: AssetManager? = null,\n    val config: KeywordSpotterConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun createStream(keywords: String = \"\"): OnlineStream {\n        val p = createStream(ptr, keywords)\n        return OnlineStream(p)\n    }\n\n    fun decode(stream: OnlineStream) = decode(ptr, stream.ptr)\n    fun reset(stream: OnlineStream) = reset(ptr, stream.ptr)\n    fun isReady(stream: OnlineStream) = isReady(ptr, stream.ptr)\n    fun getResult(stream: OnlineStream): KeywordSpotterResult {\n        return getResult(ptr, stream.ptr)\n    }\n\n    private external fun delete(ptr: Long)\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: KeywordSpotterConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: KeywordSpotterConfig,\n    ): Long\n\n    private external fun createStream(ptr: Long, keywords: String): Long\n    private external fun isReady(ptr: Long, streamPtr: Long): Boolean\n    private external fun decode(ptr: Long, streamPtr: Long)\n    private external fun reset(ptr: Long, streamPtr: Long)\n    private external fun getResult(ptr: Long, streamPtr: Long): KeywordSpotterResult\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n\n/*\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html\nfor a list of pre-trained models.\n\nWe only add a few here. Please change the following code\nto add your own. (It should be straightforward to add a new model\nby following the code)\n\n@param type\n0 - sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01 (Chinese)\n    https://www.modelscope.cn/models/pkufool/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/summary\n\n1 - sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01 (English)\n    https://www.modelscope.cn/models/pkufool/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/summary\n\n */\nfun getKwsModelConfig(type: Int): OnlineModelConfig? {\n    when (type) {\n        0 -> {\n            val modelDir = \"sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-12-avg-2-chunk-16-left-64.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-12-avg-2-chunk-16-left-64.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        1 -> {\n            val modelDir = \"sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-12-avg-2-chunk-16-left-64.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-12-avg-2-chunk-16-left-64.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n    }\n    return null\n}\n\n/*\n * Get the default keywords for each model.\n * Caution: The types and modelDir should be the same as those in getModelConfig\n * function above.\n */\nfun getKeywordsFile(type: Int): String {\n    when (type) {\n        0 -> {\n            val modelDir = \"sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01\"\n            return \"$modelDir/keywords.txt\"\n        }\n\n        1 -> {\n            val modelDir = \"sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01\"\n            return \"$modelDir/keywords.txt\"\n        }\n\n    }\n    return \"\"\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/OfflinePunctuation.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class OfflinePunctuationModelConfig(\n    var ctTransformer: String = \"\",\n    var numThreads: Int = 1,\n    var debug: Boolean = false,\n    var provider: String = \"cpu\",\n)\n\n\ndata class OfflinePunctuationConfig(\n    var model: OfflinePunctuationModelConfig,\n)\n\nclass OfflinePunctuation(\n    assetManager: AssetManager? = null,\n    config: OfflinePunctuationConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun addPunctuation(text: String) = addPunctuation(ptr, text)\n\n    private external fun delete(ptr: Long)\n\n    private external fun addPunctuation(ptr: Long, text: String): String\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: OfflinePunctuationConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: OfflinePunctuationConfig,\n    ): Long\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/OfflineRecognizer.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class OfflineRecognizerResult(\n    val text: String,\n    val tokens: Array<String>,\n    val timestamps: FloatArray,\n    val lang: String,\n    val emotion: String,\n    val event: String,\n\n    // valid only for TDT models\n    val durations: FloatArray,\n)\n\ndata class OfflineTransducerModelConfig(\n    var encoder: String = \"\",\n    var decoder: String = \"\",\n    var joiner: String = \"\",\n)\n\ndata class OfflineParaformerModelConfig(\n    var model: String = \"\",\n    var qnnConfig: QnnConfig = QnnConfig(),\n)\n\ndata class OfflineNemoEncDecCtcModelConfig(\n    var model: String = \"\",\n)\n\ndata class OfflineDolphinModelConfig(\n    var model: String = \"\",\n)\n\ndata class OfflineZipformerCtcModelConfig(\n    var model: String = \"\",\n    var qnnConfig: QnnConfig = QnnConfig(),\n)\n\ndata class OfflineWenetCtcModelConfig(\n    var model: String = \"\",\n)\n\ndata class OfflineOmnilingualAsrCtcModelConfig(\n    var model: String = \"\",\n)\n\ndata class OfflineMedAsrCtcModelConfig(\n    var model: String = \"\",\n)\n\ndata class OfflineFireRedAsrCtcModelConfig(\n    var model: String = \"\",\n)\n\ndata class OfflineFunAsrNanoModelConfig(\n    var encoderAdaptor: String = \"\",\n    var llm: String = \"\",\n    var embedding: String = \"\",\n    var tokenizer: String = \"\",\n    var systemPrompt: String = \"You are a helpful assistant.\",\n    var userPrompt: String = \"语音转写：\",\n    var maxNewTokens: Int = 512,\n    var temperature: Float = 1e-6f,\n    var topP: Float = 0.8f,\n    var seed: Int = 42,\n    var language: String = \"\",\n    var itn: Boolean = true,\n    var hotwords: String = \"\",\n)\n\ndata class OfflineWhisperModelConfig(\n    var encoder: String = \"\",\n    var decoder: String = \"\",\n    var language: String = \"en\", // Used with multilingual model\n    var task: String = \"transcribe\", // transcribe or translate\n    var tailPaddings: Int = 1000, // Padding added at the end of the samples\n    var enableTokenTimestamps: Boolean = false,\n    var enableSegmentTimestamps: Boolean = false,\n)\n\ndata class OfflineCanaryModelConfig(\n    var encoder: String = \"\",\n    var decoder: String = \"\",\n    var srcLang: String = \"en\",\n    var tgtLang: String = \"en\",\n    var usePnc: Boolean = true,\n)\n\ndata class OfflineFireRedAsrModelConfig(\n    var encoder: String = \"\",\n    var decoder: String = \"\",\n)\n\n// For moonshine v1, you need four models.\n// For moonshine v2, you need two models.\n// - v1: preprocessor, encoder, uncachedDecoder, cachedDecoder\n// - v2: encoder, mergedDecoder\ndata class OfflineMoonshineModelConfig(\n    var preprocessor: String = \"\",\n    var encoder: String = \"\",\n    var uncachedDecoder: String = \"\",\n    var cachedDecoder: String = \"\",\n    var mergedDecoder: String = \"\",\n)\n\ndata class OfflineSenseVoiceModelConfig(\n    var model: String = \"\",\n    var language: String = \"\",\n    var useInverseTextNormalization: Boolean = true,\n    var qnnConfig: QnnConfig = QnnConfig(),\n)\n\ndata class OfflineModelConfig(\n    var transducer: OfflineTransducerModelConfig = OfflineTransducerModelConfig(),\n    var paraformer: OfflineParaformerModelConfig = OfflineParaformerModelConfig(),\n    var whisper: OfflineWhisperModelConfig = OfflineWhisperModelConfig(),\n    var fireRedAsr: OfflineFireRedAsrModelConfig = OfflineFireRedAsrModelConfig(),\n    var moonshine: OfflineMoonshineModelConfig = OfflineMoonshineModelConfig(),\n    var nemo: OfflineNemoEncDecCtcModelConfig = OfflineNemoEncDecCtcModelConfig(),\n    var senseVoice: OfflineSenseVoiceModelConfig = OfflineSenseVoiceModelConfig(),\n    var dolphin: OfflineDolphinModelConfig = OfflineDolphinModelConfig(),\n    var zipformerCtc: OfflineZipformerCtcModelConfig = OfflineZipformerCtcModelConfig(),\n    var wenetCtc: OfflineWenetCtcModelConfig = OfflineWenetCtcModelConfig(),\n    var omnilingual: OfflineOmnilingualAsrCtcModelConfig = OfflineOmnilingualAsrCtcModelConfig(),\n    var medasr: OfflineMedAsrCtcModelConfig = OfflineMedAsrCtcModelConfig(),\n    var funasrNano: OfflineFunAsrNanoModelConfig = OfflineFunAsrNanoModelConfig(),\n    var fireRedAsrCtc: OfflineFireRedAsrCtcModelConfig = OfflineFireRedAsrCtcModelConfig(),\n    var canary: OfflineCanaryModelConfig = OfflineCanaryModelConfig(),\n    var teleSpeech: String = \"\",\n    var numThreads: Int = 1,\n    var debug: Boolean = false,\n    var provider: String = \"cpu\",\n    var modelType: String = \"\",\n    var tokens: String = \"\",\n    var modelingUnit: String = \"\",\n    var bpeVocab: String = \"\",\n)\n\ndata class OfflineRecognizerConfig(\n    var featConfig: FeatureConfig = FeatureConfig(),\n    var modelConfig: OfflineModelConfig = OfflineModelConfig(),\n    // var lmConfig: OfflineLMConfig(), // TODO(fangjun): enable it\n    var hr: HomophoneReplacerConfig = HomophoneReplacerConfig(),\n    var decodingMethod: String = \"greedy_search\",\n    var maxActivePaths: Int = 4,\n    var hotwordsFile: String = \"\",\n    var hotwordsScore: Float = 1.5f,\n    var ruleFsts: String = \"\",\n    var ruleFars: String = \"\",\n    var blankPenalty: Float = 0.0f,\n)\n\nclass OfflineRecognizer(\n    assetManager: AssetManager? = null,\n    val config: OfflineRecognizerConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun createStream(): OfflineStream {\n        val p = createStream(ptr)\n        return OfflineStream(p)\n    }\n\n    fun getResult(stream: OfflineStream): OfflineRecognizerResult {\n        return getResult(stream.ptr)\n    }\n\n    fun decode(stream: OfflineStream) = decode(ptr, stream.ptr)\n\n    fun setConfig(config: OfflineRecognizerConfig) = setConfig(ptr, config)\n\n    private external fun delete(ptr: Long)\n\n    private external fun createStream(ptr: Long): Long\n\n    private external fun setConfig(ptr: Long, config: OfflineRecognizerConfig)\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: OfflineRecognizerConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: OfflineRecognizerConfig,\n    ): Long\n\n    private external fun decode(ptr: Long, streamPtr: Long)\n\n    private external fun getResult(streamPtr: Long): OfflineRecognizerResult\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n\n        @JvmStatic\n        external fun prependAdspLibraryPath(newPath: String) // for qnn\n    }\n}\n\n/*\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models.\n\nWe only add a few here. Please change the following code\nto add your own. (It should be straightforward to add a new model\nby following the code)\n\n@param type\n\n0 - csukuangfj/sherpa-onnx-paraformer-zh-2023-09-14 (Chinese)\n    https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2023-09-14-chinese\n    int8\n\n1 - icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04 (English)\n    https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#icefall-asr-multidataset-pruned-transducer-stateless7-2023-05-04-english\n    encoder int8, decoder/joiner float32\n\n2 - sherpa-onnx-whisper-tiny.en\n    https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html#tiny-en\n    encoder int8, decoder int8\n\n3 - sherpa-onnx-whisper-base.en\n    https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html#tiny-en\n    encoder int8, decoder int8\n\n4 - pkufool/icefall-asr-zipformer-wenetspeech-20230615 (Chinese)\n    https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#pkufool-icefall-asr-zipformer-wenetspeech-20230615-chinese\n    encoder/joiner int8, decoder fp32\n\n */\nfun getOfflineModelConfig(type: Int): OfflineModelConfig? {\n    when (type) {\n        0 -> {\n            val modelDir = \"sherpa-onnx-paraformer-zh-2023-09-14\"\n            return OfflineModelConfig(\n                paraformer = OfflineParaformerModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"paraformer\",\n            )\n        }\n\n        1 -> {\n            val modelDir = \"icefall-asr-multidataset-pruned_transducer_stateless7-2023-05-04\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-30-avg-4.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-30-avg-4.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-30-avg-4.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        2 -> {\n            val modelDir = \"sherpa-onnx-whisper-tiny.en\"\n            return OfflineModelConfig(\n                whisper = OfflineWhisperModelConfig(\n                    encoder = \"$modelDir/tiny.en-encoder.int8.onnx\",\n                    decoder = \"$modelDir/tiny.en-decoder.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tiny.en-tokens.txt\",\n                modelType = \"whisper\",\n            )\n        }\n\n        3 -> {\n            val modelDir = \"sherpa-onnx-whisper-base.en\"\n            return OfflineModelConfig(\n                whisper = OfflineWhisperModelConfig(\n                    encoder = \"$modelDir/base.en-encoder.int8.onnx\",\n                    decoder = \"$modelDir/base.en-decoder.int8.onnx\",\n                ),\n                tokens = \"$modelDir/base.en-tokens.txt\",\n                modelType = \"whisper\",\n            )\n        }\n\n\n        4 -> {\n            val modelDir = \"icefall-asr-zipformer-wenetspeech-20230615\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-12-avg-4.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-12-avg-4.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-12-avg-4.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        5 -> {\n            val modelDir = \"sherpa-onnx-zipformer-multi-zh-hans-2023-9-2\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-20-avg-1.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-20-avg-1.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-20-avg-1.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        6 -> {\n            val modelDir = \"sherpa-onnx-nemo-ctc-en-citrinet-512\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        7 -> {\n            val modelDir = \"sherpa-onnx-nemo-fast-conformer-ctc-be-de-en-es-fr-hr-it-pl-ru-uk-20k\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        8 -> {\n            val modelDir = \"sherpa-onnx-nemo-fast-conformer-ctc-en-24500\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9 -> {\n            val modelDir = \"sherpa-onnx-nemo-fast-conformer-ctc-en-de-es-fr-14288\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        10 -> {\n            val modelDir = \"sherpa-onnx-nemo-fast-conformer-ctc-es-1424\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        11 -> {\n            val modelDir = \"sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04\"\n            return OfflineModelConfig(\n                teleSpeech = \"$modelDir/model.int8.onnx\",\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"telespeech_ctc\",\n            )\n        }\n\n        12 -> {\n            val modelDir = \"sherpa-onnx-zipformer-thai-2024-06-20\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-12-avg-5.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-12-avg-5.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-12-avg-5.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        13 -> {\n            val modelDir = \"sherpa-onnx-zipformer-korean-2024-06-24\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-99-avg-1.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-99-avg-1.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-99-avg-1.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        14 -> {\n            val modelDir = \"sherpa-onnx-paraformer-zh-small-2024-03-09\"\n            return OfflineModelConfig(\n                paraformer = OfflineParaformerModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"paraformer\",\n            )\n        }\n\n        15 -> {\n            val modelDir = \"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2024-07-17\"\n            return OfflineModelConfig(\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        16 -> {\n            val modelDir = \"sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-99-avg-1.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-99-avg-1.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-99-avg-1.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        17 -> {\n            val modelDir = \"sherpa-onnx-zipformer-ru-2024-09-18\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        18 -> {\n            val modelDir = \"sherpa-onnx-small-zipformer-ru-2024-09-18\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        19 -> {\n            val modelDir = \"sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        20 -> {\n            val modelDir = \"sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"nemo_transducer\",\n            )\n        }\n\n        21 -> {\n            val modelDir = \"sherpa-onnx-moonshine-tiny-en-int8\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    preprocessor = \"$modelDir/preprocess.onnx\",\n                    encoder = \"$modelDir/encode.int8.onnx\",\n                    uncachedDecoder = \"$modelDir/uncached_decode.int8.onnx\",\n                    cachedDecoder = \"$modelDir/cached_decode.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        22 -> {\n            val modelDir = \"sherpa-onnx-moonshine-base-en-int8\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    preprocessor = \"$modelDir/preprocess.onnx\",\n                    encoder = \"$modelDir/encode.int8.onnx\",\n                    uncachedDecoder = \"$modelDir/uncached_decode.int8.onnx\",\n                    cachedDecoder = \"$modelDir/cached_decode.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        23 -> {\n            val modelDir = \"sherpa-onnx-zipformer-zh-en-2023-11-22\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-34-avg-19.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-34-avg-19.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-34-avg-19.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        24 -> {\n            val modelDir = \"sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\"\n            return OfflineModelConfig(\n                fireRedAsr = OfflineFireRedAsrModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        25 -> {\n            val modelDir = \"sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\"\n            return OfflineModelConfig(\n                dolphin = OfflineDolphinModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        26 -> {\n            val modelDir = \"sherpa-onnx-zipformer-vi-int8-2025-04-20\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-12-avg-8.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-12-avg-8.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-12-avg-8.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        27 -> {\n            val modelDir = \"sherpa-onnx-nemo-ctc-giga-am-v2-russian-2025-04-19\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        28 -> {\n            val modelDir = \"sherpa-onnx-nemo-transducer-giga-am-v2-russian-2025-04-19\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"nemo_transducer\",\n            )\n        }\n\n        29 -> {\n            val modelDir = \"sherpa-onnx-zipformer-ru-int8-2025-04-20\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        30 -> {\n            val modelDir = \"sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.int8.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"nemo_transducer\",\n            )\n        }\n\n        31 -> {\n            val modelDir = \"sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03\"\n            return OfflineModelConfig(\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        32 -> {\n            val modelDir = \"sherpa-onnx-nemo-canary-180m-flash-en-es-de-fr-int8\"\n            return OfflineModelConfig(\n                canary = OfflineCanaryModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.int8.onnx\",\n                    srcLang = \"en\",\n                    tgtLang = \"en\",\n                    usePnc = true,\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        33 -> {\n            val modelDir = \"sherpa-onnx-nemo-parakeet_tdt_ctc_110m-en-36000-int8\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        34 -> {\n            val modelDir = \"sherpa-onnx-nemo-parakeet-tdt_ctc-0.6b-ja-35000-int8\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        35 -> {\n            val modelDir = \"sherpa-onnx-nemo-transducer-stt_pt_fastconformer_hybrid_large_pc-int8\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.int8.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"nemo_transducer\",\n            )\n        }\n\n        36 -> {\n            val modelDir = \"sherpa-onnx-nemo-stt_pt_fastconformer_hybrid_large_pc-int8\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        37 -> {\n            val modelDir = \"sherpa-onnx-nemo-transducer-stt_de_fastconformer_hybrid_large_pc-int8\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.int8.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"nemo_transducer\",\n            )\n        }\n\n        38 -> {\n            val modelDir = \"sherpa-onnx-nemo-stt_de_fastconformer_hybrid_large_pc-int8\"\n            return OfflineModelConfig(\n                nemo = OfflineNemoEncDecCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        39 -> {\n            val modelDir = \"sherpa-onnx-zipformer-ctc-small-zh-int8-2025-07-16\"\n            return OfflineModelConfig(\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        40 -> {\n            val modelDir = \"sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.int8.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"nemo_transducer\",\n            )\n        }\n\n        41 -> {\n            val modelDir = \"sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09\"\n            return OfflineModelConfig(\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        42 -> {\n            val modelDir =\n                \"sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10\"\n            return OfflineModelConfig(\n                wenetCtc = OfflineWenetCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        43 -> {\n            val modelDir = \"sherpa-onnx-paraformer-zh-int8-2025-10-07\"\n            return OfflineModelConfig(\n                paraformer = OfflineParaformerModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"paraformer\",\n            )\n        }\n\n        44 -> {\n            val modelDir = \"sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12\"\n            return OfflineModelConfig(\n                omnilingual = OfflineOmnilingualAsrCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        45 -> {\n            val modelDir = \"sherpa-onnx-medasr-ctc-en-int8-2025-12-25\"\n            return OfflineModelConfig(\n                medasr = OfflineMedAsrCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        46 -> {\n            val modelDir = \"sherpa-onnx-funasr-nano-int8-2025-12-30\"\n            return OfflineModelConfig(\n                funasrNano = OfflineFunAsrNanoModelConfig(\n                    encoderAdaptor = \"$modelDir/encoder_adaptor.int8.onnx\",\n                    llm = \"$modelDir/llm.int8.onnx\",\n                    embedding = \"$modelDir/embedding.int8.onnx\",\n                    tokenizer = \"$modelDir/Qwen3-0.6B\",\n                ),\n                tokens = \"\",\n            )\n        }\n\n        47 -> {\n            val modelDir = \"sherpa-onnx-wenetspeech-wu-u2pp-conformer-ctc-zh-int8-2026-02-03\"\n            return OfflineModelConfig(\n                wenetCtc = OfflineWenetCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        48 -> {\n            val modelDir = \"sherpa-onnx-wenetspeech-wu-u2pp-conformer-ctc-zh-2026-02-03\"\n            return OfflineModelConfig(\n                wenetCtc = OfflineWenetCtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        49 -> {\n            val modelDir = \"sherpa-onnx-zipformer-vi-30M-int8-2026-02-09\"\n            return OfflineModelConfig(\n                transducer = OfflineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"transducer\",\n            )\n        }\n\n        50 -> {\n            val modelDir = \"sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\"\n            return OfflineModelConfig(\n                fireRedAsrCtc = OfflineFireRedAsrCtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        51 -> {\n            val modelDir = \"sherpa-onnx-moonshine-tiny-ko-quantized-2026-02-27\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    encoder = \"$modelDir/encoder_model.ort\",\n                    mergedDecoder = \"$modelDir/decoder_model_merged.ort\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        52 -> {\n            val modelDir = \"sherpa-onnx-moonshine-tiny-ja-quantized-2026-02-27\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    encoder = \"$modelDir/encoder_model.ort\",\n                    mergedDecoder = \"$modelDir/decoder_model_merged.ort\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        53 -> {\n            val modelDir = \"sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    encoder = \"$modelDir/encoder_model.ort\",\n                    mergedDecoder = \"$modelDir/decoder_model_merged.ort\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        54 -> {\n            val modelDir = \"sherpa-onnx-moonshine-base-zh-quantized-2026-02-27\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    encoder = \"$modelDir/encoder_model.ort\",\n                    mergedDecoder = \"$modelDir/decoder_model_merged.ort\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        55 -> {\n            val modelDir = \"sherpa-onnx-moonshine-base-vi-quantized-2026-02-27\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    encoder = \"$modelDir/encoder_model.ort\",\n                    mergedDecoder = \"$modelDir/decoder_model_merged.ort\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        56 -> {\n            val modelDir = \"sherpa-onnx-moonshine-base-uk-quantized-2026-02-27\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    encoder = \"$modelDir/encoder_model.ort\",\n                    mergedDecoder = \"$modelDir/decoder_model_merged.ort\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        57 -> {\n            val modelDir = \"sherpa-onnx-moonshine-base-ja-quantized-2026-02-27\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    encoder = \"$modelDir/encoder_model.ort\",\n                    mergedDecoder = \"$modelDir/decoder_model_merged.ort\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        58 -> {\n            val modelDir = \"sherpa-onnx-moonshine-base-es-quantized-2026-02-27\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    encoder = \"$modelDir/encoder_model.ort\",\n                    mergedDecoder = \"$modelDir/decoder_model_merged.ort\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        59 -> {\n            val modelDir = \"sherpa-onnx-moonshine-base-en-quantized-2026-02-27\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    encoder = \"$modelDir/encoder_model.ort\",\n                    mergedDecoder = \"$modelDir/decoder_model_merged.ort\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        60 -> {\n            val modelDir = \"sherpa-onnx-moonshine-base-ar-quantized-2026-02-27\"\n            return OfflineModelConfig(\n                moonshine = OfflineMoonshineModelConfig(\n                    encoder = \"$modelDir/encoder_model.ort\",\n                    mergedDecoder = \"$modelDir/decoder_model_merged.ort\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9000 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-5-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        // Please copy libQnnHtp.so and libQnnSystem.so to jniLibs/arm64-v8a by yourself\n                        //\n                        // model.bin is created in the first run and is used from the second run\n                        // to speed up the initialization\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                debug = true,\n            )\n        }\n\n        9001 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-8-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                debug = true,\n            )\n        }\n\n        9002 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-10-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                debug = true,\n            )\n        }\n\n        9003 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-13-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9004 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-15-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9005 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-18-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9006 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-20-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9007 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-23-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9008 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-25-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9009 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-28-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9010 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-30-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9011 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-5-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                debug = true,\n            )\n        }\n\n        9012 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-8-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                debug = true,\n            )\n        }\n\n        9013 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-10-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                debug = true,\n            )\n        }\n\n        9014 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-13-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9015 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-15-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9016 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-18-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9017 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-20-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9018 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-23-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9019 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-25-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9020 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-28-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9021 -> {\n            val modelDir =\n                \"sherpa-onnx-qnn-30-seconds-zipformer-ctc-zh-2025-07-03-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                zipformerCtc = OfflineZipformerCtcModelConfig(\n                    model = \"$modelDir/libmodel.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        9022 -> {\n            // for Xiaomi 17 Pro\n            val modelDir =\n                \"sherpa-onnx-qnn-SM8850-binary-10-seconds-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                senseVoice = OfflineSenseVoiceModelConfig(\n                    qnnConfig = QnnConfig(\n                        // Please copy libQnnHtp.so and libQnnSystem.so to jniLibs/arm64-v8a by yourself\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/model.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                debug = true,\n            )\n        }\n\n        9023 -> {\n            val modelDir = \"sherpa-onnx-qnn-5-seconds-paraformer-zh-2023-03-28-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                paraformer = OfflineParaformerModelConfig(\n                    model = \"$modelDir/libencoder.so,$modelDir/libpredictor.so,$modelDir/libdecoder.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        // The following three *.bin files are generated during the first run\n                        // and are used to replace the corresponding *.so files in later runs\n                        contextBinary = \"$modelDir/encoder.bin,$modelDir/predictor.bin,$modelDir/decoder.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                debug = true,\n            )\n        }\n\n        9024 -> {\n            val modelDir = \"sherpa-onnx-qnn-5-seconds-paraformer-zh-2025-10-07-int8-android-aarch64\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                paraformer = OfflineParaformerModelConfig(\n                    model = \"$modelDir/libencoder.so,$modelDir/libpredictor.so,$modelDir/libdecoder.so\",\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        // The following three *.bin files are generated during the first run\n                        // and are used to replace the corresponding *.so files in later runs\n                        contextBinary = \"$modelDir/encoder.bin,$modelDir/predictor.bin,$modelDir/decoder.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                debug = true,\n            )\n        }\n\n        9025 -> {\n            // for Xiaomi 17 Pro\n            val modelDir = \"sherpa-onnx-qnn-SM8850-binary-5-seconds-paraformer-zh-2023-03-28-int8\"\n            return OfflineModelConfig(\n                provider = \"qnn\",\n                paraformer = OfflineParaformerModelConfig(\n                    qnnConfig = QnnConfig(\n                        backendLib = \"libQnnHtp.so\",\n                        systemLib = \"libQnnSystem.so\",\n                        contextBinary = \"$modelDir/encoder.bin,$modelDir/predictor.bin,$modelDir/decoder.bin\",\n                    ),\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                debug = true,\n            )\n        }\n    }\n    return null\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/OfflineSpeakerDiarization.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class OfflineSpeakerSegmentationPyannoteModelConfig(\n    var model: String = \"\",\n)\n\ndata class OfflineSpeakerSegmentationModelConfig(\n    var pyannote: OfflineSpeakerSegmentationPyannoteModelConfig = OfflineSpeakerSegmentationPyannoteModelConfig(),\n    var numThreads: Int = 1,\n    var debug: Boolean = false,\n    var provider: String = \"cpu\",\n)\n\ndata class FastClusteringConfig(\n    var numClusters: Int = -1,\n    var threshold: Float = 0.5f,\n)\n\ndata class OfflineSpeakerDiarizationConfig(\n    var segmentation: OfflineSpeakerSegmentationModelConfig = OfflineSpeakerSegmentationModelConfig(),\n    var embedding: SpeakerEmbeddingExtractorConfig = SpeakerEmbeddingExtractorConfig(),\n    var clustering: FastClusteringConfig = FastClusteringConfig(),\n    var minDurationOn: Float = 0.2f,\n    var minDurationOff: Float = 0.5f,\n)\n\ndata class OfflineSpeakerDiarizationSegment(\n    val start: Float, // in seconds\n    val end: Float, // in seconds\n    val speaker: Int, // ID of the speaker; count from 0\n)\n\nclass OfflineSpeakerDiarization(\n    assetManager: AssetManager? = null,\n    val config: OfflineSpeakerDiarizationConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    // Only config.clustering is used. All other fields in config\n    // are ignored\n    fun setConfig(config: OfflineSpeakerDiarizationConfig) = setConfig(ptr, config)\n\n    fun sampleRate() = getSampleRate(ptr)\n\n    fun process(samples: FloatArray) = process(ptr, samples)\n\n    fun processWithCallback(\n        samples: FloatArray,\n        callback: (numProcessedChunks: Int, numTotalChunks: Int, arg: Long) -> Int,\n        arg: Long = 0,\n    ) = processWithCallback(ptr, samples, callback, arg)\n\n    private external fun delete(ptr: Long)\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: OfflineSpeakerDiarizationConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: OfflineSpeakerDiarizationConfig,\n    ): Long\n\n    private external fun setConfig(ptr: Long, config: OfflineSpeakerDiarizationConfig)\n\n    private external fun getSampleRate(ptr: Long): Int\n\n    private external fun process(\n        ptr: Long,\n        samples: FloatArray\n    ): Array<OfflineSpeakerDiarizationSegment>\n\n    private external fun processWithCallback(\n        ptr: Long,\n        samples: FloatArray,\n        callback: (numProcessedChunks: Int, numTotalChunks: Int, arg: Long) -> Int,\n        arg: Long,\n    ): Array<OfflineSpeakerDiarizationSegment>\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/OfflineSpeechDenoiser.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class OfflineSpeechDenoiserGtcrnModelConfig(\n    var model: String = \"\",\n)\n\ndata class OfflineSpeechDenoiserDpdfNetModelConfig(\n    var model: String = \"\",\n)\n\ndata class OfflineSpeechDenoiserModelConfig(\n    var gtcrn: OfflineSpeechDenoiserGtcrnModelConfig = OfflineSpeechDenoiserGtcrnModelConfig(),\n    var dpdfnet: OfflineSpeechDenoiserDpdfNetModelConfig = OfflineSpeechDenoiserDpdfNetModelConfig(),\n    var numThreads: Int = 1,\n    var debug: Boolean = false,\n    var provider: String = \"cpu\",\n)\n\ndata class OfflineSpeechDenoiserConfig(\n    var model: OfflineSpeechDenoiserModelConfig = OfflineSpeechDenoiserModelConfig(),\n)\n\nclass OfflineSpeechDenoiser(\n    assetManager: AssetManager? = null,\n    config: OfflineSpeechDenoiserConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun run(samples: FloatArray, sampleRate: Int) = run(ptr, samples, sampleRate)\n\n    val sampleRate\n      get() = getSampleRate(ptr)\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: OfflineSpeechDenoiserConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: OfflineSpeechDenoiserConfig,\n    ): Long\n\n    private external fun delete(ptr: Long)\n\n    private external fun run(ptr: Long, samples: FloatArray, sampleRate: Int): DenoisedAudio\n\n    private external fun getSampleRate(ptr: Long): Int\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/OfflineStream.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nclass OfflineStream(var ptr: Long) {\n    fun acceptWaveform(samples: FloatArray, sampleRate: Int) =\n        acceptWaveform(ptr, samples, sampleRate)\n\n    fun setOption(key: String, value: String) = setOption(ptr, key, value)\n\n    fun getOption(key: String): String = getOption(ptr, key)\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun use(block: (OfflineStream) -> Unit) {\n        try {\n            block(this)\n        } finally {\n            release()\n        }\n    }\n\n    private external fun acceptWaveform(ptr: Long, samples: FloatArray, sampleRate: Int)\n    private external fun setOption(ptr: Long, key: String, value: String)\n    private external fun getOption(ptr: Long, key: String): String\n    private external fun delete(ptr: Long)\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/OnlinePunctuation.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class OnlinePunctuationModelConfig(\n    var cnnBilstm: String = \"\",\n    var bpeVocab: String = \"\",\n    var numThreads: Int = 1,\n    var debug: Boolean = false,\n    var provider: String = \"cpu\",\n)\n\n\ndata class OnlinePunctuationConfig(\n    var model: OnlinePunctuationModelConfig,\n)\n\nclass OnlinePunctuation(\n    assetManager: AssetManager? = null,\n    config: OnlinePunctuationConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun addPunctuation(text: String) = addPunctuation(ptr, text)\n\n    private external fun delete(ptr: Long)\n\n    private external fun addPunctuation(ptr: Long, text: String): String\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: OnlinePunctuationConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: OnlinePunctuationConfig,\n    ): Long\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/OnlineRecognizer.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class EndpointRule(\n    var mustContainNonSilence: Boolean,\n    var minTrailingSilence: Float,\n    var minUtteranceLength: Float,\n)\n\ndata class EndpointConfig(\n    var rule1: EndpointRule = EndpointRule(false, 2.4f, 0.0f),\n    var rule2: EndpointRule = EndpointRule(true, 1.4f, 0.0f),\n    var rule3: EndpointRule = EndpointRule(false, 0.0f, 20.0f)\n)\n\ndata class OnlineTransducerModelConfig(\n    var encoder: String = \"\",\n    var decoder: String = \"\",\n    var joiner: String = \"\",\n)\n\ndata class OnlineParaformerModelConfig(\n    var encoder: String = \"\",\n    var decoder: String = \"\",\n)\n\ndata class OnlineZipformer2CtcModelConfig(\n    var model: String = \"\",\n)\n\ndata class OnlineNeMoCtcModelConfig(\n    var model: String = \"\",\n)\n\ndata class OnlineToneCtcModelConfig(\n    var model: String = \"\",\n)\n\ndata class OnlineModelConfig(\n    var transducer: OnlineTransducerModelConfig = OnlineTransducerModelConfig(),\n    var paraformer: OnlineParaformerModelConfig = OnlineParaformerModelConfig(),\n    var zipformer2Ctc: OnlineZipformer2CtcModelConfig = OnlineZipformer2CtcModelConfig(),\n    var neMoCtc: OnlineNeMoCtcModelConfig = OnlineNeMoCtcModelConfig(),\n    var toneCtc: OnlineToneCtcModelConfig = OnlineToneCtcModelConfig(),\n    var tokens: String = \"\",\n    var numThreads: Int = 1,\n    var debug: Boolean = false,\n    var provider: String = \"cpu\",\n    var modelType: String = \"\",\n    var modelingUnit: String = \"\",\n    var bpeVocab: String = \"\",\n)\n\ndata class OnlineLMConfig(\n    var model: String = \"\",\n    var scale: Float = 0.5f,\n)\n\ndata class OnlineCtcFstDecoderConfig(\n    var graph: String = \"\",\n    var maxActive: Int = 3000,\n)\n\ndata class OnlineRecognizerConfig(\n    var featConfig: FeatureConfig = FeatureConfig(),\n    var modelConfig: OnlineModelConfig = OnlineModelConfig(),\n    var lmConfig: OnlineLMConfig = OnlineLMConfig(),\n    var ctcFstDecoderConfig: OnlineCtcFstDecoderConfig = OnlineCtcFstDecoderConfig(),\n    var hr: HomophoneReplacerConfig = HomophoneReplacerConfig(),\n    var endpointConfig: EndpointConfig = EndpointConfig(),\n    var enableEndpoint: Boolean = true,\n    var decodingMethod: String = \"greedy_search\",\n    var maxActivePaths: Int = 4,\n    var hotwordsFile: String = \"\",\n    var hotwordsScore: Float = 1.5f,\n    var ruleFsts: String = \"\",\n    var ruleFars: String = \"\",\n    var blankPenalty: Float = 0.0f,\n)\n\ndata class OnlineRecognizerResult(\n    val text: String,\n    val tokens: Array<String>,\n    val timestamps: FloatArray,\n    val ysProbs: FloatArray,\n    // TODO(fangjun): Add more fields\n)\n\nclass OnlineRecognizer(\n    assetManager: AssetManager? = null,\n    val config: OnlineRecognizerConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun createStream(hotwords: String = \"\"): OnlineStream {\n        val p = createStream(ptr, hotwords)\n        return OnlineStream(p)\n    }\n\n    fun reset(stream: OnlineStream) = reset(ptr, stream.ptr)\n    fun decode(stream: OnlineStream) = decode(ptr, stream.ptr)\n    fun isEndpoint(stream: OnlineStream) = isEndpoint(ptr, stream.ptr)\n    fun isReady(stream: OnlineStream) = isReady(ptr, stream.ptr)\n    fun getResult(stream: OnlineStream): OnlineRecognizerResult {\n        return getResult(ptr, stream.ptr)\n    }\n\n    private external fun delete(ptr: Long)\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: OnlineRecognizerConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: OnlineRecognizerConfig,\n    ): Long\n\n    private external fun createStream(ptr: Long, hotwords: String): Long\n    private external fun reset(ptr: Long, streamPtr: Long)\n    private external fun decode(ptr: Long, streamPtr: Long)\n    private external fun isEndpoint(ptr: Long, streamPtr: Long): Boolean\n    private external fun isReady(ptr: Long, streamPtr: Long): Boolean\n    private external fun getResult(ptr: Long, streamPtr: Long): OnlineRecognizerResult\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n\n\n/*\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models.\n\nWe only add a few here. Please change the following code\nto add your own. (It should be straightforward to add a new model\nby following the code)\n\n@param type\n0 - sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 (Bilingual, Chinese + English)\n    https://k2-fsa.github.io/sherpa/onnx/pretrained_models/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n\n1 - csukuangfj/sherpa-onnx-lstm-zh-2023-02-20 (Chinese)\n\n    https://k2-fsa.github.io/sherpa/onnx/pretrained_models/lstm-transducer-models.html#csukuangfj-sherpa-onnx-lstm-zh-2023-02-20-chinese\n\n2 - csukuangfj/sherpa-onnx-lstm-en-2023-02-17 (English)\n    https://k2-fsa.github.io/sherpa/onnx/pretrained_models/lstm-transducer-models.html#csukuangfj-sherpa-onnx-lstm-en-2023-02-17-english\n\n3,4 - pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615\n    https://huggingface.co/pkufool/icefall-asr-zipformer-streaming-wenetspeech-20230615\n    3 - int8 encoder\n    4 - float32 encoder\n\n5 - csukuangfj/sherpa-onnx-streaming-paraformer-bilingual-zh-en\n    https://huggingface.co/csukuangfj/sherpa-onnx-streaming-paraformer-bilingual-zh-en\n\n6 - sherpa-onnx-streaming-zipformer-en-2023-06-26\n    https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-en-2023-06-26\n\n7 - shaojieli/sherpa-onnx-streaming-zipformer-fr-2023-04-14 (French)\n    https://huggingface.co/shaojieli/sherpa-onnx-streaming-zipformer-fr-2023-04-14\n\n8 - csukuangfj/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 (Bilingual, Chinese + English)\n    https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\n    encoder int8, decoder/joiner float32\n\n */\nfun getModelConfig(type: Int): OnlineModelConfig? {\n    when (type) {\n        0 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-99-avg-1.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-99-avg-1.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-99-avg-1.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer\",\n            )\n        }\n\n        1 -> {\n            val modelDir = \"sherpa-onnx-lstm-zh-2023-02-20\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-11-avg-1.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-11-avg-1.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-11-avg-1.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"lstm\",\n            )\n        }\n\n        2 -> {\n            val modelDir = \"sherpa-onnx-lstm-en-2023-02-17\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-99-avg-1.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-99-avg-1.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-99-avg-1.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"lstm\",\n            )\n        }\n\n        3 -> {\n            val modelDir = \"icefall-asr-zipformer-streaming-wenetspeech-20230615\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/exp/encoder-epoch-12-avg-4-chunk-16-left-128.int8.onnx\",\n                    decoder = \"$modelDir/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx\",\n                    joiner = \"$modelDir/exp/joiner-epoch-12-avg-4-chunk-16-left-128.onnx\",\n                ),\n                tokens = \"$modelDir/data/lang_char/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        4 -> {\n            val modelDir = \"icefall-asr-zipformer-streaming-wenetspeech-20230615\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/exp/encoder-epoch-12-avg-4-chunk-16-left-128.onnx\",\n                    decoder = \"$modelDir/exp/decoder-epoch-12-avg-4-chunk-16-left-128.onnx\",\n                    joiner = \"$modelDir/exp/joiner-epoch-12-avg-4-chunk-16-left-128.onnx\",\n                ),\n                tokens = \"$modelDir/data/lang_char/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        5 -> {\n            val modelDir = \"sherpa-onnx-streaming-paraformer-bilingual-zh-en\"\n            return OnlineModelConfig(\n                paraformer = OnlineParaformerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"paraformer\",\n            )\n        }\n\n        6 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-en-2023-06-26\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-99-avg-1-chunk-16-left-128.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-99-avg-1-chunk-16-left-128.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-99-avg-1-chunk-16-left-128.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        7 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-fr-2023-04-14\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-29-avg-9-with-averaged-model.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-29-avg-9-with-averaged-model.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-29-avg-9-with-averaged-model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer\",\n            )\n        }\n\n        8 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-99-avg-1.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-99-avg-1.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-99-avg-1.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer\",\n            )\n        }\n\n        9 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-99-avg-1.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-99-avg-1.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-99-avg-1.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer\",\n            )\n        }\n\n        10 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-en-20M-2023-02-17\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-99-avg-1.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-99-avg-1.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-99-avg-1.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer\",\n            )\n        }\n\n        11 -> {\n            val modelDir = \"sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-80ms\"\n            return OnlineModelConfig(\n                neMoCtc = OnlineNeMoCtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        12 -> {\n            val modelDir = \"sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-480ms\"\n            return OnlineModelConfig(\n                neMoCtc = OnlineNeMoCtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        13 -> {\n            val modelDir = \"sherpa-onnx-nemo-streaming-fast-conformer-ctc-en-1040ms\"\n            return OnlineModelConfig(\n                neMoCtc = OnlineNeMoCtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        14 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-korean-2024-06-16\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder-epoch-99-avg-1.int8.onnx\",\n                    decoder = \"$modelDir/decoder-epoch-99-avg-1.onnx\",\n                    joiner = \"$modelDir/joiner-epoch-99-avg-1.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer\",\n            )\n        }\n\n        15 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-small-ctc-zh-int8-2025-04-01\"\n            return OnlineModelConfig(\n                zipformer2Ctc = OnlineZipformer2CtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        16 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-small-ctc-zh-2025-04-01\"\n            return OnlineModelConfig(\n                zipformer2Ctc = OnlineZipformer2CtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        17 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-ctc-zh-int8-2025-06-30\"\n            return OnlineModelConfig(\n                zipformer2Ctc = OnlineZipformer2CtcModelConfig(\n                    model = \"$modelDir/model.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        18 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-ctc-zh-2025-06-30\"\n            return OnlineModelConfig(\n                zipformer2Ctc = OnlineZipformer2CtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        19 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-ctc-zh-fp16-2025-06-30\"\n            return OnlineModelConfig(\n                zipformer2Ctc = OnlineZipformer2CtcModelConfig(\n                    model = \"$modelDir/model.fp16.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        20 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-zh-int8-2025-06-30\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        21 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-en-kroko-2025-08-06\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        22 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-es-kroko-2025-08-06\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        23 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-fr-kroko-2025-08-06\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        24 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-de-kroko-2025-08-06\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        25 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-small-ru-vosk-int8-2025-08-16\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        26 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-small-ru-vosk-2025-08-16\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        27 -> {\n            val modelDir = \"sherpa-onnx-streaming-t-one-russian-2025-09-08\"\n            return OnlineModelConfig(\n                toneCtc = OnlineToneCtcModelConfig(\n                    model = \"$modelDir/model.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        28 -> {\n            val modelDir = \"sherpa-onnx-nemotron-speech-streaming-en-0.6b-int8-2026-01-14\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.int8.onnx\",\n                    decoder = \"$modelDir/decoder.int8.onnx\",\n                    joiner = \"$modelDir/joiner.int8.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n            )\n        }\n\n        29 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-bn-vosk-2026-02-09\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.onnx\",\n                    decoder = \"$modelDir/decoder.onnx\",\n                    joiner = \"$modelDir/joiner.onnx\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer2\",\n            )\n        }\n\n        1000 -> {\n            val modelDir = \"sherpa-onnx-rk3588-streaming-zipformer-bilingual-zh-en-2023-02-20\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.rknn\",\n                    decoder = \"$modelDir/decoder.rknn\",\n                    joiner = \"$modelDir/joiner.rknn\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer\",\n                provider = \"rknn\",\n            )\n        }\n\n        1001 -> {\n            val modelDir = \"sherpa-onnx-rk3588-streaming-zipformer-small-bilingual-zh-en-2023-02-16\"\n            return OnlineModelConfig(\n                transducer = OnlineTransducerModelConfig(\n                    encoder = \"$modelDir/encoder.rknn\",\n                    decoder = \"$modelDir/decoder.rknn\",\n                    joiner = \"$modelDir/joiner.rknn\",\n                ),\n                tokens = \"$modelDir/tokens.txt\",\n                modelType = \"zipformer\",\n                provider = \"rknn\",\n            )\n        }\n\n    }\n    return null\n}\n\n/*\nPlease see\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nfor a list of pre-trained models.\n\nWe only add a few here. Please change the following code\nto add your own LM model. (It should be straightforward to train a new NN LM model\nby following the code, https://github.com/k2-fsa/icefall/blob/master/icefall/rnn_lm/train.py)\n\n@param type\n0 - sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 (Bilingual, Chinese + English)\n    https://k2-fsa.github.io/sherpa/onnx/pretrained_models/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\n */\nfun getOnlineLMConfig(type: Int): OnlineLMConfig {\n    when (type) {\n        0 -> {\n            val modelDir = \"sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20\"\n            return OnlineLMConfig(\n                model = \"$modelDir/with-state-epoch-99-avg-1.int8.onnx\",\n                scale = 0.5f,\n            )\n        }\n    }\n    return OnlineLMConfig()\n}\n\nfun getEndpointConfig(): EndpointConfig {\n    return EndpointConfig(\n        rule1 = EndpointRule(false, 2.4f, 0.0f),\n        rule2 = EndpointRule(true, 1.4f, 0.0f),\n        rule3 = EndpointRule(false, 0.0f, 20.0f)\n    )\n}\n\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/OnlineSpeechDenoiser.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class OnlineSpeechDenoiserConfig(\n    var model: OfflineSpeechDenoiserModelConfig = OfflineSpeechDenoiserModelConfig(),\n)\n\nclass OnlineSpeechDenoiser(\n    assetManager: AssetManager? = null,\n    config: OnlineSpeechDenoiserConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun run(samples: FloatArray, sampleRate: Int) = run(ptr, samples, sampleRate)\n\n    fun flush() = flush(ptr)\n\n    fun reset() = reset(ptr)\n\n    val sampleRate\n      get() = getSampleRate(ptr)\n\n    val frameShiftInSamples\n      get() = getFrameShiftInSamples(ptr)\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: OnlineSpeechDenoiserConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: OnlineSpeechDenoiserConfig,\n    ): Long\n\n    private external fun delete(ptr: Long)\n\n    private external fun run(ptr: Long, samples: FloatArray, sampleRate: Int): DenoisedAudio\n\n    private external fun flush(ptr: Long): DenoisedAudio\n\n    private external fun reset(ptr: Long)\n\n    private external fun getSampleRate(ptr: Long): Int\n\n    private external fun getFrameShiftInSamples(ptr: Long): Int\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/OnlineStream.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nclass OnlineStream(var ptr: Long = 0) {\n    fun acceptWaveform(samples: FloatArray, sampleRate: Int) =\n        acceptWaveform(ptr, samples, sampleRate)\n\n    fun inputFinished() = inputFinished(ptr)\n\n    fun setOption(key: String, value: String) = setOption(ptr, key, value)\n\n    fun getOption(key: String): String = getOption(ptr, key)\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun use(block: (OnlineStream) -> Unit) {\n        try {\n            block(this)\n        } finally {\n            release()\n        }\n    }\n\n    private external fun acceptWaveform(ptr: Long, samples: FloatArray, sampleRate: Int)\n    private external fun inputFinished(ptr: Long)\n    private external fun setOption(ptr: Long, key: String, value: String)\n    private external fun getOption(ptr: Long, key: String): String\n    private external fun delete(ptr: Long)\n\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/QnnConfig.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\ndata class QnnConfig(\n    var backendLib: String = \"\",\n    var contextBinary: String = \"\",\n    var systemLib: String = \"\",\n)\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/Speaker.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\nimport android.util.Log\n\nclass SpeakerEmbeddingExtractor(\n    assetManager: AssetManager? = null,\n    config: SpeakerEmbeddingExtractorConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun createStream(): OnlineStream {\n        val p = createStream(ptr)\n        return OnlineStream(p)\n    }\n\n    fun isReady(stream: OnlineStream) = isReady(ptr, stream.ptr)\n    fun compute(stream: OnlineStream) = compute(ptr, stream.ptr)\n    fun dim() = dim(ptr)\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: SpeakerEmbeddingExtractorConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: SpeakerEmbeddingExtractorConfig,\n    ): Long\n\n    private external fun delete(ptr: Long)\n\n    private external fun createStream(ptr: Long): Long\n\n    private external fun isReady(ptr: Long, streamPtr: Long): Boolean\n\n    private external fun compute(ptr: Long, streamPtr: Long): FloatArray\n\n    private external fun dim(ptr: Long): Int\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n\nclass SpeakerEmbeddingManager(val dim: Int) {\n    private var ptr: Long\n\n    init {\n        ptr = create(dim)\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n    fun add(name: String, embedding: FloatArray) = add(ptr, name, embedding)\n    fun add(name: String, embedding: Array<FloatArray>) = addList(ptr, name, embedding)\n    fun remove(name: String) = remove(ptr, name)\n    fun search(embedding: FloatArray, threshold: Float) = search(ptr, embedding, threshold)\n    fun verify(name: String, embedding: FloatArray, threshold: Float) =\n        verify(ptr, name, embedding, threshold)\n\n    fun contains(name: String) = contains(ptr, name)\n    fun numSpeakers() = numSpeakers(ptr)\n\n    fun allSpeakerNames() = allSpeakerNames(ptr)\n\n    private external fun create(dim: Int): Long\n    private external fun delete(ptr: Long): Unit\n    private external fun add(ptr: Long, name: String, embedding: FloatArray): Boolean\n    private external fun addList(ptr: Long, name: String, embedding: Array<FloatArray>): Boolean\n    private external fun remove(ptr: Long, name: String): Boolean\n    private external fun search(ptr: Long, embedding: FloatArray, threshold: Float): String\n    private external fun verify(\n        ptr: Long,\n        name: String,\n        embedding: FloatArray,\n        threshold: Float\n    ): Boolean\n\n    private external fun contains(ptr: Long, name: String): Boolean\n    private external fun numSpeakers(ptr: Long): Int\n\n    private external fun allSpeakerNames(ptr: Long): Array<String>\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n\n// Please download the model file from\n// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n// and put it inside the assets directory.\n//\n// Please don't put it in a subdirectory of assets\nprivate val modelName = \"3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\"\n\nobject SpeakerRecognition {\n    var _extractor: SpeakerEmbeddingExtractor? = null\n    var _manager: SpeakerEmbeddingManager? = null\n\n    val extractor: SpeakerEmbeddingExtractor\n        get() {\n            return _extractor!!\n        }\n\n    val manager: SpeakerEmbeddingManager\n        get() {\n            return _manager!!\n        }\n\n    fun initExtractor(assetManager: AssetManager? = null) {\n        synchronized(this) {\n            if (_extractor != null) {\n                return\n            }\n            Log.i(\"sherpa-onnx\", \"Initializing speaker embedding extractor\")\n\n            _extractor = SpeakerEmbeddingExtractor(\n                assetManager = assetManager,\n                config = SpeakerEmbeddingExtractorConfig(\n                    model = modelName,\n                    numThreads = 2,\n                    debug = false,\n                    provider = \"cpu\",\n                )\n            )\n\n            _manager = SpeakerEmbeddingManager(dim = _extractor!!.dim())\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/SpeakerEmbeddingExtractorConfig.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\ndata class SpeakerEmbeddingExtractorConfig(\n    val model: String = \"\",\n    var numThreads: Int = 1,\n    var debug: Boolean = false,\n    var provider: String = \"cpu\",\n)\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/SpokenLanguageIdentification.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class SpokenLanguageIdentificationWhisperConfig(\n    var encoder: String = \"\",\n    var decoder: String = \"\",\n    var tailPaddings: Int = -1,\n)\n\ndata class SpokenLanguageIdentificationConfig(\n    var whisper: SpokenLanguageIdentificationWhisperConfig = SpokenLanguageIdentificationWhisperConfig(),\n    var numThreads: Int = 1,\n    var debug: Boolean = false,\n    var provider: String = \"cpu\",\n)\n\nclass SpokenLanguageIdentification(\n    assetManager: AssetManager? = null,\n    config: SpokenLanguageIdentificationConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun createStream(): OfflineStream {\n        val p = createStream(ptr)\n        return OfflineStream(p)\n    }\n\n    fun compute(stream: OfflineStream) = compute(ptr, stream.ptr)\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: SpokenLanguageIdentificationConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: SpokenLanguageIdentificationConfig,\n    ): Long\n\n    private external fun delete(ptr: Long)\n\n    private external fun createStream(ptr: Long): Long\n\n    private external fun compute(ptr: Long, streamPtr: Long): String\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n\n// please refer to\n// https://k2-fsa.github.io/sherpa/onnx/spolken-language-identification/pretrained_models.html#whisper\n// to download more models\nfun getSpokenLanguageIdentificationConfig(\n    type: Int,\n    numThreads: Int = 1\n): SpokenLanguageIdentificationConfig? {\n    when (type) {\n        0 -> {\n            val modelDir = \"sherpa-onnx-whisper-tiny\"\n            return SpokenLanguageIdentificationConfig(\n                whisper = SpokenLanguageIdentificationWhisperConfig(\n                    encoder = \"$modelDir/tiny-encoder.int8.onnx\",\n                    decoder = \"$modelDir/tiny-decoder.int8.onnx\",\n                ),\n                numThreads = numThreads,\n                debug = true,\n            )\n        }\n\n        1 -> {\n            val modelDir = \"sherpa-onnx-whisper-base\"\n            return SpokenLanguageIdentificationConfig(\n                whisper = SpokenLanguageIdentificationWhisperConfig(\n                    encoder = \"$modelDir/base-encoder.int8.onnx\",\n                    decoder = \"$modelDir/base-decoder.int8.onnx\",\n                ),\n                numThreads = 1,\n                debug = true,\n            )\n        }\n    }\n    return null\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/Tts.kt",
    "content": "// Copyright (c)  2023  Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class OfflineTtsVitsModelConfig(\n    var model: String = \"\",\n    var lexicon: String = \"\",\n    var tokens: String = \"\",\n    var dataDir: String = \"\",\n    var dictDir: String = \"\", // unused\n    var noiseScale: Float = 0.667f,\n    var noiseScaleW: Float = 0.8f,\n    var lengthScale: Float = 1.0f,\n)\n\ndata class OfflineTtsMatchaModelConfig(\n    var acousticModel: String = \"\",\n    var vocoder: String = \"\",\n    var lexicon: String = \"\",\n    var tokens: String = \"\",\n    var dataDir: String = \"\",\n    var dictDir: String = \"\", // unused\n    var noiseScale: Float = 1.0f,\n    var lengthScale: Float = 1.0f,\n)\n\ndata class OfflineTtsKokoroModelConfig(\n    var model: String = \"\",\n    var voices: String = \"\",\n    var tokens: String = \"\",\n    var dataDir: String = \"\",\n    var lexicon: String = \"\",\n    var lang: String = \"\",\n    var dictDir: String = \"\", // unused\n    var lengthScale: Float = 1.0f,\n)\n\ndata class OfflineTtsZipVoiceModelConfig(\n    var tokens: String = \"\",\n    var encoder: String = \"\",\n    var decoder: String = \"\",\n    var vocoder: String = \"\",\n    var dataDir: String = \"\",\n    var lexicon: String = \"\",\n    var featScale: Float = 0.1f,\n    var tShift: Float = 0.5f,\n    var targetRms: Float = 0.1f,\n    var guidanceScale: Float = 1.0f,\n)\n\ndata class OfflineTtsKittenModelConfig(\n    var model: String = \"\",\n    var voices: String = \"\",\n    var tokens: String = \"\",\n    var dataDir: String = \"\",\n    var lengthScale: Float = 1.0f,\n)\n\n/**\n * Configuration for Pocket TTS models.\n *\n * See https://k2-fsa.github.io/sherpa/onnx/tts/pocket/index.html for details.\n *\n * @property lmFlow Path to the LM flow model (.onnx)\n * @property lmMain Path to the LM main model (.onnx)\n * @property encoder Path to the encoder model (.onnx)\n * @property decoder Path to the decoder model (.onnx)\n * @property textConditioner Path to the text conditioner model (.onnx)\n * @property vocabJson Path to vocabulary JSON file\n * @property tokenScoresJson Path to token scores JSON file\n */\ndata class OfflineTtsPocketModelConfig(\n  var lmFlow: String = \"\",\n  var lmMain: String = \"\",\n  var encoder: String = \"\",\n  var decoder: String = \"\",\n  var textConditioner: String = \"\",\n  var vocabJson: String = \"\",\n  var tokenScoresJson: String = \"\",\n  var voiceEmbeddingCacheCapacity: Int = 50,\n)\n\ndata class OfflineTtsSupertonicModelConfig(\n  var durationPredictor: String = \"\",\n  var textEncoder: String = \"\",\n  var vectorEstimator: String = \"\",\n  var vocoder: String = \"\",\n  var ttsJson: String = \"\",\n  var unicodeIndexer: String = \"\",\n  var voiceStyle: String = \"\",\n)\n\ndata class OfflineTtsModelConfig(\n    var vits: OfflineTtsVitsModelConfig = OfflineTtsVitsModelConfig(),\n    var matcha: OfflineTtsMatchaModelConfig = OfflineTtsMatchaModelConfig(),\n    var kokoro: OfflineTtsKokoroModelConfig = OfflineTtsKokoroModelConfig(),\n    var zipvoice: OfflineTtsZipVoiceModelConfig = OfflineTtsZipVoiceModelConfig(),\n    var kitten: OfflineTtsKittenModelConfig = OfflineTtsKittenModelConfig(),\n    var pocket: OfflineTtsPocketModelConfig = OfflineTtsPocketModelConfig(),\n    var supertonic: OfflineTtsSupertonicModelConfig = OfflineTtsSupertonicModelConfig(),\n\n    var numThreads: Int = 1,\n    var debug: Boolean = false,\n    var provider: String = \"cpu\",\n)\n\ndata class OfflineTtsConfig(\n    var model: OfflineTtsModelConfig = OfflineTtsModelConfig(),\n    var ruleFsts: String = \"\",\n    var ruleFars: String = \"\",\n    var maxNumSentences: Int = 1,\n    var silenceScale: Float = 0.2f,\n)\n\nclass GeneratedAudio(\n    val samples: FloatArray,\n    val sampleRate: Int,\n) {\n    fun save(filename: String) =\n        saveImpl(filename = filename, samples = samples, sampleRate = sampleRate)\n\n    private external fun saveImpl(\n        filename: String,\n        samples: FloatArray,\n        sampleRate: Int\n    ): Boolean\n}\n\ndata class GenerationConfig(\n    var silenceScale: Float = 0.2f,\n    var speed: Float = 1.0f,\n    var sid: Int = 0,\n    var referenceAudio: FloatArray? = null,\n    var referenceSampleRate: Int = 0,\n    var referenceText: String? = null,\n    var numSteps: Int = 5,\n    var extra: Map<String, String>? = null\n)\n\nclass OfflineTts(\n    assetManager: AssetManager? = null,\n    var config: OfflineTtsConfig,\n) {\n    private var ptr: Long\n\n    init {\n        ptr = if (assetManager != null) {\n            newFromAsset(assetManager, config)\n        } else {\n            newFromFile(config)\n        }\n    }\n\n    fun sampleRate() = getSampleRate(ptr)\n\n    fun numSpeakers() = getNumSpeakers(ptr)\n\n    fun generate(\n        text: String,\n        sid: Int = 0,\n        speed: Float = 1.0f\n    ): GeneratedAudio {\n        return generateImpl(ptr, text = text, sid = sid, speed = speed)\n    }\n\n    fun generateWithCallback(\n        text: String,\n        sid: Int = 0,\n        speed: Float = 1.0f,\n        callback: (samples: FloatArray) -> Int\n    ): GeneratedAudio {\n        return generateWithCallbackImpl(\n            ptr,\n            text = text,\n            sid = sid,\n            speed = speed,\n            callback = callback\n        )\n    }\n\n    fun generateWithConfig(\n      text: String,\n      config: GenerationConfig\n    ): GeneratedAudio {\n        return generateWithConfigImpl(ptr, text, config, null)\n    }\n\n    fun generateWithConfigAndCallback(\n        text: String,\n        config: GenerationConfig,\n        callback: (samples: FloatArray) -> Int\n    ): GeneratedAudio {\n        return generateWithConfigImpl(ptr, text, config, callback)\n    }\n\n    fun allocate(assetManager: AssetManager? = null) {\n        if (ptr == 0L) {\n            ptr = if (assetManager != null) {\n                newFromAsset(assetManager, config)\n            } else {\n                newFromFile(config)\n            }\n        }\n    }\n\n    fun free() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: OfflineTtsConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: OfflineTtsConfig,\n    ): Long\n\n    private external fun delete(ptr: Long)\n    private external fun getSampleRate(ptr: Long): Int\n    private external fun getNumSpeakers(ptr: Long): Int\n\n    // The returned array has two entries:\n    //  - the first entry is an 1-D float array containing audio samples.\n    //    Each sample is normalized to the range [-1, 1]\n    //  - the second entry is the sample rate\n    private external fun generateImpl(\n        ptr: Long,\n        text: String,\n        sid: Int = 0,\n        speed: Float = 1.0f\n    ): GeneratedAudio\n\n    private external fun generateWithCallbackImpl(\n        ptr: Long,\n        text: String,\n        sid: Int = 0,\n        speed: Float = 1.0f,\n        callback: (samples: FloatArray) -> Int\n    ): GeneratedAudio\n\n\n    private external fun generateWithConfigImpl(\n        ptr: Long,\n        text: String,\n        config: GenerationConfig,\n        callback: ((samples: FloatArray) -> Int)?\n    ): GeneratedAudio\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n\n// please refer to\n// https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/index.html\n// to download models\nfun getOfflineTtsConfig(\n    modelDir: String,\n    modelName: String, // for VITS\n    acousticModelName: String, // for Matcha\n    vocoder: String, // for Matcha\n    voices: String, // for Kokoro or kitten\n    lexicon: String,\n    dataDir: String,\n    dictDir: String, // unused\n    ruleFsts: String,\n    ruleFars: String,\n    numThreads: Int? = null,\n    isKitten: Boolean = false\n): OfflineTtsConfig {\n    // For Matcha TTS, please set\n    // acousticModelName, vocoder\n\n    // For Kokoro TTS, please set\n    // modelName, voices\n\n    // For Kitten TTS, please set\n    // modelName, voices, isKitten\n\n    // For VITS, please set\n    // modelName\n\n    val numberOfThreads = if (numThreads != null) {\n        numThreads\n    } else if (voices.isNotEmpty()) {\n        // for Kokoro and Kitten TTS models, we use more threads\n        4\n    } else {\n        2\n    }\n\n    if (modelName.isEmpty() && acousticModelName.isEmpty()) {\n        throw IllegalArgumentException(\"Please specify a TTS model\")\n    }\n\n    if (modelName.isNotEmpty() && acousticModelName.isNotEmpty()) {\n        throw IllegalArgumentException(\"Please specify either a VITS or a Matcha model, but not both\")\n    }\n\n    if (acousticModelName.isNotEmpty() && vocoder.isEmpty()) {\n        throw IllegalArgumentException(\"Please provide vocoder for Matcha TTS\")\n    }\n\n    val vits = if (modelName.isNotEmpty() && voices.isEmpty()) {\n        OfflineTtsVitsModelConfig(\n            model = \"$modelDir/$modelName\",\n            lexicon = \"$modelDir/$lexicon\",\n            tokens = \"$modelDir/tokens.txt\",\n            dataDir = dataDir,\n        )\n    } else {\n        OfflineTtsVitsModelConfig()\n    }\n\n    val matcha = if (acousticModelName.isNotEmpty()) {\n        OfflineTtsMatchaModelConfig(\n            acousticModel = \"$modelDir/$acousticModelName\",\n            vocoder = vocoder,\n            lexicon = \"$modelDir/$lexicon\",\n            tokens = \"$modelDir/tokens.txt\",\n            dataDir = dataDir,\n        )\n    } else {\n        OfflineTtsMatchaModelConfig()\n    }\n\n    val kokoro = if (voices.isNotEmpty() && !isKitten) {\n        OfflineTtsKokoroModelConfig(\n            model = \"$modelDir/$modelName\",\n            voices = \"$modelDir/$voices\",\n            tokens = \"$modelDir/tokens.txt\",\n            dataDir = dataDir,\n            lexicon = when {\n                lexicon == \"\" -> lexicon\n                \",\" in lexicon -> lexicon\n                else -> \"$modelDir/$lexicon\"\n            },\n        )\n    } else {\n        OfflineTtsKokoroModelConfig()\n    }\n\n    val kitten = if (isKitten) {\n        OfflineTtsKittenModelConfig(\n            model = \"$modelDir/$modelName\",\n            voices = \"$modelDir/$voices\",\n            tokens = \"$modelDir/tokens.txt\",\n            dataDir = dataDir,\n        )\n    } else {\n        OfflineTtsKittenModelConfig()\n    }\n\n    return OfflineTtsConfig(\n        model = OfflineTtsModelConfig(\n            vits = vits,\n            matcha = matcha,\n            kokoro = kokoro,\n            kitten = kitten,\n            numThreads = numberOfThreads,\n            debug = true,\n            provider = \"cpu\",\n        ),\n        ruleFsts = ruleFsts,\n        ruleFars = ruleFars,\n    )\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/Vad.kt",
    "content": "// Copyright (c)  2023  Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class SileroVadModelConfig(\n    var model: String = \"\",\n    var threshold: Float = 0.5F,\n    var minSilenceDuration: Float = 0.25F,\n    var minSpeechDuration: Float = 0.25F,\n    var windowSize: Int = 512,\n    var maxSpeechDuration: Float = 5.0F,\n)\n\ndata class TenVadModelConfig(\n    var model: String = \"\",\n    var threshold: Float = 0.5F,\n    var minSilenceDuration: Float = 0.25F,\n    var minSpeechDuration: Float = 0.25F,\n    var windowSize: Int = 256,\n    var maxSpeechDuration: Float = 5.0F,\n)\n\ndata class VadModelConfig(\n    var sileroVadModelConfig: SileroVadModelConfig = SileroVadModelConfig(),\n    var tenVadModelConfig: TenVadModelConfig = TenVadModelConfig(),\n    var sampleRate: Int = 16000,\n    var numThreads: Int = 1,\n    var provider: String = \"cpu\",\n    var debug: Boolean = false,\n)\n\nclass SpeechSegment(val start: Int, val samples: FloatArray)\n\nclass Vad(\n    assetManager: AssetManager? = null,\n    var config: VadModelConfig,\n) {\n    private var ptr: Long\n\n    init {\n        if (assetManager != null) {\n            ptr = newFromAsset(assetManager, config)\n        } else {\n            ptr = newFromFile(config)\n        }\n    }\n\n    protected fun finalize() {\n        if (ptr != 0L) {\n            delete(ptr)\n            ptr = 0\n        }\n    }\n\n    fun release() = finalize()\n\n    fun compute(samples: FloatArray): Float = compute(ptr, samples)\n\n\n    fun acceptWaveform(samples: FloatArray) = acceptWaveform(ptr, samples)\n\n    fun empty(): Boolean = empty(ptr)\n    fun pop() = pop(ptr)\n\n    fun front(): SpeechSegment {\n        return front(ptr)\n    }\n\n    fun clear() = clear(ptr)\n\n    fun isSpeechDetected(): Boolean = isSpeechDetected(ptr)\n\n    fun reset() = reset(ptr)\n\n    fun flush() = flush(ptr)\n\n    private external fun delete(ptr: Long)\n\n    private external fun newFromAsset(\n        assetManager: AssetManager,\n        config: VadModelConfig,\n    ): Long\n\n    private external fun newFromFile(\n        config: VadModelConfig,\n    ): Long\n\n    private external fun acceptWaveform(ptr: Long, samples: FloatArray)\n    private external fun compute(ptr: Long, samples: FloatArray): Float\n\n    private external fun empty(ptr: Long): Boolean\n    private external fun pop(ptr: Long)\n    private external fun clear(ptr: Long)\n    private external fun front(ptr: Long): SpeechSegment\n    private external fun isSpeechDetected(ptr: Long): Boolean\n    private external fun reset(ptr: Long)\n    private external fun flush(ptr: Long)\n\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n\n// Please visit\n// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n// to download silero_vad.onnx\n// and put it inside the assets/\n// directory\n//\n// For ten-vad, please use\n// https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\n//\nfun getVadModelConfig(type: Int): VadModelConfig? {\n    when (type) {\n        0 -> {\n            return VadModelConfig(\n                sileroVadModelConfig = SileroVadModelConfig(\n                    model = \"silero_vad.onnx\",\n                    threshold = 0.5F,\n                    minSilenceDuration = 0.25F,\n                    minSpeechDuration = 0.25F,\n                    windowSize = 512,\n                ),\n                sampleRate = 16000,\n                numThreads = 1,\n                provider = \"cpu\",\n            )\n        }\n\n        1 -> {\n            return VadModelConfig(\n                tenVadModelConfig = TenVadModelConfig(\n                    model = \"ten-vad.onnx\",\n                    threshold = 0.5F,\n                    minSilenceDuration = 0.25F,\n                    minSpeechDuration = 0.25F,\n                    windowSize = 256,\n                ),\n                sampleRate = 16000,\n                numThreads = 1,\n                provider = \"cpu\",\n            )\n        }\n    }\n    return null\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/VersionInfo.kt",
    "content": "package com.k2fsa.sherpa.onnx\n\nclass VersionInfo {\n    companion object {\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n\n        val version: String\n            get() = getVersionStr2()\n\n        val gitSha1: String\n            get() = getGitSha12()\n\n        val gitDate: String\n            get() = getGitDate2()\n\n        external fun getVersionStr2(): String\n        external fun getGitSha12(): String\n        external fun getGitDate2(): String\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/kotlin-api/WaveReader.kt",
    "content": "// Copyright (c)  2023  Xiaomi Corporation\npackage com.k2fsa.sherpa.onnx\n\nimport android.content.res.AssetManager\n\ndata class WaveData(\n    val samples: FloatArray,\n    val sampleRate: Int,\n) {\n    override fun equals(other: Any?): Boolean {\n        if (this === other) return true\n        if (javaClass != other?.javaClass) return false\n\n        other as WaveData\n\n        if (!samples.contentEquals(other.samples)) return false\n        if (sampleRate != other.sampleRate) return false\n\n        return true\n    }\n\n    override fun hashCode(): Int {\n        var result = samples.contentHashCode()\n        result = 31 * result + sampleRate\n        return result\n    }\n}\n\nclass WaveReader {\n    companion object {\n\n        fun readWave(\n            assetManager: AssetManager,\n            filename: String,\n        ): WaveData {\n            return readWaveFromAsset(assetManager, filename)\n        }\n\n        fun readWave(\n            filename: String,\n        ): WaveData {\n            return readWaveFromFile(filename)\n        }\n\n        // Read a mono wave file asset\n        external fun readWaveFromAsset(\n            assetManager: AssetManager,\n            filename: String,\n        ): WaveData\n\n        // Read a mono wave file from disk\n        external fun readWaveFromFile(\n            filename: String,\n        ): WaveData\n\n        init {\n            System.loadLibrary(\"sherpa-onnx-jni\")\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/pascal-api/README.md",
    "content": "# Introduction\n\nThis directory contains APIs for [Object Pascal](https://en.wikipedia.org/wiki/Object_Pascal).\n\nPlease see\nhttps://github.com/k2-fsa/sherpa-onnx/tree/master/pascal-api-examples\nfor usages.\n\n[portaudio.pas](./portaudio.pas)\nis copied from\nhttps://github.com/UltraStar-Deluxe/USDX/blob/master/src/lib/portaudio/portaudio.pas\n"
  },
  {
    "path": "sherpa-onnx/pascal-api/portaudio.pas",
    "content": "{\nThis file is copied from\nhttps://github.com/UltraStar-Deluxe/USDX/blob/master/src/lib/portaudio/portaudio.pas\n}\n{*\n * $Id: portaudio.h,v 1.7 2007/08/16 20:45:34 richardash1981 Exp $\n * PortAudio Portable Real-Time Audio Library\n * PortAudio API Header File\n * Latest version available at: http://www.portaudio.com/\n *\n * Copyright (c) 1999-2002 Ross Bencina and Phil Burk\n *                                                 \n * Permission is hereby granted, free of charge, to any person obtaining\n * a copy of this software and associated documentation files\n * (the \"Software\"), to deal in the Software without restriction,\n * including without limitation the rights to use, copy, modify, merge,\n * publish, distribute, sublicense, and/or sell copies of the Software,\n * and to permit persons to whom the Software is furnished to do so,\n * subject to the following conditions:\n *\n * The above copyright notice and this permission notice shall be\n * included in all copies or substantial portions of the Software.\n *\n * THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND,\n * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF\n * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.\n * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR\n * ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF\n * CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION\n * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n *}\n\n{*\n * The text above constitutes the entire PortAudio license; however, \n * the PortAudio community also makes the following non-binding requests:\n *\n * Any person wishing to distribute modifications to the Software is\n * requested to send the modifications to the original developer so that\n * they can be incorporated into the canonical version. It is also \n * requested that these non-binding requests be included along with the \n * license above.\n *}\n\n{** @file\n @brief The PortAudio API.\n*}\n\nunit portaudio;\n\n{$IFDEF FPC}\n  {$PACKENUM 4}    (* use 4-byte enums *)\n  {$PACKRECORDS C} (* C/C++-compatible record packing *)\n  {$MODE DELPHI }\n{$ELSE}\n  {$MINENUMSIZE 4} (* use 4-byte enums *)\n{$ENDIF}\n\ninterface\n\nuses\n  ctypes;\n\nconst\n{$IF Defined(MSWINDOWS)}\n  LibName = 'portaudio_x86.dll';\n{$ELSEIF Defined(UNIX)}\n  LibName = 'portaudio';\n  {$LINKLIB portaudio}\n{$IFEND}\n\n{** Retrieve the release number of the currently running PortAudio build,\n eg 1900.\n*}\nfunction Pa_GetVersion(): cint; cdecl; external LibName;\n\n\n{** Retrieve a textual description of the current PortAudio build,\n eg \"PortAudio V19-devel 13 October 2002\".\n*}\nfunction Pa_GetVersionText(): PChar; cdecl; external LibName;\n\n\n{** Error codes returned by PortAudio functions.\n Note that with the exception of paNoError, all PaErrorCodes are negative.\n*}\n\ntype TPaError = cint;\ntype TPaErrorCode = {enum}cint; const\n{enum_begin PaErrorCode}\n    paNoError = 0;\n\n    paNotInitialized = -10000;\n    paUnanticipatedHostError                = (paNotInitialized+ 1);\n    paInvalidChannelCount                   = (paNotInitialized+ 2);\n    paInvalidSampleRate                     = (paNotInitialized+ 3);\n    paInvalidDevice                         = (paNotInitialized+ 4);\n    paInvalidFlag                           = (paNotInitialized+ 5);\n    paSampleFormatNotSupported              = (paNotInitialized+ 6);\n    paBadIODeviceCombination                = (paNotInitialized+ 7);\n    paInsufficientMemory                    = (paNotInitialized+ 8);\n    paBufferTooBig                          = (paNotInitialized+ 9);\n    paBufferTooSmall                        = (paNotInitialized+10);\n    paNullCallback                          = (paNotInitialized+11);\n    paBadStreamPtr                          = (paNotInitialized+12);\n    paTimedOut                              = (paNotInitialized+13);\n    paInternalError                         = (paNotInitialized+14);\n    paDeviceUnavailable                     = (paNotInitialized+15);\n    paIncompatibleHostApiSpecificStreamInfo = (paNotInitialized+16);\n    paStreamIsStopped                       = (paNotInitialized+17);\n    paStreamIsNotStopped                    = (paNotInitialized+18);\n    paInputOverflowed                       = (paNotInitialized+19);\n    paOutputUnderflowed                     = (paNotInitialized+20);\n    paHostApiNotFound                       = (paNotInitialized+21); // The notes below are from the \n    paInvalidHostApi                        = (paNotInitialized+22); // original file portaudio.h\n    paCanNotReadFromACallbackStream         = (paNotInitialized+23); {**< @todo review error code name *}\n    paCanNotWriteToACallbackStream          = (paNotInitialized+24); {**< @todo review error code name *}\n    paCanNotReadFromAnOutputOnlyStream      = (paNotInitialized+25); {**< @todo review error code name *}\n    paCanNotWriteToAnInputOnlyStream        = (paNotInitialized+26); {**< @todo review error code name *}\n    paIncompatibleStreamHostApi             = (paNotInitialized+27);\n    paBadBufferPtr                          = (paNotInitialized+28);\n{enum_end PaErrorCode}\n\n\n{** Translate the supplied PortAudio error code into a human readable\n message.\n*}\nfunction Pa_GetErrorText( errorCode: TPaError ): PChar; cdecl; external LibName;\n\n\n{** Library initialization function - call this before using PortAudio.\n This function initialises internal data structures and prepares underlying\n host APIs for use.  With the exception of Pa_GetVersion(), Pa_GetVersionText(),\n and Pa_GetErrorText(), this function MUST be called before using any other\n PortAudio API functions.\n\n If Pa_Initialize() is called multiple times, each successful\n call must be matched with a corresponding call to Pa_Terminate(). \n Pairs of calls to Pa_Initialize()/Pa_Terminate() may overlap, and are not \n required to be fully nested.\n\n Note that if Pa_Initialize() returns an error code, Pa_Terminate() should\n NOT be called.\n\n @return paNoError if successful, otherwise an error code indicating the cause\n of failure.\n\n @see Pa_Terminate\n*}\nfunction Pa_Initialize(): TPaError; cdecl; external LibName;\n\n\n{** Library termination function - call this when finished using PortAudio.\n This function deallocates all resources allocated by PortAudio since it was\n initializied by a call to Pa_Initialize(). In cases where Pa_Initialise() has\n been called multiple times, each call must be matched with a corresponding call\n to Pa_Terminate(). The final matching call to Pa_Terminate() will automatically\n close any PortAudio streams that are still open.\n\n Pa_Terminate() MUST be called before exiting a program which uses PortAudio.\n Failure to do so may result in serious resource leaks, such as audio devices\n not being available until the next reboot.\n\n @return paNoError if successful, otherwise an error code indicating the cause\n of failure.\n \n @see Pa_Initialize\n*}\nfunction Pa_Terminate(): TPaError; cdecl; external LibName;\n\n\n\n{** The type used to refer to audio devices. Values of this type usually\n range from 0 to (Pa_GetDeviceCount()-1), and may also take on the PaNoDevice\n and paUseHostApiSpecificDeviceSpecification values.\n\n @see Pa_GetDeviceCount, paNoDevice, paUseHostApiSpecificDeviceSpecification\n*}\ntype TPaDeviceIndex = cint;\n\n\n{** A special PaDeviceIndex value indicating that no device is available,\n or should be used.\n\n @see PaDeviceIndex\n*}\nconst paNoDevice = TPaDeviceIndex(-1);\n\n\n{** A special PaDeviceIndex value indicating that the device(s) to be used\n are specified in the host api specific stream info structure.\n\n @see PaDeviceIndex\n*}\nconst paUseHostApiSpecificDeviceSpecification = TPaDeviceIndex(-2);\n\n\n{* Host API enumeration mechanism *}\n\n{** The type used to enumerate to host APIs at runtime. Values of this type\n range from 0 to (Pa_GetHostApiCount()-1).\n\n @see Pa_GetHostApiCount\n*}\ntype TPaHostApiIndex = cint;\n\n{** Retrieve the number of available host APIs. Even if a host API is\n available it may have no devices available.\n\n @return A non-negative value indicating the number of available host APIs\n or, a PaErrorCode (which are always negative) if PortAudio is not initialized\n or an error is encountered.\n\n @see PaHostApiIndex\n*}\nfunction Pa_GetHostApiCount(): TPaHostApiIndex; cdecl; external LibName;\n\n\n{** Retrieve the index of the default host API. The default host API will be\n the lowest common denominator host API on the current platform and is\n unlikely to provide the best performance.\n\n @return A non-negative value ranging from 0 to (Pa_GetHostApiCount()-1)\n indicating the default host API index or, a PaErrorCode (which are always\n negative) if PortAudio is not initialized or an error is encountered.\n*}\nfunction Pa_GetDefaultHostApi(): TPaHostApiIndex; cdecl; external LibName;\n\n\n{** Unchanging unique identifiers for each supported host API. This type\n is used in the PaHostApiInfo structure. The values are guaranteed to be\n unique and to never change, thus allowing code to be written that\n conditionally uses host API specific extensions.\n\n New type ids will be allocated when support for a host API reaches\n \"public alpha\" status, prior to that developers should use the\n paInDevelopment type id.\n\n @see PaHostApiInfo\n*}\ntype TPaHostApiTypeId = {enum}cint; const\n{enum_begin PaHostApiTypeId}\n    paInDevelopment=0; {* use while developing support for a new host API *}\n    paDirectSound=1;\n    paMME=2;\n    paASIO=3;\n    paSoundManager=4;\n    paCoreAudio=5;\n    paOSS=7;\n    paALSA=8;\n    paAL=9;\n    paBeOS=10;\n    paWDMKS=11;\n    paJACK=12;\n    paWASAPI=13;\n    paAudioScienceHPI=14;\n{enum_end PaHostApiTypeId}\n\n{** A structure containing information about a particular host API. *}\n\ntype\n  PPaHostApiInfo = ^TPaHostApiInfo;\n  TPaHostApiInfo = record\n      {** this is struct version 1 *}\n      structVersion: cint;\n      {** The well known unique identifier of this host API @see PaHostApiTypeId *}\n      _type: TPaHostApiTypeId;\n      {** A textual description of the host API for display on user interfaces. *}\n      name: PChar;\n\n      {**  The number of devices belonging to this host API. This field may be\n       used in conjunction with Pa_HostApiDeviceIndexToDeviceIndex() to enumerate\n       all devices for this host API.\n       @see Pa_HostApiDeviceIndexToDeviceIndex\n      *}\n      deviceCount: cint;\n\n      {** The default input device for this host API. The value will be a\n       device index ranging from 0 to (Pa_GetDeviceCount()-1), or paNoDevice\n       if no default input device is available.\n      *}\n      defaultInputDevice: TPaDeviceIndex;\n\n      {** The default output device for this host API. The value will be a\n       device index ranging from 0 to (Pa_GetDeviceCount()-1), or paNoDevice\n       if no default output device is available.\n      *}\n      defaultOutputDevice: TPaDeviceIndex;\n  end;\n\n\n{** Retrieve a pointer to a structure containing information about a specific\n host Api.\n\n @param hostApi A valid host API index ranging from 0 to (Pa_GetHostApiCount()-1)\n\n @return A pointer to an immutable PaHostApiInfo structure describing\n a specific host API. If the hostApi parameter is out of range or an error\n is encountered, the function returns NULL.\n\n The returned structure is owned by the PortAudio implementation and must not\n be manipulated or freed. The pointer is only guaranteed to be valid between\n calls to Pa_Initialize() and Pa_Terminate().\n*}\nfunction Pa_GetHostApiInfo( hostApi: TPaHostApiIndex ): PPaHostApiInfo; cdecl; external LibName;\n\n\n{** Convert a static host API unique identifier, into a runtime\n host API index.\n\n @param type A unique host API identifier belonging to the PaHostApiTypeId\n enumeration.\n\n @return A valid PaHostApiIndex ranging from 0 to (Pa_GetHostApiCount()-1) or,\n a PaErrorCode (which are always negative) if PortAudio is not initialized\n or an error is encountered.\n \n The paHostApiNotFound error code indicates that the host API specified by the\n type parameter is not available.\n\n @see PaHostApiTypeId\n*}\nfunction Pa_HostApiTypeIdToHostApiIndex( _type: TPaHostApiTypeId ): TPaHostApiIndex; cdecl; external LibName;\n\n\n{** Convert a host-API-specific device index to standard PortAudio device index.\n This function may be used in conjunction with the deviceCount field of\n PaHostApiInfo to enumerate all devices for the specified host API.\n\n @param hostApi A valid host API index ranging from 0 to (Pa_GetHostApiCount()-1)\n\n @param hostApiDeviceIndex A valid per-host device index in the range\n 0 to (Pa_GetHostApiInfo(hostApi)->deviceCount-1)\n\n @return A non-negative PaDeviceIndex ranging from 0 to (Pa_GetDeviceCount()-1)\n or, a PaErrorCode (which are always negative) if PortAudio is not initialized\n or an error is encountered.\n\n A paInvalidHostApi error code indicates that the host API index specified by\n the hostApi parameter is out of range.\n\n A paInvalidDevice error code indicates that the hostApiDeviceIndex parameter\n is out of range.\n \n @see PaHostApiInfo\n*}\nfunction Pa_HostApiDeviceIndexToDeviceIndex( hostApi: TPaHostApiIndex;\n        hostApiDeviceIndex: cint ): TPaDeviceIndex; cdecl; external LibName;\n\n\n\n{** Structure used to return information about a host error condition.\n*}\ntype\n  PPaHostErrorInfo = ^TPaHostErrorInfo;\n  TPaHostErrorInfo = record\n      hostApiType: TPaHostApiTypeId;    {**< the host API which returned the error code *}\n      errorCode: clong;                 {**< the error code returned *}\n      errorText: PChar;                 {**< a textual description of the error if available, otherwise a zero-length string *}\n  end;\n\n\n{** Return information about the last host error encountered. The error\n information returned by Pa_GetLastHostErrorInfo() will never be modified\n asynchronously by errors occurring in other PortAudio owned threads\n (such as the thread that manages the stream callback.)\n\n This function is provided as a last resort, primarily to enhance debugging\n by providing clients with access to all available error information.\n\n @return A pointer to an immutable structure containing information about\n the host error. The values in this structure will only be valid if a\n PortAudio function has previously returned the paUnanticipatedHostError\n error code.\n*}\nfunction Pa_GetLastHostErrorInfo(): PPaHostErrorInfo; cdecl; external LibName;\n\n\n\n{* Device enumeration and capabilities *}\n\n{** Retrieve the number of available devices. The number of available devices\n may be zero.\n\n @return A non-negative value indicating the number of available devices or,\n a PaErrorCode (which are always negative) if PortAudio is not initialized\n or an error is encountered.\n*}\nfunction Pa_GetDeviceCount(): TPaDeviceIndex; cdecl; external LibName;\n\n\n{** Retrieve the index of the default input device. The result can be\n used in the inputDevice parameter to Pa_OpenStream().\n\n @return The default input device index for the default host API, or paNoDevice\n if no default input device is available or an error was encountered.\n*}\nfunction Pa_GetDefaultInputDevice(): TPaDeviceIndex; cdecl; external LibName;\n\n\n{** Retrieve the index of the default output device. The result can be\n used in the outputDevice parameter to Pa_OpenStream().\n\n @return The default output device index for the default host API, or paNoDevice\n if no default output device is available or an error was encountered.\n\n @note\n On the PC, the user can specify a default device by\n setting an environment variable. For example, to use device #1.\n<pre>\n set PA_RECOMMENDED_OUTPUT_DEVICE=1\n</pre>\n The user should first determine the available device ids by using\n the supplied application \"pa_devs\".\n*}\nfunction Pa_GetDefaultOutputDevice(): TPaDeviceIndex; cdecl; external LibName;\n\n\n{** The type used to represent monotonic time in seconds that can be used\n for synchronisation. The type is used for the outTime argument to the\n PaStreamCallback and as the result of Pa_GetStreamTime().\n     \n @see PaStreamCallback, Pa_GetStreamTime\n*}\ntype TPaTime = cdouble;\n\n\n{** A type used to specify one or more sample formats. Each value indicates\n a possible format for sound data passed to and from the stream callback,\n Pa_ReadStream and Pa_WriteStream.\n\n The standard formats paFloat32, paInt16, paInt32, paInt24, paInt8\n and aUInt8 are usually implemented by all implementations.\n\n The floating point representation (paFloat32) uses +1.0 and -1.0 as the\n maximum and minimum respectively.\n\n paUInt8 is an unsigned 8 bit format where 128 is considered \"ground\"\n\n The paNonInterleaved flag indicates that a multichannel buffer is passed\n as a set of non-interleaved pointers.\n\n @see Pa_OpenStream, Pa_OpenDefaultStream, PaDeviceInfo\n @see paFloat32, paInt16, paInt32, paInt24, paInt8\n @see paUInt8, paCustomFormat, paNonInterleaved\n*}\ntype TPaSampleFormat = culong;\nconst\n  paFloat32        = TPaSampleFormat($00000001); {**< @see PaSampleFormat *}\n  paInt32          = TPaSampleFormat($00000002); {**< @see PaSampleFormat *}\n  paInt24          = TPaSampleFormat($00000004); {**< Packed 24 bit format. @see PaSampleFormat *}\n  paInt16          = TPaSampleFormat($00000008); {**< @see PaSampleFormat *}\n  paInt8           = TPaSampleFormat($00000010); {**< @see PaSampleFormat *}\n  paUInt8          = TPaSampleFormat($00000020); {**< @see PaSampleFormat *}\n  paCustomFormat   = TPaSampleFormat($00010000); {**< @see PaSampleFormat *}\n  paNonInterleaved = TPaSampleFormat($80000000);\n\n{** A structure providing information and capabilities of PortAudio devices.\n Devices may support input, output or both input and output.\n*}\ntype\n  PPaDeviceInfo = ^TPaDeviceInfo;\n  TPaDeviceInfo = record\n      structVersion: cint;  {* this is struct version 2 *}\n      name: PChar;\n      hostApi: TPaHostApiIndex; {* note this is a host API index, not a type id*}\n\n      maxInputChannels: cint;\n      maxOutputChannels: cint;\n\n      {* Default latency values for interactive performance. *}\n      defaultLowInputLatency: TPaTime;\n      defaultLowOutputLatency: TPaTime;\n      {* Default latency values for robust non-interactive applications (eg. playing sound files). *}\n      defaultHighInputLatency: TPaTime;\n      defaultHighOutputLatency: TPaTime;\n\n      defaultSampleRate: cdouble;\n  end;\n\n\n{** Retrieve a pointer to a PaDeviceInfo structure containing information\n about the specified device.\n @return A pointer to an immutable PaDeviceInfo structure. If the device\n parameter is out of range the function returns NULL.\n\n @param device A valid device index in the range 0 to (Pa_GetDeviceCount()-1)\n\n @note PortAudio manages the memory referenced by the returned pointer,\n the client must not manipulate or free the memory. The pointer is only\n guaranteed to be valid between calls to Pa_Initialize() and Pa_Terminate().\n\n @see PaDeviceInfo, PaDeviceIndex\n*}\nfunction Pa_GetDeviceInfo( device: TPaDeviceIndex ): PPaDeviceInfo; cdecl; external LibName;\n\n\n{** Parameters for one direction (input or output) of a stream.\n*}\ntype\n  PPaStreamParameters = ^TPaStreamParameters;\n  TPaStreamParameters = record\n      {** A valid device index in the range 0 to (Pa_GetDeviceCount()-1)\n       specifying the device to be used or the special constant\n       paUseHostApiSpecificDeviceSpecification which indicates that the actual\n       device(s) to use are specified in hostApiSpecificStreamInfo.\n       This field must not be set to paNoDevice.\n      *}\n      device: TPaDeviceIndex;\n\n      {** The number of channels of sound to be delivered to the\n       stream callback or accessed by Pa_ReadStream() or Pa_WriteStream().\n       It can range from 1 to the value of maxInputChannels in the\n       PaDeviceInfo record for the device specified by the device parameter.\n      *}\n      channelCount: cint;\n\n      {** The sample format of the buffer provided to the stream callback,\n       a_ReadStream() or Pa_WriteStream(). It may be any of the formats described\n       by the PaSampleFormat enumeration.\n      *}\n      sampleFormat: TPaSampleFormat;\n\n      {** The desired latency in seconds. Where practical, implementations should\n       configure their latency based on these parameters, otherwise they may\n       choose the closest viable latency instead. Unless the suggested latency\n       is greater than the absolute upper limit for the device implementations\n       should round the suggestedLatency up to the next practical value - ie to\n       provide an equal or higher latency than suggestedLatency wherever possible.\n       Actual latency values for an open stream may be retrieved using the\n       inputLatency and outputLatency fields of the PaStreamInfo structure\n       returned by Pa_GetStreamInfo().\n       @see default*Latency in PaDeviceInfo, *Latency in PaStreamInfo\n      *}\n      suggestedLatency: TPaTime;\n\n      {** An optional pointer to a host api specific data structure\n       containing additional information for device setup and/or stream processing.\n       hostApiSpecificStreamInfo is never required for correct operation,\n       if not used it should be set to NULL.\n      *}\n      hostApiSpecificStreamInfo: Pointer;\n  end;\n\n\n{** Return code for Pa_IsFormatSupported indicating success. *}\nconst paFormatIsSupported = (0);\n\n{** Determine whether it would be possible to open a stream with the specified\n parameters.\n\n @param inputParameters A structure that describes the input parameters used to\n open a stream. The suggestedLatency field is ignored. See PaStreamParameters\n for a description of these parameters. inputParameters must be NULL for\n output-only streams.\n\n @param outputParameters A structure that describes the output parameters used\n to open a stream. The suggestedLatency field is ignored. See PaStreamParameters\n for a description of these parameters. outputParameters must be NULL for\n input-only streams.\n\n @param sampleRate The required sampleRate. For full-duplex streams it is the\n sample rate for both input and output\n\n @return Returns 0 if the format is supported, and an error code indicating why\n the format is not supported otherwise. The constant paFormatIsSupported is\n provided to compare with the return value for success.\n\n @see paFormatIsSupported, PaStreamParameters\n*}\nfunction Pa_IsFormatSupported( inputParameters: PPaStreamParameters;\n                              outputParameters: PPaStreamParameters;\n                              sampleRate: cdouble ): TPaError; cdecl; external LibName;\n\n\n\n{* Streaming types and functions *}\n\n\n{**\n A single PaStream can provide multiple channels of real-time\n streaming audio input and output to a client application. A stream\n provides access to audio hardware represented by one or more\n PaDevices. Depending on the underlying Host API, it may be possible \n to open multiple streams using the same device, however this behavior \n is implementation defined. Portable applications should assume that \n a PaDevice may be simultaneously used by at most one PaStream.\n\n Pointers to PaStream objects are passed between PortAudio functions that\n operate on streams.\n\n @see Pa_OpenStream, Pa_OpenDefaultStream, Pa_OpenDefaultStream, Pa_CloseStream,\n Pa_StartStream, Pa_StopStream, Pa_AbortStream, Pa_IsStreamActive,\n Pa_GetStreamTime, Pa_GetStreamCpuLoad\n\n*}\ntype\n  PPaStream = Pointer;\n\n{** Can be passed as the framesPerBuffer parameter to Pa_OpenStream()\n or Pa_OpenDefaultStream() to indicate that the stream callback will\n accept buffers of any size.\n*}\nconst paFramesPerBufferUnspecified = (0);\n\n\n{** Flags used to control the behavior of a stream. They are passed as\n parameters to Pa_OpenStream or Pa_OpenDefaultStream. Multiple flags may be\n ORed together.\n\n @see Pa_OpenStream, Pa_OpenDefaultStream\n @see paNoFlag, paClipOff, paDitherOff, paNeverDropInput,\n  paPrimeOutputBuffersUsingStreamCallback, paPlatformSpecificFlags\n*}\ntype TPaStreamFlags = culong;\n\n{** @see PaStreamFlags *}\nconst   paNoFlag          = TPaStreamFlags(0);\n\n{** Disable default clipping of out of range samples.\n @see PaStreamFlags\n*}\nconst   paClipOff         = TPaStreamFlags($00000001);\n\n{** Disable default dithering.\n @see PaStreamFlags\n*}\nconst   paDitherOff       = TPaStreamFlags($00000002);\n\n{** Flag requests that where possible a full duplex stream will not discard\n overflowed input samples without calling the stream callback. This flag is\n only valid for full duplex callback streams and only when used in combination\n with the paFramesPerBufferUnspecified (0) framesPerBuffer parameter. Using\n this flag incorrectly results in a paInvalidFlag error being returned from\n Pa_OpenStream and Pa_OpenDefaultStream.\n\n @see PaStreamFlags, paFramesPerBufferUnspecified\n*}\nconst   paNeverDropInput  = TPaStreamFlags($00000004);\n\n{** Call the stream callback to fill initial output buffers, rather than the\n default behavior of priming the buffers with zeros (silence). This flag has\n no effect for input-only and blocking read/write streams.\n \n @see PaStreamFlags\n*}\nconst   paPrimeOutputBuffersUsingStreamCallback = TPaStreamFlags($00000008);\n\n{** A mask specifying the platform specific bits.\n @see PaStreamFlags\n*}\nconst   paPlatformSpecificFlags = TPaStreamFlags($FFFF0000);\n\n{**\n Timing information for the buffers passed to the stream callback.\n*}\ntype\n  PPaStreamCallbackTimeInfo = ^TPaStreamCallbackTimeInfo;\n  TPaStreamCallbackTimeInfo = record\n      inputBufferAdcTime: TPaTime;\n      currentTime: TPaTime;\n      outputBufferDacTime: TPaTime;\n  end;\n\n\n{**\n Flag bit constants for the statusFlags to PaStreamCallback.\n\n @see paInputUnderflow, paInputOverflow, paOutputUnderflow, paOutputOverflow,\n paPrimingOutput\n*}\ntype TPaStreamCallbackFlags = culong;\n\n{** In a stream opened with paFramesPerBufferUnspecified, indicates that\n input data is all silence (zeros) because no real data is available. In a\n stream opened without paFramesPerBufferUnspecified, it indicates that one or\n more zero samples have been inserted into the input buffer to compensate\n for an input underflow.\n @see PaStreamCallbackFlags\n*}\nconst paInputUnderflow   = TPaStreamCallbackFlags($00000001);\n\n{** In a stream opened with paFramesPerBufferUnspecified, indicates that data\n prior to the first sample of the input buffer was discarded due to an\n overflow, possibly because the stream callback is using too much CPU time.\n Otherwise indicates that data prior to one or more samples in the\n input buffer was discarded.\n @see PaStreamCallbackFlags\n*}\nconst paInputOverflow    = TPaStreamCallbackFlags($00000002);\n\n{** Indicates that output data (or a gap) was inserted, possibly because the\n stream callback is using too much CPU time.\n @see PaStreamCallbackFlags\n*}\nconst paOutputUnderflow  = TPaStreamCallbackFlags($00000004);\n\n{** Indicates that output data will be discarded because no room is available.\n @see PaStreamCallbackFlags\n*}\nconst paOutputOverflow   = TPaStreamCallbackFlags($00000008);\n\n{** Some of all of the output data will be used to prime the stream, input\n data may be zero.\n @see PaStreamCallbackFlags\n*}\nconst paPrimingOutput    = TPaStreamCallbackFlags($00000010);\n\n{**\n Allowable return values for the PaStreamCallback.\n @see PaStreamCallback\n*}\ntype TPaStreamCallbackResult = {enum}cint; const\n{enum_begin PaStreamCallbackResult}\n    paContinue=0;\n    paComplete=1;\n    paAbort=2;\n{enum_end PaStreamCallbackResult}\n\n{**\n Functions of type PaStreamCallback are implemented by PortAudio clients.\n They consume, process or generate audio in response to requests from an\n active PortAudio stream.\n     \n @param input and @param output are arrays of interleaved samples,\n the format, packing and number of channels used by the buffers are\n determined by parameters to Pa_OpenStream().\n     \n @param frameCount The number of sample frames to be processed by\n the stream callback.\n\n @param timeInfo The time in seconds when the first sample of the input\n buffer was received at the audio input, the time in seconds when the first\n sample of the output buffer will begin being played at the audio output, and\n the time in seconds when the stream callback was called.\n See also Pa_GetStreamTime()\n\n @param statusFlags Flags indicating whether input and/or output buffers\n have been inserted or will be dropped to overcome underflow or overflow\n conditions.\n\n @param userData The value of a user supplied pointer passed to\n Pa_OpenStream() intended for storing synthesis data etc.\n\n @return\n The stream callback should return one of the values in the\n PaStreamCallbackResult enumeration. To ensure that the callback continues\n to be called, it should return paContinue (0). Either paComplete or paAbort\n can be returned to finish stream processing, after either of these values is\n returned the callback will not be called again. If paAbort is returned the\n stream will finish as soon as possible. If paComplete is returned, the stream\n will continue until all buffers generated by the callback have been played.\n This may be useful in applications such as soundfile players where a specific\n duration of output is required. However, it is not necessary to utilise this\n mechanism as Pa_StopStream(), Pa_AbortStream() or Pa_CloseStream() can also\n be used to stop the stream. The callback must always fill the entire output\n buffer irrespective of its return value.\n\n @see Pa_OpenStream, Pa_OpenDefaultStream\n\n @note With the exception of Pa_GetStreamCpuLoad() it is not permissible to call\n PortAudio API functions from within the stream callback.\n*}\ntype\n  PPaStreamCallback = ^TPaStreamCallback;\n  TPaStreamCallback = function(\n      input: Pointer; output: Pointer;\n      frameCount: culong;\n      timeInfo: PPaStreamCallbackTimeInfo;\n      statusFlags: TPaStreamCallbackFlags;\n      userData: Pointer ): cint; cdecl;\n\n\n{** Opens a stream for either input, output or both.\n     \n @param stream The address of a PaStream pointer which will receive\n a pointer to the newly opened stream.\n     \n @param inputParameters A structure that describes the input parameters used by\n the opened stream. See PaStreamParameters for a description of these parameters.\n inputParameters must be NULL for output-only streams.\n\n @param outputParameters A structure that describes the output parameters used by\n the opened stream. See PaStreamParameters for a description of these parameters.\n outputParameters must be NULL for input-only streams.\n \n @param sampleRate The desired sampleRate. For full-duplex streams it is the\n sample rate for both input and output\n\n @param framesPerBuffer The number of frames passed to the stream callback\n function, or the preferred block granularity for a blocking read/write stream.\n The special value paFramesPerBufferUnspecified (0) may be used to request that\n the stream callback will receive an optimal (and possibly varying) number of\n frames based on host requirements and the requested latency settings.\n Note: With some host APIs, the use of non-zero framesPerBuffer for a callback\n stream may introduce an additional layer of buffering which could introduce\n additional latency. PortAudio guarantees that the additional latency\n will be kept to the theoretical minimum however, it is strongly recommended\n that a non-zero framesPerBuffer value only be used when your algorithm\n requires a fixed number of frames per stream callback.\n \n @param streamFlags Flags which modify the behaviour of the streaming process.\n This parameter may contain a combination of flags ORed together. Some flags may\n only be relevant to certain buffer formats.\n     \n @param streamCallback A pointer to a client supplied function that is responsible\n for processing and filling input and output buffers. If this parameter is NULL\n the stream will be opened in 'blocking read/write' mode. In blocking mode,\n the client can receive sample data using Pa_ReadStream and write sample data\n using Pa_WriteStream, the number of samples that may be read or written\n without blocking is returned by Pa_GetStreamReadAvailable and\n Pa_GetStreamWriteAvailable respectively.\n\n @param userData A client supplied pointer which is passed to the stream callback\n function. It could for example, contain a pointer to instance data necessary\n for processing the audio buffers. This parameter is ignored if streamCallback\n is NULL.\n     \n @return\n Upon success Pa_OpenStream() returns paNoError and places a pointer to a\n valid PaStream in the stream argument. The stream is inactive (stopped).\n If a call to Pa_OpenStream() fails, a non-zero error code is returned (see\n PaError for possible error codes) and the value of stream is invalid.\n\n @see PaStreamParameters, PaStreamCallback, Pa_ReadStream, Pa_WriteStream,\n Pa_GetStreamReadAvailable, Pa_GetStreamWriteAvailable\n*}\nfunction Pa_OpenStream( var stream: PPaStream;\n                       inputParameters: PPaStreamParameters;\n                       outputParameters: PPaStreamParameters;\n                       sampleRate: cdouble;\n                       framesPerBuffer: culong;\n                       streamFlags: TPaStreamFlags;\n                       streamCallback: PPaStreamCallback;\n                       userData: Pointer ): TPaError; cdecl; external LibName;\n\n\n{** A simplified version of Pa_OpenStream() that opens the default input\n and/or output devices.\n\n @param stream The address of a PaStream pointer which will receive\n a pointer to the newly opened stream.\n \n @param numInputChannels  The number of channels of sound that will be supplied\n to the stream callback or returned by Pa_ReadStream. It can range from 1 to\n the value of maxInputChannels in the PaDeviceInfo record for the default input\n device. If 0 the stream is opened as an output-only stream.\n\n @param numOutputChannels The number of channels of sound to be delivered to the\n stream callback or passed to Pa_WriteStream. It can range from 1 to the value\n of maxOutputChannels in the PaDeviceInfo record for the default output dvice.\n If 0 the stream is opened as an output-only stream.\n\n @param sampleFormat The sample format of both the input and output buffers\n provided to the callback or passed to and from Pa_ReadStream and Pa_WriteStream.\n sampleFormat may be any of the formats described by the PaSampleFormat\n enumeration.\n \n @param sampleRate Same as Pa_OpenStream parameter of the same name.\n @param framesPerBuffer Same as Pa_OpenStream parameter of the same name.\n @param streamCallback Same as Pa_OpenStream parameter of the same name.\n @param userData Same as Pa_OpenStream parameter of the same name.\n\n @return As for Pa_OpenStream\n\n @see Pa_OpenStream, PaStreamCallback\n*}\nfunction Pa_OpenDefaultStream( var stream: PPaStream;\n                              numInputChannels: cint;\n                              numOutputChannels: cint;\n                              sampleFormat: TPaSampleFormat;\n                              sampleRate: cdouble;\n                              framesPerBuffer: culong;\n                              streamCallback: PPaStreamCallback;\n                              userData: Pointer ): TPaError; cdecl; external LibName;\n\n\n{** Closes an audio stream. If the audio stream is active it\n discards any pending buffers as if Pa_AbortStream() had been called.\n*}\nfunction Pa_CloseStream( stream: PPaStream ): TPaError; cdecl; external LibName;\n\n\n{** Functions of type PaStreamFinishedCallback are implemented by PortAudio \n clients. They can be registered with a stream using the Pa_SetStreamFinishedCallback\n function. Once registered they are called when the stream becomes inactive\n (ie once a call to Pa_StopStream() will not block).\n A stream will become inactive after the stream callback returns non-zero,\n or when Pa_StopStream or Pa_AbortStream is called. For a stream providing audio\n output, if the stream callback returns paComplete, or Pa_StopStream is called,\n the stream finished callback will not be called until all generated sample data\n has been played.\n \n @param userData The userData parameter supplied to Pa_OpenStream()\n\n @see Pa_SetStreamFinishedCallback\n*}\ntype\n  PPaStreamFinishedCallback = ^TPaStreamFinishedCallback;\n  TPaStreamFinishedCallback = procedure( userData: Pointer ); cdecl;\n\n\n{** Register a stream finished callback function which will be called when the \n stream becomes inactive. See the description of PaStreamFinishedCallback for \n further details about when the callback will be called.\n\n @param stream a pointer to a PaStream that is in the stopped state - if the\n stream is not stopped, the stream's finished callback will remain unchanged \n and an error code will be returned.\n\n @param streamFinishedCallback a pointer to a function with the same signature\n as PaStreamFinishedCallback, that will be called when the stream becomes\n inactive. Passing NULL for this parameter will un-register a previously\n registered stream finished callback function.\n\n @return on success returns paNoError, otherwise an error code indicating the cause\n of the error.\n\n @see PaStreamFinishedCallback\n*}\nfunction Pa_SetStreamFinishedCallback( stream: PPaStream;\n                streamFinishedCallback: PPaStreamFinishedCallback ): TPaError; cdecl; external LibName;\n\n\n{** Commences audio processing.\n*}\nfunction Pa_StartStream( stream: PPaStream ): TPaError; cdecl; external LibName;\n\n\n{** Terminates audio processing. It waits until all pending\n audio buffers have been played before it returns.\n*}\nfunction Pa_StopStream( stream: PPaStream ): TPaError; cdecl; external LibName;\n\n\n{** Terminates audio processing immediately without waiting for pending\n buffers to complete.\n*}\nfunction Pa_AbortStream( stream: PPaStream ): TPaError; cdecl; external LibName;\n\n\n{** Determine whether the stream is stopped.\n A stream is considered to be stopped prior to a successful call to\n Pa_StartStream and after a successful call to Pa_StopStream or Pa_AbortStream.\n If a stream callback returns a value other than paContinue the stream is NOT\n considered to be stopped.\n\n @return Returns one (1) when the stream is stopped, zero (0) when\n the stream is running or, a PaErrorCode (which are always negative) if\n PortAudio is not initialized or an error is encountered.\n\n @see Pa_StopStream, Pa_AbortStream, Pa_IsStreamActive\n*}\nfunction Pa_IsStreamStopped( stream: PPaStream ): TPaError; cdecl; external LibName;\n\n\n{** Determine whether the stream is active.\n A stream is active after a successful call to Pa_StartStream(), until it\n becomes inactive either as a result of a call to Pa_StopStream() or\n Pa_AbortStream(), or as a result of a return value other than paContinue from\n the stream callback. In the latter case, the stream is considered inactive\n after the last buffer has finished playing.\n\n @return Returns one (1) when the stream is active (ie playing or recording\n audio), zero (0) when not playing or, a PaErrorCode (which are always negative)\n if PortAudio is not initialized or an error is encountered.\n\n @see Pa_StopStream, Pa_AbortStream, Pa_IsStreamStopped\n*}\nfunction Pa_IsStreamActive( stream: PPaStream ): TPaError; cdecl; external LibName;\n\n\n\n{** A structure containing unchanging information about an open stream.\n @see Pa_GetStreamInfo\n*}\ntype\n  PPaStreamInfo = ^TPaStreamInfo;\n  TPaStreamInfo = record\n      {** this is struct version 1 *}\n      structVersion: cint;\n\n      {** The input latency of the stream in seconds. This value provides the most\n       accurate estimate of input latency available to the implementation. It may\n       differ significantly from the suggestedLatency value passed to Pa_OpenStream().\n       The value of this field will be zero (0.) for output-only streams.\n       @see PaTime\n      *}\n      inputLatency: TPaTime;\n\n      {** The output latency of the stream in seconds. This value provides the most\n       accurate estimate of output latency available to the implementation. It may\n       differ significantly from the suggestedLatency value passed to Pa_OpenStream().\n       The value of this field will be zero (0.) for input-only streams.\n       @see PaTime\n      *}\n      outputLatency: TPaTime;\n\n      {** The sample rate of the stream in Hertz (samples per second). In cases\n       where the hardware sample rate is inaccurate and PortAudio is aware of it,\n       the value of this field may be different from the sampleRate parameter\n       passed to Pa_OpenStream(). If information about the actual hardware sample\n       rate is not available, this field will have the same value as the sampleRate\n       parameter passed to Pa_OpenStream().\n      *}\n      sampleRate: cdouble;\n  end;\n\n\n{** Retrieve a pointer to a PaStreamInfo structure containing information\n about the specified stream.\n @return A pointer to an immutable PaStreamInfo structure. If the stream\n parameter invalid, or an error is encountered, the function returns NULL.\n\n @param stream A pointer to an open stream previously created with Pa_OpenStream.\n\n @note PortAudio manages the memory referenced by the returned pointer,\n the client must not manipulate or free the memory. The pointer is only\n guaranteed to be valid until the specified stream is closed.\n\n @see PaStreamInfo\n*}\nfunction Pa_GetStreamInfo( stream: PPaStream ): PPaStreamInfo; cdecl; external LibName;\n\n\n{** Determine the current time for the stream according to the same clock used\n to generate buffer timestamps. This time may be used for synchronising other\n events to the audio stream, for example synchronizing audio to MIDI.\n                                        \n @return The stream's current time in seconds, or 0 if an error occurred.\n\n @see PaTime, PaStreamCallback\n*}\nfunction Pa_GetStreamTime( stream: PPaStream ): TPaTime; cdecl; external LibName;\n\n\n{** Retrieve CPU usage information for the specified stream.\n The \"CPU Load\" is a fraction of total CPU time consumed by a callback stream's\n audio processing routines including, but not limited to the client supplied\n stream callback. This function does not work with blocking read/write streams.\n\n This function may be called from the stream callback function or the\n application.\n     \n @return\n A floating point value, typically between 0.0 and 1.0, where 1.0 indicates\n that the stream callback is consuming the maximum number of CPU cycles possible\n to maintain real-time operation. A value of 0.5 would imply that PortAudio and\n the stream callback was consuming roughly 50% of the available CPU time. The\n return value may exceed 1.0. A value of 0.0 will always be returned for a\n blocking read/write stream, or if an error occurs.\n*}\nfunction Pa_GetStreamCpuLoad( stream: PPaStream ): cdouble; cdecl; external LibName;\n\n\n{** Read samples from an input stream. The function doesn't return until\n the entire buffer has been filled - this may involve waiting for the operating\n system to supply the data.\n\n @param stream A pointer to an open stream previously created with Pa_OpenStream.\n \n @param buffer A pointer to a buffer of sample frames. The buffer contains\n samples in the format specified by the inputParameters->sampleFormat field\n used to open the stream, and the number of channels specified by\n inputParameters->numChannels. If non-interleaved samples were requested,\n buffer is a pointer to the first element of an array of non-interleaved\n buffer pointers, one for each channel.\n\n @param frames The number of frames to be read into buffer. This parameter\n is not constrained to a specific range, however high performance applications\n will want to match this parameter to the framesPerBuffer parameter used\n when opening the stream.\n\n @return On success PaNoError will be returned, or PaInputOverflowed if input\n data was discarded by PortAudio after the previous call and before this call.\n*}\nfunction Pa_ReadStream( stream: PPaStream;\n                       buffer: Pointer;\n                       frames: culong ): TPaError; cdecl; external LibName;\n\n\n{** Write samples to an output stream. This function doesn't return until the\n entire buffer has been consumed - this may involve waiting for the operating\n system to consume the data.\n\n @param stream A pointer to an open stream previously created with Pa_OpenStream.\n\n @param buffer A pointer to a buffer of sample frames. The buffer contains\n samples in the format specified by the outputParameters->sampleFormat field\n used to open the stream, and the number of channels specified by\n outputParameters->numChannels. If non-interleaved samples were requested,\n buffer is a pointer to the first element of an array of non-interleaved\n buffer pointers, one for each channel.\n\n @param frames The number of frames to be written from buffer. This parameter\n is not constrained to a specific range, however high performance applications\n will want to match this parameter to the framesPerBuffer parameter used\n when opening the stream.\n\n @return On success PaNoError will be returned, or paOutputUnderflowed if\n additional output data was inserted after the previous call and before this\n call.\n*}\nfunction Pa_WriteStream( stream: PPaStream;\n                        buffer: Pointer;\n                        frames: culong ): TPaError; cdecl; external LibName;\n\n\n{** Retrieve the number of frames that can be read from the stream without\n waiting.\n\n @return Returns a non-negative value representing the maximum number of frames\n that can be read from the stream without blocking or busy waiting or, a\n PaErrorCode (which are always negative) if PortAudio is not initialized or an\n error is encountered.\n*}\nfunction Pa_GetStreamReadAvailable( stream: PPaStream ): cslong; cdecl; external LibName;\n\n\n{** Retrieve the number of frames that can be written to the stream without\n waiting.\n\n @return Returns a non-negative value representing the maximum number of frames\n that can be written to the stream without blocking or busy waiting or, a\n PaErrorCode (which are always negative) if PortAudio is not initialized or an\n error is encountered.\n*}\nfunction Pa_GetStreamWriteAvailable( stream: PPaStream ): cslong; cdecl; external LibName;\n\n\n{** Retrieve the host type handling an open stream.\n\n @return Returns a non-negative value representing the host API type\n handling an open stream or, a PaErrorCode (which are always negative)\n if PortAudio is not initialized or an error is encountered.\n*}\nfunction Pa_GetStreamHostApiType( stream: PPaStream ): TPaHostApiTypeId; cdecl; external LibName;\n\n\n{* Miscellaneous utilities *}\n\n\n{** Retrieve the size of a given sample format in bytes.\n\n @return The size in bytes of a single sample in the specified format,\n or paSampleFormatNotSupported if the format is not supported.\n*}\nfunction Pa_GetSampleSize( format: TPaSampleFormat ): TPaError; cdecl; external LibName;\n\n\n{** Put the caller to sleep for at least 'msec' milliseconds. This function is\n provided only as a convenience for authors of portable code (such as the tests\n and examples in the PortAudio distribution.)\n\n The function may sleep longer than requested so don't rely on this for accurate\n musical timing.\n*}\nprocedure Pa_Sleep( msec: clong ); cdecl; external LibName;\n\nimplementation\n\nend.\n"
  },
  {
    "path": "sherpa-onnx/pascal-api/sherpa_onnx.pas",
    "content": "{ Copyright (c)  2024  Xiaomi Corporation\n\nPlease see\nhttps://github.com/k2-fsa/sherpa-onnx/tree/master/pascal-api-examples\nfor how to use APIs in this file.\n}\n\nunit sherpa_onnx;\n\n{$IFDEF FPC}\n  {$mode objfpc}\n  {$modeSwitch advancedRecords} { to support records with methods }\n{$ENDIF}\n\n{$LongStrings ON}\n\ninterface\nuses\n  ctypes;\n\ntype\n  TSherpaOnnxSamplesArray = array of Single;\n\n  TSherpaOnnxLinearResampler = class\n  private\n    Handle: Pointer;\n    InputSampleRate: Integer;\n    OutputSampleRate: Integer;\n  public\n    constructor Create(SampleRateIn: Integer; SampleRateOut: Integer);\n    destructor Destroy; override;\n\n    function Resample(Samples: pcfloat;\n      N: Integer; Flush: Boolean): TSherpaOnnxSamplesArray; overload;\n\n    function Resample(const Samples: array of Single;\n      Flush: Boolean): TSherpaOnnxSamplesArray; overload;\n\n    procedure Reset;\n\n    property GetInputSampleRate: Integer Read InputSampleRate;\n    property GetOutputSampleRate: Integer Read OutputSampleRate;\n  end;\n\n  TSherpaOnnxGeneratedAudioCallbackWithArg = function(\n      Samples: pcfloat; N: cint32;\n      Arg: Pointer): cint32; cdecl;\n\n  TSherpaOnnxGeneratedAudioProgressCallbackWithArg = function(\n      Samples: pcfloat; N: cint32; P: cfloat;\n      Arg: Pointer): cint32; cdecl;\n\n  TSherpaOnnxOfflineTtsVitsModelConfig = record\n    Model: AnsiString;\n    Lexicon: AnsiString;\n    Tokens: AnsiString;\n    DataDir: AnsiString;\n    NoiseScale: Single;\n    NoiseScaleW: Single;\n    LengthScale: Single;\n    DictDir: AnsiString;\n\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsVitsModelConfig);\n  end;\n\n  TSherpaOnnxGenerationConfig = record\n    SilenceScale: Single;\n    Speed: Single;\n    Sid: Integer;\n    ReferenceAudio: array of Single;\n    ReferenceAudioLen: Integer;\n    ReferenceSampleRate: Integer;\n    ReferenceText: AnsiString;\n    NumSteps: Integer;\n    Extra: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxGenerationConfig);\n  end;\n\n  TSherpaOnnxOfflineTtsMatchaModelConfig = record\n    AcousticModel: AnsiString;\n    Vocoder: AnsiString;\n    Lexicon: AnsiString;\n    Tokens: AnsiString;\n    DataDir: AnsiString;\n    NoiseScale: Single;\n    LengthScale: Single;\n    DictDir: AnsiString;\n\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsMatchaModelConfig);\n  end;\n\n  TSherpaOnnxOfflineTtsKokoroModelConfig = record\n    Model: AnsiString;\n    Voices: AnsiString;\n    Tokens: AnsiString;\n    DataDir: AnsiString;\n    LengthScale: Single;\n    DictDir: AnsiString;\n    Lexicon: AnsiString;\n    Lang: AnsiString;\n\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsKokoroModelConfig);\n  end;\n\n  TSherpaOnnxOfflineTtsKittenModelConfig = record\n    Model: AnsiString;\n    Voices: AnsiString;\n    Tokens: AnsiString;\n    DataDir: AnsiString;\n    LengthScale: Single;\n\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsKittenModelConfig);\n  end;\n\n  TSherpaOnnxOfflineTtsZipVoiceModelConfig = record\n    Tokens: AnsiString;\n    Encoder: AnsiString;\n    Decoder: AnsiString;\n    Vocoder: AnsiString;\n    DataDir: AnsiString;\n    Lexicon: AnsiString;\n    FeatScale: Single;\n    Tshift: Single;\n    TargetRms: Single;\n    GuidanceScale: Single;\n\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsZipVoiceModelConfig);\n  end;\n\n  TSherpaOnnxOfflineTtsPocketModelConfig = record\n    LmFlow: AnsiString;\n    LmMain: AnsiString;\n    Encoder: AnsiString;\n    Decoder: AnsiString;\n    TextConditioner: AnsiString;\n    VocabJson: AnsiString;\n    TokenScoresJson: AnsiString;\n    VoiceEmbeddingCacheCapacity: Integer;\n\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsPocketModelConfig);\n  end;\n\n  TSherpaOnnxOfflineTtsSupertonicModelConfig = record\n    DurationPredictor: AnsiString;\n    TextEncoder: AnsiString;\n    VectorEstimator: AnsiString;\n    Vocoder: AnsiString;\n    TtsJson: AnsiString;\n    UnicodeIndexer: AnsiString;\n    VoiceStyle: AnsiString;\n\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineTtsModelConfig = record\n    Vits: TSherpaOnnxOfflineTtsVitsModelConfig;\n    NumThreads: Integer;\n    Debug: Boolean;\n    Provider: AnsiString;\n    Matcha: TSherpaOnnxOfflineTtsMatchaModelConfig;\n    Kokoro: TSherpaOnnxOfflineTtsKokoroModelConfig;\n    Kitten: TSherpaOnnxOfflineTtsKittenModelConfig;\n    ZipVoice: TSherpaOnnxOfflineTtsZipVoiceModelConfig;\n    Pocket: TSherpaOnnxOfflineTtsPocketModelConfig;\n    Supertonic: TSherpaOnnxOfflineTtsSupertonicModelConfig;\n\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsModelConfig);\n  end;\n\n  TSherpaOnnxOfflineTtsConfig = record\n    Model: TSherpaOnnxOfflineTtsModelConfig;\n    RuleFsts: AnsiString;\n    MaxNumSentences: Integer;\n    RuleFars: AnsiString;\n    SilenceScale: Single;\n\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsConfig);\n  end;\n\n  TSherpaOnnxGeneratedAudio = record\n    Samples: array of Single;\n    SampleRate: Integer;\n  end;\n\n  TSherpaOnnxOfflineTts = class\n  private\n   Handle: Pointer;\n   SampleRate: Integer;\n   NumSpeakers: Integer;\n   _Config: TSherpaOnnxOfflineTtsConfig;\n  public\n    constructor Create(Config: TSherpaOnnxOfflineTtsConfig);\n    destructor Destroy; override;\n\n    function Generate(Text: AnsiString; SpeakerId: Integer;\n      Speed: Single): TSherpaOnnxGeneratedAudio; overload;\n\n    function Generate(Text: AnsiString; SpeakerId: Integer;\n      Speed: Single;\n      Callback: TSherpaOnnxGeneratedAudioCallbackWithArg;\n      Arg: Pointer\n      ): TSherpaOnnxGeneratedAudio; overload;\n\n    function Generate(Text: AnsiString;\n      GenerationConfig: TSherpaOnnxGenerationConfig;\n      Callback: TSherpaOnnxGeneratedAudioProgressCallbackWithArg;\n      Arg: Pointer\n      ): TSherpaOnnxGeneratedAudio; overload;\n\n    property GetHandle: Pointer Read Handle;\n    property GetSampleRate: Integer Read SampleRate;\n    property GetNumSpeakers: Integer Read NumSpeakers;\n  end;\n\n  TSherpaOnnxWave = record\n    Samples: array of Single; { normalized to the range [-1, 1] }\n    SampleRate: Integer;\n  end;\n\n  TSherpaOnnxOnlineTransducerModelConfig = record\n    Encoder: AnsiString;\n    Decoder: AnsiString;\n    Joiner: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOnlineParaformerModelConfig = record\n    Encoder: AnsiString;\n    Decoder: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOnlineZipformer2CtcModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOnlineNemoCtcModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOnlineToneCtcModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOnlineModelConfig = record\n    Transducer: TSherpaOnnxOnlineTransducerModelConfig;\n    Paraformer: TSherpaOnnxOnlineParaformerModelConfig;\n    Zipformer2Ctc: TSherpaOnnxOnlineZipformer2CtcModelConfig;\n    Tokens: AnsiString;\n    NumThreads: Integer;\n    Provider: AnsiString;\n    Debug: Boolean;\n    ModelType: AnsiString;\n    ModelingUnit: AnsiString;\n    BpeVocab: AnsiString;\n    TokensBuf: AnsiString;\n    TokensBufSize: Integer;\n    NemoCtc: TSherpaOnnxOnlineNemoCtcModelConfig;\n    ToneCtc: TSherpaOnnxOnlineToneCtcModelConfig;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOnlineModelConfig);\n  end;\n\n  TSherpaOnnxFeatureConfig = record\n    SampleRate: Integer;\n    FeatureDim: Integer;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxFeatureConfig);\n  end;\n\n  TSherpaOnnxOnlineCtcFstDecoderConfig = record\n    Graph: AnsiString;\n    MaxActive: Integer;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOnlineCtcFstDecoderConfig);\n  end;\n\n  TSherpaOnnxHomophoneReplacerConfig = record\n    DictDir: AnsiString;\n    Lexicon: AnsiString;\n    RuleFsts: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOnlineRecognizerConfig = record\n    FeatConfig: TSherpaOnnxFeatureConfig;\n    ModelConfig: TSherpaOnnxOnlineModelConfig;\n    DecodingMethod: AnsiString;\n    MaxActivePaths: Integer;\n    EnableEndpoint: Boolean;\n    Rule1MinTrailingSilence: Single;\n    Rule2MinTrailingSilence: Single;\n    Rule3MinUtteranceLength: Single;\n    HotwordsFile: AnsiString;\n    HotwordsScore: Single;\n    CtcFstDecoderConfig: TSherpaOnnxOnlineCtcFstDecoderConfig;\n    RuleFsts: AnsiString;\n    RuleFars: AnsiString;\n    BlankPenalty: Single;\n    HotwordsBuf: AnsiString;\n    HotwordsBufSize: Integer;\n    Hr: TSherpaOnnxHomophoneReplacerConfig;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOnlineRecognizerConfig);\n  end;\n\n  TSherpaOnnxOnlineRecognizerResult = record\n    Text: AnsiString;\n    Tokens: array of AnsiString;\n    Timestamps: array of Single;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOnlineStream = class\n  private\n   Handle: Pointer;\n  public\n    constructor Create(P: Pointer);\n    destructor Destroy; override;\n    procedure AcceptWaveform(const Samples: array of Single; SampleRate: Integer);\n    procedure InputFinished;\n    property GetHandle: Pointer Read Handle;\n  end;\n\n  TSherpaOnnxOnlineRecognizer = class\n  private\n   Handle: Pointer;\n   _Config: TSherpaOnnxOnlineRecognizerConfig;\n  public\n    constructor Create(Config: TSherpaOnnxOnlineRecognizerConfig);\n    destructor Destroy; override;\n\n    function CreateStream: TSherpaOnnxOnlineStream; overload;\n    function CreateStream(Hotwords: AnsiString): TSherpaOnnxOnlineStream; overload;\n    function IsReady(Stream: TSherpaOnnxOnlineStream): Boolean;\n    procedure Decode(Stream: TSherpaOnnxOnlineStream);\n    procedure Reset(Stream: TSherpaOnnxOnlineStream);\n    function IsEndpoint(Stream: TSherpaOnnxOnlineStream): Boolean;\n    function GetResult(Stream: TSherpaOnnxOnlineStream): TSherpaOnnxOnlineRecognizerResult;\n    property Config: TSherpaOnnxOnlineRecognizerConfig Read _Config;\n    property GetHandle: Pointer Read Handle;\n  end;\n\n  TSherpaOnnxOfflineTransducerModelConfig = record\n    Encoder: AnsiString;\n    Decoder: AnsiString;\n    Joiner: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineParaformerModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineNemoEncDecCtcModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineDolphinModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineZipformerCtcModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineWenetCtcModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineOmnilingualAsrCtcModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineMedAsrCtcModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineFireRedAsrCtcModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineFunAsrNanoModelConfig = record\n    EncoderAdaptor: AnsiString;\n    LLM: AnsiString;\n    Embedding: AnsiString;\n    Tokenizer: AnsiString;\n    SystemPrompt: AnsiString;\n    UserPrompt: AnsiString;\n    MaxNewTokens: Integer;\n    Temperature: Single;\n    TopP: Single;\n    Seed: Integer;\n    Language: AnsiString;\n    UseItn: Boolean;\n    Hotwords: AnsiString;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineFunAsrNanoModelConfig);\n  end;\n\n  TSherpaOnnxOfflineWhisperModelConfig = record\n    Encoder: AnsiString;\n    Decoder: AnsiString;\n    Language: AnsiString;\n    Task: AnsiString;\n    TailPaddings: Integer;\n    EnableTokenTimestamps: Boolean;\n    EnableSegmentTimestamps: Boolean;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineWhisperModelConfig);\n  end;\n\n  TSherpaOnnxOfflineCanaryModelConfig = record\n    Encoder: AnsiString;\n    Decoder: AnsiString;\n    SrcLang: AnsiString;\n    TgtLang: AnsiString;\n    UsePnc: Boolean;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineCanaryModelConfig);\n  end;\n\n  TSherpaOnnxOfflineMoonshineModelConfig = record\n    Preprocessor: AnsiString;\n    Encoder: AnsiString;\n    UncachedDecoder: AnsiString;\n    CachedDecoder: AnsiString;\n    MergedDecoder: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineFireRedAsrModelConfig = record\n    Encoder: AnsiString;\n    Decoder: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineTdnnModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineLMConfig = record\n    Model: AnsiString;\n    Scale: Single;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineLMConfig);\n  end;\n\n  TSherpaOnnxOfflineSenseVoiceModelConfig = record\n    Model: AnsiString;\n    Language: AnsiString;\n    UseItn: Boolean;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineSenseVoiceModelConfig);\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineModelConfig = record\n    Transducer: TSherpaOnnxOfflineTransducerModelConfig;\n    Paraformer: TSherpaOnnxOfflineParaformerModelConfig;\n    NeMoCtc: TSherpaOnnxOfflineNemoEncDecCtcModelConfig;\n    Whisper: TSherpaOnnxOfflineWhisperModelConfig;\n    Tdnn: TSherpaOnnxOfflineTdnnModelConfig;\n    Tokens: AnsiString;\n    NumThreads: Integer;\n    Debug: Boolean;\n    Provider: AnsiString;\n    ModelType: AnsiString;\n    ModelingUnit: AnsiString;\n    BpeVocab: AnsiString;\n    TeleSpeechCtc: AnsiString;\n    SenseVoice: TSherpaOnnxOfflineSenseVoiceModelConfig;\n    Moonshine: TSherpaOnnxOfflineMoonshineModelConfig;\n    FireRedAsr: TSherpaOnnxOfflineFireRedAsrModelConfig;\n    Dolphin: TSherpaOnnxOfflineDolphinModelConfig;\n    ZipformerCtc: TSherpaOnnxOfflineZipformerCtcModelConfig;\n    Canary: TSherpaOnnxOfflineCanaryModelConfig;\n    WenetCtc: TSherpaOnnxOfflineWenetCtcModelConfig;\n    Omnilingual: TSherpaOnnxOfflineOmnilingualAsrCtcModelConfig;\n    MedAsr: TSherpaOnnxOfflineMedAsrCtcModelConfig;\n    FunAsrNano: TSherpaOnnxOfflineFunAsrNanoModelConfig;\n    FireRedAsrCtc: TSherpaOnnxOfflineFireRedAsrCtcModelConfig;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineModelConfig);\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineRecognizerConfig = record\n    FeatConfig: TSherpaOnnxFeatureConfig;\n    ModelConfig: TSherpaOnnxOfflineModelConfig;\n    LMConfig: TSherpaOnnxOfflineLMConfig;\n    DecodingMethod: AnsiString;\n    MaxActivePaths: Integer;\n    HotwordsFile: AnsiString;\n    HotwordsScore: Single;\n    RuleFsts: AnsiString;\n    RuleFars: AnsiString;\n    BlankPenalty: Single;\n    Hr: TSherpaOnnxHomophoneReplacerConfig;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineRecognizerConfig);\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineRecognizerResult = record\n    Text: AnsiString;\n    Tokens: array of AnsiString;\n    Timestamps: array of Single;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineStream = class\n  private\n   Handle: Pointer;\n  public\n    constructor Create(P: Pointer);\n    destructor Destroy; override;\n    procedure AcceptWaveform(const Samples: array of Single; SampleRate: Integer);\n    property GetHandle: Pointer Read Handle;\n  end;\n\n  TSherpaOnnxOfflineRecognizer = class\n  private\n   Handle: Pointer;\n   _Config: TSherpaOnnxOfflineRecognizerConfig;\n  public\n    constructor Create(Config: TSherpaOnnxOfflineRecognizerConfig);\n    destructor Destroy; override;\n    function CreateStream: TSherpaOnnxOfflineStream;\n    procedure Decode(Stream: TSherpaOnnxOfflineStream);\n    procedure SetConfig(Config: TSherpaOnnxOfflineRecognizerConfig);\n    function GetResult(Stream: TSherpaOnnxOfflineStream): TSherpaOnnxOfflineRecognizerResult;\n    property Config: TSherpaOnnxOfflineRecognizerConfig Read _Config;\n    property GetHandle: Pointer Read Handle;\n  end;\n\n  TSherpaOnnxSileroVadModelConfig = record\n    Model: AnsiString;\n    Threshold: Single;\n    MinSilenceDuration: Single;\n    MinSpeechDuration: Single;\n    WindowSize: Integer;\n    MaxSpeechDuration: Single;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxSileroVadModelConfig);\n  end;\n\n  TSherpaOnnxTenVadModelConfig = record\n    Model: AnsiString;\n    Threshold: Single;\n    MinSilenceDuration: Single;\n    MinSpeechDuration: Single;\n    WindowSize: Integer;\n    MaxSpeechDuration: Single;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxTenVadModelConfig);\n  end;\n\n  TSherpaOnnxVadModelConfig = record\n    SileroVad: TSherpaOnnxSileroVadModelConfig;\n    SampleRate: Integer;\n    NumThreads: Integer;\n    Provider: AnsiString;\n    Debug: Boolean;\n    TenVad: TSherpaOnnxTenVadModelConfig;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxVadModelConfig);\n  end;\n\n\n  TSherpaOnnxCircularBuffer = class\n  private\n    Handle: Pointer;\n  public\n    constructor Create(Capacity: Integer);\n    destructor Destroy; override;\n    procedure Push(Samples: array of Single); overload;\n    procedure Push(Samples: pcfloat; N: Integer); overload;\n    function Get(StartIndex: Integer; N: Integer): TSherpaOnnxSamplesArray;\n    procedure Pop(N: Integer);\n    procedure Reset;\n    function Size: Integer;\n    function Head: Integer;\n    property GetHandle: Pointer Read Handle;\n  end;\n\n  TSherpaOnnxSpeechSegment = record\n    Samples: array of Single;\n    Start: Integer;\n  end;\n\n  TSherpaOnnxVoiceActivityDetector = class\n  private\n    Handle: Pointer;\n    _Config: TSherpaOnnxVadModelConfig;\n  public\n    constructor Create(Config: TSherpaOnnxVadModelConfig; BufferSizeInSeconds: Single);\n    destructor Destroy; override;\n    procedure AcceptWaveform(const Samples: array of Single); overload;\n    procedure AcceptWaveform(const Samples: array of Single; Offset: Integer; N: Integer); overload;\n    function IsEmpty: Boolean;\n    function IsDetected: Boolean;\n    procedure Pop;\n    procedure Clear;\n    function Front: TSherpaOnnxSpeechSegment;\n    procedure Reset;\n    procedure Flush;\n    property Config: TSherpaOnnxVadModelConfig Read _Config;\n    property GetHandle: Pointer Read Handle;\n  end;\n\n\n  TSherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineSpeakerSegmentationModelConfig = record\n    Pyannote: TSherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig;\n    NumThreads: Integer;\n    Debug: Boolean;\n    Provider: AnsiString;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineSpeakerSegmentationModelConfig);\n  end;\n\n  TSherpaOnnxFastClusteringConfig = record\n    NumClusters: Integer;\n    Threshold: Single;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxFastClusteringConfig);\n  end;\n\n  TSherpaOnnxSpeakerEmbeddingExtractorConfig = record\n    Model: AnsiString;\n    NumThreads: Integer;\n    Debug: Boolean;\n    Provider: AnsiString;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxSpeakerEmbeddingExtractorConfig);\n  end;\n\n  TSherpaOnnxOfflineSpeakerDiarizationConfig = record\n    Segmentation: TSherpaOnnxOfflineSpeakerSegmentationModelConfig;\n    Embedding: TSherpaOnnxSpeakerEmbeddingExtractorConfig;\n    Clustering: TSherpaOnnxFastClusteringConfig;\n    MinDurationOn: Single;\n    MinDurationOff: Single;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineSpeakerDiarizationConfig);\n  end;\n\n  TSherpaOnnxOfflineSpeakerDiarizationSegment = record\n    Start: Single;\n    Stop: Single;\n    Speaker: Integer;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineSpeakerDiarizationSegmentArray = array of TSherpaOnnxOfflineSpeakerDiarizationSegment;\n\n  PSherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArg = ^TSherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArg;\n\n  TSherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArg = function(\n      NumProcessChunks: cint32;\n      NumTotalChunks: cint32): cint32; cdecl;\n\n  TSherpaOnnxOfflineSpeakerDiarization = class\n  private\n    Handle: Pointer;\n    SampleRate: Integer;\n    _Config: TSherpaOnnxOfflineSpeakerDiarizationConfig;\n  public\n    constructor Create(Config: TSherpaOnnxOfflineSpeakerDiarizationConfig);\n    destructor Destroy; override;\n    procedure SetConfig(Config: TSherpaOnnxOfflineSpeakerDiarizationConfig);\n    function Process(const Samples: array of Single): TSherpaOnnxOfflineSpeakerDiarizationSegmentArray; overload;\n    function Process(const Samples: array of Single; Callback: PSherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArg): TSherpaOnnxOfflineSpeakerDiarizationSegmentArray; overload;\n    property GetHandle: Pointer Read Handle;\n    property GetSampleRate: Integer Read SampleRate;\n  end;\n\n  TSherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig = record\n    Model: AnsiString;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOfflineSpeechDenoiserModelConfig = record\n    Gtcrn: TSherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig;\n    DpdfNet: TSherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig;\n    NumThreads: Integer;\n    Debug: Boolean;\n    Provider: AnsiString;\n    function ToString: AnsiString;\n    class operator Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineSpeechDenoiserModelConfig);\n  end;\n\n  TSherpaOnnxOfflineSpeechDenoiserConfig = record\n    Model: TSherpaOnnxOfflineSpeechDenoiserModelConfig;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxOnlineSpeechDenoiserConfig = record\n    Model: TSherpaOnnxOfflineSpeechDenoiserModelConfig;\n    function ToString: AnsiString;\n  end;\n\n  TSherpaOnnxDenoisedAudio = record\n    Samples: array of Single;\n    SampleRate: Integer;\n  end;\n\n  TSherpaOnnxOfflineSpeechDenoiser = class\n  private\n   Handle: Pointer;\n   SampleRate: Integer;\n   _Config: TSherpaOnnxOfflineSpeechDenoiserConfig;\n  public\n    constructor Create(Config: TSherpaOnnxOfflineSpeechDenoiserConfig);\n    destructor Destroy; override;\n\n    function Run(const Samples: array of Single; InputSampleRate: Integer): TSherpaOnnxDenoisedAudio;\n\n    property GetHandle: Pointer Read Handle;\n    property GetSampleRate: Integer Read SampleRate;\n  end;\n\n  TSherpaOnnxOnlineSpeechDenoiser = class\n  private\n   Handle: Pointer;\n   SampleRate: Integer;\n   FrameShiftInSamples: Integer;\n   _Config: TSherpaOnnxOnlineSpeechDenoiserConfig;\n  public\n    constructor Create(Config: TSherpaOnnxOnlineSpeechDenoiserConfig);\n    destructor Destroy; override;\n\n    function Run(const Samples: array of Single; InputSampleRate: Integer): TSherpaOnnxDenoisedAudio;\n    function Flush: TSherpaOnnxDenoisedAudio;\n    procedure Reset;\n\n    property GetHandle: Pointer Read Handle;\n    property GetSampleRate: Integer Read SampleRate;\n    property GetFrameShiftInSamples: Integer Read FrameShiftInSamples;\n  end;\n\n  { It supports reading a single channel wave with 16-bit encoded samples.\n    Samples are normalized to the range [-1, 1].\n  }\n  function SherpaOnnxReadWave(Filename: AnsiString): TSherpaOnnxWave;\n\n  function SherpaOnnxWriteWave(Filename: AnsiString;\n    const Samples: array of Single; SampleRate: Integer): Boolean;\n\n  function SherpaOnnxGetVersionStr(): AnsiString;\n  function SherpaOnnxGetGitSha1(): AnsiString;\n  function SherpaOnnxGetGitDate(): AnsiString;\n\nimplementation\n\nuses\n  Math,\n  fpjson,\n    { See\n      - https://wiki.freepascal.org/fcl-json\n      - https://www.freepascal.org/daily/doc/fcl/fpjson/getjson.html\n    }\n  jsonparser,\n  SysUtils;\n\nconst\n  {\n  See\n   - https://www.freepascal.org/docs-html/prog/progap7.html\n   - https://downloads.freepascal.org/fpc/docs-pdf/\n   - https://downloads.freepascal.org/fpc/docs-pdf/CinFreePascal.pdf\n  }\n\n  {$if defined(WINDOWS)}\n   { For windows, we always use dynamic link. See\n     https://forum.lazarus.freepascal.org/index.php/topic,15712.msg84781.html#msg84781\n     We need to rebuild the static lib for windows using Mingw or cygwin\n   }\n     SherpaOnnxLibName = 'sherpa-onnx-c-api.dll';\n  {$elseif not defined(SHERPA_ONNX_USE_SHARED_LIBS)}\n     {static link for linux and macos}\n     {$linklib sherpa-onnx-c-api}\n     {$linklib sherpa-onnx-core}\n     {$linklib kaldi-decoder-core}\n     {$linklib sherpa-onnx-kaldifst-core}\n     {$linklib sherpa-onnx-fstfar}\n     {$linklib sherpa-onnx-fst}\n     {$linklib kissfft-float}\n     {$linklib kaldi-native-fbank-core}\n     {$linklib piper_phonemize}\n     {$linklib espeak-ng}\n     {$linklib ucd}\n     {$linklib onnxruntime}\n     {$linklib ssentencepiece_core}\n\n     {$ifdef LINUX}\n       {$linklib m}\n       {$LINKLIB stdc++}\n       {$LINKLIB gcc_s}\n     {$endif}\n\n     {$ifdef DARWIN}\n       {$linklib c++}\n     {$endif}\n     SherpaOnnxLibName = '';\n  {$else}\n     {dynamic link for linux and macos}\n     SherpaOnnxLibName = 'sherpa-onnx-c-api';\n     {$linklib sherpa-onnx-c-api}\n  {$endif}\n\ntype\n  SherpaOnnxWave = record\n    Samples: pcfloat;\n    SampleRate: cint32;\n    NumSamples: cint32;\n  end;\n\n  PSherpaOnnxWave = ^SherpaOnnxWave;\n\n  SherpaOnnxOnlineTransducerModelConfig = record\n    Encoder: PAnsiChar;\n    Decoder: PAnsiChar;\n    Joiner: PAnsiChar;\n  end;\n  SherpaOnnxOnlineParaformerModelConfig = record\n    Encoder: PAnsiChar;\n    Decoder: PAnsiChar;\n  end;\n  SherpaOnnxOnlineZipformer2CtcModelConfig = record\n    Model: PAnsiChar;\n  end;\n\n  SherpaOnnxOnlineNemoCtcModelConfig = record\n    Model: PAnsiChar;\n  end;\n\n  SherpaOnnxOnlineToneCtcModelConfig = record\n    Model: PAnsiChar;\n  end;\n\n  SherpaOnnxOnlineModelConfig= record\n    Transducer: SherpaOnnxOnlineTransducerModelConfig;\n    Paraformer: SherpaOnnxOnlineParaformerModelConfig;\n    Zipformer2Ctc: SherpaOnnxOnlineZipformer2CtcModelConfig;\n    Tokens: PAnsiChar;\n    NumThreads: cint32;\n    Provider: PAnsiChar;\n    Debug: cint32;\n    ModelType: PAnsiChar;\n    ModelingUnit: PAnsiChar;\n    BpeVocab: PAnsiChar;\n    TokensBuf: PAnsiChar;\n    TokensBufSize: cint32;\n    NemoCtc: SherpaOnnxOnlineNemoCtcModelConfig;\n    ToneCtc: SherpaOnnxOnlineToneCtcModelConfig;\n  end;\n  SherpaOnnxFeatureConfig = record\n    SampleRate: cint32;\n    FeatureDim: cint32;\n  end;\n  SherpaOnnxOnlineCtcFstDecoderConfig = record\n    Graph: PAnsiChar;\n    MaxActive: cint32;\n  end;\n\n  SherpaOnnxHomophoneReplacerConfig = record\n    DictDir: PAnsiChar;\n    Lexicon: PAnsiChar;\n    RuleFsts: PAnsiChar;\n  end;\n\n  SherpaOnnxOnlineRecognizerConfig = record\n    FeatConfig: SherpaOnnxFeatureConfig;\n    ModelConfig: SherpaOnnxOnlineModelConfig;\n    DecodingMethod: PAnsiChar;\n    MaxActivePaths: cint32;\n    EnableEndpoint: cint32;\n    Rule1MinTrailingSilence: cfloat;\n    Rule2MinTrailingSilence: cfloat;\n    Rule3MinUtteranceLength: cfloat;\n    HotwordsFile: PAnsiChar;\n    HotwordsScore: cfloat;\n    CtcFstDecoderConfig: SherpaOnnxOnlineCtcFstDecoderConfig;\n    RuleFsts: PAnsiChar;\n    RuleFars: PAnsiChar;\n    BlankPenalty: cfloat;\n    HotwordsBuf: PAnsiChar;\n    HotwordsBufSize: cint32;\n    Hr: SherpaOnnxHomophoneReplacerConfig;\n  end;\n\n  PSherpaOnnxOnlineRecognizerConfig = ^SherpaOnnxOnlineRecognizerConfig;\n\n  SherpaOnnxOfflineTransducerModelConfig = record\n    Encoder: PAnsiChar;\n    Decoder: PAnsiChar;\n    Joiner: PAnsiChar;\n  end;\n  SherpaOnnxOfflineParaformerModelConfig = record\n    Model: PAnsiChar;\n  end;\n  SherpaOnnxOfflineNemoEncDecCtcModelConfig = record\n    Model: PAnsiChar;\n  end;\n  SherpaOnnxOfflineDolphinModelConfig = record\n    Model: PAnsiChar;\n  end;\n  SherpaOnnxOfflineZipformerCtcModelConfig = record\n    Model: PAnsiChar;\n  end;\n  SherpaOnnxOfflineWenetCtcModelConfig = record\n    Model: PAnsiChar;\n  end;\n  SherpaOnnxOfflineOmnilingualAsrCtcModelConfig = record\n    Model: PAnsiChar;\n  end;\n  SherpaOnnxOfflineMedAsrCtcModelConfig = record\n    Model: PAnsiChar;\n  end;\n  SherpaOnnxOfflineFunAsrNanoModelConfig = record\n    EncoderAdaptor: PAnsiChar;\n    LLM: PAnsiChar;\n    Embedding: PAnsiChar;\n    Tokenizer: PAnsiChar;\n    SystemPrompt: PAnsiChar;\n    UserPrompt: PAnsiChar;\n    MaxNewTokens: cint32;\n    Temperature: cfloat;\n    TopP: cfloat;\n    Seed: cint32;\n    Language: PAnsiChar;\n    UseItn: cint32;\n    Hotwords: PAnsiChar;\n  end;\n  SherpaOnnxOfflineFireRedAsrCtcModelConfig = record\n    Model: PAnsiChar;\n  end;\n  SherpaOnnxOfflineWhisperModelConfig = record\n    Encoder: PAnsiChar;\n    Decoder: PAnsiChar;\n    Language: PAnsiChar;\n    Task: PAnsiChar;\n    TailPaddings: cint32;\n    EnableTokenTimestamps: cint32;\n    EnableSegmentTimestamps: cint32;\n  end;\n  SherpaOnnxOfflineCanaryModelConfig = record\n    Encoder: PAnsiChar;\n    Decoder: PAnsiChar;\n    SrcLang: PAnsiChar;\n    TgtLang: PAnsiChar;\n    UsePnc: cint32;\n  end;\n  SherpaOnnxOfflineFireRedAsrModelConfig = record\n    Encoder: PAnsiChar;\n    Decoder: PAnsiChar;\n  end;\n  SherpaOnnxOfflineMoonshineModelConfig = record\n    Preprocessor: PAnsiChar;\n    Encoder: PAnsiChar;\n    UncachedDecoder: PAnsiChar;\n    CachedDecoder: PAnsiChar;\n    MergedDecoder: PAnsiChar;\n  end;\n  SherpaOnnxOfflineTdnnModelConfig = record\n    Model: PAnsiChar;\n  end;\n  SherpaOnnxOfflineLMConfig = record\n    Model: PAnsiChar;\n    Scale: cfloat;\n  end;\n  SherpaOnnxOfflineSenseVoiceModelConfig = record\n    Model: PAnsiChar;\n    Language: PAnsiChar;\n    UseItn: cint32;\n  end;\n  SherpaOnnxOfflineModelConfig = record\n    Transducer: SherpaOnnxOfflineTransducerModelConfig;\n    Paraformer: SherpaOnnxOfflineParaformerModelConfig;\n    NeMoCtc: SherpaOnnxOfflineNemoEncDecCtcModelConfig;\n    Whisper: SherpaOnnxOfflineWhisperModelConfig;\n    Tdnn: SherpaOnnxOfflineTdnnModelConfig;\n    Tokens: PAnsiChar;\n    NumThreads: cint32;\n    Debug: cint32;\n    Provider: PAnsiChar;\n    ModelType: PAnsiChar;\n    ModelingUnit: PAnsiChar;\n    BpeVocab: PAnsiChar;\n    TeleSpeechCtc: PAnsiChar;\n    SenseVoice:  SherpaOnnxOfflineSenseVoiceModelConfig;\n    Moonshine: SherpaOnnxOfflineMoonshineModelConfig;\n    FireRedAsr: SherpaOnnxOfflineFireRedAsrModelConfig;\n    Dolphin: SherpaOnnxOfflineDolphinModelConfig;\n    ZipformerCtc: SherpaOnnxOfflineZipformerCtcModelConfig;\n    Canary: SherpaOnnxOfflineCanaryModelConfig;\n    WenetCtc: SherpaOnnxOfflineWenetCtcModelConfig;\n    Omnilingual: SherpaOnnxOfflineOmnilingualAsrCtcModelConfig;\n    MedAsr: SherpaOnnxOfflineMedAsrCtcModelConfig;\n    FunAsrNano: SherpaOnnxOfflineFunAsrNanoModelConfig;\n    FireRedAsrCtc: SherpaOnnxOfflineFireRedAsrCtcModelConfig;\n  end;\n\n  SherpaOnnxOfflineRecognizerConfig = record\n    FeatConfig: SherpaOnnxFeatureConfig;\n    ModelConfig: SherpaOnnxOfflineModelConfig;\n    LMConfig: SherpaOnnxOfflineLMConfig;\n    DecodingMethod: PAnsiChar;\n    MaxActivePaths: cint32;\n    HotwordsFile: PAnsiChar;\n    HotwordsScore: cfloat;\n    RuleFsts: PAnsiChar;\n    RuleFars: PAnsiChar;\n    BlankPenalty: cfloat;\n    Hr: SherpaOnnxHomophoneReplacerConfig;\n  end;\n\n  PSherpaOnnxOfflineRecognizerConfig = ^SherpaOnnxOfflineRecognizerConfig;\n\n  SherpaOnnxSileroVadModelConfig = record\n    Model: PAnsiChar;\n    Threshold: cfloat;\n    MinSilenceDuration: cfloat;\n    MinSpeechDuration: cfloat;\n    WindowSize: cint32;\n    MaxSpeechDuration: cfloat;\n  end;\n\n  SherpaOnnxTenVadModelConfig = record\n    Model: PAnsiChar;\n    Threshold: cfloat;\n    MinSilenceDuration: cfloat;\n    MinSpeechDuration: cfloat;\n    WindowSize: cint32;\n    MaxSpeechDuration: cfloat;\n  end;\n\n  SherpaOnnxVadModelConfig = record\n    SileroVad: SherpaOnnxSileroVadModelConfig;\n    SampleRate: cint32;\n    NumThreads: cint32;\n    Provider: PAnsiChar;\n    Debug: cint32;\n    TenVad: SherpaOnnxTenVadModelConfig;\n  end;\n  PSherpaOnnxVadModelConfig = ^SherpaOnnxVadModelConfig;\n\n  SherpaOnnxSpeechSegment = record\n    Start: cint32;\n    Samples: pcfloat;\n    N: cint32;\n  end;\n\n  PSherpaOnnxSpeechSegment = ^SherpaOnnxSpeechSegment;\n\n  SherpaOnnxOfflineTtsVitsModelConfig = record\n    Model: PAnsiChar;\n    Lexicon: PAnsiChar;\n    Tokens: PAnsiChar;\n    DataDir: PAnsiChar;\n    NoiseScale: cfloat;\n    NoiseScaleW: cfloat;\n    LengthScale: cfloat;\n    DictDir: PAnsiChar;\n  end;\n\n  PSherpaOnnxGenerationConfig = ^SherpaOnnxGenerationConfig;\n\n  SherpaOnnxGenerationConfig = record\n    SilenceScale: cfloat;\n    Speed: cfloat;\n    Sid: cint32;\n    ReferenceAudio: pcfloat;\n    ReferenceAudioLen: cint32;\n    ReferenceSampleRate: cint32;\n    ReferenceText: PAnsiChar;\n    NumSteps: cint32;\n    Extra: PAnsiChar;\n  end;\n\n  SherpaOnnxOfflineTtsMatchaModelConfig = record\n    AcousticModel: PAnsiChar;\n    Vocoder: PAnsiChar;\n    Lexicon: PAnsiChar;\n    Tokens: PAnsiChar;\n    DataDir: PAnsiChar;\n    NoiseScale: cfloat;\n    LengthScale: cfloat;\n    DictDir: PAnsiChar;\n  end;\n\n  SherpaOnnxOfflineTtsKokoroModelConfig = record\n    Model: PAnsiChar;\n    Voices: PAnsiChar;\n    Tokens: PAnsiChar;\n    DataDir: PAnsiChar;\n    LengthScale: cfloat;\n    DictDir: PAnsiChar;\n    Lexicon: PAnsiChar;\n    Lang: PAnsiChar;\n  end;\n\n  SherpaOnnxOfflineTtsKittenModelConfig = record\n    Model: PAnsiChar;\n    Voices: PAnsiChar;\n    Tokens: PAnsiChar;\n    DataDir: PAnsiChar;\n    LengthScale: cfloat;\n  end;\n\n  SherpaOnnxOfflineTtsZipVoiceModelConfig = record\n    Tokens: PAnsiChar;\n    Encoder: PAnsiChar;\n    Decoder: PAnsiChar;\n    Vocoder: PAnsiChar;\n    DataDir: PAnsiChar;\n    Lexicon: PAnsiChar;\n    FeatScale: cfloat;\n    Tshift: cfloat;\n    TargetRms: cfloat;\n    GuidanceScale: cfloat;\n  end;\n\n  SherpaOnnxOfflineTtsPocketModelConfig = record\n    LmFlow: PAnsiChar;\n    LmMain: PAnsiChar;\n    Encoder: PAnsiChar;\n    Decoder: PAnsiChar;\n    TextConditioner: PAnsiChar;\n    VocabJson: PAnsiChar;\n    TokenScoresJson: PAnsiChar;\n    VoiceEmbeddingCacheCapacity: cint32;\n  end;\n\n  SherpaOnnxOfflineTtsSupertonicModelConfig = record\n    DurationPredictor: PAnsiChar;\n    TextEncoder: PAnsiChar;\n    VectorEstimator: PAnsiChar;\n    Vocoder: PAnsiChar;\n    TtsJson: PAnsiChar;\n    UnicodeIndexer: PAnsiChar;\n    VoiceStyle: PAnsiChar;\n  end;\n\n  SherpaOnnxOfflineTtsModelConfig = record\n    Vits: SherpaOnnxOfflineTtsVitsModelConfig;\n    NumThreads: cint32;\n    Debug: cint32;\n    Provider: PAnsiChar;\n    Matcha: SherpaOnnxOfflineTtsMatchaModelConfig;\n    Kokoro: SherpaOnnxOfflineTtsKokoroModelConfig;\n    Kitten: SherpaOnnxOfflineTtsKittenModelConfig;\n    ZipVoice: SherpaOnnxOfflineTtsZipVoiceModelConfig;\n    Pocket: SherpaOnnxOfflineTtsPocketModelConfig;\n    Supertonic: SherpaOnnxOfflineTtsSupertonicModelConfig;\n  end;\n\n  SherpaOnnxOfflineTtsConfig = record\n    Model: SherpaOnnxOfflineTtsModelConfig;\n    RuleFsts: PAnsiChar;\n    MaxNumSentences: cint32;\n    RuleFars: PAnsiChar;\n    SilenceScale: cfloat;\n  end;\n\n  PSherpaOnnxOfflineTtsConfig = ^SherpaOnnxOfflineTtsConfig;\n\n  SherpaOnnxGeneratedAudio = record\n    Samples: pcfloat;\n    N: cint32;\n    SampleRate: cint32;\n  end;\n\n  PSherpaOnnxGeneratedAudio = ^SherpaOnnxGeneratedAudio;\n\n  SherpaOnnxResampleOut = record\n    Samples: pcfloat;\n    N: cint32;\n  end;\n\n  PSherpaOnnxResampleOut = ^SherpaOnnxResampleOut;\n\n  SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig = record\n    Model: PAnsiChar;\n  end;\n\n  SherpaOnnxOfflineSpeakerSegmentationModelConfig = record\n    Pyannote: SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig;\n    NumThreads: cint32;\n    Debug: cint32;\n    Provider: PAnsiChar;\n  end;\n\n  SherpaOnnxFastClusteringConfig = record\n    NumClusters: cint32;\n    Threshold: cfloat;\n  end;\n\n  SherpaOnnxSpeakerEmbeddingExtractorConfig = record\n    Model: PAnsiChar;\n    NumThreads: cint32;\n    Debug: cint32;\n    Provider: PAnsiChar;\n  end;\n\n  SherpaOnnxOfflineSpeakerDiarizationConfig = record\n    Segmentation: SherpaOnnxOfflineSpeakerSegmentationModelConfig;\n    Embedding: SherpaOnnxSpeakerEmbeddingExtractorConfig;\n    Clustering: SherpaOnnxFastClusteringConfig;\n    MinDurationOn: cfloat;\n    MinDurationOff: cfloat;\n  end;\n\n  SherpaOnnxOfflineSpeakerDiarizationSegment = record\n    Start: cfloat;\n    Stop: cfloat;\n    Speaker: cint32;\n  end;\n\n  PSherpaOnnxOfflineSpeakerDiarizationSegment = ^SherpaOnnxOfflineSpeakerDiarizationSegment;\n\n  PSherpaOnnxOfflineSpeakerDiarizationConfig = ^SherpaOnnxOfflineSpeakerDiarizationConfig;\n\n  SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig = record\n    Model: PAnsiChar;\n  end;\n\n  SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig = record\n    Model: PAnsiChar;\n  end;\n\n  SherpaOnnxOfflineSpeechDenoiserModelConfig = record\n    Gtcrn: SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig;\n    NumThreads: cint32;\n    Debug: cint32;\n    Provider: PAnsiChar;\n    DpdfNet: SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig;\n  end;\n\n  SherpaOnnxOfflineSpeechDenoiserConfig = record\n    Model: SherpaOnnxOfflineSpeechDenoiserModelConfig;\n  end;\n\n  PSherpaOnnxOfflineSpeechDenoiserConfig = ^SherpaOnnxOfflineSpeechDenoiserConfig;\n\n  SherpaOnnxOnlineSpeechDenoiserConfig = record\n    Model: SherpaOnnxOfflineSpeechDenoiserModelConfig;\n  end;\n\n  PSherpaOnnxOnlineSpeechDenoiserConfig = ^SherpaOnnxOnlineSpeechDenoiserConfig;\n\n  SherpaOnnxDenoisedAudio = record\n    Samples: pcfloat;\n    N: cint32;\n    SampleRate: cint32;\n  end;\n\n  PSherpaOnnxDenoisedAudio = ^SherpaOnnxDenoisedAudio;\n\nfunction SherpaOnnxCreateLinearResampler(SampleRateInHz: cint32;\n  SampleRateOutHz: cint32;\n  FilterCutoffHz: cfloat;\n  NumZeros: cint32): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxGetVersionStrWrapper(): PAnsiChar; cdecl;\n  external SherpaOnnxLibName name 'SherpaOnnxGetVersionStr';\n\nfunction SherpaOnnxGetGitSha1Wrapper(): PAnsiChar; cdecl;\n  external SherpaOnnxLibName name 'SherpaOnnxGetGitSha1';\n\nfunction SherpaOnnxGetGitDateWrapper(): PAnsiChar; cdecl;\n  external SherpaOnnxLibName name 'SherpaOnnxGetGitDate';\n\nfunction SherpaOnnxGetVersionStr(): AnsiString;\nbegin\n  Result := SherpaOnnxGetVersionStrWrapper();\nend;\n\nfunction SherpaOnnxGetGitSha1(): AnsiString;\nbegin\n  Result := SherpaOnnxGetGitSha1Wrapper();\nend;\n\nfunction SherpaOnnxGetGitDate(): AnsiString;\nbegin\n  Result := SherpaOnnxGetGitDateWrapper();\nend;\n\nprocedure SherpaOnnxDestroyLinearResampler(P: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxLinearResamplerResample(P: Pointer;\n  Samples: pcfloat;\n  N: Integer;\n  Flush: Integer): PSherpaOnnxResampleOut; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxLinearResamplerResampleFree(P: PSherpaOnnxResampleOut); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxLinearResamplerReset(P: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateOfflineSpeechDenoiser(Config: PSherpaOnnxOfflineSpeechDenoiserConfig): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOfflineSpeechDenoiser(P: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineSpeechDenoiserGetSampleRate(P: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineSpeechDenoiserRun(P: Pointer;\n  Samples: pcfloat; N: cint32;SampleRate: cint32):PSherpaOnnxDenoisedAudio; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateOnlineSpeechDenoiser(Config: PSherpaOnnxOnlineSpeechDenoiserConfig): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOnlineSpeechDenoiser(P: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOnlineSpeechDenoiserGetSampleRate(P: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(P: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOnlineSpeechDenoiserRun(P: Pointer;\n  Samples: pcfloat; N: cint32; SampleRate: cint32): PSherpaOnnxDenoisedAudio; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOnlineSpeechDenoiserFlush(P: Pointer): PSherpaOnnxDenoisedAudio; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxOnlineSpeechDenoiserReset(P: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyDenoisedAudio(Audio: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateOfflineSpeakerDiarization(Config: PSherpaOnnxOfflineSpeakerDiarizationConfig): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOfflineSpeakerDiarization(P: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(P: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxOfflineSpeakerDiarizationSetConfig(P: Pointer; Config: PSherpaOnnxOfflineSpeakerDiarizationConfig); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(P: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(P: Pointer): PSherpaOnnxOfflineSpeakerDiarizationSegment; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxOfflineSpeakerDiarizationDestroySegment(P: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineSpeakerDiarizationProcess(P: Pointer; Samples: pcfloat; N: cint32): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg(P: Pointer;\n  Samples: pcfloat; N: cint32;  Callback: PSherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArg): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxOfflineSpeakerDiarizationDestroyResult(P: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateOfflineTts(Config: PSherpaOnnxOfflineTtsConfig): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOfflineTts(Tts: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineTtsSampleRate(Tts: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineTtsNumSpeakers(Tts: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineTtsGenerate(Tts: Pointer;\n  Text: PAnsiChar; Sid: cint32; Speed: cfloat): PSherpaOnnxGeneratedAudio; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineTtsGenerateWithCallbackWithArg(Tts: Pointer;\n  Text: PAnsiChar; Sid: cint32; Speed: cfloat;\n  Callback: TSherpaOnnxGeneratedAudioCallbackWithArg;\n  Arg: Pointer): PSherpaOnnxGeneratedAudio; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOfflineTtsGenerateWithConfig(Tts: Pointer;\n  Text: PAnsiChar; config: PSherpaOnnxGenerationConfig;\n  Callback: TSherpaOnnxGeneratedAudioProgressCallbackWithArg;\n  Arg: Pointer): PSherpaOnnxGeneratedAudio; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOfflineTtsGeneratedAudio(Audio: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateVoiceActivityDetector(Config: PSherpaOnnxVadModelConfig;\n  BufferSizeInSeconds: cfloat): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyVoiceActivityDetector(Vad: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxVoiceActivityDetectorAcceptWaveform(Vad: Pointer;\n  Samples: pcfloat; N: cint32); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxVoiceActivityDetectorEmpty(Vad: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxVoiceActivityDetectorDetected(Vad: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxVoiceActivityDetectorPop(Vad: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxVoiceActivityDetectorClear(Vad: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxVoiceActivityDetectorFront(Vad: Pointer): PSherpaOnnxSpeechSegment; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroySpeechSegment(P: PSherpaOnnxSpeechSegment); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxVoiceActivityDetectorReset(P: PSherpaOnnxSpeechSegment); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxVoiceActivityDetectorFlush(P: PSherpaOnnxSpeechSegment); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateCircularBuffer(Capacity: cint32): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyCircularBuffer(Buffer: Pointer) ; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxCircularBufferPush(Buffer: Pointer; Samples: pcfloat; N: cint32); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCircularBufferGet(Buffer: Pointer; StartIndex: cint32; N: cint32): pcfloat ; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxCircularBufferFree(P: pcfloat); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxCircularBufferPop(Buffer: Pointer; N: cint32); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCircularBufferSize(Buffer: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCircularBufferHead(Buffer: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxCircularBufferReset(Buffer: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateOnlineRecognizer(Config: PSherpaOnnxOnlineRecognizerConfig): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOnlineRecognizer(Recognizer: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateOnlineStream(Recognizer: Pointer): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateOnlineStreamWithHotwords(Recognizer: Pointer; Hotwords: PAnsiChar): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOnlineStream(Recognizer: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxOnlineStreamAcceptWaveform(Stream: Pointer;\n  SampleRate: cint32; Samples: pcfloat; N: cint32 ); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxOnlineStreamInputFinished(Stream: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxIsOnlineStreamReady(Recognizer: Pointer; Stream: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDecodeOnlineStream(Recognizer: Pointer; Stream: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxOnlineStreamReset(Recognizer: Pointer; Stream: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxOnlineStreamIsEndpoint(Recognizer: Pointer; Stream: Pointer): cint32; cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxGetOnlineStreamResultAsJson(Recognizer: Pointer; Stream: Pointer): PAnsiChar; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOnlineStreamResultJson(PJson: PAnsiChar); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateOfflineRecognizer(Config: PSherpaOnnxOfflineRecognizerConfig): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOfflineRecognizer(Recognizer: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxCreateOfflineStream(Recognizer: Pointer): Pointer; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOfflineStream(Stream: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxAcceptWaveformOffline(Stream: Pointer;\n  SampleRate: cint32; Samples: pcfloat; N: cint32); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDecodeOfflineStream(Recognizer: Pointer; Stream: Pointer); cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxOfflineRecognizerSetConfig(Recognizer: Pointer; Config: PSherpaOnnxOfflineRecognizerConfig); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxGetOfflineStreamResultAsJson(Stream: Pointer): PAnsiChar; cdecl;\n  external SherpaOnnxLibName;\n\nprocedure SherpaOnnxDestroyOfflineStreamResultJson(Json: PAnsiChar); cdecl;\n  external SherpaOnnxLibName;\n\nfunction SherpaOnnxReadWaveWrapper(Filename: PAnsiChar): PSherpaOnnxWave; cdecl;\n  external SherpaOnnxLibName name 'SherpaOnnxReadWave';\n\nfunction SherpaOnnxWriteWaveWrapper(Samples: pcfloat; N: cint32;\n  SampleRate: cint32; Filename: PAnsiChar): cint32; cdecl;\n  external SherpaOnnxLibName name 'SherpaOnnxWriteWave';\n\nprocedure SherpaOnnxFreeWaveWrapper(P: PSherpaOnnxWave); cdecl;\n  external SherpaOnnxLibName name 'SherpaOnnxFreeWave';\n\nfunction SherpaOnnxWriteWave(Filename: AnsiString;\n    const Samples: array of Single; SampleRate: Integer): Boolean;\nbegin\n  Result := SherpaOnnxWriteWaveWrapper(pcfloat(Samples), Length(Samples),\n    SampleRate, PAnsiChar(Filename)) = 1;\nend;\n\nfunction SherpaOnnxReadWave(Filename: AnsiString): TSherpaOnnxWave;\nvar\n  PWave: PSherpaOnnxWave;\nbegin\n  Result.Samples := nil;\n  Result.SampleRate := 0;\n\n  PWave := SherpaOnnxReadWaveWrapper(PAnsiChar(Filename));\n\n  if PWave = nil then\n    Exit;\n\n  Result.SampleRate := PWave^.SampleRate;\n  SetLength(Result.Samples, PWave^.NumSamples);\n\n  if PWave^.NumSamples > 0 then\n    Move(PWave^.Samples[0], Result.Samples[0], PWave^.NumSamples * SizeOf(Single));\n\n  SherpaOnnxFreeWaveWrapper(PWave);\nend;\n\nfunction TSherpaOnnxOnlineTransducerModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOnlineTransducerModelConfig(Encoder := %s, Decoder := %s, Joiner := %s)',\n  [Self.Encoder, Self.Decoder, Self.Joiner]);\nend;\n\nfunction TSherpaOnnxOnlineParaformerModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOnlineParaformerModelConfig(Encoder := %s, Decoder := %s)',\n  [Self.Encoder, Self.Decoder]);\nend;\n\nfunction TSherpaOnnxOnlineZipformer2CtcModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOnlineZipformer2CtcModelConfig(Model := %s)',\n  [Self.Model]);\nend;\n\nfunction TSherpaOnnxOnlineNemoCtcModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOnlineNemoCtcModelConfig(Model := %s)',\n  [Self.Model]);\nend;\n\nfunction TSherpaOnnxOnlineToneCtcModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOnlineToneCtcModelConfig(Model := %s)',\n  [Self.Model]);\nend;\n\nfunction TSherpaOnnxOnlineModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOnlineModelConfig(Transducer := %s, ' +\n    'Paraformer := %s,' +\n    'Zipformer2Ctc := %s, ' +\n    'Tokens := %s, ' +\n    'NumThreads := %d, ' +\n    'Provider := %s, ' +\n    'Debug := %s, ' +\n    'ModelType := %s, ' +\n    'ModelingUnit := %s, ' +\n    'BpeVocab := %s, ' +\n    'NemoCtc := %s, ' +\n    'ToneCtc := %s)',\n  [Self.Transducer.ToString, Self.Paraformer.ToString,\n   Self.Zipformer2Ctc.ToString, Self.Tokens,\n   Self.NumThreads, Self.Provider, Self.Debug.ToString,\n   Self.ModelType, Self.ModelingUnit, Self.BpeVocab,\n   Self.NemoCtc.ToString, Self.ToneCtc.ToString\n  ]);\nend;\n\nfunction TSherpaOnnxFeatureConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxFeatureConfig(SampleRate := %d, FeatureDim := %d)',\n    [Self.SampleRate, Self.FeatureDim]);\nend;\n\nfunction TSherpaOnnxOnlineCtcFstDecoderConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOnlineCtcFstDecoderConfig(Graph := %s, MaxActive := %d)',\n  [Self.Graph, Self.MaxActive]);\nend;\n\nfunction TSherpaOnnxHomophoneReplacerConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxHomophoneReplacerConfig(Lexicon := %s, RuleFsts := %s)',\n  [Self.Lexicon, Self.RuleFsts]);\nend;\n\nfunction TSherpaOnnxOnlineRecognizerConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOnlineRecognizerConfig(FeatConfig := %s, ' +\n    'ModelConfig := %s, ' +\n    'DecodingMethod := %s, ' +\n    'MaxActivePaths := %d, ' +\n    'EnableEndpoint := %s, ' +\n    'Rule1MinTrailingSilence := %.1f, ' +\n    'Rule2MinTrailingSilence := %.1f, ' +\n    'Rule3MinUtteranceLength := %.1f, ' +\n    'HotwordsFile := %s, ' +\n    'HotwordsScore := %.1f, ' +\n    'CtcFstDecoderConfig := %s, ' +\n    'RuleFsts := %s, ' +\n    'RuleFars := %s, ' +\n    'BlankPenalty := %.1f, ' +\n    'Hr := %s' +\n    ')'\n    ,\n    [Self.FeatConfig.ToString, Self.ModelConfig.ToString,\n     Self.DecodingMethod, Self.MaxActivePaths, Self.EnableEndpoint.ToString,\n     Self.Rule1MinTrailingSilence, Self.Rule2MinTrailingSilence,\n     Self.Rule3MinUtteranceLength, Self.HotwordsFile, Self.HotwordsScore,\n     Self.CtcFstDecoderConfig.ToString, Self.RuleFsts, Self.RuleFars,\n     Self.BlankPenalty, Self.Hr.ToString\n    ]);\nend;\n\nfunction TSherpaOnnxOnlineRecognizerResult.ToString: AnsiString;\nvar\n  TokensStr: AnsiString;\n  S: AnsiString;\n  TimestampStr: AnsiString;\n  T: Single;\n  Sep: AnsiString;\nbegin\n  TokensStr := '[';\n  Sep := '';\n  for S in Self.Tokens do\n  begin\n    TokensStr := TokensStr + Sep + S;\n    Sep := ', ';\n  end;\n  TokensStr := TokensStr + ']';\n\n  TimestampStr := '[';\n  Sep := '';\n  for T in Self.Timestamps do\n  begin\n    TimestampStr := TimestampStr + Sep + Format('%.2f', [T]);\n    Sep := ', ';\n  end;\n  TimestampStr := TimestampStr + ']';\n\n  Result := Format('TSherpaOnnxOnlineRecognizerResult(Text := %s, ' +\n    'Tokens := %s, ' +\n    'Timestamps := %s' +\n    ')',\n    [Self.Text, TokensStr, TimestampStr]);\nend;\n\nconstructor TSherpaOnnxOnlineRecognizer.Create(Config: TSherpaOnnxOnlineRecognizerConfig);\nvar\n  C: SherpaOnnxOnlineRecognizerConfig;\nbegin\n  C := Default(SherpaOnnxOnlineRecognizerConfig);\n  C.FeatConfig.SampleRate := Config.FeatConfig.SampleRate;\n  C.FeatConfig.FeatureDim := Config.FeatConfig.FeatureDim;\n\n  C.ModelConfig.Transducer.Encoder := PAnsiChar(Config.ModelConfig.Transducer.Encoder);\n  C.ModelConfig.Transducer.Decoder := PAnsiChar(Config.ModelConfig.Transducer.Decoder);\n  C.ModelConfig.Transducer.Joiner := PAnsiChar(Config.ModelConfig.Transducer.Joiner);\n\n  C.ModelConfig.Paraformer.Encoder := PAnsiChar(Config.ModelConfig.Paraformer.Encoder);\n  C.ModelConfig.Paraformer.Decoder := PAnsiChar(Config.ModelConfig.Paraformer.Decoder);\n\n  C.ModelConfig.Zipformer2Ctc.Model := PAnsiChar(Config.ModelConfig.Zipformer2Ctc.Model);\n  C.ModelConfig.NemoCtc.Model := PAnsiChar(Config.ModelConfig.NemoCtc.Model);\n  C.ModelConfig.ToneCtc.Model := PAnsiChar(Config.ModelConfig.ToneCtc.Model);\n\n  C.ModelConfig.Tokens := PAnsiChar(Config.ModelConfig.Tokens);\n  C.ModelConfig.NumThreads := Config.ModelConfig.NumThreads;\n  C.ModelConfig.Provider := PAnsiChar(Config.ModelConfig.Provider);\n  C.ModelConfig.Debug := Ord(Config.ModelConfig.Debug);\n  C.ModelConfig.ModelType := PAnsiChar(Config.ModelConfig.ModelType);\n  C.ModelConfig.ModelingUnit := PAnsiChar(Config.ModelConfig.ModelingUnit);\n  C.ModelConfig.BpeVocab := PAnsiChar(Config.ModelConfig.BpeVocab);\n\n  C.DecodingMethod := PAnsiChar(Config.DecodingMethod);\n  C.MaxActivePaths := Config.MaxActivePaths;\n  C.EnableEndpoint := Ord(Config.EnableEndpoint);\n  C.Rule1MinTrailingSilence := Config.Rule1MinTrailingSilence;\n  C.Rule2MinTrailingSilence := Config.Rule2MinTrailingSilence;\n  C.Rule3MinUtteranceLength := Config.Rule3MinUtteranceLength;\n  C.HotwordsFile := PAnsiChar(Config.HotwordsFile);\n  C.HotwordsScore := Config.HotwordsScore;\n  C.CtcFstDecoderConfig.Graph := PAnsiChar(Config.CtcFstDecoderConfig.Graph);\n  C.CtcFstDecoderConfig.MaxActive := Config.CtcFstDecoderConfig.MaxActive;\n  C.RuleFsts := PAnsiChar(Config.RuleFsts);\n  C.RuleFars := PAnsiChar(Config.RuleFars);\n  C.BlankPenalty := Config.BlankPenalty;\n  C.Hr.Lexicon := PAnsiChar(Config.Hr.Lexicon);\n  C.Hr.RuleFsts := PAnsiChar(Config.Hr.RuleFsts);\n\n  Self.Handle := SherpaOnnxCreateOnlineRecognizer(@C);\n  Self._Config := Config;\nend;\n\ndestructor TSherpaOnnxOnlineRecognizer.Destroy;\nbegin\n  SherpaOnnxDestroyOnlineRecognizer(Self.Handle);\n  Self.Handle := nil;\nend;\n\nfunction TSherpaOnnxOnlineRecognizer.CreateStream: TSherpaOnnxOnlineStream;\nvar\n  Stream: Pointer;\nbegin\n  Stream := SherpaOnnxCreateOnlineStream(Self.Handle);\n  Result := TSherpaOnnxOnlineStream.Create(Stream);\nend;\n\nfunction TSherpaOnnxOnlineRecognizer.CreateStream(Hotwords: AnsiString): TSherpaOnnxOnlineStream;\nvar\n  Stream: Pointer;\nbegin\n  Stream := SherpaOnnxCreateOnlineStreamWithHotwords(Self.Handle, PAnsiChar(Hotwords));\n  Result := TSherpaOnnxOnlineStream.Create(Stream);\nend;\n\nfunction TSherpaOnnxOnlineRecognizer.IsReady(Stream: TSherpaOnnxOnlineStream): Boolean;\nbegin\n  Result := SherpaOnnxIsOnlineStreamReady(Self.Handle, Stream.Handle) = 1;\nend;\n\nprocedure TSherpaOnnxOnlineRecognizer.Decode(Stream: TSherpaOnnxOnlineStream);\nbegin\n  SherpaOnnxDecodeOnlineStream(Self.Handle, Stream.Handle);\nend;\n\nprocedure TSherpaOnnxOnlineRecognizer.Reset(Stream: TSherpaOnnxOnlineStream);\nbegin\n  SherpaOnnxOnlineStreamReset(Self.Handle, Stream.Handle);\nend;\n\nfunction TSherpaOnnxOnlineRecognizer.IsEndpoint(Stream: TSherpaOnnxOnlineStream): Boolean;\nbegin\n  Result := SherpaOnnxOnlineStreamIsEndpoint(Self.Handle, Stream.Handle) = 1;\nend;\n\nfunction TSherpaOnnxOnlineRecognizer.GetResult(Stream: TSherpaOnnxOnlineStream): TSherpaOnnxOnlineRecognizerResult;\nvar\n  pJson: PAnsiChar;\n  JsonData: TJSONData;\n  JsonObject : TJSONObject;\n  JsonEnum: TJSONEnum;\n  I: Integer;\nbegin\n  pJson := SherpaOnnxGetOnlineStreamResultAsJson(Self.Handle, Stream.Handle);\n\n  {\n   - https://www.freepascal.org/daily/doc/fcl/fpjson/getjson.html\n   - https://www.freepascal.org/daily/doc/fcl/fpjson/tjsondata.html\n   - https://www.freepascal.org/daily/doc/fcl/fpjson/tjsonobject.html\n   - https://www.freepascal.org/daily/doc/fcl/fpjson/tjsonenum.html\n  }\n\n  JsonData := GetJSON(AnsiString(pJson), False);\n\n  JsonObject := JsonData as TJSONObject;\n\n  Result.Text := JsonObject.Strings['text'];\n\n  SetLength(Result.Tokens, JsonObject.Arrays['tokens'].Count);\n\n  I := 0;\n  for JsonEnum in JsonObject.Arrays['tokens'] do\n  begin\n    Result.Tokens[I] := JsonEnum.Value.AsString;\n    Inc(I);\n  end;\n\n  SetLength(Result.Timestamps, JsonObject.Arrays['timestamps'].Count);\n  I := 0;\n  for JsonEnum in JsonObject.Arrays['timestamps'] do\n  begin\n    Result.Timestamps[I] := JsonEnum.Value.AsFloat;\n    Inc(I);\n  end;\n\n  SherpaOnnxDestroyOnlineStreamResultJson(pJson);\nend;\n\n\nconstructor TSherpaOnnxOnlineStream.Create(P: Pointer);\nbegin\n  Self.Handle := P;\nend;\n\ndestructor TSherpaOnnxOnlineStream.Destroy;\nbegin\n  SherpaOnnxDestroyOnlineStream(Self.Handle);\n  Self.Handle := nil;\nend;\n\nprocedure TSherpaOnnxOnlineStream.AcceptWaveform(const Samples: array of Single; SampleRate: Integer);\nbegin\n  SherpaOnnxOnlineStreamAcceptWaveform(Self.Handle, SampleRate,\n    pcfloat(Samples), Length(Samples));\nend;\n\nprocedure TSherpaOnnxOnlineStream.InputFinished;\nbegin\n  SherpaOnnxOnlineStreamInputFinished(Self.Handle);\nend;\n\nfunction TSherpaOnnxOfflineTransducerModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTransducerModelConfig(' +\n    'Encoder := %s, ' +\n    'Decoder := %s, ' +\n    'Joiner := %s' +\n    ')',\n    [Self.Encoder, Self.Decoder, Self.Joiner]);\nend;\n\nfunction TSherpaOnnxOfflineParaformerModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineParaformerModelConfig(Model := %s)',\n    [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineNemoEncDecCtcModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineNemoEncDecCtcModelConfig(Model := %s)',\n    [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineDolphinModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineDolphinModelConfig(Model := %s)',\n    [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineZipformerCtcModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineZipformerCtcModelConfig(Model := %s)',\n    [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineWenetCtcModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineWenetCtcModelConfig(Model := %s)',\n    [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineOmnilingualAsrCtcModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineOmnilingualAsrCtcModelConfig(Model := %s)',\n    [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineMedAsrCtcModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineMedAsrCtcModelConfig(Model := %s)',\n    [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineFireRedAsrCtcModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineFireRedAsrCtcModelConfig(Model := %s)',\n    [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineFunAsrNanoModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineFunAsrNanoModelConfig(' +\n    'EncoderAdaptor := %s' +\n    ', LLM := %s' +\n    ', Embedding := %s' +\n    ', Tokenizer := %s' +\n    ', SystemPrompt := %s' +\n    ', UserPrompt := %s' +\n    ', MaxNewTokens := %d' +\n    ', Temperature := %.3f' +\n    ', TopP := %.3f' +\n    ', Seed := %d' +\n    ', Language := %s' +\n    ', UseItn := %s' +\n    ', Hotwords := %s' +\n    ')',\n    [Self.EncoderAdaptor, Self.LLM, Self.Embedding, Self.Tokenizer,\n     Self.SystemPrompt, Self.UserPrompt, Self.MaxNewTokens, Self.Temperature,\n     Self.TopP, Self.Seed, Self.Language, Self.UseItn.ToString, Self.Hotwords]);\nend;\n\nfunction TSherpaOnnxOfflineWhisperModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineWhisperModelConfig(' +\n    'Encoder := %s, ' +\n    'Decoder := %s, ' +\n    'Language := %s, ' +\n    'Task := %s, ' +\n    'TailPaddings := %d, ' +\n    'EnableTokenTimestamps := %s, ' +\n    'EnableSegmentTimestamps := %s' +\n    ')',\n    [Self.Encoder, Self.Decoder, Self.Language, Self.Task, Self.TailPaddings,\n     Self.EnableTokenTimestamps.ToString,\n     Self.EnableSegmentTimestamps.ToString]);\nend;\n\nfunction TSherpaOnnxOfflineCanaryModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineCanaryModelConfig(' +\n    'Encoder := %s, ' +\n    'Decoder := %s, ' +\n    'SrcLang := %s, ' +\n    'TgtLang := %s, ' +\n    'UsePnc := %s' +\n    ')',\n    [Self.Encoder, Self.Decoder, Self.SrcLang,\n     Self.TgtLang, Self.UsePnc.ToString]);\nend;\n\nfunction TSherpaOnnxOfflineFireRedAsrModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineFireRedAsrModelConfig(' +\n    'Encoder := %s, ' +\n    'Decoder := %s)',\n    [Self.Encoder, Self.Decoder]);\nend;\n\nfunction TSherpaOnnxOfflineMoonshineModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineMoonshineModelConfig(' +\n    'Preprocessor := %s, ' +\n    'Encoder := %s, ' +\n    'UncachedDecoder := %s, ' +\n    'CachedDecoder := %s, ' +\n    'MergedDecoder := %s)',\n    [Self.Preprocessor, Self.Encoder, Self.UncachedDecoder, Self.CachedDecoder,\n     Self.MergedDecoder]);\nend;\n\nfunction TSherpaOnnxOfflineTdnnModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTdnnModelConfig(Model := %s)',\n    [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineLMConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineLMConfig(' +\n    'Model := %s, ' +\n    'Scale := %.1f' +\n    ')',\n    [Self.Model, Self.Scale]);\nend;\n\nfunction TSherpaOnnxOfflineSenseVoiceModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineSenseVoiceModelConfig(' +\n    'Model := %s, ' +\n    'Language := %s, ' +\n    'UseItn := %s' +\n    ')',\n    [Self.Model, Self.Language, Self.UseItn.ToString]);\nend;\n\nfunction TSherpaOnnxOfflineModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineModelConfig(' +\n    'Transducer := %s, ' +\n    'Paraformer := %s, ' +\n    'NeMoCtc := %s, ' +\n    'Whisper := %s, ' +\n    'Tdnn := %s, ' +\n    'Tokens := %s, ' +\n    'NumThreads := %d, ' +\n    'Debug := %s, ' +\n    'Provider := %s, ' +\n    'ModelType := %s, ' +\n    'ModelingUnit := %s, ' +\n    'BpeVocab := %s, ' +\n    'TeleSpeechCtc := %s, ' +\n    'SenseVoice := %s, ' +\n    'Moonshine := %s, ' +\n    'FireRedAsr := %s, ' +\n    'Dolphin := %s, ' +\n    'ZipformerCtc := %s, ' +\n    'Canary := %s, ' +\n    'WenetCtc := %s, ' +\n    'Omnilingual := %s' +\n    ', MedAsr := %s' +\n    ', FunAsrNano := %s' +\n    ', FireRedAsrCtc := %s' +\n    ')',\n    [Self.Transducer.ToString, Self.Paraformer.ToString,\n     Self.NeMoCtc.ToString, Self.Whisper.ToString, Self.Tdnn.ToString,\n     Self.Tokens, Self.NumThreads, Self.Debug.ToString, Self.Provider,\n     Self.ModelType, Self.ModelingUnit, Self.BpeVocab,\n     Self.TeleSpeechCtc, Self.SenseVoice.ToString, Self.Moonshine.ToString,\n     Self.FireRedAsr.ToString, Self.Dolphin.ToString,\n     Self.ZipformerCtc.ToString, Self.Canary.ToString, Self.WenetCtc.ToString,\n     Self.Omnilingual.ToString, Self.MedAsr.ToString,\n     Self.FunAsrNano.ToString, Self.FireRedAsrCtc.ToString\n     ]);\nend;\n\nfunction TSherpaOnnxOfflineRecognizerConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineRecognizerConfig(' +\n    'FeatConfig := %s, ' +\n    'ModelConfig := %s, ' +\n    'LMConfig := %s, ' +\n    'DecodingMethod := %s, ' +\n    'MaxActivePaths := %d, ' +\n    'HotwordsFile := %s, ' +\n    'HotwordsScore := %.1f, ' +\n    'RuleFsts := %s, ' +\n    'RuleFars := %s, ' +\n    'BlankPenalty := %1.f, ' +\n    'Hr := %s' +\n    ')',\n    [Self.FeatConfig.ToString, Self.ModelConfig.ToString,\n     Self.LMConfig.ToString, Self.DecodingMethod, Self.MaxActivePaths,\n     Self.HotwordsFile, Self.HotwordsScore, Self.RuleFsts, Self.RuleFars,\n     Self.BlankPenalty, Self.Hr.ToString\n     ]);\nend;\n\nfunction ConvertOfflineRecognizerConfig(Config: TSherpaOnnxOfflineRecognizerConfig): SherpaOnnxOfflineRecognizerConfig;\nvar\n  C: SherpaOnnxOfflineRecognizerConfig;\nbegin\n  C := Default(SherpaOnnxOfflineRecognizerConfig);\n  C.FeatConfig.SampleRate := Config.FeatConfig.SampleRate;\n  C.FeatConfig.FeatureDim := Config.FeatConfig.FeatureDim;\n\n  C.ModelConfig.Transducer.Encoder := PAnsiChar(Config.ModelConfig.Transducer.Encoder);\n  C.ModelConfig.Transducer.Decoder := PAnsiChar(Config.ModelConfig.Transducer.Decoder);\n  C.ModelConfig.Transducer.Joiner := PAnsiChar(Config.ModelConfig.Transducer.Joiner);\n\n  C.ModelConfig.Paraformer.Model := PAnsiChar(Config.ModelConfig.Paraformer.Model);\n  C.ModelConfig.NeMoCtc.Model := PAnsiChar(Config.ModelConfig.NeMoCtc.Model);\n\n  C.ModelConfig.Whisper.Encoder := PAnsiChar(Config.ModelConfig.Whisper.Encoder);\n  C.ModelConfig.Whisper.Decoder := PAnsiChar(Config.ModelConfig.Whisper.Decoder);\n  C.ModelConfig.Whisper.Language := PAnsiChar(Config.ModelConfig.Whisper.Language);\n  C.ModelConfig.Whisper.Task := PAnsiChar(Config.ModelConfig.Whisper.Task);\n  C.ModelConfig.Whisper.TailPaddings := Config.ModelConfig.Whisper.TailPaddings;\n  C.ModelConfig.Whisper.EnableTokenTimestamps := Ord(Config.ModelConfig.Whisper.EnableTokenTimestamps);\n  C.ModelConfig.Whisper.EnableSegmentTimestamps := Ord(Config.ModelConfig.Whisper.EnableSegmentTimestamps);\n\n  C.ModelConfig.Tdnn.Model := PAnsiChar(Config.ModelConfig.Tdnn.Model);\n\n  C.ModelConfig.Tokens := PAnsiChar(Config.ModelConfig.Tokens);\n  C.ModelConfig.NumThreads := Config.ModelConfig.NumThreads;\n  C.ModelConfig.Debug := Ord(Config.ModelConfig.Debug);\n  C.ModelConfig.Provider := PAnsiChar(Config.ModelConfig.Provider);\n  C.ModelConfig.ModelType := PAnsiChar(Config.ModelConfig.ModelType);\n  C.ModelConfig.ModelingUnit := PAnsiChar(Config.ModelConfig.ModelingUnit);\n  C.ModelConfig.BpeVocab := PAnsiChar(Config.ModelConfig.BpeVocab);\n  C.ModelConfig.TeleSpeechCtc := PAnsiChar(Config.ModelConfig.TeleSpeechCtc);\n\n  C.ModelConfig.SenseVoice.Model := PAnsiChar(Config.ModelConfig.SenseVoice.Model);\n  C.ModelConfig.SenseVoice.Language := PAnsiChar(Config.ModelConfig.SenseVoice.Language);\n  C.ModelConfig.SenseVoice.UseItn := Ord(Config.ModelConfig.SenseVoice.UseItn);\n\n  C.ModelConfig.Moonshine.Preprocessor := PAnsiChar(Config.ModelConfig.Moonshine.Preprocessor);\n  C.ModelConfig.Moonshine.Encoder := PAnsiChar(Config.ModelConfig.Moonshine.Encoder);\n  C.ModelConfig.Moonshine.UncachedDecoder := PAnsiChar(Config.ModelConfig.Moonshine.UncachedDecoder);\n  C.ModelConfig.Moonshine.CachedDecoder := PAnsiChar(Config.ModelConfig.Moonshine.CachedDecoder);\n  C.ModelConfig.Moonshine.MergedDecoder := PAnsiChar(Config.ModelConfig.Moonshine.MergedDecoder);\n\n  C.ModelConfig.FireRedAsr.Encoder := PAnsiChar(Config.ModelConfig.FireRedAsr.Encoder);\n  C.ModelConfig.FireRedAsr.Decoder := PAnsiChar(Config.ModelConfig.FireRedAsr.Decoder);\n\n  C.ModelConfig.Dolphin.Model := PAnsiChar(Config.ModelConfig.Dolphin.Model);\n  C.ModelConfig.ZipformerCtc.Model := PAnsiChar(Config.ModelConfig.ZipformerCtc.Model);\n\n  C.ModelConfig.Canary.Encoder := PAnsiChar(Config.ModelConfig.Canary.Encoder);\n  C.ModelConfig.Canary.Decoder := PAnsiChar(Config.ModelConfig.Canary.Decoder);\n  C.ModelConfig.Canary.SrcLang := PAnsiChar(Config.ModelConfig.Canary.SrcLang);\n  C.ModelConfig.Canary.TgtLang := PAnsiChar(Config.ModelConfig.Canary.TgtLang);\n  C.ModelConfig.Canary.UsePnc := Ord(Config.ModelConfig.Canary.UsePnc);\n\n  C.ModelConfig.WenetCtc.Model := PAnsiChar(Config.ModelConfig.WenetCtc.Model);\n  C.ModelConfig.Omnilingual.Model := PAnsiChar(Config.ModelConfig.Omnilingual.Model);\n  C.ModelConfig.MedAsr.Model := PAnsiChar(Config.ModelConfig.MedAsr.Model);\n\n  C.ModelConfig.FunAsrNano.EncoderAdaptor := PAnsiChar(Config.ModelConfig.FunAsrNano.EncoderAdaptor);\n  C.ModelConfig.FunAsrNano.LLM := PAnsiChar(Config.ModelConfig.FunAsrNano.LLM);\n  C.ModelConfig.FunAsrNano.Embedding := PAnsiChar(Config.ModelConfig.FunAsrNano.Embedding);\n  C.ModelConfig.FunAsrNano.Tokenizer := PAnsiChar(Config.ModelConfig.FunAsrNano.Tokenizer);\n  C.ModelConfig.FunAsrNano.SystemPrompt := PAnsiChar(Config.ModelConfig.FunAsrNano.SystemPrompt);\n  C.ModelConfig.FunAsrNano.UserPrompt := PAnsiChar(Config.ModelConfig.FunAsrNano.UserPrompt);\n  C.ModelConfig.FunAsrNano.MaxNewTokens := Config.ModelConfig.FunAsrNano.MaxNewTokens;\n  C.ModelConfig.FunAsrNano.Temperature := Config.ModelConfig.FunAsrNano.Temperature;\n  C.ModelConfig.FunAsrNano.TopP := Config.ModelConfig.FunAsrNano.TopP;\n  C.ModelConfig.FunAsrNano.Seed := Config.ModelConfig.FunAsrNano.Seed;\n  C.ModelConfig.FunAsrNano.Language := PAnsiChar(Config.ModelConfig.FunAsrNano.Language);\n  C.ModelConfig.FunAsrNano.UseItn := Ord(Config.ModelConfig.FunAsrNano.UseItn);\n  C.ModelConfig.FunAsrNano.Hotwords := PAnsiChar(Config.ModelConfig.FunAsrNano.Hotwords);\n\n  C.ModelConfig.FireRedAsrCtc.Model := PAnsiChar(Config.ModelConfig.FireRedAsrCtc.Model);\n\n  C.LMConfig.Model := PAnsiChar(Config.LMConfig.Model);\n  C.LMConfig.Scale := Config.LMConfig.Scale;\n\n  C.DecodingMethod := PAnsiChar(Config.DecodingMethod);\n  C.MaxActivePaths := Config.MaxActivePaths;\n  C.HotwordsFile := PAnsiChar(Config.HotwordsFile);\n  C.HotwordsScore := Config.HotwordsScore;\n  C.RuleFsts := PAnsiChar(Config.RuleFsts);\n  C.RuleFars := PAnsiChar(Config.RuleFars);\n  C.BlankPenalty := Config.BlankPenalty;\n\n  C.Hr.Lexicon := PAnsiChar(Config.Hr.Lexicon);\n  C.Hr.RuleFsts := PAnsiChar(Config.Hr.RuleFsts);\n\n  Result := C;\nend;\n\nconstructor TSherpaOnnxOfflineRecognizer.Create(Config: TSherpaOnnxOfflineRecognizerConfig);\nvar\n  C: SherpaOnnxOfflineRecognizerConfig;\nbegin\n  C := ConvertOfflineRecognizerConfig(Config);\n  Self.Handle := SherpaOnnxCreateOfflineRecognizer(@C);\n  Self._Config := Config;\nend;\n\nprocedure TSherpaOnnxOfflineRecognizer.SetConfig(Config: TSherpaOnnxOfflineRecognizerConfig);\nvar\n  C: SherpaOnnxOfflineRecognizerConfig;\nbegin\n  C := ConvertOfflineRecognizerConfig(Config);\n  SherpaOnnxOfflineRecognizerSetConfig(Self.Handle, @C);\n  { We don't update Self._Config }\nend;\n\ndestructor TSherpaOnnxOfflineRecognizer.Destroy;\nbegin\n  SherpaOnnxDestroyOfflineRecognizer(Self.Handle);\n  Self.Handle := nil;\nend;\n\nfunction TSherpaOnnxOfflineRecognizer.CreateStream: TSherpaOnnxOfflineStream;\nvar\n  Stream: Pointer;\nbegin\n  Stream := SherpaOnnxCreateOfflineStream(Self.Handle);\n  Result := TSherpaOnnxOfflineStream.Create(Stream);\nend;\n\nprocedure TSherpaOnnxOfflineRecognizer.Decode(Stream: TSherpaOnnxOfflineStream);\nbegin\n  SherpaOnnxDecodeOfflineStream(Self.Handle, Stream.Handle);\nend;\n\nfunction TSherpaOnnxOfflineRecognizer.GetResult(Stream: TSherpaOnnxOfflineStream): TSherpaOnnxOfflineRecognizerResult;\nvar\n  pJson: PAnsiChar;\n  JsonData: TJSONData;\n  JsonObject : TJSONObject;\n  JsonEnum: TJSONEnum;\n  I: Integer;\nbegin\n  pJson := SherpaOnnxGetOfflineStreamResultAsJson(Stream.Handle);\n\n  JsonData := GetJSON(AnsiString(pJson), False);\n\n  JsonObject := JsonData as TJSONObject;\n\n  Result.Text := JsonObject.Strings['text'];\n\n  SetLength(Result.Tokens, JsonObject.Arrays['tokens'].Count);\n\n  I := 0;\n  for JsonEnum in JsonObject.Arrays['tokens'] do\n  begin\n    Result.Tokens[I] := JsonEnum.Value.AsString;\n    Inc(I);\n  end;\n\n  SetLength(Result.Timestamps, JsonObject.Arrays['timestamps'].Count);\n  I := 0;\n  for JsonEnum in JsonObject.Arrays['timestamps'] do\n  begin\n    Result.Timestamps[I] := JsonEnum.Value.AsFloat;\n    Inc(I);\n  end;\n\n  SherpaOnnxDestroyOfflineStreamResultJson(pJson);\nend;\n\nconstructor TSherpaOnnxOfflineStream.Create(P: Pointer);\nbegin\n  Self.Handle := P;\nend;\n\ndestructor TSherpaOnnxOfflineStream.Destroy;\nbegin\n  SherpaOnnxDestroyOfflineStream(Self.Handle);\n  Self.Handle := nil;\nend;\n\nprocedure TSherpaOnnxOfflineStream.AcceptWaveform(const Samples: array of Single; SampleRate: Integer);\nbegin\n  SherpaOnnxAcceptWaveformOffline(Self.Handle, SampleRate, pcfloat(Samples),\n    Length(Samples));\nend;\n\nfunction TSherpaOnnxOfflineRecognizerResult.ToString: AnsiString;\nvar\n  TokensStr: AnsiString;\n  S: AnsiString;\n  TimestampStr: AnsiString;\n  T: Single;\n  Sep: AnsiString;\nbegin\n  TokensStr := '[';\n  Sep := '';\n  for S in Self.Tokens do\n  begin\n    TokensStr := TokensStr + Sep + S;\n    Sep := ', ';\n  end;\n  TokensStr := TokensStr + ']';\n\n  TimestampStr := '[';\n  Sep := '';\n  for T in Self.Timestamps do\n  begin\n    TimestampStr := TimestampStr + Sep + Format('%.2f', [T]);\n    Sep := ', ';\n  end;\n  TimestampStr := TimestampStr + ']';\n\n  Result := Format('TSherpaOnnxOfflineRecognizerResult(Text := %s, ' +\n    'Tokens := %s, ' +\n    'Timestamps := %s' +\n    ')',\n    [Self.Text, TokensStr, TimestampStr]);\nend;\n\nfunction TSherpaOnnxSileroVadModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxSileroVadModelConfig(' +\n    'Model := %s, ' +\n    'Threshold := %.2f, ' +\n    'MinSilenceDuration := %.2f, ' +\n    'MinSpeechDuration := %.2f, ' +\n    'WindowSize := %d, ' +\n    'MaxSpeechDuration := %.2f' +\n    ')',\n    [Self.Model, Self.Threshold, Self.MinSilenceDuration,\n     Self.MinSpeechDuration, Self.WindowSize, Self.MaxSpeechDuration\n    ]);\nend;\n\nfunction TSherpaOnnxTenVadModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxTenVadModelConfig(' +\n    'Model := %s, ' +\n    'Threshold := %.2f, ' +\n    'MinSilenceDuration := %.2f, ' +\n    'MinSpeechDuration := %.2f, ' +\n    'WindowSize := %d, ' +\n    'MaxSpeechDuration := %.2f' +\n    ')',\n    [Self.Model, Self.Threshold, Self.MinSilenceDuration,\n     Self.MinSpeechDuration, Self.WindowSize, Self.MaxSpeechDuration\n    ]);\nend;\n\nclass operator TSherpaOnnxOfflineFunAsrNanoModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineFunAsrNanoModelConfig);\nbegin\n  Dest.MaxNewTokens := 512;\n  Dest.Temperature := 1e-6;\n  Dest.TopP := 0.8;\n  Dest.Seed := 42;\n  Dest.UseItn := False;\nend;\n\nclass operator TSherpaOnnxSileroVadModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxSileroVadModelConfig);\nbegin\n  Dest.Threshold := 0.5;\n  Dest.MinSilenceDuration := 0.5;\n  Dest.MinSpeechDuration := 0.25;\n  Dest.WindowSize := 512;\n  Dest.MaxSpeechDuration := 5.0;\nend;\n\nclass operator TSherpaOnnxTenVadModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxTenVadModelConfig);\nbegin\n  Dest.Threshold := 0.5;\n  Dest.MinSilenceDuration := 0.5;\n  Dest.MinSpeechDuration := 0.25;\n  Dest.WindowSize := 256;\n  Dest.MaxSpeechDuration := 5.0;\nend;\n\nfunction TSherpaOnnxVadModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxVadModelConfig(' +\n    'SileroVad := %s, ' +\n    'SampleRate := %d, ' +\n    'NumThreads := %d, ' +\n    'Provider := %s, ' +\n    'Debug := %s, ' +\n    'TenVad := %s' +\n    ')',\n    [Self.SileroVad.ToString, Self.SampleRate, Self.NumThreads, Self.Provider,\n     Self.Debug.ToString, Self.TenVad.ToString\n    ]);\nend;\n\nclass operator TSherpaOnnxVadModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxVadModelConfig);\nbegin\n  Dest.SampleRate := 16000;\n  Dest.NumThreads := 1;\n  Dest.Provider := 'cpu';\n  Dest.Debug := False;\nend;\n\nclass operator TSherpaOnnxFeatureConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxFeatureConfig);\nbegin\n  Dest.SampleRate := 16000;\n  Dest.FeatureDim := 80;\nend;\n\nclass operator TSherpaOnnxOnlineCtcFstDecoderConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOnlineCtcFstDecoderConfig);\nbegin\n  Dest.MaxActive := 3000;\nend;\n\nclass operator TSherpaOnnxOnlineRecognizerConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOnlineRecognizerConfig);\nbegin\n  Dest.DecodingMethod := 'greedy_search';\n  Dest.EnableEndpoint := False;\n  Dest.Rule1MinTrailingSilence := 2.4;\n  Dest.Rule2MinTrailingSilence := 1.2;\n  Dest.Rule3MinUtteranceLength := 20;\n  Dest.HotwordsScore := 1.5;\n  Dest.BlankPenalty := 0;\nend;\n\nclass operator TSherpaOnnxOnlineModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOnlineModelConfig);\nbegin\n  Dest.NumThreads := 1;\n  Dest.Provider := 'cpu';\n  Dest.Debug := False;\nend;\n\nclass operator TSherpaOnnxOfflineWhisperModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineWhisperModelConfig);\nbegin\n  Dest.Task := 'transcribe';\n  Dest.TailPaddings := -1;\n  Dest.EnableTokenTimestamps := False;\n  Dest.EnableSegmentTimestamps := False;\nend;\n\nclass operator TSherpaOnnxOfflineCanaryModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineCanaryModelConfig);\nbegin\n  Dest.SrcLang := 'en';\n  Dest.TgtLang := 'en';\n  Dest.UsePnc := True;\nend;\n\nclass operator TSherpaOnnxOfflineLMConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineLMConfig);\nbegin\n  Dest.Scale := 1.0;\nend;\n\nclass operator TSherpaOnnxOfflineSenseVoiceModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineSenseVoiceModelConfig);\nbegin\n  Dest.UseItn := True;\nend;\n\nclass operator TSherpaOnnxOfflineModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineModelConfig);\nbegin\n  Dest.NumThreads := 1;\n  Dest.Debug := False;\n  Dest.Provider := 'cpu';\nend;\n\nclass operator TSherpaOnnxOfflineRecognizerConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineRecognizerConfig);\nbegin\n  Dest.DecodingMethod := 'greedy_search';\n  Dest.MaxActivePaths := 4;\n  Dest.HotwordsScore := 1.5;\n  Dest.BlankPenalty := 0;\nend;\n\nconstructor TSherpaOnnxCircularBuffer.Create(Capacity: Integer);\nbegin\n  Self.Handle := SherpaOnnxCreateCircularBuffer(Capacity);\nend;\n\ndestructor TSherpaOnnxCircularBuffer.Destroy;\nbegin\n  SherpaOnnxDestroyCircularBuffer(Self.Handle);\n  Self.Handle := nil;\nend;\n\nprocedure TSherpaOnnxCircularBuffer.Push(Samples: array of Single);\nbegin\n  SherpaOnnxCircularBufferPush(Self.Handle, pcfloat(Samples), Length(Samples));\nend;\n\nprocedure TSherpaOnnxCircularBuffer.Push(Samples: pcfloat; N: Integer);\nbegin\n  SherpaOnnxCircularBufferPush(Self.Handle, Samples, N);\nend;\n\nfunction TSherpaOnnxCircularBuffer.Get(StartIndex: Integer; N: Integer): TSherpaOnnxSamplesArray;\nvar\n  P: pcfloat;\nbegin\n  Result := nil;\n\n  if N <= 0 then\n    Exit;\n\n  P := SherpaOnnxCircularBufferGet(Self.Handle, StartIndex, N);\n  if P = nil then\n    Exit;\n\n  SetLength(Result, N);\n\n  Move(P[0], Result[0], N * SizeOf(Single));\n\n  SherpaOnnxCircularBufferFree(P);\nend;\n\nprocedure TSherpaOnnxCircularBuffer.Pop(N: Integer);\nbegin\n  SherpaOnnxCircularBufferPop(Self.Handle, N);\nend;\n\nprocedure TSherpaOnnxCircularBuffer.Reset;\nbegin\n  SherpaOnnxCircularBufferReset(Self.Handle);\nend;\n\nfunction TSherpaOnnxCircularBuffer.Size: Integer;\nbegin\n  Result := SherpaOnnxCircularBufferSize(Self.Handle);\nend;\n\nfunction TSherpaOnnxCircularBuffer.Head: Integer;\nbegin\n  Result := SherpaOnnxCircularBufferHead(Self.Handle);\nend;\n\nconstructor TSherpaOnnxVoiceActivityDetector.Create(Config: TSherpaOnnxVadModelConfig; BufferSizeInSeconds: Single);\nvar\n  C: SherpaOnnxVadModelConfig ;\nbegin\n  C := Default(SherpaOnnxVadModelConfig);\n  Self._Config := Config;\n\n  C.SileroVad.Model := PAnsiChar(Config.SileroVad.Model);\n  C.SileroVad.Threshold := Config.SileroVad.Threshold;\n  C.SileroVad.MinSilenceDuration := Config.SileroVad.MinSilenceDuration;\n  C.SileroVad.MinSpeechDuration := Config.SileroVad.MinSpeechDuration;\n  C.SileroVad.WindowSize := Config.SileroVad.WindowSize;\n  C.SileroVad.MaxSpeechDuration := Config.SileroVad.MaxSpeechDuration;\n\n  C.TenVad.Model := PAnsiChar(Config.TenVad.Model);\n  C.TenVad.Threshold := Config.TenVad.Threshold;\n  C.TenVad.MinSilenceDuration := Config.TenVad.MinSilenceDuration;\n  C.TenVad.MinSpeechDuration := Config.TenVad.MinSpeechDuration;\n  C.TenVad.WindowSize := Config.TenVad.WindowSize;\n  C.TenVad.MaxSpeechDuration := Config.TenVad.MaxSpeechDuration;\n\n  C.SampleRate := Config.SampleRate;\n  C.NumThreads := Config.NumThreads;\n  C.Provider := PAnsiChar(Config.Provider);\n  C.Debug := Ord(Config.Debug);\n\n  Self.Handle := SherpaOnnxCreateVoiceActivityDetector(@C, BufferSizeInSeconds);\nend;\n\ndestructor TSherpaOnnxVoiceActivityDetector.Destroy;\nbegin\n  SherpaOnnxDestroyVoiceActivityDetector(Self.Handle);\n  Self.Handle := nil;\nend;\n\nprocedure TSherpaOnnxVoiceActivityDetector.AcceptWaveform(const Samples: array of Single);\nbegin\n  SherpaOnnxVoiceActivityDetectorAcceptWaveform(Self.Handle, pcfloat(Samples), Length(Samples));\nend;\n\nprocedure TSherpaOnnxVoiceActivityDetector.AcceptWaveform(const Samples: array of Single; Offset: Integer; N: Integer);\nbegin\n  if Offset + N > Length(Samples) then\n    begin\n      WriteLn(Format('Invalid arguments!. Array length: %d, Offset: %d, N: %d',\n        [Length(Samples), Offset, N]\n      ));\n      Exit;\n    end;\n\n  SherpaOnnxVoiceActivityDetectorAcceptWaveform(Self.Handle,\n    pcfloat(Samples) + Offset, N);\nend;\n\nfunction TSherpaOnnxVoiceActivityDetector.IsEmpty: Boolean;\nbegin\n  Result := SherpaOnnxVoiceActivityDetectorEmpty(Self.Handle) = 1;\nend;\n\nfunction TSherpaOnnxVoiceActivityDetector.IsDetected: Boolean;\nbegin\n  Result := SherpaOnnxVoiceActivityDetectorDetected(Self.Handle) = 1;\nend;\n\nprocedure TSherpaOnnxVoiceActivityDetector.Pop;\nbegin\n  SherpaOnnxVoiceActivityDetectorPop(Self.Handle);\nend;\n\nprocedure TSherpaOnnxVoiceActivityDetector.Clear;\nbegin\n  SherpaOnnxVoiceActivityDetectorClear(Self.Handle);\nend;\n\nfunction TSherpaOnnxVoiceActivityDetector.Front: TSherpaOnnxSpeechSegment;\nvar\n  P: PSherpaOnnxSpeechSegment;\nbegin\n  Result := Default(TSherpaOnnxSpeechSegment);\n\n  P := SherpaOnnxVoiceActivityDetectorFront(Self.Handle);\n  if P = nil then\n    Exit;\n\n  Result.Start := P^.Start;\n  Result.Samples := nil;\n  SetLength(Result.Samples, P^.N);\n\n  if P^.N > 0 then\n    Move(P^.Samples[0], Result.Samples[0], P^.N * SizeOf(Single));\n\n  SherpaOnnxDestroySpeechSegment(P);\nend;\n\nprocedure TSherpaOnnxVoiceActivityDetector.Reset;\nbegin\n  SherpaOnnxVoiceActivityDetectorReset(Self.Handle);\nend;\n\nprocedure TSherpaOnnxVoiceActivityDetector.Flush;\nbegin\n  SherpaOnnxVoiceActivityDetectorFlush(Self.Handle);\nend;\n\nfunction TSherpaOnnxOfflineTtsVitsModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTtsVitsModelConfig(' +\n    'Model := %s, ' +\n    'Lexicon := %s, ' +\n    'Tokens := %s, ' +\n    'DataDir := %s, ' +\n    'NoiseScale := %.2f, ' +\n    'NoiseScaleW := %.2f, ' +\n    'LengthScale := %.2f' +\n    ')',\n    [Self.Model, Self.Lexicon, Self.Tokens, Self.DataDir, Self.NoiseScale,\n     Self.NoiseScaleW, Self.LengthScale\n    ]);\nend;\n\nclass operator TSherpaOnnxOfflineTtsVitsModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsVitsModelConfig);\nbegin\n  Dest.NoiseScale := 0.667;\n  Dest.NoiseScaleW := 0.8;\n  Dest.LengthScale := 1.0;\nend;\n\nclass operator TSherpaOnnxGenerationConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxGenerationConfig);\nbegin\n  Dest.SilenceScale := 0.2;\n  Dest.Speed := 1.0;\n  Dest.Sid := 0;\n  Dest.ReferenceAudioLen := 0;\n  Dest.ReferenceSampleRate := 0;\n  Dest.NumSteps := 5;\nend;\n\nfunction TSherpaOnnxOfflineTtsMatchaModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTtsMatchaModelConfig(' +\n    'AcousticModel := %s, ' +\n    'Vocoder := %s, ' +\n    'Lexicon := %s, ' +\n    'Tokens := %s, ' +\n    'DataDir := %s, ' +\n    'NoiseScale := %.2f, ' +\n    'LengthScale := %.2f' +\n    ')',\n    [Self.AcousticModel, Self.Vocoder, Self.Lexicon, Self.Tokens,\n     Self.DataDir, Self.NoiseScale, Self.LengthScale\n    ]);\nend;\n\nclass operator TSherpaOnnxOfflineTtsMatchaModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsMatchaModelConfig);\nbegin\n  Dest.NoiseScale := 0.667;\n  Dest.LengthScale := 1.0;\nend;\n\nfunction TSherpaOnnxOfflineTtsKokoroModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTtsKokoroModelConfig(' +\n    'Model := %s, ' +\n    'Voices := %s, ' +\n    'Tokens := %s, ' +\n    'DataDir := %s, ' +\n    'LengthScale := %.2f, ' +\n    'Lexicon := %s, ' +\n    'Lang := %s' +\n    ')',\n    [Self.Model, Self.Voices, Self.Tokens, Self.DataDir, Self.LengthScale,\n     Self.Lexicon, Self.Lang]);\nend;\n\nclass operator TSherpaOnnxOfflineTtsKokoroModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsKokoroModelConfig);\nbegin\n  Dest.LengthScale := 1.0;\nend;\n\nfunction TSherpaOnnxOfflineTtsKittenModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTtsKittenModelConfig(' +\n    'Model := %s, ' +\n    'Voices := %s, ' +\n    'Tokens := %s, ' +\n    'DataDir := %s, ' +\n    'LengthScale := %.2f' +\n    ')',\n    [Self.Model, Self.Voices, Self.Tokens, Self.DataDir, Self.LengthScale]);\nend;\n\nclass operator TSherpaOnnxOfflineTtsKittenModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsKittenModelConfig);\nbegin\n  Dest.LengthScale := 1.0;\nend;\n\nfunction TSherpaOnnxOfflineTtsZipVoiceModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTtsZipVoiceModelConfig(' +\n    'Tokens := %s, ' +\n    'Encoder := %s, ' +\n    'Decoder := %s, ' +\n    'Vocoder := %s, ' +\n    'DataDir := %s, ' +\n    'Lexicon := %s, ' +\n    'FeatScale := %.2f, ' +\n    'Tshift := %.2f, ' +\n    'TargetRms := %.2f, ' +\n    'GuidanceScale := %.2f' +\n    ')',\n    [Self.Tokens, Self.Encoder, Self.Decoder, Self.Vocoder,\n     Self.DataDir, Self.Lexicon, Self.FeatScale, Self.Tshift,\n     Self.TargetRms, Self.GuidanceScale]);\nend;\n\nclass operator TSherpaOnnxOfflineTtsZipVoiceModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsZipVoiceModelConfig);\nbegin\n  Dest.FeatScale := 0.1;\n  Dest.Tshift := 0.5;\n  Dest.TargetRms := 0.1;\n  Dest.GuidanceScale := 1.0;\nend;\n\nclass operator TSherpaOnnxOfflineTtsPocketModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsPocketModelConfig);\nbegin\n  Dest.VoiceEmbeddingCacheCapacity := 50;\nend;\n\nfunction TSherpaOnnxOfflineTtsPocketModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTtsPocketModelConfig(' +\n    'LmFlow := %s, ' +\n    'LmMain := %s, ' +\n    'Encoder := %s, ' +\n    'Decoder := %s, ' +\n    'TextConditioner := %s, ' +\n    'VocabJson := %s, ' +\n    'TokenScoresJson := %s, ' +\n    'VoiceEmbeddingCacheCapacity := %d' +\n    ')',\n    [Self.LmFlow, Self.LmMain, Self.Encoder, Self.Decoder, Self.TextConditioner,\n     Self.VocabJson, Self.TokenScoresJson, Self.VoiceEmbeddingCacheCapacity]);\nend;\n\nfunction TSherpaOnnxOfflineTtsSupertonicModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTtsSupertonicModelConfig(' +\n    'DurationPredictor := %s, ' +\n    'TextEncoder := %s, ' +\n    'VectorEstimator := %s, ' +\n    'Vocoder := %s, ' +\n    'TtsJson := %s, ' +\n    'UnicodeIndexer := %s, ' +\n    'VoiceStyle := %s' +\n    ')',\n    [Self.DurationPredictor, Self.TextEncoder, Self.VectorEstimator, Self.Vocoder,\n     Self.TtsJson, Self.UnicodeIndexer, Self.VoiceStyle]);\nend;\n\nfunction TSherpaOnnxOfflineTtsModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTtsModelConfig(' +\n    'Vits := %s, ' +\n    'NumThreads := %d, ' +\n    'Debug := %s, ' +\n    'Provider := %s, ' +\n    'Matcha := %s, ' +\n    'Kokoro := %s, ' +\n    'Kitten := %s, ' +\n    'ZipVoice := %s, ' +\n    'Pocket := %s, ' +\n    'Supertonic := %s' +\n    ')',\n    [Self.Vits.ToString, Self.NumThreads, Self.Debug.ToString, Self.Provider,\n     Self.Matcha.ToString, Self.Kokoro.ToString, Self.Kitten.ToString,\n     Self.ZipVoice.ToString, Self.Pocket.ToString, Self.Supertonic.ToString\n    ]);\nend;\n\nclass operator TSherpaOnnxOfflineTtsModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsModelConfig);\nbegin\n  Dest.NumThreads := 1;\n  Dest.Debug := False;\n  Dest.Provider := 'cpu';\nend;\n\nfunction TSherpaOnnxOfflineTtsConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineTtsConfig(' +\n    'Model := %s, ' +\n    'RuleFsts := %s, ' +\n    'MaxNumSentences := %d, ' +\n    'RuleFars := %s, ' +\n    'SilenceScale := %f' +\n    ')',\n    [Self.Model.ToString, Self.RuleFsts, Self.MaxNumSentences, Self.RuleFars,\n     Self.SilenceScale]);\nend;\n\nclass operator TSherpaOnnxOfflineTtsConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineTtsConfig);\nbegin\n  Dest.MaxNumSentences := 1;\n  Dest.SilenceScale := 0.2;\nend;\n\nconstructor TSherpaOnnxOfflineTts.Create(Config: TSherpaOnnxOfflineTtsConfig);\nvar\n  C: SherpaOnnxOfflineTtsConfig;\nbegin\n  C := Default(SherpaOnnxOfflineTtsConfig);\n  Self._Config := Config;\n\n  C.Model.Vits.Model := PAnsiChar(Config.Model.Vits.Model);\n  C.Model.Vits.Lexicon := PAnsiChar(Config.Model.Vits.Lexicon);\n  C.Model.Vits.Tokens := PAnsiChar(Config.Model.Vits.Tokens);\n  C.Model.Vits.DataDir := PAnsiChar(Config.Model.Vits.DataDir);\n  C.Model.Vits.NoiseScale := Config.Model.Vits.NoiseScale;\n  C.Model.Vits.NoiseScaleW := Config.Model.Vits.NoiseScaleW;\n  C.Model.Vits.LengthScale := Config.Model.Vits.LengthScale;\n\n  C.Model.Matcha.AcousticModel := PAnsiChar(Config.Model.Matcha.AcousticModel);\n  C.Model.Matcha.Vocoder := PAnsiChar(Config.Model.Matcha.Vocoder);\n  C.Model.Matcha.Lexicon := PAnsiChar(Config.Model.Matcha.Lexicon);\n  C.Model.Matcha.Tokens := PAnsiChar(Config.Model.Matcha.Tokens);\n  C.Model.Matcha.DataDir := PAnsiChar(Config.Model.Matcha.DataDir);\n  C.Model.Matcha.NoiseScale := Config.Model.Matcha.NoiseScale;\n  C.Model.Matcha.LengthScale := Config.Model.Matcha.LengthScale;\n\n  C.Model.Kokoro.Model := PAnsiChar(Config.Model.Kokoro.Model);\n  C.Model.Kokoro.Voices := PAnsiChar(Config.Model.Kokoro.Voices);\n  C.Model.Kokoro.Tokens := PAnsiChar(Config.Model.Kokoro.Tokens);\n  C.Model.Kokoro.DataDir := PAnsiChar(Config.Model.Kokoro.DataDir);\n  C.Model.Kokoro.LengthScale := Config.Model.Kokoro.LengthScale;\n  C.Model.Kokoro.Lexicon := PAnsiChar(Config.Model.Kokoro.Lexicon);\n  C.Model.Kokoro.Lang := PAnsiChar(Config.Model.Kokoro.Lang);\n\n  C.Model.Kitten.Model := PAnsiChar(Config.Model.Kitten.Model);\n  C.Model.Kitten.Voices := PAnsiChar(Config.Model.Kitten.Voices);\n  C.Model.Kitten.Tokens := PAnsiChar(Config.Model.Kitten.Tokens);\n  C.Model.Kitten.DataDir := PAnsiChar(Config.Model.Kitten.DataDir);\n  C.Model.Kitten.LengthScale := Config.Model.Kitten.LengthScale;\n\n  C.Model.ZipVoice.Tokens := PAnsiChar(Config.Model.ZipVoice.Tokens);\n  C.Model.ZipVoice.Encoder := PAnsiChar(Config.Model.ZipVoice.Encoder);\n  C.Model.ZipVoice.Decoder := PAnsiChar(Config.Model.ZipVoice.Decoder);\n  C.Model.ZipVoice.Vocoder := PAnsiChar(Config.Model.ZipVoice.Vocoder);\n  C.Model.ZipVoice.DataDir := PAnsiChar(Config.Model.ZipVoice.DataDir);\n  C.Model.ZipVoice.Lexicon := PAnsiChar(Config.Model.ZipVoice.Lexicon);\n  C.Model.ZipVoice.FeatScale := Config.Model.ZipVoice.FeatScale;\n  C.Model.ZipVoice.Tshift := Config.Model.ZipVoice.Tshift;\n  C.Model.ZipVoice.TargetRms := Config.Model.ZipVoice.TargetRms;\n  C.Model.ZipVoice.GuidanceScale := Config.Model.ZipVoice.GuidanceScale;\n\n  C.Model.Pocket.LmFlow := PAnsiChar(Config.Model.Pocket.LmFlow);\n  C.Model.Pocket.LmMain := PAnsiChar(Config.Model.Pocket.LmMain);\n  C.Model.Pocket.Encoder := PAnsiChar(Config.Model.Pocket.Encoder);\n  C.Model.Pocket.Decoder := PAnsiChar(Config.Model.Pocket.Decoder);\n  C.Model.Pocket.TextConditioner := PAnsiChar(Config.Model.Pocket.TextConditioner);\n  C.Model.Pocket.VocabJson := PAnsiChar(Config.Model.Pocket.VocabJson);\n  C.Model.Pocket.TokenScoresJson := PAnsiChar(Config.Model.Pocket.TokenScoresJson);\n  C.Model.Pocket.VoiceEmbeddingCacheCapacity := Config.Model.Pocket.VoiceEmbeddingCacheCapacity;\n\n  C.Model.Supertonic.DurationPredictor := PAnsiChar(Config.Model.Supertonic.DurationPredictor);\n  C.Model.Supertonic.TextEncoder := PAnsiChar(Config.Model.Supertonic.TextEncoder);\n  C.Model.Supertonic.VectorEstimator := PAnsiChar(Config.Model.Supertonic.VectorEstimator);\n  C.Model.Supertonic.Vocoder := PAnsiChar(Config.Model.Supertonic.Vocoder);\n  C.Model.Supertonic.TtsJson := PAnsiChar(Config.Model.Supertonic.TtsJson);\n  C.Model.Supertonic.UnicodeIndexer := PAnsiChar(Config.Model.Supertonic.UnicodeIndexer);\n  C.Model.Supertonic.VoiceStyle := PAnsiChar(Config.Model.Supertonic.VoiceStyle);\n\n  C.Model.NumThreads := Config.Model.NumThreads;\n  C.Model.Provider := PAnsiChar(Config.Model.Provider);\n  C.Model.Debug := Ord(Config.Model.Debug);\n\n  C.RuleFsts := PAnsiChar(Config.RuleFsts);\n  C.MaxNumSentences := Config.MaxNumSentences;\n  C.RuleFars := PAnsiChar(Config.RuleFars);\n  C.SilenceScale := Config.SilenceScale;\n\n  Self.Handle := SherpaOnnxCreateOfflineTts(@C);\n\n  Self.SampleRate := SherpaOnnxOfflineTtsSampleRate(Self.Handle);\n  Self.NumSpeakers := SherpaOnnxOfflineTtsNumSpeakers(Self.Handle);\nend;\n\ndestructor TSherpaOnnxOfflineTts.Destroy;\nbegin\n  SherpaOnnxDestroyOfflineTts(Self.Handle);\n  Self.Handle := nil;\nend;\n\nfunction ExtractGeneratedAudio(Audio: PSherpaOnnxGeneratedAudio): TSherpaOnnxGeneratedAudio;\nbegin\n  Result := Default(TSherpaOnnxGeneratedAudio);\n\n  if Audio = nil then\n    Exit;\n\n  SetLength(Result.Samples, Audio^.N);\n  Result.SampleRate := Audio^.SampleRate;\n\n  if Audio^.N > 0 then\n    Move(Audio^.Samples[0], Result.Samples[0], Audio^.N * SizeOf(Single));\n\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio(Audio);\nend;\n\nfunction TSherpaOnnxOfflineTts.Generate(Text: AnsiString; SpeakerId: Integer;\n  Speed: Single): TSherpaOnnxGeneratedAudio;\nvar\n  Audio: PSherpaOnnxGeneratedAudio;\nbegin\n  Audio := SherpaOnnxOfflineTtsGenerate(Self.Handle, PAnsiChar(Text), SpeakerId, Speed);\n  Result := ExtractGeneratedAudio(Audio);\nend;\n\nfunction TSherpaOnnxOfflineTts.Generate(Text: AnsiString; SpeakerId: Integer;\n  Speed: Single;\n  Callback: TSherpaOnnxGeneratedAudioCallbackWithArg;\n  Arg: Pointer\n  ): TSherpaOnnxGeneratedAudio;\nvar\n  Audio: PSherpaOnnxGeneratedAudio;\nbegin\n  Audio := SherpaOnnxOfflineTtsGenerateWithCallbackWithArg(Self.Handle, PAnsiChar(Text),\n    SpeakerId, Speed, Callback, Arg);\n  Result := ExtractGeneratedAudio(Audio);\nend;\n\nfunction TSherpaOnnxOfflineTts.Generate(Text: AnsiString;\n  GenerationConfig: TSherpaOnnxGenerationConfig;\n  Callback: TSherpaOnnxGeneratedAudioProgressCallbackWithArg;\n  Arg: Pointer\n  ): TSherpaOnnxGeneratedAudio;\nvar\n  Audio: PSherpaOnnxGeneratedAudio;\n  C: SherpaOnnxGenerationConfig;\n  ReferenceAudio: TSherpaOnnxSamplesArray;\n  CReferenceAudio: pcfloat;\n  ReferenceText: AnsiString;\n  Extra: AnsiString;\nbegin\n  C := Default(SherpaOnnxGenerationConfig);\n  C.SilenceScale := GenerationConfig.SilenceScale;\n  C.Speed := GenerationConfig.Speed;\n  C.Sid := GenerationConfig.Sid;\n  ReferenceAudio := GenerationConfig.ReferenceAudio;\n  CReferenceAudio := nil;\n  C.ReferenceAudio := nil;\n  C.ReferenceAudioLen := Length(ReferenceAudio);\n  if C.ReferenceAudioLen > 0 then\n    begin\n      GetMem(CReferenceAudio, C.ReferenceAudioLen * SizeOf(Single));\n      Move(ReferenceAudio[0], CReferenceAudio[0], C.ReferenceAudioLen * SizeOf(Single));\n      C.ReferenceAudio := CReferenceAudio;\n    end;\n  C.ReferenceSampleRate:= GenerationConfig.ReferenceSampleRate;\n  ReferenceText := GenerationConfig.ReferenceText;\n  C.ReferenceText := PAnsiChar(ReferenceText);\n  C.NumSteps := GenerationConfig.NumSteps;\n  Extra := GenerationConfig.Extra;\n  C.Extra := PAnsiChar(Extra);\n\n  Audio := nil;\n  try\n    Audio := SherpaOnnxOfflineTtsGenerateWithConfig(Self.Handle, PAnsiChar(Text),\n      @C, Callback, Arg);\n  finally\n    if CReferenceAudio <> nil then\n      FreeMem(CReferenceAudio);\n  end;\n\n  Result := ExtractGeneratedAudio(Audio);\nend;\n\nconstructor TSherpaOnnxLinearResampler.Create(SampleRateIn: Integer; SampleRateOut: Integer);\nvar\n  MinFreq: Single;\n  LowpassCutoff: Single;\n  LowpassFilterWidth: Integer = 6;\nbegin\n  if SampleRateIn > SampleRateOut then\n    MinFreq := SampleRateOut\n  else\n    MinFreq := SampleRateIn;\n\n  LowpassCutoff := 0.99 * 0.5 * MinFreq;\n\n  Self.Handle := SherpaOnnxCreateLinearResampler(SampleRateIn,\n    SampleRateOut, LowpassCutoff, LowpassFilterWidth);\n  Self.InputSampleRate := SampleRateIn;\n  Self.OutputSampleRate := SampleRateOut;\nend;\n\ndestructor TSherpaOnnxLinearResampler.Destroy;\nbegin\n  SherpaOnnxDestroyLinearResampler(Self.Handle);\n  Self.Handle := nil;\nend;\n\nfunction TSherpaOnnxLinearResampler.Resample(Samples: pcfloat;\n  N: Integer; Flush: Boolean): TSherpaOnnxSamplesArray;\nvar\n  P: PSherpaOnnxResampleOut;\nbegin\n  Result := Default(TSherpaOnnxSamplesArray);\n  P := SherpaOnnxLinearResamplerResample(Self.Handle, Samples, N, Ord(Flush));\n  if P = nil then\n    Exit;\n\n  SetLength(Result, P^.N);\n\n  if P^.N > 0 then\n    Move(P^.Samples[0], Result[0], P^.N * SizeOf(Single));\n\n  SherpaOnnxLinearResamplerResampleFree(P);\nend;\n\nfunction TSherpaOnnxLinearResampler.Resample(const Samples: array of Single; Flush: Boolean): TSherpaOnnxSamplesArray;\nbegin\n  Result := Self.Resample(pcfloat(Samples), Length(Samples), Flush);\nend;\n\nprocedure TSherpaOnnxLinearResampler.Reset;\nbegin\n  SherpaOnnxLinearResamplerReset(Self.Handle);\nend;\n\nfunction TSherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig(' +\n    'Model := %s)',[Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineSpeakerSegmentationModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig(' +\n    'Pyannote := %s, ' +\n    'NumThreads := %d, ' +\n    'Debug := %s, ' +\n    'Provider := %s)',\n    [Self.Pyannote.ToString, Self.NumThreads,\n     Self.Debug.ToString, Self.Provider]);\nend;\n\nclass operator TSherpaOnnxOfflineSpeakerSegmentationModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineSpeakerSegmentationModelConfig);\nbegin\n  Dest.NumThreads := 1;\n  Dest.Debug := False;\n  Dest.Provider := 'cpu';\nend;\n\nfunction TSherpaOnnxFastClusteringConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxFastClusteringConfig(' +\n    'NumClusters := %d, Threshold := %.3f)',\n    [Self.NumClusters, Self.Threshold]);\nend;\n\nclass operator TSherpaOnnxFastClusteringConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxFastClusteringConfig);\nbegin\n  Dest.NumClusters := -1;\n  Dest.Threshold := 0.5;\nend;\n\nfunction TSherpaOnnxSpeakerEmbeddingExtractorConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxSpeakerEmbeddingExtractorConfig(' +\n    'Model := %s, '+\n    'NumThreads := %d, '+\n    'Debug := %s, '+\n    'Provider := %s)',\n    [Self.Model, Self.NumThreads, Self.Debug.ToString, Self.Provider]);\nend;\n\nclass operator TSherpaOnnxSpeakerEmbeddingExtractorConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxSpeakerEmbeddingExtractorConfig);\nbegin\n  Dest.NumThreads := 1;\n  Dest.Debug := False;\n  Dest.Provider := 'cpu';\nend;\n\nfunction TSherpaOnnxOfflineSpeakerDiarizationConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineSpeakerDiarizationConfig(' +\n    'Segmentation := %s, '+\n    'Embedding := %s, '+\n    'Clustering := %s, '+\n    'MinDurationOn := %.3f, '+\n    'MinDurationOff := %.3f)',\n    [Self.Segmentation.ToString, Self.Embedding.ToString,\n     Self.Clustering.ToString, Self.MinDurationOn, Self.MinDurationOff]);\nend;\n\nclass operator TSherpaOnnxOfflineSpeakerDiarizationConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineSpeakerDiarizationConfig);\nbegin\n  Dest.MinDurationOn := 0.2;\n  Dest.MinDurationOff := 0.5;\nend;\n\nfunction TSherpaOnnxOfflineSpeakerDiarizationSegment.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineSpeakerDiarizationSegment(' +\n    'Start := %.3f, '+\n    'Stop := %.3f, '+\n    'Speaker := %d)',\n    [Self.Start, Self.Stop, Self.Speaker]);\nend;\n\nconstructor TSherpaOnnxOfflineSpeakerDiarization.Create(Config: TSherpaOnnxOfflineSpeakerDiarizationConfig);\nvar\n  C: SherpaOnnxOfflineSpeakerDiarizationConfig;\nbegin\n  C := Default(SherpaOnnxOfflineSpeakerDiarizationConfig);\n  C.Segmentation.Pyannote.Model := PAnsiChar(Config.Segmentation.Pyannote.Model);\n  C.Segmentation.NumThreads := Config.Segmentation.NumThreads;\n  C.Segmentation.Debug := Ord(Config.Segmentation.Debug);\n  C.Segmentation.Provider := PAnsiChar(Config.Segmentation.Provider);\n\n  C.Embedding.Model := PAnsiChar(Config.Embedding.Model);\n  C.Embedding.NumThreads := Config.Embedding.NumThreads;\n  C.Embedding.Debug := Ord(Config.Embedding.Debug);\n  C.Embedding.Provider := PAnsiChar(Config.Embedding.Provider);\n\n  C.Clustering.NumClusters := Config.Clustering.NumClusters;\n  C.Clustering.Threshold := Config.Clustering.Threshold;\n\n  C.MinDurationOn := Config.MinDurationOn;\n  C.MinDurationOff := Config.MinDurationOff;\n\n  Self.Handle := SherpaOnnxCreateOfflineSpeakerDiarization(@C);\n  Self._Config := Config;\n  Self.SampleRate :=  0;\n\n  if Self.Handle <> nil then\n    begin\n      Self.SampleRate := SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(Self.Handle);\n    end;\nend;\n\ndestructor TSherpaOnnxOfflineSpeakerDiarization.Destroy;\nbegin\n  SherpaOnnxDestroyOfflineSpeakerDiarization(Self.Handle);\n  Self.Handle := nil;\nend;\n\nprocedure TSherpaOnnxOfflineSpeakerDiarization.SetConfig(Config: TSherpaOnnxOfflineSpeakerDiarizationConfig);\nvar\n  C: SherpaOnnxOfflineSpeakerDiarizationConfig;\nbegin\n  C := Default(SherpaOnnxOfflineSpeakerDiarizationConfig);\n\n  C.Clustering.NumClusters := Config.Clustering.NumClusters;\n  C.Clustering.Threshold := Config.Clustering.Threshold;\n\n  SherpaOnnxOfflineSpeakerDiarizationSetConfig(Self.Handle, @C);\nend;\n\nfunction TSherpaOnnxOfflineSpeakerDiarization.Process(const Samples: array of Single): TSherpaOnnxOfflineSpeakerDiarizationSegmentArray;\nvar\n  R: Pointer;\n  NumSegments: Integer;\n  I: Integer;\n  Segments: PSherpaOnnxOfflineSpeakerDiarizationSegment;\nbegin\n  Result := nil;\n\n  R := SherpaOnnxOfflineSpeakerDiarizationProcess(Self.Handle, pcfloat(Samples), Length(Samples));\n  if R = nil then\n    begin\n      Exit\n    end;\n  NumSegments := SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(R);\n\n  Segments := SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(R);\n\n  SetLength(Result, NumSegments);\n  for I := Low(Result) to High(Result) do\n    begin\n      Result[I].Start := Segments[I].Start;\n      Result[I].Stop := Segments[I].Stop;\n      Result[I].Speaker := Segments[I].Speaker;\n    end;\n\n  SherpaOnnxOfflineSpeakerDiarizationDestroySegment(Segments);\n  SherpaOnnxOfflineSpeakerDiarizationDestroyResult(R);\nend;\n\nfunction TSherpaOnnxOfflineSpeakerDiarization.Process(const Samples: array of Single;\n  callback: PSherpaOnnxOfflineSpeakerDiarizationProgressCallbackNoArg): TSherpaOnnxOfflineSpeakerDiarizationSegmentArray;\nvar\n  R: Pointer;\n  NumSegments: Integer;\n  I: Integer;\n  Segments: PSherpaOnnxOfflineSpeakerDiarizationSegment;\nbegin\n  Result := nil;\n\n  R := SherpaOnnxOfflineSpeakerDiarizationProcessWithCallbackNoArg(Self.Handle, pcfloat(Samples), Length(Samples), callback);\n  if R = nil then\n    begin\n      Exit\n    end;\n  NumSegments := SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(R);\n\n  Segments := SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(R);\n\n  SetLength(Result, NumSegments);\n  for I := Low(Result) to High(Result) do\n    begin\n      Result[I].Start := Segments[I].Start;\n      Result[I].Stop := Segments[I].Stop;\n      Result[I].Speaker := Segments[I].Speaker;\n    end;\n\n  SherpaOnnxOfflineSpeakerDiarizationDestroySegment(Segments);\n  SherpaOnnxOfflineSpeakerDiarizationDestroyResult(R);\nend;\n\nfunction TSherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig(' +\n    'Model := %s)', [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig(' +\n    'Model := %s)', [Self.Model]);\nend;\n\nfunction TSherpaOnnxOfflineSpeechDenoiserModelConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineSpeechDenoiserModelConfig(' +\n    'Gtcrn := %s, '+\n    'DpdfNet := %s, '+\n    'NumThreads := %d, '+\n    'Debug := %s, '+\n    'Provider := %s)',\n    [Self.Gtcrn.ToString, Self.DpdfNet.ToString, Self.NumThreads, Self.Debug.ToString, Self.Provider]);\nend;\n\nclass operator TSherpaOnnxOfflineSpeechDenoiserModelConfig.Initialize({$IFDEF FPC}var{$ELSE}out{$ENDIF} Dest: TSherpaOnnxOfflineSpeechDenoiserModelConfig);\nbegin\n  Dest.NumThreads := 1;\n  Dest.Debug := False;\n  Dest.Provider := 'cpu';\nend;\n\nfunction TSherpaOnnxOfflineSpeechDenoiserConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOfflineSpeechDenoiserConfig(' +\n    'Model := %s)', [Self.Model.ToString]);\nend;\n\nfunction TSherpaOnnxOnlineSpeechDenoiserConfig.ToString: AnsiString;\nbegin\n  Result := Format('TSherpaOnnxOnlineSpeechDenoiserConfig(' +\n    'Model := %s)', [Self.Model.ToString]);\nend;\n\nfunction ExtractDenoisedAudio(Audio: PSherpaOnnxDenoisedAudio): TSherpaOnnxDenoisedAudio;\nbegin\n  Result := Default(TSherpaOnnxDenoisedAudio);\n\n  if Audio = nil then\n    Exit;\n\n  SetLength(Result.Samples, Audio^.N);\n  Result.SampleRate := Audio^.SampleRate;\n\n  if Audio^.N > 0 then\n    Move(Audio^.Samples[0], Result.Samples[0], Audio^.N * SizeOf(Single));\n\n  SherpaOnnxDestroyDenoisedAudio(Audio);\nend;\n\nconstructor TSherpaOnnxOfflineSpeechDenoiser.Create(Config: TSherpaOnnxOfflineSpeechDenoiserConfig);\nvar\n  C: SherpaOnnxOfflineSpeechDenoiserConfig;\nbegin\n  C := Default(SherpaOnnxOfflineSpeechDenoiserConfig);\n  C.Model.Gtcrn.Model := PAnsiChar(Config.Model.Gtcrn.Model);\n  C.Model.DpdfNet.Model := PAnsiChar(Config.Model.DpdfNet.Model);\n  C.Model.NumThreads := Config.Model.NumThreads;\n  C.Model.Debug := Ord(Config.Model.Debug);\n  C.Model.Provider := PAnsiChar(Config.Model.Provider);\n\n  Self.Handle := SherpaOnnxCreateOfflineSpeechDenoiser(@C);\n  Self._Config := Config;\n  Self.SampleRate :=  0;\n\n  if Self.Handle <> nil then\n    begin\n      Self.SampleRate := SherpaOnnxOfflineSpeechDenoiserGetSampleRate(Self.Handle);\n    end;\nend;\n\ndestructor TSherpaOnnxOfflineSpeechDenoiser.Destroy;\nbegin\n  SherpaOnnxDestroyOfflineSpeechDenoiser(Self.Handle);\n  Self.Handle := nil;\nend;\n\nfunction TSherpaOnnxOfflineSpeechDenoiser.Run(const Samples: array of Single; InputSampleRate: Integer): TSherpaOnnxDenoisedAudio;\nvar\n  Audio: PSherpaOnnxDenoisedAudio;\nbegin\n  Audio := SherpaOnnxOfflineSpeechDenoiserRun(Self.Handle, pcfloat(Samples), Length(Samples), InputSampleRate);\n  Result := ExtractDenoisedAudio(Audio);\nend;\n\nconstructor TSherpaOnnxOnlineSpeechDenoiser.Create(Config: TSherpaOnnxOnlineSpeechDenoiserConfig);\nvar\n  C: SherpaOnnxOnlineSpeechDenoiserConfig;\nbegin\n  C := Default(SherpaOnnxOnlineSpeechDenoiserConfig);\n  C.Model.Gtcrn.Model := PAnsiChar(Config.Model.Gtcrn.Model);\n  C.Model.DpdfNet.Model := PAnsiChar(Config.Model.DpdfNet.Model);\n  C.Model.NumThreads := Config.Model.NumThreads;\n  C.Model.Debug := Ord(Config.Model.Debug);\n  C.Model.Provider := PAnsiChar(Config.Model.Provider);\n\n  Self.Handle := SherpaOnnxCreateOnlineSpeechDenoiser(@C);\n  Self._Config := Config;\n  Self.SampleRate := 0;\n  Self.FrameShiftInSamples := 0;\n\n  if Self.Handle <> nil then\n    begin\n      Self.SampleRate := SherpaOnnxOnlineSpeechDenoiserGetSampleRate(Self.Handle);\n      Self.FrameShiftInSamples := SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(Self.Handle);\n    end;\nend;\n\ndestructor TSherpaOnnxOnlineSpeechDenoiser.Destroy;\nbegin\n  SherpaOnnxDestroyOnlineSpeechDenoiser(Self.Handle);\n  Self.Handle := nil;\nend;\n\nfunction TSherpaOnnxOnlineSpeechDenoiser.Run(const Samples: array of Single; InputSampleRate: Integer): TSherpaOnnxDenoisedAudio;\nvar\n  Audio: PSherpaOnnxDenoisedAudio;\nbegin\n  Audio := SherpaOnnxOnlineSpeechDenoiserRun(Self.Handle, pcfloat(Samples), Length(Samples), InputSampleRate);\n  Result := ExtractDenoisedAudio(Audio);\nend;\n\nfunction TSherpaOnnxOnlineSpeechDenoiser.Flush: TSherpaOnnxDenoisedAudio;\nvar\n  Audio: PSherpaOnnxDenoisedAudio;\nbegin\n  Audio := SherpaOnnxOnlineSpeechDenoiserFlush(Self.Handle);\n  Result := ExtractDenoisedAudio(Audio);\nend;\n\nprocedure TSherpaOnnxOnlineSpeechDenoiser.Reset;\nbegin\n  SherpaOnnxOnlineSpeechDenoiserReset(Self.Handle);\nend;\n\ninitialization\n  { Match the C API's default behavior. PocketTTS can raise FP overflow flags\n    during native inference on some platforms, and Free Pascal would otherwise\n    surface them as EOverflow.\n    See also https://github.com/k2-fsa/sherpa-onnx/pull/3351\n  }\n  SetExceptionMask([exInvalidOp, exDenormalized, exZeroDivide, exOverflow,\n    exUnderflow, exPrecision]);\n\nend.\n"
  },
  {
    "path": "sherpa-onnx/python/CMakeLists.txt",
    "content": "add_subdirectory(csrc)\n\nif(SHERPA_ONNX_ENABLE_TESTS)\n  add_subdirectory(tests)\nendif()\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/CMakeLists.txt",
    "content": "include_directories(${PROJECT_SOURCE_DIR})\n\nset(srcs\n  audio-tagging.cc\n  circular-buffer.cc\n  cuda-config.cc\n  display.cc\n  endpoint.cc\n  features.cc\n  homophone-replacer.cc\n  keyword-spotter.cc\n  offline-canary-model-config.cc\n  offline-ctc-fst-decoder-config.cc\n  offline-dolphin-model-config.cc\n  offline-fire-red-asr-ctc-model-config.cc\n  offline-fire-red-asr-model-config.cc\n  offline-funasr-nano-model-config.cc\n  offline-lm-config.cc\n  offline-medasr-ctc-model-config.cc\n  offline-model-config.cc\n  offline-moonshine-model-config.cc\n  offline-nemo-enc-dec-ctc-model-config.cc\n  offline-omnilingual-asr-ctc-model-config.cc\n  offline-paraformer-model-config.cc\n  offline-punctuation.cc\n  offline-recognizer.cc\n  offline-sense-voice-model-config.cc\n  offline-source-separation-model-config.cc\n  offline-source-separation-spleeter-model-config.cc\n  offline-source-separation-uvr-model-config.cc\n  offline-source-separation.cc\n  offline-speech-denoiser-dpdfnet-model-config.cc\n  offline-speech-denoiser-gtcrn-model-config.cc\n  offline-speech-denoiser-model-config.cc\n  offline-speech-denoiser.cc\n  offline-stream.cc\n  offline-tdnn-model-config.cc\n  offline-transducer-model-config.cc\n  offline-wenet-ctc-model-config.cc\n  offline-whisper-model-config.cc\n  offline-zipformer-ctc-model-config.cc\n  online-ctc-fst-decoder-config.cc\n  online-lm-config.cc\n  online-model-config.cc\n  online-nemo-ctc-model-config.cc\n  online-paraformer-model-config.cc\n  online-punctuation.cc\n  online-recognizer.cc\n  online-speech-denoiser.cc\n  online-stream.cc\n  online-t-one-ctc-model-config.cc\n  online-transducer-model-config.cc\n  online-wenet-ctc-model-config.cc\n  online-zipformer2-ctc-model-config.cc\n  provider-config.cc\n  sherpa-onnx.cc\n  silero-vad-model-config.cc\n  speaker-embedding-extractor.cc\n  speaker-embedding-manager.cc\n  spoken-language-identification.cc\n  ten-vad-model-config.cc\n  tensorrt-config.cc\n  vad-model-config.cc\n  vad-model.cc\n  version.cc\n  voice-activity-detector.cc\n  wave-writer.cc\n)\nif(SHERPA_ONNX_HAS_ALSA)\n  list(APPEND srcs ${PROJECT_SOURCE_DIR}/sherpa-onnx/csrc/alsa.cc alsa.cc)\nelse()\n  list(APPEND srcs faked-alsa.cc)\nendif()\n\nif(SHERPA_ONNX_ENABLE_TTS)\n  list(APPEND srcs\n    offline-tts-kitten-model-config.cc\n    offline-tts-kokoro-model-config.cc\n    offline-tts-matcha-model-config.cc\n    offline-tts-model-config.cc\n    offline-tts-pocket-model-config.cc\n    offline-tts-supertonic-model-config.cc\n    offline-tts-vits-model-config.cc\n    offline-tts-zipvoice-model-config.cc\n    offline-tts.cc\n    sentence-piece-tokenizer.cc\n  )\nendif()\n\nif(SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION)\n  list(APPEND srcs\n    fast-clustering.cc\n    offline-speaker-diarization-result.cc\n    offline-speaker-diarization.cc\n  )\nendif()\n\npybind11_add_module(_sherpa_onnx ${srcs})\n\nif(APPLE)\n  execute_process(\n    COMMAND \"${PYTHON_EXECUTABLE}\" -c \"from distutils.sysconfig import get_python_lib; print(get_python_lib())\"\n    OUTPUT_STRIP_TRAILING_WHITESPACE\n    OUTPUT_VARIABLE PYTHON_SITE_PACKAGE_DIR\n  )\n  message(STATUS \"PYTHON_SITE_PACKAGE_DIR: ${PYTHON_SITE_PACKAGE_DIR}\")\n  if(PYTHON_SITE_PACKAGE_DIR STREQUAL \"\")\n    message(WARNING \"PYTHON_SITE_PACKAGE_DIR is empty!\")\n  else()\n    target_link_libraries(_sherpa_onnx PRIVATE \"-Wl,-rpath,${PYTHON_SITE_PACKAGE_DIR}\")\n  endif()\nendif()\n\nif(NOT WIN32)\n  target_link_libraries(_sherpa_onnx PRIVATE \"-Wl,-rpath,${SHERPA_ONNX_RPATH_ORIGIN}/sherpa_onnx/lib\")\nendif()\n\ntarget_link_libraries(_sherpa_onnx PRIVATE sherpa-onnx-core)\n\nif(SHERPA_ONNX_HAS_ALSA)\n  if(DEFINED ENV{SHERPA_ONNX_ALSA_LIB_DIR})\n    target_link_libraries(_sherpa_onnx PRIVATE -L$ENV{SHERPA_ONNX_ALSA_LIB_DIR} -lasound)\n  else()\n    target_link_libraries(_sherpa_onnx PRIVATE asound)\n  endif()\nendif()\n\ninstall(TARGETS _sherpa_onnx DESTINATION lib)\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/alsa.cc",
    "content": "// sherpa-onnx/python/csrc/alsa.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/alsa.h\"\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/alsa.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindAlsa(py::module *m) {\n  using PyClass = Alsa;\n  py::class_<PyClass>(*m, \"Alsa\")\n      .def(py::init<const char *>(), py::arg(\"device_name\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"read\",\n          [](PyClass &self, int32_t num_samples) -> std::vector<float> {\n            return self.Read(num_samples);\n          },\n          py::arg(\"num_samples\"), py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"expected_sample_rate\",\n                             &PyClass::GetExpectedSampleRate)\n      .def_property_readonly(\"actual_sample_rate\",\n                             &PyClass::GetActualSampleRate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/alsa.h",
    "content": "// sherpa-onnx/python/csrc/alsa.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ALSA_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ALSA_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindAlsa(py::module *m);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ALSA_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/audio-tagging.cc",
    "content": "// sherpa-onnx/python/csrc/audio-tagging.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/audio-tagging.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/audio-tagging.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindOfflineZipformerAudioTaggingModelConfig(py::module *m) {\n  using PyClass = OfflineZipformerAudioTaggingModelConfig;\n  py::class_<PyClass>(*m, \"OfflineZipformerAudioTaggingModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nstatic void PybindAudioTaggingModelConfig(py::module *m) {\n  PybindOfflineZipformerAudioTaggingModelConfig(m);\n\n  using PyClass = AudioTaggingModelConfig;\n\n  py::class_<PyClass>(*m, \"AudioTaggingModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const OfflineZipformerAudioTaggingModelConfig &,\n                    const std::string &, int32_t, bool, const std::string &>(),\n           py::arg(\"zipformer\") = OfflineZipformerAudioTaggingModelConfig{},\n           py::arg(\"ced\") = \"\", py::arg(\"num_threads\") = 1,\n           py::arg(\"debug\") = false, py::arg(\"provider\") = \"cpu\")\n      .def_readwrite(\"zipformer\", &PyClass::zipformer)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nstatic void PybindAudioTaggingConfig(py::module *m) {\n  PybindAudioTaggingModelConfig(m);\n\n  using PyClass = AudioTaggingConfig;\n\n  py::class_<PyClass>(*m, \"AudioTaggingConfig\")\n      .def(py::init<>())\n      .def(py::init<const AudioTaggingModelConfig &, const std::string &,\n                    int32_t>(),\n           py::arg(\"model\"), py::arg(\"labels\"), py::arg(\"top_k\") = 5)\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"labels\", &PyClass::labels)\n      .def_readwrite(\"top_k\", &PyClass::top_k)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nstatic void PybindAudioEvent(py::module *m) {\n  using PyClass = AudioEvent;\n\n  py::class_<PyClass>(*m, \"AudioEvent\")\n      .def_property_readonly(\n          \"name\", [](const PyClass &self) -> std::string { return self.name; })\n      .def_property_readonly(\n          \"index\", [](const PyClass &self) -> int32_t { return self.index; })\n      .def_property_readonly(\n          \"prob\", [](const PyClass &self) -> float { return self.prob; })\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindAudioTagging(py::module *m) {\n  PybindAudioTaggingConfig(m);\n  PybindAudioEvent(m);\n\n  using PyClass = AudioTagging;\n\n  py::class_<PyClass>(*m, \"AudioTagging\")\n      .def(py::init<const AudioTaggingConfig &>(), py::arg(\"config\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"create_stream\", &PyClass::CreateStream,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"compute\", &PyClass::Compute, py::arg(\"s\"), py::arg(\"top_k\") = -1,\n           py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/audio-tagging.h",
    "content": "// sherpa-onnx/python/csrc/audio-tagging.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_AUDIO_TAGGING_H_\n#define SHERPA_ONNX_PYTHON_CSRC_AUDIO_TAGGING_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindAudioTagging(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_AUDIO_TAGGING_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/circular-buffer.cc",
    "content": "// sherpa-onnx/python/csrc/circular-buffer.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/circular-buffer.h\"\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/circular-buffer.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindCircularBuffer(py::module *m) {\n  using PyClass = CircularBuffer;\n  py::class_<PyClass>(*m, \"CircularBuffer\")\n      .def(py::init<int32_t>(), py::arg(\"capacity\"))\n      .def(\n          \"push\",\n          [](PyClass &self, const std::vector<float> &samples) {\n            self.Push(samples.data(), samples.size());\n          },\n          py::arg(\"samples\"), py::call_guard<py::gil_scoped_release>())\n      .def(\"get\", &PyClass::Get, py::arg(\"start_index\"), py::arg(\"n\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"pop\", &PyClass::Pop, py::arg(\"n\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"reset\", &PyClass::Reset, py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"size\", &PyClass::Size)\n      .def_property_readonly(\"head\", &PyClass::Head)\n      .def_property_readonly(\"tail\", &PyClass::Tail);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/circular-buffer.h",
    "content": "// sherpa-onnx/python/csrc/circular-buffer.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_CIRCULAR_BUFFER_H_\n#define SHERPA_ONNX_PYTHON_CSRC_CIRCULAR_BUFFER_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindCircularBuffer(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_CIRCULAR_BUFFER_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/cuda-config.cc",
    "content": "// sherpa-onnx/python/csrc/cuda-config.cc\n//\n// Copyright (c)  2024  Uniphore (Author: Manickavela A)\n\n#include \"sherpa-onnx/python/csrc/cuda-config.h\"\n\n#include <memory>\n#include <string>\n\n#include \"sherpa-onnx/csrc/provider-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindCudaConfig(py::module *m) {\n  using PyClass = CudaConfig;\n  py::class_<PyClass>(*m, \"CudaConfig\")\n      .def(py::init<>())\n      .def(py::init<int32_t>(),\n           py::arg(\"cudnn_conv_algo_search\") = 1)\n      .def_readwrite(\"cudnn_conv_algo_search\", &PyClass::cudnn_conv_algo_search)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/cuda-config.h",
    "content": "// sherpa-onnx/python/csrc/cuda-config.h\n//\n// Copyright (c)  2024  Uniphore (Author: Manickavela A)\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_CUDA_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_CUDA_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindCudaConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_CUDA_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/display.cc",
    "content": "// sherpa-onnx/python/csrc/display.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/display.h\"\n\n#include \"sherpa-onnx/csrc/display.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindDisplay(py::module *m) {\n  using PyClass = Display;\n  py::class_<PyClass>(*m, \"Display\")\n      .def(py::init<int32_t>(), py::arg(\"max_word_per_line\") = 60)\n      .def(\"print\", &PyClass::Print, py::arg(\"idx\"), py::arg(\"s\"));\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/display.h",
    "content": "// sherpa-onnx/python/csrc/display.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_DISPLAY_H_\n#define SHERPA_ONNX_PYTHON_CSRC_DISPLAY_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindDisplay(py::module *m);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_DISPLAY_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/endpoint.cc",
    "content": "// sherpa-onnx/csrc/endpoint.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/endpoint.h\"\n\n#include <memory>\n#include <string>\n\n#include \"sherpa-onnx/csrc/endpoint.h\"\n\nnamespace sherpa_onnx {\n\nstatic constexpr const char *kEndpointRuleInitDoc = R\"doc(\nConstructor for EndpointRule.\n\nArgs:\n  must_contain_nonsilence:\n    If True, for this endpointing rule to apply there must be nonsilence in the\n    best-path traceback. For decoding, a non-blank token is considered as\n    non-silence.\n  min_trailing_silence:\n    This endpointing rule requires duration of trailing silence (in seconds)\n    to be ``>=`` this value.\n  min_utterance_length:\n    This endpointing rule requires utterance-length (in seconds) to\n    be ``>=`` this value.\n)doc\";\n\nstatic constexpr const char *kEndpointConfigInitDoc = R\"doc(\nIf any rule in EndpointConfig is activated, it is said that an endpointing\nis detected.\n\nArgs:\n  rule1:\n    By default, it times out after 2.4 seconds of silence, even if\n    we decoded nothing.\n  rule2:\n    By default, it times out after 1.2 seconds of silence after decoding\n    something.\n  rule3:\n    By default, it times out after the utterance is 20 seconds long, regardless of\n    anything else.\n)doc\";\n\nstatic void PybindEndpointRule(py::module *m) {\n  using PyClass = EndpointRule;\n  py::class_<PyClass>(*m, \"EndpointRule\")\n      .def(py::init<bool, float, float>(), py::arg(\"must_contain_nonsilence\"),\n           py::arg(\"min_trailing_silence\"), py::arg(\"min_utterance_length\"),\n           kEndpointRuleInitDoc)\n      .def(\"__str__\", &PyClass::ToString)\n      .def_readwrite(\"must_contain_nonsilence\",\n                     &PyClass::must_contain_nonsilence)\n      .def_readwrite(\"min_trailing_silence\", &PyClass::min_trailing_silence)\n      .def_readwrite(\"min_utterance_length\", &PyClass::min_utterance_length);\n}\n\nstatic void PybindEndpointConfig(py::module *m) {\n  using PyClass = EndpointConfig;\n  py::class_<PyClass>(*m, \"EndpointConfig\")\n      .def(\n          py::init(\n              [](float rule1_min_trailing_silence,\n                 float rule2_min_trailing_silence,\n                 float rule3_min_utterance_length) -> std::unique_ptr<PyClass> {\n                EndpointRule rule1(false, rule1_min_trailing_silence, 0);\n                EndpointRule rule2(true, rule2_min_trailing_silence, 0);\n                EndpointRule rule3(false, 0, rule3_min_utterance_length);\n\n                return std::make_unique<EndpointConfig>(rule1, rule2, rule3);\n              }),\n          py::arg(\"rule1_min_trailing_silence\"),\n          py::arg(\"rule2_min_trailing_silence\"),\n          py::arg(\"rule3_min_utterance_length\"))\n      .def(py::init([](const EndpointRule &rule1, const EndpointRule &rule2,\n                       const EndpointRule &rule3) -> std::unique_ptr<PyClass> {\n             auto ans = std::make_unique<PyClass>();\n             ans->rule1 = rule1;\n             ans->rule2 = rule2;\n             ans->rule3 = rule3;\n             return ans;\n           }),\n           py::arg(\"rule1\") = EndpointRule(false, 2.4, 0),\n           py::arg(\"rule2\") = EndpointRule(true, 1.2, 0),\n           py::arg(\"rule3\") = EndpointRule(false, 0, 20),\n           kEndpointConfigInitDoc)\n      .def(\"__str__\",\n           [](const PyClass &self) -> std::string { return self.ToString(); })\n      .def_readwrite(\"rule1\", &PyClass::rule1)\n      .def_readwrite(\"rule2\", &PyClass::rule2)\n      .def_readwrite(\"rule3\", &PyClass::rule3);\n}\n\nvoid PybindEndpoint(py::module *m) {\n  PybindEndpointRule(m);\n  PybindEndpointConfig(m);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/endpoint.h",
    "content": "// sherpa-onnx/csrc/endpoint.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ENDPOINT_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ENDPOINT_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindEndpoint(py::module *m);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ENDPOINT_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/faked-alsa.cc",
    "content": "// sherpa-onnx/python/csrc/faked-alsa.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/macros.h\"\n#include \"sherpa-onnx/python/csrc/alsa.h\"\n\nnamespace sherpa_onnx {\n\nclass FakedAlsa {\n public:\n  explicit FakedAlsa(const char *) {\n    SHERPA_ONNX_LOGE(\"This function is for Linux only.\");\n#if (SHERPA_ONNX_ENABLE_ALSA == 0) && (defined(__unix__) || defined(__unix))\n    SHERPA_ONNX_LOGE(R\"doc(\nsherpa-onnx is compiled without alsa support. To enable that, please run\n  (1) sudo apt-get install alsa-utils libasound2-dev\n  (2) rebuild sherpa-onnx\n)doc\");\n#endif\n    exit(-1);\n  }\n\n  std::vector<float> Read(int32_t) const { return {}; }\n  int32_t GetExpectedSampleRate() const { return -1; }\n  int32_t GetActualSampleRate() const { return -1; }\n};\n\nvoid PybindAlsa(py::module *m) {\n  using PyClass = FakedAlsa;\n  py::class_<PyClass>(*m, \"Alsa\")\n      .def(py::init<const char *>(), py::arg(\"device_name\"))\n      .def(\n          \"read\",\n          [](PyClass &self, int32_t num_samples) -> std::vector<float> {\n            return self.Read(num_samples);\n          },\n          py::arg(\"num_samples\"), py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"expected_sample_rate\",\n                             &PyClass::GetExpectedSampleRate)\n      .def_property_readonly(\"actual_sample_rate\",\n                             &PyClass::GetActualSampleRate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/fast-clustering.cc",
    "content": "// sherpa-onnx/python/csrc/fast-clustering.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/fast-clustering.h\"\n\n#include <sstream>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/fast-clustering.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindFastClusteringConfig(py::module *m) {\n  using PyClass = FastClusteringConfig;\n  py::class_<PyClass>(*m, \"FastClusteringConfig\")\n      .def(py::init<int32_t, float>(), py::arg(\"num_clusters\") = -1,\n           py::arg(\"threshold\") = 0.5)\n      .def_readwrite(\"num_clusters\", &PyClass::num_clusters)\n      .def_readwrite(\"threshold\", &PyClass::threshold)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\nvoid PybindFastClustering(py::module *m) {\n  PybindFastClusteringConfig(m);\n\n  using PyClass = FastClustering;\n  py::class_<PyClass>(*m, \"FastClustering\")\n      .def(py::init<const FastClusteringConfig &>(), py::arg(\"config\"))\n      .def(\n          \"__call__\",\n          [](const PyClass &self,\n             py::array_t<float> features) -> std::vector<int32_t> {\n            if (!(features.flags() & py::array::c_style)) {\n              throw py::value_error(\n                  \"input features should be contiguous. Please use \"\n                  \"np.ascontiguousarray(features)\");\n            }\n\n            int num_dim = features.ndim();\n            if (num_dim != 2) {\n              std::ostringstream os;\n              os << \"Expect an array of 2 dimensions. Given dim: \" << num_dim\n                 << \"\\n\";\n              throw py::value_error(os.str());\n            }\n\n            int32_t num_rows = features.shape(0);\n            int32_t num_cols = features.shape(1);\n            float *p = features.mutable_data();\n            py::gil_scoped_release release;\n            return self.Cluster(p, num_rows, num_cols);\n          },\n          py::arg(\"features\"));\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/fast-clustering.h",
    "content": "// sherpa-onnx/python/csrc/fast-clustering.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_FAST_CLUSTERING_H_\n#define SHERPA_ONNX_PYTHON_CSRC_FAST_CLUSTERING_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindFastClustering(py::module *m);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_FAST_CLUSTERING_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/features.cc",
    "content": "// sherpa-onnx/python/csrc/features.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/features.h\"\n\n#include \"sherpa-onnx/csrc/features.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindFeatureExtractorConfig(py::module *m) {\n  using PyClass = FeatureExtractorConfig;\n  py::class_<PyClass>(*m, \"FeatureExtractorConfig\")\n      .def(py::init<int32_t, int32_t, float, float, float, bool, bool>(),\n           py::arg(\"sampling_rate\") = 16000,\n           py::arg(\"feature_dim\") = 80,\n           py::arg(\"low_freq\") = 20.0f,\n           py::arg(\"high_freq\") = -400.0f,\n           py::arg(\"dither\") = 0.0f,\n           py::arg(\"normalize_samples\") = true,\n           py::arg(\"snip_edges\") = false)\n      .def_readwrite(\"sampling_rate\", &PyClass::sampling_rate)\n      .def_readwrite(\"feature_dim\", &PyClass::feature_dim)\n      .def_readwrite(\"low_freq\", &PyClass::low_freq)\n      .def_readwrite(\"high_freq\", &PyClass::high_freq)\n      .def_readwrite(\"dither\", &PyClass::dither)\n      .def_readwrite(\"normalize_samples\", &PyClass::normalize_samples)\n      .def_readwrite(\"snip_edges\", &PyClass::snip_edges)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindFeatures(py::module *m) { PybindFeatureExtractorConfig(m); }\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/features.h",
    "content": "// sherpa-onnx/python/csrc/features.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_FEATURES_H_\n#define SHERPA_ONNX_PYTHON_CSRC_FEATURES_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindFeatures(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_FEATURES_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/homophone-replacer.cc",
    "content": "// sherpa-onnx/python/csrc/homophone-replacer.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/homophone-replacer.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/homophone-replacer.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindHomophoneReplacer(py::module *m) {\n  using PyClass = HomophoneReplacerConfig;\n  py::class_<PyClass>(*m, \"HomophoneReplacerConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, bool>(),\n           py::arg(\"dict_dir\") = \"\", py::arg(\"lexicon\") = \"\",\n           py::arg(\"rule_fsts\") = \"\", py::arg(\"debug\") = false)\n      .def_readwrite(\"dict_dir\", &PyClass::dict_dir)\n      .def_readwrite(\"lexicon\", &PyClass::lexicon)\n      .def_readwrite(\"rule_fsts\", &PyClass::rule_fsts)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/homophone-replacer.h",
    "content": "// sherpa-onnx/python/csrc/homophone-replacer.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_HOMOPHONE_REPLACER_H_\n#define SHERPA_ONNX_PYTHON_CSRC_HOMOPHONE_REPLACER_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindHomophoneReplacer(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_HOMOPHONE_REPLACER_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/keyword-spotter.cc",
    "content": "// sherpa-onnx/python/csrc/keyword-spotter.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/keyword-spotter.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/keyword-spotter.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindKeywordResult(py::module *m) {\n  using PyClass = KeywordResult;\n  py::class_<PyClass>(*m, \"KeywordResult\")\n      .def_property_readonly(\n          \"keyword\",\n          [](PyClass &self) -> py::str {\n            return py::str(PyUnicode_DecodeUTF8(self.keyword.c_str(),\n                                                self.keyword.size(), \"ignore\"));\n          })\n      .def_property_readonly(\n          \"tokens\",\n          [](PyClass &self) -> std::vector<std::string> { return self.tokens; })\n      .def_property_readonly(\n          \"timestamps\",\n          [](PyClass &self) -> std::vector<float> { return self.timestamps; });\n}\n\nstatic void PybindKeywordSpotterConfig(py::module *m) {\n  using PyClass = KeywordSpotterConfig;\n  py::class_<PyClass>(*m, \"KeywordSpotterConfig\")\n      .def(py::init<const FeatureExtractorConfig &, const OnlineModelConfig &,\n                    int32_t, int32_t, float, float, const std::string &>(),\n           py::arg(\"feat_config\"), py::arg(\"model_config\"),\n           py::arg(\"max_active_paths\") = 4, py::arg(\"num_trailing_blanks\") = 1,\n           py::arg(\"keywords_score\") = 1.0,\n           py::arg(\"keywords_threshold\") = 0.25, py::arg(\"keywords_file\") = \"\")\n      .def_readwrite(\"feat_config\", &PyClass::feat_config)\n      .def_readwrite(\"model_config\", &PyClass::model_config)\n      .def_readwrite(\"max_active_paths\", &PyClass::max_active_paths)\n      .def_readwrite(\"num_trailing_blanks\", &PyClass::num_trailing_blanks)\n      .def_readwrite(\"keywords_score\", &PyClass::keywords_score)\n      .def_readwrite(\"keywords_threshold\", &PyClass::keywords_threshold)\n      .def_readwrite(\"keywords_file\", &PyClass::keywords_file)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindKeywordSpotter(py::module *m) {\n  PybindKeywordResult(m);\n  PybindKeywordSpotterConfig(m);\n\n  using PyClass = KeywordSpotter;\n  py::class_<PyClass>(*m, \"KeywordSpotter\")\n      .def(py::init<const KeywordSpotterConfig &>(), py::arg(\"config\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"create_stream\",\n          [](const PyClass &self) { return self.CreateStream(); },\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"create_stream\",\n          [](PyClass &self, const std::string &keywords) {\n            return self.CreateStream(keywords);\n          },\n          py::arg(\"keywords\"), py::call_guard<py::gil_scoped_release>())\n      .def(\"is_ready\", &PyClass::IsReady,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"reset\", &PyClass::Reset, py::call_guard<py::gil_scoped_release>())\n      .def(\"decode_stream\", &PyClass::DecodeStream,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"decode_streams\",\n          [](PyClass &self, std::vector<OnlineStream *> ss) {\n            self.DecodeStreams(ss.data(), ss.size());\n          },\n          py::call_guard<py::gil_scoped_release>())\n      .def(\"get_result\", &PyClass::GetResult,\n           py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/keyword-spotter.h",
    "content": "// sherpa-onnx/python/csrc/keyword-spotter.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_KEYWORD_SPOTTER_H_\n#define SHERPA_ONNX_PYTHON_CSRC_KEYWORD_SPOTTER_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindKeywordSpotter(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_KEYWORD_SPOTTER_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-canary-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-canary-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-canary-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/python/csrc/offline-canary-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineCanaryModelConfig(py::module *m) {\n  using PyClass = OfflineCanaryModelConfig;\n  py::class_<PyClass>(*m, \"OfflineCanaryModelConfig\")\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, const std::string &, bool>(),\n           py::arg(\"encoder\") = \"\", py::arg(\"decoder\") = \"\",\n           py::arg(\"src_lang\") = \"\", py::arg(\"tgt_lang\") = \"\",\n           py::arg(\"use_pnc\") = true)\n      .def_readwrite(\"encoder\", &PyClass::encoder)\n      .def_readwrite(\"decoder\", &PyClass::decoder)\n      .def_readwrite(\"src_lang\", &PyClass::src_lang)\n      .def_readwrite(\"tgt_lang\", &PyClass::tgt_lang)\n      .def_readwrite(\"use_pnc\", &PyClass::use_pnc)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-canary-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-canary-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_CANARY_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_CANARY_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineCanaryModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_CANARY_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-ctc-fst-decoder-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-ctc-fst-decoder-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-ctc-fst-decoder-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-ctc-fst-decoder-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineCtcFstDecoderConfig(py::module *m) {\n  using PyClass = OfflineCtcFstDecoderConfig;\n  py::class_<PyClass>(*m, \"OfflineCtcFstDecoderConfig\")\n      .def(py::init<const std::string &, int32_t>(), py::arg(\"graph\") = \"\",\n           py::arg(\"max_active\") = 3000)\n      .def_readwrite(\"graph\", &PyClass::graph)\n      .def_readwrite(\"max_active\", &PyClass::max_active)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-ctc-fst-decoder-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-ctc-fst-decoder-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_CTC_FST_DECODER_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_CTC_FST_DECODER_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineCtcFstDecoderConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_CTC_FST_DECODER_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-dolphin-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-dolphin-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-dolphin-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/python/csrc/offline-dolphin-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineDolphinModelConfig(py::module *m) {\n  using PyClass = OfflineDolphinModelConfig;\n  py::class_<PyClass>(*m, \"OfflineDolphinModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-dolphin-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-dolphin-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_DOLPHIN_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_DOLPHIN_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineDolphinModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_DOLPHIN_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-fire-red-asr-ctc-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-fire-red-asr-ctc-model-config.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-ctc-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/python/csrc/offline-fire-red-asr-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineFireRedAsrCtcModelConfig(py::module *m) {\n  using PyClass = OfflineFireRedAsrCtcModelConfig;\n  py::class_<PyClass>(*m, \"OfflineFireRedAsrCtcModelConfig\")\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-fire-red-asr-ctc-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-fire-red-asr-ctc-model-config.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_FIRE_RED_ASR_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_FIRE_RED_ASR_CTC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineFireRedAsrCtcModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_FIRE_RED_ASR_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-fire-red-asr-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-fire-red-asr-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-fire-red-asr-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/python/csrc/offline-fire-red-asr-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineFireRedAsrModelConfig(py::module *m) {\n  using PyClass = OfflineFireRedAsrModelConfig;\n  py::class_<PyClass>(*m, \"OfflineFireRedAsrModelConfig\")\n      .def(py::init<const std::string &, const std::string &>(),\n           py::arg(\"encoder\"), py::arg(\"decoder\"))\n      .def_readwrite(\"encoder\", &PyClass::encoder)\n      .def_readwrite(\"decoder\", &PyClass::decoder)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-fire-red-asr-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-fire-red-asr-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineFireRedAsrModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_FIRE_RED_ASR_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-funasr-nano-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-funasr-nano-model-config.cc\n//\n// Copyright (c)  2025  zengyw\n\n#include \"sherpa-onnx/csrc/offline-funasr-nano-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/python/csrc/offline-funasr-nano-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineFunASRNanoModelConfig(py::module *m) {\n  using PyClass = OfflineFunASRNanoModelConfig;\n  py::class_<PyClass>(*m, \"OfflineFunASRNanoModelConfig\")\n      .def(py::init<>())\n      .def_readwrite(\"encoder_adaptor\", &PyClass::encoder_adaptor)\n      .def_readwrite(\"llm\", &PyClass::llm)\n      .def_readwrite(\"embedding\", &PyClass::embedding)\n      .def_readwrite(\"tokenizer\", &PyClass::tokenizer)\n      .def_readwrite(\"system_prompt\", &PyClass::system_prompt)\n      .def_readwrite(\"user_prompt\", &PyClass::user_prompt)\n      .def_readwrite(\"max_new_tokens\", &PyClass::max_new_tokens)\n      .def_readwrite(\"temperature\", &PyClass::temperature)\n      .def_readwrite(\"top_p\", &PyClass::top_p)\n      .def_readwrite(\"seed\", &PyClass::seed)\n      .def_readwrite(\"language\", &PyClass::language)\n      .def_readwrite(\"itn\", &PyClass::itn)\n      .def_readwrite(\"hotwords\", &PyClass::hotwords)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-funasr-nano-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-funasr-nano-model-config.h\n//\n// Copyright (c)  2025  zengyw\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_FUNASR_NANO_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_FUNASR_NANO_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineFunASRNanoModelConfig(py::module *m);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_FUNASR_NANO_MODEL_CONFIG_H_\n\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-lm-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-lm-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-lm-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx//csrc/offline-lm-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineLMConfig(py::module *m) {\n  using PyClass = OfflineLMConfig;\n  py::class_<PyClass>(*m, \"OfflineLMConfig\")\n      .def(py::init<const std::string &, float, int32_t, const std::string &,\n           const std::string &, float, int32_t>(),\n           py::arg(\"model\"), py::arg(\"scale\") = 0.5f,\n           py::arg(\"lm_num_threads\") = 1, py::arg(\"lm_provider\") = \"cpu\",\n           py::arg(\"lodr_fst\") = \"\", py::arg(\"lodr_scale\") = 0.0f,\n           py::arg(\"lodr_backoff_id\") = -1)\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"scale\", &PyClass::scale)\n      .def_readwrite(\"lm_provider\", &PyClass::lm_provider)\n      .def_readwrite(\"lm_num_threads\", &PyClass::lm_num_threads)\n      .def_readwrite(\"lodr_fst\", &PyClass::lodr_fst)\n      .def_readwrite(\"lodr_scale\", &PyClass::lodr_scale)\n      .def_readwrite(\"lodr_backoff_id\", &PyClass::lodr_backoff_id)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-lm-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-lm-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_LM_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_LM_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineLMConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_LM_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-medasr-ctc-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-medasr-ctc-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-medasr-ctc-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/python/csrc/offline-medasr-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineMedAsrCtcModelConfig(py::module *m) {\n  using PyClass = OfflineMedAsrCtcModelConfig;\n  py::class_<PyClass>(*m, \"OfflineMedAsrCtcModelConfig\")\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-medasr-ctc-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-medasr-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_MEDASR_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_MEDASR_CTC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineMedAsrCtcModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_MEDASR_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-model-config.cc\n//\n// Copyright (c)  2023 by manyeyes\n\n#include \"sherpa-onnx/python/csrc/offline-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-canary-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-dolphin-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-fire-red-asr-ctc-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-fire-red-asr-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-funasr-nano-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-medasr-ctc-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-moonshine-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-nemo-enc-dec-ctc-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-omnilingual-asr-ctc-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-paraformer-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-sense-voice-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-tdnn-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-transducer-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-wenet-ctc-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-whisper-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-zipformer-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineModelConfig(py::module *m) {\n  PybindOfflineTransducerModelConfig(m);\n  PybindOfflineParaformerModelConfig(m);\n  PybindOfflineNemoEncDecCtcModelConfig(m);\n  PybindOfflineWhisperModelConfig(m);\n  PybindOfflineFireRedAsrModelConfig(m);\n  PybindOfflineTdnnModelConfig(m);\n  PybindOfflineZipformerCtcModelConfig(m);\n  PybindOfflineWenetCtcModelConfig(m);\n  PybindOfflineSenseVoiceModelConfig(m);\n  PybindOfflineMoonshineModelConfig(m);\n  PybindOfflineDolphinModelConfig(m);\n  PybindOfflineCanaryModelConfig(m);\n  PybindOfflineOmnilingualAsrCtcModelConfig(m);\n  PybindOfflineFunASRNanoModelConfig(m);\n  PybindOfflineMedAsrCtcModelConfig(m);\n  PybindOfflineFireRedAsrCtcModelConfig(m);\n\n  using PyClass = OfflineModelConfig;\n  py::class_<PyClass>(*m, \"OfflineModelConfig\")\n      .def(py::init<const OfflineTransducerModelConfig &,\n                    const OfflineParaformerModelConfig &,\n                    const OfflineNemoEncDecCtcModelConfig &,\n                    const OfflineWhisperModelConfig &,\n                    const OfflineFireRedAsrModelConfig &,\n                    const OfflineTdnnModelConfig &,\n                    const OfflineZipformerCtcModelConfig &,\n                    const OfflineWenetCtcModelConfig &,\n                    const OfflineSenseVoiceModelConfig &,\n                    const OfflineMoonshineModelConfig &,\n                    const OfflineDolphinModelConfig &,\n                    const OfflineCanaryModelConfig &,\n                    const OfflineOmnilingualAsrCtcModelConfig &,\n                    const OfflineFunASRNanoModelConfig &,\n                    const OfflineMedAsrCtcModelConfig &,\n                    const OfflineFireRedAsrCtcModelConfig &,\n                    const std::string &, const std::string &, int32_t, bool,\n                    const std::string &, const std::string &,\n                    const std::string &, const std::string &>(),\n           py::arg(\"transducer\") = OfflineTransducerModelConfig(),\n           py::arg(\"paraformer\") = OfflineParaformerModelConfig(),\n           py::arg(\"nemo_ctc\") = OfflineNemoEncDecCtcModelConfig(),\n           py::arg(\"whisper\") = OfflineWhisperModelConfig(),\n           py::arg(\"fire_red_asr\") = OfflineFireRedAsrModelConfig(),\n           py::arg(\"tdnn\") = OfflineTdnnModelConfig(),\n           py::arg(\"zipformer_ctc\") = OfflineZipformerCtcModelConfig(),\n           py::arg(\"wenet_ctc\") = OfflineWenetCtcModelConfig(),\n           py::arg(\"sense_voice\") = OfflineSenseVoiceModelConfig(),\n           py::arg(\"moonshine\") = OfflineMoonshineModelConfig(),\n           py::arg(\"dolphin\") = OfflineDolphinModelConfig(),\n           py::arg(\"canary\") = OfflineCanaryModelConfig(),\n           py::arg(\"omnilingual\") = OfflineOmnilingualAsrCtcModelConfig(),\n           py::arg(\"funasr_nano\") = OfflineFunASRNanoModelConfig(),\n           py::arg(\"medasr\") = OfflineMedAsrCtcModelConfig(),\n           py::arg(\"fire_red_asr_ctc\") = OfflineFireRedAsrCtcModelConfig(),\n           py::arg(\"telespeech_ctc\") = \"\", py::arg(\"tokens\") = \"\",\n           py::arg(\"num_threads\") = 1, py::arg(\"debug\") = false,\n           py::arg(\"provider\") = \"cpu\", py::arg(\"model_type\") = \"\",\n           py::arg(\"modeling_unit\") = \"cjkchar\", py::arg(\"bpe_vocab\") = \"\")\n      .def_readwrite(\"transducer\", &PyClass::transducer)\n      .def_readwrite(\"paraformer\", &PyClass::paraformer)\n      .def_readwrite(\"nemo_ctc\", &PyClass::nemo_ctc)\n      .def_readwrite(\"whisper\", &PyClass::whisper)\n      .def_readwrite(\"fire_red_asr\", &PyClass::fire_red_asr)\n      .def_readwrite(\"tdnn\", &PyClass::tdnn)\n      .def_readwrite(\"zipformer_ctc\", &PyClass::zipformer_ctc)\n      .def_readwrite(\"wenet_ctc\", &PyClass::wenet_ctc)\n      .def_readwrite(\"sense_voice\", &PyClass::sense_voice)\n      .def_readwrite(\"moonshine\", &PyClass::moonshine)\n      .def_readwrite(\"dolphin\", &PyClass::dolphin)\n      .def_readwrite(\"canary\", &PyClass::canary)\n      .def_readwrite(\"omnilingual\", &PyClass::omnilingual)\n      .def_readwrite(\"funasr_nano\", &PyClass::funasr_nano)\n      .def_readwrite(\"medasr\", &PyClass::medasr)\n      .def_readwrite(\"fire_red_asr_ctc\", &PyClass::fire_red_asr_ctc)\n      .def_readwrite(\"telespeech_ctc\", &PyClass::telespeech_ctc)\n      .def_readwrite(\"tokens\", &PyClass::tokens)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def_readwrite(\"model_type\", &PyClass::model_type)\n      .def_readwrite(\"modeling_unit\", &PyClass::modeling_unit)\n      .def_readwrite(\"bpe_vocab\", &PyClass::bpe_vocab)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-model-config.h\n//\n// Copyright (c)  2023 by manyeyes\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-moonshine-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-moonshine-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-moonshine-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/python/csrc/offline-moonshine-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineMoonshineModelConfig(py::module *m) {\n  using PyClass = OfflineMoonshineModelConfig;\n  py::class_<PyClass>(*m, \"OfflineMoonshineModelConfig\")\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, const std::string &,\n                    const std::string &>(),\n           py::arg(\"preprocessor\") = \"\", py::arg(\"encoder\") = \"\",\n           py::arg(\"uncached_decoder\") = \"\", py::arg(\"cached_decoder\") = \"\",\n           py::arg(\"merged_decoder\") = \"\")\n      .def_readwrite(\"preprocessor\", &PyClass::preprocessor)\n      .def_readwrite(\"encoder\", &PyClass::encoder)\n      .def_readwrite(\"uncached_decoder\", &PyClass::uncached_decoder)\n      .def_readwrite(\"cached_decoder\", &PyClass::cached_decoder)\n      .def_readwrite(\"merged_decoder\", &PyClass::merged_decoder)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-moonshine-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-moonshine-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_MOONSHINE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_MOONSHINE_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineMoonshineModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_MOONSHINE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-nemo-enc-dec-ctc-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-nemo-enc-dec-ctc-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-nemo-enc-dec-ctc-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-nemo-enc-dec-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineNemoEncDecCtcModelConfig(py::module *m) {\n  using PyClass = OfflineNemoEncDecCtcModelConfig;\n  py::class_<PyClass>(*m, \"OfflineNemoEncDecCtcModelConfig\")\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-nemo-enc-dec-ctc-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-nemo-enc-dec-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_NEMO_ENC_DEC_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_NEMO_ENC_DEC_CTC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineNemoEncDecCtcModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_NEMO_ENC_DEC_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-omnilingual-asr-ctc-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-omnilingual-asr-ctc-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-omnilingual-asr-ctc-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-omnilingual-asr-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineOmnilingualAsrCtcModelConfig(py::module *m) {\n  using PyClass = OfflineOmnilingualAsrCtcModelConfig;\n  py::class_<PyClass>(*m, \"OfflineOmnilingualAsrCtcModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-omnilingual-asr-ctc-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-omnilingual-asr-ctc-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_OMNILINGUAL_ASR_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_OMNILINGUAL_ASR_CTC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineOmnilingualAsrCtcModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_OMNILINGUAL_ASR_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-paraformer-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-paraformer-model-config.cc\n//\n// Copyright (c)  2023 by manyeyes\n\n#include \"sherpa-onnx/python/csrc/offline-paraformer-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-paraformer-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineParaformerModelConfig(py::module *m) {\n  using PyClass = OfflineParaformerModelConfig;\n  py::class_<PyClass>(*m, \"OfflineParaformerModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-paraformer-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-paraformer-model-config.h\n//\n// Copyright (c)  2023 by manyeyes\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_PARAFORMER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_PARAFORMER_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineParaformerModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_PARAFORMER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-punctuation.cc",
    "content": "// sherpa-onnx/python/csrc/offline-punctuation.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-punctuation.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-punctuation.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindOfflinePunctuationModelConfig(py::module *m) {\n  using PyClass = OfflinePunctuationModelConfig;\n  py::class_<PyClass>(*m, \"OfflinePunctuationModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, int32_t, bool, const std::string &>(),\n           py::arg(\"ct_transformer\"), py::arg(\"num_threads\") = 1,\n           py::arg(\"debug\") = false, py::arg(\"provider\") = \"cpu\")\n      .def_readwrite(\"ct_transformer\", &PyClass::ct_transformer)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nstatic void PybindOfflinePunctuationConfig(py::module *m) {\n  PybindOfflinePunctuationModelConfig(m);\n  using PyClass = OfflinePunctuationConfig;\n\n  py::class_<PyClass>(*m, \"OfflinePunctuationConfig\")\n      .def(py::init<>())\n      .def(py::init<const OfflinePunctuationModelConfig &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindOfflinePunctuation(py::module *m) {\n  PybindOfflinePunctuationConfig(m);\n  using PyClass = OfflinePunctuation;\n\n  py::class_<PyClass>(*m, \"OfflinePunctuation\")\n      .def(py::init<const OfflinePunctuationConfig &>(), py::arg(\"config\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"add_punctuation\", &PyClass::AddPunctuation, py::arg(\"text\"),\n           py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-punctuation.h",
    "content": "// sherpa-onnx/python/csrc/offline-punctuation.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_PUNCTUATION_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_PUNCTUATION_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflinePunctuation(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_PUNCTUATION_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-recognizer.cc",
    "content": "// sherpa-onnx/python/csrc/offline-recognizer.cc\n//\n// Copyright (c)  2023 by manyeyes\n\n#include \"sherpa-onnx/python/csrc/offline-recognizer.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-recognizer.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindOfflineRecognizerConfig(py::module *m) {\n  using PyClass = OfflineRecognizerConfig;\n  py::class_<PyClass>(*m, \"OfflineRecognizerConfig\")\n      .def(py::init<const FeatureExtractorConfig &, const OfflineModelConfig &,\n                    const OfflineLMConfig &, const OfflineCtcFstDecoderConfig &,\n                    const std::string &, int32_t, const std::string &, float,\n                    float, const std::string &, const std::string &,\n                    const HomophoneReplacerConfig &>(),\n           py::arg(\"feat_config\") = FeatureExtractorConfig(),\n           py::arg(\"model_config\") = OfflineModelConfig(),\n           py::arg(\"lm_config\") = OfflineLMConfig(),\n           py::arg(\"ctc_fst_decoder_config\") = OfflineCtcFstDecoderConfig(),\n           py::arg(\"decoding_method\") = \"greedy_search\",\n           py::arg(\"max_active_paths\") = 4, py::arg(\"hotwords_file\") = \"\",\n           py::arg(\"hotwords_score\") = 1.5, py::arg(\"blank_penalty\") = 0.0,\n           py::arg(\"rule_fsts\") = \"\", py::arg(\"rule_fars\") = \"\",\n           py::arg(\"hr\") = HomophoneReplacerConfig{})\n      .def_readwrite(\"feat_config\", &PyClass::feat_config)\n      .def_readwrite(\"model_config\", &PyClass::model_config)\n      .def_readwrite(\"lm_config\", &PyClass::lm_config)\n      .def_readwrite(\"ctc_fst_decoder_config\", &PyClass::ctc_fst_decoder_config)\n      .def_readwrite(\"decoding_method\", &PyClass::decoding_method)\n      .def_readwrite(\"max_active_paths\", &PyClass::max_active_paths)\n      .def_readwrite(\"hotwords_file\", &PyClass::hotwords_file)\n      .def_readwrite(\"hotwords_score\", &PyClass::hotwords_score)\n      .def_readwrite(\"blank_penalty\", &PyClass::blank_penalty)\n      .def_readwrite(\"rule_fsts\", &PyClass::rule_fsts)\n      .def_readwrite(\"rule_fars\", &PyClass::rule_fars)\n      .def_readwrite(\"hr\", &PyClass::hr)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindOfflineRecognizer(py::module *m) {\n  PybindOfflineRecognizerConfig(m);\n\n  using PyClass = OfflineRecognizer;\n  py::class_<PyClass>(*m, \"OfflineRecognizer\")\n      .def(py::init<const OfflineRecognizerConfig &>(), py::arg(\"config\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"create_stream\",\n          [](const PyClass &self) { return self.CreateStream(); },\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"create_stream\",\n          [](PyClass &self, const std::string &hotwords) {\n            return self.CreateStream(hotwords);\n          },\n          py::arg(\"hotwords\"), py::call_guard<py::gil_scoped_release>())\n      .def(\"decode_stream\", &PyClass::DecodeStream, py::arg(\"s\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"set_config\", &PyClass::SetConfig, py::arg(\"config\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"decode_streams\",\n          [](const PyClass &self, std::vector<OfflineStream *> ss) {\n            self.DecodeStreams(ss.data(), ss.size());\n          },\n          py::arg(\"ss\"), py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-recognizer.h",
    "content": "// sherpa-onnx/python/csrc/offline-recognizer.h\n//\n// Copyright (c)  2023 by manyeyes\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_RECOGNIZER_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_RECOGNIZER_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineRecognizer(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_RECOGNIZER_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-sense-voice-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-sense-voice-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-sense-voice-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/python/csrc/offline-sense-voice-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSenseVoiceModelConfig(py::module *m) {\n  using PyClass = OfflineSenseVoiceModelConfig;\n  py::class_<PyClass>(*m, \"OfflineSenseVoiceModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &, bool>(),\n           py::arg(\"model\"), py::arg(\"language\"), py::arg(\"use_itn\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"language\", &PyClass::language)\n      .def_readwrite(\"use_itn\", &PyClass::use_itn)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-sense-voice-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-sense-voice-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SENSE_VOICE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SENSE_VOICE_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSenseVoiceModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SENSE_VOICE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-source-separation-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-source-separation-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-source-separation-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-source-separation-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-source-separation-spleeter-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-source-separation-uvr-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSourceSeparationModelConfig(py::module *m) {\n  PybindOfflineSourceSeparationSpleeterModelConfig(m);\n  PybindOfflineSourceSeparationUvrModelConfig(m);\n\n  using PyClass = OfflineSourceSeparationModelConfig;\n  py::class_<PyClass>(*m, \"OfflineSourceSeparationModelConfig\")\n      .def(py::init<const OfflineSourceSeparationSpleeterModelConfig &,\n                    const OfflineSourceSeparationUvrModelConfig &, int32_t,\n                    bool, const std::string &>(),\n           py::arg(\"spleeter\") = OfflineSourceSeparationSpleeterModelConfig{},\n           py::arg(\"uvr\") = OfflineSourceSeparationUvrModelConfig{},\n           py::arg(\"num_threads\") = 1, py::arg(\"debug\") = false,\n           py::arg(\"provider\") = \"cpu\")\n      .def_readwrite(\"spleeter\", &PyClass::spleeter)\n      .def_readwrite(\"uvr\", &PyClass::uvr)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-source-separation-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-source-separation-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSourceSeparationModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-source-separation-spleeter-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-source-separation-spleeter-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-source-separation-spleeter-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-source-separation-spleeter-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSourceSeparationSpleeterModelConfig(py::module *m) {\n  using PyClass = OfflineSourceSeparationSpleeterModelConfig;\n  py::class_<PyClass>(*m, \"OfflineSourceSeparationSpleeterModelConfig\")\n      .def(py::init<const std::string &, const std::string &>(),\n           py::arg(\"vocals\") = \"\", py::arg(\"accompaniment\") = \"\")\n      .def_readwrite(\"vocals\", &PyClass::vocals)\n      .def_readwrite(\"accompaniment\", &PyClass::accompaniment)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-source-separation-spleeter-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-source-separation-spleeter-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSourceSeparationSpleeterModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_SPLEETER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-source-separation-uvr-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-source-separation-uvr-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-source-separation-uvr-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-source-separation-uvr-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSourceSeparationUvrModelConfig(py::module *m) {\n  using PyClass = OfflineSourceSeparationUvrModelConfig;\n  py::class_<PyClass>(*m, \"OfflineSourceSeparationUvrModelConfig\")\n      .def(py::init<const std::string &>(), py::arg(\"model\") = \"\")\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-source-separation-uvr-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-source-separation-uvr-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSourceSeparationUvrModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_UVR_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-source-separation.cc",
    "content": "// sherpa-onnx/python/csrc/offline-source-separation-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-source-separation.h\"\n\n#include <algorithm>\n#include <string>\n\n#include \"sherpa-onnx/python/csrc/offline-source-separation-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-source-separation.h\"\n\n#define C_CONTIGUOUS py::detail::npy_api::constants::NPY_ARRAY_C_CONTIGUOUS_\n\nnamespace sherpa_onnx {\n\nstatic void PybindOfflineSourceSeparationConfig(py::module *m) {\n  PybindOfflineSourceSeparationModelConfig(m);\n\n  using PyClass = OfflineSourceSeparationConfig;\n  py::class_<PyClass>(*m, \"OfflineSourceSeparationConfig\")\n      .def(py::init<const OfflineSourceSeparationModelConfig &>(),\n           py::arg(\"model\") = OfflineSourceSeparationModelConfig{})\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nstatic void PybindMultiChannelSamples(py::module *m) {\n  using PyClass = MultiChannelSamples;\n\n  py::class_<PyClass>(*m, \"MultiChannelSamples\")\n      .def_property_readonly(\"data\", [](PyClass &self) -> py::object {\n        // if data is not empty, return a float array of\n        // shape (num_channels, num_samples)\n        int32_t num_channels = self.data.size();\n        if (num_channels == 0) {\n          return py::none();\n        }\n\n        int32_t num_samples = self.data[0].size();\n        if (num_samples == 0) {\n          return py::none();\n        }\n\n        py::array_t<float> ans({num_channels, num_samples});\n\n        py::buffer_info buf = ans.request();\n        auto p = static_cast<float *>(buf.ptr);\n\n        for (int32_t i = 0; i != num_channels; ++i) {\n          std::copy(self.data[i].begin(), self.data[i].end(),\n                    p + i * num_samples);\n        }\n\n        return ans;\n      });\n}\n\nstatic void PybindOfflineSourceSeparationOutput(py::module *m) {\n  using PyClass = OfflineSourceSeparationOutput;\n  py::class_<PyClass>(*m, \"OfflineSourceSeparationOutput\")\n      .def_property_readonly(\n          \"sample_rate\", [](const PyClass &self) { return self.sample_rate; })\n      .def_property_readonly(\"stems\",\n                             [](const PyClass &self) { return self.stems; });\n}\n\nvoid PybindOfflineSourceSeparation(py::module *m) {\n  PybindOfflineSourceSeparationConfig(m);\n  PybindOfflineSourceSeparationOutput(m);\n\n  PybindMultiChannelSamples(m);\n\n  using PyClass = OfflineSourceSeparation;\n  py::class_<PyClass>(*m, \"OfflineSourceSeparation\")\n      .def(py::init<const OfflineSourceSeparationConfig &>(),\n           py::arg(\"config\") = OfflineSourceSeparationConfig{})\n      .def(\n          \"process\",\n          [](const PyClass &self, int32_t sample_rate,\n             const py::array_t<float> &samples) {\n            if (!(samples.flags() & py::array::c_style)) {\n              throw py::value_error(\n                  \"input samples should be contiguous. Please use \"\n                  \"np.ascontiguousarray(samples)\");\n            }\n\n            int num_dim = samples.ndim();\n            if (samples.ndim() != 2) {\n              std::ostringstream os;\n              os << \"Expect an array of 2 dimensions [num_channels x \"\n                    \"num_samples]. \"\n                    \"Given dim: \"\n                 << num_dim << \"\\n\";\n              throw py::value_error(os.str());\n            }\n\n            // if num_samples is less than 10, it is very likely the user\n            // has swapped num_channels and num_samples.\n            if (samples.shape(1) < 10) {\n              std::ostringstream os;\n              os << \"Expect an array of 2 dimensions [num_channels x \"\n                    \"num_samples]. \"\n                    \"Given [\"\n                 << samples.shape(0) << \" x \" << samples.shape(1) << \"]\"\n                 << \"\\n\";\n              throw py::value_error(os.str());\n            }\n\n            int32_t num_channels = samples.shape(0);\n            int32_t num_samples = samples.shape(1);\n            const float *p = samples.data();\n\n            OfflineSourceSeparationInput input;\n\n            input.samples.data.resize(num_channels);\n            input.sample_rate = sample_rate;\n\n            for (int32_t i = 0; i != num_channels; ++i) {\n              input.samples.data[i] = {p + i * num_samples,\n                                       p + (i + 1) * num_samples};\n            }\n\n            pybind11::gil_scoped_release release;\n\n            return self.Process(input);\n          },\n          py::arg(\"sample_rate\"), py::arg(\"samples\"),\n          \"samples is of shape (num_channels, num-samples) with dtype \"\n          \"np.float32\");\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-source-separation.h",
    "content": "// sherpa-onnx/python/csrc/offline-source-separation.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSourceSeparation(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SOURCE_SEPARATION_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speaker-diarization-result.cc",
    "content": "// sherpa-onnx/python/csrc/offline-speaker-diarization-result.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-speaker-diarization-result.h\"\n\n#include \"sherpa-onnx/csrc/offline-speaker-diarization-result.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindOfflineSpeakerDiarizationSegment(py::module *m) {\n  using PyClass = OfflineSpeakerDiarizationSegment;\n  py::class_<PyClass>(*m, \"OfflineSpeakerDiarizationSegment\")\n      .def_property_readonly(\"start\", &PyClass::Start)\n      .def_property_readonly(\"end\", &PyClass::End)\n      .def_property_readonly(\"duration\", &PyClass::Duration)\n      .def_property_readonly(\"speaker\", &PyClass::Speaker)\n      .def_property(\"text\", &PyClass::Text, &PyClass::SetText)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindOfflineSpeakerDiarizationResult(py::module *m) {\n  PybindOfflineSpeakerDiarizationSegment(m);\n  using PyClass = OfflineSpeakerDiarizationResult;\n  py::class_<PyClass>(*m, \"OfflineSpeakerDiarizationResult\")\n      .def_property_readonly(\"num_speakers\", &PyClass::NumSpeakers)\n      .def_property_readonly(\"num_segments\", &PyClass::NumSegments)\n      .def(\"sort_by_start_time\", &PyClass::SortByStartTime)\n      .def(\"sort_by_speaker\", &PyClass::SortBySpeaker);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speaker-diarization-result.h",
    "content": "// sherpa-onnx/python/csrc/offline-speaker-diarization-result.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEAKER_DIARIZATION_RESULT_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEAKER_DIARIZATION_RESULT_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSpeakerDiarizationResult(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEAKER_DIARIZATION_RESULT_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speaker-diarization.cc",
    "content": "// sherpa-onnx/python/csrc/offline-speaker-diarization.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-speaker-diarization.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-speaker-diarization.h\"\n#include \"sherpa-onnx/csrc/offline-speaker-segmentation-model-config.h\"\n#include \"sherpa-onnx/csrc/offline-speaker-segmentation-pyannote-model-config.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindOfflineSpeakerSegmentationPyannoteModelConfig(py::module *m) {\n  using PyClass = OfflineSpeakerSegmentationPyannoteModelConfig;\n  py::class_<PyClass>(*m, \"OfflineSpeakerSegmentationPyannoteModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\nstatic void PybindOfflineSpeakerSegmentationModelConfig(py::module *m) {\n  PybindOfflineSpeakerSegmentationPyannoteModelConfig(m);\n\n  using PyClass = OfflineSpeakerSegmentationModelConfig;\n  py::class_<PyClass>(*m, \"OfflineSpeakerSegmentationModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const OfflineSpeakerSegmentationPyannoteModelConfig &,\n                    int32_t, bool, const std::string &>(),\n           py::arg(\"pyannote\"), py::arg(\"num_threads\") = 1,\n           py::arg(\"debug\") = false, py::arg(\"provider\") = \"cpu\")\n      .def_readwrite(\"pyannote\", &PyClass::pyannote)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\nstatic void PybindOfflineSpeakerDiarizationConfig(py::module *m) {\n  PybindOfflineSpeakerSegmentationModelConfig(m);\n\n  using PyClass = OfflineSpeakerDiarizationConfig;\n  py::class_<PyClass>(*m, \"OfflineSpeakerDiarizationConfig\")\n      .def(py::init<const OfflineSpeakerSegmentationModelConfig &,\n                    const SpeakerEmbeddingExtractorConfig &,\n                    const FastClusteringConfig &, float, float>(),\n           py::arg(\"segmentation\"), py::arg(\"embedding\"), py::arg(\"clustering\"),\n           py::arg(\"min_duration_on\") = 0.3, py::arg(\"min_duration_off\") = 0.5)\n      .def_readwrite(\"segmentation\", &PyClass::segmentation)\n      .def_readwrite(\"embedding\", &PyClass::embedding)\n      .def_readwrite(\"clustering\", &PyClass::clustering)\n      .def_readwrite(\"min_duration_on\", &PyClass::min_duration_on)\n      .def_readwrite(\"min_duration_off\", &PyClass::min_duration_off)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\nvoid PybindOfflineSpeakerDiarization(py::module *m) {\n  PybindOfflineSpeakerDiarizationConfig(m);\n\n  using PyClass = OfflineSpeakerDiarization;\n  py::class_<PyClass>(*m, \"OfflineSpeakerDiarization\")\n      .def(py::init<const OfflineSpeakerDiarizationConfig &>(),\n           py::arg(\"config\"))\n      .def_property_readonly(\"sample_rate\", &PyClass::SampleRate)\n      .def(\"set_config\", &PyClass::SetConfig, py::arg(\"config\"))\n      .def(\n          \"process\",\n          [](const PyClass &self, const std::vector<float> samples,\n             std::function<int32_t(int32_t, int32_t)> callback) {\n            if (!callback) {\n              return self.Process(samples.data(), samples.size());\n            }\n\n            std::function<int32_t(int32_t, int32_t, void *)> callback_wrapper =\n                [callback](int32_t processed_chunks, int32_t num_chunks,\n                           void *) -> int32_t {\n              callback(processed_chunks, num_chunks);\n              return 0;\n            };\n\n            return self.Process(samples.data(), samples.size(),\n                                callback_wrapper);\n          },\n          py::arg(\"samples\"), py::arg(\"callback\") = py::none());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speaker-diarization.h",
    "content": "// sherpa-onnx/python/csrc/offline-speaker-diarization.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEAKER_DIARIZATION_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEAKER_DIARIZATION_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSpeakerDiarization(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEAKER_DIARIZATION_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speech-denoiser-dpdfnet-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-speech-denoiser-dpdfnet-model-config.cc\n//\n// Copyright (c)  2026  Ceva Inc\n\n#include \"sherpa-onnx/python/csrc/offline-speech-denoiser-dpdfnet-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-dpdfnet-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSpeechDenoiserDpdfNetModelConfig(py::module *m) {\n  using PyClass = OfflineSpeechDenoiserDpdfNetModelConfig;\n  py::class_<PyClass>(*m, \"OfflineSpeechDenoiserDpdfNetModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &>(), py::arg(\"model\") = \"\")\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speech-denoiser-dpdfnet-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-speech-denoiser-dpdfnet-model-config.h\n//\n// Copyright (c)  2026  Ceva Inc\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSpeechDenoiserDpdfNetModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_DPDFNET_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speech-denoiser-gtcrn-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-speech-denoiser-gtcrn-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-speech-denoiser-gtcrn-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-gtcrn-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSpeechDenoiserGtcrnModelConfig(py::module *m) {\n  using PyClass = OfflineSpeechDenoiserGtcrnModelConfig;\n  py::class_<PyClass>(*m, \"OfflineSpeechDenoiserGtcrnModelConfig\")\n      .def(py::init<const std::string &>(), py::arg(\"model\") = \"\")\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speech-denoiser-gtcrn-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-speech-denoiser-gtcrn-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSpeechDenoiserGtcrnModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_GTCRN_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speech-denoiser-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-speech-denoiser-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-speech-denoiser-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-speech-denoiser-dpdfnet-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-speech-denoiser-gtcrn-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSpeechDenoiserModelConfig(py::module *m) {\n  PybindOfflineSpeechDenoiserDpdfNetModelConfig(m);\n  PybindOfflineSpeechDenoiserGtcrnModelConfig(m);\n\n  using PyClass = OfflineSpeechDenoiserModelConfig;\n  py::class_<PyClass>(*m, \"OfflineSpeechDenoiserModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const OfflineSpeechDenoiserGtcrnModelConfig &,\n                    const OfflineSpeechDenoiserDpdfNetModelConfig &, int32_t,\n                    bool, const std::string &>(),\n           py::arg(\"gtcrn\") = OfflineSpeechDenoiserGtcrnModelConfig{},\n           py::arg(\"dpdfnet\") = OfflineSpeechDenoiserDpdfNetModelConfig{},\n           py::arg(\"num_threads\") = 1, py::arg(\"debug\") = false,\n           py::arg(\"provider\") = \"cpu\")\n      .def_readwrite(\"gtcrn\", &PyClass::gtcrn)\n      .def_readwrite(\"dpdfnet\", &PyClass::dpdfnet)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speech-denoiser-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-speech-denoiser-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSpeechDenoiserModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speech-denoiser.cc",
    "content": "// sherpa-onnx/python/csrc/offline-speech-denoiser.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-speech-denoiser.h\"\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-speech-denoiser.h\"\n#include \"sherpa-onnx/python/csrc/offline-speech-denoiser-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineSpeechDenoiserConfig(py::module *m) {\n  PybindOfflineSpeechDenoiserModelConfig(m);\n\n  using PyClass = OfflineSpeechDenoiserConfig;\n\n  py::class_<PyClass>(*m, \"OfflineSpeechDenoiserConfig\")\n      .def(py::init<>())\n      .def(py::init<const OfflineSpeechDenoiserModelConfig &>(),\n           py::arg(\"model\") = OfflineSpeechDenoiserModelConfig{})\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindDenoisedAudio(py::module *m) {\n  using PyClass = DenoisedAudio;\n  py::class_<PyClass>(*m, \"DenoisedAudio\")\n      .def_property_readonly(\n          \"sample_rate\", [](const PyClass &self) { return self.sample_rate; })\n      .def_property_readonly(\"samples\",\n                             [](const PyClass &self) { return self.samples; });\n}\n\nvoid PybindOfflineSpeechDenoiser(py::module *m) {\n  PybindOfflineSpeechDenoiserConfig(m);\n  PybindDenoisedAudio(m);\n  using PyClass = OfflineSpeechDenoiser;\n  py::class_<PyClass>(*m, \"OfflineSpeechDenoiser\")\n      .def(py::init<const OfflineSpeechDenoiserConfig &>(), py::arg(\"config\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"__call__\",\n          [](const PyClass &self, const std::vector<float> &samples,\n             int32_t sample_rate) {\n            return self.Run(samples.data(), samples.size(), sample_rate);\n          },\n          py::arg(\"samples\"), py::arg(\"sample_rate\"),\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"run\",\n          [](const PyClass &self, const std::vector<float> &samples,\n             int32_t sample_rate) {\n            return self.Run(samples.data(), samples.size(), sample_rate);\n          },\n          py::arg(\"samples\"), py::arg(\"sample_rate\"),\n          py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"sample_rate\", &PyClass::GetSampleRate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-speech-denoiser.h",
    "content": "// sherpa-onnx/python/csrc/offline-speech-denoiser.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindDenoisedAudio(py::module *m);\n\nvoid PybindOfflineSpeechDenoiser(py::module *m);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_SPEECH_DENOISER_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-stream.cc",
    "content": "// sherpa-onnx/python/csrc/offline-stream.cc\n//\n// Copyright (c)  2023 by manyeyes\n\n#include \"sherpa-onnx/python/csrc/offline-stream.h\"\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-stream.h\"\n\nnamespace sherpa_onnx {\n\nconstexpr const char *kAcceptWaveformUsage = R\"(\nProcess audio samples.\n\nArgs:\n  sample_rate:\n    Sample rate of the input samples. If it is different from the one\n    expected by the model, we will do resampling inside.\n  waveform:\n    A 1-D float32 tensor containing audio samples. It must be normalized\n    to the range [-1, 1].\n)\";\n\nstatic void PybindOfflineRecognitionResult(py::module *m) {  // NOLINT\n  using PyClass = OfflineRecognitionResult;\n  py::class_<PyClass>(*m, \"OfflineRecognitionResult\")\n      .def(\"__str__\", &PyClass::AsJsonString)\n      .def_property_readonly(\n          \"text\",\n          [](const PyClass &self) -> py::str {\n            return py::str(PyUnicode_DecodeUTF8(self.text.c_str(),\n                                                self.text.size(), \"ignore\"));\n          })\n      .def_property_readonly(\"lang\",\n         [](const PyClass &self) { return self.lang; })\n      .def_property_readonly(\"emotion\",\n        [](const PyClass &self) { return self.emotion; })\n      .def_property_readonly(\"event\",\n        [](const PyClass &self) { return self.event; })\n      .def_property_readonly(\"tokens\",\n        [](const PyClass &self) { return self.tokens; })\n      .def_property_readonly(\"words\",\n        [](const PyClass &self) { return self.words; })\n      .def_property_readonly(\"timestamps\",\n        [](const PyClass &self) { return self.timestamps; })\n      .def_property_readonly(\"durations\",\n        [](const PyClass &self) { return self.durations; })\n      .def_property_readonly(\"ys_log_probs\",\n        [](const PyClass &self) { return self.ys_log_probs; })\n      .def_property_readonly(\"segment_timestamps\",\n        [](const PyClass &self) { return self.segment_timestamps; })\n      .def_property_readonly(\"segment_durations\",\n        [](const PyClass &self) { return self.segment_durations; })\n      .def_property_readonly(\"segment_texts\",\n        [](const PyClass &self) { return self.segment_texts; });\n}\n\nvoid PybindOfflineStream(py::module *m) {\n  PybindOfflineRecognitionResult(m);\n\n  using PyClass = OfflineStream;\n  py::class_<PyClass>(*m, \"OfflineStream\")\n      .def(\n          \"accept_waveform\",\n          [](PyClass &self, float sample_rate,\n             const std::vector<float> &waveform) {\n            self.AcceptWaveform(sample_rate, waveform.data(), waveform.size());\n          },\n          py::arg(\"sample_rate\"), py::arg(\"waveform\"), kAcceptWaveformUsage,\n          py::call_guard<py::gil_scoped_release>())\n      .def(\"set_option\", &PyClass::SetOption, py::arg(\"key\"),\n           py::arg(\"value\"), py::call_guard<py::gil_scoped_release>())\n      .def(\"has_option\", &PyClass::HasOption, py::arg(\"key\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"get_option\", &PyClass::GetOption, py::arg(\"key\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"result\", &PyClass::GetResult);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-stream.h",
    "content": "// sherpa-onnx/python/csrc/offline-stream.h\n//\n// Copyright (c)  2023 by manyeyes\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_STREAM_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_STREAM_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineStream(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_STREAM_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tdnn-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-tdnn-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-tdnn-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/python/csrc/offline-tdnn-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTdnnModelConfig(py::module *m) {\n  using PyClass = OfflineTdnnModelConfig;\n  py::class_<PyClass>(*m, \"OfflineTdnnModelConfig\")\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tdnn-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-tdnn-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TDNN_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TDNN_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTdnnModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TDNN_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-transducer-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-transducer-model-config.cc\n//\n// Copyright (c)  2023 by manyeyes\n\n#include \"sherpa-onnx/python/csrc/offline-transducer-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-transducer-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTransducerModelConfig(py::module *m) {\n  using PyClass = OfflineTransducerModelConfig;\n  py::class_<PyClass>(*m, \"OfflineTransducerModelConfig\")\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &>(),\n           py::arg(\"encoder_filename\"), py::arg(\"decoder_filename\"),\n           py::arg(\"joiner_filename\"))\n      .def_readwrite(\"encoder_filename\", &PyClass::encoder_filename)\n      .def_readwrite(\"decoder_filename\", &PyClass::decoder_filename)\n      .def_readwrite(\"joiner_filename\", &PyClass::joiner_filename)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-transducer-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-transducer-model-config.h\n//\n// Copyright (c)  2023 by manyeyes\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TRANSDUCER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TRANSDUCER_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTransducerModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TRANSDUCER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-kitten-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-tts-kitten-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-tts-kitten-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-tts-kitten-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsKittenModelConfig(py::module *m) {\n  using PyClass = OfflineTtsKittenModelConfig;\n\n  py::class_<PyClass>(*m, \"OfflineTtsKittenModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, const std::string &, float>(),\n           py::arg(\"model\"), py::arg(\"voices\"), py::arg(\"tokens\"),\n           py::arg(\"data_dir\"), py::arg(\"length_scale\") = 1.0)\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"voices\", &PyClass::voices)\n      .def_readwrite(\"tokens\", &PyClass::tokens)\n      .def_readwrite(\"data_dir\", &PyClass::data_dir)\n      .def_readwrite(\"length_scale\", &PyClass::length_scale)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-kitten-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-tts-kitten-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_KITTEN_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_KITTEN_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsKittenModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_KITTEN_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-kokoro-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-tts-kokoro-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-tts-kokoro-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-tts-kokoro-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsKokoroModelConfig(py::module *m) {\n  using PyClass = OfflineTtsKokoroModelConfig;\n\n  py::class_<PyClass>(*m, \"OfflineTtsKokoroModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, const std::string &,\n                    const std::string &, const std::string &, float,\n                    const std::string &>(),\n           py::arg(\"model\"), py::arg(\"voices\"), py::arg(\"tokens\"),\n           py::arg(\"lexicon\") = \"\", py::arg(\"data_dir\"),\n           py::arg(\"dict_dir\") = \"\", py::arg(\"length_scale\") = 1.0,\n           py::arg(\"lang\") = \"\")\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"voices\", &PyClass::voices)\n      .def_readwrite(\"tokens\", &PyClass::tokens)\n      .def_readwrite(\"lexicon\", &PyClass::lexicon)\n      .def_readwrite(\"data_dir\", &PyClass::data_dir)\n      .def_readwrite(\"dict_dir\", &PyClass::dict_dir)\n      .def_readwrite(\"length_scale\", &PyClass::length_scale)\n      .def_readwrite(\"lang\", &PyClass::lang)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-kokoro-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-tts-kokoro-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_KOKORO_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_KOKORO_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsKokoroModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_KOKORO_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-matcha-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-tts-matcha-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-tts-matcha-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-tts-matcha-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsMatchaModelConfig(py::module *m) {\n  using PyClass = OfflineTtsMatchaModelConfig;\n\n  py::class_<PyClass>(*m, \"OfflineTtsMatchaModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, const std::string &,\n                    const std::string &, const std::string &, float, float>(),\n           py::arg(\"acoustic_model\"), py::arg(\"vocoder\"),\n           py::arg(\"lexicon\") = \"\", py::arg(\"tokens\"), py::arg(\"data_dir\") = \"\",\n           py::arg(\"dict_dir\") = \"\", py::arg(\"noise_scale\") = 1.0,\n           py::arg(\"length_scale\") = 1.0)\n      .def_readwrite(\"acoustic_model\", &PyClass::acoustic_model)\n      .def_readwrite(\"vocoder\", &PyClass::vocoder)\n      .def_readwrite(\"lexicon\", &PyClass::lexicon)\n      .def_readwrite(\"tokens\", &PyClass::tokens)\n      .def_readwrite(\"data_dir\", &PyClass::data_dir)\n      .def_readwrite(\"dict_dir\", &PyClass::dict_dir)\n      .def_readwrite(\"noise_scale\", &PyClass::noise_scale)\n      .def_readwrite(\"length_scale\", &PyClass::length_scale)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-matcha-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-tts-matcha-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_MATCHA_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_MATCHA_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsMatchaModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_MATCHA_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-tts-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-tts-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-tts-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-tts-kitten-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-tts-kokoro-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-tts-matcha-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-tts-pocket-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-tts-supertonic-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-tts-vits-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-tts-zipvoice-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsModelConfig(py::module *m) {\n  PybindOfflineTtsVitsModelConfig(m);\n  PybindOfflineTtsMatchaModelConfig(m);\n  PybindOfflineTtsKokoroModelConfig(m);\n  PybindOfflineTtsZipvoiceModelConfig(m);\n  PybindOfflineTtsKittenModelConfig(m);\n  PybindOfflineTtsPocketModelConfig(m);\n  PybindOfflineTtsSupertonicModelConfig(m);\n\n  using PyClass = OfflineTtsModelConfig;\n\n  py::class_<PyClass>(*m, \"OfflineTtsModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const OfflineTtsVitsModelConfig &,\n                    const OfflineTtsMatchaModelConfig &,\n                    const OfflineTtsKokoroModelConfig &,\n                    const OfflineTtsZipvoiceModelConfig &,\n                    const OfflineTtsKittenModelConfig &,\n                    const OfflineTtsPocketModelConfig &,\n                    const OfflineTtsSupertonicModelConfig &, int32_t, bool,\n                    const std::string &>(),\n           py::arg(\"vits\") = OfflineTtsVitsModelConfig{},\n           py::arg(\"matcha\") = OfflineTtsMatchaModelConfig{},\n           py::arg(\"kokoro\") = OfflineTtsKokoroModelConfig{},\n           py::arg(\"zipvoice\") = OfflineTtsZipvoiceModelConfig{},\n           py::arg(\"kitten\") = OfflineTtsKittenModelConfig{},\n           py::arg(\"pocket\") = OfflineTtsPocketModelConfig{},\n           py::arg(\"supertonic\") = OfflineTtsSupertonicModelConfig{},\n           py::arg(\"num_threads\") = 1, py::arg(\"debug\") = false,\n           py::arg(\"provider\") = \"cpu\")\n      .def_readwrite(\"vits\", &PyClass::vits)\n      .def_readwrite(\"matcha\", &PyClass::matcha)\n      .def_readwrite(\"kokoro\", &PyClass::kokoro)\n      .def_readwrite(\"zipvoice\", &PyClass::zipvoice)\n      .def_readwrite(\"kitten\", &PyClass::kitten)\n      .def_readwrite(\"pocket\", &PyClass::pocket)\n      .def_readwrite(\"supertonic\", &PyClass::supertonic)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-tts-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-pocket-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-tts-pocket-model-config.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-tts-pocket-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-tts-pocket-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsPocketModelConfig(py::module *m) {\n  using PyClass = OfflineTtsPocketModelConfig;\n\n  py::class_<PyClass>(*m, \"OfflineTtsPocketModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, const std::string &,\n                    const std::string &, const std::string &,\n                    const std::string &, int32_t>(),\n           py::arg(\"lm_flow\"), py::arg(\"lm_main\"), py::arg(\"encoder\"),\n           py::arg(\"decoder\"), py::arg(\"text_conditioner\"),\n           py::arg(\"vocab_json\"), py::arg(\"token_scores_json\"),\n           py::arg(\"voice_embedding_cache_capacity\") = 50)\n      .def_readwrite(\"lm_flow\", &PyClass::lm_flow)\n      .def_readwrite(\"lm_main\", &PyClass::lm_main)\n      .def_readwrite(\"encoder\", &PyClass::encoder)\n      .def_readwrite(\"decoder\", &PyClass::decoder)\n      .def_readwrite(\"text_conditioner\", &PyClass::text_conditioner)\n      .def_readwrite(\"vocab_json\", &PyClass::vocab_json)\n      .def_readwrite(\"token_scores_json\", &PyClass::token_scores_json)\n      .def_readwrite(\"voice_embedding_cache_capacity\",\n                     &PyClass::voice_embedding_cache_capacity)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-pocket-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-tts-pocket-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_POCKET_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_POCKET_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsPocketModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_POCKET_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-supertonic-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-tts-supertonic-model-config.cc\n//\n// Copyright (c)  2026 zengyw\n\n#include \"sherpa-onnx/python/csrc/offline-tts-supertonic-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-tts-supertonic-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsSupertonicModelConfig(py::module *m) {\n  using PyClass = OfflineTtsSupertonicModelConfig;\n\n  py::class_<PyClass>(*m, \"OfflineTtsSupertonicModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, const std::string &,\n                    const std::string &, const std::string &,\n                    const std::string &>(),\n           py::arg(\"duration_predictor\"), py::arg(\"text_encoder\"),\n           py::arg(\"vector_estimator\"), py::arg(\"vocoder\"), py::arg(\"tts_json\"),\n           py::arg(\"unicode_indexer\"), py::arg(\"voice_style\"))\n      .def_readwrite(\"duration_predictor\", &PyClass::duration_predictor)\n      .def_readwrite(\"text_encoder\", &PyClass::text_encoder)\n      .def_readwrite(\"vector_estimator\", &PyClass::vector_estimator)\n      .def_readwrite(\"vocoder\", &PyClass::vocoder)\n      .def_readwrite(\"tts_json\", &PyClass::tts_json)\n      .def_readwrite(\"unicode_indexer\", &PyClass::unicode_indexer)\n      .def_readwrite(\"voice_style\", &PyClass::voice_style)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-supertonic-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-tts-supertonic-model-config.h\n//\n// Copyright (c)  2026 zengyw\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_SUPERTONIC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_SUPERTONIC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsSupertonicModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_SUPERTONIC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-vits-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-tts-vits-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-tts-vits-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-tts-vits-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsVitsModelConfig(py::module *m) {\n  using PyClass = OfflineTtsVitsModelConfig;\n\n  py::class_<PyClass>(*m, \"OfflineTtsVitsModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, const std::string &,\n                    const std::string &, float, float, float>(),\n           py::arg(\"model\"), py::arg(\"lexicon\") = \"\", py::arg(\"tokens\"),\n           py::arg(\"data_dir\") = \"\", py::arg(\"dict_dir\") = \"\",\n           py::arg(\"noise_scale\") = 0.667, py::arg(\"noise_scale_w\") = 0.8,\n           py::arg(\"length_scale\") = 1.0)\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"lexicon\", &PyClass::lexicon)\n      .def_readwrite(\"tokens\", &PyClass::tokens)\n      .def_readwrite(\"data_dir\", &PyClass::data_dir)\n      .def_readwrite(\"dict_dir\", &PyClass::dict_dir)\n      .def_readwrite(\"noise_scale\", &PyClass::noise_scale)\n      .def_readwrite(\"noise_scale_w\", &PyClass::noise_scale_w)\n      .def_readwrite(\"length_scale\", &PyClass::length_scale)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-vits-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-tts-vits-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_VITS_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_VITS_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsVitsModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_VITS_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-zipvoice-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-tts-zipvoice-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-tts-zipvoice-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-tts-zipvoice-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsZipvoiceModelConfig(py::module *m) {\n  using PyClass = OfflineTtsZipvoiceModelConfig;\n\n  py::class_<PyClass>(*m, \"OfflineTtsZipvoiceModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, const std::string &,\n                    const std::string &, const std::string &, float, float,\n                    float, float>(),\n           py::arg(\"tokens\"), py::arg(\"encoder\"), py::arg(\"decoder\"),\n           py::arg(\"vocoder\"), py::arg(\"data_dir\") = \"\",\n           py::arg(\"lexicon\") = \"\", py::arg(\"feat_scale\") = 0.1,\n           py::arg(\"t_shift\") = 0.5, py::arg(\"target_rms\") = 0.1,\n           py::arg(\"guidance_scale\") = 1.0)\n      .def_readwrite(\"tokens\", &PyClass::tokens)\n      .def_readwrite(\"encoder\", &PyClass::encoder)\n      .def_readwrite(\"decoder\", &PyClass::decoder)\n      .def_readwrite(\"vocoder\", &PyClass::vocoder)\n      .def_readwrite(\"data_dir\", &PyClass::data_dir)\n      .def_readwrite(\"lexicon\", &PyClass::lexicon)\n      .def_readwrite(\"feat_scale\", &PyClass::feat_scale)\n      .def_readwrite(\"t_shift\", &PyClass::t_shift)\n      .def_readwrite(\"target_rms\", &PyClass::target_rms)\n      .def_readwrite(\"guidance_scale\", &PyClass::guidance_scale)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts-zipvoice-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-tts-zipvoice-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTtsZipvoiceModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_ZIPVOICE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts.cc",
    "content": "// sherpa-onnx/python/csrc/offline-tts.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#include \"sherpa-onnx/python/csrc/offline-tts.h\"\n\n#include <algorithm>\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/offline-tts.h\"\n#include \"sherpa-onnx/python/csrc/offline-tts-model-config.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindGeneratedAudio(py::module *m) {\n  using PyClass = GeneratedAudio;\n  py::class_<PyClass>(*m, \"GeneratedAudio\")\n      .def(py::init<>())\n      .def_readwrite(\"samples\", &PyClass::samples)\n      .def_readwrite(\"sample_rate\", &PyClass::sample_rate)\n      .def(\"__str__\", [](PyClass &self) {\n        std::ostringstream os;\n        os << \"GeneratedAudio(sample_rate=\" << self.sample_rate << \", \";\n        os << \"num_samples=\" << self.samples.size() << \")\";\n        return os.str();\n      });\n}\n\nstatic void PybindGenerationConfig(py::module *m) {\n  using PyClass = GenerationConfig;\n\n  py::class_<PyClass>(*m, \"GenerationConfig\")\n      .def(py::init<>())\n      .def_readwrite(\"silence_scale\", &PyClass::silence_scale)\n      .def_readwrite(\"speed\", &PyClass::speed)\n      .def_readwrite(\"sid\", &PyClass::sid)\n      .def_readwrite(\"reference_audio\", &PyClass::reference_audio)\n      .def_readwrite(\"reference_sample_rate\", &PyClass::reference_sample_rate)\n      .def_readwrite(\"reference_text\", &PyClass::reference_text)\n      .def_readwrite(\"num_steps\", &PyClass::num_steps)\n      .def_readwrite(\"extra\", &PyClass::extra)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nstatic void PybindOfflineTtsConfig(py::module *m) {\n  PybindOfflineTtsModelConfig(m);\n\n  using PyClass = OfflineTtsConfig;\n  py::class_<PyClass>(*m, \"OfflineTtsConfig\")\n      .def(py::init<>())\n      .def(py::init<const OfflineTtsModelConfig &, const std::string &,\n                    const std::string &, int32_t, float>(),\n           py::arg(\"model\"), py::arg(\"rule_fsts\") = \"\",\n           py::arg(\"rule_fars\") = \"\", py::arg(\"max_num_sentences\") = 1,\n           py::arg(\"silence_scale\") = 0.2)\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"rule_fsts\", &PyClass::rule_fsts)\n      .def_readwrite(\"rule_fars\", &PyClass::rule_fars)\n      .def_readwrite(\"max_num_sentences\", &PyClass::max_num_sentences)\n      .def_readwrite(\"silence_scale\", &PyClass::silence_scale)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindOfflineTts(py::module *m) {\n  PybindOfflineTtsConfig(m);\n  PybindGeneratedAudio(m);\n  PybindGenerationConfig(m);\n\n  using PyClass = OfflineTts;\n  py::class_<PyClass>(*m, \"OfflineTts\")\n      .def(py::init<const OfflineTtsConfig &>(), py::arg(\"config\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"sample_rate\", &PyClass::SampleRate)\n      .def_property_readonly(\"num_speakers\", &PyClass::NumSpeakers)\n      .def(\n          \"generate\",\n          [](const PyClass &self, const std::string &text, int64_t sid,\n             float speed,\n             std::function<int32_t(py::array_t<float>, float)> callback)\n              -> GeneratedAudio {\n            if (!callback) {\n              return self.Generate(text, sid, speed);\n            }\n\n            std::function<int32_t(const float *, int32_t, float)>\n                callback_wrapper = [callback](const float *samples, int32_t n,\n                                              float progress) {\n                  // CAUTION(fangjun): we have to copy samples since it is\n                  // freed once the call back returns.\n\n                  pybind11::gil_scoped_acquire acquire;\n\n                  pybind11::array_t<float> array(n);\n                  py::buffer_info buf = array.request();\n                  auto p = static_cast<float *>(buf.ptr);\n                  std::copy(samples, samples + n, p);\n                  return callback(array, progress);\n                };\n\n            return self.Generate(text, sid, speed, callback_wrapper);\n          },\n          py::arg(\"text\"), py::arg(\"sid\") = 0, py::arg(\"speed\") = 1.0,\n          py::arg(\"callback\") = py::none(),\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"generate\",\n          [](const PyClass &self, const std::string &text,\n             const GenerationConfig &config,\n             std::function<int32_t(py::array_t<float>, float)> callback)\n              -> GeneratedAudio {\n            if (!callback) {\n              return self.Generate(text, config);\n            }\n\n            std::function<int32_t(const float *, int32_t, float)>\n                callback_wrapper = [callback](const float *samples, int32_t n,\n                                              float progress) {\n                  py::gil_scoped_acquire acquire;\n\n                  py::array_t<float> array(n);\n                  auto buf = array.request();\n                  auto *p = static_cast<float *>(buf.ptr);\n                  std::copy(samples, samples + n, p);\n\n                  return callback(array, progress);\n                };\n\n            return self.Generate(text, config, callback_wrapper);\n          },\n          py::arg(\"text\"), py::arg(\"config\"), py::arg(\"callback\") = py::none(),\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"generate\",\n          [](const PyClass &self, const std::string &text,\n             const std::string &prompt_text,\n             const std::vector<float> &prompt_samples, int32_t sample_rate,\n             float speed, int32_t num_steps,\n             std::function<int32_t(py::array_t<float>, float)> callback)\n              -> GeneratedAudio {\n            GenerationConfig config;\n            config.reference_audio = prompt_samples;\n            config.reference_sample_rate = sample_rate;\n            config.reference_text = prompt_text;\n            config.speed = speed;\n            config.num_steps = num_steps;\n\n            if (!callback) {\n              return self.Generate(text, config);\n            }\n\n            std::function<int32_t(const float *, int32_t, float)>\n                callback_wrapper = [callback](const float *samples, int32_t n,\n                                              float progress) {\n                  // CAUTION(fangjun): we have to copy samples since it is\n                  // freed once the call back returns.\n\n                  pybind11::gil_scoped_acquire acquire;\n\n                  pybind11::array_t<float> array(n);\n                  py::buffer_info buf = array.request();\n                  auto p = static_cast<float *>(buf.ptr);\n                  std::copy(samples, samples + n, p);\n                  return callback(array, progress);\n                };\n\n            return self.Generate(text, config, callback_wrapper);\n          },\n          py::arg(\"text\"), py::arg(\"prompt_text\"), py::arg(\"prompt_samples\"),\n          py::arg(\"sample_rate\"), py::arg(\"speed\") = 1.0,\n          py::arg(\"num_steps\") = 4, py::arg(\"callback\") = py::none(),\n          py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-tts.h",
    "content": "// sherpa-onnx/python/csrc/offline-tts.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineTts(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_TTS_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-wenet-ctc-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-wenet-ctc-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-wenet-ctc-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/python/csrc/offline-wenet-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineWenetCtcModelConfig(py::module *m) {\n  using PyClass = OfflineWenetCtcModelConfig;\n  py::class_<PyClass>(*m, \"OfflineWenetCtcModelConfig\")\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-wenet-ctc-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-wenet-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_WENET_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_WENET_CTC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineWenetCtcModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_WENET_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-whisper-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-whisper-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/offline-whisper-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/python/csrc/offline-whisper-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineWhisperModelConfig(py::module *m) {\n  using PyClass = OfflineWhisperModelConfig;\n  py::class_<PyClass>(*m, \"OfflineWhisperModelConfig\")\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &, const std::string &, int32_t, bool,\n                    bool>(),\n           py::arg(\"encoder\"), py::arg(\"decoder\"), py::arg(\"language\"),\n           py::arg(\"task\"), py::arg(\"tail_paddings\") = -1,\n           py::arg(\"enable_token_timestamps\") = false,\n           py::arg(\"enable_segment_timestamps\") = false)\n      .def_readwrite(\"encoder\", &PyClass::encoder)\n      .def_readwrite(\"decoder\", &PyClass::decoder)\n      .def_readwrite(\"language\", &PyClass::language)\n      .def_readwrite(\"task\", &PyClass::task)\n      .def_readwrite(\"tail_paddings\", &PyClass::tail_paddings)\n      .def_readwrite(\"enable_token_timestamps\",\n                     &PyClass::enable_token_timestamps)\n      .def_readwrite(\"enable_segment_timestamps\",\n                     &PyClass::enable_segment_timestamps)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-whisper-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-whisper-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_WHISPER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_WHISPER_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineWhisperModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_WHISPER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-zipformer-ctc-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/offline-zipformer-ctc-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/offline-zipformer-ctc-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/offline-zipformer-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineZipformerCtcModelConfig(py::module *m) {\n  using PyClass = OfflineZipformerCtcModelConfig;\n  py::class_<PyClass>(*m, \"OfflineZipformerCtcModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/offline-zipformer-ctc-model-config.h",
    "content": "// sherpa-onnx/python/csrc/offline-zipformer-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_OFFLINE_ZIPFORMER_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_OFFLINE_ZIPFORMER_CTC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOfflineZipformerCtcModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_OFFLINE_ZIPFORMER_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-ctc-fst-decoder-config.cc",
    "content": "// sherpa-onnx/python/csrc/online-ctc-fst-decoder-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-ctc-fst-decoder-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/online-ctc-fst-decoder-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineCtcFstDecoderConfig(py::module *m) {\n  using PyClass = OnlineCtcFstDecoderConfig;\n  py::class_<PyClass>(*m, \"OnlineCtcFstDecoderConfig\")\n      .def(py::init<const std::string &, int32_t>(), py::arg(\"graph\") = \"\",\n           py::arg(\"max_active\") = 3000)\n      .def_readwrite(\"graph\", &PyClass::graph)\n      .def_readwrite(\"max_active\", &PyClass::max_active)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-ctc-fst-decoder-config.h",
    "content": "// sherpa-onnx/python/csrc/online-ctc-fst-decoder-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_CTC_FST_DECODER_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_CTC_FST_DECODER_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineCtcFstDecoderConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_CTC_FST_DECODER_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-lm-config.cc",
    "content": "// sherpa-onnx/python/csrc/online-lm-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-lm-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx//csrc/online-lm-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineLMConfig(py::module *m) {\n  using PyClass = OnlineLMConfig;\n  py::class_<PyClass>(*m, \"OnlineLMConfig\")\n      .def(py::init<const std::string &, float, int32_t,\n           const std::string &, bool, const std::string &,\n           float, int>(),\n           py::arg(\"model\") = \"\", py::arg(\"scale\") = 0.5f,\n           py::arg(\"lm_num_threads\") = 1, py::arg(\"lm_provider\") = \"cpu\",\n           py::arg(\"shallow_fusion\") = true, py::arg(\"lodr_fst\") = \"\",\n           py::arg(\"lodr_scale\") = 0.0f, py::arg(\"lodr_backoff_id\") = -1)\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"scale\", &PyClass::scale)\n      .def_readwrite(\"lm_provider\", &PyClass::lm_provider)\n      .def_readwrite(\"lm_num_threads\", &PyClass::lm_num_threads)\n      .def_readwrite(\"shallow_fusion\", &PyClass::shallow_fusion)\n      .def_readwrite(\"lodr_fst\", &PyClass::lodr_fst)\n      .def_readwrite(\"lodr_scale\", &PyClass::lodr_scale)\n      .def_readwrite(\"lodr_backoff_id\", &PyClass::lodr_backoff_id)\n\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-lm-config.h",
    "content": "// sherpa-onnx/python/csrc/online-lm-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_LM_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_LM_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineLMConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_LM_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/online-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-model-config.h\"\n#include \"sherpa-onnx/csrc/online-transducer-model-config.h\"\n#include \"sherpa-onnx/csrc/provider-config.h\"\n#include \"sherpa-onnx/python/csrc/online-nemo-ctc-model-config.h\"\n#include \"sherpa-onnx/python/csrc/online-paraformer-model-config.h\"\n#include \"sherpa-onnx/python/csrc/online-t-one-ctc-model-config.h\"\n#include \"sherpa-onnx/python/csrc/online-transducer-model-config.h\"\n#include \"sherpa-onnx/python/csrc/online-wenet-ctc-model-config.h\"\n#include \"sherpa-onnx/python/csrc/online-zipformer2-ctc-model-config.h\"\n#include \"sherpa-onnx/python/csrc/provider-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineModelConfig(py::module *m) {\n  PybindOnlineTransducerModelConfig(m);\n  PybindOnlineParaformerModelConfig(m);\n  PybindOnlineWenetCtcModelConfig(m);\n  PybindOnlineZipformer2CtcModelConfig(m);\n  PybindOnlineNeMoCtcModelConfig(m);\n  PybindOnlineToneCtcModelConfig(m);\n  PybindProviderConfig(m);\n\n  using PyClass = OnlineModelConfig;\n  py::class_<PyClass>(*m, \"OnlineModelConfig\")\n      .def(py::init<const OnlineTransducerModelConfig &,\n                    const OnlineParaformerModelConfig &,\n                    const OnlineWenetCtcModelConfig &,\n                    const OnlineZipformer2CtcModelConfig &,\n                    const OnlineNeMoCtcModelConfig &,\n                    const OnlineToneCtcModelConfig &, const ProviderConfig &,\n                    const std::string &, int32_t, int32_t, bool,\n                    const std::string &, const std::string &,\n                    const std::string &>(),\n           py::arg(\"transducer\") = OnlineTransducerModelConfig(),\n           py::arg(\"paraformer\") = OnlineParaformerModelConfig(),\n           py::arg(\"wenet_ctc\") = OnlineWenetCtcModelConfig(),\n           py::arg(\"zipformer2_ctc\") = OnlineZipformer2CtcModelConfig(),\n           py::arg(\"nemo_ctc\") = OnlineNeMoCtcModelConfig(),\n           py::arg(\"t_one_ctc\") = OnlineToneCtcModelConfig(),\n           py::arg(\"provider_config\") = ProviderConfig(), py::arg(\"tokens\"),\n           py::arg(\"num_threads\"), py::arg(\"warm_up\") = 0,\n           py::arg(\"debug\") = false, py::arg(\"model_type\") = \"\",\n           py::arg(\"modeling_unit\") = \"\", py::arg(\"bpe_vocab\") = \"\")\n      .def_readwrite(\"transducer\", &PyClass::transducer)\n      .def_readwrite(\"paraformer\", &PyClass::paraformer)\n      .def_readwrite(\"wenet_ctc\", &PyClass::wenet_ctc)\n      .def_readwrite(\"zipformer2_ctc\", &PyClass::zipformer2_ctc)\n      .def_readwrite(\"nemo_ctc\", &PyClass::nemo_ctc)\n      .def_readwrite(\"t_one_ctc\", &PyClass::t_one_ctc)\n      .def_readwrite(\"provider_config\", &PyClass::provider_config)\n      .def_readwrite(\"tokens\", &PyClass::tokens)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"warm_up\", &PyClass::warm_up)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"model_type\", &PyClass::model_type)\n      .def_readwrite(\"modeling_unit\", &PyClass::modeling_unit)\n      .def_readwrite(\"bpe_vocab\", &PyClass::bpe_vocab)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-model-config.h",
    "content": "// sherpa-onnx/python/csrc/online-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-nemo-ctc-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/online-nemo-ctc-model-config.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-nemo-ctc-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-nemo-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineNeMoCtcModelConfig(py::module *m) {\n  using PyClass = OnlineNeMoCtcModelConfig;\n  py::class_<PyClass>(*m, \"OnlineNeMoCtcModelConfig\")\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-nemo-ctc-model-config.h",
    "content": "// sherpa-onnx/python/csrc/online-nemo-ctc-model-config.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_NEMO_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_NEMO_CTC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineNeMoCtcModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_NEMO_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-paraformer-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/online-paraformer-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-paraformer-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-paraformer-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineParaformerModelConfig(py::module *m) {\n  using PyClass = OnlineParaformerModelConfig;\n  py::class_<PyClass>(*m, \"OnlineParaformerModelConfig\")\n      .def(py::init<const std::string &, const std::string &>(),\n           py::arg(\"encoder\"), py::arg(\"decoder\"))\n      .def_readwrite(\"encoder\", &PyClass::encoder)\n      .def_readwrite(\"decoder\", &PyClass::decoder)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-paraformer-model-config.h",
    "content": "// sherpa-onnx/python/csrc/online-paraformer-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_PARAFORMER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_PARAFORMER_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineParaformerModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_PARAFORMER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-punctuation.cc",
    "content": "// sherpa-onnx/python/csrc/online-punctuation.cc\n//\n// Copyright (c) 2024\n\n#include \"sherpa-onnx/python/csrc/online-punctuation.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/online-punctuation.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindOnlinePunctuationModelConfig(py::module *m) {\n  using PyClass = OnlinePunctuationModelConfig;\n  py::class_<PyClass>(*m, \"OnlinePunctuationModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &, int32_t, bool,\n                    const std::string &>(),\n           py::arg(\"cnn_bilstm\"), py::arg(\"bpe_vocab\"),\n           py::arg(\"num_threads\") = 1, py::arg(\"debug\") = false,\n           py::arg(\"provider\") = \"cpu\")\n      .def_readwrite(\"cnn_bilstm\", &PyClass::cnn_bilstm)\n      .def_readwrite(\"bpe_vocab\", &PyClass::bpe_vocab)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nstatic void PybindOnlinePunctuationConfig(py::module *m) {\n  PybindOnlinePunctuationModelConfig(m);\n  using PyClass = OnlinePunctuationConfig;\n\n  py::class_<PyClass>(*m, \"OnlinePunctuationConfig\")\n      .def(py::init<>())\n      .def(py::init<const OnlinePunctuationModelConfig &>(),\n           py::arg(\"model_config\"))\n      .def_readwrite(\"model_config\", &PyClass::model)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindOnlinePunctuation(py::module *m) {\n  PybindOnlinePunctuationConfig(m);\n  using PyClass = OnlinePunctuation;\n\n  py::class_<PyClass>(*m, \"OnlinePunctuation\")\n      .def(py::init<const OnlinePunctuationConfig &>(), py::arg(\"config\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"add_punctuation_with_case\", &PyClass::AddPunctuationWithCase,\n           py::arg(\"text\"), py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-punctuation.h",
    "content": "// sherpa-onnx/python/csrc/online-punctuation.h\n//\n// Copyright (c) 2024\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_PUNCTUATION_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_PUNCTUATION_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlinePunctuation(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_PUNCTUATION_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-recognizer.cc",
    "content": "// sherpa-onnx/python/csrc/online-recongizer.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-recognizer.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-recognizer.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindOnlineRecognizerResult(py::module *m) {\n  using PyClass = OnlineRecognizerResult;\n  py::class_<PyClass>(*m, \"OnlineRecognizerResult\")\n      .def_property_readonly(\n          \"text\",\n          [](PyClass &self) -> py::str {\n            return py::str(PyUnicode_DecodeUTF8(self.text.c_str(),\n                                                self.text.size(), \"ignore\"));\n          })\n      .def_property_readonly(\n          \"tokens\",\n          [](PyClass &self) -> std::vector<std::string> { return self.tokens; })\n      .def_property_readonly(\n          \"start_time\", [](PyClass &self) -> float { return self.start_time; })\n      .def_property_readonly(\n          \"timestamps\",\n          [](PyClass &self) -> std::vector<float> { return self.timestamps; })\n      .def_property_readonly(\n          \"ys_probs\",\n          [](PyClass &self) -> std::vector<float> { return self.ys_probs; })\n      .def_property_readonly(\n          \"lm_probs\",\n          [](PyClass &self) -> std::vector<float> { return self.lm_probs; })\n      .def_property_readonly(\"context_scores\",\n                             [](PyClass &self) -> std::vector<float> {\n                               return self.context_scores;\n                             })\n      .def_property_readonly(\n          \"segment\", [](PyClass &self) -> int32_t { return self.segment; })\n      .def_property_readonly(\n          \"words\",\n          [](PyClass &self) -> std::vector<int32_t> { return self.words; })\n      .def_property_readonly(\n          \"is_final\", [](PyClass &self) -> bool { return self.is_final; })\n      .def(\"__str__\", &PyClass::AsJsonString,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"as_json_string\", &PyClass::AsJsonString,\n           py::call_guard<py::gil_scoped_release>());\n}\n\nstatic void PybindOnlineRecognizerConfig(py::module *m) {\n  using PyClass = OnlineRecognizerConfig;\n  py::class_<PyClass>(*m, \"OnlineRecognizerConfig\")\n      .def(py::init<const FeatureExtractorConfig &, const OnlineModelConfig &,\n                    const OnlineLMConfig &, const EndpointConfig &,\n                    const OnlineCtcFstDecoderConfig &, bool,\n                    const std::string &, int32_t, const std::string &, float,\n                    float, float, const std::string &, const std::string &,\n                    bool, const HomophoneReplacerConfig &>(),\n           py::arg(\"feat_config\"), py::arg(\"model_config\"),\n           py::arg(\"lm_config\") = OnlineLMConfig(),\n           py::arg(\"endpoint_config\") = EndpointConfig(),\n           py::arg(\"ctc_fst_decoder_config\") = OnlineCtcFstDecoderConfig(),\n           py::arg(\"enable_endpoint\"), py::arg(\"decoding_method\"),\n           py::arg(\"max_active_paths\") = 4, py::arg(\"hotwords_file\") = \"\",\n           py::arg(\"hotwords_score\") = 0, py::arg(\"blank_penalty\") = 0.0,\n           py::arg(\"temperature_scale\") = 2.0, py::arg(\"rule_fsts\") = \"\",\n           py::arg(\"rule_fars\") = \"\", py::arg(\"reset_encoder\") = false,\n           py::arg(\"hr\") = HomophoneReplacerConfig{})\n      .def_readwrite(\"feat_config\", &PyClass::feat_config)\n      .def_readwrite(\"model_config\", &PyClass::model_config)\n      .def_readwrite(\"lm_config\", &PyClass::lm_config)\n      .def_readwrite(\"endpoint_config\", &PyClass::endpoint_config)\n      .def_readwrite(\"ctc_fst_decoder_config\", &PyClass::ctc_fst_decoder_config)\n      .def_readwrite(\"enable_endpoint\", &PyClass::enable_endpoint)\n      .def_readwrite(\"decoding_method\", &PyClass::decoding_method)\n      .def_readwrite(\"max_active_paths\", &PyClass::max_active_paths)\n      .def_readwrite(\"hotwords_file\", &PyClass::hotwords_file)\n      .def_readwrite(\"hotwords_score\", &PyClass::hotwords_score)\n      .def_readwrite(\"blank_penalty\", &PyClass::blank_penalty)\n      .def_readwrite(\"temperature_scale\", &PyClass::temperature_scale)\n      .def_readwrite(\"rule_fsts\", &PyClass::rule_fsts)\n      .def_readwrite(\"rule_fars\", &PyClass::rule_fars)\n      .def_readwrite(\"reset_encoder\", &PyClass::reset_encoder)\n      .def_readwrite(\"hr\", &PyClass::hr)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindOnlineRecognizer(py::module *m) {\n  PybindOnlineRecognizerResult(m);\n  PybindOnlineRecognizerConfig(m);\n\n  using PyClass = OnlineRecognizer;\n  py::class_<PyClass>(*m, \"OnlineRecognizer\")\n      .def(py::init<const OnlineRecognizerConfig &>(), py::arg(\"config\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"create_stream\",\n          [](const PyClass &self) { return self.CreateStream(); },\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"create_stream\",\n          [](PyClass &self, const std::string &hotwords) {\n            return self.CreateStream(hotwords);\n          },\n          py::arg(\"hotwords\"), py::call_guard<py::gil_scoped_release>())\n      .def(\"is_ready\", &PyClass::IsReady,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"decode_stream\", &PyClass::DecodeStream, py::arg(\"s\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"decode_streams\",\n          [](PyClass &self, std::vector<OnlineStream *> ss) {\n            self.DecodeStreams(ss.data(), ss.size());\n          },\n          py::arg(\"ss\"), py::call_guard<py::gil_scoped_release>())\n      .def(\"get_result\", &PyClass::GetResult, py::arg(\"s\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"is_endpoint\", &PyClass::IsEndpoint, py::arg(\"s\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"reset\", &PyClass::Reset, py::arg(\"s\"),\n           py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-recognizer.h",
    "content": "// sherpa-onnx/python/csrc/online-recognizer.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_RECOGNIZER_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_RECOGNIZER_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineRecognizer(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_RECOGNIZER_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-speech-denoiser.cc",
    "content": "// sherpa-onnx/python/csrc/online-speech-denoiser.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-speech-denoiser.h\"\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-speech-denoiser.h\"\n#include \"sherpa-onnx/python/csrc/offline-speech-denoiser.h\"\n#include \"sherpa-onnx/python/csrc/offline-speech-denoiser-model-config.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindOnlineSpeechDenoiserConfig(py::module *m) {\n  using PyClass = OnlineSpeechDenoiserConfig;\n\n  py::class_<PyClass>(*m, \"OnlineSpeechDenoiserConfig\")\n      .def(py::init<>())\n      .def(py::init<const OfflineSpeechDenoiserModelConfig &>(),\n           py::arg(\"model\") = OfflineSpeechDenoiserModelConfig{})\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindOnlineSpeechDenoiser(py::module *m) {\n  PybindOnlineSpeechDenoiserConfig(m);\n\n  using PyClass = OnlineSpeechDenoiser;\n  py::class_<PyClass>(*m, \"OnlineSpeechDenoiser\")\n      .def(py::init<const OnlineSpeechDenoiserConfig &>(), py::arg(\"config\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"__call__\",\n          [](PyClass &self, const std::vector<float> &samples,\n             int32_t sample_rate) {\n            return self.Run(samples.data(), samples.size(), sample_rate);\n          },\n          py::arg(\"samples\"), py::arg(\"sample_rate\"),\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"run\",\n          [](PyClass &self, const std::vector<float> &samples,\n             int32_t sample_rate) {\n            return self.Run(samples.data(), samples.size(), sample_rate);\n          },\n          py::arg(\"samples\"), py::arg(\"sample_rate\"),\n          py::call_guard<py::gil_scoped_release>())\n      .def(\"flush\", &PyClass::Flush, py::call_guard<py::gil_scoped_release>())\n      .def(\"reset\", &PyClass::Reset, py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"sample_rate\", &PyClass::GetSampleRate)\n      .def_property_readonly(\"frame_shift_in_samples\",\n                             &PyClass::GetFrameShiftInSamples);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-speech-denoiser.h",
    "content": "// sherpa-onnx/python/csrc/online-speech-denoiser.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_SPEECH_DENOISER_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_SPEECH_DENOISER_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineSpeechDenoiser(py::module *m);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_SPEECH_DENOISER_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-stream.cc",
    "content": "// sherpa-onnx/python/csrc/online-stream.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-stream.h\"\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-stream.h\"\n\nnamespace sherpa_onnx {\n\nconstexpr const char *kAcceptWaveformUsage = R\"(\nProcess audio samples.\n\nArgs:\n  sample_rate:\n    Sample rate of the input samples. If it is different from the one\n    expected by the model, we will do resampling inside.\n  waveform:\n    A 1-D float32 tensor containing audio samples. It must be normalized\n    to the range [-1, 1].\n)\";\n\n\nconstexpr const char *kGetFramesUsage = R\"(\nGet n frames starting from the given frame index.\n(hint: intended for debugging, for comparing FBANK features across pipelines)\n\nArgs:\n  frame_index:\n    The starting frame index\n  n:\n    Number of frames to get.\nReturn:\n  Return a 2-D tensor of shape (n, feature_dim).\n  which is flattened into a 1-D vector (flattened in row major).\n  Unflatten in python with:\n    `features = np.reshape(arr, (n, feature_dim))`\n)\";\n\nvoid PybindOnlineStream(py::module *m) {\n  using PyClass = OnlineStream;\n  py::class_<PyClass>(*m, \"OnlineStream\")\n      .def(\n          \"accept_waveform\",\n          [](PyClass &self, float sample_rate,\n             const std::vector<float> &waveform) {\n            self.AcceptWaveform(sample_rate, waveform.data(), waveform.size());\n          },\n          py::arg(\"sample_rate\"), py::arg(\"waveform\"), kAcceptWaveformUsage,\n          py::call_guard<py::gil_scoped_release>())\n      .def(\"input_finished\", &PyClass::InputFinished,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"set_option\", &PyClass::SetOption, py::arg(\"key\"),\n           py::arg(\"value\"), py::call_guard<py::gil_scoped_release>())\n      .def(\"has_option\", &PyClass::HasOption, py::arg(\"key\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"get_option\", &PyClass::GetOption, py::arg(\"key\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"get_frames\", &PyClass::GetFrames,\n           py::arg(\"frame_index\"), py::arg(\"n\"), kGetFramesUsage,\n           py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-stream.h",
    "content": "// sherpa-onnx/python/csrc/online-stream.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_STREAM_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_STREAM_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineStream(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_STREAM_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-t-one-ctc-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/online-t-one-ctc-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-t-one-ctc-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-t-one-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineToneCtcModelConfig(py::module *m) {\n  using PyClass = OnlineToneCtcModelConfig;\n  py::class_<PyClass>(*m, \"OnlineToneCtcModelConfig\")\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-t-one-ctc-model-config.h",
    "content": "// sherpa-onnx/python/csrc/online-t-one-ctc-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_T_ONE_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_T_ONE_CTC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineToneCtcModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_T_ONE_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-transducer-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/online-transducer-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/csrc/online-transducer-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/python/csrc/online-transducer-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineTransducerModelConfig(py::module *m) {\n  using PyClass = OnlineTransducerModelConfig;\n  py::class_<PyClass>(*m, \"OnlineTransducerModelConfig\")\n      .def(py::init<const std::string &, const std::string &,\n                    const std::string &>(),\n           py::arg(\"encoder\"), py::arg(\"decoder\"), py::arg(\"joiner\"))\n      .def_readwrite(\"encoder\", &PyClass::encoder)\n      .def_readwrite(\"decoder\", &PyClass::decoder)\n      .def_readwrite(\"joiner\", &PyClass::joiner)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-transducer-model-config.h",
    "content": "// sherpa-onnx/python/csrc/online-transducer-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_TRANSDUCER_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_TRANSDUCER_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineTransducerModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_TRANSDUCER_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-wenet-ctc-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/online-wenet-ctc-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-wenet-ctc-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-wenet-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineWenetCtcModelConfig(py::module *m) {\n  using PyClass = OnlineWenetCtcModelConfig;\n  py::class_<PyClass>(*m, \"OnlineWenetCtcModelConfig\")\n      .def(py::init<const std::string &, int32_t, int32_t>(), py::arg(\"model\"),\n           py::arg(\"chunk_size\") = 16, py::arg(\"num_left_chunks\") = 4)\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"chunk_size\", &PyClass::chunk_size)\n      .def_readwrite(\"num_left_chunks\", &PyClass::num_left_chunks)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-wenet-ctc-model-config.h",
    "content": "// sherpa-onnx/python/csrc/online-wenet-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_WENET_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_WENET_CTC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineWenetCtcModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_WENET_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-zipformer2-ctc-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/online-zipformer2-ctc-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/online-zipformer2-ctc-model-config.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/online-zipformer2-ctc-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineZipformer2CtcModelConfig(py::module *m) {\n  using PyClass = OnlineZipformer2CtcModelConfig;\n  py::class_<PyClass>(*m, \"OnlineZipformer2CtcModelConfig\")\n      .def(py::init<const std::string &>(), py::arg(\"model\"))\n      .def_readwrite(\"model\", &PyClass::model)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/online-zipformer2-ctc-model-config.h",
    "content": "// sherpa-onnx/python/csrc/online-zipformer2-ctc-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_ONLINE_ZIPFORMER2_CTC_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_ONLINE_ZIPFORMER2_CTC_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindOnlineZipformer2CtcModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_ONLINE_ZIPFORMER2_CTC_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/provider-config.cc",
    "content": "// sherpa-onnx/python/csrc/provider-config.cc\n//\n// Copyright (c)  2024  Uniphore (Author: Manickavela A)\n\n\n#include \"sherpa-onnx/python/csrc/provider-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/provider-config.h\"\n#include \"sherpa-onnx/python/csrc/cuda-config.h\"\n#include \"sherpa-onnx/python/csrc/tensorrt-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindProviderConfig(py::module *m) {\n  PybindCudaConfig(m);\n  PybindTensorrtConfig(m);\n\n  using PyClass = ProviderConfig;\n  py::class_<PyClass>(*m, \"ProviderConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, int32_t>(),\n           py::arg(\"provider\") = \"cpu\",\n           py::arg(\"device\") = 0)\n      .def(py::init<const TensorrtConfig &, const CudaConfig &,\n          const std::string &, int32_t>(),\n           py::arg(\"trt_config\") = TensorrtConfig{},\n           py::arg(\"cuda_config\") = CudaConfig{},\n           py::arg(\"provider\") = \"cpu\",\n           py::arg(\"device\") = 0)\n      .def_readwrite(\"trt_config\", &PyClass::trt_config)\n      .def_readwrite(\"cuda_config\", &PyClass::cuda_config)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def_readwrite(\"device\", &PyClass::device)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/provider-config.h",
    "content": "// sherpa-onnx/python/csrc/provider-config.h\n//\n// Copyright (c)  2024  Uniphore (Author: Manickavela A)\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_PROVIDER_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_PROVIDER_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindProviderConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_PROVIDER_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/sentence-piece-tokenizer.cc",
    "content": "// sherpa-onnx/python/csrc/sentence-piece-tokenizer.cc\n//\n// Copyright (c)  2026  Xiaomi Corporation\n#include \"sherpa-onnx/python/csrc/sentence-piece-tokenizer.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/sentence-piece-tokenizer.h\"\n\nnamespace sherpa_onnx {\nvoid PybindSentencePieceTokenizer(py::module *m) {\n  using PyClass = SentencePieceTokenizer;\n  py::class_<PyClass>(*m, \"SentencePieceTokenizer\")\n      .def(py::init<const std::string &, const std::string &>(),\n           py::arg(\"vocab_json\"), py::arg(\"token_scores_json\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"encode\",\n          [](const PyClass &self, const std::string &text,\n             py::object out_type) -> py::object {\n            auto builtins = py::module::import(\"builtins\");\n            py::object int_type = builtins.attr(\"int\");\n            py::object str_type = builtins.attr(\"str\");\n\n            if (out_type.is_none() || out_type.equal(str_type)) {\n              std::vector<std::string> tokens;\n              {\n                py::gil_scoped_release release;\n                tokens = self.EncodeTokens(text);\n              }\n              return py::cast(tokens);\n            } else if (out_type.equal(int_type)) {\n              std::vector<int32_t> ids;\n              {\n                py::gil_scoped_release release;\n                ids = self.EncodeIds(text);\n              }\n              return py::cast(ids);\n            } else {\n              throw std::runtime_error(\n                  \"Invalid out_type. Must be int, str, or None.\");\n            }\n          },\n          py::arg(\"text\"), py::arg(\"out_type\") = py::none(),\n          \"Encode text. out_type can be int, str, or None. Default to str\");\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/sentence-piece-tokenizer.h",
    "content": "// sherpa-onnx/python/csrc/sentence-piece-tokenizer.h\n//\n// Copyright (c)  2026  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_SENTENCE_PIECE_TOKENIZER_H_\n#define SHERPA_ONNX_PYTHON_CSRC_SENTENCE_PIECE_TOKENIZER_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindSentencePieceTokenizer(py::module *m);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_SENTENCE_PIECE_TOKENIZER_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/sherpa-onnx.cc",
    "content": "// sherpa-onnx/python/csrc/sherpa-onnx.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\n#include \"sherpa-onnx/python/csrc/alsa.h\"\n#include \"sherpa-onnx/python/csrc/audio-tagging.h\"\n#include \"sherpa-onnx/python/csrc/circular-buffer.h\"\n#include \"sherpa-onnx/python/csrc/display.h\"\n#include \"sherpa-onnx/python/csrc/endpoint.h\"\n#include \"sherpa-onnx/python/csrc/features.h\"\n#include \"sherpa-onnx/python/csrc/homophone-replacer.h\"\n#include \"sherpa-onnx/python/csrc/keyword-spotter.h\"\n#include \"sherpa-onnx/python/csrc/offline-ctc-fst-decoder-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-lm-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-model-config.h\"\n#include \"sherpa-onnx/python/csrc/offline-punctuation.h\"\n#include \"sherpa-onnx/python/csrc/offline-recognizer.h\"\n#include \"sherpa-onnx/python/csrc/offline-source-separation.h\"\n#include \"sherpa-onnx/python/csrc/offline-speech-denoiser.h\"\n#include \"sherpa-onnx/python/csrc/offline-stream.h\"\n#include \"sherpa-onnx/python/csrc/online-ctc-fst-decoder-config.h\"\n#include \"sherpa-onnx/python/csrc/online-lm-config.h\"\n#include \"sherpa-onnx/python/csrc/online-model-config.h\"\n#include \"sherpa-onnx/python/csrc/online-punctuation.h\"\n#include \"sherpa-onnx/python/csrc/online-recognizer.h\"\n#include \"sherpa-onnx/python/csrc/online-speech-denoiser.h\"\n#include \"sherpa-onnx/python/csrc/online-stream.h\"\n#include \"sherpa-onnx/python/csrc/speaker-embedding-extractor.h\"\n#include \"sherpa-onnx/python/csrc/speaker-embedding-manager.h\"\n#include \"sherpa-onnx/python/csrc/spoken-language-identification.h\"\n#include \"sherpa-onnx/python/csrc/vad-model-config.h\"\n#include \"sherpa-onnx/python/csrc/vad-model.h\"\n#include \"sherpa-onnx/python/csrc/version.h\"\n#include \"sherpa-onnx/python/csrc/voice-activity-detector.h\"\n#include \"sherpa-onnx/python/csrc/wave-writer.h\"\n\n#if SHERPA_ONNX_ENABLE_TTS == 1\n#include \"sherpa-onnx/python/csrc/offline-tts.h\"\n#include \"sherpa-onnx/python/csrc/sentence-piece-tokenizer.h\"\n#endif\n\n#if SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION == 1\n#include \"sherpa-onnx/python/csrc/fast-clustering.h\"\n#include \"sherpa-onnx/python/csrc/offline-speaker-diarization-result.h\"\n#include \"sherpa-onnx/python/csrc/offline-speaker-diarization.h\"\n#endif\n\nnamespace sherpa_onnx {\n\nPYBIND11_MODULE(_sherpa_onnx, m) {\n  m.doc() = \"pybind11 binding of sherpa-onnx\";\n\n  PybindWaveWriter(&m);\n  PybindAudioTagging(&m);\n  PybindOfflinePunctuation(&m);\n  PybindOnlinePunctuation(&m);\n  PybindHomophoneReplacer(&m);\n\n  PybindFeatures(&m);\n  PybindOnlineCtcFstDecoderConfig(&m);\n  PybindOnlineModelConfig(&m);\n  PybindOnlineLMConfig(&m);\n  PybindOnlineStream(&m);\n  PybindEndpoint(&m);\n  PybindOnlineRecognizer(&m);\n  PybindKeywordSpotter(&m);\n  PybindDisplay(&m);\n\n  PybindOfflineStream(&m);\n  PybindOfflineLMConfig(&m);\n  PybindOfflineModelConfig(&m);\n  PybindOfflineCtcFstDecoderConfig(&m);\n  PybindOfflineRecognizer(&m);\n\n  PybindVadModelConfig(&m);\n  PybindVadModel(&m);\n  PybindCircularBuffer(&m);\n  PybindVoiceActivityDetector(&m);\n\n#if SHERPA_ONNX_ENABLE_TTS == 1\n  PybindOfflineTts(&m);\n  PybindSentencePieceTokenizer(&m);\n#else\n  /* Define \"empty\" TTS symbols */\n  m.attr(\"OfflineTtsKittenModelConfig\") = py::none();\n  m.attr(\"OfflineTtsPocketModelConfig\") = py::none();\n  m.attr(\"OfflineTtsKokoroModelConfig\") = py::none();\n  m.attr(\"OfflineTtsMatchaModelConfig\") = py::none();\n  m.attr(\"OfflineTtsModelConfig\") = py::none();\n  m.attr(\"OfflineTtsVitsModelConfig\") = py::none();\n  m.attr(\"OfflineTtsZipvoiceModelConfig\") = py::none();\n  m.attr(\"GeneratedAudio\") = py::none();\n  m.attr(\"OfflineTtsConfig\") = py::none();\n  m.attr(\"OfflineTts\") = py::none();\n  m.attr(\"SentencePieceTokenizer\") = py::none();\n#endif\n\n  PybindSpeakerEmbeddingExtractor(&m);\n  PybindSpeakerEmbeddingManager(&m);\n  PybindSpokenLanguageIdentification(&m);\n\n#if SHERPA_ONNX_ENABLE_SPEAKER_DIARIZATION == 1\n  PybindFastClustering(&m);\n  PybindOfflineSpeakerDiarizationResult(&m);\n  PybindOfflineSpeakerDiarization(&m);\n#else\n  /* Define \"empty\" diarization symbols */\n  m.attr(\"FastClusteringConfig\") = py::none();\n  m.attr(\"FastClustering\") = py::none();\n  m.attr(\"OfflineSpeakerDiarizationSegment\") = py::none();\n  m.attr(\"OfflineSpeakerDiarizationResult\") = py::none();\n  m.attr(\"OfflineSpeakerSegmentationPyannoteModelConfig\") = py::none();\n  m.attr(\"OfflineSpeakerSegmentationModelConfig\") = py::none();\n  m.attr(\"OfflineSpeakerDiarizationConfig\") = py::none();\n  m.attr(\"OfflineSpeakerDiarization\") = py::none();\n#endif\n\n  PybindAlsa(&m);\n  PybindOfflineSpeechDenoiser(&m);\n  PybindOnlineSpeechDenoiser(&m);\n  PybindOfflineSourceSeparation(&m);\n  PybindVersion(&m);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/sherpa-onnx.h",
    "content": "// sherpa-onnx/python/csrc/sherpa-onnx.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_SHERPA_ONNX_H_\n#define SHERPA_ONNX_PYTHON_CSRC_SHERPA_ONNX_H_\n\n#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION\n\n#include \"pybind11/functional.h\"\n#include \"pybind11/numpy.h\"\n#include \"pybind11/pybind11.h\"\n#include \"pybind11/stl.h\"\n\nnamespace py = pybind11;\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_SHERPA_ONNX_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/silero-vad-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/silero-vad-model-config.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/silero-vad-model-config.h\"\n\n#include <memory>\n#include <string>\n\n#include \"sherpa-onnx/csrc/silero-vad-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindSileroVadModelConfig(py::module *m) {\n  using PyClass = SileroVadModelConfig;\n  py::class_<PyClass>(*m, \"SileroVadModelConfig\")\n      .def(py::init<>())\n      .def(py::init([](const std::string &model, float threshold,\n                       float min_silence_duration, float min_speech_duration,\n                       int32_t window_size,\n                       float max_speech_duration) -> std::unique_ptr<PyClass> {\n             auto ans = std::make_unique<PyClass>();\n\n             ans->model = model;\n             ans->threshold = threshold;\n             ans->min_silence_duration = min_silence_duration;\n             ans->min_speech_duration = min_speech_duration;\n             ans->window_size = window_size;\n             ans->max_speech_duration = max_speech_duration;\n\n             return ans;\n           }),\n           py::arg(\"model\"), py::arg(\"threshold\") = 0.5,\n           py::arg(\"min_silence_duration\") = 0.5,\n           py::arg(\"min_speech_duration\") = 0.25, py::arg(\"window_size\") = 512,\n           py::arg(\"max_speech_duration\") = 20)\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"threshold\", &PyClass::threshold)\n      .def_readwrite(\"min_silence_duration\", &PyClass::min_silence_duration)\n      .def_readwrite(\"min_speech_duration\", &PyClass::min_speech_duration)\n      .def_readwrite(\"window_size\", &PyClass::window_size)\n      .def_readwrite(\"max_speech_duration\", &PyClass::max_speech_duration)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/silero-vad-model-config.h",
    "content": "// sherpa-onnx/python/csrc/silero-vad-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_SILERO_VAD_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_SILERO_VAD_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindSileroVadModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_SILERO_VAD_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/speaker-embedding-extractor.cc",
    "content": "// sherpa-onnx/python/csrc/speaker-embedding-extractor.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/speaker-embedding-extractor.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/speaker-embedding-extractor.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindSpeakerEmbeddingExtractorConfig(py::module *m) {\n  using PyClass = SpeakerEmbeddingExtractorConfig;\n  py::class_<PyClass>(*m, \"SpeakerEmbeddingExtractorConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, int32_t, bool, const std::string &>(),\n           py::arg(\"model\"), py::arg(\"num_threads\") = 1,\n           py::arg(\"debug\") = false, py::arg(\"provider\") = \"cpu\")\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindSpeakerEmbeddingExtractor(py::module *m) {\n  PybindSpeakerEmbeddingExtractorConfig(m);\n\n  using PyClass = SpeakerEmbeddingExtractor;\n  py::class_<PyClass>(*m, \"SpeakerEmbeddingExtractor\")\n      .def(py::init<const SpeakerEmbeddingExtractorConfig &>(),\n           py::arg(\"config\"), py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"dim\", &PyClass::Dim)\n      .def(\"create_stream\", &PyClass::CreateStream,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"compute\", &PyClass::Compute,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"is_ready\", &PyClass::IsReady,\n           py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/speaker-embedding-extractor.h",
    "content": "// sherpa-onnx/python/csrc/speaker-embedding-extractor.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_H_\n#define SHERPA_ONNX_PYTHON_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindSpeakerEmbeddingExtractor(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_SPEAKER_EMBEDDING_EXTRACTOR_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/speaker-embedding-manager.cc",
    "content": "// sherpa-onnx/python/csrc/speaker-embedding-manager.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/speaker-embedding-manager.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/speaker-embedding-manager.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindSpeakerEmbeddingManager(py::module *m) {\n  using PyClass = SpeakerEmbeddingManager;\n  py::class_<PyClass>(*m, \"SpeakerEmbeddingManager\")\n      .def(py::init<int32_t>(), py::arg(\"dim\"),\n           py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"num_speakers\", &PyClass::NumSpeakers)\n      .def_property_readonly(\"dim\", &PyClass::Dim)\n      .def_property_readonly(\"all_speakers\", &PyClass::GetAllSpeakers)\n      .def(\n          \"__contains__\",\n          [](const PyClass &self, const std::string &name) -> bool {\n            return self.Contains(name);\n          },\n          py::arg(\"name\"), py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"add\",\n          [](const PyClass &self, const std::string &name,\n             const std::vector<float> &v) -> bool {\n            return self.Add(name, v.data());\n          },\n          py::arg(\"name\"), py::arg(\"v\"),\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"add\",\n          [](const PyClass &self, const std::string &name,\n             const std::vector<std::vector<float>> &embedding_list) -> bool {\n            return self.Add(name, embedding_list);\n          },\n          py::arg(\"name\"), py::arg(\"embedding_list\"),\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"remove\",\n          [](const PyClass &self, const std::string &name) -> bool {\n            return self.Remove(name);\n          },\n          py::arg(\"name\"), py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"search\",\n          [](const PyClass &self, const std::vector<float> &v, float threshold)\n              -> std::string { return self.Search(v.data(), threshold); },\n          py::arg(\"v\"), py::arg(\"threshold\"),\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"verify\",\n          [](const PyClass &self, const std::string &name,\n             const std::vector<float> &v, float threshold) -> bool {\n            return self.Verify(name, v.data(), threshold);\n          },\n          py::arg(\"name\"), py::arg(\"v\"), py::arg(\"threshold\"),\n          py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"score\",\n          [](const PyClass &self, const std::string &name,\n             const std::vector<float> &v) -> float {\n            return self.Score(name, v.data());\n          },\n          py::arg(\"name\"), py::arg(\"v\"),\n          py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/speaker-embedding-manager.h",
    "content": "// sherpa-onnx/python/csrc/speaker-embedding-manager.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_SPEAKER_EMBEDDING_MANAGER_H_\n#define SHERPA_ONNX_PYTHON_CSRC_SPEAKER_EMBEDDING_MANAGER_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindSpeakerEmbeddingManager(py::module *m);\n\n}  // namespace sherpa_onnx\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_SPEAKER_EMBEDDING_MANAGER_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/spoken-language-identification.cc",
    "content": "// sherpa-onnx/python/csrc/spoken-language-identification.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/spoken-language-identification.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/spoken-language-identification.h\"\n\nnamespace sherpa_onnx {\n\nstatic void PybindSpokenLanguageIdentificationWhisperConfig(py::module *m) {\n  using PyClass = SpokenLanguageIdentificationWhisperConfig;\n\n  py::class_<PyClass>(*m, \"SpokenLanguageIdentificationWhisperConfig\")\n      .def(py::init<>())\n      .def(py::init<const std::string &, const std::string &, int32_t>(),\n           py::arg(\"encoder\"), py::arg(\"decoder\"),\n           py::arg(\"tail_paddings\") = -1)\n      .def_readwrite(\"encoder\", &PyClass::encoder)\n      .def_readwrite(\"decoder\", &PyClass::decoder)\n      .def_readwrite(\"tail_paddings\", &PyClass::tail_paddings)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nstatic void PybindSpokenLanguageIdentificationConfig(py::module *m) {\n  PybindSpokenLanguageIdentificationWhisperConfig(m);\n\n  using PyClass = SpokenLanguageIdentificationConfig;\n\n  py::class_<PyClass>(*m, \"SpokenLanguageIdentificationConfig\")\n      .def(py::init<>())\n      .def(py::init<const SpokenLanguageIdentificationWhisperConfig &, int32_t,\n                    bool, const std::string &>(),\n           py::arg(\"whisper\"), py::arg(\"num_threads\") = 1,\n           py::arg(\"debug\") = false, py::arg(\"provider\") = \"cpu\")\n      .def_readwrite(\"whisper\", &PyClass::whisper)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def(\"validate\", &PyClass::Validate)\n      .def(\"__str__\", &PyClass::ToString);\n}\n\nvoid PybindSpokenLanguageIdentification(py::module *m) {\n  PybindSpokenLanguageIdentificationConfig(m);\n\n  using PyClass = SpokenLanguageIdentification;\n  py::class_<PyClass>(*m, \"SpokenLanguageIdentification\")\n      .def(py::init<const SpokenLanguageIdentificationConfig &>(),\n           py::arg(\"config\"), py::call_guard<py::gil_scoped_release>())\n      .def(\"create_stream\", &PyClass::CreateStream,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"compute\", &PyClass::Compute, py::arg(\"s\"),\n           py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/spoken-language-identification.h",
    "content": "// sherpa-onnx/python/csrc/spoken-language-identification.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_H_\n#define SHERPA_ONNX_PYTHON_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindSpokenLanguageIdentification(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_SPOKEN_LANGUAGE_IDENTIFICATION_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/ten-vad-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/ten-vad-model-config.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/ten-vad-model-config.h\"\n\n#include <memory>\n#include <string>\n\n#include \"sherpa-onnx/csrc/ten-vad-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindTenVadModelConfig(py::module *m) {\n  using PyClass = TenVadModelConfig;\n  py::class_<PyClass>(*m, \"TenVadModelConfig\")\n      .def(py::init<>())\n      .def(py::init([](const std::string &model, float threshold,\n                       float min_silence_duration, float min_speech_duration,\n                       int32_t window_size,\n                       float max_speech_duration) -> std::unique_ptr<PyClass> {\n             auto ans = std::make_unique<PyClass>();\n\n             ans->model = model;\n             ans->threshold = threshold;\n             ans->min_silence_duration = min_silence_duration;\n             ans->min_speech_duration = min_speech_duration;\n             ans->window_size = window_size;\n             ans->max_speech_duration = max_speech_duration;\n\n             return ans;\n           }),\n           py::arg(\"model\"), py::arg(\"threshold\") = 0.5,\n           py::arg(\"min_silence_duration\") = 0.5,\n           py::arg(\"min_speech_duration\") = 0.25, py::arg(\"window_size\") = 256,\n           py::arg(\"max_speech_duration\") = 20)\n      .def_readwrite(\"model\", &PyClass::model)\n      .def_readwrite(\"threshold\", &PyClass::threshold)\n      .def_readwrite(\"min_silence_duration\", &PyClass::min_silence_duration)\n      .def_readwrite(\"min_speech_duration\", &PyClass::min_speech_duration)\n      .def_readwrite(\"window_size\", &PyClass::window_size)\n      .def_readwrite(\"max_speech_duration\", &PyClass::max_speech_duration)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/ten-vad-model-config.h",
    "content": "// sherpa-onnx/python/csrc/ten-vad-model-config.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_TEN_VAD_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_TEN_VAD_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindTenVadModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_TEN_VAD_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/tensorrt-config.cc",
    "content": "// sherpa-onnx/python/csrc/tensorrt-config.cc\n//\n// Copyright (c)  2024  Uniphore (Author: Manickavela A)\n\n#include \"sherpa-onnx/python/csrc/tensorrt-config.h\"\n\n#include <string>\n#include <memory>\n#include \"sherpa-onnx/csrc/provider-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindTensorrtConfig(py::module *m) {\n  using PyClass = TensorrtConfig;\n  py::class_<PyClass>(*m, \"TensorrtConfig\")\n        .def(py::init<>())\n        .def(py::init([](int64_t trt_max_workspace_size,\n                      int32_t trt_max_partition_iterations,\n                      int32_t trt_min_subgraph_size,\n                      bool trt_fp16_enable,\n                      bool trt_detailed_build_log,\n                      bool trt_engine_cache_enable,\n                      bool trt_timing_cache_enable,\n                      const std::string &trt_engine_cache_path,\n                      const std::string &trt_timing_cache_path,\n                      bool trt_dump_subgraphs) -> std::unique_ptr<PyClass> {\n            auto ans = std::make_unique<PyClass>();\n\n            ans->trt_max_workspace_size = trt_max_workspace_size;\n            ans->trt_max_partition_iterations = trt_max_partition_iterations;\n            ans->trt_min_subgraph_size = trt_min_subgraph_size;\n            ans->trt_fp16_enable = trt_fp16_enable;\n            ans->trt_detailed_build_log = trt_detailed_build_log;\n            ans->trt_engine_cache_enable = trt_engine_cache_enable;\n            ans->trt_timing_cache_enable = trt_timing_cache_enable;\n            ans->trt_engine_cache_path = trt_engine_cache_path;\n            ans->trt_timing_cache_path = trt_timing_cache_path;\n            ans->trt_dump_subgraphs = trt_dump_subgraphs;\n\n            return ans;\n          }),\n           py::arg(\"trt_max_workspace_size\") = 2147483647,\n           py::arg(\"trt_max_partition_iterations\") = 10,\n           py::arg(\"trt_min_subgraph_size\") = 5,\n           py::arg(\"trt_fp16_enable\") = true,\n           py::arg(\"trt_detailed_build_log\") = false,\n           py::arg(\"trt_engine_cache_enable\") = true,\n           py::arg(\"trt_timing_cache_enable\") = true,\n           py::arg(\"trt_engine_cache_path\") = \".\",\n           py::arg(\"trt_timing_cache_path\") = \".\",\n           py::arg(\"trt_dump_subgraphs\") = false)\n\n      .def_readwrite(\"trt_max_workspace_size\",\n          &PyClass::trt_max_workspace_size)\n      .def_readwrite(\"trt_max_partition_iterations\",\n          &PyClass::trt_max_partition_iterations)\n      .def_readwrite(\"trt_min_subgraph_size\", &PyClass::trt_min_subgraph_size)\n      .def_readwrite(\"trt_fp16_enable\", &PyClass::trt_fp16_enable)\n      .def_readwrite(\"trt_detailed_build_log\",\n          &PyClass::trt_detailed_build_log)\n      .def_readwrite(\"trt_engine_cache_enable\",\n          &PyClass::trt_engine_cache_enable)\n      .def_readwrite(\"trt_timing_cache_enable\",\n          &PyClass::trt_timing_cache_enable)\n      .def_readwrite(\"trt_engine_cache_path\", &PyClass::trt_engine_cache_path)\n      .def_readwrite(\"trt_timing_cache_path\", &PyClass::trt_timing_cache_path)\n      .def_readwrite(\"trt_dump_subgraphs\", &PyClass::trt_dump_subgraphs)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/tensorrt-config.h",
    "content": "// sherpa-onnx/python/csrc/tensorrt-config.h\n//\n// Copyright (c)  2024  Uniphore (Author: Manickavela A)\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_TENSORRT_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_TENSORRT_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindTensorrtConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_TENSORRT_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/vad-model-config.cc",
    "content": "// sherpa-onnx/python/csrc/vad-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/vad-model-config.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/vad-model-config.h\"\n#include \"sherpa-onnx/python/csrc/silero-vad-model-config.h\"\n#include \"sherpa-onnx/python/csrc/ten-vad-model-config.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindVadModelConfig(py::module *m) {\n  PybindSileroVadModelConfig(m);\n  PybindTenVadModelConfig(m);\n\n  using PyClass = VadModelConfig;\n  py::class_<PyClass>(*m, \"VadModelConfig\")\n      .def(py::init<>())\n      .def(py::init<const SileroVadModelConfig &, const TenVadModelConfig &,\n                    int32_t, int32_t, const std::string &, bool>(),\n           py::arg(\"silero_vad\") = SileroVadModelConfig{},\n           py::arg(\"ten_vad\") = TenVadModelConfig{},\n           py::arg(\"sample_rate\") = 16000, py::arg(\"num_threads\") = 1,\n           py::arg(\"provider\") = \"cpu\", py::arg(\"debug\") = false)\n      .def_readwrite(\"silero_vad\", &PyClass::silero_vad)\n      .def_readwrite(\"ten_vad\", &PyClass::ten_vad)\n      .def_readwrite(\"sample_rate\", &PyClass::sample_rate)\n      .def_readwrite(\"num_threads\", &PyClass::num_threads)\n      .def_readwrite(\"provider\", &PyClass::provider)\n      .def_readwrite(\"debug\", &PyClass::debug)\n      .def(\"__str__\", &PyClass::ToString)\n      .def(\"validate\", &PyClass::Validate);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/vad-model-config.h",
    "content": "// sherpa-onnx/python/csrc/vad-model-config.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_VAD_MODEL_CONFIG_H_\n#define SHERPA_ONNX_PYTHON_CSRC_VAD_MODEL_CONFIG_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindVadModelConfig(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_VAD_MODEL_CONFIG_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/vad-model.cc",
    "content": "// sherpa-onnx/python/csrc/vad-model.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/vad-model.h\"\n\n#include <memory>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/vad-model.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindVadModel(py::module *m) {\n  using PyClass = VadModel;\n  py::class_<PyClass>(*m, \"VadModel\")\n      .def_static(\"create\",\n                  (std::unique_ptr<VadModel>(*)(const VadModelConfig &))(\n                      &PyClass::Create),\n                  py::arg(\"config\"), py::call_guard<py::gil_scoped_release>())\n      .def(\"reset\", &PyClass::Reset, py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"is_speech\",\n          [](PyClass &self, const std::vector<float> &samples) -> bool {\n            return self.IsSpeech(samples.data(), samples.size());\n          },\n          py::arg(\"samples\"), py::call_guard<py::gil_scoped_release>())\n      .def(\"window_size\", &PyClass::WindowSize,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"min_silence_duration_samples\", &PyClass::MinSilenceDurationSamples,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"min_speech_duration_samples\", &PyClass::MinSpeechDurationSamples,\n           py::call_guard<py::gil_scoped_release>());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/vad-model.h",
    "content": "// sherpa-onnx/python/csrc/vad-model.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_VAD_MODEL_H_\n#define SHERPA_ONNX_PYTHON_CSRC_VAD_MODEL_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindVadModel(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_VAD_MODEL_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/version.cc",
    "content": "// sherpa-onnx/python/csrc/version.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/version.h\"\n\n#include <string>\n\n#include \"sherpa-onnx/csrc/version.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindVersion(py::module *m) {\n  m->attr(\"version\") = std::string(GetVersionStr());\n\n  m->attr(\"git_sha1\") = std::string(GetGitSha1());\n\n  m->attr(\"git_date\") = std::string(GetGitDate());\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/version.h",
    "content": "// sherpa-onnx/python/csrc/version.h\n//\n// Copyright (c)  2025  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_VERSION_H_\n#define SHERPA_ONNX_PYTHON_CSRC_VERSION_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindVersion(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_VERSION_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/voice-activity-detector.cc",
    "content": "// sherpa-onnx/python/csrc/voice-activity-detector.cc\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/voice-activity-detector.h\"\n\n#include <vector>\n\n#include \"sherpa-onnx/csrc/voice-activity-detector.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindSpeechSegment(py::module *m) {\n  using PyClass = SpeechSegment;\n  py::class_<PyClass>(*m, \"SpeechSegment\")\n      .def_property_readonly(\"start\",\n                             [](const PyClass &self) { return self.start; })\n      .def_property_readonly(\"samples\",\n                             [](const PyClass &self) { return self.samples; });\n}\n\nvoid PybindVoiceActivityDetector(py::module *m) {\n  PybindSpeechSegment(m);\n  using PyClass = VoiceActivityDetector;\n  py::class_<PyClass>(*m, \"VoiceActivityDetector\",\n                      R\"(\n1. It is an error to call the front property when the method empty() returns True\n2. The property front returns a reference, which is valid until the next call of any\n   methods of this class\n3. When speech is detected, the method is_speech_detected() return True, you can\n   use the property current_segment to get the speech samples since\n   is_speech_detected() returns true\n4. When is_speech_detected() is changed from True to False, the method\n   empty() returns False.\n      )\")\n      .def(py::init<const VadModelConfig &, float>(), py::arg(\"config\"),\n           py::arg(\"buffer_size_in_seconds\") = 60,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\n          \"accept_waveform\",\n          [](PyClass &self, const std::vector<float> &samples) {\n            self.AcceptWaveform(samples.data(), samples.size());\n          },\n          py::arg(\"samples\"), py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"config\", &PyClass::GetConfig)\n      .def(\"empty\", &PyClass::Empty, py::call_guard<py::gil_scoped_release>())\n      .def(\"pop\", &PyClass::Pop, py::call_guard<py::gil_scoped_release>())\n      .def(\"is_speech_detected\", &PyClass::IsSpeechDetected,\n           py::call_guard<py::gil_scoped_release>())\n      .def(\"reset\", &PyClass::Reset, py::call_guard<py::gil_scoped_release>())\n      .def(\"flush\", &PyClass::Flush, py::call_guard<py::gil_scoped_release>())\n      .def_property_readonly(\"front\", &PyClass::Front)\n      .def_property_readonly(\"current_segment\", &PyClass::CurrentSpeechSegment);\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/voice-activity-detector.h",
    "content": "// sherpa-onnx/python/csrc/voice-activity-detector.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_VOICE_ACTIVITY_DETECTOR_H_\n#define SHERPA_ONNX_PYTHON_CSRC_VOICE_ACTIVITY_DETECTOR_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindVoiceActivityDetector(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_VOICE_ACTIVITY_DETECTOR_H_\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/wave-writer.cc",
    "content": "// sherpa-onnx/python/csrc/wave-writer.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#include \"sherpa-onnx/python/csrc/wave-writer.h\"\n\n#include <string>\n#include <vector>\n\n#include \"sherpa-onnx/csrc/wave-writer.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindWaveWriter(py::module *m) {\n  m->def(\n      \"write_wave\",\n      [](const std::string &filename, const std::vector<float> &samples,\n         int32_t sample_rate) -> bool {\n        bool ok =\n            WriteWave(filename, sample_rate, samples.data(), samples.size());\n\n        return ok;\n      },\n      py::arg(\"filename\"), py::arg(\"samples\"), py::arg(\"sample_rate\"));\n}\n\n}  // namespace sherpa_onnx\n"
  },
  {
    "path": "sherpa-onnx/python/csrc/wave-writer.h",
    "content": "// sherpa-onnx/python/csrc/wave-writer.h\n//\n// Copyright (c)  2024  Xiaomi Corporation\n\n#ifndef SHERPA_ONNX_PYTHON_CSRC_WAVE_WRITER_H_\n#define SHERPA_ONNX_PYTHON_CSRC_WAVE_WRITER_H_\n\n#include \"sherpa-onnx/python/csrc/sherpa-onnx.h\"\n\nnamespace sherpa_onnx {\n\nvoid PybindWaveWriter(py::module *m);\n\n}\n\n#endif  // SHERPA_ONNX_PYTHON_CSRC_WAVE_WRITER_H_\n"
  },
  {
    "path": "sherpa-onnx/python/sherpa_onnx/__init__.py",
    "content": "from sherpa_onnx.lib._sherpa_onnx import (\n    Alsa,\n    AudioEvent,\n    AudioTagging,\n    AudioTaggingConfig,\n    AudioTaggingModelConfig,\n    CircularBuffer,\n    DenoisedAudio,\n    FastClustering,\n    FastClusteringConfig,\n    FeatureExtractorConfig,\n    GenerationConfig,\n    HomophoneReplacerConfig,\n    OfflineCanaryModelConfig,\n    OfflineCtcFstDecoderConfig,\n    OfflineDolphinModelConfig,\n    OfflineFireRedAsrModelConfig,\n    OfflineFunASRNanoModelConfig,\n    OfflineLMConfig,\n    OfflineModelConfig,\n    OfflineMoonshineModelConfig,\n    OfflineNemoEncDecCtcModelConfig,\n    OfflineParaformerModelConfig,\n    OfflinePunctuation,\n    OfflinePunctuationConfig,\n    OfflinePunctuationModelConfig,\n    OfflineRecognizerConfig,\n    OfflineSenseVoiceModelConfig,\n    OfflineSourceSeparation,\n    OfflineSourceSeparationConfig,\n    OfflineSourceSeparationModelConfig,\n    OfflineSourceSeparationSpleeterModelConfig,\n    OfflineSourceSeparationUvrModelConfig,\n    OfflineSpeakerDiarization,\n    OfflineSpeakerDiarizationConfig,\n    OfflineSpeakerDiarizationResult,\n    OfflineSpeakerDiarizationSegment,\n    OfflineSpeakerSegmentationModelConfig,\n    OfflineSpeakerSegmentationPyannoteModelConfig,\n    OfflineSpeechDenoiser,\n    OfflineSpeechDenoiserConfig,\n    OfflineSpeechDenoiserDpdfNetModelConfig,\n    OfflineSpeechDenoiserGtcrnModelConfig,\n    OfflineSpeechDenoiserModelConfig,\n    OfflineStream,\n    OfflineTdnnModelConfig,\n    OfflineTransducerModelConfig,\n    OfflineTts,\n    OfflineTtsConfig,\n    OfflineTtsKittenModelConfig,\n    OfflineTtsKokoroModelConfig,\n    OfflineTtsMatchaModelConfig,\n    OfflineTtsModelConfig,\n    OfflineTtsPocketModelConfig,\n    OfflineTtsSupertonicModelConfig,\n    OfflineTtsVitsModelConfig,\n    OfflineTtsZipvoiceModelConfig,\n    OfflineWenetCtcModelConfig,\n    OfflineWhisperModelConfig,\n    OfflineZipformerAudioTaggingModelConfig,\n    OfflineZipformerCtcModelConfig,\n    OnlinePunctuation,\n    OnlinePunctuationConfig,\n    OnlinePunctuationModelConfig,\n    OnlineSpeechDenoiser,\n    OnlineSpeechDenoiserConfig,\n    OnlineStream,\n    SentencePieceTokenizer,\n    SileroVadModelConfig,\n    SpeakerEmbeddingExtractor,\n    SpeakerEmbeddingExtractorConfig,\n    SpeakerEmbeddingManager,\n    SpeechSegment,\n    SpokenLanguageIdentification,\n    SpokenLanguageIdentificationConfig,\n    SpokenLanguageIdentificationWhisperConfig,\n    TenVadModelConfig,\n    VadModel,\n    VadModelConfig,\n    VoiceActivityDetector,\n    git_date,\n    git_sha1,\n    version,\n    write_wave,\n)\n\nfrom .display import Display\nfrom .keyword_spotter import KeywordSpotter\nfrom .offline_recognizer import OfflineRecognizer\nfrom .online_recognizer import OnlineRecognizer\nfrom .utils import text2token\n"
  },
  {
    "path": "sherpa-onnx/python/sherpa_onnx/cli.py",
    "content": "# Copyright (c)  2023  Xiaomi Corporation\n\nimport logging\n\ntry:\n    import click\nexcept ImportError:\n    print(\"Please run\")\n    print(\"  pip install click\")\n    print(\"before you continue\")\n    raise\n\nfrom pathlib import Path\nfrom sherpa_onnx import text2token\n\n\n@click.group()\ndef cli():\n    \"\"\"\n    The shell entry point to sherpa-onnx.\n    \"\"\"\n    logging.basicConfig(\n        format=\"%(asctime)s %(levelname)s [%(filename)s:%(lineno)d] %(message)s\",\n        level=logging.INFO,\n    )\n\n\n@cli.command(name=\"text2token\")\n@click.argument(\"input\", type=click.Path(exists=True, dir_okay=False))\n@click.argument(\"output\", type=click.Path())\n@click.option(\n    \"--tokens\",\n    type=str,\n    required=True,\n    help=\"The path to tokens.txt.\",\n)\n@click.option(\n    \"--tokens-type\",\n    type=click.Choice(\n        [\n            \"cjkchar\",\n            \"bpe\",\n            \"cjkchar+bpe\",\n            \"fpinyin\",\n            \"ppinyin\",\n            \"phone+ppinyin\",\n        ],\n        case_sensitive=True,\n    ),\n    required=True,\n    help=\"\"\"The type of modeling units, should be cjkchar, bpe, cjkchar+bpe, fpinyin, ppinyin or phone+ppinyin.\n    fpinyin means full pinyin, each cjkchar has a pinyin(with tone).\n    ppinyin means partial pinyin, it splits pinyin into initial and final,\n    phone means English phonemes in CMU dictionary format.\n    \"\"\",\n)\n@click.option(\n    \"--bpe-model\",\n    type=str,\n    help=\"The path to bpe.model. Only required when tokens-type is bpe or cjkchar+bpe.\",\n)\n@click.option(\n    \"--lexicon\",\n    type=str,\n    help=\"The path to lexicon.txt. Only required when tokens-type is phone+ppinyin.\",\n)\ndef encode_text(\n    input: Path,\n    output: Path,\n    tokens: Path,\n    tokens_type: str,\n    bpe_model: Path,\n    lexicon: Path,\n):\n    \"\"\"\n    Encode the texts given by the INPUT to tokens and write the results to the OUTPUT.\n    Each line in the texts contains the original phrase, it might also contain some\n    extra items, for example, the boosting score (starting with :), the triggering\n    threshold (starting with #, only used in keyword spotting task) and the original\n    phrase (starting with @). Note: the extra items will be kept same in the output.\n\n    example input 1 (tokens_type = ppinyin):\n\n    小爱同学 :2.0 #0.6 @小爱同学\n    你好问问 :3.5 @你好问问\n    小艺小艺 #0.6 @小艺小艺\n\n    example output 1:\n\n    x iǎo ài t óng x ué :2.0 #0.6 @小爱同学\n    n ǐ h ǎo w èn w èn :3.5 @你好问问\n    x iǎo y ì x iǎo y ì #0.6 @小艺小艺\n\n    example input 2 (tokens_type = bpe):\n\n    HELLO WORLD :1.5 #0.4\n    HI GOOGLE :2.0 #0.8\n    HEY SIRI #0.35\n\n    example output 2:\n\n    ▁HE LL O ▁WORLD :1.5 #0.4\n    ▁HI ▁GO O G LE :2.0 #0.8\n    ▁HE Y ▁S I RI #0.35\n    \"\"\"\n    texts = []\n    # extra information like boosting score (start with :), triggering threshold (start with #)\n    # original keyword (start with @)\n    extra_info = []\n    with open(input, \"r\", encoding=\"utf8\") as f:\n        for line in f:\n            extra = []\n            text = []\n            toks = line.strip().split()\n            for tok in toks:\n                if tok[0] == \":\" or tok[0] == \"#\" or tok[0] == \"@\":\n                    extra.append(tok)\n                else:\n                    text.append(tok)\n            texts.append(\" \".join(text))\n            extra_info.append(extra)\n\n    encoded_texts = text2token(\n        texts,\n        tokens=tokens,\n        tokens_type=tokens_type,\n        bpe_model=bpe_model,\n        lexicon=lexicon,\n    )\n    with open(output, \"w\", encoding=\"utf8\") as f:\n        for i, txt in enumerate(encoded_texts):\n            txt += extra_info[i]\n            f.write(\" \".join(txt) + \"\\n\")\n"
  },
  {
    "path": "sherpa-onnx/python/sherpa_onnx/display.py",
    "content": "# Copyright (c)  2025  Xiaomi Corporation\nimport os\nfrom time import localtime, strftime\n\n\ndef get_current_time():\n    return strftime(\"%Y-%m-%d %H:%M:%S\", localtime())\n\n\ndef clear_console():\n    os.system(\"cls\" if os.name == \"nt\" else \"clear\")\n\n\nclass Display:\n    def __init__(self):\n        self.sentences = []\n        self.currentText = \"\"\n\n    def update_text(self, text):\n        self.currentText = text\n\n    def finalize_current_sentence(self):\n        if self.currentText.strip():\n            self.sentences.append((get_current_time(), self.currentText))\n\n        self.currentText = \"\"\n\n    def display(self):\n        clear_console()\n        print(\"=== Speech Recognition with Next-gen Kaldi ===\")\n        print(\"Time:\", get_current_time())\n        print(\"-\" * 30)\n\n        # display history sentences\n        if self.sentences:\n            for i, (when, text) in enumerate(self.sentences):\n                print(f\"[{when}] {i + 1}. {text}\")\n            print(\"-\" * 30)\n\n        if self.currentText.strip():\n            print(\"Recognizing:\", self.currentText)\n"
  },
  {
    "path": "sherpa-onnx/python/sherpa_onnx/keyword_spotter.py",
    "content": "# Copyright (c)  2023  Xiaomi Corporation\n\nfrom pathlib import Path\nfrom typing import List, Optional\n\nfrom sherpa_onnx.lib._sherpa_onnx import (\n    FeatureExtractorConfig,\n    KeywordSpotterConfig,\n    OnlineModelConfig,\n    OnlineTransducerModelConfig,\n    OnlineStream,\n    ProviderConfig,\n)\n\nfrom sherpa_onnx.lib._sherpa_onnx import KeywordSpotter as _KeywordSpotter\n\n\ndef _assert_file_exists(f: str):\n    assert Path(f).is_file(), f\"{f} does not exist\"\n\n\nclass KeywordSpotter(object):\n    \"\"\"A class for keyword spotting.\n\n    Please refer to the following files for usages\n     - https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/keyword-spotter.py\n     - https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/keyword-spotter-from-microphone.py\n    \"\"\"\n\n    def __init__(\n        self,\n        tokens: str,\n        encoder: str,\n        decoder: str,\n        joiner: str,\n        keywords_file: str,\n        num_threads: int = 2,\n        sample_rate: float = 16000,\n        feature_dim: int = 80,\n        max_active_paths: int = 4,\n        keywords_score: float = 1.0,\n        keywords_threshold: float = 0.25,\n        num_trailing_blanks: int = 1,\n        provider: str = \"cpu\",\n        device: int = 0,\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html>`_\n        to download pre-trained models for different languages, e.g., Chinese,\n        English, etc.\n\n        Args:\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          encoder:\n            Path to ``encoder.onnx``.\n          decoder:\n            Path to ``decoder.onnx``.\n          joiner:\n            Path to ``joiner.onnx``.\n          keywords_file:\n            The file containing keywords, one word/phrase per line, and for each\n            phrase the bpe/cjkchar/pinyin are separated by a space.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          max_active_paths:\n            Use only when decoding_method is modified_beam_search. It specifies\n            the maximum number of active paths during beam search.\n          keywords_score:\n            The boosting score of each token for keywords. The larger the easier to\n            survive beam search.\n          keywords_threshold:\n            The trigger threshold (i.e. probability) of the keyword. The larger the\n            harder to trigger.\n          num_trailing_blanks:\n            The number of trailing blanks a keyword should be followed. Setting\n            to a larger value (e.g. 8) when your keywords has overlapping tokens\n            between each other.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          device:\n            onnxruntime cuda device index.\n        \"\"\"\n        _assert_file_exists(tokens)\n        _assert_file_exists(encoder)\n        _assert_file_exists(decoder)\n        _assert_file_exists(joiner)\n\n        assert num_threads > 0, num_threads\n\n        transducer_config = OnlineTransducerModelConfig(\n            encoder=encoder,\n            decoder=decoder,\n            joiner=joiner,\n        )\n\n        provider_config = ProviderConfig(\n            provider=provider,\n            device=device,\n        )\n\n        model_config = OnlineModelConfig(\n            transducer=transducer_config,\n            tokens=tokens,\n            num_threads=num_threads,\n            provider_config=provider_config,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        keywords_spotter_config = KeywordSpotterConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            max_active_paths=max_active_paths,\n            num_trailing_blanks=num_trailing_blanks,\n            keywords_score=keywords_score,\n            keywords_threshold=keywords_threshold,\n            keywords_file=keywords_file,\n        )\n        self.keyword_spotter = _KeywordSpotter(keywords_spotter_config)\n\n    def reset_stream(self, s: OnlineStream):\n        self.keyword_spotter.reset(s)\n\n    def create_stream(self, keywords: Optional[str] = None):\n        if keywords is None:\n            return self.keyword_spotter.create_stream()\n        else:\n            return self.keyword_spotter.create_stream(keywords)\n\n    def decode_stream(self, s: OnlineStream):\n        self.keyword_spotter.decode_stream(s)\n\n    def decode_streams(self, ss: List[OnlineStream]):\n        self.keyword_spotter.decode_streams(ss)\n\n    def is_ready(self, s: OnlineStream) -> bool:\n        return self.keyword_spotter.is_ready(s)\n\n    def get_result(self, s: OnlineStream) -> str:\n        return self.keyword_spotter.get_result(s).keyword.strip()\n\n    def tokens(self, s: OnlineStream) -> List[str]:\n        return self.keyword_spotter.get_result(s).tokens\n\n    def timestamps(self, s: OnlineStream) -> List[float]:\n        return self.keyword_spotter.get_result(s).timestamps\n"
  },
  {
    "path": "sherpa-onnx/python/sherpa_onnx/offline_recognizer.py",
    "content": "# Copyright (c)  2023 by manyeyes\n# Copyright (c)  2023  Xiaomi Corporation\nfrom pathlib import Path\nfrom typing import List, Optional\n\nfrom sherpa_onnx.lib._sherpa_onnx import (\n    FeatureExtractorConfig,\n    HomophoneReplacerConfig,\n    OfflineCanaryModelConfig,\n    OfflineFunASRNanoModelConfig,\n    OfflineOmnilingualAsrCtcModelConfig,\n    OfflineMedAsrCtcModelConfig,\n    OfflineFireRedAsrCtcModelConfig,\n    OfflineCtcFstDecoderConfig,\n    OfflineDolphinModelConfig,\n    OfflineFireRedAsrModelConfig,\n    OfflineLMConfig,\n    OfflineModelConfig,\n    OfflineMoonshineModelConfig,\n    OfflineNemoEncDecCtcModelConfig,\n    OfflineParaformerModelConfig,\n)\nfrom sherpa_onnx.lib._sherpa_onnx import OfflineRecognizer as _Recognizer\nfrom sherpa_onnx.lib._sherpa_onnx import (\n    OfflineRecognizerConfig,\n    OfflineSenseVoiceModelConfig,\n    OfflineStream,\n    OfflineTdnnModelConfig,\n    OfflineTransducerModelConfig,\n    OfflineWenetCtcModelConfig,\n    OfflineWhisperModelConfig,\n    OfflineZipformerCtcModelConfig,\n)\n\n\ndef _assert_file_exists(f: str):\n    assert Path(f).is_file(), f\"{f} does not exist\"\n\n\nclass OfflineRecognizer(object):\n    \"\"\"A class for offline speech recognition.\n\n    Please refer to the following files for usages\n     - https://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/python/tests/test_offline_recognizer.py\n     - https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/offline-decode-files.py\n    \"\"\"\n\n    @classmethod\n    def from_transducer(\n        cls,\n        encoder: str,\n        decoder: str,\n        joiner: str,\n        tokens: str,\n        num_threads: int = 1,\n        sample_rate: int = 16000,\n        feature_dim: int = 80,\n        dither: float = 0.0,\n        decoding_method: str = \"greedy_search\",\n        max_active_paths: int = 4,\n        hotwords_file: str = \"\",\n        hotwords_score: float = 1.5,\n        blank_penalty: float = 0.0,\n        modeling_unit: str = \"cjkchar\",\n        bpe_vocab: str = \"\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        model_type: str = \"transducer\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        lm: str = \"\",\n        lm_scale: float = 0.1,\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n        lodr_fst: str = \"\",\n        lodr_scale: float = 0.0,\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html>`_\n        to download pre-trained models for different languages, e.g., Chinese,\n        English, etc.\n\n        Args:\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          encoder:\n            Path to ``encoder.onnx``.\n          decoder:\n            Path to ``decoder.onnx``.\n          joiner:\n            Path to ``joiner.onnx``.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          dither:\n            Dithering constant (0.0 means no dither).\n            By default the audio samples are in range [-1,+1],\n            so dithering constant 0.00003 is a good value,\n            equivalent to the default 1.0 from kaldi\n          decoding_method:\n            Valid values: greedy_search, modified_beam_search.\n          max_active_paths:\n            Maximum number of active paths to keep. Used only when\n            decoding_method is modified_beam_search.\n          hotwords_file:\n            The file containing hotwords, one words/phrases per line, and for each\n            phrase the bpe/cjkchar are separated by a space.\n          hotwords_score:\n            The hotword score of each token for biasing word/phrase. Used only if\n            hotwords_file is given with modified_beam_search as decoding method.\n          blank_penalty:\n            The penalty applied on blank symbol during decoding.\n          modeling_unit:\n            The modeling unit of the model, commonly used units are bpe, cjkchar,\n            cjkchar+bpe, etc. Currently, it is needed only when hotwords are\n            provided, we need it to encode the hotwords into token sequence.\n            and the modeling unit is bpe or cjkchar+bpe.\n          bpe_vocab:\n            The vocabulary generated by google's sentencepiece program.\n            It is a file has two columns, one is the token, the other is\n            the log probability, you can get it from the directory where\n            your bpe model is generated. Only used when hotwords provided\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n          lodr_fst:\n            Path to the LODR FST file in binary format. If empty, LODR is disabled.\n          lodr_scale:\n            Scale factor for LODR rescoring. Only used when lodr_fst is provided.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            transducer=OfflineTransducerModelConfig(\n                encoder_filename=encoder,\n                decoder_filename=decoder,\n                joiner_filename=joiner,\n            ),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n            modeling_unit=modeling_unit,\n            bpe_vocab=bpe_vocab,\n            model_type=model_type,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n            dither=dither,\n        )\n\n        if len(hotwords_file) > 0 and decoding_method != \"modified_beam_search\":\n            raise ValueError(\n                \"Please use --decoding-method=modified_beam_search when using \"\n                f\"--hotwords-file. Currently given: {decoding_method}\"\n            )\n\n        if lm and decoding_method != \"modified_beam_search\":\n            raise ValueError(\n                \"Please use --decoding-method=modified_beam_search when using \"\n                f\"--lm. Currently given: {decoding_method}\"\n            )\n\n        lm_config = OfflineLMConfig(\n            model=lm,\n            scale=lm_scale,\n            lm_num_threads=num_threads,\n            lm_provider=provider,\n            lodr_fst=lodr_fst,\n            lodr_scale=lodr_scale,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            lm_config=lm_config,\n            decoding_method=decoding_method,\n            max_active_paths=max_active_paths,\n            hotwords_file=hotwords_file,\n            hotwords_score=hotwords_score,\n            blank_penalty=blank_penalty,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_sense_voice(\n        cls,\n        model: str,\n        tokens: str,\n        num_threads: int = 1,\n        sample_rate: int = 16000,\n        feature_dim: int = 80,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        language: str = \"\",\n        use_itn: bool = False,\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models>`_\n        to download pre-trained models.\n\n        Args:\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          model:\n            Path to ``model.onnx``.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          decoding_method:\n            Valid values are greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          language:\n            If not empty, then valid values are: auto, zh, en, ja, ko, yue\n          use_itn:\n            True to enable inverse text normalization; False to disable it.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            sense_voice=OfflineSenseVoiceModelConfig(\n                model=model,\n                language=language,\n                use_itn=use_itn,\n            ),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_funasr_nano(\n        cls,\n        encoder_adaptor: str,\n        llm: str,\n        embedding: str,\n        tokenizer: str,\n        num_threads: int = 1,\n        sample_rate: int = 16000,\n        feature_dim: int = 80,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        system_prompt: str = \"You are a helpful assistant.\",\n        user_prompt: str = \"语音转写:\",\n        max_new_tokens: int = 512,\n        temperature: float = 1e-6,\n        top_p: float = 0.8,\n        seed: int = 42,\n        language: str = \"\",\n        itn: bool = True,\n        hotwords: str = \"\",\n    ):\n        \"\"\"\n        Create an offline recognizer for FunASR-nano models.\n\n        Args:\n          encoder_adaptor:\n            Path to ``encoder_adaptor.onnx``.\n          llm:\n            Path to ``llm.onnx`` (KV cache model).\n          embedding:\n            Path to ``embedding.onnx``.\n          tokenizer:\n            Path to tokenizer directory (e.g., Qwen3-0.6B).\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          decoding_method:\n            Valid values are greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda.\n          system_prompt:\n            System prompt for FunASR-nano.\n          user_prompt:\n            User prompt template for FunASR-nano.\n          max_new_tokens:\n            Maximum number of new tokens to generate.\n          temperature:\n            Sampling temperature.\n          top_p:\n            Top-p (nucleus) sampling threshold.\n          seed:\n            Random seed.\n          language:\n            Language for transcription (empty string means None).\n          itn:\n            Whether to apply inverse text normalization (default: True).\n          hotwords:\n            Hotwords (comma-separated, e.g., \"Sherpa,FunASR\").\n        \"\"\"\n        self = cls.__new__(cls)\n        # Create OfflineFunASRNanoModelConfig and set attributes\n        funasr_nano_config = OfflineFunASRNanoModelConfig()\n        funasr_nano_config.encoder_adaptor = encoder_adaptor\n        funasr_nano_config.llm = llm\n        funasr_nano_config.embedding = embedding\n        funasr_nano_config.tokenizer = tokenizer\n        funasr_nano_config.system_prompt = system_prompt\n        funasr_nano_config.user_prompt = user_prompt\n        funasr_nano_config.max_new_tokens = max_new_tokens\n        funasr_nano_config.temperature = temperature\n        funasr_nano_config.top_p = top_p\n        funasr_nano_config.seed = seed\n        funasr_nano_config.language = language\n        funasr_nano_config.itn = itn\n        funasr_nano_config.hotwords = hotwords\n\n        model_config = OfflineModelConfig(\n            funasr_nano=funasr_nano_config,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_paraformer(\n        cls,\n        paraformer: str,\n        tokens: str,\n        num_threads: int = 1,\n        sample_rate: int = 16000,\n        feature_dim: int = 80,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html>`_\n        to download pre-trained models.\n\n        Args:\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          paraformer:\n            Path to ``model.onnx``.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          decoding_method:\n            Valid values are greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            paraformer=OfflineParaformerModelConfig(model=paraformer),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n            model_type=\"paraformer\",\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_telespeech_ctc(\n        cls,\n        model: str,\n        tokens: str,\n        num_threads: int = 1,\n        sample_rate: int = 16000,\n        feature_dim: int = 40,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models>`_\n        to download pre-trained models.\n\n        Args:\n          model:\n            Path to ``model.onnx``.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model. It is\n            ignored and is hard-coded in C++ to 40.\n          feature_dim:\n            Dimension of the feature used to train the model. It is ignored\n            and is hard-coded in C++ to 40.\n          decoding_method:\n            Valid values are greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            telespeech_ctc=model,\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir, lexicon=hr_lexicon, rule_fsts=hr_rule_fsts\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_dolphin_ctc(\n        cls,\n        model: str,\n        tokens: str,\n        num_threads: int = 1,\n        sample_rate: int = 16000,\n        feature_dim: int = 80,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/dolphin/index.html>`_\n        to download pre-trained models.\n\n        Args:\n          model:\n            Path to ``model.onnx`` or ``model.int8.onnx``.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          decoding_method:\n            Valid values are greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            dolphin=OfflineDolphinModelConfig(model=model),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_fire_red_asr_ctc(\n        cls,\n        model: str,\n        tokens: str,\n        num_threads: int = 1,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/FireRedAsr/index.html>`_\n        to download pre-trained models.\n\n        Args:\n          model:\n            Path to ``model.onnx``.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          decoding_method:\n            The only supported decoding method is greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            fire_red_asr_ctc=OfflineFireRedAsrCtcModelConfig(model=model),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            model_config=model_config,\n            decoding_method=decoding_method,\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_medasr_ctc(\n        cls,\n        model: str,\n        tokens: str,\n        num_threads: int = 1,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/medasr/index.html>`_\n        to download pre-trained models.\n\n        Args:\n          model:\n            Path to ``model.onnx``.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          decoding_method:\n            The only supported decoding method is greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            medasr=OfflineMedAsrCtcModelConfig(model=model),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            model_config=model_config,\n            decoding_method=decoding_method,\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_omnilingual_asr_ctc(\n        cls,\n        model: str,\n        tokens: str,\n        num_threads: int = 1,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/omnilingual-asr/index.html>`_\n        to download pre-trained models.\n\n        Args:\n          model:\n            Path to ``model.onnx``.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          decoding_method:\n            The only supported decoding method is greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            omnilingual=OfflineOmnilingualAsrCtcModelConfig(model=model),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            model_config=model_config,\n            decoding_method=decoding_method,\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_zipformer_ctc(\n        cls,\n        model: str,\n        tokens: str,\n        num_threads: int = 1,\n        sample_rate: int = 16000,\n        feature_dim: int = 80,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/index.html>`_\n        to download pre-trained models for different languages, e.g., Chinese,\n        English, etc.\n\n        Args:\n          model:\n            Path to ``model.onnx``.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          decoding_method:\n            Valid values are greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            zipformer_ctc=OfflineZipformerCtcModelConfig(model=model),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_nemo_ctc(\n        cls,\n        model: str,\n        tokens: str,\n        num_threads: int = 1,\n        sample_rate: int = 16000,\n        feature_dim: int = 80,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/index.html>`_\n        to download pre-trained models for different languages, e.g., Chinese,\n        English, etc.\n\n        Args:\n          model:\n            Path to ``model.onnx``.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          decoding_method:\n            Valid values are greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=model),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n            model_type=\"nemo_ctc\",\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_nemo_canary(\n        cls,\n        encoder: str,\n        decoder: str,\n        tokens: str,\n        src_lang: str = \"en\",\n        tgt_lang: str = \"en\",\n        num_threads: int = 1,\n        sample_rate: int = 16000,\n        feature_dim: int = 128,  # not used\n        decoding_method: str = \"greedy_search\",  # not used\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/nemo/index.html>`_\n        to download pre-trained models for different languages.\n\n        Args:\n          encoder:\n            Path to ``encoder.onnx`` or ``encoder.int8.onnx``.\n          decoder:\n            Path to ``decoder.onnx`` or ``decoder.int8.onnx``.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          src_lang:\n            The language of the input audio. Valid values are: en, es, de, fr.\n            If you leave it empty, it uses en internally.\n          tgt_lang:\n            The language of the output text. Valid values are: en, es, de, fr.\n            If you leave it empty, it uses en internally.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model. Not used\n          feature_dim:\n            Dimension of the feature used to train the model. Not used\n          decoding_method:\n            Valid values are greedy_search. Not used\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            canary=OfflineCanaryModelConfig(\n                encoder=encoder,\n                decoder=decoder,\n                src_lang=src_lang,\n                tgt_lang=tgt_lang,\n            ),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_whisper(\n        cls,\n        encoder: str,\n        decoder: str,\n        tokens: str,\n        language: str = \"en\",\n        task: str = \"transcribe\",\n        num_threads: int = 1,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        tail_paddings: int = -1,\n        enable_token_timestamps: bool = False,\n        enable_segment_timestamps: bool = False,\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/index.html>`_\n        to download pre-trained models for different kinds of whisper models,\n        e.g., tiny, tiny.en, base, base.en, etc.\n\n        Args:\n          encoder:\n            Path to the encoder model, e.g., tiny-encoder.onnx,\n            tiny-encoder.int8.onnx, tiny-encoder.ort, etc.\n          decoder:\n            Path to the decoder model, e.g., tiny-decoder.onnx,\n            tiny-decoder.int8.onnx, tiny-decoder.ort, etc.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          language:\n            The spoken language in the audio file. Example values: en, de, zh,\n            jp, fr. See https://github.com/openai/whisper/blob/main/whisper/tokenizer.py#L10\n            for all possible values. Note that for non-multilingual models, the\n            only valid value is 'en'.\n          task:\n            Valid values are: transcribe, translate. Note that for\n            non-multilingual models, the only valid value is 'transcribe'.\n          num_threads:\n            Number of threads for neural network computation.\n          decoding_method:\n            Valid values: greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          enable_token_timestamps:\n            True to enable token-level timestamps using cross-attention alignment\n            and DTW. Requires ONNX models exported with attention outputs.\n            When enabled, result.timestamps will contain token-level start times.\n            Defaults to False.\n          enable_segment_timestamps:\n            True to enable segment-level timestamps using Whisper's native\n            timestamp token mode. The decoder outputs timestamp tokens like\n            <|0.00|> to mark segment boundaries. Does not require attention\n            outputs. Can be combined with enable_token_timestamps for both\n            segment and token-level timestamps. Defaults to False.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            whisper=OfflineWhisperModelConfig(\n                encoder=encoder,\n                decoder=decoder,\n                language=language,\n                task=task,\n                tail_paddings=tail_paddings,\n                enable_token_timestamps=enable_token_timestamps,\n                enable_segment_timestamps=enable_segment_timestamps,\n            ),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n            model_type=\"whisper\",\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=16000,\n            feature_dim=80,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_fire_red_asr(\n        cls,\n        encoder: str,\n        decoder: str,\n        tokens: str,\n        num_threads: int = 1,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/fire_red_asr/index.html>`_\n        to download pre-trained models for different kinds of FireRedAsr models,\n        e.g., xs, large, etc.\n\n        Args:\n          encoder:\n            Path to the encoder model.\n          decoder:\n            Path to the decoder model.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n          num_threads:\n            Number of threads for neural network computation.\n          decoding_method:\n            Valid values: greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            fire_red_asr=OfflineFireRedAsrModelConfig(\n                encoder=encoder,\n                decoder=decoder,\n            ),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=16000,\n            feature_dim=80,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_moonshine(\n        cls,\n        preprocessor: str,\n        encoder: str,\n        uncached_decoder: str,\n        cached_decoder: str,\n        tokens: str,\n        num_threads: int = 1,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/moonshine/index.html>`_\n        to download pre-trained models for different kinds of moonshine models,\n        e.g., tiny, base, etc.\n\n        Args:\n          preprocessor:\n            Path to the preprocessor model, e.g., preprocess.onnx\n          encoder:\n            Path to the encoder model, e.g., encode.int8.onnx\n          uncached_decoder:\n            Path to the uncached decoder model, e.g., uncached_decode.int8.onnx,\n          cached_decoder:\n            Path to the cached decoder model, e.g., cached_decode.int8.onnx,\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          decoding_method:\n            Valid values: greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            moonshine=OfflineMoonshineModelConfig(\n                preprocessor=preprocessor,\n                encoder=encoder,\n                uncached_decoder=uncached_decoder,\n                cached_decoder=cached_decoder,\n            ),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        unused_feat_config = FeatureExtractorConfig(\n            sampling_rate=16000,\n            feature_dim=80,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            model_config=model_config,\n            feat_config=unused_feat_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_moonshine_v2(\n        cls,\n        encoder: str,\n        decoder: str,\n        tokens: str,\n        num_threads: int = 1,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/moonshine/index.html>`_\n        to download pre-trained models for different kinds of moonshine v2 models,\n        e.g., tiny-en, base-zh, etc.\n\n        Args:\n          encoder:\n            Path to the encoder model, e.g., encoder_model.ort\n          decoder:\n            Path to the merged decoder model, e.g., decoder_model_merged.ort,\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          decoding_method:\n            Valid values: greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            moonshine=OfflineMoonshineModelConfig(\n                encoder=encoder,\n                merged_decoder=decoder,\n            ),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n        )\n\n        unused_feat_config = FeatureExtractorConfig(\n            sampling_rate=16000,\n            feature_dim=80,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            model_config=model_config,\n            feat_config=unused_feat_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_tdnn_ctc(\n        cls,\n        model: str,\n        tokens: str,\n        num_threads: int = 1,\n        sample_rate: int = 8000,\n        feature_dim: int = 23,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/yesno/index.html>`_\n        to download pre-trained models.\n\n        Args:\n          model:\n            Path to ``model.onnx``.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          decoding_method:\n            Valid values are greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            tdnn=OfflineTdnnModelConfig(model=model),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n            model_type=\"tdnn\",\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_wenet_ctc(\n        cls,\n        model: str,\n        tokens: str,\n        num_threads: int = 1,\n        sample_rate: int = 16000,\n        feature_dim: int = 80,\n        decoding_method: str = \"greedy_search\",\n        debug: bool = False,\n        provider: str = \"cpu\",\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/index.html>`_\n        to download pre-trained models for different languages, e.g., Chinese,\n        English, etc.\n\n        Args:\n          model:\n            Path to ``model.onnx``.\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          decoding_method:\n            Valid values are greedy_search.\n          debug:\n            True to show debug messages.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n        \"\"\"\n        self = cls.__new__(cls)\n        model_config = OfflineModelConfig(\n            wenet_ctc=OfflineWenetCtcModelConfig(model=model),\n            tokens=tokens,\n            num_threads=num_threads,\n            debug=debug,\n            provider=provider,\n            model_type=\"wenet_ctc\",\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        recognizer_config = OfflineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    def create_stream(self, hotwords: Optional[str] = None):\n        if hotwords is None:\n            return self.recognizer.create_stream()\n        else:\n            return self.recognizer.create_stream(hotwords)\n\n    def decode_stream(self, s: OfflineStream):\n        self.recognizer.decode_stream(s)\n\n    def decode_streams(self, ss: List[OfflineStream]):\n        self.recognizer.decode_streams(ss)\n"
  },
  {
    "path": "sherpa-onnx/python/sherpa_onnx/online_recognizer.py",
    "content": "# Copyright (c)  2023  Xiaomi Corporation\nfrom pathlib import Path\nfrom typing import List, Optional\n\nfrom sherpa_onnx.lib._sherpa_onnx import (\n    CudaConfig,\n    EndpointConfig,\n    FeatureExtractorConfig,\n    HomophoneReplacerConfig,\n    OnlineCtcFstDecoderConfig,\n    OnlineLMConfig,\n    OnlineModelConfig,\n    OnlineNeMoCtcModelConfig,\n    OnlineParaformerModelConfig,\n)\nfrom sherpa_onnx.lib._sherpa_onnx import OnlineRecognizer as _Recognizer\nfrom sherpa_onnx.lib._sherpa_onnx import (\n    OnlineRecognizerConfig,\n    OnlineRecognizerResult,\n    OnlineStream,\n    OnlineToneCtcModelConfig,\n    OnlineTransducerModelConfig,\n    OnlineWenetCtcModelConfig,\n    OnlineZipformer2CtcModelConfig,\n    ProviderConfig,\n    TensorrtConfig,\n)\n\n\ndef _assert_file_exists(f: str):\n    assert Path(f).is_file(), f\"{f} does not exist\"\n\n\nclass OnlineRecognizer(object):\n    \"\"\"A class for streaming speech recognition.\n\n    Please refer to the following files for usages\n     - https://github.com/k2-fsa/sherpa-onnx/blob/master/sherpa-onnx/python/tests/test_online_recognizer.py\n     - https://github.com/k2-fsa/sherpa-onnx/blob/master/python-api-examples/online-decode-files.py\n    \"\"\"\n\n    @classmethod\n    def from_transducer(\n        cls,\n        tokens: str,\n        encoder: str,\n        decoder: str,\n        joiner: str,\n        num_threads: int = 2,\n        sample_rate: float = 16000,\n        feature_dim: int = 80,\n        low_freq: float = 20.0,\n        high_freq: float = -400.0,\n        dither: float = 0.0,\n        normalize_samples: bool = True,\n        snip_edges: bool = False,\n        enable_endpoint_detection: bool = False,\n        rule1_min_trailing_silence: float = 2.4,\n        rule2_min_trailing_silence: float = 1.2,\n        rule3_min_utterance_length: float = 20.0,\n        decoding_method: str = \"greedy_search\",\n        max_active_paths: int = 4,\n        hotwords_score: float = 1.5,\n        blank_penalty: float = 0.0,\n        hotwords_file: str = \"\",\n        model_type: str = \"\",\n        modeling_unit: str = \"cjkchar\",\n        bpe_vocab: str = \"\",\n        lm: str = \"\",\n        lm_scale: float = 0.1,\n        lm_shallow_fusion: bool = True,\n        temperature_scale: float = 2.0,\n        reset_encoder: bool = False,\n        debug: bool = False,\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        provider: str = \"cpu\",\n        device: int = 0,\n        cudnn_conv_algo_search: int = 1,\n        trt_max_workspace_size: int = 2147483647,\n        trt_max_partition_iterations: int = 10,\n        trt_min_subgraph_size: int = 5,\n        trt_fp16_enable: bool = True,\n        trt_detailed_build_log: bool = False,\n        trt_engine_cache_enable: bool = True,\n        trt_timing_cache_enable: bool = True,\n        trt_engine_cache_path: str = \"\",\n        trt_timing_cache_path: str = \"\",\n        trt_dump_subgraphs: bool = False,\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n        lodr_fst: str = \"\",\n        lodr_scale: float = 0.0,\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html>`_\n        to download pre-trained models for different languages, e.g., Chinese,\n        English, etc.\n\n        Args:\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          encoder:\n            Path to ``encoder.onnx``.\n          decoder:\n            Path to ``decoder.onnx``.\n          joiner:\n            Path to ``joiner.onnx``.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          low_freq:\n            Low cutoff frequency for mel bins in feature extraction.\n          high_freq:\n            High cutoff frequency for mel bins in feature extraction\n            (if <= 0, offset from Nyquist)\n          dither:\n            Dithering constant (0.0 means no dither).\n            By default the audio samples are in range [-1,+1],\n            so dithering constant 0.00003 is a good value,\n            equivalent to the default 1.0 from kaldi\n          normalize_samples:\n            True for +/- 1.0 range of audio samples (default, zipformer feats),\n            False for +/- 32k samples (ebranchformer features).\n          snip_edges:\n            handling of end of audio signal in kaldi feature extraction.\n            If true, end effects will be handled by outputting only frames that\n            completely fit in the file, and the number of frames depends on the\n            frame-length.  If false, the number of frames depends only on the\n            frame-shift, and we reflect the data at the ends.\n          enable_endpoint_detection:\n            True to enable endpoint detection. False to disable endpoint\n            detection.\n          rule1_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If the duration\n            of trailing silence in seconds is larger than this value, we assume\n            an endpoint is detected.\n          rule2_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If we have decoded\n            something that is nonsilence and if the duration of trailing silence\n            in seconds is larger than this value, we assume an endpoint is\n            detected.\n          rule3_min_utterance_length:\n            Used only when enable_endpoint_detection is True. If the utterance\n            length in seconds is larger than this value, we assume an endpoint\n            is detected.\n          decoding_method:\n            Valid values are greedy_search, modified_beam_search.\n          max_active_paths:\n            Use only when decoding_method is modified_beam_search. It specifies\n            the maximum number of active paths during beam search.\n          blank_penalty:\n            The penalty applied on blank symbol during decoding.\n          hotwords_file:\n            The file containing hotwords, one words/phrases per line, and for each\n            phrase the bpe/cjkchar are separated by a space.\n          hotwords_score:\n            The hotword score of each token for biasing word/phrase. Used only if\n            hotwords_file is given with modified_beam_search as decoding method.\n          temperature_scale:\n            Temperature scaling for output symbol confidence estimation.\n            It affects only confidence values, the decoding uses the original\n            logits without temperature.\n          reset_encoder:\n            True to reset `encoder_state` on an endpoint after empty segment.\n            Done in `Reset()` method, after an endpoint was detected,\n            currently only in `OnlineRecognizerTransducerImpl`.\n          model_type:\n            Online transducer model type. Valid values are: conformer, lstm,\n            zipformer, zipformer2. All other values lead to loading the model twice.\n          modeling_unit:\n            The modeling unit of the model, commonly used units are bpe, cjkchar,\n            cjkchar+bpe, etc. Currently, it is needed only when hotwords are\n            provided, we need it to encode the hotwords into token sequence.\n          bpe_vocab:\n            The vocabulary generated by google's sentencepiece program.\n            It is a file has two columns, one is the token, the other is\n            the log probability, you can get it from the directory where\n            your bpe model is generated. Only used when hotwords provided\n            and the modeling unit is bpe or cjkchar+bpe.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          device:\n            onnxruntime cuda device index.\n          cudnn_conv_algo_search:\n            onxrt CuDNN convolution search algorithm selection. CUDA EP\n          trt_max_workspace_size:\n            Set TensorRT EP GPU memory usage limit. TensorRT EP\n          trt_max_partition_iterations:\n            Limit partitioning iterations for model conversion. TensorRT EP\n          trt_min_subgraph_size:\n            Set minimum size for subgraphs in partitioning. TensorRT EP\n          trt_fp16_enable: bool = True,\n            Enable FP16 precision for faster performance. TensorRT EP\n          trt_detailed_build_log: bool = False,\n            Enable detailed logging of build steps. TensorRT EP\n          trt_engine_cache_enable: bool = True,\n            Enable caching of TensorRT engines. TensorRT EP\n          trt_timing_cache_enable: bool = True,\n            \"Enable use of timing cache to speed up builds.\" TensorRT EP\n          trt_engine_cache_path: str =\"\",\n            \"Set path to store cached TensorRT engines.\" TensorRT EP\n          trt_timing_cache_path: str =\"\",\n            \"Set path for storing timing cache.\" TensorRT EP\n          trt_dump_subgraphs: bool = False,\n            \"Dump optimized subgraphs for debugging.\" TensorRT EP\n          lodr_fst:\n            Path to the LODR FST file in binary format. If empty, LODR is disabled.\n          lodr_scale:\n            Scale factor for LODR rescoring. Only used when lodr_fst is provided.\n        \"\"\"\n        self = cls.__new__(cls)\n        _assert_file_exists(tokens)\n        _assert_file_exists(encoder)\n        _assert_file_exists(decoder)\n        _assert_file_exists(joiner)\n\n        assert num_threads > 0, num_threads\n\n        transducer_config = OnlineTransducerModelConfig(\n            encoder=encoder,\n            decoder=decoder,\n            joiner=joiner,\n        )\n\n        cuda_config = CudaConfig(\n            cudnn_conv_algo_search=cudnn_conv_algo_search,\n        )\n\n        trt_config = TensorrtConfig(\n            trt_max_workspace_size=trt_max_workspace_size,\n            trt_max_partition_iterations=trt_max_partition_iterations,\n            trt_min_subgraph_size=trt_min_subgraph_size,\n            trt_fp16_enable=trt_fp16_enable,\n            trt_detailed_build_log=trt_detailed_build_log,\n            trt_engine_cache_enable=trt_engine_cache_enable,\n            trt_timing_cache_enable=trt_timing_cache_enable,\n            trt_engine_cache_path=trt_engine_cache_path,\n            trt_timing_cache_path=trt_timing_cache_path,\n            trt_dump_subgraphs=trt_dump_subgraphs,\n        )\n\n        provider_config = ProviderConfig(\n            trt_config=trt_config,\n            cuda_config=cuda_config,\n            provider=provider,\n            device=device,\n        )\n\n        model_config = OnlineModelConfig(\n            transducer=transducer_config,\n            tokens=tokens,\n            num_threads=num_threads,\n            provider_config=provider_config,\n            model_type=model_type,\n            modeling_unit=modeling_unit,\n            bpe_vocab=bpe_vocab,\n            debug=debug,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            normalize_samples=normalize_samples,\n            snip_edges=snip_edges,\n            feature_dim=feature_dim,\n            low_freq=low_freq,\n            high_freq=high_freq,\n            dither=dither,\n        )\n\n        endpoint_config = EndpointConfig(\n            rule1_min_trailing_silence=rule1_min_trailing_silence,\n            rule2_min_trailing_silence=rule2_min_trailing_silence,\n            rule3_min_utterance_length=rule3_min_utterance_length,\n        )\n\n        if len(hotwords_file) > 0 and decoding_method != \"modified_beam_search\":\n            raise ValueError(\n                \"Please use --decoding-method=modified_beam_search when using \"\n                f\"--hotwords-file. Currently given: {decoding_method}\"\n            )\n\n        if lm and decoding_method != \"modified_beam_search\":\n            raise ValueError(\n                \"Please use --decoding-method=modified_beam_search when using \"\n                f\"--lm. Currently given: {decoding_method}\"\n            )\n\n        lm_config = OnlineLMConfig(\n            model=lm,\n            scale=lm_scale,\n            shallow_fusion=lm_shallow_fusion,\n            lodr_fst=lodr_fst,\n            lodr_scale=lodr_scale,\n        )\n\n        recognizer_config = OnlineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            lm_config=lm_config,\n            endpoint_config=endpoint_config,\n            enable_endpoint=enable_endpoint_detection,\n            decoding_method=decoding_method,\n            max_active_paths=max_active_paths,\n            hotwords_score=hotwords_score,\n            hotwords_file=hotwords_file,\n            blank_penalty=blank_penalty,\n            temperature_scale=temperature_scale,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            reset_encoder=reset_encoder,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_paraformer(\n        cls,\n        tokens: str,\n        encoder: str,\n        decoder: str,\n        num_threads: int = 2,\n        sample_rate: float = 16000,\n        feature_dim: int = 80,\n        enable_endpoint_detection: bool = False,\n        rule1_min_trailing_silence: float = 2.4,\n        rule2_min_trailing_silence: float = 1.2,\n        rule3_min_utterance_length: float = 20.0,\n        decoding_method: str = \"greedy_search\",\n        provider: str = \"cpu\",\n        debug: bool = False,\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        device: int = 0,\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html>`_\n        to download pre-trained models for different languages, e.g., Chinese,\n        English, etc.\n\n        Args:\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          encoder:\n            Path to ``encoder.onnx``.\n          decoder:\n            Path to ``decoder.onnx``.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          enable_endpoint_detection:\n            True to enable endpoint detection. False to disable endpoint\n            detection.\n          rule1_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If the duration\n            of trailing silence in seconds is larger than this value, we assume\n            an endpoint is detected.\n          rule2_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If we have decoded\n            something that is nonsilence and if the duration of trailing silence\n            in seconds is larger than this value, we assume an endpoint is\n            detected.\n          rule3_min_utterance_length:\n            Used only when enable_endpoint_detection is True. If the utterance\n            length in seconds is larger than this value, we assume an endpoint\n            is detected.\n          decoding_method:\n            The only valid value is greedy_search.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n          device:\n            onnxruntime cuda device index.\n        \"\"\"\n        self = cls.__new__(cls)\n        _assert_file_exists(tokens)\n        _assert_file_exists(encoder)\n        _assert_file_exists(decoder)\n\n        assert num_threads > 0, num_threads\n\n        paraformer_config = OnlineParaformerModelConfig(\n            encoder=encoder,\n            decoder=decoder,\n        )\n\n        provider_config = ProviderConfig(\n            provider=provider,\n            device=device,\n        )\n\n        model_config = OnlineModelConfig(\n            paraformer=paraformer_config,\n            tokens=tokens,\n            num_threads=num_threads,\n            provider_config=provider_config,\n            model_type=\"paraformer\",\n            debug=debug,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        endpoint_config = EndpointConfig(\n            rule1_min_trailing_silence=rule1_min_trailing_silence,\n            rule2_min_trailing_silence=rule2_min_trailing_silence,\n            rule3_min_utterance_length=rule3_min_utterance_length,\n        )\n\n        recognizer_config = OnlineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            endpoint_config=endpoint_config,\n            enable_endpoint=enable_endpoint_detection,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_zipformer2_ctc(\n        cls,\n        tokens: str,\n        model: str,\n        num_threads: int = 2,\n        sample_rate: float = 16000,\n        feature_dim: int = 80,\n        enable_endpoint_detection: bool = False,\n        rule1_min_trailing_silence: float = 2.4,\n        rule2_min_trailing_silence: float = 1.2,\n        rule3_min_utterance_length: float = 20.0,\n        decoding_method: str = \"greedy_search\",\n        ctc_graph: str = \"\",\n        ctc_max_active: int = 3000,\n        provider: str = \"cpu\",\n        debug: bool = False,\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        device: int = 0,\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-ctc/index.html>`_\n        to download pre-trained models for different languages, e.g., Chinese,\n        English, etc.\n\n        Args:\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          model:\n            Path to ``model.onnx``.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          enable_endpoint_detection:\n            True to enable endpoint detection. False to disable endpoint\n            detection.\n          rule1_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If the duration\n            of trailing silence in seconds is larger than this value, we assume\n            an endpoint is detected.\n          rule2_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If we have decoded\n            something that is nonsilence and if the duration of trailing silence\n            in seconds is larger than this value, we assume an endpoint is\n            detected.\n          rule3_min_utterance_length:\n            Used only when enable_endpoint_detection is True. If the utterance\n            length in seconds is larger than this value, we assume an endpoint\n            is detected.\n          decoding_method:\n            The only valid value is greedy_search.\n          ctc_graph:\n            If not empty, decoding_method is ignored. It contains the path to\n            H.fst, HL.fst, or HLG.fst\n          ctc_max_active:\n            Used only when ctc_graph is not empty. It specifies the maximum\n            active paths at a time.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n          device:\n            onnxruntime cuda device index.\n        \"\"\"\n        self = cls.__new__(cls)\n        _assert_file_exists(tokens)\n        _assert_file_exists(model)\n\n        assert num_threads > 0, num_threads\n\n        zipformer2_ctc_config = OnlineZipformer2CtcModelConfig(model=model)\n\n        provider_config = ProviderConfig(\n            provider=provider,\n            device=device,\n        )\n\n        model_config = OnlineModelConfig(\n            zipformer2_ctc=zipformer2_ctc_config,\n            tokens=tokens,\n            num_threads=num_threads,\n            provider_config=provider_config,\n            debug=debug,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        endpoint_config = EndpointConfig(\n            rule1_min_trailing_silence=rule1_min_trailing_silence,\n            rule2_min_trailing_silence=rule2_min_trailing_silence,\n            rule3_min_utterance_length=rule3_min_utterance_length,\n        )\n\n        ctc_fst_decoder_config = OnlineCtcFstDecoderConfig(\n            graph=ctc_graph,\n            max_active=ctc_max_active,\n        )\n\n        recognizer_config = OnlineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            endpoint_config=endpoint_config,\n            ctc_fst_decoder_config=ctc_fst_decoder_config,\n            enable_endpoint=enable_endpoint_detection,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_t_one_ctc(\n        cls,\n        tokens: str,\n        model: str,\n        num_threads: int = 2,\n        sample_rate: float = 8000,\n        feature_dim: int = 80,\n        enable_endpoint_detection: bool = False,\n        rule1_min_trailing_silence: float = 2.4,\n        rule2_min_trailing_silence: float = 1.2,\n        rule3_min_utterance_length: float = 20.0,\n        decoding_method: str = \"greedy_search\",\n        provider: str = \"cpu\",\n        debug: bool = False,\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        device: int = 0,\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models>`_\n        to download pre-trained models.\n\n        Args:\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          model:\n            Path to ``model.onnx``.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          enable_endpoint_detection:\n            True to enable endpoint detection. False to disable endpoint\n            detection.\n          rule1_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If the duration\n            of trailing silence in seconds is larger than this value, we assume\n            an endpoint is detected.\n          rule2_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If we have decoded\n            something that is nonsilence and if the duration of trailing silence\n            in seconds is larger than this value, we assume an endpoint is\n            detected.\n          rule3_min_utterance_length:\n            Used only when enable_endpoint_detection is True. If the utterance\n            length in seconds is larger than this value, we assume an endpoint\n            is detected.\n          decoding_method:\n            The only valid value is greedy_search.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          debug:\n            True to show meta data in the model.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n          device:\n            onnxruntime cuda device index.\n        \"\"\"\n        self = cls.__new__(cls)\n        _assert_file_exists(tokens)\n        _assert_file_exists(model)\n\n        assert num_threads > 0, num_threads\n\n        t_one_ctc_config = OnlineToneCtcModelConfig(\n            model=model,\n        )\n\n        provider_config = ProviderConfig(\n            provider=provider,\n            device=device,\n        )\n\n        model_config = OnlineModelConfig(\n            t_one_ctc=t_one_ctc_config,\n            tokens=tokens,\n            num_threads=num_threads,\n            provider_config=provider_config,\n            debug=debug,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        endpoint_config = EndpointConfig(\n            rule1_min_trailing_silence=rule1_min_trailing_silence,\n            rule2_min_trailing_silence=rule2_min_trailing_silence,\n            rule3_min_utterance_length=rule3_min_utterance_length,\n        )\n\n        recognizer_config = OnlineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            endpoint_config=endpoint_config,\n            enable_endpoint=enable_endpoint_detection,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_nemo_ctc(\n        cls,\n        tokens: str,\n        model: str,\n        num_threads: int = 2,\n        sample_rate: float = 16000,\n        feature_dim: int = 80,\n        enable_endpoint_detection: bool = False,\n        rule1_min_trailing_silence: float = 2.4,\n        rule2_min_trailing_silence: float = 1.2,\n        rule3_min_utterance_length: float = 20.0,\n        decoding_method: str = \"greedy_search\",\n        provider: str = \"cpu\",\n        debug: bool = False,\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        device: int = 0,\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models>`_\n        to download pre-trained models.\n\n        Args:\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          model:\n            Path to ``model.onnx``.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          enable_endpoint_detection:\n            True to enable endpoint detection. False to disable endpoint\n            detection.\n          rule1_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If the duration\n            of trailing silence in seconds is larger than this value, we assume\n            an endpoint is detected.\n          rule2_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If we have decoded\n            something that is nonsilence and if the duration of trailing silence\n            in seconds is larger than this value, we assume an endpoint is\n            detected.\n          rule3_min_utterance_length:\n            Used only when enable_endpoint_detection is True. If the utterance\n            length in seconds is larger than this value, we assume an endpoint\n            is detected.\n          decoding_method:\n            The only valid value is greedy_search.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          debug:\n            True to show meta data in the model.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n          device:\n            onnxruntime cuda device index.\n        \"\"\"\n        self = cls.__new__(cls)\n        _assert_file_exists(tokens)\n        _assert_file_exists(model)\n\n        assert num_threads > 0, num_threads\n\n        nemo_ctc_config = OnlineNeMoCtcModelConfig(\n            model=model,\n        )\n\n        provider_config = ProviderConfig(\n            provider=provider,\n            device=device,\n        )\n\n        model_config = OnlineModelConfig(\n            nemo_ctc=nemo_ctc_config,\n            tokens=tokens,\n            num_threads=num_threads,\n            provider_config=provider_config,\n            debug=debug,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        endpoint_config = EndpointConfig(\n            rule1_min_trailing_silence=rule1_min_trailing_silence,\n            rule2_min_trailing_silence=rule2_min_trailing_silence,\n            rule3_min_utterance_length=rule3_min_utterance_length,\n        )\n\n        recognizer_config = OnlineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            endpoint_config=endpoint_config,\n            enable_endpoint=enable_endpoint_detection,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    @classmethod\n    def from_wenet_ctc(\n        cls,\n        tokens: str,\n        model: str,\n        chunk_size: int = 16,\n        num_left_chunks: int = 4,\n        num_threads: int = 2,\n        sample_rate: float = 16000,\n        feature_dim: int = 80,\n        enable_endpoint_detection: bool = False,\n        rule1_min_trailing_silence: float = 2.4,\n        rule2_min_trailing_silence: float = 1.2,\n        rule3_min_utterance_length: float = 20.0,\n        decoding_method: str = \"greedy_search\",\n        provider: str = \"cpu\",\n        debug: bool = False,\n        rule_fsts: str = \"\",\n        rule_fars: str = \"\",\n        device: int = 0,\n        hr_dict_dir: str = \"\",\n        hr_rule_fsts: str = \"\",\n        hr_lexicon: str = \"\",\n    ):\n        \"\"\"\n        Please refer to\n        `<https://k2-fsa.github.io/sherpa/onnx/pretrained_models/wenet/index.html>`_\n        to download pre-trained models for different languages, e.g., Chinese,\n        English, etc.\n\n        Args:\n          tokens:\n            Path to ``tokens.txt``. Each line in ``tokens.txt`` contains two\n            columns::\n\n                symbol integer_id\n\n          model:\n            Path to ``model.onnx``.\n          chunk_size:\n            The --chunk-size parameter from WeNet.\n          num_left_chunks:\n            The --num-left-chunks parameter from WeNet.\n          num_threads:\n            Number of threads for neural network computation.\n          sample_rate:\n            Sample rate of the training data used to train the model.\n          feature_dim:\n            Dimension of the feature used to train the model.\n          enable_endpoint_detection:\n            True to enable endpoint detection. False to disable endpoint\n            detection.\n          rule1_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If the duration\n            of trailing silence in seconds is larger than this value, we assume\n            an endpoint is detected.\n          rule2_min_trailing_silence:\n            Used only when enable_endpoint_detection is True. If we have decoded\n            something that is nonsilence and if the duration of trailing silence\n            in seconds is larger than this value, we assume an endpoint is\n            detected.\n          rule3_min_utterance_length:\n            Used only when enable_endpoint_detection is True. If the utterance\n            length in seconds is larger than this value, we assume an endpoint\n            is detected.\n          decoding_method:\n            The only valid value is greedy_search.\n          provider:\n            onnxruntime execution providers. Valid values are: cpu, cuda, coreml.\n          rule_fsts:\n            If not empty, it specifies fsts for inverse text normalization.\n            If there are multiple fsts, they are separated by a comma.\n          rule_fars:\n            If not empty, it specifies fst archives for inverse text normalization.\n            If there are multiple archives, they are separated by a comma.\n          device:\n            onnxruntime cuda device index.\n        \"\"\"\n        self = cls.__new__(cls)\n        _assert_file_exists(tokens)\n        _assert_file_exists(model)\n\n        assert num_threads > 0, num_threads\n\n        wenet_ctc_config = OnlineWenetCtcModelConfig(\n            model=model,\n            chunk_size=chunk_size,\n            num_left_chunks=num_left_chunks,\n        )\n\n        provider_config = ProviderConfig(\n            provider=provider,\n            device=device,\n        )\n\n        model_config = OnlineModelConfig(\n            wenet_ctc=wenet_ctc_config,\n            tokens=tokens,\n            num_threads=num_threads,\n            provider_config=provider_config,\n            debug=debug,\n        )\n\n        feat_config = FeatureExtractorConfig(\n            sampling_rate=sample_rate,\n            feature_dim=feature_dim,\n        )\n\n        endpoint_config = EndpointConfig(\n            rule1_min_trailing_silence=rule1_min_trailing_silence,\n            rule2_min_trailing_silence=rule2_min_trailing_silence,\n            rule3_min_utterance_length=rule3_min_utterance_length,\n        )\n\n        recognizer_config = OnlineRecognizerConfig(\n            feat_config=feat_config,\n            model_config=model_config,\n            endpoint_config=endpoint_config,\n            enable_endpoint=enable_endpoint_detection,\n            decoding_method=decoding_method,\n            rule_fsts=rule_fsts,\n            rule_fars=rule_fars,\n            hr=HomophoneReplacerConfig(\n                dict_dir=hr_dict_dir,\n                lexicon=hr_lexicon,\n                rule_fsts=hr_rule_fsts,\n            ),\n        )\n\n        self.recognizer = _Recognizer(recognizer_config)\n        self.config = recognizer_config\n        return self\n\n    def create_stream(self, hotwords: Optional[str] = None):\n        if hotwords is None:\n            return self.recognizer.create_stream()\n        else:\n            return self.recognizer.create_stream(hotwords)\n\n    def decode_stream(self, s: OnlineStream):\n        self.recognizer.decode_stream(s)\n\n    def decode_streams(self, ss: List[OnlineStream]):\n        self.recognizer.decode_streams(ss)\n\n    def is_ready(self, s: OnlineStream) -> bool:\n        return self.recognizer.is_ready(s)\n\n    def get_result_all(self, s: OnlineStream) -> OnlineRecognizerResult:\n        return self.recognizer.get_result(s)\n\n    def get_result(self, s: OnlineStream) -> str:\n        return self.recognizer.get_result(s).text.strip()\n\n    def get_result_as_json_string(self, s: OnlineStream) -> str:\n        return self.recognizer.get_result(s).as_json_string()\n\n    def tokens(self, s: OnlineStream) -> List[str]:\n        return self.recognizer.get_result(s).tokens\n\n    def timestamps(self, s: OnlineStream) -> List[float]:\n        return self.recognizer.get_result(s).timestamps\n\n    def start_time(self, s: OnlineStream) -> float:\n        return self.recognizer.get_result(s).start_time\n\n    def ys_probs(self, s: OnlineStream) -> List[float]:\n        return self.recognizer.get_result(s).ys_probs\n\n    def lm_probs(self, s: OnlineStream) -> List[float]:\n        return self.recognizer.get_result(s).lm_probs\n\n    def context_scores(self, s: OnlineStream) -> List[float]:\n        return self.recognizer.get_result(s).context_scores\n\n    def is_endpoint(self, s: OnlineStream) -> bool:\n        return self.recognizer.is_endpoint(s)\n\n    def reset(self, s: OnlineStream) -> bool:\n        return self.recognizer.reset(s)\n"
  },
  {
    "path": "sherpa-onnx/python/sherpa_onnx/utils.py",
    "content": "# Copyright (c)  2023  Xiaomi Corporation\nimport re\n\nfrom pathlib import Path\nfrom typing import List, Optional, Union\n\n\ndef text2token(\n    texts: List[str],\n    tokens: str,\n    tokens_type: str = \"cjkchar\",\n    bpe_model: Optional[str] = None,\n    lexicon: Optional[str] = None,\n    output_ids: bool = False,\n) -> List[List[Union[str, int]]]:\n    \"\"\"\n    Encode the given texts (a list of string) to a list of a list of tokens.\n\n    Args:\n      texts:\n        The given contexts list (a list of string).\n      tokens:\n        The path of the tokens.txt.\n      tokens_type:\n        The valid values are cjkchar, bpe, cjkchar+bpe, fpinyin, ppinyin, phone+ppinyin.\n        fpinyin means full pinyin, each cjkchar has a pinyin(with tone).\n        ppinyin means partial pinyin, it splits pinyin into initial and final,\n        phone means English phonemes in CMU dictionary format.\n      bpe_model:\n        The path of the bpe model. Only required when tokens_type is bpe or\n        cjkchar+bpe.\n      lexicon:\n        The path of the lexicon.txt. Only required when tokens_type is phone+ppinyin.\n      output_ids:\n        True to output token ids otherwise tokens.\n    Returns:\n      Return the encoded texts, it is a list of a list of token ids if output_ids\n      is True, or it is a list of list of tokens.\n    \"\"\"\n    try:\n        import sentencepiece as spm\n    except ImportError:\n        print(\"Please run\")\n        print(\"  pip install sentencepiece\")\n        print(\"before you continue\")\n        raise\n\n    try:\n        from pypinyin import pinyin\n        from pypinyin.contrib.tone_convert import to_initials, to_finals_tone\n    except ImportError:\n        print(\"Please run\")\n        print(\"  pip install pypinyin\")\n        print(\"before you continue\")\n        raise\n\n    assert Path(tokens).is_file(), f\"File not exists, {tokens}\"\n    tokens_table = {}\n    with open(tokens, \"r\", encoding=\"utf-8\") as f:\n        for line in f:\n            toks = line.strip().split()\n            assert len(toks) == 2, len(toks)\n            assert toks[0] not in tokens_table, f\"Duplicate token: {toks} \"\n            tokens_table[toks[0]] = int(toks[1])\n\n    if \"bpe\" in tokens_type:\n        assert Path(bpe_model).is_file(), f\"File not exists, {bpe_model}\"\n        sp = spm.SentencePieceProcessor()\n        sp.load(bpe_model)\n\n    phone_table = {}\n    if tokens_type == \"phone+ppinyin\":\n        assert (\n            lexicon and Path(lexicon).is_file()\n        ), f\"File not exists, {lexicon}\"\n        with open(lexicon, \"r\", encoding=\"utf-8\") as f:\n            for line in f:\n                toks = line.strip().split()\n                assert len(toks) >= 2, len(toks)\n                word = toks[0]\n                phones = toks[1:]\n                phone_table[word] = phones\n\n    texts_list: List[List[str]] = []\n\n    def to_pinyin(txt: str, out_type: str) -> List[str]:\n        assert out_type in [\"ppinyin\", \"fpinyin\"], f\"given {out_type}\"\n        py = [x[0] for x in pinyin(txt)]\n        if \"ppinyin\" == out_type:\n            res = []\n            for x in py:\n                initial = to_initials(x, strict=False)\n                final = to_finals_tone(x, strict=False)\n                if initial == \"\" and final == \"\":\n                    res.append(x)\n                else:\n                    if initial:\n                        res.append(initial)\n                    if final:\n                        res.append(final)\n            return res\n        else:\n            return py\n\n    if tokens_type == \"cjkchar\":\n        texts_list = [list(\"\".join(text.split())) for text in texts]\n    elif tokens_type == \"bpe\":\n        texts_list = sp.encode(texts, out_type=str)\n    elif tokens_type == \"ppinyin\" or tokens_type == \"fpinyin\":\n        for txt in texts:\n            texts_list.append(to_pinyin(txt, tokens_type))\n    elif tokens_type == \"phone+ppinyin\":\n        # CJK(China Japan Korea) unicode range is [U+4E00, U+9FFF], ref:\n        # https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_(Unicode_block)\n        pattern = re.compile(r\"^[\\u4e00-\\u9fff]+$\")\n        for text in texts:\n            words = text.strip().split()\n            text_list = []\n            skip_text = False\n            for w in words:\n                if w in phone_table:\n                    text_list += phone_table[w]\n                else:\n                    if pattern.fullmatch(w) is None:\n                        print(\n                            f\"Word {w} not in lexicon and it is not a CJK character, \"\n                            f\"skipping text: {text}.\"\n                        )\n                        skip_text = True\n                        break\n                    else:\n                        text_list += to_pinyin(w, \"ppinyin\")\n            if not skip_text:\n                texts_list.append(text_list)\n    else:\n        assert (\n            tokens_type == \"cjkchar+bpe\"\n        ), f\"Supported tokens_type are cjkchar, bpe, cjkchar+bpe, ppinyin, fpinyin, phone+ppinyin given {tokens_type}\"\n\n        # CJK(China Japan Korea) unicode range is [U+4E00, U+9FFF], ref:\n        # https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_(Unicode_block)\n        pattern = re.compile(r\"([\\u4e00-\\u9fff])\")\n        for text in texts:\n            # Example:\n            #   txt   = \"你好 ITS'S OKAY 的\"\n            #   chars = [\"你\", \"好\", \" ITS'S OKAY \", \"的\"]\n            chars = pattern.split(text)\n            mix_chars = [w for w in chars if len(w.strip()) > 0]\n            text_list = []\n            for ch_or_w in mix_chars:\n                # ch_or_w is a single CJK character(i.e., \"你\"), do nothing.\n                if pattern.fullmatch(ch_or_w) is not None:\n                    text_list.append(ch_or_w)\n                # ch_or_w contains non-CJK characters(i.e., \" IT'S OKAY \"),\n                # encode ch_or_w using bpe_model.\n                else:\n                    text_list += sp.encode_as_pieces(ch_or_w)\n            texts_list.append(text_list)\n\n    result: List[List[Union[int, str]]] = []\n    for text in texts_list:\n        text_list = []\n        contain_oov = False\n        for txt in text:\n            if txt in tokens_table:\n                text_list.append(tokens_table[txt] if output_ids else txt)\n            else:\n                print(\n                    f\"Can't find token {txt} in token table, check your \"\n                    f\"tokens.txt see if {txt} in it. skipping text : {text}.\"\n                )\n                contain_oov = True\n                break\n        if contain_oov:\n            continue\n        else:\n            result.append(text_list)\n    return result\n"
  },
  {
    "path": "sherpa-onnx/python/tests/CMakeLists.txt",
    "content": "function(sherpa_onnx_add_py_test source)\n  get_filename_component(name ${source} NAME_WE)\n  set(name \"${name}_py\")\n\n  add_test(NAME ${name}\n    COMMAND\n      \"${PYTHON_EXECUTABLE}\"\n      \"${CMAKE_CURRENT_SOURCE_DIR}/${source}\"\n    WORKING_DIRECTORY\n      ${CMAKE_CURRENT_SOURCE_DIR}\n  )\n\n  get_filename_component(sherpa_onnx_path ${CMAKE_CURRENT_LIST_DIR} DIRECTORY)\n\n  set_property(TEST ${name}\n    PROPERTY ENVIRONMENT \"PYTHONPATH=${sherpa_onnx_path}:$<TARGET_FILE_DIR:_sherpa_onnx>:$ENV{PYTHONPATH}\"\n  )\nendfunction()\n\n# please sort the files in alphabetic order\nset(py_test_files\n  test_fast_clustering.py\n  test_feature_extractor_config.py\n  test_keyword_spotter.py\n  test_offline_recognizer.py\n  test_online_recognizer.py\n  test_online_transducer_model_config.py\n  test_speaker_recognition.py\n  test_text2token.py\n)\n\nforeach(source IN LISTS py_test_files)\n  sherpa_onnx_add_py_test(${source})\nendforeach()\n\n"
  },
  {
    "path": "sherpa-onnx/python/tests/test_fast_clustering.py",
    "content": "# sherpa-onnx/python/tests/test_fast_clustering.py\n#\n# Copyright (c)  2024  Xiaomi Corporation\n#\n# To run this single test, use\n#\n#  ctest --verbose -R  test_fast_clustering_py\nimport unittest\n\nimport sherpa_onnx\nimport numpy as np\nfrom pathlib import Path\nfrom typing import Tuple\n\nimport soundfile as sf\n\n\ndef load_audio(filename: str) -> np.ndarray:\n    data, sample_rate = sf.read(\n        filename,\n        always_2d=True,\n        dtype=\"float32\",\n    )\n    data = data[:, 0]  # use only the first channel\n    samples = np.ascontiguousarray(data)\n    assert sample_rate == 16000, f\"Expect sample_rate 16000. Given: {sample_rate}\"\n    return samples\n\n\nclass TestFastClustering(unittest.TestCase):\n    def test_construct_by_num_clusters(self):\n        config = sherpa_onnx.FastClusteringConfig(num_clusters=4)\n        assert config.validate() is True\n\n        print(config)\n\n        clustering = sherpa_onnx.FastClustering(config)\n        features = np.array(\n            [\n                [0.2, 0.3],  # cluster 0\n                [0.3, -0.4],  # cluster 1\n                [-0.1, -0.2],  # cluster 2\n                [-0.3, -0.5],  # cluster 2\n                [0.1, -0.2],  # cluster 1\n                [0.1, 0.2],  # cluster 0\n                [-0.8, 1.9],  # cluster 3\n                [-0.4, -0.6],  # cluster 2\n                [-0.7, 0.9],  # cluster 3\n            ]\n        )\n        labels = clustering(features)\n        assert isinstance(labels, list)\n        assert len(labels) == features.shape[0]\n\n        expected = [0, 1, 2, 2, 1, 0, 3, 2, 3]\n        assert labels == expected, (labels, expected)\n\n    def test_construct_by_threshold(self):\n        config = sherpa_onnx.FastClusteringConfig(threshold=0.2)\n        assert config.validate() is True\n\n        print(config)\n\n        clustering = sherpa_onnx.FastClustering(config)\n        features = np.array(\n            [\n                [0.2, 0.3],  # cluster 0\n                [0.3, -0.4],  # cluster 1\n                [-0.1, -0.2],  # cluster 2\n                [-0.3, -0.5],  # cluster 2\n                [0.1, -0.2],  # cluster 1\n                [0.1, 0.2],  # cluster 0\n                [-0.8, 1.9],  # cluster 3\n                [-0.4, -0.6],  # cluster 2\n                [-0.7, 0.9],  # cluster 3\n            ]\n        )\n        labels = clustering(features)\n        assert isinstance(labels, list)\n        assert len(labels) == features.shape[0]\n\n        expected = [0, 1, 2, 2, 1, 0, 3, 2, 3]\n        assert labels == expected, (labels, expected)\n\n    def test_cluster_speaker_embeddings(self):\n        d = Path(\"/tmp/test-cluster\")\n\n        # Please download the onnx file from\n        # https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n        model_file = d / \"3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\"\n\n        if not model_file.exists():\n            print(f\"skip test since {model_file} does not exist\")\n            return\n\n        # Please download the test wave files from\n        # https://github.com/csukuangfj/sr-data\n        wave_dir = d / \"sr-data\"\n        if not wave_dir.is_dir():\n            print(f\"skip test since {wave_dir} does not exist\")\n            return\n\n        wave_files = [\n            \"enroll/fangjun-sr-1.wav\",  # cluster 0\n            \"enroll/fangjun-sr-2.wav\",  # cluster 0\n            \"enroll/fangjun-sr-3.wav\",  # cluster 0\n            \"enroll/leijun-sr-1.wav\",  # cluster 1\n            \"enroll/leijun-sr-2.wav\",  # cluster 1\n            \"enroll/liudehua-sr-1.wav\",  # cluster 2\n            \"enroll/liudehua-sr-2.wav\",  # cluster 2\n            \"test/fangjun-test-sr-1.wav\",  # cluster 0\n            \"test/fangjun-test-sr-2.wav\",  # cluster 0\n            \"test/leijun-test-sr-1.wav\",  # cluster 1\n            \"test/leijun-test-sr-2.wav\",  # cluster 1\n            \"test/leijun-test-sr-3.wav\",  # cluster 1\n            \"test/liudehua-test-sr-1.wav\",  # cluster 2\n            \"test/liudehua-test-sr-2.wav\",  # cluster 2\n        ]\n        for w in wave_files:\n            f = d / \"sr-data\" / w\n            if not f.is_file():\n                print(f\"skip testing since {f} does not exist\")\n                return\n\n        extractor_config = sherpa_onnx.SpeakerEmbeddingExtractorConfig(\n            model=str(model_file),\n            num_threads=1,\n            debug=0,\n        )\n        if not extractor_config.validate():\n            raise ValueError(f\"Invalid extractor config. {config}\")\n\n        extractor = sherpa_onnx.SpeakerEmbeddingExtractor(extractor_config)\n\n        features = []\n\n        for w in wave_files:\n            f = d / \"sr-data\" / w\n            audio = load_audio(str(f))\n            stream = extractor.create_stream()\n            stream.accept_waveform(sample_rate=16000, waveform=audio)\n            stream.input_finished()\n\n            assert extractor.is_ready(stream)\n            embedding = extractor.compute(stream)\n            embedding = np.array(embedding)\n            features.append(embedding)\n        features = np.array(features)\n\n        config = sherpa_onnx.FastClusteringConfig(num_clusters=3)\n        #  config = sherpa_onnx.FastClusteringConfig(threshold=0.5)\n        clustering = sherpa_onnx.FastClustering(config)\n        labels = clustering(features)\n\n        expected = [0, 0, 0, 1, 1, 2, 2]\n        expected += [0, 0, 1, 1, 1, 2, 2]\n\n        assert labels == expected, (labels, expected)\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  },
  {
    "path": "sherpa-onnx/python/tests/test_feature_extractor_config.py",
    "content": "# sherpa-onnx/python/tests/test_feature_extractor_config.py\n#\n# Copyright (c)  2023  Xiaomi Corporation\n#\n# To run this single test, use\n#\n#  ctest --verbose -R  test_feature_extractor_config_py\n\nimport unittest\n\nimport _sherpa_onnx\n\n\nclass TestFeatureExtractorConfig(unittest.TestCase):\n    def test_default_constructor(self):\n        config = _sherpa_onnx.FeatureExtractorConfig()\n        assert config.sampling_rate == 16000, config.sampling_rate\n        assert config.feature_dim == 80, config.feature_dim\n        print(config)\n\n    def test_constructor(self):\n        config = _sherpa_onnx.FeatureExtractorConfig(sampling_rate=8000, feature_dim=40)\n        assert config.sampling_rate == 8000, config.sampling_rate\n        assert config.feature_dim == 40, config.feature_dim\n        print(config)\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  },
  {
    "path": "sherpa-onnx/python/tests/test_keyword_spotter.py",
    "content": "# sherpa-onnx/python/tests/test_keyword_spotter.py\n#\n# Copyright (c)  2024  Xiaomi Corporation\n#\n# To run this single test, use\n#\n#  ctest --verbose -R  test_keyword_spotter_py\n\nimport unittest\nimport wave\nfrom pathlib import Path\nfrom typing import Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\nd = \"/tmp/onnx-models\"\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/kws/pretrained_models/index.html\n# to download pre-trained models for testing\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\nclass TestKeywordSpotter(unittest.TestCase):\n    def test_zipformer_transducer_en(self):\n        for use_int8 in [True, False]:\n            if use_int8:\n                encoder = f\"{d}/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx\"\n                decoder = f\"{d}/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\"\n                joiner = f\"{d}/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx\"\n            else:\n                encoder = f\"{d}/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx\"\n                decoder = f\"{d}/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\"\n                joiner = f\"{d}/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx\"\n\n            tokens = (\n                f\"{d}/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/tokens.txt\"\n            )\n            keywords_file = f\"{d}/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/test_wavs/test_keywords.txt\"\n            wave0 = f\"{d}/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/test_wavs/0.wav\"\n            wave1 = f\"{d}/sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01/test_wavs/1.wav\"\n\n            if not Path(encoder).is_file():\n                print(\"skipping test_zipformer_transducer_en()\")\n                return\n            keyword_spotter = sherpa_onnx.KeywordSpotter(\n                encoder=encoder,\n                decoder=decoder,\n                joiner=joiner,\n                tokens=tokens,\n                num_threads=1,\n                keywords_file=keywords_file,\n                provider=\"cpu\",\n            )\n            streams = []\n            waves = [wave0, wave1]\n            for wave in waves:\n                s = keyword_spotter.create_stream()\n                samples, sample_rate = read_wave(wave)\n                s.accept_waveform(sample_rate, samples)\n\n                tail_paddings = np.zeros(int(0.2 * sample_rate), dtype=np.float32)\n                s.accept_waveform(sample_rate, tail_paddings)\n                s.input_finished()\n                streams.append(s)\n\n            results = [\"\"] * len(streams)\n            while True:\n                ready_list = []\n                for i, s in enumerate(streams):\n                    if keyword_spotter.is_ready(s):\n                        ready_list.append(s)\n                    r = keyword_spotter.get_result(s)\n                    if r:\n                        print(f\"{r} is detected.\")\n                        results[i] += f\"{r}/\"\n\n                        keyword_spotter.reset_stream(s)\n\n                if len(ready_list) == 0:\n                    break\n                keyword_spotter.decode_streams(ready_list)\n            for wave_filename, result in zip(waves, results):\n                print(f\"{wave_filename}\\n{result[0:-1]}\")\n                print(\"-\" * 10)\n\n    def test_zipformer_transducer_cn(self):\n        for use_int8 in [True, False]:\n            if use_int8:\n                encoder = f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx\"\n                decoder = f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\"\n                joiner = f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx\"\n            else:\n                encoder = f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx\"\n                decoder = f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\"\n                joiner = f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx\"\n\n            tokens = (\n                f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt\"\n            )\n            keywords_file = f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/test_keywords.txt\"\n            wave0 = f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/3.wav\"\n            wave1 = f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/4.wav\"\n            wave2 = f\"{d}/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/5.wav\"\n\n            if not Path(encoder).is_file():\n                print(\"skipping test_zipformer_transducer_cn()\")\n                return\n            keyword_spotter = sherpa_onnx.KeywordSpotter(\n                encoder=encoder,\n                decoder=decoder,\n                joiner=joiner,\n                tokens=tokens,\n                num_threads=1,\n                keywords_file=keywords_file,\n                provider=\"cpu\",\n            )\n            streams = []\n            waves = [wave0, wave1, wave2]\n            for wave in waves:\n                s = keyword_spotter.create_stream()\n                samples, sample_rate = read_wave(wave)\n                s.accept_waveform(sample_rate, samples)\n\n                tail_paddings = np.zeros(int(0.2 * sample_rate), dtype=np.float32)\n                s.accept_waveform(sample_rate, tail_paddings)\n                s.input_finished()\n                streams.append(s)\n\n            results = [\"\"] * len(streams)\n            while True:\n                ready_list = []\n                for i, s in enumerate(streams):\n                    if keyword_spotter.is_ready(s):\n                        ready_list.append(s)\n                    r = keyword_spotter.get_result(s)\n                    if r:\n                        print(f\"{r} is detected.\")\n                        results[i] += f\"{r}/\"\n\n                        keyword_spotter.reset_stream(s)\n\n                if len(ready_list) == 0:\n                    break\n                keyword_spotter.decode_streams(ready_list)\n            for wave_filename, result in zip(waves, results):\n                print(f\"{wave_filename}\\n{result[0:-1]}\")\n                print(\"-\" * 10)\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  },
  {
    "path": "sherpa-onnx/python/tests/test_offline_recognizer.py",
    "content": "# sherpa-onnx/python/tests/test_offline_recognizer.py\n#\n# Copyright (c)  2023  Xiaomi Corporation\n#\n# To run this single test, use\n#\n#  ctest --verbose -R  test_offline_recognizer_py\n\nimport unittest\nimport wave\nfrom pathlib import Path\nfrom typing import Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\nd = \"/tmp/icefall-models\"\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html\n# and\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html\n# to download pre-trained models for testing\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\nclass TestOfflineRecognizer(unittest.TestCase):\n    def test_transducer_single_file(self):\n        for use_int8 in [True, False]:\n            if use_int8:\n                encoder = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.int8.onnx\"\n                decoder = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx\"\n                joiner = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.int8.onnx\"\n            else:\n                encoder = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx\"\n                decoder = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx\"\n                joiner = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.onnx\"\n\n            tokens = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/tokens.txt\"\n            wave0 = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav\"\n\n            if not Path(encoder).is_file():\n                print(\"skipping test_transducer_single_file()\")\n                return\n\n            recognizer = sherpa_onnx.OfflineRecognizer.from_transducer(\n                encoder=encoder,\n                decoder=decoder,\n                joiner=joiner,\n                tokens=tokens,\n                num_threads=1,\n                provider=\"cpu\",\n            )\n\n            s = recognizer.create_stream()\n            samples, sample_rate = read_wave(wave0)\n            s.accept_waveform(sample_rate, samples)\n            recognizer.decode_stream(s)\n            print(s.result.text)\n\n    def test_transducer_multiple_files(self):\n        for use_int8 in [True, False]:\n            if use_int8:\n                encoder = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.int8.onnx\"\n                decoder = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx\"\n                joiner = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.int8.onnx\"\n            else:\n                encoder = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/encoder-epoch-99-avg-1.onnx\"\n                decoder = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/decoder-epoch-99-avg-1.onnx\"\n                joiner = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/joiner-epoch-99-avg-1.onnx\"\n\n            tokens = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/tokens.txt\"\n            wave0 = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/test_wavs/0.wav\"\n            wave1 = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/test_wavs/1.wav\"\n            wave2 = f\"{d}/sherpa-onnx-zipformer-en-2023-04-01/test_wavs/8k.wav\"\n\n            if not Path(encoder).is_file():\n                print(\"skipping test_transducer_multiple_files()\")\n                return\n\n            recognizer = sherpa_onnx.OfflineRecognizer.from_transducer(\n                encoder=encoder,\n                decoder=decoder,\n                joiner=joiner,\n                tokens=tokens,\n                num_threads=1,\n                provider=\"cpu\",\n            )\n\n            s0 = recognizer.create_stream()\n            samples0, sample_rate0 = read_wave(wave0)\n            s0.accept_waveform(sample_rate0, samples0)\n\n            s1 = recognizer.create_stream()\n            samples1, sample_rate1 = read_wave(wave1)\n            s1.accept_waveform(sample_rate1, samples1)\n\n            s2 = recognizer.create_stream()\n            samples2, sample_rate2 = read_wave(wave2)\n            s2.accept_waveform(sample_rate2, samples2)\n\n            recognizer.decode_streams([s0, s1, s2])\n            print(s0.result.text)\n            print(s1.result.text)\n            print(s2.result.text)\n\n    def test_paraformer_single_file(self):\n        model = f\"{d}/sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx\"\n\n        tokens = f\"{d}/sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\"\n        wave0 = f\"{d}/sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/0.wav\"\n\n        if not Path(model).is_file():\n            print(\"skipping test_paraformer_single_file()\")\n            return\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(\n            paraformer=model,\n            tokens=tokens,\n            num_threads=1,\n            provider=\"cpu\",\n        )\n\n        s = recognizer.create_stream()\n        samples, sample_rate = read_wave(wave0)\n        s.accept_waveform(sample_rate, samples)\n        recognizer.decode_stream(s)\n        print(s.result.text)\n\n    def test_paraformer_multiple_files(self):\n        model = f\"{d}/sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx\"\n\n        tokens = f\"{d}/sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\"\n        wave0 = f\"{d}/sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/0.wav\"\n        wave1 = f\"{d}/sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/1.wav\"\n        wave2 = f\"{d}/sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/2.wav\"\n        wave3 = f\"{d}/sherpa-onnx-paraformer-zh-2023-09-14/test_wavs/8k.wav\"\n\n        if not Path(model).is_file():\n            print(\"skipping test_paraformer_multiple_files()\")\n            return\n\n        recognizer = sherpa_onnx.OfflineRecognizer.from_paraformer(\n            paraformer=model,\n            tokens=tokens,\n            num_threads=1,\n            provider=\"cpu\",\n        )\n\n        s0 = recognizer.create_stream()\n        samples0, sample_rate0 = read_wave(wave0)\n        s0.accept_waveform(sample_rate0, samples0)\n\n        s1 = recognizer.create_stream()\n        samples1, sample_rate1 = read_wave(wave1)\n        s1.accept_waveform(sample_rate1, samples1)\n\n        s2 = recognizer.create_stream()\n        samples2, sample_rate2 = read_wave(wave2)\n        s2.accept_waveform(sample_rate2, samples2)\n\n        s3 = recognizer.create_stream()\n        samples3, sample_rate3 = read_wave(wave3)\n        s3.accept_waveform(sample_rate3, samples3)\n\n        recognizer.decode_streams([s0, s1, s2, s3])\n        print(s0.result.text)\n        print(s1.result.text)\n        print(s2.result.text)\n        print(s3.result.text)\n\n    def test_nemo_ctc_single_file(self):\n        for use_int8 in [True, False]:\n            if use_int8:\n                model = f\"{d}/sherpa-onnx-nemo-ctc-en-citrinet-512/model.int8.onnx\"\n            else:\n                model = f\"{d}/sherpa-onnx-nemo-ctc-en-citrinet-512/model.onnx\"\n\n            tokens = f\"{d}/sherpa-onnx-nemo-ctc-en-citrinet-512/tokens.txt\"\n            wave0 = f\"{d}/sherpa-onnx-nemo-ctc-en-citrinet-512/test_wavs/0.wav\"\n\n            if not Path(model).is_file():\n                print(\"skipping test_nemo_ctc_single_file()\")\n                return\n\n            recognizer = sherpa_onnx.OfflineRecognizer.from_nemo_ctc(\n                model=model,\n                tokens=tokens,\n                num_threads=1,\n                provider=\"cpu\",\n            )\n\n            s = recognizer.create_stream()\n            samples, sample_rate = read_wave(wave0)\n            s.accept_waveform(sample_rate, samples)\n            recognizer.decode_stream(s)\n            print(s.result.text)\n\n    def test_nemo_ctc_multiple_files(self):\n        for use_int8 in [True, False]:\n            if use_int8:\n                model = f\"{d}/sherpa-onnx-nemo-ctc-en-citrinet-512/model.int8.onnx\"\n            else:\n                model = f\"{d}/sherpa-onnx-nemo-ctc-en-citrinet-512/model.onnx\"\n\n            tokens = f\"{d}/sherpa-onnx-nemo-ctc-en-citrinet-512/tokens.txt\"\n            wave0 = f\"{d}/sherpa-onnx-nemo-ctc-en-citrinet-512/test_wavs/0.wav\"\n            wave1 = f\"{d}/sherpa-onnx-nemo-ctc-en-citrinet-512/test_wavs/1.wav\"\n            wave2 = f\"{d}/sherpa-onnx-nemo-ctc-en-citrinet-512/test_wavs/8k.wav\"\n\n            if not Path(model).is_file():\n                print(\"skipping test_nemo_ctc_multiple_files()\")\n                return\n\n            recognizer = sherpa_onnx.OfflineRecognizer.from_nemo_ctc(\n                model=model,\n                tokens=tokens,\n                num_threads=1,\n                provider=\"cpu\",\n            )\n\n            s0 = recognizer.create_stream()\n            samples0, sample_rate0 = read_wave(wave0)\n            s0.accept_waveform(sample_rate0, samples0)\n\n            s1 = recognizer.create_stream()\n            samples1, sample_rate1 = read_wave(wave1)\n            s1.accept_waveform(sample_rate1, samples1)\n\n            s2 = recognizer.create_stream()\n            samples2, sample_rate2 = read_wave(wave2)\n            s2.accept_waveform(sample_rate2, samples2)\n\n            recognizer.decode_streams([s0, s1, s2])\n            print(s0.result.text)\n            print(s1.result.text)\n            print(s2.result.text)\n\n    def _test_wenet_ctc(self):\n        models = [\n            \"sherpa-onnx-zh-wenet-aishell\",\n            \"sherpa-onnx-zh-wenet-aishell2\",\n            \"sherpa-onnx-zh-wenet-wenetspeech\",\n            \"sherpa-onnx-zh-wenet-multi-cn\",\n            \"sherpa-onnx-en-wenet-librispeech\",\n            \"sherpa-onnx-en-wenet-gigaspeech\",\n        ]\n        for m in models:\n            for use_int8 in [True, False]:\n                name = \"model.int8.onnx\" if use_int8 else \"model.onnx\"\n                model = f\"{d}/{m}/{name}\"\n                tokens = f\"{d}/{m}/tokens.txt\"\n\n                wave0 = f\"{d}/{m}/test_wavs/0.wav\"\n                wave1 = f\"{d}/{m}/test_wavs/1.wav\"\n                wave2 = f\"{d}/{m}/test_wavs/8k.wav\"\n\n                if not Path(model).is_file():\n                    print(\"skipping test_wenet_ctc()\")\n                    return\n\n                recognizer = sherpa_onnx.OfflineRecognizer.from_wenet_ctc(\n                    model=model,\n                    tokens=tokens,\n                    num_threads=1,\n                    provider=\"cpu\",\n                )\n\n                s0 = recognizer.create_stream()\n                samples0, sample_rate0 = read_wave(wave0)\n                s0.accept_waveform(sample_rate0, samples0)\n\n                s1 = recognizer.create_stream()\n                samples1, sample_rate1 = read_wave(wave1)\n                s1.accept_waveform(sample_rate1, samples1)\n\n                s2 = recognizer.create_stream()\n                samples2, sample_rate2 = read_wave(wave2)\n                s2.accept_waveform(sample_rate2, samples2)\n\n                recognizer.decode_streams([s0, s1, s2])\n                print(s0.result.text)\n                print(s1.result.text)\n                print(s2.result.text)\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  },
  {
    "path": "sherpa-onnx/python/tests/test_online_recognizer.py",
    "content": "# sherpa-onnx/python/tests/test_online_recognizer.py\n#\n# Copyright (c)  2023  Xiaomi Corporation\n#\n# To run this single test, use\n#\n#  ctest --verbose -R  test_online_recognizer_py\n\nimport unittest\nimport wave\nfrom pathlib import Path\nfrom typing import Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\nd = \"/tmp/icefall-models\"\n# Please refer to\n# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html\n# to download pre-trained models for testing\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\nclass TestOnlineRecognizer(unittest.TestCase):\n    def test_transducer_single_file(self):\n        for use_int8 in [True, False]:\n            if use_int8:\n                encoder = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx\"\n                decoder = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\"\n                joiner = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx\"\n            else:\n                encoder = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx\"\n                decoder = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\"\n                joiner = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx\"\n\n            tokens = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\"\n            wave0 = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav\"\n\n            if not Path(encoder).is_file():\n                print(\"skipping test_transducer_single_file()\")\n                return\n\n            for decoding_method in [\"greedy_search\", \"modified_beam_search\"]:\n                recognizer = sherpa_onnx.OnlineRecognizer.from_transducer(\n                    encoder=encoder,\n                    decoder=decoder,\n                    joiner=joiner,\n                    tokens=tokens,\n                    num_threads=1,\n                    decoding_method=decoding_method,\n                    provider=\"cpu\",\n                )\n                s = recognizer.create_stream()\n                samples, sample_rate = read_wave(wave0)\n                s.accept_waveform(sample_rate, samples)\n\n                tail_paddings = np.zeros(int(0.2 * sample_rate), dtype=np.float32)\n                s.accept_waveform(sample_rate, tail_paddings)\n\n                s.input_finished()\n                while recognizer.is_ready(s):\n                    recognizer.decode_stream(s)\n                print(recognizer.get_result(s))\n\n    def test_transducer_multiple_files(self):\n        for use_int8 in [True, False]:\n            if use_int8:\n                encoder = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx\"\n                decoder = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\"\n                joiner = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx\"\n            else:\n                encoder = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx\"\n                decoder = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\"\n                joiner = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx\"\n\n            tokens = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\"\n            wave0 = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav\"\n            wave1 = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/1.wav\"\n            wave2 = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/2.wav\"\n            wave3 = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/3.wav\"\n            wave4 = f\"{d}/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/8k.wav\"\n\n            if not Path(encoder).is_file():\n                print(\"skipping test_transducer_multiple_files()\")\n                return\n\n            for decoding_method in [\"greedy_search\", \"modified_beam_search\"]:\n                recognizer = sherpa_onnx.OnlineRecognizer.from_transducer(\n                    encoder=encoder,\n                    decoder=decoder,\n                    joiner=joiner,\n                    tokens=tokens,\n                    num_threads=1,\n                    decoding_method=decoding_method,\n                    provider=\"cpu\",\n                )\n                streams = []\n                waves = [wave0, wave1, wave2, wave3, wave4]\n                for wave in waves:\n                    s = recognizer.create_stream()\n                    samples, sample_rate = read_wave(wave)\n                    s.accept_waveform(sample_rate, samples)\n\n                    tail_paddings = np.zeros(int(0.2 * sample_rate), dtype=np.float32)\n                    s.accept_waveform(sample_rate, tail_paddings)\n                    s.input_finished()\n                    streams.append(s)\n\n                while True:\n                    ready_list = []\n                    for s in streams:\n                        if recognizer.is_ready(s):\n                            ready_list.append(s)\n                    if len(ready_list) == 0:\n                        break\n                    recognizer.decode_streams(ready_list)\n                results = [recognizer.get_result(s) for s in streams]\n                for wave_filename, result in zip(waves, results):\n                    print(f\"{wave_filename}\\n{result}\")\n                    print(\"-\" * 10)\n\n    def test_zipformer2_ctc(self):\n        m = \"sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13\"\n        for use_int8 in [True, False]:\n            name = (\n                \"ctc-epoch-20-avg-1-chunk-16-left-128.int8.onnx\"\n                if use_int8\n                else \"ctc-epoch-20-avg-1-chunk-16-left-128.onnx\"\n            )\n            model = f\"{d}/{m}/{name}\"\n            tokens = f\"{d}/{m}/tokens.txt\"\n            wave0 = f\"{d}/{m}/test_wavs/DEV_T0000000000.wav\"\n            wave1 = f\"{d}/{m}/test_wavs/DEV_T0000000001.wav\"\n            wave2 = f\"{d}/{m}/test_wavs/DEV_T0000000002.wav\"\n            if not Path(model).is_file():\n                print(\"skipping test_zipformer2_ctc()\")\n                return\n            print(f\"testing {model}\")\n\n            recognizer = sherpa_onnx.OnlineRecognizer.from_zipformer2_ctc(\n                model=model,\n                tokens=tokens,\n                num_threads=1,\n                provider=\"cpu\",\n            )\n\n            streams = []\n            waves = [wave0, wave1, wave2]\n            for wave in waves:\n                s = recognizer.create_stream()\n                samples, sample_rate = read_wave(wave)\n                s.accept_waveform(sample_rate, samples)\n\n                tail_paddings = np.zeros(int(0.2 * sample_rate), dtype=np.float32)\n                s.accept_waveform(sample_rate, tail_paddings)\n                s.input_finished()\n                streams.append(s)\n\n            while True:\n                ready_list = []\n                for s in streams:\n                    if recognizer.is_ready(s):\n                        ready_list.append(s)\n                if len(ready_list) == 0:\n                    break\n                recognizer.decode_streams(ready_list)\n\n            results = [recognizer.get_result(s) for s in streams]\n            for wave_filename, result in zip(waves, results):\n                print(f\"{wave_filename}\\n{result}\")\n                print(\"-\" * 10)\n\n    def test_wenet_ctc(self):\n        models = [\n            \"sherpa-onnx-zh-wenet-aishell\",\n            \"sherpa-onnx-zh-wenet-aishell2\",\n            \"sherpa-onnx-zh-wenet-wenetspeech\",\n            \"sherpa-onnx-zh-wenet-multi-cn\",\n            \"sherpa-onnx-en-wenet-librispeech\",\n            \"sherpa-onnx-en-wenet-gigaspeech\",\n        ]\n        for m in models:\n            for use_int8 in [True, False]:\n                name = (\n                    \"model-streaming.int8.onnx\" if use_int8 else \"model-streaming.onnx\"\n                )\n                model = f\"{d}/{m}/{name}\"\n                tokens = f\"{d}/{m}/tokens.txt\"\n\n                wave0 = f\"{d}/{m}/test_wavs/0.wav\"\n                wave1 = f\"{d}/{m}/test_wavs/1.wav\"\n                wave2 = f\"{d}/{m}/test_wavs/8k.wav\"\n\n                if not Path(model).is_file():\n                    print(\"skipping test_wenet_ctc()\")\n                    return\n\n                recognizer = sherpa_onnx.OnlineRecognizer.from_wenet_ctc(\n                    model=model,\n                    tokens=tokens,\n                    num_threads=1,\n                    provider=\"cpu\",\n                )\n\n                streams = []\n                waves = [wave0, wave1, wave2]\n                for wave in waves:\n                    s = recognizer.create_stream()\n                    samples, sample_rate = read_wave(wave)\n                    s.accept_waveform(sample_rate, samples)\n\n                    tail_paddings = np.zeros(int(0.2 * sample_rate), dtype=np.float32)\n                    s.accept_waveform(sample_rate, tail_paddings)\n                    s.input_finished()\n                    streams.append(s)\n\n                while True:\n                    ready_list = []\n                    for s in streams:\n                        if recognizer.is_ready(s):\n                            ready_list.append(s)\n                    if len(ready_list) == 0:\n                        break\n                    recognizer.decode_streams(ready_list)\n\n                results = [recognizer.get_result(s) for s in streams]\n                for wave_filename, result in zip(waves, results):\n                    print(f\"{wave_filename}\\n{result}\")\n                    print(\"-\" * 10)\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  },
  {
    "path": "sherpa-onnx/python/tests/test_online_transducer_model_config.py",
    "content": "# sherpa-onnx/python/tests/test_online_transducer_model_config.py\n#\n# Copyright (c)  2023  Xiaomi Corporation\n#\n# To run this single test, use\n#\n#  ctest --verbose -R  test_online_transducer_model_config_py\n\nimport unittest\n\nimport _sherpa_onnx\n\n\nclass TestOnlineTransducerModelConfig(unittest.TestCase):\n    def test_constructor(self):\n        config = _sherpa_onnx.OnlineTransducerModelConfig(\n            encoder=\"encoder.onnx\",\n            decoder=\"decoder.onnx\",\n            joiner=\"joiner.onnx\",\n        )\n        assert config.encoder == \"encoder.onnx\", config.encoder\n        assert config.decoder == \"decoder.onnx\", config.decoder\n        assert config.joiner == \"joiner.onnx\", config.joiner\n        print(config)\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  },
  {
    "path": "sherpa-onnx/python/tests/test_speaker_recognition.py",
    "content": "# sherpa-onnx/python/tests/test_speaker_recognition.py\n#\n# Copyright (c)  2024  Xiaomi Corporation\n#\n# To run this single test, use\n#\n#  ctest --verbose -R  test_speaker_recognition_py\n\nimport unittest\nimport wave\nfrom collections import defaultdict\nfrom pathlib import Path\nfrom typing import Tuple\n\nimport numpy as np\nimport sherpa_onnx\n\nd = \"/tmp/sr-models\"\n\n\ndef read_wave(wave_filename: str) -> Tuple[np.ndarray, int]:\n    \"\"\"\n    Args:\n      wave_filename:\n        Path to a wave file. It should be single channel and each sample should\n        be 16-bit. Its sample rate does not need to be 16kHz.\n    Returns:\n      Return a tuple containing:\n       - A 1-D array of dtype np.float32 containing the samples, which are\n       normalized to the range [-1, 1].\n       - sample rate of the wave file\n    \"\"\"\n\n    with wave.open(wave_filename) as f:\n        assert f.getnchannels() == 1, f.getnchannels()\n        assert f.getsampwidth() == 2, f.getsampwidth()  # it is in bytes\n        num_samples = f.getnframes()\n        samples = f.readframes(num_samples)\n        samples_int16 = np.frombuffer(samples, dtype=np.int16)\n        samples_float32 = samples_int16.astype(np.float32)\n\n        samples_float32 = samples_float32 / 32768\n        return samples_float32, f.getframerate()\n\n\ndef load_speaker_embedding_model(model_filename):\n    config = sherpa_onnx.SpeakerEmbeddingExtractorConfig(\n        model=model_filename,\n        num_threads=1,\n        debug=True,\n        provider=\"cpu\",\n    )\n    if not config.validate():\n        raise ValueError(f\"Invalid config. {config}\")\n    extractor = sherpa_onnx.SpeakerEmbeddingExtractor(config)\n    return extractor\n\n\ndef test_zh_models(model_filename: str, threshold: float = 0.5):\n    model_filename = str(model_filename)\n    if \"en\" in model_filename:\n        print(f\"skip {model_filename}\")\n        return\n    extractor = load_speaker_embedding_model(model_filename)\n    filenames = [\n        \"leijun-sr-1\",\n        \"leijun-sr-2\",\n        \"fangjun-sr-1\",\n        \"fangjun-sr-2\",\n        \"fangjun-sr-3\",\n    ]\n    tmp = defaultdict(list)\n    for filename in filenames:\n        print(filename)\n        name = filename.split(\"-\", maxsplit=1)[0]\n        data, sample_rate = read_wave(f\"/tmp/sr-models/sr-data/enroll/{filename}.wav\")\n        stream = extractor.create_stream()\n        stream.accept_waveform(sample_rate=sample_rate, waveform=data)\n        stream.input_finished()\n        assert extractor.is_ready(stream)\n        embedding = extractor.compute(stream)\n        embedding = np.array(embedding)\n        tmp[name].append(embedding)\n\n    manager = sherpa_onnx.SpeakerEmbeddingManager(extractor.dim)\n    for name, embedding_list in tmp.items():\n        print(name, len(embedding_list))\n        embedding = sum(embedding_list) / len(embedding_list)\n        status = manager.add(name, embedding)\n        if not status:\n            raise RuntimeError(f\"Failed to register speaker {name}\")\n\n    filenames = [\n        \"leijun-test-sr-1\",\n        \"leijun-test-sr-2\",\n        \"leijun-test-sr-3\",\n        \"fangjun-test-sr-1\",\n        \"fangjun-test-sr-2\",\n    ]\n    for filename in filenames:\n        name = filename.split(\"-\", maxsplit=1)[0]\n        data, sample_rate = read_wave(f\"/tmp/sr-models/sr-data/test/{filename}.wav\")\n        stream = extractor.create_stream()\n        stream.accept_waveform(sample_rate=sample_rate, waveform=data)\n        stream.input_finished()\n        assert extractor.is_ready(stream)\n        embedding = extractor.compute(stream)\n        embedding = np.array(embedding)\n        status = manager.verify(name, embedding, threshold=threshold)\n        if not status:\n            raise RuntimeError(f\"Failed to verify {name} with wave {filename}.wav\")\n\n        ans = manager.search(embedding, threshold=threshold)\n        assert ans == name, (name, ans)\n\n\ndef test_en_and_zh_models(model_filename: str, threshold: float = 0.5):\n    model_filename = str(model_filename)\n    extractor = load_speaker_embedding_model(model_filename)\n    manager = sherpa_onnx.SpeakerEmbeddingManager(extractor.dim)\n\n    filenames = [\n        \"speaker1_a_cn_16k\",\n        \"speaker2_a_cn_16k\",\n        \"speaker1_a_en_16k\",\n        \"speaker2_a_en_16k\",\n    ]\n    is_en = \"en\" in model_filename\n    for filename in filenames:\n        if is_en and \"cn\" in filename:\n            continue\n\n        if not is_en and \"en\" in filename:\n            continue\n\n        name = filename.rsplit(\"_\", maxsplit=1)[0]\n        data, sample_rate = read_wave(\n            f\"/tmp/sr-models/sr-data/test/3d-speaker/{filename}.wav\"\n        )\n        stream = extractor.create_stream()\n        stream.accept_waveform(sample_rate=sample_rate, waveform=data)\n        stream.input_finished()\n        assert extractor.is_ready(stream)\n        embedding = extractor.compute(stream)\n        embedding = np.array(embedding)\n\n        status = manager.add(name, embedding)\n        if not status:\n            raise RuntimeError(f\"Failed to register speaker {name}\")\n\n    filenames = [\n        \"speaker1_b_cn_16k\",\n        \"speaker1_b_en_16k\",\n    ]\n    for filename in filenames:\n        if is_en and \"cn\" in filename:\n            continue\n\n        if not is_en and \"en\" in filename:\n            continue\n        print(filename)\n        name = filename.rsplit(\"_\", maxsplit=1)[0]\n        name = name.replace(\"b_cn\", \"a_cn\")\n        name = name.replace(\"b_en\", \"a_en\")\n        print(name)\n\n        data, sample_rate = read_wave(\n            f\"/tmp/sr-models/sr-data/test/3d-speaker/{filename}.wav\"\n        )\n        stream = extractor.create_stream()\n        stream.accept_waveform(sample_rate=sample_rate, waveform=data)\n        stream.input_finished()\n        assert extractor.is_ready(stream)\n        embedding = extractor.compute(stream)\n        embedding = np.array(embedding)\n        status = manager.verify(name, embedding, threshold=threshold)\n        if not status:\n            raise RuntimeError(\n                f\"Failed to verify {name} with wave {filename}.wav. model: {model_filename}\"\n            )\n\n        ans = manager.search(embedding, threshold=threshold)\n        assert ans == name, (name, ans)\n\n\nclass TestSpeakerRecognition(unittest.TestCase):\n    def test_wespeaker_models(self):\n        model_dir = Path(d) / \"wespeaker\"\n        if not model_dir.is_dir():\n            print(f\"{model_dir} does not exist - skip it\")\n            return\n        for filename in model_dir.glob(\"*.onnx\"):\n            print(filename)\n            threshold = 0.5\n\n            test_zh_models(filename, threshold)\n\n            if \"wespeaker_en_voxceleb_CAM++_LM.onnx\" in str(filename):\n                threshold = 0.3\n            test_en_and_zh_models(filename, threshold)\n\n    def _test_3dpeaker_models(self):\n        model_dir = Path(d) / \"3dspeaker\"\n        if not model_dir.is_dir():\n            print(f\"{model_dir} does not exist - skip it\")\n            return\n        for filename in model_dir.glob(\"*.onnx\"):\n            print(filename)\n            test_en_and_zh_models(filename)\n\n    def test_nemo_models(self):\n        model_dir = Path(d) / \"nemo\"\n        if not model_dir.is_dir():\n            print(f\"{model_dir} does not exist - skip it\")\n            return\n        for filename in model_dir.glob(\"*.onnx\"):\n            print(filename)\n            test_en_and_zh_models(filename)\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  },
  {
    "path": "sherpa-onnx/python/tests/test_text2token.py",
    "content": "# sherpa-onnx/python/tests/test_text2token.py\n#\n# Copyright (c)  2023  Xiaomi Corporation\n#\n# To run this single test, use\n#\n#  ctest --verbose -R  test_text2token_py\n\nimport unittest\nfrom pathlib import Path\n\nimport sherpa_onnx\n\nd = \"/tmp/sherpa-test-data\"\n# Please refer to\n# https://github.com/pkufool/sherpa-test-data\n# to download test data for testing\n\n\nclass TestText2Token(unittest.TestCase):\n    def test_bpe(self):\n        tokens = f\"{d}/text2token/tokens_en.txt\"\n        bpe_model = f\"{d}/text2token/bpe_en.model\"\n\n        if not Path(tokens).is_file() or not Path(bpe_model).is_file():\n            print(\n                f\"No test data found, skipping test_bpe().\\n\"\n                f\"You can download the test data by: \\n\"\n                f\"git clone https://github.com/pkufool/sherpa-test-data.git /tmp/sherpa-test-data\"\n            )\n            return\n\n        texts = [\"HELLO WORLD\", \"I LOVE YOU\"]\n        encoded_texts = sherpa_onnx.text2token(\n            texts,\n            tokens=tokens,\n            tokens_type=\"bpe\",\n            bpe_model=bpe_model,\n        )\n        assert encoded_texts == [\n            [\"▁HE\", \"LL\", \"O\", \"▁WORLD\"],\n            [\"▁I\", \"▁LOVE\", \"▁YOU\"],\n        ], encoded_texts\n\n        encoded_ids = sherpa_onnx.text2token(\n            texts,\n            tokens=tokens,\n            tokens_type=\"bpe\",\n            bpe_model=bpe_model,\n            output_ids=True,\n        )\n        assert encoded_ids == [[22, 58, 24, 425], [19, 370, 47]], encoded_ids\n\n    def test_cjkchar(self):\n        tokens = f\"{d}/text2token/tokens_cn.txt\"\n\n        if not Path(tokens).is_file():\n            print(\n                f\"No test data found, skipping test_cjkchar().\\n\"\n                f\"You can download the test data by: \\n\"\n                f\"git clone https://github.com/pkufool/sherpa-test-data.git /tmp/sherpa-test-data\"\n            )\n            return\n\n        texts = [\"世界人民大团结\", \"中国 VS 美国\"]\n        encoded_texts = sherpa_onnx.text2token(\n            texts, tokens=tokens, tokens_type=\"cjkchar\"\n        )\n        assert encoded_texts == [\n            [\"世\", \"界\", \"人\", \"民\", \"大\", \"团\", \"结\"],\n            [\"中\", \"国\", \"V\", \"S\", \"美\", \"国\"],\n        ], encoded_texts\n        encoded_ids = sherpa_onnx.text2token(\n            texts,\n            tokens=tokens,\n            tokens_type=\"cjkchar\",\n            output_ids=True,\n        )\n        assert encoded_ids == [\n            [379, 380, 72, 874, 93, 1251, 489],\n            [262, 147, 3423, 2476, 21, 147],\n        ], encoded_ids\n\n    def test_cjkchar_bpe(self):\n        tokens = f\"{d}/text2token/tokens_mix.txt\"\n        bpe_model = f\"{d}/text2token/bpe_mix.model\"\n\n        if not Path(tokens).is_file() or not Path(bpe_model).is_file():\n            print(\n                f\"No test data found, skipping test_cjkchar_bpe().\\n\"\n                f\"You can download the test data by: \\n\"\n                f\"git clone https://github.com/pkufool/sherpa-test-data.git /tmp/sherpa-test-data\"\n            )\n            return\n\n        texts = [\"世界人民 GOES TOGETHER\", \"中国 GOES WITH 美国\"]\n        encoded_texts = sherpa_onnx.text2token(\n            texts,\n            tokens=tokens,\n            tokens_type=\"cjkchar+bpe\",\n            bpe_model=bpe_model,\n        )\n        assert encoded_texts == [\n            [\"世\", \"界\", \"人\", \"民\", \"▁GO\", \"ES\", \"▁TOGETHER\"],\n            [\"中\", \"国\", \"▁GO\", \"ES\", \"▁WITH\", \"美\", \"国\"],\n        ], encoded_texts\n        encoded_ids = sherpa_onnx.text2token(\n            texts,\n            tokens=tokens,\n            tokens_type=\"cjkchar+bpe\",\n            bpe_model=bpe_model,\n            output_ids=True,\n        )\n        assert encoded_ids == [\n            [1368, 1392, 557, 680, 275, 178, 475],\n            [685, 736, 275, 178, 179, 921, 736],\n        ], encoded_ids\n\n    def test_phone_ppinyin(self):\n        tokens = f\"{d}/text2token/tokens_phone_ppinyin.txt\"\n        lexicon = f\"{d}/text2token/en.phone\"\n\n        if not Path(tokens).is_file() or not Path(lexicon).is_file():\n            print(\n                f\"No test data found, skipping test_phone_ppinyin().\\n\"\n                f\"You can download the test data by: \\n\"\n                f\"git clone https://github.com/pkufool/sherpa-test-data.git /tmp/sherpa-test-data\"\n            )\n            return\n\n        texts = [\"世界人民 GOES TOGETHER\", \"中国 GOES WITH 美国\"]\n        encoded_texts = sherpa_onnx.text2token(\n            texts,\n            tokens=tokens,\n            tokens_type=\"phone+ppinyin\",\n            lexicon=lexicon,\n        )\n        assert encoded_texts == [\n            [\n                \"sh\",\n                \"ì\",\n                \"j\",\n                \"iè\",\n                \"r\",\n                \"én\",\n                \"m\",\n                \"ín\",\n                \"G\",\n                \"OW1\",\n                \"Z\",\n                \"T\",\n                \"AH0\",\n                \"G\",\n                \"EH1\",\n                \"DH\",\n                \"ER0\",\n            ],\n            [\n                \"zh\",\n                \"ōng\",\n                \"g\",\n                \"uó\",\n                \"G\",\n                \"OW1\",\n                \"Z\",\n                \"W\",\n                \"IH1\",\n                \"DH\",\n                \"m\",\n                \"ěi\",\n                \"g\",\n                \"uó\",\n            ],\n        ], encoded_texts\n\n        encoded_ids = sherpa_onnx.text2token(\n            texts,\n            tokens=tokens,\n            tokens_type=\"phone+ppinyin\",\n            lexicon=lexicon,\n            output_ids=True,\n        )\n        assert encoded_ids == [\n            [\n                139,\n                203,\n                127,\n                107,\n                137,\n                200,\n                130,\n                207,\n                35,\n                50,\n                70,\n                59,\n                9,\n                35,\n                26,\n                24,\n                28,\n            ],\n            [182, 241, 87, 163, 35, 50, 70, 68, 38, 24, 130, 231, 87, 163],\n        ], encoded_ids\n\n\nif __name__ == \"__main__\":\n    unittest.main()\n"
  },
  {
    "path": "sherpa-onnx/rust/.gitignore",
    "content": "notes.md\ntarget\n"
  },
  {
    "path": "sherpa-onnx/rust/.rustfmt.toml",
    "content": "# Put each method in a chain on its own line\nchain_width = 0\n\n# Optional: make sure calls break vertically\nfn_call_width = 60\n\n# Optional: control general line width\nmax_width = 100\n"
  },
  {
    "path": "sherpa-onnx/rust/Cargo.toml",
    "content": "[workspace]\nresolver = \"2\"\nmembers = [\"sherpa-onnx\",\"sherpa-onnx-sys\",]\n"
  },
  {
    "path": "sherpa-onnx/rust/check.sh",
    "content": "#!/usr/bin/env bash\nset -euo pipefail\n\necho \"=== Building sherpa-onnx ===\"\ncargo build -p sherpa-onnx\n\necho \"=== Checking code with cargo check ===\"\ncargo check -p sherpa-onnx\n\necho \"=== Running clippy for lints ===\"\ncargo clippy -p sherpa-onnx -- -D warnings\n\necho \"=== Running tests ===\"\ncargo test -p sherpa-onnx\n\necho \"All checks passed for sherpa-onnx ✅\"\n"
  },
  {
    "path": "sherpa-onnx/rust/publish.sh",
    "content": "#!/usr/bin/env bash\n\npushd sherpa-onnx-sys\n\ncp -v ../../../README.md ./\ncp -v ../../../LICENSE ./\n\npopd\n\npushd sherpa-onnx\n\ncp -v ../../../README.md ./\ncp -v ../../../LICENSE ./\n\npopd\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/Cargo.toml",
    "content": "[package]\nname = \"sherpa-onnx\"\nversion = \"1.12.31\"\nedition = \"2021\"\ndescription = \"Safe Rust wrapper for sherpa-onnx speech recognition toolkit\"\nlicense = \"Apache-2.0\"\nrepository = \"https://github.com/k2-fsa/sherpa-onnx\"\ndocumentation = \"https://docs.rs/sherpa-onnx\"\nreadme = \"README.md\"  # make sure this is inside the crate folder\n\nkeywords = [\"speech\", \"speech-to-text\", \"stt\", \"onnx\", \"asr\"]\ncategories = [\"api-bindings\", \"multimedia::audio\"]\n\n# Explicitly list files to include in crates.io\ninclude = [\n    \"src/**\",\n    \"Cargo.toml\",\n    \"README.md\",\n    \"LICENSE*\",\n]\n\n[dependencies]\nsherpa-onnx-sys = { path = \"../sherpa-onnx-sys\", version = \"1.12.31\" }\nserde = { version = \"1.0\", features = [\"derive\"] }\nserde_json = \"1.0\"\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/audio_tagging.rs",
    "content": "//! Offline audio tagging.\n//!\n//! This API classifies complete audio clips and returns the most likely events.\n//! See:\n//!\n//! - `rust-api-examples/examples/audio_tagging_zipformer.rs`\n//! - `rust-api-examples/examples/audio_tagging_ced.rs`\n\nuse crate::utils::to_c_ptr;\nuse sherpa_onnx_sys as sys;\nuse std::ffi::{CStr, CString};\n\n#[derive(Clone, Debug, Default)]\n/// Zipformer audio tagging model path.\npub struct OfflineZipformerAudioTaggingModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineZipformerAudioTaggingModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineZipformerAudioTaggingModelConfig {\n        sys::OfflineZipformerAudioTaggingModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Model-level configuration for audio tagging.\n///\n/// Configure either `zipformer` or `ced` for a concrete model package.\npub struct AudioTaggingModelConfig {\n    pub zipformer: OfflineZipformerAudioTaggingModelConfig,\n    pub ced: Option<String>,\n    pub num_threads: i32,\n    pub debug: bool,\n    pub provider: Option<String>,\n}\n\nimpl Default for AudioTaggingModelConfig {\n    fn default() -> Self {\n        Self {\n            zipformer: Default::default(),\n            ced: None,\n            num_threads: 1,\n            debug: false,\n            provider: Some(\"cpu\".to_string()),\n        }\n    }\n}\n\nimpl AudioTaggingModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::AudioTaggingModelConfig {\n        sys::AudioTaggingModelConfig {\n            zipformer: self\n                .zipformer\n                .to_sys(cstrings),\n            ced: to_c_ptr(&self.ced, cstrings),\n            num_threads: self.num_threads,\n            debug: self.debug as i32,\n            provider: to_c_ptr(&self.provider, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Top-level configuration for [`AudioTagging`].\npub struct AudioTaggingConfig {\n    pub model: AudioTaggingModelConfig,\n    pub labels: Option<String>,\n    pub top_k: i32,\n}\n\nimpl Default for AudioTaggingConfig {\n    fn default() -> Self {\n        Self {\n            model: Default::default(),\n            labels: None,\n            top_k: 5,\n        }\n    }\n}\n\nimpl AudioTaggingConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::AudioTaggingConfig {\n        sys::AudioTaggingConfig {\n            model: self\n                .model\n                .to_sys(cstrings),\n            labels: to_c_ptr(&self.labels, cstrings),\n            top_k: self.top_k,\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// One predicted audio event.\npub struct AudioEvent {\n    pub name: String,\n    pub index: i32,\n    pub prob: f32,\n}\n\n/// Offline audio tagger.\npub struct AudioTagging {\n    ptr: *const sys::AudioTagging,\n}\n\nunsafe impl Send for AudioTagging {}\n\nimpl AudioTagging {\n    /// Create a tagger from `config`.\n    pub fn create(config: &AudioTaggingConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        let ptr = unsafe { sys::SherpaOnnxCreateAudioTagging(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Create a stream that accepts one complete clip.\n    pub fn create_stream(&self) -> AudioTaggingOfflineStream {\n        let ptr = unsafe { sys::SherpaOnnxAudioTaggingCreateOfflineStream(self.ptr) };\n        AudioTaggingOfflineStream { ptr }\n    }\n\n    /// Compute the top `top_k` events for the provided stream.\n    pub fn compute(&self, stream: &AudioTaggingOfflineStream, top_k: i32) -> Vec<AudioEvent> {\n        unsafe {\n            let p = sys::SherpaOnnxAudioTaggingCompute(self.ptr, stream.ptr, top_k);\n            if p.is_null() {\n                return Vec::new();\n            }\n\n            let mut ans = Vec::new();\n            let mut cur = p;\n            while !(*cur).is_null() {\n                let event = &*(*cur);\n                let name = if event\n                    .name\n                    .is_null()\n                {\n                    String::new()\n                } else {\n                    CStr::from_ptr(event.name)\n                        .to_string_lossy()\n                        .into_owned()\n                };\n                ans.push(AudioEvent {\n                    name,\n                    index: event.index,\n                    prob: event.prob,\n                });\n                cur = cur.add(1);\n            }\n\n            sys::SherpaOnnxAudioTaggingFreeResults(p);\n            ans\n        }\n    }\n}\n\nimpl Drop for AudioTagging {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .ptr\n                .is_null()\n            {\n                sys::SherpaOnnxDestroyAudioTagging(self.ptr);\n            }\n        }\n    }\n}\n\n/// Input stream for offline audio tagging.\npub struct AudioTaggingOfflineStream {\n    ptr: *const sys::OfflineStream,\n}\n\nimpl AudioTaggingOfflineStream {\n    /// Append waveform samples to the clip.\n    pub fn accept_waveform(&self, sample_rate: i32, samples: &[f32]) {\n        unsafe {\n            sys::SherpaOnnxAcceptWaveformOffline(\n                self.ptr,\n                sample_rate,\n                samples.as_ptr(),\n                samples.len() as i32,\n            )\n        }\n    }\n}\n\nimpl Drop for AudioTaggingOfflineStream {\n    fn drop(&mut self) {\n        unsafe { sys::SherpaOnnxDestroyOfflineStream(self.ptr) }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/display.rs",
    "content": "//! Small terminal display helper for streaming ASR demos.\n\nuse std::time::{Duration, Instant};\n\n/// Stores finalized sentences and the current partial hypothesis for terminal UIs.\n#[derive(Debug)]\npub struct DisplayManager {\n    sentences: Vec<String>,\n    current_text: String,\n    last_render: Instant,\n}\n\nimpl DisplayManager {\n    /// Create an empty display manager.\n    pub fn new() -> Self {\n        Self {\n            sentences: Vec::new(),\n            current_text: String::new(),\n            last_render: Instant::now(),\n        }\n    }\n\n    /// Replace the current partial text shown in the display.\n    pub fn update_text(&mut self, text: &str) {\n        self.current_text = text.to_string();\n    }\n\n    /// Move the current partial text into the finalized sentence list.\n    pub fn finalize_sentence(&mut self) {\n        let trimmed = self\n            .current_text\n            .trim();\n        if !trimmed.is_empty() {\n            self.sentences\n                .push(trimmed.to_string());\n        }\n        self.current_text\n            .clear();\n    }\n\n    /// Render the current state to stdout.\n    ///\n    /// Rendering is throttled slightly to reduce terminal flicker.\n    pub fn render(&mut self) {\n        // Throttle rendering to reduce flicker (200ms)\n        if self\n            .last_render\n            .elapsed()\n            < Duration::from_millis(200)\n        {\n            return;\n        }\n        self.last_render = Instant::now();\n\n        // Clear screen (ANSI escape)\n        print!(\"\\x1B[2J\\x1B[1;1H\");\n        println!(\"=== Speech Recognition with Next-gen Kaldi ===\");\n        println!(\"-----------------------------------------------\");\n\n        for (i, s) in self\n            .sentences\n            .iter()\n            .enumerate()\n        {\n            println!(\"{}: {}\", i + 1, s);\n        }\n\n        if !self\n            .current_text\n            .is_empty()\n        {\n            println!(\"-----------------------------------------------\");\n            println!(\"Recognizing: {}\", self.current_text);\n        }\n    }\n\n    /// Return `true` if at least one sentence has been finalized.\n    pub fn has_sentences(&self) -> bool {\n        !self\n            .sentences\n            .is_empty()\n    }\n\n    /// Borrow the current partial text.\n    pub fn current_text(&self) -> &str {\n        &self.current_text\n    }\n}\n\nimpl Default for DisplayManager {\n    fn default() -> Self {\n        Self::new()\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/kws.rs",
    "content": "//! Streaming keyword spotting.\n//!\n//! This module detects predefined or per-stream override keywords from an\n//! online ASR model. See\n//! [`rust-api-examples/examples/keyword_spotter.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/keyword_spotter.rs)\n//! for a complete example.\n//!\n//! # Example\n//!\n//! ```no_run\n//! use sherpa_onnx::{KeywordSpotter, KeywordSpotterConfig, Wave};\n//!\n//! let wave = Wave::read(\"./test.wav\").expect(\"read wave\");\n//! let mut config = KeywordSpotterConfig::default();\n//! config.model_config.transducer.encoder = Some(\"./kws/encoder.onnx\".into());\n//! config.model_config.transducer.decoder = Some(\"./kws/decoder.onnx\".into());\n//! config.model_config.transducer.joiner = Some(\"./kws/joiner.onnx\".into());\n//! config.model_config.tokens = Some(\"./kws/tokens.txt\".into());\n//! config.keywords_file = Some(\"./keywords.txt\".into());\n//!\n//! let kws = KeywordSpotter::create(&config).expect(\"create keyword spotter\");\n//! let stream = kws.create_stream();\n//! stream.accept_waveform(wave.sample_rate(), wave.samples());\n//! stream.input_finished();\n//!\n//! while kws.is_ready(&stream) {\n//!     kws.decode(&stream);\n//! }\n//!\n//! if let Some(result) = kws.get_result(&stream) {\n//!     println!(\"{}\", result.keyword);\n//! }\n//! ```\n\nuse crate::online_asr::{OnlineModelConfig, OnlineStream};\nuse crate::utils::to_c_ptr;\nuse sherpa_onnx_sys as sys;\nuse std::ffi::{c_char, CStr, CString};\nuse std::slice;\n\nfn c_ptr_to_string(ptr: *const c_char) -> String {\n    if ptr.is_null() {\n        String::new()\n    } else {\n        unsafe {\n            CStr::from_ptr(ptr)\n                .to_string_lossy()\n                .into_owned()\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Configuration for [`KeywordSpotter`].\npub struct KeywordSpotterConfig {\n    pub feat_config: sys::FeatureConfig,\n    pub model_config: OnlineModelConfig,\n    pub max_active_paths: i32,\n    pub num_trailing_blanks: i32,\n    pub keywords_score: f32,\n    pub keywords_threshold: f32,\n    pub keywords_file: Option<String>,\n    pub keywords_buf: Option<String>,\n}\n\nimpl Default for KeywordSpotterConfig {\n    fn default() -> Self {\n        Self {\n            feat_config: sys::FeatureConfig {\n                sample_rate: 16000,\n                feature_dim: 80,\n            },\n            model_config: Default::default(),\n            max_active_paths: 4,\n            num_trailing_blanks: 1,\n            keywords_score: 1.0,\n            keywords_threshold: 0.25,\n            keywords_file: None,\n            keywords_buf: None,\n        }\n    }\n}\n\nimpl KeywordSpotterConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::KeywordSpotterConfig {\n        sys::KeywordSpotterConfig {\n            feat_config: self.feat_config,\n            model_config: self\n                .model_config\n                .to_sys(cstrings),\n            max_active_paths: self.max_active_paths,\n            num_trailing_blanks: self.num_trailing_blanks,\n            keywords_score: self.keywords_score,\n            keywords_threshold: self.keywords_threshold,\n            keywords_file: to_c_ptr(&self.keywords_file, cstrings),\n            keywords_buf: to_c_ptr(&self.keywords_buf, cstrings),\n            keywords_buf_size: self\n                .keywords_buf\n                .as_ref()\n                .map_or(0, |s| s.len() as i32),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Decoded keyword spotting result for one stream.\npub struct KeywordResult {\n    pub keyword: String,\n    pub tokens: String,\n    pub tokens_arr: Vec<String>,\n    pub timestamps: Vec<f32>,\n    pub start_time: f32,\n    pub json: String,\n}\n\n/// Streaming keyword spotter.\npub struct KeywordSpotter {\n    ptr: *const sys::KeywordSpotter,\n}\n\nunsafe impl Send for KeywordSpotter {}\n\nimpl KeywordSpotter {\n    /// Create a keyword spotter from [`KeywordSpotterConfig`].\n    pub fn create(config: &KeywordSpotterConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        let ptr = unsafe { sys::SherpaOnnxCreateKeywordSpotter(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Create a stream that uses the keywords configured in [`KeywordSpotterConfig`].\n    pub fn create_stream(&self) -> OnlineStream {\n        let ptr = unsafe { sys::SherpaOnnxCreateKeywordStream(self.ptr) };\n        OnlineStream { ptr }\n    }\n\n    /// Create a stream that uses `keywords` instead of the configured keyword list.\n    pub fn create_stream_with_keywords(&self, keywords: &str) -> OnlineStream {\n        let keywords = CString::new(keywords).unwrap();\n        let ptr =\n            unsafe { sys::SherpaOnnxCreateKeywordStreamWithKeywords(self.ptr, keywords.as_ptr()) };\n        OnlineStream { ptr }\n    }\n\n    /// Return `true` if `stream` has enough audio for another decode step.\n    pub fn is_ready(&self, stream: &OnlineStream) -> bool {\n        unsafe { sys::SherpaOnnxIsKeywordStreamReady(self.ptr, stream.ptr) != 0 }\n    }\n\n    /// Decode one incremental step for `stream`.\n    pub fn decode(&self, stream: &OnlineStream) {\n        unsafe { sys::SherpaOnnxDecodeKeywordStream(self.ptr, stream.ptr) }\n    }\n\n    /// Decode multiple streams in one batch.\n    pub fn decode_multiple_streams(&self, streams: &[&OnlineStream]) {\n        let ptrs: Vec<*const sys::OnlineStream> = streams\n            .iter()\n            .map(|s| s.ptr)\n            .collect();\n        unsafe {\n            sys::SherpaOnnxDecodeMultipleKeywordStreams(self.ptr, ptrs.as_ptr(), ptrs.len() as i32)\n        }\n    }\n\n    /// Reset the detector state for `stream`.\n    pub fn reset(&self, stream: &OnlineStream) {\n        unsafe { sys::SherpaOnnxResetKeywordStream(self.ptr, stream.ptr) }\n    }\n\n    /// Get the structured keyword spotting result for `stream`.\n    pub fn get_result(&self, stream: &OnlineStream) -> Option<KeywordResult> {\n        unsafe {\n            let p = sys::SherpaOnnxGetKeywordResult(self.ptr, stream.ptr);\n            if p.is_null() {\n                return None;\n            }\n\n            let result = &*p;\n            let tokens_arr = if result\n                .tokens_arr\n                .is_null()\n                || result.count <= 0\n            {\n                Vec::new()\n            } else {\n                slice::from_raw_parts(result.tokens_arr, result.count as usize)\n                    .iter()\n                    .map(|item| c_ptr_to_string(*item))\n                    .collect()\n            };\n\n            let timestamps = if result\n                .timestamps\n                .is_null()\n                || result.count <= 0\n            {\n                Vec::new()\n            } else {\n                slice::from_raw_parts(result.timestamps, result.count as usize).to_vec()\n            };\n\n            let ans = KeywordResult {\n                keyword: c_ptr_to_string(result.keyword),\n                tokens: c_ptr_to_string(result.tokens),\n                tokens_arr,\n                timestamps,\n                start_time: result.start_time,\n                json: c_ptr_to_string(result.json),\n            };\n\n            sys::SherpaOnnxDestroyKeywordResult(p);\n            Some(ans)\n        }\n    }\n\n    /// Get the result for `stream` as a JSON string.\n    pub fn get_result_as_json(&self, stream: &OnlineStream) -> Option<String> {\n        unsafe {\n            let p = sys::SherpaOnnxGetKeywordResultAsJson(self.ptr, stream.ptr);\n            if p.is_null() {\n                return None;\n            }\n\n            let ans = CStr::from_ptr(p)\n                .to_string_lossy()\n                .into_owned();\n            sys::SherpaOnnxFreeKeywordResultJson(p);\n            Some(ans)\n        }\n    }\n}\n\nimpl Drop for KeywordSpotter {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .ptr\n                .is_null()\n            {\n                sys::SherpaOnnxDestroyKeywordSpotter(self.ptr);\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/lib.rs",
    "content": "//! Safe Rust bindings for the public sherpa-onnx inference APIs.\n//!\n//! This crate wraps the sherpa-onnx C API with RAII-owned Rust types and\n//! idiomatic configuration structs. The main feature families are:\n//!\n//! - offline ASR through [`OfflineRecognizer`]\n//! - streaming ASR through [`OnlineRecognizer`]\n//! - offline text-to-speech through [`OfflineTts`]\n//! - voice activity detection through [`VoiceActivityDetector`]\n//! - speaker embeddings and diarization\n//! - online punctuation\n//! - offline and streaming speech denoising\n//! - audio tagging\n//! - WAV I/O helpers through [`Wave`] and [`write()`]\n//!\n//! # How the Rust API is organized\n//!\n//! Most APIs follow the same pattern:\n//!\n//! 1. Start with a `*Config` value and fill the fields for exactly one model\n//!    family.\n//! 2. Call `create()` to construct the runtime object.\n//! 3. Create a stream if the API is stream-based.\n//! 4. Feed audio or text, then fetch results with the provided accessor methods.\n//!\n//! All runtime wrappers automatically free their underlying C resources on drop.\n//!\n//! # Examples\n//!\n//! The repository contains end-to-end Rust examples under\n//! [`rust-api-examples/examples/`](https://github.com/k2-fsa/sherpa-onnx/tree/master/rust-api-examples/examples).\n//! Good entry points are:\n//!\n//! - [`sense_voice.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/sense_voice.rs)\n//! - [`nemo_parakeet.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/nemo_parakeet.rs)\n//! - [`streaming_zipformer.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/streaming_zipformer.rs)\n//! - [`pocket_tts.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/pocket_tts.rs)\n//! - [`silero_vad_remove_silence.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/silero_vad_remove_silence.rs)\n//! - [`online_punctuation.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/online_punctuation.rs)\n//! - [`offline_punctuation.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/offline_punctuation.rs)\n//! - [`keyword_spotter.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/keyword_spotter.rs)\n//! - [`spoken_language_identification.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/spoken_language_identification.rs)\n//! - [`offline_speaker_diarization.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/offline_speaker_diarization.rs)\n//! - [`speaker_embedding_manager.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/speaker_embedding_manager.rs)\n//!\n//! # Offline recognition example\n//!\n//! ```no_run\n//! use sherpa_onnx::{\n//!     OfflineRecognizer, OfflineRecognizerConfig, OfflineSenseVoiceModelConfig, Wave,\n//! };\n//!\n//! let wave = Wave::read(\"./test.wav\").expect(\"read wave\");\n//!\n//! let mut config = OfflineRecognizerConfig::default();\n//! config.model_config.sense_voice = OfflineSenseVoiceModelConfig {\n//!     model: Some(\n//!         \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/model.int8.onnx\".into(),\n//!     ),\n//!     language: Some(\"auto\".into()),\n//!     use_itn: true,\n//! };\n//! config.model_config.tokens = Some(\n//!     \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/tokens.txt\".into(),\n//! );\n//!\n//! let recognizer = OfflineRecognizer::create(&config).expect(\"create recognizer\");\n//! let stream = recognizer.create_stream();\n//! stream.accept_waveform(wave.sample_rate(), wave.samples());\n//! recognizer.decode(&stream);\n//!\n//! let result = stream.get_result().expect(\"result\");\n//! println!(\"{}\", result.text);\n//! ```\n//!\n//! # Streaming recognition example\n//!\n//! ```no_run\n//! use sherpa_onnx::{OnlineRecognizer, OnlineRecognizerConfig, Wave};\n//!\n//! let wave = Wave::read(\"./test.wav\").expect(\"read wave\");\n//!\n//! let mut config = OnlineRecognizerConfig::default();\n//! config.model_config.transducer.encoder = Some(\n//!     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx\".into(),\n//! );\n//! config.model_config.transducer.decoder = Some(\n//!     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\".into(),\n//! );\n//! config.model_config.transducer.joiner = Some(\n//!     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx\".into(),\n//! );\n//! config.model_config.tokens = Some(\n//!     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\".into(),\n//! );\n//! config.enable_endpoint = true;\n//! config.decoding_method = Some(\"greedy_search\".into());\n//!\n//! let recognizer = OnlineRecognizer::create(&config).expect(\"create recognizer\");\n//! let stream = recognizer.create_stream();\n//! stream.accept_waveform(wave.sample_rate(), wave.samples());\n//! stream.input_finished();\n//! while recognizer.is_ready(&stream) {\n//!     recognizer.decode(&stream);\n//! }\n//! ```\n//!\n//! # TTS example\n//!\n//! ```no_run\n//! use sherpa_onnx::{OfflineTts, OfflineTtsConfig, OfflineTtsModelConfig, OfflineTtsPocketModelConfig};\n//!\n//! let config = OfflineTtsConfig {\n//!     model: OfflineTtsModelConfig {\n//!         pocket: OfflineTtsPocketModelConfig {\n//!             lm_flow: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\".into()),\n//!             lm_main: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\".into()),\n//!             encoder: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\".into()),\n//!             decoder: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\".into()),\n//!             text_conditioner: Some(\n//!                 \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\".into(),\n//!             ),\n//!             vocab_json: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\".into()),\n//!             token_scores_json: Some(\n//!                 \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\".into(),\n//!             ),\n//!             ..Default::default()\n//!         },\n//!         ..Default::default()\n//!     },\n//!     ..Default::default()\n//! };\n//!\n//! let tts = OfflineTts::create(&config).expect(\"create tts\");\n//! println!(\"{}\", tts.sample_rate());\n//! ```\nmod audio_tagging;\nmod display;\nmod kws;\nmod offline_asr;\nmod offline_punctuation;\nmod offline_speaker_diarization;\nmod offline_speech_denoiser;\nmod online_asr;\nmod online_punctuation;\nmod online_speech_denoiser;\nmod speaker_embedding;\nmod spoken_language_identification;\nmod tts;\nmod utils;\nmod vad;\nmod wave;\n\npub use audio_tagging::*;\npub use display::*;\npub use kws::*;\npub use offline_asr::*;\npub use offline_punctuation::*;\npub use offline_speaker_diarization::*;\npub use offline_speech_denoiser::*;\npub use online_asr::*;\npub use online_punctuation::*;\npub use online_speech_denoiser::*;\npub use speaker_embedding::*;\npub use spoken_language_identification::*;\npub use tts::*;\npub use utils::*;\npub use vad::*;\npub use wave::*;\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/offline_asr.rs",
    "content": "//! Offline speech recognition.\n//!\n//! The Rust wrapper exposes the same model families as the native C API. In\n//! typical use, configure exactly one model family inside [`OfflineModelConfig`]\n//! and then create an [`OfflineRecognizer`].\n//!\n//! Repository examples:\n//!\n//! - `rust-api-examples/examples/sense_voice.rs`\n//! - `rust-api-examples/examples/nemo_parakeet.rs`\n//! - `rust-api-examples/examples/moonshine_v2.rs`\n//! - `rust-api-examples/examples/fire_red_asr_ctc.rs`\n//!\n//! ```no_run\n//! use sherpa_onnx::{\n//!     OfflineRecognizer, OfflineRecognizerConfig, OfflineSenseVoiceModelConfig, Wave,\n//! };\n//!\n//! let wave = Wave::read(\"./test.wav\").expect(\"read wave\");\n//! let mut config = OfflineRecognizerConfig::default();\n//! config.model_config.sense_voice = OfflineSenseVoiceModelConfig {\n//!     model: Some(\n//!         \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/model.int8.onnx\".into(),\n//!     ),\n//!     language: Some(\"auto\".into()),\n//!     use_itn: true,\n//! };\n//! config.model_config.tokens = Some(\n//!     \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17-int8/tokens.txt\".into(),\n//! );\n//!\n//! let recognizer = OfflineRecognizer::create(&config).expect(\"create recognizer\");\n//! let stream = recognizer.create_stream();\n//! stream.accept_waveform(wave.sample_rate(), wave.samples());\n//! recognizer.decode(&stream);\n//! println!(\"{}\", stream.get_result().expect(\"result\").text);\n//! ```\n\nuse crate::utils::to_c_ptr;\nuse serde::Deserialize;\nuse sherpa_onnx_sys as sys;\nuse std::ffi::{CStr, CString};\n\n#[derive(Clone, Debug, Default)]\n/// Offline transducer model configuration.\n///\n/// This is used for transducer-style models such as the Parakeet example in\n/// `rust-api-examples/examples/nemo_parakeet.rs`.\npub struct OfflineTransducerModelConfig {\n    pub encoder: Option<String>,\n    pub decoder: Option<String>,\n    pub joiner: Option<String>,\n}\n\nimpl OfflineTransducerModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTransducerModelConfig {\n        sys::OfflineTransducerModelConfig {\n            encoder: to_c_ptr(&self.encoder, cstrings),\n            decoder: to_c_ptr(&self.decoder, cstrings),\n            joiner: to_c_ptr(&self.joiner, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline Paraformer model configuration.\npub struct OfflineParaformerModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineParaformerModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineParaformerModelConfig {\n        sys::OfflineParaformerModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline NeMo CTC model configuration.\npub struct OfflineNemoEncDecCtcModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineNemoEncDecCtcModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineNemoEncDecCtcModelConfig {\n        sys::OfflineNemoEncDecCtcModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline Whisper model configuration.\npub struct OfflineWhisperModelConfig {\n    pub encoder: Option<String>,\n    pub decoder: Option<String>,\n    pub language: Option<String>,\n    pub task: Option<String>,\n    pub tail_paddings: i32,\n    pub enable_token_timestamps: bool,\n    pub enable_segment_timestamps: bool,\n}\n\nimpl OfflineWhisperModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineWhisperModelConfig {\n        sys::OfflineWhisperModelConfig {\n            encoder: to_c_ptr(&self.encoder, cstrings),\n            decoder: to_c_ptr(&self.decoder, cstrings),\n            language: to_c_ptr(&self.language, cstrings),\n            task: to_c_ptr(&self.task, cstrings),\n            tail_paddings: self.tail_paddings,\n            enable_token_timestamps: self.enable_token_timestamps as i32,\n            enable_segment_timestamps: self.enable_segment_timestamps as i32,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline Canary model configuration.\npub struct OfflineCanaryModelConfig {\n    pub encoder: Option<String>,\n    pub decoder: Option<String>,\n    pub src_lang: Option<String>,\n    pub tgt_lang: Option<String>,\n    pub use_pnc: bool,\n}\n\nimpl OfflineCanaryModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineCanaryModelConfig {\n        sys::OfflineCanaryModelConfig {\n            encoder: to_c_ptr(&self.encoder, cstrings),\n            decoder: to_c_ptr(&self.decoder, cstrings),\n            src_lang: to_c_ptr(&self.src_lang, cstrings),\n            tgt_lang: to_c_ptr(&self.tgt_lang, cstrings),\n            use_pnc: self.use_pnc as i32,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline FireRed ASR transducer configuration.\npub struct OfflineFireRedAsrModelConfig {\n    pub encoder: Option<String>,\n    pub decoder: Option<String>,\n}\n\nimpl OfflineFireRedAsrModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineFireRedAsrModelConfig {\n        sys::OfflineFireRedAsrModelConfig {\n            encoder: to_c_ptr(&self.encoder, cstrings),\n            decoder: to_c_ptr(&self.decoder, cstrings),\n        }\n    }\n}\n\n/// For Moonshine v1, you need 4 models:\n///  - preprocessor, encoder, uncached_decoder, cached_decoder\n///\n/// For Moonshine v2, you need 2 models:\n///  - encoder, merged_decoder\n#[derive(Clone, Debug, Default)]\n/// Offline Moonshine model configuration.\npub struct OfflineMoonshineModelConfig {\n    pub preprocessor: Option<String>,\n    pub encoder: Option<String>,\n    pub uncached_decoder: Option<String>,\n    pub cached_decoder: Option<String>,\n    pub merged_decoder: Option<String>,\n}\n\nimpl OfflineMoonshineModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineMoonshineModelConfig {\n        sys::OfflineMoonshineModelConfig {\n            preprocessor: to_c_ptr(&self.preprocessor, cstrings),\n            encoder: to_c_ptr(&self.encoder, cstrings),\n            uncached_decoder: to_c_ptr(&self.uncached_decoder, cstrings),\n            cached_decoder: to_c_ptr(&self.cached_decoder, cstrings),\n            merged_decoder: to_c_ptr(&self.merged_decoder, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline TDNN model configuration.\npub struct OfflineTdnnModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineTdnnModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTdnnModelConfig {\n        sys::OfflineTdnnModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Optional external language model configuration for offline ASR.\npub struct OfflineLMConfig {\n    pub model: Option<String>,\n    pub scale: f32,\n}\nimpl Default for OfflineLMConfig {\n    fn default() -> Self {\n        Self {\n            model: None,\n            scale: 1.0,\n        }\n    }\n}\nimpl OfflineLMConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineLMConfig {\n        sys::OfflineLMConfig {\n            model: to_c_ptr(&self.model, cstrings),\n            scale: self.scale,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline SenseVoice model configuration.\npub struct OfflineSenseVoiceModelConfig {\n    pub model: Option<String>,\n    pub language: Option<String>,\n    pub use_itn: bool,\n}\n\nimpl OfflineSenseVoiceModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineSenseVoiceModelConfig {\n        sys::OfflineSenseVoiceModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n            language: to_c_ptr(&self.language, cstrings),\n            use_itn: self.use_itn as i32,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline Dolphin model configuration.\npub struct OfflineDolphinModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineDolphinModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineDolphinModelConfig {\n        sys::OfflineDolphinModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline Zipformer CTC model configuration.\npub struct OfflineZipformerCtcModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineZipformerCtcModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineZipformerCtcModelConfig {\n        sys::OfflineZipformerCtcModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline WeNet CTC model configuration.\npub struct OfflineWenetCtcModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineWenetCtcModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineWenetCtcModelConfig {\n        sys::OfflineWenetCtcModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline omnilingual CTC model configuration.\npub struct OfflineOmnilingualAsrCtcModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineOmnilingualAsrCtcModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineOmnilingualAsrCtcModelConfig {\n        sys::OfflineOmnilingualAsrCtcModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline MedASR CTC model configuration.\npub struct OfflineMedAsrCtcModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineMedAsrCtcModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineMedAsrCtcModelConfig {\n        sys::OfflineMedAsrCtcModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Offline FireRed ASR CTC model configuration.\npub struct OfflineFireRedAsrCtcModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineFireRedAsrCtcModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineFireRedAsrCtcModelConfig {\n        sys::OfflineFireRedAsrCtcModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Offline FunASR Nano model configuration.\npub struct OfflineFunASRNanoModelConfig {\n    pub encoder_adaptor: Option<String>,\n    pub llm: Option<String>,\n    pub embedding: Option<String>,\n    pub tokenizer: Option<String>,\n    pub system_prompt: Option<String>,\n    pub user_prompt: Option<String>,\n    pub max_new_tokens: i32,\n    pub temperature: f32,\n    pub top_p: f32,\n    pub seed: i32,\n    pub language: Option<String>,\n    pub itn: i32,\n    pub hotwords: Option<String>,\n}\nimpl Default for OfflineFunASRNanoModelConfig {\n    fn default() -> Self {\n        Self {\n            encoder_adaptor: None,\n            llm: None,\n            embedding: None,\n            tokenizer: None,\n            system_prompt: None,\n            user_prompt: None,\n            max_new_tokens: 0,\n            temperature: 1.0,\n            top_p: 1.0,\n            seed: 0,\n            language: None,\n            itn: 0,\n            hotwords: None,\n        }\n    }\n}\nimpl OfflineFunASRNanoModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineFunASRNanoModelConfig {\n        sys::OfflineFunASRNanoModelConfig {\n            encoder_adaptor: to_c_ptr(&self.encoder_adaptor, cstrings),\n            llm: to_c_ptr(&self.llm, cstrings),\n            embedding: to_c_ptr(&self.embedding, cstrings),\n            tokenizer: to_c_ptr(&self.tokenizer, cstrings),\n            system_prompt: to_c_ptr(&self.system_prompt, cstrings),\n            user_prompt: to_c_ptr(&self.user_prompt, cstrings),\n            max_new_tokens: self.max_new_tokens,\n            temperature: self.temperature,\n            top_p: self.top_p,\n            seed: self.seed,\n            language: to_c_ptr(&self.language, cstrings),\n            itn: self.itn,\n            hotwords: to_c_ptr(&self.hotwords, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Aggregate model configuration for offline recognition.\n///\n/// Configure exactly one model family for typical use. Shared options such as\n/// `tokens`, `provider`, and `num_threads` live here as well.\npub struct OfflineModelConfig {\n    pub transducer: OfflineTransducerModelConfig,\n    pub paraformer: OfflineParaformerModelConfig,\n    pub nemo_ctc: OfflineNemoEncDecCtcModelConfig,\n    pub whisper: OfflineWhisperModelConfig,\n    pub tdnn: OfflineTdnnModelConfig,\n    pub sense_voice: OfflineSenseVoiceModelConfig,\n    pub moonshine: OfflineMoonshineModelConfig,\n    pub fire_red_asr: OfflineFireRedAsrModelConfig,\n    pub dolphin: OfflineDolphinModelConfig,\n    pub zipformer_ctc: OfflineZipformerCtcModelConfig,\n    pub canary: OfflineCanaryModelConfig,\n    pub wenet_ctc: OfflineWenetCtcModelConfig,\n    pub omnilingual: OfflineOmnilingualAsrCtcModelConfig,\n    pub medasr: OfflineMedAsrCtcModelConfig,\n    pub funasr_nano: OfflineFunASRNanoModelConfig,\n    pub fire_red_asr_ctc: OfflineFireRedAsrCtcModelConfig,\n\n    pub tokens: Option<String>,\n    pub num_threads: i32,\n    pub debug: bool,\n    pub provider: Option<String>,\n    pub model_type: Option<String>,\n    pub modeling_unit: Option<String>,\n    pub bpe_vocab: Option<String>,\n    pub telespeech_ctc: Option<String>,\n}\n\nimpl OfflineModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineModelConfig {\n        sys::OfflineModelConfig {\n            transducer: self\n                .transducer\n                .to_sys(cstrings),\n            paraformer: self\n                .paraformer\n                .to_sys(cstrings),\n            nemo_ctc: self\n                .nemo_ctc\n                .to_sys(cstrings),\n            whisper: self\n                .whisper\n                .to_sys(cstrings),\n            tdnn: self\n                .tdnn\n                .to_sys(cstrings),\n            sense_voice: self\n                .sense_voice\n                .to_sys(cstrings),\n            canary: self\n                .canary\n                .to_sys(cstrings),\n            fire_red_asr: self\n                .fire_red_asr\n                .to_sys(cstrings),\n            dolphin: self\n                .dolphin\n                .to_sys(cstrings),\n            moonshine: self\n                .moonshine\n                .to_sys(cstrings),\n            zipformer_ctc: self\n                .zipformer_ctc\n                .to_sys(cstrings),\n            wenet_ctc: self\n                .wenet_ctc\n                .to_sys(cstrings),\n            omnilingual: self\n                .omnilingual\n                .to_sys(cstrings),\n            medasr: self\n                .medasr\n                .to_sys(cstrings),\n            funasr_nano: self\n                .funasr_nano\n                .to_sys(cstrings),\n            fire_red_asr_ctc: self\n                .fire_red_asr_ctc\n                .to_sys(cstrings),\n\n            tokens: to_c_ptr(&self.tokens, cstrings),\n            num_threads: self.num_threads,\n            debug: self.debug as i32,\n            provider: to_c_ptr(&self.provider, cstrings),\n            model_type: to_c_ptr(&self.model_type, cstrings),\n            modeling_unit: to_c_ptr(&self.modeling_unit, cstrings),\n            bpe_vocab: to_c_ptr(&self.bpe_vocab, cstrings),\n            telespeech_ctc: to_c_ptr(&self.telespeech_ctc, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Top-level configuration for [`OfflineRecognizer`].\n///\n/// Use [`Default`] as a starting point, then fill the fields for the model you\n/// want to run.\npub struct OfflineRecognizerConfig {\n    pub feat_config: sys::FeatureConfig,\n    pub model_config: OfflineModelConfig,\n    pub lm_config: OfflineLMConfig,\n    pub decoding_method: Option<String>,\n    pub max_active_paths: i32,\n    pub hotwords_file: Option<String>,\n    pub hotwords_score: f32,\n    pub rule_fsts: Option<String>,\n    pub rule_fars: Option<String>,\n    pub blank_penalty: f32,\n    pub hr: super::online_asr::HomophoneReplacerConfig,\n}\n\nimpl OfflineRecognizerConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineRecognizerConfig {\n        sys::OfflineRecognizerConfig {\n            feat_config: self.feat_config,\n            model_config: self\n                .model_config\n                .to_sys(cstrings),\n            lm_config: self\n                .lm_config\n                .to_sys(cstrings),\n            decoding_method: to_c_ptr(&self.decoding_method, cstrings),\n            max_active_paths: self.max_active_paths,\n            hotwords_file: to_c_ptr(&self.hotwords_file, cstrings),\n            hotwords_score: self.hotwords_score,\n            rule_fsts: to_c_ptr(&self.rule_fsts, cstrings),\n            rule_fars: to_c_ptr(&self.rule_fars, cstrings),\n            blank_penalty: self.blank_penalty,\n            hr: self\n                .hr\n                .to_sys(cstrings),\n        }\n    }\n}\n\nimpl Default for OfflineRecognizerConfig {\n    fn default() -> Self {\n        Self {\n            feat_config: sys::FeatureConfig {\n                sample_rate: 16000,\n                feature_dim: 80,\n            },\n\n            model_config: OfflineModelConfig::default(),\n            lm_config: OfflineLMConfig::default(),\n            decoding_method: None,\n            max_active_paths: 4, // a reasonable default\n            hotwords_file: None,\n            hotwords_score: 0.0,\n            rule_fsts: None,\n            rule_fars: None,\n            blank_penalty: 0.0,\n            hr: super::online_asr::HomophoneReplacerConfig::default(),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Deserialize)]\n/// Recognition result returned by [`OfflineStream::get_result`].\npub struct OfflineRecognizerResult {\n    pub text: String,\n    pub tokens: Vec<String>,\n    pub timestamps: Option<Vec<f32>>,\n    pub durations: Option<Vec<f32>>,\n}\n\n/// Offline speech recognizer.\n///\n/// ```no_run\n/// use sherpa_onnx::{\n///     OfflineRecognizer, OfflineRecognizerConfig, OfflineTransducerModelConfig, Wave,\n/// };\n///\n/// let wave = Wave::read(\"./test.wav\").expect(\"read wave\");\n/// let mut config = OfflineRecognizerConfig::default();\n/// config.model_config.transducer = OfflineTransducerModelConfig {\n///     encoder: Some(\"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/encoder.int8.onnx\".into()),\n///     decoder: Some(\"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/decoder.int8.onnx\".into()),\n///     joiner: Some(\"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/joiner.int8.onnx\".into()),\n/// };\n/// config.model_config.tokens =\n///     Some(\"./sherpa-onnx-nemo-parakeet-tdt-0.6b-v3-int8/tokens.txt\".into());\n/// config.model_config.model_type = Some(\"nemo_transducer\".into());\n///\n/// let recognizer = OfflineRecognizer::create(&config).expect(\"create recognizer\");\n/// let stream = recognizer.create_stream();\n/// stream.accept_waveform(wave.sample_rate(), wave.samples());\n/// recognizer.decode(&stream);\n/// let result = stream.get_result().expect(\"result\");\n/// println!(\"{}\", result.text);\n/// ```\npub struct OfflineRecognizer {\n    ptr: *const sys::OfflineRecognizer,\n}\n\nimpl OfflineRecognizer {\n    /// Create a recognizer from `config`.\n    pub fn create(config: &OfflineRecognizerConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        let ptr = unsafe { sys::SherpaOnnxCreateOfflineRecognizer(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Create an empty offline stream.\n    pub fn create_stream(&self) -> OfflineStream {\n        let ptr = unsafe { sys::SherpaOnnxCreateOfflineStream(self.ptr) };\n        OfflineStream { ptr }\n    }\n\n    /// Create a stream with per-stream hotwords.\n    pub fn create_stream_with_hotwords(&self, hotwords: &str) -> OfflineStream {\n        let c = CString::new(hotwords).unwrap();\n        let ptr = unsafe { sys::SherpaOnnxCreateOfflineStreamWithHotwords(self.ptr, c.as_ptr()) };\n        OfflineStream { ptr }\n    }\n\n    /// Decode one stream.\n    pub fn decode(&self, stream: &OfflineStream) {\n        unsafe { sys::SherpaOnnxDecodeOfflineStream(self.ptr, stream.ptr) }\n    }\n\n    /// Decode multiple streams in one batch call.\n    pub fn decode_multiple_streams(&self, streams: &[&OfflineStream]) {\n        let ptrs: Vec<*const sys::OfflineStream> = streams\n            .iter()\n            .map(|s| s.ptr)\n            .collect();\n        unsafe {\n            sys::SherpaOnnxDecodeMultipleOfflineStreams(self.ptr, ptrs.as_ptr(), ptrs.len() as i32)\n        }\n    }\n}\n\nimpl Drop for OfflineRecognizer {\n    fn drop(&mut self) {\n        unsafe {\n            sys::SherpaOnnxDestroyOfflineRecognizer(self.ptr);\n        }\n    }\n}\n\n/// Input stream used by [`OfflineRecognizer`].\npub struct OfflineStream {\n    pub(crate) ptr: *const sys::OfflineStream,\n}\n\nimpl OfflineStream {\n    /// Append samples to the stream.\n    pub fn accept_waveform(&self, sample_rate: i32, samples: &[f32]) {\n        unsafe {\n            sys::SherpaOnnxAcceptWaveformOffline(\n                self.ptr,\n                sample_rate,\n                samples.as_ptr(),\n                samples.len() as i32,\n            )\n        }\n    }\n\n    /// Fetch the current recognition result.\n    pub fn get_result(&self) -> Option<OfflineRecognizerResult> {\n        unsafe {\n            let cstr = sys::SherpaOnnxGetOfflineStreamResultAsJson(self.ptr);\n            if cstr.is_null() {\n                return None;\n            }\n            let s = CStr::from_ptr(cstr)\n                .to_string_lossy()\n                .into_owned();\n            sys::SherpaOnnxDestroyOfflineStreamResultJson(cstr);\n            serde_json::from_str(&s).ok()\n        }\n    }\n\n    pub fn set_option(&self, key: &str, value: &str) {\n        let key = CString::new(key).unwrap();\n        let value = CString::new(value).unwrap();\n        unsafe { sys::SherpaOnnxOfflineStreamSetOption(self.ptr, key.as_ptr(), value.as_ptr()) }\n    }\n\n    pub fn get_option(&self, key: &str) -> String {\n        let key = CString::new(key).unwrap();\n        unsafe {\n            let p = sys::SherpaOnnxOfflineStreamGetOption(self.ptr, key.as_ptr());\n            if p.is_null() {\n                String::new()\n            } else {\n                CStr::from_ptr(p)\n                    .to_string_lossy()\n                    .into_owned()\n            }\n        }\n    }\n\n    pub fn has_option(&self, key: &str) -> bool {\n        let key = CString::new(key).unwrap();\n        unsafe { sys::SherpaOnnxOfflineStreamHasOption(self.ptr, key.as_ptr()) != 0 }\n    }\n}\n\nimpl Drop for OfflineStream {\n    fn drop(&mut self) {\n        unsafe { sys::SherpaOnnxDestroyOfflineStream(self.ptr) }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/offline_punctuation.rs",
    "content": "//! Offline punctuation restoration.\n//!\n//! Use this module when you already have a complete text string and want a\n//! one-shot punctuation pass. See\n//! [`rust-api-examples/examples/offline_punctuation.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/offline_punctuation.rs)\n//! for a complete example.\n//!\n//! # Example\n//!\n//! ```no_run\n//! use sherpa_onnx::{OfflinePunctuation, OfflinePunctuationConfig};\n//!\n//! let mut config = OfflinePunctuationConfig::default();\n//! config.model.ct_transformer = Some(\"./sherpa-onnx-offline-punctuation/model.onnx\".into());\n//!\n//! let punct = OfflinePunctuation::create(&config).expect(\"create punctuator\");\n//! let text = punct\n//!     .add_punctuation(\"today is a good day how are you\")\n//!     .expect(\"punctuate\");\n//! println!(\"{text}\");\n//! ```\n\nuse crate::utils::to_c_ptr;\nuse sherpa_onnx_sys as sys;\nuse std::ffi::{CStr, CString};\n\n#[derive(Clone, Debug)]\n/// Model configuration for offline punctuation restoration.\npub struct OfflinePunctuationModelConfig {\n    pub ct_transformer: Option<String>,\n    pub num_threads: i32,\n    pub debug: bool,\n    pub provider: Option<String>,\n}\n\nimpl Default for OfflinePunctuationModelConfig {\n    fn default() -> Self {\n        Self {\n            ct_transformer: None,\n            num_threads: 1,\n            debug: false,\n            provider: Some(\"cpu\".to_string()),\n        }\n    }\n}\n\nimpl OfflinePunctuationModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflinePunctuationModelConfig {\n        sys::OfflinePunctuationModelConfig {\n            ct_transformer: to_c_ptr(&self.ct_transformer, cstrings),\n            num_threads: self.num_threads,\n            debug: self.debug as i32,\n            provider: to_c_ptr(&self.provider, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Top-level configuration for [`OfflinePunctuation`].\npub struct OfflinePunctuationConfig {\n    pub model: OfflinePunctuationModelConfig,\n}\n\nimpl OfflinePunctuationConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflinePunctuationConfig {\n        sys::OfflinePunctuationConfig {\n            model: self\n                .model\n                .to_sys(cstrings),\n        }\n    }\n}\n\n/// Offline punctuation restorer.\npub struct OfflinePunctuation {\n    ptr: *const sys::OfflinePunctuation,\n}\n\nunsafe impl Send for OfflinePunctuation {}\n\nimpl OfflinePunctuation {\n    /// Create an offline punctuator from [`OfflinePunctuationConfig`].\n    pub fn create(config: &OfflinePunctuationConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        let ptr = unsafe { sys::SherpaOnnxCreateOfflinePunctuation(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Add punctuation to `text`.\n    pub fn add_punctuation(&self, text: &str) -> Option<String> {\n        let text = CString::new(text).ok()?;\n\n        unsafe {\n            let p = sys::SherpaOfflinePunctuationAddPunct(self.ptr, text.as_ptr());\n            if p.is_null() {\n                return None;\n            }\n\n            let ans = CStr::from_ptr(p)\n                .to_string_lossy()\n                .into_owned();\n            sys::SherpaOfflinePunctuationFreeText(p);\n            Some(ans)\n        }\n    }\n}\n\nimpl Drop for OfflinePunctuation {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .ptr\n                .is_null()\n            {\n                sys::SherpaOnnxDestroyOfflinePunctuation(self.ptr);\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/offline_speaker_diarization.rs",
    "content": "//! Offline speaker diarization.\n//!\n//! This combines segmentation, speaker embedding extraction, and clustering.\n//! See `rust-api-examples/examples/offline_speaker_diarization.rs`.\n\nuse crate::{speaker_embedding::SpeakerEmbeddingExtractorConfig, utils::to_c_ptr};\nuse sherpa_onnx_sys as sys;\nuse std::ffi::CString;\nuse std::slice;\n\n#[derive(Clone, Debug, Default)]\n/// Pyannote segmentation model path.\npub struct OfflineSpeakerSegmentationPyannoteModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineSpeakerSegmentationPyannoteModelConfig {\n    fn to_sys(\n        &self,\n        cstrings: &mut Vec<CString>,\n    ) -> sys::OfflineSpeakerSegmentationPyannoteModelConfig {\n        sys::OfflineSpeakerSegmentationPyannoteModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Segmentation model configuration for diarization.\npub struct OfflineSpeakerSegmentationModelConfig {\n    pub pyannote: OfflineSpeakerSegmentationPyannoteModelConfig,\n    pub num_threads: i32,\n    pub debug: bool,\n    pub provider: Option<String>,\n}\n\nimpl Default for OfflineSpeakerSegmentationModelConfig {\n    fn default() -> Self {\n        Self {\n            pyannote: Default::default(),\n            num_threads: 1,\n            debug: false,\n            provider: Some(\"cpu\".to_string()),\n        }\n    }\n}\n\nimpl OfflineSpeakerSegmentationModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineSpeakerSegmentationModelConfig {\n        sys::OfflineSpeakerSegmentationModelConfig {\n            pyannote: self\n                .pyannote\n                .to_sys(cstrings),\n            num_threads: self.num_threads,\n            debug: self.debug as i32,\n            provider: to_c_ptr(&self.provider, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Fast clustering options used after segmentation and embedding extraction.\npub struct FastClusteringConfig {\n    pub num_clusters: i32,\n    pub threshold: f32,\n}\n\nimpl Default for FastClusteringConfig {\n    fn default() -> Self {\n        Self {\n            num_clusters: -1,\n            threshold: 0.5,\n        }\n    }\n}\n\nimpl FastClusteringConfig {\n    fn to_sys(&self) -> sys::FastClusteringConfig {\n        sys::FastClusteringConfig {\n            num_clusters: self.num_clusters,\n            threshold: self.threshold,\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Top-level configuration for [`OfflineSpeakerDiarization`].\npub struct OfflineSpeakerDiarizationConfig {\n    pub segmentation: OfflineSpeakerSegmentationModelConfig,\n    pub embedding: SpeakerEmbeddingExtractorConfig,\n    pub clustering: FastClusteringConfig,\n    pub min_duration_on: f32,\n    pub min_duration_off: f32,\n}\n\nimpl Default for OfflineSpeakerDiarizationConfig {\n    fn default() -> Self {\n        Self {\n            segmentation: Default::default(),\n            embedding: Default::default(),\n            clustering: Default::default(),\n            min_duration_on: 0.3,\n            min_duration_off: 0.5,\n        }\n    }\n}\n\nimpl OfflineSpeakerDiarizationConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineSpeakerDiarizationConfig {\n        sys::OfflineSpeakerDiarizationConfig {\n            segmentation: self\n                .segmentation\n                .to_sys(cstrings),\n            embedding: self\n                .embedding\n                .to_sys(cstrings),\n            clustering: self\n                .clustering\n                .to_sys(),\n            min_duration_on: self.min_duration_on,\n            min_duration_off: self.min_duration_off,\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// One diarization segment labeled with a speaker index.\npub struct OfflineSpeakerDiarizationSegment {\n    pub start: f32,\n    pub end: f32,\n    pub speaker: i32,\n}\n\n/// Offline speaker diarizer.\npub struct OfflineSpeakerDiarization {\n    ptr: *const sys::OfflineSpeakerDiarization,\n}\n\nunsafe impl Send for OfflineSpeakerDiarization {}\n\nimpl OfflineSpeakerDiarization {\n    /// Create a diarizer from `config`.\n    pub fn create(config: &OfflineSpeakerDiarizationConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        let ptr = unsafe { sys::SherpaOnnxCreateOfflineSpeakerDiarization(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Return the sample rate expected by the segmentation model.\n    pub fn sample_rate(&self) -> i32 {\n        unsafe { sys::SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(self.ptr) }\n    }\n\n    /// Replace the current configuration.\n    pub fn set_config(&self, config: &OfflineSpeakerDiarizationConfig) {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        unsafe { sys::SherpaOnnxOfflineSpeakerDiarizationSetConfig(self.ptr, &sys_config) }\n    }\n\n    /// Process a complete waveform and return a diarization result.\n    pub fn process(&self, samples: &[f32]) -> Option<OfflineSpeakerDiarizationResult> {\n        let ptr = unsafe {\n            sys::SherpaOnnxOfflineSpeakerDiarizationProcess(\n                self.ptr,\n                samples.as_ptr(),\n                samples.len() as i32,\n            )\n        };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(OfflineSpeakerDiarizationResult { ptr })\n        }\n    }\n}\n\nimpl Drop for OfflineSpeakerDiarization {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .ptr\n                .is_null()\n            {\n                sys::SherpaOnnxDestroyOfflineSpeakerDiarization(self.ptr);\n            }\n        }\n    }\n}\n\n/// Result object returned by [`OfflineSpeakerDiarization::process`].\npub struct OfflineSpeakerDiarizationResult {\n    ptr: *const sys::OfflineSpeakerDiarizationResult,\n}\n\nimpl OfflineSpeakerDiarizationResult {\n    /// Return the number of speakers estimated for the recording.\n    pub fn num_speakers(&self) -> i32 {\n        unsafe { sys::SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers(self.ptr) }\n    }\n\n    /// Return the number of diarization segments.\n    pub fn num_segments(&self) -> i32 {\n        unsafe { sys::SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(self.ptr) }\n    }\n\n    /// Return all segments sorted by start time.\n    pub fn sort_by_start_time(&self) -> Vec<OfflineSpeakerDiarizationSegment> {\n        let n = self.num_segments();\n        if n <= 0 {\n            return Vec::new();\n        }\n\n        unsafe {\n            let p = sys::SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(self.ptr);\n            if p.is_null() {\n                return Vec::new();\n            }\n\n            let segments = slice::from_raw_parts(p, n as usize)\n                .iter()\n                .map(|s| OfflineSpeakerDiarizationSegment {\n                    start: s.start,\n                    end: s.end,\n                    speaker: s.speaker,\n                })\n                .collect::<Vec<_>>();\n            sys::SherpaOnnxOfflineSpeakerDiarizationDestroySegment(p);\n            segments\n        }\n    }\n}\n\nimpl Drop for OfflineSpeakerDiarizationResult {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .ptr\n                .is_null()\n            {\n                sys::SherpaOnnxOfflineSpeakerDiarizationDestroyResult(self.ptr);\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/offline_speech_denoiser.rs",
    "content": "//! Offline speech denoising.\n//!\n//! Supported model families mirror the native API and currently include GTCRN\n//! and DPDFNet. See the repository examples:\n//!\n//! - `rust-api-examples/examples/offline_speech_enhancement_gtcrn.rs`\n//! - `rust-api-examples/examples/offline_speech_enhancement_dpdfnet.rs`\n\nuse crate::utils::to_c_ptr;\nuse sherpa_onnx_sys as sys;\nuse std::ffi::CString;\nuse std::ptr;\nuse std::slice;\n\n#[derive(Clone, Debug, Default)]\n/// GTCRN model path for offline denoising.\npub struct OfflineSpeechDenoiserGtcrnModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineSpeechDenoiserGtcrnModelConfig {\n    pub(crate) fn to_sys(\n        &self,\n        cstrings: &mut Vec<CString>,\n    ) -> sys::OfflineSpeechDenoiserGtcrnModelConfig {\n        sys::OfflineSpeechDenoiserGtcrnModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// DPDFNet model path for offline denoising.\npub struct OfflineSpeechDenoiserDpdfNetModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OfflineSpeechDenoiserDpdfNetModelConfig {\n    pub(crate) fn to_sys(\n        &self,\n        cstrings: &mut Vec<CString>,\n    ) -> sys::OfflineSpeechDenoiserDpdfNetModelConfig {\n        sys::OfflineSpeechDenoiserDpdfNetModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Aggregate model configuration for [`OfflineSpeechDenoiser`].\n///\n/// Configure exactly one model family in normal use.\npub struct OfflineSpeechDenoiserModelConfig {\n    pub gtcrn: OfflineSpeechDenoiserGtcrnModelConfig,\n    pub dpdfnet: OfflineSpeechDenoiserDpdfNetModelConfig,\n    pub num_threads: i32,\n    pub debug: bool,\n    pub provider: Option<String>,\n}\n\nimpl Default for OfflineSpeechDenoiserModelConfig {\n    fn default() -> Self {\n        Self {\n            gtcrn: Default::default(),\n            dpdfnet: Default::default(),\n            num_threads: 1,\n            debug: false,\n            provider: Some(\"cpu\".to_string()),\n        }\n    }\n}\n\nimpl OfflineSpeechDenoiserModelConfig {\n    pub(crate) fn to_sys(\n        &self,\n        cstrings: &mut Vec<CString>,\n    ) -> sys::OfflineSpeechDenoiserModelConfig {\n        sys::OfflineSpeechDenoiserModelConfig {\n            gtcrn: self\n                .gtcrn\n                .to_sys(cstrings),\n            num_threads: self.num_threads,\n            debug: self.debug as i32,\n            provider: to_c_ptr(&self.provider, cstrings),\n            dpdfnet: self\n                .dpdfnet\n                .to_sys(cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Denoised samples returned from an offline or online denoiser.\npub struct DenoisedAudio {\n    pub samples: Vec<f32>,\n    pub sample_rate: i32,\n}\n\nimpl DenoisedAudio {\n    pub(crate) fn from_ptr(ptr: *const sys::DenoisedAudio) -> Self {\n        if ptr.is_null() {\n            return Self::default();\n        }\n\n        unsafe {\n            let n = (*ptr)\n                .n\n                .max(0) as usize;\n            let samples = if (*ptr)\n                .samples\n                .is_null()\n                || n == 0\n            {\n                vec![]\n            } else {\n                slice::from_raw_parts((*ptr).samples, n).to_vec()\n            };\n            let sample_rate = (*ptr).sample_rate;\n            sys::SherpaOnnxDestroyDenoisedAudio(ptr);\n            Self {\n                samples,\n                sample_rate,\n            }\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Top-level configuration for [`OfflineSpeechDenoiser`].\npub struct OfflineSpeechDenoiserConfig {\n    pub model: OfflineSpeechDenoiserModelConfig,\n}\n\nimpl OfflineSpeechDenoiserConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineSpeechDenoiserConfig {\n        sys::OfflineSpeechDenoiserConfig {\n            model: self\n                .model\n                .to_sys(cstrings),\n        }\n    }\n}\n\n/// Offline speech denoiser.\npub struct OfflineSpeechDenoiser {\n    ptr: *const sys::OfflineSpeechDenoiser,\n}\n\nimpl OfflineSpeechDenoiser {\n    /// Create a denoiser from `config`.\n    pub fn create(config: &OfflineSpeechDenoiserConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        let ptr = unsafe { sys::SherpaOnnxCreateOfflineSpeechDenoiser(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Denoise one chunk or a complete waveform.\n    pub fn run(&self, samples: &[f32], sample_rate: i32) -> DenoisedAudio {\n        let samples_ptr = if samples.is_empty() {\n            ptr::null()\n        } else {\n            samples.as_ptr()\n        };\n        let ptr = unsafe {\n            sys::SherpaOnnxOfflineSpeechDenoiserRun(\n                self.ptr,\n                samples_ptr,\n                samples.len() as i32,\n                sample_rate,\n            )\n        };\n        DenoisedAudio::from_ptr(ptr)\n    }\n\n    /// Return the model sample rate expected by this denoiser.\n    pub fn sample_rate(&self) -> i32 {\n        unsafe { sys::SherpaOnnxOfflineSpeechDenoiserGetSampleRate(self.ptr) }\n    }\n}\n\nimpl Drop for OfflineSpeechDenoiser {\n    fn drop(&mut self) {\n        unsafe { sys::SherpaOnnxDestroyOfflineSpeechDenoiser(self.ptr) }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/online_asr.rs",
    "content": "//! Streaming speech recognition.\n//!\n//! Configure exactly one model family inside [`OnlineModelConfig`], create an\n//! [`OnlineRecognizer`], then feed waveform chunks into an [`OnlineStream`].\n//!\n//! See:\n//!\n//! - `rust-api-examples/examples/streaming_zipformer.rs`\n//! - `rust-api-examples/examples/streaming_zipformer_microphone.rs`\n//!\n//! ```no_run\n//! use sherpa_onnx::{OnlineRecognizer, OnlineRecognizerConfig, Wave};\n//!\n//! let wave = Wave::read(\"./test.wav\").expect(\"read wave\");\n//! let mut config = OnlineRecognizerConfig::default();\n//! config.model_config.transducer.encoder = Some(\n//!     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx\".into(),\n//! );\n//! config.model_config.transducer.decoder = Some(\n//!     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\".into(),\n//! );\n//! config.model_config.transducer.joiner = Some(\n//!     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx\".into(),\n//! );\n//! config.model_config.tokens = Some(\n//!     \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\".into(),\n//! );\n//! config.enable_endpoint = true;\n//! config.decoding_method = Some(\"greedy_search\".into());\n//!\n//! let recognizer = OnlineRecognizer::create(&config).expect(\"create recognizer\");\n//! let stream = recognizer.create_stream();\n//! stream.accept_waveform(wave.sample_rate(), wave.samples());\n//! stream.input_finished();\n//!\n//! while recognizer.is_ready(&stream) {\n//!     recognizer.decode(&stream);\n//! }\n//! ```\n\nuse crate::utils::to_c_ptr;\nuse serde::Deserialize;\nuse std::ffi::{CStr, CString};\nuse std::ptr;\n\nuse sherpa_onnx_sys as sys;\n\n#[derive(Clone, Debug, Default)]\n/// Online transducer model configuration.\npub struct OnlineTransducerModelConfig {\n    pub encoder: Option<String>,\n    pub decoder: Option<String>,\n    pub joiner: Option<String>,\n}\n\nimpl OnlineTransducerModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlineTransducerModelConfig {\n        sys::OnlineTransducerModelConfig {\n            encoder: to_c_ptr(&self.encoder, cstrings),\n            decoder: to_c_ptr(&self.decoder, cstrings),\n            joiner: to_c_ptr(&self.joiner, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Online Paraformer model configuration.\npub struct OnlineParaformerModelConfig {\n    pub encoder: Option<String>,\n    pub decoder: Option<String>,\n}\n\nimpl OnlineParaformerModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlineParaformerModelConfig {\n        sys::OnlineParaformerModelConfig {\n            encoder: to_c_ptr(&self.encoder, cstrings),\n            decoder: to_c_ptr(&self.decoder, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Online Zipformer2 CTC model configuration.\npub struct OnlineZipformer2CtcModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OnlineZipformer2CtcModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlineZipformer2CtcModelConfig {\n        sys::OnlineZipformer2CtcModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Online NeMo CTC model configuration.\npub struct OnlineNemoCtcModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OnlineNemoCtcModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlineNemoCtcModelConfig {\n        sys::OnlineNemoCtcModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Online Tone CTC model configuration.\npub struct OnlineToneCtcModelConfig {\n    pub model: Option<String>,\n}\n\nimpl OnlineToneCtcModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlineToneCtcModelConfig {\n        sys::OnlineToneCtcModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Aggregate model configuration for streaming recognition.\n///\n/// Configure exactly one model family for typical use.\npub struct OnlineModelConfig {\n    pub transducer: OnlineTransducerModelConfig,\n    pub paraformer: OnlineParaformerModelConfig,\n    pub zipformer2_ctc: OnlineZipformer2CtcModelConfig,\n    pub nemo_ctc: OnlineNemoCtcModelConfig,\n    pub t_one_ctc: OnlineToneCtcModelConfig,\n\n    pub tokens: Option<String>,\n    pub num_threads: i32,\n    pub provider: Option<String>,\n    pub debug: bool,\n\n    pub model_type: Option<String>,\n    pub modeling_unit: Option<String>, // cjkchar | bpe | cjkchar+bpe\n    pub bpe_vocab: Option<String>,\n\n    /// Optional in-memory tokens\n    pub tokens_buf: Option<Vec<u8>>,\n}\n\nimpl Default for OnlineModelConfig {\n    fn default() -> Self {\n        Self {\n            transducer: Default::default(),\n            paraformer: Default::default(),\n            zipformer2_ctc: Default::default(),\n            nemo_ctc: Default::default(),\n            t_one_ctc: Default::default(),\n\n            tokens: None,\n            num_threads: 1,\n            provider: Some(\"cpu\".to_string()),\n            debug: false,\n\n            model_type: None,\n            modeling_unit: None,\n            bpe_vocab: None,\n            tokens_buf: None,\n        }\n    }\n}\n\nimpl OnlineModelConfig {\n    pub(crate) fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlineModelConfig {\n        sys::OnlineModelConfig {\n            transducer: self\n                .transducer\n                .to_sys(cstrings),\n            paraformer: self\n                .paraformer\n                .to_sys(cstrings),\n            zipformer2_ctc: self\n                .zipformer2_ctc\n                .to_sys(cstrings),\n            nemo_ctc: self\n                .nemo_ctc\n                .to_sys(cstrings),\n            t_one_ctc: self\n                .t_one_ctc\n                .to_sys(cstrings),\n\n            tokens: to_c_ptr(&self.tokens, cstrings),\n            num_threads: self.num_threads,\n            provider: to_c_ptr(&self.provider, cstrings),\n            debug: self.debug as i32,\n\n            model_type: to_c_ptr(&self.model_type, cstrings),\n            modeling_unit: to_c_ptr(&self.modeling_unit, cstrings),\n            bpe_vocab: to_c_ptr(&self.bpe_vocab, cstrings),\n\n            tokens_buf: self\n                .tokens_buf\n                .as_ref()\n                .map_or(ptr::null(), |buf| buf.as_ptr() as *const _),\n            tokens_buf_size: self\n                .tokens_buf\n                .as_ref()\n                .map_or(0, |buf| buf.len() as i32),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// FST decoder options for CTC models.\npub struct OnlineCtcFstDecoderConfig {\n    pub graph: Option<String>,\n    pub max_active: i32,\n}\n\nimpl Default for OnlineCtcFstDecoderConfig {\n    fn default() -> Self {\n        Self {\n            graph: None,\n            max_active: 4,\n        }\n    }\n}\n\nimpl OnlineCtcFstDecoderConfig {\n    /// Convert to sys struct using `to_c_ptr()`\n    pub(crate) fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlineCtcFstDecoderConfig {\n        sys::OnlineCtcFstDecoderConfig {\n            graph: to_c_ptr(&self.graph, cstrings),\n            max_active: self.max_active,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Optional homophone replacement resources.\npub struct HomophoneReplacerConfig {\n    pub lexicon: Option<String>,\n    pub rule_fsts: Option<String>,\n}\n\nimpl HomophoneReplacerConfig {\n    pub(crate) fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::HomophoneReplacerConfig {\n        sys::HomophoneReplacerConfig {\n            dict_dir: ptr::null(), // not used any more internally\n            lexicon: to_c_ptr(&self.lexicon, cstrings),\n            rule_fsts: to_c_ptr(&self.rule_fsts, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Top-level configuration for [`OnlineRecognizer`].\npub struct OnlineRecognizerConfig {\n    pub feat_config: sys::FeatureConfig,\n    pub model_config: OnlineModelConfig,\n\n    /// Decoding method: greedy_search | modified_beam_search\n    pub decoding_method: Option<String>,\n\n    /// Used only when decoding_method is modified_beam_search\n    pub max_active_paths: i32,\n\n    /// Endpoint detection\n    pub enable_endpoint: bool,\n\n    pub rule1_min_trailing_silence: f32,\n    pub rule2_min_trailing_silence: f32,\n    pub rule3_min_utterance_length: f32,\n\n    pub hotwords_file: Option<String>,\n    pub hotwords_score: f32,\n\n    pub ctc_fst_decoder_config: OnlineCtcFstDecoderConfig,\n\n    pub rule_fsts: Option<String>,\n    pub rule_fars: Option<String>,\n\n    pub blank_penalty: f32,\n\n    pub hotwords_buf: Option<Vec<u8>>,\n\n    pub hr: HomophoneReplacerConfig,\n}\n\nimpl Default for OnlineRecognizerConfig {\n    fn default() -> Self {\n        Self {\n            feat_config: sys::FeatureConfig {\n                sample_rate: 16000,\n                feature_dim: 80,\n            },\n            model_config: Default::default(),\n            decoding_method: None,\n            max_active_paths: 0,\n            enable_endpoint: false,\n            rule1_min_trailing_silence: 0.0,\n            rule2_min_trailing_silence: 0.0,\n            rule3_min_utterance_length: 0.0,\n            hotwords_file: None,\n            hotwords_score: 0.0,\n            ctc_fst_decoder_config: Default::default(),\n            rule_fsts: None,\n            rule_fars: None,\n            blank_penalty: 0.0,\n            hotwords_buf: None,\n            hr: Default::default(),\n        }\n    }\n}\n\nimpl OnlineRecognizerConfig {\n    /// Convert to sys struct for FFI call\n    pub(crate) fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlineRecognizerConfig {\n        sys::OnlineRecognizerConfig {\n            feat_config: self.feat_config,\n            model_config: self\n                .model_config\n                .to_sys(cstrings),\n            decoding_method: to_c_ptr(&self.decoding_method, cstrings),\n            max_active_paths: self.max_active_paths,\n            enable_endpoint: self.enable_endpoint as i32,\n            rule1_min_trailing_silence: self.rule1_min_trailing_silence,\n            rule2_min_trailing_silence: self.rule2_min_trailing_silence,\n            rule3_min_utterance_length: self.rule3_min_utterance_length,\n            hotwords_file: to_c_ptr(&self.hotwords_file, cstrings),\n            hotwords_score: self.hotwords_score,\n            ctc_fst_decoder_config: self\n                .ctc_fst_decoder_config\n                .to_sys(cstrings),\n            rule_fsts: to_c_ptr(&self.rule_fsts, cstrings),\n            rule_fars: to_c_ptr(&self.rule_fars, cstrings),\n            blank_penalty: self.blank_penalty,\n            hotwords_buf: self\n                .hotwords_buf\n                .as_ref()\n                .map_or(ptr::null(), |buf| buf.as_ptr() as *const _),\n            hotwords_buf_size: self\n                .hotwords_buf\n                .as_ref()\n                .map_or(0, |buf| buf.len() as i32),\n            hr: self\n                .hr\n                .to_sys(cstrings),\n        }\n    }\n}\n\n/// Streaming speech recognizer.\npub struct OnlineRecognizer {\n    ptr: *const sys::OnlineRecognizer,\n}\n\nimpl OnlineRecognizer {\n    /// Create a recognizer from `config`.\n    pub fn create(config: &OnlineRecognizerConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n\n        let sys_config = config.to_sys(&mut cstrings);\n\n        let ptr = unsafe { sys::SherpaOnnxCreateOnlineRecognizer(&sys_config) };\n\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Create an empty online stream.\n    pub fn create_stream(&self) -> OnlineStream {\n        let ptr = unsafe { sys::SherpaOnnxCreateOnlineStream(self.ptr) };\n        OnlineStream { ptr }\n    }\n\n    /// Create a stream with per-stream hotwords.\n    pub fn create_stream_with_hotwords(&self, hotwords: &str) -> OnlineStream {\n        let c = CString::new(hotwords).unwrap();\n        let ptr = unsafe { sys::SherpaOnnxCreateOnlineStreamWithHotwords(self.ptr, c.as_ptr()) };\n        OnlineStream { ptr }\n    }\n\n    /// Decode one step for `stream`.\n    pub fn decode(&self, stream: &OnlineStream) {\n        unsafe { sys::SherpaOnnxDecodeOnlineStream(self.ptr, stream.ptr) }\n    }\n\n    /// Decode multiple streams in one batch call.\n    pub fn decode_multiple_streams(&self, streams: &[&OnlineStream]) {\n        let ptrs: Vec<*const sys::OnlineStream> = streams\n            .iter()\n            .map(|s| s.ptr)\n            .collect();\n        unsafe {\n            sys::SherpaOnnxDecodeMultipleOnlineStreams(self.ptr, ptrs.as_ptr(), ptrs.len() as i32)\n        }\n    }\n\n    /// Reset stream state after an endpoint or utterance boundary.\n    pub fn reset(&self, stream: &OnlineStream) {\n        unsafe { sys::SherpaOnnxOnlineStreamReset(self.ptr, stream.ptr) }\n    }\n\n    /// Return `true` if endpointing rules say the current utterance has ended.\n    pub fn is_endpoint(&self, stream: &OnlineStream) -> bool {\n        unsafe { sys::SherpaOnnxOnlineStreamIsEndpoint(self.ptr, stream.ptr) != 0 }\n    }\n\n    /// Return `true` if the recognizer has enough audio to run another step.\n    pub fn is_ready(&self, stream: &OnlineStream) -> bool {\n        unsafe { sys::SherpaOnnxIsOnlineStreamReady(self.ptr, stream.ptr) != 0 }\n    }\n\n    /// Fetch the current recognition hypothesis.\n    pub fn get_result(&self, stream: &OnlineStream) -> Option<RecognizerResult> {\n        unsafe {\n            let cstr = sys::SherpaOnnxGetOnlineStreamResultAsJson(self.ptr, stream.ptr);\n            if cstr.is_null() {\n                return None;\n            }\n            let s = CStr::from_ptr(cstr)\n                .to_string_lossy()\n                .into_owned();\n            sys::SherpaOnnxDestroyOnlineStreamResultJson(cstr);\n            serde_json::from_str(&s).ok()\n        }\n    }\n}\n\n#[derive(Clone, Debug, Deserialize)]\n/// Streaming ASR result returned by [`OnlineRecognizer::get_result`].\npub struct RecognizerResult {\n    pub text: String,\n    pub tokens: Vec<String>,\n    pub timestamps: Option<Vec<f32>>,\n    pub segment: Option<i32>,\n    pub start_time: Option<f32>,\n    pub is_final: bool,\n}\n\nimpl Drop for OnlineRecognizer {\n    fn drop(&mut self) {\n        unsafe {\n            sys::SherpaOnnxDestroyOnlineRecognizer(self.ptr);\n        }\n    }\n}\n\n/// Input stream used by [`OnlineRecognizer`].\npub struct OnlineStream {\n    pub(crate) ptr: *const sys::OnlineStream,\n}\n\nimpl OnlineStream {\n    /// Append one chunk of waveform samples.\n    pub fn accept_waveform(&self, sample_rate: i32, samples: &[f32]) {\n        unsafe {\n            sys::SherpaOnnxOnlineStreamAcceptWaveform(\n                self.ptr,\n                sample_rate,\n                samples.as_ptr(),\n                samples.len() as i32,\n            )\n        }\n    }\n\n    /// Mark the end of input so the recognizer can flush trailing context.\n    pub fn input_finished(&self) {\n        unsafe { sys::SherpaOnnxOnlineStreamInputFinished(self.ptr) }\n    }\n\n    pub fn set_option(&self, key: &str, value: &str) {\n        let key = CString::new(key).unwrap();\n        let value = CString::new(value).unwrap();\n        unsafe { sys::SherpaOnnxOnlineStreamSetOption(self.ptr, key.as_ptr(), value.as_ptr()) }\n    }\n\n    pub fn get_option(&self, key: &str) -> String {\n        let key = CString::new(key).unwrap();\n        unsafe {\n            let p = sys::SherpaOnnxOnlineStreamGetOption(self.ptr, key.as_ptr());\n            if p.is_null() {\n                String::new()\n            } else {\n                CStr::from_ptr(p)\n                    .to_string_lossy()\n                    .into_owned()\n            }\n        }\n    }\n\n    pub fn has_option(&self, key: &str) -> bool {\n        let key = CString::new(key).unwrap();\n        unsafe { sys::SherpaOnnxOnlineStreamHasOption(self.ptr, key.as_ptr()) != 0 }\n    }\n}\n\nimpl Drop for OnlineStream {\n    fn drop(&mut self) {\n        unsafe { sys::SherpaOnnxDestroyOnlineStream(self.ptr) }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/online_punctuation.rs",
    "content": "//! Online punctuation restoration.\n//!\n//! This module wraps the punctuation model used in\n//! `rust-api-examples/examples/online_punctuation.rs`.\n//!\n//! ```no_run\n//! use sherpa_onnx::{OnlinePunctuation, OnlinePunctuationConfig, OnlinePunctuationModelConfig};\n//!\n//! let config = OnlinePunctuationConfig {\n//!     model: OnlinePunctuationModelConfig {\n//!         cnn_bilstm: Some(\"./sherpa-onnx-online-punct-en/cnn_bilstm.onnx\".into()),\n//!         bpe_vocab: Some(\"./sherpa-onnx-online-punct-en/bpe.vocab\".into()),\n//!         ..Default::default()\n//!     },\n//! };\n//!\n//! let punct = OnlinePunctuation::create(&config).expect(\"create punctuation\");\n//! let text = punct\n//!     .add_punctuation(\"how are you i am fine thank you\")\n//!     .expect(\"punctuate\");\n//! println!(\"{text}\");\n//! ```\n\nuse crate::utils::to_c_ptr;\nuse sherpa_onnx_sys as sys;\nuse std::ffi::{CStr, CString};\n\n#[derive(Clone, Debug)]\n/// Model-level options for online punctuation restoration.\npub struct OnlinePunctuationModelConfig {\n    pub cnn_bilstm: Option<String>,\n    pub bpe_vocab: Option<String>,\n    pub num_threads: i32,\n    pub debug: bool,\n    pub provider: Option<String>,\n}\n\nimpl Default for OnlinePunctuationModelConfig {\n    fn default() -> Self {\n        Self {\n            cnn_bilstm: None,\n            bpe_vocab: None,\n            num_threads: 1,\n            debug: false,\n            provider: Some(\"cpu\".to_string()),\n        }\n    }\n}\n\nimpl OnlinePunctuationModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlinePunctuationModelConfig {\n        sys::OnlinePunctuationModelConfig {\n            cnn_bilstm: to_c_ptr(&self.cnn_bilstm, cstrings),\n            bpe_vocab: to_c_ptr(&self.bpe_vocab, cstrings),\n            num_threads: self.num_threads,\n            debug: self.debug as i32,\n            provider: to_c_ptr(&self.provider, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Top-level configuration for [`OnlinePunctuation`].\npub struct OnlinePunctuationConfig {\n    pub model: OnlinePunctuationModelConfig,\n}\n\nimpl OnlinePunctuationConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlinePunctuationConfig {\n        sys::OnlinePunctuationConfig {\n            model: self\n                .model\n                .to_sys(cstrings),\n        }\n    }\n}\n\n/// Online punctuation restorer.\n///\n/// Feed plain text fragments to [`OnlinePunctuation::add_punctuation`] and get\n/// punctuated text back.\npub struct OnlinePunctuation {\n    ptr: *const sys::OnlinePunctuation,\n}\n\nunsafe impl Send for OnlinePunctuation {}\n\nimpl OnlinePunctuation {\n    pub fn create(config: &OnlinePunctuationConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n\n        let ptr = unsafe { sys::SherpaOnnxCreateOnlinePunctuation(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Add punctuation to a text fragment.\n    ///\n    /// Returns `None` if the input cannot be converted to a C string or the\n    /// native punctuator fails.\n    pub fn add_punctuation(&self, text: &str) -> Option<String> {\n        let text = CString::new(text).ok()?;\n\n        unsafe {\n            let p = sys::SherpaOnnxOnlinePunctuationAddPunct(self.ptr, text.as_ptr());\n            if p.is_null() {\n                return None;\n            }\n\n            let ans = CStr::from_ptr(p)\n                .to_string_lossy()\n                .into_owned();\n            sys::SherpaOnnxOnlinePunctuationFreeText(p);\n            Some(ans)\n        }\n    }\n}\n\nimpl Drop for OnlinePunctuation {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .ptr\n                .is_null()\n            {\n                sys::SherpaOnnxDestroyOnlinePunctuation(self.ptr);\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/online_speech_denoiser.rs",
    "content": "//! Streaming speech denoising.\n//!\n//! This API is intended for chunked audio. Call [`OnlineSpeechDenoiser::run`]\n//! on consecutive chunks, then [`OnlineSpeechDenoiser::flush`] after the final\n//! chunk to drain any buffered state.\n\nuse crate::offline_speech_denoiser::{DenoisedAudio, OfflineSpeechDenoiserModelConfig};\nuse sherpa_onnx_sys as sys;\nuse std::ffi::CString;\nuse std::ptr;\n\n#[derive(Clone, Debug, Default)]\n/// Top-level configuration for [`OnlineSpeechDenoiser`].\npub struct OnlineSpeechDenoiserConfig {\n    pub model: OfflineSpeechDenoiserModelConfig,\n}\n\nimpl OnlineSpeechDenoiserConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OnlineSpeechDenoiserConfig {\n        sys::OnlineSpeechDenoiserConfig {\n            model: self\n                .model\n                .to_sys(cstrings),\n        }\n    }\n}\n\n/// Streaming speech denoiser.\npub struct OnlineSpeechDenoiser {\n    ptr: *const sys::OnlineSpeechDenoiser,\n}\n\nimpl OnlineSpeechDenoiser {\n    /// Create a denoiser from `config`.\n    pub fn create(config: &OnlineSpeechDenoiserConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        let ptr = unsafe { sys::SherpaOnnxCreateOnlineSpeechDenoiser(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Denoise one input chunk.\n    pub fn run(&self, samples: &[f32], sample_rate: i32) -> DenoisedAudio {\n        let samples_ptr = if samples.is_empty() {\n            ptr::null()\n        } else {\n            samples.as_ptr()\n        };\n        let ptr = unsafe {\n            sys::SherpaOnnxOnlineSpeechDenoiserRun(\n                self.ptr,\n                samples_ptr,\n                samples.len() as i32,\n                sample_rate,\n            )\n        };\n        DenoisedAudio::from_ptr(ptr)\n    }\n\n    /// Flush any internally buffered samples after the final chunk.\n    pub fn flush(&self) -> DenoisedAudio {\n        let ptr = unsafe { sys::SherpaOnnxOnlineSpeechDenoiserFlush(self.ptr) };\n        DenoisedAudio::from_ptr(ptr)\n    }\n\n    /// Reset the streaming state.\n    pub fn reset(&self) {\n        unsafe { sys::SherpaOnnxOnlineSpeechDenoiserReset(self.ptr) }\n    }\n\n    /// Return the model sample rate expected by this denoiser.\n    pub fn sample_rate(&self) -> i32 {\n        unsafe { sys::SherpaOnnxOnlineSpeechDenoiserGetSampleRate(self.ptr) }\n    }\n\n    /// Return the preferred input frame shift, in samples.\n    pub fn frame_shift_in_samples(&self) -> i32 {\n        unsafe { sys::SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(self.ptr) }\n    }\n}\n\nimpl Drop for OnlineSpeechDenoiser {\n    fn drop(&mut self) {\n        unsafe { sys::SherpaOnnxDestroyOnlineSpeechDenoiser(self.ptr) }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/speaker_embedding.rs",
    "content": "//! Speaker embedding extraction and speaker search utilities.\n//!\n//! See:\n//!\n//! - `rust-api-examples/examples/speaker_embedding_extractor.rs`\n//! - `rust-api-examples/examples/speaker_embedding_manager.rs`\n//! - `rust-api-examples/examples/speaker_embedding_cosine_similarity.rs`\n\nuse crate::{online_asr::OnlineStream, utils::to_c_ptr};\nuse sherpa_onnx_sys as sys;\nuse std::ffi::{CStr, CString};\nuse std::ptr;\nuse std::slice;\n\n#[derive(Clone, Debug)]\n/// Configuration for [`SpeakerEmbeddingExtractor`].\npub struct SpeakerEmbeddingExtractorConfig {\n    pub model: Option<String>,\n    pub num_threads: i32,\n    pub debug: bool,\n    pub provider: Option<String>,\n}\n\nimpl Default for SpeakerEmbeddingExtractorConfig {\n    fn default() -> Self {\n        Self {\n            model: None,\n            num_threads: 1,\n            debug: false,\n            provider: Some(\"cpu\".to_string()),\n        }\n    }\n}\n\nimpl SpeakerEmbeddingExtractorConfig {\n    pub(crate) fn to_sys(\n        &self,\n        cstrings: &mut Vec<CString>,\n    ) -> sys::SpeakerEmbeddingExtractorConfig {\n        sys::SpeakerEmbeddingExtractorConfig {\n            model: to_c_ptr(&self.model, cstrings),\n            num_threads: self.num_threads,\n            debug: self.debug as i32,\n            provider: to_c_ptr(&self.provider, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// One speaker search result returned by [`SpeakerEmbeddingManager::get_best_matches`].\npub struct SpeakerEmbeddingMatch {\n    pub score: f32,\n    pub name: String,\n}\n\n/// Embedding extractor that consumes audio through an [`OnlineStream`].\npub struct SpeakerEmbeddingExtractor {\n    ptr: *const sys::SpeakerEmbeddingExtractor,\n    dim: i32,\n}\n\nunsafe impl Send for SpeakerEmbeddingExtractor {}\n\nimpl SpeakerEmbeddingExtractor {\n    /// Create an extractor from `config`.\n    pub fn create(config: &SpeakerEmbeddingExtractorConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        let ptr = unsafe { sys::SherpaOnnxCreateSpeakerEmbeddingExtractor(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            let dim = unsafe { sys::SherpaOnnxSpeakerEmbeddingExtractorDim(ptr) };\n            Some(Self { ptr, dim })\n        }\n    }\n\n    /// Return the embedding dimension.\n    pub fn dim(&self) -> i32 {\n        self.dim\n    }\n\n    /// Create an audio stream that can be filled with waveform chunks.\n    pub fn create_stream(&self) -> Option<OnlineStream> {\n        let ptr = unsafe { sys::SherpaOnnxSpeakerEmbeddingExtractorCreateStream(self.ptr) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(OnlineStream { ptr })\n        }\n    }\n\n    /// Return `true` if enough audio has been accumulated to compute an embedding.\n    pub fn is_ready(&self, stream: &OnlineStream) -> bool {\n        unsafe { sys::SherpaOnnxSpeakerEmbeddingExtractorIsReady(self.ptr, stream.ptr) == 1 }\n    }\n\n    /// Compute the embedding for `stream`.\n    pub fn compute(&self, stream: &OnlineStream) -> Option<Vec<f32>> {\n        let p = unsafe {\n            sys::SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(self.ptr, stream.ptr)\n        };\n        if p.is_null() {\n            None\n        } else {\n            let ans = unsafe { slice::from_raw_parts(p, self.dim as usize) }.to_vec();\n            unsafe { sys::SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(p) };\n            Some(ans)\n        }\n    }\n}\n\nimpl Drop for SpeakerEmbeddingExtractor {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .ptr\n                .is_null()\n            {\n                sys::SherpaOnnxDestroySpeakerEmbeddingExtractor(self.ptr);\n            }\n        }\n    }\n}\n\n/// In-memory index of named speaker embeddings.\npub struct SpeakerEmbeddingManager {\n    ptr: *const sys::SpeakerEmbeddingManager,\n    dim: i32,\n}\n\nunsafe impl Send for SpeakerEmbeddingManager {}\n\nimpl SpeakerEmbeddingManager {\n    /// Create a manager for embeddings with the given dimension.\n    pub fn create(dim: i32) -> Option<Self> {\n        let ptr = unsafe { sys::SherpaOnnxCreateSpeakerEmbeddingManager(dim) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr, dim })\n        }\n    }\n\n    /// Return the embedding dimension expected by the manager.\n    pub fn dim(&self) -> i32 {\n        self.dim\n    }\n\n    /// Add one embedding for `name`.\n    pub fn add(&self, name: &str, embedding: &[f32]) -> bool {\n        if embedding.len() != self.dim as usize {\n            return false;\n        }\n\n        let c_name = match CString::new(name) {\n            Ok(v) => v,\n            Err(_) => return false,\n        };\n\n        unsafe {\n            sys::SherpaOnnxSpeakerEmbeddingManagerAdd(self.ptr, c_name.as_ptr(), embedding.as_ptr())\n                == 1\n        }\n    }\n\n    /// Add multiple embeddings for `name`.\n    pub fn add_list(&self, name: &str, embeddings: &[Vec<f32>]) -> bool {\n        if embeddings.is_empty()\n            || embeddings\n                .iter()\n                .any(|v| v.len() != self.dim as usize)\n        {\n            return false;\n        }\n\n        let c_name = match CString::new(name) {\n            Ok(v) => v,\n            Err(_) => return false,\n        };\n\n        let mut ptrs: Vec<*const f32> = embeddings\n            .iter()\n            .map(|v| v.as_ptr())\n            .collect();\n        ptrs.push(ptr::null());\n\n        unsafe {\n            sys::SherpaOnnxSpeakerEmbeddingManagerAddList(self.ptr, c_name.as_ptr(), ptrs.as_ptr())\n                == 1\n        }\n    }\n\n    /// Add multiple embeddings laid out as a flattened slice.\n    pub fn add_list_flattened(&self, name: &str, embeddings: &[f32]) -> bool {\n        if embeddings.is_empty() || embeddings.len() % self.dim as usize != 0 {\n            return false;\n        }\n\n        let c_name = match CString::new(name) {\n            Ok(v) => v,\n            Err(_) => return false,\n        };\n\n        let n = (embeddings.len() / self.dim as usize) as i32;\n        unsafe {\n            sys::SherpaOnnxSpeakerEmbeddingManagerAddListFlattened(\n                self.ptr,\n                c_name.as_ptr(),\n                embeddings.as_ptr(),\n                n,\n            ) == 1\n        }\n    }\n\n    /// Remove all embeddings stored under `name`.\n    pub fn remove(&self, name: &str) -> bool {\n        let c_name = match CString::new(name) {\n            Ok(v) => v,\n            Err(_) => return false,\n        };\n\n        unsafe { sys::SherpaOnnxSpeakerEmbeddingManagerRemove(self.ptr, c_name.as_ptr()) == 1 }\n    }\n\n    /// Search for the best matching speaker name above `threshold`.\n    pub fn search(&self, embedding: &[f32], threshold: f32) -> Option<String> {\n        if embedding.len() != self.dim as usize {\n            return None;\n        }\n\n        unsafe {\n            let p = sys::SherpaOnnxSpeakerEmbeddingManagerSearch(\n                self.ptr,\n                embedding.as_ptr(),\n                threshold,\n            );\n            if p.is_null() {\n                None\n            } else {\n                let ans = CStr::from_ptr(p)\n                    .to_string_lossy()\n                    .into_owned();\n                sys::SherpaOnnxSpeakerEmbeddingManagerFreeSearch(p);\n                Some(ans)\n            }\n        }\n    }\n\n    /// Return up to `n` best matches above `threshold`.\n    pub fn get_best_matches(\n        &self,\n        embedding: &[f32],\n        threshold: f32,\n        n: i32,\n    ) -> Vec<SpeakerEmbeddingMatch> {\n        if embedding.len() != self.dim as usize {\n            return Vec::new();\n        }\n\n        unsafe {\n            let r = sys::SherpaOnnxSpeakerEmbeddingManagerGetBestMatches(\n                self.ptr,\n                embedding.as_ptr(),\n                threshold,\n                n,\n            );\n            if r.is_null() {\n                return Vec::new();\n            }\n\n            let result = &*r;\n            let matches = slice::from_raw_parts(result.matches, result.count as usize)\n                .iter()\n                .map(|m| SpeakerEmbeddingMatch {\n                    score: m.score,\n                    name: if m\n                        .name\n                        .is_null()\n                    {\n                        String::new()\n                    } else {\n                        CStr::from_ptr(m.name)\n                            .to_string_lossy()\n                            .into_owned()\n                    },\n                })\n                .collect::<Vec<_>>();\n            sys::SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches(r);\n            matches\n        }\n    }\n\n    pub fn verify(&self, name: &str, embedding: &[f32], threshold: f32) -> bool {\n        if embedding.len() != self.dim as usize {\n            return false;\n        }\n\n        let c_name = match CString::new(name) {\n            Ok(v) => v,\n            Err(_) => return false,\n        };\n\n        unsafe {\n            sys::SherpaOnnxSpeakerEmbeddingManagerVerify(\n                self.ptr,\n                c_name.as_ptr(),\n                embedding.as_ptr(),\n                threshold,\n            ) == 1\n        }\n    }\n\n    pub fn contains(&self, name: &str) -> bool {\n        let c_name = match CString::new(name) {\n            Ok(v) => v,\n            Err(_) => return false,\n        };\n\n        unsafe { sys::SherpaOnnxSpeakerEmbeddingManagerContains(self.ptr, c_name.as_ptr()) == 1 }\n    }\n\n    pub fn num_speakers(&self) -> i32 {\n        unsafe { sys::SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(self.ptr) }\n    }\n\n    pub fn get_all_speakers(&self) -> Vec<String> {\n        unsafe {\n            let names = sys::SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers(self.ptr);\n            if names.is_null() {\n                return Vec::new();\n            }\n\n            let mut ans = Vec::new();\n            let mut p = names;\n            while !(*p).is_null() {\n                ans.push(\n                    CStr::from_ptr(*p)\n                        .to_string_lossy()\n                        .into_owned(),\n                );\n                p = p.add(1);\n            }\n            sys::SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers(names);\n            ans\n        }\n    }\n}\n\nimpl Drop for SpeakerEmbeddingManager {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .ptr\n                .is_null()\n            {\n                sys::SherpaOnnxDestroySpeakerEmbeddingManager(self.ptr);\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/spoken_language_identification.rs",
    "content": "//! Spoken language identification.\n//!\n//! This module identifies the language spoken in an audio clip using the\n//! Whisper-based language ID API. See\n//! [`rust-api-examples/examples/spoken_language_identification.rs`](https://github.com/k2-fsa/sherpa-onnx/blob/master/rust-api-examples/examples/spoken_language_identification.rs)\n//! for a complete example.\n//!\n//! # Example\n//!\n//! ```no_run\n//! use sherpa_onnx::{\n//!     SpokenLanguageIdentification, SpokenLanguageIdentificationConfig,\n//!     SpokenLanguageIdentificationWhisperConfig, Wave,\n//! };\n//!\n//! let wave = Wave::read(\"./test.wav\").expect(\"read wave\");\n//! let config = SpokenLanguageIdentificationConfig {\n//!     whisper: SpokenLanguageIdentificationWhisperConfig {\n//!         encoder: Some(\"./sherpa-onnx-whisper-tiny/encoder.int8.onnx\".into()),\n//!         decoder: Some(\"./sherpa-onnx-whisper-tiny/decoder.int8.onnx\".into()),\n//!         tail_paddings: 0,\n//!     },\n//!     ..Default::default()\n//! };\n//!\n//! let slid = SpokenLanguageIdentification::create(&config).expect(\"create\");\n//! let stream = slid.create_stream();\n//! stream.accept_waveform(wave.sample_rate(), wave.samples());\n//! let result = slid.compute(&stream).expect(\"compute\");\n//! println!(\"{}\", result.lang);\n//! ```\n\nuse crate::offline_asr::OfflineStream;\nuse crate::utils::to_c_ptr;\nuse sherpa_onnx_sys as sys;\nuse std::ffi::{CStr, CString};\n\n#[derive(Clone, Debug, Default)]\n/// Whisper model configuration for spoken language identification.\npub struct SpokenLanguageIdentificationWhisperConfig {\n    pub encoder: Option<String>,\n    pub decoder: Option<String>,\n    pub tail_paddings: i32,\n}\n\nimpl SpokenLanguageIdentificationWhisperConfig {\n    fn to_sys(\n        &self,\n        cstrings: &mut Vec<CString>,\n    ) -> sys::SpokenLanguageIdentificationWhisperConfig {\n        sys::SpokenLanguageIdentificationWhisperConfig {\n            encoder: to_c_ptr(&self.encoder, cstrings),\n            decoder: to_c_ptr(&self.decoder, cstrings),\n            tail_paddings: self.tail_paddings,\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Top-level configuration for [`SpokenLanguageIdentification`].\npub struct SpokenLanguageIdentificationConfig {\n    pub whisper: SpokenLanguageIdentificationWhisperConfig,\n    pub num_threads: i32,\n    pub debug: bool,\n    pub provider: Option<String>,\n}\n\nimpl Default for SpokenLanguageIdentificationConfig {\n    fn default() -> Self {\n        Self {\n            whisper: Default::default(),\n            num_threads: 1,\n            debug: false,\n            provider: Some(\"cpu\".to_string()),\n        }\n    }\n}\n\nimpl SpokenLanguageIdentificationConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::SpokenLanguageIdentificationConfig {\n        sys::SpokenLanguageIdentificationConfig {\n            whisper: self\n                .whisper\n                .to_sys(cstrings),\n            num_threads: self.num_threads,\n            debug: self.debug as i32,\n            provider: to_c_ptr(&self.provider, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Result returned by [`SpokenLanguageIdentification::compute`].\npub struct SpokenLanguageIdentificationResult {\n    pub lang: String,\n}\n\n/// Spoken language identifier.\npub struct SpokenLanguageIdentification {\n    ptr: *const sys::SpokenLanguageIdentification,\n}\n\nunsafe impl Send for SpokenLanguageIdentification {}\n\nimpl SpokenLanguageIdentification {\n    /// Create a language identifier from [`SpokenLanguageIdentificationConfig`].\n    pub fn create(config: &SpokenLanguageIdentificationConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        let ptr = unsafe { sys::SherpaOnnxCreateSpokenLanguageIdentification(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Create an offline stream for one audio clip.\n    pub fn create_stream(&self) -> OfflineStream {\n        let ptr =\n            unsafe { sys::SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(self.ptr) };\n        OfflineStream { ptr }\n    }\n\n    /// Compute the spoken language for `stream`.\n    pub fn compute(&self, stream: &OfflineStream) -> Option<SpokenLanguageIdentificationResult> {\n        unsafe {\n            let p = sys::SherpaOnnxSpokenLanguageIdentificationCompute(self.ptr, stream.ptr);\n            if p.is_null() {\n                return None;\n            }\n\n            let ans = SpokenLanguageIdentificationResult {\n                lang: if (*p)\n                    .lang\n                    .is_null()\n                {\n                    String::new()\n                } else {\n                    CStr::from_ptr((*p).lang)\n                        .to_string_lossy()\n                        .into_owned()\n                },\n            };\n\n            sys::SherpaOnnxDestroySpokenLanguageIdentificationResult(p);\n            Some(ans)\n        }\n    }\n}\n\nimpl Drop for SpokenLanguageIdentification {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .ptr\n                .is_null()\n            {\n                sys::SherpaOnnxDestroySpokenLanguageIdentification(self.ptr);\n            }\n        }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/tts.rs",
    "content": "//! Offline text-to-speech.\n//!\n//! Supported model families include VITS, Matcha, Kokoro, Kitten, ZipVoice,\n//! Pocket TTS, and Supertonic. See the repository examples:\n//!\n//! - `rust-api-examples/examples/pocket_tts.rs`\n//! - `rust-api-examples/examples/kokoro_tts_en.rs`\n//! - `rust-api-examples/examples/kokoro_tts_zh_en.rs`\n//! - `rust-api-examples/examples/matcha_tts_en.rs`\n//! - `rust-api-examples/examples/matcha_tts_zh.rs`\n//! - `rust-api-examples/examples/zipvoice_tts.rs`\n//! - `rust-api-examples/examples/supertonic_tts.rs`\n//!\n//! # Example\n//!\n//! ```no_run\n//! use sherpa_onnx::{\n//!     GenerationConfig, OfflineTts, OfflineTtsConfig, OfflineTtsModelConfig,\n//!     OfflineTtsPocketModelConfig, Wave,\n//! };\n//!\n//! let config = OfflineTtsConfig {\n//!     model: OfflineTtsModelConfig {\n//!         pocket: OfflineTtsPocketModelConfig {\n//!             lm_flow: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\".into()),\n//!             lm_main: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\".into()),\n//!             encoder: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\".into()),\n//!             decoder: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\".into()),\n//!             text_conditioner: Some(\n//!                 \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\".into(),\n//!             ),\n//!             vocab_json: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\".into()),\n//!             token_scores_json: Some(\n//!                 \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\".into(),\n//!             ),\n//!             ..Default::default()\n//!         },\n//!         ..Default::default()\n//!     },\n//!     ..Default::default()\n//! };\n//!\n//! let tts = OfflineTts::create(&config).expect(\"create tts\");\n//! let reference = Wave::read(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\")\n//!     .expect(\"read reference\");\n//! let generation_config = GenerationConfig {\n//!     reference_audio: Some(reference.samples().to_vec()),\n//!     reference_sample_rate: reference.sample_rate(),\n//!     ..Default::default()\n//! };\n//! let audio = tts\n//!     .generate_with_config(\"Hello from sherpa-onnx\", &generation_config, None)\n//!     .expect(\"generate\");\n//! println!(\"{}\", audio.sample_rate());\n//! ```\n\nuse crate::utils::to_c_ptr;\nuse sherpa_onnx_sys as sys;\nuse std::collections::HashMap;\nuse std::ffi::CString;\nuse std::os::raw::c_void;\nuse std::ptr;\nuse std::slice;\n\ntype ProgressCallback = dyn FnMut(&[f32], f32) -> bool;\ntype BoxedProgressCallback = Box<ProgressCallback>;\n\n// --- Model config structs ---\n\n#[derive(Clone, Debug)]\n/// VITS model configuration.\npub struct OfflineTtsVitsModelConfig {\n    pub model: Option<String>,\n    pub lexicon: Option<String>,\n    pub tokens: Option<String>,\n    pub data_dir: Option<String>,\n    pub noise_scale: f32,\n    pub noise_scale_w: f32,\n    pub length_scale: f32,\n    pub dict_dir: Option<String>,\n}\n\nimpl Default for OfflineTtsVitsModelConfig {\n    fn default() -> Self {\n        Self {\n            model: None,\n            lexicon: None,\n            tokens: None,\n            data_dir: None,\n            noise_scale: 0.667,\n            noise_scale_w: 0.8,\n            length_scale: 1.0,\n            dict_dir: None,\n        }\n    }\n}\n\nimpl OfflineTtsVitsModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTtsVitsModelConfig {\n        sys::OfflineTtsVitsModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n            lexicon: to_c_ptr(&self.lexicon, cstrings),\n            tokens: to_c_ptr(&self.tokens, cstrings),\n            data_dir: to_c_ptr(&self.data_dir, cstrings),\n            noise_scale: self.noise_scale,\n            noise_scale_w: self.noise_scale_w,\n            length_scale: self.length_scale,\n            dict_dir: to_c_ptr(&self.dict_dir, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Matcha model configuration.\npub struct OfflineTtsMatchaModelConfig {\n    pub acoustic_model: Option<String>,\n    pub vocoder: Option<String>,\n    pub lexicon: Option<String>,\n    pub tokens: Option<String>,\n    pub data_dir: Option<String>,\n    pub noise_scale: f32,\n    pub length_scale: f32,\n    pub dict_dir: Option<String>,\n}\n\nimpl Default for OfflineTtsMatchaModelConfig {\n    fn default() -> Self {\n        Self {\n            acoustic_model: None,\n            vocoder: None,\n            lexicon: None,\n            tokens: None,\n            data_dir: None,\n            noise_scale: 0.667,\n            length_scale: 1.0,\n            dict_dir: None,\n        }\n    }\n}\n\nimpl OfflineTtsMatchaModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTtsMatchaModelConfig {\n        sys::OfflineTtsMatchaModelConfig {\n            acoustic_model: to_c_ptr(&self.acoustic_model, cstrings),\n            vocoder: to_c_ptr(&self.vocoder, cstrings),\n            lexicon: to_c_ptr(&self.lexicon, cstrings),\n            tokens: to_c_ptr(&self.tokens, cstrings),\n            data_dir: to_c_ptr(&self.data_dir, cstrings),\n            noise_scale: self.noise_scale,\n            length_scale: self.length_scale,\n            dict_dir: to_c_ptr(&self.dict_dir, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Kokoro model configuration.\npub struct OfflineTtsKokoroModelConfig {\n    pub model: Option<String>,\n    pub voices: Option<String>,\n    pub tokens: Option<String>,\n    pub data_dir: Option<String>,\n    pub length_scale: f32,\n    pub dict_dir: Option<String>,\n    pub lexicon: Option<String>,\n    pub lang: Option<String>,\n}\n\nimpl Default for OfflineTtsKokoroModelConfig {\n    fn default() -> Self {\n        Self {\n            model: None,\n            voices: None,\n            tokens: None,\n            data_dir: None,\n            length_scale: 1.0,\n            dict_dir: None,\n            lexicon: None,\n            lang: None,\n        }\n    }\n}\n\nimpl OfflineTtsKokoroModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTtsKokoroModelConfig {\n        sys::OfflineTtsKokoroModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n            voices: to_c_ptr(&self.voices, cstrings),\n            tokens: to_c_ptr(&self.tokens, cstrings),\n            data_dir: to_c_ptr(&self.data_dir, cstrings),\n            length_scale: self.length_scale,\n            dict_dir: to_c_ptr(&self.dict_dir, cstrings),\n            lexicon: to_c_ptr(&self.lexicon, cstrings),\n            lang: to_c_ptr(&self.lang, cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// Kitten model configuration.\npub struct OfflineTtsKittenModelConfig {\n    pub model: Option<String>,\n    pub voices: Option<String>,\n    pub tokens: Option<String>,\n    pub data_dir: Option<String>,\n    pub length_scale: f32,\n}\n\nimpl Default for OfflineTtsKittenModelConfig {\n    fn default() -> Self {\n        Self {\n            model: None,\n            voices: None,\n            tokens: None,\n            data_dir: None,\n            length_scale: 1.0,\n        }\n    }\n}\n\nimpl OfflineTtsKittenModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTtsKittenModelConfig {\n        sys::OfflineTtsKittenModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n            voices: to_c_ptr(&self.voices, cstrings),\n            tokens: to_c_ptr(&self.tokens, cstrings),\n            data_dir: to_c_ptr(&self.data_dir, cstrings),\n            length_scale: self.length_scale,\n        }\n    }\n}\n\n#[derive(Clone, Debug)]\n/// ZipVoice model configuration.\npub struct OfflineTtsZipvoiceModelConfig {\n    pub tokens: Option<String>,\n    pub encoder: Option<String>,\n    pub decoder: Option<String>,\n    pub vocoder: Option<String>,\n    pub data_dir: Option<String>,\n    pub lexicon: Option<String>,\n    pub feat_scale: f32,\n    pub t_shift: f32,\n    pub target_rms: f32,\n    pub guidance_scale: f32,\n}\n\nimpl Default for OfflineTtsZipvoiceModelConfig {\n    fn default() -> Self {\n        Self {\n            tokens: None,\n            encoder: None,\n            decoder: None,\n            vocoder: None,\n            data_dir: None,\n            lexicon: None,\n            feat_scale: 0.0,\n            t_shift: 0.0,\n            target_rms: 0.0,\n            guidance_scale: 0.0,\n        }\n    }\n}\n\nimpl OfflineTtsZipvoiceModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTtsZipvoiceModelConfig {\n        sys::OfflineTtsZipvoiceModelConfig {\n            tokens: to_c_ptr(&self.tokens, cstrings),\n            encoder: to_c_ptr(&self.encoder, cstrings),\n            decoder: to_c_ptr(&self.decoder, cstrings),\n            vocoder: to_c_ptr(&self.vocoder, cstrings),\n            data_dir: to_c_ptr(&self.data_dir, cstrings),\n            lexicon: to_c_ptr(&self.lexicon, cstrings),\n            feat_scale: self.feat_scale,\n            t_shift: self.t_shift,\n            target_rms: self.target_rms,\n            guidance_scale: self.guidance_scale,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Pocket TTS model configuration.\npub struct OfflineTtsPocketModelConfig {\n    pub lm_flow: Option<String>,\n    pub lm_main: Option<String>,\n    pub encoder: Option<String>,\n    pub decoder: Option<String>,\n    pub text_conditioner: Option<String>,\n    pub vocab_json: Option<String>,\n    pub token_scores_json: Option<String>,\n    pub voice_embedding_cache_capacity: i32,\n}\n\nimpl OfflineTtsPocketModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTtsPocketModelConfig {\n        sys::OfflineTtsPocketModelConfig {\n            lm_flow: to_c_ptr(&self.lm_flow, cstrings),\n            lm_main: to_c_ptr(&self.lm_main, cstrings),\n            encoder: to_c_ptr(&self.encoder, cstrings),\n            decoder: to_c_ptr(&self.decoder, cstrings),\n            text_conditioner: to_c_ptr(&self.text_conditioner, cstrings),\n            vocab_json: to_c_ptr(&self.vocab_json, cstrings),\n            token_scores_json: to_c_ptr(&self.token_scores_json, cstrings),\n            voice_embedding_cache_capacity: self.voice_embedding_cache_capacity,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Supertonic model configuration.\npub struct OfflineTtsSupertonicModelConfig {\n    pub duration_predictor: Option<String>,\n    pub text_encoder: Option<String>,\n    pub vector_estimator: Option<String>,\n    pub vocoder: Option<String>,\n    pub tts_json: Option<String>,\n    pub unicode_indexer: Option<String>,\n    pub voice_style: Option<String>,\n}\n\nimpl OfflineTtsSupertonicModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTtsSupertonicModelConfig {\n        sys::OfflineTtsSupertonicModelConfig {\n            duration_predictor: to_c_ptr(&self.duration_predictor, cstrings),\n            text_encoder: to_c_ptr(&self.text_encoder, cstrings),\n            vector_estimator: to_c_ptr(&self.vector_estimator, cstrings),\n            vocoder: to_c_ptr(&self.vocoder, cstrings),\n            tts_json: to_c_ptr(&self.tts_json, cstrings),\n            unicode_indexer: to_c_ptr(&self.unicode_indexer, cstrings),\n            voice_style: to_c_ptr(&self.voice_style, cstrings),\n        }\n    }\n}\n\n// --- Aggregate config structs ---\n\n#[derive(Clone, Debug, Default)]\n/// Aggregate model configuration for [`OfflineTts`].\n///\n/// Configure exactly one model family for typical use.\npub struct OfflineTtsModelConfig {\n    pub vits: OfflineTtsVitsModelConfig,\n    pub matcha: OfflineTtsMatchaModelConfig,\n    pub kokoro: OfflineTtsKokoroModelConfig,\n    pub kitten: OfflineTtsKittenModelConfig,\n    pub zipvoice: OfflineTtsZipvoiceModelConfig,\n    pub pocket: OfflineTtsPocketModelConfig,\n    pub supertonic: OfflineTtsSupertonicModelConfig,\n    pub num_threads: i32,\n    pub debug: bool,\n    pub provider: Option<String>,\n}\n\nimpl OfflineTtsModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTtsModelConfig {\n        sys::OfflineTtsModelConfig {\n            vits: self\n                .vits\n                .to_sys(cstrings),\n            num_threads: self.num_threads,\n            debug: self.debug as i32,\n            provider: to_c_ptr(&self.provider, cstrings),\n            matcha: self\n                .matcha\n                .to_sys(cstrings),\n            kokoro: self\n                .kokoro\n                .to_sys(cstrings),\n            kitten: self\n                .kitten\n                .to_sys(cstrings),\n            zipvoice: self\n                .zipvoice\n                .to_sys(cstrings),\n            pocket: self\n                .pocket\n                .to_sys(cstrings),\n            supertonic: self\n                .supertonic\n                .to_sys(cstrings),\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Top-level configuration for [`OfflineTts`].\npub struct OfflineTtsConfig {\n    pub model: OfflineTtsModelConfig,\n    pub rule_fsts: Option<String>,\n    pub max_num_sentences: i32,\n    pub rule_fars: Option<String>,\n    pub silence_scale: f32,\n}\n\nimpl OfflineTtsConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::OfflineTtsConfig {\n        sys::OfflineTtsConfig {\n            model: self\n                .model\n                .to_sys(cstrings),\n            rule_fsts: to_c_ptr(&self.rule_fsts, cstrings),\n            max_num_sentences: self.max_num_sentences,\n            rule_fars: to_c_ptr(&self.rule_fars, cstrings),\n            silence_scale: self.silence_scale,\n        }\n    }\n}\n\n// --- Generation config ---\n\n#[derive(Clone, Debug)]\n/// Per-request generation options for [`OfflineTts::generate_with_config`].\npub struct GenerationConfig {\n    pub silence_scale: f32,\n    pub speed: f32,\n    pub sid: i32,\n    pub reference_audio: Option<Vec<f32>>,\n    pub reference_sample_rate: i32,\n    pub reference_text: Option<String>,\n    pub num_steps: i32,\n    pub extra: Option<HashMap<String, serde_json::Value>>,\n}\n\nimpl Default for GenerationConfig {\n    fn default() -> Self {\n        Self {\n            silence_scale: 0.2,\n            speed: 1.0,\n            sid: 0,\n            reference_audio: None,\n            reference_sample_rate: 0,\n            reference_text: None,\n            num_steps: 5,\n            extra: None,\n        }\n    }\n}\n\n// --- Generated audio ---\n\n/// Generated audio returned by [`OfflineTts::generate_with_config`].\npub struct GeneratedAudio {\n    ptr: *const sys::SherpaOnnxGeneratedAudio,\n}\n\nimpl GeneratedAudio {\n    /// Borrow generated samples.\n    pub fn samples(&self) -> &[f32] {\n        unsafe {\n            let p = &*self.ptr;\n            if p.samples\n                .is_null()\n                || p.n <= 0\n            {\n                &[]\n            } else {\n                slice::from_raw_parts(p.samples, p.n as usize)\n            }\n        }\n    }\n\n    /// Return the output sample rate in Hz.\n    pub fn sample_rate(&self) -> i32 {\n        unsafe { (*self.ptr).sample_rate }\n    }\n\n    /// Save generated audio to a WAV file.\n    pub fn save(&self, filename: &str) -> bool {\n        crate::wave::write(filename, self.samples(), self.sample_rate())\n    }\n}\n\nimpl Drop for GeneratedAudio {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .ptr\n                .is_null()\n            {\n                sys::SherpaOnnxDestroyOfflineTtsGeneratedAudio(self.ptr);\n            }\n        }\n    }\n}\n\n// --- Offline TTS ---\n\n/// Offline TTS engine.\n///\n/// ```no_run\n/// use sherpa_onnx::{\n///     OfflineTts, OfflineTtsConfig, OfflineTtsModelConfig, OfflineTtsPocketModelConfig,\n/// };\n///\n/// let config = OfflineTtsConfig {\n///     model: OfflineTtsModelConfig {\n///         pocket: OfflineTtsPocketModelConfig {\n///             lm_flow: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\".into()),\n///             lm_main: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\".into()),\n///             encoder: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\".into()),\n///             decoder: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\".into()),\n///             text_conditioner: Some(\n///                 \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\".into(),\n///             ),\n///             vocab_json: Some(\"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\".into()),\n///             token_scores_json: Some(\n///                 \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\".into(),\n///             ),\n///             ..Default::default()\n///         },\n///         ..Default::default()\n///     },\n///     ..Default::default()\n/// };\n///\n/// let tts = OfflineTts::create(&config).expect(\"create tts\");\n/// println!(\"{}\", tts.sample_rate());\n/// ```\npub struct OfflineTts {\n    ptr: *const sys::SherpaOnnxOfflineTts,\n}\n\nunsafe impl Send for OfflineTts {}\n\nimpl OfflineTts {\n    /// Create a TTS engine from `config`.\n    pub fn create(config: &OfflineTtsConfig) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n        let ptr = unsafe { sys::SherpaOnnxCreateOfflineTts(&sys_config) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Return the output sample rate in Hz.\n    pub fn sample_rate(&self) -> i32 {\n        unsafe { sys::SherpaOnnxOfflineTtsSampleRate(self.ptr) }\n    }\n\n    /// Return the number of built-in speakers reported by the model.\n    pub fn num_speakers(&self) -> i32 {\n        unsafe { sys::SherpaOnnxOfflineTtsNumSpeakers(self.ptr) }\n    }\n\n    /// Generate audio for `text`.\n    ///\n    /// The optional callback receives the samples generated so far together\n    /// with a progress value in `[0, 1]`. Return `true` to continue and\n    /// `false` to stop early.\n    pub fn generate_with_config<F>(\n        &self,\n        text: &str,\n        config: &GenerationConfig,\n        callback: Option<F>,\n    ) -> Option<GeneratedAudio>\n    where\n        F: FnMut(&[f32], f32) -> bool + 'static,\n    {\n        let mut cstrings = Vec::new();\n\n        let c_text = CString::new(text).unwrap();\n\n        // Build extra JSON string\n        let extra_json = match &config.extra {\n            Some(map) => serde_json::to_string(map).unwrap_or_else(|_| \"{}\".to_string()),\n            None => \"{}\".to_string(),\n        };\n        let c_extra = CString::new(extra_json).unwrap();\n        let c_ref_text = to_c_ptr(&config.reference_text, &mut cstrings);\n\n        let (ref_ptr, ref_len) = match &config.reference_audio {\n            Some(samples) => (samples.as_ptr(), samples.len() as i32),\n            None => (ptr::null(), 0),\n        };\n\n        let sys_gen_config = sys::SherpaOnnxGenerationConfig {\n            silence_scale: config.silence_scale,\n            speed: config.speed,\n            sid: config.sid,\n            reference_audio: ref_ptr,\n            reference_audio_len: ref_len,\n            reference_sample_rate: config.reference_sample_rate,\n            reference_text: c_ref_text,\n            num_steps: config.num_steps,\n            extra: c_extra.as_ptr(),\n        };\n\n        let (c_callback, c_arg): (\n            sys::SherpaOnnxGeneratedAudioProgressCallbackWithArg,\n            *mut c_void,\n        ) = if let Some(cb) = callback {\n            let boxed: Box<BoxedProgressCallback> = Box::new(Box::new(cb));\n            let raw = Box::into_raw(boxed);\n            (Some(progress_callback_trampoline), raw as *mut c_void)\n        } else {\n            (None, ptr::null_mut())\n        };\n\n        let audio_ptr = unsafe {\n            sys::SherpaOnnxOfflineTtsGenerateWithConfig(\n                self.ptr,\n                c_text.as_ptr(),\n                &sys_gen_config,\n                c_callback,\n                c_arg,\n            )\n        };\n\n        // Clean up the boxed callback if we allocated one\n        if !c_arg.is_null() {\n            unsafe {\n                let _ = Box::from_raw(c_arg as *mut BoxedProgressCallback);\n            }\n        }\n\n        if audio_ptr.is_null() {\n            None\n        } else {\n            Some(GeneratedAudio { ptr: audio_ptr })\n        }\n    }\n}\n\nimpl Drop for OfflineTts {\n    fn drop(&mut self) {\n        unsafe {\n            sys::SherpaOnnxDestroyOfflineTts(self.ptr);\n        }\n    }\n}\n\nunsafe extern \"C\" fn progress_callback_trampoline(\n    samples: *const f32,\n    n: i32,\n    progress: f32,\n    arg: *mut c_void,\n) -> i32 {\n    let cb = &mut *(arg as *mut BoxedProgressCallback);\n    let data = if samples.is_null() || n <= 0 {\n        &[]\n    } else {\n        slice::from_raw_parts(samples, n as usize)\n    };\n    if cb(data, progress) {\n        1\n    } else {\n        0\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/utils.rs",
    "content": "use sherpa_onnx_sys as sys;\nuse std::ffi::{CStr, CString};\nuse std::os::raw::c_char;\nuse std::ptr;\n\n/// Safely convert a C string pointer to a `'static` Rust string slice.\n///\n/// If the pointer is null, an empty string is returned.\n/// If the C string is not valid UTF-8, a lossy UTF-8 conversion is used\n/// and the resulting string is leaked to obtain a `'static` lifetime.\nfn c_str_to_static_str(ptr: *const c_char) -> &'static str {\n    assert!(!ptr.is_null(), \"C string pointer is null\");\n\n    unsafe {\n        CStr::from_ptr(ptr)\n            .to_str()\n            .unwrap()\n    }\n}\n\n/// Return the sherpa-onnx version string compiled into the native library.\npub fn version() -> &'static str {\n    let ptr = unsafe { sys::SherpaOnnxGetVersionStr() };\n    c_str_to_static_str(ptr)\n}\n\n/// Return the Git SHA1 of the native library build.\npub fn git_sha1() -> &'static str {\n    let ptr = unsafe { sys::SherpaOnnxGetGitSha1() };\n    c_str_to_static_str(ptr)\n}\n\n/// Return the Git date of the native library build.\npub fn git_date() -> &'static str {\n    let ptr = unsafe { sys::SherpaOnnxGetGitDate() };\n    c_str_to_static_str(ptr)\n}\n\n/// Return `true` if `filename` exists according to the native helper.\npub fn file_exists(filename: &str) -> bool {\n    let cstr = match CString::new(filename) {\n        Ok(cstr) => cstr,\n        Err(_) => {\n            // Invalid input (e.g., contains interior NUL); treat as non-existent.\n            return false;\n        }\n    };\n\n    unsafe { sys::SherpaOnnxFileExists(cstr.as_ptr()) != 0 }\n}\n\npub(crate) fn to_c_ptr(opt: &Option<String>, storage: &mut Vec<CString>) -> *const c_char {\n    if let Some(s) = opt {\n        let c = CString::new(s.as_str()).unwrap();\n        let ptr = c.as_ptr();\n        storage.push(c);\n        ptr\n    } else {\n        ptr::null()\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/vad.rs",
    "content": "//! Voice activity detection and buffering helpers.\n//!\n//! See `rust-api-examples/examples/silero_vad_remove_silence.rs` for a complete\n//! example that removes non-speech segments from a WAV file.\n\nuse crate::utils::to_c_ptr;\nuse std::ffi::CString;\nuse std::slice;\n\nuse sherpa_onnx_sys as sys;\n\n#[derive(Clone, Debug, Default)]\n/// Silero VAD configuration.\npub struct SileroVadModelConfig {\n    pub model: Option<String>,\n    pub threshold: f32,\n    pub min_silence_duration: f32,\n    pub min_speech_duration: f32,\n    pub window_size: i32,\n    pub max_speech_duration: f32,\n}\n\nimpl SileroVadModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::SileroVadModelConfig {\n        sys::SileroVadModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n            threshold: self.threshold,\n            min_silence_duration: self.min_silence_duration,\n            min_speech_duration: self.min_speech_duration,\n            window_size: self.window_size,\n            max_speech_duration: self.max_speech_duration,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Ten VAD configuration.\npub struct TenVadModelConfig {\n    pub model: Option<String>,\n    pub threshold: f32,\n    pub min_silence_duration: f32,\n    pub min_speech_duration: f32,\n    pub window_size: i32,\n    pub max_speech_duration: f32,\n}\n\nimpl TenVadModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::TenVadModelConfig {\n        sys::TenVadModelConfig {\n            model: to_c_ptr(&self.model, cstrings),\n            threshold: self.threshold,\n            min_silence_duration: self.min_silence_duration,\n            min_speech_duration: self.min_speech_duration,\n            window_size: self.window_size,\n            max_speech_duration: self.max_speech_duration,\n        }\n    }\n}\n\n#[derive(Clone, Debug, Default)]\n/// Top-level model configuration for [`VoiceActivityDetector`].\n///\n/// Configure exactly one model family for typical use.\npub struct VadModelConfig {\n    pub silero_vad: SileroVadModelConfig,\n    pub ten_vad: TenVadModelConfig,\n    pub sample_rate: i32,\n    pub num_threads: i32,\n    pub provider: Option<String>,\n    pub debug: bool,\n}\n\nimpl VadModelConfig {\n    fn to_sys(&self, cstrings: &mut Vec<CString>) -> sys::VadModelConfig {\n        sys::VadModelConfig {\n            silero_vad: self\n                .silero_vad\n                .to_sys(cstrings),\n            ten_vad: self\n                .ten_vad\n                .to_sys(cstrings),\n            sample_rate: self.sample_rate,\n            num_threads: self.num_threads,\n            provider: to_c_ptr(&self.provider, cstrings),\n            debug: self.debug as i32,\n        }\n    }\n}\n\n/// Circular sample buffer used by some VAD workflows.\npub struct CircularBuffer {\n    ptr: *const sys::CircularBuffer,\n}\n\nimpl CircularBuffer {\n    /// Create a new buffer with capacity measured in samples.\n    pub fn new(capacity: i32) -> Option<Self> {\n        let ptr = unsafe { sys::SherpaOnnxCreateCircularBuffer(capacity) };\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Append samples to the tail of the buffer.\n    pub fn push(&self, samples: &[f32]) {\n        unsafe {\n            sys::SherpaOnnxCircularBufferPush(self.ptr, samples.as_ptr(), samples.len() as i32)\n        }\n    }\n\n    /// Copy `n` samples starting at `start_index`.\n    pub fn get(&self, start_index: i32, n: i32) -> Vec<f32> {\n        unsafe {\n            let p = sys::SherpaOnnxCircularBufferGet(self.ptr, start_index, n);\n            if p.is_null() {\n                return vec![];\n            }\n            let slice = slice::from_raw_parts(p, n as usize);\n            let result = slice.to_vec();\n            sys::SherpaOnnxCircularBufferFree(p);\n            result\n        }\n    }\n\n    /// Drop `n` samples from the head of the buffer.\n    pub fn pop(&self, n: i32) {\n        unsafe { sys::SherpaOnnxCircularBufferPop(self.ptr, n) }\n    }\n\n    /// Return the number of samples currently stored.\n    pub fn size(&self) -> i32 {\n        unsafe { sys::SherpaOnnxCircularBufferSize(self.ptr) }\n    }\n\n    /// Return the logical head position.\n    pub fn head(&self) -> i32 {\n        unsafe { sys::SherpaOnnxCircularBufferHead(self.ptr) }\n    }\n\n    /// Clear the buffer.\n    pub fn reset(&self) {\n        unsafe { sys::SherpaOnnxCircularBufferReset(self.ptr) }\n    }\n}\n\nimpl Drop for CircularBuffer {\n    fn drop(&mut self) {\n        unsafe { sys::SherpaOnnxDestroyCircularBuffer(self.ptr) }\n    }\n}\n\n/// One detected speech segment.\npub struct SpeechSegment {\n    ptr: *const sys::SpeechSegment,\n}\n\nimpl SpeechSegment {\n    /// Start index, in samples, relative to the input seen so far.\n    pub fn start(&self) -> i32 {\n        unsafe { (*self.ptr).start }\n    }\n\n    /// Borrow the segment samples.\n    pub fn samples(&self) -> &[f32] {\n        unsafe { slice::from_raw_parts((*self.ptr).samples, (*self.ptr).n as usize) }\n    }\n\n    /// Return the number of samples in the segment.\n    pub fn n(&self) -> i32 {\n        unsafe { (*self.ptr).n }\n    }\n}\n\nimpl Drop for SpeechSegment {\n    fn drop(&mut self) {\n        unsafe { sys::SherpaOnnxDestroySpeechSegment(self.ptr) }\n    }\n}\n\n/// Voice activity detector that emits speech segments.\npub struct VoiceActivityDetector {\n    ptr: *const sys::VoiceActivityDetector,\n}\n\nimpl VoiceActivityDetector {\n    /// Create a detector and an internal result buffer.\n    pub fn create(config: &VadModelConfig, buffer_size_in_seconds: f32) -> Option<Self> {\n        let mut cstrings = Vec::new();\n        let sys_config = config.to_sys(&mut cstrings);\n\n        let ptr = unsafe {\n            sys::SherpaOnnxCreateVoiceActivityDetector(&sys_config, buffer_size_in_seconds)\n        };\n\n        if ptr.is_null() {\n            None\n        } else {\n            Some(Self { ptr })\n        }\n    }\n\n    /// Feed waveform samples to the detector.\n    pub fn accept_waveform(&self, samples: &[f32]) {\n        unsafe {\n            sys::SherpaOnnxVoiceActivityDetectorAcceptWaveform(\n                self.ptr,\n                samples.as_ptr(),\n                samples.len() as i32,\n            )\n        }\n    }\n\n    /// Return `true` if there are no queued speech segments.\n    pub fn is_empty(&self) -> bool {\n        unsafe { sys::SherpaOnnxVoiceActivityDetectorEmpty(self.ptr) != 0 }\n    }\n\n    /// Return `true` if speech is currently being detected.\n    pub fn detected(&self) -> bool {\n        unsafe { sys::SherpaOnnxVoiceActivityDetectorDetected(self.ptr) != 0 }\n    }\n\n    /// Drop the front speech segment, if any.\n    pub fn pop(&self) {\n        unsafe { sys::SherpaOnnxVoiceActivityDetectorPop(self.ptr) }\n    }\n\n    /// Remove all queued segments.\n    pub fn clear(&self) {\n        unsafe { sys::SherpaOnnxVoiceActivityDetectorClear(self.ptr) }\n    }\n\n    /// Borrow the front speech segment, if available.\n    pub fn front(&self) -> Option<SpeechSegment> {\n        if self.is_empty() {\n            return None;\n        }\n\n        unsafe {\n            let ptr = sys::SherpaOnnxVoiceActivityDetectorFront(self.ptr);\n            if ptr.is_null() {\n                None\n            } else {\n                Some(SpeechSegment { ptr })\n            }\n        }\n    }\n\n    /// Reset the detector state.\n    pub fn reset(&self) {\n        unsafe { sys::SherpaOnnxVoiceActivityDetectorReset(self.ptr) }\n    }\n\n    /// Flush any buffered trailing speech into the output queue.\n    pub fn flush(&self) {\n        unsafe { sys::SherpaOnnxVoiceActivityDetectorFlush(self.ptr) }\n    }\n}\n\nimpl Drop for VoiceActivityDetector {\n    fn drop(&mut self) {\n        unsafe { sys::SherpaOnnxDestroyVoiceActivityDetector(self.ptr) }\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx/src/wave.rs",
    "content": "//! WAV file helpers used by the Rust wrappers and examples.\n\nuse std::ffi::CString;\nuse std::slice;\n\nuse sherpa_onnx_sys as sys;\n\n#[derive(Debug)]\n/// A WAV file loaded through sherpa-onnx.\n///\n/// Samples are exposed as normalized `f32` PCM values. Use [`Wave::read`] to\n/// load a file and [`Wave::write`] or [`write()`] to save audio.\npub struct Wave {\n    inner: *const sys::SherpaOnnxWave,\n}\n\nimpl Wave {\n    /// Read a mono WAV file from disk.\n    ///\n    /// Returns `None` if the file cannot be opened or decoded.\n    pub fn read(filename: &str) -> Option<Self> {\n        let c_filename = CString::new(filename).unwrap();\n        let wave_ptr = unsafe { sys::SherpaOnnxReadWave(c_filename.as_ptr()) };\n        if wave_ptr.is_null() {\n            None\n        } else {\n            Some(Self { inner: wave_ptr })\n        }\n    }\n\n    /// Write this waveform to a WAV file.\n    pub fn write(&self, filename: &str) -> bool {\n        let c_filename = CString::new(filename).unwrap();\n        unsafe {\n            sys::SherpaOnnxWriteWave(\n                (*self.inner).samples,\n                (*self.inner).num_samples,\n                (*self.inner).sample_rate,\n                c_filename.as_ptr(),\n            ) == 1\n        }\n    }\n\n    /// Return the sample rate in Hz.\n    pub fn sample_rate(&self) -> i32 {\n        unsafe { (*self.inner).sample_rate }\n    }\n\n    /// Return the number of samples in the waveform.\n    pub fn num_samples(&self) -> i32 {\n        unsafe { (*self.inner).num_samples }\n    }\n\n    /// Return the normalized PCM samples.\n    pub fn samples(&self) -> &[f32] {\n        unsafe {\n            let ptr = (*self.inner).samples;\n            let len = (*self.inner).num_samples as usize;\n\n            if ptr.is_null() || len == 0 {\n                &[]\n            } else {\n                slice::from_raw_parts(ptr, len)\n            }\n        }\n    }\n}\n\nimpl Drop for Wave {\n    fn drop(&mut self) {\n        unsafe {\n            if !self\n                .inner\n                .is_null()\n            {\n                sys::SherpaOnnxFreeWave(self.inner);\n            }\n        }\n    }\n}\n\n/// Write normalized PCM samples to a WAV file.\n///\n/// This is convenient when an API returns a plain `Vec<f32>` and you do not\n/// need to build a [`Wave`] first.\npub fn write(filename: &str, samples: &[f32], sample_rate: i32) -> bool {\n    let c_filename = CString::new(filename).unwrap();\n    unsafe {\n        sys::SherpaOnnxWriteWave(\n            samples.as_ptr(),\n            samples.len() as i32,\n            sample_rate,\n            c_filename.as_ptr(),\n        ) == 1\n    }\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/Cargo.toml",
    "content": "[package]\nname = \"sherpa-onnx-sys\"\nversion = \"1.12.31\"\nedition = \"2021\"\ndescription = \"Raw FFI bindings to the sherpa-onnx C API\"\nlicense = \"Apache-2.0\"\nrepository = \"https://github.com/k2-fsa/sherpa-onnx\"\nreadme = \"README.md\"\nlinks = \"sherpa-onnx\"\n\nkeywords = [\"ffi\", \"speech\", \"sherpa-onnx\", \"bindings\"]\ncategories = [\"external-ffi-bindings\"]\n\ninclude = [\n    \"src/**\",\n    \"build.rs\",\n    \"Cargo.toml\",\n    \"README.md\",\n    \"LICENSE*\",\n]\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/build.rs",
    "content": "use std::env;\n\nfn main() {\n    // Try to get library directory from environment variable\n    let lib_dir = env::var(\"SHERPA_ONNX_LIB_DIR\").ok();\n\n    match &lib_dir {\n        Some(path) => {\n            println!(\"cargo:warning=SHERPA_ONNX_LIB_DIR={}\", path);\n\n            // Tell Rust/Cargo where to find the libraries at build time\n            println!(\"cargo:rustc-link-search=native={}\", path);\n\n            // Add rpath for Linux/macOS\n            if cfg!(any(target_os = \"linux\", target_os = \"macos\")) {\n                println!(\"cargo:rustc-link-arg=-Wl,-rpath,{}\", path);\n            }\n        }\n        None => {\n            println!(\"cargo:warning=SHERPA_ONNX_LIB_DIR not set. You may need to set it to the folder containing libsherpa-onnx-c-api and libonnxruntime.\");\n        }\n    }\n\n    // Link the dynamic libraries regardless (cargo will fail later if not found)\n    println!(\"cargo:rustc-link-lib=dylib=sherpa-onnx-c-api\");\n    println!(\"cargo:rustc-link-lib=dylib=onnxruntime\");\n\n    // Rebuild if the env variable changes\n    println!(\"cargo:rerun-if-env-changed=SHERPA_ONNX_LIB_DIR\");\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/audio_tagging.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::{c_char, c_float};\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineZipformerAudioTaggingModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct AudioTaggingModelConfig {\n    pub zipformer: OfflineZipformerAudioTaggingModelConfig,\n    pub ced: *const c_char,\n    pub num_threads: i32,\n    pub debug: i32,\n    pub provider: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct AudioTaggingConfig {\n    pub model: AudioTaggingModelConfig,\n    pub labels: *const c_char,\n    pub top_k: i32,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct AudioEvent {\n    pub name: *const c_char,\n    pub index: i32,\n    pub prob: c_float,\n}\n\n#[repr(C)]\npub struct AudioTagging {\n    _private: [u8; 0],\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateAudioTagging(config: *const AudioTaggingConfig) -> *const AudioTagging;\n\n    pub fn SherpaOnnxDestroyAudioTagging(tagger: *const AudioTagging);\n\n    pub fn SherpaOnnxAudioTaggingCreateOfflineStream(\n        tagger: *const AudioTagging,\n    ) -> *const crate::offline_asr::OfflineStream;\n\n    pub fn SherpaOnnxAudioTaggingCompute(\n        tagger: *const AudioTagging,\n        s: *const crate::offline_asr::OfflineStream,\n        top_k: i32,\n    ) -> *const *const AudioEvent;\n\n    pub fn SherpaOnnxAudioTaggingFreeResults(p: *const *const AudioEvent);\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/kws.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::{c_char, c_float};\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct KeywordResult {\n    pub keyword: *const c_char,\n    pub tokens: *const c_char,\n    pub tokens_arr: *const *const c_char,\n    pub count: i32,\n    pub timestamps: *mut c_float,\n    pub start_time: c_float,\n    pub json: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct KeywordSpotterConfig {\n    pub feat_config: super::online_asr::FeatureConfig,\n    pub model_config: super::online_asr::OnlineModelConfig,\n    pub max_active_paths: i32,\n    pub num_trailing_blanks: i32,\n    pub keywords_score: c_float,\n    pub keywords_threshold: c_float,\n    pub keywords_file: *const c_char,\n    pub keywords_buf: *const c_char,\n    pub keywords_buf_size: i32,\n}\n\n#[repr(C)]\npub struct KeywordSpotter {\n    _private: [u8; 0],\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateKeywordSpotter(\n        config: *const KeywordSpotterConfig,\n    ) -> *const KeywordSpotter;\n\n    pub fn SherpaOnnxDestroyKeywordSpotter(spotter: *const KeywordSpotter);\n\n    pub fn SherpaOnnxCreateKeywordStream(\n        spotter: *const KeywordSpotter,\n    ) -> *const super::online_asr::OnlineStream;\n\n    pub fn SherpaOnnxCreateKeywordStreamWithKeywords(\n        spotter: *const KeywordSpotter,\n        keywords: *const c_char,\n    ) -> *const super::online_asr::OnlineStream;\n\n    pub fn SherpaOnnxIsKeywordStreamReady(\n        spotter: *const KeywordSpotter,\n        stream: *const super::online_asr::OnlineStream,\n    ) -> i32;\n\n    pub fn SherpaOnnxDecodeKeywordStream(\n        spotter: *const KeywordSpotter,\n        stream: *const super::online_asr::OnlineStream,\n    );\n\n    pub fn SherpaOnnxResetKeywordStream(\n        spotter: *const KeywordSpotter,\n        stream: *const super::online_asr::OnlineStream,\n    );\n\n    pub fn SherpaOnnxDecodeMultipleKeywordStreams(\n        spotter: *const KeywordSpotter,\n        streams: *const *const super::online_asr::OnlineStream,\n        n: i32,\n    );\n\n    pub fn SherpaOnnxGetKeywordResult(\n        spotter: *const KeywordSpotter,\n        stream: *const super::online_asr::OnlineStream,\n    ) -> *const KeywordResult;\n\n    pub fn SherpaOnnxDestroyKeywordResult(r: *const KeywordResult);\n\n    pub fn SherpaOnnxGetKeywordResultAsJson(\n        spotter: *const KeywordSpotter,\n        stream: *const super::online_asr::OnlineStream,\n    ) -> *const c_char;\n\n    pub fn SherpaOnnxFreeKeywordResultJson(s: *const c_char);\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/lib.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::c_char;\n\nextern \"C\" {\n    pub fn SherpaOnnxGetVersionStr() -> *const c_char;\n    pub fn SherpaOnnxGetGitSha1() -> *const c_char;\n    pub fn SherpaOnnxGetGitDate() -> *const c_char;\n    pub fn SherpaOnnxFileExists(filename: *const c_char) -> i32;\n}\n\npub mod audio_tagging;\npub mod kws;\npub mod offline_asr;\npub mod offline_punctuation;\npub mod offline_speaker_diarization;\npub mod online_asr;\npub mod online_punctuation;\npub mod speaker_embedding;\npub mod speech_denoiser;\npub mod spoken_language_identification;\npub mod tts;\npub mod vad;\npub mod wave;\n\npub use audio_tagging::*;\npub use kws::*;\npub use offline_asr::*;\npub use offline_punctuation::*;\npub use offline_speaker_diarization::*;\npub use online_asr::*;\npub use online_punctuation::*;\npub use speaker_embedding::*;\npub use speech_denoiser::*;\npub use spoken_language_identification::*;\npub use tts::*;\npub use vad::*;\npub use wave::*;\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/offline_asr.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::{c_char, c_float};\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTransducerModelConfig {\n    pub encoder: *const c_char,\n    pub decoder: *const c_char,\n    pub joiner: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineParaformerModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineNemoEncDecCtcModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineWhisperModelConfig {\n    pub encoder: *const c_char,\n    pub decoder: *const c_char,\n    pub language: *const c_char,\n    pub task: *const c_char,\n    pub tail_paddings: i32,\n    pub enable_token_timestamps: i32,\n    pub enable_segment_timestamps: i32,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineCanaryModelConfig {\n    pub encoder: *const c_char,\n    pub decoder: *const c_char,\n    pub src_lang: *const c_char,\n    pub tgt_lang: *const c_char,\n    pub use_pnc: i32,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineFireRedAsrModelConfig {\n    pub encoder: *const c_char,\n    pub decoder: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineMoonshineModelConfig {\n    pub preprocessor: *const c_char,\n    pub encoder: *const c_char,\n    pub uncached_decoder: *const c_char,\n    pub cached_decoder: *const c_char,\n    pub merged_decoder: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTdnnModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineLMConfig {\n    pub model: *const c_char,\n    pub scale: c_float,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineSenseVoiceModelConfig {\n    pub model: *const c_char,\n    pub language: *const c_char,\n    pub use_itn: i32,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineDolphinModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineZipformerCtcModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineWenetCtcModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineOmnilingualAsrCtcModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineFunASRNanoModelConfig {\n    pub encoder_adaptor: *const c_char,\n    pub llm: *const c_char,\n    pub embedding: *const c_char,\n    pub tokenizer: *const c_char,\n    pub system_prompt: *const c_char,\n    pub user_prompt: *const c_char,\n    pub max_new_tokens: i32,\n    pub temperature: c_float,\n    pub top_p: c_float,\n    pub seed: i32,\n    pub language: *const c_char,\n    pub itn: i32,\n    pub hotwords: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineMedAsrCtcModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineFireRedAsrCtcModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineModelConfig {\n    pub transducer: OfflineTransducerModelConfig,\n    pub paraformer: OfflineParaformerModelConfig,\n    pub nemo_ctc: OfflineNemoEncDecCtcModelConfig,\n    pub whisper: OfflineWhisperModelConfig,\n    pub tdnn: OfflineTdnnModelConfig,\n\n    pub tokens: *const c_char,\n    pub num_threads: i32,\n    pub debug: i32,\n    pub provider: *const c_char,\n    pub model_type: *const c_char,\n    pub modeling_unit: *const c_char,\n    pub bpe_vocab: *const c_char,\n    pub telespeech_ctc: *const c_char,\n\n    pub sense_voice: OfflineSenseVoiceModelConfig,\n    pub moonshine: OfflineMoonshineModelConfig,\n    pub fire_red_asr: OfflineFireRedAsrModelConfig,\n    pub dolphin: OfflineDolphinModelConfig,\n    pub zipformer_ctc: OfflineZipformerCtcModelConfig,\n    pub canary: OfflineCanaryModelConfig,\n    pub wenet_ctc: OfflineWenetCtcModelConfig,\n    pub omnilingual: OfflineOmnilingualAsrCtcModelConfig,\n    pub medasr: OfflineMedAsrCtcModelConfig,\n    pub funasr_nano: OfflineFunASRNanoModelConfig,\n    pub fire_red_asr_ctc: OfflineFireRedAsrCtcModelConfig,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineRecognizerConfig {\n    pub feat_config: super::online_asr::FeatureConfig,\n    pub model_config: OfflineModelConfig,\n    pub lm_config: OfflineLMConfig,\n\n    pub decoding_method: *const c_char,\n    pub max_active_paths: i32,\n    pub hotwords_file: *const c_char,\n    pub hotwords_score: c_float,\n    pub rule_fsts: *const c_char,\n    pub rule_fars: *const c_char,\n    pub blank_penalty: c_float,\n    pub hr: super::online_asr::HomophoneReplacerConfig,\n}\n\n#[repr(C)]\npub struct OfflineRecognizer {\n    _private: [u8; 0],\n}\n\n#[repr(C)]\npub struct OfflineStream {\n    _private: [u8; 0],\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateOfflineRecognizer(\n        config: *const OfflineRecognizerConfig,\n    ) -> *const OfflineRecognizer;\n\n    pub fn SherpaOnnxDestroyOfflineRecognizer(recognizer: *const OfflineRecognizer);\n\n    pub fn SherpaOnnxCreateOfflineStream(\n        recognizer: *const OfflineRecognizer,\n    ) -> *const OfflineStream;\n\n    pub fn SherpaOnnxCreateOfflineStreamWithHotwords(\n        recognizer: *const OfflineRecognizer,\n        hotwords: *const c_char,\n    ) -> *const OfflineStream;\n\n    pub fn SherpaOnnxDestroyOfflineStream(stream: *const OfflineStream);\n\n    pub fn SherpaOnnxAcceptWaveformOffline(\n        stream: *const OfflineStream,\n        sample_rate: i32,\n        samples: *const f32,\n        n: i32,\n    );\n\n    pub fn SherpaOnnxOfflineStreamSetOption(\n        stream: *const OfflineStream,\n        key: *const c_char,\n        value: *const c_char,\n    );\n\n    pub fn SherpaOnnxOfflineStreamGetOption(\n        stream: *const OfflineStream,\n        key: *const c_char,\n    ) -> *const c_char;\n\n    pub fn SherpaOnnxOfflineStreamHasOption(\n        stream: *const OfflineStream,\n        key: *const c_char,\n    ) -> i32;\n\n    pub fn SherpaOnnxDecodeOfflineStream(\n        recognizer: *const OfflineRecognizer,\n        stream: *const OfflineStream,\n    );\n\n    pub fn SherpaOnnxDecodeMultipleOfflineStreams(\n        recognizer: *const OfflineRecognizer,\n        streams: *const *const OfflineStream,\n        n: i32,\n    );\n\n    pub fn SherpaOnnxGetOfflineStreamResultAsJson(stream: *const OfflineStream) -> *const c_char;\n\n    pub fn SherpaOnnxDestroyOfflineStreamResultJson(s: *const c_char);\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/offline_punctuation.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::c_char;\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflinePunctuationModelConfig {\n    pub ct_transformer: *const c_char,\n    pub num_threads: i32,\n    pub debug: i32,\n    pub provider: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflinePunctuationConfig {\n    pub model: OfflinePunctuationModelConfig,\n}\n\n#[repr(C)]\npub struct OfflinePunctuation {\n    _private: [u8; 0],\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateOfflinePunctuation(\n        config: *const OfflinePunctuationConfig,\n    ) -> *const OfflinePunctuation;\n\n    pub fn SherpaOnnxDestroyOfflinePunctuation(punct: *const OfflinePunctuation);\n\n    pub fn SherpaOfflinePunctuationAddPunct(\n        punct: *const OfflinePunctuation,\n        text: *const c_char,\n    ) -> *const c_char;\n\n    pub fn SherpaOfflinePunctuationFreeText(text: *const c_char);\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/offline_speaker_diarization.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::{c_char, c_float};\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineSpeakerSegmentationPyannoteModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineSpeakerSegmentationModelConfig {\n    pub pyannote: OfflineSpeakerSegmentationPyannoteModelConfig,\n    pub num_threads: i32,\n    pub debug: i32,\n    pub provider: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct FastClusteringConfig {\n    pub num_clusters: i32,\n    pub threshold: c_float,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineSpeakerDiarizationConfig {\n    pub segmentation: OfflineSpeakerSegmentationModelConfig,\n    pub embedding: crate::speaker_embedding::SpeakerEmbeddingExtractorConfig,\n    pub clustering: FastClusteringConfig,\n    pub min_duration_on: c_float,\n    pub min_duration_off: c_float,\n}\n\n#[repr(C)]\npub struct OfflineSpeakerDiarization {\n    _private: [u8; 0],\n}\n\n#[repr(C)]\npub struct OfflineSpeakerDiarizationResult {\n    _private: [u8; 0],\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineSpeakerDiarizationSegment {\n    pub start: c_float,\n    pub end: c_float,\n    pub speaker: i32,\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateOfflineSpeakerDiarization(\n        config: *const OfflineSpeakerDiarizationConfig,\n    ) -> *const OfflineSpeakerDiarization;\n\n    pub fn SherpaOnnxDestroyOfflineSpeakerDiarization(sd: *const OfflineSpeakerDiarization);\n\n    pub fn SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(\n        sd: *const OfflineSpeakerDiarization,\n    ) -> i32;\n\n    pub fn SherpaOnnxOfflineSpeakerDiarizationSetConfig(\n        sd: *const OfflineSpeakerDiarization,\n        config: *const OfflineSpeakerDiarizationConfig,\n    );\n\n    pub fn SherpaOnnxOfflineSpeakerDiarizationResultGetNumSpeakers(\n        r: *const OfflineSpeakerDiarizationResult,\n    ) -> i32;\n\n    pub fn SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(\n        r: *const OfflineSpeakerDiarizationResult,\n    ) -> i32;\n\n    pub fn SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(\n        r: *const OfflineSpeakerDiarizationResult,\n    ) -> *const OfflineSpeakerDiarizationSegment;\n\n    pub fn SherpaOnnxOfflineSpeakerDiarizationDestroySegment(\n        s: *const OfflineSpeakerDiarizationSegment,\n    );\n\n    pub fn SherpaOnnxOfflineSpeakerDiarizationProcess(\n        sd: *const OfflineSpeakerDiarization,\n        samples: *const c_float,\n        n: i32,\n    ) -> *const OfflineSpeakerDiarizationResult;\n\n    pub fn SherpaOnnxOfflineSpeakerDiarizationDestroyResult(\n        r: *const OfflineSpeakerDiarizationResult,\n    );\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/online_asr.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::{c_char, c_float};\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlineTransducerModelConfig {\n    pub encoder: *const c_char,\n    pub decoder: *const c_char,\n    pub joiner: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlineParaformerModelConfig {\n    pub encoder: *const c_char,\n    pub decoder: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlineZipformer2CtcModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlineNemoCtcModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlineToneCtcModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlineModelConfig {\n    pub transducer: OnlineTransducerModelConfig,\n    pub paraformer: OnlineParaformerModelConfig,\n    pub zipformer2_ctc: OnlineZipformer2CtcModelConfig,\n\n    pub tokens: *const c_char,\n    pub num_threads: i32,\n    pub provider: *const c_char,\n    pub debug: i32,\n\n    pub model_type: *const c_char,\n\n    // cjkchar | bpe | cjkchar+bpe\n    pub modeling_unit: *const c_char,\n\n    pub bpe_vocab: *const c_char,\n\n    pub tokens_buf: *const u8,\n    pub tokens_buf_size: i32,\n\n    pub nemo_ctc: OnlineNemoCtcModelConfig,\n    pub t_one_ctc: OnlineToneCtcModelConfig,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct FeatureConfig {\n    pub sample_rate: i32,\n    pub feature_dim: i32,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlineCtcFstDecoderConfig {\n    pub graph: *const c_char,\n    pub max_active: i32,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct HomophoneReplacerConfig {\n    pub dict_dir: *const c_char,\n    pub lexicon: *const c_char,\n    pub rule_fsts: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlineRecognizerConfig {\n    pub feat_config: FeatureConfig,\n    pub model_config: OnlineModelConfig,\n\n    // greedy_search | modified_beam_search\n    pub decoding_method: *const c_char,\n\n    pub max_active_paths: i32,\n\n    pub enable_endpoint: i32,\n\n    pub rule1_min_trailing_silence: c_float,\n    pub rule2_min_trailing_silence: c_float,\n    pub rule3_min_utterance_length: c_float,\n\n    pub hotwords_file: *const c_char,\n    pub hotwords_score: c_float,\n\n    pub ctc_fst_decoder_config: OnlineCtcFstDecoderConfig,\n\n    pub rule_fsts: *const c_char,\n    pub rule_fars: *const c_char,\n\n    pub blank_penalty: c_float,\n\n    pub hotwords_buf: *const u8,\n    pub hotwords_buf_size: i32,\n\n    pub hr: HomophoneReplacerConfig,\n}\n\n#[repr(C)]\npub struct OnlineRecognizer {\n    _private: [u8; 0],\n}\n\n#[repr(C)]\npub struct OnlineStream {\n    _private: [u8; 0],\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateOnlineRecognizer(\n        config: *const OnlineRecognizerConfig,\n    ) -> *const OnlineRecognizer;\n\n    pub fn SherpaOnnxDestroyOnlineRecognizer(recognizer: *const OnlineRecognizer);\n\n    pub fn SherpaOnnxCreateOnlineStream(recognizer: *const OnlineRecognizer)\n        -> *const OnlineStream;\n\n    pub fn SherpaOnnxCreateOnlineStreamWithHotwords(\n        recognizer: *const OnlineRecognizer,\n        hotwords: *const c_char,\n    ) -> *const OnlineStream;\n\n    pub fn SherpaOnnxDestroyOnlineStream(stream: *const OnlineStream);\n\n    pub fn SherpaOnnxOnlineStreamAcceptWaveform(\n        stream: *const OnlineStream,\n        sample_rate: i32,\n        samples: *const f32,\n        n: i32,\n    );\n\n    pub fn SherpaOnnxIsOnlineStreamReady(\n        recognizer: *const OnlineRecognizer,\n        stream: *const OnlineStream,\n    ) -> i32;\n\n    pub fn SherpaOnnxDecodeOnlineStream(\n        recognizer: *const OnlineRecognizer,\n        stream: *const OnlineStream,\n    );\n\n    pub fn SherpaOnnxDecodeMultipleOnlineStreams(\n        recognizer: *const OnlineRecognizer,\n        streams: *const *const OnlineStream,\n        n: i32,\n    );\n\n    pub fn SherpaOnnxGetOnlineStreamResultAsJson(\n        recognizer: *const OnlineRecognizer,\n        stream: *const OnlineStream,\n    ) -> *const c_char;\n\n    pub fn SherpaOnnxDestroyOnlineStreamResultJson(s: *const c_char);\n\n    pub fn SherpaOnnxOnlineStreamReset(\n        recognizer: *const OnlineRecognizer,\n        stream: *const OnlineStream,\n    );\n\n    pub fn SherpaOnnxOnlineStreamInputFinished(stream: *const OnlineStream);\n\n    pub fn SherpaOnnxOnlineStreamSetOption(\n        stream: *const OnlineStream,\n        key: *const c_char,\n        value: *const c_char,\n    );\n\n    pub fn SherpaOnnxOnlineStreamGetOption(\n        stream: *const OnlineStream,\n        key: *const c_char,\n    ) -> *const c_char;\n\n    pub fn SherpaOnnxOnlineStreamHasOption(\n        stream: *const OnlineStream,\n        key: *const c_char,\n    ) -> i32;\n\n    pub fn SherpaOnnxOnlineStreamIsEndpoint(\n        recognizer: *const OnlineRecognizer,\n        stream: *const OnlineStream,\n    ) -> i32;\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/online_punctuation.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::c_char;\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlinePunctuationModelConfig {\n    pub cnn_bilstm: *const c_char,\n    pub bpe_vocab: *const c_char,\n    pub num_threads: i32,\n    pub debug: i32,\n    pub provider: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlinePunctuationConfig {\n    pub model: OnlinePunctuationModelConfig,\n}\n\n#[repr(C)]\npub struct OnlinePunctuation {\n    _private: [u8; 0],\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateOnlinePunctuation(\n        config: *const OnlinePunctuationConfig,\n    ) -> *const OnlinePunctuation;\n\n    pub fn SherpaOnnxDestroyOnlinePunctuation(punctuation: *const OnlinePunctuation);\n\n    pub fn SherpaOnnxOnlinePunctuationAddPunct(\n        punctuation: *const OnlinePunctuation,\n        text: *const c_char,\n    ) -> *const c_char;\n\n    pub fn SherpaOnnxOnlinePunctuationFreeText(text: *const c_char);\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/speaker_embedding.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::{c_char, c_float};\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct SpeakerEmbeddingExtractorConfig {\n    pub model: *const c_char,\n    pub num_threads: i32,\n    pub debug: i32,\n    pub provider: *const c_char,\n}\n\n#[repr(C)]\npub struct SpeakerEmbeddingExtractor {\n    _private: [u8; 0],\n}\n\n#[repr(C)]\npub struct SpeakerEmbeddingManager {\n    _private: [u8; 0],\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct SpeakerEmbeddingManagerSpeakerMatch {\n    pub score: c_float,\n    pub name: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct SpeakerEmbeddingManagerBestMatchesResult {\n    pub matches: *const SpeakerEmbeddingManagerSpeakerMatch,\n    pub count: i32,\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateSpeakerEmbeddingExtractor(\n        config: *const SpeakerEmbeddingExtractorConfig,\n    ) -> *const SpeakerEmbeddingExtractor;\n\n    pub fn SherpaOnnxDestroySpeakerEmbeddingExtractor(p: *const SpeakerEmbeddingExtractor);\n\n    pub fn SherpaOnnxSpeakerEmbeddingExtractorDim(p: *const SpeakerEmbeddingExtractor) -> i32;\n\n    pub fn SherpaOnnxSpeakerEmbeddingExtractorCreateStream(\n        p: *const SpeakerEmbeddingExtractor,\n    ) -> *const crate::online_asr::OnlineStream;\n\n    pub fn SherpaOnnxSpeakerEmbeddingExtractorIsReady(\n        p: *const SpeakerEmbeddingExtractor,\n        s: *const crate::online_asr::OnlineStream,\n    ) -> i32;\n\n    pub fn SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(\n        p: *const SpeakerEmbeddingExtractor,\n        s: *const crate::online_asr::OnlineStream,\n    ) -> *const c_float;\n\n    pub fn SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(v: *const c_float);\n\n    pub fn SherpaOnnxCreateSpeakerEmbeddingManager(dim: i32) -> *const SpeakerEmbeddingManager;\n\n    pub fn SherpaOnnxDestroySpeakerEmbeddingManager(p: *const SpeakerEmbeddingManager);\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerAdd(\n        p: *const SpeakerEmbeddingManager,\n        name: *const c_char,\n        v: *const c_float,\n    ) -> i32;\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerAddList(\n        p: *const SpeakerEmbeddingManager,\n        name: *const c_char,\n        v: *const *const c_float,\n    ) -> i32;\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerAddListFlattened(\n        p: *const SpeakerEmbeddingManager,\n        name: *const c_char,\n        v: *const c_float,\n        n: i32,\n    ) -> i32;\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerRemove(\n        p: *const SpeakerEmbeddingManager,\n        name: *const c_char,\n    ) -> i32;\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerSearch(\n        p: *const SpeakerEmbeddingManager,\n        v: *const c_float,\n        threshold: c_float,\n    ) -> *const c_char;\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerFreeSearch(name: *const c_char);\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerGetBestMatches(\n        p: *const SpeakerEmbeddingManager,\n        v: *const c_float,\n        threshold: c_float,\n        n: i32,\n    ) -> *const SpeakerEmbeddingManagerBestMatchesResult;\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches(\n        r: *const SpeakerEmbeddingManagerBestMatchesResult,\n    );\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerVerify(\n        p: *const SpeakerEmbeddingManager,\n        name: *const c_char,\n        v: *const c_float,\n        threshold: c_float,\n    ) -> i32;\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerContains(\n        p: *const SpeakerEmbeddingManager,\n        name: *const c_char,\n    ) -> i32;\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerNumSpeakers(\n        p: *const SpeakerEmbeddingManager,\n    ) -> i32;\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerGetAllSpeakers(\n        p: *const SpeakerEmbeddingManager,\n    ) -> *const *const c_char;\n\n    pub fn SherpaOnnxSpeakerEmbeddingManagerFreeAllSpeakers(names: *const *const c_char);\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/speech_denoiser.rs",
    "content": "use std::os::raw::c_char;\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineSpeechDenoiserGtcrnModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineSpeechDenoiserDpdfNetModelConfig {\n    pub model: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineSpeechDenoiserModelConfig {\n    pub gtcrn: OfflineSpeechDenoiserGtcrnModelConfig,\n    pub num_threads: i32,\n    pub debug: i32,\n    pub provider: *const c_char,\n    pub dpdfnet: OfflineSpeechDenoiserDpdfNetModelConfig,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineSpeechDenoiserConfig {\n    pub model: OfflineSpeechDenoiserModelConfig,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OnlineSpeechDenoiserConfig {\n    pub model: OfflineSpeechDenoiserModelConfig,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct DenoisedAudio {\n    pub samples: *const f32,\n    pub n: i32,\n    pub sample_rate: i32,\n}\n\n#[repr(C)]\npub struct OfflineSpeechDenoiser {\n    _private: [u8; 0],\n}\n\n#[repr(C)]\npub struct OnlineSpeechDenoiser {\n    _private: [u8; 0],\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateOfflineSpeechDenoiser(\n        config: *const OfflineSpeechDenoiserConfig,\n    ) -> *const OfflineSpeechDenoiser;\n    pub fn SherpaOnnxDestroyOfflineSpeechDenoiser(p: *const OfflineSpeechDenoiser);\n    pub fn SherpaOnnxOfflineSpeechDenoiserGetSampleRate(p: *const OfflineSpeechDenoiser) -> i32;\n    pub fn SherpaOnnxOfflineSpeechDenoiserRun(\n        p: *const OfflineSpeechDenoiser,\n        samples: *const f32,\n        n: i32,\n        sample_rate: i32,\n    ) -> *const DenoisedAudio;\n\n    pub fn SherpaOnnxCreateOnlineSpeechDenoiser(\n        config: *const OnlineSpeechDenoiserConfig,\n    ) -> *const OnlineSpeechDenoiser;\n    pub fn SherpaOnnxDestroyOnlineSpeechDenoiser(p: *const OnlineSpeechDenoiser);\n    pub fn SherpaOnnxOnlineSpeechDenoiserGetSampleRate(p: *const OnlineSpeechDenoiser) -> i32;\n    pub fn SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(\n        p: *const OnlineSpeechDenoiser,\n    ) -> i32;\n    pub fn SherpaOnnxOnlineSpeechDenoiserRun(\n        p: *const OnlineSpeechDenoiser,\n        samples: *const f32,\n        n: i32,\n        sample_rate: i32,\n    ) -> *const DenoisedAudio;\n    pub fn SherpaOnnxOnlineSpeechDenoiserFlush(\n        p: *const OnlineSpeechDenoiser,\n    ) -> *const DenoisedAudio;\n    pub fn SherpaOnnxOnlineSpeechDenoiserReset(p: *const OnlineSpeechDenoiser);\n\n    pub fn SherpaOnnxDestroyDenoisedAudio(audio: *const DenoisedAudio);\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/spoken_language_identification.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::c_char;\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct SpokenLanguageIdentificationWhisperConfig {\n    pub encoder: *const c_char,\n    pub decoder: *const c_char,\n    pub tail_paddings: i32,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct SpokenLanguageIdentificationConfig {\n    pub whisper: SpokenLanguageIdentificationWhisperConfig,\n    pub num_threads: i32,\n    pub debug: i32,\n    pub provider: *const c_char,\n}\n\n#[repr(C)]\npub struct SpokenLanguageIdentification {\n    _private: [u8; 0],\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct SpokenLanguageIdentificationResult {\n    pub lang: *const c_char,\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateSpokenLanguageIdentification(\n        config: *const SpokenLanguageIdentificationConfig,\n    ) -> *const SpokenLanguageIdentification;\n\n    pub fn SherpaOnnxDestroySpokenLanguageIdentification(\n        slid: *const SpokenLanguageIdentification,\n    );\n\n    pub fn SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(\n        slid: *const SpokenLanguageIdentification,\n    ) -> *const super::offline_asr::OfflineStream;\n\n    pub fn SherpaOnnxSpokenLanguageIdentificationCompute(\n        slid: *const SpokenLanguageIdentification,\n        stream: *const super::offline_asr::OfflineStream,\n    ) -> *const SpokenLanguageIdentificationResult;\n\n    pub fn SherpaOnnxDestroySpokenLanguageIdentificationResult(\n        r: *const SpokenLanguageIdentificationResult,\n    );\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/tts.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::{c_char, c_float, c_void};\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTtsVitsModelConfig {\n    pub model: *const c_char,\n    pub lexicon: *const c_char,\n    pub tokens: *const c_char,\n    pub data_dir: *const c_char,\n    pub noise_scale: c_float,\n    pub noise_scale_w: c_float,\n    pub length_scale: c_float,\n    pub dict_dir: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTtsMatchaModelConfig {\n    pub acoustic_model: *const c_char,\n    pub vocoder: *const c_char,\n    pub lexicon: *const c_char,\n    pub tokens: *const c_char,\n    pub data_dir: *const c_char,\n    pub noise_scale: c_float,\n    pub length_scale: c_float,\n    pub dict_dir: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTtsKokoroModelConfig {\n    pub model: *const c_char,\n    pub voices: *const c_char,\n    pub tokens: *const c_char,\n    pub data_dir: *const c_char,\n    pub length_scale: c_float,\n    pub dict_dir: *const c_char,\n    pub lexicon: *const c_char,\n    pub lang: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTtsKittenModelConfig {\n    pub model: *const c_char,\n    pub voices: *const c_char,\n    pub tokens: *const c_char,\n    pub data_dir: *const c_char,\n    pub length_scale: c_float,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTtsZipvoiceModelConfig {\n    pub tokens: *const c_char,\n    pub encoder: *const c_char,\n    pub decoder: *const c_char,\n    pub vocoder: *const c_char,\n    pub data_dir: *const c_char,\n    pub lexicon: *const c_char,\n    pub feat_scale: c_float,\n    pub t_shift: c_float,\n    pub target_rms: c_float,\n    pub guidance_scale: c_float,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTtsPocketModelConfig {\n    pub lm_flow: *const c_char,\n    pub lm_main: *const c_char,\n    pub encoder: *const c_char,\n    pub decoder: *const c_char,\n    pub text_conditioner: *const c_char,\n    pub vocab_json: *const c_char,\n    pub token_scores_json: *const c_char,\n    pub voice_embedding_cache_capacity: i32,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTtsSupertonicModelConfig {\n    pub duration_predictor: *const c_char,\n    pub text_encoder: *const c_char,\n    pub vector_estimator: *const c_char,\n    pub vocoder: *const c_char,\n    pub tts_json: *const c_char,\n    pub unicode_indexer: *const c_char,\n    pub voice_style: *const c_char,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTtsModelConfig {\n    pub vits: OfflineTtsVitsModelConfig,\n    pub num_threads: i32,\n    pub debug: i32,\n    pub provider: *const c_char,\n    pub matcha: OfflineTtsMatchaModelConfig,\n    pub kokoro: OfflineTtsKokoroModelConfig,\n    pub kitten: OfflineTtsKittenModelConfig,\n    pub zipvoice: OfflineTtsZipvoiceModelConfig,\n    pub pocket: OfflineTtsPocketModelConfig,\n    pub supertonic: OfflineTtsSupertonicModelConfig,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct OfflineTtsConfig {\n    pub model: OfflineTtsModelConfig,\n    pub rule_fsts: *const c_char,\n    pub max_num_sentences: i32,\n    pub rule_fars: *const c_char,\n    pub silence_scale: c_float,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct SherpaOnnxGeneratedAudio {\n    pub samples: *const f32,\n    pub n: i32,\n    pub sample_rate: i32,\n}\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct SherpaOnnxGenerationConfig {\n    pub silence_scale: c_float,\n    pub speed: c_float,\n    pub sid: i32,\n    pub reference_audio: *const f32,\n    pub reference_audio_len: i32,\n    pub reference_sample_rate: i32,\n    pub reference_text: *const c_char,\n    pub num_steps: i32,\n    pub extra: *const c_char,\n}\n\npub type SherpaOnnxGeneratedAudioProgressCallbackWithArg =\n    Option<unsafe extern \"C\" fn(samples: *const f32, n: i32, progress: c_float, arg: *mut c_void) -> i32>;\n\n#[repr(C)]\npub struct SherpaOnnxOfflineTts {\n    _private: [u8; 0],\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateOfflineTts(\n        config: *const OfflineTtsConfig,\n    ) -> *const SherpaOnnxOfflineTts;\n\n    pub fn SherpaOnnxDestroyOfflineTts(tts: *const SherpaOnnxOfflineTts);\n\n    pub fn SherpaOnnxOfflineTtsSampleRate(tts: *const SherpaOnnxOfflineTts) -> i32;\n\n    pub fn SherpaOnnxOfflineTtsNumSpeakers(tts: *const SherpaOnnxOfflineTts) -> i32;\n\n    pub fn SherpaOnnxOfflineTtsGenerateWithConfig(\n        tts: *const SherpaOnnxOfflineTts,\n        text: *const c_char,\n        config: *const SherpaOnnxGenerationConfig,\n        callback: SherpaOnnxGeneratedAudioProgressCallbackWithArg,\n        arg: *mut c_void,\n    ) -> *const SherpaOnnxGeneratedAudio;\n\n    pub fn SherpaOnnxDestroyOfflineTtsGeneratedAudio(p: *const SherpaOnnxGeneratedAudio);\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/vad.rs",
    "content": "use std::os::raw::{c_char, c_float};\n\n#[repr(C)]\npub struct SileroVadModelConfig {\n    pub model: *const c_char,\n    pub threshold: c_float,\n    pub min_silence_duration: c_float,\n    pub min_speech_duration: c_float,\n    pub window_size: i32,\n    pub max_speech_duration: c_float,\n}\n\n#[repr(C)]\npub struct TenVadModelConfig {\n    pub model: *const c_char,\n    pub threshold: c_float,\n    pub min_silence_duration: c_float,\n    pub min_speech_duration: c_float,\n    pub window_size: i32,\n    pub max_speech_duration: c_float,\n}\n\n#[repr(C)]\npub struct VadModelConfig {\n    pub silero_vad: SileroVadModelConfig,\n    pub sample_rate: i32,\n    pub num_threads: i32,\n    pub provider: *const c_char,\n    pub debug: i32,\n    pub ten_vad: TenVadModelConfig,\n}\n\n#[repr(C)]\npub struct CircularBuffer {\n    _private: [u8; 0],\n}\n\n#[repr(C)]\npub struct SpeechSegment {\n    pub start: i32,\n    pub samples: *mut f32,\n    pub n: i32,\n}\n\n#[repr(C)]\npub struct VoiceActivityDetector {\n    _private: [u8; 0],\n}\n\nextern \"C\" {\n    pub fn SherpaOnnxCreateCircularBuffer(capacity: i32) -> *const CircularBuffer;\n    pub fn SherpaOnnxDestroyCircularBuffer(buffer: *const CircularBuffer);\n    pub fn SherpaOnnxCircularBufferPush(buffer: *const CircularBuffer, p: *const f32, n: i32);\n    pub fn SherpaOnnxCircularBufferGet(\n        buffer: *const CircularBuffer,\n        start_index: i32,\n        n: i32,\n    ) -> *const f32;\n    pub fn SherpaOnnxCircularBufferFree(p: *const f32);\n    pub fn SherpaOnnxCircularBufferPop(buffer: *const CircularBuffer, n: i32);\n    pub fn SherpaOnnxCircularBufferSize(buffer: *const CircularBuffer) -> i32;\n    pub fn SherpaOnnxCircularBufferHead(buffer: *const CircularBuffer) -> i32;\n    pub fn SherpaOnnxCircularBufferReset(buffer: *const CircularBuffer);\n\n    pub fn SherpaOnnxCreateVoiceActivityDetector(\n        config: *const VadModelConfig,\n        buffer_size_in_seconds: c_float,\n    ) -> *const VoiceActivityDetector;\n    pub fn SherpaOnnxDestroyVoiceActivityDetector(p: *const VoiceActivityDetector);\n    pub fn SherpaOnnxVoiceActivityDetectorAcceptWaveform(\n        p: *const VoiceActivityDetector,\n        samples: *const f32,\n        n: i32,\n    );\n    pub fn SherpaOnnxVoiceActivityDetectorEmpty(p: *const VoiceActivityDetector) -> i32;\n    pub fn SherpaOnnxVoiceActivityDetectorDetected(p: *const VoiceActivityDetector) -> i32;\n    pub fn SherpaOnnxVoiceActivityDetectorPop(p: *const VoiceActivityDetector);\n    pub fn SherpaOnnxVoiceActivityDetectorClear(p: *const VoiceActivityDetector);\n    pub fn SherpaOnnxVoiceActivityDetectorFront(\n        p: *const VoiceActivityDetector,\n    ) -> *const SpeechSegment;\n    pub fn SherpaOnnxDestroySpeechSegment(p: *const SpeechSegment);\n    pub fn SherpaOnnxVoiceActivityDetectorReset(p: *const VoiceActivityDetector);\n    pub fn SherpaOnnxVoiceActivityDetectorFlush(p: *const VoiceActivityDetector);\n}\n"
  },
  {
    "path": "sherpa-onnx/rust/sherpa-onnx-sys/src/wave.rs",
    "content": "#![allow(non_camel_case_types)]\n#![allow(non_snake_case)]\n#![allow(non_upper_case_globals)]\n\nuse std::os::raw::c_char;\n\n#[repr(C)]\n#[derive(Debug, Copy, Clone)]\npub struct SherpaOnnxWave {\n    /// Samples normalized to [-1, 1]\n    pub samples: *const f32,\n    pub sample_rate: i32,\n    pub num_samples: i32,\n}\n\nextern \"C\" {\n    /// Read a WAV file. Returns NULL on error.\n    pub fn SherpaOnnxReadWave(filename: *const c_char) -> *const SherpaOnnxWave;\n\n    /// Free memory allocated by SherpaOnnxReadWave\n    pub fn SherpaOnnxFreeWave(wave: *const SherpaOnnxWave);\n\n    /// Write a WAV file. Returns 1 on success, 0 on failure.\n    pub fn SherpaOnnxWriteWave(\n        samples: *const f32,\n        n: i32,\n        sample_rate: i32,\n        filename: *const c_char,\n    ) -> i32;\n}\n"
  },
  {
    "path": "swift-api-examples/.gitignore",
    "content": "decode-file\ndecode-file-non-streaming\ngenerate-subtitles\ngenerate-subtitles-ten-vad\nspoken-language-identification\ntts-vits\nvits-vctk\nsherpa-onnx-paraformer-zh-2023-09-14\n!*.sh\n*.bak\nstreaming-hlg-decode-file\nkeyword-spotting-from-file\nadd-punctuations\ntts-matcha-zh\ntts-matcha-en\ntts-kokoro-en\ntts-kokoro-zh-en\nspeech-enhancement-gtcrn\nspeech-enhancement-dpdfnet\nonline-speech-enhancement-gtcrn\nonline-speech-enhancement-dpdfnet\ndecode-file-sense-voice-with-hr\ntest-version\nzipformer-ctc-asr\nwenet-ctc-asr\ndolphin-ctc-asr\ntts-kitten-en\ntts-pocket-en\ncompute-speaker-embeddings\ndecode-file-t-one-streaming\nomnilingual-asr-ctc\nmedasr-ctc\nfunasr-nano\nfire-red-asr-ctc\nmoonshine-v2-asr\ntts-supertonic-en\ntts-zipvoice\n"
  },
  {
    "path": "swift-api-examples/SherpaOnnx-Bridging-Header.h",
    "content": "// swfit-api-examples/SherpaOnnx-Bridging-Header.h\n//\n// Copyright (c)  2023  Xiaomi Corporation\n#ifndef SWIFT_API_EXAMPLES_SHERPAONNX_BRIDGING_HEADER_H_\n#define SWIFT_API_EXAMPLES_SHERPAONNX_BRIDGING_HEADER_H_\n\n#import \"sherpa-onnx/c-api/c-api.h\"\n\n#endif  // SWIFT_API_EXAMPLES_SHERPAONNX_BRIDGING_HEADER_H_\n"
  },
  {
    "path": "swift-api-examples/SherpaOnnx.swift",
    "content": "/// swift-api-examples/SherpaOnnx.swift\n/// Copyright (c)  2023  Xiaomi Corporation\n\nimport Foundation  // For NSString\n\n/// Convert a String from swift to a `const char*` so that we can pass it to\n/// the C language.\n///\n/// - Parameters:\n///   - s: The String to convert.\n/// - Returns: A pointer that can be passed to C as `const char*`\n\nfunc toCPointer(_ s: String) -> UnsafePointer<Int8>! {\n  let cs = (s as NSString).utf8String\n  return UnsafePointer<Int8>(cs)\n}\n\n/// Return an instance of SherpaOnnxOnlineTransducerModelConfig.\n///\n/// Please refer to\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html\n/// to download the required `.onnx` files.\n///\n/// - Parameters:\n///   - encoder: Path to encoder.onnx\n///   - decoder: Path to decoder.onnx\n///   - joiner: Path to joiner.onnx\n///\n/// - Returns: Return an instance of SherpaOnnxOnlineTransducerModelConfig\nfunc sherpaOnnxOnlineTransducerModelConfig(\n  encoder: String = \"\",\n  decoder: String = \"\",\n  joiner: String = \"\"\n) -> SherpaOnnxOnlineTransducerModelConfig {\n  return SherpaOnnxOnlineTransducerModelConfig(\n    encoder: toCPointer(encoder),\n    decoder: toCPointer(decoder),\n    joiner: toCPointer(joiner)\n  )\n}\n\n/// Return an instance of SherpaOnnxOnlineParaformerModelConfig.\n///\n/// Please refer to\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/index.html\n/// to download the required `.onnx` files.\n///\n/// - Parameters:\n///   - encoder: Path to encoder.onnx\n///   - decoder: Path to decoder.onnx\n///\n/// - Returns: Return an instance of SherpaOnnxOnlineParaformerModelConfig\nfunc sherpaOnnxOnlineParaformerModelConfig(\n  encoder: String = \"\",\n  decoder: String = \"\"\n) -> SherpaOnnxOnlineParaformerModelConfig {\n  return SherpaOnnxOnlineParaformerModelConfig(\n    encoder: toCPointer(encoder),\n    decoder: toCPointer(decoder)\n  )\n}\n\nfunc sherpaOnnxOnlineZipformer2CtcModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOnlineZipformer2CtcModelConfig {\n  return SherpaOnnxOnlineZipformer2CtcModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOnlineNemoCtcModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOnlineNemoCtcModelConfig {\n  return SherpaOnnxOnlineNemoCtcModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOnlineToneCtcModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOnlineToneCtcModelConfig {\n  return SherpaOnnxOnlineToneCtcModelConfig(\n    model: toCPointer(model)\n  )\n}\n\n/// Return an instance of SherpaOnnxOnlineModelConfig.\n///\n/// Please refer to\n/// https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\n/// to download the required `.onnx` files.\n///\n/// - Parameters:\n///   - tokens: Path to tokens.txt\n///   - numThreads:  Number of threads to use for neural network computation.\n///\n/// - Returns: Return an instance of SherpaOnnxOnlineTransducerModelConfig\nfunc sherpaOnnxOnlineModelConfig(\n  tokens: String,\n  transducer: SherpaOnnxOnlineTransducerModelConfig = sherpaOnnxOnlineTransducerModelConfig(),\n  paraformer: SherpaOnnxOnlineParaformerModelConfig = sherpaOnnxOnlineParaformerModelConfig(),\n  zipformer2Ctc: SherpaOnnxOnlineZipformer2CtcModelConfig =\n    sherpaOnnxOnlineZipformer2CtcModelConfig(),\n  numThreads: Int = 1,\n  provider: String = \"cpu\",\n  debug: Int = 0,\n  modelType: String = \"\",\n  modelingUnit: String = \"cjkchar\",\n  bpeVocab: String = \"\",\n  tokensBuf: String = \"\",\n  tokensBufSize: Int = 0,\n  nemoCtc: SherpaOnnxOnlineNemoCtcModelConfig = sherpaOnnxOnlineNemoCtcModelConfig(),\n  toneCtc: SherpaOnnxOnlineToneCtcModelConfig = sherpaOnnxOnlineToneCtcModelConfig()\n) -> SherpaOnnxOnlineModelConfig {\n  return SherpaOnnxOnlineModelConfig(\n    transducer: transducer,\n    paraformer: paraformer,\n    zipformer2_ctc: zipformer2Ctc,\n    tokens: toCPointer(tokens),\n    num_threads: Int32(numThreads),\n    provider: toCPointer(provider),\n    debug: Int32(debug),\n    model_type: toCPointer(modelType),\n    modeling_unit: toCPointer(modelingUnit),\n    bpe_vocab: toCPointer(bpeVocab),\n    tokens_buf: toCPointer(tokensBuf),\n    tokens_buf_size: Int32(tokensBufSize),\n    nemo_ctc: nemoCtc,\n    t_one_ctc: toneCtc\n  )\n}\n\nfunc sherpaOnnxFeatureConfig(\n  sampleRate: Int = 16000,\n  featureDim: Int = 80\n) -> SherpaOnnxFeatureConfig {\n  return SherpaOnnxFeatureConfig(\n    sample_rate: Int32(sampleRate),\n    feature_dim: Int32(featureDim))\n}\n\nfunc sherpaOnnxOnlineCtcFstDecoderConfig(\n  graph: String = \"\",\n  maxActive: Int = 3000\n) -> SherpaOnnxOnlineCtcFstDecoderConfig {\n  return SherpaOnnxOnlineCtcFstDecoderConfig(\n    graph: toCPointer(graph),\n    max_active: Int32(maxActive))\n}\n\nfunc sherpaOnnxHomophoneReplacerConfig(\n  dictDir: String = \"\",\n  lexicon: String = \"\",\n  ruleFsts: String = \"\"\n) -> SherpaOnnxHomophoneReplacerConfig {\n  return SherpaOnnxHomophoneReplacerConfig(\n    dict_dir: toCPointer(dictDir),\n    lexicon: toCPointer(lexicon),\n    rule_fsts: toCPointer(ruleFsts))\n}\n\nfunc sherpaOnnxOnlineRecognizerConfig(\n  featConfig: SherpaOnnxFeatureConfig,\n  modelConfig: SherpaOnnxOnlineModelConfig,\n  enableEndpoint: Bool = false,\n  rule1MinTrailingSilence: Float = 2.4,\n  rule2MinTrailingSilence: Float = 1.2,\n  rule3MinUtteranceLength: Float = 30,\n  decodingMethod: String = \"greedy_search\",\n  maxActivePaths: Int = 4,\n  hotwordsFile: String = \"\",\n  hotwordsScore: Float = 1.5,\n  ctcFstDecoderConfig: SherpaOnnxOnlineCtcFstDecoderConfig = sherpaOnnxOnlineCtcFstDecoderConfig(),\n  ruleFsts: String = \"\",\n  ruleFars: String = \"\",\n  blankPenalty: Float = 0.0,\n  hotwordsBuf: String = \"\",\n  hotwordsBufSize: Int = 0,\n  hr: SherpaOnnxHomophoneReplacerConfig = sherpaOnnxHomophoneReplacerConfig()\n) -> SherpaOnnxOnlineRecognizerConfig {\n  return SherpaOnnxOnlineRecognizerConfig(\n    feat_config: featConfig,\n    model_config: modelConfig,\n    decoding_method: toCPointer(decodingMethod),\n    max_active_paths: Int32(maxActivePaths),\n    enable_endpoint: enableEndpoint ? 1 : 0,\n    rule1_min_trailing_silence: rule1MinTrailingSilence,\n    rule2_min_trailing_silence: rule2MinTrailingSilence,\n    rule3_min_utterance_length: rule3MinUtteranceLength,\n    hotwords_file: toCPointer(hotwordsFile),\n    hotwords_score: hotwordsScore,\n    ctc_fst_decoder_config: ctcFstDecoderConfig,\n    rule_fsts: toCPointer(ruleFsts),\n    rule_fars: toCPointer(ruleFars),\n    blank_penalty: blankPenalty,\n    hotwords_buf: toCPointer(hotwordsBuf),\n    hotwords_buf_size: Int32(hotwordsBufSize),\n    hr: hr\n  )\n}\n\n/// Wrapper for recognition result.\n///\n/// Usage:\n///\n///  let result = recognizer.getResult()\n///  print(\"text: \\(result.text)\")\n///\nclass SherpaOnnxOnlineRecongitionResult {\n  /// A pointer to the underlying counterpart in C\n  private let result: UnsafePointer<SherpaOnnxOnlineRecognizerResult>\n\n  private lazy var _text: String = {\n    guard let cstr = result.pointee.text else { return \"\" }\n    return String(cString: cstr)\n  }()\n\n  private lazy var _tokens: [String] = {\n    guard let tokensPointer = result.pointee.tokens_arr else { return [] }\n    return (0..<count).compactMap { index in\n      guard let ptr = tokensPointer[index] else { return nil }\n      return String(cString: ptr)\n    }\n  }()\n\n  private lazy var _timestamps: [Float] = {\n    guard let timestampsPointer = result.pointee.timestamps else { return [] }\n    return (0..<count).map { index in timestampsPointer[index] }\n  }()\n\n  init(result: UnsafePointer<SherpaOnnxOnlineRecognizerResult>) {\n    self.result = result\n  }\n\n  deinit {\n    SherpaOnnxDestroyOnlineRecognizerResult(result)\n  }\n\n  /// Return the actual recognition result.\n  /// For English models, it contains words separated by spaces.\n  /// For Chinese models, it contains Chinese words.\n  var text: String { _text }\n\n  var count: Int { Int(result.pointee.count) }\n\n  var tokens: [String] { _tokens }\n\n  var timestamps: [Float] { _timestamps }\n}\n\nclass SherpaOnnxRecognizer {\n  /// A pointer to the underlying counterpart in C\n  private let recognizer: OpaquePointer\n  private var stream: OpaquePointer\n  private let lock = NSLock()  // for thread-safe stream replacement\n\n  /// Constructor taking a model config\n  init(\n    config: UnsafePointer<SherpaOnnxOnlineRecognizerConfig>\n  ) {\n    self.recognizer = SherpaOnnxCreateOnlineRecognizer(config)\n    self.stream = SherpaOnnxCreateOnlineStream(recognizer)\n  }\n\n  deinit {\n    SherpaOnnxDestroyOnlineStream(stream)\n    SherpaOnnxDestroyOnlineRecognizer(recognizer)\n  }\n\n  /// Decode wave samples.\n  ///\n  /// - Parameters:\n  ///   - samples: Audio samples normalized to the range [-1, 1]\n  ///   - sampleRate: Sample rate of the input audio samples. Must match\n  ///                 the one expected by the model.\n  func acceptWaveform(samples: [Float], sampleRate: Int = 16_000) {\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, Int32(sampleRate), samples, Int32(samples.count))\n  }\n\n  func isReady() -> Bool {\n    return SherpaOnnxIsOnlineStreamReady(recognizer, stream) != 0\n  }\n\n  /// If there are enough number of feature frames, it invokes the neural\n  /// network computation and decoding. Otherwise, it is a no-op.\n  func decode() {\n    SherpaOnnxDecodeOnlineStream(recognizer, stream)\n  }\n\n  /// Get the decoding results so far\n  func getResult() -> SherpaOnnxOnlineRecongitionResult {\n    guard let result = SherpaOnnxGetOnlineStreamResult(recognizer, stream) else {\n      fatalError(\"SherpaOnnxGetOnlineStreamResult returned nil\")\n    }\n    return SherpaOnnxOnlineRecongitionResult(result: result)\n  }\n\n  /// Reset the recognizer, which clears the neural network model state\n  /// and the state for decoding.\n  /// If hotwords is an empty string, it just recreates the decoding stream\n  /// If hotwords is not empty, it will create a new decoding stream with\n  /// the given hotWords appended to the default hotwords.\n  func reset(hotwords: String? = nil) {\n    guard let words = hotwords, !words.isEmpty else {\n      SherpaOnnxOnlineStreamReset(recognizer, stream)\n      return\n    }\n\n    words.withCString { cString in\n      guard let newStream = SherpaOnnxCreateOnlineStreamWithHotwords(recognizer, cString) else {\n        fatalError(\"SherpaOnnxCreateOnlineStreamWithHotwords returned nil\")\n      }\n      lock.lock()\n      // lock while release and replace stream\n      SherpaOnnxDestroyOnlineStream(stream)\n      stream = newStream\n      lock.unlock()\n    }\n  }\n\n  /// Signal that no more audio samples would be available.\n  /// After this call, you cannot call acceptWaveform() any more.\n  func inputFinished() {\n    SherpaOnnxOnlineStreamInputFinished(stream)\n  }\n\n  /// Return true is an endpoint has been detected.\n  func isEndpoint() -> Bool {\n    return SherpaOnnxOnlineStreamIsEndpoint(recognizer, stream) != 0\n  }\n}\n\n// For offline APIs\n\nfunc sherpaOnnxOfflineTransducerModelConfig(\n  encoder: String = \"\",\n  decoder: String = \"\",\n  joiner: String = \"\"\n) -> SherpaOnnxOfflineTransducerModelConfig {\n  return SherpaOnnxOfflineTransducerModelConfig(\n    encoder: toCPointer(encoder),\n    decoder: toCPointer(decoder),\n    joiner: toCPointer(joiner)\n  )\n}\n\nfunc sherpaOnnxOfflineParaformerModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOfflineParaformerModelConfig {\n  return SherpaOnnxOfflineParaformerModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOfflineZipformerCtcModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOfflineZipformerCtcModelConfig {\n  return SherpaOnnxOfflineZipformerCtcModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOfflineWenetCtcModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOfflineWenetCtcModelConfig {\n  return SherpaOnnxOfflineWenetCtcModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOfflineOmnilingualAsrCtcModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOfflineOmnilingualAsrCtcModelConfig {\n  return SherpaOnnxOfflineOmnilingualAsrCtcModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOfflineMedAsrCtcModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOfflineMedAsrCtcModelConfig {\n  return SherpaOnnxOfflineMedAsrCtcModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOfflineFireRedAsrCtcModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOfflineFireRedAsrCtcModelConfig {\n  return SherpaOnnxOfflineFireRedAsrCtcModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOfflineNemoEncDecCtcModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOfflineNemoEncDecCtcModelConfig {\n  return SherpaOnnxOfflineNemoEncDecCtcModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOfflineDolphinModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOfflineDolphinModelConfig {\n  return SherpaOnnxOfflineDolphinModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOfflineWhisperModelConfig(\n  encoder: String = \"\",\n  decoder: String = \"\",\n  language: String = \"\",\n  task: String = \"transcribe\",\n  tailPaddings: Int = -1,\n  enableTokenTimestamps: Bool = false,\n  enableSegmentTimestamps: Bool = false\n) -> SherpaOnnxOfflineWhisperModelConfig {\n  return SherpaOnnxOfflineWhisperModelConfig(\n    encoder: toCPointer(encoder),\n    decoder: toCPointer(decoder),\n    language: toCPointer(language),\n    task: toCPointer(task),\n    tail_paddings: Int32(tailPaddings),\n    enable_token_timestamps: enableTokenTimestamps ? 1 : 0,\n    enable_segment_timestamps: enableSegmentTimestamps ? 1 : 0\n  )\n}\n\nfunc sherpaOnnxOfflineCanaryModelConfig(\n  encoder: String = \"\",\n  decoder: String = \"\",\n  srcLang: String = \"en\",\n  tgtLang: String = \"en\",\n  usePnc: Bool = true\n) -> SherpaOnnxOfflineCanaryModelConfig {\n  return SherpaOnnxOfflineCanaryModelConfig(\n    encoder: toCPointer(encoder),\n    decoder: toCPointer(decoder),\n    src_lang: toCPointer(srcLang),\n    tgt_lang: toCPointer(tgtLang),\n    use_pnc: usePnc ? 1 : 0\n  )\n}\n\nfunc sherpaOnnxOfflineFireRedAsrModelConfig(\n  encoder: String = \"\",\n  decoder: String = \"\"\n) -> SherpaOnnxOfflineFireRedAsrModelConfig {\n  return SherpaOnnxOfflineFireRedAsrModelConfig(\n    encoder: toCPointer(encoder),\n    decoder: toCPointer(decoder)\n  )\n}\n\n// there are two versions of Moonshine\n// For v1, you need four models: preprocessor, encoder, uncachedDecoder, cachedDecoder\n// For v2, you need two models: encoder, mergedDecoder\nfunc sherpaOnnxOfflineMoonshineModelConfig(\n  preprocessor: String = \"\",\n  encoder: String = \"\",\n  uncachedDecoder: String = \"\",\n  cachedDecoder: String = \"\",\n  mergedDecoder: String = \"\"\n) -> SherpaOnnxOfflineMoonshineModelConfig {\n  return SherpaOnnxOfflineMoonshineModelConfig(\n    preprocessor: toCPointer(preprocessor),\n    encoder: toCPointer(encoder),\n    uncached_decoder: toCPointer(uncachedDecoder),\n    cached_decoder: toCPointer(cachedDecoder),\n    merged_decoder: toCPointer(mergedDecoder)\n  )\n}\n\nfunc sherpaOnnxOfflineTdnnModelConfig(\n  model: String = \"\"\n) -> SherpaOnnxOfflineTdnnModelConfig {\n  return SherpaOnnxOfflineTdnnModelConfig(\n    model: toCPointer(model)\n  )\n}\n\nfunc sherpaOnnxOfflineSenseVoiceModelConfig(\n  model: String = \"\",\n  language: String = \"\",\n  useInverseTextNormalization: Bool = false\n) -> SherpaOnnxOfflineSenseVoiceModelConfig {\n  return SherpaOnnxOfflineSenseVoiceModelConfig(\n    model: toCPointer(model),\n    language: toCPointer(language),\n    use_itn: useInverseTextNormalization ? 1 : 0\n  )\n}\n\nfunc sherpaOnnxOfflineLMConfig(\n  model: String = \"\",\n  scale: Float = 1.0\n) -> SherpaOnnxOfflineLMConfig {\n  return SherpaOnnxOfflineLMConfig(\n    model: toCPointer(model),\n    scale: scale\n  )\n}\n\nfunc sherpaOnnxOfflineFunASRNanoModelConfig(\n  encoderAdaptor: String = \"\",\n  llm: String = \"\",\n  embedding: String = \"\",\n  tokenizer: String = \"\",\n  systemPrompt: String = \"You are a helpful assistant.\",\n  userPrompt: String = \"语音转写：\",\n  maxNewTokens: Int = 512,\n  temperature: Float = 1e-6,\n  topP: Float = 0.8,\n  seed: Int = 42,\n  language: String = \"\",\n  itn: Bool = true,\n  hotwords: String = \"\"\n) -> SherpaOnnxOfflineFunASRNanoModelConfig {\n  return SherpaOnnxOfflineFunASRNanoModelConfig(\n    encoder_adaptor: toCPointer(encoderAdaptor),\n    llm: toCPointer(llm),\n    embedding: toCPointer(embedding),\n    tokenizer: toCPointer(tokenizer),\n    system_prompt: toCPointer(systemPrompt),\n    user_prompt: toCPointer(userPrompt),\n    max_new_tokens: Int32(maxNewTokens),\n    temperature: temperature,\n    top_p: topP,\n    seed: Int32(seed),\n    language: toCPointer(language),\n    itn: itn ? 1 : 0,\n    hotwords: toCPointer(hotwords)\n  )\n}\n\nfunc sherpaOnnxOfflineModelConfig(\n  tokens: String,\n  transducer: SherpaOnnxOfflineTransducerModelConfig = sherpaOnnxOfflineTransducerModelConfig(),\n  paraformer: SherpaOnnxOfflineParaformerModelConfig = sherpaOnnxOfflineParaformerModelConfig(),\n  nemoCtc: SherpaOnnxOfflineNemoEncDecCtcModelConfig = sherpaOnnxOfflineNemoEncDecCtcModelConfig(),\n  whisper: SherpaOnnxOfflineWhisperModelConfig = sherpaOnnxOfflineWhisperModelConfig(),\n  tdnn: SherpaOnnxOfflineTdnnModelConfig = sherpaOnnxOfflineTdnnModelConfig(),\n  numThreads: Int = 1,\n  provider: String = \"cpu\",\n  debug: Int = 0,\n  modelType: String = \"\",\n  modelingUnit: String = \"cjkchar\",\n  bpeVocab: String = \"\",\n  teleSpeechCtc: String = \"\",\n  senseVoice: SherpaOnnxOfflineSenseVoiceModelConfig = sherpaOnnxOfflineSenseVoiceModelConfig(),\n  moonshine: SherpaOnnxOfflineMoonshineModelConfig = sherpaOnnxOfflineMoonshineModelConfig(),\n  fireRedAsr: SherpaOnnxOfflineFireRedAsrModelConfig = sherpaOnnxOfflineFireRedAsrModelConfig(),\n  dolphin: SherpaOnnxOfflineDolphinModelConfig = sherpaOnnxOfflineDolphinModelConfig(),\n  zipformerCtc: SherpaOnnxOfflineZipformerCtcModelConfig =\n    sherpaOnnxOfflineZipformerCtcModelConfig(),\n  canary: SherpaOnnxOfflineCanaryModelConfig = sherpaOnnxOfflineCanaryModelConfig(),\n  wenetCtc: SherpaOnnxOfflineWenetCtcModelConfig =\n    sherpaOnnxOfflineWenetCtcModelConfig(),\n  omnilingual: SherpaOnnxOfflineOmnilingualAsrCtcModelConfig =\n    sherpaOnnxOfflineOmnilingualAsrCtcModelConfig(),\n  medasr: SherpaOnnxOfflineMedAsrCtcModelConfig =\n    sherpaOnnxOfflineMedAsrCtcModelConfig(),\n  funasrNano: SherpaOnnxOfflineFunASRNanoModelConfig =\n    sherpaOnnxOfflineFunASRNanoModelConfig(),\n  fireRedAsrCtc: SherpaOnnxOfflineFireRedAsrCtcModelConfig =\n    sherpaOnnxOfflineFireRedAsrCtcModelConfig()\n) -> SherpaOnnxOfflineModelConfig {\n  return SherpaOnnxOfflineModelConfig(\n    transducer: transducer,\n    paraformer: paraformer,\n    nemo_ctc: nemoCtc,\n    whisper: whisper,\n    tdnn: tdnn,\n    tokens: toCPointer(tokens),\n    num_threads: Int32(numThreads),\n    debug: Int32(debug),\n    provider: toCPointer(provider),\n    model_type: toCPointer(modelType),\n    modeling_unit: toCPointer(modelingUnit),\n    bpe_vocab: toCPointer(bpeVocab),\n    telespeech_ctc: toCPointer(teleSpeechCtc),\n    sense_voice: senseVoice,\n    moonshine: moonshine,\n    fire_red_asr: fireRedAsr,\n    dolphin: dolphin,\n    zipformer_ctc: zipformerCtc,\n    canary: canary,\n    wenet_ctc: wenetCtc,\n    omnilingual: omnilingual,\n    medasr: medasr,\n    funasr_nano: funasrNano,\n    fire_red_asr_ctc: fireRedAsrCtc\n  )\n}\n\nfunc sherpaOnnxOfflineRecognizerConfig(\n  featConfig: SherpaOnnxFeatureConfig,\n  modelConfig: SherpaOnnxOfflineModelConfig,\n  lmConfig: SherpaOnnxOfflineLMConfig = sherpaOnnxOfflineLMConfig(),\n  decodingMethod: String = \"greedy_search\",\n  maxActivePaths: Int = 4,\n  hotwordsFile: String = \"\",\n  hotwordsScore: Float = 1.5,\n  ruleFsts: String = \"\",\n  ruleFars: String = \"\",\n  blankPenalty: Float = 0.0,\n  hr: SherpaOnnxHomophoneReplacerConfig = sherpaOnnxHomophoneReplacerConfig()\n) -> SherpaOnnxOfflineRecognizerConfig {\n  return SherpaOnnxOfflineRecognizerConfig(\n    feat_config: featConfig,\n    model_config: modelConfig,\n    lm_config: lmConfig,\n    decoding_method: toCPointer(decodingMethod),\n    max_active_paths: Int32(maxActivePaths),\n    hotwords_file: toCPointer(hotwordsFile),\n    hotwords_score: hotwordsScore,\n    rule_fsts: toCPointer(ruleFsts),\n    rule_fars: toCPointer(ruleFars),\n    blank_penalty: blankPenalty,\n    hr: hr\n  )\n}\n\nclass SherpaOnnxOfflineRecongitionResult {\n  /// A pointer to the underlying counterpart in C\n  let result: UnsafePointer<SherpaOnnxOfflineRecognizerResult>\n\n  private lazy var _text: String = {\n    guard let cstr = result.pointee.text else { return \"\" }\n    return String(cString: cstr)\n  }()\n\n  private lazy var _timestamps: [Float] = {\n    guard let p = result.pointee.timestamps else { return [] }\n    return (0..<result.pointee.count).map { p[Int($0)] }\n  }()\n\n  private lazy var _durations: [Float] = {\n    guard let p = result.pointee.durations else { return [] }\n    return (0..<result.pointee.count).map { p[Int($0)] }\n  }()\n\n  private lazy var _lang: String = {\n    guard let cstr = result.pointee.lang else { return \"\" }\n    return String(cString: cstr)\n  }()\n\n  private lazy var _emotion: String = {\n    guard let cstr = result.pointee.emotion else { return \"\" }\n    return String(cString: cstr)\n  }()\n\n  private lazy var _event: String = {\n    guard let cstr = result.pointee.event else { return \"\" }\n    return String(cString: cstr)\n  }()\n\n  private lazy var _segmentTimestamps: [Float] = {\n    guard let p = result.pointee.segment_timestamps else { return [] }\n    return (0..<result.pointee.segment_count).map { p[Int($0)] }\n  }()\n\n  private lazy var _segmentDurations: [Float] = {\n    guard let p = result.pointee.segment_durations else { return [] }\n    return (0..<result.pointee.segment_count).map { p[Int($0)] }\n  }()\n\n  private lazy var _segmentTexts: [String] = {\n    guard let arr = result.pointee.segment_texts_arr else { return [] }\n    return (0..<result.pointee.segment_count).compactMap { idx -> String? in\n      guard let ptr = arr[Int(idx)] else { return nil }\n      return String(cString: ptr)\n    }\n  }()\n\n  /// Return the actual recognition result.\n  /// For English models, it contains words separated by spaces.\n  /// For Chinese models, it contains Chinese words.\n  var text: String { _text }\n  var count: Int { Int(result.pointee.count) }\n  var timestamps: [Float] { _timestamps }\n\n  // Non-empty for TDT models. Empty for all other non-TDT models\n  var durations: [Float] { _durations }\n\n  // For SenseVoice models, it can be zh, en, ja, yue, ko\n  // where zh is for Chinese\n  // en is for English\n  // ja is for Japanese\n  // yue is for Cantonese\n  // ko is for Korean\n  var lang: String { _lang }\n\n  // for SenseVoice models\n  var emotion: String { _emotion }\n\n  // for SenseVoice models\n  var event: String { _event }\n\n  // Segment-level timestamps (for Whisper with segment timestamps enabled)\n  var segmentCount: Int { Int(result.pointee.segment_count) }\n  var segmentTimestamps: [Float] { _segmentTimestamps }\n  var segmentDurations: [Float] { _segmentDurations }\n  var segmentTexts: [String] { _segmentTexts }\n\n  init(result: UnsafePointer<SherpaOnnxOfflineRecognizerResult>) {\n    self.result = result\n  }\n\n  deinit {\n    SherpaOnnxDestroyOfflineRecognizerResult(result)\n  }\n}\n\nclass SherpaOnnxOfflineRecognizer {\n  /// A pointer to the underlying counterpart in C\n  private let recognizer: OpaquePointer\n\n  init(\n    config: UnsafePointer<SherpaOnnxOfflineRecognizerConfig>\n  ) {\n    guard let ptr = SherpaOnnxCreateOfflineRecognizer(config) else {\n      fatalError(\"Failed to create SherpaOnnxOfflineRecognizer\")\n    }\n    self.recognizer = ptr\n  }\n\n  deinit {\n    SherpaOnnxDestroyOfflineRecognizer(recognizer)\n  }\n\n  /// Decode wave samples.\n  ///\n  /// - Parameters:\n  ///   - samples: Audio samples normalized to the range [-1, 1]\n  ///   - sampleRate: Sample rate of the input audio samples. Must match\n  ///                 the one expected by the model.\n  func decode(samples: [Float], sampleRate: Int = 16_000) -> SherpaOnnxOfflineRecongitionResult {\n    guard let stream = SherpaOnnxCreateOfflineStream(recognizer) else {\n      fatalError(\"Failed to create offline stream\")\n    }\n\n    defer { SherpaOnnxDestroyOfflineStream(stream) }\n\n    SherpaOnnxAcceptWaveformOffline(stream, Int32(sampleRate), samples, Int32(samples.count))\n\n    SherpaOnnxDecodeOfflineStream(recognizer, stream)\n\n    guard let resultPtr = SherpaOnnxGetOfflineStreamResult(stream) else {\n      fatalError(\"Failed to get offline recognition result\")\n    }\n\n    return SherpaOnnxOfflineRecongitionResult(result: resultPtr)\n  }\n\n  func setConfig(config: UnsafePointer<SherpaOnnxOfflineRecognizerConfig>) {\n    SherpaOnnxOfflineRecognizerSetConfig(recognizer, config)\n  }\n}\n\nfunc sherpaOnnxSileroVadModelConfig(\n  model: String = \"\",\n  threshold: Float = 0.5,\n  minSilenceDuration: Float = 0.25,\n  minSpeechDuration: Float = 0.5,\n  windowSize: Int = 512,\n  maxSpeechDuration: Float = 5.0\n) -> SherpaOnnxSileroVadModelConfig {\n  return SherpaOnnxSileroVadModelConfig(\n    model: toCPointer(model),\n    threshold: threshold,\n    min_silence_duration: minSilenceDuration,\n    min_speech_duration: minSpeechDuration,\n    window_size: Int32(windowSize),\n    max_speech_duration: maxSpeechDuration\n  )\n}\n\nfunc sherpaOnnxTenVadModelConfig(\n  model: String = \"\",\n  threshold: Float = 0.5,\n  minSilenceDuration: Float = 0.25,\n  minSpeechDuration: Float = 0.5,\n  windowSize: Int = 256,\n  maxSpeechDuration: Float = 5.0\n) -> SherpaOnnxTenVadModelConfig {\n  return SherpaOnnxTenVadModelConfig(\n    model: toCPointer(model),\n    threshold: threshold,\n    min_silence_duration: minSilenceDuration,\n    min_speech_duration: minSpeechDuration,\n    window_size: Int32(windowSize),\n    max_speech_duration: maxSpeechDuration\n  )\n}\n\nfunc sherpaOnnxVadModelConfig(\n  sileroVad: SherpaOnnxSileroVadModelConfig = sherpaOnnxSileroVadModelConfig(),\n  sampleRate: Int32 = 16000,\n  numThreads: Int = 1,\n  provider: String = \"cpu\",\n  debug: Int = 0,\n  tenVad: SherpaOnnxTenVadModelConfig = sherpaOnnxTenVadModelConfig()\n) -> SherpaOnnxVadModelConfig {\n  return SherpaOnnxVadModelConfig(\n    silero_vad: sileroVad,\n    sample_rate: sampleRate,\n    num_threads: Int32(numThreads),\n    provider: toCPointer(provider),\n    debug: Int32(debug),\n    ten_vad: tenVad\n  )\n}\n\nclass SherpaOnnxCircularBufferWrapper {\n  private let buffer: OpaquePointer\n\n  init(capacity: Int) {\n    guard let ptr = SherpaOnnxCreateCircularBuffer(Int32(capacity)) else {\n      fatalError(\"Failed to create SherpaOnnxCircularBuffer\")\n    }\n    self.buffer = ptr\n  }\n\n  deinit {\n    SherpaOnnxDestroyCircularBuffer(buffer)\n  }\n\n  func push(samples: [Float]) {\n    guard !samples.isEmpty else { return }\n    SherpaOnnxCircularBufferPush(buffer, samples, Int32(samples.count))\n  }\n\n  func get(startIndex: Int, n: Int) -> [Float] {\n    guard startIndex >= 0 else { return [] }\n    guard n > 0 else { return [] }\n\n    guard let ptr = SherpaOnnxCircularBufferGet(buffer, Int32(startIndex), Int32(n)) else {\n      return []\n    }\n    defer { SherpaOnnxCircularBufferFree(ptr) }\n\n    return Array(UnsafeBufferPointer(start: ptr, count: n))\n  }\n\n  func pop(n: Int) {\n    guard n > 0 else { return }\n    SherpaOnnxCircularBufferPop(buffer, Int32(n))\n  }\n\n  func size() -> Int {\n    return Int(SherpaOnnxCircularBufferSize(buffer))\n  }\n\n  func reset() {\n    SherpaOnnxCircularBufferReset(buffer)\n  }\n}\n\nclass SherpaOnnxSpeechSegmentWrapper {\n  private let p: UnsafePointer<SherpaOnnxSpeechSegment>\n\n  init(p: UnsafePointer<SherpaOnnxSpeechSegment>) {\n    self.p = p\n  }\n\n  deinit {\n    SherpaOnnxDestroySpeechSegment(p)\n  }\n\n  var start: Int {\n    Int(p.pointee.start)\n  }\n\n  var n: Int {\n    Int(p.pointee.n)\n  }\n\n  lazy var samples: [Float] = {\n    Array(UnsafeBufferPointer(start: p.pointee.samples, count: n))\n  }()\n}\n\nclass SherpaOnnxVoiceActivityDetectorWrapper {\n  /// A pointer to the underlying counterpart in C\n  private let vad: OpaquePointer\n\n  init(config: UnsafePointer<SherpaOnnxVadModelConfig>, buffer_size_in_seconds: Float) {\n    guard let vad = SherpaOnnxCreateVoiceActivityDetector(config, buffer_size_in_seconds) else {\n      fatalError(\"SherpaOnnxCreateVoiceActivityDetector returned nil\")\n    }\n    self.vad = vad\n  }\n\n  deinit {\n    SherpaOnnxDestroyVoiceActivityDetector(vad)\n  }\n\n  func acceptWaveform(samples: [Float]) {\n    SherpaOnnxVoiceActivityDetectorAcceptWaveform(vad, samples, Int32(samples.count))\n  }\n\n  func isEmpty() -> Bool {\n    return SherpaOnnxVoiceActivityDetectorEmpty(vad) == 1\n  }\n\n  func isSpeechDetected() -> Bool {\n    return SherpaOnnxVoiceActivityDetectorDetected(vad) == 1\n  }\n\n  func pop() {\n    SherpaOnnxVoiceActivityDetectorPop(vad)\n  }\n\n  func clear() {\n    SherpaOnnxVoiceActivityDetectorClear(vad)\n  }\n\n  func front() -> SherpaOnnxSpeechSegmentWrapper {\n    guard let p = SherpaOnnxVoiceActivityDetectorFront(vad) else {\n      fatalError(\"SherpaOnnxVoiceActivityDetectorFront returned nil\")\n    }\n    return SherpaOnnxSpeechSegmentWrapper(p: p)\n  }\n\n  func reset() {\n    SherpaOnnxVoiceActivityDetectorReset(vad)\n  }\n\n  func flush() {\n    SherpaOnnxVoiceActivityDetectorFlush(vad)\n  }\n}\n\n// offline tts\nfunc sherpaOnnxOfflineTtsVitsModelConfig(\n  model: String = \"\",\n  lexicon: String = \"\",\n  tokens: String = \"\",\n  dataDir: String = \"\",\n  noiseScale: Float = 0.667,\n  noiseScaleW: Float = 0.8,\n  lengthScale: Float = 1.0,\n  dictDir: String = \"\"\n) -> SherpaOnnxOfflineTtsVitsModelConfig {\n  return SherpaOnnxOfflineTtsVitsModelConfig(\n    model: toCPointer(model),\n    lexicon: toCPointer(lexicon),\n    tokens: toCPointer(tokens),\n    data_dir: toCPointer(dataDir),\n    noise_scale: noiseScale,\n    noise_scale_w: noiseScaleW,\n    length_scale: lengthScale,\n    dict_dir: toCPointer(dictDir)\n  )\n}\n\nfunc sherpaOnnxOfflineTtsMatchaModelConfig(\n  acousticModel: String = \"\",\n  vocoder: String = \"\",\n  lexicon: String = \"\",\n  tokens: String = \"\",\n  dataDir: String = \"\",\n  noiseScale: Float = 0.667,\n  lengthScale: Float = 1.0,\n  dictDir: String = \"\"\n) -> SherpaOnnxOfflineTtsMatchaModelConfig {\n  return SherpaOnnxOfflineTtsMatchaModelConfig(\n    acoustic_model: toCPointer(acousticModel),\n    vocoder: toCPointer(vocoder),\n    lexicon: toCPointer(lexicon),\n    tokens: toCPointer(tokens),\n    data_dir: toCPointer(dataDir),\n    noise_scale: noiseScale,\n    length_scale: lengthScale,\n    dict_dir: toCPointer(dictDir)\n  )\n}\n\nfunc sherpaOnnxOfflineTtsKokoroModelConfig(\n  model: String = \"\",\n  voices: String = \"\",\n  tokens: String = \"\",\n  dataDir: String = \"\",\n  lengthScale: Float = 1.0,\n  dictDir: String = \"\",\n  lexicon: String = \"\",\n  lang: String = \"\"\n) -> SherpaOnnxOfflineTtsKokoroModelConfig {\n  return SherpaOnnxOfflineTtsKokoroModelConfig(\n    model: toCPointer(model),\n    voices: toCPointer(voices),\n    tokens: toCPointer(tokens),\n    data_dir: toCPointer(dataDir),\n    length_scale: lengthScale,\n    dict_dir: toCPointer(dictDir),\n    lexicon: toCPointer(lexicon),\n    lang: toCPointer(lang)\n  )\n}\n\nfunc sherpaOnnxOfflineTtsKittenModelConfig(\n  model: String = \"\",\n  voices: String = \"\",\n  tokens: String = \"\",\n  dataDir: String = \"\",\n  lengthScale: Float = 1.0\n) -> SherpaOnnxOfflineTtsKittenModelConfig {\n  return SherpaOnnxOfflineTtsKittenModelConfig(\n    model: toCPointer(model),\n    voices: toCPointer(voices),\n    tokens: toCPointer(tokens),\n    data_dir: toCPointer(dataDir),\n    length_scale: lengthScale\n  )\n}\n\nfunc sherpaOnnxOfflineTtsZipvoiceModelConfig(\n  tokens: String = \"\",\n  encoder: String = \"\",\n  decoder: String = \"\",\n  vocoder: String = \"\",\n  dataDir: String = \"\",\n  lexicon: String = \"\",\n  featScale: Float = 0.1,\n  tShift: Float = 0.5,\n  targetRms: Float = 0.1,\n  guidanceScale: Float = 1.0\n) -> SherpaOnnxOfflineTtsZipvoiceModelConfig {\n  return SherpaOnnxOfflineTtsZipvoiceModelConfig(\n    tokens: toCPointer(tokens),\n    encoder: toCPointer(encoder),\n    decoder: toCPointer(decoder),\n    vocoder: toCPointer(vocoder),\n    data_dir: toCPointer(dataDir),\n    lexicon: toCPointer(lexicon),\n    feat_scale: featScale,\n    t_shift: tShift,\n    target_rms: targetRms,\n    guidance_scale: guidanceScale\n  )\n}\n\nfunc sherpaOnnxOfflineTtsPocketModelConfig(\n  lmFlow: String = \"\",\n  lmMain: String = \"\",\n  encoder: String = \"\",\n  decoder: String = \"\",\n  textConditioner: String = \"\",\n  vocabJson: String = \"\",\n  tokenScoresJson: String = \"\",\n  voiceEmbeddingCacheCapacity: Int = 50\n) -> SherpaOnnxOfflineTtsPocketModelConfig {\n  return SherpaOnnxOfflineTtsPocketModelConfig(\n    lm_flow: toCPointer(lmFlow),\n    lm_main: toCPointer(lmMain),\n    encoder: toCPointer(encoder),\n    decoder: toCPointer(decoder),\n    text_conditioner: toCPointer(textConditioner),\n    vocab_json: toCPointer(vocabJson),\n    token_scores_json: toCPointer(tokenScoresJson),\n    voice_embedding_cache_capacity: Int32(voiceEmbeddingCacheCapacity)\n  )\n}\n\nfunc sherpaOnnxOfflineTtsSupertonicModelConfig(\n  durationPredictor: String = \"\",\n  textEncoder: String = \"\",\n  vectorEstimator: String = \"\",\n  vocoder: String = \"\",\n  ttsJson: String = \"\",\n  unicodeIndexer: String = \"\",\n  voiceStyle: String = \"\"\n) -> SherpaOnnxOfflineTtsSupertonicModelConfig {\n  return SherpaOnnxOfflineTtsSupertonicModelConfig(\n    duration_predictor: toCPointer(durationPredictor),\n    text_encoder: toCPointer(textEncoder),\n    vector_estimator: toCPointer(vectorEstimator),\n    vocoder: toCPointer(vocoder),\n    tts_json: toCPointer(ttsJson),\n    unicode_indexer: toCPointer(unicodeIndexer),\n    voice_style: toCPointer(voiceStyle)\n  )\n}\n\nfunc sherpaOnnxOfflineTtsModelConfig(\n  vits: SherpaOnnxOfflineTtsVitsModelConfig = sherpaOnnxOfflineTtsVitsModelConfig(),\n  matcha: SherpaOnnxOfflineTtsMatchaModelConfig = sherpaOnnxOfflineTtsMatchaModelConfig(),\n  kokoro: SherpaOnnxOfflineTtsKokoroModelConfig = sherpaOnnxOfflineTtsKokoroModelConfig(),\n  numThreads: Int = 1,\n  debug: Int = 0,\n  provider: String = \"cpu\",\n  kitten: SherpaOnnxOfflineTtsKittenModelConfig = sherpaOnnxOfflineTtsKittenModelConfig(),\n  zipvoice: SherpaOnnxOfflineTtsZipvoiceModelConfig = sherpaOnnxOfflineTtsZipvoiceModelConfig(),\n  pocket: SherpaOnnxOfflineTtsPocketModelConfig = sherpaOnnxOfflineTtsPocketModelConfig(),\n  supertonic: SherpaOnnxOfflineTtsSupertonicModelConfig = sherpaOnnxOfflineTtsSupertonicModelConfig()\n) -> SherpaOnnxOfflineTtsModelConfig {\n  return SherpaOnnxOfflineTtsModelConfig(\n    vits: vits,\n    num_threads: Int32(numThreads),\n    debug: Int32(debug),\n    provider: toCPointer(provider),\n    matcha: matcha,\n    kokoro: kokoro,\n    kitten: kitten,\n    zipvoice: zipvoice,\n    pocket: pocket,\n    supertonic: supertonic\n  )\n}\n\nfunc sherpaOnnxOfflineTtsConfig(\n  model: SherpaOnnxOfflineTtsModelConfig,\n  ruleFsts: String = \"\",\n  ruleFars: String = \"\",\n  maxNumSentences: Int = 1,\n  silenceScale: Float = 0.2\n) -> SherpaOnnxOfflineTtsConfig {\n  return SherpaOnnxOfflineTtsConfig(\n    model: model,\n    rule_fsts: toCPointer(ruleFsts),\n    max_num_sentences: Int32(maxNumSentences),\n    rule_fars: toCPointer(ruleFars),\n    silence_scale: silenceScale\n  )\n}\n\nclass SherpaOnnxWaveWrapper {\n  let wave: UnsafePointer<SherpaOnnxWave>!\n\n  class func readWave(filename: String) -> SherpaOnnxWaveWrapper {\n    let wave = SherpaOnnxReadWave(toCPointer(filename))\n    return SherpaOnnxWaveWrapper(wave: wave)\n  }\n\n  init(wave: UnsafePointer<SherpaOnnxWave>!) {\n    self.wave = wave\n  }\n\n  deinit {\n    if let wave {\n      SherpaOnnxFreeWave(wave)\n    }\n  }\n\n  var numSamples: Int {\n    return Int(wave.pointee.num_samples)\n  }\n\n  var sampleRate: Int {\n    return Int(wave.pointee.sample_rate)\n  }\n\n  var samples: [Float] {\n    if numSamples == 0 {\n      return []\n    } else {\n      return [Float](UnsafeBufferPointer(start: wave.pointee.samples, count: numSamples))\n    }\n  }\n}\n\nclass SherpaOnnxGeneratedAudioWrapper {\n  /// A pointer to the underlying counterpart in C\n  let audio: UnsafePointer<SherpaOnnxGeneratedAudio>!\n\n  init(audio: UnsafePointer<SherpaOnnxGeneratedAudio>!) {\n    self.audio = audio\n  }\n\n  deinit {\n    if let audio {\n      SherpaOnnxDestroyOfflineTtsGeneratedAudio(audio)\n    }\n  }\n\n  var n: Int32 {\n    return audio.pointee.n\n  }\n\n  var sampleRate: Int32 {\n    return audio.pointee.sample_rate\n  }\n\n  var samples: [Float] {\n    if let p = audio.pointee.samples {\n      return [Float](UnsafeBufferPointer(start: p, count: Int(n)))\n    } else {\n      return []\n    }\n  }\n\n  func save(filename: String) -> Int32 {\n    return SherpaOnnxWriteWave(audio.pointee.samples, n, sampleRate, toCPointer(filename))\n  }\n}\n\ntypealias TtsCallbackWithArg = (\n  @convention(c) (\n    UnsafePointer<Float>?,  // const float* samples\n    Int32,  // int32_t n\n    UnsafeMutableRawPointer?  // void *arg\n  ) -> Int32\n)?\n\ntypealias TtsProgressCallbackWithArg =\n  @convention(c) (\n    UnsafePointer<Float>?, Int32, Float, UnsafeMutableRawPointer?\n  ) -> Int32\n\nstruct SherpaOnnxGenerationConfigSwift {\n  var silenceScale: Float = 0.2\n  var speed: Float = 1.0\n  var sid: Int = 0\n  var referenceAudio: [Float] = []\n  var referenceSampleRate: Int = 16000\n  var referenceText: String = \"\"\n  var numSteps: Int = 1\n  var extra: [String: Any] = [:]  // Any can be String, Int, Float, Double\n\n  /// Convert the extra dictionary into a JSON string\n  func extraJsonString() -> String {\n    var jsonCompatible: [String: Any] = [:]\n\n    for (key, value) in extra {\n      switch value {\n      case let v as String:\n        jsonCompatible[key] = v\n      case let v as Int:\n        jsonCompatible[key] = v\n      case let v as Float:\n        jsonCompatible[key] = v\n      case let v as Double:\n        jsonCompatible[key] = v\n      default:\n        // ignore unsupported types\n        print(\"Warning: unsupported type for key '\\(key)' in extra\")\n      }\n    }\n\n    guard let data = try? JSONSerialization.data(withJSONObject: jsonCompatible, options: []),\n      let json = String(data: data, encoding: .utf8)\n    else {\n      return \"{}\"\n    }\n\n    return json\n  }\n}\nfinal class SherpaOnnxGenerationConfigC {\n  /// The underlying C struct\n  var cConfig: SherpaOnnxGenerationConfig\n\n  /// Storage for reference audio so the pointer stays valid during the C call\n  private let referenceAudioStorage: [Float]\n\n  /// Extra JSON string for C API\n  let extraJson: String\n\n  init(_ swiftConfig: SherpaOnnxGenerationConfigSwift) {\n    let referenceAudio = swiftConfig.referenceAudio\n\n    let extraJson = swiftConfig.extraJsonString()\n    self.extraJson = extraJson\n\n    self.referenceAudioStorage = referenceAudio\n\n    self.cConfig = self.referenceAudioStorage.withUnsafeBufferPointer { buffer in\n      SherpaOnnxGenerationConfig(\n        silence_scale: swiftConfig.silenceScale,\n        speed: swiftConfig.speed,\n        sid: Int32(swiftConfig.sid),\n        reference_audio: buffer.count > 0 ? buffer.baseAddress : nil,\n        reference_audio_len: Int32(buffer.count),\n        reference_sample_rate: Int32(swiftConfig.referenceSampleRate),\n        reference_text: toCPointer(swiftConfig.referenceText),\n        num_steps: Int32(swiftConfig.numSteps),\n        extra: toCPointer(extraJson)\n      )\n    }\n  }\n}\n\nclass SherpaOnnxOfflineTtsWrapper {\n  /// A pointer to the underlying counterpart in C\n  let tts: OpaquePointer!\n\n  /// Constructor taking a model config\n  init(\n    config: UnsafePointer<SherpaOnnxOfflineTtsConfig>!\n  ) {\n    tts = SherpaOnnxCreateOfflineTts(config)\n  }\n\n  deinit {\n    if let tts {\n      SherpaOnnxDestroyOfflineTts(tts)\n    }\n  }\n\n  func generate(text: String, sid: Int = 0, speed: Float = 1.0) -> SherpaOnnxGeneratedAudioWrapper {\n    let audio: UnsafePointer<SherpaOnnxGeneratedAudio>? = SherpaOnnxOfflineTtsGenerate(\n      tts, toCPointer(text), Int32(sid), speed)\n\n    return SherpaOnnxGeneratedAudioWrapper(audio: audio)\n  }\n\n  func generateWithCallbackWithArg(\n    text: String, callback: TtsCallbackWithArg, arg: UnsafeMutableRawPointer, sid: Int = 0,\n    speed: Float = 1.0\n  ) -> SherpaOnnxGeneratedAudioWrapper {\n    let audio: UnsafePointer<SherpaOnnxGeneratedAudio>? =\n      SherpaOnnxOfflineTtsGenerateWithCallbackWithArg(\n        tts, toCPointer(text), Int32(sid), speed, callback, arg)\n\n    return SherpaOnnxGeneratedAudioWrapper(audio: audio)\n  }\n\n  func generateWithConfig(\n    text: String,\n    config: SherpaOnnxGenerationConfigSwift,\n    callback: TtsProgressCallbackWithArg?,\n    arg: UnsafeMutableRawPointer?\n  ) -> SherpaOnnxGeneratedAudioWrapper {\n    let bridge = SherpaOnnxGenerationConfigC(config)\n\n    let audio: UnsafePointer<SherpaOnnxGeneratedAudio>? =\n      withUnsafePointer(to: &bridge.cConfig) { configPtr in\n        SherpaOnnxOfflineTtsGenerateWithConfig(\n          tts,\n          toCPointer(text),\n          configPtr,\n          callback,\n          arg\n        )\n      }\n\n    return SherpaOnnxGeneratedAudioWrapper(audio: audio)\n  }\n\n}\n\n// spoken language identification\n\nfunc sherpaOnnxSpokenLanguageIdentificationWhisperConfig(\n  encoder: String,\n  decoder: String,\n  tailPaddings: Int = -1\n) -> SherpaOnnxSpokenLanguageIdentificationWhisperConfig {\n  return SherpaOnnxSpokenLanguageIdentificationWhisperConfig(\n    encoder: toCPointer(encoder),\n    decoder: toCPointer(decoder),\n    tail_paddings: Int32(tailPaddings))\n}\n\nfunc sherpaOnnxSpokenLanguageIdentificationConfig(\n  whisper: SherpaOnnxSpokenLanguageIdentificationWhisperConfig,\n  numThreads: Int = 1,\n  debug: Int = 0,\n  provider: String = \"cpu\"\n) -> SherpaOnnxSpokenLanguageIdentificationConfig {\n  return SherpaOnnxSpokenLanguageIdentificationConfig(\n    whisper: whisper,\n    num_threads: Int32(numThreads),\n    debug: Int32(debug),\n    provider: toCPointer(provider))\n}\n\nclass SherpaOnnxSpokenLanguageIdentificationResultWrapper {\n  /// A pointer to the underlying counterpart in C\n  let result: UnsafePointer<SherpaOnnxSpokenLanguageIdentificationResult>!\n\n  /// Return the detected language.\n  /// en for English\n  /// zh for Chinese\n  /// es for Spanish\n  /// de for German\n  /// etc.\n  var lang: String {\n    return String(cString: result.pointee.lang)\n  }\n\n  init(result: UnsafePointer<SherpaOnnxSpokenLanguageIdentificationResult>!) {\n    self.result = result\n  }\n\n  deinit {\n    if let result {\n      SherpaOnnxDestroySpokenLanguageIdentificationResult(result)\n    }\n  }\n}\n\nclass SherpaOnnxSpokenLanguageIdentificationWrapper {\n  /// A pointer to the underlying counterpart in C\n  let slid: OpaquePointer!\n\n  init(\n    config: UnsafePointer<SherpaOnnxSpokenLanguageIdentificationConfig>!\n  ) {\n    slid = SherpaOnnxCreateSpokenLanguageIdentification(config)\n  }\n\n  deinit {\n    if let slid {\n      SherpaOnnxDestroySpokenLanguageIdentification(slid)\n    }\n  }\n\n  func decode(samples: [Float], sampleRate: Int = 16000)\n    -> SherpaOnnxSpokenLanguageIdentificationResultWrapper\n  {\n    let stream: OpaquePointer! = SherpaOnnxSpokenLanguageIdentificationCreateOfflineStream(slid)\n    SherpaOnnxAcceptWaveformOffline(stream, Int32(sampleRate), samples, Int32(samples.count))\n\n    let result: UnsafePointer<SherpaOnnxSpokenLanguageIdentificationResult>? =\n      SherpaOnnxSpokenLanguageIdentificationCompute(\n        slid,\n        stream)\n\n    SherpaOnnxDestroyOfflineStream(stream)\n    return SherpaOnnxSpokenLanguageIdentificationResultWrapper(result: result)\n  }\n}\n\n// keyword spotting\n\nclass SherpaOnnxKeywordResultWrapper {\n  /// A pointer to the underlying counterpart in C\n  let result: UnsafePointer<SherpaOnnxKeywordResult>!\n\n  var keyword: String {\n    return String(cString: result.pointee.keyword)\n  }\n\n  var count: Int32 {\n    return result.pointee.count\n  }\n\n  var tokens: [String] {\n    if let tokensPointer = result.pointee.tokens_arr {\n      var tokens: [String] = []\n      for index in 0..<count {\n        if let tokenPointer = tokensPointer[Int(index)] {\n          let token = String(cString: tokenPointer)\n          tokens.append(token)\n        }\n      }\n      return tokens\n    } else {\n      let tokens: [String] = []\n      return tokens\n    }\n  }\n\n  init(result: UnsafePointer<SherpaOnnxKeywordResult>!) {\n    self.result = result\n  }\n\n  deinit {\n    if let result {\n      SherpaOnnxDestroyKeywordResult(result)\n    }\n  }\n}\n\nfunc sherpaOnnxKeywordSpotterConfig(\n  featConfig: SherpaOnnxFeatureConfig,\n  modelConfig: SherpaOnnxOnlineModelConfig,\n  keywordsFile: String,\n  maxActivePaths: Int = 4,\n  numTrailingBlanks: Int = 1,\n  keywordsScore: Float = 1.0,\n  keywordsThreshold: Float = 0.25,\n  keywordsBuf: String = \"\",\n  keywordsBufSize: Int = 0\n) -> SherpaOnnxKeywordSpotterConfig {\n  return SherpaOnnxKeywordSpotterConfig(\n    feat_config: featConfig,\n    model_config: modelConfig,\n    max_active_paths: Int32(maxActivePaths),\n    num_trailing_blanks: Int32(numTrailingBlanks),\n    keywords_score: keywordsScore,\n    keywords_threshold: keywordsThreshold,\n    keywords_file: toCPointer(keywordsFile),\n    keywords_buf: toCPointer(keywordsBuf),\n    keywords_buf_size: Int32(keywordsBufSize)\n  )\n}\n\nclass SherpaOnnxKeywordSpotterWrapper {\n  /// A pointer to the underlying counterpart in C\n  let spotter: OpaquePointer!\n  var stream: OpaquePointer!\n\n  init(\n    config: UnsafePointer<SherpaOnnxKeywordSpotterConfig>!\n  ) {\n    spotter = SherpaOnnxCreateKeywordSpotter(config)\n    stream = SherpaOnnxCreateKeywordStream(spotter)\n  }\n\n  deinit {\n    if let stream {\n      SherpaOnnxDestroyOnlineStream(stream)\n    }\n\n    if let spotter {\n      SherpaOnnxDestroyKeywordSpotter(spotter)\n    }\n  }\n\n  func acceptWaveform(samples: [Float], sampleRate: Int = 16000) {\n    SherpaOnnxOnlineStreamAcceptWaveform(stream, Int32(sampleRate), samples, Int32(samples.count))\n  }\n\n  func isReady() -> Bool {\n    return SherpaOnnxIsKeywordStreamReady(spotter, stream) == 1 ? true : false\n  }\n\n  func decode() {\n    SherpaOnnxDecodeKeywordStream(spotter, stream)\n  }\n\n  func reset() {\n    SherpaOnnxResetKeywordStream(spotter, stream)\n  }\n\n  func getResult() -> SherpaOnnxKeywordResultWrapper {\n    let result: UnsafePointer<SherpaOnnxKeywordResult>? = SherpaOnnxGetKeywordResult(\n      spotter, stream)\n    return SherpaOnnxKeywordResultWrapper(result: result)\n  }\n\n  /// Signal that no more audio samples would be available.\n  /// After this call, you cannot call acceptWaveform() any more.\n  func inputFinished() {\n    SherpaOnnxOnlineStreamInputFinished(stream)\n  }\n}\n\n// Punctuation\n\nfunc sherpaOnnxOfflinePunctuationModelConfig(\n  ctTransformer: String,\n  numThreads: Int = 1,\n  debug: Int = 0,\n  provider: String = \"cpu\"\n) -> SherpaOnnxOfflinePunctuationModelConfig {\n  return SherpaOnnxOfflinePunctuationModelConfig(\n    ct_transformer: toCPointer(ctTransformer),\n    num_threads: Int32(numThreads),\n    debug: Int32(debug),\n    provider: toCPointer(provider)\n  )\n}\n\nfunc sherpaOnnxOfflinePunctuationConfig(\n  model: SherpaOnnxOfflinePunctuationModelConfig\n) -> SherpaOnnxOfflinePunctuationConfig {\n  return SherpaOnnxOfflinePunctuationConfig(\n    model: model\n  )\n}\n\nclass SherpaOnnxOfflinePunctuationWrapper {\n  /// A pointer to the underlying counterpart in C\n  let ptr: OpaquePointer!\n\n  /// Constructor taking a model config\n  init(\n    config: UnsafePointer<SherpaOnnxOfflinePunctuationConfig>!\n  ) {\n    ptr = SherpaOnnxCreateOfflinePunctuation(config)\n  }\n\n  deinit {\n    if let ptr {\n      SherpaOnnxDestroyOfflinePunctuation(ptr)\n    }\n  }\n\n  func addPunct(text: String) -> String {\n    let cText = SherpaOfflinePunctuationAddPunct(ptr, toCPointer(text))\n    let ans = String(cString: cText!)\n    SherpaOfflinePunctuationFreeText(cText)\n    return ans\n  }\n}\n\nfunc sherpaOnnxOnlinePunctuationModelConfig(\n  cnnBiLstm: String,\n  bpeVocab: String,\n  numThreads: Int = 1,\n  debug: Int = 0,\n  provider: String = \"cpu\"\n) -> SherpaOnnxOnlinePunctuationModelConfig {\n  return SherpaOnnxOnlinePunctuationModelConfig(\n    cnn_bilstm: toCPointer(cnnBiLstm),\n    bpe_vocab: toCPointer(bpeVocab),\n    num_threads: Int32(numThreads),\n    debug: Int32(debug),\n    provider: toCPointer(provider))\n}\n\nfunc sherpaOnnxOnlinePunctuationConfig(\n  model: SherpaOnnxOnlinePunctuationModelConfig\n) -> SherpaOnnxOnlinePunctuationConfig {\n  return SherpaOnnxOnlinePunctuationConfig(model: model)\n}\n\nclass SherpaOnnxOnlinePunctuationWrapper {\n  /// A pointer to the underlying counterpart in C\n  let ptr: OpaquePointer!\n\n  /// Constructor taking a model config\n  init(\n    config: UnsafePointer<SherpaOnnxOnlinePunctuationConfig>!\n  ) {\n    ptr = SherpaOnnxCreateOnlinePunctuation(config)\n  }\n\n  deinit {\n    if let ptr {\n      SherpaOnnxDestroyOnlinePunctuation(ptr)\n    }\n  }\n\n  func addPunct(text: String) -> String {\n    let cText = SherpaOnnxOnlinePunctuationAddPunct(ptr, toCPointer(text))\n    let ans = String(cString: cText!)\n    SherpaOnnxOnlinePunctuationFreeText(cText)\n    return ans\n  }\n}\n\nfunc sherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig(model: String)\n  -> SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig\n{\n  return SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig(model: toCPointer(model))\n}\n\nfunc sherpaOnnxOfflineSpeakerSegmentationModelConfig(\n  pyannote: SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig,\n  numThreads: Int = 1,\n  debug: Int = 0,\n  provider: String = \"cpu\"\n) -> SherpaOnnxOfflineSpeakerSegmentationModelConfig {\n  return SherpaOnnxOfflineSpeakerSegmentationModelConfig(\n    pyannote: pyannote,\n    num_threads: Int32(numThreads),\n    debug: Int32(debug),\n    provider: toCPointer(provider)\n  )\n}\n\nfunc sherpaOnnxFastClusteringConfig(numClusters: Int = -1, threshold: Float = 0.5)\n  -> SherpaOnnxFastClusteringConfig\n{\n  return SherpaOnnxFastClusteringConfig(num_clusters: Int32(numClusters), threshold: threshold)\n}\n\nfunc sherpaOnnxSpeakerEmbeddingExtractorConfig(\n  model: String,\n  numThreads: Int = 1,\n  debug: Int = 0,\n  provider: String = \"cpu\"\n) -> SherpaOnnxSpeakerEmbeddingExtractorConfig {\n  return SherpaOnnxSpeakerEmbeddingExtractorConfig(\n    model: toCPointer(model),\n    num_threads: Int32(numThreads),\n    debug: Int32(debug),\n    provider: toCPointer(provider)\n  )\n}\n\nfunc sherpaOnnxOfflineSpeakerDiarizationConfig(\n  segmentation: SherpaOnnxOfflineSpeakerSegmentationModelConfig,\n  embedding: SherpaOnnxSpeakerEmbeddingExtractorConfig,\n  clustering: SherpaOnnxFastClusteringConfig,\n  minDurationOn: Float = 0.3,\n  minDurationOff: Float = 0.5\n) -> SherpaOnnxOfflineSpeakerDiarizationConfig {\n  return SherpaOnnxOfflineSpeakerDiarizationConfig(\n    segmentation: segmentation,\n    embedding: embedding,\n    clustering: clustering,\n    min_duration_on: minDurationOn,\n    min_duration_off: minDurationOff\n  )\n}\n\nstruct SherpaOnnxOfflineSpeakerDiarizationSegmentWrapper {\n  var start: Float = 0\n  var end: Float = 0\n  var speaker: Int = 0\n}\n\nclass SherpaOnnxOfflineSpeakerDiarizationWrapper {\n  /// A pointer to the underlying counterpart in C\n  let impl: OpaquePointer!\n\n  init(\n    config: UnsafePointer<SherpaOnnxOfflineSpeakerDiarizationConfig>!\n  ) {\n    impl = SherpaOnnxCreateOfflineSpeakerDiarization(config)\n  }\n\n  deinit {\n    if let impl {\n      SherpaOnnxDestroyOfflineSpeakerDiarization(impl)\n    }\n  }\n\n  var sampleRate: Int {\n    return Int(SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(impl))\n  }\n\n  // only config.clustering is used. All other fields are ignored\n  func setConfig(config: UnsafePointer<SherpaOnnxOfflineSpeakerDiarizationConfig>!) {\n    SherpaOnnxOfflineSpeakerDiarizationSetConfig(impl, config)\n  }\n\n  func process(samples: [Float]) -> [SherpaOnnxOfflineSpeakerDiarizationSegmentWrapper] {\n    let result = SherpaOnnxOfflineSpeakerDiarizationProcess(\n      impl, samples, Int32(samples.count))\n\n    if result == nil {\n      return []\n    }\n\n    let numSegments = Int(SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(result))\n\n    let p: UnsafePointer<SherpaOnnxOfflineSpeakerDiarizationSegment>? =\n      SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(result)\n\n    if p == nil {\n      return []\n    }\n\n    var ans: [SherpaOnnxOfflineSpeakerDiarizationSegmentWrapper] = []\n    for i in 0..<numSegments {\n      ans.append(\n        SherpaOnnxOfflineSpeakerDiarizationSegmentWrapper(\n          start: p![i].start, end: p![i].end, speaker: Int(p![i].speaker)))\n    }\n\n    SherpaOnnxOfflineSpeakerDiarizationDestroySegment(p)\n    SherpaOnnxOfflineSpeakerDiarizationDestroyResult(result)\n\n    return ans\n  }\n}\n\nclass SherpaOnnxOnlineStreamWrapper {\n  /// A pointer to the underlying counterpart in C\n  let impl: OpaquePointer!\n  init(impl: OpaquePointer!) {\n    self.impl = impl\n  }\n\n  deinit {\n    if let impl {\n      SherpaOnnxDestroyOnlineStream(impl)\n    }\n  }\n\n  func acceptWaveform(samples: [Float], sampleRate: Int = 16000) {\n    SherpaOnnxOnlineStreamAcceptWaveform(impl, Int32(sampleRate), samples, Int32(samples.count))\n  }\n\n  func inputFinished() {\n    SherpaOnnxOnlineStreamInputFinished(impl)\n  }\n}\n\nclass SherpaOnnxSpeakerEmbeddingExtractorWrapper {\n  /// A pointer to the underlying counterpart in C\n  let impl: OpaquePointer!\n\n  init(\n    config: UnsafePointer<SherpaOnnxSpeakerEmbeddingExtractorConfig>!\n  ) {\n    impl = SherpaOnnxCreateSpeakerEmbeddingExtractor(config)\n  }\n\n  deinit {\n    if let impl {\n      SherpaOnnxDestroySpeakerEmbeddingExtractor(impl)\n    }\n  }\n\n  var dim: Int {\n    return Int(SherpaOnnxSpeakerEmbeddingExtractorDim(impl))\n  }\n\n  func createStream() -> SherpaOnnxOnlineStreamWrapper {\n    let newStream = SherpaOnnxSpeakerEmbeddingExtractorCreateStream(impl)\n    return SherpaOnnxOnlineStreamWrapper(impl: newStream)\n  }\n\n  func isReady(stream: SherpaOnnxOnlineStreamWrapper) -> Bool {\n    return SherpaOnnxSpeakerEmbeddingExtractorIsReady(impl, stream.impl) == 1 ? true : false\n  }\n\n  func compute(stream: SherpaOnnxOnlineStreamWrapper) -> [Float] {\n    if !isReady(stream: stream) {\n      return []\n    }\n\n    let p = SherpaOnnxSpeakerEmbeddingExtractorComputeEmbedding(impl, stream.impl)\n\n    defer {\n      SherpaOnnxSpeakerEmbeddingExtractorDestroyEmbedding(p)\n    }\n\n    return [Float](UnsafeBufferPointer(start: p, count: dim))\n  }\n}\n\nfunc sherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig(model: String = \"\")\n  -> SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig\n{\n  return SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig(model: toCPointer(model))\n}\n\nfunc sherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig(model: String = \"\")\n  -> SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig\n{\n  return SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig(model: toCPointer(model))\n}\n\nfunc sherpaOnnxOfflineSpeechDenoiserModelConfig(\n  gtcrn: SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig =\n    sherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig(),\n  dpdfnet: SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig =\n    sherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig(),\n  numThreads: Int = 1,\n  provider: String = \"cpu\",\n  debug: Int = 0\n) -> SherpaOnnxOfflineSpeechDenoiserModelConfig {\n  return SherpaOnnxOfflineSpeechDenoiserModelConfig(\n    gtcrn: gtcrn,\n    num_threads: Int32(numThreads),\n    debug: Int32(debug),\n    provider: toCPointer(provider),\n    dpdfnet: dpdfnet\n  )\n}\n\nfunc sherpaOnnxOfflineSpeechDenoiserConfig(\n  model: SherpaOnnxOfflineSpeechDenoiserModelConfig =\n    sherpaOnnxOfflineSpeechDenoiserModelConfig()\n) -> SherpaOnnxOfflineSpeechDenoiserConfig {\n  return SherpaOnnxOfflineSpeechDenoiserConfig(\n    model: model)\n}\n\nclass SherpaOnnxDenoisedAudioWrapper {\n  /// A pointer to the underlying counterpart in C\n  let audio: UnsafePointer<SherpaOnnxDenoisedAudio>!\n\n  init(audio: UnsafePointer<SherpaOnnxDenoisedAudio>!) {\n    self.audio = audio\n  }\n\n  deinit {\n    if let audio {\n      SherpaOnnxDestroyDenoisedAudio(audio)\n    }\n  }\n\n  var n: Int32 {\n    guard let audio else {\n      return 0\n    }\n    return audio.pointee.n\n  }\n\n  var sampleRate: Int32 {\n    guard let audio else {\n      return 0\n    }\n    return audio.pointee.sample_rate\n  }\n\n  var samples: [Float] {\n    guard let audio else {\n      return []\n    }\n\n    if let p = audio.pointee.samples {\n      var samples: [Float] = []\n      for index in 0..<n {\n        samples.append(p[Int(index)])\n      }\n      return samples\n    } else {\n      let samples: [Float] = []\n      return samples\n    }\n  }\n\n  func save(filename: String) -> Int32 {\n    guard let audio else {\n      return 0\n    }\n    return SherpaOnnxWriteWave(audio.pointee.samples, n, sampleRate, toCPointer(filename))\n  }\n}\n\nclass SherpaOnnxOfflineSpeechDenoiserWrapper {\n  /// A pointer to the underlying counterpart in C\n  let impl: OpaquePointer!\n\n  /// Constructor taking a model config\n  init(\n    config: UnsafePointer<SherpaOnnxOfflineSpeechDenoiserConfig>!\n  ) {\n    impl = SherpaOnnxCreateOfflineSpeechDenoiser(config)\n  }\n\n  deinit {\n    if let impl {\n      SherpaOnnxDestroyOfflineSpeechDenoiser(impl)\n    }\n  }\n\n  func run(samples: [Float], sampleRate: Int) -> SherpaOnnxDenoisedAudioWrapper {\n    let audio: UnsafePointer<SherpaOnnxDenoisedAudio>? = SherpaOnnxOfflineSpeechDenoiserRun(\n      impl, samples, Int32(samples.count), Int32(sampleRate))\n\n    return SherpaOnnxDenoisedAudioWrapper(audio: audio)\n  }\n\n  var sampleRate: Int {\n    return Int(SherpaOnnxOfflineSpeechDenoiserGetSampleRate(impl))\n  }\n}\n\nfunc sherpaOnnxOnlineSpeechDenoiserConfig(\n  model: SherpaOnnxOfflineSpeechDenoiserModelConfig =\n    sherpaOnnxOfflineSpeechDenoiserModelConfig()\n) -> SherpaOnnxOnlineSpeechDenoiserConfig {\n  return SherpaOnnxOnlineSpeechDenoiserConfig(model: model)\n}\n\nclass SherpaOnnxOnlineSpeechDenoiserWrapper {\n  let impl: OpaquePointer!\n\n  init(\n    config: UnsafePointer<SherpaOnnxOnlineSpeechDenoiserConfig>!\n  ) {\n    impl = SherpaOnnxCreateOnlineSpeechDenoiser(config)\n  }\n\n  deinit {\n    if let impl {\n      SherpaOnnxDestroyOnlineSpeechDenoiser(impl)\n    }\n  }\n\n  func run(samples: [Float], sampleRate: Int) -> SherpaOnnxDenoisedAudioWrapper {\n    let audio: UnsafePointer<SherpaOnnxDenoisedAudio>? = SherpaOnnxOnlineSpeechDenoiserRun(\n      impl, samples, Int32(samples.count), Int32(sampleRate))\n    return SherpaOnnxDenoisedAudioWrapper(audio: audio)\n  }\n\n  func flush() -> SherpaOnnxDenoisedAudioWrapper {\n    let audio: UnsafePointer<SherpaOnnxDenoisedAudio>? = SherpaOnnxOnlineSpeechDenoiserFlush(impl)\n    return SherpaOnnxDenoisedAudioWrapper(audio: audio)\n  }\n\n  func reset() {\n    SherpaOnnxOnlineSpeechDenoiserReset(impl)\n  }\n\n  var sampleRate: Int {\n    return Int(SherpaOnnxOnlineSpeechDenoiserGetSampleRate(impl))\n  }\n\n  var frameShiftInSamples: Int {\n    return Int(SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(impl))\n  }\n}\n\nfunc getSherpaOnnxVersion() -> String {\n  return String(cString: SherpaOnnxGetVersionStr())\n}\n\nfunc getSherpaOnnxGitSha1() -> String {\n  return String(cString: SherpaOnnxGetGitSha1())\n}\n\nfunc getSherpaOnnxGitDate() -> String {\n  return String(cString: SherpaOnnxGetGitDate())\n}\n"
  },
  {
    "path": "swift-api-examples/add-punctuation-online.swift",
    "content": "func run() {\n    let model = \"./sherpa-onnx-online-punct-en-2024-08-06/model.onnx\"\n    let bpe = \"./sherpa-onnx-online-punct-en-2024-08-06/bpe.vocab\"\n    \n    // Create model config\n    let modelConfig = sherpaOnnxOnlinePunctuationModelConfig(\n        cnnBiLstm: model,\n        bpeVocab: bpe\n    )\n    \n    // Create punctuation config\n    var config = sherpaOnnxOnlinePunctuationConfig(model: modelConfig)\n    \n    // Create punctuation instance\n    let punct = SherpaOnnxOnlinePunctuationWrapper(config: &config)\n    \n    // Test texts\n    let textList = [\n        \"how are you i am fine thank you\",\n        \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\"\n    ]\n    \n    // Process each text\n  for i in 0..<textList.count {\n    let t = punct.addPunct(text: textList[i])\n    print(\"\\nresult is:\\n\\(t)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/add-punctuations.swift",
    "content": "func run() {\n  let model = \"./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx\"\n  let modelConfig = sherpaOnnxOfflinePunctuationModelConfig(\n    ctTransformer: model,\n    numThreads: 1,\n    debug: 1,\n    provider: \"cpu\"\n  )\n  var config = sherpaOnnxOfflinePunctuationConfig(model: modelConfig)\n\n  let punct = SherpaOnnxOfflinePunctuationWrapper(config: &config)\n\n  let textList = [\n    \"这是一个测试你好吗How are you我很好thank you are you ok谢谢你\",\n    \"我们都是木头人不会说话不会动\",\n    \"The African blogosphere is rapidly expanding bringing more voices online in the form of commentaries opinions analyses rants and poetry\",\n  ]\n\n  for i in 0..<textList.count {\n    let t = punct.addPunct(text: textList[i])\n    print(\"\\nresult is:\\n\\(t)\")\n  }\n\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/compute-speaker-embeddings.swift",
    "content": "/// swift-api-examples/compute-speaker-embeddings.swift\n/// Copyright (c)  2025  Xiaomi Corporation\n/*\nPlease download test files used in this script from\n\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\n*/\nimport Foundation\n\nfunc cosineSimilarity(_ a: [Float], _ b: [Float]) -> Float {\n  precondition(a.count == b.count, \"Vectors must have the same length\")\n\n  var dot: Float = 0\n  var sumA: Float = 0\n  var sumB: Float = 0\n\n  for i in 0..<a.count {\n    let x = a[i]\n    let y = b[i]\n    dot += x * y\n    sumA += x * x\n    sumB += y * y\n  }\n\n  let magA = sqrt(sumA)\n  let magB = sqrt(sumB)\n\n  guard magA > 0 && magB > 0 else { return 0 }\n  return dot / (magA * magB)\n}\n\nfunc computeEmbedding(extractor: SherpaOnnxSpeakerEmbeddingExtractorWrapper, waveFilename: String)\n  -> [Float]\n{\n  let audio = SherpaOnnxWaveWrapper.readWave(filename: waveFilename)\n  let stream = extractor.createStream()\n  stream.acceptWaveform(samples: audio.samples, sampleRate: audio.sampleRate)\n  stream.inputFinished()\n  return extractor.compute(stream: stream)\n}\n\nfunc run() {\n  let model = \"./wespeaker_zh_cnceleb_resnet34.onnx\"\n  var config = sherpaOnnxSpeakerEmbeddingExtractorConfig(model: model)\n  let extractor = SherpaOnnxSpeakerEmbeddingExtractorWrapper(config: &config)\n  let embedding1 = computeEmbedding(extractor: extractor, waveFilename: \"./fangjun-sr-1.wav\")\n  let embedding2 = computeEmbedding(extractor: extractor, waveFilename: \"./fangjun-sr-2.wav\")\n  let embedding3 = computeEmbedding(extractor: extractor, waveFilename: \"./leijun-sr-1.wav\")\n\n  let score12 = cosineSimilarity(embedding1, embedding2)\n  let score13 = cosineSimilarity(embedding1, embedding3)\n  let score23 = cosineSimilarity(embedding2, embedding3)\n\n  print(\"Score between spk1 and spk2: \\(score12)\")\n  print(\"Score between spk1 and spk3: \\(score13)\")\n  print(\"Score between spk2 and spk3: \\(score23)\")\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/decode-file-non-streaming.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  var recognizer: SherpaOnnxOfflineRecognizer\n  var modelConfig: SherpaOnnxOfflineModelConfig\n  var modelType = \"whisper\"\n  // modelType = \"paraformer\"\n  // modelType = \"sense_voice\"\n  // modelType = \"moonshine\"\n\n  if modelType == \"whisper\" {\n    let encoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx\"\n    let decoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx\"\n    let tokens = \"./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\"\n\n    let whisperConfig = sherpaOnnxOfflineWhisperModelConfig(\n      encoder: encoder,\n      decoder: decoder\n    )\n\n    modelConfig = sherpaOnnxOfflineModelConfig(\n      tokens: tokens,\n      whisper: whisperConfig,\n      debug: 0,\n      modelType: \"whisper\"\n    )\n  } else if modelType == \"paraformer\" {\n    let model = \"./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx\"\n    let tokens = \"./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\"\n    let paraformerConfig = sherpaOnnxOfflineParaformerModelConfig(\n      model: model\n    )\n\n    modelConfig = sherpaOnnxOfflineModelConfig(\n      tokens: tokens,\n      paraformer: paraformerConfig,\n      debug: 0,\n      modelType: \"paraformer\"\n    )\n  } else if modelType == \"sense_voice\" {\n    let model = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\"\n    let tokens = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\"\n    let senseVoiceConfig = sherpaOnnxOfflineSenseVoiceModelConfig(\n      model: model,\n      useInverseTextNormalization: true\n    )\n\n    modelConfig = sherpaOnnxOfflineModelConfig(\n      tokens: tokens,\n      debug: 0,\n      senseVoice: senseVoiceConfig\n    )\n  } else if modelType == \"moonshine\" {\n    let preprocessor = \"./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx\"\n    let encoder = \"./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx\"\n    let uncachedDecoder = \"./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx\"\n    let cachedDecoder = \"./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx\"\n    let tokens = \"./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt\"\n    let moonshine = sherpaOnnxOfflineMoonshineModelConfig(\n      preprocessor: preprocessor,\n      encoder: encoder,\n      uncachedDecoder: uncachedDecoder,\n      cachedDecoder: cachedDecoder\n    )\n\n    modelConfig = sherpaOnnxOfflineModelConfig(\n      tokens: tokens,\n      debug: 0,\n      moonshine: moonshine\n    )\n  } else {\n    print(\"Please specify a supported modelType \\(modelType)\")\n    return\n  }\n\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: 16000,\n    featureDim: 80\n  )\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  var filePath = \"./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav\"\n  if modelType == \"sense_voice\" {\n    filePath = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/zh.wav\"\n  } else if modelType == \"moonshine\" {\n    filePath = \"./sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav\"\n  }\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  let result = recognizer.decode(samples: array, sampleRate: Int(audioFormat.sampleRate))\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if result.timestamps.count != 0 {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/decode-file-sense-voice-with-hr.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  var recognizer: SherpaOnnxOfflineRecognizer\n  let model = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx\"\n  let tokens = \"./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt\"\n  let senseVoiceConfig = sherpaOnnxOfflineSenseVoiceModelConfig(\n    model: model,\n    useInverseTextNormalization: true\n  )\n\n  let modelConfig = sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    debug: 0,\n    senseVoice: senseVoiceConfig\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: 16000,\n    featureDim: 80\n  )\n\n  let hrConfig = sherpaOnnxHomophoneReplacerConfig(\n    lexicon: \"./lexicon.txt\",\n    ruleFsts: \"./replace.fst\"\n  )\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig,\n    hr: hrConfig\n  )\n\n  recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let filePath = \"./test-hr.wav\"\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  let result = recognizer.decode(samples: array, sampleRate: Int(audioFormat.sampleRate))\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if result.timestamps.count != 0 {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/decode-file-t-one-streaming.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let filePath = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/0.wav\"\n  let model =\n    \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/model.onnx\"\n  let tokens = \"./sherpa-onnx-streaming-t-one-russian-2025-09-08/tokens.txt\"\n\n  let toneCtcConfig = sherpaOnnxOnlineToneCtcModelConfig(\n    model: model)\n\n  let modelConfig = sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    toneCtc: toneCtcConfig\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: 8000,\n    featureDim: 80\n  )\n  var config = sherpaOnnxOnlineRecognizerConfig(\n    featConfig: featConfig,  // not used\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxRecognizer(config: &config)\n\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.sampleRate == 8000)\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n\n  let leftPadding = [Float](repeating: 0.0, count: 2400)\n  recognizer.acceptWaveform(samples: leftPadding, sampleRate: Int(audioFormat.sampleRate))\n\n  recognizer.acceptWaveform(samples: array, sampleRate: Int(audioFormat.sampleRate))\n\n  let tailPadding = [Float](repeating: 0.0, count: 4800)\n  recognizer.acceptWaveform(samples: tailPadding, sampleRate: Int(audioFormat.sampleRate))\n\n  recognizer.inputFinished()\n  while recognizer.isReady() {\n    recognizer.decode()\n  }\n\n  let result = recognizer.getResult()\n  print(\"\\nresult is:\\n\\(result.text)\")\n  print(\"\\nresult is:\\n\\(result.timestamps)\")\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/decode-file.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  var modelConfig: SherpaOnnxOnlineModelConfig\n  var modelType = \"zipformer2-ctc\"\n  var filePath: String\n\n  modelType = \"transducer\"\n\n  if modelType == \"transducer\" {\n    filePath = \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/1.wav\"\n    let encoder =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx\"\n    let decoder =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx\"\n    let joiner =\n      \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx\"\n    let tokens = \"./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt\"\n\n    let transducerConfig = sherpaOnnxOnlineTransducerModelConfig(\n      encoder: encoder,\n      decoder: decoder,\n      joiner: joiner\n    )\n\n    modelConfig = sherpaOnnxOnlineModelConfig(\n      tokens: tokens,\n      transducer: transducerConfig\n    )\n  } else {\n    filePath =\n      \"./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/DEV_T0000000000.wav\"\n    let model =\n      \"./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/ctc-epoch-20-avg-1-chunk-16-left-128.onnx\"\n    let tokens = \"./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt\"\n    let zipfomer2CtcModelConfig = sherpaOnnxOnlineZipformer2CtcModelConfig(\n      model: model\n    )\n\n    modelConfig = sherpaOnnxOnlineModelConfig(\n      tokens: tokens,\n      zipformer2Ctc: zipfomer2CtcModelConfig\n    )\n  }\n\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: 16000,\n    featureDim: 80\n  )\n  var config = sherpaOnnxOnlineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxRecognizer(config: &config)\n\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.sampleRate == 16000)\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  recognizer.acceptWaveform(samples: array)\n\n  let tailPadding = [Float](repeating: 0.0, count: 3200)\n  recognizer.acceptWaveform(samples: tailPadding)\n\n  recognizer.inputFinished()\n  while recognizer.isReady() {\n    recognizer.decode()\n  }\n\n  let result = recognizer.getResult()\n  print(\"\\nresult is:\\n\\(result.text)\")\n  print(\"\\nresult is:\\n\\(result.timestamps)\")\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/dolphin-ctc-asr.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let model = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx\"\n  let tokens = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/tokens.txt\"\n\n  let dolphin = sherpaOnnxOfflineDolphinModelConfig(\n    model: model\n  )\n\n  let modelConfig = sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    debug: 0,\n    dolphin: dolphin\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: 16000,\n    featureDim: 80\n  )\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let filePath = \"./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/test_wavs/0.wav\"\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  let result = recognizer.decode(samples: array, sampleRate: Int(audioFormat.sampleRate))\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if result.timestamps.count != 0 {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/fire-red-asr-ctc.swift",
    "content": "func run() {\n  let model =\n    \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx\"\n  let tokens =\n    \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/tokens.txt\"\n\n  let fireRedAsrCtc = sherpaOnnxOfflineFireRedAsrCtcModelConfig(\n    model: model\n  )\n\n  let modelConfig = sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    debug: 1,\n    fireRedAsrCtc: fireRedAsrCtc\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig()\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let filePath = \"./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/test_wavs/1.wav\"\n  let audio = SherpaOnnxWaveWrapper.readWave(filename: filePath)\n\n  let result = recognizer.decode(samples: audio.samples, sampleRate: audio.sampleRate)\n  print(\"decode done\")\n\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if result.timestamps.count != 0 {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/fire-red-asr.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let encoder = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx\"\n  let decoder = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx\"\n  let tokens = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt\"\n\n  let fireRedAsr = sherpaOnnxOfflineFireRedAsrModelConfig(\n    encoder: encoder,\n    decoder: decoder\n  )\n\n  let modelConfig = sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    debug: 0,\n    fireRedAsr: fireRedAsr\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: 16000,\n    featureDim: 80\n  )\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let filePath = \"./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav\"\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  let result = recognizer.decode(samples: array, sampleRate: Int(audioFormat.sampleRate))\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if result.timestamps.count != 0 {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/funasr-nano.swift",
    "content": "func run() {\n  let encoderAdaptor =\n    \"./sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx\"\n  let llm =\n    \"./sherpa-onnx-funasr-nano-int8-2025-12-30/llm.int8.onnx\"\n  let embedding =\n    \"./sherpa-onnx-funasr-nano-int8-2025-12-30/embedding.int8.onnx\"\n  let tokenizer =\n    \"./sherpa-onnx-funasr-nano-int8-2025-12-30/Qwen3-0.6B\"\n\n  let funasrNano = sherpaOnnxOfflineFunASRNanoModelConfig(\n    encoderAdaptor: encoderAdaptor,\n    llm: llm,\n    embedding: embedding,\n    tokenizer: tokenizer\n  )\n\n  let modelConfig = sherpaOnnxOfflineModelConfig(\n    tokens: \"\",\n    debug: 1,\n    funasrNano: funasrNano\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig()\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let filePath = \"./sherpa-onnx-funasr-nano-int8-2025-12-30/test_wavs/lyrics.wav\"\n  let audio = SherpaOnnxWaveWrapper.readWave(filename: filePath)\n\n  let result = recognizer.decode(samples: audio.samples, sampleRate: audio.sampleRate)\n  print(\"decode done\")\n\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if !result.timestamps.isEmpty {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n\n"
  },
  {
    "path": "swift-api-examples/generate-subtitles.swift",
    "content": "/*\nThis file shows how to use Swift API to generate subtitles.\n\nYou can use the files from\nhttps://huggingface.co/csukuangfj/vad/tree/main\nfor testing.\n\nFor instance, to generate subtitles for Obama.mov, please first\nuse\n\nffmpeg -i ./Obama.mov -acodec pcm_s16le -ac 1 -ar 16000 Obama.wav\n\nto extract the audio part from the video.\n\nThis file supports only processing WAV sound files, so you have to first\nextract audios from videos.\n\nPlease see\n./run-generate-subtitles.sh\nfor usages.\n*/\n\nimport AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nextension TimeInterval {\n  var hourMinuteSecondMS: String {\n    String(format: \"%d:%02d:%02d,%03d\", hour, minute, second, millisecond)\n  }\n\n  var hour: Int {\n    Int((self / 3600).truncatingRemainder(dividingBy: 3600))\n  }\n  var minute: Int {\n    Int((self / 60).truncatingRemainder(dividingBy: 60))\n  }\n  var second: Int {\n    Int(truncatingRemainder(dividingBy: 60))\n  }\n  var millisecond: Int {\n    Int((self * 1000).truncatingRemainder(dividingBy: 1000))\n  }\n}\n\nextension String {\n  var fileURL: URL {\n    return URL(fileURLWithPath: self)\n  }\n  var pathExtension: String {\n    return fileURL.pathExtension\n  }\n  var lastPathComponent: String {\n    return fileURL.lastPathComponent\n  }\n  var stringByDeletingPathExtension: String {\n    return fileURL.deletingPathExtension().path\n  }\n}\n\nclass SpeechSegment: CustomStringConvertible {\n\n  let start: Float\n  let end: Float\n  let text: String\n\n  init(start: Float, duration: Float, text: String) {\n    self.start = start\n    self.end = start + duration\n    self.text = text\n  }\n  public var description: String {\n    var s: String\n    s = TimeInterval(self.start).hourMinuteSecondMS\n    s += \" --> \"\n    s += TimeInterval(self.end).hourMinuteSecondMS\n    s += \"\\n\"\n    s += self.text\n\n    return s\n  }\n}\n\nfunc run() {\n  var recognizer: SherpaOnnxOfflineRecognizer\n  var modelConfig: SherpaOnnxOfflineModelConfig\n  let modelType = \"whisper\"\n  // modelType = \"paraformer\"\n  let filePath = \"/Users/fangjun/Desktop/Obama.wav\"  // English\n  // filePath = \"/Users/fangjun/Desktop/lei-jun.wav\"  // Chinese\n  // please go to https://huggingface.co/csukuangfj/vad\n  // to download the above two files\n\n  if modelType == \"whisper\" {\n    // for English\n    let encoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx\"\n    let decoder = \"./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx\"\n    let tokens = \"./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt\"\n\n    let whisperConfig = sherpaOnnxOfflineWhisperModelConfig(\n      encoder: encoder,\n      decoder: decoder\n    )\n\n    modelConfig = sherpaOnnxOfflineModelConfig(\n      tokens: tokens,\n      whisper: whisperConfig,\n      debug: 0,\n      modelType: \"whisper\"\n    )\n  } else if modelType == \"paraformer\" {\n    // for Chinese\n    let model = \"./sherpa-onnx-paraformer-zh-2023-09-14/model.int8.onnx\"\n    let tokens = \"./sherpa-onnx-paraformer-zh-2023-09-14/tokens.txt\"\n    let paraformerConfig = sherpaOnnxOfflineParaformerModelConfig(\n      model: model\n    )\n\n    modelConfig = sherpaOnnxOfflineModelConfig(\n      tokens: tokens,\n      paraformer: paraformerConfig,\n      debug: 0,\n      modelType: \"paraformer\"\n    )\n  } else {\n    print(\"Please specify a supported modelType \\(modelType)\")\n    return\n  }\n\n  let sampleRate = 16000\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: sampleRate,\n    featureDim: 80\n  )\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let audioFile = try! AVAudioFile(forReading: filePath.fileURL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.sampleRate == Double(sampleRate))\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  var sileroVadConfig = sherpaOnnxSileroVadModelConfig()\n  var tenVadConfig = sherpaOnnxTenVadModelConfig()\n\n  var windowSize = 0\n\n  if FileManager.default.fileExists(atPath: \"./silero_vad.onnx\") {\n    sileroVadConfig = sherpaOnnxSileroVadModelConfig(\n      model: \"./silero_vad.onnx\",\n      threshold: 0.25,\n      windowSize: 512\n    )\n    windowSize = 512\n    print(\"Use silero-vad\")\n  } else if FileManager.default.fileExists(atPath: \"./ten-vad.onnx\") {\n    tenVadConfig = sherpaOnnxTenVadModelConfig(\n      model: \"./ten-vad.onnx\",\n      threshold: 0.25,\n      windowSize: 256\n    )\n    windowSize = 256\n    print(\"Use ten-vad\")\n  } else {\n    print(\"Please provide ./silero_vad.onnx or ./ten-vad.onnx\")\n    return\n  }\n\n  var vadModelConfig = sherpaOnnxVadModelConfig(\n    sileroVad: sileroVadConfig, tenVad: tenVadConfig)\n\n  let vad = SherpaOnnxVoiceActivityDetectorWrapper(\n    config: &vadModelConfig, buffer_size_in_seconds: 120)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n\n  var segments: [SpeechSegment] = []\n\n  for offset in stride(from: 0, to: array.count, by: windowSize) {\n    let end = min(offset + windowSize, array.count)\n    vad.acceptWaveform(samples: [Float](array[offset..<end]))\n  }\n\n  vad.flush()\n  while !vad.isEmpty() {\n    let s = vad.front()\n    vad.pop()\n    let result = recognizer.decode(samples: s.samples)\n\n    segments.append(\n      SpeechSegment(\n        start: Float(s.start) / Float(sampleRate),\n        duration: Float(s.samples.count) / Float(sampleRate),\n        text: result.text))\n\n    print(segments.last!)\n  }\n\n  let srt: String = zip(segments.indices, segments).map { (index, element) in\n    return \"\\(index+1)\\n\\(element)\"\n  }.joined(separator: \"\\n\\n\")\n\n  let srtFilename: String = filePath.stringByDeletingPathExtension + \".srt\"\n  do {\n    try srt.write(to: srtFilename.fileURL, atomically: true, encoding: .utf8)\n  } catch {\n    print(\"Error writing: \\(error.localizedDescription)\")\n  }\n\n  print(\"Saved to \\(srtFilename)\")\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/keyword-spotting-from-file.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let filePath = \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/3.wav\"\n  let encoder =\n    \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx\"\n  let decoder =\n    \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\"\n  let joiner =\n    \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx\"\n  let tokens =\n    \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt\"\n  let keywordsFile =\n    \"./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/test_wavs/test_keywords.txt\"\n  let transducerConfig = sherpaOnnxOnlineTransducerModelConfig(\n    encoder: encoder,\n    decoder: decoder,\n    joiner: joiner\n  )\n\n  let modelConfig = sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    transducer: transducerConfig\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: 16000,\n    featureDim: 80\n  )\n  var config = sherpaOnnxKeywordSpotterConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig,\n    keywordsFile: keywordsFile\n  )\n\n  let spotter = SherpaOnnxKeywordSpotterWrapper(config: &config)\n\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.sampleRate == 16000)\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  spotter.acceptWaveform(samples: array)\n\n  let tailPadding = [Float](repeating: 0.0, count: 3200)\n  spotter.acceptWaveform(samples: tailPadding)\n\n  spotter.inputFinished()\n  while spotter.isReady() {\n    spotter.decode()\n    let keyword = spotter.getResult().keyword\n    if keyword != \"\" {\n      // Remember to call reset() right after detecting a keyword\n      spotter.reset()\n\n      print(\"Detected: \\(keyword)\")\n    }\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/medasr-ctc.swift",
    "content": "func run() {\n  let model =\n    \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/model.int8.onnx\"\n  let tokens =\n    \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt\"\n\n  let medasr = sherpaOnnxOfflineMedAsrCtcModelConfig(\n    model: model\n  )\n\n  let modelConfig = sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    debug: 1,\n    medasr: medasr\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig()\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let filePath = \"./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/test_wavs/0.wav\"\n  let audio = SherpaOnnxWaveWrapper.readWave(filename: filePath)\n\n  let result = recognizer.decode(samples: audio.samples, sampleRate: audio.sampleRate)\n  print(\"decode done\")\n\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if result.timestamps.count != 0 {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/moonshine-v2-asr.swift",
    "content": "func run() {\n  let encoder =\n    \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort\"\n  let decoder =\n    \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/decoder_model_merged.ort\"\n  let tokens =\n    \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/tokens.txt\"\n\n  let moonshine = sherpaOnnxOfflineMoonshineModelConfig(\n    encoder: encoder,\n    mergedDecoder: decoder\n  )\n\n  let modelConfig = sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    debug: 1,\n    moonshine: moonshine\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig()\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let filePath = \"./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/test_wavs/0.wav\"\n  let audio = SherpaOnnxWaveWrapper.readWave(filename: filePath)\n\n  let result = recognizer.decode(samples: audio.samples, sampleRate: audio.sampleRate)\n  print(\"decode done\")\n\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if result.timestamps.count != 0 {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/omnilingual-asr-ctc.swift",
    "content": "func run() {\n  let model =\n    \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/model.int8.onnx\"\n  let tokens =\n    \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt\"\n\n  let omnilingual = sherpaOnnxOfflineOmnilingualAsrCtcModelConfig(\n    model: model\n  )\n\n  let modelConfig = sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    debug: 0,\n    omnilingual: omnilingual\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig()\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let filePath = \"./sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/test_wavs/en.wav\"\n  let audio = SherpaOnnxWaveWrapper.readWave(filename: filePath)\n\n  let result = recognizer.decode(samples: audio.samples, sampleRate: audio.sampleRate)\n\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if result.timestamps.count != 0 {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/online-speech-enhancement-dpdfnet.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let model = \"./dpdfnet_baseline.onnx\"\n  // Please refer to\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n  // to download files used in this script\n  var config = sherpaOnnxOnlineSpeechDenoiserConfig(\n    model: sherpaOnnxOfflineSpeechDenoiserModelConfig(\n      dpdfnet: sherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig(model: model))\n  )\n\n  let sd = SherpaOnnxOnlineSpeechDenoiserWrapper(config: &config)\n\n  let fileURL: NSURL = NSURL(fileURLWithPath: \"./inp_16k.wav\")\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.sampleRate == 16000)\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let samples: [Float]! = audioFileBuffer?.array()\n\n  var enhanced: [Float] = []\n  let frameShift = sd.frameShiftInSamples\n\n  var start = 0\n  while start < samples.count {\n    let end = min(start + frameShift, samples.count)\n    let audio = sd.run(samples: Array(samples[start..<end]), sampleRate: Int(audioFormat.sampleRate))\n    enhanced.append(contentsOf: audio.samples)\n    start = end\n  }\n\n  enhanced.append(contentsOf: sd.flush().samples)\n\n  let filename = \"enhanced-online-dpdfnet.wav\"\n  _ = enhanced.withUnsafeBufferPointer { p in\n    SherpaOnnxWriteWave(\n      p.baseAddress,\n      Int32(enhanced.count),\n      Int32(sd.sampleRate),\n      toCPointer(filename))\n  }\n  print(\"\\nSaved to:\\(filename)\")\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/online-speech-enhancement-gtcrn.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let model = \"./gtcrn_simple.onnx\"\n  // Please refer to\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n  // to download files used in this script\n  var config = sherpaOnnxOnlineSpeechDenoiserConfig(\n    model: sherpaOnnxOfflineSpeechDenoiserModelConfig(\n      gtcrn: sherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig(model: model))\n  )\n\n  let sd = SherpaOnnxOnlineSpeechDenoiserWrapper(config: &config)\n\n  let fileURL: NSURL = NSURL(fileURLWithPath: \"./inp_16k.wav\")\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.sampleRate == 16000)\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let samples: [Float]! = audioFileBuffer?.array()\n\n  var enhanced: [Float] = []\n  let frameShift = sd.frameShiftInSamples\n\n  var start = 0\n  while start < samples.count {\n    let end = min(start + frameShift, samples.count)\n    let audio = sd.run(samples: Array(samples[start..<end]), sampleRate: Int(audioFormat.sampleRate))\n    enhanced.append(contentsOf: audio.samples)\n    start = end\n  }\n\n  enhanced.append(contentsOf: sd.flush().samples)\n\n  let filename = \"enhanced-online-gtcrn.wav\"\n  _ = enhanced.withUnsafeBufferPointer { p in\n    SherpaOnnxWriteWave(\n      p.baseAddress,\n      Int32(enhanced.count),\n      Int32(sd.sampleRate),\n      toCPointer(filename))\n  }\n  print(\"\\nSaved to:\\(filename)\")\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/run-add-punctuations-online.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\n# Download and extract the online punctuation model if not exists\nif [ ! -d ./sherpa-onnx-online-punct-en-2024-08-06 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n  tar xvf sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\n  rm sherpa-onnx-online-punct-en-2024-08-06.tar.bz2\nfi\n\nif [ ! -e ./add-punctuation-online ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./add-punctuation-online.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o ./add-punctuation-online\n\n  strip ./add-punctuation-online\nelse\n  echo \"./add-punctuation-online exists - skip building\"\nfi\n\n# Set library path and run the executable\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./add-punctuation-online "
  },
  {
    "path": "swift-api-examples/run-add-punctuations.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -d ./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\n  rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2\nfi\n\nif [ ! -e ./add-punctuations ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./add-punctuations.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o ./add-punctuations\n\n  strip ./add-punctuations\nelse\n  echo \"./add-punctuations exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./add-punctuations\n"
  },
  {
    "path": "swift-api-examples/run-compute-speaker-embeddings.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./wespeaker_zh_cnceleb_resnet34.onnx ]; then\n  echo \"Please download the pre-trained model for testing.\"\n  echo \"You can refer to\"\n  echo \"\"\n  echo \"https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\"\n  echo \"\"\n  echo \"for help\"\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/wespeaker_zh_cnceleb_resnet34.onnx\nfi\n\nif [ ! -f ./fangjun-sr-1.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/fangjun-sr-1.wav\nfi\n\nif [ ! -f ./fangjun-sr-2.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/fangjun-sr-2.wav\nfi\n\nif [ ! -f ./leijun-sr-1.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/leijun-sr-1.wav\nfi\n\nif [ ! -e ./compute-speaker-embeddings ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./compute-speaker-embeddings.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o compute-speaker-embeddings\n\n  strip compute-speaker-embeddings\nelse\n  echo \"./compute-speaker-embeddings exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./compute-speaker-embeddings\n"
  },
  {
    "path": "swift-api-examples/run-decode-file-non-streaming.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -d ./sherpa-onnx-whisper-tiny.en ]; then\n  echo \"Please download the pre-trained model for testing.\"\n  echo \"You can refer to\"\n  echo \"\"\n  echo \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html\"\n  echo \"\"\n  echo \"for help\"\n\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\nfi\n\nif [ ! -e ./decode-file-non-streaming ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./decode-file-non-streaming.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o decode-file-non-streaming\n\n  strip decode-file-non-streaming\nelse\n  echo \"./decode-file-non-streaming exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./decode-file-non-streaming\n"
  },
  {
    "path": "swift-api-examples/run-decode-file-sense-voice-with-hr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -d ./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\n  rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2\nfi\n\nif [ ! -d dict ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/dict.tar.bz2\n  tar xf dict.tar.bz2\n  rm -rf dict.tar.bz2\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/replace.fst\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/test-hr.wav\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/hr-files/lexicon.txt\nfi\n\nif [ ! -e ./decode-file-sense-voice-with-hr ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./decode-file-sense-voice-with-hr.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o decode-file-sense-voice-with-hr\n\n  strip decode-file-sense-voice-with-hr\nelse\n  echo \"./decode-file-sense-voice-with-hr exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./decode-file-sense-voice-with-hr\n"
  },
  {
    "path": "swift-api-examples/run-decode-file-t-one-streaming.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -d ./sherpa-onnx-streaming-t-one-russian-2025-09-08 ]; then\n  echo \"Downloading the pre-trained model for testing.\"\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n  tar xvf sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\n  rm sherpa-onnx-streaming-t-one-russian-2025-09-08.tar.bz2\nfi\n\nif [ ! -e ./decode-file-t-one-streaming ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./decode-file-t-one-streaming.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o decode-file-t-one-streaming\n\n  strip decode-file-t-one-streaming\nelse\n  echo \"./decode-file-t-one-streaming exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./decode-file-t-one-streaming\n"
  },
  {
    "path": "swift-api-examples/run-decode-file.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -d ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 ]; then\n  echo \"Please download the pre-trained model for testing.\"\n  echo \"You can refer to\"\n  echo \"\"\n  echo \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english\"\n  echo \"\"\n  echo \"for help\"\n\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nfi\n\nif [ ! -d ./sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13 ]; then\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2\nfi\n\nif [ ! -e ./decode-file ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./decode-file.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o decode-file\n\n  strip decode-file\nelse\n  echo \"./decode-file exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./decode-file\n"
  },
  {
    "path": "swift-api-examples/run-dolphin-ctc-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02/model.int8.onnx ]; then\n  echo \"Please download the pre-trained model for testing.\"\n  echo \"You can refer to\"\n  echo \"\"\n  echo \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/dolphin/index.html\"\n  echo \"\"\n  echo \"for help\"\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  tar xvf sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  rm sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02.tar.bz2\n  ls -lh sherpa-onnx-dolphin-base-ctc-multi-lang-int8-2025-04-02\nfi\n\nif [ ! -e ./dolphin-ctc-asr ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./dolphin-ctc-asr.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o dolphin-ctc-asr\n\n  strip dolphin-ctc-asr\nelse\n  echo \"./dolphin-ctc-asr exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./dolphin-ctc-asr\n"
  },
  {
    "path": "swift-api-examples/run-fire-red-asr-ctc.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25/model.int8.onnx ]; then\n  echo \"Please download the pre-trained model for testing.\"\n  echo \"You can refer to\"\n  echo \"\"\n  echo \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/FireRedAsr/index.html\"\n  echo \"\"\n  echo \"for help\"\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  tar xvf sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n  rm sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25.tar.bz2\n\n  ls -lh sherpa-onnx-fire-red-asr2-ctc-zh_en-int8-2026-02-25\nfi\n\nif [ ! -e ./fire-red-asr-ctc ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./fire-red-asr-ctc.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o fire-red-asr-ctc\n\n  strip fire-red-asr-ctc\nelse\n  echo \"./fire-red-asr-ctc exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./fire-red-asr-ctc\n"
  },
  {
    "path": "swift-api-examples/run-fire-red-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx ]; then\n  echo \"Please download the pre-trained model for testing.\"\n  echo \"You can refer to\"\n  echo \"\"\n  echo \"https://k2-fsa.github.io/sherpa/onnx/FireRedAsr/index.html\"\n  echo \"\"\n  echo \"for help\"\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2\n  ls -lh sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16\nfi\n\nif [ ! -e ./fire-red-asr ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./fire-red-asr.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o fire-red-asr\n\n  strip fire-red-asr\nelse\n  echo \"./fire-red-asr exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./fire-red-asr\n"
  },
  {
    "path": "swift-api-examples/run-funasr-nano-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./sherpa-onnx-funasr-nano-int8-2025-12-30/encoder_adaptor.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\n  rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2\nfi\n\nif [ ! -e ./funasr-nano ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./funasr-nano.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o funasr-nano\n\n  strip funasr-nano\nelse\n  echo \"./funasr-nano exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./funasr-nano\n\n"
  },
  {
    "path": "swift-api-examples/run-generate-subtitles-ten-vad.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -d ./sherpa-onnx-whisper-tiny.en ]; then\n  echo \"Please download the pre-trained model for testing.\"\n  echo \"You can refer to\"\n  echo \"\"\n  echo \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html\"\n  echo \"\"\n  echo \"for help\"\n\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\n  ls -lh sherpa-onnx-whisper-tiny.en\nfi\nif [ ! -f ./ten-vad.onnx ]; then\n  echo \"downloading ten-vad\"\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\nfi\n\nif [ ! -e ./generate-subtitles-ten-vad ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./generate-subtitles.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o generate-subtitles-ten-vad\n\n  strip generate-subtitles-ten-vad\nelse\n  echo \"./generate-subtitles-ten-vad exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./generate-subtitles-ten-vad\n"
  },
  {
    "path": "swift-api-examples/run-generate-subtitles.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -d ./sherpa-onnx-whisper-tiny.en ]; then\n  echo \"Please download the pre-trained model for testing.\"\n  echo \"You can refer to\"\n  echo \"\"\n  echo \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html\"\n  echo \"\"\n  echo \"for help\"\n\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2\n  rm sherpa-onnx-whisper-tiny.en.tar.bz2\n  ls -lh sherpa-onnx-whisper-tiny.en\nfi\nif [ ! -f ./silero_vad.onnx ]; then\n  echo \"downloading silero_vad\"\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nfi\n\nif [ ! -e ./generate-subtitles ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./generate-subtitles.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o generate-subtitles\n\n  strip generate-subtitles\nelse\n  echo \"./generate-subtitles exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./generate-subtitles\n"
  },
  {
    "path": "swift-api-examples/run-keyword-spotting-from-file.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -d ./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01 ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  tar xf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n  rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\nfi\n\nif [ ! -e ./keyword-spotting-from-file ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./keyword-spotting-from-file.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o keyword-spotting-from-file\n\n  strip keyword-spotting-from-file\nelse\n  echo \"./keyword-spotting-from-file exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./keyword-spotting-from-file\n"
  },
  {
    "path": "swift-api-examples/run-medasr-ctc-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./sherpa-onnx-medasr-ctc-en-int8-2025-12-25/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  tar xvf sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\n  rm sherpa-onnx-medasr-ctc-en-int8-2025-12-25.tar.bz2\nfi\n\nif [ ! -e ./medasr-ctc ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./medasr-ctc.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o medasr-ctc\n\n  strip medasr-ctc\nelse\n  echo \"./medasr-ctc exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./medasr-ctc\n"
  },
  {
    "path": "swift-api-examples/run-moonshine-v2-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27/encoder_model.ort ]; then\n  echo \"Please download the pre-trained model for testing.\"\n  echo \"You can refer to\"\n  echo \"\"\n  echo \"https://k2-fsa.github.io/sherpa/onnx/moonshine/index.html\"\n  echo \"\"\n  echo \"for help\"\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  tar xvf sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  rm sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27.tar.bz2\n  ls -lh sherpa-onnx-moonshine-tiny-en-quantized-2026-02-27\nfi\n\nif [ ! -e ./moonshine-v2-asr ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./moonshine-v2-asr.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o moonshine-v2-asr\n\n  strip moonshine-v2-asr\nelse\n  echo \"./moonshine-v2-asr exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./moonshine-v2-asr\n"
  },
  {
    "path": "swift-api-examples/run-omnilingual-asr-ctc-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12/tokens.txt ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  tar xvf sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\n  rm sherpa-onnx-omnilingual-asr-1600-languages-300M-ctc-int8-2025-11-12.tar.bz2\nfi\n\nif [ ! -e ./omnilingual-asr-ctc ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./omnilingual-asr-ctc.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o omnilingual-asr-ctc\n\n  strip omnilingual-asr-ctc\nelse\n  echo \"./omnilingual-asr-ctc exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./omnilingual-asr-ctc\n"
  },
  {
    "path": "swift-api-examples/run-online-speech-enhancement-dpdfnet.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\nif [ ! -e ./online-speech-enhancement-dpdfnet ]; then\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./online-speech-enhancement-dpdfnet.swift ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o online-speech-enhancement-dpdfnet\n\n  strip online-speech-enhancement-dpdfnet\nelse\n  echo \"./online-speech-enhancement-dpdfnet exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./online-speech-enhancement-dpdfnet\n"
  },
  {
    "path": "swift-api-examples/run-online-speech-enhancement-gtcrn.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\nif [ ! -e ./online-speech-enhancement-gtcrn ]; then\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./online-speech-enhancement-gtcrn.swift ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o online-speech-enhancement-gtcrn\n\n  strip online-speech-enhancement-gtcrn\nelse\n  echo \"./online-speech-enhancement-gtcrn exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./online-speech-enhancement-gtcrn\n"
  },
  {
    "path": "swift-api-examples/run-speaker-diarization.sh",
    "content": "#!/usr/bin/env bash\n\nif [ ! -f ./sherpa-onnx-pyannote-segmentation-3-0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\n  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nfi\n\nif [ ! -f ./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\nfi\n\nif [ ! -f ./0-four-speakers-zh.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav\nfi\n\nif [ ! -e ./speaker-diarization ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./speaker-diarization.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o speaker-diarization\n\n  strip speaker-diarization\nelse\n  echo \"./speaker-diarization exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./speaker-diarization\n"
  },
  {
    "path": "swift-api-examples/run-speech-enhancement-dpdfnet.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./dpdfnet_baseline.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/dpdfnet_baseline.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\nif [ ! -e ./speech-enhancement-dpdfnet ]; then\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./speech-enhancement-dpdfnet.swift ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o speech-enhancement-dpdfnet\n\n  strip speech-enhancement-dpdfnet\nelse\n  echo \"./speech-enhancement-dpdfnet exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./speech-enhancement-dpdfnet\n"
  },
  {
    "path": "swift-api-examples/run-speech-enhancement-gtcrn.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./gtcrn_simple.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\nfi\n\nif [ ! -f ./inp_16k.wav ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav\nfi\n\nif [ ! -e ./speech-enhancement-gtcrn ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./speech-enhancement-gtcrn.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o speech-enhancement-gtcrn\n\n  strip speech-enhancement-gtcrn\nelse\n  echo \"./speech-enhancement-gtcrn  exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./speech-enhancement-gtcrn\n"
  },
  {
    "path": "swift-api-examples/run-spoken-language-identification.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -d ./sherpa-onnx-whisper-tiny ]; then\n  echo \"Download a pre-trained model for testing.\"\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2\n  tar xvf sherpa-onnx-whisper-tiny.tar.bz2\n  rm sherpa-onnx-whisper-tiny.tar.bz2\nfi\n\nif [ ! -e ./spoken-language-identification ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./spoken-language-identification.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o spoken-language-identification\n\n  strip spoken-language-identification\nelse\n  echo \"./spoken-language-identification exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./spoken-language-identification\n"
  },
  {
    "path": "swift-api-examples/run-streaming-hlg-decode-file.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst ]; then\n  echo \"Downloading the pre-trained model for testing.\"\n\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\n  rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2\nfi\n\nif [ ! -e ./streaming-hlg-decode-file ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./streaming-hlg-decode-file.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o streaming-hlg-decode-file\n\n  strip ./streaming-hlg-decode-file\nelse\n  echo \"./streaming-hlg-decode-file exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./streaming-hlg-decode-file\n"
  },
  {
    "path": "swift-api-examples/run-test-version.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -e ./test-version ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./test-version.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o ./test-version\n\n  strip ./test-version\nelse\n  echo \"./test-version exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./test-version\n"
  },
  {
    "path": "swift-api-examples/run-tts-kitten-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kitten.html\n# to download more models\nif [ ! -f ./kitten-nano-en-v0_1-fp16/model.fp16.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kitten-nano-en-v0_1-fp16.tar.bz2\n  tar xf kitten-nano-en-v0_1-fp16.tar.bz2\n  rm kitten-nano-en-v0_1-fp16.tar.bz2\nfi\n\nif [ ! -e ./tts-kitten-en ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./tts-kitten-en.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o tts-kitten-en\n\n  strip tts-kitten-en\nelse\n  echo \"./tts-kitten-en exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./tts-kitten-en\n"
  },
  {
    "path": "swift-api-examples/run-tts-kokoro-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n# to download more models\nif [ ! -f ./kokoro-en-v0_19/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2\n  tar xf kokoro-en-v0_19.tar.bz2\n  rm kokoro-en-v0_19.tar.bz2\nfi\n\nif [ ! -e ./tts-kokoro-en ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./tts-kokoro-en.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o tts-kokoro-en\n\n  strip tts-kokoro-en\nelse\n  echo \"./tts-kokoro-en exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./tts-kokoro-en\n"
  },
  {
    "path": "swift-api-examples/run-tts-kokoro-zh-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/kokoro.html\n# to download more models\nif [ ! -f ./kokoro-multi-lang-v1_0/model.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2\n  tar xf kokoro-multi-lang-v1_0.tar.bz2\n  rm kokoro-multi-lang-v1_0.tar.bz2\nfi\n\nif [ ! -e ./tts-kokoro-zh-en ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./tts-kokoro-zh-en.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o tts-kokoro-zh-en\n\n  strip tts-kokoro-zh-en\nelse\n  echo \"./tts-kokoro-zh-en exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./tts-kokoro-zh-en\n"
  },
  {
    "path": "swift-api-examples/run-tts-matcha-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# matcha.html#matcha-icefall-en-us-ljspeech-american-english-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-en_US-ljspeech/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2\n  tar xf matcha-icefall-en_US-ljspeech.tar.bz2\n  rm matcha-icefall-en_US-ljspeech.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\nif [ ! -e ./tts-matcha-en ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./tts-matcha-en.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o tts-matcha-en\n\n  strip tts-matcha-en\nelse\n  echo \"./tts-matcha-en exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./tts-matcha-en\n"
  },
  {
    "path": "swift-api-examples/run-tts-matcha-zh.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pretrained_models/matcha.html#matcha-icefall-zh-baker-chinese-1-female-speaker\n# to download more models\nif [ ! -f ./matcha-icefall-zh-baker/model-steps-3.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2\n  tar xvf matcha-icefall-zh-baker.tar.bz2\n  rm matcha-icefall-zh-baker.tar.bz2\nfi\n\nif [ ! -f ./vocos-22khz-univ.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos-22khz-univ.onnx\nfi\n\nif [ ! -e ./tts-matcha-zh ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./tts-matcha-zh.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o tts-matcha-zh\n\n  strip tts-matcha-zh\nelse\n  echo \"./tts-matcha-zh exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./tts-matcha-zh\n"
  },
  {
    "path": "swift-api-examples/run-tts-pocket-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/pocket.html\n# to download more models\nif [ ! -f ./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  tar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n  rm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nfi\n\nif [ ! -e ./tts-pocket-en ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./tts-pocket-en.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o tts-pocket-en\n\n  strip tts-pocket-en\nelse\n  echo \"./tts-pocket-en exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./tts-pocket-en\n"
  },
  {
    "path": "swift-api-examples/run-tts-supertonic-en.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/supertonic.html\n# to download more models\nif [ ! -f ./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  tar xf sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\n  rm sherpa-onnx-supertonic-tts-int8-2026-03-06.tar.bz2\nfi\n\nif [ ! -e ./tts-supertonic-en ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./tts-supertonic-en.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o tts-supertonic-en\n\n  strip tts-supertonic-en\nelse\n  echo \"./tts-supertonic-en exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./tts-supertonic-en\n"
  },
  {
    "path": "swift-api-examples/run-tts-vits.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -d ./vits-piper-en_US-amy-low ]; then\n  echo \"Download a pre-trained model for testing.\"\n\n  wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-amy-low.tar.bz2\n  tar xf vits-piper-en_US-amy-low.tar.bz2\n  rm vits-piper-en_US-amy-low.tar.bz2\nfi\n\nif [ ! -e ./tts-vits ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./tts-vits.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o tts-vits\n\n  strip tts-vits\nelse\n  echo \"./tts-vits exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./tts-vits\n"
  },
  {
    "path": "swift-api-examples/run-tts-zipvoice.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\n# please visit\n# https://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\n# to download more models\nif [ ! -f ./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  tar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n  rm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nfi\n\nif [ ! -f ./vocos_24khz.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\nfi\n\nif [ ! -e ./tts-zipvoice ]; then\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./tts-zipvoice.swift ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o tts-zipvoice\n\n  strip tts-zipvoice\nelse\n  echo \"./tts-zipvoice exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./tts-zipvoice\n"
  },
  {
    "path": "swift-api-examples/run-wenet-ctc-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx ]; then\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n  tar xvf sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\n  rm sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10.tar.bz2\nfi\n\nif [ ! -e ./wenet-ctc-asr ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./wenet-ctc-asr.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o wenet-ctc-asr\n\n  strip wenet-ctc-asr\nelse\n  echo \"./wenet-ctc-asr exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./wenet-ctc-asr\n"
  },
  {
    "path": "swift-api-examples/run-zipformer-ctc-asr.sh",
    "content": "#!/usr/bin/env bash\n\nset -ex\n\nif [ ! -d ../build-swift-macos ]; then\n  echo \"Please run ../build-swift-macos.sh first!\"\n  exit 1\nfi\n\nif [ ! -f ./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx ]; then\n  echo \"Please download the pre-trained model for testing.\"\n  echo \"You can refer to\"\n  echo \"\"\n  echo \"https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/icefall/zipformer.html#sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03-chinese\"\n  echo \"\"\n  echo \"for help\"\n\n  curl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n\n  tar xvf sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n  rm sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03.tar.bz2\n  ls -lh sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03\nfi\n\nif [ ! -e ./zipformer-ctc-asr ]; then\n  # Note: We use -lc++ to link against libc++ instead of libstdc++\n  swiftc \\\n    -lc++ \\\n    -I ../build-swift-macos/install/include \\\n    -import-objc-header ./SherpaOnnx-Bridging-Header.h \\\n    ./zipformer-ctc-asr.swift  ./SherpaOnnx.swift \\\n    -L ../build-swift-macos/install/lib/ \\\n    -l sherpa-onnx \\\n    -l onnxruntime \\\n    -o zipformer-ctc-asr\n\n  strip zipformer-ctc-asr\nelse\n  echo \"./zipformer-ctc-asr exists - skip building\"\nfi\n\nexport DYLD_LIBRARY_PATH=$PWD/../build-swift-macos/install/lib:$DYLD_LIBRARY_PATH\n./zipformer-ctc-asr\n"
  },
  {
    "path": "swift-api-examples/speaker-diarization.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let segmentationModel = \"./sherpa-onnx-pyannote-segmentation-3-0/model.onnx\"\n  let embeddingExtractorModel = \"./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\"\n  let waveFilename = \"./0-four-speakers-zh.wav\"\n\n  // There are 4 speakers in ./0-four-speakers-zh.wav, so we use 4 here\n  let numSpeakers = 4\n  var config = sherpaOnnxOfflineSpeakerDiarizationConfig(\n    segmentation: sherpaOnnxOfflineSpeakerSegmentationModelConfig(\n      pyannote: sherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig(model: segmentationModel)),\n    embedding: sherpaOnnxSpeakerEmbeddingExtractorConfig(model: embeddingExtractorModel),\n    clustering: sherpaOnnxFastClusteringConfig(numClusters: numSpeakers)\n  )\n\n  let sd = SherpaOnnxOfflineSpeakerDiarizationWrapper(config: &config)\n\n  let fileURL: NSURL = NSURL(fileURLWithPath: waveFilename)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(Int(audioFormat.sampleRate) == sd.sampleRate)\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  print(\"Started!\")\n  let segments = sd.process(samples: array)\n  for i in 0..<segments.count {\n    print(\"\\(segments[i].start) -- \\(segments[i].end) speaker_\\(segments[i].speaker)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/speech-enhancement-dpdfnet.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let model = \"./dpdfnet_baseline.onnx\"\n  // Please refer to\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n  // to download files used in this script\n  // Use dpdfnet_baseline.onnx, dpdfnet2.onnx, dpdfnet4.onnx, or dpdfnet8.onnx\n  // for 16 kHz downstream ASR or speech recognition.\n  // Use dpdfnet2_48khz_hr.onnx for 48 kHz enhancement output.\n  var config = sherpaOnnxOfflineSpeechDenoiserConfig(\n    model: sherpaOnnxOfflineSpeechDenoiserModelConfig(\n      dpdfnet: sherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig(model: model))\n  )\n\n  let sd = SherpaOnnxOfflineSpeechDenoiserWrapper(config: &config)\n\n  let fileURL: NSURL = NSURL(fileURLWithPath: \"./inp_16k.wav\")\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.sampleRate == 16000)\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  let audio = sd.run(samples: array, sampleRate: Int(audioFormat.sampleRate))\n\n  let filename = \"enhanced.wav\"\n  let ok = audio.save(filename: filename)\n  if ok == 1 {\n    print(\"\\nSaved to:\\(filename)\")\n  } else {\n    print(\"Failed to save to \\(filename)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/speech-enhancement-gtcrn.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let model = \"./gtcrn_simple.onnx\"\n  // Please refer to\n  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\n  // to download files used in this script\n  var config = sherpaOnnxOfflineSpeechDenoiserConfig(\n    model: sherpaOnnxOfflineSpeechDenoiserModelConfig(\n      gtcrn: sherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig(model: model))\n  )\n\n  let sd = SherpaOnnxOfflineSpeechDenoiserWrapper(config: &config)\n\n  let fileURL: NSURL = NSURL(fileURLWithPath: \"./inp_16k.wav\")\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.sampleRate == 16000)\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  let audio = sd.run(samples: array, sampleRate: Int(audioFormat.sampleRate))\n\n  let filename = \"enhanced.wav\"\n  let ok = audio.save(filename: filename)\n  if ok == 1 {\n    print(\"\\nSaved to:\\(filename)\")\n  } else {\n    print(\"Failed to save to \\(filename)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/spoken-language-identification.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let encoder = \"./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx\"\n  let decoder = \"./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx\"\n\n  let whisperConfig = sherpaOnnxSpokenLanguageIdentificationWhisperConfig(\n    encoder: encoder,\n    decoder: decoder\n  )\n\n  var config = sherpaOnnxSpokenLanguageIdentificationConfig(\n    whisper: whisperConfig,\n    numThreads: 1,\n    debug: 1,\n    provider: \"cpu\"\n  )\n  let filePath = \"./sherpa-onnx-whisper-tiny/test_wavs/0.wav\"\n\n  let slid = SherpaOnnxSpokenLanguageIdentificationWrapper(config: &config)\n\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.sampleRate == 16000)\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  let result = slid.decode(samples: array)\n\n  print(\"\\nDetectedllanguage is:\\n\\(result.lang)\")\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/streaming-hlg-decode-file.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let filePath =\n    \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/8k.wav\"\n  let model =\n    \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx\"\n  let tokens = \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt\"\n  let zipfomer2CtcModelConfig = sherpaOnnxOnlineZipformer2CtcModelConfig(\n    model: model\n  )\n\n  let modelConfig = sherpaOnnxOnlineModelConfig(\n    tokens: tokens,\n    zipformer2Ctc: zipfomer2CtcModelConfig\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: 16000,\n    featureDim: 80\n  )\n\n  let ctcFstDecoderConfig = sherpaOnnxOnlineCtcFstDecoderConfig(\n    graph: \"./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst\",\n    maxActive: 3000\n  )\n\n  var config = sherpaOnnxOnlineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig,\n    ctcFstDecoderConfig: ctcFstDecoderConfig\n  )\n\n  let recognizer = SherpaOnnxRecognizer(config: &config)\n\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  recognizer.acceptWaveform(samples: array, sampleRate: Int(audioFormat.sampleRate))\n\n  let tailPadding = [Float](repeating: 0.0, count: 3200)\n  recognizer.acceptWaveform(samples: tailPadding, sampleRate: Int(audioFormat.sampleRate))\n\n  recognizer.inputFinished()\n  while recognizer.isReady() {\n    recognizer.decode()\n  }\n\n  let result = recognizer.getResult()\n  print(\"\\nresult is:\\n\\(result.text)\")\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/test-version.swift",
    "content": "func run() {\n  let version = getSherpaOnnxVersion()\n  let gitSha1 = getSherpaOnnxGitSha1()\n  let gitDate = getSherpaOnnxGitDate()\n  print(\"sherpa-onnx version: \\(version)\")\n  print(\"sherpa-onnx gitSha1: \\(gitSha1)\")\n  print(\"sherpa-onnx gitDate: \\(gitDate)\")\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/tts-kitten-en.swift",
    "content": "class MyClass {\n  func playSamples(samples: [Float]) {\n    print(\"Play \\(samples.count) samples\")\n  }\n}\n\nfunc run() {\n  let model = \"./kitten-nano-en-v0_1-fp16/model.fp16.onnx\"\n  let voices = \"./kitten-nano-en-v0_1-fp16/voices.bin\"\n  let tokens = \"./kitten-nano-en-v0_1-fp16/tokens.txt\"\n  let dataDir = \"./kitten-nano-en-v0_1-fp16/espeak-ng-data\"\n  let kitten = sherpaOnnxOfflineTtsKittenModelConfig(\n    model: model,\n    voices: voices,\n    tokens: tokens,\n    dataDir: dataDir\n  )\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(debug: 0, kitten: kitten)\n  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n\n  let myClass = MyClass()\n\n  // We use Unretained here so myClass must be kept alive as the callback is invoked\n  //\n  // See also\n  // https://medium.com/codex/swift-c-callback-interoperability-6d57da6c8ee6\n  let arg = Unmanaged<MyClass>.passUnretained(myClass).toOpaque()\n\n  let callback: TtsProgressCallbackWithArg = { samples, n, progress, arg in\n    let o = Unmanaged<MyClass>.fromOpaque(arg!).takeUnretainedValue()\n    var savedSamples: [Float] = []\n    for index in 0..<n {\n      savedSamples.append(samples![Int(index)])\n    }\n\n    o.playSamples(samples: savedSamples)\n\n    // return 1 so that it continues generating\n    return 1\n  }\n\n  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)\n\n  let text =\n    \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n  var genConfig = SherpaOnnxGenerationConfigSwift()\n  genConfig.sid = 0\n  genConfig.speed = 1.0\n  genConfig.silenceScale = 0.2\n\n  let audio = tts.generateWithConfig(\n    text: text, config: genConfig, callback: callback, arg: arg)\n  let filename = \"test-kitten-en.wav\"\n  let ok = audio.save(filename: filename)\n  if ok == 1 {\n    print(\"\\nSaved to:\\(filename)\")\n  } else {\n    print(\"Failed to save to \\(filename)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/tts-kokoro-en.swift",
    "content": "class MyClass {\n  func playSamples(samples: [Float]) {\n    print(\"Play \\(samples.count) samples\")\n  }\n}\n\nfunc run() {\n  let model = \"./kokoro-en-v0_19/model.onnx\"\n  let voices = \"./kokoro-en-v0_19/voices.bin\"\n  let tokens = \"./kokoro-en-v0_19/tokens.txt\"\n  let dataDir = \"./kokoro-en-v0_19/espeak-ng-data\"\n  let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(\n    model: model,\n    voices: voices,\n    tokens: tokens,\n    dataDir: dataDir\n  )\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro, debug: 0)\n  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n\n  let myClass = MyClass()\n\n  // We use Unretained here so myClass must be kept alive as the callback is invoked\n  //\n  // See also\n  // https://medium.com/codex/swift-c-callback-interoperability-6d57da6c8ee6\n  let arg = Unmanaged<MyClass>.passUnretained(myClass).toOpaque()\n\n  let callback: TtsProgressCallbackWithArg = { samples, n, progress, arg in\n    let o = Unmanaged<MyClass>.fromOpaque(arg!).takeUnretainedValue()\n    var savedSamples: [Float] = []\n    for index in 0..<n {\n      savedSamples.append(samples![Int(index)])\n    }\n\n    o.playSamples(samples: savedSamples)\n\n    // return 1 so that it continues generating\n    return 1\n  }\n\n  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)\n\n  let text =\n    \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n  var genConfig = SherpaOnnxGenerationConfigSwift()\n  genConfig.sid = 0\n  genConfig.speed = 1.0\n  genConfig.silenceScale = 0.2\n\n  let audio = tts.generateWithConfig(\n    text: text, config: genConfig, callback: callback, arg: arg)\n  let filename = \"test-kokoro-en.wav\"\n  let ok = audio.save(filename: filename)\n  if ok == 1 {\n    print(\"\\nSaved to:\\(filename)\")\n  } else {\n    print(\"Failed to save to \\(filename)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/tts-kokoro-zh-en.swift",
    "content": "class MyClass {\n  func playSamples(samples: [Float]) {\n    print(\"Play \\(samples.count) samples\")\n  }\n}\n\nfunc run() {\n  let model = \"./kokoro-multi-lang-v1_0/model.onnx\"\n  let voices = \"./kokoro-multi-lang-v1_0/voices.bin\"\n  let tokens = \"./kokoro-multi-lang-v1_0/tokens.txt\"\n  let dataDir = \"./kokoro-multi-lang-v1_0/espeak-ng-data\"\n  let lexicon = \"./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/lexicon-zh.txt\"\n  let kokoro = sherpaOnnxOfflineTtsKokoroModelConfig(\n    model: model,\n    voices: voices,\n    tokens: tokens,\n    dataDir: dataDir,\n    lexicon: lexicon\n  )\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(kokoro: kokoro, debug: 0)\n  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n\n  let myClass = MyClass()\n\n  // We use Unretained here so myClass must be kept alive as the callback is invoked\n  //\n  // See also\n  // https://medium.com/codex/swift-c-callback-interoperability-6d57da6c8ee6\n  let arg = Unmanaged<MyClass>.passUnretained(myClass).toOpaque()\n\n  let callback: TtsProgressCallbackWithArg = { samples, n, progress, arg in\n    let o = Unmanaged<MyClass>.fromOpaque(arg!).takeUnretainedValue()\n    var savedSamples: [Float] = []\n    for index in 0..<n {\n      savedSamples.append(samples![Int(index)])\n    }\n\n    o.playSamples(samples: savedSamples)\n\n    // return 1 so that it continues generating\n    return 1\n  }\n\n  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)\n\n  let text =\n    \"中英文语音合成测试。This is generated by next generation Kaldi using Kokoro without Misaki. 你觉得中英文说的如何呢？\"\n  var genConfig = SherpaOnnxGenerationConfigSwift()\n  genConfig.sid = 0\n  genConfig.speed = 1.0\n  genConfig.silenceScale = 0.2\n\n  let audio = tts.generateWithConfig(\n    text: text, config: genConfig, callback: callback, arg: arg)\n  let filename = \"test-kokoro-zh-en.wav\"\n  let ok = audio.save(filename: filename)\n  if ok == 1 {\n    print(\"\\nSaved to:\\(filename)\")\n  } else {\n    print(\"Failed to save to \\(filename)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/tts-matcha-en.swift",
    "content": "class MyClass {\n  func playSamples(samples: [Float]) {\n    print(\"Play \\(samples.count) samples\")\n  }\n}\n\nfunc run() {\n  let acousticModel = \"./matcha-icefall-en_US-ljspeech/model-steps-3.onnx\"\n  let vocoder = \"./vocos-22khz-univ.onnx\"\n  let tokens = \"./matcha-icefall-en_US-ljspeech/tokens.txt\"\n  let dataDir = \"./matcha-icefall-en_US-ljspeech/espeak-ng-data\"\n  let matcha = sherpaOnnxOfflineTtsMatchaModelConfig(\n    acousticModel: acousticModel,\n    vocoder: vocoder,\n    tokens: tokens,\n    dataDir: dataDir\n  )\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(matcha: matcha, debug: 0)\n  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n\n  let myClass = MyClass()\n\n  // We use Unretained here so myClass must be kept alive as the callback is invoked\n  //\n  // See also\n  // https://medium.com/codex/swift-c-callback-interoperability-6d57da6c8ee6\n  let arg = Unmanaged<MyClass>.passUnretained(myClass).toOpaque()\n\n  let callback: TtsProgressCallbackWithArg = { samples, n, progress, arg in\n    let o = Unmanaged<MyClass>.fromOpaque(arg!).takeUnretainedValue()\n    var savedSamples: [Float] = []\n    for index in 0..<n {\n      savedSamples.append(samples![Int(index)])\n    }\n\n    o.playSamples(samples: savedSamples)\n\n    // return 1 so that it continues generating\n    return 1\n  }\n\n  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)\n\n  let text =\n    \"Friends fell out often because life was changing so fast. The easiest thing in the world was to lose touch with someone.\"\n  var genConfig = SherpaOnnxGenerationConfigSwift()\n  genConfig.sid = 0\n  genConfig.speed = 1.0\n  genConfig.silenceScale = 0.2\n\n  let audio = tts.generateWithConfig(\n    text: text, config: genConfig, callback: callback, arg: arg)\n  let filename = \"test-matcha-en.wav\"\n  let ok = audio.save(filename: filename)\n  if ok == 1 {\n    print(\"\\nSaved to:\\(filename)\")\n  } else {\n    print(\"Failed to save to \\(filename)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/tts-matcha-zh.swift",
    "content": "class MyClass {\n  func playSamples(samples: [Float]) {\n    print(\"Play \\(samples.count) samples\")\n  }\n}\n\nfunc run() {\n  let acousticModel = \"./matcha-icefall-zh-baker/model-steps-3.onnx\"\n  let vocoder = \"./vocos-22khz-univ.onnx\"\n  let lexicon = \"./matcha-icefall-zh-baker/lexicon.txt\"\n  let tokens = \"./matcha-icefall-zh-baker/tokens.txt\"\n  let ruleFsts =\n    \"./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst\"\n  let matcha = sherpaOnnxOfflineTtsMatchaModelConfig(\n    acousticModel: acousticModel,\n    vocoder: vocoder,\n    lexicon: lexicon,\n    tokens: tokens\n  )\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(matcha: matcha, debug: 0)\n  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig, ruleFsts: ruleFsts)\n\n  let myClass = MyClass()\n\n  // We use Unretained here so myClass must be kept alive as the callback is invoked\n  //\n  // See also\n  // https://medium.com/codex/swift-c-callback-interoperability-6d57da6c8ee6\n  let arg = Unmanaged<MyClass>.passUnretained(myClass).toOpaque()\n\n  let callback: TtsProgressCallbackWithArg = { samples, n, progress, arg in\n    let o = Unmanaged<MyClass>.fromOpaque(arg!).takeUnretainedValue()\n    var savedSamples: [Float] = []\n    for index in 0..<n {\n      savedSamples.append(samples![Int(index)])\n    }\n\n    o.playSamples(samples: savedSamples)\n\n    // return 1 so that it continues generating\n    return 1\n  }\n\n  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)\n\n  let text = \"某某银行的副行长和一些行政领导表示，他们去过长江和长白山; 经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。\"\n  var genConfig = SherpaOnnxGenerationConfigSwift()\n  genConfig.sid = 0\n  genConfig.speed = 1.0\n  genConfig.silenceScale = 0.2\n\n  let audio = tts.generateWithConfig(\n    text: text, config: genConfig, callback: callback, arg: arg)\n  let filename = \"test-matcha-zh.wav\"\n  let ok = audio.save(filename: filename)\n  if ok == 1 {\n    print(\"\\nSaved to:\\(filename)\")\n  } else {\n    print(\"Failed to save to \\(filename)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/tts-pocket-en.swift",
    "content": "import Foundation\n\nclass PocketTtsProgressHandler {\n  func progress(samples: [Float], progress: Float) {\n    print(String(format: \"Received %d samples, Progress: %.2f%%\", samples.count, progress * 100))\n  }\n}\n\nfunc runPocketTtsDemo() {\n  let pocket = sherpaOnnxOfflineTtsPocketModelConfig(\n    lmFlow: \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx\",\n    lmMain: \"./sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx\",\n    encoder: \"./sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx\",\n    decoder: \"./sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx\",\n    textConditioner: \"./sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx\",\n    vocabJson: \"./sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json\",\n    tokenScoresJson: \"./sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json\"\n  )\n\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(numThreads: 2, pocket: pocket)\n  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n  ttsConfig.model.debug = 1\n\n  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)\n\n  let referenceAudioFile = \"./sherpa-onnx-pocket-tts-int8-2026-01-26/test_wavs/bria.wav\"\n  let referenceWave = SherpaOnnxWaveWrapper.readWave(filename: referenceAudioFile)\n\n  var genConfig = SherpaOnnxGenerationConfigSwift()\n  genConfig.speed = 1.0\n  genConfig.referenceAudio = referenceWave.samples\n  genConfig.referenceSampleRate = referenceWave.sampleRate\n  genConfig.extra = [\"max_reference_audio_len\": 15.0]\n\n  let text = \"\"\"\n    Today as always, men fall into two groups: slaves and free men. Whoever \\\n    does not have two-thirds of his day for himself, is a slave, whatever \\\n    he may be: a statesman, a businessman, an official, or a scholar. \\\n    Friends fell out often because life was changing so fast. \\\n    The easiest thing in the world was to lose touch with someone.\n    \"\"\"\n\n  func generateAndSave(\n    outputFile: String, callback: TtsProgressCallbackWithArg? = nil,\n    arg: UnsafeMutableRawPointer? = nil\n  ) {\n    let audio = tts.generateWithConfig(\n      text: text,\n      config: genConfig,\n      callback: callback,\n      arg: arg\n    )\n\n    if audio.save(filename: outputFile) == 1 {\n      print(\"Saved to: \\(outputFile)\")\n    } else {\n      print(\"Failed to save to \\(outputFile)\")\n    }\n  }\n\n  // -------------------------\n  // Option 1: with callback\n  // -------------------------\n  let useCallback = true\n  if useCallback {\n    let progressHandler = PocketTtsProgressHandler()\n    let arg = Unmanaged.passUnretained(progressHandler).toOpaque()\n\n    let callback: TtsProgressCallbackWithArg = { samples, n, progress, arg in\n      let handler = Unmanaged<PocketTtsProgressHandler>.fromOpaque(arg!).takeUnretainedValue()\n\n      let buffer: [Float] =\n        samples != nil ? Array(UnsafeBufferPointer(start: samples, count: Int(n))) : []\n      handler.progress(samples: buffer, progress: progress)\n      return 1  // continue generating\n    }\n\n    generateAndSave(outputFile: \"generated-pocket-callback.wav\", callback: callback, arg: arg)\n  } else {\n    // -------------------------\n    // Option 2: direct generation\n    // -------------------------\n    generateAndSave(outputFile: \"generated-pocket-direct.wav\")\n  }\n}\n\n// -------------------------\n// Run demo\n// -------------------------\n@main\nstruct App {\n  static func main() {\n    runPocketTtsDemo()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/tts-supertonic-en.swift",
    "content": "import Foundation\n\nclass SupertonicTtsProgressHandler {\n  func progress(samples: [Float], progress: Float) {\n    print(String(format: \"Received %d samples, Progress: %.2f%%\", samples.count, progress * 100))\n  }\n}\n\nfunc runSupertonicTtsDemo() {\n  let supertonic = sherpaOnnxOfflineTtsSupertonicModelConfig(\n    durationPredictor: \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/duration_predictor.int8.onnx\",\n    textEncoder: \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/text_encoder.int8.onnx\",\n    vectorEstimator: \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vector_estimator.int8.onnx\",\n    vocoder: \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/vocoder.int8.onnx\",\n    ttsJson: \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/tts.json\",\n    unicodeIndexer: \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/unicode_indexer.bin\",\n    voiceStyle: \"./sherpa-onnx-supertonic-tts-int8-2026-03-06/voice.bin\"\n  )\n\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(numThreads: 2, supertonic: supertonic)\n  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n  ttsConfig.model.debug = 1\n\n  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)\n\n  var genConfig = SherpaOnnxGenerationConfigSwift()\n  genConfig.sid = 6\n  genConfig.numSteps = 5\n  genConfig.speed = 1.25\n  genConfig.extra = [\"lang\": \"en\"]\n\n  let text =\n    \"Today as always, men fall into two groups: slaves and free men. Whoever \"\n    + \"does not have two-thirds of his day for himself, is a slave, whatever \"\n    + \"he may be: a statesman, a businessman, an official, or a scholar.\"\n\n  func generateAndSave(\n    outputFile: String, callback: TtsProgressCallbackWithArg? = nil,\n    arg: UnsafeMutableRawPointer? = nil\n  ) {\n    let audio = tts.generateWithConfig(\n      text: text,\n      config: genConfig,\n      callback: callback,\n      arg: arg\n    )\n\n    if audio.save(filename: outputFile) == 1 {\n      print(\"Saved to: \\(outputFile)\")\n    } else {\n      print(\"Failed to save to \\(outputFile)\")\n    }\n  }\n\n  // -------------------------\n  // Option 1: with callback\n  // -------------------------\n  let useCallback = true\n  if useCallback {\n    let progressHandler = SupertonicTtsProgressHandler()\n    let arg = Unmanaged.passUnretained(progressHandler).toOpaque()\n\n    let callback: TtsProgressCallbackWithArg = { samples, n, progress, arg in\n      let handler = Unmanaged<SupertonicTtsProgressHandler>.fromOpaque(arg!).takeUnretainedValue()\n\n      let buffer: [Float] =\n        samples != nil ? Array(UnsafeBufferPointer(start: samples, count: Int(n))) : []\n      handler.progress(samples: buffer, progress: progress)\n      return 1  // continue generating\n    }\n\n    generateAndSave(outputFile: \"generated-supertonic-callback.wav\", callback: callback, arg: arg)\n  } else {\n    // -------------------------\n    // Option 2: direct generation\n    // -------------------------\n    generateAndSave(outputFile: \"generated-supertonic-direct.wav\")\n  }\n}\n\n// -------------------------\n// Run demo\n// -------------------------\n@main\nstruct App {\n  static func main() {\n    runSupertonicTtsDemo()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/tts-vits.swift",
    "content": "class MyClass {\n  func playSamples(samples: [Float]) {\n    print(\"Play \\(samples.count) samples\")\n  }\n}\n\nfunc run() {\n  let model = \"./vits-piper-en_US-amy-low/en_US-amy-low.onnx\"\n  let tokens = \"./vits-piper-en_US-amy-low/tokens.txt\"\n  let dataDir = \"./vits-piper-en_US-amy-low/espeak-ng-data\"\n  let vits = sherpaOnnxOfflineTtsVitsModelConfig(\n    model: model,\n    lexicon: \"\",\n    tokens: tokens,\n    dataDir: dataDir\n  )\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(vits: vits)\n  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n\n  let myClass = MyClass()\n\n  // We use Unretained here so myClass must be kept alive as the callback is invoked\n  //\n  // See also\n  // https://medium.com/codex/swift-c-callback-interoperability-6d57da6c8ee6\n  let arg = Unmanaged<MyClass>.passUnretained(myClass).toOpaque()\n\n  let callback: TtsProgressCallbackWithArg = { samples, n, progress, arg in\n    let o = Unmanaged<MyClass>.fromOpaque(arg!).takeUnretainedValue()\n    var savedSamples: [Float] = []\n    for index in 0..<n {\n      savedSamples.append(samples![Int(index)])\n    }\n\n    o.playSamples(samples: savedSamples)\n\n    // return 1 so that it continues generating\n    return 1\n  }\n\n  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)\n\n  let text =\n    \"“Today as always, men fall into two groups: slaves and free men. Whoever does not have two-thirds of his day for himself, is a slave, whatever he may be: a statesman, a businessman, an official, or a scholar.”\"\n  var genConfig = SherpaOnnxGenerationConfigSwift()\n  genConfig.sid = 99\n  genConfig.speed = 1.0\n  genConfig.silenceScale = 0.2\n\n  let audio = tts.generateWithConfig(\n    text: text, config: genConfig, callback: callback, arg: arg)\n  let filename = \"test-vits-en.wav\"\n  let ok = audio.save(filename: filename)\n  if ok == 1 {\n    print(\"\\nSaved to:\\(filename)\")\n  } else {\n    print(\"Failed to save to \\(filename)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/tts-zipvoice.swift",
    "content": "import Foundation\n\nclass ZipVoiceTtsProgressHandler {\n  func progress(samples: [Float], progress: Float) {\n    print(String(format: \"Received %d samples, Progress: %.2f%%\", samples.count, progress * 100))\n  }\n}\n\nfunc runZipVoiceTtsDemo() {\n  let zipvoice = sherpaOnnxOfflineTtsZipvoiceModelConfig(\n    tokens: \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt\",\n    encoder: \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx\",\n    decoder: \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx\",\n    vocoder: \"./vocos_24khz.onnx\",\n    dataDir: \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data\",\n    lexicon: \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt\"\n  )\n\n  let modelConfig = sherpaOnnxOfflineTtsModelConfig(numThreads: 2, zipvoice: zipvoice)\n  var ttsConfig = sherpaOnnxOfflineTtsConfig(model: modelConfig)\n  ttsConfig.model.debug = 1\n\n  let tts = SherpaOnnxOfflineTtsWrapper(config: &ttsConfig)\n\n  let referenceAudioFile = \"./sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/test_wavs/leijun-1.wav\"\n  let referenceWave = SherpaOnnxWaveWrapper.readWave(filename: referenceAudioFile)\n\n  var genConfig = SherpaOnnxGenerationConfigSwift()\n  genConfig.speed = 1.0\n  genConfig.referenceAudio = referenceWave.samples\n  genConfig.referenceSampleRate = referenceWave.sampleRate\n  genConfig.referenceText = \"那还是三十六年前, 一九八七年. 我呢考上了武汉大学的计算机系.\"\n  genConfig.numSteps = 4\n  genConfig.extra = [\"min_char_in_sentence\": \"10\"]\n\n  let text = \"小米的价值观是真诚, 热爱. 真诚，就是不欺人也不自欺. 热爱, 就是全心投入并享受其中.\"\n\n  func generateAndSave(\n    outputFile: String, callback: TtsProgressCallbackWithArg? = nil,\n    arg: UnsafeMutableRawPointer? = nil\n  ) {\n    let audio = tts.generateWithConfig(\n      text: text,\n      config: genConfig,\n      callback: callback,\n      arg: arg\n    )\n\n    if audio.save(filename: outputFile) == 1 {\n      print(\"Saved to: \\(outputFile)\")\n    } else {\n      print(\"Failed to save to \\(outputFile)\")\n    }\n  }\n\n  let useCallback = true\n  if useCallback {\n    let progressHandler = ZipVoiceTtsProgressHandler()\n    let arg = Unmanaged.passUnretained(progressHandler).toOpaque()\n\n    let callback: TtsProgressCallbackWithArg = { samples, n, progress, arg in\n      let handler = Unmanaged<ZipVoiceTtsProgressHandler>.fromOpaque(arg!).takeUnretainedValue()\n\n      let buffer: [Float] =\n        samples != nil ? Array(UnsafeBufferPointer(start: samples, count: Int(n))) : []\n      handler.progress(samples: buffer, progress: progress)\n      return 1\n    }\n\n    generateAndSave(outputFile: \"generated-zipvoice-callback.wav\", callback: callback, arg: arg)\n  } else {\n    generateAndSave(outputFile: \"generated-zipvoice-direct.wav\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    runZipVoiceTtsDemo()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/wenet-ctc-asr.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let model =\n    \"./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/model.int8.onnx\"\n  let tokens =\n    \"./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/tokens.txt\"\n\n  let wenetCtc = sherpaOnnxOfflineWenetCtcModelConfig(\n    model: model\n  )\n\n  let modelConfig = sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    debug: 0,\n    wenetCtc: wenetCtc\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: 16000,\n    featureDim: 80\n  )\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let filePath =\n    \"./sherpa-onnx-wenetspeech-yue-u2pp-conformer-ctc-zh-en-cantonese-int8-2025-09-10/test_wavs/yue-0.wav\"\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  let result = recognizer.decode(samples: array, sampleRate: Int(audioFormat.sampleRate))\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if result.timestamps.count != 0 {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "swift-api-examples/zipformer-ctc-asr.swift",
    "content": "import AVFoundation\n\nextension AudioBuffer {\n  func array() -> [Float] {\n    return Array(UnsafeBufferPointer(self))\n  }\n}\n\nextension AVAudioPCMBuffer {\n  func array() -> [Float] {\n    return self.audioBufferList.pointee.mBuffers.array()\n  }\n}\n\nfunc run() {\n  let model = \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/model.int8.onnx\"\n  let tokens = \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/tokens.txt\"\n\n  let zipformerCtc = sherpaOnnxOfflineZipformerCtcModelConfig(\n    model: model\n  )\n\n  let modelConfig = sherpaOnnxOfflineModelConfig(\n    tokens: tokens,\n    debug: 0,\n    zipformerCtc: zipformerCtc\n  )\n\n  let featConfig = sherpaOnnxFeatureConfig(\n    sampleRate: 16000,\n    featureDim: 80\n  )\n  var config = sherpaOnnxOfflineRecognizerConfig(\n    featConfig: featConfig,\n    modelConfig: modelConfig\n  )\n\n  let recognizer = SherpaOnnxOfflineRecognizer(config: &config)\n\n  let filePath = \"./sherpa-onnx-zipformer-ctc-zh-int8-2025-07-03/test_wavs/0.wav\"\n  let fileURL: NSURL = NSURL(fileURLWithPath: filePath)\n  let audioFile = try! AVAudioFile(forReading: fileURL as URL)\n\n  let audioFormat = audioFile.processingFormat\n  assert(audioFormat.channelCount == 1)\n  assert(audioFormat.commonFormat == AVAudioCommonFormat.pcmFormatFloat32)\n\n  let audioFrameCount = UInt32(audioFile.length)\n  let audioFileBuffer = AVAudioPCMBuffer(pcmFormat: audioFormat, frameCapacity: audioFrameCount)\n\n  try! audioFile.read(into: audioFileBuffer!)\n  let array: [Float]! = audioFileBuffer?.array()\n  let result = recognizer.decode(samples: array, sampleRate: Int(audioFormat.sampleRate))\n  print(\"\\nresult is:\\n\\(result.text)\")\n  if result.timestamps.count != 0 {\n    print(\"\\ntimestamps is:\\n\\(result.timestamps)\")\n  }\n\n}\n\n@main\nstruct App {\n  static func main() {\n    run()\n  }\n}\n"
  },
  {
    "path": "toolchains/aarch64-linux-gnu.toolchain.cmake",
    "content": "# Copied from https://github.com/Tencent/ncnn/blob/master/toolchains/aarch64-linux-gnu.toolchain.cmake\n\nset(CMAKE_SYSTEM_NAME Linux)\nset(CMAKE_SYSTEM_PROCESSOR aarch64)\n\nset(CMAKE_C_COMPILER \"aarch64-linux-gnu-gcc\")\nset(CMAKE_CXX_COMPILER \"aarch64-linux-gnu-g++\")\n\nset(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)\nset(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)\nset(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)\n\nset(CMAKE_C_FLAGS \"-march=armv8-a\")\nset(CMAKE_CXX_FLAGS \"-march=armv8-a\")\n\n# cache flags\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS}\" CACHE STRING \"c flags\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS}\" CACHE STRING \"c++ flags\")\n"
  },
  {
    "path": "toolchains/arm-linux-gnueabihf.toolchain.cmake",
    "content": "# Copied from https://github.com/Tencent/ncnn/blob/master/toolchains/arm-linux-gnueabihf.toolchain.cmake\nset(CMAKE_SYSTEM_NAME Linux)\nset(CMAKE_SYSTEM_PROCESSOR arm)\n\nset(CMAKE_C_COMPILER \"arm-linux-gnueabihf-gcc\")\nset(CMAKE_CXX_COMPILER \"arm-linux-gnueabihf-g++\")\n\nset(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)\nset(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)\nset(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)\n\nset(CMAKE_C_FLAGS \"-march=armv7-a -mfloat-abi=hard -mfpu=neon\")\nset(CMAKE_CXX_FLAGS \"-march=armv7-a -mfloat-abi=hard -mfpu=neon\")\n\n# cache flags\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS}\" CACHE STRING \"c flags\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS}\" CACHE STRING \"c++ flags\")\n"
  },
  {
    "path": "toolchains/ios.toolchain.cmake",
    "content": "# This file is part of the ios-cmake project. It was retrieved from\n# https://github.com/leetal/ios-cmake.git, which is a fork of\n# https://github.com/gerstrong/ios-cmake.git, which is a fork of\n# https://github.com/cristeab/ios-cmake.git, which is a fork of\n# https://code.google.com/p/ios-cmake/. Which in turn is based off of\n# the Platform/Darwin.cmake and Platform/UnixPaths.cmake files which\n# are included with CMake 2.8.4\n#\n# The ios-cmake project is licensed under the new BSD license.\n#\n# Copyright (c) 2014, Bogdan Cristea and LTE Engineering Software,\n# Kitware, Inc., Insight Software Consortium.  All rights reserved.\n# Redistribution and use in source and binary forms, with or without\n# modification, are permitted provided that the following conditions\n# are met:\n# 1. Redistributions of source code must retain the above copyright\n# notice, this list of conditions and the following disclaimer.\n#\n# 2. Redistributions in binary form must reproduce the above copyright\n# notice, this list of conditions and the following disclaimer in the\n# documentation and/or other materials provided with the distribution.\n#\n# 3. Neither the name of the copyright holder nor the names of its\n# contributors may be used to endorse or promote products derived from\n# this software without specific prior written permission.\n#\n# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS\n# FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE\n# COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,\n# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,\n# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;\n# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\n# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT\n# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN\n# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE\n# POSSIBILITY OF SUCH DAMAGE.\n#\n# This file is based off of the Platform/Darwin.cmake and\n# Platform/UnixPaths.cmake files which are included with CMake 2.8.4\n# It has been altered for iOS development.\n#\n# Updated by Alex Stewart (alexs.mac@gmail.com)\n#\n# *****************************************************************************\n#      Now maintained by Alexander Widerberg (widerbergaren [at] gmail.com)\n#                      under the BSD-3-Clause license\n#                   https://github.com/leetal/ios-cmake\n# *****************************************************************************\n#\n#                           INFORMATION / HELP\n#\n# The following options control the behaviour of this toolchain:\n#\n# PLATFORM: (default \"OS64\")\n#    OS = Build for iPhoneOS.\n#    OS64 = Build for arm64 iphoneOS.\n#    OS64COMBINED = Build for arm64 x86_64 iphoneOS. Combined into FAT STATIC lib (supported on 3.14+ of CMakewith \"-G Xcode\" argument ONLY)\n#    SIMULATOR = Build for x86 i386 iphoneOS Simulator.\n#    SIMULATOR64 = Build for x86_64 iphoneOS Simulator.\n#    SIMULATORARM64 = Build for arm64 iphoneOS Simulator.\n#    TVOS = Build for arm64 tvOS.\n#    TVOSCOMBINED = Build for arm64 x86_64 tvOS. Combined into FAT STATIC lib (supported on 3.14+ of CMake with \"-G Xcode\" argument ONLY)\n#    SIMULATOR_TVOS = Build for x86_64 tvOS Simulator.\n#    WATCHOS = Build for armv7k arm64_32 for watchOS.\n#    WATCHOSCOMBINED = Build for armv7k arm64_32 x86_64 watchOS. Combined into FAT STATIC lib (supported on 3.14+ of CMake with \"-G Xcode\" argument ONLY)\n#    SIMULATOR_WATCHOS = Build for x86_64 for watchOS Simulator.\n#    MAC = Build for x86_64 macOS.\n#    MAC_ARM64 = Build for Apple Silicon macOS.\n#    MAC_CATALYST = Build for x86_64 macOS with Catalyst support (iOS toolchain on macOS).\n#                   Note: The build argument \"MACOSX_DEPLOYMENT_TARGET\" can be used to control min-version of macOS\n#    MAC_CATALYST_ARM64 = Build for Apple Silicon macOS with Catalyst support (iOS toolchain on macOS).\n#                         Note: The build argument \"MACOSX_DEPLOYMENT_TARGET\" can be used to control min-version of macOS\n#\n# CMAKE_OSX_SYSROOT: Path to the SDK to use.  By default this is\n#    automatically determined from PLATFORM and xcodebuild, but\n#    can also be manually specified (although this should not be required).\n#\n# CMAKE_DEVELOPER_ROOT: Path to the Developer directory for the platform\n#    being compiled for.  By default this is automatically determined from\n#    CMAKE_OSX_SYSROOT, but can also be manually specified (although this should\n#    not be required).\n#\n# DEPLOYMENT_TARGET: Minimum SDK version to target. Default 2.0 on watchOS and 9.0 on tvOS+iOS\n#\n# ENABLE_BITCODE: (1|0) Enables or disables bitcode support. Default 1 (true)\n#\n# ENABLE_ARC: (1|0) Enables or disables ARC support. Default 1 (true, ARC enabled by default)\n#\n# ENABLE_VISIBILITY: (1|0) Enables or disables symbol visibility support. Default 0 (false, visibility hidden by default)\n#\n# ENABLE_STRICT_TRY_COMPILE: (1|0) Enables or disables strict try_compile() on all Check* directives (will run linker\n#    to actually check if linking is possible). Default 0 (false, will set CMAKE_TRY_COMPILE_TARGET_TYPE to STATIC_LIBRARY)\n#\n# ARCHS: (armv7 armv7s armv7k arm64 arm64_32 i386 x86_64) If specified, will override the default architectures for the given PLATFORM\n#    OS = armv7 armv7s arm64 (if applicable)\n#    OS64 = arm64 (if applicable)\n#    SIMULATOR = i386\n#    SIMULATOR64 = x86_64\n#    SIMULATORARM64 = arm64\n#    TVOS = arm64\n#    SIMULATOR_TVOS = x86_64 (i386 has since long been deprecated)\n#    WATCHOS = armv7k arm64_32 (if applicable)\n#    SIMULATOR_WATCHOS = x86_64 (i386 has since long been deprecated)\n#    MAC = x86_64\n#    MAC_ARM64 = arm64\n#    MAC_CATALYST = x86_64\n#    MAC_CATALYST_ARM64 = arm64\n#\n# This toolchain defines the following properties (available via get_property()) for use externally:\n#\n# PLATFORM: The currently targeted platform.\n# XCODE_VERSION: Version number (not including Build version) of Xcode detected.\n# SDK_VERSION: Version of SDK being used.\n# OSX_ARCHITECTURES: Architectures being compiled for (generated from PLATFORM).\n# APPLE_TARGET_TRIPLE: Used by autoconf build systems. NOTE: If \"ARCHS\" are overridden, this will *NOT* be set!\n#\n# This toolchain defines the following macros for use externally:\n#\n# set_xcode_property (TARGET XCODE_PROPERTY XCODE_VALUE XCODE_VARIANT)\n#   A convenience macro for setting xcode specific properties on targets.\n#   Available variants are: All, Release, RelWithDebInfo, Debug, MinSizeRel\n#   example: set_xcode_property (myioslib IPHONEOS_DEPLOYMENT_TARGET \"3.1\" \"all\").\n#\n# find_host_package (PROGRAM ARGS)\n#   A macro used to find executable programs on the host system, not within the\n#   environment. Thanks to the android-cmake project for providing the\n#   command.\n#\n\ncmake_minimum_required(VERSION 3.8.0)\n\n# CMake invokes the toolchain file twice during the first build, but only once during subsequent rebuilds.\nif(IOS_TOOLCHAIN_HAS_RUN)\n  return()\nendif(IOS_TOOLCHAIN_HAS_RUN)\nset(IOS_TOOLCHAIN_HAS_RUN true)\n\n###############################################################################\n#                                  OPTIONS                                    #\n###############################################################################\n\noption(DROP_32_BIT \"Drops the 32-bit targets universally.\" YES)\n\n###############################################################################\n#                                END OPTIONS                                  #\n###############################################################################\n\n# List of supported platform values\nlist(APPEND _supported_platforms\n        \"OS\" \"OS64\" \"OS64COMBINED\" \"SIMULATOR\" \"SIMULATOR64\" \"SIMULATORARM64\"\n        \"TVOS\" \"TVOSCOMBINED\" \"SIMULATOR_TVOS\"\n        \"WATCHOS\" \"WATCHOSCOMBINED\" \"SIMULATOR_WATCHOS\"\n        \"MAC\" \"MAC_ARM64\"\n        \"MAC_CATALYST\" \"MAC_CATALYST_ARM64\")\n\n# Cache what generator is used\nset(USED_CMAKE_GENERATOR \"${CMAKE_GENERATOR}\")\n\n# Check if using a CMake version capable of building combined FAT builds (simulator and target slices combined in one static lib)\nif(${CMAKE_VERSION} VERSION_GREATER_EQUAL \"3.14\")\n  set(MODERN_CMAKE YES)\nendif()\n\n# Get the Xcode version being used.\n# Problem: CMake runs toolchain files multiple times, but can't read cache variables on some runs.\n# Workaround: On first run (in which cache variables are always accessible), set an intermediary environment variable.\n#\n# NOTE: This pattern is used i many places in this toolchain to speed up checks of all sorts\nif(DEFINED XCODE_VERSION_INT)\n  # Environment variables are always preserved.\n  set(ENV{_XCODE_VERSION_INT} \"${XCODE_VERSION_INT}\")\nelseif(DEFINED ENV{_XCODE_VERSION_INT})\n  set(XCODE_VERSION_INT \"$ENV{_XCODE_VERSION_INT}\")\nelseif(NOT DEFINED XCODE_VERSION_INT)\n  find_program(XCODEBUILD_EXECUTABLE xcodebuild)\n  if(NOT XCODEBUILD_EXECUTABLE)\n    message(FATAL_ERROR \"xcodebuild not found. Please install either the standalone commandline tools or Xcode.\")\n  endif()\n  execute_process(COMMAND ${XCODEBUILD_EXECUTABLE} -version\n          OUTPUT_VARIABLE XCODE_VERSION_INT\n          ERROR_QUIET\n          OUTPUT_STRIP_TRAILING_WHITESPACE)\n  string(REGEX MATCH \"Xcode [0-9\\\\.]+\" XCODE_VERSION_INT \"${XCODE_VERSION_INT}\")\n  string(REGEX REPLACE \"Xcode ([0-9\\\\.]+)\" \"\\\\1\" XCODE_VERSION_INT \"${XCODE_VERSION_INT}\")\n  set(XCODE_VERSION_INT \"${XCODE_VERSION_INT}\" CACHE INTERNAL \"\")\nendif()\n\n# Assuming that xcode 12.0 is installed you most probably have ios sdk 14.0 or later installed (tested on Big Sur)\n# if you don't set a deployment target it will be set the way you only get 64-bit builds\nif(NOT DEFINED DEPLOYMENT_TARGET AND XCODE_VERSION_INT VERSION_GREATER 12.0)\n  # Temporarily fix the arm64 issues in CMake install-combined by excluding arm64 for simulator builds (needed for Apple Silicon...)\n  set(CMAKE_XCODE_ATTRIBUTE_EXCLUDED_ARCHS[sdk=iphonesimulator*] \"arm64\")\nendif()\n\n# Check if the platform variable is set\nif(DEFINED PLATFORM)\n  # Environment variables are always preserved.\n  set(ENV{_PLATFORM} \"${PLATFORM}\")\nelseif(DEFINED ENV{_PLATFORM})\n  set(PLATFORM \"$ENV{_PLATFORM}\")\nelseif(NOT DEFINED PLATFORM)\n  message(FATAL_ERROR \"PLATFORM argument not set. Bailing configure since I don't know what target you want to build for!\")\nendif ()\n\n# Safeguard that the platform value is set and is one of the supported values\nlist(FIND _supported_platforms ${PLATFORM} contains_PLATFORM)\nif(\"${contains_PLATFORM}\" EQUAL \"-1\")\n  string(REPLACE \";\"  \"\\n * \" _supported_platforms_formatted \"${_supported_platforms}\")\n  message(FATAL_ERROR \" Invalid PLATFORM specified! Current value: ${PLATFORM}.\\n\"\n          \" Supported PLATFORM values: \\n * ${_supported_platforms_formatted}\")\nendif()\n\n# Check if Apple Silicon is supported\nif(PLATFORM MATCHES \"^(MAC_ARM64)$|^(MAC_CATALYST_ARM64)$\" AND ${CMAKE_VERSION} VERSION_LESS \"3.19.5\")\n  message(FATAL_ERROR \"Apple Silicon builds requires a minimum of CMake 3.19.5\")\nendif()\n\n# Touch toolchain variable to suppress \"unused variable\" warning.\n# This happens if CMake is invoked with the same command line the second time.\nif(CMAKE_TOOLCHAIN_FILE)\nendif()\n\n# Fix for PThread library not in path\nset(CMAKE_THREAD_LIBS_INIT \"-lpthread\")\nset(CMAKE_HAVE_THREADS_LIBRARY 1)\nset(CMAKE_USE_WIN32_THREADS_INIT 0)\nset(CMAKE_USE_PTHREADS_INIT 1)\n\n# Specify minimum version of deployment target.\nif(NOT DEFINED DEPLOYMENT_TARGET)\n  if (PLATFORM MATCHES \"WATCHOS\")\n    # Unless specified, SDK version 4.0 is used by default as minimum target version (watchOS).\n    set(DEPLOYMENT_TARGET \"4.0\")\n  elseif(PLATFORM STREQUAL \"MAC\")\n    # Unless specified, SDK version 10.13 (High sierra) is used by default as minimum target version (macos).\n    set(DEPLOYMENT_TARGET \"10.13\")\n  elseif(PLATFORM STREQUAL \"MAC_ARM64\")\n    # Unless specified, SDK version 11.0 (Big Sur) is used by default as minimum target version (macos on arm).\n    set(DEPLOYMENT_TARGET \"11.0\")\n  elseif(PLATFORM STREQUAL \"MAC_CATALYST\" OR PLATFORM STREQUAL \"MAC_CATALYST_ARM64\")\n    # Unless specified, SDK version 13.0 is used by default as minimum target version (mac catalyst minimum requirement).\n    set(DEPLOYMENT_TARGET \"13.0\")\n  else()\n    # Unless specified, SDK version 11.0 is used by default as minimum target version (iOS, tvOS).\n    set(DEPLOYMENT_TARGET \"11.0\")\n  endif()\n  message(STATUS \"[DEFAULTS] Using the default min-version since DEPLOYMENT_TARGET not provided!\")\nelseif(DEFINED DEPLOYMENT_TARGET AND PLATFORM STREQUAL \"MAC_CATALYST\" AND ${DEPLOYMENT_TARGET} VERSION_LESS \"13.0\")\n  message(FATAL_ERROR \"Mac Catalyst builds requires a minimum deployment target of 13.0!\")\nendif()\n\n# Store the DEPLOYMENT_TARGET in the cache\nset(DEPLOYMENT_TARGET \"${DEPLOYMENT_TARGET}\" CACHE INTERNAL \"\")\n\n# Handle the case where we are targeting iOS and a version above 10.3.4 (32-bit support dropped officially)\nif(PLATFORM STREQUAL \"OS\" AND DEPLOYMENT_TARGET VERSION_GREATER_EQUAL 10.3.4)\n  set(PLATFORM \"OS64\")\n  message(STATUS \"Targeting minimum SDK version ${DEPLOYMENT_TARGET}. Dropping 32-bit support.\")\nelseif(PLATFORM STREQUAL \"SIMULATOR\" AND DEPLOYMENT_TARGET VERSION_GREATER_EQUAL 10.3.4)\n  set(PLATFORM \"SIMULATOR64\")\n  message(STATUS \"Targeting minimum SDK version ${DEPLOYMENT_TARGET}. Dropping 32-bit support.\")\nendif()\n\nset(PLATFORM_INT \"${PLATFORM}\")\n\nif(DEFINED ARCHS)\n  string(REPLACE \";\" \"-\" ARCHS_SPLIT \"${ARCHS}\")\nendif()\n\n# Determine the platform name and architectures for use in xcodebuild commands\n# from the specified PLATFORM_INT name.\nif(PLATFORM_INT STREQUAL \"OS\")\n  set(SDK_NAME iphoneos)\n  if(NOT ARCHS)\n    set(ARCHS armv7 armv7s arm64)\n    set(APPLE_TARGET_TRIPLE_INT arm-apple-ios)\n  endif()\nelseif(PLATFORM_INT STREQUAL \"OS64\")\n  set(SDK_NAME iphoneos)\n  if(NOT ARCHS)\n    if (XCODE_VERSION_INT VERSION_GREATER 10.0)\n      set(ARCHS arm64) # Add arm64e when Apple have fixed the integration issues with it, libarclite_iphoneos.a is currently missung bitcode markers for example\n    else()\n      set(ARCHS arm64)\n    endif()\n    set(APPLE_TARGET_TRIPLE_INT aarch64-apple-ios)\n  else()\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-ios)\n  endif()\nelseif(PLATFORM_INT STREQUAL \"OS64COMBINED\")\n  set(SDK_NAME iphoneos)\n  if(MODERN_CMAKE)\n    if(NOT ARCHS)\n      if (XCODE_VERSION_INT VERSION_GREATER 10.0)\n        set(ARCHS arm64 x86_64) # Add arm64e when Apple have fixed the integration issues with it, libarclite_iphoneos.a is currently missung bitcode markers for example\n        set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=iphoneos*] \"arm64\")\n        set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=iphonesimulator*] \"x86_64\")\n        set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=iphoneos*] \"arm64\")\n        set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=iphonesimulator*] \"x86_64\")\n      else()\n        set(ARCHS arm64 x86_64)\n        set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=iphoneos*] \"arm64\")\n        set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=iphonesimulator*] \"x86_64\")\n        set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=iphoneos*] \"arm64\")\n        set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=iphonesimulator*] \"x86_64\")\n      endif()\n      set(APPLE_TARGET_TRIPLE_INT aarch64-x86_64-apple-ios)\n    else()\n      set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-ios)\n    endif()\n  else()\n    message(FATAL_ERROR \"Please make sure that you are running CMake 3.14+ to make the OS64COMBINED setting work\")\n  endif()\nelseif(PLATFORM_INT STREQUAL \"SIMULATOR\")\n  set(SDK_NAME iphonesimulator)\n  if(NOT ARCHS)\n    set(ARCHS i386)\n    set(APPLE_TARGET_TRIPLE_INT i386-apple-ios)\n  else()\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-ios)\n  endif()\n  message(DEPRECATION \"SIMULATOR IS DEPRECATED. Consider using SIMULATOR64 instead.\")\nelseif(PLATFORM_INT STREQUAL \"SIMULATOR64\")\n  set(SDK_NAME iphonesimulator)\n  if(NOT ARCHS)\n    set(ARCHS x86_64)\n    set(APPLE_TARGET_TRIPLE_INT x86_64-apple-ios)\n  else()\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-ios)\n  endif()\nelseif(PLATFORM_INT STREQUAL \"SIMULATORARM64\")\n  set(SDK_NAME iphonesimulator)\n  if(NOT ARCHS)\n    set(ARCHS arm64)\n    set(APPLE_TARGET_TRIPLE_INT aarch64-apple-ios)\n  else()\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-ios)\n  endif()\nelseif(PLATFORM_INT STREQUAL \"TVOS\")\n  set(SDK_NAME appletvos)\n  if(NOT ARCHS)\n    set(ARCHS arm64)\n    set(APPLE_TARGET_TRIPLE_INT aarch64-apple-tvos)\n  else()\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-tvos)\n  endif()\nelseif (PLATFORM_INT STREQUAL \"TVOSCOMBINED\")\n  set(SDK_NAME appletvos)\n  if(MODERN_CMAKE)\n    if(NOT ARCHS)\n      set(ARCHS arm64 x86_64)\n      set(APPLE_TARGET_TRIPLE_INT aarch64-x86_64-apple-tvos)\n      set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=appletvos*] \"arm64\")\n      set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=appletvsimulator*] \"x86_64\")\n      set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=appletvos*] \"arm64\")\n      set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=appletvsimulator*] \"x86_64\")\n    else()\n      set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-tvos)\n    endif()\n  else()\n    message(FATAL_ERROR \"Please make sure that you are running CMake 3.14+ to make the TVOSCOMBINED setting work\")\n  endif()\nelseif(PLATFORM_INT STREQUAL \"SIMULATOR_TVOS\")\n  set(SDK_NAME appletvsimulator)\n  if(NOT ARCHS)\n    set(ARCHS x86_64)\n    set(APPLE_TARGET_TRIPLE_INT x86_64-apple-tvos)\n  else()\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-tvos)\n  endif()\nelseif(PLATFORM_INT STREQUAL \"WATCHOS\")\n  set(SDK_NAME watchos)\n  if(NOT ARCHS)\n    if (XCODE_VERSION_INT VERSION_GREATER 10.0)\n      set(ARCHS armv7k arm64_32)\n      set(APPLE_TARGET_TRIPLE_INT aarch64_32-apple-watchos)\n    else()\n      set(ARCHS armv7k)\n      set(APPLE_TARGET_TRIPLE_INT arm-apple-watchos)\n    endif()\n  else()\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-watchos)\n  endif()\nelseif(PLATFORM_INT STREQUAL \"WATCHOSCOMBINED\")\n  set(SDK_NAME watchos)\n  if(MODERN_CMAKE)\n    if(NOT ARCHS)\n      if (XCODE_VERSION_INT VERSION_GREATER 10.0)\n        set(ARCHS armv7k arm64_32 i386)\n        set(APPLE_TARGET_TRIPLE_INT aarch64_32-i386-apple-watchos)\n        set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=watchos*] \"armv7k arm64_32\")\n        set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=watchsimulator*] \"i386\")\n        set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=watchos*] \"armv7k arm64_32\")\n        set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=watchsimulator*] \"i386\")\n      else()\n        set(ARCHS armv7k i386)\n        set(APPLE_TARGET_TRIPLE_INT arm-i386-apple-watchos)\n        set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=watchos*] \"armv7k\")\n        set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=watchsimulator*] \"i386\")\n        set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=watchos*] \"armv7k\")\n        set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=watchsimulator*] \"i386\")\n      endif()\n    else()\n      set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-watchos)\n    endif()\n  else()\n    message(FATAL_ERROR \"Please make sure that you are running CMake 3.14+ to make the WATCHOSCOMBINED setting work\")\n  endif()\nelseif(PLATFORM_INT STREQUAL \"SIMULATOR_WATCHOS\")\n  set(SDK_NAME watchsimulator)\n  if(NOT ARCHS)\n    set(ARCHS i386)\n    set(APPLE_TARGET_TRIPLE_INT i386-apple-watchos)\n  else()\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-watchos)\n  endif()\nelseif(PLATFORM_INT STREQUAL \"MAC\" OR PLATFORM_INT STREQUAL \"MAC_CATALYST\")\n  set(SDK_NAME macosx)\n  if(NOT ARCHS)\n    set(ARCHS x86_64)\n  endif()\n  string(REPLACE \";\" \"-\" ARCHS_SPLIT \"${ARCHS}\")\n  if(PLATFORM_INT STREQUAL \"MAC\")\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-macosx)\n  elseif(PLATFORM_INT STREQUAL \"MAC_CATALYST\")\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-ios${DEPLOYMENT_TARGET}-macabi)\n  endif()\nelseif(PLATFORM_INT MATCHES \"^(MAC_ARM64)$|^(MAC_CATALYST_ARM64)$\")\n  set(SDK_NAME macosx)\n  if(NOT ARCHS)\n    set(ARCHS arm64)\n  endif()\n  string(REPLACE \";\" \"-\" ARCHS_SPLIT \"${ARCHS}\")\n  if(PLATFORM_INT STREQUAL \"MAC_ARM64\")\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-macosx)\n  elseif(PLATFORM_INT STREQUAL \"MAC_CATALYST_ARM64\")\n    set(APPLE_TARGET_TRIPLE_INT ${ARCHS_SPLIT}-apple-ios${DEPLOYMENT_TARGET}-macabi)\n  endif()\nelse()\n  message(FATAL_ERROR \"Invalid PLATFORM: ${PLATFORM_INT}\")\nendif()\n\nif(MODERN_CMAKE AND PLATFORM_INT MATCHES \".*COMBINED\" AND NOT CMAKE_GENERATOR MATCHES \"Xcode\")\n  message(FATAL_ERROR \"The COMBINED options only work with Xcode generator, -G Xcode\")\nendif()\n\nif(CMAKE_GENERATOR MATCHES \"Xcode\" AND PLATFORM_INT MATCHES \"MAC_CATALYST_.*\")\n  set(CMAKE_XCODE_ATTRIBUTE_CLANG_CXX_LIBRARY \"libc++\")\n  set(CMAKE_XCODE_ATTRIBUTE_SUPPORTED_PLATFORMS \"macosx\")\n  set(CMAKE_XCODE_EFFECTIVE_PLATFORMS \"-maccatalyst\")\n  if(NOT DEFINED MACOSX_DEPLOYMENT_TARGET)\n    set(CMAKE_XCODE_ATTRIBUTE_MACOSX_DEPLOYMENT_TARGET \"10.15\")\n  else()\n    set(CMAKE_XCODE_ATTRIBUTE_MACOSX_DEPLOYMENT_TARGET \"${MACOSX_DEPLOYMENT_TARGET}\")\n  endif()\nelseif(CMAKE_GENERATOR MATCHES \"Xcode\")\n  set(CMAKE_XCODE_ATTRIBUTE_IPHONEOS_DEPLOYMENT_TARGET \"${DEPLOYMENT_TARGET}\")\n  if(NOT PLATFORM_INT MATCHES \".*COMBINED\")\n    set(CMAKE_XCODE_ATTRIBUTE_ARCHS[sdk=${SDK_NAME}*] \"${ARCHS}\")\n    set(CMAKE_XCODE_ATTRIBUTE_VALID_ARCHS[sdk=${SDK_NAME}*] \"${ARCHS}\")\n  endif()\nendif()\n\n# If user did not specify the SDK root to use, then query xcodebuild for it.\nif(DEFINED CMAKE_OSX_SYSROOT_INT)\n  # Environment variables are always preserved.\n  set(ENV{_CMAKE_OSX_SYSROOT_INT} \"${CMAKE_OSX_SYSROOT_INT}\")\nelseif(DEFINED ENV{_CMAKE_OSX_SYSROOT_INT})\n  set(CMAKE_OSX_SYSROOT_INT \"$ENV{_CMAKE_OSX_SYSROOT_INT}\")\nelseif(NOT DEFINED CMAKE_OSX_SYSROOT_INT)\n  execute_process(COMMAND ${XCODEBUILD_EXECUTABLE} -version -sdk ${SDK_NAME} Path\n          OUTPUT_VARIABLE CMAKE_OSX_SYSROOT_INT\n          ERROR_QUIET\n          OUTPUT_STRIP_TRAILING_WHITESPACE)\nendif()\n\nif (NOT DEFINED CMAKE_OSX_SYSROOT_INT AND NOT DEFINED CMAKE_OSX_SYSROOT)\n  message(SEND_ERROR \"Please make sure that Xcode is installed and that the toolchain\"\n          \"is pointing to the correct path. Please run:\"\n          \"sudo xcode-select -s /Applications/Xcode.app/Contents/Developer\"\n          \"and see if that fixes the problem for you.\")\n  message(FATAL_ERROR \"Invalid CMAKE_OSX_SYSROOT: ${CMAKE_OSX_SYSROOT} \"\n          \"does not exist.\")\nelseif(DEFINED CMAKE_OSX_SYSROOT_INT)\n  set(CMAKE_OSX_SYSROOT_INT \"${CMAKE_OSX_SYSROOT_INT}\" CACHE INTERNAL \"\")\n  # Specify the location or name of the platform SDK to be used in CMAKE_OSX_SYSROOT.\n  set(CMAKE_OSX_SYSROOT \"${CMAKE_OSX_SYSROOT_INT}\" CACHE INTERNAL \"\")\nendif()\n\n# Use bitcode or not\nif(NOT DEFINED ENABLE_BITCODE AND NOT ARCHS MATCHES \"((^|;|, )(i386|x86_64))+\")\n  # Unless specified, enable bitcode support by default\n  message(STATUS \"[DEFAULTS] Enabling bitcode support by default. ENABLE_BITCODE not provided!\")\n  set(ENABLE_BITCODE TRUE)\nelseif(NOT DEFINED ENABLE_BITCODE)\n  message(STATUS \"[DEFAULTS] Disabling bitcode support by default on simulators. ENABLE_BITCODE not provided for override!\")\n  set(ENABLE_BITCODE FALSE)\nendif()\nset(ENABLE_BITCODE_INT ${ENABLE_BITCODE} CACHE BOOL\n        \"Whether or not to enable bitcode\" FORCE)\n# Use ARC or not\nif(NOT DEFINED ENABLE_ARC)\n  # Unless specified, enable ARC support by default\n  set(ENABLE_ARC TRUE)\n  message(STATUS \"[DEFAULTS] Enabling ARC support by default. ENABLE_ARC not provided!\")\nendif()\nset(ENABLE_ARC_INT ${ENABLE_ARC} CACHE BOOL \"Whether or not to enable ARC\" FORCE)\n# Use hidden visibility or not\nif(NOT DEFINED ENABLE_VISIBILITY)\n  # Unless specified, disable symbols visibility by default\n  set(ENABLE_VISIBILITY FALSE)\n  message(STATUS \"[DEFAULTS] Hiding symbols visibility by default. ENABLE_VISIBILITY not provided!\")\nendif()\nset(ENABLE_VISIBILITY_INT ${ENABLE_VISIBILITY} CACHE BOOL \"Whether or not to hide symbols from the dynamic linker (-fvisibility=hidden)\" FORCE)\n# Set strict compiler checks or not\nif(NOT DEFINED ENABLE_STRICT_TRY_COMPILE)\n  # Unless specified, disable strict try_compile()\n  set(ENABLE_STRICT_TRY_COMPILE FALSE)\n  message(STATUS \"[DEFAULTS] Using NON-strict compiler checks by default. ENABLE_STRICT_TRY_COMPILE not provided!\")\nendif()\nset(ENABLE_STRICT_TRY_COMPILE_INT ${ENABLE_STRICT_TRY_COMPILE} CACHE BOOL\n        \"Whether or not to use strict compiler checks\" FORCE)\n\n# Get the SDK version information.\nif(DEFINED SDK_VERSION)\n  # Environment variables are always preserved.\n  set(ENV{_SDK_VERSION} \"${SDK_VERSION}\")\nelseif(DEFINED ENV{_SDK_VERSION})\n  set(SDK_VERSION \"$ENV{_SDK_VERSION}\")\nelseif(NOT DEFINED SDK_VERSION)\n  execute_process(COMMAND ${XCODEBUILD_EXECUTABLE} -sdk ${CMAKE_OSX_SYSROOT_INT} -version SDKVersion\n          OUTPUT_VARIABLE SDK_VERSION\n          ERROR_QUIET\n          OUTPUT_STRIP_TRAILING_WHITESPACE)\nendif()\n\n# Find the Developer root for the specific iOS platform being compiled for\n# from CMAKE_OSX_SYSROOT.  Should be ../../ from SDK specified in\n# CMAKE_OSX_SYSROOT. There does not appear to be a direct way to obtain\n# this information from xcrun or xcodebuild.\nif (NOT DEFINED CMAKE_DEVELOPER_ROOT AND NOT CMAKE_GENERATOR MATCHES \"Xcode\")\n  get_filename_component(PLATFORM_SDK_DIR ${CMAKE_OSX_SYSROOT_INT} PATH)\n  get_filename_component(CMAKE_DEVELOPER_ROOT ${PLATFORM_SDK_DIR} PATH)\n  if (NOT EXISTS \"${CMAKE_DEVELOPER_ROOT}\")\n    message(FATAL_ERROR \"Invalid CMAKE_DEVELOPER_ROOT: ${CMAKE_DEVELOPER_ROOT} does not exist.\")\n  endif()\nendif()\n\n# Find the C & C++ compilers for the specified SDK.\nif(DEFINED CMAKE_C_COMPILER)\n  # Environment variables are always preserved.\n  set(ENV{_CMAKE_C_COMPILER} \"${CMAKE_C_COMPILER}\")\nelseif(DEFINED ENV{_CMAKE_C_COMPILER})\n  set(CMAKE_C_COMPILER \"$ENV{_CMAKE_C_COMPILER}\")\nelseif(NOT DEFINED CMAKE_C_COMPILER)\n  execute_process(COMMAND xcrun -sdk ${CMAKE_OSX_SYSROOT_INT} -find clang\n          OUTPUT_VARIABLE CMAKE_C_COMPILER\n          ERROR_QUIET\n          OUTPUT_STRIP_TRAILING_WHITESPACE)\nendif()\nif(DEFINED CMAKE_CXX_COMPILER)\n  # Environment variables are always preserved.\n  set(ENV{_CMAKE_CXX_COMPILER} \"${CMAKE_CXX_COMPILER}\")\nelseif(DEFINED ENV{_CMAKE_CXX_COMPILER})\n  set(CMAKE_CXX_COMPILER \"$ENV{_CMAKE_CXX_COMPILER}\")\nelseif(NOT DEFINED CMAKE_CXX_COMPILER)\n  execute_process(COMMAND xcrun -sdk ${CMAKE_OSX_SYSROOT_INT} -find clang++\n          OUTPUT_VARIABLE CMAKE_CXX_COMPILER\n          ERROR_QUIET\n          OUTPUT_STRIP_TRAILING_WHITESPACE)\nendif()\n# Find (Apple's) libtool.\nif(DEFINED BUILD_LIBTOOL)\n  # Environment variables are always preserved.\n  set(ENV{_BUILD_LIBTOOL} \"${BUILD_LIBTOOL}\")\nelseif(DEFINED ENV{_BUILD_LIBTOOL})\n  set(BUILD_LIBTOOL \"$ENV{_BUILD_LIBTOOL}\")\nelseif(NOT DEFINED BUILD_LIBTOOL)\n  execute_process(COMMAND xcrun -sdk ${CMAKE_OSX_SYSROOT_INT} -find libtool\n          OUTPUT_VARIABLE BUILD_LIBTOOL\n          ERROR_QUIET\n          OUTPUT_STRIP_TRAILING_WHITESPACE)\nendif()\n# Find the toolchain's provided install_name_tool if none is found on the host\nif(DEFINED CMAKE_INSTALL_NAME_TOOL)\n  # Environment variables are always preserved.\n  set(ENV{_CMAKE_INSTALL_NAME_TOOL} \"${CMAKE_INSTALL_NAME_TOOL}\")\nelseif(DEFINED ENV{_CMAKE_INSTALL_NAME_TOOL})\n  set(CMAKE_INSTALL_NAME_TOOL \"$ENV{_CMAKE_INSTALL_NAME_TOOL}\")\nelseif(NOT DEFINED CMAKE_INSTALL_NAME_TOOL)\n  execute_process(COMMAND xcrun -sdk ${CMAKE_OSX_SYSROOT_INT} -find install_name_tool\n          OUTPUT_VARIABLE CMAKE_INSTALL_NAME_TOOL_INT\n          ERROR_QUIET\n          OUTPUT_STRIP_TRAILING_WHITESPACE)\n  set(CMAKE_INSTALL_NAME_TOOL ${CMAKE_INSTALL_NAME_TOOL_INT} CACHE INTERNAL \"\")\nendif()\n\n# Configure libtool to be used instead of ar + ranlib to build static libraries.\n# This is required on Xcode 7+, but should also work on previous versions of\n# Xcode.\nget_property(languages GLOBAL PROPERTY ENABLED_LANGUAGES)\nforeach(lang ${languages})\n  set(CMAKE_${lang}_CREATE_STATIC_LIBRARY \"${BUILD_LIBTOOL} -static -o <TARGET> <LINK_FLAGS> <OBJECTS> \" CACHE INTERNAL \"\")\nendforeach()\n\n# CMake 3.14+ support building for iOS, watchOS and tvOS out of the box.\nif(MODERN_CMAKE)\n  if(SDK_NAME MATCHES \"iphone\")\n    set(CMAKE_SYSTEM_NAME iOS)\n  elseif(SDK_NAME MATCHES \"macosx\")\n    set(CMAKE_SYSTEM_NAME Darwin)\n  elseif(SDK_NAME MATCHES \"appletv\")\n    set(CMAKE_SYSTEM_NAME tvOS)\n  elseif(SDK_NAME MATCHES \"watch\")\n    set(CMAKE_SYSTEM_NAME watchOS)\n  endif()\n  # Provide flags for a combined FAT library build on newer CMake versions\n  if(PLATFORM_INT MATCHES \".*COMBINED\")\n    set(CMAKE_XCODE_ATTRIBUTE_ONLY_ACTIVE_ARCH \"NO\")\n    set(CMAKE_IOS_INSTALL_COMBINED YES)\n    message(STATUS \"Will combine built (static) artifacts into FAT lib...\")\n  endif()\nelseif(NOT DEFINED CMAKE_SYSTEM_NAME AND ${CMAKE_VERSION} VERSION_GREATER_EQUAL \"3.10\")\n  # Legacy code path prior to CMake 3.14 or fallback if no CMAKE_SYSTEM_NAME specified\n  set(CMAKE_SYSTEM_NAME iOS)\nelseif(NOT DEFINED CMAKE_SYSTEM_NAME)\n  # Legacy code path prior to CMake 3.14 or fallback if no CMAKE_SYSTEM_NAME specified\n  set(CMAKE_SYSTEM_NAME Darwin)\nendif()\n# Standard settings.\nset(CMAKE_SYSTEM_VERSION ${SDK_VERSION} CACHE INTERNAL \"\")\nset(UNIX TRUE CACHE BOOL \"\")\nset(APPLE TRUE CACHE BOOL \"\")\nif(PLATFORM STREQUAL \"MAC\" OR PLATFORM STREQUAL \"MAC_ARM64\")\n  set(IOS FALSE CACHE BOOL \"\")\n  set(MACOS TRUE CACHE BOOL \"\")\nelseif(PLATFORM STREQUAL \"MAC_CATALYST\" OR PLATFORM STREQUAL \"MAC_CATALYST_ARM64\")\n  set(IOS TRUE CACHE BOOL \"\")\n  set(MACOS TRUE CACHE BOOL \"\")\nelse()\n  set(IOS TRUE CACHE BOOL \"\")\nendif()\nset(CMAKE_AR ar CACHE FILEPATH \"\" FORCE)\nset(CMAKE_RANLIB ranlib CACHE FILEPATH \"\" FORCE)\nset(CMAKE_STRIP strip CACHE FILEPATH \"\" FORCE)\n# Set the architectures for which to build.\nset(CMAKE_OSX_ARCHITECTURES ${ARCHS} CACHE INTERNAL \"\")\n# Change the type of target generated for try_compile() so it'll work when cross-compiling, weak compiler checks\nif(NOT ENABLE_STRICT_TRY_COMPILE_INT)\n  set(CMAKE_TRY_COMPILE_TARGET_TYPE STATIC_LIBRARY)\nendif()\n# All iOS/Darwin specific settings - some may be redundant.\nset(CMAKE_MACOSX_BUNDLE YES)\nset(CMAKE_XCODE_ATTRIBUTE_CODE_SIGNING_REQUIRED \"NO\")\nset(CMAKE_SHARED_LIBRARY_PREFIX \"lib\")\nset(CMAKE_SHARED_LIBRARY_SUFFIX \".dylib\")\nset(CMAKE_SHARED_MODULE_PREFIX \"lib\")\nset(CMAKE_SHARED_MODULE_SUFFIX \".so\")\nset(CMAKE_C_COMPILER_ABI ELF)\nset(CMAKE_CXX_COMPILER_ABI ELF)\nset(CMAKE_C_HAS_ISYSROOT 1)\nset(CMAKE_CXX_HAS_ISYSROOT 1)\nset(CMAKE_MODULE_EXISTS 1)\nset(CMAKE_DL_LIBS \"\")\nset(CMAKE_C_OSX_COMPATIBILITY_VERSION_FLAG \"-compatibility_version \")\nset(CMAKE_C_OSX_CURRENT_VERSION_FLAG \"-current_version \")\nset(CMAKE_CXX_OSX_COMPATIBILITY_VERSION_FLAG \"${CMAKE_C_OSX_COMPATIBILITY_VERSION_FLAG}\")\nset(CMAKE_CXX_OSX_CURRENT_VERSION_FLAG \"${CMAKE_C_OSX_CURRENT_VERSION_FLAG}\")\n\nif(ARCHS MATCHES \"((^|;|, )(arm64|arm64e|x86_64))+\")\n  set(CMAKE_C_SIZEOF_DATA_PTR 8)\n  set(CMAKE_CXX_SIZEOF_DATA_PTR 8)\n  if(ARCHS MATCHES \"((^|;|, )(arm64|arm64e))+\")\n    set(CMAKE_SYSTEM_PROCESSOR \"aarch64\")\n  else()\n    set(CMAKE_SYSTEM_PROCESSOR \"x86_64\")\n  endif()\nelse()\n  set(CMAKE_C_SIZEOF_DATA_PTR 4)\n  set(CMAKE_CXX_SIZEOF_DATA_PTR 4)\n  set(CMAKE_SYSTEM_PROCESSOR \"arm\")\nendif()\n\n# Note that only Xcode 7+ supports the newer more specific:\n# -m${SDK_NAME}-version-min flags, older versions of Xcode use:\n# -m(ios/ios-simulator)-version-min instead.\nif(${CMAKE_VERSION} VERSION_LESS \"3.11\")\n  if(PLATFORM_INT STREQUAL \"OS\" OR PLATFORM_INT STREQUAL \"OS64\")\n    if(XCODE_VERSION_INT VERSION_LESS 7.0)\n      set(SDK_NAME_VERSION_FLAGS\n              \"-mios-version-min=${DEPLOYMENT_TARGET}\")\n    else()\n      # Xcode 7.0+ uses flags we can build directly from SDK_NAME.\n      set(SDK_NAME_VERSION_FLAGS\n              \"-m${SDK_NAME}-version-min=${DEPLOYMENT_TARGET}\")\n    endif()\n  elseif(PLATFORM_INT STREQUAL \"TVOS\")\n    set(SDK_NAME_VERSION_FLAGS\n            \"-mtvos-version-min=${DEPLOYMENT_TARGET}\")\n  elseif(PLATFORM_INT STREQUAL \"SIMULATOR_TVOS\")\n    set(SDK_NAME_VERSION_FLAGS\n            \"-mtvos-simulator-version-min=${DEPLOYMENT_TARGET}\")\n  elseif(PLATFORM_INT STREQUAL \"WATCHOS\")\n    set(SDK_NAME_VERSION_FLAGS\n            \"-mwatchos-version-min=${DEPLOYMENT_TARGET}\")\n  elseif(PLATFORM_INT STREQUAL \"SIMULATOR_WATCHOS\")\n    set(SDK_NAME_VERSION_FLAGS\n            \"-mwatchos-simulator-version-min=${DEPLOYMENT_TARGET}\")\n  elseif(PLATFORM_INT STREQUAL \"MAC\")\n    set(SDK_NAME_VERSION_FLAGS\n            \"-mmacosx-version-min=${DEPLOYMENT_TARGET}\")\n  else()\n    # SIMULATOR or SIMULATOR64 both use -mios-simulator-version-min.\n    set(SDK_NAME_VERSION_FLAGS\n            \"-mios-simulator-version-min=${DEPLOYMENT_TARGET}\")\n  endif()\nelseif(NOT PLATFORM_INT STREQUAL \"MAC_CATALYST\")\n  # Newer versions of CMake sets the version min flags correctly, skip this for Mac Catalyst targets\n  set(CMAKE_OSX_DEPLOYMENT_TARGET ${DEPLOYMENT_TARGET})\nendif()\n\nif(DEFINED APPLE_TARGET_TRIPLE_INT)\n  set(APPLE_TARGET_TRIPLE ${APPLE_TARGET_TRIPLE_INT} CACHE INTERNAL \"\")\nendif()\n\nif(PLATFORM_INT STREQUAL \"MAC_CATALYST\")\n  set(C_TARGET_FLAGS \"-target ${APPLE_TARGET_TRIPLE_INT} -isystem ${CMAKE_OSX_SYSROOT_INT}/System/iOSSupport/usr/include\")\nendif()\n\nif(ENABLE_BITCODE_INT)\n  set(BITCODE \"-fembed-bitcode\")\n  set(CMAKE_XCODE_ATTRIBUTE_BITCODE_GENERATION_MODE \"bitcode\")\n  set(CMAKE_XCODE_ATTRIBUTE_ENABLE_BITCODE \"YES\")\nelse()\n  set(BITCODE \"\")\n  set(CMAKE_XCODE_ATTRIBUTE_ENABLE_BITCODE \"NO\")\nendif()\n\nif(ENABLE_ARC_INT)\n  set(FOBJC_ARC \"-fobjc-arc\")\n  set(CMAKE_XCODE_ATTRIBUTE_CLANG_ENABLE_OBJC_ARC \"YES\")\nelse()\n  set(FOBJC_ARC \"-fno-objc-arc\")\n  set(CMAKE_XCODE_ATTRIBUTE_CLANG_ENABLE_OBJC_ARC \"NO\")\nendif()\n\nif(NOT ENABLE_VISIBILITY_INT)\n  foreach(lang ${languages})\n    set(CMAKE_${lang}_VISIBILITY_PRESET \"hidden\" CACHE INTERNAL \"\")\n  endforeach()\n  set(CMAKE_XCODE_ATTRIBUTE_GCC_SYMBOLS_PRIVATE_EXTERN \"YES\")\n  set(VISIBILITY \"-fvisibility=hidden -fvisibility-inlines-hidden\")\nelse()\n  foreach(lang ${languages})\n    set(CMAKE_${lang}_VISIBILITY_PRESET \"default\" CACHE INTERNAL \"\")\n  endforeach()\n  set(CMAKE_XCODE_ATTRIBUTE_GCC_SYMBOLS_PRIVATE_EXTERN \"NO\")\n  set(VISIBILITY \"-fvisibility=default\")\nendif()\n\n#Check if Xcode generator is used, since that will handle these flags automagically\nif(CMAKE_GENERATOR MATCHES \"Xcode\")\n  message(STATUS \"Not setting any manual command-line buildflags, since Xcode is selected as generator.\")\nelse()\n  # Hidden visibility is required for C++ on iOS.\n  set(CMAKE_C_FLAGS \"${C_TARGET_FLAGS} ${SDK_NAME_VERSION_FLAGS} ${BITCODE} -fobjc-abi-version=2 ${FOBJC_ARC} ${CMAKE_C_FLAGS}\")\n  set(CMAKE_CXX_FLAGS \"${C_TARGET_FLAGS} ${SDK_NAME_VERSION_FLAGS} ${BITCODE} ${VISIBILITY} -fobjc-abi-version=2 ${FOBJC_ARC} ${CMAKE_CXX_FLAGS}\")\n  set(CMAKE_CXX_FLAGS_DEBUG \"${CMAKE_CXX_FLAGS} -O0 -g ${CMAKE_CXX_FLAGS_DEBUG}\")\n  set(CMAKE_CXX_FLAGS_MINSIZEREL \"${CMAKE_CXX_FLAGS} -DNDEBUG -Os -ffast-math ${CMAKE_CXX_FLAGS_MINSIZEREL}\")\n  set(CMAKE_CXX_FLAGS_RELWITHDEBINFO \"${CMAKE_CXX_FLAGS} -DNDEBUG -O2 -g -ffast-math ${CMAKE_CXX_FLAGS_RELWITHDEBINFO}\")\n  set(CMAKE_CXX_FLAGS_RELEASE \"${CMAKE_CXX_FLAGS} -DNDEBUG -O3 -ffast-math ${CMAKE_CXX_FLAGS_RELEASE}\")\n  set(CMAKE_C_LINK_FLAGS \"${C_TARGET_FLAGS} ${SDK_NAME_VERSION_FLAGS} -Wl,-search_paths_first ${CMAKE_C_LINK_FLAGS}\")\n  set(CMAKE_CXX_LINK_FLAGS \"${C_TARGET_FLAGS} ${SDK_NAME_VERSION_FLAGS}  -Wl,-search_paths_first ${CMAKE_CXX_LINK_FLAGS}\")\n  set(CMAKE_ASM_FLAGS \"${CMAKE_C_FLAGS} -x assembler-with-cpp -arch ${CMAKE_OSX_ARCHITECTURES}\")\nendif()\n\n## Print status messages to inform of the current state\nmessage(STATUS \"Configuring ${SDK_NAME} build for platform: ${PLATFORM_INT}, architecture(s): ${ARCHS}\")\nmessage(STATUS \"Using SDK: ${CMAKE_OSX_SYSROOT_INT}\")\nmessage(STATUS \"Using C compiler: ${CMAKE_C_COMPILER}\")\nmessage(STATUS \"Using CXX compiler: ${CMAKE_CXX_COMPILER}\")\nmessage(STATUS \"Using libtool: ${BUILD_LIBTOOL}\")\nmessage(STATUS \"Using install name tool: ${CMAKE_INSTALL_NAME_TOOL}\")\nif(DEFINED APPLE_TARGET_TRIPLE)\n  message(STATUS \"Autoconf target triple: ${APPLE_TARGET_TRIPLE}\")\nendif()\nmessage(STATUS \"Using minimum deployment version: ${DEPLOYMENT_TARGET}\"\n        \" (SDK version: ${SDK_VERSION})\")\nif(MODERN_CMAKE)\n  message(STATUS \"Merging integrated CMake 3.14+ iOS,tvOS,watchOS,macOS toolchain(s) with this toolchain!\")\nendif()\nif(CMAKE_GENERATOR MATCHES \"Xcode\")\n  message(STATUS \"Using Xcode version: ${XCODE_VERSION_INT}\")\nendif()\nmessage(STATUS \"CMake version: ${CMAKE_VERSION}\")\nif(DEFINED SDK_NAME_VERSION_FLAGS)\n  message(STATUS \"Using version flags: ${SDK_NAME_VERSION_FLAGS}\")\nendif()\nmessage(STATUS \"Using a data_ptr size of: ${CMAKE_CXX_SIZEOF_DATA_PTR}\")\nif(ENABLE_BITCODE_INT)\n  message(STATUS \"Bitcode: Enabled\")\nelse()\n  message(STATUS \"Bitcode: Disabled\")\nendif()\n\nif(ENABLE_ARC_INT)\n  message(STATUS \"ARC: Enabled\")\nelse()\n  message(STATUS \"ARC: Disabled\")\nendif()\n\nif(ENABLE_VISIBILITY_INT)\n  message(STATUS \"Hiding symbols: Disabled\")\nelse()\n  message(STATUS \"Hiding symbols: Enabled\")\nendif()\n\n# Set global properties\nset_property(GLOBAL PROPERTY PLATFORM \"${PLATFORM}\")\nset_property(GLOBAL PROPERTY APPLE_TARGET_TRIPLE \"${APPLE_TARGET_TRIPLE_INT}\")\nset_property(GLOBAL PROPERTY SDK_VERSION \"${SDK_VERSION}\")\nset_property(GLOBAL PROPERTY XCODE_VERSION \"${XCODE_VERSION_INT}\")\nset_property(GLOBAL PROPERTY OSX_ARCHITECTURES \"${CMAKE_OSX_ARCHITECTURES}\")\n\n# Export configurable variables for the try_compile() command.\nset(CMAKE_TRY_COMPILE_PLATFORM_VARIABLES\n        PLATFORM\n        XCODE_VERSION_INT\n        SDK_VERSION\n        DEPLOYMENT_TARGET\n        CMAKE_DEVELOPER_ROOT\n        CMAKE_OSX_SYSROOT_INT\n        ENABLE_BITCODE\n        ENABLE_ARC\n        CMAKE_C_COMPILER\n        CMAKE_CXX_COMPILER\n        BUILD_LIBTOOL\n        CMAKE_INSTALL_NAME_TOOL\n        CMAKE_C_FLAGS\n        CMAKE_CXX_FLAGS\n        CMAKE_CXX_FLAGS_DEBUG\n        CMAKE_CXX_FLAGS_MINSIZEREL\n        CMAKE_CXX_FLAGS_RELWITHDEBINFO\n        CMAKE_CXX_FLAGS_RELEASE\n        CMAKE_C_LINK_FLAGS\n        CMAKE_CXX_LINK_FLAGS\n        CMAKE_ASM_FLAGS\n        )\n\nset(CMAKE_PLATFORM_HAS_INSTALLNAME 1)\nset(CMAKE_SHARED_LINKER_FLAGS \"-rpath @executable_path/Frameworks -rpath @loader_path/Frameworks\")\nset(CMAKE_SHARED_LIBRARY_CREATE_C_FLAGS \"-dynamiclib -Wl,-headerpad_max_install_names\")\nset(CMAKE_SHARED_MODULE_CREATE_C_FLAGS \"-bundle -Wl,-headerpad_max_install_names\")\nset(CMAKE_SHARED_MODULE_LOADER_C_FLAG \"-Wl,-bundle_loader,\")\nset(CMAKE_SHARED_MODULE_LOADER_CXX_FLAG \"-Wl,-bundle_loader,\")\nset(CMAKE_FIND_LIBRARY_SUFFIXES \".tbd\" \".dylib\" \".so\" \".a\")\nset(CMAKE_SHARED_LIBRARY_SONAME_C_FLAG \"-install_name\")\n\n# Set the find root to the SDK developer roots.\n# Note: CMAKE_FIND_ROOT_PATH is only useful when cross-compiling. Thus, do not set on macOS builds.\nif(NOT PLATFORM_INT STREQUAL \"MAC\" AND NOT PLATFORM_INT STREQUAL \"MAC_ARM64\")\n  list(APPEND CMAKE_FIND_ROOT_PATH \"${CMAKE_OSX_SYSROOT_INT}\" CACHE INTERNAL \"\")\n  set(CMAKE_IGNORE_PATH \"/System/Library/Frameworks;/usr/local/lib\" CACHE INTERNAL \"\")\nendif()\n\n# Default to searching for frameworks first.\nset(CMAKE_FIND_FRAMEWORK FIRST)\n\n# Set up the default search directories for frameworks.\nif(PLATFORM_INT MATCHES \"MAC_CATALYST.*\")\n  set(CMAKE_FRAMEWORK_PATH\n          ${CMAKE_DEVELOPER_ROOT}/Library/PrivateFrameworks\n          ${CMAKE_OSX_SYSROOT_INT}/System/Library/Frameworks\n          ${CMAKE_OSX_SYSROOT_INT}/System/iOSSupport/System/Library/Frameworks\n          ${CMAKE_FRAMEWORK_PATH} CACHE INTERNAL \"\")\nelse()\n  set(CMAKE_FRAMEWORK_PATH\n          ${CMAKE_DEVELOPER_ROOT}/Library/PrivateFrameworks\n          ${CMAKE_OSX_SYSROOT_INT}/System/Library/Frameworks\n          ${CMAKE_FRAMEWORK_PATH} CACHE INTERNAL \"\")\nendif()\n\n# By default, search both the specified iOS SDK and the remainder of the host filesystem.\nif(NOT CMAKE_FIND_ROOT_PATH_MODE_PROGRAM)\n  set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM BOTH CACHE INTERNAL \"\")\nendif()\nif(NOT CMAKE_FIND_ROOT_PATH_MODE_LIBRARY)\n  set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY BOTH CACHE INTERNAL \"\")\nendif()\nif(NOT CMAKE_FIND_ROOT_PATH_MODE_INCLUDE)\n  set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE BOTH CACHE INTERNAL \"\")\nendif()\nif(NOT CMAKE_FIND_ROOT_PATH_MODE_PACKAGE)\n  set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE BOTH CACHE INTERNAL \"\")\nendif()\n\n#\n# Some helper-macros below to simplify and beautify the CMakeFile\n#\n\n# This little macro lets you set any Xcode specific property.\nmacro(set_xcode_property TARGET XCODE_PROPERTY XCODE_VALUE XCODE_RELVERSION)\n  set(XCODE_RELVERSION_I \"${XCODE_RELVERSION}\")\n  if(XCODE_RELVERSION_I STREQUAL \"All\")\n    set_property(TARGET ${TARGET} PROPERTY XCODE_ATTRIBUTE_${XCODE_PROPERTY} \"${XCODE_VALUE}\")\n  else()\n    set_property(TARGET ${TARGET} PROPERTY XCODE_ATTRIBUTE_${XCODE_PROPERTY}[variant=${XCODE_RELVERSION_I}] \"${XCODE_VALUE}\")\n  endif()\nendmacro(set_xcode_property)\n\n# This macro lets you find executable programs on the host system.\nmacro(find_host_package)\n  set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)\n  set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY NEVER)\n  set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE NEVER)\n  set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE NEVER)\n  set(_TOOLCHAIN_IOS ${IOS})\n  set(IOS FALSE)\n  find_package(${ARGN})\n  set(IOS ${_TOOLCHAIN_IOS})\n  set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM BOTH)\n  set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY BOTH)\n  set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE BOTH)\n  set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE BOTH)\nendmacro(find_host_package)\n"
  },
  {
    "path": "toolchains/riscv64-linux-gnu-spacemit.toolchain.cmake",
    "content": "set(CMAKE_SYSTEM_NAME Linux)\nset(CMAKE_SYSTEM_PROCESSOR riscv64)\nset(CMAKE_SYSTEM_VERSION 1)\n\nif(CMAKE_HOST_SYSTEM_PROCESSOR MATCHES \"^(riscv)\")\n    message(STATUS \"HOST SYSTEM ${CMAKE_HOST_SYSTEM_PROCESSOR}\")\nelse()\n    set(GNU_MACHINE riscv64-unknown-linux-gnu CACHE STRING \"GNU compiler triple\")\n    if(DEFINED ENV{RISCV_ROOT_PATH})\n        file(TO_CMAKE_PATH $ENV{RISCV_ROOT_PATH} RISCV_ROOT_PATH)\n    else()\n        message(FATAL_ERROR \"RISCV_ROOT_PATH env must be defined\")\n    endif()\n\n    set(RISCV_ROOT_PATH ${RISCV_ROOT_PATH} CACHE STRING \"root path to riscv toolchain\")\n    set(CMAKE_C_COMPILER ${RISCV_ROOT_PATH}/bin/riscv64-unknown-linux-gnu-gcc)\n    set(CMAKE_CXX_COMPILER ${RISCV_ROOT_PATH}/bin/riscv64-unknown-linux-gnu-g++)\n    set(CMAKE_STRIP ${RISCV_ROOT_PATH}/bin/riscv64-unknown-linux-gnu-strip)\n    set(CMAKE_FIND_ROOT_PATH \"${RISCV_ROOT_PATH}/sysroot\")\n    set(CMAKE_SYSROOT \"${RISCV_ROOT_PATH}/sysroot\")\nendif()\n\nset(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)\nset(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)\nset(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)\nset(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)\nset(CMAKE_C_FLAGS \"-march=rv64gcv_zfh_zvfh_zba_zicbop_zihintpause -mabi=lp64d -ftree-vectorize ${CMAKE_C_FLAGS}\")\nset(CMAKE_CXX_FLAGS \"-march=rv64gcv_zfh_zvfh_zba_zicbop_zihintpause -mabi=lp64d  -ftree-vectorize ${CXX_FLAGS}\")\nset(CMAKE_EXE_LINKER_FLAGS \"${CMAKE_EXE_LINKER_FLAGS} -latomic -lrt -lpthread\")\nset(CMAKE_EXE_LINKER_FLAGS \"${CMAKE_EXE_LINKER_FLAGS} --sysroot=${CMAKE_SYSROOT}\")\n"
  },
  {
    "path": "toolchains/riscv64-linux-gnu.toolchain.cmake",
    "content": "# Copied from https://github.com/Tencent/ncnn/blob/master/toolchains/riscv64-linux-gnu.toolchain.cmake\nset(CMAKE_SYSTEM_NAME Linux)\nset(CMAKE_SYSTEM_PROCESSOR riscv64)\n\nset(CMAKE_C_COMPILER \"riscv64-unknown-linux-gnu-gcc\")\nset(CMAKE_CXX_COMPILER \"riscv64-unknown-linux-gnu-g++\")\n\nset(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)\nset(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)\nset(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)\n\nset(CMAKE_C_FLAGS \"-march=rv64gc\")\nset(CMAKE_CXX_FLAGS \"-march=rv64gc\")\n\n# cache flags\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS}\" CACHE STRING \"c flags\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS}\" CACHE STRING \"c++ flags\")\n"
  },
  {
    "path": "wasm/CMakeLists.txt",
    "content": "if(SHERPA_ONNX_ENABLE_WASM_TTS)\n  add_subdirectory(tts)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_ASR)\n  add_subdirectory(asr)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_KWS)\n  add_subdirectory(kws)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_VAD)\n  add_subdirectory(vad)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_VAD_ASR)\n  add_subdirectory(vad-asr)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_SPEECH_ENHANCEMENT)\n  add_subdirectory(speech-enhancement)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_SPEAKER_DIARIZATION)\n  add_subdirectory(speaker-diarization)\nendif()\n\nif(SHERPA_ONNX_ENABLE_WASM_NODEJS)\n  add_subdirectory(nodejs)\nendif()\n"
  },
  {
    "path": "wasm/asr/.gitignore",
    "content": "*.bak\n"
  },
  {
    "path": "wasm/asr/CMakeLists.txt",
    "content": "if(NOT $ENV{SHERPA_ONNX_IS_USING_BUILD_WASM_SH})\n  message(FATAL_ERROR \"Please use ./build-wasm-simd-asr.sh to build for wasm ASR\")\nendif()\n\nif(NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/encoder.onnx\")\n  message(FATAL_ERROR \"Please read ${CMAKE_CURRENT_SOURCE_DIR}/assets/README.md before you continue\")\nendif()\n\nset(exported_functions\n  MyPrint\n  # online ASR\n  SherpaOnnxCreateOnlineRecognizer\n  SherpaOnnxCreateOnlineStream\n  SherpaOnnxDecodeOnlineStream\n  SherpaOnnxDestroyOnlineRecognizer\n  SherpaOnnxDestroyOnlineRecognizerResult\n  SherpaOnnxDestroyOnlineStream\n  SherpaOnnxDestroyOnlineStreamResultJson\n  SherpaOnnxGetOfflineStreamResultAsJson\n  SherpaOnnxGetOnlineStreamResult\n  SherpaOnnxGetOnlineStreamResultAsJson\n  SherpaOnnxIsOnlineStreamReady\n  SherpaOnnxOnlineStreamAcceptWaveform\n  SherpaOnnxOnlineStreamGetOption\n  SherpaOnnxOnlineStreamInputFinished\n  SherpaOnnxOnlineStreamIsEndpoint\n  SherpaOnnxOnlineStreamReset\n  SherpaOnnxOnlineStreamSetOption\n  SherpaOnnxOfflineStreamGetOption\n  SherpaOnnxOfflineStreamSetOption\n  #\n)\nset(mangled_exported_functions)\nforeach(x IN LISTS exported_functions)\n  list(APPEND mangled_exported_functions \"_${x}\")\nendforeach()\nlist(JOIN mangled_exported_functions \",\" all_exported_functions)\n\ninclude_directories(${CMAKE_SOURCE_DIR})\nset(MY_FLAGS \" -s FORCE_FILESYSTEM=1 -s INITIAL_MEMORY=512MB -s ALLOW_MEMORY_GROWTH=1\")\nstring(APPEND MY_FLAGS \" -sSTACK_SIZE=10485760 \") # 10MB\nstring(APPEND MY_FLAGS \" -sEXPORTED_FUNCTIONS=[_CopyHeap,_malloc,_free,${all_exported_functions}] \")\nstring(APPEND MY_FLAGS \"--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets@. \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_RUNTIME_METHODS=['ccall','stringToUTF8','setValue','getValue','lengthBytesUTF8','UTF8ToString','HEAPU8','HEAP16','HEAP32','HEAPU32','HEAPF32','HEAPF64'] \")\n\nmessage(STATUS \"MY_FLAGS: ${MY_FLAGS}\")\n\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_EXECUTABLE_LINKER_FLAGS \"${CMAKE_EXECUTABLE_LINKER_FLAGS} ${MY_FLAGS}\")\n\nif (NOT CMAKE_EXECUTABLE_SUFFIX STREQUAL \".js\")\n  message(FATAL_ERROR \"The default suffix for building executables should be .js!\")\nendif()\n# set(CMAKE_EXECUTABLE_SUFFIX \".html\")\n\nadd_executable(sherpa-onnx-wasm-main-asr sherpa-onnx-wasm-main-asr.cc)\ntarget_link_libraries(sherpa-onnx-wasm-main-asr sherpa-onnx-c-api)\ninstall(TARGETS sherpa-onnx-wasm-main-asr DESTINATION bin/wasm/asr)\n\ninstall(\n  FILES\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-asr>/sherpa-onnx-wasm-main-asr.js\"\n    \"index.html\"\n    \"sherpa-onnx-asr.js\"\n    \"app-asr.js\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-asr>/sherpa-onnx-wasm-main-asr.wasm\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-asr>/sherpa-onnx-wasm-main-asr.data\"\n  DESTINATION\n    bin/wasm/asr\n)\n"
  },
  {
    "path": "wasm/asr/app-asr.js",
    "content": "// This file copies and modifies code\n// from https://mdn.github.io/web-dictaphone/scripts/app.js\n// and https://gist.github.com/meziantou/edb7217fddfbb70e899e\n\nconst startBtn = document.getElementById('startBtn');\nconst stopBtn = document.getElementById('stopBtn');\nconst clearBtn = document.getElementById('clearBtn');\nconst soundClips = document.getElementById('sound-clips');\n\nlet textArea = document.getElementById('results');\n\nlet lastResult = '';\nlet resultList = [];\n\nclearBtn.onclick = function() {\n  resultList = [];\n  textArea.value = getDisplayResult();\n  textArea.scrollTop = textArea.scrollHeight;  // auto scroll\n};\n\nfunction getDisplayResult() {\n  let i = 0;\n  let ans = '';\n  for (let s in resultList) {\n    if (resultList[s] == '') {\n      continue;\n    }\n\n    ans += '' + i + ': ' + resultList[s] + '\\n';\n    i += 1;\n  }\n\n  if (lastResult.length > 0) {\n    ans += '' + i + ': ' + lastResult + '\\n';\n  }\n  return ans;\n}\n\nModule = {};\n\n// https://emscripten.org/docs/api_reference/module.html#Module.locateFile\nModule.locateFile = function(path, scriptDirectory = '') {\n  console.log(`path: ${path}, scriptDirectory: ${scriptDirectory}`);\n  return scriptDirectory + path;\n};\n\n// https://emscripten.org/docs/api_reference/module.html#Module.locateFile\nModule.setStatus = function(status) {\n  console.log(`status ${status}`);\n  const statusElement = document.getElementById('status');\n  if (status == 'Running...') {\n    status = 'Model downloaded. Initializing recognizer...'\n  }\n\n  const downloadMatch = status.match(/Downloading data... \\((\\d+)\\/(\\d+)\\)/);\n  if (downloadMatch) {\n    const downloaded = BigInt(downloadMatch[1]);\n    const total = BigInt(downloadMatch[2]);\n    const percent =\n        total === 0 ? 0.00 : Number((downloaded * 10000n) / total) / 100;\n    const downloadedMB = Number(downloaded) / (1024 * 1024);\n    const totalMB = Number(total) / (1024 * 1024);\n    status = `Downloading data... ${percent.toFixed(2)}% (${downloadedMB.toFixed(2)} MB/${\n        totalMB.toFixed(2)} MB)`;\n    console.log(`here ${status}`)\n  }\n\n  statusElement.textContent = status;\n  if (status === '') {\n    statusElement.style.display = 'none';\n    // statusElement.parentNode.removeChild(statusElement);\n\n    document.querySelectorAll('.tab-content').forEach((tabContentElement) => {\n      tabContentElement.classList.remove('loading');\n    });\n  } else {\n    statusElement.style.display = 'block';\n    document.querySelectorAll('.tab-content').forEach((tabContentElement) => {\n      tabContentElement.classList.add('loading');\n    });\n  }\n};\n\nModule.onRuntimeInitialized = function() {\n  console.log('inited!');\n\n  startBtn.disabled = false;\n\n  recognizer = createOnlineRecognizer(Module);\n  console.log('recognizer is created!', recognizer);\n};\n\nlet audioCtx;\nlet mediaStream;\n\nlet expectedSampleRate = 16000;\nlet recordSampleRate;  // the sampleRate of the microphone\nlet recorder = null;   // the microphone\nlet leftchannel = [];  // TODO: Use a single channel\n\nlet recordingLength = 0;  // number of samples so far\n\nlet recognizer = null;\nlet recognizer_stream = null;\n\nif (navigator.mediaDevices.getUserMedia) {\n  console.log('getUserMedia supported.');\n\n  // see https://w3c.github.io/mediacapture-main/#dom-mediadevices-getusermedia\n  const constraints = {audio: true};\n\n  let onSuccess = function(stream) {\n    if (!audioCtx) {\n      audioCtx = new AudioContext({sampleRate: 16000});\n    }\n    console.log(audioCtx);\n    recordSampleRate = audioCtx.sampleRate;\n    console.log('sample rate ' + recordSampleRate);\n\n    // creates an audio node from the microphone incoming stream\n    mediaStream = audioCtx.createMediaStreamSource(stream);\n    console.log('media stream', mediaStream);\n\n    // https://developer.mozilla.org/en-US/docs/Web/API/AudioContext/createScriptProcessor\n    // bufferSize: the onaudioprocess event is called when the buffer is full\n    var bufferSize = 4096;\n    var numberOfInputChannels = 1;\n    var numberOfOutputChannels = 2;\n    if (audioCtx.createScriptProcessor) {\n      recorder = audioCtx.createScriptProcessor(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    } else {\n      recorder = audioCtx.createJavaScriptNode(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    }\n    console.log('recorder', recorder);\n\n    recorder.onaudioprocess = function(e) {\n      let samples = new Float32Array(e.inputBuffer.getChannelData(0))\n      samples = downsampleBuffer(samples, expectedSampleRate);\n\n      if (recognizer_stream == null) {\n        recognizer_stream = recognizer.createStream();\n      }\n\n      recognizer_stream.acceptWaveform(expectedSampleRate, samples);\n      while (recognizer.isReady(recognizer_stream)) {\n        recognizer.decode(recognizer_stream);\n      }\n\n      let isEndpoint = recognizer.isEndpoint(recognizer_stream);\n\n      let result = recognizer.getResult(recognizer_stream).text;\n\n      if (recognizer.config.modelConfig.paraformer.encoder != '') {\n        let tailPaddings = new Float32Array(expectedSampleRate);\n        recognizer_stream.acceptWaveform(expectedSampleRate, tailPaddings);\n        while (recognizer.isReady(recognizer_stream)) {\n          recognizer.decode(recognizer_stream);\n        }\n        result = recognizer.getResult(recognizer_stream).text;\n      }\n\n      if (result.length > 0 && lastResult != result) {\n        lastResult = result;\n      }\n\n      if (isEndpoint) {\n        if (lastResult.length > 0) {\n          resultList.push(lastResult);\n          lastResult = '';\n        }\n        recognizer.reset(recognizer_stream);\n      }\n\n      textArea.value = getDisplayResult();\n      textArea.scrollTop = textArea.scrollHeight;  // auto scroll\n\n      let buf = new Int16Array(samples.length);\n      for (var i = 0; i < samples.length; ++i) {\n        let s = samples[i];\n        if (s >= 1)\n          s = 1;\n        else if (s <= -1)\n          s = -1;\n\n        samples[i] = s;\n        buf[i] = s * 32767;\n      }\n\n      leftchannel.push(buf);\n      recordingLength += bufferSize;\n    };\n\n    startBtn.onclick = function() {\n      mediaStream.connect(recorder);\n      recorder.connect(audioCtx.destination);\n\n      console.log('recorder started');\n\n      stopBtn.disabled = false;\n      startBtn.disabled = true;\n    };\n\n    stopBtn.onclick = function() {\n      console.log('recorder stopped');\n\n      // stopBtn recording\n      recorder.disconnect(audioCtx.destination);\n      mediaStream.disconnect(recorder);\n\n      startBtn.style.background = '';\n      startBtn.style.color = '';\n      // mediaRecorder.requestData();\n\n      stopBtn.disabled = true;\n      startBtn.disabled = false;\n\n      var clipName = new Date().toISOString();\n\n      const clipContainer = document.createElement('article');\n      const clipLabel = document.createElement('p');\n      const audio = document.createElement('audio');\n      const deleteButton = document.createElement('button');\n      clipContainer.classList.add('clip');\n      audio.setAttribute('controls', '');\n      deleteButton.textContent = 'Delete';\n      deleteButton.className = 'delete';\n\n      clipLabel.textContent = clipName;\n\n      clipContainer.appendChild(audio);\n\n      clipContainer.appendChild(clipLabel);\n      clipContainer.appendChild(deleteButton);\n      soundClips.appendChild(clipContainer);\n\n      audio.controls = true;\n      let samples = flatten(leftchannel);\n      const blob = toWav(samples);\n\n      leftchannel = [];\n      const audioURL = window.URL.createObjectURL(blob);\n      audio.src = audioURL;\n      console.log('recorder stopped');\n\n      deleteButton.onclick = function(e) {\n        let evtTgt = e.target;\n        evtTgt.parentNode.parentNode.removeChild(evtTgt.parentNode);\n      };\n\n      clipLabel.onclick = function() {\n        const existingName = clipLabel.textContent;\n        const newClipName = prompt('Enter a new name for your sound clip?');\n        if (newClipName === null) {\n          clipLabel.textContent = existingName;\n        } else {\n          clipLabel.textContent = newClipName;\n        }\n      };\n    };\n  };\n\n  let onError = function(err) {\n    console.log('The following error occurred: ' + err);\n  };\n\n  navigator.mediaDevices.getUserMedia(constraints).then(onSuccess, onError);\n} else {\n  console.log('getUserMedia not supported on your browser!');\n  alert('getUserMedia not supported on your browser!');\n}\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction flatten(listOfSamples) {\n  let n = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    n += listOfSamples[i].length;\n  }\n  let ans = new Int16Array(n);\n\n  let offset = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    ans.set(listOfSamples[i], offset);\n    offset += listOfSamples[i].length;\n  }\n  return ans;\n}\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction toWav(samples) {\n  let buf = new ArrayBuffer(44 + samples.length * 2);\n  var view = new DataView(buf);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true);               // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true);  // chunkSize\n  //                   E V A W\n  view.setUint32(8, 0x45564157, true);  // format\n                                        //\n  //                      t m f\n  view.setUint32(12, 0x20746d66, true);          // subchunk1ID\n  view.setUint32(16, 16, true);                  // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true);                   // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true);                   // numChannels: 1 channel\n  view.setUint32(24, expectedSampleRate, true);  // sampleRate\n  view.setUint32(28, expectedSampleRate * 2, true);  // byteRate\n  view.setUint16(32, 2, true);                       // blockAlign\n  view.setUint16(34, 16, true);                      // bitsPerSample\n  view.setUint32(36, 0x61746164, true);              // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true);      // subchunk2Size\n\n  let offset = 44;\n  for (let i = 0; i < samples.length; ++i) {\n    view.setInt16(offset, samples[i], true);\n    offset += 2;\n  }\n\n  return new Blob([view], {type: 'audio/wav'});\n}\n\n// this function is copied from\n// https://github.com/awslabs/aws-lex-browser-audio-capture/blob/master/lib/worker.js#L46\nfunction downsampleBuffer(buffer, exportSampleRate) {\n  if (exportSampleRate === recordSampleRate) {\n    return buffer;\n  }\n  var sampleRateRatio = recordSampleRate / exportSampleRate;\n  var newLength = Math.round(buffer.length / sampleRateRatio);\n  var result = new Float32Array(newLength);\n  var offsetResult = 0;\n  var offsetBuffer = 0;\n  while (offsetResult < result.length) {\n    var nextOffsetBuffer = Math.round((offsetResult + 1) * sampleRateRatio);\n    var accum = 0, count = 0;\n    for (var i = offsetBuffer; i < nextOffsetBuffer && i < buffer.length; i++) {\n      accum += buffer[i];\n      count++;\n    }\n    result[offsetResult] = accum / count;\n    offsetResult++;\n    offsetBuffer = nextOffsetBuffer;\n  }\n  return result;\n};\n"
  },
  {
    "path": "wasm/asr/assets/.gitignore",
    "content": ""
  },
  {
    "path": "wasm/asr/assets/README.md",
    "content": "# Introduction\n\nPlease refer to\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models\nor\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nto download a model.\n\n# Streaming ASR\n\n## Transducer\n```bash\ncd sherpa-onnx/wasm/asr/assets\n\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\ntar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\nrm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2\n\n# Note it is not an error that we rename encoder.int8.onnx to encoder.onnx\n\nmv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.int8.onnx encoder.onnx\nmv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx decoder.onnx\nmv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.int8.onnx joiner.onnx\nmv sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt ./\nrm -rf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/\n\ncd ../../..\n\n./build-wasm-simd-asr.sh\n```\n\nYou should have the following files in `assets` before you can run\n`build-wasm-simd-asr.sh`\n\n```\nassets fangjun$ tree -L 1\n.\n├── README.md\n├── decoder.onnx\n├── encoder.onnx\n├── joiner.onnx\n└── tokens.txt\n\n0 directories, 5 files\n```\n\n## Paraformer\n\n```\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\ntar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\nrm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2\n\nmv sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx encoder.onnx\nmv sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx decoder.onnx\nmv sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt ./\n\nrm -rf sherpa-onnx-streaming-paraformer-bilingual-zh-en\n\ncd ../\n\nsed -i.bak s/\"type = 0\"/\"type = 1\"/g ./sherpa-onnx-asr.js\nsed -i.bak s/Zipformer/Paraformer/g ./index.html\n\ncd ../..\n\n./build-wasm-simd-asr.sh\n```\n\nYou should have the following files in `assets` before you can run\n`build-wasm-simd-asr.sh`\n\n```\nassets fangjun$ tree -L 1\n.\n├── README.md\n├── decoder.onnx\n├── encoder.onnx\n└── tokens.txt\n\n0 directories, 4 files\n```\n\nYou can find example build scripts at:\n\n  - Streaming Zipformer (English + Chinese): https://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/ wasm-simd-hf-space-zh-en-asr-zipformer.yaml\n  - Streaming Zipformer (English): https://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/wasm-simd-hf-space-en-asr-zipformer.yaml\n  - Streaming Paraformer (English + Chinese): https://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/wasm-simd-hf-space-zh-en-asr-paraformer.yaml\n  - Streaming Paraformer (English + Chinese + Cantonese): https://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/wasm-simd-hf-space-zh-cantonese-en-asr-paraformer.yaml\n"
  },
  {
    "path": "wasm/asr/index.html",
    "content": "<html lang=\"en\">\n\n<head>\n  <meta charset=\"utf-8\">\n  <meta name=\"viewport\" content=\"width=device-width\" />\n  <title>Next-gen Kaldi WebAssembly with sherpa-onnx for ASR</title>\n  <style>\n    h1,div {\n      text-align: center;\n    }\n    textarea {\n      width:100%;\n    }\n    .loading {\n      display: none !important;\n    }\n  </style>\n</head>\n\n<body style=\"font-family: 'Source Sans Pro', sans-serif; background-color: #f9fafb; color: #333; display: flex; flex-direction: column; align-items: center; height: 100vh; margin: 0;\">\n  <h1>\n    Next-gen Kaldi + WebAssembly<br/>\n    ASR Demo with <a href=\"https://github.com/k2-fsa/sherpa-onnx\">sherpa-onnx</a><br/>\n    (with Zipformer)\n  </h1>\n\n  <div style=\"width: 100%; max-width: 900px; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); flex: 1;\">\n    <div id=\"status\">Loading...</div>\n\n    <div id=\"singleAudioContent\" class=\"tab-content loading\">\n      <div style=\"display: flex; gap: 1.5rem;\">\n        <div style=\"flex: 1; display: flex; flex-direction: row; align-items: center; gap: 1rem;\">\n          <button id=\"startBtn\" disabled>Start</button>\n          <button id=\"stopBtn\" disabled>Stop</button>\n          <button id=\"clearBtn\">Clear</button>\n        </div>\n      </div>\n\n      <div style=\"flex: 1; display: flex; flex-direction: column; gap: 1rem;\">\n          <div style=\"font-size: 1rem; font-weight: bold; padding: 0.5rem 1rem; background-color: #f8f9fa; border-radius: 8px; color: #6c757d;\">Transcript</div>\n          <textarea id=\"results\" rows=\"10\" placeholder=\"Output will appear here...\" readonly style=\"flex: 1; padding: 0.75rem; font-size: 1rem; border: 1px solid #ced4da; border-radius: 8px; resize: none; background-color: #f8f9fa;\"></textarea>\n      </div>\n    </div>\n\n    <section flex=\"1\" overflow=\"auto\" id=\"sound-clips\">\n    </section>\n\n  </div>\n\n  <!-- Footer Section -->\n  <div style=\"width: 100%; max-width: 900px; margin-top: 1.5rem; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); text-align: left; font-size: 0.9rem; color: #6c757d;\">\n    <h3>Description</h3>\n    <ul>\n      <li>Everything is <strong>open-sourced.</strong> <a href=\"https://github.com/k2-fsa/sherpa-onnx\">code</a></li>\n      <li>If you have any issues, please either <a href=\"https://github.com/k2-fsa/sherpa-onnx/issues\">file a ticket</a> or contact us via</li>\n        <ul>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#wechat\">WeChat group</a></li>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#qq\">QQ group</a></li>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#bilibili-b\">Bilibili</a></li>\n        </ul>\n    </ul>\n    <h3>About This Demo</h3>\n    <ul>\n      <li><strong>Private and Secure:</strong> All processing is done locally on your device (CPU) within your browser with a single thread. No server is involved, ensuring privacy and security. You can disconnect from the Internet once this page is loaded.</li>\n      <li><strong>Efficient Resource Usage:</strong> No GPU is required, leaving system resources available for webLLM analysis.</li>\n    </ul>\n    <h3>Latest Update</h3>\n    <ul>\n      <li>Update UI.</li>\n      <li>First working version.</li>\n    </ul>\n\n    <h3>Acknowledgement</h3>\n    <ul>\n      <li>We refer to <a href=\"https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm\">https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm</a> for the UI part.</li>\n    </ul>\n  </div>\n\n  <script src=\"sherpa-onnx-asr.js\"></script>\n  <script src=\"app-asr.js\"></script>\n  <script src=\"sherpa-onnx-wasm-main-asr.js\"></script>\n</body>\n"
  },
  {
    "path": "wasm/asr/sherpa-onnx-asr.js",
    "content": "function freeConfig(config, Module) {\n  if ('buffer' in config) {\n    Module._free(config.buffer);\n  }\n\n  if ('config' in config) {\n    freeConfig(config.config, Module)\n  }\n\n  if ('transducer' in config) {\n    freeConfig(config.transducer, Module)\n  }\n\n  if ('paraformer' in config) {\n    freeConfig(config.paraformer, Module)\n  }\n\n  if ('zipformer2Ctc' in config) {\n    freeConfig(config.zipformer2Ctc, Module)\n  }\n\n  if ('feat' in config) {\n    freeConfig(config.feat, Module)\n  }\n\n  if ('model' in config) {\n    freeConfig(config.model, Module)\n  }\n\n  if ('nemoCtc' in config) {\n    freeConfig(config.nemoCtc, Module)\n  }\n\n  if ('toneCtc' in config) {\n    freeConfig(config.toneCtc, Module)\n  }\n\n  if ('whisper' in config) {\n    freeConfig(config.whisper, Module)\n  }\n\n  if ('fireRedAsr' in config) {\n    freeConfig(config.fireRedAsr, Module)\n  }\n\n  if ('dolphin' in config) {\n    freeConfig(config.dolphin, Module)\n  }\n\n  if ('zipformerCtc' in config) {\n    freeConfig(config.zipformerCtc, Module)\n  }\n\n  if ('wenetCtc' in config) {\n    freeConfig(config.wenetCtc, Module)\n  }\n\n  if ('omnilingual' in config) {\n    freeConfig(config.omnilingual, Module)\n  }\n\n  if ('medasr' in config) {\n    freeConfig(config.medasr, Module)\n  }\n\n  if ('fireRedAsrCtc' in config) {\n    freeConfig(config.fireRedAsrCtc, Module)\n  }\n\n  if ('funasrNano' in config) {\n    freeConfig(config.funasrNano, Module)\n  }\n\n  if ('moonshine' in config) {\n    freeConfig(config.moonshine, Module)\n  }\n\n  if ('tdnn' in config) {\n    freeConfig(config.tdnn, Module)\n  }\n\n  if ('senseVoice' in config) {\n    freeConfig(config.senseVoice, Module)\n  }\n\n  if ('canary' in config) {\n    freeConfig(config.canary, Module)\n  }\n\n  if ('lm' in config) {\n    freeConfig(config.lm, Module)\n  }\n\n  if ('ctcFstDecoder' in config) {\n    freeConfig(config.ctcFstDecoder, Module)\n  }\n\n  if ('hr' in config) {\n    freeConfig(config.hr, Module)\n  }\n\n  Module._free(config.ptr);\n}\n\n// The user should free the returned pointers\nfunction initSherpaOnnxOnlineTransducerModelConfig(config, Module) {\n  const encoderLen = Module.lengthBytesUTF8(config.encoder || '') + 1;\n  const decoderLen = Module.lengthBytesUTF8(config.decoder || '') + 1;\n  const joinerLen = Module.lengthBytesUTF8(config.joiner || '') + 1;\n\n  const n = encoderLen + decoderLen + joinerLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 3 * 4;  // 3 pointers\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.encoder || '', buffer + offset, encoderLen);\n  offset += encoderLen;\n\n  Module.stringToUTF8(config.decoder || '', buffer + offset, decoderLen);\n  offset += decoderLen;\n\n  Module.stringToUTF8(config.joiner || '', buffer + offset, joinerLen);\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += encoderLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += decoderLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOnlineParaformerModelConfig(config, Module) {\n  const encoderLen = Module.lengthBytesUTF8(config.encoder || '') + 1;\n  const decoderLen = Module.lengthBytesUTF8(config.decoder || '') + 1;\n\n  const n = encoderLen + decoderLen;\n  const buffer = Module._malloc(n);\n\n  const len = 2 * 4;  // 2 pointers\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.encoder || '', buffer + offset, encoderLen);\n  offset += encoderLen;\n\n  Module.stringToUTF8(config.decoder || '', buffer + offset, decoderLen);\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += encoderLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOnlineZipformer2CtcModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOnlineNemoCtcModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOnlineToneCtcModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOnlineModelConfig(config, Module) {\n  if (!('transducer' in config)) {\n    config.transducer = {\n      encoder: '',\n      decoder: '',\n      joiner: '',\n    };\n  }\n\n  if (!('paraformer' in config)) {\n    config.paraformer = {\n      encoder: '',\n      decoder: '',\n    };\n  }\n\n  if (!('zipformer2Ctc' in config)) {\n    config.zipformer2Ctc = {\n      model: '',\n    };\n  }\n\n  if (!('nemoCtc' in config)) {\n    config.nemoCtc = {\n      model: '',\n    };\n  }\n\n  if (!('toneCtc' in config)) {\n    config.toneCtc = {\n      model: '',\n    };\n  }\n\n  if (!('tokensBuf' in config)) {\n    config.tokensBuf = '';\n  }\n\n  if (!('tokensBufSize' in config)) {\n    config.tokensBufSize = 0;\n  }\n\n  const transducer =\n      initSherpaOnnxOnlineTransducerModelConfig(config.transducer, Module);\n\n  const paraformer =\n      initSherpaOnnxOnlineParaformerModelConfig(config.paraformer, Module);\n\n  const zipformer2Ctc = initSherpaOnnxOnlineZipformer2CtcModelConfig(\n      config.zipformer2Ctc, Module);\n\n  const nemoCtc =\n      initSherpaOnnxOnlineNemoCtcModelConfig(config.nemoCtc, Module);\n\n  const toneCtc =\n      initSherpaOnnxOnlineToneCtcModelConfig(config.toneCtc, Module);\n\n  const len = transducer.len + paraformer.len + zipformer2Ctc.len + 9 * 4 +\n      nemoCtc.len + toneCtc.len;\n\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module._CopyHeap(transducer.ptr, transducer.len, ptr + offset);\n  offset += transducer.len;\n\n  Module._CopyHeap(paraformer.ptr, paraformer.len, ptr + offset);\n  offset += paraformer.len;\n\n  Module._CopyHeap(zipformer2Ctc.ptr, zipformer2Ctc.len, ptr + offset);\n  offset += zipformer2Ctc.len;\n\n  const tokensLen = Module.lengthBytesUTF8(config.tokens || '') + 1;\n  const providerLen = Module.lengthBytesUTF8(config.provider || 'cpu') + 1;\n  const modelTypeLen = Module.lengthBytesUTF8(config.modelType || '') + 1;\n  const modelingUnitLen = Module.lengthBytesUTF8(config.modelingUnit || '') + 1;\n  const bpeVocabLen = Module.lengthBytesUTF8(config.bpeVocab || '') + 1;\n  const tokensBufLen = Module.lengthBytesUTF8(config.tokensBuf || '') + 1;\n\n  const bufferLen = tokensLen + providerLen + modelTypeLen + modelingUnitLen +\n      bpeVocabLen + tokensBufLen;\n  const buffer = Module._malloc(bufferLen);\n\n  offset = 0;\n  Module.stringToUTF8(config.tokens || '', buffer, tokensLen);\n  offset += tokensLen;\n\n  Module.stringToUTF8(config.provider || 'cpu', buffer + offset, providerLen);\n  offset += providerLen;\n\n  Module.stringToUTF8(config.modelType || '', buffer + offset, modelTypeLen);\n  offset += modelTypeLen;\n\n  Module.stringToUTF8(\n      config.modelingUnit || '', buffer + offset, modelingUnitLen);\n  offset += modelingUnitLen;\n\n  Module.stringToUTF8(config.bpeVocab || '', buffer + offset, bpeVocabLen);\n  offset += bpeVocabLen;\n\n  Module.stringToUTF8(config.tokensBuf || '', buffer + offset, tokensBufLen);\n  offset += tokensBufLen;\n\n  offset = transducer.len + paraformer.len + zipformer2Ctc.len;\n  Module.setValue(ptr + offset, buffer, 'i8*');  // tokens\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.numThreads || 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, buffer + tokensLen, 'i8*');  // provider\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.debug ?? 1, 'i32');\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset, buffer + tokensLen + providerLen, 'i8*');  // modelType\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset, buffer + tokensLen + providerLen + modelTypeLen,\n      'i8*');  // modelingUnit\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset,\n      buffer + tokensLen + providerLen + modelTypeLen + modelingUnitLen,\n      'i8*');  // bpeVocab\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset,\n      buffer + tokensLen + providerLen + modelTypeLen + modelingUnitLen +\n          bpeVocabLen,\n      'i8*');  // tokens_buf\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.tokensBufSize || 0, 'i32');\n  offset += 4;\n\n  Module._CopyHeap(nemoCtc.ptr, nemoCtc.len, ptr + offset);\n  offset += nemoCtc.len;\n\n  Module._CopyHeap(toneCtc.ptr, toneCtc.len, ptr + offset);\n  offset += toneCtc.len;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n    transducer: transducer,\n    paraformer: paraformer,\n    zipformer2Ctc: zipformer2Ctc,\n    nemoCtc: nemoCtc,\n    toneCtc: toneCtc,\n  };\n}\n\nfunction initSherpaOnnxFeatureConfig(config, Module) {\n  const len = 2 * 4;  // 2 pointers\n  const ptr = Module._malloc(len);\n\n  Module.setValue(ptr, config.sampleRate || 16000, 'i32');\n  Module.setValue(ptr + 4, config.featureDim || 80, 'i32');\n  return {ptr: ptr, len: len};\n}\n\nfunction initSherpaOnnxHomophoneReplacerConfig(config, Module) {\n  const len = 3 * 4;\n  const ptr = Module._malloc(len);\n\n  const dictDir = '';\n\n  const dictDirLen = Module.lengthBytesUTF8(dictDir) + 1;\n  const lexiconLen = Module.lengthBytesUTF8(config.lexicon || '') + 1;\n  const ruleFstsLen = Module.lengthBytesUTF8(config.ruleFsts || '') + 1;\n\n  const bufferLen = dictDirLen + lexiconLen + ruleFstsLen;\n\n  const buffer = Module._malloc(bufferLen);\n  let offset = 0\n  Module.stringToUTF8(dictDir, buffer + offset, dictDirLen);\n  offset += dictDirLen;\n\n  Module.stringToUTF8(config.lexicon || '', buffer + offset, lexiconLen);\n  offset += lexiconLen;\n\n  Module.stringToUTF8(config.ruleFsts || '', buffer + offset, ruleFstsLen);\n  offset += ruleFstsLen;\n\n  Module.setValue(ptr, buffer, 'i8*');\n  Module.setValue(ptr + 4, buffer + dictDirLen, 'i8*');\n  Module.setValue(ptr + 8, buffer + dictDirLen + lexiconLen, 'i8*');\n\n  return {ptr: ptr, len: len, buffer: buffer};\n}\n\nfunction initSherpaOnnxOnlineCtcFstDecoderConfig(config, Module) {\n  const len = 2 * 4;\n  const ptr = Module._malloc(len);\n\n  const graphLen = Module.lengthBytesUTF8(config.graph || '') + 1;\n  const buffer = Module._malloc(graphLen);\n  Module.stringToUTF8(config.graph, buffer, graphLen);\n\n  Module.setValue(ptr, buffer, 'i8*');\n  Module.setValue(ptr + 4, config.maxActive || 3000, 'i32');\n  return {ptr: ptr, len: len, buffer: buffer};\n}\n\nfunction initSherpaOnnxOnlineRecognizerConfig(config, Module) {\n  if (!('featConfig' in config)) {\n    config.featConfig = {\n      sampleRate: 16000,\n      featureDim: 80,\n    };\n  }\n\n  if (!('ctcFstDecoderConfig' in config)) {\n    config.ctcFstDecoderConfig = {\n      graph: '',\n      maxActive: 3000,\n    };\n  }\n\n  if (!('hotwordsBuf' in config)) {\n    config.hotwordsBuf = '';\n  }\n\n  if (!('hotwordsBufSize' in config)) {\n    config.hotwordsBufSize = 0;\n  }\n\n  if (!('hr' in config)) {\n    config.hr = {\n      lexicon: '',\n      ruleFsts: '',\n    };\n  }\n\n  const feat = initSherpaOnnxFeatureConfig(config.featConfig, Module);\n  const model = initSherpaOnnxOnlineModelConfig(config.modelConfig, Module);\n  const ctcFstDecoder = initSherpaOnnxOnlineCtcFstDecoderConfig(\n      config.ctcFstDecoderConfig, Module)\n  const hr = initSherpaOnnxHomophoneReplacerConfig(config.hr, Module);\n\n  const len = feat.len + model.len + 8 * 4 + ctcFstDecoder.len + 5 * 4 + hr.len;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module._CopyHeap(feat.ptr, feat.len, ptr + offset);\n  offset += feat.len;\n\n  Module._CopyHeap(model.ptr, model.len, ptr + offset);\n  offset += model.len;\n\n  const decodingMethodLen =\n      Module.lengthBytesUTF8(config.decodingMethod || 'greedy_search') + 1;\n  const hotwordsFileLen = Module.lengthBytesUTF8(config.hotwordsFile || '') + 1;\n  const ruleFstsFileLen = Module.lengthBytesUTF8(config.ruleFsts || '') + 1;\n  const ruleFarsFileLen = Module.lengthBytesUTF8(config.ruleFars || '') + 1;\n  const hotwordsBufLen = Module.lengthBytesUTF8(config.hotwordsBuf || '') + 1;\n  const bufferLen = decodingMethodLen + hotwordsFileLen + ruleFstsFileLen +\n      ruleFarsFileLen + hotwordsBufLen;\n  const buffer = Module._malloc(bufferLen);\n\n  offset = 0;\n  Module.stringToUTF8(\n      config.decodingMethod || 'greedy_search', buffer, decodingMethodLen);\n  offset += decodingMethodLen;\n\n  Module.stringToUTF8(\n      config.hotwordsFile || '', buffer + offset, hotwordsFileLen);\n  offset += hotwordsFileLen;\n\n  Module.stringToUTF8(config.ruleFsts || '', buffer + offset, ruleFstsFileLen);\n  offset += ruleFstsFileLen;\n\n  Module.stringToUTF8(config.ruleFars || '', buffer + offset, ruleFarsFileLen);\n  offset += ruleFarsFileLen;\n\n  Module.stringToUTF8(\n      config.hotwordsBuf || '', buffer + offset, hotwordsBufLen);\n  offset += hotwordsBufLen;\n\n  offset = feat.len + model.len;\n  Module.setValue(ptr + offset, buffer, 'i8*');  // decoding method\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.maxActivePaths || 4, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.enableEndpoint || 0, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.rule1MinTrailingSilence || 2.4, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.rule2MinTrailingSilence || 1.2, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.rule3MinUtteranceLength || 20, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, buffer + decodingMethodLen, 'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.hotwordsScore || 1.5, 'float');\n  offset += 4;\n\n  Module._CopyHeap(ctcFstDecoder.ptr, ctcFstDecoder.len, ptr + offset);\n  offset += ctcFstDecoder.len;\n\n  Module.setValue(\n      ptr + offset, buffer + decodingMethodLen + hotwordsFileLen, 'i8*');\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset,\n      buffer + decodingMethodLen + hotwordsFileLen + ruleFstsFileLen, 'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.blankPenalty || 0, 'float');\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset,\n      buffer + decodingMethodLen + hotwordsFileLen + ruleFstsFileLen +\n          ruleFarsFileLen,\n      'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.hotwordsBufSize || 0, 'i32');\n  offset += 4;\n\n  Module._CopyHeap(hr.ptr, hr.len, ptr + offset);\n  offset += hr.len;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n    feat: feat,\n    model: model,\n    ctcFstDecoder: ctcFstDecoder,\n    hr: hr,\n  };\n}\n\nfunction createOnlineRecognizer(Module, myConfig) {\n  const onlineTransducerModelConfig = {\n    encoder: '',\n    decoder: '',\n    joiner: '',\n  };\n\n  const onlineParaformerModelConfig = {\n    encoder: '',\n    decoder: '',\n  };\n\n  const onlineZipformer2CtcModelConfig = {\n    model: '',\n  };\n\n  const onlineNemoCtcModelConfig = {\n    model: '',\n  };\n\n  const onlineToneCtcModelConfig = {\n    model: '',\n  };\n\n  let type = 0;\n\n  switch (type) {\n    case 0:\n      // transducer\n      onlineTransducerModelConfig.encoder = './encoder.onnx';\n      onlineTransducerModelConfig.decoder = './decoder.onnx';\n      onlineTransducerModelConfig.joiner = './joiner.onnx';\n      break;\n    case 1:\n      // paraformer\n      onlineParaformerModelConfig.encoder = './encoder.onnx';\n      onlineParaformerModelConfig.decoder = './decoder.onnx';\n      break;\n    case 2:\n      // zipformer2Ctc\n      onlineZipformer2CtcModelConfig.model = './encoder.onnx';\n      break;\n    case 3:\n      // nemoCtc\n      onlineNemoCtcModelConfig.model = './nemo-ctc.onnx';\n      break;\n    case 4:\n      // toneCtc\n      onlineToneCtcModelConfig.model = './tone-ctc.onnx';\n      break;\n  }\n\n\n  const onlineModelConfig = {\n    transducer: onlineTransducerModelConfig,\n    paraformer: onlineParaformerModelConfig,\n    zipformer2Ctc: onlineZipformer2CtcModelConfig,\n    nemoCtc: onlineNemoCtcModelConfig,\n    toneCtc: onlineToneCtcModelConfig,\n    tokens: './tokens.txt',\n    numThreads: 1,\n    provider: 'cpu',\n    debug: 1,\n    modelType: '',\n    modelingUnit: 'cjkchar',\n    bpeVocab: '',\n  };\n\n  const featureConfig = {\n    sampleRate: 16000,  // it is ignored when toneCtc is used\n    featureDim: 80,     // it is ignored when toneCtc is used\n  };\n\n  let recognizerConfig = {\n    featConfig: featureConfig,\n    modelConfig: onlineModelConfig,\n    decodingMethod: 'greedy_search',\n    maxActivePaths: 4,\n    enableEndpoint: 1,\n    rule1MinTrailingSilence: 2.4,\n    rule2MinTrailingSilence: 1.2,\n    rule3MinUtteranceLength: 20,\n    hotwordsFile: '',\n    hotwordsScore: 1.5,\n    ctcFstDecoderConfig: {\n      graph: '',\n      maxActive: 3000,\n    },\n    ruleFsts: '',\n    ruleFars: '',\n  };\n  if (myConfig) {\n    recognizerConfig = myConfig;\n  }\n\n  return new OnlineRecognizer(recognizerConfig, Module);\n}\n\nfunction initSherpaOnnxOfflineTransducerModelConfig(config, Module) {\n  const encoderLen = Module.lengthBytesUTF8(config.encoder || '') + 1;\n  const decoderLen = Module.lengthBytesUTF8(config.decoder || '') + 1;\n  const joinerLen = Module.lengthBytesUTF8(config.joiner || '') + 1;\n\n  const n = encoderLen + decoderLen + joinerLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 3 * 4;  // 3 pointers\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.encoder || '', buffer + offset, encoderLen);\n  offset += encoderLen;\n\n  Module.stringToUTF8(config.decoder || '', buffer + offset, decoderLen);\n  offset += decoderLen;\n\n  Module.stringToUTF8(config.joiner || '', buffer + offset, joinerLen);\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += encoderLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += decoderLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineParaformerModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineNemoEncDecCtcModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineDolphinModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineZipformerCtcModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineWenetCtcModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineOmnilingualAsrCtcModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineMedAsrCtcModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineFireRedAsrCtcModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineFunAsrNanoModelConfig(config, Module) {\n  const encoderAdaptorLen =\n      Module.lengthBytesUTF8(config.encoderAdaptor || '') + 1;\n  const llmLen = Module.lengthBytesUTF8(config.llm || '') + 1;\n  const embeddingLen = Module.lengthBytesUTF8(config.embedding || '') + 1;\n  const tokenizerLen = Module.lengthBytesUTF8(config.tokenizer || '') + 1;\n  const systemPromptLen =\n      Module.lengthBytesUTF8(\n          config.systemPrompt || 'You are a helpful assistant.') +\n      1;\n  const userPromptLen =\n      Module.lengthBytesUTF8(config.userPrompt || '语音转写：') + 1;\n  const languageLen = Module.lengthBytesUTF8(config.language || '') + 1;\n  const hotwordsLen = Module.lengthBytesUTF8(config.hotwords || '') + 1;\n\n  const n = encoderAdaptorLen + llmLen + embeddingLen + tokenizerLen +\n      systemPromptLen + userPromptLen + languageLen + hotwordsLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 13 * 4;  // 8 pointers + 3 int + 2 float\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(\n      config.encoderAdaptor || '', buffer + offset, encoderAdaptorLen);\n  offset += encoderAdaptorLen;\n\n  Module.stringToUTF8(config.llm || '', buffer + offset, llmLen);\n  offset += llmLen;\n\n  Module.stringToUTF8(config.embedding || '', buffer + offset, embeddingLen);\n  offset += embeddingLen;\n\n  Module.stringToUTF8(config.tokenizer || '', buffer + offset, tokenizerLen);\n  offset += tokenizerLen;\n\n  Module.stringToUTF8(\n      config.systemPrompt || 'You are a helpful assistant.', buffer + offset,\n      systemPromptLen);\n  offset += systemPromptLen;\n\n  Module.stringToUTF8(\n      config.userPrompt || '语音转写：', buffer + offset, userPromptLen);\n  offset += userPromptLen;\n\n  Module.stringToUTF8(config.language || '', buffer + offset, languageLen);\n  offset += languageLen;\n\n  Module.stringToUTF8(config.hotwords || '', buffer + offset, hotwordsLen);\n  offset += hotwordsLen;\n\n  offset = 0;\n  Module.setValue(ptr + 0 * 4, buffer + offset, 'i8*');\n  offset += encoderAdaptorLen;\n\n  Module.setValue(ptr + 1 * 4, buffer + offset, 'i8*');\n  offset += llmLen;\n\n  Module.setValue(ptr + 2 * 4, buffer + offset, 'i8*');\n  offset += embeddingLen;\n\n  Module.setValue(ptr + 3 * 4, buffer + offset, 'i8*');\n  offset += tokenizerLen;\n\n  Module.setValue(ptr + 4 * 4, buffer + offset, 'i8*');\n  offset += systemPromptLen;\n\n  Module.setValue(ptr + 5 * 4, buffer + offset, 'i8*');\n  offset += userPromptLen;\n\n  Module.setValue(ptr + 6 * 4, config.maxNewTokens || 512, 'i32');\n  Module.setValue(ptr + 7 * 4, config.temperature || 1e-6, 'float');\n  Module.setValue(ptr + 8 * 4, config.topP || 0.8, 'float');\n  Module.setValue(ptr + 9 * 4, config.seed || 42, 'i32');\n  Module.setValue(ptr + 10 * 4, buffer + offset, 'i8*');\n  offset += languageLen;\n  Module.setValue(ptr + 11 * 4, config.itn || 0, 'i32');\n  Module.setValue(ptr + 12 * 4, buffer + offset, 'i8*');\n  offset += hotwordsLen;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineWhisperModelConfig(config, Module) {\n  const encoderLen = Module.lengthBytesUTF8(config.encoder || '') + 1;\n  const decoderLen = Module.lengthBytesUTF8(config.decoder || '') + 1;\n  const languageLen = Module.lengthBytesUTF8(config.language || '') + 1;\n  const taskLen = Module.lengthBytesUTF8(config.task || '') + 1;\n\n  const n = encoderLen + decoderLen + languageLen + taskLen;\n  const buffer = Module._malloc(n);\n\n  const len = 7 * 4;  // 4 pointers + 3 int32\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.encoder || '', buffer + offset, encoderLen);\n  offset += encoderLen;\n\n  Module.stringToUTF8(config.decoder || '', buffer + offset, decoderLen);\n  offset += decoderLen;\n\n  Module.stringToUTF8(config.language || '', buffer + offset, languageLen);\n  offset += languageLen;\n\n  Module.stringToUTF8(config.task || '', buffer + offset, taskLen);\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += encoderLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += decoderLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n  offset += languageLen;\n\n  Module.setValue(ptr + 12, buffer + offset, 'i8*');\n  offset += taskLen;\n\n  Module.setValue(ptr + 16, config.tailPaddings || 2000, 'i32');\n  Module.setValue(ptr + 20, config.enableTokenTimestamps || 0, 'i32');\n  Module.setValue(ptr + 24, config.enableSegmentTimestamps || 0, 'i32');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineCanaryModelConfig(config, Module) {\n  const encoderLen = Module.lengthBytesUTF8(config.encoder || '') + 1;\n  const decoderLen = Module.lengthBytesUTF8(config.decoder || '') + 1;\n  const srcLangLen = Module.lengthBytesUTF8(config.srcLang || '') + 1;\n  const tgtLangLen = Module.lengthBytesUTF8(config.tgtLang || '') + 1;\n\n  const n = encoderLen + decoderLen + srcLangLen + tgtLangLen;\n  const buffer = Module._malloc(n);\n\n  const len = 5 * 4;  // 4 pointers + 1 int32\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.encoder || '', buffer + offset, encoderLen);\n  offset += encoderLen;\n\n  Module.stringToUTF8(config.decoder || '', buffer + offset, decoderLen);\n  offset += decoderLen;\n\n  Module.stringToUTF8(config.srcLang || '', buffer + offset, srcLangLen);\n  offset += srcLangLen;\n\n  Module.stringToUTF8(config.tgtLang || '', buffer + offset, tgtLangLen);\n  offset += tgtLangLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += encoderLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += decoderLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n  offset += srcLangLen;\n\n  Module.setValue(ptr + 12, buffer + offset, 'i8*');\n  offset += tgtLangLen;\n\n  Module.setValue(ptr + 16, config.usePnc ?? 1, 'i32');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineMoonshineModelConfig(config, Module) {\n  const preprocessorLen = Module.lengthBytesUTF8(config.preprocessor || '') + 1;\n  const encoderLen = Module.lengthBytesUTF8(config.encoder || '') + 1;\n  const uncachedDecoderLen =\n      Module.lengthBytesUTF8(config.uncachedDecoder || '') + 1;\n  const cachedDecoderLen =\n      Module.lengthBytesUTF8(config.cachedDecoder || '') + 1;\n  const mergedDecoderLen =\n      Module.lengthBytesUTF8(config.mergedDecoder || '') + 1;\n\n  const n = preprocessorLen + encoderLen + uncachedDecoderLen +\n      cachedDecoderLen + mergedDecoderLen;\n  const buffer = Module._malloc(n);\n\n  const len = 5 * 4;  // 5 pointers\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(\n      config.preprocessor || '', buffer + offset, preprocessorLen);\n  offset += preprocessorLen;\n\n  Module.stringToUTF8(config.encoder || '', buffer + offset, encoderLen);\n  offset += encoderLen;\n\n  Module.stringToUTF8(\n      config.uncachedDecoder || '', buffer + offset, uncachedDecoderLen);\n  offset += uncachedDecoderLen;\n\n  Module.stringToUTF8(\n      config.cachedDecoder || '', buffer + offset, cachedDecoderLen);\n  offset += cachedDecoderLen;\n\n  Module.stringToUTF8(\n      config.mergedDecoder || '', buffer + offset, mergedDecoderLen);\n  offset += mergedDecoderLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += preprocessorLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += encoderLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n  offset += uncachedDecoderLen;\n\n  Module.setValue(ptr + 12, buffer + offset, 'i8*');\n  offset += cachedDecoderLen;\n\n  Module.setValue(ptr + 16, buffer + offset, 'i8*');\n  offset += mergedDecoderLen;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineFireRedAsrModelConfig(config, Module) {\n  const encoderLen = Module.lengthBytesUTF8(config.encoder || '') + 1;\n  const decoderLen = Module.lengthBytesUTF8(config.decoder || '') + 1;\n\n  const n = encoderLen + decoderLen;\n  const buffer = Module._malloc(n);\n\n  const len = 2 * 4;  // 2 pointers\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.encoder || '', buffer + offset, encoderLen);\n  offset += encoderLen;\n\n  Module.stringToUTF8(config.decoder || '', buffer + offset, decoderLen);\n  offset += decoderLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += encoderLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += decoderLen;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineTdnnModelConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;  // 1 pointer\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineSenseVoiceModelConfig(config, Module) {\n  const modelLen = Module.lengthBytesUTF8(config.model || '') + 1;\n  const languageLen = Module.lengthBytesUTF8(config.language || '') + 1;\n\n  // useItn is a integer with 4 bytes\n  const n = modelLen + languageLen;\n  const buffer = Module._malloc(n);\n\n  const len = 3 * 4;  // 2 pointers + 1 int\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.model || '', buffer + offset, modelLen);\n  offset += modelLen;\n\n  Module.stringToUTF8(config.language || '', buffer + offset, languageLen);\n  offset += languageLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += modelLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += languageLen;\n\n  Module.setValue(ptr + 8, config.useInverseTextNormalization ?? 0, 'i32');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineLMConfig(config, Module) {\n  const n = Module.lengthBytesUTF8(config.model || '') + 1;\n  const buffer = Module._malloc(n);\n\n  const len = 2 * 4;\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, n);\n  Module.setValue(ptr, buffer, 'i8*');\n  Module.setValue(ptr + 4, config.scale || 1, 'float');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineModelConfig(config, Module) {\n  if (!('transducer' in config)) {\n    config.transducer = {\n      encoder: '',\n      decoder: '',\n      joiner: '',\n    };\n  }\n\n  if (!('paraformer' in config)) {\n    config.paraformer = {\n      model: '',\n    };\n  }\n\n  if (!('nemoCtc' in config)) {\n    config.nemoCtc = {\n      model: '',\n    };\n  }\n\n  if (!('dolphin' in config)) {\n    config.dolphin = {\n      model: '',\n    };\n  }\n\n  if (!('zipformerCtc' in config)) {\n    config.zipformerCtc = {\n      model: '',\n    };\n  }\n\n  if (!('wenetCtc' in config)) {\n    config.wenetCtc = {\n      model: '',\n    };\n  }\n\n  if (!('omnilingual' in config)) {\n    config.omnilingual = {\n      model: '',\n    };\n  }\n\n  if (!('medasr' in config)) {\n    config.medasr = {\n      model: '',\n    };\n  }\n\n  if (!('fireRedAsrCtc' in config)) {\n    config.fireRedAsrCtc = {\n      model: '',\n    };\n  }\n\n  if (!('funasrNano' in config)) {\n    config.funasrNano = {\n      encoderAdaptor: '',\n      llm: '',\n      embedding: '',\n      tokenizer: '',\n      systemPrompt: 'You are a helpful assistant.',\n      userPrompt: '语音转写：',\n      maxNewTokens: 512,\n      temperature: 1e-6,\n      topP: 0.8,\n      seed: 42,\n      language: '',\n      itn: 0,\n      hotwords: '',\n    };\n  }\n\n  if (!('whisper' in config)) {\n    config.whisper = {\n      encoder: '',\n      decoder: '',\n      language: '',\n      task: '',\n      tailPaddings: -1,\n      enableTokenTimestamps: 0,\n      enableSegmentTimestamps: 0,\n    };\n  }\n\n  if (!('moonshine' in config)) {\n    config.moonshine = {\n      preprocessor: '',\n      encoder: '',\n      uncachedDecoder: '',\n      cachedDecoder: '',\n      mergedDecoder: '',\n    };\n  }\n\n  if (!('fireRedAsr' in config)) {\n    config.fireRedAsr = {\n      encoder: '',\n      decoder: '',\n    };\n  }\n\n  if (!('tdnn' in config)) {\n    config.tdnn = {\n      model: '',\n    };\n  }\n\n  if (!('senseVoice' in config)) {\n    config.senseVoice = {\n      model: '',\n      language: '',\n      useInverseTextNormalization: 0,\n    };\n  }\n\n  if (!('canary' in config)) {\n    config.canary = {\n      encoder: '',\n      decoder: '',\n      srcLang: '',\n      tgtLang: '',\n      usePnc: 1,\n    };\n  }\n\n  const transducer =\n      initSherpaOnnxOfflineTransducerModelConfig(config.transducer, Module);\n\n  const paraformer =\n      initSherpaOnnxOfflineParaformerModelConfig(config.paraformer, Module);\n\n  const nemoCtc =\n      initSherpaOnnxOfflineNemoEncDecCtcModelConfig(config.nemoCtc, Module);\n\n  const whisper =\n      initSherpaOnnxOfflineWhisperModelConfig(config.whisper, Module);\n\n  const tdnn = initSherpaOnnxOfflineTdnnModelConfig(config.tdnn, Module);\n\n  const senseVoice =\n      initSherpaOnnxOfflineSenseVoiceModelConfig(config.senseVoice, Module);\n\n  const moonshine =\n      initSherpaOnnxOfflineMoonshineModelConfig(config.moonshine, Module);\n\n  const fireRedAsr =\n      initSherpaOnnxOfflineFireRedAsrModelConfig(config.fireRedAsr, Module);\n\n  const dolphin =\n      initSherpaOnnxOfflineDolphinModelConfig(config.dolphin, Module);\n\n  const zipformerCtc =\n      initSherpaOnnxOfflineZipformerCtcModelConfig(config.zipformerCtc, Module);\n\n  const canary = initSherpaOnnxOfflineCanaryModelConfig(config.canary, Module);\n\n  const wenetCtc =\n      initSherpaOnnxOfflineWenetCtcModelConfig(config.wenetCtc, Module);\n\n  const omnilingual = initSherpaOnnxOfflineOmnilingualAsrCtcModelConfig(\n      config.omnilingual, Module);\n\n  const medasr =\n      initSherpaOnnxOfflineMedAsrCtcModelConfig(config.medasr, Module);\n\n  const funasrNano =\n      initSherpaOnnxOfflineFunAsrNanoModelConfig(config.funasrNano, Module);\n\n  const fireRedAsrCtc = initSherpaOnnxOfflineFireRedAsrCtcModelConfig(\n      config.fireRedAsrCtc, Module);\n\n  const len = transducer.len + paraformer.len + nemoCtc.len + whisper.len +\n      tdnn.len + 8 * 4 + senseVoice.len + moonshine.len + fireRedAsr.len +\n      dolphin.len + zipformerCtc.len + canary.len + wenetCtc.len +\n      omnilingual.len + medasr.len + funasrNano.len + fireRedAsrCtc.len;\n\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module._CopyHeap(transducer.ptr, transducer.len, ptr + offset);\n  offset += transducer.len;\n\n  Module._CopyHeap(paraformer.ptr, paraformer.len, ptr + offset);\n  offset += paraformer.len;\n\n  Module._CopyHeap(nemoCtc.ptr, nemoCtc.len, ptr + offset);\n  offset += nemoCtc.len;\n\n  Module._CopyHeap(whisper.ptr, whisper.len, ptr + offset);\n  offset += whisper.len;\n\n  Module._CopyHeap(tdnn.ptr, tdnn.len, ptr + offset);\n  offset += tdnn.len;\n\n  const tokensLen = Module.lengthBytesUTF8(config.tokens || '') + 1;\n  const providerLen = Module.lengthBytesUTF8(config.provider || 'cpu') + 1;\n  const modelTypeLen = Module.lengthBytesUTF8(config.modelType || '') + 1;\n  const modelingUnitLen = Module.lengthBytesUTF8(config.modelingUnit || '') + 1;\n  const bpeVocabLen = Module.lengthBytesUTF8(config.bpeVocab || '') + 1;\n  const teleSpeechCtcLen =\n      Module.lengthBytesUTF8(config.teleSpeechCtc || '') + 1;\n\n  const bufferLen = tokensLen + providerLen + modelTypeLen + modelingUnitLen +\n      bpeVocabLen + teleSpeechCtcLen;\n\n  const buffer = Module._malloc(bufferLen);\n\n  offset = 0;\n  Module.stringToUTF8(config.tokens, buffer, tokensLen);\n  offset += tokensLen;\n\n  Module.stringToUTF8(config.provider || 'cpu', buffer + offset, providerLen);\n  offset += providerLen;\n\n  Module.stringToUTF8(config.modelType || '', buffer + offset, modelTypeLen);\n  offset += modelTypeLen;\n\n  Module.stringToUTF8(\n      config.modelingUnit || '', buffer + offset, modelingUnitLen);\n  offset += modelingUnitLen;\n\n  Module.stringToUTF8(config.bpeVocab || '', buffer + offset, bpeVocabLen);\n  offset += bpeVocabLen;\n\n  Module.stringToUTF8(\n      config.teleSpeechCtc || '', buffer + offset, teleSpeechCtcLen);\n  offset += teleSpeechCtcLen;\n\n  offset =\n      transducer.len + paraformer.len + nemoCtc.len + whisper.len + tdnn.len;\n  Module.setValue(ptr + offset, buffer, 'i8*');  // tokens\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.numThreads || 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.debug ?? 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, buffer + tokensLen, 'i8*');  // provider\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset, buffer + tokensLen + providerLen, 'i8*');  // modelType\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset, buffer + tokensLen + providerLen + modelTypeLen,\n      'i8*');  // modelingUnit\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset,\n      buffer + tokensLen + providerLen + modelTypeLen + modelingUnitLen,\n      'i8*');  // bpeVocab\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset,\n      buffer + tokensLen + providerLen + modelTypeLen + modelingUnitLen +\n          bpeVocabLen,\n      'i8*');  // teleSpeechCtc\n  offset += 4;\n\n  Module._CopyHeap(senseVoice.ptr, senseVoice.len, ptr + offset);\n  offset += senseVoice.len;\n\n  Module._CopyHeap(moonshine.ptr, moonshine.len, ptr + offset);\n  offset += moonshine.len;\n\n  Module._CopyHeap(fireRedAsr.ptr, fireRedAsr.len, ptr + offset);\n  offset += fireRedAsr.len;\n\n  Module._CopyHeap(dolphin.ptr, dolphin.len, ptr + offset);\n  offset += dolphin.len;\n\n  Module._CopyHeap(zipformerCtc.ptr, zipformerCtc.len, ptr + offset);\n  offset += zipformerCtc.len;\n\n  Module._CopyHeap(canary.ptr, canary.len, ptr + offset);\n  offset += canary.len;\n\n  Module._CopyHeap(wenetCtc.ptr, wenetCtc.len, ptr + offset);\n  offset += wenetCtc.len;\n\n  Module._CopyHeap(omnilingual.ptr, omnilingual.len, ptr + offset);\n  offset += omnilingual.len;\n\n  Module._CopyHeap(medasr.ptr, medasr.len, ptr + offset);\n  offset += medasr.len;\n\n  Module._CopyHeap(funasrNano.ptr, funasrNano.len, ptr + offset);\n  offset += funasrNano.len;\n\n  Module._CopyHeap(fireRedAsrCtc.ptr, fireRedAsrCtc.len, ptr + offset);\n  offset += fireRedAsrCtc.len;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n    transducer: transducer,\n    paraformer: paraformer,\n    nemoCtc: nemoCtc,\n    whisper: whisper,\n    tdnn: tdnn,\n    senseVoice: senseVoice,\n    moonshine: moonshine,\n    fireRedAsr: fireRedAsr,\n    dolphin: dolphin,\n    zipformerCtc: zipformerCtc,\n    canary: canary,\n    wenetCtc: wenetCtc,\n    omnilingual: omnilingual,\n    medasr: medasr,\n    funasrNano: funasrNano,\n    fireRedAsrCtc: fireRedAsrCtc\n  };\n}\n\nfunction initSherpaOnnxOfflineRecognizerConfig(config, Module) {\n  if (!('featConfig' in config)) {\n    config.featConfig = {\n      sampleRate: 16000,\n      featureDim: 80,\n    };\n  }\n\n  if (!('lmConfig' in config)) {\n    config.lmConfig = {\n      model: '',\n      scale: 1.0,\n    };\n  }\n\n  if (!('hr' in config)) {\n    config.hr = {\n      lexicon: '',\n      ruleFsts: '',\n    };\n  }\n\n  const feat = initSherpaOnnxFeatureConfig(config.featConfig, Module);\n  const model = initSherpaOnnxOfflineModelConfig(config.modelConfig, Module);\n  const lm = initSherpaOnnxOfflineLMConfig(config.lmConfig, Module);\n  const hr = initSherpaOnnxHomophoneReplacerConfig(config.hr, Module);\n\n  const len = feat.len + model.len + lm.len + 7 * 4 + hr.len;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module._CopyHeap(feat.ptr, feat.len, ptr + offset);\n  offset += feat.len;\n\n  Module._CopyHeap(model.ptr, model.len, ptr + offset);\n  offset += model.len;\n\n  Module._CopyHeap(lm.ptr, lm.len, ptr + offset);\n  offset += lm.len;\n\n  const decodingMethodLen =\n      Module.lengthBytesUTF8(config.decodingMethod || 'greedy_search') + 1;\n  const hotwordsFileLen = Module.lengthBytesUTF8(config.hotwordsFile || '') + 1;\n  const ruleFstsLen = Module.lengthBytesUTF8(config.ruleFsts || '') + 1;\n  const ruleFarsLen = Module.lengthBytesUTF8(config.ruleFars || '') + 1;\n  const bufferLen =\n      decodingMethodLen + hotwordsFileLen + ruleFstsLen + ruleFarsLen;\n  const buffer = Module._malloc(bufferLen);\n\n  offset = 0;\n  Module.stringToUTF8(\n      config.decodingMethod || 'greedy_search', buffer, decodingMethodLen);\n  offset += decodingMethodLen;\n\n  Module.stringToUTF8(\n      config.hotwordsFile || '', buffer + offset, hotwordsFileLen);\n  offset += hotwordsFileLen;\n\n  Module.stringToUTF8(config.ruleFsts || '', buffer + offset, ruleFstsLen);\n  offset += ruleFstsLen;\n\n  Module.stringToUTF8(config.ruleFars || '', buffer + offset, ruleFarsLen);\n  offset += ruleFarsLen;\n\n  offset = feat.len + model.len + lm.len;\n\n  Module.setValue(ptr + offset, buffer, 'i8*');  // decoding method\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.maxActivePaths || 4, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, buffer + decodingMethodLen, 'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.hotwordsScore || 1.5, 'float');\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset, buffer + decodingMethodLen + hotwordsFileLen, 'i8*');\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset, buffer + decodingMethodLen + hotwordsFileLen + ruleFstsLen,\n      'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.blankPenalty || 0, 'float');\n  offset += 4;\n\n  Module._CopyHeap(hr.ptr, hr.len, ptr + offset);\n  offset += hr.len;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n    feat: feat,\n    model: model,\n    lm: lm,\n    hr: hr,\n  };\n}\n\nclass OfflineStream {\n  constructor(handle, Module) {\n    this.handle = handle;\n    this.Module = Module;\n  }\n\n  free() {\n    if (this.handle) {\n      this.Module._SherpaOnnxDestroyOfflineStream(this.handle);\n      this.handle = null;\n    }\n  }\n\n  /**\n   * @param sampleRate {Number}\n   * @param samples {Float32Array} Containing samples in the range [-1, 1]\n   */\n  acceptWaveform(sampleRate, samples) {\n    const pointer =\n        this.Module._malloc(samples.length * samples.BYTES_PER_ELEMENT);\n    this.Module.HEAPF32.set(samples, pointer / samples.BYTES_PER_ELEMENT);\n    this.Module._SherpaOnnxAcceptWaveformOffline(\n        this.handle, sampleRate, pointer, samples.length);\n    this.Module._free(pointer);\n  }\n\n  /**\n   * @param key {String} The option name\n   * @param value {String} The option value\n   */\n  setOption(key, value) {\n    const keyLen = this.Module.lengthBytesUTF8(key) + 1;\n    const valueLen = this.Module.lengthBytesUTF8(value) + 1;\n    const pKey = this.Module._malloc(keyLen);\n    const pValue = this.Module._malloc(valueLen);\n    this.Module.stringToUTF8(key, pKey, keyLen);\n    this.Module.stringToUTF8(value, pValue, valueLen);\n    this.Module._SherpaOnnxOfflineStreamSetOption(this.handle, pKey, pValue);\n    this.Module._free(pKey);\n    this.Module._free(pValue);\n  }\n\n  /**\n   * @param key {String} The option name\n   * @returns {String} The option value, or empty string if not set\n   */\n  getOption(key) {\n    const keyLen = this.Module.lengthBytesUTF8(key) + 1;\n    const pKey = this.Module._malloc(keyLen);\n    this.Module.stringToUTF8(key, pKey, keyLen);\n    const pValue = this.Module._SherpaOnnxOfflineStreamGetOption(this.handle, pKey);\n    const value = this.Module.UTF8ToString(pValue);\n    this.Module._free(pKey);\n    return value;\n  }\n};\n\nclass OfflineRecognizer {\n  constructor(configObj, Module) {\n    this.config = configObj;\n    const config = initSherpaOnnxOfflineRecognizerConfig(configObj, Module);\n    const handle = Module._SherpaOnnxCreateOfflineRecognizer(config.ptr);\n    freeConfig(config, Module);\n\n    this.handle = handle;\n    this.Module = Module;\n  }\n\n  setConfig(configObj) {\n    const config =\n        initSherpaOnnxOfflineRecognizerConfig(configObj, this.Module);\n    this.Module._SherpaOnnxOfflineRecognizerSetConfig(this.handle, config.ptr);\n    freeConfig(config, this.Module);\n  }\n\n  free() {\n    this.Module._SherpaOnnxDestroyOfflineRecognizer(this.handle);\n    this.handle = 0\n  }\n\n  createStream() {\n    const handle = this.Module._SherpaOnnxCreateOfflineStream(this.handle);\n    return new OfflineStream(handle, this.Module);\n  }\n\n  decode(stream) {\n    this.Module._SherpaOnnxDecodeOfflineStream(this.handle, stream.handle);\n  }\n\n  getResult(stream) {\n    const r =\n        this.Module._SherpaOnnxGetOfflineStreamResultAsJson(stream.handle);\n    const jsonStr = this.Module.UTF8ToString(r);\n    const ans = JSON.parse(jsonStr);\n    this.Module._SherpaOnnxDestroyOfflineStreamResultJson(r);\n\n    return ans;\n  }\n};\n\nclass OnlineStream {\n  constructor(handle, Module) {\n    this.handle = handle;\n    this.pointer = null;  // buffer\n    this.n = 0;           // buffer size\n    this.Module = Module;\n  }\n\n  free() {\n    if (this.handle) {\n      this.Module._SherpaOnnxDestroyOnlineStream(this.handle);\n      this.handle = null;\n      this.Module._free(this.pointer);\n      this.pointer = null;\n      this.n = 0;\n    }\n  }\n\n  /**\n   * @param sampleRate {Number}\n   * @param samples {Float32Array} Containing samples in the range [-1, 1]\n   */\n  acceptWaveform(sampleRate, samples) {\n    if (this.n < samples.length) {\n      this.Module._free(this.pointer);\n      this.pointer =\n          this.Module._malloc(samples.length * samples.BYTES_PER_ELEMENT);\n      this.n = samples.length\n    }\n\n    this.Module.HEAPF32.set(samples, this.pointer / samples.BYTES_PER_ELEMENT);\n    this.Module._SherpaOnnxOnlineStreamAcceptWaveform(\n        this.handle, sampleRate, this.pointer, samples.length);\n  }\n\n  inputFinished() {\n    this.Module._SherpaOnnxOnlineStreamInputFinished(this.handle);\n  }\n\n  /**\n   * @param key {String} The option name\n   * @param value {String} The option value\n   */\n  setOption(key, value) {\n    const keyLen = this.Module.lengthBytesUTF8(key) + 1;\n    const valueLen = this.Module.lengthBytesUTF8(value) + 1;\n    const pKey = this.Module._malloc(keyLen);\n    const pValue = this.Module._malloc(valueLen);\n    this.Module.stringToUTF8(key, pKey, keyLen);\n    this.Module.stringToUTF8(value, pValue, valueLen);\n    this.Module._SherpaOnnxOnlineStreamSetOption(this.handle, pKey, pValue);\n    this.Module._free(pKey);\n    this.Module._free(pValue);\n  }\n\n  /**\n   * @param key {String} The option name\n   * @returns {String} The option value, or empty string if not set\n   */\n  getOption(key) {\n    const keyLen = this.Module.lengthBytesUTF8(key) + 1;\n    const pKey = this.Module._malloc(keyLen);\n    this.Module.stringToUTF8(key, pKey, keyLen);\n    const pValue = this.Module._SherpaOnnxOnlineStreamGetOption(this.handle, pKey);\n    const value = this.Module.UTF8ToString(pValue);\n    this.Module._free(pKey);\n    return value;\n  }\n};\n\nclass OnlineRecognizer {\n  constructor(configObj, Module) {\n    this.config = configObj;\n    const config = initSherpaOnnxOnlineRecognizerConfig(configObj, Module)\n    const handle = Module._SherpaOnnxCreateOnlineRecognizer(config.ptr);\n\n    freeConfig(config, Module);\n\n    this.handle = handle;\n    this.Module = Module;\n  }\n\n  free() {\n    this.Module._SherpaOnnxDestroyOnlineRecognizer(this.handle);\n    this.handle = 0\n  }\n\n  createStream() {\n    const handle = this.Module._SherpaOnnxCreateOnlineStream(this.handle);\n    return new OnlineStream(handle, this.Module);\n  }\n\n  isReady(stream) {\n    return this.Module._SherpaOnnxIsOnlineStreamReady(\n               this.handle, stream.handle) == 1;\n  }\n\n  decode(stream) {\n    this.Module._SherpaOnnxDecodeOnlineStream(this.handle, stream.handle);\n  }\n\n  isEndpoint(stream) {\n    return this.Module._SherpaOnnxOnlineStreamIsEndpoint(\n               this.handle, stream.handle) == 1;\n  }\n\n  reset(stream) {\n    this.Module._SherpaOnnxOnlineStreamReset(this.handle, stream.handle);\n  }\n\n  getResult(stream) {\n    const r = this.Module._SherpaOnnxGetOnlineStreamResultAsJson(\n        this.handle, stream.handle);\n    const jsonStr = this.Module.UTF8ToString(r);\n    const ans = JSON.parse(jsonStr);\n    this.Module._SherpaOnnxDestroyOnlineStreamResultJson(r);\n\n    return ans;\n  }\n}\n\nif (typeof process == 'object' && typeof process.versions == 'object' &&\n    typeof process.versions.node == 'string') {\n  module.exports = {\n    createOnlineRecognizer,\n    OfflineRecognizer,\n  };\n}\n"
  },
  {
    "path": "wasm/asr/sherpa-onnx-wasm-main-asr.cc",
    "content": "// wasm/sherpa-onnx-wasm-main-asr.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <stdio.h>\n\n#include <algorithm>\n#include <memory>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n// see also\n// https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html\n\nextern \"C\" {\n\nstatic_assert(sizeof(SherpaOnnxOnlineTransducerModelConfig) == 3 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOnlineParaformerModelConfig) == 2 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOnlineZipformer2CtcModelConfig) == 1 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOnlineNemoCtcModelConfig) == 1 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOnlineModelConfig) ==\n                  sizeof(SherpaOnnxOnlineTransducerModelConfig) +\n                      sizeof(SherpaOnnxOnlineParaformerModelConfig) +\n                      sizeof(SherpaOnnxOnlineZipformer2CtcModelConfig) + 9 * 4 +\n                      sizeof(SherpaOnnxOnlineNemoCtcModelConfig) +\n                      sizeof(SherpaOnnxOnlineToneCtcModelConfig),\n              \"\");\nstatic_assert(sizeof(SherpaOnnxFeatureConfig) == 2 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOnlineCtcFstDecoderConfig) == 2 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOnlineRecognizerConfig) ==\n                  sizeof(SherpaOnnxFeatureConfig) +\n                      sizeof(SherpaOnnxOnlineModelConfig) + 8 * 4 +\n                      sizeof(SherpaOnnxOnlineCtcFstDecoderConfig) + 5 * 4 +\n                      sizeof(SherpaOnnxHomophoneReplacerConfig),\n              \"\");\n\nvoid MyPrint(SherpaOnnxOnlineRecognizerConfig *config) {\n  auto model_config = &config->model_config;\n  auto feat = &config->feat_config;\n  auto transducer_model_config = &model_config->transducer;\n  auto paraformer_model_config = &model_config->paraformer;\n  auto ctc_model_config = &model_config->zipformer2_ctc;\n  auto nemo_ctc = &model_config->nemo_ctc;\n  auto t_one_ctc = &model_config->t_one_ctc;\n\n  fprintf(stdout, \"----------online transducer model config----------\\n\");\n  fprintf(stdout, \"encoder: %s\\n\", transducer_model_config->encoder);\n  fprintf(stdout, \"decoder: %s\\n\", transducer_model_config->decoder);\n  fprintf(stdout, \"joiner: %s\\n\", transducer_model_config->joiner);\n\n  fprintf(stdout, \"----------online parformer model config----------\\n\");\n  fprintf(stdout, \"encoder: %s\\n\", paraformer_model_config->encoder);\n  fprintf(stdout, \"decoder: %s\\n\", paraformer_model_config->decoder);\n\n  fprintf(stdout, \"----------online zipformer2 ctc model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", ctc_model_config->model);\n\n  fprintf(stdout, \"----------online nemo ctc model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", nemo_ctc->model);\n\n  fprintf(stdout, \"----------online t-one ctc model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", t_one_ctc->model);\n\n  fprintf(stdout, \"tokens: %s\\n\", model_config->tokens);\n  fprintf(stdout, \"num_threads: %d\\n\", model_config->num_threads);\n  fprintf(stdout, \"provider: %s\\n\", model_config->provider);\n  fprintf(stdout, \"debug: %d\\n\", model_config->debug);\n  fprintf(stdout, \"model type: %s\\n\", model_config->model_type);\n  fprintf(stdout, \"modeling unit: %s\\n\", model_config->modeling_unit);\n  fprintf(stdout, \"bpe vocab: %s\\n\", model_config->bpe_vocab);\n  fprintf(stdout, \"tokens_buf: %s\\n\",\n          model_config->tokens_buf ? model_config->tokens_buf : \"\");\n  fprintf(stdout, \"tokens_buf_size: %d\\n\", model_config->tokens_buf_size);\n\n  fprintf(stdout, \"----------feat config----------\\n\");\n  fprintf(stdout, \"sample rate: %d\\n\", feat->sample_rate);\n  fprintf(stdout, \"feat dim: %d\\n\", feat->feature_dim);\n\n  fprintf(stdout, \"----------recognizer config----------\\n\");\n  fprintf(stdout, \"decoding method: %s\\n\", config->decoding_method);\n  fprintf(stdout, \"max active paths: %d\\n\", config->max_active_paths);\n  fprintf(stdout, \"enable_endpoint: %d\\n\", config->enable_endpoint);\n  fprintf(stdout, \"rule1_min_trailing_silence: %.2f\\n\",\n          config->rule1_min_trailing_silence);\n  fprintf(stdout, \"rule2_min_trailing_silence: %.2f\\n\",\n          config->rule2_min_trailing_silence);\n  fprintf(stdout, \"rule3_min_utterance_length: %.2f\\n\",\n          config->rule3_min_utterance_length);\n  fprintf(stdout, \"hotwords_file: %s\\n\", config->hotwords_file);\n  fprintf(stdout, \"hotwords_score: %.2f\\n\", config->hotwords_score);\n  fprintf(stdout, \"rule_fsts: %s\\n\", config->rule_fsts);\n  fprintf(stdout, \"rule_fars: %s\\n\", config->rule_fars);\n  fprintf(stdout, \"blank_penalty: %f\\n\", config->blank_penalty);\n\n  fprintf(stdout, \"----------ctc fst decoder config----------\\n\");\n  fprintf(stdout, \"graph: %s\\n\", config->ctc_fst_decoder_config.graph);\n  fprintf(stdout, \"max_active: %d\\n\",\n          config->ctc_fst_decoder_config.max_active);\n\n  fprintf(stdout, \"----------hr config----------\\n\");\n  fprintf(stdout, \"dict_dir: %s\\n\", config->hr.dict_dir);\n  fprintf(stdout, \"lexicon: %s\\n\", config->hr.lexicon);\n  fprintf(stdout, \"rule_fsts: %s\\n\", config->hr.rule_fsts);\n}\n\nvoid CopyHeap(const char *src, int32_t num_bytes, char *dst) {\n  std::copy(src, src + num_bytes, dst);\n}\n}\n"
  },
  {
    "path": "wasm/kws/CMakeLists.txt",
    "content": "if(NOT $ENV{SHERPA_ONNX_IS_USING_BUILD_WASM_SH})\n    message(FATAL_ERROR \"Please use ./build-wasm-simd-kws.sh to build for wasm KWS\")\nendif()\n\nif(NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/decoder-epoch-12-avg-2-chunk-16-left-64.onnx\")\n    message(WARNING \"${CMAKE_CURRENT_SOURCE_DIR}/assets/decoder-epoch-12-avg-2-chunk-16-left-64.onnx does not exist\")\n    message(FATAL_ERROR \"Please read ${CMAKE_CURRENT_SOURCE_DIR}/assets/README.md before you continue\")\nendif()\n\nset(exported_functions\n  SherpaOnnxCreateKeywordSpotter\n  SherpaOnnxCreateKeywordStream\n  SherpaOnnxDecodeKeywordStream\n  SherpaOnnxDestroyKeywordResult\n  SherpaOnnxDestroyKeywordSpotter\n  SherpaOnnxGetKeywordResult\n  SherpaOnnxIsKeywordStreamReady\n  SherpaOnnxOnlineStreamAcceptWaveform\n  SherpaOnnxOnlineStreamGetOption\n  SherpaOnnxOnlineStreamInputFinished\n  SherpaOnnxOnlineStreamSetOption\n  SherpaOnnxResetKeywordStream\n)\nset(mangled_exported_functions)\nforeach(x IN LISTS exported_functions)\n    list(APPEND mangled_exported_functions \"_${x}\")\nendforeach()\n\nlist(JOIN mangled_exported_functions \",\" all_exported_functions)\n\ninclude_directories(${CMAKE_SOURCE_DIR})\nset(MY_FLAGS \"-s FORCE_FILESYSTEM=1 -s INITIAL_MEMORY=512MB -s ALLOW_MEMORY_GROWTH=1\")\nstring(APPEND MY_FLAGS \" -sSTACK_SIZE=10485760 \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_FUNCTIONS=[_CopyHeap,_malloc,_free,${all_exported_functions}] \")\nstring(APPEND MY_FLAGS \"--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets@. \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_RUNTIME_METHODS=['ccall','stringToUTF8','setValue','getValue','lengthBytesUTF8','UTF8ToString','HEAPU8','HEAP16','HEAP32','HEAPU32','HEAPF32','HEAPF64'] \")\n\nmessage(STATUS \"MY_FLAGS: ${MY_FLAGS}\")\n\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_EXECUTABLE_LINKER_FLAGS \"${CMAKE_EXECUTABLE_LINKER_FLAGS} ${MY_FLAGS}\")\n\nadd_executable(sherpa-onnx-wasm-kws-main sherpa-onnx-wasm-main-kws.cc)\ntarget_link_libraries(sherpa-onnx-wasm-kws-main sherpa-onnx-c-api)\ninstall(TARGETS sherpa-onnx-wasm-kws-main DESTINATION bin/wasm)\n\ninstall(\n        FILES\n        \"sherpa-onnx-kws.js\"\n        \"app.js\"\n        \"index.html\"\n        \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-kws-main>/sherpa-onnx-wasm-kws-main.js\"\n        \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-kws-main>/sherpa-onnx-wasm-kws-main.wasm\"\n        \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-kws-main>/sherpa-onnx-wasm-kws-main.data\"\n        DESTINATION\n        bin/wasm\n)\n"
  },
  {
    "path": "wasm/kws/app.js",
    "content": "// This file copies and modifies code\n// from https://mdn.github.io/web-dictaphone/scripts/app.js\n// and https://gist.github.com/meziantou/edb7217fddfbb70e899e\n\nconst startBtn = document.getElementById('startBtn');\nconst stopBtn = document.getElementById('stopBtn');\nconst clearBtn = document.getElementById('clearBtn');\nconst hint = document.getElementById('hint');\nconst soundClips = document.getElementById('sound-clips');\n\nlet textArea = document.getElementById('results');\n\nlet lastResult = '';\nlet resultList = [];\n\nclearBtn.onclick = function() {\n  resultList = [];\n  textArea.value = getDisplayResult();\n  textArea.scrollTop = textArea.scrollHeight;  // auto scroll\n};\n\nfunction getDisplayResult() {\n  let i = 0;\n  let ans = '';\n  for (let s in resultList) {\n    if (resultList[s] == '') {\n      continue;\n    }\n\n    ans += '' + i + ': ' + resultList[s] + '\\n';\n    i += 1;\n  }\n\n  return ans;\n}\n\n\nModule = {};\nModule.onRuntimeInitialized = function() {\n  console.log('inited!');\n  hint.innerText = 'Model loaded! Please click start';\n\n  startBtn.disabled = false;\n\n  recognizer = createKws(Module);\n  console.log('recognizer is created!', recognizer);\n};\n\nlet audioCtx;\nlet mediaStream;\n\nlet expectedSampleRate = 16000;\nlet recordSampleRate;  // the sampleRate of the microphone\nlet recorder = null;   // the microphone\nlet leftchannel = [];  // TODO: Use a single channel\n\nlet recordingLength = 0;  // number of samples so far\n\nlet recognizer = null;\nlet recognizer_stream = null;\n\nif (navigator.mediaDevices.getUserMedia) {\n  console.log('getUserMedia supported.');\n\n  // see https://w3c.github.io/mediacapture-main/#dom-mediadevices-getusermedia\n  const constraints = {audio: true};\n\n  let onSuccess = function(stream) {\n    if (!audioCtx) {\n      audioCtx = new AudioContext({sampleRate: 16000});\n    }\n    console.log(audioCtx);\n    recordSampleRate = audioCtx.sampleRate;\n    console.log('sample rate ' + recordSampleRate);\n\n    // creates an audio node from the microphone incoming stream\n    mediaStream = audioCtx.createMediaStreamSource(stream);\n    console.log('media stream', mediaStream);\n\n    // https://developer.mozilla.org/en-US/docs/Web/API/AudioContext/createScriptProcessor\n    // bufferSize: the onaudioprocess event is called when the buffer is full\n    var bufferSize = 4096;\n    var numberOfInputChannels = 1;\n    var numberOfOutputChannels = 2;\n    if (audioCtx.createScriptProcessor) {\n      recorder = audioCtx.createScriptProcessor(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    } else {\n      recorder = audioCtx.createJavaScriptNode(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    }\n    console.log('recorder', recorder);\n\n    recorder.onaudioprocess = function(e) {\n      let samples = new Float32Array(e.inputBuffer.getChannelData(0))\n      samples = downsampleBuffer(samples, expectedSampleRate);\n\n      if (recognizer_stream == null) {\n        recognizer_stream = recognizer.createStream();\n      }\n\n      recognizer_stream.acceptWaveform(expectedSampleRate, samples);\n      while (recognizer.isReady(recognizer_stream)) {\n        recognizer.decode(recognizer_stream);\n\n        let result = recognizer.getResult(recognizer_stream);\n\n        if (result.keyword.length > 0) {\n          console.log(result)\n          lastResult = result;\n          resultList.push(JSON.stringify(result));\n\n          // remember to reset the stream right after detecting a keyword\n          recognizer.reset(recognizer_stream);\n        }\n      }\n\n\n      textArea.value = getDisplayResult();\n      textArea.scrollTop = textArea.scrollHeight;  // auto scroll\n\n      let buf = new Int16Array(samples.length);\n      for (var i = 0; i < samples.length; ++i) {\n        let s = samples[i];\n        if (s >= 1)\n          s = 1;\n        else if (s <= -1)\n          s = -1;\n\n        samples[i] = s;\n        buf[i] = s * 32767;\n      }\n\n      leftchannel.push(buf);\n      recordingLength += bufferSize;\n    };\n\n    startBtn.onclick = function() {\n      mediaStream.connect(recorder);\n      recorder.connect(audioCtx.destination);\n\n      console.log('recorder started');\n\n      stopBtn.disabled = false;\n      startBtn.disabled = true;\n    };\n\n    stopBtn.onclick = function() {\n      console.log('recorder stopped');\n\n      // stopBtn recording\n      recorder.disconnect(audioCtx.destination);\n      mediaStream.disconnect(recorder);\n\n      startBtn.style.background = '';\n      startBtn.style.color = '';\n      // mediaRecorder.requestData();\n\n      stopBtn.disabled = true;\n      startBtn.disabled = false;\n\n      var clipName = new Date().toISOString();\n\n      const clipContainer = document.createElement('article');\n      const clipLabel = document.createElement('p');\n      const audio = document.createElement('audio');\n      const deleteButton = document.createElement('button');\n      clipContainer.classList.add('clip');\n      audio.setAttribute('controls', '');\n      deleteButton.textContent = 'Delete';\n      deleteButton.className = 'delete';\n\n      clipLabel.textContent = clipName;\n\n      clipContainer.appendChild(audio);\n\n      clipContainer.appendChild(clipLabel);\n      clipContainer.appendChild(deleteButton);\n      soundClips.appendChild(clipContainer);\n\n      audio.controls = true;\n      let samples = flatten(leftchannel);\n      const blob = toWav(samples);\n\n      leftchannel = [];\n      const audioURL = window.URL.createObjectURL(blob);\n      audio.src = audioURL;\n      console.log('recorder stopped');\n\n      deleteButton.onclick = function(e) {\n        let evtTgt = e.target;\n        evtTgt.parentNode.parentNode.removeChild(evtTgt.parentNode);\n      };\n\n      clipLabel.onclick = function() {\n        const existingName = clipLabel.textContent;\n        const newClipName = prompt('Enter a new name for your sound clip?');\n        if (newClipName === null) {\n          clipLabel.textContent = existingName;\n        } else {\n          clipLabel.textContent = newClipName;\n        }\n      };\n    };\n  };\n\n  let onError = function(err) {\n    console.log('The following error occurred: ' + err);\n  };\n\n  navigator.mediaDevices.getUserMedia(constraints).then(onSuccess, onError);\n} else {\n  console.log('getUserMedia not supported on your browser!');\n  alert('getUserMedia not supported on your browser!');\n}\n\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction flatten(listOfSamples) {\n  let n = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    n += listOfSamples[i].length;\n  }\n  let ans = new Int16Array(n);\n\n  let offset = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    ans.set(listOfSamples[i], offset);\n    offset += listOfSamples[i].length;\n  }\n  return ans;\n}\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction toWav(samples) {\n  let buf = new ArrayBuffer(44 + samples.length * 2);\n  var view = new DataView(buf);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true);               // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true);  // chunkSize\n  //                   E V A W\n  view.setUint32(8, 0x45564157, true);  // format\n                                        //\n  //                      t m f\n  view.setUint32(12, 0x20746d66, true);          // subchunk1ID\n  view.setUint32(16, 16, true);                  // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true);                   // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true);                   // numChannels: 1 channel\n  view.setUint32(24, expectedSampleRate, true);  // sampleRate\n  view.setUint32(28, expectedSampleRate * 2, true);  // byteRate\n  view.setUint16(32, 2, true);                       // blockAlign\n  view.setUint16(34, 16, true);                      // bitsPerSample\n  view.setUint32(36, 0x61746164, true);              // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true);      // subchunk2Size\n\n  let offset = 44;\n  for (let i = 0; i < samples.length; ++i) {\n    view.setInt16(offset, samples[i], true);\n    offset += 2;\n  }\n\n  return new Blob([view], {type: 'audio/wav'});\n}\n\n// this function is copied from\n// https://github.com/awslabs/aws-lex-browser-audio-capture/blob/master/lib/worker.js#L46\nfunction downsampleBuffer(buffer, exportSampleRate) {\n  if (exportSampleRate === recordSampleRate) {\n    return buffer;\n  }\n  var sampleRateRatio = recordSampleRate / exportSampleRate;\n  var newLength = Math.round(buffer.length / sampleRateRatio);\n  var result = new Float32Array(newLength);\n  var offsetResult = 0;\n  var offsetBuffer = 0;\n  while (offsetResult < result.length) {\n    var nextOffsetBuffer = Math.round((offsetResult + 1) * sampleRateRatio);\n    var accum = 0, count = 0;\n    for (var i = offsetBuffer; i < nextOffsetBuffer && i < buffer.length; i++) {\n      accum += buffer[i];\n      count++;\n    }\n    result[offsetResult] = accum / count;\n    offsetResult++;\n    offsetBuffer = nextOffsetBuffer;\n  }\n  return result;\n};\n"
  },
  {
    "path": "wasm/kws/assets/README.md",
    "content": "# Introduction\n\nPlease refer to\nhttps://www.modelscope.cn/models/pkufool/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/summary\nto download a model.\n\n# Kws\n\nThe following is an example:\n```bash\ncd sherpa-onnx/wasm/kws/assets\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\ntar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\nrm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2\n\nmv sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/encoder-epoch-12-avg-2-chunk-16-left-64.onnx ./\nmv sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/decoder-epoch-12-avg-2-chunk-16-left-64.onnx ./\nmv sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/joiner-epoch-12-avg-2-chunk-16-left-64.onnx ./\nmv sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01/tokens.txt ./\nrm -rf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01\n```\n\nYou should have the following files in `assets` before you can run\n`build-wasm-simd-kws.sh`\n\n```bash\nfangjuns-MacBook-Pro:assets fangjun$ pwd\n/Users/fangjun/open-source/sherpa-onnx/wasm/kws/assets\n\nfangjuns-MacBook-Pro:assets fangjun$ ls -lh\ntotal 25616\n-rw-r--r--  1 fangjun  staff   692B Oct 29 16:53 README.md\n-rw-r--r--  1 fangjun  staff   660K Aug 14 15:21 decoder-epoch-12-avg-2-chunk-16-left-64.onnx\n-rw-r--r--  1 fangjun  staff    12M Aug 14 15:21 encoder-epoch-12-avg-2-chunk-16-left-64.onnx\n-rw-r--r--  1 fangjun  staff   247K Aug 14 15:21 joiner-epoch-12-avg-2-chunk-16-left-64.onnx\n-rw-r--r--  1 fangjun  staff   1.6K Aug 14 15:08 tokens.txt\n```\n\n**Hint**: Remember to remove extra files from ``assets``. For instance, please remember to remove\nthe file `sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01.tar.bz2`.\n"
  },
  {
    "path": "wasm/kws/index.html",
    "content": "<html lang=\"en\">\n\n<head>\n  <meta charset=\"utf-8\">\n  <meta name=\"viewport\" content=\"width=device-width\" />\n  <title>Next-gen Kaldi WebAssembly with sherpa-onnx for kws</title>\n  <style>\n    h1,div {\n      text-align: center;\n    }\n    textarea {\n      width:100%;\n    }\n  </style>\n</head>\n\n<body>\n  <h1>\n    WebAssembly<br/>\n    Kws Demo with <a href=\"https://github.com/k2-fsa/sherpa-onnx\">sherpa-onnx</a>\n  </h1>\n  <div>\n    <span id=\"hint\">Loading model ... ...</span>\n    <br/>\n    <br/>\n    <button id=\"startBtn\" disabled>Start</button>\n    <button id=\"stopBtn\" disabled>Stop</button>\n    <button id=\"clearBtn\">Clear</button>\n    <br/>\n    <br/>\n    <textarea id=\"results\" rows=\"10\" readonly></textarea>\n  </div>\n\n  <section flex=\"1\" overflow=\"auto\" id=\"sound-clips\">\n  </section>\n\n  <script src=\"sherpa-onnx-kws.js\"></script>\n  <script src=\"app.js\"></script>\n  <script src=\"sherpa-onnx-wasm-kws-main.js\"></script>\n</body>"
  },
  {
    "path": "wasm/kws/sherpa-onnx-kws.js",
    "content": "\n\nfunction freeConfig(config, Module) {\n  if ('buffer' in config) {\n    Module._free(config.buffer);\n  }\n\n  if ('transducer' in config) {\n    freeConfig(config.transducer, Module);\n  }\n\n  if ('featConfig' in config) {\n    freeConfig(config.featConfig, Module);\n  }\n\n  if ('modelConfig' in config) {\n    freeConfig(config.modelConfig, Module);\n  }\n\n  if ('keywordsBuffer' in config) {\n    Module._free(config.keywordsBuffer);\n  }\n\n  Module._free(config.ptr);\n}\n\n\nfunction initSherpaOnnxOnlineTransducerModelConfig(config, Module) {\n  const encoderLen = Module.lengthBytesUTF8(config.encoder) + 1;\n  const decoderLen = Module.lengthBytesUTF8(config.decoder) + 1;\n  const joinerLen = Module.lengthBytesUTF8(config.joiner) + 1;\n\n  const n = encoderLen + decoderLen + joinerLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 3 * 4;  // 3 pointers\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.encoder, buffer + offset, encoderLen);\n  offset += encoderLen;\n\n  Module.stringToUTF8(config.decoder, buffer + offset, decoderLen);\n  offset += decoderLen;\n\n  Module.stringToUTF8(config.joiner, buffer + offset, joinerLen);\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += encoderLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += decoderLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\n// The user should free the returned pointers\nfunction initModelConfig(config, Module) {\n  if (!('tokensBuf' in config)) {\n    config.tokensBuf = '';\n  }\n\n  if (!('tokensBufSize' in config)) {\n    config.tokensBufSize = 0;\n  }\n\n  const transducer =\n      initSherpaOnnxOnlineTransducerModelConfig(config.transducer, Module);\n  const paraformer_len = 2 * 4\n  const zipfomer2_ctc_len = 1 * 4\n  const nemo_ctc_len = 1 * 4\n  const t_one_ctc_len = 1 * 4\n\n  const len = transducer.len + paraformer_len + zipfomer2_ctc_len + 9 * 4 +\n      nemo_ctc_len + t_one_ctc_len;\n\n  const ptr = Module._malloc(len);\n  Module.HEAPU8.fill(0, ptr, ptr + len);\n\n  let offset = 0;\n  Module._CopyHeap(transducer.ptr, transducer.len, ptr + offset);\n\n  const tokensLen = Module.lengthBytesUTF8(config.tokens) + 1;\n  const providerLen = Module.lengthBytesUTF8(config.provider || 'cpu') + 1;\n  const modelTypeLen = Module.lengthBytesUTF8(config.modelType || '') + 1;\n  const modelingUnitLen = Module.lengthBytesUTF8(config.modelingUnit || '') + 1;\n  const bpeVocabLen = Module.lengthBytesUTF8(config.bpeVocab || '') + 1;\n  const tokensBufLen = Module.lengthBytesUTF8(config.tokensBuf || '') + 1;\n  const bufferLen = tokensLen + providerLen + modelTypeLen + modelingUnitLen +\n      bpeVocabLen + tokensBufLen;\n  const buffer = Module._malloc(bufferLen);\n\n  offset = 0;\n  Module.stringToUTF8(config.tokens, buffer, tokensLen);\n  offset += tokensLen;\n\n  Module.stringToUTF8(config.provider || 'cpu', buffer + offset, providerLen);\n  offset += providerLen;\n\n  Module.stringToUTF8(config.modelType || '', buffer + offset, modelTypeLen);\n  offset += modelTypeLen;\n\n  Module.stringToUTF8(\n      config.modelingUnit || '', buffer + offset, modelingUnitLen);\n  offset += modelingUnitLen;\n\n  Module.stringToUTF8(config.bpeVocab || '', buffer + offset, bpeVocabLen);\n  offset += bpeVocabLen;\n\n  Module.stringToUTF8(config.tokensBuf || '', buffer + offset, tokensBufLen);\n  offset += tokensBufLen;\n\n  offset = transducer.len + paraformer_len + zipfomer2_ctc_len;\n  Module.setValue(ptr + offset, buffer, 'i8*');  // tokens\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.numThreads || 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, buffer + tokensLen, 'i8*');  // provider\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.debug, 'i32');\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset, buffer + tokensLen + providerLen, 'i8*');  // modelType\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset, buffer + tokensLen + providerLen + modelTypeLen,\n      'i8*');  // modelingUnit\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset,\n      buffer + tokensLen + providerLen + modelTypeLen + modelingUnitLen,\n      'i8*');  // bpeVocab\n  offset += 4;\n\n  Module.setValue(\n      ptr + offset,\n      buffer + tokensLen + providerLen + modelTypeLen + modelingUnitLen +\n          bpeVocabLen,\n      'i8*');  // tokens_buf\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.tokensBufSize || 0, 'i32');\n  offset += 4;\n  // skip nemo_ctc and t_one_ctc\n\n  return {buffer: buffer, ptr: ptr, len: len, transducer: transducer};\n}\n\nfunction initFeatureExtractorConfig(config, Module) {\n  let ptr = Module._malloc(4 * 2);\n  Module.setValue(ptr, config.samplingRate || 16000, 'i32');\n  Module.setValue(ptr + 4, config.featureDim || 80, 'i32');\n  return {\n    ptr: ptr,\n    len: 8,\n  };\n}\n\nfunction initKwsConfig(config, Module) {\n  if (!('featConfig' in config)) {\n    config.featConfig = {\n      sampleRate: 16000,\n      featureDim: 80,\n    };\n  }\n\n  if (!('keywordsBuf' in config)) {\n    config.keywordsBuf = '';\n  }\n\n  if (!('keywordsBufSize' in config)) {\n    config.keywordsBufSize = 0;\n  }\n\n  let featConfig = initFeatureExtractorConfig(config.featConfig, Module);\n\n  let modelConfig = initModelConfig(config.modelConfig, Module);\n  let numBytes = featConfig.len + modelConfig.len + 4 * 7;\n\n  let ptr = Module._malloc(numBytes);\n  let offset = 0;\n  Module._CopyHeap(featConfig.ptr, featConfig.len, ptr + offset);\n  offset += featConfig.len;\n\n  Module._CopyHeap(modelConfig.ptr, modelConfig.len, ptr + offset)\n  offset += modelConfig.len;\n\n  Module.setValue(ptr + offset, config.maxActivePaths || 4, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.numTrailingBlanks || 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.keywordsScore || 1.0, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.keywordsThreshold || 0.25, 'float');\n  offset += 4;\n\n  let keywordsLen = Module.lengthBytesUTF8(config.keywords) + 1;\n  let keywordsBufLen = Module.lengthBytesUTF8(config.keywordsBuf) + 1;\n\n  let keywordsBuffer = Module._malloc(keywordsLen + keywordsBufLen);\n  Module.stringToUTF8(config.keywords, keywordsBuffer, keywordsLen);\n  Module.stringToUTF8(\n      config.keywordsBuf, keywordsBuffer + keywordsLen, keywordsBufLen);\n\n  Module.setValue(ptr + offset, keywordsBuffer, 'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, keywordsBuffer + keywordsLen, 'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.keywordsBufSize, 'i32');\n  offset += 4;\n\n  return {\n    ptr: ptr,\n    len: numBytes,\n    featConfig: featConfig,\n    modelConfig: modelConfig,\n    keywordsBuffer: keywordsBuffer\n  };\n}\n\nclass Stream {\n  constructor(handle, Module) {\n    this.handle = handle;\n    this.pointer = null;\n    this.n = 0;\n    this.Module = Module;\n  }\n\n  free() {\n    if (this.handle) {\n      this.Module._SherpaOnnxDestroyOnlineStream(this.handle);\n      this.handle = null;\n      this.Module._free(this.pointer);\n      this.pointer = null;\n      this.n = 0;\n    }\n  }\n\n  /**\n   * @param sampleRate {Number}\n   * @param samples {Float32Array} Containing samples in the range [-1, 1]\n   */\n  acceptWaveform(sampleRate, samples) {\n    if (this.n < samples.length) {\n      this.Module._free(this.pointer);\n      this.pointer =\n          this.Module._malloc(samples.length * samples.BYTES_PER_ELEMENT);\n      this.n = samples.length\n    }\n\n    this.Module.HEAPF32.set(samples, this.pointer / samples.BYTES_PER_ELEMENT);\n    this.Module._SherpaOnnxOnlineStreamAcceptWaveform(\n        this.handle, sampleRate, this.pointer, samples.length);\n  }\n\n  inputFinished() {\n    this.Module._SherpaOnnxOnlineStreamInputFinished(this.handle);\n  }\n};\n\nclass Kws {\n  constructor(configObj, Module) {\n    this.config = configObj;\n    let config = initKwsConfig(configObj, Module)\n    let handle = Module._SherpaOnnxCreateKeywordSpotter(config.ptr);\n\n    freeConfig(config, Module);\n\n    this.handle = handle;\n    this.Module = Module;\n  }\n\n  free() {\n    this.Module._SherpaOnnxDestroyKeywordSpotter(this.handle);\n    this.handle = 0\n  }\n\n  createStream() {\n    let handle = this.Module._SherpaOnnxCreateKeywordStream(this.handle);\n    return new Stream(handle, this.Module);\n  }\n\n  isReady(stream) {\n    return this.Module._SherpaOnnxIsKeywordStreamReady(\n               this.handle, stream.handle) == 1;\n  }\n\n  decode(stream) {\n    this.Module._SherpaOnnxDecodeKeywordStream(this.handle, stream.handle);\n  }\n\n  reset(stream) {\n    this.Module._SherpaOnnxResetKeywordStream(this.handle, stream.handle);\n  }\n\n  getResult(stream) {\n    let r = this.Module._SherpaOnnxGetKeywordResult(this.handle, stream.handle);\n    let jsonPtr = this.Module.getValue(r + 24, 'i8*');\n    let json = this.Module.UTF8ToString(jsonPtr);\n    this.Module._SherpaOnnxDestroyKeywordResult(r);\n    return JSON.parse(json);\n  }\n}\n\nfunction createKws(Module, myConfig) {\n  let transducerConfig = {\n    encoder: './encoder-epoch-12-avg-2-chunk-16-left-64.onnx',\n    decoder: './decoder-epoch-12-avg-2-chunk-16-left-64.onnx',\n    joiner: './joiner-epoch-12-avg-2-chunk-16-left-64.onnx',\n  };\n  let modelConfig = {\n    transducer: transducerConfig,\n    tokens: './tokens.txt',\n    provider: 'cpu',\n    modelType: '',\n    numThreads: 1,\n    debug: 1,\n    modelingUnit: 'cjkchar',\n    bpeVocab: '',\n  };\n\n  let featConfig = {\n    samplingRate: 16000,\n    featureDim: 80,\n  };\n\n  let configObj = {\n    featConfig: featConfig,\n    modelConfig: modelConfig,\n    maxActivePaths: 4,\n    numTrailingBlanks: 1,\n    keywordsScore: 1.0,\n    keywordsThreshold: 0.25,\n    keywords: 'x iǎo ài t óng x ué @小爱同学\\n' +\n        'j ūn g ē n iú b ī @军哥牛逼'\n  };\n\n  if (myConfig) {\n    configObj = myConfig;\n  }\n  return new Kws(configObj, Module);\n}\n\nif (typeof process == 'object' && typeof process.versions == 'object' &&\n    typeof process.versions.node == 'string') {\n  module.exports = {\n    createKws,\n  };\n}\n"
  },
  {
    "path": "wasm/kws/sherpa-onnx-wasm-main-kws.cc",
    "content": "// wasm/sherpa-onnx-wasm-main-kws.cc\n//\n// Copyright (c)  2024  lovemefan\n#include <stdio.h>\n\n#include <algorithm>\n#include <memory>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n// see also\n// https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html\n\nextern \"C\" {\n\nstatic_assert(sizeof(SherpaOnnxOnlineTransducerModelConfig) == 3 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOnlineParaformerModelConfig) == 2 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOnlineZipformer2CtcModelConfig) == 1 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOnlineModelConfig) ==\n                  sizeof(SherpaOnnxOnlineTransducerModelConfig) +\n                      sizeof(SherpaOnnxOnlineParaformerModelConfig) +\n                      sizeof(SherpaOnnxOnlineZipformer2CtcModelConfig) + 9 * 4 +\n                      sizeof(SherpaOnnxOnlineNemoCtcModelConfig) +\n                      sizeof(SherpaOnnxOnlineToneCtcModelConfig),\n              \"\");\nstatic_assert(sizeof(SherpaOnnxFeatureConfig) == 2 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxKeywordSpotterConfig) ==\n                  sizeof(SherpaOnnxFeatureConfig) +\n                      sizeof(SherpaOnnxOnlineModelConfig) + 7 * 4,\n              \"\");\n\nvoid CopyHeap(const char *src, int32_t num_bytes, char *dst) {\n  std::copy(src, src + num_bytes, dst);\n}\n}\n"
  },
  {
    "path": "wasm/nodejs/CMakeLists.txt",
    "content": "if(NOT $ENV{SHERPA_ONNX_IS_USING_BUILD_WASM_SH})\n  message(FATAL_ERROR \"Please use ./build-wasm-simd-nodejs.sh to build for wasm NodeJS\")\nendif()\n\nset(exported_functions\n  #tts\n  PrintOfflineTtsConfig\n  SherpaOnnxCreateOfflineTts\n  SherpaOnnxDestroyOfflineTts\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio\n  SherpaOnnxOfflineTtsGenerate\n  SherpaOnnxOfflineTtsGenerateWithCallback\n  SherpaOnnxOfflineTtsGenerateWithConfig\n  SherpaOnnxOfflineTtsNumSpeakers\n  SherpaOnnxOfflineTtsSampleRate\n  SherpaOnnxWriteWave\n  # streaming asr\n  SherpaOnnxCreateOnlineRecognizer\n  SherpaOnnxCreateOnlineStream\n  SherpaOnnxDecodeOnlineStream\n  SherpaOnnxDestroyOnlineRecognizer\n  SherpaOnnxDestroyOnlineRecognizerResult\n  SherpaOnnxDestroyOnlineStream\n  SherpaOnnxDestroyOnlineStreamResultJson\n  SherpaOnnxGetOnlineStreamResult\n  SherpaOnnxGetOnlineStreamResultAsJson\n  SherpaOnnxIsOnlineStreamReady\n  SherpaOnnxOnlineStreamAcceptWaveform\n  SherpaOnnxOnlineStreamGetOption\n  SherpaOnnxOnlineStreamInputFinished\n  SherpaOnnxOnlineStreamIsEndpoint\n  SherpaOnnxOnlineStreamReset\n  SherpaOnnxOnlineStreamSetOption\n  # non-streaming ASR\n  PrintOfflineRecognizerConfig\n  SherpaOnnxAcceptWaveformOffline\n  SherpaOnnxCreateOfflineRecognizer\n  SherpaOnnxCreateOfflineStream\n  SherpaOnnxDecodeMultipleOfflineStreams\n  SherpaOnnxDecodeOfflineStream\n  SherpaOnnxDestroyOfflineRecognizer\n  SherpaOnnxDestroyOfflineRecognizerResult\n  SherpaOnnxDestroyOfflineStream\n  SherpaOnnxDestroyOfflineStreamResultJson\n  SherpaOnnxGetOfflineStreamResult\n  SherpaOnnxGetOfflineStreamResultAsJson\n  SherpaOnnxOfflineStreamGetOption\n  SherpaOnnxOfflineStreamSetOption\n  SherpaOnnxOfflineRecognizerSetConfig\n  # online kws\n  SherpaOnnxCreateKeywordSpotter\n  SherpaOnnxCreateKeywordStream\n  SherpaOnnxDecodeKeywordStream\n  SherpaOnnxDestroyKeywordResult\n  SherpaOnnxDestroyKeywordSpotter\n  SherpaOnnxGetKeywordResult\n  SherpaOnnxIsKeywordStreamReady\n  SherpaOnnxResetKeywordStream\n  # VAD\n  SherpaOnnxCreateCircularBuffer\n  SherpaOnnxDestroyCircularBuffer\n  SherpaOnnxCircularBufferPush\n  SherpaOnnxCircularBufferGet\n  SherpaOnnxCircularBufferFree\n  SherpaOnnxCircularBufferPop\n  SherpaOnnxCircularBufferSize\n  SherpaOnnxCircularBufferHead\n  SherpaOnnxCircularBufferReset\n  SherpaOnnxCreateVoiceActivityDetector\n  SherpaOnnxDestroyVoiceActivityDetector\n  SherpaOnnxVoiceActivityDetectorAcceptWaveform\n  SherpaOnnxVoiceActivityDetectorEmpty\n  SherpaOnnxVoiceActivityDetectorDetected\n  SherpaOnnxVoiceActivityDetectorPop\n  SherpaOnnxVoiceActivityDetectorClear\n  SherpaOnnxVoiceActivityDetectorFront\n  SherpaOnnxDestroySpeechSegment\n  SherpaOnnxVoiceActivityDetectorReset\n  SherpaOnnxVoiceActivityDetectorFlush\n  # Speaker diarization\n  SherpaOnnxCreateOfflineSpeakerDiarization\n  SherpaOnnxDestroyOfflineSpeakerDiarization\n  SherpaOnnxOfflineSpeakerDiarizationDestroyResult\n  SherpaOnnxOfflineSpeakerDiarizationDestroySegment\n  SherpaOnnxOfflineSpeakerDiarizationGetSampleRate\n  SherpaOnnxOfflineSpeakerDiarizationProcess\n  SherpaOnnxOfflineSpeakerDiarizationProcessWithCallback\n  SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments\n  SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime\n  SherpaOnnxOfflineSpeakerDiarizationSetConfig\n  #\n  SherpaOnnxFileExists\n  SherpaOnnxReadWave\n  SherpaOnnxReadWaveFromBinaryData\n  SherpaOnnxFreeWave\n  SherpaOnnxWriteWave\n  # speech enhancement\n  SherpaOnnxCreateOfflineSpeechDenoiser\n  SherpaOnnxCreateOnlineSpeechDenoiser\n  SherpaOnnxDestroyDenoisedAudio\n  SherpaOnnxDestroyOfflineSpeechDenoiser\n  SherpaOnnxDestroyOnlineSpeechDenoiser\n  SherpaOnnxOfflineSpeechDenoiserGetSampleRate\n  SherpaOnnxOfflineSpeechDenoiserRun\n  SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples\n  SherpaOnnxOnlineSpeechDenoiserGetSampleRate\n  SherpaOnnxOnlineSpeechDenoiserRun\n  SherpaOnnxOnlineSpeechDenoiserFlush\n  SherpaOnnxOnlineSpeechDenoiserReset\n  # version\n  SherpaOnnxGetGitDate\n  SherpaOnnxGetGitSha1\n  SherpaOnnxGetVersionStr\n)\n\n\nset(mangled_exported_functions)\nforeach(x IN LISTS exported_functions)\n  list(APPEND mangled_exported_functions \"_${x}\")\nendforeach()\nlist(JOIN mangled_exported_functions \",\" all_exported_functions)\n\ninclude_directories(${CMAKE_SOURCE_DIR})\nset(MY_FLAGS \" -s FORCE_FILESYSTEM=1 -s INITIAL_MEMORY=512MB -s ALLOW_MEMORY_GROWTH=1\")\nstring(APPEND MY_FLAGS \" -sSTACK_SIZE=10485760 \") # 10MB\nstring(APPEND MY_FLAGS \" -sALLOW_TABLE_GROWTH \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_FUNCTIONS=[_CopyHeap,_malloc,_free,${all_exported_functions}] \")\nstring(APPEND MY_FLAGS \" -sNODERAWFS=1 \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_RUNTIME_METHODS=['ccall','stringToUTF8','setValue','getValue','lengthBytesUTF8','UTF8ToString','HEAPU8','HEAP16','HEAP32','HEAPU32','HEAPF32','HEAPF64','addFunction','removeFunction'] \")\n\nstring(APPEND MY_FLAGS \" -sMODULARIZE=1 -sWASM_ASYNC_COMPILATION=0 \")\n\nmessage(STATUS \"MY_FLAGS: ${MY_FLAGS}\")\n\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_EXECUTABLE_LINKER_FLAGS \"${CMAKE_EXECUTABLE_LINKER_FLAGS} ${MY_FLAGS}\")\n\nadd_executable(sherpa-onnx-wasm-nodejs sherpa-onnx-wasm-nodejs.cc)\ntarget_link_libraries(sherpa-onnx-wasm-nodejs sherpa-onnx-core sherpa-onnx-c-api)\ninstall(TARGETS sherpa-onnx-wasm-nodejs DESTINATION bin/wasm/nodejs)\n\ninstall(\n  FILES\n  ${CMAKE_SOURCE_DIR}/wasm/asr/sherpa-onnx-asr.js\n  ${CMAKE_SOURCE_DIR}/wasm/tts/sherpa-onnx-tts.js\n  ${CMAKE_SOURCE_DIR}/wasm/kws/sherpa-onnx-kws.js\n  ${CMAKE_SOURCE_DIR}/wasm/vad/sherpa-onnx-vad.js\n  ${CMAKE_SOURCE_DIR}/wasm/speaker-diarization/sherpa-onnx-speaker-diarization.js\n  ${CMAKE_SOURCE_DIR}/wasm/speech-enhancement/sherpa-onnx-speech-enhancement.js\n  ${CMAKE_SOURCE_DIR}/wasm/nodejs/sherpa-onnx-wave.js\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-nodejs>/sherpa-onnx-wasm-nodejs.js\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-nodejs>/sherpa-onnx-wasm-nodejs.wasm\"\n  DESTINATION\n    bin/wasm/nodejs\n)\n"
  },
  {
    "path": "wasm/nodejs/sherpa-onnx-wasm-nodejs.cc",
    "content": "// wasm/sherpa-onnx-wasm-main-nodejs.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <stdio.h>\n\n#include <algorithm>\n#include <memory>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\nextern \"C\" {\n\nstatic_assert(sizeof(SherpaOnnxOfflineTransducerModelConfig) == 3 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineParaformerModelConfig) == 4, \"\");\n\nstatic_assert(sizeof(SherpaOnnxOfflineZipformerCtcModelConfig) == 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineWenetCtcModelConfig) == 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineOmnilingualAsrCtcModelConfig) == 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineMedAsrCtcModelConfig) == 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineFireRedAsrCtcModelConfig) == 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineFunASRNanoModelConfig) == 13 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineDolphinModelConfig) == 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineNemoEncDecCtcModelConfig) == 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineWhisperModelConfig) == 7 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineFireRedAsrModelConfig) == 2 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineMoonshineModelConfig) == 5 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineTdnnModelConfig) == 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineSenseVoiceModelConfig) == 3 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineCanaryModelConfig) == 5 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineLMConfig) == 2 * 4, \"\");\n\nstatic_assert(sizeof(SherpaOnnxOfflineModelConfig) ==\n                  sizeof(SherpaOnnxOfflineTransducerModelConfig) +\n                      sizeof(SherpaOnnxOfflineParaformerModelConfig) +\n                      sizeof(SherpaOnnxOfflineNemoEncDecCtcModelConfig) +\n                      sizeof(SherpaOnnxOfflineWhisperModelConfig) +\n                      sizeof(SherpaOnnxOfflineTdnnModelConfig) + 8 * 4 +\n                      sizeof(SherpaOnnxOfflineSenseVoiceModelConfig) +\n                      sizeof(SherpaOnnxOfflineMoonshineModelConfig) +\n                      sizeof(SherpaOnnxOfflineFireRedAsrModelConfig) +\n                      sizeof(SherpaOnnxOfflineDolphinModelConfig) +\n                      sizeof(SherpaOnnxOfflineZipformerCtcModelConfig) +\n                      sizeof(SherpaOnnxOfflineCanaryModelConfig) +\n                      sizeof(SherpaOnnxOfflineWenetCtcModelConfig) +\n                      sizeof(SherpaOnnxOfflineOmnilingualAsrCtcModelConfig) +\n                      sizeof(SherpaOnnxOfflineMedAsrCtcModelConfig) +\n                      sizeof(SherpaOnnxOfflineFunASRNanoModelConfig) +\n                      sizeof(SherpaOnnxOfflineFireRedAsrCtcModelConfig),\n\n              \"\");\nstatic_assert(sizeof(SherpaOnnxFeatureConfig) == 2 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineRecognizerConfig) ==\n                  sizeof(SherpaOnnxFeatureConfig) +\n                      sizeof(SherpaOnnxOfflineLMConfig) +\n                      sizeof(SherpaOnnxOfflineModelConfig) + 7 * 4 +\n                      sizeof(SherpaOnnxHomophoneReplacerConfig),\n              \"\");\n\nvoid PrintOfflineTtsConfig(SherpaOnnxOfflineTtsConfig *tts_config) {\n  auto tts_model_config = &tts_config->model;\n  auto vits_model_config = &tts_model_config->vits;\n  auto matcha_model_config = &tts_model_config->matcha;\n  auto kokoro = &tts_model_config->kokoro;\n  auto kitten = &tts_model_config->kitten;\n  auto zipvoice = &tts_model_config->zipvoice;\n  auto pocket = &tts_model_config->pocket;\n  auto supertonic = &tts_model_config->supertonic;\n\n  fprintf(stdout, \"----------vits model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", vits_model_config->model);\n  fprintf(stdout, \"lexicon: %s\\n\", vits_model_config->lexicon);\n  fprintf(stdout, \"tokens: %s\\n\", vits_model_config->tokens);\n  fprintf(stdout, \"data_dir: %s\\n\", vits_model_config->data_dir);\n  fprintf(stdout, \"noise scale: %.3f\\n\", vits_model_config->noise_scale);\n  fprintf(stdout, \"noise scale w: %.3f\\n\", vits_model_config->noise_scale_w);\n  fprintf(stdout, \"length scale: %.3f\\n\", vits_model_config->length_scale);\n  fprintf(stdout, \"dict_dir: %s\\n\", vits_model_config->dict_dir);\n\n  fprintf(stdout, \"----------matcha model config----------\\n\");\n  fprintf(stdout, \"acoustic_model: %s\\n\", matcha_model_config->acoustic_model);\n  fprintf(stdout, \"vocoder: %s\\n\", matcha_model_config->vocoder);\n  fprintf(stdout, \"lexicon: %s\\n\", matcha_model_config->lexicon);\n  fprintf(stdout, \"tokens: %s\\n\", matcha_model_config->tokens);\n  fprintf(stdout, \"data_dir: %s\\n\", matcha_model_config->data_dir);\n  fprintf(stdout, \"noise scale: %.3f\\n\", matcha_model_config->noise_scale);\n  fprintf(stdout, \"length scale: %.3f\\n\", matcha_model_config->length_scale);\n  fprintf(stdout, \"dict_dir: %s\\n\", matcha_model_config->dict_dir);\n\n  fprintf(stdout, \"----------kokoro model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", kokoro->model);\n  fprintf(stdout, \"voices: %s\\n\", kokoro->voices);\n  fprintf(stdout, \"tokens: %s\\n\", kokoro->tokens);\n  fprintf(stdout, \"data_dir: %s\\n\", kokoro->data_dir);\n  fprintf(stdout, \"length scale: %.3f\\n\", kokoro->length_scale);\n  fprintf(stdout, \"dict_dir: %s\\n\", kokoro->dict_dir);\n  fprintf(stdout, \"lexicon: %s\\n\", kokoro->lexicon);\n  fprintf(stdout, \"lang: %s\\n\", kokoro->lang);\n\n  fprintf(stdout, \"----------kitten model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", kitten->model);\n  fprintf(stdout, \"voices: %s\\n\", kitten->voices);\n  fprintf(stdout, \"tokens: %s\\n\", kitten->tokens);\n  fprintf(stdout, \"data_dir: %s\\n\", kitten->data_dir);\n  fprintf(stdout, \"length scale: %.3f\\n\", kitten->length_scale);\n\n  fprintf(stdout, \"----------zipvoice model config----------\\n\");\n  fprintf(stdout, \"tokens: %s\\n\", zipvoice->tokens);\n  fprintf(stdout, \"encoder: %s\\n\", zipvoice->encoder);\n  fprintf(stdout, \"decoder: %s\\n\", zipvoice->decoder);\n  fprintf(stdout, \"vocoder: %s\\n\", zipvoice->vocoder);\n  fprintf(stdout, \"data_dir: %s\\n\", zipvoice->data_dir);\n  fprintf(stdout, \"lexicon: %s\\n\", zipvoice->lexicon);\n  fprintf(stdout, \"feat scale: %.3f\\n\", zipvoice->feat_scale);\n  fprintf(stdout, \"t_shift: %.3f\\n\", zipvoice->t_shift);\n  fprintf(stdout, \"target_rms: %.3f\\n\", zipvoice->target_rms);\n  fprintf(stdout, \"guidance_scale: %.3f\\n\", zipvoice->guidance_scale);\n\n  fprintf(stdout, \"----------pocketTTS model config----------\\n\");\n  fprintf(stdout, \"lm_flow: %s\\n\", pocket->lm_flow);\n  fprintf(stdout, \"lm_main: %s\\n\", pocket->lm_main);\n  fprintf(stdout, \"encoder: %s\\n\", pocket->encoder);\n  fprintf(stdout, \"decoder: %s\\n\", pocket->decoder);\n  fprintf(stdout, \"text_conditioner: %s\\n\", pocket->text_conditioner);\n  fprintf(stdout, \"vocab_json: %s\\n\", pocket->vocab_json);\n  fprintf(stdout, \"token_scores_json: %s\\n\", pocket->token_scores_json);\n  fprintf(stdout, \"voice_embedding_cache_capacity: %d\\n\",\n          pocket->voice_embedding_cache_capacity);\n\n  fprintf(stdout, \"----------supertonic model config----------\\n\");\n  fprintf(stdout, \"duration_predictor: %s\\n\", supertonic->duration_predictor);\n  fprintf(stdout, \"text_encoder: %s\\n\", supertonic->text_encoder);\n  fprintf(stdout, \"vector_estimator: %s\\n\", supertonic->vector_estimator);\n  fprintf(stdout, \"vocoder: %s\\n\", supertonic->vocoder);\n  fprintf(stdout, \"tts_json: %s\\n\", supertonic->tts_json);\n  fprintf(stdout, \"unicode_indexer: %s\\n\", supertonic->unicode_indexer);\n  fprintf(stdout, \"voice_style: %s\\n\", supertonic->voice_style);\n\n  fprintf(stdout, \"----------tts model config----------\\n\");\n  fprintf(stdout, \"num threads: %d\\n\", tts_model_config->num_threads);\n  fprintf(stdout, \"debug: %d\\n\", tts_model_config->debug);\n  fprintf(stdout, \"provider: %s\\n\", tts_model_config->provider);\n\n  fprintf(stdout, \"----------tts config----------\\n\");\n  fprintf(stdout, \"rule_fsts: %s\\n\", tts_config->rule_fsts);\n  fprintf(stdout, \"rule_fars: %s\\n\", tts_config->rule_fars);\n  fprintf(stdout, \"max num sentences: %d\\n\", tts_config->max_num_sentences);\n  fprintf(stdout, \"silence scale: %.3f\\n\", tts_config->silence_scale);\n}\n\nvoid PrintOfflineRecognizerConfig(SherpaOnnxOfflineRecognizerConfig *config) {\n  auto model_config = &config->model_config;\n  auto feat = &config->feat_config;\n  auto transducer = &model_config->transducer;\n  auto paraformer = &model_config->paraformer;\n  auto nemo_ctc = &model_config->nemo_ctc;\n  auto whisper = &model_config->whisper;\n  auto tdnn = &model_config->tdnn;\n  auto sense_voice = &model_config->sense_voice;\n  auto moonshine = &model_config->moonshine;\n  auto fire_red_asr = &model_config->fire_red_asr;\n  auto dolphin = &model_config->dolphin;\n  auto zipformer_ctc = &model_config->zipformer_ctc;\n  auto canary = &model_config->canary;\n  auto wenet_ctc = &model_config->wenet_ctc;\n  auto omnilingual = &model_config->omnilingual;\n  auto medasr = &model_config->medasr;\n  auto funasr_nano = &model_config->funasr_nano;\n  auto fire_red_asr_ctc = &model_config->fire_red_asr_ctc;\n\n  fprintf(stdout, \"----------offline transducer model config----------\\n\");\n  fprintf(stdout, \"encoder: %s\\n\", transducer->encoder);\n  fprintf(stdout, \"decoder: %s\\n\", transducer->decoder);\n  fprintf(stdout, \"joiner: %s\\n\", transducer->joiner);\n\n  fprintf(stdout, \"----------offline paraformer model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", paraformer->model);\n\n  fprintf(stdout, \"----------offline nemo_ctc model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", nemo_ctc->model);\n\n  fprintf(stdout, \"----------offline whisper model config----------\\n\");\n  fprintf(stdout, \"encoder: %s\\n\", whisper->encoder);\n  fprintf(stdout, \"decoder: %s\\n\", whisper->decoder);\n  fprintf(stdout, \"language: %s\\n\", whisper->language);\n  fprintf(stdout, \"task: %s\\n\", whisper->task);\n  fprintf(stdout, \"tail_paddings: %d\\n\", whisper->tail_paddings);\n\n  fprintf(stdout, \"----------offline tdnn model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", tdnn->model);\n\n  fprintf(stdout, \"----------offline sense_voice model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", sense_voice->model);\n  fprintf(stdout, \"language: %s\\n\", sense_voice->language);\n  fprintf(stdout, \"use_itn: %d\\n\", sense_voice->use_itn);\n\n  fprintf(stdout, \"----------offline moonshine model config----------\\n\");\n  fprintf(stdout, \"preprocessor: %s\\n\", moonshine->preprocessor);\n  fprintf(stdout, \"encoder: %s\\n\", moonshine->encoder);\n  fprintf(stdout, \"uncached_decoder: %s\\n\", moonshine->uncached_decoder);\n  fprintf(stdout, \"cached_decoder: %s\\n\", moonshine->cached_decoder);\n  fprintf(stdout, \"merged_decoder: %s\\n\", moonshine->merged_decoder);\n\n  fprintf(stdout, \"----------offline FireRedAsr model config----------\\n\");\n  fprintf(stdout, \"encoder: %s\\n\", fire_red_asr->encoder);\n  fprintf(stdout, \"decoder: %s\\n\", fire_red_asr->decoder);\n\n  fprintf(stdout, \"----------offline Dolphin model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", dolphin->model);\n\n  fprintf(stdout, \"----------offline zipformer ctc model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", zipformer_ctc->model);\n\n  fprintf(stdout, \"----------offline NeMo Canary model config----------\\n\");\n  fprintf(stdout, \"encoder: %s\\n\", canary->encoder);\n  fprintf(stdout, \"decoder: %s\\n\", canary->decoder);\n  fprintf(stdout, \"src_lang: %s\\n\", canary->src_lang);\n  fprintf(stdout, \"tgt_lang: %s\\n\", canary->tgt_lang);\n  fprintf(stdout, \"use_pnc: %d\\n\", canary->use_pnc);\n\n  fprintf(stdout, \"----------offline wenet ctc model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", wenet_ctc->model);\n\n  fprintf(stdout, \"----------offline Omnilingual ASR model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", omnilingual->model);\n\n  fprintf(stdout, \"----------offline MedASR model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", medasr->model);\n\n  fprintf(stdout, \"----------offline FunASR Nano config----------\\n\");\n  fprintf(stdout, \"encoder_adaptor: %s\\n\", funasr_nano->encoder_adaptor);\n  fprintf(stdout, \"llm: %s\\n\", funasr_nano->llm);\n  fprintf(stdout, \"embedding: %s\\n\", funasr_nano->embedding);\n  fprintf(stdout, \"tokenizer: %s\\n\", funasr_nano->tokenizer);\n  fprintf(stdout, \"system_prompt: %s\\n\", funasr_nano->system_prompt);\n  fprintf(stdout, \"user_prompt: %s\\n\", funasr_nano->user_prompt);\n  fprintf(stdout, \"max_new_tokens: %d\\n\", funasr_nano->max_new_tokens);\n  fprintf(stdout, \"temperature: %f\\n\", funasr_nano->temperature);\n  fprintf(stdout, \"top_p: %f\\n\", funasr_nano->top_p);\n  fprintf(stdout, \"seed: %d\\n\", funasr_nano->seed);\n\n  fprintf(stdout, \"----------offline FireRedASR CTC model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", fire_red_asr_ctc->model);\n\n  fprintf(stdout, \"tokens: %s\\n\", model_config->tokens);\n  fprintf(stdout, \"num_threads: %d\\n\", model_config->num_threads);\n  fprintf(stdout, \"provider: %s\\n\", model_config->provider);\n  fprintf(stdout, \"debug: %d\\n\", model_config->debug);\n  fprintf(stdout, \"model type: %s\\n\", model_config->model_type);\n  fprintf(stdout, \"modeling unit: %s\\n\", model_config->modeling_unit);\n  fprintf(stdout, \"bpe vocab: %s\\n\", model_config->bpe_vocab);\n  fprintf(stdout, \"telespeech_ctc: %s\\n\", model_config->telespeech_ctc);\n\n  fprintf(stdout, \"----------feat config----------\\n\");\n  fprintf(stdout, \"sample rate: %d\\n\", feat->sample_rate);\n  fprintf(stdout, \"feat dim: %d\\n\", feat->feature_dim);\n\n  fprintf(stdout, \"----------recognizer config----------\\n\");\n  fprintf(stdout, \"decoding method: %s\\n\", config->decoding_method);\n  fprintf(stdout, \"max active paths: %d\\n\", config->max_active_paths);\n  fprintf(stdout, \"hotwords_file: %s\\n\", config->hotwords_file);\n  fprintf(stdout, \"hotwords_score: %.2f\\n\", config->hotwords_score);\n  fprintf(stdout, \"rule_fsts: %s\\n\", config->rule_fsts);\n  fprintf(stdout, \"rule_fars: %s\\n\", config->rule_fars);\n  fprintf(stdout, \"blank_penalty: %f\\n\", config->blank_penalty);\n  fprintf(stdout, \"----------hr config----------\\n\");\n  fprintf(stdout, \"dict_dir: %s\\n\", config->hr.dict_dir);\n  fprintf(stdout, \"lexicon: %s\\n\", config->hr.lexicon);\n  fprintf(stdout, \"rule_fsts: %s\\n\", config->hr.rule_fsts);\n}\n\nvoid CopyHeap(const char *src, int32_t num_bytes, char *dst) {\n  std::copy(src, src + num_bytes, dst);\n}\n}\n"
  },
  {
    "path": "wasm/nodejs/sherpa-onnx-wave.js",
    "content": "// return an object\n// {\n//   samples: a float32 array\n//   sampleRate: an integer\n// }\nfunction readWave(filename, Module) {\n  const filenameLen = Module.lengthBytesUTF8(filename) + 1;\n  const pFilename = Module._malloc(filenameLen);\n  Module.stringToUTF8(filename, pFilename, filenameLen);\n\n  const w = Module._SherpaOnnxReadWave(pFilename);\n  Module._free(pFilename);\n\n\n  const samplesPtr = Module.HEAP32[w / 4] / 4;\n  const sampleRate = Module.HEAP32[w / 4 + 1];\n  const numSamples = Module.HEAP32[w / 4 + 2];\n\n  const samples = new Float32Array(numSamples);\n  for (let i = 0; i < numSamples; i++) {\n    samples[i] = Module.HEAPF32[samplesPtr + i];\n  }\n\n  Module._SherpaOnnxFreeWave(w);\n\n  return {samples: samples, sampleRate: sampleRate};\n}\n\nfunction readWaveFromBinaryData(uint8Array, Module) {\n  const numBytes = uint8Array.length * uint8Array.BYTES_PER_ELEMENT;\n  const pointer = Module._malloc(numBytes);\n\n  const dataOnHeap = new Uint8Array(Module.HEAPU8.buffer, pointer, numBytes);\n  dataOnHeap.set(uint8Array);\n\n  const w =\n      Module._SherpaOnnxReadWaveFromBinaryData(dataOnHeap.byteOffset, numBytes);\n  if (w == 0) {\n    console.log('Failed to read wave from binary data');\n    return null;\n  }\n\n  Module._free(pointer);\n\n  const samplesPtr = Module.HEAP32[w / 4] / 4;\n  const sampleRate = Module.HEAP32[w / 4 + 1];\n  const numSamples = Module.HEAP32[w / 4 + 2];\n\n  const samples = new Float32Array(numSamples);\n  for (let i = 0; i < numSamples; i++) {\n    samples[i] = Module.HEAPF32[samplesPtr + i];\n  }\n\n  Module._SherpaOnnxFreeWave(w);\n\n\n  return {samples: samples, sampleRate: sampleRate};\n}\n\n// data is an object\n// {\n//   samples: a float32 array\n//   sampleRate: an integer\n// }\nfunction writeWave(filename, data, Module) {\n  const pSamples =\n      Module._malloc(data.samples.length * data.samples.BYTES_PER_ELEMENT);\n  Module.HEAPF32.set(data.samples, pSamples / data.samples.BYTES_PER_ELEMENT);\n\n  const filenameLen = Module.lengthBytesUTF8(filename) + 1;\n  const pFilename = Module._malloc(filenameLen);\n  Module.stringToUTF8(filename, pFilename, filenameLen);\n\n  Module._SherpaOnnxWriteWave(\n      pSamples, data.samples.length, data.sampleRate, pFilename);\n\n  Module._free(pFilename);\n  Module._free(pSamples);\n}\n\nif (typeof process == 'object' && typeof process.versions == 'object' &&\n    typeof process.versions.node == 'string') {\n  module.exports = {\n    readWave,\n    writeWave,\n    readWaveFromBinaryData,\n  };\n}\n"
  },
  {
    "path": "wasm/speaker-diarization/CMakeLists.txt",
    "content": "if(NOT $ENV{SHERPA_ONNX_IS_USING_BUILD_WASM_SH})\n  message(FATAL_ERROR \"Please use ./build-wasm-simd-speaker-diarization.sh to build for WASM for speaker diarization\")\nendif()\n\nif(NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/segmentation.onnx\" OR NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/embedding.onnx\")\n  message(FATAL_ERROR \"Please read ${CMAKE_CURRENT_SOURCE_DIR}/assets/README.md before you continue\")\nendif()\n\nset(exported_functions\n  MyPrint\n  SherpaOnnxCreateOfflineSpeakerDiarization\n  SherpaOnnxDestroyOfflineSpeakerDiarization\n  SherpaOnnxOfflineSpeakerDiarizationDestroyResult\n  SherpaOnnxOfflineSpeakerDiarizationDestroySegment\n  SherpaOnnxOfflineSpeakerDiarizationGetSampleRate\n  SherpaOnnxOfflineSpeakerDiarizationProcess\n  SherpaOnnxOfflineSpeakerDiarizationProcessWithCallback\n  SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments\n  SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime\n  SherpaOnnxOfflineSpeakerDiarizationSetConfig\n)\nset(mangled_exported_functions)\nforeach(x IN LISTS exported_functions)\n  list(APPEND mangled_exported_functions \"_${x}\")\nendforeach()\nlist(JOIN mangled_exported_functions \",\" all_exported_functions)\n\n\ninclude_directories(${CMAKE_SOURCE_DIR})\nset(MY_FLAGS \" -s FORCE_FILESYSTEM=1 -s INITIAL_MEMORY=512MB -s ALLOW_MEMORY_GROWTH=1\")\nstring(APPEND MY_FLAGS \" -sSTACK_SIZE=10485760 \") # 10MB\nstring(APPEND MY_FLAGS \" -sEXPORTED_FUNCTIONS=[_CopyHeap,_malloc,_free,${all_exported_functions}] \")\nstring(APPEND MY_FLAGS \"--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets@. \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_RUNTIME_METHODS=['ccall','stringToUTF8','setValue','getValue','lengthBytesUTF8','UTF8ToString','HEAPU8','HEAP16','HEAP32','HEAPU32','HEAPF32','HEAPF64'] \")\n\nmessage(STATUS \"MY_FLAGS: ${MY_FLAGS}\")\n\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_EXECUTABLE_LINKER_FLAGS \"${CMAKE_EXECUTABLE_LINKER_FLAGS} ${MY_FLAGS}\")\n\nif (NOT CMAKE_EXECUTABLE_SUFFIX STREQUAL \".js\")\n  message(FATAL_ERROR \"The default suffix for building executables should be .js!\")\nendif()\n# set(CMAKE_EXECUTABLE_SUFFIX \".html\")\n\nadd_executable(sherpa-onnx-wasm-main-speaker-diarization sherpa-onnx-wasm-main-speaker-diarization.cc)\ntarget_link_libraries(sherpa-onnx-wasm-main-speaker-diarization sherpa-onnx-c-api)\ninstall(TARGETS sherpa-onnx-wasm-main-speaker-diarization DESTINATION bin/wasm/speaker-diarization)\n\ninstall(\n  FILES\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-speaker-diarization>/sherpa-onnx-wasm-main-speaker-diarization.js\"\n    \"index.html\"\n    \"sherpa-onnx-speaker-diarization.js\"\n    \"app-speaker-diarization.js\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-speaker-diarization>/sherpa-onnx-wasm-main-speaker-diarization.wasm\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-speaker-diarization>/sherpa-onnx-wasm-main-speaker-diarization.data\"\n  DESTINATION\n    bin/wasm/speaker-diarization\n)\n"
  },
  {
    "path": "wasm/speaker-diarization/app-speaker-diarization.js",
    "content": "const startBtn = document.getElementById('startBtn');\nconst hint = document.getElementById('hint');\nconst numClustersInput = document.getElementById('numClustersInputID');\nconst thresholdInput = document.getElementById('thresholdInputID');\nconst textArea = document.getElementById('text');\n\nconst fileSelectCtrl = document.getElementById('file');\n\nlet sd = null;\nlet float32Samples = null;\n\nModule = {};\nModule.onRuntimeInitialized = function() {\n  console.log('Model files downloaded!');\n\n  console.log('Initializing speaker diarization ......');\n  sd = createOfflineSpeakerDiarization(Module)\n  console.log('sampleRate', sd.sampleRate);\n\n  hint.innerText =\n      'Initialized! Please select a wave file and click the Start button.';\n\n  fileSelectCtrl.disabled = false;\n};\n\nfunction onFileChange() {\n  var files = document.getElementById('file').files;\n\n  if (files.length == 0) {\n    console.log('No file selected');\n    float32Samples = null;\n    startBtn.disabled = true;\n    return;\n  }\n  textArea.value = '';\n\n  console.log('files: ' + files);\n\n  const file = files[0];\n  console.log(file);\n  console.log('file.name ' + file.name);\n  console.log('file.type ' + file.type);\n  console.log('file.size ' + file.size);\n\n  let audioCtx = new AudioContext({sampleRate: sd.sampleRate});\n\n  let reader = new FileReader();\n  reader.onload = function() {\n    console.log('reading file!');\n    audioCtx.decodeAudioData(reader.result, decodedDone);\n  };\n\n  function decodedDone(decoded) {\n    let typedArray = new Float32Array(decoded.length);\n    float32Samples = decoded.getChannelData(0);\n\n    startBtn.disabled = false;\n  }\n\n  reader.readAsArrayBuffer(file);\n}\n\nstartBtn.onclick = function() {\n  textArea.value = '';\n  if (float32Samples == null) {\n    alert('Empty audio samples!');\n\n    startBtn.disabled = true;\n    return;\n  }\n\n  let numClusters = numClustersInput.value;\n  if (numClusters.trim().length == 0) {\n    alert(\n        'Please provide numClusters. Use -1 if you are not sure how many speakers are there');\n    return;\n  }\n\n  if (!numClusters.match(/^\\d+$/)) {\n    alert(`number of clusters ${\n        numClusters} is not an integer .\\nPlease enter an integer`);\n    return;\n  }\n  numClusters = parseInt(numClusters, 10);\n  if (numClusters < -1) {\n    alert(`Number of clusters should be >= -1`);\n    return;\n  }\n\n  let threshold = 0.5;\n  if (numClusters <= 0) {\n    threshold = thresholdInput.value;\n    if (threshold.trim().length == 0) {\n      alert('Please provide a threshold.');\n      return;\n    }\n\n    threshold = parseFloat(threshold);\n    if (threshold < 0) {\n      alert(`Pleaser enter a positive threshold`);\n      return;\n    }\n  }\n\n  let config = sd.config\n  config.clustering = {numClusters: numClusters, threshold: threshold};\n  sd.setConfig(config);\n  let segments = sd.process(float32Samples);\n  if (segments == null) {\n    textArea.value = 'No speakers detected';\n    return\n  }\n\n  let s = '';\n  let sep = '';\n\n  for (seg of segments) {\n    // clang-format off\n    s += sep + `${seg.start.toFixed(2)} -- ${seg.end.toFixed(2)} speaker_${seg.speaker}`\n    // clang-format on\n    sep = '\\n';\n  }\n  textArea.value = s;\n}\n"
  },
  {
    "path": "wasm/speaker-diarization/assets/README.md",
    "content": "# Introduction\n\nPlease refer to\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models\nto download a speaker segmentation model\nand\nrefer to\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models\nto download a speaker embedding extraction model.\n\nRemember to rename the downloaded files.\n\nThe following is an example.\n\n```bash\ncd wasm/speaker-diarization/assets/\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\ntar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\nrm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2\ncp sherpa-onnx-pyannote-segmentation-3-0/model.onnx ./segmentation.onnx\nrm -rf sherpa-onnx-pyannote-segmentation-3-0\n\ncurl -SL -O https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx\nmv 3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx ./embedding.onnx\n```\n"
  },
  {
    "path": "wasm/speaker-diarization/index.html",
    "content": "<html lang=\"en\">\n\n<head>\n  <meta charset=\"utf-8\">\n  <meta name=\"viewport\" content=\"width=device-width\" />\n  <title>Next-gen Kaldi WebAssembly with sherpa-onnx for Speaker Diarization</title>\n  <style>\n    h1,div {\n      text-align: center;\n    }\n    textarea {\n      width:100%;\n    }\n  </style>\n</head>\n\n<body>\n  <h1>\n    Next-gen Kaldi + WebAssembly<br/>\n    Speaker Diarization <br> with <a href=\"https://github.com/k2-fsa/sherpa-onnx\">sherpa-onnx</a>\n  </h1>\n  <div>\n    <span id=\"hint\">Loading model ... ...</span>\n    <br/>\n    <br/>\n    <label for=\"avatar\">Choose a wav file:</label>\n    <input type=\"file\" id=\"file\" accept=\".wav\" onchange=\"onFileChange()\" disabled></input>\n    <br/>\n    <br/>\n    <label for=\"numClusters\" id=\"numClustersID\">Number of speakers: </label>\n    <input type=\"text\" id=\"numClustersInputID\" name=\"numClusters\" value=\"-1\" />\n    <br/>\n    <br/>\n    <label for=\"clusteringThreshold\" id=\"thresholdID\">Clustering threshold: </label>\n    <input type=\"text\" id=\"thresholdInputID\" name=\"clusteringThreshold\" value=\"0.5\" />\n    <br/>\n    <br/>\n\n    <textarea id=\"text\" rows=\"10\" placeholder=\"If you know the actual number of speakers in the input wave file, please provide it via Number of speakers. Otherwise, please leave Number of speakers to -1 and provide Clustering threshold instead. A larger threshold leads to fewer clusters, i.e., fewer speakers; a smaller threshold leads to more clusters, i.e., more speakers.\"></textarea>\n    <br/>\n    <br/>\n    <button id=\"startBtn\" disabled>Start</button>\n  </div>\n\n  <script src=\"app-speaker-diarization.js\"></script>\n  <script src=\"sherpa-onnx-speaker-diarization.js\"></script>\n  <script src=\"sherpa-onnx-wasm-main-speaker-diarization.js\"></script>\n</body>\n"
  },
  {
    "path": "wasm/speaker-diarization/sherpa-onnx-speaker-diarization.js",
    "content": "\nfunction freeConfig(config, Module) {\n  if ('buffer' in config) {\n    Module._free(config.buffer);\n  }\n\n  if ('config' in config) {\n    freeConfig(config.config, Module)\n  }\n\n  if ('segmentation' in config) {\n    freeConfig(config.segmentation, Module)\n  }\n\n  if ('embedding' in config) {\n    freeConfig(config.embedding, Module)\n  }\n\n  if ('clustering' in config) {\n    freeConfig(config.clustering, Module)\n  }\n\n  Module._free(config.ptr);\n}\n\nfunction initSherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig(\n    config, Module) {\n  const modelLen = Module.lengthBytesUTF8(config.model || '') + 1;\n  const n = modelLen;\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.model || '', buffer + offset, modelLen);\n  offset += modelLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineSpeakerSegmentationModelConfig(config, Module) {\n  if (!('pyannote' in config)) {\n    config.pyannote = {\n      model: '',\n    };\n  }\n\n  const pyannote = initSherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig(\n      config.pyannote, Module);\n\n  const len = pyannote.len + 3 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module._CopyHeap(pyannote.ptr, pyannote.len, ptr + offset);\n  offset += pyannote.len;\n\n  Module.setValue(ptr + offset, config.numThreads || 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.debug || 0, 'i32');\n  offset += 4;\n\n  const providerLen = Module.lengthBytesUTF8(config.provider || 'cpu') + 1;\n  const buffer = Module._malloc(providerLen);\n  Module.stringToUTF8(config.provider || 'cpu', buffer, providerLen);\n  Module.setValue(ptr + offset, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n    config: pyannote,\n  };\n}\n\nfunction initSherpaOnnxSpeakerEmbeddingExtractorConfig(config, Module) {\n  const modelLen = Module.lengthBytesUTF8(config.model || '') + 1;\n  const providerLen = Module.lengthBytesUTF8(config.provider || 'cpu') + 1;\n  const n = modelLen + providerLen;\n  const buffer = Module._malloc(n);\n\n  const len = 4 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.model || '', buffer + offset, modelLen);\n  offset += modelLen;\n\n  Module.stringToUTF8(config.provider || 'cpu', buffer + offset, providerLen);\n  offset += providerLen;\n\n  offset = 0\n  Module.setValue(ptr + offset, buffer, 'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.numThreads || 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.debug || 0, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, buffer + modelLen, 'i8*');\n  offset += 4;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxFastClusteringConfig(config, Module) {\n  const len = 2 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.setValue(ptr + offset, config.numClusters || -1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.threshold || 0.5, 'float');\n  offset += 4;\n\n  return {\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineSpeakerDiarizationConfig(config, Module) {\n  if (!('segmentation' in config)) {\n    config.segmentation = {\n      pyannote: {model: ''},\n      numThreads: 1,\n      debug: 0,\n      provider: 'cpu',\n    };\n  }\n\n  if (!('embedding' in config)) {\n    config.embedding = {\n      model: '',\n      numThreads: 1,\n      debug: 0,\n      provider: 'cpu',\n    };\n  }\n\n  if (!('clustering' in config)) {\n    config.clustering = {\n      numClusters: -1,\n      threshold: 0.5,\n    };\n  }\n\n  const segmentation = initSherpaOnnxOfflineSpeakerSegmentationModelConfig(\n      config.segmentation, Module);\n\n  const embedding =\n      initSherpaOnnxSpeakerEmbeddingExtractorConfig(config.embedding, Module);\n\n  const clustering =\n      initSherpaOnnxFastClusteringConfig(config.clustering, Module);\n\n  const len = segmentation.len + embedding.len + clustering.len + 2 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module._CopyHeap(segmentation.ptr, segmentation.len, ptr + offset);\n  offset += segmentation.len;\n\n  Module._CopyHeap(embedding.ptr, embedding.len, ptr + offset);\n  offset += embedding.len;\n\n  Module._CopyHeap(clustering.ptr, clustering.len, ptr + offset);\n  offset += clustering.len;\n\n  Module.setValue(ptr + offset, config.minDurationOn || 0.2, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.minDurationOff || 0.5, 'float');\n  offset += 4;\n\n  return {\n    ptr: ptr,\n    len: len,\n    segmentation: segmentation,\n    embedding: embedding,\n    clustering: clustering,\n  };\n}\n\nclass OfflineSpeakerDiarization {\n  constructor(configObj, Module) {\n    const config =\n        initSherpaOnnxOfflineSpeakerDiarizationConfig(configObj, Module)\n    // Module._MyPrint(config.ptr);\n\n    const handle =\n        Module._SherpaOnnxCreateOfflineSpeakerDiarization(config.ptr);\n\n    freeConfig(config, Module);\n\n    this.handle = handle;\n    this.sampleRate =\n        Module._SherpaOnnxOfflineSpeakerDiarizationGetSampleRate(this.handle);\n    this.Module = Module\n\n                  this.config = configObj;\n  }\n\n  free() {\n    this.Module._SherpaOnnxDestroyOfflineSpeakerDiarization(this.handle);\n    this.handle = 0\n  }\n\n  setConfig(configObj) {\n    if (!('clustering' in configObj)) {\n      return;\n    }\n\n    const config =\n        initSherpaOnnxOfflineSpeakerDiarizationConfig(configObj, this.Module);\n\n    this.Module._SherpaOnnxOfflineSpeakerDiarizationSetConfig(\n        this.handle, config.ptr);\n\n    freeConfig(config, this.Module);\n\n    this.config.clustering = configObj.clustering;\n  }\n\n  process(samples) {\n    const pointer =\n        this.Module._malloc(samples.length * samples.BYTES_PER_ELEMENT);\n    this.Module.HEAPF32.set(samples, pointer / samples.BYTES_PER_ELEMENT);\n\n    let r = this.Module._SherpaOnnxOfflineSpeakerDiarizationProcess(\n        this.handle, pointer, samples.length);\n    this.Module._free(pointer);\n\n    let numSegments =\n        this.Module._SherpaOnnxOfflineSpeakerDiarizationResultGetNumSegments(r);\n\n    let segments =\n        this.Module._SherpaOnnxOfflineSpeakerDiarizationResultSortByStartTime(\n            r);\n\n    let ans = [];\n\n    let sizeOfSegment = 3 * 4;\n    for (let i = 0; i < numSegments; ++i) {\n      let p = segments + i * sizeOfSegment\n\n      let start = this.Module.HEAPF32[p / 4 + 0];\n      let end = this.Module.HEAPF32[p / 4 + 1];\n      let speaker = this.Module.HEAP32[p / 4 + 2];\n\n      ans.push({start: start, end: end, speaker: speaker});\n    }\n\n    this.Module._SherpaOnnxOfflineSpeakerDiarizationDestroySegment(segments);\n    this.Module._SherpaOnnxOfflineSpeakerDiarizationDestroyResult(r);\n\n    return ans;\n  }\n}\n\nfunction createOfflineSpeakerDiarization(Module, myConfig) {\n  let config = {\n    segmentation: {\n      pyannote: {model: './segmentation.onnx'},\n      debug: 1,\n    },\n    embedding: {\n      model: './embedding.onnx',\n      debug: 1,\n    },\n    clustering: {numClusters: -1, threshold: 0.5},\n    minDurationOn: 0.3,\n    minDurationOff: 0.5,\n  };\n\n  if (myConfig) {\n    config = myConfig;\n  }\n\n  return new OfflineSpeakerDiarization(config, Module);\n}\n\nif (typeof process == 'object' && typeof process.versions == 'object' &&\n    typeof process.versions.node == 'string') {\n  module.exports = {\n    createOfflineSpeakerDiarization,\n  };\n}\n"
  },
  {
    "path": "wasm/speaker-diarization/sherpa-onnx-wasm-main-speaker-diarization.cc",
    "content": "// wasm/sherpa-onnx-wasm-main-speaker-diarization.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <stdio.h>\n\n#include <algorithm>\n#include <memory>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n// see also\n// https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html\n\nextern \"C\" {\n\nstatic_assert(sizeof(SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig) ==\n                  1 * 4,\n              \"\");\n\nstatic_assert(\n    sizeof(SherpaOnnxOfflineSpeakerSegmentationModelConfig) ==\n        sizeof(SherpaOnnxOfflineSpeakerSegmentationPyannoteModelConfig) + 3 * 4,\n    \"\");\n\nstatic_assert(sizeof(SherpaOnnxFastClusteringConfig) == 2 * 4, \"\");\n\nstatic_assert(sizeof(SherpaOnnxSpeakerEmbeddingExtractorConfig) == 4 * 4, \"\");\n\nstatic_assert(sizeof(SherpaOnnxOfflineSpeakerDiarizationConfig) ==\n                  sizeof(SherpaOnnxOfflineSpeakerSegmentationModelConfig) +\n                      sizeof(SherpaOnnxSpeakerEmbeddingExtractorConfig) +\n                      sizeof(SherpaOnnxFastClusteringConfig) + 2 * 4,\n              \"\");\n\nvoid MyPrint(const SherpaOnnxOfflineSpeakerDiarizationConfig *sd_config) {\n  const auto &segmentation = sd_config->segmentation;\n  const auto &embedding = sd_config->embedding;\n  const auto &clustering = sd_config->clustering;\n\n  fprintf(stdout, \"----------segmentation config----------\\n\");\n  fprintf(stdout, \"pyannote model: %s\\n\", segmentation.pyannote.model);\n  fprintf(stdout, \"num threads: %d\\n\", segmentation.num_threads);\n  fprintf(stdout, \"debug: %d\\n\", segmentation.debug);\n  fprintf(stdout, \"provider: %s\\n\", segmentation.provider);\n\n  fprintf(stdout, \"----------embedding config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", embedding.model);\n  fprintf(stdout, \"num threads: %d\\n\", embedding.num_threads);\n  fprintf(stdout, \"debug: %d\\n\", embedding.debug);\n  fprintf(stdout, \"provider: %s\\n\", embedding.provider);\n\n  fprintf(stdout, \"----------clustering config----------\\n\");\n  fprintf(stdout, \"num_clusters: %d\\n\", clustering.num_clusters);\n  fprintf(stdout, \"threshold: %.3f\\n\", clustering.threshold);\n\n  fprintf(stdout, \"min_duration_on: %.3f\\n\", sd_config->min_duration_on);\n  fprintf(stdout, \"min_duration_off: %.3f\\n\", sd_config->min_duration_off);\n}\n\nvoid CopyHeap(const char *src, int32_t num_bytes, char *dst) {\n  std::copy(src, src + num_bytes, dst);\n}\n}\n"
  },
  {
    "path": "wasm/speech-enhancement/CMakeLists.txt",
    "content": "if(NOT $ENV{SHERPA_ONNX_IS_USING_BUILD_WASM_SH})\n  message(FATAL_ERROR \"Please use ./build-wasm-simd-speech-enhancement.sh to build for wasm speech enhancement\")\nendif()\n\nif(NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/gtcrn.onnx\")\n  message(FATAL_ERROR \"Please read ${CMAKE_CURRENT_SOURCE_DIR}/assets/README.md before you continue\")\nendif()\n\nset(exported_functions\n  MyPrint\n  SherpaOnnxCreateOfflineSpeechDenoiser\n  SherpaOnnxCreateOnlineSpeechDenoiser\n  SherpaOnnxDestroyDenoisedAudio\n  SherpaOnnxDestroyOfflineSpeechDenoiser\n  SherpaOnnxDestroyOnlineSpeechDenoiser\n  SherpaOnnxFreeWave\n  SherpaOnnxOfflineSpeechDenoiserGetSampleRate\n  SherpaOnnxOfflineSpeechDenoiserRun\n  SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples\n  SherpaOnnxOnlineSpeechDenoiserGetSampleRate\n  SherpaOnnxOnlineSpeechDenoiserRun\n  SherpaOnnxOnlineSpeechDenoiserFlush\n  SherpaOnnxOnlineSpeechDenoiserReset\n  SherpaOnnxReadWave\n  SherpaOnnxReadWaveFromBinaryData\n  SherpaOnnxWriteWave\n)\nset(mangled_exported_functions)\nforeach(x IN LISTS exported_functions)\n  list(APPEND mangled_exported_functions \"_${x}\")\nendforeach()\nlist(JOIN mangled_exported_functions \",\" all_exported_functions)\n\n\ninclude_directories(${CMAKE_SOURCE_DIR})\nset(MY_FLAGS \" -s FORCE_FILESYSTEM=1 -s INITIAL_MEMORY=128MB -s ALLOW_MEMORY_GROWTH=1\")\nstring(APPEND MY_FLAGS \" -sSTACK_SIZE=10485760 \") # 10MB\nstring(APPEND MY_FLAGS \" -sEXPORTED_FUNCTIONS=[_CopyHeap,_malloc,_free,${all_exported_functions}] \")\nstring(APPEND MY_FLAGS \"--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets@. \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_RUNTIME_METHODS=['ccall','stringToUTF8','setValue','getValue','lengthBytesUTF8','UTF8ToString','HEAPU8','HEAP16','HEAP32','HEAPU32','HEAPF32','HEAPF64'] \")\n\nmessage(STATUS \"MY_FLAGS: ${MY_FLAGS}\")\n\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_EXECUTABLE_LINKER_FLAGS \"${CMAKE_EXECUTABLE_LINKER_FLAGS} ${MY_FLAGS}\")\n\nif (NOT CMAKE_EXECUTABLE_SUFFIX STREQUAL \".js\")\n  message(FATAL_ERROR \"The default suffix for building executables should be .js!\")\nendif()\n# set(CMAKE_EXECUTABLE_SUFFIX \".html\")\n\nadd_executable(sherpa-onnx-wasm-main-speech-enhancement sherpa-onnx-wasm-main-speech-enhancement.cc)\ntarget_link_libraries(sherpa-onnx-wasm-main-speech-enhancement sherpa-onnx-c-api)\ninstall(TARGETS sherpa-onnx-wasm-main-speech-enhancement DESTINATION bin/wasm/speech-enhancement)\n\ninstall(\n  FILES\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-speech-enhancement>/sherpa-onnx-wasm-main-speech-enhancement.js\"\n    \"index.html\"\n    \"sherpa-onnx-speech-enhancement.js\"\n    \"../nodejs/sherpa-onnx-wave.js\"\n    \"app-speech-enhancement.js\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-speech-enhancement>/sherpa-onnx-wasm-main-speech-enhancement.wasm\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-speech-enhancement>/sherpa-onnx-wasm-main-speech-enhancement.data\"\n  DESTINATION\n    bin/wasm/speech-enhancement\n)\n"
  },
  {
    "path": "wasm/speech-enhancement/app-speech-enhancement.js",
    "content": "\nconst fileInput = document.getElementById('fileInput');\n\nlet speech_denoiser = null;\nconst inAudioPlayback = document.getElementById('inAudioPlayback');\nconst outAudioPlayback = document.getElementById('outAudioPlayback');\n\nModule = {};\n\n// https://emscripten.org/docs/api_reference/module.html#Module.locateFile\nModule.locateFile = function(path, scriptDirectory = '') {\n  console.log(`path: ${path}, scriptDirectory: ${scriptDirectory}`);\n  return scriptDirectory + path;\n};\n\n// https://emscripten.org/docs/api_reference/module.html#Module.locateFile\nModule.setStatus = function(status) {\n  console.log(`status ${status}`);\n  const statusElement = document.getElementById('status');\n  statusElement.textContent = status;\n  if (status === '') {\n    statusElement.style.display = 'none';\n    document.querySelectorAll('.tab-content').forEach((tabContentElement) => {\n      tabContentElement.classList.remove('loading');\n    });\n  } else {\n    statusElement.style.display = 'block';\n    document.querySelectorAll('.tab-content').forEach((tabContentElement) => {\n      tabContentElement.classList.add('loading');\n    });\n  }\n};\n\nModule.onRuntimeInitialized = function() {\n  console.log('Model files downloaded!');\n\n  console.log('Initializing speech denoiser ......');\n  speech_denoiser = createOfflineSpeechDenoiser(Module)\n};\n\nasync function process(wave) {\n  let denoised = speech_denoiser.run(wave.samples, wave.sampleRate);\n  console.log(denoised);\n\n  let int16Samples = new Int16Array(denoised.samples.length);\n  for (var i = 0; i < denoised.samples.length; ++i) {\n    let s = denoised.samples[i];\n    if (s >= 1)\n      s = 1;\n    else if (s <= -1)\n      s = -1;\n\n    int16Samples[i] = s * 32767;\n  }\n\n  let blob = toWav(int16Samples, denoised.sampleRate);\n  const objectUrl = URL.createObjectURL(blob);\n  console.log(objectUrl);\n\n  outAudioPlayback.src = objectUrl;\n  outAudioPlayback.controls = true;\n  outAudioPlayback.style.display = 'block';\n}\n\nfileInput.addEventListener('change', function(event) {\n  if (!event.target.files || !event.target.files[0]) {\n    console.log('No file selected.');\n    return;\n  }\n\n  const file = event.target.files[0];\n  console.log('Selected file:', file.name, file.type, file.size, 'bytes');\n  const reader = new FileReader();\n  reader.onload = function(ev) {\n    console.log('FileReader onload called.');\n    const arrayBuffer = ev.target.result;\n    console.log('ArrayBuffer length:', arrayBuffer.byteLength);\n\n    const uint8Array = new Uint8Array(arrayBuffer);\n    const wave = readWaveFromBinaryData(uint8Array, Module);\n    if (wave == null) {\n      alert(\n          `${file.name} is not a valid .wav file. Please select a *.wav file`);\n      return;\n    }\n\n\n    var url = URL.createObjectURL(file);\n    console.log(`url: ${url}`);\n    inAudioPlayback.src = url;\n    inAudioPlayback.style.display = 'block';\n\n    process(wave)\n    console.log('process done')\n  };\n  reader.onerror = function(err) {\n    console.error('FileReader error:', err);\n  };\n  console.log('Starting FileReader.readAsArrayBuffer...');\n  reader.readAsArrayBuffer(file);\n});\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction toWav(samples, sampleRate) {\n  let buf = new ArrayBuffer(44 + samples.length * 2);\n  var view = new DataView(buf);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true);               // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true);  // chunkSize\n  //                   E V A W\n  view.setUint32(8, 0x45564157, true);  // format\n                                        //\n  //                      t m f\n  view.setUint32(12, 0x20746d66, true);          // subchunk1ID\n  view.setUint32(16, 16, true);                  // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true);                   // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true);                   // numChannels: 1 channel\n  view.setUint32(24, sampleRate, true);          // sampleRate\n  view.setUint32(28, sampleRate * 2, true);      // byteRate\n  view.setUint16(32, 2, true);                   // blockAlign\n  view.setUint16(34, 16, true);                  // bitsPerSample\n  view.setUint32(36, 0x61746164, true);          // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true);  // subchunk2Size\n\n  let offset = 44;\n  for (let i = 0; i < samples.length; ++i) {\n    view.setInt16(offset, samples[i], true);\n    offset += 2;\n  }\n\n  return new Blob([view], {type: 'audio/wav'});\n}\n"
  },
  {
    "path": "wasm/speech-enhancement/assets/README.md",
    "content": "# Introduction\n\n## Huggingface space\n\nYou can visit https://huggingface.co/spaces/k2-fsa/wasm-speech-enhancement-gtcrn\nto try it in your browser without building or installing anything.\n\nYou can also visit\nhttps://modelscope.cn/studios/csukuangfj/wasm-speech-enhancement-gtcrn\n\n## Usage\n\nPlease refer to\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models\nto download a model.\n\nThe following is an example:\n\n```bash\ncd sherpa-onnx/wasm/speech-enhancement/assets\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx\n\nmv gtcrn_simple.onnx gtcrn.onnx\n```\n\nYou should have the following files in `assets` before you can run\n`build-wasm-simd-speech-enhancement.sh`\n\n```\n(py38) fangjuns-MacBook-Pro:assets fangjun$ tree .\n.\n├── README.md\n└── gtcrn.onnx\n\n0 directories, 2 files\n(py38) fangjuns-MacBook-Pro:assets fangjun$ ls -lh\ntotal 1056\n-rw-r--r--  1 fangjun  staff   466B Mar 12 16:13 README.md\n-rw-r--r--  1 fangjun  staff   523K Mar 12 16:14 gtcrn.onnx\n```\n"
  },
  {
    "path": "wasm/speech-enhancement/index.html",
    "content": "<html lang=\"en\">\n\n<!--\nThe UI code is modified from\nhttps://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm\n-->\n\n<head>\n  <meta charset=\"utf-8\">\n  <meta name=\"viewport\" content=\"width=device-width\" />\n  <title>Next-gen Kaldi WebAssembly with sherpa-onnx for speech enhancement</title>\n  <style>\n    h1,div {\n      text-align: center;\n    }\n    textarea {\n      width:100%;\n    }\n    .loading {\n      display: none !important;\n    }\n  </style>\n</head>\n\n<body>\n  <h1>\n    Next-gen Kaldi + WebAssembly<br/>\n    Speech Enhancement with <a href=\"https://github.com/k2-fsa/sherpa-onnx\">sherpa-onnx</a><br/>\n    using <a href=\"https://github.com/Xiaobin-Rong/gtcrn\">GTCRN</a>\n  </h1>\n\n  <div id=\"status\">Loading...</div>\n\n  <div id=\"singleAudioContent\" class=\"tab-content loading\">\n    <div style=\"display: flex; gap: 1.5rem;\">\n      <!-- Input Section -->\n      <div style=\"flex: 1; display: flex; flex-direction: column; gap: 1rem;\">\n        <div style=\"font-size: 1rem; font-weight: bold; padding: 0.5rem 1rem; background-color: #f8f9fa; border-radius: 8px; display: flex; align-items: center; gap: 0.5rem; color: #6c757d;\">\n          <span style=\"line-height: 1;\">🎵</span> Input\n        </div>\n\n        <!-- Drag and Drop / File Upload -->\n        <div id=\"dropzone\" style=\"border: 2px dashed #ced4da; border-radius: 8px; padding: 2rem; text-align: center; color: #6c757d; cursor: pointer; background-color: #f8f9fa; transition: background-color 0.3s, border-color 0.3s; position: relative;\">\n          <input type=\"file\" id=\"fileInput\" accept=\".wav\" style=\"position: absolute; top: 0; left: 0; opacity: 0; width: 100%; height: 100%; cursor: pointer;\" />\n          <p style=\"margin: 0;\">Drop Audio Here (*.wav)<br>- or -<br>Click to Upload</p>\n        </div>\n        <audio id=\"inAudioPlayback\" controls style=\"display: none; margin-top: 1rem; width: 100%;\"></audio>\n      </div>\n    </div>\n\n    <div style=\"display: flex; gap: 1.5rem;\">\n      <!-- Output Section -->\n      <div style=\"flex: 1; display: flex; flex-direction: column; gap: 1rem;\">\n        <div style=\"font-size: 1rem; font-weight: bold; padding: 0.5rem 1rem; background-color: #f8f9fa; border-radius: 8px; display: flex; align-items: center; gap: 0.5rem; color: #6c757d;\">\n        <span style=\"line-height: 1;\">🎵</span> Output\n      </div>\n        <audio id=\"outAudioPlayback\" controls style=\"display: none; margin-top: 1rem; width: 100%;\"></audio>\n    </div>\n  </div>\n\n  <!-- Footer Section -->\n  <div style=\"width: 100%; max-width: 900px; margin-top: 1.5rem; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); text-align: left; font-size: 0.9rem; color: #6c757d;\">\n    <h3>Description</h3>\n    <ul>\n      <li>Everything is <strong>open-sourced.</strong> <a href=\"https://github.com/k2-fsa/sherpa-onnx\">code</a></li>\n      <li>The model is from <a href=\"https://github.com/Xiaobin-Rong/gtcrn\">GTCRN</a></li>\n      <li>Please upload .wav files</li>\n        <ul>\n          <li>You can download noisy test wave files from <a href=\"https://htmlpreview.github.io/?https://github.com/Xiaobin-Rong/gtcrn_demo/blob/main/index.html\">https://htmlpreview.github.io/?https://github.com/Xiaobin-Rong/gtcrn_demo/blob/main/index.html</a></li>\n        </ul>\n      <li>If you have any issues, please either <a href=\"https://github.com/k2-fsa/sherpa-onnx/issues\">file a ticket</a> or contact us via</li>\n        <ul>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#wechat\">WeChat group</a></li>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#qq\">QQ group</a></li>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#bilibili-b\">Bilibili</a></li>\n        </ul>\n    </ul>\n    <h3>About This Demo</h3>\n    <ul>\n      <li><strong>Private and Secure:</strong> All processing is done locally on your device (CPU) within your browser with a single thread. No server is involved, ensuring privacy and security. You can disconnect from the Internet once this page is loaded.</li>\n      <li><strong>Efficient Resource Usage:</strong> No GPU is required, leaving system resources available for webLLM analysis.</li>\n    </ul>\n    <h3>Latest Update</h3>\n    <ul>\n      <li>First working version.</li>\n    </ul>\n\n    <h3>Acknowledgement</h3>\n    <ul>\n      <li>We refer to <a href=\"https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm\">https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm</a> for the UI part.</li>\n    </ul>\n  </div>\n\n  <script src=\"app-speech-enhancement.js\"></script>\n  <script src=\"sherpa-onnx-wave.js\"></script>\n  <script src=\"sherpa-onnx-speech-enhancement.js\"></script>\n  <script src=\"sherpa-onnx-wasm-main-speech-enhancement.js\"></script>\n</body>\n"
  },
  {
    "path": "wasm/speech-enhancement/sherpa-onnx-speech-enhancement.js",
    "content": "function freeConfig(config, Module) {\n  if ('buffer' in config) {\n    Module._free(config.buffer);\n  }\n\n  if ('config' in config) {\n    freeConfig(config.config, Module)\n  }\n\n  if ('gtcrn' in config) {\n    freeConfig(config.gtcrn, Module)\n  }\n\n  if ('dpdfnet' in config) {\n    freeConfig(config.dpdfnet, Module)\n  }\n\n  Module._free(config.ptr);\n}\n\nfunction initSherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig(config, Module) {\n  if (!('model' in config)) {\n    config.model = '';\n  }\n\n  const modelLen = Module.lengthBytesUTF8(config.model) + 1;\n\n  const n = modelLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 1 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.model, buffer + offset, modelLen);\n  offset += modelLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += modelLen;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig(config, Module) {\n  if (!('model' in config)) {\n    config.model = '';\n  }\n\n  const modelLen = Module.lengthBytesUTF8(config.model) + 1;\n  const n = modelLen;\n  const buffer = Module._malloc(n);\n  const len = 1 * 4;\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model, buffer, modelLen);\n  Module.setValue(ptr, buffer, 'i8*');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineSpeechDenoiserModelConfig(config, Module) {\n  if (!('gtcrn' in config)) {\n    config.gtcrn = {model: ''};\n  }\n\n  if (!('dpdfnet' in config)) {\n    config.dpdfnet = {model: ''};\n  }\n\n  const gtcrn =\n      initSherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig(config.gtcrn, Module);\n  const dpdfnet =\n      initSherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig(\n          config.dpdfnet, Module);\n\n  const len = gtcrn.len + 3 * 4 + dpdfnet.len;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module._CopyHeap(gtcrn.ptr, gtcrn.len, ptr + offset);\n  offset += gtcrn.len;\n\n  Module.setValue(ptr + offset, config.numThreads || 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.debug || 0, 'i32');\n  offset += 4;\n\n  const providerLen = Module.lengthBytesUTF8(config.provider || 'cpu') + 1;\n  const buffer = Module._malloc(providerLen);\n  Module.stringToUTF8(config.provider || 'cpu', buffer, providerLen);\n  Module.setValue(ptr + offset, buffer, 'i8*');\n  offset += 4;\n\n  Module._CopyHeap(dpdfnet.ptr, dpdfnet.len, ptr + offset);\n  offset += dpdfnet.len;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n    gtcrn: gtcrn,\n    dpdfnet: dpdfnet,\n  };\n}\n\nfunction initSherpaOnnxOfflineSpeechDenoiserConfig(config, Module) {\n  if (!('model' in config)) {\n    config.model = {\n      gtcrn: {model: ''},\n      dpdfnet: {model: ''},\n      provider: 'cpu',\n      debug: 1,\n      numThreads: 1,\n    };\n  }\n\n  const modelConfig =\n      initSherpaOnnxOfflineSpeechDenoiserModelConfig(config.model, Module);\n  const len = modelConfig.len;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module._CopyHeap(modelConfig.ptr, modelConfig.len, ptr + offset);\n  offset += modelConfig.len;\n\n  return {\n    ptr: ptr,\n    len: len,\n    config: modelConfig,\n  };\n}\n\nfunction copyDenoisedAudio(handle, Module) {\n  const numSamples = Module.HEAP32[handle / 4 + 1];\n  const denoisedSampleRate = Module.HEAP32[handle / 4 + 2];\n  const samplesPtr = Module.HEAP32[handle / 4] / 4;\n  const denoisedSamples = new Float32Array(numSamples);\n  for (let i = 0; i < numSamples; i++) {\n    denoisedSamples[i] = Module.HEAPF32[samplesPtr + i];\n  }\n\n  Module._SherpaOnnxDestroyDenoisedAudio(handle);\n  return {samples: denoisedSamples, sampleRate: denoisedSampleRate};\n}\n\nclass SpeechDenoiserBase {\n  constructor(Module) {\n    this.Module = Module;\n  }\n\n  save(filename, audio) {\n    const samples = audio.samples;\n    const sampleRate = audio.sampleRate;\n    const ptr = this.Module._malloc(samples.length * 4);\n    for (let i = 0; i < samples.length; i++) {\n      this.Module.HEAPF32[ptr / 4 + i] = samples[i];\n    }\n\n    const filenameLen = this.Module.lengthBytesUTF8(filename) + 1;\n    const buffer = this.Module._malloc(filenameLen);\n    this.Module.stringToUTF8(filename, buffer, filenameLen);\n    this.Module._SherpaOnnxWriteWave(ptr, samples.length, sampleRate, buffer);\n    this.Module._free(buffer);\n    this.Module._free(ptr);\n  }\n}\n\nclass OfflineSpeechDenoiser extends SpeechDenoiserBase {\n  constructor(configObj, Module) {\n    super(Module);\n    const config = initSherpaOnnxOfflineSpeechDenoiserConfig(configObj, Module);\n    const handle = Module._SherpaOnnxCreateOfflineSpeechDenoiser(config.ptr);\n\n    freeConfig(config, Module);\n\n    this.handle = handle;\n    this.sampleRate =\n        Module._SherpaOnnxOfflineSpeechDenoiserGetSampleRate(this.handle);\n  }\n\n  free() {\n    this.Module._SherpaOnnxDestroyOfflineSpeechDenoiser(this.handle);\n    this.handle = 0;\n  }\n\n  run(samples, sampleRate) {\n    const pointer =\n        this.Module._malloc(samples.length * samples.BYTES_PER_ELEMENT);\n    this.Module.HEAPF32.set(samples, pointer / samples.BYTES_PER_ELEMENT);\n    const h = this.Module._SherpaOnnxOfflineSpeechDenoiserRun(\n        this.handle, pointer, samples.length, sampleRate);\n    this.Module._free(pointer);\n\n    return copyDenoisedAudio(h, this.Module);\n  }\n}\n\nclass OnlineSpeechDenoiser extends SpeechDenoiserBase {\n  constructor(configObj, Module) {\n    super(Module);\n    const config = initSherpaOnnxOfflineSpeechDenoiserConfig(configObj, Module);\n    const handle = Module._SherpaOnnxCreateOnlineSpeechDenoiser(config.ptr);\n\n    freeConfig(config, Module);\n\n    this.handle = handle;\n    this.sampleRate =\n        Module._SherpaOnnxOnlineSpeechDenoiserGetSampleRate(this.handle);\n    this.frameShiftInSamples =\n        Module._SherpaOnnxOnlineSpeechDenoiserGetFrameShiftInSamples(\n            this.handle);\n  }\n\n  free() {\n    this.Module._SherpaOnnxDestroyOnlineSpeechDenoiser(this.handle);\n    this.handle = 0;\n  }\n\n  run(samples, sampleRate) {\n    const pointer =\n        this.Module._malloc(samples.length * samples.BYTES_PER_ELEMENT);\n    this.Module.HEAPF32.set(samples, pointer / samples.BYTES_PER_ELEMENT);\n    const h = this.Module._SherpaOnnxOnlineSpeechDenoiserRun(\n        this.handle, pointer, samples.length, sampleRate);\n    this.Module._free(pointer);\n\n    return copyDenoisedAudio(h, this.Module);\n  }\n\n  flush() {\n    const h = this.Module._SherpaOnnxOnlineSpeechDenoiserFlush(this.handle);\n    return copyDenoisedAudio(h, this.Module);\n  }\n\n  reset() {\n    this.Module._SherpaOnnxOnlineSpeechDenoiserReset(this.handle);\n  }\n}\n\nfunction createOfflineSpeechDenoiser(Module, myConfig) {\n  let config = {\n    model: {\n      gtcrn: {model: './gtcrn.onnx'},\n      debug: 0,\n    },\n  };\n\n  if (myConfig) {\n    config = myConfig;\n  }\n\n  return new OfflineSpeechDenoiser(config, Module);\n}\n\nfunction createOnlineSpeechDenoiser(Module, myConfig) {\n  let config = {\n    model: {\n      gtcrn: {model: './gtcrn.onnx'},\n      debug: 0,\n    },\n  };\n\n  if (myConfig) {\n    config = myConfig;\n  }\n\n  return new OnlineSpeechDenoiser(config, Module);\n}\n\nif (typeof process == 'object' && typeof process.versions == 'object' &&\n    typeof process.versions.node == 'string') {\n  module.exports = {\n    createOfflineSpeechDenoiser,\n    createOnlineSpeechDenoiser,\n  };\n}\n"
  },
  {
    "path": "wasm/speech-enhancement/sherpa-onnx-wasm-main-speech-enhancement.cc",
    "content": "// wasm/sherpa-onnx-wasm-main-speech-enhancement.cc\n//\n// Copyright (c)  2025  Xiaomi Corporation\n#include <stdio.h>\n\n#include <algorithm>\n#include <memory>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n// see also\n// https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html\n\nextern \"C\" {\n\nstatic_assert(sizeof(SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig) == 1 * 4,\n              \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig) ==\n                  1 * 4,\n              \"\");\nstatic_assert(\n    sizeof(SherpaOnnxOfflineSpeechDenoiserModelConfig) ==\n        sizeof(SherpaOnnxOfflineSpeechDenoiserGtcrnModelConfig) +\n            sizeof(SherpaOnnxOfflineSpeechDenoiserDpdfNetModelConfig) + 3 * 4,\n    \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineSpeechDenoiserConfig) ==\n                  sizeof(SherpaOnnxOfflineSpeechDenoiserModelConfig),\n              \"\");\n\nvoid MyPrint(SherpaOnnxOfflineSpeechDenoiserConfig *config) {\n  auto model = &config->model;\n  auto gtcrn = &model->gtcrn;\n  auto dpdfnet = &model->dpdfnet;\n  fprintf(stdout, \"----------offline speech denoiser model config----------\\n\");\n  fprintf(stdout, \"gtcrn: %s\\n\", gtcrn->model);\n  fprintf(stdout, \"dpdfnet: %s\\n\", dpdfnet->model);\n  fprintf(stdout, \"num threads: %d\\n\", model->num_threads);\n  fprintf(stdout, \"debug: %d\\n\", model->debug);\n  fprintf(stdout, \"provider: %s\\n\", model->provider);\n}\n\nvoid CopyHeap(const char *src, int32_t num_bytes, char *dst) {\n  std::copy(src, src + num_bytes, dst);\n}\n}\n"
  },
  {
    "path": "wasm/tts/CMakeLists.txt",
    "content": "if(NOT $ENV{SHERPA_ONNX_IS_USING_BUILD_WASM_SH})\n  message(FATAL_ERROR \"Please use ./build-wasm-simd-tts.sh to build for wasm TTS\")\nendif()\n\nif(NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/tokens.txt\" AND\n   NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/lm_flow.int8.onnx\")\n  message(FATAL_ERROR \"Please read ${CMAKE_CURRENT_SOURCE_DIR}/assets/README.md before you continue\")\nendif()\n\nset(exported_functions\n  MyPrint\n  SherpaOnnxCreateOfflineTts\n  SherpaOnnxDestroyOfflineTts\n  SherpaOnnxDestroyOfflineTtsGeneratedAudio\n  SherpaOnnxOfflineTtsGenerate\n  SherpaOnnxOfflineTtsGenerateWithCallback\n  SherpaOnnxOfflineTtsGenerateWithConfig\n  SherpaOnnxOfflineTtsNumSpeakers\n  SherpaOnnxOfflineTtsSampleRate\n  SherpaOnnxWriteWave\n)\nset(mangled_exported_functions)\nforeach(x IN LISTS exported_functions)\n  list(APPEND mangled_exported_functions \"_${x}\")\nendforeach()\nlist(JOIN mangled_exported_functions \",\" all_exported_functions)\n\n\ninclude_directories(${CMAKE_SOURCE_DIR})\nset(MY_FLAGS \" -s FORCE_FILESYSTEM=1 -s INITIAL_MEMORY=512MB -s ALLOW_MEMORY_GROWTH=1\")\nstring(APPEND MY_FLAGS \" -sSTACK_SIZE=10485760 \") # 10MB\nstring(APPEND MY_FLAGS \" -sALLOW_TABLE_GROWTH \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_FUNCTIONS=[_CopyHeap,_malloc,_free,${all_exported_functions}] \")\nstring(APPEND MY_FLAGS \"--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets@. \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_RUNTIME_METHODS=['ccall','stringToUTF8','setValue','getValue','lengthBytesUTF8','UTF8ToString','HEAPU8','HEAP16','HEAP32','HEAPU32','HEAPF32','HEAPF64','addFunction','removeFunction'] \")\n\nmessage(STATUS \"MY_FLAGS: ${MY_FLAGS}\")\n\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_EXECUTABLE_LINKER_FLAGS \"${CMAKE_EXECUTABLE_LINKER_FLAGS} ${MY_FLAGS}\")\n\nif (NOT CMAKE_EXECUTABLE_SUFFIX STREQUAL \".js\")\n  message(FATAL_ERROR \"The default suffix for building executables should be .js!\")\nendif()\n# set(CMAKE_EXECUTABLE_SUFFIX \".html\")\n\nadd_executable(sherpa-onnx-wasm-main-tts sherpa-onnx-wasm-main-tts.cc)\ntarget_link_libraries(sherpa-onnx-wasm-main-tts sherpa-onnx-c-api)\ninstall(TARGETS sherpa-onnx-wasm-main-tts DESTINATION bin/wasm/tts)\n\ninstall(\n  FILES\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-tts>/sherpa-onnx-wasm-main-tts.js\"\n    \"index.html\"\n    \"sherpa-onnx-tts.js\"\n    \"sherpa-onnx-tts.worker.js\"\n    \"app-tts.js\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-tts>/sherpa-onnx-wasm-main-tts.wasm\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-tts>/sherpa-onnx-wasm-main-tts.data\"\n  DESTINATION\n    bin/wasm/tts\n)\n"
  },
  {
    "path": "wasm/tts/app-tts.js",
    "content": "const generateBtn = document.getElementById('generateBtn');\nconst speakerIdLabel = document.getElementById('speakerIdLabel');\nconst speakerIdInput = document.getElementById('speakerId');\nconst speakerIdSection = document.getElementById('speakerIdSection');\nconst referenceAudioSection = document.getElementById('referenceAudioSection');\nconst referenceTextSection = document.getElementById('referenceTextSection');\nconst referenceAudioInput = document.getElementById('referenceAudio');\nconst referenceTextInput = document.getElementById('referenceText');\nconst speedInput = document.getElementById('speed');\nconst speedValue = document.getElementById('speedValue');\nconst textArea = document.getElementById('text');\nconst soundClips = document.getElementById('sound-clips');\nconst statusElement = document.getElementById('status');\nconst generationStatusElement = document.getElementById('generationStatus');\n\nspeedValue.innerHTML = speedInput.value;\n\nlet index = 0;\n\nlet audioCtx = null;\nconst worker = new Worker(\"sherpa-onnx-tts.worker.js\");\nlet ttsInstanceInfo = {\n  modelType: null,\n  numSpeakers: 0,\n  isReady: false,\n};\nworker.onmessage = (e) => {\n  if (e.data.type === \"sherpa-onnx-tts-progress\") {\n    Module.setStatus(e.data.status);\n    return;\n  }\n  if (e.data.type === \"sherpa-onnx-tts-generation-progress\") {\n    const percent = Math.max(0, Math.min(100, (e.data.progress || 0) * 100));\n    setGenerationStatus(`Generating audio... ${percent.toFixed(2)}%`);\n    return;\n  }\n  if (e.data.type === \"sherpa-onnx-tts-ready\") {\n    ttsInstanceInfo.modelType = e.data.modelType;\n    ttsInstanceInfo.numSpeakers = e.data.numSpeakers;\n    ttsInstanceInfo.isReady = true;\n    generateBtn.disabled = false;\n    speakerIdLabel.innerHTML = `Speaker ID (0 - ${e.data.numSpeakers - 1}):`;\n    updateUiForModelType();\n    Module.setStatus('');\n    return;\n  }\n  if (e.data.type === \"error\") {\n    generateBtn.disabled = false;\n    if (ttsInstanceInfo.isReady) {\n      setGenerationStatus(e.data.message);\n    } else {\n      Module.setStatus(e.data.message);\n    }\n    return;\n  }\n  if (e.data.type === \"sherpa-onnx-tts-result\") {\n    let audio = e.data;\n    generateBtn.disabled = false;\n    setGenerationStatus('');\n\n    console.log(audio.samples.length, audio.sampleRate);\n\n    if (!audioCtx) {\n      audioCtx = new AudioContext({ sampleRate: audio.sampleRate });\n    }\n\n    const buffer = audioCtx.createBuffer(\n      1,\n      audio.samples.length,\n      audio.sampleRate,\n    );\n\n    buffer.getChannelData(0).set(audio.samples); // 使用 .set() 比 for 循环快得多\n    const source = audioCtx.createBufferSource();\n    source.buffer = buffer;\n    source.connect(audioCtx.destination);\n    source.start();\n\n    createAudioTag(audio);\n  }\n};\n\nModule = {};\n\n// https://emscripten.org/docs/api_reference/module.html#Module.locateFile\nModule.setStatus = function(status) {\n  console.log(`status ${status}`);\n  if (status == 'Running...') {\n    status = 'Model downloaded. Initializing text to speech model...'\n  }\n\n  const downloadMatch = status.match(/Downloading data... \\((\\d+)\\/(\\d+)\\)/);\n  if (downloadMatch) {\n    const downloaded = BigInt(downloadMatch[1]);\n    const total = BigInt(downloadMatch[2]);\n    const percent =\n        total === 0 ? 0.00 : Number((downloaded * 10000n) / total) / 100;\n    const downloadedMB = Number(downloaded) / (1024 * 1024);\n    const totalMB = Number(total) / (1024 * 1024);\n    status = `Downloading data... ${percent.toFixed(2)}% (${downloadedMB.toFixed(2)} MB/${\n        totalMB.toFixed(2)} MB)`;\n    console.log(`here ${status}`)\n  }\n\n  statusElement.textContent = status;\n  if (status === '') {\n    statusElement.style.display = 'none';\n    // statusElement.parentNode.removeChild(statusElement);\n\n    document.querySelectorAll('.tab-content').forEach((tabContentElement) => {\n      tabContentElement.classList.remove('loading');\n    });\n  } else {\n    statusElement.style.display = 'block';\n    document.querySelectorAll('.tab-content').forEach((tabContentElement) => {\n      tabContentElement.classList.add('loading');\n    });\n  }\n};\nspeedInput.oninput = function() {\n  speedValue.innerHTML = this.value;\n};\n\nfunction updateUiForModelType() {\n  const isZipVoice = ttsInstanceInfo.modelType === 4;\n  const isPocketTts = ttsInstanceInfo.modelType === 5;\n  const useGenerationConfig = isZipVoice || isPocketTts;\n  speakerIdSection.classList.toggle('hidden', useGenerationConfig);\n  referenceAudioSection.classList.toggle('hidden', !useGenerationConfig);\n  referenceTextSection.classList.toggle('hidden', !isZipVoice);\n}\n\nfunction setGenerationStatus(status) {\n  if (!generationStatusElement) {\n    return;\n  }\n\n  generationStatusElement.textContent = status;\n  generationStatusElement.style.display = status ? 'block' : 'none';\n}\n\nfunction getMonoSamples(audioBuffer) {\n  if (audioBuffer.numberOfChannels === 1) {\n    return new Float32Array(audioBuffer.getChannelData(0));\n  }\n\n  const samples = new Float32Array(audioBuffer.length);\n  for (let c = 0; c < audioBuffer.numberOfChannels; ++c) {\n    const channel = audioBuffer.getChannelData(c);\n    for (let i = 0; i < channel.length; ++i) {\n      samples[i] += channel[i];\n    }\n  }\n\n  for (let i = 0; i < samples.length; ++i) {\n    samples[i] /= audioBuffer.numberOfChannels;\n  }\n\n  return samples;\n}\n\nasync function readReferenceAudio(file) {\n  const arrayBuffer = await file.arrayBuffer();\n  const ctx = new AudioContext();\n  try {\n    const audioBuffer = await ctx.decodeAudioData(arrayBuffer.slice(0));\n    return {\n      samples: getMonoSamples(audioBuffer),\n      sampleRate: audioBuffer.sampleRate,\n    };\n  } finally {\n    await ctx.close();\n  }\n}\n\nfunction isWaveFile(file) {\n  const name = file.name || '';\n  return name.toLowerCase().endsWith('.wav');\n}\n\nfunction sanitizeFilename(name) {\n  return name.replace(/[^a-zA-Z0-9._-]+/g, '-');\n}\n\nfunction downloadBlob(blob, filename) {\n  const url = window.URL.createObjectURL(blob);\n  const link = document.createElement('a');\n  link.href = url;\n  link.download = filename;\n  document.body.appendChild(link);\n  link.click();\n  document.body.removeChild(link);\n  window.URL.revokeObjectURL(url);\n}\n\ngenerateBtn.onclick = async function() {\n  const isZipVoice = ttsInstanceInfo.modelType === 4;\n  const isPocketTts = ttsInstanceInfo.modelType === 5;\n  const useGenerationConfig = isZipVoice || isPocketTts;\n\n  let speakerId = speakerIdInput.value;\n  if (!useGenerationConfig) {\n    if (speakerId.trim().length == 0) {\n      alert('Please input a speakerId');\n      return;\n    }\n\n    if (!speakerId.match(/^\\d+$/)) {\n      alert(`Input speakerID ${\n          speakerId} is not a number.\\nPlease enter a number between 0 and ${\n          ttsInstanceInfo.numSpeakers - 1}`);\n      return;\n    }\n    speakerId = parseInt(speakerId, 10);\n    if (speakerId > ttsInstanceInfo.numSpeakers - 1) {\n      alert(`Pleaser enter a number between 0 and ${ttsInstanceInfo.numSpeakers - 1}`);\n      return;\n    }\n  }\n\n  let text = textArea.value.trim();\n  if (text.length == 0) {\n    alert('Please input a non-blank text');\n    return;\n  }\n\n  console.log('speakerId', speakerId);\n  console.log('speed', speedInput.value);\n  console.log('text', text);\n\n  if (useGenerationConfig) {\n    if (!referenceAudioInput.files || referenceAudioInput.files.length === 0) {\n      alert('Please select a reference audio file');\n      return;\n    }\n\n    const referenceFile = referenceAudioInput.files[0];\n    if (!isWaveFile(referenceFile)) {\n      alert('Please select a .wav reference audio file');\n      return;\n    }\n\n    const referenceAudio = await readReferenceAudio(referenceFile);\n    const genConfig = {\n      speed: parseFloat(speedInput.value),\n      referenceAudio: referenceAudio.samples,\n      referenceSampleRate: referenceAudio.sampleRate,\n      numSteps: isPocketTts ? 5 : 4,\n    };\n\n    if (isZipVoice) {\n      const referenceText = referenceTextInput.value.trim();\n      if (referenceText.length === 0) {\n        alert('Please input the transcript of the reference audio');\n        return;\n      }\n\n      genConfig.referenceText = referenceText;\n      genConfig.extra = {\n        min_char_in_sentence: 10,\n      };\n    }\n\n    generateBtn.disabled = true;\n    setGenerationStatus('Generating audio...');\n\n    worker.postMessage({\n      text,\n      genConfig,\n      type: \"generateWithConfig\",\n    }, [genConfig.referenceAudio.buffer]);\n    return;\n  }\n\n  worker.postMessage({\n    text,\n    sid: speakerId,\n    speed: parseFloat(speedInput.value),\n    type: \"generate\",\n  });\n};\n\nfunction createAudioTag(generateAudio) {\n  const blob = toWav(generateAudio.samples, generateAudio.sampleRate);\n\n  const text = textArea.value.trim().substring(0, 100);\n  const clipName = `${index} ${text} ...`;\n  const filename = `${sanitizeFilename(clipName)}.wav`;\n  index += 1;\n\n  const clipContainer = document.createElement('article');\n  const clipLabel = document.createElement('p');\n  const audio = document.createElement('audio');\n  const saveButton = document.createElement('button');\n  const deleteButton = document.createElement('button');\n  clipContainer.classList.add('clip');\n  audio.setAttribute('controls', '');\n  saveButton.textContent = 'Save';\n  saveButton.className = 'save';\n  deleteButton.textContent = 'Delete';\n  deleteButton.className = 'delete';\n\n  clipLabel.textContent = clipName;\n\n  clipContainer.appendChild(audio);\n\n  clipContainer.appendChild(clipLabel);\n  clipContainer.appendChild(saveButton);\n  clipContainer.appendChild(deleteButton);\n  soundClips.appendChild(clipContainer);\n\n  audio.controls = true;\n\n  const audioURL = window.URL.createObjectURL(blob);\n  audio.src = audioURL;\n\n  saveButton.onclick = function() {\n    downloadBlob(blob, filename);\n  };\n\n  deleteButton.onclick = function(e) {\n    let evtTgt = e.target;\n    evtTgt.parentNode.parentNode.removeChild(evtTgt.parentNode);\n  };\n\n  clipLabel.onclick = function() {\n    const existingName = clipLabel.textContent;\n    const newClipName = prompt('Enter a new name for your sound clip?');\n    if (newClipName === null) {\n      clipLabel.textContent = existingName;\n    } else {\n      clipLabel.textContent = newClipName;\n    }\n  };\n}\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction toWav(floatSamples, sampleRate) {\n  let samples = new Int16Array(floatSamples.length);\n  for (let i = 0; i < samples.length; ++i) {\n    let s = floatSamples[i];\n    if (s >= 1)\n      s = 1;\n    else if (s <= -1)\n      s = -1;\n\n    samples[i] = s * 32767;\n  }\n\n  let buf = new ArrayBuffer(44 + samples.length * 2);\n  var view = new DataView(buf);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true);               // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true);  // chunkSize\n  //                   E V A W\n  view.setUint32(8, 0x45564157, true);  // format\n                                        //\n  //                      t m f\n  view.setUint32(12, 0x20746d66, true);          // subchunk1ID\n  view.setUint32(16, 16, true);                  // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true);                   // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true);                   // numChannels: 1 channel\n  view.setUint32(24, sampleRate, true);          // sampleRate\n  view.setUint32(28, sampleRate * 2, true);      // byteRate\n  view.setUint16(32, 2, true);                   // blockAlign\n  view.setUint16(34, 16, true);                  // bitsPerSample\n  view.setUint32(36, 0x61746164, true);          // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true);  // subchunk2Size\n\n  let offset = 44;\n  for (let i = 0; i < samples.length; ++i) {\n    view.setInt16(offset, samples[i], true);\n    offset += 2;\n  }\n\n  return new Blob([view], {type: 'audio/wav'});\n}\n"
  },
  {
    "path": "wasm/tts/assets/.gitignore",
    "content": "*.onnx\n*.txt\nespeak-ng-data\n\n"
  },
  {
    "path": "wasm/tts/assets/README.md",
    "content": "# Introduction\n\nPlease refer to\nhttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models\nto download a model.\n\nThe following is an example:\n```bash\ncd sherpa-onnx/wasm/tts/assets\n\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-en_US-libritts_r-medium.tar.bz2\ntar xf vits-piper-en_US-libritts_r-medium.tar.bz2\nrm vits-piper-en_US-libritts_r-medium.tar.bz2\nmv vits-piper-en_US-libritts_r-medium/en_US-libritts_r-medium.onnx ./model.onnx\nmv vits-piper-en_US-libritts_r-medium/tokens.txt ./\nmv vits-piper-en_US-libritts_r-medium/espeak-ng-data ./\nrm -rf vits-piper-en_US-libritts_r-medium\n```\n\nZipVoice example:\n\n```bash\ncd sherpa-onnx/wasm/tts/assets\n\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\ntar xf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\nrm sherpa-onnx-zipvoice-distill-int8-zh-en-emilia.tar.bz2\n\nmv sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/encoder.int8.onnx ./\nmv sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/decoder.int8.onnx ./\nmv sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/tokens.txt ./\nmv sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/lexicon.txt ./\nmv sherpa-onnx-zipvoice-distill-int8-zh-en-emilia/espeak-ng-data ./\nrm -rf sherpa-onnx-zipvoice-distill-int8-zh-en-emilia\n\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/vocos_24khz.onnx\n```\n\nPocketTTS example:\n\n```bash\ncd sherpa-onnx/wasm/tts/assets\n\nwget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\ntar xf sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\nrm sherpa-onnx-pocket-tts-int8-2026-01-26.tar.bz2\n\nmv sherpa-onnx-pocket-tts-int8-2026-01-26/lm_flow.int8.onnx ./\nmv sherpa-onnx-pocket-tts-int8-2026-01-26/lm_main.int8.onnx ./\nmv sherpa-onnx-pocket-tts-int8-2026-01-26/encoder.onnx ./\nmv sherpa-onnx-pocket-tts-int8-2026-01-26/decoder.int8.onnx ./\nmv sherpa-onnx-pocket-tts-int8-2026-01-26/text_conditioner.onnx ./\nmv sherpa-onnx-pocket-tts-int8-2026-01-26/vocab.json ./\nmv sherpa-onnx-pocket-tts-int8-2026-01-26/token_scores.json ./\nrm -rf sherpa-onnx-pocket-tts-int8-2026-01-26\n```\n\nYou should have the following files in `assets` before you can run\n`build-wasm-simd-tts.sh`\n\n```\nassets fangjun$ tree -L 1\n.\n├── README.md\n├── espeak-ng-data\n├── mode.onnx\n└── tokens.txt\n\n1 directory, 3 files\n```\n\nYou can find example build scripts at:\n\n  - English TTS: https://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/wasm-simd-hf-space-en-tts.yaml\n  - German TTS: https://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/wasm-simd-hf-space-de-tts.yaml\n"
  },
  {
    "path": "wasm/tts/index.html",
    "content": "<html lang=\"en\">\n\n<head>\n  <meta charset=\"utf-8\">\n  <meta name=\"viewport\" content=\"width=device-width\" />\n  <title>Next-gen Kaldi WebAssembly with sherpa-onnx for Text-to-speech</title>\n  <style>\n    h1,div {\n      text-align: center;\n    }\n    textarea {\n      width:100%;\n    }\n    .loading {\n      display: none !important;\n    }\n    .hidden {\n      display: none !important;\n    }\n  </style>\n</head>\n\n<body style=\"font-family: 'Source Sans Pro', sans-serif; background-color: #f9fafb; color: #333; display: flex; flex-direction: column; align-items: center; height: 100vh; margin: 0;\">\n  <h1>\n    Next-gen Kaldi + WebAssembly<br/>\n    Text-to-speech Demo with <a href=\"https://github.com/k2-fsa/sherpa-onnx\">sherpa-onnx</a>\n  </h1>\n\n  <div style=\"width: 100%; max-width: 900px; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); flex: 1;\">\n    <div id=\"status\">Loading...</div>\n\n    <div id=\"singleAudioContent\" class=\"tab-content loading\">\n      <div id=\"speakerIdSection\">\n        <label for=\"speakerId\" id=\"speakerIdLabel\">Speaker ID: </label>\n        <input type=\"text\" id=\"speakerId\" name=\"speakerId\" value=\"0\" />\n        <br/>\n        <br/>\n      </div>\n      <div id=\"referenceAudioSection\" class=\"hidden\">\n        <label for=\"referenceAudio\">Reference audio (.wav): </label>\n        <input type=\"file\" id=\"referenceAudio\" name=\"referenceAudio\" accept=\".wav,audio/wav\" />\n        <div style=\"font-size: 0.9rem; color: #6c757d;\">Only `.wav` files are supported.</div>\n        <br/>\n        <br/>\n      </div>\n      <div id=\"referenceTextSection\" class=\"hidden\">\n        <label for=\"referenceText\">Reference transcript (must match the reference audio): </label>\n        <br/>\n        <textarea id=\"referenceText\" rows=\"3\" placeholder=\"Please enter the transcript of the reference audio exactly\"></textarea>\n        <br/>\n        <br/>\n      </div>\n      <label for=\"speed\" id=\"speedLabel\">Speed: </label>\n      <input type=\"range\" id=\"speed\" name=\"speed\" min=\"0.4\" max=\"3.5\" step=\"0.1\" value=\"1.0\" />\n      <span id=\"speedValue\"></span>\n      <br/>\n      <br/>\n      <textarea id=\"text\" rows=\"10\" placeholder=\"Please enter your text here and click the Generate button\"></textarea>\n      <br/>\n      <br/>\n      <button id=\"generateBtn\" disabled>Generate</button>\n      <div id=\"generationStatus\" style=\"display: none; margin-top: 0.75rem; font-size: 0.95rem; color: #6c757d;\"></div>\n    </div>\n\n    <section flex=\"1\" overflow=\"auto\" id=\"sound-clips\">\n    </section>\n  </div>\n\n  <!-- Footer Section -->\n  <div style=\"width: 100%; max-width: 900px; margin-top: 1.5rem; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); text-align: left; font-size: 0.9rem; color: #6c757d;\">\n    <h3>Description</h3>\n    <ul>\n      <li>Everything is <strong>open-sourced.</strong> <a href=\"https://github.com/k2-fsa/sherpa-onnx\">code</a></li>\n      <li>If you have any issues, please either <a href=\"https://github.com/k2-fsa/sherpa-onnx/issues\">file a ticket</a> or contact us via</li>\n        <ul>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#wechat\">WeChat group</a></li>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#qq\">QQ group</a></li>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#bilibili-b\">Bilibili</a></li>\n        </ul>\n    </ul>\n    <h3>About This Demo</h3>\n    <ul>\n      <li><strong>Private and Secure:</strong> All processing is done locally on your device (CPU) within your browser with a single thread. No server is involved, ensuring privacy and security. You can disconnect from the Internet once this page is loaded.</li>\n      <li><strong>Efficient Resource Usage:</strong> No GPU is required, leaving system resources available for webLLM analysis.</li>\n    </ul>\n    <h3>Latest Update</h3>\n    <ul>\n      <li>Update UI.</li>\n      <li>First working version.</li>\n    </ul>\n\n    <h3>Acknowledgement</h3>\n    <ul>\n      <li>We refer to <a href=\"https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm\">https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm</a> for the UI part.</li>\n    </ul>\n  </div>\n\n\n  <script src=\"app-tts.js\"></script>\n</body>\n"
  },
  {
    "path": "wasm/tts/sherpa-onnx-tts.js",
    "content": "\nfunction freeConfig(config, Module) {\n  if ('buffer' in config) {\n    Module._free(config.buffer);\n  }\n\n  if ('config' in config) {\n    freeConfig(config.config, Module)\n  }\n\n  if ('matcha' in config) {\n    freeConfig(config.matcha, Module)\n  }\n\n  if ('kokoro' in config) {\n    freeConfig(config.kokoro, Module)\n  }\n\n  if ('kitten' in config) {\n    freeConfig(config.kitten, Module)\n  }\n\n  if ('zipvoice' in config) {\n    freeConfig(config.zipvoice, Module)\n  }\n\n  if ('pocket' in config) {\n    freeConfig(config.pocket, Module)\n  }\n\n  if ('supertonic' in config) {\n    freeConfig(config.supertonic, Module)\n  }\n\n  if (config.ptr) {\n    Module._free(config.ptr);\n  }\n}\n\n// The user should free the returned pointers\nfunction initSherpaOnnxOfflineTtsVitsModelConfig(config, Module) {\n  const modelLen = Module.lengthBytesUTF8(config.model || '') + 1;\n  const lexiconLen = Module.lengthBytesUTF8(config.lexicon || '') + 1;\n  const tokensLen = Module.lengthBytesUTF8(config.tokens || '') + 1;\n  const dataDirLen = Module.lengthBytesUTF8(config.dataDir || '') + 1;\n  const dictDir = ''\n  const dictDirLen = Module.lengthBytesUTF8(dictDir) + 1;\n\n  const n = modelLen + lexiconLen + tokensLen + dataDirLen + dictDirLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 8 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.model || '', buffer + offset, modelLen);\n  offset += modelLen;\n\n  Module.stringToUTF8(config.lexicon || '', buffer + offset, lexiconLen);\n  offset += lexiconLen;\n\n  Module.stringToUTF8(config.tokens || '', buffer + offset, tokensLen);\n  offset += tokensLen;\n\n  Module.stringToUTF8(config.dataDir || '', buffer + offset, dataDirLen);\n  offset += dataDirLen;\n\n  Module.stringToUTF8(dictDir, buffer + offset, dictDirLen);\n  offset += dictDirLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += modelLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += lexiconLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n  offset += tokensLen;\n\n  Module.setValue(ptr + 12, buffer + offset, 'i8*');\n  offset += dataDirLen;\n\n  Module.setValue(ptr + 16, config.noiseScale || 0.667, 'float');\n  Module.setValue(ptr + 20, config.noiseScaleW || 0.8, 'float');\n  Module.setValue(ptr + 24, config.lengthScale || 1.0, 'float');\n  Module.setValue(ptr + 28, buffer + offset, 'i8*');\n  offset += dictDirLen;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineTtsMatchaModelConfig(config, Module) {\n  const acousticModelLen = Module.lengthBytesUTF8(config.acousticModel) + 1;\n  const vocoderLen = Module.lengthBytesUTF8(config.vocoder) + 1;\n  const lexiconLen = Module.lengthBytesUTF8(config.lexicon || '') + 1;\n  const tokensLen = Module.lengthBytesUTF8(config.tokens || '') + 1;\n  const dataDirLen = Module.lengthBytesUTF8(config.dataDir || '') + 1;\n\n  const dictDir = '';\n  const dictDirLen = Module.lengthBytesUTF8(dictDir) + 1;\n\n  const n = acousticModelLen + vocoderLen + lexiconLen + tokensLen +\n      dataDirLen + dictDirLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 8 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(\n      config.acousticModel || '', buffer + offset, acousticModelLen);\n  offset += acousticModelLen;\n\n  Module.stringToUTF8(config.vocoder || '', buffer + offset, vocoderLen);\n  offset += vocoderLen;\n\n  Module.stringToUTF8(config.lexicon || '', buffer + offset, lexiconLen);\n  offset += lexiconLen;\n\n  Module.stringToUTF8(config.tokens || '', buffer + offset, tokensLen);\n  offset += tokensLen;\n\n  Module.stringToUTF8(config.dataDir || '', buffer + offset, dataDirLen);\n  offset += dataDirLen;\n\n  Module.stringToUTF8(dictDir, buffer + offset, dictDirLen);\n  offset += dictDirLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += acousticModelLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += vocoderLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n  offset += lexiconLen;\n\n  Module.setValue(ptr + 12, buffer + offset, 'i8*');\n  offset += tokensLen;\n\n  Module.setValue(ptr + 16, buffer + offset, 'i8*');\n  offset += dataDirLen;\n\n  Module.setValue(ptr + 20, config.noiseScale || 0.667, 'float');\n  Module.setValue(ptr + 24, config.lengthScale || 1.0, 'float');\n  Module.setValue(ptr + 28, buffer + offset, 'i8*');\n  offset += dictDirLen;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineTtsKokoroModelConfig(config, Module) {\n  const modelLen = Module.lengthBytesUTF8(config.model) + 1;\n  const voicesLen = Module.lengthBytesUTF8(config.voices) + 1;\n  const tokensLen = Module.lengthBytesUTF8(config.tokens || '') + 1;\n  const dataDirLen = Module.lengthBytesUTF8(config.dataDir || '') + 1;\n  const dictDir = '';\n  const dictDirLen = Module.lengthBytesUTF8(dictDir) + 1;\n  const lexiconLen = Module.lengthBytesUTF8(config.lexicon || '') + 1;\n  const langLen = Module.lengthBytesUTF8(config.lang || '') + 1;\n\n  const n = modelLen + voicesLen + tokensLen + dataDirLen + dictDirLen +\n      lexiconLen + langLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 8 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.model || '', buffer + offset, modelLen);\n  offset += modelLen;\n\n  Module.stringToUTF8(config.voices || '', buffer + offset, voicesLen);\n  offset += voicesLen;\n\n  Module.stringToUTF8(config.tokens || '', buffer + offset, tokensLen);\n  offset += tokensLen;\n\n  Module.stringToUTF8(config.dataDir || '', buffer + offset, dataDirLen);\n  offset += dataDirLen;\n\n  Module.stringToUTF8(dictDir, buffer + offset, dictDirLen);\n  offset += dictDirLen;\n\n  Module.stringToUTF8(config.lexicon || '', buffer + offset, lexiconLen);\n  offset += lexiconLen;\n\n  Module.stringToUTF8(config.lang || '', buffer + offset, langLen);\n  offset += langLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += modelLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += voicesLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n  offset += tokensLen;\n\n  Module.setValue(ptr + 12, buffer + offset, 'i8*');\n  offset += dataDirLen;\n\n  Module.setValue(ptr + 16, config.lengthScale || 1.0, 'float');\n\n  Module.setValue(ptr + 20, buffer + offset, 'i8*');\n  offset += dictDirLen;\n\n  Module.setValue(ptr + 24, buffer + offset, 'i8*');\n  offset += lexiconLen;\n\n  Module.setValue(ptr + 28, buffer + offset, 'i8*');\n  offset += langLen;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineTtsKittenModelConfig(config, Module) {\n  const modelLen = Module.lengthBytesUTF8(config.model) + 1;\n  const voicesLen = Module.lengthBytesUTF8(config.voices) + 1;\n  const tokensLen = Module.lengthBytesUTF8(config.tokens || '') + 1;\n  const dataDirLen = Module.lengthBytesUTF8(config.dataDir || '') + 1;\n\n  const n = modelLen + voicesLen + tokensLen + dataDirLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 5 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.model || '', buffer + offset, modelLen);\n  offset += modelLen;\n\n  Module.stringToUTF8(config.voices || '', buffer + offset, voicesLen);\n  offset += voicesLen;\n\n  Module.stringToUTF8(config.tokens || '', buffer + offset, tokensLen);\n  offset += tokensLen;\n\n  Module.stringToUTF8(config.dataDir || '', buffer + offset, dataDirLen);\n  offset += dataDirLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += modelLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += voicesLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n  offset += tokensLen;\n\n  Module.setValue(ptr + 12, buffer + offset, 'i8*');\n  offset += dataDirLen;\n\n  Module.setValue(ptr + 16, config.lengthScale || 1.0, 'float');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineTtsZipVoiceModelConfig(config, Module) {\n  const tokensLen = Module.lengthBytesUTF8(config.tokens || '') + 1;\n  const encoderLen = Module.lengthBytesUTF8(config.encoder || '') + 1;\n  const decoderLen = Module.lengthBytesUTF8(config.decoder || '') + 1;\n  const vocoderLen = Module.lengthBytesUTF8(config.vocoder || '') + 1;\n  const dataDirLen = Module.lengthBytesUTF8(config.dataDir || '') + 1;\n  const lexiconLen = Module.lengthBytesUTF8(config.lexicon || '') + 1;\n\n  const n = tokensLen + encoderLen + decoderLen + vocoderLen + dataDirLen +\n      lexiconLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 10 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.tokens || '', buffer + offset, tokensLen);\n  offset += tokensLen;\n\n  Module.stringToUTF8(config.encoder || '', buffer + offset, encoderLen);\n  offset += encoderLen;\n\n  Module.stringToUTF8(config.decoder || '', buffer + offset, decoderLen);\n  offset += decoderLen;\n\n  Module.stringToUTF8(config.vocoder || '', buffer + offset, vocoderLen);\n  offset += vocoderLen;\n\n  Module.stringToUTF8(config.dataDir || '', buffer + offset, dataDirLen);\n  offset += dataDirLen;\n\n  Module.stringToUTF8(config.lexicon || '', buffer + offset, lexiconLen);\n  offset += lexiconLen;\n\n  offset = 0;\n  Module.setValue(ptr, buffer + offset, 'i8*');\n  offset += tokensLen;\n\n  Module.setValue(ptr + 4, buffer + offset, 'i8*');\n  offset += encoderLen;\n\n  Module.setValue(ptr + 8, buffer + offset, 'i8*');\n  offset += decoderLen;\n\n  Module.setValue(ptr + 12, buffer + offset, 'i8*');\n  offset += vocoderLen;\n\n  Module.setValue(ptr + 16, buffer + offset, 'i8*');\n  offset += dataDirLen;\n\n  Module.setValue(ptr + 20, buffer + offset, 'i8*');\n  offset += lexiconLen;\n\n  Module.setValue(ptr + 24, config.featScale || 0.1, 'float');\n  Module.setValue(ptr + 28, config.tShift || 0.5, 'float');\n  Module.setValue(ptr + 32, config.targetRMS || 0.1, 'float');\n  Module.setValue(ptr + 36, config.guidanceScale || 1.0, 'float');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineTtsPocketModelConfig(config, Module) {\n  const lmFlowLen = Module.lengthBytesUTF8(config.lmFlow || '') + 1;\n  const lmMainLen = Module.lengthBytesUTF8(config.lmMain || '') + 1;\n  const encoderLen = Module.lengthBytesUTF8(config.encoder || '') + 1;\n  const decoderLen = Module.lengthBytesUTF8(config.decoder || '') + 1;\n  const textConditionerLen =\n      Module.lengthBytesUTF8(config.textConditioner || '') + 1;\n  const vocabJsonLen = Module.lengthBytesUTF8(config.vocabJson || '') + 1;\n  const tokenScoresJsonLen =\n      Module.lengthBytesUTF8(config.tokenScoresJson || '') + 1;\n\n\n  const n = lmFlowLen + lmMainLen + encoderLen + decoderLen +\n      textConditionerLen + vocabJsonLen + tokenScoresJsonLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 8 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(config.lmFlow || '', buffer + offset, lmFlowLen);\n  offset += lmFlowLen;\n\n  Module.stringToUTF8(config.lmMain || '', buffer + offset, lmMainLen);\n  offset += lmMainLen;\n\n  Module.stringToUTF8(config.encoder || '', buffer + offset, encoderLen);\n  offset += encoderLen;\n\n  Module.stringToUTF8(config.decoder || '', buffer + offset, decoderLen);\n  offset += decoderLen;\n\n  Module.stringToUTF8(\n      config.textConditioner || '', buffer + offset, textConditionerLen);\n  offset += textConditionerLen;\n\n  Module.stringToUTF8(config.vocabJson || '', buffer + offset, vocabJsonLen);\n  offset += vocabJsonLen;\n\n  Module.stringToUTF8(\n      config.tokenScoresJson || '', buffer + offset, tokenScoresJsonLen);\n  offset += tokenScoresJsonLen;\n\n  offset = 0;\n  Module.setValue(ptr + 0 * 4, buffer + offset, 'i8*');\n  offset += lmFlowLen;\n\n  Module.setValue(ptr + 1 * 4, buffer + offset, 'i8*');\n  offset += lmMainLen;\n\n  Module.setValue(ptr + 2 * 4, buffer + offset, 'i8*');\n  offset += encoderLen;\n\n  Module.setValue(ptr + 3 * 4, buffer + offset, 'i8*');\n  offset += decoderLen;\n\n  Module.setValue(ptr + 4 * 4, buffer + offset, 'i8*');\n  offset += textConditionerLen;\n\n  Module.setValue(ptr + 5 * 4, buffer + offset, 'i8*');\n  offset += vocabJsonLen;\n\n  Module.setValue(ptr + 6 * 4, buffer + offset, 'i8*');\n  offset += tokenScoresJsonLen;\n\n  Module.setValue(\n      ptr + 7 * 4,\n      config.voiceEmbeddingCacheCapacity !== undefined ?\n          config.voiceEmbeddingCacheCapacity :\n          50,\n      'i32');\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineTtsSupertonicModelConfig(config, Module) {\n  const durationPredictorLen =\n      Module.lengthBytesUTF8(config.durationPredictor || '') + 1;\n  const textEncoderLen = Module.lengthBytesUTF8(config.textEncoder || '') + 1;\n  const vectorEstimatorLen =\n      Module.lengthBytesUTF8(config.vectorEstimator || '') + 1;\n  const vocoderLen = Module.lengthBytesUTF8(config.vocoder || '') + 1;\n  const ttsJsonLen = Module.lengthBytesUTF8(config.ttsJson || '') + 1;\n  const unicodeIndexerLen =\n      Module.lengthBytesUTF8(config.unicodeIndexer || '') + 1;\n  const voiceStyleLen = Module.lengthBytesUTF8(config.voiceStyle || '') + 1;\n\n  const n = durationPredictorLen + textEncoderLen + vectorEstimatorLen +\n      vocoderLen + ttsJsonLen + unicodeIndexerLen + voiceStyleLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 7 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module.stringToUTF8(\n      config.durationPredictor || '', buffer + offset, durationPredictorLen);\n  offset += durationPredictorLen;\n\n  Module.stringToUTF8(\n      config.textEncoder || '', buffer + offset, textEncoderLen);\n  offset += textEncoderLen;\n\n  Module.stringToUTF8(\n      config.vectorEstimator || '', buffer + offset, vectorEstimatorLen);\n  offset += vectorEstimatorLen;\n\n  Module.stringToUTF8(config.vocoder || '', buffer + offset, vocoderLen);\n  offset += vocoderLen;\n\n  Module.stringToUTF8(config.ttsJson || '', buffer + offset, ttsJsonLen);\n  offset += ttsJsonLen;\n\n  Module.stringToUTF8(\n      config.unicodeIndexer || '', buffer + offset, unicodeIndexerLen);\n  offset += unicodeIndexerLen;\n\n  Module.stringToUTF8(config.voiceStyle || '', buffer + offset, voiceStyleLen);\n  offset += voiceStyleLen;\n\n  offset = 0;\n  Module.setValue(ptr + 0 * 4, buffer + offset, 'i8*');\n  offset += durationPredictorLen;\n\n  Module.setValue(ptr + 1 * 4, buffer + offset, 'i8*');\n  offset += textEncoderLen;\n\n  Module.setValue(ptr + 2 * 4, buffer + offset, 'i8*');\n  offset += vectorEstimatorLen;\n\n  Module.setValue(ptr + 3 * 4, buffer + offset, 'i8*');\n  offset += vocoderLen;\n\n  Module.setValue(ptr + 4 * 4, buffer + offset, 'i8*');\n  offset += ttsJsonLen;\n\n  Module.setValue(ptr + 5 * 4, buffer + offset, 'i8*');\n  offset += unicodeIndexerLen;\n\n  Module.setValue(ptr + 6 * 4, buffer + offset, 'i8*');\n  offset += voiceStyleLen;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxOfflineTtsModelConfig(config, Module) {\n  if (!('offlineTtsVitsModelConfig' in config)) {\n    config.offlineTtsVitsModelConfig = {\n      model: '',\n      lexicon: '',\n      tokens: '',\n      noiseScale: 0.667,\n      noiseScaleW: 0.8,\n      lengthScale: 1.0,\n      dataDir: '',\n    };\n  }\n\n  if (!('offlineTtsMatchaModelConfig' in config)) {\n    config.offlineTtsMatchaModelConfig = {\n      acousticModel: '',\n      vocoder: '',\n      lexicon: '',\n      tokens: '',\n      noiseScale: 0.667,\n      lengthScale: 1.0,\n      dataDir: '',\n    };\n  }\n\n  if (!('offlineTtsKokoroModelConfig' in config)) {\n    config.offlineTtsKokoroModelConfig = {\n      model: '',\n      voices: '',\n      tokens: '',\n      lengthScale: 1.0,\n      dataDir: '',\n      lexicon: '',\n      lang: '',\n    };\n  }\n\n  if (!('offlineTtsKittenModelConfig' in config)) {\n    config.offlineTtsKittenModelConfig = {\n      model: '',\n      voices: '',\n      tokens: '',\n      lengthScale: 1.0,\n    };\n  }\n\n  if (!('offlineTtsZipVoiceModelConfig' in config)) {\n    config.offlineTtsZipVoiceModelConfig = {\n      tokens: '',\n      encoder: '',\n      decoder: '',\n      vocoder: '',\n      dataDir: '',\n      lexicon: '',\n      featScale: 0.1,\n      tShift: 0.5,\n      targetRMS: 0.1,\n      guidanceScale: 1.0,\n    };\n  }\n\n  if (!('offlineTtsPocketModelConfig' in config)) {\n    config.offlineTtsPocketModelConfig = {\n      lmFlow: '',\n      lmMain: '',\n      encoder: '',\n      decoder: '',\n      textConditioner: '',\n      vocabJson: '',\n      tokenScoresJson: '',\n      voiceEmbeddingCacheCapacity: 50,\n    };\n  }\n\n  if (!('offlineTtsSupertonicModelConfig' in config)) {\n    config.offlineTtsSupertonicModelConfig = {\n      durationPredictor: '',\n      textEncoder: '',\n      vectorEstimator: '',\n      vocoder: '',\n      ttsJson: '',\n      unicodeIndexer: '',\n      voiceStyle: '',\n    };\n  }\n\n  const vitsModelConfig = initSherpaOnnxOfflineTtsVitsModelConfig(\n      config.offlineTtsVitsModelConfig, Module);\n\n  const matchaModelConfig = initSherpaOnnxOfflineTtsMatchaModelConfig(\n      config.offlineTtsMatchaModelConfig, Module);\n\n  const kokoroModelConfig = initSherpaOnnxOfflineTtsKokoroModelConfig(\n      config.offlineTtsKokoroModelConfig, Module);\n\n  const kittenModelConfig = initSherpaOnnxOfflineTtsKittenModelConfig(\n      config.offlineTtsKittenModelConfig, Module);\n\n  const zipVoiceModelConfig = initSherpaOnnxOfflineTtsZipVoiceModelConfig(\n      config.offlineTtsZipVoiceModelConfig, Module);\n\n  const pocketModelConfig = initSherpaOnnxOfflineTtsPocketModelConfig(\n      config.offlineTtsPocketModelConfig, Module);\n\n  const supertonicModelConfig = initSherpaOnnxOfflineTtsSupertonicModelConfig(\n      config.offlineTtsSupertonicModelConfig, Module);\n\n  const len = vitsModelConfig.len + matchaModelConfig.len +\n      kokoroModelConfig.len + kittenModelConfig.len + zipVoiceModelConfig.len +\n      pocketModelConfig.len + supertonicModelConfig.len + 3 * 4;\n\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module._CopyHeap(vitsModelConfig.ptr, vitsModelConfig.len, ptr + offset);\n  offset += vitsModelConfig.len;\n\n  Module.setValue(ptr + offset, config.numThreads || 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.debug || 0, 'i32');\n  offset += 4;\n\n  const providerLen = Module.lengthBytesUTF8(config.provider || 'cpu') + 1;\n  const buffer = Module._malloc(providerLen);\n  Module.stringToUTF8(config.provider || 'cpu', buffer, providerLen);\n  Module.setValue(ptr + offset, buffer, 'i8*');\n  offset += 4;\n\n  Module._CopyHeap(matchaModelConfig.ptr, matchaModelConfig.len, ptr + offset);\n  offset += matchaModelConfig.len;\n\n  Module._CopyHeap(kokoroModelConfig.ptr, kokoroModelConfig.len, ptr + offset);\n  offset += kokoroModelConfig.len;\n\n  Module._CopyHeap(kittenModelConfig.ptr, kittenModelConfig.len, ptr + offset);\n  offset += kittenModelConfig.len;\n\n  Module._CopyHeap(\n      zipVoiceModelConfig.ptr, zipVoiceModelConfig.len, ptr + offset);\n  offset += zipVoiceModelConfig.len;\n\n  Module._CopyHeap(pocketModelConfig.ptr, pocketModelConfig.len, ptr + offset);\n  offset += pocketModelConfig.len;\n\n  Module._CopyHeap(\n      supertonicModelConfig.ptr, supertonicModelConfig.len, ptr + offset);\n  offset += supertonicModelConfig.len;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n    config: vitsModelConfig,\n    matcha: matchaModelConfig,\n    kokoro: kokoroModelConfig,\n    kitten: kittenModelConfig,\n    zipvoice: zipVoiceModelConfig,\n    pocket: pocketModelConfig,\n    supertonic: supertonicModelConfig,\n  };\n}\n\nfunction initSherpaOnnxOfflineTtsConfig(config, Module) {\n  const modelConfig =\n      initSherpaOnnxOfflineTtsModelConfig(config.offlineTtsModelConfig, Module);\n  const len = modelConfig.len + 4 * 4;\n  const ptr = Module._malloc(len);\n\n  let offset = 0;\n  Module._CopyHeap(modelConfig.ptr, modelConfig.len, ptr + offset);\n  offset += modelConfig.len;\n\n  const ruleFstsLen = Module.lengthBytesUTF8(config.ruleFsts || '') + 1;\n  const ruleFarsLen = Module.lengthBytesUTF8(config.ruleFars || '') + 1;\n\n  const buffer = Module._malloc(ruleFstsLen + ruleFarsLen);\n  Module.stringToUTF8(config.ruleFsts || '', buffer, ruleFstsLen);\n  Module.stringToUTF8(config.ruleFars || '', buffer + ruleFstsLen, ruleFarsLen);\n\n  Module.setValue(ptr + offset, buffer, 'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.maxNumSentences || 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, buffer + ruleFstsLen, 'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.silenceScale || 0.2, 'float');\n  offset += 4;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n    config: modelConfig,\n  };\n}\n\n/*\nconst genConfig = {\n  silenceScale: 0.2,\n  speed: 1.0,\n  sid: 1,\n  referenceAudio: myFloat32Array, // optional\n  referenceSampleRate: 16000, // used if referenceAudio is required\n  referenceText: \"Hello world\", // optional\n  numSteps: 5, // optional\n  extra: { bar: \"ok\", foo: 0.8, foobar: 10}\n};\n\n */\n\n// Allocate a SherpaOnnxGenerationConfig in WASM\nfunction initSherpaOnnxGenerationConfig(config, Module) {\n  const len = 9 * 4;\n  const ptr = Module._malloc(len);\n\n  // float silence_scale\n  Module.setValue(ptr + 0 * 4, config.silenceScale || 0.2, 'float');\n\n  // float speed\n  Module.setValue(ptr + 1 * 4, config.speed || 1.0, 'float');\n\n  // int32_t sid\n  Module.setValue(ptr + 2 * 4, config.sid || 0, 'i32');\n\n  // const float* reference_audio\n  let referenceAudioPtr = 0;\n  if (config.referenceAudio && config.referenceAudio.length > 0) {\n    referenceAudioPtr = Module._malloc(config.referenceAudio.length * 4);\n    Module.HEAPF32.set(config.referenceAudio, referenceAudioPtr / 4);\n  }\n  Module.setValue(ptr + 3 * 4, referenceAudioPtr, 'i8*');\n\n  // int32_t reference_audio_len\n  Module.setValue(\n      ptr + 4 * 4, config.referenceAudio ? config.referenceAudio.length : 0,\n      'i32');\n\n  // int32_t reference_sample_rate\n  Module.setValue(ptr + 5 * 4, config.referenceSampleRate || 0, 'i32');\n\n  // const char* reference_text\n  let referenceTextPtr = 0;\n  if (config.referenceText) {\n    const textLen = Module.lengthBytesUTF8(config.referenceText) + 1;\n    referenceTextPtr = Module._malloc(textLen);\n    Module.stringToUTF8(config.referenceText, referenceTextPtr, textLen);\n  }\n  Module.setValue(ptr + 6 * 4, referenceTextPtr, 'i8*');\n\n  // int32_t num_steps\n  Module.setValue(ptr + 7 * 4, config.numSteps || 5, 'i32');\n\n  // const char* extra (JSON string)\n  let extraPtr = 0;\n  let extraStr = null;\n\n  if (config.extra) {\n    if (typeof config.extra === 'object') {\n      extraStr = JSON.stringify(config.extra);\n    } else if (typeof config.extra === 'string') {\n      extraStr = config.extra;\n    }\n  }\n\n  if (extraStr !== null) {\n    const extraLen = Module.lengthBytesUTF8(extraStr) + 1;\n    extraPtr = Module._malloc(extraLen);\n    Module.stringToUTF8(extraStr, extraPtr, extraLen);\n  }\n\n  Module.setValue(ptr + 8 * 4, extraPtr, 'i8*');\n\n  return {\n    ptr,\n    referenceAudioPtr,\n    referenceTextPtr,\n    extraPtr,\n  };\n}\n\n\n// Free the memory allocated for a SherpaOnnxGenerationConfig\nfunction freeSherpaOnnxGenerationConfig(cfg, Module) {\n  if (!cfg) return;\n\n  if (cfg.referenceAudioPtr) Module._free(cfg.referenceAudioPtr);\n  if (cfg.referenceTextPtr) Module._free(cfg.referenceTextPtr);\n  if (cfg.extraPtr) Module._free(cfg.extraPtr);\n  if (cfg.ptr) Module._free(cfg.ptr);\n}\n\n\nclass OfflineTts {\n  constructor(configObj, Module) {\n    const config = initSherpaOnnxOfflineTtsConfig(configObj, Module)\n    const handle = Module._SherpaOnnxCreateOfflineTts(config.ptr);\n\n    freeConfig(config, Module);\n\n    this.handle = handle;\n    this.sampleRate = Module._SherpaOnnxOfflineTtsSampleRate(this.handle);\n    this.numSpeakers = Module._SherpaOnnxOfflineTtsNumSpeakers(this.handle);\n    this.Module = Module\n  }\n\n  free() {\n    if (!this.handle) return;\n\n    this.Module._SherpaOnnxDestroyOfflineTts(this.handle);\n    this.handle = 0\n  }\n\n  // {\n  //   text: \"hello\",\n  //   sid: 1,\n  //   speed: 1.0\n  // }\n  generate(config) {\n    if (!this.handle) {\n      throw new Error('OfflineTts has been freed');\n    }\n\n    if (!config || !config.text) {\n      throw new Error('config.text is required');\n    }\n\n    const textLen = this.Module.lengthBytesUTF8(config.text) + 1;\n    const textPtr = this.Module._malloc(textLen);\n    this.Module.stringToUTF8(config.text, textPtr, textLen);\n\n    const h = this.Module._SherpaOnnxOfflineTtsGenerate(\n        this.handle, textPtr, config.sid ?? 0, config.speed ?? 1.0);\n\n    this.Module._free(textPtr);\n\n    if (!h) {\n      throw new Error('TTS generation failed');\n    }\n\n    const base = h / 4;\n\n    const samplesPtr = this.Module.HEAPU32[base];\n    const numSamples = this.Module.HEAP32[base + 1];\n    const sampleRate = this.Module.HEAP32[base + 2];\n\n    const heapSamples = this.Module.HEAPF32.subarray(\n        samplesPtr / 4, samplesPtr / 4 + numSamples);\n\n    const samples = new Float32Array(heapSamples);\n\n    this.Module._SherpaOnnxDestroyOfflineTtsGeneratedAudio(h);\n    return {samples: samples, sampleRate: sampleRate};\n  }\n\n  generateWithConfig(text, genConfig) {\n    if (!this.handle) {\n      throw new Error('OfflineTts has been freed');\n    }\n\n    const cfgWasm = initSherpaOnnxGenerationConfig(genConfig, this.Module);\n\n    const textLen = this.Module.lengthBytesUTF8(text) + 1;\n    const textPtr = this.Module._malloc(textLen);\n    this.Module.stringToUTF8(text, textPtr, textLen);\n\n    let callbackPtr = 0;\n    if (genConfig.callback) {\n      callbackPtr = this.Module.addFunction((samplesPtr, n, progress, arg) => {\n        const heapSamples =\n            this.Module.HEAPF32.subarray(samplesPtr / 4, samplesPtr / 4 + n);\n        const samples = new Float32Array(heapSamples);\n        return genConfig.callback(samples, n, progress, arg);\n      }, 'iiifi');\n    }\n\n    let audioPtr = 0;\n    try {\n      audioPtr = this.Module._SherpaOnnxOfflineTtsGenerateWithConfig(\n          this.handle, textPtr, cfgWasm.ptr, callbackPtr, 0);\n    } finally {\n      this.Module._free(textPtr);\n      freeSherpaOnnxGenerationConfig(cfgWasm, this.Module);\n      if (callbackPtr) {\n        this.Module.removeFunction(callbackPtr);\n      }\n    }\n\n    if (!audioPtr) {\n      throw new Error('Failed to generate audio');\n    }\n\n    const base = audioPtr / 4;\n\n    const samplesPtr = this.Module.HEAPU32[base];     // float* samples\n    const numSamples = this.Module.HEAP32[base + 1];  // int32 num_samples\n    const sampleRate = this.Module.HEAP32[base + 2];  // int32 sample_rate\n\n    const heapSamples = this.Module.HEAPF32.subarray(\n        samplesPtr / 4, samplesPtr / 4 + numSamples);\n    const samples = new Float32Array(heapSamples);\n\n    this.Module._SherpaOnnxDestroyOfflineTtsGeneratedAudio(audioPtr);\n\n    return {samples, sampleRate};\n  }\n\n  save(filename, audio) {\n    const samples = audio.samples;\n    const sampleRate = audio.sampleRate;\n    const ptr = this.Module._malloc(samples.length * 4);\n\n    this.Module.HEAPF32.set(samples, ptr / 4);\n\n    const filenameLen = this.Module.lengthBytesUTF8(filename) + 1;\n    const buffer = this.Module._malloc(filenameLen);\n    this.Module.stringToUTF8(filename, buffer, filenameLen);\n    this.Module._SherpaOnnxWriteWave(ptr, samples.length, sampleRate, buffer);\n    this.Module._free(buffer);\n    this.Module._free(ptr);\n  }\n}\n\nlet modelType = 0;\n\nfunction getDefaultOfflineTtsModelType() {\n  return modelType;\n}\n\nfunction createOfflineTts(Module, myConfig) {\n  const vits = {\n    model: '',\n    lexicon: '',\n    tokens: '',\n    dataDir: '',\n    noiseScale: 0.667,\n    noiseScaleW: 0.8,\n    lengthScale: 1.0,\n  };\n\n  const matcha = {\n    acousticModel: '',\n    vocoder: '',\n    lexicon: '',\n    tokens: '',\n    dataDir: '',\n    noiseScale: 0.667,\n    lengthScale: 1.0,\n  };\n\n  const offlineTtsKokoroModelConfig = {\n    model: '',\n    voices: '',\n    tokens: '',\n    dataDir: '',\n    lengthScale: 1.0,\n    lexicon: '',\n    lang: '',\n  };\n\n  const offlineTtsKittenModelConfig = {\n    model: '',\n    voices: '',\n    tokens: '',\n    dataDir: '',\n    lengthScale: 1.0,\n  };\n\n  const offlineTtsZipVoiceModelConfig = {\n    tokens: '',\n    encoder: '',\n    decoder: '',\n    vocoder: '',\n    dataDir: '',\n    lexicon: '',\n    featScale: 0.1,\n    tShift: 0.5,\n    targetRMS: 0.1,\n    guidanceScale: 1.0,\n  };\n\n  const offlineTtsPocketModelConfig = {\n    lmFlow: '',\n    lmMain: '',\n    encoder: '',\n    decoder: '',\n    textConditioner: '',\n    vocabJson: '',\n    tokenScoresJson: '',\n    voiceEmbeddingCacheCapacity: 50,\n  };\n\n  let ruleFsts = '';\n\n  switch (modelType) {\n    case 0:\n      // vits\n      vits.model = './model.onnx';\n      vits.tokens = './tokens.txt';\n      vits.dataDir = './espeak-ng-data';\n      break;\n    case 1:\n      // matcha zh-en\n      // https://k2-fsa.github.io/sherpa/onnx/tts/all/Chinese-English/matcha-icefall-zh-en.html\n      matcha.acousticModel = './model-steps-3.onnx';\n      matcha.vocoder = './vocos-16khz-univ.onnx';\n      matcha.lexicon = './lexicon.txt';\n      matcha.tokens = './tokens.txt';\n      matcha.dataDir = './espeak-ng-data';\n      ruleFsts = './phone-zh.fst,./date-zh.fst,./number-zh.fst';\n      break;\n    case 2:\n      // matcha zh\n      // https://k2-fsa.github.io/sherpa/onnx/tts/all/Chinese/matcha-icefall-zh-baker.html\n      matcha.acousticModel = './model-steps-3.onnx';\n      matcha.vocoder = './vocos-22khz-univ.onnx';\n      matcha.lexicon = './lexicon.txt';\n      matcha.tokens = './tokens.txt';\n      ruleFsts = './phone.fst,./date.fst,./number.fst';\n      break;\n    case 3:\n      // matcha en\n      // https://k2-fsa.github.io/sherpa/onnx/tts/all/English/matcha-icefall-en_US-ljspeech.html\n      matcha.acousticModel = './model-steps-3.onnx';\n      matcha.vocoder = './vocos-22khz-univ.onnx';\n      matcha.tokens = './tokens.txt';\n      matcha.dataDir = './espeak-ng-data';\n      break;\n    case 4:\n      // zipvoice zh-en\n      // https://k2-fsa.github.io/sherpa/onnx/tts/zipvoice.html\n      offlineTtsZipVoiceModelConfig.tokens = './tokens.txt';\n      offlineTtsZipVoiceModelConfig.encoder = './encoder.int8.onnx';\n      offlineTtsZipVoiceModelConfig.decoder = './decoder.int8.onnx';\n      offlineTtsZipVoiceModelConfig.vocoder = './vocos_24khz.onnx';\n      offlineTtsZipVoiceModelConfig.dataDir = './espeak-ng-data';\n      offlineTtsZipVoiceModelConfig.lexicon = './lexicon.txt';\n      break;\n    case 5:\n      // pocket tts\n      // https://k2-fsa.github.io/sherpa/onnx/tts/pocket.html\n      offlineTtsPocketModelConfig.lmFlow = './lm_flow.int8.onnx';\n      offlineTtsPocketModelConfig.lmMain = './lm_main.int8.onnx';\n      offlineTtsPocketModelConfig.encoder = './encoder.onnx';\n      offlineTtsPocketModelConfig.decoder = './decoder.int8.onnx';\n      offlineTtsPocketModelConfig.textConditioner = './text_conditioner.onnx';\n      offlineTtsPocketModelConfig.vocabJson = './vocab.json';\n      offlineTtsPocketModelConfig.tokenScoresJson = './token_scores.json';\n      break;\n  }\n\n  const offlineTtsModelConfig = {\n    offlineTtsVitsModelConfig: vits,\n    offlineTtsMatchaModelConfig: matcha,\n    offlineTtsKokoroModelConfig: offlineTtsKokoroModelConfig,\n    offlineTtsKittenModelConfig: offlineTtsKittenModelConfig,\n    offlineTtsZipVoiceModelConfig: offlineTtsZipVoiceModelConfig,\n    offlineTtsPocketModelConfig: offlineTtsPocketModelConfig,\n    numThreads: 1,\n    debug: 1,\n    provider: 'cpu',\n  };\n\n  let offlineTtsConfig = {\n    offlineTtsModelConfig: offlineTtsModelConfig,\n    ruleFsts: ruleFsts,\n    ruleFars: '',\n    maxNumSentences: 1,\n  }\n\n  if (myConfig) {\n    offlineTtsConfig = myConfig;\n  }\n\n  return new OfflineTts(offlineTtsConfig, Module);\n}\n\nif (typeof process == 'object' && typeof process.versions == 'object' &&\n    typeof process.versions.node == 'string') {\n  module.exports = {\n    createOfflineTts,\n    getDefaultOfflineTtsModelType,\n  };\n}\n"
  },
  {
    "path": "wasm/tts/sherpa-onnx-tts.worker.js",
    "content": "let tts = null;\nself.Module = {\n  // https://emscripten.org/docs/api_reference/module.html#Module.locateFile\n  locateFile: function (path, scriptDirectory = \"\") {\n    return scriptDirectory + path;\n  },\n  // https://emscripten.org/docs/api_reference/module.html#Module.locateFile\n  setStatus: function (status) {\n    self.postMessage({ type: \"sherpa-onnx-tts-progress\", status });\n  },\n  onRuntimeInitialized: function () {\n    console.log(\"Model files downloaded!\");\n    console.log(\"Initializing tts ......\");\n    try {\n      tts = createOfflineTts(self.Module);\n      self.postMessage({\n        type: \"sherpa-onnx-tts-ready\",\n        modelType: getDefaultOfflineTtsModelType(),\n        numSpeakers: tts.numSpeakers,\n      });\n    } catch (e) {\n      self.postMessage({\n        type: \"error\",\n        message: \"TTS Initialization failed: \" + e.message,\n      });\n    }\n  },\n};\nimportScripts(\"sherpa-onnx-wasm-main-tts.js\");\nimportScripts(\"sherpa-onnx-tts.js\");\n\nfunction getErrorMessage(err) {\n  if (err instanceof Error) {\n    if (err.stack) {\n      return `${err.message}\\n${err.stack}`;\n    }\n    return err.message;\n  }\n\n  return `${err}`;\n}\n\nself.onmessage = async (e) => {\n  const { type, text, sid, speed, genConfig } = e.data;\n  if (type === \"generate\") {\n    if (!tts) {\n      return;\n    }\n    try {\n      const audio = tts.generate({\n        text: text,\n        sid: sid || 0,\n        speed: speed || 1.0,\n      });\n      const samples = audio.samples;\n      const sampleRate = tts.sampleRate;\n      self.postMessage(\n        {\n          type: \"sherpa-onnx-tts-result\",\n          samples: samples,\n          sampleRate: sampleRate,\n        },\n        [samples.buffer],\n      );\n    } catch (err) {\n      self.postMessage({\n        type: \"error\",\n        message: \"Generation failed: \" + getErrorMessage(err),\n      });\n    }\n  } else if (type === \"generateWithConfig\") {\n    if (!tts) {\n      return;\n    }\n    try {\n      const config = Object.assign({}, genConfig || {});\n      config.callback = (samples, n, progress) => {\n        self.postMessage({\n          type: \"sherpa-onnx-tts-generation-progress\",\n          progress: progress,\n        });\n        return 1;\n      };\n\n      const audio = tts.generateWithConfig(text, config);\n      const samples = audio.samples;\n      const sampleRate = audio.sampleRate;\n      self.postMessage(\n          {\n            type: \"sherpa-onnx-tts-result\",\n            samples: samples,\n            sampleRate: sampleRate,\n          },\n          [samples.buffer],\n      );\n    } catch (err) {\n      self.postMessage({\n        type: \"error\",\n        message: \"Generation failed: \" + getErrorMessage(err),\n      });\n    }\n  }\n};\n"
  },
  {
    "path": "wasm/tts/sherpa-onnx-wasm-main-tts.cc",
    "content": "// wasm/sherpa-onnx-wasm-main-tts.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <stdio.h>\n\n#include <algorithm>\n#include <memory>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n// see also\n// https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html\n\nextern \"C\" {\n\nstatic_assert(sizeof(SherpaOnnxOfflineTtsVitsModelConfig) == 8 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineTtsMatchaModelConfig) == 8 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineTtsKokoroModelConfig) == 8 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineTtsKittenModelConfig) == 5 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineTtsZipvoiceModelConfig) == 10 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineTtsPocketModelConfig) == 8 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineTtsSupertonicModelConfig) == 7 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxOfflineTtsModelConfig) ==\n                  sizeof(SherpaOnnxOfflineTtsVitsModelConfig) +\n                      sizeof(SherpaOnnxOfflineTtsMatchaModelConfig) +\n                      sizeof(SherpaOnnxOfflineTtsKokoroModelConfig) + 3 * 4 +\n                      sizeof(SherpaOnnxOfflineTtsKittenModelConfig) +\n                      sizeof(SherpaOnnxOfflineTtsZipvoiceModelConfig) +\n                      sizeof(SherpaOnnxOfflineTtsPocketModelConfig) +\n                      sizeof(SherpaOnnxOfflineTtsSupertonicModelConfig),\n              \"\");\n\nstatic_assert(sizeof(SherpaOnnxOfflineTtsConfig) ==\n                  sizeof(SherpaOnnxOfflineTtsModelConfig) + 4 * 4,\n              \"\");\n\nstatic_assert(sizeof(SherpaOnnxGenerationConfig) == 9 * 4, \"\");\n\nvoid MyPrint(SherpaOnnxOfflineTtsConfig *tts_config) {\n  auto tts_model_config = &tts_config->model;\n  auto vits_model_config = &tts_model_config->vits;\n  auto matcha_model_config = &tts_model_config->matcha;\n  auto kokoro = &tts_model_config->kokoro;\n  auto kitten = &tts_model_config->kitten;\n  auto zipvoice = &tts_model_config->zipvoice;\n  auto pocket = &tts_model_config->pocket;\n  fprintf(stdout, \"----------vits model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", vits_model_config->model);\n  fprintf(stdout, \"lexicon: %s\\n\", vits_model_config->lexicon);\n  fprintf(stdout, \"tokens: %s\\n\", vits_model_config->tokens);\n  fprintf(stdout, \"data_dir: %s\\n\", vits_model_config->data_dir);\n  fprintf(stdout, \"noise scale: %.3f\\n\", vits_model_config->noise_scale);\n  fprintf(stdout, \"noise scale w: %.3f\\n\", vits_model_config->noise_scale_w);\n  fprintf(stdout, \"length scale: %.3f\\n\", vits_model_config->length_scale);\n  fprintf(stdout, \"dict_dir: %s\\n\", vits_model_config->dict_dir);\n\n  fprintf(stdout, \"----------matcha model config----------\\n\");\n  fprintf(stdout, \"acoustic_model: %s\\n\", matcha_model_config->acoustic_model);\n  fprintf(stdout, \"vocoder: %s\\n\", matcha_model_config->vocoder);\n  fprintf(stdout, \"lexicon: %s\\n\", matcha_model_config->lexicon);\n  fprintf(stdout, \"tokens: %s\\n\", matcha_model_config->tokens);\n  fprintf(stdout, \"data_dir: %s\\n\", matcha_model_config->data_dir);\n  fprintf(stdout, \"noise scale: %.3f\\n\", matcha_model_config->noise_scale);\n  fprintf(stdout, \"length scale: %.3f\\n\", matcha_model_config->length_scale);\n  fprintf(stdout, \"dict_dir: %s\\n\", matcha_model_config->dict_dir);\n\n  fprintf(stdout, \"----------kokoro model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", kokoro->model);\n  fprintf(stdout, \"voices: %s\\n\", kokoro->voices);\n  fprintf(stdout, \"tokens: %s\\n\", kokoro->tokens);\n  fprintf(stdout, \"data_dir: %s\\n\", kokoro->data_dir);\n  fprintf(stdout, \"length scale: %.3f\\n\", kokoro->length_scale);\n  fprintf(stdout, \"dict_dir: %s\\n\", kokoro->dict_dir);\n  fprintf(stdout, \"lexicon: %s\\n\", kokoro->lexicon);\n  fprintf(stdout, \"lang: %s\\n\", kokoro->lang);\n\n  fprintf(stdout, \"----------kitten model config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", kitten->model);\n  fprintf(stdout, \"voices: %s\\n\", kitten->voices);\n  fprintf(stdout, \"tokens: %s\\n\", kitten->tokens);\n  fprintf(stdout, \"data_dir: %s\\n\", kitten->data_dir);\n  fprintf(stdout, \"length scale: %.3f\\n\", kitten->length_scale);\n\n  fprintf(stdout, \"----------zipvoice model config----------\\n\");\n  fprintf(stdout, \"tokens: %s\\n\", zipvoice->tokens);\n  fprintf(stdout, \"encoder: %s\\n\", zipvoice->encoder);\n  fprintf(stdout, \"decoder: %s\\n\", zipvoice->decoder);\n  fprintf(stdout, \"vocoder: %s\\n\", zipvoice->vocoder);\n  fprintf(stdout, \"data_dir: %s\\n\", zipvoice->data_dir);\n  fprintf(stdout, \"lexicon: %s\\n\", zipvoice->lexicon);\n  fprintf(stdout, \"feat scale: %.3f\\n\", zipvoice->feat_scale);\n  fprintf(stdout, \"t_shift: %.3f\\n\", zipvoice->t_shift);\n  fprintf(stdout, \"target_rms: %.3f\\n\", zipvoice->target_rms);\n  fprintf(stdout, \"guidance_scale: %.3f\\n\", zipvoice->guidance_scale);\n\n  fprintf(stdout, \"----------pocketTTS model config----------\\n\");\n  fprintf(stdout, \"lm_flow: %s\\n\", pocket->lm_flow);\n  fprintf(stdout, \"lm_main: %s\\n\", pocket->lm_main);\n  fprintf(stdout, \"encoder: %s\\n\", pocket->encoder);\n  fprintf(stdout, \"decoder: %s\\n\", pocket->decoder);\n  fprintf(stdout, \"text_conditioner: %s\\n\", pocket->text_conditioner);\n  fprintf(stdout, \"vocab_json: %s\\n\", pocket->vocab_json);\n  fprintf(stdout, \"token_scores_json: %s\\n\", pocket->token_scores_json);\n  fprintf(stdout, \"voice_embedding_cache_capacity: %d\\n\",\n          pocket->voice_embedding_cache_capacity);\n\n  auto supertonic = &tts_model_config->supertonic;\n  fprintf(stdout, \"----------supertonic model config----------\\n\");\n  fprintf(stdout, \"duration_predictor: %s\\n\", supertonic->duration_predictor);\n  fprintf(stdout, \"text_encoder: %s\\n\", supertonic->text_encoder);\n  fprintf(stdout, \"vector_estimator: %s\\n\", supertonic->vector_estimator);\n  fprintf(stdout, \"vocoder: %s\\n\", supertonic->vocoder);\n  fprintf(stdout, \"tts_json: %s\\n\", supertonic->tts_json);\n  fprintf(stdout, \"unicode_indexer: %s\\n\", supertonic->unicode_indexer);\n  fprintf(stdout, \"voice_style: %s\\n\", supertonic->voice_style);\n\n  fprintf(stdout, \"----------tts model config----------\\n\");\n  fprintf(stdout, \"num threads: %d\\n\", tts_model_config->num_threads);\n  fprintf(stdout, \"debug: %d\\n\", tts_model_config->debug);\n  fprintf(stdout, \"provider: %s\\n\", tts_model_config->provider);\n\n  fprintf(stdout, \"----------tts config----------\\n\");\n  fprintf(stdout, \"rule_fsts: %s\\n\", tts_config->rule_fsts);\n  fprintf(stdout, \"rule_fars: %s\\n\", tts_config->rule_fars);\n  fprintf(stdout, \"max num sentences: %d\\n\", tts_config->max_num_sentences);\n  fprintf(stdout, \"silence scale: %.3f\\n\", tts_config->silence_scale);\n}\n\nvoid CopyHeap(const char *src, int32_t num_bytes, char *dst) {\n  std::copy(src, src + num_bytes, dst);\n}\n}\n"
  },
  {
    "path": "wasm/vad/CMakeLists.txt",
    "content": "if(NOT $ENV{SHERPA_ONNX_IS_USING_BUILD_WASM_SH})\n  message(FATAL_ERROR \"Please use ./build-wasm-simd-vad.sh to build for wasm VAD\")\nendif()\n\nif(NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/silero_vad.onnx\" AND NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/ten-vad.onnx\" )\n  message(FATAL_ERROR \"Please read ${CMAKE_CURRENT_SOURCE_DIR}/assets/README.md before you continue\")\nendif()\n\nset(exported_functions\n  MyPrint\n  # VAD\n  SherpaOnnxCreateCircularBuffer\n  SherpaOnnxDestroyCircularBuffer\n  SherpaOnnxCircularBufferPush\n  SherpaOnnxCircularBufferGet\n  SherpaOnnxCircularBufferFree\n  SherpaOnnxCircularBufferPop\n  SherpaOnnxCircularBufferSize\n  SherpaOnnxCircularBufferHead\n  SherpaOnnxCircularBufferReset\n  SherpaOnnxCreateVoiceActivityDetector\n  SherpaOnnxDestroyVoiceActivityDetector\n  SherpaOnnxVoiceActivityDetectorAcceptWaveform\n  SherpaOnnxVoiceActivityDetectorEmpty\n  SherpaOnnxVoiceActivityDetectorDetected\n  SherpaOnnxVoiceActivityDetectorPop\n  SherpaOnnxVoiceActivityDetectorClear\n  SherpaOnnxVoiceActivityDetectorFront\n  SherpaOnnxDestroySpeechSegment\n  SherpaOnnxVoiceActivityDetectorReset\n  SherpaOnnxVoiceActivityDetectorFlush\n  #\n  SherpaOnnxFileExists\n)\nset(mangled_exported_functions)\nforeach(x IN LISTS exported_functions)\n  list(APPEND mangled_exported_functions \"_${x}\")\nendforeach()\nlist(JOIN mangled_exported_functions \",\" all_exported_functions)\n\ninclude_directories(${CMAKE_SOURCE_DIR})\nset(MY_FLAGS \" -s FORCE_FILESYSTEM=1 -s INITIAL_MEMORY=64MB -s ALLOW_MEMORY_GROWTH=1\")\nstring(APPEND MY_FLAGS \" -sSTACK_SIZE=10485760 \") # 10MB\nstring(APPEND MY_FLAGS \" -sEXPORTED_FUNCTIONS=[_CopyHeap,_malloc,_free,${all_exported_functions}] \")\nstring(APPEND MY_FLAGS \"--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets@. \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_RUNTIME_METHODS=['ccall','stringToUTF8','setValue','getValue','lengthBytesUTF8','UTF8ToString','HEAPU8','HEAP16','HEAP32','HEAPU32','HEAPF32','HEAPF64'] \")\n\nmessage(STATUS \"MY_FLAGS: ${MY_FLAGS}\")\n\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_EXECUTABLE_LINKER_FLAGS \"${CMAKE_EXECUTABLE_LINKER_FLAGS} ${MY_FLAGS}\")\n\nif (NOT CMAKE_EXECUTABLE_SUFFIX STREQUAL \".js\")\n  message(FATAL_ERROR \"The default suffix for building executables should be .js!\")\nendif()\n# set(CMAKE_EXECUTABLE_SUFFIX \".html\")\n\nadd_executable(sherpa-onnx-wasm-main-vad sherpa-onnx-wasm-main-vad.cc)\ntarget_link_libraries(sherpa-onnx-wasm-main-vad sherpa-onnx-c-api)\ninstall(TARGETS sherpa-onnx-wasm-main-vad DESTINATION bin/wasm/vad)\n\ninstall(\n  FILES\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-vad>/sherpa-onnx-wasm-main-vad.js\"\n    \"index.html\"\n    \"sherpa-onnx-vad.js\"\n    \"app-vad.js\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-vad>/sherpa-onnx-wasm-main-vad.wasm\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-vad>/sherpa-onnx-wasm-main-vad.data\"\n  DESTINATION\n    bin/wasm/vad\n)\n"
  },
  {
    "path": "wasm/vad/app-vad.js",
    "content": "// This file copies and modifies code\n// from https://mdn.github.io/web-dictaphone/scripts/app.js\n// and https://gist.github.com/meziantou/edb7217fddfbb70e899e\n\nconst startBtn = document.getElementById('startBtn');\nconst stopBtn = document.getElementById('stopBtn');\nconst clearBtn = document.getElementById('clearBtn');\nconst soundClips = document.getElementById('sound-clips');\n\nlet textArea = document.getElementById('results');\n\nlet lastResult = '';\nlet resultList = [];\n\nclearBtn.onclick = function() {\n  resultList = [];\n  textArea.value = getDisplayResult();\n  textArea.scrollTop = textArea.scrollHeight;  // auto scroll\n};\n\nfunction getDisplayResult() {\n  let i = 0;\n  let ans = '';\n  for (let s in resultList) {\n    if (resultList[s] == '') {\n      continue;\n    }\n\n    if (resultList[s] == 'Speech detected') {\n      ans += '' + i + ': ' + resultList[s];\n      i += 1;\n    } else {\n      ans += ', ' + resultList[s] + '\\n';\n    }\n  }\n\n  if (lastResult.length > 0) {\n    ans += '' + i + ': ' + lastResult + '\\n';\n  }\n  return ans;\n}\n\n\nModule = {};\n\n// https://emscripten.org/docs/api_reference/module.html#Module.locateFile\nModule.locateFile = function(path, scriptDirectory = '') {\n  console.log(`path: ${path}, scriptDirectory: ${scriptDirectory}`);\n  return scriptDirectory + path;\n};\n\n// https://emscripten.org/docs/api_reference/module.html#Module.locateFile\nModule.setStatus = function(status) {\n  console.log(`status ${status}`);\n  const statusElement = document.getElementById('status');\n  if (status == 'Running...') {\n    status = 'Model downloaded. Initializing vad...'\n  }\n\n  const downloadMatch = status.match(/Downloading data... \\((\\d+)\\/(\\d+)\\)/);\n  if (downloadMatch) {\n    const downloaded = BigInt(downloadMatch[1]);\n    const total = BigInt(downloadMatch[2]);\n    const percent =\n        total === 0 ? 0.00 : Number((downloaded * 10000n) / total) / 100;\n    const downloadedMB = Number(downloaded) / (1024 * 1024);\n    const totalMB = Number(total) / (1024 * 1024);\n    status = `Downloading data... ${percent.toFixed(2)}% (${downloadedMB.toFixed(2)} MB/${\n        totalMB.toFixed(2)} MB)`;\n    console.log(`here ${status}`)\n  }\n\n  statusElement.textContent = status;\n  if (status === '') {\n    statusElement.style.display = 'none';\n    // statusElement.parentNode.removeChild(statusElement);\n\n    document.querySelectorAll('.tab-content').forEach((tabContentElement) => {\n      tabContentElement.classList.remove('loading');\n    });\n  } else {\n    statusElement.style.display = 'block';\n    document.querySelectorAll('.tab-content').forEach((tabContentElement) => {\n      tabContentElement.classList.add('loading');\n    });\n  }\n};\n\nModule.onRuntimeInitialized = function() {\n  console.log('inited!');\n\n  startBtn.disabled = false;\n\n  initVad();\n  console.log('vad is created!', vad);\n\n  buffer = new CircularBuffer(30 * 16000, Module);\n  console.log('CircularBuffer is created!', buffer);\n};\n\nfunction fileExists(filename) {\n  const filenameLen = Module.lengthBytesUTF8(filename) + 1;\n  const buffer = Module._malloc(filenameLen);\n  Module.stringToUTF8(filename, buffer, filenameLen);\n\n  let exists = Module._SherpaOnnxFileExists(buffer);\n\n  Module._free(buffer);\n\n  return exists;\n}\n\nfunction initVad() {\n  const sileroVad = {\n    model: '',\n    threshold: 0.50,\n    minSilenceDuration: 0.50,\n    minSpeechDuration: 0.25,\n    maxSpeechDuration: 20,\n    windowSize: 512,\n  };\n\n  const tenVad = {\n    model: '',\n    threshold: 0.50,\n    minSilenceDuration: 0.50,\n    minSpeechDuration: 0.25,\n    maxSpeechDuration: 20,\n    windowSize: 256,\n  };\n\n  let config = {\n    sileroVad: sileroVad,\n    tenVad: tenVad,\n    sampleRate: 16000,\n    numThreads: 1,\n    provider: 'cpu',\n    debug: 1,\n    bufferSizeInSeconds: 30,\n  };\n\n  if (fileExists('silero_vad.onnx') == 1) {\n    config.sileroVad.model = 'silero_vad.onnx'\n  } else if (fileExists('ten-vad.onnx') == 1) {\n    config.tenVad.model = 'ten-vad.onnx'\n  }\n\n  vad = createVad(Module, config);\n}\n\nlet audioCtx;\nlet mediaStream;\n\nlet expectedSampleRate = 16000;\nlet recordSampleRate;  // the sampleRate of the microphone\nlet recorder = null;   // the microphone\nlet leftchannel = [];  // TODO: Use a single channel\n\nlet recordingLength = 0;  // number of samples so far\n\nlet vad = null;\nlet buffer = null;\nlet printed = false;\n\nif (navigator.mediaDevices.getUserMedia) {\n  console.log('getUserMedia supported.');\n\n  // see https://w3c.github.io/mediacapture-main/#dom-mediadevices-getusermedia\n  const constraints = {audio: true};\n\n  let onSuccess = function(stream) {\n    if (!audioCtx) {\n      audioCtx = new AudioContext({sampleRate: expectedSampleRate});\n    }\n    console.log(audioCtx);\n    recordSampleRate = audioCtx.sampleRate;\n    console.log('sample rate ' + recordSampleRate);\n\n    // creates an audio node from the microphone incoming stream\n    mediaStream = audioCtx.createMediaStreamSource(stream);\n    console.log('media stream', mediaStream);\n\n    // https://developer.mozilla.org/en-US/docs/Web/API/AudioContext/createScriptProcessor\n    // bufferSize: the onaudioprocess event is called when the buffer is full\n    var bufferSize = 4096;\n    var numberOfInputChannels = 1;\n    var numberOfOutputChannels = 2;\n    if (audioCtx.createScriptProcessor) {\n      recorder = audioCtx.createScriptProcessor(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    } else {\n      recorder = audioCtx.createJavaScriptNode(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    }\n    console.log('recorder', recorder);\n\n    recorder.onaudioprocess = function(e) {\n      let samples = new Float32Array(e.inputBuffer.getChannelData(0))\n      samples = downsampleBuffer(samples, expectedSampleRate);\n      buffer.push(samples);\n      while (buffer.size() > vad.config.sileroVad.windowSize) {\n        const s = buffer.get(buffer.head(), vad.config.sileroVad.windowSize);\n        vad.acceptWaveform(s);\n        buffer.pop(vad.config.sileroVad.windowSize);\n\n        if (vad.isDetected() && !printed) {\n          printed = true;\n          lastResult = 'Speech detected';\n        }\n\n        if (!vad.isDetected()) {\n          printed = false;\n          if (lastResult != '') {\n            resultList.push(lastResult);\n          }\n          lastResult = '';\n        }\n\n        while (!vad.isEmpty()) {\n          const segment = vad.front();\n          const duration = segment.samples.length / expectedSampleRate;\n          const durationStr = `Duration: ${duration.toFixed(3)} seconds`;\n          resultList.push(durationStr);\n          vad.pop();\n\n          // now save the segment to a wav file\n          let buf = new Int16Array(segment.samples.length);\n          for (var i = 0; i < segment.samples.length; ++i) {\n            let s = segment.samples[i];\n            if (s >= 1)\n              s = 1;\n            else if (s <= -1)\n              s = -1;\n\n            buf[i] = s * 32767;\n          }\n\n          let clipName = new Date().toISOString() + '--' + durationStr;\n\n          const clipContainer = document.createElement('article');\n          const clipLabel = document.createElement('p');\n          const audio = document.createElement('audio');\n          const deleteButton = document.createElement('button');\n\n          clipContainer.classList.add('clip');\n          audio.setAttribute('controls', '');\n          deleteButton.textContent = 'Delete';\n          deleteButton.className = 'delete';\n\n          clipLabel.textContent = clipName;\n\n          clipContainer.appendChild(audio);\n\n          clipContainer.appendChild(clipLabel);\n          clipContainer.appendChild(deleteButton);\n          soundClips.appendChild(clipContainer);\n\n          audio.controls = true;\n          const blob = toWav(buf);\n\n          leftchannel = [];\n          const audioURL = window.URL.createObjectURL(blob);\n          audio.src = audioURL;\n\n          deleteButton.onclick = function(e) {\n            let evtTgt = e.target;\n            evtTgt.parentNode.parentNode.removeChild(evtTgt.parentNode);\n          };\n\n          clipLabel.onclick = function() {\n            const existingName = clipLabel.textContent;\n            const newClipName = prompt('Enter a new name for your sound clip?');\n            if (newClipName === null) {\n              clipLabel.textContent = existingName;\n            } else {\n              clipLabel.textContent = newClipName;\n            }\n          };\n        }\n      }\n\n      textArea.value = getDisplayResult();\n      textArea.scrollTop = textArea.scrollHeight;  // auto scroll\n    };\n\n    startBtn.onclick = function() {\n      mediaStream.connect(recorder);\n      recorder.connect(audioCtx.destination);\n\n      console.log('recorder started');\n\n      stopBtn.disabled = false;\n      startBtn.disabled = true;\n    };\n\n    stopBtn.onclick = function() {\n      vad.reset();\n      buffer.reset();\n      console.log('recorder stopped');\n\n      // stopBtn recording\n      recorder.disconnect(audioCtx.destination);\n      mediaStream.disconnect(recorder);\n\n      startBtn.style.background = '';\n      startBtn.style.color = '';\n      // mediaRecorder.requestData();\n\n      stopBtn.disabled = true;\n      startBtn.disabled = false;\n    };\n  };\n\n  let onError = function(err) {\n    console.log('The following error occurred: ' + err);\n  };\n\n  navigator.mediaDevices.getUserMedia(constraints).then(onSuccess, onError);\n} else {\n  console.log('getUserMedia not supported on your browser!');\n  alert('getUserMedia not supported on your browser!');\n}\n\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction flatten(listOfSamples) {\n  let n = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    n += listOfSamples[i].length;\n  }\n  let ans = new Int16Array(n);\n\n  let offset = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    ans.set(listOfSamples[i], offset);\n    offset += listOfSamples[i].length;\n  }\n  return ans;\n}\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction toWav(samples) {\n  let buf = new ArrayBuffer(44 + samples.length * 2);\n  var view = new DataView(buf);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true);               // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true);  // chunkSize\n  //                   E V A W\n  view.setUint32(8, 0x45564157, true);  // format\n                                        //\n  //                      t m f\n  view.setUint32(12, 0x20746d66, true);          // subchunk1ID\n  view.setUint32(16, 16, true);                  // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true);                   // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true);                   // numChannels: 1 channel\n  view.setUint32(24, expectedSampleRate, true);  // sampleRate\n  view.setUint32(28, expectedSampleRate * 2, true);  // byteRate\n  view.setUint16(32, 2, true);                       // blockAlign\n  view.setUint16(34, 16, true);                      // bitsPerSample\n  view.setUint32(36, 0x61746164, true);              // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true);      // subchunk2Size\n\n  let offset = 44;\n  for (let i = 0; i < samples.length; ++i) {\n    view.setInt16(offset, samples[i], true);\n    offset += 2;\n  }\n\n  return new Blob([view], {type: 'audio/wav'});\n}\n\n// this function is copied from\n// https://github.com/awslabs/aws-lex-browser-audio-capture/blob/master/lib/worker.js#L46\nfunction downsampleBuffer(buffer, exportSampleRate) {\n  if (exportSampleRate === recordSampleRate) {\n    return buffer;\n  }\n  var sampleRateRatio = recordSampleRate / exportSampleRate;\n  var newLength = Math.round(buffer.length / sampleRateRatio);\n  var result = new Float32Array(newLength);\n  var offsetResult = 0;\n  var offsetBuffer = 0;\n  while (offsetResult < result.length) {\n    var nextOffsetBuffer = Math.round((offsetResult + 1) * sampleRateRatio);\n    var accum = 0, count = 0;\n    for (var i = offsetBuffer; i < nextOffsetBuffer && i < buffer.length; i++) {\n      accum += buffer[i];\n      count++;\n    }\n    result[offsetResult] = accum / count;\n    offsetResult++;\n    offsetBuffer = nextOffsetBuffer;\n  }\n  return result;\n};\n"
  },
  {
    "path": "wasm/vad/assets/README.md",
    "content": "# Introduction\n\n## Use silero-vad\n\nPlease download\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nand put `silero_vad.onnx` into the current directory, i.e., `wasm/vad/assets`.\n\nYou can find example build script at\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/wasm-simd-hf-space-silero-vad.yaml\n\n```\ncd /path/to/sherpa-onnx/wasm/vad/assets\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\n```\n\n## Use ten-vad\n\nPlease download\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\nand put `ten-vad.onnx` into the current directory, i.e., `wasm/vad/assets`.\n\nYou can find example build script at\nhttps://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/wasm-simd-hf-space-ten-vad.yaml\n\n```\ncd /path/to/sherpa-onnx/wasm/vad/assets\nwget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/ten-vad.onnx\ncd ..\nsed -i.bak \"s|.*(with <a .*|    (with <a href=\"https://github.com/TEN-framework/ten-vad\">ten-vad</a>)|\" ./index.html\n\n```\n"
  },
  {
    "path": "wasm/vad/index.html",
    "content": "<html lang=\"en\">\n\n<head>\n  <meta charset=\"utf-8\">\n  <meta name=\"viewport\" content=\"width=device-width\" />\n  <title>Next-gen Kaldi WebAssembly with sherpa-onnx for VAD</title>\n  <style>\n    h1,div {\n      text-align: center;\n    }\n    textarea {\n      width:100%;\n    }\n    .loading {\n      display: none !important;\n    }\n  </style>\n</head>\n\n<body style=\"font-family: 'Source Sans Pro', sans-serif; background-color: #f9fafb; color: #333; display: flex; flex-direction: column; align-items: center; height: 100vh; margin: 0;\">\n  <h1>\n    Next-gen Kaldi + WebAssembly<br/>\n    VAD Demo using <a href=\"https://github.com/k2-fsa/sherpa-onnx\">sherpa-onnx</a><br/>\n    (with <a href=\"https://github.com/snakers4/silero-vad\">silero-vad</a>)\n  </h1>\n\n  <div style=\"width: 100%; max-width: 900px; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); flex: 1;\">\n    <div id=\"status\">Loading...</div>\n\n    <div id=\"singleAudioContent\" class=\"tab-content loading\">\n      <div style=\"display: flex; gap: 1.5rem;\">\n        <div style=\"flex: 1; display: flex; flex-direction: row; align-items: center; gap: 1rem;\">\n          <button id=\"startBtn\" disabled>Start</button>\n          <button id=\"stopBtn\" disabled>Stop</button>\n          <button id=\"clearBtn\">Clear</button>\n        </div>\n      </div>\n\n      <div style=\"flex: 1; display: flex; flex-direction: column; gap: 1rem;\">\n          <textarea id=\"results\" rows=\"10\" placeholder=\"Please click start and speak. Output will appear here...\" readonly style=\"flex: 1; padding: 0.75rem; font-size: 1rem; border: 1px solid #ced4da; border-radius: 8px; resize: none; background-color: #f8f9fa;\"></textarea>\n      </div>\n\n      <section flex=\"1\" overflow=\"auto\" id=\"sound-clips\">\n      </section>\n  </div>\n\n  <!-- Footer Section -->\n  <div style=\"width: 100%; max-width: 900px; margin-top: 1.5rem; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); text-align: left; font-size: 0.9rem; color: #6c757d;\">\n    <h3>Description</h3>\n    <ul>\n      <li>Everything is <strong>open-sourced.</strong> <a href=\"https://github.com/k2-fsa/sherpa-onnx\">code</a></li>\n      <li>If you have any issues, please either <a href=\"https://github.com/k2-fsa/sherpa-onnx/issues\">file a ticket</a> or contact us via</li>\n        <ul>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#wechat\">WeChat group</a></li>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#qq\">QQ group</a></li>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#bilibili-b\">Bilibili</a></li>\n        </ul>\n    </ul>\n    <h3>About This Demo</h3>\n    <ul>\n      <li><strong>Private and Secure:</strong> All processing is done locally on your device (CPU) within your browser with a single thread. No server is involved, ensuring privacy and security. You can disconnect from the Internet once this page is loaded.</li>\n      <li><strong>Efficient Resource Usage:</strong> No GPU is required, leaving system resources available for webLLM analysis.</li>\n    </ul>\n    <h3>Latest Update</h3>\n    <ul>\n      <li>Update UI.</li>\n      <li>First working version.</li>\n    </ul>\n\n    <h3>Acknowledgement</h3>\n    <ul>\n      <li>We refer to <a href=\"https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm\">https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm</a> for the UI part.</li>\n    </ul>\n  </div>\n\n  <script src=\"sherpa-onnx-vad.js\"></script>\n  <script src=\"app-vad.js\"></script>\n  <script src=\"sherpa-onnx-wasm-main-vad.js\"></script>\n</body>\n"
  },
  {
    "path": "wasm/vad/sherpa-onnx-vad.js",
    "content": "function freeConfig(config, Module) {\n  if ('buffer' in config) {\n    Module._free(config.buffer);\n  }\n\n  if ('sileroVad' in config) {\n    freeConfig(config.sileroVad, Module)\n  }\n\n  if ('tenVad' in config) {\n    freeConfig(config.tenVad, Module)\n  }\n\n\n  Module._free(config.ptr);\n}\n\n// The user should free the returned pointers\nfunction initSherpaOnnxSileroVadModelConfig(config, Module) {\n  const modelLen = Module.lengthBytesUTF8(config.model || '') + 1;\n\n  const n = modelLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 6 * 4;\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, modelLen);\n\n  let offset = 0;\n  Module.setValue(ptr, buffer, 'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.threshold || 0.5, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.minSilenceDuration || 0.5, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.minSpeechDuration || 0.25, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.windowSize || 512, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.maxSpeechDuration || 20, 'float');\n  offset += 4;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxTenVadModelConfig(config, Module) {\n  const modelLen = Module.lengthBytesUTF8(config.model || '') + 1;\n\n  const n = modelLen;\n\n  const buffer = Module._malloc(n);\n\n  const len = 6 * 4;\n  const ptr = Module._malloc(len);\n\n  Module.stringToUTF8(config.model || '', buffer, modelLen);\n\n  let offset = 0;\n  Module.setValue(ptr, buffer, 'i8*');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.threshold || 0.5, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.minSilenceDuration || 0.5, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.minSpeechDuration || 0.25, 'float');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.windowSize || 256, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.maxSpeechDuration || 20, 'float');\n  offset += 4;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n  };\n}\n\nfunction initSherpaOnnxVadModelConfig(config, Module) {\n  if (!('sileroVad' in config)) {\n    config.sileroVad = {\n      model: '',\n      threshold: 0.50,\n      minSilenceDuration: 0.50,\n      minSpeechDuration: 0.25,\n      windowSize: 512,\n      maxSpeechDuration: 20,\n    };\n  }\n\n  if (!('tenVad' in config)) {\n    config.tenVad = {\n      model: '',\n      threshold: 0.50,\n      minSilenceDuration: 0.50,\n      minSpeechDuration: 0.25,\n      windowSize: 256,\n      maxSpeechDuration: 20,\n    };\n  }\n\n  const sileroVad =\n      initSherpaOnnxSileroVadModelConfig(config.sileroVad, Module);\n\n  const tenVad = initSherpaOnnxTenVadModelConfig(config.tenVad, Module);\n\n  const len = sileroVad.len + 4 * 4 + tenVad.len;\n  const ptr = Module._malloc(len);\n\n  const providerLen = Module.lengthBytesUTF8(config.provider || 'cpu') + 1;\n  const buffer = Module._malloc(providerLen);\n  Module.stringToUTF8(config.provider || 'cpu', buffer, providerLen);\n\n  let offset = 0;\n  Module._CopyHeap(sileroVad.ptr, sileroVad.len, ptr + offset);\n  offset += sileroVad.len;\n\n  Module.setValue(ptr + offset, config.sampleRate || 16000, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.numThreads || 1, 'i32');\n  offset += 4;\n\n  Module.setValue(ptr + offset, buffer, 'i8*');  // provider\n  offset += 4;\n\n  Module.setValue(ptr + offset, config.debug || 0, 'i32');\n  offset += 4;\n\n  Module._CopyHeap(tenVad.ptr, tenVad.len, ptr + offset);\n  offset += tenVad.len;\n\n  return {\n    buffer: buffer,\n    ptr: ptr,\n    len: len,\n    sileroVad: sileroVad,\n    tenVad: tenVad\n  };\n}\n\nfunction createVad(Module, myConfig) {\n  const sileroVad = {\n    model: './silero_vad.onnx',\n    threshold: 0.50,\n    minSilenceDuration: 0.50,\n    minSpeechDuration: 0.25,\n    maxSpeechDuration: 20,\n    windowSize: 512,\n  };\n\n  const tenVad = {\n    model: '',\n    threshold: 0.50,\n    minSilenceDuration: 0.50,\n    minSpeechDuration: 0.25,\n    maxSpeechDuration: 20,\n    windowSize: 256,\n  };\n\n  let config = {\n    sileroVad: sileroVad,\n    tenVad: tenVad,\n    sampleRate: 16000,\n    numThreads: 1,\n    provider: 'cpu',\n    debug: 1,\n    bufferSizeInSeconds: 30,\n  };\n\n  if (myConfig) {\n    config = myConfig;\n  }\n\n  return new Vad(config, Module);\n}\n\n\nclass CircularBuffer {\n  constructor(capacity, Module) {\n    this.handle = Module._SherpaOnnxCreateCircularBuffer(capacity);\n    this.Module = Module;\n  }\n\n  free() {\n    this.Module._SherpaOnnxDestroyCircularBuffer(this.handle);\n    this.handle = 0\n  }\n\n  /**\n   * @param samples {Float32Array}\n   */\n  push(samples) {\n    const pointer =\n        this.Module._malloc(samples.length * samples.BYTES_PER_ELEMENT);\n    this.Module.HEAPF32.set(samples, pointer / samples.BYTES_PER_ELEMENT);\n    this.Module._SherpaOnnxCircularBufferPush(\n        this.handle, pointer, samples.length);\n    this.Module._free(pointer);\n  }\n\n  get(startIndex, n) {\n    const p =\n        this.Module._SherpaOnnxCircularBufferGet(this.handle, startIndex, n);\n\n    const samplesPtr = p / 4;\n    const samples = new Float32Array(n);\n    for (let i = 0; i < n; i++) {\n      samples[i] = this.Module.HEAPF32[samplesPtr + i];\n    }\n\n    this.Module._SherpaOnnxCircularBufferFree(p);\n\n    return samples;\n  }\n\n  pop(n) {\n    this.Module._SherpaOnnxCircularBufferPop(this.handle, n);\n  }\n\n  size() {\n    return this.Module._SherpaOnnxCircularBufferSize(this.handle);\n  }\n\n  head() {\n    return this.Module._SherpaOnnxCircularBufferHead(this.handle);\n  }\n\n  reset() {\n    this.Module._SherpaOnnxCircularBufferReset(this.handle);\n  }\n}\n\nclass Vad {\n  constructor(configObj, Module) {\n    this.config = configObj;\n    const config = initSherpaOnnxVadModelConfig(configObj, Module);\n    const handle = Module._SherpaOnnxCreateVoiceActivityDetector(\n        config.ptr, configObj.bufferSizeInSeconds || 30);\n    freeConfig(config, Module);\n\n    this.handle = handle;\n    this.Module = Module;\n  }\n\n  free() {\n    this.Module._SherpaOnnxDestroyVoiceActivityDetector(this.handle);\n    this.handle = 0\n  }\n\n  // samples is a float32 array\n  acceptWaveform(samples) {\n    const pointer =\n        this.Module._malloc(samples.length * samples.BYTES_PER_ELEMENT);\n    this.Module.HEAPF32.set(samples, pointer / samples.BYTES_PER_ELEMENT);\n    this.Module._SherpaOnnxVoiceActivityDetectorAcceptWaveform(\n        this.handle, pointer, samples.length);\n    this.Module._free(pointer);\n  }\n\n  isEmpty() {\n    return this.Module._SherpaOnnxVoiceActivityDetectorEmpty(this.handle) == 1;\n  }\n\n  isDetected() {\n    return this.Module._SherpaOnnxVoiceActivityDetectorDetected(this.handle) ==\n        1;\n  }\n\n  pop() {\n    this.Module._SherpaOnnxVoiceActivityDetectorPop(this.handle);\n  }\n\n  clear() {\n    this.Module._SherpaOnnxVoiceActivityDetectorClear(this.handle);\n  }\n\n  /*\n{\n  samples: a 1-d float32 array,\n  start: an int32\n}\n   */\n  front() {\n    const h = this.Module._SherpaOnnxVoiceActivityDetectorFront(this.handle);\n\n    const start = this.Module.HEAP32[h / 4];\n    const samplesPtr = this.Module.HEAP32[h / 4 + 1] / 4;\n    const numSamples = this.Module.HEAP32[h / 4 + 2];\n\n    const samples = new Float32Array(numSamples);\n    for (let i = 0; i < numSamples; i++) {\n      samples[i] = this.Module.HEAPF32[samplesPtr + i];\n    }\n\n    this.Module._SherpaOnnxDestroySpeechSegment(h);\n    return {samples: samples, start: start};\n  }\n\n  reset() {\n    this.Module._SherpaOnnxVoiceActivityDetectorReset(this.handle);\n  }\n\n  flush() {\n    this.Module._SherpaOnnxVoiceActivityDetectorFlush(this.handle);\n  }\n};\n\nif (typeof process == 'object' && typeof process.versions == 'object' &&\n    typeof process.versions.node == 'string') {\n  module.exports = {\n    createVad,\n    CircularBuffer,\n  };\n}\n"
  },
  {
    "path": "wasm/vad/sherpa-onnx-wasm-main-vad.cc",
    "content": "// wasm/sherpa-onnx-wasm-main-vad.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <stdio.h>\n\n#include <algorithm>\n#include <memory>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n// see also\n// https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html\n\nextern \"C\" {\n\nstatic_assert(sizeof(SherpaOnnxSileroVadModelConfig) == 6 * 4, \"\");\nstatic_assert(sizeof(SherpaOnnxTenVadModelConfig) == 6 * 4, \"\");\n\nstatic_assert(sizeof(SherpaOnnxVadModelConfig) ==\n                  sizeof(SherpaOnnxSileroVadModelConfig) + 4 * 4 +\n                      sizeof(SherpaOnnxTenVadModelConfig),\n              \"\");\nvoid MyPrint(SherpaOnnxVadModelConfig *config) {\n  auto silero_vad = &config->silero_vad;\n  auto ten_vad = &config->ten_vad;\n\n  fprintf(stdout, \"----------silero_vad config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", silero_vad->model);\n  fprintf(stdout, \"threshold: %.3f\\n\", silero_vad->threshold);\n  fprintf(stdout, \"min_silence_duration: %.3f\\n\",\n          silero_vad->min_silence_duration);\n  fprintf(stdout, \"min_speech_duration: %.3f\\n\",\n          silero_vad->min_speech_duration);\n  fprintf(stdout, \"window_size: %d\\n\", silero_vad->window_size);\n  fprintf(stdout, \"max_speech_duration: %.3f\\n\",\n          silero_vad->max_speech_duration);\n\n  fprintf(stdout, \"----------ten_vad config----------\\n\");\n  fprintf(stdout, \"model: %s\\n\", ten_vad->model);\n  fprintf(stdout, \"threshold: %.3f\\n\", ten_vad->threshold);\n  fprintf(stdout, \"min_silence_duration: %.3f\\n\",\n          ten_vad->min_silence_duration);\n  fprintf(stdout, \"min_speech_duration: %.3f\\n\", ten_vad->min_speech_duration);\n  fprintf(stdout, \"window_size: %d\\n\", ten_vad->window_size);\n  fprintf(stdout, \"max_speech_duration: %.3f\\n\", ten_vad->max_speech_duration);\n\n  fprintf(stdout, \"----------config----------\\n\");\n\n  fprintf(stdout, \"sample_rate: %d\\n\", config->sample_rate);\n  fprintf(stdout, \"num_threads: %d\\n\", config->num_threads);\n\n  fprintf(stdout, \"provider: %s\\n\", config->provider);\n  fprintf(stdout, \"debug: %d\\n\", config->debug);\n}\n\nvoid CopyHeap(const char *src, int32_t num_bytes, char *dst) {\n  std::copy(src, src + num_bytes, dst);\n}\n}\n"
  },
  {
    "path": "wasm/vad-asr/CMakeLists.txt",
    "content": "if(NOT $ENV{SHERPA_ONNX_IS_USING_BUILD_WASM_SH})\n  message(FATAL_ERROR \"Please use ./build-wasm-simd-vad.sh to build for wasm VAD\")\nendif()\n\nif(NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/silero_vad.onnx\" OR NOT EXISTS \"${CMAKE_CURRENT_SOURCE_DIR}/assets/tokens.txt\")\n  message(FATAL_ERROR \"Please read ${CMAKE_CURRENT_SOURCE_DIR}/assets/README.md before you continue\")\nendif()\n\nset(exported_functions\n  # VAD\n  SherpaOnnxCreateCircularBuffer\n  SherpaOnnxDestroyCircularBuffer\n  SherpaOnnxCircularBufferPush\n  SherpaOnnxCircularBufferGet\n  SherpaOnnxCircularBufferFree\n  SherpaOnnxCircularBufferPop\n  SherpaOnnxCircularBufferSize\n  SherpaOnnxCircularBufferHead\n  SherpaOnnxCircularBufferReset\n  SherpaOnnxCreateVoiceActivityDetector\n  SherpaOnnxDestroyVoiceActivityDetector\n  SherpaOnnxVoiceActivityDetectorAcceptWaveform\n  SherpaOnnxVoiceActivityDetectorEmpty\n  SherpaOnnxVoiceActivityDetectorDetected\n  SherpaOnnxVoiceActivityDetectorPop\n  SherpaOnnxVoiceActivityDetectorClear\n  SherpaOnnxVoiceActivityDetectorFront\n  SherpaOnnxDestroySpeechSegment\n  SherpaOnnxVoiceActivityDetectorReset\n  SherpaOnnxVoiceActivityDetectorFlush\n  # non-streaming ASR\n  SherpaOnnxAcceptWaveformOffline\n  SherpaOnnxCreateOfflineRecognizer\n  SherpaOnnxCreateOfflineStream\n  SherpaOnnxDecodeMultipleOfflineStreams\n  SherpaOnnxDecodeOfflineStream\n  SherpaOnnxDestroyOfflineRecognizer\n  SherpaOnnxDestroyOfflineRecognizerResult\n  SherpaOnnxDestroyOfflineStream\n  SherpaOnnxDestroyOfflineStreamResultJson\n  SherpaOnnxGetOfflineStreamResult\n  SherpaOnnxGetOfflineStreamResultAsJson\n  #\n  SherpaOnnxFileExists\n)\nset(mangled_exported_functions)\nforeach(x IN LISTS exported_functions)\n  list(APPEND mangled_exported_functions \"_${x}\")\nendforeach()\nlist(JOIN mangled_exported_functions \",\" all_exported_functions)\n\ninclude_directories(${CMAKE_SOURCE_DIR})\nset(MY_FLAGS \" -s FORCE_FILESYSTEM=1 -s INITIAL_MEMORY=512MB -s ALLOW_MEMORY_GROWTH=1\")\nstring(APPEND MY_FLAGS \" -sSTACK_SIZE=10485760 \") # 10MB\nstring(APPEND MY_FLAGS \" -sEXPORTED_FUNCTIONS=[_CopyHeap,_malloc,_free,${all_exported_functions}] \")\nstring(APPEND MY_FLAGS \"--preload-file ${CMAKE_CURRENT_SOURCE_DIR}/assets@. \")\nstring(APPEND MY_FLAGS \" -sEXPORTED_RUNTIME_METHODS=['ccall','stringToUTF8','setValue','getValue','lengthBytesUTF8','UTF8ToString','HEAPU8','HEAP16','HEAP32','HEAPU32','HEAPF32','HEAPF64'] \")\n\nmessage(STATUS \"MY_FLAGS: ${MY_FLAGS}\")\n\nset(CMAKE_C_FLAGS \"${CMAKE_C_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_CXX_FLAGS \"${CMAKE_CXX_FLAGS} ${MY_FLAGS}\")\nset(CMAKE_EXECUTABLE_LINKER_FLAGS \"${CMAKE_EXECUTABLE_LINKER_FLAGS} ${MY_FLAGS}\")\n\nif (NOT CMAKE_EXECUTABLE_SUFFIX STREQUAL \".js\")\n  message(FATAL_ERROR \"The default suffix for building executables should be .js!\")\nendif()\n# set(CMAKE_EXECUTABLE_SUFFIX \".html\")\n\nadd_executable(sherpa-onnx-wasm-main-vad-asr sherpa-onnx-wasm-main-vad-asr.cc)\ntarget_link_libraries(sherpa-onnx-wasm-main-vad-asr sherpa-onnx-c-api)\ninstall(TARGETS sherpa-onnx-wasm-main-vad-asr DESTINATION bin/wasm/vad-asr)\n\ninstall(\n  FILES\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-vad-asr>/sherpa-onnx-wasm-main-vad-asr.js\"\n    \"index.html\"\n    \"app-vad-asr.js\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-vad-asr>/sherpa-onnx-wasm-main-vad-asr.wasm\"\n    \"$<TARGET_FILE_DIR:sherpa-onnx-wasm-main-vad-asr>/sherpa-onnx-wasm-main-vad-asr.data\"\n  DESTINATION\n    bin/wasm/vad-asr\n)\n"
  },
  {
    "path": "wasm/vad-asr/app-vad-asr.js",
    "content": "// This file copies and modifies code\n// from https://mdn.github.io/web-dictaphone/scripts/app.js\n// and https://gist.github.com/meziantou/edb7217fddfbb70e899e\n\nconst startBtn = document.getElementById('startBtn');\nconst stopBtn = document.getElementById('stopBtn');\nconst clearBtn = document.getElementById('clearBtn');\nconst soundClips = document.getElementById('sound-clips');\n\nlet textArea = document.getElementById('results');\n\nlet lastResult = '';\nlet resultList = [];\n\nclearBtn.onclick = function() {\n  resultList = [];\n  textArea.value = getDisplayResult();\n  textArea.scrollTop = textArea.scrollHeight;  // auto scroll\n};\n\nfunction getDisplayResult() {\n  let i = 0;\n  let ans = '';\n  for (let s in resultList) {\n    if (resultList[s] == '') {\n      continue;\n    }\n\n    if (resultList[s] == 'Speech detected') {\n      ans += '' + i + ': ' + resultList[s];\n      i += 1;\n    } else {\n      ans += ', ' + resultList[s] + '\\n';\n    }\n  }\n\n  if (lastResult.length > 0) {\n    ans += '' + i + ': ' + lastResult + '\\n';\n  }\n  return ans;\n}\n\nModule = {};\n\nlet audioCtx;\nlet mediaStream;\n\nlet expectedSampleRate = 16000;\nlet recordSampleRate;  // the sampleRate of the microphone\nlet recorder = null;   // the microphone\nlet leftchannel = [];  // TODO: Use a single channel\n\nlet recordingLength = 0;  // number of samples so far\n\nlet vad = null;\nlet buffer = null;\nlet recognizer = null;\nlet printed = false;\n\nfunction fileExists(filename) {\n  const filenameLen = Module.lengthBytesUTF8(filename) + 1;\n  const buffer = Module._malloc(filenameLen);\n  Module.stringToUTF8(filename, buffer, filenameLen);\n\n  let exists = Module._SherpaOnnxFileExists(buffer);\n\n  Module._free(buffer);\n\n  return exists;\n}\n\nfunction initOfflineRecognizer() {\n  let config = {\n    modelConfig: {\n      debug: 1,\n      tokens: './tokens.txt',\n    },\n  };\n  if (fileExists('sense-voice.onnx') == 1) {\n    config.modelConfig.senseVoice = {\n      model: './sense-voice.onnx',\n      useInverseTextNormalization: 1,\n    };\n  } else if (fileExists('whisper-encoder.onnx')) {\n    config.modelConfig.whisper = {\n      encoder: './whisper-encoder.onnx',\n      decoder: './whisper-decoder.onnx',\n    };\n  } else if (fileExists('transducer-encoder.onnx')) {\n    config.modelConfig.transducer = {\n      encoder: './transducer-encoder.onnx',\n      decoder: './transducer-decoder.onnx',\n      joiner: './transducer-joiner.onnx',\n    };\n    config.modelConfig.modelType = 'transducer';\n  } else if (fileExists('nemo-transducer-encoder.onnx')) {\n    config.modelConfig.transducer = {\n      encoder: './nemo-transducer-encoder.onnx',\n      decoder: './nemo-transducer-decoder.onnx',\n      joiner: './nemo-transducer-joiner.onnx',\n    };\n    config.modelConfig.modelType = 'nemo_transducer';\n  } else if (fileExists('paraformer.onnx')) {\n    config.modelConfig.paraformer = {\n      model: './paraformer.onnx',\n    };\n  } else if (fileExists('telespeech.onnx')) {\n    config.modelConfig.telespeechCtc = './telespeech.onnx';\n  } else if (fileExists('moonshine-preprocessor.onnx')) {\n    // moonshine v1\n    config.modelConfig.moonshine = {\n      preprocessor: './moonshine-preprocessor.onnx',\n      encoder: './moonshine-encoder.onnx',\n      uncachedDecoder: './moonshine-uncached-decoder.onnx',\n      cachedDecoder: './moonshine-cached-decoder.onnx'\n    };\n  } else if (fileExists('moonshine-merged-decoder.ort')) {\n    // moonshine v2\n    config.modelConfig.moonshine = {\n      encoder: './moonshine-encoder.ort',\n      mergedDecoder: './moonshine-merged-decoder.ort'\n    };\n  } else if (fileExists('dolphin.onnx')) {\n    config.modelConfig.dolphin = {model: './dolphin.onnx'};\n  } else if (fileExists('zipformer-ctc.onnx')) {\n    // you need to rename model.int8.onnx from zipformer CTC to\n    // zipformer-ctc.onnx\n    config.modelConfig.zipformerCtc = {model: './zipformer-ctc.onnx'};\n  } else {\n    console.log('Please specify a model.');\n    alert('Please specify a model.');\n  }\n\n  recognizer = new OfflineRecognizer(config, Module);\n}\n\n// https://emscripten.org/docs/api_reference/module.html#Module.locateFile\nModule.locateFile = function(path, scriptDirectory = '') {\n  console.log(`path: ${path}, scriptDirectory: ${scriptDirectory}`);\n  return scriptDirectory + path;\n};\n\n// https://emscripten.org/docs/api_reference/module.html#Module.locateFile\nModule.setStatus = function(status) {\n  console.log(`status ${status}`);\n  const statusElement = document.getElementById('status');\n  if (status == 'Running...') {\n    status = 'Model downloaded. Initializing recognizer...'\n  }\n\n  const downloadMatch = status.match(/Downloading data... \\((\\d+)\\/(\\d+)\\)/);\n  if (downloadMatch) {\n    const downloaded = BigInt(downloadMatch[1]);\n    const total = BigInt(downloadMatch[2]);\n    const percent =\n        total === 0 ? 0.00 : Number((downloaded * 10000n) / total) / 100;\n    const downloadedMB = Number(downloaded) / (1024 * 1024);\n    const totalMB = Number(total) / (1024 * 1024);\n    status = `Downloading data... ${percent.toFixed(2)}% (${downloadedMB.toFixed(2)} MB/${\n        totalMB.toFixed(2)} MB)`;\n    console.log(`here ${status}`)\n  }\n\n  statusElement.textContent = status;\n  if (status === '') {\n    statusElement.style.display = 'none';\n    // statusElement.parentNode.removeChild(statusElement);\n\n    document.querySelectorAll('.tab-content').forEach((tabContentElement) => {\n      tabContentElement.classList.remove('loading');\n    });\n  } else {\n    statusElement.style.display = 'block';\n    document.querySelectorAll('.tab-content').forEach((tabContentElement) => {\n      tabContentElement.classList.add('loading');\n    });\n  }\n};\n\nModule.onRuntimeInitialized = function() {\n  console.log('inited!');\n\n  startBtn.disabled = false;\n\n  vad = createVad(Module);\n  console.log('vad is created!', vad);\n\n  buffer = new CircularBuffer(30 * 16000, Module);\n  console.log('CircularBuffer is created!', buffer);\n\n  initOfflineRecognizer();\n};\n\nif (navigator.mediaDevices.getUserMedia) {\n  console.log('getUserMedia supported.');\n\n  // see https://w3c.github.io/mediacapture-main/#dom-mediadevices-getusermedia\n  const constraints = {audio: true};\n\n  let onSuccess = function(stream) {\n    if (!audioCtx) {\n      audioCtx = new AudioContext({sampleRate: expectedSampleRate});\n    }\n    console.log(audioCtx);\n    recordSampleRate = audioCtx.sampleRate;\n    console.log('sample rate ' + recordSampleRate);\n\n    // creates an audio node from the microphone incoming stream\n    mediaStream = audioCtx.createMediaStreamSource(stream);\n    console.log('media stream', mediaStream);\n\n    // https://developer.mozilla.org/en-US/docs/Web/API/AudioContext/createScriptProcessor\n    // bufferSize: the onaudioprocess event is called when the buffer is full\n    var bufferSize = 4096;\n    var numberOfInputChannels = 1;\n    var numberOfOutputChannels = 2;\n    if (audioCtx.createScriptProcessor) {\n      recorder = audioCtx.createScriptProcessor(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    } else {\n      recorder = audioCtx.createJavaScriptNode(\n          bufferSize, numberOfInputChannels, numberOfOutputChannels);\n    }\n    console.log('recorder', recorder);\n\n    recorder.onaudioprocess = function(e) {\n      let samples = new Float32Array(e.inputBuffer.getChannelData(0))\n      samples = downsampleBuffer(samples, expectedSampleRate);\n      buffer.push(samples);\n      while (buffer.size() > vad.config.sileroVad.windowSize) {\n        const s = buffer.get(buffer.head(), vad.config.sileroVad.windowSize);\n        vad.acceptWaveform(s);\n        buffer.pop(vad.config.sileroVad.windowSize);\n\n        if (vad.isDetected() && !printed) {\n          printed = true;\n          lastResult = 'Speech detected';\n        }\n\n        if (!vad.isDetected()) {\n          printed = false;\n          if (lastResult != '') {\n            resultList.push(lastResult);\n          }\n          lastResult = '';\n        }\n\n        while (!vad.isEmpty()) {\n          const segment = vad.front();\n          const duration = segment.samples.length / expectedSampleRate;\n          let durationStr = `Duration: ${duration.toFixed(3)} seconds`;\n          vad.pop();\n\n          // non-streaming asr\n          const stream = recognizer.createStream();\n          stream.acceptWaveform(expectedSampleRate, segment.samples);\n          recognizer.decode(stream);\n          let recognitionResult = recognizer.getResult(stream);\n          console.log(recognitionResult);\n          let text = recognitionResult.text;\n          stream.free();\n          console.log(text);\n\n          if (text != '') {\n            durationStr += `. Result: ${text}`;\n          }\n\n          resultList.push(durationStr);\n\n          // now save the segment to a wav file\n          let buf = new Int16Array(segment.samples.length);\n          for (var i = 0; i < segment.samples.length; ++i) {\n            let s = segment.samples[i];\n            if (s >= 1)\n              s = 1;\n            else if (s <= -1)\n              s = -1;\n\n            buf[i] = s * 32767;\n          }\n\n          let clipName = new Date().toISOString() + '--' + durationStr;\n\n          const clipContainer = document.createElement('article');\n          const clipLabel = document.createElement('p');\n          const audio = document.createElement('audio');\n          const deleteButton = document.createElement('button');\n\n          clipContainer.classList.add('clip');\n          audio.setAttribute('controls', '');\n          deleteButton.textContent = 'Delete';\n          deleteButton.className = 'delete';\n\n          clipLabel.textContent = clipName;\n\n          clipContainer.appendChild(audio);\n\n          clipContainer.appendChild(clipLabel);\n          clipContainer.appendChild(deleteButton);\n          soundClips.appendChild(clipContainer);\n\n          audio.controls = true;\n          const blob = toWav(buf);\n\n          leftchannel = [];\n          const audioURL = window.URL.createObjectURL(blob);\n          audio.src = audioURL;\n\n          deleteButton.onclick = function(e) {\n            let evtTgt = e.target;\n            evtTgt.parentNode.parentNode.removeChild(evtTgt.parentNode);\n          };\n\n          clipLabel.onclick = function() {\n            const existingName = clipLabel.textContent;\n            const newClipName = prompt('Enter a new name for your sound clip?');\n            if (newClipName === null) {\n              clipLabel.textContent = existingName;\n            } else {\n              clipLabel.textContent = newClipName;\n            }\n          };\n        }\n      }\n\n      textArea.value = getDisplayResult();\n      textArea.scrollTop = textArea.scrollHeight;  // auto scroll\n    };\n\n    startBtn.onclick = function() {\n      mediaStream.connect(recorder);\n      recorder.connect(audioCtx.destination);\n\n      console.log('recorder started');\n\n      stopBtn.disabled = false;\n      startBtn.disabled = true;\n    };\n\n    stopBtn.onclick = function() {\n      vad.reset();\n      buffer.reset();\n      console.log('recorder stopped');\n\n      // stopBtn recording\n      recorder.disconnect(audioCtx.destination);\n      mediaStream.disconnect(recorder);\n\n      startBtn.style.background = '';\n      startBtn.style.color = '';\n      // mediaRecorder.requestData();\n\n      stopBtn.disabled = true;\n      startBtn.disabled = false;\n    };\n  };\n\n  let onError = function(err) {\n    console.log('The following error occurred: ' + err);\n  };\n\n  navigator.mediaDevices.getUserMedia(constraints).then(onSuccess, onError);\n} else {\n  console.log('getUserMedia not supported on your browser!');\n  alert('getUserMedia not supported on your browser!');\n}\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction flatten(listOfSamples) {\n  let n = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    n += listOfSamples[i].length;\n  }\n  let ans = new Int16Array(n);\n\n  let offset = 0;\n  for (let i = 0; i < listOfSamples.length; ++i) {\n    ans.set(listOfSamples[i], offset);\n    offset += listOfSamples[i].length;\n  }\n  return ans;\n}\n\n// this function is copied/modified from\n// https://gist.github.com/meziantou/edb7217fddfbb70e899e\nfunction toWav(samples) {\n  let buf = new ArrayBuffer(44 + samples.length * 2);\n  var view = new DataView(buf);\n\n  // http://soundfile.sapp.org/doc/WaveFormat/\n  //                   F F I R\n  view.setUint32(0, 0x46464952, true);               // chunkID\n  view.setUint32(4, 36 + samples.length * 2, true);  // chunkSize\n  //                   E V A W\n  view.setUint32(8, 0x45564157, true);  // format\n                                        //\n  //                      t m f\n  view.setUint32(12, 0x20746d66, true);          // subchunk1ID\n  view.setUint32(16, 16, true);                  // subchunk1Size, 16 for PCM\n  view.setUint32(20, 1, true);                   // audioFormat, 1 for PCM\n  view.setUint16(22, 1, true);                   // numChannels: 1 channel\n  view.setUint32(24, expectedSampleRate, true);  // sampleRate\n  view.setUint32(28, expectedSampleRate * 2, true);  // byteRate\n  view.setUint16(32, 2, true);                       // blockAlign\n  view.setUint16(34, 16, true);                      // bitsPerSample\n  view.setUint32(36, 0x61746164, true);              // Subchunk2ID\n  view.setUint32(40, samples.length * 2, true);      // subchunk2Size\n\n  let offset = 44;\n  for (let i = 0; i < samples.length; ++i) {\n    view.setInt16(offset, samples[i], true);\n    offset += 2;\n  }\n\n  return new Blob([view], {type: 'audio/wav'});\n}\n\n// this function is copied from\n// https://github.com/awslabs/aws-lex-browser-audio-capture/blob/master/lib/worker.js#L46\nfunction downsampleBuffer(buffer, exportSampleRate) {\n  if (exportSampleRate === recordSampleRate) {\n    return buffer;\n  }\n  var sampleRateRatio = recordSampleRate / exportSampleRate;\n  var newLength = Math.round(buffer.length / sampleRateRatio);\n  var result = new Float32Array(newLength);\n  var offsetResult = 0;\n  var offsetBuffer = 0;\n  while (offsetResult < result.length) {\n    var nextOffsetBuffer = Math.round((offsetResult + 1) * sampleRateRatio);\n    var accum = 0, count = 0;\n    for (var i = offsetBuffer; i < nextOffsetBuffer && i < buffer.length; i++) {\n      accum += buffer[i];\n      count++;\n    }\n    result[offsetResult] = accum / count;\n    offsetResult++;\n    offsetBuffer = nextOffsetBuffer;\n  }\n  return result;\n};\n"
  },
  {
    "path": "wasm/vad-asr/assets/README.md",
    "content": "# Introduction\n\n## Download VAD models\n\nPlease download\nhttps://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx\nand put `silero_vad.onnx` into the current directory, i.e., `wasm/vad/assets`.\n\n## Download non-streaming ASR models\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html\nto download a non-streaming ASR model, i.e., an offline ASR model.\n\nAfter downloading, you should rename the model files.\n\nPlease refer to\nhttps://k2-fsa.github.io/sherpa/onnx/lazarus/generate-subtitles.html#download-a-speech-recognition-model\nfor how to rename.\n\nThe renamed file shoud put in current folder( ```/<repo>/wasm/vad-asr/assets```)\nExample after download sense-voice model\n```\ntree ~/work/github/sherpa-onnx/wasm/vad-asr/assets\n/home/gxy/work/github/sherpa-onnx/wasm/vad-asr/assets\n├── README.md\n├── sense-voice.onnx\n├── silero_vad.onnx\n└── tokens.txt\n\n1 directory, 4 files\n```\n\n\nYou can find example build scripts at the following address:\n\n  https://github.com/k2-fsa/sherpa-onnx/blob/master/.github/workflows/wasm-simd-hf-space-vad-asr.yaml\n"
  },
  {
    "path": "wasm/vad-asr/index.html",
    "content": "<html lang=\"en\">\n\n<head>\n  <meta charset=\"utf-8\">\n  <meta name=\"viewport\" content=\"width=device-width\" />\n  <title>Next-gen Kaldi WebAssembly with sherpa-onnx for VAD + ASR</title>\n  <style>\n    h1,div {\n      text-align: center;\n    }\n    textarea {\n      width:100%;\n    }\n    .loading {\n      display: none !important;\n    }\n  </style>\n</head>\n\n<body style=\"font-family: 'Source Sans Pro', sans-serif; background-color: #f9fafb; color: #333; display: flex; flex-direction: column; align-items: center; height: 100vh; margin: 0;\">\n  <h1>\n    Next-gen Kaldi + WebAssembly<br/>\n    VAD+ASR Demo with <a href=\"https://github.com/k2-fsa/sherpa-onnx\">sherpa-onnx</a><br/>\n    (with Zipformer)\n  </h1>\n\n  <div style=\"width: 100%; max-width: 900px; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); flex: 1;\">\n    <div id=\"status\">Loading...</div>\n\n    <div id=\"singleAudioContent\" class=\"tab-content loading\">\n      <div style=\"display: flex; gap: 1.5rem;\">\n        <div style=\"flex: 1; display: flex; flex-direction: row; align-items: center; gap: 1rem;\">\n          <button id=\"startBtn\" disabled>Start</button>\n          <button id=\"stopBtn\" disabled>Stop</button>\n          <button id=\"clearBtn\">Clear</button>\n        </div>\n      </div>\n\n      <div style=\"flex: 1; display: flex; flex-direction: column; gap: 1rem;\">\n          <div style=\"font-size: 1rem; font-weight: bold; padding: 0.5rem 1rem; background-color: #f8f9fa; border-radius: 8px; color: #6c757d;\">Transcript</div>\n          <textarea id=\"results\" rows=\"10\" placeholder=\"Output will appear here...\" readonly style=\"flex: 1; padding: 0.75rem; font-size: 1rem; border: 1px solid #ced4da; border-radius: 8px; resize: none; background-color: #f8f9fa;\"></textarea>\n      </div>\n\n      <section flex=\"1\" overflow=\"auto\" id=\"sound-clips\">\n      </section>\n  </div>\n\n  <!-- Footer Section -->\n  <div style=\"width: 100%; max-width: 900px; margin-top: 1.5rem; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); text-align: left; font-size: 0.9rem; color: #6c757d;\">\n    <h3>Description</h3>\n    <ul>\n      <li>Everything is <strong>open-sourced.</strong> <a href=\"https://github.com/k2-fsa/sherpa-onnx\">code</a></li>\n      <li>If you have any issues, please either <a href=\"https://github.com/k2-fsa/sherpa-onnx/issues\">file a ticket</a> or contact us via</li>\n        <ul>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#wechat\">WeChat group</a></li>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#qq\">QQ group</a></li>\n          <li><a href=\"https://k2-fsa.github.io/sherpa/social-groups.html#bilibili-b\">Bilibili</a></li>\n        </ul>\n    </ul>\n    <h3>About This Demo</h3>\n    <ul>\n      <li><strong>Private and Secure:</strong> All processing is done locally on your device (CPU) within your browser with a single thread. No server is involved, ensuring privacy and security. You can disconnect from the Internet once this page is loaded.</li>\n      <li><strong>Efficient Resource Usage:</strong> No GPU is required, leaving system resources available for webLLM analysis.</li>\n    </ul>\n    <h3>Latest Update</h3>\n    <ul>\n      <li>Update UI.</li>\n      <li>First working version.</li>\n    </ul>\n\n    <h3>Acknowledgement</h3>\n    <ul>\n      <li>We refer to <a href=\"https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm\">https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm</a> for the UI part.</li>\n    </ul>\n  </div>\n\n  <script src=\"sherpa-onnx-asr.js\"></script>\n  <script src=\"sherpa-onnx-vad.js\"></script>\n  <script src=\"app-vad-asr.js\"></script>\n  <script src=\"sherpa-onnx-wasm-main-vad-asr.js\"></script>\n</body>\n"
  },
  {
    "path": "wasm/vad-asr/sherpa-onnx-wasm-main-vad-asr.cc",
    "content": "// wasm/sherpa-onnx-wasm-main-vad-asr.cc\n//\n// Copyright (c)  2024  Xiaomi Corporation\n#include <stdio.h>\n\n#include <algorithm>\n#include <memory>\n\n#include \"sherpa-onnx/c-api/c-api.h\"\n\n// see also\n// https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html\n\nextern \"C\" {\n\nvoid CopyHeap(const char *src, int32_t num_bytes, char *dst) {\n  std::copy(src, src + num_bytes, dst);\n}\n}\n"
  }
]